Accepted Manuscript Whole brain white matter connectivity analysis using machine learning: An application to autism Fan Zhang, Peter Savadjiev, Weidong Cai, Yang Song, Yogesh Rathi, Birkan Tunç, Drew Parker, Tina Kapur, Robert T. Schultz, Nikos Makris, Ragini Verma, Lauren J. O'Donnell PII:
S1053-8119(17)30856-X
DOI:
10.1016/j.neuroimage.2017.10.029
Reference:
YNIMG 14407
To appear in:
NeuroImage
Received Date: 19 May 2017 Revised Date:
26 September 2017
Accepted Date: 14 October 2017
Please cite this article as: Zhang, F., Savadjiev, P., Cai, W., Song, Y., Rathi, Y., Tunç, B., Parker, D., Kapur, T., Schultz, R.T., Makris, N., Verma, R., O'Donnell, L.J., Whole brain white matter connectivity analysis using machine learning: An application to autism, NeuroImage (2017), doi: 10.1016/ j.neuroimage.2017.10.029. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
RI PT
Whole brain white matter connectivity analysis using machine learning: an application to autism Fan Zhanga , Peter Savadjieva , Weidong Caib , Yang Songb , Yogesh Rathia , Birkan Tun¸cc , Drew Parkerc , Tina Kapura , Robert T. Schultzc,d , Nikos Makrisa , Ragini Vermac , Lauren J. O’Donnella a Harvard
Medical School, Boston MA, USA of Sydney, Sydney NSW, Australia c University of Pennsylvania, Philadelphia PA, USA d Department of Radiology, Children’s Hospital of Philadelphia, Philadelphia PA, USA
M AN U
SC
b University
Abstract
In this paper, we propose an automated white matter connectivity analysis method for machine learning classification and characterization of white matter abnormality via identification of discriminative fiber tracts. The proposed method uses diffusion MRI tractography and a data-driven approach to find fiber clusters corresponding to subdivisions of the white matter anatomy. Fea-
D
tures extracted from each fiber cluster describe its diffusion properties and are
TE
used for machine learning. The method is demonstrated by application to a pediatric neuroimaging dataset from 149 individuals, including 70 children with autism spectrum disorder (ASD) and 79 typically developing controls (TDC).
EP
A classification accuracy of 78.33% is achieved in this cross-validation study. We investigate the discriminative diffusion features based on a two-tensor fiber tracking model. We observe that the mean fractional anisotropy from the second
AC C
tensor (associated with crossing fibers) is most affected in ASD. We also find that local along-tract (central cores and endpoint regions) differences between ASD and TDC are helpful in differentiating the two groups. These altered diffusion properties in ASD are associated with multiple robustly discriminative fiber clusters, which belong to several major white matter tracts including the corpus callosum, arcuate fasciculus, uncinate fasciculus and aslant tract; and the white matter structures related to the cerebellum, brain stem, and ventral
Preprint submitted to NeuroImage
October 19, 2017
ACCEPTED MANUSCRIPT
diencephalon. These discriminative fiber clusters, a small part of the whole brain tractography, represent the white matter connections that could be most
RI PT
affected in ASD. Our results indicate the potential of a machine learning pipeline based on white matter fiber clustering.
Keywords: Autism Spectrum Disorder, White Matter Connectivity, Fiber
SC
Clustering, Machine Learning
1. Introduction
The advent of diffusion magnetic resonance imaging (dMRI), which can infer
M AN U
the underlying connection structure of brain white matter in vivo via a process called tractography (Basser et al., 2000), has allowed analysis of white matter structures in a non-invasive way. Models of brain white matter connectivity have been shown to be valuable for understanding neurological function, development and disease (Assaf and Pasternak, 2008; Horsfield and Jones, 2002; Ciccarelli et al., 2008). Traditional statistical analyses of tractography usually investigate
D
a predetermined hypothesis that certain fiber tracts of interest are affected in one group (Catani et al., 2008; Pugliese et al., 2009; Hong et al., 2011). However,
TE
there is increased interest in developing statistical and machine learning methods that can assess potential combined effects from multiple fiber tracts (Gong et al., 2009; Robinson et al., 2010; Wang et al., 2016; Ratnarajah et al., 2013; O’Donnell
EP
et al., 2013; Meskaldji et al., 2013). In this study, we investigate computational methods to analyze whole brain
AC C
connectivity patterns at multiple coarse-to-fine scales of white matter parcellation. We propose a machine learning methodology to learn underlying whole brain white matter connectivity patterns using dMRI and to provide insights into white matter structures that are discriminative between groups.
The
proposed computational method provides a novel whole white matter global tract assessment and end-to-end classification framework to study alterations in brain white matter connectivity. The method includes a study-specific datadriven tractography parcellation and a high-dimensional multivariate machine-
2
ACCEPTED MANUSCRIPT
learning-based group classification, which allows identification of tracts that may be most affected by pathology in the whole brain. Results are interpreted
RI PT
and visualized in terms of tracts with known cortical terminations.
We demonstrate our method with an application of classification between
autism spectrum disorder (ASD) and typically developing control (TDC). While the current diagnosis of ASD is based on behavioral assessment (Lord et al.,
SC
1994, 2000; Association, 2000), there has been high interest in investigating
dMRI-based white matter analysis for classification of ASD (Lange et al., 2010; Ingalhalikar et al., 2011; Mostapha et al., 2015). The literature has suggested
M AN U
broad white matter impairments in ASD compared to TDC, and altered white matter connectivity has been hypothesized to associate with symptoms of core and comorbid conditions across the autism spectrum, as described in recent review papers (Travers et al., 2012; Hoppenbrouwers et al., 2014; Ameis and Catani, 2015). Therefore, we hypothesized that a whole brain tractography analysis could predict ASD by inferring global white matter abnormality patterns.
D
To the best of our knowledge, this work represents the first investigation of
TE
machine-learning-based whole brain fiber clustering analysis for group classification. This investigation extends our preliminary work (Zhang et al., 2016) to employ a multi-fiber tractography model that is more sensitive in tracking
EP
through regions of crossing fibers, a hemisphere-based cluster separation to observe potentially lateralized white matter abnormalities, and a multiple feature selection to assess the most discriminative clusters that present abnormalities
AC C
in different diffusion properties such as anisotropy and diffusivity.
2. Methods and Materials Here we first give a brief overview of our proposed method, followed by
more detailed descriptions of the computational processing methods. Then, we introduce the dataset and the multi-fiber tractography method and experimental evaluations employed in this work.
3
ACCEPTED MANUSCRIPT
2.1. Method Overview
RI PT
Our approach had two main steps: data-driven fiber clustering white matter parcellation and high-dimensional multivariate diffusion feature classification.
Figure 1 illustrates the fiber clustering pipeline (Section 2.2), including a groupwise tractography registration, a groupwise fiber cluster atlas generation,
a subject-specific fiber clustering, and a hemisphere-based cluster separation.
SC
Overall, these steps enabled parcellation of corresponding fiber clusters in the whole brain of all subjects. We note that the whole pipeline worked in an automated data-driven fashion. A recently released implementation of these steps
M AN U
provides a publicly available software package for groupwise fiber-clusteringbased white matter parcellation in the whitematteranalysis 5 package. Our recent work has shown successful applications of the pipeline (Zhang et al., 2016; O’Donnell et al., 2017; Zhang et al., 2017a,c).
Figure 2 gives an overview of the machine-learning-based classification (Section 2.3). It included a standard machine learning classification procedure, consisting of feature extraction, feature ranking and selection, classifier training,
D
new data classification, and performance evaluation in a cross-validation. We
TE
extracted multiple diffusion features (see Section 2.3.1) from each fiber cluster per subject and conducted the classification using these features separately to investigate their individual discriminative ability between two groups (TDC vs
EP
ASD in this application). Fiber clusters that were robustly discriminative across the multiple features were identified as the output discriminative white matter regions between the groups.
AC C
The proposed method was demonstrated by application to a pediatric neu-
roimaging dataset from 149 individuals, including 70 children with autism spectrum disorder (ASD) and 79 typically developing controls (TDC) (see Section 2.6 for details of the dataset). The study was conducted based on open source software, including fiber tracking ukftractography, fiber clustering whitematteranalysis, feature extraction and tractography visualization in 3D Slicer 1 via 1 https://www.slicer.org
4
ACCEPTED MANUSCRIPT
(c)
TE
EP
(d)
D
M AN U
SC
(b)
RI PT
(a)
Figure 1: Outline of the fiber clustering white matter parcellation. (a) Groupwise whole brain tractography registration for simultaneous joint alignment of tractography across all subjects.
AC C
Overlap of all subjects’ tractography before and after registration is displayed, with fibers from different subjects colored differently. (b) Atlas generation using spectral clustering to discover common white matter structures in the population. Fiber colors are automatically generated from the spectral embedding, where each fiber cluster has a unique color, and similar clusters have similar colors. Example individual atlas clusters are displayed. (c) Subject-specific fiber clustering based on the obtained atlas. Fibers are colored accordingly to the atlas. Example subject-specific clusters corresponding to those in (b) are shown. (d) Hemisphere-based subject cluster separation to divide each subject cluster into hemispheric or commissural clusters.
5
ACCEPTED MANUSCRIPT
(a)
Leave-subjects-out Process Training Set
(b)
M AN U
min FAT1
New Representation & Class Prediction
SC
Top R Feature Ranking Clusters and Selection & SVM Model Classifier Training
RI PT
Left-out Subjects
(d)
EP
TE
D
(c)
MDTmax 2
Figure 2: Overview of the machine-learning-based classification. (a) Diagram of the classifica-
AC C
tion steps using an individual feature. Feature ranking and selection were conducted using the signal-to-noise (s2n) ratio coefficient. Support vector machine (SVM) was used as the classifier. A leave-subjects-out cross-validation was performed for performance evaluation. (b) Feature-specific discriminative clusters, which obtained the highest classification accuracy for each individual feature, were identified in the cross-validation. (c) Discriminative clusters that were robustly retained across multiple features were identified as the robustly discriminative clusters. (d) The anatomical connectivity of each robustly discriminative cluster was assessed by comparing its fiber endpoints to a brain parcellation into multiple cortical/subcortical regions (red: left postcentral gyrus; magenta: brain stem).
6
ACCEPTED MANUSCRIPT
2.2. Fiber Clustering White Matter Parcellation
RI PT
SlicerDMRI (Norton et al., 2017)2 .
This section illustrates the applications of the groupwise tractography reg-
2.2.3). 2.2.1. Groupwise Tractography Registration
SC
istration (Section 2.2.1) and the data-driven fiber clustering (Sections 2.2.2 and
We first computed an unbiased groupwise whole brain tractography registration (Figure 1(a)) to align all subjects’ tractography into a common space (the
M AN U
atlas space), where the fiber clustering was performed. The method performed an entropy-based registration in a multiscale manner based on the pairwise fiber trajectory similarity (mean closest point distance) across all subjects. We first applied an affine transform (O’Donnell et al., 2012). Then, a recently implemented b-spline transform was conducted in a coarse-to-fine fashion following the affine transform. This enabled nonrigid deformations of the fibers for im-
D
provements of parcellation consistency across subjects. The transforms were conducted in a symmetric registration manner, which registered all subjects
TE
including midsagittal plane reflected copies of the subjects. In this way, the registration can effectively perform tractography alignment across the midsagittal plane. This could benefit the bilateral clustering for atlas generation (Section
2.2.3).
EP
2.2.2) and also enabled the hemisphere-based subject cluster separation (Section
AC C
Detailed parameter settings used in our experiments were as follows. The groupwise tractography registration employed 20,000 fibers randomly sampled from each subject (a total of 149 subjects) for a total of about 3 million fibers. The affine registration was conducted with multiscale sigmas from 20mm down to 5mm, applied to the pairwise fiber similarity via Gaussian transform. Then, the coarse-to-fine nonrigid b-spline registration was calculated with the sigmas 2 http://dmri.slicer.org
7
ACCEPTED MANUSCRIPT
from 5mm to 2mm and b-spline grid sizes from 4 × 4 × 4 to 8 × 8 × 8. Each subject’s full tractography was transformed accordingly to the affine and nonrigid
RI PT
registrations. All parameters (also for the fiber clustering in Sections 2.2.2 and
2.2.3) were chosen for reasonable performance, computational time and memory usages. Other parameters were left as default values in the whitematteranaly-
sis 5 package, which have been tuned for optimal performance in general. The
computing (HPC) Linux Cluster. 2.2.2. Groupwise Fiber Cluster Atlas Generation
SC
computations were conducted using a 48 core machine in a high performance
M AN U
After the registration, spectral embedding was conducted on the pairwise fiber affinity to convert each fiber to a point in a high-dimensional spectral embedding space (O’Donnell and Westin, 2007). In this space, fibers were interpreted with respect to their similarities to all others, where neighborhoods in general grouped similar fibers. We applied the mean closest point distance, a popular fiber distance measure used in fiber clustering (Ding et al., 2003;
D
Moberts et al., 2005; Garyfallidis et al., 2012; O’Donnell and Westin, 2007; Wang et al., 2016). The fiber distance was converted to a fiber affinity using a
TE
Gaussian-like kernel with sigma of 60 mm as in (O’Donnell and Westin, 2007; O’Donnell et al., 2017; Zhang et al., 2017a,b,c). We used the Nystrom sampling method to represent the embedding space compactly and to reduce the com-
EP
putations considering the large number of fiber pairs across subjects. K-means clustering was then performed to create the cluster atlas in a data-driven way
AC C
(Figure 1(b)). Each atlas fiber cluster was obtained across subjects, representing common white matter structures in the population. Bilateral clustering that simultaneously clustered fibers in both hemispheres was applied to improve clustering robustness (O’Donnell and Westin, 2007). The total number of bilateral clusters in the atlas was defined as Kb . We further incorporated an outlier removal to remove the spurious fibers for cluster consistency in the atlas. For each cluster, we excluded the fibers that were present in only few subjects to reject uncommon tractography er-
8
ACCEPTED MANUSCRIPT
rors. Specifically, we considered a fiber as an outlier if it was distant from other fibers within its cluster. To quantitatively decide the outliers, we define a prob-
RI PT
ability for each fiber based on the pairwise fiber distances (the mean closest
point distance) mapped through a Gaussian kernel with sigma 20 (O’Donnell et al., 2017). The probability of each fiber given its cluster was computed using the fibers from all other subjects in this cluster. The fibers whose probability
SC
was more than two standard deviations away from the cluster mean probability
were regarded as outliers and rejected. The threshold of two standard deviations was chosen as a good value to remove uncommon tractography errors (spuri-
M AN U
ous fibers), based on our recent works (O’Donnell et al., 2017; Zhang et al., 2017a,b,c). The fiber clustering and outlier removal were performed iteratively to get a final fiber cluster atlas.
In the experiments, all 149 subjects were used for atlas generation, with 5,000 fibers randomly sampled from each subject for a total of 745,000 fibers. 3500 fibers were sampled for the Nystrom method and two rounds of outlier rejection with a Gaussian kernel of σ=20mm were applied to create a final atlas. We
D
generated multiple cluster atlases given different numbers of bilateral clusters,
TE
i.e., Kb from 400 to 4000 which provided reasonable parcellation with respect to the white matter anatomy, to explore the influence of white matter parcellations at different scales.
EP
2.2.3. Fiber Clustering of Subjects We then applied the obtained fiber cluster atlas to each individual subject
AC C
for a subject-specific tractography parcellation (Figure 1(c)) using spectral embedding (O’Donnell and Westin, 2007). In details, affinity values for a new fiber (already affine and nonrigid transformed) were calculated by comparison to each fiber stored in the atlas (the stored Nystrom sample of 3500 fibers), using the same fiber affinity computation as in the atlas generation (Section 2.2.2). Spectral embedding was then used to embed each new fiber in the same atlas embedding space in which the fiber clustering was performed originally. The subject-specific fiber clusters were then obtained by assigning the fibers
9
ACCEPTED MANUSCRIPT
to their closest atlas cluster (O’Donnell and Westin, 2007). We note that each subject’s fibers were clustered individually according to the atlas, thus any indi-
RI PT
vidual fiber tract property, e.g. the fractional anisotropy or the fiber count, was preserved. We then applied an outlier removal process for each subject cluster,
similar to that in the atlas creation. Any outlier fiber whose probability/affinity given its atlas cluster was over two standard deviations from the cluster’s mean
SC
fiber probability was removed.
A hemisphere-based cluster separation was then conducted to obtain intraan inter-hemispheric clusters (Figure 1(d)). Given the bilateral clustering of
M AN U
the atlas, the subject clusters were detected bilaterally, which presented white matter tracts in both hemispheres. Considering the potentially lateralized asymmetry of white matter microstructure in ASD (Herbert et al., 2005; Lange et al., 2010), we divided the bilateral clusters into left-hemispheric (L), righthemispheric (R) and commissural (C) clusters according to the brain midsagittal plane. We obtained a total of Ks = 3 × Kb clusters per subject after the separation, in which some presented empty tracts. For example, a before-separation
D
cluster of commissural tracts would produce a C cluster but empty L/R clusters.
TE
After the bilateral cluster separation, valid L/R/C clusters were identified as those that passed a nonparametric one-tailed sign test (Bonferroni corrected at a significance level of 0.05) in the population. This method has been applied
EP
in multiple studies to identify valid white matter connections (Gong et al., 2009; Ratnarajah et al., 2013). Specifically, for each L/R/C cluster, the sign test was performed across all the subjects to determine the existence of the cluster. The
AC C
sign test was used to find the ones most consistently present across subjects and resulted in a total number of clusters, K, retained for each subject to conduct ASD/TDC classification. 2.3. Machine Learning Classification After the fiber clustering white matter parcellation, multiple diffusion features were extracted from each valid fiber cluster (Section 2.3.1) and used separately to inspect their individual discriminative ability in ASD classification. 10
ACCEPTED MANUSCRIPT
For each feature, we performed a 10-fold cross-validation (Section 2.4), i.e., the feature ranking, the feature selection (Section 2.3.2) and the SVM model train-
RI PT
ing (Section 2.3.3) were conducted using all subjects from all but the left-out subjects, as illustrated in Figure 2(a). 2.3.1. Diffusion Feature Extraction
Diffusion feature extraction was then conducted to quantitatively describe
SC
the K valid fiber clusters for each subject. Studies in the literature (Travers et al., 2012; Hoppenbrouwers et al., 2014; Ameis and Catani, 2015) have observed
diffusion anisotropy and diffusivity differences, which are generally described by
M AN U
fractional anisotropy (FA) and mean diffusivity (MD) measures, in ASD relative to normal controls. Therefore, in this study we focused on these two measures. In addition to the typical mean FA and MD features along fiber tracts that have been studied in ASD (Catani et al., 2008; Pugliese et al., 2009; Ge et al., 2010; Hong et al., 2011; Langen et al., 2012), we also measured extreme values (minimum and maximum) of FA and MD for each fiber cluster. As recent studies
D
of autism have found local along-tract diffusion differences between ASD and TDC (Johnson et al., 2014; Libero et al., 2015), we extracted these extreme
TE
features for the potential to detect local along-tract diffusion changes. Specifically, for each fiber cluster, FA and MD were computed at each point for both tensors (T1 and T2), and the minimum, mean and maximum values for
EP
each measure were calculated. Thus, given the statistics s ∈ {min, mean, max} of the measure m ∈ {FA, MD} of the tensor t ∈ {T 1, T 2}, a total of 12 features,
AC C
f = mst , were extracted from each cluster. Then, for each feature, each subject was associated with a feature vector that involved K fiber clusters, FK . An
extension to the 3D Slicer module FiberTractScalarMeasurements was implemented to compute these features. 2.3.2. Feature Ranking and Selection Selection of a compact discriminatory subset of features is one important step, in particular for high-dimensional data, in machine learning classification
11
ACCEPTED MANUSCRIPT
tasks for accuracy improvement and better data understanding (Guyon and Elisseeff, 2003) (i.e., which white matter structures were more discriminative in
RI PT
ASD). Therefore, for our machine learning pipeline, we started with a feature ranking and selection to eliminate potential feature redundancy.
Specifically, we employed the signal-to-noise (s2n) ratio coefficient (Golub
et al., 1999) for the feature ranking and selection. The s2n ratio has been
SC
widely applied in machine learning tasks to improve classification performance, e.g., between genes (Golub et al., 1999; Mishra and Sahu, 2011). It has also been employed and proven efficient in finding discriminative features for ASD
M AN U
classification (Ingalhalikar et al., 2011). Therefore, in our work, we applied s2n to rank each fiber cluster according to its feature values (as introduced in Section 2.3.1) between the TDC and ASD groups and hypothesized that the top ranked clusters could be more discriminative to improve the group classification. Given a certain feature, the s2n ratio was computed for each cluster based on a set of training subjects annotated with ASD (class label 0) or TDC (class label 1). Specifically, for a feature f = mst , the s2n ranked the cluster k based
D
on the ratio of the absolute difference of the class means over the average class
TE
standard deviation, as:
s2n(k) = |
µk0 (mst ) − µk1 (mst ) | σ0k (mst )2 + σ1k (mst )2
(1)
EP
where µk0 (mst ) and σ0k (mst )2 are the mean and variance of the feature across the subjects annotated with ASD (label 0), similarly to µk1 (mst ) and σ1k (mst )2 for TDC (label 1). It is noted that there can be clusters where some subjects had
AC C
empty tracts due to tractography noise and/or individual anatomical variability. To avoid any ranking bias towards these empty clusters, the features were set to the mean from the non-empty corresponding tracts across all subjects before computing the s2n ratio. Feature selection was then conducted by retaining the top ranked clusters
that were considered most discriminative between the groups. Assuming R is the number of the selected clusters, each subject was represented as a new feature vector that involved R fiber clusters, FR . The new feature vectors were then 12
ACCEPTED MANUSCRIPT
used for the classification. In our experiments, the optimal R was defined as the number that generated the highest classification in a 10-fold cross-validation as
RI PT
discussed in Section 2.4. 2.3.3. Classification with SVM
After the feature selection that retained a small number of clusters (R),
we used support vector machine (SVM) to conduct the ASD classification.
SC
SVM is one of the most widely used machine learning techniques to classify
high-dimensional-feature data in neuroimage analysis (Plant et al., 2010; Dyrba et al., 2013) and has effectively captured the multivariate relationships among
M AN U
anatomical regions in ASD (Lange et al., 2010; Ingalhalikar et al., 2011; Deshpande et al., 2013; Goch et al., 2014; Mostapha et al., 2015). Briefly, SVM is a supervised classification method and treats each feature vector as a point in a high dimensional space. It classifies the input data into different groups (ASD vs TDC) by identifying a separating hyperplane or decision boundary in the feature space. Here, given the feature vector FRtrain and its class label (0
D
for ASD, 1 for TDC) of each subject in the training set, a SVM model was first trained. The feature vector of a new subject (each left-out subject) FRnew
TE
was then fed into the model to predict its class. In our experiments, we used the nu-SVC SVM with polynomial kernel and default parameters (gamma = 1/number of features, coef 0 = 0, degree = 3 and cost = 1) from the LIBSVM
EP
(Chang and Lin, 2011) toolbox3 .
AC C
2.4. Cross-validation and Classification Performance Evaluation We conducted 10-fold cross-validation, which has been suggested as a good
cross-validation strategy for accuracy estimation and model selection (Kohavi, 1995; Krstajic et al., 2014), for performance evaluation (Figure 2(a)). In each trial, subgroups (1/10 of the total subjects) of the ASD and TDC subjects were randomly pulled out as a test set, while the remaining were used as a training set 3 The
package was downloaded at https://www.csie.ntu.edu.tw/~cjlin/libsvm/
13
ACCEPTED MANUSCRIPT
for ranking and selecting features and learning the SVM model. The classifier was evaluated based on the predicted label of the left-out test subjects. By
RI PT
repeatedly leaving subjects out for testing and averaging the accuracies over all cross-validation trails, we obtained an average classification accuracy.
To determine an optimal R, we performed the cross-validation for each possible R ∈ [1, K] and obtained an accuracy curve given all R values. The threshold
SC
to select the optimal R was the value that generated the highest averaged accu-
racy in the cross-validation, which corresponded to the smallest cross-validation error (Guyon and Elisseeff, 2003). This analysis has been successfully applied
M AN U
in previous studies (Ingalhalikar et al., 2011; Zhang et al., 2016). Specifically, given a feature f = mst , in each cross-validation trial feature ranking and selection was conducted to keep the top R ranked clusters. An SVM model was then trained given these clusters on the training subjects and was used for predicting the test subjects using the corresponding clusters. We then obtained an average accuracy, acc, across all cross-validation trials for this particular R. Repeating the above process given different R values, we obtained an average accuracy
D
curve C = {acc(R)|R ∈ [1, K]}. The R0 leading to the highest average accuracy on the curve was set to as the optimal setting, and the highest average accuracy
TE
was defined as the classification accuracy obtained by the feature f , i.e., acc0 .
EP
2.5. Robustly Discriminative Clusters via Multiple Feature Selection We conducted a multiple-feature-based analysis to assess which fiber clusters were robustly discriminative for classification. The feature ranking and
AC C
selection (Section 2.3.2) was performed on each individual feature, which retained R0 feature-specific discriminative clusters (for best classification) given the cross-validation process (Figure 2(b)). Here, we selected the clusters that were robustly retained across the features (Figure 2(c)). In detail, for an individual feature f = mst , as the retained clusters may slightly vary between the cross-validation trials, we considered the ones that appeared in all crossvalidation trials as feature-specific discriminative clusters. Following this, the clusters that were retained for most of the features were selected as robustly 14
ACCEPTED MANUSCRIPT
discriminative clusters. In the experiments, we selected the clusters that were
robustly discriminative clusters (Section 3.3 and Figure 8).
RI PT
retained in at least 8 features, which generated a reasonable number (21) of
For each robustly discriminative cluster, we assessed which brain anatomical
regions it may connect to, by comparing its fiber endpoints to a Freesurfer brain
parcellation into multiple cortical/subcortical gray matter regions (Fischl, 2012)
SC
(Figure 2(d)). We developed a 3D Slicer module FiberEndPointFromLabelMap
to calculate the percentage of endpoints per cluster touching the Freesurfer regions. We considered that a cluster was connected to a segmented gray matter
M AN U
region if at least 20% of its endpoints touched it, as the fiber tracking may end in the white matter region near the gray matter parcels. In our experiments, we identified the top two connected gray matter regions across all subjects for most of the robustly discriminative clusters. However, considering there were clusters touching different segmented regions, for several discriminative clusters we also included the third and fourth connected gray matter regions (see Section 3.3
D
and Table 4).
2.6. Data Acquisition and Tractography
TE
We demonstrate the proposed method using a diffusion weighted imaging (DWI) dataset from 149 pediatric male children (70 ASD, age: 11.0±2.6 and 79
EP
TDC, age: 11.1±2.7). Detailed population demographics are listed in Table 1. This data was acquired at the Center for Autism Research, Children’s Hospital of Philadelphia, with approval of the local ethics board. The diagnoses were
AC C
made with gold standard research procedures, by consensus by two experienced clinicians. Age distributions did not differ between groups. Overall general conceptual ability (GCA) IQ scores (ASD: 106.1±20.1 and TDC: 110.4±14.9)
were not significantly different between the two groups. DWI data were acquired using a Siemens 3T VerioTM scanner with a 32 channel head coil. A monopolar Stejskal-Tanner diffusion weighted spin-echo was used to perform high angular resolution diffusion imaging (HARDI) acquisition (TR/TE = 14.8s/110ms, b = 3000s/mm2 , 2mm isotropic resolution, and 64 gradient directions). The DWI 15
ACCEPTED MANUSCRIPT
Table 1: Demographic and neuropsychological variables of the ASD and TDC groups
TDC
Number of subjects
70 (Male)
79 (Male)
Agea
11.0 ± 2.6
11.1 ± 2.7
GCA
106.1 ± 20.1
110.4± 14.9
SRSc
76.9 ± 10.8
41.9 ± 7.0
a
-
0.982
0.140
<0.001
SC
b
p-valued
RI PT
ASD
Age ranges of the two groups were matched, where the ASD group was from 6.1
to 17.9 years and the TDC group was from 6.3 to 17.7 years.
General conceptual ability (GCA) score evaluates the reasoning and conceptual
M AN U
b
abilities and gives an overall score for IQ (Kotz et al., 2008). c
Social responsiveness scale (SRS) score is a standard socio-psychological biomarker
indicating social impairments (Constantino et al., 2003). d
A two sample Student’s t-test was used to derive the p-value.
images were processed using a joint linear minimum mean squared error filter for Rician noise removal. Eddy current correction was performed by registering
D
each DWI volume to the baseline image.
TE
We conducted whole brain tractography using a two-tensor unscented Kalman filter (UKF) method (Malcolm et al., 2010), as implemented in the ukftractography 4 package. The UKF method fits a mixture model of two tensors to the
EP
diffusion data while tracking fibers, in a recursive estimation fashion (the current tracking estimate is guided by the previous one). The mixture model assumes there are two distinct tract directions within voxels, in which the first one rep-
AC C
resents the principal direction and the other one is from the second tract. The two-tensor UKF model was shown to be more sensitive than standard singletensor tractography, in particular in the presence of crossing fibers (Baumgartner et al., 2012; Chen et al., 2015, 2016). In our experiments, tractography was seeded with 10 seeds per voxel (for a more thorough tracking result as suggested 4 https://github.com/pnlbwh/ukftractography
16
ACCEPTED MANUSCRIPT
Table 2: The numbers of valid clusters of the different white matter parcellations under study. Kb is the number of bilateral clusters; Ks is the number of clusters obtained after hemisphere-
RI PT
based separation; K is the number of valid clusters that were retained after the sign test.
400
800
1200
1600
2000
2400
2800
3200
3600
4000
Ks
1200
2400
3600
4800
6000
7200
8400
9600
10800
12000
K
837
1666
2413
3161
3927
4697
5468
6233
6974
7718
SC
Kb
in the software), in all voxels within the brain mask where FA was greater than
M AN U
0.18 (default). Tracking stopped where the FA value fell below 0.22 or the generalized anisotropy (GA) (a normalized variance of the tensor diffusivities in all gradient directions) fell below 0.13. These two stopping thresholds were set slightly above the default values, which were defined for more traditional datasets with a b-value of 1000, while we used DWI data with a higher b-value of 3000. Other parameters were left as default values. This generated about
D
350,000 fibers for each subject. Visual and quantitative quality control of the
TE
tractography was performed using the whitematteranalysis 5 package.
3. Experimental Results
EP
3.1. White Matter Parcellation Consistency Figure 3 shows the mean intra- and inter-cluster fiber pair distances (i.e. the mean closest point distance) given the different white matter parcellation atlases
AC C
(Kb from 400 to 4000). Both intra- and inter-cluster distances were highly similar across the multiple atlases, with lower intra-cluster distances (around 20 mm) compared to the inter-cluster distances (around 70 mm). Under the statistical criterion of the sign test (p < 0.05, corrected), Table 2 shows the numbers of valid clusters (K) per white matter parcellation. About 67% of
the clusters were retained after the sign test across all the parcellations. We 5 https://github.com/SlicerDMRI/whitematteranalysis
17
ACCEPTED MANUSCRIPT
100
RI PT
80 70 60 50 40
Inter-cluster Distance Intra-cluster Distance
30 20 10 0
400
800
1200
1600
2000
2400
2800
3200
3600
4000
M AN U
Bilateral Clusters (K b )
SC
Mean Fiber Pair Distance (mm)
90
Figure 3: The mean intra- and inter-cluster fiber pair distances across the different white matter parcellation atlases under study. The fiber clustering method identified fiber clusters within which the fibers had similar geometric trajectories (i.e. similar white matter anatomy), thus a fiber cluster was considered as anatomically coherent if its intra-cluster fiber distances were low. Across the multiple atlases, the similar mean intra-cluster fiber pair distances suggested that the fiber clustering method could identify highly anatomically coherent white matter subdivisions at different parcellation scales.
D
note that, as expected, most of the excluded clusters were produced by the
TE
hemisphere-based separation. Figure 4 shows the percentage of the K valid clusters given the numbers of subjects presenting them. The retained fiber clusters were highly consistent across the subjects. 86.5% of the clusters were
EP
detected in at least 140 subjects, and only a small percentage were detected in 109 or fewer subjects. Figure 5 displays some example fiber clusters across subjects. The fiber clustering subdivided the thalamus-to-superior-frontal-gyrus
AC C
connections into several smaller tracts. The subject-specific clusters were highly similar across subjects indicating a corresponding division of the white matter anatomy.
3.2. Classification Accuracy Table 3 displays the classification accuracy given the fiber clustering white matter parcellations at different scales. For each parcellation, we performed ASD classification separately using each of the 12 features and thus obtained 18
ACCEPTED MANUSCRIPT
0.8% 1.4% 2.8%
RI PT
8.4%
SC
86.5%
M AN U
140-149 subjects 130-139 subjects 120-129 subjects 110-119 subjects 109 or fewer subjects
Figure 4: White matter parcellation consistency of the valid clusters across the population. 86.5% of the valid clusters were detected in at least 140 subjects, while only fewer than 1% were detected in 109 or fewer subjects. The displayed percentage per pie slice is the average value across the multiple parcellations under the different K values.
D
Table 3: Classification accuracies given the white matter parcellations at different scales. Average acc0A (± standard deviation) (across all features) and highest acc0H (feature-specific) 2.
1666
2413
3161
3927
4697
5468
6233
6974
7718
66.70
68.02
68.89
69.39
70.03
70.64 69.68
69.80
69.67
67.26
±4.58 ±4.00 ±4.39 ±4.62 ±4.50 ±4.23 ±5.26 ±4.74 ±4.98 ±5.29
AC C
acc0A
837
EP
K
TE
accuracies are shown. The number of valid clusters K corresponds to that displayed in Table
acc0H 74.74
75.21
75.88
76.50
77.02
78.33 77.54
76.40
76.50
75.52
12 classification accuracies (acc0 ). The average acc0A (across all features) and highest acc0H (feature-specific) accuracies were reported for each K. The best
classification performance was obtained when K = 4697, with the average and highest accuracies of 70.64% and 78.33%. Other classification performance mea-
19
ACCEPTED MANUSCRIPT
(b)
SC
RI PT
(a)
Figure 5: Visual presentation of the fiber clusters connecting the thalamus and the superior
M AN U
frontal gyri in a Freesurfer parcellation across multiple subjects. (a) shows the top six clusters connecting the two regions from the white matter parcellation under K=4697, according to the anatomical region assessment process introduced in Section 3.3. (b) shows the corresponding subject-specific clusters across multiple subjects.
surements, corresponding to the best classification accuracy, were computed, as sensitivity = 84.81%, specificity = 72.86%, positive predictive value = 77.91% and negative predictive value = 80.95%. Please see the video included in the
TE
this atlas (K = 4697).
D
supplementary material for a visualization of each individual fiber cluster within
The K values from 3927 to 5468 were considered better given the different parcellations under study. Dividing the whole brain tractography into more or
EP
fewer clusters tended to decrease the classification performance. Outlier removal was employed in creating the atlas for the purpose of improving cluster consistency by removing any anatomically incoherent fibers. Experiments demon-
AC C
strated insensitivity of the machine learning to the fiber outlier removal (under 2% change in average classification accuracy across the multiple parcellations under study).
Figure 6 shows the classification accuracies obtained from the multiple fea-
tures. FAmean generated overall best classification results, with the highest T2 mean and a relatively small standard deviation. The accuracy from FAmax T 1 was
the second best, which was close to that from FAmean . In addition, higher accuT2
20
ACCEPTED MANUSCRIPT
85
RI PT
75 70 65 60
SC
Classification Accuracy
80
50
min
mean
FAT1 FAT1
max
FAT1
min
M AN U
55
mean
FAT2 FAT2
max
FAT2
min
mean
MDT1 MDT1
max
MDT1
min
mean
MDT2 MDT2
max
MDT2
Figure 6: Classification accuracies given the different features. The bar and error bar indicate the mean and standard deviation across the different K values.
racies were obtained using the extreme features from FAT 1 , MD T 1 and MD T 2
D
when compared to their corresponding mean feature. 3.3. Robustly Discriminative Clusters
TE
To identify the most discriminative fiber clusters given the multiple feature selection, we used the white matter parcellation of K=4697, which generated
EP
the best accuracies (Table 3). Figure 7 gives the average accuracy curve C of the multiple features for this parcellation. For each feature, the feature-specific discriminative clusters were the top R0 ones that generated the highest aver-
AC C
age cross-validation accuracy acc0 (annotated in blue). To measure the feature weight stability, we computed the number of clusters that were consistently selected as the top ranked R clusters per feature across the 10-fold trials. For 0 example, for the F Amin T 1 , 654 clusters (91.7%) of the top R = 713 clusters across
the 10-fold trials were consistently selected. Across all the diffusion features, the average percentage of the clusters that were consistently identified was 92.6%, showing a high stability of the s2n feature weighting across the validation runs. We selected the clusters that were retained in at least 8 (out of the 12) features 21
M AN U
SC
RI PT
ACCEPTED MANUSCRIPT
Figure 7: Average classification accuracy given the number of selected clusters R. The blue
D
circle indicates the highest average cross-validation accuracy that was obtained using the
TE
optimal number of clusters R0 .
as the most discriminative ones. We chose the threshold of 8 to produce a rea-
EP
sonable number of robustly discriminative clusters, i.e., a total of 21 clusters as displayed in Figure 8. Table 4 lists the Freesurfer cortical/subcortical regions to which the clusters connect. A lower threshold of the number of features re-
AC C
tained a larger number of fiber clusters. For example, the number of retained clusters in at least 7 features was 116. On the other hand, a higher threshold value led to only a small number of retained clusters. As shown in Figure 7, min 0 several features, such as F Amin T 1 and F AT 2 , have relatively small optimal R
values, which reduced the total possible retained clusters across features. We found only 7 retained clusters in at least 9 features (annotated in Table 4) and did not obtain any clusters that were retained in 10 or more features. To evaluate which features from the robustly discriminative clusters were 22
ACCEPTED MANUSCRIPT
(d)
(e)
(c)
(f)
RI PT
(b)
M AN U
SC
(a)
Figure 8: Visual presentation of the robustly discriminative fiber clusters.
most informative between the two groups, we identified the 8 features for each cluster that were used to retain this cluster. Across the 21 clusters, the most the most informative features) included F Amax T1 ,
D
widely used features (i.e.
max mean , M DTmax M DTmin 2 and F AT 2 that were used by 16, 16, 14, 12 and 11 2 , F AT 2
TE
clusters respectively. These features in general corresponded to the ones that had the best classification accuracies (Figure 6). Among the 21 robustly discriminative fiber clusters, nearly half (10/21) be-
EP
longed to the corpus callosum, which were mainly from the anterior (Figure 8(a)) and posterior (Figure 8(b)) regions. Three clusters that were potentially
AC C
from the uncinate fasciculus (Figure 8(c), purple), the aslant tract (Figure 8(c), yellow) and the arcuate fasciculus/superior longitudinal fasciculus (Figure 8(c), red) were identified. We also found clusters related to the brain stem (Figure 8(d)) and the cerebellum (Figure 8(e)). There were also two clusters connecting to the ventral diencephalon (Figure 8(f)).
23
ACCEPTED MANUSCRIPT
Table 4: List of the Freesurfer regions to which the robustly discriminative clusters (Figure 8) connect. The endpoints of the clusters that mostly ended in white matter regions were †
indicates the clusters retained in at least 9 features. Abbreviations:
RI PT
displayed as empty.
hemispheres, L - left; R - right. Freesurfer regions, BS - brain stem; CAU - caudate; CBLM cerebellum; FronPol - frontal pole; LatOcci - lateral occipital gyrus; ParsOrb - pars orbitalis;
PCN - precuneus; SupPari - superior parietal gyrus; PreCen - precentral gyrus; RosMidFron - rostral middle frontal gyrus; SM - supramarginal gyrus; SupFron - superior frontal gyrus;
(d)
(e)
M AN U
Region B R-SupFron R-SupFron R-FronPol R-ParsOrb, R-SupFron R-RosMidFron R-PCN R-SupPari R-SupPari R-PCN, R-SupTemp L-SupFron L-SM L-LatOcci L-PosCen L-CAU L-CAU L-PreCen L-CAU L-CAU R-InfPari R-RosMidFron
AC C
(f)
D
(c)
TE
(b)
Region A L-SupFron L-SupFron L-FronPol L-ParsOrb, L-SupFron L-RosMidFron L-PCN L-SupPari L-SupPari L-PCN, L-SupTemp L-RosMidFron, L-ParsOrb L-RosMidFron, L-ParsOrb L-RosMidFron, L-ParsOrb BS BS BS L-CBLM L-CBLM L-CBLM R-VDC R-VDC
EP
(a)
Color P B Y R C† G P† Y† G B Y† R† P† G P R P Y G† G R
SC
SupTemp - superior temporal gyrus; VDC - ventral diencephalon.
4. Discussion
In this paper, we presented a machine learning method based on groupwise
data-driven fiber clustering to differentiate whole brain white matter connectivity patterns. We demonstrated the method’s performance with an application of ASD/TDC classification. We have several overall observations about the results. We observed that fine white matter parcellations (K from 3927 to
24
ACCEPTED MANUSCRIPT
5468) tended to provide higher classification accuracy, likely due to the fact that the extracted diffusion features could be more specific to smaller scale
RI PT
subdivisions of the white matter anatomy. For example, we found one discrim-
inative cluster (Figure 8(d), green) that comprised only a small portion of the whole corticospinal-tract pathway. We also observed that the 12 measured dif-
fusion features performed differently for the ASD classification. We found that
SC
the mean FA feature from the second tensor (F Amean ) was the most effective in T2
classifying the two groups. The two-tensor UKF model enables tracking through crossing fiber regions, where the crossing fibers are mainly represented by the
M AN U
second tensor (Malcolm et al., 2010). Thus the highest classification accuracy could suggest that the diffusion anisotropy in crossing fiber refrom F Amean T2 gions was affected in ASD. Furthermore, analyzing the extracted fiber clusters we noticed that the maximum FA and minimum MD features normally corresponded to a more middle/core fiber tract region, while the minimum FA and maximum MD features were more likely to appear at the endpoint regions of the fibers. This was in accordance with the relatively well-known along-tract FA and
D
MD patterns (Colby et al., 2012; O’Donnell et al., 2009). Among these extreme
TE
features, F Amin,max , M DTmin,max and M DTmin,max obtained higher accuracies T1 1 2 when compared to their corresponding mean features. Thus our results suggest potential alterations in diffusion not only in the overall fiber mean features, but
EP
also in the central cores of the fiber tracts and at the white matter/gray matter interface.
Most of the robustly discriminative clusters found here correspond to fiber
AC C
tracts that have been previously implicated in ASD, such as fiber tracts related to corpus callosum (Alexander et al., 2007; Hong et al., 2011), cerebellum (Catani et al., 2008), brain stem (Travers et al., 2015a), uncinate fasciculus (Pardini et al., 2012), arcuate fasciculus (Fletcher et al., 2010), superior longitudinal fasciculus (Nagae et al., 2012) and aslant tract (Lo et al., 2016). We note that the discriminative clusters may include or connect to only part of these structures, and that clusters can potentially combine fibers from more than one anatomical structure. For example the red colored cluster in Figure 25
ACCEPTED MANUSCRIPT
8(c) may include parts of arcuate fasciculus and superior longitudinal fasciculus, and thus may not be specific to either structure. Additional related findings in
RI PT
ASD can be found in (Travers et al., 2012; Hoppenbrouwers et al., 2014; Ameis
and Catani, 2015). We also found two discriminative clusters related to the ventral diencephalon (Figure 8(f)). A larger volume of diencephalon (including thalamus and ventral diencephalon) was reported in ASD when compared to
SC
controls (Herbert et al., 2003). However, we were not able to find any previous work on white matter fiber tracts related to the ventral diencephalon.
These white matter tracts were obtained using the multi-fiber UKF trac-
M AN U
tography, which has been shown to be more sensitive in tracking through the regions in the presence of crossing fibers (Baumgartner et al., 2012; Chen et al., 2015, 2016). This could reduce the well-known tractography problem of missing fiber error and enable investigation of more white matter connections. For example, we found discriminative corpus callosum clusters following the lateral paths to the pars orbitalis gyri (Figure 8(a)) and to the temporal lobe (Figure 8(b)); however, these white matter tracts can hardly be tracked in a basic single-
D
tensor streamline method (Malcolm et al., 2010). Given the diffusion features
TE
extracted from the multi-fiber tractography, we compared the average feature value across the identified discriminative clusters between the two groups. Feature change directions of the discriminative clusters in mean and maximum
EP
FA, as well as mean and minimum MD, were in line with the general findings of decreased anisotropy and increased diffusivity in many ASD studies using dMRI (Travers et al., 2012; Hoppenbrouwers et al., 2014; Ameis and Catani,
AC C
2015). On the other hand, higher minimum FA and lower maximum MD of the discriminate clusters suggested potentially increased anisotropy and decreased diffusivity at fiber endpoint regions in ASD. The aforementioned discriminative clusters may imply potential altered brain
functions in ASD. With reference to a 7-region corpus callosum subdivision (CC1 to CC7) (Makris et al., 1999), among the 6 anterior corpus callosum clusters, we found one from the anterior half of the body of the callosal commissure (CC3) while the others were from the rostrum (CC1) and genu (CC2). The CC3 clus26
ACCEPTED MANUSCRIPT
ter connected to the supplementary motor area that is related to planning and initialization of movement activities. In addition, we observed more left hemi-
RI PT
sphere lateralized white matter alterations in ASD. There were discriminative
clusters connecting to the left precentral and postcentral gyri, which suggest potential motor and somatosensory function alteration in ASD. We also found
several discriminative clusters connecting to the left pars orbitalis region that
SC
potentially indicate language processing abnormality in ASD. These potential altered functions in ASD have been reported in previous studies (Hong et al., 2011; D’Mello and Stoodley, 2015; Travers et al., 2015a; Herbert et al., 2005;
M AN U
Nagae et al., 2012; Pardini et al., 2012; Fletcher et al., 2010). Our results were consistent with these research outputs, but we demonstrated these clusters using a whole brain analysis, which suggested their related functions could be more affected than other altered white matter connections.
In related work, multiple methods have been applied to study dMRI in ASD. The first category has focused on identifying group differences between ASD and TDC groups using statistical analysis. Traditional hypothesis-driven
D
studies of dMRI have been conducted based on region-of-interest (ROI) methods
TE
to identify regional white matter structural abnormalities in ASD (Brito et al., 2009; Alexander et al., 2007; Pardini et al., 2012; Travers et al., 2015b). Other studies have applied voxel-based approaches, such as voxel-based morphometry
EP
(VBM) (Cheung et al., 2009; Keller et al., 2007) and tract-based spatial statistics (TBSS) (Billeci et al., 2012; Cheon et al., 2011), for whole brain white matter analysis. Tractography-based analysis has enabled measurement of macrostruc-
AC C
tural white matter tract properties for a more detailed investigation of specific subpopulations of fibers. Tract-of-interest studies (Catani et al., 2008; Pugliese et al., 2009; Ge et al., 2010; Hong et al., 2011; Langen et al., 2012; Wolff et al., 2012) using tractography have identified atypical white matter structures in ASD.
Another category of dMRI-based studies, to which our proposed method belongs, has focused on applying machine learning methodology to classify ASD/TDC for diagnostic prediction. Unlike the statistical analysis methods 27
ACCEPTED MANUSCRIPT
that aim to find white matter structures with statistical group difference, the machine-learning-based methods aim to offer predictive diagnostic relevance.
RI PT
Multiple approaches have been proposed to address this challenging problem. Several groups have employed a priori knowledge of affected regions specific to
autism pathology to define white matter regions of interest for classification
(Adluru et al., 2009; Lange et al., 2010; Deshpande et al., 2013), while other
SC
work has used volumetric regions of interest located throughout the whole white
matter (Ingalhalikar et al., 2011). Additional whole white matter approaches have studied fiber tracts by employing cortical-parcellation-based (CPB) seg-
M AN U
mentation of tractography. This has enabled measurement of graph theoretic measures derived from connectome matrices, e.g. average node degree that describes the number of tracts connecting cortex parcels (Goch et al., 2014), as well as extraction of diffusion features for description of the whole brain white matter connectivity (Mostapha et al., 2015). To characterize white matter connections for analysis, the most common method has adopted mean diffusion features within fiber tracts for an overall description. For example, mean FA
D
and/or MD were applied (Lange et al., 2010; Mostapha et al., 2015) to perform
TE
ASD classification. In contrast to the above approaches, our method has extracted multiple FA and MD features in a multi-fiber model, and we employed a study-specific fiber clustering tractography segmentation of the whole white
EP
matter, which has been suggested to be more consistent in finding corresponding white matter parcels across subjects compared to the CPB method (Zhang et al., 2017b). Our previous research work has also shown the fiber clustering
AC C
method’s high repeatability performance in terms of spectral embedding stability (O’Donnell and Westin, 2007), fiber cluster reproducibility at different parcellation scales (O’Donnell, 2006) and segmentation consistency across multiple populations (O’Donnell et al., 2017; Zhang et al., 2017b). In addition, our proposed multiple feature analysis provides a way to identify potential local along-tract abnormalities, such as in the central cores of fiber tracts and at the white matter/gray matter interface. One benefit of the data-driven fiber clustering method is that it allows a 28
ACCEPTED MANUSCRIPT
white matter parcellation into a large number of fiber clusters. Fine scale brain parcellations have been suggested to provide better description of locally spe-
RI PT
cific brain regions compared to coarse-grained parcellations (Hagmann et al., 2008; van den Heuvel et al., 2008; Liu et al., 2017). One goal of this study is to determine an optimal white matter parcellation (i.e. an optimal number of fiber clusters K) based on a certain real-world problem. Choosing the
SC
optimal K for anatomical fiber clustering is a challenge, because many white
matter structures are relatively continuous with no clear boundaries, while different structures have different size scales. Optimal white matter subdivision
M AN U
thus depends on the desired application (such as visualization, which can benefit from fewer clusters, or quantitative measurement, which can benefit from finer subdivisions thus more clusters). Our previous work has determined a reasonable value (K ≥ 200) by testing clustering and embedding reproducibility in single-tensor tractography (O’Donnell and Westin, 2007) and by empirically determining a value (K ≥ 800) that could separate structures considered to be anatomically different according to expert judgment using sensitive multi-
D
fiber tractography (O’Donnell et al., 2017). Many spectral clustering techniques
TE
have been proposed for automatic determination of an optimal K by analyzing eigenvectors of the embedding matrix (Sanguinetti et al., 2005; Zelnik-Manor and Perona, 2005). However, without including prior knowledge about brain
EP
anatomical subdivisions of interest, such an automated decision could potentially prevent a finer parcellation into more locally specific subdivisions. In this work, we investigated multiple parcellations at different scales and determined
AC C
an optimal K value in a machine learning pipeline by identifying local white matter differences that could differentiate the two groups to the largest extent. Potential future directions and limitations of the current work are as fol-
lows. First, we applied the feature statistics of minimum, mean and maximum to approximately locate along-tract abnormality. Advanced feature descriptions, which can characterize along-tract feature distributions and locate exact abnormality position, will potentially increase the discriminative ability of a fiber cluster in differentiating different populations. For example, our and 29
ACCEPTED MANUSCRIPT
other research groups have utilized features, such as percentile feature along fiber tract (Zhang et al., 2016), return-to-the-origin probability (RTOP) fea-
RI PT
ture from multiple-shell-dMRI-based tractography (Zhang et al., 2017c), and along-tract diffusion distribution feature (Colby et al., 2012), to study between-
group white matter alterations. Also, the number of fibers within fiber tracts
has been used to measure structural connectivity differences between ASD and
SC
other developmental disorders (Conti et al., 2017). These features and many oth-
ers provide diffusion property descriptions from different perspectives, thus are expected to extend and improve the proposed method to study other neurolog-
M AN U
ical diseases/disorders. In addition, the study ASD and TDC populations were matched on age, gender and IQ so that the two groups did not differ on these factors. Therefore, we did not conduct feature control on these variables that could be mediated by brain connectivity. However, controlling features could reduce their influence to focus on the neuropathology of autism to a greater extent. For example, diffusion parameters change rapidly during childhood and adolescence and such age-related changes could provide important longitudinal information
D
to study brain white matter. Third, in this study, we created one fiber clus-
TE
tering atlas from the whole population so that the same atlas was used in all cross-validation runs (which enabled identification of discriminative clusters in the population). A future investigation could include building an atlas from a
EP
separate cohort or applying a leave-testing-subjects-out atlas creation approach to further cross-validate the machine-learning-based classification. However, for our current study, we believe that the usage of the whole-population-based atlas
AC C
should have effects that were too minimal to generate any subject biases during the cross-validation, given the facts that the atlas was created from a large population (evenly distributed groups with matched ages and same genders), and the atlas only used the geometric information from a small subset of the tractography per subject (5,000 fibers per subject or on average 1.5% of the fibers from each subject). Fourth, to investigate focal diffusion measurements corresponding to small parcels of the white matter, we created fine scale white matter parcellations therefore higher dimensional features (e.g. K = 4697). 30
ACCEPTED MANUSCRIPT
These feature dimensions were relatively higher than the sample size (n = 149) thus potentially introduced model overfitting or subject bias. In our current
RI PT
study, we applied a 10-fold cross-validation approach with a feature selection process to reduce any potential overfitting or subject bias effects from the high
feature dimensions. However, a further study could include an investigation of the method using a larger cohort. Fifth, the more sensitive UKF tracking
SC
method may introduce more false positive or anatomically incorrect errors com-
pared to a standard fiber tracking method. While it is commonly considered impossible to assess the existence of such errors without ground truth, we be-
M AN U
lieve our outlier removals have ameliorated the false positive fibers to a certain extent. The groupwise fiber trajectory-based computation can naturally enable the rejection of the fibers that were improbable (trajectory-dissimilar) in the cluster. However, this method cannot remove apparent tractography errors that are present in all subjects while being inconsistent with expected anatomy. For instance, in Figure 8(e) we found fiber clusters connecting the cortex and
D
cerebellum in the same hemisphere.
TE
5. Conclusion
In this paper, we have demonstrated a fiber-clustering-based whole brain white matter connectivity analysis with the application of automated ASD clas-
EP
sification. Our results indicate the potential of a machine learning pipeline based on white matter fiber clustering. The method enables high white matter
AC C
parcellation consistency across the population and assessment of alterations in multiple diffusion features. The proposed method may provide a general framework for observing white matter connectivity alterations between disordered and control groups. In addition, the method works in data-driven manner to identify tracts that may be most affected by pathology in whole-brain tractography. This allows further hypothesis-driven research on certain white matter structures from the whole brain. We note the related software is publicly available, enabling its future application to other diffusion MRI datasets.
31
ACCEPTED MANUSCRIPT
6. Acknowledgements
RI PT
We gratefully acknowledge funding provided by the following National Institutes of Health (NIH) grants: R01 MH092862, U01 CA199459, R03 NS088301,
P41 EB015898 National Center for Image-Guided Therapy, R01 MH074794, R01 MH097979, P41 EB015902 Neuroimage Analysis Center, U54 HD86984,
RC1MH08879, a grant from the Pennsylvania Department of Health (SAP #
SC
4100047863), the Australian Research Council (ARC) grants, and a NARSAD Young Investigator Award from the Brain and Behavior Research Foundation,
M AN U
grant number 22591, to P.S.
7. References
Adluru, N., Hinrichs, C., Chung, M. K., Lee, J.-E., Singh, V., Bigler, E. D., Lange, N., Lainhart, J. E., Alexander, A. L., 2009. Classification in DTI using shapes of white matter tracts. In: Annual International Conference of
D
the IEEE Engineering in Medicine and Biology Society. IEEE, pp. 2719–2722. Alexander, A. L., Lee, J. E., Lazar, M., Boudos, R., DuBray, M. B., Oakes,
TE
T. R., Miller, J. N., Lu, J., Jeong, E.-K., McMahon, W. M., Bigler, E., JE., L., 2007. Diffusion tensor imaging of the corpus callosum in autism.
EP
Neuroimage 34 (1), 61–73.
Ameis, S., Catani, M., 2015. Altered white matter connectivity as a neural substrate for social impairment in autism spectrum disorder. Cortex 62, 158–
AC C
181.
Assaf, Y., Pasternak, O., 2008. Diffusion tensor imaging (DTI)-based white matter mapping in brain research: a review. Journal of molecular neuroscience 34 (1), 51–61.
Association, A. P., 2000. Diagnostic criteria from DSM-IV-TR. American Psychiatric Pub.
32
ACCEPTED MANUSCRIPT
Basser, P. J., Pajevic, S., Pierpaoli, C., Duda, J., Aldroubi, A., 2000. In vivo fiber tractography using DT-MRI data. Magnetic resonance in medicine
RI PT
44 (4), 625–632.
Baumgartner, C., Michailovich, O., Levitt, J., Pasternak, O., Bouix, S., Westin,
C.-F., Rathi, Y., 2012. A unified tractography framework for comparing diffusion models on clinical scans. In: Computational Diffusion MRI Workshop
SC
of International Conference on Medical Image Computing and Computer Assisted Intervention. pp. 27–32.
M AN U
Billeci, L., Calderoni, S., Tosetti, M., Catani, M., Muratori, F., 2012. White matter connectivity in children with autism spectrum disorders: a tract-based spatial statistics study. BMC neurology 12 (1), 1–16.
Brito, A. R., Vasconcelos, M. M., Domingues, R. C., da Cruz Jr, H., Celso, L., Rodrigues, L. d. S., Gasparetto, E. L., Cal¸cada, C. A. B. P., 2009. Diffusion tensor imaging findings in school-aged autistic children. Journal of
D
Neuroimaging 19 (4), 337–343.
Catani, M., Jones, D. K., Daly, E., Embiricos, N., Deeley, Q., Pugliese, L.,
TE
Curran, S., Robertson, D., Murphy, D. G., 2008. Altered cerebellar feedback projections in Asperger syndrome. Neuroimage 41 (4), 1184–1191.
EP
Chang, C.-C., Lin, C.-J., 2011. Libsvm: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2 (3), 27.
AC C
Chen, Z., Tie, Y., Olubiyi, O., Rigolo, L., Mehrtash, A., Norton, I., Pasternak, O., Rathi, Y., Golby, A. J., O’Donnell, L. J., 2015. Reconstruction of the arcuate fasciculus for surgical planning in the setting of peritumoral edema using two-tensor unscented Kalman filter tractography. NeuroImage: Clinical 7, 815–822.
Chen, Z., Tie, Y., Olubiyi, O., Zhang, F., Mehrtash, A., Rigolo, L., Kahali, P., Norton, I., Pasternak, O., Rathi, Y., Golby, A. J., O’Donnell, L. J., 2016. Corticospinal tract modeling for neurosurgical planning by tracking through 33
ACCEPTED MANUSCRIPT
regions of peritumoral edema and crossing fibers using two-tensor unscented
ology and surgery 11 (8), 1475–1486.
RI PT
Kalman filter tractography. International journal of computer assisted radi-
Cheon, K.-A., Kim, Y.-S., Oh, S.-H., Park, S.-Y., Yoon, H.-W., Herrington, J.,
Nair, A., Koh, Y.-J., Jang, D.-P., Kim, Y.-B., Leventhal, B., Cho, Z., Castellanos, F., Schultz, R., 2011. Involvement of the anterior thalamic radiation
imaging study. Brain research 1417, 77–86.
SC
in boys with high functioning autism spectrum disorders: a diffusion tensor
M AN U
Cheung, C., Chua, S., Cheung, V., Khong, P., Tai, K., Wong, T., Ho, T., McAlonan, G., 2009. White matter fractional anisotrophy differences and correlates of diagnostic symptoms in autism. Journal of Child Psychology and Psychiatry 50 (9), 1102–1112.
Ciccarelli, O., Catani, M., Johansen-Berg, H., Clark, C., Thompson, A., 2008. Diffusion-based tractography in neurological disorders: concepts, applica-
D
tions, and future developments. The Lancet Neurology 7 (8), 715–727. Colby, J. B., Soderberg, L., Lebel, C., Dinov, I. D., Thompson, P. M., Sowell,
TE
E. R., 2012. Along-tract statistics allow for enhanced tractography analysis. Neuroimage 59 (4), 3227–3242.
EP
Constantino, J. N., Davis, S. A., Todd, R. D., Schindler, M. K., Gross, M. M., Brophy, S. L., Metzger, L. M., Shoushtari, C. S., Splinter, R., Reich, W., 2003. Validation of a brief quantitative measure of autistic traits: comparison
AC C
of the social responsiveness scale with the autism diagnostic interview-revised. Journal of autism and developmental disorders 33 (4), 427–433.
Conti, E., Mitra, J., Calderoni, S., Pannek, K., Shen, K. K., Pagnozzi, A., Rose, S., Mazzotti, S., Scelfo, D., Tosetti, M., Muratori, F., Cioni, G., Guzzetta, A., 2017. Network over-connectivity differentiates autism spectrum disorder from other developmental disorders in toddlers: A diffusion MRI study. Human brain mapping 38 (5), 2333–2344.
34
ACCEPTED MANUSCRIPT
Deshpande, G., Libero, L. E., Sreenivasan, K. R., Deshpande, H. D., Kana, R. K., 2013. Identification of neural connectivity signatures of autism using
RI PT
machine learning. Frontiers in Human Neuroscience 7, 670.
Ding, Z., Gore, J. C., Anderson, A. W., 2003. Classification and quantification of neuronal fiber pathways using diffusion tensor MRI. Magnetic Resonance in Medicine 49 (4), 716–721.
spectrum disorder. Frontiers in neuroscience 9.
SC
D’Mello, A. M., Stoodley, C. J., 2015. Cerebro-cerebellar circuits in autism
M AN U
Dyrba, M., Ewers, M., Wegrzyn, M., Kilimann, I., Plant, C., Oswald, A., Meindl, T., Pievani, M., Bokde, A. L., Fellgiebel, A., Filippi, M., Hampel, H., Kloppel, S., Hauenstein, K., Kirste, T., Teipel, S. J., 2013. Robust automated detection of microstructural white matter degeneration in Alzheimer’s disease using machine learning classification of multicenter DTI data. PloS one 8 (5), e64925.
D
Fischl, B., 2012. FreeSurfer. Neuroimage 62 (2), 774–781.
Fletcher, P. T., Whitaker, R. T., Tao, R., DuBray, M. B., Froehlich, A.,
TE
Ravichandran, C., Alexander, A. L., Bigler, E. D., Lange, N., Lainhart, J. E., 2010. Microstructural connectivity of the arcuate fasciculus in adolescents
EP
with high-functioning autism. Neuroimage 51 (3), 1117–1125. Garyfallidis, E., Brett, M., Correia, M. M., Williams, G. B., Nimmo-Smith, I., 2012. Quickbundles, a method for tractography simplification. Frontiers in
AC C
neuroscience 6.
Ge, B., Guo, L., Li, K., Li, H., Faraco, C., Zhao, Q., Miller, S., Liu, T., 2010. Automatic clustering of white matter fibers based on symbolic sequence analysis. In: SPIE Medical Imaging. International Society for Optics and Photonics, pp. 762327–762327–8.
Goch, C. J., Oztan, B., Stieltjes, B., Henze, R., Hering, J., Poustka, L., Meinzer, H.-P., Yener, B., Maier-Hein, K. H., 2014. Global changes in the connectome 35
ACCEPTED MANUSCRIPT
in autism spectrum disorders. In: International Conference on Medical Image Computing and Computer Assisted Intervention - Computational Diffusion
RI PT
MRI and Brain Connectivity. pp. 239–247.
Golub, T. R., Slonim, D. K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov,
J. P., Coller, H., Loh, M. L., Downing, J. R., Caligiuri, M. A., Bloomfield,
C. D., Lander, E. S., 1999. Molecular classification of cancer: class discovery
SC
and class prediction by gene expression monitoring. Science 286 (5439), 531– 537.
M AN U
Gong, G., He, Y., Concha, L., Lebel, C., Gross, D. W., Evans, A. C., Beaulieu, C., 2009. Mapping anatomical connectivity patterns of human cerebral cortex using in vivo diffusion tensor imaging tractography. Cerebral cortex 19 (3), 524–536.
Guyon, I., Elisseeff, A., 2003. An introduction to variable and feature selection. Journal of machine learning research 3 (Mar), 1157–1182.
D
Hagmann, P., Cammoun, L., Gigandet, X., Meuli, R., Honey, C. J., Wedeen, V. J., Sporns, O., 2008. Mapping the structural core of human cerebral cortex.
TE
PLoS biology 6 (7), e159.
Herbert, M., Ziegler, D., Deutsch, C., O’brien, L., Lange, N., Bakardjiev, A.,
EP
Hodgson, J., Adrien, K., Steele, S., Makris, N., Kennedy, D. N., Harris, G. J., Caviness, V. S., 2003. Dissociations of cerebral cortex, subcortical and
AC C
cerebral white matter volumes in autistic boys. Brain 126 (5), 1182–1192. Herbert, M. R., Ziegler, D., Deutsch, C., O’Brien, L., Kennedy, D., Filipek, P., Bakardjiev, A., Hodgson, J., Takeoka, M., Makris, N., Caviness, V. J., 2005. Brain asymmetries in autism and developmental language disorder: a nested whole-brain analysis. Brain 128 (1), 213–226.
Hong, S., Ke, X., Tang, T., Hang, Y., Chu, K., Huang, H., Ruan, Z., Lu, Z., Tao, G., Liu, Y., 2011. Detecting abnormalities of corpus callosum connectivity in
36
ACCEPTED MANUSCRIPT
autism using magnetic resonance imaging and diffusion tensor tractography.
RI PT
Psychiatry Research: Neuroimaging 194 (3), 333–339. Hoppenbrouwers, M., Vandermosten, M., Boets, B., 2014. Autism as a disconnection syndrome: A qualitative and quantitative review of diffusion tensor imaging studies. Research in Autism Spectrum Disorders 8 (4), 387–412.
Horsfield, M. A., Jones, D. K., 2002. Applications of diffusion-weighted and
SC
diffusion tensor MRI to white matter diseases–a review. NMR in Biomedicine 15 (7-8), 570–577.
M AN U
Ingalhalikar, M., Parker, D., Bloy, L., Roberts, T., Verma, R., 2011. Diffusion based abnormality markers of pathology: toward learned diagnostic prediction of ASD. Neuroimage 57 (3), 918–927.
Johnson, R. T., Yeatman, J. D., Wandell, B. A., Buonocore, M. H., Amaral, D. G., Nordahl, C. W., 2014. Diffusion properties of major white matter tracts in young, typically developing children. Neuroimage 88, 143–154.
D
Keller, T. A., Kana, R. K., Just, M. A., 2007. A developmental study of the
TE
structural integrity of white matter in autism. Neuroreport 18 (1), 23–27. Kohavi, R., 1995. A study of cross-validation and bootstrap for accuracy estimation and model selection. In: International Joint Conference on Artificial
EP
Intelligence (IJCAI). Vol. 14. pp. 1137–1145. Kotz, K. M., Watkins, M. W., McDermott, P. A., 2008. Validity of the General
AC C
Conceptual Ability score from the Differential Ability Scales as a function of significant and rare interfactor variability. School Psychology Review 37 (2), 261.
Krstajic, D., Buturovic, L. J., Leahy, D. E., Thomas, S., 2014. Cross-validation pitfalls when selecting and assessing regression and classification models. Journal of cheminformatics 6 (1), 10.
37
ACCEPTED MANUSCRIPT
Lange, N., DuBray, M. B., Lee, J. E., Froimowitz, M. P., Froehlich, A., Adluru, N., Wright, B., Ravichandran, C., Fletcher, P. T., Bigler, E. D., 2010. Atypical
RI PT
diffusion tensor hemispheric asymmetry in autism. Autism research 3 (6), 350–358.
Langen, M., Leemans, A., Johnston, P., Ecker, C., Daly, E., Murphy, C. M., dell’Acqua, F., Durston, S., Murphy, D. G., Consortium, A., Murphy, D. G.,
SC
2012. Fronto-striatal circuitry and inhibitory control in autism: findings from diffusion tensor imaging tractography. Cortex 48 (2), 183–193.
Libero, L. E., DeRamus, T. P., Lahti, A. C., Deshpande, G., Kana, R. K., 2015.
M AN U
Multimodal neuroimaging based classification of autism spectrum disorder using anatomical, neurochemical, and white matter correlates. Cortex 66, 46–59.
Liu, X., Lauer, K. K., Ward, D., Roberts, C., Liu, S., Suneeta, G., Rohloff, R., Gross, W., Xu, Z., Chen, G., Binder, J., Li, S.-J., Hudetz, A. G., 2017. Finegrained parcellation of brain connectivity improves differentiation of states of
D
consciousness during graded propofol sedation. Brain Connectivity.
TE
Lo, Y.-C., Chen, Y.-J., Hsu, Y.-C., Tseng, W.-Y. I., Gau, S. S.-F., 2016. Reduced tract integrity of the model for social communication is a neural substrate of social communication deficits in autism spectrum disorder. Journal
EP
of Child Psychology and Psychiatry(in press). Lord, C., Risi, S., Lambrecht, L., Cook Jr, E. H., Leventhal, B. L., DiLa-
AC C
vore, P. C., Pickles, A., Rutter, M., 2000. The autism diagnostic observation schedule–generic: A standard measure of social and communication deficits associated with the spectrum of autism. Journal of autism and developmental disorders 30 (3), 205–223.
Lord, C., Rutter, M., Le Couteur, A., 1994. Autism diagnostic interview-revised: a revised version of a diagnostic interview for caregivers of individuals with possible pervasive developmental disorders. Journal of autism and developmental disorders 24 (5), 659–685. 38
ACCEPTED MANUSCRIPT
Makris, N., Meyer, J. W., Bates, J. F., Yeterian, E. H., Kennedy, D. N., Caviness, V. S., 1999. MRI-based topographic parcellation of human cerebral white
connectivity. Neuroimage 9 (1), 18–45.
RI PT
matter and nuclei: II. Rationale and applications with systematics of cerebral
Malcolm, J. G., Shenton, M. E., Rathi, Y., 2010. Filtered multitensor tractography. IEEE Transactions on Medical Imaging 29 (9), 1664–1675.
SC
Meskaldji, D. E., Fischi-Gomez, E., Griffa, A., Hagmann, P., Morgenthaler, S., Thiran, J.-P., 2013. Comparing connectomes across subjects and populations
M AN U
at different scales. NeuroImage 80, 416–425.
Mishra, D., Sahu, B., 2011. Feature selection for cancer classification: a signalto-noise ratio approach. International Journal of Scientific & Engineering Research 2 (4), 1–7.
Moberts, B., Vilanova, A., Van Wijk, J. J., 2005. Evaluation of fiber clustering methods for diffusion tensor imaging. In: VIS 05. IEEE Visualization. IEEE,
D
pp. 65–72.
TE
Mostapha, M., Casanova, M., Gimel’farb, G., El-Baz, A., 2015. Towards noninvasive image-based early diagnosis of autism. In: International Conference on Medical Image Computing and Computer Assisted Intervention. pp. 160–
EP
168.
Nagae, L., Zarnow, D., Blaskey, L., Dell, J., Khan, S., Qasmieh, S., Levy, S.,
AC C
Roberts, T., 2012. Elevated mean diffusivity in the left hemisphere superior longitudinal fasciculus in autism spectrum disorders increases with more profound language impairment. American Journal of Neuroradiology 33 (9), 1720–1725.
Norton, I., Essayed, I., Zhang, F., Pujol, S., Yarmarkovich, A., Golby, A., Kindlmann, G., Wasserman, D., Johnson, J., Westin, C.-F., O’Donnell, L. J., 2017. SlicerDMRI: Open source diffusion mri software for brain cancer research. Cancer Research. 39
ACCEPTED MANUSCRIPT
O’Donnell, L. J., 2006. Cerebral white matter analysis using diffusion imaging.
RI PT
Ph.D. thesis, Massachusetts Institute of Technology. O’Donnell, L. J., Golby, A., Westin, C.-F., 2013. Fiber clustering versus the parcellation-based connectome. NeuroImage 80, 283–289.
O’Donnell, L. J., Suter, Y., Rigolo, L., Kahali, P., Zhang, F., Norton, I., Albi,
A., Olubiyi, O., Meola, A., Essayed, W. I., Unadkat, P., Ciris, P. A., Wells III,
SC
W. M., Rathi, Y., Westin, C.-F., Golby, A. J., 2017. Automated white matter fiber tract identification in patients with brain tumors. NeuroImage: Clinical
M AN U
13, 138–153.
O’Donnell, L. J., Wells III, W., Golby, A., Westin, C.-F., 2012. Unbiased groupwise registration of white matter tractography. In: International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI). pp. 123–130.
O’Donnell, L. J., Westin, C.-F., 2007. Automatic tractography segmentation
D
using a high-dimensional white matter atlas. IEEE Transactions on Medical
TE
Imaging 26 (11), 1562–1575.
O’Donnell, L. J., Westin, C.-F., Golby, A. J., 2009. Tract-based morphometry for white matter group analysis. Neuroimage 45 (3), 832–844.
EP
Pardini, M., Elia, M., Garaci, F. G., Guida, S., Coniglione, F., Krueger, F., Benassi, F., Gialloreti, L. E., 2012. Long-term cognitive and behavioral ther-
AC C
apies, combined with augmentative communication, are related to uncinate fasciculus integrity in autism. Journal of autism and developmental disorders 42 (4), 585–592.
Plant, C., Teipel, S. J., Oswald, A., Bohm, C., Meindl, T., Mourao-Miranda, J., Bokde, A. W., Hampel, H., Ewers, M., 2010. Automated detection of brain atrophy patterns based on MRI for the prediction of Alzheimer’s disease. Neuroimage 50 (1), 162–174.
40
ACCEPTED MANUSCRIPT
Pugliese, L., Catani, M., Ameis, S., Dell’Acqua, F., de Schotten, M. T., Murphy, C., Robertson, D., Deeley, Q., Daly, E., Murphy, D. G., 2009. The anatomy
RI PT
of extended limbic pathways in Asperger syndrome: a preliminary diffusion tensor imaging tractography study. Neuroimage 47 (2), 427–434.
Ratnarajah, N., Rifkin-Graboi, A., Fortier, M. V., Chong, Y. S., Kwek, K.,
Saw, S.-M., Godfrey, K. M., Gluckman, P. D., Meaney, M. J., Qiu, A., 2013.
SC
Structural connectivity asymmetry in the neonatal brain. Neuroimage 75, 187–194.
M AN U
Robinson, E. C., Hammers, A., Ericsson, A., Edwards, A. D., Rueckert, D., 2010. Identifying population differences in whole-brain structural networks: a machine learning approach. NeuroImage 50 (3), 910–919.
Sanguinetti, G., Laidler, J., Lawrence, N. D., 2005. Automatic determination of the number of clusters using spectral algorithms. In: IEEE Workshop on Machine Learning for Signal Processing. IEEE, pp. 55–60.
D
Travers, B. G., Adluru, N., Ennis, C., Tromp, D. P., Destiche, D., Doran, S., Bigler, E. D., Lange, N., Lainhart, J. E., Alexander, A. L., 2012. Diffusion
289–313.
TE
tensor imaging in autism spectrum disorder: a review. Autism Research 5 (5),
EP
Travers, B. G., Bigler, E. D., Tromp, D. P., Adluru, N., Destiche, D., Samsin, D., Froehlich, A., Prigge, M. D., Duffield, T. C., Lange, N., Alexander, A. L., Lainhart, J. E., 2015a. Brainstem white matter predicts individual dif-
AC C
ferences in manual motor difficulties and symptom severity in autism. Journal of autism and developmental disorders 45 (9), 3030–3040.
Travers, B. G., Tromp, D. P., Adluru, N., Lange, N., Destiche, D., Ennis, C., Nielsen, J. A., Froehlich, A. L., Prigge, M. B., Fletcher, P. T., Anderson, J. S., Zielinski, B. A., Bigler, E. D., Lainhart, J. E., Alexander, A. L., 2015b. Atypical development of white matter microstructure of the corpus callosum in males with autism: a longitudinal investigation. Molecular autism 6 (1), 1.
41
ACCEPTED MANUSCRIPT
van den Heuvel, M. P., Stam, C. J., Boersma, M., Pol, H. H., 2008. Small-world
in the human brain. Neuroimage 43 (3), 528–539.
RI PT
and scale-free organization of voxel-based resting-state functional connectivity
Wang, D., Luo, Y., Mok, V. C., Chu, W. C., Shi, L., 2016. Tractography atlas-
based spatial statistics: Statistical analysis of diffusion tensor image along fiber pathways. NeuroImage 125, 301–310.
SC
Wolff, J. J., Gu, H., Gerig, G., Elison, J. T., Styner, M., Gouttard, S., Bot-
teron, K. N., Dager, S. R., Dawson, G., Estes, A. M., Evans, A., Hazlett, H., Kostopoulos, P., McKinstry, R., Paterson, S., Schultz, R., Zwaigenbaum, L.,
M AN U
Piven, J., 2012. Differences in white matter fiber tract development present from 6 to 24 months in infants with autism. American Journal of Psychiatry 169 (6), 589–600.
Zelnik-Manor, L., Perona, P., 2005. Self-tuning spectral clustering. In: Advances in neural information processing systems. pp. 1601–1608.
Zhang, F., Kahali, P., Suter, Y., Norton, I., Rigolo, L., Savadjiev, P., Song, Y.,
D
Rathi, Y., Cai, W., Wells III, W. M., Golby, A. J., O’Donnell, L. J., 2017a.
TE
Automated connectivity-based groupwise cortical atlas generation: Application to data of neurosurgical patients with brain tumors for cortical parcellation prediction. In: IEEE International Symposium on Biomedical Imaging
EP
(ISBI). IEEE, pp. 774–777.
Zhang, F., Norton, I., Cai, W., Song, Y., Wells III, W. M., O’Donnell, L. J.,
AC C
2017b. Comparison between two white matter segmentation strategies: an investigation into white matter segmentation consistency. In: IEEE International Symposium on Biomedical Imaging (ISBI). IEEE, pp. 796–799.
Zhang, F., Savadjiev, P., Cai, W., Song, Y., Verma, R., Westin, C.-F., O’Donnell, L. J., 2016. Fiber clustering based white matter connectivity analysis for prediction of autism spectrum disorder using diffusion tensor imaging. In: IEEE International Symposium on Biomedical Imaging (ISBI). IEEE, pp. 564–567. 42
ACCEPTED MANUSCRIPT
Zhang, F., Wu, W., Ning, L., McAnulty, G., Waber, D., Gagoski, B., Sarill, K., Hamoda, H., Song, Y., Cai, W., Rathi, Y., O’Donnell, L. J., 2017c.
RI PT
Supra-threshold fiber cluster statistics for data-driven whole brain tractogra-
phy analysis. In: International Conference on Medical Image Computing and
AC C
EP
TE
D
M AN U
SC
Computer-Assisted Intervention (MICCAI). pp. 556–565.
43