Predicting intrinsic brain activity

Predicting intrinsic brain activity

NeuroImage 82 (2013) 127–136 Contents lists available at ScienceDirect NeuroImage journal homepage: www.elsevier.com/locate/ynimg Predicting intrin...

2MB Sizes 1 Downloads 128 Views

NeuroImage 82 (2013) 127–136

Contents lists available at ScienceDirect

NeuroImage journal homepage: www.elsevier.com/locate/ynimg

Predicting intrinsic brain activity R. Cameron Craddock a,b,c, Michael P. Milham b,c, Stephen M. LaConte a,d,e,f,⁎ a

Virginia Tech Carilion Research Institute, Roanoke, VA, USA Center for the Developing Brain, Child Mind Institute, New York, NY, USA c Nathan S. Kline Institute for Psychiatric Research, Orangeburg, NY, USA d School of Biomedical Engineering and Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA, USA e Department of Emergency Medicine, Virginia Tech Carilion School of Medicine, Roanoke, VA, USA f Department of Radiology, Virginia Tech Carilion School of Medicine, Roanoke, VA, USA b

a r t i c l e

i n f o

Article history: Accepted 15 May 2013 Available online 24 May 2013 Keywords: Resting state Multivariate Multi-voxel pattern analysis MVPA Regression Functional connectivity Effective connectivity Functional magnetic resonance imaging fMRI

a b s t r a c t Multivariate supervised learning methods exhibit a remarkable ability to decode externally driven sensory, behavioral, and cognitive states from functional neuroimaging data. Although they are typically applied to task-based analyses, supervised learning methods are equally applicable to intrinsic effective and functional connectivity analyses. The obtained models of connectivity incorporate the multivariate interactions between all brain regions simultaneously, which will result in a more accurate representation of the connectome than the ones available with standard bivariate methods. Additionally the models can be applied to decode or predict the time series of intrinsic brain activity of a region from an independent dataset. The obtained prediction accuracy provides a measure of the integration between a brain region and other regions in its network, as well as a method for evaluating acquisition and preprocessing pipelines for resting state fMRI data. This article describes a method for learning multivariate models of connectivity. The method is applied in the nonparametric prediction accuracy, influence, and reproducibility–resampling (NPAIRS) framework, to study the regional variation of prediction accuracy and reproducibility (Strother et al., 2002). The resulting spatial distribution of these metrics is consistent with the functional hierarchy proposed by Mesulam (1998). Additionally we illustrate the utility of the multivariate regression connectivity modeling method for optimizing experimental parameters and assessing the quality of functional neuroimaging data. © 2013 Elsevier Inc. All rights reserved.

Introduction Multivariate supervised learning methods, commonly referred to as multi-voxel pattern analysis (MVPA), have shown a remarkable ability to decode externally driven sensory, behavioral, and cognitive states from functional neuroimaging data (Chu et al., 2011a; Cox and Savoy, 2003; Haxby et al., 2001; Haynes and Rees, 2006; Kamitani and Tong, 2005; LaConte et al., 2005; Mitchell et al., 2004; MourãoMiranda et al., 2005; Polyn et al., 2005; Strother et al., 2002). Although these techniques are typically applied to task-based experimental parameters, they can also be used to model intrinsic brain activity (Chu et al., 2011b; Friston, 1994; Friston and Frith, 1993). In this setting, a connectivity model is learned from distributed statistical relationships, and thus is capable of decoding (predicting) the intrinsic activity of a brain region. A multivariate regression connectivity (MRC) model relates the activity measured in a brain region to a linear (or non-linear) combination of the activity measured in every other region of the brain (Friston, ⁎ Corresponding author at: Virginia Tech Carilion Research Institute, 2 Riverside Circle, Roanoke, VA 24016, USA. Fax: +1 540 985 3373. E-mail address: [email protected] (S.M. LaConte). 1053-8119/$ – see front matter © 2013 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.neuroimage.2013.05.072

1994; Friston and Frith, 1993). Typically, the number of brain regions to be modeled is much greater than the number of observations, resulting in an underdetermined system of linear equations that cannot be solved uniquely. Methods for dealing with this include dimensionality reduction (Friston, 1994; Friston and Frith, 1993), feature selection, regularization based algorithms (Ryali et al., 2011; Varoquaux et al., 2010), or frameworks such as statistical learning theory (Chu et al., 2011b). The result is a network model of connectivity that accounts for the complex interactions between all modeled brain regions simultaneously, which is a more accurate representation of the connectome than achieved with standard bivariate connectivity measures (Varoquaux et al., 2010). Further, the learned models can be applied to independently acquired data to decode a region's intrinsic activity from these new data (Chu et al., 2011b; Varoquaux et al., 2010). Although the reproducibility of intrinsic functional connectivity (iFC) has been examined (Braun et al., 2012; Shehzad et al., 2009; Wang et al., 2011; Zuo et al., 2010a, 2010b), the ability of iFC models to predict intrinsic brain activity has been largely overlooked (notable exceptions include Chu et al. (2011b) and Varoquaux et al. (2010)). The ability to make accurate predictions is a fundamental criterion for scientific models. Poor prediction accuracy might indicate that a model is missing an important source of information about the

128

R.C. Craddock et al. / NeuroImage 82 (2013) 127–136

phenomena being modeled (i.e. unmodeled variance), makes invalid assumptions, or is inconsistent over time. In the case of connectivity models of brain architecture, unmodeled variance would indicate that there is a component of local intrinsic activity that is not present in other regions of the brain, and the modeled brain region exhibits a degree of segregation from the rest of the brain. As such, prediction accuracy provides a means to evaluate the amount of information that exists in the brain about a specific brain region, and quantifies the degree to which a brain region is integrated with its iFC network (Kjems et al., 2002; Marrelec et al., 2008; Tononi, 1998). This is in contrast to centrality measures, which assess functional integration of a brain region based on its number of connections with other regions (Zuo et al., 2011). Similarly prediction accuracy can be used as a quantitative metric to evaluate the degree to which iFC models differ over time, between subjects, and with experimental paradigms (i.e. task vs. rest). Poor prediction accuracy may also reflect the validity of the modeling framework, and as such when combined with reproducibility provides a data driven approach that can be used to compare the modeling frameworks as well as acquisition and analysis parameters (Chu et al., 2011b; Kjems et al., 2002; LaConte et al., 2003; Shaw et al., 2003; Strother et al., 2002). Additionally the predictive quality of models of iFC provides a mechanism for real-time tracking of iFC for neurofeedback and brain–computer-interface applications (LaConte, 2011; LaConte et al., 2007). This article describes multivariate prediction analysis of intrinsic brain activity measured from functional magnetic resonance imaging (fMRI) data. We have used support vector regression (SVR) (Drucker et al., 1997; Müller et al., 1997; Vapnik et al., 1996), but other multivariate regression methods, such as PLS, ridge regression, Lasso, and elastic-net, could also be used. We discuss the links between prediction accuracy and integration (Kjems et al., 2002; Marrelec et al., 2008; Tononi, 1998), as well as between reproducibility and signal-to-noise ratio (SNR) of the regression models (Strother et al., 2002). Regional variations in prediction accuracy and reproducibility were investigated across anatomical and functional subdivisions of the brain. The impacts of subject age, sex and head motion on these metrics were also evaluated. Additionally, we evaluated the impact of scan length on model performance. This paper is methodologically related to previous work (Chu et al., 2011a; Friston, 1994; Friston and Frith, 1993). Here, we provide an extensive characterization of regional variation in predictive modeling of intrinsic brain activity using the NPAIRS framework. In addition, we demonstrate the utility of parcellation to reduce the number of regression models required to perform these analyses and facilitate the anatomical interpretability of results. Methods Subjects and scanning Thirty-three healthy volunteers (age: 19–48, mean 27.2, std. dev. 7.9, 16 females) participated in accordance with Baylor College of Medicine Institutional Review Board policy. To qualify for inclusion, subjects were required to be between the ages of 18 and 65, have no contraindications for MRI, to be medication free, and have no current or past neurological or psychiatric conditions. Subjects were scanned on a 3 T Siemens Magnetom TIM Trio scanner (Siemens Medical Solutions USA; Malvern PA, USA) using a 12-channel head matrix coil. The scanning procedure began with a high-resolution anatomic scan followed by two separate 10-minute resting state fMRI runs (Rest 1 and Rest 2). Anatomic images were acquired at 1 × 1 × 1 mm3 resolution with a 3D T1-weighted magnetization-prepared rapid acquisition gradient-echo (MPRAGE) sequence (Mugler and Brookman, 1990) using: field of view (FOV) 256 × 256 × 176 mm3, repetition time (TR) 2600 ms, echo time (TE) 3.02 ms, inversion recovery time (TI) 900 ms, flip angle (FA) 8°, phase partial Fourier 6/8, slice partial Fourier 7/8, GRAPPA factor of 2 with 24 reference lines, and

bandwidth 130 Hz/pixel. Resting state fMRI data were acquired with a blood oxygenation level dependent (BOLD) contrast weighted gradient-recalled echo-planar-imaging sequence (EPI). Twenty-nine 3.6-mm thick interleaved oblique slices were acquired with a 10% slice gap and the parameters: TR 1750 ms, TE 30 ms, FA 90°, 64 × 64 matrix, 220-mm FOV, in-plane resolution 3.45 × 3.45 mm2, anterior-toposterior phase encoding and bandwidth 2442 Hz/pixel. Each restingstate scan consisted of 343 functional volumes, lasting approximately 10 min. For resting state functional scans, subjects were instructed to passively view a fixation cross while clearing their minds of any specific thoughts. This dataset has been made available for non-commercial use through the International Neuroimaging Data-Sharing Initiative (http://fcon_1000.projects.nitrc.org/). Data preprocessing Data preprocessing was accomplished using a combination of tools from AFNI (Cox, 1996) and FSL (Smith et al., 2004). Anatomical images were skull stripped, segmented, and then registered to MNI152 space using a two-step procedure, which first calculates a linear registration (FLIRT; Jenkinson et al., 2002) that is subsequently refined using a non-linear registration (FNIRT). Conservative white matter (WM) and cerebrospinal fluid (CSF) masks were derived from the results of segmentation (Zhang et al., 2001) by applying probability ≥0.99 thresholds to WM and CSF probability maps. These maps were down sampled to 4 mm isotropic resolution and subsequently dilated by 1 voxel to further prevent overlap with gray matter. Functional image preprocessing began with slice timing correction and then motion correction. The mean image from each scan was calculated and used to co-register each functional scan to the corresponding anatomic image. Next nuisance variable regression was performed by regressing out WM and CSF time series, the six motion parameters calculated from motion correction, and a 4th order polynomial to account for baseline drifts (Friston et al., 1996; Lund et al., 2006). The functional to anatomical and anatomical to MNI152 transformations calculated for each dataset were concatenated to construct a functional to MNI152 transform. This transform was applied to the corresponding functional images to write them into MNI space at 4 mm isotropic resolution. The resulting images were smoothed using a 6-mm full width at half maximum (FWHM) Gaussian kernel. ROI generation This analysis used 179 ROIs specified by a 2-level temporal correlation based whole brain functional parcellation (Craddock et al., 2012) of resting state data from 198 subjects from the Cambridge dataset publically available from the 1000 functional connectome project (Biswal et al., 2010). The independent dataset was used to exclude any bias that might be introduced by generating ROIs from the same data that were analyzed. The Cambridge dataset was preprocessed identically to our experimental data, except that it was band-pass filtered (0.001 Hz b f b 0.08 Hz) after nuisance variable regression as in Craddock et al. (2012). The parcellation procedure produced 196 ROIs, which were reduced to 179 ROIs after excluding regions in the cerebellum and brain stem (Fig. 1A). Multivariate model of intrinsic brain function The application of support vector regression to model intrinsic brain function follows the analysis of task-based fMRI experiments, in which the goal is to learn the relationship between a vector of features (from each brain image) and scalar labels (of the task conditions) (Cox and Savoy, 2003; LaConte et al., 2005; Mourão-Miranda et al., 2005). The crux of the approach used here is that the labels

R.C. Craddock et al. / NeuroImage 82 (2013) 127–136

129

A

B Prediction Accuracy

Predicted Time Series

REST 1 Prediction

Model Estimation

Observed Time Series Features

Network Model

Network Model

Reproducibility Features

Model Estimation

Prediction

Observed Time Series

REST 2

Predicted Time Series

Prediction Accuracy Fig. 1. The NPAIRS framework for estimating reproducibility and prediction accuracy for whole brain effective connectivity models. The NPAIRS procedure is performed for each of the 179 ROIs obtained from the whole brain parcellation of the cerebrum (Craddock et al., 2012) (A). ROI colors were arbitrarily assigned to help differentiate regions; disconnected regions of similar color correspond to different ROIs (ROIs are spatially contiguous). The NPAIRS framework (B) consists of two phases. In the first phase, the first resting state scan (Rest 1) is used as training data and the second resting state scan (Rest 2) is used for testing. A time series for an ROI is extracted from the training data, and submitted along with the voxel-wise whole brain data with the ROI voxels excluded (features), to estimate a network model. Prediction is performed by applying the learned network model to the features of the testing data to decode a predicted time series for the ROI. An estimate of the model's prediction accuracy is obtained by comparing the predicted time series to the observed ROI time series extracted from the testing data. The second phase of NPAIRS is identical to the first except that the roles of Rest 1 and Rest 2 are swapped. Reproducibility is estimated by comparing the network models calculated from the two datasets. Black boxes correspond to algorithmic manipulations and boxes with rounded corners correspond to data and parameters. Boxes colored orange are dependent on, or derived from Rest 1 and maroon boxes are derived from Rest 2; boxes that are colored orange and maroon are derived from both datasets.

are a time series of intrinsic measures of brain activity and for each model; the features contributing to the time series are excluded. The dataset X ∈ RP × N is comprised of N fMRI volumes, each of which consists of P voxels. A time series of interest (TOI), yr, is extracted from the rth (r ∈ R) brain region of X by averaging the time series of all in-region voxels. The R brain regions to be modeled correspond to the regions in the parcellation atlas (Fig. 1A). Since the TOI is extracted from X, it is important that the voxels used to derive the TOI be excluded from further analysis; X¬r is used to denote X with the voxels corresponding to region r removed. A linear model of connectivity treats the activity of a TOI as a linear combination of the activity measured at all other brain regions T

yr ¼ w X Ir þ b1 þ ξ;

ð1Þ

Support vector regression (SVR) is a generalization of support vector classification (Boser et al., 1992; Drucker et al., 1997; Müller et al., 1997; Smola and Schölkopf, 2004; Vapnik et al., 1996) that provides a way to learn the model parameters w and b for a region r from a training dataset. Similar to other regularized regression approaches, SVR address over-fitting in underdetermined problems by incorporating a regularization term to select a unique solution with minimal complexity. The three parameters in SVR are 1) the type of kernel (which provides a computationally efficient mechanism for non-linear models), 2) a parameter C (which controls the cost of training errors), and 3) a parameter  (which provides a mechanism for tolerating small fitting errors). Once w and b are determined, the model can then be applied to an independent dataset X′ Ir to decode a time series y′r using the prediction equation: ′

T



y r ¼ w X Ir þ b: where w is a column vector of connectivity strengths from the voxels in X¬r to the rth region, b is a constant offset scaling 1 (an N-length vector of ones) and ξ is an error term (Friston, 1994; Friston and Frith, 1993). The system of linear equations is underdetermined if the number of voxels in X¬r is larger than N, the number of fMRI volumes. We addressed this issue using support vector regression.

ð2Þ

SVR was performed using the 3dsvm tool (LaConte et al., 2005), distributed with AFNI, which, is based on the SVMlight implementation of SVM (Joachims, 1999). Smoothing (in this case at 6 mm FWHM) raises a potential concern about information leak to neighboring ROIs, which could affect training.

130

R.C. Craddock et al. / NeuroImage 82 (2013) 127–136

To avoid this leakage, a two-voxel border around the ROI was excluded from the training features. Prior to training, the TOI (yr) was normalized to zero mean and standard variance to provide a frame of reference for interpreting training error. Since using different SVR parameters for different brain regions would make it difficult to interpret differences in prediction performance between the brain regions, SVR parameters were chosen to be: linear kernel,  = 0.1, and C = 100 for all region. These were chosen based on the previous observation that SVM performance on fMRI data is insensitive to C (LaConte et al., 2005), and the desire to cap acceptable training error at 10% of the standard deviation. Alternatively, cross-validation could be used to choose values for  and C that maximize prediction accuracy, model performance could be averaged across a range of parameters (Chu et al., 2011b) or the parameters could be chosen analytically (Cherkassky and Ma, 2004). The MRC method results in models of effective connectivity, since it provides “a precise mathematical specification of the nature of the proposed functional relationship in a way that functional connectivity does not” (Stone and Kötter, 2002). The feature-specific “effective connectivities” (Friston, 1994) derived from MRC training are similar to semi-partial correlations and measure the unique contribution of the feature's time series to the TOI. Prediction accuracy and reproducibility Prediction accuracy and reproducibility of network models for each of 179 cerebral brain regions (Fig. 1A), and each subject, were estimated from the independently acquired resting state scans (Rest 1 and Rest 2) using the nonparametric prediction accuracy, influence, and reproducibility resampling (NPAIRS) framework (Kjems et al., 2002; LaConte et al., 2003; Shaw et al., 2003; Strother et al., 2002) illustrated in Fig. 1B. The NPAIRS framework proceeds in two phases. In the first phase Rest 1 was used as training data and Rest 2 was used as testing data. A TOI for a region under inquiry was extracted from the training data by averaging the within ROI voxel time series. The TOI and the training data were submitted to 3dsvm along with a mask that excludes the in-ROI voxels from the input feature space. The resulting model was applied to the testing data to decode a time series for the ROI. This predicted time series was compared to a ROI time series extracted from the testing data by averaging the time series for the within-ROI voxels, to estimate the prediction accuracy of the model. In the second phase of the NPAIRS framework Rest 2 was used as the training data, Rest 1 is used as the testing data, and the procedure was repeated. Afterwards, the model learned from Rest 1 was compared to the model learned from Rest 2 to estimate reproducibility. The prediction accuracy estimates from phase 1 and phase 2 were averaged to generate a single estimate of prediction accuracy per region, per subject. Prediction accuracy provides a measure of how well a model learned from a training set generalizes to independently acquired data. In the context of task-based experiments, prediction accuracy has been interpreted as the amount of information present in the model about a task (Kjems et al., 2002; Kriegeskorte et al., 2006). In the case of intrinsic brain activity this equates to the amount of information present in the brain about the activity of a region under inquiry. The link between prediction accuracy and mutual information is particularly relevant since mutual information has been proposed as a measure of functional integration (Marrelec et al., 2008; Tononi, 1998). To remain consistent with this interpretation of prediction accuracy, and Chu et al. (2011b) prediction accuracy was quantified using Pearson's correlation coefficient. This measure focuses on patterns of oscillations in the time courses and is insensitive to differences in scale and offset in the prediction. This is reasonable given that BOLD based fMRI is not a quantifiable technique, and the scale and baseline level of the signal are likely to vary substantially between scanning sessions. Alternatively, prediction accuracy could have been measured using mean squared error, which is sensitive to differences in scale and offset.

Reproducibility evaluates the similarity between models learned from independent datasets. Differences between models estimated from independent datasets can be interpreted as noise, and similarity between models as signal, hence reproducibility can be reformulated as a measure of signal-to-noise ratio (Strother et al., 2002). Any similarity metric can be used as a measure of reproducibility, and Pearson's correlation was used here to remain consistent with Strother et al. (2002). Null distributions for prediction accuracy and reproducibility Null distributions for prediction accuracy and reproducibility were generated using a non-parametric permutation technique. The NPAIRS analysis described in the Prediction accuracy and reproducibility section was repeated except that prior to training the TOI was scrambled using wavestrapping (Bullmore et al., 2004) implemented using the Wavelet toolbox in MATLAB R2010b (the Mathworks, Natick, MA). The TOI was decomposed using a 6-level discrete wavelet decomposition and the 4th order Daubechies wavelet. The detail coefficients at each decomposition level were randomly permuted, leaving the approximation coefficients untouched and the inverse discrete wavelet transform was performed to generate a surrogate TOI. Prediction accuracy and reproducibility estimated from the surrogate TOIs (across all regions and subjects) form the null distribution. The statistical significance of prediction accuracy and reproducibility estimates obtained for actual data was obtained by calculating the frequency of obtaining an equal or greater value from the corresponding null distribution. The resulting p values were FDR corrected for the number of ROIs and subjects (Benjamini and Hochberg, 1995). Results The voxel weights in the MRC model can be positive or negative and their absolute value indicates the relative importance of each feature for decoding the activity of the ROI (see Fig. 2A for an illustration). In this sense, the feature weights have a similar interpretation to voxel specific correlation coefficients (or beta-values) from a standard seed-based functional connectivity analysis, except that they reflect only direct interactions. Models constructed using an ROI in the ventral posterior cingulate cortex (vPCC) from the Rest 1 datasets were combined across subjects using a one-sample t-test and are presented in Fig. 2A. The spatial pattern of the model is very similar to the canonical default mode network. Fig. 2B illustrates observed and predicted time courses for the vPCC for the subjects that exhibited the best and the worst prediction accuracy. The network model is able to follow high frequency components of the time series as well as large deviations from the mean quite well. Similar comparisons for other regions of the brain are available in Supplementary information. The results of the NPAIRS procedure are summarized in Fig. 3. The prediction accuracy and reproducibility results are significantly better than the null distribution for these data. Note that prediction accuracy is high even for the worst performing ROIs and subject combinations, 95% of which have prediction accuracy correlation coefficients greater than 0.72. Reproducibility, on the other hand, does not fare as well, with 95% of the values below 0.46. The 95th percentiles for the null distribution are 0.18 for prediction accuracy and 0.04 for reproducibility. We observe a non-linear relationship between prediction accuracy and reproducibility. Models with lower reproducibility can still achieve high prediction accuracy. The elongated shape of the null distribution can be explained by the fewer number of time points used to calculate prediction accuracy compared to the number of voxels used to calculate reproducibility; the variance of Pearson's correlation coefficients is inversely proportional to the number of observations used to estimate the coefficient. The spatial distribution of prediction accuracy and reproducibility is illustrated in Fig. 4. This information was summarized using 55 bilateral ROIs from the Harvard Oxford atlas and is plotted in Fig. 5.

R.C. Craddock et al. / NeuroImage 82 (2013) 127–136

131

A

t = -12

t = 12

−3 −2 −1 0 1 2 −3−2−1 0 1 2 3

Observed

100

200

Predicted

300

400

500

r = .86

Worst Subject

0

r = .96

Best Subject

Z Scores

B

600

Time (s)

Fig. 2. Network model (across individuals) for ventral posterior cingulate cortex (vPCC) (A). Network models learned from the Rest 1 dataset were combined across subjects using a one-sample t-test and each voxel is colored according to its corresponding t-statistic; an FDR corrected p b 0.05 threshold has been applied to the images. The ROI used to generate the map is outlined in pink and colored white. The empty space between the ROI and significant model weights reflects the two-voxel boundary imposed on the features to prevent leakage. Predicted and observed time series for the vPCC ROI for the participant exhibiting highest prediction accuracy (B top), and the participant exhibiting the lowest prediction (B bottom). Observed time series (maroon) were extracted from the vPCC ROI using the Rest 1 dataset. Predicted time series (orange) were decoded from the Rest 1 dataset using the network model derived from Rest 2. The plots are annotated with the prediction accuracy obtained by correlating the observed and predicted time series.

0.6 0.4 0.2 0.0 −0.4 −0.2

Prediction Accuracy

0.8

The patterns of reproducibility and prediction accuracy are consistent with notable exceptions such as medial and lateral prefrontal cortices, and right posterior insula. Posterior regions tend to perform better than anterior regions, and superior regions perform better than inferior regions. This was further analyzed by mapping each ROI to brain lobes, as well as their functional classification as described by Mesulam (1998). Scatter plots of reproducibility versus prediction accuracy across brain lobes are illustrated in Fig. 6A and functional hierarchy in

Observed Null

0.0

0.2

0.4

0.6

Reproducibility Fig. 3. Prediction accuracy and reproducibility of the effective connectivity network models from the NPAIRS procedure. SVM training/testing splits were performed for each subject across their two resting-state runs. Each maroon point represents a subject's model for a single ROI (given 179 ROIs × 33 subjects = 5907 points). The null data points were generated using the wavestrapping procedure described in the Null distributions for prediction accuracy and reproducibility section. Prediction accuracy and reproducibility are significantly better than the null distributions for all ROIs and subjects (p b 0.05 FDR corrected).

Fig. 6B. The parietal lobe has the highest prediction accuracy and reproducibility followed by occipital, frontal, and temporal lobes; subcortical regions were the lowest (Fig. 6A). Primary sensory motor cortex has the highest ranking of the functional hierarchy followed by unimodal association cortex, heteromodal association cortex, paralimbic cortex, and subcortical regions, and the limbic system was last (Fig. 6B). The consistency of this ranking was tested across subjects by first calculating the geometric mean of median prediction accuracy and reproducibility (NPAIRS coefficient) for each categorical level and submitting the results to a Friedman rank sum test. The ranking was consistent across subjects for brain lobes (Kendall's concordance coefficient = 0.86, Friedman's chi-squared statistic = 113.28, Bonferroni corrected p b 4.4 × 10−16) and functional hierarchy (Kendall's concordance coefficient = 0.79, Friedman's chi-squared statistic = 130.75, Bonferroni corrected p b 4.4 × 10−16). The median predication accuracy and reproducibility were calculated for each subject (across ROIs) and are plotted in Fig. 7. The median prediction accuracy is high across subjects, and the subjects are tightly clustered. This illustrates the utility of prediction accuracy and reproducibility to identify outlier data, there is no extreme outlier in this dataset, although the subject in the bottom left corner is suspect. The effect of subject sex, age, and mean head motion on prediction accuracy and reproducibility was explored separately using multiple regressions. Median prediction accuracy and reproducibility were transformed to Gaussian using the Fisher transform, these were entered as dependent variables in multiple regressions in which age, sex (coded as 0 for male and 1 for female) and mean head motion were the independent variables. Robust regression was used to exclude the influence of outliers. There was no correlation between age and prediction accuracy (p ~ 0.90, FDR corrected) and reproducibility (p ~ 0.89, FDR corrected), nor was there a correlation with head motion and prediction accuracy (p ~ 0.14, FDR corrected) or reproducibility (p ~ 0.90, FDR corrected). There was no significant relationship between subject sex and prediction accuracy (p ~ 0.11, FDR corrected) or reproducibility (p ~ 0.56, FDR corrected).

132

R.C. Craddock et al. / NeuroImage 82 (2013) 127–136

A

B

Fig. 4. Spatial distribution of prediction accuracy (A) and reproducibility (B). ROI prediction accuracy and reproducibility were independently ranked and color-coded based on their quartiles. The results are shown on inflated cortical maps as well as axial slices. Axial slices are shown in radiological convention; the right side of the brain is shown on the left side of the slice.

Next we evaluate the impact of the amount of training data on model performance. This was performed by repeating the NPAIRS procedure but truncating the training data to only use the first 2.5, 5, or 7.5 min. These results are compared to using the full 10 min of training data in Fig. 8. Network models with better than chance prediction accuracy and reproducibility can be achieved with as little as 2.5 min of training data. Both prediction accuracy and reproducibility improve with the amount of training data, but the improvements to prediction accuracy are the most substantial. Prediction accuracy levels off at the 7.5 to 10 minute range, although there is a significant improvement between training with 7.5 or 10 min for every ROI (p b 0.05 FDR corrected, paired Wilcoxon rank-sum test, 10 min > 7.5 min). The amount of training data has less impact on reproducibility, which shows little improvement when using more than 5 min of training data, 52 of the 179 ROIs show improvement between 5 and 7.5 min (p b 0.05 FDR corrected, pair Wilcoxon rank-sum test, 7.5 min > 5 min) and only 3 show improvement between 7.5 and 10 min (p b 0.05 FDR corrected, pair Wilcoxon rank-sum test, 10 min > 7.5 min). Discussion The reproducibility and prediction accuracy of multivariate models of network connectivity provide insight into the functional architecture of the brain, and metrics for evaluating experimental parameters. Prediction accuracy was high for all of the evaluated brain regions and showed a spatial distribution consistent with the functional hierarchy proposed by Mesulam (1998). Good prediction accuracy was achieved using training datasets as short as 2.5 min long, and continued to improve with longer datasets. Voxel-specific weights in a MRC network model describe the mathematical relationship between a given voxel and modeled brain region, and reflect the relative importance of the corresponding voxel

to the model. The spatial distribution of model weights is consistent with what is observed from standard iFC analyses and the spatial specificity of the network models suggests that model performance is driven by regional activity rather than some global phenomenon (Fig. 2A). The mathematical relationship defined by the network model can be exploited to predict a time series of intrinsic brain activity for a “hidden” brain region from activity measured at other locations in the brain. The predicted time series were a close match for the observed time series and faithfully reproduce high frequency features and large deviations from the mean (Fig. 2A). Reproducibility, when measured by Pearson's correlation, can be interpreted as a measure of a model's signal-to-noise ratio. This relies on the intuition that the aspects of the model that are reproducible represent signal and the non-reproducible parts represent noise. Poor reproducibility could reflect non-stationarity of the underlying biological phenomenon, or inadequate data acquisition or preprocessing. In the context of modeling intrinsic brain function, prediction accuracy can be interpreted as a measure of the amount of information that exists in the brain about the activity of a single brain region. In this interpretation, prediction accuracy is equivalent to mutual information, which is used as a measure of a brain region's functional integration within a network (Marrelec et al., 2008; Tononi, 1998). If a brain region were completely segregated, then we shouldn't be able to model that region's activity using measures obtained in other parts of the brain, and the prediction accuracy for that region would be very low. Other explanations for poor prediction accuracy are poor temporal SNR of the brain region, non-stationarity of the underlying biological processes, or poor modeling. Various factors that can potentially contribute to poor prediction accuracy and reproducibility may merit additional experiments in future work to more clearly determine their impact. These include SNR, non-stationarity of the underlying biological processes, poor modeling, or the level of integration of the brain areas. Although

R.C. Craddock et al. / NeuroImage 82 (2013) 127–136

133

Prediction Accuracy

0.5

0.4

0.3

0.2

0.1

0.9

0.8

0.7

0.6

0.5

Intracalcarine Cortex Cuneal Cortex Supracalcarine Cortex Precuneous Cortex Lat Occipital Cortex (sup div) Sup Parietal Lobule Angular Gyrus Lingual Gyrus Supramarginal Gyrus (ant div) Supramarginal Gyrus (post div) Postcentral Gyrus Planum Temporale Cingulate Gyrus (post div) Sup Temporal Gyrus (post div) Central Opercular Cortex Inf Frontal Gyrus (opercularis) Precentral Gyrus Sup Temporal Gyrus (ant div) Occipital Pole Thalamus Planum Polare Parietal Operculum Cortex Middle Frontal Gyrus Paracingulate Gyrus Heschl’s Gyrus Sup Frontal Gyrus Supplementary Motor Cortex Frontal Medial Cortex Frontal Operculum Cortex Lat Occipital Cortex (inf div) Occipital Fusiform Gyrus Inf Frontal Gyrus (triangularis) Cingulate Gyrus (ant div) Middle Temporal Gyrus (temporooccipital) Insular Cortex Temporal Occipital Fusiform Cortex Inf Temporal Gyrus (temporooccipital) Frontal Pole Middle Temporal Gyrus (post div) Frontal Orbital Cortex Parahippocampal Gyrus (post div) Inf Temporal Gyrus (post div) Amygdala Subcallosal Cortex Hippocampus Putamen Pallidum Caudate Temporal Pole Middle Temporal Gyrus (ant div) Temporal Fusiform Cortex (post div) Accumbens Parahippocampal Gyrus (ant div) Inf Temporal Gyrus (ant div) Temporal Fusiform Cortex (ant div)

Reproducibility

Fig. 5. Prediction accuracy and reproducibility for anatomical regions defined by the Harvard–Oxford atlas. Subject specific, ROI-based maps of prediction accuracy and reproducibility were written into voxel space by writing ROI prediction accuracy or reproducibility to every within-ROI voxel. The median prediction accuracy and reproducibility were then calculated for each of 55 bilateral brain regions defined by the Harvard–Oxford cortical and subcortical atlases. For each brain region, the distribution across of prediction accuracy and reproducibility across participants is illustrated using Tukey box plots. sup div: superior division, ant div: anterior division, post div: posterior division, inf div: inferior division.

there is no good measure of CNR or SNR for resting state data, measures such as spatial SNR, tSNR (Murphy et al., 2007) and fALFF (Zou et al., 2008), as well as field map information, can be combined with NPAIRS results to determine whether poor signal is the source of poor performance. Recently proposed methods for measuring temporal dynamics in functional connectivity (Smith et al., 2012) may be used to account for the impact of potential non-stationarity in the underlying biology on NPAIRS metrics. Additionally, the comparison of different analytical approaches can be used to evaluate the contributions of modeling and/or preprocessing decisions (LaConte et al., 2003; Lange et al., 1999; Strother et al., 2002).

Since the prediction accuracy and reproducibility for every brain region is above chance, there is distributed information in the brain about each region. Each ROI, therefore, is not completely segregated (Figs. 3, 4 and 5). Both reproducibility and prediction accuracy are sensitive to dynamic changes of the brain networks. The high prediction accuracy for all regions suggests that the networks are stationary at the time scales of 10 to 20 min. Although prediction accuracy is very high for most regions, they obtain only moderate reproducibility. This can be explained by the large amount of redundant information in the brain. For example, if several brain regions were highly correlated with a TOI, then several different linear combinations of these regions

Lobe Frontal

0.5

Occipital

1.0

Subcortical Temporal

0.30

0.35

0.40

B

Sex F M

Age 20

Functional Hierarchy

0.80 0.82 0.84 0.86 0.88 0.90

30

Heteromodal Limbic Paralimbic Sensory−Motor Subcortical Unimodal 0.25

0.30

0.35

0.82

Prediction Accuracy

1.5

0.84

Reproducibility

Prediction Accuracy

Parietal

0.25

Motion

0.88

0.80 0.82 0.84 0.86 0.88 0.90

Prediction Accuracy

A

0.90

R.C. Craddock et al. / NeuroImage 82 (2013) 127–136

0.86

134

40 0.26

0.28

0.30

0.32

0.34

0.36

Reproducibility Fig. 7. Exploring inter-individual variation in median prediction accuracy and reproducibility. The shape of the symbol corresponds to sex, the size of the symbol corresponds to the subject age, and symbol color corresponds to the mean head motion observed in the participants' fMRI data.

Reproducibility Fig. 6. Comparison of prediction accuracy and reproducibility for ROIs based on cerebral lobes and functional hierarchy. 55 cortical and subcortical bilateral brain regions from the Harvard–Oxford atlas were hand labeled by cerebral brain lobe and functional hierarchy (Mesulam, 1998). Labels were transferred to the ROI masks (Fig. 1A) using a voting algorithm, with ties broken by visual inspection. The median prediction accuracy and reproducibility across subjects for each cerebral lobe (A) and functional subdivision (B) are displayed. The filled circle is located at the median of both prediction accuracy and reproducibility, vertical lines extend to the 25th and 75th percentiles of prediction accuracy, and horizontal lines extend to the 25th and 75th percentiles of reproducibility.

could achieve high prediction accuracy. Different model trainings would likely result in different models from those possible, and high prediction accuracy with low reproducibility would result. Poor reproducibility is an important consideration when attempting to interpret MRC models. Reproducibility as well as interpretability of models may be improved using feature selection, or alternate modeling techniques, which results in sparser models (Craddock et al., 2009; Varoquaux et al., 2010). Previous work of LaConte et al. (2003) and Strother et al. (2002) discusses using reproducibility maps and Lange et al. (1999) recommend a pluralistic strategy for fMRI data analysis. From a neuroscience interpretation perspective, these approaches can help uncover invariant brain–behavior relationships that could remain hidden if only a single model is use. Regional variations in prediction accuracy reflect a hierarchy of integration across brain regions. Regions were sorted using anatomybased landmarks into the functional organization of brain regions proposed by Mesulam (1998) in order to better analyze their model performance (Fig. 5). This sorting was justified by the clear patterns of model performance seen in Fig. 4. The ranking of the functional subtypes (primary sensory–motor, unimodal, heteromodal, paralimbic and limbic) follows Mesulam's hierarchy of functional networks. The high consistency and information content of primary sensory–motor cortices support their high degree of specialization. Unimodal areas are categorized as being very closely linked with primary sensory– motor areas (one or two synapses), and respond “predominately, if not exclusively” to inputs from primary cortices (Mesulam, 1998). This would explain their high consistency across time, as well as the high information content of their network models. Heteromodal association as well as limbic and paralimbic cortices have highly dynamic

connections and provide a context driven integration of information from unimodal association areas. This would result in the observed behavior of less consistency across time. The lower information content of these regions presumably arises from a latent variable (unmodeled variance), which could be locally generated activity that is not shared throughout the network. In addition to exploring the brain's functional architecture the described method, when applied in the NPAIRS framework, provides objective metrics for optimizing experimental parameters as well as assessing the quality of datasets (Chu et al., 2011b; Kjems et al., 2002; LaConte et al., 2003; Shaw et al., 2003; Strother et al., 2002). Since these metrics are calculated for each brain region, experimental parameters can be optimized for specific regions of interest. The appropriate scan length is one methodical question that often arises when designing a resting state experiment, this was evaluated by repeating the NPAIRS framework using 2.5, 5, 7.5 and 10 min of training data. Training using 2.5 min of data was able to achieve better than chance performance for this dataset. Performance improves exponentially for longer scan lengths, and appears to reach an asymptote near 7.5 min. There is a statistically significant improvement for 10 min compared to 7.5 min for every brain region, but the amount of improvement is marginal. This analysis suggests that resting state scans should be no less than 5 min, and preferably 7.5 to 10 min. The greatest improvement from 7.5 to 10 min of training data occurs for regions that have the lowest performance. This underlines the importance of optimizing experimental parameters based on the specific regions to be analyzed. Without specific hypotheses about which brain regions are implicated in a study, longer scan lengths should be utilized. A scatter plot of subject median prediction accuracy and reproducibility was constructed to assess the quality of the data as well as the role of individual variation in these metrics (Fig. 7). Overall the study participants are tightly clustered suggesting that all of the data are of similar quality, with the exception of the subject in the lower left corner, which might be considered an outlier. Variance in subject performance is not explained by head motion, age or sex. It is undeniable that a substantial amount of head motion will negatively impact both of these metrics due to the inability to accurately align the model with the data as well as other temporal artifacts that might be

R.C. Craddock et al. / NeuroImage 82 (2013) 127–136

B

0.4 0.3 0.1

0.6

0.2

0.7

0.8

Reproducibility

0.9

0.5

A

Prediction Accuracy

135

2.5 min

5 min

7.5 min

10 min

2.5 min

Amount of Training Data

5 min

7.5 min

10 min

Amount of Training Data

Fig. 8. Impact of training sample duration on prediction accuracy (A) and reproducibility (B). The median prediction accuracy and reproducbility across subjects was determined for each ROI. Distributions of ROI prediction accuracy and reproducibility are illustrated using violin plots overlaid with Tukey box plots. The shape of the violin plot corresponds to kernel density estimates of the distributions. The outliers have been suppressed from the box plots. min: minutes.

induced by motion (Power et al., 2012; Van Dijk et al., 2011). There is very little motion for these subjects, and these results would suggest that motion correction, as well as the nuisance variable regression, are doing a good job of removing motion effects. Similarly the age range for these subjects is constrained to an age range where significant differences in connectivity due to development or old age are minimized. The described MRC method is illustrated in the context of single subject analysis where the model is trained and evaluated with the data from the same subject. Similar to task-based multivariate modeling techniques, group models can be constructed by concatenating the data from several different subjects, although from our experiences this will require substantial computational power. Additionally single-subject models can be applied to separate subjects to explore inter-subject prediction accuracy, which would be a compelling method for accessing the heritability of the brain's functional architecture. Although SVR was used to perform MRC, a similar prediction and reproducibility analysis could be performed using a variety of methods. One drawback to methods that utilize L2 regularization such as SVR and ridge regression is that they are not sparse in the feature space, i.e. the weights for features with no influence on the model will be set to non-zero (although very small) values (Zou and Hastie, 2005). Other methods obtain sparseness by using L1 regularization (Lasso, L1-SVM) but these methods may exclude relevant features if they are highly correlated with other features, and bound the number of non-zero weights to the number of observations (Zou and Hastie, 2005). Similarly, model validation methods such as structural equation modeling (McIntosh and Gonzalez-Lima, 1994) and dynamic causal modeling (Friston et al., 2003) can also be used in the NPAIRS framework, although their requirements for model pre-specification and limitations on model size make them unattractive in a data-driven setting (Lohmann et al., 2012). Regardless of the method chosen for model estimation, region specific measures of model prediction accuracy and reproducibility not only are valuable for optimization of acquisition and analysis pipelines, but also provide useful neuroscientific information about the brain's functional architecture. Additionally the NPAIRS framework provides a principled means for comparing these various methods for estimating models of connectivity. SVR parameters for each brain region model were set to C = 100 and  = 0.1 in order to simplify the comparison of results between regions. In our previous experience applying support vector machines to task fMRI data, prediction accuracy tends to be insensitive to values of C larger than 1 (LaConte et al., 2005). Following the analytical approach described in Cherkassky and Ma (2004), we evaluated the

impact of C on the prediction accuracy of the obtained models. The maximum Lagrange multiplier observed across models trained on 10-minute data for both resting scans, 179 ROIs, and all 33 subjects was 5 × 10−5, with the majority being much smaller than this (Sup. Fig. 3). Since, in the dual representation of SVR, C bounds the Lagrange multipliers, the models presented here will be insensitive to C values greater than 5 × 10−5. Values of C smaller than 5 × 10−5 will change the model, and can be evaluated with cross-validation or other model selection procedures. It is important to note, that if cross-validation is used for both parameter selection and accuracy estimation, then a nested cross-validation approach must be used to avoid overly optimistic estimates of prediction accuracy (Cherkassky and Mulier, 2007). Conclusion In conclusion, the estimation of network models using multivariate regression connectivity is a promising technique for studying the functional architecture of the brain. Prediction accuracy estimated from these models is analogous to a measure of integration, and the reproducibility of the models measures model signal-to-noise ratio. These metrics can be used to evaluate regional as well as individual variation in network properties. Regional variations in prediction accuracy and reproducibility are consistent with Mesulam's functional hierarchy model, which was derived using invasive methods. When applied in the NPAIRS framework, the method provides an objective metric for optimizing experimental parameters inherent in data acquisition, preprocessing and analysis, as well as means to evaluate the quality of data. Acknowledgments This research was supported in part by the DOD Grant W81XWH08-2-0144 (to SML) and a 2010 NARSAD Young Investigator Grant from the Brain and Behavior Research Foundation (to RCC). We would additionally like to acknowledge Jonathon Lisinski and Nina Lauharatanahirun for research support, Gaël Varoquaux for useful discussions, and the anonymous reviewers for their insightful comments. Appendix A. Supplementary data Supplementary data to this article can be found online at http:// dx.doi.org/10.1016/j.neuroimage.2013.05.072.

136

R.C. Craddock et al. / NeuroImage 82 (2013) 127–136

Conflict of interest The authors declare that there are no conflicts of interest. References Benjamini, Y., Hochberg, Y., 1995. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B Stat. Methodol. 57 (1), 289–300. Biswal, B.B., Mennes, M., Zuo, X.-N.X., Gohel, S., Kelly, C., Smith, S.M.S.M., Beckmann, C.F.C.F., Adelstein, J.S.J.S., Buckner, R.L.R.L., Colcombe, S., et al., 2010. Toward discovery science of human brain function. Proc. Natl. Acad. Sci. 107, 4734. Boser, B.E., Guyon, I.M., Vapnik, V.N., 1992. A training algorithm for optimal margin classifiers. Proceedings of the Fifth Annual Workshop on Computational Learning Theory (ACM), pp. 144–152. Braun, U., Plichta, M.M., Esslinger, C., Sauer, C., Haddad, L., Grimm, O., Mier, D., Mohnke, S., Heinz, A., Erk, S., et al., 2012. Test–retest reliability of resting-state connectivity network characteristics using fMRI and graph theoretical measures. NeuroImage 59, 1404–1412. Bullmore, E., Fadili, J., Maxim, V., Sendur, L., Whitcher, B., Suckling, J., Brammer, M., Breakspear, M., 2004. Wavelets and functional magnetic resonance imaging of the human brain. NeuroImage 23 (Suppl. 1), S234–S249. Cherkassky, V., Ma, Y., 2004. Practical selection of SVM parameters and noise estimation for SVM regression. Neural Netw. 17 (1), 113–126. Cherkassky, V.S., Mulier, F., 2007. Learning From Data: Concepts, Theory, and Methods, 2nd ed. IEEE Press: Wiley-Interscience, Hoboken, N.J. Chu, C., Ni, Y., Tan, G., Saunders, C.J., Ashburner, J., 2011a. Kernel regression for fMRI pattern prediction. NeuroImage 56, 662–673. Chu, C., Handwerker, D.A., Bandettini, P.A., Ashburner, J., 2011b. Measuring the consistency of global functional connectivity using kernel regression methods. IEEE International Workshop on Pattern Recognition in NeuroImaging, pp. 41–44. Cox, R.W., 1996. AFNI: software for analysis and visualization of functional magnetic resonance neuroimages. Comput. Biomed. Res. 29, 162–173. Cox, D.D., Savoy, R.L., 2003. Functional magnetic resonance imaging (fMRI) “brain reading”: detecting and classifying distributed patterns of fMRI activity in human visual cortex. NeuroImage 19, 261–270. Craddock, R.C., Holtzheimer, P.E., Hu, X.P., Mayberg, H.S., 2009. Disease state prediction from resting state functional connectivity. Magn. Reson. Med. 62, 1619–1628. Craddock, R., James, G., Holtzheimer III, P., Hu, X., Mayberg, H., 2012. A whole brain fMRI atlas generated via spatially constrained spectral clustering. Hum. Brain Mapp. 33, 1914. http://dx.doi.org/10.1002/hbm.21333. Drucker, H., Burges, C., Kaufman, L., Smola, A.J., Vapnik, V., 1997. Support vector regression machines. Advances in Neural Information Processing Systems, 9. MIT Press, pp. 155–161. Friston, K.J., 1994. Functional and effective connectivity in neuroimaging: a synthesis. Hum. Brain Mapp. 2, 56–78. Friston, K.J., Frith, C., 1993. Time dependent changes in effective connectivity measured with PET. Hum. Brain Mapp. 1, 69–79. Friston, K.J., Williams, S., Howard, R., Frackowiak, R.S., Turner, R., 1996. Movement-related effects in fMRI time-series. Magn. Reson. Med. 35, 346–355. Friston, K.J., Harrison, L., Penny, W., 2003. Dynamic causal modelling. NeuroImage 19 (4), 1273–1302. Haxby, J.V., Gobbini, M.I., Furey, M.L., Ishai, a, Schouten, J.L., Pietrini, P., 2001. Distributed and overlapping representations of faces and objects in ventral temporal cortex. Science 293, 2425–2430. Haynes, J.D., Rees, G., 2006. Decoding mental states from brain activity in humans. Nat. Rev. Neurosci. 7, 523–534. Jenkinson, M., Bannister, P., Brady, M., Smith, S., 2002. Improved optimization for the robust and accurate linear registration and motion correction of brain images. NeuroImage 17, 825–841. Joachims, T., 1999. Making large-scale support vector machine learning practical. In: Schölkopf, B., Burges, C., Smola, A. (Eds.), Advances in Kernel Methods — Support Vector Learning. MIT Press. Kamitani, Y., Tong, F., 2005. Decoding the visual and subjective contents of the human brain. Nat. Neurosci. 8, 679–685. Kjems, U., Hansen, L.K., Anderson, J.S., Frutiger, S., Muley, S., Sidtis, J., Rottenberg, D., Strother, S.C., 2002. The quantitative evaluation of functional neuroimaging experiments: mutual information learning curves. NeuroImage 15, 772–786. Kriegeskorte, N., Goebel, R., Bandettini, P., 2006. Information-based functional brain mapping. Proc. Natl. Acad. Sci. U. S. A. 103, 3863–3868. LaConte, S.M., 2011. Decoding fMRI brain states in real-time. NeuroImage 56, 440–454. LaConte, S.M., Anderson, J., Muley, S., Ashe, J., Frutiger, S., Rehm, K., Hansen, L.K., Yacoub, E., Hu, X., Rottenberg, D., et al., 2003. The evaluation of preprocessing choices in singlesubject BOLD fMRI using NPAIRS performance metrics. NeuroImage 18, 10–27. LaConte, S.M., Strother, S.C., Cherkassky, V., Anderson, J., Hu, X.P., 2005. Support vector machines for temporal classification of block design fMRI data. NeuroImage 26, 317–329. LaConte, S.M., Peltier, S.J., Hu, X.P., 2007. Real-time fMRI using brain-state classification. Hum. Brain Mapp. 28, 1033–1044. Lange, N., Strother, S.C., Anderson, J.R., Nielsen, F.a, Holmes, a.P., Kolenda, T., Savoy, R., Hansen, L.K., 1999. Plurality and resemblance in fMRI data analysis. NeuroImage 10, 282–303. Lohmann, G., Erfurth, K., Müller, K., Turner, R., 2012. Critical comments on dynamic causal modelling. NeuroImage 59 (3), 2322–2329.

Lund, T.E., Madsen, K.H., Sidaros, K., Luo, W.-L., Nichols, T.E., 2006. Non-white noise in fMRI: does modelling have an impact? NeuroImage 29, 54–66. Marrelec, G., Bellec, P., Krainik, A., Duffau, H., Pélégrini-Issac, M., Lehéricy, S., Benali, H., Doyon, J., 2008. Regions, systems, and the brain: hierarchical measures of functional integration in fMRI. Med. Image Anal. 12, 484–496. McIntosh, A.R., Gonzalez-Lima, F., 1994. Structural equation modeling and its application to network analysis in functional brain imaging. Hum. Brain Mapp. 2 (1–2), 2–22. Mesulam, M.M., 1998. From sensation to cognition. Brain 121 (Pt 6), 1013–1052. Mitchell, T.M., Hutchinson, R., Niculescu, R.S., Pereira, F., Wang, X., Just, M., Newman, S., 2004. Learning to decode cognitive states from brain images. Mach. Learn. 57, 145–175. Mourão-Miranda, J., Bokde, A.L.W., Born, C., Hampel, H., Stetter, M., 2005. Classifying brain states and determining the discriminating activation patterns: support vector machine on functional MRI data. NeuroImage 28, 980–995. Mugler, J.P., Brookman, J.R., 1990. Three-dimensional magnetization-prepared rapid gradient-echo imaging (3D MP RAGE). Magn. Reson. Med. 15, 152–157. Müller, K., Smola, A., Rätsch, G., Schölkopf, B., Kohlmorgen, J., Vapnik, V., 1997. Predicting time series with support vector machines. Proceedings of the 7th International Conference on Artificial Neural Networks. Springer-Verlag, pp. 999–1004. Murphy, K., Bodurka, J., Bandettini, P.a, 2007. How long to scan? The relationship between fMRI temporal signal to noise ratio and necessary scan duration. NeuroImage 34, 565–574. Polyn, S.M., Natu, V.S., Cohen, J.D., Norman, K. a, 2005. Category-specific cortical activity precedes retrieval during memory search. Science 310, 1963–1966. Power, J.D., Barnes, K. a, Snyder, A.Z., Schlaggar, B.L., Petersen, S.E., 2012. Spurious but systematic correlations in functional connectivity MRI networks arise from subject motion. NeuroImage 59, 2142–2154. Ryali, S., Chen, T., Supekar, K., Menon, V., 2011. Estimation of functional connectivity in fMRI data using stability selection-based sparse partial correlation with elastic net penalty. NeuroImage 59, 3852–3861. Shaw, M.E., Strother, S.C., Gavrilescu, M., Podzebenko, K., Waites, A., Watson, J., Anderson, J., Jackson, G., Egan, G., 2003. Evaluating subject specific preprocessing choices in multisubject fMRI data sets using data-driven performance metrics. NeuroImage 19, 988–1001. Shehzad, Z., Kelly, a M.C., Reiss, P.T., Gee, D.G., Gotimer, K., Uddin, L.Q., Lee, S.H., Margulies, D.S., Roy, A.K., Biswal, B.B., et al., 2009. The resting brain: unconstrained yet reliable. Cereb. Cortex 19, 2209–2229. Smith, S.M., Jenkinson, M., Woolrich, M.W., Beckmann, C.F., Behrens, T.E.J., Johansen-Berg, H., Bannister, P.R., De Luca, M., Drobnjak, I., Flitney, D.E., et al., 2004. Advances in functional and structural MR image analysis and implementation as FSL. NeuroImage 23 (Suppl. 1), S208–S219. Smith, S.M., Miller, K.L., Moeller, S., Xu, J., Auerbach, E.J., Woolrich, M.W., Beckmann, C.F., et al., 2012. Temporally-independent functional modes of spontaneous brain activity. Proc. Natl. Acad. Sci. U.S.A. 109 (8), 3131–3136. http://dx.doi.org/10.1073/ pnas.1121329109. Smola, A.J., Schölkopf, B., 2004. A tutorial on support vector regression. Stat. Comput. 14, 199–222. Stone, J.V., Kötter, R., 2002. Making connections about brain connectivity. Trends Cogn. Sci. 6, 327–328. Strother, S.C., Anderson, J., Hansen, L.K., Kjems, U., Kustra, R., Sidtis, J., Frutiger, S., Muley, S., LaConte, S.M., Rottenberg, D., 2002. The quantitative evaluation of functional neuroimaging experiments: the NPAIRS data analysis framework. NeuroImage 15, 747–771. Tononi, G., 1998. Complexity and coherency: integrating information in the brain. Trends Cogn. Sci. 2, 474–484. Van Dijk, K.R.a., Sabuncu, M.R., Buckner, R.L., 2011. The influence of head motion on intrinsic functional connectivity MRI. NeuroImage 59, 431–438. Vapnik, V., Golowich, S., Smola, A., 1996. Support vector method for function approximation, regression estimation and signal processing. Advances in Neural Information Processing Systems, 9. MIT Press. Varoquaux, G., Gramfort, A., Poline, J.B., 2010. Brain covariance selection: better individual functional connectivity models using population prior. Advances in Neural Information Processing Systems. MIT Press. Wang, J.-H., Zuo, X.-N., Gohel, S., Milham, M.P., Biswal, B.B., He, Y., 2011. Graph theoretical analysis of functional brain networks: test–retest evaluation on short- and long-term resting-state functional MRI data. PLoS One 6, e21976. Zhang, Y., Brady, M., Smith, S., 2001. Segmentation of brain MR images through a hidden Markov random field model and the expectation–maximization algorithm. IEEE Trans. Med. Imaging 20, 45–57. Zou, H., Hastie, T., 2005. Regularization and variable selection via the elastic net. J. R. Stat. Soc. Ser. B Stat. Methodol. 67 (2), 301–320. Zou, Q.-H., Zhu, C.-Z., Yang, Y., Zuo, X.-N., Long, X.-Y., Cao, Q.-J., Wang, Y.-F., Zang, Y.-F., 2008. An improved approach to detection of amplitude of low-frequency fluctuation (ALFF) for resting-state fMRI: fractional ALFF. J. Neurosci. Methods 172, 137–141. Zuo, X.-N., Kelly, C., Adelstein, J.S., Klein, D.F., Castellanos, F.X., Milham, M.P., 2010a. Reliable intrinsic connectivity networks: test–retest evaluation using ICA and dual regression approach. NeuroImage 49, 2163–2177. Zuo, X.-N., Di Martino, A., Kelly, C., Shehzad, Z.E., Gee, D.G., Klein, D.F., Castellanos, F.X., Biswal, B.B., Milham, M.P., 2010b. The oscillating brain: complex and reliable. NeuroImage 49, 1432–1445. Zuo, X.-N., Ehmke, R., Mennes, M., Imperati, D., Castellanos, F.X., Sporns, O., Milham, M.P., 2011. Network centrality in the human functional connectome. Cereb. Cortex 22 (8), 1862–1875.