Accepted Manuscript Estimating dynamic brain functional networks using multi-subject fMRI data Suprateek Kundu, Jin Ming, Jordan Pierce, Jennifer McDowell, Ying Guo PII:
S1053-8119(18)30660-8
DOI:
10.1016/j.neuroimage.2018.07.045
Reference:
YNIMG 15136
To appear in:
NeuroImage
Received Date: 20 January 2018 Revised Date:
8 June 2018
Accepted Date: 17 July 2018
Please cite this article as: Kundu, S., Ming, J., Pierce, J., McDowell, J., Guo, Y., Estimating dynamic brain functional networks using multi-subject fMRI data, NeuroImage (2018), doi: 10.1016/ j.neuroimage.2018.07.045. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
Estimating Dynamic Brain Functional Networks Using Multi-subject fMRI Data
RI PT
Suprateek Kundua1 , Jin Minga , Jordan Pierceb , Jennifer McDowellb , and Ying Guoa .
a. Department of Biostatistics, Emory University, 1518 Clifton Road NE, Atlanta, GA 30322, USA. b. Department of Psychology, University of Georgia, 125 Baldwin Street, Athens, GA 30602, USA.
SC
Abstract: A common assumption in the study of brain functional connectivity is that the brain network is stationary. However it is increasingly recognized that the brain organization is prone to
M AN U
variations across the scanning session, fueling the need for dynamic connectivity approaches. One of the main challenges in developing such approaches is that the frequency and change points for the brain organization are unknown, with these changes potentially occurring frequently during the scanning session. In order to provide greater power to detect rapid connectivity changes, we propose
TE D
a fully automated two-stage approach which pools information across multiple subjects to estimate change points in functional connectivity, and subsequently estimates the brain networks within each state phase lying between consecutive change points. The number and positioning of the change
EP
points are unknown and learned from the data in the first stage, by modeling a time-dependent connectivity metric under a fused lasso approach. In the second stage, the brain functional network
AC C
for each state phase is inferred via sparse inverse covariance matrices. We compare the performance of the method with existing dynamic connectivity approaches via extensive simulation studies, and apply the proposed approach to a saccade block task fMRI data. Keywords: Brain functional connectivity; change point models; dynamic networks; fused lasso; graphical models; precision matrix estimation.
1
Please email the corresponding author at
[email protected] for any clarifications.
1
ACCEPTED MANUSCRIPT
1. Introduction Neuroimaging studies have proved to be a pivotal tool for understanding the neurobiological basis of cognitive and behavioral outcomes in psychiatric studies, as well as for examining the patho-
RI PT
physiological mechanisms and the atypical brain development underlying mental disorders (Biswal et al., 1995; Greicius et al., 2003; Fox et al., 2005; Smith et al., 2009, among others). Traditional neuroimaging studies focus on identifying brain regions which are activated with respect to a cer-
SC
tain task or outcome; however, in recent years, there has been a shift from region-based analyses to models involving brain networks. There have been some prominent findings which point to-
M AN U
wards the promise of brain functional network connectivity as an imaging biomarker which can be predictive of mental illnesses (Wang et al., 2007; Woodward and Cascio, 2015; Plitt et al., 2015). Many functional connectivity modeling approaches have been proposed, including independent component analysis (ICA) (McKeown et al., 1998,; Hyvarinen and Oja, 2000; Calhoun et al.,
TE D
2001; Guo and Pagnoni, 2008; Beckmann and Smith, 2004, 2005; Guo, 2011), pairwise correlation analysis (Stam et al., 2007; Suprekar et al., 2008), partial correlation analysis (Salvador et al., 2005) and sparse inverse covariance or precision matrix estimation (Huang et al., 2009). Among
EP
those methods, precision matrix estimation was found to be one of the most successful approaches (Smith et al., 2011; Wang et al., 2016), which can distinguish a true, direct functional connection
AC C
from an apparent connection, caused via confounding by a common third party node. A key assumption which is used in the aforementioned approaches is that the functional connectivity remains stationary throughout the scanning session. However, an increasing number of recent findings provide evidence on the dynamic nature of brain functional organization (Hutchinson et al, 2012; Hellyer et al, 2014). In particular, task-related imaging studies have shown that the brain networks will re-organize when the subjects undergo different modulations of the experimental tasks during the scanning session (Chang and Glover, 2010). Accurate detection and characterization
2
ACCEPTED MANUSCRIPT
of temporal variations in connectivity due to underlying neurobiological causes are not necessarily straightforward using the observed fMRI BOLD signals. Many non-neural related factors, such as low signal-to-noise ratio in the fMRI data, changing levels of non-neural noise resulting from
RI PT
background cardiac and respiratory processes and hardware instability, as well as variations in the BOLD signal mean and variance over time, can induce temporal variations (Hutchinson et al., 2013; Lindquist et al, 2014). Recently, studies are beginning to identify potential correlates of
SC
temporal variations in connectivity in simultaneously recorded electrophysiological data (Allen et al., 2013; Chang et al., 2013), behavior (Thompson et al., 2013), and disease status (Jones et al.,
M AN U
2012; Sakoglu et al., 2010). This evidence points to an underlying neuronal basis for temporal variations in functional connectivity which is linked with changes in cognitive and disease states, and provides a strong motivation for detailed investigation of dynamic brain connectivity. Arguably, the most commonly used strategy for examining dynamic functional connectivity
TE D
has been a sliding window approach (Allen et al., 2013; Chang and Glover, 2010; Handwerker et al., 2012; Jones et al., 2012; Sakoglu et al., 2010; Li et al., 2013; Monti et al., 2014; Liam et al., 2015). While the sliding window technique is a valuable tool for investigating temporal
EP
dynamics of functional brain networks, there are some known limitations associated with this approach (Lindquist et al., 2014) such as the choice of window length. In particular, a narrow
AC C
window length will lead to a small number of time scans from which to estimate the brain networks in each state phase, potentially resulting in poor brain network estimates, whereas a large window length may overlook short term changes in the brain network. In addition, the sliding window techniques require secondary criteria to determine if variations in the edge structure are significant, which may not be straightforward to implement. For example, Hindriks et al. (2016) showed that dynamic functional connections were almost impossible to detect using sliding-window correlations. Dynamic functional connectivity is often conceptualized as a collection of state phases corre-
3
ACCEPTED MANUSCRIPT
sponding to various modulations in the brain (Hutchinson et al., 2013). Such state phases may either refer to re-occurring quasi-stable connectivity states detected via hidden Markov models (Baker et al., 2014; Vidaurre et al., 2016), or they may imply piecewise constant connectivity with
RI PT
jumps at different change points (Cribben et al., 2012, 2013; Lian et al, 2015; Xu and Lindquist, 2015; Cribben and Yu, 2017; Dai et. al, 2017). These latter class of approaches are known as change point models, which have also been recently proposed for EEG data (Kirch et al., 2015; Schroder
SC
and Ombao, 2015). Although existing change point methods for dynamic functional connectivity have been somewhat successful in describing the temporal changes in the brain network, there are
M AN U
some existing challenges which involve the detection of rapid changes in brain organization, that can evolve within as little as 30-60s (Sakoglu et al., 2010; Shirer et al., 2012; Jones et al., 2012). Such rapid fluctuations will divide the scanning period into a collection of narrow state phases, each containing only a few time scans from which to estimate the brain network, which is likely
TE D
to violate an inherent assumption in existing approaches requiring a minimum number of scans between consecutive change points. A potential remedy is to pool information across multiple subjects as suggested by Hindriks et al. (2016), who showed an increase in power for detecting brain
EP
connections by averaging over multi-subject data. We propose a novel and fundamentally different change point model to detect temporal varia-
AC C
tions in the group level brain functional connectivity by combining data across multiple subjects. The proposed two-stage data-driven approach automatically estimates the number and locations of the change points from the data, as well as the time-varying brain network. The first stage estimates a connectivity metric at each time point based on multi-subject fMRI data and detects change points as shifts in the time series of this connectivity metric under a fused lasso approach. In the second stage, we use sparse inverse covariance matrix estimation to infer a distinct brain network for each state defined by the estimated change points. In addition, we introduce a sub-
4
ACCEPTED MANUSCRIPT
sampling scheme designed to reliably infer change points in the presence of heterogeneity in the temporal dynamics across subjects, which can also be used to provide measures of uncertainty. The proposed approach is expected to work best for task-based experiments involving multiple subjects
RI PT
who are engaged in the same task for a fixed time duration, although we propose extensions to more general scenarios in the Conclusion section, and to single subject change point analysis in Section 2.4. The proposed group level approach is particularly relevant given existing evidence that
SC
the changes in dynamic functional connectivity in task-based experiments are often not completely determined by the boundaries of the task events/blocks (Di et al., 2015). A schematic illustration
M AN U
of the proposed approach is presented in Figure 1.
We evaluate the performance of the proposed method via extensive simulation studies involving various types of dynamic networks with rapid changes in functional connectivity, and compare the performance with several state of the art competing methods. We also apply our approach to a
TE D
saccade block task data with alternate blocks of fixation and task, which involves rapidly changes in the task as well as aberrant subjects. For example, each block has only 10 time scans which lead to rapid changes in functional connectivity, and presents inherent challenges for an analysis based
EP
on single subject data. Moreover, there are several subjects who exhibit behavioral errors for one or more task blocks, which potentially reflect differences with the group level connectivity. Among
AC C
other findings, our method is able to identify brain regions supplementary eye fields, lateral and medial frontal eye fields, and cuneus as the most stable regions across the time varying network, which is consistent with previous evidence (McDowell et al., 2008).
2. Materials and Methods Suppose we are interested in studying the connectivity between p regions of interest (ROI) in the brain. From each ROI, the time-series of the blood oxygen level dependent (BOLD) signals is extracted to represent the neuronal activity within the ROI. To set notation, denote yit = 5
RI PT
ACCEPTED MANUSCRIPT
SC
Figure 1: A visual depiction of the two stage approach. Stage I, which involves the estimation of change
M AN U
points, consists of first sub-sampling the multi-subject fMRI data (see panel (a) on the right hand side), and then use this sub-sample to compute the ensemble pairwise correlations at each time point (see panel (b) on the right hand side). Subsequently, a fused lasso approach is applied to approximate the time series of the connectivity metric using a piecewise constant function (see panel (b) on the right hand side), which automatically detects change points and divides the length of the scanning session into distinct state phases. This process is repeated for multiple sub-samples in order to diminish the influence of aberrant subjects on change point estimation, and those change points which show up most frequently across these subsamples are chosen as the time points where functional connectivity changes occur. In the second stage, a graphical lasso approach is applied independently to the time scans in each state phase to obtain the corresponding brain functional networks (see panel (c) on the right hand side). The collection of these brain networks across state phases represents the time varying brain functional connectivity.
TE D
(yit1 , . . . , yitp ) as the vector of BOLD fMRI measurements obtained from the i-th subject at the t-th time point over the p ROIs, where t = 1, . . . , T, i = 1, . . . , N . We propose a two-stage approach to investigate the dynamic connectivity between the ROIs, where the first stage identifies the change
EP
points for functional connectivity and the second stage estimates the brain networks appearing in various state phases which are time intervals separated by the change points.
AC C
2.1. Stage I: Change point estimation To identify the change point in functional connectivity during the fMRI scanning, we first characterize the dynamic pattern in functional connectivity with a time course of a connectivity metric. Specifically, we propose to evaluate the time-dependent group functional connectivity between two nodes in the brain network by evaluating the correlations between the fMRI BOLD signals of these two nodes at each time point in the scanning session, averaged over subjects. The time courses of these pairwise correlations would then be used to identify the group level 6
ACCEPTED MANUSCRIPT
change points in functional connectivity. Here, we use pair-wise correlations instead of partial correlations as the connectivity metric due to the following reasons: (a) pair-wise correlations can be computed in a model-free manner, and they are computationally easier to estimate than
RI PT
the partial correlations, which requires the estimation of the precision matrix or fitting regression models; and (b) there is one-to-one correspondence between the changes in partial and pairwise correlations, and the time courses for these two types of correlations will have change points at
SC
exactly the same locations. We illustrate this concept in a toy example in Figure 2, which depicts changes in the time series of pair-wise correlations and corresponding partial correlations for a three
M AN U
node system. Given these reasons, we adopt pair-wise correlation as the connectivity metric for our change point estimation purposes in stage I. Subsequently in the second stage, we estimate the functional connectivity for each state phase (defined under the estimated change points), in terms of partial correlations under a graphical modeling approach involving sparse precision matrices.
TE D
We note that the assumption of discrete transitions under a change point approach is meant to serve as an approximation to a more realistic scenario of gradual changes in the network consistent with the slow timescale of the hemodynamic response. Each change point can be interpreted as
EP
summarizing a particular transition period spanning multiple consecutive time points and incorporating gradual changes in the network. The assumption of discrete jumps is meant to speed up
AC C
computations and make the approach scalable to high dimensions. 2.1.1. Connectivity Metric Suppose there are unknown change points denoted as 1 = a0 < a1 < . . . < aK−1 < aK < aK+1 = T , which divides the scanning session into K + 1 distinct state phases, where the k-th state phase consists of time points between ak−1 and ak . The state phases represent time intervals between two consecutive change points where the brain network remains relatively stationary. Both the number and locations of change points are unknown and the change points are estimated as time
7
SC
RI PT
ACCEPTED MANUSCRIPT
Figure 2: Toy example involving 3 regions and 50 time points, showing changes in pair-wise correlations
M AN U
(top) and corresponding partial correlations (below) across time points when the data is generated under a mixture model with multiple change points. Each component of the mixture model follows a Gaussian distribution with mean zero and a distinct inverse covariance matrix quantifying the functional associations corresponding to time intervals 1 − 10, 11 − 20, 21 − 30, 31 − 40, and 41 − 50. The time varying network illustrated in the above diagram has 5 state phases, and is characterized by piece-wise constant pair-wise and partial correlations having the exact same change points.
scans corresponding to jumps in the connectivity metric which is taken to be pair-wise correlations. ˜ t = (ρt,rs )p Let us denote the sample covariance matrices for time t as Σ r,s=1 , and that for the PN
yt )(yit −¯ yt )T i=1 (yit −¯
TE D
ˆ k , where Σ ˜t = k-th state phase as Σ
N
ˆk = , Σ
P
ak−1 ≤t
nk
˜t Σ
, t = 1, . . . , T, k =
1, . . . , K +1, with nk denoting the number of time scans in the k-th state phase Bk = [ak−1 , ak ), and
EP
˜ 1, . . . , Σ ˜ T , represent ¯ t denoting the sample mean for time point t. The sample covariance matrices Σ y
AC C
the time series of sample pairwise correlation between the r-th and s-th ROIs which is averaged over q ˜ ˜ t (r, r)Σ ˜ t (s, s), r < s and r, s = 1, . . . , p. all subjects, as ρ˜1,rs , . . . , ρ˜T,rs , where ρ˜t,rs = Σt (r, s)/ Σ The sample pairwise correlations {˜ ρt,rs , t = 1, . . . , T } correspond to realizations for the time series of true pairwise correlations representing the time varying group level associations between pairs of regions/nodes in the brain. Let us denote the p(p − 1)/2 dimensional vector of sample pairwise correlations at the t-th time point as ˜ rt = {˜ ρt,rs : r < s, r, s = 1 . . . , p}. The time series of multivariate pair-wise correlations ˜ r1 , . . . , ˜ rT , characterize the dynamic connectivity patterns in the brain network. We detect temporal changes in functional connectivity by approximating the time
8
ACCEPTED MANUSCRIPT
series of multivariate pair-wise correlations via a piece-wise constant function under a fused lasso approach (Tibshirani et al., 2004), which divides the scanning period into distinct state phases, such that the functional connectivity is constant within a state phase but changes across these
RI PT
phases. We elucidate the approach below.
2.1.2. A Fused Lasso Approach for Detecting Change Points in Brain Connectivity The fused lasso (Tibshirani et al., 2004) is a generalization of the lasso (Tibshirani, 1996)
SC
approach which applies to settings where the observations have a natural ordering. The fused lasso
M AN U
can be used to detect changes in the mean of a naturally ordered set of observations by inferring spatial or temporal locations where jumps occur (Tibshirani and Wang, 2007). In our case, the set of ordered observations correspond to the time series of sample pairwise correlations, and one can examine the connectivity changes in the brain network by detecting shifts in this time series profile. To our knowledge, this is one of the first approaches to use the well established fused lasso
TE D
method for detecting connectivity changes, although there is another recent work (Monti et al., 2014) which uses a fused graphical lasso to estimate a distinct brain network at each time point. The approach by Monti et al. (2014) is based on sliding window correlations using single subject
EP
data, and is distinct from our approach which is focused on group level change point models. To understand the application of the fused lasso approach for change point detection, let us first
AC C
consider the special case when p = 2. In this case, the time series of univariate pairwise correlations can be approximated via a piece-wise constant function under a fused lasso as follows
minu∈<
T X
||˜ ρt − ut ||2 + λ
t=1
T −1 X
|ut+1 − ut |,
(1)
t=1
where ρ˜1 , . . . , ρ˜T , are sample pairwise correlations as computed in Section 2.1.1, and (u1 , . . . , uT ) represents the piecewise constant approximation to the time series of pairwise correlations, with
9
ACCEPTED MANUSCRIPT
ut denoting the assumed constant correlation during the interval [at − 1, at ). The optimization function in (1) is composed of two parts: (a) the first term captures the goodness of fit between the sample correlations ρ˜ and the assumed piecewise constant correlations u via a summed squared
RI PT
distance; and (b) the second term is the absolute difference between the piecewise constant function u between two consecutive time points, which equals zero if there is no jump (|ut − ut−1 | = 0), and is greater than zero if there is a jump (|ut−1 − ut | > 0). More generally, the second term
SC
aims to measure the number of jumps in the piecewise constant function, where the number and locations of the jumps are unknown and controlled by the sparsity parameter λ. When λ increases,
M AN U
this penalty will enforce many increments to become exactly zero, which will reduce the number of jumps and decrease the number of change points; on the other hand, a small value of λ will lead to a greater number of jumps. In summary, the optimization function in (1) aims to provide an adequate fit to the observed sample time series of correlations via a piecewise constant function
TE D
with an unknown number of jumps that is controlled by the parameter λ. In practical applications involving p > 2 ROIs in the brain, there are p(p − 1)/2 distinct time series profiles of pair-wise correlations which are co-dependent on one another, and act in
EP
coordination to impact temporal variations in the brain functional connectivity. To tackle this issue, we present the following generalization of the original fused lasso in (1) to the case of multivariate
AC C
time series of correlations, based on the approach in Vert and Bleakly (2010):
minu∈
where ||ut+1,· − ut,· || =
1 p(p−1)/2
T X
2
||˜ rt − ut || + λ
t=1
q Pp(p−1)/2 m=1
T −1 X
||ut+1 − ut ||,
(2)
t=1
(ut+1,m − ut,m )2 , ˜ rt is the vector of p(p − 1)/2 sample
pairwise correlations at time point t as computed in Section 2.1.1, and ut ∈
10
ACCEPTED MANUSCRIPT
term measures the error between the observed pair-wise correlations and the piece-wise constant function. The second term in (2) controls the increments between successive time points across the p(p − 1)/2 pair-wise correlations, which collapses to zero when the multivariate time series does
RI PT
not change significantly between successive time points, and takes non-zero values corresponding to significant changes in functional connectivity defined as non-trivial cumulative changes across all elements in the piecewise constant approximation. At these jumps, one or functional connections
SC
may change, while the others may remain constant depending on the data. The penalty parameter λ influences the number of change points, in a manner similar to model (1). We denote the above
M AN U
approach as connectivity change point detection which is abbreviated as CCPD. For a fixed value of λ, we can fit model (2) using the group lasso (Yuan and Lin, 2006), as described in the Appendix.
2.1.3. Choice of Parameters in CCPD
As pointed out in Section 2.1.2, the number of change points is determined by the penalty
TE D
parameter λ, with a smaller value yielding a greater number of change points and vice-versa. Tibshirani and Wang (2007) proposed an estimate of λ based on a pre-smoothed fit of a univariate time series using a lowess estimator (Becker et al., 1988). We adapt this approach for a multivariate
EP
time series to obtain an initial estimate for λ, and then propose post-processing steps to tune this estimate and obtain change points.
AC C
Generalizing the approach in Tibshirani and Wang (2007) to the multivariate case, we apply lowess independently to each time-series of pair-wise correlations as a first step. Denote the smoothed fit as ρ¯q = (¯ ρq1 , . . . , ρ¯qT ) for the q-th pair-wise correlation profile, q = 1, . . . , p(p − 1)/2. The fraction parameter in the lowess fit which controls the smoothness level, is chosen to be small so as to avoid oversmoothing which will cause difficulty in detecting potential change points. Then for each time series, we compute the first order differences δqt = ρ¯qt − ρ¯q,t+1 , t = 1, . . . , T −1, followed by the median of (δq1 , . . . , δq,T −1 ) denoted as µq , q = 1, . . . , p(p − 1)/2. Next, we compute the median 11
ACCEPTED MANUSCRIPT
of the absolute deviations {|δq1 − µq |, . . . , |δq,T −1 − µq |}, and denote it as ∆q , q = 1, . . . , p(p − 1)/2. Finally, we note that equation (2) can be expressed as minu∈
rt t=1 ||˜
− ut ||2 subject to
||ut+1 − ut || ≤ s2 , where λ ∝ 1/s2 , with a smaller value of s2 implying a lesser number of
increments. We choose the threshold parameter as
max
{2∆q +
q=1,...,p(p−1)/2
t=1
|δqt |1(|δqt | > 4∆q )},
(3)
SC
s2 =
T −1 X
RI PT
PT −1
PT
where the expression inside the parenthesis corresponds to the threshold for the q-th individual
M AN U
pairwise connectivity time series in (1) and is motivated by the choice in Tibshirani and Wang (2007), and where 1(·) is the indicator function. The above expression (3) specifies the threshold for the increment term in the fused lasso as the maximum of the thresholds for the individual pairwise connectivity time series, in order to ensure that no true change point is omitted in the first step. Equation (3) assumes that first order differences with absolute values greater than 4∆q
TE D
corresponds to a change in connectivity, and the 2∆q term ensures that the threshold is not very small and is able to capture all the true change points accurately with no omissions, even at the cost of detecting spurious change points. Using the choice of the threshold in (3), we then re-fit
EP
the fused lasso (2) to obtain an initial estimate of the number of change points (Kmax) and their
AC C
locations τ ∗ = (t∗1 , . . . , t∗Kmax ). The initial estimate for the number of change points is potentially inflated due to the choice of a small value of the lowess fraction parameter, and the manner in which λ was computed in (3). This is to ensure that we do not exclude true change points in the initial fit. In the next step we propose a screening criteria to exclude false change points in τ ∗ . In particular, the screening criteria involves a post-processing step as follows. For each given subset of k < Kmax change-points, we approximate the signals between the successive changepoints with the mean value of the points in that interval. Subsequently, we calculate the total sum
12
ACCEPTED MANUSCRIPT
of squared errors (SSE) between the set of real signals and these piecewise constant approximations to them. Though it may appear computationally intensive to do this for all subsets of k < Kmax change-points, a dynamic programming strategy (Picard et al., 2005) enables us to compute the
RI PT
subset of k change points having the minimum SSE from among all possible sets of k < Kmax change points in O(Kmax2 ) time. Denote the minimum SSE obtained from the subset of all possible k change points derived from the set τ ∗ as SSE(k), and denote the corresponding locations
SC
of the change points as τ ∗k . Clearly, SSE(k + 1), will be smaller than SSE(k); however, after a certain point, adding an extra change-point will have a negligible impact on the SSE reduction.
M AN U
We determine the optimal number of change points by examining the curvature of the SSE curve as follows. First we normalize the minimum SSE scores SSE(1), . . . , SSE(Kmax) as Jk = SSE(Kmax)−SSE(k) SSE(Kmax)−SSE(1) (Kmax
− 1) + 1, where J(1) = Kmax, and J(Kmax) = 1. Then we compute
the curvature of these normalized scores via discrete second derivatives as ∇k = Jk−1 − 2Jk + Jk+1 ,
TE D
and select the number of change points as K = max{1 < k < Kmax : ∇k > 0.5} such that the second derivative does not rise above a certain threshold on addition of further change points. The choice of the threshold 0.5 is recommended in earlier papers on change point estimation (Picard et
EP
al., 2005), and worked adequately well for our applications. The idea behind this approach is that if the curvature of the normalized SSE scores does not change beyond a certain value by adding an
AC C
extra change point, then that will imply that the SSE does not reduce significantly, suggesting that no additional change points are required. In our experience based on extensive numerical studies, this approach is able to eliminate any false change points included in the initial set of change points τ ∗ and produces reliable estimates. We provide a summary of the above steps for estimating the number of change points in the form of Algorithm I in the Appendix.
2.2. Stage II: Brain Network Estimation based on Graphical Models We note that although it is possible to threshold the pairwise correlations in equation (2) to 13
ACCEPTED MANUSCRIPT
estimate the brain network, we use a partial correlation based approach which is known to have some inherent advantages (Smith et al., 2011) and does not depend on thresholding to estimate the network. After estimating the change points in the stage I, the brain network within each state phase
RI PT
is estimated via sparse precision matrices under graphical models, which estimates the strength of connections via partial correlations. The fMRI data will be pre-processed using standard procedures (Lindquist et al., 2008) including de-trending and de-meaning. We also pre-whiten the observations
SC
to remove temporal correlations across scans, adopting the spectral decomposition approach in Shi and Guo (2016). The pre-processed fMRI data are then assumed to be independently distributed
M AN U
as zero mean multivariate Gaussian distribution. Suppose τ ∗K = {a1 , . . . , aK } are the estimated change points in Stage I. We propose the following Gaussian graphical model
yit ∼ N (0, Ω−1 k ), if ak−1 < t ≤ ak , k = 1, . . . , K + 1,
(4)
TE D
where 1 = a0 < a1 < . . . < aK < aK+1 = T , and Ωk is a sparse group level inverse covariance matrix for the k-th state phase (k = 1, . . . , K + 1), from which one can derive partial correlations to estimate the brain network. We note that given the estimated change points in the stage I,
EP
there are K + 1 distinct brain networks encapsulated by Ω1 , . . . , ΩK+1 . Model (4) specifies a group
AC C
level change point model, where it is assumed that all the subjects in the group have similar transition points, which is reasonable for task based fMRI experiments. In the Section 2.3, we relax this assumption to identify and exclude aberrant subjects who have significant differences with the group level brain network, and we extend this approach to the analysis of change points for individual subjects in Section 2.4. The Gaussian graphical model in (4) enables us to infer brain networks in the form of sparse inverse covariance matrices Ω1 , . . . , ΩK+1 which can be translated to graphs G1 , . . . , GK+1 . The
14
ACCEPTED MANUSCRIPT
graph Gk has the vertex set V = {1, . . . , p} corresponding to the p ROIs, and edge set Ek comprising all the edges in Gk corresponding to non-zero off-diagonal elements in Ωk , k = 1, . . . , K + 1. The edges in Ek correspond to significant associations between pairs of ROIs in terms of partial
RI PT
correlations which can be be directly derived from Ωk , k = 1, . . . , K + 1. In other words, one can obtain the graph Gk , and hence the brain network for the k-th state phase, by studying the pattern of zeros in the sparse inverse covariance matrix Ωk , k = 1, . . . , K + 1.
SC
A popular method for estimating sparse inverse covariance matrices is the graphical lasso (Friedman et al. 2007) which is a penalized likelihood approach and can be thought of as an extension of
M AN U
the Lasso approach in regression settings. The graphical lasso operates by imposing a L1 penalty on the off-diagonal elements of the inverse covariance matrix under a Gaussian likelihood, which results in zero off-diagonal entries corresponding to absent edges. The graphical lasso for brain network estimation estimates the sparse inverse covariance matrix for the k-th state phase as
TE D
ˆ k,γ = arg minΩ {− log(det(Ωk )) + tr(Sk Ωk ) + γ|Ωk |1 }, k = 1, . . . , K + 1, Ω k
where γ is the tuning parameter controlling for the sparsity of the graph, |Ωk |1 =
(5)
P
r,s |ωk,rs |
denotes the element-wise L1 norm of Ωk , Sk is the sample correlation matrix for all observations
EP
belonging to the k-th state phase which can be obtained directly from the sample covariance matrix ˆ k computed in Section 2.1.1. A higher value of the penalty parameter γ leads to sparser estimates Σ
AC C
of the inverse covariance matrix, and vice-versa. We obtain the solution in (5) corresponding to a grid of γ values, and choose an optimal value of γ which minimizes the BIC computed as
BICγ =
K X T X
k=1
ˆ ˆ tr(Sk Ωk,γ ) − log(det(Ωk,γ )) 1(ak−1 ≤ t < ak ) + ek,γ log(nk ) ,
(6)
t=1
ˆ k,γ . We use the efficient algorithm developed by Friedman where ek,γ is the number of edges under Ω et al. (2007) to implement the graphical lasso approach, which is available in the R package glasso available at the CRAN repository (https://cran.r-project.org/web/packages/glasso/index.html). We note 15
ACCEPTED MANUSCRIPT
that the graphical lasso is known to result in stable networks when the number of data points is larger than the number of nodes, which is certainly the case in our group level analysis. Even for short bins involving a small number of time scans t∗ , the total number of data points used to
RI PT
estimate the precision matrix is N t∗ which is expected to be at least moderate and larger than the number of ROIs in practical applications. Hence the proposed approach is expected to yield stable precision matrix estimates which are not overly sensitive to the state phase lengths.
SC
2.3. A robust procedure for estimating group level dynamic connectivity
M AN U
In designing a change point estimation approach, we would like to develop a method which can perform well even under variations in the location of change points across subjects, with the estimated group level change points essentially reflecting the most commonly occurring transitions across individuals in the group. Heterogeneity across subjects is expected to be limited in taskrelated fMRI studies where subjects engage in the same experimental paradigm. However, there
TE D
may be some aberrant subjects with large systematic differences whose neural activities do not properly align with the experimental tasks as expected. For example, in our study with saccade trails, a few aberrant subjects have slower response times and a lower number of correct responses
EP
due to behavioral errors, but the majority of subjects have negligible errors implying consistency in the behavioral response. Our aim is to ensure that the proposed change point approach is able to
AC C
identify these aberrant subjects and exclude them from subsequent analysis, while retaining other subjects with limited variations in the location of change points. We note that it may be of possible interest to look more closely at the aberrant subjects to identify the causes for systematic variations, which may be potentially indicative of disease severity or some other underlying physiological attributes. However, this will require a secondary analysis which is not the focus of this article. We propose an approach that can provide reliable estimates of dynamic connectivity patterns in the presence of aberrant subjects. The proposed approach is based on a sub-sampling scheme. We 16
ACCEPTED MANUSCRIPT
randomly select a sub-sample of pre-specified size (typically taken to be N −5) by excluding a small subset of subjects, and compute the change points based on this sub-sample using the methods in Section 2.1. This procedure is then repeated multiple times with distinct sub-samples, and a
RI PT
set of change points is computed for each sub-sample. Once this process is completed, the final change points are estimated as those which appear most frequently across all the sub-samples. The justification for this approach lies in the fact that change points which are supported by the majority
SC
of the subjects in the dataset, will be detected more frequently across multiple sub-samples, and any fallacious change points which may show up in a limited number of sub-samples due to the
M AN U
influence of aberrant subjects will be eliminated through this process. By choosing to work with multiple sub-samples having a reduced sample size, our goal is to ensure a robust change point estimation by reducing the impact of the aberrant subjects. Moreover, the proposed approach is equipped to provide a characterization of the uncertainty in the temporal locations of the change
TE D
points and accommodate limited heterogeneity in the location of change points. In particular, by repeatedly drawing subsamples and calculating the corresponding change points, the proposed method is able to provide a subsampling distribution of estimated change points.
EP
In order to identify aberrant subjects, we compute the sparse inverse covariance matrix as in Section 2.2 using the estimated change points, independently for each subject. This essentially
AC C
reduces to computing the estimate in (5) where the covariance matrices S1 , . . . , SK+1 , are computed based on single subject data. The individual BIC scores using single subject data are then computed using (6), and those who have BIC score above a certain threshold (greater than the 90-th percentile) are identified as aberrant subjects, since a higher BIC score implies the inadequacy of the model fit under the estimated change points. These identified aberrant subjects are then excluded, and the group level dynamic brain network is computed using the remaining typical subjects, under the methods in Section 2.2.
17
ACCEPTED MANUSCRIPT
2.4. Extension to Single Subject Change Point Analysis We now extend the proposed CCPD approach for inferring subject level change points, which does not assume a similar locations of change points across subjects. It uses the core idea that
RI PT
the change points for a individual ‘i’ will show up prominently under the sub-sampling scheme involving all pairs of subjects containing subject ‘i’, and additional change points which are not representative of the dynamics for subject ‘i’ will be downweighted across sub-samples and hence
SC
be automatically filtered out.
Given fMRI data on N subjects, we first construct all possible pairs of subjects containing
M AN U
the ith individual, i.e. consider the set Pi := {(i, i0 ), i0 = 1, . . . , N, i 6= i0 }. We then run our sub-sampling approach outlined in Section 2.3 on the subsamples corresponding to the pairs in Pi . Subsequently, we retain the set of all change points which show up at least once during the subsampling scheme, and denote this set by τ ∗i . It is possible that we obtain consecutive time points
TE D
as change points, but each having frequency one, which implies a strong possibility that the true change point belongs to the neighborhood represented by the consecutive change points. In order to eliminate any spurious change points which may have been initially included in τ ∗i , we propose a
EP
screening procedure which is described below. Since we want to retain the most promising change points in τ ∗i in the final output, we do not subject those change points with the highest frequencies
AC C
under the subsampling scheme (denoted by τ¯ i ) to the screening process. Denote the set of all change points in τ ∗i excluding those with the highest frequencies by τ ∗i , so that τ ∗i is a union of τ¯ i and τ ∗i . We use the BIC reduction idea incorporating bootstrapping developed by Cribben et. al (2012) in the DCR paper, to infer false positives in τ ∗i . First, we compute the BIC reduction resulting due to the addition of each of the change points in τ ∗i to the preliminary set of change points τ¯ ∗i , and then examine the change point in τ ∗i (denoted by τ 0 ) which yields the minimum BIC reduction. We note that the BIC criteria is computed using
18
ACCEPTED MANUSCRIPT
equation (6) in the manuscript, but using data from the ith subject. To test whether τ 0 is a false positive, we compute confidence bounds for the BIC reduction by including τ 0 in addition to τ¯ ∗i using a simple permutation test procedure. In particular, the data is repeatedly permuted across
RI PT
time, and for each permuted dataset, the data is split into several parts, with boundaries consisting of the change points (¯ τ ∗i , τ 0 ). The combined BIC score from each subset is subtracted from the BIC of the entire permuted data set and the results are combined across permutations to create a
SC
permutation distribution.
We then determine whether the value of the BIC reduction for the original data is extreme
M AN U
relative to the permutation distribution. If not, then τ 0 is labeled as a false positive and excluded from τ ∗i . Otherwise, τ 0 is identified as an important change point and included in the set τ¯ ∗i . Then the above procedure is repeated for the remaining elements in τ ∗i to obtain an updated set of important change points τ¯ ∗i for the ith individual. The above process is then repeated for all
TE D
individuals, to infer subject specific change points τ¯ ∗1 , . . . , τ¯ ∗N . Once these change points have been estimated, the dynamic network for subject i is computed in the second stage conditional on the
3. Results
EP
estimated change points τ ∗i , as in Section 2.2 of the manuscript, but using single subject data.
3.1. Simulation Studies
AC C
3.1.1. Simulation Set-up
We conducted a series of simulation studies to examine the performance of our approach in terms of change point detection, and brain network estimation. We generate data sets which include both typical subjects having limited variability in the location of change points, and also aberrant subjects with large systematic differences compared to the group level dynamics. The data was generated under a Gaussian graphical model (4) containing several state phases corresponding to unequally distributed change points. 19
ACCEPTED MANUSCRIPT
We evaluated the performance of the proposed method for a range of networks with different characteristics. In particular, we consider the brain networks derived from four different types of graphs: (a) Erdos-Renyi random graphs (Erdos and Renyi, 1959) was generated for the first state
RI PT
phase using the same probability for all connections (0.15), while the networks for the subsequent state phases are obtained by flipping a subset of edges (adding an edge if there was no edge, and deleting an edge if there was one) in the graph corresponding to the previous state phase; (b)
SC
Erdos-Renyi random graphs (Erdos and Renyi, 1959) were generated independently for each state phase; (c) scale-free graphs were generated independently for each state phase using the preferential
M AN U
attachment model of Barabasi and Albert (1999); and (d) small-world random graphs, obtained using the Watts and Strogatz (1998) model, were generated independently for each state phase. The random graphs in (a) mimic the time varying brain networks which retain certain connections between consecutive state phases, but discard others, while the random graphs in (b) represent
TE D
a random network setting that is commonly seen in various types of data and has been widely studied. On the other hand, the scale-free and small-world networks in (c) and (d), are motivated by the characteristics of real life brain networks derived from fMRI data.
EP
Given the graph, the corresponding precision matrix was computed by fixing off-diagonals to zero for absent edges, and randomly generating elements from U (−1, 1) corresponding to important
AC C
edges. The diagonal elements are generated by adding one to the sum of the absolute values of the off-diagonal elements of the corresponding rows, to yield a diagonally dominant structure which ensures positive definiteness. The simulated brain network and the corresponding precision matrix was constant for all time scans within a state phase but they were different across the state phases. The data was generated independently for all subjects using a Gaussian graphical model characterized by the precision matrices generated in the above manner. For each type of graph, we then considered four combinations, i.e. (p, T ) = (10, 200), (20, 200),
20
ACCEPTED MANUSCRIPT
with change points as 40,80,140, and (p, T ) = (10, 300), (20, 300), with change points as 40,115,175. For each case, we generated data for N = 60 subjects, out of which 50% subjects had limited variation in the location of the change points (+/-3 time scans) compared to the group level tran-
RI PT
sitions, and 5 subjects were aberrant with large systematic differences. The results are presented in Table 1. As a special case, we also examine the performance of different approaches under a full network in Table 2, where none of the edges are absent. Further, Table 3 reports the results under
SC
high dimensional scenarios with 200 nodes and T = 1200 time points, with N =20,60, subjects. For these cases, we consider 20 group level transition periods each consisting of 7 consecutive time
M AN U
points, and the distance between any two consecutive transition periods range from 20 to 200. The simulation setup implies slower transitions between state phases, which is more consistent with the timescale of the hemodynamic response. Finally, we also consider cases when the observations across time points are autocorrelated, with limited heterogeneity in the location of change points
TE D
and having 5 aberrant subjects, as in Tables 1-2. We generate data under the VAR model (Monti et. al, 2014), which is an econometric model generalizing the univariate AR model to multivariate responses, as well as a matrix variate Gaussian graphical model, where the observations are gener-
EP
ated under the four network structures (a)-(d) described above, in the presence of autocorrelations across time points. Tables 4 and 5 report the results under these autocorrelated cases with and
AC C
without prewhitening the observations. The prewhitening was performed by modifying the spectral decomposition approach in Shi and Guo (2016). Finally, we also generate data to evaluate the performance of the CCPD approach for single subject change point detection. In particular, we generate data for N = 20 subjects under different networks considered previously in the manuscript. For each network having dimension p = 20, we generated T = 600 time points with true change points generated as follows: (a) we randomly generated three change points for each subject, with consecutive change points spaced at least
21
ACCEPTED MANUSCRIPT
60 scans apart; and (b) we chose 15 true change points and each subject were randomly allocated three change points from among these 15 change points with the constraint that the selected change points were at least 60 scans apart. While the first scenario corresponds to no overlap and maximum
RI PT
heterogeneity across subjects, the second scenario implies a partial overlap in change points across subjects where one or at most two change points may be common between a small number of subjects while the remaining subjects have minimal or no similarity in change points.
SC
3.1.2. Performance Metrics
We evaluate the performance of the proposed method for change point detection as well as
M AN U
network estimation. To assess the performance in terms of change point estimation, we report the proportion of change points that were detected correctly and those which were falsely identified across replicates, depending on whether the difference between the true and the estimated change points were less than equal to, or greater than 2. Conditional on the estimated change points in the
TE D
first step, the graph estimation performance was assessed by comparing the estimated and the true networks at each time point in terms of sensitivity and specificity, and then averaging these metrics across time points. Sensitivity represents the power to detect true connections and is computed as
EP
the ratio of the true connections detected versus the total number of true connections in the network. Moreover, specificity measures the proportion of absent edges that are correctly identified as such
AC C
and can be loosely interpreted as 1-false positive rate (FPR). We also present receiver operating characteristic (ROC) curves for graph estimation for the first four simulation cases, which plots the sensitivity versus 1-specificity levels for a series of graphs estimated by varying the penalty parameter γ in (5). A higher area under the ROC curve implies more accurate graph estimates across all values of the penalty parameter, with higher sensitivity and specificity levels. Finally, we also examine computation times under our method. We compare our approach with dynamic connectivity regression (DCR) approach for multi-
22
ACCEPTED MANUSCRIPT
subject data proposed by Cribben et al. (2012), which is a change point estimation method that uses the graphical lasso for network estimation, the code for which is available on the author’s website. For implementing the DCR algorithm, the block size for the stationary bootstrap was
RI PT
chosen to be 30 (we found that the block size in DCR had minimal impact on performance), the significance level for permutation test was fixed at 0.05 and the number of bootstrap samples used was 500. The sensitivity and specificity under the estimated graphs for each segment are computed,
SC
and these are averaged to report a point estimate. We also compare with the SINGLE approach by Monti et al. (2014) which computes a network at each time point in a temporally homogeneous
M AN U
manner, as well as the generic sliding window technique. For both these approaches, we obtain local estimates of the covariance matrices by downweighting time points far away using a kernel function Kh (i, j) = I(|i − j| ≤ h), as in Monti et al. (2014). Here, I(·) is an indicator function, and Kh (i, j) represents the weight given to time point j when computing the local covariance matrix at
¯ i,SL = timepoint i is y PT
j=1 Kh (i, j)(yi
PT
TE D
time point i corresponding to window length h. Given the kernel function, the local mean at the j=1 Kh (i, j)yi /
PT
and the local covariance matrix is Si,SL =
PT
We then average these local covariance
¯ i,SL )(yi − y ¯ i,SL )T / −y
j=1 Kh (i, j), j=1 Kh (i, j).
EP
matrices for each time point across all subjects, and use them to obtain group level estimates for the graph at each time point under the graphical lasso method. We compute the sensitivity and
AC C
specificity under the estimated graphs at each time point, and average across them. We present the sliding window results under window lengths of 10 and 50, according to recommendations in Shakil et al. (2016), Leonardi and Van De Ville (2015) and Zalesky and Breakspear (2015). We also consider the tapered window approach of Change and Glover (2010) in our simulations, but did not present the results since it did not yield any decisive advantages. 3.1.3. Simulation Results From Table 1, we see that the proposed approach is able to detect the true group change points
23
ACCEPTED MANUSCRIPT
for all scenarios, with zero false positives, even in the presence of aberrant subjects. On the other hand, the DCR approach performs poorly, often missing change points altogether and reporting false positives, which are somewhat expected under the DCR approach depending on the number
RI PT
of bootstrap replicates or the width of the confidence bounds. In terms of graph estimation, the proposed approach often has a higher area under the curve for graph estimation compared to DCR and sliding window based approaches, with higher sensitivity levels and higher or comparable
SC
specificity levels. We note that the lower AUC values under DCR for network estimation may be related to the poor change point estimation performance. Further, although it was not possible
M AN U
to report change point estimation performance under the sliding window based approaches, these methods have a consistently lower AUC, sensitivity, and specificity compared to the proposed approach. Moreover, Table 2 also illustrates the superior change point estimation performance and higher sensitivity levels under the proposed approach, for the full correlation case.
TE D
The results under the high dimensional cases reported in Table 3, suggest that the proposed approach can be scaled to a large number of nodes, time points, and change points. On the other hand, neither the DCR or the SINGLE approach could be applied to settings involving p = 200
EP
nodes and T = 1200 time scans, due to an unrealistic computational burden. We observe that the change point performance of the approach essentially remains the same when the sample size
AC C
is reduced from 60 to 20, while keeping the number of nodes and time points fixed. Moreover, while there may be a drop in the AUC under all approaches when the sample size is reduced while keeping the number of nodes fixed, the difference is often lesser under the proposed approach in contrast to the competing approaches. For data generated under autocorrelations, Tables 4 and 5 present the results for both prewhitened and un-prewhitened data. The results clearly suggest that the proposed approach detects the change points accurately for all cases. For data generated under the matrix variate Gaussian graphical
24
ACCEPTED MANUSCRIPT
model, the DCR approach often reports a large number of spurious change points as evident from the false positive column, and it also fails to detect all the true change points in some cases. Moreover, the AUC under the proposed approach is comparable and often higher than the DCR
RI PT
approach, both for data without prewhitening, as well as pre-whitened data. For the VAR data, the DCR method is able to detect all the true change points, but consistently reports an inflated number of transitions with high false positives, as evident from the results in Table 5. The AUC
SC
under the proposed approach is seen to be comparable or slightly higher than the DCR approach for most cases under the VAR model. In all cases, the AUC under the sliding window approach
M AN U
with a window length 10 is much lower than both the approaches. Our simulation study adds insights into the performance of network modeling based on prewhitened data, for which there is a limited literature (Zhu and Cribben, 2018).
We also experimented by varying the number of aberrant subjects, and discovered that the
TE D
proposed approach is able to accurately detect all the change points with no false positives, even when 30 out of 60 subjects had large systematic differences, which is encouraging. From our numerical experiments, we expect that the proposed approach will work well in task based fMRI
EP
experiments, where the number of aberrant subjects is expected to be limited. Another useful feature of the proposed subsampling scheme in Section 2.3 is that it can be used to characterize
AC C
uncertainty. For example, we are able to construct histograms for the estimated change points (see Figure 3a) which clearly show peaks around the true change point locations, highlighting their ability to accurately detect change points. For the single subject change point detection analysis, the results are presented in Table 6 of this report. From the Table, it is clear that the proposed CCPD approach has a desirable accuracy for detecting true change points (80% or higher), and also reports negligible false positives. Moreover, a higher accuracy is observed under the partial overlap case containing 15 true change points
25
ACCEPTED MANUSCRIPT
Table 1: Simulation results for data having discrete group level change points with 50% of subjects having
P 10 20 10 20
T 200 200 300 300
CP 1 1 1 1
FP 0 0 0 0
CCPD SE 0.90 0.87 0.90 0.84
SP 0.91 0.92 0.95 0.95
AUC 0.93∗ 0.92∗ 0.95∗ 0.93
CP 1 1 1 1
FP 7.5 11.3 8.2 7.4
DCR SE 0.86 0.70 0.85 0.79
SP 0.85 0.93 0.94 0.91
AUC 0.89 0.89 0.92 0.91
B B B B
60 60 60 60
10 20 10 20
200 200 300 300
1 1 1 1
0 0 0 0
0.87 0.91 0.88 0.92
0.99 0.99 0.98 0.98
0.95∗ 0.96∗ 0.95 0.95
1 1 1 1
4.4 7.9 5.2 6.9
0.86 0.77 0.89 0.84
0.90 0.94 0.92 0.96
0.91 0.92 0.93 0.94
C C C C
60 60 60 60
10 20 10 20
200 200 300 300
1 1 1 1
0 0 0 0
0.90 0.84 0.91 0.87
0.99 0.98 0.98 0.98
0.97∗ 0.95 0.97∗ 0.97∗
1 0.93 1 0.89
6.5 7.5 5.5 6.0
0.87 0.88 0.89 0.83
0.86 0.87 0.89 0.95
0.91 0.93 0.91 0.94
D D D D
60 60 60 60
10 20 10 20
200 200 300 300
0.97 1 1 1
0 0 0 0
0.88 0.91 0.85 0.86
0.99 0.99 0.98 0.98
0.96∗ 0.97 0.97∗ 0.93
0.63 0.63 0.7 0.62
2.2 2.5 3.1 2.2
0.93 0.89 0.89 0.92
0.85 0.92 0.84 0.92
0.90 0.94 0.88 0.93
A A A A
N 60 60 60 60
P 10 20 10 20
T 200 200 300 300
CP NA NA NA NA
FP NA NA NA NA
SD(10) SE 0.90 0.79 0.86 0.79
SP 0.11 0.22 0.14 0.21
AUC 0.59 0.61 0.60 0.60
CP NA NA NA NA
FP NA NA NA NA
SINGLE SE 0.56 0.51 0.55 0.51
SP 0.64 0.70 0.63 0.70
AUC 0.74 0.62 0.69 0.69
B B B B
60 60 60 60
10 20 10 20
200 200 300 300
NA NA NA NA
NA NA NA NA
0.91 0.80 0.92 0.79
0.09 0.21 0.08 0.21
0.59 0.62 0.58 0.61
NA NA NA NA
NA NA NA NA
0.54 0.54 0.66 0.55
0.66 0.67 0.66 0.68
0.77 0.76 0.76 0.75
C C C C
60 60 60 60
10 20 10 20
200 200 300 300
NA NA NA NA
NA NA NA NA
0.92 0.79 0.92 0.79
0.08 0.21 0.09 0.21
0.59 0.59 0.62 0.61
NA NA NA NA
NA NA NA NA
0.50 0.49 0.48 0.48
0.62 0.69 0.66 0.66
0.76 0.76 0.75 0.77
D D D D
60 60 60 60
10 20 10 20
200 200 300 300
NA NA NA NA
NA NA NA NA
0.92 0.80 0.92 0.79
0.08 0.22 0.18 0.23
0.64 0.64 0.62 0.63
NA NA NA NA
NA NA NA NA
0.49 0.49 0.49 0.49
0.70 0.64 0.70 0.69
0.72 0.72 0.74 0.73
M AN U
TE D
EP
RI PT
A A A A
N 60 60 60 60
SC
a variation of +/-3 time scans around these group level change points. CP is percentage of estimated true change points; FP is the number of estimated false positive change points; SE is sensitivity. SP is specificity; AUC is Area under curve. The first column denotes different simulation scenarios - A:Random Network. B: Erdos Renyi. C: Small-World. D: Free-network. The approach having a significantly higher AUC compared to all other approaches is indicated via an asterisk sign.
AC C
shared across all subjects. In contrast, the DCR approach has very poor power to detect the true positives, and has significantly higher false positives. Finally, the AUC under CCPD is also significantly higher compared to DCR and SINGLE, with the latter two approaches having a similar performance. We note that the change point performance for individual subjects under both CCPD and DCR approaches are poorer than the group level results, but this is expected due to the large heterogeneity in the location of change points in the single subject analysis. Figure 3b-3c also provides histograms for change point detection under CCPD and DCR for single subject analysis. 26
ACCEPTED MANUSCRIPT
Table 2: Simulation results for full correlation case, where the graphs for all the four bins are fully connected, and only the strength of connections are different. SE denotes the sensitivity, and CP and FP are as defined in Table 1. Unlike the DCR, CCPD has zero falsely estimated change points and maintains high sensitivity. Note that it was not possible to compute AUC since the specificity is always zero for a full graph. The full correlation case represents a challenging scenario for the methods considered, since all of them rely on sparse precision matrix estimation using graphical models. T 200 200 300 300
CP 1 1 1 1
CCPD FP 0 0 0 0
SE 0.99 0.98 0.97 0.98
CP 1 1 1 0.83
DCR FP 3.5 11.4 3.7 13.7
SE 0.80 0.41 0.82 0.49
20 20 20 20
10 20 10 20
200 200 300 300
1 1 1 1
0 0 0 0
0.99 0.98 0.98 0.97
1 0.73 1 0.7
2.3 10.4 1.5 9.1
0.78 0.47 0.87 0.47
SD(10) SE 0.92 0.77 0.91 0.77
SD(50) SE 0.90 0.81 0.89 0.83
0.96 0.82 0.96 0.82
0.89 0.80 0.90 0.83
RI PT
P 10 20 10 20
SC
N 60 60 60 60
Table 3: Simulation results for T=1200 and P=200 cases. There are 20 group level transition periods each
M AN U
consisting of 7 consecutive time points, and the distance between any two consecutive transition periods range from 20 to 200. The simulation setup implies slower transitions between state phases. CP is percentage of estimated true change points; FP is the number of estimated false positive change points; SE is sensitivity. SP is specificity; AUC is Area under curve and TIME is the running time for change point estimation in minutes. The first column denotes different simulation scenarios - A:Random Network. B: Erdos Renyi. C: Small-World. D: Free-network. The results under DCR and SINGLE are not included for p = 200 because of infeasible computational burden. CCPD performs accurately in terms of change point estimation and has higher area under the curve compared to competing approaches.
T 1200 1200 1200 1200
CP 1 1 1 1
A B C D
20 20 20 20
200 200 200 200
1200 1200 1200 1200
1 1 1 1
FP 0 0 0 0
SE 0.75 0.80 0.77 0.78
CCPD SP 0.90 0.92 0.91 0.91
AUC 0.83 0.83 0.84 0.84
SP 0.56 0.55 0.68 0.45
SD(10) SE 0.88 0.90 0.87 0.92
AUC 0.77 0.72 0.75 0.70
SP 0.60 0.60 0.64 0.57
SD(50) SE 0.81 0.83 0.85 0.81
AUC 0.75 0.74 0.79 0.73
0 0 0 0
0.77 0.84 0.77 0.78
0.86 0.80 0.88 0.82
0.82 0.81 0.83 0.80
0.83 0.70 0.82 0.66
0.57 0.44 0.50 0.54
0.61 0.59 0.62 0.59
0.80 0.75 0.76 0.69
0.59 0.53 0.60 0.58
0.68 0.63 0.59 0.74
TE D
P 200 200 200 200
EP
A B C D
N 60 60 60 60
AC C
Extensive simulation studies illustrate that the proposed approach has a reasonable computation time and is much faster and more scalable than the DCR or SINGLE approaches. For example, it took the method about 67 secs to run for p = 20 nodes and T = 300 time points, and 215 secs to run for p = 50 nodes. Moreover, unlike the DCR and the SINGLE approaches, the proposed approach was scalable to p = 200 nodes and T = 1200 time points. From our experience in simulations in the presence of aberrant subjects, the proposed approach is readily scalable to several hundred nodes. Further, the approach can be scalable to even larger number of nodes in the absence of aberrant
27
N 60 60 60 60
60 60 60 60
60 60 60 60
60 60 60 60
20 20 20 20
20 20 20 20
20 20 20 20
20 20 20 20
A A A A
B B B B
C C C C
D D D D
28
A A A A
B B B B
C C C C
D D D D
10 20 10 20
10 20 10 20
10 20 10 20
10 20 10 20
10 20 10 20
10 20 10 20
10 20 10 20
P 10 20 10 20
200 200 300 300
200 200 300 300
200 200 300 300
200 200 300 300
200 200 300 300
200 200 300 300
200 200 300 300
T 200 200 300 300
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
CP 1 1 1 1
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0.83 0.91 0.84 0.91
0.90 0.94 0.88 0.93
0.90 0.92 0.91 0.92
0.93 0.92 0.89 0.81
0.89 0.94 0.91 0.94
0.91 0.94 0.92 0.94
0.92 0.93 0.92 0.94
TE D
1 0.94 0.83 0.72
1 0.67 1 0.5
1 0.8 1 0.6
1 0.67 1 0.83
1 1 1 0.8
1 1 1 0.8
1 1 1 0.9
0.7 1 0.5 1.6
0.5 4.8 1.1 5.3
0.5 5.6 1.4 5.7
0.8 7.3 0.7 7.1
0.6 1.7 1.1 1.7
0.6 6.3 0.9 5.2
0.4 4.5 1.4 3.8
0.55 0.55 0.53 0.55
SD(10) AUC 0.54 0.53 0.52 0.52 1 1 1 1
CP 1 1 1 1 0 0 0 0
0.84 0.88 0.85 0.90
0.90 0.91 0.89 0.91
0.91 0.90 0.92 0.90
0.90 0.88 0.89 0.90
0.87 0.89 0.93 0.91
0.90 0.90 0.92 0.91
0.54 0.55 0.55 0.57
0.51 0.51 0.51 0.51
0.52 0.51 0.51 0.53
0.50 0.51 0.47 0.52
0.57 0.57 0.56 0.54
0.52 0.53 0.52 0.52
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
1 1 1 1
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
0.87 0.94 0.90 0.94
0.92 0.96 0.92 0.96
0.93 0.94 0.93 0.95
0.82 0.91 0.83 0.90
0.89 0.94 0.87 0.93
0.89 0.92 0.92 0.91
0.92 0.91 0.88 0.90
0.8 7.3 0.7 7.1
0.6 1.7 1.1 1.7
0.6 6.3 0.9 5.2
0.4 4.5 1.4 3.8
0.90 0.77 0.88 0.78
0.80 0.83 0.87 0.90
0.89 0.93 0.88 0.83
0.88 0.88 0.90 0.90
AUC 0.94 0.83 0.91 0.88
1 0.94 0.83 0.72
1 0.67 1 0.5
1 0.8 1 0.6
0.7 1 0.5 1.6
0.5 4.8 1.1 5.3
0.5 5.6 1.4 5.7
0.78 0.81 0.83 0.83
0.86 0.81 0.85 0.79
0.87 0.79 0.88 0.82
RI PT
1 0.67 1 0.83
1 1 1 0.8
1 1 1 0.8
1 1 1 0.9
With Prewhitening DCR AUC CP FP 0.96 1 0.7 0.94 1 7.2 0.95 1 2.1 0.94 1 6.5
SC
CCPD FP 0 0 0 0
M AN U
0.91 0.90 0.93 0.91
Without Prewhitening DCR AUC CP FP AUC 0.93 1 0.7 0.91 0.94 1 7.2 0.91 0.93 1 2.1 0.93 0.94 1 6.5 0.91
EP
CCPD FP 0 0 0 0
AC C
0.50 0.50 0.51 0.50
0.50 0.5 0.50 0.50
0.50 0.51 0.51 0.50
0.50 0.50 0.50 0.50
0.52 0.51 0.55 0.53
0.52 0.52 0.53 0.52
0.53 0.54 0.53 0.55
SD(10) AUC 0.52 0.53 0.52 0.52
There were 3 true group level change points and 50% of the subjects had variations of +/3 time scans across these change points. The first column reports simulation scenarios for graph generation. A is Random graph, B is Erdos Renyi, C is Small World, and D is Scale Free. CP, FP, and AUC are defined as in Table 1. The first 10 columns represent the result of data generated by matrix variate model directly, whereas column 12-18 represent the result after the data has been pre-whitened. In both cases, note that CCPD uses the data without pre-whitening to estimate change points. All results are averaged over 100 replicates.
Table 4: Simulation results for matrix variate model incorporating correlations between nodes for each time point as well as temporal correlations.
ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT
Table 5: Simulation results for VAR model. The first 8 rows use the raw data directly without pre-whitening, which means the data are auto-correlated. The last 4 rows use pre-whitened data to estimate the graph. It is clear that CCPD has slightly higher AUC and more accurate change point estimation compared with DCR. Both CCPD and DCR perform better than sliding window with window size of 10.
T 200 200 300 300
CP 1 1 1 1
FP 0 0 0 0
CCPD SE 1.00 0.95 1.00 0.97
SP 1.00 0.99 1.00 0.99
AUC 1.00 0.99 1.00 0.99
CP 1 1 1 1
FP 2.6 1.6 1.2 2.5
DCR SE 0.98 0.92 0.98 0.92
60 60 60 60
10 20 10 20
200 200 300 300
1 1 1 1
0 0 0 0
0.71 0.68 0.98 0.72
0.87 0.86 0.86 0.86
0.86 0.82 0.93 0.83
1 1 1 1
2.6 1.6 1.2 2.5
0.58 0.49 0.58 0.47
SP 0.99 0.98 0.99 0.99
AUC 0.99 0.96 0.98 0.96
SE 0.66 0.73 0.65 0.72
SD(10) SP 0.38 0.31 0.39 0.30
AUC 0.58 0.58 0.58 0.58
0.88 0.97 0.96 0.94
0.85 0.81 0.87 0.80
0.59 0.81 0.57 0.81
0.42 0.19 0.44 0.19
0.51 0.52 0.52 0.53
RI PT
VAR P 10 20 10 20
SC
N 60 60 60 60
TP 2.755 2.540 2.775 2.415 2.715 2.430 2.670 2.410
CCPD FP 0.535 0.590 0.755 0.615 0.620 0.800 0.720 0.610
TE D
Change point Partial overlap Random Partial overlap Random Partial overlap Random Partial overlap Random
AUC 0.757 0.750 0.768 0.773 0.763 0.757 0.771 0.781
TP 0.180 0.120 0.120 0.130 0.100 0.110 0.120 0.100
DCR FP 2.370 2.390 2.370 2.310 2.350 2.415 1.950 1.970
AUC 0.657 0.653 0.688 0.679 0.681 0.673 0.688 0.672
SINGLE AUC 0.651 0.652 0.673 0.679 0.679 0.670 0.685 0.674
EP
Network A A B B C C D D
M AN U
Table 6: Simulation results for subject level change point estimation where data is generated for N = 20 subjects with T = 600 time points and p = 20 nodes, with three true underlying change points. The networks A, B, C, D, refer to the random, Erdos-Renyi, small-world, and scalefree networks. The Change point column indicates how the change points were generated, either randomly or with partial overlap. Columns TP and FP describes the number of true positives and false positives in the estimation of the true change points, with an estimated change point being a true positive if it lies within +/-2 scans of a true change point, and it is false positive otherwise. The AUC column summarizes the area under the curve for the network estimation. Results are reported under CCPD, DCR, and SINGLE methods.
AC C
subjects. In general the scalability of the approach is inversely proportional to the number of aberrant subjects assumed to be present in the sample. 3.2 Experimental Data for Saccade Trials 3.2.1. Saccade Trial fMRI data and pre-processing steps We look at a block task data involving a saccade fMRI trial at the University of Georgia, Athens. Data was collected from 35 right-handed, healthy participants (mean age = 19.5 years, SD = 3.7; 10 males), who experienced no current major psychiatric disorders or substance abuse, had no
29
M AN U
SC
RI PT
ACCEPTED MANUSCRIPT
TE D
Figure 3: Histogram of change point estimation involving 20 subjects, quantifying the frequency with which
AC C
EP
each time point is identified as a change point across all subjects, with the true change points being indicated on the X-axis. The top panel (a) corresponds to group level analysis under CCPD with 1200 time points and 200 nodes. The second panel (b) and the third panel (c) corresponds to single subject analysis under CCPD and DCR respectively, with 600 time points and 20 nodes. For panels (a) and (b), we observe high peaks near the true group level change points under CCPD, illustrating the effectiveness of CCPD in change point detection. Panel (c) shows the lack of power in change point detection under DCR, along with high false positives. Clearly, CCPD is successful in detecting all true change points for the group level analysis (panel (a)), and in detecting over 80% true change points in the single subject analysis (panel (b)), and has no or minimal false positives in panels (a) and (b) respectively. On the other hand, DCR (panel (c)) has a poor performance for the single subject analysis.
metal implants, and had normal or corrected-to-normal vision (via self-report), as described in Pierce and McDowell (2016). Participants completed a saccade blocked task which consisted of repeating 20 second blocks (10 scans) of a) fixation (9 blocks), b) pro-saccade trials (4 blocks), and c) anti-saccade trials (4 blocks). The blocks were arranged as fixation followed by prosaccade, then fixation followed by antisaccade, and this sequence was repeated several times such that the total number of scans combining all the task blocks was 170, and the total scan time was 5 minutes
30
ACCEPTED MANUSCRIPT
and 48 seconds. All stimuli consisted of a 1 degree gray shape presented on a black background (Pierce and McDowell, 2016). During fixation a cross appeared in the center of the screen; subjects were asked to look as quickly and accurately as possible towards the peripheral stimulus for pro-
anti-saccade task. Figure 4 illustrates the experimental design.
RI PT
saccade task and towards opposite location of the stimulus, same distance from the center, for the
Participants attended an initial screening session, where they completed demographic surveys
SC
and performed twenty practice trials of mixed prosaccades and antisaccades. Those who met inclusion criteria were scheduled for a subsequent MRI session. A high-resolution structural scan
M AN U
was obtained first for each participant, followed by several functional scans (only one of which is reported here). Stimuli were displayed using Presentation software (Neurobehavioral Systems, Albany, CA) and a dual mirror system attached to the head coil that allowed a participant to view a projection screen at his/her feet and researchers to monitor the participants eye. Right eye pupil
TE D
position was sampled at 60 Hz (IView X MRI-LR system, SensoMotoric Instruments, Germany) and recorded for off-line analysis. Before beginning the saccade tasks, eye position was calibrated using IViews 5-point calibration and an in-house horizontal calibration.
EP
MR images were collected on a 3T GE Signa Excite HDx system at the University of Georgia BioImaging Research Center. A high-resolution anatomical image was collected using a T1-weighted
AC C
3D FSPGR sequence (echo time = 3 ms, flip angle = 20 degrees, field of view (FOV) = 240 mm × 240 mm, matrix size 256 × 256, 150 axial slices, in-slice resolution = 0.94 × 0.94 mm, slice thickness = 1.2 mm, scan time=6 minutes and 32 seconds). The functional scans were collected using a T 2∗ -weighted gradient echo EPI sequence (echo time = 30 ms, repetition time = 2000 ms, flip angle = 90 degrees, FOV =220 mm × 220 mm, matrix size = 64 × 64, 33 interleaved oblique slices aligned with the AC-PC plane, in-slice resolution = 3.4 × 3.4 mm, slice thickness = 4 mm, and 4 dummy volumes for magnet stabilization).
31
ACCEPTED MANUSCRIPT
Eye position data were analyzed using custom scripts written in MATLAB (MathWorks, Natick, MA). Trials were manually scored for initial direction of response (eye movements with velocities surpassing 20 degree/sec were classified as saccades). Trials with no response, blinks at stimulus
RI PT
onset, anticipatory saccades (faster than 90 ms latency or during the gap window), or with insufficient data quality due to loss of pupil tracking were excluded from further analysis (about 6 out of 80 trials on average per subject). Pre-processing of functional MRI data was performed using the
SC
AFNI software package (Cox, 1996, 2012) and included: slice-timing correction, volume alignment, resampling to 4 mm3 voxel grid, spatial standardization to a Talairach template, spatial smoothing
M AN U
(4 mm full width-half maximum Gaussian kernel), and voxel-wise scaling to a mean of 100. Regions of interest (ROIs) were identified using 6 mm spheres centered on coordinates from a previous saccade study (Dyckman et al., 2007) that included supplementary eye fields (SEF), lateral and medial frontal eye fields (FEF), prefrontal cortex (PFC), inferior frontal cortex (IFC), precuneus, cuneus,
TE D
inferior parietal lobule (IPL), middle occipital gyrus (MOG), striatum, and thalamus. The average preprocessed time series then was extracted from each ROI for each participant, for estimating the dynamic brain functional connectivity.
EP
We deleted the first four task blocks (fixation-pro-fixation-anti) from our analysis, in order to exclude data from the high variability blocks where subjects were still familiarizing themselves with
AC C
the task. We also excluded data from the anti-saccade blocks due to high error rate and variability across subjects associated with this task, which is potentially indicative of varied brain states during these trials. Here, error rate is defined as the number of trials with an initial saccade in the incorrect direction divided by the total number of scoreable trials. We then applied the proposed approach to detect change points and estimate dynamic networks based on blocks associated with the pro-saccade task and fixation.
3.2.2. Analysis Results 32
SC
RI PT
ACCEPTED MANUSCRIPT
Figure 4: Saccade Task Design. The saccade task consisted of alternating blocks of fixation, prosaccade
M AN U
trials, and antisaccade trials. During the saccade trials, a central fixation cross appeared for 500 ms, followed by a 500 ms trial cue that indicated to the subjects whether they were to perform a prosaccade (towards the target) or an antisaccade (to the mirror location of the target), then the cue disappeared briefly (200 ms gap), and finally the target (circle) appeared to the left or right of center for 800 ms. Arrows indicate the correct direction of response and were not shown during the trials.
Our analysis detected changes in the saccade network at the beginning of the pro-saccade and fixation blocks which is consistent with the experiment design. In addition, we also detected change
TE D
points in the middle of blocks two fixation blocks and one pro-saccade block. This results implies that the brain networks tend to be stationary during the majority of the task and rest phases, however there could be occasional network changes within a task or rest block. We conjecture that
EP
these changes near the middle of the blocks are potentially due to anticipation of the fixation block at the end of a task block, and vice-versa.
AC C
To characterize how the brain network during a saccade task shifts during pro-saccade and fixation blocks, we examined the partial correlations between all pairs of ROIs during each brain state between the change points. Then, we identified those connections that were relatively stable (i.e., present in at least 50% of the brain states) across both pro-saccade and fixation blocks, exclusively for pro-saccade blocks, exclusively for fixation blocks, or neither block. These connections are depicted in Figure (5). The analysis revealed that there was a higher number of stable connections for pro-saccade (29) compared to fixation blocks (17). The SEF, medial FEF, and cuneus 33
ACCEPTED MANUSCRIPT
had the largest number of stable connections during task or both blocks, while the thalamus had a high number of connections exclusively during the task block. Moreover, there were only three connections (IFC-SEF, IFC-IPL, and IPL-MOG) that were present at fixation that did not also
RI PT
occur during the pro-saccade task. Striatum and MOG had the fewest stable connections across the scanning session.
This pattern of results implies that SEF, FEF, and cuneus are core regions in the saccade net-
SC
work and that they maintain functional connectivity with multiple nodes in the network throughout the task as well as during intervening fixation blocks. These regions are known to be critical for
M AN U
visual processing and saccade generation (McDowell et al., 2008) and the number of consistent connections identified here demonstrates their relevance to the saccade network over time. The thalamus, in contrast, was connected to the saccade network specifically during the pro-saccade task blocks, indicating a dynamic role in ocular motor control that may be interrupted during
TE D
non-active blocks. Striatum and MOG have also been reported during saccade activation studies, yet showed limited stable connections to other saccade ROIs here. Interestingly, the striatum was connected to the thalamus, while MOG was connected to the cuneus, during all the analyzed brain
EP
states. This suggests that these two regions are more functionally distant in the saccade network and their interactions may be mediated largely by other, more central nodes. Overall these find-
AC C
ings support activation studies of saccade tasks on the role of many of these ROIs in the ocular motor network, as well as providing new insight into how these regions functionally interact over the course of task and fixation blocks in a saccade paradigm. 4. Conclusion
We have proposed a fundamentally novel approach for detecting temporal changes in the brain functional connectivity, and for estimating a time varying brain network based on multi-subject fMRI data. By pooling information across subjects, the proposed approach has major advantages
34
M AN U
SC
RI PT
ACCEPTED MANUSCRIPT
Figure 5: A visual depiction of functional connectivity in the saccade trial. The Figures display connections which are present in both pro-saccade and fixation (Figure A), connections which are present in pro-saccade blocks only (Figure B), and connections which are present in fixation blocks only (Figure C). Figure D provides a color coded table which indicates a connection present in both pro-saccade and fixation in green, connection appearing only in pro-saccades in yellow, connections appearing only in fixation in pink, and connections which do not appear in either blocks in gray.
TE D
over existing methods. It provides more reliable estimation of group-level dynamic connectivity in task-related fMRI studies. In particular, our method provides greater power for detecting rapid changes in functional connectivity, that can otherwise be challenging to detect using a single subject
EP
analysis. We further propose a robust procedure for accurately detecting the true change points even in the presence of aberrant subjects, who demonstrated significantly differently dynamics
AC C
in connectivity from the group mostly because their neural activity was inappropriate registered to the experiment tasks. The proposed group level analysis involves a sub-sampling scheme which accounts for individual level differences by identifying and eliminating aberrant subjects who exhibit significant differences in the dynamic brain functional connectivity compared to the group, and is expected to work well in settings involving a limited number of aberrant subjects, such as in task based fMRI experiments. The CCPD approach assumes that the fMRI data is available on all subjects at the same set of time points, such that measurements for each scan can be aligned
35
ACCEPTED MANUSCRIPT
across subjects. Hence, the proposed approach is well-suited for estimating group-level dynamic connectivity in task-related fMRI studies where subjects are imaged under the same experimental paradigm, although it can accommodate heterogeneity in change points across samples in the group-
RI PT
level analysis. Moreover, we propose an extension of the method to single subject change point analysis, which does not assume similar locations of change points across subjects, and exhibits a desirable performance in terms of change point detection and network estimation, with significant
SC
improvements over competing approaches.
When it is of interest to perform a group level change point analysis in studies with larger
M AN U
differences across subjects, the proposed approach can still be applied after an additional clustering and registration step. In particular, when the number of change points vary across subjects, one can first cluster the multivariate fMRI time series using recent advances in clustering of multivariate functional curves (Park and Ahn, 2017), where each cluster represents a sub-group of subjects
TE D
with the same number of change points and similar temporal dynamics. Once the clustering is complete, a registration step can be implemented separately for each cluster to align the phases of the functional curves for each ROI across all subjects in the cluster, thereby reducing variability in
EP
the location of the change points across subjects within each cluster (Dai et al., 2017). Subsequently, the proposed two stage approach can be applied separately to each cluster, with each cluster
AC C
representing systematic differences in the number and location of change points. We plan to pursue these generalizations in future work. Finally, we note that the results of dynamic functional connectivity approaches may often depend upon the time-scale over which it is defined (Horwitz, 2003), and the proposed dynamic network modeling approach based on fMRI data is best suited for detecting discrete brain states spanning at least a few time scans. When the objective is to detect even faster transient jumps in the functional connectivity occurring at a sub-second rate, alternate approaches with higher sampling rates such as the simultaneous multislice imaging techniques
36
ACCEPTED MANUSCRIPT
(Todd et al., 2016) or magnetoencephalography (Baker et al., 2014) may be preferable.
Acknowledgements
RI PT
Research reported in this publication was partially supported by the National Institute Of Mental Health of the National Institutes of Health under Award Number ROI MH105561 and R01MH079448. The content is solely the responsibility of the authors and does not necessarily represent the views
SC
of the funding agencies involved.
Appendix
M AN U
Computational Details for Fitting CCPD
We fit model (2) using a group lasso approach, after suitable transformation of variables. Denote α0 = u1 , and the increments as αi = ui+1 − ui , i = 1, . . . , T − 1. Then we have, u1 = α0 , and ui = α 0 +
Pi−1 l=1
˜ denote the T × p matrix having ˜ αl for i = 2, . . . , T. Further let R rt as the t-th
TE D
row (t = 1, . . . , T ), and A denote the (T − 1) × p matrix having αk as the k-th row (k = 1, . . . , T ). Further denote X as the T ×(T −1) matrix having elements (k, l) as one if k > l, and zero otherwise. It can be shown that the equation (2) may be re-written as
EP
˜ − XA − 1T α0 ||2 + λ min A∈<(T −1)×p ,α0 ∈
A∈<(T −1)×p ,α0 ∈
||αt ||
t=1 T −1 X
˜ c − Xc A − 1T α0 ||2 + λ ||R
AC C
=
T −1 X
||αt ||,
(7)
t=1
˜ c are obtained from X and R ˜ respectively, after centering each column, and 1T where Xc and R is a vector of ones having length T . The equality in (7) can be obtained by substituting α0 = ˜ − XA)1T . Equation (7) can be solved using the group lasso approach in Yuan and Lin (2006), (R for which existing R packages are available.
Dynamic Programming Strategy to Compute SSE 37
ACCEPTED MANUSCRIPT
We use a dynamic programming strategy recursively to compute the optimal change points τ ∗k , where k = 1, . . . , Kmax , using results in Picard at. al (2005), which ensures that the subset enumeration is tractable even for a large number of nodes and change points. An exhaustive
into Kmax segments is
T −1 Kmax−1
RI PT
search becomes impossible for large Kmax since the number of partitions of a set with T elements . The dynamic programming strategy reduces the computation
time from O(T Kmax ) to O(T 2 ) using the additive property of SSE. Let βk+1 (i, j) denote the SSE
SC
corresponding to the best partition of the data between the i-th and j-th time points into k + 1
M AN U
segments, noting that βk+1 (0, n) = SSE(k + 1). The recursive algorithm is as follows
k = 0, ∀0 ≤ i < j ≤ T, β1 (i, j) =
j X
||˜rt − ¯rij ||2 ,
t=i+1
∀k
where ¯rij =
1 j−i
Pj
∈ [1, Kmax], βk+1 (i, j) = min{βk (1, h) + β1 (h + 1, j)},
rt t=i ˜
h
(8)
is the mean for the vector of sample pairwise correlations between time
TE D
points i and j. The above dynamic programming takes advantage of the additivity of the SSE, considering that a partition of the data into k + 1 segments is a union of a partition into k optimal segments and a set containing 1 segment, which enables us to efficiently compute the best partition
EP
of the data into k + 1 segments, k = 1, . . . , Kmax. Once these optimal partitions corresponding to
AC C
τ ∗k , k = 1, . . . , Kmax have been computed via the dynamic programming strategy, one can eliminate spurious change points from this set using the curvature approach described in Section 2.1.3.
38
ACCEPTED MANUSCRIPT
Algorithm I: Estimation of the Number of Change Points 1. Compute a value of penalty parameter λ for fused lasso model (2). 1.1 Apply lowess independently to all the time series of pairwise correlations, under
RI PT
a small value of the fraction parameters to avoid overfitting.
1.2 Compute the differences δkt = ρ¯kt − ρ¯k,t+1 , t = 1, . . . , T − 1, where ρ¯k = (¯ ρk1 , . . . , ρ¯kT ) denotes the lowess fit for the k-th time series.
SC
1.3 For each first order difference, compute the median of (δk1 , . . . , δkT ) as µk . 1.4 For each first order difference, compute {|δk1 − µk |, . . . , |δkT − µk |}
M AN U
1.5 Compute the median of the differences {|δk1 − µk |, . . . , |δkT − µk |} as ∆k 1.6 Compute the penalty parameter as λ = mink=1,...,p {2∆k +
PT −1 t=1
|δkt |1(|δkt | > 4∆k )}.
2. Fit the fused lasso model (2) under the chosen value of λ in step 1, and compute an initial set of Kmax change points denoted as τ ∗Kmax . 3. Screen out false change points detected in step 2
SSE(Kmax)−SSE(k) SSE(Kmax)−SSE(1) (Kmax
− 1) + 1, where J(1) =
TE D
3.1 Compute normalized scores Jk = Kmax, J(Kmax) = 1.
3.2 Compute discrete second derivatives as ∇k = Jk−1 − 2Jk + Jk+1 .
EP
3.3 Choose the final subset of K time points from the original set of change points τ ∗Kmax such that K = max{1 < k < Kmax : ∇k > 0.5}.
AC C
References
1. Allen, E., Damaraju, E., Plis, S.M., Erhardt, E., Eichele, T., Calhoun V.D., 2012. Tracking whole-brain connectivity dynamics in the resting state. Cereb. Cortex bhs352. 2. Baker, A.P., Brookes, M.J., et al. (2014). Fast transient networks in spontaneous human brain activity, In eLife 2014;3:e01867 DOI: 10.7554/eLife.01867. 3. Barabasi, A.L., Albert, R., 1999. Emergence of scaling in random networks. science 286(5439), 509-512. 4. Becker, R. A., Chambers, J. M. AND Wilks, A. R., 1988. The New S Language. Pacific Grove, CA: Wadsworth Brooks Cole. 39
ACCEPTED MANUSCRIPT
5. Beckmann, C. F. and Smith, S. M., 2004. Probabilistic independent component analysis for functional magnetic resonance imaging. IEEE Transactions on Medical Imaging, 23, 137-152. 6. Beckmann, C. F. and Smith, S. M.. 2005. Tensorial extensions of independent component analysis for multisubject FMRI analysis. Neuroimage, 25, 294-311.
RI PT
7. Biswal, B., Yetkin, F., Haughton, V.M., Hyde, J.S., 1995. Functional connectivity in the motor cortex of resting human brain using echo-planar MRI. Magnetic resonance in medicine 34(4), 537-541. 8. Calhoun, V., Adali, T., Pearlson, G. and Pekar, J., 2001. A method for making group inferences from functional MRI data using independent component analysis. Human brain mapping, 14, 140-151.
SC
9. Chang, C., Glover, G.H., 2010. Time-frequency dynamics of resting-state brain connectivity measured with fMRI. Neuroimage 50(1), 81-98.
M AN U
10. Chang, C., Liu, Z., Chen, M.C., Liu. X., Duyn, J.H., 2013, EEG correlates of time-varying BOLD functional connectivity. Neuroimage 72, 227-236. 11. Chiang, S., Cassese, A., Guindani, M., Vannucci, M., Yeh, H.J., Haneef, Z. and Stern, J.M., 2016. Time-dependence of Graph Theory Metrics in Functional Connectivity Analysis. NeuroImage 125, 601-615. 12. Cox, R.W., 1996. AFNI: software for analysis and visualization of functional magnetic resonance neuroimages. Computers and Biomedical research 29(3), 162-173.
TE D
13. Cox, R.W., 2012. AFNI: what a long strange trip it’s been. Neuroimage 62(2), 743-747. 14. Cribben, I., Haraldsdottir, R., Atlas, L.Y., Wager, T.D., Lindquist, M. A., 2012. Dynamic connectivity regression: determining state-related changes in brain connectivity. Neuroimage 61(4), 907-920.
EP
15. Cribben, I., Wager, T.D., Lindquist, M.A., 2013. Detecting functional connectivity change points for single-subject fmri data. Frontiers in Computational Neuroscience 7, 143. 16. Cribben, I. and Yu, Y., 2017. Estimating whole-brain dynamics by using spectral clustering. Journal of Royal Statistical Society C, 66: 607-627.
AC C
17. Dai, M., Zhang, Z., and Srivastava, A. (2017). Discovering Change-Point Patterns in Dynamic Functional Brain Connectivity of a Population. In International Conference on Information Processing in Medical Imaging (pp. 361-372). Springer, Cham. 18. Dyckman, K.A., Camchong, J., Clementz, B.A., McDowell, J.E., 2007. An effect of context on saccade-related behavior and brain activity. Neuroimage 36(3), 774-784. 19. Erdos, P., Renyi, A., 1960. On the evolution of random graphs. Publ. Math. Inst. Hung. Acad. Sci 5(1), 17-60. 20. Fox, M.D., Snyder, A.Z., Vincent, J.L., Corbetta, M., Van Essen, D.C., Raichle, M.E., 2005. The human brain is intrinsically organized into dynamic, anticorrelated functional networks. Proceedings of the National Academy of Sciences of the United States of America 102(27), 9673-9678. 40
ACCEPTED MANUSCRIPT
21. Friedman, J., Hastie, T., Tibshirani, R., 2008. Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9(3), 432-441. 22. Greicius, M.D., Krasnow, B., Reiss, A. L., Menon, V., 2003. Functional connectivity in the resting brain: a network analysis of the default mode hypothesis. Proceedings of the National Academy of Sciences 100(1), 253-258.
RI PT
23. Guo, Y., 2011. A general probabilistic model for group independent component analysis and its estimation methods. Biometrics 67(4), 1532-1542. 24. Guo, Y., Pagnoni, G., 2008. A unified framework for group independent component analysis for multi-subject fMRI data. NeuroImage 42, 1078-1093.
SC
25. Handwerker D.A., Roopchansingh V., Gonzalez-Castillo J., Bandettini P.A., 2012. Periodic changes in fMRI connectivity. Neuroimage 63,1712-1719.
M AN U
26. Hellyer, P.J., Shanahan, M., Scott, G., Wise, R.J., Sharp, D.J., Leech, R., 2014. The control of global brain dynamics: opposing actions of frontoparietal control and default mode networks on attention. Journal of Neuroscience 34(2), 451-461. 27. Hindriks, R., Adhikari, M.H., Murayama, Y., Ganzetti, M., Mantini, D., Logothetis, N.K., Deco, G., 2016. Can sliding-window correlations reveal dynamic functional connectivity in resting-state fMRI?. Neuroimage 127, 242-256. 28. Huang, H., Li, J., Sun, L., Liu, J., Wu, T., Chen, K., Fleisher, A., Reiman, E., Ye, J., 2009. Learning brain connectivity of Alzheimers disease from neuroimaging data. In Advances in Neural Information Processing Systems, 808-816.
TE D
29. Hutchison, R.M., Womelsdorf, T., Gati, J.S., Everling, S., Menon, R.S., 2013. Resting-state networks show dynamic functional connectivity in awake humans and anesthetized macaques. Hum. Brain Mapp. 34(9), 2154-2177.
EP
30. Hutchison, R.M., Womelsdorf, T., Allen, E.A., Bandettini, P.A., Calhoun, V.D., Corbetta, M., Penna, S.D., Duyu, J., et al., 2013. Dynamic functional connectivity: promise, issues, and interpretations. Neuroimage 80, 360-378.
AC C
31. Hyvarinen, A. and Oja, E., 2000. Independent component analysis: algorithms and applications. Neural networks 13(4), 411-430. 32. Jones, D.T, Vemuri, P., Murphy, M.C., Gunter, J.L., Senjem, M.L, Machulda, M.M., Przybelski, S.A., Gregg, B.E., Kantarci, K., Knopman, D.S., Boeve, B.F., Petersen, R.C., Jack, C.R. Jr., 2012. Non-stationarity in the “resting brain’s” modular architecture. PLoS One. 7(6):e39731. 33. Kirch, C., Muhsal, B., and Ombao, H., 2015. Detection of changes in multivariate time series with application to EEG data. Journal of the American Statistical Association, 110(511):11971216. 34. Leonardi, N., and Van De Ville, D. (2015) On spurious and real fluctuations of dynamic functional connectivity during rest, In NeuroImage, Volume 104, 2015, Pages 430-436, ISSN 1053-8119.
41
ACCEPTED MANUSCRIPT
35. Li, X., Lim, C., Li, K., Guo, L., Liu, T., 2013. Detecting Brain State Changes via FiberCentered Functional Connectivity Analysis. Neuroinformatics 11(2), 193-210.
RI PT
36. Lian, Z., Li, X., Pan, y., Guo, X., Chen, L., Chen, G., Wei, Z., Liu, T., Zhang, J., 2015. Dynamic Bayesian brain network partition and connectivity change point detection, In Computational Advances in Bio and Medical Sciences (ICCABS), 2015 IEEE 5th International Conference on (pp. 1-6). IEEE. 37. Lindquist, M.A., Xu, Y., Nebel, M.B., Caffo, B.S., 2014. Evaluating dynamic bivariate correlations in resting-state fMRI: A comparison study and a new approach. Neuroimage 101, 531-546.
SC
38. McDowell, J.E., Dyckman, K.A., Austin, B.P., Clementz, B.A., 2008. Neurophysiology and neuroanatomy of reflexive and volitional saccades: evidence from studies of humans. Brain and cognition 68(3), 255-270.
M AN U
39. Mckeown, M. J., Makeig, S., Brown, G. G., Jung, T.-P., Kindermann, S. S., Kindermann, R. S., Bell, A. J. and Sejnowski, T. J. (1998). Analysis of fMRI Data by Blind Separation Into Independent Spatial Components. Human Brain Mapping, 6, 160-188. 40. Monti, R.P., Hellyer, P., Sharp, D., Leech, R., Anagnostopoulos, C., Montana, G., 2014. Estimating time-varying brain connectivity networks from functional MRI time series. NeuroImage 103, 427-443. 41. Picard, F., Robin, S., Lavielle, M., Vaisse, C., Daudin, J. J., 2005. A statistical approach for array CGH data analysis. BMC bioinformatics 6(1), 27.
TE D
42. Pierce. J.E. and McDowell, J.E., 2016. Modulation of cognitive control levels via manipulation of saccade trial-type probability assessed with event-related BOLD fMRI, Journal of Neurophysiology, 115:2, 763-772.
EP
43. Plitt, M., Barnes, K.A., Martin, A., 2015. Functional connectivity classification of autism identifies highly predictive brain features but falls short of biomarker standards. NeuroImage: Clinical 7, 359-366. 44. Politis, D.N., Romano, J.P., 1994. The stationary bootstrap. Journal of the American Statistical association 89(428), 1303-1313.
AC C
45. Sakoglu, U., Pearlson, G.D., Kiehl, K.A., Wang, Y.M., Michael, A.M., Calhoun, V.D., 2010. A method for evaluating dynamic functional network connectivity and task-modulation: application to schizophrenia. Magnetic Resonance Materials in Physics, Biology and Medicine 23(5-6), 351-366. 46. Salvador, R., Suckling, J., Schwarzbauer, C., and Bullmore, E., 2005. Undirected graphs of frequency-dependent functional connectivity in whole brain networks. Philosophical Transactions of the Royal Society B: Biological Sciences 360(1457), 937-946. 47. Schroder, A.L. and Ombao, H., 2015. FreSpeD: Frequency-specific change-point detection in epileptic seizure multi-channel EEG data. 48. Shi, R. and Guo, Y., 2016, Modeling Covariate Effects in Group Independent Component Analysis with Application to Functional Magnetic Resonance Imaging., Annals of Applied Statistics, 1930-1957. 42
ACCEPTED MANUSCRIPT
49. Shakil S, Lee C H, Keilholz S D. 2016. Evaluation of sliding window correlation performance for characterizing dynamic functional connectivity and brain states. NeuroImage, 133: 111128. 50. Shirer, W.R., Ryali, S., Rykhlevskaia, E., Menon, V., Greicius, M.D., 2012. Decoding subjectdriven cognitive states with whole-brain connectivity patterns. Cereb. Cortex 22,158 -165.
RI PT
51. Smith, S.M., Fox, P.T., Miller, K.L., Glahn, D.C., Fox, P.M., Mackay, C.E., Filippini, N.,Watkins, K.E., Toro, R., Laird, A.R., Beckmann, C.F., 2009. Correspondence of the brain’s functional architecture during activation and rest. Proc. Natl. Acad. Sci. U.S.A. 106, 13040-13045.
SC
52. Smith, S.M., Miller, K.L., Salimi-Khorshidi, G., Webster, M., Beckmann, C.F., Nichols, T.E., Ramsey, J.D., and Woolrich, M.W., 2011. Network modelling methods for fmri. Neuroimage 54(2), 875-891.
M AN U
53. Stam, C.J., Jones, B. F., Nolte, G., Breakspear, M., Scheltens, P., 2007. Small-world networks and functional connectivity in Alzheimer’s disease. Cerebral cortex 17(1), 92-99. 54. Supekar, K., Menon, V., Rubin, D., Musen, M., and Greicius, M.D., 2008. Network analysis of intrinsic functional brain connectivity in alzheimers disease. PLoS Computational Biology 4(6), e1000100. 55. Tagliazucchi, E., Von Wegner, F., Morzelewski, A., Brodbeck, V., Laufs, H., 2012. Dynamic BOLD functional connectivity in humans and its electrophysiological correlates. Frontiers in human neuroscience 6, 339.
TE D
56. Tibshirani, R., 1996. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society Series B (Methodological), 267-288. 57. Tibshirani, R., Saunders, M., Rosset, S., Zhu, J., Knight, K., 2005. Sparsity and smoothness via the fused lasso. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 67(1), 91-108.
EP
58. Tibshirani, R., Wang, P., 2008. Spatial smoothing and hot spot detection for CGH data using the fused lasso. Biostatistics 9(1), 18-29.
AC C
59. Thompson, G., Magnuson, M., Merritt, M., Schwarb, H., Pan, W., McKinley, A., Tripp, L., Schumacher, E., Keilholz, S., 2013. Short time windows of correlation between large scale functional brain networks predict vigilance intra-individually and inter-individually. Hum Brain Mapping 34(12), 3280-3298. 60. Vert, J. and Bleakley, K. (2010), “Fast detection of multiple change-points shared by many signals using group LARS,” Advances in Neural Information Processing Systems, 23, 23432351. 61. Vidaurre, D., Quinn, A.J., Baker, A.P. et al. (2016). Spectrally resolved fast transient brain states in electrophysiological data, NeuroImage, Volume 126, 2016, Pages 81-95, ISSN 1053-8119 62. Wang, K., Liang, M., Wang, L., Tian, L., Zhang, X., Li, K., Jiang, T., 2007. Altered functional connectivity in early Alzheimer’s disease: a resting?state fMRI study. Human brain mapping, 28(10), 967-978. 43
ACCEPTED MANUSCRIPT
63. Wang, Y., Jian, K., Kemmer, P.B., and Guo, Y., 2016. An Efficient and Reliable Statistical Method for Estimating Functional Connectivity in Large Scale Brain Networks Using Partial Correlation. Frontiers in neuroscience, 10. 64. Watts, D.J., Strogatz, S.H., 1998. Collective dynamics of small-world networks. Nature 393(6684), 440-442.
RI PT
65. Woodward, N.D., Cascio, C.J., 2015. Resting-state functional connectivity in psychiatric disorders. JAMA Psychiatry 72(8), 743-744. 66. Xu, Y., Lindquist, M. A., 2015. Dynamic connectivity detection: an algorithm for determining functional connectivity change points in fMRI data. Frontiers in neuroscience, 9.
SC
67. Yuan, M. and Lin, Y., 2006. Model selection and estimation in regression with grouped variables. Journal of the Royal Statistical Society, Series B, 68(1):4967.
M AN U
68. Zalesky, A. and Breakspear, M. (2015). Towards a statistical test for functional connectivity dynamics, NeuroImage, 114 (2015), pp. 466-470.
AC C
EP
TE D
69. Zhu, Y., and Cribben, I. (2018). Sparse Graphical Models for Functional Connectivity Networks: Best Methods and the Autocorrelation Issue. Brain connectivity.
44