Blind source separation of fMRI data by means of factor analytic transformations

Blind source separation of fMRI data by means of factor analytic transformations

NeuroImage 47 (2009) 77–87 Contents lists available at ScienceDirect NeuroImage j o u r n a l h o m e p a g e : w w w. e l s ev i e r. c o m / l o c...

945KB Sizes 3 Downloads 28 Views

NeuroImage 47 (2009) 77–87

Contents lists available at ScienceDirect

NeuroImage j o u r n a l h o m e p a g e : w w w. e l s ev i e r. c o m / l o c a t e / y n i m g

Blind source separation of fMRI data by means of factor analytic transformations Dave R.M. Langers ⁎ Department of Otorhinolaryngology/Head and Neck Surgery, University Medical Center Groningen, The Netherlands Eaton-Peabody Laboratory, Massachusetts Eye and Ear Infirmary, Boston, USA

a r t i c l e

i n f o

Article history: Received 22 September 2008 Revised 30 March 2009 Accepted 1 April 2009 Available online 9 April 2009 Keywords: Functional magnetic resonance imaging Blind source separation Factor analysis Independent component analysis

a b s t r a c t In this study, the application of factor analytic (FA) rotation methods in the context of neuroimaging data analysis was explored. Three FA algorithms (ProMax, QuartiMax, and VariMax) were employed to carry out blind source separation in a functional magnetic resonance imaging (fMRI) experiment that involved a basic audiovisual stimulus paradigm. The outcomes were compared with those from three common independent component analysis (ICA) methods (FastICA, InfoMax, and Jade). When applied in the spatial domain (sFA), all three FA methods performed satisfactorily and comparably to the ICA methods. The QuartiMax and VariMax methods resulted in highly similar outcomes, while the ProMax results more closely resembled those from the FastICA and InfoMax ICA analyses. All methods were able to identify multiple distinct contributing factors of neural origin, including e.g. the central auditory system, the mediotemporal limbic lobe, the basal ganglia, and the motor system. In addition, various contributions from artifacts could be observed, but these constituted different factors that were well separated from those with neural effects. When applied in the temporal domain (tFA), the factor analytic methods performed drastically worse, in the sense that the spatial activation maps revealed activation much more diffusely throughout the brain and the corresponding time courses were less pronouncedly related to the employed stimulus paradigm. Temporal ICA performed better than tFA, with the possible exception of the Jade method, but still did worse than any of the spatial FA or ICA methods. In conclusion, the present findings suggest that sFA forms a viable and useful alternative to ICA in the context of fMRI data analyses, and indicate that sFA methods complement the range of blind source separation methods that are currently in use in fMRI already. © 2009 Elsevier Inc. All rights reserved.

Introduction Data analysis in functional neuroimaging has been dominated by the use of mass-univariate multiple linear regression models since its emergence. Particularly in functional magnetic resonance imaging (fMRI) studies, such general linear models have become the widespread standard to test prior hypotheses (Friston et al., 1995). However, regression requires response dynamics to be exactly specified beforehand, which limits its use to experiments where such responses are predictable. In the last decade, a broad range of useful novel analysis methods has been introduced into the field that may offer more flexibility with respect to model specification, especially for exploratory purposes. Some of these techniques are rapidly becoming more commonplace. For example, blind source separation techniques (like principal component analysis, PCA, in combination with independent component analysis, ICA) have been employed to decompose functional neuroimaging data into components with a meaningful neurophysio-

⁎ Fax: +31 50 363 8875. E-mail address: [email protected]. 1053-8119/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.neuroimage.2009.04.017

logical interpretation in the absence of any prior information about the experimental paradigm. This permits the brain to be studied during complex active states and even during rest (Malinen et al., 2007; McKeown et al., 1998; Bartels and Zeki, 2005; van de Ven et al., 2004). In this paper, the application to fMRI data of a related class of factor analytic methods will be explored. Factor analysis (FA) originated in psychometrics and has a longstanding tradition in the fields of e.g. behavioral and social sciences (Mulaik, 1986). It was pioneered by Spearman in the beginning of the 20th century (Spearman, 1904) and the term was already introduced by Thurstone in the 1930s (Thurstone, 1931). FA is a statistical method that attempts to explain variability among various observed variables in terms of a small set of unobserved variables, termed factors. Main applications of factor analytic techniques are to diminish the number of variables and to detect structure in the relationships between variables. In other words, FA is utilized as a data reduction and classification method. Conceptually, it is comparable to PCA and ICA. Surprisingly, whereas PCA and ICA have frequently been applied to acquired fMRI data, it appears that no studies have ever considered FA for this purpose. FA includes an initial data reduction step to extract principal factors that capture a maximal amount of meaningful variance from

78

D.R.M. Langers / NeuroImage 47 (2009) 77–87

the data. Several methods are available, e.g. those based on principal axes, image solution, or alpha reliability. Most of these methods aim to exclude the noise in the measurements from the underlying factors (Darton, 1980). If noise is not explicitly modeled, and all variance is captured, PCA is obtained. Because in fMRI data sets the number of observed variables is very high, and the various decomposition methods have been reported to perform similarly under those circumstances (Gorsuch, 1983), conventional PCA will be employed in this study. After the original fMRI data have been preprocessed and organized into an Nx × Nt matrix Ŷ that contains a total number of Nt scan volumes in its columns, and correspondingly the time courses of Nx voxels in its rows, PCA can be achieved by means of a singular value decomposition ˆ =U ˆ  Sˆ  U ˆT = Y t x

X

ˆ x;n  u ˆ t;n : sˆn  u

ð1Þ

n

Ûx and Ût are unitary matrices that contain the left-singular and right-singular vectors in their columns, which correspond with the spatial and temporal dimension of the original data, respectively. Ŝ is a diagonal matrix that contains the singular values ŝn in decreasing order. Equivalently, this may be viewed as a decomposition of Ŷ into a sum of outer products. Except for a flipping of signs and possible degeneracies in the singular values, this decomposition is uniquely defined, and the series terms are called the principal components. Because these components are ordered to become weaker in magnitude and increasingly less relevant, a substantial reduction in data complexity may be achieved while still retaining a maximal amount of signal variance by preserving only a limited number, Nc, of the strongest components and discarding the weaker ones. This leads to an approximation Y of the original data Ŷ according to T

Y = Ux  S  Ut :

ð2Þ

S is an Nc × Nc square matrix that contains the Nc highest singular values on its diagonal (Nc ≤ min{Nx,Nt}). Ux and Ut are Nx × Nc and Nt × Nc matrices that contain the first Nc columns of Ûx and Ût, respectively. Eq. (2) is equivalent to the series in Eq. (1) if the range of the summation is restricted to n = 1…Nc. The matrices Ux and Ut decrease the number of spatial or temporal dimensions that are necessary to approximate the data, as becomes clear from T

Y = Yx  U t ;

ð3aÞ

and T

Y = Yt 

T Ux :

ð3bÞ

Here, Yx = Ux ∙ S is a compressed data matrix that contains full spatial maps, but a reduced number of time points. Likewise, Yt = Ut ∙ S contains full time courses, but a reduced number of voxels. Although PCA is very suitable for the purpose of data compression and complexity reduction, by its very nature it tends to group as much signal variation into a single component as it possibly can. As a result, principal components typically contain mixtures of signals from all kinds of sources and therefore may be difficult to interpret. To overcome this, components should be ‘unmixed’; i.e., new linear combinations of components need to be formed that more directly represent the characteristics of the underlying data, but that together still span the same data space. Such a transformation may be viewed as a (possibly oblique) rotation of the component axes, and may be expressed by means of an (invertible) Nc × Nc

transformation matrix R. The decompositions from Eqs. (3a) and (b) then transform into     T −1 T T uLx  Ft ; Y = Yx Rt  Ut Rt

ð4aÞ

and     T T −1 T T uLt  Fx : Y = Yt Rx  Ux Rx

ð4bÞ

Eq. (4a) expresses the definitional equation for factor analysis in the spatial domain (sFA), where Ft contains the normalized time courses (i.e., the factors), and Lx the degree to which these are present in individual voxels (i.e., the loadings). Conversely, Eq. (4b) expresses the definitional equation for factor analysis in the temporal domain (tFA), where Fx contains normalized maps as the factors, and Lt the degree to which these are present in individual scans as loadings. Because the transformations Rt and Rx need not be trivially related, these decompositions can be entirely different in character. So far, the followed approach is analogous to that in ICA, although in ICA it is more common to refer to the loadings as sources and to the factor matrix as a mixing matrix (Hyvärinen and Oja, 2000). Like in FA, a distinction between spatial ICA (sICA) and temporal ICA (tICA) can be made (Calhoun et al., 2001a). However, FA diverges from ICA in the criteria that are used to determine the transformations R. FA essentially attempts to transform the loadings in such a way that the observed variables can each be expressed in as few factors as possible, whereas ICA optimizes the mixing matrix to maximize the statistical independence of the distributions of the sources. The rows of the loading matrix L contain the coefficients of the various factors for a particular voxel (in sFA, Eq. (4a)) or for a particular scan (in tFA, Eq. (4b)). The goal in FA is to achieve a sparse structure for L, in which any row contains as many vanishingly small entries as possible. In factor analytic terms, this is commonly referred to as a simple or parsimonious structure (Carroll, 1953; Thurstone, 1954; Ferguson, 1954; Darton, 1980). This principle ensures that the original signals are a mixture of as few factors as possible, and the factors are maximally representative of the original data. Different voxels/scans can load strongly on different factors, and the total number of factors may therefore still be considerable, but each individual voxel/scan ideally loads on as few of these factors as possible. Algorithmically, a maximally simple structure may be achieved by penalizing rows according to the number of pairs of distinct entries that are both substantially different from zero. This approach is essentially followed in an FA method called QuartiMax (Carroll, 1953). In this method, the unique solution is found that minimizes the sum of the squared cross products of the loadings. 2 0 13 X X 2 2 Lij Lik A5: min 4 @ RaO

i

ð5Þ

j ≠ k

Because the transformation R is restricted to the class O of real unitary matrices (i.e., orthonormal rotations), the above solution simultaneously maximizes the sum of the fourth powers of the loadings (hence the name of the method). 2 0 13 X X 4 4 @ Lij A5: max RaO

i

ð5′Þ

j

The latter equation expresses that the QuartiMax method attempts to maximize the fourth moment of all loadings, which is closely related to the kurtosis of the distribution of the loadings across all factors combined.

D.R.M. Langers / NeuroImage 47 (2009) 77–87

A QuartiMax solution may tend to include one general factor with all major loadings and few non-zero loadings in the rest of the matrix. To avoid this, the variance of the squared loadings may be maximized across the columns of the loading matrix. The following criterion is obtained, which forms the basis of the VariMax method (Kaiser, 1958). 2 min 4 RaO

X

j ≠ k

1X 2 2 1X 2 L L − L N i ij ik N i ij

!

1X 2 L N i ik

!!3 5;

ð6Þ

or, equivalently, 2 ! !3 X 1X 4 X 2 2 1 5; L − L max 4 RaO N i ij N i ij j

ð6′Þ

where N equals the number of rows of L: N = Nx for sFA; N = Nt for tFA. Some FA methods allow non-unitary matrices R, although care has to be taken to prevent factors from becoming co-linear. One such method is ProMax (Hendrickson and White, 1964). This method uses the loadings that are obtained from an orthogonal method (like VariMax) as a starting point. A target matrix is constructed from this loading matrix by taking some higher power of its entries; because this makes the smallest loadings even smaller, and the largest ones even larger, this target matrix represents a simpler and more parsimonious factor structure than the originally obtained loading matrix. Next, the obtained transformation matrix R is obliquely perturbed in an effort to let the loading matrix approach the target matrix as closely as possible. FA is a well established technique that has been described in numerous articles and textbooks (Child, 2006; Gorsuch, 1983; Kim and Mueller, 1979; Kline, 1993; Thompson, 2004; Darton, 1980). Also, software libraries that implement factor analytic methods are readily

79

available (Waller, 1993) and have been incorporated into many data processing packages as standard functionality or by means of plug-ins (e.g. MatLab, R, MS-Excel, SPSS, SAS, STATA, BMDP, AMOS, etc.). Although the aforementioned factor analytic rotation methods are probably known best, other algorithms abound. For instance: EquaMax and ParsiMax belong to the same family of methods as QuartiMax and VariMax (Crawford and Ferguson, 1970); ObliMin, MaxPlane, BiquartiMin, and Harris-Kaiser are alternative oblique rotation methods, but these appear to be less widely used than ProMax (Hakstian and Abell, 1974). Various authors have provided further extensions; for example, the use of overcomplete representations enables the extraction of linearly dependent components (Lewicki and Sejnowski, 2000). Some basic examples to illustrate the operation of FA are shown in Fig. 1. Three two-dimensional data sets were generated with distinctly different distributions, as illustrated in the leftmost panels. PCA, ICA, as well as an orthogonal and an oblique FA method were used to transform the data. PCA identifies the two orthogonal axes along which the data has the largest and the smallest amount of variance, respectively. Although this determines the general direction of ‘largest elongation’ of the data clouds, it does not relate very well to the visible internal structure of the data. In contrast, ICA manages to find oblique axes that are relevant with respect to the structural distribution of the data points. For this solution, the distribution of the data along one axis is maximally independent of the value of the coordinate along the other axis. The factor analytic methods attempt to place the axes such that the data points load on as few factors as possible. The orthogonal method reveals the data structure to some extent already, but the orthogonality constraint leads to a compromise that is suboptimal; the oblique method does not take this constraint into account and succeeds better in locating the axes along which most points of the data cloud are situated. The outcome is virtually equivalent to that for ICA in Fig. 1a; however, results are vastly different in Figs. 1b and c.

Fig. 1. Blind source separation techniques applied to various artificial data sets. Three data sets were generated, each consisting of 1000 2-D points, according to different multivariate statistical distributions. Four methods were applied to extract meaningful signal components. Principal component analysis identifies the orthogonal axes of largest and smallest variance (shown as the horizontal and vertical axis, respectively). Independent component analysis extracts oblique structural axes such that the distributions of the horizontal and vertical coordinates are maximally independent. The factor analytic methods determine axes such that the data points on average are located as closely to an axis as possible; an orthogonal method is illustrated that is restricted to orthonormal transformations (i.e., rotations), as well as an oblique method that has no such constraint and allows for skew transformations. The independent component analysis and oblique factor analysis both correctly identify the same structural axes in example (a) and differ in irrelevant scaling factors only, but their solutions diverge for the other two examples (b) and (c).

80

D.R.M. Langers / NeuroImage 47 (2009) 77–87

The examples in Fig. 1 show that both FA and ICA methods are able to determine data transformations that reveal inherent characteristics from data sets, without requiring any prior information about the structure of the data. Yet, obvious differences between both types of methods are apparent for at least some data sets. Given that ICA has already been proven useful in practical fMRI applications, this study was carried out to explore the performance of FA in such experiments. Materials and methods Data acquisition 16 healthy subjects participated in an fMRI study on the basis of written informed consent, in approved accordance with the requirements of the institutional committees on the use of human subjects at the Massachusetts Eye and Ear Infirmary, the Massachusetts General Hospital, and the Massachusetts Institute of Technology. Subjects were placed in supine position in the bore of a 3.0-T MR system (Siemens Allegra), which was equipped with a standard 12channel transmit/receive head coil. A sagittal T1-weighted MPRAGE scan was acquired for anatomical orientation (repetition time TR 2530 ms; echo time TE 3.39 ms; inversion time TI 1100 ms; flip angle 7°; matrix 256 × 256 × 128; voxel dimensions 1.0 × 1.0 × 1.3 mm). On these high-resolution images an oblique coronal imaging volume was positioned in which functional scans were acquired. The functional imaging runs consisted of a dynamic series of T2⁎-sensitive echo planar imaging (EPI) sequences (TR ∼8 s; TE 30 ms; flip angle 90°; matrix 64 × 64 × 10; voxel dimensions 3.1 × 3.1 × 8.0 mm, including 2.0-mm slice gap). Each run comprised 36 EPI acquisitions, excluding initial start-up scans that were rejected before the analysis. Between 5 and 11 runs were carried out per subject (median 9 runs). A sparse clustered volume scanning paradigm was employed to reduce the influence of acoustic scanner noise (Hall et al., 1999; Edmister et al., 1999). The scanner coolant pump and fan were turned off during functional imaging to reduce ambient noise levels. Cardiac triggering on the basis of a pulse-oximeter signal was used to reduce the influence of cardiac motion (Guimaraes et al., 1998). In one subject, the cardiac signal was too weak to detect reliably, and a fixed TR of 7.5 s was set. During the functional imaging sessions, a basic audiovisual task was performed by the subjects to control their attention levels. Subjects viewed a projection screen through a mirror. On a uniform gray background, a black cross was displayed that briefly flashed red at pseudo-random intervals. Subjects were instructed to press a button device each time they observed such a color change. Responses were monitored to ensure that subjects understood the task and stayed alert, and the achieved hit rate approximately equaled 80–90%. Behavioral results were not analyzed further for the purpose of this study. At the same time, MR-compatible piezo-electric speakers that were mounted in insert ear plugs were used to deliver auditory stimuli. These stimuli consisted of four 32-s broadband noise bursts per run, separated by periods of silence. The stimulus intensities of all bursts were equal within a run, but differed across runs in a range of 50–80 dB SPL. Subjects were told not to respond to the auditory stimuli. Data analysis In the subsequent data analysis, the MatLab programming environment (The MathWorks Inc.) was used in combination with various processing routines from the SPM2 software package (http:// www.fil.ion.ucl.ac.uk/spm/) (Friston et al., 1995) and the Group ICA of fMRI Toolbox v.1.3d (http://icatb.sourceforge.net/groupica.htm) (Calhoun et al., 2001b). The functional image volumes were corrected for motion effects using 3-D rigid body transformations. The anatomical images were coregistered to the functional volumes, and all images were normalized into MNI stereotaxic space using affine transformations. Images were

resampled at a 3-mm isotropic resolution and cropped in the anteriorto-posterior direction to a subvolume that comprised the acquired functional data (−45 ≤ y ≤ +45). To improve signal-to-noise ratio characteristics, spatial smoothing was performed on the functional volumes using an isotropic 5-mm Gaussian kernel. Various signal confounds were estimated by means of a linear regression approach. The model included [i] baseline and drift effects (three vectors modeling a second-order polynomial for each separate functional run), [ii] residual motion effects (six vectors containing the translation and rotation parameters in the x-, y- and z-direction), and [iii] T1-relaxation effects (a single vector containing the TRs corresponding to each functional acquisition). The model was evaluated voxelwise, and the fitted confounds were subtracted from the original data. As the timing of all stimuli was identical in all runs, corresponding images were averaged across all runs of a subject to selectively reduce non-repeatable signal contributions and noise levels. This resulted in a single set of 36 images per subject. These images were divided by the mean EPI image in order to express all signals in units of percentage signal change relative to baseline. Finally, the image sets of all subjects were concatenated to form one comprehensive data set. The individual subject brains were segmented into gray matter, white matter, and cerebrospinal fluid (CSF) probability maps. Discrete brain masks were constructed that included all voxels for which the sum of the three aforementioned compartmental probability maps exceeded 50%. A group mask was derived from these individual masks by means of logical conjunction. After masking, all Nt = 576 images comprised Nx = 22783 brain voxels, resulting in a single 22783 × 576 data matrix Ŷ. Its singular value decomposition was determined and the Nc = 40 strongest principal components were retained. Larger (e.g. 60) and smaller (e.g. 20) numbers of components were tried, but these were rejected because they either only appeared to result in additional components that contained noise only, or they were found to merge multiple components that were considered to contain separate brain systems (like auditory and motor areas). Spatial factor analysis (sFA) was performed on the basis of the reduced data matrix Yx (Eqs. (3a) and (4a)). The transformations Rt were determined that maximally simplified the loading matrix Lx = Yx ∙ RTt according to three different FA methods: ProMax, QuartiMax, and VariMax. These were available from the rotatefactors() function that is implemented in the MatLab Statistics Toolbox. By default, the ProMax solution was based on a VariMax solution, and the exponent for creating the target matrix equaled 4; optional row normalization was turned off for all methods. Subsequently, the 1 corresponding factors were derived according to Ft = Ut ∙ R− t . In addition, similar decompositions into independent components were determined according to three ICA methods: FastICA, InfoMax, and Jade (Cardoso and Souloumiac, 1993; Bell and Sejnowski, 1995; Hyvärinen and Oja, 2000; Cardoso, 1999). By default, the FastICA method employed signal deflation with a G(x) = log(cosh(x)) contrast function. Each of the six employed methods resulted in 40 factors or independent components that were expressed as the product of a spatial activation map and a corresponding time course. Their time courses were normalized to unit variance, and the corresponding maps were scaled in the opposite direction such that their product remained the same. The spatial maps were inspected visually, and 8 effects of interest were manually selected. Some included signal contributions that were interpreted as resulting from neural activation, while others appeared to contain fMRI artifacts from sources like physiological noise, head motion, susceptibility effects, and longitudinal relaxation. For most methods, corresponding maps could be subjectively identified. A set of 8 initial template maps was derived by averaging these manually selected corresponding maps. Next, for each method, and for each of the 8 templates, all 40 factor or independent

D.R.M. Langers / NeuroImage 47 (2009) 77–87

component maps were correlated with the template, and the pairing with the highest squared correlation coefficient was considered a match (Yang et al., 2008). In this way, 8 corresponding factors or independent components of interest could be objectively identified for all six methods. These matched factors and independent components were then averaged to form a final set of templates. An analogous procedure was followed to perform FA and ICA in the temporal domain (Eqs. (3b) and (4b)). The same set of 8 templates that was already derived and used in the sFA and sICA analyses was employed to identify matching factors and independent components among the outcomes of tFA and tICA as well. For each template, the maps of the matching factors and independent components according to the 12 methods were correlated pair-wise. The resulting Pearson correlation coefficients R were transformed into a distance metric d in accordance with d=

pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 − R2 ;

ð7Þ

and the results were gathered in symmetrical 12 × 12 matrices, one for each template. The outcomes for all templates were averaged, and the resulting mean distance matrix was used to construct an optimal 2-D representation by means of multidimensional scaling (Himberg et al., 2004; Esposito et al., 2005). In analogous fashion, a representation was constructed on the basis of the correlations between the time courses of the matching factors and independent components.

81

Results The final set of template maps and associated time courses that were derived by averaging matching factors and independent components across six spatial analysis methods are displayed in Fig. 2. Four of these templates were interpreted to reflect neural effects; these included activation of: [1] the central auditory system, [2] the mediotemporal limbic lobe, [3] the basal ganglia, and [4] the motor and premotor cortex, respectively. Four other templates appeared to contain imaging artifacts; these were predominantly confined to: [5] the ventricles, [6] the inferior temporal lobe, [7] the sulcal CSF, and [8] the lower brainstem, respectively. Fig. 3 shows the first three principal components of the 40 that were retained. These revealed major signal fluctuations in the brainstem, midbrain, and interbrain, particularly towards their anterior sides, as well as in various areas of the cerebral cortex, most notably including auditory, mediotemporal limbic, motor and premotor, and cingulate cortex. Tables 1 and 2 list, among other things, the squared correlations of the spatial maps and the time courses of these three principal components with those of the eight templates. Strong signal variations were found in the auditory processing centers, as may be derived from the high correlation coefficient between the first principal component and the auditory template (R2 = 0.5). The time course of this first principal component, and to a lesser extent the other components, showed four periods of

Fig. 2. Spatial maps and time courses of eight templates. Templates were derived by selecting eight effects of interest, comprising (a) neural signal contributions from [1] the central auditory system, [2] the mediotemporal limbic lobe, [3] the basal ganglia, and [4] the motor and premotor cortex, and (b) artifactual signal contributions in [5] the ventricles, [6] the inferior part of the temporal lobe, [7] the sulcal cerebrospinal fluid (CSF), and [8] the lower brainstem, respectively. The corresponding factors and independent components were averaged across all six sFA and sICA methods. The average time courses were normalized to unit variance, and the mean signal across a single run is shown. Error bars indicate the standard error across all 16 subjects. Each corresponding spatial map indicates the template's magnitude (i.e., its root-mean-square signal power) in all voxels by means of a color code. Maps were thresholded at a level of ±0.05% signal change relative to baseline.

82

D.R.M. Langers / NeuroImage 47 (2009) 77–87

Fig. 3. First three principal components. Using principal component analysis, 40 components were extracted, the strongest three of which are illustrated. The time courses were normalized to unit variance, and the mean signal across a single run is shown. Error bars indicate the standard error across all 16 subjects. Each corresponding spatial map indicates the component's magnitude (i.e., its root-mean-square signal power) in all voxels by means of a color code. Maps were thresholded at a level of ±0.05% signal change relative to baseline. The first principal component is dominated by auditory responses. However, the signal separation into neurophysiologically meaningful separate brain systems is imperfect.

oscillatory fluctuations, which reflected the epochs of the auditory stimulus paradigm. All sFA (Fig. 4a) and sICA (Fig. 4b) methods succeeded in extracting a factor or independent component that could unambiguously be interpreted as the auditory system on the basis of its map and time course. Although six different methods were employed, their outcomes were all found to have a strikingly similar appearance. All maps revealed strong responses in the bilateral primary and secondary auditory cortices, but also weaker responses in the sub-cortical auditory nuclei, in particular the medial geniculate bodies in the thalamus, the inferior colliculi in the midbrain, and the cochlear nuclei in the lower brainstem. Some local suprathreshold signals with positive and negative sign were detected in the anterior thalamus. The cingulate gyrus exceeded the threshold extensively for the VariMax method only. The corresponding time courses were also highly comparable for all methods. The time course mirrored the four sound presentation epochs that occur in a run: sustained responses were observed with additional peaks at stimulus onset and offset. For the analyses in the temporal domain, outcomes of all six methods are shown in Figs. 4c and d. The time courses show periodical oscillations corresponding with the sound presentation periods, but response dynamics appear less pronounced than for the spatial methods. In particular, onset and offset peaks were barely present. Also, variability across subjects was larger. The extracted maps were relatively diverse in comparison with the analyses in the spatial domain (i.e., Figs. 4a and b). All three tFA methods as well as the Jade method resulted in rather similar outcomes. In addition to extensive activation in the cortical and sub-cortical auditory processing centers, signals also exceeded the threshold in the cingulate gyrus, the basal ganglia, the mediotemporal limbic lobe, and the motor and premotor cortex. The FastICA method revealed relatively weak activation in the auditory cortex, and hardly any sub-cortical activation could be detected. Still, suprathreshold signals were mostly confined to auditory brain structures. The InfoMax method managed to extract slightly more activation in the auditory cortex than the FastICA method, and also revealed activity in the inferior colliculi. In the lower brainstem, a large region was detected with negative signal. For the seven other templates, the corresponding factors and independent components that were found by the individual methods in the spatial and temporal domain are depicted in a Supplementary Figure. Especially for the factors and independent components that

were interpreted to have a neural origin, sFA and sICA performed similarly; overall, the ProMax sFA method revealed the strongest activity and the Jade sICA method the weakest, but differences were limited. The six methods resulted in less consistent outcomes for some of the artifactual factors and independent components, particularly with regard to the sulcal CSF and lower brainstem. Overall, the analyses that were performed in the temporal domain led to worse Table 1 Correlations between spatial maps. Method

Template

Mean

1

2

3

4

5

6

7

8

PCA

1st PC 2nd PC 3rd PC Mean

.53 .00 .08 .20

.00 .08 .05 .05

.10 .00 .04 .05

.08 .01 .10 .06

.04 .08 .00 .04

.02 .08 .01 .04

.02 .02 .00 .01

.00 .03 .05 .03

.10 .04 .04 .06

sFA

ProMax QuartiMax VariMax Mean

.98 .94 .91 .94

.93 .86 .90 .90

.89 .94 .96 .93

.91 .94 .94 .93

.92 .82 .87 .87

.88 .93 .89 .90

.87 .71 .85 .81

.32 .52 .52 .46

.84 .83 .86 .84

sICA

FastICA InfoMax Jade Mean

.96 .98 .96 .96

.86 .83 .87 .85

.86 .91 .41 .73

.88 .90 .42 .73

.96 .95 .94 .95

.82 .84 .79 .82

.80 .89 .21 .64

.57 .38 .63 .53

.84 .84 .65 .78

tFA

ProMax QuartiMax VariMax Mean

.36 .30 .30 .32

.20 .15 .15 .17

.15 .20 .20 .18

.19 .12 .12 .14

.26 .24 .24 .25

.22 .28 .28 .26

.20 .54 .54 .43

.21 .34 .34 .30

.22 .27 .27 .26

tICA

FastICA InfoMax Jade Mean

.44 .60 .37 .47

.21 .47 .17 .29

.22 .42 .29 .31

.07 .25 .10 .14

.25 .34 .23 .27

.26 .29 .27 .27

.51 .45 .57 .51

.37 .37 .35 .37

.29 .40 .29 .33

Pearson's correlation coefficients were calculated between the spatial maps of the eight templates, and the principal components as well as the extracted factors and independent components according to the various methods. Results are listed for the four neural components and the four artifactual components that are displayed in Fig. 3 (1 = auditory; 2 = limbic; 3 = basal ganglia; 4 = [pre]motor; 5 = ventricles; 6 = inferior temporal; 7 = sulcal CSF; 8 = lower brainstem). For the PCA method, the correlation coefficients with each of the template maps are reported for the first three principal components; for the FA and ICA methods, the values concern the correlations between the various templates and the correspondingly matched factors and independent components. Average values across all methods within a group and across all templates are also reported.

D.R.M. Langers / NeuroImage 47 (2009) 77–87 Table 2 Correlations between time courses. Method

Template

Mean

1

2

3

4

5

6

7

8

1st PC 2nd PC 3rd PC Mean

.49 .00 .06 .18

.00 .08 .06 .05

.08 .00 .03 .04

.11 .01 .08 .07

.04 .08 .00 .04

.02 .08 .00 .04

.03 .01 .00 .02

.01 .05 .00 .02

.10 .04 .03 .06

sFA

ProMax QuartiMax VariMax Mean

.92 .86 .95 .91

.76 .82 .81 .79

.83 .69 .74 .75

.67 .85 .86 .79

.94 .78 .94 .89

.92 .85 .84 .87

.88 .89 .86 .88

.08 .79 .80 .56

.75 .82 .85 .81

sICA

FastICA InfoMax Jade Mean

.98 .98 .95 .97

.92 .90 .89 .90

.86 .93 .61 .80

.88 .90 .42 .73

.93 .96 .89 .93

.90 .90 .84 .88

.93 .94 .51 .79

.19 .16 .40 .25

.82 .83 .69 .78

tFA

ProMax QuartiMax VariMax Mean

.11 .17 .17 .15

.06 .08 .08 .07

.07 .12 .12 .10

.02 .15 .15 .11

.08 .08 .08 .08

.07 .21 .21 .16

.25 .22 .22 .23

.20 .18 .18 .19

.11 .15 .15 .14

tICA

FastICA InfoMax Jade Mean

.24 .37 .23 .53

.17 .33 .09 .44

.19 .28 .18 .47

.13 .16 .15 .38

.18 .19 .07 .38

.19 .23 .20 .45

.20 .24 .26 .48

.16 .15 .20 .41

.18 .24 .17 .45

PCA

Pearson's correlation coefficients were calculated between the time courses of the eight templates, and the principal components as well as the extracted factors and independent components according to the various methods. Results are listed for the four neural components and the four artifactual components that are displayed in Fig. 3 (1 = auditory; 2 = limbic; 3 = basal ganglia; 4 = [pre]motor; 5 = ventricles; 6 = inferior temporal; 7 = sulcal CSF; 8 = lower brainstem). For the PCA method, the correlation coefficients with each of the template maps are reported for the first three principal components; for the FA and ICA methods, the values concern the correlations between the various templates and the correspondingly matched factors and independent components. Average values across all methods within a group and across all templates are also reported.

signal separation than their counterparts in the spatial domain, in the sense that activation was less coherent and much more spread throughout the brain. Table 1 summarizes the squared correlation coefficients between the template maps and the corresponding extracted factors and independent components that were observed according to the individual methods. The observed correlations were very high for the sFA methods (typically R2 ≈ 0.9), particularly for the factors that reflected neural effects. The performance of these methods was comparable to the sICA analyses using the FastICA and InfoMax methods. The Jade method showed markedly lower correlations for at least some of the independent components. When these methods were applied in the temporal domain, performance significantly worsened. Then, correlations were low for all templates, both for tFA and tICA methods (typically, R2 ≈ 0.3). Possibly the only exception was tICA according to the InfoMax method, which achieved R2 = 0.6 for the auditory independent component, but even this performance was considerably worse than for any method when applied in the spatial domain. Similarly, Table 2 shows the correlations between the time courses. The same trends emerged: the performance of sFA was comparable to that of sICA, and all methods typically performed much worse when applied in the temporal domain compared to the spatial domain. Finally, the similarities between the matching spatial maps (Fig. 5a) and time courses (Fig. 5b) according to the various methods were quantified using a distance metric (Eq. (7)) and represented in a 2-D Euclidean space. The six sFA and sICA methods resulted in comparable outcomes. Particularly the QuartiMax and VariMax methods performed similarly, while the ProMax method was found to closely resemble the FastICA and InfoMax methods. For the analyses in the time domain (i.e., for tFA and tICA), vastly different outcomes were obtained. The

83

QuartiMax and VariMax methods resulted in virtually indistinguishable outcomes, which also closely resembled those according to the Jade method. The ProMax method resulted in somewhat different outcomes. The FastICA and InfoMax methods resembled each other, but performed differently from any of the other methods. Discussion The goal of the present study was to assess the feasibility of fMRI data analysis based on blind source separation by means of factor analytic rotation methods. FA has not been applied to fMRI data before, except in one methodological publication that employed a tensorial modification and primarily aimed to investigate an ICA method (Beckmann and Smith, 2005). Nonetheless, FA has a much longer tradition than ICA. Also, its transformation methods are relatively simple, both conceptually and computationally. As explained in the introduction, FA essentially attempts to decompose a data set into meaningful factors by first reducing the dimensionality of the data while still retaining a maximal amount of meaningful signal variation, and by next transforming the determined solution in such a way that the original observations can each be approximated by a linear combination of only a small subset of these factors (Gorsuch, 1983). To test and compare the performance of various algorithms, three different FA methods were used. Quartimax and VariMax are the two most popularly used factor rotation techniques that constrain the extracted factors to be uncorrelated; ProMax was added to assess the potential benefit of non-orthogonal factor transformations. Additionally, in order to make a comparison with some other blind source separation techniques that are already better established in the context of fMRI, the three most popular and reliable ICA methods were tried (Calhoun and Adali, 2006; Correa et al., 2007; Esposito et al., 2002). These were FastICA, InfoMax, and Jade, which are based on optimization on the basis of non-Gaussianity (Hyvärinen and Oja, 2000), information entropy (Bell and Sejnowski, 1995), and fourthorder cross-cumulants (Cardoso and Souloumiac, 1993), respectively. All six aforementioned methods operated on a reduced data set that was obtained by means of PCA. When applied in the spatial domain (i.e., sFA), the factors constitute time courses, and the loading matrix contains maps that prescribe how strongly each factor is represented in individual voxels. Referring to the examples given in Fig. 1, the directions of the axes would correspond with different signal dynamics, and the data points would represent voxels. In this case, FA attempts to define the factors such that the signal in each voxel loads on as few factors as possible. In fMRI terms, sFA attempts to find a small set of exemplary time courses such that the dynamic behavior of each voxel can be well approximated by a linear combination of only a few such exemplars (ideally only one). An illustrative comparison with ICA can be made: sICA assumes spatial independence of all components; sFA assumes that the factor maps are spatially segregated. In other words, whereas sICA attempts to minimize the statistical dependence in space, sFA attempts to minimize the overlap. In that sense, sFA conceptually also bears a faint resemblance to clustering techniques (Filzmoser et al., 1999; Dimitriadou et al., 2004), although sFA does not result in discrete cluster assignments. No constraints are placed on the temporal dynamics. In particular, the extracted time courses may well be correlated. Spatial independence of neural effects has been argued on the basis of the distributed nature of brain processing (McKeown et al., 1998). Likewise, the existence of various functional brain systems that are primarily localized in distinct specialized brain structures (Gazzaniga, 1989) would support the assumption that effects of interest are largely non-overlapping. In the time domain (i.e., tFA), factors constitute spatial maps, and the loading matrix prescribes how strongly each factor is represented in individual acquired fMRI-volumes. In Fig. 1, the axes then correspond with different spatial distributions of activation, and

84

D.R.M. Langers / NeuroImage 47 (2009) 77–87

D.R.M. Langers / NeuroImage 47 (2009) 77–87

85

Fig. 5. Representation of similarities between methods. The (a) spatial maps and (b) time courses of matching factors and components according to all twelve different methods were correlated pair-wise, corresponding distance metrics were calculated (Eq. (7)), results were averaged across the eight selected templates, and the mean distance matrix was used to construct a 2-D representation by means of multidimensional scaling. In these plots, each method is represented by a symbol, and the proximity of different symbols is a measure for the similarity of the methods. The spatial analysis methods all cluster closely together, whereas analyses in the temporal domain lead to more diverse outcomes. The QuartiMax and VariMax tFA methods virtually coincided and cannot be distinguished in these plots.

each data point represents one scan volume. In this case, FA attempts to find a small set of exemplary spatial maps such that the spatial distribution of activation at any moment in time can be well approximated by the weighted sum of only a few such exemplars (ideally only one). A close analogy with tICA again exists: whereas tICA assumes that the components are statistically independent across time, tFA assumes that the activity according to the various factors does not coincide. Neither method puts constraints on the corresponding spatial maps, which may therefore turn out to be correlated. Neurophysiologically, the assumption of temporal independence is more debatable than spatial independence, and will strongly depend upon the experimental paradigm (Calhoun et al., 2001a). Even more so, incoincidence suggests the existence of a limited number of reasonably well defined functional brain states that alternate each other. Although some experimental paradigms may conceivably give rise to such brain states, this seems harder to maintain in general. Conceptually, FA and ICA use different criteria (overlap or coincidence versus statistical dependence) and give rise to distinct methods (like ProMax in FA, or InfoMax in ICA). Nevertheless, in practice, some methods show remarkable similarities. For instance, the QuartiMax and VariMax methods employ fourth-order moments (Eqs. (5)–(6')); similarly, the Jade method employs fourth-order cumulants, and the FastICA method may also be used with symmetric orthonormalization and a fourth-order contrast function G(x) = x4. However, major differences remain. For instance, while FA attempts to find factors with leptokurtic loading distributions only, the ICA methods maximize non-gaussianity and may converge to platykurtic distributions as well. Moreover, in ICA, kurtosis is impervious to the

mean of the distribution, and so are fourth-order cumulants. In contrast, because FA methods strive towards loadings that equal zero and use fourth-order moments, they are sensitive to signal offsets or baselines. These differences may be highly relevant in the context of fMRI. The example in Fig. 1a shows that ICA and (oblique) FA methods can perform similarly. In this particular example, the underlying signals show positive kurtosis, and this may serve both as a measure of independence (in ICA) or parsimony (in FA). Yet, depending on the exact characteristics of the statistical distribution of the data, the outcomes of FA and ICA may differ considerably, like in the other two examples. Fig. 1b corresponds with a situation where many voxels are simultaneously part of multiple brain systems that are independently distributed in the brain. In this example, ICA extracts the optimal independent axes with platykurtic marginal distributions. In contrast, FA converges to a suboptimal leptokurtic solution, and extracts mixtures only. If this example is typical for fMRI data, ICA is expected to outperform FA. In Fig. 1c, however, voxels are typically part of only one system, but in contrast with Fig. 1a they tend to have relatively high magnitude non-zero loadings on the corresponding axis. In this case, FA is able to extract axes that pass through the clustered data clouds by converging to an optimal leptokurtic solution. Contrariwise, ICA gets stuck in a suboptimal platykurtic extremum, and unmixes the data poorly since all points continue to load on both axes. If this better represents fMRI data, FA is expected to outperform ICA. Together, these examples illustrate that it is not trivially clear which class of blind source separation methods may perform best in particular circumstances on the basis of theoretical considerations alone.

Fig. 4. Auditory factors and independent components according to various blind source separation methods. The figure shows (a) the extracted factors according to three spatial factor analysis (sFA) methods, and (b) the independent components according to three spatial independent component analysis (sICA) methods, that corresponded with the template containing the auditory system. In addition, the same analysis methods were applied in the temporal domain, giving rise to (c) temporal factor analysis (tFA) and (d) temporal independent component analysis (tICA). The average time courses were normalized to unit variance, and the mean signal across a single run is shown. Error bars indicate the standard error across all 16 subjects. Each corresponding spatial map indicates the factor's or independent component's magnitude (i.e., its root-mean-square signal power) in all voxels by means of a color code. Maps were thresholded at a level of ±0.05% signal change relative to baseline. All six spatial methods resulted in highly comparable outcomes. In comparison, the results of the temporal methods were more diverse. (Corresponding data for seven other templates are included in a Supplementary Figure.)

86

D.R.M. Langers / NeuroImage 47 (2009) 77–87

Interpretation of extracted factors or independent components is complicated by the fact that it can only be carried out retrospectively. In the case of neuroimaging data, it is often unclear a priori what the true neurophysiological components would look like (Formisano et al., 2002). This suggests that the performance of blind source separation methods might best be assessed on the basis of artificial data sets, for which the underlying components can exactly be specified. However, the various methods that were employed in this study rely on relatively subtle characteristics of the statistical distribution of the data. By prescribing such characteristics by means of an artificially generated data set, it is unclear whether the results remain representative for real fMRI data and whether the conclusions will be transferable. For example, it should be possible to favor either FA or ICA by modeling the components and the noise in such a data set in a suitable way, witness Fig. 1. Therefore, the only way to assess the performance of the various methods on behalf of fMRI data analyses seems to be the use of actual fMRI data. This approach was chosen in this study. To simplify the interpretation of the extracted factors and independent components, a relatively straightforward paradigm was employed with an audiovisual stimulus and task. The experiment was designed to result in at least one predictable form of response, both with regard to the location of the neural activation as well as its temporal dynamics, namely, the response to the presented sound stimuli. The central auditory system comprises extensive cortical regions in the superior surface of the temporal lobe, as well as a collection of small sub-cortical nuclei in the brainstem, midbrain, and thalamus (Ehret and Romand, 1996). Therefore, auditory activation should include regions with diffuse as well as focal activation in known locations. A broadband noise stimulus was employed because of its potential to excite neuronal populations with various types of spectrotemporal tuning; also, the response dynamics to noise epochs have been characterized in the past (Harms and Melcher, 2003; Sigalovsky and Melcher, 2006). PCA served to reduce the dimensionality of the data to the desired number of components. This was not expected to separate the data into neurophysiologically meaningful components, but the results were presented for completeness. The first principal component, which explains more variance in the data than any other component, reflected the audiovisual stimulus paradigm. It comprised both cortical and sub-cortical areas of the central auditory system, and its time course demonstrated the presence of four stimulus presentations per run. It correlated moderately with the auditory template both with regard to its spatial distribution and its temporal dynamics. This suggests that the employed sound stimuli are a major drive for the occurring brain responses in this fMRI experiment. However, the first principal component also included non-auditory brain areas, like the anterior parts of the brainstem, the basal ganglia, bilateral regions near the insula, and medial frontal cortex. Furthermore, periodical fluctuations that appeared to be related to the sound presentations were also present in at least some of the other components. This confirms that although principal component analysis may be an excellent tool to summarize large amounts of signal variation in only a few components, it is incapable of completely separating signal contributions from different sources into different components. Hence the need for additional transformation methods like FA and ICA. All three sFA methods and all three sICA methods performed highly similarly with respect to the extracted auditory activation. The spatial maps showed extensive activation in the bilateral primary auditory cortices, but also revealed focal activation in the medial geniculate bodies, inferior colliculi, and cochlear nuclei. The VariMax method resulted in more suprathreshold voxels than the other methods did, including activation in the cingulate cortex; this could be interpreted as a higher sensitivity to auditory activation, but also as an incomplete unmixing of different components. The corresponding

time courses were virtually identically shaped for all six methods, exhibiting the expected sustained as well as onset and offset responses. For the other three components that were interpreted to have a neural origin, all six methods again succeeded in extracting similar spatial maps and time courses. The ProMax method revealed stronger activity as the QuartiMax and VariMax methods, but none of the methods achieved consistently better results than any of the other methods on all components. These observations were further substantiated by the correlations with the templates. These quantify to what degree the results of the individual methods resemble the average result of all (spatial) methods. Insofar as the templates can be considered gold standards, these correlation coefficients provide an objective measure for the performance of the methods in terms of their discriminatory power. Although the VariMax method achieved the highest correlations both with respect to the spatial maps and time courses, differences were minute, and, overall, the attained correlation coefficients were comparable for all three methods. On average, the factors according to the sFA methods correlated slightly better with the templates than the independent components of the sICA methods, but differences were considered too small to prefer sFA over sICA on this basis. The multidimensional scaling analysis that was carried out on the basis of the correlations between pairs of methods resulted in a picture that was consistent with the mentioned findings. All six spatial methods were found to resemble each other closely. In particular, the QuartiMax and VariMax methods were very similar; the ProMax method was found to bear more resemblance to the FastICA and InfoMax sICA methods. The reliability of ICA algorithms has been previously assessed by means of methods like ICASSO or RAICAR that repeat the analysis multiple times with different random initializations and that subsequently quantify the consistency of the outcomes (Himberg et al., 2004; Yang et al., 2008). Because various FA methods give rise to unique solutions only (Carroll, 1953), and all FA algorithms that were employed in this study are deterministic, the reliability of the methods could not be assessed in this way. Nevertheless, the fact that all sFA and sICA methods performed comparably suggests that the obtained solutions are robust and reliable. Although the QuartiMax and VariMax methods were expected to perform similarly because they both are members of the same family of orthogonal rotation methods, it is somewhat surprising that their outcomes were also rather comparable to those of the ProMax method. ProMax is not restricted to orthogonal transformations, and thus has more freedom to fit the data better (e.g., see Fig. 1 for an illustration). The fact that all three methods performed well suggests that orthogonality forms only a weak constraint in the context of fMRI. This is likely due to the high-dimensional nature of the data. Findings were entirely different for the analyses in the temporal domain. All temporal methods were rather poor at disentangling the various sources of signal variations. Generally, results were less consistent across methods as compared to the analyses in the spatial domain. More specifically, with regard to the auditory system, outcomes only poorly corresponded with the expected brain responses, both with regard to their spatial location in the brain and their dynamics in time. Particularly the tFA methods and the Jade tICA method performed badly. The FastICA method did a better job, and even more so the InfoMax method, but both were still clearly outperformed by all six of the spatial methods. The low correlations with the template maps and time courses further substantiate that the temporal methods are less consistent, and also suggest a lower overall sensitivity and specificity as compared to the spatial methods. The latter finding may to some extent be attributed to a bias that is introduced by not including the outcomes of the temporal methods in the construction of the templates. However, the large spread according to the multidimensional scaling analysis corroborates that the temporal methods as a

D.R.M. Langers / NeuroImage 47 (2009) 77–87

group are less reliable and less robust. Of course, due to the lack of a true gold standard, the possibility remains that for some purposes one of the temporal methods actually performs better than any of the other methods. But this does not seem to be supported by the appearance of the extracted spatial maps and time courses in this experiment. Although analyses in the temporal domain have been reported to be preferable in experiments where multiple stimuli are presented independently over time (Calhoun et al., 2001a), analyses in the spatial domain are more commonly used in fMRI studies. As far as the experiment in this study may be regarded as representative, it proves that the assumption of spatial independence and segregation of brain systems (in sICA and sFA, respectively) is more suitable in the general context of neuroimaging than the assumption of temporal independence and incoincidence of brain states (in tICA and tFA). In conclusion, all sFA methods performed satisfactorily and similarly, and no obvious preference for any of the tried methods emerged from these results. Their performance was very comparable to some well established sICA methods. Therefore, sFA is a viable and useful technique in the context of fMRI analyses. In comparison, the tFA methods performed much worse and should be avoided. Acknowledgments The author was partly funded by VENI research grant 016.096.011 from the Netherlands organisation for scientific research (NWO) and the Netherlands organization for health research and development (ZonMw). Furthermore, he would like to express his gratitude for the fruitful cooperation with the Melcher research group at the EatonPeabody Laboratory in Boston (USA) and particularly acknowledge the generous contributions of Jennifer Melcher and Gianwen Gu, who indispensably contributed in the design of the neuroimaging experiment, the recruitment of the subjects, and the acquisition of the data. Appendix A. Supplementary data Supplementary data associated with this article can be found, in the online version, at doi:10.1016/j.neuroimage.2009.04.017. References Bartels, A., Zeki, S., 2005. Brain dynamics during natural viewing conditions—a new guide for mapping connectivity in vivo. Neuroimage 24, 339–349. Beckmann, C., Smith, S., 2005. Tensorial extensions of independent component analysis for multisubject FMRI analysis. NeuroImage 25, 294–311. Bell, A.J., Sejnowski, T.J., 1995. An information-maximization approach to blind separation and blind deconvolution. Neural. Comput. 7, 1129–1159. Calhoun, V.D., Adali, T., 2006. Unmixing fMRI with independent component analysis. IEEE Eng. Med. Biol. Mag. 25, 79–90. Calhoun, V.D., Adali, T., Pearlson, G.D., Pekar, J.J., 2001a. Spatial and temporal independent component analysis of functional MRI data containing a pair of task-related waveforms. Hum. Brain Mapp. 13, 43–53. Calhoun, V.D., Adali, T., Pearlson, G.D., Pekar, J.J., 2001b. A method for making group inferences from functional MRI data using independent component analysis. Hum. Brain Mapp. 14, 140–151. Cardoso, J.F., 1999. High-order contrasts for independent component analysis. Neural. Comput. 11, 157–192. Cardoso, J., Souloumiac, A., 1993. Blind beamforming for non Gaussian signals. IEE Proceedings-F 140, 362–370. Carroll, J., 1953. An analytical solution for approximating simple structure in factor analysis. Psychometrika 18, 23–38. Child, D., 2006. The Essentials of Factor Analysis, 3rd ed. Continuum International Publishing Group Ltd.

87

Correa, N., Adali, T., Calhoun, V.D., 2007. Performance of blind source separation algorithms for fMRI analysis using a group ICA method. Magn. Reson. Imaging 25, 684–694. Crawford, C., Ferguson, G., 1970. A general rotation criterion and its use in orthogonal rotation. Psychometrika 35, 321–332. Darton, R.A., 1980. Rotation in factor analysis. The Statistician 29, 167–194. Dimitriadou, E., Barth, M., Windischberger, C., Hornik, K., Moser, E., 2004. A quantitative comparison of functional MRI cluster analysis. Artif. Intell. Med. 31, 57–71. Edmister, W.B., Talavage, T.M., Ledden, P.J., Weisskoff, R.M., 1999. Improved auditory cortex imaging using clustered volume acquisitions. Hum. Brain Mapp. 7, 89–97. Ehret, G., Romand, R., 1996. The Central Auditory System. Oxford University Press, USA. Esposito, F., Formisano, E., Seifritz, E., Goebel, R., Morrone, R., Tedeschi, G., Di Salle, F., 2002. Spatial independent component analysis of functional MRI time-series: to what extent do results depend on the algorithm used? Hum. Brain Mapp. 16, 146–157. Esposito, F., Scarabino, T., Hyvarinen, A., Himberg, J., Formisano, E., Comani, S., Tedeschi, G., Goebel, R., Seifritz, E., Di Salle, F., 2005. Independent component analysis of fMRI group studies by self-organizing clustering. Neuroimage 25, 193–205. Ferguson, G., 1954. The concept of parsimony in factor analysis. Psychometrika 19, 281–290. Filzmoser, P., Baumgartner, R., Moser, E., 1999. A hierarchical clustering method for analyzing functional MR images. Magn. Reson. Imaging 17, 817–826. Formisano, E., Esposito, F., Kriegeskorte, N., Tedeschi, G., Di Salle, F., Goebel, R., 2002. Spatial independent component analysis of functional magnetic resonance imaging time-series: characterization of the cortical components. Neurocomputing 49, 241–254. Friston, K.J., Holmes, A.P., Poline, J.B., Grasby, P.J., Williams, S.C., Frackowiak, R.S., Turner, R., 1995. Analysis of fMRI time-series revisited. Neuroimage 2, 45–53. Gazzaniga, M.S., 1989. Organization of the human brain. Science 245, 947–952. Gorsuch, R.L., 1983. Factor Analysis, 2nd ed. Lawrence Erlbaum Associates Inc. Guimaraes, A.R., Melcher, J.R., Talavage, T.M., Baker, J.R., Ledden, P., Rosen, B.R., Kiang, N.Y., Fullerton, B.C., Weisskoff, R.M., 1998. Imaging subcortical auditory activity in humans. Hum. Brain Mapp. 6, 33–41. Hakstian, A., Abell, R., 1974. A further comparison of oblique factor transformation methods. Psychometrika 39, 429–444. Hall, D.A., Haggard, M.P., Akeroyd, M.A., Palmer, A.R., Summerfield, A.Q., Elliott, M.R., Gurney, E.M., Bowtell, R.W., 1999. “Sparse” temporal sampling in auditory fMRI. Hum. Brain Mapp. 7, 213–223. Harms, M.P., Melcher, J.R., 2003. Detection and quantification of a wide range of fMRI temporal responses using a physiologically-motivated basis set. Hum. Brain Mapp. 20, 168–183. Hendrickson, A., White, P., 1964. Promax: a quick method for rotation to oblique simple structure. Brit. J. Stat. Psych. 17, 65–70. Himberg, J., Hyvärinen, A., Esposito, F., 2004. Validating the independent components of neuroimaging time series via clustering and visualization. Neuroimage 22, 1214–1222. Hyvärinen, A., Oja, E., 2000. Independent component analysis: algorithms and applications. Neural. Netw. 13, 411–430. Kaiser, H., 1958. The varimax criterion for analytic rotation in factor analysis. Psychometrika 23, 187–200. Kim, J., Mueller, C.W., 1979. Introduction to Factor Analysis: What it is and How to Do it. Sage Publications, Inc. Kline, P., 1993. An Easy Guide to Factor Analysis, 1st ed. Routledge. Lewicki, M.S., Sejnowski, T.J., 2000. Learning overcomplete representations. Neural. Comput. 12, 337–365. Malinen, S., Hlushchuk, Y., Hari, R., 2007. Towards natural stimulation in fMRI—issues of data analysis. Neuroimage 35, 131–139. McKeown, M.J., Makeig, S., Brown, G.G., Jung, T.P., Kindermann, S.S., Bell, A.J., Sejnowski, T.J., 1998. Analysis of fMRI data by blind separation into independent spatial components. Hum. Brain Mapp. 6, 160–188. Mulaik, S., 1986. Factor analysis and Psychometrika: major developments. Psychometrika 51, 23–33. Sigalovsky, I.S., Melcher, J.R., 2006. Effects of sound level on fMRI activation in human brainstem, thalamic and cortical centers. Hear. Res. 215, 67–76. Spearman, C., 1904. “General intelligence,” objectively determined and measured. Am. J. Psychol. 15, 201–292. Thompson, B., 2004. Exploratory and confirmatory factor analysis: understanding concepts and applications. American Psychological Association. Thurstone, L., 1954. An analytical method for simple structure. Psychometrika 19, 173–182. Thurstone, L.L., 1931. Multiple factor analysis. Psychol. Rev. 38, 406–427. van de Ven, V.G., Formisano, E., Prvulovic, D., Roeder, C.H., Linden, D.E.J., 2004. Functional connectivity as revealed by spatial independent component analysis of fMRI measurements during rest. Hum. Brain Mapp. 22, 165–178. Waller, N.G., 1993. Software review: seven confirmatory factor analysis programs: EQS, EzPATH, LINCS, LISCOMP, LISREL 7, SI MPLIS, and CALIS. Appl. Psychol. Meas. 17, 73–100. Yang, Z., LaConte, S., Weng, X., Hu, X., 2008. Ranking and averaging independent component analysis by reproducibility (RAICAR). Hum. Brain Mapp. 29, 711–725.