Meta-Analyses in Functional Neuroimaging MA Lindquist, Johns Hopkins University, Baltimore, MD, USA TD Wager, University of Colorado at Boulder, Boulder, CO, USA ã 2015 Elsevier Inc. All rights reserved.
Glossary Coordinate-based meta-analysis A meta-analysis performed using the spatial locations of peaks of activation, reported in a standard coordinate system, across multiple neuroimaging studies. Meta-analysis The statistical analysis of multiple separate but similar studies with the goal of testing the combined data for statistical significance.
Introduction In recent years, there has been an explosive growth in both the number and variety of neuroimaging studies being performed around the world. This is illustrated in Figure 1, which shows the rapid increase in the yearly number of publications related to functional magnetic resonance imaging (fMRI) since 1993. With this growing body of knowledge comes a need to both integrate and synthesize research findings. Performing metaanalyses has arguably become the primary research tool for accomplishing this goal (Wager, Lindquist, & Kaplan, 2007; Wager, Lindquist, Nichols, Kober, & Van Snellenberg, 2009). They allow for the pooling of multiple separate but similar studies and can be used to evaluate the consistency of findings across labs, scanning procedures, and task variants, as well as the specificity of findings in different brain regions or networks to particular task types. Meta-analyses have already been used to study many types of psychological processes, including cognitive control, working memory, decision-making, language, pain, and emotion (Laird et al., 2005; Phan, Wager, Taylor, & Liberzon, 2002; Wager, Jonides, & Reading, 2004). They have also been used to summarize structural and functional brain correlates of disorders such as attention deficit disorder, schizophrenia, depression, chronic pain, anxiety disorders, and obsessive–compulsive disorder (Dickstein, Bannon, Xavier Castellanos, & Milham, 2006; Etkin & Wager, 2007; Glahn et al., 2005; Menzies et al., 2008). To date, the primary goal of meta-analyses has been to provide summaries of the consistency of regional brain activation for a set of studies of a particular task type, thereby providing a consensus regarding which regions are likely to be truly activated by a given task (see Figure 2 for an illustration). Evaluating consistency across studies is critical because false-positive rates in neuroimaging are likely to be significantly higher than in many other fields. In previous work, we estimated the false-positive rates to lie somewhere between 10% and 40% (Wager et al., 2009). These inflated false-positive rates are a by-product of poor control for multiple comparisons combined with the small sample sizes typically used in neuroimaging studies. Though these rates may be decreasing over time as sample sizes increase and more
Brain Mapping: An Encyclopedic Reference
Monte Carlo methods A class of computational algorithms that use repeated re-sampling of the available data to obtain numerical results. Ontology A representation of a domain of knowledge. Reverse inference Using observed brain activity to infer a particular cognitive process not directly tested, drawing on other research implicating that brain area with that cognitive process.
rigorous statistical thresholding procedures are becoming more common (Woo, Krishnan, & Wager, 2014), it remains important to assess which findings have been replicated and have a higher probability of being true activations. The goals of meta-analysis can be extended beyond regional activation to also identify groups of consistently coactivated regions that may form spatially distributed functional networks in the brain (Kober et al., 2008). If two regions are coactivated, then studies that activate one region are more likely to activate the other. Thus, the analysis of coactivation can be viewed as a meta-analytic analogue to functional connectivity, which can provide the basis for testing them as units of analysis in individual studies and lead to the development of testable hypotheses about functional connectivity in specific tasks. Evaluating the specificity of activation patterns for particular psychological processes is also important in order to understand whether a particular brain region is unique to a certain psychological domain or whether it is shared by a larger set of cognitive processes. For example, this is critical for determining whether activity in some region implies the involvement of a given psychological process (so-called reverse inference; Poldrack, 2006; Yarkoni, Poldrack, Nichols, Van Essen, & Wager, 2011). Specificity can only be examined across a range of tested alternative tasks. However, different psychological domains are typically studied in isolation, and it is nearly impossible to compare a wide range of tasks within the scope of a single study. However, meta-analytic activation patterns can be compared across the entire range of tasks studied using neuroimaging techniques, providing a unique way to evaluate activation specificity across functional domains.
Meta-Analytic Data In an ideal setting, meta-analysis would be performed using the full statistical maps made available from each study and fit using a mixed-effects model that aggregates the effect size at each individual voxel. However, in practice, this information is not readably available from individual studies. There has been some movement toward creating such databases, and the
http://dx.doi.org/10.1016/B978-0-12-397025-1.00348-1
661
662
INTRODUCTION TO METHODS AND MODELING | Meta-Analyses in Functional Neuroimaging
The growth of fMRI 3500
Number of publications
3000 2500 2000 1500 1000 500 0
1995
2000 2005 Publication year
2010
Figure 1 A graph showing the rapid increase in the number of publications that mention the term fMRI in either the title or abstract for the years 1993–2013 according to PubMed.
Reported peaks
Meta-analysis results
performing large-scale, automated synthesis of neuroimaging data extracted from published articles.
Meta-Analytic Methods
Figure 2 (Left) Peak activation coordinates from 163 studies on emotion. (Right) A summary of consistently activated regions computed using MKDA analysis.
increased availability of such data promises to fundamentally change the manner in which meta-analysis is performed. However, in practice, individual imaging studies often use very different analyses, and effect sizes are only reported for a small number of activated locations, making combined effect-size maps across the brain impossible to reconstruct from published reports. Instead, coordinate-based meta-analysis is typically performed using the spatial locations of peaks of activation (peak coordinates), reported in the standard coordinate systems of the Montreal Neurological Institute (MNI) and combined across studies. Figure 2 (left) shows the reported peaks from 163 studies of emotion. This information is typically provided in most neuroimaging papers, and in early work, peak coordinates were manually ‘harvested’ from articles. Today, there exist electronic databases, such as BrainMap (Fox & Lancaster, 2002) and Neurosynth (Yarkoni et al., 2011), of published functional and structural neuroimaging experiments with coordinate-based results reported in MNI space. Both databases are easily searchable by paradigm, cognitive domain, region of interest, and anatomical labels. While the papers are added manually to BrainMap, Neurosynth uses text mining algorithms to automatically harvest peak coordinates from journal articles and provides a platform for
A typical neuroimaging meta-analysis studies the locations of peak activations from a large number of studies and seeks to identify regions of consistent activation in a set of studies that use a particular task or are related to the same psychological state (see Figure 2). Meta-analytic methods that have been proposed include combining effect-size data (Van Snellenberg, Torres, & Thornton, 2006), analyzing the frequencies of reported peaks within anatomically defined regions of interest (Phan et al., 2002), performing cluster analyses (Northoff et al., 2006; Wager et al., 2004) and using various Bayesian methods (Kang, Johnson, Nichols, & Wager, 2011; Yue, Lindquist, & Loh, 2012). However, the most popular approaches toward performing meta-analysis on functional neuroimaging data are the so-called kernel-based methods. These include activation likelihood estimation (ALE; Turkeltaub, Eden, Jones, & Zeffiro, 2002) and kernel density approximation (KDA; Wager et al., 2004), as well as their extensions modified ALE (modALE; Eickhoff et al., 2009) and multilevel KDA (MKDA; Wager et al., 2007). Though the current standards in the field are MKDA and modALE, we begin by discussing their precursors for historical reasons.
Kernel-Based Methods In ALE and KDA, the peak coordinates are the basic units of analyses. Both methods measure consistency by counting the number of peak activations in each voxel, convolving the results with a kernel, and comparing the number of observed peaks to a null-hypothesis distribution (see Figure 3 for an illustration). In KDA, the kernel is spherical with radius r, and the resulting maps are interpreted as the number of peaks
INTRODUCTION TO METHODS AND MODELING | Meta-Analyses in Functional Neuroimaging
Study-specific Peak coordinates across studies peaks
Kernel convolution
Peak density or ALE map
663
Significant results
Density kernel 1 0 −1 −1
0
1 1
0
−1
OR
ALE kernel 1 0 −1 −1
0
1 1
0
−1
Figure 3 Example of meta-analysis using KDA or ALE on three studies. The three small maps on the left show peaks reported in each study for a representative axial brain slice. Peaks are combined across studies and the resulting map is smoothed with either a spherical kernel (KDA) or a Gaussian kernel (ALE). The resulting peak density map or ALE map is thresholded, resulting in a map of significant results.
within r mm. In ALE, the kernel is Gaussian, with a prespecified full width at half maximum value. The smoothed values are then treated at each voxel as estimates of the probability that each peak lays within r mm, and their union is computed to give the activation likelihood. This is interpreted as the probability that at least one of the peak activations lays within this voxel. For both methods, Monte Carlo methods are used to find an appropriate threshold to test the null hypothesis that the n reported peak coordinates are uniformly distributed throughout the gray matter. A permutation distribution is computed by repeatedly generating n peaks at random locations and performing the smoothing operation to obtain a series of statistical maps under the null hypothesis that can be used to compute voxel-wise p values. In KDA, the maximum density value from each permutation is saved, and a maximum density distribution is computed under the null hypothesis. This allows one to determine thresholds that ensure strong control over the family-wise error rate (FWER). In contrast, ALE identifies voxels where the union of probabilities exceeds that expected by chance and subsequent p values are subjected to false discovery rate correction. Both approaches have a number of shortcomings. For example, neither method takes into account which study the peaks came from. Thus, KDA and ALE summarize consistency across peak coordinates, rather than across studies. The consequence is that a significant result can be driven primarily by a single study. In addition, they assume no interstudy differences and that the peaks are spatially independent within and across studies under the null hypothesis. Hence, both approaches are designed to perform fixed-effects analysis and the results are not generalizable outside of the studies under consideration. MKDA and modALE were developed to circumvent these shortcomings. Both take into consideration the multilevel nature of the data and nest peak coordinates within studyspecific contrast maps. This allows study-specific maps to be treated as random effects and ensures that no single map can disproportionately contribute to the results, allowing for a transition from fixed-effects to random-effects analysis. In addition, study-specific maps are weighed by measures of
both study quality (MKDA) and sample size (both MKDA and modALE), ensuring that larger and more rigorously performed studies exert more influence on the final results. For both methods, the contrast maps serve as the unit of analysis rather than the peak coordinates. Similar to KDA, in MKDA, the peaks are convolved with a spherical kernel of radius r mm. However, here, the convolution occurs within each subject-specific map rather than across all the included peaks. This results in the creation of new maps where a voxel value of 1 represents the presence of a peak within r mm and 0 represents the absence of a peak within r mm. These maps are thereafter weighed depending on the criteria mentioned in the preceding text, giving rise to a map representing the weighted proportion of study-specific maps activated within r mm of that particular voxel. In modALE, the peaks are convolved with a Gaussian kernel. In contrast to ALE, the width of the kernel depends upon empirical estimates of the between-subject and between-template variabilities that are used to model the spatial uncertainty associated with each coordinate. The modeled probabilities are then combined over all studies by taking the voxel-wise union of their probability values. Similar to their predecessors, the MKDA and modALE methods use Monte Carlo simulations to obtain a threshold and establish statistical significance. However, there are a few important differences. In MKDA, the null hypothesis is that the distribution of ‘blobs,’ or coherent regions of activation, within each convolved study-specific map is randomly distributed. In each Monte Carlo iteration, the number and shape of the activation blobs are held constant within each study-specific map, while their location is randomly distributed throughout the gray matter. This preserves the spatial clustering of nearby peaks within each contrast and avoids the assumption of independent peak locations within contrasts. After each Monte Carlo iteration, the maximum density statistic across all studies is saved and used to choose an appropriate threshold that controls the FWER (see Figure 4 for an illustration). For modALE, the permutation testing is limited to regions of gray matter and modified to test for significant clustering between experiments.
664
INTRODUCTION TO METHODS AND MODELING | Meta-Analyses in Functional Neuroimaging
Study-specific Comparison indicator maps peaks
Weighted average
Significant results
Weighted average
Figure 4 Example of meta-analysis using MKDA on three studies. The three small maps on the left show peaks reported in each study for a representative axial brain slice. Each map is separately convolved with a spherical kernel, generating indicator maps for each study contrast map. The weighted average of the indicator maps is computed and thresholded to produce a map of significant results.
Investigating Coactivation Meta-analysis can also be used to reveal consistent patterns of coactivation. For example, in MKDA, the data can be organized as an n v indicator matrix consisting of information about how the n different study-specific maps activate in the neighborhood of each of the v voxels in the brain. The resulting connectivity profiles across voxels can be summarized to a smaller set of structurally or functionally defined regions. Hypothesis tests can then be performed on connectivity, and relationships among multiple regions can be summarized and visualized. There exist several potential measures of association for bivariate, binomial data that are applicable, including Kruskal’s gamma, Kendall’s tau, and Fisher’s exact test.
Evaluating Specificity Meta-analysis provides perhaps the only way to compare neuroimaging results across a wide variety of tasks. For example, within the kernel-based methods, separate maps can be constructed for each of two task types and subtracted to yield difference maps. The same procedure can be used in the Monte Carlo randomization to perform inference. Locations of contiguous activation blobs (or peaks in ALE/KDA) are randomized, providing simulated null-hypothesis conditions that we can use to decide on a threshold for determining significant differences. These difference maps allow one to test the relative frequency of activating a given region, compared with the overall frequencies in the rest of the brain. Thus, a reliable concentration of peaks in a certain brain area for a specific task type will increase the marginal activation frequencies for that task, which in turn affects the null-hypothesis difference in the Monte Carlo simulations. A caveat is that for task types with relatively few peaks, there need not be a greater absolute probability of activating a region to achieve a significant density for that region relative to other task types. To test the absolute difference in activation in one condition compared with another, one can alternatively perform a nonparametric chi-square test (Wager et al., 2007). This allows one to test whether there is a systematic association between activation in a particular voxel and a set of tasks.
Software In recent years, a number of user-friendly software solutions have been introduced that have simplified the process of performing a meta-analysis. For example, GingerALE is the BrainMap application that is used to perform an ALE meta-analysis on coordinates in Talairach or MNI space. Neurosynth uses text mining to automatically harvest peak coordinates from journal articles and provides a platform for performing large-scale, automated synthesis of neuroimaging data extracted from published articles. Through a Web interface, simple but useful analyses of fMRI data can be run on a very large scale. Neurosynth currently contains coordinates from 6000 published studies, along with words from the full text of the published papers. Meta-analytic maps of the relationships between brain activation and use of 10 000 key terms in the papers are available online, as well as online coactivation analyses, topic-based brain maps, and other features under development. Users can view lists of papers that activate a given location, create a map of regions that coactivate with a location, download activation and coactivation maps for use in studies, and download ‘feature sets’ of many commonly used maps (e.g., ‘face,’ ‘place,’ ‘working memory,’ and ‘reward’). Maps for each term can be ‘forward inference’ maps, a chi-square test on the likelihood of activation in each brain location given studies that frequently use a particular term, and ‘reverse inference’ maps – a chi-square test for independence on the likelihood of a study involving a term or not, given activation at each location. These maps can be used for exploration, hypothesis generation, and inference about activation across domains (e.g., Roy, Shohamy, & Wager, 2012). They can also be used as quantitative masks for selection of a priori regions of interest (e.g., Wager et al., 2013). Finally, both KDA and MKDA are available in a Matlab toolbox.
Future Developments Meta-analyses have become increasingly popular in recent years, and there is promise of many interesting developments
INTRODUCTION TO METHODS AND MODELING | Meta-Analyses in Functional Neuroimaging
in the coming years. One exciting direction is that analyses across study types can be used to develop brain-based psychological ontologies that allow for the grouping of different kinds of tasks and psychological functions together based on the similarity of their brain patterns (Poldrack, 2008; Turner & Laird, 2012). Another promising future direction is the development of meta-analysis-based classifier techniques that allow quantitative inferences to be made from brain activation to psychological states (Yarkoni et al., 2011). This could allow formal predictions to be made about psychological states based on brain activation. A few Bayesian methods have recently appeared in the statistics literature. The first (Kang et al., 2011) proposes a Bayesian spatial hierarchical model using a marked independent cluster process. Interestingly, this provides a generative model that allows for prediction of peak locations from new studies. The second (Yue et al., 2012) suggests a nonparametric binary regression method where each location (or voxel) has a probability of being truly activated, and the corresponding probability function is based on a spatially adaptive Gaussian Markov random field. Finally, the results of meta-analysis can potentially be used to create priors for Bayesian analysis as they provide information about the current state-of-the-art knowledge of a certain psychological process under study. In addition, they show great promise as tools for performing feature selection in multivoxel pattern analysis (Wager et al., 2013).
See also: INTRODUCTION TO METHODS AND MODELING: BrainMap Database as a Resource for Computational Modeling; Contrasts and Inferences; Databases; Reverse Inference; The General Linear Model.
References Dickstein, S. G., Bannon, K., Xavier Castellanos, F., & Milham, M. P. (2006). The neural correlates of attention deficit hyperactivity disorder: An ALE meta-analysis. Journal of Child Psychology and Psychiatry, 47(10), 1051–1062. Eickhoff, S. B., Laird, A. R., Grefkes, C., Wang, L. E., Zilles, K., & Fox, P. T. (2009). Coordinate-based activation likelihood estimation meta-analysis of neuroimaging data: A random-effects approach based on empirical estimates of spatial uncertainty. Human Brain Mapping, 30(9), 2907–2926. Etkin, A., & Wager, T. (2007). Functional neuroimaging of anxiety: A meta-analysis of emotional processing in PTSD, social anxiety disorder, and specific phobia. American Journal of Psychiatry, 164(10), 1476–1488. Fox, P. T., & Lancaster, J. L. (2002). Mapping context and content: The BrainMap model. Nature Reviews Neuroscience, 3(4), 319–321. Glahn, D. C., Ragland, J. D., Abramoff, A., Barrett, J., Laird, A. R., Bearden, C. E., et al. (2005). Beyond hypofrontality: A quantitative meta-analysis of functional neuroimaging studies of working memory in schizophrenia. Human Brain Mapping, 25(1), 60–69. Kang, J., Johnson, T. D., Nichols, T. E., & Wager, T. D. (2011). Meta analysis of functional neuroimaging data via Bayesian spatial point processes. Journal of the American Statistical Association, 106(493), 124–134.
665
Kober, H., Barrett, L. F., Joseph, J., Bliss-Moreau, E., Lindquist, K., & Wager, T. D. (2008). Functional grouping and cortical–subcortical interactions in emotion: A meta-analysis of neuroimaging studies. NeuroImage, 42, 998–1031. Laird, A. R., McMillan, K. M., Lancaster, J. L., Kochunov, P., Turkeltaub, P. E., Pardo, J. V., et al. (2005). A comparison of label-based review and ALE metaanalysis in the Stroop task. Human Brain Mapping, 25(1), 6–21. Menzies, L., Chamberlain, S. R., Laird, A. R., Thelen, S. M., Sahakian, B. J., & Bullmore, E. T. (2008). Integrating evidence from neuroimaging and neuropsychological studies of obsessive–compulsive disorder: the orbitofrontostriatal model revisited. Neuroscience & Biobehavioral Reviews, 32(3), 525–549. Northoff, G., Heinzel, A., de Greck, M., Bermpohl, F., Dobrowolny, H., & Panksepp, J. (2006). Self-referential processing in our brain – A meta-analysis of imaging studies on the self. NeuroImage, 31(1), 440–457. Phan, K. L., Wager, T., Taylor, S. F., & Liberzon, I. (2002). Functional neuroanatomy of emotion: A meta-analysis of emotion activation studies in PET and fMRI. NeuroImage, 16(2), 331–348. Poldrack, R. A. (2006). Can cognitive processes be inferred from neuroimaging data? Trends in Cognitive Sciences, 10(2), 59–63. Poldrack, R. A. (2008). The role of fMRI in cognitive neuroscience: Where do we stand? Current Opinion in Neurobiology, 18(2), 223–227. Roy, M., Shohamy, D., & Wager, T. D. (2012). Ventromedial prefrontal–subcortical systems and the generation of affective meaning. Trends in Cognitive Sciences, 16(3), 147–156. Turkeltaub, P. E., Eden, G. F., Jones, K. M., & Zeffiro, T. A. (2002). Meta-analysis of the functional neuroanatomy of single-word reading: Method and validation. NeuroImage, 16(3), 765–780. Turner, J. A., & Laird, A. R. (2012). The cognitive paradigm ontology: Design and application. Neuroinformatics, 10(1), 57–66. Van Snellenberg, J. X., Torres, I. J., & Thornton, A. E. (2006). Functional neuroimaging of working memory in schizophrenia: Task performance as a moderating variable. Neuropsychology, 20(5), 497. Wager, T. D., Atlas, L. Y., Lindquist, M. A., Roy, M., Woo, C. W., & Kross, E. (2013). An fMRI-based neurologic signature of physical pain. New England Journal of Medicine, 368(15), 1388–1397. Wager, T. D., Jonides, J., & Reading, S. (2004). Neuroimaging studies of shifting attention: A meta-analysis. NeuroImage, 22(4), 1679–1693. Wager, T. D., Lindquist, M., & Kaplan, L. (2007). Meta-analysis of functional neuroimaging data: Current and future directions. Social Cognitive and Affective Neuroscience, 2, 150–158. Wager, T. D., Lindquist, M. A., Nichols, T. E., Kober, H., & Van Snellenberg, J. (2009). Evaluating the consistency and specificity of neuroimaging data using metaanalysis. NeuroImage, 45(1), S210–S221. Woo, C. W., Krishnan, A., & Wager, T. D. (2014). Cluster-extent based thresholding in fMRI analyses: Pitfalls and recommendations. NeuroImage, 91, 412–419. Yarkoni, T., Poldrack, R. A., Nichols, T. E., Van Essen, D. C., & Wager, T. D. (2011). Large-scale automated synthesis of human functional neuroimaging data. Nature Methods, 8(8), 665–670. Yue, Y. R., Lindquist, M. A., & Loh, J. M. (2012). Meta-analysis of functional neuroimaging data using Bayesian nonparametric binary regression. Annals of Applied Statistics, 6(2), 697–718.
Relevant Websites http://brainmap.org – Brainmap. http://neurosynth.org – Neurosynth. http://sumsdb.wustl.edu/sums/index.jsp – Surface Management System Database. http://wagerlab.colorado.edu/tools – Matlab toolbox.