Accepted Manuscript
An MVPA Method Based on Sparse Representation for Pattern Localization in fMRI Data Analysis Fangyi Wang, Yuanqing Li, Zhenghui Gu PII: DOI: Reference:
S0925-2312(17)30997-9 10.1016/j.neucom.2016.12.099 NEUCOM 18527
To appear in:
Neurocomputing
Received date: Revised date: Accepted date:
13 September 2016 15 December 2016 17 December 2016
Please cite this article as: Fangyi Wang, Yuanqing Li, Zhenghui Gu, An MVPA Method Based on Sparse Representation for Pattern Localization in fMRI Data Analysis, Neurocomputing (2017), doi: 10.1016/j.neucom.2016.12.099
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
Higlights • A new MVPA method for fMRI Data Analysis.
CR IP T
• The ability of detecting subtle differences between experimental conditions.
• We localized two category-specific brain activation patterns corresponding to two experimental conditions.
• The two sets consisted of a maximal number of informative features.
AC
CE
PT
ED
M
AN US
• The wrong selected features (noises) were removed by permutation tests.
1
ACCEPTED MANUSCRIPT
An MVPA Method Based on Sparse Representation for Pattern Localization in fMRI Data Analysis
CR IP T
Fangyi Wanga,b , Yuanqing Lia,b,∗, Zhenghui Gua,b a Center
for Brain Computer Interfaces and Brain Information Processing, South China University of Technology, Guangzhou, 510640, China b Guangzhou Key Laboratory of Brain Computer Interaction and Applications, Guangzhou 510640, China
AN US
Abstract
Multivariate pattern analysis (MVPA) approach applied to neuroimaging data, such as functional magnetic resonance imaging (fMRI) data, has received a great deal of attention because of its sensitivity to distinguishing patterns of neural activities associated with different stimuli or cognitive states. Generally, when
M
using MVPA approach to decode the mental states or stimuli, a set of discriminative variables (e.g. voxels) is first selected. However, in most of existing MVPA methods, the selected variables do not contain all informative variables,
ED
since these selected variables are sufficient for decoding. In this paper, we propose a multivariate pattern analysis method based on sparse representation for decoding the brain states and localizing category-specific brain activation areas
PT
corresponding to two experimental conditions/tasks at the same time. Unlike traditional MVPA approaches, this method is designed to find informative vari-
CE
ables as many as possible. We applied the proposed method to two judgement experiments: a gender discrimination and a emotion discrimination task, data analysis results demonstrate its effectiveness and potential applications.
AC
Keywords: Sparse representation, localizing, decoding, fMRI
∗ Corresponding
author Email addresses:
[email protected] (Fangyi Wang),
[email protected] (Yuanqing Li),
[email protected] (Zhenghui Gu)
Preprint submitted to Neurocomputing
June 1, 2017
ACCEPTED MANUSCRIPT
1. Introduction On the any given moment, our brains are accessing vast amount of information about the around environment. How the brain processes this flood of
5
CR IP T
information within local and global networks is one fundamental question in neuroscience. Functional magnetic resonance imaging (fMRI) has become one of the
most popular tools for imaging brain function [1]. However, fMRI data yields very complex, high-dimensional data sets including up to hundreds of thousand
voxels. Traditionally, the data have been analyzed with a mass-univariate gen-
eral linear model(GLM) approach to reveal task-related brain areas by treating each voxel separately [2]. One of the limitations about the GLM approach is
AN US
10
that the interrelationship among voxels of spatially distributed brain areas is not considered because it works on isolated voxels and ignores joint information among them.
In recent years, multivariate pattern analysis (MVPA) approaches have shown promise for the analysis of fMRI data, their ability to localize spatial patterns
M
15
of activity that differentiate across experimental conditions/tasks [3]. These spatial patterns generally are too weak to be detected by GLM [4, 5, 6, 7, 8].
ED
Applications of the MVPA have been rapidly developed in fMRI data analysis, such as stimuli reconstruction [9, 10], attention [11, 12, 13], decision making [14], concept representation [15, 16, 17]. Recent MVPA approaches include three
PT
20
common forms: regions of interest (ROI) based MVPA[8, 18, 14], whole-brain MVPA [19, 20], local multivariate search approach(e.g. searchlight) [14, 21]. In
CE
MVPA, the last few years have witnessed a flurry of research activity on algorithms and theory aimed at feature selection and estimation involving sparse
25
representation because of its ability to handle high dimensional data with com-
AC
pressed samples, and discover sparse spatial activity patterns, thus enhancing interpretability. Several methods including Lasso [22], sparse logistic regression [23], Elastic [24], Sparse NMF [25] have been used for this purpose. Moreover, an alternative feature selection approach is to use linear or nonlinear dimen-
30
sionality reduction methods, such as PCA [26] and LLE [27].
3
ACCEPTED MANUSCRIPT
Although MVPA approaches have yielded remarkable insights into understanding the types of stimulus attributes that might be represented in distributed spatial activity patterns, they are inherently limited in their ability
35
CR IP T
to characterize the underlying feature space. Because the informative voxels/features selected by MVPA approaches are based on their prediction power,
part of informative voxels may be sufficient and the redundancy of information may be useless for decoding, but these redundant information are important to localize category-specific brain activation area[23]. In this paper, we propose a
new MVPA method for fMRI data analysis. The proposed method combine a forward feature selection scheme with a sparse regularization and permutation
AN US
40
testing for feature selection in multivariate pattern classification settings, we illustrated the application of our approach using an fMRI data set. The remainder of this paper is organized as follows: Section 2 describes the experimental setting and the detail of proposed method, while Section 3 reports 45
and analyzes the experimental results, followed by our paper conclusions in
M
Section 4.
ED
2. Materials and methods 2.1. Participants
50
PT
Twelve healthy native male Chinese (aged 21 to 48 years, with normal or corrected-to-normal vision and normal hearing) participated in this study. All subjects provided written informed consent prior to the experiment. The exper-
CE
imental protocol was approved by the Ethics Committee of Guangdong General Hospital, China.
AC
2.2. Experimental stimuli and Procedure
55
We selected 80 movie clips of faces with audio from internet sources. Af-
ter image processing (Windows movie maker), each edited movie clip was in gray scale, lasted 1400 ms and subtended 10.7◦ × 8.7◦ . Semantically, these 80 movie clips could be partitioned orthogonally into two groups based on either
4
ACCEPTED MANUSCRIPT
gender (40 male vs. 40 female Chinese faces) or facial emotion (40 crying vs. 60
40 laughing faces). The luminance levels of the videos were matched by adjusting the total power value of each video. Similarly, the audio power levels
CR IP T
were also matched by adjusting the total power value of each audio clip. During the experiment, stimulus presentation and response recording were controlled
with ePrime software. The visual stimuli were projected onto a screen using an 65
LCD projector (SA-9900 fMRI Stimulation System, Shenzhen Sinorad Medical
Electronics, Inc.), and the subjects viewed the visual stimuli through a mirror mounted on a head coil. The auditory stimuli were delivered through a pneu-
AN US
matic headset (SA-9900 fMRI Stimulation System, Shenzhen Sinorad Medical Electronics, Inc.).
Each subject completed two runs, one run for gender discrimination and the
70
other for emotion discrimination, each run contained 10 blocks and each block contained 8 trials. For the 10-fold cross-validation, the 80-trial data of each run were equally partitioned into 10 non-overlapping datasets, each corresponding
75
M
to 1 of the 10 blocks. In the Kth fold, the test data was made up by Kth block of each run, the remain formed the train data. During each trial, which lasted 10
ED
seconds or 5 volumes (TR=2 s), the subjects were asked to focus their attention on either the gender or emotion of faces in the movie clips and recognize whether each face was male/female or crying/laughing. At the beginning of each trial, a
80
PT
stimulus was presented to the subject for 1400 ms, followed by a 600-ms blank period. This 2-s (one TR) cycle with the same stimulus was repeated 4 times
CE
for effectively eliciting a brain activity pattern and was followed by a 6-s blank period. Mean responses of third, fourth and fifth volumes in each trial were used, whereas the other volumes were discarded because of the delay of BOLD
AC
response. More details were described in previous study [21].
85
2.3. fMRI data Collection and Preprocessing Functional images were collected using a 3 Tesla GE Signal Excite HD MR
scanner at Guangdong General Hospital, China. A 3D anatomical T1-weighted scan (FOV, 280 mm; matrix, 256 × 256; 128 slices; and slice thickness: 1.8 mm) 5
ACCEPTED MANUSCRIPT
was acquired before the functional scan for each subject. During the experi90
ment, gradient-echo echo-planar (EPI) T2*-weighted images (25 slices acquired in an ascending noninterleaved order; TR=2000 ms, TE=35 ms, flip angle= 70◦ ;
CR IP T
FOV: 280 mm, matrix: 64 × 64, slice thickness: 5.5 mm, no gap) were acquired,
covering the entire brain. The preprocessing were executed with the program
SPM8 software package1 . First five volumes were discarded because the MRI 95
signals were unsteady. The preprocessing procedure includes head motion correction, slice timing, co-registration between the functional scans and the structural scan, normalization to an MNI standard brain, data masking to exclude
AN US
non-brain voxels, time series detrending and normalization of time series of each run to zero mean and unit variance, using custom functions in Matlab 2012a 100
(Matlab Mathwork, Inc., Natick, MA).To reduce the computational burden and remove noise, filtering of the original data by correlation which was calculated voxelwise between the time series and stimulus function, the 6000 voxels with
M
high absolute value were selected for later processing. 2.4. Feature selection and Decoding
Feature selection is an important problem in machine learning, pattern recog-
ED
105
nition, and statistics. Due to extremely high dimensional features and small number of samples which is known as the curse-of-dimensionality problem in
PT
fMRI studies. As a result, ideally choosing a small subset of features is necessary to maximize model prediction accuracy. The feature selection ability 110
modeled in the sparse representation can be used to selected subset of relevant
CE
features in the signal and meanwhile separating it into two sets corresponding to two class labels, according to the signs of the sparse representation weights
AC
[19, 23, 28]. This sparse representation for regularization is very important for MVPA because feature selection allows for functional localization of cognitive
115
processes, with sparser feature selection providing more concise localization [29]. The sparse representation of signal can be described with the following equa1 http://www.fil.ion.ucl.ac.uk/spm/
6
ACCEPTED MANUSCRIPT
tion: y = Aw.
(1)
where,y ∈
CR IP T
−1 indicates the other class. The data matrix A ∈
are the numbers of samples and features respectively. w ∈
y = Aw.
(2)
AN US
minkwk0
0-norm of w is the sparsest solution of equation (1). Here, we use a greedy 125
algorithm: Orthogonal Matching Pursuit (OMP) [30] to solve this problem, which has the advantages of computationally efficient and easy to implement [31].
The overall scheme for sparse representation based feature selection and
M
decoding, as shown in Fig. 1 and described in following steps:
Step 1: A K-fold cross-validation is performed after data partition(K =
130
ED
− 10). In each fold, we obtain two sets of informative features, IND+ k and INDk ,
corresponding to gender and emotion recognition task respectively. By taking the union operation across folds, we obtain two sets of informative features
PT
IND+ and IND− (see Fig. 1 (A)) at the individual subject level. Step 2: Each fold of the cross-validation contains n0 iterations. As an ex-
135
ample, Fig. 1 (B) illustrates the kth fold. In the nth iteration (n = 1, . . . , n0 ),
CE
(n)
(n)
two sets of informative features Ind+ and Ind− are obtained in each iteration.
AC
− The selected sets of this fold are IND+ k and INDk .
140
Step 3: In the nth iteration of this fold (see Fig. 1(B)), we first perform
sparse representations on the train data to obtain a weight vector w(n) . Second, (n)
(n)
we determine two sets of informative features Ind+ and Ind− using this weight (n)
vector. Specifically, Ind+ contains N0 features corresponding to the largest elements(generally positive, gender recognition task in this paper) of the weight (n)
vector w(n) , while Ind+
contains N0 features corresponding to the smallest 7
AN US
CR IP T
ACCEPTED MANUSCRIPT
Figure 1: Scheme of feature selection by the sparse representation method and decoding of individual subject data. This algorithm contains K folds of cross validation (A) with the
elements (generally negative, emotion recognition task in this paper) of w(n)
ED
145
M
iteration steps (including n0 iterations) of the kth fold listed in (B) as an example.
(n)
(n)
[19, 23]. Third, we remove these features in Ind+ and Ind− from the data set in this iteration, an updated data set with remaining features is obtained for next
PT
iteration. Finally, we perform another 9-fold cross-validation procedure based on the updated train data set using Support vector machine (SVM), the prediction 150
accuracy of labels is denoted as r(n+1) , r1 was performed by using the initial
CE
train data set. Meanwhile, we also perform a decoding based on all the selected features set of the test data of this fold, the prediction model was trained by the
AC
train data with same features of this fold, and the prediction accuracy of labels is denoted as Gn . To assess statistical significance of decoding accuracy, we also
155
employed nonparametric permutation test [32]. The null hypothesis assumes that the relationship between the data and the labels cannot be learned reliably by the family of classifiers used in the training step. In permutation testing, we randomly permuted the class labels of the training data (all the selected 8
ACCEPTED MANUSCRIPT
features used here) 10000 times and calculated corresponding pseudo decoding 160
accuracies. Remark : In this scheme of feature selection, the number of features with
CR IP T
the largest positive/the smallest negative weights selected in each iteration, 15 was assigned to this parameter according to previous studies [14, 28]. 2.5. Localization
After feature selection, we can obtain two sets of selected features IND+ and
165
IND− by 10-fold of cross-validation for each subject, which corresponding to
AN US
gender and emotion discrimination task respectively. However, part of selected features may represent noise. In order to remove these features representing noise, we perform permutation test on the set of selected features IND+ and 170
IND− as below.
Step 1: (probability maps): Two probability maps were constructed using the two sets of features selected across all the K folds of cross-validation for
M
each subject. For example, using the sets IND+ , we assign scores to features based on selection frequency, which by counting the times that the features in IND+ , because of features that are repeatedly selected among folds of training
ED
175
data sets could be important, so high quantitative values should be assigned. If this features does not appear in IND+ , the frequency is set to zero. Thus, we
PT
obtain a probability map corresponding to gender recognition task. Similarly, we obtain a probability map corresponding to emotion recognition task using 180
IND− . Finally, we averaged these probability maps across all subjects to obtain
CE
two probability maps at the group level. Step 2 (permutation): By permuting the class labels 300 times randomly
AC
and repeated the above procedure of feature selection in each permutation, and obtained 300 pairs probability maps. Based on these probability distributions,
185
it is possible to test the null hypothesis at the voxel level. Step 3 (multiple comparison correction): For multiple comparison correction,
a null distribution for each class was constructed by pooling all probability values of the 300 average probability maps corresponding to this class, which were 9
ACCEPTED MANUSCRIPT
90
80
70 MACG MACr
60
10
30
50
70 Iterations
90
110
130
150
AN US
50
CR IP T
Decoding accuracy (%)
100
Figure 2: Iterative decoding accuracy curves. MACr and MACG are abbreviations for mean accuracy curves of r(n+1) and Gn at the group level respectively.
obtained through the 300 permutations. The p value of each voxel is calculated 190
as the proportion of values in the null distribution that is greater or equal to
M
the value obtained by using the real label (i.e. non-permutated) data. A critical threshold was determined by False Discovery Rate (FDR)<0.05 for each class, we remove those features with their values greater than the critical threshold
195
ED
[19, 32]. The remaining features are those informative ones to localize spatial activity pattern with respect to the corresponding label (e.g., class).
PT
3. Results
CE
3.1. Decoding accuracy Each discrimination task (gender or emotion) contained 10 blocks and each
block contained 8 trials. For each subject, we applied 10-fold cross-validation scheme, the dataset of each discrimination task was divided into 10 equal sub-
AC
200
sets by blocks. Test data of each fold included 1 block of gender discrimination task and 1 block of emotion discrimination task. The remain data of two discrimination tasks was used as train data for this fold. There were two decoding results r(n+1) and Gn of each fold that correspond
205
to classification based on train data and test data respectively (see Materials 10
ACCEPTED MANUSCRIPT
2500 2000 1500 1000 500 0
0
10
20
30
40
50 60 Accuracy (%)
70
CR IP T
Permutation count
3000
80
90
100
AN US
Figure 3: The distribution of permutation test (10000 repetitions). The vertical red line indicates the real decoding accuracy without permutation.
and Methods). Increasing n (number of iterations) will result in increasing number of features for Gn calculation. By contrast, number of features for r(n+1) calculation was decreased(With the increasing of iterations, more and
210
M
more informative features were removed). The two average decoding accuracy curves across folds and subjects are shown in Fig. 2, where MACr and MACG
ED
are abbreviations for mean accuracy curves of r(n+1) and Gn at the group level respectively. We can see that after 15 iterations, the mean accuracy curve of Gn has reached 90% and keeps stable after then, because most of latter selected
215
PT
features are informative but highly related with earlier ones, which matches our expectation. Meanwhile, informative features were removed from the train data leads to decline of the mean accuracy curve of r(n+1) .
CE
With the decoding accuracy as the statistic, the distribution of permutation
is shown in Fig. 3. As demonstrated by Fig. 3, the classifier learned the rela-
AC
tionship between the data and the labels with a probability of being wrong of
220
<0.0001. 3.2. Localization of informative features Using two sets of selected features, we perform permutation test with p < 0.05 FDR-corrected and cluster size of 15 voxels to construct the corresponding
11
ACCEPTED MANUSCRIPT
z=-55
z=-50
z=-45
z=-40
z=-35
z=-30
z=-25
z=-20
z=-15
z=-10
z=-5
z=0
z=5
z=10
z=15
z=20
z=40
z=45
AN US
CR IP T
z=-60
z=25
z=30
z=35
z=50
z=55
z=60
Figure 4: Voxels selected by our method with p < 0.05 FDR-corrected cluster size of 15
M
voxels. The blue clusters corresponded to the gender discrimination task, and the red clusters corresponded to emotion discrimination task.
225
ED
activation map (see Materials and Methods), the distribution of informative patterns (clusters) are shown in Fig. 4. As we observe, although the two informative
PT
patterns share some common brain areas,such as left cuneus, left lingual gyrus, bilateral inferior occipital gyrus. Meanwhile, the pattern with non-overlapped which means the task-specific pattern that was separated successfully. For in-
CE
stance, brain regions including left precentral gyrus, left middle frontal gyrus
230
and bilateral postcentral gyrus for emotion discrimination task, and left insula,
AC
right hippocampus and right thalamus for emotion discrimination task.
4. Conclusions In fMRI data analysis, there are hundreds of thousands of voxels, which is
much larger than the number of samples, resulting in overfitting. To address 235
this issue, the number of features needs to be significantly reduced, and infor12
ACCEPTED MANUSCRIPT
mative features have to be wisely selected in order to make the classification task efficiently. In this paper, we proposed an MVPA methods based on sparse representation for decoding the brain states and localizing task-specific brain
240
CR IP T
activation areas at the same time. Experimental results using two discrimination tasks confirmed that such a method is capable of finding two corresponding
semantic categories (gender and emotion) sets of informative features and decoding the two tasks with significantly high accuracy.
Acknowledgements
245
AN US
This work was supported by the National Key Basic Research Program of
China (973 Program) under Grant 2015CB351703, the National Natural Science Foundation of China under Grants 61633010, 91420302 and 61573150, and Guangdong Natural Science Foundation under Grant 2014A030312005.
M
References References
[1] R. A. Poldrack, J. A. Mumford, T. E. Nichols, Handbook of functional
ED
250
MRI data analysis, Cambridge University Press, 2011.
PT
[2] K. J. Friston, A. P. Holmes, J. Poline, P. Grasby, S. Williams, R. S. Frackowiak, R. Turner, Analysis of fmri time-series revisited, Neuroimage 2 (1)
CE
(1995) 45–53. 255
[3] N. Kriegeskorte, Pattern-information analysis: from stimulus decoding to
AC
computational-model testing, Neuroimage 56 (2) (2011) 411–421.
[4] L. Reddy, N. Tsuchiya, T. Serre, Reading the mind’s eye: decoding category information during mental imagery, Neuroimage 50 (2) (2010) 818–825.
[5] F. Pereira, T. Mitchell, M. Botvinick, Machine learning classifiers and fmri:
260
a tutorial overview, Neuroimage 45 (2009) S199–S209.
13
ACCEPTED MANUSCRIPT
[6] A. J. O’Toole, F. Jiang, H. Abdi, N. Penard, J. P. Dunlop, M. A. Parent, Theoretical, statistical, and practical perspectives on pattern-based classification approaches to the analysis of functional neuroimaging data, J Cogn
265
CR IP T
Neurosci 19 (11) (2007) 1735–1752. [7] Y. Kamitani, F. Tong, Decoding the visual and subjective contents of the human brain, Nat Neurosci 8 (5) (2005) 679–685.
[8] J. V. Haxby, M. I. Gobbini, M. L. Furey, A. Ishai, J. L. Schouten,
P. Pietrini, Distributed and overlapping representations of faces and ob-
270
AN US
jects in ventral temporal cortex, Science 293 (5539) (2001) 2425–2430.
[9] K. N. Kay, T. Naselaris, R. J. Prenger, J. L. Gallant, Identifying natural images from human brain activity, Nature 452 (7185) (2008) 352–355. [10] Y. Miyawaki, H. Uchida, O. Yamashita, M. A. Sato, Y. Morito, H. C. Tanabe, N. Sadato, Y. Kamitani, Visual image reconstruction from hu-
M
man brain activity using a combination of multiscale local image decoders, Neuron 60 (5) (2008) 915–929.
275
ED
[11] J. A. Lewis-Peacock, B. R. Postle, Decoding the internal focus of attention, Neuropsychologia 50 (4) (2012) 470–478.
PT
[12] Y. Erez, J. Duncan, Discrimination of visual categories based on behavioral relevance in widespread regions of frontoparietal cortex, The Journal of Neuroscience 35 (36) (2015) 12383–12393.
CE
280
[13] J. C. Francken, E. L. Meijs, P. Hagoort, S. van Gaal, F. P. de Lange,
AC
Exploring the automaticity of language-perception interactions: Effects of attention and awareness, Scientific reports 5.
[14] K. Jimura, R. A. Poldrack, Analyses of regional-average activation and mul-
285
tivoxel pattern information tell complementary stories, Neuropsychologia 50 (4) (2012) 544–552.
14
ACCEPTED MANUSCRIPT
[15] S. M. Polyn, V. S. Natu, J. D. Cohen, K. A. Norman, Category-specific cortical activity precedes retrieval during memory search, Science 310 (5756) (2005) 1963–1966. [16] M. A. Just, V. L. Cherkassky, S. Aryal, T. M. Mitchell, A neurosemantic
CR IP T
290
theory of concrete noun representation based on the underlying brain codes, PloS one 5 (1) (2010) e8622.
[17] S. V. Shinkareva, V. L. Malave, R. A. Mason, T. M. Mitchell, M. A. Just,
Commonality of neural representations of words and pictures, Neuroimage
AN US
54 (3) (2011) 2418–2425.
295
[18] E. Formisano, F. De Martino, M. Bonte, R. Goebel, ”who” is saying ”what”? brain-based decoding of human voice and speech, Science 322 (5903) (2008) 970–973.
[19] J. Mourao-Miranda, A. L. Bokde, C. Born, H. Hampel, M. Stetter, Classi-
M
fying brain states and determining the discriminating activation patterns:
300
Support vector machine on functional mri data, Neuroimage 28 (4) (2005)
ED
980–995.
[20] Z. Wang, A. R. Childress, J. Wang, J. A. Detre, Support vector machine learning-based fmri data group analysis, Neuroimage 36 (4) (2007) 1139– 1151.
PT
305
[21] Y. Q. Li, J. Y. Long, B. Huang, T. Y. Yu, W. Wu, Y. J. Liu, C. H.
CE
Liang, P. Sun, Crossmodal integration enhances neural representation of task-relevant features in audiovisual face perception, Cerebral cortex 25 (2)
AC
(2015) 384–395.
310
[22] R. Tibshirani, Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society. Series B (Methodological) (1996) 267–288.
[23] O. Yamashita, M. Sato, T. Yoshioka, F. Tong, Y. Kamitani, Sparse estimation automatically selects voxels relevant for the decoding of fmri activity patterns, Neuroimage 42 (4) (2008) 1414–1429. 15
ACCEPTED MANUSCRIPT
315
[24] H. Zou, T. Hastie, Regularization and variable selection via the elastic net, Journal of the Royal Statistical Society Series B-Statistical Methodology 67 (2005) 301–320.
CR IP T
[25] H. Kim, H. Park, Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis, Bioinformatics 23 (12) (2007) 1495–1502.
320
[26] L. L. Zeng, H. Shen, L. Liu, D. Hu, Unsupervised classification of major depression using functional connectivity mri, Human Brain Mapping 35 (4)
AN US
(2014) 1630C1641.
[27] H. Shen, L. Wang, Y. Liu, D. Hu, Discriminative analysis of resting-state functional connectivity patterns of schizophrenia using low dimensional em-
325
bedding of fmri., Neuroimage 49 (4) (2010) 3110–3121.
[28] Y. Q. Li, J. Y. Long, L. He, H. D. Lu, Z. H. Gu, P. Sun, A sparse
M
representation-based algorithm for pattern localization in brain imaging data analysis, Plos One 7 (12) (2012) e50332. [29] K. Kampa, S. Mehta, C. A. Chou, W. A. Chaovalitwongse, T. J. Grabowski,
ED
330
Sparse optimization in feature selection: application in neuroimaging, Jour-
PT
nal of Global Optimization 59 (2-3) (2014) 439–457. [30] T. Zhang, On the consistency of feature selection using greedy least squares
CE
regression, Journal of Machine Learning Research 10 (2009) 555–568. 335
[31] Y. Q. Li, Z. L. Yu, N. Bi, Y. Xu, Z. H. Gu, S. Amari, Sparse representation
AC
for brain signal processing [a tutorial on methods and applications], IEEE Signal Processing Magazine 31 (3) (2014) 96–106.
[32] T. E. Nichols, A. P. Holmes, Nonparametric permutation tests for func-
340
tional neuroimaging: A primer with examples, Human Brain Mapping 15 (1) (2002) 1–25.
16
ACCEPTED MANUSCRIPT
CR IP T
biography
Fangyi Wang received the M.S. degree in signal and information processing from Jiangxi Science and Technology Normal University,
Nanchang, China, in 2012. He is currently working toward the Ph.D. degree in pattern recognition and intelligent systems at the South China University of
AN US
345
Technology, Guangzhou, China. His current research interests include the fields of sparse representation, fMRI data analysis, pattern recognition and braincom-
M
puter interface.
350
ED
Yuanqing Li was born in Hunan Province, China, in 1966.
He received the B.S. degree in applied mathematics from Wuhan University, Wuhan, China, in 1988, the M.S. degree in applied mathematics from South
PT
China Normal University, Guangzhou, China, in 1994, and the Ph.D. degree in control theory and applications from South China University of Technology,
CE
Guangzhou, China, in 1997. Since 1997, he has been with South China Uni355
versity of Technology, where he became a full professor in 2004. In 200204, he worked at the Laboratory for Advanced Brain Signal Processing, RIKEN Brain
AC
Science Institute, Saitama, Japan, as a researcher. In 200408, he worked at the Laboratory for Neural Signal Processing, Institute for Infocomm Research, Singapore, as a research scientist. His research interests include, blind signal
360
processing, sparse representation, machine learning, brain-computer interface, EEG and fMRI data analysis. He is the author or coauthor of more than 60
17
ACCEPTED MANUSCRIPT
CR IP T
scientific papers in journals and conference proceedings.
Zhenghui Gu received the Ph.D. degree from Nanyang
Technological University, Singapore, in 2003. From 2002 to 2008, she was with 365
the Institute for Infocomm Research, Singapore. In 2008, she joined the College
of Automation Science and Engineering, South China University of Technology,
AN US
Guangzhou, as an associate professor. Her current research interests include the
AC
CE
PT
ED
M
fields of brain signal processing and pattern recognition.
18