Characterizing multivariate decoding models based on correlated EEG spectral features

Characterizing multivariate decoding models based on correlated EEG spectral features

Clinical Neurophysiology 124 (2013) 1297–1302 Contents lists available at SciVerse ScienceDirect Clinical Neurophysiology journal homepage: www.else...

904KB Sizes 0 Downloads 66 Views

Clinical Neurophysiology 124 (2013) 1297–1302

Contents lists available at SciVerse ScienceDirect

Clinical Neurophysiology journal homepage: www.elsevier.com/locate/clinph

Characterizing multivariate decoding models based on correlated EEG spectral features Dennis J. McFarland ⇑ Laboratory of Neural Injury and Repair, Wadsworth Center, New York State Department of Health, P.O. Box 509, Empire State Plaza, Albany, NY 12201-0509, USA

a r t i c l e

i n f o

Article history: Accepted 7 January 2013 Available online 7 March 2013 Keywords: Sensorimotor rhythm Multivariate decoding Multicollinearity Brain–computer interface

h i g h l i g h t s  Decoding algorithms are effective for classifying EEG data.  Decoding algorithm weights may be difficult to interpret.  Simple univariate analyses should accompany complex statistical models.

a b s t r a c t Objective: Multivariate decoding methods are popular techniques for analysis of neurophysiological data. The present study explored potential interpretative problems with these techniques when predictors are correlated. Methods: Data from sensorimotor rhythm-based cursor control experiments was analyzed offline with linear univariate and multivariate models. Features were derived from autoregressive (AR) spectral analysis of varying model order which produced predictors that varied in their degree of correlation (i.e., multicollinearity). Results: The use of multivariate regression models resulted in much better prediction of target position as compared to univariate regression models. However, with lower order AR features interpretation of the spectral patterns of the weights was difficult. This is likely to be due to the high degree of multicollinearity present with lower order AR features. Conclusions: Care should be exercised when interpreting the pattern of weights of multivariate models with correlated predictors. Comparison with univariate statistics is advisable. Significance: While multivariate decoding algorithms are very useful for prediction their utility for interpretation may be limited when predictors are correlated. Ó 2013 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.

1. Introduction Multivariate decoding algorithms are becoming popular methods for the analysis of brain signals (Tong and Pratte, 2012). Decoding can be used for practical applications such as communication and control (Bradberry et al., 2009). Multivariate methods can also be applied to understand basic neurophysiological processes (Naselaris et al., 2011). Multivariate decoding may offer several advantages over traditional methods of analysis. For example, it has been claimed that decoding techniques allow examination of the ‘‘mental content’’ of brain regions rather than simply the overall level of activation (Haynes, 2011). In addition, these decoding techniques may be more sensitive than traditional univariate statistical methods (Haynes, 2011).

⇑ Tel.: +1 518 473 4680; fax: +1 518 486 4910. E-mail address: [email protected]

Interpretation of multivariate models is not always straightforward however. McFarland and Krusienski (2012) discuss several issues that may arise when the predictor variables are correlated, a condition referred to as multicollinearity. A specific case was illustrated by McFarland et al. (2006) for a subject using a P300-based matrix speller. A linear prediction of the target versus standard based on samples from Cz at 0 and 240 ms produced univariate r2 values of 0.004 and 0.073, respectively, and 0.559 when the two were combined in a bivariate model. With the bivariate linear regression model, feature weights were 0.155 for time 0 and 0.171 for 240 ms. It is unlikely that the stimulus-related information contained in the EEG was nearly equivalent at 0 and 240 ms post-stimulus. Rather it is probable that the feature at time 0 served as a means of noise cancellation (i.e., it provides a baseline correction) and thus represents a suppressor variable (Friedman and Wall, 2005). Thus, weights from a multivariate model do not always represent the extent to which a feature provides independent predictive information.

1388-2457/$36.00 Ó 2013 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved. http://dx.doi.org/10.1016/j.clinph.2013.01.015

1298

D.J. McFarland / Clinical Neurophysiology 124 (2013) 1297–1302

The author has noticed that the multivariate weights of adjacent spectral bins often show patterns of alternating sign, an effect not seen with univariate statistics. The present study is based on this observation and illustrates some potential problems with the interpretation of the weights of multivariate models. To this end the spectral patterns of weights from multivariate models were compared with the corresponding weights from univariate models in an offline analysis of data from EEG-based cursor control studies.

coded as 1 for top targets and 1 for bottom targets. All analyses used log10 values of spectral amplitudes to better approximate normality and each frequency bin was normalized to have zero mean and unit variance. Following the recommendations of Jensen (1971), r was used to express shared variance between two variables and r2 was used to express variance predicted by a model. Parameter estimates were determined using least-squares criteria and the normal equations:

ðX 0 XÞb ¼ X 0 Y

ð1Þ

2. Methods 2.1. Users For analysis of one-dimensional cursor control, the BCI users were 18 adults who participated in various EEG-based cursor control experiments at the Wadsworth laboratories. All gave informed consent for the study, which had been reviewed and approved by the New York State Department of Health Institutional Review Board. After an initial evaluation defined the frequencies and scalp locations of each person’s spontaneous mu and beta rhythm (i.e., sensorimotor rhythm) activity, he or she learned EEG-based cursor control over several months (2–3 sessions/week). The standard online protocol, which has been described in previous publications (e.g., McFarland et al., 2005), is summarized below. 2.2. Standard online protocol The user sat in a reclining chair facing a 51-cm video screen 3 m away, and was asked to remain motionless during performance. Scalp electrodes recorded 64 channels of EEG (Sharbrough et al., 1991), each referenced to an electrode on the right ear (amplification 20,000; bandpass 0.1–60 Hz; sampling rate 160 Hz). A daily session had eight 3-min runs separated by 1-min breaks, and each run had 20–30 trials. Each trial consisted of a 1-s period from target appearance to the beginning of cursor movement, a 2-s period of cursor movement, a 1.5-s post-movement reward period, and a 1-s inter-trial interval. Users participated in 2–3 sessions/week at a rate of one every 2–3 days. For online control of vertical cursor movement, one EEG channel over left sensorimotor cortex (i.e., electrode locations C3 or CP3) and/or one channel over right sensorimotor cortex (i.e., C4 or CP4) were derived from the digitized data according to a Laplacian transform (McFarland et al., 1997b). Every 50 ms, the most recent 400-ms segment from each channel was analyzed by a 16th order autoregressive (AR) model using the Berg algorithm (Press et al., 1997) to determine the amplitude (i.e., square root of power) in a 3-Hz-wide mu or beta frequency band, and the amplitudes of the one or two channels were used in a linear equation that specified a cursor movement as described above. Thus, cursor movement occurred 20 times per s. Complete EEG and cursor movement data were stored for later offline analyses.

where X is an m by n matrix formed from the n observations of m predictor variables (i.e., EEG amplitudes at specific frequencies) and Y is the vector of n values (i.e., target positions) to be predicted. Solving for b yields:

b ¼ ðX 0 XÞ1 X 0 Y

ð2Þ

Linear modeling has an advantage in this investigation by showing more clearly the operation of suppressor variables when predictors are correlated. 3. Results Spectra of r2 for each of the four AR model orders averaged over all 18 subjects are shown in Fig. 1. An analysis of variance with repeated measures on both variables indicated that the effects of model order (df = 3/51, F = 10.35, p < 0.0044 with GreenhouseGeisser correction), frequency (df = 23/391, F = 4.52, p < 0.0071 with Greenhouse-Geisser correction) and the model order by frequency interaction (df = 69/1173, F = 5.01, p < 0.0073 with Greenhouse-Geisser correction) were all significant. As can be seen in Fig. 1, the 8th order spectrum does not have the sharp peaks at 12 and 24 Hz that are present in the other spectra. The analysis was repeated without the 8th order data since this appeared to account for most of the effect seen in Fig. 1. The effects of model order (2/34, F = 19.54, p < 0.001 with Greenhouse-Geisser correction), frequency (df = 23/391, F = 5.62, p < 0.0012 Greenhouse-Geisser correction) and their interaction (df = 46/782, F = 3.87, p < 0.0138 Greenhouse-Geisser correction) were still significant, indicating that spectral sharpening occurred beyond that produced with the 16th order model. These results are consistent with those reported by McFarland and Wolpaw (2008) in that

2.3. Offline regression analysis The results presented here are from an offline analysis of data from the first 10 training sessions using channel C3 derived with the same Laplacian transform that was used online. Features for the regression analysis were the 24 1-Hz spectral bins from 6 to 29 Hz produced with either an 8th, 16th, 32nd or 64th order model. Average spectral amplitudes were computed for each trial based on 1 s windows (160 time points) with 7/8th overlap between adjacent windows. The correlations with target location of each spectral amplitude singly and in combination were computed using the multiple regression procedure from SAS (SAS Institute Inc.). Targets were

Fig. 1. Spectra of r2 for the prediction of target location with univariate autoregressive EEG features. Black represents the average of 8th order models, blue the 16th order models, green the 32nd order models and red the 64th order models. Note the sharpening of spectral peaks with higher order models. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

D.J. McFarland / Clinical Neurophysiology 124 (2013) 1297–1302

Fig. 2. Comparison of the largest univariate and the multivariate r2 values for target prediction as a function of model order averaged across all subjects. Note that the multivariate model resulted in much higher r2 values.

the 8th order model used with a sampling rate of 160 Hz is not sufficient to resolve mu and beta peaks. In contrast, the higher-order AR models produce distinct mu and beta peaks. r2 Values from the single largest univariate model and the multivariate model for each of the four AR model orders averaged over all 18 subjects are shown in Fig. 2. An analysis of variance indicated that the effects of model order (df = 1/17, F = 59.91, p < 0.0001 with Greenhouse-Geisser correction), method (df = 1/17, F = 8.73, p < 0.0065 with Greenhouse-Geisser correction) and the model order by method interaction (df = 1/17, F = 5.87, p < 0.0094 with Greenhouse-Geisser correction) were all significant. As can be seen in Fig. 2, the multivariate method always produced a greater r2, an effect that was larger at higher model orders. Thus the linear multivariate model is superior for ‘‘decoding’’ mu and beta rhythms (i.e., predicting target location). Univariate and multivariate regression weights for a single subject as a function of frequency and AR model order are presented in Fig. 3. As can be seen in Fig. 3, these two different summaries of the spectral data have markedly different shapes. The spectrum of weights for the univariate regression model based on 8th order AR features is smooth and lacks peaks. This is in marked contrast to the spectrum of weights for the multivariate regression model based on these same features. The multivariate spectrum has multiple peaks, many of these alternating between positive and negative values. As the AR model order is increased the spectrum of the univariate model weights becomes sharper and the spectrum of the multivariate weights more closely approximates those of the univariate models. The right column of Fig. 3 also includes the value of Pearson’s r for the correlation across frequency of univariate and multivariate weights. These values increase from 0.00 to 0.86 as AR model order increases, indicating a progressive increase in the similarity between the spectra of univariate and multivariate weights. In order to evaluate whether the results shown in Fig. 3 represent some peculiarity of the AR model this analysis was repeated using FFT-based spectral features using data from the same subject. Spectral resolution was varied in this analysis by varying the number of sampled data points (i.e., length of the signal) since the resolution of FFT-based spectral features in Hz is equal to the reciprocal of the length of the signal in s. A constant number of spectral bins was maintained by zero-padding so that the final length of the data was equal to the longest number of actual data points (i.e., 128). As can be seen in Fig. 4, the results were qualitatively similar those in Fig. 3. At the lowest resolution the univariate weights did not resolve the beta peak and the multivariate weights

1299

showed a pattern of alternating positive and negative values. As the resolution of the FFT-based spectral features increased the beta peak was resolved and the correlation between univariate and multivariate weights increased. These results illustrate an interesting parallel between the AR model order and the length of the FFT data window. In addition, they show that the effects of spectral resolution on the similarity between univariate and multivariate regression weights are not unique to the AR model. The remainder of our analyses will focus on this difference between the univariate and multivariate weights using the AR model. The pattern of alternating positive and negative feature weights for the multivariate models using lower order AR features seen in the subject presented in Fig. 3 was common to many of the 18 subjects in this study. This resulted in a marked difference between the spectrum of weights for univariate and multivariate models. This difference in the spectra is reflected by the value of Pearson’s r computed over the weights paired by frequency for each individual subject. The average correlation between univariate weights and multivariate weights over all subjects as a function of model order is shown in Fig. 5. Analysis of variance indicated that the effects of AR model order was significant (df = 3/51, F = 409.44, p < 0.0001 with Greenhouse-Geisser correction). As can be seen in Fig. 5, the correlation between univariate and multivariate weights increased as the AR model order increased. This increase in correlation reflects the fact that univariate and multivariate spectra became more similar as the AR model order increased. The correlations between the 12 Hz bin and neighboring bins as a function of AR model order are presented in Fig. 6. This spectral bin was selected since it was the peak r2 value with the three higher AR order univariate models. Analysis of variance (excluding 12 Hz) indicated that the effect of AR model order (df = 3/51, F = 146.46, p < 0.0001 with Greenhouse-Geisser correction), frequency (df = 11/187, F = 54.42, p < 0.0001 with Greenhouse-Geisser correction) and the interaction between model order and frequency (df = 33/561, F = 33.66, p < 0.0001 with GreenhouseGeisser correction) were all significant. As can be seen in Fig. 6, the 12 Hz bin was highly correlated with adjacent bins with the 8th order model. As AR model order increased, the 12 Hz bin became less correlated with adjacent bins. This result shows that the spectral features were more highly correlated with lower AR model orders. This effect is consistent with the results of simulations and analysis of empirical data by McFarland and Wolpaw (2008) that showed that spectral resolution depends on AR model order. Thus, the divergence of spectral shape at lower model orders is due to a greater correlation between the spectral features (i.e., multicollinearity), an effect that can be attenuated by improved signal extraction.

4. Discussion The use of multivariate regression models resulted in much better prediction of targets as compared to the simpler univariate regression models. Multivariate models produced better results using features extracted with all of the AR model orders examined in this study, a result that is not at all surprising. However with lower order AR features interpretation of the spectral patterns of the weights is difficult. This is likely due to the high degree of multicollinearity seen with AR features produced by lower model orders. These results thus show that there are potential interpretative problems with multivariate models employing correlated predictors although they may have value for purposes of prediction. An alternating pattern of positive and negative weights in the multivariate spectra was a common feature of the regression weights produced by lower AR model orders. This was in contrast

1300

D.J. McFarland / Clinical Neurophysiology 124 (2013) 1297–1302

Fig. 3. Univariate weights and multivariate weights as a function of AR model order in a single subject. Note the marked differences in the shape of the spectra for these two different models. The value of Pearson’s r for the correlation of univariate and multivariate regression weights paired by frequency for each model order is shown in the multivariate plots in the column on the right.

to the smooth nature of the spectra associated with univariate models, which generally had weights with the same sign in adjacent spectral bins. Thus, for many of the spectral bins, there was a reversal in the sign of the weights with the multivariate regression models. This reversal in the sign of the multivariate weights relative the simple correlation is diagnostic of negative suppressor variables (Tzelgov and Henik, 1991). However there is nothing negative in the suppressor itself. Suppressor variables can produce other relationships between univariate and multivariate coefficients. In the example mentioned in the introduction of target prediction from a time domain waveform at 0 and 240 ms, the suppressor effect was associated with a large increase in the 0 ms weight. In general, suppressor variables have a noise cancellation function in multivariate models and this role can be identified by their increased relevance to the prediction of the criterion variable in multivariate, as compared to univariate models. Suppressor variables operate in multivariate linear models to enhance the validity of some other variable in the model. In the present case these weights of alternating sign appear to be operat-

ing as a sort of filter that sharpens the spectrum of weights. If, for example, the greatest univariate weight is a positive value at 12 Hz, then the negative weights at 11 and 13 Hz in the multivariate regression model function to make the overall prediction based on the second derivative of the spectral function. That is, in combination these weights operate as a peak detector. An alternative way to view this relationship is that the adjacent weights suppress the variance in the 12 Hz bin that is due to the spread of variance from other frequencies. The spectral bins become less correlated and these suppressor effects tend to disappear with higher AR model order features. When all of the features in the multivariate model are uncorrelated the shape of the univariate and multivariate spectra should be identical. As noted by Kreutz et al. (1996), there is a tradeoff between the complexity of feature extraction and the complexity of the classifier. In the present case it appears that spectral resolution can be enhanced at either the feature extraction stage or at the stage of the prediction model. One might simply chose the combination that is optimal for overall system performance. However it is also

D.J. McFarland / Clinical Neurophysiology 124 (2013) 1297–1302

1301

Fig. 4. Univariate weights and multivariate weights as a function of actual FFT data points in a single subject. The final data length was zero-padded to 128 points for all cases. Note the marked differences in the shape of the spectra for these two different models. The value of Pearson’s r for the correlation of univariate and multivariate regression weights paired by frequency for each model order is shown in the multivariate plots in the column on the right.

important to consider the fact that this choice has important implications for the interpretation of the results. As illustrated with the present results, interpretation of the regression weights is difficult if the features are highly correlated. In the present study interpretive ambiguities in the decoding approach were illustrated in Fig. 3 where for a single subject, univariate and multivariate regression weights were compared. The decoding approach can be contrasted to the traditional statistical analysis of EEG features that takes the form:

EEGi ¼ f ðTargetj Þ

ð3Þ

where EEGi represents the amplitude of the EEG in the ith frequency bin. In contrast, the decoding model is:

Targetj ¼ f ðEEGi Þ

ð4Þ

The traditional model (Eq. (3)) is generally applied to determine whether there is a significant difference between EEG features as a function of the main effects of target, frequency, and their interac-

tion. In contrast, the decoding model (Eq. (4)) is applied to determine the extent to which EEG features can predict (decode) target identity. Both models use Eq. (2) to solve for regression weights. However most contemporary statistical programs do not routinely report the regression weights for applications of Eq. (3) (i.e., ANOVA models) while this is routine for applications of Eq. (4). The traditional analysis with Eq. (3) typically reports mean values for the EEG features as a function of the independent variables. In contrast, analysis with the decoding model typically report the estimated model parameters. As is apparent in Figs. 3 and 4, these two analyses provide much different indications of the effects of target identity on EEG spectral bins. The weight of a given spectral feature in the decoding model is influenced both by its covariance with target identity and by its covariance with other features in the model. This produces two effects (McFarland and Krusienski, 2012). EEG features that account for unique covariance with target identity (i.e., predictors that correlate less with other predictors in the model) tend to have larger

1302

D.J. McFarland / Clinical Neurophysiology 124 (2013) 1297–1302

Fig. 5. Average correlation of univariate and multivariate weights across frequencies as a function of AR model order. The error bars represent the standard deviations. Note that the similarity between these two measures increases as AR model order increases as indicated by the increase in correlation of weights across frequencies.

lated as illustrated in the P300 speller example discussed earlier (McFarland et al., 2006). Thus, the possible impact of multicollinearity should be considered when interpreting the weights of multivariate decoding algorithms. This can be evaluated by comparing multivariate weights with univariate statistics, as illustrated in the present study. The use of ‘‘Decoding’’ models (i.e., discriminant functions in the discrete case and random effects regression models in the continuous case) are currently very popular (Tong and Pratte, 2012). They have great utility for prediction but may present difficulties in interpretation. For example, multivariate prediction based on the lower AR model orders illustrated in Fig. 3 is superior to prediction based on any single feature, but inspection of the model coefficients does not clarify whether the BCI user is modulating narrow-band signals as would be expected if control was based on sensorimotor rhythms. In contrast, the coefficients from univariate models based on higher AR model orders clearly show distinct peaks in the mu and beta bands that are not reproducible by artifacts such as eye blinks or facial EMG (McFarland et al., 1997a). Similar considerations apply to other methods of neuroimaging. In some cases it may be possible to use signal processing methods to decorrelate predictors as was done by increasing AR model order in the present study. In other cases, such as with the correlation between closely spaced electrodes on the scalp, this may prove more difficult. In each of these cases, it is advisable to be aware of these issues and evaluate the relationship between predictors and their impact on multivariate coefficients. Acknowledgements This work was supported by grants from NIH (HD30146 (NCMRR and NICHD) and EB00856 (NIBIB and NINDS)) and the James S. McDonnell Foundation. References

Fig. 6. Average correlation of the 12 Hz bin with neighboring spectral bins for different AR model orders. Black represents the average of 8th order models, blue the 16th order models, green the 32nd order models and red the 64th order models. Note that the 12 Hz bin becomes less correlated with neighboring bins as model order increases. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

weights than EEG features that that are highly correlated with other EEG features in the model. In addition, some of the EEG features in the model may serve a role in noise cancellation (i.e., as suppressor variables). These suppressor variables may have large weights yet may not correlate individually with the criterion variable. As a result of these effects of multicollinearity, interpretation of the weights of a multivariate model may be difficult. These effects of multicollinearity could operate in either the traditional model (i.e., Eq. (3)) or the decoding model (i.e., Eq. (4)). However when the number of observations are proportional across the fixed factors on the right side of Eq. (3), the predictors are orthogonal. This is a property of fixed-effect models with equal or proportional observations that makes interpretation relatively straightforward. In contrast, the decoding model is a random effects model and the presence of correlated predictors can complicate interpretation. While the present study evaluated the effects of correlated spectral features the results are of general importance given the fact that other EEG feature dimensions are often correlated. For example, scalp recordings from adjacent electrodes are likely to produce correlated features due to volume conduction. Likewise adjacent points in time-domain waveforms will likely be corre-

Bradberry TJ, Rong F, Contras-Vidal JL. Decoding center-out hand velocity from MEG signals during visuomotor adaptation. Neuroimage 2009;47:1691–700. Friedman L, Wall M. Graphical views of suppression and multicollinearity in multiple linear regression. Am Stat 2005;59:127–36. Haynes J-D. Multivariate decoding and brain reading: introduction to the special issue. Neuroimage 2011;56:385–6. Jensen AR. Note on why genetic correlations are not squared. Psychol Bull 1971;75:223–4. Kreutz M, Volpel B, Janssen H. Scale-invariant image recognition based on higherorder autocorrelated features. Pattern Recogn 1996;29:19–26. McFarland DJ, Krusienski DJ. BCI signal processing: feature translation. In: Wolpaw JR, Wolpaw EW, editors. Brain–computer interfaces: principles and practice. New York: Oxford University Press; 2012. p. 147–63. McFarland DJ, Wolpaw JR. Sensorimotor rhythm-based brain–computer interface (BCI): model order selection for autoregressive spectral analysis. J Neural Eng 2008;5:155–62. McFarland DJ, Lefkowicz AT, Wolpaw JR. Design and operation of an EEG-based brain–computer interface (BCI) with digital signal processing technology. Behav Res Meth Ins Comput 1997a;29:337–45. McFarland DJ, McCane LM, David SV, Wolpaw JR. Spatial filter selection for EEGbased communication. EEG Clin Neurophysiol 1997b;103:386–94. McFarland DJ, Sarnacki WW, Vaughan TM, Wolpaw JR. Brain–computer interface (BCI) operation: signal and noise during early training sessions. Clin Neurophysiol 2005;116:56–62. McFarland DJ, Anderson CW, Muller KR, Schlogl A, Krusienski DJ. BCI meeting 2005 – workshop on BCI signal processing: feature extraction and translation. IEEE Trans Neural Syst Rehabil Eng 2006;14:135–8. Naselaris T, Kay KN, Nishimoto S, Gallant JL. Encoding and decoding in fMRI. Neuroimage 2011;56(400–41). Press WH, Teukolsky SA, Vetterling WT, Flannery BP. Numerical recipes in C. 2nd ed. Cambridge University Press; 1997. Sharbrough F, Chatrian CE, Lesser RP, Luders H, Nuwer M, Picton TW. American Electroencephalographic Society guidelines for standard electrode position nomenclature. J Clin Neurophysiol 1991;8:200–2. Tong F, Pratte MS. Decoding patterns of human brain activity. Annu Rev Psychol 2012;63:483–509. Tzelgov J, Henik A. Suppression situations in psychological research: definitions, implications, and applications. Psychol Bull 1991;109:524–36.