P300 brainwave extraction from EEG signals: An unsupervised approach

P300 brainwave extraction from EEG signals: An unsupervised approach

Expert Systems With Applications 74 (2017) 1–10 Contents lists available at ScienceDirect Expert Systems With Applications journal homepage: www.els...

3MB Sizes 0 Downloads 53 Views

Expert Systems With Applications 74 (2017) 1–10

Contents lists available at ScienceDirect

Expert Systems With Applications journal homepage: www.elsevier.com/locate/eswa

P300 brainwave extraction from EEG signals: An unsupervised approach Victor Lafuente, Juan M. Gorriz∗, Javier Ramirez, Eduardo Gonzalez Dept. of Signal Theory, Networking and Communications. University of Granada, Spain

a r t i c l e

i n f o

Article history: Received 28 October 2016 Revised 28 December 2016 Accepted 29 December 2016 Available online 3 January 2017 Keywords: EEG ERP Artifacts P300 classification Matched filters ICA

a b s t r a c t The P300 is an endogenous event-related potential (ERP) that is naturally elicited by rare and significant stimuli, arisen from the frontal, temporal and occipital lobe of the brain, although is usually measured in the parietal lobe. P300 signals are increasingly used in brain-computer interfaces (BCI) because the users of ERP-based BCIs need no special training. In order to detect the P300 signal, most studies in the field have been focused on a supervised approach, dealing with over-fitting filters and the need for later validation. In this paper we start bridging this gap by modeling an unsupervised classifier of the P300 presence based on a weighted score. This is carried out through the use of matched filters that weight events that are likely to represent the P300 wave. The optimal weights are determined through a study of the data’s features. The combination of different artifact cancelation methods and the P300 extraction techniques provides a marked, statistically significant, improvement in accuracy at the level of the top-performing algorithms for a supervised approach presented in the literature to date. This innovation brings a notable impact in ERP-based communicators, appointing to the development of a faster and more reliable BCI technology. © 2017 Elsevier Ltd. All rights reserved.

1. Introduction Brain-Computer interfaces (BCI) measure specific (intentionally and unintentionally evoked) brain activity signals, translating them into particular information (Allison, Wolpaw, & Wolpaw, 2007; Dornhege, del R. Millán, Hinterberger, McFarland, & Müller, 2007). Many factors limit the performance of these systems, like the natural noise in the brain signals measured, the limitations of recording devices and the processing methods that extract signal features and translate them into information, among others (Cinel, Poli, & Citi, 2004). Furthermore, the physical and mental state of the subject greatly influences the quality of the recorded signal. Among all, the event-related potential (ERP)-based BCIs (Farwell, 1988) are of great interest because they use waveforms with a well-known morphology and the subject needs no particular training (Hong, Guo, Liu, Gao, & Gao, 2009) in order to extract them. Furthermore, at least in principle, based on this prior-knowledge a high bit rate can be achieved (Allison et al., 2008). ERPs are composed of a response due to the primary processing of the external stimulus, and a later response evoked by the reflection of higher cognitive processing induced by the stimulus, which is defined as an endoge∗

Corresponding author. E-mail addresses: [email protected] (V. Lafuente), [email protected] (J.M. Gorriz), [email protected] (J. Ramirez), [email protected] (E. Gonzalez). http://dx.doi.org/10.1016/j.eswa.2016.12.038 0957-4174/© 2017 Elsevier Ltd. All rights reserved.

nous ERP (Epstein & Andriola, 1983). The P300 is an endogenous event-related potential (ERP) that is naturally elicited by rare and significant stimuli, arisen from the frontal, temporal and occipital lobe of the brain, although is usually measured in the parietal lobe (Polich, 20 07). P30 0 is contained in the 0.15 − 5 Hz frequency range whose peak often appears between 300 ms and 600 ms after related events happen. The smaller the probability of related event is, the more prominent the P300 will be (Citi, Poli, & Cinel, 2010). Traditionally, two approaches are admissible to deal with any detection problem, the supervised and the non-supervised approaches. In supervised learning the detection algorithm adjusts its parameters through a learning process based on a training dataset, that is, a set of input patterns with known outcomes. Some examples include ERP or epileptic seizure detection from the EEG signals (Cinel et al., 2004; Gao, Cai, Yang, Dang, & Zhang, 2016). Most studies have been focused on a supervised approach to the ERP detection problem (Cinel et al., 2004; Donchin, Spencer, & Wijesinghe, 20 0 0; Lotte, Congedo, Lecuyer, & Lamarche, 2007), that aims to identify the presence of P300s by a training stage. This approach adapts to the statistical profile of a particular user’s P300 in an incoming and unknown EEG generated by the same user. Thus, these methods impose the requirement of a previous training stage and the risk of obtaining an over-fitted algorithm. On the other hand, the goal for unsupervised learning is to model the underlying structure or distribution in the data in

2

V. Lafuente et al. / Expert Systems With Applications 74 (2017) 1–10

Fig. 2. General structure of the algorithm.

Fig. 1. Character matrix used in the Donchin Speller (Donchin, 1981).

order to extract relevant information. In EEG processing, priorknowledge of the processes involved in the data generation can be very useful to determine an effective time series model. In the aim of addressing the need of further data optimization and removing the hazard of over-fitted parameters during the design, we present a novel approach that avoids the traditional supervised model by looking for specific features in the data, instead of similar features previously found in a training stage. In this paper we theoretically model and design an algorithm that processes the raw signal and extracts the P300 features through a matched filter (Borjesson, Pahlm, & Sornmo, 1892). The structure of the algorithm is shown in Fig. 2. This well-known methodology constitutes an important building block in many detection schemes as a part of the field of time series analysis (Gao et al., 2015; Sörnmo & Laguna, 2005). In addition, an artifact-free signal is required to be obtained in order to avoid type I&II errors in the classifier decision. Therefore, an automatic artifact cancelation stage is designed and implemented as a preprocessing step. By achieving the overall system allows us to substantially reduce the computational load, as well as to discard noisy data recordings in the training stage, instead of those relevant for stimulus detection (Lotte et al., 2007). An unsupervised P300 detection algorithm is a significant step forward, easing its implementation in different fields such as BCI spellers for locked-in syndrome patients or ERP-based polygraphs (FazelRezai et al., 2012; Pfister & Foerster, 2014) among many others (Farwell, 1995). In fact, P300-based Guilty-Knowledge Tests are already been suggested as an alternative approach for conventional polygraphy (Abootalebi, Moradi, & Khalilzadeh, 2009; Farwell & Donchin, 1991), grounded in feature extraction algorithms (Wang, Chang, & Zhang, 2016). 1.1. Materials To evaluate the proposed method, data from a Donchin ERPbased speller is investigated (Donchin, 1981; Donchin et al., 20 0 0). In this speller, users are presented with a 6 by 6 character matrix whose rows and columns are randomly highlighted without replacement for a short period (0.1 s) and one at a time. Users focus their attention on the character they want to input, having the flash of the row and column containing the desired character as target stimuli (see Fig. 1). This task is performed approximately 20 times for every single letter, subjects are required to count how many times their desired character flashed. Each of the 12 subjects of this study spells 20 letters. During the recordings, subjects were seated with the neck supported by a C-shaped pillow to minimize muscular artifacts as shown in Goldberger et al. (20 0 0). The eyes were at approximately 80 cm form a 22-inch LCD screen with

60Hz refresh rate. The 12 participants had their brain and ocular activity recorded through a BioSemi ActiveTwo EEG system (Citi et al., 2010), with 64 electroencephalography (EEG) electrodes that are located in standardized positions following the 10–20 International System. Furthermore, 4 electrooculography (EOG) electrodes, positioned in vertical and horizontal pairs around the eyes, are also placed, as well as one last pair in the earlobes for referencing. 2. Data preprocessing Before applying the classification algorithm some preprocessing is performed including noise-artifact removal, filtering, segmentation and feature extraction in terms of relevant feature vectors for ERP detection. 2.1. Band-Pass Filtering, referencing and demean Band-Pass Filtering is applied to remove most of the undesired frequency components contained in the signal (Rakotomanonjy & Guige, 2008) that was sampled at Fs = 2048 Hz. A FIR Band-Pass Filter with N = 1800 coefficients and pass band frequencies at 0.15 Hz and 5 Hz is applied, keeping a trade-off between the transition band and delay, as suggested in Ghaderi, Kim, and Kirchner (2014). This filtering process reduces the power line interference (50 Hz), most of the background white noise and several artifacts, such as the muscular artifact (50–100 Hz) and the one created by the reaction of the sweat at the electrodes surface (> 0.4 Hz) (Sörnmo & Laguna, 2005). However, there are some noisy signals that are not possible to be removed with filtering, because their frequency spectrum overlaps the one of the P300. These signals are the ocular artifacts, composed by eye movements and blinks. We use the 4 EOG electrodes to collect data from these artifacts in order to cancel them in a further stage of preprocessing (Fortgens & Bruin, 1983). In order to enhance the SNR in the blinks, we subtract the mean of the EOG signals and then subtract one of the vertical EOG channels from the other. Furthermore, by adding the horizontal EOG channel, we obtain a channel with an improved eye movement signal. Finally, the addition of both new signals conform a high-SNR signal of the ocular artifacts that are used to perform an efficient blink cancelation. 2.2. Artifact cancelation Artifact cancelation is a key task to be performed to obtain artifact-free data composed of only brain activity sources (Barlow, 1986). The effect can be observed in Fig. 3. An automatic artifact cancelation method is applied to each one of the 64 EEG channels separately. It is worth mentioning that the purpose of our artifact cancelation stage is not suited to cancel the saccadic spike potential (SP) because their shape does not resemble the P300 waveform, thus their presence will not contaminate the results. In addition, the SP artifact is specially present in the frontal and temporal lobe (Keren, Yuval-Greenberg, & Deouell, 2010) and not in the parietal lobe, where our main analysis were performed.

V. Lafuente et al. / Expert Systems With Applications 74 (2017) 1–10

3

Fig. 3. Blink signal embedded in different EEG and EOG channels.

Firstly, we carry out an Independent Component Analysis (ICA) (Hyvärinen & Oja, 20 0 0) on the database, which is applied to separate the artifact component from the purely brain-source signal. ICA is a method for blind source separation of a multivariate dataset that transforms observations to new statistically independent linear forms (ICs). It effectively performs blind source separation (BSS) without any prior information by assuming that the observed signal is a linear combination of a number of independent original signals (Xiaoyan, Douzhe, & Youer, 2009). Previously, a whitening process is applied to the data in order to eliminate any signal correlation and to reduce the degrees of freedom of the mixing matrix that defines the relationship between the ICs and the observed signal (Hyvärinen & Oja, 20 0 0). The whitening operation is shown in Eq. (1), starting with a 2 × N array X composed by an EEG channel and its correspondent tetrapolar EOG channel:

Xw = E D−( 2 ) E T X 1

(1)

where E is the 2 × 2 orthogonal eigenvector matrix of the covariance matrix and D is a 2 × 2 diagonal matrix with the eigenvalues of the same matrix. Finally, the output matrix Xw is a 2 × N array that stands for the whitened signal. ICA is executed separately on the two independent components among the two observed signals, that is, the EEG channel and the tetrapolar-referenced EOG channel. Once the ICs are extracted, the artifact is identified through correlation and the artifact-free component amplitude is restored using the standard deviation of the original signal. Lastly, a Wiener filtering is applied to the ICA output signal in order to achieve a more efficient artifact cancelation (Gorriz et al., 2009). Wiener filtering is a statistical method that reduces noise in a signal by minimizing the round mean squared error (RMS) (Kim & Kim, 2011) between the input signal and a desired signal. The system output is the difference between the desired signal and the filtered signal. The Wiener filter is employed in a two-step form by successive filters with N = 250 coefficients having as input the bipolar horizontal and vertical EOG channels respectively (Kim & Kim, 2011). The desired signal is the original data in the first step and the output of this stage in the second one. The EOG channels have very little information about brain activity, thus the data is expected to stay uncorrupted through this method. The effective result of the artifact cancelation (EOG component) is shown in Fig. 4.

The preprocessing stage, specially the artifact cancelation algorithm, raises the SNR in our data, keeping the brain activity information intact. In order to compare results, another artifact cancelation that directly removes the events with the presence of blinks or eye movements from the signal (Salzberg, 1976) is carried out as shown in the experimental part. 2.3. Event segmentation and averaging An event is defined as the time interval from the stimuli reception by the subject to 1 second after this instant. The data recordings are segmented in order to process the events separately. Then, events within the same channel with same information are gathered (the same desired character, highlighted row or column or EEG electrode) and averaged. We reject events that contain the same information as the previous event, because in this case the evoked potential has a very low amplitude (Citi et al., 2010). This matrix containing all the segments is the input to our classifier. 3. Classification stage The proposed classifier receives information from six rows and six columns and determines the desired character by the use of a matched filter. In this novel scheme a model for the P300 signal, obtained by averaging all the events that contain a P300 in our dataset, is required as shown in the following. 3.1. Window selection and contextual analysis The classification stage is only applied to the data window more likely to contain the P300 wave under the model presented in following section. According to Sörnmo and Laguna (2005), this window is contained in the time lapse between 0.3 and 0.6 s after the reception of the event. On average the P300 signal should show not only a prominent peak in the time lapse formerly mentioned, but also a significant decrease outside the time lapse as shown in Fig. 5. We use this information to design a discriminative preclassifier (Lotte et al., 2007) that discards some of the input signals before it gets into the classifier, reducing the possibilities of error and computational load. A time lapse is set around the signal maximum and two equal time lapses outside the selected window where amplitude averages are evaluated.

4

V. Lafuente et al. / Expert Systems With Applications 74 (2017) 1–10

Fig. 4. EEG channels before and after artifact cancelation.

Fig. 5. Cardinal-like shape of the averaged P300. The figure on the right is obtained by averaging the ERPs with different number of non-targets preceding the current stimulus h.

The contextual analysis (Gorriz, Ramirez, Lang, & Puntonet, 2008; Ramirez, Gorriz, Segura, Puntonet, & Rubio, 2006) is based on the definition of specific regions, as shown in Fig. 7, to evaluate a decision rule. The aim of this decision rule is to adapt the input signal to the signal model by discarding segments that do not clearly meet the model assumptions, i.e. zero-mean noise. This is similar to the heuristic rule designed in Citi et al. (2010) by means of quartile intervals for rejecting ocular artifacts. The algorithm recognizes three regions that are defined beforehand; an area around the signal peak (C: center of the window), and two areas (AP, APT: a-priori and a-posteriori contextual areas) placed at the sides of the chosen data window, where the P300 wave is not physiologically present. In order to avoid evaluation errors due to occasional high-latency P300s, the last no-P300 region (the a posteriori area) is shifted 0.05 s. This methodology has been successfully applied to other fields (Gorriz et al., 2015) although, in this case, this analysis based on the time-domain provides a score for each observed signal x = [x(0 ), . . . , x(N − 1 )]. The detector decides

accepting the segment if the score exceeds a threshold η or:



(x ) =

1 1 x (n ) − D N−D C





x (n )



(2)

{AP}∪{APT }

where D is the number of samples in C and N is the number of samples in the operation window. The position and width of the regions is experimentally chosen by studying the main features of the P300 (Citi et al., 2010). The selection of these parameter is usually based on prior reviews of the basic physiology underlying these potentials (Pritchard, 1981), i.e. the width of the peak area is fixed to D/Fs = 0.1 s around the maximum and likewise, the width of the no-P300 areas are set to (N − D )/Fs = 0.25 s each, as shown in Fig. 7. In addition, the threshold η is selected to be conservative in order to accept the segment, i.e. the score  (x) should be at least the 75% of the one obtained by the P300 model shown in Fig. 5. As an example, given Fs = 2048, N = 1228 and D = 204, using the latter P300 model we obtain  (P300) ࣃ 0.5356, thus

V. Lafuente et al. / Expert Systems With Applications 74 (2017) 1–10

η ∼ 0.4017. Afterwards, by evaluating the scores obtained by the P300s of our dataset, we experimentally checked that segments that fulfilled Eq. (2) were more than the 99% of the P300 signals, as well as a reduced number of non-P300 signals that were discarded in the following steps of the matched filter-based classifier. If a signal passes through the discriminative pre-classifier, the moving average of the samples included in the operation window is subtracted (x = x − x¯ ). 3.2. Matched filter The use of a matched filter (Borjesson et al., 1892) for the extraction of the P300 wave is the main goal in this paper, allowing the proposal of an unsupervised approach to this problem. This well-known technique is based on assuming that the P300 signal, from now on s(n), is a known waveform, i.e. morphology quite similar to a cardinal sinus (see Fig. 5), disturbed by an additive zeromean, white, Gaussian noise v(n). A matched filter seeks for the known waveform adjusting automatically its occurrence time, amplitude and duration. As a result, the more similar the input signal is to s(n), the bigger the filter output will be. Therefore, this filtering accounts for both the amplitude and the morphology of the input signal. Let assume a stationary signal x(n) to be composed by a zero-mean white Gaussian stationary noise v(n) defined by its variance σv2 and the desired signal s(n) as:

⎧ for 0 ≤ n ≤ θ − 1 ⎨v(n ), n −θ x(n ) = as β + v(n ), for θ ≤ n ≤ θ + D − 1 ⎩ v ( n ), for θ + D ≤ n ≤ N − 1

The model presents three unknown parameters: the amplitude a, the occurrence time θ and the duration β > 0. We limit β to a discrete set of possible values, applying a linear uniform discretization:

βt (l ) = βmin + l β for l = 0, 1, . . . , lmax

(4)

where β is the duration step size and βmin is the shortest duration of physiological interest. Here a compromise has to be kept with both the need of keeping the computational load as low as possible and the minimum requirements of precision. Our goal is to choose the values of the parameters so that the log-likelihood function is maximized (ML).

[θˆ , lˆ, aˆ] = arg max ln pv (x; a, θ , l )

(5)

θ ,l,a

where x(n ) = [x(0 ), . . . , x(N − 1 )]T . In order to find the optimal solution for the three parameters, we first determine an estimate of the amplitude a as a function of θ and l. After that, we apply a maximization regarding θ and l. Under the Gaussian-white noise v(n) assumption in Eq. (3), the joint log-likelihood probability density function (pdf) of the observed signal p(x) is given by:

ln pv (x; a, θ , l ) = − +

1 2σv2

θ+ D+1  n=0

N−1 N 1  2 ln 2π σv2 − x (n ) 2 2σv2 n=0



n−θ 2ax(n )s l





2 2

−a s

n−θ l

time-reversed replica of s(n):

y (θ , l ) =

θ+ D−1 

(6)

where the statistical independence among the signal components in the observation window is assumed, that is, p(x ) = N−1 n=0 p(x (n )). The maximization of the log -pdf in Eq. (6) with respect to the set of parameters {θ , l, a}, only affects the cross-term and can be interpreted as a filtering operation. The matched filter is defined by the following equation, where h represents the array of the matched filter coefficients, whose impulsive response is a

x(n )h(θ − n, l )

n=θ

=

θ+ D−1 

x ( n )s ( n − θ , l )

(7)

n=θ

Since s(n) is completely contained in the observation interval its

+D−1 2 energy Es (l ) = θn= s (n − θ , l ) is only dependent on the signal θ duration, that is, the parameter l. Hence, Eq. (6) can be rewritten as follows:

ln ( pv (x; a, θ , l ) ) = C +



1 2 σv

2

ay(θ , l ) − a2 Es (l )



(8)

The log -pdf is maximized in two steps. The first step is performed by differentiation with respect to the continuous-valued parameter a, resulting in an estimator aˆ being a function of θ and l which is used to determine an estimator of θ . Deriving with respect to a yields:

δ ln ( pv (x; a, θ , l ) ) 1 = 2 (y(θ , l ) − aEs (l ) ) δa σv

(9)

that setting to zero, it becomes:

aˆ(θ , l ) = max l

(3)

5

y (θ , l ) = max(y˜(θ , l ) ) Es ( l ) l

(10)

The estimator of a is thus equal to the energy-normalized output of the matched filter. Inserting the result of Eq. (10) into Eq. (9), it yields the following ML estimator:



θˆ = arg max θ

1



max Es (l )yˆ2 (θ , l ) 2

2 σv



(11)

From Eq. (11) the optimal values of θ and l are determined, and then the amplitude is computed using Eq. (10). We also carry out a matched filter to our P300 model to eliminate possible undesired features in the model. This may happen because the model, obtained from our original data, is obtained by averaging a finite number of P300s. The resulting output signal will be used as input into a correlation and normalization block to appraise its morphology and for a maximum comparator block to find how prominent the P300 signal is. This task becomes easier for the comparator because, using this approach, the signal amplitude has a direct relationship with the presence (or absence) of the P300. 3.3. Correlation and normalization In order to evaluate the similarity between the matched filter’s output and the P300 morphology, a cross-correlation between them is computed. It is important to note that, since our approach is an unsupervised method, it is key to obtain a P300 model, as a reference, to be looked for in our EEG signal. The generic P300 waveform is obtained from averaging all the signals marked as EEG responses to a target stimulus from our recordings. To perform such comparison, the amplitude of both signals should be normalized in order to avoid the distortion created by the different energies in the twelve input signals (six rows and six columns). By applying the latter block to our data (see Fig. 6), we increase the detection accuracy of the system since the data containing a P300 is expected to be much more similar to the realistic P300 model. An example of the signal processing along the different blocks so far is shown in Fig. 8. In this particular example the target signal is displayed as the more prominent one in green font. The system correctly identifies the ERP after both the matched filter and the

6

V. Lafuente et al. / Expert Systems With Applications 74 (2017) 1–10

(rows and columns indexed by {r, c}) prior to character selection. The input signals to them are the output of the matched filter and the output of the cross-correlation operation with a realistic P300 model. In order to take into account the information of both outputs and their different relevance we investigate a weighted scorer. Thus, the selected character {rˆ, cˆ} among the set of characters C is chosen by maximizing the following equation:

{rˆ, cˆ} = arg max{α yrc (θˆ , lˆ) + rsrcˆ,y (θˆ , lˆ)}

(12)

C

where α is experimentally tuned and set after an optimization, sˆ stands for the realistic P300 signal model and rsrc (θˆ , lˆ) = ˆ,y

Fig. 6. Schematic representation of the classification algorithm.

cross-correlation operations. In the following section an improvement of this decision rule based on the combination of both processes is described. 3.4. Maximum comparator and weighted scorer Both comparator blocks used in our algorithm (see Fig. 6) carry out the same tasks: the peak detection of the twelve input signals

E{sˆ(n )y(n; θˆ , lˆ)} is the cross-correlation between the matchedfilter output signal and the P300 signal model. As shown in Eq. (12) the weighted scorer estimates the character by choosing the highest-scored row and column. Therefore, the only character that belongs to both row and column is the chosen one. After this process a score result of every signal (character) is provided in addition to the results provide by considering just the comparator stage (see Fig. 6). By applying this adjustment, we avoid both occasional N400 signals elicited by the experiment and low amplitude signals with similar shape to the P300 to achieve a high score at the final stage. To rule the former out, an ERP potential similar in shape, occurrence and frequency, but with a negative peak (Kutas & Federmeier, 2011), is specially relevant to the accuracy improvement due to this stage. In order to balance the influence of the matched filter and the maximum comparator, the former is adjusted with a parameter α < 1 set to be high enough so the matched filter have the desired impact in performance and low enough to keep the role of the latter stage. Given that the tuning parameters of the overall system (the observation window, the pre-classifier threshold η and the weighted-scorer α ) are extracted using physiological information and averaged magnitudes from a set of subject, they are not expected to need fine adjustments. Furthermore, even though these parameters play a significant role, they have limited impact in the algorithm’s accuracy, therefore the risk of over-fitting is minimum.

Fig. 7. Contextual analysis applied to a signal with P300 presence. Green lines define the area considered by the classifier. Pink lines delimit surroundings of the signal’s peak. Red lines show areas where no P300-like peak is expected. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

V. Lafuente et al. / Expert Systems With Applications 74 (2017) 1–10

7

Fig. 8. Signal processing under six different highlighted columns across the classifier and pre-classifier stages. Note the green signal represents an event that contains a P300 while the others are NERPs. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) Table 1 Classification results using all the proposed approaches and current baseline methods. Classification results Method

Description

Accuracy (%) (total/row/column)

Classifier 1 Classifier 2 Classifier 3 Final Classifier Proposed Algorithm Citi et al. (2010) Rakotomanonjy and Guige (2008)

ML matched filter Classifier 1 + correlation with P300 Classifier 2 + Eq. (12) Classifier 3 + Section 3.1 Final Classifier + Section 2.2 Weighted average of the responses produced by a SVM during multiple stimulus presentations. Ensemble of SVMs averaged in sequences

36.11/72.22/83.33 72.22/77.77/88.88 83.33/88.88/91.66 86.11/94.44/88.88 91.66/94.44/94.44 87.50/–/– 96.00/–/–

4. Experimental analysis

0.95

Once the classifier and the preprocessing stage have been designed, the system is evaluated on the proposed database in Goldberger et al. (20 0 0), by following different procedures and optimizing the required parameters consequently.

Table 1 shows the most relevant results of our study, applying a slight tuning of the window selection from the base position set by the theoretical P300’s occurrence time ([0.325, 0.344]) and width ([0.122, 0.161]), the pre-classifier threshold η ∈ [0.28, 0.68]] and the parameter α ∈ [0.02, 0.06] (the center value of this parameter depends on the normalization process previously performed on the signals). However, as the average of the brain activity using several subjects is considered for this selection, the assumption that the parameter setting achieved for our database remains valid for any similar EEG database is admissible. This assumption is strengthened by the observation of a small variation in accuracy when tuning the different parameters, which provides stability to the algorithm (see Fig. 9). Our approach is compared with previous works based on a supervised scheme, i.e. Citi et al. (2010) and Rakotomanonjy and Guige (2008), obtaining a similar performance on average but avoiding the use of the training stage of the latter methods. However, the latter methods achieve peak accuracies slightly higher than the proposed method (up to 95%) for a specific combina-

Accuracy

4.1. Result evaluation

0.9

0.85

0.8 Character

Column

Row

Sequence Fig. 9. BoxPlot of the proposed method for the different sequences (columns and rows) and the overall accuracy. The relevant model parameters are varied around the baseline setting to obtain the complete set of results.

tion of parameters, i.e. for a number of sequences equal to 15 in Rakotomanonjy and Guige (2008) or for some subjects of the database in Citi et al. (2010). Fig. 10 depicts the accuracy of several methods showing the total accuracy percentage and a “combined

8

V. Lafuente et al. / Expert Systems With Applications 74 (2017) 1–10

Fig. 10. Accuracy of each method shown in 1. The data window was defined from 0.33 s to 0.48 s according to physiological values and the relevant parameters were experimentally tuned to α = 0.04 and η = 0.4.

1

0.95

Total accuracy

accuracy”, obtained by multiplying both total and partial (row and column) accuracy percentages. All the results shown in the paper analyze the system performance applied to the typical Cz and the CPz electrodes, located in the parietal lobe, according to the 10 − 20 International system for electrode placement. The other channels have been used to test the presence of the P300 signal in different lobes of the brain and to demonstrate the suitability of the selected parameters. The same parameter values for the working electrodes provide the best possible accuracy values for most of the remaining electrodes. In addition to this analysis, we ran a paired t-test based on the final classifier described in Table 1 and the algorithm in Citi et al. (2010). For this analysis we used two reference electrodes Cz and Cpz and a variation of the relevant parameter α as shown in Eq. (12). The variation ran from 0.02 to 0.6 around the mean value used in the overall results α = 0.04. The accuracy values used for the baseline were the ones of the speller for the two algorithms, choosing the minimum number of repetitions for which either one reaches an accuracy of 80% and the optimum q value (Citi et al., 2010). The test confirmed that the accuracy improvement of the proposed algorithm w.r.t. the non-weighted one is statistically significant ( p = 0.0124, t = 2.7968) with a 95% confidence interval with a standard deviation of 5.18%. Unlike the paired t-test of the improved version of both approaches (our approach versus the weighted scoring algorithm) which reveal statistical similarity between their distributions ( p = 0.1192, t = −1.6407). The database used in this paper (Goldberger et al., 20 0 0) has been used to confirm the applicability and benefits of novel approaches on a larger and independent group of subjects. Other databases, intended for the BCI competition III (Blankertz et al., 2006), as the ones used in Rakotomanonjy and Guige (2008) or Citi et al. (2010), were not considered in this paper since they contain only two subjects although include several repetitions of the 12 stimuli to classify a character. Finally, we ran a one-way anova test between both effective methods for P300 detection in order to test the similarity of the means for both approaches. To this purpose, we carried out a complete statistical analysis of the proposed method, including the latter variation of the parameter α , the contextual-parameter η (ranging from 0.28 to 0.68) and the data window length (around ± 15% of the mean value set to D/Fs ∼ 150 ms) as defined in Section 3.1.

0.9

0.85

0.8

0.75 matched filter algor.

weighted scoring algor.

Methods Fig. 11. BoxPlot of the overall accuracy results using the unsupervised approach based on a matched filter with varying parameters and the optimum method proposed in Citi et al. (2010).

The results are conclusive and the null hypothesis cannot be rejected, thus the means are not statistically different. Mathematically, the result of the test was F = 0.59, with a p = 0.4445 and means (0.8761, 0.8834) with std = 0.0496. The box plot of both sample sets is shown in Fig. 11. In addition, it is worth mentioning that the contextual analysis based pre-classifier reduces the computation time by 27% of the total calculations, due to a significant reduction in the data to be processed by the classifier. The computation time is compared with the same method without the pre-classifier block. Our algorithm performs efficiently, achieving an accuracy of 92%. It is important to note that accuracy over 90% assures intelligibility in the text produced by the Donchin speller (Kubler, Mushahwar, Hochberg, & Donoghue, 2006). The results are benchmarked to other relevant studies in the field to assess the quality of the proposed algorithm, yielding a performance at the level of the best supervised

V. Lafuente et al. / Expert Systems With Applications 74 (2017) 1–10

9

Fig. 12. Number of available events in the database for each target character.

approaches found in the literature (Citi et al., 2010; Rakotomanonjy & Guige, 2008), but adding a key advantage of being an unsupervised procedure. In Citi et al. (2010) a higher peak accuracy through ERP detection is achieved, however it needs a training set of data and a later validation stage to relieve over-fitting. The cross-validation (CV) used in this paper is “leave-one out” CV for the error estimation that is unavoidably biased and optimistic (Varma & Simon, 2006). In Rakotomanonjy and Guige (2008) a 96% of accuracy is reached using the BCI Competition III dataset (Blankertz et al., 2006), however, the method is based on a supervised SVM algorithm on a dataset containing only two subjects. Therefore, this approach also needs a training set and a significant number of data sequences to improve performance. It is worth mentioning that our method could be updated to be a supervised one by training the model parameters using a CV scheme. 4.2. Discussion The ocular artifact, specifically the one provoked by a blink, is a pointed wave with some similarity with the P300, but higher in amplitude and away from the cardinal sinus morphology. This artifact must be canceled because of its randomness that could make the classifier fail, mixing both signal waveforms. However, it is important to note that an accuracy drop is obtained if we remove the events with the presence of artifacts, compared with the data processing without any artifact cancelation. This result points out that there is significant information embedded in the events with the presence of artifacts, so there could be numerous P300 waves behind these artifacts, lost after the removal of the data. Thus, this theoretically implies that there is a more frequent presence of P300 waves evoked at the same time as the ocular artifacts happen, what leads us to hypothesize that having the desired character highlighted generates not only a ERP in the brain, but also in many cases an ocular movement in this database. This hypothesis is also backed by the fact that the subjects are required to count how many times the target character gets illuminated. The internal count could provoke, intentionally or unintentionally, an eye movement to help this operation (Fatourechi, Bashashati, Ward, & Birch, 2007; Reyes et al., 2012). If this theory is accepted, it would explain the exceptionally good results of the classifier with a signal without its artifacts canceled. Thus, the presence of artifacts actually helps to achieve them, by making the classifier erroneously identify an artifact as a P300 signal in events that are actually likely to have a P300 embedded in them. Separately, another issue to be discussed is why the final accuracy is not 100%, or even the obtained one is a remarkable percentage. We note that the wrongly estimated characters are often the same ones, eliminating the hypothesis that those characters could have something that differs from the others. After the study of this specific feature we found

out, as shown in Fig. 12, that the available data for these problematic characters is way lower than for the rest. Target characters were randomly chosen before the beginning of the run, thus this situation could happen due to the limited sample size. Overall, this study has allowed us to evaluate the possibility of extracting the P300 wave without supervised learning and thus avoiding the use of number of test trials for calibration, demonstrating the potential of this novel (and complementary) approach in the field. In this sense, we did not perform a deep optimization or a supervised learning scheme in order to avoid over-fitting problems, which could have led us to an even higher performance. This feature suggests the utility of our method in these fields, using the calibration trials to optimize the parameters that should be evaluated in further studies. Furthermore, the link between ocular artifacts and the presence of P300s points out an interesting possibility of improving P300 extraction through the artifact presence, such as some already existing artifact-based BCIs (Fatourechi et al., 2007; Reyes et al., 2012). Finally, it is worth mentioning that recent research on BCIs based on P300s evoked through auditory (Schreuder, Blankertz, & Tangermann, 2010) or tactile (Brouwer & vanErp, 2010) stimuli, has been proposed. There is no reason to think that the method, previously cited to exploit visual ERPs, could not be applied also to ERPs generated through stimuli presented in a different modality (Fazel-Rezai & Ahmad, 2011). 5. Conclusion As a conclusion the ensemble of both the matched filter and contextual analysis is a tool of great utility for the development of unsupervised learning systems and yields an accuracy up to 91.66% for the P300 detection in a Donchin ERP-based speller experiment. In this way, our approach reaches a similar performance with the same statistical significance of some recently proposed classification algorithms, such as Citi et al. (2010) and Rakotomanonjy and Guige (2008). Unlike the latter methods, the need of a training stage is avoided (but could be included), as well as the later statistical validation stages, thus minimizing the over-fitting risk. Moreover, our method makes a meaningful contribution to the field by presenting a general, non-heuristic method based on several innovations that could be combined with other algorithms for feature extraction, such as Principal Component Analysis (PCA), Partial Least Squares (PLS) or other supervised methods (Bostanov, 2004). Acknowledgments This work was partly supported by the MINECO under the TEC2015-64718-R project and the Consejería de Economía, Innovación, Ciencia y Empleo (Junta de Andalucía, Spain) under the Excellence Project P11-TIC-7103. Finally, thanks to the reviewers for

10

V. Lafuente et al. / Expert Systems With Applications 74 (2017) 1–10

all the time you have taken to revise the article and for all your valuable comments. References Abootalebi, V., Moradi, M. H., & Khalilzadeh, M. A. (2009). A new approach for eeg feature extraction in p300-based lie detection. Computer Methods and Programs in Biomedicine, 94(1), 48–57. Allison, B. Z., McFarland, D. J., Schalk, G., Zheng, S. D., Jackson, M. M., & Wolpaw, J. R. (2008). Towards an independent brain–computer interface using steady state visual evoked potentials. Clinical neurophysiology, 119(2), 399–408. Allison, B. Z., Wolpaw, E. W., & Wolpaw, J. R. (2007). Brainâcomputer interface systems: Progress and prospects. Expert Review of Medical Devices, 4(4), 463–474. Barlow, J. S. (1986). Artifact processing rejection and minimization in eeg data processing. Handbook of electroencephalography and clinical neurophysiology, 2, 15–62. Blankertz, B., Muller, K. R., Krusienski, D. J., Schalk, G., Wolpaw, J. R., Schlogl, A., . . . Birbaumer, N. (2006). The bci competition iii: Validating alternative approaches to actual bci problems. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 14(2), 153–159. doi:10.1109/TNSRE.2006.875642. Borjesson, O., Pahlm, L., & Sornmo, L. (1892). Adaptive qrs detection based on maximum a posteriori estimation. IEEE Transactions on Biomedical Engineering, BME-29i. Bostanov, V. (2004). Bci competition 2003-data sets ib and iib: Feature extraction from event-related brain potentials with the continuous wavelet transform and the t-value scalogram. IEEE Transactions on Biomedical Engineering, 51(6), 1057–1061. Brouwer, A. M., & vanErp, J. B. (2010). A tactile p300 brain-computer interface. Frontiers in Neuroscience, 4(19). Cinel, C., Poli, R., & Citi, L. (2004). Possible sources of perceptual errors in P300-based speller paradigm. In Biomedizinische technik. proceedings of 2nd international bci workshop and training course: 49 (pp. 39–40). Citi, L., Poli, R., & Cinel, C. (2010). Documenting, modeling and exploiting p300 amplitude changes due to variable target delays in donchin’s speller. Journal of Neural Engineering, 7, 1–21. Donchin, E. (1981). Surprise!... surprise? Psychophysiology, 18(5), 493–513. Donchin, E., Spencer, K. M., & Wijesinghe, R. (20 0 0). The mental prosthesis: Assessing the speed of a p300-based brain-computer interface. IEEE transactions on rehabilitation engineering, 8(2), 174–179. Dornhege, G., del R. Millán, J., Hinterberger, T., McFarland, D., & Müller, K.-R. (2007). Toward brain-computer interfacing. Cambridge, MA: MIT Press. Epstein, C. M., & Andriola, M. R. (1983). Introduction to EEG and evoked potentials. J. B. Lippincot Co. ISBN 0-397-50598-1. Farwell, L. (1988). Talking off the top of your head: Toward a mental prosthesis utilizing event-related brain potentials. Electroencephalography and Clinical Neurophysiology, 70(6), 510–523. Farwell, L. (1995). Method and apparatus for truth detection. US Patent, 5(406), 956. Farwell, L. A., & Donchin, E. (1991). The truth will out : Interrogative polygraphy (“lie detection”) with event-related brain potentials. Psychophysiology, 28(5), 531–547. Fatourechi, M., Bashashati, A., Ward, R. K., & Birch, G. E. (2007). Emg and eog artifacts in brain computer interface systems: A survey. Clinical Neurophysiology, 118(3), 480–494. Fazel-Rezai, R., & Ahmad, W. (2011). P300-based brain-computerinterface paradigm design. Recent Advances in Brain-Computer Interface Systems. Fazel-Rezai, R., Allison, B., Guger, C., Sellers, E., Kleih, S., & Kbler, A. (2012). P300 brain computer interface: Current challenges and emerging trends. Frontiers in Neuroengineering, 5, 14. Fortgens, C., & Bruin, M. D. (1983). Removal of eye movement and ecg artifacts from the non-cephalic reference eeg. Electroencephalography and Clinical Neurophysiology, 56(1), 90–96. Gao, Z.-K., Cai, Q., Yang, Y.-X., Dang, W.-D., & Zhang, S.-S. (2016). Multiscale limited penetrable horizontal visibility graph for analyzing nonlinear time series. Scientific Reports, 6, 35622. Gao, Z.-K., Yang, Y.-X., Fang, P.-C., Zou, Y., Xia, C.-Y., & Du, M. (2015). Multiscale complex network for analyzing experimental multivariate time series. EPL (Europhysics Letters), 109(3), 30 0 05.

Ghaderi, F., Kim, S. K., & Kirchner, E. A. (2014). Effects of eye artifact removal methods on single trial detection, a comparative study. Journal of Neuroscience Methods, 221, 41–47. Goldberger, A. L., Amaral, L. A. N., Glass, L., Hausdorff, J. M., Ivanov, P. C., Mark, R. G., . . . Stanley, H. E. (20 0 0). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation, 101(23), 215–220. Gorriz, J. M., Ramirez, J., Cruces-Alvarez, S. A., Puntonet, C. G., Lang, E. W., & Erdogmus, D. (2009). A novel lms algorithm applied to adaptative noise cancellation. IEEE Signal Processing Letters, 16(1), 34–37. Gorriz, J. M., Ramirez, J., Lang, E. W., & Puntonet, C. G. (2008). Jointly gaussian pdf-based likelihood ratio test for voice activitity detection. IEEE Transactions Audio, Speech & Lenguage Processing, 16(8), 1565–1578. Gorriz, J. M., Ramirez, J., Olivares, A., Padilla, P., Puntonet, C. G., Canton, M., & Laguna, P. (2015). Real time qrs detection based on m-ary likelihood ratio test on the dft coefficients. Plos One, 9(10). Hong, B., Guo, F., Liu, T., Gao, X., & Gao, S. (2009). N200-speller using motion-onset visual response. Clinical Neurophysiology, 120(9), 1658–1666. Hyvärinen, A., & Oja, E. (20 0 0). Independent component analysis: Algorithms and applications. Neural Networks, 13(4–5), 411–430. Keren, A. S., Yuval-Greenberg, S., & Deouell, L. Y. (2010). Saccadic spike potentials in gamma-band eeg: Characterization, detection and suppression. NeuroImage, 49(3), 2248–2263. http://dx.doi.org/10.1016/j.neuroimage.2009.10.057. Kim, M. K., & Kim, S. P. (2011). Detection of p300 components using the wiener filter for bci-based spellers. In Control conference (ascc), 2011 8th asian (pp. 892–896). Kubler, A., Mushahwar, V. K., Hochberg, L. R., & Donoghue, J. P. (2006). Bci meeting 2005-workshop on clinical issues and applications. IEEE Transactions on Neural Systems and Rehabilitation Engineering, 14(2), 131–134. Kutas, M., & Federmeier, K. D. (2011). Thirty years and counting: Finding meaning in the n400 component of the event-related brain potential (erp). Annu Rev Psychol, 62(3), 621–647. Lotte, F., Congedo, M., Lecuyer, A., & Lamarche, F. (2007). A review of classification algorithms for eeg-based brain computer interfaces. Journal of Neural Engineering, 4. Pfister, R., & Foerster, A. (2014). Pants on fire: The electrophysiological signature of telling a lie. Psychology Press Limited. Polich, J. (2007). Updating p300: An integrative theory of p3a and p3b. Clinical Neurophysiology, 118(10), 21282148. Pritchard, W. S. (1981). Psychophysiology of p300. Psychol Bull, 89(3), 506–540. Rakotomanonjy, A., & Guige, V. (2008). Ensemble of svms for bci p300 speller. IEEE Biomedical eng., 55, 1147–1154. Ramirez, J., Gorriz, J. M., Segura, J., Puntonet, C. G., & Rubio, A. J. (2006). Speech/non-speech discrimination based on contextual information integrated bispectrum lrt. IEEE Signal Processing Letters, 13(8), 497–500. Reyes, C. E., Rugayan, J. L. C., Jason, C., Rullan, G., Oppus, C. M., & Tangonan, G. L. (2012). A study on ocular and facial muscle artifacts in eeg signals for bci applications. In Tencon 2012 - 2012 ieee region 10 conference (pp. 1–6). doi:10.1109/TENCON.2012.6412241. Salzberg, B. (1976). The potential role of cepstral analysis in eeg research in epilepsy. Quantitative Analytic Studies in Epilepsy (P. Kellaway and I. Petersen, eds., New York, Raven Press), 559–563. Schreuder, M., Blankertz, B., & Tangermann, M. (2010). A new auditory multi-class brain-computer interface paradigm: Spatial hearing as an informative cue. PLoS ONE, 5(4), 1–14. doi:10.1371/journal.pone.0 0 09813. Sörnmo, L., & Laguna, P. (2005). Bioelectrical signal processing in cardiac and neurological applications. Biomedical Engineering. Burlington: Academic Pr Inc ISBN: 0124375529. Varma, S., & Simon, R. (2006). Bias in error estimation when using cross-validation for model selection. BMC Bioinformatics, 91(7). Wang, H., Chang, W., & Zhang, C. (2016). Functional brain network and multichannel analysis for the p300-based brain computer interface system of lying detection. Expert Syst. Appl., 53(C), 117–128. doi:10.1016/j.eswa.2016.01.024. Xiaoyan, Q., Douzhe, L., & Youer, D. (2009). P300 feature extraction based on parametric model and fastica algorithm. In 2009 fifth international conference on natural computation: 2 (pp. 585–589). doi:10.1109/ICNC.2009.160.