A new automatic sleep staging system based on statistical behavior of local extrema using single channel EEG signal

A new automatic sleep staging system based on statistical behavior of local extrema using single channel EEG signal

Accepted Manuscript A New Automatic Sleep Staging System Based on Statistical Behavior of Local Extrema Using Single Channel EEG Signal Saman Seifpou...

8MB Sizes 1 Downloads 56 Views

Accepted Manuscript

A New Automatic Sleep Staging System Based on Statistical Behavior of Local Extrema Using Single Channel EEG Signal Saman Seifpour , Hamid Niknazar , Mohammad Mikaeili , Ali Motie Nasrabadi PII: DOI: Reference:

S0957-4174(18)30160-X 10.1016/j.eswa.2018.03.020 ESWA 11867

To appear in:

Expert Systems With Applications

Received date: Revised date: Accepted date:

24 August 2017 12 March 2018 13 March 2018

Please cite this article as: Saman Seifpour , Hamid Niknazar , Mohammad Mikaeili , Ali Motie Nasrabadi , A New Automatic Sleep Staging System Based on Statistical Behavior of Local Extrema Using Single Channel EEG Signal, Expert Systems With Applications (2018), doi: 10.1016/j.eswa.2018.03.020

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT

Highlights 

A novel time domain feature is proposed for design an automatic sleep scoring system.



The method employs a new symbolic idea to explore hidden dynamics of EEG sleep stages. The effectiveness of the features is validated by statistical and graphical analysis.



Compared with other existing methods, our method provides robust and superior

CR IP T



AC

CE

PT

ED

M

AN US

results.

1

ACCEPTED MANUSCRIPT

A New Automatic Sleep Staging System Based on Statistical Behavior of Local Extrema Using Single Channel EEG Signal Saman Seifpour* (1) Department of Biomedical Engineering, Shahed University, Tehran, Iran (1). [email protected]

CR IP T

Hamid Niknazar (2)

Department of Biomedical Engineering, Science and Research Branch Islamic Azad University, Tehran, Iran (2). [email protected]

Mohammad Mikaeili (1)

AN US

Department of Biomedical Engineering, Shahed University, Tehran, Iran (1).

[email protected] Ali Motie Nasrabadi (1)

Department of Biomedical Engineering, Shahed University, Tehran, Iran (1).

M

[email protected]

ED

Abstract__ Over the past decade, converging evidence from diverse studies has demonstrated that sleep is closely associated with the mental and physical health, quality of life, and safety. Visual sleep scoring provides an initial and tangible illustration of how the

PT

brain wave changes across different sleep stages. The main objective of the present study is to design an accurate and robust computer-assisted sleep stage scoring system using singe-

CE

channel EEG signal by proposing a novel time domain feature named Statistical Behavior of Local Extrema (SBLE). SBLE provides profound understanding of hidden dynamics of EEG

AC

signals by quantifying and symbolizing its local extrema information, extracting and defining various patterns, and statistical analysis of extracted patterns. First, each EEG segment was decomposed into 6 frequency sub-bands (i.e., low-delta, high-delta, theta, alpha, sigma, and beta). Next, SBLE features were separately computed from each sub-band. Then, an optimal feature set with highest rate of accuracy was selected using a supervised Multi-Cluster/Class Feature Selection (MCFS) algorithm. Finally, the selected features were fed to a multi-class Support Vector Machine (SVM) for classification purposes. The benchmark Sleep-EDF *Corresponding Author: Saman Seifpour; Phone: (+98) 9183170810; E-mail: [email protected] Address: Department of Biomedical Engineering, Shahed University, Tehran, Iran.

2

ACCEPTED MANUSCRIPT database and DREAMS Subject database were employed to evaluate the performance of the proposed framework. The average accuracy rates were 90.6 ± 4.2%, 91.8 ± 5.0%, 92.8 ± 3.3%, 94.5 ± 3.4%, 97.9 ± 1.4% for six-stage to two-stage sleep classification for Sleep-EDF database, respectively. Besides, its performance on DREAMS Subjects database was also promising in term of accuracy, sensitivity, specificity, and Cohen’s Kappa coefficient. Experimental results suggest that the proposed methodology can precisely solve the multiclass sleep stage classification problem by presenting an innovative symbolic approach

CR IP T

similar to physician’s point of view.

Keywords: Electroencephalography (EEG), Sleep stage classification, Symbolic analysis, Statistical Behavior of Local Extrema (SBLE), Multi-Cluster/Class Feature Selection

AN US

(MCFS), Support Vector Machine (SVM).

1. Introduction

Sleep is a reversible state of mind and body, characterized by increased level of unconsciousness and perceptual disengagement from the surrounding world.

It is

distinguished from wakefulness by a decreased ability to response to environmental stimuli

M

(Allada & Siegel, 2008). Since human beings spend approximately one-third of their lifetimes sleeping, it is essential to overall health and well-being so that the lack of sleep can directly

Stickgold, 2006).

ED

influence on our mood and cognitive performances (Touchette et al., 2007; Walker &

PT

A widely adopted technique in sleep studies, specifically in the diagnosis of sleep disorders, is Polysomnography (PSG). PSG is a multi-parametric test that monitors many

CE

body functions by recording electrophysiological signals such as Electroencephalogram (EEG), Electrooculogram (EOG), Electromyogram (EMG) and Electrocardiogram (ECG)

AC

signals (Álvarez-Estévez & Moret-Bonillo, 2011). Two well-known gold standards for interpreting sleep recordings are Rechtschaffen and Kales (R&K) criteria and the guideline developed by the American Academy of Sleep Medicine (AASM). According to R&K recommendations, each 20 or 30 second epoch of an overnight sleep belongs to one of the seven discrete stages: Wake (wakefulness), S1 (drowsiness), S2 (light sleep), S3 (deep sleep), S4 (deep or slow wave sleep), REM (rapid eye movement), and movement time (MT) (Rechtschaffen & Kales, 1968). AASM modified R&K criteria and developed a new guideline for scoring of sleep-related phenomenon (Iber, 2007). The major distinction between AASM and R&K guidelines are: 1) NREM stages (S1, S2, S3, and S4) in R&K are 3

ACCEPTED MANUSCRIPT referred to as stages N1, N2, and N3 in the AASM. In this case, N3 (deep or delta-wave sleep) is obtained by merging stages S3 and S4; 2) Stages Wake and REM in R&K are renamed to as stages W and R, respectively; 3) Stage movement time (MT) is abolished in AASM guideline; 4) EEG derivations of R&K criteria, i.e. C3-A2, and O2-A1, are replaced with F4-M1, C4-M1, and O2-M1 derivations by AASM (Iber, 2007; Moser et al., 2009; Rechtschaffen & Kales, 1968). Furthermore, according to AASM criteria, the scoring of PSG recordings enhances the amount of light and deep sleep in comparison to R&K criteria

CR IP T

(Moser et al., 2009).

The process of visually scoring of sleep stages is complex, time-consuming, and prone to human errors. Due to inter- and intra-scorer variability, the level of agreement in visual sleep stage scoring differs greatly among scorers, even by recording or by diagnosis (Collop, 2002; Norman, Pal, Stewart, Walsleben, & Rapoport, 2000). For these reasons, designing an

AN US

accurate and robust automatic sleep stage scoring system can significantly reduce scoring time and generate reliable results. The first attempts to devise such systems have been begun more than four decades ago (Fraiwan et al., 2010). Nowadays, by developing innovative signal processing techniques and heuristic machine learning methods, a large number of

M

methods have been proposed to automatically classify sleep stages.

Since EEG patterns exhibit different characteristics during the sleep stages (Fig. 1),

ED

numerous automatic systems for sleep stage classification have constructed based on EEG signals (Doroshenkov, Konyshev, & Selishchev, 2007; Flexer, Gruber, & Dorffner, 2005; Shi et al., 2015). In addition, more complex approaches have utilized EOG and EMG signals in

PT

combination with EEG for extracting relevant features (S. F. Liang, Kuo, Hu, Pan, & Wang, 2012; S.-F. Liang et al., 2016; S.-F. Liang, Kuo, Hu, & Cheng, 2012; Tagluk, Sezgin, &

CE

Akin, 2010). Some studies have also focused on other physiologic signals such as ECG and respiratory signals to design an automatic sleep stage scoring system (Ebrahimi, Setarehdan,

AC

Ayala-Moyeda, & Nazeran, 2013; Ebrahimi, Setarehdan, & Nazeran, 2015; Virkkala, Hasan, Värri, Himanen, & Müller, 2007). Generally, EEG signal processing techniques applied to human sleep staging perform in

three main steps: pre-processing, feature extraction, and classification. Two most popular approaches in pre-processing step are artifact removal (to eliminate or reduce the effects of artifacts on EEG signals) and segmentation (to overcome the non-stationary nature of EEG signals) (Motamedi-Fakhr, Moshrefi-Torbati, Hill, Hill, & White, 2014).

4

AN US

CR IP T

ACCEPTED MANUSCRIPT

Figure 1. Characteristics of EEG signals during various sleep stages.

Various methods have been developed to extract proper information from sleep EEG signals. All of these techniques can classify into four main categories: temporal features,

M

spectral features, time-frequency features, and nonlinear features (Aboalayon, Faezipour, Almuhammadi, & Moslehpour, 2016; Motamedi-Fakhr et al., 2014). Some of the ubiquitous

ED

temporal features, which represent characteristics of a signal in the time domain space, are mean, mode, median, variance, standard deviation, skewness, kurtosis, zero-crossing, and

PT

Hjorth parameters (Diykh & Li, 2016; Diykh, Li, & Wen, 2016; Şen, Peker, Çavuşoğlu, & Çelebi, 2014). To obtain spectral features or frequency-based features, the time domain

CE

signals are converted into the frequency domain using the Fourier transforms. The more prevalent spectral features in sleep EEG signal processing are parametric and nonparametric power spectral density (PSD) and higher-order spectral analysis (HOS) (Acharya et al., 2015;

AC

Acharya, Chua, Chua, Min, & Tamura, 2010; Radha, Garcia-Molina, Poel, & Tononi, 2014). Time-frequency features such as short time Fourier transform (STFT), wavelet transform (WT), and empirical mode decomposition (EMD) decompose a signal into both of time domain and frequency domain (Hassan & Haque, 2016a; Hassan & Hassan Bhuiyan, 2016; Sanders, McCurry, & Clements, 2014; Şen et al., 2014). Nonlinear and entropy-based features such as fractal dimension (FD), correlation dimension (CD), entropy measures, and Lyapunov exponent can also provide complementary information about main characteristics of different sleep stages (Peker, 2016; Şen et al., 2014). 5

ACCEPTED MANUSCRIPT So far, a wide variety of various classification techniques have been utilized in sleep staging studies, but the ubiquitous classifiers are Artificial Neural Networks (ANN) (Ronzhina et al., 2012), k-means (Agarwal & Gotman, 2001), Self-Organizing Maps (SOM) (Ouanes et al., 2016), Linear Discriminant Analysis (LDA) (Fraiwan et al., 2010), Support Vector Machines (SVM) (Enshaeifar et al., 2016), Hidden Markov Model (HMM) (Doroshenkov et al., 2007), and ensemble classifiers such as Adaptive Boosting (Adaboost) (Hassan, 2016), Linear Programming Boosting (LPBoost) (Hassan & Subasi, 2016),

CR IP T

Bootstrap Aggregating (Bagging) (Hassan & Haque, 2016b; Hassan, Siuly, & Zhang, 2016), and Random Under Sampling Boosting (RUSBost) (Hassan & Haque, 2016a, 2017).

Recently, it has been observed an increasing trend of interest in time series data analysis by employing symbolic approach. Symbolic analysis of time series can increase the efficiency of findings, reduce sensitivity to measurement noise, and discriminate both

AN US

specific and general classes of proposed models. Symbolic techniques provide a general description of a dynamical system by transforming a system into a new representation space (so that most of the significant temporal information is retained), assigning a set of symbols (which each symbol corresponds to a given state of a system), and extracting invaluable

M

information from the new space (Amigo, Keller, & Unakafova, 2014; Daw, Finney, & Tracy, 2003; Lin, Keogh, Lonardi, & Chiu, 2003). As a mathematical tool, it presents a useful

ED

measure to assess the complexity or the irregularity of biomedical recordings (Balakrishnan, Shoeb, & Syed, 2010; Canelas, Neves, & Horta, 2012). One of the basic methods in symbolic analysis of time series, which reduces the dimensionality of signals by discretizing, is the

PT

Symbolic Aggregate approximation (SAX). The distance in the SAX has a lower bound to the Euclidean distance (Lin, Keogh, Wei, & Lonardi, 2007). In recent years, some extensions

CE

have been presented to enhance the capability and efficacy of the SAX method. The extended symbolic aggregate approximation (ESAX) could overcome some of the limitation of SAX.

AC

In this method, the SAX dimensions were tripled by incorporating the minimum and maximum information of time series (Lkhagva, Yu Suzuki, & Kawagoe, 2006). The ESAX Statistical Vector Space (ESSVS) reduced the complexity of the cluster computing process of the SAX by replacing Euclidean distance to Cosine distance (Jiang, Lan, & Zhang, 2009). The trend distance SAX (SAX-TD) designed a measure to compute the distance of trends using the starting and the ending points of segments by proposing a modified distance measure that integrated the SAX distance with a weighted trend distance (Sun, Li, Liu, Sun, & Chow, 2014). The adaptive SAX (aSAX) presented a novel adaptive symbolic approach based on the combination of SAX and k-means algorithm to boost performances of the 6

ACCEPTED MANUSCRIPT classic SAX (Pham, Le, & Dang, 2010). The indexable SAX (iSAX) is a superset of classic SAX and show how it can modify SAX to be a multiresolution representation, similar in spirit to wavelets. This approach also allows for both fast exact search and ultra-fast approximate search (Shieh & Keogh, 2008). The SAX and all its extensions only describe a specific signal in the time domain. However, in nonlinear and chaotic signals (like EEG), dynamics of a system plays a prominent role. In this study, a novel temporal feature named statistical behavior of local

CR IP T

extreme (SBLE) is propounded based on dynamical characteristics of EEG signals. In other words, SBLE is a symbolic technique to compare and track the dynamics of sleep EEG signals through various sleep stages. The principal advantage of SBLE is its potency to detect the hidden dynamical and behavioral changes of sleep EEG signals in the process of transforming a given stage into other stages. These alterations are specifically traceable in

AN US

different frequency bands.

In general, physicians and sleep experts score the sleep stages according to the morphological characteristics of EEG signals. Thus, these morphological alternations have unique information, which by quantifying and symbolizing of it, an efficient feature set can

M

be proposed to construct a precise sleep stage scoring system. In fact, the main hypothesis and contribution of this study is to propose a novel time-domain feature, which can

ED

automatically detect various sleep stages based on morphological changes of sleep EEG signals, similar to the point of view of sleep experts. Originally, the main idea of the novel time domain feature (SBLE) propound herein was

PT

introduced in our pervious works and its effectiveness was evaluated to predict epileptic seizures and detect the activation phase (A phase) of sleep Cyclic Alternating Pattern (CAP)

CE

(Niknazar, Maghooli, & Motie Nasrabadi, 2015; Niknazar & Nasrabadi, 2016; Niknazar, Seifpour, Mikaili, Nasrabadi, & Banaraki, 2015). In the aforementioned studies, SBLE was

AC

mainly utilized as a similarity index to address our hypotheses. In the present study, to maximize the generalization ability of SBLE, a series of modifications and extensions were conducted. As a result, these improvements promoted the performance of SBLE as an effective time domain feature to handle the multi class classification of sleep stages compared to the previous one. These changes will be discussed in detail in Section 3.

7

ACCEPTED MANUSCRIPT

2. Related works Hassan et al. (Hassan & Bhuiyan, 2016b, 2017b; Hassan & Hassan Bhuiyan, 2016) decomposed EEG signals and extracted statistical moment-based features and adaptive noise features by using Empirical Mode Decomposition (EMD) technique. Then, by means of AdaBoost, Bagging, and RUSBoost classifiers, they reported 88.6%, 86.9%, and 88.1% as average accuracy for six-stage classification, respectively. Hassan et al. (Hassan & Bhuiyan,

CR IP T

2016a, 2017a; Hassan & Subasi, 2017) also decomposed the sleep EEG segments into tunable-Q factor wavelet transform (TQWT). Then, various spectral features and normal inverse Gaussian (NIG) pdf features were extracted from these sub-bands. Finally, by using ensemble classification methods (RF, AdaBoost, and Bagging), they achieved 90.4%, 90%, and 93.7% accuracy rate for six-stage classification, respectively.

AN US

Liang et al. (S. F. Liang et al., 2012; S.-F. Liang et al., 2016, 2012) designed different single-channel and multi-channel based automated sleep staging models by applying contextual smoothing rules. In first model, the multiscale entropy (MSE) and autoregressive (AR) features were extracted, and LDA classifier was used to obtain 88% overall agreement (Liang et al., 2012). In second model, twelve features including temporal and spectrum

M

features were extracted from EEG, EOG and EMG signals. Then, a hierarchical decision tree with fourteen rules was constructed for classification of sleep stages. The overall agreement

ED

was 86.7% (Liang et al., 2012b). In last model, eight temporal and spectral features from EEG and EMG signals were extracted and fed into a genetic fuzzy inference system. They

PT

could achieve an overall accuracy of 86.44% (Liang et al., 2016). Diykh et al. (Diykh et al., 2016; Diykh, Li, & Wen, 2016) identified six-class sleep stages

CE

with a k-means clustering algorithm as classifier by incorporating the characteristics of complex network concepts and time-domain features. The average classification accuracy

AC

was 92.2% and 95.9%, respectively. Bajaj et al. (2013) classified sleep stages based on timefrequency images (TFI) of EEG signals by utilizing a smooth pseudo Wigner-Ville distribution and a multiclass least squares support vector machines (MC-LS-SVM) classifier. The average accuracy was 88.5% and 92.3% for six-class and five-class classification of sleep stages, respectively. Hsu et al. (2013) classified the sleep stages by using energy features and recurrent neural classifier. Its classification rate was 87%. Zhu et al. (2014) proposed a system to classify the sleep stages based on graph domain features and an SVM classifier. The accuracy of six-stage and five-stage sleep classification was 87.5% and 88.9%, respectively. The research by Ronzhina et al. (2012) used PSD features and ANN classifier to 8

ACCEPTED MANUSCRIPT obtain a 76.6% accuracy for classification of sleep stage. Vural et al. (2010) designed a sleep stage scoring system by employing a hybrid approach and principle component analysis (PCA). They reported an 83.5% classification rate. Doroshenkov et al. (2007) developed an automated system for classification of human sleep stages based on R&K standard. Amplitude information of two-channel EEG signals at the different frequency bands were used as features to learn a HMM classifier. They reported a 61.8% success rate for their proposed system. Berthomier et al. (2007) presented an automatic sleep stage scoring method

CR IP T

by using a fuzzy logic iterative system and contextual rule smoothing based on spectral and temporal features. They reported a 72.1% epoch by epoch agreement for five-stage sleep classification. Ouanes et al. (2016) proposed a hybrid approach to classify sleep stages based on unsupervised learning. An SOM clustering with a rule-based approach learning classifier system (LCS) was utilized to create an explicit model. The average accuracy of six-stage

AN US

sleep classification was 87.5%.

Huang et al. (2014) designed a sleep stage scoring system based on spectral features and relevance vector machine (RVM) as classifier. They reported a 76.7 % overall agreement. Ozsen et al. (2013) organized an automatic sleep stage scoring scheme using time domain

M

and spectral domain features, sequential feature selection, and ANN classifier. The overall accuracy was 90.9 % based on AASM criteria. Enshaeifar et al. (2016) employed a singular

ED

spectrum analysis (SSA) method to extract statistical descriptors for five-stage sleep classification. The level of agreement using SVM classifier was 74%. Gunes et al. (2010) proposed a novel data preprocessing method called k-means clustering based feature

PT

weighting (KMCFW) to classify sleep stages into six classes. The success rate of k-NN (k = 40) classifier was 82.2%. Acharya et al. (2010b) designed an automated sleep stage scoring

CE

system according to R&K criteria with an accuracy of 88.7% by extracting a number of HOS based features from bispectrum and bicoherence plots and a Gaussian mixture model (GMM)

AC

classifier. Ebrahimi et al. (2008) employed a sleep stage classification system to classify stages Wake, S1 + REM, S2, and slow wave sleep (SWS) by wavelet packet coefficient features and ANN. The average accuracy of their work was around 93.0%. Farag et al. (2012) classified Wake, NREM, and REM stages based on Detrended Fluctuation Analysis (DFA) and k-NN classifier with a 93.2% overall accuracy. Finally, it is worth mentioning that the application of symbolic based methods to analyze sleep EEG signals is limited and restricted to very few articles (R. J. Chang, Liu, Chen, & Wang, 2013; M. Gao, Wu, Li, & Wang, 2017; Klonowski, Olejarczyk, Stepienl, & Szelenberger, 2003). Eventually, to the best of our knowledge, the present study is the first attempt to address the multi-class sleep stage 9

ACCEPTED MANUSCRIPT classification problem by using the symbolic analysis concepts to develop a new time domain feature.

3. Materials and methods The scheme of the proposed automatic sleep stage scoring system is depicted in Fig. 2. In the first step, the entire epochs of dataset were randomly divided into two halves: the training

CR IP T

set and the test set. Next, in the preprocessing block, the selected EEG channel was filtered and decomposed to the given frequency bands. Then, SBLE features were independently extracted from each frequency sub-bands. The discriminative features were selected by employing an appropriate feature selection method on the training data. Finally, an optimal subset of features was fed into a classifier to perform two-stage to six-stage sleep

Labels Train data

Preprocessing

Feature extraction

M

Dataset

AN US

classification.

Test Data

Feature extraction

Feature set composition

SVM

Classifier testing SVM

Result

ED

Preprocessing

Feature Selection

Classifier training

Labels

PT

Figure 2. Block diagram of the proposed automatic sleep stage scoring system.

CE

3.1. Experimental data

3.1.1. Sleep-EDF dataset

AC

The Physionet Sleep-EDF dataset was utilized to implement experiments and assess the

proposed framework (Goldberger et al., 2000; Kemp et al., 2000). The recordings were acquired from the Caucasian males and females (21–35 years old) without any medication. The eight data recordings were from subjects: sc4002e0, sc4012e0, sc4102e0, sc4112e0, st7022j0, st7052j0, st7121j0, and st7132j0. The sc* files were obtained from ambulatory healthy volunteers during 24 hours in their normal daily life, and the st* files were acquired from subjects who had mild difficulty falling asleep, but were otherwise healthy. The PSG recordings contain two EEG channels (Fpz-Cz and Pz-Oz), horizontal EOG, and submental 10

ACCEPTED MANUSCRIPT EMG. The sampling rate of the EEG and EOG was 100 Hz, while EMG was sampled at 1 Hz (sc* recordings) and 100 Hz (st* recordings). All hypnograms were manually scored by welltrained technicians according to R&K criteria as one of the following eight sleep stages: Wake, S1, S2, S3, S4, REM, Movement Time (MT), and unknown sleep stages (Unscored) (Rechtschaffen & Kales, 1968). The epochs of ―MT‖ or ―Unscored‖ were excluded from the further analysis. The EEG data were segmented into 30 second epochs or 3000 data points for off-line processing. The total number of 15186 segments from Fpz-Cz channel were collected

CR IP T

for evaluating the proposed method.

3.1.2. DREAMS Subjects Database

Another database that was used in this work to evaluate the generalization of the proposed computerized method was DREAMS Subjects Database (―The DREAMS Subjects

AN US

Database,‖ n.d.). This database consists of 20 whole-night polysomnographic recordings, coming from 20 healthy subjects (16 females and 4 males; 20-65 years old). DREAMS Subjects database were collected in the sleep laboratory of a Belgium hospital using a digital 32-channel polygraph. Sleep data were contained at least two EOG channels, one submental

M

EMG channel, and three EEG channels (Cz-A1 or C3-A1, Fp1-A1, and O1-A1). The sampling frequency was 200 Hz. Sleep stages of DREAMS database were visually annotated

ED

by an expert sleep technician according to both R&K (Wake, S1 to S4, REM, MT, and Unscored) and AASM (W, N1, N2, N3, and R) criteria. The ―Unscored‖ and ―MT‖ stages were excluded from our analysis. The total number of 30 second EEG epochs was 20257, and

PT

all the analyses were performed on the Cz-A1 channel. To perform further experiments, in this research, the entire Sleep-EDF database was

CE

randomly divided into two equal halves: the training set and the test set. Moreover, Epochs of each sleep stage were equally split to ensure that both the training and test sets contain epochs

AC

of all the sleep stages. The same procedure was chosen for DREAMS Subjects Database. This approach decreases the chance of over-fitting as well. It is essential to note that the process of dividing and intermixing the databases was repeated 20 times, and the mean and standard deviation of the performance metrics of this 20 runs are here presented and discussed.

11

ACCEPTED MANUSCRIPT

3.2. Signal preprocessing Sleep EEG recordings have been typically contaminated by various types of artifacts, thus the existence of a preprocessing step is absolutely essential. This step enhances the quality of EEG signals by removing baseline drifts and linear trends, eliminating noises, and minimizing residual artifacts. The EEG signals were filtered with a 20th order Butterworth highpass filter at a frequency of 0.5 Hz, then with a 50th order Equiripple lowpass filter at a

CR IP T

frequency of 30 Hz. These cutoff frequencies for bandpass filtering were selected because the brain activities, specifically during sleep, have significant information in 0.5 to 30 - 35 Hz range (Nishida etal., 2009; Şen et al., 2014). Finally, the filtered signals were decomposed into the conventional frequency bands: low-delta (0.5-2 Hz), high-delta (2-4 Hz), theta (4-8 Hz), alpha (8-12 Hz), sigma (12-16 Hz), and beta (16-30 Hz) (Mariani et al., 2010). An

AN US

Equiripple bandpass filter with a 50th order and a hamming window was used for this purpose.

3.3. Statistical Behavior of Local Extrema (SBLE)

M

In this study, a novel symbolic technique based on statistical characteristics of local extrema is proposed to extract relevant features of single-channel EEG signal for sleep stage

ED

classification. The idea behind symbolic time series analysis for extracting information is simple. In this approach, the values of a given time series data (i.e. numeric sequence) are transformed into a finite set of symbols obtaining a symbolic sequence. A symbolic string is a

PT

discrete-time sequence so that each of its element is a member of a predefined symbolic set (Daw et al., 2003; Wei Wang et al., 2002).

CE

In contrast to many of symbolic methods like SAX (Canelas et al., 2012; Tayebi, Krishnaswamy, Waluyo, Sinha, & Gaber, 2011) that attempted to quantify the behavior of

AC

time series during the time, SBLE extract several features to describe the dynamics of time series. For this purpose, SBLE concentrates on the positional information of local extrema in some amplitude intervals and tries to present an optimized estimation of the local extrema amplitude distributions and their behavior. The procedure of creating SBLE feature vectors is designed in six steps: Step 1. Identification of Local extrema: In this step, the positions of local extrema are discovered from the EEG signals and the amplitude of each local extremum is specified (Fig. 3a). This sequence of amplitude values of local extrema constructs a sequential string named 12

ACCEPTED MANUSCRIPT the first symbolic string or SS1 (Fig. 4a). The local extremum of a given time series such as x(t) can be indicated by finding a point that have one of the two following features: 

Amplitude of x(t) is greater than x(t-1) and x(t+1), which is local maximum.



Amplitude of x(t) is smaller than x(t-1) and x(t+1), which is local minimum.

Step 2. Finding amplitude intervals: In this step, for obtaining an estimation of amplitude

CR IP T

distribution, a histogram of local extrema amplitudes obtained from previous step should be calculated (Fig. 3b). Then, the area under the histogram should be divided into m intervals to specify boundaries of the amplitude sub-areas (Fig. 3c). These intervals should be located at positions, where the maximum information is obtained from the amplitude distribution. For this purpose, according to Eq. 1, the probability of all sub-areas should be the same because

AN US

entropy will be maximized if the probability of occurrence of events is approximately equal (Fig. 3d).

∑ ( )

( ( ))

(1)

M

where m is the number of amplitude intervals and p(Si) is occurrence probability of

AC

CE

PT

ED

symbol Si.

Figure 3. Dividing amplitude into some intervals. a) Extracting local extrema from the EEG signal. b) Plotting a histogram of local extrema’s amplitudes. c) Dividing the area under the histogram into a number of areas with equal probabilities. d) Depicting the boundaries of the amplitude intervals.

13

ACCEPTED MANUSCRIPT Step 3. Symbol assignment: In this step, to define the second symbolic string (SS2), R1 to Rm symbols are assigned to m intervals as is specified in step 2. Based on the value that a local extremum has in the SS1 string, a relevant symbol (R1 to Rm) will be assigned. As can be seen in Fig. 4b, the SS2 string is composed by replacing amplitude values in SS1 string (Fig. 4a) with specified symbols according to Eq. 2. (2)

AN US

CR IP T

{

M

Figure 4. Symbol assignment and symbolic string creation. a) Creating an SS1 string from amplitude values of local extrema. b) Generating SS2 string based on the SS1 string.

ED

Step 4. Definition of micro patterns: In this step, three micro patterns are defined. Each of two consecutive symbols in SS2 string can have three different relations according to their

Stable order (S): Sequence of Ri, Rj, where i = j, i.e. two symbols belong to same

CE



PT

orders as follows:

amplitude interval.



Falling order (F): Sequence of Ri, Rj, where i > j, i.e. prior symbol belongs to

AC

higher amplitude interval.



Rising order (R): Sequence of Ri, Rj, where i < j, i.e. prior symbol belongs to lower amplitude interval.

Therefore, the third symbolic string (SS3) string is constructed by replacing the elements of SS2 string with proper micro patterns. Step 5. Definition of macro patterns: A macro pattern consisted of N consecutive symbols of micro patterns so that for N = x there are 3x macro patterns. For example, when N = 2, 14

ACCEPTED MANUSCRIPT there are 9 distinct micro patterns: ―SS, SR, SF, RR, RS, RF, FR, FS, and FF‖. Nevertheless, some of these patterns such as RR and FF macro patterns cannot be occurred in the realworld problems (their values are always zero). Figure 5 presents an example of all possible

ED

M

AN US

CR IP T

macro patterns for N = 1 to 3.

Figure 5. Samples of all possible micro and macro patterns for N = 1 to 3 (i.e. SS41, SS42 and SS43) when m = 4.

PT

Thus, based on the values that the parameter N takes i.e. N = 1 or 2 or … L, various macro pattern strings such as SS41, SS42, and SS4L are defined and extracted from the SS3

CE

string.

Step 6. Obtaining features: This step uses the defined strings to extract the features.

AC

Features that the method proposes are: 

Number of local maxima in each amplitude interval.



Number of local minima in each amplitude interval.



Number of each macro pattern by N = 1 to L in SSm1 to SSmL strings.

According to Eq. 3, all of the obtained features depend on L and m. The first part (2*m) is the number of features that are related to local minima and maxima in m amplitude intervals, and the second part (∑

) is to the number of macro patterns. For instance, if m = 3 and 15

ACCEPTED MANUSCRIPT L = 5 the number of features is equal to 369. The L parameter can have a value ranging between 1 and the number of local extrema, and the m value can also vary between 2 and infinity. By increasing the values of m and L parameters, the computation time of the system is increased and the redundant information is extracted. In this research, the values of m and L parameters were empirically determined during the experiments and considered equal to 6 and 5, respectively. This means that the feature vector of each EEG frequency sub-bands was



CR IP T

composed of 375 features.

(3)

All features were fed as inputs to a classifier for decision making. Furthermore, due the

AN US

existence of a large number of features extracted by the proposed method, using an appropriate and efficient feature selection algorithm for eliminating redundant or irreverent features is clearly essential.

It should be noted that in the initial usage of SBLE (as a similarity index), the amplitude

M

interval locations were fixed and limited to an equation depending on the average and variance of entire signal (a linear combination of the average and variance). In the present

ED

study, we developed an adaptive idea for estimating the amplitude intervals to expand the extracted information from each segment of signal. Furthermore, we improved and enriched the pattern extraction procedure by defining different micro and macro patterns. This

PT

approach provided a more detailed representation of local extrema behavior to track the

CE

dynamical changes of EEG in different sleep stages.

3.4. Feature scaling

AC

Since SBLE features have a wide range of values, its distribution is skewed and its

variance is unequal, hence the scaling of the selected features is essential. For this reason, a two-step feature scaling process was implemented in this study by using feature transformation and feature normalization techniques. It is noteworthy the scaling procedure was independently applied to the both training and test groups. Feature transformation is a process in which a new feature set is created by means of a group of mathematical methods in order to alleviate the influence of extreme values. These methods are mainly logarithmic, square root, arcsine, reciprocal, and squared transformation. 16

ACCEPTED MANUSCRIPT The logarithmic and square root transformations are commonly used for positive data. (Becq et al., 2005; Khalighi et al., 2013). In this work, we used a logarithmic transformation (Eq. 4) for this purpose. (

(4)

)

CR IP T

where Y denotes the original feature matrix and X is the transformed feature matrix. The X matrix was organized according to Equation 5: (5) * + where N and M indicate the number of epochs and features, respectively. Furthermore, the feature normalization has been widely used to standardize the range of features, and generally implanted during the data preprocessing step. This method eliminates the individual

AN US

differences (J. F. Gao, Yang, Lin, Wang, & Zheng, 2010), improves the convergence speed of the classifiers, and changes the classification results (Ioffe & Szegedy, 2015). The simplest feature normalization method is to rescale the values to the range of [0, 1] or [−1, 1]. Here, the transformed feature matrix (X) was normalized between 0 and 1 according to Equation 6: ( )) (

M

(

( )

( ))

(6)

ED

where Z is the normalized matrix and has a dimension similar to X. In Eq. 6, xij is the value of feature j in feature vector i.

PT

3.5. Feature selection

CE

According to whether the training data is labeled or not, feature selection methods can be classified into supervised, unsupervised, and semi-supervised methods. Supervised feature

AC

selection works with labeled data, and can be divided into filter-based, wrapper-based and embedded models. The wrapper and the embedded methods have higher computational complexity compared to the filter based methods (Tang et al., 2014). Since the overall performance of the classifier was deeply affected by the process of

features selection, several unsupervised and supervised feature selection techniques (filterbased and wrapper-based) were tested to find an optimum algorithm with superior classification results (this is explained more detail in Section 4.1). The supervised feature selection algorithms included Statistical Dependency (SD), Relief-F, Mutual Information (MI), minimum Redundancy Maximum Relevance (mRMR), Feature Selection via concave 17

ACCEPTED MANUSCRIPT minimization (FSV), and supervised Multi-Class/Cluster Feature Selection (supervised MCFS). All of the supervised algorithms are filter-based methods except FSV and SMCFS, which are wrapper-based methods. In addition, the tested unsupervised feature selection algorithms were unsupervised Multi-Class/Cluster Feature Selection (unsupervised MCFS), Unsupervised Discriminative Feature Selection (UDFS), and Local Learning-Based Clustering (LLC) (Roffo, 2016). In this study, the supervised MCFS method was applied to select a relevant subset of

CR IP T

features by executing a 5-fold cross-validation on the training data. It is essential to note that the MCFS is basically an unsupervised feature selection method, but it can also be used in supervised and semi-supervised problems (Cai et al., 2010).

AN US

3.6. Classification

A multi-class support vector machine (SVM) classifier was employed to determine the classification accuracy of the two-stage to six-stage sleep classification. SVM is a statistical based classifier, and it is basically designed for solving binary classification problems. To extend SVM ability for solving multi-class problems, several methods such as One-Against-

M

All (OAA) and One-Against-One (OAO) (Bishop, 2014), Directed Acyclic Graph SVM (DAGSVM) (Platt et al., 1999), and Error-Correcting Output Codes (ECOC) (Dietterich et

ED

al., 1994) have been proposed. Recently, a new multi-class SVM method was developed based on decision tree concepts. This method is called Dendrogram-based SVM or DSVM (Lajnef et al., 2015).

PT

SVM classifiers have been used in a wide range of machine learning and pattern recognition problems. With regard to the recognition of a small sample of nonlinear and high-

CE

dimensional data, SVM has a remarkable adaptability, robust classification ability, and computational efficacy (Ge, Wang, & Yu, 2014). All in all, it is a computationally more

AC

efficient classifier with high accuracy, stability, and robustness especially in the field of signal processing. Use of SVMs in sleep EEG studies have mainly focused on automatic sleep staging, arousal detection, and sleep spindle recognition (Motamedi-Fakhr et al., 2014). The LIBSVM package was used in this work. LIBSVM is an integrated software that supports both type of two-class and multi-class support vector classification problems (Chang et al. 2011). LIBSVM solves multi-class problems by using the OAO method (Chih-Wei Hsu & Chih-Jen Lin, 2002). Because of the nonlinear nature of EEG signals, the radial basis kernel function (RBF) was selected as the SVM kernel function. A grid-search procedure was 18

ACCEPTED MANUSCRIPT also implemented to find optimal values for RBF-kernel width (or γ) and soft margin (or C). Finally, the values 8 and 0.5 were selected for C and γ, respectively. To prevent the overfitting problem, a 5-fold cross-validation procedure was employed.

3.7. Performance assessment In order to assess the efficacy and capability of the proposed method, a series of most renowned and commonly used statistical measures were computed (Fraiwan et al., 2010).

CR IP T

These measures are sensitivity, specificity, accuracy, overall agreement, and Cohen’s kappa coefficient.

Cohen’s kappa coefficient (κ) is a statistical measure to evaluate efficacy and robustness of a method. It is a more robust measure than simple percent agreement, since κ takes into

AN US

account the possibility of the agreement occurring by chance. In other words, it is a measure of how well an algorithm performed as compared to how well it would have performed simply by chance (Cohen, 1960). The interpretation of kappa coefficient is as follows: if κ value is less than 0.00 represents no agreement; 0.00 to 0.20 represents slight agreement; 0.21 to 0.40 represents fair agreement; 0.41 to 0.60 represents moderate agreement; 0.61 to 0.80

M

represents substantial agreement; and more than 0.80 represents perfect agreement (Landis et al., 1977).

ED

4. Experimental results

PT

To evaluate the potency of the proposed framework, a set of comprehensive experiments were carefully designed and conducted. In different component of this section (i.e. the effect

CE

of different feature selection techniques on the classification results, statistical analysis of the selected features, the rational of channel selection, feature evaluation, the classification results of different multi-class problems using Sleep-EDF dataset, the classification results

AC

for DREAMS Subject database, and eventually comparative study) we try to illustrate the objective of each experiment and explain the findings of quantitative analysis in detail. It is important to mention here that the dataset considered in the computation of Sections 4.1 to 4.7 is Sleep-EDF dataset.

4.1. Performance for various feature selection algorithms Figure 6 provides a comparison among classification accuracy of different feature selection techniques and the number of selected features. In all cases, the maximum 19

ACCEPTED MANUSCRIPT classification accuracy belonged to each feature selection method occurred in the first 150 features. As you can see from Fig. 6b, the supervised MCFS algorithm provided the best overall performance among all methods for six-stage sleep classification and a set of 29 features. Moreover, the filter-based methods had the highest performance in comparison to other methods (Fig. 6c). On the contrary, the UDFS and LLC could not select a

CE

PT

ED

M

AN US

CR IP T

discriminative subset of features to yield satisfying results (Fig. 6a).

AC

Figure 6. Number of selected features with highest accuracy rate using different feature selection methods in

six-stage sleep classification using Sleep-EDF dataset. a) Unsupervised methods b) Wrapper-based supervised methods c) Filter-based supervised methods.

4.2. Statistical analysis The selected features obtained from the supervised MCFS were tested by statistical analysis to assess overall discriminatory capability of the feature set. In other words, this investigation provides a clear insight into the nature of each feature and its variation through different sleep stages. A one-way analysis of variance (ANOVA) was utilized to test our 20

ACCEPTED MANUSCRIPT hypothesis that whether or not the difference between each stage of sleep is significant. In fact, ANOVA returns a p-value that demonstrates the significance of the results and test the validity of a hypothesis. It is noteworthy ANOVA compares the means of several groups to prove the hypothesis that they are all equal, against the general alternative that they are not all equal. Sometimes this alternative is too general and we need information about which pairs of means are significantly different, and which are not. A multiple comparison test can provide this

CR IP T

valuable information. In this study, to obtain a good understanding of differentiation between paired-stages, we used one of the more conservative multiple comparison methods (i.e. Bonferroni correction) subsequent to ANOVA. Table 1 indicate the results of statistical analysis. The statistical analysis reveals that all of the selected features not only pass the hypothesis but also present lower p-values indicating remarkable efficiency to differentiate

AN US

among the six sleep stages. All reported results are two-tailed and the test was conducted at 99% confidence level. Thus, a difference is statistically significant if P ≤ 0.01. In addition, the paired-stages that each feature could not significantly differentiate between them are also provided. Even though the results of six-stage classification are presented only, it is to be

M

noted that approximately similar results were obtained for two-stage to five-stage sleep classification. According to Table 1, features No. 20 and No. 25 have a marginal efficacy to

ED

distinguish among a number of paired-stages individually, but, in combination with other features, their performance are compelling. In contrast, features No. 4 can greatly separate all possible paired-stages of sleep. Furthermore,

a more detailed analysis shows that the most

PT

problematic and challenging paired-stages for identification are S3-S4 and S1-REM, so that

AC

CE

11 and 10 features cannot significantly differentiate between these pared-stages, respectively.

21

CR IP T

ACCEPTED MANUSCRIPT

Table 1. Statistical analysis to assess discrimination ability of the selected features in six-stage sleep classification using Sleep-EDF dataset. The feature value of each sleep stage are transformed and normalized, and presented as Mean ± STD. Nonsignificant p-values (P > 0.01) are shown in boldface. S1

S2

S3

S4

No. 1 No. 2 No. 3 No. 4 No. 5 No. 6 No. 7 No. 8 No. 9 No. 10 No. 11 No. 12 No. 13 No. 14 No. 15 No. 16 No. 17 No. 18 No. 19 No. 20 No. 21 No. 22 No. 23 No. 24 No. 25 No. 26 No. 27 No. 28 No. 29

0.885 ± 0.094 0.807 ± 0.103 0.581 ± 0.124 0.832 ± 0.146 0.890 ± 0.092 0.803 ± 0.139 0.884 ± 0.091 0.356 ± 0.310 0.859 ± 0.109 0.636 ± 0.143 0.915 ± 0.085 0.131 ± 0.245 0.631 ± 0.144 0.635 ± 0.168 0.502 ± 0.264 0.135 ± 0.248 0.684 ± 0.228 0.613 ± 0.142 0.426 ± 0.410 0.179 ± 0.275 0.716 ± 0.266 0.748 ± 0.161 0.854 ± 0.091 0.853 ± 0.079 0.626 ± 0.406 0.537 ± 0.306 0.536 ± 0.153 0.649 ± 0.225 0.576 ± 0.288

0.761 ± 0.106 0.831 ± 0.068 0.777 ± 0.088 0.525 ± 0.285 0.813 ± 0.104 0.462 ± 0.299 0.809 ± 0.102 0.031 ± 0.120 0.679 ± 0.168 0.820 ± 0.100 0.888 ± 0.067 0.039 ± 0.145 0.822 ± 0.087 0.800 ± 0.123 0.155 ± 0.241 0.038 ± 0.145 0.787 ± 0.169 0.816 ± 0.082 0.564 ± 0.396 0.169 ± 0.278 0.492 ± 0.372 0.742 ± 0.201 0.793 ± 0.084 0.853 ± 0.096 0.432 ± 0.463 0.655 ± 0.251 0.581 ± 0.107 0.401 ± 0.309 0.601 ± 0.280

0.881 ± 0.071 0.860 ± 0.074 0.633 ± 0.113 0.788 ± 0.107 0.663 ± 0.190 0.685 ± 0.171 0.659 ± 0.189 0.109 ± 0.212 0.820 ± 0.067 0.740 ± 0.100 0.824 ± 0.121 0.056 ± 0.171 0.730 ± 0.114 0.725 ± 0.115 0.268 ± 0.277 0.048 ± 0.155 0.738 ± 0.204 0.707 ± 0.093 0.506 ± 0.411 0.173 ± 0.279 0.729 ± 0.248 0.673 ± 0.312 0.804 ± 0.109 0.775 ± 0.253 0.533 ± 0.459 0.617 ± 0.276 0.549 ± 0.128 0.593 ± 0.246 0.642 ± 0.259

0.877 ± 0.071 0.726 ± 0.099 0.620 ± 0.112 0.892 ± 0.057 0.614 ± 0.239 0.877 ± 0.044 0.617 ± 0.231 0.197 ± 0.261 0.885 ± 0.057 0.544 ± 0.152 0.799 ± 0.148 0.244 ± 0.296 0.532 ± 0.156 0.618 ± 0.161 0.440 ± 0.266 0.200 ± 0.273 0.511 ± 0.319 0.638 ± 0.080 0.562 ± 0.404 0.232 ± 0.291 0.774 ± 0.197 0.624 ± 0.350 0.893 ± 0.048 0.733 ± 0.300 0.419 ± 0.465 0.448 ± 0.321 0.445 ± 0.175 0.684 ± 0.181 0.569 ± 0.286

0.856 ± 0.073 0.549 ± 0.176 0.625 ± 0.083 0.935 ± 0.039 0.692 ± 0.142 0.920 ± 0.024 0.687 ± 0.139 0.200 ± 0.261 0.907 ± 0.038 0.423 ± 0.174 0.835 ± 0.096 0.448 ± 0.327 0.395 ± 0.199 0.584 ± 0.194 0.428 ± 0.270 0.438 ± 0.301 0.377 ± 0.325 0.621 ± 0.065 0.478 ± 0.432 0.163 ± 0.249 0.803 ± 0.162 0.745 ± 0.252 0.918 ± 0.037 0.851 ± 0.162 0.444 ± 0.476 0.407 ± 0.334 0.353 ± 0.205 0.662 ± 0.209 0.390 ± 0.337

M

ED

PT

REM

Overall p-value

0.704 ± 0.115 0.856 ± 0.058 0.810 ± 0.069 0.640 ± 0.197 0.741 ± 0.155 0.466 ± 0.268 0.738 ± 0.150 0.040 ± 0.134 0.763 ± 0.084 0.825 ± 0.076 0.852 ± 0.096 0.031 ± 0.131 0.824 ± 0.075 0.804 ± 0.092 0.175 ± 0.252 0.027 ± 0.126 0.776 ± 0.181 0.775 ± 0.079 0.496 ± 0.425 0.176 ± 0.281 0.625 ± 0.322 0.686 ± 0.286 0.768 ± 0.138 0.808 ± 0.200 0.496 ± 0.465 0.673 ± 0.243 0.595 ± 0.102 0.521 ± 0.286 0.613 ± 0.269

0 0 0 0 0 0 0 0 0 0 1.65E-284 5.46E-222 0 2.09E-288 0 3.72E-220 1.07E-213 0 5.96E-19 0.011 1.39E-71 6.56E-43 2.19E-221 9.82E-81 6.91E-41 1.99E-77 1.92E-164 7.30E-103 1.24E-48

AN US

Wake

CE

1

Features

Non-significant pairs 2; 3; 10; 131 12 10; 11; 13 ____ 11 9; 13 11 9; 13 13 9 11; 15 6; 9; 12 9 3; 9; 13 9; 13 6; 9; 12 9 4; 13 4; 6 to 15 1;2; 4 to 15 1; 10; 13 1; 4; 8; 12 6; 13 1; 4; 8 7 to 9; 11 to 15 6; 9; 13 2; 9 3; 4; 13 1; 3; 6; 7; 9; 12; 14

AC

Labels of the defined paired-stages: (1) W-S1; (2) W-S2; (3) W-S3; (4) W-S4; (5) W-REM; (6) S1-S2; (7) S1-S3; (8) S1-S4; (9) S1-REM; (10) S2-S3; (11) S2-S4; (12) S2-REM; (13) S3-S4;(14) S3-REM; (15) S4-REM.

22

ACCEPTED MANUSCRIPT Figure 7 depicts an error bar representation of all selected features in six-class sleep stage classification. This figure describes uncertainty or variation of features by using the average and standard deviation values of each one. In other words, an error bar graph communicates important information about the nature of each feature such as how spread the data are around the mean value and how accurately the mean value represents the data. In this case,

PT

ED

M

AN US

CR IP T

most of the selected features had appropriate distribution around the mean values.

CE

Figure 7. Error bar representation of selected features in six-stage sleep classification on Sleep-EDF dataset.

4.3. Performance for Fpz-Cz and Pz-Oz channel In single-channel based approaches, the selection of an efficient and appropriate EEG

AC

channel for extracting more informative is a challenging problem. This has been a controversial issue so that many of the prior studies (Diykh & Li, 2016; Diykh, Li, & Wen, 2016; Hassan & Hassan Bhuiyan, 2016; S. F. Liang et al., 2012; G. Zhu et al., 2014) yielded higher accuracy results for Pz-Oz. On the other hand, some researches achieved (Hsu et al., 2013; Ouanes et al., 2016; Tsinalis et al., 2016a, 2016b) reported that Fpz-Cz is the most suitable channel for the sleep stages classification. For this purpose, here, we tested our automatic framework on two available EEG channels of Sleep-EDF dataset. It is evident from

23

ACCEPTED MANUSCRIPT Table 2 that the classification performance of Fpz-Cz derivation outperformed those of Pz-Oz in all the cases of interest. Table 2. Average performance of Fpz-Cz and Pz-Oz channels for two-stage to six-stage sleep classification on Sleep-EDF dataset. The highest accuracy and kappa statistic values are marked in boldface. Fpz - Cz

Channel

Pz - Oz Kappa

Accuracy

Kappa

6-state classification

90.6%

0.85

88.6%

0.82

5-state classification

91.8%

0.87

4-state classification

92.8%

0.88

3-state classification

94.5%

0.91

2-state classification

97.9%

0.96

CR IP T

Accuracy

0.85

91.0%

0.85

93.6%

0.89

97.5%

0.95

AN US

4.4. Feature evaluation

90.2%

The capability of the proposed algorithm was examined by assigning different number of epochs as the training and test datasets. For this purpose, two additional SVM models were tested to see if the proposed method was able to produce satisfactory results. First, 30% of

M

data was randomly selected and considered as training dataset, and the classifier was tested by others (70%). In the next model, about 15% of entire dataset was used for training process

ED

and the other 85% assigned to the test group. Table 3 provides a comparison between the accuracy and kappa values obtained from two-stage to six-stage sleep classification (Q index indicates the amount of training data in percent). As can be found from Table 3, the best

PT

classification result achieved when about half of the data were selected as training, but it is important to note that all models were able to obtain promising classification rates for all

CE

multi-class problem.

Table 3. Comparison of two additional models with different number of epochs (Q) as the training group on

AC

Sleep-EDF dataset. The highest values are presented in boldface. Percent of training

Q = 50%

Q = 30%

Q = 15%

epochs

Accuracy

Kappa

Accuracy

Kappa

Accuracy

Kappa

6-state classification

90. 6%

0.85

89.4%

0.84

87.8%

0.81

5-state classification

91.8%

0.87

90.7%

0.85

88.0%

0.81

4-state classification

92.8%

0.88

91.2%

0.86

90.0%

0.84

3-state classification

94.5%

0.91

93.4%

0.88

91.6%

0.85

2-state classification

97.9%

0.96

97.2%

0.94

95.7%

0.91

24

ACCEPTED MANUSCRIPT

4.5. Six-state classification of sleep stages Table 4 shows the confusion matrix of six-state classification of sleep stages using the proposed method. The average of overall agreement between the expert and the proposed method was 90.6 ± 4.2%, and the kappa coefficient exhibited perfect agreement (k = 0.85± 0.02). The stage Wake had outstanding classification accuracy with around 99% of sensitivity. Then, stages S2 (around 91%), REM (around 85%), and S4 (around 80%) were in

CR IP T

the next places. The specificity values for all stages were higher than 96%. In addition, the most misclassification rate occurred in stages S1 and S3 with about 59% and 41%, respectively. In the stage S1, S1-REM and S1-Wake pairs had slightly low performance with about 33% and 14% misclassification rates, respectively. As the same way, in stage S3, the misclassification rate of S3-S2 and S3-S4 were higher than other pairs with about 23% and

AN US

16%, respectively.

Table 4. Confusion matrix of six-stage sleep classification on Sleep-EDF dataset.

Wake

S2

S3

S4

REM

32

4

2

0

17

124

35

2

0

98

5

1651

73

5

48

9

0

78

197

52

0

6

1

5

51

251

0

22

29

72

0

0

682

Sensitivity

98.6%

41.1%

91.2%

58.6%

79.9%

84.7%

Specificity

96.4%

99.0%

96.4%

98.1%

99.2%

97.4%

Accuracy

97.7%

96.6%

95.1%

96.3%

98.3%

96.0%

Wake

3972

S1

43

S2

29

S3 S4

CE

PT

REM

ED

Proposed method

S1

M

Expert

AC

Table 5 presents the confusion matrix of six-stage sleep classification by testing the proposed method on Pz-Oz channel. A direct comparison between Table 4 and Table 5 indicates that the sensitivity of stages Wake, S2, S4 and REM were approximately equal for both derivations, whereas, in stages S1 and S3, these values were dramatically reduced for Pz-Oz channel (with about 22% and 18% reduction, respectively). In this state, the average accuracy and kappa statistic were 88.6 ± 4.7% and 0.82 ± 0.03.

25

ACCEPTED MANUSCRIPT Table 5. Six-stage sleep classification for Pz-Oz channel on Sleep-EDF dataset.

Wake

S1

S2

S3

S4

REM

Wake

3971

20

8

1

1

26

S1

76

58

68

0

0

100

S2

21

5

1646

61

20

58

S3

4

0

153

138

41

0

S4

3

0

13

60

238

0

REM

26

11

90

Sensitivity

98.6%

19.2%

90.9%

Specificity

95.5%

99.5%

93.9%

Accuracy

97.3%

96.0%

93.1%

CR IP T

Proposed method

Expert

0

678

41.1%

75.8%

84.2%

98.2%

99.1%

97.1%

95.5%

98.0%

95.6%

AN US

4.6. Five-state classification of sleep stages

0

The confusion matrix of the five-stage sleep classification is displayed in Table 6. The average classification performance and kappa value were 91.8 ± 5.2% and 0.87 ± 0.03, respectively. The sensitivity of all stages, except S1, were higher than 84 ± 3.9%. This

M

method had outstanding classification rate for classifying stages Wake and S2. Furthermore, the specificity of all stages were higher than 96%.

ED

Like the pervious section, the S1-REM had highest misclassification rate (about 26%), while the S1-Wake and S1-S2 pairs were in the next place with about 18% and 15% classification error, respectively. The misclassification rate of (S3+S4) - S2 was also about

PT

13%. The other paired-stages had marginal misclassification rates.

AC

Proposed method

CE

Table 6. Confusion matrix of five-stage sleep classification on Sleep-EDF dataset.

Expert Wake

S1

S2

S3+S4

REM

Wake

3977

29

6

3

12

S1

54

121

46

3

78

S2

30

13

1647

69

52

S3+S4

12

0

85

553

0

REM

22

34

72

1

676

Sensitivity

98.8%

40.1%

90.9%

85.1%

84.0%

Specificity

96.2%

98.9%

96.2%

98.3%

97.8%

Accuracy

97.7%

96.5%

94.9%

97.6%

96.3%

26

ACCEPTED MANUSCRIPT By eliminating the Wake stage from calculations, another type of five-stage sleep classification is obtained. In this case, the confusion matrix contained only NREM (S1 to S4) and REM sleep (Table 7). The average accuracy and kappa coefficient of the new state were 81.3 ± 3.6% and 0.71 ± 0.04, respectively. Table 7. Confusion matrix of new five-stage sleep classification (without considering Wake stage) on Sleep-

S1

S2

S3

S4

REM

S1

126

48

1

0

127

S2

12

1652

67

7

73

S3

1

101

181

53

0

S4

1

6

53

254

0

REM

36

83

0

0

686

Sensitivity

41.7%

91.2%

53.9%

80.9%

85.2%

Specificity

98.2%

84.0%

95.7%

97.8%

91.7%

Accuracy

92.8%

88.0%

91.3%

96.0%

90.1%

AN US

Proposed method

Expert

CR IP T

EDF dataset.

M

4.7. Four-state, three-state, and two-state classification

The tow-stage to four-stage sleep classification are mainly useful for diagnosing sleep-

ED

related disorders such as REM behavior disorder (RBD), narcolepsy, etc. (Hassan & Bhuiyan, 2017a). In this work, these classification models were mainly examined to evaluate the

PT

capability of our method for addressing sleep health problems (Table 8). The average sensitivities of four-state (Wake, S1+S2, S3+S4, and REM) classification

CE

were around 98.7%, 88.7%, 87.1%, and 78.4%, respectively, while all of specificity values were higher than 95%. The average of kappa value showed a perfect agreement (k = 0.88 ±

AC

0.02) and overall accuracy was 92.8 ± 3.3%. The average sensitivity values of three-state classification (Wake, NREM, and REM)

were around 98.3%, 93.6%, and 78.6%, respectively, while the specificity of each stage was higher than 95.5%. Moreover, the kappa statistic value and overall agreement were 0.91 ± 0.03 and 94.5 ± 4.4%, respectively. Finally, in two-state classification, the accuracy and kappa coefficient to differentiate between Wake and Sleep were 97.9 ± 1.4% and 0.96 ± 0.01%, respectively.

27

ACCEPTED MANUSCRIPT Table 8. Average performance of the proposed system for 4-state, 3-state, and 2-state classification on SleepEDF dataset. 4-state classification

3-state classification

2-state classification

S1+S2

S3+S4

REM

Wake

NREM

REM

Wake

Sleep

Sensitivity

98.7%

88.7%

87.1%

78.4%

98.3%

93.6%

78.6%

98.1%

97.7%

Specificity

96.3%

95.3%

98.9%

98.4%

97.5%

95.8%

98.0%

97.7%

98.1%

Accuracy

97.6%

93.5%

97.8%

96.2%

98.0%

95.0%

95.9%

97.9%

97.9%

4.8. Performance on DREAMS Subject Database

CR IP T

Wake

Nevertheless, our results were additionally validated with a larger and more recent dataset (DREAMS Subjects Database), which enriches power of the analysis and illustrates the potency and capability of the designed system.

AN US

Tables 9 and 10 indicate the confusion matrices of the classification results based on both R&K and AASM guidelines, respectively. The results denoted that the overall accuracy and kappa coefficient of the proposed system were 83.4 ± 4.3 % and 0.77 ± 0.02, respectively, according to the R&K criteria. Moreover, these values were 83.3 ± 5.6% and 0.77 ± 0.03

M

based on AASM standard.

Table 9. Confusion matrix of the proposed sleep stage classification system according to R&K standard on

AC

ED

Expert

Wake

S1

S2

S3

S4

REM

1729

21

70

1

4

40

S1

67

224

183

0

0

117

S2

59

18

3994

83

44

220

S3

2

0

183

346

181

0

S4

3

0

27

86

860

0

REM

48

20

192

0

0

1257

Sensitivity

92.7%

37.9%

90.4%

48.6%

88.1%

82.9%

Specificity

97.4%

99.3%

87.1%

97.9%

97.1%

95.0%

Accuracy

96.4%

95.2%

88.6%

94.0%

96.1%

93.0%

PT

Wake

CE

Proposed method

DREAMS Subject Database.

28

ACCEPTED MANUSCRIPT Table 10. Confusion matrix of the proposed sleep stage classification system according to AASM criteria on DREAMS Subject Database.

Expert N2

N3

R

W

1664

48

47

1

16

N1

117

220

207

3

193

N2

52

46

3648

187

193

N3

13

0

303

1651

0

R

24

64

175

Sensitivity

93.7%

29.7%

Specificity

97.1%

Accuracy

96.4%

CR IP T

N1

Proposed method

W

0

1247

88.4%

83.9%

82.6%

98.1%

86.7%

97.3%

94.7%

92.6%

87.5%

94.3%

92.7%

AN US

4.9. Comparative study

The classification accuracy and kappa coefficient of the proposed scheme were compared with existing literature. To ensure fair comparison, we only reported the best results of the studies that examined their methods on Sleep-EDF dataset. As Table 11 shows, the proposed

M

have similar performance to the best classifications results reported in the literature for sixstate and five-state classification of sleep stages.

ED

Table 11. Performance comparison of the proposed method with the other existing methods on Sleep-EDF database. The highest values are marked in boldface. Number of

PT

Authors

epochs

Classifier

6-state classification

5-state classification

Accuracy

Kappa

Accuracy

Kappa

_____

HMM

61.8%

_____

_____

_____

Berthomier et al., (2007)

8500

Fuzzy logic

_____

_____

71.2%

0.61

Vural & Yildiz., (2010)

1378

_____

83.5%

0.75

_____

_____

S. F. Liang et al., (2012)

8 Rec.

LDA

_____

_____

83.6%

0.75

S.-F. Liang et al., (2012)

4 Rec.

DT

_____

_____

78.0%

0.68

Ronzhina et al., (2012)

_____

NN

76.7%

_____

_____

_____

Guohun Zhu, Li, & Wen., (2012)

11120

SVM

82.6%

_____

_____

_____

Hsu et al., (2013)

2880

NN

_____

_____

87.2%

0.80

Bajaj & Pachori., (2013)

4700

SVM

88.5%

_____

92.9%

0.90

G. Zhu, Li, & Wen., (2014)

14963

SVM

87.5%

0.81

88.9%

0.83

Hassan, Bashar, et al., (2015)

15188

Boosted DT

80.3%

_____

82.0%

_____

Hassan & Bhuiyan., (2015)

15188

AdaBoost

86.9%

_____

89.5%

_____

Hassan & Bhuiyan., (2016)

15188

AdaBoost

88.6%

0.79

90.1%

0.84

AC

CE

Doroshenkov et al., (2007)

29

ACCEPTED MANUSCRIPT 14963

K means

95.9%

0.82

_____

_____

S.-F. Liang et al., (2016)

4 Rec.

Genetic Fuzzy

_____

_____

81.3%

0.72

Hassan & Bhuiyan., (2016a)

15188

RF

90.4%

0.84

91.5%

0.86

Hassan & Bhuiyan., (2016b)

15188

Bagging

86.9%

0.76

90.7%

0.83

Hassan & Subasi., (2017)

15188

Bagging

92.4%

0.84

93.7%

0.85

Hassan & Bhuiyan., (2017a)

15188

AdaBoost

90.0%

0.84

91.4%

0.86

Hassan & Bhuiyan., (2017b)

15188

RUSBoost

88.1%

0.88

83.5%

0.84

Proposed method

15186

SVM

90.6%

0.85

91.8%

0.87

CR IP T

Diykh, Li, & Wen., (2016)

It is noteworthy that the overall accuracy of the proposed algorithm was slightly lower than the Diykh, Li, & Wen, (2016) and Hassan et al. (2017a) studies but our method was more robust than that of those in the term of kappa statistic value. Moreover, the accuracy and kappa coefficient reported by Bajaj et al. (2013) were higher than our method, whereas,

segments) to validate their own method.

AN US

in comparison with the work, they used a small number of EEG segments (about 4700

Moreover, Table 12 denotes the results obtained using the proposed method were comparable to studies using the same database (i.e. DREAMS Subject Database). The

two criteria (R&K and AASM).

M

average performance accuracy and kappa statistic value were approximately similar for the

Table 12. Performance comparison of the proposed method with the other existing methods on DREAMS

Number of epochs

PT

Authors

ED

Subject Database according to both R&K and AASM criteria. The highest values are marked in boldface.

Classifier

R&K Criteria

AASM Criteria

Accuracy

Kappa

Accuracy

Kappa

20257

RF

68.7%

_____

72.3%

_____

Hassan & Subasi., (2017)

20257

Bagging

76.4%

0.78

78.95%

0.82

Hassan & Bhuiyan., (2017b)

20257

RUSBoost

70.7%

0.70

74.6%

0.74

Proposed method

20257

SVM

83.4%

0.77

83.3%

0.77

AC

CE

Hassan & Bhuiyan., (2016a)

The ability of the proposed method to discriminate between other pairs of two-state sleep

classifications (i.e. NREM - REM, S1+S2 - S3+S4, Wake - REM, S1 - REM, S1 - S2, S2 S3, and S3 - S4) was also evaluated. The overall binary classification results are presented in Table 13. The accuracy rate of the framework propounded herein was higher than the results reported by Zhu et al. (2014) for S1-REM, S1 - S2, and S1+S2 - S3+S4 pairs. Moreover, Vatankhah et al. (2010) achieved 90.6% classification accuracy for S3 - S4, while they tested their method on a limited number of EEG epochs (about 1200 epochs). The proposed method

30

ACCEPTED MANUSCRIPT could classify the S2-S3 pair with a 92.8% accuracy, which was higher than the result of Vatankhah et al. (2010) study. Table 13. Other pairs of two-state sleep stage classification on Sleep-EDF dataset and comparison with previous studies. The highest accuracy and kappa coefficient values are highlighted in bold. Proposed method

Zhu et al., (2014)

Vatankhah et al., (2010)

classification problem

Accuracy

Kappa

Accuracy

Kappa

Accuracy

Kappa

Wake - REM

98.3%

0.94

98.8%

0.96

98.2%

_____

S1 - REM

83.8%

0.57

78.8%

S1+S2 - S3+S4

94.1%

0.83

91.2%

S1 - S2

93.2%

0.81

91.6%

S3 - S4

83.4%

0.67

83.0%

S2 - S3

92.8%

0.72

NREM - REM

92.8%

0.77

CR IP T

Authors / Other 2-state

_____

_____

0.74

_____

_____

0.61

_____

_____

0.66

90.6%

_____

_____

_____

91.5%

_____

_____

_____

_____

_____

AN US

0.34

Furthermore, one of the exceptional ability of the suggested algorithm is to correctly classify S1 epochs, because it is easily misclassified as one of the other stages except SWS. In addition, training a model with a high sensitivity rate for detecting S1 is demanding,

M

because the number of S1 epochs is considerably lower than that of other stages (S.-F. Liang et al., 2012). Table 14 suggests that the S1 detection performance of the proposed method is

ED

remarkably better than those of previous works. In fact, our method could correctly detect the S1 epochs with a sensitivity rate of 41.1% (six-state classification) and 40.1% (five-state classification).

PT

Table 14. S1 detection performance of various studies for 5-stage and 6-stage sleep classification on Sleep-EDF dataset. The highest accuracy values are highlighted in bold. S1 detection accuracy

(5-state classification)

(6-state classification)

_____

4.8%

Vural & Yildiz., (2010)

_____

33.7%

Ronzhina et al., (2012)

_____

35.8%

S. F. Liang et al., (2012)

18.8%

_____

S.-F. Liang et al., (2012)

30.0%

_____

Hsu et al., (2013)

36.7%

_____

G. Zhu et al., (2014)

15.8%

_____

S.-F. Liang et al., (2016)

39.6%

_____

Hassan & Bhuiyan., (2016)

39.7%

39.1%

Hassan & Bhuiyan., (2016a)

37.4%

38.7%

CE

S1 detection accuracy

Authors

AC

Doroshenkov et al., (2007)

31

ACCEPTED MANUSCRIPT Prakash & Roy., (2017)

11.9%

11.0%

Hassan & Subasi., (2017)

38.7%

37.4%

Hassan & Bhuiyan., (2017a)

39.7%

40.7%

Proposed method

40.1%

41.1%

5. Discussion The aim of the present study was to propose a robust and reliable computer-aided sleep

CR IP T

stage scoring system to overcome the common difficulties of manual sleep staging. The rationale of each step of algorithmic development was discussed in detail, and the generalization of the proposed methodology was analyzed on a large number of sleep EEG epochs (more than 35000 epochs) of Sleep-EDF dataset and DREAMS Subject Database. The main advantages and contribution of this paper can be summarized as follows.

AN US

First, the proposed scheme does not require any manual or automatic artifact rejection algorithms to eliminate the common physiological artifacts. An artifact processing step usually imposes an excessive computational cost, and increases complexity of the algorithm. Second, the critical disadvantage of PSG is its long acquisition preparation time. The proposed single-channel based system can considerably reduce the preparation time using

M

fewer recording sensors and facilitate the process of PSG recordings. Hence, compared to conventional PSG recording that require multiple physiological signals (at least EEG, EOG,

ED

and EMG), our method has the potential to be applied for a portable healthcare device. Third, in the context of biomedical signal processing, the feature extraction step is a vital

PT

part so that distinguishable features require no complicated classification mechanism. In most of existing studies (Khalighi et al., 2013; S.-F. Liang et al., 2016, 2012; Ouanes & Rejeb,

CE

2016; Özşen, 2013; Şen et al., 2014; Vural & Yildiz, 2010), the process of feature extraction is highly complex, because different types of features have been derived from EEG signals.

AC

While, in this research, an expert system was designed for multi-class classification of sleep stages using only one type of features, i.e. SBLE features. Fourth, the proposed automatic scheme obtained robust classification results without

considering any post processing step (i.e., contextual rule smoothing). Employing these rules after classification step can correct significant number of misclassifications, and increase the overall performance of an automatic sleep stage classification system (Berthomier et al., 2007; Huang et al., 2014). Fifth, one of the major advantages of SBLE feature is its ability to yield outstanding results when a small proportion of the segments was selected as the training data set. This 32

ACCEPTED MANUSCRIPT particular importance accelerates the process of training classifier and, thus, reduces the total processing time and computational cost (Table 3). Sixth, our results also outperformed the previous studies that extracted features from more than one electrophysiological channel (EEG, EOG, or EMG) to design an accurate sleep stage scoring system (Charbonnier, Zoubek, Lesecq, & Chapotot, 2011; Huang et al., 2014; Krakovská & Mezeiová, 2011; S.-F. Liang et al., 2012; Tagluk, Sezgin, & Akin, 2010; Virkkala et al., 2007).

CR IP T

Since SBLE deals with the local extrema of time series, it is well-suited for analyzing oscillatory signals. The proposed framework can also be implemented for other EEG based signal analysis and classification problems, such as detecting transient EEG events during sleep and automatic detection of arousals in digital PSG recordings.

For exploratory purposes, the effectiveness of selected features obtained from the MCFS

AN US

algorithm was also validated by statistical analysis and graphical representation. Statistical analysis results in Table 1 and Figure 7 indicate that SBLE is a powerful feature for separating all possible paired-stages of sleep. In previous studies (Doroshenkov et al., 2007; Hassan & Bhuiyan, 2016a, 2017a; Hassan & Subasi, 2017; S. F. Liang et al., 2012; S.-F.

M

Liang et al., 2016, 2012; Prakash & Roy, 2017; Ronzhina et al., 2012; G. Zhu et al., 2014), a substantial amount of misclassifications occurs in transitory stages such as S1 and S3. This

ED

can be discussed in the context of brain physiology during sleep. From a neurophysiological point of view, S1 has been considered as a transition period between wakefulness and sleep, and, therefore, its brain activity is a combination of several waves with different frequencies.

PT

During S1 stage, the brain waves are transitioned from relatively unsynchronized beta and gamma waves (the normal brain activity for the Wake and REM stages) to more

CE

synchronized but slower alpha waves, and then to theta waves (the normal brain activity for the S2 stage) (Horne, 2013). Therefore, the similar EEG patterns of stages Wake, S1, and

AC

REM can be the primary cause of poor performance of the conventional methods (Iber, 2007; Tsinalis, Matthews, & Guo, 2016; Tsinalis, Matthews, Guo, et al., 2016). Moreover, in sixstage sleep classification, the misclassification between S2, S3, and S4 is a major problem. This phenomenon could have two reasons. First, according to R&K guideline, there are marginal differences between the nature of stages S3 and S4 except that delta waves in stage S3 occupy less than 50 percent of an entire epoch, while this value for stage S4 are more than 50 percent of an epoch (Rechtschaffen & Kales, 1968). The second reason for the misclassification between stages S2 and SWS can be assigned to the existence of sleep spindles and k-complexes in SWS (Iber, 2007). 33

ACCEPTED MANUSCRIPT Table 2 demonstrates that Fpz-Cz is the most appropriate EEG montage for dealing with different multi-class classification problems, obtaining highest performance than Pz-Oz. A close look at Tables 4 and 5 reveals that the classification performance of two EEG channels (Fpz-Cz and Pz-Oz) are roughly the same. The misclassification rate of S1-Wake and S1-S2 in Pz-Oz derivation increased than Fpz-Cz (with an increase of about 14% and 12%, respectively). In the same way, for stage S3, the misclassification rate of the S2-S3 in Pz-Oz was remarkably higher than Fpz-Cz (with an increase of about 23%). Other paired-stages,

CR IP T

which were prone to misclassification error i.e., S1-REM and S3-S4 pairs, had marginal changes. Based on the fact that the delta waves, slow sleep spindles, and K-complexes are dominant on the frontal and mediofrontal regions, the number of epochs that were mistakenly assigned to stages S1 or S3 decreased when Fpz-Cz derivation was employed instead of PzOz (Finelli, Borbély, & Achermann, 2001; Happe et al., 2002; McCormick, Nielsen, Nicolas,

AN US

Ptito, & Montplaisir, 1997). Furthermore, it can be hypothesized that the high performance of Fpz-Cz derivation to discriminate between stages Wake and S1 is related to alpha waves. However, the alpha activity predominantly originates from the occipital lobe, but it can be more obvious in the frontal regions (Finelli et al., 2001; Tsinalis, Matthews, & Guo, 2016;

M

Tsinalis, Matthews, Guo, et al., 2016).

Table 7 indicates the elimination of stage Wake from our analysis reduced the overall

ED

agreement of the proposed scheme. Nevertheless, by comparing the obtained results with other studies, the proposed automated system could classify the new five-state with higher overall accuracy than Zhu et al. (2014) and Tagluk et al. (2010) works (81.25% vs. 76.6%

PT

and 74.7%, respectively). In addition, a detailed investigation of how various feature selection algorithms can affect on the overall performance of the sleep staging system was

CE

also performed (Fig. 6). Based on the results, the supervised MCFS technique was the best algorithm in term of accuracy. Following that, the other filter-based feature selection

AC

approaches were found to be effective in classification of sleep stages. The current study also includes that the result of testing our method on both PSG datasets

(Sleep-EDF dataset and DREAMS Subject Database) are accurate and robust, which means that the novel single-channel automatic sleep stage classification system has a remarkable generalization capacity in healthy subjects (Tables 11 and 12). All in all, with regard to Comparative study section, it is concluded that the proposed method is a more robust and accurate system for solving multi-class sleep stage classification problems and the global performance of it is much better.

34

ACCEPTED MANUSCRIPT

6. Conclusions The main idea of this paper was to implement an expert system for multi-class classification of sleep stages based on tracking the dynamical changes of EEG signals. For this purpose, a novel time domain feature named SBLE was proposed. In this method, amplitude values of local extreme are quantized into some intervals to quantify the dynamical behavior of signals by considering a series of micro and macro patterns. Comparing with

CR IP T

other existing studies, the automatic sleep staging approach could provide promising results with lower signal processing complexity. Besides, the proposed methodology yielded remarkable performance for different types of two-state sleep stage classification, especially the transitory stages such as S1-REM and quite similar stages such as S3-S4. This study also suggests that the symbolic approach can provide an effective quantitative measure to study

AN US

the dynamics changes of EEG signals through various sleep stages. Since the primary application of sleep stages scoring is its diagnostic aspect, the effectiveness of the proposed approach need to be evaluated on whole-night PSG recordings coming from sleep-disordered patients. This is the major limitation of the present study. Furthermore, how SBLE behaves in Magnetoencephalography (MEG) and Electrocorticography (ECoG) signals would be a

M

fascinating and persuasive topic of further researches. In conclusion, the obtained results demonstrate that the proposed system can effectively assist neuroscience researchers and

ED

sleep physicians by facilitating the procedure of visual sleep scoring more accurately and

References

PT

robustly.

CE

Aboalayon, K. A. I., Faezipour, M., Almuhammadi, W. S., & Moslehpour, S. (2016). Sleep Stage Classification Using EEG Signal Analysis: A Comprehensive Survey and New Investigation. Entropy, 18(9), 272.

AC

https://doi.org/10.3390/e18090272 Acharya, U. R., Bhat, S., Faust, O., Adeli, H., Chua, E. C.-P., Lim, W. J. E., & Koh, J. E. W. (2015). Nonlinear Dynamics Measures for Automated EEG-Based Sleep Stage Detection. European Neurology, 74(5–6), 268–287. https://doi.org/10.1159/000441975

Acharya, U. R., Chua, E. C.-P., Chua, K. C., Min, L. C., & Tamura, T. (2010). Analysis and automatic identification of sleep stages using higher order spectra. International Journal of Neural Systems, 20(6), 509–521. https://doi.org/10.1142/S0129065710002589 Agarwal, R., & Gotman, J. (2001). Computer-assisted sleep staging. IEEE Transactions on Biomedical Engineering, 48(12), 1412–1423. https://doi.org/10.1109/10.966600

35

ACCEPTED MANUSCRIPT Allada, R., & Siegel, J. M. (2008). Unearthing the Phylogenetic Roots of Sleep. Current Biology, 18(15), R670– R679. https://doi.org/10.1016/j.cub.2008.06.033 Álvarez-Estévez, D., & Moret-Bonillo, V. (2011). Identification of Electroencephalographic Arousals in Multichannel Sleep Recordings. IEEE Transactions on Biomedical Engineering, 58(1), 54–63. https://doi.org/10.1109/TBME.2010.2075930 Amigo, J. M., Keller, K., & Unakafova, V. A. (2014). Ordinal symbolic analysis and its application to biomedical recordings. Philosophical Transactions of the Royal Society A: Mathematical, Physical and

CR IP T

Engineering Sciences, 373(2034), 20140091–20140091. https://doi.org/10.1098/rsta.2014.0091

Bajaj, V., & Pachori, R. B. (2013). Automatic classification of sleep stages based on the time-frequency image of

EEG

signals.

Computer

Methods

and

Programs

https://doi.org/10.1016/j.cmpb.2013.07.006

in

Biomedicine,

112(3),

320–328.

AN US

Balakrishnan, G., Shoeb, A., & Syed, Z. (2010). Creating symbolic representations of electroencephalographic signals: An investigation of alternate methodologies on intracranial data (pp. 4683–4686). IEEE. https://doi.org/10.1109/IEMBS.2010.5626414

Becq, G., Charbonnier, S., Chapotot, F., Buguet, A., Bourdon, L., & Baconnier, P. (2005). Comparison Between Five Classifiers for Automatic Scoring of Human Sleep Recordings. In D. S. K. Halgamuge & D. L.

M

Wang (Eds.), Classification and Clustering for Knowledge Discovery (pp. 113–127). Springer Berlin Heidelberg. Retrieved from http://link.springer.com/chapter/10.1007/11011620_8

ED

Berthomier, C., Drouot, X., Herman-Stoïca, M., Berthomier, P., Prado, J., Bokar-Thire, D., … d’Ortho, M.-P. (2007). Automatic Analysis of Single-Channel Sleep EEG: Validation in Healthy Individuals. Sleep,

PT

30(11), 1587–1595.

Cai, D., Zhang, C., & He, X. (2010). Unsupervised Feature Selection for Multi-cluster Data. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 333–

CE

342). New York, NY, USA: ACM. https://doi.org/10.1145/1835804.1835848 Canelas, A., Neves, R., & Horta, N. (2012). A new SAX-GA methodology applied to investment strategies

AC

optimization (p. 1055). ACM Press. https://doi.org/10.1145/2330163.2330310

Chang, C.-C., & Lin, C.-J. (2011). LIBSVM: A Library for Support Vector Machines. ACM Trans. Intell. Syst. Technol., 2(3), 27:1–27:27. https://doi.org/10.1145/1961189.1961199

Chang, R. J., Liu, Y., Chen, Q. P., & Wang, J. (2013). Sleep Electroencephalogram Analysis Based on Symbolic Transfer

Entropy.

Advanced

Materials

Research,

https://doi.org/10.4028/www.scientific.net/AMR.765-767.2678

36

765–767,

2678–2681.

ACCEPTED MANUSCRIPT Charbonnier, S., Zoubek, L., Lesecq, S., & Chapotot, F. (2011). Self-evaluated automatic classifier as a decision-support tool for sleep/wake staging. Computers in Biology and Medicine, 41(6), 380–389. https://doi.org/10.1016/j.compbiomed.2011.04.001 Chih-Wei Hsu, & Chih-Jen Lin. (2002). A comparison of methods for multiclass support vector machines. IEEE Transactions on Neural Networks, 13(2), 415–425. https://doi.org/10.1109/72.991427 Cohen, J. (1960). A Coefficient of Agreement for Nominal Scales. Educational and Psychological

CR IP T

Measurement, 20(1), 37–46. https://doi.org/10.1177/001316446002000104 Collop, N. A. (2002). Scoring variability between polysomnography technologists in different sleep laboratories. Sleep Medicine, 3(1), 43–47. https://doi.org/10.1016/S1389-9457 (01)00115-0

Daw, C. S., Finney, C. E. A., & Tracy, E. R. (2003). A review of symbolic analysis of experimental data. Review of Scientific Instruments, 74(2), 915–930. https://doi.org/10.1063/1.1531823

AN US

Diykh, M., & Li, Y. (2016). Complex networks approach for EEG signal sleep stages classification. Expert Systems with Applications, 63, 241–248. https://doi.org/10.1016/j.eswa.2016.07.004 Diykh, M., Li, Y., & Wen, P. (2016a). EEG Sleep Stages Classification Based on Time Domain Features and Structural Graph Similarity. IEEE Transactions on Neural Systems and Rehabilitation Engineering: A Publication of the IEEE Engineering in Medicine and Biology Society, 24(11), 1159–1168.

M

https://doi.org/10.1109/TNSRE.2016.2552539

ED

Diykh, M., Li, Y., & Wen, P. (2016b). EEG Sleep Stages Classification Based on Time Domain Features and Structural Graph Similarity. IEEE Transactions on Neural Systems and Rehabilitation Engineering: A Publication of the IEEE Engineering in Medicine and Biology Society, 24(11), 1159–1168.

PT

https://doi.org/10.1109/TNSRE.2016.2552539 Doroshenkov, L. G., Konyshev, V. A., & Selishchev, S. V. (2007). Classification of human sleep stages based

CE

on EEG processing using hidden Markov models. Meditsinskaia Tekhnika, (1), 24–28. Ebrahimi, F., Mikaeili, M., Estrada, E., & Nazeran, H. (2008). Automatic sleep stage classification based on

AC

EEG signals by using neural networks and wavelet packet coefficients. Conference Proceedings: ... Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering

in

Medicine

and

Biology

Society.

Annual

Conference,

2008,

1151–1154.

https://doi.org/10.1109/IEMBS.2008.4649365

Ebrahimi, F., Setarehdan, S.-K., Ayala-Moyeda, J., & Nazeran, H. (2013). Automatic sleep staging using empirical mode decomposition, discrete wavelet transform, time-domain, and nonlinear dynamics features of heart rate variability signals. Computer Methods and Programs in Biomedicine, 112(1), 47– 57. https://doi.org/10.1016/j.cmpb.2013.06.007

37

ACCEPTED MANUSCRIPT Ebrahimi, F., Setarehdan, S.-K., & Nazeran, H. (2015). Automatic sleep staging by simultaneous analysis of ECG and respiratory signals in long epochs. Biomedical Signal Processing and Control, 18, 69–79. https://doi.org/10.1016/j.bspc.2014.12.003 Enshaeifar, S., Kouchaki, S., Took, C. C., & Sanei, S. (2016). Quaternion Singular Spectrum Analysis of Electroencephalogram With Application in Sleep Analysis. IEEE Transactions on Neural Systems and Rehabilitation Engineering: A Publication of the IEEE Engineering in Medicine and Biology Society, 24(1), 57–67. https://doi.org/10.1109/TNSRE.2015.2465177

CR IP T

Farag, A. F., & El-Metwally, S. M. (2012). Detrended Fluctuation Analysis Features for Automated Sleep Staging of Sleep EEG: International Journal of Systems Biology and Biomedical Technologies, 1(4), 47–59. https://doi.org/10.4018/ijsbbt.2012100104

Finelli, L. A., Borbély, A. A., & Achermann, P. (2001). Functional topography of the human nonREM sleep

AN US

electroencephalogram. The European Journal of Neuroscience, 13(12), 2282–2290.

Flexer, A., Gruber, G., & Dorffner, G. (2005). A reliable probabilistic sleep stager based on a single EEG signal. Artificial Intelligence in Medicine, 33(3), 199–207. https://doi.org/10.1016/j.artmed.2004.04.004 Fraiwan, L., Lweesy, K., Khasawneh, N., Fraiwan, M., Wenz, H., & Dickhaus, H. (2010). Classification of sleep stages using multi-wavelet time frequency entropy and LDA. Methods of Information in

M

Medicine, 49(3), 230–237. https://doi.org/10.3414/ME09-01-0054

Gao, J. F., Yang, Y., Lin, P., Wang, P., & Zheng, C. X. (2010). Automatic Removal of Eye-Movement and

009-0131-4

ED

Blink Artifacts from EEG Signals. Brain Topography, 23(1), 105–114. https://doi.org/10.1007/s10548-

PT

Gao, M., Wu, M., Li, X., & Wang, J. (2017). Analysis of Sleep Staging Based on Multivariate Symbolic Transfer Entropy. DEStech Transactions on Engineering and Technology Research, 0(iceeac).

CE

https://doi.org/10.12783/dtetr/iceeac2017/10752 Ge, S., Wang, R., & Yu, D. (2014). Classification of Four-Class Motor Imagery Employing Single-Channel

AC

Electroencephalography. PLoS ONE, 9(6), e98019. https://doi.org/10.1371/journal.pone.0098019 Goldberger, A. L., Amaral, L. A., Glass, L., Hausdorff, J. M., Ivanov, P. C., Mark, R. G., … Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals. Circulation, 101(23), E215-220.

Güneş, S., Polat, K., & Yosunkaya, Ş. (2010). Efficient sleep stage recognition system based on EEG signal using k-means clustering based feature weighting. Expert Systems with Applications, 37(12), 7922– 7928. https://doi.org/10.1016/j.eswa.2010.04.043 Happe, S., Anderer, P., Gruber, G., Klösch, G., Saletu, B., & Zeitlhofer, J. (2002). Scalp topography of the spontaneous K-complex and of delta-waves in human sleep. Brain Topography, 15(1), 43–49.

38

ACCEPTED MANUSCRIPT Hassan, A. R. (2016). Computer-aided obstructive sleep apnea detection using normal inverse Gaussian parameters and adaptive boosting. Biomedical Signal Processing and Control, 29, 22–30. https://doi.org/10.1016/j.bspc.2016.05.009 Hassan, A. R., Bashar, S. K., & Bhuiyan, M. I. H. (2015). Automatic classification of sleep stages from singlechannel electroencephalogram (pp. 1–6). IEEE. https://doi.org/10.1109/INDICON.2015.7443756 Hassan, A. R., & Bhuiyan, M. I. H. (2015). Automatic sleep stage classification (pp. 211–216). IEEE.

CR IP T

https://doi.org/10.1109/EICT.2015.7391948 Hassan, A. R., & Bhuiyan, M. I. H. (2016a). A decision support system for automatic sleep staging from EEG signals using tunable Q-factor wavelet transform and spectral features. Journal of Neuroscience Methods, 271, 107–118. https://doi.org/10.1016/j.jneumeth.2016.07.012

Hassan, A. R., & Bhuiyan, M. I. H. (2016b). Computer-aided sleep staging using Complete Ensemble Empirical

AN US

Mode Decomposition with Adaptive Noise and bootstrap aggregating. Biomedical Signal Processing and Control, 24, 1–10. https://doi.org/10.1016/j.bspc.2015.09.002

Hassan, A. R., & Bhuiyan, M. I. H. (2017a). An automated method for sleep staging from EEG signals using normal inverse Gaussian parameters and adaptive boosting. Neurocomputing, 219, 76–87. https://doi.org/10.1016/j.neucom.2016.09.011

M

Hassan, A. R., & Bhuiyan, M. I. H. (2017b). Automated identification of sleep states from EEG signals by means of ensemble empirical mode decomposition and random under sampling boosting. Computer

ED

Methods and Programs in Biomedicine, 140, 201–210. https://doi.org/10.1016/j.cmpb.2016.12.015 Hassan, A. R., & Haque, M. A. (2016a). Computer-aided obstructive sleep apnea identification using statistical

PT

features in the EMD domain and extreme learning machine. Biomedical Physics & Engineering Express, 2(3), 035003. https://doi.org/10.1088/2057-1976/2/3/035003

CE

Hassan, A. R., & Haque, M. A. (2016b). Computer-aided obstructive sleep apnea screening from single-lead electrocardiogram using statistical and spectral features and bootstrap aggregating. Biocybernetics and

AC

Biomedical Engineering, 36(1), 256–266. https://doi.org/10.1016/j.bbe.2015.11.003 Hassan, A. R., & Haque, M. A. (2017). An expert system for automated identification of obstructive sleep apnea from single-lead ECG using random under sampling boosting. Neurocomputing, 235, 122–130. https://doi.org/10.1016/j.neucom.2016.12.062

Hassan, A. R., & Hassan Bhuiyan, M. I. (2016). Automatic sleep scoring using statistical features in the EMD domain and ensemble methods. Biocybernetics and Biomedical Engineering, 36(1), 248–255. https://doi.org/10.1016/j.bbe.2015.11.001

39

ACCEPTED MANUSCRIPT Hassan, A. R., Siuly, S., & Zhang, Y. (2016). Epileptic seizure detection in EEG signals using tunable-Q factor wavelet transform and bootstrap aggregating. Computer Methods and Programs in Biomedicine, 137, 247–259. https://doi.org/10.1016/j.cmpb.2016.09.008 Hassan, A. R., & Subasi, A. (2016). Automatic identification of epileptic seizures from EEG signals using linear programming boosting.

Computer Methods and

Programs in

Biomedicine, 136, 65–77.

https://doi.org/10.1016/j.cmpb.2016.08.013 Hassan, A. R., & Subasi, A. (2017). A decision support system for automated identification of sleep stages from EEG

signals.

Knowledge-Based

Systems,

128,

115–124.

CR IP T

single-channel

https://doi.org/10.1016/j.knosys.2017.05.005

Horne, J. (2013). Why REM sleep? Clues beyond the laboratory in a more challenging world. Biological Psychology, 92(2), 152–168. https://doi.org/10.1016/j.biopsycho.2012.10.010

using

energy

features

AN US

Hsu, Y.-L., Yang, Y.-T., Wang, J.-S., & Hsu, C.-Y. (2013). Automatic sleep stage recurrent neural classifier of

EEG

signals.

Neurocomputing,

104,

105–114.

https://doi.org/10.1016/j.neucom.2012.11.003

Huang, C.-S., Lin, C.-L., Ko, L.-W., Liu, S.-Y., Su, T.-P., & Lin, C.-T. (2014). Knowledge-based identification of sleep stages based on two forehead electroencephalogram channels. Frontiers in Neuroscience, 8,

M

263. https://doi.org/10.3389/fnins.2014.00263

Iber, C. (2007). The AASM Manual for the Scoring of Sleep and Associated Events: Rules, Terminology and

ED

Technical Specifications. American Academy of Sleep Medicine. Ioffe, S., & Szegedy, C. (2015). Batch Normalization: Accelerating Deep Network Training by Reducing

PT

Internal Covariate Shift. arXiv: 1502.03167 [Cs]. Retrieved from http://arxiv.org/abs/1502.03167 Jiang, Y., Lan, T., & Zhang, D. (2009). A New Representation and Similarity Measure of Time Series on Data

CE

Mining (pp. 1–5). IEEE. https://doi.org/10.1109/CISE.2009.5364532 Kayikcioglu, T., Maleki, M., & Eroglu, K. (2015). Fast and accurate PLS-based classification of EEG sleep single

channel

data.

Expert

Systems

with

Applications,

42(21),

7825–7830.

AC

using

https://doi.org/10.1016/j.eswa.2015.06.010

Kemp, B., Zwinderman, A. H., Tuk, B., Kamphuisen, H. A. C., & Oberye, J. J. L. (2000). Analysis of a sleepdependent neuronal feedback loop: the slow-wave microcontinuity of the EEG. IEEE Transactions on Biomedical Engineering, 47(9), 1185–1194. https://doi.org/10.1109/10.867928 Khalighi, S., Sousa, T., Pires, G., & Nunes, U. (2013). Automatic sleep staging: A computer assisted approach for optimal combination of features and polysomnographic channels. Expert Systems with Applications, 40(17), 7046–7059. https://doi.org/10.1016/j.eswa.2013.06.023

40

ACCEPTED MANUSCRIPT Klonowski, W., Olejarczyk, E., Stepienl, R., & Szelenberger, W. (2003). New methods of nonlinear and symbolic dynamics in sleep EEG-signal analysis. IFAC Proceedings Volumes, 36(15), 241–244. https://doi.org/10.1016/S1474-6670 (17)33508-5 Krakovská, A., & Mezeiová, K. (2011). Automatic sleep scoring: A search for an optimal combination of measures.

Artificial

Intelligence

in

Medicine,

53(1),

25–33.

https://doi.org/10.1016/j.artmed.2011.06.004

Biometrics, 33(1), 159–174. https://doi.org/10.2307/2529310

CR IP T

Landis, J. R., & Koch, G. G. (1977). The Measurement of Observer Agreement for Categorical Data.

Liang, S. F., Kuo, C. E., Hu, Y. H., Pan, Y. H., & Wang, Y. H. (2012). Automatic Stage Scoring of SingleChannel Sleep EEG by Using Multiscale Entropy and Autoregressive Models. IEEE Transactions on Instrumentation and Measurement, 61(6), 1649–1657. https://doi.org/10.1109/TIM.2012.2187242

AN US

Liang, S.-F., Kuo, C.-E., Hu, Y.-H., & Cheng, Y.-S. (2012). A rule-based automatic sleep staging method. Journal of Neuroscience Methods, 205(1), 169–176. https://doi.org/10.1016/j.jneumeth.2011.12.022 Liang, S.-F., Kuo, C.-E., Shaw, F.-Z., Chen, Y.-H., Hsu, C.-H., & Chen, J.-Y. (2016). Combination of Expert Knowledge and a Genetic Fuzzy Inference System for Automatic Sleep Staging. IEEE Transactions on Bio-Medical Engineering, 63(10), 2108–2118. https://doi.org/10.1109/TBME.2015.2510365

M

Lin, J., Keogh, E., Lonardi, S., & Chiu, B. (2003). A symbolic representation of time series, with implications

ED

for streaming algorithms (p. 2). ACM Press. https://doi.org/10.1145/882082.882086 Lin, J., Keogh, E., Wei, L., & Lonardi, S. (2007). Experiencing SAX: a novel symbolic representation of time series. Data Mining and Knowledge Discovery, 15(2), 107–144. https://doi.org/10.1007/s10618-007-

PT

0064-z

Lkhagva, B., Yu Suzuki, & Kawagoe, K. (2006). New Time Series Data Representation ESAX for Financial

CE

Applications (pp. x115–x115). IEEE. https://doi.org/10.1109/ICDEW.2006.99 Mariani, S., Bianchi, A. M., Manfredini, E., Rosso, V., Mendez, M. O., Parrino, L., … Terzano, M. G. (2010).

AC

Automatic detection of A phases of the Cyclic Alternating Pattern during sleep (pp. 5085–5088). IEEE. https://doi.org/10.1109/IEMBS.2010.5626211

McCormick, L., Nielsen, T., Nicolas, A., Ptito, M., & Montplaisir, J. (1997). Topographical distribution of spindles and K-complexes in normal subjects. Sleep, 20(11), 939–941. Moser, D., Anderer, P., Gruber, G., Parapatics, S., Loretz, E., Boeck, M., … Dorffner, G. (2009). Sleep Classification According to AASM and Rechtschaffen & Kales: Effects on Sleep Scoring Parameters. Sleep, 32(2), 139–149.

41

ACCEPTED MANUSCRIPT Motamedi-Fakhr, S., Moshrefi-Torbati, M., Hill, M., Hill, C. M., & White, P. R. (2014). Signal processing techniques applied to human sleep EEG signals—A review. Biomedical Signal Processing and Control, 10, 21–33. https://doi.org/10.1016/j.bspc.2013.12.003 Niknazar, H., Maghooli, K., & Motie Nasrabadi, A. (2015). Epileptic Seizure Prediction using Statistical Behavior of Local Extrema and Fuzzy Logic System. International Journal of Computer Applications, 113(2), 24–30. https://doi.org/10.5120/19799-1578 Niknazar, H., & Nasrabadi, A. M. (2016). Epileptic Seizure Prediction Using a New Similarity Index for Signals.

International

Journal

of

Bifurcation

https://doi.org/10.1142/S0218127416501868

and

Chaos,

26(11),

1650186.

CR IP T

Chaotic

Niknazar, H., Seifpour, S., Mikaili, M., Nasrabadi, A. M., & Banaraki, A. K. (2015). A novel method to detect the A phases of Cyclic Alternating Pattern (CAP) using similarity index (pp. 67–71). IEEE.

AN US

https://doi.org/10.1109/IranianCEE.2015.7146184

Nishida, M., Pearsall, J., Buckner, R. L., & Walker, M. P. (2009). REM Sleep, Prefrontal Theta, and the Consolidation

of

Human

Emotional

https://doi.org/10.1093/cercor/bhn155

Memory.

Cerebral

Cortex,

19(5),

1158–1166.

Norman, R. G., Pal, I., Stewart, C., Walsleben, J. A., & Rapoport, D. M. (2000). Interobserver agreement among

M

sleep scorers from different centers in a large dataset. Sleep, 23(7), 901–908. Ouanes, A., & Rejeb, L. (2016). A Hybrid Approach for Sleep Stages Classification. In Proceedings of the

ED

Genetic and Evolutionary Computation Conference 2016 (pp. 493–500). New York, NY, USA: ACM. https://doi.org/10.1145/2908812.2908910

neural

PT

Özşen, S. (2013). Classification of sleep stages using class-dependent sequential feature selection and artificial network.

Neural

Computing

and

Applications,

23(5),

1239–1250.

CE

https://doi.org/10.1007/s00521-012-1065-4 Peker, M. (2016). An efficient sleep scoring system based on EEG signal using complex-valued machine

AC

learning algorithms. Neurocomputing, 207, 165–177. https://doi.org/10.1016/j.neucom.2016.04.049 Pham, N. D., Le, Q. L., & Dang, T. K. (2010). Two Novel Adaptive Symbolic Representations for Similarity Search in Time Series Databases (pp. 181–187). IEEE. https://doi.org/10.1109/APWeb.2010.23

Prakash, A., & Roy, V. (2017). An Automatic Detection of Sleep using Different Statistical Parameters of Single Channel EEG Signals. International Journal of Signal Processing, Image Processing and Pattern Recognition, 9(10), 365–374. Radha, M., Garcia-Molina, G., Poel, M., & Tononi, G. (2014). Comparison of feature and classifier algorithms for online automatic sleep staging based on a single EEG signal. Conference Proceedings: ... Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering

42

ACCEPTED MANUSCRIPT in

Medicine

and

Biology

Society.

Annual

Conference,

2014,

1876–1880.

https://doi.org/10.1109/EMBC.2014.6943976 Rechtschaffen, A., & Kales, A. (1968). A manual of standardized terminology, techniques and scoring system for sleep stages of human subjects. Allan Rechtschaffen and Anthony Kales, editors. (University of California, Los Angeles & NINDB Neurological Information Network (U.S.), Eds.). Bethesda, Md: U. S. National Institute of Neurological Diseases and Blindness, Neurological Information Network. Ronzhina, M., Janoušek, O., Kolářová, J., Nováková, M., Honzík, P., & Provazník, I. (2012). Sleep scoring artificial

neural

networks.

Sleep

Medicine

Reviews,

16(3),

251–263.

CR IP T

using

https://doi.org/10.1016/j.smrv.2011.06.003

Sanders, T. H., McCurry, M., & Clements, M. A. (2014). Sleep stage classification with cross frequency coupling. Conference Proceedings: ... Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual

AN US

Conference, 2014, 4579–4582. https://doi.org/10.1109/EMBC.2014.6944643

Şen, B., Peker, M., Çavuşoğlu, A., & Çelebi, F. V. (2014). A Comparative Study on Classification of Sleep Stage Based on EEG Signals Using Feature Selection and Classification Algorithms. Journal of Medical Systems, 38(3), 18. https://doi.org/10.1007/s10916-014-0018-0

M

Shi, J., Liu, X., Li, Y., Zhang, Q., Li, Y., & Ying, S. (2015). Multi-channel EEG-based sleep stage classification with joint collaborative representation and multiple kernel learning. Journal of Neuroscience Methods,

ED

254, 94–101. https://doi.org/10.1016/j.jneumeth.2015.07.006 Shieh, J., & Keogh, E. (2008). iSAX: indexing and mining terabyte sized time series (p. 623). ACM Press.

PT

https://doi.org/10.1145/1401890.1401966 Sun, Y., Li, J., Liu, J., Sun, B., & Chow, C. (2014). An improvement of symbolic aggregate approximation distance

measure

for

time

series.

Neurocomputing,

138,

189–198.

CE

https://doi.org/10.1016/j.neucom.2014.01.045 Tagluk, M. E., Sezgin, N., & Akin, M. (2010). Estimation of Sleep Stages by an Artificial Neural Network

AC

Employing

EEG,

EMG

and

EOG.

Journal

of

Medical

Systems,

34(4),

717–725.

https://doi.org/10.1007/s10916-009-9286-5

Tang, J., Alelyani, S., & Liu, H. (2014). Feature Selection for Classification: A Review. In Data Classification (Vols. 1–0, pp. 37–64). Chapman and Hall/CRC. https://doi.org/10.1201/b17320-3 Tayebi, H., Krishnaswamy, S., Waluyo, A. B., Sinha, A., & Gaber, M. M. (2011). RA-SAX: Resource-Aware Symbolic

Aggregate

Approximation

for

https://doi.org/10.1109/MDM.2011.67

43

Mobile

ECG

Analysis

(pp.

289–290).

IEEE.

ACCEPTED MANUSCRIPT The

DREAMS

Subjects

Database.

(n.d.).

Retrieved

February

1,

2018,

from

http://www.tcts.fpms.ac.be/~devuyst/Databases/DatabaseSubjects/ Touchette, E., Petit, D., Séguin, J. R., Boivin, M., Tremblay, R. E., & Montplaisir, J. Y. (2007). Associations between sleep duration patterns and behavioral/cognitive functioning at school entry. Sleep, 30(9), 1213–1219. Tsinalis, O., Matthews, P. M., & Guo, Y. (2016). Automatic Sleep Stage Scoring Using Time-Frequency Analysis and Stacked Sparse Autoencoders. Annals of Biomedical Engineering, 44(5), 1587–1597.

CR IP T

https://doi.org/10.1007/s10439-015-1444-y

Tsinalis, O., Matthews, P. M., Guo, Y., & Zafeiriou, S. (2016). Automatic Sleep Stage Scoring with SingleChannel EEG Using Convolutional Neural Networks. arXiv: 1610.01683 [Cs, Stat]. Retrieved from http://arxiv.org/abs/1610.01683

AN US

Vatankhah, M., Totonchi, M. R. A., Moghimi, A., & Asadpour, V. (2010). GA/SVM for Diagnosis Sleep Stages Using Non-linear and Spectral Features. In X.-Z. Gao, A. Gaspar-Cunha, M. Köppen, G. Schaefer, & J. Wang (Eds.), Soft Computing in Industrial Applications (pp. 175–184). Springer Berlin Heidelberg. Retrieved from http://link.springer.com/chapter/10.1007/978-3-642-11282-9_19 Virkkala, J., Hasan, J., Värri, A., Himanen, S.-L., & Müller, K. (2007). Automatic sleep stage classification

M

using two-channel electro-oculography. Journal of Neuroscience Methods, 166(1), 109–115. https://doi.org/10.1016/j.jneumeth.2007.06.016

ED

Vural, C., & Yildiz, M. (2010). Determination of Sleep Stage Separation Ability of Features Extracted from EEG Signals Using Principle Component Analysis. Journal of Medical Systems, 34(1), 83–89.

PT

https://doi.org/10.1007/s10916-008-9218-9 Walker, M. P., & Stickgold, R. (2006). Sleep, memory, and plasticity. Annual Review of Psychology, 57, 139–

CE

166. https://doi.org/10.1146/annurev.psych.56.091103.070307 Wei Wang, & Johnson, D. H. (2002). Computing linear transforms of symbolic signals. IEEE Transactions on

AC

Signal Processing, 50(3), 628–634. https://doi.org/10.1109/78.984752 Zhu, G., Li, Y., & Wen, P. (2014a). Analysis and Classification of Sleep Stages Based on Difference Visibility Graphs From a Single-Channel EEG Signal. IEEE Journal of Biomedical and Health Informatics, 18(6), 1813–1821. https://doi.org/10.1109/JBHI.2014.2303991

Zhu, Guohun, Li, Y., & Wen, P. P. (2012). An Efficient Visibility Graph Similarity Algorithm and Its Application on Sleep Stages Classification. In F. M. Zanzotto, S. Tsumoto, N. Taatgen, & Y. Yao (Eds.), Brain Informatics (Vol. 7670, pp. 185–195). Berlin, Heidelberg: Springer Berlin Heidelberg. Retrieved from http://link.springer.com/10.1007/978-3-642-35139-6_18

44