Cyclic spectral analysis of electrocardiogram signals based on GARCH model

Biomedical Signal Processing and Control 31 (2017) 79–88 Contents lists available at ScienceDirect Biomedical Signal Processing and Control journal ...

Download PDF

2MB Sizes 0 Downloads 50 Views

Report

PDF Reader
Full Text

Biomedical Signal Processing and Control 31 (2017) 79–88

Contents lists available at ScienceDirect

Biomedical Signal Processing and Control journal homepage: www.elsevier.com/locate/bspc

Cyclic spectral analysis of electrocardiogram signals based on GARCH model Sara Mihandoost (IEEE member) ∗ , Mehdi Chehel Amirani (IEEE member) Department of Electrical Engineering, Urmia University, Urmia, Iran

a r t i c l e

i n f o

Article history: Received 27 October 2015 Received in revised form 5 July 2016 Accepted 22 July 2016 Keywords: ECG signals CSA GARCH model MRF SVM

a b s t r a c t In this paper, the capability of Cyclic Spectral Density (CSD) is evaluated for ECG signal analyzing and a new feature generation method for them is presented. Although, the CSD presents a second-order statistical description in the frequency domain and reveals the hidden periodicity in EEG signals, it needs an efﬁcient algorithm for calculating and also a suitable model for describing. By employing an efﬁcient computational algorithm which is called the FFT accumulation method (FAM), the CSD of ECG signals can be computed. In this study, In order to choose an efﬁcient statistical model for the Cyclic Spectral Analysis (CSA) coefﬁcients of ECG signals, their statistical features are investigated at various regions of bi-frequency plane. It is revealed that the CSA coefﬁcients are heteroscedastic and the Generalized Autoregressive Conditional Heteroscedasticity (GARCH) is a suitable model for them. Hence, The GARCH parameters of CSA sub-bands are calculated and are employed to classify the ECG using a support vector machine (SVM) classiﬁer. Evidently, the results reveal that the performance of the new method in ECG classiﬁcation outperforms the former studies. © 2016 Elsevier Ltd. All rights reserved.

1. Introduction Biomedical signal processing systems are designed to help physicians in order to diagnosis diseases more effectively. Several biomedical signals such as electrocardiogram (ECG) are extremely complex to decipher. The changes of the heart’s electrical activity is recorded in the ECG waveform made by different electrical depolarization–repolarization patterns of the heart [1]. Any heart rate abnormality is appeared in the morphological pattern pathology that can be identiﬁed by the analysis of the recorded ECG signal [1–3]. Due to the fact that the ECG signals have long duration and their visual scanning is highly time consuming, their interpretations are highly subjective which cause disparity among cardiologist. Nowadays, to overcome this issue, automatic identiﬁcation systems have been introduced in many literatures. Generally, selection of a suitable representation method plays a major role in signal processing algorithms [4]. In fact, the quality of automatic identiﬁcation systems performance is determined by signal representation algorithm and feature extraction schemes.

∗ Corresponding author. E-mail addresses: [email protected] (S. Mihandoost), [email protected] (M. Chehel Amirani). http://dx.doi.org/10.1016/j.bspc.2016.07.012 1746-8094/© 2016 Elsevier Ltd. All rights reserved.

Some type of representations includes using orthonormal basis functions, optimally time-warped polynomial functions and so on [5,6]. In Ref. [7], the fast Fourier transform (FFT) analysis has been employed which provides only frequency content and cannot be localized in time domain. To overcome this problem, the short time Fourier transform (STFT) has been used extensively in order to presents both time and frequency information using uniform window [8]. Also, both time and frequency components can be localized by the wavelet transform (WT), in this regard, the window size is adopted based on the signal’s frequency content [9]. Although the algorithms based on the WT have successful performance in representation and description of the ECG signals, many surveys are continued concerning this issue. Several recent studies on ECG modeling have ﬁtted mathematical representations to ECG ﬁducially points. In Ref. [10], ECG beat shapes were modeled by employing Hermitic basis functions. In Ref. [11], ECG complexes were modeled utilizing lines or parabolas. The major of these methods require pre-processing to portray the ECG signals into meaningful components such as T-wave, Pwave, and QRS complex. In some studies, the Gaussian distribution has been employed to the heart rate estimation [12,13], denoise ECG signals [13,14], detect arrhythmias [15], and generate synthetic ECG signals for various types of arrhythmias [16]. This technique depends on a priori information and requires nonlinear equations

80

S. Mihandoost, M. Chehel Amirani / Biomedical Signal Processing and Control 31 (2017) 79–88

in order to estimate it. Various studies are perused concerning this issue to determine a more efﬁcient representation method; furthermore, a number of the researchers attempt to obtain an appropriate signal representation technique which can decipher ECG signals more effectively [2]. In this study, a novel technique for ECG representation based on Cyclic Spectral Analysis is recommended and the capability of this method for ECG classiﬁcation is evaluated. The conducted experiments in this study on classiﬁcation of ECG signals reveal that the extracted features from the cyclic spectral density coefﬁcients are more potentially separable compared to the extracted features of other techniques. This function appears hidden periodicity of a signal using the second-order statistical explanation and grants more comprehensive information than the ordinary power spectral density [4,17]. The theory of cyclostationary signals and two equivalent methods for analyzing of cyclic spectral density, time and frequency-smoothing methods, have been introduced in Refs. [18–24]. One of the proposed methods is the FFT accumulation method (FAM). It is an algorithm belonging to time smoothing class and employs the computational efﬁciency of the FFT [24]. In the present study, the utilizing of cyclic spectral analysis for feature extraction is explained comprehensively. One ought to note that the CSD produces many features which contain redundant information and cause error in ECG classiﬁcation. One can utilize a statistical model to describe the CSA coefﬁcients and to diminish the number of these features as one of the successful schemes. The model parameters will be trustable if an appropriate model is chosen. In this research, the statistical characteristics of the CSA coefﬁcients are investigated in order to choose the suitable statistical model. Next, the generalized autoregressive conditional heteroscedasticity (GARCH) model is used for theses coefﬁcients since they are heteroscedastic and the GARCH model is matchable with properties of the CSA coefﬁcients for instance heavy tail margin distribution [25,26]. As an application, the suggested procedure is employed to classify the ECG signals. To accomplish this goal, the GARCH parameters are extracted from Cyclic Spectral Analysis at various regions of bi-frequency plane. Due to the fact Markov random ﬁeld (MRF) is an efﬁcient approach to acquire more separable features; it is employed to provide the best features subset. On the other hand, since the support vector machine classiﬁer offers best margin between different classes, it provides a considerable backing of the medical diagnosis [10,27–29]. Therefore, the SVM classiﬁer is chosen amongst various classiﬁers for ECG classiﬁcation. The obtained features are applied to the SVM classiﬁer with ﬁve distinct outputs. The results of experiment reveal that the extracted features are appropriate for the ECG classiﬁcation. The remaining sections of this paper are organized as follows: In “Theoretical background” Section, cyclic spectral density, FAM algorithm and GARCH model are explained. The “Proposed Method” Section is allocated to illustrate the new proposed method and the experimental results are reported in “Experimental Results and Discussion” Section. Finally, last Section concludes the paper. 2. Theoretical background In this section, we brieﬂy explain Cyclic Spectral Density and GARCH model used in this work.

results indicated that the CSD based methods were robust to noise and other undesirable condition [31]. In this paper, a cyclostationary based method is suggested to analyze the ECG signals. These signals can be modeled as cyclostationary random processes that their statistical parameters are changing periodically. Cyclostationary analysis is based on the auto-correlation function of the signal [19,30]. Using auto-correlation, the hidden patterns in a time series such as the presence of sinusoids in a noisy signal can be detected. The cyclic autocorrelation function can be explained for a discrete time real value signal x(n), is deﬁned as [4]: 1 N→∞ (2N + 1)

Rx˛ (k) = lim

As a comparison to the other stationary based approaches, cyclostationary analysis is highly efﬁcient to separate signals having distinct cycle frequencies and hidden periodicities [23]. This method has been utilized for digital modulation detection [30], texture classiﬁcation [17,30], and EEG classiﬁcation which their

[x(n + k)e−j˛(n+k) ][x(n)ej˛n ]

∗

(1)

n=−N

where is the time lag between observations of the signal x(n). The discrete-time Fourier transform of the Rx˛ (k): Sx˛ (f ) =

k

Rx˛ (k)e−j2fk

(2)

is called CSD. In this description, ˛ presents cyclic frequency and depends on amount of frequency shift, and f shows spectral frequency variable [17]. The numerical value of ˛ for which Sx˛ (f ) = / 0 is countable if the signal has ﬁnite average power [23]. In addition, if ˛ = 0, the ordinary power spectrum is obtained as follows: ∞

Sx (f ) = Sx0 (f ) =

Rx0 (k)e−j2fk

(3)

k=−∞

It is evident that the CSD can provide more extensive features with respect to the ordinary power spectral density [4]. Two inherent characteristics of the CSD are concluded easily from the deﬁnition expression of the CSD due to the fact that the ECG is a real value signal [31]. The ﬁrst characteristic is periodicity corresponding with discrete time for each integer values of n and m, Sx˛+n (f + m + n/2) = Sx˛ (f ), and the second is symmetrical relation∗ ships Sx˛ (−f ) = Sx˛ (f ) and Sx−˛ (−f ) = Sx˛ (f ) . Hence, only a quarter of estimated points can describe entire CSD. A number of computational techniques have been proposed which generally fall into frequency smoothing algorithms and time smoothing algorithms classes. In this research, the FAM algorithm is utilized as an efﬁcient algorithm out of time class [22,23]. In this algorithm the cyclic cross periodgram is expressed as: ˛ SxT (n, f )t =

1 ˛ ˛ XT (n, f + ) × XT∗ (n, f − ) T 2 2 t

(4)

where:

XT (n, f ) =

N /2

a(r)x(n − r)e−j2f (n−r)Ts

(5)

n=−N /2

The Eqs. (4) and (5) explain how frequency contents of a narrowband signal x(n) are correlated over a speciﬁc t duration [4,24]. XT (n, f + ˛/2) and XT∗ (n, f − ˛/2) are complex demodulates of the x(n) and a(r) presents a data capturing window with length T = N Ts which Ts is the period of data sampling and N’ is the sliding window’s length [24]. Subsequently, with correlation of complex demodulates on t seconds, the cyclic cross periodgram, for speciﬁc f0 and ˛0 , can be calculated as follow: Sx˛T0 (n, f0 )t =

2.1. Cyclic spectral density

N

r

XT (r, f1 )XT∗ (r, f2 )g(n − r)

(6)

where g(n) is a window with length t = NTs ; f1 = f0 + ˛0 /2 and f2 = f0 − ˛0 /2 [24]. In the FAM, Fourier transform is used to smooth out the time. When frequency changes from ˛0 to ˛0 + ε, the output of the algorithm is expressed as [4,24]: Sx˛T0 +ε (n, f0 )t =

r

XT (r, f1 )XT∗ (r, f2 )g(n − r)e−j2εrTs

(7)

S. Mihandoost, M. Chehel Amirani / Biomedical Signal Processing and Control 31 (2017) 79–88

Locations of CSA estimation for N'=8

α

ht = ˛0 +

2 + ˛i yt−i

i=1

fs

Cyclic Frequency

q

f -fs/2

t−1 =

y0 , ..., yt−1 , ..., ht−1

t−1

=

Fig. 1. Tiling bi-frequency plane which indicates locations of CSD estimation in one cell typically.

By quantizing the value of ε to ε = q˛, the Eq. (8) is modiﬁed to: Sx˛T0 +q˛ (n, fj )t =

r

XT (r, f1 )XT∗ (r, f2 )g(n − r)e−j2rq/N

(8)

In this equation, the sum is calculated using an N-point FFT. The speed of the algorithm is enhanced by this substitution. To perfect coverage of the bi-frequency plane, we require a bank of band-pass ﬁlters to produce complex demodulates. This can be conducted by an efﬁcient method using a sliding FFT [32]. In this method the center frequencies of the ﬁlter banks and the positions of estimated points of CSD (fj , ˛i ) are expressed as: fs N N , k = − , · · ·, −1 N 2 2

(9)

fs N

(10)

fk + f k + 1 fs =( ) 2 2 N

(11)

˛i = fk − f1 = (k − 1)

Fig. 1 indicates the regions and locations of the estimated CSD using the FAM algorithm on bi-frequency plane (A page with two frequencies dimensions f and ˛). For an N -point window, (N )2 sub-bands (diamond regions) exist on bi-frequency plane. As a result of the aforementioned symmetry, calculation of the cyclic spectral density for a real signal needs only (N )2 /4 tiling regions. To survey further details, Ref. [24] is provided. 2.2. GARCH model Formerly, time series were considered assuming constant variance. It is clear that time-varying variance has a signiﬁcant task in many time series [31]. For the ﬁrst time, the time varying conditional variance as a function of previous errors on conditional variance constant was introduced as autoregressive conditional heteroscedasticity (time-varying variance, i.e., volatility) (ARCH) model in Refs. [33,34]. The GARCH model is considered as a part of a more general ARCH model as discussed in Ref. [35] which produces a more ﬂexible lag structure. The GARCH distribution is deﬁned as follows:

ht εt

(13)

(14)

1

−y2 t

e 2ht

(15)

2ht

and ht is the conditional variance of {yt }. The maximum likelihood estimation can be employed to compute the unknown parameters in the GARCH model [33,35].

Normalized Frequency

yt =

˛0 > 0, ˛i ≥ 0, ˇj ≥ 0

{yt } is conditionally distributed as follows:

fs/2

-fs

fj =

ˇj ht−j,

j=1

where {yt } is a time series. In addition p and q are GARCH model parameters which in this paper p = 1 and q = 1 are chosen, {εt } is a series of independent identically random variables which have standard normal distributed with zero mean and unit variance. t−1 indicates all the information until, t − 1:

f yt |

fk = k

p

81

(12)

3. Proposed method In this section, a new ECG classiﬁcation method is proposed using CSA and GARCH model. This scheme is indicated in Fig. 2. 3.1. Data base A public available data base is used as described in Refs. [36,37]. To have fair comparison with the results of the former researches [10,37–39], train and test data sets are exactly chosen same as other studies. To accomplish this goal, ﬁfteen recorded ECG signals from the MIT database (MITDB) are chosen. Their record name and details of corresponding information are reported in Table 1: where based on the MITDB annotation, Normal, LBBB, RBBB, PVC and APB, indicate Normal, left bundle branch block, right bundle branch block, premature ventricular contraction, atrial premature beat rhythms, respectively. 3.2. CSD estimation As a primary step in ECG analysis, the CSD coefﬁcients of each ECG signal are calculated. Although there are many options for data capturing window, there is no speciﬁc rule to choose a proper window. With considering the main-lobe width and the amplitude of side-lobes at the frequency domain, the Hamming window is an appropriate selection [4] which can be expressed as follows:

⎧ ⎨ 0.54 − 0.46 cos( 2n ) 0 ≤ n ≤ N N w[n] = ⎩ 0 otherwise

(16)

where N indicates the length of the selected window. Selection of window size (N ) in order to CSD estimation is a trade off between accuracy and processing time. We investigated the accuracy and the processing time of our method for different N . Fig. 3 indicates the summary of this investigation. Then we employed the suitable option (window with length N = 8) from the aspect of accuracy and processing time. Figs. 4 and 5 illustrate an example for the ﬁve ECG signals of each class and their corresponding CSA’s magnitudes for N = 8 as bar graph. Furthermore, to provide a more vivid visualization, in the third column, each CSA’s cell is extended in a square region. In this research, N = 8 has been chosen and as aforementioned in Part 2.1, due to existence of symmetry in CSD, 16 CSA (64/4) individual sub-bands for each ECG signal with N = 1024 samples are selected in which every sub-band contains a vector with N/N =

82

S. Mihandoost, M. Chehel Amirani / Biomedical Signal Processing and Control 31 (2017) 79–88

SCF Representation

ECG Signals

GARCH Modeling

Feature Selection

Classifier

Fig. 2. Block diagram of the proposed method for ECG classiﬁcation.

Table 1 The number of chosen MITDB records with information about their rhythm types. Normal

Record

100 105 106 109 111 114 116 118 119 124 200 207 209 212 214 Total

LBBB

RBBB

test

train

Test

train

test

train

test

train

test

146 165 100 0 0 119 150 0 101 0 114 0 171 60 0 1126

158 179 107 0 0 129 163 0 109 0 123 0 185 65 0 1218

0 0 0 185 158 0 0 0 0 0 0 108 0 0 149 600

0 0 0 201 171 0 0 0 0 0 0 117 0 0 161 650

0 0 0 0 0 0 0 193 0 136 0 8 0 163 0 500

0 0 0 0 0 0 0 212 0 150 0 8 0 180 0 550

0 12 150 11 0 12 31 5 127 13 236 30 0 0 73 700

0 12 150 11 0 12 31 5 127 13 236 30 0 0 73 700

18 0 0 0 0 5 0 53 0 0 16 58 208 0 0 358

15 0 0 0 0 5 0 43 0 0 14 49 174 0 0 300

for N'=16 98

for N'=8 for N'=32 CCR(%)

96

94

92

for N'=4 90

20

APB

Train

100

88

PVC

25

30

35 40 Processing Time (min)

45

50

55

Fig. 3. A comparison between accuracy and processing time for different values of N .

128 elements. In the following section, we evaluate various subbands by several tests in order to study statistical properties of ECG signal. 3.3. Survey of statistical properties for CSA’s sub-bands Next, we utilize various tests to investigate statistical properties of the CSA coefﬁcients to illustrate the ﬂexibility of GARCH model to describe them. Due to limited space, only the tests results of some CSA coefﬁcients of ECG signals will be reported; however, it is essential to know that the tests results are similar for various CSA coefﬁcients. Figs. 4 and 5 clearly indicates that the central subbands of bi-frequency plane, have higher content and heavier tail. It means that these sub-bands have stronger heteroscedastic effect. Tables 2 and 3 show results of ARCH and K-S tests which GARCHstat and P statistics of the central sub-bands (20th, 30th, and 40th) are bigger than other sub-bands. But this interpretation does not apply to Table 4 because in this Table the reported kurtosis values are not belong to one ECG signal. In this Table, the maximum and minimum of kurtosis values for a speciﬁc sub-band of different ECG signals from one class are reported.

3.3.1. ARCH test Primarily, in Ref. [33] a hypothesis test suggested to examine the existence of ARCH/GARCH effect in the signal [31]. In addition, this statistical test is asymptotically Chi-Square distributed [34]. The ARCH test for CSA coefﬁcients of ECG signals is applied and the tests results for several CSA sub-bands are shown in Table 2. The Boolean decision parameter is shown by H and the null hypothesis is rejected when H = 1. The null hypothesis shows that the GARCH effect does not exist. PValue indicates the signiﬁcance level of rejection the null hypothesis and in the present study it is set to be 0.05 same as other literatures [25]. The GARCHstat illustrates ARCH statistic and CriticalValue indicates the critical value of Chisquare model which is commensurate with order of GARCH models. If the GARCHstat is less than CriticalValue, there is no effect GARCH at signiﬁcance level equal to 5%. Considering the results of Table 2 and the above discussion, it is evident that there is GARCH effect in the CSA coefﬁcients of all tested ECG signals.

3.3.2. Kolmogorov-Smirnov test Kolmogorov-Smirnov (K-S) test [40] is considered to be a highly accurate test. In this study, K-S is employed to investigate the compatibility of GARCH model with the CSA coefﬁcients. Primarily, samples of GARCH data are produced using the extracted GARCH parameters of the CSA coefﬁcients [31]. Subsequently, we can determine whether GARCH data and CSA coefﬁcients have similar histogram or not. The K-S test can be used to investigate the null hypothesis that the two data have the same statistical model, i.e. the two data are highly identical. Alternatively, another hypothesis is that they are from dissimilar statistical models [31]. In this study, the Kolmogorov-Smirnov test for two data series are employed. Consider the ﬁrst series, x1 , ..., xN1 (CSA coefﬁcients of ECG signals), have size N1 and statistical distribution with the empirical CDF, F(x). And the second series y1 , ..., yN2 (GARCH data), have size N2 and statistical distribution with the empirical CDF, G(x). which is deﬁned as follows: 1 I(xi < x) N1 N1

FN1 (x) =

i=1

(17)

S. Mihandoost, M. Chehel Amirani / Biomedical Signal Processing and Control 31 (2017) 79–88

83

Fig. 4. Amplitude of CSAs: (a, b, c) are ECG signals from Normal, LBBB, and RBBB classes, (a’, b’, c’) are the amplitude of CSA’s coefﬁcients as bar graph and (a¨, b¨, c)¨ are other view of CSA’s amplitude, respectively.

1 I(yi < y) N2 N2

GN2 (x) =

(18)

k˛ is computed at the 5% signiﬁcance level (˛ = 0.05) which is expressed as follows [41]:

i=1

where I(·) is the indicator function which is equal to “1” if xi < x and is equal to “0” otherwise. Two hypotheses are considered: Null hypothesis h = 0: The CSA coefﬁcients of ECG signals can be modeled with the GARCH model, i.e. F = G. Other hypothesis h = 1: The CSA coefﬁcients of ECG signals cannot be modeled with the GARCH model, i.e. F = / G. In this test, the maximum difference between the produced GARCH data and CSA coefﬁcients are calculated as follows: DN1 , N2 = sup|FN1 (x) − GN2 (x)|

(19)

x

with deﬁnition of k˛ as the critical value, the null hypothesis is rejected if:

N1 N2 DN ,N > k˛ (N1 + N2 ) 1 2

(20)

Pr

N1 N2 DN ,N > k˛ (N1 + N2 ) 1 2

=1−˛

(21)

In the present study, K-S test is used for all CSA sub-bands of ECG signals. The size of produced GARCH data and CSA coefﬁcients are same, i.e. N1 = N2 . In Table 3, a number of the results are presented. In this Table, h = 0 if the K-S test accepts the null hypothesis and otherwise h = 1. p indicates the signiﬁcance level. In this test, it is set to be 0.05. Finally, the maximum difference between histograms of the CSA coefﬁcients and produced GARCH data is shown using k [31]. It is evident from Table 3 that the histograms are similar; which implies GARCH model is appropriate one to ﬁt to the CSA coefﬁcients. Fig. 6 indicates the graphical outcomes of the supplemmentry ﬁle of the K-S test, where the histograms of CSA coefﬁcients and the corresponding GARCH data are plotted for ﬁve ECG signals.

84

S. Mihandoost, M. Chehel Amirani / Biomedical Signal Processing and Control 31 (2017) 79–88

Fig. 5. Amplitude of CSAs: (d, f) are ECG signals from PVC, and APB classes, (d’, f’) are the amplitude of CSA’s coefﬁcients as bar graph and (d¨, f)¨ are other view of CSA’s amplitude, respectively. Table 2 Outcomes of using Engle test to indicate the ARCH/GARCH effects. Dataset

ARCH testparameters

CSA Coefﬁcients Sub-bands

H pValue GARCHstat CriticalValue H pValue GARCHstat CriticalValue H pValue GARCHstat CriticalValue H pValue GARCHstat CriticalValue H pValue GARCHstat CriticalValue

Normal (100)

LBBB (109)

RBBB (118)

PVC (200)

APB (213)

1st

10th

20th

30th

40th

50th

60th

1 0 136.2 3.84 1 0 228.6 3.84 1 0 248.3 3.84 1 0 245.6 3.84 1 0 238.6 3.84

1 0.01 105.8 3.84 1 0 241.22 3.84 1 0 206.5 3.84 1 0 231.7 3.84 1 0 187.6 3.84

1 0 162.5 3.84 1 0 246.4 3.84 1 0 148.1 3.84 1 0.01 248.9 3.84 1 0 212.2 3.84

1 0 108.3 3.84 1 0 242.3 3.84 1 0 225.6 3.84 1 0 212.3 3.84 1 0 194.6 3.84

1 0 155.63 3.84 1 0 206.1 3.84 1 0 201.5 3.84 1 0 232.6 3.84 1 0 206.5 3.84

1 0 141.1 3.84 1 0 219.5 3.84 1 0 110.3 3.84 1 0.04 195.3 3.84 1 0 187.5 3.84

1 0 178.7 3.84 1 0.01 179.82 3.84 1 0 209.7 3.84 1 0 111.94 3.84 1 0.04 153.7 3.84

Table 3 K-S test outcomes for CSA coefﬁcients of CSA sub-bands. K-S test parameters

100th Record of Normal

109th Record of LBBB

118th Record of RBBB

200th Record of PVC

215th Record of APB

CSA sub- bands

h p k

20th

40th

20th

40th

20th

40th

20th

40th

20th

40th

0 0.083 0.731

0 0.461 0.054

0 0.075 0.0825

0 0.341 0.304

0 0.094 0.396

0 0.401 0.018

0 0.102 0.302

0 0.873 0.080

0 0.112 0.320

0 0.645 0.030

S. Mihandoost, M. Chehel Amirani / Biomedical Signal Processing and Control 31 (2017) 79–88

85

Table 4 Kurtosis calculating for different of CSA sub-bands. Dataset

CSA Coefﬁcients Sub-bands

Normal LBBB RBBB PVC APB

Maximum Minimum Maximum Minimum Maximum Minimum Maximum Minimum Maximum Minimum

1st

10th

20th

30th

40th

50th

60th

36.83 5.72 15.10 5.3 33.97 5.85 98.27 5.51 105.3 8.22

39.21 5.60 25.13 8.22 29.86 9.20 73.53 9.33 111.2 11.35

24.82 4.02 42.34 5.15 57.66 8.55 108.55 8.61 159.21 10.81

29.36 6.76 22.17 5.48 68.25 5.97 86.14 5.66 222.21 18.62

50.31 11.35 61.97 5.93 26.86 7.66 204.10 8.63 179.32 8.63

72.53 8.16 55.52 5.21 63.50 6.24 122.8 6.98 93.55 7.31

87.32 7.10 48.58 4.44 52.57 4.96 222.2 11.09 173.2 6.01

15 20

RBBB

Normal 10

15

10 5 5

0 800

950 (a)

1100

30 25

0 850

1100

30

LBBB

PVC

25

20

20

15

15

10

10

5

5

0

950 (b)

900

1000

1100

(c)

0 750

850

900

1,000 (d)

1,100

1,200

25

20

APB

15

10

5

0 750

800

850

900

950

1000

1050

1100

(e)

Fig. 6. Comparison between CSA coefﬁcients’ histogram and GARCH data’s histogram using K-S test for 50th record of each class that have been shown in solid and dotted lines, respectively. (a) shows 50th CSA’s sub-bands of set Normal, (b) shows 50th CSA’s sub-bands of set LBBB, (c) shows 50th CSA’s sub-bands of set RBBB, (d) shows 50th CSA’s sub-bands of set PVC, and (e) indicates 50th CSA’s sub-bands of set APB.

3.3.3. Kurtosis Here, we examine the CSA coefﬁcients of ECG signals follow the normal distribution or not. To show this fact, kurtosis criterion is computed for each CSA sub-band [31]. The kurtosis measurement is used as follows:

kurtosis =

E(x − )4 4

(22)

where and 2 are the mean and the variance of x, respectively, and E(.) is the expectation operator. For each CSA sub-band of different classes, the maximum and minimum of kurtosis are calculated which some are randomly presented in Table 4. We can clearly see in Table 4, all the minimum values of the obtained kurtosis is larger than three that three is expected value for a Gaussian distribution [34].

S. Mihandoost, M. Chehel Amirani / Biomedical Signal Processing and Control 31 (2017) 79–88

The experimental results in Table 4 show the CSA coefﬁcients of ECG signals have a heavy tail characteristic as indicated by the calculated kurtosis parameters, and these coefﬁcients do not track the Gaussian distribution [31]. Hence, we recommend the GARCH model and demonstrate that it is a suitable representation of the CSA coefﬁcients. 3.4. Feature selection by Markov random ﬁeld In order to decrease complexity of automatic identiﬁcation system and to increase accuracy, we try to select best set of features for classiﬁcation [42]. As it is evident, in biomedical signals, the accuracy of diagnosing is highly important. To accomplish this objective, we employ MRF selection method to select the best set of features using the Markov optimization technique, which simultaneously satisﬁes the Fisher criterion, i.e., minimizing the within-class distance and maximizing between classes distance [30]. This technique uses the Fisher’s class separability as the optimized criterion to deﬁne two characteristics of discriminativeness and sparseness for the feature space [30]. 3.5. Support vector machine classiﬁer The obtained features are fed to a suitable classiﬁer. The SVM attempts to ﬁnd the best margin between different classes. At ﬁrst, the SVM classiﬁer introduced to classify two classes. In the binary SVM, nonlinear boundaries with radial basis function (RBF) kernel are employed to classify features [43]. To train the classiﬁer, training set are given to the SVM with the error function J:

1 T W W +C i 2 M

J(W ) =

(23)

i=1

where :

yi (W T ␸(Xi ) + b) ≥ 1 − i

and i ≥ 0,

i = 1, · · ·, M (24)

where Xi and yi are ith feature vector and ith label pairs. C > 0 is the penalty parameter of the error function [44]. W and b are weighting coefﬁcients vector and a constant. i parameters are considered for handling non-separable input data. This classiﬁer, by using the kernel function ϕ, maps the input features to a new space [30]. The new space is generally in high-dimensional spaces which can creates the opportunity for the algorithm to ﬁnd a linear solution [30]. The RBF kernel is exp(− Xi − Xj 2 ), where indicates the Gaussian kernel width [44]. Although SVM is originally designed as binary classiﬁers, there are various extensions to enable SVM to handle more than two classes [43]. In this study, One-against-One method is employed D to classify multiclass [43]. This method utilizes binary SVM 2 with RBF kernel for each pair of classes, where D is the number of

classes [43]. At ﬁrst, the feature vectors are applied to all D binary classiﬁers, and next the histogram of the outputs are 2 calculated and investigated. Finally, the maximum value of the histogram is selected as the target class. If there are two or more classes with the same histogram value, one of the classes is randomly selected [45]. 4. Experiments results and discussion In this section, at ﬁrst, in order to choose the ﬁnal number of features by MRF, an experiment on the aforementioned data base is done and repeated several times. In this experiment, the MRF is

99

Correct Classification Rate (CCR%)

86

98

97

96

95

94

93

8

10

12

14

16

18

Number of Features

Fig. 7. CCR for different number of ﬁnal features.

applied to the extracted GARCH parameters of the ECG’s CSA coefﬁcients. Then, the obtained features are fed to an SVM classiﬁer and the correct classiﬁcation rate (CCR) is calculated. As we know, CCR is the number of correct decisions divided by the all of decisions. The selection of the appropriate number of features is a tradeoff between computational complexity and the CCR. Fig. 7 shows the results of this experiment. Therefore, 12 features are chosen which is a suitable option from the aspect of CCR and computational complexity. We utilized MATLAB version 2014 to simulate the proposed method. Now, Table 5 illustrates the obtained confusion matrix and demonstrates the efﬁciency of the purposed method. In addition, due to assess the effectiveness of the proposed method, three standard statistical parameters from confusion matrix are used as in Refs. [2,46]. They are speciﬁcity, sensitivity and CCR which are calculated for the proposed method and are reported in Table 6. a) sensitivity (Se): Se =

TP × 100 TP + FN

(25)

b) speciﬁcity (Sp): Sp =

TN × 100 TN + FP

(26)

c) CCRwhere TP, FP, FN and TN are true positive, false positive, false negative and true negative, respectively [2]. For more explanation, consider the LBBB class of Table 5 that illustrates identiﬁcation system has 643 TP and 18 FP detections, i.e., 643 beat numbers from LBBB were correctly classiﬁed into LBBB and 11, 4, 3 and 0 beat numbers from the Normal, RBBB, PVC and APB classes, were falsely categorized as LBBB set. On the other side, 4, 0, 3 and 0 beat numbers from LBBB were falsely categorized into the Normal, RBBB, PVC and APB sets, respectively. Therefore, the number of FN recognitions for the LBBB set equals to 7 [37]. Finally, TN is all the remaining beat numbers were correctly classiﬁed as non-LBBB and equals to 2750 beat numbers. A comparison between experimental results of this study and other similar researches which utilized the MIT BIH database from the aspect of classiﬁcation accuracy are reported in Table 7. In Ref. [10], ECG signals are represented using Hermitic coefﬁcients; moreover Ref. [47] applied interacting multiple model (IMM) and sequential Markov chain Monte Carlo (SMCMC) algorithms to the ECG representation. Ref. [38] utilized the second, third and fourth order of cumulants and fuzzy hybrid neural network. Furthermore, Ref. [48] applied the new wavelet functions to estimate the shape of ECG signals. Although, the results for all methods are similar, we have slightly improved the accuracy for all of the classes by presenting the new feature generation based on CSD.

S. Mihandoost, M. Chehel Amirani / Biomedical Signal Processing and Control 31 (2017) 79–88

87

Table 5 Confusion matrix. Detected class

Normal LBBB RBBB PVC APB

Actual class

Normal

LBBB

RBBB

PVC

APB

1202 4 2 1 2

11 643 4 3 0

2 0 540 0 1

1 3 1 692 8

2 0 3 4 289

Table 6 Performance evaluation of the proposed method. Total Number of Test Data

Sp(%)

Se(%)

Correct Classiﬁcation

Misclassiﬁcation

Acc (%)

3418

99.62

98.47

3366

52

98.48

Table 7 Experimental results of this study and other similar researches. Methods

[38] [47] [10] [48] CSA-GARCH

Correct Classiﬁcation Rate (%) Normal

LBBB

RBBB

PVC

APB

98 98 98 99.44 98.68

97 99 97 100 98.92

94 98 99 92.62 98.18

* * * 95.7 98.85

92 * 94 * 96.33

(*)Not reported.

Table 8 Comparing different method based on the classiﬁcation accuracies and number of used features. Authors

Methods Features Extraction/Classiﬁer

Dataset

Input Feature Dimension

Average Classiﬁcation Rate

Osowski and et.al. [10] Osowski and et.al. [38]

Hermite Coefﬁcients/SVM

7279 beats from MIT-BIH; 3611 training-3668 testing; 7185 beats from MIT-BIH;4035 training-3150 testing [Normal: 2250, APB: 658, LBBB: 1200, PVC: 1500, RBBB: 1000, VF: 472, VE: 105] 7185 beats from MIT-BIH; 4035 training-3150 testing [Normal: 2250, APB: 658, LBBB: 1200, PVC: 1500, RBBB: 1000, VF: 472, VE: 105] 7279 beats from MIT-BIH; 3611 training—3668 testing [Normal: 2344, LBBB: 1250, RBBB: 1050, PVC:1400, APB:658,VE:105, VF: 472]

17

∼96.1

*

96.06

18

97.53

16

98.28

beats from MIT-BIH; [Normal, LBBB, RBBB, VE, Nodal Escape]

17

98

6702 beats from MIT-BIH; 3284 training—3418 testing [Normal: 2344, LBBB: 1250, RBBB: 1050, PVC:1400, APB:658]

12

98.48

Cumulants of the Second, Third and Fourth Order/Fuzzy Hybrid Neural Network

S. N. Yu and et.al. [39]

Higher Order Statistics of Subband Components/Feed forward Neural Network

Homaeinezhad and et.al. [37]

Geometrical Properties Obtained from Segmentation of Each Detected-Delineated QRS Complex Virtual Image as well as RR-Tachogram/a Fusion Structure Consisting of Three MLP and Three ANFIS Classiﬁers Interacting Multiple Model (IMM) and Sequential Markov Chain Monte Carlo (SMCMC)/Bayesian ML Classiﬁer CSA-GARCH model/SVM

S. Edla, and et.al. [47] This study

(*)Not reported.

Some ECG classiﬁcation methods, with an emphasis on feature generation step, try to present the discriminating features for the classiﬁer. Therefore, the efﬁciency of this methods are depended on how well the content of the signal is described by the obtained features; what is the dimensionality of obtained feature in the feature generation step [30]. The aforementioned common questions should be answered by anyone who is presenting a new feature generating method [30]. Table 8 answers above questions

and shows a comparison in number of utilized features and accuracy among several methods. Experimental results indicate that the cyclic stationary hypothesize for ECG signal gives rich and efﬁcient information. Furthermore, we know every study has several advantages and limitations. The proposed method revealed the hidden information and obtained good performance, but it is time consuming and not suitable for online processing.

88

S. Mihandoost, M. Chehel Amirani / Biomedical Signal Processing and Control 31 (2017) 79–88

5. Conclusion ECG analyzing has a long history of investigation. Although there are famous studies for this ﬁeld, the ECG analyzing, currently, is an active ﬁeld of study. In this paper, the Cyclic Spectral Analysis was applied to ECG classiﬁcation problem. In fact, the main goal of this paper is to present an effective feature generation method for ECG signal. We indicated that ECG signals are cyclostationarity. The CSD coefﬁcients of the signal using the FAM algorithm were calculated. We showed that these coefﬁcients have heteroscedastic property; hence, we employed GARCH model to describe them because it can accurately capture this property. Advantage: the proposed method can reveal hidden information content periodicity of order 2 and heteroscedastic properties of signal. These rich information cause generate efﬁcient features, and consequently increase accuracy of classiﬁcation. In addition, this method is implementable. The experimental results indicate high performance of proposed method for the ECG signal classiﬁcation in comparison with previous methods. Limitation: in biomedical signal processing accuracy is more important that computational complexity. Although proposed method have good performance, it is time consuming and not suitable for online processing. In addition, this method is implementable but it is difﬁcult because of its computational complexity. Suggestion: we try to decrease the processing time of this method in our future studies. To overcome these limitation, using the faster algorithm or matrix based algorithm can be a good suggestion. References [1] A.L. Goldberger, Clinical Electrocardiography: a Simpliﬁed Approach, 8th ed., Elsevier Health Sciences, 2012. [2] S. Banerjee, M. Mitra, Application of cross wavelet transform for ecg pattern analysis and classiﬁcation, IEEE Trans. Instrum. Meas. 63 (2) (2014) 326–333. [3] L. Schamroth, An introduction to electrocardiography, Acad. Med. 39 (10) (1964) 977. [4] M.C. Amirani, A.A. Shirazi, Evaluation of the texture analysis using spectral correlation function, Fundam. Inform. 3 (2009) 245–262. [5] L.G. Herrera-Bendezú, B.G. Denys, Feature identiﬁcation of ECG waveforms via orthonormal functions, Proc. Comput. Cardiol. (September) (1990) 641–644. [6] W. Philips, ‘ECG data compression with time-warped polynomials’, IEEE Trans. Biomed. Eng. 40 (11) (1993) 1095–1101. [7] C.-H. Lin, Frequency-domain features for ECG beat discrimination using grey relational analysis-based classiﬁer, Comput. Math. Appl. 55 (4) (2008) 680–690. [8] M. Tsipouras, D. Fotiadis, Automatic arrhythmia detection based on time and time-frequency analysis of heart rate variability, Comput. Methods Programs Biomed. 74 (2) (2004) 95–108. [9] C.-H. Lin, Y.-C. Du, T. Chen, Adaptive wavelet network for multiple cardiac arrhythmias recognition, Expert Syst. Appl. 34 (4) (2008) 2601–2611. [10] S. Osowski, L.T. Hoai, T. Markiewicz, Support vector machine-based expert system for reliable heartbeat recognition, IEEE Trans. Biomed. Eng. 51 (4) (2004) 582–589. [11] R. Dubois, P. Roussel, M. Vaglio, F. Extramiana, F. Badilini, P. Maison-Blanche, G. Dreyfus, Efﬁcient modeling of ECG waves for morphology tracking, Proc. Comput. Cardiol. (2009) 313–316. [12] P.E. McSharry, G.D. Clifford, L. Tarassenko, L. Smith, ‘A dynamical model for generating synthetic electrocardiogram signals’, IEEE Trans. Biomed. Eng. 50 (3) (2003) 289–294. [13] Q. Li, R.G. Mark, G.D. Clifford, Robust heart rate estimation from multiple asynchronous noisy sources using signal quality indices and a Kalman ﬁlter, Physiol. Meas. 29 (1) (2008) 15. [14] O. Sayadi, R. Sameni, M.B. Shamsollahi, ECG denoising using parameters of ECG dynamical model as the states of an extended Kalman ﬁlter, Proc. Int. Conf. IEEE Eng. Med. Biol. Soc. (2007) 2548–2551. [15] O. Sayadi, M.B. Shamsollahi, G.D. Clifford, Robust detection of premature ventricular contractions using a wave-based bayesian framework, IEEE Trans. Biomed. Eng. 57 (2) (2010) 353–362. [16] O. Sayadi, M.B. Shamsollahi, G.D. Clifford, Synthetic ECG generation and Bayesian ﬁltering using a Gaussian wave-based dynamical model, Physiol. Meas. 31 (10) (2010) 1309.

[17] M.C. Amirani, A.A.B. Shirazi, Texture classiﬁcation using cyclic spectral function. In 2008 congress on image and signal processing, IEEE, 2008, pp. 834–838. [18] W.A. Brown, ‘On the theory of cyclostationary signals’ (1989). [19] W. Gardner, Measurement of spectral correlation, IEEE Trans. Acoust. Speech Signal Process. 34 (5) (1986) 1111–1123. [20] W. Gardner, Spectral correlation of modulated signals: part I—analog modulation, IEEE Trans. Commun. 35 (6) (1987) 584–594. [21] W. Gardner, W. Brown, C.-K. Chen, Spectral correlation of modulated signals: part II—digital modulation, IEEE Trans. Commun. 35 (6) (1987) 595–601. [22] W.A. Gardner, The spectral correlation theory of cyclostationary time-series, Signal Process. 11 (1) (1986) 13–36. [23] W.A. Gardner, Statistical Spectral Analysis: A Nonprobabilistic Theory, Prentice-Hall, Inc., 1986. [24] R.S. Roberts, W. Brown, H.H. Loomis Jr., Computationally efﬁcient algorithms for cyclic spectral analysis, Signal Process. Mag. IEEE 8 (2) (1991) 38–49. [25] M. Amirmazlaghani, H. Amindavar, Two novel Bayesian multiscale approaches for speckle suppression in SAR images, IEEE Trans. Geosci. Remote Sens. 48 (7) (2010) 2980–2993. [26] M. Amirmazlaghani, H. Amindavar, A. Moghaddamjoo, Speckle suppression in SAR images using the 2-D GARCH model, IEEE Trans. Image Process. 18 (2) (2009) 250–259. [27] B.M. Asl, S.K. Setarehdan, M. Mohebbi, Support vector machine-based arrhythmia classiﬁcation using reduced features of heart rate variability signal, Artif. Intell. Med. 44 (1) (2008) 51–64. [28] K. Polat, S. Günes¸, Detection of ECG Arrhythmia using a differential expert system approach based on principal component analysis and least square support vector machine, Appl. Math. Comput. 186 (1) (2007) 898–906. [29] E.D. Übeyli, ECG beats classiﬁcation using multiclass support vector machines with error correcting output codes, Digit. Signal Process. 17 (3) (2007) 675–684. [30] S. Sahami, M.C. Amirani, Matrix based cyclic spectral estimator for fast and robust texture classiﬁcation, Vis. Comput. 29 (12) (2013) 1245–1257. [31] S. Mihandoost, M.C. Amirani, EEG signal analysis using spectral correlation function & GARCH model, Signal Image Video Process. (2014) 1–12. [32] L.R. Rabiner, Multirate Digital Signal Processing, Prentice Hall PTR, 1996. [33] R.F. Engle, Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inﬂation, Econom.: J. Econom. Soc. (1982) 987–1007. [34] M. Amirmazlaghani, H. Amindavar, EMG signal denoising via Bayesian wavelet shrinkage based on GARCH modeling, in: Acoustics, Speech and Signal Processing, ICASSP, IEEE International Conference, 2009, pp. 469–472. [35] T. Bollerslev, ‘Generalized autoregressive conditional heteroskedasticity’, J. Econom. 31 (3) (1986) 307–327. [36] M. Arif, Robust electrocardiogram (ECG) beat classiﬁcation using discrete wavelet transform, Physiol. Meas. 29 (5) (2008) 555. [37] M. Homaeinezhad, E. Tavakkoli, A. Afshar, S.A. Atyabi, A. Ghaffari, Neuro-ANFIS architecture for ECG rhythm-type recognition using different QRS geometrical-based features, Iran. J. Electr. Electron. Eng. 7 (2) (2011) 70–83. [38] S. Osowski, T.H. Linh, ECG beat recognition using fuzzy hybrid neural network, IEEE Trans. Biomed. Eng. 48 (11) (2001) 1265–1271. [39] S.-N. Yu, Y.-H. Chen, Noise-tolerant electrocardiogram beat classiﬁcation based on higher order statistics of subband components, Artif. Intell. Med. 46 (2) (2009) 165–178. [40] F.J. Massey Jr., The Kolmogorov-Smirnov test for goodness of ﬁt, J. Am. Stat. Assoc. 46 (253) (1951) 68–78. [41] V. Choqueuse, K. Yao, L. Collin, G. Burel, Blind detection of the number of communication signals under spatially correlated noise by ICA and KS tests, in: In Acoustics, Speech and Signal Processing, ICASSP, IEEE International Conference, 2008, pp. 2397–2400. [42] Q. Cheng, H. Zhou, J. Cheng, ‘The ﬁsher-markov selector: fast selecting maximally separable feature subset for multiclass classiﬁcation with applications to high-dimensional data’, IEEE Trans. Pattern Anal. Mach. Intell. 33 (6) (2011) 1217–1233. [43] C.J.C. Burges, A tutorial on support vector machines for pattern recognition, Data Min. Knowl. Discov. 2 (2) (1998) 121–167. [44] C.W. Hsu, C.C. Chang, C.J. Lin, A Practical Guide to Support Vector Classiﬁcation, Technical Report, Department of Computer Science, National Taiwan University, 2003, July. [45] C.W. Hsu, C.J. Lin, A comparison on methods for multi-class support vector machines, IEEE Trans. Neural Netw. 13 (2002) 415–425. [46] E.D. Übeyli, Statistics over features: EEG signals analysis, Comput. Biol. Med. 39 (8) (2009) 733–741. [47] S. Edla, N. Kovvali, A. Papandreou-Suppappola, Electrocardiogram signal modeling with adaptive parameter estimation using sequential bayesian methods, IEEE Trans. Signal Process. 62 (10) (2014) 2667–2680. [48] M. Nazarahari, S.G. Namin, A.H.D. Markazi, A.K. Anaraki, A multi-wavelet optimization approach using similarity measures for electrocardiogram signal classiﬁcation, Biomed. Signal Process. Control 20 (2015) 142–151.

Cyclic spectral analysis of electrocardiogram signals based on GARCH model

Cyclic spectral analysis of electrocardiogram signals based on GARCH model

Recommend Documents