Available online at www.sciencedirect.com
Medical Engineering & Physics 31 (2009) 92–100
Human heart beat analysis using a modified algorithm of detrended fluctuation analysis based on empirical mode decomposition Jia-Rong Yeh a , Shou-Zen Fan b , Jiann-Shing Shieh a,∗ a
Department of Mechanical Engineering, Yuan Ze University, 135 Yuan-Tung Road, Chung-Li, Taoyuan 320, Taiwan b Department of Anaesthesiology, College of Medicine, National Taiwan University, Taiwan Received 10 January 2008; received in revised form 30 April 2008; accepted 30 April 2008
Abstract How to quantify the complexity of a physiological signal is a crucial issue for verifying the underlying mechanism of a physiological system. The original algorithm of detrended fluctuation analysis (DFA) quantifies the complexity of signals using the DFA scaling exponent. However, the DFA scaling exponent is suitable only for an integrated time series but not the original signal. Moreover, the method of least squares line is a simple detrending operation. Thus, the analysis results of the original DFA are not sufficient to verify the underlying mechanism of physiological signals. In this study, we apply an innovative timescale-adaptive algorithm of empirical mode decomposition (EMD) as the detrending operation for the modified DFA algorithm. We also propose a two-parameter scale of randomness for DFA to replace the DFA scaling exponent. Finally, we apply this modified algorithm to the database of human heartbeat interval from Physiobank, and it performs well in identifying characteristics of heartbeat interval caused by the effects of aging and of illness. © 2008 IPEM. Published by Elsevier Ltd. All rights reserved. Keywords: Detrended fluctuation analysis; Self-similarity; Empirical mode decomposition
1. Introduction Time series of physiologic signals are usually complex, though if they can be analyzed they may reveal information on physiologic systems. Quantifying the complexity of physiologic signals is a way to evaluate the condition of a physiologic system that has not been resolved by medical engineers. According to the development of chaotic analysis, chaotic modeling techniques help to describe the evolution of the behavior of a nonlinear system [1]. These analysis results indicate the chaotic characteristics existing in a nonlinear system and they are helpful in analyzing and predicting the condition of the nonlinear system. Recently, many different analysis algorithms have been proposed for quantifying the complexity of physiologic signals, including multi-scale entropy [2], detrended fluctuation analysis (DFA) [3], measurement of similarity [4], and others. The original DFA proposed by Peng et al. [3] is a simple and efficient method for scaling the long-term correlation of ∗
Corresponding author. Tel.: +886 3 4638800x2470. E-mail address:
[email protected] (J.-S. Shieh).
non-stationary time series [5]. It is also one of the promising approaches for quantifying the complexity of physiologic signals using the fractal property of the integrated time series of physiological signals [6]. In the original DFA, the degree of long-term correlation is scaled by the DFA scaling exponent. However, there are two important limits for quantifying the complexity of signals by the original DFA. The first limitation is that the fractal property of a real-world signal is bounded. So, the DFA scaling exponent can be derived only from the integrated time series of the signal [7,8]. Secondly, the detrending operation via the least square line of the original DFA algorithm is a simplified method, so the derived trend is not accurate. More recently, higher order detrending [9] and detrending moving averages (DMAs) [10] have been proposed for detrending. Those modified detrendings clearly improve the performance of DFA, but the extracted trends were still derived with a given timescale of n. In this investigation, the timescale of detrending is not a fixed value but is adaptive to the nature of the time series. In the past, empirical mode decomposition (EMD) has been used as a data-driven signal processing algorithm [11–13]. However, Wu and Huang [15] and Flandrin et al.
1350-4533/$ – see front matter © 2008 IPEM. Published by Elsevier Ltd. All rights reserved. doi:10.1016/j.medengphy.2008.04.011
J.-R. Yeh et al. / Medical Engineering & Physics 31 (2009) 92–100
[16] recently conducted studies on the characteristics of white noise and fractal Gaussian noises with various Hurst exponents by EMD [16], confirming some important characteristics of white noise [15]. Those characteristics of white noise lead us to define a baseline of randomness for evaluating the complexity of signals. Furthermore, EMD is also the only timescale-adaptive detrending process to derive the trend with its corresponding intrinsic timescale [17]. Among the applications of EMD, it has been applied for deriving trends of financial markets and natural signals [17]. In contrast to the original detrending operation of the least square line, timescale during an EMD-based detrending process is not a fixed or pre-determined value but is adaptive to the nature of data. According to Huang et al. [14], a limited number of trends can be derived from a complex time series by EMD, and for each trend, the timescale varies with time. This is the main difference between the original and EMD-based detrending operations. In this modified DFA algorithm, EMD is used for the detrending operation to make the trend more accurate. However, the timescale for each trend extracted by EMD, denoted as the EMD-based trend, is a fluctuating time series and not a fixed value. We determine the timescale of an EMD-based trend using the average for a time series of timescale, and this averaged timescale is denoted by an intrinsic timescale because it adapts to the intrinsic characteristic of time series. Moreover, in contrast to the timescale, which is a predetermined value for each detrending process by the original detrending operation, the values of intrinsic timescales are adaptive to the nature of time series. Therefore, the signal complexity can be also evaluated on an extra dimension of intrinsic timescale distribution. Moreover, since the DFA scaling exponent is not used as the scale of complexity in the modified DFA algorithm, the integrated time series is no longer necessary. So, in contrast to the original DFA algorithm, we analyze the original time series directly. Furthermore, we propose that the scale of complexity can be evaluated by measuring the difference between the characteristics of detrended fluctuations for the original time series and a randomly shuffled time series of itself. For a stochastic signal, the difference in measurement is small, whereas the difference in measurement is large for an orderly time series. This is why we claim the scale of complexity is based on the concept of randomness. Moreover, the scale of complexity is evaluated both on dimensions of energy spectrum of detrended time series and on intrinsic timescale distribution in the modified DFA algorithm. Thus, the scale of randomness is a two-parameter scale and the modified DFA algorithm is a two-parameter analysis algorithm. Finally, this modified DFA algorithm is applied for human heartbeat interval analysis. The database of human heartbeat interval, including 92 subjects of four groups with different underlying physiologic mechanisms, was downloaded from Physiobank [18] and used for this investigation. Analysis results demonstrate the powerful analytical capability by the modified DFA algorithm.
93
2. Methodology 2.1. The original DFA algorithm In 1994, the original algorithm of detrended fluctuation analysis applied for fractal analysis was proposed to quantify the complexity of a non-stationary time series [3]. In the original DFA, a least squares line of data in a window with fixed timescale n is used to represent the data trend in the window and the fluctuation of detrended time series is defined using root mean square (RMS) energy. In addition, a logarithmic fluctuation against its corresponding logarithmic timescale plot is denoted as a DFA plot. For a fractal subject, the DFA plot is a straight line and the DFA scaling exponent is defined by the slope of the DFA plot. However, the time series of a real-world signal is not a real fractal subject because its fractal property is bounded [3]. In order to improve the fractal property of a real-world signal, the time series of a signal should be integrated [5,6] by the following equation: y(k) =
k
[B(i) − Bave ]
(1)
i=1
where B(i) is the ith heart beat interval, Bave is the average heart beat interval and y(k) is the value of the kth point of integrated time series. Then, the fluctuation of detrended integrated time series for a given window with timescale of n is calculated by N 1 F (n) = [y(k) − yn (k)]2 (2) N k=1
where F(n) is the fluctuation of an integrated time series for a given window with timescale of n, N is number of heart beats and yn (k) is the kth point on the trend with a given window with timescale of n. In the original DFA algorithm, the trend in a window is the least square line fitting the data points within the window, and the DFA scaling exponent α is calculated by the slope of DFA plot. 2.2. Empirical mode decomposition Empirical mode decomposition method is an innovative data processing algorithm for nonlinear and non-stationary time series [11–13]. It is applied to decompose a complex time series for a limited number of intrinsic mode functions (IMF). The decomposing operation called a sifting process extracts the intrinsic mode function from the time series using the mean of upper and lower envelopes, while an envelope connects the extrema by using a cubic spline. However, a limited number of intrinsic mode functions are extracted from a complex time series and the residue with a monotonic pattern remains. The original time series can be reconstructed by the
94
J.-R. Yeh et al. / Medical Engineering & Physics 31 (2009) 92–100
decomposed IMFs and the residue by following equation: X(t) =
N
Cn + r n
(3)
n=1
where X(t) is the original time series, Cn is the nth IMF, and rn is the residue after n decomposing operations. Recently, Wu and Huang studied the characteristics of white noise using EMD [15]. Two parameters, energy density and its corresponding averaged period, were defined to present the special period and energy for each IMF. These two parameters are calculated by the following equations: 2
En =
N 1 Cn (j) N
(4)
j=1
T¯ n =
Sln T,n d ln T
Sln T,n
d ln T T
−1 (5)
where En is the energy density of the nth IMF, SlnT,n is the Fourier spectrum of the nth IMF as a function of ln T, T is the period, and T¯ n is the averaged period of the nth IMF. For a white noise, the sum of the logarithmic averaged period and logarithmic energy density is a constant and is expressed as: ln En + ln Tn = c
(6)
where En is the energy density, Tn is the averaged period, and c is a constant that depends on the signal’s total energy density. Wu and Huang [15] showed that the EMD-based energy spectrum of a white noise is a straight line with slope of −1. Moreover, Flandrin et al. conducted a similar study with fractal Gaussian noise [16], also finding that the EMD-based energy spectrum of a fractal Gaussian noise is a straight line. But the slope of the EMD-based energy spectrum for a fractal Gaussian noise depends on the value of the Hurst exponent. Furthermore, EMD has been shown to be the only adaptive detrending operation for signal analysis [17]. In contrast to the original detrending operation, timescale is adaptive to the nature of signal and it is a time series but not a fixed value in an EMD-based detrending process. 2.3. Modified DFA algorithm As mentioned above, there are two limitations in the original DFA. First, to extend the fractal property for a real-world signal, the original DFA algorithm is applied on the integrated time series, and the property of self-similarity (i.e., the analysis result of the original DFA) is not sensitive enough. Secondly, the original detrending operation using the least squares line is not an accurate detrending method. Thus, we propose a modified DFA algorithm to overcome these limitations. As the first limitation in the original DFA, a real-world time series should be integrated before a detrending process
in order to enhance the fractal property of signal. However, an integrated time series would reduce the fluctuation under small timescales and alter the signal’s energy spectrum. Therefore, we conduct the detrending process and DFA analysis directly on the original time series to avoid altering the energy spectrum. Meanwhile, the scale of complexity using the fractal property disappears in the modified DFA analysis. The randomly shuffled time series has been previously used as a surrogate for the random time series [4]. Thus, we propose that the scale of complexity can be evaluated by measuring the difference between detrended fluctuations of the original time series and the randomly shuffled surrogate of itself. The randomly shuffled time series and the original time series have the same values of means and standard deviations but different characteristics of randomness. Therefore, the difference of detrended fluctuations depends on the randomness of the time series only. Thus, in the modified DFA algorithm, scale of complexity is based on the measurement of difference between the detrended fluctuations of the original signal and the baseline of randomness. Furthermore, the EMD method is applied for detrending to improve the accuracy of trends. According to the study of trends and detrending methods by Huang et al., an EMD-based trend is the residue of the decomposition process of EMD, and the timescale of an EMD-based trend is the averaged timescale of the decomposed IMF. Therefore, the detrended time series is expressed by the following equation: fn (k) =
n
ci (k)
(7)
i=1
where fn (k) is value of the kth point of the nth detrended time series and ci (k) is value of the kth point of the ith IMF. Comparing the DFA plot by detrending on EMD with the plot from the original detrending method for an example of complex time series, the fluctuation of detrended time series by EMD is larger than that by the least squares line under the same timescale, as shown in Fig. 1. Since an inaccurate trend cannot fit the signal well and it weakens the detrended fluctuation, the fluctuation of detrended time series is smaller than a more accurate detrending algorithm. Fig. 1 shows that the EMD-based detrending process yields a more accurate trend from a complex time series. Moreover, the randomly shuffled time series was used as a baseline of randomness for a complex time series. In Fig. 2, the DFA plots for the original time series and the baseline of randomness using two different detrending operations show the differences between detrended fluctuations of the original time series and the baseline of randomness under various timescales. Using the original detrending operation, only the difference of energy spectrum between the detrended fluctuations of the original time series and baseline of randomness can be determined. However, using the EMD-based detrending operation, the difference is shown in both dimensions of intrinsic timescale and energy spectrum of detrended time
J.-R. Yeh et al. / Medical Engineering & Physics 31 (2009) 92–100
95
and represents a high complexity of the original time series. These two parameters can be calculated by the following equations. N
1 IS = (FS,i − FO,i ) N
(8)
i=1 N
IT =
1 (TS,i − TO,i ) N
(9)
i=1
Fig. 1. Analysis results of DFA using a least-squares line and EMD. Dots show the fluctuation of detrended heartbeat time series under a timescale of n by the detrending operation of least squares line. Triangles show the fluctuation of a detrended time series under the first five intrinsic timescales by the detrending operation of EMD.
series. Therefore, we define a two-parameter scale of randomness using these differences in the two dimensions of EMD-based energy spectrum and intrinsic timescale distribution. These two parameters are denoted as the spectrum index (SI) and the timescale index (TI). In this study, the spectrum index represents the averaged difference of energy density between EMD-based DFA plots of the original and randomly shuffled time series, and the timescale index represents the averaged difference of logarithmic intrinsic timescales of these two time series. A small value of both indexes of TI and SI indicates a small difference between the original and the randomly shuffled time series
where IS is the spectrum index; IT is the timescale index. TS,i is the ith intrinsic timescale of randomly shuffled time series; TO,i is the ith intrinsic timescale of the original time series; FS,i is the detrended fluctuation of randomly shuffled time series under the ith intrinsic time series; FO,i is the detrended fluctuation of the original time series under the ith intrinsic timescale; and N is the number of intrinsic timescales. 2.4. Comparing the functions of different parameters by statistical analysis There are three different parameters (i.e., DFA scaling exponent, spectrum index, and timescale index) derived from the original and modified DFA algorithms. In order to estimate the function of these parameters to verify the underlying physiological mechanisms of heartbeat interval, the values of parameters are expressed as means ± S.D. Data were analyzed by one-way analysis of variance (ANOVA). The Student–Newman–Keuls (SNK) test was conducted for multiple comparisons when the null hypothesis was not applicable. P < 0.05 was considered statistically significant [19,20]. 2.5. Two-parameter classification model using a contour plot
Fig. 2. DFA plots of the original and the randomly shuffled time series using two different detrending operations. The gray dots illustrate the DFA plot of the original time series by DFA using detrending operation of least-squares line and the black dots illustrate that of the randomly shuffled time series. The inverse triangles illustrate the DFA plots of the original time series by DFA using EMD, and the triangles illustrate that of the randomly shuffled time series.
The concepts and definitions of fuzzy set theory [21] are briefly summarized to facilitate the classification. As a result, the degree of membership μA (x) of element x to a set A depends on their positions with respect to the center (or the central concept). An element’s position within a set is determined by the degree of belief. In this investigation, we also design a simple two-parameter classification model using a contour plot based on the similar concept but different modeling. At the beginning, each parameter applied in the classification model should be normalized by the range between maximum and minimum to create a domain space from 0 to 1. Every point in the domain space has a degree of belief that the subject on that point belongs to the group. Secondly, we define the membership function of group’s subject as an exponential decay function of distance, ranging from 1 to 0, and the membership function of a group is the summation of membership functions for all subjects in the group. Finally, membership functions of two groups can be merged together by taking the absolute value of the difference between membership functions of the two groups. Thus, we
96
J.-R. Yeh et al. / Medical Engineering & Physics 31 (2009) 92–100
Table 1 ANOVA statistical analysis of four groups with different underlying physiological mechanisms for three different parameters Parameter
Healthy young
Healthy elderly
CHF
DFA scaling exponent Spectrum index Timescale index
0.973 ± 0.129 ± 0.053b,c 0.391 ± 0.062c,d
0.987 ± 0.250 ± 0.096a,c,d 0.413 ± 0.067c,d
1.070 ± 0.367 ± 0.192a,b,d 0.170 ± 0.084a,b,d
0.075c,d
0.095c,d
AF 0.150a,b,d
P-value
0.582 ± 0.027 ± 0.013b,c 0.076 ± 0.031a,b,c 0.049a,b,c
<0.001 <0.001 <0.001
Values are expressed as mean ± S.D. P < 0.05 was considered statistically significant using ANOVA method. a P < 0.05 for comparing to healthy young group using SNK test. b P < 0.05 for comparing to healthy elderly group using SNK test. c P < 0.05 for comparing to CHF group using SNK test. d P < 0.05 for comparing to AF group using SNK test.
can find boundaries between groups on the contour plot. For any point on the boundary, there are equal degrees of belief for belonging to two or more adjacent groups.
3. Material For this study, the NN-interval database of Physiobank [18] was downloaded as the study material. This database includes, 40 healthy subjects in two subgroups of young and elderly, 43 subjects with severe congestive heart failure (CHF), and 9 subjects with atrial fibrillation (AF). There are 20 subjects in the young subgroup (10 females and 10 males, with an average age of 25.9 years) and 20 subjects in the elderly subgroup (10 females and 10 males, with an average age of 74.5 years) in the healthy group. There are 43 subjects (15 females and 28 males, with averaged age of 55.5 years) in the CHF group, and 9 subjects in the AF group.
4. Results For the modified DFA algorithm, the crucial issue is whether or not the surrogate of white noise is a real random signal and to determine this we use the EMD-based energy spectrum analysis. Based on the characteristics of white noise from EMD-based energy spectrum analysis, the EMD-based energy spectrum of a white noise should be a straight line with slope of −1. Thus, we conduct this test with a randomly shuffled time series of 92 subjects. This test shows that the average slope and standard deviation of EMD-based energy spectrum for these 92 randomly shuffled time series are −0.9879 ± 0.1406. We also conduct the same test for 1000 examples of simulated white noise gener-
ated by the MATLAB random number generator, with results of −0.9736 ± 0.0392. Comparing the test results of the randomly shuffled time series to those of simulated white noises, the randomly shuffled time series satisfy the characteristic of white noise. Furthermore, we compare the three parameters of DFA scaling exponent, spectrum index, and timescale index. In Table 1, the analysis results of three parameters are shown as means ± S.D. for four groups. Then, ANOVA analysis and SNK test are conducted to verify the statistical differences among four groups using different parameters. For the DFA scaling exponent and timescale index, only the comparison between groups of healthy young and healthy elderly is not statistically significant (P > 0.05). And for spectrum index, only the comparison between groups of healthy young and AF is not statistically significant (P > 0.05). Therefore, the spectrum index provides weak verification between the groups of healthy young and AF, whereas the timescale index provides weak verification between the groups of healthy young and healthy elderly. The analysis results of spectrum index and timescale index show different advantages and disadvantages. However, the two parameters of SI and TI represent the scale of complexity in two different dimensions for human heartbeat intervals. Spectrum index shows the scale of complexity in the dimension of energy spectrum, while timescale index shows the scale of complexity in the dimension of intrinsic timescale distribution. To explain that the heartbeat intervals with different underlying physiologic mechanisms have various degrees of complexity in different dimensions of energy spectrum and intrinsic timescale distribution, we compared four groups for the parameters of spectrum index and timescale index. In Table 2, the spectrum index and timescale index of each group are shown as means ± S.D. We also deter-
Table 2 Comparison results of four groups for spectrum index and timescale index Group
Spectrum index (SI)
Timescale index (TI)
Notation for SI
Notation for TI
Healthy young Healthy elderly CHF AF
0.129 ± 0.053 0.250 ± 0.096 0.367 ± 0.150 0.027 ± 0.049
0.391 ± 0.062 0.413 ± 0.067 0.170 ± 0.084 0.076 ± 0.031
Low High High Low
High High Low Low
Average
0.192
0.262
Spectrum index and timescale index for each group are shown as means ± S.D. The notation is determined as high when the parameter value is larger than the average value. Otherwise, the notation is determined as low.
J.-R. Yeh et al. / Medical Engineering & Physics 31 (2009) 92–100
97
Fig. 3. Membership functions for four groups: (a) healthy young group, (b) healthy elderly group, (c) CHF group, and (d) AF group.
mine the standards of comparisons using average of means for spectrum index and timescale index. The results of comparison are denoted as high when the mean of parameter for a group is larger than the average of these means, and otherwise the value is denoted as low. The average of means for spectrum indexes is 0.192 and 0.262 for timescale index. However, as mentioned above, the spectrum index and timescale index are defined as the measure of difference between the original and the surrogate of white noise in different dimensions. A larger value for either of the parameters indicates a lower randomness for the complex signal. According to the comparisons shown in Table 2, the AF group has low values for both parameters and shows higher degrees of complexity in both dimensions. The healthy elderly group has high values for both parameters and shows lower complexities in both dimensions. The groups of healthy young
and CHF show a higher complexity in one dimension but a lower complexity in the other one. Finally, a two-parameter classification model is applied to determine the boundaries among four groups. Membership functions of the four groups are shown in Fig. 3, indicating that subjects of different groups are located in different areas. For subjects of the AF group, parameters of both SI and TI are low and the heartbeat interval is more complicated in the dimensions of both energy spectrum and intrinsic timescale distribution. Moreover, for subjects of the CHF group, the heartbeat interval is more complex in the dimensions of intrinsic timescale distribution but not in the dimension of EMD-based energy spectrum. Membership functions of four groups are merged together and shown in Fig. 4. We also use Table 3 to show the classifying rate, with three values for each group located in the 3rd,
Table 3 Classifying results of the modified DFA algorithm Group
Number of objects
Healthy young Healthy elderly CHF AF
20 20 43 9
Classifying rate 85% 70% 93% 100%
Negative classifying
False positive classifying
3 6 3 0
3 0 3 0
Here, the classifying rate illustrates that the percentage of objects is classified correctly. A negative classification is the number of objects in a group that are sorted to a wrong group. False positive classification is the number of objects that are sorted to a group they do not belong to.
98
J.-R. Yeh et al. / Medical Engineering & Physics 31 (2009) 92–100
Fig. 4. Two-parameter classification model applied for human heartbeat interval analysis. The black dots illustrate healthy young objects; circles illustrate healthy elderly objects; rectangular marks illustrate CHF objects; cross marks illustrate AF objects. Each closed area illustrates the distribution of a group.
4th and 5th columns. The first value is the classifying rate, which indicates the percentage of subjects sorted correctly in a group. The second value, denoted as negative classification, indicates the number of subjects belonging to a group but not identified correctly. The third value, denoted as false positive classification, indicates the number of subjects that are sorted into a group but not really belonging to the group. According to the analysis results, subjects with different underlying physiologic mechanisms present different complexities in different dimensions. In summary, healthy subjects have lower complexity in the dimension of intrinsic timescale, but subjects with CHF or AF have higher complexity in the dimension of intrinsic timescale. Moreover, subjects with different types of heart disease (i.e., CHF and AF) can be determined by complexity in the dimension of energy spectrum. In Table 3, the negative classifying rate for group of healthy elderly is higher than that for the other groups because subjects in this group have diverse underlying physiologic situations. In addition, the receiver operating characteristic (ROC) curves are frequently used to evaluate classification models [22]. ROC analysis determines the accuracy of a model’s ability to distinguish positive from negative cases. According to the analysis results shown in Table 2, TI is a powerful index for distinguishing the healthy subjects from subjects with heart disease (i.e., AF and CHF). SI is effective for verifying young and elderly subjects in healthy groups. It is also useful for verifying subjects with CHF or AF in the groups with heart disease. Therefore, we conducted three comparisons to show the parameter sensitivities (i.e., SI, TI, and DFA scaling exponent) for classification. These comparisons are healthy subjects versus subjects with heart disease, healthy young versus elderly subjects, and subjects with CHF versus subjects with AF. The comparison results are shown in Fig. 5 and Fig. 5(a) shows the comparison between healthy subjects
Fig. 5. ROC curves for different classifications using various parameters. (a) Classification according to whether subjects have heart disease or not. (b) Classification into groups of healthy young and elderly. (c) Classification into groups of CHF and AF.
and subjects with heart disease. Clearly, TI performs is more sensitive and specific for the classification of healthy subjects versus subjects with heart disease. Fig. 5(b) shows that SI is more sensitive for the classification of healthy young versus elderly subjects. Moreover, Fig. 5(c) shows a similar ability for the classification of subjects with CHF versus AF
J.-R. Yeh et al. / Medical Engineering & Physics 31 (2009) 92–100
for both SI and the DFA scaling exponent. Therefore, the modified DFA associating parameters of TI and SI as a twoparameter analysis algorithm is more sensitive and specific than the original DFA does.
99
parameter scaling of complexity (i.e., TI and SI). Therefore, the HRV of patients with AF performs high complexity in both dimensions of long-term correlation and distribution of intrinsic timescales is a critical point to present the complexity of a relative random time series quantified by the modified DFA in this study.
5. Discussions and conclusions Although the EMD-based detrending operation is not suitable for the integrated time series in DFA, we apply the innovative signal processing algorithm of EMD to design a two-parameter scale for quantifying the signal complexity. These two parameters are spectrum index (SI) and timescale index (TI). We found that the complexity of a signal cannot be quantified by only a single dimension of long-term correlation. Thus, another dimension of intrinsic timescale distribution is also important for quantifying the complexity of signal. However, the technique of quantifying the complexity of signal using a baseline of white noise has previously been used [15], and deriving the surrogate of white noise from the original signal [4] can totally eliminate the effects caused by the magnitude of signal. This new technique helps us evaluate the complexity of signal dependent only on the signal pattern. In this investigation, human heartbeat intervals for 92 subjects with four types of various underlying physiologic mechanisms are re-investigated by the modified DFA algorithm. The complexity of time series was quantified using two dimensions of long-term correlation (SI) and intrinsic timescale distribution (TI). According to the analysis results shown in Table 2, the heartbeat interval for a subject with CHF or AF has significant complexity in the dimension of intrinsic timescale distribution (i.e., value of TI is small). However, the heartbeat interval of an AF subject is more complex than that of a CHF subject in long-term correlation (i.e., value of SI is relative small). Although the heartbeat interval of a healthy young subject presents similar complexity to that of a healthy elderly subject in the dimension of intrinsic timescale distribution (i.e., values of TI are similar), the difference between healthy young and elderly groups can be identified in long-term correlation (i.e., SI values are significantly different). According to the definitions of TI and SI in the modified DFA, a small value of both indexes of TI and SI shows only a small difference between the original and the randomly shuffled time series and represents a high complexity of the original time series. In this study, the HRV of patients with AF is significantly complex in both dimensions of longterm correlation and distribution of intrinsic timescales, as shown in Table 2. It is completely different from the HRV of patients with CHF, which is complex in distribution of intrinsic timescales but not in long-term correlation. In Table 2, we show the comparison results among four groups using both parameters of SI and TI. Those four groups with various physiological conditions can be distinguished using the two-
Acknowledgement The authors wish to thank the National Science Council (NSC) of Taiwan (Grant Number NSC95-2815-C-155-012E) for supporting this research.
Conflict of interest The authors want to disclose there are no potential conflicts of interest with other people or organizations in this study.
References [1] Shang PJ, Na X, Kamae S. Chaotic analysis of time series in sediment transport phenomenon. Chaos Solitons Fractals 2008, doi:10:1016/j.chaos.2008.01.014. [2] Costa M, Goldberger AL, Peng CK. Multiscale entropy analysis of complex physiologic time series. Phys Rev Lett 2002;89:068102. [3] Peng CK, Buldyrev SV, Havlin S, Simons M, Stanley HE, Goldberger AL. Mosaic organization of DNA nucleotides. Phys Rev E 1994;49:1685–9. [4] Yang ACC, Hseu SS, Yien HW, Goldberger AL, Peng CK. Linguistic analysis of the human heartbeat using frequency and rank order statistics. Phys Rev Lett 2003;90:108103. [5] Rodriguez E, Echeverria JC, Ramirez JA. Detrending fluctuation analysis based on high-pass filtering. Physica A 2007;375:699– 708. [6] Goldberger AL, Amaral LAN, Hausdorff JM, Ivanov PC, Peng CK. Fractal dynamics in physiology: alterations with disease and aging. Proc Natl Acad Sci 2002;99:2466–72. [7] Hurst HE. Long-term storage capacity of reservoirs. Trans Am Soc Civil Eng 1951;116:770–99. [8] Kolmogorov AN. Local structure of turbulence in fluid for very large Reynolds numbers. Translation in turbulence. New York: Interscience Publishers; 1961. p. 151–55. [9] Bunde A, Havlin S, Kantelhardt JW, Penzel T, Peter JH, Voigt K. Correlated and uncorrelated regions in heart-rate fluctuations during sleep. Phys Rev Lett 2000;85:3736–9. [10] Alessio E, Carbone A, Castelli G, Frappietro V. Second-order moving average and scaling of stochastic time series. Eur Phys J B 2002;27:197–200. [11] Huang NE, Shen Z, Long SR, Wu MC, Shih HH, Zheng Q, et al. The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis. Proc R Soc A 1998;454:903–95. [12] Flandrin P, Goncalves P. Empirical mode decompositions as data-driven wavelet-like expansions. Int J Wavelets Multiresolut Inform Process 2004;2:477–96. [13] Flandrin P, Concalves P, Rilling G. In: Huang NE, Shen SSP, editors. Hilbert-Huang transform: introduction and applications. Singapore: World Scientific; 2003. p. 57–74.
100
J.-R. Yeh et al. / Medical Engineering & Physics 31 (2009) 92–100
[14] Huang NE, Wu MLC, Long SR, Shen SSP, Qu WD, Gloersen P, et al. A confidence limit for the empirical mode decomposition and Hilbert spectral analysis. Proc R Soc A 2003;459:2317–45. [15] Wu ZH, Huang NE. A study of the characteristics of white noise using the empirical mode decomposition method. Proc R Soc A 2004;460:1597–611. [16] Flandrin P, Rilling G, Goncalces P. Empirical mode decomposition as a filter bank. IEEE Signal Process Lett 2004;11:112–4. [17] Huang NE, Wu ZH, Long ST, Peng CK. On the trend, detrending and variability of nonlinear and non-stationary time series. Proc Natl Acad Sci 2007;104(38):14889–94. [18] Database are available at http://www.physionet.org/. See Goldberger AL, Amaral LAN, Glass L, Hausdorff JM, Ivanov PC, Mark RG, Mietus
[19] [20]
[21] [22]
JE, Moody GB, Peng CK, Stanley HE. PhysioBank, PhysioToolkit, and PhysioNet. Circulation 2000;101:E215. Glanz SA. Primer of biostatistics. 4th ed. Singapore: McGraw-Hill; 1997. Shieh JS, Kao MH, Liu CC. Genetic fuzzy modeling and control of bispectral index (BIS) for general intravenous anaesthesia. Med Eng Phys 2006;28:134–48. Dubolis D, Prade H. Fuzzy sets and systems: theory and applications. New York: Academic Press; 1998. p. 393. Lasko TA, Bhagwat JG, Zou KH, Lucila OM. The use of receiver operating characteristic curves in biomedical informatics. J Biomed Inform 2005;38:404–15.