COMPUTERS
AND
Influence
BIOMEDICAL
RESEARCH
20,
543-562 (1987)
of Noise on Wave Boundary Recognition Measurement Programs Recommendations
by ECG
for Preprocessing’
J. L. WILLEMS,~ CHR. ZYWIETZ, P. ARNAUD, J. H. VAN BEMMEL, R. DEGANI, AND P. W. MACFARLANE On behalf of the CSE Working
Party3
Received January 26, 1987
In the international cooperative project entitled “Common Standards for Quantitative Electrocardiography” (CSE) systematic noise tests have been performed in order to compare measurement results of electrocardiographic computer programs under degraded operational conditions and to develop recommendations for preprocessing and measurement strategies. The influence of seven different high- and low-frequency noise types on the recognition of P, QRS, and T wave onsets and offsets was investigated. The analysis was performed on 160 electrocardiograms derived from two sets of 10 cases each, by eight electrocardiographic and six vectorcardiographic computer programs. The stability and precision of these programs were tested with respect to the results obtained (1) in the noise-free recordings and (2) by a group of five cardiologists who have analyzed the recordings previously in a Delphi reviewing process. Increasing levels of high-frequency noise shifted the onsets and offsets of most programs outward. Programs analyzing an averaged beat showed significantly less variability than programs which measure every complex or a selected beat. On the basis of the findings of the present study, a measurement strategy based on selective averaging is recommended for diagnostic ECG computer programs. However, averaging should be performed only if proper alignment and precise waveform comparison have been performed beforehand in order to exclude dissimilar complexes. Q 1987 Academic Press, Inc.
’ Supported in part by the Commission of the European Communities within the frame of its Medical and Public Health Research Programme under Project No. 82/616/EEC X2.2. and by local and national research funding to different institutes in the EEC Member States. ’ To whom all correspondence should be addressed at University Hospital Gasthuisberg, 49 Herestraat, 3000 Leuven, Belgium. 3 See Appendix for academic affiliations. 543 OolO-4809/87 $3.00 Copyright 43 1987 by Academic Press, Inc. All rights of reproduction in any form reserved.
544
WILLEMS
ET Al.
INTRODUCTION
Throughout the world over 30 different computer programs have been developed for automatic analysis of the routine electrocardiogram (ECG) or vectorcardiogram (VCG) (Z-4). These programs are basically composed of a measurement section and a diagnostic interpretation routine. The measurement program can be viewed as consisting of the following steps: signal acquisition and conditioning, detection and typification of waves. wave boundary recognition, and feature extraction. As in other fields of pattern recognition a unique solution to the analytical problem involved does not yet exist. Using a learning and a test set, every investigator has on a trial-and-error basis for each of these different steps found suitable methods from a wide range of possible techniques, using standard and nonstandard pattern recognition and signal processing methods. The importance of development of quantitative test procedures and references libraries for assessing the precision and accuracy of these methods has been emphasized on several occasions. as early as 1971 (1-4). In the late seventies it became obvious that lack of agreed definitions of waves and common measuring rules had created a situation whereby large differences in measurements hampered the exchange of diagnostic criteria and made evaluation studies on the utility and performance of computer programs difficult (5, 6). To overcome some of these problems a concerted action was started in the European community, striving toward “Common Standards for Quantitative Electrocardiography” (CSE) (7-17). A reference library was developed and a testing scheme was devised for the assessment of ECG measurement programs (14-16). The measurements made by computer ECG programs are influenced to a large extent by the signal-to-noise ratio of the ECG recordings: yet, to date, no systematic effort has been made to evaluate in a quantitative way how the wave recognition of different programs behaves in the presence of specified noise. To resolve this shortcoming, systematic noise tests (18-20) have been performed in CSE in order to compare program measurement results under degraded operational conditions and to develop recommendations for preprocessing as well as minimum performance requirements. The present paper, which is the first of two, reports on the influence of noise on the recognition of the P, QRS, and T wave onsets and offsets. As a result recommendations will be formulated with respect to ECG measurement strategies. In a second paper (21), the influence of noise on the stability of various wave measurements, intervals, and amplitudes will be examined. METHODS
CSE Noise Set and Noise Types The main study and the establishment of the CSE reference library has been previously described in detail (6-16). Out of the 3-lead CSE library, 10 original
INFLUENCE
OF NOISE
ON
WAVE
BOUNDARY
RECOGNITION
545
ECGs with different QRS waveforms and low noise content and the 10 corresponding so-called artificial recordings were selected. As described previously (13, Z4), the artificial library was created from the original recordings by selecting one beat from each of the lead groups and by creating strings of identical beats with stable RR intervals over 10 set for the XYZ leads and over 5 set for each 3-lead group of the conventional 12 leads. One ECG exhibited a normal QRS-T pattern and one had left and one had right ventricular hypertrophy. In addition there was one Wolff-Parkinson-White syndrome, one complete left and one complete right bundle branch block, one old anterior and one old inferior infarction, one combined old inferior and recent anterior infarction, and one high-lateral infarction with nonspecific intraventricular conduction delay and QRS duration greater than 120 msec. All cases were in normal sinus rhythm. Seven different types of noise were added to each of these 10 recordings. Thus, including the noise-free recording, the CSE noise set consisted of 160 recordings (2 x 10 ECGs times 8). The influence of the following noise types was investigated (18): highfrequency (1 to 250 Hz), line-frequency (50 Hz), low-frequency (0.1 to 1 Hz), and step function artifacts. The high-frequency (HF) noise has been generated by a random number generator with Gaussian distribution. Low-pass filtering of this signal was undertaken with a filter function recommended by IEC for electrocardiographs. Three levels of HF noise, namely 15, 25, and 35 PV rms, were added. They constitute noise types 1 to 3. The line-frequency noise had a peak amplitude of 25 IV (noise type 4). For the low-frequency (LF) noise a sine wave was generated with a frequency of 0.3 Hz and a peak amplitude of 500 PV (noise type 5). A sawtooth signal was added with the same frequency and amplitude range to create noise type 6. The seventh noise type consisted of a combination of 15 PV high-frequency noise, superimposed on a sine wave of 0.3 Hz and 500 PV amplitude. The line-frequency interference and the low-frequency baseline noise were coherent between the various leads of each lead group, but the high-frequency noise was not. Data Analysis
The results of the seven different noise records were for all participating programs compared with those of the noise-free ECGs. Each program thereby served as its own control. An ECG program, with ideal preprocessing, should provide the same measurement results from the noisy and the no-noise recordings. Differences between the obtained results represent a measure for noise sensitivity. For the present study the results from the so-called artificial recordings were chosen for analysis in order to avoid the influence of different beat selection algorithms. The CSE noise data have been analyzed by eight ECG and six VCG programs (see Table 1). Each of the cooperating centers had to send its results on magnetic tape to the coordinating center in an agreed format. The parameters measured were those of basic intervals and amplitudes, i.e., P and QRS
S46
WlLLEMS
ET AL
TABLE PROGRAMS
EXAMINED
I
IN THF: Pn~st~r
STUOY
CSE program IlO.
5
x 9 10 12 13 14 15 16
Program name Louvain Hannover HP IBM Nagoya Lyon AVA Halifax Padova Telemed Modular Sicard-Riedl
I?-Lead
Frank XYZ Ye\ Ye\
Yes Ye\ Ye\ Yes Yes Ye\ Yes Yes
Yes Yes Yes YCS
Version 1979 3.-l 2-5890 Bi 5.6 4.0 I980 I980 6H 8101 1980
duration, PR and QT interval, and the duration and amplitude of Q, R, S, R’. S’, and R”, as well as the amplitude of the J point and of the positive and negative components of the P and T waves. Time locations with respect to the beginning of the record or of the reference beat were requested, as well as a copy of the raw data for the modal or averaged beat when applicable. Alignment of the respective averaged beats of the noise recordings with the beat of the noise-free recordings was made in the coordinating center by means of a cross-correlation method using a lOO-msec interval centered on QRS. As for previous studies (13, 16), the earliest onset and latest offset of QRS in any of the three corresponding leads were taken to represent the computer QRS onset and offset for that lead group. For the wave onsets and offsets, differences (algebraic and absolute) were computed between the noise and the no-noise recordings for each program separately and for all ECG and VCG programs combined. The same results have also been derived for all programs computing an average or modal beat (programs 3,4, 8, 12, 15, and 16) as opposed to those making measurements for each beat separately (programs 5. 7. 9, 10, 13. and 14). The algorithms applied in the different programs for these two basic measurement strategies have been described in detail by various investigators (I-4, 13, 22-26). For each noise type mean differences and standard deviations were computed for the on- and offsets of P and QRS, as well as for the end of T. The variance was calculated first for the results of all cases and subsequently after deletion of the most extreme outlier, for each measurement and each noise type. For the IZlead ECG, results were computed in each lead group separately and for all lead groups combined. Differences in variance obtained by the various programs were tested according to the method proposed by Hartley (27). Differences in
INFLUENCE
OF NOISE
ON
WAVE
BOUNDARY
RECOGNITION
547
noise tolerance were also tested, after ranking of the variance results, by using nonparametric two-way analysis of variance according to Friedman (27). The stability and precision of the program results were also tested in relation to the original referee results, after correction for systematic program shifts using results from the 3-lead study (14-16). Deviations at both sides by more than some tolerance limits, as previously defined (13, i.e., respectively, 10, 12, 6, 12, and 30 msec for the on- and offsets of P, QRS, and T, were given one penalty point. The penalty was doubled if the deviation exceeded two times these tolerance limits. When a program could not make a particular measurement, e.g., when it could not determine the on- or offset of the P wave in the noisy record, it was also given one penalty. Ranking of the programs was performed on the basis of these penalty scores for each measurement and each noise type, as well as for all measurements and all noise types combined given equal weights to each wave fiducial. In addition to the specific noise tests, a comparison was made of program results with referee measurement results in low and high noise recordings for the total CSE library (N = 310 cases). Details on the calculation of the noise content and the applied ranking procedure for this set of recordings have been reported previously (13). The influence of noise on the wave amplitude and duration results will be presented in a second paper (21).
RESULTS Influence
of Noise on P Onset and Offset Recognition
Mean differences and corresponding standard deviations between P onset and P offset determinations in the noise-free and HF noise recordings are represented in Figs. 1 and 2 for the ECG and VCG programs involved in the study. Variability was slightly higher in the precordial lead groups, but because of space considerations data are only presented for all 12 standard leads combined. Line interference (noise type 4) and linear baseline shift (noise type 6) had a minor effect on most programs. Increasing levels of high-frequency noise, on the other hand, significantly moved the onsets and offsets of the P wave outward for the 12-lead programs 5,7,8, and 13 and to a similar extent for VCG programs 9 and 10 (Fig. 1). The standard deviation of the mean difference also increased significantly in several programs (Fig. 2). This is partially due to some outliers caused by gross program errors. For example, removal of one outlier for P end reduced the variance for program 9 by a factor of almost two. However, even after deletion of one outlier for each noise type and each lead group the basic results remained the same. For noise type 3, at a level of 35 PV rms high-frequency noise, several programs did not measure P waves in 30 to 40% of the cases. The differences for noise type 5 (low-frequency sinusoidal baseline sway) were intermediate, whereas those for noise type 7 (low-
54x
WILLEMS
ET Al
MEAN
DEVIATION
OF P ONSET
BY HF NOISE
MEAN
DEVIATION
OF P OFFSET
BY HF NOISE
D 25 I F ”
0
5 5
7
8
12
12 LEAD
13 PROGRAMS
14
15
II;
3 I
4
9
1:’
\‘L G PRoc,~AbJ.“i
FIG. 1. Mean shift in P onset and P offset by increasing level5 of high-l’requency noise (denoted 1 to 3) obtained by eight ECG and six VCG programs. the numbering of which is explained in Table 1. For the standard 12-lead ECG the average results of the 4-lead groups are presented. In this and all subsequent figures one outlier has been eliminated for each lead group.
frequency baseline sway and superimposed high-frequency noise) were high, at least for the programs analyzing each QRS complex beat-by-beat. Programs analyzing an averaged beat generally showed significantly less variability than programs measuring each beat individually. This can be seen both from the variance figures with respect to the noise-free recording and the ranking based on absolute deviations in relation to the referee results (Table 2). Influence
of Noise on the Detection
of the Onset und Oflset of QRS
Average differences between the noise and noise-free results for QRS onset and offset in the standard ECG and the Frank XYZ leads are depicted in Fig. 3. Figures 4 and 5 demonstrate the standard deviations of these differences after removal of one outlier. High-frequency noise (noise types 2, 3, and 7) caused a net outward shift of QRS onset for programs 5 and 7 and to a minor extent for program 10 at the highest noise level. Average results for the other programs were relatively
TABLE
2
AVERAGE RANKING OF PROGRAMS ON THE BASIS OF THE VARIANCE WITH RESPECT TO THE NOISE-FREE ORIGINAL DATA AND THE DEVIATIONS TO THE REFEREE RESULTS
Measurement
Averaged beat
strategy
1ZLead programs
8
Single beats
12
15
16
5
7
13
14
4.5 6.5
2.5 3.5 4.5 5 7.5 4.6
Onset P Offset P Onset QRS Offset QRS End T Mean rank*
8
I
2.5
4.5
6.5
6.5
3.5 4.5 7.5 3.5 5.4
5 1.5 6.5 3 3.4
1 1 1
2 4 2.5
8 7 4
7 8 7
2.5 1.6
2 3.0
7.5 6.6
6 6.9
VCG programs
3
4
Onset P Offset P Onset QRS Offset QRS End T Mean rank**
3 2 2 3 3.5 2.7
1 1
2.5
3.5
2 1.5 3 1.7
4.5 5
12
15
5 2.5 4 4.3
9
10
3.5
5
6
3.5
5
5.5
2 1.5
5 4
5.5 6
1
5
3.5
5
3.3
3.1
4.5
5.6
* Friedman test x2 = 21.4; & = 7; P < 0.005. ** Friedman test x2 = 17.4; df = 5; P < 0.005. EFFECT
s 22,
OF HF NOISE
ON P ONSET .,r
5
7
6
12
13
EFFECT
6
7
6 12.LEAD
12
14
15
16
OF HF NOISE
13
14
15
PROGRAMS
3
4
9
IO
12
15
12
15
ON P OFFSET
16
3 I
4
9 “cc
10
PROGRAMS
FIG. 2. Standard deviation of differences between P onsets and offsets derived from the recordings with added HF noise (types 1 to 3) and the corresponding noise-free recordings. For further explanation see Fig. 1. 549
550
WILLEMS MEAN
ET AL.
DEVIATION
OF QRS
ONSET
BY HF NOISE
DEVIATION
OF QRS
OFFSET
BY HF NOISE
I F
o
MEAN D I
”
M s
10
j
81162H3
E c
5
0
-5 5
7
8
12 12.LEAD
13
14
PROGRAMS
15
16
3 I
4
9
10
12
.i
VCG PROGRAMS
FIG. 3. Mean shift in QRS onset and QRS offset by increasing levels of high-frequency noise.
stable. However, differences between the noise and no-noise QRS onsets in individual cases were sometimes large, as can be derived from the standard deviation data in Figs. 4 and 5. Standard deviations of the differences for noise type 3 varied from 5.4 to 11.5 msec (P < 0.001) for the 1Zlead programs and even between 1 and 26 msec (P < 0.001) for the VCG programs. A different response was apparent for QRS offset in some programs. For example, whereas program 12 showed stable average results for QRS onset, it shifted QRS offset significantly outward in the presence of high-frequency noise (see Fig. 3). Program 7 showed few changes on the average, but presented together with programs 8, 12, and 14 showed a high variance for QRS offset determination. If QRS offset were shifted, the movement occurred outward for almost all programs, except for programs 7 and 9, which showed a significant inward shift at the highest HF noise levels. Table 2 demonstrates that all programs which apply coherent beat averaging had more stable results for QRS onset than programs analyzing single beats. For QRS offset the results were more divergent. Some 12-lead programs, which analyze each beat individually, scored better than some beat averaging 12-lead
INFLUENCE
OF NOISE
ON
WAVE
BOUNDARY
551
RECOGNITION
programs. Three of the four VCG programs (3,4, and 15) performing averaging had significantly better noise tolerance than the two VCG (9 and 10) programs which analyze a selected complex or every beat (see Table 2). Similar to those for the P wave, line frequency and linear baseline shift caused no significant deviations in QRS onset or offset for the majority of programs. Znfluence of Noise on the Recognition
of T End
Average differences between the noise and noise-free results for T end and the corresponding standard deviations, obtained after removal of one outlier, are depicted in Fig. 6. As a result of high- and low-frequency noise, programs 5, 14, and 16 did not give a definite T wave boundary in up to 15% of the cases. The other programs rejected fewer cases, As for the other fiducials, highfrequency noise caused a significant increase in the variability of T end boundary recognition. Some programs (e.g., 7, 8, 9, 10, 13, and 15) showed a significant outward shift whereas others did not. Programs 4, 5, 7, 10, and 14 showed a significant higher susceptibility to low-frequency sinusoidal baseline shift (noise types 5 and 7) than the other programs. The 12-lead programs EFFECT
s
12
T D
II 10 9 8
M s
OF HF NOISE
ON QRS
ONSET
7
E
6
c
5 4
5
7
8
12
EFFECT s
12
T D
II 10
13
14
15
OF 50 Hz AND
16
3
LF NOISE
4
ON
9
QRS
10
12
15
12
15
ONSET
9
M s
8 7
E
6
c
5 4
5
7
6 $2.LEAD
12
13 PROGRAMS
14
15
16
3
4
9 VCG
10 PROGRAMS
I
FIG. 4. Influence of high- and low-frequency noise on the stability of QRS onset determination. The influence of increasing high-frequency noise (types 1 to 3) is presented in the upper half and that of the low-frequency noise (types 4 to 6) in the bottom half.
552
WILLEMS
s
16
T D
14
EFFECT
ET AL.
OF HF NOISE
ON QRS
OFFSET
12 10 8 6
EFFECT s
T D
OF 50Hz
AND
LF NOISE
ON QRS
OFFSET
16
14 12
M S E c
10 8 6
FIG. 5. Influence of high- and low-frequency noise on the stability of QRS offset determination.
proved to be less stable than the VCG programs. The removal of one outlier significantly decreased the variance figures for some programs, but the overall trend in the data was the same when all cases were included. Again, all programs which analyze an averaged beat showed with one exception a lower variability than programs which measure every complex or a selected beat (see Table 2). Comparison of Program with Referee Results in Low versus High Noise Recordings
A bar graph depicting results from the whole annotated CSE library (N = 310), divided into the 50% cleanest and 50% noisiest recordings, is represented in Fig. 7 for the 12-lead programs involved in the study. These results demonstrate that in noisy recordings wave fiducials are displaced outward by most programs, as compared to the reference established by a group of five cardiologists in an iterative reviewing process. The average differences and the corresponding variance figures, illustrated by the width of the bars, proved that QRS onset was the most stable measurement.
INFLUENCE
OF NOISE
ON
WAVE
BOUNDARY
553
RECOGNITION
DISCUSSION
In order to achieve an optimal performance it is essential that the data used by ECG or VCG processing systems are of good quality. In practice, however, ECG records are often distorted by powerline interference, baseline wander caused by electrode polarization or respiration, myographic noise, artifacts, spikes, sudden baseline shifts, and amplifier saturation. The first goal of the preprocessing and quality control module of an ECG processing system is to detect these disturbances (22). Based on this recognition, parts of the ECG or the whole recording can be rejected or corrected by using different techniques. Most programs apply digital filtering methods. Properly designed digital filters can improve signal quality without compromising signal integrity (28). However, certain digital filters can “ring” and produce exaggerated Q, R, or S waves, whereas others can smooth these waves. From the present report, which is focused on boundary recognition, it is apparent that most programs are able to remove line-frequency noise adequately. To this end notch filters, autoregressive and adaptive filtering (22), or other techniques such as the one developed by Mortara based on incremental
D I F
MEAN
15
DEVIATION
OF T END
BY HF
NOISE
10
M s E c
5
0
-5 5
7
8
12
13
EFFECT s
35
T D
3o
14
15
16
OF HF NOISE
3
4
9
ON
T END
10
12
15
IO
12
15
25 M s E c
20 15
C q
81
23
1050 5
7
8 1 P-LEAD
12
13
14
15
16
3
PROGRAMS
4
9 VCG
PROGRAMS
I
FIG. 6. Average differences between deviations for the end of the T wave. illustrated.
noise and noise-free results and corresponding Only the effect of high-frequency noise (types
standard 1 to 3) is
554
WILLEMS
ET AL. /NOISE
---NOISE PROQRAM MEDIAN ________-----------. ONSET QRS d
SCALE +.A.(M=)-IO+IO-I~O-,;;C;;o
2
RANKS RANKS
1 to 77 78 to 156
5
-IO
+IO
-,o
+IO-10
+io-to
+lo-,o
+IO
-IO
+go-,o
+,o
-1~+lo
FIG. 7. QRS onsets and offsets obtained by 10 different standard lead programs in comparison to the referee standard in the 50% lowest vs 50% highest noise recordings. The total number of ECGs analyzed is 310. Mean differences are depicted by small vertical lines and 99% confidence intervals after omitting 2% outliers by horizontal bars. The long vertical lines denote zero differences with respect to the referee results (2-3% outliers rejected).
estimation (28) are applied. Line-frequency interference was not entirely removed by some programs (Nos. 5, 7, and 14) possibly because they did not apply the most appropriate elimination procedure for 50 Hz. The baseline wander, linear or sinusoidal, as applied in the current study, also had no significant effect on the P, QRS, and T boundary detection. This does not imply that line interference and baseline wander have no effect on feature extraction, especially amplitude measurements, as will be reported in the second paper (21). High-frequency noise, on the contrary, produced a significant effect on the onset and offset determination of different waves by various programs. Random and periodic noise components of biological or technical origin will influence measurement precision in any given complex. All programs attempt to make use of the redundancy of complexes available in the sampled ECG to optimize the accuracy of its measurements. On the basis of the strategy applied, it is possible to divide the different ECG measurement programs into two basic sets. In the first set are programs which perform detailed wave recognition on every complex and subsequently average measurement results of similar dominant complexes. To this set also belong programs that choose one complex for analysis, ideally the one with the least noise and baseline
INFLUENCEOFNOISEONWAVEBOUNDARYRECOC;NITION
555
wander. In the second set are the programs which perform time coherent averaging of all complexes that are considered to be morphologically of the same type (29-34) or which calculate a so-called median or modal beat (35). Detailed wave recognition and measurement extraction is then made on this epitome, averaged complex. Good complex typification and proper beat alignment before averaging are essential for the selective averaging approach. The choice of a consistent fiducial point is critical, but the beats should first be characterized by some comparison technique. If the waveforms differ by a certain amount, then the beats should be considered of a different type and discarded from the averaging. Different algorithms are being used. These techniques have to some degree grown more sophisticated over time. However, limited attempts have been made to validate these approaches. There are often variations on the same theme (22). Wolf et al. (29) use a correlation technique for alignment of the beats. They perform complex averaging only when the random noise in any of the leads exceeds 20 PV rms. Otherwise, the complex closest to the center of the dominant QRS complexes is taken as the representative cycle. In the program developed by Zywietz et al. (30) coherent averaging is always performed. Similar to the Dalhousie program, synchronization is performed by a cross-correlation method. Mortara uses the coherence of two Chebychev vectors (22) and cross-correlation (13). In the Modular program developed by van Bemmel and co-workers (22, 32) alignment is based on a reference point within the QRS complex, the position of an extremum in the bandpass-filtered signal. The onset of individual QRS complexes is used as reference in the Louvain program (33). In the new Glasgow system (35) an averaged beat is routinely computed. Individual QRS complexes are then compared with the average. If significant differences are noted, then the program returns to calculate a more better median or modal beat. To date, the argument was not settled as to which of the two basic averaging strategies is the best. Proponents of the beat-to-beat measurement strategy claim that jitter in the alignment may cause QRS widening, produce loss of small initial Q or R waves, and blur other high-frequency components such as QRS notches. In the case of significant jitter around the triggering point, major components of a high-frequency signal may indeed be canceled, and selective averaging might act as a low-pass filter. The proponents of selective averaging dispute that this is actually happening and point to the advantages of an increased signal-to-noise ratio by coherent signal averaging. It is indeed well known that signal averaging reduces the noise dispersion by a factor equal to the square root of the number of beats analyzed, assuming stationary Gaussian noise. This technique is widely applied in quantitative exercise electrocardiography (36, 37), and for a few years has also been used for detection of low-voltage ventricular late potentials (38). With advancements in signal processing techniques jitter can nowadays be limited to less than 1 or 2 msec depending on the sampling rate. Simson (38) obtained a reference jitter of only -co.5 msec with a sampling rate of 1000 Hz. With the aid of the CSE library,
556
WILLEMSETAL
Talmon (22) was not able to show any significant difference in QRS duration when wave recognition was performed on single complexes or on averaged complexes computed using proper beat alignment and after rejection of ectopic beats or grossly noisy signals by a template recognition program. From the present study it is evident that programs measuring an averaged beat show significantly less variability in wave boundary recognition than programs measuring each beat individually. These findings give strong support to those who favor selective signal averaging for routine ECG computer processing. High-frequency noise caused a significant outward shift of P onset and P offset for most programs which analyze each complex separately (5, 7, 9, 10, 13, and 14). The same was true for T end. QRS onset proved to be the most stable measurement. Noise affected QRS onset significantly for only a few programs, mainly program 7, and at the highest noise levels programs 5 and 10 were affected to some extent. Although “average-beat” programs proved to be more stable than “beat-by-beat” programs, some of the latter proved to have good noise tolerance as well. Conversely, for a few measurements, some “average-beat” programs even showed an unexpected variability. For example, although program 12 performs selective averaging, it shifted QRS end outward significantly at the highest noise level. These results are corroborated by the observed differences of this program with the referee results in the total CSE library. Instead of causing an outward shift. which was the general finding, program 9 produced a small inward shift of QRS end with increasing noise. This may be explained by the specific area-dependent boundary detection method used in that particular program (23). Programs may be less sensitive to noise not only on account of more elaborate preprocessing, but also as a result of more robust boundary detection. Different methods have been applied for boundary detection of ECG waveforms, e.g., straightforward threshold detection, matching with part of a signal, and cross-correlation with an amplitude-time template (32). A single level threshold detection algorithm based on spatial velocity was first proposed by Stallmann and Pipberger (39) in 1961. This technique has been used since then in different processing systems. However, the method proved to be rather sensitive to noise. Indeed, noise superimposed on the scalar leads enlarges spatial magnitude and velocity, which results in an outward shift of the onsets and offsets when absolute or relative thresholds are used for boundary detection. In some programs, such as feedback, noise adaptive, and multilevel detectors, difference or least-square methods were therefore built in. Other investigators have used spatial area or other time-derivative functions (22). Morlet (23) has demonstrated that the area method adapts itself well to a minor or moderate amount of noise. The detection by program 9 of the onset and offset of P, as well as QRS offset, agreed even better with the CSE referee results when small amounts of noise were introduced in the recordings (23). A new technique, the template method, was introduced by van Bemmel and co-workers (32, 40) in the late sixties. It takes into account the information on
INFLUENCE
OF NOISE
ON
WAVE
BOUNDARY
RECOGNITION
557
the amplitude distribution of the spatial velocity or pseudovelocity function not in a localized area, but in the whole vicinity of the onset and end points of the waves. The templates have been constructed from a series of tracings where the inflectional points have been visually determined in a training set by one or more human observers. An inflectional point in a new tracing is then located at that sample where a certain function of the signal optimally correlates with the template within a certain amplitude-time window. Although the method is not immune to noise, it proved to work well in a noisy environment (40). The template level is shifted upward adaptively to the amount of noise. This explains why the onset and offset points of QRS by program 15 are shifted outward to a lesser extent in presence of high-frequency noise. The same template matching method is being used in other programs such as the Louvain program (33) for the determination of P onset, QRS offset, and T end, but fixed threshold level crossing is used in that program for QRS onset and P offset determination. Also other programs (programs 4, 10, and 12) apply fitting to “standard waveform functions” for some boundaries but not for others. An inventory of the techniques applied has been summarized in the second CSE Progress Report (13). This explains why noise tolerance may vary from program to program, but also within the same program for different boundaries. As stated previously (17), the CSE Working Party at the present time does not wish to propose any formal algorithm as the standard for ECG wave recognition. Several formal mathematical algorithms may indeed lead to similar solutions in pattern recognition (16). However, on the basis of the results from the present study, a measurement strategy based on selective signal averaging is recommended for diagnostic ECG computer programs, similar to those for exercise electrocardiography (36) and late potential detection (38). It is also strongly recommended that proper filtering techniques should be applied to remove 50 to 60 Hz noise. Results from the present study demonstrate that line interference can completely or almost completely be removed with efficient algorithms, but that this ideal is not met by some computer programs. Averaging is only allowed if waveform comparison (cross-correlation or other clustering techniques as described above) has been performed beforehand in order to exclude dissimilar complexes. In any event, careful typification of P, QRS, and ST-T waves is required in every complex for rhythm analysis. Some investigators claim that the median beat may provide a better estimate of the “true” complex than the averaged beat, although its random noise may be slightly higher and the calculation requires more computer resources (37). Others disagree with this statement (42) and prefer a hybrid approach combining mean and median averaging. Still others (22) found that discontinuities may occur in the baseline of the median beat if an odd number of P-QRS-T complexes is present and that this disadvantage does not apply to the modal or averaged beat. It should be appreciated that for the artificial recordings analyzed in this study, programs analyzing an averaged beat operate in a more ideal context than in day to day practice. Indeed, some programs had to perform coherent
558
WILLEMS
ET AL.
averaging of noisy complexes, which in these recordings showed no physiological beat-to-beat variation. It should also be appreciated that basically only 10 cases have been analyzed. Except for one patient, they all presented an abnormal QRS-T morphology. A better performance might be expected when all cases exhibit a normal QRS-T configuration. Also, in the noise studies performed by Wolf and Rautaharju (42), as well as by Zywietz and Borovsky (#3), only 12 and 10 cases have been analyzed. respectively. In these studies only VCGs and the noise sensitivity of derived measurements have been investigated, not that of the primary wave fiducials. Also, Perz and Hermann (44) have studied the effect of noise on the amplitude and interval parameters of two VCG and one ECG program. They used 20 recordings with 15 added noise levels. All recordings had a normal QRS morphology in at least two of these studies. Van Bemmel et al. (40, #5), on the other hand, used 30 different VCG waveforms to which were added 10 levels of random noise (l-l 50 Hz) to study the stability of QRS onset and offset recognition. These authors demonstrated that the template method was rather insensitive to random noise. As a final remark, one should be reminded that recordings with an rms noise value of 35 WV, such as included in the present study, are not common in clinical practice and can be avoided by proper quality control. Such a noise level corresponds to a baseline thickness of close to 1 mm on an ECG writer at standard amplification. Results for noise types 3 and 7, presented in this study, therefore reflect the program performance at poor operational conditions. Some programs normally reject such tracings as unacceptable. In a previous study (26), the CSE project has demonstrated that, at noise levels below 15 PV rms, several of the programs which analyze complexes on a beat-by-beat basis had a very good performance when their results were compared with independent referee results in the annotated CSE library. APPENDIX
CSE Steering
Committee
P. Arnaud (France), R. Degani (Italy), P. W. Macfarlane (United Kingdom), J. H. van Bemmel (The Netherlands), J. L. Willems (Project Leader; Belgium), C. Zywietz (West Germany). CSE European
Working Party
Belgium: C. Brohet (University of Louvain), M. Demeester (University of Brussels), J. Pardaens and J. L. Willems (University of Leuven). West Germany: J. Dudeck (University of Giessen), J. Meyer and J. Michaelis (University of Mainz), S. J. Poppl (Institute Medical Data Processing, Munich), C. Zywietz (University of Hannover). Denmark: J. Damgaard Andersen (University of Copenhagen). France: P. Arnaud (INSERM U121 Lyon), B. Denis (University of Grenoble), P. Rubel (INSA, Lyon).
INFLUENCE
OF NOISE ON WAVE BOUNDARY
RECOGNITION
559
Greece: S. Moulopoulos (University of Athens), E. Skordalakis (Technical University of Athens). Italy: S. Dalla Volta (University of Padova), R. Degani (Ladseb CNR, Padova), G. Mazzocca (University of Pisa). Ireland: I. Graham and B. C. Reardon (University of Dublin). Portugal: C. Abreu Lima (University of Porto). The Netherlands: J. H. van Bemmel and J. L. Talmon (Free University, Amsterdam), F. M. A. Harms and E. 0. Robles de Medina (University of Utrecht), H. J. Ritsema van Eck (Rotterdam), G. van Herpen (University of Leiden) . United Kingdom: P. J. Bourdillon (University of London), P. W. Macfarlane (University of Glasgow). Consultants J. J. Bailey (NIH) Washington, D.C.), Canada). Non-European
and H. V. Pipberger (George Washington University, P. M. Rautaharju (University of Dalhousie, Halifax,
Participants
U.S.A.: R. Bonner (IBM; former participant), C. Monroe/S. Charlesworth (Hewlett-Packard), K. Michler (Telemed), I. Rowlandson (Marquette). Canada: P. M. Rautaharju and P. Macinnis (University of Dalhousie, Halifax, Nova Scotia). Japan: M. Okajima, N. Okamoto, and M. Yokoi (University of Nagoya), M. Ohsawa (Fukuda Denshi). EC Biomedical
Engineering
J. Marquez-Montes Stuttgart). CSE Coordinating Division
COMAC
(University
of Madrid)
and U. Faust (University
of
Center
of Medical
Informatics,
University
of Leuven, Belgium.
ACKNOWLEDGMENTS We gratefully acknowledge the secretarial assistance of D. Wolput, V. Dihemans, and I. Tassens, as well as the technical assistance of L. Ackermans and D. De Schreye.
REFERENCES C., AND SCHNEIDER, B. Computer Application on ECG and VCG Analysis. In “Proceedings of IFIP ECG Working Conference, Hannover, October 1971,” p. 583. NorthHolland, Amsterdam, 1973. 2. VAN BEMMEL, J. H., AND WILLEMS, J. L. “Trends in Computer-Processed Electrocardiograms,” p. 437. North-Holland, Amsterdam, 1977. 1. ZYWIETZ,
560
WILLEMS
ET AL.
3. WOLF, H. K., AND MACFARLANE. P. W. “Optimization of Computer ECG Processing,” p. 346. North-Holland, Amsterdam, 1980. 4. WILLEMS, J. L., VAN BEMMEL, J. H., AND ZYWIETZ. C. Computer ECG Analysis: Towards Standardization. In “Proceedings, International Working Conference Leuven. June 2-5. 1985,” p. 401. North-Holland, Amsterdam, 1986. 5. WILLEMS. J. L., AND PARDAENS, J. Differences in measurement results obtained by four different ECG computer programs. In “Computers in Cardiology 1977” (H. G. Ostrow and K. L. Ripley, Eds.), p. 115. IEEE Computer Society. Long Beach, CA, 1977. 6. WILLEMS, J. L. A plea for common standards in computer aided ECG analysis. Comput. Biomed. Research 13, 120-131 (1980). 7. The CSE Working Party (WILLEMS, J. L. (Chairman), ARNAUD, P.. VAN BEMMEL, J. H., rt al.). An approach to measurement standards in computer ECG analysis. In “Optimization of Computer ECG Processing” (H. K. Wolf, and P. W. Macfarlane. Eds.). p. 135. North-Holland, Amsterdam, 1980. 8. WILLEMS. J. L.. ARNAUD, P., DEGANI, R., MACFARLANE, P. W.. VAN BEMMEL, J. H., AND ZYWIETZ, C. Protocol for the concerted action project “Common Standards for Quantitative Electrocardiography,” 2nd R&D programme in the field of Medical and Public Health Research of the EEC (80/344/EEC), CSE Ref. 80-06-00, p. 152. ACCO Pub]., Leuven, 1980. 9. The CSE European Working Party (WILLEMS, J. L., ARNAUD, P., VAN BEMMEI., J. H., et al.). Common Standards for Quantitative Electrocardiography. The CSE Pilot Study. In “Proceeedings of Medical Informatics Europe 81” (F. Gremy, P. Degoulet, B. Barber, and R. Salamon, Eds.), p. 319. Springer-Verlag, Heidelberg, 1981. 10. BOURDILLON. P. J., DENIS, B., HARMS, F. M. A.. MAZZOC~A, G., MEYER, J.. ROBLES DE MEDINA. E. O., RITSEMA VAN ECK, H. J., AND WILLEMS, J. L. European experience in the standardization of measurements and of definitions of the electrocardiogram. In “Computerized Interpretation of Electrocardiograms VII” (M. Laks and S. S. Cole, Eds.), p. 9. Engineering Foundation, New York, 1982. 11. MACFARLANE, P. W., AND WILLEMS, J. L., on behalf of the CSE Working Party. The CSE Project: Progress as viewed by the cooperating centers. In “Computer Interpretation of Electrocardiograms VIII” (R. Selvester, Ed.), p. 293. Engineering Foundation, New York, 1983. 12. WILLEMS, J. L., ARNAUD, P., VAN BEMMEL, J. H.. et al. Common Standards for Quantitative Electrocardiography: CSE Project Phase One. In “Computers in Cardiology 1982” (K. L. Ripley, Ed.), p. 69. IEEE Computer Society, Long Beach, CA, 1982. 13. WILLEMS, J. L. CSE Progress Reports: 1st (1981, p. 242): 2nd (1982. p. 246); 3rd (1983, p. 275); 4th (1984, p. 277); 5th (1985, p. 327). ACCO Pub]., Leuven. 14. WILLEMS, J. L.. ARNAUD, P., VAN BEMMEL, J. H., et al. Establishment of a reference library for evaluating computer ECG measurement programs. Comput. Biomed. Res. 18,439 (1985). IS. WILLEMS, J. L. Common Standards for Quantitative Electrocardiography. In “CSE Atlas,” Ref. 83.05.13, p. 655. ACCO Publ., Leuven, 1983. 16. WILLEMS, J. L., ARNAUD, P.. VAN BEMMEL, J. H., et al. Assessment of the performance of electrocardiographic computer programs with the use of a reference data base. Circulation 71, 523 (1985). 17. The CSE Working Party. Recommendations for measurement standards in quantitative electrocardiography. Eur. Heart J. 6, 815 (1985). 18. ALRAUN, W., ZYWIETZ, C., BOROVSKY, D., AND WILLEMS, J. L. Methods for noise testing of ECG analysis programs. In “Computers in Cardiology 1983” (K. L. Ripley, Ed.). p. 253. IEEE Computer Society, Long Beach, CA, 1983. 19. ZYWIETZ, C., ALRAUN, W., AND WILLEMS, J. L., on behalf of the CSE Working Party. Results of ECG program noise tests within the CSE Project. In “Computers in Cardiology 1984” (K. L. Ripley, Ed.), p. 377. IEEE Computer Society, Long Beach, CA, 1984. 20. ZYWIETZ, C. Noise tolerance in ECG computer programs and measurement stability of minimum waves. A summary of results from CSE noise tests and conclusions. In “Computer
INFLUENCE
2I. 22. 23. 24. 25. 26. 27. 28. 29.
30. 31. 32.
33.
34. 35. 36. 37.
38. 39. 40. 41.
OF NOISE ON WAVE BOUNDARY
RECOGNITION
561
ECG Analysis: Towards Standardization” (J. L. Willems, J. H. van Bemmel, and Chr. Zywietz, Eds.), p. 87. North-Holland, Amsterdam, 1986. ZYWIETZ, C., WILLEMS, J. L., ARNAUD, P., VAN BEMMEL, J. H., DEGANI, R., AND MACFARLANE, P. W., manuscript in preparation. TALMON, J. L. “Pattern Recognition of the ECG,” p. 366. Ph.D. thesis, Free Univ., Amsterdam, 1983. MORLET, D. “Contribution B l’analyse automatique des electrocardiogrammes: Algorithmes de localisation, classification et dklimitation prCcise des ondes dans le systtme de Lyon,” pp. l-320. Thtse de doctorat, INSA, Univ. de Lyon, 1986. BORTOLAN, G., CAVAGGION, C., AND DECANI, R. T. The European CSE project: Experiences in the Italian processing center. In “Computers in Cardiology 1983” (K. L. Ripley, Ed.), p. 257. IEEE Computer Society, Long Beach, CA, 1983. TALMON, J., AND VAN BEMMEL, J. H. Template waveform recognition revisited. Results for the CSE data base. In “Computers in Cardiology 1983” (K. L. Ripley, Ed.), p. 249. IEEE Computer Society, Long Beach, CA, 1983. OKAJIMA, M., IWATSUKA, T., MIZUNO, Y., OKAMOTO, N., YOKOI, M., OHSAWA, N., AND OHTA, N. Accuracy in computer measurements of electrocardiograms. Analysis of CSE Data Base. Japan. Med. Biol. Eng. 23, 23 (1984). WINER, B. J. “Statistical Principles in Experimental Design,” 2nd ed., p. 206 and p. 851. McGraw-Hill, New York, 1971. MORTARA, D. W. Digital filters for ECG signals. In “Computers in Cardiology 1977” (H. G. Ostrow and K. L. Ripley, Eds.), p. 511. IEEE Computer Society, Long Beach, CA, 1977. WOLF, H. K., MACINNIS, P. J., STOCK, S., HELPPI, R. K., AND RAUTAHARJU, P. M. The Dalhousie program: A comprehensive analysis program for rest and exercise electrocardiograms. In “Computer Application on ECG and VCG Analysis” (C. Zywietz and B. Schneider, Eds.), p. 231. North-Holland, Amsterdam, 1973. ZYWIETZ, C., BOROVSKY, D., FALTINAT, D., KLUSMEIER, S., AND SCHIEMAN, W. The Hannover EKG system HES. In “Trends in Computer-Processed Electrocardiograms” (J. H. van Bemmel and J. L. Willems, Eds.), p. 159. North-Holland, Amslerdam, 1977. MORTARA, D. W. Reply to CSE inventory questionnaire. In “2nd CSE Progress Report” (J. L. Willems, Ed.), p. 224. ACCO Pubi., Leuven, 1982. VAN BEMMEL, J. H. Recognition of electrocardiographic patterns. In “Handbook of Statistics”, (P. R. Krishnaiah and L. N. Kanal, Eds.), Vol. 2, p. 501. North-Holland, Amsterdam, 1982. BROHET, C. R. Computer-assisted interpretation of electro- and vectorcardiograms. Thesis, University of Louvain. Rev. Inst. Hyg. Mines 35, 133 (1980). HERRMANN, G. Optimum time adjustment of beats for construction of a modal beat. In “Computer ECG Analysis: Towards Standardization” (J. L. Willems, J. H. van Bemmel, and Chr. Zywietz, Eds.), p. 139. North-Holland, Amsterdam, 1986. MACFARLANE, P. W., WATTS, M. P., PODOLSKI, M., SHOAT, D., AND LAWRIE, T. D. V. The new Glasgow system. In “Computer ECG Analysis: Towards Standardization” (J. L. Willems, J. H. van Bemmel, and Chr. Zywietz, Eds.), p. 31. North-Holland, Amsterdam, 1986. SIMOONS, M. L., HUGENHOLTZ, P. G., ASCOOP, C. A., DISTELBRINK, C. A., DE LAND, P. A., AND VINKE, R. V. M. Quantitation of exercise electrocardiography. Circulation 63,471 (1981). WATANABE, K., BHARGAVA, V., AND FROELICHER, V. Computer analysis of the exercise ECG: A review. hog. Cardiovasc. Dis. 32, 423 (1980). SIMSON, M. B. Use of signals in the terminal QRS complex to identify patients with ventricular tachycardia after myocardial infarction. Circulation 64, 235 (1981). STALLMANN, F. W., AND PIPBERGER, H. V. Automatic recognition of electrocardiographic waves by digital computer. Cir. Res. 9, 138 (1961). VAN BEMMEL, J. H., TALMON, J. L., DUISTERHOUT, J. S., AND HENGEVELD, S. J. Template waveform recognition applied to ECG/VCG analysis. Comput. Biomed. Res. 6, 430 (1973). MERTENS, J., AND MORTARA, D. A new algorithm for QRS averaging. In “Computers in
562
WILLEMS Cardiology
ET AL.
1984” (K. L. Ripley, Ed.), p. 367. IEEE Computer Society, Long Beach, CA.
1984. WOLF, H. K., AND RAUTAHARJU, P. M. Assessment of precision and accuracy of computer programs for analysis of electrocardiograms. In “Measurement in Exercise Electrocardiography. The Ernst Simonson Conference” (H. Blackburn. Ed.). p. 169. C. C. Thomas, Springfield, IL, 1969. 43. ZYWIETZ, C., AND BOROVSKY, D. The influence of noise on ECG computer measurements and interpretations. Abstract book, ‘Second International Symposium on Electrocardiology,” Yerevan, September 1973. 44. PERZ, S., AND HERMANN, G. Diagnostic accuracy in computerized analysis of non-optimum ECG signals. In “Medical Data Transmission by Public Telephone Systems” (H. J. von Mengden and H. Just, Eds.), pp. 64-73. Urban & Schwarzenberg, Munich. 1978. 45. VAN BEMMEL, J. H., DUISTERHOUT, J. S., VAN HERPEN, G., BIERWOLF, L. C.. HENGEVELD, S. J., AND VERSTEEG, B. Statistical processing methods for recognition and classification of vectorcardiograms. In “Proceedings of the 1lth International VCG Symposium” (I. Hoffman, Ed.), p. 207. North-Holland, Amsterdam, 1971. 42.