Body surface maps and the conventional 12-lead ECG compared by studying their performances in classification of old myocardial infarction

Body surface maps and the conventional 12-lead ECG compared by studying their performances in classification of old myocardial infarction

J. ELECTROCARDIOLOGY 20(3), 1987, 193-202 Body Surface Maps and the Conventional 12-lead ECG Compared by Studying Their Performances in Classificatio...

1MB Sizes 0 Downloads 45 Views

J. ELECTROCARDIOLOGY 20(3), 1987, 193-202

Body Surface Maps and the Conventional 12-lead ECG Compared by Studying Their Performances in Classification of Old Myocardial Infarction BY GI~RARDJ. H. UIJEN, M.S.,* ANCO HERINGA, M.S.,* ADRIAAN VAN OOSTEROM, PH.D.t AND RUDOLF TH. VAN DAM, M.D.*

SUMMARY The performance of body surface potential maps and the 12-1ead ECG in the detection of old myocardial infarction has been compared in a two-group: (54 normals; 52 infarctions) classification procedure (linear discriminant analysis). Three methods for data reduction of body surface maps were compared: 1) time integration, 2) one-step reduction in eigenvectors and 3) two-step reduction in spatial and temporal elgenvectors. Features were taken from the reduction variables by a stepwlse selection procedure. From 90% to 93% correct classifications could be obtained using three features from the map data over the initial 30 ms (Q interval) of the QRS wave for all three methods considered. Using the 100 ms (QRS) interval 86% correct classifications were obtained using method 1, and up to 90% and 87% for methods 2 and 3, respectively. In a further analysis the classification based on body surface maps was compared to the one based on the 12-lead ECG. The 12-1ead ECG was treated as a restricted set of the body surface mapping leads, so the same methods of data reduction, feature extraction and classification could be applied to both sets of data. Applying method 1 (time integration) 89% correct classifications were obtained using data taken from the 30 ms interval of the 12-lead ECG and a subsequent reduction to three features. When using the 100 ms interval the result was 79% also using three features. The results of method 2 applied to the 12-1ead ECG were 89% (30 ms interval, three features) and 78% (100 ms interval, three features).

The standard 12-lead ECG provides a limited representation of the time varying potential distribution on the body surface generated by cardiac electrical activity. More information on the electrical state of the heart can be expected to be present in a recording of the full potential distribution over the body surface as a function of time. For this reason the procedure of body surface mapping (BSM) has evolved. This is the simultaneous recording of multiple (>32) ECG leads and the display of the recorded potential distribution at the

* Department of Cardiology,Radboud Hospital, Universityof Nijmegen, Nijmegen,The Netherlands. t Laboratoryof MedicalPhysicsand Biophysics,Universityof Nijmegen, Nijmegen,The Netherlands. Reprint requests to: G. J. H. Uijen, Department of Cardiology, St. Radboud Hospital, Universityof Nijmegen,P.O. Box 9101, 6500 HB Nijmegen,The Netherlands.

body surface, projected on a map of this body surface. In the literature several applications of the BSM technique have been described, such as the detection of myocardial infarction, ~-3 the localization of abnormal conduction pathways 4 and the detection of coronary artery disease. 5 In some of these studies the diagnostic performance of BSM's has been compared directly to that of the corresponding standard 12-lead ECG. ~,6~Usually, entirely different features are selected from maps and from the standard 12lead ECG. Consequently any increase in diagnostic performance, which might result from the e x t r a leads recorded in the BSM data, cannot be evaluated properly. For example, BSM data may be evaluated from the course o f potential maxima and minima or by the development of areas of negative potential, whereas the ECG is inspected using the usual clinical criteri.a, either by an experienced cardiologist or by some program for automatic interpretation of ECG data. 193

194

UIJEN ET AL

In this s t u d y we t r e a t B S M a n d s t a n d a r d 12-lead E C G d a t a as similar data, differing in the n u m b e r of r e c o r d e d signals only. In t h e 12-lead E C G , as is well known, the m a x i m u m n u m b e r of i n d e p e n d e n t signals is eight, whereas in a m a p b a s e d on 64 leads (as in this study) the m a x i m u m n u m b e r of indep e n d e n t signals is 64. T h e a p p l i c a t i o n of a linear d i s c r i m i n a n t analysis in a classification p r o c e d u r e to the raw d a t a (implying m o r e t h a n 3000 d a t a p o i n t s per subject) would require an impossibly large d a t a b a s e (---100,000), since the s a m p l e p o p u l a t i o n has to be large with r e s p e c t to t h e n u m b e r of features. Hence, a f e a t u r e e x t r a c t i o n p r o c e d u r e has to be included resulting in a n u m b e r of features which bears a direct relation to the n u m b e r of subjects in a d a t a base. Our d a t a base c o m p r i s e d 106 subjects a n d t h e n u m b e r of features inspected was p r o p o r t i o n a l l y smaller. T h e e x t r a c t i o n p r o c e d u r e should be applicable to b o t h E C G a n d B S M data. In this p a p e r possible f e a t u r e sets are discussed, a n d their diagnostic perform a n c e s are c o m p a r e d in a t w o - g r o u p classification p r o c e d u r e b a s e d on B S M data. N e x t , the p e r f o r m a n c e s of t h e feature sets are c o m p a r e d b e t w e e n a p p l i c a t i o n s to E C G d a t a a n d to B S M d a t a in t h e s a m e classification procedure.

M A T E R I A L S AND M E T H O D S A. Data Recording From a base of BSM signals comprising 390 subjects, two groups of subjects were selected: 54 normal subjects and 52 patients with old infarction. The 52 patients were selected from a group of 76 patients with old infarction. For these 52 patients a clinical diagnosis was available in which akinesis or dyskinesis in one or more ventricular wall regions was observed by means of ventriculography and/or echocardiography. No presence Of hypertrophy and conduction irregularities was Observed. For each subject 64 leads (K = 64) were recorded using our BSM recording system as described previously2 The measuring sites are indicated in Fig. 1 by the position of their corresponding ECG-complexes. The 64 measuring sites were selected from 180 points forming a regular grid (10 rows and 18 columns). For the display of maps the potential values at the 116 points which were not covered by the leads were computed from the 64 recorded voltages by using an interpolation procedure. The 64 time signals were sampled simultaneously with 500 samples per second and stored on disk in epochs of two seconds. Since the noise in the data was low (<7 ~V rms) there was no need for signal averaging. The measuring sites of the 12lead ECG system (three extremity electrodes and six precordial electrodes) are a subset of the BSM recording system, leading to at most eight signals (K = 8). From the stored data one complete heart-cycle was

-',-

~r~-~-~-~-

-~

T.-~,-~--~-4-,.

..a_.--

i

4q--

,_IL--,-

4+-

Fig. 1. Lead system. Sixty-four electrodes have been chosen from a regular grid of 180 points over the thorax. The grid consists of 10 rows of 18 equidistant points in the horizontal plane. The standard ECG leads are a subset.

selected and this heart beat was considered as representative for the subject. A linear base line shift correction was performed. For the analysis described here both the interval of 100 ms from onset QRS (QRS interval; resulting in L = 50 samples) and the interval of 30 ms onwards from onset QRS (Q interval; resulting in L = 15 samples) were used. B. Data Reduction and Feature Selcctlon

For the classification of the subjects into two groups, using discriminant analysis, it is necessary that each subject be represented by a limited set of features. The K x L data set of each subject contains a certain amount of redundancy, and the same holds true for the complete set of data over the subjects. This redundancy can be formulated in both statistical terms and by modelling the underlying physiological process which may impose constraints. Data reduction can be performed by using this redundancy. Three different methods of data reduction were employed leading to three different sets of variables for each subject. These three reduction methods are described below. B. 1 - - T i m e Integration (Method 1)

By integration over time of the recorded K electrocardiographic signals K values result. This way of data reduction has been applied previouslyF ~ The projection of the result on the body surface will be called an integral map.

J. ELECTROCARDIOLOGY 20(3), 1987

COMPARISON OF BODY SURFACE MAPS AND THE 12-LEAD ECG

The integral map, especially over the QRST interval, is related to the concept of the "ventricular gradient, ''".~-'4 which is based on the specific properties of the action potentials in the heart cells. Recently the special signif9icance of integral maps over the depolarization period of the heart has been clarified. ~5 It has been shown that, under the assumption of a uniform dipole layer as the model for electrical sources during depolarization, the integral map contains all the information on the electrical activation sequence of the heart that can be obtained at the body surface. The time integration was performed by summation of the L time samples of each of the K leads. In this way the original K x L numbers are reduced to K numbers, which means that the amount of data is reduced by a factor L. For both the QRS data, represented by 64 x 50 = 3200 variables, and the Q data, represented by 64 x 15 = 960 variables, the integration results in 64 variables. Applying this reduction method to the ECG-data (K = 8), the result for both the 100 ms and the 30 ms interval consists of eight variables. B. 2--One-step Data Reduction (Method 2)

In this reduction method all K x L samples of the BSM of a subject (K electrodes, L time samples) are . treated as a single vector of dimension KL. Each KLvector is constructed by setting the ECG's of the K leads in sequence and by treating this as one signal. For the ensemble of KL-vectors, representing all data of the N subjects, common patterns can be found in the form of the eigenvectors of the estimated covariance matrix of this ensemble. Most of the eigenvectors are associated with very small eigenvalues. As a consequence each subject's map data can be represented by a small number of expansion coefficients corresponding to the set of eigenvectors with the largest eigenvalues. Instead of deriving the eigenvectors from the covariance matrix, which is impractical when K L is large, these vectors can be derived reliably by computing the singular value decomposition (SVD) TM of the (KL x N) matrix G formed by taking all KL-vectors g as columns? v The SVD of G is expressed as G = USW,

(1)

where

195

with b. = n-th column of the matrix (SVD. U is an orthonormal matrix of eigenvectors, so (2) is an expression of the Karhunen-Lo~ve expansion of g.. This expression states that the original map g. can be completely represented by the N coefficients b. (assuming N -< KL). Expression (2) results in data reduction when only a limited set of M column vectors in U associated with the M largest eigenvalues is used. The (relative) error in the representation performance, which results from using a limited number of M eigenvectors, can be found from N

e, = ~ i=M+l

N

O'ii2/~ aii 2,

(3)

i~l

where the denominator is the complete signal power, and the numerator is the power in the coefficients not used for the reconstruction. 17 This reduction method was applied to both the 64-1cad BSM and to the 8-1cad ECG. In both cases 40 eigenvectors were computed for the respective matrices G. For the QRS interval the number of data representing the BSM was reduced from 3200 to 40. Similarly, the ECG data were reduced from 400 to 40. For the Q interval the number of the BSM data was reduced from 960 to 40 and for the ECG from 120 to 40. B. 3--Two-step Data Reduction (Method 3)

The sampled body surface data of a subject can either be interpreted as a series of body surface potentials at subsequent time instants, or alternatively, as a set of ECG's from the measuring sites at the body surface. Formally the data of subject n can be treated as a (K x L) matrix Y,. This matrix Y. can be expanded into two sets of eigenvectors: the eigenvectors of the potentials over the thorax (eigenmaps) and the eigenvectors of time (intrinsic ECG's or eigen ECG's). 17~',~2These eigenvectors result from the decomposition of the covariance matrices. Using the data of N subjects these matrices can be estimated as rp = ~ .-, Y.Y~

(the covariance over position)

(4a)

g~=

(g~., g~., 9 . . . . , gKLn) = a vector representing the K L samples of subject n, U = matrix (dimension K L x KL) of orthonormal column vectors, representing the common patterns in the N vectors, S = a matrix (dimension KL x N) of which only the elements a, may be non zero: the non-negative singular values, V matrix (dimension N x N) of orthonormal column vectors (the significance of which is shown below).

For subject n the expansion (1) is written as g. = Ub,,

J. ELECTROCARDIOLOGY 20(3), 1987

(2)

or

F, = ~

y.Ty.

(the covariance over time)

(4b)

n-1

respectively. The resulting expansion can be expressed as Y, = E A . F r, in which the column.vectors of the matrix E are the K orthonormal eigenmaps computed from F, and F is a matrix whose columns are the L orthonormal eigen ECG's

196

UIJEN ET AL

computed from I t. Using only a limited number k of eigenvectors in E (associated with the largest eigenvalues of F.) and a limited number e of eigenvectors in F (associated with the largest eigenvalues of It) the original (K x L) data matrix Y. is represented by the k x coefficients in A.. The coefficient matrix A. can be shown to be 17 Ao = ETy.F.

(5)

The quality of representation of the data of subject n, which results from using a limited number of k eigenmaps and g eigen ECG's, is computed as the ratio of the averaged sum of the squared elements of A and of the averaged sum of the squared elements of Y: k

P. = ~ i-1

g

K

L

~ alj2/~ ~ Yij2. j-1

iffil

j=l

The average error in this representation is N

e2 = 1 - 1/N ~ P..

(6)

n-I

This method of data reduction is closely related to Method 2, but is less efficient in representing the data. ~7 For the QRS interval and the Q interval the eigenvectors have been determined separately. In both cases eight eigenmaps and five eigen ECG's were used resulting in 40 expansion coefficients. C. Feature Selection and Classification After the data reduction the resulting number of variables is still far too large for a classification procedure. Since the number of subjects in the two groups is limited no meaningful discrimination result can be expected when the number of features is not substantially smaller. 1s,'9 To be on the safe side we restrict the number of features to five at most. The most discriminative features were found by the STEPDISC feature selection procedure (SAS Institute Inc.) 2~ applied to the representation variables. In this procedure a subset of the representation variables is selected in steps, in such a way that the ratio of the determinant of the within group covariance matrix and the determinant of the total covariance matrix is minimized. Since there is no general rule for feature selection the selected subset may be suboptimal. A linear discriminant analysis was performed for the three different sets of feature vectors selected from the representation variables which resulted after the three methods of data reduction had been applied. The result of the classification is expressed in the form of the number of correct classifications as well as by the sensitivity and specificity. With both the BSM features and the ECG features as input the result of the classification is used in order to compare the discriminative performance of the BSM to that of the ECG. Only Method I and Method 2 were used in this case.

For each group or class of patients a discriminant function was computed under the assumption of a multivariate normal distribution of the features with equal covariances for all classes involved and an equal a priori probability for each subject to belong to each of the two classes. This means that a priori each feature is equally important, since each feature is scaled by its variance. This makes this method of classification more powerful than by using a simple distance to the class averages or by using the correlation with the features of the class means as a criterium. The classification was performed for different numbers (2-5) of features. Because of the limited number of subjects it is not easy to divide the population into a design set, from which the discriminant function is derived, and a test set. This dilemma rises from the notion that for a representative discriminant function the number of subjects in the design set must not be too small, while for the reliability of the classification results the number of subjects in the test set has to be as large as possible. The best method for a limited total set is the jackknifing or "leave one out" evaluation procedure: all subjects except one are considered as the design set from which the discriminant functions are derived; TM subsequently, the subject left out is classified. This procedure is carried out for all subjects, leaving one out each time. In this way virtually all subjects are involved in designing discriminant functions while also all subjects are classified. The resulting test score is as reliable as possible for the underlying population. RESULTS

A. Analysis of Body Surface Maps Using Different Methods of Data Reduction A. 1 - - T i m e integration (Method 1) T i m e integration of t h e m a p d a t a results in 64 variables. I n Fig. 2 the integral m a p s of a n o r m a l subject a n d of a m e m b e r of t h e infarction g r o u p are shown for b o t h t h e 30 m s (Q integral m a p s ) a n d the 100 m s (QRS integral m a p s ) data. D i s c r i m i n a n t analysis was p e r f o r m e d using two, three, four a n d five features selected f r o m these 64 variables b y the stepwise d i s c r i m i n a n t selection procedure. T h e o u t c o m e of the d i s c r i m i n a n t analyses are listed in c o l u m n 2 of T a b l e IA (Q interval) a n d I B (QRS interval). A. 2 - - O n e - s t e p data reduction (Method 2) Using the one-step r e d u c t i o n m e t h o d all d a t a of a subject from 64 electrodes over L time samples are represented as one single 64 L vector. For the 30 ms interval this results in a 64 x 15 = 960dimensional vector and for the 100 ms interval in a 64 x 50 = 3200-dimensional vector. F o r the 30 ms interval the c o m p l e t e set of eigenvectors con-

J. ELECTROCARDIOLOGY 20(3), 1987

COMPARISON OF BODY SURFACE MAPS AND THE 12-LEAD ECG

30 ms N

197

integpa~ MI

100 ms N

integpal HI

Fig. 2. Upper part: Q integral maps of two subjects, each representative for each group. N: normal subject, MI: patient with myocardial infarction. The isofunction (potential x time) lines have linear increments (arbitrary units). Dark background refers to negative values, white to positive. Grey covers a zone of near zero values. Lower part: QRS integral maps of the same subjects. Map region corresponds to the one in Fig. 1.

sists of 106 960-dimensional vectors. For the 100 ms interval the complete set of eigenvectors consists of 106 3200-dimensional vectors. In Fig. 3 the representation error of the 30 ms interval and of the 100 ms interval map data is shown as a function of the number of eigenvectors. For each subject 40 coefficients associated with the 40 most prominent eigenvectors (indicated by the largest 40 singular values) have been identified. The 40 coefficients represent more than 99 % of the power in both cases. The representation capability of the 40 coefficients is illustrated in Fig. 4. For this purpose six standard precordial ECG's, which form a subset of the BSM-data, are used. In Fig. 4A the recorded QRS complexes of the six precordial ECG's of a subject are shown. The same precordial ECG's reconstructed using the 40 coefficients from Method 2 are shown in Fig. 4B. Using the stepwise procedure two to five features were selected from the 40. The five most discriminative features were all found within the first 15 variables (sorted by descending singular values). These sets of features J. ELECTROCARDIOLOGY 20(3), 1987

have been used for the classification. The results of the discriminant analysis of this set of data are listed in column 4 of Table IA (30 ms interval) and Table IB (100 ms interval). A. 3--Two-step method of reduction (Method 3) The results of the reduction method 3 were obtained by using eight eigenmaps (position) and five eigen ECG's (time). This results in 40 coefficients for the representation of each subject in the different groups. In Fig. 5 the relative error of representation is shown separately as a function of the number of spatial eigenvectors (eigenmaps: Fig. 5A) and as a function of the number of temporal eigenvectors (Fig. 5B). The relative error e2 due to the expansion in eight eigenmaps and 5 eigen ECG's was computed using expression (6). This error can also be deduced from Fig. 5 by adding the relative error for eight spatial eigenvectors to the one for five temporal eigenvectors. The average representation power in these coefficients is 96 % for the 30 ms interval and 97% for the 100 ms interval.

198.

UIJEN ET AL

TABLE I Sensitivity (SE), Specificity (SP) and Correct Classifications (NC) in Discriminating 54 Normals and 52 Patients with Myocardial Infarction Method 1 (integral) Number of features

BSM

Method 3 (two-step)

Method 2 (one-step)

ECG

BSM

ECG

BSM

SE

SP

NC

SE

SP

NC

SE

SP

NC

SE

SP

NC

SE

SP

NC

A: 30 msinterval 2 79 3 87 4 83 5 90

91 93 89 91

85 90 86 91

63 83 79 *

72 94 93 *

68 89 86 *

81 92 87 85

93 94 94 93

87 93 91 89

85 87 90 90

91 91 89 93

88 89 90 92

83 92 94 94

93 94 93 93

88 93 93 93

B: 100 msinterval 2 81 3 87 4 85 5 83

81 85 85 87

81 86 85 85

69 75 71 73

83 83 83 83

76 79 77 78

85 88 87 90

85 91 93 91

85 90 90 91

71 77 77 79

74 80 81 85

73 78 79 82

83 88 87 92

70 85 87 93

76 87 87 92

The features for the discriminant analysis were selected from compressed data sets of complete body surface potential maps (BSM) and of the standard electrocardiogram (ECG). Three types of data compression were applied on the BSM and two types of data compression on the ECG. Method 1: data compression by time integration; Method 2: data compression by expansion in eigenvectors, considering each measurement (BSM or ECG) as a single vector; Method 3: data compression by expansion in both spatial and temporal eigenvectors (applied to BSM only). A: analysis based on the 30 ms interval (Q interval) B: analysis based on the 100 ms interval (QRS interval). The figures of SE, SP and NC are percentages. * No more than four discriminative features coutd be determined jn this case.

BSM

1.0

L 0 L L 0

A

100 ms

upper:

B

C

D

r >

~0.1

,--4

O L

0.01

0. 001 0

i

10

i

i

20

30

40

number os c o e f f i c i e n t s

Fig. 3. Representation error of the body surface map data as a function of the number of coefficients resulting from the expansion in eigenvectors (Method 2). The lower curve shows the representation error due to the reduction of body surface maps in the Q interval, the upper curve in the QRS interval. Twenty-one variables represent the Q map (960 values) for 99 %; the QRS maps (3200 values) need 35 variables for a similar representation.

Fig. 4. Six precordial ECG's of a normal subject as measured and as computed from different sets of representation variables. A: original data. B: subset of a reconstructed body surface map; map was reconstructed from the 40 variables derived according to Method 2. C: subset of a reconstructed body surface map; map was reconstructed from the 8 x 5 variables derived according to Method 3. Dr ECG reconstructed from the 40 coefficients derived from the original ECG according to Method 2.

J. ELECTROCARDIOLOGY 20(3), 1987

COMPARISON OF BODY SURFACE MAPS AND THE 12-LEAD ECG

L 0 L L r

SPATIAL

10

A

B

EIOENVECTORS upper:

I00

199

[.._ 0 (.. L O

ms

O >

TEMPORAL

1.0

EIGENVECTORS upper:

lower

100

:

ms

3fl m s

0 >

~J ,-4 O. O L

@ (_

0.01

0.0

0. 9 0

. . . . 2 4 6 8 number oF e i g e n v e e t o r s

[] 00 10

[]

2

4

6

8

10

number oF e i g e n v e c t o r s

Fig. 5. (A) Representation error of the body surface map data using expansion in spatial eigenvectors. The lower curve refers to the Q interval, the upper to the QRS interval. Using 8 eigenvectors the representation errors are 1.3% (Q interval) and 1.5% (QRS interval). (B) Representation error of body surface maps using expansion in temporal eigenveetors. The lower curve represents the Q interval and the upper curve the QRS interval. Using 5 eigenveetors the representation errors are 0.2% (Q interval) and 1% (QRS interval). Using both expansions the data is represented by 8 x 5 = 40 variables. These 40 variables represent the maps within an error of 1.5% (Q interval) and 2.5% (QRS interval).

The representation power of the 40 coefficients is depicted in the same way as used for Method 2. In Fig. 4C, six ECG's from the reconstructed BSM data are shown. Two to five features were selected by the stepwise selection procedure and used for the discriminant analysis. The five most discriminative features were all found within the set of coefficients by using the four most prominent eigenmaps and four eigen ECG's. The outcome of this analysis is listed in column 6 of Tables IA and IB. A. 4--Comparison of the methods From the data in Table I it can be concluded that similar results are obtained by using Methods 2 and 3. Only Method 1 and Method 2 are used in the subsequent analysis of the ECG data. B. A n a l y s i s o f t h e E C G B. 1--Time integration (Method 1)

The same method of analysis as described in subsections B.1 and B.2 of the section Materials and Methods has been applied to the eight independent leads of the standard ECG data of the 106 subjects. Time integration of the ECG data results in eight J. ELECTROCARDIOLOGY 20(3), 1987

variables. The features of the integrated ECG were selected from the eight variables by the stepwise discriminant analysis. No more than four significant features were found in the 30 ms interval and no more than five in the 100 ms interval. The most discriminative sites are (in decreasing order): left arm and right arm, V1, V6 for the Q interval. For the QRS interval the corresponding sites were found to be: left arm, V3, V2, V1, right arm. The results of this analysis are shown in column 3 of Table IA (30 ms interval) and Table IB (100 ms interval). B. 2--One-step data reduction (Method 2)

Using the one-step reduction method all data of a subject from K electrodes over L time samples are represented as one single K L vector. This results for the ECG in sets of 120- and 400-dimensional vectors both for the 0 interval and for the QRS interval respectively. Expansion coefficients were derived from the sets using the singular value decomposition. In Fig. 6 the representation error of the ECG data in the 80 ms interval and in the 100 ms interval is shown as a function of the number

200

UIJEN ET AL

of eigenvectors. The 40 coefficients, associated with the 40 largest singular values, were entered in the stepwise selection procedure. The 40 coefficients represent more than 99% of the power both in the 30 ms interval and in the 100 ms interval. In Fig. 4D the six precordial ECG's reconstructed from the 40 expansion coefficients and eigenvectors are shown. The discriminant analysis was performed using two to five features selected in the stepwise discriminant procedure. The five most discriminative features were all found within the first 15 variables, sorted by descending eigenvalues. The numbers of correct classifications, specificity and sensitivity are also listed in Table I (A: 30 ms interval, B: 100 ms interval).

E CG L

o (_ c

upper = lover:

@

100 30

ms ms

m 0. .-4 O) L

0.0

DISCUSSION For the statistical evaluation of both B S M and ECG data sets a data reduction is required to obtain a feature space of manageable dimensionality. This goal can be reached by using signal analytical methods or by using a priori information on the data. In this way a limited set of variables can represent the original data quite well. Accordingly, the variables (features) which are most discriminative can be selected. As a form of data reduction one can use a physiological model of the electrical activity of the heart. Time integration can be used when the depolarization in the heart can be modelled as a propagating uniform dipole layer. As indicated in section II this model implies a unique relationship between the time integrated body surface potential and the time pattern of the depolarization at the heart wall surfaceJ 5 So much of the information on the electrical activation of the heart is preserved in the time integrated body surface potential. This reasoning also holds true for integration within the QRS-interval: integration over the Q-wave refers to the activation sequence during the first 30 ms of depolarization. The classification results using time integration should be compared to the results in which no integration is done and in which thus time information is retained. Reduction of data in the latter ease is achieved by Methods 2 and 3. Method 2 is more efficient in representation than Method 3Y However, after feature selection from the two representations, comparable classification results were found. We conclude therefore that both methods of reduction are equally powerful. Using feature selection, it appears that only a small number of features are required for a proper classification. As can be concluded from Table I, no substantial increase in the number of correct

13.00 0

10 number

20 oF

30

40

coeFFicients

Fig. 6. Representation error of the standard ECG data as a function of the number of expansion coefficients (Method 2). The lower curve shows the representation error of the ECG analyzed in the Q interval, the upper curve in the QRS interval. Nine variables represent the Q interval of the ECG data (120 values) for 99% and 19 variables represent the QRS interval of the ECG (400 values) for 99%. classifications is observed when the number of features is larger than three. From Table I it can be seen that, particularly for the 30 ms interval, the results obtained by time integration and by the one-step reduction are similar. This suggests indeed the preservation of diagnostic information in the integral map of the first part of the QRS complex (30 ms interval). When the integration interval (Method 1) spans the entire QRS complex, the results of classification are slightly worse compared to the Methods 2 and 3. Considering the number of subjects these differences in results may not be significant. The results of the analysis over the 30 ms interval of the 12-lead ECG are equally good as the ones based on the BSM data. Thus the extra diagnostic information in the (64 - 8 = 56) body surface leads could not be shown in this analysis. Looking at the entire QRS interval the classification scores of the standard ECG are lower than results of the BSM, while the latter ones are comparably high both in the 30 ms interval and in the 100 ms interval. As can be deduced from Table IA two features from the BSM data are able to separate the two groups of patients with a performance of 87 %. ,Two J. ELECTROCARDIOLOGY 20(3), 1987

COMPARISON OF BODY SURFACE MAPS AND THE 12-LEAD ECG

infarct

nor'ma[s

ions

I

map

I I I I I I

I

map

,

1o

o

~ll m

a

~

o

........

~ 9}~.~r . . . . . . .

o~ ~ o

Ceature

vePsus

1

I

ecg

.

.

.

.

.

I I I I I I .

.

.

.

ecg o

o

o

o

.

.

.

.

.

B

.

........

o-%~--% .....

0

ouQ I

,

~

I I I I I

I I I I i

Featupe

y e t ' N u N

1

Fig. 7. Plot of 54 normal and 52 patients with old infarction in the 2-dimensional feature space. The most discriminative features of the map data (3 and 1) are shown in the upper part of the figure, while the most discriminative features (6 versus 1) of the ECG are shown in the lower part. The set of basisvectors from which two features have been chosen are different. features of the ECG lead to an almost equal score. From the fluctuations in the scores in Table I one may consider the former two scores to be near to the best result. Therefore, the separability of the two groups can be seen in the 2-dimensional feature space spanned by the two sets of two features. This is shown in Fig. 7 from which it can be concluded that the distance between the two groups is relatively small with respect to the dispersion within each group. As a consequence we feel that an increased sample size would not necessarily result in an improved classification, particularly since the quality of this classification is as good as it is (93 % ). The limited number of subjects causes two problems in multivariate analysis. Firstly, for statistical classification the ratio of the number of subjects and the number of features has to be substantially higher than three for each class, Is taking into account the filling of the feature space by the subjects, or more than 20,19 when estimates for the parameters of the parent multivariate distribution are used. Since in this study the maximum results, apart from fluctuations, have been obtained using at most three features, this requirement has been met. Secondly, the discriminant function, derived from the J. ELECTROCARDIOLOGY 20(3), 1987

201

design set, must be determined from as many data as possible. For testing also a large number of subjects is needed. The optimal method is the jackknifing procedure by which virtually all subjects are used both in the design set and in the test set, as is done in this study. The results of this study can be compared with the results of others. Kornreich et al.~used discriminant analysis in order to discriminate between normals and patients with a history of myocardial infarction, using other methods of data representation. Taking into account the different data bases with respect to both size and composition, comparable maximal results were obtained. They used the whole QRST-interval whereas we used the Q-interval only. For the QRS interval their results and our results were 92% and 89% respectively. In a recent publication of Kornreich et al. 21 similar results were obtained from the 12-lead ECG, the VCG and a combination of these two lead systerns; their analysis was applied to the whole QT interval. Lux et al., = using a slightly different version of the two-step reduction method# 3,24obtained about equal results in the discrimination between norreals, myocardial infarctions and other diseases, from the complete QRS-data whereby the features were selected on their discriminative power. Our best results were also obtained by using the data from the Q interval only. Within the limitations of this study this leads to the conclusion that the specific information on the classes considered is already present in this interval. The accuracy of the classification, assuming ideal discrimination functions and a binomial distribution for the misclassification rate, can be computed from the found error rate and the number of subjects2 s For a result of 90 % correct classification the 95% confidence interval is between the scores of 83 % and 95%, respectively. From this and the results in Table I it may be concluded: 1. there is no significant difference between the best performance of BSM and ECG in the presented material, when looking at the 30 ms interval. 2. integral maps and integral ECG can separate the infarction groups well (90% and 89%). 3. only three features are needed to achieve this separation. All three methods of data reduction are effective for feature extraction. In Methods 2 and 3 the time information is retained: the time course of body surface maps can be recomputed from the features (coefficients) and the set of eigenvectors. The fact that the performance of BSM and ECG in the presented material are comi~arable may be

202

UIJEN ET AL

due to the fact that no distinction with respect to the site and location of the infarction has been considered. The restricted size of our data base at the present time prevented such differentiation. Another possible explanation may be found in the fact t h a t the same (advanced) methods of data reduction, feature extraction and statistical analysis have been applied to both the full (extensive) set of map data and to the (restricted) set of data of the standard ECG. In the literature several studies have appeared in which the performance of advanced methods applied to BSM data have been compared to inferior classification methods applied to the standard ECG data. T h a t a fair comparison may, in this respect, do more justice to the performance of the ECG was demonstrated in this paper. REFERENCES 1. FLOWERS,N C, HORAN,L G, SOHI, G S, HAND, R C ANDJOHNSON,J C: New evidence for inferoposterior myocardial infarction on potential maps. Am J Cardiol 36:576, 1976 2. VINCENT,G M, ABILDSKOV,J A, BURGESS,M, MILLAR, K, Lux, R L AND WYATT, R F: Diagnosis of old inferior myocardial infarction by body surface isopotential mapping. Am J Cardiol 39:510, 1977 3. OttTA, T, TOYAMA,J, OHSUGI,J, KINOSHITA,A, ISOMORA, S, TAKATSU,F, NAGAYA,T AND YAMADA,K: Correlation between body surface isopotential maps and left ventriculograms in patients with old anterior myocardial infarction. Jpn Heart J 22:747, 1982 4. DE AMBROGGI,L, TACCARDI,B ANDMACCHI,E: Bodysurface maps of heart potentials. Tentative localization of pre-excited areas in forty-two Wolff-Parkinson-White patients. Circulation 54:251, 1976 5. YASUI,S, KUBOTA,I, OHYAMA,T, WATANABE,Y AND TSUIKI,K: Diagnosis of coronary artery disease using isointegral mapping. In Advances in Body Surface Potential Mapping. K Yamada, K Harumi and T Musha, Eds, The University 0f Nagoya Press, Nagoya, 1983, p. 243 6. KORNREICH, F AND RAUTAHAI~U, P: The missing waveform and diagnostic information in the standard 12 lead electrocardiogram. J Electrocardiol 14:341, 1981 7. TOYAMA,S, SUZUKi,K, YOSHINO,K AND FUJIMOTO, K: A comparative study of body surface isopotential mapping and the electrocardiogram in diagnosing of myocardial infarction. J Electrocardiol 17:7, 1984 8. OSUGI,J, OHTA,T, TOYAMA,J, TAKATSU,F, NAGAYA, T AND YAMADA,K: Body surface isopotential maps in old inferior myocardial infarction undetectable by 12 lead electrocardiogram. J Electrocardio117:55,1984 9. HERINGA,A, UIJEN, G J H AND VAN DAM, R TH: A 64-channel system for body surface potential map-

ping. In Electrocardiology '81, Z Antalozcy and I Pr6da, Eds, Akademiai Kiado, Budapest, 1982, p. 297 10. MONTAGUE,T, SMITH,E, CAMERON,D, RAUTAHARJU, P, KLASSEN,G, FELMINGTON,C AND HORACEK,B: Isointegral analysis of body surface maps: surface distribution and temporal variability in normal subjects. Circulation 63:1166, 1981 11. ABILDSKOV,J A, GREEN,L S, EVANS,A K AND LUX, R L: The QRST deflection area of electrograms during global alterations of ventricular repolarization. J Electrocardiol 15:103, 1982 12. WILSON, F N, MACLEOD, A G, BARKER,P S AND JOHNSTON, F D: The determination and the significance of the areas of the ventricular deflections of the electrocardiogram. Am Heart J 10:46, 1931 13. ABILDSKOV,J A, BURGESS,M J, MILLAR,K, WYATT, R AND BAULE,G: The primary T-wave. A new electrocardiographic waveform. Am Heart J 81:242, 1971 14. PLONSEY,R: A contemporary view of the ventricular gradient of Wilson. J Electrocardiol 12:337, 1979 15. CUPPEN,J ANDOOSTEROM,A VAN."Model studies with the inversely calculated isochrones of ventricular depolarization. IEEE Trans Biomed Eng 31:652, 1984 16. FORSYTHE,G E, MALCOLM,M A AND MOLER, C G: Computer methods for mathematical computations. Prentice Hall, Englewood Cliffs, NJ, 1977 17. UIJEN, G J H, HERINGA,A AND OOSTEROM,A VAN: Data reduction of body surface potential maps by means of orthogonal expansions. IEEE Trans Biomed Eng 31:706, 1984 18. DUDA, R O AND HART, P E: Pattern Classification and Scene Analysis. pp. 251. John Wiley and Sons, New York, 1973 19. CORNFIELD,J: Statistical classification methods. In Computer Diagnosis and Diagnostic Methods. J A Jacquez, Ed, Ch. 6. Charles C Thomas Publ., Springfield, 1972 20. SAS USER'S GUIDE:STATISTICS, 1982 EDITION. SAS Institute Inc, Cary, NC, 1982, pp 405 21. KORNREICH,F, RAUTAHARJU,P M, WARREN, J W, HORACEK,B M ANDDRAMAIX,M: Effective extraction of diagnostic ECG waveform information using orthonormal basis functions derived from body surface potential maps. J Electrocardiol 18:341, 1985 22. Lux, R L, GREEN,L S ANDABILDSKOV,J A: Statistical representation and classification of electrocardiographic body surface potential maps. In Computers in Cardiology '84. IEEE Computer Society, Long Beach, CA, 1984, p 251 23. EVANS, K, Lux, R, BURGESS, M, WYATT, R AND ABILDSKOV,J: Redundancy reduction for improved display and analysis of body surface potential maps. II. Temporal compression. Circ Res 49:197, 1981 24. Lvx, R, EVANS, K, BURGESS, M, WYATT, R AND ABILDSKOV,J: Redundancy reduction for improved display and analysis of body surface potential maps. I. Spatial compression. Circ Res 49:186, 1981

J. ELECTROCARDIOLOGY 20(3), 1987