J. ELECTROCARDIOLOGY 14 (3), 1981, 283-288
Electrocardiogram Computer Analysis. Practical Value of the IBM Bonner-2 (V2 MO) Program BY REMIGIO GARCIA, M.D., GERALD M. BRENEMAN, M.D. AND SIDNEY GOLDSTEIN, M.D.
SUMMARY One thousand ECGs were selected from those taken from an adult population and analyzed by three cardiologists to assess the performance of the IBM Bonner-2 (V2 MO) program. The sample included 200 ECGs with 248 myocardial infarctions, 200 with conduction abnormalities, 100 with ventricular hypertrophy, 300 normals and 200 with electronic pacemakers. In the MI group, there was a total sensitivity of the program with respect to the readers of 88% (218/248) with 19 (8%) program errors and 90% (248/ 275) specificity. In 53 ECGs with two MI statements, the sensitivity was 89% (47/53). The sensitivity in conduction abnormalities was 93.4% (183/196) with 98.7% specificity (800/ 810). The sensitivity in LVH was 90% (74/81) and in RVH 83% (5/6). Among the normals, the specificity was 98.6% (289/293). The sensitivity to PVCs was 84% (56/67), to atrial fibrillation 87% (48/55), to SVCs 58.6% (17/29) and to electronic pacemakers 65% (127/ 196). Recognizing the limitations of this type of analysis, this study indicates that the V2 MO version compares favorably with the earlier versions of the same program and is a valuable aid in ECG interpretation with rapid acceptance by physicians. N u m e r o u s p r o g r a m s for the a u t o m a t e d processing of E C G s h a v e b e e n developed a n d are now a v a i l a b l e for r o u t i n e clinical use. I n a n a t t e m p t to e v a l u a t e t h e p e r f o r m a n c e of the n e w v e r s i o n of t h e I B M B o n n e r - 2 (V2 MO) c o m p u t e r a n a l y s i s p r o g r a m , one t h o u s a n d E C G s selected from those t a k e n f r o m a n a d u l t p o p u l a t i o n were a n a l y z e d by t h r e e cardiologists a n d t h e i r a n a l y s e s were comp a r e d to the p r o g r a m ' s i n t e r p r e t a t i o n s . A l t h o u g h t h e s h o r t c o m i n g s of this type of e v a l u a t i o n h a v e been e m p h a s i z e d by several a u t h o r s , 1-4 this form of a n a l y s i s will c o n t i n u e to h a v e some v a l u e u n t i l a d a t a b a n k l i b r a r y of E C G s w i t h p r o v e n clinicopathological diagnosis becomes generally available.
essed by the IBM Bonner-2 (V2 MO) computer analysis program.* Each ECG was reviewed in detail by a panel of three staff cardiologists, none of whom was aware of the computer interpretation at the time of his review. Each made his own independent interpretation using criteria with which he was most familiar. In case of disagreement between the readers and the program, the patient's history and/or serial ECGs were reviewed by the panel and a consensus was reached against which the computer program results were compared. Diagnostic disagreements were studied to determine if they were due to a difference in diagnostic criteria between the readers and the computer program or to a true program error. Minor differences in ST-T changes were disregarded. Program errors were defined as computer inaccuracies in P-QRS-T recognition, errors in measurement and/or program logic. The relative sensitivity and specificity of the computer program were calculated using the following two equations: (1) percent sensitivity = No. of true positive/(No, of true positive + No. of false negative) x 100; (2) percent specificity = No. of false negative/(No, of false positive + No. of true negative) x 100. The sample included 200 ECGs (20%) with 248 myocardial infarctions (MI), 200 with conduction abnormalities (BBB), 100 with ventricular hypertrophy, 300 normals and 200 with electronic pacemakers. Although not restricted by a fixed set of criteria, the cardiologists based their diagnosis mostly on the criteria
MATERIALS AND METHODS One thousand ECGs were selected from the Henry Ford Hospital's adult population ECGs collected on Marquette transmitting carts (Series 3000* and proc-
From the Division of Cardiovascular Diseases, Department of Medicine, Henry Ford Hospital, Detroit, Michigan The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. w 1734 solely to indicate this fact. Reprint requests to: Remigio Garcia, M.D., Cardiovascular Medicine, Henry Ford Hospital, 2799 West Grand Blvd., Detroit, Michigan 48202.
*Marquette Electronics, Incorporated, 3712 West Elm Street, Milwaukee, WI 53209.
283
284
GARCIA
Anteroseptal MI R P
Myocardial Infarct Statements Total 2 0 0 E C G s R P 100 90-
N=248
N=24S
F(-) 12 .... 9
.
Anterior MI R P
18
N=39
i
,oo-
.-
70- ~
:.-
o o
t-
-:i-::-:: -
-
o.
-
.
F(+) 14
.
.
.
.... ....
~.~
::::
40-
~
9
:\ "\.-'.N "- .'-..'-..
' .- .--. -. -.
-.
"
"
40-
9
.<
~-.~...',(..~
10-
O- ~
.
.
.
.
k.\-\'-\ ~\.\.\,
:::::: ,'Q
"
::\::\.?
-....
.; . ). .< < . i::." ."-."
- . .. "-. "x "x ,>.',x'-\ \.,\.-\:
n
20-
4~-"
.
.xl
~
20-
x,\-,\
13
-..
... .?-:-:
,.\..\',3
F(-)! 45
. . .
50-
30-
r~-J
.
50-
....
.
N=33
so-
"~ ~"
.
N=38
":.-
..
9
N=7
C,\ ~,,\,,N
60o
.
8 0 - %~-.M
80.....
N=11
Subendocardial MI R P .. -.
.~
~-.': ."-.: \:
N=75
90-. ~
N=39
' '
y . : x -
N=75
Lateral MI R P I
~-c. .
\.
70-
ET AL
.
. .
. -
.
x,\,\,
-......
~,: \: \.,,
~:~UU,'-
Fig. 2. Showing the sensitivity of the program with respect to the readers in anteroseptal MI (95%), in lateral MI (55%) and subendocardial MI (87%). The black segment of each bar indicates the percentage of program errors.
x
0Fig. 1. Myocardial Infarct. In each bar graph, the number ( N = ) written under the letter R represents the total number of times the readers made the diagnostic statement listed. The number ( N = ) under the letter P represents the number of times the program made the diagnostic statement in question. In each bar the number under the letter A in the dotted lined section of the bar indicates the percent of agreements between the readers and the computer program. The black bar segment indicates the percentage of cases in which disagreements between the readers and computer program resulted from program error. The white middle segment bar indicates the percentage of disagreements resulting from the different diagnostic criteria used by the readers and the computer program. Twenty (8%) of the infarct statements were due to false positive program errors whereas 30 (12%) infarct statements were missed due to false negative program errors. In the anterior MI subgroup, there were three (8%) false positive disagreements and three (8%) false negative true program errors. established by the Minnesota Code for myocardial infarction, s the Romhilt-Estes s point score system, and Sokolow-Lyon~ for LVH; for RVH, the R/S ratio of V1 greater than one, the R/S ratio in V6 less than one plus right axis deviation and/or right atrial enlargement (when present) in the absence of RBBB, in addition to the history and/or serial ECGs in doubtful cases.
RESULTS M y o c a r d i a l Infarction. In t h e g r o u p of 2 0 0 m y o c a r d i a l i n f a r c t i o n s
ECGs, there were a total of 248 infarct statements made by the readers with a sensitivity of the program with respect to the readers of 88% (218/248) (Fig. 1). There were a total of 30 false negative and 27 false positive statements by the program with 20 (8%) program errors. The specificity was 90.1% (248/275). In the different MI categories (Figs. 1, 2, 3), we observed a 95% (71/75) sensitivity in the anteroseptal MI group followed by the anterior MI w i t h 92% (36/39), subendocardial MI 87% (33/38)
and diaphragmatic MI 86% (69/80). The lowest sensitivity was in lateral MI with 55% (6/11) and true posterior with 60% (3/5). In the group with two MI s t a t e m e n t s , the s e n s i t i v i t y was 89% (47/53).
In the anterior MI category, three (8%) false positive statements and three (8%) false negative program errors were found (Fig. 1); the latter included three examples of anterior subendocardial infarction in association with LVH and intraventricular conduction delay diagnosed by the readers, while the computer statement was LBBB. In the anteroseptal MI group, there was one (1.3%) J. E L E C T R O C A R D I O L O G Y , VOL. 14, NO. 3, 1981
ELECTROCARDIOGRAM COMPUTER ANALYSIS
false negative and two (2.7%) false positive program errors. (Fig. 2). In the diaphragmatic infarctions, there were 11 (14%) false negative and 13 (16%) false positive s t a t e m e n t s including four (5%) false negative and seven (8.5%) false positive program errors (Fig. 3). The four false negative errors included three s t a t e m e n t s of "no processing because of measuring inconsistency" although the recordings showed changes of inferior MI, while in another case diagnostic Q waves in leads II, III and aVF were missed by the computer. Among the seven false positive errors, six were due to the inability of the computer to recognize small initial r waves in leads II, III and aVF, and in another example the statement of inferior MI was made by the program in addition to LBBB while only LBBB was found by the reading panel. In the true posterior group, there were six disagreements and three program errors (Fig. 3). The six disagreements were based on criteria differences resulting from the computer statement of "possible acute true posterior injury" whenever the J-point was depressed more t h a n 0.09 mv. in two of V1, V2, or V3 and the sum of J depression in V1 to V3 is more t h a n the sum of J depression in V4 to V6. These criteria were not shared by the readers. Serial ECGs showed rapid r e t u r n to the baseline of the J-point depressions without any other evidence of true posterior MI. The program errors included two (40%) false negative statements interpreted as true posterior MI by the readers due to the presence of tall R and T waves in leads V1 to V3 in association with an inferior MI; the false positive statement was an example of an inferior MI with poor R- wave progression in leads V1 to V3 without ST-T wave changes, misinterpreted as a true posterior MI by the program. Conduction Abnormalities
In this group the overall sensitivity was 93.4% (183/196) with 13 (6.6%) false negatives, and the specificity was 98.7% (800/810) with a total of ten (5.2%) false positive program errors (Fig. 4). The sensitivity in LBBB was 95.9% (47/49) with 99.3% specificity and six program errors: one example of LBBB was interpreted as LVH with strain by the computer; a second false negative error was due to m e a s u r e m e n t inconsistency without further processing done by the computer. In four cases of LVH with intraventricular conduction delay, the false positive statement of "consider LBBB" was made by the computer. The sensitivity in isolated LAFB was 86% (25/29) with four (14%) false negative J. ELECTROCARDIOLOGY, VOL. 14, NO. 3, 1981
285
Diaphragmatic MI
True P o s t e r i o r R 1 O0 -
P
N=80
m ~
N=82 m
14
16
8.5
90 -
80-
,\'-\'\1 ~
6o-
~-~
50-
~\ 1
P N=9
]11
R
P
N=53
N=51
F(+)] 671
,\'-\',\1 :\.~.-\1
70-
R N=5
Two Infarct Statements
\-\-\l
':~
(:
I
40 3O-
" @
20-
~
~ ,-~.\ , \
10 -
-..-. ~':~': : \,\-\
O-
Fig. 3. Showing the sensitivity of the program in the diaphragmatic (86%), true posterior MI (60%) and two MI categories (89%). In the diaphragmatic group, there were four (5%) false negative and seven (8.5%) false positive program errors. In the posterior MI there were 2 (40%) false negative and 1(11%) false positive program errors. In the two infarct groups, there were six (11%) false negative and three (6%) false positive program errors. program errors. In RBBB (Fig. 5) the sensitivity was 95% (39/41) with 99.5% specificity and five program errors. The two false negative errors included one report of "no further processing" and one due to measurement inconsistency in examples of RBBB; the three false positive errors included one example with atrial flutter and IVCD misinterpreted by the computer as RBBB and two due to m e a s u r e m e n t error by the program. In combined RBBB plus left anterior fascicular block, the sensitivity was 94% (47/50) with 100% specificity. There were three false negative errors: in two ECGs the program did not identify the presence of RBBB, and in a third the LAFB was missed by the computer probably as a result of frequent PVCs. There were 15 examples of trifascicular block with a sensitivity of 73% (11/15), with four false negatives due to the inability of the program to identify the first degree A-V block. No false positive statements were made by the program in these categories. The sensitivity in incomplete RBBB was 94% (15/16) with one false negative (6%) and one false positive (Fig. 5). In the intraventricular conduc-
286
GARCIA ET AL
RBBB
LV.
CONDUCTION ABNORMALITY TOTAL 200 ECGS R P 100-
N=196
N=193
F-6.6l
F+5.2] . . . . . .
LBBB R N=49
IVCD
IRBBB
R
P
R
P
R
P
N=41
N=42
N=16
N=16
N=11
N=12
LAFB P
N=5
R
P
N=29
N=25
\
\
,.
~,,~
\.-\.:~.)
......
\ c \ - \-2, ~.,,\.,.\,.\
80-
!
" 9149 " 9149
\,\,\.xl ~- \-\-~
\.\.\.,
?- \-
90-
l
-..-~>.:~71
, \ , \ , \ , 1
~.~..N:.N
~.~
-.:-:'-.
:.-~:-..!
.... ":-":-'.\:
\,.~,~,]
,~.:.\..,.\:
\.\.\.~ ....
60 -
\ .'.\." \.'.\
\, \, \
\,\,\,\ , \, \-,\-, \.\.\.\
~..~. ....
.
.
-....
\9
. ..
-\-i':i"
-
"-. " , :
\
.-...-...
,~.~:,~N
\?\?\-?\
\
~
~
40\ ~\
.
.
,-\-\,\
50-
.
}-~-\:\
~,~,,',4 \,\,\.~
70
7
F + I
9O
.
.
\ \
\,\,\
X'.\'.\'.~l
30-
\,\,
40 30-
,~. . ...... ... .~\
20-
...,
..
,.:
10<,N,.~,.~I ~,x3
10-
.~
"
"
9 ..
~:\~
RBBB
Right Bundle Branch Block
iRBBB= Incomplete RBBB Pattern iVCD = Intraventrleular Conduction Delay L B B B = Left Bundle Branch Block
LAFB: Left Anterior Fesclculer Block
Fig. 4. The overall sensitivity in conduction abnormality was 93.4%, in LBBB 96% with two (4%) false negative and four (8%) false positive program errors. In LAFB the sensitivity was 86% with four (14%) false negative errors.
tion delay group (IVCD), of 12 statements made by the program, two (7%) were false positive due to mismeasurement of the QRS by the computer. A false negative error occurred when an example of IVCD was interpreted as normal by the computer.
Fig. 5. In RBBB the sensitivity was 95% with two (5%) false negative and three (7%) false positive program errors. The sensitivity in IRBBB and IVCD were 94% and 91% respectively.
Normal
duction delay. The 11 false negative statements, interpreted as normal by the computer, included five e x a m p l e s w i t h m i n o r non-specific ST-T changes, three cases of possible anteroseptal myocardial infarction, two examples of left ventricular hypertrophy and one of probable old inferior myocardial infarction identified by the readers.
With the 300 normal ECGs, the sensitivity of the program with respect to the readers was 94.8% (274/289) and the specificity was 98.6% (289/293) with four (1.4%) false positive and 11 (3.8%) false negative program error statements (Fig. 6). One of the false positives was due to a program error of "no further processing" printed by the computer in an otherwise normal ECG; the other three false positive statements made by the program about ECGs interpreted as normal by the readers included one statement of possible obstructive lung disease, one of possible acute inferior injury and one of intraventricular con-
Hypertrophy The sensitivity in LVH was 90% (73/81) with eight (10%) false negatives and eight (10%) false positives (Fig. 6). There were five program errors. Three (3.7%) false negatives were found to have LVH with strain by the readers, while the computer statements listed anteroseptal MI, possible a n t e r i o r i s c h e m i a and n o n - s p e c i f i c T w a v e changes plus left axis deviation respectively. The other two (2.5%) false positives were overcalled LVH by the computer without the fulfillment of any of the voltage criteria. In RVH there were no true program errors, but there were six disagreeJ. E L E C T R O C A R D I O L O G Y ,
V O L . 14, NO. 3, 1981
ELECTROCARDIOGRAM
COMPUTER
ments between the readers and the computer criteria for RVH. In RVH the overall sensitivity was 83% (5/6) with one false negative and six false positives. loo90 .
J. E L E C T R O C A R D I O L O G Y ,
VOL. 14, NO. 3, 1981
Hypertrophy 100 ECGs L.V.H. R.V.H.
R
P
R
q=289
N=296
N=81
F- ~ . 3 . 8 .
.
~,_ .
'
~.4 i r ~ l to
.
R
P
N=6
3.7 I ~ lO .........
N=11
F-
2.s
17 \\~ ......
IF+
70-
~....
I 55
r-
,
~ 50 . . . . . a.
P N=81
80-
~
,.
.
40-
.
:
.
. .
.\ .
.
.
.
.
.
.
.
.
,,j
. .,
-
~\~
\
-\
.
. . . . . . \ . \ ~\ "
, . 9
~ ~
. ,,,
10-
. .
".'"
20-
9
.
-
.
.
.
......
•215 . .
......
Fig. 6. In Normals the sensitivity was 94.8% with eleven (3.8%) false negative and four (1.4%) false positive program error statements. In LVH the sensitivity was 90% with eight (10%) false negative and eight (10%) false positive errors. In RVH the sensitivity was 83%.
RHYTHM DISTURBANCE
PVC s R P
DISCUSSION
Recognizing its numerous limitations, ~-~ the usual way of validating the accuracy of a computer ECG analysis system consisted of comparing the computer's r e a d i n g reports a g a i n s t those made by the panel of cardiologists or supervised r e a d e r technicians u s i n g clearly defined code criteria, s,9 The need for an objective electrocardiographic library d a t a b a n k of proven cases for computer ECG validation has been emphasized by most of the a u t h o r i t i e s in ECG c o m p u t e r technology, but such a b a n k is not g e n e r a l l y available as yet. Short of autopsy correlation, the patient's clinical history combined with new noninvasive and invasive techniques in addition to direct examination at the time of h e a r t surgery, have been suggested ~ and utilized as alternatives by several authors ~o to confirm the diagnostic accuracy of computer programs. The computer's utility in the community, regardless of its accuracy, at the present time relates to its acceptability by clinical cardiologists and such a comparison between the computer and the clinical cardiologist is important.
287
Normal Total 300 ECGs
Rhythm disturbances and electronic pacemakers Of the 800 cases without electronic pacemakers, there were 55 cases of atrial fibrillation identified by the readers with seven false negative statements by the computer with a sensitivity of 87.2% (48/55) (Fig. 7). The program labeled 58 as showing atrial fibrillation with ten false positive statements with a specificity of 98.7% (745/755). Ventricular premature complexes (PVCs) were identified by the readers in 67 cases with a program sensitivity of 83.5% (56/67) and a specificity of 98.5% (733/744). The program was not able to identify couplets nor salvos of PVCs; whenever more t h a n one PVC was detected the suffix complex(es) was printed. The sensitivity of the program to supraventricular premature complexes (SVCs) was 58.6% (17/29). The sensitivity to artificial pacemakers was 65% (127/196) with four (3%) false positives giving a program specificity of 99.5% (800/804). The sensitivity to the first degree A-V block was 74.2% (69/93) with 98.6% specificity (800/811). Complex a r r h y t h m i a s and second or third degree A-V block were not detected by the program.
ANALYSIS
100--
N=67
N=67
Electronic Pacemakers R P
Atrial Fibrillation R P N=55
N=196
N=58
90-
N=131 ~ 1
-F+3
-...-..
807060P O.
5040-\ :\-\,
30.\.,\:
\
2010-
\:\:\: .N ~ : , x
,',,',~"
i
Fig. 7. Atrial fibrillation sensitivity was 87.2% with seven (13%) false negative and ten (17%) false positive error statements, the sensitivity to PVCs was 83.5% and to electronic pacemakers 65% with sixty-eight (35%) false negative and four (3%) false positive errors.
288
GARCIA ET AL
In an attempt to determine the practical value and sensitivity of the new IBM Bonner-2 (V2 MO) computer program in a large hospital and clinic setting, 1000 ECGs were selected from the Henry Ford Hospital adult population ECGs as described in the Materials and Methods section. In the total group with MI, the sensitivity was 88%, which compared favorably with the results reported by Bailey et al. ~ using the early experimental version of the IBM program of 1971. In the diagnostic category of LVH and conduction abnormalities, as well as in the r e c o g n i t i o n of v e n t r i c u l a r arr h y t h m i a s and atrial fibrillation, the results were an improvement over those reported by Bailey et al. H,I~ In the LBBB category, there was an improvement in the sensitivity from the original version. 13 The identification of artificial pacemakers was m a r k e d l y improved from the first version. ~2 The accuracy in the group of normal ECGs was excellent with a specificity of 98.6%. In the category with two MIs, the sensitivity was good at 8 ~ . The authors were favorably impressed by the rapid general acceptance of the program both by clinical cardiologists and practitioners, but the need for physician verification of the computer printouts including the normal r e p o r t s m u s t be e m p h a s i z e d . The p r o g r a m ' s time-saving features, not only for the readers, but for the personnel handling the ECGs, were also impressive. Acknowledgment: The authors would like to express their gratitude to Ms. Karen Iwaniec for her assistance in typing the manuscript.
REFERENCES 1. PIPBERGER, H V AND CORNFIELD, J: What ECG computer program to choose for clinical application. The need for consumer protection. Circulation 47:918, 1973 2. RAUTAHARJU,P M AND SMETS, P H: Evaluation of computer-ECG programs. The strange case of the
golden standard. Comput Biomed Res 12:39, 1979 3. RAUTAHARJU,P M, ARIET, M PAYOR,T L, et. al: Task Force III: Computers in diagnostic electrocardiography. Proceeding of the Tenth Bethesda Conference, Optimal Electrocardiography. Am J Cardiol 41:158, 1978 4. RAUTAHAB.JU,P M: Use and abuse of electrocardiographic classification systems in epidemiologic studies. Eur J Cardiol 8:155, 1978 5. BLACKBURN,H, KEYS, A, SIMONSON, E, RAUTAHARJU,P ANDPUNSAR,S: The electrocardiogram in population studies. A classification system. Circulation 21:1160, 1960 6. ROMHILT,D W, BOVE, K E, NORRIS, R J, CONYERS, E, CORADI,S, ROLLANDS,D T AND SCOTT, R C: A critical appraisal of the electrocardiographic criteria for the diagnosis of left ventricular hypertrophy. Circulation 40:185, 1969 7. SOKOLOW,M AND LYON,T P: The ventricular complex in left ventricular hypertrophy as obatined by unipolar precordial and limb leads. Am Heart J 38:273, 1949 8. CACERES,C A: Present status of computer interpretation of the electrocardiogram: A 20 year overview. Am J Cardiol 41:11, 1978 9. MEYER,J, HEINRICH,K W, MERX,W ANDEFFERT,S: Evaluation of Current Systems for Computer Analysis of the Electrocardiogram, L PORDY, ed. Futura Pub Co, Mount Kisco, 1977, pp 295-307 10. BROHET,C R AND RICHMAN,H G: Clinical evaluation of automated processing of electrocardiograms by the veterans administration program (AVA3.4) Am J Cardiol 43:1167, 1979 11. BAILEY, J J, ITSCOITZ, S B, HIRSHFIELD, J W, GRAUER, L E AND HORTON, M R: A method for evaluating Computer programs for electrocardiographic interpretation. I. Application to the experimental IBM program of 1971. Circulation 50:73, 1974 12. MIYAHARA,H, ENI)OU, K, DOMAE,A AND SATO,T: Arrhythmia diagnosis by the IBM electrocardiogram analysis program. J Electrocardiol 13:17, 1980 13. BONNER, R E, CREVASSE, L, FEARER, M I AND GREENFIELD,J C, JR: A new computer program for analysis of scalar electrocardiograms. Comput Biomed Res 5:629, 1972
J. ELECTROCARDIOLOGY, VOL. 14, NO. 3, 1981