exercise and the heart Predicting Severe Angiographic Coronary Artery Disease Using Computerization of Clinical and Exercise Test Data* Dat Do, BS; Rachel Marcus, MD; Victor Froelicher, MD; Andras Janosi, MD, FCCP; Jeff W est, MD;]. Edwin Atwood, MD; Jonathan Myers, PhD; Robert Chilton, MD;and Jeff Fro·ning, MS
Currently the standard exercise test is shifting from being a tool for the cardiologist to utilization by the nonspecialist. This change could be facilitated by computerization similar to the interpretation programs available for the resting ECG. Therefore, we sought to detennine if computerization of both exercise ECG measurements and prediction equations can substitute for visual analysis perfonned by cardiologists to predict which patients have severe angiographic coronary artery disease. We perfonned a retrosp ective analysis of consecutive patients referred for evaluation of possible or known coronary artery disease who underwent both exercise testing with digital recording of their exercise ECGs and coronary angiography at two university-affiliated Veteran's Affairs medical centers and a Hungarian hospital. There we re 2,385 consecutive male patients with complete data who had exercise tests between 1987 and 1997. Measurements included clinical and exercise test data, and visual interpretation of the ECG paper tracings and > 100 computed measurements from the digitized ECG recordings and compilation of angiographic data from clinical reports. The computer measurements had similar diagnostic power compared with visual interpretation. Computerized ECG measurements from maximal exercise or recovery were equivalent or superior to all other measurements. Prediction equations applied by computer were only able to correctly classifY two or three more patients out of 100 tested than ECG m easurements alone. 13-Blockers had no effect on test characteristics while ST depression on the resting ECG decreased specificity. By setting probability limits using the scores from the equations, the population was divided into high-, intennediate-, and low-probability groups. A strategy using further testing in the intennediate group resulted in 86% sensitivity and 85% specificity for ide ntifying patients with severe coronary disease . We conclude that computerized exercise ST measurements are comparable to visual ST measurements by a cardiologist and computerized scores only minimally improved the discriminatory power of the test. However, using these scores in a stratification algorithm allows the nonspecialist physician to improve the discriminatory characteristics of the standard exercise test eve n when resting ST depression is present. Computerization permitted accurate identification of patients with severe coronary disease who require referral. (CHEST 1998; 114:1437-1445) Key words: angiography; computerization; coronary mtery disease; exercise testing; prediction equations Abbreviations: BM I = body mass ni dex; CAD = coronary aJtel)' disease; HR = heaJt rate; LVH = left ventJicular hypertrophy; METS = metabolic equivalents; MI = myoc ardial infmction; ROC = receiver operating characteristic; V/Q = ventilation perfusion ratio
*From the Cardiology D ivisions at the Vete rans Affai rs Palo Alto H ealth Care Syste m (Drs. Marcus, Froelicher, Atwood, Myers.. and West), Stanford University, Palo Alto, CA, University o( Texas in San Antonio (D r. Chilton and Mr. Do), Sunnyside Biomedical (Mr. Froning), Vista, CA, and St. Janos's Municipal Hospital (Dr. Janosi), Budapest, Hungary. Manuscript received April 20, 1998; revision accepted April 21 , 1998. Correspondence to: Victor Froelicher, MD, Cardiology Division (lll C), VA Palo Alto Health Care System, 3801 Miranda Ave, Palo Alto, CA 94304; e-m.ail: v ic
[email protected]
c
ost-containment measures as part of managed care have expanded the role of p imary r care physicians in the treatment of patients with probable coronary disease. As this group of physicians a t least partly assumes w hat was traditionally the cardiologist's role, it i s critical that t here be clearly a defined strategy for referral to a cardiologist. The principal tools that are available to p hysicians without advanced training in cardiology are the ECG and the CHEST I 11 4 I 5 I NOVEMBER, 1998
1437
exercise test. As these tools are noninvasive, they are relatively inexpensive, safe, and easy to use. However, their limited sensitivity and specificity as well as subjectivity of interpretation compromise the ability of primary care physicians to adequately risk-stratify their patients, and identify those who need referral. The diagnostic capacity of modifications to the interpretation of the standard exercise test relying on computerization has been thoroughly reviewed. 1 Additionally, multivariable statistical scores considering clinical and exercise test data have demonstrated superior discriminating power compared with simple classification of the ST response.2 If computer analysis is indeed equivalent to visual ST interpretation by the cardiologist and scores are superior to ST measurements alone, computerization of analysis and scoring could supplement and facilitate exercise ECG interpretation similar to the widely used computer programs for the resting ECG. 3 While nonspecialists can treat many patients with coronary disease, they would like to identify those patients with severe coronary disease who usually require referral to a specialist. In this study, we performed r etrospective analysis of patients presenting with signs and/or symptoms of possible or known coronary disease. All patients had exercise ECGs analyzed visually and by computer; the outcome variable was angiographically severe coronmy disease. In addition, we utilized multivariable statistics to devise a score using clinical and exercise test data for estimating the probability for severe coronary disease. By setting probability prediction thresholds that divided our patients into low-, intermediate-, and high-risk groups, a strategy is provided for treating patients that we hypothesize is superior to simply considering whether a patient had an abnormal test result or not.
MATERIALS AND METHODS Patients
The population included 2,385 consecutive male patients with complete data who had treadmill tests at two Veteran's Affairs medkal centers and bicycle tests at a Hungarian hospital b etween 1987 and 1997 to evaluate signs or symptoms of possible or known coronary disease. All patients had coronary angiography within 4 months of the exercise test. As is the case for clinical observational studies like this, there was no attempt to remove workup bias. Patients with previous cardi ac surgery, valvular heart disease, left bundle branch block, or Wolf-Parkinson-vVhite syndrome on their resting ECG were excluded from the study. Prior cardiac surgery was the predominant reason for exclusion of patients who underwe nt exercise testing and angiography durin g this time period. The clinical variables considered were obtained from th e initial history. These included age, chest pain symptoms, body mass index (BMI), obesity, history of congestive heart failure, hyper1438
tension , noninsulin- or insulin-dependent diabetes, stroke, petiph eral vascular disease, hypercholesterolemia by current values or by history, and COPD, as well as family history and current and past cigarette smoking s tatus. Chest pain symptoms were coded as 1 for typical, 2for atypical, 3 for non anginal pain, and 4 for no chest pain. Resting ECGs were coded abnormal if they exhibited one or more of the following: ST depression > 0.1 mm, left ventricular hypettrophy (LVI-I ), or T-wave inversion. All clinical vatiables except for age, chest pain, and BMI were coded 0 (abse nt) or 1 (p resent). Myocardial infarcti on (MI) score was classified according to Hubbard et al4 (3 for presence of Q waves on resting ECG and histmy of MI , 2 for presence of Q waves without histmy of MI, 1for history of prior Ml without Q waves, and 0 if no Q waves or histmy of MI). While much of these data were gathered prospectively using computeti zed forms,5 ·6 some of the patients initially studi ed had incomplete data requiring additional chart review. Exercise Testing
At the Vete ran's Affairs medical centers, patients underwent treadmill testing using the United States Air Force School of Aerospace Medicine 7 or an individualized ramp treadmill protoco\.8 Before ramp testing, the patients were given a questionnaire consisting of a list of activities presented in an increasing order accordi ng to metabolic equivalents (METs ). This questionnai re estimated the patient's exercise capacity before the test and thus allowed most patients to reach maximal exercise atapproximately 10 min. 9 Visual ST segment deviation was measured at the J junction and corrected for preexercise ST depression while standing; ST slope was measured over the following 60 ms and classified as upsloping, horizontal, or downsloping. An abnormal slope was coded as 1 for horizontal, 2 for downsloping, and 0 for normal slope (upsloping or ST depression of < 0.5 mm ). The ST response considered was the most horizontal or downsloping ST depression in any lead except aVR during exercise or recovery. In addition, all of the following hemodynamic measurements were recorded: resting, change, and maximal values for heart rate (HR); systolic BP; double product as well as exercise-induced hypotension (a drop in exercise systolic BP below standing or a drop in systolic BP of 20 mm Hg after a rise); and exercise capacity estimated in METs. Angina during testing was classified according to the Duke Exercise Angina Index (2 if angina required stopping the test, 1 if angina occurred during or after exercise tes tin g, and 0 for no angina). 10 No test result was classified as indetermin ate, 1 1 treatment with medications was not withheld, and no maximal heart rate targets were applied. 12 While all th e exercise tests were performed, analyzed, and reported p er standard protocol and utilizing a computerized database (EXTRA; Mosby Publishers; Chicago, IL), the cardiac catheterization was consistent \\~th clinical practice at each institution and results were abstracted from clinical reports. All exercise ECG analyses and comparisons were pe rformed blinded from clinical and angiographic results. Computer Analysis
Microprocessor-based exercise ECG devices were used at the three sites to simultaneously reco rd all 12 ECG leads through exercise and recovery at 500 samples per second (Mortara Electronics; Milwaukee, WI) on optical disks. Optical disk recm·dings were processed off-line using standard personal computers. Averaging of the raw data from three leads (II, V2 , and V5 ) and determination of QRS onset and offset points were perform ed using developed software (Sunnyside Biomedical; Vista, CA). The computer-chosen isoelectric line and QRS onset and Exercise and the Heart
offset points were confirmed visually for their accuracy. During the last year of data collection at Palo Alto Health Care Systems, a treadmill system (QUEST; Burdick; Milton, WI) was used. This system collected data on memory cards and utilized a 12-lead on-line version of the software. The following measurements and calculations were evaluated: (l) ST0 (}-junction) and ST60 (60 ms after the J-junction) at rest, at 2 min prior to maximal exercise, at maximal exercise, and at l, 3.5, and 5 min recovery; (2) ST slope, based on a least squares fit between ST0 and ST60 , at the same times as the amplitude measurements; (3) ST integral; (4) ST index 13 ; (5) the sum of and the most ST depression in II, V2 , and V5 at maximal exercise and 3.5 min recovery; (6) ST0 and ST 6JHR index and slope; (7) the treadmill exercise score of Hollenberg et aP 4 (which includes time-amplitude plots for the three leads in exercise and recovery [six separate areas]); and (8) ST60 in V5 during exercise at HRs of 100 and 110 beats/min. Several empiric composite adjustments were made in an attempt to simulate visual analysis by adjusting for baseline depression and using slope criteria changing with HR. R-wave amplitude was available at all of the time periods and results obtained adjusting the ST measurements by this amplitude are reported. Coronary Angiography Coronary artery narrowing was visually estimated and expressed as percent luminal diameter stenosis at each site blinded to the patient's history and exercise test results. Patients \vjth a 50% narrowing in the follovvjng patterns were considered to have severe angiographic coronary artery disease: two major vessel disease if one lesion was proximal in the left anterior descending, three vessels and/or their major divisions, or a 50% narrowing in the left main coronary artery. The 50% criterion and consideration of proximal left anterior descending lesions are consistent \vjth the cooperative trialists findings.l 5 The Duke angiographic jeopardy score, which grades the severity of disease by consideration of location on a scale from 0 to 12, was calculated. 16 Statistical Methods Using equations formulated in a spreadsheet (Excel; Microsoft Corp; Redmond, WA), > 100 computer variables were evaluated by receiver operating characteristic (ROC) analysis in the total sample and the means and SDs were calculated. In addition, sensitivity for specificity matching that of the visual analysis criterion was also calculated (True Epistat; Richardson, TX). This was done since it is at the point of the ROC curve that the clinician usually applies the exercise test. Any proposed improvement must be compared with the tests' current performance. With the sample size, the 95% confidence interval for the ROC curves was ± 0.02 and so only measurements within the 95% confidence interval of visual analysis were chosen for presentation. Three logistic regression models were developed (True Epistat) using clinical, hemodynamic, and non-ECG variables, then one model added visual ST measurement, a second added the best ST measurement in recovery, and the third added the best computerized ST measurement at maximal exercise. In addition, in patients without an MI, a fourth logistic regression model was developed and the relationship of exercise-induced ST depression to angiographic disease severity was studied by linear regression. These two analyses were performed in both the patients with and without an MI since an obstruction in an artery supplying an infarcted area does not result in ischemian The performance of visual and computerized exercise ECG measurements and the models were also assessed considering medication status and the resting ECG. The resting ECG was classified by visual criteria (LVH, T-wave inversion, and ST depression) and also by the computer (ST depression).
The measurements and the models were then tested in the three subpopulations, each with a different prevalence of coronary disease and a different rate of abnormal results of exercise tests, to evaluate their performance and demonstrate how probability thresholds divided the populations.
RESULTS
Population Gharacteristics
The mean age of this male population was 60 (±10) years. Of the total, 42% had diagnostic Q waves and!or history of an MI. Thirty percent of the total population had severe angiographic coronary disease and 30% had no significant disease according to the stated definitions. Of the 30% with severe disease, 60% (429/719) had three-vessel disease and 17% (125/719) had left main disease. Age, presenting chest pain, diabetes, currently smoking, MI score, family history of coronary disease, and abnormal resting ECG were significantly different between those with and without severe angiographic coronary disease. Four and a half percent were receiving digoxin and 35% were receiving (3-blockers with no differences between those with or without severe disease. Nine percent had peripheral vascular disease, 5% had a history of congestive heart failure, 4% had a history of a stroke, and 6% had COPD with no differences between the groups. Table 1 lists all of the other important clinical variables. Postexercise Test Herrwdynamic, Non-EGG, and Visual EGG Results
Table 2 compares the exercise test data between those with and without any severe angiographic coronary disease. The Duke treadmill angina score, the prevalence of abnormal ST depression, and all of the hemodynamic measurements were significantly different. Table !-Clinical Characteristics* No Severe CAD (n = 1,666)
Severe CAD (n = 719) (30%)
Age, yr 58± lO 115 (7%) Typical angina MI score (0-3) 0.8 ± l.l 572 (34%) Hypercholesterolemia 233 (14%) Diabetes Abnormal resting ECG 533 (32%) Resting ST depression, mm 0.08 ± 0.22 Currently smoked 598 (36%) BMI, kg!m 2 27.8 ± 4.8 Family history of CAD 733 (44%) 857 (51%) Hypertension
62 ± 9 357 (53%) 1.2 ± 1.2 278 (39%) 137 (19%) 304 (42%) 0.12 ± 0.25 208 (29%) 28.1 ± 14.3 264 (37%) 390 (54%)
Variables
p Values
< 0.0001 < 0.0001 < 0.0001 0.04 0.002 < 0.0001 0.0003 0.001 NS 0.001 NS
*Data are presented as mean± SD or No. (%) of subjects. NS = nonsignificant. CHEST I 114 I 5 I NOVEMBER, 1998
1439
Table 2-Exercise
Variables Maximal HH , beats/m in D elta HR , beats/min Maximal systolic BP, mm Hg D elta systolic BP, mm H g Maxim al doubl e product, Xl ,OOO Delta double product, X1 ,000 METs Angina reason stopped test Abnormal ST depression
Te.~t
No Seve re CA D (n = 1,666)
Results* Severe CAD (n = 719) (30%)
p Valu e
22 19 29 25 5.6
< 0.0001 < 0.0001 < 0.0001 < 0.0001 < 0.0001
12.0:!:: 5.9
9.3:!:: 3.2
< 0.0001
7.4 :!:: 3.1 183 (11 %) 489 (29%)
6. 1 :!:: 2.5 133 (21%) 443 (62%)
< 0.0001 < 0.0001 < 0.0001
128 :!:: 50:!:: 167:!:: 42 :!:: 21.6 :!::
24 22 29 25 6.4
120 :!:: 44:!:: 157:!:: 30:!:: 19. 1 :!::
*Data a re presented as mean :!:: SD or No. (%) of subjects.
ST Criteria Performance and Validation Visual analysis was found to have a ROC area of 0.70. The discriminating performances of the ST variables that exhibited an average ROC area within the 95% confidence intervals associated with visual analysis are the computerized ST measurements with the highest discriminating power listed in Table 3. They included the sum of the depression at ST60 in II, V5 , and V2 , the most ST60 depression in these three le ads, HR index (ST60 or ST0 V5 ), and ST60 and slope in V5 at 3.5 min of recovery. M easurements made at 3.5 min of recovery and in V.5 predominated compared vvith other leads or at other time points . Thus , only these 11 measurements out of the 100 that were calculated by the exercise ECG analysis program had ROC curve areas> 0.65. While several of the ST time areas that are part of the Hollenberg score had ROC curve areas comparable to visual analysis, the Hollenberg score itself had a ROC area of 0.68 (sensitivity of 54% at a specificity of 7l % and a sensitivity of 32% at a specificity of 89%). The independent ST slope and amplitude areas are not listed since their complexity for computation exceeds that of the other measurements. In addition, the sensitivity of the measurements at specificity of 7l % to match 1 mm criteria and specificity of 89% for 2 mm criteria, matching visual analysis, are also listed. R-wave adjustment failed to improve on the ROC areas.
Prediction Score Development Four scores were developed u sing stepwise logistic regression to predict the presence of severe disease (Appendix). The scores were developed considering that some clinicians prefer to use a maximal exercise ST measurement rathe r than one from recovery. For the recovery ST measurement to have the same 1440
diagnostic characte1istics as it did in this study, exercise must be stopped abruptly (no cool down walk performed) and the patient placed in supine position postexercise. The appropriate values are inserted into the following logistic regression formula to calculate an estimate of the probability for angiographic coronary disease : Probability (0 to 1)
= 1/ 1(1 + e -(a+bx +cy
·
l)
Where a = intercept, b and c are coefficients, x and y are variable values.
Prediction Equation Performance and Validation Probabilities of severe coronary artery disease (CAD ) generated using the scores were plotted as ROC curves (Fig 1) and the areas under the curve calculated (Table 4). The areas under the curve of the ROC curves for three of the four prediction equations was significantly larger than those of visual analysis or the best computer measurements (p < 0.0001). Additionally, sensitivities were calculated for each prediction equation using the specificity comparable to that of visual analysis using a 1-mm (71 %) and 2-mm (89% ) criterion for a positive test. This required that probabilities of severe CAD be calculated for the prediction equations that match the specificity for visual criteria (see cut point column). Predictive accuracy, which is the percentage of pati ents correctly classified as having or not having severe disease, was also calculated and presented. Since the differences were only 2% or 3%, this means that only 2or 3 more patients out of 100 were correctly classified by the equations. Thus, as can be seen in Table 4, all three models were minimally superior to solitary ST measure ments either visual or by computer, and were equivalent to one another in accuracy. Even though the correlation between exercise-induced ST depression and the Duke angiographic score was greater vvith patients with MI excluded (-0.32 vs -0.41 , p < 0.01 ), the equation developed in these patients separately did not discriminate significantly better (ROC = 0.78) as shown in the bottom row of Table 4.
Effect of Medications and Resting EGG Abnormalities As seen in the second row of Table 5, (3-blocker administration did not affect the characte1istics of the standard visual criteria. However, Digoxin (see third row) significantly lowered the predictive accuracy of the test for severe disease. The clinical indication for digoxin was not known and the condition for which it was prescribed could affect the ST Exercise and the Heart
Table 3-The Diagnostic Characteristics of the Computerized ST Measurements With Results Comparable to Visual Analysis With Sensitivity at a Cut Point Associated With a Specificity Matching the 1- and 2-mm Criteria for Visual Analysis (71% Specificity and 89% Specificity, Respectively) ROC (:+::1 SE)
Sensitivity, %
Cut Point To Match Visual Specificity at 1- and 2-mm Criteria
V5 slope 3.5 min recovery
0.70 ::': 0.01
V5 slope 5 min recovery
0.69 ::': 0.01
V5 ST60 max exercise
0.66 ::': 0.01
V5 ST60 1 min recovery
0.69 :+:: 0.01
V5 ST60 3.5 min recovery
0.71 ::': 0.01
V5 ST60 5 min recovery
0.70 ::': 0.01
V5 ST integral 3.5 min recovery
0.69 ::': 0.01
Sum ST60 3.5 min recovery
0.70 ::': 0.01
Most ST60 3.5 min recovery
0.71 ::': 0.01
ST/HR Index
0.67 :+:: 0.01
Visual ST analysis
0.70 ::': 0.01
59 33 58 34 55 31 59 34 63 36 60 34 61 34 61 38 62 35 56 34 62 36
0.076 mV/ms -0.24 mV/ms 0.024 mV/ms -0.28 mV/ms - 0.115 mV -0.195 mV -0.055mV -0.126 mV -0.051 mV - 0.102 mV -0.046 mV -0.097mV -0.058 mV - 0.103 mV -0.073 mV - 0.150 mV -0.048 mV -0.096 mV -0.0021 mV/beats/min -0.0036 mV/beats/min lmm 2mm
ST Measurement
response. The exclusion of all patients with resting ECG abnormalities (LVH, T-wave inversion, ST depression) as well as digoxin use returned the discriminating power of standard visual analysis of the test; however, the ROC areas of the prediction equations were not adversely effected. While the prevalence of severe disease was not affected by the medication status (column three, Table 5), patients with normal resting ECGs had a lower prevalence of severe disease (26 to 23% compared with 36 to 37%). The computer classification of resting ST depression confirmed the visual ECG classification results by obtaining the same dichotomy: degradation of the visual exercise ST analysis but maintenance of the discrimination using the equations. This is explained by a calibration shift in the subpopulation formed by the presence of ST depression and means that patients with these ECG abnormalities need not be referred directly for other stress testing modalities such as nuclear or echocardiography. Strategy for Discrimination in Clinical Practice Given the limited ability to classify patients correctly even using the equations, an attempt was made to divide the patients into three groups with different risk. In order to apply the prediction equations to stratify the population into groups with low, intermediate, and high risk, various combinations of probabilities of CAD were assessed to see
which would best allow for appropriate categorization. Using a probability of 0.20 or below to constitute low risk, 0.20 to 0.40 to constitute intermediate risk, and 0.40 or above to constitute high risk, the population was divided roughly into thirds. These thresholds could be set to other values for specific managed care objectives. However, applying them in our population and following a practice strategy that all intermediate-group patients would go on to further testing and be correctly classified resulted in a sensitivity of 86% and a specificity of 85% (see Appendix for sample calculation). This strategy gave the same results with the various equations as well as in the three subpopulations with different disease prevalence. DISCUSSION
ROC Analysis The ROC curves in this article were plotted to describe the exercise ECG as a discriminatory tool for severe coronary disease. These plots are particularly useful as a means to compare measurements without defined cut points (that is, computerized ST measurements and scores) . These measurements and scores were then compared on the basis of the area under their respective ROC curves. The ROC curves also enable adjusting the cutpoints of the measurements or scores to stratify patients as we CHEST/114/5/NOVEMBER, 1998
1441
Severe Disease Prediction
1.00 0.90 0.80 ~
·:; ·; ·~
c
Q)
U)
0.70 0.60 0.50 0.40
Visual ST
0.30
Predictive Equation using Visual ST
0.20
Computer VS ST60 recovery Predictive Equation using Computer Recovery V5ST60
0.10 0.00
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1 - Specificity l. ROC cUJves of two prediction equations, visual analysis, and the best compute rized me asure ments. Because of the fewe r ST points measured by physicians (usually rounding off to the millime ter) as compared with computer m easure me nts , the a rea form ed by visual analysis is always less than computer m easureme nts, putting the visual analysis a t a disadvantage. FIGURE
found necessary. In order to compare the number or percentage of patients correctly classified, predictive accuracy is more appropriate. Meta-analysis of Studies Predicting Angiographic Severity
In order to assure that our results were comparable to that previously demonstrated in the literature, we reviewed available meta-analyses. One metaanalysis considered 60 consecutively published re-
ports comparing exercise-induced ST depression with coronary angiographic findings. 18 The 60 reports included 62 distinct study groups comprising 12,030 patients who underwent both tests. Wide variability in sensitivity and specificity was found (mean sensitivity 86% [range, 40 to 100% ]; mean specificity 53% [range, 17 to 100%) for left main or triple-vessel disease . Hartz et al 19 compiled results from the literature on the use of the exercise test to identify patients with severe CAD. One millimeter
Table 4-Comparison of the Three Predictive Equations With Reference to Visual Analysis and the Single Best Computer Measurement (ST 60 V 5 Recovery)* Cut Point Vi sual ST V5 ST60 3.5 min recovery PE \vith visual ST PE 'vith ST/HR index PE with V5 ST60 3.5 min recovery PE \vith V5 ST60 3.5 min recovery in no MI subgroup
1 mm/2 mm ~0.051/~ 0 . 102
0.31/0.48 0.3210.46 0.31/0.45 0.23/0.40
mV
Sensitivity, %
Predictive Accuracy
ROC Area
62136 63/36 67/41 64/33 65/41 71/44
0.68/0. 73 0.69/0.73 0.70/0.75 0.69/0.72 0.69/0.75 0.71/0.75
0.70 0.71 0.76 0.72 0.75 0.78
*PE = predictive equations or scores; note that the cut point for calculated probability of coronary disease average out to be 0.32 t o match the specificity obtained with 1 mm using vi sual analysis and 0.46 to match the 2-mm criterion. Sensitivity is determined at uc t points to match the specillcity of 1 and 2 mm ST visual criteria. Predictive accuracy equals the percent of correct classifications (TP + T N/TP +TN + FP + FN ). Since the scores only increased the predictive accuracy by 2% o r3%, that means they only result in 2 or 3more patients out of 100 to be correctly classified than simple ST measurements.
1442
Exercise and the Heart
Table 5-The Effect of Medications and Resting ECG Abnormalities on Discriminatory Characteristics of Simple ST Analysis and Two of the Prediction Equations*
Group
No.
Prevalence Severe CAD,%
% Abnormal by 1 mm Visual ST
Sensitivity, %
Specificity, %
Predictive Accuracy
ROC Visual ST
ROC of PE With Visual ST
ROC ofPE With Computer Recovery ST
Total population Receiving !3-blockers Receiving digoxin Visual rest ECG Abnormal resting ECG No digoxin or abnormal rest ECG Computer r est ECG Rest ST depression No rest ST depression
2,385 813 108
30 31 34
39 37 58
62 61 62
71 73 44
0.68 0.69 0.50
0.70 0.71 0.56
0.76 0.75 0.67
0.75 0.74 0.69
837 1 5, 17
36 26
53 31
71 57
55 75
0.61 0.70
0.68 0.69
0.75 0.76
0.74 0.76
1,019 1,181
37 23
48 31
74 48
55 83
0.62 0.75
0.69 0.68
0.74 0.76
0.73 0.75
- -·----··
*Total population results are provided for comparison; Visual ECG results are for standard interpretation of the resting ECG; abnormal resting ECG includes LVH, T-wave inversion, and resting ST depression; computer E CG separates the patients according to computer measurements of the ST segment at rest; PE = predictive equations using either a visual or a computerized exercise ST measurement; percent of abnormal exercise tests by visual ST ctiteria; Prev CAD = prevalence of severe di sease on the angiograms; while the discriminating power of the visual analysis is degraded by the resting ECG abnormalities (p redictive accuracy and ROC areas lessened), the equations still discriminate severe di sease.
criteria averaged a sensitivity of 75% and a specificity of 66% while 2 mm criteria averaged a sensitivity of 52% and a specificity of 86%. There was great variability among the studies examined in the estimated sensitivity and specificity of a given criterion for severe CAD and these reviewers could not explain the reported variation. Our results with exercise-induced ST depression alone (62% sensitivity and 71% specificity for l mm and 36% sensitivity and 89% specificity for 2 mm) are within the results reported by these reviews, but the sensitivity is relatively low. This is due both to our population and the criteria for severity utilized. Multivariable Analysis Other studies have shown that prediction equations are more accurate than ST measurements alone for identifying patients with severe coronary disease. A recent review has summarized all of the studies that included multivariable analysis to predict severe angiographic disease.2 Similar to other studies, we found that age, chest pain symptoms, history of hypercholesterolemia, the Duke treadmill angina score, and the MI score were good predictors for the presence of severe disease. Medication status did not affect the predictions, which is consistent with previous findings. 20 Among the exercise hemodynamic and ECG variables, ST segment depression and double product have been good predictors in other studies. There has been controversy regarding whether other testing modalities should be used when ECG abnormalities such as LVH or resting ST depression are present.21 Our current results, consistent with our previous findings ,22 suggest that the
standard exercise test is effective in patients with these ECG abnormalities for identifYing severe disease, particularly when scores are used. Otherwise, the scores were disappointing in their discriminating ability. They only improved predictive accuracy by l% or 2% compared with simple visual or computerized measurements. Severe Coronary Disease Ideally a generalist would like to treat most patients with coronary disease and refer only those who require interventions by a specialist to improve their quality or quantity of life. While quality of life issues must be a matter for the patient and physician to determine individually, there are rough guidelines about how to affect quantity. The coronary artery bypass graft surgery trialists carried out a systematic overview using individual patient data from the seven randomized trials that compared a strategy of initial coronary artery bypass surgery with one of initial medical therapy to assess the effects on mortality in patients with stable coronary heart disease. 15 The absolute benefits of bypass surgery were most pronounced in patients in the highest-risk categories, including those with proximal left anterior descending disease. Our classification of severe coronary disease matched the angiographic criteria that they determined. Sensitivity for Severe CAD Though low compared with the mean sensitivity for predicting severe angiographic coronary disease from the meta-analysis, our findings are within the CHEST I 114 I 5 I NOVEMBER, 1998
1443
reported range and are the way that the standard exercise test performs in many clinical populations. It would be desirable to calculate the test performance of0.5 mm ST depression determined visually, but measurements < 1 mm are not made reliably or consistently. \Vhen we reviewed our data, there were not sufficient 0.5-mm measurements made by our interpreters to demonstrate an increase in test sensitivity. This criterion certainly would yield a sensitivity that would be more acceptable to physicians for identifying severe corona1y disease. The fact that criteria associated with acceptable test characteristics (ie, sensitivity of 85% or higher) can be chosen from ROC analysis and implemented for computer measurements or multivariable probability equations gives them a distinct advantage over the routine visual analysis. Application of Prediction Equations to Clinical Practice Our ultimate goal was to be able to find a way for generalists to be able to adequately risk-stratify their patient population as to whether they would require the services of a specialist. We hoped to find a noninvasive and inexpensive manner that would be simple to implement. The exercise ECG, both with and without computer-aided analysis, was found to be limited as a discriminating tool. This is no surp1ise to those physicians who use the test frequently, although for patients in whom it either rules in or out severe disease, it is quite helpful. Traditionally, subsequent efforts to diagnose severe coronary disease have included nuclear imaging studies, stress echo, and catheterization, all of which are costly and require expert interpretation. While prediction equations do not supplant these studies, they can potentially narrow the group of patients who will ultimately need more complex evaluations. Exercise test data are critical parts of the prediction equations. This study suggests that computerized analysis, while not superior to visual analysis, is equivalent. Therefore, computer programs that both analyze the exercise ECG and subsequently calculate prediction equation results can be a valuable tool to the general practitioner faced vvith cming for cardiac patients. Similar to the use of the ventilation/perfusion ratio (V/Q) scan in the diagnosis of pulmonary embolism, exercise test data can be combined with clinical characteristics, to allow for risk stratification of patients presenting with possible or probable coronary disease. A probability generated from a prediction equation of < 20% or > 40% allows for a definitive answer to the diagnostic question, much in the same way a normal V/Q scan and a high-probability V/Q scan do. However, a probability between 20% and 1444
40% mandates further testing, similar to an intermediate-probability scan. These thresholds could be set to other numeric values for specific managed care objectives. If the generalist applies the strategies specified in the appendix utilizing computerized clinical and exercise test data, a sensitivity and specificity of approximately 85% can be expected even if the patient has resting ST depression.
CONCLUSIONS
The major limitations of this study are the lack of women (because of their low prevalence in the populations studied), the retrospective design, and the failure to remove workup bias. Gender is always chosen in prediction equations developed in mixed populations and so the equations we compiled are not valid in women. Many of our findings have clinical relevance, however. On the one hand, we did not validate previous studies that found heart rate and R-wave amplitude adjustment or computerized measurements and scores to be superior to visual analysis. Instead we found that computerized measurements of ST amplitude in V5 60 ms after QRS end at 3.5 min of recovery were able to match visual analysis performed by expert electrocardiographers. We confirmed that 13-blockers do not effect the diagnostic characteristics of the exercise ECG and that resting ST depression raises sensitivity. Thus, the general practitioner need not stop treatment with 13-blockers or exclude patients with resting ST depression from standard exercise testing. Finally, we demonstrated that setting probability limits using the scores divided the patients into equally sized high (2:: 40% probability), low (::::::: 20% probability), and intermediate probability groups enabled our management strategy to have a sensitivity and specificity approximating 85%. APPENDIX Intercept, variables, and their coefficients considering visual analysis: -2.18 + 0.025 * Age+ 0.30 * Typical angina+ 0.21 * Hypercholesterolemia + 0.28 * Q wave score + 0.30 * Angina major reason test stopped - 0.07 * Maximal double product+ 0.72 * ST depression (in millimeters) Intercept, variables, and their coefficients considering computer analysis of recovery ECG data: -2.43 + 0.03 *Age+ 0.29 *Typical angina+ 0.26 *Hypercholesterolemia + 0.26 * Q wave score + 0.44 * Angina major reason test stopped - 0.052 * Maximal double product- 7.37 * ST amplitude ST60 V5 at 3.5 min recovery (in millivolts) Intercept, variables, and their coefficients considering computer analysis of exercise ECG data: -3.02 + 0.04 * Age + 0.37 *Typical angina + 0.26 * Hypercholesterolemia + 0.26 * Q wave score + 0.51 * Angina Exercise and the Heart
major reason test stopped - 0.04 * Maximal double product - 81.28 * ST amplitude ST 60 V5 at maximal exercise divided by delta HR Intercept, variables, and their coefficients considering computer analysis of recovery ECG data in patie nts w/o a prior MI (1,384 patients vvith 24% severe CAD): -2.03 + 0.02 *Age + 0.53 *Typical angina+ 0.35 * Hypercholesterol emia + 0.49 * Diabetes - 0.08 * Maximal double product- 7.77 * ST amplitude ST60 V5 at 3.5 min recovery (in millivolts) Variable definitions for calculations: Visual ST: Maximal visual ST depression in exercise or during recovery. ST was recorded in absolute millimeters at jjunction if ST depression was at least 0.5 mm horizontal or downsloping or at least 2 mm upsloping: ST60 amplitude in negative millivolts (60 ms afte r QRS end); ST60 amplitude in Vs at 3 min in recovery in negative millivolts. Calculation of the sensitivity and specificity obtained using the predictive equation with visual analysis by applying the strategy that the indeterminate group are all refe rred for further testing and are assumed to be diagnosed correctly (all true results). Stratification Group
True Positives
Low ri s k (~ 0.20) Intermediate risk High risk (2! 0.40) Total
'2157 363 620
False Negatives
98 98
True Negatives
False Positives
803 568
290
1,371
290
Sensitivity = 620/620 + 98 = 86%; Specificity = 1371/1371 + 290 = 85% Strategy fo r Using Risk Categories for Severe Coronary Disease to Obtain Optimal Test Characteristics+ Low risk Intermediate risk High risk
May have disease but unlikely to have severe CAD; no need for procedures Require other tests, possibly including stress echo, nuclear, or angiography to clarify severity of CAD If clinically appropriate, patient should be informed and undergo angiography in consideration for an intervention
*These thresholds could be set to other values for speciflc managed care objectives. t Severe coronmy disease = 50% or greater occlusion in two vessels if one is proximal left anterior descending, three vesse l, or l fet main disease.
REFE RENCES 1 D elCampo J, DoD, Umann T, et a!. Compatison of computerized and standard visual criteria of exercise ECG for di agnosis of coronary artery disease. Ann Noninvasive Electrocardiogr 1996; 1:430 -442 2 Yamada H, DoD, Morise A, e ta!. Review of studies utilizing multi-variable analysis of clinical and exercise test data to predict angiographic coronary artery disease. Prog Cardiovasc Dis 1997; 39:457-481 3 Willems JL, Abrev-Lima C, Arn aud P, et a!. The diagnostic performance of computer programs for the in terpretation of ECGs. N Eng! J Med1991; 325:1767-1773
4 Hubbard B, Gibbons R, Lapeyre A, e t !a. Identification of severe coronary artety disease using simple clinical paramete rs. Arch Intern Med 1992; 152:309-312 5 Ustin J, Umann T, Froelicher V. Data management: a better approach . Physicians Comput 1994; 12:30-33 6 F roelicher V, Shiu P. Exe rcise test inte rpretation system. Physicians Comput 1996; 14:40-44 7 Wolthuis R, F roelicher VF, Fischer J, et a!. New practical treadmill protocol for clinical use. Am J Cardiol1977; 39:697700 8 Myers J, Buchanan N, Walsh D , tea!. A comparison of the ramp versus standard exercise protocols. J Am Coil Cardia! 1991; 17:1334-1342 9 Myers J, Do D, Herbert W, et a!. A nomogram to predict exercise capacity from a specific activity questionnaire and clinical data. Am J Cardia! 1994; 73:591-596 10 Mark D , HlatJ..-y M, Harrell F, et a!. Exercise treadmill score for predicting prognosis in coronary artety disease. Ann Intern Med 1987; 106:793-800 ll Reid M, Lachs M, Feinstein A. Use of methodological standards in diagnostic test research. JAMA 1995; 274:645651 12 Fletcher GF, Balady G, Froelicher VF, et a!. Exercise standards: a state ment for healthcare professionals from the American Heart Association \Vriting Group. Circulation 1995; 91:580-615 13 Okin PM , Kligfield P. Heart rate adjustment of ST segment depression and pe rformance of the exercise electrocardiogram: a critical evalu ation. JAm Coli Cardiol1995; 25: 17261735 14 Hollenberg M, Budge WR , Wisneski JA, et a!. Treadmill score quantifies the ECG response to exercise and improves test accuracy and reproducibility. Circulation 1980; 61:276 285 15 Yusuf S, Zucker D,Peduzzi P, et al. Effect of coronary artery bypass graft surgery on survival: overview of 10-year results from randomized ttials by the Coronary Artery Bypass Graft Surgety Trialists Collaboration . Lancet 1994 344:563-570 16 CaliffRM , Phillips HR, Hindman MC, eta!. Prognostic value of a coronaty attery jeopardy score. J Am Col! Carcliol 1985; 5:1055-1063 17 Ribisl PM , Liu J, Mousa I, e t !a. A co mparison of computer ST criteria for diagnosis of severe coronary artery disease. Am J Cardia! 1993; 71:546 -551 18 Detrano R, Gianrossi R, Mulvihill D, e t !a. Exercise-induced ST segment depression in th e diagnosis of multivessel coronary disease: a meta analys is. J Am Col! Cardia! 1989; 14:1501- 1508 19 Hartz A, Gammaitoni C, Young M. Quantitative analysis of the exercise tolerance test for determining the severity of coronaty artery disease. Int J Cardiol1989; 24:63-71 20 Herbert WG, Lehm ann KG , Dubach P, et al. Effect of beta blockade on the exercise ECG: ST level versus delta ST/HR index. Am H eart J 1991; 122:993-1000 21 Gibbons RJ , Balady GJ, B easley JW, eta!. ACC/AHA guidelines for exercise testing: a report of th e American College of Cardiology/American Heart Associati on Task Force on Practice Guidelines (Committee on Exercise Testing). JAm Coil Cardia! 1997; 30:260 -311 22 Miranda CP, Lehmann KG , Froelicher VF. The effect of resting ST-depression, digitals, and left ventricular hypertrophy on the standard exercise test. Am Heatt J 1991; 122: 1617- 1628
CHEST I 114 I 5 I NOVEMBER, 1998
1445