Quality control in exercise testing

Quality control in exercise testing

Quality Control in Exercise Testing BRUCE A. STAATS, M.D. STEPHEN F. GRINTON, M.D. CARL D. MOTTRAM, A.S. DAVID J. DRISCOLL, M.D. KEN C. BECK, Ph.D. De...

829KB Sizes 0 Downloads 78 Views

Quality Control in Exercise Testing BRUCE A. STAATS, M.D. STEPHEN F. GRINTON, M.D. CARL D. MOTTRAM, A.S. DAVID J. DRISCOLL, M.D. KEN C. BECK, Ph.D. Departments of Internal Medicine, Pediatrics, and Physiology Mayo Clinic, Mayo Foundation and Mayo Graduate School of Medicine Rochester, Minnesota and ]acksonville, Florida

Cardiopulmonary exercise testing is a technically demanding procedure. Most current testing systems are self-contained and can be configured to allow relatively little operator interface during the procedure. Internal quality control procedures are convenient but limited. In a busy laboratory, one can rely too heavily on the internal quality control software and not critically evaluate the actual data. In this paper, we will discuss potential errors in exercise testing and offer suggestions to avoid them. Because manufacturers are continually introducing new models and improving old ones, we have not compared specific exercise testing systems. We advise laboratory directors and technical personnel to schedule time for regular routine quality control checks and evaluation of the quality of exercise testing data.

STRIP CHART RECORDER Before automation, exercise laboratories often relied on Douglas bags or mixing boxes to collect expired gas. On-line data from these tests were displayed on a strip chart recorder and hand calculations were performed later. Our laboratory was developed during this era. Although we now have Address correspondence to Bruce A. Staats, M.D., Mayo Clinic and Foundation, 200 First Street, SW, Rochester, MN 55905.

a modern exercise system with software to provide for lengthy computations and quality control checks, we routinely inspect the original strip chart signals before, during, and after testing. In this way the operator can recognize errors as the data are collected. For example, this monitoring allows for the detection of hyperventilation, and the patient can be coached immediately to develop a more normal ventilatory pattern. Gas leaks can be recognized and remedied, and erroneous heart rates can be corrected. We believe that a strip chart recorder is an important quality control device that should be incorporated into all diagnostic exercise testing systems. Some computerized systems have an analog display of digital signals, closely reproducing the output of a strip chart recorder. This may be sufficient, but a free-standing recorder is preferable to display original signals that are independent of the computer outputs.

CYCLE ERGOMETER AND TREADMILL Current cycle ergometers are braked mechanically or electronically. With a mechanically braked cycle, resistance is generated by friction. The faster a subject pedals, the greater the power output. A metronome is used to help maintain a constant pedaling frequency. With an electronically braked cycle, the operator sets a prescribed power level, Prog Pediatr Cardioll993; 2(2):11-17 Copyright 0 1993 by Andover Medical

12

Progress in Pediatric Cardiology

and the control circuits of the ergometer adjust the workload according to the pedaling frequency. If the patient pedals slowly, resistance increases to match the prescribed power. If the patient pedals rapidly, resistance decreases. The power outputs of ergometers are generally accurate within 40 to 120 pedal rotations per minute. Pedaling outside this range produces more or less power than prescribed. Cycle ergometers are calibrated statically or dynamically. With static calibration, weights are connected to the end of a rod attached to the pedal crank. Dynamic calibration is performed using an electronic torque generator (Quinton Instrument Company or a locally fabricated devicel) attached directly to the pedal crank. Wilmore et aL2 have shown that cycle ergometers can be up to 10% in error. It is advisable to check the calibration of the ergometer every 3 to 6 months. Periodic measurement of the oxygen uptake of several laboratory subjects at given power outputs (a biologic quality control test described in the section on Overall Variability of Exercise Measurements) is a convenient way to check stability of the ergometer calibrations. Treadmills are used in many exercise laboratories. Power output is increased by changes in speed and elevation. With treadmill tests, it is not possible to determine absolute workloads or power outputs because of individual variations in body weight and muscular efficiency. To estimate work efficiency (external power output/oxygen equivalence of power) the patient must exercise at known power outputs. This requires a cycle ergometer. The mechanical accuracy of treadmills should be verified every 3 to 6 months by checking belt speeds and elevations. Biologic quality control testing can also be performed on the treadmill. For pediatric and adult exercise testing, protocols should be selected to reflect individual body size and fitness. If workload increments are too large, patients can experience leg fatigue early in the test before reaching a maximal cardiopulmonary effort. If the increments are too small, the test will take too long. The ideal duration of a diagnostic exercise test is from 8 to 12 minutes, including the warm-up time. The saddle of the cycle ergometer should be sufficiently high to allow for a small angle of flexion at the knee when the foot is at the low point of the

pedaling arc with the sole of the shoe parallel to the floor. If the knee and hip are excessively flexed, produced by a low saddle height often chosen by an individual who does not ride a bicycle, quadriceps muscle fatigue will occur before a maximal cardiopulmonary effort is reached. In treadmill testing, the patient should not lean on the support rails because the observed power output and oxygen uptake will be less than the protocol would predict.

VALVES AND TUBING Valves and mouthpieces should be appropriate for the size of patients being tested. Most commercially available systems are designed for adults. With children and smaller adults, it is important that the system accurately measures small tidal volumes and ventilatory frequencies of up to 80 breaths per minute. Accuracy can be assessed by comparing computer and Douglas bag measurements of oxygen uptake, carbon dioxide output, and ventilation during rapid breathing. If an adult-sized breathing valve is used for testing small children, the ratio of deadspace to tidal volume may be 50% or more. This causes children to rebreathe expired gas and use larger tidal volumes and higher breathing frequencies than would occur in older and larger patients. Conversely, if the breathing valve is too small for the patient, resistance to breathing will be increased at higher workloads. We use the HansRudolph #2700 valve with 115 ml of deadspace for adults with large ventilatory capacities (maximal voluntary ventilation >160 Llmin): the #2600 model with 59 ml of deadspace for adults and children with body surface areas greater than 1 m2 and the #1400 valve with 35 ml deadspace for children less than 1 m.2 Outboard or expiratory gas leaks can be difficult to detect; the most common causes are a loose noseclip or leaks around the mouthpiece. Most exercise test systems measure expiratory volume and use the Haldane transformation (nitrogen balance equation) to calculate inspiratory volume.3 An outboard gas leak is suspected when measurements of relatively low ventilation and oxygen uptake values for a given power output are obtained. Inboard leaks are of no consequence in systems that measure only expiration. To test for leaks from tubing and valves, the system should be pressurized to 30 cm

Quality Control

HZO. If the pressure decays quickly, present.

a leak is

PATIENT VARIABLES The goal of a cardiopulmonary exercise test is to evaluate overall cardiopulmonary responses to exercise. To estimate the level of physical fitness and identify limiting factors to exercise, a maximal or near-maximal test must be performed. If a child does not achieve a maximal effort because of fatigue or orthopedic discomfort, an accurate assessment cannot be made. Coaching by the testing physician and technician plays an important part in obtaining a maximal effort. If a patient performs several exercise tests, it is common for the last one to be more of a maximal effort. This is a demonstration of the “learning effect” or “test-retest phenomenon.” Davies et al.* noted that with repeated cycle ergometric studies in normal deconditioned patients, heart rates and ventilation at a given submaximal workload diminished almost 5% from the first to the fourth test. Kraemer et aL5 found similar changes in patients with chronic atria1 fibrillation, noting a progressive fall in oxygen uptake at a given treadmill speed. Although these changes are small and within usual errors of measurement, they are systematic (always in the same direction) and important when detection of small changes in exercise performance is relevant, as in the study of the effects of drugs on cardiac function. In a diagnostic exercise evaluation, usually only one exercise test is performed, and in this situation we have found that constant patient feedback is essential for a maximal effort.

GAS ANALYZERS There are several types of respiratory oxygen and carbon dioxide analyzers. The mass spectrometer is the fastest and most accurate instrument, but it is also the most expensive. Paramagnetic oxygen and carbon dioxide analyzers are commonly used in exercise testing systems. They are less expensive than mass spectrometers, but they have slower response times. Fast and slow responses describe how quickly the electronic output of analyzers follows the rapidly changing concentrations of respiratory gases sampled at the mouthpiece. Analyzers are

13

calibrated statically to agree with calibration gases and dynamically to respond appropriately to step changes in gas concentration. Using a two-point static calibration procedure is usually adequate, although we have found that a three-point calibration can detect subtle alinearities in analyzer outputs. Gas mixtures are chosen to bracket the extremes in respiratory gas concentrations that occur during exercise testing. Previously analyzed high- and low-calibration gases are passed through the system, and the outputs of the analyzers are adjusted to give the proper calibration values. If possible, the calibration gases should be checked by an independent means (Scholander technique). To detect analyzer drift, calibration gases should be passed through the analyzers before and after each exercise test. Chinn et a1.6 found that 52% of oxygen analyzers were not accurate to within 1% . If expired gas concentrations are inaccurate, spurious values can be produced for derived variables, such as oxygen uptake and carbon dioxide output. Users will not recognize a systematic error in the analyzers unless oxygen uptake and carbon dioxide production are measured by a different method, and these results are compared to computer-derived values.

MIXING BOX Most exercise systems use breath-by-breath measurements of tidal volume and oxygen and carbon dioxide concentrations with computer determinations of minute ventilation, oxygen uptake, and carbon dioxide output for each breath. However, several systems measure oxygen uptake, carbon dioxide output, and ventilation over several breaths using a mixing box. This is a 3 L to 6 L box through which expired gas is mixed. Fairly stable concentrations of oxygen and carbon dioxide are present at the outlet of the box, even with irregular breathing. Children have smaller tidal volumes than adults, and if the mixing box is too large, errors in the calculation of ventilation and gas exchange data can occur. We use a 6 L mixing box for adults and children with a body surface area >1 M2 and a 4 L mixing box for children with body surface area
14

breath systems should be validated children.

Progress in Pediatric Cardiology

for use in 1,600

1 I

VENTILATION AND FLOW MEASUREMENTS Ventilation (volume/time) can be measured directly or calculated as the integral product under the recorded curve for expired flow. Pneumotachographic systems measure the pressure difference across a linear flow-resistive element that is proportional to flow. Once it is properly calibrated, the pneumotachographic signal usually does not drift. Pressure differences are alinear with volumes at very high flow rates, such as in maximal tests of adults, but computerized systems can make allowances for this. It is important to note that the pneumotachograph is calibrated in a predetermined position within the ventilatory apparatus of the system, and it must be recalibrated if the apparatus is changed substantially, such as with the use of shorter, longer, smaller, or larger respiratory tubing or with new bends or angles in tubes. Other methods for measuring volume include a turbine meter, cooling of a hot wire and a rolling seal spirometer. Usually these methods are accurate, once they have been properly calibrated. A “super syringe“ is used to calibrate the volume signal. As a quality control check, a higher frequency of breathing can be simulated by rapid pumping of the syringe.

TIME (PHASE DELAY) As opposed to the instanteously measured volume signal, there is a time delay between gas concentrations at the mouth and at the output of the gas analyzers. For proper measurements of oxygen uptake and carbon dioxide output, volumes and expired gas concentrations must be analyzed accurately and be aligned precisely in time.7 This time (phase) delay includes both the delay inherent in the gas analyzers and that due to gas passing through the tubing connecting the mouthpiece or mixing box to the analyzers. If expired gas concentrations and volume are not aligned properly, erroneously high or low values of oxygen uptake or carbon dioxide output will be obtained. Ventilation calibration is not affected by time delay. To align ventilation and gas concentration signals, most systems use a solenoid to present a square wave of

.

.

.

.

40,

- * 1std 1,200

4

Jun‘88

Nov ‘88

smp‘89 May ‘90 Date of test

Jul'91

Figure 1. Oxygen uptake data from OUY laboratory. Points consist of measurements performed at 100 W on a laboratory technician (CDM). Lines are f 5% of historical means. The variability (precision) is consistent with that reported by Garrard and Emmons.13 A systematic error occurred in June 7989. The source could not be found and the problem appeared to COYrect itself.

calibration gas to the analyzers, with computer calculation of the time delay between the volume and gas analyzer signals. Usually the delay is measured before each test and is fairly constant day by day. However, time delays can be erroneously long if the analyzers are not functioning properly or if the tubing is partially occluded.

OVERALL QUALITY CONTROL Ventilation, oxygen uptake, and carbon dioxide output should be measured in the same patient and periodically compared as a control check of time delays, volume measurements, gas analyzer calibrations, and the overall system integrity. This socalled biologic quality control data identifies deviations from historical laboratory values (Figure 1). Quality assessment is also evaluated by comparisons of computer-derived and Douglas bag measurements (bag measurement is considered the “gold” standard). Expired gas passes through the computer system and into the bag. The volume and gas concentrations in the bag are measured; ventilation, oxygen uptake, and carbon dioxide output are calculated and compared, as shown in

Quality Control

15

cation tests are not performed, erroneous data can be reported indefinitely.

-

0.85 0.80

,

Jun ‘88

* 5%

I

I

I

I

Nov’88

Sep’89

May’90

Jw-1'91

Date of test

Figure 2. Computer-Douglas bag comparison data from our laboratory. When computer and bag oxygen uptake me~urements are identical, the ratio is 1.0. Note that for nearly 2 years, computer me~urements of oxygen uptake were almost 5% higher than bag measurements. The source of error was not found. In 1990, we switched from a latex weather balloon to an imperclious (Hans-Rudolph No. 6100) bag, and the ratio is now near 1.0. This was a systematic error caused by the escape of expired gas from the bag fittings or through the bag itself before analysis.

Figure 2. With bag collections, it is important to be sure that there are no leaks, that valves are turned at the proper time, and that gas concentrations are analyzed rapidly and accurately. Tests of biologic quality control determine the precision of laboratory measurements. Are repeated measurements in a given patient close to the historical average value for that individual? Comparisons of computer and bag measurements test the accuracy of laboratory measurements. Do computer measurements agree with bag measurements? In an analysis of three automated exercise testing systems, Versteeg and Kippersluiss found significant differences in computer and bag collection measurements. Others have reported similar results.9J0 Ideally, a manufacturer or an independent testing firm should provide results of bag collection measurements for the exercise testing system; but because this information is not usually available, it is important that individual laboratories have the capability to perform computer-bag comparisons. If an inaccurate system is purchased and bag verifi-

PULSE OXIMETRY AND BLOOD GAS ANALYSIS Pulse oximetry is performed routinely with exercise testing. Some oximeters are more accurate than others for measurements during exercise.” The pulse-wave amplitude must be sufficient to allow for an estimation of oxygen saturation. A low pulse signal alarm suggests a problem. During exercise measurements, more reliable pulse signals are obtained from the ear lobe than from the finger tip. Pulse oximetric saturations are reasonable estimations of directly measured arterial oxygen saturation, if carbon monoxide concentrations are less than 5% ~nonsmokers) and if arterial oxygen saturations are greater than 85% .I2 In patients whose exercise performance is limited by gas exchange, direct arterial blood gases measurements should be obtained at rest and during exercise, rather than relying on pulse oximetric estimations. In small children, separate serial collections of arterial samples are difficult and often impossible. With a sufficiently large radial artery, a No. X&gauge indwelling catheter (such as an Angiocath, Deseret Medical, Sandy UT) can be introduced to obtain arterial blood samples at rest and during exercise. In most patients, an indwelling catheter is more comfortable than repeated arterial punctures. Proper calibration of blood gas analyzers is not reviewed here. Recently enacted Clinical Laboratory Improvement Act regulations have established high-quality standards for blood gas measurements.

COMPUTER SOFTWARE ERRORS In our laboratory, several commercially available exercise testing systems have been evaluated, using comparisons with bag collection measurements as described above. Some systems have systematic errors as high as 40% in oxygen uptake measurements. Often it is difficult to find the source of these errors. Common causes of these errors include faulty gas analyzers, erroneous volume measurements, and inaccurate time delays. When the source of error cannot be detected, we discuss the problem with the manufacturers. Software experts

Progress in Pediatric Cardiology

16

are often helpful in debugging their work. Random errors, so-called computer glitches, occur occasionally, and if a quality control check suggests inaccuracy, the test should be repeated to be certain that one exists.

OVERALL

VARIABILITY OF EXERCISE MEASUREMENTS The overall variabilities in measurement of oxygen uptake, carbon dioxide output, and ventilation is determined by the measurement errors and withinsubject variability. Quality control testing helps to separate these two sources. Repeated computerbag comparisons with a ratio near 1.0 indicate that computer measurements are reasonably accurate. If computer-bag comparisons equal 1.0 and there is wide variation of individual measurements, subject variability is then likely. Garrard and Emmons13 evaluated overall variability of maximal exercise measurements in six normal subjects who were studied six times. They found coefficients of variation of 12% for ventilation, 3.8% for heart rate, and 8.4% for oxygen uptake. In studies of submaximal exercise performance, Jones and KaneI reported coefficients of variation of 8% for ventilation, 3% for heart rate, and 5.1% for oxygen uptake in six normal subjects studied six times. This range of overall variability must be considered when interpreting exercise test results. A significant change in oxygen uptake measurements depends both on the quality control of the laboratory procedure and on the ability of the patient to perform a reproducible test.

REPORT FORM Most computerized exercise testing systems can generate a report form, which helps to eliminate transcription errors. However, certain data such as blood gas results may have to be entered by hand into the computer or report form. A transcription error is not uncommon in a busy laboratory and all data should be checked carefully.

INTERPRETATION Physicians who order exercise tests are more concerned with an overall interpretation than with specific numerical results. Those who perform and in-

terpret the tests are responsible for both the overall quality control and the content of the interpretation. They should have sufficient experience in exercise physiology to interpret exercise performance accurately. During IS years of teaching exercise physiology to residents and fellows, we have found that the most common error made by novices is placing an overemphasis on a single abnormal variable instead of interpretation of the overall trend of the test data. Wording of the exercise test interpretation is important because critical decisions regarding disability, athletic participation, and lifestyle choices can depend upon it. Computer algorithms to simplify and standardize interpretations, in our opinion, should not be relied upon for the final interpretation. Testing physicians are encouraged to develop and consistently use their own methods of inte~retation.

PROPOSED SCHEME FOR EXERCISE TESTING QUALITY CONTROL Several adaptable and user-friendly quality control procedures have been reviewed by Jones.14,15 The recommendations that conclude this presentation reflect procedures that have been refined and adopted in our laboratory over several years. With receipt of a new exercise testing system, both the medical director and technical personnel should thoroughly read the equipment manual. Several weeks should be allowed for familiarization and testing of normal subjects before clinical or research testing is performed. Oxygen uptake, carbon dioxide output, and ventilation measurements should be compared to bag collection data using gas analyzers other than those included in the system. If bag collection measurements are not possible, the data should be compared to results of tests performed in the same subjects using another system. If systematic errors are found, the specific problem should be discussed with the manufacturer. A 2to 4-hour period should be reserved each week for quality control testing. One or two laboratory personnel, who consent to repeated testing, should be studied as test control subjects. Oxygen uptake, carbon dioxide output, ventilation, and heart rate should be recorded along with the ergometer workload or treadmill speed and elevation. We have found that an ideal exercise level is 100 W. For most fit individuals, this is a subanaerobic workload that

17

allows subjects to reach a near-steady state of exercise after 3 to 5 minutes. A computer software spreadsheet with graphics capabilities should be available for analysis of quality control data. Several months of data should be examined and analyzed on a weekly basis to evaluate a variety of laboratory equipment, such as cycle ergometers, treadmills, gas analyzers, and flow meters. Computer-bag comparisons should be performed routinely. This serves as an overall check of the system, and it maintains the skills of laboratory personnel in performing bag collections and making the associated exercise calculations.

REFERENCES 1. Russell JC, Dale JD. Dynamic torquemeter calibra-

2.

3.

4.

5.

tion of bicycle ergometers. f Appl Physiol. 1986;61: 1217-1220. Wiimore JH, Constable SH, Stanforth PR, et al. Mechanical and physiological calibration of four cycle ergometers. Med Sci Sports &er. 1982;14:322-325. Wilmore JH, Costill DL. Adequacy of the Haldane transformation in the computation of exercise VOZ in man. J App2 Physiol. 1973;35:85-89. Davies CTM, Tuxworth W, Young JM. Physiological effects of repeated exercise. Clin Sci. 1970;39: 247-258. Kraemer MD, Sullivan M, Atwood JE, Forbes S, Myers J, Froelicher V. Reproducibility of treadmill exercise data in patients with atria1 fibrillation. Cur-

diotogy. 1989;76:234-242. 6. Chinn DJ. Naruse Y, Cotes JE. Accuracy of gas analysis in lung function laboratories. Thorax. 1986;41: 133-137. 7. Hughson RL, Northey DR, Xing HC, Dietrich BH, Cochrane JE. Alignment of ventilation and gas fraction for breath-by-breath respiratory gas exchange calculations in exercise. Comput Biomed Res. 1991; 24:118-128. 8. Versteeg PGA, Kippersluis GJ. Automated systems for measurement of oxygen uptake during exercise testing. Int I Sports Med. l.989;10:107-112. 9. Matthews JI, Bush BA, Morales FM. Microprocessor exercise physiology systems vs a nonautomated system. Chest. 1987;92:696-703. 10. Jones NL. Evaluation of a microprocessor-controlled exercise testing system. J Appl Physiol. 1984;57: 1312-1318. 11. Barthelemy JC, Geyssant A, Riffat J, Antoniadis A, Berruyer J, Lacour JR. Accuracy of pulse oximetry during moderate exercise: a comparative study. Stand 1 C&n Lab Invest. 1990;50:533-539. 12. Powers SK, Dodd S, Freeman J, Ayers GD, Samson H, M&night T. Accuracy of pulse oximetry to estimate HbOt fraction of total Hb during exercise. I Appl Physiol. 1989;67:3000-3004. 13. Garrard CS, Emmons C. The reproducibility of the respiratory responses to maximum exercise. Xespirution. 1986;49:94-100. 14. Jones NL, Kane JW. Quality control of exercise test measurements. Med Sci Sports. 1979;11:368-372. 15. Jones NL. Clinical Exercise Testing. Philadelphia, PA: WB Saunders; 1988208-212.