Observer Variation in Spirometry* STUART
W.
ROSNER, M.D.,·· SIDNEY ABRAHAMt AND CESAR A. CACERES, M.D·tt
Washington, D. C.
D
4 same individual. Both observers demonstrated a relatively small technical error for the FVC and FEV 1.2 and 3 compared with the MEFR and MMF. The largest reproducibility variation was noted for the MEFR with technical errors of 209 and 333 for observers A and B respectively. Using observer B's technical error as an example, there is a 95 per cent probability that any of his MEFR measurements may differ from the true value by 666 mI. (2 x T.E.). The statistical significance of differences in precision between two observers can be assessed by the F test.§§ If the calculated F value exceeds the table F value then observed differences in reproducibility probably represent true differences in observer ability.4 The F table value for the 2.5 per cent probability level for the six parameters computed by the two observers is 1.7. The calculated F values exceeded the table value in every instance, thus one can note that one observer may be significantly more precise in perfonning the measurements, as A was in these tests.
ESPITE THE WEALTH OF PUBLICA-
tions relating to clinical spirometry, 1 there is little commene· on the precision of the measurements. In this study, the reproducibility error associated with the measurements of the forced vital capacity ( FVC), forced expiratory volumes at one, two and three seconds (FEV 1 . 2 and 3), maximal expiratory flow rate (MEFR) and maximal mid-expiratory flow rate (MMF) is reported. METHOD
The forced expiratory spirogram was recorded with a Collins spirometer at 30 mm.jsecond. The portion of the curve which approaches a straight line was extrapolated back to establish a zero time in recordings where the onset of expiration was not definite! To eliminate one source of variance, one of the observers perfonned the extrapolation for both. The measurements were made in duplicate by two persons, using a ruled plastic overlay sheet. RESULTS
The technical error§ of the measurement (Table 1) expresses the intra-observer variation and is calculated from the difference between duplicate detennmations by the
DISCUSSION
The magnitude of the technical error of the measurement is useful infonnation in evaluating an individual value in the region of the arbitrary borderline between nonnal and abnonnal. A large technical error reduces the diagnostic accuracy of a 5 measurement. In Fig. 1A, one can see that ability to discriminate between a normal and abnonnal population will be vitiated as the technical error of the measurement increases. If clinical experience demonstrates that the several parameters extracted from the spirogram have equal diagnostic potential, then the magnitude of the
·Presented in part at the 8th International Congress on Diseases of the Chest, Mexico City, October, 1964. ··Instrumentation Field Station, Heart Disease Control Program, Division of Chronic Diseases, U.S. Department of Health, Education and Welfare: Clinical Instructor in Medicine. George Washington University Hospital. fStatistician, Instrumentation Field Station, Heart Disease Control Program, Division of Chronic Diseases, U.S. Department of Health, Education and Welfare. tfChief, Instrumentation Field Station, Heart Disease Control Program, Division of Chronic Diseases, U.S. Department of Health, Education and Welfare, Assistant Professor of Medicine, George Washington University Hospital. §T.E.= v'~d'l2n, d=difference between duplicate determinations and n=number of subjects.
HF=(T.E.)', (T.E.)I
265
where the numerator is the larger variance.
266
ROSNER. ABRAHAM AND CACERES
technical error will detennine which measurement is best employed. Epidemiologic research often demands the combined efforts of two or more 0bservers for the study of a large population. If the technical variability of one obsen'er exceeds another, then die quality of the better effort is diluted when the results are pooled (Fig. IB). The smallest technical errors, hence greatest precision, were associated with determination of the total volume and the volumes at different time inten'ak These errors are due, in part, to use of a scale with 2 mm. inten'als between lines. Each obsen'er measured to the nearest I mm. which required
Diseases of the Chest
an interpolation by eye in many instances. Such interpolations between scale divisions are known to be untrustworthy.' The provision of a scale with 1 mm. divisions could reduce this source of variability. Nevertheless, the error in measuring the total volume is relatively small compared with the known daily fluctuations in vital capacity.··J A large technical error was noted for the MEFR and MMF. These measurements require identification of more data points than the volume measurements, in that way increasing the reproducibility error. Furthennore, two points are sighted by a line crossing the vertical time lines (Fig.
A Region of Folse Positive and Folse Negative Diogn05fl
8
First Observer
Second Observer
1: The effect of technical error on the discrimination of two populations by a diagnostic measure is illustrated, (A) The region of false positive and false negative diagnoses is determined, in part, by the magnitude of the technical error. (B) Different levels of precision between observers will affect pooled data. The area of overlapping values between two populations will be determined by the technical error of the least precise worker, FIGURE
Volume 48, No. SepKmber 19M
~
OBSERVER VARIATION IN SPIROMETRY
TABLE l-SlX MEASUIlEMENTS FROM THE FORCED EXPIRATORY SPUlOGIlAKS OF 77 PATIENTS PERFORMED BY Two OBSERVEIlS IN DUPLICATE Mean· Standard· Deviation
Obs. A
FVC 3165
FEV 1 2426
FEV. 2798
FEV. 2948
MEFR «59
MMF 2518
Obs. B
3160
2427
2797
2947
4538
2534
Obs. A
879
779
826
842
2334
1282
Obs. B
873
776
824
840
2463
1307
---------------------------------------------------------------------------------------------------------------------------------------------------
Technical·
Obs. A
9
Error
Obs. B
16
F Value
15
12
10
209
188
21
23
19
333
271
2.1
3.7
3.9
2.5
4.8
1.7
1.7
1.7
-------------------------------------------------------
---------------
Calculated
2.9
Table
1.7
-~-------------
1.7
1.7
----------------------------------
·Means, standard deviations and technical erron are expressed in milliliten.
associated with the FVe and FEV 1,2 _d a measurements contribute to the overlap, the errors are sm all. The MEFR and M MF measurements were accompanied by large technical errors; therefore, they must be interpreted more cautiously than the volume measurements. While an individual value at either end of the scale is easily interpreted, a large technical error, of itself, will hamper interpretation of borderline values. In this study, the observer differences were found to be statistically significant. This finding is of particular importance to groups engaged in epidemiologic and joint studies because significant inter-individual
2). The angle made by this line with the horizontal, which is the complement of the angle made by this line with the vertical, is examined for its effect on the precision of the readings. Increasing the drum speed will reduce the angle made by the line with the horizontal and will increase the precision of the measurement. The larger technical error observed for the MEFR measurement relates to the acuteness of this angle. Typically, the velocity of the MEFR exceeds the M MF, i.e., the slope is steeper and the angle more cl
•
MEFR dope line\
rJ/\.,' .-
" ---
,,, ,, 1 ,n :'\ ~78ILO'
~t oJ expiration
0
r
I ,FIGURE 2: Identical expirations are drawn as they would appear with two different drum speeds. Increasing the drum speed pennits more precise measurement of the slope. With a drum speed of 30 mm./ second and an MEFR value of 9.0 liten/second, the angle made with the horizontal is 86·. An error of only ± ~ 0 in this angle will result in corresponding errors in the tangent of the angle (i.e., the numerical value of the slope) of +9.1 per cent (819 mJ.) and -7.7 per cent (693 mI.). If the flow rate remain. unchanged but the drum speed is tripled, the angle will be reduced to 78 Yz o. The same ± ~. error in the angle produc~s smaller tangent erron of +3.1 per cent (279 ml.) and -2.9 per cent (261 mI.),
268
ROSNER, ABRAHAM AND CACERES
variation will affect the quality of the final results. SUMMARY
The intra-observer and inter-observer variation associated with clinical spirometry was statistically analyzed. The technical errors for the MEFR and MMF measurements were large enough to question their clinical reliability, especially in borderline instances. Measurement differences between two observers, as in this study, are statistically significant. This observation has special meaning for groups engaged in epidemiologic studies. Sign ific an tinter-individual variation will diminish the quality of pooled result~.
ACKNOWLEDGMENT: Miss Nancy]. Welton and Mr. Roger Sitterly of Antioch College gave valuable assistance and Dr. Albert Roberts, Chief. Cardiopulmonary Diseases Section, Heart Diseases Control Program, and Mr. David Winer. B. E. offered helpful criticisms. RESUMEN Las variaciones intra-observador e inter-observador en la espirometria clinica han side analizadas estadisticamente. Los errores de tecnica comprobados por las medidas del MEFR y MMF han sido 10 bastante considerables como para poner en duda su valor en la c1inica. Las diferencias entre las medidas tomadas por dos observadores distintos, como en el caso del presente estudio, son estadisticamente significativas. Esta comprobacion es de particular interes para los investigadores dedicados a estudios epidemiologicos. Las discrepancias inter-individuales tienden a restar valor a las conclusiones de conjunto.
Diseases of the Ch~t
ZUSAMMENFASSUNG Analyse der Varianten bei ein und dem gleichen Beobachter und zwischen zwei verschiedenen Beobachtem in Zusammenhang mit der klinischen Spirometrie auf statistischer Basis. Die technischen Irrutiimer fUr die MEFR und fUr die Messung des Atemgrenzwertes waren doch so erheblich, urn ihre klinische ZuverHissigkeit in Frage lU stellen, besonders bei Grenzfcillen. Unterschiede in den Mepwerten zwischen zwei Beobachtem wie in der vorliegenden Untersuchung sind von statistischer Signifikanz. Diese Beobachtung ist von besonderer Bedeutung fUr Forschergruppen, die mit epidemiologischen Untersuchungen beschaftigt sind. Signifikante interindi\;duelle Varianten konnen die Qualitat koordinierter Resultate herabsetzen.
2
3 4
5 6
7 8
REFERENCES LEUALLEN, E. C. AND FOWLER, W. S.: "Maximal Midexpiratory Flow," Am. R6V. Tuberc., 72: 783, 1955. SHONFELD, E. M., RADEMACHER, C., WEIHRER, A. L., ABRAHAM, S., WIENER, J. AND CACERES, C. A.: "Methodology for Computer Measurement of Pulmonary Function Curves," Dis. Chest, 46:427, 1964. D'SILVA, 1. L. AND KAZANTZIS, G.: "Measurement of the Mechanical Function of the Lung in Normal Subjects," Thorax, 9: 128, 1954. THORNER, R. M. AND REKEIN, Q. R.: "Principles and Procedures in the Evaluation of Screening for Disease," Pub. Health Monograph, 67: 24, 1961. SIMONSON, E.: Differentiation Between Normal and Abnormal in Electrocardiography, C. V. Mosby Company, St. Louis, 1961. RAHN, H., FENN, W. O. AND OTIS, A. B.: "Daily Variation of Vital Capacity, Residual Air, and Expiratory Reserve Including a Study of the Residual Air Method," ]. Appl. Physiol., 1: 725, 1949. MILLS, 1. N.: "Variability of the Vital Capacity of the Normal Human Subject," ]. Physiol., 110:76, 1949. MAINLAND, D.: Elementary Medical Statistics, 2nd ed., W. B. Saunden Company, Philadelphia, 1952.
RELATIONSHIP OF UNEQUAL PULMONARY VENTILATION TO GRAVITY tlon. The unequal excursion of the hemldlaphragms Relative ventilation of different lung segments Is when the chest Is In the lateral recumbent position atrected by body position. Radiologic data demonalso appears to be due to hydrostatic forces. Unequal strate greater ventilation In the dependent segments pulmonary ventilation Is effected by gravity; hydroof the lung. regardless of body position, A hydrostatic pressure appears to be a slgnlftcant factor. static pressure gradient has been shown within the SURPRENANT, E. L.: "Roentgen Studies in Man on the Relapulmonary vasculature and pleural cavity, An antionship of Unequal Pulmonary Ventilation to Gravity," RAalogue demonstrates a mechanism by which this presdiolog,. 84:6M. 1965. sure may contribute to unequal pulmonary venti la-