The Journalof EmergencyMedicine,Vol13, No 5, pp 617-621.1995 Copyright 0 1995ElsevierScienceLtd Printedin the USA. All rightsreserved
0736-4679/95$9.50 + .OO
0736~4679( 95)ooo65-8
Original Contributions
MEASURING THE ACCURACY OF THE INFRARED TYMPANIC THERMOMETER: CORRELATION DOES NOT SIGNIFY AGREEMENT Michael Yaron, Colorado
MD, FACEP,
Steven R. Lower-stein,
MD, MPH, FACEP,
and Jane Koziol-McLain,
RN, MS
Emergency
Medicine Research Center, Division of Emergency Medicine, University of Colorado Health Sciences Center, Denver, Colorado Reprint Address: Dr. Michael Yaron, Division of Emergency Medicine, University of Colorado Health Sciences Center, 4200 East Ninth Avenue, Box 8215, Denver, CO 80262
0 Abstract -This prospective study assessedthe accuracy of the infrared tympanic thermometer (ITT) compared to the rectal thermometer (RT) using statistical measures of agreement. In a convenience sample of 100 adult emergency department patients, ear examinations to assessfor cerumen or otitis were followed by temperature measurements using the First Temp 2000A thermometer in both ears and the WAC 2000 rectally. Left and right ITT temperatures showed high correlation and agreement; therefore, only right ITT results are reported. Both the ITT and RT recorded similar mean temperatures, standard deviations, and ranges. The correlation of the ITT and RT and agreement were below the 0.8 level, indicating excellent agreement. The mean temperature difference (RT-ITT) between the two devices was 0.1 f 0.7W; in lOolo of patients, the temperature difference was rl°C. Among 10 patients identified as febrile by RT (RT r 38.5%!), 6 were febrile by ITT. Significant differences occurred between the temperature measurementsusing the ITT and RT; these devices do not demonstrate excellent agreement. 0 Keywords - accuracy; agreement; correlation; temperature; thermometers; tympanic membrane
and disease states are often defined using measuring devices. In recent years, many new, highly technical devices have been introduced: included are pulse oximeters, salivary tests for alcohol, and the infrared tympanic thermometer (ITT). The ITT is a striking example of the recent revolution in measuring devices in emergency medicine. The ITT records nondiscriminating infrared radiation from the auditory canal and converts this information into body temperature. The ITT offers rapid, convenient, clean, noninvasive, and painless temperature measurements. The introduction of this device has met with acceptance and excitement among patients and health professionals ( 1,2). However, apart from questions of measurement precision, one wonders how accurate are ITT instruments. Two questions are particularly salient to clinicians. First, how well does the new device agree with the standard measuring technique; that is, can the new device substitute for the old? Second, how often does the new device misclassify a patient as normal or abnormal? These questions address agreement. While numerous studies report correlations when comparing new devices to old, few examine whether the devices agree. Correlation statistics assessthe relatedness of two different variables, often measured in different units. However, two measures may have a perfect linear relationship yet not yield coinciding results, that is, they may not agree. The Intraclass Correlation Coef-
body
INTRODUCTION Modern emergency medicine practice depends, to a large extent, on making accurate measurements. From the moment of triage to the time of treatment and disposition, clinical signs and symptoms, pathophysiologic disturbances, chemical derangements,
RECEIVED: 28 Se tember 1994; FINAL SUBMISSION RECEIVED: 1 March 1995; ACCEPTED: 13 hf arch1995 617
618
M. Yaron et al.
ficient (ICC) provides information that is similar to a correlation analysis, but corrects the statistic for differences in group means. The ICC, in regression terms, “assessessimilarity of slopes as well as intercepts” (3). The ICC ranges from - 1 to + 1. The absolute magnitude of the ICC indicates increasing agreement with values ~0.8 generally accepted as demonstrating “excellent” agreement (4). The assessmentof temperature measurement devices in the emergency setting is a relevant example for exploring correlation versus agreement. Body temperature is a vital sign. Abnormal temperature readings are used in the emergency setting to define disease states and to determine the need for treatment. In some cases, such as the neutropenic patient or the newborn, the temperature may be the sole criterion upon which treatment, testing, and admission decisions turn.
METHODS Study Design and Population We conducted a prospective study of patients visiting an urban Emergency Department (ED) to measure the correlation and agreement between the ear and the rectal thermometer. This study was conducted in the adult ED of a Level I Trauma Center during a three-week period in the summer. Each patient who entered the ED during scheduled research assistant shifts was approached for entry into the study. All days of the week and times of the day were sampled. Sampling continued until 100 patients were entered. The study was approved by the Colorado Multiple Institutional Review Board, and written informed consent was obtained from all patients except those whose temperature would normally be taken per rectum (i.e., patients who are comatose, tachypneic, or otherwise unable to obtain an oral temperature). Refusal to consent and contraindications to rectal temperature measurement (i.e., recent rectal surgery, injury or enema, rectal inflammation, painful hemorrhoids, and diarrhea) were the only exclusion criteria.
Instruments and Process Trained research assistants (a nurse practitioner and medical student) collected all data. Demographic and clinical data, including age, sex, presenting com-
plaint, and final diagnosis, were recorded. Ear examinations were performed to assessfor the presence of otitis media based on erythema, an abnormal light reflex, and bulging of the tympanic membrane. The degree of auditory canal obstruction by cerumen was recorded as: 0, < 50%, > 50%, or complete. Ear temperature measurements were assessed first, immediately followed by the rectal measurement. All data were collected within three weeks of study instrument calibrations. Ear temperature measurements were made using one of three First Temp@ model 2000A thermometers (Intelligence Medical Systems, Carlsbad, CA). Each of these thermometers was calibrated by the manufacturer using a calibrator model 2OOOA-CL. These devices were set to display the “rectal equivalent” of the ear temperature. Following manufacturer’s guidelines, both right and left ear temperatures were measured by placing the probe in the external auditory canal and waiting for the instrument to indicate that it had completed its measurement. Rectal temperature measurements were made using one of three IVAC model 2000 thermometers (San Diego, CA). For this study, calibration was performed by an independent laboratory using a water bath technique (at 98.07O and 103.33OF). All the IVAC instruments were found to record temperatures accurately within 0.2OF of the reference standards. Rectal temperatures were measured by placing the thermometer probe 4-5 cm into the rectum and waiting for the instrument to sound a completion tone (predictive mode).
Data Analysis Data analysis was performed using the SAS statistical and graphics packages (SAS Institute Inc, Cary, NC). For temperature data, the mean, standard deviation, and range are reported. The correlation between ear and rectal temperature measurements was determined using Pearson’s r. Agreement was measured by the Intraclass Correlation Coefficient (ICC). Agreement was also ascertained by determining the ability of the ITT to discriminate between febrile and afebrile patients. The rectal temperature was considered the gold standard for temperature and a value z38.5OC was considered febrile. The sensitivity and specificity of the ITT in detecting febrile patients were calculated using standard 2 x 2 contingency tables.
619
Accuracy of the Infrared Tympanic Thermometer Table 1. Temfwrature (“C) Measurements Mean Temperature
Standard Deviation
Range
0.88 0.87 0.68
35.9-40.0 33.8-39.7 - 1 .O to 3.2
37.38 37.25 0.13
Rectal Ear Mean difference
N = 100.
RESULTS
One hundred subjects were enrolled. The average age of the sample population was 38.8 f 18.1 years (range = 17 to 91 years). Fifty-four percent were females. The left and right ear temperatures showed a high correlation (r = 0.94) and agreement (ICC = 0.94); therefore, only the right ear temperature results are reported. The ear and rectal thermometers recorded similar mean temperatures, standard deviations, and ranges (Table 1). Figure 1 displays the distribution of our data about the regression line. The correlation between the ear and rectal temperatures was 0.70 (p < JO01 ), and the agreement was 0.74. The mean temperature difference (rectal-ear) between the two devices was 0.1 f 0.7OC (range = 3.2 to -1.0). In 10% of patients, the temperature difference was 2 1OC.
40 s *
*
39
t*
f
._
t
. l
*
**
” *
* tt
l .
l
* et,
t ;;. *tt * * *t
f
*
.
.
l :** l
.
t l
l
*
** *
Table 2. Detection of Fever
*
*
t
t
I
l
Febrile _I_--___
f f
35 35
36
---.-I___Rectal .~__
”
* **
36
New devices, like drugs and procedures, should undergo rigorous evaluation before they are embraced by practitioners. This study shows that the infrared tympanic thermometer is inaccurate compared to the electronic rectal thermometer. There is only moderate agreement between these two devices; furthermore, if fever is present, it is detected by ear thermometry only 60% of the time. Previous studies examining the ear thermometer in the ED and other settings have reported mixed results (5). Some studies have reported high correlations and reached encouraging conclusions (6,7). Others have yielded less promising results, indicated by a low sensitivity in detecting fever or a significant incidence of divergent readings between the rectal and ear temperature measurements (8-12). One common error in previously published studies has been to rely on correlation coefficients. Many investigators have obtained high correlation coefficients (Pearson’s r) and have inferred that the ear thermometer is accurate. Such conclusions are unwarranted, and this use of the r-statistic is misleading
I
*
*t *
DISCUSSION
I
l *
.
I
I
t
.:::
*t
/
The sensitivity of the ear temperature in detecting a fever ( r38.5OC) was 60%~(95070CI = .26, .88). Table 2 displays the 2 x 2 table and test statistics for the detection of febrile patients using the ear thermometer. The ITT temperatures among the four patients with false negative measurements were 37.4O, 38.0°, 38.3O, and 38.4OC. The mean temperature difference was significantly greater (0.9 f l.O°C) for the nine patients with complete cerumen impaction compared to the other 91 patients (0.05 f 0.6OC, p = .04). There was no change in mean temperature difference for the five patients with otitis media.
37
38
T 39
~. ._~,/ 40
REclAL TEMPERATURE Figure 1. Scattetplot of ear and rectal temperature measurememts(OC)wWttheHnear regmdon tine (ear = 11.0 + 0.7t x recta). P&fez one ob#rvatton fafls outskfe axes range; each asterisk may represent more than one observation.
Ear Febrile Afebrile Prevalence = 10% Sensitivity = 60% Specificity = 98% Positive Predictive Value = 75% Negative Predictive Value = 98%
6 4
Afebrile
882
.1111_ ‘Fever = temperature *Iv = 100.
2 38Li°C.
M. Yaron et al.
(13). Correlation statistics might be used to test the relatedness of the PaC02 and the peak expiratory flow rate in a sample of asthmatics. Correlation statistics would also be appropriate to test for a relationship between temperature and respiratory rate. These examples are in marked contrast to the situation of measuring temperature with two devices. There cannot be any question that the ear and rectal measurements of temperature will correlate, for they are measuring the same parameter-body temperature. The critical issue in clinical practice is whether the rectal and ear temperature measurements agree. If, and only if, the agreement is high, one can substitute one measure for the other. Correlation coefficients, however, do not indicate agreement. The high correlation statistics as reported in the published studies of temperature measurement may mislead the clinician in several ways. First, systematic biases may be ignored. For example, one device could systematically record every patient’s temperature three degrees higher than the other device, with perfect correlation, but no agreement, between the two (3,14). Second, misclassification of febrile patients is often overlooked. Third, data have seldom been examined for areas of over- or underestimation. Finally, correlation statistics invariably have been accompanied by p-values, and undo weight has been given to hypothesis testing rather than to the magnitude of the agreement or error. In this study, despite a highly statistically significant value for correlation (p < O.OOOl), the ear thermometer must be judged inaccurate, based on a level of agreement less than 0.8 and substantial misclassification of febrile patients. LIMITATIONS
AND FUTURE RESEARCH
This study demonstrates that the agreement between the ear and rectal temperature measurements is unsatisfactory. However, the source of the problem remains unclear. Errors in technique may be responsible (15-17); for example, without an adequate seal, the ear temperature measurement may be influenced by the ambient air (18). The presence of otitis media or cerumen in the auditory canal may also contribute
to errors in ear thermometry. In the current study, complete cerumen obstruction of the external auditory canal resulted in error. However, as a screening tool in the clinical setting, it is doubtful that ITT measurement would be preceded by an ear examination. Other factors that may introduce error in ear temperature measurements include lags between rectal and core temperatures during rapid temperature shifts (19) and differences in software algorithms programmed by various ITT manufacturers (4,20,21). In addition, due to a low prevalence rate of fever in our sample, confidence intervals around the ability to detect a fever are wide. Future studies of new measuring devices will continue to be important to emergency clinicians. In such studies, correlations should not be confused with agreement. Measures of agreement may take many forms, including 2 x 2 contingency tables with measures of sensitivity and specificity to indicate whether the new test conforms to a gold standard. Alternatively, the ICC, a regression statistic, may be employed to measure agreement between two measuring devices. The ICC accounts simultaneously for correlation and agreement, without making any assumptions about a true gold standard. Studies of new measuring devices that report correlation coefficients should be scrutinized carefully. As stated by Kramer and Feinstein, “indexes of trend [correlation] are inadequate for describing concordance, for two measurements may be very closely related but never agree” ( 3 ). CONCLUSION Clinicians should be certain that agreement, rather than correlation, is reported when new devices are compared to standard ones. Until further engineering advances are made and data are accumulated for unadjusted normal ear temperature, clinicians should be wary in replacing the old rectal thermometer with the new ear thermometer. Acknowledgments-The authors would like to thank Janice Thomas, RN, ANPC, and Cheryl Jannett, Medical Student, for their efforts in data collection.
REFERENCES 1. Alexander D, Kelly B. Responsesof children, parents and nurses to tympanic thermometary in the pediatric office. Clin Pediatr. 1991;30(4 Suppl):53-6.
2. Barber N, Kilmon CA. Reactions to tympanic temperature measurement in an ambulatory setting. Pediatr Nurs. 1989; 15(5):477-81.
Accuracy
of the Infrared Tympanic Thermometer
3. Kramer MS, Feinstein AR. Clinical biostatistics. The biostatistics of concordance. Clin Pharmacol Ther. 1981;29(i):lll-23. 4. Bravo G, Potvin L. Estimating the reliability of continuous measures with Cronbach’s alpha or the intraclass correlation coefficient: toward the integration of two traditions. J Clin Epidemiol. 1991;44:381-90. 5. Terndrup TE. An appraisal of temperature assessmentby infrared emission detection tympanic thermometry. Ann Emerg Med. 1992;21:1483-92. 6. Green MM, Danzl DF, Praszkier H. Infrared tympanic thermography in the emergency department. J Emerg Med. 1989; 7(5):437-40. 7. Ward L, Kaplan RM, Paris PM. A comparison of tympanic and rectal temperatures in the emergency department [ abstr 1. Ann Emerg Med. 1988;17(4):435. 8. Johnson KJ, Bhatia P, Bell EF. Infrared thermometry of newborn infants. Pediatrics. 1991;87(1):34-8. 9. Kenney RD, Fortenberry JD, Surratt SS, Ribbeck BM, Thomas WJ. Evaluation of an infrared tympanic membrane thermometer in pediatric patients. Pediatrics. 1990;85(5):854-8. 10. Donoto M, McHugh TP. Inaccuracy of infrared tympanic membrane temperatures in young children (abstr). Ann Emerg Med. 1991;20(4):449. 11. Nierman DM. Core temperature measurement in the intensive care unit. Crit Care Med. 1991;19:818-23. 12. Freed GL, Fraley JK. Lack of agreement of tympanic membrane temperature assessmentswith conventional methods in a private practice setting. Pediatrics. 1992;89:384-6.
621 13. Lowenstein SR, Koziol-McLain J, Badgett RG. Concordance versus correlation [letter]. Ann Emerg Med. 1993;22(2):269. 14. Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet. 1986;1(8476):307-10. 15. Pransky SM. The impact of technique and conditions fo the tympanic membrane upon infrared tympanic thermometry. Clin Pediatr. 1991;30(4 Suppl):50-2. 16. Shenep JL, Adair JR, Hughes WT, et al. Infrared, thermistor, and glass-mercury thermometry for measurement of body temperature in children with cancer. Clin Pediatr. 1991;30(4 Suppl):36-41. 17. Weiss ME. Tympanic infrared thermometry for fullterm and preterm neonates. Clin Pediatr. 1991;30(4 Suppl):42-5. 18. Doyle F, Zehner WJ, Terndrup TE. The effect of ambient temperature extremes on tympanic and oral temperatures. Am J Emerg Med. 1992;10(4):285-9. 19. Moorthy SS, Winn BA, Jallard MS, Edwards K, Smith ND. Monitoring urinary bladder temperature. Heart Lung. 1985; 1490-3. 20. Erickson RS, Meyer LT. Accuracy of infrared ear thermometry and other temperature methods in adults. Am J Crit Care. 1994;3(1):40-54. 21. Chamberlain JM, Terndrup TE, Alexander DT, et al. Determination of normal ear temperatures with an infrared emission detection thermometer. Ann Emerg Med. 1995;25:15-20.