Interobserver reliability in quantitative MRI assessment of temporomandibular joint disk status

Interobserver reliability in quantitative MRI assessment of temporomandibular joint disk status

Interobserver reliability in quantitative MRI assessment of temporomandibular joint disk status B. Nebbe, BDS, MDent, FFD(SA)Ortho, PhD, a S. L. Brook...

475KB Sizes 0 Downloads 38 Views

Interobserver reliability in quantitative MRI assessment of temporomandibular joint disk status B. Nebbe, BDS, MDent, FFD(SA)Ortho, PhD, a S. L. Brooks, DDS, MS, b D. Hatcher, DDS, MSc, MRCD(C), c L. G. Hollender, DDS, PhD, Odont Dr, d N. G. N. Prasad, PhD, e and E W. Major, DDS, MSc, MRCD, f Edmonton, Alberta, Ann Arbor, Mich, San Francisco, Calif, and Seattle, Wash UNIVERSITY OF ALBERTA, UNIVERSITY OF MICHIGAN, UNIVERSITY OF CALIFORNIA AT SAN FRANCISCO, AND UNIVERSITY OF WASHINGTON

Objective. The purpose of this study was to investigate interobserver reliability of a new technique for quantification of magnetic resonance images of temporomandibular joint disk status. Study design. Sixty magnetic resonance images of adolescent temporomandibular joints were randomly drawn for analysis. Four experienced observers traced the articular disk and osseous structures on sagittal magnetic resonance slice images. Quantitative measurements of disk length and disk displacement were recorded for each slice of 57 joints traced by each observer through use of a new quantification technique. Intraclass correlation coefficients were computed to assess interobserver agreement in the tracing of joint structures. Results. The calculated intraclass correlation coefficient was 0.681 for disk length and 0.830 for disk displacement. In addition, the mean variability among observers was 1.041 mm for measurement of disk length and 0.972 mm for measurement of disk displacement. Conclusions. Interobserver agreement is high when the new quantification technique is used to interpret magnetic resonance images. (Oral Surg Oral Med Oral Pathol Oral Radiol Endod 1998;86:746-50)

Magnetic resonance imaging (MRI) has been used as an imaging modality for assessment of temporomandibular joint (TMJ) soft tissue structures. MRI offers a direct form of visualization of the articular disk relative to the osseous articular surfaces, providing excellent spatial and contrast resolution of the TMJ components. 1-3 Interpretation of MRI-determined disk position has traditionally relied on subjective evaluation by experienced radiologists. More recently, specific subjective categories of disk position have been introduced for use in the classification of joints. 4 The classification techniques make use of both sagittal and coronal images to explain the 3-dimensional appearance and aResearch Associate, TMD Investigation Unit, Faculty of Medicine and Oral Health Sciences, University of Alberta. bProfessor, Department of Oral Medicine, Pathology and Surgery, University of Michigan, School of Dentistry. CActing Associate Professor, Department of Restorative Dentisty, UCSF School of Dentistry. dDirector, Division of Oral Radiology, Department of Oral Medicine, University of Washington. eAssociate Professor, Department of Mathematical Sciences, University of Alberta. fProfessor, Division of Orthodontics, Director, TMD Investigation Unit, Faculty of Medicine and Oral Health Sciences, University of Alberta. Received for publication Oct. 7, 1997; returned for revision Dec. 5, 1997; accepted for publication June 1, 1998. Copyright © 1998 by Mosby, Inc. 1079-2104/98/$5.00 + 0 7/16/92819

746

position of the articular disk. 1-3,5These subjective categories presume a continuous progression of disk displacement from normal disk position to eventual disk displacement without reduction. Classification according to categoric data is highly descriptive and aids in the communication of findings between observers. In addition, the formalization of structured categories with specific criteria facilitates consistent interpretation and systematic appraisal, which may help determine the prevalence and incidence of a disorder. More importantly, reliability and reproducibility studies on these classification techniques have tested intraobserver and interobserver variability and have proved to be acceptable when tested with Kappa statistics for categoric data. 5,6 Quantification of disk position on MRIs of the TMJ has been attempted 7,8 in preference to qualitative subjective description. Drace and Enzmann 7 measured the angular displacement of the posterior band of the disk relative to the 12 o'clock position on the condylar head. In their study, intraobserver and interobserver variation was cited to be less than -+5 degrees. More recently, projective geometry has been used to predict the relationship between the condyle, disk, and fossa of the temporomandibular joint. 8 This technique made use of ratios of areas defined by 3 points in a mathematic equation to demarcate disk position instead of conventional linear or angular measurements. Unfortunately, the study provided no interobserver reli-

ORAL SURGERY ORAL MEDICINE ORAL PATHOLOGY

Nebbe et al

747

Volume 86, Number 6

Frankfort Horizonltal

Fig 1. Morphologic joint structures traced by observers.

ability measures to assess the reliability of point identification in that technique. A new technique has been developed that quantifies disk position and disk length, describing disk status in terms of continuous linear measurements. 9 Cross-validation and misclassification errors of the technique have been tested, and intraobserver reliability measures have been conducted. 9 However, interobserver reliability with observers unfamiliar with the technique has not been tested. Therefore, the purpose of our present study was to investigate the interobserver reliability (agreement) in the application of a new technique developed for the quantification of TMJ disk status through use of MR images.

MATERIAL AND METHODS As part of a screening evaluation of TMJ status, 194 adolescent subjects (male and female) between the ages of 10 and 17 years underwent MRI of the TMJs. All subjects consented to inclusion in the study, which was approved by the Joint Dentistry/Pharmacy Human Ethics Committee. Bilateral MR images of the TMJs were performed through use of a 1.0 Tesla magnet (Shimadzu Corporation, Tokyo, Japan) and a unilateral 3-inch surface receiver coil. MRI axial scout images were obtained at the level of the TMJs to identify the long axis of the condyles. Coronal images were obtained parallel to the condylar long axis to aid in identification of lateral or medial disk displacements. Bilateral closed-mouth sagittal sections were obtained perpendicular to the long axis of the condyle through use of polyvinylsiloxane (President Jet-Bite, Coltene/Whaledent Inc, Mahwah, NJ) centric occlusion bite registration. The position of the disk was determined on open mouth images by means of a Burnett caliper (Medrad, Pittsburgh, PA) set at 10 mm below the maximal

Fig 2. Determination of condylar long axis.

voluntary mouth opening. Tl-weighted 500/20 (repetition time [in ms]/echo time [in ms]) pulse sequences were performed for all coronal and sagittal slices with the slice thickness set at 3 mm; the field of view was 140 mm, the NEX was 2, and the image matrix was 204 x 204. Of the 388 joints imaged, 70 were randomly selected for evaluation by a panel of 4 observers experienced in the interpretation of MR images of the TMJ. The panel of observers consisted of 3 maxillofacial dental radiologists and 1 orthodontist. All observers had different training backgrounds, and they had not previously worked together. An orthodontist, the principal investigator, was included as an observer after an extensive period of training that included the theory of MR imaging and evaluation of MRIs of the TMJ. Each observer was asked to independently examine the MR images and trace the structures of the joint. Clear acetate was adhered to the sagittal MR images, and each observer was supplied with a 0.3-mm lead pencil to trace the outline of the TMJ articular structures. Each observer traced the outline of the head of the mandibular condyle, the neck and ascending ramus, the posterior slope of the articular eminence, and the position of the articular disk on each sagittal MR slice image produced. The first 10 MRIs were used for calibration purposes by each observer and were excluded from the study (Fig 1). Each observer traced the anatomic landmarks specified, with particular attention to disk morphology and position. MR imaging of the TMJs with a width selection of 3 mm generally produced 4 slices per joint. Each slice was traced by each observer. Standardized reference planes were transferred to each of the sagittal tracings according to previously published procedures. 9 This required the determination of the long axis

ORAL SURGERY ORAL MEDICINE ORAL PATHOLOGY

748 Nebbe et al

December 1998

Frankfort Horizontal

/bU~

f

B C

D

e plane Fig 3. Establishment of reference planes for determination of disk length and disk displacement. A, Posterior band of disk. B, Condylar load point. C, Midpoint of articular disk. D, Anterior band of disk.

of the condylar neck on MRI sagittal images and corresponding lateral cephalometric radiographs through use of a circle center technique (Fig 2). The long axis of the condylar neck was similarly determined on lateral cephalometric radiographs, and this allowed for the transfer of the Frankfort Horizontal (FH) from the lateral cephalometric radiographs to the sagittal MR images. Once the FH was transferred to the MR images, an average Eminence Reference plane (ERP) was determined at 50 degrees to the FH plane (Figure 3). The ERP served as the reference plane onto which identified joint structures could be projected perpendicularly for measurement of disk displacement and disk length. Disk length was measured as the linear distance along the ERP from the projected posterior band of the disk to the projected anterior band of the disk. Disk displacement required the projection of the condylar load point (the closest point on the outline of the anterior superior surface of the condyle to the posterior surface of the articular eminence) onto the ERR This facilitated the measurement of disk displacement as the linear measurement of the midpoint of the articular disk relative to the projected condylar load point (Fig 3). In this way, quantitative measurements were produced for each slice traced by each observer for 57 MRIs of the TMJs; 3 joints were judged to be of unsuitable quality for tracing of the joint structures and were excluded from the sample traced by each of the observers.

analyze disk length (DL) and disk displacement (DD) data from 57 joints belonging to 47 subjects. Three joints were excluded from the original sample of 60 joints to be traced because they were of poor diagnostic quality and could not be traced by any of the examiners. All of the remaining 57 joints were included in the analysis whether all sagittal MRI slices were traced by each observer or not. This necessitated the design and use of an unbalanced analysis model. Estimates of intraclass correlation coefficients (ICC) were computed through use of the 3-fold nested error variance component model to assess interobserver agreement in the determination of DL and DD by the 4 observers when sagittal MRI slices of the TMJ were traced. The model considered here incorporated random effects resulting from all of the following: subject variability joint variability within subjects slice variability within the joints (as well as within subjects) observer variability. Estimates of the variance components for each of the above mentioned random effects were obtained by means of a "REML" (restricted maximum likelihood) method incorporated into the function of "Varcomp" (variance component) in the S-Plus statistical package (S-Plus Package 1995: Statistical Science in S-Plus Guide to Statistical and Mathematical Analysis, Version 3.0; Mathsoft, Inc, Seattle). The aforementioned estimates were used in the following formula to compute the ICC for DL and DD measures:

Analysis of data Interobserver variability in identification and tracing of the articular structures was computed. A 3-fold nested error of variance component model was used to

ICC = where

Sp2 + Sj2 + Ss2 Sp2+sj 2+Ss2+So 2+se2

ORAL SURGERY ORAL MEDICINE ORAL PATHOLOGY

Nebbe et al 749

Volume 86, Number 6

Table I. Estimates of variance values for disk displacement and disk length Variable

Disk length

Disk displacement

Sp2 Sj2 Ss2

1.492 1.323 0.673 0.101 1,530 0.681

5.104 2.120 0.432 0.017 1.569 0.830

$o2

se2 ICC

ICC = intraclass correlation coefficient Sp2 = estimate of variance component due to patient sj2 = estimate of variance component due to joint within patient Ss2 = estimate of variance component due to slice within joint within patient So2 = estimate of variance component due to observers Se2 = estimate of variance component due to sampling error. Ideally, most of the variation should be explained by slice, joint, and subject (patient) differences, and the calculated ICC should thus be high (close to 1) if the source of variation due to observers and error (residual) is low. In addition, standard deviations of measurements (in ram) recorded for disk length and disk displacement were computed for each joint from slices traced within each of the 57 joints. This gave a standard deviation score among observers for each joint. Thereafter, the mean and range of these standard deviations for all joints were computed to obtain inferences on interobserver variability. This provides a quantitative measure in millimeters as to how the observers varied in the measurement of DL and DD.

RESULTS MR imaging of the TMJs with 3 mm selected as the slice width generally produced 4 slices per joint. Each of the 4 observers traced the slices produced by imaging the 60 joints. Three joints could not be traced, however; this left 57 joints available for analysis. In the absence of repeated ratings by each observer on each slice, the degree to which each observer departed from his or her usual rating tendencies could not be evaluated. The calculated ICC for 4 observers tracing MRI slices of 57 joints was 0.681 for disk length and 0.830 for disk displacement. The values derived from the 3fold nested error variance component model for calculation of ICC for disk length and disk displacement are provided in Table I. Average variability among all observers in the determination of disk length measure-

Table II. Interobserver variability in measurement of disk displacement and disk length MeansD Disk length Disk displacement

1.041 0.972

MinimumsD MaximumsD 0.253 0.334

3.110 5.061

SD, Standard deviation taken over observers.

ments was calculated to be 1.041 ram, with a maximum of 3.110 mm. Average variability among all observers for measurement of disk displacement was calculated to be 0.972 ram, with a maximum of 5.061 mm. The mean variability among all observers for determination of either quantitative variable was therefore approximately 1.000 mm (Table II).

DISCUSSION Three-fold nested error variance component analysis was used to test the hypothesis that the difference among observers in the measurement of disk status is equal to zero. Determination of interobserver agreement by means of this model was used to generate the component estimates of variance. ICCs showed almost perfect interobserver agreement (0.820) in identification and the tracing of points necessary to define disk displacement. This was not true for determination of disk length measurements, in which case only substantial agreement (0.681) between observers was evident. The interpretation of ICC calculations in isolation, however, does not adequately indicate the magnitude of the difference among observers because no units of measurement are associated with these computed values. Mean variability among all observers for the deterruination of disk length was therefore calculated, and it was found to be 1.041 ram, with a maximum of 3.110 nun. The mean variability among the 4 observers in the determination of disk displacement was calculated to be 0.972 ram, with a maximum of 5.061 ram. Whereas the calculated ICC for disk length was only 0.681, the mean variability among all observers was only slightly more than 1 ram. Measurement of mean variability is relevant to determine whether such variability is large relative to the sagittal length of the disk, which is measured at only 10.00 mm. If mean variability were large, it could significantly alter the interpretation of joint findings when categoric classification of disk disorders is used. Determination of ICC for disk displacement was higher at 0.830. This suggests that observers were relatively consistent in identifying landmarks essential for the determination of disk displacement. Whether disk length or disk displacement is considered, this study clearly indicates that through use of the

750 Nebbe et al

ORAL SURGERY ORAL MEDICINE ORAL PATHOLOGY December 1998

measurement technique previously described, 9 interobserver variabiliy is relatively small, with betweenobserver variability averaging approximately 1 mm for either variable measured. This implies that this technique for quantification of disk status may be used by observers with varied training and backgrounds in interpretation of MR images and will produce reproducible results. From a clinical perspective, this is important with regard to providing consistent, reproducible, and meaningful information from MRIs of the TMJ. Measurements of DL and DD may aid in determining the severity and implications of disk displacement as it affects the TMJ. Improvements in interobserver variability may be achieved through

Acknowledgements: TMD Investigations Unit, University of Alberta; Edmonton Diagnostic Imaging; Magnetic Resonance Imaging Centre of Edmonton; Raymond McIntyre Fund for Dentistry, University of Alberta.

• having observers who have previously worked and trained together rate the slices • allowing observers to become familiar with the technique and tracing procedure • improving the quality, resolution, and contrast of MR images produced • applying the quantitative technique to MR images obtained from an adult population where less motion artifact may be evident as a result of increased cooperation.

CONCLUSIONS ICC calculations for determination of disk displacement and disk length on MR images of the TMJ by 4 observers showed substantial to almost total agreement among observers. This indicates that this new quantitative technique for assessment of disk status can be applied by multiple observers with different training backgrounds who are unfamiliar with the technique; regardless of such limitations, a high degree of interobserver agreement in interpretation may be achieved.

REFERENCES 1. Katzberg RW, Westesson PL, Tallents RH, Anderson R, Kurita K, Manzione JV, et al. Temporomandibular joint: MR assessment of rotational and sideways disk displacements. Radiology 1988;169:741-8. 2. Schwaighofer BW, Tanaka TT, Klein MV, Sartoris DJ, Resnick D. MR imaging of the temporomandibular joint: a cadaver study of the value of coronal images. Am J Roentgenol 1990; 154:1245-9. 3. Brooks SL, Westesson PL. Temporomandibular joint: value of coronal MR images. Radiology 1993;188:317-21. 4. Tasaki MM0 Westesson PL, Isberg AM, Ren YF, Tallents RH. Classification and prevalence of temporomandibular joint disk displacement in patients and symptom-free volunteers. Am J Orthod Dentofac Orthop 1996;109:249-62. 5. Tasaki MM, Westesson PL. Temporomandibular joint: diagnostic accuracy with sagittal and coronal MR imaging. Radiology 1993;186:723-9. 6. Westesson PL. Reliability and validity of imaging diagnosis of temporomandibular joint disordei: Advances in Dental Research 1993;7:137-51. 7. Drace JE, Enzmann DR. Defining the normal temporomandibular joint: closed-, partial open-, and open mouth MR imaging of asymptomatic subjects. Radiology 1990;177:67-71. 8. Silverstein R, Dunn S, Binder R, Maganzini A. MRI assessment of the normal temporomanibular joint with the use of projective geometry. Oral Surg Oral Med Oral Pathol 1994;77:523-30. 9. Nebbe B, Prasad NGN, Hatcher D, Major PW. Quantitative assessment of temporomandibular joint disk status. Oral Surg Oral Med Oral Pathol Oral Radiol Endod 1998;85:598-607.

Reprint requests: Brian Nebbe, BDS, MDent, FFD(SA)Ortho, PhD TMD Investigation Unit, Faculty of Dentistry 4068 Dentistry/Pharmacy Center Faculty of Medicine and Oral Health Sciences University of Alberta, Edmonton Canada T6G 2N8