ORIGINAL ARTICLE
Discriminatory ability of the skeletal maturation index and the cervical vertebrae maturation index in detecting peak pubertal growth in Indonesian and white subjects with receiver operating characteristics analysis Benny M. Soegiharto,a David R. Moles,b and Susan J. Cunninghamc Jakarta, Indonesia, and London, United Kingdom Introduction: In this study, we aimed to determine the effectiveness of the skeletal maturation index (SMI) and the cervical vertebrae maturation (CVM) index in discriminating between patients who have yet to attain their peak pubertal growth, compared with those who have reached or passed it. An additional aim was to determine whether there was any significant difference in the ability of the 2 methods to predict peak pubertal growth. Methods: The study included 2167 patients with hand-wrist and lateral cephalometric radiographs. There were 648 Indonesian boys and 303 white boys (age range, 10-17 years), and 774 Indonesian girls and 442 white girls (age range, 8-15 years). The SMI was used to evaluate the stages of skeletal maturity from hand-wrist radiographs, and the CVM index was used to evaluate skeletal maturity from lateral cephalograms. Several commonly used cephalometric parameters representing maxillary and mandibular dimensions were also measured to construct growth curves and calculate peak velocity. Results: Receiver operating characteristic (ROC) analysis was performed for the craniofacial morphology parameters for both sex and ethnic groups. The percentages of correctly classified subjects into the appropriate maturational stages for the mandibular parameters, as well as the stages with high sensitivity values for the maxillary parameters, showed that both the CVM index and the SMI have good discriminatory ability. On average, the results of the area under curve (AUC) for the SMI (AUC ⬎0.9) were greater than for the CVM (AUC ⬎0.8), and the differences between them were also statistically significant (P ⬍0.05 for all parameters investigated). However, the curves for both CVM and SMI approached the top left corner of the ROC graph, suggesting that both tests have good discriminatory ability, and the differences between the methods were only between 1% and 7%. Conclusions: Both the CVM index and the SMI are valid methods to discriminate between patients who have not yet attained peak pubertal growth and those who have reached or passed it. The differences in discriminatory ability between the SMI and the CVM index in detecting peak pubertal growth were small. These results question the necessity of taking hand-wrist radiographs and exposing a child to additional radiation when the discriminatory ability is similar with the CVM index, which is readily derived from most lateral cephalograms. (Am J Orthod Dentofacial Orthop 2008;134:227-37)
O
ne objective when undertaking orthodontic treatment during adolescence is to use craniofacial growth to aid the treatment of patients with skeletal discrepancies. The main area of interest From the UCL-Eastman Dental Institute for Oral Health Care Sciences, London, United Kingdom. a Lecturer, Faculty of Dentistry University of Indonesia, Jakarta, Indonesia; and former research graduate, Unit of Orthodontics UCL-Eastman Dental Institute for Oral Health Care Sciences, London, United Kingdom. b Senior clinical lecturer, Health Services Research. c Senior lecturer and honorary consultant, Unit of Orthodontics. Reprint requests to: Benny M. Soegiharto, Unit of Orthodontics, UCL-Eastman Dental Institute for Oral Health Care Sciences, 256, Grays Inn Rd, London WC1X 8LD, United Kingdom; e-mail,
[email protected]. Submitted, June 2006; revised and accepted, September 2006. 0889-5406/$34.00 Copyright © 2008 by the American Association of Orthodontists. doi:10.1016/j.ajodo.2006.09.062
for the orthodontist is to know whether a patient has attained peak pubertal growth or passed that point. This, in turn, determines whether growth modification is still a viable treatment option. It has been proposed that growth modification treatment for the maxilla (eg, with protraction headgear) should be started before the peak pubertal growth of the maxilla, whereas growth modification for the mandible (eg, functional appliances for Class II) has been shown to be more effective during or slightly after peak mandibular growth.1-3 The use of hand-wrist radiographs for growth assessment in orthodontics has been widely reported.4,5 However, this method has disadvantages, including the complexity of landmark identification that can lead to 227
228 Soegiharto, Moles, and Cunningham
inaccurate classification and, therefore, limits its clinical value for the prediction of pubertal growth.6,7 To avoid this complex assessment, Fishman8 introduced the skeletal maturation index (SMI), which offers an organized and relatively simple method of observing skeletal maturity by using 11 anatomic sites on the phalanges, the adductor sesamoid, and the radius, excluding the carpal bones. Nevertheless, hand-wrist radiographs still expose the child to additional radiation, and their use must be questioned if other comparable methods of assessment are available.6,7,9 The cervical vertebral maturation (CVM) index was introduced by Lamparski10 for growth assessment. Many authors have found that the CVM method is effective and clinically reliable.11-16 Its main advantage is that the vertebrae are already recorded on most routine lateral cephalometric radiographs; this eliminates the need for additional radiographic exposure and optimizes the use of routine cephalograms. Baccetti et al14 further simplified the CVM index to include only the second to the fourth vertebrae, so it can be useful even with a protective collar. Several authors investigated the correlations between the hand-wrist and cervical vertebrae methods.12,17-20 They found good correlations between them in various ethnic groups and recommended the cervical vertebrae method over hand-wrist radiographs. However, because both methods are used to assess growing patients, the correlation values will tend to be high anyway. Furthermore, the correlation values cannot show that 1 method is better than the other. The objectives of this study were (1) to determine whether the SMI and the CVM index can be used to identify patients who have yet to attain peak pubertal growth, compared with those who have reached or passed it; and (2) to determine whether there is a significant difference between the SMI and the CVM index in the prediction of peak pubertal growth by using receiver operating characteristic (ROC) analysis. Our study will add to the knowledge regarding the use of the SMI and the CVM index in different ethnic groups and identify whether 1 method is appreciably better than the other for orthodontic diagnosis and treatment planning. MATERIAL AND METHODS
This was a cross-sectional study with hand-wrist and lateral cephalometric radiographs from Indonesian children of Deutero-Malay ethnic origin. The data were derived from school children in Jakarta and its vicinity, and the radiographs were taken as part of routine orthodontic examinations. The data for the white patients were obtained from
American Journal of Orthodontics and Dentofacial Orthopedics August 2008
Table I.
Patient distribution based on chronologic age Indonesian
Age group (y) Boys 10 11 12 12.5 13 13.5 14 14.5 15 16 17 Total Girls 8 9 9.5 10 10.5 11 11.5 12 12.5 13 14 15 Total
White
Patients (n)
Percentage (%)
Patients (n)
Percentage (%)
73 105 53 45 36 50 60 36 96 52 42 648
11.3 16.2 8.2 6.9 5.6 7.7 9.3 5.6 14.8 8.0 6.5 100
35 30 25 29 20 35 19 21 44 28 17 303
11.6 9.9 8.3 9.6 6.6 11.6 6.3 6.9 14.5 9.2 5.6 100
81 51 46 48 45 54 56 65 54 85 108 81 774
10.5 6.6 5.9 6.2 5.8 7.0 7.2 8.4 7.0 11.0 14.0 10.5 100
17 9 10 16 17 26 40 37 44 73 65 88 442
3.8 2.0 2.3 3.6 3.8 5.9 9.0 8.4 10.0 16.5 14.7 19.9 100
preexisting radiographs taken as part of routine orthodontic assessment at the Department of Orthodontics, Eastman Dental Center, University of Rochester Medical Center, Rochester, NY. Ethical approval was obtained from the University of Rochester after the Ethical Principles in Research Program examination. This study included 2167 patients: 648 Indonesian and 303 white boys (age range, 10-17 years), and 774 Indonesian and 442 white girls (age range, 8-15 years) (Table I). These age ranges were selected based on the assumption that, on average, girls reach skeletal maturation earlier than boys.5 Only patients with a Class I molar relationship and normal overjet and overbite were included to restrict the study to patients with normal dentofacial characteristics. Patients were included if they fulfilled the following criteria: (1) Deutero-Malay or white ethnic origin, (2) physically healthy with no relevant medical conditions, (3) Angle Class I molar relationship with normal overjet and overbite, (4) no previous orthodontic treatment, and (5) all lateral cephalometric radiographs and hand-wrist radiographs taken at the same time, with good clarity and contrast.
Soegiharto, Moles, and Cunningham 229
American Journal of Orthodontics and Dentofacial Orthopedics Volume 134, Number 2
Table II.
Patients classified according to the SMI method Age (y)
SMI stage Boys 1 (PP2⫽) 2 (MP3⫽) 3 (MP5⫽) 4 (S) 5 (DP3Cap) 6 (MP3Cap) 7 (MP5Cap) 8 (DP3U) 9 (PP3U) 10 (MP3U) 11 (RU) Girls 1 (PP2⫽) 2 (MP3⫽) 3 (MP5⫽) 4 (S) 5 (DP3Cap) 6 (MP3Cap) 7 (MP5Cap) 8 (DP3U) 9 (PP3U) 10 (MP3U) 11 (RU)
Age (y)
Patients (n) (Indonesian boys)
Percentage (%)
Mean
SD
Patients (n) (white boys)
Percentage (%)
Mean
SD
126 94 95 54 4 31 102 45 8 72 17
19.4 14.5 14.7 8.3 0.6 4.8 15.7 6.9 1.2 11.1 2.6
11.22 11.95 12.53 13.35 14.28 14.61 14.70 15.28 15.52 16.00 17.35
1.06 1.26 1.07 1.21 1.12 0.95 1.12 0.83 0.83 0.93 0.58
21 26 33 22 21 32 51 27 6 41 23
6.9 8.6 10.9 7.3 6.9 10.6 16.8 8.9 2.0 13.5 7.6
10.66 11.29 12.23 12.20 13.04 13.17 14.54 14.96 16.05 15.72 16.85
0.79 1.13 1.08 1.24 0.81 0.98 1.11 0.88 1.39 1.08 0.90
102 106 76 70 4 51 117 96 15 116 21
13.2 13.7 9.8 9.0 0.5 6.6 15.1 12.4 1.9 15.0 2.7
8.97 9.86 10.76 11.04 11.91 11.77 12.77 13.36 13.99 14.42 15.26
0.84 1.09 1.34 1.22 1.38 0.99 1.11 1.01 1.16 1.01 0.40
18 16 13 39 17 55 56 48 4 91 85
4.1 3.6 2.9 8.8 3.8 12.4 12.7 10.9 0.9 20.6 19.2
8.77 9.92 10.86 11.05 11.33 12.28 12.71 13.25 13.99 14.33 15.27
0.85 1.51 1.19 0.94 0.97 1.03 0.99 0.99 0.98 1.11 0.85
PP2, The proximal phalanx of the second finger has its epiphysis equals to its diaphysis; MP3, the middle phalanx of the third finger has its epiphysis equals to its diaphysis; MP5, the middle phalanx of the fifth finger has its epiphysis equals to its diaphysis; S, the ossification of the ulnar sesamoid bone; DP3Cap, the distal phalanx of the third finger has its epiphysis caps the diaphysis; MP3Cap, the middle phalanx of the third finger has its epiphysis caps the diaphysis; MP5Cap, the middle phalanx of the fifth finger has its epiphysis caps the diaphysis; DP3U, the distal phalanx of the third finger has complete fusion of its epiphysis and diaphysis; PP3U, the proximal phalanx of the third finger has complete fusion of its epiphysis and diaphysis; MP3U, the middle phalanx of the third finger has complete fusion of its epiphysis and diaphysis; RU, the radius bone has complete fusion of its epiphysis and diaphysis.
The hand-wrist radiographs were viewed in a darkened room, and a black surround was used on the light box to eliminate excess light and facilitate landmark identification. Each radiograph was then classified according to the SMI, as described by Fishman (Table II).8 The CVM index and craniofacial morphology were assessed with lateral cephalometric radiographs. Observation was similarly undertaken in a darkened room with a black surround. The outlines of the cervical vertebrae (C2-C4) and the craniofacial structures of interest were traced on 0.003-in thick matte acetate cephalometric tracing paper (GAC International, Bohemia, NY) with a 0.5-mm diameter (2B) mechanical lead pencil. The CVM readings were then classified into the stages described by Baccetti et al14 (Table III). The craniofacial landmarks were manually traced and then digitized by using a customized computer program specifically for this study. The magnification values of the lateral cephalometric radiographs for the Indonesian
and white subjects were 5% and 15%, respectively. All linear measurements used in this study were adjusted accordingly. The landmarks of the craniofacial morphology parameters were digitized, and the linear measurements calculated by the computer program are shown in Figure 1. The parameters ANS-PNS and ANS-Me have been widely used in previous studies and were selected to represent horizontal and vertical maxillary growth.13,21,22 The parameter Ar-Pog was selected to represent horizontal mandibular growth, and Ar-Go represented growth of the ramus. These linear measurements have been widely used in other studies for assessing mandibular growth.11,15,16,22,23 S-Pog was selected to represent the anteroposterior position of the mandible relative to the cranial base. To establish intraoperator repeatability, 300 subjects (100 Indonesian subjects of each sex and 50 white subjects of each sex) were randomly selected for
230 Soegiharto, Moles, and Cunningham
Table III.
American Journal of Orthodontics and Dentofacial Orthopedics August 2008
Patients classified according to the CVM index Age (y)
CVM stage Boys 1 2 3 4 5 Girls 1 2 3 4 5
Age (y)
Patients (n) (Indonesian boys)
Percentage (%)
Mean
SD
Patients (n) (white boys)
Percentage (%)
Mean
SD
253 117 159 100 19
39.0 18.1 24.5 15.4 2.9
11.73 12.82 14.76 15.80 17.03
1.24 1.34 1.09 0.99 0.94
72 34 114 66 17
23.8 11.2 37.6 21.8 5.6
11.38 12.44 14.07 15.54 16.84
1.15 1.18 1.34 1.23 1.00
269 95 226 177 7
34.8 12.3 29.2 22.9 0.9
9.69 11.19 12.83 14.15 14.68
1.18 1.39 1.25 1.19 1.03
44 51 125 187 35
10.0 11.5 28.3 42.3 7.9
9.55 11.11 12.58 14.26 15.33
1.35 0.89 1.09 1.29 0.81
Statistical analysis
Fig 1. Diagrammatic representation of the anatomic landmarks and linear craniofacial morphology parameters used in this study.
reobservation with both methods. All observations were repeated within a month, and no more than 20 radiographs were observed at any session to minimize errors due to operator fatigue.
The data for the repeatability study for the SMI and the CVM index were analyzed with SPSS software (version 12.0.1, SPSS for Windows, SPSS UK, Woking, Surrey, United Kingdom). Cohen’s kappa statistic was calculated to assess agreement for the SMI and the CVM. The results of the repeatability study for the craniofacial parameters were analyzed by using the Stata package (version 8.2, Intercooled Stata, Stata, College Station, Tex). The Bland and Altman method and the Lin concordance correlation statistics were calculated to assess agreement between the 2 measurements. The ROC analysis, undertaken with the Stata package, aimed to assess the discriminatory ability of both methods and to establish whether there was a significant difference between them. ROC analysis has been widely used in the health care sector to compare 2 or more competing methods of screening.24 The ROC curve is obtained by calculating sensitivity and specificity, and then plotting the true positive probability (sensitivity) on the vertical axis and the false positive probability (1-specificity) on the horizontal axis for the entire range of cut-off points.24,25 The closer the ROC curve to the upper left corner of the graph, the greater the overall accuracy of the test. The closer the curve comes to the 45° diagonal of the ROC space, the less accurate the test. A test that is completely useless will give a straight line from the bottom left corner to the top right corner; this indicates that the test has equal true positive and false positive values for all cut-off points.24 The ROC curve allows the results of 2 or more screening tests to be shown in 1 graph, thus giving a visual comparison of the tests. In addition, it is also possible to quantify the accuracy of the tests by
Soegiharto, Moles, and Cunningham 231
American Journal of Orthodontics and Dentofacial Orthopedics Volume 134, Number 2
Table IV.
Repeatability test results of the craniofacial morphology parameters in this study Lin concordance correlation
Indonesian ANS-PNS SPog ArGo ArPog ANSMe White ANS-PNS SPog ArGo ArPog ANSMe
Bland and Altman
c
Cb
Slope
Intercept
Mean difference
SD (difference)
0.982 0.998 0.991 0.997 0.993
1.000 1.000 1.000 1.000 1.000
1.019 0.005 0.993 1.005 1.010
⫺0.975 ⫺0.561 0.326 ⫺0.547 ⫺0.613
⫺0.075 0.015 0.030 ⫺0.023 0.040
0.595 0.542 0.609 0.557 0.585
⫺1.241, ⫺1.047, ⫺1.164, ⫺1.113, ⫺1.407,
1.091 1.077 1.224 1.068 1.587
0.076 0.696 0.487 0.568 0.335
0.996 0.995 0.960 0.998 0.998
1.000 1.000 0.998 1.000 1.000
1.007 0.994 1.047 0.996 0.996
⫺0.411 0.687 ⫺1.766 0.391 0.255
⫺0.010 ⫺0.005 0.215 ⫺0.15 0.010
0.396 0.783 1.380 0.566 0.355
⫺0.785, ⫺1.540, ⫺1.584, ⫺1.125, ⫺0.686,
0.765 1.530 2.074 1.095 0.706
0.801 0.949 0.123 0.792 0.779
LOA
P
c, Lin concordance correlation coefficient; Cb, accuracy measurement; LOA, limits of agreement.
evaluating the area under the curve (AUC). The AUC is a measure of the accuracy of a diagnostic procedure and is frequently used for comparisons between tests.26 An AUC of 0.5, or 50%, produces a straight line (the green diagonal straight line in the graphs); this indicates that a test is not useful. The closer the AUC value is to 1.0 (or 100%), the more accurate the test. Before the ROC analysis, growth curves were constructed by plotting each craniofacial parameter against the subject’s age by using a scatter plot graph. After that, the “running mean smoother line” was calculated to smooth the plotted values and join these averaged values in a single best-fit line through the data. The next stage involved constructing a velocity growth curve where the gradients of every part of the curve in the “running mean smoother line” were calculated. All gradients were then plotted against age to be used as a proxy measure for the actual growth velocity curve. The peak velocity of each parameter of interest was calculated as the point at which the gradient of the velocity curve reached its maximum value. The ROC analysis was then performed based on this value to measure the discriminatory ability of the SMI and the CVM index by observing both the sensitivity and the percentage of correctly classified subjects. These can then be used to identify those who have yet to attain peak pubertal growth and those who have reached or passed it. In this study, 2 clinical scenarios were of interest in relation to the discriminatory ability of the SMI and the CVM index. The first scenario was relevant for the mandibular parameters when growth modification for the mandible has been suggested to be more effective if used during, or slightly after, peak mandibular growth.1-3 Therefore, the primary concern for this first scenario was to discriminate subjects
according to whether they had reached or passed peak pubertal growth. The discriminatory ability of the test is optimized at the cut-off point at which the most children are correctly classified. The second scenario was related to growth modification in the maxilla, which has been suggested to be more effective if started before peak pubertal growth.27-29 Thus, in this second scenario, it was desirable to ensure that the peak pubertal growth stage had not yet been attained. This was measured by the sensitivity of the test, where the observed cut-off point for the start of treatment had a high sensitivity value. The highest sensitivity is 100%, which by definition is when the cut-off point is stage 1, indicating that 100% of subjects are at the prepeak pubertal growth stage. However, it might not be practical to always begin treatment at this stage (eg, if the patient arrives later). Therefore, in this study, any sensitivity value greater than, or equal to, 90% was considered to be high sensitivity. This choice of 90% sensitivity was arbitrary, but pragmatic, ensuring that most subjects had not attained peak pubertal growth. If it is necessary to have a higher degree of certainty, an earlier stage (ie, stage 1) can be used. RESULTS
The kappa results for the SMI were 0.75 and 0.89 for the Indonesian boys and girls, respectively, and 0.86 and 0.82 for the white boys and girls, respectively. The kappa results for the CVM were 0.85 and 0.97 for the Indonesian boys and girls, respectively, and 0.94 and 0.95 for the white boys and girls, respectively. All results showed substantial or good intraoperator agreement for both sex and ethnic groups. For the analysis of the cephalometric data, the
232 Soegiharto, Moles, and Cunningham
Table V.
American Journal of Orthodontics and Dentofacial Orthopedics August 2008
Summary of ROC analysis for the SMI and the CVM index Cut-off points CVM14
Parameter
IB
IG
WB
WG
3 (88.89%) 2 (77.01%) 3 (88.89%)
3 (87.47%) 2 (82.69%) 3 (87.47%)
3 (87.46%) 2 (80.86%) 3 (87.46%)
4 (81.45%) 2 (60.18%) 3 (71.27%)
AUC of ArPog ArGo (observed optimum cut-off point with highest correct classification) ArGo (previously reported optimum)
0.918 (0.895; 0.939)
0.921 (0.902; 0.939)
0.917 (0.888; 0.945)
0.861 (0.829; 0.892)
3 (88.12%) 2 (75.31%) 3 (88.12%)
3 (85.92%) 2 (80.10%) 3 (85.92%)
4 (77.89%) 2 (67.33%) 3 (76.57%)
4 (80.32%) 2 (64.03%) 3 (74.66%)
AUC of ArGo SPog (observed optimum cut-off point with highest correct classification) SPog (previously reported optimum)
0.918 (0.896; 0.939)
0.914 (0.895; 0.933)
0.880 (0.847; 0.914)
0.862 (0.829; 0.893)
3 (87.81%) 2 (78.40%) 3 (87.81%)
3 (87.60%) 2 (82.82%) 3 (87.60%)
3 (83.17%) 2 (75.91%) 3 (83.17%)
4 (81.45%) 2 (60.18%) 3 (71.27%)
AUC of SPog ANS-PNS (observed cut-off points with ‘high sensitivity’ value)
0.911 (0.889; 0.933) 1 (100%)
0.912 (0.903; 0.939) 1 (100%) 2 (96.94%)
0.897 (0.866; 0.928) 1 (100%) 2 (98.53%)
0.861 (0.829; 0.892) 1 (100%) 2 (98.62%)
1 (100%) 2 (77.59%)
1 (100%) 2 (96.94%)
1 (100%) 2 (98.53%)
1 (100%) 2 (98.62%)
0.853 (0.829; 0.878) 1 (100%) 2 (92.28%)
0.919 (0.901; 0.938) 1 (100%) 2 (96.73%)
0.880 (0.847; 0.914) 1 (100%) 2 (98.77%)
0.941 (0.916; 0.965) 1 (100%) 2 (99.12%)
1 (100%) 2 (92.28%)
1 (100%) 2 (96.73%)
1 (100%) 2 (98.77%)
1 (100%) 2 (99.12%)
0.912 (0.889; 0.934)
0.921 (0.903; 0.939)
0.897 (0.866; 0.928)
0.861 (0.829; 0.892)
ArPog (observed optimum cut-off point with highest correct classification) ArPog (previously reported optimum)
ANS-PNS (previously reported optimum)
AUC of ANS-PNS ANS-Me (observed cut-off points with ‘high sensitivity’ value) ANS-Me (previously reported optimum)
AUC of ANS-Me
The observed optimum cut-off points for mandibular parameters maximize the proportion of children correctly classified as those who have yet to attain peak pubertal growth and those who have reached or passed it. The observed cut-off points for the maxilla with high sensitivity values ensure the maximum proportion of children who have not attained peak pubertal growth. The figures in parentheses show the percentages of subjects classified at these cut-off points. Results from this study are presented along with those previously reported in the literature as clinically relevant for intervention. The AUC values between the CVM index and the SMI were statistically significant (P ⬍0.05). The values in parentheses are the 95% confidence intervals of the AUC. IB, Indonesian boys; IG, Indonesian girls; WB, white boys; WG, white girls.
paired t test results for both sex and ethnic groups showed no evidence of systematic error (bias). Based on clinical judgment, it was decided that the 95% limits of agreement were acceptable for all parameters (Table IV). The Lin concordance correlations also showed good correlations between the 2 measurements for both sex and ethnic groups. The ROC analysis for the CVM index mandibular parameters (ArPog, ArGo, SPog) is summarized in Table V. It was suggested that the peak pubertal growth spurt of the mandible occurs between stages 2 and 3 of the CVM index.14,15 For parameter ArPog in the Indonesian boys, if the cut-off point was stage 3,
88.89% of them were correctly classified as either having reached peak growth or not, whereas 77.01% were correctly classified as having reached peak growth or not, if the cut-off point was stage 2. In the Indonesian girls, 87.47% were correctly classified if the cut-off point was stage 3, and 82.69% were correctly classified when the cut-off point was stage 2. Similar results were also seen for parameters ArGo and SPog. For the white group, the findings were similar to those for the Indonesians; this suggested that the CVM index has good discriminatory ability in predicting whether a patient has yet to attain peak pubertal growth or has passed the peak. For parameter ArPog in white
Soegiharto, Moles, and Cunningham 233
American Journal of Orthodontics and Dentofacial Orthopedics Volume 134, Number 2
Table V.
Continued
SMI8 IB
5,6 (89.04%) 4 (84.41%) 5,6 (89.04%) 0.928 (0.908; 0.949) 5,6 (88.58%) 4 (83.64%) 5,6 (88.58%) 0.932 (0.912; 0.952) 5 (88.27%) 4 (86.11%) 5 (88.27%) 6 (87.96%) 0.928 (0.907; 0.948) 1 (100%) 2 (93.87%) 1 (100%) 2 (93.87%) 3 (84.57%) 0.913 (0.892; 0.933) 1 (100%) 2 (98.32%) 3 (93.96%) 1 (100%) 2 (98.32%) 3 (93.96%) 0.928 (0.907; 0.948)
IG
WB
WG
7 (89.15%) 4 (82.82%) 5 (87.98%) 6 (87.73%) 0.944 (0.929; 0.959)
6 (87.79%) 4 (82.18%) 5 (86.80%) 6 (87.79%) 0.951 (0.928; 0.973)
8 (85.07%) 4 (60.86%) 5 (69.23%) 6 (73.08%) 0.923 (0.899; 0.947)
7 (88.11%) 4 (80.23%) 5 (85.66%) 6 (85.40%) 0.942 (0.927; 0.957)
7 (88.12%) 4 (69.31%) 5 (75.25%) 6 (81.52%) 0.943 (0.919; 0.967)
8 (84.84%) 4 (64.25%) 5 (72.62%) 6 (76.02%) 0.918 (0.893; 0.943)
7 (89.02%) 4 (82.95%) 5 (88.11%) 6 (87.86%) 0.947 (0.929; 0.959) 1 (100%) 2 (100%) 3 (98.47%) 1 (100%) 2 (100%) 3 (98.47%) 0.946 (0.931; 0.961) 1 (100%) 2 (99.75%) 3 (98.24%) 1 (100%) 2 (99.75%) 3 (98.24%) 0.944 (0.929; 0.959)
7 (86.80%) 4 (77.89%) 5 (82.51%) 6 (84.82%) 0.941 (0.918; 0.965) 1 (100%) 2 (100%) 3 (100%) 1 (100%) 2 (100%) 3 (100%) 0.943 (0.919; 0.967) 1 (100%) 2 (100%) 3 (100%) 1 (100%) 2 (100%) 3 (100%) 0.941 (0.918; 0.965)
8 (85.07%) 4 (60.86%) 5 (69.23%) 6 (73.08%) 0.923 (0.899; 0.947) 1 (100%) 2 (100%) 3 (99.17%) 1 (100%) 2 (100%) 3 (99.17%) 0.951 (0.931; 0.972) 1 (100%) 2 (100%) 3 (99.56%) 1 (100%) 2 (100%) 3 (99.56%) 0.923 (0.899; 0.947)
boys, if the cut-off point was stage 3, 87.46% of them were correctly classified as having attained peak growth or not, and 80.86% if the cut-off point was stage 2. In the white girls, 81.45% were classified as having attained peak growth or not if the cut-off point was stage 4, 71.27% when the cut-off point was stage 3, and 60.18% for stage 2. For ArGo in white boys, 77.89% were correctly classified as having attained peak growth or not if the cut-off point was stage 4, 76.57% when the cut-off point was stage 3, and 67.33% when the cut-off point was stage 2. In the white girls, 80.32% were correctly classified if the cut-off point was stage 4, 74.66% for stage 3, and 64.03% when the cut-off point
was stage 2. Similar results were also seen for parameter SPog. Baccetti et al27 recommended that growth modification treatment for the maxilla should be undertaken before the peak pubertal growth of the maxilla—ie, stage 1 or 2 of the CVM index. For ANS-PNS in all the Indonesians, the test had 100% sensitivity in detecting all subjects in the prepeak pubertal growth period when the cut-off point was stage 1. However, high sensitivity could also be obtained in the Indonesian girls if the cut-off point was stage 2 (96.94%), indicating that this might also be an acceptable stage at which to start treatment.
234 Soegiharto, Moles, and Cunningham
For ANS-Me in all Indonesians, the test had 100% sensitivity in detecting them in the prepeak pubertal growth when the cut-off point was stage 1. In addition, high sensitivity could also be obtained if the cut-off point was stage 2, with 92.28% of Indonesian boys and 96.73% of the girls classified in the prepeak pubertal growth. In the white subjects, for ANS-PNS, the test had 100% sensitivity in detecting all in the prepeak pubertal growth period if the cut-off point was stage 1. Moreover, high sensitivity could also be obtained when the cut-off point was stage 2 (98.53% for boys, 98.62% for girls). For ANS-Me, which represents lower anterior face height, the results were similar to ANS-PNS, with 100% sensitivity for all white subjects if the cut-off point was stage 1. High sensitivity was also obtained if the cut-off point was stage 2 (98.77% for boys, 99.12% for girls). The ROC analysis for the SMI mandibular parameters (ArPog, ArGo, SPog) is summarized in Table V. It was suggested that peak mandibular growth occurs between stages 4 and 6 of the SMI.8 The results for ArPog in Indonesian boys showed that, if the cut-off point was stage 5 or 6, 89.04% were correctly classified as having attained peak growth or not, whereas 84.41% were correctly classified if the cut-off point was stage 4. In Indonesian girls, 89.15% were correctly classified as having attained peak growth or not if the cut-off point was stage 7, 87.73% when the cut-off point was stage 6, 87.98% for stage 5, and 82.82% for stage 4. The parameters ArGo and SPog for both sexes showed similar results. In the white boys, the results for ArPog showed that 87.79% were correctly classified as having attained peak growth or not if the cut-off point was stage 6, 86.80% for stage 5, and 82.18% for stage 4. In the white girls, the highest percentage of correct classification (85.07%) was achieved if the cut-off point was stage 8, with correct classifications of 60.86% to 73.08% for stages 4 to 6. For parameter ArGo in white boys, correct classifications ranged from 69.31% to 81.52% for stages 4 to 6; in white girls, the percentages of correct classification ranged from 64.25% to 76.02% for stages 4 to 6. Similar results were also seen for SPog in white boys, with correct classifications from 77.89% to 84.82% for stages 4 to 6; in the girls, 60.86% to 73.08% were correctly classified if the cut-off points were stages 4 to 6. Stages 1 to 3 of the SMI are generally considered to represent the prepeak pubertal growth period, when growth modification for the maxilla should be started.29 For ANS-PNS, 100% sensitivity was obtained in classifying all Indonesian boys at the prepeak pubertal
American Journal of Orthodontics and Dentofacial Orthopedics August 2008
growth period if the cut-off point was stage 1. In addition, high sensitivity was also obtained if the cut-off point was stage 2 (93.87%). The test had 100% sensitivity in classifying all Indonesian girls into the prepeak pubertal growth period when the cut-off points were stages 1 and 2. High sensitivity, with 98.47% of Indonesian girls classified at the prepeak pubertal stage, could also be obtained when the cut-off point was stage 3. Similarly, for ANS-Me in Indonesian boys, there was 100% sensitivity in classifying them at the prepeak pubertal period when the cut-off point was stage 1. High sensitivity was also achieved if the cut-off point was stage 2 (98.32%) or stage 3 (93.96%). The test had 100% sensitivity, with all girls classified in the prepeak pubertal growth stage, if the cut-off point was stage 1 of the SMI. Moreover, high sensitivity could also be obtained if the cut-off points were stages 2 (99.75%) and 3 (98.24%). In white boys for ANS-PNS, 100% sensitivity was obtained if the cut-off points were stages 1 to 3, suggesting that all boys were classified at the prepeak pubertal growth period. In the girls, the test had 100% sensitivity when using stages 1 and 2 as the cut-off points. High sensitivity was also obtained if the cut-off point was stage 3 (99.17%). For parameter ANS-Me, 100% sensitivity was obtained in boys if the cut-off points were stages 1 to 3. In white girls, if the cut-off points were stages 1 and 2, 100% sensitivity was obtained. In addition, high sensitivity was also obtained if the cut-off point was stage 3 (99.56%). The ROC analysis for parameter ArPog in Indonesian boys showed that the AUC values were 0.918 for the CVM index and 0.928 for the SMI. In Indonesian girls, the AUC values were 0.944 for the SMI and 0.921 for the CVM index. Although the differences between both methods were statistically significant, the magnitudes of the differences were only 0.010, or 1.07%, for Indonesian boys and 0.023, or 2.37%, for Indonesian girls. Similarly, the ROC analysis for ANS-PNS in Indonesian boys showed that the AUC values were 0.853 for the CVM index and 0.913 for the SMI. In Indonesian girls, these values were 0.919 for the CVM index and 0.946 for the SMI. Again, the differences between the 2 methods were statistically significant, but with a magnitude of only 0.059, or 5.93%, in Indonesian boys and 0.025, or 2.5%, in Indonesian girls. Thus, these differences are unlikely to be clinically relevant. The remaining parameters for Indonesian subjects (ArGo, SPog, and ANS-Me) showed that the differences between the AUC values of the CVM index and the SMI were relatively low, only 0.014, or 1.43%, for
Soegiharto, Moles, and Cunningham 235
American Journal of Orthodontics and Dentofacial Orthopedics Volume 134, Number 2
suggest that the 2 indexes are valid tools to predict whether a patient has yet to attain peak pubertal growth or has reached or passed it. DISCUSSION
Fig 2. Example of the ROC curve analysis for ArPog between the SMI and the CVM index of Indonesian boys.
ArGo in Indonesian boys and 0.028, or 2.8%, in the girls; 0.017, or 1.71%, for SPog in Indonesian boys and 0.034, or 3.44%, in the girls; 0.016, or 1.62%, for ANS-Me in Indonesian boys and 0.023, or 2.31%, in the girls. Figure 2 shows an example of the ROC curve results for parameter ArPog for the CVM index and the SMI for Indonesian boys. The plots were close to the top left corner of the graph for both methods. This suggests that both methods are valid clinical diagnostic indexes. The results for other parameters in both ethnic and sex groups were similar to this example. For ArPog in white boys, the AUC values were 0.917 for the CVM index and 0.951 for the SMI. In white girls, the AUC values were 0.861 for the CVM index and 0.923 for the SMI. The differences in discriminatory ability were 0.034, or 3.4%, in white boys and 0.062, or 6.22%, in white girls. Again, although these differences were statistically significant, they were relatively low in terms of clinical relevance. For ANS-PNS in white boys, the AUC values were 0.880 for the CVM index and 0.943 for the SMI. In the white girls, the AUC values were 0.941 for the CVM and 0.951 for the SMI. Again, the differences between both methods were low in the girls (1.08%), although slightly greater in the boys (6.29%). Other parameters for the white subjects (ArGo, SPog and ANS-Me) also showed that the differences in the discriminatory abilities between the CVM and the SMI were relatively low: 6.29% for ArGo in boys and 5.63% in girls; 4.39% for SPog in boys and 6.22% in girls; and 4.39% for ANS-Me in boys and 6.22% in the girls. The AUC results for both ethnic and sex groups
The kappa values for the SMI showed substantial to good intraoperator agreement for both sex and ethnic groups, suggesting acceptable repeatability for this method. The kappa values for the CVM index were generally higher compared with the SMI and also showed good intraoperator agreement for both sexes and ethnic groups. These findings are similar to those of Ballrick et al,30 who found good kappa scores (0.82) when using Baccetti’s CVM classification.14 The Bland and Altman test results, as well as the Lin concordance correlations, for the craniofacial parameters showed acceptable agreement between the 2 measurements and confirmed that the methods were repeatable. According to Baccetti et al14 and Grave and Townsend,15 the peak pubertal growth spurt of the mandible occurs between stages 2 and 3 of the CVM index. Therefore, the high percentage of correct classifications into stages 2 and 3 in this study suggests that the CVM index generally has good discriminatory ability for predicting whether a patient has attained peak mandibular growth for the parameters we tested. These results are relevant, especially if growth modification will be undertaken in Class II patients, for whom intervention has been suggested during or slightly after peak pubertal growth.1-3 The findings for the maxillary parameters for both ethnic and sex groups agreed with those of Baccetti et al,27 who suggested that growth modification treatment for the maxilla (eg, with protraction headgear) should start before the peak pubertal growth of the maxilla (around stage 1 or 2 of the CVM index). In addition, Westwood et al28 reported that, when patients were treated with rapid maxillary expansion and protraction headgear before peak pubertal growth (early stages of the CVM index), there was minimal change in lower anterior face height. Therefore, it is recommended that patients with increased vertical face height should be treated at this stage to prevent further increase in facial height. Our study confirmed that using stage 1 of the CVM index gave the highest degree of certainty, because the sensitivity was 100% for all cases. This means that every patient would be at the prepeak pubertal growth period. In addition, stage 2 also gave a high degree of certainty; the sensitivity value was generally greater than 90% (except in Indonesian boys, whose value was 78%). Therefore, when treatment cannot begin at an earlier stage, stage 2 can be used in
236 Soegiharto, Moles, and Cunningham
approximately 90% of patients to establish the prepeak pubertal period. The results showed that the percentages of correct classification into the pubertal stages (4 to 6) for all mandibular parameters for all Indonesians were greater than 80%. For the same pubertal stages in white boys, the percentages of correctly classified subjects ranged from 69% to 87%; in the girls, the percentages ranged from 61% to 76%. These results agree with those of Fishman,8 who suggested that peak mandibular growth occurs during stages 4 to 6 of the SMI. Thus, our results show that the SMI has good discriminatory ability when determining the timing of mandibular growth modification. For maxillary parameters ANS-PNS and ANS-Me in both ethnic and sex groups, the results showed high sensitivity if the cut-off points were stages 1 to 3, classifying subjects with high certainty into their prepeak pubertal growth stage. These results agree with those of Fishman,29 in which stages 1 to 3 of the SMI were considered to be part of prepubertal growth, when growth modification treatment for maxillary deficiency should begin. Thus, the SMI appeared to have good discriminatory ability when assessing the early stages of maxillary pubertal growth in both ethnic and sex groups; this might be useful if growth modification in the maxilla (eg, protraction headgear) is contemplated. There were differences for the observed optimum cut-off points between sex and ethnic groups as well as for various parameters investigated in this study. These differences might have been due to sampling variation and because the sample size was relatively smaller in each age group for the white subjects than the Indonesians. The ROC analysis for parameter ArPog in all Indonesians showed that the AUC values for the CVM and the SMI were all above 0.9. This suggests that both the CVM and the SMI are valid clinical diagnostic indexes. However, even though the differences in AUC values were statistically significant, the magnitudes obtained by subtracting the AUC value of the SMI from that of the CVM was only 0.010, or 1.07%, for Indonesian boys and 0.023, or 2.37%, for Indonesian girls. Therefore, the differences between the 2 methods were not clinically relevant. Similarly, the difference between the AUC values for both for ANS-PNS in Indonesian boys were 0.059, or 5.93%, and 0.025, or 2.5%, in the girls. Although the differences between the 2 indexes were statistically significant, they are unlikely to be clinically relevant. Similarly, other parameters for both ethnic and sex groups (ArGo, SPog, and ANS-Me) also showed AUC values greater than 0.8 for both methods. In addition,
American Journal of Orthodontics and Dentofacial Orthopedics August 2008
the differences between the AUC for the CVM index and the SMI for these parameters for both ethnic and sex groups were relatively low, ranging from 1.08% to 6.29%. These differences are unlikely to be clinically relevant. These findings suggest that both the SMI and the CVM index are valid clinical diagnostic indexes for prediction of peak growth of the maxilla and the mandible in both ethnic and sex groups. However, the typical effective dose for a lateral cephalometric radiograph is 1 to 3 Sv compared with 10 to100 Sv for a hand-wrist radiograph (up to 100 times greater than a lateral cephalometric radiograph).31 Therefore, this study strongly suggests that it is not beneficial to take a hand-wrist radiograph to predict peak pubertal growth of the maxilla and the mandible if the routine lateral cephalometric radiographs can be optimized, thereby avoiding unnecessary radiation exposure to a child. Previous studies investigated correlations between the SMI and the CVM index with high correlation values.12,17-20 However, correlation coefficients cannot show that 1 method is better than the other. One strength of this study was that, to the best of our knowledge, it was the first to use ROC analysis to investigate the discriminatory ability of both methods and to compare the differences between them. In this way, the study showed that both methods are valid radiographic diagnostic indexes to discriminate patients into those who have yet to reach peak pubertal growth and those who have reached or passed it. The study also showed that, even though the AUC differences between the CVM index and the SMI were statistically significant, they were relatively low in both ethnic and sex groups. Thus, the results question the necessity of taking hand-wrist radiographs when lateral cephalograms can be optimized. However, caution must be exercised when interpreting the results because this study was based on cross-sectional data, which have inherent limitations when analyzing growth. Ideally, studies of this type should be longitudinal, but the difficulties of obtaining large sample sizes for a longitudinal study and the associated increase in the number of radiographic exposures tend to preclude this methodology. The other factor that must be considered is that the sample size was relatively smaller in each age group for the white subjects compared with the Indonesians, although the overall sample size was still good. In addition, our data included only subjects with Class I occlusions; therefore, inferences for patients with Class II or Class III malocclusions should be viewed with caution.
American Journal of Orthodontics and Dentofacial Orthopedics Volume 134, Number 2
CONCLUSIONS
This study confirms that both the SMI and the CVM index are valid clinical diagnostic methods to discriminate patients into those who have not yet attained peak pubertal growth and those who have reached or passed it. These indexes are also valid in both ethnic groups. We also found that the differences in the discriminatory ability between the SMI and the CVM index in detecting the peak pubertal growth spurt were between 1% and 7% in both ethnic and sex groups. Thus, we strongly question the necessity of taking additional hand-wrist radiographs and exposing a child to unnecessary radiation, when the CVM index has similar discriminatory ability and is readily available from lateral cephalometric radiographs. We sincerely thank all the staff from the Department of Orthodontics, Eastman Dental Center University of Rochester Medical Center, Rochester, NY (in particular Stephanos Kyrkanides, Leonard Fishman, and Mary Therese Whelehan) for giving full access to their patient database to be used in this study. REFERENCES 1. Pancherz H, Hägg U. Dentofacial orthopedics in relation to somatic maturation. Am J Orthod 1985;88:273-87. 2. Baccetti T, Franchi L, Toth LR, McNamara JA. Treatment timing for Twin-block therapy. Am J Orthod Dentofacial Orthop 2000;118:159-70. 3. Faltin K, Faltin RM, Baccetti T, Franchi L, Ghiozzi B, McNamara JA. Long-term effectiveness and treatment timing for bionator therapy. Angle Orthod 2003;73:221-30. 4. Grave KC, Brown T. Skeletal ossification and the adolescent growth spurt. Am J Orthod 1976;69:611-9. 5. Hägg U, Taranger J. Skeletal stages of the hand and wrist as indicators of the pubertal growth spurt. Acta Odontol Scand 1980;38:187-200. 6. Houston WJB, Miller JC, Tanner JM. Prediction of the timing of the adolescent growth spurt from ossification events in handwrist films. Br J Orthod 1979;6:145-52. 7. Houston WJB. Relationships between skeletal maturity estimated from hand-wrist radiographs and the timing of the adolescent growth spurt. Eur J Orthod 1980;2:81-93. 8. Fishman LS. Radiographic evaluation of skeletal maturation. A clinically oriented method based on hand-wrist films. Angle Orthod 1982;52:88-112. 9. Smith RJ. Misuse of hand-wrist radiographs. Am J Orthod 1980;77:75-8. 10. Lamparski DG. Skeletal age assessment utilizing cervical vertebrae [thesis]. Pittsburgh: University of Pittsburgh; 1972. 11. O’Reilly MT, Yanniello GJ. Mandibular growth changes and maturation of cervical vertebrae. A longitudinal cephalometric study. Angle Orthod 1988;58:179-84. 12. Hassel B, Farman AG. Skeletal maturation evaluation using cervical vertebrae. Am J Orthod Dentofacial Orthop 1995;107: 58-66.
Soegiharto, Moles, and Cunningham 237
13. Franchi L, Baccetti T, McNamara JA. Mandibular growth as related to cervical vertebral maturation and body height. Am J Orthod Dentofacial Orthop 2000;118:335-40. 14. Baccetti T, Franchi L, McNamara JA. An improved version of the cervical vertebral maturation (CVM) method for the assessment of mandibular growth. Angle Orthod 2002;72:316-23. 15. Grave K, Townsend G. Cervical vertebral maturation as a predictor of the adolescent growth spurt. Aust J Orthod 2003;19: 25-31. 16. Grave K, Townsend G. Hand-wrist and cervical vertebral maturation indicators: how can these events be used to time Class II treatment? Aust J Orthod 2003;19:33-45. 17. Kucukkeles N, Acar A, Biren S, Arun T. Comparisons between cervical vertebrae and hand-wrist maturation for the assessment of skeletal maturity. J Clin Pediatr Dent 1999;24:47-52. 18. Chang HP, Liao CH, Yang YH, Chang HF, Chen KC. Correlation of cervical vertebrae maturation with hand-wrist maturation in children. Kaoshiung J Med Sci 2001;17:29-35. 19. San Román P, Palma JC, Oteo MD, Nevado E. Skeletal maturation determined by cervical vertebrae development. Eur J Orthod 2002;24:303-11. 20. Flores-Mir C, Burgess CA, Champney M, Jensen RJ, Pitcher MR, Major PW. Correlation of skeletal maturation stages determined by cervical vertebrae and hand-wrist evaluations. Angle Orthod 2006;76:1-5. 21. Bhatia SN, Leighton BC. A manual of facial growth. A computer analysis of longitudinal cephalometric growth data. 1st ed. Oxford: Oxford University Press; 1993. 22. Ochoa BK, Nanda RS. Comparison of maxillary and mandibular growth. Am J Orthod Dentofacial Orthop 2004;125:148-59. 23. Fishman LS. Chronological versus skeletal age, an evaluation of craniofacial growth. Angle Orthod 1979;49:181-224. 24. Altman DG, Machin D, Bryant TN, Gardner MJ. Statistics with confidence. Confidence intervals and statistical guidelines. 2nd ed. London: BMJ Books; 2000. 25. Petrie A, Watson P. Statistics for veterinary and animal science. 1st ed. Oxford: Blackwell Science; 1999. 26. Petrie A, Bulman JS, Osborn JF. Further statistics in dentistry. Part 5: diagnostic tests for oral conditions. Br Dent J 2002;193: 621-5. 27. Baccetti T, Franchi L, McNamara JA. The cervical vertebral maturation (CVM) method for the assessment of optimal treatment timing in dentofacial orthopedics. Semin Orthod 2005;11: 119-29. 28. Westwood PV, McNamara JA, Baccetti T, Franchi L, Sarver DM. Long-term effects of Class III treatment with rapid maxillary expansion and facemask therapy followed by fixed appliances. Am J Orthod Dentofacial Orthop 2003;123:306-20. 29. Fishman LS. Maturational development and facial form relative to treatment timing. In: Subtelny DJ, editor. Early orthodontic treatment. 1st ed. Chicago: Quintessence; 2000. p. 265-85. 30. Ballrick J, Fields H, Vig KWL, Beck FM, Germak T, Baccetti T, et al. Reliability and validity of cervical vertebral maturation and hand-wrist radiographs. IADR 2005 Abstract in Baltimore (available at: http://iadr.confex.com/iadr/2005Balt/techprogram/ abstract_62129.htm). 31. Isaacson K, Thom A. Orthodontic radiographic guidelines. 2nd ed. London: British Orthodontic Society; 2001.