Classifying Pubertal Development Using Child and Parent Report: Comparing the Pubertal Development Scales to Tanner Staging

Classifying Pubertal Development Using Child and Parent Report: Comparing the Pubertal Development Scales to Tanner Staging

Journal of Adolescent Health xxx (2019) 1e6 www.jahonline.org Original article Classifying Pubertal Development Using Child and Parent Report: Compa...

577KB Sizes 1 Downloads 93 Views

Journal of Adolescent Health xxx (2019) 1e6

www.jahonline.org Original article

Classifying Pubertal Development Using Child and Parent Report: Comparing the Pubertal Development Scales to Tanner Staging Maria Elisabeth Koopman-Verhoeff, M.Sc. a, b, *, Caroline Gredvig-Ardito a, David H. Barker, Ph.D. a, c, Jared M. Saletin, Ph.D. a, b, and Mary A. Carskadon, Ph.D. a, b a

EP Bradley Hospital Sleep Laboratory, Alpert Medical School of Brown University, Providence, Rhode Island Department of Psychiatry and Human Behavior, Alpert Medical School of Brown University, Providence, Rhode Island c The Bradley Hasbro Children's Research Center, Providence, Rhode Island b

Article history: Received August 9, 2019; Accepted November 19, 2019 Keywords: Puberty; Reliability; Validity; Adolescence; Pediatric population

A B S T R A C T

Purpose: This project investigated internal consistency and testeretest reliability of the frequently used Pubertal Development Scale (PDS) and compared parent and child reports with clinicianrated Tanner staging. Methods: Using a repository of data collected from 1995 to 2016, 252 participants (aged 7.8 e17.7 years) provided self- and parent-reported PDS and received Tanner staging by a certified health care professional within 30 days. Internal consistency and testeretest reliability statistics were evaluated for 56 children across two assessments occurring within 6 months. Comparisons with Tanner staging involved examining concurrent validity and calibration analysis using data from 233 child and 252 parental ratings. Results: Self- and parent-reported PDS demonstrated good internal consistency, with Cronbach's alpha .91e.96; high testeretest reliability was confirmed with intraclass correlation coefficient .81 e.92. The association of Tanner stage with self- and parent-reported PDS was moderate to high; Kendall's Tau ranged from .67 to .76, and intraclass correlation coefficient ranged from .73 to 83. The absolute agreement of Tanner stage with self- and parent-reported PDS was low; Cohen's Kappa ranged from .20 to .37. However, combining pubertal scores into three stages of development (pre/early-, mid-, and late/post-pubertal) improved interrater agreement across measures (k ¼ .65, 95% confidence interval ¼ .57e.73). Conclusions: The present study shows that the PDS is reliable and generally tracks with Tanner staging (for both self and parent report). Low absolute agreement indicates that PDS categories do not map directly to specific Tanner stages, partly because a premature adrenarche is often misinterpreted by parents and pediatricians alike. However, three broad categories showed better agreement and are generally adequate for most applications in child and adolescent research. Ó 2019 Society for Adolescent Health and Medicine. All rights reserved.

Conflicts of interest: The authors have no conflicts of interest to disclose. * Address correspondence to: Maria Elisabeth Koopman-Verhoeff, M.Sc., EP Bradley Hospital Sleep Laboratory, 300 Duncan Drive, Providence, RI 02906. E-mail address: [email protected] (M.E. Koopman-Verhoeff). 1054-139X/Ó 2019 Society for Adolescent Health and Medicine. All rights reserved. https://doi.org/10.1016/j.jadohealth.2019.11.308

IMPLICATIONS AND CONTRIBUTION

Self- and parent-reported Pubertal Development Scale is a reliable and valid instrument to assess pubertal development and can be used in research without the need for Tanner staging, increasing feasibility. The Pubertal Development Scale can reasonably be used to classify children as prepubertal, midpubertal, and postpubertal based on absolute agreement with Tanner staging.

Extensive evidence now indicates pubertal status as an important factor in studies investigating adolescent mental and physical health [1e7]; however, measurement of pubertal status varies widely across studies. The medical standard has long been Tanner staging by a trained health care professional. Tanner staging is based on clinical evaluation of pubic hair growth in

2

M.E. Koopman-Verhoeff et al. / Journal of Adolescent Health xxx (2019) 1e6

boys and girls, breast development in girls, and genital development in boys. Tanner staging provides a simple, ordinal scale that summarizes pubertal development ranging from 1 (prepubertal) to 5 (postpubertal) [8]. Although the result is a simple scale, Tanner staging requires clinical training and is burdensome to researchers and participants alike, spawning multiple attempts to derive a self-reported (or parent-reported) proxy measure useful for research [9]. Petersen's Pubertal Development Scale (PDS) [10], one commonly used measure, attempts to approximate Tanner staging by including questions about pubertal hair and skin changes. Sex-specific items include questions about menarche and breast development for girls and about voice development and facial hair for boys. The original PDS scoring algorithm provides a continuous mean score ranging from 1 to 4, not describing the categorical mapping to the 5-category scheme of Tanner staging. Our group previously described an algorithm to bridge these two scoring systems and has used the PDS and our algorithm in the widely used Sleep Habits Survey (SHS) [4]. Despite the wide use of this instrument, information about the reliability and validity is limited. Previous studies have explored the reliability and validity of self-assessed pubertal development against professional Tanner staging [11]. A common impetus for these studies is that subjective ratings (self- and/or parent-reported) may be preferable to Tanner staging because of the expense of having a trained health care professional provide Tanner stagingdparticularly in large studiesdand because of the discomfort that some children and families may have with the intrusive nature of the in-person evaluation or with the use of picture-based ratings. Studies examining the PDS have used child self-report versus Tanner staging, but without contemporaneous parent ratings [11e13]. Assessing the reliability and validity of parental assessment, in addition to self-report, is important because research and cultural contexts or circumstances may prevent child report. Moreover, it is not known whether the PDS can reliably estimate precise Tanner staging. Clear reliability and validity data for selfand parent-reported PDS against Tanner staging may help guide researchers in how to balance the use of the medical standard with the practical constraints often found in research and in how best to interpret the data. The aim of the present study, therefore, was to address this need using a data repository of children with contemporaneous self-reported PDS, parent-reported PDS, and professional-rated Tanner stage. We sought to investigate three key outcomes: (1) testeretest reliability of the PDS across time, (2) criterion validity of the PDS as a measure of Tanner staging, and (3) the calibration of the self-reported, parent-reported PDS, and individual Tanner stages. Ultimately, our goal was to inform how the measures can work together to appropriately assess pubertal development in research. Methods Study sample The repository consisted of deidentified data from participants across various studies run through the E.P. Bradley Hospital Sleep Research Laboratory [14e18]. Data were collected from 1995 to 2016. Many children had repeated visits to the laboratory as part of longitudinal research programs [17] (1e10, median ¼ 1). The institutional ethical committee for the protection of human subjects approved each study, and for each participant, we

collected parental consent and child assent. Data were included from participants who received a clinician Tanner evaluation and completed a concurrent PDS scale, self-report, parent report, or both. Selected participants were, on average, aged 12.9 years (7.8e17.7), 48.6% female, 86.3% white/Caucasian, and 6% identified themselves as Hispanic or Latino. For this selection, the maternal education level, as defined by the highest attained educational level, was divided into three categories: low, high school education or less, 17.1%; middle, at least 1 year of college training, 25.1%; and high, a college or graduate degree 55.8%. Tanner staging Across the two decades of data collection, 12 different trained health care professionals, either a physician or nurse, performed Tanner staging. Each had prior pediatric training, was trained on research-grade tanner staging across standardization meetings in our group, and all remained unaware of self- and parentreported PDS scores at the time of staging. Tanner staging derives two distinct “scores,” one based on Pubic Hair growth, and a second score derived from genital development in boys and breast development in girls, labeled “Other.” As these two scores, pubic hair and other, have historically differed in their reliability (pubic hair being the more reliable measure) [11,19], we treated them as separate metrics, with a separate set of analyses for each. Pubertal Development Scale Children completed the PDS as part of the SHS [4,10], whereas parents (in majority of the cases mothers, 89.5%) completed the PDS as part of the laboratory's Sleep, Medical, Education & Family History Form [10]. In 197 cases, the SHS and Sleep, Medical, Education & Family History Form were sent by mail before their research visit, and in others, they were completed on site during an orientation visit. For both children and parents, the PDS consists of five questions about pubertal status, with five answer categories (1 ¼ has not yet started changing, 2 ¼ has barely started changing, 3 ¼ changes are definitely underway, 4 ¼ changes seem complete, and 0 ¼ I do not know). The continuous PDS score is converted to a 5-point ordinal scale (in keeping with the original Tanner categories) using an algorithm developed by Crockett (1988, unpublished) and described by Carskadon and Acebo (1993). Our scoring procedure allowed ~30% missing or “I do not know” responses (e.g., two of the three questions had been answered (for girls, response for item menarche must be present for a score) [20]. The scoring algorithm for self- and parentreported PDS is provided in Figure 1. Analytic approach Professional-rated Tanner staging was considered the criterion standard in analyses comparing to self- and parent-reported PDS. All statistical analyses were performed using SPSS version 19.0 for MAC (IBM Corp., Armonk, NY). Testeretest reliability To evaluate the consistency of self- and parent-reported PDS across time, we selected a subset of participants whose repeated assessments occurred within 6 months of one another, given that adolescents in the midst of puberty

M.E. Koopman-Verhoeff et al. / Journal of Adolescent Health xxx (2019) 1e6

3

Introduction: The next questions are about changes that may be happening to your body. These changes normally happen to different young people at different ages. Since they may have something to do with your sleep patterns, do your best to answer carefully. If you do not understand a question or do not know the answer, just mark “I don’t know.” Questions: 1. Would you say that your growth in height: 2. And how about the growth of your body hair? (“Body hair” means hair any place other than your head, such as under your arms.) Would you say that your body hair growth: 3. Have you noticed any skin changes, especially pimples? FORM FOR BOYS: 4. Have you noticed a deepening of your voice? 5. Have you begun to grow hair on your face? FORM FOR GIRLS: 4. Have you noticed your breasts have begun to grow? 5a. Have you begun to menstruate (started to have your period)? 5b. If yes, how old were you when you started to menstruate? Scoring Algorithms: For Items 1 through 4 on the girls’ version and all items on the boys’ version, response options were: not yet started (1 point); barely started (2 points); definitely started (3 points); seems complete (4 points); I don’t know (missing). Yes on the menstruation item = 4 points; no = 1 point. Point values are averaged for all items to give a Pubertal Development Scale (PDS) score. Puberty Category Scores are computed using the criteria of Crockett (1988, unpublished) by totaling the scale values given above. An SPSS syntax of the scoring algorithm can be found on the sleep for science website. http://www.sleepforscience.org/research/ BOYS sum body hair growth, voice change, GIRLS sum body hair growth, breast and facial hair growth: development, and menarche as follows: Prepubertal : 3 Prepubertal : 2 and no menarche Early Pubertal : 4 or 5 (no 3-point responses) Early Puberty : 3 and no menarche Midpubertal : 6, 7, or 8 (no 4-points) Midpubertal : > 3 and no menarche Late pubertal : 9-11 Late Puberty : <= 7 and menarche Postpubertal : 12 Postpubertal : 8 and menarche. Figure 1. Scoring algorithm for PDS derived from SHS.

typically change stage in .5e1.5 years [21]. This restriction provided a sample of 56 children (33 girls), with mean age of 11.9 years (range: 8.5e15.9), 87.5% white/Caucasian. From this sample, we examined the internal consistency of the PDS using Cronbach's alpha. We also examined testeretest reliability of these self- and parent-reports of the PDS from the first to the second visit using the intraclass correlation coefficients (ICC) of absolute agreement, a measure of correlation and agreement [22].

Criterion validity To examine the validity of the PDS as an estimate of Tanner staging, we used a larger sample, selected based on PDS reports where Tanner staging occurred within 30 days. In this sample, 233 children (114 girls) aged 12.9 years (7.8e17.7) had contemporaneous Tanner staging to accompany self-reported PDS; 252 children (123 girls) aged 12.9 years (8.6e17.7) had contemporaneous Tanner staging available for parent-reported PDS.

To compute the criterion validity, we first computed Kendall's s to estimate the association between Tanner staging and self- and parent-reported PDS. We also ran ICC of absolute agreement analyses between Tanner staging and self- and parent-reported PDS.

Calibration Finally, to determine whether the PDS provides a point-topoint mapping to the ordinal Tanner staging scale, we used the criterion validity sample to generate confusion matrices among professional Tanner staging and our PDS-derived categories, as well as calculate Cohen's Kappa (k) with bootstrapped 95% confidence intervals (CIs). We ran two different sets of analyses for this determination, first using the complete range (1e5) for both measures. Second, guided by the results of this first analysis, we combined both Tanner stages and PDS scores into three categories: pre/early pubertal (stages 1e2), pubertal (stage 3), and late/postpubertal (Stage 4e5).

4

M.E. Koopman-Verhoeff et al. / Journal of Adolescent Health xxx (2019) 1e6

Table 1 Testeretest reliability for self- and parent-reported PDS

Bothdself Girlsdself Boysdself Bothdparent Girlsdparent Boysdparent

Chronbach's a

ICC

95% CI ICC

.93 .93 .91 .94 .91 .96

.87 .88 .84 .88 .81 .93

.78e.92 .77e.94 .66e.93 .79e.93 .62e.90 .85e.97

n ¼ 56 (33 girls and 23 boys). CI ¼ confidence interval; ICC ¼ intraclass correlation coefficient.

Results Reliability As detailed in Table 1, Cronbach's alpha indicated high internal consistency (range .91e.96) for both PDS scales across genders. For self- and parent-reported PDS ICC yielded excellent levels of testeretest reliability; self-assessed PDS showed high reliability for both sexes (ICC ¼ .87, 95% CI ¼ .78e.92) and separately for girls (ICC ¼ .88, 95% CI ¼ .78e.92) and boys (ICC ¼ .84, 95% CI ¼ .66e.93). The same high degree of reliability was identified from parentreported PDS (results in Table 1). Criterion validity Table 2 details the criterion validity for Tanner staging and self- and parent-reported PDS. These data show a moderate to high association of Tanner staging with self- and parent-reported PDS (s range .68e.76, ICC range .72e.82). Calibration To investigate how the PDS mapped onto the five Tanner stages, we first examined the distribution of scores across the sample for each measure (Figure 2). These distributions show a U-shaped distribution for Tanner staging with the preponderance of participants being rated as a 1 or a 5. In contrast, both the self- and parent-reported PDS scales showed more consistent distribution across the five categories. Table 3 shows the absolute agreement between the Tanner staging categories based on pubic hair growth and self-reported PDS (k ¼ .26, 95% CI ¼ .19e.33) and parent-reported PDS (k ¼ .28, 95% CI ¼ .21e.35). Similar results were found for the absolute agreement between Tanner

Table 2 Criterion validity for clinician rated Tanner staging and self- and parent-reported PDS Pubertal development Clinician rated tanner (pubic hair)

Bothdself Girlsdself Boysdself Bothdparent Girlsdparent Boysdparent

Pubertal development Clinician rated tanner (other)

ICC 95% CI ICC s

95% CI s ICC 95% CI ICC s

95% CI s

.78 .82 .73 .83 .82 .82

.67e.75 .69e.80 .61e.74 .73e.80 .70e.81 .73e.84

.61e.72 .63e.76 .60e.74 .67e.76 .67e.80 .67e.81

.72e.82 .75e.87 .64e.81 .77e.87 .75e.87 .71e.89

.71 .75 .68 .76 .76 .79

.75 .79 .72 .79 .82 .76

.68e.81 .72e.85 .55e.81 .69e.85 .75e.87 .42e.88

.67 .70 .70 .72 .74 .74

For self-report, n ¼ 233 (114 girls and 119 boys); for parent report, n ¼ 252 (123 girls and 129 boys). ICC ¼ intraclass correlation coefficient; s ¼ Kendall's tau.

staging “other” characteristics and self- and parent-reported PDS (Supplemental Table 1). Examination of the distributions and the confusion matrix suggest that disagreement is primarily within the middle categories. When pubertal stages were combined as pre-, mid-, and post-pubertal, the k increased to .65, 95% CI ¼ .57e.73, which is adequate interrater agreement (results separated by gender are given in Supplemental Tables 2 and 3). Discussion These analyses indicate that self- and parent-reported PDS are reliable and valid instruments to assess pubertal development in adolescents. The PDS showed strong internal consistency and testeretest reliability and generally track with professional-rated Tanner staging. In contrast, the two versions of the PDS failed to categorize participants into specific Tanner stages but could adequately differentiate between pre-, mid-, and post-pubertal categories. Thus, although researchers are encouraged to use the PDS to track general pubertal development, it is not recommended to use these scales to assign specific Tanner-like meaning to the derived categories, especially because premature adrenarche by parents and pediatricians often is mistaken as signs of true puberty. Therefore, for accurate individual puberty staging, a pediatric endocrinologist should be consulted. Our results indicated a higher degree of PDS reliability and validity than previous studies using smaller samples and only girls [12] while also adding to the literature by including both self- and parent-assessment [13,23]. Although prior findings compared self- or parent-ratings based on pictorial stimuli [19,24] or the original child-completed PDS scores [9,13], ours used a prior algorithm developed to directly approximate the Tanner scale. The current data provide strong evidence that the widely used PDS [4,25], when coupled with this algorithm, provides a reliable, valid, differentiation of participants into prepubertal, midpubertal, and postpubertal categories. Researchers aiming to categorize their participants in such a scheme may be reassured in eschewing physical examination in favor of this low-burden self-report method. It is important to note that although the Crockett algorithm was intended to translate the PDS into Tanner-like categories [20], the PDS categories do not provide a one-to-one translation to the classical 5-point Tanner scale. When we compared the selfand parent-reported PDS with the professional Tanner staging, the widely used medical standard, the absolute agreement was poor. This result is not surprising, as the PDS is not a complete estimation of Tanner stages, and no direct questions about pubic hair are included; on the contrary, questions are limited to body hair more broadly. In addition, on further inspection of this result, we found that health care professionals were more prone to rate children either prepubertal or postpubertal, suggesting that they may have some difficulty in the subtle differences in the midpubertal categories [13]. Given that much of the disagreement was in the midpubertal categories, one possible solution for this low agreement was to combine each scale (PDS and Tanner) into three categories. Guided by a clear rationale to distinguish pre- from post-puberty and the data showing a lack of specificity for Tanner stage 3 in the PDS, we reduced scales to three points: prepubertal (stages 1 and 2), pubertal (stage 3), postpubertal (stages 4 and 5) and found that agreement among measures rose from low to substantial. We note that others have used similar categories, which supports the utility of the PDS in estimating this more broadly defined approximation of Tanner

M.E. Koopman-Verhoeff et al. / Journal of Adolescent Health xxx (2019) 1e6

5

Figure 2. A) Distribution of scores across the sample Tanner stages and parent completed PDS (percentages). (B) Distribution of scores across the sample Tanner stages and child completed PDS (percentages).

Table 3 Absolute agreement of clinician-rated Tanner staging versus self- and parentreported PDS Clinician rated tannerdpubic hair Self 1 2 3 4 5 Parent 1 2 3 4 5

1

2

3

4

5

35 25 10 1 0

6 6 12 1 0

0 3 11 3 0

1 2 15 22 3

0 0 12 46 19

51 22 7 0 0

8 7 12 1 0

2 0 10 6 0

0 3 18 20 2

0 0 6 57 20

Cohen's Kappa (k) k for self ¼ .26, 95% CI ¼ .19e.33, p < .001, k for parent ¼ .28, 95% CI ¼ .21e.35, p < .001. For self-report, n ¼ 233 (114 girls and 119 boys); for parent report, n ¼ 252 (123 girls and 129 boys).

stages; however, we note that such decisions appear arbitrary and different studies divide the five Tanner stages differently [6,7,26,27]. Using a common definition is critically important for generalizing across studies. Strengths and limitations This study had numerous strengths. First, we assessed the reliability and validity of the PDS with the algorithm described by our group in the Sleep Habits Survey (1993) [4], and we examined both self- and parent-reported PDS ratings. Second, because of multiple assessments for the same child, we were able to evaluate testeretest reliability of the PDS, in addition to criterion validity and calibration agreement to Tanner staging. We also note several limitations. As pubertal status is a multidimensional concept, we are aware that the PDS and Tanner staging tap into different aspects of this concept. Our article does not attempt to operationalize pubertal status per se but to provide a measurement solution that combines scales into a more general measure

6

M.E. Koopman-Verhoeff et al. / Journal of Adolescent Health xxx (2019) 1e6

to capture pubertal development. Although Tanner staging is often considered a medical standard, it would have been ideal to also compare our results with additional hormone determinations. Previous studies indicate that different measures of pubertal development (e.g., gonadal and adrenal processes regulating hormones such as estrogen, testosterone, and dehydroepiandrosterone) track different aspects of development [9], and thus, further work comparing the PDS to such measures may be useful. Second, the present study provided reliability statistics for the PDS as scored via the Crockett algorithm and framed in the context of the SHS, this algorithm that has been in the literature a long time. Researchers using other algorithms to convert the PDS should be cautious when generalizing our results to those algorithms. Third, the current sample was primarily white/Caucasian, future studies should replicate our findings across race and ethnicity. In summary, the PDSs, along with the published algorithm, provide a meaningful instrument for researchers measuring pubertal development. Both child and parent PDS provide valid assessments of the general progression of Tanner staging and are stable across time. Care should be taken in attributing PDS scores to specific Tanner Stages.

Acknowledgments The authors would like to thank the children and parents participating in the studies of the Bradley Sleep Laboratory. Tanner staging was done by J.A. Owens, MD; H. Sachs, MD; S. Labyak, RN, PhD; T. Raffray, MD; V. Dalzell, MD; K. Callahan, MD; S. Schreitmueller, RN; E. Forbes, MD; S. Laube, MD; S. Schmidhofer, MD; M. Forcier, MD; and J. Kole, MD. The authors thank them for this important contribution.

Funding Sources This is not an industry supported study. M.E.K.-V. was supported by the Royal Netherlands Academy of Arts and Sciences (KNAW Ter Meulen Grant), the VVAO (Jo Kolk), and a Fulbright Award. M.A.C.'s research from which the sample was drawn was supported by National Institutes of Health grants DK101046, MH52415, MH01358, MH58879, MH076969, MH45945-08, NR08381, NR04270, AA13252, HL092910, and the Periodic Breathing Foundation.

Supplementary Data Supplementary data related to this article can be found at http://doi.org/10.1016/j.jadohealth.2019.11.308.

References [1] Sadeh A, Dahl RE, Shahar G, et al. Sleep and the transition to adolescence: A longitudinal study. Sleep 2009;32:1602e9. [2] Dahl RE, Lewin DS. Pathways to adolescent health sleep regulation and behavior. J Adolesc Health 2002;31(6 Suppl):175e84. [3] Carskadon MA, Wolfson AR, Acebo C, et al. Adolescent sleep patterns, circadian timing, and sleepiness at a transition to early school days. Sleep 1998;21:871e81. [4] Wolfson AR, Carskadon MA. Sleep schedules and daytime functioning in adolescents. Child Dev 1998;69:875e87. [5] Carskadon MA, Harvey K, Duke P, et al. Pubertal changes in daytime sleepiness. Sleep 1980;2:453e60. [6] Hubers M, Geisler C, Plachta-Danielzik S, et al. Association between individual fat depots and cardio-metabolic traits in normal- and overweight children, adolescents and adults. Nutr Diabetes 2017;7:e267. [7] Feder A, Coplan JD, Goetz RR, et al. Twenty-four-hour cortisol secretion patterns in prepubertal children with anxiety or depressive disorders. Biol Psychiatry 2004;56:198e204. [8] Tanner JM, Whitehouse RH. Clinical longitudinal standards for height, weight, height velocity, weight velocity, and stages of puberty. Arch Dis Child 1976;51:170e9. [9] Blakemore SJ, Burnett S, Dahl RE. The role of puberty in the developing adolescent brain. Hum Brain Mapp 2010;31:926e33. [10] Petersen AC, Crockett L, Richards M, et al. A self-report measure of pubertal status: Reliability, validity, and initial norms. J Youth Adolesc 1988;17:117e33. [11] Rasmussen AR, Wohlfahrt-Veje C, Tefre de Renzy-Martin K, et al. Validity of self-assessment of pubertal maturation. Pediatrics 2015;135:86e93. [12] Brooks-Gunn J, Warren MP, Rosso J, et al. Validity of self-report measures of girls’ pubertal status. Child Dev 1987;58:829e41. [13] Shirtcliff EA, Dahl RE, Pollak SD. Pubertal development: Correspondence between hormonal and physical development. Child Dev 2009;80:327e37. [14] Herz RS, Van Reen E, Barker DH, et al. The influence of circadian timing on olfactory sensitivity. Chem Senses 2017;43:45e51. [15] Saletin JM, Coon WG, Carskadon MA. Stage 2 sleep EEG sigma activity and motor learning in childhood ADHD: A pilot study. J Clin Child Adolesc Psychol 2017;46:188e97. [16] Crowley SJ, Acebo C, Fallone G, et al. Estimating dim light melatonin onset (DLMO) phase in adolescents using summer or school-year sleep/wake schedules. Sleep 2006;29:1632e41. [17] Tarokh L, Carskadon MA, Achermann P. Dissipation of sleep pressure is stable across adolescence. Neuroscience 2012;216:167e77. [18] Tarokh L, Van Reen E, Acebo C, et al. Adolescence and parental history of alcoholism: Insights from the sleep EEG. Alcohol Clin Exp Res 2012;36: 1530e41. [19] Chavarro JE, Watkins DJ, Afeiche MC, et al. Validity of self-assessed sexual maturation against physician assessments and hormone levels. J Pediatr 2017;186:172e178.e3. [20] Carskadon MA, Acebo C. A self-administered rating scale for pubertal development. J Adolesc Health 1993;14:190e5. [21] Emmanuel M, Bokor BR. Tanner stages. London: StatPearls Publishing; 2018. [22] Koo TK, Li MY. A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J Chiropr Med 2016;15:155e63. [23] Schmitz KE, Hovell MF, Nichols JF, et al. A validation study of early adolescents’ pubertal self-assessments. J Early Adolesc 2004;24:357e84. [24] Rollof L, Elfving M. Evaluation of self-assessment of pubertal maturation in boys and girls using drawings and orchidometer. J Pediatr Endocrinol Metab 2012;25:125e9. [25] Wolfson AR, Carskadon MA, Acebo C, et al. Evidence for the validity of a sleep habits survey for adolescents. Sleep 2003;26:213e6. [26] Rico H, Revilla M, Villa LF, et al. Body composition in children and Tanner’s stages: A study with dual-energy x-ray absorptiometry. Metabolism 1993;42:967e70. [27] Lazar L, Kalter-Leibovici O, Pertzelan A, et al. Thyrotoxicosis in prepubertal children compared with pubertal and postpubertal patients. J Clin Endocrinol Metab 2000;85:3678e82.