Computerized Adaptive Testing in Pediatric Brain Tumor Clinics

Computerized Adaptive Testing in Pediatric Brain Tumor Clinics

Accepted Manuscript Why Computerized Adaptive Testing In Pediatric Brain Tumor Clinics Jin-Shei Lai, PhD, OTR/L, Jennifer L. Beaumont, MS, Cindy J. No...

1MB Sizes 1 Downloads 25 Views

Accepted Manuscript Why Computerized Adaptive Testing In Pediatric Brain Tumor Clinics Jin-Shei Lai, PhD, OTR/L, Jennifer L. Beaumont, MS, Cindy J. Nowinski, MD, PhD, David Cella, PhD, William F. Hartsell, MD, John Han-Chih Chang, MD, Peter E. Manley, MD, Stewart Goldman, MD PII:

S0885-3924(17)30358-5

DOI:

10.1016/j.jpainsymman.2017.05.008

Reference:

JPS 9504

To appear in:

Journal of Pain and Symptom Management

Received Date: 14 February 2017 Revised Date:

20 April 2017

Accepted Date: 25 May 2017

Please cite this article as: Lai J-S, Beaumont JL, Nowinski CJ, Cella D, Hartsell WF, Han-Chih Chang J, Manley PE, Goldman S, Why Computerized Adaptive Testing In Pediatric Brain Tumor Clinics, Journal of Pain and Symptom Management (2017), doi: 10.1016/j.jpainsymman.2017.05.008. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT

Why Computerized Adaptive Testing In Pediatric Brain Tumor Clinics

RI PT

Jin-Shei Lai, PhD., OTR/L Medical Social Sciences and Pediatrics Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA

SC

Jennifer L. Beaumont, MS Medical Social Sciences Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA

M AN U

Cindy J. Nowinski, MD, PhD Medical Social Sciences Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA David Cella, PhD Medical Social Sciences Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA William F. Hartsell, MD Northwestern Medicine Chicago Proton Center, Warrenville, Illinois, USA Ann & Robert H. Lurie Children’s Hospital of Chicago, Chicago, Illinois, USA

TE D

John Han-Chih Chang, MD Northwestern Medicine Chicago Proton Center, Warrenville, Illinois, USA Ann & Robert H. Lurie Children’s Hospital of Chicago, Chicago, Illinois, USA

EP

Peter E. Manley, MD Children’s Hospital Boston and Dana-Farber Cancer Institute Harvard Medical School, Boston, MA

AC C

Stewart Goldman, MD Pediatrics, Northwestern University Feinberg School of Medicine, Chicago, Illinois, USA Ann & Robert H. Lurie Children’s Hospital of Chicago, Chicago, Illinois, USA

Corresponding Author:

Jin-Shei Lai, PhD, OTR/L 633 N St Clair, 19th Floor Chicago, IL, 60611 Phone: 312-503-3370 Fax: 312-503-9800 EMAIL: [email protected] 1

ACCEPTED MANUSCRIPT

Running Head: CAT in Pediatric Brain Tumor Clinics

RI PT

Funding Source: This study was funded by National Institutes of Health/National Cancer Institute (R01CA174452; PI: Jin-Shei Lai)

SC

Conflict of Interests Disclosures: Jin-Shei Lai, Jennifer Beaumont, David Cella, Cindy Nowinski, William Hartsell, Peter Manley, and Stewart Goldman have no conflicts of interest to declare. John Han-Chih Chang is a shareholder of Illinois Cyberknife Investment, LLC, Chicago Proton Treatment Investment, LLC, and Elk Grove Radiosurgery Investment, LLC.

AC C

EP

TE D

M AN U

Author contributions: Study concept and design: Lai. Acquisition of data: Lai, Hartsell, Chang, Manley, Goldman. Analysis and interpretation of data: Lai and Beaumont. Drafting of the Manuscript: Lai and Beaumont. Critical revision of manuscript for important intellectual content: Cella, Nowinski, Hartsell, Chang, Manley and Goldman. Obtained funding: Lai. Administrative, technical and material support: Lai and Nowinski.

2

ACCEPTED MANUSCRIPT

Abstract CONTEXT: Monitoring of health-related quality of life (HRQOL) and symptoms of patients with brain tumors is needed yet not always feasible. This is partially due to lack of brief-yet-

RI PT

precise assessments with minimal administration burden that are easily incorporated into clinics. Dynamic computerized adaptive testing (CAT) or static fixed-length short-forms, derived from psychometrically-sound item banks, are designed to fill this void. OBJECTIVE: This study

SC

evaluated the comparability of scores obtained from CATs and short-forms. METHODS:

Patients (ages 7-22) were recruited from brain tumor clinics and completed PROMIS CATs and

M AN U

short-forms (Fatigue, Mobility, Upper Extremity, Depressive Symptoms, Anxiety, and Peer Relationships). Pearson correlations, paired t-tests, and Cohen’s d were used to evaluate the relationship, significant differences and the magnitude of the difference between these two scores, respectively. RESULTS: Data from 161 patients with brain tumors were analyzed.

TE D

Patients completed each CAT within 2 minutes. Scores obtained from CATs and short-forms were highly correlated (r=0.95 – 0.98). Significantly different CAT versus short-form scores were found on 4 (of 6) domains yet with negligible effect sizes (|d| < 0.09). These relationships

EP

varied across patients with different levels of reported symptoms, with the strongest association at the worst or best symptom scores. CONCLUSIONS: This study demonstrated the

AC C

comparability of scores from CATs and short-forms. Yet the agreement between these two varied across degrees of symptom severity which was a result of the ceiling effects of static short-forms. We recommend CATs to enable individualized assessment for longitudinal monitoring.

KEY WORDS: Children; Brain Tumor; Patient-centered outcomes; PROMIS; Computerized adaptive testing (CAT)

3

ACCEPTED MANUSCRIPT

INTRODUCTION Advances in medical treatment have enabled increasing numbers of children and adolescents with brain tumors to become survivors, such that the 5 year overall survival rate for

RI PT

pediatric brain tumor patients is now 75% (source: Surveillance, Epidemiology, and End Results [SEER] Program; seer.cancer.gov). However, these same lifesaving treatments can also have significant short and long term negative effects on brain tumor patients’ lives. As survival has

SC

increased, the scope of cancer treatment has broadened to include prevention and amelioration of both tumor and treatment-related acute and chronic problems. Compared to both healthy

M AN U

controls and pediatric survivors with other types of cancer, brain tumor patients and survivors show worse social, emotional, cognitive, physical and school functioning and health-related quality of life (HRQOL) (1-10) along with greater likelihood of unemployment, financial challenges and legal difficulties (e.g., workplace discrimination, disability insurance denial,

TE D

being a victim of theft, fraud or assault, etc.) during adulthood.(11, 12) We believe that monitoring health status and HRQOL of children with brain tumors during routine follow-up may at least partially ameliorate these consequences through provision of appropriate and timely

EP

intervention. For example, physical exercise training interventions may improve physical fitness for some types of cancer patients and survivors.(13) Mathematics abilities and visual working

AC C

memory of childhood cancer survivors could be improved by an effective early intervention.(14) Patients’ reports of their subjective experience (patient reported outcomes) are

increasingly accepted as valid means of ascertaining the varied effects of diseases and their treatments on their lives. Standardized assessment of patient-reported outcomes (PROs) has been shown to improve patient-clinician communication, influence clinical decision-making, result in better symptom management(15-20) and be acceptable to both patients and

4

ACCEPTED MANUSCRIPT

clinicians.(21) Yet such assessments are not widespread, and there are multiple challenges to implementing routine PRO assessment into the clinical setting (22, 23) including concerns about increased patient and clinician burden, uncertainty about interpreting and using PRO results, and

RI PT

lack of widely accepted measurement tools. The Patient Reported Outcome Measurement

Information System (PROMIS)(24, 25) is a set of valid and reliable measures that can be used to assess symptoms, function and multiple aspects of health-related quality of life. Most PROMIS

SC

measures are calibrated using item response theory models, allowing them to be administered using a variety of formats, including computerized adaptive tests (CATs) and short, fixed length

M AN U

forms (SFs).(24, 26, 27) CAT uses iterative processes to obtain brief-yet-precise estimations, using the following steps: all participants will first complete a screening item; an initial score will be estimated based on the response using the pre-programmed algorithm; the next most informative item around the estimated score on the measurement continuum will be selected by

TE D

the algorithm for participants to complete; and the score is re-estimated based on the participant’s response to that item. This iterative estimation process continues until the stopping rule is met. As a result, precise estimation can be achieved by using just a few items since only

EP

the most informative item will be chosen.(28-30) SFs, on the other hand, consist of a fixed set of items that are the same for every participant. When SFs are created from a calibrated item bank,

AC C

like PROMIS, the desired measurement properties of the SF can be taken into consideration. For example, a researcher or clinician may select items from the item bank that target precise estimation of more severe symptoms or a set of items that provide moderate precision across the full spectrum of severity.

The brevity and ease of use of CAT and SFs can enhance implementation in clinic settings. Indeed, several studies have demonstrated that pediatric cancer patients and survivors

5

ACCEPTED MANUSCRIPT

find PROMIS measures to be acceptable and feasible to complete in both inpatient and outpatient settings.(31, 32) Varni et al(33) compared CAT and SF versions of pediatric PROMIS measures in a mixed sample from the general population and asthma clinic patients. They found that scores

RI PT

on the PROMIS pediatric measures are highly correlated regardless of the scoring or

administration technique. However, the CAT stopping criterion (minimum and maximum

number of items allowed) used impacted the desired precision levels of the scores, especially for

SC

those who were at extreme ends of the measurement continuum (i.e., either ceiling or floor). CAT administration makes individualized assessment possible as items are selected

M AN U

iteratively according to patients’ responses. As a result, CAT is ideal for monitoring patients’ HRQOL at different occasions since scores are comparable regardless of whether the same items are administered at each assessment.(24, 34) We believe that children with brain tumors can benefit from this feature as most of them require long-term follow-ups throughout their life. Yet

TE D

since CAT administration relies on the use of an electronic device, it may not be feasible to some clinics with limited resources. In this situation, SFs are considered reliable alternatives to obtain patients’ HRQOL.(34-36) To our knowledge no studies have been conducted to evaluate

EP

implementation of CAT administration in pediatric neuro-oncology settings. Nor has anyone compared the properties of fixed-length short-forms versus CAT versions in a pediatric brain

AC C

tumor population. The purpose of this study was to compare scores on pediatric PROMIS measures obtained by using CAT and SF. Understanding the potential commonalities among and differences between these two administration formats for PROMIS measures will allow investigators to select the most appropriate applications to meet their needs.

METHODS

6

ACCEPTED MANUSCRIPT

All study procedures were approved by each institution’s review board (IRB). Participants and Procedures

RI PT

Patients with brain tumor between the ages of 5 and 22 and their parents were recruited from the Ann and Robert H. Lurie Children’s Hospital of Chicago, the Northwestern Medicine Chicago Proton Center, and Boston Children’s Hospital. Patients could be at any stage of their

SC

treatment continuum (including pre- and post-treatment and long-term survival) and undergoing or have undergone any type of cancer treatment (chemotherapy, radiation, and/or surgery.)

M AN U

Patients and parents were excluded from the study if they lacked sufficient literacy to read and understand consent/assent forms in English or respond to the questions. Once consented, patients (ages 7-22 years) completed the baseline study questionnaires using self-report of PROMIS CATs and short-forms measuring domains of Fatigue, Mobility,

TE D

Upper Extremity Function, Depressive Symptoms, Anxiety, and Peer Relationships. While pediatric PROMIS was validated on children ages 8-17, we included children with an age of 7 (n=5) based on our previous experiences showing 7 year olds are able to complete self-reported

EP

symptom and HRQOL measures.(37) We also included patients with ages 18-22 (n=15) to capture patients who were transitioning into adult clinics. These items have been used on patients

AC C

ages 18-25 years,(38) supporting the inclusion of these young adults. Parents completed demographic information and other symptom and HRQOL measures (data were not reported in this manuscript). Parents of patients ages 5-7 completed proxy versions of the pediatric PROMIS measures. Measures were either completed in-clinic using a tablet computer or, if not convenient for the participant, from home or any other place with an internet connection. Each participant first took the CAT version of a given pediatric PROMIS measure and then any remaining items

7

ACCEPTED MANUSCRIPT

in the associated SF that were not presented during the CAT. A stopping rule for CAT testing included the use of a minimum of 5 items with a maximum of 15 items (outside of Peer Relationship all other CATs had a maximum of 13 items) and/or a standard error of 3, whichever

RI PT

came first. Short Form versions contain the following number of items: Fatigue (10), Mobility (8), Upper Extremity Function (8), Depressive symptoms (8), Anxiety (8), and Peer relationships

versions were presented in this manuscript.

M AN U

Statistical Analyses

SC

(8). Clinical data were obtained via medical chart review. Only data from patient self-reported

Descriptive statistics were calculated for time and number of items to complete PROMIS CATs. To avoid interrupting clinical flow and participant exhaustion, participants were allowed to take a break during the assessment, thereby extending the time to complete an assessment.

TE D

Thus, times to complete a CAT of greater than one hour were considered unrealistic and were coded as missing. All PROMIS measures were scored using IRTPRO and established item parameters and reported using a T-score metric, in which the general population based mean=50

EP

and standard deviation=10. We used Pearson correlations (criterion: >0.7)(39) to describe the relationship between scores obtained from CAT and SF; paired t-tests to evaluate the significant

AC C

differences (criterion: p<0.05) and effect size (ES; Cohen’s d) (criterion: >0.2) to describe the magnitude of the difference between these two scores. For each pediatric PROMIS measure, participants were classified into three groups based on their T-scores: <45 (1/2 SD below norm), 45-55 (1/2 SD within the norm), and >55 (1/2 SD above the norm). Analyses were conducted with all patients together as well as within these severity groups. Higher scores represent worse

8

ACCEPTED MANUSCRIPT

Fatigue, Anxiety, and Depression. Conversely, higher scores represent better functioning on Mobility, Upper Extremity Function and Peer Relationship.

RI PT

RESULTS Data from 161 children with brain tumors were analyzed. The mean age of the children was 13.9 years (range = 7 to 22). The sample was gender-balanced (53.9%), primarily white (78.9%) and just over half (55.7%) attended regular classrooms. Mean time since tumor

SC

diagnosis was 5.2 years, with an average of 3.7 years since last treatment. Yet about a third of

M AN U

patients received their last treatment within one year. Almost all children (94.7%) had been treated for their tumor (chemotherapy, radiotherapy, and/or surgery) with 20.3% receiving all three forms of treatment. More demographic and clinical characteristics of the sample are shown in Table 1.

Selected features of CAT and SF administration are shown in Table 2. Average number

TE D

(standard deviation; SD) of items to complete each CAT was 9.7 (2.9), 8.7 (2.8), 8.1 (3.3), 10.4 (2.7), 8.3 (3.4), 8.1 (3.2) for Anxiety, Fatigue, Mobility, Upper Extremity Function, Depression, and Peer Relationships, respectively. Patients completed each CAT within 2 minutes, ranging

EP

from 1.3 to 2.0 across measures. Strong correlations between CAT and SF scores were found

AC C

across all domains, ranging from 0.95 (Depression) to 0.98 (Peer Relationships). However, as shown in Figure 1, in general, more skewed distributions were observed in short-form scores than in CAT scores.

Paired t-test results showed significant (p<0.05) differences between CAT and SF scores

on Anxiety, Fatigue, Mobility, and Peer Relationship (Table 2). However, their effect sizes were all less than 0.2 (range between 0 and 0.08) indicating these differences were negligible. Distributions of the differences between CAT and SF scores are shown in Figure 2, in which 9

ACCEPTED MANUSCRIPT

majorities of differences were around zero. When comparing CAT and SF by T-score groups (i.e., <45, between 45 and 55 (inclusive), and >55), as shown in Table 3, most correlation coefficients remained at acceptable levels (>0.7) except for Depression. An unexpected low

RI PT

correlation of 0.18 was found for those whose T-scores were between 45 and 55. Significant differences between CAT and SF T-scores of non-negligible magnitude (ES>0.2) were identified at the better functioning end of the continuum on all measures, except for Anxiety T-scores,

SC

which showed larger differences at the worse functioning end (p=0.021, ES=0.60). However, all differences between CAT and SF were 2.5 T-score points or less and most were less than 1 T-

M AN U

score unit. An extremely high ES (39.19) was found on patients with Upper Extremity Function CAT T-scores greater than 55. This was because this group had low variance (standard deviation=0.01) of T-scores resulting in a large ES, even though the mean difference was minimal (mean=0.50).

TE D

DISCUSSION

Results of this study support the use of the pediatric PROMIS CATs in pediatric brain tumor clinics. If the CAT administration is not feasible, SFs targeting to patients’ severity levels

EP

can be used. Inferior health-related quality of life due to disease and/or treatments has been a

AC C

concern for childhood brain tumor patients and survivors throughout their lifespan and should be monitored closely.(11) HRQOL measures that can provide brief-yet-precise estimation with minimal administration burden should be included in routine clinical visits from on-therapy to long-term follow-ups. Both CATs and SFs derived from comprehensive and psychometricallysound item banks serve as appropriate means to achieve this goal. The dynamic CATs enable individualized assessment by administering the most informative items based on patients’ previous responses. However, when using a computer is not feasible, fixed-length SFs may be

10

ACCEPTED MANUSCRIPT

more practical. As long as items from CATs and SFs are from the same IRT-calibrated item banks, scores from CATs and SFs are comparable.(28, 40) Our results, showing high correlations and negligible effect sizes between scores on these two forms, further supports the score

RI PT

comparability of CAT and SF versions of PROMIS measures. This conclusion is consistent with previous findings.(33)

However, we also noted that concordance between CATs and SFs varied across the

SC

continuum. As shown in Table 3, higher correlation coefficients were found on the more

symptomatic end of the measurement continuum with two exceptions, Mobility and Upper

M AN U

Extremity Function. For patients with better functioning of Mobility, CAT was able to capture various degrees of Mobility (scores ranging between 56.4 and 61.7) while these patients reported the same SF score (58.5) (see Fig 1.d). Therefore, no reliability coefficient could be estimated due to the lack of variance on the Mobility SF. For patients who reported Upper Extremity

TE D

Function T-scores >55, a perfect correlation between these two forms was found. This low variation contributed to an effect size of 39 (mean=0.5 and SD=0.01). More than one SF can be constructed from a given item bank. As demonstrated by Lai

EP

and her colleagues,(29) investigators can develop SFs based on their needs; such as targeting patients with worse symptoms only, patients with minimal symptoms only, or a SF covering

AC C

various degrees of symptom severity across the whole continuum. As investigators typically do not know patients’ HRQOL/symptom levels prior to assessments, we recommend using CATs given its dynamic and individualized nature as described in INTRODUCTION. The advantage of using CAT over a short-form was demonstrated by a low correlation coefficient (r=0.18) for patients with depression CAT T-scores between 45 and 55 (inclusive). As shown in Fig 1.c., these patients reported the same lowest possible SF score of 35 but various CAT scores,

11

ACCEPTED MANUSCRIPT

implying individualized CAT was able to capture a more varied degree of (lack of) depressive symptoms for these patients. The diagnosis of childhood brain tumors is associated with significant life disruption due

RI PT

to both acute disease effects and the increasingly recognized adverse long-term treatment

consequences. There is a need to better understand HRQOL of pediatric brain tumor patients and how it compares to that of children with other chronic conditions. However, progress in this

SC

area has been limited by a number of factors. For example, children with brain tumors were often excluded from many pediatric HRQOL studies because their experiences were considered

M AN U

atypical to that of the majority of pediatric survivors. Furthermore, pediatric brain tumor patients are a particularly challenging group to study because of the relatively small number of patients, the diversity of brain tumors, the varying functional impact of the tumors, and the range of surgical and treatment effects that can occur.(41) And, finally, there is a lack of accepted brief-

TE D

yet-precise and psychometrically-sound measures. The PROMIS measurement system allows for tailored and brief-yet-precise assessments, was validated on children with cancer, including brain tumors,(31) and is available in electronic health record (EHR) software.(42) These

EP

characteristics make it a strong candidate for use as an HRQOL measure in clinical settings. We note several limitations to our study. First, patients were allowed to pause

AC C

assessments when needed and complete them at a later time. This was done to avoid interrupting clinical flow and to reflect the reality of patients’ daily lives. Yet this flexibility resulted in some participants taking an unusually long time to complete an assessment. Second, we did not include non-English speaking families in this study, which might have unintentionally excluded some economically disadvantaged families without internet access at home and those with low

12

ACCEPTED MANUSCRIPT

health literacy. Future studies should be conducted to evaluate the feasibility of implementing pediatric PROMIS in these disadvantaged populations. In conclusion, this study demonstrated that PROMIS CATs and SFs produce comparable

RI PT

scores, at the level of group comparisons, for children with a brain tumor. The agreement

between scores obtained by using CATs and SFs varied at individual level across symptom

severity levels, which resulted from ceiling or floor effects introduced by fixed-length short-

SC

forms. We thus recommend CATs to enable individualized assessment for longitudinal

monitoring. If the CAT administration is not feasible, multiple SFs which target different

M AN U

severity levels can be considered to minimize ceiling or floor effects. Low respondent burden was evidenced by brief time required to complete a CAT. Including pediatric PROMIS CATs or

AC C

EP

TE D

SFs in routine clinical visits should be considered.

13

ACCEPTED MANUSCRIPT

Acknowledgement This study was funded by the National Cancer Institute (R01CA174452; PI: Jin-Shei Lai) Conflict of Interest

RI PT

Jin-Shei Lai, William Hartsell, Cindy Nowinski and Jennifer Beaumont have no conflicts of interest to declare. John Han-Chih Chang is a shareholder of Illinois Cyberknife Investment, LLC, Chicago Proton Treatment Investment, LLC, and Elk Grove Radiosurgery Investment,

AC C

EP

TE D

M AN U

SC

LLC.

14

ACCEPTED MANUSCRIPT

References

AC C

EP

TE D

M AN U

SC

RI PT

1. Meeske K, Katz ER, Palmer SN, Burwinkle T, Varni JW. Parent proxy-reported health-related quality of life and fatigue in pediatric patients diagnosed with brain tumors and acute lymphoblastic leukemia. Cancer 2004;101:2116-2125. 2. Penn A, Shortman RI, Lowis SP, et al. Child-related determinants of health-related quality of life in children with brain tumours 1 year after diagnosis. Pediatric Blood and Cancer 2010;55:1377-1385. 3. de Ruiter MA, Schouten-van Meeteren AYN, van Vuurden DG, et al. Psychosocial profile of pediatric brain tumor survivors with neurocognitive complaints. Quality of Life Research 2016;25:435-446. 4. de Ruiter MA, Van Mourik R, Schouten‐Van Meeteren AY, Grootenhuis MA, Oosterlaan J. Neurocognitive consequences of a paediatric brain tumour and its treatment: a meta‐analysis. Developmental Medicine & Child Neurology 2013;55:408-417. 5. Schulte F, Barrera M. Social competence in childhood brain tumor survivors: a comprehensive review. Supportive Care in Cancer 2010;18:1499-1513. 6. Salley CG, Hewitt LL, Patenaude AF, et al. Temperament and social behavior in pediatric brain tumor survivors and comparison peers. Journal of pediatric psychology 2015;40:297308. 7. Duckworth J, Nayiager T, Pullenayegum E, et al. Health-related quality of life in long-term survivors of brain tumors in childhood and adolescence: a serial study spanning a decade. Journal of pediatric hematology/oncology 2015;37:362-367. 8. Klassen AF, Anthony SJ, Khan A, Sung L, Klaassen R. Identifying determinants of quality of life of children with cancer and childhood cancer survivors: a systematic review. Supportive Care in Cancer 2011;19:1275-1287. 9. Yagc-Küpeli B, Akyüz C, Küpeli S, Büyükpamukçu M. Health-related quality of life in pediatric cancer survivors: a multifactorial assessment including parental factors. Journal of pediatric hematology/oncology 2012;34:194-199. 10. Macartney G, VanDenKerkhof E, Harrison MB, Stacey D. Symptom experience and quality of life in pediatric brain tumor survivors: A cross-sectional study. Journal of pain and symptom management 2014;48:957–967. 11. Howard AF, Hasan H, Bobinski MA, et al. Parents’ perspectives of life challenges experienced by long-term paediatric brain tumour survivors: work and finances, daily and social functioning, and legal difficulties. Journal of Cancer Survivorship 2014;8:372-383. 12. Olson R, Hung G, Bobinski MA, Goddard K. Prospective evaluation of legal difficulties and quality of life in adult survivors of childhood cancer. Pediatric blood & cancer 2011;56:439443. 13. Braam KI, van der Torre P, Takken T, et al. Physical exercise training interventions for children and young adults during and after treatment for childhood cancer. Cochrane database of systematic reviews 2013;3:CD008796. 14. Moore IM, Hockenberry MJ, Anhalt C, McCarthy K, Krull KR. Mathematics intervention for prevention of neurocognitive deficits in childhood leukemia. Pediatric Blood and Cancer 2012;59:278-84. 15. Roter D, Hall JA, Katz N. Relations between physicians' behaviors and analogue patients' satisfaction, recall, and impressions. Medical Care 1987;25:437-451. 15

ACCEPTED MANUSCRIPT

AC C

EP

TE D

M AN U

SC

RI PT

16. Smith DM, Weinberger M, Katz BP, Moore PS. Postdischarge care and readmissions. Medical Care 1988;26:699-708. 17. Weinberger M, Smith DM, Katz BP, Moore PS. The cost-effectiveness of intensive postdischarge care. A randomized trial. Medical Care 1988;26:1092-1102. 18. Fitzgerald JF, Smith DM, Martin DK, Freedman JA, Katz BP. A case manager intervention to reduce readmissions. Archives of Internal Medicine 1994;154:1721-1729. 19. Kerr J, Engel J, Schlesinger-Raab A, Sauer H, Holzel D. Communication, quality of life and age: Results of a 5-year prospective study in breast cancer patients. Annals of Oncology 2003;14:421-427. 20. Velikova G, Booth L, Smith AB, et al. Measuring quality of life in routine oncology practice improves communication and patient well-being: A randomized controlled trial. Journal of Clinical Oncology 2004;22:714-724. 21. Stover A, Irwin DE, Chen RC, et al. Integrating patient-reported outcome measures into routine cancer care: cancer patients’ and clinicians’ perceptions of acceptability and value. eGEMs 2015;3. 22. Senders A, Hanes D, Bourdette D, Whitham R, Shinto L. Improving the Patient-Reported Outcome Experience for Participants and PI's: Feasibility and Validity of PROMIS. Journal of Alternative and Complementary Medicine 2014;20:A13-A13. 23. Gilbert A, Sebag-Montefiore D, Davidson S, Velikova G. Use of patient-reported outcomes to measure symptoms and health related quality of life in the clinic. Gynecologic oncology 2015;136:429-439. 24. Cella D, Riley W, Stone A, et al. The Patient-Reported Outcomes Measurement Information System (PROMIS) developed and tested its first wave of adult self-reported health outcome item banks: 2005-2008. Journal of Clinical Epidemiology 2010;63:1179-1194. 25. Cella D, Yount S, Gershon R, Rothrock N. The Patient-Reported Outcomes Measurement Information System (PROMIS): Four years in and four to go. In: International Society for Quality of Life Research (ISOQOL), Montevideo, Uruguay: 2008. 26. Lai JS, Nowinski C, Victorson D, et al. Quality-of-Life Measures in Children With Neurological Conditions: Pediatric Neuro-QOL. Neurorehabilitation and Neural Repair 2012;26:36-47. 27. Lai JS, Zelko F, Butt Z, et al. Parent-perceived child cognitive function: results from a sample drawn from the US general population. Child's Nervous System 2011;27:285-93. 28. Lai J-S, Butt Z, Zelko F, et al. Development of a Parent-Report Cognitive Function Item Bank Using Item Response Theory and Exploration of its Clinical Utility in Computerized Adaptive Testing. Journal of Pediatric Psychology 2011;36:766-79. 29. Lai JS, Cella D, Choi SW, et al. How Item Banks and Their Application Can Influence Measurement Practice in Rehabilitation Medicine: A PROMIS Fatigue Item Bank Example. Archives of Physical Medicine and Rehabilitation 2011;92:S20-S27. 30. Pilkonis PA, Choi SW, Reise SP, et al. Item banks for measuring emotional distress from the Patient-Reported Outcomes Measurement Information System (PROMIS):depression, anxiety, and anger. Assessment 2011;18:263-83. 31. Hinds PS, Nuss SL, Ruccione KS, et al. PROMIS pediatric measures in pediatric oncology: valid and clinically feasible indicators of patient-reported outcomes. Pediatric Blood and Cancer 2013;60:402-8.

16

ACCEPTED MANUSCRIPT

AC C

EP

TE D

M AN U

SC

RI PT

32. Menard JC, Hinds PS, Jacobs SS, et al. Feasibility and acceptability of the patient-reported outcomes measurement information system measures in children and adolescents in active cancer treatment and survivorship. Cancer Nursing 2014;37:66-74. 33. Varni JW, Magnus B, Stucky BD, et al. Psychometric properties of the PROMIS (R) pediatric scales: precision, stability, and comparison of different scoring and administration options. Qual Life Res 2014;23:1233-43. 34. Lai J-S, Zelko F, Krull K, et al. Parent-reported cognition of children with cancer and its potential clinical usefulness. Quality of Life Research 2014;23:1049-58. 35. Lai J-S, Stucky B, Thissen D, et al. Development and psychometric properties of the PROMIS® pediatric fatigue item banks. Quality of Life Research 2013;22:2417-2427. 36. Irwin DE, Stucky B, Langer MM, et al. An item response analysis of the pediatric PROMIS anxiety and depressive symptoms scales. Quality of Life Research 2010;19:595-607. 37. Lai JS, Cella D, Peterman A, Barocas J, Goldman S. Anorexia/cachexia related quality of life for children with cancer: Testing the psychometric properties of the Pediatric Functional Assessment of Anorexia/Cachexia Therapy (peds-FAACT). Cancer 2005;104:1531-1539. 38. Reeve B, Thissen D, DeWalt D, et al. Linkage between the PROMIS® pediatric and adult emotional distress measures. Quality of Life Research 2016;25:823-33. 39. Cohen J. Statistical power analysis for the behavioral sciences, 2nd ed. Hillsdale, N.J.: L. Erlbaum Associates, 1988. 40. Bjorner JB, Rose M, Gandek B, et al. Method of administration of PROMIS scales did not significantly impact score level, reliability, or validity. J Clin Epidemiol 2014;67:108-13. 41. Patenaude AF, Kupst MJ. Psychosocial functioning in pediatric cancer. Journal of Pediatric Psychology 2005;30:9-27. 42. Wagner LI, Schink J, Bass M, et al. Bringing PROMIS to practice: Brief and precise symptom screening in ambulatory cancer care. Cancer 2015;121:927-34.

17

ACCEPTED MANUSCRIPT

Figure Legends

Figure 1. T-Score Distributions and Correlations between CAT and SF

RI PT

Figure 2. Distribution of Differences and Agreement between T-Scores Obtained from CAT

AC C

EP

TE D

M AN U

SC

versus SF

18

ACCEPTED MANUSCRIPT

Figure 1. T-Score Distributions and Correlations between CAT and SF a. Fatigue 70

RI PT

65 60 55 50 45 40 35

SC

Fatigue Score from Short-Form

75

30 25 30

35

40

45

50

55

60

65

70

M AN U

25

75

Fatigue Score from CAT

b. Anxiety 70

TE D

65 60 55 50

EP

45 40 35 30 30

35

40

AC C

Anxiety Score from Short-Form

75

45

50

55

60

65

70

75

Anxiety Score from CAT

19

ACCEPTED MANUSCRIPT

80 75 70 65

RI PT

60 55 50 45 40 35

SC

Depression Score from Short-Form

c. Depression

30 30

40

50

60

70

d. Mobility 65 60 55 50

TE D

45 40 35 30

20 20

25

EP

25 30

35

40

45

50

55

60

65

Mobility Score from CAT

AC C

Mobility Score from Short-Form

80

M AN U

Depression Score from CAT

20

ACCEPTED MANUSCRIPT

60 55 50

RI PT

45 40 35 30 25 20 20

25

30

35

40

45

50

SC

Upper Extremity Function Score from Short-Form

e. Upper Extremity

55

60

f.

M AN U

Upper Extremity Function Score from CAT

70 65 60 55

TE D

50 45 40 35 30 30

EP

Peer Relationships Score from ShortForm

Peer Relationships

35

40

45

50

55

60

65

70

AC C

Peer Relationships Score from CAT

21

ACCEPTED MANUSCRIPT

Figure 2. Distribution of Differences and Agreement between T-Scores Obtained from CAT versus SF

EP

TE D

b. Depression

M AN U

SC

RI PT

a. Anxiety

AC C

c. Fatigue

22

ACCEPTED MANUSCRIPT

TE D

M AN U

e. Upper Extremity

SC

RI PT

d. Mobility

AC C

EP

f. Peer Relationship

23

ACCEPTED MANUSCRIPT

Table 1. Sample demographic and clinical information Variable

Categories with the variable

Age (in years) Years since diagnosis

Mean=13.9 (SD=3.7) Mean=5.2 (SD=4.6) Mean=3.7 (SD=3.4)

<= 1 year; > 1 year Years since last chemotherapy

RI PT

Years since last treatment

36% (64%)

Mean=5.1 (SD=12.0)

<= 1 year; > 1 year Years since last radiation

39.8% (60.2%)

Mean=3.6 (SD=3.3)

<= 1 year; > 1 year

Mean=5.4 (SD=11.5)

SC

Years since last surgery

37.9% (62.1%)

<= 1 year; > 1 year 25.3% (74.7%) Male 53.9% Embryonal tumors Medulloblastoma 23.0% Ganglioma 18.3% Pilocytic Astrocytoma 13.5% Astrocytoma (diffuse, infiltrative, 11.1% fibrillary) Status of tumor Initial diagnosis 85.8% Recurrence 14.2% Treatment Chemotherapy 77.5% Radiotherapy 54.2% Surgery 69.9% Surgery, chemotherapy, and radiation 20.3% No surgery, chemotherapy and radiation 5.3% Radiation type c Limited field/localized 7.4% Craniospinal 20.6% Proton beam 52.9% Race White 78.9% African American 5.5% Attending School Yes 95.4% Type of Classroom a Regular classroom; no IEP 55.7% Regular classroom; with IEP 32.8% Special education 7.4% other 4.1% 100 57.0% Karnofsky or 90 32.0% Lansky 70-80 9.4% performance statusd 50-60 1.6% Parent-rated child’s Excellent 28.4% Very good 37.0% quality of life Good 27.6% Fair or Poor 7.1% a. Only those attending school were included. IEP: Individualized educational program b. % was calculated using non-missing data (n=305). Disease severity was not documented in a consistent manner across recruitment sites as well as across cancer types, and thus not reported here. c. % was calculated based on children who received radiotherapy (n=170) d. Clinician rated

AC C

EP

TE D

M AN U

Gender Tumor Type

24

ACCEPTED MANUSCRIPT

Table 2. Comparisons between CAT and Short-form Computerized Adaptive Testing Number of items administered

Item bank

CAT vs. SF T-score

Short-Form (T-Score)

T-Score

Anxiety

140 1.38 (1.69)

Mean Min Max Mean (SD) (SD)b 9.7 (2.9) 5 13 42.7 (10.7)

Fatigue

161 2.01 (3.96)

8.7 (2.8)

5

13

43.7 (12.9)

25.4

73.8

147

44.8 (11.6)

30.3

Changec Paired (CAT-SF) ttest 72.9 0.8 p=0.035 73.0 -1.1 p<0.001

Mobility Upper Extremity

157 1.46 (0.98) 148 1.3 (0.97)

8.1 (3.3) 10.4 (2.7)

5 5

13 13

48.1 (9.4) 48.6 (9.2)

21.2 25.7

61.7 57.2

144 136

47.6 (9.1) 48.6 (9.7)

25.0 21.6

58.5 56.7

0.5 0

p=0.004

Depression

145 1.31 (2.46)

8.3 (3.4)

5

13

45.2 (11.2)

31.8

72.0

130

45.0 (10.6)

35.2

75.7

0.2

Peer relationship

137 1.49 (1.95)

8.1 (3.2)

5

15

49.8 (10.5)

17.0

66.0

125

49.4 (10.7)

17.7

64.4

0.4

p=0.704 p=0.010

Max

N

Mean (SD)

min

31.3

72.2

128

41.9 (10.5)

32.3

max

p=0.974

Effect size d 0.07

Pearson r

0.08

0.976 0.972

0.05 0.00 0.02 0.03

0.966

0.962 0.947 0.978

AC C

EP

TE D

M AN U

a. Time to complete CAT, in minutes b. Number of CAT items administered c. Mean differences between CAT and SF T-scores d. Mean difference / standard deviation of differences

Min

RI PT

Time (SD)a

SC

N

25

ACCEPTED MANUSCRIPT

Table 3. Comparisons between CAT and Short-form by groups:

<45 45-55 (inclusive) >55

Fatiguea pc

Anxietya ESd

n Corrb

pc

Depressiona

n

Corrb

ESd

76

0.88 <0.001 -0.76 81 0.79 0.309 0.11 67

40

0.79

31

0.90

Corrb

ESd

pc

ESd

n

Corrb

0.83 <0.001 -0.61 58 0.92

0.175 0.18 43

0.84

0.801

0.04

39

0.94 0.111 0.26

0.128 -0.25 29 0.74 0.583 0.10 35

0.18

0.007

0.49

0.333 -0.15 32

0.67

0.038

-0.38 44

0.86 0.958 0.01

0.080 -0.33 18 0.88 0.021 0.60 28

0.95

0.634

-0.09 41

<0.001 0.87 61

1

<0.001 39.19 42

0.88 0.002 0.51

45 0.73 NA

pc

ESd

Peer Relationshipa

Corrb

n

Corrb

Upper Extremitya n

n

pc

Mobilitya

RI PT

Group (by T-score)

pc

ESd

AC C

EP

TE D

M AN U

SC

a. Scoring direction is reflected by the score names. For Fatigue, Anxiety, and Depression, higher scores mean more symptomatic (i.e., more fatigue, anxious and depressive). For Mobility, Upper Extremity Function and Peer Relationships, higher scores mean better functioning. b. Pearson correlation c. p-value for paired t-test d. Effect size = mean difference / standard deviation of differences

26