Journal of Clinical Epidemiology 63 (2010) 1165e1166
EDITORIAL
Tailoring patient reported outcome measurement Patient reported outcomes are key variables in clinical research, but it is not always easy to select the best methods to measure them in specific situations. We welcome, therefore, in this month’s issue of the Journal of Clinical Epidemiology, a series on outlining the first wave of testing of the Patient Reported Outcome Measurement Information System (PROMIS) initiative. This is one of the most exciting recent initiatives in clinical epidemiology, funded by the US National Institutes of Health Roadmap Program. PROMIS has the ambitious goal of combining the best of all of the patient-reported outcomes questionnaires and tailoring them to the specific needs of the patients being studied. Therefore, this will create a national resource for precise and efficient measurement of patient-reported symptoms, functioning, and health-related quality of life, appropriate for patients with a wide variety of chronic disease conditions. Of special appeal to the research community (and no doubt, to the funders of research) is the claim that this will reduce the sample size requirements of trials needed to demonstrate minimal clinically important differences by 20% to 50%. Following an insightful commentary by Maarten Boers, three original articles describe different aspects of the initial validation. The first article, by Liu et al, evaluates and compares PROMIS with national norms. The second, by Cella et al, demonstrates the reliability, precision, and construct validity of the item banks used in PROMIS. The final article in this series, by Rothrock et al, shows how PROMIS allows assessment of the impact of chronic conditions on health-related quality of life (HRQL) across diseases. While, David Sackett, in his classic article [1], identified 53 types of bias, Chavalarias and Ioannidis have expanded this to 235 biases and present these graphically in their fascinating color cluster maps. Based on an intensive computation text mining approach, these maps offer a broad view of the different types of bias in biomedical research, their overlap, co-occurrence, and specificity; the terms they are associated with; and how they have evolved over time. It is of great interest to observe how the quality of studyreporting improves over time and its association with reporting standards such as CONSORT (Consolidated Standards of Reporting Trials), and study-registration. Reveiz et al reviewed 148 randomized controlled trials (RCTs) published in 2007 from top medical journals. Based on this study of the 55 top-ranked journals, they conclude that reporting of CONSORT items remains suboptimal; however, registration in a trial registry was associated with improved 0895-4356/$ - see front matter Ó 2010 Published by Elsevier Inc. doi: 10.1016/j.jclinepi.2010.08.004
reporting. They echo the sentiments of the International Committee of Medical Journal Editors (ICMJE), who maintain that all journals publishing RCTs should require a prospective trial registration identification number to improve the quality of reporting. ‘Impact estimates’ such as number needed to treat for benefit (NNTB) and harm (NNTH), extended to non-randomized studies with measures such as the disease impact number and population impact number [2] and the number needed to be exposed [3] continue to attract attention. These latter need additional careful attention to bias and thus need more sophisticated statistical techniques to adjust for this. Germhann et al propose a novel approach to estimating both number needed to treat (NNT) and number needed to be exposed (NNE). Their method, termed ‘LR-ARD for average risk difference (ARD) approach based on logistic regression (LR),’ suggest that the LR-ARD is the most appropriate method to estimate covariate-adjusted risk differences and NNEs. Binomial regression and the ordinary logistic regression can also be used if the sample size is large, the exposure groups have similar covariate distributions, and one is only interested in the estimation of risk difference and number needed to be exposed, but not in individual risk estimates. This month we have published two studies based on data obtained from the Longitudinal Aging Study Amsterdam (LASA). In the first study, van Nispen et al tested the psychometric quality of the vision-related quality of life core measure (VCM1) and feasibility in a visually impaired, elderly, community-based sample. This study concluded that the psychometric quality of the VCM1 scale was satisfactory. In addition, this scale was feasible in the community setting; however, it was recommended that only one type of administration (i.e., self report or interview) be used. In the second study, Peeters et al assessed the predictive validity of the fall risk profile in older persons seeking care after a fall. They found that the discriminative validity of this profile was moderate, while the predictive validity of the profile to identify recurrent fallers was limited among older persons. The Quebec Back Pain Disability Scale is one of the primary recommended questionnaires to assess functional status in patients with low back pain. Because its responsiveness and interpretability have only been investigated in a few studies, Demoulin et al have sought to assess this. This study proposes values for responsiveness and interpretability indicators for patients with chronic low back pain referred for multidisciplinary treatment.
1166
Editorial / Journal of Clinical Epidemiology 63 (2010) 1165e1166
The Kappa statistic and its variants continue to evolve. In recent years, several 5-level emergency department (ED) triage systems have been developed. Because it is difficult to compare the reliability of triage systems with the kappa statistic, van der Wulp et al have proposed a method for comparing triage systems that accounts for influences of kappa because of the distribution of ratings. From previously conducted triage reliability studies, they conclude that calculating normal kappas results in substantial theoretical differences in inter-rater reliability of triage systems. They recommend that when comparing triage systems with different numbers of categories, one should report both the normal and quadratically weighted kappa. Hewitt et al report on the impact of attrition in RCTs. A major theoretical downfall of many RCTs is the moderate to high attrition rates that are often experienced between randomization and follow-up, thus resulting in a large amount of missing data. Although this study did not find sufficient evidence to support the often-made claim of selection bias, the authors remain cautious and recommend that trialists always consider the impact of attrition on baseline imbalances, and their impact on the outcomes reported, where possible. There are two brief reports: PRECIS (A PragmaticExplanatory Continuum Indicator Summary) was designed
to assist investigative teams in understanding the various design decisions that must be made regarding pragmatic versus explanatory trials. Riddle et al describe how the use of the PRECIS instrument can be used to facilitate discussion, make revisions, and achieve consensus when designing a randomized trial. Finally, Hammink et al report their finding that adding pre-notification to follow-up in patient surveys had no additional effect on the response rate. Peter Tugwell J. Andre Knottnerus Editors Leanne Idzerda Assistant Editor E-mail address:
[email protected] (L. Idzerda). References [1] Sackett DL. Bias in analytic research. J Chronic Dis 1979;32:51e63. [2] Heller R, Dobson A, Attia J, Page J. Impact numbers: measures of risk factor impact on the whole population from case-control and cohort studies. J Epidemiol Commun Health 2002;56:606e10. [3] Bender R, Blettner M. Calculating the ‘‘number needed to be exposed’’ with adjustment for confounding variables in epidemiological studies. J Clin Epidemiol 2002;55:525e30.