Journal of Clinical Epidemiology 71 (2016) 1e2
EDITORIAL
Extrapolating results from adult trials to children It should never be taken for granted that drugs or drug dosages that work in adults would be also appropriate and safe for use in children. Two Commentaries extend the discussion on the topic raised by Janiaud et al. [1] in a recent issue, on the challenges of extrapolating the results from adult trials to children; in the original article Ratio of Odds Ratios (RORs) were not significantly different from 1 for 110 of 124 drugs. For 36 of these drugs, the treatment effect was confirmed in both populations. Oostenbroucke raise a number of issues notably that these studies do not represent the common pediatric conditions. Janiaud responds by emphasising the importance of utilising and improving an extrapolation framework both for drug development and for routine practice by using all available evidence prior to undertaking any additional studies in children. Reporting guidelines are a frequent focus of papers in JCE. There is some controversy as to whether these should follow or should precede guidelines on the actual conduct of these studies. Consensus in the area of observational studies is all the more urgent as more large databases become available for conducting observational research. In this issue Morton et al. provide quite a compelling argument for the need to establish global consensus across different stakeholders on the alignment on appropriate methods for the conduct, reporting, and evaluation of observational studies. They demonstrate the substantive differences in nine of the leading sets of guidelines on the conduct of observational studies [Agency for Healthcare Research Quality (AHRQ); Comparative Effectiveness Research Collaborative; European Network of Centres for Pharmacoepidemiology and Pharmacovigilance (ENCePP); ENCePP Guide on Methodological Standards in Pharmacoepidemiology; The United States Food and Drug Administration (FDA); Good ReseArch for Comparative Effectiveness (GRACE) Checklist; GRACE Principles; ISPOR; PatientCentered Outcomes Research Institute (PCORI)]. Interestingly each of the nine standards/guidelines had an intended, self-identified audience: five targeted researchers, three decision makers, one stakeholders, and one industry. It is reassuring that there was uniformity on a number of items although even in these their ‘actionability,’ i.e. what is needed to meet the specific criteria, was absent. However, in a number of important areas there was a surprising number of guidelines where this consensus was absent (e.g., how to handle missing data, the need for a systematic
review before conducting the research or adequacy of subgroup analysis to assess heterogeneity of treatment effects). The authors call for a consensus process to establish common guidelines. Reporting guidelines are also the subject of a report from Chhapola et al. who examined the compliance before [2003e2007] and after [2010e2014] the introduction of the first CONSORT guidelines for Abstracts. Although their results showed a statistically significant improvement, the improvement was minimal and extropolation of the postCONSORT slope showed that it would take approximately 50 years to achieve 100% reporting. Clearly other implementation strategies are needed. Single case or case-only designs are important for assessing investigating transient effects of accurately recorded preventive agents, for example, vaccines. They are observational equivalents of randomized cross-over studies and n-of- 1 and other Single Case Experimental Designs [SCED] that will be the focus of an upcoming JCE series. Their main strength is they entail self-controlled analyses that eliminate confounding and selection bias by timeinvariant characteristics not recorded in healthcare databases [2]. In this JCE issue, Pouwels et al. report on 53 studies that included both case-only and parallel group designs and found moderate agreement overall in their effect sizes. In those with clinically important differences they found these were often due to inappropriate application of the case-only design. In an article on the clinical sensibility of network metaanalysis, Linde et al. emphasize the importance of not assuming clinical ‘transitivity’ [sometimes also called similarity or exchangability]; they call for a more systematic assessment of this when implementing network metaanalyses to check that it makes sense to combine the different types of patients in different settings. They demonstrate the application to their decision not to combine pharmacologic and behavioural interventions for depression into one network. The measurement of mental health by patient report is attracting increasing attention and a number of papers in this issue have this as their focus. The profusion of both classical test theory based instruments and now itemresponse theory based instruments, makes it increasingly difficult for users to interpret and combine in metaanalyses. Liegl et al. applied a common metrics approach based on item-response theory for measuring depression
http://dx.doi.org/10.1016/j.jclinepi.2016.01.027 0895-4356/Ó 2016 Elsevier Inc. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
2
Editorial / Journal of Clinical Epidemiology 71 (2016) 1e2
to four German-speaking samples that completed the Patient Health Questionnaire (PHQ-9). They showed that it is possible to estimate latent depression scores by using the item parameters from a common metric instead of reestimating and linking a model. They provide a web application (http://www.common-metrics.org) that others can use. Batterham et al. report on a new questionnaire: the Distress Questionnaire in for detecting seven common mental conditions. Psychological distress questionnaires have been widely used as screening but are not accurate for identifying specific common mental conditions, but this instrument shows promise in doing just that. Another article on mental health by Meister et al. calls attention to the failure of papers in 2014 reporting on 60 randomized trials in persistent depressive disorder to conform with the CONSORT reporting guidelines for assessing harms, and call for increased attention to this. They point out that psychotherapy trials rarely report harms but there are strong reasons that these should be reported. Applied statistical analysis can take us forward in addressing important clinical epidemiological issues. Age -adjustment is an important example but even this has its pitfalls. Age is often used as a surrogate in risk analyses studies to assess potential effect modification by unmeasured factors correlated with age. Using the example of age-at-onset of radiation-related early menopause [due to presumed early ovarian function failure] in a cohort of female Japanese Atomic Bomb Survivors, Izumi et al. demonstrate problems with two commonly used models for age-adjusting the hazard rate, the excess absolute rate (EAR) and excess relative risk (ERR) models. They showed that these produced strange results and call for these calculations being scrutinized when these approaches are used. Using multinomial logistic regression modeling in diagnostic research, nomograms can be developed that offer a format that is much preferred by most clinicians over complex formulae requiring calculation. Bertens et al. provide a nice example for diagnosing patients with dyspnoea caused by either heart failure, chronic obstructive lung disease or both. A major strength is the clarity with which the large differences in relative importance of the individual
predictors is clearly incorporated into the nomogram; for example, cough provides a much smaller contribution than the FEV1 results. Sergio et al. argue that we should stop restricting analyses to single risk factors when investigating risk factors for a clinical condition. Using multivariate regression analyses: field-wide meta-analyses, i.e. a meta-analysis of observational data assessing the entire field of putative risk factors, can map the selective availability of risk factors as well as the patterns of modeling, adjustments and reporting of risk factors across studies. They show an example of how preferential multivariate analyses produces different results in 60 studies of pterygium. van Walraven and Colman describe some new techniques for capitalizing on data mining approaches to increase the accuracy of identifying patients with the condition of interest. Using the condition of migraine they show how the accuracy can be improved by a] the use of diagnostic code scores to identify significant, and sometimes unpredictable, associations between migraine status and a vast array of diagnostic codes; b] the explicit interrogation of how applying ‘double thresholds’ for the new migraine model influenced its operating characteristics for identifying migraineurs. A second example of this is provided by Bagherzadeh-Khiabania et al who use a diabetes database to also show the benefits of data mining using variable selection methods for clinical prediction models. They argue for the development of user-friendly statistical packages for their wider use in diagnosis and prognosis. Peter Tugwell J. Andre Knottnerus E-mail address:
[email protected] (P. Tugwell) References [1] Janiaud P, Lajoinie A, Cour-Andlauer F, Cornu C, Cochat P, Cucherat M, et al. Different treatment benefits were estimated by clinical trials performed in adults compared with those performed in children. J Clin Epidemiol 2015;68:1221e31. [2] Maclure M, Fireman B, Nelson JC, Hua W, Shoaibi A, Paredes A, et al. When should case-only designs be used for safety monitoring of medical products? Pharmacoepidemiol Drug Saf 2012;21(Suppl 1):50e61.