Big data in nephrology: promises and pitfalls

Big data in nephrology: promises and pitfalls

nephrology digest epidemiology and statistics Big data in nephrology: promises and pitfalls Girish N. Nadkarni1, Steven G. Coca1 and Christina M. Wy...

101KB Sizes 2 Downloads 79 Views

nephrology digest

epidemiology and statistics

Big data in nephrology: promises and pitfalls Girish N. Nadkarni1, Steven G. Coca1 and Christina M. Wyatt1 Data from the electronic health records hold great promise for nephrology research. However, due to significant limitations, reporting guidelines have been formulated for analyses conducted using electronic health records data. Refers to: Benchimol EI, Smeeth L, Guttman A, et al. The REporting of studies Conducting using Observational Routinely-collected health Data (RECORD) statement. PLOS Med. 2015;12:e1001885. Kidney International (2016) 90, 240–241; http://dx.doi.org/10.1016/j.kint.2016.06.003 Copyright ª 2016, International Society of Nephrology. Published by Elsevier Inc. All rights reserved.

he term Big Data has entered the popular lexicon. Although definitions vary, it can be defined as data that are so large and complex that it becomes difficult to manage and analyze using standard software and methods due to its unaligned, granular, and temporal nature.1 With an exponential increase in the data generated on a daily basis due to mobile technology and social media, business sectors including the financial, airline, and retail industries utilize Big Data to analyze consumer behavior patterns to improve their services and market share. In epidemiologic research, the term Big Data often refers to the vast amounts of clinical and laboratory data collected in the electronic health record (EHR). One could argue that nephrology lends itself to the use of Big Data. We collect large amounts of vital, laboratory, and other data on patients with end-stage renal disease. Kidney function is assessed through changes in the estimated glomerular filtration rate, which is one of the most commonly ordered laboratory tests.2 Thus, a large amount of granular (finely detailed) and temporal data on chronic kidney disease prevalence, incidence, and progression is available. Perhaps most importantly, important unanswered questions in nephrology, in particular comparisons between dialysis modalities and between dialysis and transplantation, cannot be addressed in randomized clinical trials. Much of the epidemiologic research in nephrology is currently conducted using large national databases, such as the US Renal Data System. Although this research has led to

T

1 Division of Nephrology, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, New York, USA Correspondence: G.N. Nadkarni, Icahn School of Medicine at Mount Sinai, One Gustave Levy Place, Box 1243, New York, New York 10128, USA. E-mail: girish.nadkarni@mssm. edu

240

important insights into the epidemiology of acute and chronic kidney disease, these databases are limited by the lack of patient-level and granular data. In the EHR, a large amount of granular data is collected and can be used for epidemiologic research. Recent publications using EHR data have analyzed differences in outcomes by hypertension types and identified previously unknown associations between commonly used medications and chronic kidney disease risk.3,4 Recognizing both the potential and the limitations of using Big Data from EHR for epidemiologic research, reporting guidelines have evolved to include the reporting of analyses conducted using EHR data.5 The new guidelines expand on existing guidelines for the reporting of observational studies, focusing on issues that are unique to EHR and other large datasets that were not created for research purposes. The RECORD (REporting of studies Conducted using Observational Routinelycollected health Data) guidelines recommend including a complete list of the codes or algorithms used to identify eligible subjects and to define exposures and outcomes of interest, as well as reporting the methods used to link datasets, if relevant. In addition, the RECORD guidelines provide an outline of limitations that should be considered in studies using routinely available data, including the increased potential for misclassification bias, unmeasured confounding, and missing data. With appropriate recognition and acknowledgment of these limitations, EHR data provide several unique opportunities in nephrology research. Kidney International (2016) 90, 238–241

nephrology digest

PREDICTIVE ANALYTICS

Because EHR data are collected longitudinally, it is possible to study the natural history of a disease process, as well as the response to treatment, in a “real-world” scenario. Patient-level data can be utilized, with the help of appropriate techniques, for predictive analytics. For example, researchers have generated equations to predict the risk of kidney failure by combining large datasets (including EHR data).6 By risk stratifying large groups of patients based on their risk of progression, these tools are useful for population health management, especially in this era of managed care and accountable care organizations. However, their use in routine clinical practice for individual patients will be valuable only if they provide meaningful improvements in clinical prediction. In a prospective cohort study that also used EHR data, the strongest predictor of 6-month mortality in end-stage renal disease patients was that the nephrologist would not be surprised if the patient died within 6 months.7 Thus, Big Data predictive analytics, although useful on a macro level, do not replace clinical acumen while treating individual patients. PRAGMATIC TRIAL DESIGN

Nephrology lags behind most medical subspecialties with respect to randomized clinical trials. In addition, patients with CKD are frequently excluded from clinical trials in cardiology and other relevant fields, leading to poor-quality evidence in this patient population8 (please also refer to our recent Kidney International Nephrology Digest article9). One reason for the low number of randomized clinical trials in nephrology is the low event rate and inadequate surrogate outcomes for endstage renal disease, requiring prohibitively large sample sizes and long follow-up time for accrual. Big Data from the EHR could be used to improve efficiency in randomized clinical trials in several ways. First, if laboratory information can be extracted and the change in estimated glomerular filtration rate per unit of time (slope) calculated before enrollment, investigators can limit enrollment to those who are more likely to progress, ultimately leading to shorter, more efficient clinical trials. Second, provider notes within the EHR can be extremely valuable. Currently, the process for determining whether a prospective trial participant meets stringent inclusion

Kidney International (2016) 90, 238–241

and exclusion criteria is labor-intensive and involves manual review of the EHR by trial investigators. The development of automated approaches, such as natural language processing, that extract specific medical concepts from textual medical documents presents an alternative. Utilizing this technology, the time and effort needed for screening prospective participants could be significantly reduced. Finally, with the development of biorepositories linked to EHR data,10 prognostic biomarkers and previous clinical data could be combined to enroll only the highest risk participants in clinical trials. In summary, with appropriate consideration of the limitations, Big Data holds significant promise. Time will tell whether these promises are fulfilled, and we as nephrologists have the opportunity to make smarter, data-driven clinical decisions to improve outcomes in our vulnerable patient population. DISCLOSURE All the authors declared no competing interests. REFERENCES 1. The Economist. Data, data everywhere. Available at: http://www.economist.com/node/15557443. Accessed April 16, 2016. 2. Zhi M, Ding EL, Theisen-Toupal J, et al. The landscape of inappropriate laboratory testing: a 15-year metaanalysis. PLoS One. 2013;8:e78962. 3. Sim JJ, Bhandari SK, Shi J, et al. Comparative risk of renal, cardiovascular, and mortality outcomes in controlled, uncontrolled resistant, and nonresistant hypertension. Kidney Int. 2015;88:622–632. 4. Lazarus B, Chen Y, Wilson FP, et al. Proton pump inhibitor use and the risk of chronic kidney disease. JAMA Intern Med. 2016;176:238–246. 5. Benchimol EI, Smeeth L, Guttmann A, et al. The REporting of studies Conducted using Observational Routinely-collected health Data (RECORD) statement. PLoS Med. 2015;12:e1001885. 6. Tangri N, Grams ME, Levey AS, et al. Multinational assessment of accuracy of equations for predicting risk of kidney failure: a meta-analysis. JAMA. 2016;315: 164–174. 7. Cohen LM, Ruthazer R, Moss AH, et al. Predicting sixmonth mortality for patients who are on maintenance hemodialysis. Clin J Am Soc Nephrol. 2010;5:72–79. 8. Konstantinidis I, Nadkarni GN, Yacoub R, et al. Representation of patients with kidney disease in trials of cardiovascular interventions: an updated systematic review. JAMA Intern Med. 2016;176:121–124. 9. Wyatt CM, Shineski M, Chertow GM, Bangalore S. ISCHEMIA in chronic kidney disease: improving the representation of patients with chronic kidney disease in cardiovascular trials. Kidney Int. 2016;89:1178–1179. 10. Gottesman O, Kuivaniemi H, Tromp G, et al. The Electronic Medical Records and Genomics (eMERGE) Network: past, present, and future. Genet Med. 2013;15:761–771.

241