Blood-Derived DNA Methylation Signatures of Crohn's Disease and Severity of Intestinal Inflammation

Blood-Derived DNA Methylation Signatures of Crohn's Disease and Severity of Intestinal Inflammation

Gastroenterology 2019;-:1–12 Hari K. Somineni,1,2 Suresh Venkateswaran,2 Varun Kilaru,3 Urko M. Marigorta,4 Angela Mo,4 David T. Okou,2 Richard Kelle...

2MB Sizes 0 Downloads 11 Views

Gastroenterology 2019;-:1–12

Hari K. Somineni,1,2 Suresh Venkateswaran,2 Varun Kilaru,3 Urko M. Marigorta,4 Angela Mo,4 David T. Okou,2 Richard Kellermayer,5 Kajari Mondal,2 Dawayland Cobb,3 Thomas D. Walters,6 Anne Griffiths,6 Joshua D. Noe,7 Wallace V. Crandall,8 Joel R. Rosh,9 David R. Mack,10 Melvin B. Heyman,11 Susan S. Baker,12 Michael C. Stephens,13 Robert N. Baldassano,14 James F. Markowitz,15 Marla C. Dubinsky,16 Judy Cho,16 Jeffrey S. Hyams,17 Lee A. Denson,18 Greg Gibson,4 David J. Cutler,19 Karen N. Conneely,1,19,§ Alicia K. Smith,1,3,20,§ and Subra Kugathasan1,2,19,§

Control plasma

Crohn’s disease onset Crohn’s disease course

C-reactive protein (inflammatory marker)

WBC

Blood

DNA methylation

Inflammation

1 Genetics and Molecular Biology Program, Emory University, Atlanta, Georgia; 2Division of Pediatric Gastroenterology, Department of Pediatrics, Emory University School of Medicine and Children’s Healthcare of Atlanta, Atlanta, Georgia; 3 Department of Gynecology and Obstetrics, Emory University School of Medicine, Atlanta, Georgia; 4Center for Integrative Genomics, Georgia Institute of Technology, Atlanta, Georgia; 5Section of Pediatric Gastroenterology, Baylor College of Medicine, Texas Children’s Hospital, Houston, Texas; 6Division of Pediatric Gastroenterology, Hepatology and Nutrition, Department of Pediatrics, The Hospital for Sick Children, University of Toronto, Toronto, Ontario, Canada; 7Department of Pediatric Gastroenterology, Hepatology and Nutrition, Medical College of Wisconsin, Milwaukee, Wisconsin; 8Division of Pediatric Gastroenterology, Nationwide Children’s Hospital, Ohio State University College of Medicine, Columbus, Ohio; 9 Department of Pediatrics, Goryeb Children’s Hospital, Morristown, New Jersey; 10Department of Pediatrics, Children’s Hospital of Eastern Ontario IBD Centre and University of Ottawa, Ottawa, Ontario, Canada; 11Department of Pediatrics, University of California, San Francisco, San Francisco, California; 12Department of Digestive Diseases and Nutrition Center, University at Buffalo, Buffalo, New York; 13Department of Pediatric Gastroenterology, Mayo Clinic, Rochester, Minnesota; 14 Department of Pediatrics, University of Pennsylvania, Philadelphia, Pennsylvania; 15Department of Pediatrics, Northwell Health, New York, New York; 16Department of Pediatrics, Mount Sinai Hospital, New York, New York; 17Division of Digestive Diseases, Hepatology, and Nutrition, Connecticut Children’s Medical Center, Hartford, Connecticut; 18Division of Pediatric Gastroenterology, Hepatology, and Nutrition, Cincinnati Children’s Hospital Medical Center, Cincinnati, Ohio; 19Department of Human Genetics, Emory University, Atlanta, Georgia; and 20Department of Psychiatry and Behavioral Sciences, Emory University, Atlanta, Georgia

(active disease)

Effect of DNA methylation on Crohn’s disease

BACKGROUND & AIMS: Crohn disease is a relapsing and remitting inflammatory disorder with a variable clinical course. Although most patients present with an inflammatory phenotype (B1), approximately 20% of patients rapidly progress to complicated disease, which includes stricture (B2), within 5 years. We analyzed DNA methylation patterns in blood samples of pediatric patients with Crohn disease at diagnosis and later

Control

Disease onset DNA methylation

Q5

Post treatment (inactive disease)

time points to identify changes that associate with and might contribute to disease development and progression. METHODS: We obtained blood samples from 164 pediatric patients (1–17 years old) with Crohn disease (B1 or B2) who participated in a North American study and were followed for 5 years. Participants without intestinal inflammation or symptoms served as controls (n ¼ 74). DNA methylation patterns

FLA 5.5.0 DTD  YGAST62484_proof  17 April 2019  5:36 am  ce

61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120

BASIC AND TRANSLATIONAL AT

Blood-Derived DNA Methylation Signatures of Crohn Disease and Severity of Intestinal Inflammation

Effect of DNA methylation on inflammation

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

2

BASIC AND TRANSLATIONAL AT

121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180

Somineni et al

Gastroenterology Vol.

were analyzed in samples collected at time of diagnosis and 1–3 years later at approximately 850,000 sites. We used genetic association and the concept of Mendelian randomization to identify changes in DNA methylation patterns that might contribute to the development of or result from Crohn disease. RESULTS: We identified 1189 50 -cytosine–phosphate–guanosine-30 (CpG) sites that were differentially methylated between patients with Crohn disease (at diagnosis) and controls. Methylation changes at these sites correlated with plasma levels of C-reactive protein. A comparison of methylation profiles of DNA collected at diagnosis of Crohn disease vs during the follow-up period showed that, during treatment, alterations identified in methylation profiles at the time of diagnosis of Crohn disease more closely resembled patterns observed in controls, irrespective of disease progression to B2. We identified methylation changes at 3 CpG sites that might contribute to the development of Crohn disease. Most CpG methylation changes associated with Crohn disease disappeared with treatment of inflammation and might be a result of Crohn disease. CONCLUSIONS: Methylation patterns observed in blood samples from patients with Crohn disease accompany acute inflammation; with treatment, these change to resemble methylation patterns observed in patients without intestinal inflammation. These findings indicate that Crohn disease– associated patterns of DNA methylation observed in blood samples are a result of the inflammatory features of the disease and are less likely to contribute to disease development or progression. Keywords: Inflammatory Bowel Disease; Children; Epigenetic Alteration; Risk Stratification and Identification of Immunogenetic and Microbial Markers of Rapid Disease Progression in Children with Crohn Disease (RISK) Study.

I

nflammatory bowel diseases (IBDs) encompassing Crohn disease and ulcerative colitis arise in the context of complex interactions between genetic and environmental factors. Although these diseases can manifest at any age, pediatric-onset Crohn disease has a higher incidence than ulcerative colitis,1,2 and patients diagnosed with Crohn disease in childhood are more likely to develop an aggressive and severe disease course.2 DNA methylation, occurring predominantly in the 50 (CpG) dinucleotide cytosine–phosphate–guanosine-30 context, is a key epigenetic mechanism that can regulate gene expression and thereby influence the development and progression of complex diseases. Cross-sectional studies of DNA methylation have begun to expose epigenetic associations with IBD in pediatric and adult populations across a range of cell and tissue types.3–11 For instance, site-specific DNA methylation differences in peripheral blood3 and blood-derived mononuclear cells5 of adult patients with IBD have been reported. Similarly, studies of mixed or purified cells from blood and intestinal mucosa of pediatric populations have shown distinct methylation profiles relevant to IBD.8,9 Howell et al12 recently reported a gut segmentspecific methylation signature in pediatric patients with IBD in purified intestinal epithelial cells and its persistence during the course of the disease. However, because of the

-,

No.

-

WHAT YOU NEED TO KNOW BACKGROUND AND CONTEXT Placeholder Placeholder Placeholder Placeholder

text  Placeholder text  Placeholder text  text  Placeholder text  Placeholder text  text  Placeholder text  Placeholder text  text 

NEW FINDINGS Placeholder Placeholder Placeholder Placeholder

text  Placeholder text  Placeholder text  text  Placeholder text  Placeholder text  text  Placeholder text  Placeholder text  text 

LIMITATIONS Placeholder Placeholder Placeholder Placeholder

text  Placeholder text  Placeholder text  text  Placeholder text  Placeholder text  text  Placeholder text  Placeholder text  text 

IMPACT Placeholder Placeholder Placeholder Placeholder

text  Placeholder text  Placeholder text  text  Placeholder text  Placeholder text  text  Placeholder text  Placeholder text  text 

relapsing–remitting behavior of Crohn disease and the dynamic nature of DNA methylation and its resulting vulnerability to confounding and reverse causation, delineating the causal role of methylation in Crohn disease requires longitudinal studies and the application of integrative analytical approaches. Understanding how the methylome changes during the course of the disease, as a result of varying clinical characteristics, and how disease complications evolve can aid in the identification of potentially causal epigenetic targets, which could subsequently be leveraged for therapeutic benefits. We performed an epigenome-wide association analysis of DNA methylation in peripheral blood at w850,000 sites and Crohn disease at diagnosis and at later stages (1–3 years after diagnosis) during which time w33% of patients progressed from an initial stage of B1 inflammatory behavior to B2 stricture behavior. Study participants (Table 1) were sampled from the Risk Stratification and Identification of Immunogenetic and Microbial Markers of Rapid Disease Progression in Children with Crohn’s Disease (RISK) cohort,13 a pediatric prospective inception Crohn disease cohort. Because current Crohn disease therapeutics

§

Authors share last co-authorship.

Abbreviations used in this paper: CpG, 50 -cytosine–phosphate–guanosine30 ; CRP, C-reactive protein; FDR, false discovery rate; GWAS, genomewide association study; IBD, inflammatory bowel disease; KEGG, Kyoto Encyclopedia of Genes and Genomes; mQTL, cis-methylation quantitative trait loci; OR, odds ratio; PCDAI, Pediatric Crohn’s Disease Activity Index; RISK, Risk Stratification and Identification of Immunogenetic and Microbial Markers of Rapid Disease Progression in Children with Crohn’s Disease; SNP, single-nucleotide polymorphism. © 2019 by the AGA Institute 0016-5085/$36.00 https://doi.org/10.1053/j.gastro.2019.01.270

FLA 5.5.0 DTD  YGAST62484_proof  17 April 2019  5:36 am  ce

181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240

241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300

2019

Blood DNA Methylation Signatures of Crohn Disease

3

Q1

Table 1.Summary of Patient Characteristics P values

Samples, n Age (y), median (IQR) Girls, n (%) Disease state: B1/B2a Disease location: L1/L2/L3/missinga Samples with CRP data, n CRP (mg/L), median (IQR) Samples with CRP <1/>1, n Samples with PCDAI data, n PCDAI, median (IQR) Samples with PCDAI 10/>10, n Treatment-naïve subjects, n Biologics Immunomodulators Biologics þ immunomodulators Othersa Medication data missing

Controls without IBD

Crohn disease at diagnosis

Crohn disease at follow-up

74 12.6 (9.8–14.3) 34 (46) 0/0

164 12.6 (10.6–15.0) 68 (41) 150/14 39/39/69/17 132 4.4 (1.5–14.5) 25/107 159 30.0 (20.0–42.5) 19/140 164 NA NA NA NA NA

164 15.2 (13.2–17.3) 68 (41) 95/69 31/21/95/17 95 1.0 (0.5–3.0) 52/43 149 5.0 (0.0–15.0) 99/50 0 76 26 43 11 8

45 0.5 (0.1–3.4) 30/15 NA NA NA 74 NA NA NA NA NA

Crohn disease vs non-IBD

Crohn disease at diagnosis vs follow-up

.237 .523

2.5  1014

.037

.0001 2.2  1016

IQR, interquartile range; NA, ??. a In the Montreal classification of Crohn disease behavior, B1 corresponds to inflammatory behavior with no stricture or luminal penetrating complications and B2 to stricture behavior with no luminal penetrating complications. Similarly, for disease location, L1 corresponds to disease in the ileum, L2 corresponds to disease in the colon, and L3 corresponds to disease in the ileal colon. Others include patients who received 5-aminosalicylic acid, steroids, and/or antibiotics.

systematically targets the peripheral immune system and considerable genetic and cell biological evidence, including previous epigenetic studies, have implicated the immune system in the etiology of Crohn disease,3,14 we investigated methylation changes in peripheral blood with respect to their potential causal vs consequential roles in disease.

Methods Study Population We examined a subset of pediatric subjects recruited under the RISK study.13 The RISK inception cohort study thus far is the largest pediatric Crohn disease cohort recruited at 28 sites in the United States and Canada to identify genetic, clinical, microbial, and immunologic factors that predispose patients with Crohn disease (B1) to a complicated disease course (B2 or B3). Briefly, the RISK study recruited children 1–17 years old who presented to gastroenterology clinics with suspected IBD and followed them for 5 years at regular intervals to determine the incidence of IBDs or complications of an established disorder. The RISK study design, recruitment details, inclusion– exclusion criteria, disease behaviors, and data collection have been described in detail elsewhere.13

Study Design The initial recruitment and follow-up have been previously described.13 A subset of age-, sex-, and ethnicity-matched control subjects without IBD and patients with Crohn disease with B1 inflammatory behavior and B2 stricture behavior were drawn from the RISK cohort13 based on the availability

of patient samples at diagnosis and follow-up 1–3 years after diagnosis (Table 1). Subjects who were negative for gut inflammation and exhibited no bowel pathology at endoscopy and remained free of IBD symptoms during the course of the follow-up period served as non-IBD controls. Peripheral blood DNA samples from 164 treatment-naïve pediatric patients with newly diagnosed with Crohn disease (cases) and 74 nonIBD controls (controls) were considered for baseline analysis to identify Crohn disease–associated CpG sites. Of these, 150 cases presented purely with an inflammatory phenotype (noncomplicated Crohn disease; B1) and the remaining 14 presented with the stricture phenotype (B2) at diagnosis. However, sensitivity analysis comparing 150 B1 cases or 14 B2 cases with 74 non-IBD controls vs 164 cases (150 B1, 14 B2) with 74 non-IBD controls showed that our findings were robust to disease behavior states (B1 or B2), allowing the grouping of all cases (B1 and B2) at diagnosis into a single large cohort. Longitudinal analysis relied on follow-up samples taken from established cases (n ¼ 164) as part of a longitudinal follow-up in the RISK study, which was 1–3 years from diagnosis. Exact details are presented in Table 1. Of the 150 B1 cases at diagnosis, 55 progressed to B2 (progressors) during the course of the follow-up period, and the rest (n ¼ 95) remained at B1 at the time of the follow-up sampling (nonprogressors). To increase statistical power to define methylome changes (if any) involved in disease progression, we purposefully inflated the number of progressors by selecting more pediatric cases who developed a the B2 complication during the course of their prospective follow-up in the original RISK study.13 With the 14 patients with a diagnosis of B2 who remained at B2 at the time of follow-up sampling, we had 95 B1

FLA 5.5.0 DTD  YGAST62484_proof  17 April 2019  5:36 am  ce

Q3

301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360

BASIC AND TRANSLATIONAL AT

-

4

BASIC AND TRANSLATIONAL AT

361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420

Somineni et al

Gastroenterology Vol.

-,

No.

-

and 69 B2 cases at follow-up. More details about phenotype classification are available in the publication by Kugathasan et al.13

frequency >0.05. Unless stated otherwise, the first 3 genotypebased principal components were used to control for population stratification in all analyses (Supplementary Figure 1).

Quantification of Genome-wide DNA Methylation and Data Processing

Genetic Risk Scores

Peripheral blood genomic DNA was extracted using the AllPrep DNA/RNA Mini Kit (Qiagen, Valencia, CA). Extracted DNA 500 ng from each sample was subjected to bisulfite treatment using the EZ DNA Methylation-Gold Kit (Zymo Research, Irvine, CA). Genome-wide DNA methylation was quantified in bisulfite-converted genomic DNA at single-base resolution using the MethylationEPIC BeadChip (Illumina, San Diego, CA). The initial quality control for the dataset was performed with the R package CpGassoc.15 CpG sites called with low signal or low confidence (detection P > .05) or with data missing for more than 10% of samples were removed, and samples with data missing or called with low confidence for more than 10% of CpG sites were removed. In addition, probes mapping to multiple locations were removed.16 After these steps, 807,511 probes and 402 samples (74 non-IBD controls and 2 samples from each of the 164 cases) remained. The b values were calculated for each CpG site as the ratio of methylated (M) to methylated and unmethylated (U) signals: b ¼ M/(M þ U). Signal intensities were normalized using the beta-mixture quantile dilation module17 to account for the probe design bias in the EPIC array data. These normalized signal intensities were used to perform principal component analysis to further identify sample outliers (Supplementary Figure 1). Differential cell counts of the constituent cell types, CD4þ T cells, CD8þ T cells, natural killer cells, B cells, monocytes, and granulocytes, were estimated for each individual from the methylation data using the algorithm of Houseman et al.18

Genotyping and Data Processing Peripheral blood DNA samples from the 238 subjects with methylation data were genotyped using the Infinium MultiEthnic Global-8 Kit (Illumina) and genotypes were called using GenomeStudio software (Illumina). All these subjects had call rates >95% and inferred sex consistent with the clinical records. We tested for relatedness among subjects by calculating pairwise identity by descent based on 59,889 linkage disequilibrium–independent single-nucleotide polymorphisms (SNPs; r2 < 0.1), which confirmed no relatedness among subjects. The Multi-Ethnic array contained 1,762,905 variants before quality control. Removal of (1) SNPs with low call rate (<95%), (2) SNPs not in Hardy-Weinberg equilibrium (P < 1.0  103), and (3) SNPs with minor allele frequency <5% resulted in the retention of 1,751,369, 1,736,281, and 651,370 SNPs, respectively. We further removed non-autosomal SNPs and SNPs mapping to multiple locations. This resulted in a dataset consisting of 636,006 high-quality SNPs. All quality control procedures were performed in PLINK.19

Genotype-Based Principal Components Principal components were computed based on a pruned version of the dataset consisting of 59,889 linkage disequilibrium–independent SNPs (r2 < 0.1) and minor allele

We used the score function available in PLINK to compute weighted genetic risk scores. These scores were calculated based on the observed genotypes at 93 genotyped Crohn disease risk SNPs and their corresponding effect sizes reported for a Caucasian population in the study by Liu et al.20

Statistical and Bioinformatic Analyses A detailed description of the statistical and bioinformatic methods used in this study can be found in the Supplementary Methods. Briefly, case–control DNA methylation association analysis was performed using the R package CATE,21 which implements a state-of-the-art batch correction method to remove inflation and test-statistic bias in association tests. Linear regression was used to analyze the association of each CpG site in blood with Crohn disease at diagnosis. Linear mixed effects models were used to analyze longitudinal changes in DNA methylation of patients with Crohn disease and to identify methylation changes associated with C-reactive protein (CRP) and the Pediatric Crohn’s Disease Activity Index (PCDAI). Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis was performed using missMethyl,22 an R/Bioconductor package. Cis-methylation quantitative trait loci (mQTL) analysis was performed using a linear mixed model implemented in GEMMA.23 Genetic association and the concept of Mendelian randomization were used to clarify the causal vs consequential roles of methylation changes associated with Crohn disease. To ascertain whether peripheral blood methylation could distinguish patients with Crohn disease from nonIBD controls, we divided the baseline methylation dataset consisting of 238 subjects (164 cases and 74 controls) at random into equally weighted (cases and controls) training and testing datasets with 70% of samples going into the training dataset. The training dataset was fit with a logistic regression model using the R package glmnet,24 and the fitted model was used to predict the case status for the test dataset. Diagnostic accuracy was assessed by the area under the receiver operator characteristic curve.

Results Differentially Methylated CpG Sites Associated With Crohn Disease at Diagnosis Epigenome-wide association analysis of 164 treatmentnaïve pediatric patients newly diagnosed with Crohn disease (150 B1, 14 B2; cases) and 74 non-IBD controls identified 1189 CpG sites associated with Crohn disease in blood at diagnosis (false discovery rate [FDR] < 0.05; Figure 1 and Supplementary Table 1). Of these, 976 CpG sites (82%) had increased methylation in cases vs controls and 213 (18%) had decreased methylation. Because disease-associated inflammation can influence expression within a cell population and create differences in total cell composition (Supplementary Figure 2), our analysis included covariates to adjust for estimated proportions18 of

FLA 5.5.0 DTD  YGAST62484_proof  17 April 2019  5:36 am  ce

421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480

Blood DNA Methylation Signatures of Crohn Disease

5

Figure 1. Crohn disease at diagnosis is associated with methylation changes at 1189 CpG sites in blood. The w850,000 CpG sites (dots) are ordered by genomic position per chromosome (x axis). P values (log10) of sitespecific association with Crohn disease are shown on the y axis. Dots above the blue line represent CpG sites reaching epigenome-wide significance (FDR < 0.05). Dots above the red line represent CpG sites reaching epigenome-wide significance after Bonferroni correction (n ¼ 114 CpG sites).

print & web 4C=FPO

481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540

2019

the 6 dominant cell types (CD4þ T cells, CD8þ T cells, B cells, natural killer cells, monocytes, and granulocytes) in blood (see Methods; Supplementary Figure 3). Sensitivity analyses demonstrated that our findings were robust to disease behavior states (B1 or B2; Supplementary Figure 4 and Supplementary Table 1) and manifestation in the bowel (L1, L2, or L3; Supplementary Figure 5), suggesting that baseline methylome contributions to Crohn disease do not vary strongly by disease behavior or location. The strongest association signals were found on chromosomes 16, 17, and 19 (Figure 1), with CpG sites in a long non-coding RNA, LOC100996291 (LINCO1993), showing the peak association with Crohn disease at diagnosis (Supplementary Table 1). Apart from identifying novel CpG sites, we replicated several findings previously associated with Crohn disease (including TMEM49 [VMP1], SBNO2, RPS6KA2, ITGB2, and TXK3; Supplementary Table 2). CpG sites annotated to prominent IBD therapeutic targets such as tumor necrosis factor (TNF), Janus kinase 3 (JAK3), interleukin 12B (IL12B), interleukin 23 subunit a (IL23A), and interleukin 1 receptor type 1 (IL1R1) were among the disease-associated CpG sites (Supplementary Table 1). Notably, enrichment analysis of our Crohn disease–associated CpG sites indicated they were more likely to occur in gene bodies (odds ratio [OR] ¼ 1.67, P < 2.2  1016) and CpG shelves (OR ¼ 1.42, P ¼ 5.0  104) and less likely to be in gene promoters (OR ¼ 0.35, P ¼ 8.2  1013 for <200 base pairs from transcription start site; OR ¼ 0.57, P ¼ 5.9  1008 for <1500 base pairs from transcription start site), and CpG islands (OR ¼ 0.14, P < 2.2  1016) and shores (OR ¼ 0.59, P ¼ 9.7  1010). Exact details of the distribution of Crohn disease–associated CpG sites in relation to gene regions and CpG islands are provided in Supplementary Tables 3 and 4.

Gene Expression Profiles of Differentially Methylated Genes in Crohn Disease The 1189 Crohn disease–associated CpG sites mapped to 717 unique genes. To better understand how these CpG sites might reflect functional processes that are perturbed during the diagnosis of Crohn disease, we examined the expression profiles of our differentially methylated genes in blood RNASeq data available from an independent dataset consisting of 60 pediatric patients newly diagnosed with Crohn disease and 12 non-IBD controls.25 Of the 585 (of 717) genes available for analysis after quality control, 162 (28%) were differentially expressed at FDR <0.05 and 233 (40%) were expressed at the less stringent threshold of P <.05 (Supplementary Figure 6 and Supplementary Table 5). Overlapped with these 233 differentially expressed genes were 295 of the Crohn disease–associated CpG sites, with an average of 1.3 CpG sites associated per gene (range ¼ 1–4). As shown in Supplementary Figure 7, the direction of effects between DNA methylation and gene expression changes in relation to Crohn disease appears to be dependent on context, with some CpG methylation-gene expression probes demonstrating negative association and others showing a positive relation, irrespective of the position of the CpG site in the associated gene (P > .05 by Fisher test; Supplementary Table 6). Collectively, these observations suggest the integrative involvement of methylome and transcriptomic processes underlying Crohn disease pathogenesis.

Biological Processes Enriched in Crohn Disease– Associated CpG Sites Next, we evaluated whether the disease-associated CpG sites in blood were enriched for biological processes

FLA 5.5.0 DTD  YGAST62484_proof  17 April 2019  5:36 am  ce

541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600

BASIC AND TRANSLATIONAL AT

-

6

Gastroenterology Vol.

relevant to Crohn disease. Our pathway enrichment analysis identified 164 KEGG pathways that were more likely to occur in the Crohn disease–associated CpG sites than would be expected by chance (FDR < 0.05; Supplementary Table 7). Among these were pathways relevant to immune function, including tumor necrosis factor a, JAK-STAT, Rasrelated protein 1, and phosphoinositide 3-kinase–Akt signaling, and inflammation, such as the interleukin 17 signaling pathway, cytokine–cytokine receptor interaction, and chemokine signaling.

Relation Between DNA Methylation Signatures of Crohn Disease and Inflammation To further evaluate the relation between the diseaseassociated methylation signatures and inflammation, we tested the 1189 CpG sites for association with plasma CRP levels, a marker of inflammation, and compared the estimated effects of methylation changes on CRP vs Crohn disease at diagnosis. The relation was extremely strong (R ¼ 0.91, P < 2.2  1016), suggesting that the methylation signatures of Crohn disease cause the inflammatory status of the patient or directly result from it (Figure 2a). Of the 1189 Crohn disease CpG sites, 1155 (97%) exhibited directional consistency, and 872 (73%) showed a statistically significant association with CRP (P < .05; Figure 2a and Supplementary Table 8). Next, to assess the relevance of these methylation signatures to Crohn disease–related inflammation, we compared the effect sizes of Crohn disease–associated CpG sites on Crohn disease and CRP in our dataset with a recently published meta-analysis of the epigenome-wide association of CRP in subjects who were not selected for any particular disorder.26 These meta-analyses were composed of 8863 participants who were sampled from 9 different prospective cohort studies with a wide range of

-,

No.

-

focus, from cardiometabolic phenotypes to physical activity, intelligence, and aging. Surprisingly, we noted an extremely strong correlation between the estimated effects of Crohn disease–associated CpG sites on Crohn disease and chronic low-grade inflammation that is associated with a broad range of complex diseases, including diabetes and cardiovascular disease (Figure 2b and c). To validate this inference, we examined the overlap and directional consistency of previously reported Crohn disease CpG sites3 with the CRP meta-analysis,26 obtaining consistent results (Supplementary Table 2).

Longitudinal Dynamics of DNA Methylation in Crohn Disease To establish the direction of causality of this strong association, we next examined the longitudinal dynamics of inflammation and disease-associated methylation profiles during the course of the disease. Because frontline treatment of IBD attempts to lower inflammation in patients and, as expected, CRP levels in patients at follow-up 1–3 years after diagnosis were dramatically lower than at diagnosis (P ¼ 8.4  109; Supplementary Figure 8), the direction of causality seems obvious. The patients received treatment known to decrease inflammation, and the primary marker for inflammation was much lower. Next, to assess the dynamics of DNA methylation before and after treatment, we compared the methylation profiles at follow-up with the profiles at diagnosis. The effects at diagnosis reflect differences between newly diagnosed patients and controls, and the effects at follow-up reflect differences in patients before and after treatment. At 1179 (99.2%) of the 1189 sites associated with Crohn disease at diagnosis, the sign of the effect had reversed, whereas the magnitude of the change remained the same, generating a strong negative correlation (R ¼ 0.93, P < 2.2  1016; Figure 3a and Supplementary

print & web 4C=FPO

BASIC AND TRANSLATIONAL AT

601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660

Somineni et al

Figure 2. Methylation signatures of Crohn disease reflect inflammatory status of the patient. (a) For the 1189 Crohn disease– associated CpG sites, estimated effects (n ¼ 164 cases and 74 controls) on Crohn disease at diagnosis (x axis) were strongly correlated with the estimated effects (n ¼ 272: 45 non-IBD controls, 132 at diagnosis and 95 follow-up samples) on plasma CRP levels within the same subjects (y axis). Maroon dots represent Crohn disease CpG sites that showed significance with plasma CRP (P < .05). (b, c) At the 206 (of 218) CRP-associated CpG sites in the latest meta-analysis (n ¼ 8863) by Ligthart et al26 (y axis), (b) 199 had effects on Crohn disease at diagnosis and (c) 196 had effects on CRP in the same direction in our data. Maroon dots are CpG sites from Ligthart et al26 that showed significance with (b) Crohn disease and (c) CRP in our data (P < .05). CD, Crohn disease.

FLA 5.5.0 DTD  YGAST62484_proof  17 April 2019  5:36 am  ce

661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720

Blood DNA Methylation Signatures of Crohn Disease

7

Figure 3. Disrupted methylation patterns during the diagnosis of Crohn disease revert back during the course of the disease. (a) Estimated methylation differences between diagnosis (n ¼ 164) and follow-up (n ¼ 164; y axis) were of similar magnitude to baseline differences between cases (n ¼ 164) and controls (n ¼ 164; x axis). The w850,000 CpG sites are shown; the correlation coefficient and P value are for the 1189 Crohn disease CpG sites that are maroon. (b) Same comparison for patients (n ¼ 55) who received an initial diagnosis of B1 at the time of diagnosis and progressed to B2 during the course of the followup period. (c) Same comparison for patients (n ¼ 95) who started and remained at B1 during diagnosis and follow-up. CD, Crohn disease.

Table 8). In fact, after treatment, methylation at these sites was largely indistinguishable in patients vs controls (Supplementary Figures 9 and 10). We noted similar results even after stratifying patients based on disease progression to B2 (R ¼ 0.91, P < 2.2  1016 for progressors; R ¼ 0.90, P < 2.2  1016 for non-progressors; Figure 3b and c). Collectively, our data establish that, during the course of the disease, methylation patterns that are disrupted at the diagnosis of Crohn disease revert to the levels seen in non-IBD controls, irrespective of the disease behavior states (B1 or B2). Only 10 CpG sites (0.8%) had the same sign of effect during diagnosis vs follow-up; these CpG sites corresponded to 8 unique genes (RORC, CXXC5, GMNN, GPR183, DIDO1, SMARCD3, ESPNL, and EPS8L3; Supplementary Table 8). Interestingly, genes such as RORC, SMARCD3, and EPS8L3 have been linked to IBD, including in genome-wide association studies (GWASs) and gene expression studies.27–31 For instance, RORC encodes a key transcription factor for the T-helper cell type 17 pathway involved in transcriptional regulation of the effector cytokines IL17A, IL17F, IL21, IL22, IL26, and CCL2032 and was previously reported to be differentially expressed in peripheral blood and intestinal Crohn disease samples compared with healthy controls.28

Relation Between DNA Methylation Signatures of Crohn Disease and Disease Activity Next, to test whether our finding of methylome and inflammatory reversion extends to other clinical and laboratory measurements, we examined the measures of the PCDAI, a multi-item index that incorporates clinical symptoms, laboratory parameters, and endoscopic findings,33 and noted higher PCDAI scores (median score ¼ 30; n ¼ 159) during diagnosis that were significantly lower during follow-up (median score ¼ 5; n ¼ 149; P < 2.2  1016;

Supplementary Figure 11). After association analysis of PCDAI with the 1189 sites (n ¼ 308 samples; see Methods), estimated effect sizes demonstrated a strong correlation with their estimated effects on Crohn disease (R ¼ 0.91, P < 2.2  1016), suggesting a potential relation between disease activity (based on PCDAI) and DNA methylation in blood (Supplementary Table 8 and Supplementary Figure 12).

Role of Medication in DNA Methylation Reversal To evaluate the potential impact of therapy on the reversal of Crohn disease–associated methylation signatures, we stratified patients based on the class of medications they were taking at the time of follow-up sampling (Table 1). Comparative analysis of methylation levels in blood at the time of follow-up did not show any genomewide significant differences between subsets of patients who received biologics, immunomodulators, biologics plus immunomodulators, or other drugs, except for 1 CpG, cg24052338 (in the 30 untranslated region of ZNF837), that showed a significant association with other drugs (FDR < 0.05; Supplementary Tables 9–12), indicating that the medication is probably not the primary contributor to the methylome reversion during the course of the disease. Boxplots depicting methylation b values of follow-up patients’ samples stratified based on the class of medications at the top 5 disease-associated CpG sites are shown in Figure 4. However, it is possible that the medicationinduced decreases in inflammation; thus, disease activity could account for the reversal of the disrupted methylome signatures in blood. Consistent with our interpretation, a study of site-specific methylation differences in peripheral blood mononuclear cells10 and a different study of 2 colonic mucosa samples9 showed methylome reversion in response to treatment and/or disease remission through modulation

FLA 5.5.0 DTD  YGAST62484_proof  17 April 2019  5:36 am  ce

781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840

BASIC AND TRANSLATIONAL AT

721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780

2019

print & web 4C=FPO

-

BASIC AND TRANSLATIONAL AT

841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900

Somineni et al

Gastroenterology Vol.

-,

No.

-

print & web 4C=FPO

8

Figure 4. Boxplots depicting the impact of the class of medications on methylation b values at the top 5 Crohn disease– associated CpG sites during follow-up. Methylation levels for the same CpG sites in non-IBD controls and patients with Crohn disease at diagnosis also are shown.

of the disease-specific inflammatory characteristics. In contrast, another study reported stable methylation differences in patients with newly diagnosed (treatment-naïve) vs established IBD (exposed to IBD medications).3 However, our finding that the reversion of disease-associated methylation patterns associates with clinical characteristics of the disease (CRP, PCDAI) rather than medication underscores the importance of having prospectively followed inception cohorts with well-documented disease measures.

Understanding Causal vs Consequential Roles of DNA Methylation in Crohn Disease Given the methylome reversion occurring during the course of the disease, and its strong relation with plasma CRP levels, Crohn disease–associated methylation signatures seem tightly linked to inflammation rather than to the development of disease. However, if methylation at specific sites plays a role in disease development, then their identification would provide valuable therapeutic targets. To distinguish sites that might have causal vs consequential roles in Crohn disease, we used the concept of Mendelian randomization as operationalized by Wahl et al.34 As shown in Supplementary Figure 13, CpG sites that emerge on the path between the instrumental variable and the outcome (Crohn disease; Model 1), where methylation appears to mediate genetic risk of Crohn disease, are interpreted to be causal rather than being the consequence (Model 2) of the

disease. Of the 1189 Crohn disease CpG sites at diagnosis, 194 were associated with DNA sequence variation in mQTL analysis (FDR < 0.05; Supplementary Table 13). Of these, 174 CpGs with genetic data available for the associated mQTL SNPs from a large meta-analysis of GWASs of Crohn disease20 were evaluated for potential causal relations between methylation in blood and Crohn disease. For each CpG, we identified the most significantly associated SNP (sentinel mQTL) and applied the concept of Mendelian randomization using the sentinel mQTL SNP as the instrumental variable, CpG as a mediator, and Crohn disease as the outcome for methylation cause of Crohn disease (Model 1, Supplementary Figure 13).

Causal Role of DNA Methylation in Crohn Disease Using this set of sentinel SNP-CpG pairs, we first investigated relations between SNP and DNA methylation (b coefficient; b1) and between DNA methylation and Crohn disease (b2) to obtain predicted effects (b1  b2) of the corresponding SNPs on Crohn disease through DNA methylation (Figure 5A). Subsequently, genetic effect sizes of SNPs on Crohn disease were obtained from large GWAS meta-analyses of Crohn disease20 to assess the observed effects of genotypes at these SNPs on Crohn disease (b3). If methylation contributes causally to Crohn disease, then one would expect the observed effect of SNP on phenotype to be consistent with, if not equivalent to, its predicted effect

FLA 5.5.0 DTD  YGAST62484_proof  17 April 2019  5:36 am  ce

901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960

2019

Blood DNA Methylation Signatures of Crohn Disease

9

Figure 5. Evaluation of directionality among Crohn disease–associated CpG sites. Schematic diagram and results of genetic association and the concept of Mendelian randomization framework implemented to clarify the causal vs consequential role of Crohn disease–associated methylation changes in blood. (a) Overall strength of causality of the 174 CpG sites tested for methylation cause of Crohn disease is inferred from the correlation between the observed (y axis) and predicted (x axis) effects of SNP on Crohn disease. To infer causality of individual CpG sites, the association of the sentinel mQTL with Crohn disease should be significant (FDR < 0.05). The 95% confidence interval error bars are shown for the 3 CpG sites with an associated mQTL that also associated with Crohn disease. Eighty-two of the 174 CpG sites that demonstrated directional consistency are shown in maroon; CpG sites that are directionally inconsistent with the observed vs predicted effects are shown in gray. (b) Observed effect of Crohn disease genetic risk score on methylation (y axis) is highly correlated with its predicted effect (x axis) through Crohn disease, suggesting a strong consequential signal at the 194 CpG sites investigated. The 95% confidence interval error bars are shown for 8 CpG sites demonstrating statistically significant consequential association (FDR < 0.05). One hundred forty-two of the 194 CpGs with directionally consistent effects are shown in maroon; CpGs that are directionally inconsistent are shown in gray. CD, Crohn disease.

mediated through methylation. Notably, methylation changes at 3 CpG sites (cg15706657, cg23216724: near GPR31; and cg20406979: near RNASET2) showed significant causal associations with Crohn disease at diagnosis (FDR < 0.05; Figure 5a and Supplementary Table 14). Consistent with the potentially causal effect, methylation levels at cg23216724 and cg20406979 became even more pronounced or remained about the same without exhibiting signs of reversion during follow-up (Supplementary Figure 14), supporting our inference regarding causality. Further support for their potentially causal influence is provided by the observation that all 3 CpG sites are influenced by the known IBD-associated SNP, rs1819333, identified through large GWASs.20,35 The IBD risk locus containing the SNP rs1819333 harbors (within 1-Mb flanking rs1819333) key genes RPS6KA2, RNASET2, and CCR6 that have previously been implicated in IBD pathology at genomic and/or molecular levels, including in transcriptomic and epigenomic studies.3,20,35–37 Although genetic variation at rs1819333 has been associated with significant risk for Crohn disease susceptibility, underlying causal gene(s) and molecular mechanisms of this strong GWAS association are yet to be elucidated. Remarkably, all 3 identified potentially causal CpGs that are associated with rs1819333 were recently shown to causally regulate transcriptional levels of RPS6KA2 in peripheral blood using a

summary data-based Mendelian randomization approach.38 Taken together, these findings suggest DNA methylation as a potential mediator of genetic effects of rs1819333 on Crohn disease, possibly through transcriptional regulation of RPS6KA2.

Consequential Role of DNA Methylation in Crohn Disease Conversely, to identify Crohn disease–associated sites where changes in methylation are more likely to be the consequence of the disease, we used a weighted Crohn disease genetic risk score (see Methods) as an instrumental variable, Crohn disease as the mediator, and methylation as the outcome (Model 2, Supplementary Figure 13). An extremely strong correlation (R ¼ 0.86; P < 2.2  1016) between the observed effect of the weighted genetic risk score on methylation and its predicted effect through Crohn disease was seen (Figure 5b). In particular, we identified 8 CpG sites corresponding to 7 genes that showed significant consequential associations with Crohn disease at diagnosis (FDR < 0.05; Figure 5b and Supplementary Table 15). In keeping with their consequential role, methylation levels at these CpG sites demonstrated drastic changes approaching levels seen in non-IBD controls during follow-up (Supplementary Figure 15). Differential methylation at

FLA 5.5.0 DTD  YGAST62484_proof  17 April 2019  5:36 am  ce

1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080

BASIC AND TRANSLATIONAL AT

961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020

print & web 4C=FPO

-

10

BASIC AND TRANSLATIONAL AT

1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105 1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135Q2 1136 1137 1138 1139 1140

Somineni et al

Gastroenterology Vol.

cg18942579: TMEM49 and cg17501210: RPS6KA2, CpG sites that have been consistently associated with Crohn disease,3,8 appears to be a consequence of the disease rather than exerting causal effects. For instance, cg17501210 was reported to be the top-most differentially methylated CpG site in peripheral blood of patients with IBD, whose effects were (1) even more pronounced in purified CD14þ monocytes; (2) strongly correlated with disease-relevant markers, including CRP, albumin, and hemoglobin; and (3) not influenced by treatment status. Overall, the strong correlation between the observed and predicted effects presented in Figure 5b suggests that most disease-associated methylation changes are triggered by the onset of Crohn disease. This finding is consistent with findings from other complex diseases,34,39,40 suggesting that only a minority of the traitassociated methylation changes are likely to exert causal effects.

Role of DNA Methylation in Diagnosis and Prognosis of Crohn Disease Biological data that enable accurate diagnosis and/or prognosis of IBD have always been of considerable interest from the viewpoint of clinical application. In line with previous studies,3 we noted that peripheral blood DNA methylation data could indeed distinguish patients with Crohn disease from non-IBD controls (area under the curve ¼ 0.91; Supplementary Figure 16). However, given the noninflammatory nature of the sampled control subjects, supplemented by our finding that the signatures of methylation observed at diagnosis of Crohn disease capture general inflammation rather than Crohn disease specifically, we are hesitant to propose peripheral methylation as a diagnostic biomarker for Crohn disease based on the prevailing evidence. Future studies of side-by-side evaluation of methylation data from patients with different immunemediated inflammatory diseases and the disease-relevant tissue-specific inflammatory characterization are required to definitively establish the diagnostic potential conferred by methylation signatures of complex diseases. Next, to assess the utility of methylation in prognosis, by stratifying patients with Crohn disease based on subsequent progression to complicated disease (see Methods), we investigated whether methylation signatures at diagnosis could predict who would in time progress to complicated Crohn disease. In keeping with our finding presented in Supplementary Figure 4 and Supplementary Table 1, we did not find any CpG sites showing significant differences when the baseline methylation profiles of subsequent progressors were compared with non-progressors. Taken together, our data suggest that peripheral blood methylation profiles do not predict or change in relation to the evolution or presence of Crohn disease complications.

Discussion We characterized temporal relations connecting methylation changes in blood with different inflammatory characteristics at diagnosis and during treatment for Crohn

-,

No.

-

disease in children. Systemic inflammation has long been understood to be a pathogenetic hallmark of Crohn disease, and medication to relieve the burden of inflammation has been part of all frontline treatment strategies for managing the disease. Our results provide convincing evidence that the signatures of methylation observed at diagnosis accompany acute inflammation that decreases with treatment, but reverts toward levels seen in non-IBD controls despite ongoing bowel disease, arguing that they are primarily a symptom of the disease rather than a cause. If so, then treatment of inflammation signatures might fundamentally be treating the symptoms of Crohn disease rather than the etiology, partly explaining why IBD often remains a lifelong remitting and relapsing disorder, despite effective treatment of the inflammation symptoms. A caveat to this interpretation is that we measured circulating immune cells, whereas the inflammation is manifest in the bowel. Data for gut-resident immune cells will be required to establish whether the methylome reversion we describe also is observed in the gut and affected by the diverse treatment regimens independent of the epithelial signature. By contrast, a recent study of methylation in intestinal epithelial cells described a distinct IBD profile more related to disease than to inflammation, which was stable over time in the 23 patients examined,12 further suggesting the need for new drugs that treat the cells in which persistent molecular changes underlie disease pathogenesis. Future epigenetic studies of IBD could profile circulating, gut-resident immune, and gut epithelial cells in parallel using our framework to definitively identify causal CpG sites and leverage the epigenome for the development of targeted therapeutics. This, in combination with the current armamentarium of IBD medications that hold promise for successfully managing the disease by targeting the immune system, might put us one step closer to sustained remission and mucosal healing. One of the long-term complications of IBD is inflammation-related cancers,41–44 and our finding that aberrant DNA methylation of Crohn disease is predominantly a consequence of inflammation provides a strong rationale for the molecular link between IBD and cancer, because chronic inflammation and aberrant DNA methylation have a known role in malignancy development. It remains to be seen whether these methylation signatures detected in blood as a consequence of inflammation could in part predict new-onset incident cancer, a major clinical consequence associated with IBD. Our study has certain limitations. Although it was well powered to detect CpG sites that differ between non-IBD controls and patients with newly diagnosed Crohn disease and examine how they change during the course of the disease, we were limited in terms of power to apply a Mendelian randomization framework to infer causal associations; hence, we might have missed detecting some CpG sites with a potential causal influence. Despite this limitation, we identified differential methylation of several CpG sites associated with an IBD-associated SNP, rs1819333, to be potentially causal to Crohn disease. However, this result should be interpreted with caution given the lack of

FLA 5.5.0 DTD  YGAST62484_proof  17 April 2019  5:36 am  ce

1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170 1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199 1200

1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228 1229 1230 1231 1232 1233 1234 1235 1236 1237 1238 1239 1240 1241 1242 1243 1244 1245 1246 1247 1248 1249 1250 1251 1252 1253 1254 1255 1256 1257 1258 1259 1260

2019

Blood DNA Methylation Signatures of Crohn Disease

replication. Nevertheless, Mendelian randomization showed results consistent with findings from our longitudinal framework that the peripheral blood methylation changes associated with Crohn disease in children are predominantly a consequence of disease. Causal vs consequential analyses of adult cohorts should confirm the potential impact of blood-derived DNA methylation or the lack thereof in adult patients diagnosed with Crohn disease.

Supplementary Material Note: To access the supplementary material accompanying this article, visit the online version of Gastroenterology at www.gastrojournal.org, and at https://doi.org/10.1053/ j.gastro.2019.01.270.

References 1. Van Limbergen J, Russell RK, Drummond HE, et al. Definition of phenotypic characteristics of childhoodonset inflammatory bowel disease. Gastroenterology 2008;135:1114–1122. 2. Ruel J, Ruane D, Mehandru S, et al. IBD across the age spectrum: is it the same disease? Nat Rev Gastroenterol Hepatol 2014;11:88–98. 3. Ventham NT, Kennedy NA, Adams AT, et al. Integrative epigenome-wide analysis demonstrates that DNA methylation may mediate genetic risk in inflammatory bowel disease. Nat Commun 2016;7:13507. 4. Karatzas PS, Gazouli M, Safioleas M, et al. DNA methylation changes in inflammatory bowel disease. Ann Gastroenterol 2014;27:125–132. 5. McDermott E, Ryan EJ, Tosetto M, et al. DNA methylation profiling in inflammatory bowel disease provides new insights into disease pathogenesis. J Crohns Colitis 2016;10:77–86. 6. Li Yim AYF, Duijvis NW, Zhao J, et al. Peripheral blood methylation profiling of female Crohn’s disease patients. Clin Epigenetics 2016;8:65. 7. Nimmo ER, Prendergast JG, Aldhous MC, et al. Genome-wide methylation profiling in Crohn’s disease identifies altered epigenetic regulation of key host defense mechanisms including the Th17 pathway. Inflamm Bowel Dis 2012;18:889–899. 8. Adams AT, Kennedy NA, Hansen R, et al. Two-stage genome-wide methylation profiling in childhood-onset Crohn’s Disease implicates epigenetic alterations at the VMP1/MIR21 and HLA loci. Inflamm Bowel Dis 2014; 20:1784–1793. 9. Harris RA, Nagy-Szakal D, Mir SA, et al. DNA methylation-associated colonic mucosal immune and defense responses in treatment-naive pediatric ulcerative colitis. Epigenetics 2014;9:1131–1137. 10. Harris RA, Nagy-Szakal D, Pedersen N, et al. Genomewide peripheral blood leukocyte DNA methylation microarrays identified a single association with inflammatory bowel diseases. Inflamm Bowel Dis 2012; 18:2334–2341.

11

11. Taman H, Fenton CG, Hensel IV, et al. Genome-wide DNA methylation in treatment-naive ulcerative colitis. J Crohns Colitis 2018;12:1338–1347. 12. Howell KJ, Kraiczy J, Nayak KM, et al. DNA methylation and transcription patterns in intestinal epithelial cells from pediatric patients with inflammatory bowel diseases differentiate disease subtypes and associate with outcome. Gastroenterology 2018;154:585–598. 13. Kugathasan S, Denson LA, Walters TD, et al. Prediction of complicated disease course for children newly diagnosed with Crohn’s disease: a multicentre inception cohort study. Lancet 2017;389:1710–1718. 14. Farh KK, Marson A, Zhu J, et al. Genetic and epigenetic fine mapping of causal autoimmune disease variants. Nature 2015;518:337–343. 15. Barfield RT, Kilaru V, Smith AK, et al. CpGassoc: an R function for analysis of DNA methylation microarray data. Bioinformatics 2012;28:1280–1281. 16. McCartney DL, Walker RM, Morris SW, et al. Identification of polymorphic and off-target probe binding sites on the Illumina Infinium MethylationEPIC BeadChip. Genom Data 2016;9:22–24. 17. Teschendorff AE, Marabita F, Lechner M, et al. A betamixture quantile normalization method for correcting probe design bias in Illumina Infinium 450 k DNA methylation data. Bioinformatics 2013;29:189–196. 18. Houseman EA, Accomando WP, Koestler DC, et al. DNA methylation arrays as surrogate measures of cell mixture distribution. BMC Bioinformatics 2012;13:86. 19. Purcell S, Neale B, Todd-Brown K, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 2007;81:559–575. 20. Liu JZ, van Sommeren S, Huang H, et al. Association analyses identify 38 susceptibility loci for inflammatory bowel disease and highlight shared genetic risk across populations. Nat Genet 2015;47:979–986. 21. Wang J, Zhao Q, Hastie T, Owen AB. Confounder adjustment in multiple hypothesis testing. arXiv 2015;1508.04178. 22. Phipson B, Maksimovic J, Oshlack A. missMethyl: an R package for analyzing data from Illumina’s HumanMethylation450 platform. Bioinformatics 2016;32:286–288. 23. Zhou X, Stephens M. Genome-wide efficient mixedmodel analysis for association studies. Nat Genet 2012;44:821–824. 24. Friedman J, Hastie T, Tibshirani R. Regularization paths for generalized linear models via coordinate descent. J Stat Softw 2010;33:1–22. 25. Mo A, Marigorta UM, Arafat D, et al. Disease-specific regulation of gene expression in a comparative analysis of juvenile idiopathic arthritis and inflammatory bowel disease. Genome Med 2018;10:48. 26. Ligthart S, Marzi C, Aslibekyan S, et al. DNA methylation signatures of chronic low-grade inflammation are associated with complex diseases. Genome Biol 2016; 17:255. 27. Bogaert S, Laukens D, Peeters H, et al. Differential mucosal expression of Th17-related genes between the

FLA 5.5.0 DTD  YGAST62484_proof  17 April 2019  5:36 am  ce

1261 1262 1263 1264 1265 1266 1267 1268 1269 1270 1271 1272 1273 1274 1275 1276 1277 1278 1279 1280 1281 1282 1283 1284 1285 1286 1287 1288 1289 1290 1291 1292 1293 1294 1295 1296 1297 1298 1299 1300 1301 1302 1303 1304 1305 1306 1307 1308 1309 1310 1311 1312 1313 1314 1315 1316 1317 1318 1319 1320

BASIC AND TRANSLATIONAL AT

-

12

BASIC AND TRANSLATIONAL AT

1321 1322 1323 1324 1325 1326 1327 1328 1329 1330 1331 1332 1333 1334 1335 1336 1337 1338 1339 1340 1341 1342 1343 1344 1345 1346 1347 1348 1349 1350 1351 1352 1353 1354 1355 1356 1357 1358 1359 1360 1361 1362 1363 1364 1365 1366 1367 1368 1369 1370 1371 1372 1373 1374 1375 1376 1377 1378 1379 1380

28.

29.

30.

31.

32.

33.

34.

35.

36.

37.

38.

39.

40.

Somineni et al

Gastroenterology Vol.

inflamed colon and ileum of patients with inflammatory bowel disease. BMC Immunol 2010;11:61. Granlund A, Flatberg A, Ostvik AE, et al. Whole genome gene expression meta-analysis of inflammatory bowel disease colon mucosa demonstrates lack of major differences between Crohn’s disease and ulcerative colitis. PLoS One 2013;8:e56818. Sipos F, Galamb O, Wichmann B, et al. Peripheral blood based discrimination of ulcerative colitis and Crohn’s disease from non-IBD colitis by genome-wide gene expression profiling. Dis Markers 2011;30:1–17. Kang J, Kugathasan S, Georges M, et al. Improved risk prediction for Crohn’s disease with a multi-locus approach. Hum Mol Genet 2011;20:2435–2442. de Lange KM, Moutsianas L, Lee JC, et al. Genomewide association study implicates immune activation of multiple integrin genes in inflammatory bowel disease. Nat Genet 2017;49:256–261. Maddur MS, Miossec P, Kaveri SV, et al. Th17 cells: biology, pathogenesis of autoimmune and inflammatory diseases, and therapeutic strategies. Am J Pathol 2012; 181:8–18. Hyams J, Markowitz J, Otley A, et al. Evaluation of the pediatric Crohn Disease Activity Index: a prospective multicenter experience. J Pediatr Gastroenterol Nutr 2005;41:416–421. Wahl S, Drong A, Lehne B, et al. Epigenome-wide association study of body mass index, and the adverse outcomes of adiposity. Nature 2017;541:81–86. Jostins L, Ripke S, Weersma RK, et al. Host-microbe interactions have shaped the genetic architecture of inflammatory bowel disease. Nature 2012;491:119–124. Marigorta UM, Denson LA, Hyams JS, et al. Transcriptional risk scores link GWAS to eQTLs and predict complications in Crohn’s disease. Nat Genet 2017; 49:1517–1521. Hannon E, Weedon M, Bray N, et al. Pleiotropic effects of trait-associated genetic variation on DNA methylation: utility for refining GWAS loci. Am J Hum Genet 2017; 100:954–959. Wu Y, Zeng J, Zhang F, et al. Integrative analysis of omics summary data reveals putative mechanisms underlying complex traits. Nat Commun 2018;9:918. Mendelson MM, Marioni RE, Joehanes R, et al. Association of body mass index with DNA methylation and gene expression in blood cells and relations to cardiometabolic disease: a Mendelian randomization approach. PLoS Med 2017;14:e1002215. Elliott HR, Shihab HA, Lockett GA, et al. Role of DNA methylation in type 2 diabetes etiology: using genotype as a causal anchor. Diabetes 2017;66:1713–1722.

-,

No.

-

41. Scharl S, Barthel C, Rossel JB, et al. Malignancies in inflammatory bowel disease: frequency, incidence and risk factors—results from the Swiss IBD Cohort Study. Am J Gastroenterol 2018;114:116–126. 42. Choi CR, Bakir IA, Hart AL, et al. Clonal evolution of colorectal cancer in IBD. Nat Rev Gastroenterol Hepatol 2017;14:218–229. 43. Peneau A, Savoye G, Turck D, et al. Mortality and cancer in pediatric-onset inflammatory bowel disease: a population-based study. Am J Gastroenterol 2013; 108:1647–1653. 44. Aardoom MA, Linda Joosse ME, de Vries ACH, et al. Malignancy and mortality in pediatric-onset inflammatory bowel disease: a systematic review. Inflamm Bowel Dis 2018;24:732–741.

Authors names in bold designate shared co-first authorship. Received September 10, 2018. Accepted January 28, 2019. Reprint requests Address requests for reprints to: Subra Kugathasan, MD: Division of Pediatric Gastroenterology, Emory University School of Medicine and Children’s Healthcare of Atlanta, 1760 Haygood Drive, W-427, Atlanta, Goergia 30322. e-mail: [email protected]. Acknowledgments We are grateful to Sarah Curtis, Anna Knight, Biao Zeng, Anne Dodd, Khuong Le, and Jarod Prince for their support and helpful comments. We also thank Neal Leleiko for comments on the report. The DNA methylation data for all 238 subjects (402 samples) included in this study have been deposited in the Gene Expression Omnibus and are accessible through series accession GSE112611. Author contributions: Hari K. Somineni, Karen N. Conneely, Alicia K. Smith, and Subra Kugathasan conceived and designed the study. Hari K. Somineni, Suresh Venkateswaran, and Varun Kilaru performed the analysis with input from Greg Gibson, David J. Cutler, Karen N. Conneely, Alicia K. Smith, and Subra Kugathasan. Angela Mo performed gene expression analysis with input from Urko M. Marigorta and Greg Gibson. Dawayland Cobb processed samples for methylation profiling. Kajari Mondal managed the RISK study. Jeffrey S. Hyams, Lee A. Denson, and Subra Kugathasan participated in the conception and design of the RISK study. David T. Okou, Richard Kellermayer, Kajari Mondal, Thomas D. Walters, Anne Griffiths, Joshua D. Noe, Wallace V. Crandall, Joel R. Rosh, David R. Mack, Melvin B. Heyman, Susan S. Baker, Michael C. Stephens, Robert N. Baldassano, James F. Markowitz, Marla C. Dubinsky, Judy Cho, Jeffrey S. Hyams, Lee A. Denson, and Subra Kugathasan recruited subjects, collected the data, and worked on its curation and analysis. Hari K. Somineni, Urko M. Marigorta, Greg Gibson, David J. Cutler, Karen N. Conneely, Alicia K. Smith, and Subra Kugathasan interpreted the results and wrote the manuscript. All authors reviewed and approved the manuscript before submission. Conflicts of interest Authors disclose no conflicts. Funding This work was supported by a research initiative grant from the Crohn’s and Colitis Foundation (New York, New York) to the individual study institutions participating in the RISK study. This research also was supported by the National Institute of Diabetes and Digestive and Kidney Q4 Diseases of the National Institutes of Health (grants R01-DK098231 and R01-DK087694 to Subra Kugathasan). The funders of the study had no role in the study design; data collection, analysis, or interpretation; or writing of the article.

FLA 5.5.0 DTD  YGAST62484_proof  17 April 2019  5:36 am  ce

1381 1382 1383 1384 1385 1386 1387 1388 1389 1390 1391 1392 1393 1394 1395 1396 1397 1398 1399 1400 1401 1402 1403 1404 1405 1406 1407 1408 1409 1410 1411 1412 1413 1414 1415 1416 1417 1418 1419 1420 1421 1422 1423 1424 1425 1426 1427 1428 1429 1430 1431 1432 1433 1434 1435 1436 1437 1438 1439 1440

-

1441 1442 1443 1444 1445 1446 1447 1448 1449 1450 1451 1452 1453 1454 1455 1456 1457 1458 1459 1460 1461 1462 1463 1464 1465 1466 1467 1468 1469 1470 1471 1472 1473 1474 1475 1476 1477 1478 1479 1480 1481 1482 1483 1484 1485 1486 1487 1488 1489 1490 1491 1492 1493 1494 1495 1496 1497 1498 1499 1500

2019

Blood DNA Methylation Signatures of Crohn Disease

Supplementary Methods Methylation Association With Crohn Disease at Diagnosis Crohn disease–associated methylation changes were profiled using the R package CATE,1 which implements a state-of-the-art batch correction method to remove inflation and test-statistic bias in association tests. We tested for an association between Crohn disease and methylation at the w807,000 sites that yielded results with an inflation in the association test of 1.16. Briefly, DNA methylation was regressed on disease status (0 for control, 1 for case) with age, sex, estimated cell proportions, and the first 3 genotype-based principal components as covariates in the model. A case–control epigenome-wide association was performed to identify CpG sites associated with Crohn disease at diagnosis in a set of 238 study participants composed of 164 cases and 74 non-IBD controls.

Methylation Association With Crohn Disease at Diagnosis vs Follow-up To analyze longitudinal changes in DNA methylation of patients with Crohn disease, beta-mixture quantile dilation -normalized signal intensities were further adjusted using ComBAT2 to account for chip and position effects. This was done as an alternative to adjustment within CATE, because CATE could not be used to perform the longitudinal analysis. This within-case longitudinal epigenome-wide association was performed to identify methylation changes during the course of the disease in 328 samples composed of 164 at-diagnosis samples and 164 follow-up samples from the same subjects. To account for the 2 time points from each patient, representing methylation levels at diagnosis and follow-up, we used a linear mixed-effects model to model the repeated measures in which adjusted b values were regressed on the time point (0 for at-diagnosis sample, 1 for follow-up sample) with age, sex, estimated cell proportions, 3 genotype-based principal components, and disease behavior states (0 for B1 sample, 1 for B2 sample) as fixedeffect covariates and the subject ID as a random effect. Similarly, within-progressor and within–non-progressor epigenome-wide association analyses were performed in 55 progressors and 95 non-progressors, respectively, by comparing their methylation profiles at diagnosis with the follow-up data.

Methylation Association With Plasma CRP A subset of subjects (45 non-IBD controls, 132 cases at diagnosis, and 95 cases at follow-up) who underwent methylation profiling had data on plasma CRP levels. To investigate the relation between DNA methylation in blood and CRP, we performed an epigenome-wide association of CRP in 272 samples using a linear mixed-effects model. Methylation b values were regressed on the log2-transformed plasma CRP (milligrams per liter) levels with age,

12.e1

sex, estimated cell proportions, 3 genotype-based principal components, and disease status (0 for control, 1 for B1, 2 for B2) as fixed-effect covariates and the subject ID as a random effect.

Methylation Association With PCDAI PCDAI scores were available for almost all patients at the time of diagnosis (n ¼ 159) and follow-up (n ¼ 149). Details on how PCDAI was computed can be found elsewhere.3 We performed epigenome-wide association of PCDAI in 308 samples by regressing methylation b values on PCDAI with age, sex, estimated cell proportions, 3 genotype-based principal components, and disease behavior states (0 for B1, 1 for B2) as fixed-effect covariates and the subject ID as a random effect.

KEGG Pathway Enrichment Analysis We used missMethyl,4 an R/Bioconductor package, to identify pathways that are more likely to occur in Crohn disease–associated CpG sites than would be expected by chance by referencing the KEGG database. Genes with more probes (more CpG sites probed) on the MethylationEPIC array are more likely to have differentially methylated CpG sites, which could introduce potential bias when performing pathway enrichment analysis. The gometh function implemented in missMethyl takes into account the varying number of differentially methylated CpGs by computing the prior probability for each gene based on the gene length and number of CpG sites probed per gene on the array.

mQTL Analysis After exclusion of CpG sites with a SNP(s) in their probe sequence, methylation proportions at each of the 625,464 CpG sites from baseline peripheral blood methylation data in 238 subjects were tested for associations with local genetic variants (±500 kb; mQTLs) using a linear mixed model implemented in GEMMA.5 This model allows for the adjustment of the population structure and relatedness among individuals as a random effect by providing a genetic relationship matrix using the linkage disequilibrium– pruned SNP dataset, which could then be used as a covariate in the mQTL analysis. In addition to the genetic relationship matrix, we included covariates for age, sex, disease status, estimated cell proportions, and 3 genotype-based principal components, and modeled methylation (CpG) as the outcome and SNP as an explanatory variable. For each CpG site, all SNPs residing within ±500 kb were individually tested for association for a total of 144,916,995 tests genome-wide. To adjust for multiple tests, statistically significant SNP–CpG pairs were inferred at FDR <5%.

Genetic Association and Concept of Mendelian Randomization To clarify the role of methylation changes that are associated with Crohn disease, we used genetic association and the concept of Mendelian randomization as described

FLA 5.5.0 DTD  YGAST62484_proof  17 April 2019  5:36 am  ce

1501 1502 1503 1504 1505 1506 1507 1508 1509 1510 1511 1512 1513 1514 1515 1516 1517 1518 1519 1520 1521 1522 1523 1524 1525 1526 1527 1528 1529 1530 1531 1532 1533 1534 1535 1536 1537 1538 1539 1540 1541 1542 1543 1544 1545 1546 1547 1548 1549 1550 1551 1552 1553 1554 1555 1556 1557 1558 1559 1560

12.e2

1561 1562 1563 1564 1565 1566 1567 1568 1569 1570 1571 1572 1573 1574 1575 1576 1577 1578 1579 1580 1581 1582 1583 1584 1585 1586 1587 1588 1589 1590 1591 1592 1593 1594 1595 1596 1597 1598 1599 1600 1601 1602 1603 1604 1605 1606 1607 1608 1609 1610 1611 1612 1613 1614 1615 1616 1617 1618 1619 1620

Somineni et al

Gastroenterology Vol.

by Wahl et al.6 The fundamental idea of Mendelian randomization is shown in Supplementary Figure 13. Briefly, Mendelian randomization makes the following assumptions: (1) an instrumental variable (individual SNP or a combination of SNPs, such as a genetic risk score) has an association with the intermediate phenotype, (2) the instrumental variable has no association with the outcome except through the intermediate phenotype, and (3) the instrumental variable is not influenced by any of the measured or unmeasured confounding factors. If the intermediate phenotype is causally associated with the outcome, then, in an adequately powered study, the instrumental variable (associated with the intermediate phenotype) also should be associated with the outcome. Hence, assignment of directionality to the intermediate phenotype–outcome relation through Mendelian randomization relies on the observed association between the instrumental variable and the outcome, which would typically require tens of thousands of subjects to achieve adequate power. For studies with limited sample size, as described by Wahl et al,6 if a potential causal relation exists between the intermediate phenotype and the outcome, then one would expect the estimated effect (b coefficient) of the instrumental variable on the outcome (b3 in Figure 5) to be consistent with (directional consistency), if not equivalent to, its predicted effect mediated through the intermediate phenotype (b1  b2).

DNA Methylation Cause of Crohn Disease To identify Crohn disease–associated CpG sites that are potentially causal, we used the most significantly associated mQTL (sentinel mQTL, defined as the mQTL with the smallest P value) as the instrumental variable, methylation as the intermediate phenotype, and Crohn disease as the outcome (Model 1; Supplementary Figure 13). The effect size between SNPs and corresponding CpG sites (sentinel mQTL–CpG pair from our mQTL analysis; b1) was estimated through simple linear regression models with methylation as the response and SNP as the explanatory variable. The effect size between CpG sites and disease status (b2) was estimated through simple linear regression models with disease status (0 for non-IBD control, 1 for Crohn disease) as the response and CpG as the explanatory variable. The effect size between sentinel mQTL SNP and disease status (b3) was obtained from a large meta-analysis of Crohn disease GWASs.7 For these, the ORs estimated by logistic regression models with disease status (0 for non-IBD control, 1 for Crohn disease) as the response and SNP as the explanatory variable in GWAS meta-analysis7 were logtransformed to make the effects linear. The reason behind fitting simple linear regression models despite the response variable being binary is that the relation between effect

-,

No.

-

sizes denoted in the equation in Figure 5 holds true when linear regression models are fit, but no analogous relation exists for logistic regression models. Because this relation between effect sizes was important for our assessment of consequence vs causality depicted in Figure 5, we chose to use linear rather than logistic regression. Although the normality assumption of linear regression is clearly violated by the use of a binary dependent variable, leading to incorrect estimates of the standard errors, the estimated effect sizes will be unbiased estimates of the expected change in outcome due to a 1-unit change in the predictor. From our mQTL analysis we identified mQTL associations (FDR < 0.05) for 194 of the 1189 Crohn disease–associated CpG sites. Of these, 174 CpG sites for which the associated mQTL SNP (or proxy SNP [n ¼ 6]: linkage disequilibrium r2  0.8; Supplementary Table 14) had been analyzed in a previously published GWAS7 were subsequently evaluated for their causal role. None of the sentinel mQTLs associated with the selected CpG sites showed deviance from the assumptions made to be a valid instrument. For example, no sentinel mQTL showed significant association with the outcome (Crohn disease) after conditioning on methylation levels at the corresponding CpG sites, allowing us to investigate the potential causal relations between DNA methylation in blood at all sentinel mQTL–CpG sites and Crohn disease. The predicted effect sizes and standard errors were estimated as bpred ¼ b1  b2 and SEpred ¼ (SE21  SE22 þ SE21  b22 þ SE22  b21)1/2 , respectively. FDR <0.05 was considered statistically significant for individual CpG sites.

DNA Methylation Consequence of Crohn Disease To identify Crohn disease–associated CpG sites where changes in methylation are a consequence of the disease, we used weighted Crohn disease genetic risk score as the instrumental variable, Crohn disease as the intermediate phenotype, and methylation as the outcome (Model 2; Supplementary Figure 13). The effect size between z-scored weighted genetic risk score and Crohn disease (b1) was estimated through simple linear regression models with Crohn disease as the response and risk score as the explanatory variable. The effect size between methylation and Crohn disease (b2) was estimated through simple linear regression with methylation as the response and Crohn disease as the explanatory variable. The effect size between methylation and weighted genetic risk score (b3) was estimated through a simple linear model with methylation as the response and genetic risk score as the explanatory variable. All 194 sentinel mQTL–CpG sites were tested for methylation consequence of Crohn disease. The predicted effect sizes and standard errors were computed as described earlier. FDR <0.05 was considered statistically significant for individual CpG sites.

FLA 5.5.0 DTD  YGAST62484_proof  17 April 2019  5:36 am  ce

1621 1622 1623 1624 1625 1626 1627 1628 1629 1630 1631 1632 1633 1634 1635 1636 1637 1638 1639 1640 1641 1642 1643 1644 1645 1646 1647 1648 1649 1650 1651 1652 1653 1654 1655 1656 1657 1658 1659 1660 1661 1662 1663 1664 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1680

-

1681 1682 1683 1684 1685 1686 1687 1688 1689 1690 1691 1692 1693 1694 1695 1696 1697 1698 1699 1700 1701 1702 1703 1704 1705 1706 1707 1708 1709 1710 1711 1712 1713 1714 1715 1716 1717 1718 1719 1720 1721 1722 1723 1724 1725 1726 1727 1728 1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740

2019

Blood DNA Methylation Signatures of Crohn Disease

References 1.

2.

3.

4.

Wang J, Zhao Q, Hastie T, Owen AB. Confounder adjustment in multiple hypothesis testing. arXiv 2015;1508.04178. Johnson WE, Li C, Rabinovic A. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 2007;8:118–127. Kugathasan S, Denson LA, Walters TD, et al. Prediction of complicated disease course for children newly diagnosed with Crohn’s disease: a multicentre inception cohort study. Lancet 2017;389:1710–1718. Phipson B, Maksimovic J, Oshlack A. missMethyl: an R package for analyzing data from Illumina’s

5.

6.

7.

12.e3

HumanMethylation450 platform. Bioinformatics 2016; 32:286–288. Zhou X, Stephens M. Genome-wide efficient mixedmodel analysis for association studies. Nat Genet 2012;44:821–824. Wahl S, Drong A, Lehne B, et al. Epigenome-wide association study of body mass index, and the adverse outcomes of adiposity. Nature 2017;541:81–86. Liu JZ, van Sommeren S, Huang H, et al. Association analyses identify 38 susceptibility loci for inflammatory bowel disease and highlight shared genetic risk across populations. Nat Genet 2015;47:979–986.

Author names in bold designate shared co-first authorship.

FLA 5.5.0 DTD  YGAST62484_proof  17 April 2019  5:36 am  ce

1741 1742 1743 1744 1745 1746 1747 1748 1749 1750 1751 1752 1753 1754 1755 1756 1757 1758 1759 1760 1761 1762 1763 1764 1765 1766 1767 1768 1769 1770 1771 1772 1773 1774 1775 1776 1777 1778 1779 1780 1781 1782 1783 1784 1785 1786 1787 1788 1789 1790 1791 1792 1793 1794 1795 1796 1797 1798 1799 1800