Identification of Loci at 1q21 and 16q23 That Affect Susceptibility to Inflammatory Bowel Disease in Koreans

Identification of Loci at 1q21 and 16q23 That Affect Susceptibility to Inflammatory Bowel Disease in Koreans

Accepted Manuscript Identification of Loci at 1q21 and 16q23 That Affect Susceptibility to Inflammatory Bowel Disease in Koreans Suk-Kyun Yang, Myungh...

888KB Sizes 0 Downloads 30 Views

Accepted Manuscript Identification of Loci at 1q21 and 16q23 That Affect Susceptibility to Inflammatory Bowel Disease in Koreans Suk-Kyun Yang, Myunghee Hong, Hyunjung Oh, Hui-Qi Low, Seulgi Jung, Seonjoo Ahn, Youngjin Kim, Jiwon Baek, Cue Hyunkyu Lee, Eunji Kim, Kyung Mo Kim, Byong Duk Ye, Kyung-Jo Kim, Sang Hyoung Park, Ho-Su Lee, Inchul Lee, Hyoung Doo Shin, Buhm Han, Dermot P.B. McGovern, Jianjun Liu, Kyuyoung Song PII: DOI: Reference:

S0016-5085(16)34967-8 10.1053/j.gastro.2016.08.025 YGAST 60648

To appear in: Gastroenterology Accepted Date: 18 August 2016 Please cite this article as: Yang S-K, Hong M, Oh H, Low H-Q, Jung S, Ahn S, Kim Y, Baek J, Lee CH, Kim E, Kim KM, Ye BD, Kim K-J, Park SH, Lee H-S, Lee I, Shin HD, Han B, McGovern DPB, Liu J, Song K, Identification of Loci at 1q21 and 16q23 That Affect Susceptibility to Inflammatory Bowel Disease in Koreans, Gastroenterology (2016), doi: 10.1053/j.gastro.2016.08.025. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT

Identification of Loci at 1q21 and 16q23 That Affect Susceptibility to

RI PT

Inflammatory Bowel Disease in Koreans

Suk-Kyun Yang,1# Myunghee Hong,2# Hyunjung Oh,2 Hui-Qi Low,2 Seulgi Jung,

2

Seonjoo Ahn,2 Youngjin Kim, 2 Jiwon Baek,2 Cue Hyunkyu Lee,3 Eunji Kim,3,4 Kyung

Mo Kim,5 Byong Duk Ye,1 Kyung-Jo Kim,1 Sang Hyoung Park,1 Ho-Su Lee,1 Inchul

SC

Lee,6 Hyoung Doo Shin,7 Buhm Han,3 Dermot P B McGovern,8 Jianjun Liu,9 Kyuyoung

1

Department of Gastroenterology, Asan Medical Center, University of Ulsan College of

Medicine, Seoul 138-736, Korea 2

Department of Biochemistry and Molecular Biology, University of Ulsan College of

TE D

Medicine, Seoul 138-736, Korea 3

M AN U

Song2

Department of Convergence Medicine, University of Ulsan College of Medicine &

Asan Institute for Life Sciences, Asan Medical Center, Seoul 138-736, Korea Department of Chemistry, Seoul National University, Seoul 151-742, Korea

5

Department of Pediatrics, Asan Medical Center Children’s Hospital, University of

EP

4

AC C

Ulsan College of Medicine, Seoul 138-736, Korea 6

Department of Pathology, Asan Medical Center, University of Ulsan College of

Medicine, Seoul 138-736, Korea

7

Department of Life Science, Sogang University, Seoul 121-742, Korea

8

The F. Widjaja Foundation Inflammatory Bowel and Immunobiology Research

Institute; Cedars-Sinai Medical Center, Los Angeles, California, United States of America 1

ACCEPTED MANUSCRIPT

9

Human Genetics Group, Genome Institute of Singapore, Singapore

RI PT

Authors named in bold designate shared co-first authoprship.

Correspondence should be addressed to:

SC

Kyuyoung Song, Ph.D.

Dept. of Biochemistry and Molecular Biology

M AN U

Univ. of Ulsan College of Medicine, 88, Olympic-ro, 43-gil Songpa-Gu

Seoul 138-736, Korea

Tel: 822-3010-4277 / Fax: 822-3010-4248

TE D

e-mail: [email protected]

EP

Author contributions

K.S. obtained financial support. K.S. conceived of and designed the study and

AC C

supervised the sample collection, genotyping, data analysis, and interpretation. J.L and B.H. participated in study design and supervised the data analysis and interpretation. S.K.Y. participated in study design and supervised the sample selection. S.-K.Y., K.M.K., B.D.Y., K.-J.K., S.H.P., and H.S.L. recruited subjects and participated in the diagnostic evaluation. H.D.S. provided additional control data. D.P.B.M. and I.L. provided advice and critical comments on the manuscript. M.H., H.O., E.K., C.H.L., S.A., S.J., and H.L. performed data analyses. M.H., H.O., S.J., and J.B. prepared DNA samples and 2

ACCEPTED MANUSCRIPT

performed genotyping. K.S., M.H., and B.H. drafted the manuscript. K.S., B.D.Y.,

RI PT

D.P.B.M., and B.H. revised the manuscript.

Funding

This work was supported by a Korean Health Technology R&D Project grant through

SC

the Korea Health Industry Development Institute to S.-K. Yang (A120176), funded by

the Ministry of Health & Welfare, as well as a Mid-career Researcher Program grant the

National

Research

Foundation

of

Korea

M AN U

through

to

K.

Song

(2014R1A2A1A09005824), funded by the Ministry of Science, Information & Communication Technology and Future Planning, the Republic of Korea.

Conflicts of interest

AC C

EP

TE D

The authors disclose no conflicts.

3

ACCEPTED MANUSCRIPT

Abstract Recent genome-wide association studies (GWAS) have identified more than 200 regions

RI PT

that affect susceptibility to inflammatory bowel disease (IBD). However, identified common variants account for only a fraction of IBD heritability and have largely been

identified in populations of European ancestry. We performed a GWAS of susceptibility

SC

loci in Korean individuals, comprising a total of 1,505 IBD patients and 4,041 controls.

We identified 2 new susceptibility loci for IBD at genome-wide significance: rs3766920 near PYGO2-SHC1 at 1q21 and rs16953946 in CDYL2 at 16q23. In addition, we

M AN U

confirmed associations, in Koreans, with 28 established IBD loci (P < 2.16 × 10-4). Our findings support the complementary value of genetic studies in different populations.

AC C

EP

TE D

Keywords: Crohn’s disease; ulcerative colitis; genetics; risk factor

4

ACCEPTED MANUSCRIPT

The incidence and prevalence of the inflammatory bowel diseases (IBD) are

RI PT

rapidly increasing throughout Asia, presumably due to environmental changes.1-4 Several studies have shown that demographic and phenotypic characteristics such as

gender ratio, disease location, and clinical course of IBD differ between Asians and

SC

Europeans, emphasizing the necessity of studying both genetic and environmental influences in diverse populations. Despite a number of IBD genome-wide association

studies (GWAS) in East Asians,5-8 large-scale European ancestry studies,9 and a trans-

M AN U

ethnic association study,10 it is clear that additional studies are needed to expand our understanding of the genetic architecture of IBD.

We performed a GWAS on IBD in Korean subjects, including 1,505 cases (CD: 922; ulcerative colitis [UC]: 583) and 4,041 controls for the discovery stage, and an

TE D

additional 1,989 cases (CD: 993; UC: 996) and 3,491 controls for the replication cohorts (Supplementary Table 1). All subjects were diagnosed with IBD according to standard criteria. After stringent quality control measures were applied (Supplementary Materials

EP

and Methods), we analyzed 522,285 SNPs in the discovery groups using trend test of logistic regression. The quantile-quantile plots appeared normal (Figure 1). The

AC C

genomic inflation factor λGC of 1.106 (1.006 when applied linear mixed model). The discovery GWAS identified 9 loci achieving stringent genome-wide significance (P<5×10-8) (Figure 1), including 2 novel loci (PYGO2-SHC1, NCF4-CSF2RB) and 7 established IBD risk loci (TNFSF15-TNFSF8, MHC, TBC1D1-KLF3, STAT3-STAT5BSTAT5A, TNFRSF6B, SMNDC1-DUSP5, GPR35). Additionally, 58 loci showed suggestive associations (P<1×10-4). To validate these 2 new and 58 suggestive loci, we selected the most significant SNP for each locus and examined their association with 5

ACCEPTED MANUSCRIPT

IBD in the replication cohort. In total, we identified 2 novel IBD loci with genome-wide significance in the combined discovery and replication analyses: rs3766920 in the

RI PT

PYGO2-SHC region at 1q21 (combined P = 1.96 × 10-9) and rs16953946 in the CDYL2 region at 16q23 (combined P = 2.48 × 10-8) (Table 1 and Figure 1). The detailed results

of combined analysis of all samples are shown in Supplementary Table 2. A likelihood-

SC

modeling approach showed that the newly identified loci were associated with both CD

and UC. Conditional analysis within each locus in the discovery dataset did not show

M AN U

any additional independent signals. The two loci contain plausible candidate genes: using knock-out mice, Shc1 was shown to play an important role in later stages of thymic T cell development and in peripheral T cell-dependent events:11 and a common variant (rs4613079, r2 < 0.2) in CDYL2 is associated with interleukin (IL)-10 levels in African Americans.12 We investigated why these 2 new loci were not identified as

TE D

significant in European populations. rs3766920 is monomorphic. rs16953946 is polymorphic in Europeans, but when we examined the local LD structure for rs16953946, LD was not consistently lower in Europeans (r2EUR < r2ASN for 11 out of 17

EP

SNPs, differential LD P > 0.3), indicating that factors other than the frequency or LD differences may be causing the heterogeneity in effect size.

AC C

Using our Korean data, we examined the 200 previously established ‘European’

IBD-associated loci (231 independent SNPs).10 In our population, data from 178 SNPs

in 161 loci were available. Of these, a total of 26 SNPs from 25 loci were replicated at significant threshold of 2.16 x 10-4 probably due to the sample size as our cohort was only estimated to have > 80% power at 4 loci (Supplementary Table 3). An additional

53 SNPs in 46 loci did not reach the Bonferroni threshold but showed nominal P < 0.05.

6

ACCEPTED MANUSCRIPT

For all 178 SNPs, a comparison of their effect sizes between Asians and Europeans showed positive correlations in the direction of effects for both CD and UC

RI PT

(Supplementary Figure 1; r2 = 0.40 and P = 3.56 × 10-21 for CD; r2 = 0.53 and P = 1.10 × 10-30 for UC), consistent with a previous study.10 Three loci first identified in Asians but not replicated in Europeans,10 suggesting they are Asian-specific, were replicated in the

SC

current study (rs11195128 at 10q25, rs11235604 at 11q13, and rs7335629 at 13q14).

For estimating the percentage of phenotypic variance explained by each of the CD or

M AN U

UC risk alleles, we used the 2 novel loci (2 SNPs) and 28 confirmed loci (29 SNPs) from this study as well as the previously reported CD or UC risk loci (4 and 6, respectively, Supplementary Materials and Methods): totaling 29 SNPs for CD and 29 SNPs for UC (18 SNPs included in both CD and UC totals)(Supplementary Table 4). Considering a CD prevalence of 0.0112% in the Korean population, the 29 loci

TE D

accounted for 6.65% of the total CD genetic risk. For UC, the 29 loci accounted for 5.47% of the total genetic risk based on a UC prevalence of 0.0309%. When genetic relationship between CD and UC was estimated by fitting a bivariate linear mixed

EP

model, the genetic correlation of the risk between CD and UC was 0.47 (standard error, 0.08). This estimate was lower than that in Caucasian data (0.62; standard error,

AC C

0.042),14 perhaps due to differences in IBD phenotypes between Europeans and Asians. We have identified two novel loci which potentially further implicate

alterations in T cell biology and the IL10 pathway in IBD pathogenesis. In addition, we have replicated 29 SNPs from 28 previously reported loci. Collectively, our findings suggest distinct as well as common pathways associated with IBD in European ancestry and Asian populations.

7

ACCEPTED MANUSCRIPT

Web resources The URLs for data presented herein are as follows:

The 1000 Genome Project, http://www.1000genomes.org/ UCSC Genome Browser, http://genome.ucsu.edu/ GCTA, cnsgenomics.com/software/gcta/

M AN U

SC

IIBDGC, www.ibdgentics.org

RI PT

Online Mendelian Inheritance in Man (OMIM), http://www.omim.org

Acknowledgements

We would like to thank all the participating patients and healthy donors who provided the DNA and clinical information necessary for this study. This study was provided with biospecimens and data from the Korean Genome Analysis Project (4845-301), the

TE D

Korean Genome and Epidemiology Study (4845-302), and Korea Biobank Project (4851-307, KBP-2015-000), which were supported by the Korea Centers for Disease Control & Prevention, Republic of Korea. This work was supported by PLSI

EP

supercomputing resources of Korea Institute of Science and Technology Information. DPBM is the Joshua L. and Lisa Z. Greer Endowed Chair in IBD Genetics and was

AC C

supported by DK062413, U54DE023789-01, grant 305479 from the European Union, and The Leona M. and Harry B. Helmsley Charitable Trust. JJ Liu is supported by the Agency for Science, Technology and Research (A*STAR), Singapore.

8

ACCEPTED MANUSCRIPT

References Thia KT et al. Am J Gastroenterol 2008;103:3167-182.

2.

Yang SK et al. Inflamm Bowel Dis 2008;14:542-549.

3.

Ng SC et al. Gastroenterology 2013;145:158-162.e2

4.

Park SJ et al. World J Gastroenterol 2014;20:11525-11537.

5.

Yang S-K, Hong M, Zhao W, et al. Inflamm Bowel Dis 2013;19:954-966.

6.

Yang S-K, Hong M, Zhao W, et al. Gut 2014;63:80-87.

7.

Yamazaki K et al. Gastroenterology 2013;144:781-788.

8.

Fuyuno Y et al. J Gastroenterol DOI: 10.1007/s00535-015-1135-3.

9.

Jostins L, Ripke S, et al. Nature 2012;491:119-124.

10.

Liu JZ, van Sommeren S, et al. Nat Genet 2015;47:979-986.

11.

Buckley MW et al. J Immunol 2015;194:1665-1676.

12.

Ayele FT, Doumatey A, et al. Immunogenetics 2012;64:351-359.

13.

Chen G-B et al. Hum Mol Genet 2014;23:4710-4720.

TE D

M AN U

SC

RI PT

1.

AC C

EP

Authors named in bold designate shared co-first authoprship.

9

ACCEPTED MANUSCRIPT

RI PT

Figure legends

Figure 1. The results of GWAS in IBD in the discovery stage. (A) Quantile-quantile plot, -log10 P values were plotted against the expected null distribution. Black dots indicate whole genome-wide statistics using trend test (522,285 SNPs, λGC = 1.106); red dots

SC

indicate whole genome-wide statistics using linear mixed model of GEMMA (522,285 SNPs, λGC = 1.006). (B) Manhattan plot, SNPs located in novel loci are colored blue and those with genome-wide significance level (blue line, P < 5 × 10-8) colored red.

M AN U

When using the linear mixed model approach, three loci that did not survive genomewide significance level are underlined (C-D) Regional association plots of (C) rs3766920 at 1q21, (D) rs16953946 at 16q23. SNPs are plotted according to their chromosomal positions (NCBI Build 37) with -log10 P values from the GWAS in the region flanking 750 kb on either side of the marker SNP. Circles indicate genotyped SNPs and squares indicate imputed SNPs. The most strongly associated SNP in the

TE D

discovery stage is shown as a small purple circle. LD (r2 values) between the lead SNP and the other SNPs are indicated using colors. The relative location of the annotated genes and the direction of transcription are shown in the lower portion of the figure. The estimated recombination rates of the Asian samples from the 1000 Genomes Project

EP

(Nov 2014) are plotted to reflect the local LD structure. Plots are generated using

AC C

LocusZoom (http://csg.sph.umich.edu/ locuszoom/).

10

ACCEPTED MANUSCRIPT

Table 1. Two novel risk loci associated with inflammatory bowel disease in Koreans.

rs3766920

16q23

rs16953946

154,934,963

80,785,448

Candidate gene(s)

Risk allele

Study

No. of cases

No. of controls

RAF in cases

RAF in controls

PYGO2, SHC1

A

GWAS

1,505

4,039

0.11

0.08

Replication

1,977

3,485

0.09

0.08

Combined

3,482

7,524

0.10

0.08

GWAS

1,505

4,040

0.20

Replication

1,981

3,485

0.20

Combined

3,486

7,525

0.20

CDYL2

C

OR (95% CI) 1.48 (1.29-1.70) 1.24 (1.08-1.43) 1.35 (1.22-1.49)

0.17

0.17

0.17

P

PGEMMA

3.56 × 10-8*

2.48 × 10-9†

RI PT

1q21

Position (hg19)

SC

SNP

M AN U

Chr

1.26

(1.13-1.40) 1.20

(1.09-1.33) 1.23 (1.14-1.32)

0.002*

1.96 × 10-9**

3.06 × 10-10‡

1.85 × 10-5*

4.68 × 10-6†

2.90 × 10-4* 2.48 × 10-8**

P_BD∆

LR phenotype§

0.084

IBD_U

0.484

IBD_U

8.01 × 10-9‡

AC C

EP

TE D

Chr, chromosome; CI, confidence interval; IBD, inflammatory bowel disease; OR, odds ratio; Position, chromosome position; RAF, risk allele frequency. * P value was calculated using the Cochran-Armitage trend test. ** The combined P value was calculated using the Cochran-Mantel-Haenszel test. † P value was calculated using the z-score from genome-wide efficient mixed-model association (GEMMA). ‡ The combined P value for GEMMA was calculated using the weighted z-score meta-analysis. ∆ P_BD was calculated using the Breslow-Day test. § The most strongly associated phenotype was assigned using likelihood ratio modeling. IBD unsaturated (IBD_U) loci were associated with both Crohn's disease and ulcerative colitis with no evidence of differences in effect size.

AC C

EP

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

ACCEPTED MANUSCRIPT

Supplementary Materials and Methods The supplementary materials include 1 figure and 4 tables

RI PT

Identification of Loci at 1q21 and 16q23 That Affect Susceptibility to Inflammatory Bowel Disease in Koreans

SC

Suk-Kyun Yang,# Myunghee Hong,# Hyunjung Oh, Hui-Qi Low, Seulgi Jung, Seonjoo Ahn, Youngjin Kim, Jiwon Baek, Cue Hyunkyu Lee, Eunji Kim, Kyung Mo Kim, Byong Duk Ye,

M AN U

Kyung-Jo Kim, Sang Hyoung Park, Ho-Su Lee, Inchul Lee, Hyoung Doo Shin, Buhm Han,

AC C

EP

TE D

Dermot P B McGovern, Jianjun Liu, Kyuyoung Song

Materials and Methods 1

ACCEPTED MANUSCRIPT

Study subjects A total of 3,494 IBD patients (1,915 with CD and 1,579 UC) and 7,532 unrelated healthy

RI PT

controls were included in the study. The clinical characteristics of the study subjects are shown in Supplementary Table 1. All study participants were of Korean descent. All the IBD patients were recruited from the IBD Clinic of Asan Medical Center. IBD was diagnosed based on standard clinical, radiologic, endoscopic, and histopathologic criteria.1 Patients with

SC

indeterminate colitis were excluded from the study. CD patients were sub-grouped according to the Montreal classification with minor modifications,2—i.e., according to age at diagnosis

M AN U

(A1: ≤16 years; A2: 17−40 years; A3: ≥40 years), disease location (L1, terminal ileum; L2, colon; L3, ileocolon), and disease behavior (B1, inflammatory; B2, stricturing; B3, penetrating). Upper gastrointestinal and perianal disease modifiers of the Montreal classification system were not included in our classification scheme. UC patients were

TE D

classified into 3 categories based on the Montreal classification,2 as ulcerative proctitis, leftsided colitis, and extensive colitis, according to the maximal endoscopic extent. To identify risk loci associated with IBD in Koreans and to increase the power of discovery

EP

analysis by GWAS, we combined samples from 915 individuals with IBD (532 with CD, 383 with UC) and 745 control individuals used in our previously published GWAS, along with an

AC C

additional set of 590 individuals with IBD (390 with CD, 200 with UC) and 3,296 controls from the population data of the Korea National Institute of Health. Details of the previous and current discovery GWAS panels, independent replication panel, and genotyping platforms used are provided in Supplementary Table 1. The Replication cohort consisted of 1,989 individuals with IBD (993 with CD, 996 with UC) and 3,491 healthy controls (2,496 controls from the Korea National Institute of Health and 995 controls were provided by H-D Shin). 2

ACCEPTED MANUSCRIPT

Genotyping and quality control in expanded GWAS analyses

OmniExpressExome arrays.

RI PT

A total of 595 individuals with IBD were newly genotyped using Illumina For controls, we combined publicly available data of 800

individuals that were enrolled in our previously published GWAS with data from an

SC

additional 3,351 population control subjects. All the control subjects were genotyped using the same Illumina Omni1-Quad arrays. For the IBD cases, we combined samples from 538

M AN U

individuals with CD and 397 individuals with UC that were enrolled in our previously published GWAS with samples from an additional 595 individuals with IBD (395 with CD and 200 with UC). The CD and UC cases were genotyped using Illumina OmniExpress arrays, whereas the IBD cases newly genotyped in this study were genotyped using Illumina OmniExpressExome arrays. In total, the discovery cohort consisted of 1,530 IBD cases and

TE D

4,151 controls. Quality control was conducted on each dataset separately by using a common approach. All the SNPs on the X, Y, and mitochondrial chromosomes as well as copy number variation–related SNPs were excluded. As a part of quality control, SNPs were excluded if

EP

they had a call rate <98%, a minor allele frequency (MAF) <0.01, and significant deviation from Hardy-Weinberg equilibrium in controls (P<1×10-5). Similarly, we removed all samples

AC C

with a genotyping rate <96% from further analysis. We examined potential genetic relatedness of the 5,681 samples based on identity by descent analysis using PLINK 1.9 software (http://pngu.mgh.harvard.edu/ ~purcell/plink/). For each pair of first or seconddegree related samples (PI_HAT >0.2, IBS >0.8), the sample with lower genotype call rate was removed. Thus, a total of 106 samples were removed due to sample duplications and/or

3

ACCEPTED MANUSCRIPT

genetic relatedness (15 cases and 91 controls). We then combined the 2 sets of IBD cases and 2 sets of controls. Quality control was performed again in the combined set of samples. In the combined set of samples, SNPs with a call rate <98%, markers with a Hardy-Weinberg

RI PT

equilibrium P value of <1.0 × 10-5 in the controls, or a MAF <0.01, or differential missingness between cases and controls (P <1.0 × 10-5) were excluded from the analysis. An additional 25 samples were removed due to sample duplications and/or genetic relatedness (9

SC

cases and 16 controls). After these SNP and sample quality control analyses were conducted, the genotype data of 522,285 SNPs in 1,506 cases and 4,044 controls was available for

M AN U

GWAS analysis.

Subsequently, we used principal-component analysis (PCA)-based methods to detect population outliers and stratification with the software package EIGENSTRAT 3.0.3 First, 5,550 samples (1,506 individuals with IBD and 4,044 control individuals) were analyzed

TE D

along with the 194 reference samples from the International HapMap Project. Four population outliers (1 case and 3 controls) were identified and removed. After these SNP and sample quality control analyses, we finally used the genotype data of 522,285 SNPs in 1,505

AC C

EP

cases and 4,041 controls in the GWAS analysis.

Imputation

We imputed 5,546 samples (1,505 individuals with IBD and 4,041 control subjects) by using the genotyped SNPs that had passed the quality control criteria described above. Imputation was performed by using IMPUTE version 2.0 (https://mathgen.stats.ox.ac.uk/impute/ impute_v2.html)4 and the Asian reference data (JPT + CHB) from the 1,000 Genomes Project 4

ACCEPTED MANUSCRIPT

databases (Feb2012release) (http://www.1000genomes.org/). SNPs with low imputation quality (with a posterior probability score of < 80%), MAF <1%, or a missing rate of >10%

RI PT

were excluded from further analysis. A total of 4,148,401 imputed SNPs passed quality control and were combined with 522,285 genotyped SNPs for association analysis. Note that the imputed SNPs were not included in both the discovery and replication analyses to avoid large multiple testing burden. The imputed data were only used for examining the association

SC

patterns in the regional plots of loci identified during the discovery phase and conditional

SNP selection for replication

M AN U

analysis.

Following genome-wide discovery analysis, the SNPs with P <10-4 were selected for

TE D

replication in an independent cohort. With an aim to identify novel susceptibility loci, we chose a region with >1 SNP showing association evidences and only chose SNPs with a low linkage disequilibrium (LD) with any known IBD risk variants reported in previous GWAS.

EP

A total of 60 SNPs from the 60 loci with association P <10-4 were selected for validation in an independent set of 1,989 IBD and 3,491 controls (the replication cohort).

AC C

Genotyping of the replication cohort was conducted by using either the Sequenom iPLEX system (Analytical Genetics Technology Centre, Princess Margaret Hospital/ University Health Network in Toronto, Canada) or TaqMan genotyping technology (7900HT Fast RealTime PCR System, Applied Biosystems) according to the manufacturer’s suggestion. The validation of rs3766920 and rs4821558 was performed using TaqMan SNP genotyping assays. We excluded SNPs with a call rate of <97.5% or a deviation from the Hardy-Weinberg 5

ACCEPTED MANUSCRIPT

equilibrium (P <0.05). For rs7335629 of which replication was not successfully conducted in our samples due to an assay failure, we performed a meta-analysis between the Japanese CD

RI PT

discovery analysis results and our discovery analysis results.

Statistical analysis

SC

Genome-wide association analyses were conducted for IBD. Association tests were performed using the Cochran-Armitage trend test. The quantile-quantile plot was generated

M AN U

using R(3.2.0) (http://www.r-project.org/) to evaluate the overall significance of the genomewide associations and the potential impact of population stratification. The impact of population stratification was also evaluated by calculating the genomic control inflation factor (λGC=1.106) and the genomic inflation factor for 1000 cases and 1000 controls

TE D

(λGC1000=1.048). Since the polygenic architecture and LD to true causal variants can have influence on λGC, we also evaluated λGC after stringent LD pruning (r2<0.1, λGC=1.059 after pruning). In addition, we evaluated λGC after including 10 principal components in the

EP

logistic regression (λGC=1.102). The Manhattan plot of −log10 P was generated by using R(3.2.0) (http://www.r-project.org/). Conditional regression analysis was performed to

AC C

identify the possible independent associations at genome-wide significant loci. The CochranArmitage trend tests were used to examine the genotype-phenotype associations in the replication samples. The final analysis of the combined GWAS and replication samples was performed using the Cochran-Mantel-Haenszel test. Moreover, Breslow-Day tests were performed to evaluate the significance of heterogeneity between the odds ratios (ORs) for samples from different cohorts. 6

ACCEPTED MANUSCRIPT

We used the genome-wide significance threshold defined as P < 5 × 10-8 throughout this study. To replicate the IBD loci previously reported in European populations, we applied P < 2.16 × 10-4 (0.05/231, where 231 is the number of

RI PT

the significance threshold of

independent SNPs from 200 loci known to be involved in IBD in European populations) and examined whether any of the associations to CD, UC, or IBD exceeded this threshold in our dataset. Power analysis of our GWAS samples for detecting the 200 previously reported IBD

SC

susceptibility loci in the European populations was performed by using Quanto software version1.2.4 (http://hydra.usc.edu/gxe/). For each reported SNP, the power for detection at a

M AN U

nominal P value of 2.16 × 10-4 (0.05/231) was calculated based on the reported OR and the allele frequency in the Korean population (from the current study). We also compared the LD structure of the 2 novel loci between Asian and European populations by using the 1000 Genomes databases (Feb2012release) (http://www.1000genomes.org/).

TE D

In addition, as a fine-tuning step to confirm the associations, we applied a linear mixed model approach (GEMMA)5 to the discovery dataset. Using the P values based on zscores obtained from GEMMA, the resulting genomic inflation factor was 1.006, suggesting

EP

possible cryptic relatedness between samples. We then recalculated combined P values using weighted z-score strategy. This did not change our main results; the two novel loci’s P values

AC C

became even more significant. After applying GEMMA, the significances of 24 SNPs replicated European loci also changed. We supplement GEMMA results to all our Tables except Supplementary Table 4.

CD, UC, and IBD likelihood modeling 7

ACCEPTED MANUSCRIPT

We classified the association signals into 4 categories by using the same approach applied by Jostins et al.6 Four multinomial logistic regression models with parameters βCD and βUC were fitted with the following constraints: (1) CD-specific model: βUC = 0, βCD fitted by maximum

RI PT

likelihood, (2) UC-specific model: βCD = 0, βUC fitted by maximum likelihood, (3) IBD unsaturated (same effect size) model: βCD = βUC = βIBD, βIBD fitted by maximum likelihood, and (4) IBD saturated (different effect size) model: βCD and βUC both fitted by maximum

SC

likelihood. The log likelihoods were calculated for each model, and 3 likelihood-ratio tests were conducted by comparing models 1–3 against the IBD saturated model. If all the 3 tests

M AN U

showed a P value < 0.05, then the SNP was classified as associated with both CD and UC, but with evidence of different effect sizes. Otherwise, of the 3 constrained models, the SNP was classified based on the model with the largest likelihood. If IBD unsaturated was the best-fitting model, the locus could be interpreted as being associated with both CD and UC

Variance explained

TE D

without evidence of different effect sizes.

EP

The fraction of genetic variance explained by each of the CD or UC risk alleles was estimated

AC C

by using the algorithm developed by So et al,7 under a liability threshold model.8 This model proposes a latent continuous liability, which is assumed to be normally distributed with a mean of 0 and variance of 1. We assumed a CD prevalence of 0.0112% and a UC variance of 0.0309% in the Korean population.9 To calculate the heritability explained by a single variant, we used the allele frequency, OR, and disease prevalence. To estimate the genetic variance, we included the previously reported Korean risk loci for CD or UC in addition to the novel 8

ACCEPTED MANUSCRIPT

and confirmed loci in the present study (Supplementary Table 4). The CD or UC risk loci included the loci classified as CD or UC through CD, UC, and IBD likelihood modeling, loci with association P < 2.16 × 10-4 from the separate association analysis of CD or UC by using

RI PT

common controls, and loci with association P < 2.16 × 10-4 from the association analysis of

SC

IBD.

Genetic relationship between CD and UC

M AN U

The proportion of genetic variation shared between CD and UC was estimated by using the bivariate linear mixed-effects model implemented in the genome-wide complex trait analysis tool (GCTA).10 In the bivariate analysis, controls were split in proportion to the numbers of

Ethics

TE D

the CD and UC cases.

EP

This study was approved by the Institutional Review Board of the Asan Medical Center.

All

samples were obtained with written informed consent under Institutional Review Board-

AC C

approved protocols in each center.

Supplementary References 1. 2.

Lennard-Jones JE. Scand J Gastroenterol Suppl 1989;170:2-6;discussion 16-19.

Silverberg MS et al. Can J Gastroenterol. 2005;19 Suppl A:5-36.

9

Price AL et al. Nat Genet 2006;38:904-909.

4.

Howie BN et al. PLoS Genet 2009;5:e1000529.

5.

Zhou X et al. Nature Genet 2012;44:821-824.

6.

Jostins L, Ripke S, et al. Nature 2012;491:119-124.

7.

So HC et al. Genet Epidemiol 2011;35:310-317.

8.

Falconer DS. Ann Hum Genet 1965;29:51-76.

9.

Yang SK et al. Inflamm Bowel Dis 2008;14:542-549.

10.

Lee SH et al. Bioinformatics 2012;28:2540-2.

AC C

EP

TE D

M AN U

SC

3.

RI PT

ACCEPTED MANUSCRIPT

10

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

AC C

EP

TE D

Supplementary Figure 1. Comparison of odds ratios for the 178 reported Crohn’s disease and ulcerative colitis risk variants in Europeans and Koreans (based on the minor allele in Europeans). Each dot represents the OR (on a log scale) for each SNP in CD and UC. Color denotes the range of association P value for CD or UC in Koreans. The red line refers to linear regression, which was weighted by the inverse of the variance of the log (ORs) in Koreans. The correlation coefficient and P value of the linear model are shown at the bottom-right corner.

11

ACCEPTED MANUSCRIPT

Supplementary Table 1. Clinical characteristics of the study subjects and study cohort description. GWAS Control

IBD

CD

UC

1,505

922

583

4,041

1,989

993

996

976 (64.9)

651 (70.6)

325 (55.7)

1,602 (39.6)

1,275 (64.1)

723 (72.8)

NA

34.9 ± 13.6 31.1 ± 12.8 114 (5.7) 1,419 (71.3) 456 (22.9)

Mean age at sampling (yr)

31.1 ± 13.6

25.5 ± 9.1

40.1 ± 14.7

Mean age at diagnosis (yr) Age group at diagnosis (%) ≤ 16 17∼40 ≥ 40 Location, no. (%) Ileum Colon Ileocolon NA Extent, no. (%) Proctitis Left-sided colitis Extensive colitis NA Behavior, no. (%) Inflammatory Stricturing Penetrating NA Perianal fistula, no. (%) No Yes NA

27.6 ± 12.6

22.3 ± 8.2

36.0 ± 13.8

302 (20.1) 917 (60.9) 286 (19.0)

249 (27.0) 634 (68.8) 39 (4.2)

53 (9.1) 283 (48.5) 247 (42.4)

NA NA NA

161 (19.4) 45 (5.4) 623 (75.2) 93

CD

UC

3,491

3,494

1,915

1,579

7,532

552 (55.4)

1,771 (50.7)

2,251 (64.4)

1,374 (71.7)

877 (55.5)

3,373 (44.8)

28.5 ± 9.4

41.3 ± 14.2

NA

33.3 ± 13.7

27.0 ± 9.4

40.8 ± 14.4

NA

25.4 ± 8.7

36.8 ± 13.7

29.6 ± 12.9

23.9 ± 8.6

36.5 ± 13.7

80 (8.1) 841 (84.7) 72 (7.3)

34 (3.4) 578 (58.0) 384 (38.6)

416 (11.9) 2,336 (66.9) 742 (21.2)

329 (17.2) 1,475 (77.0) 111 (5.8)

87 (5.5) 861 (54.5) 631 (40.0)

NA NA NA

Control

NA NA NA

389 (21.4) 81 (4.5) 1,349 (74.2) 96

260 (26.3) 321 (32.5) 407 (41.2) 8

TE D

329 (39.6) 501 (60.4) 92

IBD

228 (23.0) 36 (3.6) 726 (73.3) 3

160 (29.4) 176 (32.4) 208 (38.2) 39 325 (39.2) 164 (19.8) 341 (41.1) 92

Control

RI PT

UC

SC

Male (%)

Total

CD

M AN U

No. of samples

Replication

IBD

420 (27.4) 497 (32.4) 615 (40.1) 47

453 (45.6) 186 (18.7) 354 (35.6)

778 (42.7) 350 (19.2) 695 (38.1) 92

456 (45.9) 537 (54.1)

785 (43.1) 1,038 (56.9) 92

AC C

EP

GWAS set 1 Illumina chip OmniExpress12v1 Omni1-Quad No. of samples 915 532 383 745 GWAS set 2 Illumina chip HumanOmniExpressExome8v1 Omni1-Quad No. of samples 590 390 200 3,296 Replication set Genotyping technology Sequenom iPLEX, TaqMan SNP genotyping assay No. of samples 1,989 cases 3,491 controls* CD, Crohn's disease; IBD, inflammatory bowel disease; UC, ulcerative colitis; KNIH, Korea National Institute of Health. All patients with IBD were recruited at the Asan Medical Center in Seoul, Korea. The samples from GWAS set 1 had been used in a previous GWAS on CD and UC separately (ref 5, 6). *The 3,296 controls in GWAS set 2 and 2,496 controls in the replication set were obtained from the National Biobank of Korea, supported by the Ministry of Health, Welfare and Family Affairs. An additional 995 controls in the replication set were provided by Prof. H-D Shin.

12

ACCEPTED MANUSCRIPT

Supplementary Table 2. Results from IBD GWAS, replication, and combined association analyses. IBD GWAS (1,505/4,041)

Locus

ADAP1, COX19, C7orf50, CYP2W1, GPR146, GPER1, ZFAND2A CDKN2A-AS1, CDKN2A, CDKN2B-AS1, CDKN2B

FAM162B, GPRC6A, RFX6 SYT10, ALG10 DEC1 PUS7L, IRAK4, TWF1, TMEM117, NELL2 FAM187B, LSR, USF2, HAMP, MAG CDYL2 ZFP3, ZNF232, USP6, ZNF594, SCIMP HIVEP2 SOX5 KIAA0922 TTC7B MSI2

ELL2, MIR583, PCSK1 PACSIN1, SPDEF APP FAT4 NFKBIA RASGEF1C, MAPK9, GFPT2 NPAS3 GRM7-AS3 OBP2B, ABO

95% CI

P trend*

A C A G T A T C T G G G G T C G A C A T T C T T C G A C T T T T A T G T G T A A G A A T T G G A G G

0.11 0.39 0.90 0.26 0.05 0.58 0.19 0.27 0.22 0.38 0.28 0.69 0.51 0.93 0.44 0.76 0.40 0.92 0.31 0.96 0.41 0.20 0.83 0.60 0.09 0.39 0.69 0.03 0.91 0.23 0.59 0.66 0.23 0.59 0.89 0.45 0.57 0.35 0.16 0.55 0.10 0.12 0.44 0.78 0.63 0.57 0.70 0.63 0.15 0.04

0.08 0.34 0.86 0.21 0.03 0.53 0.15 0.23 0.18 0.34 0.24 0.64 0.46 0.90 0.39 0.71 0.35 0.90 0.27 0.94 0.36 0.17 0.80 0.55 0.07 0.34 0.65 0.01 0.88 0.19 0.55 0.62 0.20 0.54 0.86 0.40 0.53 0.31 0.13 0.51 0.08 0.10 0.40 0.75 0.59 0.53 0.66 0.59 0.12 0.03

1.48 1.28 1.43 1.28 1.64 1.23 1.30 1.26 1.27 1.23 1.24 1.23 1.21 1.44 1.21 1.24 1.21 1.39 1.22 1.60 1.20 1.26 1.27 1.20 1.39 1.20 1.21 1.81 1.34 1.24 1.20 1.21 1.24 1.19 1.32 1.19 1.19 1.21 1.28 1.19 1.34 1.31 1.19 1.23 1.19 1.19 1.20 1.19 1.27 1.57

(1.29-1.70) (1.17-1.39) (1.25-1.64) (1.16-1.41) (1.34-2.00) (1.13-1.34) (1.16-1.45) (1.14-1.38) (1.15-1.41) (1.12-1.34) (1.13-1.37) (1.12-1.35) (1.11-1.31) (1.22-1.69) (1.11-1.32) (1.13-1.37) (1.11-1.32) (1.20-1.62) (1.12-1.34) (1.29-1.99) (1.11-1.31) (1.13-1.40) (1.14-1.42) (1.10-1.31) (1.19-1.62) (1.10-1.31) (1.11-1.33) (1.36-2.40) (1.17-1.54) (1.12-1.38) (1.10-1.30) (1.10-1.32) (1.12-1.37) (1.10-1.30) (1.15-1.50) (1.10-1.30) (1.10-1.30) (1.10-1.32) (1.13-1.43) (1.09-1.29) (1.16-1.54) (1.15-1.50) (1.09-1.30) (1.11-1.36) (1.09-1.30) (1.09-1.29) (1.10-1.32) (1.09-1.30) (1.13-1.44) (1.25-1.97)

3.56E-08 3.81E-08 1.09E-07 9.18E-07 1.58E-06 2.48E-06 2.63E-06 3.48E-06 4.43E-06 4.93E-06 5.61E-06 6.76E-06 9.01E-06 9.82E-06 1.13E-05 1.14E-05 1.22E-05 1.43E-05 1.47E-05 1.53E-05 1.66E-05 1.85E-05 1.99E-05 2.30E-05 2.36E-05 2.76E-05 3.02E-05 3.22E-05 3.25E-05 3.29E-05 3.49E-05 3.54E-05 3.69E-05 3.82E-05 3.85E-05 4.02E-05 4.36E-05 4.51E-05 5.37E-05 6.08E-05 6.13E-05 6.29E-05 6.39E-05 6.46E-05 7.19E-05 7.41E-05 7.48E-05 7.65E-05 7.79E-05 8.21E-05

SLC35B3, LOC100506207, HULC GFRAL, HMGCLL1 NRXN3

C3orf80, IFT80, SMC4, MIR15B, MIR16-2, TRIM59, KPNA4, SCARNA7, ARL14 DLX3

13

Combined (3,494/7,532)

P GEMMA**

RAF in case

RAF in control

Logistic OR

95% CI

P trend

OR

P CMH†

Weighted z-score meta P ‡

P_ BD

2.48E-09 2.48E-07 1.88E-06 4.52E-06 4.01E-05 2.55E-05 3.16E-05 3.03E-05 1.81E-05 3.47E-05 9.98E-05 4.63E-05 1.26E-04 1.46E-04 1.75E-05 8.65E-06 1.89E-04 9.98E-05 2.05E-05 1.54E-04 9.80E-05 4.68E-06 5.04E-05 5.00E-05 2.40E-04 1.54E-04 9.96E-05 2.93E-06 3.50E-05 7.42E-05 8.96E-05 3.36E-05 1.28E-03 7.42E-05 4.96E-05 1.79E-04 7.81E-04 6.03E-04 5.39E-04 1.75E-04 1.49E-05 3.86E-05 9.01E-05 9.67E-05 2.29E-05 9.55E-04 2.05E-04 1.05E-04 9.30E-04 4.94E-05

0.09 0.37 0.23 0.04 0.55 0.17 0.18 0.25 0.67 0.50 0.91 0.41 0.72 0.37 0.90 0.96 0.37 0.20 0.82 0.08 0.37 0.67 0.89 0.20 0.57 0.62 0.22 0.56 0.87 0.41 0.34 0.53 0.09 0.10 0.41 0.75 0.61 0.54 0.66 0.59 0.13 0.04

0.08 0.35 0.23 0.03 0.54 0.16 0.18 0.25 0.66 0.46 0.90 0.40 0.70 0.34 0.91 0.95 0.37 0.17 0.81 0.07 0.36 0.66 0.88 0.20 0.55 0.63 0.20 0.56 0.88 0.41 0.33 0.52 0.09 0.10 0.41 0.76 0.60 0.54 0.67 0.59 0.13 0.03

1.24 1.07 0.99 1.19 1.02 1.04 1.03 1.00 1.06 1.14 1.13 1.06 1.08 1.09 0.91 1.23 1.04 1.20 1.05 1.20 1.04 1.02 1.17 1.02 1.09 0.96 1.17 0.98 0.92 1.01 1.04 1.06 0.95 1.02 1.03 0.98 1.01 1.03 0.98 0.99 1.08 1.30

(1.08-1.43) (0.99-1.16) (0.90-1.08) (0.97-1.47) (0.94-1.11) (0.94-1.15) (0.94-1.14) (0.91-1.09) (0.97-1.15) (1.05-1.23) (0.99-1.29) (0.98-1.15) (0.99-1.17) (1.01-1.18) (0.80-1.04) (1.02-1.47) (0.96-1.12) (1.09-1.33) (0.95-1.16) (1.03-1.39) (0.96-1.13) (0.94-1.11) (1.04-1.33) (0.93-1.13) (1.01-1.18) (0.88-1.04) (1.07-1.29) (0.91-1.07) (0.82-1.03) (0.94-1.10) (0.96-1.13) (0.98-1.14) (0.82-1.08) (0.90-1.16) (0.95-1.11) (0.89-1.08) (0.93-1.09) (0.95-1.11) (0.91-1.07) (0.92-1.07) (0.96-1.21) (1.05-1.61)

2.23E-03 1.05E-01 7.57E-01 1.03E-01 6.03E-01 4.65E-01 5.22E-01 9.73E-01 1.99E-01 1.14E-03 7.44E-02 1.64E-01 1.05E-01 3.18E-02 1.58E-01 2.92E-02 3.90E-01 2.90E-04 3.81E-01 1.62E-02 3.58E-01 6.70E-01 1.11E-02 6.84E-01 2.84E-02 2.73E-01 9.96E-04 7.03E-01 1.57E-01 7.39E-01 3.93E-01 1.69E-01 4.17E-01 8.01E-01 5.34E-01 6.71E-01 9.02E-01 5.34E-01 7.10E-01 8.35E-01 2.07E-01 1.80E-02

1.35 1.16 1.11 1.39 1.11 1.16 1.14 1.11 1.13 1.17 1.25 1.13 1.15 1.15 1.11 1.38 1.12 1.23 1.15 1.29 1.11 1.10 1.25 1.12 1.14 1.06 1.20 1.08 1.08 1.09 1.11 1.12 1.11 1.14 1.10 1.09 1.09 1.10 1.08 1.08 1.17 1.41

1.96E-09 7.87E-07 1.85E-03 7.35E-06 3.01E-04 1.81E-04 2.60E-04 2.05E-03 6.37E-05 5.45E-08 1.80E-05 5.67E-05 3.37E-05 4.64E-06 4.29E-02 6.10E-06 2.84E-04 2.48E-08 3.69E-04 3.12E-06 4.07E-04 1.76E-03 2.69E-06 1.87E-03 1.04E-05 4.98E-02 1.47E-07 9.84E-03 7.10E-02 2.37E-03 7.48E-04 1.70E-04 3.45E-02 3.95E-03 1.48E-03 1.55E-02 5.44E-03 1.57E-03 1.63E-02 1.09E-02 2.82E-04 1.27E-05

3.06E-10 2.60E-06 3.77E-03 6.56E-05 1.16E-03 7.54E-04 7.00E-04 8.64E-03 2.03E-04 5.93E-07 9.94E-05 8.03E-05 2.49E-05 3.89E-05 1.06E-01 3.02E-05 1.03E-03 8.01E-09 6.67E-04 2.06E-05 1.15E-03 3.05E-03 2.92E-06 2.72E-03 1.93E-05 4.32E-02 4.22E-06 1.55E-02 8.38E-02 5.17E-03 3.06E-03 3.65E-04 1.85E-02 2.81E-03 1.79E-03 1.90E-02 2.93E-03 6.82E-03 2.40E-02 1.28E-02 1.48E-03 7.04E-06

8.35E-02 3.66E-03 1.63E-04 3.02E-02 2.26E-03 3.84E-03 4.33E-03 8.98E-04 1.57E-02 3.09E-01 2.94E-02 2.35E-02 2.81E-02 9.30E-02 2.19E-05 7.27E-02 1.07E-02 4.84E-01 1.15E-02 1.90E-01 1.52E-02 6.01E-03 1.58E-01 7.10E-03 1.23E-01 1.70E-04 4.72E-01 8.79E-04 5.17E-05 5.76E-03 1.80E-02 4.44E-02 5.92E-04 7.72E-03 1.37E-02 1.18E-03 5.05E-03 1.29E-02 1.61E-03 2.32E-03 5.10E-02 2.41E-01

RI PT

LOC100128993 PCSK5 EPYC, KERA, LUM, DCN MCCC1, LAMP3 FOXO1, MIR320D1, MRPS31, SLC25A15, MIR621, ELF1, WBP4, KBTBD6, KBTBD7

Logistic OR

SC

PMVK, PBXIP1, PYGO2, SHC1 NCF4, CSF2RB

RAF in control

M AN U

154,934,963 37,308,785 103,164,639 19,028,649 78,917,378 91,372,294 182,831,688 41,579,868 243,041,047 96,539,484 187,702,211 985,840 21,966,221 139,034,049 10,807,667 133,288,281 117,240,593 34,008,574 118,287,508 44,205,018 35,807,845 80,785,448 4,991,694 143,185,683 24,289,366 154,408,454 90,993,286 55,356,341 208,535,774 20,104,224 95,770,862 82,496,434 34,520,267 27,655,708 126,412,575 35,870,454 233,843,795 179,727,957 182,077,680 33,555,133 130,656,378 6,785,430 136,120,091 52,073,761 8,391,116 55,313,379 79,169,828 11,279,164 160,394,315 48,057,873

RAF in case

TE D

1 22 12 8 9 12 3 13 1 12 3 7 9 3 3 10 6 12 9 12 19 16 17 6 12 4 14 17 1 7 5 6 6 21 4 14 1 5 1 14 4 3 9 2 6 6 14 7 3 17

Gene(s)#

Replication cohort (1,989/3,491)

Risk allele

EP

rs3766920¶ rs4821558 rs1357766§ rs691605 rs11788518 rs11105898 rs11716740 rs7335629§ rs4658504 rs4762274§ rs4686912 rs7811408 rs3731257 rs4894345 rs9820724 rs4456177 rs1321366 rs1608912 rs7033617§ rs1816854 rs11666282 rs16953946¶ rs10438722 rs765875§ rs7973572 rs713248 rs1294552 rs12944955§ rs6677524 rs13231772 rs3762986 rs194615 rs3798544 rs469047 rs1014866 rs3138055 rs4142987§ rs7725 rs17498867§ rs8008999 rs2637592 rs2633748 rs9411471 rs10208223 rs1199389 rs9464247 rs12880418 rs17164101 rs35498840 rs11079881

Chr

AC C

SNP

Position (hg19)



ACCEPTED MANUSCRIPT

Supplementary Table 2. Cont'd. Locus Gene(s)#

Ŧ

RAF in control

Logistic OR

95% CI

P trend*

C A C G T T T C G A

0.11 0.47 0.93 0.60 0.14 0.56 0.81 0.15 0.87 0.59

0.08 0.43 0.91 0.56 0.12 0.52 0.77 0.12 0.84 0.55

1.33 1.19 1.37 1.19 1.28 1.18 1.23 1.27 1.27 1.18

(1.15-1.53) (1.09-1.29) (1.17-1.61) (1.09-1.29) (1.13-1.45) (1.09-1.29) (1.11-1.37) (1.13-1.44) (1.13-1.44) (1.08-1.28)

8.34E-05 8.38E-05 8.55E-05 8.67E-05 9.14E-05 9.33E-05 9.54E-05 9.90E-05 1.17E-04 1.62E-04

M AN U

rs12648453 4 62,788,111 LPHN3 rs10901553 10 127,868,744 ADAM12 rs6065§ 17 4,836,381 PLD2, MINK1, CHRNE, C17orf107, GP1BA, SLC25A11, RNF167, PFN1 rs2771098§ 9 93,058,963 rs1820707 19 28,692,266 rs6796933 3 150,685,372 CLRN1, CLRN1-AS1 rs2367252 17 69,101,244 CASC17 rs805316§ 2 54,133,744 ASB3, GPR75-ASB3, CHAC2, ERLEC1, MIR3682, GPR75, PSME4 9 107,505,262 rs7034016 NIPSNAP3A, NIPSNAP3B, ABCA1 2 1,987,338 MYT1L rs7569308 Chr, chromosome; CI, confidence interval; OR, odds ratio; RAF, risk allele frequency. # 2 Locus was defined by SNPs with r >0.4 and the flanking recombination hotspots * P value was calculated using the Cochran-Armitage trend test. ** P value was calculated using the z-score from genome-wide efficient mixed-model association (GEMMA). † The combined P value was calculated using the Cochran-Mantel-Haenszel test. ‡ The combined P value for GEMMA was calculated using the weighted z-score meta-analysis. ∆ P _BD was calculated using the Breslow-Day test.

Replication cohort (1,989/3,491)

RAF in case

2

Replacement of the most strongly associated SNP (r >0.8) These 11 SNPs failed in the replication assay. ¶ -8 These SNPs exhibited a genome-wide significance level (P < 5.00 × 10 )

AC C

EP

TE D

§

14

Combined (3,494/7,532)

P GEMMA**

RAF in case

RAF in control

Logistic OR

95% CI

P trend*

OR

P CMH†

Weighted z-score meta P ‡

P_ BD

3.16E-04 5.11E-04 3.27E-03 1.45E-04 3.94E-04 3.94E-04 4.54E-04 9.28E-04 3.00E-05 1.68E-03

0.10 0.45 0.13 0.52 0.78 0.84 0.57

0.08 0.45 0.13 0.54 0.78 0.85 0.56

1.14 0.99 0.96 0.95 0.97 0.94 1.04

(1.00-1.31) (0.91-1.07) (0.86-1.08) (0.88-1.03) (0.89-1.07) (0.85-1.05) (0.96-1.12)

5.30E-02 7.33E-01 5.47E-01 2.37E-01 5.43E-01 2.89E-01 3.79E-01

1.23 1.07 1.10 1.06 1.08 1.08 1.10

4.16E-05 1.57E-02 2.99E-02 6.74E-02 2.70E-02 6.85E-02 1.25E-03

1.09E-04 3.43E-02 4.85E-02 1.21E-01 5.20E-02 3.87E-02 5.36E-03

1.36E-01 1.89E-03 1.29E-03 2.32E-04 9.37E-04 2.81E-04 2.89E-02

RI PT

Chr

IBD GWAS (1,505/4,041) Risk allele

SC

SNP

Position (hg19)



ACCEPTED MANUSCRIPT

Supplementary Table 3. Association results for the previously reported Caucasian inflammatory bowel disease 231 independent SNPs in Koreans. Inflammatory bowel disease

TNFRSF18, TNFRSF4 TNFRSF14 TNFRSF9, ERRFI1, PARK7 PLA2G2A PLA2G2A PLA2G2A

2 rs17229285

199,523,122

2 rs1405108

199,745,018

IL23R IL23R IL23R

EDG1 PTPN22 RORC SLAMF1, CD48 FCGR2A, FCGR2B, FCGR3B, FCGR3A SELP, SELE, SELL TNFSF18, FASLG PTGS2, PLA2G4A PTPRC

IL10, IL19, IL20, FAIM3, IL24, MAPKAPK2, PIGR UCN FOSL2, BRE REL SPRED2 IL1R1, IL1R2, IL1RL2 IL18RAP, IL18R1, IL1RL1, IL1RL2 LY75 IFIH1, DPP4 STAT1, STAT4

A A A A A G G G A C A A A A A A G A A A G A A A A A A A A G A A A G A C A G C A A A C

Combined P 3.28E-11 1.35E-06 8.84E-18 1.13E-07 1.08E-21 3.06E-36 3.64E-10 1.94E-04 1.38E-16 9.48E-134 2.21E-166 8.72E-01 1.27E-05 9.80E-09 1.92E-06 2.15E-04 4.43E-06 1.21E-17 2.62E-05 2.05E-11 8.50E-36 2.51E-08 1.10E-07 2.49E-07 1.26E-15 2.54E-09 3.64E-10 3.20E-44 2.99E-50 1.69E-20 1.03E-14 2.70E-14 2.61E-12 2.60E-36 8.59E-03 5.59E-12 1.63E-16 8.39E-14 2.09E-09 2.61E-08 4.60E-09 3.87E-14 2.33E-02

Caucasian

Ulcerative colitis

Korean

Caucasian

RI PT

1,247,494 2,502,780 8,022,197 20,135,822 20,142,866 20,171,860 22,702,231 63,049,593 67,652,984 67,681,669 67,708,155 70,995,562 78,623,626 92,554,283 101,466,054 114,303,808 120,437,884 151,801,680 155,878,732 160,856,964 161,479,745 169,519,049 172,853,460 186,875,459 197,631,141 198,598,663 200,101,920 200,877,562 206,939,904 25,097,644 27,730,940 28,614,794 43,806,918 61,204,856 62,551,472 65,667,272 102,662,888 103,063,369 145,492,382 160,794,008 163,110,536 191,931,464 198,871,417

r2

Proxy SNP

Crohn's disease

Korean GWAS GEMMA Power** (0.05/231) P P* 5.97E-02 1.38E-01 0.0008 1.16E-02 3.21E-02 0.0053 5.11E-01 6.82E-01 0.0095 − − − 8.28E-03 3.44E-03 0.0676 1.87E-05 9.77E-05 0.1091 2.90E-03 5.26E-03 0.0172 8.00E-01 9.07E-01 0.0017 7.47E-01 5.26E-01 0.0339 2.69E-04 1.21E-03 0.9731 − − − 2.12E-01 3.83E-01 0.0002 − − − − − − 5.79E-01 5.79E-01 0.0028 − − − 4.39E-01 2.21E-01 0.0009 − − − 3.64E-01 1.15E-01 0.0015 4.62E-01 3.54E-01 0.0144 3.46E-03 2.06E-03 0.1165 − − − 6.13E-01 6.96E-01 0.0015 6.61E-01 8.57E-01 0.0061 2.38E-01 1.66E-01 0.0275 − − − − − − − − − 4.17E-02 5.34E-02 0.0215 1.50E-01 9.11E-02 0.0707 4.30E-01 4.99E-01 0.0283 3.94E-01 3.80E-01 0.0086 − − − 4.42E-01 6.43E-01 0.0036 8.76E-01 8.45E-01 0.0012 1.85E-06 1.15E-06 0.0100 9.51E-02 1.39E-01 0.0282 7.76E-03 3.29E-02 0.0461 − − − 8.79E-01 7.71E-01 0.0068 5.05E-01 6.02E-01 0.0036 1.27E-01 1.46E-01 0.0226 5.56E-01 3.23E-01 0.0010

M AN U

rs12103 rs6667605 rs3766606 rs10799838 rs3806308‡ rs6426833‡ rs12568930 rs1748195 rs6588248 rs7517847 rs80174646 rs2651244 rs17391694 rs34856868 rs11583043 rs6679677 rs2641348 rs4845604 rs670523 rs4656958 rs1801274‡ rs6025 rs7517810 rs10798069 rs2488389 rs7555082 rs2816958 rs7554511 rs3024505 rs13407913 rs1260326 rs925255 rs10495903 rs7608910 rs10865331 rs6740462‡ rs10185424 rs6708413 rs11681525 rs4664304 rs2111485 rs1517352 rs1440088

A1 A1 Frequency Frequency G 0.18 0.98 G rs4648649‡ 0.93 0.49 0.48 C 0.17 0.07 G 0.23 − G 0.38 0.49 A 0.46 0.23 A rs12728589‡ 0.84 0.16 0.20 C 0.33 0.23 C 0.47 0.37 A 0.44 0.41 C 0.07 − G 0.39 0.11 G 0.12 − G 0.03 − G 0.27 0.13 C 0.10 − A 0.11 0.03 G 0.15 − G 0.33 0.88 G 0.32 0.28 A 0.50 0.24 G 0.03 − G 0.24 0.92 C 0.49 0.44 G 0.21 0.19 G 0.11 − G 0.11 − C 0.28 − G 0.16 0.03 A 0.43 0.46 G 0.41 0.54 G 0.45 0.19 G 0.13 − A 0.39 0.03 G 0.38 0.35 A 0.26 0.15 C 0.46 0.26 A 0.24 0.51 G 0.09 − G 0.44 0.71 G 0.39 0.83 C 0.40 0.48 A 0.19 0.25

A1 A2

Caucasian

TE D

GRAIL gene

EP

Position (hg19)

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2

SNP

AC C

Chr.

Korean

Combined P 5.63E-07 2.21E-01 4.51E-09 8.34E-01 4.27E-03 2.80E-02 3.52E-02 7.13E-08 1.66E-25 1.38E-159 5.80E-149 8.52E-05 2.62E-09 2.35E-06 1.28E-02 4.67E-17 9.65E-10 1.39E-07 6.03E-09 5.21E-07 9.47E-11 8.14E-05 4.53E-22 4.25E-09 8.59E-15 1.47E-10 2.82E-02 2.09E-27 3.95E-25 9.64E-22 1.74E-21 1.07E-16 3.44E-13 2.95E-23 4.40E-10 1.74E-12 2.76E-09 2.82E-16 4.08E-11 1.11E-06 2.00E-04 1.31E-10 3.38E-01

SC

Caucasian

GWAS P 9.65E-01 6.71E-01 5.63E-01 − 5.70E-01 3.04E-01 3.16E-01 7.48E-01 9.80E-01 2.70E-04 − 2.35E-01 − − 2.10E-01 − 9.97E-01 − 1.91E-01 5.54E-01 3.91E-01 − 1.72E-01 8.83E-01 4.02E-01 − − − 3.82E-01 5.16E-01 9.94E-01 5.46E-01 − 1.84E-01 3.67E-01 1.40E-04 9.03E-01 2.82E-02 − 2.94E-01 4.73E-01 5.16E-01 1.36E-01

GEMMA Power** (0.05/231) P* 6.39E-01 0.0005 8.46E-01 0.0004 5.65E-01 0.0036 − − 4.36E-01 0.0009 4.57E-01 0.0007 3.21E-01 0.0000 7.28E-01 0.0008 8.26E-01 0.0911 1.09E-03 0.9961 − − 3.27E-01 0.0010 − − − − 3.41E-01 0.0005 − − 8.27E-01 0.0016 − − 1.88E-01 0.0030 9.30E-01 0.0043 1.63E-01 0.0097 − − 2.87E-01 0.0090 9.90E-01 0.0105 3.29E-01 0.0262 − − − − − − 4.50E-01 0.0053 2.35E-01 0.0660 8.50E-01 0.0645 7.20E-01 0.0166 − − 1.57E-01 0.0021 4.95E-01 0.0117 6.32E-05 0.0078 7.04E-01 0.0056 7.72E-02 0.0659 − − 3.39E-01 0.0037 2.99E-01 0.0014 2.56E-01 0.0184 5.35E-02 0.0003

Combined P 9.96E-10 3.16E-10 5.50E-14 2.87E-19 5.56E-39 3.77E-76 4.40E-13 8.77E-01 2.66E-03 4.16E-34 4.34E-62 1.60E-06 3.35E-01 5.84E-05 6.05E-08 9.92E-03 1.32E-01 1.20E-18 1.03E-01 2.82E-09 1.43E-41 1.58E-05 4.39E-01 5.59E-03 9.46E-08 1.55E-05 1.14E-18 6.83E-31 4.71E-43 8.81E-08 2.90E-03 6.21E-04 7.46E-05 1.25E-23 1.17E-01 5.76E-05 1.47E-14 1.35E-03 5.75E-03 1.76E-05 2.11E-10 2.10E-09 5.68E-07

Korean GWAS GEMMA Power** LR (0.05/231) phenotype† P P* 1.25E-03 1.03E-03 0.0005 UC 9.43E-05 1.15E-04 0.0076 6.87E-01 7.92E-01 0.0046 − − − UC 9.41E-05 4.45E-05 0.1774 UC 6.85E-10 5.89E-09 0.2537 UC 7.84E-05 1.24E-04 0.0170 3.84E-01 6.36E-01 0.0002 5.94E-01 4.36E-01 0.0011 9.49E-02 9.88E-02 0.0878 − − − 5.13E-01 5.45E-01 0.0010 − − − − − − 5.19E-01 7.07E-01 0.0021 − − − 1.70E-01 1.23E-01 0.0003 − − − 9.19E-01 8.94E-01 0.0003 6.04E-01 5.67E-01 0.0059 UC 5.73E-05 2.43E-04 0.0820 − − − 3.85E-01 4.61E-01 0.0002 5.64E-01 6.21E-01 0.0007 3.21E-01 2.37E-01 0.0033 − − − − − − − − − 1.30E-02 1.22E-02 0.0082 9.09E-02 1.61E-01 0.0044 1.63E-01 2.08E-01 0.0011 4.78E-01 3.25E-01 0.0007 − − − 6.98E-01 5.84E-01 0.0016 3.64E-01 4.70E-01 0.0004 8.55E-04 1.08E-03 0.0013 IBD_U 1.96E-03 2.66E-03 0.0095 6.83E-02 1.50E-01 0.0017 − − − 2.81E-01 4.86E-01 0.0021 8.11E-01 9.57E-01 0.0036 6.56E-02 1.07E-01 0.0055 3.63E-01 3.92E-01 0.0052

G

A

0.50

0.76

7.51E-07

7.45E-01

9.41E-01

0.0032

4.60E-01

1.84E-01

2.38E-01

0.0003

3.38E-14

2.42E-01

1.96E-01

0.0078

C

A

0.34

0.16

3.35E-05

4.92E-01

9.60E-01

0.0013

8.42E-01

7.50E-01

9.86E-01

0.0002

5.86E-10

4.25E-01

6.94E-01

0.0038

15

ACCEPTED MANUSCRIPT

Supplementary Table 3. Cont'd (1). Inflammatory bowel disease

2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4 4 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5

rs3116494 rs2382817 rs111781203 rs6716753 rs12994997‡ rs3749171 rs6724516 rs35320439 rs4256159 rs113010081 rs9868809 rs3172494 rs4541435 rs3197999‡ rs9847710 rs616597 rs724016 rs2073505 rs4692386 rs6856616‡ rs7438704 rs2457996 rs13126505 rs3774937 rs2189234 rs7657746 rs11739663 rs2930047‡ rs395157 rs1842076 rs11742570‡ rs1505992 rs10065637 rs4703855 rs10061469 rs1363907 rs10051722 rs11743851 rs17622378 rs254560 rs6863411 rs11741861‡ rs6556412‡ rs9313808

204,592,021 219,151,218 228,660,112 231,097,129 234,173,503 241,569,692 241,586,810 242,737,341 18,767,404 46,457,412 48,681,053 48,731,487 49,228,501 49,721,532 53,062,661 101,569,726 141,105,570 3,444,503 26,132,361 38,325,036 48,363,245 74,856,535 102,865,304 103,434,253 106,075,498 123,161,619 594,083 10,695,526 38,867,732 40,237,018 40,410,584 40,498,577 55,438,851 71,693,899 72,518,148 96,252,803 130,104,076 130,613,600 131,778,452 134,443,606 141,513,204 150,277,909 158,787,385 158,820,844

ICOS, CD28, CTLA4 SLC11A1, IL8RA, IL8RB CCL20

158,827,769

GPR35 GPR35 PDCD1, ATG4B FLJ78302, LTF, CCR1, CCR3, CCR5 IHPK2, UCN2, PFKFB4 USP4 MST1R PRKCD NFKBIZ

TXK IL8, CXCL3, CXCL2, CXCL6, CXCL1, CXCL5, PF4 NFKB1 IL2 DAP OSMR, FYB

PTGER4 IL6ST, IL31RA

IRF1, IL4, IL13, IL5

TNIP1 IL12B IL12B IL12B

G A G G G A G G A G A A A A G A G A A G A G A G A G G G A G A T A A G A C G G A A G A A

A C A A A G A A G A G C G G A C A G G A G A G A C A A A G A G A G G A G A A A G T A G G

A

C

r2

Proxy SNP

AC C

5 rs56167332

A1 A2

A1 A1 Frequency Frequency 0.25 0.07 0.40 0.43 0.34 − 0.19 − 0.47 0.69 0.18 − 0.27 − 0.31 − 0.15 0.12 0.11 − 0.10 0.08 0.11 − 0.01 − 0.28 0.07 0.41 0.57 0.23 0.36 0.44 0.35 0.08 0.11 0.41 0.16 0.07 0.23 0.36 0.63 0.11 − 0.06 − 0.33 0.33 0.37 0.36 0.24 0.04 0.23 0.02 0.37 0.74 0.48 0.25 0.28 0.03 0.39 0.83 0.32 − 0.21 − 0.30 0.57 0.33 0.20 0.42 0.27 0.30 0.39 0.37 − 0.42 − 0.40 0.18 0.37 0.32 0.08 0.36 0.33 0.42 0.17 − 0.34 −

Combined P 5.11E-07 1.13E-13 2.16E-10 1.23E-10 2.58E-41 1.57E-16 8.47E-09 1.73E-05 2.06E-10 4.21E-08 8.17E-18 2.90E-08 4.24E-04 1.55E-52 4.85E-03 1.82E-05 5.15E-03 1.46E-07 1.21E-08 9.72E-07 8.92E-07 1.05E-05 1.00E-10 1.07E-05 3.50E-07 1.83E-13 2.04E-02 9.78E-14 2.22E-20 4.15E-15 8.13E-65 6.61E-32 9.56E-08 7.16E-11 1.45E-06 4.87E-15 8.89E-09 1.57E-23 1.08E-42 8.55E-10 2.41E-14 5.50E-31 1.24E-32 1.15E-25

Caucasian

7.17E-50

16

Ulcerative colitis

Korean

GWAS GEMMA Power** (0.05/231) P P* 6.74E-02 1.30E-01 0.0014 7.45E-03 2.55E-02 0.0281 − − − − − − 4.12E-04 9.52E-05 0.2611 − − − − − − − − − 4.82E-02 2.04E-01 0.0091 − − − 1.15E-02 1.26E-01 0.0312 − − − − − − 2.49E-03 4.11E-02 0.0506 3.78E-01 7.45E-01 0.0013 4.91E-02 1.53E-01 0.0054 1.02E-02 4.70E-02 0.0012 1.83E-03 1.05E-02 0.0115 1.08E-01 2.10E-01 0.0040 4.58E-10 1.24E-09 0.0365 5.72E-01 9.37E-01 0.0056 − − − − − − 9.31E-01 7.82E-01 0.0044 7.77E-04 3.78E-03 0.0047 1.19E-01 1.16E-01 0.0016 6.52E-01 3.29E-01 0.0003 2.97E-05 1.18E-04 0.0164 2.14E-02 1.96E-02 0.0409 7.48E-01 8.95E-01 0.0012 5.40E-05 6.98E-04 0.2973 − − − − − − 2.77E-01 3.71E-01 0.0221 6.34E-01 7.11E-01 0.0030 5.69E-01 2.39E-01 0.0186 9.63E-01 5.52E-01 0.0110 − − − − − − 1.88E-03 8.44E-03 0.0042 2.26E-01 3.20E-01 0.0291 2.12E-04 7.66E-03 0.7973 1.48E-04 4.49E-03 0.1946 − − − − − −

Combined P 4.24E-03 3.79E-09 5.93E-06 1.98E-17 1.37E-77 5.89E-06 3.76E-01 9.89E-10 7.01E-14 3.75E-03 9.90E-10 5.64E-05 8.69E-06 2.05E-33 5.81E-01 5.10E-03 3.36E-06 9.29E-07 2.53E-07 1.56E-05 3.42E-11 4.65E-02 1.57E-14 9.87E-01 2.05E-02 2.39E-09 1.06E-01 1.21E-12 2.36E-16 1.98E-16 3.66E-87 1.51E-49 1.19E-09 3.03E-08 1.08E-10 3.89E-16 3.99E-12 1.40E-31 7.17E-56 6.91E-03 1.87E-14 5.88E-44 5.53E-34 1.95E-18 2.29E-41

GWAS P 1.85E-01 8.25E-03 − − 6.49E-05 − − − 4.00E-02 − 2.29E-02 − − 4.10E-05 3.06E-01 1.56E-01 2.56E-03 5.30E-03 4.59E-01 4.19E-14 8.24E-01 − − 3.36E-01 8.80E-02 9.49E-02 5.26E-01 1.69E-02 8.62E-02 7.22E-01 1.71E-03 − − 3.81E-01 8.13E-01 7.74E-02 8.87E-01 − − 3.97E-02 3.06E-01 1.32E-02 2.15E-04 − −

Korean

Caucasian

RI PT

GRAIL gene

Crohn's disease

Korean

M AN U

Position (hg19)

Caucasian

TE D

SNP

EP

Chr.

Korean

SC

Caucasian

GEMMA Power** (0.05/231) P* 2.38E-01 0.0005 1.66E-02 0.0082 − − − − 2.89E-05 0.6523 − − − − − − 1.23E-01 0.0171 − − 1.75E-01 0.0067 − − − − 1.29E-03 0.0186 5.82E-01 0.0003 3.26E-01 0.0016 1.14E-02 0.0044 2.61E-02 0.0081 6.34E-01 0.0023 8.80E-14 0.0170 8.43E-01 0.0166 − − − − 5.34E-01 0.0002 1.03E-01 0.0008 8.26E-02 0.0010 3.39E-01 0.0002 3.38E-02 0.0121 1.06E-01 0.0189 8.06E-01 0.0013 5.96E-03 0.5367 − − − − 2.87E-01 0.0106 8.97E-01 0.0076 7.11E-02 0.0302 4.53E-01 0.0269 − − − − 9.88E-02 0.0006 2.75E-01 0.0221 1.53E-01 0.9560 3.26E-03 0.2487 − −

− −

Combined P 1.30E-07 1.54E-08 2.86E-08 1.16E-02 1.12E-02 4.22E-18 8.40E-18 3.33E-01 1.06E-03 9.02E-10 2.92E-13 8.61E-08 2.62E-01 2.24E-37 2.69E-07 9.34E-06 6.75E-01 3.21E-04 4.45E-05 5.51E-05 8.80E-02 3.60E-07 5.57E-04 4.61E-14 1.95E-10 5.27E-10 1.70E-06 1.20E-07 8.77E-12 4.84E-05 6.06E-11 1.44E-04 5.14E-03 2.07E-06 4.41E-01 4.02E-06 8.26E-03 1.28E-05 4.42E-11 3.53E-10 5.92E-07 6.62E-08 4.31E-12 5.32E-13 7.27E-27

GWAS GEMMA Power** LR (0.05/231) phenotype† P P* 1.28E-01 2.29E-01 0.0011 2.09E-01 2.53E-01 0.0043 − − − − − − 3.10E-01 1.84E-01 0.0006 CD − − − − − − − − − 4.19E-01 6.29E-01 0.0010 − − − 1.26E-01 1.84E-01 0.0091 − − − − − − 9.71E-01 6.90E-01 0.0134 CD 8.29E-01 9.44E-01 0.0042 1.06E-01 1.74E-01 0.0047 5.63E-01 8.18E-01 0.0003 5.78E-02 7.19E-02 0.0024 6.16E-02 9.44E-02 0.0010 2.48E-01 2.43E-01 0.0082 CD 4.81E-01 7.87E-01 0.0004 − − − − − − 2.65E-01 2.13E-01 0.0121 2.17E-04 2.08E-03 0.0060 5.79E-01 6.41E-01 0.0009 9.75E-01 6.31E-01 0.0004 IBD_U 2.74E-05 3.98E-05 0.0029 6.73E-02 6.77E-02 0.0063 9.21E-01 8.88E-01 0.0003 2.07E-03 5.32E-03 0.0037 IBD_U − − − − − − 4.39E-01 5.58E-01 0.0033 2.50E-01 3.08E-01 0.0002 1.94E-01 2.50E-01 0.0021 7.90E-01 9.62E-01 0.0006 − − − − − − 4.58E-03 1.12E-02 0.0031 4.28E-01 6.64E-01 0.0027 9.92E-04 4.37E-03 0.0353 IBD_U 6.12E-02 1.65E-01 0.0142 IBD_U − −

− −

− −

ACCEPTED MANUSCRIPT

Supplementary Table 3. Cont'd (2). Inflammatory bowel disease

rs564349 rs72810983 rs4976646 rs7773324 rs13204048 rs17119 rs6908425 rs9358372 rs71559680 rs116392568 rs113653754 rs1847472 rs7746082 rs3851228 rs2503322 rs13204742 rs6920220 rs12199775 rs7758080 rs212388 rs1819333 rs1182188 rs1077773 rs10486483 rs4722672 rs864745 rs12718244 rs1456896 rs9297145 rs314313 rs7805114 rs4380874 rs38911 rs4728142 rs2538470 rs17057051 rs7011507 rs7015630 rs921720 rs6651252 rs13277237 rs75900472 rs4743820 rs4246905 rs11554257

172,324,978 173,318,254 176,788,570 382,559 3,420,406 14,719,496 20,728,731 20,812,588 21,430,728 31,274,380 32,626,272 90,973,159 106,435,269 111,848,191 127,457,260 128,245,765 138,006,504 143,898,894 149,577,079 159,490,436 167,373,547 2,869,985 17,442,679 26,892,440 27,231,762 28,180,556 50,175,654 50,304,461 98,759,117 100,423,365 107,450,033 107,480,315 116,895,163 128,573,967 148,220,448 27,227,554 49,129,242 90,875,918 126,534,671 129,567,181 130,604,563 4,981,602 93,928,416 117,553,249 117,605,070

DUSP1 DOK3 IRF4, DUSP22

TRAF3IP2, FYN

TNFAIP3 MAP3K7IP2 TAGAP CCR6, RPS6KA2 CARD11 AHR SKAP2

IKZF1 IKZF1 SMURF1 EPO

IRF5 PTK2B RIPK2, NBN TRIB1

JAK2 NFIL3 TNFSF15, TNFSF8 TNFSF15, TNFSF8

G G G G G G A G A G A A C A A A A G G G C G G A G A A G C G C A A A A G A G A G G C G A G

Proxy SNP

A A A A A A G A G A C C G T G C G A A A A rs394522‡ A A G A G G A A A A G G G G A G A G A A A A G A

A1 A1 Frequency Frequency 0.32 0.33 0.31 0.07 0.34 0.32 0.40 0.77 0.39 0.51 0.19 0.09 0.22 0.19 0.37 0.58 0.47 − 0.35 0.47 0.27 0.34 0.34 0.05 0.29 − 0.06 − 0.46 0.40 0.13 − 0.21 − 0.07 0.06 0.27 0.47 0.40 0.66 0.93 0.46 0.39 0.30 0.16 0.48 0.38 0.24 0.09 0.19 0.44 0.50 − 0.41 0.18 0.31 0.56 0.26 0.09 0.30 0.02 0.43 0.32 0.41 0.11 0.46 0.31 0.44 0.12 0.36 0.19 0.31 0.21 0.12 0.23 0.27 0.15 0.38 0.59 0.13 0.04 0.44 0.52 0.35 − 0.30 0.33 0.28 − 0.13 0.17 r2

Combined P 1.54E-07 1.57E-08 3.23E-12 5.84E-09 1.11E-06 2.18E-11 5.41E-13 6.48E-09 2.40E-10 1.55E-18 3.30E-58 6.63E-10 1.29E-20 1.50E-15 3.70E-05 5.39E-10 1.43E-14 3.53E-07 1.15E-07 1.93E-07 2.72E-15 1.08E-09 6.37E-07 2.11E-07 4.55E-03 4.53E-05 3.35E-14 3.48E-13 9.94E-11 3.92E-10 2.50E-11 1.53E-15 2.63E-04 5.51E-07 3.00E-11 5.50E-08 2.03E-08 2.90E-08 3.23E-16 9.08E-10 1.17E-07 4.70E-48 3.80E-09 2.86E-29 3.93E-21

Caucasian

17

Korean

RI PT

5 5 5 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 7 7 7 7 7 7 7 7 7 7 7 7 7 7 8 8 8 8 8 8 9 9 9 9

A1 A2

Crohn's disease

Korean GWAS GEMMA Power** (0.05/231) P P* 1.91E-02 2.99E-02 0.0080 2.00E-01 3.40E-01 0.0015 5.70E-02 2.92E-02 0.0225 3.88E-01 5.42E-01 0.0068 7.73E-01 8.22E-01 0.0062 4.54E-01 3.45E-01 0.0046 2.35E-02 4.43E-02 0.0143 4.47E-02 4.05E-02 0.0092 − − − 1.67E-01 5.55E-02 0.6065 1.43E-01 6.17E-01 0.9907 1.36E-01 2.92E-02 0.0015 − − − − − − 1.27E-01 3.65E-01 0.0029 − − − − − − 3.96E-03 5.90E-03 0.0053 2.77E-01 2.87E-01 0.0097 8.24E-01 8.65E-01 0.0079 1.28E-06 3.69E-06 0.0303 7.67E-01 5.54E-01 0.0040 9.72E-01 6.16E-01 0.0056 8.33E-03 2.30E-02 0.0018 4.86E-01 8.00E-01 0.0027 − − − 3.63E-02 2.29E-02 0.0109 1.85E-01 2.29E-01 0.0395 5.95E-01 5.40E-01 0.0027 7.41E-02 1.31E-01 0.0006 4.58E-01 6.28E-01 0.0092 1.05E-01 5.16E-02 0.0054 9.87E-01 4.79E-01 0.0024 2.32E-01 2.43E-01 0.0016 4.32E-01 3.36E-01 0.0073 4.91E-02 5.43E-02 0.0057 7.63E-03 3.93E-02 0.0189 5.60E-02 2.44E-02 0.0037 8.00E-01 8.66E-01 0.0383 5.88E-01 3.81E-01 0.0022 1.55E-02 1.21E-02 0.0097 − − − 5.88E-01 4.01E-01 0.0095 − − − 8.78E-01 7.58E-01 0.1198

M AN U

GRAIL gene

Caucasian

TE D

Position (hg19)

EP

SNP

AC C

Chr.

Korean

Combined P 4.60E-05 2.24E-11 3.47E-08 1.06E-09 2.89E-08 1.94E-08 4.81E-12 1.05E-10 1.78E-15 6.78E-32 2.18E-13 1.09E-10 3.18E-23 2.76E-08 5.55E-08 3.24E-15 4.12E-03 4.68E-06 7.27E-09 1.80E-16 1.62E-19 5.30E-02 1.23E-02 1.55E-11 5.43E-01 1.82E-09 4.62E-11 1.03E-13 4.05E-08 2.90E-08 1.37E-01 4.09E-02 1.23E-01 3.38E-01 1.05E-09 1.54E-06 1.34E-04 9.00E-10 1.12E-21 3.86E-16 3.52E-05 4.18E-34 3.47E-04 6.81E-23 1.23E-16

SC

Caucasian

GWAS P 2.02E-02 7.52E-02 9.53E-02 2.69E-01 5.79E-02 6.27E-01 8.19E-02 1.72E-01 − 2.04E-01 4.31E-02 9.77E-02 − − 1.00E-01 − − 2.61E-02 1.23E-01 3.95E-01 2.41E-08 9.80E-01 9.96E-01 7.64E-04 1.99E-01 − 1.27E-02 8.00E-02 6.63E-01 1.81E-01 3.93E-02 5.12E-01 5.75E-01 4.36E-01 1.67E-01 3.03E-01 2.47E-01 1.15E-02 5.68E-01 3.63E-01 1.24E-02 − 3.37E-01 − 6.30E-01

GEMMA Power** (0.05/231) P* 3.63E-02 0.0025 1.58E-01 0.0026 8.64E-02 0.0068 4.00E-01 0.0101 4.28E-01 0.0109 4.44E-01 0.0026 2.03E-01 0.0166 1.78E-01 0.0128 − − 1.98E-02 0.8196 9.09E-02 0.4813 1.97E-02 0.0013 − − − − 5.59E-01 0.0057 − − − − 2.63E-02 0.0039 1.23E-01 0.0136 3.79E-01 0.0356 2.21E-07 0.0536 8.21E-01 0.0004 8.05E-01 0.0009 1.64E-03 0.0044 3.64E-01 0.0003 − − 2.78E-02 0.0056 2.03E-01 0.0301 8.02E-01 0.0023 1.09E-01 0.0005 6.21E-02 0.0005 3.20E-01 0.0004 2.48E-01 0.0004 5.61E-01 0.0002 8.44E-02 0.0060 3.37E-01 0.0031 4.63E-01 0.0057 3.27E-02 0.0051 5.24E-01 0.0713 2.64E-01 0.0046 4.11E-03 0.0029 − − 3.83E-01 0.0028 − − 4.60E-01 0.0700

Ulcerative colitis Caucasian Combined P 2.08E-06 1.77E-02 2.52E-09 2.10E-05 1.29E-02 1.41E-05 4.49E-06 3.60E-03 7.71E-02 3.10E-03 9.82E-87 7.30E-04 1.75E-07 1.13E-14 1.18E-01 3.18E-02 4.78E-22 1.23E-03 1.78E-02 4.13E-01 5.49E-04 5.03E-15 5.96E-09 4.90E-02 5.38E-06 7.57E-01 1.41E-08 1.01E-04 3.51E-07 3.77E-06 3.89E-21 6.43E-25 6.32E-06 1.92E-14 2.71E-07 8.32E-05 6.40E-08 2.00E-02 4.01E-04 1.67E-01 1.50E-05 1.04E-28 4.05E-09 1.04E-15 7.02E-12

Korean GWAS GEMMA Power** LR (0.05/231) phenotype† P P* 2.70E-01 2.76E-01 0.0024 9.53E-01 9.76E-01 0.0003 2.35E-01 1.16E-01 0.0055 9.37E-01 9.58E-01 0.0013 5.03E-02 1.01E-01 0.0007 4.97E-01 5.43E-01 0.0012 9.04E-02 6.95E-02 0.0025 7.89E-02 8.95E-02 0.0010 − − − 4.36E-01 7.81E-01 0.0959 UC 1.87E-07 5.72E-05 0.9926 6.51E-01 5.19E-01 0.0004 − − − − − − 5.88E-01 4.97E-01 0.0004 − − − − − − 3.40E-02 4.06E-02 0.0010 9.19E-01 8.29E-01 0.0006 4.73E-01 4.53E-01 0.0003 2.11E-01 1.62E-01 0.0013 CD 5.82E-01 6.12E-01 0.0063 9.56E-01 6.86E-01 0.0049 8.12E-01 9.79E-01 0.0003 6.57E-01 5.14E-01 0.0066 − − − 6.56E-01 4.14E-01 0.0021 9.61E-01 7.58E-01 0.0020 7.14E-01 7.49E-01 0.0011 1.46E-01 2.14E-01 0.0003 1.59E-01 1.49E-01 0.0234 4.69E-02 4.46E-02 0.0094 4.44E-01 7.69E-01 0.0016 2.15E-03 2.38E-03 0.0036 6.71E-01 5.95E-01 0.0023 3.50E-02 2.49E-02 0.0012 1.55E-03 4.15E-03 0.0105 9.56E-01 8.03E-01 0.0004 7.67E-01 6.05E-01 0.0011 8.24E-01 9.84E-01 0.0003 3.15E-01 4.05E-01 0.0028 − − − 7.60E-01 6.94E-01 0.0070 − − − 3.69E-01 5.50E-01 0.0140

ACCEPTED MANUSCRIPT

Supplementary Table 3. Cont'd (3). Inflammatory bowel disease

rs13300483‡ rs10781499 rs13300218 rs12722515 rs1042058 rs11010067 rs1199103 rs10995235 rs10761659 rs224090‡ rs2227551 rs1250546 rs7097656 rs12778642 rs4409764‡ rs3740415 rs907611 rs11229555 rs11230563 rs174537 rs559928 rs568617 rs2155219 rs483905 rs561722 rs566416 rs7954567 rs11054935 rs12422544 rs4768236 rs11168249 rs7134472 rs653178 rs11064881 rs17085007‡ rs915286 rs17061048 rs941823 rs9525625 rs3764147 rs3742130 rs194749 rs1569328 rs8005161 rs16967103

117,643,362 139,266,405 139,399,641 6,081,230 30,728,101 35,295,431 59,947,231 64,369,749 64,445,564 64,541,319 75,669,190 81,032,532 82,250,831 94,464,307 101,284,237 104,232,716 1,874,072 58,408,687 60,776,209 61,552,680 64,150,370 65,653,242 76,299,194 96,023,427 114,386,830 118,759,610 6,491,125 12,648,843 40,528,432 40,756,472 48,208,368 68,499,986 112,007,756 120,146,925 27,531,267 40,695,992 40,833,012 41,013,977 43,018,030 44,457,925 99,907,341 69,273,905 75,741,751 88,472,595 38,899,190

TNFSF15, TNFSF8 CARD9 CARD9 IL2RA, IL15RA MAP3K8 CREM

PLAU

NFKB2 CNTF CD5, GPR44, CD6 RPS6KA4 RELA, FOSL1, SIPA1

CXCR5 CD27, TNFRSF1A, LTBR DUSP16

RAPGEF3, SENP1 IL22, IFNG, IL26 SH2B3

TNFSF11 EBI2 FOS GPR65, GALC RASGRP1, SPRED1

A A A A A G G A A A C G A A A G A A A A A A A A A C A G G C G A G A G G T A A G A G A A G

Proxy SNP

G G G C G C A G G G A A G C C A G C G C G G C G G A G A A A A G A G A A A G G A G A G rs1063169‡ G A

A1 A1 Frequency Frequency 0.24 0.34 0.43 0.28 0.10 − 0.16 0.10 0.40 0.58 0.35 0.27 0.22 0.40 0.17 0.22 0.46 0.22 0.41 0.57 0.27 0.53 0.42 0.51 0.20 0.02 0.43 0.74 0.48 0.48 0.47 0.75 0.32 − 0.25 0.22 0.35 0.19 0.33 0.32 0.19 0.13 0.19 0.43 0.50 0.52 0.29 0.26 0.34 0.63 0.24 0.07 0.33 − 0.28 0.09 0.02 0.04 0.33 0.58 0.47 0.06 0.39 − 0.49 − 0.07 − 0.18 0.18 0.45 0.77 0.05 0.01 0.25 0.12 0.47 0.81 0.23 0.35 0.22 0.06 0.23 0.31 0.99 0.17 0.27 0.09 0.13 0.19 0.03 r2

Combined P 6.33E-16 1.71E-53 6.58E-20 4.57E-12 1.29E-10 6.37E-25 3.22E-10 1.87E-09 4.97E-53 1.32E-21 1.28E-09 1.29E-14 1.27E-15 3.26E-07 1.16E-61 1.03E-07 1.59E-08 5.23E-12 1.71E-14 1.03E-08 3.33E-13 1.69E-05 8.71E-48 7.15E-08 1.62E-09 1.83E-03 4.11E-05 5.17E-06 2.13E-21 1.76E-15 1.57E-05 5.64E-27 1.11E-08 5.95E-08 7.52E-11 2.89E-09 4.66E-09 6.19E-13 7.38E-06 8.70E-16 9.19E-13 5.25E-07 3.21E-09 1.37E-16 3.59E-07

Caucasian

18

Korean

RI PT

9 9 9 10 10 10 10 10 10 10 10 10 10 10 10 10 11 11 11 11 11 11 11 11 11 11 12 12 12 12 12 12 12 12 13 13 13 13 13 13 13 14 14 14 15

A1 A2

Crohn's disease

Korean GWAS GEMMA Power** (0.05/231) P P* 2.39E-27 1.54E-21 0.0585 1.89E-02 5.86E-02 0.3685 − − − 5.24E-01 2.25E-01 0.0081 3.58E-01 3.54E-01 0.0118 1.31E-01 1.32E-01 0.0668 1.15E-01 1.06E-01 0.0362 1.18E-01 8.34E-02 0.0143 3.89E-01 3.71E-01 0.1987 4.55E-05 1.16E-03 0.0669 8.94E-02 1.46E-01 0.0227 1.48E-01 2.50E-01 0.0403 6.99E-01 5.83E-01 0.0010 5.57E-01 5.14E-01 0.0034 1.61E-04 1.07E-03 0.5690 4.06E-01 6.06E-01 0.0040 − − − 9.65E-02 1.44E-01 0.0177 7.07E-01 6.10E-01 0.0143 8.00E-02 6.34E-02 0.0078 7.45E-02 1.39E-01 0.0124 8.52E-01 7.28E-01 0.0095 7.93E-02 1.27E-01 0.4061 8.46E-01 6.67E-01 0.0063 2.94E-01 1.41E-01 0.0109 1.86E-01 3.98E-01 0.0007 − − − 5.96E-01 5.10E-01 0.0012 6.47E-01 5.19E-01 0.2359 4.32E-01 4.78E-01 0.0270 1.86E-02 1.80E-02 0.0006 − − − − − − − − − 1.02E-03 1.17E-03 0.0168 2.20E-02 7.30E-02 0.0068 1.77E-01 2.21E-01 0.0011 1.96E-03 1.74E-03 0.0071 5.58E-01 4.24E-01 0.0025 1.76E-01 2.74E-01 0.0601 1.35E-02 1.63E-02 0.0026 8.82E-01 7.68E-01 0.0076 1.67E-05 3.28E-05 0.0211 2.13E-03 3.54E-03 0.0760 4.07E-01 3.82E-01 0.0008

M AN U

GRAIL gene

Caucasian

TE D

Position (hg19)

EP

SNP

AC C

Chr.

Korean

Combined P 1.70E-21 8.20E-43 7.32E-16 8.02E-13 1.60E-09 1.38E-26 5.32E-11 2.02E-03 1.40E-49 3.85E-26 4.72E-13 3.07E-19 3.89E-18 7.84E-04 6.06E-47 1.86E-04 2.59E-03 9.93E-07 4.23E-11 1.92E-12 3.75E-10 1.75E-08 6.47E-46 5.03E-02 2.16E-01 2.09E-05 1.30E-09 2.69E-02 3.78E-25 4.05E-21 1.71E-01 2.23E-06 7.18E-08 1.62E-05 5.65E-02 2.59E-08 1.02E-07 1.23E-05 1.41E-09 3.17E-24 1.16E-10 1.28E-07 6.47E-11 8.88E-14 2.98E-11

SC

Caucasian

GWAS P 6.50E-43 3.91E-01 − 3.83E-01 7.25E-01 4.05E-01 2.63E-01 5.74E-02 2.00E-02 1.16E-06 1.20E-01 3.22E-02 4.18E-01 6.85E-01 1.40E-03 6.90E-01 − 4.12E-01 8.96E-01 4.66E-03 1.01E-01 2.27E-01 8.02E-01 5.24E-01 4.64E-01 3.33E-01 − 7.36E-01 1.97E-01 7.37E-01 8.38E-02 − − − 9.00E-01 9.50E-03 2.76E-01 1.79E-02 6.18E-01 6.18E-03 1.90E-01 3.19E-01 1.35E-03 5.00E-02 3.70E-01

GEMMA Power** (0.05/231) P* 5.78E-33 0.1028 6.43E-01 0.2237 − − 3.30E-01 0.0120 6.58E-01 0.0105 3.47E-01 0.0799 2.39E-01 0.0433 5.37E-02 0.0018 5.22E-02 0.2103 1.47E-05 0.1146 1.85E-01 0.0305 1.35E-01 0.0486 5.43E-01 0.0012 8.15E-01 0.0012 6.18E-03 0.3702 8.50E-01 0.0023 − − 3.33E-01 0.0054 4.71E-01 0.0071 1.25E-02 0.0167 2.58E-01 0.0063 2.40E-01 0.0206 9.91E-01 0.3649 6.77E-01 0.0007 4.59E-01 0.0005 8.25E-01 0.0010 − − 9.47E-01 0.0004 1.83E-01 0.3049 9.29E-01 0.0611 1.00E-01 0.0003 − − − − − − 9.29E-01 0.0006 7.26E-02 0.0037 2.80E-01 0.0008 8.71E-03 0.0017 5.44E-01 0.0054 8.13E-03 0.1387 2.27E-01 0.0022 4.93E-01 0.0105 3.78E-03 0.0324 4.64E-02 0.0569 2.07E-01 0.0014

Ulcerative colitis Caucasian Combined P 3.54E-04 4.03E-26 2.48E-10 6.47E-05 1.84E-05 2.05E-08 4.69E-04 2.72E-11 1.50E-20 9.22E-06 1.02E-02 1.01E-03 7.40E-05 1.99E-06 2.13E-36 3.03E-07 1.00E-08 1.21E-08 1.90E-08 2.95E-02 6.34E-08 1.45E-01 7.53E-21 3.16E-10 4.19E-20 1.41E-01 9.91E-01 5.95E-08 3.23E-06 8.85E-03 7.40E-07 5.76E-37 6.15E-05 1.61E-07 9.72E-17 1.09E-04 6.71E-05 1.39E-13 3.73E-01 3.13E-03 6.19E-07 9.02E-03 2.61E-04 2.80E-09 5.83E-02

Korean GWAS GEMMA Power** LR (0.05/231) phenotype† P P* 2.41E-01 1.53E-01 0.0016 CD 2.53E-03 6.33E-03 0.0368 − − − 9.84E-01 9.29E-01 0.0013 2.49E-01 2.84E-01 0.0019 1.17E-01 1.15E-01 0.0047 1.91E-01 2.07E-01 0.0019 7.83E-01 7.17E-01 0.0150 1.35E-01 5.26E-02 0.0143 4.00E-01 7.27E-01 0.0027 CD 3.38E-01 3.90E-01 0.0012 8.14E-01 5.28E-01 0.0012 7.00E-01 9.93E-01 0.0003 1.16E-01 1.31E-01 0.0019 1.43E-02 3.25E-02 0.1162 IBD_U 3.46E-01 3.65E-01 0.0023 − − − 6.51E-02 1.20E-01 0.0045 6.24E-01 8.44E-01 0.0025 5.45E-01 9.28E-01 0.0006 3.28E-01 2.93E-01 0.0024 2.13E-01 2.78E-01 0.0004 5.82E-03 6.65E-03 0.0396 6.26E-01 8.06E-01 0.0066 3.73E-01 2.69E-01 0.0436 2.93E-01 3.37E-01 0.0003 − − − 1.67E-01 1.87E-01 0.0014 3.97E-01 6.36E-01 0.0086 3.43E-01 3.40E-01 0.0006 5.18E-02 3.07E-02 0.0006 − − − − − − − − − UC 2.06E-08 1.15E-08 0.0202 5.01E-01 4.23E-01 0.0013 3.55E-01 4.66E-01 0.0004 2.10E-02 2.19E-02 0.0042 8.97E-02 4.07E-02 0.0002 2.30E-01 1.68E-01 0.0010 9.31E-03 9.66E-03 0.0008 2.98E-01 2.74E-01 0.0009 7.56E-04 5.82E-04 0.0024 IBD_U 3.79E-03 5.25E-03 0.0122 7.68E-01 9.02E-01 0.0003

ACCEPTED MANUSCRIPT

Supplementary Table 3. Cont'd (4). Inflammatory bowel disease

SMAD3 CRTC3 SOCS1 LITAF IL27 ITGAL NOD2 NOD2 NOD2 IRF8 NOS2A, LGALS9 CCL13, CCL2, CCL11, CCL1, CCL7 IKZF3 STAT3, STAT5B, STAT5A

PTPN2 SMAD7 CD226, DOK6 NFATC1 TYK2, ICAM1, ICAM3 TYK2, ICAM1, ICAM3 CEBPG PTGIR SPHK2, DBP, IZUMO1 NLRP2, NLRP7 HCK PROCR HNF4A CD40, MMP9 CEBPB, PTPN1, TMEM189-UBE2V1 TNFRSF6B

G A C A A A G A A G D A C G G A G A G C G G G G G A A C A A A C A G A G C G G G A A A A A

Proxy SNP

A G A C C G A G G C I C G A A G A rs6503695‡ G A A A A A A A G G A G G C A G A C A A A A A G G G G G

A1 A1 Frequency Frequency 0.26 − 0.24 0.02 0.18 0.17 0.20 0.04 0.48 0.39 0.42 0.39 0.46 0.34 0.47 0.03 0.04 − 0.01 0.02 0.98 − 0.22 0.19 0.08 − 0.41 0.70 0.27 0.63 0.47 0.31 0.86 0.42 0.33 0.36 0.18 0.44 0.64 0.19 0.15 0.21 − 0.16 0.14 0.38 0.55 0.19 0.19 0.48 0.28 0.15 − 0.22 0.19 0.09 − 0.20 0.40 0.28 − 0.30 0.04 0.40 0.34 0.47 − 0.42 0.38 0.44 − 0.41 0.89 0.45 0.32 0.48 0.04 0.25 0.35 0.33 − 0.46 0.09 0.31 0.67 0.29 0.14 0.41 0.31 0.27 0.20 r2

IL10RB, IFNAR1, IFNGR2, IFNAR2

Combined P 2.10E-05 2.71E-20 3.54E-07 1.54E-10 1.09E-14 3.10E-08 1.23E-21 8.53E-07 − − − 1.22E-03 9.21E-14 2.77E-08 4.35E-20 2.16E-39 1.03E-21 7.70E-10 9.89E-13 3.19E-11 7.30E-05 1.37E-27 1.01E-10 5.47E-04 4.40E-08 1.45E-08 1.12E-18 4.13E-16 5.27E-20 1.21E-14 1.72E-08 2.10E-06 1.15E-13 5.53E-06 8.57E-07 2.49E-07 6.87E-04 9.04E-06 8.32E-11 5.35E-11 6.93E-12 2.12E-20 1.62E-27 4.82E-09 8.25E-48

GWAS P − 9.74E-01 2.47E-01 6.06E-01 6.16E-03 6.60E-02 3.37E-02 5.70E-01 − 5.64E-01 − 4.63E-01 − 2.62E-03 2.66E-02 7.73E-04 1.57E-09 4.62E-02 6.41E-03 7.35E-02 − 2.32E-02 1.72E-03 1.07E-01 4.13E-02 − 9.29E-03 − 3.20E-01 − 9.30E-01 2.52E-01 − 3.48E-01 − 2.82E-01 9.16E-01 7.34E-01 1.14E-03 − 1.90E-01 1.75E-01 2.14E-02 6.70E-01 2.14E-01

Caucasian

19

Korean

RI PT

A1 A2

Crohn's disease

Korean GEMMA Power** (0.05/231) P* − − 9.45E-01 0.0014 3.94E-01 0.0044 9.57E-01 0.0016 1.36E-02 0.0266 9.16E-02 0.0057 2.03E-01 0.0585 3.77E-01 0.0005 − − 7.57E-01 0.2651 − − 4.51E-01 0.0016 − − 3.34E-03 0.0048 2.07E-02 0.0969 1.18E-03 0.2053 4.55E-08 0.0595 1.12E-02 0.0047 1.90E-02 0.0144 4.51E-02 0.0100 − − 3.05E-02 0.0865 3.73E-03 0.0121 1.40E-01 0.0015 1.76E-01 0.0069 − − 5.36E-03 0.0404 − − 2.29E-01 0.1479 − − 9.67E-01 0.0009 3.42E-01 0.0045 − − 2.48E-01 0.0049 − − 1.02E-01 0.0014 9.06E-01 0.0011 5.21E-01 0.0005 1.12E-02 0.0245 − − 1.60E-01 0.0027 3.07E-01 0.0875 3.19E-03 0.0321 5.73E-01 0.0089 1.68E-01 0.2290

M AN U

41,563,950 67,442,596 91,181,489 11,373,405 11,704,651 23,864,590 28,517,709 30,482,494 50,745,926 50,756,540 50,763,781 68,591,230 86,009,686 25,843,643 32,593,665 37,912,377 40,527,544 54,880,993 57,963,537 70,642,923 76,737,118 12,809,340 46,395,022 56,879,827 67,530,439 77,220,616 1,124,031 10,469,975 10,512,911 33,731,551 46,849,806 47,119,910 49,206,172 55,380,214 30,849,517 31,349,908 33,799,280 43,068,996 44,740,196 48,955,424 57,824,309 62,348,907 16,817,938 34,776,695 40,465,534

GRAIL gene

Caucasian

TE D

rs28374715 rs17293632 rs7165170 rs423674 rs11641184 rs7404095 rs26528 rs11150589 rs2066844 rs2066845 rs5743293 rs1728785 rs2361755 rs2945412 rs3091315 rs12946510 rs12942547 rs3853824 rs1292053 rs17780256 rs17736589 rs1893217 rs7240004 rs9319943 rs727088 rs7236492 rs2024092 rs12720356 rs11879191 rs17694108 rs4802307 rs11083840 rs516246 rs17771967 rs4243971 rs6087990 rs6088765 rs4812833 rs6074022‡ rs913678 rs259964 rs6062504 rs2823286 rs2284553 rs2836878

Position (hg19)

EP

15 15 15 16 16 16 16 16 16 16 16 16 16 17 17 17 17 17 17 17 17 18 18 18 18 18 19 19 19 19 19 19 19 19 20 20 20 20 20 20 20 20 21 21 21

SNP

AC C

Chr.

Korean

Combined P 1.70E-01 3.70E-20 6.83E-07 1.65E-14 3.08E-10 5.16E-04 1.29E-22 5.57E-02 − − − 8.82E-01 6.06E-14 4.51E-20 8.00E-25 1.97E-24 1.01E-16 1.17E-10 1.75E-14 4.46E-04 3.46E-01 5.50E-25 1.89E-05 9.05E-07 8.04E-05 9.09E-09 7.12E-25 3.39E-11 4.81E-20 3.29E-09 9.04E-12 1.34E-02 1.33E-20 4.85E-02 3.17E-06 9.59E-05 4.50E-01 2.88E-01 2.70E-12 1.49E-05 2.08E-09 7.51E-17 1.21E-24 5.63E-17 4.61E-15

SC

Caucasian

GWAS P

− 2.73E-01 9.82E-01 1.55E-01 1.67E-03 4.39E-01 1.44E-01 8.26E-02 − 5.36E-01 − 8.44E-01 − 6.99E-03 1.52E-01 2.71E-02 2.29E-09 1.79E-01 4.25E-03 2.32E-01 − 1.50E-02 8.84E-03 1.72E-02 1.32E-01 − 3.09E-01 − 7.32E-01 − 6.39E-01 9.04E-01 − 9.67E-01 − 1.32E-01 7.12E-01 6.57E-01 1.04E-04 − 2.09E-01 1.49E-01 2.94E-02 6.23E-01 7.69E-01

GEMMA Power** (0.05/231) P* − − 2.23E-01 0.0015 4.80E-01 0.0039 3.40E-01 0.0029 7.79E-03 0.0126 4.20E-01 0.0017 6.67E-01 0.0762 6.38E-02 0.0002 − − 8.10E-01 0.9827 − − 9.29E-01 0.0002 − − 5.13E-03 0.0546 1.33E-01 0.1427 1.49E-02 0.0695 7.25E-08 0.0325 9.95E-02 0.0066 2.89E-02 0.0261 2.26E-01 0.0014 − − 1.61E-02 0.0817 2.26E-02 0.0034 1.61E-02 0.0060 3.81E-01 0.0022 − − 2.01E-01 0.0836 − − 3.29E-01 0.1399 − − 7.12E-01 0.0013 7.61E-01 0.0008 − − 6.06E-01 0.0005 − − 4.76E-02 0.0009 9.44E-01 0.0003 5.45E-01 0.0002 8.54E-04 0.0272 − − 2.47E-01 0.0017 2.78E-01 0.0392 5.64E-03 0.0298 3.63E-01 0.0333 6.55E-01 0.0180

Ulcerative colitis Caucasian Combined P 8.87E-08 1.23E-07 1.08E-03 2.30E-02 4.24E-10 1.52E-08 1.26E-06 3.28E-10 − − − 4.86E-06 2.90E-05 7.87E-01 4.83E-05 1.25E-25 6.72E-11 2.97E-04 3.07E-04 6.13E-13 4.34E-08 2.65E-13 2.50E-10 6.07E-01 1.12E-06 8.52E-03 1.67E-03 1.67E-11 1.12E-07 6.17E-12 4.90E-02 3.41E-08 3.98E-03 4.50E-08 4.29E-03 5.95E-06 1.07E-06 1.87E-16 8.83E-04 1.23E-08 4.46E-06 1.19E-08 2.40E-12 2.79E-01 7.35E-53

Korean GWAS P − 1.20E-01 4.37E-02 3.32E-01 4.70E-01 2.54E-02 6.69E-02 2.31E-01 − 8.34E-01 − 3.00E-01 − 6.99E-02 4.12E-02 2.32E-03 5.00E-03 8.07E-02 2.92E-01 1.12E-01 − 4.07E-01 3.59E-02 7.85E-01 1.02E-01 − 1.03E-03 − 1.92E-01 − 6.52E-01 6.45E-02 − 8.67E-02 − 9.50E-01 7.68E-01 9.85E-01 5.14E-01 − 4.90E-01 6.13E-01 2.35E-01 9.14E-01 7.33E-02

LR GEMMA Power** (0.05/231) phenotype† P* − − 1.40E-01 0.0004 1.59E-02 0.0010 3.18E-01 0.0003 6.56E-01 0.0063 4.31E-02 0.0050 1.18E-01 0.0025 2.58E-01 0.0006 − − 7.36E-01 0.0003 − − 1.81E-01 0.0025 − − 9.37E-02 0.0002 1.91E-02 0.0031 5.66E-03 0.0413 6.75E-03 0.0073 IBD_S 3.39E-02 0.0011 2.40E-01 0.0016 1.04E-01 0.0081 − − 4.64E-01 0.0105 4.54E-02 0.0088 5.96E-01 0.0002 1.73E-01 0.0021 − − 1.50E-03 0.0010 − − 3.16E-01 0.0081 − − 5.19E-01 0.0003 1.22E-01 0.0038 − − 9.92E-02 0.0041 − − 9.44E-01 0.0009 7.50E-01 0.0024 9.03E-01 0.0011 7.65E-01 0.0016 CD − − 5.68E-01 0.0008 6.70E-01 0.0074 1.49E-01 0.0037 9.53E-01 0.0003 6.95E-02 0.1657

ACCEPTED MANUSCRIPT

Supplementary Table 3. Cont'd (5). Inflammatory bowel disease

Chr. 21 22 22 22 22 22

SNP rs7282490‡ rs2256609‡ rs5763767 rs2413583 rs12627970 rs727563

Position (hg19) 45,615,741 21,925,017 30,493,882 39,659,773 39,721,745 41,867,377

GRAIL gene ICOSLG, AIRE MAPK1 OSM, LIF MAP3K7IP1 MAP3K7IP1, ATF4

A1 A2 G G A A G G

r2

Proxy SNP

A A G G A A

Korean

A1 A1 Frequency Frequency 0.39 0.58 0.20 0.34 0.45 0.27 0.17 0.05 0.21 0.81 0.20 0.42

Caucasian Combined P 9.85E-30 2.73E-12 8.82E-15 5.06E-38 1.94E-18 2.58E-06

*

Caucasian

GWAS GEMMA Power** (0.05/231) P P* 1.23E-04 1.10E-03 0.1367 4.08E-05 4.02E-03 0.0381 2.63E-01 2.94E-01 0.0186 1.05E-02 2.28E-02 0.0228 9.96E-03 1.88E-02 0.0485 3.59E-01 5.86E-01 0.0094

Combined P 5.98E-23 6.14E-12 6.45E-13 7.72E-36 1.39E-13 1.88E-10

SC

P value was calculated using the z-score from genome-wide efficient mixed-model association (GEMMA). Power of published 231 independent SNPs in our GWAS discovery was analyzed using Quanto (http://hyda.usc.edu/gxel). The best associated phenotype was assigned using the likelihood ratio modeling (24 SNPs of 23 loci). ‡ These SNPs had passed the P value threshold (0.05/231 independent SNPs) after the Bonferroni correction.

Crohn's disease

Korean

**

AC C

EP

TE D

M AN U



20

Korean

RI PT

Caucasian

GWAS P 3.04E-04 1.56E-03 8.80E-01 1.12E-01 1.33E-01 2.33E-01

GEMMA Power** (0.05/231) P* 2.42E-03 0.0841 4.71E-02 0.0388 8.03E-01 0.0139 2.41E-01 0.0217 1.53E-01 0.0220 3.79E-01 0.0308

Ulcerative colitis Caucasian Combined P 2.06E-16 1.18E-04 1.19E-06 7.09E-15 1.03E-10 2.47E-02

Korean GWAS GEMMA Power** LR (0.05/231) phenotype† P P* 4.19E-02 6.19E-02 0.0195 IBD_U 1.81E-03 1.01E-02 0.0025 IBD_U 7.63E-02 1.18E-01 0.0021 1.82E-02 1.56E-02 0.0028 1.08E-02 1.47E-02 0.0078 9.50E-01 9.46E-01 0.0006

ACCEPTED MANUSCRIPT

Supplementary Table 4. The proportion of genetic variance explained by CD and UC risk loci in Koreans. CD

Inflammatory bowel disease risk loci 1 rs12141431∆ IL23R 1 rs3766920† PYGO2, SHC1 2 rs6740462† SPRED2 2 rs3749172† GPR35 5 rs2930047† DAP 5 rs11742570† (PTGER4) 5 rs11741861† TNIP1 5 rs6556412† IL12B 6 rs7775228† (MHC) 9 rs6478109† TNFSF15,TNFSF8 10 rs4409764† (NKX2-3) 13 rs7335629†‡ SLC25A15, ELF1, WBP4 14 rs1063169† FOS 16 rs16953946† CDYL2 17 rs6503695† STAT3, STAT5B, STAT5A 20 rs2427537† TNFRSF6B 21 rs7282490† ICOSLG, AIRE 22 rs2256609† MAPK1 Crohn's disease risk loci 2 rs12994997 (ATG16L1) 3 rs3197999 MST1R 4 rs6856616 (TBC1D1, KLF3) 6 rs394522 CCR6, RPS6KA2 10 rs224090 (ZNF365) 10 rs1892497∆ ZMIZ1 10 rs11195128‡ (SMNDC1, DUSP5) 11 rs72981516∆‡ ATG16L2, FCHSD2 18 rs534911∆ PTPN2 20 rs6074022 CD40, MMP9 21 rs55673812∆ USP25 Ulcerative colitis risk loci 1 1 1 1 1 1 2 7 9 13 16

rs4648649 rs3806308 rs6426833 rs12728589 rs1801274 rs4845141∆ rs4851534∆ rs4728142∆ rs16922779∆ rs17085007 rs16940186∆

TNFRSF14 PLA2G2A PLA2G2A (ZBTB40) FCGR2A, FCGR2B, FCGR3B, FCGR3A IL10, IL19 IL1R2 IRF5 JAK2, INSL4 (GPR12, USP12) IRF8

RAF Case Control

UC

OR

95% CI

P trend*

Additive genetic variance explained (%)

C A A A C C G A C G T C G C T T G G

0.52 0.12 0.88 0.37 0.77 0.20 0.39 0.47 0.34 0.71 0.52 0.28 0.76 0.20 0.75 0.08 0.62 0.38

0.45 0.08 0.85 0.30 0.74 0.17 0.36 0.42 0.26 0.52 0.48 0.23 0.73 0.17 0.67 0.05 0.58 0.34

1.34 1.56 1.36 1.37 1.16 1.23 1.14 1.21 1.45 2.21 1.18 1.27 1.21 1.26 1.42 1.59 1.21 1.19

(1.20-1.50) (1.32-1.83) (1.16-1.59) (1.23-1.52) (1.03-1.31) (1.08-1.40) (1.03-1.26) (1.10-1.35) (1.30-1.62) (1.98-2.48) (1.07-1.30) (1.13-1.42) (1.08-1.37) (1.11-1.44) (1.26-1.59) (1.31-1.93) (1.09-1.35) (1.07-1.32)

9.79E-08 7.57E-08 1.40E-04 5.28E-09 1.69E-02 1.71E-03 1.32E-02 2.15E-04 1.44E-11 3.00E-46 1.40E-03 6.04E-05 1.35E-03 2.75E-04 2.29E-09 2.13E-06 3.04E-04 1.56E-03

0.276 0.190 0.155 0.263 0.055 0.078 0.051 0.119 0.353 1.962 0.086 0.129 0.096 0.100 0.340 0.148 0.115 0.085

A A C C T C T G G C T

0.36 0.09 0.32 0.46 0.63 0.64 0.21 0.14 0.40 0.40 0.75

0.31 0.07 0.23 0.39 0.57 0.59 0.15 0.11 0.35 0.35 0.71

1.24 1.45 1.53 1.33 1.30 1.22 1.52 1.36 1.24 1.23 1.24

(1.12-1.39) (1.21-1.73) (1.37-1.71) (1.20-1.48) (1.17-1.44) (1.09-1.35) (1.38-1.73) (1.17-1.58) (1.12-1.38) (1.11-1.36) (1.10-1.40)

6.49E-05 4.10E-05 4.19E-14 2.41E-08 1.16E-06 2.65E-04 9.62E-11 6.46E-05 3.35E-05 1.04E-04 3.42E-04

0.133 0.112 0.422 0.254 0.215 0.119 0.291 0.118 0.140 0.123 0.122

T C A G A T T A A C C

AC C

EP

TE D

*

Case Control 0.47 0.10 0.89 0.33 0.80 0.21 0.41 0.45 0.34 0.56 0.52 0.27 0.77 0.20 0.72 0.09 0.61 0.39

0.58 0.57 0.85 0.85 0.82 0.77 0.26 0.16 0.26 0.24 0.27

Chr, chromosome; CI, confidence interval; OR, odds ratio; Position, chromosome position (hg19); RAF, risk allele frequency. ()Nearby gene. P value was calculated using the Cochran-Amitage trend test. ∆ Previously reported SNP/proxy SNP (r2 > 0.6) associated with CD in Koreans. † These SNPs were assigned to IBD based on the LR phenotype. ‡ These loci were first identified in Asian populations and validated in the present study and Japanese GWAS on IBD (ref. 8). Total additive genetic variance explained : 6.65% for CD and 5.47% for UC. §

RAF

0.45 0.08 0.85 0.30 0.74 0.17 0.36 0.42 0.26 0.52 0.48 0.23 0.73 0.17 0.67 0.05 0.58 0.34

21

OR

95% CI

P trend*

Additive genetic variance explained (%)

1.11 1.34 1.39 1.16 1.40 1.27 1.23 1.13 1.42 1.18 1.16 1.24 1.29 1.25 1.21 1.71 1.14 1.23

(0.97-1.26) (1.09-1.65) (1.15-1.69) (1.02-1.32) (1.19-1.63) (1.09-1.49) (1.09-1.40) (0.99-1.28) (1.25-1.62) (1.05-1.34) (1.03-1.31) (1.08-1.43) (1.11-1.49) (1.07-1.46) (1.06-1.39) (1.37-2.14) (1.01-1.29) (1.08-1.40)

1.40E-01 4.91E-03 8.55E-04 2.72E-02 2.74E-05 2.07E-03 9.92E-04 6.12E-02 1.08E-07 7.67E-03 1.43E-02 2.98E-03 7.56E-04 4.06E-03 5.00E-03 1.72E-06 4.19E-02 1.81E-03

0.036 0.095 0.205 0.066 0.309 0.123 0.148 0.051 0.355 0.105 0.084 0.121 0.185 0.106 0.118 0.228 0.061 0.139

9.43E-05 9.41E-05 6.85E-10 7.84E-05 5.73E-05 4.49E-04 1.00E-04 2.15E-03 1.83E-02 2.06E-08 1.02E-05

0.229 0.228 0.720 0.279 0.265 0.208 0.199 0.114 0.078 0.376 0.241

RI PT

Risk allele

Gene(s)§

SC

SNP

M AN U

Chr

0.52 0.51 0.77 0.80 0.76 0.72 0.21 0.12 0.23 0.18 0.21

1.28 1.28 1.70 1.42 1.37 1.30 1.33 1.30 1.19 1.52 1.36

(1.13-1.46) (1.13-1.45) (1.44-2.02) (1.19-1.69) (1.18-1.60) (1.12-1.51) (1.15-1.53) (1.10-1.55) (1.03-1.37) (1.31-1.76) (1.19-1.57)