Phylogenetic mapping of phased haplotype data for post-GWAS mapping

Phylogenetic mapping of phased haplotype data for post-GWAS mapping

Developing Topics have a strong genetic component, we would expect some genes to act directly on one phenotype and through their potential causal rela...

131KB Sizes 0 Downloads 64 Views

Developing Topics have a strong genetic component, we would expect some genes to act directly on one phenotype and through their potential causal relationship to have indirect effects on the other or to impact susceptibility to both diseases via independent mechanisms i.e. have pleiotropic effects. The aim of this study was to investigate whether a genotype score based on the top T2D candidates could predict LOAD risk. Methods: 17 Single-nucleotide polymorphisms (SNPs) associated with T2D were genotyped in 3500 Caucasian LOAD patients and controls, 2500 of which had diabetes information available. A genotype risk score (GRS) from the number of risk alleles was created and logistic regression was used to investigate its association with LOAD. Interactions between the GRS and APOE e4 status we used to investigate whether any observed associations were e4 stratum specific. Analyses were repeated for 2500 subjects with T2D info. Adjustment for T2D investigated whether any associations were through T2D status and an interaction with T2D was used to examine T2D stratum specific effects. Results: Six SNPs were associated with LOAD status, but only two of these increased LOAD risk. The GRS score was linearly associated with a decreased risk for AD (OR ¼ 0.97; 95% CI: 0.94-0.99; p ¼ 0.032 per allele). Also, the OR for LOAD in individuals at the top 5% of allele count distribution against those at the bottom 5% was OR ¼ 0.25 (95% CI: 0.096- 0.65, p ¼ 0.005). The association of the GRS with AD risk was independent of T2D status. However, the GRS was associated with decreased LOAD risk in APOE e4 carriers (OR ¼ 0.91; 95% CI from 0.87 to 0.96; p <0.001 per risk allele) but not in non-e4 carriers (OR ¼ 1.00; 95% CI 0.96 to 1.04; p ¼ 0.946) (p ¼ 0.005 for the interaction term). Conclusions: A GRS based on the top T2D candidates was associated with a decreased risk for LOAD in APOE e4 carriers. Such an observation could partly explain why only some e4 carriers have LOAD. P4-338

PHYLOGENETIC MAPPING OF PHASED HAPLOTYPE DATA FOR POST-GWAS MAPPING

Allen Roses1, Michael Lutz1, Donna Crenshaw1, Ann Saunders1, Richard Caselli2, Iris Grossman3, Ornit Chiba-Falek1, Daniel Burns3, 1Duke University, Durham, North Carolina, United States; 2Mayo Clinic, Scottsdale, Arizona, United States; 3Zinfandel Pharmaceuticals, Durham, North Carolina, United States. Background: Recent genome-wide association studies (GWAS) of late onset Alzheimer’s disease (LOAD) have reported putative new susceptibility loci for LOAD. However, multiple mutations in an associated region, on each of the homologous chromosomes, may have a profound effect on gene expression, protein function, disease risk, and on endophenotypes such as age at disease onset. Phylogenetic analysis reveals the evolutionary history of haplotypes in these regions and can provide greater resolution to the association analyses of various phenotypes. Methods: We analyzed three APOE promoter SNPs (-219, -491 and -427) reported in a recent LOAD association study by another research group. These SNPs were mapped to phylogenetic trees previously generated for the TOMM40-APOE genomic interval. The phylogeny was constructed for a case/control cohort using neighbor-joining and Bayesian phylogenetic tree construction methods. Results: The three promoter SNPs are in tight linkage with different alleles of TOMM40 rs10524523 (‘523’, a polymorphic poly-T defined by the number of T nucleotides). The major findings are: 1) The T allele of the APOE promoter -219 SNP always maps to a major clade of haplotypes that is associated with higher risk for LOAD and an earlier age of LOAD onset. 2) The T allele of APOE -219 is always linked to either long 523 alleles (20-29 Ts), which are tightly linked to APOE E4 or to very long 523 alleles (> 29 Ts) linked to APOE E3, but never to short 523 alleles (<20 Ts). The APOE -219 G allele is linked most frequently to short 523 alleles, but this is not always the case. 4) In this analysis, the 523 poly-T is the variant that is most significantly associated with differences in age of onset distributions of LOAD. Conclusions: Our results demonstrate that phylogenetic analysis can reconcile results of independent studies of phased sequence data, using the specific example of the TOMM40-APOE region and age at onset of LOAD. Phylogenetic mapping may be useful for fine mapping putatively associated loci identified by GWAS with an independent method of genetic analysis.

P4-339

e29 INFLUENCE OF GENETIC VARIATION ON PLASMA PROTEOMICS IN AD, MCI AND CONTROLS: PAIRWISE GENE-PROTEIN ANALYSIS IN THE ADNI-1 COHORT

Sungeun Kim1, Shanker Swaminathan1, Mark Inlow1, Shannon Risacher1, Li Shen1, Tatiana Foroud2, Leslie Shaw3, John Trojanowski4, Holly Soares5, Michael Weiner6, Andrew Saykin1, the Alzheimer’s Disease Neuroimaging Initiative7, 1Center for Neuroimaging, Indiana University School of Medicine, Department of Radiology & Imaging Sciences, Indianapolis, Indiana, United States; 2Department of Medical and Molecular Genetics, Indiana University School of Medicine, Indianapolis, Indiana, United States; 3University of Pennsylvania Medical Center, Philadelphia, Pennsylvania, United States; 4 Department of Pathology and Laboratory Medicine, University of Pennsylvania School of Medicine, Philadelphia, Pennsylvania, United States; 5 Bristol-Myers Squibb, Wallingford, Connecticut, United States; 6 Departments of Radiology, Medicine and Psychiatry, University of California, San Francisco, San Francisco, California, United States; 7Indiana University, Indianapolis, Indiana, United States. Background: Blood-based biomarkers for early detection and monitoring of Alzheimer’s disease (AD) would be highly desirable. Sets of protein levels from plasma or serum have been effective in classifying AD from healthy controls (HC) [1,2]. Data from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) provides 190 analytes from plasma using a Rules Based Medicine (RBM) panel [3]. We hypothesized that genetic variation in single nucleotide polymorphisms (SNPs) within each protein-coding gene would influence plasma protein levels. Methods: Quality-controlled genotype data version 2 (Illumina 610Quad) and RBM analytes from 522 non-Hispanic Caucasian participants were analyzed. After preprocessing steps, 2191 SNPs within 131 protein-coding genes were assessed for associations with 129 quality-controlled analytes. Corresponding SNPs and analytes were analyzed using a dominant genetic model without/with controlling baseline diagnosis (model 1/model 2). Sex, age, education and handedness were included in the model when significant. SNPs with uncorrected p <5.0*10^-5 were considered significant. Results: Analyses using model1 identified 105 significant associations between 28 analytes and 105 SNPs, belonging to 28 genes. Figure 1 visualizes these associations and the linkage disequilibrium among significant SNPs within each gene. Analyses using model2 found the same set of significant associations with slightly different p-values. For each of these associations, the fraction of R^2 (DR^2), accounted for by each SNP in the model1, was computed using all 522 participants, 52 HC, 361 patients with mild cognitive impairment, and 108 AD separately. Table 1 lists top SNPs (largest DR^2) in the full sample in decreasing order of DR^2. Conclusions: 28 of 129 analytes were significantly influenced by one or more SNPs within corresponding protein-coding genes. These SNPs accounted for 3 to 56 percent of the total variation. For some analytes, the role of genetic variation appeared to differ by diagnostic group, a topic that warrants further investigation. The combination of paired genetic and proteomic data may be synergistic in studies of AD pathophysiology and early

Figure 1. Heatmap of significant associations at p<5.0*e-5 (upper) in the result of model 1 and linkage disequilibrium among significant SNPs within each gene (lower). The heatmap show the significant association, not representing their p-values.