Mutation Research 642 (2008) 86–89
Contents lists available at ScienceDirect
Mutation Research/Fundamental and Molecular Mechanisms of Mutagenesis journal homepage: www.elsevier.com/locate/molmut Community address: www.elsevier.com/locate/mutres
Short communication
Evaluation of the flanking nucleotide sequences of sarcomeric hypertrophic cardiomyopathy substitution mutations Kathryn M. Meurs ∗ , Katrina L. Mealey Department of Veterinary Clinical Sciences, Washington State University College of Veterinary Medicine, Grimes Street, Pullman, WA 99164, United States
a r t i c l e
i n f o
Article history: Received 14 February 2008 Received in revised form 15 April 2008 Accepted 16 April 2008 Available online 24 April 2008 Keywords: Hypertrophic cardiomyopathy Mutation Methylation Deamination
a b s t r a c t Hypertrophic cardiomyopathy (HCM) is a familial myocardial disease with a prevalence of 1 in 500. More than 400 causative mutations have been identified in 13 sarcomeric and myofilament related genes, 350 of these are substitution mutations within eight sarcomeric genes. Within a population, examples of recurring identical disease causing mutations that appear to have arisen independently have been noted as well as those that appear to have been inherited from a common ancestor. The large number of novel HCM mutations could suggest a mechanism of increased mutability within the sarcomeric genes. The objective of this study was to evaluate the most commonly reported HCM genes, beta myosin heavy chain (MYH7), myosin binding protein C, troponin I, troponin T, cardiac regulatory myosin light chain, cardiac essential myosin light chain, alpha tropomyosin and cardiac alpha-actin for sequence patterns surrounding the substitution mutations that may suggest a mechanism of increased mutability. The mutations as well as the 10 flanking nucleotides were evaluated for frequency of di-, tri- and tetranucleotides containing the mutation as well as for the presence of certain tri- and tetranculeotide motifs. The most common substitutions were guanine (G) to adenine (A) and cytosine (C) to thymidine (T). The CG dinucleotide had a significantly higher relative mutability than any other dinucleotide (p < 0.05). The relative mutability of each possible trinucleotide and tetranucleotide sequence containing the mutation was calculated; none were at a statistically higher frequency than the others. The large number of G to A and C to T mutations as well as the relative mutability of CG may suggest that deamination of methylated CpG is an important mechanism for mutation development in at least some of these cardiac genes. © 2008 Elsevier B.V. All rights reserved.
1. Introduction Hypertrophic cardiomyopathy (HCM) is a primary myocardial disease characterized by increased left ventricular mass and wall thickness in the absence of a pressure overload or metabolic stimulus [1]. It has a prevalence of 1 in 500 and is believed to be familial in at least 60% of the cases, usually inherited as an autosomal dominant trait [2]. More than 400 causative mutations have been identified in 13 sarcomeric and myofilament related genes, however, the majority of the mutations are substitution mutations found in the coding regions of eight genes that encode for sarcomeric proteins [2,3]. Within a population, examples of recurring identical disease causing mutations that appear to have arisen independently have been noted as well as those that appear to have been inherited from a common ancestor [4–7].
∗ Corresponding author. Tel.: +1 509 335 0817; fax: +1 509 335 0880. E-mail address:
[email protected] (K.M. Meurs). 0027-5107/$ – see front matter © 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.mrfmmm.2008.04.005
The large number of novel substitution mutations observed within these eight sarcomeric genes could suggest a mechanism of increased gene mutability. The most commonly cited mechanisms for development of endogenous mutations are chemical (deamination of CpG regions), physical (DNA slippage) or enzymatic (post-replication repair) and the efficiency of all of these processes appears to be dependent on the local DNA sequence environment [8]. Specifically, evidence from sequence analysis and mutation rates of human DNA sequences and from early studies of the Human Gene Mutation Database (http://www.hgmd.cf.ac.uk/ac/index.php) suggests that the identity of neighboring bases can have an influence on both the type and rate of mutation events that occur at specific positions within a gene [9,10]. The objective of this study was to evaluate the eight most commonly reported hypertrophic cardiomyopathy genes, beta myosin heavy chain (MYH7), myosin binding protein C (MYBPC3), troponin I (TNNI3), troponin T (TNNT2), cardiac regulatory myosin light chain (MYL2), cardiac essential myosin light chain (MYL3), alpha tropomyosin (TPM1) and cardiac alpha-actin (ACTC) for sequence
K.M. Meurs, K.L. Mealey / Mutation Research 642 (2008) 86–89
87
Table 1 Nucleotide altered within the coding region of each gene
Transcript size (bp) Number of mutations A C G T
ACTC
MYH7
MYBPC3
TNNI3
TNNT2
TPM1
MYL2
MYL3
1134
5925
4304
708
882
1105
811
868
8
190
71
24
31
11
10
5
2 2 4 0
71 33 37 49
7 19 40 5
4 8 10 2
8 7 11 5
3 1 3 4
3 2 4 1
2 1 2 0
Table 2 Total number of each type of substitution mutation for the eight cardiac sarcomeric genes G>A C>T A>G G>C T>C G>T C>G A>T C>A A>C T>G T>A
113 56 37 30 23 21 18 12 13 12 8 7
patterns surrounding the reported substitution mutations to identify a pattern for increased mutability. 2. Materials and methods The number of known substitution mutations and the coding sequences for the MYBPC3, MYH7, TNNI3, TNNT2, TPM1, ACTC, MYL2 and MYL3 genes were acquired from the CardioGenomics (http://cardiogenomics.med.harvard.edu/home) and UCSC Genome Bioinformatics data bases (http://genome.ucsc.edu/) [11,12]. The number and type of mutation was sorted according to the nucleotide mutated, the type of substitution and the di-, tri-, and tetranucleotides containing each mutation. The number of each type of di-, tri- and tetranucleotide containing the mutation was divided by the number of times that the same sequence appeared in the coding sequence of the gene to calculate the relative mutability for that sequence and multiplied by 100 to present it as a percentage. The mutability of each gene was determined by dividing the number of substitution mutations in each gene by the size of the transcript in nucleotides and multiplying by 100 to present it as a percentage. The potential methylation status of each CG dinucleotide was assessed by scanning each gene for CpG islands that suggest unmethylated sites using the UCSC Genome Bioinformatics data base (http://genome.ucsc.edu/) [12]. Finally, the 10 nucleotides immediately preceding and following the mutation were evaluated for the presence of previously reported trinucleotide and tetranucleotide motifs that are thought to predispose to mutation development including TTT, CTT, TGA, TTG, CTTT, TCTT and TTTG [9]. The frequency of each of these was divided by the total number of mutations (350) and multiplied by 100 to determine the frequency (in percentage) for which each of these was observed to be associated with a substitution mutation. A one-way ANOVA was used to assess statistical significance of the relative mutability for each dinucleotide, trincucleotide and tetranucleotide. A Pearsons correlation was used to assess the correlation between the relative mutability of each gene and its transcript size, as well as the relative mutability of each gene and the relative CG mutability. A Chi squared analysis was performed to determine if the
Fig. 1. The mutability of each gene was determined by dividing the number of substitution mutations in each gene by the size of the transcript in nucleotides. The relative mutability of the CG dinucleotide was calculated by dividing the number of CGs in each gene that contained a mutation by the number of times that dinucleotide appears in the coding region of each gene. A Pearson’s correlation was performed to determine if the gene mutability correlated with CG mutability, and an alpha of < 0.05 was considered to be significant. The relative mutability of the individual genes was found to correlate with CG mutability (p < 0.05, r = 0.78).
observed frequency of the seven tri- and tetranucleotides motifs within the flanking regions of each mutation occurred at a greater frequency than expected. An alpha of p ≤ 0.05 was considered to be significant.
3. Results A total of 350 substitution mutations were evaluated including 190 in MYH7, 71 in MYBPC3, 31 in TNNT2, 24 in TNNI3, 11 in TPM1, 10 in MYL2, 8 in ACTC and 5 in MYL3 (Table 1). The most common substitutions were G > A (113) and C > T (56) (Table 2). The CG dinucleotide had a significantly higher relative mutability than any other dinucleotide (p < 0.05) and had the highest mutability in the TNNI3 gene (Table 3). Relative mutability of the individual genes correlated with CG mutability (p < 0.05, r = 0.78), but not the size of transcript (NS, r = 0.27) (Figs. 1 and 2). CpG islands suggesting unmethyled CG sites were identified at exon 27 of MYH7, exon 24 in the MYBPC3 gene and exons 3–5 in TNNI3. A total of three C > T or G > A mutations occurred in these regions. The MYL3, MYL2, ACTC, TNNT2 and TPM1 genes did not have any CpG islands in the exonic regions. The relative mutability of the trinucleotides and tetranucleotides containing the mutation were calculated; none were observed to occur statistically more frequently than the others. Of the previously reported trinucleotide or tetranucleotide motifs that have been associated with mutation development when
Table 3 Relative gene mutability and relative CG mutability for each gene Gene
Transcript size (bp)
Number of mutations
Gene mutability
Number of CG mutations
Number of CG sites in transcript
CG mutability
TNNT2 TNNI3 MYH7 MYBPC3 MYL2 ACTC TPM1 MYL3
882 708 5925 4304 811 1134 1105 868
31 24 190 71 10 8 11 5
3.5 3.4 3.2 1.6 1.2 0.7 1.0 0.6
4 12 58 33 4 2 2 1
33 41 240 195 29 39 41 23
12.1 29.3 24.2 16.9 14.0 5.1 4.8 4.3
88
K.M. Meurs, K.L. Mealey / Mutation Research 642 (2008) 86–89
Fig. 2. The mutability of each gene was also correlated to the size of the transcript in nucleotides with a Pearson’s correlation. The relative mutability of the individual genes did not correlate with the size of the transcript (NS, r = 0.27).
observed within the 10 nucleotides flanking a mutation, TGA had the highest prevalence and was observed to precede or follow a mutation 39% of the time. None of the nucleotide motifs were observed at a statisitically higher frequency than expected (Table 4). 4. Discussion The results of the study presented here indicate that only the CG dinucleotide was found to have a significant risk for mutability in this subset of genes. The five genes with the highest CG relative mutability, TNNI3, MYH7, MYBPC3, MYL2 and TNNT2 also had the highest relative gene mutability, although they were not always the largest genes (Figs. 1 and 2). One likely mechanism for this level of CG mutability is the deamination of 5-methylcytosine within CpG sites. Chemical deamination of a methylated cytosine in a CpG can result in a transition mutation from a cytosine to thymine, or a guanine to adenine as a result of a transition on the antisense strand followed by a miscorrection of guanine to adenine on the sense strand [8,9,13]. The methylation of cytosine and subsequent deamination and mutation to a thymine has been suggested to be responsible for 25–60% of the point mutations that result in human familial disease [8,9,14]. The observed frequency of cytosine to thymine and guanine to adenine transition mutations does vary for individual human genes, for example it appears to be responsible for less than 10% of mutations in the hypoxanthine phosphoribosyltransferase gene in Lesch–Nyhan syndrome but 60% of the mutations in apolipoprotein E in ApoA deficiency [13]. A previous report of familial hypertrophic cardiomyopathy that evaluated the MYH7 gene, one of the genes studied here, noted that 7 of 9 (78%) HCM mutations in this gene were accounted for by cytosine to Table 4 Percentage of substitution mutations that contained the TTT, CTT, TGA, TTG, CTTT, TCTT and TTTG motif in the 10 nucleotides preceding or following the substitution Motif
Percentage
Observed
Expected
TTT CTT TGA TTG CTTT TCTT TTTG
6 16 39 14 4 2 4
23 56 136 50 15 8 13
27 52 125 49 13 11 16
thymine transitions at CpG dinuclotides [7]. Although deamination may play in an important role in the development of the observed mutations it is important to note that other mechanisms for mutation development have been observed in methylated CpG sites as well. In fact, preferential binding of several carcinogens including benzo[a]pyrene diol and acrolein occurs at methylated CpG sites and results in the formation of a G to T transversion, another fairly commonly observed HCM mutation [15–18] (Table 2). The state of methylation of CpGs within the coding region of these genes was not specifically studied in this project however, the majority of the mutations did not occur at the typically unmethylated CpG islands and thus, are most likely to have occurred at a methylated CpGs. Not all of the substitution mutations studied were associated with a CG dinucleotide. An additional theory for the development of mutations is based on the “nearest neighbor theory” which implies that the closest nucleotides to the mutation are involved in the development of the base pair change [9]. The effect of regional base pairs on mutation development appears to be locally confined to the 10 nucleotides on each side of the mutation, however, the closest one to three nucleotides flanking the mutation are the most important [8,13]. We did not identify any other di-, trior tetranucleotide sequences containing the mutation that were overrepresented in frequency (Table 4). Previous studies have also suggested that some trinucleotide and tetranucleotide motifs are overrepresented within the 10 nucleotides flanking the mutation including among others, TTT, CTT, TGA, TTG, CTTT, TCTT, TTTG and TGGA [9]. The relationship of these motifs to the development of the mutations is unknown, but one of these sequences, TGGA, is an arrest site for DNA polymerase alpha which is involved in DNA repair and proofreading mechanisms and CTT is a topoisomerase-I cleavage site consensus sequence [9,13]. In this study, none of these motifs were observed to be consistently associated with the development of these substitution mutations or were overrepresented within these regions. Further evaluation of mutation mechanisms for non-CG regions is warranted. The results of this study indicate that other than the CG dinucleotide, specific sequences of DNA containing or flanking these substitution mutations do not appear to be a consistent factor in the development of the mutation. The large number of cytosine to thymidine and guanine to adenine substiutions may suggest that methylated CpGs may play an important role in the mechanism of mutagenesis of cardiac sarcomeric genes.
Conflict of interest None.
References [1] A.J. Marian, R. Roberts, The molecular genetic basis for hypertrophic cardiomyopathy, J. Mol. Cell Cardiol. 33 (2000) 655–670. [2] R. Alcalai, J.G. Seidman, C.E. Seidman, Genetic basis of hypertrophic cardiomyopathy: from bench to the clinics, J. Cardiovasc. Electrophysiol. 19 (2007) 1–7. [3] J.M. Bos, S.R. Ommen, M.J. Ackerman, Genetics of hypertrophic cardiomyopathy: one, two or more diseases? Curr. Opin. Cardiol. 22 (2007) 193–199. [4] M. Alders, R. Jongbloed, W. Deelen, A. van den Wijngaard, P. Doevendans, F.T. Cate, V. Regitz-Zagrosek, H.P. Vosberg, I. Van Langen, A. Wilde, D. Dooijes, M. Mannens, The 2373insG mutation in the MYBPC3 gene is a founder mutation, which accounts for nearly one-fourth of the HCM cases in the Netherlands, Eur. Heart J. 24 (2003) 1848–1853. [5] J. Erdmann, J. Raible, J. Maki-Abadi, M. Hummel, J. Hammann, B. Wollnik, E. Frantz, E. Fleck, R. Hetzer, V. Regitz-Zagrosek, Spectrum of clinical phenotypes and gene variants in cardiac myosin-binding protein C mutation carriers with hypertrophic cardiomyopathy, J. Am. Coll. Cardiol. 38 (2001) 322–330. [6] H. Watkins, L. Thierfelder, R. Anan, J. Jarcho, A. Matsumori, W.J. McKenna, J.G. Seidman, C.E. Seidman, Independent origin of identical  cardiac myosin heavy chain mutations in hypertrophic cardiomyopathy, Am. J. Hum. Genet. 53 (1993) 1180–1185.
K.M. Meurs, K.L. Mealey / Mutation Research 642 (2008) 86–89 [7] J.C. Moolman, W.J. De Lange, E.C.D. Bruwer, P.A. Brink, V.A. Corfield, The origins of hypertrophic cardiomyopathy-causing mutations in two South African subpopulations: a unique profile of both independent and founder events, Am. J. Hum. Genet. 65 (1999) 1308–1320. [8] M. Krawczak, E.V. Ball, D.N. Cooper, Neighboring-nucleotide effects on the rates of germ-line single-base-pair substitution in human genes, Am. J. Hum. Genet. 63 (1998) 474–488. [9] S.E. Antonarakis, M. Krawczak, D.N. Cooper, The nature and mechanisms of human gene mutation, in: C.R. Scriver, W.S. Sly (Eds.), The Metabolic & Molecular Basis of Inherited Disease, McGraw-Hill, New York, 2001, pp. 343–377. [10] P.F. Arndt, T. Hwa, Identification and measurement of neighbor-dependent nucleotide substutution processes, Bioinformatics 21 (2005) 2322–2328. [11] Genomics of Cardiovascular Development, Adaptation, and Remodeling. NHLBI Program for Genomic Applications, Harvard Medical School (accessed October 2007). [12] W.J. Kent, C.W. Sugnet, T.S. Furey, K.M. Roskin, T.H. Pringle, A.M. Zahler, D. Haussler, The human genome browser at UCSC, Genome Res. 12 (2002) 996–1006, http://www.genome.org/cgi/content/abstract/12/6/996.
89
[13] D.N. Cooper, M. Krawczak, Single base pair substitutions, in: Human Gene Mutations, BIOS Scientific Publishers Limited, Oxford, 1999, pp. 109–162. [14] C.P. Walsh, G.L. Xu, Cytosine methylation and DNA repair, Curr. Top. Microbiol. Immunol. 301 (2006) 283–315. [15] Z. Feng, W. Hu, Y. Hu, M. Tang, Acrolein is a major cigarette-related lung cancer agent: preferential binding at p53 mutational hotspots and inhibition of DNA repair, Proc. Natl. Acad. Sci. U.S.A. 103 (2006) 15404–15409. [16] M.F. Denissenko, A. Pao, M. Tang, G.P. Pfeifer, Preferential formation of benzo[a]pyrene adducts at lung cancer mutational hotspots in P53, Science 274 (1996) 430–432. [17] G.P. Pfeifer, M.F. Denissenko, M. Olivier, N. Tretyakova, S.S. Hecht, P. Hainaut, Tobacco smoke carcinogens, DNA damage and p53 mutations in smokingassociated cancers, Oncogene 21 (2002) 7435–7451. [18] J.X. Chen, Y. Zheng, M. West, M. Tand, Carcinogens preferentially bind at methylated CpG in the p53 mutational hotspots, Cancer Res. 58 (1998) 2070–2075.