Heteroplasmic substitutions in the entire mitochondrial genomes of human colon cells detected by ultra-deep 454 sequencing

Heteroplasmic substitutions in the entire mitochondrial genomes of human colon cells detected by ultra-deep 454 sequencing

G Model FSIGEN-1260; No. of Pages 5 Forensic Science International: Genetics xxx (2014) xxx–xxx Contents lists available at ScienceDirect Forensic ...

328KB Sizes 1 Downloads 34 Views

G Model

FSIGEN-1260; No. of Pages 5 Forensic Science International: Genetics xxx (2014) xxx–xxx

Contents lists available at ScienceDirect

Forensic Science International: Genetics journal homepage: www.elsevier.com/locate/fsig

Heteroplasmic substitutions in the entire mitochondrial genomes of human colon cells detected by ultra-deep 454 sequencing Katarzyna Skonieczna a, Boris Malyarchuk b, Arkadiusz Jawien´ c, Andrzej Marszałek d, Zbigniew Banaszkiewicz c, Paweł Jarmocik c, Marcelina Borcz e, Piotr Bała e, Tomasz Grzybowski a,* a

Department of Molecular and Forensic Genetics, Institute of Forensic Medicine, Ludwik Rydygier Collegium Medicum, Nicolaus Copernicus University, 9 Sklodowskiej-Curie Street, 85-094 Bydgoszcz, Poland b Institute of Biological Problems of the North, Far-East Branch of the Russian Academy of Sciences, 18 Portovaya Street, 685000 Magadan, Russia c Department of Vascular Surgery and Angiology, Ludwik Rydygier Collegium Medicum, Nicolaus Copernicus University, 9 Sklodowskiej-Curie Street, 85-094 Bydgoszcz, Poland d Department of Clinical Pathomorphology, Ludwik Rydygier Collegium Medicum, Nicolaus Copernicus University, 9 Sklodowskiej-Curie Street, 85-094 Bydgoszcz, Poland e Interdisciplinary Centre for Mathematical and Computational Modelling, University of Warsaw, 5a Pawinskiego Street, 02-106 Warsaw, Poland

A R T I C L E I N F O

A B S T R A C T

Article history:

Mitochondrial DNA (mtDNA) heteroplasmy has been widely described from clinical, evolutionary and analytical points of view. Historically, the majority of studies have been based on Sanger sequencing. However, next-generation sequencing technologies are now being used for heteroplasmy analysis. Ultra-deep sequencing approaches provide increased sensitivity for detecting minority variants. However, a phylogenetic a posteriori analysis revealed that most of the next-generation sequencing data published to date suffers from shortcomings. Because implementation of new technologies in clinical, population, or forensic studies requires proper verification, in this paper we present a direct comparison of ultra-deep 454 and Sanger sequencing for the detection of heteroplasmy in complete mitochondrial genomes of normal colon cells. The spectrum of heteroplasmic mutations is discussed against the background of mitochondrial DNA variability in human populations. ß 2014 Elsevier Ireland Ltd. All rights reserved.

Keywords: Mitochondrial genome Phylogeny Haplogroup Heteroplasmy 454 Sequencing Low-level variants

1. Introduction Mitochondrial DNA sequence heteroplasmy, defined as the presence of different sequences within a single individual, has been considered extensively from clinical, population, and analytical points of view [1–21]. For heteroplasmy at single nucleotide positions, standard mtDNA analysis based on dideoxy sequencing is capable of detecting only those sequence changes for which the minority variant level is above 10%, where peak heights on an electrophoregram are considered [3,9]. However, recent developments in ultra-deep DNA sequencing technologies may aid in the identification of low-level heteroplasmic variants (i.e., those below 10%) [11,14,17]. Indeed, next generation sequencing (NGS) approaches like 454 (GS FLX, Roche Diagnostics), Mi-Seq (Illumina), or Ion Torrent PGM (Life Technologies) allow not only fast and

* Corresponding author. Tel.: +48 52 585 35 49; fax: +48 52 585 35 53. E-mail address: [email protected] (T. Grzybowski).

robust sequence determination, but also efficient heteroplasmy detection, the degree of which is dependent on the particular technology and the extent of sequence coverage. Moreover, NGS approaches allow a more objective measurement of the minor component, presented as the percentage of the number of reads. Nevertheless, the accuracy of NGS technologies should be carefully verified before their implementation in clinical, population, or forensic studies. Otherwise, low-quality mtDNA sequences may lead to false conclusions [22,23]. Thus far, only a few studies have described the application of 454 technology to forensic and clinical mitochondrial DNA analysis [8,10,13,14,17,20]. Furthermore, ultradeep 454 mitogenome sequencing was performed in only one of these studies and the obtained data were not subjected to phylogenetic scrutiny [13]. Heteroplasmy in human cells may be inherited or may be a consequence of somatic changes in the mitochondrial DNA of a progenitor cell. For example, previous studies have shown that replicative cells, such as cells lining the colon, accumulate mtDNA mutations [24,25]. Therefore, colon tissue appears to be a

http://dx.doi.org/10.1016/j.fsigen.2014.10.021 1872-4973/ß 2014 Elsevier Ireland Ltd. All rights reserved.

Please cite this article in press as: K. Skonieczna, et al., Heteroplasmic substitutions in the entire mitochondrial genomes of human colon cells detected by ultra-deep 454 sequencing, Forensic Sci. Int. Genet. (2014), http://dx.doi.org/10.1016/j.fsigen.2014.10.021

G Model

FSIGEN-1260; No. of Pages 5 2

K. Skonieczna et al. / Forensic Science International: Genetics xxx (2014) xxx–xxx

convenient model to investigate heteroplasmy. Ultra-deep mtDNA sequencing of normal colon tissues with the Illumina technology has shown at least two heteroplasmic substitutions in a single individual [11]. The heteroplasmy ranged from 1.6% to 43.6% [11]. Only 17.5% of those heteroplasmic substitutions were above the 10% level, and could thus be detected by conventional Sanger sequencing [11]. However, the results of the above-mentioned study [11] could not be reconciled with human mtDNA phylogeny, indicating the presence of errors in this data set [22,23]. Here, we apply the 454 sequencing approach to an analysis of heteroplasmic substitution occurrence in the complete mitochondrial genomes of normal colon cells obtained from 50 Polish individuals. The 454 sequencing of their entire mitochondrial genomes enabled us to detect heteroplasmic variants at levels ranging from 2% to 43% in 32% of the investigated individuals. The low-level variants, not detected by Sanger sequencing, accounted for approximately 57% of all heteroplasmic mutations. 2. Materials and methods 2.1. Biological material The study was approved by the Bioethics Committee of the Ludwik Rydygier Collegium Medicum, Nicolaus Copernicus University in Bydgoszcz, Poland (statement no. KB 432/2008). Normal colon tissues (0.5 cm3 each) were collected from 50 individuals undergoing resection for colonic tumors at the Department of Vascular Surgery and Angiology CM NCU in Bydgoszcz. Each sample of normal colon tissue was collected at least 15 cm from the tumor edge. An experienced pathologist studied the tissue samples obtained for mtDNA analysis. In all cases the samples were found to be non-neoplastic. In addition, pathologists routinely examined the surgical margins of the bowel resection to assess the surgery protocol. In all cases the resection margins were described as being free from neoplastic disease (uninvolved by invasive carcinoma or dysplastic changes). Thus, each sample was doubly verified for tissue normalcy. 2.2. 454 Sequencing Total DNA was isolated from 50 normal colon tissue samples with a GeneMATRIX Bio-Trace DNA Purification Kit (EURX, Gdansk, Poland). The entire mitochondrial genomes were amplified according to the protocol described in Fendt et al. [26]. PCR products were purified with Microcon YM-100 (EMD Millipore, Billerica, MA, USA) and their concentrations were estimated densitometrically using Science Lab 99 ImageGuage v.3.46 (Fujifilm, Tokyo, Japan). Purified PCR products were mixed in equimolar ratios. Rapid Libraries of samples were labeled with different RLMID adaptors (Roche Diagnostics GmbH, Mannheim, Germany). Four libraries (labeled with different RLMIDs) were mixed in equimolar ratios and sequenced on a ¼ PTP plate using a GS FLX Sequencer (Roche Diagnostics GmbH, Mannheim, Germany). Rapid Library preparation, emPCR, and 454 sequencing were performed according to the manufacturer’s instructions using Titanium reagents (Roche Diagnostics GmbH, Mannheim, Germany). 2.3. Resequencing For comparative purposes, the entire mitochondrial genomes of all samples were sequenced by the dideoxy method. Briefly, purified PCR products used for 454 sequencing were also sequenced with primers designed by Fendt et al. [26] and Torroni et al. [27] using BigDye Terminator v.3.1 chemistry and a 3130xl Genetic Analyzer (Applied Biosystems, Life Technologies, Grand Island, NY, USA).

2.4. Data analysis The raw data obtained by the 454 platform were analyzed with GS Run Processor v.2.5.3 and GS Reporter v.2.5.3 (Roche Diagnostics GmbH, Mannheim, Germany). The mtDNA sequences obtained by the 454 and dideoxy sequencing technologies were compared with the revised Cambridge Reference Sequence, rCRS [28] using GS Reference Mapper v.2.5.3 (Roche Diagnostics GmbH, Mannheim, Germany) or SeqScape v.2.5 software (Applied Biosystems, Life Technologies, Grand Island, NY, USA), respectively. For the GS Reference Mapper v.2.5.3 results, all three files (454AllDiffs.txt, 454HCDifffs.txt, and 454AlignmentInfo.tsv) were used to determine the sample haplotype. Several quality control steps were applied to detect mtDNA point heteroplasmy using 454 sequencing technology. Heteroplasmy was confirmed when the minority variant occurred in at least 20 unique reads with a high quality score, if at least 35% of all reads with the minority variant were observed from both forward and reverse strands, and if the ratio of forward to reverse reads for the minority variant was similar to that calculated for the majority variant. For the Sanger sequencing results, point heteroplasmy was scored when the electropherograms for the forward and reverse strands showed clear evidence of two different nucleotides above the sequence background (approximately a 10% threshold). We did not measure the percentage of the minority variant for heteroplasmic positions detected by the dideoxy method, due to the potential for bias resulting from a disproportionate rate of nucleotide incorporation during the sequencing reaction and a dissimilar intensity of the signal at different positions in the sequence [29]. Two scientists independently reviewed the raw data (the 454 and dideoxy sequencing results). The GenBank [30] accession numbers of the obtained sequences are JX128058, JX128059, and KM047188 – KM047235. To exclude the possibility of sample contamination, whole mitochondrial genome sequences were determined for all of the laboratory staff, including the person who handled the samples. Haplotypes were assigned to known mtDNA haplogroups according to van Oven and Kayser nomenclature [31]. The most parsimonious tree of the complete sequences obtained from normal cells was reconstructed with mtPhyl v.4.015 software [32]. The differences in the number of 454 reads from the forward and reverse strands for minority and majority variants (F/R ratio) were analyzed statistically with a Chi-squared test with Yates correction, using STATISTICA v.9.1 software (StatSoft Inc./Dell Software, Aliso Viejo, CA, USA). The evolutionary stability and possible pathogenicity of heteroplasmic mutations were analyzed according to the data presented by van Oven and Kayser [31], RuizPesini et al. [33], Pereira et al. [34], Soares et al. [35]. 3. Results 3.1. Consistency of 454 and Sanger sequencing results regarding homoplasmic substitutions Sequencing of 454 amplicons revealed that mitochondrial DNA positions in the 50 normal colon tissue samples were covered a minimum of 94 times and a maximum of 16,522 times, with a mean of 1591 times, and a median of 1270 times (Fig. 1). The accuracy of the data obtained by the 454 technology was confirmed by Sanger sequencing. Comparison of the files obtained from 454 sequencing to those obtained from the dideoxy method did not show any differences between homoplasmic positions outside homopolymer regions (Fig. S1). The results of 454 and dideoxy sequencing of homopolymer tracts containing up to four nucleotides were identical. However, for homopolymer tracts consisting of five or more identical bases, the 454 sequencing reads showed a tendency to be miscalled (mainly indicating deletion).

Please cite this article in press as: K. Skonieczna, et al., Heteroplasmic substitutions in the entire mitochondrial genomes of human colon cells detected by ultra-deep 454 sequencing, Forensic Sci. Int. Genet. (2014), http://dx.doi.org/10.1016/j.fsigen.2014.10.021

G Model

FSIGEN-1260; No. of Pages 5 K. Skonieczna et al. / Forensic Science International: Genetics xxx (2014) xxx–xxx

Fig. 1. The average 454 sequencing coverage for all 50 colon samples, plotted at each mtDNA position (1 to 16569).

The accuracies for 5, 6, and 7 nucleotides were approximately 98%, 94% and 91%, respectively. Accuracy decreased to approximately 35% for 8 identical bases. Thus, we excluded homopolymer mtDNA regions consisting of more than four identical bases from the analysis. 3.2. Verification of haplotypes with human phylogeny The reliability of the obtained sequences was verified a posteriori using phylogenetic methods, by confronting the haplotypes with the current mtDNA tree [36,37]. All obtained haplotypes, irrespective of the sequencing technology used, were correctly assigned (according to haplogroup-diagnostic motifs) into known mtDNA clades (Fig. S1) [31]. However, Sanger sequencing only detected heteroplasmic substitutions in normal colon cells when the 454 reads of the minor component constituted at least 16% (Table S1). Among the samples investigated, 48% belonged to haplogroup HV, 32% were from haplogroup U, 16% were classified into haplogroup T, and 4% were assigned to haplogroup J (Fig. S1). The potential external contamination of investigated samples was excluded, because all heteroplasmic substitutions were observed at invariant positions in the complete mitochondrial genome of the person who handled the samples (Fig. S1) and in the mtDNA database of the laboratory staff. 3.3. The spectrum of heteroplasmy in mitochondrial genomes of normal colon cells Twenty-three heteroplasmic substitutions were detected by 454 sequencing in 16 (32%) individuals and localized at different positions (Fig. S1, Table S1). The frequencies of heteroplasmic substitutions were not associated with any mitochondrial haplogroups (Fisher exact test; p > 0.05). The level of the minority variant detected by the 454 platform ranged from 2% to 43%. More than half (57%) of the heteroplasmic variants failed to be detected by the Sanger method. Indeed, the level of the minority variant in these heteroplasmic positions was less than 16%. All of the 23 heteroplasmic mutations observed in the 16 samples were transitions. A G to A change was the most prevalent (35%); whereas, an A to G transition occurred with the lowest frequency (17%, Table S1). A single heteroplasmic position in the mitochondrial genome was found in approximately 63% of individuals with heteroplasmy. Two heteroplasmic substitutions were observed in approximately 31% of the normal colon samples with detected heteroplasmy. Three heteroplasmic mutations were detected in only one tissue (Table S1). The majority of the detected heteroplasmic substitutions (78%) occurred at positions known

3

to be variable in the human mtDNA phylogeny (Table S1). Analysis of the overall spectrum of heteroplasmic mutations showed that greater than half (57%) of the detected heteroplasmies were localized in the non-coding region. Approximately 35% of the heteroplasmies fell within protein-coding genes (CYTB, ATP6, CO1, ND1, ND4, ND5 and ND6); whereas, only 9% occurred in the 12S rRNA genes (Table S1). The most stable region involved the tRNA genes, which had no detected heteroplasmy. Among the point mutations detected in protein-coding genes, approximately 88% targeted highly evolutionarily conserved positions in the protein sequence. Indeed, the conservation index for these amino acid positions exceeded 80%. Approximately 63% of the heteroplasmic substitutions located in protein-coding genes led to amino acid changes in the polypeptide chain. Moreover, the pathogenicity score of approximately 80% of the nonsynonymous mutations exceeded 0.68  0.03, a value similar to that characteristic of disease-related variants (Table S1) [34]. However, none of the heteroplasmic positions were found in the database of mtDNA variants associated with mitochondrial diseases [39]. 4. Discussion The large number of mtDNA molecules in a single cell and the high mutation rate of DNA due to exposure to reactive oxygen species (ROS), lack of histones, and inefficient DNA repair [40] may lead to the occurrence of sequence variability within a single individual. The presence of heteroplasmic mutations has been widely documented in reports concerning the variability of the Dloop sequence in human populations [3,5,9]. The studies performed to date show that heteroplasmy is not incidental. Indeed, the analysis of more than 5000 people has allowed the detection of heteroplasmic substitutions in approximately 6% of healthy individuals [9]. However, to date heteroplasmy detection in mtDNA has been primarily based on dideoxy sequencing, a method of limited sensitivity for the visualization of low-level variants. Ultra-deep sequencing using NGS technologies reveals a much higher incidence of heteroplasmic mutations in mitochondrial genomes [11,13,17]. The higher the coverage, the greater the prevalence of point heteroplasmy that is detected [11,13,17]. For example, Payne et al. [17] showed that in two small mtDNA regions (encompassing 579 bp) there was an extreme incidence of very low-level variants in human tissues, indicating that point heteroplasmy was present in every investigated individual. In contrast, in our data set heteroplasmy in the entire mitochondrial genome was found in only 32% of individuals. However, one should take into account that each mtDNA position in the Payne et al. [17] study reached a mean coverage of more than 8000 times; thus allowing the detection of nucleotide variants (below the level of 2%) that would pass undetected in our analysis. Notably, subjecting some published NGS results to phylogenetic scrutiny reveals that they suffer from shortcomings. For example, reconstruction of the haplotypes from all patients examined by He et al. [11] and their comparison to the human mtDNA phylogeny demonstrated that all the haplotypes lack many haplogroup-diagnostic mutation motifs [22,23]. The recently published landscape of heteroplasmic mutations in 1085 individuals analyzed in the framework of the 1000 Genomes Project [21] also had some obvious errors. The authors reported a high prevalence of heteroplasmy in the human population, showing up to 71 heteroplasmic positions in a single individual (sample number: HG00740) [21]. However, phylogenetic analysis of those 71 heteroplasmic variants clearly indicated an artificial mutation pattern in the HG00740 sample. Indeed, all of the majority variants (each greater than 86%) belonged to the diagnostic motif of the L1b1a7a haplogroup. Twelve of 18 minority variants were characteristic of the B2b3a haplogroup. The example described

Please cite this article in press as: K. Skonieczna, et al., Heteroplasmic substitutions in the entire mitochondrial genomes of human colon cells detected by ultra-deep 454 sequencing, Forensic Sci. Int. Genet. (2014), http://dx.doi.org/10.1016/j.fsigen.2014.10.021

G Model

FSIGEN-1260; No. of Pages 5 4

K. Skonieczna et al. / Forensic Science International: Genetics xxx (2014) xxx–xxx

above is not exceptional. Similar problems were observed in at least 40 other samples [21], indicating sample contamination or numts (nuclear copies of mitochondrial DNA) co-amplification. These findings highlight the need for accurate validation of NGS technologies before their implementation. All of the homoplasmic mutations detected by the 454 sequencing technology in the 50 mitochondrial genomes tested were also observed using the Sanger method. Only for homopolymer tracts longer than four nucleotides did the 454 sequencing results show uncertainty with respect to the sequence length, which is consistent with previous observations [14,20]. This technical limitation probably results from errors in the pyrosequencing reaction, because incomplete extension within a homopolymer can occur as a consequence of insufficient nucleotides within a flow cell [41]. Importantly, all haplotypes determined using the 454 technology were unambiguously assigned to known mtDNA clades based on the haplogroup-diagnostic motifs, which confirmed the accuracy of the data. Moreover, we were able to reliably reconstruct the spectrum of heteroplasmic substitutions in the entire mitochondrial genomes of human colon cells. The 454 sequencing detected all of the heteroplasmic substitutions observed by Sanger sequencing. Moreover, with the 454 technology we were able to uncover an additional 13 heteroplasmic positions (about 57% of all heteroplasmic substitutions) for which dideoxy sequencing failed to detect changes (Fig. S2, Table S1). Our results are consistent with data from studies on mtDNA heteroplasmy that have used phylogenetic methods to verify the accuracy of mitochondrial DNA sequences. In the present study, heteroplasmic substitutions ranging from 2% to 43% were detected in the entire mitochondrial genomes of 32% of the investigated individuals. This frequency is slightly higher than that observed by Ramos et al. [18], who identified point heteroplasmies in approximately 24% of individuals. However, one should take into account that the mitochondrial genomes from the Ramos et al. [18] study were determined via Sanger sequencing. Thus, the detection limit of the minor variant was at least 10%, where peak height was considered [18]. Differences between the studies may also be due to the type of tissue analyzed (blood versus colon tissue), because the rate of heteroplasmy differs by tissue [17,19]. The frequency of heteroplasmic substitutions in our data set was 22% in the control region, a result three times higher than that observed in other population studies [9,18]. This difference may be the result of differing sensitivities of the methods used by us, versus other authors. Indeed, if we consider only the variants that were confirmed via Sanger sequencing, the frequency of heteroplasmic mutations in the control region in our data set reached 8%, which is similar to previous observations for blood and saliva samples [9,18]. It is also worth noting that most of the heteroplasmic substitutions (67%) detected by us in the D-loop were found at sites for which heteroplasmy has been previously observed in two or more individuals [9]. The rate of the occurrence of heteroplasmic substitutions in our data set for the HVS I region (16%) was more than three-fold lower than that previously described by Holland et al. [14]. However, one should take into account that the 454 sequencing experiments by Holland et al. [14] enabled the detection of heteroplasmic mutations at a level six-fold lower than that in our data set. The rate of heteroplasmy at individual mtDNA regions in normal colon cells that is described above resembles the mutation rate that has been observed for mitochondrial DNA sequences in human populations [38,42]. Moreover, the analysis of the spectrum of heteroplasmic substitutions reported in this study showed that about half of the detected mutations were localized in the highly variable D-loop region, and occurred with the lowest

frequencies in the evolutionary stable rRNA and tRNA genes. However, none of the observed heteroplasmic substitutions were located at positions previously associated with any human mitochondrial diseases [39,43]. Yet, one cannot exclude the possibility that some changes in protein-coding genes could have a potentially harmful character. Indeed, at least half of the detected heteroplasmic mutations in protein-coding genes that lead to amino acid changes in the protein sequence have pathogenicity scores that are higher than those calculated for disease-related variants, indicating a possible pathogenic effect on cell metabolism. The potentially ‘‘harmful’’ character of the heteroplasmic mutations located in protein-coding genes (for which the pathogenicity score was above 0.68  0.03) and the fact that 75% of the mutations were not observed in the human phylogeny suggests that purifying selection may be acting to eliminate them from human populations. 5. Conclusions This study applied 454 sequencing technology to the analysis of intra-individual variability in complete mitochondrial genome sequences. A direct comparison of 454 and Sanger sequencing data highlighted the greater sensitivity of NGS technologies to detect mtDNA heteroplasmy, which could present a forensic challenge for researchers interested in comparing results generated with different technologies. In addition, the spectrum of heteroplasmic mutations in the entire mitochondrial genomes of normal cells described here could have important implications for medical, population, and forensic genetics. Acknowledgments The study was partially supported by a Polish Ministry of Science and Higher Education grant (no. N N301 075839) and the European Social Fund (no. 20/9/POKL/4.1.1/2008 and 145/UMK/ 09). Analyses of 454 data with GS Run Processor v.2.5.3, GS Reporter v.2.5.3 and GS Reference Mapper v. 2.5.3 were performed using PLGrid Infrastructure. Appendix A. Supplementary data Supplementary data associated with this article can be found, in the online version, at doi:10.1016/j.fsigen.2014.10.021. References [1] J. Hu¨hne, H. Pfeiffer, K. Waterkamp, K. Brinkmann, Mitochondrial DNA in human hair shafts—existence of intra-individual differences, Int. J. Legal Med. 112 (1999) 172–175. [2] C.D. Calloway, R.L. Reynolds, G.L. Herrin Jr., W.W. Anderson, The frequency of heteroplasmy in the HVII region of mtDNA differs across tissue types and increases with age, Am. J. Hum. Genet. 66 (2000) 1384–1397. [3] L.A. Tully, T.J. Parsons, R.J. Steighner, M.M. Holland, M.A. Marino, V.L. Prenger, A sensitive denaturing gradient-Gel electrophoresis assay reveals a high frequency of heteroplasmy in hypervariable region 1 of the human mtDNA control region, Am. J. Hum. Genet. 67 (2000) 432–443. [4] A. Alonso, A. Salas, C. Albarra´n, E. Arroyo, A. Castro, M. Crespillo, A.M. di Lonardo, M.V. Lareu, C.L. Cubrı´a, M.L. Soto, J.A. Lorente, M.M. Semper, A. Palacio, M. Paredes, L. Pereira, A.P. Lezaun, J.P. Brito, A. Sala, M.C. Vide, M. Whittle, J.J. Yunis, J. Go´mez, Results of the 1999–2000 collaborative exercise and proficiency testing program on mitochondrial DNA of the GEP-ISFG: an inter-laboratory study of the observed variability in the heteroplasmy level of hair from the same donor, Forensic Sci. Int. 125 (2002) 1–7. [5] G. Tully, S.M. Barritt, K. Bender, E. Brignon, C. Capelli, N. Dimo-Simonin, C. Eichmann, C.M. Ernst, C. Lambert, M.V. Lareu, Results of a collaborative study of the EDNAP group regarding mitochondrial DNA heteroplasmy and segregation in hair shafts, Forensic Sci. Int. 40 (2004) 1–11. [6] M. Meyer, U. Stenzel, S. Myles, K. Pru¨fer, M. Hofreiter, Targeted high-throughput sequencing of tagged nucleic acid samples, Nucleic Acids Res. 35 (2007) e97. [7] G.G. Paneto, J.A. Martins, L.V. Longo, G.A. Pereira, A. Freschi, V.L. Alvarenga, B. Chen, R.N. Oliveira, M.H. Hirata, R.M. Cicarelli, Heteroplasmy in hair: differences

Please cite this article in press as: K. Skonieczna, et al., Heteroplasmic substitutions in the entire mitochondrial genomes of human colon cells detected by ultra-deep 454 sequencing, Forensic Sci. Int. Genet. (2014), http://dx.doi.org/10.1016/j.fsigen.2014.10.021

G Model

FSIGEN-1260; No. of Pages 5 K. Skonieczna et al. / Forensic Science International: Genetics xxx (2014) xxx–xxx

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22] [23]

among hair and blood from the same individuals are still a matter of debate, Forensic Sci. Int. 173 (2007) 117–121. M. Meyer, A.W. Briggs, T. Maricic, B. Ho¨ber, B. Ho¨ffner, J. Krause, A. Weihmann, S. Pa¨a¨bo, M. Hofreiter, From micrograms to picograms: quantitative PCR reduces the material demands of high-throughput sequencing, Nucleic Acids Res. 36 (2008) e5. J.A. Irwin, J.L. Saunier, H. Niedersta¨tter, K.M. Strouss, K.A. Sturk, T.M. Diegoli, A. Brandsta¨tter, W. Parson, T.J. Parsons, Investigation of heteroplasmy in the human mitochondrial DNA control region: a synthesis of observations from more than 5000 global population samples, J. Mol. Evol. 68 (2009) 516–527. M. Mikkelsen, E. Rockenbauer, A. Wachter, L. Fendt, B. Zimmermann, W. Parson, S.A. Nielsen, T. Gilbert, E. Willerslev, N. Morling, Application of full mitochondrial genome sequencing using 454 GS FLX pyrosequencing, Forensic Sci. Int. Genet. 2 (2009) 518–519. Y. He, J. Wu, D.C. Dressman, C. Iacobuzio-Donahue, S.D. Markowitz, V.E. Velculescu, L.A. Diaz Jr., K.W. Kinzler, B. Vogelstein, N. Papadopoulos, Heteroplasmic mitochondrial DNA mutations in normal and tumour cells, Nature 464 (2010) 610–614. M. Li, A. Scho¨nberg, M. Schaefer, R. Schroeder, I. Nasidze, M. Stoneking, Detecting heteroplasmy from high-throughput sequencing of complete human mitochondrial DNA genomes, Am. J. Hum. Genet. 87 (2010) 237–249. M.V. Zaragoza, J. Fass, M. Diegoli, D. Lin, E. Arbustini, Mitochondrial DNA variant discovery and evaluation in human cardiomyopathies through next-generation sequencing, PLOS ONE 5 (2010) e12295. M.M. Holland, M.R. McQuillan, K.A. O’Hanlon, Second generation sequencing allows for mtDNA mixture deconvolution and high resolution detection of heteroplasmy, Croat. Med. J. 52 (2011) 299–313. M.X. Sosa, I.K. Sivakumar, S. Maragh, V. Veeramachaneni, R. Hariharan, M. Parulekar, K.M. Fredrikson, T.T. Harkins, J. Lin, A.B. Feldman, P. Tata, G.B. Ehret, A. Chakravarti, Next-generation sequencing of human mitochondrial reference genomes uncovers high heteroplasmy frequency, PLoS Comput. Biol. 8 (2012) e1002737. Y. Guo, C.I. Li, Q. Sheng, J.F. Winther, Q. Cai, J.D. Boice, Y. Shyr, Very low-level heteroplasmy mtDNA variations are inherited in humans, J. Genet. Genomics 40 (2013) 607–615. B.A. Payne, I.J. Wilson, P. Yu-Wai-Man, J. Coxhead, D. Deehan, R. Horvath, R.W. Taylor, D.C. Samuels, M. Santibanez-Koref, P.F. Chinnery, Universal heteroplasmy of human mitochondrial DNA, Hum. Mol. Genet. 22 (2013) 384–390. A. Ramos, C. Santos, L. Mateiu, M. Gonzalez Mdel, L. Alvarez, L. Azevedo, A. Amorim, M.P. Aluja, Frequency and pattern of heteroplasmy in the complete human mitochondrial genome, PLOS ONE 8 (2013) e74636. D.C. Samuels, C. Li, B. Li, Z. Song, E. Torstenson, H. Boyd Clay, A. Rokas, T.A. Thornton-Wells, J.H. Moore, T.M. Hughes, R.D. Hoffman, J.L. Haines, D.G. Murdock, D.P. Mortlock, S.M. Williams, Recurrent tissue-specific mtDNA mutations are common in humans, PLoS Genet. 9 (2013) e1003929. M. Mikkelsen, R. Frank-Hansen, A.J. Hansen, N. Morling, Massively parallel pyrosequencing of the mitochondrial genome with the 454 methodology in forensic genetics, Forensic Sci. Int. Genet. 12 (2014) 30–37. K. Ye, J. Lu, F. Ma, A. Keinan, Z. Gu, Extensive pathogenicity of mitochondrial heteroplasmy in healthy human individuals, Proc. Natl. Acad. Sci. U. S. A. 111 (2014) 10654–10659. H.J. Bandelt, A. Salas, Current Next Generation Sequencing technology may not meet forensic standards, Forensic Sci. Int. Genet. 6 (2012) 143–145. K. Skonieczna, B.A. Malyarchuk, T. Grzybowski, The landscape of mitochondrial DNA variation in human colorectal cancer on the background of phylogenetic knowledge, Biochim. Biophys. Acta 1825 (2012) 153–159.

5

[24] L.C. Greaves, S.L. Preston, P.J. Tadrous, R.W. Taylor, M.J. Barron, D. Oukrif, S.J. Leedham, M. Deheragoda, P. Sasieni, M.R. Novelli, J.A. Jankowski, D.M. Turnbull, N.A. Wright, S.A. McDonald, Mitochondrial DNA mutations are established in human colonic stem cells, and mutated clones expand by crypt fission, Proc. Natl. Acad. Sci. U. S. A. 103 (2006) 714–719. [25] L.C. Greaves, J.L. Elson, M. Nooteboom, J.P. Grady, G.A. Taylor, R.W. Taylor, J.C. Mathers, T.B. Kirkwood, D.M. Turnbull, Comparison of mitochondrial mutation spectra in ageing human colonic epithelium and disease: absence of evidence for purifying selection in somatic mitochondrial DNA point mutations, PLoS Genet. 8 (2012) e1003082. [26] L. Fendt, B. Zimmermann, M. Daniaux, W. Parson, Sequencing strategy for the whole mitochondrial genome resulting in high quality sequences, BMC Genomics 10 (2009) 139. [27] A. Torroni, C. Rengo, V. Guida, F. Cruciani, D. Sellitto, A. Coppa, F.L. Calderon, B. Simionati, G. Valle, M. Richards, V. Macaulay, R. Scozzari, Do the four clades of the mtDNA haplogroup L2 evolve at different rates, Am. J. Hum. Genet. 69 (2001) 1348–1356. [28] R.M. Andrews, I. Kubacka, P.F. Chinnery, R.N. Lightowlers, D.M. Turnbull, N. Howell, Reanalysis and revision of the Cambridge reference sequence for human mitochondrial DNA, Nat. Genet. 23 (1999) 147. [29] L.T. Parker, Q. Deng, H. Zakeri, C. Carlson, D.A. Nickerson, P.Y. Kwok, Peak height variations in automated sequencing of PCR products using Taq dye-terminator chemistry, Biotechniques 19 (1995) 116–121. [30] http://www.ncbi.nlm.nih.gov/genbank/. [31] M. van Oven, M. Kayser, Updated comprehensive phylogenetic tree of global human mitochondrial DNA variation, Hum. Mutat. 30 (2009), E386–394, updated 19.02.2014. [32] http://eltsov.org. [33] E. Ruiz-Pesini, D. Mishmar, M. Brandon, V. Procaccio, D.C. Wallace, Effects of purifying and adaptive selection on regional variation in human mtDNA, Science 303 (2004) 223–226. [34] L. Pereira, P. Soares, P. Radivojac, B. Li, D.C. Samuels, Comparing phylogeny and the predicted pathogenicity of protein variations reveals equal purifying selection across the global human mtDNA diversity, Am. J. Hum. Genet. 88 (2011) 433–439. [35] P. Soares, L. Ermini, N. Thomson, M. Mormina, T. Rito, A. Ro¨hl, A. Salas, S. Oppenheimer, V. Macaulay, M.B. Richards, Correcting for purifying selection: an improved human mitochondrial molecular clock, Am. J. Hum. Genet. 84 (2009) 740–759. [36] H.J. Bandelt, P. Lahermo, M. Richards, V. Macaulay, Detecting errors in mtDNA data by phylogenetic analysis, Int. J. Legal Med. 115 (2001) 64–69. [37] A. Salas, Y.G. Yao, V. Macaulay, A. Vega, A. Carracedo, H.J. Bandelt, A critical reassessment of the role of mitochondria in tumorigenesis, PLoS Med. 2 (2005) e296. [38] L. Pereira, F. Freitas, V. Fernandes, J.B. Pereira, M.D. Costa, S. Costa, V. Ma´ximo, V. Macaulay, R. Rocha, D.C. Samuels, The diversity present in 5140 human mitochondrial genomes, Am. J. Hum. Genet. 84 (2009) 628–640. [39] https://www.mitomap.org/MITOMAP. [40] C. Richter, J.W. Park, B.N. Ames, Normal oxidative damage to mitochondrial and nuclear DNA is extensive, Proc. Natl. Acad. Sci. U. S. A. 17 (1988) 6465–6467. [41] S.M. Huse, J.A. Huber, H.G. Morrison, M.L. Sogin, D.M. Welch, Accuracy and quality of massively parallel DNA pyrosequencing, Genome Biol. 8 (2007) R143. [42] B.A. Malyarchuk, Distribution of nucleotide substitutions in human mitochondrial DNA genes, Russ. J. Genet. 41 (2005) 93–99. [43] D.C. Wallace, A mitochondrial paradigm of metabolic and degenerative diseases, aging, and cancer: a dawn for evolutionary medicine, Annu. Rev. Genet. 39 (2005) 359–407.

Please cite this article in press as: K. Skonieczna, et al., Heteroplasmic substitutions in the entire mitochondrial genomes of human colon cells detected by ultra-deep 454 sequencing, Forensic Sci. Int. Genet. (2014), http://dx.doi.org/10.1016/j.fsigen.2014.10.021