Molecular Phylogenetics and Evolution Vol. 12, No. 2, July, pp. 105–114, 1999 Article ID mpev.1998.0602, available online at http://www.idealibrary.com on
Primers for a PCR-Based Approach to Mitochondrial Genome Sequencing in Birds and Other Vertebrates Michael D. Sorenson,*,1 Jennifer C. Ast,† Derek E. Dimcheff,† Tamaki Yuri,† and David P. Mindell† *Department of Biology, Boston University, Boston, Massachusetts 02215; and †Museum of Zoology and Department of Biology, University of Michigan, Ann Arbor, Michigan 48109-1079 Received June 24, 1998; revised September 24, 1998
A PCR-based approach to sequencing complete mitochondrial genomes is described along with a set of 86 primers designed primarily for avian mitochondrial DNA (mtDNA). This PCR-based approach allows an accurate determination of complete mtDNA sequences that is faster than sequencing cloned mtDNA. The primers are spaced at about 500-base intervals along both DNA strands. Many of the primers incorporate degenerate positions to accommodate variation in mtDNA sequence among avian taxa and to reduce the potential for preferential amplification of nuclear pseudogenes. Comparison with published vertebrate mtDNA sequences suggests that many of the primers will have broad taxonomic utility. In addition, these primers should make available a wider variety of mitochondrial genes for studies based on smaller data sets. r 1999 Academic Press
Because of its maternal inheritance, haploidy, and rapid rate of evolution, mitochondrial DNA (mtDNA) has many advantages as a marker for phylogenetic analyses (Moore, 1995) and is one of the most frequently used markers in molecular systematics (e.g., Mindell, 1997). While advances in DNA sequencing technology have made it possible to determine longer sequences for larger samples of taxa, we believe that the sampling of mitochondrial genes for systematic studies continues to be limited by the availability of polymerase chain reaction (PCR) primers. In avian studies, the cytochrome b gene has been sequenced more often than all other mitochondrial genes combined (Table 1), presumably due to the early availability of ‘‘universal’’ primers for this gene (Kocher et al., 1989). This historical inertia should be replaced by a deliberate choice of the genes that are likely to be the most informative for a given study. In addition, it now 1 To whom correspondence should be addressed at Department of Biology, Boston University, 5 Cummington Street, Boston, MA 02215. Fax: (617) 353-6340. E-mail:
[email protected].
appears that adequate resolution of higher-level relationships in birds and other taxa will require larger data sets, including a number of different genes (Cummings et al., 1995; Mindell et al., 1997) or even the entire mitochondrial genome (e.g., Janke et al., 1994; Zardoya and Meyer, 1996). We describe here a PCR-based approach to complete mitochondrial genome sequencing and provide sequences of the primers we have used for work on birds and other vertebrates. This PCR-based approach allows a more rapid determination of complete mtDNA sequences than a traditional approach using cloned mtDNA. In addition, these primers should make available a wider range of mitochondrial genes for smaller sequencing studies and, based on comparisons with published sequences, many of the primers will have utility for other vertebrates. Covering the entire mitochondrial genome with a focus on birds, this paper augments previous compilations of PCR primers for broader taxonomic categories (e.g., Palumbi, 1996; Simon et al., 1994) and more specialized primer sets reported in many studies of birds (e.g., Cooper, 1994; Edwards et al., 1991; Edwards, 1993; Lee et al., 1997; Quinn and Wilson, 1993; Tarr, 1995) and other reptiles (e.g., Kumazawa and Nishida, 1993; Quinn and Mindell, 1996). Targeting highly conserved regions in published vertebrate mtDNA sequences and new sequences that we generated, we designed pairs of PCR primers that would amplify the entire mitochondrial genome in fragments of 3–5 kb (Fig. 1). We used hot-start, XL-PCR with rTth DNA polymerase (Perkin–Elmer) to amplify these large fragments: 50-µl reactions included 1.25 mM Mg(OAc) 2, 0.2 mM each dNTP, 0.4 µM each primer, 1 unit rTth DNA polymerase-XL, and 100 ng of total DNA prepared with a QIAamp Tissue Kit (Qiagen). Annealing temperatures were from 55 to 60°C. PCR products were gel-purified in 1.5% low-melt agarose, excised from the gel, and recovered with a Gel Extraction Kit (Qiagen). Double-stranded PCR products were sequenced directly in cycle sequencing reactions using
105
1055-7903/99 $30.00 Copyright r 1999 by Academic Press All rights of reproduction in any form reserved.
106
SORENSON ET AL.
TABLE 1 Approximate Representation of Mitochondrial Genes among 2023 Avian mtDNA Sequences in GenBank (as of 22 April 1998) Gene
No.
Percentage of total
Cytochrome b 12S rDNA Control region (D-loop) NADH dehydrogenase subunit 2 (ND2) Cytochrome oxidases (mostly COI) 16S rDNA NADH dehydrogenase subunit 6 (ND6) Others
1227 268 217 82 72 61 35 60
61 13 11 4 4 3 2 3
Taq DNA Polymerase FS (Applied Biosystems). Reaction products were run on an Applied Biosystems 377 automated DNA sequencer. Working with several taxa simultaneously, we ‘‘walked’’ in on both strands of these large fragments and designed additional primers at about 500-base intervals such that we now have a complete set of primers for both strands of the avian mitochondrial genome. We sequenced the entire mtDNA of five birds (peregrine falcon, Falco peregrinus; redhead, Aythya americana; village indigobird, Vidua chalybeata; grey-headed broadbill, Smithornis sharpei; and greater rhea, Rhea americana), and a painted turtle (Chrysemys picta), and 65 and 95% of the genome, respectively, of an ostrich (Struthio camellus) and an American alligator (Alligator mississippiensis). Complete mtDNA sequences for the latter three species were recently published (Ha¨rlid et al., 1997, 1998; Janke and Arnason, 1997). A number of standard considerations went into our primer design process (e.g., Palumbi, 1996). We targeted primers to conserved portions of ribosomal or transfer RNAs or to strings of conserved, twofold degenerate amino acids in protein coding regions. Primer sequences were evaluated with MacVector (Kodak) to ensure adequate annealing temperature and base composition (generally ⬎50% Gs plus Cs) and to avoid secondary structure and self annealing. We designed most primers with relatively high annealing
FIG. 1. Schematic diagram of avian mtDNA showing the locations of light-strand primers listed in Fig. 2. Heavy-strand primers are shown as shorter lines between light-strand primers, but are not labeled. See Fig. 2 for primer sequences. Large fragments typically amplified in our PCR-based approach to complete mtDNA sequencing are shown outside the mtDNA map. The small subunit (12S) rRNA gene was amplified with L1263 and H2294. The gene order shown is that of Gallus gallus (Desjardins and Morais, 1990), which is shared by many but not all birds (Mindell et al., 1998).
temperatures (55–60°C or higher) to facilitate XL-PCR. Primers included a G or C on the 38 end whenever possible to strengthen primer-template annealing at this important position. We also considered the relative strengths of unconventional base pairings (see Kwok et al., 1990) and, for example, used G rather than A in the primer when the template was variably C or T and used T rather than C when the template was variably A or G. Finally, we incorporated from one to six degenerate positions in many primers, thereby accommodating much of the variation in mtDNA sequence among taxa. We have not explored the use of inosine in lieu of degenerate sites (e.g., Christopherson et al., 1997). One potential pitfall of ‘‘universal’’ primers is the preferential amplification of nuclear sequences of mito-
FIG. 2. Primers compared to homologous sequences of a variety of avian and other vertebrate mtDNAs. L and H numbers refer to the strand and position of the 38 base in the published chicken sequence (Desjardins and Morais, 1990). Dots indicate match with the primer (or revised primer) and lowercase letters indicate sites accommodated by degenerate positions in primer. Codes for degenerate sites are as follows: B, CGT; D, AGT; H, ACT; K, GT; M, AC; N, ACGT; R, AG; S, CG; V, ACG; W, AT; Y, CT. Uppercase letters indicate mismatches with the primer. A line separates birds from other vertebrates. Where two primer sequences are listed, the first sequence is the primer used in our study and the second is a revised version (with changes underlined) that we would use in future work. Previously published primers are noted: (a) Mindell et al., 1991; (b) Kocher et al., 1989; (c) Miranda et al., 1997; (d) Sorenson and Quinn, 1998; (e) A. Cooper (in Edwards, 1993). Each primer sequence is followed by homologous sequences for the taxa sequenced in our study (AYAM, Aythya americana; RHAM, Rhea americana; STCA, Struthio camellus; FAPE, Falco peregrinus; VICH, Vidua chalybeata; SMSH, Smithornis sharpei; ALMI, Alligator mississippiensis; CHPI, Chrysemys picta; GenBank Accession Nos. AF069422–AF069431) plus Gallus gallus (GAGA, X52392; Desjardins and Morais, 1990), Homo sapiens (HOSA, J01415; Anderson et al., 1981), Mus musculus (MUMU, J01420; Bibb et al., 1981), Didelphis virginiana (DIVI, Z29573; Janke et al., 1994), Ornithorhynchus anatinus (ORAN, X83427; Janke et al., 1996), Xenopus laevis (XELA, M10217; Roe et al., 1985), Crossostoma lacustre (CRLA, M91245; Tzeng et al., 1992), and Cyprinius carpio (CYCA, X61010; Chang et al., 1994). Our sequences for Struthio and Alligator are supplemented by published sequences for these taxa (Y12025; Ha¨rlid et al., 1997 and Y13113; Janke and Arnason, 1997, respectively).
PRIMERS FOR MITOCHONDRIAL DNA SEQUENCING
107
108
SORENSON ET AL.
FIG. 2—Continued
PRIMERS FOR MITOCHONDRIAL DNA SEQUENCING
FIG. 2—Continued
109
110
SORENSON ET AL.
FIG. 2—Continued
PRIMERS FOR MITOCHONDRIAL DNA SEQUENCING
FIG. 2—Continued
111
112
SORENSON ET AL.
FIG. 2—Continued
chondrial origin (Zhang and Hewitt, 1997). Because they evolve more slowly, these nuclear copies may be more similar to ancestral mtDNA sequences than is the contemporary mtDNA (Sorenson and Fleischer, 1996; Zischler et al., 1995) and ‘‘consensus’’ primers tend to approximate ancestral sequences. We suggest that primers with appropriate degenerate sites are less likely to preferentially amplify nuclear pseudogenes because they accommodate likely differences between nuclear and mtDNA sequences (e.g., 3rd positions changes in the mtDNA copy). The high ratio of mitochondrial to nuclear genome copies in most tissue samples is then allowed to determine the PCR product. Initial amplification of long mitochondrial fragments and/or use of purified mtDNA will also reduce the risk of unintended amplification of nuclear pseudogenes (Sorenson and Quinn, 1998). Figure 2 compares primer sequences used in our study with a variety of vertebrate mtDNA sequences. We show only primers that we consider useful for most birds, with the exception of some control region primers that probably have more limited utility. A number of additional taxon-specific primers, most in ND5, ND6,
and the control region, were required to complete each genome, particularly for the turtle and alligator samples (Sorenson et al., in prep). We also list previously published primers that we used and our suggestions for modifications to some of these. Note that the relative positions of a given pair of primers and the expected product size may differ in taxa with gene orders different from that of chicken (see Mindell et al., 1998). PCR primers can be effective with up to 4 or 5 internal mismatches (Christopherson et al., 1997), but are very sensitive to mismatches in the first few bases of the 38 end (Kwok et al., 1990). A codon rearrangement near the beginning of ND6 (Moum et al., 1994) makes primers in this location (L16206, L16225, H16191) unsuitable for mammals. We have used all of the listed primers in standard PCR with annealing temperatures of 50–55°C and with 50-µl reactions including 2.5 mM MgCl2, 0.25 mM each dNTP, 0.5 µM each primer, 1.25 units Taq DNA polymerase, and 100 ng of genomic DNA. With these primers in hand, there are two approaches to sequencing large mtDNA fragments. First, long amplification products can be sequenced directly with a number of different internal primers. Using this approach with six additional birds (Dendrocygna arcuata, Scolopax minor, Otus asio, Mycteria americana, Buteo jamaicensis, Sayornis phoebe; Mindell et al., unpublished data), we sequenced a 4.4-kb PCR product (L2258–H6681, see Fig. 1) with the 2 PCR primers and 14 internal primers. On average, 14 primers worked well for each taxon, with different primers failing for different taxa. A second approach, which yielded more consistent results, was to use the initial, long PCR product as a template for the amplification of 500- to 700-bp fragments and then to sequence each of these products with the PCR primers used in the second amplification. This second amplification provides for a perfect match between primer and template in the sequencing reaction. We also found that shorter fragments that would not amplify from total DNA preparations for a particular taxon often could be amplified from longer products that included the target sequence. PCR was apparently less sensitive to primer mismatches than were the cycle-sequencing reactions. Recent studies reporting complete mtDNA sequences have emphasized the use of ‘‘natural’’ clones of mtDNA rather than PCR-amplified DNA (e.g., Ha¨rlid et al., 1997, 1998; Xu and Arnason, 1996), implying that a PCR-based approach is in some way inferior. We explored the question of accuracy by comparing our ostrich, rhea, and alligator sequences with published sequences for these taxa. Our ostrich sequence (10714 bp) is 99.75% identical (n ⫽ 27 differences in 19 separate locations) to the sequence reported by Ha¨rlid et al. (1997), our rhea sequence (16704 bp) is 99.67% identical (n ⫽ 55 differences in 44 separate locations, excluding a 12-base indel in a repeat region) to the sequence
PRIMERS FOR MITOCHONDRIAL DNA SEQUENCING
113
more accurate than previously published sequences (see also Waddell et al., in press). While it is important to know whether there are any inherent differences in data quality between cloning-based and PCR-based approaches to mtDNA sequencing, the above comparisons do not represent a controlled experiment. We suspect that errors arising during the interpretation and transcribing of raw sequence data are more likely than errors associated with some systematic difference in replication error associated with cloning versus PCR. ACKNOWLEDGMENTS
FIG. 3. Example of a discrepancy between the ostrich mtDNA sequence obtained in this study and a previously published sequence. DNA and amino acid sequences are shown for a small portion of the COI gene. Asterisks mark positions otherwise conserved among the nine ‘‘reptilian’’ taxa. STCA1, this study; STCA2, Ha¨rlid et al. (1997). Other taxa as in Fig. 2. STCA2 appears to be offset by one base due to the inclusion of an extra G, which is then compensated by a missing G seven bases downstream. If translated, STCA2 would include an amino acid substitution in the fourth codon at a site conserved among birds and most other vertebrates.
reported by Ha¨rlid et al. (1998), and our alligator sequence (15899 bp) is 99.80% identical (n ⫽ 32 differences in 21 separate locations) to the sequence reported by Janke and Arnason (1997). Because samples from different individual animals were used in each study, many differences (e.g., those representing third position transitions) can be attributed to intraspecific variation. At other positions, however, the previously published sequences include unusual amino acid replacements that are inconsistent with patterns of sequence conservation among diverse avian taxa (e.g., Fig. 3), as well as insertions, deletions, or transversions that are inconsistent with conserved blocks of ribosomal or transfer RNAs. In contrast, our sequences in these conserved locations are almost always consistent with sequences of other vertebrates. Such comparisons suggest that errors in the previously published sequences account for at least 4 of 19, 11 of 44, and 8 of 21 locations that differ between the two ostrich, two rhea, and two alligator sequences, respectively. By the same standards, our alligator sequence includes one suspect location in the ND3 gene, in which the two alligator sequences are temporarily offset by one base, resulting in five consecutive mismatches. Our sequence in this location is consistent, however, with the ND3 amino acid sequence of another crocodilian, Crocodylus porosus (Mindell et al., 1998), although both differ from other vertebrates for two consecutive positions. Based on these comparisons, we conclude that sequences derived from our PCR-based approach are
We thank Christine E. Thacker and Laura J. Howard for laboratory assistance. NSF Grant BSR-9496343 to D.P.M. and the University of Michigan Museum of Zoology Bird Division Swales and Fargo funds supported this research. M.D.S. was supported by NSF Grant IBN-9412399 to Robert B. Payne.
REFERENCES Anderson, S., Bankier, A. T., Barrell, B. G., de Bruijn, M. H. L., Coulson, A. R., Drouin, J., Eperon, I. C., Nierlich, D. P., Roe, B. A., Sanger, F., Schreier, P. H., Smith, A. J. H., Staden, R., and Young, I. G. (1981). Sequence and organization of the human mitochondrial genome. Nature 290: 457–465. Bibb, M. J., Van Etten, R. A., Wright, C. T., Walberg, M. W., and Clayton, D. A. (1981). Sequence and gene organization of mouse mitochondrial DNA. Cell 26: 167–180. Chang, Y. S., Huang, F. L., and Lo, T. B. (1994). The complete nucleotide sequence and gene organization of carp (Cyprinus carpio) mitochondrial genome. J. Mol. Evol. 38: 138–155. Christopherson, C., Sninsky, J., and Kwok, S. (1997). The effects of internal primer-template mismatches on RT-PCR: HIV-1 model studies. Nucleic Acids Res. 25: 654–658. Cooper, A. (1994). DNA from museum specimens. In ‘‘Ancient DNA: Recovery and Analysis of Genetic Material from Paleontological, Archaeological, Museum, Medical, and Forensic Specimens’’ (B. Herrmann and S. Herrmann, Eds.), pp. 149–165. Springer-Verlag, New York. Cummings, M. P., Otto, S. P., and Wakeley, J. (1995). Sampling properties of DNA sequence data in phylogenetic analysis. Mol. Biol. Evol. 12: 814–822. Desjardins, P., and Morais, R. (1990). Sequence and gene organization of the chicken mitochondrial genome: A novel gene order in higher vertebrates. J. Mol. Biol. 212: 599–634. Edwards, S. V. (1993). Mitochondrial gene genealogy and gene flow among island and mainland populations of a sedentary songbird, the grey-crowned babbler (Pomatostomus temporalis). Evolution 47: 1118–1137. Edwards, S. V., Arctander, P., and Wilson, A. C. (1991). Mitochondrial resolution of a deep branch in the genealogical tree for perching birds. Proc. R. Soc. Lond. B 243: 99–108. Hedges, S. B. (1994). Molecular evidence for the origin of birds. Proc. Natl. Acad. Sci. USA 91: 2621–2624. Ha¨rlid, A., Janke, A., and Arnason, U. (1997). The mtDNA sequence of the ostrich and the divergence between paleognathous and neognathous birds. Mol. Biol. Evol. 14: 754–761. Ha¨rlid, A., Janke, A., and Arnason, U. (1998). The complete mitochondrial genome of Rhea americana and early avian divergences. J. Mol. Evol. 46: 669–679.
114
SORENSON ET AL.
Janke, A., and Arnason, U. (1997). The complete mitochondrial genome of Alligator mississippiensis and the separation between recent archosauria (birds and crocodiles). Mol. Biol. Evol. 14: 1266–1272. Janke, A., Feldmaier-Fuchs, G., Thomas, W. K., von Haeseler, A., and Pa¨a¨bo, S. (1994). The marsupial mitochondrial genome and the evolution of placental mammals. Genetics 137: 243–256. Janke, A., Gemmell, N., Feldmaier-Fuchs, G., von Haeseler, A., and Pa¨a¨bo, S. (1996). The mitochondrial genome of a monotreme, Platypus (Ornithorhynchus anatinus). J. Mol. Evol. 42: 153–159. Kocher, T. D., Thomas, W. K., Meyer, A., Edwards, S. V., Pa¨a¨bo, S., Villablanca, F. X., and Wilson, A. C. (1989). Dynamics of mitochondrial DNA evolution in animals: Amplification and sequencing with conserved primers. Proc. Natl. Acad. Sci. USA 86: 6196–6200. Kwok, S., Kellogg, D. E., McKinney, N., Spasic, D., Goda, L., Levenson, C., and Sninsky, J. J. (1990). Effects of primer-template mismatches on the polymerase chain reaction: Human immunodeficiency virus type 1 model studies. Nucleic Acids Res. 18: 999– 1005. Kumazawa, Y., and Nishida, M. (1993). Sequence evolution of mitochondrial tRNA genes and deep-branch animal phylogenetics. J. Mol. Evol. 37: 380–398. Lee, K., Feinstein, J., and Cracraft, J. (1997). The phylogeny of ratite birds: Resolving conflicts between molecular and morphological data sets. In ‘‘Avian Molecular Evolution and Systematics’’ (D. P. Mindell, Ed.), pp. 213–247. Academic Press, San Diego. Mindell, D. P., Ed. (1997). ‘‘Avian Molecular Evolution and Systematics,’’ Academic Press, San Diego. Mindell, D. P., Dick, C. W., and Baker, R. J. (1991). Phylogenetic relationships among megabats, microbats, and primates. Proc. Natl. Acad. Sci. USA 88: 10322–10326. Mindell, D. P., Sorenson, M. D., and Dimcheff, D. E. (1998). Multiple independent origins of mitochondrial gene order in birds. Proc. Natl. Acad. Sci. USA 95: 10693–10697. Mindell, D. P., Sorenson, M. D., and Dimcheff, D. E. (1998). An extra nucleotide is not translated in mitochondrial ND3 of some birds and turtles. Mol. Biol. Evol., 15:1568–1571. Miranda, H. C., Kennedy, R. S., and Mindell, D. P. (1997). The phylogenetic placement of Mimizuku gurneyi (Aves: Strigidae) inferred from mitochondrial DNA. Auk 114: 315–323. Moore, W. S. (1995). Inferring phylogenies from mtDNA variation: Mitochondrial-gene trees versus nuclear-gene trees. Evolution 49: 718–726. Moum, T., Willassen, N. P., and Johansen, S. (1994). Intragenic rearrangements in the mitochondrial NADH dehydrogenase subunit 6 gene of vertebrates. Curr. Genet. 25: 554–557.
Palumbi, S. R. (1996). Nucleic acids II: The polymerase chain reaction. In ‘‘Molecular Systematics’’ (D. M. Hillis, C. Moritz, and B. K. Mable, Eds.), 2nd ed., pp. 205–247, Sinauer, Sunderland, MA. Quinn, T. W., and Mindell, D. P. (1996). Mitochondrial gene order adjacent to the control region in crocodile, turtle, and tuatara. Mol. Phylogenet. Evol. 5: 344–351. Quinn, T. W., and Wilson, A. C. (1993). Sequence evolution in and around the mitochondrial control region in birds. J. Mol. Evol. 37: 417–425. Roe, B. A., Ma, D. P., Wilson, R. K., and Wong, J. F. (1985). The complete nucleotide sequence of the Xenopus laevis mitochondrial genome. J. Biol. Chem. 260: 9759–9774. Simon, C., Frati, F., Beckenbach, A., Crespi, B., Liu, H., and Flook, P. (1994). Evolution, weighting, and phylogenetic utility of mitochondrial gene sequences and a compilation of conserved polymerase chain reaction primers. Ann. Entomol. Soc. Am. 87: 651–701. Sorenson, M. D., and Fleischer, R. C. (1996). Multiple independent transpositions of mitochondrial DNA control region sequences to the nucleus. Proc. Natl. Acad. Sci. USA 93: 15239–15243. Sorenson, M. D., and Quinn, T. W. (1998). Numts: A challenge for avian systematics and population biology. Auk 115: 214–221. Tarr, C. L. (1995). Primers for amplification and determination of mitochondrial control-region sequences in oscine passerines. Mol. Ecol. 4: 527–529. Tzeng, C. S., Hui, C. F., Shen, S. C., and Huang, P. C. (1992). The complete nucleotide sequence of the Crossostoma lacustre mitochondrial genome: Conservation and variations among vertebrates. Nucleic Acids Res. 20: 4853–4858. Waddell, P. J., Cao, Y., Hasegawa, M., and Mindell, D. P. (1999). Assessing the Cretaceous superordinal divergence times within birds and placental mammals using whole mitochondrial protein sequences and an extended statistical framework. Syst. Biol. (in press). Xu, X., and Arnason, U. (1996). A complete sequence of the mitochondrial genome of the western lowland gorilla. Mol. Biol. Evol. 13: 691–698. Zardoya, R., and Meyer, A. (1996). The complete nucleotide sequence of the mitochondrial genome of the lungfish (Protopterus dolloi) supports its phylogenetic position as a close relative of land vertebrates. Genetics 142: 1249–1263. Zhang, D.-X., and Hewitt, G. M. (1996). Nuclear integrations: Challenges for mitochondrial DNA markers. Trends Ecol. Evol. 11: 247–251. Zischler, H., Geisert, H., vonHaeseler, A., and Pa¨a¨bo, S. (1995). A nuclear ‘‘fossil’’ of the mitochondrial D-loop and the origin of modern humans. Nature 378: 489–492.