Utility of the dentin matrix protein 1 (DMP1) gene for resolving mammalian intraordinal phylogenetic relationships

Utility of the dentin matrix protein 1 (DMP1) gene for resolving mammalian intraordinal phylogenetic relationships

MOLECULAR PHYLOGENETICS AND EVOLUTION Molecular Phylogenetics and Evolution 26 (2003) 89–101 www.elsevier.com/locate/ympev Utility of the dentin matr...

789KB Sizes 0 Downloads 29 Views

MOLECULAR PHYLOGENETICS AND EVOLUTION Molecular Phylogenetics and Evolution 26 (2003) 89–101 www.elsevier.com/locate/ympev

Utility of the dentin matrix protein 1 (DMP1) gene for resolving mammalian intraordinal phylogenetic relationships Ronald A. Van Den Bussche,* Serena A. Reeder,1 Eric W. Hansen, and Steven R. Hoofer Department of Zoology and Collection of Vertebrates, 430 Life Sciences West, Oklahoma State University, Stillwater, OK 74078, USA Received 22 January 2002; received in revised form 14 June 2002

Abstract We sequenced exon 6 of the nuclear dentin matrix protein 1 (DMP1) gene from 19 species of bats (order Chiroptera) to assess the utility of this gene for higher-level phylogenetic studies. Bayesian analysis revealed high support (posterior probabilities P 0:95) for monophyly of Noctilionoidea (Phyllostomidae, Noctilionidae, and Mormoopidae), all genera and most families examined. Comparison of the phylogenetic information present in DMP1 with mitochondrial rDNA and nuclear RAG2 genes indicated no significant heterogeneity. Thus, we concatenated these three data sets into a single ‘‘total evidence’’ phylogenetic analysis. Combined analysis was congruent with study of RAG2 and combined RAG2 and mtrDNA sequences, but improved support (Bayesian posterior probabilities) for many nodes. Our results indicate that exon 6 of DMP1 is rapidly evolving, able to tolerate non-frame shifting insertion and deletion events, is more variable than RAG2, and provides phylogenetic resolution from the interfamilial to infraclass levels in mammals. Ó 2002 Elsevier Science (USA). All rights reserved. Keywords: Bats; Chiroptera; DMP1; Phylogenetic relationships; Molecular evolution

1. Introduction Although mitochondrial DNA (mtDNA) has proven valuable for elucidating evolutionary relationships, its use has potential problems (e.g., maternal inheritance, lineage sorting, single linkage-group). Nuclear DNA (nDNA) potentially houses a wealth of genealogic information, with proteins that function in immunologic, metabolic, morphologic, and physiologic systems. However, the size and diversity of the nuclear genome has hindered efforts to identify specific markers that can be extracted efficiently across taxonomic boundaries (Baker et al., 1997). Even though recent advances in molecular technologies and several mammalian genome projects together have made available new regions of the nuclear genome, phylogenetic utility of many of these regions have yet to be examined. * Corresponding author. E-mail address: [email protected] (R.A. Van Den Bussche). 1 Present address: Department of Biological Sciences, Texas Tech University, Lubbock, TX 79409.

The nuclear dentin matrix protein 1 (DMP1) gene was examined recently in a phylogenetic study (Toyosawa et al., 1999) of six mammals representing Primates (Homo sapiens), Artiodactyla (Bos taurus), Rodentia (Rattus), Monotremata (Ornithorhynchus anatinus), Diprotodontia (Macropus rufogriseus), and Didelphimorphia (Monodelphis domestica). Of the six DMP1 exons, only exon 6 (about 449 amino acid residues) was suitable for phylogenetic analysis. Toyosawa et al. (1999) detected considerable amino acid divergence between Prototheria, Metatheria, and Eutheria, although there were several conserved features suggesting the sequences were orthologous. They concluded that exon 6 of DMP1 appears rapidly evolving, tolerant of non-frame shifting insertions/deletions, and has remained intact throughout the evolution of proto-, meta-, and eutherians, an interval of about 200 million years. Although the exact function of DMP1 is uncertain (George et al., 1993, 1994), it is believed to be one of potentially several genes responsible for the mineralization of tooth dentin (Toyosawa et al., 1999). DNA sequence data of DMP1 may be a useful phylogenetic

1055-7903/02/$ - see front matter Ó 2002 Elsevier Science (USA). All rights reserved. PII: S 1 0 5 5 - 7 9 0 3 ( 0 2 ) 0 0 2 9 7 - X

90

R.A. Van Den Bussche et al. / Molecular Phylogenetics and Evolution 26 (2003) 89–101

marker because dentin is: (1) thought to have a long evolutionary history, (2) considered one of the distinguishing characteristics of the vertebrate subphylum (Smith and Hall, 1990), and (3) the presence and characterization of different types of dentin have been used in the elucidation of phylogenetic relationships among extant and extinct vertebrate taxa (Ørvig, 1967). Therefore, further characterization of DMP1 is warranted in order to assess its utility in mammalian systematics. Furthermore, the function of DMP1 differs

from those of other nuclear protein coding genes currently used in systematic studies of mammals [e.g., a-2B adrenergic receptor (A2B2), breast cancer susceptibility gene (BRCA1), globins, interphotoreceptor binding protein (IRBP), milk casein genes, proto-oncogene (cmyc), and recombination activator 1 and 2 (RAG1, RAG2)]. We generated sequences from exon 6 of DMP1 for 19 species of bats (order Chiroptera) representing a taxonomic hierarchy from the subordinal to species level. These taxa were chosen because they represent both chiropteran suborders and multiple representatives of two superfamilies. This study should provide further insight into the evolution and phylogenetic utility of this gene because all taxa examined have been included in previous molecular studies (Van Den Bussche and Hoofer, 2001; Hoofer et al., Submitted), thereby providing an interpretive background (Fig. 1). We examined congruence between DMP1 and other molecular markers and concatenated those data to develop a ‘‘total evidence’’ tree.

2. Materials and methods

Fig. 1. Redrawing of Hoofer et al. (Submitted) phylogenetic relationships within Yangochiroptera based on the concatenation of mitochondrial 12S rRNA, tRNAVal , 16S rRNA and nuclear RAG2 DNA sequences. Taxa indicated by an asterisk (*) were examined in the present study. Numbers above nodes indicate Bayesian posterior probabilities. K. ¼ Kerivoula, M. ¼ Myotis, Na. ¼ Natalus, No. ¼ Noctilio, and T. ¼ Thyroptera.

Genomic DNA was extracted following standard protocols (Lomgmire et al., 1997) and polymerase chain reaction (PCR) amplification of DMP1 exon 6 was carried out using primers Den1 and Den2 previously described by Toyosawa et al. (1999). These primers worked for most, but not all taxa. Therefore, after DNA sequences were obtained from representatives of several chiropteran families, additional flanking and internal primers were designed, and a total of nine primers (Table 1) was used for amplification and DNA sequencing. PCR amplifications were carried out in 50 ll using the FailSafe PCR PreMix Selection Kit (Epicentre, Madison, WI). Each PCR contained 500 ng of genomic DNA, 25 ll of FailSafe PCR 2X PreMix, 1.25 U of FailSafe

Table 1 Primers used in amplification and sequencing of exon 6 of dentin matrix protein 1 (DMP1) gene. Amplification of exon 6 was accomplished using Den12 or Den2a in combination with Den2 or Den10. Various combinations of the remaining primers were used to completely sequence both strands of DNA Primer name Den12 (F) Den2a (F) DenB (R) DenB2 (R) DenA (F) DenD (R) DenC (F) Den10 (R) Den2 (R)

Primer sequence 0

5 -GATGAAGACGACAGTGGAGATGACACCTT-3 50 -GACACCTTTGGTGATGA-30 50 -TGATTCTCTTGATTTGACACTGG-30 50 -CCTTCAYYATCAAACTCAGAGCC-30 50 -TGCARAGYGAYGATCCAGACAC-30 50 -GGATNTGCTTTCWGAACTGRAGG-30 50 -ACCTCCAGTCACTCAGAAG-30 50 -GTTGCTCTCTTGTGATTTGCTGC-30 50 -ATCTTGGCAATCATTGTCATC-30

Letters in parentheses after primer name indicate forward or reverse direction.

Citation 0

Toyosawa et al. (1999) This study This study This study This study This study This study Toyosawa et al. (1999) Toyosawa et al. (1999)

R.A. Van Den Bussche et al. / Molecular Phylogenetics and Evolution 26 (2003) 89–101

PCR Enzyme Mix, and primers in a final concentration of 1.0 lM. Thermal profile consisted of initial denaturation at 95 °C for 2 min followed by 35 cycles of 95 °C for 60 s, 55 °C for 60 s, and 72 °C for 90 s and a final extension at 72 °C for 30 min. Double-stranded amplicons were purified using the Wizard PCR Prep DNA Purification System (Promega, Madison, WI) and sequenced in both directions using Big-Dye chain terminators and a 377 automated DNA sequencer (Applied Biosystems, Foster City, CA). Amplicons were sequenced entirely in both directions using a combination of flanking and internal primers (Table 1). AssemblyLIGN 1.0.9 (Oxford Molecular Group PLC, 1998) was used to assemble contiguous fragments within taxa. Base calling ambiguities on single strands were resolved either by choosing the call on the cleanest strand or using the appropriate standardized IUB ambiguity code if both strands showed the same ambiguity. DNA sequences were translated into deduced amino acid sequences using MacClade 4.0 (Maddison and Maddison, 2000) and CLUSTAL X (Thompson et al., 1997) was used to obtain a multiple amino acid sequence alignment. Alignment of deduced amino acid sequences was used to manually align DNA sequences in MacClade 4.0. Relative frequency of nonsynonymous (dN ) and synonymous (dS ) substitutions among all pairwise comparisons of taxa was computed using MEGA (Molecular Evolutionary Genetics Analysis version 1.01; Kumar et al., 1993). Nucleotides were coded as unordered, discrete characters, and ambiguously coded sites as polymorphic. Departures from base compositional homogeneity, or stationarity (Saccone et al., 1989) across taxa was examined using PAUP*4.0b8 (Swofford, 2000). To assess degree of substitutional saturation (i.e., multiple hits) in DMP1 sequences, we plotted observed numbers of transition and transversion substitutions (each codon separately and combined) against uncorrected ‘‘p’’ values of sequence divergence. To evaluate phylogenetic relationships, maximum likelihood and Baysian analyses were performed with a yinpterochiropteran (Pteropus hypomelanus) designated as the outgroup. Maximum likelihood analyses were carried out using PAUP* whereas Bayesian analyses were conducted using MRBAYES (Huelsenbeck and Ronquist, 2001). Prior to maximum likelihood analysis, Modeltest (Posada and Crandall, 1998) was used to determine which model of DNA sequence evolution and model parameters best fit our data. Nodal support of the maximum likelihood tree was determined via 100 bootstrap iterations with the nearest neighbor interchange branchswapping. For Bayesian phylogenetic analysis, we used the GTR þ C model of sequence evolution with site-specific rate variation calculated for each of the three codon positions via the ‘‘ssgamma’’ option in MRBAYES.

91

Starting trees were random, four simultaneous Markov chains were run for 1,000,000 generations, and trees were sampled every 10 generations. Three independent Bayesian analyses were run to ensure that model parameters and our final trees converged upon similar results. 2.1. Combined analysis DNA sequences have been generated from the mitochondrial 12S rRNA, tRNAVal , 16S rRNA (Van Den Bussche and Hoofer, 2001) and nuclear RAG2 (Hoofer et al., Submitted) genes from these same taxa. The default parameters in CLUSTALX (Thompson et al., 1997) were used to obtain multiple sequence alignments of all taxa for the mtrDNA and RAG2 data, respectively. Alignment of the mtrDNA data was refined based on published secondary structural models (Anderson et al., 1982; De Rijk et al., 1994; Springer and Douzery, 1996) and, with few exceptions, alignment gaps occurred within loop regions. Alignment of the RAG2 data was trivial as only a single 3 bp deletion event was detected in a single taxon. Bayesian phylogenetic analyses were performed on the mtrDNA and RAG2 data as described above, using the GTR þ C þ site-specific rate variation using the ‘‘ssgamma’’ option in MRBAYES. The mtrDNA data were partitioned into 12S rRNA, tRNAVal , and 16S rRNA whereas the RAG2 data were partitioned into codon positions. Although the incongruence-length-difference test (ILD test; Farris et al., 1994; Mickevich and Farris, 1981) has been popular for examining character congruence and combinability, this test has recently come under criticism as a poor measure of congruence, especially when data sets are of different sizes (Dowton and Austin, 2002; Yoder et al., 2001). Therefore, we used a more straightforward method to assess combinability (Weins, 1998; see also Leache and Reeder, 2002). This method considers the tree or trees from the combined analysis as the best estimate of species phylogeny and that strongly supported conflicts between data sets may be due to different phylogenetic histories. When there are many such conflicts, combining data sets yields little phylogenetic resolution. We considered only those clades with Bayesian posterior probabilities P 0:95 to be strongly supported. For the combined analysis, the three data sets (DMP1, RAG2, mtrDNA) were concatenated into a single file and the GTR þ C model with site-specific rate variation was used. Data partitions for each of the independent analyses described above (the three genes for the mitochondrial data set and each codon position for the two nuclear gene sequences) were used for this combined analysis. Bayesian analyses were carried out as described above.

92

R.A. Van Den Bussche et al. / Molecular Phylogenetics and Evolution 26 (2003) 89–101

Fig. 2. Amino acid alignment of the dentin matrix protein 1 (DMP1) gene encoded by exon six for representatives of Primates (Homo), Artiodactyla (Bos) and the 19 bats examined in this study. M. ¼ Myotis, Na. ¼ Natalus, No. ¼ Noctilio, and T. ¼ Thyroptera.

2.2. Specimens examined All tissues used in this study are represented by voucher specimens deposited in the American Museum of Natural History (AMNH), Natural Sciences Research Laboratory of the Museum of Texas Tech University (TK), Oklahoma State University Collection of Vertebrates (OK), or the Royal Ontario Museum (ROM). Tissue for Myzopoda aurita (OK4246) was accessioned in the Oklahoma State University Collection of Vertebrates, but the voucher specimen is accessioned in the National Museum of Natural History (USNM448885). Following are the taxa examined in this study, museum accession number, and GenBank numbers for mitochondrial rDNA and nuclear RAG2 sequences, respectively. Suborder Megachiroptera: Pteropodidae: Pteropus hypomelanus (TK20225; 12S rRNA ¼ U93073; 16S rRNA ¼ AF069537; AY141025); Suborder Microchiroptera: Emballonuridae: Saccopteryx bilineata (AMNH267842; AF263213; AF316485); Molossidae: Tadarida brasiliensis (OK430; AF263219; AY141019); Mormoopidae: Mormoops megalophylla (TK78661; AF263220; AY141020), Pteronotus parnellii (AMNH269115; AF263221; AF330817); Myzopodidae:

Myzopoda aurita (OK4246; AF345926; AY141022); Natalidae: Natalus micropus (TK9454; AF345925; AY141023), N. stramineus (TK15660; AF345924; AY141024); Noctilionidae: Noctilio albiventris (TK46004; AF263223; AF316476), N. leporinus (TK10224; AF263224; AF316477); Phyllostomidae: Centurio senex (TK13537; AF263227; AF316438), Desmodus rotundus (TK4764; AF263228; AF316444); Thyropteridae: Thyroptera discifera (TK17210; AF345923; AY141027), T. tricolor (AMNH268577; AF263233; AY141028); Vespertilionidae: Corynorhinus townsendii (TK83182; AF263236; AY141029), Harpiocephalus harpia (TK21258; AF263235; AY141031), Kerivoula hardwickei (ROM110829; AF345928; AY141034), Myotis riparius (AMNH268591; AF263236; AY141032), M. velifer (TK79170; AF263237; AY141033).

3. Results 3.1. Analysis of DMP1 exon 6 Because different primer combinations were used to amplify exon 6 of DMP1, the final region compared

R.A. Van Den Bussche et al. / Molecular Phylogenetics and Evolution 26 (2003) 89–101

93

Fig. 2. (continued)

contained the sequence between Den2a and Den10 (Table 1) and these sequences have been deposited in GenBank (Accession Nos. AY141877–AY141895). The number of bases between these primers ranged from 960 (Tadarida) to 1047 (six taxa). Although several indels were required to align these sequences, most indels were phylogenetically informative and no stop codons were observed in the region sequenced for any taxon (Fig. 2). Thirty of the 19,059 compared nucleotides were scored as ambiguous. These sites were double-checked on the electropherograms corresponding to forward and reverse directions, and in no case were these polymorphisms recognized as sequencing artifacts. We also rejected the hypothesis that these ambiguities resulted from cross-contamination because each of the 30 polymorphic sites was detected in only a single taxon.

Therefore, we conclude that these polymorphisms represent the simultaneous amplification of different alleles within individuals. Each of these sites was assigned an IUB code corresponding to a two-base ambiguity (no three- or four-base ambiguities were scored). The total number of base differences between alleles within individuals ranged from 0 (eight taxa) to 7 (Myotis riparius) with a mean of 1.58. Of the 30 sites polymorphic within individuals, three were at first, 12 at second, and 15 at third codon positions. Assuming the two alleles within a taxon share a recent common ancestry, 25 polymorphisms were transition substitutions and five were transversion substitutions. We detected no significant departure of homogeneity in base composition across taxa ðv2 ¼ 34:489, df ¼ 54, P ¼ 0:982), although codon positions varied in degree of

Table 2 Range of percent base composition at each codon position and for the enitre region of exon 6 of the dentin matrix protein 1 gene examined for 19 bats Position

A

C

G

T

First Second Third All

31.8–34.7 43.7–48.4 22.7–26.3 33.6–35.5

13.2–15.6 19.2–21.6 25.5–31.8 20.2–22.6

38.2–41.0 23.7–27.2 19.1–21.8 27.4–29.0

12.2–13.4 6.8–9.5 24.4–29.5 14.9–16.8

94

M. ¼ Myotis, Na. ¼ Natalus, No. ¼ Noctiilo, and T. ¼ Thyroptera.





50 52 54 51 58 58 58 56 58 46



38 43 37 53 53 55 53 50 47



45 39 48 47 51 50 48 42



12.4 12.2 12.2 8.9 9.3 11.8 11.9 9.7 9.7 8.5 8.2 7.0

36 55 54 55 55 55 48

50 49 53 51 52 44



12.1 11.4 12.0 8.1 8.3 12.7 13.0 11.7 11.6 12.0 10.9 11.4 10.9

Tadarida

5 50 51 54 51 56 56 56 55 57 44

12.2 12.8 13.6 9.7 9.8 11.6 11.5 10.8 10.6 8.6 8.8

Corynorhinus



11.9 12.0 12.4 9.2 9.5 10.2 10.7 10.2 10.1 7.7

Kerivoula

56 57 53 55 60 60 62 63 64 63 61 51

12.3 12.7 12.5 9.2 9.6 10.5 11.0 10.5 10.5

Harpiocephalus



10 56 57 50 52 61 58 61 62 63 62 61 51

13.1 11.9 12.3 9.2 9.8 11.5 11.4 0.7

13.2 12.0 12.6 9.1 9.7 11.4 11.3

M. velifer



46 46 51 52 42 43 45 43 42 42 44 43 42 32

14.1 13.2 14.3 10.1 10.1 2.0

M. riparius



12 45 45 47 48 41 41 43 40 39 39 41 38 39 27

13.8 13.1 14.2 10.0 10.1

Desmodus



45 48 72 71 59 58 56 57 65 60 58 57 56 52 55 45

10.3 8.8 10.7 1.9

Centurio

67 51 54 73 76 65 65 67 72 76 69 62 61 65 61 62 50

9.9 8.6 10.1

Pteronotus

12.7 12.9

Mormoops



No. albiventris

T. discifera

12.2

No. leporinus

T. tricolor

Na. stramineus

Na. micropus

Saccopteryx



67 60 46 46 67 69 62 62 60 63 64 65 56 56 56 55 54 50

Myzopoda

Pteropus Pteropus Saccopteryx Myzopoda Na. micropus Na. stramineus T. tricolor T. disfera No. leporinus No. albvantris Mormoops Pteronotus Centurio Desmodus M. riparius M. velifer Harpiocephalus Kerivoula Corynorhinus Tadarida



12.4 11.7 12.1 8.3 8.5 13.2 13.5 12.1 12.0 12.4 11.0 11.7 11.1 0.5

2 20 27 25 37



11.7 11.5 11.5 8.1 8.2 12.6 13.0 11.1 11.0 11. 7 10.9 11.1 10.7 3.5 3.8

19 26 24 37



12.7 12.0 11.9 8.7 9.1 13.0 13.1 11.4 11.5 12.1 11.6 11.8 11.6 4.9 5.0 3.9

23 27 38



11.2 11.2 10.6 7.8 8.2 12.5 12.4 11.0 10.9 11.1 10.0 11.2 10.7 4.8 5.0 4.7 5. 8

31 39



10.5 8.6 10.4 5.4 5.6 11.0 10.9 9.6 9.7 9.7 9.4 10.2 8.8 8.1 8.5 7.7 8.7 7.9

36



R.A. Van Den Bussche et al. / Molecular Phylogenetics and Evolution 26 (2003) 89–101

Table 3 Percent sequence divergence (uncorrected ‘‘p’’) for all pairwise comparisons of taxa above the diagonal and number of amino acid differences between all pairwise comparisons of taxa below diagonal for the dentin matrix protein 1 gene

R.A. Van Den Bussche et al. / Molecular Phylogenetics and Evolution 26 (2003) 89–101

base compositional bias (Table 2). First positions were biased for adenine and guanine, second positions for adenine, whereas third positions had a nearly even distribution of base composition (albeit with slightly lower levels of guanine). Overall, adenine outnumbered the other three nucleotides by about 1.7:1. Skewed base composition at first and second positions reflects the unevenness in translated amino acid frequencies for serine (AGY ¼ 14.6%), aspartic acid (GAY ¼ 11.5%), and glutamic acid (GAR ¼ 16.8%). Percent sequence divergence (uncorrected ‘‘p’’) among taxa ranged from 0.51 (between M. riparius and M. velifer) to 14.3% (between Thyroptera discifera and Myzopoda aurita) with a mean percent sequence divergence among all taxa of 10.28% (Table 3). Average percent sequence divergence for all pairwise comparisons between yinptero- and yangochiropterans (n ¼ 18), among yangochiropteran families (n ¼ 133), among genera within families (n ¼ 18), and among species within a genus (n ¼ 8) were 12.15, 10.81, 5.4, and 0.64%, respectively. Concordant with this divergence at the DNA level, the portion of DMP1 encoded by exon 6 revealed a

95

high degree of amino acid substitutions among pairwise comparisons of taxa. Amino acid differences ranged from 2 (between M. riparius and M. velifer) to 76 (between Saccopteryx and T. discifera and Centurio) with a mean of 50.4 (Table 3). Average number of amino acid differences for all pairwise comparison between yinpteroand yangochiropterans (n ¼ 18), among yangochiropteran families (n ¼ 133), among genera within families ðn ¼ 18Þ, and among species within a genus (n ¼ 8) were 58.8, 53.0, 26.8, and 3.6, respectively. To test for saturation, observed numbers of transition (ti) and transverstion (tv) substitutions at each codon position were plotted against uncorrected (‘‘p’’) sequence divergence (Fig. 3). In this type of analysis, lack of a vertical increase in number of substitutions with increasing sequence divergence indicates that substitutions have reached saturation. No evidence of saturation was detected in any of the six data partitions. Of the 1047 aligned positions, 463 were variable. Distribution of variable sites, partitioned by codon position, revealed that 107, 131, and 225 variable positions occurred at first, second, and third positions, respec-

Fig. 3. Pairwise comparisons (indicated by diamonds) for each codon position among all taxa (n ¼ 171) of transition (ti) and transversion (tv) substitutions (ordinate) plotted against uncorrected (‘‘p’’) sequence divergence (abscissa) for all pairwise comparisons.

96

R.A. Van Den Bussche et al. / Molecular Phylogenetics and Evolution 26 (2003) 89–101

tively. Modeltest chose the TrN þ C model of sequence evolution as the best fit for our data. Model parameters used in the maximum likelihood analysis were: base frequencies ¼ 0.3427, 0.2179, 0.2759; NST ¼ 6; Revmat ¼ 1.0000, 2.7785, 1.0000, 1.0000, 6.8984); shape parameter of the gamma distribution ðaÞ ¼ 0:8646; proportion of sites invariant ¼ 0.00 and resulted in a single tree (Fig. 4). Bayesian analyses of the DMP1 data reached stationarity by 5000 generations, which reduced our data to 95,000 trees. Topology, posterior probabilities, and model parameters were in excellent agreement for all runs (Table 4; Fig. 5) and the topology resulting from Bayesian analysis with the ‘‘ssgamma’’ option resulted in the same topology as that from the maximum likelihood analysis using the TrN þ C model of sequence evolution (Fig. 4). High posterior probabilities were detected for all clades uniting representatives of species within a genus (1.00) and a clade containing the New World Mormoopidae, Noctilionidae, Thyropteridae, and Phyllostomidae.

mology. Similar to the results for DMP1 exon 6 DNA sequences, model parameters and resultant phylogeny for both the mtrDNA and RAG2 data were in agreement among the three independent Bayesian analyses (Table 4; Fig. 5).

3.2. Re-analysis of mitochondrial and RAG2 sequences

4. Discussion

For the mtrDNA data, 561 sites of the aligned sequence (149 from the 12S rRNA, 14 from tRNAVal , 398 from 16S rRNA) were removed prior to phylogenetic analysis due to the potential violation of positional ho-

Deduced amino acid sequences of the portion of DMP1 encoded by exon 6 from 19 bats aligned with those from human and cattle suggest these sequences are orthologous (Fig. 2). Also supporting this conclusion is that amino acid sequences from bats possess the same biochemical characteristics found in other mammalian DMP1 molecules. Toyosawa et al. (1999) suggested that in order for proper initiation and regulation of hydroxyapatite crystal growth during mineralization of the extracellular matrix, mammalian DMP1 sequences were highly acidic due to high levels of aspartic and glutamic acids. They also suggested that DMP1 possessed a high frequency of serine (for casein kinases I and II and for the target of phosphorylation catalyzed by these enzymes) and an RGD cell attachment peptide. These same characteristics are present in sequences examined in this study (Fig. 2). Serine residues accounted for approximately 15% of the translated amino acid residues (13–16%), and DMP1 in bats is characterized by high acidity due to high frequencies of aspartic acid (11–12%) and glutamic acid (16–18%). Finally, all 19 bats possess an RGD amino acid sequence motif near the middle of exon 6 at the same position as found in cattle and humans (Fig. 2). Taken together, these data suggest that the DMP1 sequences examined in this study are orthologous to previously described mammalian DMP1 exon 6 sequences (Toyosawa et al., 1999). The ratio of mean number of nonsynonymous to synonymous substitutions can be used as an indicator for the level of selection acting on protein coding genes. For exon 6 of DMP1, synonymous substitutions ðdS ¼ 0:2897  0:0217Þ occurred at a frequency greater

Fig. 4. Topology of a maximum likelihood analysis of DMP1 exon six sequences using the TrN þ C model of sequence evolution. Numbers above branches indicate the percentage of 100 bootstrap iteration each node was detected. M. ¼ Myotis, Na. ¼ Natalus, No. ¼ Noctilio, and T. ¼ Thyroptera.

3.3. Combined analysis Although slight differences exist among the trees generated from the three independent data sets, because we considered only those clades receiving Bayesian posterior probabilities P 0:95 to be strongly supported, these differences did not preclude us from combining DNA sequences from all three genes into a single analysis. For this combined nuclear and mitochondrial gene analysis, model parameters and resulting phylogeny were in excellent agreement among the three independent runs. Fifteen of the 16 clades resulting from the combined analysis received high (P ¼ 1:0) posterior probabilities (Fig. 5).

R.A. Van Den Bussche et al. / Molecular Phylogenetics and Evolution 26 (2003) 89–101

97

Table 4 Nucleotide substitution parameter estimates for GTR plus site-specific gamma model of DNA sequence evolution from the Bayesian analysis for each individual data set and the combined data that produced the trees in Fig. 5 Parameter DMP1 )ln rct rcg rat rag rac pa pc pg pt alpha ss1 ss2 ss3 mtrDNA )ln rct rcg rat rag rac pa pc pg pt alpha ss1 ss2 ss3 RAG2 )ln rct rcg rat rag rac pa pc pg pt alpha ss1 ss2 ss3 Combined )ln rct rcg rat rag rac pa pc pg pt alpha ss1 ss2 ss3

Mean

Variance

95% Cl

)5310.299247 6.832362 1.104792 1.055295 3.497401 1.196052 0.335191 0.221458 0.272312 0.171038 1.246208 0.712941 0.823535 1.463524

27.338059 1.719558 0.067619 0.063510 0.459808 0.075723 0.000157 0.000126 0.000137 0.000100 0.042452 0.003466 0.003699 0.005379

)5321.42; )5301.13 4.718299; 9.747362 0.693621; 1.687383 0.659748; 1.617117 2.395035; 5.009216 0.763304; 1.823389 0.311338; 0.360091 0.199940; 0.243894 0.249832; 0.295999 0.152384; 0.191394 0.915076; 1.684871 0.596456; 0.836077 0.710492; 0.948505 1.321150; 1.610228

)13319.894971 48.467785 0.680743 5.754842 17.637750 7.635042 0.364667 0.205686 0.193824 0.235824 0.252196 0.985422 0.800527 1.018787

27.781068 206.811295 0.169106 3.265341 27.585542 5.764851 0.000082 0.000052 0.000059 0.000058 0.000127 0.002460 0.060311 0.001171

)5510.139055 12.991936 1.657148 1.350240 10.714900 2.950848 0.291752 0.229552 0.224772 0.253924 1.535155 0.472322 0.331135 2.195391

31.177413 8.354923 0.252521 0.140461 5.712757 0.489886 0.000119 0.000102 0.000104 0.000104 0.094310 0.002191 0.001409 0.003468

)5522.06; )5500.40 8.135737; 18.906921 0.840218; 2.768912 0.766463; 2.188853 6.655989; 16.209625 1.776472; 4.472631 0.269344; 0.312816 0.210806; 0.250570 0.205403; 0.245673 0.235045; 0.275009 1.067616; 2.224286 0.381880; 0.568194 0.260420; 0.408337 2.075552; 2.312860

)24458.493153 17.774306 1.237394 2.369932 7.998365 3.127122 0.337944 0.216772 0.219330 0.225955 0.362522 1.545807 1.317642 1.547474

29.527230 6.141229 0.053957 0.133304 1.211090 0.223487 0.000039 0.000029 0.000029 0.000029 0.000224 0.005553 0.188306 0.003447

)24470.33; )24449.22 13.565759; 23.201433 0.843329; 1.774979 1.716271; 3.154584 6.207788; 10.391293 2.322768; 4.169524 0.325732: 0.350290 0.205589; 0.227417 0.208711; 0.230315 0.215448; 0.236653 0.334058; 0.392469 1.375169; 1.687040 0.817727; 2.196787 1.437225; 1.664373

)13331.02; )13310.38 28.703930; 79.979638 0.128734; 1.724544 3.233090; 9.976612 10.207991; 29.135045 4.298348; 13.253724 0.347077; 0.382317 0.191473; 0.220111 0.178919; 0.209101 0.220829; 0.251118 0.230636; 0.274860 0.891338; 1.084418 0.477270; 1.394248 0.949782; 1.083455

98

R.A. Van Den Bussche et al. / Molecular Phylogenetics and Evolution 26 (2003) 89–101

Table 4 (continued) Parameter ss4 ss5 ss6 ss7 ss8 ss9

Mean

Variance 0.179721 0.793879 0.254755 0.579404 0.579167 0.777672

0.000558 0.005680 0.001221 0.004969 0.002456 0.002335

95% Cl 0.136437; 0.693692; 0.196123; 0.461163; 0.473641; 0.687858;

0.221570 0.986634 0.323552 0.743838 0.686472 0.865816

Fig. 5. Topologies resulting from Bayesian phylogenetic analysis of each of the three genes independently and the combined analysis. Numbers above nodes indicate Bayesian posterior probabilities. M. ¼ Myotis, Na. ¼ Natalus, No. ¼ Noctilio, and T. ¼ Thyroptera.

R.A. Van Den Bussche et al. / Molecular Phylogenetics and Evolution 26 (2003) 89–101

than nonsynonymous substitutions ðdN ¼ 0:0739  0:0047Þ, producing a dN /dS of 0.2551. For the RAG2 gene of these same 19 bats, dS ¼ 0:2409  0:0124, whereas dN ¼ 0:272  0:0022, resulting in dN / dS ¼ 1:129. Thus, selective pressures acting on DMP1 are moderate but differ from those acting on RAG2 (for these bats). Examination of representatives of 10 families of bats supports the conclusion of Toyosawa et al. (1999) that DMP1 is a rapidly evolving nuclear gene that tolerates non-frame shifting indels (Fig. 2). Numerous indels were

99

required to align our sequences. For example, alignment of Tadarida (our sole representative of Molossidae) required 30 indels, with 22 occurring in one continuous stretch (Fig. 2). Even so, our alignment was unambiguous because most indels were conserved within families and other closely related taxa. 4.1. Phylogenetic utility of DMP1 Phylogenetic analysis of DMP1 sequences strongly supported monophyly of Noctilionoidea (Phyllostomi-

Fig. 6. Bubble diagrams reflecting the frequency of character-state changes based on the toplogy of the combined analysis in Fig. 5 for exon 6 of DMP1 and RAG2 for each codon position separately and for all codon positions combined.

100

R.A. Van Den Bussche et al. / Molecular Phylogenetics and Evolution 26 (2003) 89–101

dae, Noctilionidae, Mormoopidae, and Thyropteridae), and all families and genera examined, with the exception of Mormoopidae (Figs. 4 and 5a). Thus, our results agree with those of Hoofer et al. (Submitted) for the inclusion of Thyropteridae in Noctilionoidea (Fig. 1). The affinities of Natalidae are less clear. Mitochondrial ribosomal sequences (Fig. 5) suggest, albeit not strongly that Natalidae shares phylogenetic affinities with the Molossidae and Vespertilionidae and similar relationships were detected when the mtrDNA and RAG2 data were analyzed separately (Fig. 5). Concatenation of all three data sets provides strong support (posterior probability ¼ 1.0) for the association of Natalidae with Vespertilionidae and Molossidae (Fig. 5). Therefore, results of the combined mitochondrial and nuclear gene sequences support the conclusions of Hoofer et al. (Submitted) that Natalidae should be included in the superfamily Vespertilionoidae. Moreover, these combined data strongly support inclusion of Thyropteridae within the Noctilionoidae, as proposed by Hoofer et al. (Submitted). This study illustrates that exon 6 sequences of DMP1 are useful for resolving phylogenetic relationships among families of bats. In comparison with RAG2, a gene used frequently in mammalian systematic studies (Baker et al., 2000; Madsen et al., 2001; Murphy et al., 2001; Teeling et al., 2000, 2002), DMP1 provides more variable characters (463/1047 ¼ 44.2%) than RAG2 (428/ 1373 ¼ 31.2%) and patterns of reconstructed character change within codon positions for DMP1 and RAG2 show similar results (Fig. 6). Although insertion-deletion events were detected in DMP1 (rare in RAG2 sequences), they did not cause problems in aligning sequences, even among distantly related taxa. This is an appealing aspect of DMP1 because indels appear to be conserved at various taxonomic levels (Fig. 1). Such a finding is in contrast to protamine P1 sequences. Although protamine P1 appears to be a rapidly evolving nuclear gene that provides phylogenetic resolution from the specific to the interordinal taxonomic levels (Retief et al., 1995, 1993; Krajewski et al., 1997a, b; Van Den Bussche et al., 2001), the phylogenetic utility of protamine P1 is limited due to its small size and difficulty aligning noncoding regions, even among closely related taxa (Van Den Bussche et al., 2001). Therefore, exon 6 of DMP1 appears to be rapidly evolving, able to tolerate non-frame shifting insertion and deletion events, and provides phylogenetic resolution from the interfamilial (this study) to the infraclass (Toyosawa et al., 1999) taxonomic levels. Although several alleles were detected within individuals, this level of polymorphism did not hinder attempts at directly sequencing PCR products. Finally, indels may be phylogenetically informative because they generally are conserved at certain taxonomic levels (e.g., family level). Denser taxonomic sampling is required, however, to

fully evaluate the phylogenetic utility of indels in DMP1. Because there appears to more variation present in DMP1 than RAG2, coupled with the fact that the functional and evolutionary constraints on DMP1 differ from other nuclear protein coding genes used in molecular systematic studies, DNA sequence variation in this gene may be useful for resolving other higher-level mammalian systematic relationships.

Acknowledgments We thank R.J. Baker of the Natural Sciences Research Laboratory, the Museum of Texas Tech University, N.B. Simmons of the American Museum of Natural History, M.D. Engstrom of the Royal Ontario Museum, J.L. Patton of the Museum of Vertebrate Zoology, Berkeley, and K. McBee of the Oklahoma State University Collection of Vertebrates for generously loaning tissues for this study. We also thank S. Janza for initial discussions on the possible usefulness of DMP1 for studies on mammalian phylogenetic relationships. We thank personnel of the Oklahoma State University Recombinant DNA/Protein Resource Facility for synthesis and purification of synthetic oligonucleotides. Finally, we extend our gratitude to two anonymous reviewers whoÕs comments greatly improved the presentation of this manuscript. This study was supported by National Science Foundation grant DEB9873657 and an REU supplement to R.A. Van Den Bussche.

References Anderson, S., de Bruiln, M.H.L., Coulson, A.R., Eperon, I.C., Sanger F, Young, I.G., 1982. Complete sequence of bovine mitochondrial DNA: conserved features of the mammalian mitochondrial genome. J. Mol. Biol. 156, 683–717. Baker, R.J., Longmire, J.L., Maltbie, M., Hamilton, M.J., Van Den Bussche, R.A., 1997. DNA synapomorphies for a variety of taxonomic levels from a cosmid library of the New World bat Macrotus waterhousii. Syst. Biol. 46, 579–589. Baker, R.J., Porter, C.A., Patton, J.C., Van Den Bussche, R.A., 2000. Systematics of bats of the family Phyllostomidae based on RAG2 DNA sequences. Occas. Pap. Mus. Texas Tech Univ. 202, 1–16. De Rijk, P., van de Peer, Y., Chapella, S., Wachter, R.D., 1994. Database on the structure of the large ribosomal subunit RNA. Nucleic Acids Res. 22, 3495–3501. Dowton, M., Austin, A.D., 2002. Increased congruence does not necessarily indicate increased phylogenetic accuracy – the behavior of the incongruence length difference test in mixed-model analyses. Syst. Biol. 51, 19–31. Farris, J.S., Kallersjo, M., Kluge, A.G., Bult, C., 1994. Testing significance of incongruence. Cladistics 10, 315–319. George, A., Sabsay, B., Simonian, P.A.L., Veis, A., 1993. Characterization of a novel dentin matrix acidic phosphoprotein. Implications for induction of biomineralization. J. Biol. Chem. 268, 12624– 12630.

R.A. Van Den Bussche et al. / Molecular Phylogenetics and Evolution 26 (2003) 89–101 George, A., Gui, J., Jenkins, N.A., Gilbert, D.J., Copeland, N.G., Veis, A., 1994. In situ localization and chromosomal mapping of the AG1 (Dmp1) gene. J. Histochem. Cytochem. 42, 1527–1531. Hoofer, S.R., Reeder, S.A., Hansen, E.W., Van Den Bussche, R.A., Submitted. Molecular phylogenetics and taxonomic review of noctilionoid and vespertilionoid bats (Chiroptera: Yangochiroptera). J. Mammal. Huelsenbeck, J.P., Ronquist, F., 2001. MRBAYES: Bayesian inference of phylogeny. Bioinformatics 17, 754–755. Krajewski, C., Buckley, L., Westerman, M., 1997a. DNA phylogeny of the marsupial wolf resolved. Proc. R. Soc. London B 264, 911–917. Krajewski, C., Blacket, M., Buckley, L., Westerman, M., 1997b. A multigene assessment of phylogenetic relationships within the dasyurid marsupial subfamily Sminthopsinae. Mol. Phylogenet. Evol. 8, 236–248. Kumar, S., Tamura, K., Nei, M., 1993. MEGA: Molecular evolutionary genetic analysis. Version 1.03. Institute of Molecular Evolutionary Genetics, Pennsylvania State Univ., University Park. Leache, A.D., Reeder, T.W., 2002. Molecular systematics of the eastern fence lizard (Sceloporus undulates): a comparison of parsimony, likelihood, and Bayesian approaches. Syst. Biol. 51, 44–68. Lomgmire, J., Maltbie, M., Baker, R.J., 1997. Use of ‘‘lysis buffer’’ in DNA isolation and its implication for museum collections. Occas. Pap. Mus. Texas Tech Univ. 163, 1–3. Maddison, W.P., Maddison, D.R., 2000. MacClade, Version 4: Analysis of Phylogeny and Character Evolution. Sinauer, Sunderland, MA. Madsen, O., Scally, M., Douady, C.J., Kao, D.J., DeBry, R.W., Adkins, R., Amrine, H.M., Stanhope, M.J., de Jong, W.W., Springer, M.S., 2001. Parallel adaptive radiations in two major clades of placental mammals. Nature 409, 610–614. Mickevich, M.F., Farris, J.S., 1981. The implications of incongruence in Menidia. Syst. Zool. 30, 351–370. Murphy, W.J., Eizirik, E., Johnson, W.E., Zhang, Y.P., Ryder, O.A., OÕBrien, S.J., 2001. Molecular phylogenetics and the origins of placental mammals. Nature 409, 614–618. Ørvig, T., 1967. Phylogeny of tooth tissues in early vertebrates. In: Miles, A.E.W. (Ed.), Structural and chemical organization of teeth. Academic Press, London, pp. 45–110. Oxford Molecular Group PLC, 1998. AssemblyLIGN 1.0.9. Oxford Molecular Group PLC., Oxford, UK. Posada, D., Crandall, K.A., 1998. Modeltest: testing the model of DNA substitution. Bioinformatics 14, 817–818.

101

Retief, J.D., Krajewski, C., Westerman, M., Dixon, G.H., 1995. The evolution of protamine P1 genes in dasyurid marsupials. J. Mol. Evol. 41, 549–555. Retief, J.D., Winkfein, R.J., Dixon, G.H., Adroer, R., Queralt, R., Ballabriga, J., Oliva, R., 1993. Evolution of protamine P1 genes in Primates. J. Mol. Evol. 37, 426–434. Saccone, C., Pesole, G., Preparata, G., 1989. DNA microenvironments and the molecular clock. Mol. Biol. Evol. 15, 442–448. Smith, M.M., Hall, B.K., 1990. Development and evolutionary aspects of vertebrate skeletogenic and odontogenic tissues. Biol. Rev. 65, 277–373. Springer, M.S., Douzery, E., 1996. Secondary structure, conservation of functional sites, and rates of evolution among mammalian mitochondrial 12S rRNA genes based on sequences from placentals, marsupials, and a monotreme. J. Mol. Evol. 43, 357–373. Swofford, D.L., 2000. PAUP*: phylogenetic analysis using parsimony (*and other methods) version 4.02b. Sinauer Associates, Inc., Publishers, Sunderland, MA. Teeling, E.C., Scally, M., Kao, D.J., Romagnoli, M.L., Springer, M.S., Stanhope, M.J., 2000. Molecular evidence regarding the origin of echolocation and flight in bats. Nature 403, 188–192. Teeling, E.C., Madsen, O., Van Den Bussche, R.A., de Jong, W.W., Stanhope, M.J., Springer, M.S., 2002. Microbat paraphyly and the convergent evolution of a key innovation in Old World rhinolophoid microbats. Proc. Natl. Acad. Sci. 99, 1431–1436. Thompson, J.D., Gibson, T.J., Plewniak, F., Jeanmougin, F., Higgins, D.G., 1997. The CLUSTAL X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 24, 4876–4882. Toyosawa, S., OÕhUigin, C., Klein, J., 1999. The dentin matrix protein 1 gene of prototherian and metatherian mammals. J. Mol. Evol. 48, 160–167. Van Den Bussche, R.A., Hoofer, S.R., 2001. Evaluating monophyly of Nataloidea (Chiroptera) with mitochondrial DNA sequences. J. Mammal. 83, 320–327. Van Den Bussche, R.A., Hoofer, S.R., Hansen, E.W., 2001. Characterization and phylogenetic utility of the mammalian protamine P1 gene. Mol. Phyl. Evol. 22, 333–341. Weins, J.J., 1998. Combining data sets with different phylogenetic histories. Syst. Biol. 47, 568–581. Yoder, A.D., Irwin, J.A., Payseur, B.A., 2001. Failure of the ILD to determine data combinability for the slow loris phylogey. Syst. Biol. 50, 408–424.