Gene 381 (2006) 92 – 101 www.elsevier.com/locate/gene
Mitochondrial genome of the moon jelly Aurelia aurita (Cnidaria, Scyphozoa): A linear DNA molecule encoding a putative DNA-dependent DNA polymerase Zhiyong Shao a , Shannon Graf b , Oleg Y. Chaga c , Dennis V. Lavrov a,d,⁎ a
Interdepartmental Genetics Graduate Program, Iowa State University, Ames, Iowa 50011, United States Department of Microbiology and Immunology, Georgetown University Medical Center, Washington, DC 20007, United States Department of Cell and Molecular Biology, Feinberg School of Medicine, Northwestern University, Chicago, Illinois, United States d Department of Ecology, Evolution and Organismal Biology, Ames, Iowa 50011, United States b
c
Received 18 April 2006; received in revised form 20 June 2006; accepted 23 June 2006 Available online 18 July 2006 Received by M. Di Giulio
Abstract The 16,937-nuceotide sequence of the linear mitochondrial DNA (mt-DNA) molecule of the moon jelly Aurelia aurita (Cnidaria, Scyphozoa) – the first mtDNA sequence from the class Scypozoa and the first sequence of a linear mtDNA from Metazoa – has been determined. This sequence contains genes for 13 energy pathway proteins, small and large subunit rRNAs, and methionine and tryptophan tRNAs. In addition, two open reading frames of 324 and 969 base pairs in length have been found. The deduced amino-acid sequence of one of them, ORF969, displays extensive sequence similarity with the polymerase [but not the exonuclease] domain of family B DNA polymerases, and this ORF has been tentatively identified as dnab. This is the first report of dnab in animal mtDNA. The genes in A. aurita mtDNA are arranged in two clusters with opposite transcriptional polarities; transcription proceeding toward the ends of the molecule. The determined sequences at the ends of the molecule are nearly identical but inverted and lack any obvious potential secondary structures or telomere-like repeat elements. The acquisition of mitochondrial genomic data for the second class of Cnidaria allows us to reconstruct characteristic features of mitochondrial evolution in this animal phylum. © 2006 Elsevier B.V. All rights reserved. Keywords: Linear mtDNA; Mitochondrial evolution; dnaB; Gene order
1. Introduction Although mitochondrial genomes are known to vary extensively in size, structure, and gene content across diverse eukaryotic groups, those of multicellular animals (Metazoa) are considered to be largely uniform (Lang et al., 1999). A typical metazoan mtDNA is usually depicted as a small (∼ 16 kpb), circular molecule that carries a conserved set of 37 compactly arrayed genes coding for 13 proteins, two rRNAs and 22 tRNAs Abbreviations: SSU-rRNA and LSU-rRNA, small and large subunit ribosomal RNAs; mtDNA, mitochondrial DNA; mt, mitochondrial. ⁎ Corresponding author. 253 Bessey Hall, Department of Ecology, Evolution and Organismal Biology, Iowa State University, Ames, Iowa 50011, United States. E-mail address:
[email protected] (D.V. Lavrov). 0378-1119/$ - see front matter © 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.gene.2006.06.021
(Boore, 1999). Other features commonly associated with animal mtDNA include the lack of introns, multiple deviations in the genetic code, unusual and/or highly reduced rRNA and tRNA primary and secondary structures, and the presence of a single large non-coding region containing the replication origins and transcriptional promoters (Wolstenholme, 1992). Although the idea of a “typical” animal mtDNA has its merits, a broader sampling of diverse animal phyla revealed multiple exceptions (Armstrong et al., 2000; Helfenbein et al., 2004). This is especially true for mtDNAs of nonbilaterian (“lower”) animals, which lack many features associated with the “typical” animal mtDNA and vary more in size and gene content (Beagley et al., 1995; Ender and Schierwater, 2003; Lavrov et al., 2005). Mitochondrial genomes in the phylum Cnidaria are among the most atypical among Metazoa. First, among the four traditionally
Z. Shao et al. / Gene 381 (2006) 92–101
recognized cnidarian classes, Anthozoa, Cubozoa, Scyphozoa and Hydrozoa, only the Anthozoa have a circular mitochondrial genome, while the others have linear mtDNA (Bridge et al., 1992, 1995). Second, out of the 22–25 tRNA genes present in mtDNA of most other animals, only two, trnM and trnW, have been found in cnidarian mtDNA. Other tRNAs required for mitochondrial protein synthesis are assumed to be encoded in the nuclear genome and imported into the mitochondria. Third, foreign DNA is frequently found in cnidarian mtDNA, including a putative mismatch repair protein (MutS) in the soft coral Sarcophyton glaucum and the sea pen Renilla kolikeri (Pont-Kingdon et al., 1995, 1998) and group I introns with or without the homing endonucleases of the LAGLI-DADG type in the sea anemone Metridium senile (Beagley et al., 1996) and several hard corals (van Oppen et al., 2002; Fukami and Knowlton, 2005). Finally, cnidarians use a minimally modified genetic code for mitochondrial translation, with TGA=tryptophan as the only deviation, and encode well-conserved ribosomal and transfer RNAs. Our knowledge of cnidarian mtDNA is mostly limited to the class Anthozoa, for which several complete mitochondrial sequences are available: those from the soft coral S. glaucum (Beaton et al., 1998), the sea anemone M. senile (Beagley et al., 1998), and six hard corals Acropora tenuis (van Oppen et al., 2002), Montipora cactus, Anacropora matthai (GenBank Access number NC_006902; NC_006898), Montastraea annularis, Montastraea franksi, and Montastraea faveolata (Fukami and Knowlton, 2005). The other three classes are represented by a few, mostly partial gene sequences (Pont-Kingdon et al., 2000; Dawson and Jacobs, 2001; Schroth et al., 2002). It is unclear, therefore, whether the features described for anthozoan mtDNA are specific for this class or characteristic of the whole phylum Cnidaria. To resolve this question we determined a nearly complete sequence of the linear mitochondrial genome from the scyphozoan Aurelia aurita, which we describe here. Scyphozoans (jellyfish), are exclusively marine cnidarians with a predominant medusoid stage in the lifecycle. They live in all oceans, from polar to tropic. Some inhabit the deep sea, but most live close to the coastal waters. There are approximately 200 species in this class, which are traditionally divided into four orders (Brusca and Brusca, 2002), although one of these orders (Stauromedusae) may represent an independent lineage within the phylum (Collins, 2002; Marques and Collins, 2004; Collins and Daly, 2005). The moon jelly Aurelia is one of the most common and widely distributed species of jellyfish and a popular research organism (Arai, 1996). Traditionally only two species are recognized in the genus, A. aurita and A. limbata, the latter is restricted to north polar oceans. However, presence of several cryptic species within A. aurita is possible (Dawson and Jacobs, 2001; Schroth et al., 2002). 2. Materials and methods 2.1. DNA extraction, PCR, and sequencing Scyphistomae (scyphozoan polyps) of A. aurita were collected at the Marine Biological Station of the St. Petersburg University (Chupa Inlet, Kandalaksha Bay of the White Sea; 66°19′N,
93
033°40′E) and cultured in the lab. After strobilation (an asexual process of transverse fission), ephirae (young medusae) were gathered and stored in 70% ethanol. Total DNA was extracted from ∼10 ephirae according to the protocol described in Berntson et al. (1999). Primers designed to match generally conserved regions of the animal mtDNA were used to amplify short fragments from cox1 (Folmer et al., 1994), and nad4 (nad4F: 5′CCKAARGCYCAYGTKGARGCYCC-3′, nad4-R: 5′-GARGAWCAKAWWCCRTGAGCAATYAT-3′). Specific primers (Aurelia-cox1-F1: 5′-AACGTTGTAGTGACCGCTCATGC-3′, Aurelia-cox1-R1: 5′CTTGGAAAGGCCATATCTGGAGC-3′, Aurelia-nad4-F1: 5′-TCATGGACACTTTTTCCTGTGGC-3′, Aurelia-nad4-R1: 5′-ATATATTACAGCAATTGACCCAAGC3′, Aurelia-rnl-F1: 5′- GTAACTCTGACCGTGATGAAG TAGC-3′, Aurelia-rnl-R1: 5′-ATATCATAATTCAACATC GAGGTGGC-3′) were designed based on these sequences and previously available (Bridge et al., 1995) partial rnl data and were used with Takara® LA PCR kits to amplify ∼15 kb of the mtDNA in two overlapping fragments, between nad4 and cox1 and between cox1 and rnl. The peripheral regions of the molecule were amplified using Step-Out PCR (Wesley and Wesley, 1997) and reamplified by conventional PCR with specific primers designed at the ends of the molecule. PCR reaction products were purified by three serial passages through Ultrafree™ (30,000 NMWL) columns (Millipore) and used as templates in dye-terminator, cycle-sequencing reactions according to supplier's (Perkin Elmer®) instructions. Both strands of each amplification product were sequenced by primer walking, using an ABI 377 Automated DNA Sequencer. Sequences were assembled using the STADEN software suite (Staden, 1996). tRNA genes were identified by the tRNAscan-SE program (Lowe and Eddy, 1997), other genes were identified by similarity searches in local databases using the FASTA program (Pearson, 1994), and in GenBank at NCBI using the BLAST network service (Benson et al., 2003). The secondary structures of rRNA genes were derived by analogy to other published rRNA gene structures, and drawn using the RnaViz 2 program (De Rijk et al., 2003). A. aurita mtDNA sequence is available via Genbank accession number DQ787873. 2.2. Phylogenetic analyses of protein data Amino acid sequences of individual proteins were aligned three times using ClustalW 1.82 (Thompson et al., 1994) with different combinations of opening/extension gap penalties: 10/0.2 (default), 12/4 and 5/1. For the last alignment, no increased gap penalties near existing gaps, no reduced gap penalties in hydrophilic stretches and no residue-specific penalties were applied. The three alignments were compared using SOAP (Löytynoja and Milinkovitch, 2001), and the positions which were identical among them were included in phylogenetic analyses. The final concatenated data set was 2864 amino acids in length. We performed a maximum likelihood (ML) search for the best tree and estimated bootstrap support values as implemented in the PHYML (v. 2.4.4) program (Guindon and Gascuel, 2003) using a gamma + invariant model with 8 categories, estimated α-parameter, the mtREV matrix of amino-acid substitutions, and estimated frequencies of amino acids. Bayesian inferences (MB) used
94
Z. Shao et al. / Gene 381 (2006) 92–101
MrBayes 3.1.1 (Ronquist and Huelsenbeck, 2003). We ran four Markov Chain Monte Carlo (MCMC) chains for 1,100,000 generations, using the mtREV model of amino acid substitutions with gamma + invariant distributed rates. Trees were sampled every 1,000th cycle after the first 100,000 burn-in cycles. 3. Results and discussion 3.1. Genome structure, organization and nucleotide composition The mitochondrial genome of A. aurita has been previously characterized as a monomeric linear molecule about 18 kbp in length (Bridge et al., 1992). We determined the sequence of a 16,937 bp segment of the genome that encodes 13 known mitochondrial proteins, 2 tRNAs, large and small subunit rRNAs, and two ORFs (Fig. 1). PCR amplification, cloning, and sequencing of the rest of the mtDNA were problematic. The sequence at one end of the determined segment is repeated in an inverted orientation at the other end. We view these repeated sequences as parts of long inverted terminal repeats that constitute a hallmark of linear plasmids and linear mitochondrial genomes, derived from integration of such plasmids into circular mtDNA (Meinhardt et al., 1990, 1997; Nosek and Tomaska, 2003). Linear mitochondrial genomes are known in diverse phylogenetic groups (Vahrenholz et al., 1993; Burger et al., 2000; Forget et al., 2002), but this is the first description of such a genome within the Metazoa. The genes in A. aurita mtDNA are arranged into two clusters, one of which is in the opposite transcriptional orientation to the other. Transcription proceeds in the directions towards the ends of the molecule, with 15 genes transcribed in one direction while two known genes and (tentatively) two ORFs in the other. The change in the transcriptional polarity occurs between cox1 and cox2, where a 93 bp non-coding region is located. Cox1 and cox2 are located in a cluster of six genes, which are arranged nearly identically in A. aurita and S. glaucum (Fig. 1). Two other arrangements, +nad6+nad3+nad4L and +nad2+nad5, are also conserved between these organisms. Furthermore the same three arrangements of protein-coding genes are largely conserved in
three demosponge mtDNAs [except that there are tRNA genes present in the intergenic regions (Lavrov et al., 2005)] and so may be ancestral for all animals. The A+T content of A. aurita mtDNA is 66.7%, slightly higher than in other cnidarian mtDNAs but well within the range reported for other metazoans. There is a slight AT-skew between the two strands (0.09), but almost no GC-skew (− 0.03) for the whole genome. However, strong GC-skew (N 0.3) is found in the sequences encoding ORFs and the terminal ends of the molecule (Table 1). 3.2. Protein-coding genes Thirteen protein-coding genes found in most other animal mtDNA (atp6, atp8, cob, cox1–3, nad1–6, nad4L) are also present in A. aurita genome. These A. aurita genes, coding for protein subunits involved in respiration and oxidative phosphorylation, are similar in size to their homologues in other cnidarian mtDNAs and share with them on average 54% (range 21–81%) of sequence identity (Table 2). All protein-coding genes are inferred to have complete termination codons: either TAA (12 genes) or TAG (cob). Twelve protein-coding genes start with ATG initiation codons while one (nad3) is inferred to begin with GTG. We inferred that nad3 overlaps with the adjacent genes by 10 (nad4L) and 47 (nad6) nucleotides, respectively (Fig. 1). However, it is possible that nad6 terminates with the CGT codon (see below) in which case the latter overlap would be reduced to 2 nt. In addition to the typical set of protein-coding genes described above, two ORFs of 324 and 969 nucleotides have been found downstream of rnl, close to the end of the linear molecule. The deduced amino-acid sequence of one of them, ORF969, displays extensive sequence similarity with the family B DNA polymerases (Fig. 2) and this ORF has been tentatively identified as dnab. Genes encoding a family B DNA polymerase have been previously found in mtDNA from several other organisms, including the linear genome of the golden algae (chrysophyte) Ochromonas danica (Coleman et al., 1991), and the circular genomes of the red algae Porphyra purpurea (Burger et al.,
Fig. 1. Gene map of A. aurita mtDNA and gene order comparison with Sarcophyton glaucum mtDNA. Protein and ribosomal genes (large open boxes) are: atp6, 8 — subunits 6, and 8 of the F0 ATPase, cox1–3–cytochrome c oxidase subunits 1–3, cob — apocytochrome b (cob), nad1–6 and nad4L — NADH dehydrogenase subunits 1–6 and 4L, rns and rnl — small and large subunit rRNAs. tRNA genes (hatched boxes) are identified by the one-letter code for their corresponding amino acid. Orf324 and orf969 are two large open reading frames found in A. aurita mtDNA. Large intergenic regions are shown by shaded boxes; flanking sequences — by two large arrows. Transcriptional direction for each gene is indicated by an arrow. Three conserved gene clusters between A. aurita and S. glaucum are underlined and interconnected with arrows.
Z. Shao et al. / Gene 381 (2006) 92–101
95
Table 1 Nucleotide composition data for different groups of genes, ORFs, and non-coding regions in Aurelia aurita mtDNA Coding sequences (Total) %G %A %T %C %A+T AT-skew GC-skew Total (bp)
ORFs
rRNA-genes
tRNA-genes
Intergenic
Flanking repeats a
10.5 37.4 30.0 22.0 67.4 0.11 − 0.35 1293
16.8 38.0 28.8 16.4 66.8 0.14 0.01 2763
19.1 32.6 29.1 19.1 61.7 0.06 0.00 141
8.4 45.0 37.4 9.2 82.4 0.09 − 0.04 131
21.1/10.3 37.6/30.8 30.2/38.5 11.0/20.5 67.9/69.2 0.11/− 0.11 0.31/− 0.33 417/351
(3rd positions)
17.0 28.9 37.6 16.6 66.5 − 0.13 0.01 11,931
11.8 36.9 37.2 14.1 74.1 −0.01 −0.09 3977
a The nucleotide composition data is shown separately for the repeat units upstream of cob/downstream of orf969. The discrepancies in the nucleotide composition data for these repeats reflect a) the opposite orientation of repeats and b) different lengths of sequences determined at the two ends of the molecule.
1999), carrot (Robison and Wolyn, 2005), and basidiomycete Agrocybe aegerita (Bois et al., 1999). Related genes are also found in linear mitochondrial plasmids of protists, plants and fungi (Robison et al., 1991; Robison and Horgen, 1996; Rousvoal et al., 1998; Robison and Wolyn, 2005). The evolutionary proximity between mitochondrial and plasmid polB genes within each of these groups of organisms indicates that the mitochondrial polB genes originated by the integration of linear plasmids into mtDNA (Mouhamadou et al., 2004). In comparison to its homologues in other organisms, the polB present in A. aurita mtDNA is truncated at the 5′ end and is inferred to code for a protein of 323 amino acids, approximately 200 aa smaller than any previously reported DNA polymerases. Although this may suggest that the ORF represents a pseudogene on its way to elimination as reported previously in some other species [e.g., Weber et al. (1995)], the analysis of its sequence (below) suggests a different interpretation. Most DNA-dependent DNA polymerases possess two distinct catalytic activities, a synthetic DNA polymerization activity and a
proofreading 3′–5′ exonuclease activity [reviewed in Joyce and Steitz (1994)]. These two enzymatic activities have been mapped in two structurally independent domains of the enzyme: the exonuclease activity in the N-terminal domain and the polymerase activity in the C-terminal domain (Braithwaite and Ito, 1993). A comparison of the amino acid sequence inferred from A. aurita ORF969 with those of other DNA-dependent DNA polymerases indicates that the exonuclease domain has been mostly lost in the A. aurita protein, while the polymerase domain remains mostly intact and contains at least four out of five evolutionary conserved motifs (Dx2SLYP, Kx3NSxYG, Tx2A/GR, YxDTDS) characteristic of this domain in family B DNA polymerases (Truniger et al., 1998). This clearly non-random pattern of the DNA loss and conservation suggest that the polymerase domain is maintained by selection pressure. However, its function remains unclear because the lack of the exonuclease domain is known to detrimentally affect polymerization activity of the enzyme (Freemont et al., 1986; Truniger et al., 1998). Interestingly, A. aurita hypothetical POLB lacks a particular insertion between motifs POLIIa and
Table 2 Comparison of mitochondrial protein genes of the jellyfish Aurelia aurita (A.a.), the hard coral Acropora tenuis (A.t.), the sea anemone Metridium senile (M.s.) and the soft coral Sarcophyton glaucum (S.g.) Protein genes
atp6 atp8 c cob cox1 cox2 cox3 nad1 nad2 nad3 nad4 nad4L nad5 nad6 a
Number of encoded amino acids a
Percent amino acid identity
Predicted initiation and termination codons
A.a.
A.t.
M.s.
S.g.
A.a./A.t.
A.a./M.s.
A.a./S.g.
M.s./S.g.
In Aurelia aurita b
234 67 379 526 241 261 323 439 119 480 100 605 190
232 72 384 533 247 262 327 365 118 491 99 611 197
229 72 393 530 248 262 334 385 118 491 99 600 202
237 72 386 531 253 261 325 457 117 495 97 605 185
64 21 60 79 63 70 61 39 54 52 42 48 32
63 22 63 81 62 75 61 39 56 54 50 50 35
62 22 64 80 59 76 61 30 60 49 48 49 34
57 26 68 81 71 77 70 41 70 58 57 59 38
ATG (− 1) ATG (5) ATG (2) ATG (93) ATG (93) ATG (− 1) ATG (− 1) ATG (1) GTG (−47) ATG (0) ATG (− 10) ATG (0) ATG (14?)
TAA (−1) TAA (−1) TAG (END) TAA (0) TAA (5) TAA (11) TAA (0) TAA (0) TAA (−10) TAA (2) TAA (−1) TAA (−2) TAG (−47)
Data for A. tenuis are from van Oppen et al. (2002), for M. senile are from Beagley et al. (1998) and for S. glaucum are from Beaton et al. (1998). The numbers of non-coding nucleotides upstream and downstream of a gene are shown in parenthesis after initiation and termination codons. The negative numbers indicate that the genes are overlapping. END indicates that the flanking sequence is adjacent to the gene. c Sequence identities of atp8 are uncertain due to alignment ambiguities. b
96
Z. Shao et al. / Gene 381 (2006) 92–101
Fig. 2. Alignment of the ORF969 encoded by the Aurelia aurita mtDNA with DNA-dependent DNA polymerases from other organisms. The sequences are derived from: aa-orf969 — Aurelia aurita mtDNA; pa-pAL2-1 — pAL2-1 plasmid of Podospora anserina (X60707); aa-mt — Agrocybe aegerita mtDNA (AF061244); aiAI2 — plasmid AI2 of Ascobolus immerses (P22374); ni-Kal — kalilo plasmid of Neurospora intermedia (X52106); od-mt — Ochromonas danica mtDNA (NC002571); ppu-mt — Porphyra purpurea mtDNA (NC_002007); pp-mF — mF plasmid of Physarum polycephalum (D29637), bv-mt — Beta vulgaris mtDNA (NC_002511); dc-mt — Daucus carota mtDNA (AY521591); prd1 — bacteriophage PRD1 (NC001421); phi29 — bacteriophage phi29 (X53370). Five evolutionary conserved motifs characteristic of polymerase domain in family B DNA polymerases are those listed by Truniger et al. (1998). Numbers within each sequence indicate the number of amino-acid residues excluded from the alignment.
POLIIb that is specific for protein-primed DNA polymerases and, at least in phi29 DNA polymerase, also plays a role in strand displacement and processivity (Rodriguez et al., 2005). 3.3. Codon usage and genetic code The analyses of the codon usage among the 13 energy pathway protein genes and, separately, in the dnab are shown in Table 3. The table is compiled based on a minimally modified genetic code
(TGA=tryptophan as the only deviation) deduced for mitochondrial protein synthesis in A. aurita. However, the specificity of CGN codons, tentatively identified as arginine, remains suspect due to their rarity in the genome. Indeed, among the 13 energy pathway protein genes, two codons CGC and CGG are never used and two other codons in the CGN family are used only once each. The scarcity of CGN codons does not reflect the absence of arginine in mitochondrial proteins: two other codons specifying arginine (AGA and AGG) are found 73 times. Neither can it be
Table 3 Codon usage among the 13 energy pathway protein genes and ORF969
Phe Leu Leu
Ile
Met Val
TTT TTC TTA TTG CTT CTC CTA CTG ATT ATC ATA ATG GTT GTC GTA GTG
A.a.
ORF
220 78 271 95 85 14 103 28 162 42 252 123 132 27 116 16
10 9 10 4 8 5 8 2 12 8 14 5 6 0 3 0
Ser
Pro
Thr
Ala
TCT TCC TCA TCG CCT CCC CCA CCG ACT ACC ACA ACG GCT GCC GCA GCG
A.a.
ORF
138 57 73 9 74 24 46 5 79 43 70 6 117 64 63 4
13 4 7 0 11 4 2 1 12 5 9 0 3 2 6 0
Tyr TER His Gln Asn Lys Asp Glu
TAT TAC TAA TAG CAT CAC CAA CAG AAT AAC AAA AAG GAT GAC GAA GAG
A.a.
ORF
123 41 11 2 58 20 66 16 78 56 88 34 67 28 74 26
12 9 1 0 1 5 8 4 12 11 25 4 7 3 6 2
Cys Trp Arg
Ser Arg Gly
TGT TGC TGA TGG CGT CGC CGA CGG AGT AGC AGA AGG GGT GGC GGA GGG
A.a.
ORF
26 9 62 25 1 0 1 0 58 26 55 18 62 31 115 61
1 0 0 0 1 2 0 1 2 1 9 3 1 1 5 3
Z. Shao et al. / Gene 381 (2006) 92–101
explained by the shortage of CpG dinucleotides, typical for most genomes (Ohno, 1988) because NCG codons specifying serine, proline, threonine, and alanine are still present, although less frequently than other synonymous codons in their codon families. Instead, the paucity of CGN codons may be explained if some tRNAs for the CGN arginine codon family are not imported into the mitochondria. If this is the case then the only CGT codon found close to the 3′ end of nad6 may actually be used as a stop codon. This would explain a relatively large overlap between nad6 and nad3 (Fig. 1). By contrast, the only CGA codon is likely translated because it occurs 132 nts upstream of the 3′ end of nad1 in a well-conserved region. We also note that CGN codons occur 4 times in the tentative dnab. Whether this finding argues against its expression in A. aurita mitochondria is unclear. Another notable feature of codon usage is the preference of TGA over TGG to code for tryptophan. Although such preference is consistent with the general bias against codons ending in G or C, it is opposite to the codon usage in anthozoan mtDNA where the TGG codon is clearly preferred (Beaton et al., 1998). 3.4. tRNA genes Genes for only two tRNAs (methionine and tryptophan) were found in the A. aurita mtDNA, as in most other cnidarian mtDNAs studied to date. Although it is possible that some tRNA genes are located in the unsequenced portion of the genome, it is much more likely that the rest of tRNAs are nuclear encoded and imported into the mitochondria. Inferred secondary structures for tRNAs encoded in A. aurita mtDNA are presented in Fig. 3. A. aurita trnM(cau)f is identical in size (71 nt) to corresponding genes in M. senile, S. glaucum and Hydra attenuata (Pont-Kingdon et al., 1994, 1998, 2000) and shares 52–80% sequence identity with them. A. aurita trnW(uca) is identical in size (70 nt) to those in M. senile mtDNA and A. tenuis but is one nucleotide larger than in H. attenuata mtDNA (Beagley et al., 1998; Pont-Kingdon et al., 2000; van Oppen et al., 2002) and shares with these tRNAs 62–74% sequence identity.
97
Similar to other cnidarian mt-tRNAs, those encoded by A. aurita mtDNA largely follow the pattern of conservation described for bacterial and nuclear tRNAs (Rich and RajBhandary, 1976; Sprinzl et al., 1989). Among the structural deviations in A. aurita mt-tRNAs are atypical R11–Y24 pairs in tRNATrp UCA which is generally characteristic for animal mt-tRNATrp UCA (Wolstenholme, 1992; Lavrov et al., 2005). The R11–Y24 pair is otherwise a distinctive feature of bacterial, archaeal, and organellar initiator Met tRNACAU that is strongly counterselected in elongator tRNAs (Marck and Grosjean, 2002). In addition, several characteristic Met features of bacterial and nuclear initiator tRNACAU [such as C3G70, G31-C39, and G15 (Marck and Grosjean, 2002)] are absent Met in A. aurita mt-tRNACAU . The retention of genes for methionine and tryptophan tRNAs in the A. aurita and most other cnidarian mtDNA and can be explained by the unique role of these tRNAs in mitochondrial Met translation: tRNACAU is used for the initiation of mitochondrial translation with formylmethionine (Smith and Marcker, 1968) while tRNATrp UCA must translate the TGA in addition to the TGG codons as tryptophan. Interestingly, trnW is absent from the mtDNA of S. glaucum, and either one or both of these tRNA genes are lost from chaetognata mtDNA (Helfenbein et al., 2004; Papillon et al., 2004). 3.5. rRNA genes Genes for small and large subunit mitochondrial rRNAs (rns and rnl) are located more than 7 kbp apart in A. aurita mtDNA and have opposite transcriptional polarities (Fig. 1). Such arrangement is relatively rare in animal mtDNA, where two ribosomal genes are often clustered together and/or transcribed from the same strand, but it is not unprecedented (Boore, 1999). Other cnidarian mtDNAs studied to date have two ribosomal genes transcribed from the same strand but interspersed by several other genes. Based on the secondary structure modeling (Supplementary Figs. 1 and 2) we deduced the length of rns as 960 nt, and the length of rnl as 1817 nt. This makes them a little shorter
Trp Fig. 3. Cloverleaf representation of gene sequences for tRNAMet CAU and tRNAUCA encoded in A. aurita mtDNA. Nucleotides discussed in the text are numbered; numbering is based on the convention used for yeast tRNA phenylalanine (Robertus et al. 1974).
98
Z. Shao et al. / Gene 381 (2006) 92–101
than the homologous genes in two demosponges (Lavrov et al., 2005) and anthozoans, but longer than those in most bilaterian animals (Wolstenholme, 1992). 3.6. Non-coding regions and terminal repeats The mitochondrial genome of A. aurita is remarkably compact and contains few intergenic nucleotides. The longest intergenic region (93 bp) is found between cox1 and cox2 and coincides with the change in the transcriptional polarity of A. aurita genes. The second largest non-coding region is located upstream of trnM and is only 11 bp in size. All other intergenic regions are 5 bp or less and several genes appear to overlap. The scarcity of non-
coding nucleotides in the A. aurita mitochondrial genome represents a sharp contrast to anthozoan (and especially hexacorallian) mtDNA where multiple relatively large non-coding regions are often present. It also poses a question about the expression of genes in this genome: the presence of multiple intergenic regions along with complete stop codons in anthozoan mtDNA was suggested as possible evidence for different transcriptional mechanisms in the Cnidaria compared to bilaterian animals (Wolstenholme, 1992). In addition to non-coding intergenic regions, non-coding flanking regions are present in the A. aurita genome. The determined sequences at two ends of the molecule are nearly identical but inverted, lack any obvious potential secondary structures and
Fig. 4. Phylogenetic analysis of animal relationships based on maximum likelihood (ML) and Bayesian (MB) analyses of derived protein sequences. A. ML tree. The first number at each node indicates the percentage of ML bootstrap support; the second number shows the posterior probability in percent. Phylogenetic relationships not supported by the MB consensus tree are marked with minus signs. The protein sequences for Cantharellus cibarius, Hypocrea jecorina, and Rhizopus oryzae were downloaded from http://megasun.bch.umontreal.ca/People/lang/FMGP/proteins.html. Other protein sequences were inferred from GenBank files: Katharina tunicata U09810, Limulus polyphemus AF216203, Asterina pectinifera D16387, Mustelus manazo AF347015, Acropora tenuis AF338425, Metridium senile AF000023, Montastraea annularis AP008973, Sarcophyton glaucum AF064823, AF063191, Amoebidium parasiticum AF538042–AF538052, Monosiga brevicollis AF538053, Allomyces macrogynus U41288, Arabidopsis thaliana NC_001284, Marchantia polymorpha NC_001660, Nephroselmis olivacea AF110138, Prototheca wickerhamii NC_001613.
Z. Shao et al. / Gene 381 (2006) 92–101
telomere-like repeat elements, and do not show any significant similarity to known sequences. Such long inverted terminal repeats are common in linear mtDNA and linear plasmids and can be associated i) with 3′ end overhangs, or ii) with covalently bound terminal proteins, or iii) with terminal loops (Nosek and Tomaska, 2002) that protect the ends of the molecule. The exact nature of the terminal ends in A. aurita genome was not determined. 3.7. Phylogenetic analysis Phylogenetic analyses based on concatenated amino acid sequences from twelve mitochondrial protein genes reveal an overall conventional tree of eukaryotic relationships, with strong support for most of the inferred clades (Fig. 4). However, the recovered relationships within the Metazoa are clearly unconventional. As has been reported in our previous study, analyses of mitochondrial protein sequences strongly support the division of animals into two sister groups, the Bilateria and the Diploblastica (Porifera + Cnidaria), a likely artifact of different rates of sequence evolution in these two groups (Lavrov et al., 2005). Furthermore, our present analysis does not recover Cnidaria as a monophyletic phylum. Instead, antozoan species from the subclass Zoantharia (=Hexacorallia) are placed as a sister group to three demosponges, to the exclusion of A. aurita and S. glaucum. However, the support for the nonmonophyletic Cnidaria is weak and should be better interpreted as the lack of resolution for these relationships. The lack of resolution and conflicting phylogenetic signals at the base of the metazoan tree have been reported previously based with respect to rRNA (Medina et al., 2001) and nuclear protein gene sequences (Rokas et al., 2003, 2005). Thus other characters such as gene orders and genome physical structures may be more informative for understanding cnidarian relationships. 3.8. Implications for cnidarian mtDNA evolution The availability of nearly complete mitochondrial genomes from two major lineages of Cnidaria – Anthozoa and Scyphozoa – allows us to make some inferences about the mitochondrial evolution in this phylum. Comparisons of A. aurita mitochondrial sequences with those previously available from anthozoans reveal two features characteristic to the phylum. The first is the use of nuclear-encoded tRNAs for mitochondrial translation. All cnidarian mtDNA sequences determined to date lack all but one or two mt-tRNA genes. This implies that the loss of these genes from mtDNA and the acquisition of the mitochondrial tRNA import machinery have likely evolved in the common ancestor of all cnidarians [as previously suggested by Beaton et al. (1998)]. Our unpublished data on Hydra mtDNA support this conclusion. The second feature that we perceive to be specific to cnidarian mtDNA can be characterized as a tendency to acquire and incorporate foreign DNA. This tendency is manifested in the presence of the mutS homologue in the mtDNA of S. glaucum and the sea pen R. kolikeri (Pont-Kingdon et al., 1995, 1998), a homing nuclease within mitochondrial group I introns of M. senile (Beagley et al., 1996), and dnaB (this study). We want to emphasize that the presence of these genes is unique among Metazoa —
99
no other group of animals has foreign (non-mitochondrial genes) in mtDNA. Clearly, there is more exchange between the nuclear and mitochondrial genomes in Cnidaria than in other animals. We can see two possible explanations for this observation. First, because both homologous and non-homologous recombination are needed to insert foreign DNA into the mitochondrial genome and to repair double strand breaks produced by homing endonuclease activity (Chevalier and Stoddard, 2001), we suggest that recombination may be more frequent in cnidarian mitochondria than in other animals. Second, we suspect that the same mechanism used for tRNA import may help to transfer foreign DNA into mitochondria. Further studies are needed to investigate these intriguing possibilities. Acknowledgements We thank Karri Haen and Monica Medina for valuable comments on an earlier version of this manuscript. Part of the research reported in this paper was conducted by Shannon Graf and Dennis Lavrov in the laboratory of Dr. Wesley Brown at the University of Michigan. Appendix A. Supplementary data Supplementary data associated with this article can be found, in the online version, at doi:10.1016/j.gene.2006.06.021. References Arai, M.N., 1996. Functional Biology of Scyphozoa. Chapman & Hall, New York, NY. Armstrong, M.R., Block, V.C., Phillips, M.S., 2000. A multipartite mitochondrial genome in the potato cyst nematode Globodera pallida. Genetics 154, 181–192. Beagley, C.T., Macfarlane, J.L., Pont-Kingdon, G.A., Okimoto, R., Okada, N.A., Wolstenholme, D.R., 1995. Mitochondrial genomes of Anthozoa (Cnidaria). In: Palmieri, F., Pappa, S., Saccone, C., Gadaleta, N. (Eds.), Progress in Cell Research-Symposium on Thirty Years of Progress in Mitochondrial Bioenergetics and Molecular Biology. Elsevier, Amsterdam, pp. 149–153. Beagley, C.T., Okada, N.A., Wolstenholme, D.R., 1996. Two mitochondrial group I introns in a metazoan, the sea anemone Metridium senile: one intron contains genes for subunits 1 and 3 of NADH dehydrogenase. Proc. Natl. Acad. Sci. U. S. A. 93, 5619–5623. Beagley, C.T., Okimoto, R., Wolstenholme, D.R., 1998. The mitochondrial genome of the sea anemone Metridium senile (Cnidaria): introns, a paucity of tRNA genes, and a nearstandard genetic code. Genetics 148, 1091–1108. Beaton, M.J., Roger, A.J., Cavalier-Smith, T., 1998. Sequence analysis of the mitochondrial genome of Sarcophyton glaucum: conserved gene order among octocorals. J. Mol. Evol. 47, 697–708. Benson, D.A., Karsch-Mizrachi, I., Lipman, D.J., Ostell, J., Wheeler, D.L., 2003. GenBank. Nucleic Acids Res. 31, 23–27. Berntson, E.A., France, S.C., Mullineaux, L.S., 1999. Phylogenetic relationships within the class Anthozoa (phylum Cnidaria) based on nuclear 18S rDNA sequences. Mol. Phylogenet. Evol. 13, 417–433. Bois, F., Barroso, G., Gonzalez, P., Labarere, J., 1999. Molecular cloning, sequence and expression of Aa-polB, a mitochondrial gene encoding a family B DNA polymerase from the edible basidiomycete Agrocybe aegerita. Mol. Gen. Genet. 261, 508–513. Boore, J.L., 1999. Animal mitochondrial genomes. Nucleic Acids Res. 27, 1767–1780. Braithwaite, D.K., Ito, J., 1993. Compilation, alignment, and phylogenetic relationships of DNA polymerases. Nucleic Acids Res. 21, 787–802.
100
Z. Shao et al. / Gene 381 (2006) 92–101
Bridge, D., Cunningham, C.W., Schierwater, B., DeSalle, R., Buss, L.W., 1992. Class-level relationships in the phylum Cnidaria: evidence from mitochondrial genome structure. Proc. Natl. Acad. Sci. U. S. A. 89, 8750–8753. Bridge, D., Cunningham, C.W., DeSalle, R., Buss, L.W., 1995. Class-level relationships in the phylum Cnidaria: molecular and morphological evidence. Mol. Biol. Evol. 12, 679–689. Brusca, R.C., Brusca, G.J., 2002. Invertebrates. Sinauer Associates, Sunderland, MA. Burger, G., Saint-Louis, D., Gray, M.W., Lang, B.F., 1999. Complete sequence of the mitochondrial DNA of the red alga Porphyra purpurea. Cyanobacterial introns and shared ancestry of red and green algae. Plant Cell 11, 1675–1694. Burger, G., et al., 2000. Complete sequence of the mitochondrial genome of Tetrahymena pyriformis and comparison with Paramecium aurelia mitochondrial DNA. J. Mol. Biol. 297, 365–380. Chevalier, B.S., Stoddard, B.L., 2001. Homing endonucleases: structural and functional insight into the catalysts of intron/intein mobility. Nucleic Acids Res. 29, 3757–3774. Coleman, A.W., Thompson, W.F., Goff, L.J., 1991. Identification of the mitochondrial genome in the chrysophyte alga Ochromonas danica. J. Protozool. 38, 129–135. Collins, A.G., 2002. Phylogeny of Medusozoa and the evolution of cnidarian life cycles. J. Evol. Biol. 15, 418–432. Collins, A.G., Daly, M., 2005. A new deepwater species of Stauromedusae, Lucernaria janetae (Cnidaria, Staurozoa, Lucernariidae), and a preliminary investigation of stauromedusan phylogeny based on nuclear and mitochondrial rDNA data. Biol. Bull. 208, 221–230. Dawson, M.N., Jacobs, D.K., 2001. Molecular evidence for cryptic species of Aurelia aurita (Cnidaria, Scyphozoa). Biol. Bull. 200, 92–96. De Rijk, P., Wuyts, J., De Wachter, R., 2003. RnaViz 2: an improved representation of RNA secondary structure. Bioinformatics 19, 299–300. Ender, A., Schierwater, B., 2003. Placozoa are not derived cnidarians: evidence from molecular morphology. Mol. Biol. Evol. 20, 130–134. Folmer, O., Black, M., Hoeh, W., Lutz, R., Vrijenhoek, R., 1994. DNA primers for amplification of mitochondrial cytochrome c oxidase subunit I from diverse metazoan invertebrates. Mol. Mar. Biol. Biotechnol. 3, 294–299. Forget, L., Ustinova, J., Wang, Z., Huss, V.A., Lang, B.F., 2002. Hyaloraphidium curvatum: a linear mitochondrial genome, tRNA editing, and an evolutionary link to lower fungi. Mol. Biol. Evol. 19, 310–319. Freemont, P.S., Ollis, D.L., Steitz, T.A., Joyce, C.M., 1986. A domain of the Klenow fragment of Escherichia coli DNA polymerase I has polymerase but no exonuclease activity. Proteins 1, 66–73. Fukami, H., Knowlton, N., 2005. Analysis of complete mitochondrial DNA sequences of three members of the Montastraea annularis coral species complex (Cnidaria, Anthozoa, Scleractinia). Coral Reefs 24, 410–417. Guindon, S., Gascuel, O., 2003. A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst. Biol. 52, 696–704. Helfenbein, K.G., Fourcade, H.M., Vanjani, R.G., Boore, J.L., 2004. The mitochondrial genome of Paraspadella gotoi is highly reduced and reveals that chaetognaths are a sister group to protostomes. Proc. Natl. Acad. Sci. U. S. A. 101, 10639–10643. Joyce, C.M., Steitz, T.A., 1994. Function and structure relationships in DNA polymerases. Ann. Rev. Biochem. 63, 777–822. Lang, B.F., Gray, M.W., Burger, G., 1999. Mitochondrial genome evolution and the origin of eukaryotes. Annu. Rev. Genet. 33, 351–397. Lavrov, D.V., Forget, L., Kelly, M., Lang, B.F., 2005. Mitochondrial genomes of two demosponges provide insights into an early stage of animal evolution. Mol. Biol. Evol. 22, 1231–1239. Lowe, T.M., Eddy, S.R., 1997. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 25, 955–964. Löytynoja, A., Milinkovitch, M.C., 2001. SOAP, cleaning multiple alignments from unstable blocks. Bioinformatics 17, 573–574. Marck, C., Grosjean, H., 2002. tRNomics: analysis of tRNA genes from 50 genomes of Eukarya, Archaea, and Bacteria reveals anticodon-sparing strategies and domain-specific features. RNA 8, 1189–1232. Marques, A.C., Collins, A.G., 2004. Cladistic analysis of Medusozoa and cnidarian evolution. Invertebr. Biol. 123, 23–42. Medina, M., Collins, A.G., Silberman, J.D., Sogin, M.L., 2001. Evaluating hypotheses of basal animal phylogeny using complete sequences of large and small subunit rRNA. Proc. Natl. Acad. Sci. U. S. A. 98, 9707–9712.
Meinhardt, F., Kempken, F., Kamper, J., Esser, K., 1990. Linear plasmids among eukaryotes: fundamentals and application. Curr. Genet. 17, 89–95. Meinhardt, F., Schaffrath, R., Larsen, M., 1997. Microbial linear plasmids. Appl. Microbiol. Biotechnol. 47, 329–336. Mouhamadou, B., Barroso, G., Labarere, J., 2004. Molecular evolution of a mitochondrial polB gene, encoding a family B DNA polymerase, towards the elimination from Agrocybe mitochondrial genomes. Mol. Genet. Genomics 272, 257–263. Nosek, J., Tomaska, L., 2002. Mitochondrial telomeres: alternative solutions to the endreplication problem. In: Krupp, G., Parwaresch, R. (Eds.), Telomeres, Telomerases and Cancer. Kluwer Academic/Plenum Publishers, New York, NY, pp. 396–417. Nosek, J., Tomaska, L., 2003. Mitochondrial genome diversity: evolution of the molecular architecture and replication strategy. Curr. Genet. 44, 73–84. Ohno, S., 1988. Universal rule for coding sequence construction: TA/CG deficiency–TG/CT excess. Proc. Natl. Acad. Sci. U. S. A. 85, 9630–9634. Papillon, D., Perez, Y., Caubit, X., Le Parco, Y., 2004. Identification of chaetognaths as protostomes is supported by the analysis of their mitochondrial genome. Mol. Biol. Evol. 21, 2122–2129. Pearson, W.R., 1994. Using the FASTA program to search protein and DNA sequence databases. Methods Mol. Biol. 25, 365–389. Pont-Kingdon, G.A., Beagley, C.T., Okimoto, R., Wolstenholme, D.R., 1994. Mitochondrial DNA of the sea anemone, Metridium senile (Cnidaria): prokaryote-like genes for tRNA(f-Met) and small-subunit ribosomal RNA, and standard genetic code specificities for AGR and ATA codons. J. Mol. Evol. 39, 387–399. Pont-Kingdon, G.A., et al., 1995. A coral mitochondrial mutS gene. Nature 375, 109–111. Pont-Kingdon, G., et al., 1998. Mitochondrial DNA of the coral Sarcophyton glaucum contains a gene for a homologue of bacterial MutS: a possible case of gene transfer from the nucleus to the mitochondrion. J. Mol. Evol. 46, 419–431. Pont-Kingdon, G., Vassort, C.G., Warrior, R., Okimoto, R., Beagley, C.T., Wolstenholme, D.R., 2000. Mitochondrial DNA of Hydra attenuata (Cnidaria): a sequence that includes an end of one linear molecule and the genes for l-rRNA, tRNA(f-Met), tRNA(Trp), COII, and ATPase8. J. Mol. Evol. 51, 404–415. Rich, A., RajBhandary, U.L., 1976. Transfer RNA: molecular structure, sequence, and properties. Ann. Rev. Biochem. 45, 805–860. Robertus, J.D., Ladner, J.E., Finch, J.T., Rhodes, D., Brown, R.S., Clark, B.F., Klug, A., 1974. Structure of yeast phenylalanine tRNA at 3 A resolution. Nature 250, 546–551. Robison, M.M., Horgen, P.A., 1996. Plasmid RNA polymerase-like mitochondrial sequences in Agaricus bitorquis. Curr. Genet. 29, 370–376. Robison, M.M., Wolyn, D.J., 2005. A mitochondrial plasmid and plasmid-like RNA and DNA polymerases encoded within the mitochondrial genome of carrot (Daucus carota L.). Curr. Genet. 47, 57–66. Robison, M.M., Royer, J.C., Horgen, P.A., 1991. Homology between mitochondrial DNA of Agaricus bisporus and an internal portion of a linear mitochondrial plasmid of Agaricus bitorquis. Curr. Genet. 19, 495–502. Rodriguez, I., et al., 2005. A specific subdomain in phi29 DNA polymerase confers both processivity and strand-displacement capacity. Proc. Natl. Acad. Sci. U. S. A. 102, 6407–6412. Rokas, A., King, N., Finnerty, J., Carroll, S.B., 2003. Conflicting phylogenetic signals at the base of the metazoan tree. Evolut. Develop. 5, 346–359. Rokas, A., Kruger, D., Carroll, S.B., 2005. Animal evolution and the molecular signature of radiations compressed in time. Science 310, 1933–1938. Ronquist, F., Huelsenbeck, J.P., 2003. MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19, 1572–1574. Rousvoal, S., Oudot, M., Fontaine, J., Kloareg, B., Goer, S.L., 1998. Witnessing the evolution of transcription in mitochondria: the mitochondrial genome of the primitive brown alga Pylaiella littoralis (L.) Kjellm. encodes a T7-like RNA polymerase. J. Mol. Biol. 277, 1047–1057. Schroth, W., Jarms, G., Streit, B., Schierwater, B., 2002. Speciation and phylogeography in the cosmopolitan marine moon jelly, Aurelia sp. BMC Evol. Biol. 2, 1. Smith, A.E., Marcker, K.A., 1968. N-formylmethionyl transfer RNA in mitochondria from yeast and rat liver. J. Mol. Biol. 38, 241–243.
Z. Shao et al. / Gene 381 (2006) 92–101 Sprinzl, M., Hartmann, T., Weber, J., Blank, J., Zeidler, R., 1989. Compilation of tRNA sequences and sequences of tRNA genes. Nucleic Acids Res. 17, 1–172 (Suppl.). Staden, R., 1996. The Staden sequence analysis package. Mol. Biotechnol. 5, 233–241. Thompson, J.D., Higgins, D.G., Gibson, T.J., 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673–4680. Truniger, V., Lazaro, J.M., Salas, M., Blanco, L., 1998. Phi 29 DNA polymerase requires the N-terminal domain to bind terminal protein and DNA primer substrates. J. Mol. Biol. 278, 741–755. Vahrenholz, C., Riemen, G., Pratje, E., Dujon, B., Michaelis, G., 1993. Mitochondrial DNA of Chlamydomonas reinhardtii: the structure of the ends
101
of the linear 15.8-kb genome suggests mechanisms for DNA replication. Curr. Genet. 24, 241–247. van Oppen, M.J., Catmull, J., McDonald, B.J., Hislop, N.R., Hagerman, P.J., Miller, D.J., 2002. The mitochondrial genome of Acropora tenuis (Cnidaria; Scleractinia) contains a large group I intron and a candidate control region. J. Mol. Evol. 55, 1–13. Weber, B., Borner, T., Weihe, A., 1995. Remnants of a DNA polymerase gene in the mitochondrial DNA of Marchantia polymorpha. Curr. Genet. 27, 488–490. Wesley, U.V., Wesley, C.S., 1997. Rapid directional walk within DNA clones by step-out PCR. Methods Mol. Biol. 67, 279–285. Wolstenholme, D.R., 1992. Animal mitochondrial DNA: structure and evolution. Int. Rev. Cyt. 141, 173–216.