The complete mitochondrial genome of the helmet catfish Cranoglanis bouderius (Siluriformes: Cranoglanididae) and the phylogeny of otophysan fishes

The complete mitochondrial genome of the helmet catfish Cranoglanis bouderius (Siluriformes: Cranoglanididae) and the phylogeny of otophysan fishes

Gene 376 (2006) 290 – 297 www.elsevier.com/locate/gene The complete mitochondrial genome of the helmet catfish Cranoglanis bouderius (Siluriformes: C...

248KB Sizes 0 Downloads 41 Views

Gene 376 (2006) 290 – 297 www.elsevier.com/locate/gene

The complete mitochondrial genome of the helmet catfish Cranoglanis bouderius (Siluriformes: Cranoglanididae) and the phylogeny of otophysan fishes Zuogang Peng a , Jun Wang b , Shunping He a,⁎ a

Laboratory of Fish Phylogenetics and Biogeography, Institute of Hydrobiology, Chinese Academy of Sciences, 7th Donghu South Road, Wuhan 430072, China b Beijing Institute of Genomics of Chinese Academy of Sciences, Beijing Genomics Institute, Beijing Proteomics Institute, Beijing 101300, China Received 22 March 2006; received in revised form 17 April 2006; accepted 19 April 2006 Available online 3 May 2006 Received by J.G. Zhang

Abstract The complete sequence of the 16,539 nucleotide mitochondrial genome from the single species of the catfish family Cranoglanididae, the helmet catfish Cranoglanis bouderius, was determined using the long and accurate polymerase chain reaction (LA PCR) method. The nucleotide sequences of C. bouderius mitochondrial DNA have been compared with those of three other catfish species in the same order. The contents of the C. bouderius mitochondrial genome are 13 protein-coding genes, two ribosomal RNA and 22 transfer RNA genes, and a non-coding control region, the gene order of which is identical to that observed in most other vertebrates. Phylogenetic analyses for 13 otophysan fishes were performed using Bayesian method based on the concatenated mtDNA protein-coding gene sequence and the individual protein-coding gene sequence data set. The competing otophysan topologies were then tested by using the approximately unbiased test, the Kishino–Hasegawa test, and the Shimodaira–Hasegawa test. The results show that the grouping ((((Characiformes, Gymnotiformes), Siluriformes), Cypriniformes), outgroup) is the most likely but there is no significant difference between this one and the other alternative hypotheses. In addition, the phylogenetic placement of the family Cranoglanididae among siluriform families was also discussed. © 2006 Elsevier B.V. All rights reserved. Keywords: Helmet catfish; Otophysi; Mitochondrial genome; Phylogeny

1. Introduction Mitochondrial DNA is commonly used in population and phylogenetic studies due to its maternal mode of inheritance and relative lack of recombination. The mitochondrial DNA (mtDNA) of most animals is a self-replicating, about 16-kb-long, circular DNA molecule that codes for 13 mitochondrial proteins, 22 mitochondrial tRNAs, and two mitochondrial specific ribosomal RNAs: the 12S and 16S rRNAs. It also contains DNA regions that control its replication and transcription (control region). From the evoAbbreviations: ATP 6 and 8, ATPase subunits 6 and 8; bp, base pair(s); COI–III, cytochrome c oxidase subunits I–III; cyt b, cytochrome b; ND1–6, 4L, NADH dehydrogenase subunits 1–6, 4L; LA PCR, long and accurate polymerase chain reaction; tRNA, transfer RNA; 12S rRNA and 16S rRNA, 12S and 16S ribosomal RNA. ⁎ Corresponding author. Tel.: +86 27 68780430; fax: +86 27 68780071. E-mail address: [email protected] (S. He). 0378-1119/$ - see front matter © 2006 Elsevier B.V. All rights reserved. doi:10.1016/j.gene.2006.04.014

lutionary viewpoint, mtDNAs are “small genomes” that co-evolve at their own rate with the organism in which they are lodged. The complete mitochondrial genome sequences have been reported for numerous vertebrates including many fishes (e.g., loach, Tzeng et al., 1992; carp, Chang et al., 1994; sea lamprey, Lee and Kocher, 1995; cod, Johansen and Bakke, 1996; bichir, Noack et al., 1996; lungfish, Zardoya and Meyer, 1996a; coelacanth, Zardoya and Meyer, 1997; dogfish, Delarbre et al., 1998; Miya et al., 2003; etc.). Although gene rearrangements have been described in some species (e.g., Miya and Nishida, 1999), the gene content (13 protein-coding genes, 22 transfer RNAs [tRNAs] and two ribosomal RNAs [rRNAs]) and organization of fish mitochondrial genomes is quite conserved. This conserved characteristic facilitates their alignment and identification. The monophyletic group of teleost fish known as the Otophysi (ostariophysan groups excluding Gonorynchiformes), represents about two thirds of all freshwater fishes and constitutes one of the

Z. Peng et al. / Gene 376 (2006) 290–297

291

Fig. 1. Hypotheses of otophysan phylogeny. A) Hypothesis of Rosen and Greenwood (1970) based on morphological data. B) Hypothesis of Fink and Fink (1981, 1996) based on morphological characters. C) Hypothesis of Dimmick and Larson (1996) based on approximately 2000 bp mitochondrial DNA sequence data, Orti (1997) based on molecular data from the first and second codon positions of the ependymin gene, and hypothesis of Saitoh et al. (2003) based on complete mitochondrial genome data. And D) hypothesis of Lavoue et al. (2005) based on complete mitochondrial genome data.

most diverse vertebrate taxa, including the Cypriniformes (carps and loaches), Characiformes (piranhas and relatives), Siluriformes (catfishes), and Gymnotiformes (knifefishes) (Fink and Fink, 1981, 1996). Morphological phylogenetic studies resulted so far in two major hypotheses concerning otophysan relationships, using Gonorynchiformes as outgroups (Fig. 1A, B; Rosen and Greenwood, 1970; Fink and Fink, 1981, 1996). However, recent molecular cladistic studies resulted in two other different hypotheses regarding the phylogenetic relationships among the four otophysan orders (the third hypothesis — Fig. 1C: Dimmick and Larson, 1996; Orti, 1997; Saitoh et al., 2003; and the fourth hypothesis — Fig. 1D: Lavoue et al., 2005). Unlike the first hypothesis (Fig. 1A), the other three hypotheses point out that both the cypriniforms and the characiphysans (characiforms, siluriforms, and gymnotiforms) are monophyletic groups; the phylogenetic relationships within characiphysans remain however highly uncertain (Fig. 1B, C, D). The helmet catfish, Cranoglanis bouderius (Siluriformes, Cranoglanididae), is an endemic otophysan species in the water systems of Pearl River, Red River, and drainages in Hainan Island. It was originally one of the main edible fish in Pearl River basin, but its numbers markedly decreased during the past several decades, owing to the sudden explosion of human population and overfishing; C. bouderius is now one of the vulnerable endangered species listed in the Chinese Red Data Book of Endangered Animals (Yue and Chen, 1998). In order to understand the mitochondrial genome structure of the species C. bouderius we determined the complete mitochondrial genome sequence of this catfish species, which is the only valid species of the family Cranoglanididae (Liu et al., 2005). Based on this new mitochondrial genome sequence, as well as on those of 13 other species retrieved from GenBank, including one gonorynchiform species as an outgroup, we reexamined the phylogenetic relationships of otophysan fishes. It is hoped that the knowledge of the mitochondrial genome sequence of C. bouderius could also contribute to the clarification of the higher-level phylogeny of catfishes.

2. Materials and methods 2.1. Fish sample and genomic DNA extraction, LA PCR, and sequencing The C. bouderius sample was obtained from the Hongshui River in Guangxi province, China (Table 1). Total genomic DNA Table 1 Fish species used in this study Species Otophysi Cypriniformes Carpiodes carpio Crossostoma lacustre Lefua echigonia Danio rerio Cyprinus carpio Characiformes Chalceus macrolepidotus Phenacogrammus interruptus Gymnotiformes Apteronotus albifrons Eigenmannia sp. Siluriformes Pseudobagrus tokiensis Cranoglanis bouderius Ictalurus punctatus Pangasianodon gigas Outgroup Gonorynchiformes Chanos chanos a

Source

GenBank accession no.

Broughton et al., unpublished Tzeng et al. (1992)

AY366087 M91245

Saitoh et al. (2003) Broughton et al. (2001) Chang et al. (1994)

AB054126 AC024175 X61010

Saitoh et al. (2003)

AB054130

Saitoh et al. (2003)

AB054129

Saitoh et al. (2003) Saitoh et al. (2003)

AB054132 AB054131

Saitoh et al. (2003)

AB054127

This study (IHB 03051801) a

AY898626

Waldbieser et al. (2003) Jondeung and Sangthong, unpublished

AF482987 AY762971

Saitoh et al. (2003)

AB054133

Abbreviation for specimen depository; IHB, Institute of Hydrobiology, Chinese Academy of Sciences.

292

Z. Peng et al. / Gene 376 (2006) 290–297

was extracted from the muscle tissue using a QIAamp tissue kit (Qiagen, Germany) following the manufacture's protocol. The mitochondrial genome DNA of C. bouderius was amplified in its entirety using a long PCR technique (Miya and Nishida, 1999). The primers designed by Miya and Nishida (2000), and Inoue et al. (2001) were used to amplify the total mitochondrial genome in two reactions. Long PCR was done in a PTC-100 programmable thermal controller (MJ Research, USA); reactions were carried out in 25 μl reaction volume containing 2.5 μl 10 × LA PCR buffer II (Takara), 0.8 mM dNTPs, 2.5 mM MgCl2, 0.5 μM each primer, 0.625 U LA Taq polymerase (Takara) and approximately 20 ng template DNA. The thermal cycle profile was: pre-denaturation at 94 °C for 2 min, and 30 cycles of denaturation at 98 °C for 10 s, annealing and extension combined at the same temperature (68 °C) for 16 min, and were electrophoresed on a 0.8% agarose gel (Promega, USA). The long PCR products were diluted in sterilised distilled water for subsequent use as PCR templates. We used 30 different primers that amplify contiguous, overlapping segments to get the entire mitochondrial genome of the fish (Table 2). Some of these primer pairs were versatile, based on the complete mitochondrial genome of six bony fish species according to Miya and Nishida (2000). The others were special for C. bouderius, being designed from the sequence obtained from the versatile primers. Then PCR was done and reactions were carried out in 25 μl reaction volume containing 2.5 μl 10 × PCR buffer (Takara), 0.4 mM dNTPs, 1.8 mM MgCl2, 0.2 μM each primer, 1 U Taq polymerase (Takara) and 1.0 μl long PCR products as template. The thermal cycle profile was: pre-denaturation at 94 °C for 2 min, and 30 cycles of denaturation at 94 °C for 15 s, annealing at 52 °C for 15 s, extension at 72 °C for 30 s, and final extension at 72 °C for 5 min. PCR products were electrophoresed on 1.0% agarose gel (Promega). Double strand PCR purified products were subsequently used for direct cycle sequencing with dye-labelled terminators (ABI). PCR primers were also used for sequencing. All sequencing reactions were performed according to the manufacturer's instructions. Labelled fragments were analyzed

on a model MegaBACE 1000 DNA sequencer (GE Healthcare Biosciences, USA). 2.2. Sequence analysis DNA sequences were analyzed using the software Lasergene version 5.0 (DNASTAR). Contig assembly was performed with the program Seqman. Protein-coding, rRNA, and tRNA genes were identified by comparison with the corresponding known sequences of other catfish taxa, including Ictalurus punctatus (Waldbieser et al., 2003), Pseudobagrus tokiensis (Saitoh et al., 2003), and Pangasianodon gigas (Jondeung and Sangthong, unpublished data). More detailed species information is given in Table 1. The ambiguous alignments of two rRNA genes (12S and 16S) and 22 tRNAs were not used in our analyses. Also, mitochondrial control region and other ambiguous alignment regions, such as the 5′ and 3′ ends of several protein-coding genes, were excluded from the analyses, leaving 11,100 nucleotide positions for the 13 protein-coding genes. Aligned sequence data in NEXUS format for each mtDNA protein-coding gene and for the concatenated mtDNA protein-coding genes are available from one of us (P.Z.) upon request. We used a Bayesian approach to infer phylogenies from sequences of different protein-coding regions. We also inferred phylogenies from the concatenated mtDNA protein-coding gene sequences. To accomplish these we employed MrBayes version3.1 (Ronquist and Huelsenbeck, 2003). For each individual protein-coding gene and for the concatenated mtDNA proteincoding gene sequences, the appropriate models of DNA substitution were specified in MrBayes using the general lset values (e.g., nst and rates) and allowing the program to converge on the best estimates of these model parameters. The decision theoretic approach (DT) implemented in DT-ModSel (Minin et al., 2003) was used to select the appropriate models of evolution for our Bayesian analyses. Each Markov chain was started from a random tree and run for 106 generations with every 100th cycle sampled from the chain to assure independence of the samples.

Table 2 PCR and sequencing primers for C. bouderius designed from the complete mitochondrial genome of six bony fish species according to the reference (Miya and Nishida, 2000) Forward a

Sequence (5′ to 3′) b

Reverse a

Sequence (5′ to 3′) b

L709-12S L1969-16S L709-12S L2946-16S L1969-16S L4633-ND2 L8329-Lys S5-20-2F L9655-CO3 L9220-CO3 L3074-16S L9655-CO3 L11424-ND4 L12329-Leu L13562-ND5M

TAC ACA TGC AAG TCT CCG CA CGT CTC TGT GGC AAA AGA GTG G TAC ACA TGC AAG TCT CCG CA GGG ATA ACA GCG CAA TC CGT CTC TGT GGC AAA AGA GTG G CAC CGC CCW CGA GCA GTT GA AGC GTT GGC CTT TTA AGC CCC TAC ACA AGA CCT AAC CCC GTA ACW TGG GCT CAT CAC AG AAC GTT TAA TGG CCC ACC AAG C CGA TTA AAG TCC TAC GTG ATC TGA GTT CAG GTA ACW TGG GCT CAT CAC AG TGA CTT CCW AAA GCC CAT GTA GA CTC TTG GTG CAA MTC CAA GT TCT TAC CTA AAC GCC TGA GCC CT

H2009-16S H3058-16S H3934-ND1 S5-6-R4 H8319-Lys H10035-Gly H10035-Gly H11618-ND4 H12293-Leu H5937-CO1 S7-6-R4 H13069-ND5 H13727-ND5 H14080-ND5 H1065-12S

CCT AAG CAA CCA GCT ATA AC TCC GGT CTG AAC TCA GAT CAC GTA GCG TAT TCT ACG TTG AAT CC GAT GCC CAG CCT GAG CCC CAC CWG TTT TTG GCT TAA AAG GC CTT TCC TTG GGK TTT AAC CAA G CTT TCC TTG GGK TTT AAC CAA G TGG CTG ACK GAK GAG TAG GC TTG CAC CAA GAG TTT TTG GTT CCT AAG ACC TGG GTG CCA ATG TCT TTG TG AAA TGC GAT GGG CAG GGC GTG CTG GAG TGK AGT AGG GC GCG ATK ATG CTT CCT CAG GC AGG TAK GTT TTG ATT AKK CC GGC ATA GTG GGG TAT CTA ATC CCA GTT TGT

a b

L and H denote heavy and light strands, respectively. Positions with mixed bases are labelled with their IUB codes: R = A/G; Y = C/T; K = G/T; M = A/C; S = G/C; W = A/T.

Z. Peng et al. / Gene 376 (2006) 290–297

We ran four chains simultaneously, three heated (temperature = 0.5) and one cold, using Metropolis-coupled Markov chain Monte Carlo to enhance the mixing capabilities of the Markov chains. To check that stationarity had been reached, we monitored the fluctuating value of the likelihood and all the phylogenetic parameters graphically, and repeated each simulation two times starting from different random trees and then comparing means and variances for each model parameter. All sample points prior to reaching stationarity were discarded as “burn in”. Posterior probabilities for individual clades obtained from the three separate analyses were combined and summarized on a majority-rule consensus tree for each individual mtDNA protein-coding gene and for the concatenated mtDNA proteincoding gene sequences, respectively. To compare competing otophysan topologies, site-wise loglikelihoods were calculated for each topology in PAUP (Swofford, 2002) and used as input for CONSEL (Shimodaira and Hasegawa, 2001). CONSEL was used to calculate the probability values according to the approximately unbiased test

293

(AU) using the multiscale bootstrap technique (Shimodaira, 2002), the Kishino–Hasegawa test (KH; Kishino and Hasegawa, 1989), and the Shimodaira–Hasegawa test (SH; Shimodaira and Hasegawa, 1999, both weighted [w] and unweighted). CONSEL was also used to test incongruence between the different trees reconstructed by using the different mtDNA protein-coding gene and the concatenated protein-coding gene sequences. 3. Results and discussion The complete nucleotide sequence of the L-strand of C. bouderius mtDNA was determined to be 16,539 bp long and has been deposited in GenBank (Accession No. AY898626). The structural organization of the mitochondrial genes and noncoding regions is identical to that of fish and higher vertebrates, consisting of two rRNAs, 22 tRNAs, and 13 protein-coding genes with a control region (also known as D-loop). Most of the C. bouderius mitochondrial genes are encoded on the H-strand,

Table 3 Organization of the C. bouderius mitochondrial genome Gene/element Phe

tRNA 12S ribosomal RNA tRNAVal 16S ribosomal RNA tRNALeu NADH dehydrogenase subunit 1 tRNAIle tRNAGln tRNAMet NADH dehydrogenase subunit 2 tRNATrp tRNAAla tRNAAsn tRNACys tRNATyr Cytochrome c oxidase subunit 1 tRNASer tRNAAsp Cytochrome c oxidase subunit 2 tRNALys ATP synthase F0 subunit 8 ATP synthase F0 subunit 6 Cytochrome c oxidase subunit 3 tRNAGly NADH dehydrogenase subunit 3 tRNAArg NADH dehydrogenase subunit 4L NADH dehydrogenase subunit 4 tRNAHis tRNASer tRNALeu NADH dehydrogenase subunit 5 NADH dehydrogenase subunit 6 tRNAGlu Cytochrome b tRNAThr tRNAPro Displacement loop (control region) a

Abbreviation

Strand a

Position

Size

F 12S V 16S L ND1 I Q M ND2 W A N C Y COI S D COII K ATP8 ATP6 COIII G ND3 R ND4L ND4 H S L ND5 ND6 E cyt b T P D-loop

H H H H H H H L H H H L L L L H L H H H H H H H H H H H H H L H L L H H L –

1–70 71–1017 1027–1098 1099–2734 2780–2857 2859–3833 3838–3909 3979–3909 3981–4049 4050–5096 5095–5165 5236–5168 5310–5238 5408–5342 5481–5411 5483–7033 7104–7034 7109–7181 7196–7886 7887–7960 7962–8129 8120–8803 8803–9586 9587–9659 9660–10010 10009–10080 10081–10377 10371–11751 11752–11821 11824–11887 11970–11898 11971–13797 14312–13794 14381–14313 14383–15520 15521–15592 15661–15591 15662–16539

70 947 72 1636 78 975 72 71 69 1047 71 69 73 67 71 1551 71 73 691 74 168 684 784 73 351 72 297 1381 70 64 73 1827 519 69 1138 72 71 878

L and H refer to light and heavy strands, respectively.

Start

Stop

ATG

TAG

ATG

TAG

GTG

TAG

ATG

T–

ATG ATG ATG

TAA TAA T–

ATG

TAG

ATG ATG

TAA T–

ATG ATG

TAA TAG

ATG

T–

294

Z. Peng et al. / Gene 376 (2006) 290–297

although the ND6 gene and eight tRNA genes are encoded on the L-strand. Of the 13 protein-coding genes, three parts (ATP8 and ATP6, ND4L and ND4, ND5 and ND6) were partially overlapped on the same strand clockwise and anticlockwise as shown in Table 3. The extent of overlap differs in the various fish mitochondrial genomes, as the protein-coding genes have similar lengths but the lengths of the non-coding regions differ.

Table 4 Codon usage in C. bouderius mitochondrial protein-coding genes Amino acid

Codon

Number

Frequency

Codon usage (%)

Lys

AAA AAG AAC AAU ACA ACG ACC ACU AGA AGG AGC AGU AUA AUG AUC AUU CAA CAG CAC CAU CCA CCG CCC CCU CGA CGG CGC CGU CUA CUG CUC CUU GAA GAG GAC GAU GCA GCG GCC GCU GGA GGG GGC GGU GUA GUG GUC GUU UAA UAG UAC UAU UCA UCG UCC UCU UGA UGG UGC UGU UUA UUG UUC UUU

69 12 76 57 132 5 121 48 0 0 30 16 124 46 108 180 86 12 67 36 93 11 67 42 42 8 16 7 238 51 87 93 81 18 59 18 105 16 148 66 97 35 74 37 85 33 50 51 4 5 49 69 72 7 68 35 105 15 21 8 141 25 113 113

0.018 0.003 0.020 0.015 0.035 0.001 0.032 0.013 0.000 0.000 0.008 0.004 0.033 0.012 0.028 0.047 0.023 0.003 0.018 0.009 0.025 0.003 0.018 0.011 0.011 0.002 0.004 0.002 0.063 0.013 0.023 0.025 0.021 0.005 0.016 0.005 0.028 0.004 0.039 0.017 0.026 0.009 0.020 0.010 0.022 0.009 0.013 0.013 0.001 0.001 0.013 0.018 0.019 0.002 0.018 0.009 0.028 0.004 0.006 0.002 0.037 0.007 0.030 0.030

86 14 57 43 43 1 40 16 0 0 67 33 73 27 37 63 88 12 67 33 44 5 32 19 58 10.5 21 10.5 51 10 19 20 81 19 76 24 32 5 44 19 40 14 31 15 39 16 22.5 22.5 50 50 42 58 40 4 38 18 88 12 75 25 84 16 50 50

Asn Thr

3.1. Non-coding sequences The major non-coding region (control region) in mtDNA regulates replication and transcription (Clayton, 1982, 1991; Shadel and Clayton, 1997). The primary sequence of much of the control region does not appear to be particularly important for regulatory function, as this region shows extensive variability across taxonomic groups and even among closely related species. The 878 bp C. bouderius control region was much less similar to other fishes than were the coding sequences, with numerous nucleotide substitutions and indels. However, several important regulatory elements are present. Conserved sequence blocks (CSB-2 and CSB-3), found in the 3′ end of the control region, appear to be involved in positioning RNA polymerase both for transcription and for priming replication (Clayton, 1991; Shadel and Clayton, 1997). Both of these elements are identifiable in C. bouderius and they show strong similarity to CSBs of other catfish sequences (results not shown). These features reflect a strong conservation of vertebrate mitochondrial regulatory elements. But the CSB-1 can not be identified due to strong nucleotide variations.

Stop

3.2. Ribosomal and transfer RNA genes

Glu

The C. bouderius 12S and 16S ribosomal RNA genes contain 947 and 1636 bp, respectively. As in other vertebrates (e.g., Inoue et al., 2000), these genes are located between the genes for tRNA Phe and tRNA Leu (UUR), and are separated by the gene for tRNA Val. The C. bouderius mitochondrial genome contains 22 tRNA genes (Table 3), which are interspersed along the genome, range in size from 64 to 78 nucleotides, and are predicted to fold into the expected cloverleaf secondary structures with normal base pairing. The rRNA genes are A + C-rich (58.7%) as in other bony fish (Zardoya and Meyer, 1997), and the C. bouderius tRNA genes are also slightly A + C-rich (56.8%), similar to other vertebrates. 3.3. Protein-coding genes All 13 protein-coding open reading frames (ORFs) generally found in other vertebrates are also presented in C. bouderius mitochondrial genome with the same organization. The codon usage in the 13 protein-coding genes of C. bouderius mtDNA is given in Table 4. For amino acids with fourfold degenerate third positions, codons ending in A are always the most frequent in C. bouderius, followed in frequency by codons ending in T or C. Among twofold degenerate codons, C appears to be used somewhat more than T. Consistent with the overall bias against G, G is the least common third position nucleotide in all categories except

Ser Met Ile Gln His Pro

Arg

Leu

Asp Ala

Gly

Val

Stop Tyr Ser

Trp Cys Leu Phe

Z. Peng et al. / Gene 376 (2006) 290–297

for arginine and glycine codons (where G is similar in frequency to C and T but still much less than A). These patterns are generally similar across vertebrate groups (e.g., zebrafish, Broughton et al., 2001). The mtDNA of C. bouderius has a strong bias against the use of ‘G’ at the third codon position (38.7% A; 30.4% C; 7.7% G; 23.1% T), which is typical in vertebrates. At the second codon position pyrimidines are over-represented compared with purines (T + C = 67.8%). All these genes begin with an ATG start codon except for COI, which initiates with GTG (Table 3). As in other vertebrates, the C. bouderius mtDNA also uses TAA and TAG as stop codons. Of these stop codons, the TAG is the most prevalent one. Five genes (ND1, ND2, COI, ND3, and ND6) use the TAG stop codon whereas four genes (ATP8, ATP6, ND4L, and ND5) use the TAA stop codon. The incomplete stop codon of TNN is observed in the COII, COIII, ND4, and cyt b genes. The ND6 that is encoded by the L-strand of the mtDNA ends with TAG. Two cases of reading-frame overlap on the same strand are found in C. bouderius mtDNA. The ATP8 and ATP6 genes share 10 nucleotides, as in birds and other fishes. The ND4 and ND4L genes overlap by 7 nucleotides, as in all other chordates. In addition, the ND5 and ND6 genes, which are located on the opposite strands, have an overlap of 4 nucleotides, within what is found in most vertebrates (4–17 nucleotides). 3.4. Phylogenetic analyses The combined data set of all mitochondrial protein-coding genes yielded, using Bayesian inference with GTR + I + G model, strongly posterior probabilities-supported trees that were similar with the hypothesis of Saitoh et al. (Fig. 2, compared with Fig. 1C). The Cypriniformes appear as the sister group to the other three otophysan orders, the Siluriformes appearing as the sister group of Gymnotiformes + Characiformes

295

with 100% posterior nodal probability (pp), and the Gymnotiformes appearing as the sister group of the Characiformes with 98% posterior nodal probability. Many ichthyologists continue to accept Fink and Fink's (1981, 1996) hypothesis (Fig. 1B) as the most likely scenario concerning the relationships among the four otophysan orders (see e.g., Nelson, 2006). However, it should be noted that the sister-group relationship between the Siluriformes and the clade Characiformes + Gymnotiformes was also supported by other analyses, such as Orti's (1997) analysis of the first and second codon positions of the nuclear ependymin gene. Moreover, it should be noted that when the third codon positions were excluded from the data set, Bayesian analysis also recovered similar topology shown in Fig. 2, with only some differences at nodal support values and presenting only a single polytomy within the gymnotiform and characiform species (results not shown). With regard to the placement of the Cranoglanididae within siluriforms, C. bouderius was grouped with the pangasiid species P. gigas, with the clade formed by the two taxa being grouped with the ictalurid species I. punctatus (Fig. 2). This scenario is somewhat different from that obtained in recent morphological and molecular cladistic analyses, which indicated a sister-group relationship between cranoglanidids and ictalurids (Diogo, 2005; Hardman, 2005; Peng et al., 2005). Such an incongruence might be caused by partial taxon sampling of the present study or by the number of morphological and molecular characters used in those analyses (e.g., 440 morphological characters in Diogo's, 2005 analysis, and only 1138 bp cyt b gene sequences in Hardman's, 2005 and in Peng et al.'s, 2005 analyses). In order to test the competing otophysan topologies, we compared four alternative topologies using statistical tests: ((((Characiformes, Gymnotiformes), Cypriniformes), Siluriformes), outgroup), ((((Gymnotiformes, Siluriformes), Characiformes), Cypriniformes), outgroup), ((((Characiformes, Gymnotiformes), Siluriformes), Cypriniformes), outgroup), and ((((Characiformes,

Fig. 2. Majority-rule phylogram and posterior probabilities among the four otophysan orders resulting from Bayesian analysis of concatenated 13 protein-coding genes based on a combined 2.7 million post-burn-in generations under the GTR + I + G model of evolution.

296

Z. Peng et al. / Gene 376 (2006) 290–297

Table 5 Tests of significance for the four competing otophysan phylogenetic hypotheses, and for Bayesian trees reconstructed using the mtDNA individual proteincoding genes and combined data sets Tree compared Otophysan topology tests C ((((Ch, Gy), Si), Cy), outgroup); D ((((Ch, Si), Gy), Cy), outgroup); B ((((Gy, Si), Ch), Cy), outgroup); A ((((Ch, Gy), Cy), Si), outgroup); Individual protein-coding gene Concatenated mtDNA protein-coding gene sequences ND4 COI ATP6 ND6 cyt b ND2 ND5 COII COIII ATP8 ND3 ND4L ND1

− LnL

AU

KH

SH

wSH

96,359.21

0.634

0.571

0.899

0.901

96,361.26

0.496

0.429

0.685

0.677

96,367.11

0.204

0.223

0.511

0.451

96,427.58

⁎⁎

0

0.001

0

96,359.21

0.841

0.744

0.999

0.998

Best

COI can be considered as the best individual genes for reconstructing higher-level relationships among major otophysan fish lineages, which is consistent, in some way, with the results obtained by authors such as Zardoya and Meyer (1996b) and Miya and Nishida (2000). Nevertheless, as suggested by many authors (e.g., Cummings et al., 1995; Russo et al., 1996), the different phylogenetic performance of the different mtDNA protein-coding genes might be partly due to gene lengths, and possibly to evolutionary rate differences between genes (Zardoya and Meyer, 1996b). Further work should be done to find out why different genes deemed to have different phylogenetic performances. 4. Conclusions

96,379.00 96,388.72 96,414.72 96,436.70 96,457.20 96,480.18 96,483.19 96,519.26 96,539.62 96,583.00 96,633.20 96,726.25 97,609.15

0.298 0.091 0.031 ⁎⁎ ⁎⁎ ⁎⁎ 0.006 ⁎⁎ ⁎⁎ ⁎⁎ ⁎⁎ ⁎⁎ ⁎⁎

0.256 0.070 0.029 0 0 ⁎⁎ 0 0 0 0 0 0 0

0.766 0.671 0.348 0.159 0.088 0.036 0.035 0.011 0.004 0.001 0 0 0

Best

0.730 0.360 0.176 ⁎⁎ 0.001 ⁎⁎ 0 0 0 0 0 0 0

NOTE. − LnL = Log-likelihood scores; AU = approximately unbiased test; BP = bootstrap probability for the bootstrap test; KH = Kishino–Hasegawa test; SH = Shimodaira–Hasegawa test; wSH = weighted Shimodaira–Hasegawa test; Ch = Characiformes; Gy = Gymnotiformes; Cy = Cypriniformes; Si = Siluriformes. The four capital letters (A, B, C, and D) are consistent with the four capital letters in Fig. 1. Significance level is 0.05. ⁎⁎ 0 b P b 0.001.

Siluriformes), Gymnotiformes), Cypriniformes), outgroup) (for topology A, B, C, and D, respectively in Fig. 1). In all the four tests, the phylogenetic scenario in which the siluriforms occupy a basal position within otophysans (Fig. 1A) was strongly rejected in favor of a basal position of cypriniforms (Table 5). However, within the characiphysan clade, the three competing groupings cannot be rejected by all the four tests at the 5% significance level. Indeed, although the grouping of Characiformes + Gymnotiformes gives the best topology in all the tests (Fig. 1C, also see Fig. 2), the groupings of Characiformes + Siluriformes and of Siluriformes + Gymnotiformes cannot be rejected in all the four tests (Table 5). Furthermore, the comparison of the competing otophysan topologies obtained by using different protein-coding genes (results not shown) and the concatenated protein-coding genes (Fig. 2) revealed that the tree reconstructed by using concatenated protein-coding genes was the best tree although the topologies obtained by using ND4 and COI genes were not significantly different from the best one by all the four tests (Table 5). The tree obtained by using the ATP6 gene was rejected by AU and KH but not by more relaxed SH and wSH tests. Concerning the phylogenetic performance of the mtDNA individual protein-coding genes, at least in this case, ND4 and

The mitochondrial genome of C. bouderius contains the 37 genes that are most commonly found in vertebrate mtDNAs. Concerning the phylogenetic placement of C. bouderius, this species appears closely related to the pangasiid P. gigas, an hypothesis that is different from that proposed in recent works in which cranoglanidids appear as the sister groups of ictalurids. Concerning otophysan relationships, the cypriniforms appear as the most basal extant otophyans, the Siluriformes appearing as the sister group of Characiformes + Gymnotiformes. Such a scenario is consistent with the results of most molecular cladistic works undertook in the last years. However, it should be stressed that further description and comparison of complete mtDNA sequences of more teleosts is clearly needed in order to clarify the relationships within the diverse, and highly fascinating, teleostean fishes. Acknowledgments We are very grateful to Z. Abdo and R. Diogo for their critical reading of this manuscript and for their helpful comments and suggestions that greatly improved the manuscript. We are also very grateful to C. Liu and Q. Tang for their kind help in reading the manuscript at every stage. We sincerely thank the two anonymous referees and the Associate Editor for their insightful comments on the manuscript. The research was funded and supported by grants from National Science Fund for Distinguished Young Scholars (32005008) and the Knowledge Innovation Project of the Chinese Academy of Sciences (KSCX2-SW-101B) to S.H. References Broughton, R.E., Milam, J.E., Roe, B.A., 2001. The complete sequence of the zebrafish (Danio rerio) mitochondrial genome and evolutionary patterns in vertebrate mitochondrial DNA. Genome Res. 11, 1958–1967. Chang, Y.C., Hunag, F.L., Lo, T.B., 1994. The complete nucleotide sequence and gene organization of carp (Cyprinus carpio) mitochondrial genome. J. Mol. Evol. 38, 138–155. Clayton, D.A., 1982. Replication of animal mitochondrial DNA. Cell 28, 693–705. Clayton, D.A., 1991. Nuclear gadgets in mitochondrial DNA replication and transcription. Trends Biochem. Sci. 16, 107–111. Cummings, M.P., Otto, S.P., Wakeley, J., 1995. Sampling properties of DNA sequence data in phylogenetic analysis. Mol. Biol. Evol. 12, 814–822.

Z. Peng et al. / Gene 376 (2006) 290–297 Delarbre, C., et al., 1998. The complete nucleotide sequence of the mitochondrial DNA of the dogfish, Scyliorhinus canicula. Genetics 150, 331–344. Dimmick, W.W., Larson, A., 1996. A molecular and morphological perspective on the phylogenetic relationships of the otophysan fish. Mol. Phylogenet. Evol. 6, 120–133. Diogo, R., 2005. Adaptations, homoplasies, constraints, and evolutionary trends: catfish morphology, phylogeny and evolution, a case study on theoretical phylogeny and macroevolution. Science Publishers Inc., Enfield. Fink, S.V., Fink, W.L., 1981. Interrelationships of the ostariophysan fish. Zool. J. Linn. Soc. 72, 297–353. Fink, S.V., Fink, W.L., 1996. Interrelationships of the ostariophysan fishes (Teleostei). In: Stiassny, M.L.J., Parenti, L.R., Johnson, G.D. (Eds.), Interrelationships of Fishes. Academic Press, San Diego, pp. 209–249. Hardman, M., 2005. The phylogenetic relationships among non-diplomystid catfishes as inferred from mitochondrial cytochrome b sequences; the search for the ictalurid sister taxon (Otophysi: Siluriformes). Mol. Phylogenet. Evol. 37, 700–720. Inoue, J.G., Miya, M., Tsukamoto, K., Nishida, M., 2000. Complete mitochondrial DNA sequence of the Japanese sardine, Sardinops melanostictus. Fish. Sci. 66, 924–932. Inoue, J.G., Miya, M., Tsukamoto, K., Nishida, M., 2001. A mitogenomic perspective on the basal Teleostean phylogeny: resolving higher-level relationships with longer DNA sequences. Mol. Phylogenet. Evol. 20, 275–285. Johansen, S., Bakke, I., 1996. The complete mitochondrial DNA sequence of Atlantic cod, Gadus morhua: relevance to taxonomic studies among codfish. Mol. Mar. Biol. Biotechnol. 5, 203–214. Kishino, H., Hasegawa, M., 1989. Evaluation of the maximum likelihood estimate of the evolutionary tree topologies from DNA sequence data, and the branching order in the Hominoidea. J. Mol. Evol. 29, 170–179. Lavoue, S., Miya, M., Inoue, J.G., Saitoh, K., Ishiguro, N.B., Nishida, M., 2005. Molecular systematics of the gonorynchiform fishes (Teleostei) based on whole mitogenome sequences: implications for higher-level relationships within the Otocephala. Mol. Phylogenet. Evol. 37, 165–177. Lee, W.J., Kocher, T.D., 1995. Complete sequence of a sea lamprey (Pettromyzon marinus) mitochondrial genome: early establishment of the vertebrate genome organization. Genetics 139, 873–887. Liu, C., Peng, Z., He, S., 2005. Studies on species classification for genus Cranoglanis Peters with the method of morphometrics. Acta Hydrobiol. Sin. 29, 507–512. Minin, V., Abdo, Z., Joyce, P., Sullivan, J., 2003. Performance-based selection of likelihood models for phylogeny estimation. Syst. Biol. 52, 674–683. Miya, M., Nishida, M., 1999. Organization of the mitochondrial genome of a deepsea fish, Gonostoma gracile (Teleostei: Stomiiformes): first example of transfer RNA gene rearrangements in bony fish. Mar. Biotechnol. 1, 416–426. Miya, M., Nishida, M., 2000. Use mitogenomic information in Teleostean molecular phylogenetics: a tree-based exploration under the maximumparsimony optimality criterion. Mol. Phylogenet. Evol. 17, 437–455. Miya, M., et al., 2003. Major patterns of higher teleostean phylogenies: a new perspective based on 100 complete mitochondrial DNA sequences. Mol. Phylogenet. Evol. 26, 121–138. Nelson, J.S., 2006. Fishes of the World, 4th ed. John Wiley and Sons Inc., New York.

297

Noack, K., Zardoya, R., Meyer, A., 1996. The complete mitochondrial DNA sequence of the bichir (Polypterus ornatipinnis), a basal rayfinned fish: ancient establishment of the consensus vertebrate gene order. Genetics 144, 1165–1180. Orti, G.S., 1997. Radiation of characiform fishes: evidence from mitochondrial and nuclear DNA sequences. In: Kocher, T.D., Stepien, C.A. (Eds.), Molecular Systematics of Fishes. Academic Press, San Diego, pp. 219–243. Peng, Z., Zhang, Y., He, S., Chen, Y., 2005. Phylogeny of Chinese catfishes inferred from mitochondrial cytochrome b sequences. Acta Genet. Sin. 32, 145–154. Ronquist, F., Huelsenbeck, J.P., 2003. MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19, 1572–1574. Rosen, D.E., Greenwood, P.H., 1970. Origin of the Weberian apparatus and the relationships of the ostariophysan and gonorynchiform fishes. Am. Mus. Novit. 2428, 1–25. Russo, C.A.M., Takezaki, N., Nei, M., 1996. Efficiencies of different genes and different tree-building methods in recovering a known vertebrate phylogeny. Mol. Biol. Evol. 13, 525–536. Saitoh, K., Miya, M., Inoue, J.G., Ishiguro, N.B., Nishida, M., 2003. Mitochondrial genomics of ostariophysan fishes: perspectives on phylogeny and biogeography. J. Mol. Evol. 56, 464–472. Shadel, G.S., Clayton, D.A., 1997. Mitochondrial DNA maintenance in vertebrates. Ann. Rev. Biochem. 66, 409–435. Shimodaira, H., 2002. An approximately unbiased test of phylogenetic tree selection. Syst. Biol. 51, 492–508. Shimodaira, H., Hasegawa, M., 1999. Multiple comparisons of loglikelihoods with applications to phylogenetic inference. Mol. Biol. Evol. 16, 1114–1116. Shimodaira, H., Hasegawa, M., 2001. CONSEL: for assessing the confidence of phylogenetic tree selection. Bioinformatics 17, 1246–1247. Swofford, D.L., 2002. PAUP⁎. Phylogenetic Analysis Using Parsimony (*and Other Methods), version 4. Sinauer Associates, Sunderland, Massachusetts. Tzeng, C.S., Hui, C.F., Shen, S.C., Huang, P.C., 1992. The complete nucleotide sequence of the Crossostoma lacustre mitochondrial genome: conservation and variations among vertebrates. Nucleic Acids Res. 20, 4853–4858. Waldbieser, G.C., Bilodeau, A.L., Nonneman, D.J., 2003. Complete sequence and characterization of the channel catfish mitochondrial genome. DNA Seq. 14, 265–277. Yue, P., Chen, Y., 1998. Pisces. In: Wang, S. (Ed.), China Red Data Book of Endangered Animals. Science Press, Beijing, pp. 222–223. Zardoya, R., Meyer, A., 1996a. The complete nucleotide sequence of the mitochondrial genome of the lungfish (Protopterus dolloi) supports its phylogenetic position as a close relative of land vertebrates. Genetics 142, 1249–1263. Zardoya, R., Meyer, A., 1996b. Phylogenetic performance of mitochondrial protein-coding genes in resolving relationships among vertebrates. Mol. Biol. Evol. 13, 933–942. Zardoya, R., Meyer, A., 1997. The complete DNA sequence of the mitochondrial genome of a ‘living fossil’, the coelacanth (Latimeria chalumnae). Genetics 146, 995–1010.