Phylogenetic relationships of the Cobitoidea (Teleostei: Cypriniformes) inferred from mitochondrial and nuclear genes with analyses of gene evolution

Phylogenetic relationships of the Cobitoidea (Teleostei: Cypriniformes) inferred from mitochondrial and nuclear genes with analyses of gene evolution

Gene 508 (2012) 60–72 Contents lists available at SciVerse ScienceDirect Gene journal homepage: www.elsevier.com/locate/gene Phylogenetic relations...

1MB Sizes 0 Downloads 53 Views

Gene 508 (2012) 60–72

Contents lists available at SciVerse ScienceDirect

Gene journal homepage: www.elsevier.com/locate/gene

Phylogenetic relationships of the Cobitoidea (Teleostei: Cypriniformes) inferred from mitochondrial and nuclear genes with analyses of gene evolution Si-qing Liu a, b, Richard L. Mayden c, Jia-bo Zhang d, Dan Yu a, b, Qiong-ying Tang a,⁎, Xin Deng e, Huan-zhang Liu a,⁎ a

The Key Laboratory of Aquatic Biodiversity and Conservation of Chinese Academy of Sciences, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, Hubei, 430072, PR China Graduate University of the Chinese Academy of Sciences, Beijing, 100049, PR China Department of Biology, Saint Louis University, 3507 Laclede Avenue, St. Louis, MO 63103‐2010, USA d College of Fisheries, Huazhong Agricultural University, Wuhan, Hubei, 430070, PR China e Department of Pesticide Regulation, California Environmental Protection Agency, Sacramento, CA 95812, USA b c

a r t i c l e

i n f o

Article history: Accepted 23 July 2012 Available online 31 July 2012 Keywords: Cobitoidea Phylogeny Gene evolution Positive selection Relaxed selection

a b s t r a c t The superfamily Cobitoidea of the order Cypriniformes is a diverse group of fishes, inhabiting freshwater ecosystems across Eurasia and North Africa. The phylogenetic relationships of this well-corroborated natural group and diverse clade are critical to not only informing scientific communities of the phylogeny of the order Cypriniformes, the world's largest freshwater fish order, but are key to every area of comparative biology examining the evolution of traits, functional structures, and breeding behaviors to their biogeographic histories, speciation, anagenetic divergence, and divergence time estimates. In the present study, two mitochondrial gene sequences (COI, ND4 + 5) and four single-copy nuclear gene segments (RH1, RAG1, EGR2B, IRBP) were used to infer the phylogenetic relationships of the Cobitoidea as reconstructed from maximum likelihood (ML) and partitioned Bayesian Analysis (BA). Analyses of the combined mitochondrial/nuclear gene datasets revealed five strongly supported monophyletic Cobitoidea families and their sister-group relationships: Botiidae+(Vaillantellidae+ (Cobitidae+ (Nemacheilidae+ Balitoridae))). These recovered relationships are in agreement with previous systematic studies on the order Cypriniformes and/or those focusing on the superfamily Cobitoidea. Using these relationships, our analyses revealed pattern lineageor ecological-group-specific evolution of these genes for the Cobitoidea. These observations and results corroborate the hypothesis that these group-specific-ancestral ecological characters have contributed in the diversification and/or adaptations within these groups. Positive selections were detected in RH1 of nemacheilids and in RAG1 of nemacheilids and genus Vaillantella, which indicated that evolution of RH1 (related to eye's optic sense) and RAG1 (related to immunity) genes appeared to be important for the diversification of these groups. The balitorid lineage (those species inhabiting fast-flowing riverine habitats) had, as compared with other cobitoid lineages, significantly different dN/dS, dN and dS values for ND4 and IRBP genes. These significant differences are usually indicative of weaker selection pressure, and lineage-specific evolution on genes along the balitorid lineage. Furthermore, within Cobitoidea, excluding balitorids, species living in subtropics had significantly higher dN/dS values in RAG1 and IRBP genes than those living in temperate and tropical zones. Among tropical cobitoids, genes COI, ND5, EGR2B, IRBP and RH1, had a significantly higher mean dS value than those species in subtropical and temperate groups. These findings suggest that the evolution of these genes could also be ecological-group-specific and may have played an important role in the adaptive evolution and diversification of these groups. Thus, we hypothesize that the genes included in the present study were actively involved in lineage- and/or ecological-group-specific evolutionary processes of the highly diverse Cobitoidea. These two evolutionary patterns, both subject to further testing, are hypothesized as integral in the diversification with this major clade of the world's most diverse group of freshwater fishes. © 2012 Elsevier B.V. All rights reserved.

Abbreviations: COI, cytochrome c oxidase subunit I; ND4, NADH dehydrogenase subunit 4; ND5, NADH dehydrogenase subunit 5; Cyt b, cytochrome b; RAG1, recombination activating gene 1; RH1, rhodopsin; EGR2B, early growth response protein 2B; IRBP, interphotoreceptor retinoid binding protein; ML, maximum likelihood; BA, Bayesian approach; LRT, likelihood ratio test; CAI, codon adaptation index; ENC, effective number of codons; dN, nonsynonymous substitution; dS, synonymous substitution. ⁎ Corresponding authors at: 7 south Donghu Road, Wuchang District, Wuhan 430072, Hubei, PR China. Tel.: +86 27 68780776; fax: +86 27 68780065. E-mail addresses: [email protected] (S. Liu), [email protected] (R.L. Mayden), [email protected] (J. Zhang), [email protected] (D. Yu), [email protected] (Q. Tang), [email protected] (X. Deng), [email protected] (H. Liu). 0378-1119/$ – see front matter © 2012 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.gene.2012.07.040

S. Liu et al. / Gene 508 (2012) 60–72

1. Introduction The Cypriniformes represents a fish order noted as the most highly diverse group of freshwater fishes. At present, 20 families within three superfamilies (Paedocyprioidea, Cyprinoidea, Cobitoidea) are recognized in recent research efforts (Mayden and Chen, 2010). The superfamily Paedocyprioidea only constitutes one family, Paedocypridae. The Cyprinoidea includes families Cyprinidae, Psilorhynchidae, Leptobarbidae, Danionidae, Cultridae, Xenocyprinidae, Tincidae, Tanichthyidae, Gobionidae, Achaelognathidae and Leuciscidae (all formerly recognized under the overwhelmingly diverse Cyprinidae). The Cobitoidea includes Gyrinocheilidae, Catostomidae, Cobitidae, Botiidae, Nemacheilidae, Vaillantellidae, Ellopostomatidae, and Balitoridae. While these former subfamilies within Cypriniformes are recognized by Nelson (2006), there is no phylogenetic basis for the previous classification. Depending on different authors, the superfamily Cobitoidea might contain a broad group including Gyrinocheilidae, Catostomidae, Cobitidae and Balitoridae (Nelson, 2006), only loach species including Cobitidae and Balitoridae (Sawada, 1982), or the families listed above, recently derived from substantial molecular genetic data and analyses (Chen et al., 2008; Mayden et al., 2008; Šlechtová et al., 2007). Because of its high species abundance and diversification, the cobitoids (26% species of the Cypriniformes) represent a critical element in resolving the phylogenetic relationships of the Cypriniformes. The family Cobitidae first proposed by Regan (1911) was supported in pre-phylogenetic observations by Hora (1932). The latter author initially identified two subfamilies in Cobitidae, Cobitinae and Nemacheilinae. Later Berg (1940), also supported by pre-phylogenetic observations, reclassified the family into three subfamilies, Cobitinae, Nemacheilinae, and Botiinae, which were widely accepted by later authors (Chen and Zhu, 1984; Nalbant, 1963; Ramaswami, 1953). Using 52 osteological characters, Sawada (1982) proposed a phylogeny of the Cobitoidea (but only loaches) as (Botiinae+Cobitinae)+ (Nemacheilinae+Homalopterinae), and this classification was followed by other authors (Kottelat, 2001; Nelson, 2006; Siebert, 1987). Based on mtDNA control region sequences, Liu et al. (2002) suggested that the Nemacheilinae and Cobitinae were sister clades and clustered with Balitoridae (=Homalopteridae), with the subfamily Botiinae at the most basal position. Tang et al. (2006), using a detailed phylogenetic analysis of mitochondrial cytochrome b and control region sequences, corroborated the classification of Liu et al. (2002). These three subfamilies were thus elevated to the family level as to maintain consistency between phylogenetic relationships and a natural classification. Šlechtová et al. (2007) analyzed the phylogenetic relationships of the Cobitoidea using only one nuclear gene (RAG1) and proposed relationships as Botiidae+(Vaillantellidae+(Cobitidae+(Nemacheilidae+ Balitoridae))). Interestingly this study could not corroborate the monophyly of the Cobitidae (sensu Sawada, 1982) but validated with strong support that the Cobitidae and Botiidae were not closely related. Rather, this and later studies (Liu et al., 2010; Mayden et al., 2008, 2009) demonstrated with strong nodal supports that the cobitines were really sister to a clade of balitorines plus nemacheilines, and the botiines were sister to all other loach-like Cobitoidea. Multiple studies have now confirmed Vaillantella as sister to cobitines, balitorines, and nemacheilines, appearing on the phylogeny between botiines and remaining loaches (Mayden et al., 2009; Šlechtová et al., 2007). Based on four nuclear genes and complete mitochondrial genome, Mayden et al. (2009) analyzed phylogenetic relationships of the Cypriniformes (with 12 samples of the Cobitoidea besides one catostomid), in which the same phylogenetic relationships of the Cobitoidea, proposed by Šlechtová et al. (2007), were recognized. These recent studies have made significant advancements in our previous, only rudimentary, understanding of the systematics and natural classification of the Cypriniformes, especially the Cobitoidea. The evolutionary relationships of Vaillantella were controversial as “phylogenies” or statements of relationships were not based

61

on synapomorphies. Nalbant and Bǎnǎrescu (1977) considered Vaillantellinae, containing only Vaillantella, in an intermediate evolutionary position between Nemacheilinae and Botiinae. However, their study lacked supporting synapomorphic data and should not be considered equivalent to those supplying supporting evidence. Many authors (Kottelat, 1994; Roberts, 1989; Sawada, 1982) disagreed with this hypothesis, supporting the traditional morphological hypothesis that Vaillantella was a genus of the subfamily Nemacheilinae (from the pre-phylogenetic paper by Weber and de Beaufort, 1916). However, these studies were also not based on outgroup comparisons for derived characters. Whole mitogenome sequence data and phylogenetic analyses by Saitoh et al. (2006) and sequence data from multiple nuclear and mitochondrial gene loci by Mayden et al. (2009) strongly supported Vaillantella as sister to Cobitinae + (Balitorinae + Nemacheilinae). Nalbant (2002) proposed the sister-group relationship between Vaillantellinae and Botiinae under the family Botiidae, a relationship that has been refuted in multiple studies. Recently, studies by Šlechtová et al. (2007) and Mayden et al. (2009) argued for Vaillantellinae to be elevated to a separate family Vaillantellidae to be consistent with phylogenetic relationships. This family forms a lineage sister to and independent of Cobitidae + (Nemacheilidae + Balitoridae) based on molecular data. Mayden et al. (2009), using nuclear and mitochondrial genes, recovered a sister group relationship identical to that of Saitoh et al. (2006, see above). At about the same time, the phylogenetic position of Ellopostoma was examined by Chen et al. (2009) and this enigmatic genus was strongly supported as sister to the Nemacheilidae, like Vaillantella, that necessitated the formation of the new family Ellopostomatidae, to maintain a natural classification. These studies and others (Chen et al., 2008; Mayden and Chen, 2010; Mayden et al., 2008) provide critical, stable, and well supported evidence as to large-scale relationships between and within the Paedocypridoidea, Cyprinoidea and Cobitoidea. As such, we strongly recommend that the above referenced phylogenetic hypotheses, based on molecular and/or morphological data, should be tested with increased character/taxon sampling as these studies and others are clearly subject to sampling error. Thus, hypotheses put forth as to the sister-group relationships regarding groupings from species to families in the Cobitoidea require additional investigation with additional genes or more taxa or additional taxa for the same genes to evaluate pre-existing hypotheses and offer greater insight into this amazingly highly diverse and extremely popular group of fishes. These fishes have highly divergent life histories and biologies of interest to not only the hobbyist and aquarium trade but are of great interest in aquaculture, scientific studies that may be useful in aquaculture for human food sources, as well as systematic and evolutionary studies. Aside from the above referenced and logically sound reasons for deeper investigation into the evolutionary relationships of the Cobitoidea, it has been suggested that biodiversity is eventually attributed to gene evolution (Seehausen et al., 2008). As such, a detailed examination of gene evolution should thus provide insight into fundamental questions regarding organismal evolution and diversification as seen in the Cobitoidea. One such question is how gene evolution can contribute to diversification within lineages (Bromham, 2009). Variation of evolutionary rates within or between lineages might be one of the most common phenomena in gene evolution. With the accumulation of abundant molecular data and requisite phylogenies for the identification of lineages for appropriate and meaningful comparisons or evaluations, only with these essential data can the community better understand causes and/or influences of variation, variation in evolutionary rates via anagenesis along lineages, or other types of genetic modifications within a phylogenetic context. The evolutionary rate of change of a gene could result from at least three potential intrinsic factors that have already been identified. These include 1) efficiency of DNA repair, 2) generation-time effect and/or 3) metabolic rate (Britten, 1986; Laird et al., 1969; Martin

62

S. Liu et al. / Gene 508 (2012) 60–72

and Palumbi, 1993). However, selective pressure (positive or purifying selection) from extrinsic ecological/environmental factors, to a large extent, drives the variation of evolutionary rates both in prokaryotes and eukaryotes (Baer et al., 2007; Bromham, 2009). Therefore, variation in evolutionary rates (e.g. dN/dS ratio), especially those between taxa sufficiently closely related (for example within mammals), is considered a strong indicator of anagenesis driven by selection pressure over time (Czelusniak et al., 1982; Kawahara and Imanishi, 2007; Lynch and Conery, 2000; Toll-Riera et al., 2011). This variation of possible factors is hypothesized to have contributed greatly to the function differentiation we observe today in the genes and diversification of organisms (Wang et al., 2009). As a highly diverse group, the superfamily Cobitoidea has never been examined through appropriate means and within an evolutionary context to evaluate these factors in the evolution of different genes and how they have contributed to the diversification in this natural group. In this study, our particular aims are: (1) test the monophyly of each family in the Cobitoidea and to clarify phylogenetic relationships based on multiple genes with a thorough taxon sampling; (2) test the phylogenetic position of the problematic family Vaillantellidae and provide hypothesized factors as to causes for the unstable phylogenetic position of Vaillantella; (3) analyze the impacts of selection pressure on gene evolution within the Cobitoidea, and (4) investigate and hypothesize how gene evolution may have contributed to species diversification within Cobitoidea. 2. Materials and method 2.1. Sampling of taxa This study included 64 specimens, representing 61 species in 33 genera, from the Cobitoidea and three outgroup taxa (Danio rerio, Gyrinocheilus aymonieri, Myxocyprinus asiaticus) were used for all analyses. Most nucleotide sequences were newly sequenced for phylogenetic inferences and molecular evolutionary analyses; although some sequences were necessarily obtained from GenBank or from our previous studies (Table 1). In the present study, all newly sequenced samples were identified following Chen and Tang (2000) and Zhu (1995). All materials were collected from China, preserved in 95% ethanol, and deposited in the Institute of Hydrobiology, Chinese Academy of Sciences. 2.2. DNA extraction, PCR protocols and sequencing DNA was extracted from ethanol-preserved muscle/fin tissues following standard phenol–chloroform protocols described by Sambrook et al. (1989). DNA amplification was conducted using PCR (Mullis and Faloona, 1987; Saiki et al., 1988) for fragments from two mtDNA datasets (COI and ND4 + 5 genes; ND4 and ND5 protein coding genes and intervening three tRNAs were denoted as ND4 + 5) and four targeted nuclear loci (RH1, RAG1, EGR2B, IRBP). Information regarding primers and conditions for PCR amplifications are in the Appendices, Table A.1 and Table A.2. A different laboratory protocol was used to obtain the ND4 + 5 sequence data where the complete ND4 + 5 gene segments were amplified using five pairs of primers (LArgd1 + H11618d; L11427dA + H12632dB; L12328d + H13393d; L13058d + H13721d; L13559d + H14710d). Fragments were sequenced using Shanghai DNA Biotechnologies Company. All sequences used in the present study are available in GenBank (accession numbers provided in Table 1). 2.3. Sequence data and phylogenetic analyses Average base frequencies and phylogenetic information for all positions in genes of the Cobitoidea were calculated in MEGA4.0 software (Tamura et al., 2007). Nucleotide saturation was analyzed by

plotting absolute numbers of transitions (Ts) and transversions (Tv) against GTR distance values (chosen by Modeltest 3.06; Posada and Crandall, 1998) for all pairwise comparisons among taxa in PAUP* 4.0b10. Compiled nucleotide sequences were initially aligned using Clustal X (Thompson et al., 1997), and then adjusted manually, based on the inferred amino acid translation (except for three tRNA genes), with SEAVIEW (Galtier et al., 1996). Three operational datasets were used for inferences of phylogenetic trees: (1) combined mitochondrial DNA sequences only; (2) combined nuclear DNA sequences only; (3) combined data of mitochondrial DNA and nuclear DNA (denoted as mtDNA + ncDNA) sequences. Phylogenetic analyses were based on Maximum Likelihood (ML) analysis, and partitioned Bayesian Analysis (BA). ML was used to find the optimal ML tree with a heuristic search, as implemented in PAUP* 4.0b10, with TBR branch swapping and 10 random sequence additions. Support for clades was assessed using bootstrap analysis (Felsenstein, 1985) with 1000 total pseudo-replicates adopted in the program PHYML3.0 (Guindon et al., 2009). BA analyses were implemented in MrBayes 3.1.1 (Huelsenbeck and Ronquist, 2001). Prior to analyses, Modeltest 3.06 (Posada and Crandall, 1998) was used to determine the most appropriate nucleotide substitution models with Akaike Information Criterion (AIC) for each gene coding position in a partitioned BA. In Bayesian analyses, two independent searches were conducted for each dataset. Four independent Markov Monte Carlo (MCMC) chains were run for 2,000,000 generations, with sampling of one tree per 100 replicates in each run. The first 1000 trees with non-stationary log likelihood values represented “burn-in” and were ignored. Posterior probabilities (PP) of phylogenetic inferences were determined from remaining trees. Trees shown herein represent 50% majority-rule consensus trees and BA PP for each node. With the great diversity of taxonomic coverage in this study, transitions at the third codon position are likely saturated and reveal only what appears as “noise” in analyses. However, the appearance of saturation in a broad-taxon sampling study inherently provides evidence for synapomorphies at different locations—depending upon the level of universality of the analysis of a clade. The problem that exists is that we are essentially incapable of distinguishing true saturation (aka noisy data leading to spurious results) from stable synapomorphies at different levels that only appear to be occurrences of saturation. Thus, following thoughtful analyses by Saitoh et al. (2006), Chen et al. (2008), Mayden et al. (2008), Mayden and Chen (2010) and some other recent insightful researchers, we employed RY-coding, in BA analyses, thereby taking only transversions into account. 2.4. Tests of positive selection The COMEML program of PAML was used to estimate parameters in models of sequence evolution and to test particularly interesting biological hypotheses (Yang, 2007). First, there is the ratio model M0, which assumes a single average ratio of nonsynonymous vs. synonymous substitution rates (denoted as dN/dS ratio or ω ratio) for all nucleotide sites and all branches of the phylogeny. This model was used to initially assess conditions of selective pressure for all genes. Tests using the pair of site models, M7 (not permitting positive selection) and M8 (sites under positive selection) were performed to detect positive selection (Yang et al., 2000). More rigorous branch-site models (Zhang et al., 2005) were also used to detect positive selection affecting only a few sites on particular lineages (Yang and Nielsen, 2002). For these, likelihood ratio tests (LRTs) compared the fit to the data of two nested models, assuming that twice the log likelihood difference between the two models (2ΔlnL) follows a χ 2 distribution with a number of degrees of freedom equal to the difference in the number of free parameters (Whelan and Goldman,

S. Liu et al. / Gene 508 (2012) 60–72

63

Table 1 Species used in the present study and their GenBank accession numbers and distribution information. Classification

Species

GenBank accession numbers Mitochondrial gene

Nuclear gene

COI

ND4 + 5

RH1

RAG1

EGR2B

IRBP

Vaillantellidae

Sinibotia superciliaris ■ Sinibotia robusta ■ Leptobotia taeniops 1 ● Leptobotia taeniops 2 ● Leptobotia elongata ■ Leptobotia pellegrini 1 ■ Leptobotia pellegrini 2 ■ Leptobotia mantschurica ● Parabotia lijiangensis ■ Parabotia banarescui ● Parabotia bimaculata ● Chromobotia macracanthus 1 ▲ Chromobotia macracanthus 2 ▲ Botia dario ▲ Syncrossus beauforti ▲ Misgurnus anguillicaudatus 1 ■ Misgurnus anguillicaudatus 2 ■ Misgurnus nikolskyi ● Acantopsis choirorhyrchos 1 ▲ Acantopsis choirorhyrchos 2 ▲ Cobitis sinensis 1 ■ Cobitis sinensis 2 ■ Cobitis sinensis 3 ■ Cobitis striata ● Cobitis choii ■ Cobitis takatsuensis ● Paramisgurnus dabryanus ■ Pangio anguillaris ▲ Pangio oblonga ▲ Niwaella multifasciata ● Canthophrys gongota ▲ Schistura balteata ▲ Schistura dabryi ■ Schistura fasciolata ■ Micronemacheilus pulcher ■ Oreonectes platycephalus ▲ Paracobitis wujiangensis ■ Barbatula toni ● Barbatula barbatula ● Lefua. echigonia ● Lefua. costata ● Triplophysa gundriseri ● Tuberoschistura baenzigeri ▲ Jinshaia abbreviata ● Sinogastromyzon sichangensis ■ Sinogastromyzon tonkinensis ■ Sinohomaloptera kwangsiens ■ Sewellia. lineolata ▲ Lepturichthys fimbriata 1 ● Lepturichthys fimbriata 2 ● Formosania chenyiyui ■ Formosania lacustre ■ Vanmanenia pingchowensis ■ Vanmanenia caldwelli ● Beaufortia szechuanensis ■ Pseudogastromyzon fangi ■ Pseudogastromyzon fasciatus ■ Pseudogastromyzon cheni ● Homaloptera leonardi ▲ Homaloptera parclitella ▲ Vaillantella maassi ▲

JN177226 JN177235 JN177224 JN177236 JN177225 JN177223 —————————— NC008677b JN177222 JN177221 JN177220 —————————— NC008671b —————————— —————————— JN177217 —————————— NC008678b JN177219 —————————— JN177238 —————————— —————————— —————————— —————————— —————————— JN177218 NC008675b —————————— —————————— —————————— NC008679b —————————— JN177233 JN177214 JN177215 JN177216 NC008670b —————————— —————————— —————————— —————————— —————————— JN177228 JN177227 —————————— —————————— —————————— —————————— JN177229 —————————— NC001727b JN177230 JN177232 JN177231 JN177234 JN177237 —————————— NC008673b —————————— NC008680b

GQ281575a GQ281574a —————————— GQ281576a JN177164 JN177163 —————————— NC008677b —————————— —————————— —————————— GQ281573a NC008671b —————————— —————————— JN177170 NC011209b NC008678b —————————— NC008669b GQ281579a —————————— NC007229b AB054125b EU656112b —————————— —————————— NC008675b —————————— —————————— —————————— NC008679b JN177172 GQ281577a JN177171 GQ281578a —————————— NC008670b —————————— AB054126b —————————— —————————— —————————— JN177169 GQ281581a —————————— —————————— —————————— JN177168 GQ281582a JN177166 NC001727b JN177165 —————————— —————————— —————————— GQ281580a JN177167 NC008673b —————————— NC008680b

JN177208 JN177209 —————————— —————————— —————————— JN177212 EU409640b FJ197038b JN177213 JN177210 JN177211 —————————— FJ197037b EU409641b —————————— —————————— —————————— FJ197040b —————————— FJ197039b —————————— —————————— —————————— —————————— —————————— EU409643b —————————— —————————— FJ197041b —————————— —————————— FJ197029b JN177207 —————————— EU409637b —————————— —————————— FJ197030b —————————— FJ197028b EU409634b —————————— —————————— —————————— —————————— —————————— —————————— EU409635b —————————— —————————— JN177200 —————————— JN177201 JN177206 JN177205 JN177204 JN177203 JN177202 FJ197027b EU409636b FJ197031b

JN177190 JN177191 JN177194 JN177193 JN177196 JN177195 EU292683b EU711138b JN177199 JN177198 JN177197 JN177192 EU711137b —————————— —————————— JN177189 —————————— EU711140b —————————— EU711139b —————————— —————————— —————————— —————————— —————————— —————————— JN177188 —————————— EU711141b —————————— —————————— EU711131b JN177185 JN177184 JN177187 JN177186 JN177183 EU711133b —————————— EF458305b —————————— —————————— —————————— JN177180 JN177181 —————————— —————————— —————————— JN177182 —————————— JN177173 —————————— JN177174 JN177178 JN177179 JN177177 JN177175 JN177176 EU711130b —————————— EU711132b

JN177239 JN177240 —————————— —————————— —————————— —————————— EU409736.1b —————————— JN177260 JN177261 JN177262 JN177241 —————————— EU409737.1b FJ650440.1b JN177257 —————————— —————————— JN177258 —————————— —————————— —————————— —————————— —————————— —————————— EU409739.1b —————————— —————————— —————————— EU409738.1b FJ650444.1b —————————— JN177242 JN177244 JN177246 JN177243 JN177245 —————————— FJ650447.1b —————————— EU409730.1b FJ650451.1b FJ650452.1b JN177249 JN177250 JN177247 JN177248 EU409731.1b JN177251 —————————— JN177254 —————————— JN177255 JN177252 JN177253 JN177256 —————————— —————————— —————————— EU409732.1b FJ650453.1b

JN177264 JN177265 —————————— —————————— —————————— —————————— EU409672.1b —————————— JN177289 —————————— —————————— JN177266 —————————— EU409673.1b FJ650482.1b JN177287 —————————— —————————— JN177288 —————————— JN177285 JN177286 —————————— —————————— —————————— EU409675.1b —————————— —————————— —————————— EU409674.1b FJ650485.1b —————————— JN177267 JN177268 JN177271 JN177270 JN177269 —————————— FJ650488.1b —————————— EU409666.1b FJ650492.1b FJ650493.1b JN177274 JN177275 JN177277 JN177276 EU409667.1b JN177272 JN177273 JN177278 —————————— JN177279 JN177280 JN177281 JN177284 JN177282 JN177283 —————————— EU409668.1b FJ197080b

Outgroup Gyrinocheilidae Catostomidae Cyprinidae

Gyrinocheilus aymonieri Myxocyprinus asiaticus Danio rerio

NC008672.1b NC006401.1b NC002333.2b

NC008672b AY986503b AC024175b

FJ197071b FJ197036.1b NM131084.1b

EU292682b EU711136b NM131389.1b

JN177259 JN177263 NM130997.2b

JN177290 FJ197085.1b BC060944.1b

Cobitoidea Botiidae

Cobitidae

Nemacheilidae

Balitoridae

Note: ——————————: sequences not available. Without ‘a’ or ‘b’: sequences novel to this study. ● Distribution in temperate; ■ distribution in subtropical; ▲ distribution in tropical. a Sequences obtained from our previous study. b Sequences downloaded from GenBank.

64

S. Liu et al. / Gene 508 (2012) 60–72

1999). When LRTs indicated positive selection, we used the Bayesian empirical Bayesian (BEB) method (Yang et al., 2005) to calculate posterior probabilities of positively selected codons.

3. Results

2.5. Synonymous codon usage bias

Among the aligned mitochondrial genes, no indels could be detected in either COI or ND4. Aligned ND5 sequences revealed indels for several sequences. ND4 and ND5 sequences in the aligned data matrix were 1384 and 1848 bp, respectively. The three tRNA genes between ND4 and ND5 were 70, 72, and 74 bp. The total length of the concatenated sequences for ND4 + 5 included 3448 bp. Of these, 1830 bp (53.1%) were at variable sites and 1538 (44.6%) contained variation inferred to as parsimony-informative. Nuclear genes, contained no indels for RH1 and RAG1. Some indels were found in both the EGR2B and IRBP genes. Additional details for each gene (length, variability number of parsimony information sites, mean base composition) are provided in Table A.3 in the Appendices. No evidence of saturation was detected at first or second codon positions. However, third position sites in mitochondrial genes appeared to contain areas of saturation. Pairwise Transition (Ts) and Transversion (Tv) differences at each codon of each gene increased with increasing genetic distance (GTR-distance), except for Ts differences at the third codon positions of COI, ND4 and ND5. Furthermore, a remarkable degree of saturation occurred at third codon sites that was revealed to have been reached with only about 5% genetic distance in ND4 and ND5 (Appendices, Fig. A.1).

Synonymous codon usage bias is usually attributed to translation selection, favoring efficient protein synthesis (Akashi and Eyre-Walker, 1998; Duret, 2002; Ikemura, 1985). Highly constrained proteins, however, are less tolerant of translational errors, and their genes may possess higher codon usage bias in order to maximize translation efficiency (Akashi, 1994; Li, 1997). Thus, synonymous codon usage bias is a proper indicator of translation efficiency across the evolutionary history of genes (Duret and Mouchiroud, 1999; Kawahara and Imanishi, 2007; Sharp and Li, 1987). We used the codon adaptation index (CAI) (Sharp and Li, 1987) and the effective number of codons (ENC) (Wright, 1990) to measure codon usage bias acting on synonymous sites. These two indices were calculated using the program CodonW 1.4 (written by John Peden, obtained from http://codonw. sourceforge.net/), and statistical differences were evaluated using SPSS 13.0. 2.6. Detection of variation in dN/dS The dN/dS evaluation results determine whether the genes involved in the present study have historically experienced different selective pressures along different, evolutionary independent lineages. Such tests demand hypothesized phylogenetic relationships of the taxa being evaluated. Our inferred evolutionary history of the Cobitoidea with mtDNA + ncDNA datasets (Fig. 1) reveals at a minimum five independent historical lineages corresponding to five families (Balitoridae, Nemacheilidae, Cobitidae, Vaillantellidae, Botiidae). Using branch models in PAML (Yang, 1997, 2007), the following four steps were performed: (1) dN/dS values were estimated for all single-gene datasets, where one-ratio (all lineages share a common dN/dS value) and two-ratio (foreground lineage and background lineages have different dN/dS values) models were conducted; (2) each lineage was treated as a foreground lineage as compared to remaining lineages (background lineages) separately; 35 pairs of LRTs were examined between the one- and two-ratio models for seven genes; (3) genes detected as having significant variation in dN/dS values in above step 2 were subsequently estimated separately for each branch of the phylogenetic tree using the free-ratio model; (4) the mean dN/dS, dN and dS values from step 3 were compared between so-called “foreground” and “background” lineages to assess possible changes in selection pressure. Because climatic factors are known to play important roles in shaping the evolutionary history of mitochondrial protein encoding genes of Teleostei (Sun et al., 2011), we investigated variations in dN/dS of genes of the Cobitoidea where relevant comparisons of taxa could be made across three distinct climate groups. The three climate groups included temperate, subtropical and tropical groups, identified from field sampling records, literature review and Fishbase (Froese and Pauly, 2011) (Table 1). Values for dN/dS, dN, and dS for the seven genes were estimated separately using the free-ratio model, and compared across the three climate groups. Given that the unique, fast-flowing environment inhabited by balitorids might confound analyses as to the impact of climatic factors on the evolution of genes, balitorid species were not included at present in any of the three climate groups. For analysis of changes in dN/dS identified above, only dN/dS values associated with terminal branches were used in analyses without values of dS = 0 (focused only on the rate of accumulation of dN/dS between extant species and their most recent reconstructed ancestors) (Shen et al., 2009; Sun et al., 2011). All statistical analyses were performed in SPSS 13.0.

3.1. Characteristics of sequences

3.2. Phylogenetic relationships of the superfamily Cobitoidea Phylogenetic relationships of the superfamily Cobitoidea were reconstructed separately based on independent evaluations of the three operationally-based datasets, combined mitochondrial DNA (mtDNA) sequences, combined nuclear DNA (ncDNA) sequences, and combined mitochondrial and nuclear gene sequences (mtDNA+ncDNA). Combined mtDNA sequences (4545 bp) and combined ncDNA sequences (3996 bp) yielded similar topologies except for the position of the clade Vaillantellidae (Appendices, Fig. A.2–3). In both ML and BA of the combined mtDNA, phylogenetics inferred a linked clade of both Botiidae and Vaillantellidae. These two clades, however, formed a single clade inclusive of these two monophyletic groups basally within the Cobitoidea, resolving the superfamily consisting of (Botiidae + Vaillantellidae) + (Cobitidae + (Nemacheilidae+ Balitoridae)). In phylogenetic reconstructions using ncDNA sequences, however, the Botiidae species formed the basal sister-group to other “loaches” of the Cobitoidea; Vaillantellidae formed an independent clade sister to the large clade consisting of species of Balitoridae, Nemacheilidae, and Cobitidae. As with the results of ncDNA analyses, analyses of the combined mtDNA and ncDNA sequences (8541 bp) from the 62 taxa examined clearly identified Vaillantellidae as an independent lineage located in an intermediate position between the basal sister group of the Botiidae lineage and the large clade of remaining loaches (Cobitidae+ (Balitoridae+Nemacheilidae)). These relationships were strongly supported herein for ML and BA analyses (Fig. 1). Species and the relationships of lineages within the Cobitoidea, derived from recent detailed analyses, and carefully scrutinized revisions of classifications of Cypriniformes by Chen et al. (2008), Mayden et al. (2008), Mayden and Chen (2010), are in general agreement with this parallel study. Clades within the families also received high support in both ML and BA analyses, as did the monophyly of each of the families. Phylogenetic relationships among families of loaches of the Cobitoidea clade were resolved as (Botiidae+(Vaillantellidae+(Cobitidae+(Nemacheilidae+ Balitoridae)))). Combined mtDNA and ncDNA sequences include more variable and phylogenetically informative sites, and the resulting evolutionary inferences received higher support values than those with single data set and ncDNA sequences analyses. As such, we hypothesize that

S. Liu et al. / Gene 508 (2012) 60–72

65

Fig. 1. Phylogenetic tree based on combined mitochondrial and nuclear gene sequences. The main complete topology is based on a BA tree. Numbers on nodes before/after slash represent posterior probabilities and bootstrap values, respectively. * represents 100% support; - represents no support.

these resulting topologies best reflect the evolutionary history of the major lineages of Cobitoidea; relationships described below were inferred from the combined dataset. Botiidae formed the basal sister group to remaining cobitoids. Our samples of this family included six of the seven genera (Nelson, 2006), including Chromobotia, Botia, Sinibotia, Syncrossus, Leptobotia, and Parabotia. BA and ML analyses consistently resolved these six genera into two main clades and both analyses with high branch supports. One clade included the genera Leptobotia and Parabotia, two genera corresponding to the subfamily Leptobotiinae. The other clade including Chromobotia, Botia, Sinibotia, and Syncrossus represented the subfamily Botiinae. Within the Leptobotiinae, Leptobotia mantschurica was nested within the genus Parabotia, which makes Leptobotia paraphyletic.

Within Cobitidae, Niwaella multifasciata was nested within Cobitis, rendering Cobitis paraphyletic in both trees. Misgurnus anguillicaudatus was resolved as more closely related to Paramisgurnus dabryanus than to Misgurnus nikolskyi, rendering Misgurnus as a paraphyletic grouping as well. For the remaining genera, the relative relationships differed between BA and ML analyses (Fig. 1). Within the Balitoridae, two subfamilies, Gastromyzoninae and Balitorinae, were both resolved as monophyletic and with high posterior probability (100%) and robust bootstrap values (100%). Within Gastromyzoninae, Formosania was paraphyletic with Vanmanenia pingchowensis nested within Formosania. Three species of the genus Pseudogastromyzon formed a monophyletic group closely related to the clade including Formosania chenyiyui, F. lacustre, and Vanmanenia pingchowensis. Sewellia was recovered as the basal-most lineage of

66

S. Liu et al. / Gene 508 (2012) 60–72

Gastromyzoninae with strong nodal support (PP = 100%, BP = 100%), followed by Vanmanenia caldwelli. In reference to this analysis and within the Balitorinae, Lepturichthys was resolved as an unnatural grouping. One individual of Lepturichthys fimbriata was sister to Sinogastromyzon sichangensis while the other was sister to Jinshaia abbreviata. Schistura balteata formed the sister group to the remaining species of Nemacheilidae, and thus relegating the Schistura as currently recognized as an unnatural genus and polyphyletic. For remaining Nemacheilidae, Paracobitis wujiangensis grouped with Schistura fasciolata, and together they formed the sister group to Tuberoschistura baenzigeri. Thus, the phylogenetic relationships within Schistura demand much more attention and evaluation using molecular and morphological character data. A surprising, yet clearly obvious pattern emerging from this larger comparative analysis of this superfamily is that significant cryptic diversity currently exists amongst already described taxa. This pattern of our recognition of elements of biodiversity and our interpretation of the evolution of biodiversity is not unusual once genealogical relationships amongst the taxonomic units is evaluated in a phylogenetic context and wherein more than one representative of the taxon is included in any analyses. Cobitoid species herein represented by more than one individual, or genera represented by more than one species suggests cryptic divergence in Misgurnus, Vanmanenia, Formosania, Lepturichthys, Sinogastromyzon, and Schistura. BA and ML analyses yielded different trees as to the relationships of species of Homaloptera to remaining balitorine species (Fig. 1).

3.3. Detection of positive selection and comparison of synonymous codon usage bias Analyses based on the one ratio model M0 revealed that the average ω values of the seven genes were less than 0.15, indicating that the average selective pressure affecting each gene has been negative (or purifying). Noteworthy here is also that the average negative purifying selective pressure impacting the three mitochondrial genes COI, ND4, and ND5 (0.005, 0.0296, 0.0478) was stronger than that on the four nuclear genes RH1, RAG1, EGR2B, and IRBP (0.0738, 0.0664, 0.0578, 0.1490). Based on the comparisons of models M7 and M8, positive selection was merely detected in RH1 sequences and that model M8 had a significantly better fit than model M7 (P = 0.0416) using LRT analysis. The BEB procedure identified amino acid sites AA92, AA134, and AA169 of some RH1 sequence segments as possibly being under positive selection, but only AA92 reached posterior probability (Pr) of 98.0% (the program prints out an * if the posterior probability is > 95%; Yang, 2007).

Analyses with the rigorous, branch-site specific model also showed that some RH1 sequence segments were driven by positive selection in the lineage Nemacheilidae but only a single site AA92 was identified under positive selection by BEB procedure (Pr = 98.4%, Table 2). Moreover, sequences of RAG1 were detected as being affected by positive selection for the major lineages Nemacheilidae and Vaillantella. Positive selection was identified at the sites AA133, AA136, AA196, AA199, and AA200 along the Nemacheilidae (Pr of AA136 >95%), and sites AA135, AA288, and AA324 along the Vaillantella (Pr of AA324> 95%; Table 2). Analyses on synonymous codon usage bias indicated that the codon adaptation index (CAI) of RAG1 in Nemacheilidae and Vaillantella was significantly different from that of other cobitoid species (0.261 ± 0.01 and 0.248 ± 0.02, respectively; P = 0.03, T-Test). The mean CAI of RH1 in the nemacheilid lineage was higher than that for other species of Cobitoidea (0.280 ± 0.01 and 0.270 ± 0.03); this observation is interesting and should be examined with greater taxon sampling as we do acknowledge that the test was not significant (P = 0.436). The effective number of codons (ENC), another index measuring codon usage bias, also revealed the same results as CAIs in these two genes. 3.4. Variation in dN/dS among lineages of Cobitoidea Comparisons of LRTs detected significant variations in dN/dS values in 7 of 35 pairs of foreground and background lineages in five genes (ND4, ND5, EGR2B, IRBP, RH1) and four lineages (Appendices, Table A.4–8). Values of dN/dS in five foreground lineages were significantly higher than those of background lineages; and values in two foreground lineages were significantly lower than those of background lineages (Fig. 2). These findings provide a primary evolutionary trend of genes for guiding the next efficient analyses of patterns of variation in dN/dS. Comparisons of mean dN/dS, dN, and dS values between so-called “foreground” and “background” lineages were significant in ND4, ND5, and IRBP. No significant differences occurred in EGR2B or RH1 (Table 3). Thus, the following discussion is focused on the former three genes. Mean dN/dS of ND4 in the balitorid lineage was significantly greater than that of the other four lineages (denoted as nonbalitorids) (P = 0.034). Mean dN in this same lineage was similar to that of the non-balitorids (P = 0.681). Mean dS in the balitorid lineage was significantly lower than that of non-balitorids (P = 0.031) (Table 3). Mean dN/dS of ND5 in the balitorid lineage was also greater than that of non-balitorids but was not significant (P = 0.140). Mean dN in the balitorid lineage was similar to that of non-balitorids (P = 0.438), while mean dS was nearly significantly lower than that of non-balitorids (P = 0.082) (Table 3). Mean dN/dS of ND5 in the botiid

Table 2 Parameter estimates and likelihood ratio tests for RH1 and RAG1 genes in the branch-site models. Parameter estimates

RH1a

RAG1a

RAG1c

a b c

Site class

Proportion

Background ω

Foreground ω

0 1 2a 2b 0 1 2a 2b 0 1 2a 2b

0.92512 0.07028 0.00428 0.00032 0.93355 0.06133 0.00481 0.00032 0.92851 0.06569 0.00542 0.00038

0.02018 1 0.02018 1 0.03801 1 0.03801 1 0.03651 1 0.03651 1

0.02018 1 43.60333 43.60333 0.3801 1 18.97775 18.97775 0.03651 1 30.07654 30.07654

Δdf

2ΔInL

P value

Positive selected sites

1

5.491284

P = 0.02

92Wb (Pr = 98.4%)

1

3.878464

P = 0.04

133S, 136Ab (Pr = 97.3%), 196V, 199L, 200L

1

6.871678

P b 0.01

135R, 288P, 324Sb (Pr = 98.9%)

Nemacheilidae is the foreground lineage. Posterior probability, Pr > 0.95; the posterior probabilities of positive selected sites represented here are all >0.50. Vaillantellidae is the foreground lineage.

S. Liu et al. / Gene 508 (2012) 60–72

67

0.25

0.2

dN/dS

0.15

0.1

0.05

0.00

COI

ND4

ND5

EGR2B

IRBP

RAG1

RH1

Fig. 2. The dN/dS values of all genes across five lineages of Cobitoidea. Rectangles represent dN/dS values of the foreground lineages as calculated in a branch-specific model in PAML. The upward arrow means that the likelihood ratio test (LRT) was significant and dN/dS values of the foreground lineage were larger than that of the background lineage. The downward arrow means that the LRT was significant but dN/dS value of foreground lineage was smaller.

lineage was slightly lower than that of the remaining lineages (denoted as non-botiids) (P = 0.302). Mean dN and dS in the botiid lineage was significantly lower than that occurring in non-botiids (P = 0.001 and 0.003) (Table 3). This is a more conservative result as the smaller dS in the botiid lineage should result in a greater dN/dS. The variation pattern in ND4 and ND5 was not observed in the nuclear IRBP. The average dN/dS of IRBP in the balitorid lineage was slightly higher than that of the remaining lineages (P = 0.181), while both mean dN and dS in the balitorid lineage was significantly lower than that of non-balitorids (P = 0.019 and 0.005, respectively) (Table 3). The lower dN in the balitorid lineage should lead to a lower dN/dS ratio, indicating that these results are conservative. 3.5. Lineage evolution, historical constraints and impact of climate on dN/dS of the Cobitoidea Mean dN/dS was greater in the subtropical group, relative to temperate and tropical groups, in all genes except EGR2; while the trend in these differences is clear, statistical significance was only detected for IRBP and RAG1 (Fig. 3; Appendices, Table A.9). Mean dN/dS of IRBP in subtropical groups (0.2503) was significantly greater than

that of tropical groups (0.1256; P = 0.021; Fig. 3B), but was similar to that found in temperate groups (0.1287; P = 0.111; Fig. 3B). Mean dN/dS of RAG1 in subtropical groups (0.1998) was significantly higher than that in temperate groups (0.0509; P = 0.043; Fig. 3B), but similar to that of the tropical groups (0.0856; P = 0.184; Fig. 3B). The subtropical groups generally had lower dN values for ND4, ND5, EGR2B, and IRBP relative to the other two groups, but statistical significance in differences was only detected for subtropical and tropical group comparisons in IRBP (P = 0.017; Fig. 4A; Appendices, Table A.9), very likely due to sample sizes. In contrast, the significant differences of the mean dS were more frequently observed in 5 of 7 genes (Fig. 4B; Appendices, Table A.9). All seven genes generally exhibited greater mean dS values in the tropical groups relative to those in temperate and subtropical groups, but statistical significance was not observed in COI, ND5, EGR2B, IRBP and RH1, again possibly a result of insufficient taxon sampling in this study (but may be observed with increased taxon sampling). No significant differences were observed in dS values for ND4 and RAG1 among the three climatic groups. Furthermore, all genes, except EGR2B, were found to have lower mean dS values in subtropical groups than that found in those from the temperate and tropical groups (Appendices, Table A.9).

Table 3 Statistical analyses of mean dN/dS, dN and dS values between lineages for genes being detected to undergo significant heterogeneous selection in a branch-specific model. Balitorid ND4

ND5

EGR2B

IRBP

RH1

a b c d e

dN/dS dN dS dN/dS dN dS dN/dS dN dS dN/dS dN dS dN/dS dN dS

e

0.0591 0.0196 0.2528e 0.0640 0.0174 0.2300

Non-balitorida

Nemacheilid

Non-nemacheilidb

Cobitid

Botiid

Non-botiidd

0.0867 0.0057 0.0692

0.0436 0.0075e 0.1803e 0.0984 0.0088 0.0788

0.0623 0.0180 0.3595 0.0691 0.0039 0.0702

0.0214 0.0180 0.5036 0.0615 0.0183 0.4135 0.0174 0.0018 0.0896

0.3046 0.0056e 0.0243e

Non-cobitidc

0.1732 0.0116 0.0816 0.1332 0.0057 0.0630

Lineages containing nemacheilid, cobitid, vaillantellid and botiid. Lineages containing balitorid, cobitid, vaillantellid and botiid. Lineages containing balitorid, nemacheilid, vaillantellid and botiid. Lineages containing balitorid, nemacheilid, cobitid and vaillantellid. Difference of mean values was significant between two lineages.

0.1674 0.0035 0.0499

68

S. Liu et al. / Gene 508 (2012) 60–72

Temperate Subtropical

A

Tropical

Temperate Subtropical

B

Tropical

Fig. 3. Comparison of mean dN/dS values among three climatic groups. A) Comparison of mean dN/dS values for three mitochondrial genes; B) comparison of mean dN/dS values for four nuclear genes. * represents P b 0.05.

4. Discussion 4.1. Phylogenetic relationships of the Cobitoidea In the present study, the recovery of the monophyletic Cobitoidea was consistent with previous studies (Chen et al., 2008; Liu et al., 2010; Mayden and Chen, 2010; Mayden et al., 2008, 2009; Šlechtová et al., 2007; Tang et al., 2006). The superfamily Cobitoidea is herein hypothesized to consist of five monophyletic families: Balitoridae, Nemacheilidae, Cobitidae, Vaillantellidae, and Botiidae. The supporting values resulting from analyses of combined mtDNA genes/sequences were lower than those resulting from analyses of combined ncDNA at deeper-level lineages, suggesting that the ncDNA data performed better and provided more reliable results for recovering deeper-level relationships.

4.1.1. Relationships among families In the present phylogenetic analysis, species of the monophyletic family Balitoridae, a group of fishes known as the sucker-body hillstream loaches, was most closely related to species of the monophyletic family Nemacheilidae, a group commonly referred to as the brook loaches. Previous studies (Liu et al., 2002; Tang et al., 2006) suggested that nemacheilids and cobitids formed sister groups (outside of balitorids); however, these earlier studies were based on more limited taxon sampling, mitochondrial gene sampling, and analyses. The Cobitidae–Nemacheilidae sister group relationship hypothesized in these latter studies were not supported in the present mitochondrial-gene analyses wherein we have greater character and taxon sampling (number of molecular characters and more appropriate gene sequences for higher-level phylogenetic inference, and greater number of species representative of diversity of

Fig. 4. Comparison of mean (A) dN and (B) dS values for all genes among three climate groups. * represents 0.01 b P b 0.05; ** represents 0.001 b P b 0.01; *** represents P b 0.001.

S. Liu et al. / Gene 508 (2012) 60–72

Cobitoidea). This observation is not unexpected as many recent studies have also shown that both phylogenetic hypotheses and evolutionary studies relying on these hypotheses can and will likely change with increasing taxon and character sampling (Hillis et al., 2003; Mayden et al., 2008; Tao et al., 2010). Sawada (1982) examined 52 osteological characters and found nemacheilid and balitorid shared three autapomorphic features supporting their sister-group relationship: 1) Y-shaped tripus, 2) second descending process forming a part of gas bladder capsule, and 3) osseous capsule laterally subdivided. Subsequent authors (e.g. Kottelat, 2001; Liu et al., 2010; Mayden et al., 2009; Saitoh et al., 2006; Siebert, 1987; Šlechtová et al., 2007) corroborated Sawada's (1982) hypothesized relationships, with the placement of a monophyletic Nemacheilidae sister to a monophyletic Balitoridae. Furthermore, nemacheilid and balitorid species share the same distribution, areas in cool, well-oxygenated and sand-and-gravel-bottomed streams in temperate, tropical, and subtropical regions (Bǎnǎrescu and Nalbant, 1995). Therefore, we support the hypothesis and recognition of the sister-group relationship between Balitoridae and Nemacheilidae. We recognized Cobitidae as the sister group to the Balitoridae + Nemacheilidae clade in the present study, despite the presence of a movable bifurcated suborbital spine considered as a synapomorphy of Cobitidae plus Botiidae (Sawada, 1982). Furthermore, many morphological and molecular studies rejected their sister-group relationships (Liu et al., 2010; Mayden et al., 2009; Nalbant, 2002; Šlechtová et al., 2007, 2008; Tang et al., 2006). The present study supports Botiidae as the basal-most sister clade of the Cobitoidea, a finding consistent with previous studies (Liu et al., 2002; Mayden et al., 2009; Nalbant, 1963; Saitoh et al., 2006; Šlechtová et al., 2008; Tang et al., 2006). Therefore, we hypothesize that the presence of the suborbital spine in the cobitid and botiid groups should be recognized as convergence. The phylogenetic position of Vaillantellinae has long been debated. Nalbant and Bǎnǎrescu (1977) considered subfamilies Botiinae, Cobitinae, Nemacheilinae and Vaillantellinae as four members of the Cobitidae. Instead of elevating the genus Vaillantella to the subfamily Vaillantellinae, Sawada (1982) recognized this genus to be a member of the subfamily Nemacheilinae. After a reorganization of families, Nalbant (2002) assigned Vaillantella into the subfamily Vaillantellinae within the family Botiidae. In recent years, Vaillantellinae was elevated to a separate family Vaillantellidae equal to Balitoridae, Nemacheilidae, Cobitidae and Botiidae (Mayden et al., 2009; Šlechtová et al., 2007). In the present study, phylogenetic evidence supports the recognition of Vaillantellidae as a separate family. The phylogenetic trees based on mtDNA and ncDNA datasets yielded incongruent results about the position of Vaillantellidae. Combined mtDNA sequences place Vaillantellidae as the sister group to Botiidae (Fig. A.2). However, both the combined ncDNA or combined mtDNA + ncDNA datasets identify Vaillantellidae as sister to the clade (Cobitidae + (Nemacheilidae + Balitoridae)) (Fig. A.3). As the latter two data sets and analyses both provide a more comprehensive consideration of previous studies and our present phylogenetic analyses, herein, we support this latter hypothesis as to the phylogenetic position of Vaillantellidae. Moreover, previous morphological studies evaluating 52 osteological characters offered no synapomorphies or evidence for the sister group relationship of Vaillantella and Botiidae (Sawada, 1982). While there are many possible explanations as to the conflicting phylogenetic position of Vaillantellidae in mitochondrial and nuclear gene trees, including simply sample size of genes, base pairs, taxon sampling as well as simply methods of analysis with consideration given to the latter three items, this could have resulted from introgression of ancestral Botiidae species' mitochondrial genome into ancestral Vaillantella populations. As alluded to above, the selected mitochondrial genes were not as useful in resolving more basal nodes in the evolution of the Cobitoidea.

69

Unfortunately this analysis excluded the genus Ellopostoma. However, inclusion of this taxon in future work will be necessary to improve the phylogenetic relationships of the Cobitoidea. 4.1.2. Relationships within families Species of Balitoridae are well adapted to and are usually found in swift currents of rivers and streams (Hora, 1930; Roberts, 1982; Tan, 2006). Adaptations include depressed bodies and heads, subterminal mouths, and paired fins adapted as adhesive organs (Nelson, 2006). Balitoridae, as a whole, is a morphologically distinct group relative to other groups of Cobitoidea. In the present study we include 16 species, representing nine genera of Balitoridae, a family that was split into two distinct monophyletic lineages, Gastromyzoninae and Balitorinae. In contrast to the opinion of Hora (1932) that members of the Balitorinae and Gastromyzoninae evolved from the Cyprinidae and Cobitidae, the present study and others (Liu et al., 2010; Tang et al., 2006) corroborate their unique common ancestry. However, due to incomplete species sampling, monophyly of the genera within this family could not be resolved. Within the Nemacheilidae, Schistura was resolved as clearly polyphyletic in analyses, including separate nuclear genes, mtDNA data, and combined mtDNA + ncDNA data analyses. In earlier studies (Bǎnǎrescu and Nalbant, 1995; Tang et al., 2006), Schistura was also recovered as a polyphyletic and was impossible to delimit as a clade. This is likely the result of authors previously working with Schistura not providing sufficient morphological data warranting the recognition of this genus as a natural grouping (lacking or insufficient morphological synapomorphies to diagnose the genus). It is also possible that of these randomly selected species and specimens representing Schistura, from the many possible representatives of this genus, all have been involved in interspecific hybridization that could possibly influence the current resolved relationships. While the latter explanation is possible, it is highly implausible and the real problems lie in all of the previous morphological studies not being rigorous enough or more inclusive of taxa to test the naturalness of the group. No one has examined Schistura morphologically as a whole to establish whether or not the genus is monophyletic on the basis of clearly identifiable synapomorphies. In fact, many authors have previously published on the genus or are now continuing to work on Schistura in “systematic” studies, however, no morphological study has been done on a phylogenetic basis—a problem inherent to not identifying synapomorphies. In our study, Schistura fasciolata and Paracobitis wujiangensis formed a clade based on all three types of data matrices. Given that the distributions of Paracobitis and Schistura partly overlap in Sichuan province in China, the possibility does exist that hybridization had occurred between these two sympatric species during their evolution. To resolve the problem of an unnatural Schistura, additional taxa should be included in broader surveys of the genus and any taxon (genus, species) that is remotely similar to Schistura. It will be equally useful when systematists advocate only the use of morphological data in systematics to evaluate this large group that continues to be recognized on the basis of so called morphological “diagnostic” traits. For the family Cobitidae, two subgroups could be recognized in phylogenetic trees derived from the combined mtDNA + ncDNA dataset: one including Cobitis, Niwaella, Misgurnus, and Paramisgurnus, a clade inhabiting Europe, western, northern and eastern Asia, Vietnam, and Laos. The second clade would include Acantopsis, Canthophrys, and Pangio, taxa recognized today within these genera, inhabiting South and Southeast Asia. These two subgroups were also well corroborated in the reconstruction of northern and southern lineages as defined by Šlechtová et al. (2008). In this study, the two clades of Botiidae, Leptobotiinae and Botiinae, were recognized as, well-supported monophyletic lineages. Šlechtová et al. (2006) suggested that Leptobotiinae included the genera Leptobotia and Parabotia, and Botiinae contained the genera

70

S. Liu et al. / Gene 508 (2012) 60–72

Botia, Chromobotia, Sinibotia, Syncrossus, and Yasuhikotakia, of which six genera could be found in China (Kottelat, 2004; Zhu, 1995) and were included in the present study. Šlechtová et al. (2006) recovered the genera Leptobotia and Parabotia as monophyletic using only the mitochondrial cytochrome b and 12s rRNA genes. However, the present study clearly demonstrates that Leptobotia mantschurica, a taxon not used in Šlechtová et al. (2006), is nested within Parabotia, rendering Leptobotia and Parabotia as unnatural groups; they are not reciprocally monophyletic groups as previously hypothesized. 4.2. Molecular evolution of genes in the superfamily Cobitoidea 4.2.1. Positive selection and translation efficiency for RH1 and RAG1 Normally, a gene with dN/dS values greater than 1 are identified as being under positive selection. In the present study, one site (AA92) of RH1 in the nemacheilid lineage was detected as being putatively affected by positive selection; for RAG1, one site (AA136) in the nemacheilid and one site (AA324) in the Vaillantella lineage were detected as having experienced positive selection. It is known that RH1 encodes a seven-transmembrane protein (Hargrave and McDowell, 1992) which is one of the important opsins mediating vision in dim light (Dann et al., 2004; Terai et al., 2002; Yokoyama, 2000), and its evolutionary dynamics could be closely related to light conditions (Larmuseau et al., 2009). Changes of a photic environment could drive up amino acid (AA) substitutions of opsins and even lead to speciation in fishes (Larmuseau et al., 2009; Seehausen et al., 2008; Yokoyama et al., 2007). RAG1 encodes a protein that is essential in the assembly of immunoglobulin and T-cell-receptor during their recombination reaction (Agrawal et al., 1998). The evolution of RAG1 may impact the diversification of immune systems of organisms (Agrawal et al., 1998; Schatz, 1999). Previous studies have identified and discussed the notion that genes involved in immunity tend to be over-represented in reports of positive selection (Vallender and Lahn, 2004; Yang, 2006). A synonymous codon usage bias has been proposed as a proper indicator of translation efficiency over the evolutionary history of a gene (Akashi, 2003). In the present study, synonymous codon usage bias was observed to be significantly higher in positively selected lineages (i.e. both the nemacheilid lineage and the Vaillantella lineage in RAG1 gene). This implies that positive selection might improve translation efficiency of RAG1 in positively selected lineages, although more rigorous evidence is needed to further corroborate or falsify this hypothesis. We recognize that greater taxon diversity would aid in determining whether the positive selection of RAG1 has facilitated the evolution of the nemacheilid and Vaillantella lineages or has only been involved in their local adaptations. 4.2.2. Lineage-specific evolution of genes in balitorid fishes Generally, estimating variation of evolutionary rates of proteins (dN/dS) of certain genes, among different lineages, would represent an excellent effort to explore changes in selection pressures acting on genes, and facilitate the interpretation of their contributions to the evolutionary rates of diversification in lineages (Kawahara and Imanishi, 2007). In the present study, dN/dS of mitochondrial ND4 was found to be significantly higher in the balitorid lineage than in any non-balitorid lineages of Cobitoidea. ND4 is one of seven subunits of complex I, namely the first enzyme complex of the respiratory chain (Piruat and Lopez-Barneo, 2005). It is essential for the assembly and activity of the enzymes in aerobic respiration (Chomyn, 2001). Sawada (1982) speculated that the common ancestor of balitorids, transferred from their original ecological zone to a different zone characterized by fast-flowing water systems. The high-speed flow creates tremendous mixing of water and air, consequently resulting in high levels of dissolved oxygen (DO) in such streams and rivers. A clear, positive correlation between rate of energy consumption and strength of selection

pressure has already been observed in the mitochondrial genome (Shen et al., 2009). Therefore, the balitorid lineage might require lower energy consumption in obtaining sufficient oxygen, and may have experienced a more relaxed selection pressure than in non-balitorid lineages. This, in itself, provides evidence for our hypothesis, explanation and support from our results of the significantly higher dN/dS value of ND4 found in the balitorid lineage relative to non-balitorid lineages of Cobitoidea. For dN and dS, the former is considered to be related mainly to functional constraints (Kawahara and Imanishi, 2007), while the latter by purifying selection (Smith et al., 2008; Toll-Riera et al., 2011). For this reason, the nearly constant dN and significantly lower dS values of ND4 in the balitorid lineage suggests that it was mostly affected by relaxed purifying selection in this lineage. Interestingly, dN and dS values of IRBP were also significantly lower along the balitorid lineage. Therefore, we hypothesize that IRBP has been simultaneously impacted in the evolution of this lineage by a strong functional constraint and relaxed purifying selection. In summary, this study provides strong support for the hypothesis that the mitochondrial ND4 gene and the nuclear IRBP gene were both affected by slightly different selective pressures. Their faster evolutionary rate (ND4) or different evolutionary pattern in the balitorid lineage suggests that the evolution of these genes could be lineage specific, an observation that would have contributed greatly to the diversification of balitorid fishes.

4.2.3. Ecological-group-specific evolution of genes within the Cobitoidea For the loach lineage of the Cobitoidea, excluding balitorids, our findings clearly indicate that the COI, ND5, EGR2B, IRBP and RH1 genes have significantly higher mean dS values in the tropical group as compared to the other two groups. Loaches in the superfamily Cobitoidea are distributed mainly in Eurasia (Nelson, 2006). Based on climate information derived from Fishbase (Froese and Pauly, 2011), the numbers of temperate, subtropical, and tropical species account for 19.1%, 24.6% and 56.3% of the Cobitoidea (five families reconstructed in Section 3.2), respectively. After excluding species of Balitoridae, percentages become 23.3%, 24.8% and 51.9%, respectively. The significantly higher mean dS values in the five genes COI, ND5, EGR2B, IRBP and RH1, in the tropical group, relative to the temperate and subtropical groups, suggest that species of the tropical group may have experienced very strong purifying selection during their diversification. The great diversity of species in the tropical group (over half of the Cobitoidea) may also indicate that ancestrally this clade was highly adapted to a tropical climate and diversified in this zone. Conversely, the significantly lower mean dS values observed both in the subtropical group for five genes (COI, ND5, EGR2B, IRBP and RH1) and in the temperate group for four genes (COI, EGR2B, IRBP and RH1), suggest that these genes were affected by the relaxation of purifying selection in these two groups. Since it has already been hypothesized that rates of energy consumption are positively correlated with the strength of mitochondrial genome selection (Shen et al., 2009), the total energy consumption by species of the subtropical and temperate groups is likely less than that consumed by species of the tropical group. Therefore, it would be reasonable to conclude that fishes living in subtropical and temperate environments may have experienced more relaxed selection as compared to species inhabiting the more rigorous conditions of higher water temperatures and/or other environmental factors. The different evolutionary rates of genes in these three climate groups suggest that the evolution of these genes could also be ecologically-group-specific and may have played important roles in the adaptive evolution and diversification of these groups. Supplementary data to this article can be found online at http:// dx.doi.org/10.1016/j.gene.2012.07.040.

S. Liu et al. / Gene 508 (2012) 60–72

Acknowledgments We thank Yuyu Xiong, Caiping Liao, Yu Zeng, Deqing Tan, Zhuocheng Zhou, and Zhifu Tian for help in collecting samples. Thanks are also due to Ming Zou, Baocheng Guo and Wenjin Tao for help in data analyses. We thank two anonymous reviewers for their critical comments, which significantly improved the manuscript. Funding support was provided by grants from the National Natural Science Foundation of China (NSFC 30700072 and 31061160185), and from the Knowledge Innovation Program of the Chinese Academy of Sciences (Y05E08). Our work was also a part of the Cypriniformes Tree of Life (CToL) project funded by the U.S. National Science Foundation (NSF EF-0431326 to R. L. Mayden).

References Agrawal, A., Eastman, Q.M., Schatz, D.G., 1998. Transposition mediated by RAG1 and RAG2 and its implications for the evolution of the immune system. Nature 394, 744–751. Akashi, H., 1994. Synonymous codon usage in Drosophila melanogaster: natural selection and translational accuracy. Genetics 136, 927–935. Akashi, H., 2003. Translational selection and yeast proteome evolution. Genetics 164, 1291–1303. Akashi, H., Eyre-Walker, A., 1998. Translational selection and molecular evolution. Curr. Opin. Genet. Dev. 8, 688–693. Baer, C., Miyamoto, M.M., Denver, D.R., 2007. Mutation rate variation in multicellular eukaryotes: causes and consequences. Nat. Rev. Genet. 8, 619–631. Bǎnǎrescu, P.M., Nalbant, T.T., 1995. A generical classification of Nemacheilinae with description of two new genera (Teleostei: Cypriniformes: Cobitidae). Trav. Mus. Hist. Nat. “Grigore Antipa” 35, 429–496. Berg, L.S., 1940. Classification of fishes, both recent and fossil. Trav. Inst. Zool. Acad. Sci. USSR 5, 87–345. Britten, R.J., 1986. Rates of DNA sequence evolution differ between taxonomic groups. Science 231, 1393–1398. Bromham, L., 2009. Why do species vary in their rate of molecular evolution? Biol. Lett. 5, 401–404. Chen, Y.Y., Tang, W.Q., 2000. Family Homalopteridae. In: Yue, P.Q. (Ed.), Fauna Sinica, Osteichthyes, Cypriniformes III. Science Press, Beijing, pp. 438–563. Chen, J., Zhu, S., 1984. Phylogenetic relationships of subfamilies in the loach family Cobitidae (Pisces). Acta Zootaxon. Sin. 9, 201–207. Chen, W.J., Miya, M., Saitoh, K., Mayden, R.L., 2008. Phylogenetic utility of two existing and four novel nuclear gene loci in reconstructing tree of life of ray-finned fishes: the order Cypriniformes (Ostariophysi) as a case study. Gene 423, 125–134. Chen, W.J., Lheknim, L., Mayden, R.L., 2009. Molecular phylogeny of the Cobitoidea (Teleostei: Cypriniformes) revisited: position of enigmatic loach Ellopostoma resolved with six nuclear genes. J. Fish Biol. 75, 2197–2208. Chomyn, A., 2001. Mitochondrial genetic control, of assembly and function of complex I in mammalian cells. J. Bioenerg. Biomembr. 33, 251–257. Czelusniak, J., Goodman, M., Hewett-Emmett, D., Weiss, M.L., Venta, P.J., Tashian, R.E., 1982. Phylogenetic origins and adaptive evolution of avian and mammalian haemoglobin genes. Nature 298, 297–300. Dann, S.G., Allison, W.T., Levin, D.B., Taylor, J.S., Hawryshyn, C.W., 2004. Salmonid opsin sequences undergo positive selection and indicate an alternate evolutionary relationship in Oncorhynchus. J. Mol. Evol. 58, 400–412. Duret, L., 2002. Evolution of synonymous codon usage in metazoans. Curr. Opin. Genet. Dev. 12, 640–649. Duret, L., Mouchiroud, D., 1999. Expression pattern and, surprisingly, gene length shape codon usage in Caenorhabditis, Drosophila, Arabidopsis. Proc. Natl. Acad. Sci. U. S. A. 96, 4482–4487. Felsenstein, J., 1985. Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39, 783–791. Froese, R., Pauly, D. (Eds.), 2011. FishBase. World Wide Web electronic publication. bwww.fishbase.org> (accessed 06. 2011). Galtier, N., Gouy, M., Gautier, C., 1996. SEAVIEW and PHYLO-WIN: two graphic tools for sequence alignment and molecular phylogeny. Comput. Appl. Biosci. 12, 543–548. Guindon, S., Delsuc, F., Dufayard, J.F., Gascuel, O., 2009. Estimating maximum likelihood phylogenies with PhyML. Methods Mol. Biol. 537, 113–137. Hargrave, P.A., McDowell, J.H., 1992. Rhodopsin and phototransduction—a model system for G protein-linked receptors. FASEB J. 6, 2323–2331. Hillis, D.M., Pollock, D.D., McGuire, J.A., Zwickl, D.J., 2003. Is sparse taxon sampling a problem for phylogenetic inference? Syst. Biol. 52, 124–126. Hora, S.L., 1930. Ecology, biomass and evolution of the torrential fauna, with special reference to the organs of attachment. Philos. Trans. R. Soc. London Ser. B 218, 171–282. Hora, S.L., 1932. Classification, bionomics and evolution of Homalopteridae fishes. Mem. Indian Mus. 12, 263–330. Huelsenbeck, J.P., Ronquist, F., 2001. MRBAYES: Bayesian inference of phylogenetic trees. Bioinformatics 17, 754–755. Ikemura, T., 1985. Codon usage and tRNA content in unicellular and multicellular organisms. Mol. Biol. Evol. 2, 13–34.

71

Kawahara, Y., Imanishi, T., 2007. A genome-wide survey of changes in protein evolutionary rates across four closely related species of Saccharomyces sensu stricto group. BMC Evol. Biol. 7, 1–13. Kottelat, M., 1994. Vaillantella cinnamomea, a new species of balitorid loach from Eastern Borneo. Japan J. Ichthyol. 40, 427–431. Kottelat, M., 2001. Freshwater Fishes of Northern Vietnam. The World Bank, Washington, DC. Kottelat, M., 2004. Botia kubotai, a new species of loach (Teleostei: Cobitidae) from the Ataran River basin (Myanmar), with comments on botiine nomenclature and diagnosis of a new genus. Zootaxa 401, 1–18. Laird, C.D., McConaughy, B.L., McCarthy, B.J., 1969. Rate of fixation of nucleotide substitutions in evolution. Nature 224, 149–154. Larmuseau, M.H., Huyse, T., Vancampenhout, K., Van Houdt, J.K., Volckaert, F.A., 2009. High molecular diversity in the rhodopsin gene in closely related goby fishes: a role for visual pigments in adaptive speciation? Mol. Phylogenet. Evol. 55, 689–698. Li, W.H., 1997. Molecular Evolution. Sinauer Associates, Sunderland, USA. Liu, H., Tzeng, C.S., Teng, H.Y., 2002. Sequence variations in the mitochondrial DNA control region and their implications for the phylogeny of the Cypriniformes. Can. J. Zool. 80, 569–581. Liu, S.Q., Zhang, J.B., Tang, Q.Y., Liu, H.Z., 2010. Phylogenetic relationships among Cobitoidea based on mitochondrial ND4 and ND5 gene sequences. Zool. Res. 31, 221–229. Lynch, M., Conery, J.S., 2000. The evolutionary fate and consequences of duplicate genes. Science 290, 1151–1155. Martin, A.P., Palumbi, S.P., 1993. Body size, metabolic rate, generation time, and the molecular clock. Proc. Natl. Acad. Sci. U. S. A. 90, 4087–4091. Mayden, R.L., Chen, W.J., 2010. The world's smallest vertebrate species of the genus Paedocypris: a new family of freshwater fishes and the sister group to the world's most diverse clade of freshwater fishes (Teleostei: Cypriniformes). Mol. Phylogenet. Evol. 57, 152–175. Mayden, R.L., et al., 2008. Inferring the tree of life of the order Cypriniformes, the Earth's most diverse clade of freshwater fishes: implications of varied taxon and character sampling. J. Syst. Evol. 46, 424–438. Mayden, R.L., et al., 2009. Reconstructing the phylogenetic relationships of the earth's most diverse clade of freshwater fishes—order Cypriniformes (Actinopterygii: Ostariophysi): a case study using multiple nuclear loci and the mitochondrial genome. Mol. Phylogenet. Evol. 51, 500–514. Mullis, K.B., Faloona, F.A., 1987. Specific synthesis of DNA in vitro via a polymerasecatalyzed chain reaction. Methods Enzymol. 155, 335–350. Nalbant, T.T., 1963. A study of the genera of Botiinae and Cobitinae (Pisces, Ostariophysi, Cobitidae). Trav. Mus. Nat. “Grigore Antipa” 4, 343–375. Nalbant, T.T., 2002. Sixty million years of evolution. Part one: family Botiidae (Pisces: Ostariophysi: Cobitoidea). Trav. Mus. Hist. Nat. “Grigore Antipa” 44, 309–344. Nalbant, T.T., Bǎnǎrescu, P., 1977. Vaillantellinae, a new sub-family of Cobitidae (Pisces, Cypriniformes). Zool. Meded. 52, 99–105. Nelson, J.S., 2006. Fishes of the World, 4th ed. John Wiley and Sons Inc., New York. Piruat, J.I., Lopez-Barneo, J., 2005. Oxygen tension regulates mitochondrial DNAencoded complex I gene expression. J. Biol. Chem. 280, 42676–42684. Posada, D., Crandall, K.A., 1998. MODELTEST: testing the model of DNA substitution. Bioinformatics 14, 817–818. Ramaswami, L.S., 1953. Skeleton of cyprinoid fishes in relation to phylogenetic studies. V. The skull and the gasbladder capsule of the Cobitidae. Proc. Natl. Inst. Sci. India 18, 519–538. Regan, C.T., 1911. The classification of the teleostean fishes of the order Ostariophysi. I. Cyprinidae. Ann. Mag. Nat. Hist. 8, 13–32. Roberts, T.R., 1982. The Bornean Gastromyzontine fish genera Gastromyzon and Glaniopsis (Cypriniformes, Homalopteridae), with descriptions of new species. Proc. Calif. Acad. Sci. 42, 497–524. Roberts, T.R., 1989. The freshwater fishes of western Borneo (Kalimantan Barat, Indonesia). Mem. Calif. Acad. Sci. 14, 1–210. Saiki, R.K., et al., 1988. Primer-directed enzymatic amplification of DNA with a thermostable DNA polymerase. Science 239, 487–491. Saitoh, K., et al., 2006. Mitogenomic evolution and interrelationships of the cypriniformes (Actinopterygii: Ostariophysi): the first evidence toward resolution of higher-level relationships of the world's largest freshwater fish clade based on 59 whole mitogenome sequences. J. Mol. Evol. 63, 826–841. Sambrook, E., Fritsch, F., Maniatis, T., 1989. Molecular Cloning. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York. Sawada, Y., 1982. Phylogeny and zoogeography of the superfamily Cobitoidea (Cyprinoidei, Cypriniformes). Mem. Fac. Fish., Hokkaido Univ. 28, 65–223. Schatz, D.G., 1999. Transposition mediated by RAG1 and RAG2 and the evolution of the adaptive immune system. Immunol. Res. 19, 169–182. Seehausen, O., et al., 2008. Speciation through sensory drive in cichlid fish. Nature 455, U620–U623. Sharp, P.M., Li, W.H., 1987. The codon adaptation index — a measure of directional synonymous codon usage bias, and its potential applications. Nucl. Acids Res. 15, 1281–1295. Shen, Y.Y., Shi, P., Sun, Y.B., Zhang, Y.P., 2009. Relaxation of selective constraint on avian mitochondrial DNA following the degeneration of flight ability. Genome Res. 19, 1760–1765. Siebert, D.J., 1987. Interrelationships among families of the order Cypriniformes (Teleostei). New York, City University of New York. Ph.D. Šlechtová, V., Bohlen, J., Freyhof, J., Rab, P., 2006. Molecular phylogeny of the Southeast Asian freshwater fish family Botiidae (Teleostei: Cobitoldea) and the origin of polyploidy in their evolution. Mol. Phylogenet. Evol. 39, 529–541.

72

S. Liu et al. / Gene 508 (2012) 60–72

Šlechtová, V., Bohlen, J., Tan, H.H., 2007. Families of Cobitoidea (Teleostei: Cypriniformes) as revealed from nuclear genetic data and the position of the mysterious genera Barbucca, Psilorhynchus, Serpenticobitis and Vaillantella. Mol. Phylogenet. Evol. 44, 1358–1365. Šlechtová, V., Bohlen, J., Perdices, A., 2008. Molecular phylogeny of the freshwater fish family Cobitidae (Cypriniformes: Teleostei): delimitation of genera, mitochondrial introgression and evolution of sexual dimorphism. Mol. Phylogenet. Evol. 47, 812–831. Smith, L.L., Fessler, J.L., Alfaro, M.E., Streelman, J.T., Westneat, M.W., 2008. Phylogenetic relationships and the evolution of regulatory gene sequences in the parrotfishes. Mol. Phylogenet. Evol. 49, 136–152. Sun, Y.B., Shen, Y.Y., Irwin, D.M., Zhang, Y.P., 2011. Evaluating the roles of energetic functional constraints on teleost mitochondrial-encoded protein evolution. Mol. Biol. Evol. 28, 39–44. Tamura, K., Dudley, J., Nei, M., Kumar, S., 2007. MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0. Mol. Biol. Evol. 24, 1596–1599. Tan, H.H., 2006. The Borneo Suckers. Revision of the Torrent Loaches of Borneo (Balitoridae: Gastromyzon, Neogastromyzon). Natural History Publications (Borneo), Kota Kinabalu. Tang, Q., Liu, H., Mayden, R., Xiong, B., 2006. Comparison of evolutionary rates in the mitochondrial DNA cytochrome b gene and control region and their implications for phylogeny of the Cobitoidea (Teleostei: Cypriniformes). Mol. Phylogenet. Evol. 39, 347–357. Tao, W., Zou, M., Wang, X., Gan, X., Mayden, R.L., He, S., 2010. Phylogenomic analysis resolves the formerly intractable adaptive diversification of the endemic clade of East Asian Cyprinidae (Cypriniformes). PLoS One 5, e13508. Terai, Y., Mayer, W.E., Klein, J., Tichy, H., Okada, N., 2002. The effect of selection on a long wavelength-sensitive (LWS) opsin gene of Lake Victoria cichlid fishes. Proc. Natl. Acad. Sci. U. S. A. 99, 15501–15506. Thompson, J.D., Gibson, T.J., Plewniak, F., Jeanmougin, F., Higgins, D.G., 1997. The CLUSTAL X windows interface flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 25, 4876–4882.

Toll-Riera, M., Laurie, S., Alba, M.M., 2011. Lineage-specific variation in intensity of natural selection in mammals. Mol. Biol. Evol. 28, 383–398. Vallender, E.J., Lahn, B.T., 2004. Positive selection on the human genome. Hum. Mol. Genet. 13, 245–254. Wang, M.H., Zhang, X.Z., Zhao, H.B., Wang, Q.S., Pan, Y.C., 2009. FoxO gene family evolution in vertebrates. BMC Evol. Biol. 9, 222. Weber, M., de Beaufort, L.F., 1916. The Fishes of the Indo-Australian Archipelago. E.J. Brill, Leiden. Whelan, S., Goldman, N., 1999. Distributions of statistics used for the comparison of models of sequence evolution in phylogenetics. Mol. Biol. Evol. 16, 1292–1299. Wright, F., 1990. The ‘effective number of codons’ used in a gene. Gene 87, 23–29. Yang, Z.H., 1997. PAML: a program package for phylogenetic analysis by maximum likelihood. Comput. Appl. Biosci. 13, 555–556. Yang, Z.H., 2006. Computational Molecular Evolution. Oxford University Press, Oxford, UK. Yang, Z.H., 2007. PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 24, 1586–1591. Yang, Z.H., Nielsen, R., 2002. Codon-substitution models for detecting molecular adaptation at individual sites along specific lineages. Mol. Biol. Evol. 19, 908–917. Yang, Z.H., Nielsen, R., Goldman, N., Pedersen, A.M.K., 2000. Codon-substitution models for heterogeneous selection pressure at amino acid sites. Genetics 155, 431–449. Yang, Z.H., Wong, W.S.W., Nielsen, R., 2005. Bayes empirical Bayes inference of amino acid sites under positive selection. Mol. Biol. Evol. 22, 1107–1118. Yokoyama, S., 2000. Molecular evolution of vertebrate visual pigments. Prog. Retin. Eye Res. 19, 385–419. Yokoyama, S., Tada, T., Yamato, T., 2007. Modulation of the absorption maximum of rhodopsin by amino acids in the C-terminus. Photochem. Photobiol. 83, 236–241. Zhang, J., Nielsen, R., Yang, Z., 2005. Evaluation of an improved branch-site likelihood method for detecting positive selection at the molecular level. Mol. Biol. Evol. 22, 2472–2479. Zhu, S.Q., 1995. Synopsis of Freshwater Fishes of China. Jiangsu Science and Technology Press, Nanjing.