A multilocus assessment of nuclear and mitochondrial sequence data elucidates phylogenetic relationships among European spirlins (Alburnoides, Cyprinidae)

A multilocus assessment of nuclear and mitochondrial sequence data elucidates phylogenetic relationships among European spirlins (Alburnoides, Cyprinidae)

Molecular Phylogenetics and Evolution xxx (2015) xxx–xxx Contents lists available at ScienceDirect Molecular Phylogenetics and Evolution journal hom...

2MB Sizes 18 Downloads 92 Views

Molecular Phylogenetics and Evolution xxx (2015) xxx–xxx

Contents lists available at ScienceDirect

Molecular Phylogenetics and Evolution journal homepage: www.elsevier.com/locate/ympev

A multilocus assessment of nuclear and mitochondrial sequence data elucidates phylogenetic relationships among European spirlins (Alburnoides, Cyprinidae) q Sonˇa Stierandová a,⇑, Jasna Vukic´ b, Ekaterina D. Vasil’eva c, Stamatis Zogaris d, Spase Shumka e, Karel Halacˇka a, Lukáš Vetešník a, Miroslav Švátora b, Michal Nowak f, Tihomir Stefanov g, Ján Košcˇo h, Jan Mendel a a

Institute of Vertebrate Biology of the Academy of Sciences of the Czech Republic, v.v.i., Kveˇtná 8, 60365 Brno, Czech Republic Faculty of Science, Charles University, Vinicˇná 7, 12844 Praha, Czech Republic c Zoological Museum, Bolshaya Nikitskaya 6, 125009 Moscow, Russia d Hellenic Centre for Marine Research, Institute of Marine Biological Resources and Inland Waters, 46, 7 Km Athens-Sounio Ave, 19013 Anavissos, Attiki, Greece e Agricultural University of Tirana, Tirana, Albania f University of Agriculture in Kraków, Department of Ichthyobiology and Fisheries, T. Spiczakowa 6, 30-199 Kraków, Poland g National Museum of Natural History, Bulgarian Academy of Sciences, Tsar Osvoboditel Blvd. 1, 1000 Sofia, Bulgaria h University of Prešov, Department of Ecology, 17. Novembra 1, 08116 Prešov, Slovakia b

a r t i c l e

i n f o

Article history: Received 28 April 2015 Revised 21 October 2015 Accepted 23 October 2015 Available online xxxx Keywords: Alburnoides Taxonomy Phylogeography Mitochondrial and nuclear markers

a b s t r a c t The phylogenetic relationships and taxonomy of the spirlins in the genus Alburnoides are examined by comparative sequencing analysis of mitochondrial and nuclear markers. Molecular analyses revealed 17 Eurasian lineages divided into two main clades, termed the Ponto-Caspian and European in accordance with the lineage distribution. The indel diagnostics of b-actin and S7 markers and translation of cyt b to the amino acid chain were evaluated as a reliable identifying tool for most of the recognised lineages. Lineage richness is closely connected with the existence of known glacial refugia in most cases. The underestimation of species richness in the genus Alburnoides is confirmed: the genetic analyses support the validity of 11 morphologically accepted species; apart from them, four phylogenetic lineages requiring descriptions as separate species were revealed. The distribution area of the nominotypical species A. bipunctatus s. stricto is newly defined. Two diverging phylogenetic lineages, A. ohridanus, and A. prespensis complex, were observed in the Southeast Adriatic Freshwater Ecoregion, confirmed as a hotspot of endemic biodiversity. A. ohridanus demonstrates high divergence from the A. prespensis complex, represented by three similar mitochondrial lineages with the same nuclear haplotypes and sympatric occurrence. The range restricted endemism was confirmed for at least seven species. The Albanian river systems, as well as the wider Ponto-Caspian basin exhibit complications among definite species delineations and gaps in understanding of microevolutionary processes; these areas require further investigations. Ó 2015 Published by Elsevier Inc.

1. Introduction Spirlins are small cyprinid fishes of the genus Alburnoides Jeitteles, 1861, widespread in temperate Eurasian rivers and lakes. Different European populations of these similar-looking fishes have long been included in the same species Alburnoides bipunctatus (Bloch, 1782), originally described from the Weser River near Minden, Germany. Among numerous described subspecies or q

This paper was edited by the Associate Editor G. Orti.

⇑ Corresponding author. Fax: +420 543 211 346.

E-mail address: [email protected] (S. Stierandová).

varieties (Kottelat, 1997), Kottelat and Freyhof (2007) recognised three valid European species of Alburnoides, namely A. bipunctatus, A. ohridanus (Karaman, 1928), and A. prespensis (Karaman, 1924). However, they noted that this number was ‘‘much underestimated”, and despite similar morphological characters in populations from Montenegro, Greece, Bulgaria, Crimea, and Caucasus most likely represent distinct species, as was indicated from molecular studies. Recently, numerous taxonomic studies on Alburnoides resulted in descriptions of several new species, as well as in re-evaluation of some previously described intraspecific units as separate species (Bogutskaya and Coad, 2009; Coad and Bogutskaya, 2009, 2012; Bogutskaya et al., 2010; Šanda and

http://dx.doi.org/10.1016/j.ympev.2015.10.025 1055-7903/Ó 2015 Published by Elsevier Inc.

Please cite this article in press as: Stierandová, S., et al. A multilocus assessment of nuclear and mitochondrial sequence data elucidates phylogenetic relationships among European spirlins (Alburnoides, Cyprinidae). Mol. Phylogenet. Evol. (2015), http://dx.doi.org/10.1016/j.ympev.2015.10.025

2

S. Stierandová et al. / Molecular Phylogenetics and Evolution xxx (2015) xxx–xxx

Mlíkovsky´, 2012; Turan et al., 2013, 2014; Geiger et al., 2014). Consequently, the number of validated Alburnoides species increased from three species known by Berg (1949) to 27 currently accepted as valid (Eschmeyer, 2015). In this study, we investigated all European populations including: the validity of the three originally described species; and, A. devolli Bogutskaya, Zupancic and Naseka 2010 and A. fangfangae Bogutskaya, Zupancic and Naseka 2010 from the Albanian rivers Devolli and Osumi, A. thessalicus Stephanidis, 1950 from Greek rivers Pinios and Vardar/Axios, A. strymonicus Chichkoff, 1940 from Bulgarian rivers Mesta/Nestos, Toplitza and Struma/Strymon, A. maculatus (Kessler, 1859) from Crimea, A. rossicus Berg, 1924 from rivers Dniester, Dnieper and Volga, A. tzanevi Chichkoff, 1933 from the Bulgarian river Rezovska, western Black Sea drainage. We also examined some species from adjacent areas near the imaginary line of the Europe–Asia frontier, including A. fasciatus (Nordmann, 1840) from area of Western Caucasus and rivers of the Black Sea coast, A. kubanicus Ba˘na˘rescu, 1964 from the Russian rivers Laba and Kuban, A. eichwaldii (De Filippi, 1863) from rivers of the Lencoran Province in Azerbaijan. All newly described species are allopatric range restricted endemics demonstrating some differences in morphometric and meristic characters and are distinguished by combinations of characters (e.g. Bogutskaya et al., 2010, etc.). To provide a robust framework for clarifying phylogenetic relations among local populations and testing early taxonomic hypotheses (e.g. Perdices et al., 2003; Freyhof et al., 2005; Mendel et al., 2008; Šedivá et al., 2008), in combination with traditional morphological characteristics, analyses of genetic variation using molecular data were prepared. Molecular methods of species delineation often result in findings of new cryptic fish species (e.g. Mendel et al., 2008; Rylková et al., 2013; Buj et al., 2014) or testify to the validity (or invalidity) of previously described taxa (Sorokin et al., 2011; Perdices et al., 2012, 2015). The present study is the first complete phylogenetic analysis of the European spirlins with associated species from the eastern reaches of the European–Asian frontier. This study is based on morphologically defined species units originating from their type localities and their surroundings; we implemented a comparative two-genome sequencing analysis both of coding and noncoding regions. The main aims of this study are: (a) to recognise genetic variability of analysed populations and identify evolutionary lineages; (b) to evaluate presumed cryptic diversity and to map the extent of hybridization and species endemism in the genus; (c) to compare recent taxonomy with new findings and to propose potential taxonomic revisions on the basis of phylogeographic patterns; and (d) to offer reliable molecular diagnostics of spirlin lineages for future population research. 2. Materials and methods 2.1. Sample collection Altogether, 343 spirlin specimens were collected from 2008 to 2013 in 117 localities and 18 countries (Table 1). The studied samples cover the main distribution areas of Alburnoides taxa ranging from France to the Caspian Sea drainages. For ease of interpretation an interactive map (Fig. S1) was prepared in Google Earth application (free download http://www.google.com/earth/download/ge/ agree.html). Individual marks show collecting localities or type localities (note: tick ‘‘borders and marks” for better orientation, folders contain coloured pins indicating species). To bring as comprehensive insight as possible we added the location marks for other related species from adjacent Asian localities (A. manyanensis, A. petrubanarescue, A. namaki, A. idignensis, A. nicolausi, A. varentsovi, A. holciki; Iran1-3, A. sp. and A. cf. eichwaldii; see Seifali et al., 2012 – coloured triangles and bubbles

in the folder Eurasia). However, these are not subjects of the present study. For a more complete comparison, samples from GenBank database Y10445, AF090740-42 and certain specimens from Iran (HQ658863-93) were included in this study. The species Rutilus rutilus (cyt b FJ025072, b-actin DQ061948, S7 AY325785) and Leuciscus idus (cyt b AY026397, b-actin DQ061947, S7 HM560493) were selected as outgroups (Perea et al., 2010; Table 1). Voucher specimens are deposited in the collections of the Institute of Vertebrate Biology, v.v.i. (Brno, Czech Republic), the National Museum in Prague and the Zoological Museum of the Moscow State University. 2.2. DNA extraction, PCR amplification and sequencing Total genomic DNA was extracted from a small piece of the pectoral fin by commercial kit (Geneaid). Sequences of cytochrome b (cyt b), b-actin, the first intron of the S7 r-protein (S7) and recombination activating gene (RAG1) were amplified by polymerase chain reaction (PCR) with primers specified in Table S1. PCRs were performed in 50 ll volume containing 10 mM Tris–HCl, 50 mM KCl, 0.1% Triton X-100, 1.5 mM MgCl2, 0.2 mM dNTPs, 0.2 lM each primer, 2.5 U Taq DNA polymerase and approximately 100–500 ng of genomic DNA. The nuclear gene for S7 was amplified by commercial kit (PPP Master Mix; Top-Bio) according to the manufacturer’s instructions. Reactions were performed in Mastercycler pro (Eppendorf). The PCR products were visualised by gel electrophoresis using Gold View and 1.7% agarose gels. The amplicons were purified by means of precipitation PEG/Mg/NaAc (26% Polyethylene glycol, 6.5 mM MgCl2.6H2O, 0.6 M NaAc.3H2O). Direct sequencing of purified PCR products, in a few cases with cyt b sequencing primers, was performed with the BigDyeTM Terminator Cycle Sequencing Ready Reaction Kit version 1.1 (Applied Biosystems) according to the manufacturer’s instructions, and purified with EtOH/EDTA precipitation. The sequencing was performed on an ABI PRISM 310 Genetic Analyser (Applied Biosystems) and by using commercial service (Macrogen, Korea). All PCR amplicons were multiple sequenced from both directions to ensure high quality reads. The DNA sequences were edited and aligned by using ClustalW algorithm implemented into MEGA 5.2 (Tamura et al., 2011). The accuracy of the sequences was confirmed by comparison with the NCBI database. 2.3. Phylogenetic analyses Analyses of DNA polymorphism from the sequence dataset – haplotype diversity (Hd), nucleotide diversity (Pi) (Nei, 1987) and number of nonsynonymous substitutions was assessed using DNA SP v5.10.1 (Librado and Rozas, 2009). The assembling of multiple alignment and assessment of parsimony informative positions was performed in Mega 5.2 (Tamura et al., 2011). jModelTest 2 program was used to ascertain the best-fit model of nucleotide substitution for separate nuclear and mitochondrial regions (Darriba et al., 2012; Guindon and Gascuel, 2003) under the Akaike information criterion. Phylogenetic relationships were examined using the neighbour joining (NJ) algorithm, methods of maximum parsimony (MP), maximum likelihood (ML) and also Bayesian inference (BI). The alignments were imported into the computer programs IVB cluster and PAUP 4.0B.10 (Swofford and Selander, 2002), MrBayes 3.1.2 (Ronquist and Huelsenbeck, 2005) and into the online program PhyML (Guindon et al., 2010) for phylogenetic analysis. The DNA distances and non-parametric bootstrap with 1000 pseudo-replicates were determined by NJ method. MP phylogram was constructed by using unweight parsimony analysis using heuristic method and with bootstrap support with 1000 replications. ML search was performed under the best-fit model with the heuristic algorithm on 1000 bootstrap

Please cite this article in press as: Stierandová, S., et al. A multilocus assessment of nuclear and mitochondrial sequence data elucidates phylogenetic relationships among European spirlins (Alburnoides, Cyprinidae). Mol. Phylogenet. Evol. (2015), http://dx.doi.org/10.1016/j.ympev.2015.10.025

S. Stierandová et al. / Molecular Phylogenetics and Evolution xxx (2015) xxx–xxx

3

Table 1 Species examined in the study, source of tissue samples used, number of mitochondrial and nuclear haplotypes. Lineage/species Outgroups Leuciscus idus Rutilus rutilus Ingroup taxa Lineage_I A. bipunctatus

Lineage_II A. tzanevi Lineage_III A. rossicus Lineage_IV A. sp. 1

Lineage_V A. sp. 2

Lineage_VI A. maculatus Lineage_VII A. ohridanus Lineage_VIII A. prespensis complex Lineage_IX A. prespensis complex Lineage_X A. sp. 3 Lineage_XI A. strymonicus Lineage_XII A. fasciatus Lineage_XIII A. kubanicus Lineage_XIV A. sp. 4 Lineage_XV A. prespensis complex Lineage_XVI A. eichwaldii Lineage_XVII A. sp. 5

River, main drainage, country (H1 – H183/N, N1 – N36/N, N37 – N92/N, N93 – N137/N)y& Danube, SK Rhona, FR Allier, Loire, FR (H1/1, H8/1, N1/2, N52/1, N127/1); Becˇva, Danube, CZ (H11/1, N101/1); Bednarka, Nidzica, PL (H20/2); Bodva, Danube, SK (H20/ 6, H34/1, N7/1, N54/1, N98/1); Bourbince, Loire, FR (H8/3, H9/1, N1/1, N127/1); Charente, FR (H1/2, H8/3, N127/1, N128/1); Danube, DE (H46/1); Dyje, Danube, CZ (H1/1, H4/2, H40/1, H46/1, N2/1, N3/1, N8/1, N10/1, N56/1, N93/1); Gamlitzbach, Danube, AT (H5/1); Hornad, Danube, SK (H20/ 1, H24/1, H33/1, H44/1); Ida, Danube, SK (H20/5, H40/1, N101/2); Inn, Danube, DE (H46/1, N1/1); Ipel, Danube, SK (H1/1, H30/1); Isar, Danube, DE (H2/1); Jesiolka, Vistula, PL (H20/7, N9/1, N60/1, N93/1); Jevícˇka, Danube, CZ (H1/1, H3/1, H7/1, H12/1, H45/1, N5/1, N7/1, N55/1, N93/1); Jevišovka, Danube, CZ (H42/4, N101/1); Jihlava, Danube, CZ (H1/2, H10/1, H46/1, H47/1, H49/1, N101/1); Kocába, Elbe, CZ (H42/2, H43/1, N101/ 2); Laborec, Danube, SK (H26/1, H27/1, H31/1, H38/1, H39/1); Lech, Danube, DE (H1/1, H41/1, N1/1, N52/1, N101/1); Levocˇsky´ potok, Danube, SK ˇ , Danube, SK (H35/1); Nida, Vistula, PL (H20/2, (H24/1); Morava, Danube, CZ (H1/2, H13/1, H18/1, H19/1, N1/1, N93/1, N101/1, N102/1); Murán H23/1); Olšava, Danube, SK (H20/5, H21/1, H22/2, H29/1, H36/1, N98/1, N93/1, N101/1); Olše, Odra, CZ (H1/3, N101/1); Rhona, Rhona, FR (H1/2, H8/1, N1/2, N57/1, N127/1); Rimava, Danube, SK (H32/1, H37/1); Rimavica, Danube, SK (H1/1, H20/1); Rudava, Danube, CZ (H1/2, H6/1, H40/1, H46/1, H50/1, N93/2, N101/1); San, Vistula, PL (H20/1); Sikenica, Danube, SK (H1/1, H17/1); Skawinka Wisla, Vistula, PL (H20/1, N100/1); Solinka, Vistula, PL (H15/1, N99/1); Stara Rieka, Danube, SK (H1/1, H30/1); Stara Rzeka, Vistula, PL (H20/1, N99/1); Strniez, Dniestr, PL (H1/1); Svratka, Danube, CZ (H1/2, H48, N101/1); Ublanka, Danube, SK (H20/3, H25/1, H28/1); Vlára, Danube, SK (H1/2, H16/1, N101/1); Vsetínská Becˇva, Danube, CZ (H14/1, N1/1, N53/1, N101/1) Rezovska, BG (H51/5, N32/1, N59/1, N121/1); Veleka, BG (H51/2, H52/1, N31/2, N63/1, N120/1, N122/1) Dniester, UA (H57/1, H58/1, H59/1, N16/3, N66/1, N67/1, N68/1, N115/2, N116/1, N117/1); Lasva, Volga, RU (H53/1, H56/1, N16/1, N69/1, N115/ 1); Lovat, Neva, RU (H55/1, N115/1); Protva, Volga, RU (H53/1, H54/1, N16/2, N70/1, N115/1) Bosna, Danube, BA (H84/1, H86/1); Illova, Danube, HR (H65/1); Iscica, Danube, SI (H60/1, H61/1, H62/1, H70/1, H72/1, N1/2, N6/2, N71/1, N101/ 1); Korona, Danube, HR (H77/1, N62/1, N93/1); Kupa, Danube, HR (H60/2, 76/1, H79/1, H81/2, N1/1, N101/1); Petrncica, Danube, HR (H71/1, H81/1); Rijeka, Danube, HR (H66/1, H74/1); Rogacˇica, Danube, RS (H69/1); Satla, Danube, HR (H80/1, N97/1); Sava, Danube, HR (H60/1, H64/1, H78/1, N101/1); Subocka, Danube, HR (H60/1, H81/1, N6/1); Una, Danube, BA (H60/3, H63/1, H75/1); Usora, Danube, BA (H82/2, H83/1); Vrbanja, Danube, BA (H67/1, H73/2, H85/1, H87/1, N1/1, N50/1, N124/1); Vrbas, Danube, BA (H67/2, H68/1, H73/1, H82/1) Archar, Danube, BG (H88/2, H96/2, H98/1, H102/2, H108/3, N1/2, N61/1, N124/1); Biely Vit, Danube, BG (H97/1); Bosna, Danube, BA (H107/1, N1/1); Cerna, Danube, RO (H106/1); Crni Timok, Danube, RS (H88/1); Dicˇina, Danube, RS (H88/1, H90/1, H113/1, N51/1); Dzˇepska, Danube, RS (H88/1, H89/1, H93/1, N1/1, N4/1, N124/1); Gilort, Danube, RO (H95/1, N1/1, N129/1); Jantra, Danube, BG (H109/1); Kolubara, Danube, RS (H88/ 2, H111/1, H112/1); Korona, Danube, HR (H105/1); Lom, Danube, BG (H103/1); Malki Iskar, Danube, BG (H101/1); Mehavica, Danube, RO (H104/ 1); Nejkovska, Kamca, BG (H99/1, H100/1); Nera, Danube, RO (H96/1); Ogosta, Danube, BG (H92/1); Olt, Danube, RO (H88/1, H110/1); Osam, Danube, BG (H88/1, H96/2, N6/1, N58/1, N119/1); Rogacˇica, Danube, RS (H108/1); Svrljiški Timok, Danube, RS (H88/1); Timok, Danube, RS (H114/1); Trešnjica, Danube, RS (H88/1, H91/1, N7/1, N49/1); Vidima, Danube, BG (H94/1, H109/1) Alma, UA (H115/1, H116/1, H118/1, H119/1, N14/3, N15/1, N64/1, N101/1, N118/1); Tchernaya, UA (H117/1, N14/1, N65/1, N114/1) Erzen, AL (H123/1, N38/1, N123/1); Fani, Mat, AL (H121/2, N11/1, N43/1, N93/1); Gomsiqe, Ohrid-Drin-Skadar, AL (H120/1, H121/1, H124/1, N11/1, N48/1, N76/1, N77/1, N123/1); Ishëm, AL (H121/1); Ohrid Lake, Ohrid-Drin-Skadar, AL (H120/3, H125/1, N11/2, N30/1, N78/1, N103/1); Zezë, Ishëm, AL (H122/1) Borshi, AL (H126/1, N12/1, N46/1, N93/1); Devolli, Semani, AL (H129/1, N12/1); Dukati, AL (H127/1, N12/1, N47/1, N93); Vjosa, AL (H128/1)

Devolli, Semani, AL (H130/2, H131/1, H132/1, H133/1, H137/1, N12/3, N13/1, N37/2, N39/1, N42/1, N44/1, N45/1, N93/4, N94/1, N123/1); Osumi, Semani, AL (H130/1, N12, N45/1, N93/1); Prespa Lake, AL (H134/1, H135/1, H138/1, N12/4, N37/2, N40/1, N42/1, N93/2, N95/1); Shkumbini, AL (H136/1) Aliakmon, GR (H146/1); Pinios, GR (H149/1, H150/1, H151/1, H152/1, H153/1, N18/4, N84/1, N130/1, N131/1); Vardar, MK (H139/1, H140/1, H141/1, H142/1, H144/1, H145/1, H147/1, H148/1, N17/1, N18/3, N82/1, N83/1, N132/2, N134/1); Venetikos, Aliakmon, GR (H143/1, N18/1, N133/1) Agios, Struma, GR (H154/1, N20/1); Dospat, Mesta, BG (H157/1, N21/1, N81/1, N135/1); Mesta, GR (H157/5, N21/3); Milopotamus, Struma, GR (H156/1); Struma, GR (H154/3, H155/1, H158/1, H159/1, N19/1, N20/2, N21/2, N85/1, N136/1, N137/1) Ashe, RU (H161/1, N28/1, N79/1, N111/1); Shakhe, RU (H160/1, H162/1, N29/2, N80/2, N112/1, N113/1) Abin, Kuban, RU (H163/1, H164/1, H165/1, H167/1, N22/1, N23/2, N27/1, N73/3, N101/1, N104/3); Kuban, RU (H166/1, N23/1, N72/1, N89/1, N104/1) Sperchios, GR (H168/1, H169/1, H170/1, H171/1, H172/1, N33/2, N34/1, N35/1, N36/2, N90/2, N91/1, N92/2, N125/4, N126/1) Devolli, Semani, AL (H176/1, N12/1, N42/1, N93/1); Osumi, Semani, AL (H173/3, H174/1, H175/1, H176/1, N11/1, N12/3, N41/1, N42/3, N45/1, N93/3, N96/1, N123/2) Borsula, Kura, AZ (H177/1, H179/2, N24/3, N87/1, N88/1, N105/1); Kurakcay, Kura, AZ (H178/1, N24/1, N86/1, N108/1) Agcay, AZ (H180/1, N26/1, N106/1); Tugcay, AZ (H182/1, H183, N23/2, N75/1, N109/1, N110/1); Xizi, Tugcay, AZ (H181/5, N23/1, N25/2, N26/1, N74/1, N75/2, N106/2, N107/1)

AL, Albania; AT, Austria; AZ, Azerbaijan; BA, Bosna and Herzegovina; BG, Bulgaria; CZ, Czech Republic; DE, Germany; FR, France; GR, Greece; HR, Croatia; MK, Republic of Macedonia; PL, Poland; RO, Romania; RS, Serbia; RU, Russia; SI, Slovenia; SK, Slovakia; UA, Ukraine. Underlined rivers = common rivers for L_IV and L_V lineages; Rivers in bold correspond to the individuals in the trees. y N, number of analysed individuals; H1 – H183 = cyt b haplotype; N1 – N36 = b-actin nuclear haplotypes; N37 – N92 = S7 nuclear haplotypes; N93 – N137 = RAG1 nuclear haplotypes (to save space underscore characters are not used). & If the river is identical with main drainage, the main drainage name is not repeated.

replicates. Bayesian analysis was performed using MrBayes 3.1.2. starting from a random tree, Metropolis-coupled Markov chain Monte Carlo sampling was performed with four chains run for

1.5  106 generations for cyt b and 1  106 for merged data set of cyt b-S7-b-actin with a sampling frequency of 100. The convergence of the chains to stationarity was checked using Tracer 1.2

Please cite this article in press as: Stierandová, S., et al. A multilocus assessment of nuclear and mitochondrial sequence data elucidates phylogenetic relationships among European spirlins (Alburnoides, Cyprinidae). Mol. Phylogenet. Evol. (2015), http://dx.doi.org/10.1016/j.ympev.2015.10.025

4

S. Stierandová et al. / Molecular Phylogenetics and Evolution xxx (2015) xxx–xxx

(Rambaut and Drummond, 2003), and the trees generated prior to reaching stationarity were discarded as burn-in. Bayesian posterior probabilities were obtained from the 50% majority rule consensus trees. The resulting BI topology was applied for both presented trees. The program ProtTest 2.4 (Abasca et al., 2005) found the most suitable evolutionary model for the translated sequences of cyt b. The Bayesian analysis of the amino acid chain was done under the model MtREV + I. 2.5  106 generations with sampling frequency 100 were generated. All constructed phylograms were displayed and edited in the programmes FigTree, v1.2.1 (Rambaut, 2009) and TreeGraph v2 (Stöver and Müller, 2010). Inter- and intraspecific variability was counted from uncorrected p-distances (Table 2). As for treating the gaps as phylogenetic characters, three types of analyses were compared during the process of phylogenetic inference from the sequence of nuclear markers: (1) gaps as missing data, (2) gaps as the fifth state character (Barriel, 1994) and (3) gaps as a separate binary character (Simmons and Ochoterena, 2000). Authors Ogden and Rosenberg (2007), Simmons and Ochoterena (2000) and Müller (2006) determined three ways of coding gaps (SIC – simple indel coding, CIC – complex indel coding, MCIC – modified complex indel coding). Instead of the CIC procedure, we used its newer, modified version, MCIC. The coding of indels was provided by the SeqState program (Müller, 2005) with the implemented program IndelCoder. All heterozygous genotypes were phased in comparison to homozygous nuclear haplotype and were confirmed by a coalescent-based Bayesian method of Phase 2.1 (Stephens and Scheet, 2005; Stephens et al., 2001) as implemented in DnaSP v5.10.1. The analyses were run five times with different seeds for the random-number generator and checked if gametic phase estimation was consistent through the runs according to goodness-of-fit values. Each run was conducted under the parent-independent mutation model with a burn-in-period of 100 followed by 1000 iterations. A haplotype network was constructed by TCS 1.21 (Clement et al., 2000) using statistic parsimony. Indels were coded as the fifth state characters and 10 mutation steps (connecting limit 95%) were established as a fixed limit for disconnection of the haplotypes.

3. Results and discussion 3.1. Sequence and marker characteristics The phylogenetic relationships were derived from sequences of the complete mitochondrial gene cytochrome b (1140 bp) and three partial nuclear markers: the first intron of the S7 r-protein (564–573 bp), exon region of RAG1 (1111 bp) and combined exon–intron region of b-actin (833–850 bp). Obtaining sequences were deposited in the GenBank database under Accession Nos. (cyt b: KM874641 – KM874640; S7: KM874641 – KM874696; RAG1: KM874697 – KM874741; b-actin: KM874742 – KM874777). Variability of sequences of the gene cyt b was spread across its whole length, and variability of the noncoding markers was associated within intron regions. Five diagnostic indels were detected in the intron region of S7 marker, and variability was focused in the middle part of the sequence. Analysed sequences of b-actin contained three exons and two introns B and C (Robalo et al., 2006). The highest variability was observed in intron C: in total, 9 indels (7 diagnostic). The intron B contained four diagnostic indels. Indel diagnostics are shown in Table 3 and in the phylogram (Fig. 1). Practically no variability was observed in the nuclear exons. The coding region of gene RAG1 was very conservative; this marker does not separate any phylogenetic lineages. Nucleotide sequences of cyt b were translated to the chain contained 380 amino acids (AA).

Synonymous substitutions (298) strongly prevailed in the whole amount of 337 mutations, which resulted to the low AA diversity and low lineage support. Despite this fact it was possible to identify 1–3 specific AA substitutions for eight of the 17 phylogenetic lineages revealed from combined data (cyt b, b-actin, and S7): L_IV (1x), L_VI (1x), L_VII (1x), L_XI (2x), L_XII (3x), L_XIII (2x), L_XIV (1x), L_XVI (1x) (Fig. 1). Uncorrected genetic distances for cyt b and b-actin for each lineage are introduced in Table 2. The percentage divergences between different lineages for cyt b varied from 0.83% to 9.97%. P-distances for b-actin and S7 were observed up to 2.63%, and 2.08%, resp. A highly conservative marker RAG1 showed values of intraspecific character up to 0.64%.

3.2. Phylogenetic analysis The data obtained from the mitochondrial and three nuclear markers were analysed at first separately and then in combination. The best model GTR + C + I was chosen for sequences of cyt b and the following parameters were used: base frequency (A = 0.2594, C = 0.2920, G = 0.1551, T = 0.2936), gamma distribution C = 0.7910 and the proportion of invariant sites p-inv = 0.5890. The same model was ascertained for b-actin, S7 and RAG1 markers. The combined data set (cyt b-b-actin-S7) was subjected to BI and MP analyses (base frequency: A = 0.2570, C = 0.2463, G = 0.1870, T = 0.3096, gamma distribution C = 0.1470 and the proportion of invariant sites p-inv = 0.4130) and in addition, the binary matrix considering the existence of indels was applied. The sequences of RAG1 were not contained to the analysis due to low variability. A congruence among the tree topologies generated for the combined data (cyt b-b-actin-S7) was tested using the incongruence length difference test (ILD) as implemented in the partition homogeneity test in PAUP⁄ (Farris et al., 1995). P-value, counted from 1000 replications, was 0.001 and confirmed above mentioned findings about highly diagnostic ability of mtDNA marker and low informative ability of nuclear markers on nucleotide substitutions’ level. However, strong phylogenetic signal was localised in sequences of the nuclear introns b-actin and S7 on level of many diagnostic indels (Fig. 1; Tables 3 and 4). For reconstruction of the phylogenetic relationships, 2–5 individuals representing each phylogenetic lineage, characterised in Table 1, were chosen. Phylogenetic analyses of both datasets (mtDNA separately and in combination – mtDNA + nDNA) confirmed monophyly of genus Alburnoides. The division of lineages into the main clades and nodal supports were similar; there was always strong support, except for Albanian group. Aegean groups created basal groups within European clade (Figs. 1 and 2). In terms of the combined dataset, different statistical methods split identically tested individuals into 17 lineages. These lineages form two main geographical clades, that we term the PontoCaspian and European in accordance with the lineage spread (Fig. 1). The first clade includes four lineages: L_XII, L_XIII, L_XVI, L_XVII. The compactness of the second clade is weaker and it can be divided into the Midwest European (lineages L_I, L_IV, L_V), the Albanian (L_VII, L_VIII, L_IX, L_XV), the South Bulgarian (L_II), the Russian–Ukrainian (L_III and L_VI) and two Aegean (combined L_X, L_XI and independent L_XIV) groups. The phylogenetic tree based on mtDNA data shows similar topology (Fig. 2). The main clades and lineages are highly supported by bootstrap and posterior probability values. The lower supports are observed for some representatives from the Albanian group. In the European clade, the cyt b tree connects together lineages from the Russian–Ukrainian and the Midwest European groups. In addition, the Iranian sequences Iran I, II, III by Seifali et al. (2012) were integrated into cyt b dataset and extend the Ponto-Caspian clade.

Please cite this article in press as: Stierandová, S., et al. A multilocus assessment of nuclear and mitochondrial sequence data elucidates phylogenetic relationships among European spirlins (Alburnoides, Cyprinidae). Mol. Phylogenet. Evol. (2015), http://dx.doi.org/10.1016/j.ympev.2015.10.025

I II III IV V VI VII VIII IX X XI XII XIII XIV XV XVI XVII

0.52 ± 0.24 3.13 ± 0.60 3.98 ± 0.64 3.46 ± 0.62 2.87 ± 0.56 3.91 ± 0.61 4.89 ± 0.76 4.37 ± 0.71 4.09 ± 0.70 6.65 ± 0.87 6.98 ± 0.87 8.08 ± 0.93 7.82 ± 0.97 7.69 ± 1.00 4.30 ± 0.72 6.98 ± 0.91 6.98 ± 0.93 7.24 ± 0.90 7.04 ± 0.90 8.41 ± 0.97

2.11 ± 0.47 0.00 ± 0.00 3.23 ± 0.62 2.67 ± 0.57 2.48 ± 0.57 3.26 ± 0.61 5.22 ± 0.82 4.11 ± 0.74 4.04 ± 0.75 7.17 ± 0.90 6.98 ± 0.89 7.43 ± 0.95 7.04 ± 0.94 7.69 ± 1.02 4.26 ± 0.79 6.13 ± 0.86 6.32 ± 0.88 6.58 ± 0.86 6.65 ± 0.88 7.43 ± 0.94

Lineage II

2.23 ± 0.48 0.50 ± 0.21 0.72 ± 0.23 2.84 ± 0.54 2.51 ± 0.52 3.81 ± 0.63 5.44 ± 0.75 4.43 ± 0.73 4.50 ± 0.70 8.02 ± 0.94 8.38 ± 098 7.92 ± 0.91 8.15 ± 0.97 8.15 ± 1.06 4.45 ± 0.73 7.25 ± 0.93 7.24 ± 0.93 7.82 ± 0.92 7.50 ± 0.93 8.80 ± 1.00

Lineage III

0.06 ± 0.06 2.05 ± 0.46 2.17 ± 0.48 0.65 ± 0.28 2.22 ± 0.49 3.46 ± 0.58 5.22 ± 0.79 3.78 ± 0.67 3.98 ± 0.69 6.84 ± 0.85 7.82 ± 0.93 8.02 ± 0.92 7.63 ± 0.94 8.15 ± 1.05 4.19 ± 0.72 7.17 ± 0.92 7.04 ± 0.92 7.95 ± 0.94 7.63 ± 0.93 8.28 ± 0.94

Lineage IV

0.06 ± 0.06 2.05 ± 0.46 2.17 ± 0.48 0.00 ± 0.00 0.39 ± 0.22 3.72 ± 0.62 5.15 ± 0.77 4.04 ± 0.68 4.24 ± 0.70 7.24 ± 0.89 7.30 ± 0.89 8.02 ± 0.94 7.37 ± 0.93 7.76 ± 1.00 4.19 ± 0.71 6.58 ± 0.89 6.65 ± 0.90 7.17 ± 0.89 6.84 ± 0.87 7.89 ± 0.95

Lineage V

2.23 ± 0.48 0.50 ± 0.21 0.12 ± 0.08 2.17 ± 0.48 2.17 ± 0.48 0.52 ± 0.25 5.41 ± 0.80 4.24 ± 0.71 3.91 ± 0.69 6.65 ± 0.80 7.24 ± 0.87 6.65 ± 0.85 7.56 ± 0.96 7.82 ± 0.99 4.09 ± 0.71 6.84 ± 0.90 6.98 ± 0.90 7.37 ± 0.92 7.30 ± 0.93 7.82 ± 0.92

Lineage VI

0.37 ± 0.19 1.86 ± 0.43 1.99 ± 0.45 0.31 ± 0.18 0.31 ± 0.18 1.99 ± 0.45 0.26 ± 0.18 4.82 ± 0.76 5.07 ± 0.78 8.67 ± 0.99 9.26 ± 1.02 8.54 ± 1.01 9.00 ± 1.06 9.00 ± 1.07 5.15 ± 0.78 7.50 ± 0.95 7.89 ± 1.00 8.15 ± 0.96 8.08 ± 0.99 9.39 ± 1.03

Lineage VII

0.31 ± 0.18 1.80 ± 0.43 1.92 ± 0.45 0.25 ± 0.17 0.25 ± 0.17 1.92 ± 0.45 0.06 ± 0.06 0.26 ± 0.17 1.37 ± 0.39 7.24 ± 0.91 4.82 ± 0.92 8.02 ± 0.94 7.37 ± 0.78 8.54 ± 1.08 1.43 ± 0.39 6.71 ± 0.91 6.52 ± 0.89 7.17 ± 0.91 7.11 ± 0.93 8.08 ± 1.00

Lineage VIII 0.31 ± 0.18 1.80 ± 0.43 1.92 ± 0.45 0.25 ± 0.17 0.25 ± 0.17 1.92 ± 0.45 0.06 ± 0.06 0.00 ± 0.00 0.21 ± 0.10 6.96 ± 0.87 7.08 ± 0.89 7.61 ± 0.93 7.22 ± 0.94 8.21 ± 1.07 0.83 ± 0.29 6.45 ± 0.91 6.38 ± 0.88 6.84 ± 0.92 6.78 ± 0.93 7.87 ± 0.98

Lineage IX

2.17 ± 0.47 0.43 ± 0.21 0.31 ± 0.18 2.11 ± 0.47 2.11 ± 0.47 0.31 ± 0.18 1.92 ± 0.44 1.86 ± 0.43 1.86 ± 0.43 0.52 ± 0.25 3.06 ± 0.59 9.26 ± 0.97 8.74 ± 0.96 9.84 ± 1.09 7.30 ± 0.89 9.06 ± 1.01 8.54 ± 0.98 9.45 ± 0.99 9.39 ± 1.01 9.00 ± 0.98

Lineage X

2.30 ± 0.49 0.56 ± 0.24 0.43 ± 0.21 2.23 ± 0.49 2.23 ± 0.49 0.43 ± 0.22 2.05 ± 0.46 1.99 ± 0.46 1.99 ± 0.46 0.12 ± 0.12 0.39 ± 0.21 9.32 ± 1.00 8.41 ± 0.96 9.19 ± 1.05 7.24 ± 0.91 8.08 ± 0.95 8.87 ± 1.00 9.00 ± 0.97 8.93 ± 0.98 9.19 ± 1.00

Lineage XI

2.11 ± 0.46 0.37 ± 0.18 0.25 ± 0.14 2.05 ± 0.46 2.05 ± 0.46 0.25 ± 0.15 1.86 ± 0.43 1.80 ± 0.43 1.80 ± 0.43 0.19 ± 0.13 0.31 ± 0.18 0.26 ± 0.17 5.22 ± 0.78 8.87 ± 1.01 7.74 ± 0.93 4.37 ± 0.71 4.63 ± 073 5.02 ± 0.74 4.95 ± 0.75 5.80 ± 0.80

Lineage XII

2.05 ± 0.46 0.31 ± 0.17 0.19 ± 0.13 1.99 ± 0.46 1.99 ± 0.46 0.19 ± 0.13 1.80 ± 0.43 1.74 ± 0.42 1.74 ± 0.42 0.12 ± 0.12 0.25 ± 0.17 0.06 ± 0.06 0.26 ± 0.18 9.52 ± 1.08 7.34 ± 0.94 4.69 ± 0.68 5.02 ± 0.78 5.35 ± 0.73 4.95 ± 0.71 5.87 ± 0.81

Lineage XIII 2.36 ± 0.49 1.12 ± 0.34 0.99 ± 0.32 2.30 ± 0.49 2.30 ± 0.49 0.99 ± 0.33 2.61 ± 0.53 2.54 ± 0.52 2.54 ± 0.52 0.93 ± 0.32 1.05 ± 0.34 0.87 ± 0.30 0.81 ± 0.30 0.26 ± 0.19 7.87 ± 1.04 8.74 ± 1.06 8.93 ± 1.04 9.19 ± 1.08 9.32 ± 1.09 9.97 ± 1.06

Lineage XIV 0.39 ± 0.20 1.88 ± 0.44 2.01 ± 0.46 0.33 ± 0.19 0.33 ± 0.19 2.01 ± 0.46 0.06 ± 0.06 0.08 ± 0.08 0.08 ± 0.08 1.94 ± 0.45 2.07 ± 0.47 1.88 ± 0.44 1.82 ± 0.43 2.63 ± 0.53 0.17 ± 0.12 6.63 ± 0.93 6.32 ± 0.90 7.02 ± 0.93 6.95 ± 0.94 8.00 ± 1.01

Lineage XV

Lineage XVII 2.05 ± 0.46 0.43 ± 0.19 0.31 ± 0.15 1.99 ± 0.46 1.99 ± 0.46 0.31 ± 0.15 1.80 ± 0.42 1.74 ± 0.42 1.74 ± 0.42 0.25 ± 0.14 0.37 ± 0.19 0.19 ± 0.10 0.12 ± 0.08 0.93 ± 0.30 1.82 ± 0.43 0.12 ± 0.08 0.13 ± 0.13 5.35 ± 0.78 5.28 ± 0.81 4.76 ± 0.77

Lineage XVI 2.05 ± 0.46 0.31 ± 0.17 0.19 ± 0.13 1.99 ± 0.46 1.99 ± 0.46 0.19 ± 0.13 1.80 ± 0.43 1.74 ± 0.42 1.74 ± 0.42 0.12 ± 0.12 0.25 ± 0.17 0.06 ± 0.06 0.00 ± 0.00 0.81 ± 0.30 1.82 ± 0.43 1.04 ± 0.34 4.24 ± 0.70 2.74 ± 0.48 2.48 ± 050 5.12 ± 0.74

– – – – – – – – – – – – – – – – – 0.91 ± 0.33 1.76 ± 0.44 5.93 ± 0.81

Iran 1

The mean DNA distance in percents ± SD above the diagonal is for b-actin, under the diagonal there are the values for cyt b; values on the diagonal (in bold) indicate divergences within cyt b lineages.

Lineage Lineage Lineage Lineage Lineage Lineage Lineage Lineage Lineage Lineage Lineage Lineage Lineage Lineage Lineage Lineage Lineage Iran 1 Iran 2 Iran 3

Lineage I

Table 2 Mutual comparison of the representatives of the genus Alburnoides and their sequence divergences obtained by p-distance analysis of cyt b and b-actin.

– – – – – – – – – – – – – – – – – – 0.00 ± 0.00 5.61 ± 0.81

Iran 2

– – – – – – – – – – – – – – – – – – – 0.52 ± 0.25

Iran 3

S. Stierandová et al. / Molecular Phylogenetics and Evolution xxx (2015) xxx–xxx 5

Interesting results for phylogenetic relations in the genus Alburnoides were provided by the intron diagnostics (Table 3), which gave few good diagnostic indels for differentiation at different levels: (a) inter-clade correlation, e.g. PC indel differs PontoCaspian clade from European; (b) inter-group correlation, e.g. indel 7 is specific for Midwest European group, indel 9 is typical for A. prespensis complex, indel 8 – for South Bulgarian group, and indel 6 – for Russian–Ukrainian group; (c) inter-lineage correlation, e.g. indels 2, 4, 5 identify lineage L_XIV as separate Aegean lineage (Fig. 1). In general, the lineages of the Ponto-Caspian clade and Aegean lineages show more indel positions in S7 marker, and the lineages of European clade have typical indels largely in the intron region of b-actin (Fig. 1, Tables 3 and 4). 3.3. Haplotype network and geography

The mitochondrial networks for cyt b, were created from 329 sequences divided to 183 recognised mtDNA haplotypes (Figs. 3 and 4; Tables 1 and 5). The cyt b haplotype diversity was high (Hd = 1.000 ± 0.001), nucleotide diversity (Pi = 0.043 ± 0.002) was proportional to the number of revealed substitutions; the number of variable positions was 268 and the parsimony informative sites – 245 (Table 6). The seventeen separate lineages were distributed in the haplotype networks (Figs. 3 and 4). The nuclear network for marker b-actin was composed of 104 sequences, and 36 nuclear haplotypes were revealed (Fig. S2; Table 1). The nuclear haplotype network confirmed only three distinct groups. The Group 1 (N_1 – N_13) included individuals from the Danube River drainage, Albanian–Macedonian systems, and European rivers from the Mediterranean, and Northern West Atlantic. Among two of the most common nuclear haplotypes, nuclear haplotype N_12 (with N_11 and N_13) is typical for individuals from Albania and the Republic of Macedonia. Five heterozygous individuals were found within this group in four localities: Dyje (N_8/N_3, N_10/N_2) and Dzepska (N_1/N_4) from the Danube River drainage, Rhona (N_7/N_1), and Ohrid Lake (N_11/N_30) in Albania. Most of heterozygotes combined nuclear haplotypes from the same group; the only exception was one sample from Ohrid Lake with heterozygous pattern represented by nuclear haplotypes from both main Groups 1 and 2. Nuclear haplotypes in Group 2 (N_14 – N_32) distinguish eight lineages inhabiting Bulgarian rivers (N_31 – N_32), Greek rivers flowing into the Aegean Sea (N_17 – N_21), rivers of the Crimean Peninsula (N_28 – N_29), the Dniester and the Volga river drainages (N_14 – N_16), Kuban (N_22, N_27) and rivers from Azerbaijan (N_24 – N_26). Group 3 (N_33 – N_36) introduced only one lineage from a single locality, Sperchios in Central Greece, and referred to the increased variability caused mainly by appearance of unique indels (Table 4). The most distinct nuclear haplotypes (N_35 – N_36) were always presented in heterozygous state with N_33 and N_34. For marker S7, in total, 70 sequences created nuclear haplotype network, and 55 nuclear haplotypes (N_37 – N_92) were detected. These haplotypes are unique for eight lineages highlighted in supplement (Fig. S3). Three heterozygous individuals were revealed; for each of them both maternal and paternal haplotype was identified: Kuban River in Russia (N_89/N_72) and Devolli in Albania (N_45/N_37 and N_44/N_42). Observed variability was insufficient for lineage separation, nevertheless this marker, primarily based on an indel composition, confirms the conclusions deduced from the nuclear b-actin marker: In the case of the third nuclear marker RAG1, 106 sequences were contained in the network; 44 nuclear haplotypes (N_93 – N_137) were defined and no heterozygous pattern was observed. Specific RAG1 haplotypes (with differences in 1–3 nucleotide substitutions) distinguished more or less significantly similar lineages similar to the two foregoing nuclear markers (Fig. S4).

Please cite this article in press as: Stierandová, S., et al. A multilocus assessment of nuclear and mitochondrial sequence data elucidates phylogenetic relationships among European spirlins (Alburnoides, Cyprinidae). Mol. Phylogenet. Evol. (2015), http://dx.doi.org/10.1016/j.ympev.2015.10.025

6

S. Stierandová et al. / Molecular Phylogenetics and Evolution xxx (2015) xxx–xxx

Table 3 Parsimony informative and indel positions in sequences of the b-actin and S7 for each lineage representative. PC indel and indel 11 are displayed in both obtained forms (for the heterozygous individuals of L_ XIV resp. L_XIII).

, deletion; ⁄, insertion; PC, nucleotide insertion in sequences of the b-actin specific for Ponto-Caspian clade; 1–10, b-actin indels; 11–15, S7 intron indels.

Fig. 1. Bayesian consensus tree inferred from combined data (cyt b, b-actin and S7). Bayesian posterior probabilities and MP bootstrap values are listed near the nodes. Only values P75% are shown. The seventeen highlighted lineages are categorised into two major clades. Up and down arrows represent insertions and deletions, respectively. bactin indels are marked by empty arrows. Full arrows represent S7 indels. Numbers on the arrows correspond to gap codes in Table 3. The asterisks state presence of typical amino acid nonsynonymous substitution in the lineage. PC, typical Ponto-Caspian indel. Numbers at the end of the branches are catalogue numbers from Institute of Vertebrate Biology Academy of Sciences of the Czech Republic.

Please cite this article in press as: Stierandová, S., et al. A multilocus assessment of nuclear and mitochondrial sequence data elucidates phylogenetic relationships among European spirlins (Alburnoides, Cyprinidae). Mol. Phylogenet. Evol. (2015), http://dx.doi.org/10.1016/j.ympev.2015.10.025

7

S. Stierandová et al. / Molecular Phylogenetics and Evolution xxx (2015) xxx–xxx Table 4 Recent taxonomy of investigated species, proposed changes and molecular diagnostics. Lineage

Scientific name

Valid namea

Proposed name

Geographic distribution

L_I L_IV L_V L_II L_III L_VI L_VII L_VIII L_IX L_XV L_X L_XI L_XIV L_XII L_XIII L_XVI L_XVII

A. x x A. A. A. A. A. A. A. A. A. A. A. A. A. x

A. x x A. A. A. A. A. A. A. A. A. A. A. A. A. x

A. A. A. A. A. A. A. A. A. A. A. A. A. A. A. A. A.

Midwest European Midwest European Midwest European South Bulgarian Russian–Ukrainian Russian–Ukrainian Albanian Albanian Albanian Albanian Aegean Aegean Aegean Ponto-Caspian Ponto-Caspian Ponto-Caspian Ponto-Caspian

Unique AA

Intron diagnostics Indels b-actin

bipunctatus

tzanevi rossicus maculatus ohridanus devolli prespensis fangfangae thessalicus strymonicus thessalicus fasciatus kubanicus eichwaldii

bipunctatus

bipunctatus rossicus maculatus ohridanus devolli prespensis fangfangae thessalicus bipunctatus thessalicus fasciatus kubanicus eichwaldii

bipunctatus sp. 1 sp. 2 tzanevi rossicus maculatus ohridanus prespensis complex prespensis complex prespensis complex sp. 3 strymonicus sp. 4 fasciatus kubanicus eichwaldii sp. 5

V

D I

V V,M I G

3; 3; 3; 8 1; 1; 3 3; 3; 3; 1

Indels S7

7 7 7 6 6

11

9 9 9

2; 4; 5 PC PC PC; 10 PC

11; 11; 11; 11; 14 11; 12;

13 13 14 14 15 15

AA, amino acid; V, Valine; I, Isoleucine; M, Methionine; G, Glycine; D, Aspartic acid; PC, Ponto-Caspian clade. a FishBase 8/2015.

Fig. 2. Bayesian consensus tree resulting from the analysis of the cyt b sequences. Posterior probabilities and bootstrap values P75% are listed near the nodes. The twenty lineages are presented in the phylogram. Numbers at the end of the branches are catalogue numbers from Institute of Vertebrate Biology Academy of Sciences of the Czech Republic.

3.4. Phylogeny of Alburnoides and recent taxonomy 3.4.1. European clade and its lineages In total, the European clade is represented by 10 monophyletic lineages and populations which we define as the Alburnoides prespensis complex. This complex includes A. prespensis s. stricto, described from Lake Prespa and its tributaries in the Republic of Macedonia, and specimens of two newly described species – A. fangfangae and A. devolli (Bogutskaya et al., 2010), collected in their

type localities: respectively, Osumi [Osum] River and Devolli [Devoll] River, both from upper Semani [Seman] River drainage in Albania (Froese and Pauly, 2015; Eschmeyer, 2015). 3.4.1.1. Midwest European group. This group includes three monophyletic lineages. The first lineage (L_I) corresponds to species A. bipunctatus described from the middle Weser (near Minden), North Sea basin. Previously this species was believed to be widely distributed in Europe and Transcaucasia (Berg, 1949). Our results do

Please cite this article in press as: Stierandová, S., et al. A multilocus assessment of nuclear and mitochondrial sequence data elucidates phylogenetic relationships among European spirlins (Alburnoides, Cyprinidae). Mol. Phylogenet. Evol. (2015), http://dx.doi.org/10.1016/j.ympev.2015.10.025

8

S. Stierandová et al. / Molecular Phylogenetics and Evolution xxx (2015) xxx–xxx

not confirm this wide range. The lineage L_I is confirmed in the rivers flowing into the Baltic, North, Black, Mediterranean seas and Biscay Bay and belonging to territories of Czech Republic, Slovakia, Germany, Austria (upper Danube drainage), Poland, and France (Table 1, Fig. S1). Within L_I lineage 50 mtDNA and 23 nDNA haplotypes were distinguished (Tables 1 and 5; Fig. 3) with mitochondrial intraspecific variability up to 0.52% (Table 2). The mitochondrial network shows three larger haplogroups, which however, do not evince clear geographical affiliation (Fig. 3). Haplotypes H_1 – H_18 were largely observed in France and in the upper Danube, haplotypes H_20 – H_39 dominated in rivers

from Poland, and in the Tisa River drainage. Haplotypes H_1 and H_20 are common; relatively recent satellite haplotypes create star-pattern genealogical relationships with these predominant oldest haplotypes. Haplotype H_40 from the downstream section of the Morava on the Czech-Slovakian border was chosen as ancestral. Together with two other lineages from the same Midwest European group, L_I has a typical insertion 7 and deletion 3 found in b-actin intron (Tables 3 and 4). The second lineage from the same group – L_IV – includes individuals from the middle Danube section in Slovenia, Bosnia and Herzegovina, Croatia, and Serbia (Table 1). This lineage is

Fig. 3. The unrooted haplotype network for the lineage L_I (H_1 – H_50) based on sequences of the cyt b. The haplotype numbers refer to the numbers in Table 1. The node sizes are proportional to the haplotype frequencies. Haplotypes in bold were used in the trees. GenBank accession numbers are shown.

Fig. 4. The unrooted haplotype network for the lineages L_II – L_XVII (H_51 – H_183) constructed by analysis of sequences of the cyt b. The haplotype numbers refer to the numbers in Table 1. The node sizes are proportional to the haplotype frequencies. Haplotypes in bold were used in the trees. GenBank accession numbers are shown.

Please cite this article in press as: Stierandová, S., et al. A multilocus assessment of nuclear and mitochondrial sequence data elucidates phylogenetic relationships among European spirlins (Alburnoides, Cyprinidae). Mol. Phylogenet. Evol. (2015), http://dx.doi.org/10.1016/j.ympev.2015.10.025

9

S. Stierandová et al. / Molecular Phylogenetics and Evolution xxx (2015) xxx–xxx Table 5 Summary statistics of spirlin lineages within mtDNA and nDNA networks of markers cyt b, b-actin, S7 and RAG1. Lineage

Ncyt

Lineage_I Lineage_II Lineage_III Lineage_IV Lineage_V Lineage_VI Lineage_VII Lineage_VIII Lineage_IX Lineage_X Lineage_XI Lineage_XII Lineage_XIII Lineage_XIV Lineage_XV Lineage_XVI Lineage_XVII

131 8 8 44 46 5 12 4 11 15 14 3 5 5 6 4 8

Overall

329

b

NHcyt

Nbact

NNbact

NS7

NNS7

NRAG1

NNRAG1

50 2 7 28 27 5 6 4 9 15 6 3 5 5 4 3 4

15 3 6 6 7 5 4 3 9 9 10 3 5 4 4 4 7

8 2 1 2 4 2 2 1 2 2 3 2 3 4 2 1 3

(1) (1) (3) (2) (2) (1)

8 2 5 3 4 2 6 2 9 3 2 3 4 5 5 3 4

7 2 5 3 4 2 6 2 6 3 2 2 3 3 3 3 2

(2) (2)

31 3 7 6 4 3 4 2 10 6 3 3 5 5 6 2 6

8 3 3 4 3 3 3 1 4 5 3 3 2 2 3 2 4

183/183*

104

44/36*

70

58/55*

106

56/44*

b

(2) (1) (2) (2) (2) (2) (1) (2) (2) (1)

(1)

(2) (2) (1) (2)

N, the number of specimens; H, the number of mitochondrial haplotypes; NH, the number of nuclear haplotypes; number of indel positions in brackets. * Total/real number of haplotypes in networks (lower number caused by presence of same nDNA haplotypes in different lineages).

Table 6 Analysed fragments of both genomes and their characteristics resulting from MP analysis. Marker

No. characters (pars. inf.)

TL

CI

RI

cyt b b-actin + S7 + cyt b b-actin S7 RAG1

1140 (245) 2573 (312) 860 (30) + 10 indels 573 (37) + 5 indels 1111 (35)

735 894

0.5265 0.5749

0.7936 0.8205

Cl, consistency index (excluding uninformative characters); pars. inf., number of parsimony informative characters in brackets; RI, retention index; TL, tree length.

designated as Alburnoides sp. 1, because there is no available name for these representatives. Most previous authors believed them to be conspecific to A. bipunctatus (Berg, 1949; Kottelat and Freyhof, 2007) and only some authors separated them as A. cf. bipunctatus or Alburnoides sp. (Bogutskaya et al., 2010; Coad and Bogutskaya, 2009). 28 mtDNA and 9 nDNA haplotypes were recognised (Tables 1 and 5). Unique AA substitution as well as cyt b distance 2.22% distinguishes this lineage from the third Midwest European lineage L_V (Fig. 1; Tables 2 and 4). Individuals from lineage L_V inhabit the lower Danube River basin in Croatia, Serbia, Bosnia and Herzegovina, Romania, and Bulgaria (Table 1). 27 mtDNA and 11 nDNA haplotypes were identified (Tables 1 and 5) and the DNA distances from the other phylogenetic lineages were observed in the range of 2.22–8.02% (Table 2). We designate this lineage as Alburnoides sp. 2 for the same reasons as in the case of the lineage L_IV. Phylogenetic lineages L_IV and L_V are significantly different (2.87–3.46%) from the lineage L_I corresponding to A. bipunctatus s. stricto. At present, they should be regarded as cryptic species; their morphological identification and further description as new taxonomic units are complicated by observed sympatric distribution in some parts of system: according to the cyt b marker, both lineages were found in rivers Bosna, Korona, and Rogacˇica (Table 1). More detailed studies in the overlapping area of the lineages L_IV and L_V are necessary for understanding their evolutionary history. 3.4.1.2. Albanian group. The diagnostic deletion 3 supports sister relations between Albanian spirlins and the Midwest European group (Fig. 1; Tables 3 and 4). On the territory covered by the Albanian group four species are accepted based on morphological data: A. ohridanus (Karaman, 1928), A. prespensis (Karaman, 1924), A. devolli Bogutskaya, Zupancic and Naseka, 2010 and A. fangfangae

Bogutskaya, Zupancic and Naseka, 2010. This species’ structure was developed by morphological investigations, but the genetic results were not so definite. P-distances for cyt b between lineages of the Albanian group were detected from 0.83% to 5.15% (Table 2). The highest values of DNA distances and unique AA substitution were detected for species from the type locality Ohrid Lake, (Tables 2 and 4). The studied specimens from northern Albania (the OhridDrin-Skadar drainage, rivers Ishem, Mat, and Erzen) with 6 unique mtDNA haplotypes (H_120 – H_125) and 11 nuclear haplotypes (Table 5; Figs. 4, S2–S4) constituted monophyletic lineage L_VII which corresponds to accepted species A. ohridanus. Cyt b distances between this lineage and other phylogenetic Alburnoides lineages ranged from 4.89% to 9.39% (Table 2). Earlier, species status of this lineage was also confirmed by species-specific microsatellite alleles revealed for A. ohridanus and A. prespensis (Urbánková et al., 2013). The typical absence of visible population structure in nDNA markers is obvious within Albanian taxa named A. prespensis complex. Based on intron diagnostics, this complex is recognisable from other lineages for one 2-nt indel in b-actin sequences (deletion 9; Table 4). Three morphologically described lineages constitute this complex emerging in small water system and demonstrating low mitochondrial distances from 0.83% to 1.43%, which are more likely intraspecific level (Table 2). Lineage L_VIII with 4 mtDNA haplotypes and 4 nDNA haplotype were identified in the south Albanian rivers Devolli, Borshi, Dukati, and Vjosa [Vjosë/Aoos] (Tables 1 and 5; Figs. 4, S2–S4). Due to this finding, as in the mtDNA haplotype in the Devolli River, one may propose that it corresponds to the species A. devolli described from this river. However, the rest of specimens collected in the Devolli River (40–50 km from the type locality) were included in other lineages of A. prespensis complex: L_XV and L_IX. Lineage L_XV with 4 mtDNA and 8 nDNA haplotypes (Tables 1 and 5; Figs. 4, S2–S4) included individuals from the Devolli River and more numerous specimens which have been collected in the Osumi River near the type locality of A. fangfangae. Lineage L_IX with 9 mtDNA and 12 nDNA haplotypes combined samples from both Osumi and Devolli rivers with individuals from Prespa Lake and Shkumbini River (Tables 1 and 5; Figs. 4, S2–S4). According to the available data, the populations of both Osumi and Devolli rivers are not genetically homogeneous including haplotypes separated into different morphologic lineages. This finding is supported by Geiger et al. (2014) where the similar results have

Please cite this article in press as: Stierandová, S., et al. A multilocus assessment of nuclear and mitochondrial sequence data elucidates phylogenetic relationships among European spirlins (Alburnoides, Cyprinidae). Mol. Phylogenet. Evol. (2015), http://dx.doi.org/10.1016/j.ympev.2015.10.025

10

S. Stierandová et al. / Molecular Phylogenetics and Evolution xxx (2015) xxx–xxx

been observed according to COI marker. Thus, phylogenetic analyses present ambiguous results and do not support recently accepted taxonomy which presumes validity of three species: A. prespensis, A. fangfangae, and A. devolli. From our results, the prevailing concept that the Prespa Lake basin hosts an outstanding level of local endemism should be revised. The evolution history of this region seems to be highly complicated; probably this is a sign of recent gene flow among adjacent river basins. A similar situation was observed in the barbels Barbus spp. (Marková et al., 2010), where it was discovered that the species Barbus prespensis Karaman, 1924 is not endemic to the Prespa Lake but populates a large part of southern Albania, exactly in the same region where A. prespensis complex is detected. The presence of two or three differentiated mitochondrial haplotype groups in the same river, revealed for lineages L_VIII, L_IX and L_XV in the Devolli and Osumi rivers, was not observed in our study, except for the sympatry of L_IV and L_V in the Danube River drainage (see above). However, in the situation with the lineages L_IV and L_V we can presume their secondary contact after historical allopatry, because individuals from the lineage L_IV are characterised by unique AA substitution, not revealed in the lineage L_V. In contrast to this, the lineages L_VIII, L_IX and L_XV of A. prespensis complex demonstrate slight differences in mitochondrial haplotypes only. This event could be explained by the existence of three different mitochondrial gene pools originated from different maternal ancestral, but combined by the same nuclear genome. Considering the above, a detailed population study of the Albanian spirlins obviously should be undertaken; especially to explore genetic variability, demographic history and morphological variations among the populations from the southern river basins. 3.4.1.3. South Bulgarian group. This group is represented by the only lineage L_II, which includes individuals from the Bulgarian rivers Rezovska and Veleka. The Rezovska River is the type locality for Alburnoides bipunctatus tzanevi Chichkoff, 1933. The validity of this species is confirmed by strong nodal supports in the both phylogenetic trees (Figs. 1 and 2), also by the cyt b distance not lower than 2.48% (Table 2) and the presence of unique 3-nt deletion 8 (Tables 3 and 4). The affiliation with different nuclear-network groups and DNA distance 2.11% for b-actin support the differentiation of this lineage from A. bipunctatus. The mtDNA haplotypes H_51 – H_52 and the nuclear haplotypes N_31 – N_32 and N_120 – N_122 are specific for A. tzanevi (Figs. 4, S2, S4). 3.4.1.4. Russian–Ukrainian group. This group is represented by two monophyletic lineages L_III and L_VI which share two diagnostic b-actin indels 1 and 6 (Tables 3 and 4); both of them demonstrate significant cyt b distances from A. bipunctatus s. stricto: 3.98% and 3.91%, resp. (Table 2). The lineage L_III with specific 7 mtDNA and 9 nDNA haplotypes (Table 5; Figs. 4, S2–S4) includes specimens from the Dniester, Neva (Il’men Lake), and Volga River drainages belonging to three different sea basins, namely the Black Sea, Baltic, and Caspian Sea. They have identical insertion 11 (Tables 3 and 4) and the mtDNA distance from other phylogenetic lineages varies from 2.51% to 8.80% (Table 2). Previously, populations of spirlins from the Dnieper, Dniester, South Bug and Volga river drainages were combined together and separated as a distinct subspecies of A. bipunctatus rossicus Berg, 1924 (type localities Dnieper and Volga). Berg (1924, 1949) distinguished a new subspecies by the predominance of 2.5–5.2 pharyngeal teeth contrast to 2.5–4.2 characterised for A. bipunctatus bipunctatus; nominotypical subspecies was presumed to inhabit the Northern and Baltic Sea basins. Recently, A. rossicus is accepted as an independent species with its range covering both the Black and Caspian Sea basins (Coad and Bogutskaya, 2009; Bogutskaya and Coad, 2009;

Bogutskaya et al., 2010; Turan et al., 2014). Our results add the populations from the northern Baltic basin (Gulf of Finland) to this genetic pool. However, the tree topology and the haplotype network show increased divergence between localities from the Black Sea and the Caspian Sea; the individuals from the Baltic join the Caspian populations (Figs. 1, 2 and 4). We consider the current variability as intraspecific because of the low cyt b distance (0.72%) and the existence of identical indels. The second lineage L_VI with 5 mtDNA and 7 nuclear haplotypes (Table 5; Figs. 4, S2–S4) is represented by individuals from two Crimean rivers: Alma and Tchernaya. In the past, Crimean spirlins have been described as separate cyprinid species Alburnus maculatus Kessler, 1859 (type locality Salgir River at Simferopol) and later included in synonymy of Alburnoides bipunctatus for a long time. Recently Alburnoides maculatus is considered a separate species (Bogutskaya and Coad, 2009; Bogutskaya et al., 2010; Turan et al., 2014). The phylogenetic results confirm independent species status for the Crimean population by strong nodal support in both trees (Figs. 1 and 2), high mtDNA distance from other lineages ranging between 3.26% and 7.82% (Table 2) and also unique AA substitution (Table 4). 3.4.1.5. Aegean group. Among spirlin populations from the Aegean Sea basin in the Republic of Macedonia, Bulgaria and Greece two taxa were described, namely A. bipunctatus strymonicus Chichkoff, 1940 from the Struma/Strymon and Toplitza (the tributary of the South Morava, Danube River basin) rivers, and A. bipunctatus v. thessalicus Stephanidis, 1950 from the Pinios and Sperchios rivers in Greece. For a long time they were treated as not valid (Kottelat, 1997; Vassilev and Pehlivanov, 2005; Eschmeyer, 2015), only Geiger et al. (2014) accepts them as valid species. Our study detected three distinct lineages in this area (Figs. 1 and 2). The first of them (L_X) combines individuals from the Aliakmon and Pinios river drainages in Greece and the Vardar/Axios River in the Republic of Macedonia; most of these rivers drain to the Thermaikos Gulf of the Aegean Sea. This lineage is characterised by great mitochondrial haplotype richness (15 haplotypes), and 10 specific nuclear haplotypes were recognised (Table 5; Figs. 4, S2–S4). The divergence from the other Alburnoides lineages was in a range from 3.06% to 9.84% (Table 2), the lowest value observed for L_XI (a sister lineage in the phylograms) and the highest – for L_XIV (the third recognised lineage from this area). Intron and AA diagnostics provided another distinguishing characteristics (Table 4). We designate this lineage as Alburnoides sp. 3. The haplotype networks revealed in individuals from the Struma/Strymon and the Mesta/Nestos river basins 6 mtDNA and 8 nDNA haplotypes (Table 5; Figs. 4, S2–S4), which constitute a second lineage L_XI. All genetic data, including DNA distance values, nodal supports, and the haplotype networks establish its separate species status. This lineage is characterised by two specific indels 11 and 13 (Tables 3 and 4); shared with L_X), and two unique AA substitutions (one of them is common with L_X; Table 4). These representatives demonstrate high distances from A. bipunctatus for both cyt b and b-actin (6.98% and 2.30%; Table 2). Because this lineage was revealed in the Struma River, A. strymonicus is an available name for this species and it has been published in local checklists (Barbieri et al., 2015). It should also be noted that A. strymonicus has a diverged state from connected branches of A. prespensis, A. ohridanus, and A. bipunctatus as previously demonstrated on the phylogenetic trees from cyt b data by Zardoya and Doadrio (1999) and Perea et al. (2010). The third lineage L_XIV from the Sperchios River drainage in Central Greece is characterised by 5 mtDNA and 9 nDNA haplotypes (Table 5; Figs. 4, S2–S4). This lineage achieves the highest divergence from all other lineages of the European clade (Figs. 1 and 2) on both markers and also generates a separate b-actin

Please cite this article in press as: Stierandová, S., et al. A multilocus assessment of nuclear and mitochondrial sequence data elucidates phylogenetic relationships among European spirlins (Alburnoides, Cyprinidae). Mol. Phylogenet. Evol. (2015), http://dx.doi.org/10.1016/j.ympev.2015.10.025

S. Stierandová et al. / Molecular Phylogenetics and Evolution xxx (2015) xxx–xxx

Group 3 (Fig. S2); the DNA distances from other Aegean lineages for cyt b comprise 9.84% and 9.19%, and the b-actin distance from A. bipunctatus lineage reaches 2.36% (Table 2). The intron diagnostics distinguished five indel positions, from which three were diagnostic (deletions 2, 4, 5). Its independent species status is also confirmed by a recent barcoding study (Geiger et al., 2014). We designated this lineage as Alburnoides sp. 4. In this way, the spirlin populations distributed within the type locality of A. bipunctatus v. thessalicus are represented by two distinguished phylogenetic lineages: L_X, in rivers from the Thermaikos Gulf of the Aegean Sea (including the Pinios River), and L_XIV from the Sperchios River drainage. High genetic divergence confirms independent species status for both of them. The name thessalicus definitely is available for one of these species designated here as Alburnoides sp. 3 and Alburnoides sp. 4. Further nomenclatural applications should include: 1) the designation of neotype for A. thessalicus s. stricto (syntypes are considered to be lost according to Kottelat (1997) and Eschmeyer (2015) within one of these lineages, and 2) description of a new species within another.

3.4.2. Ponto-Caspian clade and its lineages The Ponto-Caspian clade is characterised by the presence of diagnostic PC indel not found in the European clade and 5 indels presenting in nDNA introns (Table 4). Four monophyletic lineages were revealed among studied Ponto-Caspian populations. Individuals from the Ashe and Shakhe rivers of the northeast Black Sea coast form the Transcaucasian lineage L_XII (Figs. 1 and 2; Table 1). Three mtDNA and seven nDNA haplotypes were revealed for this lineage (Table 5; Figs. 4, S2–S4). One unique AA substitution, S7 intron diagnostics and the cyt b distances (4.37– 5.22%) distinguish this lineage from the other three lineages of the same clade (Tables 2–4). These data confirm lineage L_XII as independent Alburnoides species. To date, A. fasciatus (Nordman, 1840), described (as a subspecies) from West Transcaucasia, should be accepted as its available name; however, the range of this species requires further investigation. The analysed individuals from the Kuban River drainage, the type locality of A. bipunctatus rossicus natio kubanicus recently accepted as a valid species A. kubanicus Ba˘na˘rescu, 1964 (Šanda and Mlíkovsky´, 2012; Eschmeyer, 2015), constitute the lineage L_XIII with the specific 5 mtDNA and 8 nDNA haplotypes (Table 5, Figs. 4, S2–S4). The data from this study confirm its validity by the presence of one unique AA substitution, S7 insertion 14 (Table 4) and strong supports in phylogenetic trees and also high p-distances (Figs. 1 and 2; Table 2). For a more detailed comparison, we included in our study the published data of mitochondrial analysis of Iranian spirlins (Seifali et al., 2012). Lineage L_XVI was identified from the Kura River drainage in Azerbaijanian and thus corresponds to A. eichwaldii (De Filippi, 1863) described from Kura River in Georgia. This lineage is characterised by 3 mtDNA and 6 nDNA haplotypes. According to the cyt b distances 2.48–2.74% (Table 2), this locality is close to the Iranian lineages Iran 1 (Sefidrud River) and 2 (Talar River) and evinces slightly higher interspecies divergence (4.24– 4.69%) against the other Ponto-Caspian lineages and Iranian lineage Iran 3 (Gorganrud River) (5.12%). Intron diagnostics revealed the presence of unique deletion 10 in individuals from lineage L_XVI (Table 4). Similarly as Seifali et al. (2012) our study shows differentiation between three Iran 1–3 population (clades I – III, according to Seifali et al., 2012). However, the cited authors concluded that populations from the Sefidrud River basin (clade I) ‘‘may be considered as Alburnoides eichwaldii” (p. 219). Our results do not support this and point to the absence of A. eichwaldii in Iran. We leave lineages described by Seifali et al. (2012) as ‘‘species-inwaiting” (Iran 1–3; Fig. 2) to a further investigation.

11

The last Ponto-Caspian lineage L_XVII with 4 mtDNA and 9 nDNA haplotypes (Table 5) was detected in the Tugcay and Agcay River basins in Azerbaijan. This lineage has specific deletion 12 (Tables 3 and 4) and shows high DNA distances from the other phylogenetic lineages (4.24–8.93%). We designated this lineage as Alburnoides sp. 5 and propose it is studied in more detail. 3.5. Phylogeography, taxonomic and conservation implications The ultimate goal of this study was to reveal the first phylogenetic and phylogeographic review on the taxonomy of the genus Alburnoides in the European context; in order to provide clear and convincing genetic characteristics for species validation. We applied as many molecular approaches as possible; we used two genomic analyses and the exon–intron mapping of genetic variation on nuclear markers. According to current understanding and the sequencing analyses of all collected materials, not only from the type localities, but including surrounding areas, we can conclude on the following. On the mitochondrial marker, the genus Alburnoides gave a wide scale of identifying haplotype patterns (Table 1; Fig. S1) revealing a broad distributional spectrum from the small range-restricted areas (A. tzanevi, A. maculatus, A. kubanicus, Alburnoides sp. 4, etc.) to the wide range areas of A. bipunctatus s. stricto distributed in temperate Europe from France to Slovakia and Poland. Five localities demonstrate sympatric distribution of different phylogenetic lineages: both cryptic Alburnoides sp. 1 and Alburnoides sp. 2 populate the rivers Bosna, Rogacˇica, and Korona from the Sava River drainage (Danube R. basin), whereas individuals from different lineages of A. prespensis complex coexist in Devolli and Osumi rivers. This was probably caused by the secondary contact of allopatric species in the Danube River drainage and by the existence of several refuges in Albania, which allowed formation of slightly differentiated mitochondrial haplotype groups, which recently came into contact. Interspecies hybrids, even on the level of cytonuclear disequilibrium, were not observed; there were always heterozygotes with haplotypes from the same lineage. The molecular study documents optimal informative ability of the mitochondrial marker cyt b compared with the identifying capability of the nuclear markers on a substitution level. In combination with the intron identification and diagnostic indels, the present study ascertained the existence of two main spirlin clades: European and Ponto-Caspian. The first could be divided into five groups: Midwest European, Albanian, South Bulgarian, Russian– Ukrainian and, the most different, Aegean. From a biogeographical viewpoint, the locations of lineage richness in most cases correspond to confirmed glacial refugia (Seifertová et al., 2012; Costedoat and Gilles, 2009; Hewitt, 2004; Taberlet et al., 1998): (a) Adriatic slope of the Balkans; (b) eastern Greece (Aegean rivers); (c) southern tributaries of the Danube; (d) the peripheries of the Black and Caspian Seas. Furthermore, for most new taxa, their distributions are clearly demarcated by confirmed freshwater ecoregional boundaries (Abell et al., 2008). The genetic analyses proved an underestimate of species richness in the genus Alburnoides and confirmed the validity of many morphologically accepted species (Table 4), namely A. bipunctatus s. stricto (but with significantly reduced range); A. ohridanus, A. fasciatus, A. kubanicus, A. tzanevi, A. rossicus, A. maculatus, A. prespensis, and A. eichwaldii. In addition to them, separate species status was proved both for A. strymonicus (from the Aegean basin only) and A. thessalicus (from the Thermaikos Gulf system or from the Sperchios River drainage) which are not yet considered as valid species (Eschmeyer, 2015). Apart from the 11 valid species mentioned above, the study revealed four phylogenetic lineages requiring the descriptions as separate species. Two of them are Alburnoides sp. 1 and Alburnoides sp. 2, found in the Danube River

Please cite this article in press as: Stierandová, S., et al. A multilocus assessment of nuclear and mitochondrial sequence data elucidates phylogenetic relationships among European spirlins (Alburnoides, Cyprinidae). Mol. Phylogenet. Evol. (2015), http://dx.doi.org/10.1016/j.ympev.2015.10.025

12

S. Stierandová et al. / Molecular Phylogenetics and Evolution xxx (2015) xxx–xxx

drainage (Midwest European area). The distribution of different phylogenetic lineages in this great and complicated river system demonstrates certain geographic pattern: A. bipunctatus inhabits only the upper part of the basin; Alburnoides sp. 1 occurs in the middle part, and Alburnoides sp. 2 populates the lower river system. Two last species sympatrically occur in some tributaries of the Sava and Korona rivers. Another new species should be described from the Thermaikos Gulf river systems (Aegean Sea basin in Greece and the Republic of Macedonia) or from the Sperchios River drainage in Central Greece, depending on the designation of neotype for A. thessalicus. And the fourth species should be described from the rivers of the eastern Middle Caspian basin (Tugcay, Agcay, and Xizi rivers). All of them populate somewhat geographically isolated river systems. In particular, the presumed new Caspian species inhabits territories geographically isolated from the species-rich area southwards of the Caucasus occupied by both numerous newly described morphological species (see Bogutskaya and Coad, 2009; Coad and Bogutskaya, 2012) and genetically identified monophyletic lineages (Seifali et al., 2012 and this study), including the species A. eichwaldii genetically confirmed for the Kura River basin. The more or less diverging lineages (A. ohridanus, and A. prespensis complex) were observed in the Albanian rivers, in the so-called Southeast Adriatic Freshwater Ecoregion (Abell et al., 2008). Exceptionally high localised endemism in this region confirms that it is a hotspot of endemic biodiversity. This area includes both numerous small streams flowing directly to the sea and the ancient Ohrid-Prespa lake complex that probably originated during the Pliocene (Wagner and Wilke, 2011); which was formerly connected to the Southern Adriatic rivers through the Korça graben (Fouache et al., 2001). Also, former Maliq Lake from the Korça-Devolli basin (Fouache et al., 2010), in today’s upper Devolli river system, provided a connection between Small Prespa Lake and the Devolli River during periods of high water levels (Tziritis, 2014). In the past, this situation could have essential effect on the dispersion of spirlins in southern Albania. All these facts could explain the observed population complexity in this geographic area. In general, three morphologically described lineages with small mtDNA distances were revealed in the Albanian A. prespensis complex. All of them were found in the Devolli River (the main source of the Semani River) which provides the largest observed genetic diversity of the European spirlins in this study: eight studied individuals belong to three different morphological lineages. However, A. ohridanus demonstrates high divergence from the A. prespensis complex similarly to high genetic divergence observed for other cyprinid fishes from Ohrid and Prespa lakes. This high divergence is attributed to the mountainous terrain separating these basins, which probably acted as a physical barrier promoting genetic divergence (Tsoumani et al., 2014). The endemism of A. ohridanus and the lineages of A. prespensis complex from Albania, as well as A. tzanevi from Bulgaria, A. maculatus from Crimea, A. kubanicus and A. fasciatus from Russia, A. thessalicus and one unnamed species (Alburnoides sp. 3 or sp. 4) from Greece have been proven with a high level of certainty. The Balkan populations are frequently restricted to distinct ecoregions, delineated by notable biogeographical boundary breakpoints, such as the case of the Greek spirlins from the Sperchios River system and from the Thermaikos Gulf basin belonging to different long isolated ecoregions (Zogaris et al., 2009). This wide-ranging phylogenetic study in concert with the recently published investigation by Urbánková et al. (2013) concerning 11 polymorphic loci in central Europe provide comprehensive molecular tools convenient for species identification of spirlins in the European context. We attempted an overview of phylogeny and recent taxonomy for these rather neglected small fishes, aiming to assist in defining good species units based on their

evolutionary relationships. As expected, several new questions have come to light including, for example, the status of the newly recognised lineages, the evolutionary trends within the Albanian A. prespensis complex, the genetic diversity in the Ponto-Caspian area and the validity of identified lineages, etc. Finally an important research area is the definition of recognised endemism and conservation status assessment of the new taxa. Answers to all these questions will require detailed research. Acknowledgments This study was carried out within the Framework of research projects no M200930901 supported by the Program of Internal Support for International Collaborative Projects of the Academy of Sciences of the Czech Republic and through institutional support (RVO: 68081766). JV funded by institutional resources of the Ministry of Education, Youth, and Sports of the Czech Republic. The studies on museum materials of EV were partially supported by the project no 14-50-00029 of the Russian Scientific Fond and taxonomic investigations – by the Russian Foundation for Basic Researches, project no. 13-04-00279-a. We would like to thank our colleagues S. Lusk, A. Kohoutová-Šedivá, Z. Lajbner, L. Pekárik, H. Persat, D. Neumann, A.N. Pashkov, S.I. Reshetnikov, D. Kommatas, and A.N. Economou for their help with sample collection. Appendix A. Supplementary material Supplementary data associated with this article can be found, in the online version, at http://dx.doi.org/10.1016/j.ympev.2015.10. 025. These data include Google maps of the most important areas described in this article. References Abasca, l.F., Zardoya, R., Posada, D., 2005. ProtTest: Selection of best-fit models of protein evolution. Bioinformatics 21 (9), 2104–2105. http://dx.doi.org/10.1093/ bioinformatics/bti263. Abell, R., Thieme, M.L., Revenga, C., Bryer, M., et al., 2008. Freshwater ecoregions of the world: a new map of biogeographic units for freshwater biodiversity conservation. J. Biosci. 58 (5), 403–414. http://dx.doi.org/10.1641/B580507. Barbieri, R., Zogaris, S., Kalogianni, E., et al., 2015. Freshwater fishes and lampreys of Greece. An annotated checklist. Hellenic Centre for Marine Research: Athens, Greece. Monographs Mar. Sci. 8, 130. Barriel, V., 1994. Molecular phylogenies and how to code insertion/deletion events. Comptes rendus de l’Académie des Sciences III 317, 693–701. Berg, L.S., 1924. Russian Spirlin (Alburnoides bipunctatus rossicus Berg, subsp. nova.). A collection on fisheries compiled by the department of applied ichthyology and scientific and commercial researches of the governmental institute of experimental agronomy. In: Ber, L.S. (Ed.), Bulletin of the department of ichthyology (former fisheries) and scientific and commercial researches, 2. Leningrad: Novaja Derevnja. 56. [In Russian, English summary.]. Berg, L.S., 1949, . Freshwater Fishes of the U.S.S.R. and Adjacent Countries, fourth ed., vol. 2. U.S.S.R. Academy of Sciences, Moscow-Leningrad, pp. 929–1382 (in Russian). Bogutskaya, N.G., Coad, B.W., 2009. A review of vertebral and fin-ray counts in the genus Alburnoides (Teleostei: Cyprinidae) with a description of six new species. Zoosyst. Rossica 18 (1), 126–173. Bogutskaya, N.G., Zupancˇicˇ, P., Naseka, A.M., 2010. Two new species of freshwater fishes of the genus Alburnoides, A. fangfangae and A. devolli (Actinopterygii: Cyprinidae), from the Adriatic Sea basin in Albania. Proc. Zool. Inst. RAS 314 (4), 448–468. Buj, I., Šanda, R., Marcˇic´, Z., C´aleta, M., Mrakovcˇic´, M., 2014. Combining morphology and genetics in resolving taxonomy – a systematic revision of spined loaches (Genus Cobitis; Cypriniformes, Actinopterygii) in the Adriatic Watershed. PLoS ONE 9 (6), e99833. http://dx.doi.org/10.1371/journal.pone.0099833. Clement, M., Posada, D., Crandall, K., 2000. TCS: a computer program to estimate gene genealogies. Mol. Ecol. 9 (10), 1657–1660. Coad, B.W., Bogutskaya, N.G., 2009. Alburnoides ganati, a new species of cyprinid fish from southern Iran (Actinopterygii, Cyprinidae). ZooKeys 13, 67–77. Coad, B.W., Bogutskaya, N.G., 2012. A new species of riffle minnow, Alburnoides holciki, from the Hari River basin in Afghanistan and Iran (Actinopterygii: Cyprinidae). Zootaxa 3453, 43–55. Costedoat, C., Gilles, A., 2009. Quaternary pattern of freshwater fishes in Europe: comparative phylogeography and conservation perspective. Open Conserv. Biol. J. 3, 36–48. http://dx.doi.org/10.2174/1874839200903010036.

Please cite this article in press as: Stierandová, S., et al. A multilocus assessment of nuclear and mitochondrial sequence data elucidates phylogenetic relationships among European spirlins (Alburnoides, Cyprinidae). Mol. Phylogenet. Evol. (2015), http://dx.doi.org/10.1016/j.ympev.2015.10.025

S. Stierandová et al. / Molecular Phylogenetics and Evolution xxx (2015) xxx–xxx Darriba, D., Taboada, G.L., Doallo, R., Posada, D., 2012. JModelTest 2: more models, new heuristics and parallel computing. Nat. Methods 9 (8), 772. Eschmeyer, W.N. (Ed.), 2015. Catalog of Fishes: Genera, Species, References. . Electronic version (accessed 06.04.15). Farris, J.S., Kallerjo, M., Kluge, A.G., Bult, C., 1995. Testing significance of incongruence. Cladistics 10, 315–319. Fouache, E., Dufaure, J.J., Denèfle, M., et al., 2001. Man and environment around lake Maliq (southern Albania) during the Late Holocene. Veg. Hist. Archaeobot. 10, 79–86. http://dx.doi.org/10.1007/PL00006922. Fouache, E., Desruelles, S., Magnym, M., et al., 2010. Palaeogeographical reconstructions of Lake Maliq (Korça Basin, Albania) between 14,000 BP and 2000 BP. J. Archaeol. Sci. 37, 525–535. http://dx.doi.org/10.1016/j.jas.2009.10.017. Freyhof, J., Lieckfeldt, D., Pitra, C., Ludwig, A., 2005. Molecules and morphology: evidence for introgression of mitochondrial DNA in Dalmatian cyprinids. Mol. Phylogenet. Evol. 37, 347–354. Froese, R., Pauly, D. (Eds.), 2015. FishBase. World Wide Web Electronic Publication. , version (02/2015). Geiger, M.F., Herder, F., Monaghan, M.T., et al., 2014. Spatial heterogeneity in the Mediterranean Biodiversity Hotspot affects barcoding accuracy of its freshwater fishes. Mol. Ecol. Resour. http://dx.doi.org/10.1111/1755-0998.12257 (online in advance of print). Guindon, S., Gascuel, O., 2003. A simple, fast and accurate method to estimate large phylogenies by maximum-likelihood. Syst. Biol. 52, 696–704. http://dx.doi.org/ 10.1080/10635150390235520. Guindon, S., Dufayard, J.F., Lefort, V., et al., 2010. New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. Syst. Biol. 59 (3), 307–321. http://dx.doi.org/10.1093/sysbio/syq010. Hewitt, G.M., 2004. Genetic consequences of climatic oscillations in the Quaternary. Philos. Trans. Royal Soc. B 359, 183–195. http://dx.doi.org/ 10.1098/rstb.2003.1388. Kottelat, M., 1997. European freshwater fishes. Biologia, Bratislava 52 (Suppl. 5), 1–271. Kottelat, M., Freyhof, J., 2007. Handbook of European Freshwater Fishes. Kottelat, Cornol, Switzerland and Freyhof, Berlin, Germany. Librado, P., Rozas, J., 2009. DNASP v5: a software for comprehensive analysis of DNA polymorphism data. Bioinformatics 25, 1451–1452. http://dx.doi.org/10.1093/ bioinformatics/btp187. Marková, S., Šanda, R., Crivelli, A., et al., 2010. Nuclear and mitochondrial DNA sequence data reveal the evolutionary history of Barbus (Cyprinidae) in the ancient lake systems of the Balkans. Mol. Phylogenet. Evol. 55, 488–500. http:// dx.doi.org/10.1016/j.ympev.2010.01.030. Mendel, J., Lusk, S., Vasileva, E.D., et al., 2008. Molecular phylogeny of the genus Gobio Cuvier, 1816 (Teleostei: Cyprinidae) and its contribution to taxonomy. Mol. Phylogenet. Evol. 47, 1061–1075. http://dx.doi.org/10.1016/j. ympev.2008.03.005. Müller, K., 2005. SeqState – primer design and sequence statistics for phylogenetic DNA data sets. Appl. Bioinform. 4, 65–69. Müller, K., 2006. Incorporating information from length-mutational events into phylogenetic analysis. Mol. Phylogenet. Evol. 38, 667–676. Nei, M., 1987. Molecular Evolutionary Genetics. Columbia University Press, New York. Ogden, T.H., Rosenberg, M.S., 2007. How should gaps be treated in parsimony? A comparison of approaches using simulation. Mol. Phylogenet. Evol. 42 (3), 817– 826 (Erratum: 2008, 46 (2), 807–808). Perdices, A., Doadrio, I., Economidis, P.S., Bohlen, J., Ba˘na˘rescu, P., 2003. Pleistocene effects on the European freshwater fish fauna: double origin of the cobitid genus Sabanejewia in the Danube basin (Osteichthyes: Cobitidae). Mol. Phylogenet. Evol. 26 (2), 289–299. http://dx.doi.org/10.1016/S1055-7903(02)00334-2. Perdices, A., Vasil’ev, V., Vasil’eva, E., 2012. Molecular phylogeny and intraspecific structure of loaches (genera Cobitis and Misgurnus) from the Far East region of Russia and some conclusions on their systematics. Ichthyol. Res. 59 (2), 113– 123. http://dx.doi.org/10.1007/s10228-011-0259-6. Perdices, A., Vasil’eva, E., Vasil’ev, V., 2015. From Asia to Europe across Siberia: phylogeography of the Siberian spined loach (Teleostei, Cobitidae). Zool. Scr. 44 (1), 29–40. http://dx.doi.org/10.1111/zsc.12085. Perea, S., Böhme, M., Zupancˇicˇ, P., et al., 2010. Phylogenetic relationships and biogeographical patterns in Circum-Mediterranean subfamily Leuciscinae (Teleostei, Cyprinidae) inferred from both mitochondrial and nuclear data. BMC Evol. Biol. 10, 265. http://dx.doi.org/10.1186/1471-2148-10-265. Rambaut, A., Drummond, A., 2003. TRACER, version 1.2. Oxford, UK: University of Oxford. Rambaut, A., 2009. FigTree: Tree Figure Drawing Tool, v1.2.1. University of Edinburgh, UK. Robalo, J.I., Sousa-Santos, C., Levy, A., Almada, V.C., 2006. Molecular insights on the taxonomic position of the paternal ancestor of the Squalius alburnoides hybridogenetic complex. Mol. Phylogenet. Evol. 39 (1), 276–281.

13

Ronquist, F., Huelsenbeck, J.P., 2005. Bayesian analysis of molecular evolution using MrBayes. In: Nielsen, R. (Ed.), Statistical Methods in Molecular Evolution. Springer, New York. Rylková, K., Kalous, L., Bohlen, J., Lamatsch, D.K., Petrty´l, M., 2013. Phylogeny and biogeographic history of the cyprinid fish genus Carassius (Teleostei: Cyprinidae) with focus on natural and anthropogenic arrivals in Europe. Aquaculture 380–383, 13–20. http://dx.doi.org/10.1016/j. aquaculture.2012.11.027. Šanda, R., Mlíkovsky´, J., 2012. Authorship and type specimens of Alburnoides kubanicus (Teleostei: Cyprinidae). Zootaxa 3498, 87–88. Šedivá, A., Janko, K., Šlechtová, V., et al., 2008. Around or across the Carpathians: colonization model of the Danube basin inferred from genetic diversification of stone loach (Barbatula barbatula) populations. Mol. Ecol. 17, 1277–1292. http:// dx.doi.org/10.1111/j.1365-294X.2007.03656.x. Seifali, M., Arshad, A., Moghaddam, F.Y., et al., 2012. Mitochondrial genetic differentiation of spirlin (Actinopterigii: Cyprinidae) in the South Caspian Sea basin of Iran. Evolut. Bioinform. Online. 2012 (8), 219–227. http://dx.doi.org/ 10.4137/EBO.S9207. Seifertová, M., Bryja, J., Vyskocˇilová, M., Martínková, N., Šimková, A., 2012. Multiple pleistocene refugia and post-glacial colonization in the European chub (Squalius cephalus) revealed by combined use of nuclear and mitochondrial markers. J. Biogeogr. 39, 1024–1040. http://dx.doi.org/10.1111/j.1365-2699.2011.02661.x. Simmons, M.P., Ochoterena, H., 2000. Gaps as characters in semence-based phylogenetic analysis. Syst. Biol. 49, 369–381. Sorokin, P.A., Medvedev, D.A., Vasil’ev, V.P., Vasil’eva, E.D., 2011. Further studies of mitochondrial genome variability in Ponto-Caspian Proterorhinus species (Actinopterygii: Perciformes: Gobiidae) and their taxonomic implications. Acta Ichthyol. Piscatoria 41, 95–104. Stephanidis, A., 1950. Contribution à l’étude des poissons d’eau douce de la Grece. Praktika tes Akademias Athenon 18, 200–210 (in Greek, French summary). Stephens, M., Smith, N.J., Donnelly, P., 2001. A new statistical method for haplotype reconstruction from population data. Am. J. Hum. Genet. 68, 978–989. http://dx. doi.org/10.1086/319501. Stephens, M., Scheet, P., 2005. Accounting for decay of linkage disequilibrium in haplotype inference and missing data imputation. Am. J. Hum. Genet. 76, 449– 462. http://dx.doi.org/10.1086/428594. Stöver, B.C., Müller, K.F., 2010. TreeGraph 2: combining and visualizing evidence from different phylogenetic analyses. Bioinformatics 11, 7. http://dx.doi.org/ 10.1186/1471-2105-11-7. Swofford, D.L., Selander, R.B., 2002. PAUP⁄: Phylogenetic Analysis Using Parsimony (⁄and other Methods). Sinauer Associates, Sunderland, MA. Taberlet, P., Fumagalli, L., Wust-Saucy, A.G., Cosson, J.F., 1998. Comparative phylogeography and postglacial colonization routes in Europe. Mol. Ecol. 7, 453–464. http://dx.doi.org/10.1046/j.1365-294x.1998.00289. Tamura, K., Peterson, D., Peterson, N., et al., 2011. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol. Biol. Evol. 28, 2731–2739. http://dx.doi. org/10.1093/molbev/msr121. Tsoumani, M., Georgiadis, A., Giantsis, I.A., Leonardos, I., Apostolidis, A.P., 2014. Phylogenetic relationships among Southern Balkan Rutilus species inferred from cytochrome b sequence analysis: micro-geographic resolution and taxonomic implications. Biochem. Syst. Ecol. 54, 172–178. http://dx.doi.org/10.1016/j. bse.2014.02.006. Turan, D., Kaya, C., Ekmekçi, F., Güçlü, S., 2013. Alburnoides manyasensis (Actinopterygii, Cyprinidae), a new species of cyprinid fish from Manyas Lake basin, Turkey. ZooKeys 276, 85–102. http://dx.doi.org/10.3897/zookeys. 276.4107. Turan, D., Kaya, C., Ekmekçi, F.G., Dogan, E., 2014. Three new species of Alburnoides (Teleostei: Cyprinidae) from Euphrates River, Eastern Anatolia, Turkey. Zootaxa 3754 (2), 101–116. http://dx.doi.org/10.11646/zootaxa.3754.2.1. Tziritis, E.P., 2014. Environmental monitoring of Micro Prespa Lake basin (Western Macedonia, Greece): hydrogeochemical characteristics of water resources and quality trends. Environ. Monit. Assess. 186 (7), 4553–4568. http://dx.doi.org/ 10.1007/s10661-014-3719-4. Urbánková, S., Mendel, J., Vyskocˇilová, M., 2013. Microsatellite loci for population studies of the genus Alburnoides and four other critically endangered and vulnerable cyprinids in the Czech Republic. Mol. Ecol. Res. 13 (3), 546–549. Vassilev, M.V., Pehlivanov, L.Z., 2005. Checklist of Bulgarian freshwater fishes. Acta Zool. Bulg. 57 (2), 161–190. Wagner, B., Wilke, T., 2011. Evolutionary and geological history of the Balkan lakes Ohrid and Prespa. Biogeosciences 8, 995–998. http://dx.doi.org/10.5194/bg-8995-2011. Zardoya, R., Doadrio, I., 1999. Molecular evidence on the evolutionary and biogeographical patterns of European Cyprinids. J. Mol. Evol. 49 (2), 227–237. Zogaris, S., Economou, A., Dimopoulos, P., 2009. Ecoregions in the Southern Balkans: should they be revised? J. Environ. Manage. 43, 682–697. http://dx.doi.org/ 10.1007/s00267-008-9243-y.

Please cite this article in press as: Stierandová, S., et al. A multilocus assessment of nuclear and mitochondrial sequence data elucidates phylogenetic relationships among European spirlins (Alburnoides, Cyprinidae). Mol. Phylogenet. Evol. (2015), http://dx.doi.org/10.1016/j.ympev.2015.10.025