A robust phylogeny among major lineages of the East African cichlids

A robust phylogeny among major lineages of the East African cichlids

Accepted Manuscript A robust phylogeny among major lineages of the East African cichlids Tetsumi Takahashi, Teiji Sota PII: DOI: Reference: S1055-790...

789KB Sizes 43 Downloads 38 Views

Accepted Manuscript A robust phylogeny among major lineages of the East African cichlids Tetsumi Takahashi, Teiji Sota PII: DOI: Reference:

S1055-7903(16)30046-X http://dx.doi.org/10.1016/j.ympev.2016.04.012 YMPEV 5484

To appear in:

Molecular Phylogenetics and Evolution

Received Date: Revised Date: Accepted Date:

10 December 2015 16 March 2016 7 April 2016

Please cite this article as: Takahashi, T., Sota, T., A robust phylogeny among major lineages of the East African cichlids, Molecular Phylogenetics and Evolution (2016), doi: http://dx.doi.org/10.1016/j.ympev.2016.04.012

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

A robust phylogeny among major lineages of the East African cichlids

Tetsumi Takahashia,b,1, Teiji Sotaa

a

Department of Zoology, Graduate School of Science, Kyoto University, Sakyo, Kyoto 606-

8502, Japan b

National Institute of Genetics, Yata, Mishima, Shizuoka 411-8540, Japan

Running head: A PHYLOGENY OF TANGANYIKA CICHLID

*Corresponding author. Address: Institute of Natural and Environmental Sciences, University of Hyogo, Yayoigaoka, Sanda, Hyogo 669-1546, Japan. E-mail address: [email protected] (T. Takahashi).

1

Present address: Institute of Natural and Environmental Sciences, University of Hyogo,

Yayoigaoka, Sanda, Hyogo 669-1546, Japan.

1

ABSTRACT The huge monophyletic group of the East African cichlid radiations (EAR) consists of thousands of species belonging to 12–14 tribes; the number of tribes differs among studies. Many studies have inferred phylogenies of EAR tribes using various genetic markers. However, these phylogenies partly contradict one another and can have weak statistic support. In this study, we conducted maximum-likelihood (ML) phylogenetic analyses using restriction site-associated DNA (RAD) sequences and propose a new robust phylogenetic hypothesis among Lake Tanganyika cichlid fishes, which cover most EAR tribes. Data matrices can vary in size and contents depending on the strategies used to process RAD sequences. Therefore, we prepared 23 data matrices with various processing strategies. The ML phylogenies inferred from 15 large matrices (2.0×106 to 1.1×107 base pairs) resolved every tribe as a monophyletic group with 100% bootstrap support and shared the same topology regarding relationships among the tribes. Most nodes among the tribes were supported by 100% bootstrap values, and the bootstrap support for the other node varied among the 15 ML trees from 70% to 100%. These robust ML trees differ partly in topology from those in earlier studies, and these phylogenetic relationships have important implications for the tribal classification of EAR. Keywords: Cichlidae; Lake Tanganyika; maximum-likelihood analysis; next generation sequencer; restriction site-associated DNA sequencing; tribe

2

1. Introduction The huge monophyletic group of the East African cichlid radiations (EAR, Schwarzer, 2009; Dunz et al., 2013) consists of thousands of species inhabiting lakes and rivers mainly in East Africa. Cichlid species of this group exhibit high morphological, ecological and genetic diversity, and large-scale adaptive radiations have taken place at least in Lake Tanganyika (Salzburger et al., 2002, but Genner et al., 2007; Schwarzer et al., 2009), Lake Malawi (Sturmbauer et al., 2001), the Lake Victoria basin (Verheyen et al., 2003) and the extinct Lake Makgadikgadi basin (Joyce et al., 2005). Lake Tanganyika, which is likely the oldest lake in Africa (9–12 million years, Cohen et al., 1993), is an evolutionary reservoir of old lineages of EAR (Nishida, 1991); indeed, many EAR lineages consist entirely of species endemic to this lake. Therefore, Lake Tanganyika cichlid fishes are important for resolving the phylogeny of the cichlid fishes of the EAR. Large young haplochromine species flocks of Lake Malawi and the Lake Victoria basin, which are wellknown examples of rapid adaptive radiation, originated in the Lake Tanganyika radiation (Salzburger et al., 2005; Koblmüller et al., 2008a). Poll (1986) first classified the Lake Tanganyika cichlid fishes into 12 tribes based on morphological features. Takahashi (2003a) revised Poll’s (1986) classification and recognised 16 tribes based on cladistic analysis of anatomical data, of which 14 tribes composed the EAR. Subsequently, further revisions were made for the framework of the EAR tribes (Fig. 1). Currently, there is at least one point of debate regarding the tribal classification of EAR cichlid fishes. That is, three alternative tribal classifications exist for the genera Bathybates Boulenger, 1898, Hemibates Regan, 1920 and Trematocara

3

Boulenger, 1899. Poll (1986) recognised the tribes Bathybatini Poll, 1986 (Bathybates and Hemibates) and Trematocarini Poll, 1986 (Trematocara; note that Poll, 1986 also included the genus Telotrematocara Poll, 1986 in this tribe, but this genus is currently treated as a synonym of Trematocara, see Takahashi, 2002); Takahashi (2003a) recognised only Bathybatini, consisting of three genera; and Koblmüller (2008b) suggested the presence of three monotypic tribes, Trematocarini, Bathybatini and Hemibatini Koblmüller, 2008. The phylogenetic relationships among these three genera have not been clearly resolved (Koblmüller et al., 2005; Day et al., 2008; Weiss et al., 2015). Phylogenetic relationships among the EAR cichlid tribes have repeatedly been inferred based on different molecular markers, although statistical support for these topologies is often weak (Table 1). These phylogenies partly agree in topology, but differ in several points. Many studies have located the tribes Boulengerochromini Takahashi, 2003, Bathybatini (including Hemibates) and Trematocarini, which are endemic to Lake Tanganyika, close to the root of EAR (Fig. 2), but the relationships among these three tribes have not been clearly resolved (Salzburger et al., 2002; Clabaut et al., 2005; Day et al., 2008; Muschick et al., 2012; Meyer et al., 2015; Weiss et al., 2015). The remaining tribes form a monophyletic group called MVhL, which is an abbreviation for "the clade formed by cichlids of Lakes Malawi and Victoria, the H-lineage, and Lamprologini" (Takahashi et al., 2001). The clade of the species-rich tribe Lamprologini Poll, 1986 (more than 90 species belonging to eight genera are currently treated as valid) diverged close to the root of MVhL (Nishida, 1991; Kocher et al., 1995; Salzburger et al., 2002; Clabaut et al., 2005; Day et al., 2008; Meyer et al., 2015; Weiss et al., 2015; McGee et al., 2016). Most

4

Lamprologini species are endemic to Lake Tanganyika, while nine inhabit the Congo River and one the Malagarasi River (Schelly et al., 2003; Schelly and Stiassny, 2004; Tougas and Stiassny, 2014). The Congo River is connected to Lake Tanganyika via Lukuga River, and Malagarasi River flows into the lake. Lamprologini most likely started the radiation in Lake Tanganyika about 5.3 million years ago, and then some fishes probably colonised Congo River and Malagarasi River (Sturmbauer et al., 1994, 2010). In MVhL, the phylogenetic position of the tribe Eretmodini Poll, 1986 differs between studies. Phylogenies based on nDNA markers (Nishida, 1991; Meyer et al., 2015; McGee et al., 2016) and sequences of the mtDNA cytb gene and control region (Sturmbauer and Meyer, 1993) resolve the tribe Eretmodini as a sister group of the tribe Haplochromini Trewavas, 1983 in a monophyletic group sister to Lamprologini, called the H-lineage (Nishida, 1991) (Fig. 2a). Another nDNA phylogeny based on AFLP data also supports the inclusion of Eretmodini in the Hlineage (Weiss et al., 2015). However, phylogenies inferred from data consisting of or partly including ND2 sequences of mtDNA locate Eretmodini out of the H-lineage (Fig. 2b): Eretmodini sister to Lamprologini (Kocher et al., 1995; Clabaut et al., 2005, 2007; Day et al., 2008) or to a clade consisting of the other MVhL tribes (Salzburger et al., 2002; Muschick et al., 2012). The monophyletic group of H-lineage minus Eretmodini is called the redefined H-lineage (Salzburger et al., 2002) or C-lineage (Clabaut et al., 2005). In the H-lineage (including Eretmodini) or C-lineage (excluding Eretmodini), tribes other than Haplochromini and Orthochromini Clabaut et al., 2005 consist only of species endemic to Lake Tanganyika. The most species-rich tribe Haplochromini includes the Lake Tanganyika tropheine species and haplochromine species flocks of Lake Malawi and the Lake Victoria

5

basin. The tribe Orthochromini is consisted of Orthochromis species from Malagarasi River. Phylogenetic relationships of the H-lineage or C-lineage tribes are not consistent among phylogenetic studies. Some recent studies have applied restriction site-associated DNA (RAD) sequences to phylogenetic analyses among variously related taxa and showed better results in terms of node support values, such as bootstrap percentages, than analyses using limited sequences of mtDNA or nDNA (e.g., Jones et al., 2013; Wagner et al., 2013; Takahashi et al., 2014; Longo and Bernardi, 2015; Takahashi and Moreno, 2015; Tariel et al., 2016). In the present study, we applied RAD sequence data to phylogenetic inference of Tanganyika cichlid fishes, which cover most EAR tribes, to obtain a robust tree.

2. Materials and Methods 2.1. Fishes Fifty species from the EAR cichlid tribes but Orthochromini were used as an ingroup and Oreochromis tanganicae was used as an outgoup (Table 2). Genera whose tribal allocations are still debated were included in the analyses (i.e. Bathybates, Hemibates and Trematocara). Right pectoral fins or whole bodies of samples collected in Lake Tanganyika were stored at –30 ºC or room temperature in 100% ethanol. DNA extract solutions of three Lake Victoria haplochromine species were provided by Dr. Yohei Terai of the National Institute for Basic Biology, Japan. One individual was analysed from each species.

6

2.2. RAD sequencing RNA-free total genomic DNA was extracted from the right pectoral fin using a Wizard Genomic DNA Purification Kit (Promega, Madison, WI, USA). The concentration of DNA was determined using a Qubit 2.0 Fluorometer (Invitrogen, Carlsbad, CA, USA) and adjusted to 25 ng/µL. RAD libraries were prepared as described by Etter et al. (2011). Each individual sample (40 µL containing 1 µg of genomic DNA) was digested with the high-fidelity restriction enzyme SbfI (New England Biolabs, Ipswich, MA, USA; 20 units for each sample) in CutSmart buffer for 90 – 180 min and barcoded with a unique fivenucleotide sequence, differing by at least two bases between samples. The library construction was conducted at Kyoto University until P1 adaptor ligation and subsequently at Hokkaido System Science (HSS), Sapporo, Japan. All samples were run in two lanes of single-end 101-bp sequencing on a HiSeq2000 sequencer (Illumina, San Diego, CA, USA) at HSS.

2.3. RAD data processing We used PyRAD ver. 2.0 (Eaton, 2014) for RAD sequence processing. This software package allows for indels when clustering sequence reads into orthologous loci and probably performs better in orthology identification among distantly related taxa than another commonly used programme, Stacks (Catchen et al. 2011), which does not consider indels. Stacks can align RAD sequences to a reference genome, while PyRAD does not support this method. However, we prefer de novo alignment to reduce bias between samples. The number of sequences correctly aligned to the reference genome decreases

7

with increasing genetic distance from the taxon of the reference genome; therefore, alignment to the reference genome may not be a good choice for the analysis of variously related taxa as in the present study. The key parameters for making a data matrix in PyRAD are the clustering threshold (Wclust), which is the minimum proportion of identical nucleotides required for identification of orthologous RAD loci between samples, and the minimum number of taxa (Mintaxa), in which a locus is included in the data matrix only when its allelic sequences are recovered in a number of samples ≥ Mintaxa (see Takahashi et al., 2014 for details). The values of Wclust and Mintaxa affect the size and contents of the data matrix, and such varied matrices can result in different topologies (Takahashi et al., 2014; Takahashi and Moreno, 2015). Therefore, we made 23 data matrices with four Wclust values (50%, 65%, 80%, 95%) and seven Mintaxa values (4, 12, 20, 28, 36, 44, 51) for phylogenetic analyses (five settings with Wclust = 50% and Mintaxa ≥ 20 were not considered because no sequences were recovered in all or part of the samples). The raw RAD sequence data obtained from the HiSeq sequencer were processed under given Wclust and Mintaxa values as follows. RAD sequences (101 base pairs in length) were sorted by their barcode sequences into individual samples. Bases with a Phred quality score lower than 20 were changed to “N”s, and sequence reads with more than four “N”s were discarded (quality filtering). Then, the barcode sequence (1st to 5th sites) and the restriction site (6th to 11th sites) were removed from each RAD sequence. In each sample, the processed sequences (12th to 101st sites; 90 bp in length) were clustered into orthologous loci under the given Wclust value (PyRAD uses the same clustering threshold to identify orthologous loci within samples and among

8

samples; otherwise, sequences that were identified as different loci within a sample could be combined, undesirably, into the same loci in the later orthologous search among samples). In each sample, loci with coverage depths of less than three, four or more undetermined sites, four or more heterozygous sites or three or more alleles were discarded. From these data, orthologous RAD loci were identified among samples under the given Wclust value and filtered under the given Mintaxa value. The resulting orthologous loci, including both variable and invariable loci, were concatenated into a data matrix.

2.4. Tree search and evaluation Rapid bootstrapping analysis with 100 bootstrap replicates was conducted for each data matrix produced from the RAD analysis, and maximum-likelihood (ML) analysis was performed using RAxML ver. 8.0.19 (Stamatakis et al., 2008; Stamatakis, 2014). Throughout the ML analyses, we used a general time-reversible model with gammadistributed rate variation (GTR+). We chose this most inclusive model because simpler evolution models can greatly reduce the accuracy of phylogenetic inference when data matrices contain large proportions of missing data (Roure et al., 2013). All data processing and phylogenetic analyses were performed using the supercomputer at the National Institute of Genetics (Mishima, Shizuoka, Japan).

3. Results 3.1. RAD sequencing and sequence data matrices We obtained a total of 287 million RAD sequences from two lanes of a HiSeq

9

sequencer [DNA Data Bank of Japan (DDBJ): accession No. DRA004514]. The number of RAD sequences that passed quality filtering per sample ranged from 9.5×104 to 1.0×107 (233 million sequences in total; Table 2). The mean coverage depth of loci (the mean number of sequences per locus with three or more sequences) varied greatly among samples (Table 2), partly due to low quality of some DNA samples. From the RAD sequence data, we made 23 matrices with four Wclust and seven Mintaxa values. Among these matrices, the data matrix was much larger when Wclust ≥ 65% compared to those at Wclust = 50% and decreased as Mintaxa increased under a given Wclust value (Fig. 3a). The largest matrix was at Wclust = 95% and Mintaxa = 4 (corresponding to a sequence length of 11×106 base pairs), and the smallest matrix was at Wclust = 95% and Mintaxa = 51 (812 base pair sequence length). The proportion of missing data was much higher in matrices with Wclust = 50% than the other matrices and decreased with increasing Mintaxa values under a given Wclust value (Fig. 3b). In each data matrix, the proportions of missing data tended to be higher with lower mean coverage depths when Wclust ≥ 65% and Mintaxa ≤ 44 (Fig. 4). The proportions of gaps in data matrices were much lower than the proportions of missing data (Fig. 3c). The numbers of single nucleotide polymorphisms (SNPs) were much higher when Wclust ≥ 65% than at Wclust = 50%, and the number decreased markedly as Mintaxa increased under a given Wclust value (Fig. 3d). These tendencies in matrix size, proportion of missing data, proportion of gaps and the number of SNPs were also observed in earlier studies analysing other animal groups (Takahashi et al., 2014; Takahashi and Moreno, 2015).

10

3.2. Phylogenetic analyses The estimated rates of nucleotide substitution and base frequencies in the ML analyses with the GTR+ model did not vary substantially among large and moderately sized matrices with Wclust ≥ 65% and Mintaxa ≤ 44 (Supplementary Table S1). Of the 23 ML trees, 17 trees inferred from matrices with Wclust ≥ 65% and Mintaxa ≤ 44 (except for Wclust = 95% and Mintaxa = 44) resolved all tribes as monophyletic groups with 100% bootstrap support (Fig. 5). In the ML tree with Wclust = 95% and Mintaxa = 44, only the tribe Limnochromini Poll, 1986 was not recovered as monophyletic; the sample of Limnochromini species, Greenwoodochromis staneri (Poll, 1949), which yielded the smallest number of RAD sequences (Table 2), was not located in the clade of the tribe but close to the root of the tree (Supplementary Fig. S1). Five ML trees inferred from small data matrices with Wclust = 50% or Mintaxa = 51 had low bootstrap support (< 50%) on many nodes, and many tribes were not resolved as monophyletic groups. With regard to the relationships among the tribes, the 15 ML trees inferred from large matrices (> 2×106 base pairs) with moderate Wclust (≥ 65%) and Mintaxa (≤ 36) values shared the same topology (Fig. 5, Supplementary Fig. S1). In these trees, the tribes Boulengerochromini, Bathybatini (including Hemibates) and Trematocarini formed a monophyletic group close to the root of the tree. The large clade consisting of the remaining tribes corresponds to MVhL (Takahashi et al., 2001). In this clade, the tribe Lamprologini was sister to the clade consisting of the other tribes, which corresponds to the H-lineage (Nishida, 1991). In the H-lineage, the tribes Cyphotilapiini Salzburger et al., 2002, Ectodini Poll, 1986 and Limnochromini formed a clade, and the tribes Cyprichromini

11

Poll, 1986, Benthochromini Takahashi, 2003, Perissodini Poll, 1986, Eretmodini and Haplochromini formed another clade. In the former clade, Ectodini and Limnochromini formed a clade that was sister to Cyphotilapiini. In the latter clade, Cyprichromini, Benthochromini and Perissodini formed a clade, which was sister to a clade consisting of Eretmodini and Haplochromini. In the 15 ML trees inferred from the large matrices (Wclust ≥ 65% and Mintaxa ≤ 36), nine of the ten nodes regarding the relationships among the tribes in EAR were supported by 100% bootstrap values, and the bootstrap value supporting the clade consisting of the tribes Cyphotilapiini, Ectodini and Limnochromini varied from 70% to 100% among these trees (Table 3, Supplementary Fig. S1).

4. Discussion 4.1. Application of RAD phylogenetics for EAR In RAD phylogenetic analyses using PyRAD software, the values of Wclust and Mintaxa affect the size and contents of the data matrix, and thus can affect the topology and reliability of the phylogenetic tree inferred from the matrix (Takahashi et al., 2014). In this study, the ML trees produced from large matrices (> 2.0×106 base pairs) with Wclust ≥ 65% and Mintaxa ≤ 36 recovered all tribes as monophyletic with 100% bootstrap support and shared the same topology regarding the relationships among 12 tribes examined; the ten nodes among the tribes were strongly supported statistically, and all the nodes were supported by 100% bootstrap values when Wclust = 95% and Mintaxa = 12. However, when Wclust and Mintaxa were set to extreme values, the length of the data matrix was reduced to less than 1.5×106 base pairs and bootstrap support for the phylogenetic tree tended to be

12

weak, as shown in previous studies in other animal groups (Takahashi et al., 2014; Takahashi and Moreno, 2015). The small matrix size and low statistical support for the phylogenies may have been partly due to the increase in proportion of orthologous RAD loci that were wrongly identified and removed from the data matrix when Wclust was extremely small, or the excessive exclusion of RAD loci recovered in limited taxa when Mintaxa was extremely large. Large amounts of data were missing in samples with low coverage depths of loci (Fig. 4); in the most extreme case, more than 90% of sequence data were missing in the sample of Greenwoodochromis staneri when Wclust and Mintaxa were set to moderate values. However, samples with low coverage depths seemed not to be problematic in our phylogenetic analyses because these samples always formed a clade with con-tribal samples in robust trees inferred from large data matrices, as shown in a previous study using another animal group (Takahashi et al., 2014).

4.2. Tribal classification At present, there is disagreement regarding tribal classification of the genera Bathybates, Hemibates and Trematocara. Poll (1986) recognised the tribe Bathybatini for the genera Bathybates and Hemibates and the tribe Trematocarini for the genus Trematocara based on morphological features (the genus Telotrematocara, which was included in the tribe Trematocarini, is currently a synonym of Trematocara; see Takahashi, 2002). However, Takahashi (2003a) resolved Poll’s (1986) Bathybatini as a paraphyletic group in his morphological cladistic analysis, and synonymised the tribe Trematocarini

13

with the tribe Bathybatini to make the tribe Bathybatini a monophyletic group. In a subsequent mtDNA phylogeny, Koblmüller et al. (2005) showed that these genera constitute three equally divergent lineages. Therefore, Koblmüller et al. (2008b) suggested the presence of three monotypic tribes for these genera. The present phylogenies inferred from large data matrices (> 2.0×106 base pairs) resolved the genera Bathybates and Hemibates as a monophyletic group with 100% bootstrap support, sister to the genus Trematocara (Fig. 5). Fishes of the genera Bathybates and Hemibates share eggspot-like patterns on the anal fins of males, which are also seen in many species of a distantly related tribe Haplochromini (e.g., Amcoff et al., 2013; observations of TT). The genus Trematocara is morphologically characterised by a small number of scales on the body and large sensory pores on the head (e.g., Poll, 1986; Takahashi, 2003ab). Considering the present phylogenetic patterns and the morphological features of these genera, we propose that Poll’s (1986) classification recognising tribes Bathybatini (Bathybates and Hemibates) and Trematocarini (Trematocara) is more reasonable than the other classifications of Takahashi (2003a) and Koblmüller et al. (2008b), as suggested by Kirchberger et al. (2012) from their AFLP phylogeny. For EAR cichlid fishes, and probably also for other animal groups, there is no clear definition of which monophyletic groups should be tribes. Therefore, it is not possible to say which tribal classification is correct. As discussed above, we suggested that Poll’s (1986) classification for Bathybates, Hemibates and Trematocara was reasonable, but from their phylogenetic relationships, other classifications may be valid, as proposed by Takahashi (2003a) and Koblmüller et al. (2008b). Similarly, Muschick et al. (2012) and

14

Meyer et al. (2015) resolved Trematochromis benthicola (Matthes, 1962) as a sister group of the tribe Cyphotilapiini in their molecular phylogenies (this relationship was also supported in the present phylogenies) and treated T. benthicola as a member of Cyphotilapiini. However, considering phylogenetic patterns, alternative classification recognising a tribe for T. benthicola is possible, as suggested by Takahashi (2003a). In future, a classification that is most frequently referred to will survive as the major classification.

4.3. Phylogenetic relationships among EAR tribes The common topology of the phylogenies inferred from the large matrices (> 2.0×106 base pairs) differed partly from the phylogenies proposed previously (Figs. 2, 5). In the present study, the tribes Boulengerochromini, Bathybatini (including Hemibates) and Trematocarini formed a monophyletic group supported by a 100% bootstrap value close to the roots of the trees. However, in early studies, this monophyletic group was not recovered (Nishida, 1991; Kocher et al. 1995; Muschick et al. 2012; Weiss et al., 2015), or even if recovered, the statistical support for monophyly of the group was very weak (Salzburger et al., 2002; Day et al., 2008; Meyer et al., 2015). The present phylogenies strongly resolved H-lineage as a monophyletic group, similar to phylogenies based on nDNA data (Nishida, 1991; Meyer et al., 2015; Weiss et al., 2015; McGee et al., 2016) and sequences of the mtDNA cytb gene and control region (Sturmbauer and Meyer, 1993), but phylogenies based on data consisting of or partly including ND2 sequences of mtDNA did not resolve Hlineage as monophyletic (Fig. 2b) (Kocher et al., 1995; Salzburger et al., 2002; Clabaut et

15

al., 2005, 2007; Day et al., 2008; Muschick et al., 2012). Within the H-lineage, the present phylogenies showed that the tribes Cyphotilapiini, Ectodini and Limnochromini formed a monophyletic group, and the tribes Cyprichromini, Benthochromini, Perissodini, Eretmodini and Haplochromini formed another monophyletic group. The former monophyletic group (Cyphotilapiini + Ectodini + Limnochromini) was recovered in nDNA based phylogenies of Nishida (1991) and McGee et al. (2016), and the later monophyletic group (Cyprichromini + Benthochromini + Perissodini + Eretmodini + Haplochromini) was recovered in an allozyme phylogeny of Nishida (1991). Phylogenetic analyses based on mtDNA data are probably heavily impacted by incomplete lineage sorting and introgression (e.g., Rüber et al. 2001; Koblmüller et al. 2007ab, 2010; Sturmbauer et al. 2010). Nuclear data have been applied for phylogenetic analyses of EAR, but data set sizes were limited (Nishida, 1991; Clabaut et al., 2005; Meyer et al., 2015; Weiss et al., 2015; McGee et al., 2016). The present phylogenetic analyses based on large data sets of nDNA sequences generated with a HiSeq sequencer resulted in more robust phylogenies than earlier studies in terms of bootstrap values (Table 1; McGee et al., 2016 showed a robust Bayesian tree, but the ML tree based on the same data set had low bootstrap support on some nodes). Therefore, the present phylogenies may more precisely reflect the EAR’s species tree. However, large multi-locus sequence data, such as concatenated RAD sequence data, could support incorrect nodes with high bootstrap values (Kumar et al., 2012; Rubin et al., 2012; Salichos et al., 2014). Multi-locus data will result in a topology supported by the majority of the loci. Thus, if incomplete lineage sorting, introgression or natural/sexual selection pressures influenced a large

16

proportion of loci, this would lead to erroneous phylogenetic inferences. In fact, incomplete lineage sorting and introgression have been suggested among certain EAR tribes (Takahashi et al., 2001; Weiss et al., 2015). In future, analyses considering alternative gene phylogenies will deepen our knowledge regarding EAR fishes’ species tree and unravel the complex history of their tribal diversification.

Acknowledgements We thank H. Phiri, M. Mbewe, D. Sinyinza and the other staff of the Lake Tanganyika Research Unit in Mpulungu for their support in the field, M. Hori and Y. Terai for providing samples, S. Koblmüller for essential advice, and the associate editor, G. Orti, and two anonymous reviewers for their valuable comments on earlier version of this manuscript. This study was supported by Grants-in-Aid for Scientific Research (No. 23370043 and 26291078) from the Ministry of Education, Culture, Sports, Science and Technology (MEXT), Japan.

References Amcoff, M., Gonzalez-Voyer, A., Kolm, N., 2013. Evolution of egg dummies in Tanganyikan cichlid fishes: the roles of parental care and sexual selection. J. Evol. Biol. 26, 2369–2382. http://dx.doi.org/10.1111/jeb.12231. Catchen, J.M., Amores, A., Hohenlohe, P., Cresko, W., Postlethwait, J.H., 2011. Stacks: building and genotyping loci de novo from short-read sequences. Genes Genomes Genet. 1, 171–182. http://dx.doi.org/10.1534/g3.111.000240.

17

Clabaut, C, Salzburger, W., Meyer, A., 2005. Comparative phylogenetic analyses of the adaptive radiation of Lake Tanganyika cichlid fish: Nuclear sequences are less homoplasious but also less informative than mitochondrial DNA. J. Mol. Evol. 61, 666– 681. http://dx.doi.org/10.1007/s00239-004-0217-2. Clabaut, C., Bunje, P.M.E., Salzburger, W., Meyer, A., 2007. Geometric morphometric analyses provide evidence for the adaptive character of the Tanganyikan cichlid fish radiations. Evolution 61, 560–578. http://dx.doi.org/10.1111/j.1558-5646.2007.00045.x. Cohen, A.S., Soreghan, M.J., Scholz, C.A., 1993. Estimating the age of formation of lakes: an example from Lake Tanganyika, East African Rift system. Geology 21, 511–514. Day, J.J., Cotton, J.A., Barraclough, T.G., 2008. Tempo and mode of diversification of Lake Tanganyika cichlid fishes. PLoS One 3, e1730. http://dx.doi.org/10.1371/journal.pone.0001730. Dunz, A.R., Schliewen, U.K., 2013. Molecular phylogeny and revised classification of the haplotilapiine cichlid fishes formerly referred to as "Tilapia". Mol. Phylogenet. Evol. 68, 64–80. http://dx.doi.org/10.1016/j.ympev.2013.03.015. Eaton, D.A.R., 2014. PyRAD: assembly of de novo RADseq loci for phylogenetic analyses. Bioinformatics. http://dx.doi.org/10.1093/bioinformatics/btu121. Etter, P.D., Bassham, S., Hohenlohe, P.A., Johnson, E.A., Cresko, W.A., 2011. SNP discovery and genotyping for evolutionary genetics using RAD sequencing. In: Orgogozo, V., Rockman, M.V. (Eds.), Molecular Methods for Evolutionary Genetics. Humana Press, New York, pp. 157–178. http://dx.doi.org/10.1007/978-1-61779-228-1_9. Genner, M.J., Seehausen, O., Lunt, D.H., Joyce, D.A., Shaw, P.W., Carvalho, G.R., Turner,

18

G.F., 2007. Age of cichlids: New dates for ancient lake fish radiations. Mol. Biol. Evol. 24, 1269–1282. http://dx.doi.org/10.1093/molbev/msm050. Jones, J.C., Fan, S., Franchini, P., Scharti, M., Meyer, A., 2013. The evolutionary history of Xiphophorus fish and their sexually selected sword: a genome-wide approach using restriction site-associated DNA sequencing. Mol. Ecol. 22, 2986–3001. http://dx.doi.org/10.1111/mec.12269. Joyce, D.A., Lunt, D.H., Bills, R., Turner, G.F., Katongo, C., Duftner, N., Sturmbauer, C., Seehausen, O., 2005. An extant cichlid fish radiation emerged in an extinct Pleistocene lake. Nature 435, 90–95. Kirchberger, P.C., Sefc, K.M., Sturmbauer, C., Koblmüller, S., 2012. Evolutionary history of Lake Tanganyika's predatory deepwater cichlids. Int. J. Evol. Biol. 2012, 716209. http://dx.doi.org/10.1155/2012/716209. Koblmüller, S., Duftner, N., Katongo, C., Phiri, H., Sturmbauer, C., 2005. Ancient divergence in bathypelagic Lake Tanganyika deepwater cichlids: mitochondrial phylogeny of the tribe Bathybatini. J. Mol. Evol. 60, 297–314. http://dx.doi.org/10.1007/s00239-004-0033-8. Koblmüller, S., Duftner, N., Sefc, K.M., Aibara, M., Stipacek, M., Blanc, M., Egger, B., Sturmbauer, C., 2007a. Reticulate phylogeny of gastropod-shell-breeding cichlids from Lake Tanganyika – the result of repeated introgressive hybridization. BMC Evol. Biol. 7, 7. http://dx.doi.org/10.1186/1471-2148-7-7. Koblmüller, S., Egger, B., Sefc, K.M., Sturmbauer, C., 2007b. Evolutionary history of Lake Tanganyika’s scale-eating cichlid fishes. Mol. Phylogenet. Evol. 44, 1295–1305.

19

http://dx.doi.org/10.1016/j.ympev.2007.02.010. Koblmüller, S., Schliewen, U.K., Duftner, N., Sefc, K.M., Katongo, C., Sturmbauer, C., 2008a. Age and spread of the haplochromine cichlid fishes in Africa. Mol. Phylogenet. Evol. 49, 153–169. http://dx.doi.org/10.1016/j.ympev.2008.05.045. Koblmüller, S., Sefc, K.M., Sturmbauer, C., 2008b. The Lake Tanganyika cichlid species assemblage: Recent advances in molecular phylogenetics. Hydrobiologia 615, 5–20. http://dx.doi.org/10.1007/s10750-008-9552-4. Koblmüller, S., Egger, B., Sturmbauer, C., Sefc, K.M., 2010. Rapid radiation, ancient incomplete lineage sorting and ancient hybridization in the endemic Lake Tanganyika cichlid tribe Tropheini. Mol. Phylogenet. Evol. 55, 318–334. http://dx.doi.org/10.1016/j.ympev.2009.09.032. Kocher, T.D., Conroy, J.A., McKaye, K.R., Stauffer, J.R., Lockwood, S.F., 1995. Evolution of NADH dehydrogenase subunit 2 in East African cichlid fish. Mol. Phylogenet. Evol. 4, 420–432. Kumar, S., Filipski, A.J., Battistuzzi, F.U., Kosakovsky Pond, S.L., Tamura, K., 2012. Statistics and truth in phylogenomics. Mol. Biol. Evol. 29, 457–472. http://dx.doi.org/10.1093/molbev/msr202. Longo, G., Bernardi, G., 2015. The evolutionary history of the embiotocid surfperch radiation based on genome-wide RAD sequence data. Mol. Phylogenet. Evol. 88, 55–63. http://dx.doi.org/10.1016/j.ympev.2015.03.027. McGee, M.D., Faircloth, B.C., Borstein, S.R., Zheng, J., Hulsey, C.D., Wainwright, P.C., Alfaro, M.E., 2016. Replicated divergence in cichlid radiations mirrors a major

20

vertebrate innovation. Proc. R. Soc. B 283, 20151413. http://dx.doi.org/10.1098/rspb.2015.1413. Meyer, B.S., Matschiner, M., Salzburger, W., 2015. A tribal level phylogeny of Lake Tanganyika cichlid fishes based on a genomic multi-marker approach. Mol. Phylogenet. Evol. 83, 56–71. http://dx.doi.org/10.1016/j.ympev.2014.10.009. Muschick, M., Indermaur, A., Salzburger, W., 2012. Convergent evolution within an adaptive radiation of cichlid fishes. Current Biol. 22, 2362–2368. http://dx.doi.org/10.1016/j.cub.2012.10.048. Nishida, M., 1991. Lake Tanganyika as an evolutionary reservoir of old lineages of East African cichlid fishes: Inferences from allozyme data. Experientia 47, 974–979. Poll, M., 1986. Classification des Cichlidae du lac Tanganika. Tribus, genres et espèces. Acad. R. Belg. Mém. Cl. Sci. 45, 1–163. Roure, B., Baurain, D., Philippe, H., 2013. Impact of missing data on phylogenies inferred from empirical phylogenomic data sets. Mol. Biol. Evol. 30, 197–214. http://dx.doi.org/10.1093/molbev/mss208. Rüber, L., Meyer, A., Sturmbauer, C., Verheyen, E., 2001. Population structure in two sympatric species of the Lake Tanganyika cichlid tribe Eretmodini: evidence for introgression. Mol. Ecol. 10, 1207–1225. Rubin, B.E.R., Ree, R.H., Moreau, C.S., 2012. Inferring phylogenies from RAD sequence data. PLoS ONE 7, e33394. http://dx.doi.org/10.1371/journal.pone.0033394. Salichos, L., Stamatakis, A., Rokas, A., 2014. Novel information theory-based measures for quantifying incongruence among phylogenetic trees. Mol. Biol. Evol. 31, 1261-1271.

21

http://dx.doi.org/10.1093/molbev/msu061. Salzburger, W., Meyer, A., Baric, S., Verheyen, E., Sturmbauer, C., 2002. Phylogeny of the Lake Tanganyika cichlid species flock and its relationships to the Central and East African Haplochromine cichlid fish faunas. Syst. Biol. 51, 113–135. Salzburger, W., Mack T., Verheyen, E., Meyer, A., 2005. Out of Tanganyika: Genesis, explosive speciation, key-innovations and phylogeography of the haplochromine cichlid fishes. BMC Evol. Biol. 5, 17. http://dx.doi.org/10.1186/1471-2148-5-17. Schelly, R.C., Stiassny, M.L.J., 2004. Revision of the Congo River Lamprologus Schilthuis, 1891 (Teleostei: Cichlidae), with description of two new species. Am. Mus. Novitates 3451, 1–40. Schelly R., Stiassny, M.L.J., Seegers, L., 2003. Neolamprologus devosi sp. n., a new riverine lamprologine cichlid (Teleostei, Cichlidae) from the lower Malagarasi River, Tanzania. Zootaxa 373, 1–11. Schwarzer, J., Misof, B., Tautz, D., Schliewen, U.K., 2009. The root of the East African cichlid radiations. BMC Evol. Biol. 9, 186. http://dx.doi.org/10.1186/1471-2148-9-186. Stamatakis, A., 2014. RAxML version 8: a tool for phylogenetic analyses and postanalysis of large phylogenies. Bioinformatics 30, 1312–1313. http://dx.doi.org/10.1093/bioinformatics/btu033. Stamatakis, A., Hoover, P., Rougemont, J., 2008. A rapid bootstrap algorithm for the RAxML web servers. Syst. Biol. 57, 758–771. http://dx.doi.org/10.1080/10635150802429642. Sturmbauer, C., Meyer, A., 1993. Mitochondrial phylogeny of the endemic mouthbrooding

22

lineages of cichlid fishes from Lake Tanganyika in Eastern Africa. Mol. Biol. Evol. 10, 751–768. Sturmbauer, C., Verheyen, E., Meyer, A., 1994. Mitochondrial phylogeny of the Lamprologini, the major substrate spawning lineage of cichlid fishes from Lake Tanganyika in Eastern Africa. Mol. Biol. Evol. 11, 691–703. Sturmbauer, C., Baric, S., Salzburger, W., Rüber, L., Verheyen, E., 2001. Lake level fluctuations synchronize genetic divergences of cichlid fishes in African lakes. Mol. Biol. Evol. 18, 144–154. Sturmbauer, C., Salzburger, W., Duftner, N., Schelly, R., Koblmüller, S., 2010. Evolutionary history of the Lake Tanganyika cichlid tribe Lamprologini (Teleostei: Perciformes) derived from mitochondrial and nuclear DNA data. Mol. Phylogenet. Evol. 57, 266–284. http://dx.doi.org/10.1016/j.ympev.2010.06.018. Takahashi, K., Terai, Y., Nishida, M., Okada, N., 2001. Phylogenetic relationships and ancient incomplete lineage sorting among cichlid fishes in Lake Tanganyika as revealed by analysis of the insertion of retroposons. Mol. Biol. Evol. 18, 2057–2066. Takahashi, T., 2002. Systematics of the tribe Trematocarini (Perciformes: Cichlidae) from Lake Tanganyika, Africa. Ichthyol. Res. 49, 253–259. Takahashi, T., 2003a. Systematics of Tanganyikan cichlid fishes (Teleostei: Perciformes). Ichthyol. Res. 50, 367–382. Takahashi, T., 2003b. Comparative osteology of the infraorbitals in cichlid fishes (Teleostei: Perciformes) from Lake Tanganyika. Species Divers. 8, 1–26. Takahashi, T., 2014. Greenwoodochromini Takahashi from Lake Tanganyika is a junior

23

synonym of Limnochromini Poll (Perciformes: Cichlidae). J. Fish. Biol. 84, 929–936. http://dx.doi.org/10.1111/jfb.12309. Takahashi, T., Moreno, E., 2015. A RAD-based phylogenetics for Orestias fishes from Lake Titicaca. Mol. Phylogenet. Evol. 93, 307–317. http://dx.doi.org/10.1016/j.ympev.2015.08.012. Takahashi, T., Nagata, N., Sota, T., 2014. Application of RAD-based phylogenetics to complex relationships among variously related taxa in a species flock. Mol. Phylogenet. Evol. 80, 137–144. http://dx.doi.org/10.1016/j.ympev.2014.07.016. Tariel, J., Longo, G.C., Bernardi, G., 2016. Tempo and mode of speciation in Holacanthus angelfishes based on RADseq markers. Mol. Phylogenet. Evol. 98, 84–88. http://dx.doi.org/10.1016/j.ympev.2016.01.010. Tougas, S., Stiassny, M.L.J., 2014. Lamprologus markerti, a new lamprologine cichlid (Teleostei: cichlidae) endemic to the lower Congo River in the Democratic Republic of Congo, west-central Africa. Zootaxa 3852, 391–400. http://dx.doi.org/10.11646/zootaxa.3852.3.8. Verheyen, E., Salzburger, W., Snoeks, J., Meyer, A., 2003. Origin of the superflock of cichlid fishes from Lake Victoria, East Africa. Science 300, 325–329. Wagner, C.E., Keller, I., Wittwer, S., Selz, O.M., Mwaiko, S., Greuter, L., Sivasundar, A., Seehausen, O., 2013. Genome-wide RAD sequence data provide unprecedented resolution of species boundaries and relationships in the Lake Victoria cichlid adaptive radiation. Mol. Ecol. 22, 787–798. http://dx.doi.org/10.1111/mec.12023. Weiss, J.D., Cotterill, F.P.D., Schliewen, U.K., 2015. Lake Tanganyika––A ‘Melting Pot’ of

24

ancient and young cichlid lineages (Teleostei: Cichlidae)? PLoS One 10, e0125043. http://dx.doi.org/10.1371/journal.pone.0125043.

25

Fig. 1 Tribal classifications of cichlid fishes in the EAR. From left, two major tribal classifications of Poll (1986) and Takahashi (2003a), currently accepted classification referring to recent studies (Clabaut et al., 2005, 2007; Koblmüller et al., 2008ab; Muschick et al., 2012; Takahashi, 2014; Meyer et al., 2015), and the classification used in this study. 1: Poll's (1986) Tilapiini is a paraphyletic group, and only Boulengerochromis microlepis is included in the EAR. 2: The name of this monotypic tribe was not assigned because generic allocation of Trematochromis benthicola was not resolved at that time. 3: Early studies are in disagreement on the validity of these tribes (see text). 4: No Orthochromini species are included in the present study.

Fig. 2 Schematic molecular phylogenies of the EAR tribes based on combined evidence from several studies. (a) A combined phylogeny of studies based on nDNA markers (Nishida, 1991; Meyer et al., 2015; McGee et al., 2016) and mtDNA cytb and control region sequences (Sturmbauer and Meyer, 1993). The tribes Benthochromini and Orthochromini were not included in these studies. (b) A combined phylogeny of studies wholly or partly based on mtDNA ND2 sequences (Kocher et al., 1995; Salzburger et al., 2002; Clabaut et al., 2005, 2007; Day et al., 2008; Muschick et al., 2012). Nishida (1991) and Clabaut et al. (2007) nested a species of the tribe Oreochromini (Oreochromis tanganicae) within the EAR, but it is not shown. For the phylogenies of Kocher et al. (1995) and Muschick et al. (2012), we re-designated the tribes Tylochromini and Oreochromini as an outgroup.

26

Symbols on the right of scientific names indicate tribes native (circles) or non-native (crosses) to lakes Tanganyika, Malawi and Victoria and rivers.

Fig. 3 Properties of the 23 data matrices used in the present study. (a) Length of concatenated sequence. (b) Percentage of missing sites. (c) Percentage of gaps. (d) Number of single nucleotide polymorphisms (SNPs). Dark to light grey lines indicate Wclust values, as shown on the right of the figure.

Fig. 4 Percentage of missing sites at each mean coverage depth of loci in the matrix with Wclust = 95% and Mintaxa = 12. Open plots indicate ingroup samples and the grey plot denotes the outgroup sample. Matrices created with Wclust ≥ 60% and Mintaxa ≤ 44 exhibited the same tendency, i.e. a high proportion of missing data in samples with a small mean coverage depth of loci (not shown).

Fig. 5 A maximum-likelihood (ML) tree of 50 EAR cichlid fish samples resulting from a RAD sequence data matrix with Wclust = 95% and Mintaxa = 12. Numbers close to the nodes are ML bootstrap percentages. Photographs of fish on the right of the figure show representatives of the tribes: From top, Boulengerochromis microlepis (Boulenger, 1899), Trematocara kufferathi Poll, 1948, Bathybates vittatus Boulenger, 1914, Neolamprologus savoryi (Poll, 1949), Cyphotilapia frontosa (Boulenger, 1906), Cardiopharynx schoutedeni 27

Poll, 1942, Greenwoodochromis christyi (Trewavas, 1953), Cyprichromis coloratus Takahashi and Hori, 2006, Benthochromis horii Takahashi, 2008, Perissodus eccentricus Liem and Stewart, 1976, Spathodus marlieri Poll, 1950 and Astatotilapia burtoni (Günther, 1893). Trees with other Wclust and Mintaxa values are shown in Supplementary Fig. S1.

28

Figure 1

Poll (1986)

Takahashi (2003a)

Tilapiini 1

Others

Current classification

Present study

Boulengerochromini

Boulengerochromini

Bathybatini

Bathybatini

Tilapiini (not EAR)

Boulengerochromis

Boulengerochromini Bathybatini

Bathybatini

Bathybates Hemibates Trematocara

Trematocarini

Hemibatini? 3 Trematocarini? 3

Trematocarini

Lamprologini

Lamprologini

Lamprologini

Lamprologini

Ectodini

Ectodini

Ectodini

Ectodini

Limnochromini

Limnochromini

Limnochromini

Benthochromini

Benthochromini

Benthochromini

Cyprichromini

Cyprichromini

Cyprichromini

Cyprichromini

Perissodini

Perissodini

Perissodini

Perissodini

Eretmodini

Eretmodini

Eretmodini

Eretmodini

Cyphotilapiini

Cyphotilapiini

Haplochromini

Haplochromini

Orthochromini

Orthochromini 4

Limnochromini

Others

Greenwoodochromis Benthochromis Gnathochromis pfefferi

Tropheini

Others

Greenwoodochromini

Tropheini

Cyphotilapia Ctenochromis horei

Haplochromini

Trematochromis benthicola Others

Cyphotilapiini “New tribe” 2 Haplochromini

Others Malagarasi clade of Orthochromis

ke La Ta ke ng La M an ke ala yik R Vi wi a iv ct er o s ri a

Figure 2

La

(a) Boulengerochromini Bathybatini Trematocarini EAR

Lamprologini

MVhL H-lineage

Cyphotilapiini Limnochromini Ectodini Perissodini Cyprichromini Eretmodini

Haplochromini

(b) Boulengerochromini Bathybatini Trematocarini Lamprologini

EAR

MVhL

C-lineage

Eretmodini Cyphotilapiini Limnochromini Ectodini Perissodini Cyprichromini Benthochromini Orthochromini

Haplochromini

Figure 3

(a)

(b)

(c)

Wclust

1200

8

100

12

(d)

8

6

4

2

0

80

Percentage of gaps

Percentage of missing sites

Length of matrix (M bp)

10

60

40

12 20 28 36 44 51

6

4

2

20

0 4

Number of SNPs (× 1000)

95%

4

12 20 28 36 44 51

0

80%

1000

65% 800

50%

600

400

200

0 4

12 20 28 36 44 51

Minimum number of taxa (Mintaxa)

4

12 20 28 36 44 51

Percentage of missing sites

Figure 4

1.0

0.8

0.6

0.4

0.2

0

30

60

90

Mean coverage depth

120

150

Figure 5

Oreochromis tanganicae (outgroup) Boulengerochromis microlepis 100 100 Trematocara nigrifrons Trematocara kufferathi 100 Hemibates stenosoma 100 Bathybates graueri 100 Bathybates fasciatus 100 Bathybates vittatus 100 Lamprologus ocellatus Lepidiolamprologus elongatus 100 Variabilichromis moorii 100 Chalinochromis brichardi Telmatochromis brachygnathus EAR 100 100 Neolamprologus savoryi Trematochromis benthicola 100 100 Cyphotilapia frontosa Cyphotilapia gibberosa Xenotilapia tenuidentata 100 Callochromis macrops 100 100 Ectodus descampsi 91 100 Lestradea perspicax 100 MVhL Cardiopharynx schoutedeni Cyathopharynx furcifer 91 Aulonocranus dewindti 99 100 100 Cunningtonia longiventralis Tangachromis dhanisi 100 100 Reganochromis calliurus Baileychromis centropomoides 100 Limnochromis auritus 92 Triglachromis otostigma 100 Gnathochromis permaxillaris 92 H-lineage Greenwoodochromis abeelei 100 Greenwoodochromis christyi 100 100 Greenwoodochromis staneri Paracyprichromis brieni 100 100 Cyprichromis coloratus Cyprichromis zonatus 100 100 Benthochromis melanoides Benthochromis horii 100 Perissodus eccentricus 100 Haplotaxodon trifasciatus 100 Plecodus straeleni 100 100 Tanganicodus irsacae Spathodus marlieri Lobochilotes labiatus 100 100 Simochromis diagramma 100 100 Petrochromis famula Astatotilapia burtoni 100 Lithochromis rubripinnis 100 Haplochromis sp. small yellow 100 Haplochromis pyrrhocephalus 0.0030

Boulengerochromini Trematocarini Bathybatini

Lamprologini

Cyphotilapiini

Ectodini

Limnochromini

Cyprichromini Benthochromini Perissodini Eretmodini

Haplochromini

Table 1 Robustness of phylogenies of cichlid tribes in the EAR. The minimum statistical support [bootstrap percentage for neighbour-joining (NJ), maximum-likelihood (ML) and mostparsimonious (MP) trees; posterior probability for Bayesian tree] on the nodes regarding relationships among the EAR tribes is shown for each analysis. Study

Marker

Number

of Method

EAR tribes

Minimum statistical support

Nishida, 1991

nDNA (allozyme)

10

NJ

Not estimated

MP

Not estimated

UPGMA Not estimated Sturmbauer Meyer, 1993

Kocher

et

and mtDNA (cytb, control 7

NJ

Not estimated

MP

<50%

10

NJ

20%

12

ML

<50%

MP

<50%

ML

<50%

region)

al., mtDNA (ND2)

1995 Salzburger et al., mtDNA (ND2, cytb) 2002

Clabaut

et

al., mtDNA (ND2)

12

2005

29

Bayes

<0.5

ML

<50%

Bayes

<0.5

ML

<50%

Bayes

<0.5

11

ML

Not estimated

mtDNA (ND2, control 13

ML

<50%

Bayes

<0.5

ML

Not estimated

Bayes

<0.97

ML

<50%

Bayes

<0.75

nDNA (rag1)

12

mtDNA (ND2) + nDNA 12 (rag1)

Clabaut

et

al., mtDNA (ND2)

2007 Day et al., 2008

region)

Muschick et al., mtDNA (ND2) + nDNA 12 2012

(ednrb1, phpt1)

Meyer et al., 2015 nDNA (44 loci)

Weiss et al., 2015

McGee 2016

et

11

mtDNA (ND2)

13

ML

<50%?

nDNA (AFLP)

13

NJ

37%

MP

<50%?

Bayes

<0.5?

ML

52%1

al., nDNA

(ultraconserved 9

elements)

30

Present study 1

nDNA (RAD-seq)

12

Bayes

1.001

ML

100%2

Tree inferred from 95% complete dataset without partitioning (see McGee et al., 2016 for

details). 2

Tree produced from the matrix with Wclust = 95% and Mintaxa = 12.

31

Table 2 Register number, locality and year of collection, the number of restriction-associated DNA (RAD) sequences that passed quality filtering and the range of mean coverage depths, which vary with clustering threshold (Wclust), of the cichlid samples used for molecular phylogenetic analyses. Tribe

Taxon

No.

Locality

Ye ar

RAD sequences

Coverage depth

Bathybates fasciatus Bathybates graueri Bathybates vittatus Hemibates stenosoma Benthochromi s horii Benthochromi s melanoides Boulengeroch romis microlepis Cyphotilapia frontosa Cyphotilapia gibberosa Trematochrom is benthicola Cyprichromis coloratus Cyprichromis zonatus Paracyprichro mis brieni Aulonocranus dewindti Callochromis macrops Cardiopharyn x schoutedeni Cunningtonia longiventralis Cyathopharyn x furcifer

TT5286

Ngwenya market, Mpulungu, Zambia, LT Ngwenya market, Mpulungu, Zambia, LT Ngwenya market, Mpulungu, Zambia, LT Ngwenya market, Mpulungu, Zambia, LT Kasenga, Zambia, LT

20 13 20 13 20 13 20 12 20 13 20 05 20 13

5,775,902 86–160

20 10 20 12 20 10 20 08 20 09 20 12 20 12 20 13 20 06 20 08 20 06

5,095,904 72–121

Ingroup Bathybatini

Benthochromi ni

Boulengeroch romini Cyphotilapiini

Cyprichromini

Ectodni

TT5283 TT5281 TT4362 TT5718 Zm058 50-2 TT5352

Zambia, LT

–––

Purchase in Japan

TT4276 TT3846

Ngwenya market, Mpulungu, Zambia, LT Purchase in Japan

TT3275

Nkumbula Is., Zambia, LT

TT3590

Kasenga, Zambia, LT

TT5039

Kasenga, Zambia, LT

TT4484

Nkumbula Is., Zambia, LT

TT5530

Nkumbula Is., Zambia, LT

Tn0618 6 TT3291

Tanzania, LT

TT2918

Kasenga, Zambia, LT

Mtondwe Is., Zambia, LT

Kasenga, Zambia, LT

32

5,360,469 83–148 4,495,971 74–135 9,446,007 133–218 2,870,536 50–81 5,944,193 96–154 4,819,935 77–129

7,047,448 103–166 7,090,269 112–214 6,016,116 92–144 10,139,954 138–163 6,827,434 106–204 6,263,983 82–140 2,809,237 49–76 5,981,752 85–136 5,296,332 82–133 5,615,863 89–146

Eretmodini

Haplochromin i

Lamprologini

Limnochromi ni

Ectodus descampsi Lestradea perspicax Xenotilapia tenuidentata Spathodus marlieri Tanganicodus irsacae Astatotilapia burtoni Haplochromis pyrrhocephalu s Haplochromis sp. small yellow Lithochromis rubripinnis Lobochilotes labiatus Petrochromis famula Simochromis diagramma Chalinochrom is brichardi Lamprologus ocellatus Lepidiolampro logus elongatus Neolamprolog us savoryi Telmatochrom is brachygnathus Variabilichro mis moorii Baileychromis centropomoid es Gnathochromi s permaxillaris Greenwoodoc hromis abeelei Greenwoodoc hromis christyi Greenwoodoc

TT5677 Tn098

Ngwenya market, Mpulungu, Zambia, LT Tanzania, LT

TT3447

Nkumbula Is., Zambia, LT

TT2646

Kigoma, Tanzania, LT

Tn0616 5 TT5290

Tanzania, LT

–––

LV

–––

LV

–– –

4,246,147 72–139

–––

LV

2,474,431 45–85

TT5428

Mpulungu, Zambia, LT

TT5497

Nkumbula Is., Zambia, LT

TT5377

Mtondwe Is., Zambia, LT

TT3442

Isanga, Zambia, LT

TT2825

Chibwensolo, Zambia, LT

TT5315

Nkumbula Is., Zambia, LT

–– – 20 13 20 13 20 13 20 08 20 06 20 13

TT5296

Nkumbula Is., Zambia, LT

6,969,575 96–153

TT5619

Cape Kaku, Zambia, LT

20 13 20 12

TT5154

Nkumbula Is., Zambia, LT

7,753,422 102–137

TT2779

off Mtondwe Is., Zambia, LT

20 12 20 06

TT5346

Ngwenya Market, Mpulungu, Zambia

20 13

5,075,170 77–139

TT5739

Chituta Bay, Zamiba, LT

4,424,743 68–161

TT2772

off Mtondwe Is., Zambia, LT

20 13 20 06

TT4579

off Mtondwe Is., Zambia, LT

Mtondwe Is., Zambia, LT

33

20 13 20 09 20 09 20 06 20 06 20 13 –– –

4,752,317 76–115

20

4,848,300 79–117 812,102 22–39 520,599 19–30 1,835,749 37–67 6,260,795 98–147 5,217,082 86–156

4,363,782 70–116 348,123 18–25 2,376,314 45–71 6,345,117 98–138 5,290,360 86–128 4,815,976 74–148

760,339 21–30

741,582 21–42

6,435,125 101–303

94,702 16–25

hromis staneri TT5493

Kasenga, Zambia

TT5665

Kasenga, Zambia

Zm061 81-2 TT5669

Zambia

TT4516

off Mtondwe Is., Zambia, LT

TT2775

off Mtondwe Is., Zambia, LT

TT5476

Nkumbula Is., Zambia, LT

TT2961

Kasenga, Zambia, LT

TT5276

Oreochromis TT5445 tanganicae LT: Lake Tanganyika, LV: Lake Victoria.

Perissodini

Trematocarini

Limnochromis auritus Reganochromi s calliurus Tangachromis dhanisi Triglachromis otostigma Haplotaxodon trifasciatus Perissodus eccentricus Plecodus straeleni Trematocara kufferathi Trematocara nigrifrons

12 3,183,582 45–105

Ngwenya market, Mpulungu, Zambia, LT

20 13 20 13 20 06 20 13 20 12 20 06 20 13 20 06 20 13

Ngwenya market, Mpulungu, Zambia, LT

20 13

2,506,366 42–73

Kasenga, Zambia

6,235,149 93–178 1,654,969 31–67 3,749,430 59–117 2,385,779 42–73 5,003,463 81–126 4,070,729 66–117 4,594,400 64–102 5,848,713 84–135

Outgroup Oreochromini

34

Table 3 Percentages of bootstrap trees that support a monophyletic group consisting of the tribes Cyphotilapiini, Ectodini and Limnochromini with various clustering threshold (Wclust) and minimum-number-of-taxa (Mintaxa) values. Wclust

Mintaxa 4

12

20

28

36

44

51

50

0

0

––

––

––

––

––

65

87

77

99

97

94

38

0

80

70

76

82

87

90

59

0

95

91

100

84

91

94

13

0

ML trees with the shading resolved the tribes Cyphotilapiini, Ectodini and Limnochromini as a monophyletic group.

35

Graphical Abstract

Major lineages of East African cichlids RAD-seq

Robust phylogeny

Highlights • We inferred phylogenetic relationships among major East African cichlid lineages.

• Maximum-likelihood trees were generated from RAD sequence data.

• All tribes examined were resolved as monophyletic with 100% bootstrap support.

• Relationships among tribes were resolved with high statistic support.

36