Phylogenomic Analysis of Spiders Reveals Nonmonophyly of Orb Weavers

Phylogenomic Analysis of Spiders Reveals Nonmonophyly of Orb Weavers

Current Biology 24, 1772–1777, August 4, 2014 ª2014 Elsevier Ltd All rights reserved http://dx.doi.org/10.1016/j.cub.2014.06.035 Report Phylogenomic...

680KB Sizes 0 Downloads 31 Views

Current Biology 24, 1772–1777, August 4, 2014 ª2014 Elsevier Ltd All rights reserved

http://dx.doi.org/10.1016/j.cub.2014.06.035

Report Phylogenomic Analysis of Spiders Reveals Nonmonophyly of Orb Weavers Rosa Ferna´ndez,1,* Gustavo Hormiga,2 and Gonzalo Giribet1 1Museum of Comparative Zoology and Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA 2Department of Biological Sciences, The George Washington University, Washington, D.C. 20052, USA

Summary Spiders constitute one of the most successful clades of terrestrial predators [1]. Their extraordinary diversity, paralleled only by some insects and mites [2], is often attributed to the use of silk, and, in one of the largest lineages, to stereotyped behaviors for building foraging webs of remarkable biomechanical properties [1]. However, our understanding of higher-level spider relationships is poor and is largely based on morphology [2–4]. Prior molecular efforts have focused on a handful of genes [5, 6] but have provided little resolution to key questions such as the origin of the orb weavers [1]. We apply a next-generation sequencing approach to resolve spider phylogeny, examining the relationships among its major lineages. We further explore possible pitfalls in phylogenomic reconstruction, including missing data, unequal rates of evolution, and others. Analyses of multiple data sets all agree on the basic structure of the spider tree and all reject the long-accepted monophyly of Orbiculariae, by placing the cribellate orb weavers (Deinopoidea) with other groups and not with the ecribellate orb weavers (Araneoidea). These results imply independent origins for the two types of orb webs (cribellate and ecribellate) or a much more ancestral origin of the orb web with subsequent loss in the so-called RTA clade. Either alternative demands a major reevaluation of our current understanding of the spider evolutionary chronicle. Results and Discussion Our analyses of multiple data matrices (Figure 1) result in a well-resolved phylogeny of the represented spider taxa that is largely congruent with the currently accepted major lineages (Figure 2) (see also the accompanying article [7]). Mesothelae appear as the sister group to a well-supported Opisthothelae, divided into Mygalomorphae (‘‘tarantulas’’ and trap-door spiders) and Araneomorphae (‘‘true spiders’’). Araneomorphae divides into Haplogynae and Entelegynae. The entelegynes consist of two main lineages: Araneoidea (the bulk of the orb weavers) and a clade that includes the four representatives of the RTA clade (named for the retrolateral tibial apophysis [RTA] on the male copulatory organs) plus Eresoidea (Oecobiidae) and Deinopoidea (Uloboridae, the representative of the other superfamily whose members spin orb webs). All of these nodes find near-maximal support in virtually all the analyses conducted (Figure 2). The notion of nonmonophyly of Orbiculariae based on molecular data is not entirely novel [8], but it is at odds with the predominant paradigm of spider evolution, as it contradicts most morphological and behavioral data

*Correspondence: [email protected]

analyses [1]. For this reason, molecular studies have often dismissed results that do not recover monophyly of orbicularians as artifactual or even ‘‘false’’ (e.g., [9]). However, the most taxon-intensive molecular analysis of this problem to date (362 taxa, six markers) did not recover orbicularian monophyly [10]. The results presented here required further testing since several potential pitfalls have been pointed out for phylogenomic analyses, including artificial support inflation [11]. Consequently, we defined multiple data matrices designed specifically to account for the effects of common factors affecting phylogenomic analyses, including missing data [12, 13], unequal rates of evolution in different lineages, compositional heterogeneity [14], and heterotachy [15].

Supermatrix Approach A first set of matrices was thus constructed to discern the potential effects of missing data (supermatrices I to IV; Figure 1). Their analyses showed no variation in the relationships of the major spider lineages, and all matrices showed a main division of Haplogynae and Entelegynae. Although some of the entelegyne relationships are matrix dependent, nearly all analyses find a clade that includes Deinopoidea (Uloboridae), Eresoidea (Oecobiidae), and the RTA clade, and no analysis recovered monophyly of Orbiculariae (i.e., a sister group relationship of Deinopoidea and Araneoidea) (Figure 2). In order to evaluate the effect of using genes with different evolutionary rates or phylogenetic constrains [16], we constructed three additional supermatrices including 100 genes with different evolutionary rates each (measured as percentage of identical sites). The three sets of genes found the same overall topology, only differing in the internal relationships within Araneoidea or the RTA clade and in the relative position of Oecobiidae. All analyses placed Uloboridae with the RTA clade (including or not Oecobiidae) with bootstrap support >99%, thus again refuting monophyly of Orbiculariae. Topological differences among the three sets of trees are not supported. This result also contrasts with others suggesting that resolving ancient divergences requires genes with strong phylogenetic signal [11] and that many have interpreted as if only certain rates would have such strong signal. In our case, the three sets of genes with different evolutionary rates are largely congruent with our results from virtually all other analyses. Other studies, however, show radically different results from different groups of genes [16]. Here we show that nearly identical phylogenetic hypotheses were inferred from sets of genes with radically different evolutionary rates, with or without missing data, or from 94 to 2,637 genes. This, once more, shows that phylogenomic data, even in the presence of missing data, contain enough information to provide strong phylogenetic resolution for ancient arthropod divergences [17–19]. In order to account for heterotachy, all supermatrices were analyzed with methods that implement mixture models (see the Experimental Procedures). Again, the phylogenetic hypotheses for the different matrices were similar. As an extra measure, supermatrix III was selected for further analyses to account for the potential effects of this confounding factor by using the IL approach as implemented in PhyML v.3.0.3 (see Figure 1 and the Experimental Procedures). The inferred

Spider Phylogenomics and the Origins of Orb Webs 1773

Figure 1. Description of the Different Supermatrices and Supernetworks Analyzed in Order to Account for the Main Confounding Factors in Phylogenomic Inference (A) Four matrices were analyzed to account for the effect of missing data. Supermatrix I (dark gray; 2,637 genes, 62.47% gene occupancy, 48% missing data, and 791,793 amino acids) was constructed by concatenating the set of OMA groups containing ten or more taxa. To reduce the amount of missing data, we constructed three additional matrices: supermatrix II (green; 789 genes, 79% gene occupancy, 23.54% missing data, and 201,801 amino acids), created by selecting the orthologs contained in 15 or more taxa, supermatrix III (red; 94 genes, 92.33% gene occupancy, 9.50% missing data, and 19,994 amino acids), which included the orthologs contained in 19 or more taxa, and supermatrix IV (navy blue), which excludes outgroups (100 genes present in all spiders, 2.1% missing data, and 25,406 amino acids). For further reduction of missing data, the mygalomorph was excluded from supermatrices III and IV. (B) For evaluation of the effect of using genes with different evolutionary rates or phylogenetic constrains, the 2,637 genes included in supermatrix I were ordered by the number of conserved sites. Their evolutionary rates varied following a Gaussian bell shape distribution with a mean percentage of identical sites (id.s.) of 34.78%. Supermatrix V (purple) included the 100 most variable genes (varying from 4.1 to 12.7% id.s.; 39.8% missing data and 26,265 amino acids), supermatrix VI (orange) was constructed with the 100 genes closest to the mean percentage of identical sites (34 to 35.6% id.s.; 38.4% of missing data and 27,166 amino acids), and supermatrix VII (cyan blue) contained the 100 conserved genes (from 75 to 94.9% id.s.; 34.8% of missing data and 28,152 amino acids). The genes containing 95% or more id.s. were not considered due to lack of phylogenetic signal. (C) Supermatrix III was further analyzed to account for heterotachy. The length of a branch is a random variable, characterized by a mean (i.e., number of substitutions) and a variance, proportional to the mean. The integrated length (IL) approach, as implemented in PhyML v. 3.0.3, integrates branch length over a wide range of scenarios, therefore allowing implementation of a further correction of heterotachy than mixture models do. (D) Compositional heterogeneity was tested in the genes of supermatrix IV (see Figure S2). Color codes correspond to those in the rest of the figures. See also Figure S1.

topology was the same as those recovered by most analyses, supporting the nonmonophyly of Orbiculariae (Figure 2). To discern whether compositional heterogeneity among taxa and/or within each individual ortholog alignment was affecting our phylogenetic results, we further analyzed supermatrix IV in BaCoCa v. 1.1 [20]. The relative composition frequency variability (RCFV) per taxon and per amino acid ranged from 0.0001 to 0.018, indicating therefore compositional homogeneity in all of the amino acids and taxa included in supermatrix IV (Figure S1) and eliminating possible biases due to compositional heterogeneity. Supernetwork Approach All supernetworks yielded very similar results (Figure 2, Figure S2). A supernetwork obtained with SuperQ v.1.1 [21] depicting possible gene tree/species tree conflict shows long edges separating spiders from their outgroups, as well as Opisthothelae, Araneomorphae, Haplogynae, and Entelegynae (Figure 2), corroborating traditional views on spider phylogeny and our supermatrix-based results. Although no conflict exists with respect to the position of Deinopoidea in relation to Araneoidea, the supernetwork identifies short internodes between Uloboridae, Oecobiidae, and the RTA clade. These results are thus congruent with the supermatrix approaches presented here and show much less conflict than in other arthropod radiations, such as the case of centipedes

[19]. In addition, the supernetworks from the different data matrices are highly congruent (Figure S2), and only those with the smallest number of genes (supermatrix IV) or with the slowest evolving set of genes (supermatrix VII) show shorter edges, indicating higher conflict than the other analyses, probably due to lack of information. These results are nonetheless still compatible with those of all other analyses. The Phylogeny of Spiders and the Nonmonophyly of Orb Weavers Our study provides the most comprehensive set of phylogenomic analyses to address spider evolution based on a broad taxonomic sampling of the main lineages, with emphasis on araneomorphs (12 families sampled, representing for w29% of all the araneomorph species diversity described). All the analyses performed in eight supermatrices (Figure 1; see Experimental Procedures) were largely congruent and yielded similar phylogenetic hypotheses (Figure 2). All data sets recovered a main division between Mesothelae (1 sp.) and Opisthothelae (13 spp.) and a division between Mygalomorphae (1 sp.) and Araneomorphae (12 spp.), and none of these major divisions showed evidence of gene conflict. Although our study highlights araneomorph relationships, it is reassuring to find strong support for the basal spider relationships. Within Araneomorphae, Haplogynae (Pholcidae and Dysderidae) and Entelegynae (the remaining ten araneomorph species) formed

Current Biology Vol 24 No 15 1774

Figure 2. Summary of the Evolutionary Relationships among Spider Taxa (A) ML phylogenomic hypothesis of spider interrelationships (supermatrix I, Ln L PhyML-PCMA: 29,648,921.86673). Checked matrices in each node represent nodal support for the different analyses in supermatrix I (2,637 genes; black), supermatrix II (789 genes; green), supermatrix III (94 genes; red), and supermatrix IV (100 genes, only spiders; navy blue). PP, PhyML-PCMA; EM, ExaML; PB, PhyloBayes; EB, ExaBayes; and PIL, ML analysis with integrated branch length as implemented in PhyML. Filled squares indicate nodal support values higher than 0.95/ 0.90/95 (posterior probability, PB and EB/Shimodaira-Hasegawa-like support, PP/bootstrap, EM). Grey squares indicate lower nodal support. Crossed squares indicate that the node was not recovered in the specific analysis. Supermatrix IV did not include outgroups (indicated by a white square). (B) Alternative topologies inferred from the different analyses. Fractions represent the number of analyses supporting each topology. From top to bottom and from left to right, the third topology was only recovered in supermatrix IV (PB), the fourth in supermatrix III (PB), and the fifth in supermatrix VII (PP). D, Deinopoidea; E, Eresoidea; RTA, RTA clade; and A, Araneae. (C) Supernetwork representation of quartets derived from individual ML gene trees, for all 2,637 genes concatenated in supermatrix I. Phylogenetic conflict in the position of Uloboridae (green) and Oecobiidae (cyan blue) is represented by short branches. See also Figure S2 and Table S1.

two well-resolved clades, without detectable gene conflict. These major spider lineages have long been recognized largely based on morphological data. The relationships within Entelegynae implied by our phylogenomic analyses are, however, not as currently understood [2, 5, 6], due to the placement of Deinopoidea (the cribellate orb weavers) with respect to the viscid silk orb weavers (Araneoidea). These two groups of spiders make geometrically similar orb webs using different materials but presumably homologous stereotypical behaviors [22]. The deinopoid sticky spiral is made of dry cribellate silk, which is metabolically expensive to produce, formed of thousands of fine looped fibrils woven on a core of two axial fibers. Its adhesive properties are attained by van der Vaals and hygroscopic forces [23, 24]. In contrast, the araneoid sticky spiral thread is coated with a viscid glycoprotein [25] and is produced faster and more economically and has higher stickiness than the dry deinopoid counterpart [26]. Furthermore, the axial fibers of the araneoid capture thread are more extensible than those of cribellar threads [27, 28]. Cribellar silk is plesiomorphic relative to the more recently evolved viscid araneoid silk. About 28% of the described extant spider species belong to Orbiculariae, and the vast majority of orb-weaving species belong to Araneoidea (17 families, over 12,000 species). In comparison, their putative sister group, Deinopoidea, includes only 328 species in two families, and its members spin orb webs with cribellate silk. This asymmetry in species diversity has been attributed to a shift in the type of capture thread from dry, fuzzy cribellate silk to viscid sticky silk, combined with changes in the silk spectral reflective properties and a transition from horizontal to vertical orb webs [29–31]. But most araneoid species build webs no longer recognizable as orbs, such as sheet webs

(e.g., in Linyphiidae) or cob webs (e.g., in Theridiidae), and some have abandoned capture webs altogether (e.g., Mimetidae). The controversy over a single or a convergent origin of the orb web goes back to at least the 1880s [32], but with the advent of cladistic methods, the preponderance of research in the past two decades, based mainly on behavioral and spinning organ data, has supported a single origin of the orb web [1]. Our results clearly show that Deinopoidea is not closely related to Araneoidea, even though some uncertainty remains with respect to the exact position of Deinopoidea. While most analyses placed it with Eresoidea and the RTA clade (only three analyses fail to find support for this hypothesis), none placed Deinopoidea with Araneoidea (Figure 2), therefore refuting the monophyly of Orbiculariae. However, inference of the internal relationships of Araneoidea and the RTA clade will require denser taxon sampling, as many of its constituent families were not represented in this study. Conclusions Here we addressed known potential pitfalls in reconstructing the spider tree of life using the most comprehensive NGSbased phylogenomic analyses to date, and none supported the monophyly of Orbiculariae. There are few orbicularian molecular phylogenies with dense taxonomic sampling, and only recently (i.e., [7, 10]) have taxon-dense or gene-dense studies suggested orbicularian nonmonophyly. This has far-reaching implications with respect to the origin and diversification of orb weavers and their spinning work. Alternative hypotheses congruent with our phylogenetic findings suggest that (1) orb webs evolved convergently in Araneoidea and Deinopoidea (although it is critical to add representatives of Deinopidae),

Spider Phylogenomics and the Origins of Orb Webs 1775

or that (2) the orb web is ancestral to a larger clade within Entelegynae (which includes the RTA clade), having been lost in all its members except Araneoidea and Deinopoidea. Either scenario implies a novel and radical hypothesis of spider evolution. Although the data presented here refute the monophyly of Orbiculariae, discerning among the two hypotheses of orb web evolution (convergence versus an ancestral single origin) will require further testing by inclusion of representatives of the family Nicodamidae (a non-orb-weaving lineage, allegedly sister to Araneoidea, that includes both cribellate and ecribellate species) and more entelegyne taxa, including palpimanoids. Understanding the evolution of the extraordinary diversity of araneoid web architectures will require genetic data for a much denser taxon sample. Furthermore, if, as implied in this study, the RTA clade evolved from orbicularian ancestors, then detailed studies of the webbuilding behavior of RTA clade members that spin complex, aerial webs (such as titanoecids or psechrids) are needed to search for orbicularian behavioral homologies. The level of resolution provided by the NGS data and the analyses here presented thus set the roadmap toward an until-now elusive spider tree of life. Experimental Procedures Samples and Next-Generation Sequencing New transcriptomic data were generated for thirteen specimens representing unsampled major groups of spiders (Araneae). We use additional data generated in our laboratories [19, 33] as outgroups. All specimens used in this study were legally collected. Information about the sampled species, Museum of Comparative Zoology (Harvard University) voucher numbers and Sequence Read Archive (SRA) accession numbers are provided in Table S1. Samples were flash frozen in liquid nitrogen or fixed in RNAlater (Life Technologies) immediately after collection. The transcriptome of Acanthoscurria gomesiana (Mygalomorphae: Theraphosidae) is based on 454 pyrosequencing [34]. mRNA was extracted using TRIzol (Life Sciences) and was purified with the Dynabeads mRNA Purification Kit (Invitrogen). cDNA libraries were constructed in the Apollo 324 automated system using the PrepX mRNA kit (Wafergen). The samples were run using the Illumina HiSeq 2500 platform (paired end, 150 bp) at the FAS Center for Systems Biology at Harvard University. Further details about the protocols can be found elsewhere [19].

was specified in ExaML. In the PhyML analysis, the starting tree was set to the optimal parsimony tree and the FreeRate model [51] was selected. Bayesian analysis was conducted with PhyloBayes MPI 1.4e [52] using the CAT-GTR model of evolution [53]. Three independent Markov chain Monte Carlo chains were run for 3,000–50,000 cycles (depending on the supermatrix). The initial 20% of all trees were discarded as burn-in. A 50% majority-rule consensus tree was computed from the combined remaining trees from the three independent runs. PhyloBayes never reached stationarity in some of the largest supermatrices after several months; therefore, ExaBayes [54] was used. For practical reasons and due to the similar results obtained for the different phylogenetic analysis (see the Results and Discussion), not all the analyses were implemented in all the supermatrices (see Figure 2), but at least one ML and one Bayesian inference analysis per supermatrix was explored (Figure 2). To discern whether compositional heterogeneity among taxa and/or within each individual ortholog alignment was affecting phylogenetic results, we further analyzed supermatrix IV in BaCoCa v.1.1 [20]. The relative composition frequency variability (RCFV) measures the absolute deviation from the mean for each amino acid for each taxon [55]. RCFV values were plotted in a heatmap using the R package gplots with an R script modified from [56]. Supernetwork Approach To investigate potential incongruence between individual gene trees, we inferred gene trees for each OMA group included in our supermatrices, as in [19]. Best-scoring ML trees were inferred for each gene under the selected model from 100 replicates. Gene trees were decomposed into quartets with SuperQ v.1.1 [21], and a supernetwork assigning edge lengths based on quartet frequencies was inferred selecting the ‘‘balanced’’ edge-weight optimization function, applying no filter; the supernetworks were visualized in SplitsTree v.4.13.1 [57]. All Python custom scripts can be downloaded from https://github.com/claumer and https://github.com/rfernandezgarcia. Accession Numbers Raw reads have been deposited in the National Center for Biotechnology Information Sequence Read Archive (Table S1). The SRA accession numbers for the new data published in this paper are SRR1328258, SRR1365208, SRR1365089, SRR1328334, SRR1329247, SRR1329248, SRR1329250, SRR1365651, and SRR1333842. Supplemental Information Supplemental Information includes two figures and one table and can be found with this article online at http://dx.doi.org/10.1016/j.cub.2014.06.035. Author Contributions

Data Sanitation, Sequence Assembly, and Orthology Assignment Quality filtering and adaptor trimming were done with Trimgalore! 0.3.3 [35]. Metazoan ribosomal RNA and arachnid mitochondrial DNA were filtered out via Bowtie 2.1.0 [36]. Strand-specific de novo assemblies were done in Trinity [37, 38] using paired read files and with path reinforcement distance being enforced to 75. Redundancy reduction was done with CD-HIT-EST [39] in the raw assemblies (95% global similarity). Candidate open reading frames (ORFs) were identified in TransDecoder [37]. Predicted peptides were then processed with an additional filter to select the longest ORF per Trinity subcomponent with a custom Python script. We assigned predicted ORFs into orthologous groups across all samples using OMA stand-alone v0.99u [40, 41]. Phylogenetic Analyses Each selected orthogroup was aligned individually using MUSCLE version 3.6 [42]. We applied a probabilistic character masking with ZORRO [43] to account for alignment uncertainty (and using FastTree 2.1.4 [44] to produce guide trees). Positions assigned a confidence score below a threshold of 5 were discarded with a custom Python script prior to concatenation (using Phyutility 2.6; [45]) and subsequent phylogenomic analyses. Maximum-likelihood inference was conducted with PhyML-PCMA [46], ExaML [47], and PhyML v. 3.0.3 implementing the IL approach (the latest version accessed in https://github.com/stephaneguindon/phyml-downloads/releases includes the command-line option --il [48, 49]; see Figure 1C). Bootstrap support values were based on 100 replicates. We selected 20 PCs in the PhyML-PCMA analyses and empirical amino acid frequencies. The persite rate category model (-m PSR, the equivalent of CAT in RAxML [50])

All authors planned and designed the research, G.H. collected specimens, R.F. generated and analyzed the data and contributed new reagents/analytic tools, and all authors wrote the paper. Acknowledgments The Research Computing group at the Harvard Faculty of Arts and Sciences was instrumental in facilitating the computational resources required. Jesu´s Ballesteros, Dimitar Dimitrov, Bob Kallal, Christopher Laumer, and Prashant P. Sharma assisted in some steps of this work, whether collecting specimens or data or providing ideas, assemblies, or help with methods. Three anonymous reviewers are acknowledged for their constructive comments. This work was supported by internal funds from the Museum of Comparative Zoology and by a postdoctoral fellowship from the Fundacio´n Ramo´n Areces to R.F. and from NSF ARTS grants 1144417 and 1144492 to G.G. and G.H. Received: May 1, 2014 Revised: June 12, 2014 Accepted: June 12, 2014 Published: July 17, 2014 References 1. Hormiga, G., and Griswold, C.E. (2014). Systematics, phylogeny, and evolution of orb-weaving spiders. Annu. Rev. Entomol. 59, 487–512.

Current Biology Vol 24 No 15 1776

2. Coddington, J.A., and Levi, H.W. (1991). Systematics and evolution of spiders (Araneae). Annu. Rev. Ecol. Syst. 22, 565–592. 3. Griswold, C.E., Coddington, J.A., Hormiga, G., and Scharff, N. (1998). Phylogeny of the orb-web building spiders (Araneae, Orbiculariae: Deinopoidea, Araneoidea). Zool. J. Linn. Soc. 123, 1–99. 4. Lopardo, L., Giribet, G., and Hormiga, G. (2011). Morphology to the rescue: Molecular data and the signal of morphological characters in combined phylogenetic analyses—a case study from mysmenid spiders (Araneae, Mysmenidae), with comments on the evolution of web architecture. Cladistics 27, 278–330. 5. Blackledge, T.A., Scharff, N., Coddington, J.A., Szu¨ts, T., Wenzel, J.W., Hayashi, C.Y., and Agnarsson, I. (2009). Reconstructing web evolution and spider diversification in the molecular era. Proc. Natl. Acad. Sci. USA 106, 5229–5234. 6. Dimitrov, D., Lopardo, L., Giribet, G., Arnedo, M.A., A´lvarez-Padilla, F., and Hormiga, G. (2012). Tangled in a sparse spider web: single origin of orb weavers and their spinning work unravelled by denser taxonomic sampling. Proc. Biol. Sci. 279, 1341–1350. 7. Bond, J.E., Garrison, N.L., Hamilton, C.A., Godwin, R.L., Hedin, M., and Agnarsson, I. (2014). Phylogenomics resolves a spider backbone phylogeny and rejects a prevailing paradigm for orb web evolution. Curr. Biol. Published online July 17, 2014. http://dx.doi.org/10.1016/j. cub.2014.06.034. 8. Hausdorf, B. (1999). Molecular phylogeny of araneomorph spiders. J. Evol. Biol. 12, 980–985. 9. Agnarsson, I., Coddington, J.A., and Kuntner, M. (2013). Systematics – progress in the study of spider diversity and evolution. In Spider Research in the 21st Century, D. Penney, ed. (Manchester: Siri Scientific Press), pp. 58–111. 10. Dimitrov, D., Benavides, L., Giribet, G., Arnedo, M.A., Griswold, C.E., Scharff, N., and Hormiga, G. (2013). Squeezing the last drops of the lemon: molecular phylogeny of Orbiculariae (Araneae). Abstract Book, 9th International Congress of Arachnology (Kenting National Park, Taiwan), p. 47. 11. Salichos, L., and Rokas, A. (2013). Inferring ancient divergences requires genes with strong phylogenetic signals. Nature 497, 327–331. 12. Roure, B., Baurain, D., and Philippe, H. (2013). Impact of missing data on phylogenies inferred from empirical phylogenomic data sets. Mol. Biol. Evol. 30, 197–214. 13. Dell’Ampio, E., Meusemann, K., Szucsich, N.U., Peters, R.S., Meyer, B., Borner, J., Petersen, M., Aberer, A.J., Stamatakis, A., Walzl, M.G., et al. (2014). Decisive data sets in phylogenomics: lessons from studies on the phylogenetic relationships of primarily wingless insects. Mol. Biol. Evol. 31, 239–249. 14. Foster, P.G. (2004). Modeling compositional heterogeneity. Syst. Biol. 53, 485–495. 15. Lopez, P., Casane, D., and Philippe, H. (2002). Heterotachy, an important process of protein evolution. Mol. Biol. Evol. 19, 1–7. 16. Nosenko, T., Schreiber, F., Adamska, M., Adamski, M., Eitel, M., Hammel, J., Maldonado, M., Mu¨ller, W.E., Nickel, M., Schierwater, B., et al. (2013). Deep metazoan phylogeny: when different genes tell different stories. Mol. Phylogenet. Evol. 67, 223–233. 17. Hejnol, A., Obst, M., Stamatakis, A., Ott, M., Rouse, G.W., Edgecombe, G.D., Martinez, P., Bagun˜a`, J., Bailly, X., Jondelius, U., et al. (2009). Assessing the root of bilaterian animals with scalable phylogenomic methods. Proc. Biol. Sci. 276, 4261–4270. 18. Rota-Stabelli, O., Daley, A.C., and Pisani, D. (2013). Molecular timetrees reveal a Cambrian colonization of land and a new scenario for ecdysozoan evolution. Curr. Biol. 23, 392–398. 19. Ferna´ndez, R., Laumer, C.E., Vahtera, V., Libro, S., Kaluziak, S., Sharma, P.P., Pe´rez-Porro, A.R., Edgecombe, G.D., and Giribet, G. (2014). Evaluating topological conflict in centipede phylogeny using transcriptomic data sets. Mol. Biol. Evol. 31, 1500–1513. 20. Ku¨ck, P., and Struck, T.H. (2014). BaCoCa—a heuristic software tool for the parallel assessment of sequence biases in hundreds of gene and taxon partitions. Mol. Phylogenet. Evol. 70, 94–98. 21. Gru¨newald, S., Spillner, A., Bastkowski, S., Bogershausen, A., and Moulton, V. (2013). SuperQ: computing supernetworks from quartets. IEEE/ACM Trans. Comput. Biol. Bioinform. 10, 151–160. 22. Eberhard, W.G. (1982). Behavioral characters for the higher classification of orb-weaving spiders. Evolution 36, 1067–1095. 23. Hawthorn, A.C., and Opell, B.D. (2003). van der Waals and hygroscopic forces of adhesion generated by spider capture threads. J. Exp. Biol. 206, 3905–3911.

24. Hawthorn, A.C., and Opell, B.D. (2002). Evolution of adhesive mechanisms in cribellar spider prey capture thread: evidence for van der Waals and hygroscopic forces. Biol. J. Linn. Soc. Lond. 77, 1–8. 25. Opell, B.D., and Bond, J.E. (2000). Capture thread extensibility of orbweaving spiders: testing punctuated and associative explanations of character evolution. Biol. J. Linn. Soc. Lond. 70, 107–120. 26. Opell, B.D. (1998). Economics of spider orb-webs: the benefits of producing adhesive capture thread and of recycling silk. Funct. Ecol. 12, 613–624. 27. Blackledge, T.A., and Hayashi, C.Y. (2006). Unraveling the mechanical properties of composite silk threads spun by cribellate orb-weaving spiders. J. Exp. Biol. 209, 3131–3140. 28. Opell, B.D., Markley, B.J., Hannum, C.D., and Hendricks, M.L. (2008). The contribution of axial fiber extensibility to the adhesion of viscous capture threads spun by orb-weaving spiders. J. Exp. Biol. 211, 2243– 2251. 29. Bond, J.E., and Opell, B.D. (1998). Testing adaptive radiation and key innovation hypotheses in spiders. Evolution 52, 403–414. 30. Opell, B.D. (1997). The material cost and stickiness of capture threads and the evolution of orb-weaving spiders. Biol. J. Linn. Soc. Lond. 62, 443–458. 31. Craig, C.L., Bernard, G.D., and Coddington, J.A. (1994). Evolutionary shifts in the spectral properties of spider silks. Evolution 48, 287–296. 32. Coddington, J. (1986). The monophyletic origin of the orb web. In Spiders. Webs, behavior and evolution, W.A. Shear, ed. (Stanford: Stanford University Press), pp. 319–363. 33. Sharma, P.P., Kaluziak, S., Pe´rez-Porro, A.R., Gonza´lez, V.L., Hormiga, G., Wheeler, W.C., and Giribet, G. Phylogenomic interrogation of Chelicerata reveals systemic conflicts in phylogenetic signal. Mol. Biol. Evol. 34. Lorenzini, D.M., da Silva, P.I., Jr., Soares, M.B., Arruda, P., Setubal, J., and Daffre, S. (2006). Discovery of immune-related genes expressed in hemocytes of the tarantula spider Acanthoscurria gomesiana. Dev. Comp. Immunol. 30, 545–556. 35. Wu, Z.P., Wang, X., and Zhang, X.G. (2011). Using non-uniform read distribution models to improve isoform expression inference in RNASeq. Bioinformatics 27, 502–508. 36. Langmead, B., and Salzberg, S.L. (2012). Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359. 37. Haas, B.J., Papanicolaou, A., Yassour, M., Grabherr, M., Blood, P.D., Bowden, J., Couger, M.B., Eccles, D., Li, B., Lieber, M., et al. (2013). De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis. Nat. Protoc. 8, 1494–1512. 38. Grabherr, M.G., Haas, B.J., Yassour, M., Levin, J.Z., Thompson, D.A., Amit, I., Adiconis, X., Fan, L., Raychowdhury, R., Zeng, Q.D., et al. (2011). Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat. Biotechnol. 29, 644–652. 39. Fu, L.M., Niu, B.F., Zhu, Z.W., Wu, S.T., and Li, W.Z. (2012). CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics 28, 3150–3152. 40. Altenhoff, A.M., Schneider, A., Gonnet, G.H., and Dessimoz, C. (2011). OMA 2011: orthology inference among 1000 complete genomes. Nucleic Acids Res. 39 (Database issue), D289–D294. 41. Altenhoff, A.M., Gil, M., Gonnet, G.H., and Dessimoz, C. (2013). Inferring hierarchical orthologous groups from orthologous gene pairs. PLoS ONE 8, e53786. 42. Edgar, R.C. (2004). MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32, 1792–1797. 43. Wu, M., Chatterji, S., and Eisen, J.A. (2012). Accounting for alignment uncertainty in phylogenomics. PLoS ONE 7, e30288. 44. Price, M.N., Dehal, P.S., and Arkin, A.P. (2010). FastTree 2—approximately maximum-likelihood trees for large alignments. PLoS ONE 5, e9490. 45. Smith, S.A., and Dunn, C.W. (2008). Phyutility: a phyloinformatics tool for trees, alignments and molecular data. Bioinformatics 24, 715–716. 46. Zoller, S., and Schneider, A. (2013). Improving phylogenetic inference with a semiempirical amino acid substitution model. Mol. Biol. Evol. 30, 469–479. 47. Stamatakis, A., and Aberer, A.J. (2013). Novel parallelization schemes for large-scale likelihood-based phylogenetic inference. IEEE 27th International Parallel and Distributed Processing Symposium, 1195– 1204.

Spider Phylogenomics and the Origins of Orb Webs 1777

48. Guindon, S., and Gascuel, O. (2003). A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst. Biol. 52, 696–704. 49. Guindon, S. (2013). From trajectories to averages: an improved description of the heterogeneity of substitution rates along lineages. Syst. Biol. 62, 22–34. 50. Stamatakis, A. (2014). RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30, 1312–1313. 51. Soubrier, J., Steel, M., Lee, M.S.Y., Der Sarkissian, C., Guindon, S., Ho, S.Y.W., and Cooper, A. (2012). The influence of rate heterogeneity among sites on the time dependence of molecular rates. Mol. Biol. Evol. 29, 3345–3358. 52. Lartillot, N., Rodrigue, N., Stubbs, D., and Richer, J. (2013). PhyloBayes MPI: phylogenetic reconstruction with infinite mixtures of profiles in a parallel environment. Syst. Biol. 62, 611–615. 53. Lartillot, N., and Philippe, H. (2004). A Bayesian mixture model for across-site heterogeneities in the amino-acid replacement process. Mol. Biol. Evol. 21, 1095–1109. 54. Stamatakis, A. (2014). ExaBayes. http://sco.h-its.org/exelixis/web/ software/exabayes/index.html. 55. Zhong, M., Hansen, B., Nesnidal, M., Golombek, A., Halanych, K.M., and Struck, T.H. (2011). Detecting the symplesiomorphy trap: a multigene phylogenetic analysis of terebelliform annelids. BMC Evol. Biol. 11, 369. 56. Warnes, G.R., Bolker, B., and Lumley, T. (2014). gplots: various R programming tools for plotting data. R package version 2.6.0. ftp://ftp. ccu.edu.tw/pub/languages/CRAN/web/packages/gplots/gplots.pdf. 57. Huson, D.H., and Bryant, D. (2006). Application of phylogenetic networks in evolutionary studies. Mol. Biol. Evol. 23, 254–267.