Genetic variation, pseudocryptic diversity, and phylogeny of Erpobdella (Annelida: Hirudinida: Erpobdelliformes), with emphasis on Canadian species

Genetic variation, pseudocryptic diversity, and phylogeny of Erpobdella (Annelida: Hirudinida: Erpobdelliformes), with emphasis on Canadian species

Journal Pre-proofs Genetic variation, pseudocryptic diversity, and phylogeny of Erpobdella (Annelida: Hirudinida: Erpobdelliformes), with emphasis on ...

35MB Sizes 0 Downloads 147 Views

Journal Pre-proofs Genetic variation, pseudocryptic diversity, and phylogeny of Erpobdella (Annelida: Hirudinida: Erpobdelliformes), with emphasis on Canadian species Kevin Anderson, Georgina Braoudakis, Sebastian Kvist PII: DOI: Reference:

S1055-7903(19)30464-6 https://doi.org/10.1016/j.ympev.2019.106688 YMPEV 106688

To appear in:

Molecular Phylogenetics and Evolution

Received Date: Revised Date: Accepted Date:

26 July 2019 11 November 2019 15 November 2019

Please cite this article as: Anderson, K., Braoudakis, G., Kvist, S., Genetic variation, pseudocryptic diversity, and phylogeny of Erpobdella (Annelida: Hirudinida: Erpobdelliformes), with emphasis on Canadian species, Molecular Phylogenetics and Evolution (2019), doi: https://doi.org/10.1016/j.ympev.2019.106688

This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

© 2019 Published by Elsevier Inc.

Genetic variation, pseudocryptic diversity, and phylogeny of Erpobdella (Annelida: Hirudinida: Erpobdelliformes), with emphasis on Canadian species

KEVIN ANDERSONa,b, GEORGINA BRAOUDAKISc, SEBASTIAN KVISTa,b a

Department of Ecology and Evolutionary Biology, University of Toronto, 25 Willcocks Street,

Toronto, Ontario, M5S 2B4, Canada b

Department of Natural History, Royal Ontario Museum, 100 Queen’s Park, Toronto, Ontario

M5S 2C6, Canada c

Fisheries and Oceans Canada, Canada Centre for Inland Waters, 867 Lakeshore Road,

Burlington, Ontario, L7S 1A1, Canada Corresponding Author Email: [email protected]

1

Abstract Leeches of the family Erpobdellidae are important members of benthic freshwater environments, where they are voracious predators of other invertebrates and an important source of nutrition for several species of vertebrates. Beset by a lack of reliable diagnostic morphological characters and destructive identification processes, molecular approaches have, in recent years, been employed to illuminate the relationships within this family, and DNA barcoding has been employed for identification purposes. However, an understanding of the levels of genetic variation across the geographic distributions of members of the genus is still lacking. Herein, we sequence the mitochondrial COI locus for 249 newly collected North American individuals, representing 5 species, as well as mitochondrial 12S rDNA, nuclear 18S rDNA, and nuclear 28S rDNA for a select subset of these. Our COI dataset was leveraged to detect potential cryptic species, and to calculate genetic distances as a proxy for the degree of gene flow between populations. Augmented by numerous sequences from GenBank, the multilocus dataset was used to reconstruct a phylogenetic hypothesis for worldwide members of the genus. Beyond corroborating previous overarching phylogenetic frameworks, our results show that an undescribed species that is morphologically and genetically similar to Erpobdella punctata exists in sympatry with this species – the new species has likely been overlooked in previous studies due to its morphological similarity with Erpobdella punctata. Erpobdella bucera is reported from Canada for the first time; and Erpobdella microstoma is newly reported from Saskatchewan and placed in a phylogeny for the first time. Finally, we find evidence for genetic structure in both E. cf. punctata and Erpobdella obscura that is correlated with major river drainage basin boundaries in North America. Keywords: Erpobdella; COI; DNA barcoding; species delimitation; cryptic species complex

2

1. Introduction Although the majority of leeches (subclass Hirudinea) assume an ectoparasitic, hematophagous lifestyle, several species are macrophagous predators (Sawyer, 1986; Toman and Dall, 1997; Young, 1981). In part due to their abundance, leeches play important roles in the function of ecosystems worldwide. On the one hand, many macrophagous leeches are voracious predators, often found devouring prey equal to their own size. On the other hand, leeches are also an important source of nutrition for several freshwater fish species (Clady, 1974; Sawyer, 1986), and several species are commonly sold as bait for commercial and recreational fishing, especially in northern North America. Moreover, several species of leeches (especially predaceous taxa) have been proposed as bioindicators to gauge levels of water pollution (Friese et al., 2003; Macova et al., 2009; Metcalfe et al., 1984). Members of the genus Erpobdella (Blainville, 1818) in particular, have been of interest in this regard and have been recognised as suitable indicators of pollutants ranging from trace metals (e.g. Manganese and Copper) to classes of PCBs and chlorophenols (Friese et al., 2003; Macova et al., 2009; Metcalfe et al., 1984). Historically, members of Erpobdellidae have been classified into the following genera: Croatobranchus (Kerovec et al., 1999), Dina (Blanchard, 1892), Erpobdella, Mooreobdella (Pawlowski, 1955), Motobdella (Govedich et al., 1998), and Trocheta (Dutrochet, 1817). As our understanding of the relationships within and between these genera increased, it became clear that several of the genera were artificial assemblages, and were recovered as paraphyletic or polyphyletic groups in phylogenetic analyses. To ameliorate this, Siddall (2002) formally sunk all genera into Erpobdella, based on morphological and molecular phylogenetic analyses. This nomenclatural change has not been accepted by all leech systematists (e.g. Trajanovski et al., 2010). However, since the synonymisation, multiple phylogenies with a broader taxonomic scope (e.g. Erpobdelliformes, Arhynchobdellida) have been reconstructed, providing further 3

support for Siddall’s (2002) reclassification of the genera within Erpobdellidae (Borda and Siddall, 2004; Oceguera-Figueroa et al., 2005, 2011, Tessler et al., 2018b, 2018a). Erpobdellid identification relies on relatively few diagnostic characters (for example, number of annuli between gonopores, eyespot patterns, and presence/absence of preatrial loops in the male reproductive system) and can be destructive (often requiring dissection) and time consuming, such that molecular identification methods (primarily phylogenetic analysis and DNA barcoding) have become important tools for identification and classification (e.g. Langer et al., 2018; Siddall, 2002). Previous studies have failed to take into account the genetic variation across the broad geographic ranges of erpobdellids (which can extend across the majority of North America) and, as a result, the potential cryptic diversity and level of gene flow between populations is not as well understood as it is for other groups of leeches (Mack et al., 2019; Mack and Kvist, 2019; Trontelj and Utevsky, 2012). This is especially important when considering the reported levels of cryptic diversity within Annelida (de Carle et al., 2017; De Wit and Erséus, 2010; Gustafsson et al., 2009; Hovingh, 2004), coupled with evidence of unappreciated diversity within Erpobdellidae; for example, several studies have elevated subspecies to the level of species, in recognition of this “hidden” diversity (Oceguera-Figueroa et al., 2005, 2011). Leeches are a group of organisms that, in particular, are expected to have limited vagility across the environment due to the fact that their biology confines them to freshwater “islands”. Thus, drainage basins are likely especially important for erpobdellids because their macrophagous ecology prevents them from moving between unconnected bodies of water via a host [but see Khan and Frick (1997)]. Species within Erpobdella are distributed as far north as the Northwest Territories in Canada and as far south as the state of Chiapas in Mexico, encompassing a distance of over 5000 kilometers (Klemm, 1982; Tessler et al., 2018). Given

4

their widespread distribution, it is likely that there is genetic structure and/or cryptic diversity among these leech populations. In our analyses we would expect to detect this genetic structure as particular branching patterns where multiple clades form within a species, especially if these clades coincide with sampling from areas with biological and/or ecological significance (e.g., drainage basins). Similarly, these clades would exhibit a pattern of genetic variability such that the diversity between them would be greater than the diversity within each clade. In spite of their ubiquity, sampling of erpobdellids has historically been geographically and taxonomically sparse. In Canada, 8 of the 37 nominal species of Erpobdellidae have been recorded. These are Erpobdella anoculata (Moore, 1898); Erpobdella annulata (Moore, 1922); Erpobdella fervida (Verrill, 1874); Erpobdella melanostoma (Sawyer and Shelley, 1976); Erpobdella microstoma (Moore, 1901); Erpobdella obscura (Verrill 1872); Erpobdella parva (Moore, 1912); and Erpobdella punctata (Leidy, 1870) (Davies 1973, Klemm 1985); note that the names Erpobdella dubia (Moore & Meyer, 1951) and Erpobdella parva (Moore, 1912) have been formally synonymized under the latter epithet (Hovingh, 2004). Through increased taxon sampling, the present study aims to illuminate the levels of genetic variation (with emphasis on the barcoding locus) as a means of elucidating any potential genetic structure and/or cryptic diversity within the family. By using a multilocus approach with both mitochondrial and nuclear sequence data, we also build on previous phylogenetic hypotheses for the family and place the Canadian members of the family within a broader phylogenetic context.

2. Materials and Methods 2.1 Specimen collection

5

Specimens of Erpobdella were collected between August 2015 and June 2018 from the following Canadian provinces: Quebec, Ontario, Manitoba, Saskatchewan, Alberta, and British Columbia, as well as from Iowa and Minnesota in the US (Fig. 1). Specimens were collected both manually from the undersides of stones, reeds, and other detritus, as well as with traps baited with chicken or beef liver and left for a period of time ranging from 1 to 12 hours. In total, 249 erpobdellids were collected from 55 localities (clusters of sites >100 km apart) consisting of lotic or slightly lentic environments (i.e., ponds, lakes, and slow-moving streams). All specimens were relaxed in a 10-15% ethanol solution prior to fixation in 95% ethanol. When the size of the leech permitted, a small portion of the caudal sucker was cut for DNA extraction (to reduce the probability of contamination by gut contents) and stored at -20˚C until DNA extraction. When the size of the leech did not permit, the leech was cut transversely, roughly in half and the posterior end was used for DNA extraction. The anterior ends of the specimens serve as morphological vouchers and are deposited in the invertebrate zoology collections at the Royal Ontario Museum, Toronto, Canada. Metadata associated with each individual, including species name and collection locality, are presented in Table S1.

2.2 Specimen identification and morphological analyses Specimen identification was a multi-step process. Based on specialized literature (Klemm, 1982; Sawyer, 1986), preliminary, field identifications were made for freshly fixed specimens (< 1 hour after fixation) and these were subsequently re-examined in a laboratory setting. Identifications were then corroborated based on the placement of sequences within the COI tree (see below), i.e., relative to GenBank sequences. Dissections were then conducted to further support identifications as indicated by the original descriptions of each putative species

6

(Davies et al., 1985; Leidy, 1870; Moore, 1953, 1912, 1901; Moore and Meyer, 1951; Sawyer and Shelley, 1976; Verrill, 1872), as well as Klemm (1982) when the original description was insufficient for definitive diagnosis. All dissections were performed using a Leica Wild M10 dissection microscope fitted with a Spot Flex 15.2 64 megapixel camera (Spot Imaging).

2.3 DNA sequencing For DNA extraction, the DNeasy Tissue Kit (QIAGEN Inc. Valencia, CA) was used following the manufacturers protocol. Extracted DNA was either used for immediate amplification or stored at -20˚C to prevent degradation. The 658 basepair (bp) “Folmer region” (Folmer et al., 1994) of cytochrome c oxidase subunit I (COI) was amplified and sequenced for each specimen. Following preliminary results from the COI locus, 21 specimens were selected for additional sequencing of mitochondrial 12S rDNA, and nuclear 18S rDNA and 28S rDNA. All primers used in this study, alongside their associated citations, are presented in Table 1. For each sample, the following amplification reaction mixture was used: 16.34 µL ddH20; 2.5 µL MgCl2; 2.5 µL standard PCR buffer; 1 µL of each primer; 0.56 µL dNTP; and 0.1 µL Taq polymerase (Invitrogen, Carlsbad, CA). For COI, amplification used the following protocol: 94˚C for 1 min; 5 cycles of [94˚C (30 s), 40˚C (40 s), 72˚C (1 min)]; 35 cycles of [94˚C (30 s), 46˚C (40 s), 72˚C (1 min)]; and 72˚C for 5 min. The protocol used for 12S was 94˚C for 2 min; 32 cycles of [94˚C (30 s) 48˚C (30 s) 72˚C (45 s)]; and 72˚C for 7 min. The protocol used for 18S was 94˚C for 4 min; 35 cycles of [94˚C (30 s) 46˚C (30 s) 70˚C (1.5 min)]; and 70˚C for 7 min. The protocol used for 28S was 94˚C for 5 min; 39 cycles of [95˚C (1 min) 52˚C (1 min) 72˚C (1 min)]; and 72˚C for 7 min. ExoSAP-IT (Thermo Fisher Scientific Inc. Waltham, MA) was used according to the manufacturer’s protocol to purify all successful amplifications.

7

Amplified products were cycle sequenced in both forward and reverse directions using the following reaction mixture: 4 µL ddH20; 2 µL primer; 0.5 µL each of ABI Big Dye Terminator V 3.1 and Big Dye 5x Sequencing Buffer (Applied Biosystems, Carlsbad, CA). The following thermocycler protocol was used: 96˚C for 1 min; 30 cycles of [96˚C (10 s) 50˚C (5 s) 60˚C (4 min)]. Finally, DNA was precipitated from the ethanol and was sequenced using an ABI PRISM 3730 (Applied Biosystems, Carlsbad, CA). Sequences were edited and assembled using Geneious ver. 11.0.2 (Kearse et al., 2012); when necessary, the trim level was relaxed from a 0.1 error probability limit to allow for assembly. To screen for potential DNA contamination, sequences were BLASTed (using the BLASTn algorithm) against the GenBank non-redundant (nr) sequence database (https://blast.ncbi.nlm.nih.gov/Blast.cgi accessed: 01/04/2019) prior to any formal analysis.

2.4 Phylogenetic analyses In total, sequence data for 424 terminals (both newly collected and from GenBank) were used for the COI tree, and a matrix of 91 terminals was used in the remaining gene trees and concatenated trees, comprising representatives from across the Erpobdelliformes along with the americobdelliform Americobdella valdiana and the hirudiniform Cylicobdella coccinea; the latter was used to root the tree, following the results of Oceguera-Figueroa et al. (2011) (Table S1). Importantly, sequences that were preceded by the tag “UNVERIFIED:” were excluded from the analysis. The online version of MAFFT ver. 7 (Katoh et al., 2017) was used to jointly align sequences for each data matrix, employing the ‘auto’ option under default settings. Mesquite ver. 3.51 (Maddison and Maddison, 2018) and the online version of ALTER (Glez-Peña et al., 2010)

8

were used to convert the alignment to the nexus and phylip file types respectively. Mesquite ver. 3.51 (Maddison and Maddison, 2018) was used to concatenate the gene matrices. Unless otherwise noted, the same programs and settings were used for gene tree and concatenated analyses. Parsimony analyses were conducted in TNT ver. 1.5 (Goloboff et al., 2008), using a New Technology search with 1000 replications, five rounds of ratcheting, and five rounds of tree fusing. Bootstrap support values were determined through 1000 iterations, applying default settings. Before conducting maximum likelihood and Bayesian inference analyses, PartitionFinder ver. 2.1.1 (Lanfear et al., 2012) was used to assess the most appropriate partitioning scheme and model of nucleotide evolution under the Akaike Information Criterion using the ‘greedy’ search algorithm and unlinked branch lengths. For the COI analyses, one partition was identified containing all three codon positions. For the concatenated analyses, five partitions were identified: COI position one, COI position two, 12S, and 28S were each assigned individual partitions and COI position three and 18S shared a partition. Both the maximum likelihood and Bayesian analyses were conducted on the CIPRES platform (Miller et al., 2012). Maximum likelihood analyses were conducted with RAxML ver. 8.2.10 (Stamatakis, 2014) wherein 25 initial gamma rate categories were used and bootstrap support values were determined with 1000 iterations of the rapid bootstrapping algorithm using the settings specified above. For the concatenated dataset, Bayesian analyses were conducted using MrBayes ver. 3.2.6 (Huelsenbeck et al., 2001; Ronquist and Huelsenbeck, 2003) with analyses run for 100 million generations, with two runs and 8 chains per run and sampling every 1000 generations. For the COI dataset, we employed 30 million generations, with 2 runs and 4 chains per run and sampling every 1000 generations. A relative burn-in of 25% was assigned and

9

Tracer ver. 1.7.1 (Rambaut et al., 2018) was used to ensure that stationarity and convergence were achieved. FigTree ver. 1.4.3 (Rambaut, 2018) was used to visualize the phylogenetic trees.

2.5 Haplotype networks and COI distances Haplotype networks were created in PopART ver. 1.7 (http://popart.otago.ac.nz/index.shtml accessed: 01/04/2019) using the TCS algorithm (statistical parsimony) for network construction (Clement et al., 2002). Uncorrected p-distances were calculated for the COI locus in MEGA ver. 7.0.26 (Kumar et al., 2016), employing uniform rates among sites and pairwise deletion of sites with gaps. The sequences were organized in two manners: in the first, within and between group distances were calculated for population-level clades (i.e., according to the most inclusive clades below the species level, as identified in Fig. 2, S1, S2) and in the second, the groups were formed based on the major river basin from which the sample was collected (Fig. 1). For each of the approaches above, E. parva, E. cf. punctata, and E. obscura were separately analysed. These analyses were restricted to the above three species because they were the most comprehensively sampled. For the drainage basin analysis, “E. cf. punctata 1” was excluded from the analysis due to low sampling and its putative status as a separate species from E. cf. punctata 2-4 (see section 3.2 and 3.4).

3. Results 3.1 Gene trees All sequences used in the present study are deposited in GenBank under the following accession numbers: MN612794-MN613104 (see Table S1). Of the 33 erpobdellid species analyzed, specimens of 5 species were newly collected for this study: Erpobdella bucera

10

(Moore, 1953), E. parva, E. microstoma, E. obscura, and a group of morphologically similar, but genetically variable, specimens all exhibiting the external phenotype of E. punctata (henceforth termed E. cf. punctata). Of those newly collected species, E. bucera is recorded for the first time in Canada (Ontario), and E. microstoma is recorded for the first time in Saskatchewan. Figures 2, S1 and S2 show the resulting COI trees from the maximum likelihood, Bayesian inference, and maximum parsimony analyses, respectively. For a simplified representation of the tree presented in figure 2, see figure 3. All trees agree on species-level clade assignments; that is, at the species level, all three optimality criteria show the same groupings of individual sequences. As expected from a tree based on only a single, rapidly evolving locus, there are some inconsistencies between the optimality criteria regarding deeper nodes. For this analysis, emphasis was put on the Canadian members of the family, and the levels of variation within and between these taxa/clades and their closest relatives. Specimens corresponding to the description of Erpobdella punctata nested within two distinct clades, with subclades present in each of these. As we are not yet certain which of these clades pertains to Erpobdella punctata sensu stricto, we refer to the subclades as “E. cf. punctata” below, with subclades indicated by numerical suffixes. The first clade is formed by E. cf. punctata 1 (likelihood bootstrap support [LBS] = 100, posterior probability [PP] = 100, parsimony bootstrap support [PBS] = 100) collected from Quebec and Ontario, and the second clade is composed of E. cf. punctata 2 – 4 (LBS = 96, PP = 100, PBS = 99), collected from Quebec, Ontario, Manitoba, Saskatchewan, Alberta, British Columbia and Iowa. Sequences of the species Erpobdella melanostoma and Erpobdella annulata (Moore, 1922) are recovered as the sister to E. cf. punctata 1 (LBS = 81, PP = 99; note that, under parsimony, E. melanostoma places as the sister to the E. cf. punctata + E. annulata clade [PBS = 2]) and E. cf. punctata 2 – 4

11

(LBS = 99, PP = 100, PBS = 93), respectively, rendering “E. cf. punctata” polyphyletic. Several sequences labeled Erpobdella montezuma (GenBank accession numbers GQ368760, KM611825, KM611937, KM611989, KM612138, and KM612253) (Davies et al., 1985) are nested within the clade formed by E. cf. punctata 3, suggesting that one of these identifications is erroneous. Based on our in-depth morphological analyses (see section 3.4), it seems overwhelmingly likely that the sequences labeled E. montezuma in fact represent specimens of E. cf. punctata (see section 3.4). Erpobdella cf. punctata 2 – 4 form a clade with high support (LBS = 96, PP = 100, PBS = 99), and there is some evidence for genetic structure within the clade; E. cf. punctata 2 (LBS = 100, PP = 100, PBS = 99) consists of two specimens collected from British Columbia; E. cf. punctata 3 (LBS = 31, PP = 87, PBS = 72) consists of specimens collected from western Manitoba, Saskatchewan, and Alberta, as well as the GenBank sequences labeled M. montezuma collected, in part, from Arizona, USA; E. cf. punctata 4 (LBS = 94, PP = 100, PBS = 88) consists of specimens collected from eastern Canada (eastern Manitoba, Ontario, and western Quebec), as well as Iowa. Although all specimens of E. microstoma cluster together (LBS = 99, PP = 100, PBS = 99) with branch lengths that are consistent with intraspecific variation, there are two wellsupported groups within this clade (Fig. 2). One consists of KX781833 and ROMIZI 10498 (LBS = 100, PP = 99, PBS = 99), whereas the other consists of ROMIZI 11406-11408 (LBS = 99, PP = 99, PBS = 98). KX781833 and ROMIZI 10498 were both collected in Minnesota, and the three newly sequenced specimens were collected from Pipestone Creek, Saskatchewan. Erpobdella obscura, the most densely sampled species in our dataset, also displays signs of genetic structure, but the degree of structure and the geographic areas defining each group are less marked than for other species analysed. Only one clade (LBS = 97, PP = 100, PBS = 99)

12

was recovered for E. obscura, but (barring JQ821638) this clade can be further subdivided into two groups, on the basis of genetic distances: a paraphyletic group – E. obscura 1 – containing specimens from across most of Canada (western Ontario, Manitoba, Saskatchewan, Alberta, and British Columbia) and Minnesota; and E. obscura 2, a clade composed of specimens from eastern Ontario and western Quebec (LBS= 90, PP = 99, PBS = 97). Gene trees were also constructed for a subset of specimens for each of the 12S, 18S, and 28S rDNA data sets (supplementary Fig. S3-S8); specimen choice was guided by the results in the COI gene tree analysis to maximize genetic variation and clade inclusion. The tree topologies for the COI and 12S loci resemble each other more closely (albeit with more resolution in COI), than either resembles the gene trees based on nuclear data. Notably, in the 12S gene tree, ROMIZI13685 (E. parva) places squarely within the clade containing all of the E. obscura specimens, suggesting the sequence is possibly contaminated. The 18S gene trees (Fig. S5, S6) display less resolution than either mitochondrial gene tree, but all putative species collected for this analysis are recovered as monophyletic save for E. punctata, which is rendered paraphyletic by inclusion of a multitude of species, including E. parva and E. obscura. In a similar vein, the gene trees for 28S (Fig. S7, S8) reveal that this locus does not possess much phylogenetic information at the species level; in both the parsimony and maximum likelihood analysis, there were several polytomies. In these trees, JQ821582 (E. obscura) has a dramatically longer branch length, and its placement outside of Erpobdella suggests this sequence is the result of a contamination and/or the specimen was misidentified as E. obscura.

3.2 Species trees

13

The resolution of the multilocus phylogenies (produced under ML, BI, and MP) differed somewhat, with maximum likelihood and Bayesian inference producing similarly resolved topologies while parsimony produced a less resolved hypothesis (Fig. 4, S9 and S10 respectively). By and large, however, the topologies are similar regarding the positions of the main clades. Under ML and BI, Salifidae is recovered as the sister to Orobdellidae (LBS = 31, PP = 66) and Gastrostomobdellidae is recovered as the sister to Erpobdellidae, albeit with weak support (LBS = 60, PP = 99). Under MP, the placements of these families is unresolved. Despite weak support at the inter-familial level, the monophyly of all families is well supported (LBS = 100, PP = 100, PBS = ≥ 97 for each clade). The topologies under each optimality criterion are consistent within Erpobdellidae (albeit less resolved under MP) except with respect to the placement of Erpobdella lineata (Müller, 1774), which is recovered as the sister to the remaining erpobdellids under MP and ML (LBS = 100, PBS = 99), but is found to be sister to the ‘Neartic’ clade (see below) under BI with low support (PP = 70). Aside from E. lineata, the erpobdellids form two main clades: one composed exclusively of Nearctic species (the ‘Nearctic’ clade) (LBS = 99, PP = 100, PBS = 83) and one composed almost exclusively of Palearctic species (the ‘Palearctic’ clade) (LBS = 100, PP = 100, PBS = 100). The only exception to this scheme is the placement of the Nearctic E. parva and E. obscura within the Palearctic clade. Although the Nearctic clade is largely unresolved beyond the species level in MP, two clades are recovered under ML and BI: one composed of E. melanostoma, E. cf. punctata 1, E. annulata, and E. cf. punctata 2-4 (LBS = 100, PP = 100, PBS = 99), and one composed of a mosaic of both northern and southern ranging species (i.e., E. bucera, Erpobdella triannulata (Moore, 1908), Erpobdella mexicana (Dugès, 1876), E. microstoma, Erpobdella adani (Tessler,

14

Siddall & Oceguera-Figueroa, 2018), Erpobdella costata (Sawyer & Shelley, 1976), and Erpobdella ochoterenai (Cabellero, 1932)) (LBS = 54, PP = 99). Although the monophyly of all species received high support, this latter clade had fairly weak support and only a subset of it (a clade composed of E. microstoma, E. adani, E. costata, and E. ochoterenai) is recovered under MP (PBS = 41). Notably, a sister relationship between E. microstoma (placed in a phylogeny for the first time here) and E. adani is well supported (LBS = 95, PP = 100, PBS = 90). Despite the weak support under MP, the clade containing E. microstoma, E. adani, E. costata, and E. ochoterenai is recovered under all three optimality criteria, and places as the sister to E. triannulata and E. mexicana (LBS = 40, PP = 95). These species, in turn, are recovered as the sister to E. bucera under ML and BI (LBS = 54, PP = 99). All three analyses robustly support the existence of two distinct clades within E. cf. punctata. One clade (E. cf. punctata 1) is recovered as the sister to E. melanostoma (LBS = 100, PP = 100, PBS = 89) while the other clade (E. cf. punctata 2 – 4) is recovered as the sister to E. annulata (LBS = 78, PP = 99, PBS = 93). The “Palearctic” clade, on the other hand, has a consistent, well-supported topology across MP, ML, and BI. Within this clade, two subclades are recovered: one composed of the Palearctic erpobdellids represented in our dataset (Erpobdella testacea (Savigny, 1820), Erpobdella japonica (Blanchard, 1897), and Erpobdella octoculata (Linnaeus, 1758)) (LBS = 100, BI = 100, PBS = 99), and one composed of the sister species E. parva and E. obscura (the Nearctic exceptions) (LBS = 99, BI = 100, PBS = 99).

3.3 COI distances Uncorrected p-distances were calculated using COI sequences, and can be found in Table S2. The average intraspecific distance within E. parva is 1.40 (0.30) (uncorrected p-distance (%)

15

± standard error of the mean). Likewise, the average intra-group distance between members of E. obscura 1 and 2 ranges from 0.50 (0.10) to 1.10 (0.20), while the average distance between the groups is 2.90 (0.50). Contrary to these rather low distance values, E. cf. punctata displays exceptionally high inter-clade distances when E. cf. punctata 1 is included; this is underscored by the accompanying haplotype network (Fig. S11). The intra-clade distances within E. cf. punctata range between 0 and 0.70 (0.10), while the inter-clade distance between E. cf. punctata 1 and E. cf. punctata 2 – 4 is 13.8 (1.20). By contrast, the inter-clade distances between any of E. cf. punctata 2 – 4 range between 2.80 (0.60) and 4.90 (0.80). For the drainage basin analysis, the most highly sampled basins were the St. Lawrence River (SLDB) and Nelson River basins (NDB); these roughly correspond to eastern and western Canada, respectively. In both E. parva and E. obscura, the SLDB harboured a greater genetic diversity (i.e., 1.70 (0.30) for E. parva; 1.70 (0.70) for E. obscura) than the NDB (i.e., 0.30 (0.10) for E. parva; 1.00 (0.20) for E. obscura), but in E. cf. punctata 2 – 4 the opposite scenario was observed (Table S3). In fact, the most acute separation was found between specimens of E. cf. punctata 2 – 4 from the SLDB (intra-basin distance 0.500 (0.100)) and all other drainage basins included in the analyses (inter-basin distance 4.00 – 4.80 (0.600 – 0.800)). We hesitate to draw conclusions from the rest of the basins, in which sampling was limited, in terms of both the number of specimens obtained and the geographic location in which these specimens were collected.

3.4 Morphological examinations Erpobdella cf. punctata is composed of two morphologically distinct groups. E. cf. punctata 1 lacks preatrial loops (Fig. 5d), and the dorsal pigmentation pattern consists of a bright

16

yellow dorsum with two pairs of black, dotted, paramedial stripes, extending down the entire length of the body. There is variability in the amount of dorsal pigmentation between each outer pair of stripes, ranging from four distinct stripes to two thick black bars, but there is no observable pigmentation along the median line (i.e., between the stripes bordering the median of the dorsum) (Fig. 5a). Both gonopores are located in furrows and are two annuli apart (Fig. 5c). Erpobdella cf. punctata 2, 3, and 4 are morphologically uniform: all specimens dissected possess preatrial loops extending to ganglion XI, and have atrial cornuae that extend anteriorly and are curved ventromedially (Fig. 5h). In terms of pigmentation patterns, these specimens have a much more subdued, tan colouration and, in general, only the first pair of black, dotted, paramedial stripes (i.e., the pair bordering the median) is pronounced. Even so, both pairs of stripes are often somewhat discontinuous (especially towards the anterior end), and the pigmentation between each outer pair of stripes is variable (Fig. 5e). The gonopores are located in furrows, and are separated by two annuli (Fig. 5g). Importantly, no crop chambers, crop caeaca, or large, follicular testisacs were found in any of the E. cf. punctata specimens, characters that are otherwise diagnostic for E. montezuma. Dissections of E. microstoma, E. bucera, E. obscura, and E. parva corroborate the intital identifications of each species. Erpobdella microstoma is a light gray colour without accessory pigmentation (Fig. 6a, b), gonopores are located in furrows, separated by three annuli (Fig. 6c), and the atrial cornuae curve postero-ventrally, with the ejaculatory ducts lacking preatrial loops (Fig. 6d). Erpobdella bucera is a light tan colour also lacking accessory pigmentation (Fig. 6e, f), the gonopores are located on annuli, separated by two annuli (Fig. 6g), and the cornuae project laterally from the atrium, with the ejaculatory ducts lacking preatrial loops (Fig. 6h). Erpobdella obscura is a grayish green colour, with mottled black pigmentation across the entire

17

length of the body, the venter is considerably lighter than the dorsum and devoid of most of the black pigmentation (Fig. 7a, b). The gonopores are located in furrows, and are separated by two annuli (Fig. 7c), the atrial cornuae extend anteriorly and curve ventromedially, and the ejaculatory ducts possess preatrial loops (Fig. 7d). Erpobdella parva is grayish-brown in colour, interrupted by transverse rows of yellow pigmentation spots on each annulus, and with a prominent black medial stripe running along the length of the body (becoming more faint anteriorly) (Fig. 7e, f). The gonopores are quite variable, being located on annuli in some specimens, in furrows in other specimens, or a combination of both. The separation between gonopores in the specimens observed ranged from two-and-a-half to four annuli. The most common arrangement was a separation of three-and-a-half annuli between gonopores, with the male pore located on an annulus and the female pore located in a furrow (Fig. 7g). The atrial cornuae extend anteriorly and curve ventromedially, and the ejaculatory ducts possess preatrial loops (Fig. 7h).

4. Discussion Through broadened geographic sampling and a novel suite of analyses, the present study sheds light on the phylogeny of the leech genus Erpobdella and evinces some patterns of genetic variation across a broad swath of the geographic distribution of the included taxa. Due to the paucity of available sequence data for taxa across Erpobdelliformes and the relatively recent erection of Orobdellidae (Nakano et al., 2012) the relationship between the families comprising Erpobdelliformes has not yet been firmly established. Nakano et al. (2012) show a sister relationship between Erpobdellidae and Gastrostomobdellidae and, although the relationships between families remain uncertain in our parsimony tree, our ML and BI topologies support a

18

sister relationship between Erpobdellidae and Gastrostomobdellidae as well (LBS = 60, PP = 95). COI was removed by Nakano et al. (2012) due to saturation, which could account for the different placement of Salifidae as sister to Erpobdellidae + Gastrostomobdellidae; however, the presented tree included fewer species across each of the families, and both of our topologies still remain largely compatible, suggesting that the saturation of COI does not have a large effect on the phylogenetic relationships of Erpobdelliformes. The additional support provided here for a sister relationship between Erpobdellidae and Gastrostomobdellidae is of particular interest because it implies the convergent acquisition/loss of the gastropore and the gastroporal duct in orobdellids and gastrostomobdellids or the salifids and the erpobdellids, respectively. However, these results should be interpreted with caution given the relatively low bootstrap support value. Our phylogenetic hypotheses are broadly consistent with those of previous studies, notably Oceguera-Figueroa et al. (2011) and Tessler et al. (2018a). Differences lie primarily in the placement of E. lineata, which is not robust to different methods of analysis (e.g. Fig. 4, S9). Our hypotheses provide additional support for the biogeographic scenario proposed by Oceguera-Figueroa et al., (2011) for the evolution of Erpobdellidae. The authors proposed that two major clades of Erpobdella have evolved, driven by the Tethyan and Atlantic vicariance events: one clade composed of Palearctic species (with the exception of the Nearctic E. obscura and E. parva) and one that is composed exclusively of Nearctic species. The two alternative placements of E. lineata (either as the sister taxon to all remaining species of Erpobdella or as the sister taxon to the Nearctic clade) suggests that either the Nearctic and Palearctic clades both evolved from an ancestor shared with E. lineata, or that E. lineata is a descendent of the (likely Palearctic) progenitor of the Nearctic clade. The phylogenetic position of E. lineata clearly warrants further investigation, as it could help to differentiate these historical biogeographic

19

hypotheses. In particular, future analyses should aim to include more comprehensive sequencing of species such as Erpobdella johanssoni (Johansson, 1927), Erpobdella mestrovi (Kerovec, Kučinić, and Jalžic, 1999) and Erpobdella krasensis (Sket, 1968) which may be close relatives of E. lineata. Our results show that two species, both strongly resembling Erpobdella punctata, exist sympatrically across a section of the known geographic distribution of this species. Although the morphological similarity between these species is quite extensive, there are some distinguishing features that can diagnose each of the species. Members of E. cf. punctata 1 have dorsal pigmentation that is distinctly yellow in colour and always display two pairs of strong, black, dotted, dorsal paramedial stripes. There is often black pigmentation between each pair of stripes that lie lateral to the median, but the amount of pigment varies, ranging from almost no pigmentation (i.e., four distinct paramedial stripes) to what appears as two thick, black, paramedial bars. Moreover, the specimens have cornuae that project laterally, and the ejaculatory ducts do not form preatrial loops. By contrast, members of E. cf. punctata 2-4 have more tan dorsal pigmentation, and the inner pair of paramedial stripes is more prominent than the latter pair which are often weak and discontinuous. There is also consistently less pigmentation between the pairs of stripes lateral to the median. These specimens have cornuae that project anteriorly and extend ventromedially, while the ejaculatory ducts form preatrial loops extending to ganglion XI. In his original description of E punctata, Leidy (1870) made no reference to internal morphology, and described the colour of the specimen as “blackish olivaceous… above minutely punctated with yellowish olivaceous or dusky whitish.” Based on this description, it seems likely that he was referring to E. cf. punctata 1. Meanwhile, Klemm (1982) uses the presence of preatrial loops as a diagnostic character that can, in part, help identify E. punctata.

20

Since E. cf. punctata 1 lacks preatrial loops yet is consistent with Leidy’s description, we cannot be sure if Klemm used specimens of the same species to produce his key. Furthermore, if both groups of E. cf. punctata are sympatric at the type locality, it is possible that Leidy based his description on both putative species. Seeing as specimens from the type locality (Delaware River near Beverly, Burlington Co., NJ) were not examined herein, we hesitate to make any taxonomic changes. It is important to note that we found evidence for erroneous identification of several sequences labeled as E. montezuma on GenBank. Underscored by the placement (and short branch lengths) of “E. montezuma” within the E. cf. punctata 3 clade and by the fact that the dissections of members of that clade showed internal morphology that starkly contrasts the description of E. montezuma, we suggest that these sequences (GQ368760, GQ368820, GQ368802, GQ368779) indeed are wrongly labeled. The taxonomic history of this species is somewhat contrived. First described as Erpobdella montezuma (Davies et al., 1985), it was transferred to the genus Motobdella by Govedich et al. (1998). Based on the results of a thorough phylogenetic analysis, Oceguera-Figueroa et al. (2011) transferred Motobdella montezuma back to Erpobdella. Unfortunately, the latter transfer was based on the erroneously labeled sequences, such that our finding necessitates the formal reversion of Erpobdella montezuma back to Motobdella montezuma, which we invoke herein. This study reports novel localities for two species of Erpobdella. Our record of Erpobdella bucera from Frontenac Provincial Park, ON, represents the first collection of this species in Canada, and Erpobdella microstoma is recorded for the first time in Saskatchewan, specifically from Pipestone Creek, Moosomin Regional Park. The latter species is also placed in a phylogeny for the first time in the present study.

21

To date, few publications have analyzed the genetic structure of leech populations (especially Erpobdella) across large spatial scales. Some have analyzed genetic structure across smaller scales (Govedich et al., 1999), and those that examined larger scales did so with members of other families which differ greatly from Erpobdellidae with regards to their behaviour and, importantly, their feeding ecology (e.g. Hirudinidae) (Trontelj and Utevsky, 2012). Previous studies using other leech genera found some evidence of genetic structure across northern North America, but the strength of that evidence was variable (Mack et al., 2019; Mack and Kvist, 2019). The results of the present study are similarly conflicted: some species show signs of genetic structure (E. cf. punctata and E. obscura), while others are recovered as a single, largely homogenous group (e.g. E. parva). Moreover, the extent of gene flow (using genetic distances as a proxy) differs between different species, and different clades within species groups. As seen in Fig 2, S1, and S2, erpobdellids from western (Manitoba, Saskatchewan, Alberta, British Columbia) and eastern (Ontario and Quebec) Canada are often separated in the phylogenetic trees; this pattern suggests that some geographic barrier may exist around the midpoint of the country, causing some degree of reproductive isolation. Results from our genetic distance analyses suggest that the separation might be driven, in part, by the degree of connectivity between major river drainage basins. In, particular, the division between the St. Lawrence and the Nelson River drainage basins appears to be correlated with the phylogenetic pattern observed in E. obscura and E cf. punctata 2 – 4 (i.e., population-level clades largely correspond to samples from a particular drainage basin). Unfortunately, the volume of sampling from other basins was limited, so our ability to draw conclusions remains limited as well. Still, our analysis adds to a

22

growing body of evidence suggesting that drainage basins could play an important role in the dispersal of leeches and, therefore, influence gene flow across the entirety of their range. Through this study, we continue to clarify the phylogenetic position of the various groups within Erpobdelliformes, finding some support for a sister relationship between Erpobdellidae and Gastrostomobdellidae. This suggests a convergent acquisition or loss of the gastroporal duct within Erpobdelliformes. Our phylogeny is also broadly consistent with an evolutionary history driven by the Tethyan and Atlantic vicariance events. We also find a new species of leech that closely resembles E. punctata, but the nature of the species description necessitates further work to determine which group represents E. punctata sensu stricto. This work is ongoing, and the results will be presented in a forthcoming publication. Finally, through increased sampling across the range of various Canadian species of Erpobdella (namely, E. obscura and E. cf. punctata) we provide further support for the hypothesis that drainage basin boundaries play an important role in structuring the genetic variation of leeches across large scale landscapes (i.e. thousands of kilometres).

Acknowledgements This project was supported, in part, through NSERC CRSNG to KA, a NSERC Discovery Grant and a ROM Peer Review Grant to SK. The guidance, support, and collection effort from all members of the Kvist Lab is greatly appreciated; in particular, the mentorship and extensive comments provided by Danielle de Carle. Thanks to Kristen Choffe and Oliver Haddrath for their help in the laboratory and their ongoing management of the ROM’s molecular systematics labs, and to Don Stacey and Maureen Zubowski for their tireless work maintaining the collection of invertebrate zoology. Jean-Marc Gagnon (CMN) kindly expedited the loan of several

23

specimens and these greatly benefitted the final manuscript. Alejandro Oceguera-Figueroa (UNAM) provided materials that helped in the dissection of specimens. Additional thanks to two anonymous reviewers and Santiago Claramunt, Madeline Prater, Miguel Felismino, and Tsz Hung who provided valuable comments to the manuscript.

References Borda, E., Siddall, M.E., 2004. Review of the evolution of life history strategies and phylogeny of the Hirudinida (Annelida: Oligochaeta). Lauterbornia 52, 5–25. Clady, M.D., 1974. Food Habits of Yellow Perch, Smallmouth Bass and Largemouth Bass in Two Unproductive Lakes in Northern Michigan. Am. Midl. Nat. 91, 453–459. doi:10.1002 Clement, M., Snell, Q., Walker, P., Posada, D., Crandall, K., 2002. TCS: estimating gene genealogies. Parallel Distrib. Process. Symp. 1–7. Davies, R.W., Singhal, R.N., Blinn, D.W., 1985. Erpobdella montezuma (Hirudinoidea: Erpobdellidae), a new species of freshwater leech from North America. Can. J. Zool. 63, 965–969. de Carle, D., Oceguera-Figueroa, A., Tessler, M., Siddall, M.E., Kvist, S., 2017. Phylogenetic analysis of Placobdella (Hirudinea: Rhynchobdellida: Glossiphoniidae) with consideration of COI variation. Mol. Phylogenet. Evol. 114, 234–248. doi:10.1016/j.ympev.2017.06.017 De Wit, P., Erseus, C., 2010. Genetic variation and phylogeny of Scandinavian species of Grania (Annelida: Clitellata: Enchytraeidae), with the discovery of a cryptic species. J. Zool. Syst. Evol. Res. 48, 285–293. doi:10.1111/j.1439-0469.2010.00571.x Folmer, O., Black, M., Hoeh, W., Lutz, R., Vrijenhoek, R., 1994. DNA primers for amplification of mitochondrial cytochrome c oxidase subunit I from diverse metazoan invertebrates. Mol.

24

Mar. Biol. Biotechnol. 3, 294–299. doi:10.1371/journal.pone.0013102 Friese, K., Frömmichen, R., Witter, B., Müller, H., 2003. Determination of trace metals in the freshwater leech Erpobdella octoculata of the Elbe river - Evaluation of the analytical protocol. Acta Hydrochim. Hydrobiol. 31, 346–355. doi:10.1002/aheh.200300506 Glez-Peña, D., Gómez-Blanco, D., Reboiro-Jato, M., Fdez-Riverola, F., Posada, D., 2010. ALTER: Program-oriented conversion of DNA and protein alignments. Nucleic Acids Res. 38, 14–18. doi:10.1093/nar/gkq321 Goloboff, P. a, Farris, S., Nixon, K., 2008. TNT, a free program for phylogenetic analysis. Cladistics 24, 774–786. doi:10.1111/j.1096-0031.2008.00217.x Govedich, F.R., Blinn, D.W., Hevly, R.H., Keim, P.S., 1999. Cryptic radiation in erpobdellid leeches in xeric landscapes: A molecular analysis of population differentiation. Can. J. Zool. 77, 52–57. doi:10.1139/z98-178 Gustafsson, D.R., Price, D.A., Erséus, C., 2009. Genetic variation in the popular lab worm Lumbriculus variegatus (Annelida: Clitellata: Lumbriculidae) reveals cryptic speciation. Mol. Phylogenet. Evol. 51, 182–189. doi:10.1016/j.ympev.2008.12.016 Hovingh, P., 2004. Erpobdella (Dina) parva complex (Annelida: Hirudinea: Arhynchobdellida: Erpobdellidae): additional description of Erpobdella parva, E. dubia, and E. lahontana and taxonomic revision. Hydrobiologia 517, 89–105. Huelsenbeck, J.P., Ronquist, F., Nielson, R., Bollback, J.P., 2001. Bayesian inference of phylogeny and its impacts on evolutionary biology. Science (80-. ). 294, 2310–2314. Katoh, K., Rozewicki, J., Yamada, K.D., 2017. MAFFT online service: multiple sequence alignment, interactive sequence choice and visualization. Brief. Bioinform. 1–7. doi:10.1093/bib/bbx108

25

Kearse, M., Moir, R., Wilson, A., Stones-Havas, S., Cheung, M., Sturrock, S., Buxton, S., Cooper, A., Markowitz, S., Duran, C., Thierer, T., Ashton, B., Mentjies, P., Drummond, A., 2012. Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics 28, 1647–1649. Klemm, D.J., 1982. Leeches (Annelida: Hirudinea) of North America. Cincinnati, OH. Kumar, S., Stecher, G., Tamura, K., 2016. MEGA7 : Molecular Evolutionary Genetics Analysis Version 7 . 0 for Bigger Datasets Brief communication 33, 1870–1874. doi:10.1093/molbev/msw054 Lanfear, R., Calcott, B., Ho, S.Y.W., Guindon, S., 2012. PartitionFinder: Combined selection of partitioning schemes and substitution models for phylogenetic analyses. Mol. Biol. Evol. 29, 1695–1701. doi:10.1093/molbev/mss020 Langer, S. V, Vezsenyi, K.A., de Carle, D., Beresford, D. V, Kvist, S., 2018. Leeches (Annelida: Hirudinea) from the far north of Ontario: distribution, diversity, and diagnostics. Can. J. Zool. 96, 141–152. doi:https://doi.org/10.1139/cjz-2017-0078 Leidy, J., 1870. Description of Nephelopsis punctata. Proc. Acad. Nat. Sci. Philadelphia 22, 89– 91. doi:10.1002/ece3.1264 Mack, J., de Carle, D., Kvist, S., 2019. Prey, populations, and the pleistocene: evidence for low COI variation in a widespread North American leech. Mitochondrial DNA Part A. doi:10.1080/24701394.2019.1634698 Mack, J., Kvist, S., 2019. Improved geographic sampling provides further evidence for the separation of Glossiphonia complanata and Glossiphonia elegans (Annelida: Clitellata: Glossiphoniidae) . J. Nat. Hist. 53, 335–350. doi:10.1080/00222933.2019.1590658 Macova, S., Harustiakova, D., Kolarova, J., Machova, J., Zlabek, V., Vykusova, B., Randak, T.,

26

Velisek, J., Poleszczuk, G., Hajslova, J., Pulkrabova, J., Svobodova, Z., 2009. Leeches as sensor-bioindicators of river contamination by PCBs. Sensors 9, 1807–1820. doi:10.3390/s90301807 Maddison, W.P., Maddison, D.R., 2018. Mesquite: a modular system for evolutionary analysis. Version 3.51 http://www.mesquiteproject.org. Metcalfe, J.L., Fox, M.E., Carey, J.H., 1984. Aquatic Leeches as Bioindicators of Organic Chemical Contaminants in Freshwater Ecosystems. Chemosphere 13, 143–150. Miller, M.A., Pfeiffer, W., Schwartz, T., 2012. The CIPRES science gateway. Proc. 1st Conf. Extrem. Sci. Eng. Discov. Environ. Bridg. from Extrem. to campus beyond - XSEDE ’12. doi:10.1145/2335755.2335836 Moore, J.P., 1953. Three Undescribed North American Leeches (Hirudinea). Acad. Nat. Sci. Philadelphia 250, 9–13. Moore, J.P., 1912. The leeches of Minnesota. Minneap. Geol. Nat. Hist. Surv. 5, 125–127. Moore, J.P., 1901. The Hirudinea of Illinois. Bull. Illinois State Lab. Nat. Hist. V, 479–517. Moore, J.P., Meyer, M.C., 1951. Leeches (Hirudinea) from Alaskan and adjacent waters. Wasmann J. Biol. 9, 11–77. Nakano, T., Ramlah, Z., Hikida, T., 2012. Phylogenetic position of gastrostomobdellid leeches (Hirudinida, Arhynchobdellida, Erpobdelliformes) and a new family for the genus Orobdella. Zool. Scr. 41, 177–185. doi:10.1111/j.1463-6409.2011.00506.x Oceguera-Figueroa, a, León-Règagnon, V., Siddall, M.E., 2005. Phylogeny and revision of Erpobdelliformes (Annelida, Arhynchobdellida) from Mexico based on nuclear …. Rev. Mex. Biodivers. 76, 191–198. Oceguera-Figueroa, A., Phillips, A.J., Pacheco-Chaves, B., Reeves, W.K., Siddall, M.E., 2011.

27

Phylogeny of macrophagous leeches (Hirudinea, Clitellata) based on molecular data and evaluation of the barcoding locus. Zool. Scr. 40, 194–203. doi:10.1111/j.14636409.2010.00465.x Rambaut, A., 2018. FigTree v1.4.3. Rambaut, A., Drummond, A.J., Xie, D., Baele, G., Suchard, M.A., 2018. Posterior Summarization in Bayesian Phylogenetics Using Tracer 1.7. Syst. Biol. 67, 901–904. doi:10.1093/sysbio/syy032 Ronquist, F., Huelsenbeck, J.P., 2003. MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19, 1572–1574. doi:10.1093/bioinformatics/btg180 Sawyer, R.T., 1986. Leech Biology and Behaviour. Clarendon Press, Oxford, UK. Sawyer, R.T., Shelley, R.M., 1976. New records and species of leeches (Annelida: Hirudinea) from North and South Carolina. J. Nat. Hist. 10, 65–97. doi:10.1080/00222937600770061 Siddall, M.E., 2002. Phylogeny of the leech family Erpobdellidae (Hirudinida:Oligochaeta). Invertebr. Syst. 16, 1–6. doi:10.1071/IT01011 Stamatakis, A., 2014. RAxML version 8: A tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30, 1312–1313. doi:10.1093/bioinformatics/btu033 Tessler, M., de Carle, D., Voiklis, M.L., Gresham, O.A., Neumann, J., Cios, S., Siddall, M.E., 2018a. Worms that suck: phylogenetic analysis of Hirudinea solidifies the position of Acanthobdellida and necessitates the dissolution of Rhynchobdellida. Mol. Phylogenet. Evol. 127, 129–134. doi:10.1016/j.ympev.2018.05.001 Tessler, M., Siddall, M.E., Oceguera-Figueroa, A., 2018b. Leeches from Chiapas, Mexico, with a New Species of Erpobdella (Hirudinida: Erpobdellidae). Am. Museum Novit. 3895, 1–15. doi:10.1206/3895.1

28

Toman, M.J., Dall, P.C., 1997. The diet of Erpobdella octoculata (Hirudinea: Erpobdellidae) in two Danish lowland streams. Hydrobiologia 140, 549–563. doi:10.1127/archivhydrobiol/140/1997/549 Trajanovski, S., Albrecht, C., Schreiber, K., Schultheiß, R., Stadler, T., Benke, M., Wilke, T., 2010. Testing the spatial and temporal framework of speciation in an ancient lake species flock: The leech genus Dina (Hirudinea: Erpobdellidae) in Lake Ohrid. Biogeosciences 7, 3387–3402. doi:10.5194/bg-7-3387-2010 Trontelj, P., Utevsky, S.Y., 2012. Phylogeny and phylogeography of medicinal leeches (genus Hirudo): Fast dispersal and shallow genetic structure. Mol. Phylogenet. Evol. 63, 475–485. doi:10.1016/j.ympev.2012.01.022 Verrill, A.E., 1872. ART. XIX .--Brief Contributions to Zoology from the Museum of Yale College. No. XVII.--Descriptions of North American fresh-water Leeches. Am. J. Sci. Arts 3, 126–140. Young, J.O., 1981. A comparative study of the food niches of lake-dwelling triclads and leeches. Hydrobiologia 84, 91–102. doi:10.1007/BF00026167

Figure legends Fig. 1 Map of sampling sites for all newly collected specimens in the present study; black circles denote the position of a sampling site. The legend in the bottom left corner is a key to the coloured areas on the map, which represent major river drainage basins areas in North America.

Fig. 2 Maximum likelihood tree based on data from the mitochondrial COI locus (log likelihood = -12469.649376). Bootstrap values are depicted above and to the left of their respective node.

29

Branch lengths are drawn proportional to the amount of change. Pie charts positioned to the right of each clade depict the proportion of leeches sampled from a given province or state.

Fig. 3 A condensed representation of the COI tree presented in figure 2. Where possible, clades composed of a single species have been collapsed to a single node. Dashed lines indicate a clade composed primarily of sequences from newly collected specimens. Black circles atop nodes indicate a bootstrap support value of ≥95%. Erpobdella obscura has been collapsed into a single clade owing to the fact that Erpobdella obscura 1 is a paraphyletic grouping.

Fig. 4 Maximum likelihood tree based on the concatenated dataset (log likelihood = 32419.261141). Bootstrap values are depicted above and to the left of their respective node. Branch lengths are drawn proportional to the amount of change. GenBank accession numbers for either COI or 12S were used as identifiers in lieu of a voucher number.

Fig. 5 Photos showing the internal and external morphology for E. cf. punctata 1 (ROMIZI 12434: a-d) and representatives of E. cf. punctata 2-4 (ROMIZI 11320: e,f ROMIZI 11248: g,h). Lettering, from a-d and e-h, indicates a complete dorsal photo, ventral photo, a photo of the gonopores, and a photo of the anterior male reproductive system. The scale bar represents a length of 1cm. MG = male gonopore, FG = female gonopore, C = atrial cornua, E = ejaculatory ducts, XI = ganglion XI, XII = ganglion XII.

Fig. 6 Photos showing the internal and external morphology for E. microstoma (ROMIZI 11407: a-d) and E. bucera (ROMIZI 12502: e-h). Lettering, from a-d and e-h, indicates a complete

30

dorsal photo, ventral photo, a photo of the gonopores, and a photo of the anterior male reproductive system. The scale bar represents a length of 1cm. MG = male gonopore, FG = female gonopore, C = atrial cornua, E = ejaculatory ducts, XI = ganglion XI, XII = ganglion XII.

Fig. 7 Photos showing the internal and external morphology for E. obscura (ROMIZI 12424: ad) and E. parva (ROMIZI 12494: e-h). Lettering, from a-d and e-h, indicates a complete dorsal photo, ventral photo, a photo of the gonopores, and a photo of the anterior male reproductive system. The scale bar represents a length of 1cm. MG = male gonopore, FG = female gonopore, C = atrial cornua, E = ejaculatory ducts, XI = ganglion XI, XII = ganglion XII.

Supplementary figure legends Fig. S1 Bayesian tree based on data from the mitochondrial COI locus. Posterior probabilities for each clade are depicted above and to the left of their respective node. Branch lengths are drawn proportional to the amount of change. Pie charts positioned to the right of each clade depict the proportion of leeches sampled from a given province or state.

Fig. S2 Strict consensus of ten most parsimonious trees based on the mitochondrial locus COI (length = 2592; CI = 0.198; RI = 0.904). Bootstrap values are depicted above and to the left of their respective node. Branch lengths are drawn proportional to the amount of change. Pie charts positioned to the right of each clade depict the proportion of leeches sampled from a given province or state.

31

Fig. S3 Strict consensus of ten most parsimonious trees based on the mitochondrial locus 12S (length = 1410; CI = 0.350; RI = 0.767). Bootstrap values are depicted above and to the left of their respective node. Branch lengths are drawn proportional to the amount of change.

Fig. S4 Maximum likelihood tree based on data from the mitochondrial 12S locus (log likelihood = -6654.113569). Bootstrap values are depicted above and to the left of their respective node. Branch lengths are drawn proportional to the amount of change.

Fig. S5 Strict consensus of ten most parsimonious trees based on the nuclear locus 18S (length = 978; CI = 0.622; RI = 0.897). Bootstrap values are depicted above and to the left of their respective node. Branch lengths are drawn proportional to the amount of change.

Fig. S6 Maximum likelihood tree based on data from the nuclear locus 18S (log likelihood = 8219.349328). Bootstrap values are depicted above and to the left of their respective node. Branch lengths are drawn proportional to the amount of change.

Fig. S7 Strict consensus of ten most parsimonious trees based on the nuclear locus 28S (length = 411; CI = 0.635; RI = 0.798. Bootstrap values are depicted above and to the left of their respective node. Branch lengths are drawn proportional to the amount of change.

Fig. S8 Maximum likelihood tree based on data from the nuclear locus 28S (log likelihood = 2446.965128). Bootstrap values are depicted above and to the left of their respective node. Branch lengths are drawn proportional to the amount of change.

32

Fig. S9 Bayesian tree based on a combination of the loci COI, 12S, 18S, and 28S. Posterior probabilities are depicted above and to the left of their respective node. GenBank accession numbers for either COI or 12S were used as identifiers in lieu of a voucher number. Branch lengths are drawn proportional to the amount of change.

Fig. S10 Strict consensus of ten most parsimonious trees based on a combination of the loci COI, 12S, 18S, and 28S (length = 5910; CI = 0.333; RI = 0.727). Bootstrap values are depicted above and to the left of their respective node. GenBank accession numbers for either COI or 12S were used as identifiers in lieu of a voucher number. Branch lengths are drawn proportional to the amount of change.

Fig. S11 A haplotype network reconstruction produced using the TCS method. Each coloured node represents a haplotype sampled for E. cf. punctata and the province(s) and/or state(s) they were sampled from, while the size depicts the number of individuals with said haplotype. Black circles of size 1 indicate inferred ancestral nodes that were not sampled. Hash marks on strings denote 1 mutational step. The legend in the bottom right depicts the size of a node with 1 individual and 10 individuals; coloured nodes below indicated the province/state associated with each colour.

33

Hudson Bay

Major River Drainage Basins Mackenzie River St. Lawrence River Mississippi River System Hudson Bay Seaboard Nelson River Fraser River

:

HQ336341 Salifa per sp.icax DQ009666 Erpobdella wuttkei

96 95

KF966549 Erpobdella lineata MF458701 Erpobdella sp. 100 93

Erpobdella adani KX781833 Erpobdella microstoma

100

ROMIZI 10498 Erpobdella microstoma

99 99

E. microstoma

ROMIZI 11406 Erpobdella microstoma

68

83

ROMIZI 11407 Erpobdella microstoma ROMIZI 11408 Erpobdella microstoma

83

Erpobdella ochoterenai MH013410 Erpobdella sp.

98

AY425460 Erpobdella costata

97

26

MG949121 Erpobdella sp. 81

Erpobdella mexicana 100

11

Erpobdella triannulata MF458989 Erpobdella sp.

100

AF116024 Erpobdella bucera

100

98

27

ROMIZI 12505 Erpobdella bucera

18

ROMIZI 12502 Erpobdella bucera

E. bucera

AF116025 Erpobdella melanostoma ROMIZI 12506 Erpobdella cf. punctata

81

MH013411 Erpobdella sp.

98

100

MG949120 Erpobdella sp.

17

ROMIZI 12437 Erpobdella cf. punctata

97

ROMIZI 12443 Erpobdella cf. punctata

64

ROMIZI 12434 Erpobdella cf. punctata

80 49

ROMIZI 12435 Erpobdella cf. punctata

E. cf. punctata 1

ROMIZI 12430 Erpobdella cf. punctata 49

ROMIZI 12438 Erpobdella cf. punctata ROMIZI 12433 Erpobdella cf. punctata ROMIZI 12432 Erpobdella cf. punctata

39

ROMIZI 12436 Erpobdella cf. punctata HQ336345 Erpobdella annulata ROMIZI 11217 Erpobdella cf. punctata

100

ROMIZI 11221 Erpobdella cf. punctata

E. cf. punctata 2

GQ368760 Erpobdella montezuma ROMIZI 11561 Erpobdella cf. punctata KM611937 Erpobdella montezuma

67 31

KM611825 Erpobdella montezuma 31

83

ROMIZI 11559 Erpobdella cf. punctata

ROMIZI 11558 Erpobdella cf. punctata 61

ROMIZI 11560 Erpobdella cf. punctata KM611989 Erpobdella montezuma

99

KM612253 Erpobdella montezuma

33 23

KM612138 Erpobdella montezuma

27

KM611994 Erpobdella montezuma ROMIZI 11341 Erpobdella cf. punctata

4

ROMIZI 11319 Erpobdella cf. punctata ROMIZI 11246 Erpobdella cf. punctata ROMIZI 11294 Erpobdella cf. punctata ROMIZI 11368 Erpobdella cf. punctata ROMIZI 11342 Erpobdella cf. punctata ROMIZI 11349 Erpobdella cf. punctata ROMIZI 11343 Erpobdella cf. punctata 12

ROMIZI 11248 Erpobdella cf. punctata

49

96

E. cf. punctata 3

ROMIZI 11249 Erpobdella cf. punctata ROMIZI 11245 Erpobdella cf. punctata ROMIZI 11370 Erpobdella cf. punctata ROMIZI 11582 Erpobdella cf. punctata ROMIZI 11483 Erpobdella cf. punctata

59

ROMIZI 11484 Erpobdella cf. punctata

79

ROMIZI 11498 Erpobdella cf. punctata ROMIZI 11268 Erpobdella cf. punctata 98

ROMIZI 11296 Erpobdella cf. punctata ROMIZI 11404 Erpobdella cf. punctata ROMIZI 11438 Erpobdella cf. punctata ROMIZI 11320 Erpobdella cf. punctata ROMIZI 11439 Erpobdella cf. punctata ROMIZI 11367 Erpobdella cf. punctata

73

ROMIZI 11405 Erpobdella cf. punctata

33

ROMIZI 11369 Erpobdella cf. punctata KT706410 Erpobdella punctata ROMIZI 12496 Erpobdella cf. punctata

50

ROMIZI 10502 Erpobdella cf. punctata 94

ROMIZI 12391 Erpobdella cf. punctata AF003275 Erpobdella punctata ROMIZI 12483 Erpobdella cf. punctata ROMIZI 12421 Erpobdella cf. punctata ROMIZI 12390 Erpobdella cf. punctata KT705575 Erpobdella punctata ROMIZI 12419 Erpobdella cf. punctata ROMIZI 12431 Erpobdella cf. punctata

60

ROMIZI 12429 Erpobdella cf. punctata ROMIZI 12500 Erpobdella cf. punctata

17

ROMIZI 12420 Erpobdella cf. punctata 73

ROMIZI 11530 Erpobdella cf. punctata

ROMIZI 11545 Erpobdella cf. punctata

93

ROMIZI 11531 Erpobdella cf. punctata ROMIZI 13674 Erpobdella cf. punctata ROMIZI 13688 Erpobdella cf. punctata ROMIZI 13662 Erpobdella cf. punctata ROMIZI 13683 Erpobdella cf. punctata ROMIZI 12424 Erpobdella cf. punctata

64

ROMIZI 10215 Erpobdella cf. punctata 49

ROMIZI 10252 Erpobdella cf. punctata

34

ROMIZI 10202 Erpobdella cf. punctata

56

ROMIZI 10206 Erpobdella cf. punctata

21

ROMIZI 10187 Erpobdella cf. punctata

E. cf. punctata 4

ROMIZI 10197 Erpobdella cf. punctata ROMIZI 10204 Erpobdella cf. punctata

28

ROMIZI 10186 Erpobdella cf. punctata

71

ROMIZI 10203 Erpobdella cf. punctata ROMIZI 10218 Erpobdella cf. punctata ROMIZI 10194 Erpobdella cf. punctata ROMIZI 10217 Erpobdella cf. punctata ROMIZI 12423 Erpobdella cf. punctata ROMIZI 12422 Erpobdella cf. punctata ROMIZI 12425 Erpobdella cf. punctata ROMIZI 10189 Erpobdella cf. punctata ROMIZI 10198 Erpobdella cf. punctata ROMIZI 10116 Erpobdella cf. punctata ROMIZI 10214 Erpobdella cf. punctata ROMIZI 12499 Erpobdella cf. punctata

44

ROMIZI 12501 Erpobdella cf. punctata ROMIZI 12498 Erpobdella cf. punctata 97

Erpobdella vilnensis/Erpobdella sp. DQ009668 Erpobella haskonis

82 62

DQ009667 Erpobella bykowskii

39

100

Erpobdella sp. DQ009669 Erpobella intermedia

35

KP749904 Erpobdella borisi

12

98

Erpobdella octoculata

31 77

Erpobdella japonica/Erpobdella sp. 100

25

100

5

53

Erpobdella testacea

Erpobdella nigricollis

100

Erpobdella monostriata 100

59

Erpobdella johanssoni

95

Erpobdella latestriata/Erpobdella sp. 97

HM246611 Erpobdella lineata 92

HM246607 Erpobdella lineata

56

100 98

HM246606 Erpobdella lineata HM246552 Erpobdella lineata HM246584 Erpobdella lineata

83

HM246583 Erpobdella lineata HM246553 Erpobdella sp. 83

58

Erpobdella krilata

70

97

2

83

Erpobdella sp.

HM246525 Erpobdella sp. HM246628 Erpobdella sp.

36

24

HM246533 Erpobdella cf. svilesta

58

HM246598 Erpobdella svilesta

12

Erpobdella cf. profunda

22 66

Erpobdella sp.

91

Erpobdella lepinja HM246627 Erpobdella sp. 91

43

HM246564 Erpobdella sp.

HM246565 Erpobdella sp. 81

Erpobdella lynhida

47

Erpobdella ohridana

57 9

AF116023 Erpobdella parva 100

ROMIZI 12495 Erpobdella parva ROMIZI 12494 Erpobdella parva

75

ROMIZI 12426 Erpobdella parva ROMIZI 12428 Erpobdella parva ROMIZI 12427 Erpobdella parva

48

ROMIZI 12493 Erpobdella parva ROMIZI 10473 Erpobdella parva 89 86

ROMIZI 13664 Erpobdella parva 94

ROMIZI 13679 Erpobdella parva ROMIZI 13685 Erpobdella parva

ROMIZI 11331 Erpobdella parva 45

ROMIZI 11330 Erpobdella parva ROMIZI 11332 Erpobdella parva ROMIZI 11481 Erpobdella parva

39

ROMIZI 11290 Erpobdella parva

E. parva

ROMIZI 11377 Erpobdella parva ROMIZI 11378 Erpobdella parva 14 23 67 24

ROMIZI 11291 Erpobdella parva ROMIZI 11437 Erpobdella parva ROMIZI 11436 Erpobdella parva ROMIZI 11480 Erpobdella parva

49

ROMIZI 11580 Erpobdella parva ROMIZI 11564 Erpobdella parva ROMIZI 11310 Erpobdella parva ROMIZI 11579 Erpobdella parva ROMIZI 11565 Erpobdella parva ROMIZI 11304 Erpobdella parva ROMIZI 11396 Erpobdella parva JQ821638 Erpobdella obscura ROMIZI 11490 Erpobdella obscura

3

ROMIZI 10462 Erpobdella obscura ROMIZI 11347 Erpobdella obscura ROMIZI 11289 Erpobdella obscura ROMIZI 11359 Erpobdella obscura ROMIZI 11322 Erpobdella obscura

97

5

ROMIZI 11348 Erpobdella obscura ROMIZI 11321 Erpobdella obscura ROMIZI 11293 Erpobdella obscura

61

ROMIZI 11295 Erpobdella obscura ROMIZI 11288 Erpobdella obscura ROMIZI 11389 Erpobdella obscura

48

74 14

ROMIZI 11309 Erpobdella obscura ROMIZI 11434 Erpobdella obscura ROMIZI 11435 Erpobdella obscura ROMIZI 11433 Erpobdella obscura ROMIZI 11269 Erpobdella obscura

7 54 63 28

ROMIZI 11219 Erpobdella obscura ROMIZI 11220 Erpobdella obscura ROMIZI 11225 Erpobdella obscura ROMIZI 11218 Erpobdella obscura ROMIZI 11556 Erpobdella obscura ROMIZI 11328 Erpobdella obscura

4

ROMIZI 11360 Erpobdella obscura ROMIZI 11374 Erpobdella obscura

34

ROMIZI 11581 Erpobdella obscura ROMIZI 11578 Erpobdella obscura

20

ROMIZI 11454 Erpobdella obscura ROMIZI 11329 Erpobdella obscura

21

ROMIZI 10472 Erpobdella obscura ROMIZI 11373 Erpobdella obscura ROMIZI 11372 Erpobdella obscura ROMIZI 11403 Erpobdella obscura

45 25

ROMIZI 10470 Erpobdella obscura ROMIZI 10466 Erpobdella obscura ROMIZI 10464 Erpobdella obscura ROMIZI 13659 Erpobdella obscura KM611847 Erpobdella obscura ROMIZI 13682 Erpobdella obscura KM612095 Erpobdella obscura KM611933 Erpobdella obscura ROMIZI 11346 Erpobdella obscura ROMIZI 11491 Erpobdella obscura ROMIZI 11241 Erpobdella obscura 87

ROMIZI 11577 Erpobdella obscura ROMIZI 10471 Erpobdella obscura

85

ROMIZI 10469 Erpobdella obscura

52

E. obscura 1

ROMIZI 11527 Erpobdella obscura 35

ROMIZI 11557 Erpobdella obscura ROMIZI 11402 Erpobdella obscura

5

KM612244 Erpobdella obscura AF003273 Erpobdella obscura 91

ROMIZI 11511 Erpobdella obscura

ROMIZI 11510 Erpobdella obscura 70

ROMIZI 11507 Erpobdella obscura ROMIZI 11508 Erpobdella obscura ROMIZI 11388 Erpobdella obscura

37

ROMIZI 11526 Erpobdella obscura ROMIZI 11525 Erpobdella obscura ROMIZI 10110 Erpobdella obscura 12

ROMIZI 13733 Erpobdella obscura ROMIZI 10126 Erpobdella obscura

27

ROMIZI 13731 Erpobdella obscura

21

46

ROMIZI 11523 Erpobdella obscura

83

ROMIZI 13678 Erpobdella obscura ROMIZI 13669 Erpobdella obscura

27

ROMIZI 13734 Erpobdella obscura 19

ROMIZI 13729 Erpobdella obscura ROMIZI 13735 Erpobdella obscura ROMIZI 11509 Erpobdella obscura ROMIZI 13667 Erpobdella obscura ROMIZI 13723 Erpobdella obscura ROMIZI 13726 Erpobdella obscura ROMIZI 13725 Erpobdella obscura KM612094 Erpobdella obscura ROMIZI 11524 Erpobdella obscura

56

KM612133 Erpobdella obscura ROMIZI 13658 Erpobdella obscura 61

ROMIZI 13732 Erpobdella obscura ROMIZI 13715 Erpobdella obscura

28

ROMIZI 13714 Erpobdella obscura

85

ROMIZI 13736 Erpobdella obscura ROMIZI 13730 Erpobdella obscura ROMIZI 13724 Erpobdella obscura ROMIZI 13737 Erpobdella obscura ROMIZI 13740 Erpobdella obscura

89

ROMIZI 13739 Erpobdella obscura ROMIZI 13738 Erpobdella obscura ROMIZI 13680 Erpobdella obscura ROMIZI 13727 Erpobdella obscura ROMIZI 13728 Erpobdella obscura ROMIZI 13684 Erpobdella obscura ROMIZI 12497 Erpobdella obscura ROMIZI 12442 Erpobdella obscura 90

ROMIZI 12492 Erpobdella obscura

68

ROMIZI 12439 Erpobdella obscura

ROMIZI 12441 Erpobdella obscura

0.3 substitutions per base pair

ROMIZI 12490 Erpobdella obscura ROMIZI 12440 Erpobdella obscura ROMIZI 12491 Erpobdella obscura ROMIZI 10293 Erpobdella obscura 34 30

ROMIZI 10290 Erpobdella obscura ROMIZI 10300 Erpobdella obscura ROMIZI 10292 Erpobdella obscura

88

4

ROMIZI 10301 Erpobdella obscura ROMIZI 10299 Erpobdella obscura

86

ROMIZI 12400 Erpobdella obscura

27

ROMIZI 12444 Erpobdella obscura

ROMIZI 12401 Erpobdella obscura

ROMIZI 12395 Erpobdella obscura ROMIZI 12406 Erpobdella obscura ROMIZI 12392 Erpobdella obscura ROMIZI 12396 Erpobdella obscura ROMIZI 12402 Erpobdella obscura ROMIZI 12412 Erpobdella obscura 23 66 23

ROMIZI 12393 Erpobdella obscura ROMIZI 12405 Erpobdella obscura ROMIZI 12416 Erpobdella obscura ROMIZI 12484 Erpobdella obscura ROMIZI 12408 Erpobdella obscura

62

ROMIZI 12409 Erpobdella obscura ROMIZI 12486 Erpobdella obscura ROMIZI 12414 Erpobdella obscura ROMIZI 12417 Erpobdella obscura ROMIZI 12407 Erpobdella obscura ROMIZI 12398 Erpobdella obscura ROMIZI 12399 Erpobdella obscura ROMIZI 12413 Erpobdella obscura ROMIZI 12488 Erpobdella obscura

34

ROMIZI 12482 Erpobdella obscura ROMIZI 12487 Erpobdella obscura ROMIZI 12411 Erpobdella obscura ROMIZI 12403 Erpobdella obscura ROMIZI 12485 Erpobdella obscura ROMIZI 12394 Erpobdella obscura ROMIZI 12410 Erpobdella obscura ROMIZI 12418 Erpobdella obscura ROMIZI 12404 Erpobdella obscura ROMIZI 12397 Erpobdella obscura ROMIZI 12415 Erpobdella obscura

E. o obscura 2

HQ336341 Salifa persp.icax DQ009666 Erpobdella wuttkei KF966549 Erpobdella lineata MF458701 Erpobdella sp. Erpobdella adani

93

Erpobdella microstoma

83

83

Erpobdella ochoterenai MH013410 Erpobdella sp. AY425460 Erpobdella costata MG949121 Erpobdella sp. Erpobdella mexicana Erpobdella triannulata MF458989 Erpobdella sp.

26 11

81 27 17

Erpobdella bucera

18

AF116025 Erpobdella melanostoma

81

Erpobdella cf. punctata 1

39

HQ336345 Erpobdella annulata 31 94

39

12

0.3 substitutions per

Erpobdella cf. punctata 2 Erpobdella cf. punctata 3 Erpobdella cf. punctata 4

Erpobdella vilnensis/Erpobdella sp. 82 DQ009668 Erpobdella haskonis 62 DQ009667 Erpobdella bykowskii Erpobdella sp. 35 DQ009669 Erpobdella intermedia KP749904 Erpobdella borisi 31 Erpobdella octoculata 77 Erpobdella japonica/Erpobdella sp. 5 Erpobdella testacea 25 Erpobdella nigricollis 53 Erpobdella monostriata 59 Erpobdella johanssoni Erpobdella latestriata/Erpobdella sp. 56 Erpobdella lineata 83 HM246553 Erpobdella sp. 2 83Erpobdella krilata 58 70 Erpobdella sp. HM246525 Erpobdella sp. HM246628 Erpobdella sp. 36 83 HM246533 Erpobdella cf. svilesta 58 HM246598 Erpobdella svilesta 80 9 Erpobdella cf. profunda 66 Erpobdella sp. 91 Erpobdella lepinja HM246627 Erpobdella sp. 43 HM246564 Erpobdella sp. 91 site HM246565 Erpobdella sp. 81 Erpobdella lynhida 47 57 Erpobdella ohridana

Erpobdella parva

23 74

JQ821638 Erpobdella obscura

Erpobdella obscura

AY425444 Cylicobdella coccinea AY425443 Americobdella valdiviana AB938004 Odontobdella blanchardi AB679658 Mimobdella japonica AY786456 Barbronia weberi 95 Salifidae DQ235598 Barbronia arcana 49 100 AY786455 Barbronia gwalagwalensis LC029431 Salifa motokawai 100 AY786460 Linta be 61 HQ336340 Salifa perspicax 100 HQ336341 Salifa perspicax 100 31 HQ336342 Salifa perspicax 95 HQ336343 Salifa perspicax 100 AB679704 Orobdella kawakatsuorum Orobdellidae AB679688 Orobdella koikei 38 AB679662 Orobdella tsushimensis 100 AB938006 Orobdella masaakikuroiwai AB698866 Orobdella mononoke 45 49 AB704787 Orobdella ketagalan 94 AB679680 Orobdella dolichopharynx 100 99 AB679676 Orobdella shimadae AB679664 Orobdella esulcata 74 100 100 LC087144 Orobdella naraharaetmagarum LC106320 Orobdella brachyepididymis 17 LC184552 Orobdella kanaekoikeae Erpobdelliformes AB679708 Orobdella octonaria 96 AB679672 Orobdella ijimai 99 67 100 HQ336338 Orobdella octonaria 36 AB675024 Orobdella octonaria AB679668 Orobdella whitmani 28 89 LC106350 Orobdella yamaneae 84 Gastrostomobdellidae LC106342 Orobdella okanoi LC106331 Orobdella nakahamai AB675011 Gastrostomobdella monticola 100 100 LC274559 Gastrostomobdella ampunganensis LC274561 Gastrostomobdella ampunganensis 70 LC274551 Gastrostomobdella ampunganensis 100 LC274553 Gastrostomobdella extenta LC274555 Gastrostomobdella extenta 100 LC274557 Gastrostomobdella extenta HM246583 Erpobdella lineata AF116027 Erpobdella testacea 100 60 100 AB675012 Erpobdella japonica 99 AF116026 Erpobdella japonica 99 AF003274 Erpobdella octoculata HQ336344 Erpobdella octoculata 100 ROMIZI13685 Erpobdella parva 88 100 AF116023 Erpobdella parva 100 Erpobdella parva ROMIZI12494 Erpobdella parva ROMIZI10473 Erpobdella parva 86 100 ROMIZI11436 Erpobdella parva 99 59 ROMIZI12440 Erpobdella obscura Erpobdellidae 96 ROMIZI12497 Erpobdella obscura Erpobdella obscura 2 99 ROMIZI10293 Erpobdella obscura 99 ROMIZI12484 Erpobdella obscura ROMIZI11526 Erpobdella obscura 79 AF003273 Erpobdella obscura 73 ROMIZI13733 Erpobdella obscura 29 81 ROMIZI13669 Erpobdella obscura KM612244 Erpobdella obscura 44 100 AF116024 Erpobdella bucera Erpobdella bucera ROMIZI12502 Erpobdella bucera DQ235604 Erpobdella triannulata 49 100 72 HQ336347 Erpobdella triannulata 54 DQ235602 Erpobdella triannulata 100 100 DQ235597 Erpobdella mexicana DQ235595 Erpobdella mexicana DQ235601 Erpobdella mexicana 40 100 ROMIZI11408 Erpobdella microstoma Erpobdella microstoma 95 ROMIZI10498 Erpobdella microstoma 100 MG745144 Erpobdella adani MG745145 Erpobdella adani 93 MG745146 Erpobdella adani 99 100 AY425460 Erpobdella costata HQ336354 Erpobdella costata 65 96 DQ235603 Erpobdella ochoterenai 100 DQ235596 Erpobdella ochoterenai 100 0.6 substitutions per site DQ235600 Erpobdella ochoterenai DQ235599 Erpobdella ochoterenai AF116025 Erpobdella melanostoma 100 100 ROMIZI12506 Erpobdella cf. punctata Erpobdella cf. punctata 1 ROMIZI12438 Erpobdella cf. punctata 100 HQ336345 Erpobdella annulata ROMIZI11217 Erpobdella cf. punctata 69 78 GQ368760 Erpobdella montezuma ROMIZI11294 Erpobdella cf. punctata 94 ROMIZI11560 Erpobdella cf. punctata 76 Erpobdella cf. punctata 2-4 AF003275 Erpobdella punctata 99 HQ336346 Erpobdella punctata 57 ROMIZI13683 Erpobdella cf. punctata ROMIZI12391 Erpobdella cf. punctata 84

98

Erpobdella obscura 1

A

B

C

D C

E XII

MG

FG

E

F

G

H

XI

E

MG C

XII FG

A

B

D

C MG

C

E

FG

XII

E

F

G

H XI

E MG FG

C XII

A

B

C

D MG

XI

E

FG C

E

F

H

G

XI E

C MG XII FG

GA

34

· A comprehensive phylogeny of the leech family Erpobdellidae is presented · Evidence of genetic structure is presented for E. cf. punctata and E. obscura · E. cf. punctata is revealed to be a cryptic species complex · New distributional records are provided for E. bucera and E. microstoma

35

Table 1 Table depicting, from left to right: the gene name (and its genealogy), the name of primers (forward and reverse) used in this study, the 5’ to 3’ nucleotide sequence associated with the primer, and the publication which described the associated primer. GENE MITOCHONDRIAL COI 12S

NUCLEAR 18S RDNA

28S RDNA

PRIMER NAME

PRIMER SEQUENCE

CITATION

LCO1490 HCO2198 12S a

5’ → 3’ GGTCAACAAATCATAAAGATATTGG TAAACTTCAGGGTGACCAAAAAATCA AACIIGGATTAGATACCC

Folmer et al. (1994) Folmer et al. (1994) Simon et al. (1994)

12S b

GAGAGTGACGGGCGATGTGT

Simon et al. (1994)

18S A

AACCTGGTTGATCCTGCCAGT

18S L

CCAACTACGAGCTTTT

18S C

CGGTAATTCCAGCTC

18S Y

CAGACAAATCGCTCC

18S B

TGATCCTTCCGCAGGTTCACCT

18S O

AAGGGCACCACCAG

28S A 28S B out

GACCCGTCTTGAAGCACG CCCACAGCGCCAGTTCTGCTTACC

Modified from Apakupakul et al. (1999) Modified from Apakupakul et al. (1999) Modified from Apakupakul et al. (1999) Modified from Apakupakul et al. (1999) Modified from Apakupakul et al. (1999) Modified from Apakupakul et al. (1999) Whiting (2002) Prendini et al. (2005)

36