DNA Barcoding in Amoebozoa and Challenges: The Example of Cochliopodium

DNA Barcoding in Amoebozoa and Challenges: The Example of Cochliopodium

Accepted Manuscript Title: DNA Barcoding in Amoebozoa and Challenges: The Example of Cochliopodium Author: Yonas I. Tekle PII: DOI: Reference: S1434-...

3MB Sizes 2 Downloads 80 Views

Accepted Manuscript Title: DNA Barcoding in Amoebozoa and Challenges: The Example of Cochliopodium Author: Yonas I. Tekle PII: DOI: Reference:

S1434-4610(14)00049-2 http://dx.doi.org/doi:10.1016/j.protis.2014.05.002 PROTIS 25437

To appear in: Received date: Revised date: Accepted date:

20-3-2014 15-5-2014 17-5-2014

Please cite this article as: Tekle, Y.I.,DNA Barcoding in Amoebozoa and Challenges: The Example of Cochliopodium, Protist (2014), http://dx.doi.org/10.1016/j.protis.2014.05.002 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Manuscript

DNA Barcoding in Cochliopodium 1

ORIGINAL PAPER

ip t

DNA Barcoding in Amoebozoa and Challenges: The Example of Cochliopodium

cr

Yonas I. Tekle1

an

us

Spelman College, 350 Spelman Lane Southwest, Atlanta, GA 30314, USA

Submitted March 20, 2014; Accepted May 17, 2014

d

M

Monitoring Editor: C. Graham Clark

te

Running title: DNA Barcoding in Cochliopodium

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

1

e-mail [email protected]

Page 1 of 27

DNA Barcoding in Cochliopodium 2

The diversity of microbial eukaryotes in general and amoeboid lineages in particular is poorly documented. Even though amoeboid lineages are among the most abundant microbes, taxonomic progress in the group has been hindered by the

ip t

limitations of traditional taxonomy and technical difficultly in studying them.

Studies using molecular approaches such as DNA barcoding with cytochrome

cr

oxidase I (COI) gene are slowly trickling in for Amoebozoa, and they hopefully will aid in unveiling the true diversity of the group. In this study a retrospective

us

approach is used to test the utility of COI gene in a scale-bearing amoeba,

Cochliopodium, which is morphologically well defined. A total of 126 COI sequences

an

and 62 unique haplotypes were generated from 9 Cochliopodium species. Extensive analyses exploring effects of sequence evolution models and length of sequence on genetic diversity computations were conducted. The findings show that COI is a

M

promising marker for Cochliopodium, except in one case where it failed to delineate two morphologically well-defined cochliopodiums. Two species delimitation

d

approaches also recognize 8 genetic lineages out of 9 species examined. The

te

taxonomic implications of these findings and factors that may confound COI as a barcode marker in Cochliopodium and other amoebae are discussed.

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Key words: Cochliopodium; amoeba; COI; DNA barcode; morphology; species identification; species delimitation.

Page 2 of 27

DNA Barcoding in Cochliopodium 3

Introduction

ip t

Microbial eukaryotes make up the vast diversity of lineages in the eukaryotic tree of life (Adl et al. 2012; Tekle et al. 2009). Nevertheless, the true diversity of microbial

cr

eukaryotes has been obscured due to technical challenges in studying them; including small size, culturing difficulties and paucity of diagnostic phenotypic variations.

us

Consequently, for over two centuries the traditional morphology-based taxonomy has failed to capture the full diversity of microbial eukaryotes (Haeckel 1866; Schmarda

an

1871); though several higher level taxonomic lineages still remain valid and morphology remains at the core of modern classification schemes (Adl et al. 2012; Patterson 1999). Molecular techniques have greatly enhanced our ability to analyze diverse groups of

M

microbes by overcoming some of the traditional taxonomic problems (Parfrey et al. 2010; Tekle et al. 2009; Yoon et al. 2008). Studies based on molecular data are uncovering

d

hidden diversity of microbes and aiding in elucidating evolutionary relationships among them Tekle et al. 2007, 2008; Yoon et al. 2008). Among the recent molecular approaches

te

that have been employed, particularly those related to species discovery, include metagenomics (Stoeck et al. 2007; Yi et al. 2010), next generation sequencing (Dunthorn

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

et al. 2014; Egge et al. 2013) and DNA barcoding (Hebert et al. 2003a). The latter approach has gained some popularity in the last decade due to ease, relatively low cost, and its successful application in some animals (Hebert et al. 2003a, b). DNA barcoding employs a short fragment of a gene sequence (e.g. mitochondrion-encoded Cytochrome c Oxidase Subunit 1, COI) for specimen identification and discovery. Such approach is hoped to facilitate species identification with greater efficiency, speed and objectivity (Schindel and Miller 2005). Application of DNA barcode in microbial eukaryotes including in some members of Amoebozoa is steadily growing (Barth et al. 2006; Heger et al. 2011, 2013; Kosakyan et al. 2012, 2013; Lara 2011; Lin et al. 2009; Nassonova et al. 2010; Stern et al. 2010).

Page 3 of 27

DNA Barcoding in Cochliopodium 4 The supergroup Amoebozoa encompasses the majority of naked and several testate unicellular amoeboid lineages with dynamic pseudopodia as well as some amoeboflagellates (Cavalier-Smith et al. 2004; Lahr et al. 2011a; Nikolaev et al. 2005, 2006; Tekle et al. 2008). Even though the Amoebozoa are among the most abundant

ip t

microbes, their diversity is poorly documented due to the low number of investigations

examining morphological variability and limited genetic data sampling (Smirnov 2008;

cr

Smirnov et al. 2011; Tekle et al. 2008). Most of the molecular genetic studies in

Amoebozoa have focused on establishing relationships among well-characterized

us

lineages representing higher-level taxonomic groups (Lahr et al. 2011a; Smirnov et al. 2005; Tekle et al. 2008), while few studies focused at the genus level (Kudryavtsev et al.

an

2011; Lara et al. 2008; Smirnov et al. 2007). These efforts have resulted in improved understating of some lineages within the supergroup (Smirnov et al. 2005). However, the within relationships of amoebozoans still remains unresolved mainly because of limited

M

genetic sampling and high genetic heterogeneity observed in some amoebozoan genomes that confound phylogenetic reconstruction methods (Tekle et al. 2008). Recent molecular

d

studies with a focus on species discovery are emerging in some members of amoebozoans including testate (Heger et al. 2011, 2013; Kosakyan et al. 2012, 2013) and

te

naked amoebae (Nassonova et al. 2010). These studies use COI alone, or together with nuclear genes, to explore species diversity at lower taxonomic levels and among

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

morphologically indistinguishable morphotypes. The studies revealed the existence of extensive cryptic species indicating that previous studies, based on highly conserved molecular markers (SSU-rDNA) and morphology, underestimated species diversity in these amoeboid taxa. Among the molecular markers used COI has been found to consistently provide a better resolution compared to nuclear genes that are either too conserved or problematic due to high intraspecific divergence that overlaps with interspecific variations (Nassonova et al. 2010). In this study the utility of COI as a DNA barcode marker is explored in a scale-bearing amoeba genus, Cochliopodium.

The genus Cochliopodium includes discoid or globose amoebae enclosed in a flexible dorsal cuticle or tectum composed of intricate scales. There are currently around 22 described species from both freshwater and marine habitats (Anderson and Tekle

Page 4 of 27

DNA Barcoding in Cochliopodium 5 2013; Kudryavtsev et al. 2005, 2011; Tekle et al. 2013). The phylogenetic position of Cochliopodium within the Amoebozoa is not resolved, but the genus forms a strongly supported monophyletic clade within the supergroup (Kudryavtsev et al. 2005, 2011; Lahr et al. 2011a; Tekle et al. 2008). Morphology-based taxonomy of Cochliopodium is

ip t

fairly consistent with a few exceptions, where some light microscopy characters such as

size and shape might be homoplasious (Anderson and Tekle 2013; Tekle et al. 2013). In

cr

such instances, species identification can be successfully diagnosed with fine structure of microscales obtained using electron microscopy. Molecular phylogeny of Cochliopodium

us

is also consistent with morphology. However, some cochliopodiums are observed to have nearly identical SSU-rDNA sequences (Anderson and Tekle 2013; Tekle et al. 2013).

an

This precludes SSU-rDNA as a marker for species identification due its high conservation (Nassonova et al. 2010). Some cochliopodiums have been characterized for two additional markers, actin and COI, but there is not enough data to evaluate their

M

utility at lower taxonomic levels (Anderson and Tekle 2013; Kudryavtsev et al. 2011; Nassonova et al. 2010). Sequences of COI have been employed for species identification

d

in cochliopodiums that were found indistinguishable at a SSU-rDNA level (Anderson and Tekle 2013). These findings coupled with the robust diagnostic ultrastructural microscale

te

morphology make Cochliopodium an ideal taxon to test the utility of COI as a DNA barcode marker in Amoebozoa. Multiple COI sequences (126 clones) from nine

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

cochliopodiums using different PCR conditions were used in order to explore the extent of intraspecific divergences and its impact on species delineation. Extensive analysis was conducted to explore how different molecular evolution models and total number of characters analyzed affect intra- and interspecific distance computations. Sixty-two unique haplotypes, a few of them with in-frame stop codons and deletions, features indicating pseudogenes, were recovered. The findings show that COI is a promising marker for Cochliopodium, except in one case where two morphologically distinguishable cochliopodiums shared identical or similar COI sequences polymorphism that falls within the observed intraspecific divergence boundary. The findings of this study are compared to previous DNA barcode studies in other groups of Amoebozoa. DNA barcoding challenges and species delimitation thresholds that could be applicable approaches to delimit species in Amoebozoa are discussed.

Page 5 of 27

DNA Barcoding in Cochliopodium 6

Results

ip t

COI Gene Sequence Analysis

A total of 126 COI gene sequences were obtained from 9 Cochliopodium species (Table

cr

1). The number of sequenced clones per species ranged from 6 to 38 (Table 1). A total of 62 unique haplotypes were recovered including sequences with stop codon and in-frames

us

deletions, possibly nuclear mitochondrial pseudogenes (known as Numts, Lopez et al. 1994). In general, a higher number of haplotypes corresponded with number of sequences

an

representing a particular species, but Cochliopodium spiniferum had the highest number of haplotypes relative to the number of clones sequenced (Table 1). The length of the amplified COI sequence was 668 bps after trimming flanking regions that included the

M

primers. In some instances, shorter sequences were obtained. However, these might be a result of sequencing artifacts, since resequencing of the same PCR product yielded longer

d

regular-sized sequences in some isolates. The G+C content in Cochliopodium was

te

uniform except in Cochliopodium gallicum that was slightly lower (26.9) than the average (28.6) for the rest of the group.

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Intraspecific sequence divergence of all COI sequences generated was inferred

using Kimura's two parameter (K2P) model (Table 1). Multiple identical sequences were recovered from clones representing all species. The maximum intraspecific divergence, 0.9%, was found in Cochliopodium pentatrifurcatum (Table 1). Some of the substitutions in the haplotypes were nonsynonymous resulting in 1-3 amino acids replacements per sequenced clone (Table 1). The maximum number of nonsynonymous substitutions (three amino acids) per clone was found in a pseudogene obtained from Cochliopodium minus CCAP 1537/1A (hereafter C. minus), while 2 nonsynonymous substitutions per clone were commonly found in all except three species: Cochliopodium actinophorum, Cochliopodium larifeili and C. gallicum. However, additional sequencing might reveal a similar pattern of nonsynonymy in these isolates. Shared nonsynonymous substitutions at

Page 6 of 27

DNA Barcoding in Cochliopodium 7 the same nucleotide position were recovered more than one time in two species: C. pentatrifurcatum and Cochliopodium megatetrastylus.

Interspecific sequence divergence among Cochliopodium species (including only

ip t

unique haplotypes without putative pseudogenes and short sequences) inferred using a K2P model ranged from 0-29.9% (Table 2). The minimum interspecific divergence in

cr

Cochliopodium species with unique scale morphology was always higher than 2.8% except in one case (Table 2). Interspecific sequence divergence between C.

us

pentatrifurcatum and C. minus fell within the range of intraspecific divergence (0%0.9%) (Tables 1, 2). Albeit unique in their morphology, these two species shared identical

an

sequences in 6 out of 12 and 23 out of 38 sequenced clones of C. minus and C. pentatrifurcatum, respectively. However, the unique haplotypes (Table 1) obtained from these isolates were never shared both the DNA and amino acid levels, when they were

M

compared using C. minus YT239 clone and a published C. pentatrifurcatum COI sequence (KC489470) as references. These two isolates were the closest in average

d

interspecific divergence (3.1%) to C. megatetrastylus (Table 2). Average interspecific divergence outside of these three closely related taxa (C. minus, C. pentatrifurcatum and

te

C. megatetrastylus) was always over 8%, while reaching a maximum of 29.9% divergence between C. spiniferum and C. gallicum (Table 2). Out of the nine isolates

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

examined, C. gallicum appeared to be the most divergent taxon for the COI gene.

COI Phylogeny

COI phylogeny of the large dataset (668 bps) inferred using different models and algorithms (e.g. NJ and RAxML, data not shown) and maximum likelihood (ML) using MEGA yielded the same well-resolved topology (Fig. 1). All Cochliopodium COI molecular clones analyzed formed a monophyletic group with full bootstrap (BS) support (Fig. 1). With the exception of C. minus and C. pentatrifurcatum that are nested within one clade (BS: 100, Fig. 1), all molecular clones representing each isolate with unique microscale morphology formed fully supported monophyletic groups (Fig. 1). The ML tree is mostly concordant with previously published data with one exception related to the

Page 7 of 27

DNA Barcoding in Cochliopodium 8 placement of C. spiniferum (Anderson and Tekle 2013; Kudryavtsev et al. 2011; Tekle et al. 2013). Cochliopodium spiniferum grouped with a clade consisting of (((C. minus + C. pentatrifurcatum) + C. megatetrastylus) + Cochliopodium minutoidum) with a strong support (BS: 98, Fig. 1). In previous studies based on SSU-rDNA, with slight difference

ip t

in taxon sampling, the four sister taxa grouped with a clade consisting of C.

actinophorum (Anderson and Tekle 2013; Tekle et al. 2013). The undescribed

cr

Cochliopodium sp. „Con1‟ formed strongly supported (BS: 99) sister group relationship with C. actinophorum. A weakly supported clade (BS: 67) consisting of C. larifeili and

us

C. gallicum branches basal to the rest of Cochliopodium isolates (Fig. 1). The position of C. larifeili was unresolved in some analyses (data not shown). Similar tree topology with

an

slightly lower bootstrap support was recovered in a dataset consisting of 498 bps described below (data not shown).

M

Effects of Molecular Models and Sequence Length in Genetic Diversity

d

Computation

To account for the variability of dataset size (published sequence length) and models of

te

sequence evolution used in previous studies (Kosakyan et al. 2012; Nassonova et al. 2010), a thorough analysis was performed to test the robustness of intraspecific and

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

interspecific computation results in two datasets differing by 15%: small dataset (498 bps) and large dataset (668 bps, Figs 2 and 3). All six models analyzed gave similar maximum intraspecific (Supplementary Material Fig. S1) and minimum interspecific divergences (Fig. 2). The pairwise distances using 6 different models were consistent in the lower boundary divergences ranging from 0.4%-8% (Fig. 2). The only noticeable effect of selected models on divergence computation was observed on the higher boundary (≥ 8%) average interspecific divergences (Fig. 2). While five of the six models gave similar divergences throughout, one model, uncorrected p-distance (p-distance), was observed to yield lower divergence values in clones that differed on average interspecific divergence of ≥ 8% (Fig. 2). Average interspecific divergence in the p-distance analysis was lower by up to 2.8% when compared with the rest of the models (Fig. 2). Maximum composite model as implemented in MEGA yielded a substantially high discrepancy

Page 8 of 27

DNA Barcoding in Cochliopodium 9 compared to the six models and was not used in any of the analyses (data not show). Similar patterns were obtained when different dataset sizes were analyzed. Lower boundary average interspecific sequences divergence, ranging from 0.4%-8%, was not affected by the size of dataset analyzed (Fig. 3). Average interspecific sequences

ip t

divergence was higher in the smaller dataset (498 bps) compared to the large dataset (668 bps) starting at divergences greater than 14% (Fig. 3). The smaller dataset yielded up to a

cr

maximum of 2.7% higher average interspecific divergence compared to the large dataset

us

(Fig. 3).

an

Species Delimitation in Cochliopodium

A combination of approaches was used to delimit species in Cochliopodium. Since all nine isolates studied were morphologically well defined, direct attempts were made to

M

find a local threshold to delineate species based on the observed maximum intraspecific and minimum interspecific divergences. Intraspecific varied in the range of 0%-0.9%,

d

while interspecific divergences ranged from 0%-29.9% (Tables 1, 2). The overlap in the distribution of intraspecific and interspecific divergences hindered finding a local

te

threshold that could be used to delineate the nine morphologically distinguishable species at a COI sequence level. Species delimitation approaches that are developed for DNA

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

datasets, in the absence of morphology, such as automatic barcode gap discovery (ABGD) and bayesian Poisson tree process (bPTP) were used to determine genetic lineages in Cochliopodium. Both approaches yielded similar results. The numbers of genetic lineages defined with the ABGD method using standard settings varied with different a priori thresholds (Supplementary Material Fig. S2). ABGD suggested genetic lineages based on nine priori thresholds. Lower and intermediate priori threshold values, suggested eight groups, each representing a hypothetical genetic lineage, while higher priori thresholds suggested a lower number of groups (1-7) (Supplementary Material Fig. S2). Similarly, bPTP consistently partitioned analyzed haplotypes into eight groups, each group receiving from moderate to strong (0.77-1) both Bayesian posterior probabilities and maximum likelihood supports (data not shown). Similar to the above two approaches,

Page 9 of 27

DNA Barcoding in Cochliopodium 10 eight species can be delineated if a local threshold or barcode gap is defined between the

ip t

maximum intraspecific and minimum interspecific divergences (0.9%-2.8%).

cr

Discussion

us

DNA barcoding based on a single locus marker, COI, is playing a crucial role in uncovering hidden diversity of macroscopic and microscopic eukaryotes (Hebert et al.

an

2003b, 2004a, 2013; Kosakyan et al. 2012; Lin et al. 2009; Nassonova et al. 2010; Winterbottom et al. 2014). Such a molecular based approach is of enormous importance in microscopic eukaryotes, particularly Amoebozoa where the progress in delineating

M

lower level taxonomic levels has been lagging behind, due to limited morphological variability and greater plasticity of the existing characters. Studies employing molecular

d

markers with a focus on diversity of lower taxonomic groups in Amoebozoa are emerging (Heger et al. 2011, 2013; Kosakyan et al. 2012, 2013; Nassonova et al. 2010).

te

Two recent studies that used multilocus markers in two taxonomic groups representing naked and testate amoebae demonstrated that COI gene analyses showed strong promise

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

compared to nuclear genes for species discovery and specimen identification purposes (Heger et al. 2011; Nassonova et al. 2010). Some authors have used COI genes to not only unveil extensive cryptic species diversity, but also have supported taxonomic revisions, including redescription of species complexes and suggested reevaluation of some morphological characters (Kosakyan et al. 2012, 2013). Similarly, the findings of this study corroborate the utility of COI gene analyses in understanding taxonomy and diversity of Cochliopodium.

Although utility of DNA barcoding for diversity studies have been demonstrated across a broad range of eukaryotic taxonomic groups, some pitfalls exist that deter the marker form its universal applicability (Collins and Cruickshank 2013; Rubinoff et al. 2006). Among the major hurdles that have been identified and discussed include

Page 10 of 27

DNA Barcoding in Cochliopodium 11 overestimation of diversity, the lack of global barcode gap, and experimental and analytical problems (reviewed in Collins and Cruickshank 2013). Overestimation of species diversity occurs when nuclear mitochondrial pseudogenes (Numts) are coamplified with orthologous genes (Song et al. 2008). These pseudogenes are

ip t

functionally not constrained and accumulate more mutations compared to their orthologous mitochondrial counterparts. This might result in higher observed

cr

intraspecific divergences, which could lead to erroneous high diversity estimation if they are not detected and taken into consideration (Song et al. 2008). Multiple sequencing

us

using different PCR conditions from population DNA in the current study enabled discovery of pseudogenes and their effects in species discovery and identification in

an

Cochliopodium. Most of the detected pseudogenes were with single deletions, a couple of instances with stop codons and a single divergent sequence with a gap. Even though these pseudogenes were excluded in the final analyses, their inclusion did not substantially

M

affect the outcomes of the analyses (data not shown).

d

One of the intensely debated major challenges in DNA barcoding is determination of sequence threshold, commonly known as barcoding gap, to aid in species delimitation.

te

Barcoding gap relies on substantial differences (gap) between intraspecific and interspecific distances within a group of organisms. Several approaches have been

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

proposed to define distance thresholds, which include fixed barcode gaps based on reference sequence (Hebert et al. 2004b), and more recent approaches that optimized threshold directly from each empirical data independent of reference sequences (Pons et al. 2006; Puillandre et al. 2012), or from inferred phylogeny (Zhang et al. 2013). While fixed thresholds are prone to arbitrariness due to variation in coalescent depths among species (Collins and Cruickshank 2013), the taxonomy independent approaches (Pons et al. 2006; Puillandre et al. 2012; Zhang et al. 2013) are expected to overcome this predicament. To date there is no defined working barcoding gap in Amoebozoa, though recent publications have used a combination of approaches to delimit species in testate amoebae. Kosakyan et al. (2012, 2013) used an observed intraspecific distance to delimit species in nebelid testate amoebae, while Heger et al. (2013) used a combination of approaches (fixed and data driven thresholds), to delineate species in a similar group of

Page 11 of 27

DNA Barcoding in Cochliopodium 12 testate amoebae, Hyalosphenia. A generic local threshold such as ≥ 1% used by Heger et al. (2013) in amoebae might be reasonable based on the limited available data. However, there are a few important considerations that need to be taken into account. The basis of ≥ 1% threshold in the Herger et al. (2013) study is based on a previous report that

ip t

documented ≥ 0.5% intraspecific variation in naked amoebae (Nassonova et al. 2010). In the present study, a higher intraspecific divergence reaching up to 0.9% was detected.

cr

The lower observed intraspecific divergence in the previous studies might be attributed to the low number of sequenced clones per isolates and the use of DNA obtained from

us

clonally grown amoebae or single (few) cells. Therefore, thorough exploration of intraspecific divergences should be explored as fixed threshold set at 1% or higher,

an

although this might have some unforeseen consequences.

While problems associated with a fixed threshold could be avoided by using a

M

comprehensive sequence database or non-static, data driven, threshold computational approaches, one of the most challenging aspects of DNA barcoding is absence of barcode

d

gap in some lineages due to overlap between intra- and interspecific divergences. In the present study two well-described species, C. minus (Kudryavtsev 2006) and C.

te

pentatrifurcatum (Tekle et al. 2013), with unique microscale morphology and light microscope characters had shared identical COI gene sequences and interspecific

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

haplotype divergences that fall within the intraspecific range (Tables 1, 2). Scale morphology has been playing critical role in species description in Cochliopodium (e.g. Tekle et al. 2013). According to Anderson and Tekle (2013), three surface scale categories are recognized in Cochliopodium. These two species have scale types belonging to different categories (Anderson and Tekle 2013). Similarly, another study reported a high degree of sequence similarity that falls within the range of intramorphospecific (species complex) divergence in two morphologically distinguishable described vannellids (Nassonova et al. 2010). Interestingly, in both instances the haplotypes that were recovered from each isolate were unique and were not shared between each species pairs. Though it is important to note that no identical shared sequences were reported for the two vannellids, Vennella arabica and Vennella bursella (Nassonova et al. 2010). The discordance between morphology and molecular data might

Page 12 of 27

DNA Barcoding in Cochliopodium 13 indicate that COI under special circumstance could underestimate the true extent of diversity in these groups of amoebae. Alternatively, it might equally be argued that these results indicate plasticity of the morphological characters used to describe these isolates. Morphology based taxonomy in Cochliopodium is fairly consistent with the molecular

ip t

phylogeny based on SSU-rDNA (Kudryavtsev et al. 2005; Tekle et al. 2013). Even

though there is no thorough examination of all the morphological characters used in

cr

Cochliopodium taxonomy, ultrastructure of the microscale morphology has been

successfully employed in species identification and description (Dykova et al. 1998;

us

Kudryavtsev and Smirnov 2006). Our previous study found that three closely related sister taxa of cochliopodiums (C. pentatrifurcatum, C. megatetrastylus and C.

an

minutoidum) share nearly identical SSU-rDNA sequences (Anderson and Tekle 2013; Tekle et al. 2013;). While SSU-rDNA fails to differentiate these three species, they can be readily distinguished using microscale morphology and minimum COI interspecific

M

sequence divergences that ranges from 2.8%-7.7% (Table 2).

d

Despite the perfect congruence of morphology and COI in the above three species, the reason(s) for the failure of COI to delineate C. pentatrifurcatum and C. minus

te

is unclear. Examination of additional nuclear genes (SSU-rDNA and actin) from C. minus and C. pentatrifurcatum also demonstrated that both species were indistinguishable by

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

this evidence (unpublished data). As discussed above one of the criticisms of DNA barcoding has been overestimation of biodiversity, while in this study there is a completely opposite scenario. It is counter intuitive to assume that morphology can overestimate diversity in amoebae, since the group suffers from scarcity of phenotypic characters.

The source of the ambiguities in these findings is not clear, but they may be

explained by two possible scenarios. First, it might be possible that morphological characters such as microscales used in Cochliopodium could be homoplasious. If this scenario is accurate, we have to assume that similar problems are also expected to occur in other members of the genus. Hence, the current observed genetic and morphological diversity will necessitate a serious reevaluation in the group. This will require investigation using extensive sampling of large numbers of species including species

Page 13 of 27

DNA Barcoding in Cochliopodium 14 from different geographical localities and multiple sequencing. An alternative likely explanation might be found by looking at a possible genetic confound such as hybridization, horizontal gene transfer or recent speciation that renders the COI gene useless in distinguishing well-characterized species. Abundant examples of these cases

ip t

are reported in other organisms (Rubinoff et al. 2006). Our recent work shows

Cochliopodium engages in an unusual cell-to-cell interaction leading to cellular and

cr

nuclear fusion (Tekle et al., unpubl. observ.). Large plasmodium amoebae are later

observed to fragment into uninucleated amoebae. During the fusion process many cellular

us

components are observed to freely mix. This behavior is common in several members of the group and preliminary observation indicates that interspecies fusion is likely

an

(unpublished data). If this is confirmed, exchange of cellular components including mitochondria and other genetic material might lead to transgenic (hybrid) species formation in which isolates with different morphology might carry the same genetics.

M

However, this requires further investigation. Cellular fusion in several members of Amoebozoa have been reported (Lahr et al. 2011b). Therefore, similar cases of

d

discordances between COI and morphology might be found in other amoebae with similar behavior. Understanding cell-to-cell interaction behaviors will undoubtedly

Amoebozoa.

te

provide evidence of greater significance on how genetic lineages originate and evolve in

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Finally, the simple analytical approaches (NJ) and use of a single model (K2P) employed in DNA barcoding studies has been criticized (Collins and Cruickshank 2013). Experimental designs for DNA barcoding such as sampling coverage and length of sequence used varies among studies (Heger et al. 2011, 2013; Kosakyan et al. 2012, 2013; Nassonova et al. 2010). In the present study the effects of both of these issues in species delimitation have been investigated. The findings demonstrate selected models of sequence evolution or sequence length have little to no effect on the maximum intraspecific and minimum interspecific divergence threshold boundaries in Cochliopodium.

Page 14 of 27

DNA Barcoding in Cochliopodium 15 In conclusion the success rate of COI as a barcoding marker in amoeboid diversity studies is well demonstrated. However, species delimitation based solely on this marker could at times lead to an erroneous estimation, especially under special circumstances where genetic history of the marker is compromised by evolutionary

ip t

processes or behavioral factors that undermines its utility. It is suggested that a

comprehensive integrated taxonomic approach should be considered for effective species

us

cr

identification and biodiversity description in amoebae.

an

Methods

Taxa studied: Out of the nine Cochliopodium species studied, eight are formally described including two ATCC isolates (Cochliopodium pentatrifurcatum ATCC©

M

30935TM and C. megatetrastylus ATCC© 30936TM), six CCAP isolates (C. actinophorum CCAP 1537/10, C. gallicum CCAP 1537/6, C. larifeili, CCAP 1537/8, C.

d

minus CCAP 1537/1A, C. spiniferum CCAP 1537/3 and C. minutoidum CCAP 1537/7)

te

and one an undescribed Cochliopodium sp. labeled “Con1” collected from nature. All freshwater amoebae, except the marine amoeba C. gallicum, were cultured in ATCC medium 997, with mixed bacteria as food. C. gallicum was grown in artificial seawater

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

(Alga-Gro® Seawater Medium #153754, Burlington, North Carolina) with bacteria. DNA extraction, PCR amplification: DNA samples from nine species were

extracted using illustra™ DNA Extraction Kit BACC1 (GE Healthcare UK Ltd, Little Chalfont Buckinghamshire HP7 9NA England, Cat. No. RPN8501) per manufacturer‟s instructions and with the addition of a phenol–chloroform and isoamyl alcohol step using Phase Lock Gel Heavy tubes (Eppendorf AG, Hamburg, Germany, Cat. No. 955154070). Folmer et al. (1994) COI primers were used to amplify ~700 bps region of the gene. Phusion DNA Polymerase, a strict proofreading enzyme, was used to amplify the genes of interest. In order to explore a wide range of intraspecific variation and avoid preferentially amplification of certain haplotypes, varied annealing temperatures ranging from 50oC - 60oC were used in multiple PCRs. Lucigen PCRSmart, Novagen Perfectly

Page 15 of 27

DNA Barcoding in Cochliopodium 16 Blunt and Invitrogen Zero Blunt Topo cloning kits were used for cloning. Sequencing of cloned plasmid DNA, was accomplished using vector-specific primers and the BigDye terminator kit (Perkin- Elmer). Sequences were run on an ABI 3100 automated sequencer

per species and surveyed a minimum of 8 clones per isolate.

ip t

in the Morehouse School of Medicine. We have successfully fully sequenced 6-38 clones

Alignment phylogenetic and genetic diversity analyses: COI gene sequence

cr

alignment was manually constructed using Se-Al, Sequence Alignment Editor (Rambaut

us

et al. 1997) and BioEdit version 7.2.5 (Hall 1999). Pairwise distances and general sequences composition statistics between each sequence were calculated in MEGA v6 (Tamura et al. 2013). In order to evaluate the effect of different models of sequence

an

evolution on the divergence computation between DNA sequences, genetic distances using six different models including uncorrected p-distance, Jukes-Cantor, Tajims-Nei,

M

Kimura 2-parameter (K2P), Tamura–Nei and Tamura 3-parameter, with Ts (transitions) and Tv (transversions) rate correction and uniform among site rate, were compared in MEGA v6. These were applied to two datasets differing in the number of characters, i.e.

d

small and large data sets consisting of 498 bps and 668 bps, respectively. All

te

pseudogenes (Numts) are excluded in the final analysis. Maximum likelihood phylogenetic trees and bootstrap support values with 1000 replicates were inferred using MEGA v6 and RAxML BlackBox on PhyloBench website with default settings and

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

GTR+Γ+I model (Stamatakis et al. 2008). The model of molecular evolution GTR+Γ+I with 4 rate categories was selected by the corrected Akaike Information Criterion in the jmodelTest package 2.1.4 (Darriba et al. 2012). The final data set for the phylogenetic tree reconstruction contained a total of 44 ingroup haplotypes representing 9 Cochliopodium species and 12 outgroup vannellids. Outgroup taxa were chosen based on available COI sequences that are closest to Cochliopodium. Two species delimitation methods that partition genetic lineages from empirical data were used, namely the Automatic Barcode Gap Discovery (ABGD, (Puillandre et al. 2012)) and from inferred phylogeny using Poisson tree processes (PTP) including bayesian implementation of the model (Zhang et al. 2013). The online versions of ABGD (http://wwwabi.snv.jussieu.fr/public/abgd/) and bPTP (http://species.h-its.org/ptp/) were used with default parameters.

Page 16 of 27

DNA Barcoding in Cochliopodium 17

Acknowledgements The author acknowledges the office of the Provost, Spelman College for funding this

ip t

study. YIT acknowledge financial support from ASPIRE, NSF grant HRD-0714553. The

cr

author thanks Prof. O. Roger Anderson for his invaluable comments on the manuscript.

us

Ms. Samantha Kelly and Ms. Lydia Gorfu are thanked for assistance in the laboratory.

an

References

Adl SM, Simpson AG, Lane CE, et al. (2012) The revised classification of eukaryotes.

M

J Eukaryot Microbiol 59:429-493

Anderson OR, Tekle YI (2013) A Description of Cochliopodium megatetrastylus n. sp. isolated from a freshwater habitat. Acta Protozool 52:55-64

d

Baldauf SL, Roger AJ, Wenk-Siefert I, Doolittle WF (2000) A kingdom-level

te

phylogeny of eukaryotes based on combined protein data. Science 290:972-977 Barth D, Krenek S, Fokin SI, Berendonk TU (2006) Intraspecific genetic variation in Paramecium revealed by mitochondrial cytochrome C oxidase I sequences. J

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Eukaryot Microbiol 53:20-5

Burki F, Inagaki Y, Brate J, et al. (2009) Large-scale phylogenomic analyses reveal that two enigmatic protist lineages, Telonemia and Centroheliozoa, are related to photosynthetic chromalveolates. Genome Biol Evol 1:231-238

Cavalier-Smith T, Chao EEY, Oates B (2004) Molecular phylogeny of Amoebozoa and the evolutionary significance of the unikont Phalansterium. Eur J Protistol 40:21-48 Collins RA, Cruickshank RH (2013) The seven deadly sins of DNA barcoding. Mol Ecol Resour 13:969-975 Darriba D, Taboada GL, Doallo R, Posada D (2012) jModelTest 2: more models, new heuristics and parallel computing. Nat Methods 9:772

Page 17 of 27

DNA Barcoding in Cochliopodium 18 Dykova I, Lom J, Machackova B (1998) Cochliopodium minus, a scale-bearing amoeba isolated from organs of perch Perca fluviatilis. Dis Aquat Organ 34:205-210 Egge E, Bittner L, Andersen T, Audic S, de Vargas C, Edvardsen B (2013) 454 pyrosequencing to describe microbial eukaryotic community composition,

ip t

diversity and relative abundance: a test for marine haptophytes. PLoS One 8:e74371

cr

Folmer O, Black M, Hoeh W, Lutz R, Vrijenhoek R (1994) DNA primers for

amplification of mitochondrial cytochrome c oxidase subunit I from diverse

us

metazoan invertebrates. Mol Mar Biol Biotechnol 3:294–299

Haeckel EHPA (1866) Generelle Morphologie der Organismen. Allgemeine Grundzüge

an

der organischen Formen-Wissenschaft, mechanisch begründet durch die von Charles Darwin reformirte Descendenztheorie. G. Reimer, Berlin Hall T (1999) BioEdit: An important software for molecular biology. Nucl Acids Symp

M

Ser 41:95-98

Harper JT, Gile GH, James ER, Carpenter KJ, Keeling PJ (2009) The inadequacy of

d

morphology for species and genus delineation in microbial eukaryotes: an example from the parabasalian termite symbiont coronympha. PLoS One 4:e6577

te

Hebert PD, Ratnasingham S, deWaard JR (2003b) Barcoding animal life: cytochrome c oxidase subunit 1 divergences among closely related species. Proc R Soc Lond

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

B 270:S96-S99

Hebert PD, Cywinska A, Ball SL, deWaard JR (2003a) Biological identifications through DNA barcodes. Proc R Soc Lond B 270:313-321

Hebert PD, Stoeckle MY, Zemlak TS, Francis CM (2004b) Identification of birds through DNA barcodes. PLoS Biol 2:e312

Hebert PD, Penton EH, Burns JM, Janzen DH, Hallwachs W (2004a) Ten species in one: DNA barcoding reveals cryptic species in the neotropical skipper butterfly Astraptes fulgerator. Proc Natl Acad Sci USA 101:14812-14817 Heger TJ, Mitchell EA, Leander BS (2013) Holarctic phylogeography of the testate amoeba Hyalosphenia papilio (Amoebozoa: Arcellinida) reveals extensive genetic diversity explained more by environment than dispersal limitation. Mol Ecol 22:5172-5184

Page 18 of 27

DNA Barcoding in Cochliopodium 19 Heger TJ, Pawlowski J, Lara E, et al. (2011) Comparing potential COI and SSU rDNA barcodes for assessing the diversity and phylogenetic relationships of cyphoderiid testate amoebae (Rhizaria: Euglyphida). Protist 162:131-141 Kosakyan A, Gomaa F, Mitchell EA, Heger TJ, Lara E (2013) Using DNA-barcoding

ip t

for sorting out protist species complexes: a case study of the Nebela

tinctacollaris-bohemica group (Amoebozoa; Arcellinida, Hyalospheniidae). Eur J

cr

Protistol 49:222-237

Kosakyan A, Heger TJ, Leander BS, Todorov M, Mitchell EA, Lara E (2012) COI

us

barcoding of Nebelid testate amoebae (Amoebozoa: Arcellinida): extensive cryptic diversity and redefinition of the Hyalospheniidae Schultze. Protist

an

163:415-434

Kudryavtsev A, Smirnov A (2006) Cochliopodium gallicum n. sp. (Himatismenida), an

(France). Eur J Protistol 42:3-7

M

amoeba bearing unique scales, from cyanobacterial mats in the Camargue

Kudryavtsev A, Wylezich C, Pawlowski J (2011) Ovalopodium desertum n. sp. and the

d

phylogenetic relationships of Cochliopodiidae (Amoebozoa). Protist 162:571-589 Kudryavtsev A, Bernhard D, Schlegel M, Chao EEY, Cavalier-Smith T (2005) 18S

te

ribosomal RNA gene sequences of Cochliopodium (Himatismenida) and the phylogeny of Amoebozoa. Protist 156:215-224

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Lahr DJ, Grant J, Nguyen T, Lin JH, Katz LA (2011a) Comprehensive phylogenetic reconstruction of Amoebozoa based on concatenated analyses of SSU-rDNA and actin genes. PLoS ONE 6:e22780

Lahr DJ, Parfrey LW, Mitchell EA, Katz LA, Lara E (2011b) The chastity of amoebae: re-evaluating evidence for sex in amoeboid organisms. Proc R Soc Biol Sci B 278:2081-2090

Lara E, Heger TJ, Ekelund F, Lamentowicz M, Mitchell EA (2008) Ribosomal RNA genes challenge the monophyly of the Hyalospheniidae (Amoebozoa: Arcellinida). Protist 159:165-76 Lara E, Heger, TJ, Scheihing, R, Mitchell, EAD (2011) COI gene and ecological data suggest size-dependent high dispersal and low intra-specific diversity in freeliving terrestrial protists (Euglyphida: Assulina). J Biogeogr 38:640–650

Page 19 of 27

DNA Barcoding in Cochliopodium 20 Lin S, Zhang H, Hou Y, Zhuang Y, Miranda L (2009) High-level diversity of dinoflagellates in the natural environment, revealed by assessment of mitochondrial cox1 and cob genes for dinoflagellate DNA barcoding. Appl Environ Microbiol 75:1279-1290

ip t

Lopez JV, Yuhki N, Modi W, Masuda R, O'Brien SJ (1994) Numt, a recent transfer and tandem amplification of mitochondrial DNA in the nuclear genome of the

cr

domestic cat. J Mol Evol 39:171-190

Nassonova E, Smirnov A, Fahrni J, Pawlowski J (2010) Barcoding amoebae:

naked lobose amoebae. Protist 161:102-115

us

comparison of SSU, ITS and COI genes as tools for molecular identification of

an

Nikolaev SI, Berney C, Petrov NB, Mylnikov AP, Fahrni JF, Pawlowski J (2006) Phylogenetic position of Multicilia marina and the evolution of Amoebozoa. Int J Syst Evol Microbiol 56:1449-1458

M

Nikolaev SI, Mitchell EAD, Petrov NB, Berney C, Fahrni J, Pawlowski J (2005) The testate lobose amoebae (order Arcellinida Kent, 1880) finally find their home

d

within Amoebozoa. Protist 156:191-202

Parfrey LW, Grant J, Tekle YI, et al. (2010) Broadly sampled multigene analyses yield

te

a well-resolved eukaryotic tree of life. Syst Biol 59:518-533 Patterson DJ (1999) The diversity of eukaryotes. Am Nat 154:S96-S124

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Pons J, Barraclough TG, Gomez-Zurita J, et al. (2006) Sequence-based species delimitation for the DNA taxonomy of undescribed insects. Syst Biol 55:595-609

Puillandre N, Lambert A, Brouillet S, Achaz G (2012) ABGD, Automatic Barcode Gap Discovery for primary species delimitation. Mol Ecol 21:1864-1877

Rambaut A, Harvey PH, Nee S (1997) End-Epi: An application for inferring phylogenetic and population dynamical processes from molecular sequences. Comput Appl Biosci 13:303-306

Rubinoff D, Cameron S, Will K (2006) A genomic perspective on the shortcomings of mitochondrial DNA for "barcoding" identification. J Hered 97:581-594 Schindel DE, Miller SE (2005) DNA barcoding a useful tool for taxonomists. Nature 435:17 Schmarda LK (1871) Zoologie. Braumüller, Wien, 372 p

Page 20 of 27

DNA Barcoding in Cochliopodium 21 Smirnov A (2008) Amoebas, Lobose. In Schaechter M (ed) Encyclopedia of Microbiology. Elsevier, Oxford, pp 558-577 Smirnov AV, Chao E, Nassonova ES, Cavalier-Smith T (2011) A revised

570

ip t

classification of naked lobose amoebae (Amoebozoa: lobosa). Protist 162:545-

Smirnov AV, Nassonova ES, Chao E, Cavalier-Smith T (2007) Phylogeny, evolution,

cr

and taxonomy of vannellid amoebae. Protist 158:295-324

Smirnov A, Nassonova E, Berney C, Fahrni J, Bolivar I, Pawlowski J (2005)

us

Molecular phylogeny and classification of the lobose amoebae. Protist 156:129142

an

Song H, Buhay JE, Whiting MF, Crandall KA (2008) Many species in one: DNA barcoding overestimates the number of species when nuclear mitochondrial pseudogenes are coamplified. Proc Natl Acad Sci USA 105:13486-13491

M

Spiegel FW (2011) Commentary on the chastity of amoebae: re-evaluating evidence for sex in amoeboid organisms. Proc R Soc Biol Sci B 278:2096-2097

d

Stamatakis A, Hoover P, Rougemont J (2008) A rapid bootstrap algorithm for the RAxML web-servers. Syst Biol 57:758-771

te

Stern RF, Horak A, Andrew RL, et al. (2010) Environmental barcoding reveals massive dinoflagellate diversity in marine environments. PLoS One 5:e13991

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Stoeck T, Zuendorf A, Breiner HW, Behnke A (2007) A molecular approach to identify active microbes in environmental eukaryote clone libraries. Microb Ecol 53:328-339

Tamura K, Stecher G, Peterson D, Filipski A, Kumar S (2013) MEGA6: Molecular Evolutionary Genetics Analysis version 6.0. Mol Biol Evol 30:2725-2729

Tekle YI, Parfrey LW, Katz LA (2009) Molecular data are transforming hypotheses on the origin and diversification of eukaryotes. Bioscience 59:471-481

Tekle YI, Anderson OR, Lecky AF, Kelly SD (2013) A new freshwater amoeba: Cochliopodium pentatrifurcatum n. sp. (Amoebozoa, Amorphea). J Eukaryot Microbiol 60:342-349 Tekle YI, Grant J, Anderson OR, et al. (2008) Phylogenetic placement of diverse amoebae inferred from multigene analyses and assessment of clade stability

Page 21 of 27

DNA Barcoding in Cochliopodium 22 within 'Amoebozoa' upon removal of varying rate classes of SSU-rDNA. Mol Phylogenet Evol 47:339-352 Tekle YI, Grant J, Cole JC, et al. (2007) A multigene analysis of Corallomyxa tenera sp. nov. suggests its membership in a clade that includes Gromia, Haplosporidia

ip t

and Foraminifera. Protist 158:457-472

Winterbottom R, Hanner RH, Burridge M, Zur M (2014) A cornucopia of cryptic

cr

species - a DNA barcode analysis of the gobiid fish genus Trimma (Percomorpha, Gobiiformes). ZooKeys 381:79-111

us

Yi Z, Dunthorn M, Song W, Stoeck T (2010) Increasing taxon sampling using both unidentified environmental sequences and identified cultures improves

Phylogenet Evol 57:937-941

an

phylogenetic inference in the Prorodontida (Ciliophora, Prostomatea). Mol

Yoon HS, Grant J, Tekle Y, et al. (2008) Broadly sampled multigene trees of

M

eukaryotes. BMC Evol Biol 8:14

Zhang J, Kapli P, Pavlidis P, Stamatakis A (2013) A general species delimitation

te

2876

d

method with applications to phylogenetic placements. Bioinformatics 29:2869-

Ac ce p

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Figure Legends

Figure 1. Maximum likelihood (ML) tree of Cochliopodium species inferred from COI gene sequences of the large dataset (668 bps) with K2P model. ML bootstrap values are shown at the nodes. All branches are drawn to scale. Figure 2. Average interspecific COI gene sequence divergences inferred using six different models.

Figure 3. Comparison of sequence divergences in two datasets differing by 15%: small dataset (498, broken line) and large dataset (668, solid line).

Page 22 of 27

Table 1

Table 1. Cochliopodium COI gene sequence data and intraspecific divergences. Haplotypes

Cochliopodium pentatrifurcatum Cochliopodium minus CCAP1437/1A Cochliopodium megatetrastylus Cochliopodium minutoidum Cochliopodium spiniferum Cochliopodium sp. 'Con1' Cochliopodium actinophorum Cochliopodium larifeili Cochliopodium gallicum

38 12 27 6 8 14 8 6 7

13 7 18 3 6 7 4 2 2

Intraspecific divergence maximum (average) 0.9% (0.1%) 0.8% (0.3%) 0.7% (0.2%) 0.4% (0.2%) 0.6% (0.3%) 0.8% (0.2%) 0.3% (0.2%) 0 0.2% (0%)

COI GenBank Accession Numbers KJ781428 - KJ781438 KJ781439 - KJ781444 KJ781445 - KJ781451 KJ781452 - KJ781454 KJ781455 - KJ781459 KJ781460 - KJ781461 KJ781462 - KJ781465 KJ781466 KJ781467 - KJ781468

ip t

Clones

Ac ce p

te

d

M

an

us

cr

Taxon

Page 23 of 27

ip t

Table 2

C. pentatrifurcatum

C. megatetrastylus

C. minutoidum

0-1.1(0.4) 2.8-3.7(3.2) 7.7-8.6(8.1) 13.8-14.7(14.3) 16.4-17.1(16.7) 16.5-17.3(16.9) 20-20.6(20.3) 27-27.8(27.3)

2.8-3.9(3.1) 7.7-8.7(8) 13.8-14.9(14.2) 16.4-17.3(16.7) 16.5-17.5(16.9) 20-20.8(20.2) 27-27.6(27.3)

7.9-8.7(8.2) 13.6-14.5(14) 15.2-16(15.5) 16 -16.7(16.3) 20.2-20.6(20.4) 27-27.4(27.2)

16.4-16.8(16.6) 14.7-14.9(14.8) 15.1 -15.3(15.2) 20.8-20.8(20.8) 27-27.2(27.1)

C. spiniferum

C. sp. 'Con1'

C. actinophorum

C. larifeili

9.3-9.6(9.5) 20.6-20.8(20.7) 26.1-26.5(26.3)

22.8-23(22.9) 25.9-26.5(26.2)

22.8-23(22.9)

us

C. minus

ce pt

ed

M an

17.1-17.9(17.5) 17.7-18.3(18) 20.2-20.2(20.2) 29.2-29.9(29.5)

Ac

Taxon C. minus CCAP 1537/1A C. pentatrifurcatum C. megatetrastylus C. minutoidum C. spiniferum C. sp. 'Con1' C. actinophorum C. larifeili C. gallicum

cr

Table 2. Interspecific percentage sequence divergences, minimum-maximum (average), among Cochliopodium species.

Page 24 of 27

Cochliopodium minus CCAP 1437/1A YT260 Cochliopodium pentatrifurcatum YT163 Cochliopodium minus CCAP 1437/1A YT238 Cochliopodium pentatrifurcatum YT167 Cochliopodium pentatrifurcatum YT333 Cochliopodium pentatrifarcatum YT336 Cochliopodium pentatrifarcatum YT152 Cochliopodium minus CCAP 1437/1A YT239 100 Cochliopodium pentatrifurcatum KC489470 Cochliopodium minus CCAP 1437/1A YT256 Cochliopodium pentatrifurcatum YT347 Cochliopodium pentatrifurcatum YT153 Cochliopodium pentatrifurcatum YT292 Cochliopodium minus CCAP 1437/1A YT241 Cochliopodium pentatrifurcatum YT290 83 Cochliopodium pentatrifurcatum YT338 Cochliopodium minus CCAP 1437/1A YT242 Cochliopodium pentatrifurcatum YT334 Cochliopodium megatetrastylus YT197 Cochliopodium megatetrastylus YT216 Cochliopodium megatetrastylus YT185 Cochliopodium megatetrastylus YT187 98 Cochliopodium megatetrastylus YT170 100 Cochliopodium megatetrastylus YT178 Cochliopodium megatetrastylus YT191 Cochliopodium megatetrastylus YT136 Cochliopodium megatetrastylus YT138 90 Cochliopodium minutoidum GQ354208 100 Cochliopodium minutoidum YT305 Cochliopodium minutoidum YT311 Cochliopodium minutoidum YT307 Cochliopodium spiniferum YT250 Cochliopodium spiniferum YT252 95 100 Cochliopodium spiniferum YT248 Cochliopodium spiniferum YT247 Cochliopodium spiniferum YT253 100 Cochliopodium sp. 'Con1' YT177 Cochliopodium sp. 'Con1' YT220 99 Cochliopodium actinophorum YT316 Cochliopodium actinophorum YT320 100 Cochliopodium actinophorum YT314 Cochliopodium actinophorum YT313 Cochliopodium actinophorum GQ354207 Cochliopodium larifeili YT254 67 Cochliopodium gallicum YT300 100 Cochliopodium gallicum YT298 92 Vannella arabica CCAP 1589/7.2 100 Vannella bursella CCAP 1565/10.1 Vannella calycinucleolus CCAP 1565/6.1 100 Vannella danica CCAP 1589/17.1 96 Vannella persistens CCAP 1589/13.1 96 100 Vannella simplex Geneva.1 Vannella simplex CCAP 1589/3.2 Vannella simplex Malagnou.1 Vannella simplex Ladoga.1 97 Vannella simplex L4C.1 Vannella simplex L4A.1 Vannella simplex L4A3.1

Ac ce p

te

d

M

an

us

cr

ip t

Figure 1

100

0.05

Page 25 of 27

e

ce

2 5 % 2 0 % 1 5 %

Ac

Ave r a ges e que nc edi ve r ge nc e

3 0 %

PD J C Ta j i ma Ne i T3 P Ta mu r a Ne i K2 P

pt

3 5 %

Figure 2

1 0 %

5 %

0 %

1

6

1 6 1 1 2 1 2 6 Pa i r wi s ec o mp a r i s o n s Page

3 1

26 of 27

3 6

ep

Ac c

Figure 3

Page 27 of 27