Molecular screening of free-living microbial eukaryotes: diversity and distribution using a meta-analysis

Molecular screening of free-living microbial eukaryotes: diversity and distribution using a meta-analysis

Molecular screening of free-living microbial eukaryotes: diversity and distribution using a meta-analysis Thomas A Richards1,2 and David Bass2 Numerou...

234KB Sizes 0 Downloads 26 Views

Molecular screening of free-living microbial eukaryotes: diversity and distribution using a meta-analysis Thomas A Richards1,2 and David Bass2 Numerous environmental gene library studies have shown that eukaryote microbial diversity is much greater than expected. Molecular surveys of several ‘extreme’ and some more anthropomorphically commonplace environments have revealed many previously unsampled micro-eukaryotic lineages. However, it cannot be assumed that all of the sequences recovered from these studies are derived from real organisms, and for those that are, many questions remain about their distribution and ecology. Integrating all available sequence data from these studies reveals patterns of distribution, diversity and evolutionary relationships that are not accessible from independent analyses of the individual surveys and enables us to review the wider implications of such studies. Addresses 1 Department of Zoology, The Natural History Museum, Cromwell Road, London, SW7 5BD, UK 2 Department of Zoology, The University of Oxford, South Parks Road, Oxford, OX1 3PS, UK Corresponding author: Bass, David ([email protected])

Current Opinion in Microbiology 2005, 8:240–252 This review comes from a themed issue on Ecology and industrial microbiology Edited by Sergio Sa´nchez and Betty Olson Available online 6th May 2005 1369-5274/$ – see front matter # 2005 Elsevier Ltd. All rights reserved.

Recently, similar studies have used PCR primers that target eukaryotic SSU rRNA genes in the environment. These studies revealed extensive, previously unsampled, molecular diversity both within, and distinct from, known taxonomic groups. These data have also shown that many groups were more diverse in some environments than previously thought [5,6,7]. Currently, no study has demonstrated sampling saturation (e.g. [8]), indicating that further sampling from these environments would reveal even greater diversity [9]. A disproportionate number of these studies have used eukaryote-wide PCR primers to survey ‘extreme’ environments, for example oxygen-depleted sediments, highly acidic freshwater, deep-sea habitats, including hydrothermal vents, and Antarctic soils (see Table 1); fewer have screened commonly accessible human environments [10–12]. Both approaches have revealed many previously novel lineages, some potentially, but often debatably, at the highest of taxonomic levels. In this review we pool data from as many eukaryotic environmental SSU rDNA surveys as were available by July 2004 (Table 1) to maximise the chances of revealing patterns of ecological and geographical structuring that are not apparent in individual studies. We discuss the benefits of environmental gene screening for allowing unique insights into patterns of microbial biodiversity and providing new raw data for phylogenetic analyses, as well as demonstrating the inherent flaws in this approach.

DOI 10.1016/j.mib.2005.04.010

Factors affecting the analysis of environmental gene data Introduction Our ability to analyse microbial diversity was revolutionised by the application of molecular techniques in the 1980s (e.g. [1,2]). In particular, the use of PCR primers to isolate small subunit ribosomal RNA (SSU rRNA) genes (as SSU rDNA) from natural prokaryote communities has revealed that the genetic diversity within these communities is far greater than that suggested by morphological investigations alone. It was also striking that the majority of the genotypes recovered by molecular surveys did not match those from laboratory cultures. This reinforced an earlier finding that many bacteria are resistant to cultivation [3]. Molecular sampling, which is affected by different experimental biases from cultivation or microscopy methods, has consistently recovered new phylogenetic lineages, some of which have been shown to represent novel higher-level taxa (e.g. [4]). Current Opinion in Microbiology 2005, 8:240–252

SSU rRNA genes are relatively easy to sample from both microbial cultures and environmental samples. This is currently the best-sampled gene in microbial eukaryotes for environmental and taxonomic studies. The diverse sampling of this gene family enables effective inference of the evolutionary positions of sampled SSU rDNA sequences using BLAST and phylogenetic methods. However, analyses based solely on SSU rDNA sequences suffer from several limitations. This is especially true for sequences of unidentified origin, as is the case for environmental eukaryotic SSU rDNA library (EE-SSU-L) sequences, which do not have morphological data or alternative gene phylogenies to provide highly desirable additional lines of evidence necessary for robust phylogenetic inference [13]. Three key factors complicate diversity assessment based on sequence analysis of environmental gene libraries: (i) chimaeric sequences; (ii) www.sciencedirect.com

Molecular screening of free-living microbial eukaryotes Richards and Bass 241

Table 1 Comparison of sampling strategies used in the 13 studies included in this meta-analysis. Environment

Reference

No. libraries

Filter size (if used)

Synthetic microbial bio-reactor fed by monocultures to create a detritus environment seeded with the microbes from the waters of lake Ketelmeer (Netherlands), a small turbid lake. Marine Antarctic polar fronts –250, –500, –2000 and –3000*m (cold oligotrophic, highly oxygenated waters). Equatorial Pacific Ocean at a depth of 75 m; oligotrophic. Marine surface samples; Mediterranean, Antarctica and North Atlantic. Everest mound region from the Guaymas hydrothermal vent. Sediment cores were sampled from sediment/seawater interface, and sediment. Rio Tinto Spain, pH 2 freshwater river with high concentration of heavy metals (3–20 g iron/litre). Anoxic sediments: two marine (1–3 cm deep) plus one freshwater (5 cm deep). Samples were taken at >1 cm below surface of black reducing sediment. Suboxic waters and anoxic sediments from a well protected intertidal pool in the Great Sippewisset salt marsh, Cape Cod, MA, USA. The sediment cores were sampled at the sediment/water interface and at a depth of 10 cm. Cariaco basin in the Caribbean Sea on the northern continental margin of Venezuela: a large permanently anoxic basin. Vertical stratification of microbial communities and a clear transition from oxic to anoxic are seen. Three depths sampled at 270-340-900 m corresponding to oxic, oxic/anoxic interface and the anoxic component of the water column. Rainbow hydrothermal sediment (depth 2264 m). Vent-fluid/seawater mixtures from Lucky Strike (depth 1695 m) and Rainbow Chimneys. Micro-colonisers exposed close to vent-fluid emissions for 15 days (Lucky Strike site). Soil samples were collected from six sites on the Antarctic continent; sites corresponded to a latitudinal and environmental gradient between approximately 60 and 878S. Sediment from a small freshwater river (Seymaz, Geneva, Switzerland). Oligotrophic freshwater lake in upstate New York, USA. Four sampling sites up to a depth of 20 m (euphotIc portion of the water column).

[45]

1

No filtration indicated

[42]

5

5 mm (*unfiltered)

[12] [11] [8]

1 5 7

3 mm 2–5 mm No filtration indicated

[46]

6

No filtration indicated

[16]

4

No filtration indicated

[41]

2

5 mm

[43]

3

No filtration indicated

[40]

Not available

5 mm

[47]

6

No filtration indicated

[15] [10]

1 4

No filtration indicated 5 mm

variability in rRNA cistron copies within a single nucleus or individual; and (iii) systematic artifacts produced by inappropriate phylogenetic methods. Chimaeric sequences

PCR chimaeras occur when partial PCR products undergo recombination during the thermo-cycling conditions of a PCR protocol. These are likely to occur, at least at a low frequency, and are difficult to detect in EE-SSU-L experiments. Environmental PCR is performed using heterogeneous templates providing the opportunity for discrete SSU genes to recombine. Chimaera detection programs [14] seem to be only partially effective when compared with manual alignment methods, leaving many chimaeras undetected [15]. If chimaeras are not excluded from phylogenetic analyses they can occupy unique and intermediate evolutionary positions in gene trees and can potentially be interpreted as deep-branching novel lineages. An analysis of published data [15] focusing on 28 phylotypes previously claimed to represent unique eukaryotic diversity identified 3 chimaeras among them (CS_E042, LEMD145 and LEMD119) [8,16]. Intra-individual ribosomal-RNA polymorphism

Gene conversion and unequal crossing over are thought to reduce and potentially eliminate sequence variation www.sciencedirect.com

among multi-copy genes such as the rRNA cistron [17]. The relatively few studies that have measured intraindividual SSU rRNA gene polymorphisms show that this character varies unpredictably within and between groups of related organisms [18]. Intra-individual variation can be negligible to low — for example Symbiodinium (Dinophyceae) [19], Cladophoropsis (Ulvophyceae) and some bacteria [20]) — or can be high enough potentially to cause problems when inferring evolutionary relationships between closely related taxa [21]. The suggested causes of intra-individual variation in rDNA sequence are many, including pseudogenes, hybridisation events and introgression [18,22,23], functionally different ribosome populations (e.g. [24,25]), proximity to mini- and microsatellite arrays [26] or ploidy levels [27], and may be generally related to cistron copy number. Identification of levels of rRNA gene polymorphism is essential for sensible interpretation of evolutionary and ecological relationships within phylogenetic clusters of highly similar sequences. Methodological limitations of phylogenetic trees

Investigation of the evolutionary relationships of environmental SSU rRNA genes requires the construction of a phylogenetic tree using DNA/RNA sequences and should include a broad sampling of the eukaryotic diversity to Current Opinion in Microbiology 2005, 8:240–252

242 Ecology and industrial microbiology Figure 1

84

58

64 C1 E029 100 D12b Antarctic soils [47•] 75 OLI11011 100 C2 E009 D75 Acidic freshwater [46] OLI11029 100 OLI11033 MA V 82 82 C3 E034 100 AT4-47 Freshwater seeded bio-reactor [45] D80 84 DH147-EKD18 A1 E034 Euphotic freshwater [10] A1 E035 67 H48 E188 75 99 Seymaz small-river sediments [15••] 53 C3 E011 Marine DH148-EKD22 C1 E023 100 Group I Oxygen-depleted marine [41] MA VI AT8-27 99 E12-2 92 DH145-EKD10 94 69 Anoxic Cariaco basin [43] OLI11001 DH145-EKD20 OLI11038 100 Anoxic environment [16] 88 D51 AT4-42 75 50 E214 Oceanic surface waters [11] 63 C2 E017 89 C1 OLI11511 MA I E008 Antarctic deep-sea [42] DH144-EKD3 95 100 CS E023 E045 99 100 A1 Oceanic [12] E72 C2 E003 sp. 100 99 Amphidinium Benthic hydrothermal vents [8] Amphidinium cf. rhynchocephalum 100 CCW106 CCW105 Hydrothermal vents [40] 100 OLI11010 50 DH148-EKD6 OLI11023 50 OLI11012 100 AT4-21 CS E006 DH148-EKD 97 Marine Amoebophrya sp. 93 Group II OLI11261 65 91 DH148-EKD27 86 100 OLI11115 A2 E047 OLI11009 Dinozoa 83 D69 100 C1 E010 CS E040 Hematodinium sp. CCW32 Peridinium polonicum Cachonina hallii Lepidodinium viride Cryptoperidiniopsoid sp. Prorocentrum gracile LG64-08 Gyrodinium helveticum Akashiwo sanguinea Symbiodinium sp. Glenodinium sp. Gyrodinium instriatum Hemidinium nasutum 94 AT4-98 Perkinsea Perkinsus mediterraneus IN242 OLI11005 MA VII LG25-05 MA VIII 100 LEMD002 100 LEMD015 100 LEMD135 Gregarinea 100 Gregarina chortiocetes 66 Gregarina caledia 84 LEMD119 Monocystis agilis 67 CCA5 78 Apicomplexan Acarus siro pathogen 100 Ophriocystis elektroscirrha Neogregarinorida 93 BOLA184 Mattesia geminata 99 Cryptosporidium Pig 1 Cryptosporidium andersoni Coccidia 100 Piroplasmida BOLA566 Coccidia Adelina bambarooniae 62 100 BAQA40 MA IV Sey061 81 100 E6 CCI31 100 Apicomplexa MA III C1 E039 CCA38 74 BOLA176 100 Colpodella tetrahymenae Colpodella sp. 57 100 Colpodella sp. BOLA553 Colpodella Colpodella pontica 100 LG02-01 LG09-02 54 BOLA914 LG166 93 Colpodella edax 100 BOLA327 100 BOLA044 C2 E026 87 C3 E012 Gregarinea 50 100 Lecudina tuzetae CCI7 C1 E016 C1 E017 D244 Perkinsea Parvilucifera infectans 91 E2 Urocentrum turbo Opisthonecta henneguyi LEMD274 100 Pseudoplatyophrya nana LG15-08 Strombidinopsis shimii Ciliophora Eutintinnus pectinis Euplotes rariseta 61 Trimyema compressum CS EO48 Blepharisma americanum 100 C2 EO29 0.01 substitutions/site MA II A2 EO39 95 Labyrinthula sp. 88 Suggested to be chimaeric by Berney et al. [15••] Aplanochytrium stocchinoi Thraustochytriidae sp. 100 Chattonella verruculosa Bolidomonas pacifica Blastocystis hominis Current Opinion in Microbiology

Current Opinion in Microbiology 2005, 8:240–252

www.sciencedirect.com

Molecular screening of free-living microbial eukaryotes Richards and Bass 243

facilitate precise positioning of environmental sequences. Trees with insufficient taxon sampling can be unstable to the addition of new sequences [28]. EE-SSU-L analyses with increased taxon sampling have led to the radical re-evaluation of the potential novelty of many phylotypes [15,29]. Conversely, DNA/RNA-based analyses encompassing broad eukaryotic sampling often suffer from systematic biases in the datasets, such as base compositional biases [30] and site-specific rate variation [31,32], which can lead to the misplacement of branches on phylogenetic trees. Alignments of highly divergent DNA/ RNA sequences include a large number of sites that are mutationally saturated and uninformative for analysis of divergent evolutionary relationships. When comparing ancient eukaryotic relationships the number of saturated positions can far outweigh the number of phylogenetically informative positions, and although phylogenetic methods can be employed to negate systematic biases, many eukaryotic relationships are difficult to resolve with significant terminal bootstrap support, making inferences about the evolutionary grouping of EE-SSU-L sequences hazardous.

Re-analysing the 13 environmental gene libraries All the experiments reviewed here have followed similar approaches: total DNA was extracted directly from environmental samples, for example sediments, the products of a bioreactor, or size-specific fractions of aquatic communities; environmental clone libraries were then constructed and sampled for sequencing. Primer design, PCR protocols and cloning methodology vary between all experiments, which is likely to confer different biases on each survey. The possible ramifications of experimental intricacies are not discussed here. Some studies used selective filtration protocols (Table 1) to bias the sampling in favour of pico-eukaryotes (0.2–5.0 mm), which are below the size that light or phase-contrast microscopy is informative for investigating microbial diversity, and electron microscopy methods are of limited use owing to the complexity and destructivity of fixation methods. However, electron microscopy and molecular screening have been used in a combined approach in which sequences from environmental gene libraries provide a starting point for eventual visualisation and morphological characterisation of the cells from which they came [33]. The 13 studies (Table 1) in our meta-analysis represent 49 discrete environmental samples, from which we constructed an alignment of all relevant GenBank nr database sequences. The alignment also included a range of

sequences from cultured microbial strains occupying diverse phylogenetic positions and top BLASTn hits for representative environmental sequences. Sequences were aligned manually using the alignment program GDE in batches to a prealigned sample of 20 diverse sequences taken from a general alignment of 206 sequences [10] that also included a mask, enabling us to select the same set of positions from each batch before collation. Closely related phylotypes from the same environment were removed after the preliminary analysis to give a final dataset of 1077 sequences and 717 nucleotides, restricting the alignment to a central core sampled across all 13 experiments. Sequences that were too short to include >90% of this central core were excluded (=79 sequences). The alignment was analysed using LogDet distance methods with a neighbour joining starting tree and tree-bisection-reconnection branch swapping algorithm [34]: a sequence rich/model simplistic approach. Models that make use of gamma rate correction are not practicable here, so LogDet models were used as they offer an alternative method for minimising long-branch artefacts and compositional heterogeneity problems [35,36] and have been rarely used in the 13 individual studies. A search was run for 100 days, which produced a tree useful for analysis of local topologies but not of deep eukaryote relationships. One thousand bootstrap replicates were conducted using the BioNJ method employing the same LogDet model. Alveolate and heterokont groups were resampled for individual analyses (Figures 1 and 2).

Taxonomic diversity Figure 3 shows the relative representations of major taxonomic groups ascertained for the 13 experiments. Interestingly, the relative representation of taxa in terms of diversity is not radically different from that revealed by Berney et al. [15] in the sediments of a European freshwater river, although the relative dominance of fungi is higher in the freshwater study, possibly due to the samples not being filtered. More generally, these proportions might also be influenced by the relative ease of amplification of the different taxonomic groups. For example, Amoebozoa, which are commonly seen in environmental samples and are morphologically diverse [37], are poorly represented in all the surveys, as are Euglenozoa, which include the abundant and genetically diverse kinetoplastids (e.g. Bodo and Neobodo). Other poorly known groups, in terms of both phylogeny and biodiversity, are Apusozoa (apusomonads plus Diphylleia) and the centrohelid heliozoa [38,39]. The latter are not

(Figure 1 Legend) Phylogeny of novel alveolate EE-SSU-L clusters. On the basis of this analysis we propose eight unidentified (‘mystery’) alveolate lineages labelled MA I–MA VIII, although more alveolate sister groups may have been sampled using EE-SSU-L methods (Table 2). Representative sequences from each unique lineage were used for BLASTn searches in addition to a general sampling of alveolate SSU rDNA sequences so that taxon sampling was optimised for the identification of mystery lineages. Bootstrap values above 49% are labelled. Environment types and publications are colour-coded and are identified in the boxed key; similar colours indicate similar environments. Tree and bootstrap replicates were calculated using a LogDet distance method. The alveolate marine group I and II phylogenetic clusters identified in [42] are labelled on the tree. www.sciencedirect.com

Current Opinion in Microbiology 2005, 8:240–252

244 Ecology and industrial microbiology Figure 2

96 Chattonella sp. Haramonas dimorpha Raphidophyta E133 96 Heterosigma akashiwo filum 100 73 Chorda Pseudochorda nagaii Phaeophyta Laminaria angustata 56 CN326 CN146 97 MO115 Xanthophyta MO339 77 CN364 66 Vaucheria bursata Polypodochrysis teissieri Pinguiophyta 100

73

Chrysophyta

99 59

60

LG26-04 Nannochloropsis granulata Eustigmatophyta Olisthodiscus luteus 100 CCW34 calceolata 52 Pelagomonas Aureococcus anophagefferens 100 OLI11030 Pelagophyta Pelagococcus subviridis 51 Coccoid pelagophyte CCMP1145 98 OLI11025 100 Rhizochromulina cf. marina Dictyochophyta Ciliophrys infusionum 75 Florenciella parvula ANT37 16 MH VIII

Bacillariophyta (Diatoms) LG22-09 C2 E018 Bolidomonas Bolidomonas mediterranea Bolidomonas pacifica 100 Pirsonia formosa Pirsonia verrucosa 100 CN283 100 Hyphochytrium catenoides Hyphochytriomycetes Rhizidiomyces apophysatus 69 CS E041 59 C2 E037 84 CCW73 MH V 90 BOLA320 BOLA515 56 A3 E025 MH VI CS E037 MH X

95

99

Oomycetes Developayella elegans DH148-5-EKD53 MH XIII (NM II) ME1-17 88 DH144-EKD10 ME1-21 81 100 OLI11026 MH VII (NM I) OLI11008 ME1-22 95 91 BAQA232 100 ANT12-11 LG02-05 66 81 LG10-05 LG33-04 62 Siluania monomastiga 97 LG21-12 LG30-01 LG01-04 56 Bicosoecida Cafeteria roenbergensis 67 60 CS E045 75 100 Caecitellus parvulus NBH4 Caecitellus parvulus EWM1 D184 Pseudobodo tremulans 53 MH XIV LG14-04 100 D187 100 D101 82 MH IX D226 D130 100 BAQA72 94 BAQA21 88 BAQD220 MH XV 94 BAQD18 C2 E027 Wobblia lunata Placididea Blastocystis hominis 98 OLI11006 69 ANT12-22 97 MH XI (NM III) NA11-5 ME1-28 99 H70 MH XII D217 100 NA11 9 100 ANT12 10 MH III (NM VII) OLI11150 84 OLI11066 100 ME1 19 MH II (NM IV) ME1 20 MH VIII ANT12-26 100 C2 E014 C3 E002 MH III (NM VII) C2 E002 100 58 C1-E009 A3-E043 90 D179 64 Aplanochytrium stocchinoi 63 Labyrinthuloides minuta 100 CS-E007 64 CS-E008 Labyrinthula sp. Labyrinthulida 92 CS-E005 D52 60 Thraustochytriidae sp. Thraustochytrium multirudimentale D107 100 88 E170 64 Thraustochytriidae sp. Ulkenia profunda 100 Thraustochytrium sp. Schizochytrium sp. C2-E039 97 MH I (NM V) DH147-EKD10 100 D79 E106 MH IV (NM VI) ME1 24 60 93 Amphidinium sp. Gyrodinium instriatum Perkinsus mediterraneus Colpodella sp. Trimyema compressum Euplotes rariseta

100

63

88

56

78 64 82

0.01 substitutions/site Suggested to be chimaeric by Berney et al. [15••] Massana et al. [48]

Current Opinion in Microbiology 2005, 8:240–252

Current Opinion in Microbiology

www.sciencedirect.com

Molecular screening of free-living microbial eukaryotes Richards and Bass 245

represented at all in our meta-analysis, but are common and diverse in many habitats (S von der Heyden, personal communication), whereas there is more parity between the apparent diversity of Apusozoa between microscope and molecular surveys.

Novel clades Divergent sequences retrieved from ‘extreme’ and/or anoxic environments have attracted attention as putative novel taxa specifically adapted to such environments and therefore are unlikely to occur in other environments [16,40]. Some of these sites have been targeted for molecular surveys as they are thought to represent environmental conditions present at least since the beginning of eukaryote evolution [16]. It is therefore hypothesized that these environments harbour a reservoir of eukaryotic lineages that have remained isolated from changes in global environmental conditions and could have retained some ancestral characters that have been lost by all other extant eukaryotes [41]. Two long-branch EE-SSU-L lineages, which have no particular affinity to known eukaryotes, have been recovered from three or more separate experiments (e.g. the first group includes Sey017, CS_R003, BOLA048 [listed as BOLA48 in genbank] and DH148-5-EKD18 [listed as DH148-EKD18 in genbank] and the second includes DH145-EKD11, CCW75 and LG08-02 [8,10,15,16, 41,42]) demonstrating biological authenticity of the sequences and wide environmental distribution of these lineages. Although in our analyses the DH148-5-EKD18 group clustered with the Parabasalia with 46% bootstrap support, future work will seek to establish a robust sister group for this clade. Another potential novel higher eukaryotic taxonomic group (i.e. C1_E027) [8] was found to be a sister of the excavate groups carpediemonads and retortomonads [15]; our sequence rich/model simplistic approach was consistent with these findings. After reevaluation here and elsewhere [15,29] in total we propose 15 EE-SSU-L clades with no clear affinity to known taxa (Table 2). However, the majority of the novel phylogentic clusters can be grouped within known eukaryotic kingdoms and in the absence of complementary data to test these groupings, the best tree topology must be taken as the null hypothesis even if the bootstrap support is low. Attempts to describe novel higher taxonomic groupings based on EE-SSU-L sequences alone must be viewed as the alternative hypothesis and should be treated with caution.

Some clades of novel lineages grouping within known eukaryotic ‘kingdoms’ are worthy of mention. Analysis of several marine environments [8,12,41,43] demonstrates a unique phylogenetic cluster branching close to the prasinophytes at the base of the green algae. Further analysis of culture collections and EE-SSU-L clones obtained from oceanic and coastal ecosystems suggests a diversity of seven independent lineages among prasinophytes, including one novel clade (VII) composed mainly of environmental sequences [44]. Van Hannen et al. identified three freshwater clones forming a deep branching clade within, or sister to, the fungi [45]. Ten additional sequences recovered from EE-SSU-L experiments on soil, anoxic sediments and acidic waters [15,16,46,47] also form part of this clade (e.g. Sey019, RT5iin3, MO319, BRKC111), which suggests that its members have a wide environmental distribution. A large proportion of the novel diversity identified in EE-SSU-L experiments has grouped with or within the chromalveolates (heterokonts plus alveolates), consistent with the high detected diversity of these groups (Figure 3). Phylogenetic analysis of this diversity has led to the proposal of many novel lineages within these taxonomic groups; we have classified these lineages as Mystery Alveolate (MA) I–VIII and Mystery Heterokont (MH) I–XVI (Figures 1 and 2). EE-SSU-L investigations of oceanic pico-eukaryotic diversity revealed two alveolate phylogenetic groupings [12,42] named Marine Alveolate Groups I and II [40,42] and are illustrated on Figure 1. Figure 1 shows a phylogenetic re-evaluation of all novel putative alveolate clades and the positions of the two putative novel marine clades. Both of these encompass a range of sequences from different marine environments. Marine Alveolate Group II groups within the dinoflagellates with Amoebophrya sp. sequences, as other analyses have demonstrated [40], reinforcing the assertion that this clade represents a mostly unknown radiation within the dinoflagellate clade. However, terminal bootstrap support for Marine Alveolate Group I is below 50%, which suggests that with the increased sequence sampling in our analyses the monophyly of this group weakens and this group may actually represent three distinct clades. Our analysis and the sequence rich/simplistic model tree (not shown) demonstrate that several long-branch lineages, suggested to represent unique higher taxonomic groups, [8,16] actually group with the gregarines [15,29] (e.g. LEMD002/LEMD003 sequence cluster;

(Figure 2 Legend) Phylogeny of novel heterokont EE-SSU-L lineages. On the basis of this analyses we propose 16 unidentified (‘mystery’) heterokont lineages labelled MH I–MH XVI. Representative sequences from each unique lineage were used for BLASTn searches in addition to a general sampling of heterokont SSU rDNA sequences so that taxon sampling was optimised for the identification of mystery lineages. Bootstrap values above 49% are labelled. Environment types and publications are colour-coded and are identified in the boxed key (see Figure 1); similar colours indicate similar environments. Tree and bootstrap replicates were calculated using a LogDet distance method; grey hatched lines represent topological differences found in the bootstrap consensus tree: MH III and MHVIII are not monophyletic in the topology tree but are monophyletic in all other analyses. MH XVI lineage was not included in this analyses as it forms a very long branch within the heterokonts (data not shown; see [10,46]). www.sciencedirect.com

Current Opinion in Microbiology 2005, 8:240–252

246 Ecology and industrial microbiology

Figure 3

Higher taxonomic groups detected in 13 environments

OPISTHOKONTA 16.00%

RHIZARIA 9.95%

= Fungi, Metazoa, Choanozoa

= Cercozoa, Foraminifera, Polycystinea, Acantharea

CRYPTOMONADA 2.62% UNCLASSIFIED 1.99% PARABASALIAN SISTER GROUP 1.63% VIRIDAEPLANTAE 5.88%

EUGLENOZOA 1.27%

ALVEOLATA 26.13% = Dinozoa, Ciliophora, Apicomplexa, Gregarinea, Perkinsea, Colpodella

HAPTOPHYTA 1.54%

AMOEBOZOA 0.99% HETEROKONTA 30.38%

CARPEDIEMONAS SISTER GROUP 0.90%

= Bicosoecida, Labyrinthulida, Oomycetes, Olithodiscus, Raphidophyta, Dictyochophyta, Pelagophyta, Xanthophyta, Chrysophyta, Oikomonads, Bacillariophyta, Bolidomonas

JAKOBEA 0.18% MASTIGAMOEBA INVERTENS 0.18% APUSOMONADS, OXYMONADS, PERCOLOZOA, DIPHYLLEIA (0.09% each)

Current Opinion in Microbiology

Relative diversities of SSU rDNA clones within higher taxonomic groups from 13 environmental gene library experiments (described in Table 1). Colour coding is arbitrary and does not relate to the labeling shown in all other figures.

Figure 1 and [15]). Two discrete novel apicomplexan lineages were also recovered in our analysis (MA IV and MA III, Figure 1), plus two lineages that group with the alveolates but show no particular affinity to any known alveolate taxa (MA VIII and VII Figure 1). Massana et al. [48] demonstrated, using a technique related to that described in [33], that six to eight discrete novel heterokont lineages were present in ocean samples. Our re-analysis of the 13 EE-SSU-Ls suggests there could be up to 16 discrete novel heterokont lineages (Figure 2, MHI-XVI). A recent follow-up study [49] identified a further three novel lineages that are not included in our analyses, suggesting that heterokont diversity is exceptionally high. Current Opinion in Microbiology 2005, 8:240–252

Ecological structuring Patterns of distribution determined by ecological conditions should be easier to detect in this meta-analysis (Figure 4) than in individual surveys. It is unwise to make generalisations about particular taxa being found in particular environments. Analyses of individual taxa (e.g. individual phyla or groups within phyla [6,49,50,51,52]) suggest that a high level of phylogenetic resolution is required to detect biogeography and environmental selection. However, some of the smaller clades indicated on Figure 4 have only been recovered from one gross environmental category or another (e.g. oceanic or freshwater) although most of them were recovered from more than one library, which suggests that they are real biological entities. www.sciencedirect.com

Molecular screening of free-living microbial eukaryotes Richards and Bass 247

Table 2 List of environmental eukaryotic phylotypes with unstable or unclear phylogenetic affinities to known taxa. Novel clade with unresolved eukaryotic affinitiesa

Number of clonesb

Example from each publication

Accession number

Reference

Notes

1

4*

LG08-04

AY919706

[10]

2 3 4

14

1* 1* 1* 1* 1 1 2 1* 4* 2* 1* 1* 2* 1* 1* 1* 12

CCA32 CCW35 D225 LG15-08 LG25-05 OLI11005 C2_E029 AT4-40 LG08-02 CCW75 DH145-EKD11 AT4-68 RT5iin14 BOLA176 BAQA58 CCW46 C2 E008

AY179990 AY180022 AY256326 AY919735 AY919771 ECL402349 AY046819 AF530540 AY919704 AY180032 AF290065 AF530543 AY082985 AF372786 AF372761 AY180023 AY046799

[41] [41] [43] [10] [10] [12] [8] [40] [10] [41] [42] [40] [46] [16] [16] [41] [8]

15

17

BOLA048c

AF372821

[16]

1

CS R003

AY046643

[8]

1

DH148-5-EKD18d

AF290084

[42]

Glaucocystophyceae/cryptomonad sister [10] Possible dinoflagellate Possible fungus [41] Possible alveolate [43] Possible alveolate [10] Possible alveolate (Figure 1) [10] Possible alveolate (Figure 1) [5] Possible alveolate (Figure 1) [8] Possible alveolate [40] Possible alveolate [10] Possible alveolate [10,37] Possible alveolate [41] No suggested sister relationship No suggested sister relationship Possible alveolate [16] Possible cercozoan [41] Possible cercozoan [41] Retortomonas/Carpediemonas group (this analysis and [15]) Possible parabasalian sister group (this analysis) Possible parabasalian sister group (this analysis) Possible parabasalian sister group (this analysis)

5 6 7 8 9

10 11 12 13

a b c d

Numbers in the left-hand column are arbitrary. Asterisk indicates that this is listed as unclassified in Figure 4. BOLA048 is listed as BOLA48 in genbank. DH148-5-EKD18 is listed as DH148-EKD18 in genbank.

The most discriminating environmental difference appears to be that between marine and non-marine; out of the 66 groups represented by bars on Figure 4, at least 8 were recovered only from freshwater and 31 only from marine surveys, 12 only from anoxic samples, and only 2 were found solely at deep-sea sites. Note also the cases in which there appears to be a higher genetic diversity of a group in one environment type, for example diatoms are lineage-rich in marine as opposed to non-marine environments; the opposite is true for chrysophytes and oikomonads. The subset of lineages recovered from the acidic freshwater environment [46] overlaps strongly with that from other freshwater environments [10,45]. This suggests that many freshwater protists can survive at a wide range of pH conditions, or that evolutionary transitions between lineages adapted to different degrees of acidity are relatively frequent, at least in some lineages. Very few effectively identical sequences were retrieved both from acidic and more neutral environments. However, effectively identical sequences from different oxygendepleted environments are common, and often cluster together to the exclusion of those from oxygen-rich sites (Figure 5a,b). www.sciencedirect.com

In general the phylogenetic groups found in only one environment type are represented by relatively few sequences; any inferences drawn from such small sample sizes are compromised by gross undersampling, although it is perhaps more likely that narrower phylogenetic groups may be specialized to a particular set of ecological conditions. However, our meta-analysis is limited by the nature of its constituent studies. Such limitations are well illustrated by the fact that foraminiferans were only detected in anoxic environments and yet proliferate in many marine and freshwater habitats (although the atypical SSU rRNA genes of this group will probably bias against their detection in general eukaryote-wide surveys; a foraminiferanspecific survey detected the presence of Foraminifera in every sediment sample tested) [7]. Nonetheless, this approach provides an optimal resource for identifying taxa of interest for further study. An intriguing example of this is that most of the larger taxonomic groups have deep-sea representatives, with the surprising exception of Cercozoa, members of which are numerous and diverse in all other environments screened [6]. Similar, but phylogenetically finer-scale, patterns are revealed by the sequence rich/model simplistic tree Current Opinion in Microbiology 2005, 8:240–252

248 Ecology and industrial microbiology

Figure 4

Phylogenetic grouping of 18S clones from thirteen environmental gene libraries 120

No. sequences sampled

100

80

60

Antarctic soils [47•] Acidic freshwater [46] Freshwater seeded bio-reactor [45] Euphotic freshwater [10] Seymaz small-river sediments [15 ••] Oxygen-depleted marine [41] Anoxic Cariaco basin [43] Anoxic environment [16] Oceanic surface waters [11] Antarctic deep-sea [42] Oceanic [12] Benthic hydrothermal vents [8] Hydrothermal vents [40]

40

20

UNCLASSIFIED (P) PARABASALIAN SISTER GROUP* (100%) PERCOLOZOA (72%) CARPEDIEMONAS SISTER GROUP (M) JAKOBEA (55%) EUGLENOZOA (79%) ACANTHAREA (51%) POLYCYSTINEA (100%) CERCOZOA (M) FORAMINIFERA (100%) GREGARINA (P) (55%/69%) (M/85%) COLPODELLA (M/54%) DINOZOA (M) MYSTERY ALVEOLATE CLADE VIII (M/M) MYSTERY ALVEOLATE CLADE VII (M/M) MYSTERY ALVEOLATE CLADE VI (M/87%) MYSTERY ALVEOLATE CLADE V (M/84%) MYSTERY ALVEOLATE CLADE IV COCCIDIA (58%/?) MYSTERY ALVEOLATE CLADE III (100%/100%) MYSTERY ALVEOLATE CLADE II (96%/99%) MYSTERY ALVEOLATE CLADE I (93%/99%) PERKINSEA (M/84%) CILIOPHORA (M) DIPHYLLEIA (M) OXYMONADIDA (50%) APUSOMONADS (M) MASTIGAMOEBA INVERTENS (76%) ICHTHYOSPOREA (M) CHOANOZOA (M) METAZOA (70%) UNDETERMINED FUNGAL LINEAGE (M) FUNGI (M) AMOEBOZOA (P) CRYPTOMONADA (M) NUCLEOMORPH (M) HAPTOPHYTA (100%) STREPTOPHYTA (M) PRASINOPHYTA (M) VIRIDAEPLANTAE (87%) BICOSOECIDA (61%/75%) LABYRINTHULIDA (63%/64%) MYSTERY HETEROKONT CLADE XVI (100%) MYSTERY HETEROKONT CLADE XV (M/94%) MYSTERY HETEROKONT CLADE XIV (-/-) MYSTERY HETEROKONT CLADE XIII (NM II) (99%/100%) MYSTERY HETEROKONT CLADE XII (80%/99%) MYSTERY HETEROKONT CLADE XI (NM III)(96%/97%) MYSTERY HETEROKONT CLADE X (-/-) MYSTERY HETEROKONT CLADE IX (59%/82%) MYSTERY HETEROKONT CLADE VIII (-/63%) MYSTERY HETEROKONT CLADE VII (NM I) (M/91%) MYSTERY HETEROKONT CLADE VI (M/53%) MYSTERY HETEROKONT CLADE V (-/59%) MYSTERY HETEROKONT CLADE IV (NM VI) (-/60%) MYSTERY HETEROKONT CLADE III (NM VII) (-/56%) MYSTERY HETEROKONT CLADE II (NM IV) (100%/100%) MYSTERY HETEROKONT CLADE I (NM V) (56%/97%) OOMYCETES (100%) OLITHODISCUS (-/100%) RAPHIDOPHYTA (86%/100%) DICTYOCHOPHYTA (-/100%) PELAGOPHYTA (100%/100%) XANTHOPHYTA (M/77%) CHRYSOPHYTA + OIKOMONAS (M) BACILLARIOPHYTA (M/64%) BOLIDOMONAS (52%/95%)

0

Phylogenetic groupings Current Opinion in Microbiology

Relative representation of major phylogenetic groupings and novel lineages across the meta-analysis. Bootstrap values above 49% are included; M indicates a monophyletic relationship and P indicates a polyphyletic relationship seen in both best topology and bootstrap trees. Secondary bootstrap values are derived from the alveolate and heterokont analyses (Figures 1 and 2). Environment types and publications are colour-coded and are identified in the boxed key; similar colours indicate similar environments.

(not shown) and inferred by Figures 1 and 2, which shows representative phylotypes and environment distribution diversity for the heterokonts and alveolate mystery lineages. Two types of patterns are evident: first, sequences from similar environment types group together, suggesting an ecological specialisation in these lineages at this taxonomic level [6,51]; and second, groups of sequences retrieved from the same library often cluster together, from which it could be inferred that the organisms represented by those lineages are endemic to a region including that sampling site. However such patterns are potentially artifacts of undersampling [6], or Current Opinion in Microbiology 2005, 8:240–252

may result from PCR recombination between closely related sequences and/or high levels of intra-individual rRNA variation.

Identical sequences retrieved from multiple libraries Figure 5 shows examples of effectively identical sequences recovered from more than one library. Although the sequences in each cluster are often not literally identical, the variation between them is at a level below which real inter-strain genetic differences cannot be distinguished from the combined effects of rRNA www.sciencedirect.com

Molecular screening of free-living microbial eukaryotes Richards and Bass 249

Figure 5

(a)

CCW15 D94 CCI14 Navicula ramosissima CCI28 Navicula phyllepta

E63

Navicula sp.

(b)

Bacillariophyta sp. Amphora cf. capitellata

Haslea crucigera

Pseudogomphonema sp.

changes 5/1147

Nyctotherus ovalis

C3 E024 E206 H3b E13 C1 E025 H81 D67 H65 D41 E142 E171 CCA41 CCI57

E161 D219 CCA70

changes 10/1276 Lechriopyla mystax

Nannochloris sp.

Chaetoceros muelleri

C1 E019

changes 10 /1044

changes 10/1208

Nanofrustulum shiloi Diatoma tenue

Malassezia furfur

CCA67

C3 E009 C2 E004 C2 E044 Nannochloris sp. E81 C3 E027 C3 E022 C2 E023 Coccoid Nannochloris sp. trebouxiophyte sp. C3 E045

changes 5 /1070

H91 E216 E233 E152 H2a D119 BOLA125 BOLA250H45b

CCW92 C2 E031 Plagiopyla frontata C2 E041

changes 5/1158

A1 E024 Phthirus pubis D149 A1 E002 A1 E037 AT9-6 H46 A1 E022 BAQA52

changes 10/1175

changes 5/727

C1 E008

AT4-47 OLI11011

C1 E006 CS E006 AT4-21 C1 E037 C3 E005 C1 E030 C3 E042

H12b

D80 BL010320.3 E131

D76

E214 D51 D254 C3 E003 H61 E228 D12 C2 E017 C1 E020 E27 D188 C2 E006

AT4-42

H24

C1 E005 C3 E034

C1 E029

BL000921.4 BL001221.13

DH148-EKD

changes 5/1183

changes 1/728 (c)

Pirsonia verrucosa SIMO2-M13F_G11 ME1-17 DH148-5-EKD53

(d)

P6X4-3 LG47-07 Paraphysomonas sp. CCW27 Paraphysomonas foraminifera Synura petersenii Ochromonas sp. CCI40 AY180010.1 Ochromonas tuberculata LG22-01 UEPAC37p4 UEPAC48p3

changes 10/746

changes 5/1055

changes 10/1103

12

15 3 5 19 26 34 30 31 1 14 32 6 23 10 9 24 21 22 change 1/549

changes 10/1191

Antarctic soils [47•] Acidic freshwater [46] Freshwater seeded bio-reactor [45] Euphotic freshwater [10] Seymaz small-river sediments [15••] Oxygen-depleted marine [41] Anoxic Cariaco basin [43] Anoxic environment [16] Oceanic surface waters [11] Antarctic deep-sea [42] Oceanic [12] Benthic hydrothermal vents [8] Hydrothermal vents [40] Current Opinion in Microbiology

Highly related and effectively identical sequences recovered from different SSU rRNA environmental gene libraries. Combined phylogenetic analyses of 13 gene library experiments revealed 37 closely related phylogenetic clusters comprising sequences from two or more EE-SSU-L experiments. BLASTn similarity searches of the GenBank nr database were used to sample highly related SSU sequences from known species (where available) and unpublished environmental SSU sequences. Alignments were constructed for all 37 groups of sequences. The alignments were masked to remove regions of the alignment that were not present in one or more of the environmental clones and sequence positions that were clearly a product of sequencing error. For example, if all other sequences contained a 5 T region and only one sequence posses 6 Ts then the sixth position was masked out, as the sixth position was hypothesized to be a sequencing error. As all sequences for each grouping were highly related it was not necessary to use the mask to remove hyper-variable sequence regions. Parsimony searches were used to investigate the phylogeny of all 37 groups using 10 replicate heuristic searches with stepwise addition using the tree-bisection-reconnection method (TBR). Gaps were encoded as a fifth character. Fourteen phylogenies showed environmental sequences from different environments with less than 10 in 1000 different positions (a–c). The alphanumeric code for each environmental sequence was assigned by the original authors and can be used as a nucleotide search term in GenBank. Grey curves indicate clusters of effectively identical phylotypes. A coloured dot next to each environmental sequence relates to the key and indicates the publication and environment type. A blank dot indicates that the sequence is available from GenBank but has no accompanying publication. The scale bars show the number of nucleotide changes/number of nucleotide positions sampled for each analyses. (d) The box shows (i) the SSU rDNA sequence diversity observed from a clonal monoculture of Cercomonas sp. using standard environmental gene library methods (Shapiro and Bass, unpublished) see [6] for experimental details and (ii) a colour-coded key for the environmental gene sequences; similar colours indicate similar environments. Phylogenies are numbered from left to right (a) 1–3, (b) 4–13, and (c) 14. www.sciencedirect.com

Current Opinion in Microbiology 2005, 8:240–252

250 Ecology and industrial microbiology

polymorphism and PCR/sequencing error. The majority of the sites represented are northern hemisphere American eastwards to the Azores. A property uniting members of most of the clusters is that they were recovered from oxygen-depleted sites. Such sites from other parts of the world should be surveyed to see whether anoxia-adapted lineages are able to disperse through potentially harsh (for obligate anaerobes) oxygen-rich conditions frequently and far enough to have cosmopolitan distributions, or whether the relative patchiness of their required habitat conditions provides barriers to gene flow and dispersal, resulting in local genetic radiations. Figure 5 (a,b) suggests that the latter is not true. Effectively identical SSU rDNA sequences have been recovered from different oceans (Figure 5b) and environment types (Figure 5c), which suggests that these organisms are likely to be broadly ecologically tolerant and/or dispersed on a global scale (see [53] for a morphological study of cosmopolitanism and [54] for proposed global ubiquity of bacteria in terms of SSU rDNA sequence identity).

Ubiquity or biogeography? There is currently little consensus about the nature of protist population and community structure. One hypothesis is that species below a certain size are cosmopolitan: estimates for this threshold size range from 100 mm to 10 mm [55,56]; however, results are currently inconclusive. Cosmopolitanism is predicted because such organisms are small and abundant enough to be distributed on a global scale. This debate is complicated by the likely differences in the ease with which different microbial lineages are dispersed, which may depend on abilities to encyst and to survive suboptimal conditions, for example draught or variations in salinity. The definition of the biological units for which geographical ranges are estimated is also key to this debate. In many cases, it has been shown that morphologically identical individuals (morphospecies) can be isolated from sites scattered around the world [56], but as other studies have shown that a single morphospecies may harbour high levels of genetic diversity (e.g. [57–59]), the degree of resolution offered by morphology as a marker for measuring distribution patterns of genetically distinct entities is clearly too low. Although some studies have also demonstrated that identical SSU rDNA sequences can be retrieved from sites distributed on a global scale (even if their corresponding ITS sequences cannot) [6,59], this doesn’t appear to be true for all taxa. Indeed, biogeographical structuring has been proposed at both molecular and morphological levels [7,60,61–63,64]. Our meta-analysis reveals 14 (Figure 5) instances of effectively identical groups of SSU rDNA sequences being retrieved from two or more different environmental analyses (an additional 23 clusters were also detected but had a lower level of sequence identity and are not shown; see Figure 5 legend). We are unable to say whether those sequences recovered from Current Opinion in Microbiology 2005, 8:240–252

only one site, area or environment type have restricted distributions until studies targeting each group individually have been carried out at a range of global sites.

Conclusions Although there are more environments to probe using similar approaches to those considered in this review, the results from the surveys to date suggest that although new lineages will continue to be found for the foreseeable future, most of these will belong to well-established higher taxanomic groups. As the genetic diversity of free-living protists is so high, clear patterns of ecological and geographical distribution are unlikely to be revealed by eukaryote-wide SSU approaches. A much finer scale of phylogenetic resolution (within genera or ‘species’) is required if truly revealing comparisons of biodiversity and distribution between environments are to be made, using a faster-evolving gene than SSU rRNA, and screening a comprehensive and systematic selection of sites on a global scale. In terms of investigation of the protist diversity and evolutionary relationships, it is clear that EE-SSU-L sequences in isolation can be hazardous to interpret and that such studies should be complemented by morphological, ultrastructural and amino acid phylogenetic analyses.

Acknowledgements Both authors contributed equally to this paper. TAR is supported by the Biotechnology and Biological Sciences Research Council (BBSRC) and DB is supported by the Natural Environment Research Council (NERC). We thank J Shapiro for access to his data.

References and recommended reading Papers of particular interest, published within the annual period of review, have been highlighted as:  of special interest  of outstanding interest 1.

Olsen GJ, Lane DJ, Giovannoni SJ, Pace NR, Stahl DA: Microbial ecology and evolution: a ribosomal RNA approach. Annu Rev Microbiol 1986, 40:337-365.

2.

Giovannoni SJ, Britschgi TB, Moyer CL, Field KG: Genetic diversity in Sargasso Sea bacterioplankton. Nature 1990, 345:60-63.

3.

Amman RI, Ludwig W, Schleifer KH: Phylogenetic identification and in situ detection of individual microbial cells without cultivation. Microbiol Rev 1995, 59:143-169.

4.

Fuhrman JA, McCallum K, Davis AA: Novel major archaebacterial group from marine plankton. Nature 1992, 356:148-149.

5.

Moon-van der Staay SY, van der Staay GWM, Guillou L, Vaulot D, Claustre H, Medlin LK: Abundance and diversity of prymnesiophytes in the picoplankton community from the equatorial Pacific Ocean inferred from 18S rDNA sequences. Limnol Oceanogr 2000, 45:98-109.

6. 

Bass D, Cavalier-Smith T: Phylum-specific environmental DNA analysis reveals remarkably high global biodiversity of Cercozoa (Protozoa). Int J Syst Evol Microbiol 2004, 54:2393-2404. This study samples 41 environmental SSU rDNA libraries from varied environments and global sites. It shows that Cercozoa are far more diverse than previously thought and includes many uncharacterised groups, and attempts to describe geographical and ecological patterns in their distribution. www.sciencedirect.com

Molecular screening of free-living microbial eukaryotes Richards and Bass 251

7. 

Holzmann M, Habura A, Giles H, Bowser SS, Pawlowski J: Freshwater foraminiferans revealed by analysis of environmental DNA samples. J Eukaryot Microbiol 2003, 50:135-139. Novel groups of freshwater foraminiferans are revealed by molecular probing, indicating geographically separate colonisation events of nonmarine environments, and a previously unknown component of freshwater ecosystems. 8.

9.

Edgcomb VP, Kysela DT, Teske A, de Vera Gomez A, Sogin ML: Benthic eukaryotic diversity in the Guaymas Basin hydrothermal vent environment. Proc Natl Acad Sci USA 2002, 99:7658-7662. Moreira D, Lo´ pez-Garcı´a P: The molecular ecology of microbial eukaryotes unveils a hidden world. Trends Microbiol 2002, 10:31-38.

10. Richards TA, Vepritskiy AA, Gouliamova D, Nierzwicki-Bauer SA: The molecular diversity of freshwater picoeukaryotes from an oligotrophic lake reveals diverse, distinctive and globally dispersed lineages. Environ Microbiol 2005, in press. 11. Diez B, Pedro´ s-Alio´ C, Massana R: Study of genetic diversity of eukaryotic picoplankton in different oceanic regions by small-subunit rRNA gene cloning and sequencing. Appl Environ Microbiol 2001, 67:2932-2941. 12. Moon-van der Staay SY, De Wachter R, Vaulot D: Oceanic 18S rDNA sequences from picoplankton reveal unsuspected eukaryotic diversity. Nature 2001, 409:607-610. 13. Taylor FJ: Ultrastructure as a control for protistan molecular phylogeny. Am Nat 1999, 154:S125-S136. 14. Cole JR, Chai B, Marsh TL, Farris RJ, Wang Q, Kulam SA, Chandra S, McGarrell DM, Schmidt TM, Garrity GM et al.: The Ribosomal Database Project (RDP-II): previewing a new autoaligner that allows regular updates and the new prokaryotic taxonomy. Nucleic Acids Res 2003, 31:442-443. 15. Berney C, Fahrni J, Pawlowski J: How many novel eukaryotic  ‘kingdoms’? Pitfalls and limitations of environmental DNA surveys. BMC Biol 2004, 2:13. This study combines an excellent systematic review of claims made for the potential taxonomic novelty of some environmental eukaryotic small subunit library (EE-SSU-L) phylotypes using thorough methodologies and an EE-SSU-L investigation into freshwater sediments. 16. Dawson SC, Pace NR: Novel kingdom-level eukaryotic diversity in anoxic environments. Proc Natl Acad Sci USA 2002, 99:8324-8329. 17. Dover GA, Strachan T, Coen ES, Brown SD: Molecular drive. Science 1982, 218:1069. 18. Coyer JA, Smith GJ, Andersen RA: Evolution of Macrocystis spp. (Phaeophyceae) as determined by ITS1 and ITS 2 sequences. J Phycol 2001, 37:574-585. 19. Baillie BK, Belda-Baillie CA, Tadashi M: Conspecificity and Indo-Pacific distribution of Symbiodinium genotypes (Dinophyceae) from giant clams. J Phycol 2000, 36:1153-1161. 20. Coenye T, Vandamme P: Intragenomic heterogeneity between multiple 16S ribosomal RNA operons in sequenced bacterial genomes. FEMS Microbiol Lett 2003, 228:45-49. 21. Pawlowski J, Fahrni J, Bowser SS: Phylogenetic analysis and genetic diversity of Notodendrodes hyalinosphaira. J Foraminiferal Res 2002, 32:173-176.

25. Rooney AP: Mechanisms underlying the evolution and maintenance of functionally heterogeneous 18S rRNA genes in apicomplexans. Mol Biol Evol 2004, 21:1704-1711. 26. Kupriyanova NS, Nechvolodov KK, Kirilenko PM, Kapanadze BI, Yankovskii NK, Ryslov AP: Intragenomic polymorphism of ribosomal RNA in human chromosome 13. Mol Biol (Mosk) 1996, 30:51-60. 27. Krieger JFP: Evidence of multiple alleles of the nuclear 18S ribosomal RNA gene in sturgeon (Family: Acipenseridae). J Appl Ichthyol 2002, 18:290-297. 28. Horner DS, Embley TM: Chaperonin 60 phylogeny provides further evidence for secondary loss of mitochondria among putative early-branching eukaryotes. Mol Biol Evol 2001, 18:1970-1975. 29. Cavalier-Smith T: Only six kingdoms of life. Proc R Soc Lond B  Biol Sci 2004, 271:1251-1262. This study shows how the use of increased taxon sampling in phylogenetic analyses can identify known relatives in recognized protozoan phyla for many of the ‘mysterious’ lineages described in individual environmental gene surveys. 30. Foster PG, Hickey DA: Compositional bias may affect both DNA-based and protein-based phylogenetic reconstructions. J Mol Evol 1999, 48:284-290. 31. Lockhart PJ, Larkum AW, Steel M, Waddell PJ, Penny D: Evolution of chlorophyll and bacteriochlorophyll: the problem of invariant sites in sequence analysis. Proc Natl Acad Sci USA 1996, 93:1930-1934. 32. Hirt RP, Logsdon JM Jr, Healy B, Dorey MW, Doolittle WF, Embley TM: Microsporidia are related to Fungi: evidence from the largest subunit of RNA polymerase II and other proteins. Proc Natl Acad Sci USA 1999, 96:580-585. 33. Stoeck T, Fowle WH, Epstein SS: Methodology of protistan  discovery: from rRNA detection to quality scanning electron microscope images. Appl Environ Microbiol 2003, 69:6856-6863. This study describes a technique that enables fine microscopy studies of many organisms previously known exclusively by their SSU rRNA sequences. Based on sequences recovered from environmental gene libraries, 18S rRNA-targeted fluorochrome-labeled probes can be used to ‘find’ the cells from which they came, which can then be viewed by SEM to reveal their diagnostic morphological characteristics. 34. Swofford DL: PAUP*: Phylogenetic Analysis Using Parsimony (*and Other Methods). Sinauer Associates; 1998. URL: http:// www.sinauer.com/detail.php?id=8060 35. Lockhart PJ, Steel MA, Hendy MD, Penny D: Recovering evolutionary trees under a more realistic model of sequence evolution. Mol Biol Evol 1994, 11:605-612. 36. Lake JA: Reconstructing evolutionary trees from DNA and protein sequences: paralinear distances. Proc Natl Acad Sci USA 1994, 91:1455-1459. 37. Cavalier-Smith T, Chao EE, Oates B: Molecular phylogeny of Amoebozoa and the evolutionary significance of the unikont Phalansterium. Eur J Protistol 2004, 40:21-48. 38. Bass D, Moreira D, Lo´ pez-Garcı´a P, Polet S, Chao EE, von der Heyden S, Pawlowski J, Cavalier-Smith T: Polyubiquitin insertions and the phylogeny of Cercozoa and Rhizaria. Protist 2005, 156: in press.

22. Ma´ rquez LM, Miller DJ, MacKenzie JB, van Oppen MJH: Pseudogenes contribute to the extreme diversity of nuclear ribosomal DNA in the hard coral Acropora. Mol Biol Evol 2003, 20:1077-1086.

39. Nikolaev SI, Berney C, Fahrni JF, Bolivar I, Polet S, Mylnikov AP, Aleshin VV, Petrov NB, Pawlowski J: The twilight of Heliozoa and rise of Rhizaria, an emerging supergroup of amoeboid eukaryotes. Proc Natl Acad Sci USA 2004, 101:8066-8071.

23. O’Donnell K, Cigelnik E: Two divergent intragenomic rDNA ITS2 types within a monophyletic lineage of the fungus Fusarium are nonorthologous. Mol Phylogenet Evol 1997, 7:103-116.

40. Lo´ pez-Garcı´a P, Philippe H, Gail F, Moreira D: Autochthonous eukaryotic diversity in hydrothermal sediment and experimental microcolonizers at the Mid-Atlantic Ridge. Proc Natl Acad Sci USA 2003, 100:697-702.

24. Thompson J, van Spaendonk RML, Choudhuri SR, Janse CJ, Waters AP: Heterogeneous ribosome populations are present in Plasmodium berghei during development in its vector. Mol Microbiol 1999, 31:253-260.

41. Stoeck T, Epstein S: Novel eukaryotic lineages inferred from small-subunit rRNA analyses of oxygen-depleted marine environments. Appl Environ Microbiol 2003, 69:2657-2663.

www.sciencedirect.com

Current Opinion in Microbiology 2005, 8:240–252

252 Ecology and industrial microbiology

42. Lo´ pez-Garcı´a P, Rodrı´guez-Valera F, Pedro´ s-Alio´ C, Moreira D: Unexpected diversity of small eukaryotes in deep-sea Antarctic plankton. Nature 2001, 409:603-607.

indicate the existence of clades of globally distributed freshwater bacteria. Syst Appl Microbiol 1998, 21:546-556.

43. Stoeck T, Taylor GT, Epstein SS: Novel eukaryotes from the permanently anoxic Cariaco Basin (Caribbean Sea). Appl Environ Microbiol 2003, 69:5656-5663.

55. Wilkinson DM: What is the upper size limit for cosmopolitan distribution in free-living microorganisms? J Biogeogr 2001, 28:285-291.

44. Guillou L, Eikrem W, Chretiennot-Dinet MJ, Le Gall F, Massana R, Romari K, Pedro´ s-Alio´ C, Vaulot D: Diversity of picoplanktonic prasinophytes assessed by direct nuclear SSU rDNA sequencing of environmental samples and novel isolates retrieved from oceanic and coastal marine ecosystems. Protist 2004, 155:193-214.

56. Finlay BJ: Global dispersal of free-living microbial eukaryote species. Science 2002, 296:1061-1063.

45. van Hannen EJ, Mooij W, van Agterveld MP, Gons HJ, Laanbroek HJ: Detritus-dependent development of the microbial community in an experimental system: qualitative analysis by denaturing gradient gel electrophoresis. Appl Environ Microbiol 1999, 65:2478-2484. 46. Amaral Zettler LA, Gomez F, Zettler E, Keenan BG, Amils R, Sogin ML: Microbiology: eukaryotic diversity in Spain’s River of Fire. Nature 2002, 417:137. 47. Lawley B, Ripley S, Bridge P, Convey P: Molecular analysis of  geographic patterns of eukaryotic diversity in Antarctic soils. Appl Environ Microbiol 2004, 70:5963-5972. This is the first study of its kind on soil environments and attempts to assess changes in eukaryotic microbial diversity across a transect of the Antarctic. 48. Massana R, Guillou L, Diez B, Pedro´ s-Alio´ C: Unveiling the organisms behind novel eukaryotic ribosomal DNA sequences from the ocean. Appl Environ Microbiol 2002, 68:4554-4558. 49. Massana R, Castresana J, Balague V, Guillou L, Romari K, Groisillier A, Valentin K, Pedros-Alio C: Phylogenetic and ecological analysis of novel marine stramenopiles. Appl Environ Microbiol 2004, 70:3528-3534. 50. Smirnov A, Thar R: Spatial distribution of gymnamoebae (Rhizopoda, Lobosea) in brackish-water sediments at the scale of centimeters and millimetres. Protist 2003, 154:359-369. 51. von der Heyden S, Chao EE, Cavalier-Smith T: Genetic diversity  of gonionmondas: an ancient divergence between marine and freshwater species. Eur J Phycol 2004, 39:343-350. This study reveals high genetic diversity within morphospecies of Goniomonas, a phagotophic relative of the cryptophytes, and shows a deep (several hundred million years) divergence between marine and freshwater-adapted strains. 52. Esteban GF, Finlay BJ: Cryptic freshwater ciliates in a hypersaline lagoon. Protist 2003, 154:411-418. 53. Finlay BJ, Clarke KJ: Apparent global ubiquity of species in the protist genus Paraphysomonas. Protist 1999, 150:419-430. 54. Zwart G, Hiorns WD, Methe´ BA, van Agterveld MP, Huismans R, Nold SC, Zehr JP, Laanbroek HJ: Nearly identical 16S rRNA sequences recovered from lakes in North America and Europe

Current Opinion in Microbiology 2005, 8:240–252

57. Bowers NJ, Pratt JR: Estimation of genetic variation among soil isolates of Colpoda inflata (Stokes) (Protozoa: Ciliophora) using the polymerase chain reaction and restriction fragment length polymorphism analysis. Arch Protistenkd 1995, 145:29-36. 58. de Vargas C, Bonzon M, Rees NW, Pawlowski J, Zaninetti L: A molecular approach to biodiversity and biogeography in the planktonic foraminifer Globigerinella siphonifera (d’Orbigny). Mar Micropaleontol 2002, 45:101-116. 59. Medlin LK, Lange M, Edvardsen B, Larsen A: Cosmopolitan haptophyte flagellates and their genetic links. In The Flagellates — Unity, Diversity, and Evolution. Edited by Leadbeater BSC, Green JC. Taylor and Francis; 1991:288-308. 60. Darling KF, Kucera M, Pudsey CJ, Wade CM: Molecular evidence  links cryptic diversification in polar planktonic protists to Quaternary climate dynamics. Proc Natl Acad Sci USA 2004, 101:7657-7662. Strong evidence of allopatric and ecological evolutionary processes in the planktonic marine foraminiferan Neogloboquadrina. 61. Foissner W, Stru¨ der-Kypke M, van der Staay GWM, Moon-van der Staay S-Y, Hackstein JHP: Endemic ciliates (Protozoa, Ciliophora) from tank bromeliads (Bromeliaceae): a combined morphological, molecular, and ecological study. Eur J Protistol 2003, 39:365-372. 62. Kim E, Wilcox L, Graham L, Graham J: Genetically distinct populations of the dinoflagellate Peridinium limbatum in neighboring Northern Wisconsin lakes. Microb Ecol 2004, 48:521-527. 63. Skotarczak B, Przybos E, Wodecka B, Maciejewska A: Sibling species within Paramecium jennignsi revealed by RAPD. Acta Protozool 2004, 43:29-35. 64. Fokin SI, Przybos E, Chivilev SM, Beier CL, Horn M,  Skotarczak B, Wodecka B, Fujishima M: Morphological and molecular investigations of Paramecium schewiakoffi sp. nov. (Ciliophora, Oligohymenophorea) and current status of distribution and taxonomy of Paramecium spp. Eur J Protistol 2004, 40:225-243. Molecular analysis of sexual species complexes within the genus Paramecium show that these biological species have restricted distributions and show strong biogeographical structuring. This, along with refs [7,60,61–63], is representative of a growing body of evidence refuting Finlay’s [56] generalisation that microbial eukaryotes below a certain size have cosmopolitan distributions.

www.sciencedirect.com