Review
TRENDS in Ecology and Evolution
Vol.20 No.4 April 2005
Tackling the population genetics of clonal and partially clonal organisms Fabien Halkett1, Jean-Christophe Simon1 and Franc¸ois Balloux2 1
UMR INRA/AgroCampus de Rennes Biologie des Organismes et des Populations applique´e a` la Protection des Plantes (BIO3P) INRA B.P. 35327, 35653 Le Rheu Cedex, France 2 Theoretical and Molecular Population Genetics Lab, Department of Genetics, University of Cambridge, Downing Street, Cambridge, UK, CB2 3EH
Many clonal organisms experience occasional events of sexual recombination, with profound consequences for their population dynamics and evolutionary trajectories. With the recent development of polymorphic genetic markers and new statistical methods, we now have an unprecedented ability to detect recombination in organisms that are thought to reproduce strictly, or essentially asexually. However, it is not always obvious which methodology to apply. Consequently, biologists might decide how to analyse their data without clear guidelines. Here, we discuss the available methods, focusing on those best suited when working with limited genetic information, such as a few genetic markers or DNA sequences. We conclude by commenting on the prospects offered by some recent conceptual advances and the access to high throughput technologies in an increasing number of model organisms. Organisms are often classified by the presence or absence of sex, although the actual situation is far more complex, with evidence of limited sexual recombination in many apparently asexual species [1,2]. This limited recombination category includes several medically and economically important groups, such as parasitic protozoa (e.g. Trypanosoma and Leishmania), pathogenic fungi (e.g. Candida albicans) and crop pests (e.g. aphids and mites). The rate of clonal versus sexual reproduction in natural populations has a crucial influence on demography and genetics. Whereas sexual populations pay a demographic cost by producing males, they take advantage of recombination that enhances the combination and spread of favourable mutations, relative to clonality. Therefore, even low rates of recombination in pests or pathogens would have profound consequences for prophylactic policies with regard to drug and pesticide resistance [3–6]. Although knowledge of the rate of clonal reproduction (see Glossary) in natural populations is crucial for efficient pest and pathogen control, the tools for estimating such rate are still in their infancy. With the notable exception of bdelloid rotifers, even the empirical demonstration of strict clonal reproduction generally remains controversial [7,8], especially so in the absence of adequate sampling Corresponding author: Simon, J.-C. (
[email protected]). Available online 12 January 2005
strategies and clear statistical approaches. However, there is room for optimism, as the availability of highly variable molecular markers, together with some recent theoretical advances [9,10], provide us with a larger toolbox for investigating natural rates of clonal reproduction. Nonetheless, the best approaches fail to stand out against the range of tests available, many of which were
Glossary Asexual reproduction: all reproductive modes in which a new individual derives from only one parental source. Several kinds of asexual reproductive mode can be distinguished by the extent of genetic reorganization that takes place (i.e. whether meiosis occurs) and from the types of cell (somatic or gametic) that are involved. Here, we focus on asexual reproduction that proceeds through mitotic divisions (i.e. no recombination involved). Chromosomal modification: includes fusion of two chromosomes, deletion of a part (or an entire) chromosome and translocation of part of a chromosome to another (homologous or not) chromosome. Coalescence theory: mathematical model of the genealogical history of genetic samples. Based on this theory, statistical tools enable backwards reconstruction of genealogies that are compatible with the genetic data under scrutiny and direct estimation of the parameters that shape the genealogy (e.g. recombination rate). Conjugation: transfer of DNA from a living donor bacterium to a recipient bacterium. Gene conversion: the non-reciprocal transfer of genetic sequence between homologous genes from the same individual genome (e.g. as a consequence of mismatch DNA or double-strain break repair). Genetic diversity: measure of genetic variability based on gene frequencies in the population. In haploid organisms, genetic diversity equates with allelic diversity. In diploid organisms, it is important to distinguish between allelic and genotypic diversity in the population, because allelic diversity could result from non-random assortment of alleles within individuals (genotypes), such as when clonal reproduction occurs. Random mating: random assortment of genes between individuals from the same population; synonymous with panmictic reproduction. Rate of clonal reproduction (or clonal rate): average per generation proportion of the individuals in a population that do not undergo sexual reproduction. Depending on the life cycle of the organism, the rate of clonal reproduction is either constant over time or fluctuates owing to periodic events of sexual reproduction. Recombination: any process that result in the shuffling of genetic material. Our definition includes not only meiotic recombination (segregation followed by exchange of DNA fragment(s) between homologous chromosomes from single or different parental origins), but also non-meiotic processes, such as transformation in bacteria and template switching in viruses. Repeated multilocus genotypes: allelic profile obtained at different loci shared by several individuals in a sample. They could result from insufficient discriminative power of the markers in use, or be clonal copies of the same genotype. Transformation: uptake (followed by homologous recombination) of an external DNA fragment (e.g. from a dead bacteria) of a homologous fragment of DNA from dead degraded bacteria. Wahlund effect: deficit in heterozygotes owing to population subdivision in space or in time.
www.sciencedirect.com 0169-5347/$ - see front matter Q 2005 Elsevier Ltd. All rights reserved. doi:10.1016/j.tree.2005.01.001
Review
TRENDS in Ecology and Evolution
Vol.20 No.4 April 2005
195
Box 1. A multicriteria approach to test for clonality Several statistical methods are classically used to test for departure from random mating in populations (Table I). However, none of them alone can distinguish between potential causes leading to non-random reproduction (i.e. discriminating between the effects of clonal reproduction, inbreeding or selfing, and population subdivision from patterns of genetic structure). Therefore, the best method for detecting clonality should not only consider the advantages and limitations of each test, but should also combine them through a multicriteria approach.
Applying the three classes of tests in a sequential logical way enables more information about clonal reproduction to be generated and should limit misleading conclusions (Figure I). Ideally, inferences on clonal versus sexual reproduction based on a handful of neutral markers should be complemented by DNA sequence data. Analysis of DNA sequence data should enable variation of recombination rates across the genome to be estimated, together with identification of specific break points along chromosomes (Table I).
Table I. Advocated methods and related software that can be used to detect clonalitya Methods Population structuring Analysis of molecular variance
Bayesian clustering method
Clonal diversity Test for clonal origin of MLGb
Linkage disequilibrium Correlation between pairs of loci
Software and Web address
Comments and limitations
Refs
AMOVA procedure in Arlequin (http://lgb.unige.ch/arlequin/) SAMOVA (http://wwwpeople.unil.ch/ Isabelle.Dupanloup/samova.html) STRUCTURE (http://pritch.bsd.uchicago.edu)
Several hierarchical levels are implemented in the AMOVA procedure; could necessitate many runs for the proper population structure to be detected Relies on panmixy assumption and might fail to reveal population structure when clonal rate is too high; long processing time
[22]
MLGSim (http://www.molbiol.umu.se/ forskning/saura/software.htm)
Tests for over-representation of repeated genotypes; any molecular marker; population size limited to 200
[29]
LINKDOS (implemented in GENETIX or in GENEPOP, see below)
Potentially the most accurate estimation of linkage disequilibrium; not applicable to sequence data Relatively inaccurate in the context of clonal reproduction; not applicable to sequence data The dependence of the method on the number of markers is corrected in this software; long processing time; not applicable to sequence data No developed software known, method described in [36]; only applicable to sequence data
[32]
Ohta’s indices
GENETIX (http://www.univ-montp2.fr/ ~genetix/genetix/genetix.htm)
Index of multilocus association
MULTILOCUS (http://www.agapow.net/ software/multilocus/)
Correlation between LDb and distance Genetic diversity and its apportionment Allelic diversity and FIS GENEPOP estimates (http://wbiomed.curtin.edu.au/genepop/) GENETIX (http://www.univ-montp2.fr/ ~genetix/genetix/genetix.htm) Genotypic diversity MULTILOCUS (http://www.agapow.net/ software/multilocus/) Detecting recombination from MaxChi2, GENECONV and the Homoplasy sequence data test implemented in START (http://pubmlst. org/software/analysis/start/) LDhat (http://www.stats.ox.ac.uk/~mcvean/ Inference of recombination LDhat) rates and detection of hotspots LD and LDsr (http://www.maths.lancs.ac.uk/ (Pairwise likelihood ~fearnhea/software/) approaches)
[23,24]
[33]
[36]
[37–40]
Population genetic software offering various statistical tests (beyond the scope of this review); applicable to molecular markers No statistical test
[70]
For sequence data only; see [48–50] to further determine which approach is best suited to the data set Enables direct estimation of recombination rate variation through Reverse Jump Markov Chain Monte Carlo method; necessitates high number of markers
[45,71–-73]
[36]
[62,64]
a
When several softwares are available, only the most commonly used and the most tractable were mentioned. Abbreviations: LD, linkage disequilibrium; MLG, multilocus genotype.
b
not always specifically designed to seek evidence of clonality. As a result, the main problem currently lies with the application of available statistical tools and the interpretation of the results generated. Here, we clarify the situation by discussing the strengths and limitations of current methodologies, and briefly review the possibilities offered by new conceptual advances and increasing access to high throughput technologies. Clonal reproduction and its importance in natural populations Clonal reproduction encompasses all forms of asexual reproduction in which a new individual is produced by a www.sciencedirect.com
single parent without genetic recombination. A direct consequence of clonal reproduction is that, barring mutations, the new individual is essentially genetically identical to its parent. Limited genetic reorganization, such as gene conversion and chromosomal modification [11], has nonetheless been documented in some organisms that otherwise reproduce clonally [12]. All other forms of reproduction involve profound genetic reorganization via recombination during meiosis. Recombination is the key mechanism generating high genotypic diversity and is generally considered to be the signature of sexual reproduction: that is, the formation of new unique individuals as a result of the fusion of the meiotic products from
Review
196
TRENDS in Ecology and Evolution
Sampling scheme
Add more polymorphic loci
Genotyping
1. Repeated genotypes (G:N=1) (G:N<1) Perform a posteriori clustering
Significant over-representation?
No
Yes Inconsistent results
2. Linkage disequilibrium No
Yes Test without repeated genotypes
3. Genetic variance apportionment
FIS<0
FIS<0
Only applicable to diploids
Panmictic Clonal sexual reproduction reproduction TRENDS in Ecology & Evolution
Figure I. Sequential inference of population clonal rate.
different individuals. Although most prokaryotes are believed to reproduce clonally, many viruses and bacteria have been shown to recombine extensively [2,13,14], and bacteria have specific recombination mechanisms that occur between endo- and exogenous genetic material, including transformation and conjugation. Clonal reproduction is also widespread in eukaryotes [15], occurring in about one in 1000 taxa, and has been described in most major groups, except mammals and birds. Alternation between sexual and clonal reproduction is characteristic of many pathogens whose complex life cycle involves more than one obligate host. For example, the protozoan parasite Plasmodium falciparum has a sexual diploid phase in mosquitoes and reproduces as haploid clones in humans, resulting in the symptoms of malaria. In free-living species, the alternation between sexual and clonal reproduction, also termed cyclical parthenogenesis, is common in unicellular organisms, fungi, plants, rotifers, cladocerans and insects (e.g. aphids, gall wasps and cecidomyid flies). Biological evidence for clonal reproduction Before the application of genetic tools to whole-organism biology, the detection of clonality was only possible www.sciencedirect.com
Vol.20 No.4 April 2005
through morphological observations, and was limited to dioecious taxa with sexual dimorphism. For example, among the non-marine ostracods, the absence of males in Darwinula leguminella, a species characterized by an excellent fossil record, strongly suggests that this taxon has persisted for at least 100 Myr without sex [16]. Alternatively, clonal organisms can be grown in the laboratory to assess their potential for sexual reproduction, which can sometimes be induced by environmental conditions (e.g. pH, temperature and day length). However, there are several limitations to such morphological approaches. The presence of rare males or sexual forms does not necessarily imply sexual recombination as these relict sexual forms might be non functional, as is the case in the brine shrimp Artemia [17]. Conversely, that no sexual form has been documented might simply stem from the fact that they are rare, cryptic, morphologically too different to be recognized as such or are produced under different environmental conditions than the one being considered [18]. Recombination, or the lack thereof, is thus best estimated directly at the level where it happens: in the genome, using population genetics tools. Estimating clonal rate from population genetic data Population genetic inferences of the rate of clonal reproduction require rigorous analyses that rely on three major steps: (i) an adequate sampling strategy; (ii) molecular analysis; and (iii) interpretation of results obtained using statistical tools (see Box 1 for a synthetic guideline and list of useful software and Box 2 for an application). Sampling schemes The ideal sampling scheme should apportion the genetic variance at all relevant levels, from the smallest spatial units where all individuals are expected to be clone-mates to larger areas encompassing a greater clonal diversity. Sampling at the relevant scale is crucial for obtaining accurate inferences [19], and ill-defined a priori sampling units are a major source of misleading results (e.g. mixing of pathogen samples from different hosts or even different tissues). This will be true regardless of which genetic markers are assayed and which statistical tests are applied. The sampling strategy should ensure that there is no hidden genetic structuring within the units defined as sub-populations, to avoid a Wahlund effect that is known to influence strongly parameter estimates, such as F-statistics and linkage disequilibrium (LD) and tends to mimic the signal of clonal reproduction. Some methods enabling a posteriori partitioning of samples into appropriate subpopulations are available (e.g. cumulative pooling [20], Monmonnier algorithm [21] and maximization of total genetic variance [22]). In this context, the Bayesian clustering method implemented in the widely used software STRUCTURE [23,24] cannot deal adequately with genetic data from organisms reproducing mainly asexually. In addition to spatial subdivision, sampling can also include a temporal dimension [25], thus enabling monitoring of the evolution of rates of clonal reproduction over time.
Review
TRENDS in Ecology and Evolution
Vol.20 No.4 April 2005
197
Box 2. Aphids as a case study The typical reproductive mode of aphids (cyclical parthenogenesis) consists of regular alternation of several clonal generations with a synchronous event of sexual reproduction completing the annual life cycle. Some aphid lineages fail to fulfil the sexual part of this life cycle, leading to the coexistence of sexual (cyclical parthenogenetic) and asexual (obligate parthenogenetic) populations within many species [74]. This is true for the cereal aphid Rhopalosiphum padi (Figure I), which is an ideal diploid organism for studying the impact of different rates of clonal reproduction on population and genotypic structure [75,76].
Number of multilocus genotypes / Number of individuals
100%
80%
60%
40%
20%
0% 0
1
2
3 4 5 Number of combined loci
6
7
Figure II. Proportion of genotypes discriminated as a function of the number of microsatellite loci combined to separate genotypes in sexual (solid circles) and asexual (open circles) populations of Rhopalosiphum padi. Reproduced, with permission, from [75]. Figure I. Sexual (a) and parthenogenetic (b) reproduction in the aphid, Rhopalosiphum padi. Photographs reproduced with permission of Bernard Chaubet.
Table I. Comparative analysis of sexual and asexual populationsa
Resolving power of molecular markers to separate clones Variation in clonal rates results in different proportions of distinct and repeated genotypes between sexual and asexual populations. The minimal number of microsatellite markers enabling discrimination of the number of genotypes in asexual and sexual populations can be estimated by plotting the proportion of distinct genotypes as a function of the number of loci analysed (Figure II). As long as a plateau is not reached, neither is accurate discrimination between clones. In the example in Figure II, five combined loci were sufficient to identify 94–100% of sexual genotypes, whereas at the very best (combination of all seven loci) two individuals (picked at random) have a 0.5 probability of being clone mates.
Statistical tests applied to sexual and asexual populations Estimation of clonal rate in a priori sexual and asexual populations was conducted according to the sequential inference proposed in Box 1. Strong differences were observed at each step of the analysis (Table I). Before estimation of the proportion of distinct genotypes (the G:N index), each repeated genotype (present only in the asexual population) was checked to ensure that it derived significantly from clonal reproduction (using MLGsim software; Box 1). LD was assessed from the proportion of pairs of loci significantly linked (performed with LINKDOS implemented in GENEPOP). Other estimations of LD (e.g. carried out with MULTILOCUS) gave similar results (data not shown). Whereas the observation of LD is not
Molecular analysis Since the discovery of PCR, many different classes of genetic marker have been developed. Most of the analysis techniques require only small quantities of DNA, which is a major advantage, given that many clonal organisms are tiny. The choice of genetic markers is crucial, as it will limit the statistical tools that can be subsequently applied. First, a set of highly polymorphic loci should be used whenever possible. High polymorphism enables increased discrimination among genotypes and higher statistical power when testing for associations among loci [26]. When working with diploid or polyploid organisms, co-dominant markers (markers enabling homozygotes and heterozygotes to be distinguished at individual loci) are preferable because several tests are based on the amount of www.sciencedirect.com
Statistics Repeated multilocus genotypes (RG) Proportion of distinct genotypes (G:N) Linkage disequilibrium (LD) Proportion of linked pairs of loci with RG (same LD tests performed without RGs) Genetic variance apportionment Mean FIS value across loci (variance in FIS value among loci)
Asexual
Sexual
0.35
1
1 (0.19)
0
K0.394 (0.145)
K0.039 (0.002)
a
We used the data set published in [76], which is available on line.
definitive proof for clonal reproduction, the strong negative value of FIS (for the asexual population) ensures that there is no hidden genetic structure. All these statistics argue for a strong clonal structure in the asexual populations and, conversely, demonstrate that the sexual populations reproduce at random. Nonetheless, the additional measure of LD (when performed without repeated genotypes) suggests that many fewer pairs of loci are significantly linked, indicating that some recombination does occur in asexual populations. This conclusion is supported by the large variance in FIS values between loci. The loss of sex in the asexual population is thus only partial, and limited sexual events might account for the gene exchanges between the sexual and the asexual populations [75].
heterozygosity. Dominant markers, such as amplified fragment length polymorphism (AFLP) markers, should be restricted to the study of haploid organisms, whereas microsatellites and single nucleotide polymorphisms (SNPs) are well suited for both haploids and diploids. Ideally, DNA sequences should also be obtained because many statistical methods for estimating recombination rates focus exclusively on such data for estimating recombination rates. Other considerations, such as the cost, the amount of work or previous characterization of primer sequences of molecular markers in the organisms under study, might also dictate which type of marker is selected. Clonal reproduction affects the dynamics of genes at all levels by preventing the reshuffling of alleles among and across loci. Here, we review the tools that
198
Review
TRENDS in Ecology and Evolution
enable quantification of the deviation from sexual reproduction at those different levels. Classic statistical approaches Repeated genotypes The most obvious signature of clonal multiplication is the presence of repeated multilocus genotypes in the population. Methods based on such genotypes can be applied regardless of the level of ploidy of the organism, and to data generated through any molecular method as long as at least two physically unlinked neutral genetic polymorphisms are assayed. For example, the simple index G:N (the ratio of the number of multilocus genotypes found over the sample size) or related approaches (Shannon or Simpson indices) are often used to estimate rates of clonal reproduction, with the G:N ratio ranging from 0 (all individuals share the same genotype, in case of strict clonality) to 1 (all individuals have distinct genotypes, under sexual reproduction) [27]. The conceptual simplicity of these methods is appealing and the detection of repeated multilocus genotypes can provide strong evidence for the presence of some clonal reproduction (Box 2). However, repeated multilocus genotypes are not definitive proof of clonal reproduction, because other reproductive systems can lead to apparently high frequencies of repeated genotypes in highly subdivided populations. For instance, in P. falciparum, repeated multiple genotypes are essentially absent from geographical regions with high endemism but are abundant in regions with low prevalence where parasites mate with relatives during the sexual phase [28]. This situation, however, is an artefact generated by insufficient discriminative power when genetic diversity is low. Whether repeated genotypes result from clonal reproduction or from insufficient discriminative power of the genetic markers assayed can be tested statistically by using the expected frequencies of such identical multilocus genotypes under the observed allele frequencies and assuming random mating as a null hypothesis [29]. Linkage disequilibrium and related approaches Clonal reproduction generates non-random associations between loci [3,30], mimicking complete physical linkage over the entire genome. Associations among loci can be assessed either through the study of LD between pairs of loci [31–33] or estimated as an index of multilocus association [2,34–36]. The presence of repeated genotypes owing, for example, to a recent increase in the prevalence of some strains will greatly increase LD [2,28]. It is therefore recommended that LD estimations be performed with and without repeated identical multilocus genotypes. In populations that are characterized by frequent genetic recombination, a recent burst of a subset of strains is expected to lead to significant LD only when repeated genotypes are considered in the analysis [2]. Many tests based on LD have been specifically devised for DNA sequences or discrete linked genetic markers with known physical location. Although the exact rationale differs between different tests, they can be classified into two broad categories: (i) those that are based on the predicted decay of LD over physical distance along a chromosome and work well when recombination is www.sciencedirect.com
Vol.20 No.4 April 2005
frequent; and (ii) those that aim at characterizing specific breaking points between adjacent runs of nucleotides and thus tend to perform best for relatively differentiated sequences and low rates of recombination. Testing for a decay of LD over physical distance between markers on a chromosome enables one to control for demographic factors and biases of the estimator itself, as these unwanted effects contribute to LD independently of the physical location of the polymorphism. The statistical properties of this approach have been thoroughly investigated [37–40], in part because significant LD–physical distance correlations have been proposed as one of the main pieces of evidence for recombination in human mitochondrial DNA [41]. Somewhat disappointingly, high variation in mutation rates or adaptive substitutions could lead to such patterns even in the absence of recombination [38,42,43]. A family of tests not based explicitly on LD, but exploiting the predictions that phylogenetic trees are only fully congruent under complete linkage between genetic markers, includes the homoplasy test [44], the incompatibility ratio [45] and the informative site test [46]. These methods measure the extent of recombination from the number of extra steps (i.e. steps that are not due to sequence mutations) in the observed maximumparsimony tree. Because incongruence (number of extra steps) increases rapidly with recombination rate, these methods perform best under low divergence and recent events of recombination [45]. LD estimators (or derived methods) applied to a few genetic markers should be regarded only as tools to detect recombination, rather than to quantify its extent. A second limitation is that the presence of unrecognized population subdivision will be indistinguishable from the effect of clonal reproduction. Such substructure can originate from the presence of two or more reproductively isolated taxa [2] or from geographical and/or temporal population structuring. LD estimates also depend on other population parameters, such as size of populations and migration rates [47]. Different approaches applied to a given data set often provide highly incongruent results, so that conclusions have to be drawn from a battery of different approaches rather than from the statistical evidence of single tests [48–51]. Genetic diversity and its apportionment Population genetics theory predicts that, for haploid organisms, the amount of genetic recombination does not affect the expected genetic diversity in a population at neutral loci. The situation is more complex in diploids, where two haplotypes are embedded within an individual, and it is important to disentangle population genetic diversity in terms of alleles and genotypes [52]. Strict clonal reproduction almost doubles the expected number of alleles compared with a sexual reproduction and will tend to increase allelic diversity in the population by enabling the two alleles at each locus independently to accumulate mutations and irreversibly diverge within individuals [53]. This effect, which is sometimes referred to as the ‘Meselson effect’ (Box 3), only occurs when clonal reproduction is strongly predominant [52,54,55]. Genotypic diversity, however, decreases almost linearly with
Review
TRENDS in Ecology and Evolution
the rate of clonal reproduction [47,52]. These predictions rely on the assumption that the populations under study are close to evolutionary equilibrium. This is probably rarely the case; most clonal lineages are evolutionarily young (Box 3) and the stabilization of genetic diversity takes a long time to occur in strictly clonal mating systems [47,55]. In addition, how clonal lineages originate influences their genetic diversity [18]. A recent single event of spontaneous loss of sex, followed by a rapid clonal expansion, would account for both low allelic and genotypic diversity. Conversely, clonal lineages arising through repeated hybridization between sexual taxa could be genetically extremely diverse [18,56]. Although the amount of genetic diversity per se is not a good indicator of the rate of genetic recombination, its apportionment between the individual and the population levels is more instructive in diploids. The traditional way to express the apportionment of genetic variance is through use of F-statistics that are relatively independent of equilibrium assumptions. The most informative F-statistic in this context is FIS, the deviation from random mating within subpopulations. Under clonal reproduction, FIS is expected to be negative, indicating an excess of heterozygotes relative to random mating. Whereas negative FIS can arise in small populations of dioecious or monoecious self-incompatible organisms [57], large negative values of
Vol.20 No.4 April 2005
199
FIS can be considered as the ultimate signature of clonal diploid populations. Interestingly, rare events of sexual recombination will translate into a high variance in FIS over loci [47,52]. This approach is only informative for organisms where gene conversion is relatively rare. Gene conversion is frequent in yeasts (e.g. Candida) and has been suggested to be extremely common in the ostracod Darwinula stevensoni [58], because, although this is often considered as one of the oldest clonal organisms, it shows no evidence of excess heterozygosity [59]. Conclusion In spite of the biological diversity of species or populations suspected of clonal reproduction, rarity or absence of recombination generates a few clear signatures. Here, we have reviewed these and have shown how clonality can be detected in natural populations (Boxes 1,2). We have focused on the statistical analyses that are best suited for data sets comprising a limited number of genetic markers, which is what most labs working in evolutionary genetics can currently afford. When working with large genomes, this (small) amount of genetic information limits one to testing for either the presence or absence of genetic recombination. A quantification of the rate at which recombination occurs requires more genetic information [60]. With the advent of high throughput technologies,
Box 3. Ancient asexual lineages Although there is a general trend for clonal lineages to be evolutionarily short lived, there are some groups of organisms (e.g. bdelloid rotifers, darwinulid ostracods and mycorrhizal fungi) that are thought to have been reproducing without sex for millions of years [77]. However, demonstrating ancient asexuality is a challenging task. Several lines of evidence have been proposed in eukaryotes for testing whether asexual lineages are ancient. Morphologically unique chromosomes and odd chromosome numbers are suggestive of longterm asexuality [8], as is the rarity of transposable elements (TEs). Sexual reproduction enables TEs to propagate through populations, whereas the loss of sex, by preventing their spread, has been predicted to result eventually in populations that are free of such elements [78]. Low rates of nucleotide substitution in coding regions and the decay or loss of genes involved in meiosis and sexual function are also predictive of old asexual lineages. Perhaps the best evidence for ancient asexuality is the demonstration of the so-called ‘Meselson’ effect (Figure I). This test works only for diploids because it focuses on the divergence of the two alleles within an individual. It relies on the principle that two allelic gene copies at a locus become highly divergent as a result of the independent accumulation of mutations in the absence of segregation [7,53]. If sexual reproduction was abandoned millions of generations ago, intra-individual allelic divergences can be significantly larger than in species that reproduce sexually. Application of a molecular clock approach then enables the age of asexual lineages to be estimated. There are, however, several pitfalls associated with this method, because other mechanisms might mask or mimic the allele divergence process (e.g. mitotic recombination, gene conversion, gene duplication, occasional automixis and hybridization) [79]. Recently, application of these methods has enabled the ancient asexuality status of most putatively old asexuals (e.g. aphids and mycorrhizal fungi) to be rejected [77]. Only rotifer bdelloids have passed all the tests for asexual antiquity. This group forms a monophyletic clade of 370 described species that are thought to have been reproducing clonally for the past 80–100 million years. Neither males nor any signs of sex have been reported and bdelloid chromosomes are impossible to sort into homologous pairs [8]. www.sciencedirect.com
(a) A′ S B′
(b) L1
L2 L3
L4
IA′
IB′
L1 L2 L3 L4
L1 L2 L3 L4
A
A
B
B
Recombination between chromosomes
Origin of clonality
Origin of clonality
Figure I. The Meselson effect. The figure shows four ancient clonal lineages (L1–L4) that originated from a single event of sexual recombination loss, and their sexual ancestor (S). Since the time of separation from the sexual lineages, the two sets of chromosomes, evolving within each lineage, have lost their ability to recombine. They thus behave as two distinct haplotypes, A and B, independently accumulating mutations [53]. The increase in sequence divergence between the two alleles at each locus within a clonal lineage is known as the Meselson effect [7]. The expected pattern of divergence between individuals [species tree (a)] is that the two distinct haplotypes (in green and red, respectively) are embedded in clonal individuals, whereas recombination between chromosomes within the sexual species is depicted as a net. Direct comparison between sequences of haplotypes A and B will lead to a different topology [gene tree (b)]: in a clonal lineage, each allele is more closely related to an allele in the same haplotype than it is to the companion allele in the same individual, so that the gene tree in asexual organisms is allele dependent and sequencing only a few alleles will produce incorrect topologies [7,53]. Adapted, with permission, from [58].
Furthermore, bdelloid rotifers lack TEs in their genome [78] and show allele sequence divergence that globally matches the Meselson effect. This array of evidence, rather than the use of a single criterion, strongly supports the fact that bdelloid rotifers are truly ancient asexuals [7].
200
Review
TRENDS in Ecology and Evolution
combined with a significant reduction in genotyping and sequencing costs, this constraint might be relaxed. The current cost for genotyping 10 000 human SNPs is w£500 per individual, and this is expected to decrease significantly in the future. Data sets on the scale of the human genome HapMap [61], which enabled fine-scale characterization of recombination event distribution patterns [62], might soon be available for some clonal organisms. For example, two recent papers characterize 525 new microsatellite markers in the cyclical parthenogen Daphnia pulex [63] and 561 SNPs for the yeast Candida albicans [11]. These numbers of genetic markers would provide sufficient genome-wide coverage to enable accurate estimation of recombination rates in any clonal organism. Models based on coalescence theory estimating recombination rates jointly with other parameters of the genealogy of the sample (such as migration rates) are also being developed [9,10,64–69]. Those new simulation methods enable parameter estimations (e.g. the recombination rate) to be made directly from the molecular data, without requiring multiple tests. Much can be expected from this concerted theoretical and technological development. Indeed, incorporating recombination events into accurate genealogies of clonal lineages would enable the assessment of both their evolutionary history, including clonal turnover in the population, and the deciphering of genetic diversity patterns. We are hopeful that such integrated approaches applied to clonal pathogens and pests will lead to major progress in our understanding of the evolution and spread of medically and economically important strains. This fundamental knowledge might, in turn, prove crucial for improved management strategies as well as preventing the spread of emerging disease. Acknowledgements We thank David Hosken, Ire`ne Hummel and Franck Prugnolle for helpful discussion and critical reading of the article. F.H. was supported by a fellowship from Institut National de la Recherche Agronomique (INRA). This work was also supported by a grant from Re´gion Bretagne (Ope´ration A1C701-Programme 691). We apologize to the many authors whose work could not be cited because of space limitations.
References 1 Burt, A. et al. (1996) Molecular markers reveal cryptic sex in the human pathogen Coccidioides immitis. Proc. Natl. Acad. Sci. U. S. A. 93, 770–773 2 Maynard Smith, J. et al. (1993) How clonal are bacteria? Proc. Natl. Acad. Sci. U. S. A. 90, 4384–4388 3 Awadalla, P. (2003) The evolutionary genomics of pathogen recombination. Nat. Rev. Genet. 4, 50–60 4 Taylor, J.W. et al. (1999) The evolutionary biology and population genetics underlying fungal strain typing. Clin. Microbiol. Rev. 12, 126–146 5 Tibayrenc, M. and Ayala, F. (2002) The clonal theory of parasitic protozoa: 12 years on. Trends Parasitol. 18, 405–410 6 Posada, D. et al. (2002) Recombination in evolutionary genomics. Annu. Rev. Genet. 36, 75–97 7 Welch, D.M. and Meselson, M. (2000) Evidence for the evolution of bdelloid rotifers without sexual reproduction or genetic exchange. Science 288, 1211–1215 8 Welch, J.L.M. et al. (2004) Cytogenetic evidence for asexual evolution of bdelloid rotifers. Proc. Natl. Acad. Sci. U. S. A. 101, 1618–1621 9 Hudson, R.R. (2001) Two-locus sampling distributions and their application. Genetics 159, 1805–1807 www.sciencedirect.com
Vol.20 No.4 April 2005
10 McVean, G.A.T. et al. (2002) A coalescent-based method for detecting and estimating recombination from gene sequences. Genetics 160, 1231–1241 11 Forche, A. et al. (2004) Genome-wide single-nucleotide polymorphism map for Candida albicans. Eukaryot. Cell 3, 705–714 12 Gaunt, M.W. et al. (2003) Mechanism of genetic exchange in American trypanosomes. Nature 421, 936–939 13 Feil, E.J. et al. (2001) Recombination within natural populations of pathogenic bacteria: short-term empirical estimates and long-term phylogenetic consequences. Proc. Natl. Acad. Sci. U. S. A. 98, 182–187 14 Rhodes, T. et al. (2003) High rates of human immunodeficiency virus type I recombination: near-random segregation of markers one kilobase apart in one round of viral replication. J. Virol. 77, 11193–11200 15 Bell, G. (1982) The Masterpiece of Nature: The Evolution and Genetics of Sexuality, University of California Press 16 Martens, K. et al. (2003) How ancient are ancient asexuals? Proc. R. Soc. Lond. Ser. B 270, 723–729 17 Browne, R.A. (1992) Population genetics and ecology of Artemia: insights into parthenogenetic reproduction. Trends Ecol. Evol. 7, 232–237 18 Simon, J.C. et al. (2003) Phylogenetic relationships between parthenogens and their sexual relatives: the possible routes to parthenogenesis in animals. Biol. J. Linn. Soc. 79, 151–163 19 Anderson, J.B. and Kohn, L.M. (1998) Genotyping gene genealogies and genomics bring fungal population genetics above ground. Trends Ecol. Evol. 13, 444–449 20 Goudet, J. et al. (1994) The different levels of population structuring of the dogwhelk, Nucella lapillus, along the south Devon coast. In Genetics and Evolution of Aquatic Organisms (Beaumont, A., ed.), pp. 81–95, Chapman & Hall 21 Monmonier, M. (1973) Maximum-difference barriers: an alternative numerical regionalization method. Geogr. Anal. 3, 245–261 22 Dupanloup, I. et al. (2002) A simulated annealing approach to define the genetic structure of populations. Mol. Ecol. 11, 2571–2581 23 Pritchard, J.K. et al. (2000) Inference of population structure using multilocus genotype data. Genetics 155, 945–959 24 Falush, D. et al. (2003) Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics 164, 1567–1587 25 Drummond, A.J. et al. (2003) Measurably evolving populations. Trends Ecol. Evol. 18, 481–488 26 Sunnucks, P. (2000) Efficient genetic markers for population biology. Trends Ecol. Evol. 15, 199–203 27 Ivey, C.T. and Richards, J.H. (2001) Genetic diversity of everglades sawgrass, Cladium jamaicense (Cyperaceae). Int. J. Plant Sci. 162, 817–825 28 Anderson, T.J.C. et al. (2000) Microsatellite markers reveal a spectrum of population structures in the malaria parasite Plasmodium falciparum. Mol. Biol. Evol. 17, 1467–1482 29 Stenberg, P. et al. (2003) MLGsim: a program for detecting clones using a simulation approach. Mol. Ecol. Notes 3, 329–331 30 Tibayrenc, M. et al. (1991) Are eukaryotic microorganisms clonal or sexual? A population genetics vantage. Proc. Natl. Acad. Sci. U. S. A. 88, 5129–5133 31 Weir, B. (1979) Inferences about linkage disequilibrium. Biometrics 35, 235–254 32 Garnier-Gere, P. and Dillmann, C. (1992) A computer program for testing pairwise linkage disequilibria in subdivided populations. J. Hered. 83, 239–239 33 Ohta, T. (1982) Linkage disequilibrium due to random genetic drift in finite subdivided populations. Proc. Natl. Acad. Sci. U. S. A. 79, 1940–1944 34 Brown, A.H.D. et al. (1980) Multilocus structure of natural populations of Hordeum spontaneum. Genetics 96, 523–536 35 Haubold, B. et al. (1998) Detecting linkage disequilibrium in bacterial populations. Genetics 150, 1341–1348 36 Agapow, P.M. and Burt, A. (2001) Indices of multilocus linkage disequilibrium. Mol. Ecol. Notes 1, 101–102 37 Piganeau, G. and Eyre-Walker, A. (2004) A reanalysis of the indirect evidence for recombination in human mitochondrial DNA. Heredity 92, 282–288 38 Innan, H. and Nordborg, M. (2002) Recombination or mutational hot spots in human mtDNA? Mol. Biol. Evol. 19, 1122–1127
Review
TRENDS in Ecology and Evolution
39 Meunier, J. and Eyre-Walker, A. (2001) The correlation between linkage disequilibrium and distance: implications for recombination in hominid mitochondria. Mol. Biol. Evol. 18, 2132–2135 40 Awadalla, P. and Charlesworth, D. (1999) Recombination and selection at Brassica self-incompatibility. Genetics 152, 413–425 41 Awadalla, P. et al. (1999) Linkage disequilibrium and recombination in hominid mitochondrial DNA. Science 286, 2524–2525 42 Wiuf, C. (2001) Recombination in human mitochondrial DNA? Genetics 159, 749–756 43 McVean, G.A.T. (2001) What do patterns of genetic variability reveal about mitochondrial recombination? Heredity 87, 613–620 44 Maynard Smith, J. and Smith, N.H. (1998) Detecting recombination from gene trees. Mol. Biol. Evol. 15, 590–599 45 Maynard Smith, J. (1999) The detection and measurement of recombination from sequence data. Genetics 153, 1021–1027 46 Worobey, M. (2001) A novel approach to detecting and measuring recombination: New insights into evolution in viruses, bacteria, and mitochondria. Mol. Biol. Evol. 18, 1425–1434 47 De Meeus, T. and Balloux, F. (2004) Clonal reproduction and linkage disequilibrium in diploids: a simulation study. Infect. Gen. Evol. 4, 345–351 48 Posada, D. and Crandall, K.A. (2001) Evaluation of methods for detecting recombination from DNA sequences: computer simulations. Proc. Natl. Acad. Sci. U. S. A. 98, 13757–13762 49 Wiuf, C. et al. (2001) A simulation study of the reliability of recombination detection methods. Mol. Biol. Evol. 18, 1929–1939 50 Posada, D. (2002) Evaluation of methods for detecting recombination from DNA sequences: empirical data. Mol. Biol. Evol. 19, 708–717 51 Gandolfi, A. et al. (2003) Evidence of recombination in putative ancient asexuals. Mol. Biol. Evol. 20, 754–761 52 Balloux, F. et al. (2003) The population genetics of clonal and partially clonal diploids. Genetics 164, 1635–1644 53 Birky, C.W. (1996) Heterozygosity, heteromorphy, and phylogenetic trees in asexual eukaryotes. Genetics 144, 427–437 54 Bengtsson, B. (2003) Genetic variation in organisms with sexual and asexual reproduction. J. Evol. Biol. 16, 189–199 55 Yonezawa, K. et al. (2004) The effective size of mixed sexually and asexually reproducing populations. Genetics 166, 1529–1539 56 Delmotte, F. et al. (2003) Phylogenetic evidence for hybrid origins of asexual lineages in an aphid species. Evolution 57, 1291–1303 57 Balloux, F. (2004) Heterozygote excess in small populations and the heterozygote-excess effective population size. Evolution 58, 1891–1900 58 Butlin, R.K. (2000) Virgin rotifers. Trends Ecol. Evol. 15, 389–390 59 Schon, I. et al. (1998) Slow molecular evolution in an ancient asexual ostracod. Proc. R. Soc. Lond. Ser. B 265, 235–242 60 Stumpf, M. and McVean, G.A.T. (2003) Estimating recombination rates from population-genetic data. Nat. Rev. Genet. 4, 959–968
Vol.20 No.4 April 2005
61 Couzin, J. (2004) Genomics – consensus emerges on HapMap strategy. Science 304, 671–671 62 McVean, G.A.T. et al. (2004) The fine-scale structure of recombination rate variation in the human genome. Science 304, 581–584 63 Colbourne, J. et al. (2004) Five hundred and twenty-eight microsatellite markers for ecological genomic investigations using Daphnia. Mol. Ecol. Notes 4, 485–490 64 Fearnhead, P. and Donnelly, P. (2002) Approximate likelihood methods for estimating local recombination rates. J. R. Stat. Soc. Ser. B 64, 657–680 65 Fearnhead, P. and Donnelly, P. (2001) Estimating recombination rates from population genetic data. Genetics 159, 1299–1318 66 Kuhner, M.K. et al. (2000) Maximum likelihood estimation of recombination rates from population data. Genetics 156, 1393–1401 67 Nielsen, R. (2000) Estimation of population parameters and recombination rates from single nucleotide polymorphisms. Genetics 154, 931–942 68 Hey, J. and Wakeley, J. (1997) A coalescent estimator of the population recombination rate. Genetics 145, 8333–8846 69 Rosenberg, N.A. and Nordborg, M. (2002) Genealogical trees, coalescent theory and the analysis of genetic polymorphisms. Nat. Rev. Genet. 3, 380–390 70 Raymond, M. and Rousset, F. (1995) Genepop (Version-1.2). Population-genetics software for exact tests and ecumenicism. J. Hered. 86, 248–249 71 Maynard Smith, J. (1992) Analysing the mosaic structure of genes. J. Mol. Evol. 34, 126–129 72 Sawyer, S. (1989) Statistical tests for detecting gene conversion. Mol. Biol. Evol. 6, 526–538 73 Jolley, K.A. et al. (2001) Sequence type analysis and recombinational tests (START). Bioinformatics 17, 1230–1231 74 Simon, J-C. et al. (2002) Ecology and evolution of sex in aphids. Trends Ecol. Evol. 17, 34–39 75 Delmotte, F. et al. (2002) Genetic architecture of sexual and asexual populations of the aphid Rhopalosiphum padi based on allozyme and microsatellite markers. Mol. Ecol. 11, 711–723 76 Halkett, F. et al. (2005) Admixed sexual and facultatively asexual aphid lineages at mating sites. Mol. Ecol. 14, 325–336 77 Normark, B.B. et al. (2003) Genomic signatures of ancient lineages. Biol. J. Linn. Soc. 79, 69–84 78 Arkhipova, I. and Meselson, M. (2000) Transposable elements in sexual and ancient asexual taxa. Proc. Natl. Acad. Sci. U. S. A. 97, 14473–14477 79 Birky, C.W. (2004) Bdelloid rotifers revisited. Proc. Natl. Acad. Sci. U. S. A. 101, 2651–2652
Reproduction of material from Elsevier articles Interested in reproducing part or all of an article published by Elsevier, or one of our article figures? If so, please contact our Global Rights Department with details of how and where the requested material will be used. To submit a permission request on-line, please visit: http://www.elsevier.com/wps/find/obtainpermissionform.cws_home/obtainpermissionform Alternatively, please contact: Elsevier Global Rights Department PO Box 800, Oxford OX5 1DX, UK. Phone: (+44) 1865-843830 Fax: (+44) 1865-853333
[email protected] www.sciencedirect.com
201