Annals of Botany 85 (Supplement A): 241±245, 2000 doi:10.1006/anbo.1999.1037, available online at http://www.idealibrary.com on
S-allele Diversity in Lycium andersonii: Implications for the Evolution of S-Allele Age in the Solanaceae A D A M D . R IC H MA N* Plant Sciences Department, Montana State University, Bozeman, MT 59717-0346, USA Received: 21 July 1999 Returned for revision: 24 August 1999
Accepted: 22 October 1999
We evaluate competing explanations for striking dierences in the average age of self-incompatibility (S-) alleles in population samples. The age of alleles is inferred from evidence for trans-generic evolution (TGE), in which an allele sampled from one species is more closely related to an allele found in another genus than any con-generic allele, as determined by phylogenetic analysis. Whereas some species exhibit extensive TGE, indicating very long persistence of allelic lineages, in others limited TGE suggests extensive extinction and origination of alleles. We consider two explanations for inter-speci®c dierences in TGE and allelic turnover: (1) bottleneck event(s) which have accelerated the loss of allelic diversity in some species; and (2) dierences among species in the origination rate of new allelic speci®cities. We used data on S-allele diversity in Lycium andersonii (Solanaceae), a self-incompatible perennial shrub of deserts of southwestern North America, to estimate the presumed change in origination rate. We ®nd that predicted allelic turnover assuming a change in the origination rate of new S-allele speci®cities is insucient to # 2000 Annals of Botany Company account for inter-speci®c dierences in allele turnover. Key words: Balanced genetic polymorphism, self-incompatibility, Solanaceae, Lycium andersonii.
I N T RO D U C T I O N Balanced genetic polymorphism, found, for example, in genes involved in self-recognition including the MHC Class II genes in vertebrates and the self incompatibility (S-) gene in ¯owering plants, has been proposed as a source of inference of population history complimentary to that of neutral genetic polymorphism, because genetic polymorphism maintained by balancing selection permits inferences about population size over much longer spans of time (Takahata, 1990, 1993a). This approach has been used to infer a population bottleneck at the S-gene in the Solanaceae (Richman et al., 1996b), and the absence of bottlenecks in the history of the human lineage (Klein et al., 1993b; Takahata, 1993b), during diversi®cation of cichlid species ¯ocks in Lake Malawi (Klein et al., 1993a), and in the founding population of Darwin's ®nches (Vincek et al., 1997). In the case of the S-gene, the selective basis for the maintenance of polymorphism is well understood: the S-gene prevents self-fertilization because pollen carrying an allele also expressed in the style is rejected. Codominance of pollen alleles in the style makes modelling gametophytic self incompatibility (GSI) particularly mathematically tractable (Vekemans and Slatkin, 1994). Because the frequency of an S-allele determines the probability of access to compatible mates, the ®tness of an S-allele is negatively frequency dependent, with the expectation that alleles will be maintained in equal frequency at an equilibrium between the rate of origination and the loss of alleles due to drift, a function of Ne (Wright, 1960, 1965; * Fax 1 406-994-7600, e-mail
[email protected]
0305-7364/00/0A0245+05 $35.00/00
Yokoyama and Nei, 1979; Yokoyama and Hetherington, 1982). Tests of empirical samples in the Solanaceae are largely consistent with the expectation of uniform allele frequencies (Richman et al., 1995, 1996a, b), although in the Papaveraceae signi®cant deviations have been detected (O'Donnell and Lawrence, 1984; Lawrence et al., 1993; Lawrence and Franklin-Tong, 1994). In addition to understanding the nature of selection maintaining variation at the S-gene, there has also been recent progress in understanding the molecular genetics of self-incompatibility. The S-gene product has been cloned for several plant families (Franklin-Tong and Franklin, 1993), and in the Solanaceae this gene product has been shown to be necessary and sucient for rejection of self pollen (Huang et al., 1994; Lee et al., 1994). These advances permitted development of PCR-based techniques for population genetic studies of S-gene polymorphism at the molecular level (Brace et al., 1993, 1994; Richman et al., 1995, 1996a). An emergent result from initial studies of S-allele diversity in the Solanaceae is that there appear to be marked dierences in the number and age of S-alleles sampled from dierent species. In a comparison of S-allele diversity in two species from the Solanaceae (Richman et al., 1996b), a sample of S-alleles from Physalis crassifolia contained signi®cantly more S-alleles than the number estimated from two widely separate population samples of S-alleles from Solanum carolinense. S-alleles in P. crassifolia are also apparently much younger than alleles in S. carolinense, clustering together in phylogenetic analysis, whereas the alleles sampled from S. carolinense show extensive trans-generic evolution (TGE, where alleles from # 2000 Annals of Botany Company
242
RichmanÐS-Locus Evolution in the Solanaceae
dierent genera cluster together), indicating the origin of these lineages pre-dates the genera in which they presently occur. In addition, alleles in P. crassifolia show a signi®cant excess of replacement substitutions over synonymous substitutions, indicating diversifying selection for new speci®cities. An excess of replacement substitutions is largely absent for pairwise comparisons in S. carolinense, presumably due to the accumulation of many more silent sequence dierences over time (Richman et al., 1996b). We concluded that the most likely explanation for the observed dierence between population samples was a severe bottleneck in the history of the P. crassifolia lineage which caused the loss of most S-allele lineages, followed by population recovery and rediversi®cation. An alternative interpretation is that a change in the rate of origination of new allelic speci®cities is largely responsible for inter-speci®c dierences in S-allele age and number (Uyenoyama, 1997). In support of this view, Uyenoyama (1997) showed that S-allele diversity in S. carolinense deviated from expectation under coalescent theory in that terminal branches of the species' S-allele genealogy were signi®cantly too long, and suggested that the origination rate of new S-allele speci®cities was slowing down over time. A mechanism of tightly linked deleterious mutation sheltered by obligate heterozygosity at the S-locus was proposed to explain the slow-down in origination rate. In contrast, the genealogy for P. crassifolia showed no signi®cant deviation from theoretical expectation, leading Uyenoyama (1997) to speculate that an increase in the origination rate was responsible for relatively rapid and recent allelic diversi®cation. Uyenoyama (2000) estimated the separate contributions of eective size and origination rate to dierences in total terminal branch length (as a measure of allele age) and concluded that a change in origination rate was largely responsible for inter-speci®c dierences in allele age. Here we evaluate the evidence for a bottleneck event and/ or a change in the origination rate at the S-locus, using data on S-allele diversity in Lycium andersonii, a self-incompatible plant of southwestern deserts of North America. The sample of alleles from this species is novel in combining high S-allele number with extensive TGE. Using the method of Uyenoyama (2000) we investigate whether a change in the origination rate alone is sucient to account for inter-speci®c dierences in allelic turnover, estimated from TGE. We ®nd that the estimated change in origination rate is insucient to explain inter-speci®c dierences in allelic turnover. Our results suggest that a population bottleneck is required to account for dierences in S-allele age among taxa. Finally, we suggest that inter-speci®c dierences in terminal branch length may not, in fact, be due to change in the origination rate. Instead we suggest that more divergent alleles are selectively favoured over evolutionary time. Support for this inference with respect to the S-locus and MHC loci is discussed. M AT E R I A L S A N D M E T H O D S Ten styles were taken from each of 16 individuals of Lycium andersonii sampled along a 100 m transect in the Granite
Mountains UC Reserve, USA in April 1995 and immediately frozen in liquid nitrogen. The partial S-locus genotype of each of these was determined by sequence analysis of RTPCR products obtained using S-locus speci®c primers, using methods described in Richman et al. (1995, 1996a). R E S U LT S RFLP analysis indicated the presence of two dierent sequences in PCR products obtained from each plant, consistent with the expectation of obligate heterozygosity at a gametophytic S-locus. The RFLP patterns of dierent individuals were unique, indicating that no genotype occurred more than once in the population sample. Twenty-two alleles were recovered in the sample of 16 individuals, yielding a maximum likelihood estimate of 36 alleles in the population (Paxman, 1963), with a corresponding con®dence interval of 25±69 alleles (O'Donnell and Lawrence, 1984). The assumption of the population estimate of equal frequencies of dierent genotypes was not rejected for the sample (w2 10.5, 21 d.f., P 4 0.25). Phylogenetic analysis of S-allele sequences sampled from multiple genera of Solanaceae found extensive trans-generic evolution of S-allele lineages in L. andersonii. A total of 12 trans-generic lineages are found in the genealogical reconstruction (Fig. 1, Table 1) and the 95% bootstrap con®dence interval on this estimate is 9±14 lineages. Extensive TGE was also found in Solanum carolinense which, in the same analysis, had eight (95% CI7±10) transgeneric lineages from a sample of 13 alleles. The estimate of TGE for Physalis crassifolia was much lower (95% CI 2±3) indicating greater lineage turnover (Fig. 2). Estimates of TGE are potentially sensitive to sample size, which varies among taxa (Table 1). To address this issue, we estimated the amount of allele turnover (origination and extinction) given observed levels of TGE in population samples, using the approach of Takahata (Klein et al., 1993b; Takahata, 1993b). The method assumes a random process of allelic extinction and origination. Given this model, it is possible to describe the evolution of the number of transgeneric lineages over time, as the population moves along a trajectory starting at the time of generic divergence (k n) and ending with complete turnover of all lineages present initially (k 1). The measure of turnover t0 describes the position of a population sample along this trajectory. Results of this analysis indicated a similarly low level of allele turnover in L. andersonii and S. carolinense relative to P. crassifolia (Table 1, Fig. 2). We evaluated whether a presumptive increase in the origination rate was sucient to account for inter-speci®c dierences in TGE and allelic turnover. Dierences in TGE are associated with dierences in the shape of the allelic genealogy, with genealogies showing extensive TGE exhibiting a large deviation from expectation in that the terminal branches are too long, indicating an apparent slow-down in the origination of new S-allele speci®cities. Genealogies with lower TGE fail to show a similarly signi®cant deviation (Table 1). Uyenoyama (1997) has interpreted signi®cant deviation from expectation as consistent with a slow-down in the origination rate over evolutionary time,
RichmanÐS-Locus Evolution in the Solanaceae
243
FIG. 1. Phylogeny of S-alleles in the Solanaceae. Inferred amino acid sequences from L. andersonii were aligned with S-allele sequences from other Solanaceae using Clustal W with default settings and adjusted by eye. The amino acid alignment was then used to align DNA sequences. DNA sequences used correspond to amino acid positions 1±129 in Fig. 1 of Richman et al. (1996b). Pairwise DNA distances were estimated using the HKY model (Hasegawa et al., 1987). The phylogeny was determined using the neighbour joining algorithm implemented in PAUP* (Swoord, 1999). The tree was midpoint rooted. Citations for published S-sequences and/or their Genbank accession numbers are given in Richman et al. (1996b).
T A B L E 1. Population and genealogy statistics for S-allele diversity in three species of Solanaceae Species m, n N TGE (k) Allelic turnover Rsd
L. andersonii
P. crassifolia
S. carolinense
16, 22 36 (25±69) 12 (9±15) 0.10 4.74***
22, 28 44 (33±60) 2 (2±3) 1.15 2.59ns
26, 13 14 (13±15) 8 (7±10) 0.08 5.80***
m, number of individuals sampled; n, number of dierent sequences recovered. N is the maximum likelihood estimate of the number of alleles in the population, given m and n (Paxman, 1963). The corresponding likelihood interval was estimated using the method of O'Donnell and Lawrence (1984). The number of transgeneric lineages k, de®ned as the number of lineages which predate the common divergence of Lycium, Solanum and Physalis, was determined from the genealogy in Fig. 1. Con®dence intervals for k were determined by bootstrap resampling. Allelic turnover (see text) was estimated from TGE using the method of Takahata (1993a). The shape of the genealogy was measured using the statistic Rsd [S(1ÿ1/n)/D, where S is the sum of terminal branch lengths, n is the number of alleles in the sample, and D is the depth of the genealogy (Uyenoyama, 1997)]. The method assumes a molecular clock, and least squares branch lengths were estimated under this assumption using the program LINTRE (Takezaki et al., 1995). Signi®cant deviation of the statistic from expectation for a balanced genealogy (see text) was determined by simulation (Uyenoyama, 1997). ***P 5 0.01; ns, not signi®cant.
and the closer ®t of young genealogies as the result of an increase in the origination rate following a reduction of genetic load at the S-locus. Uyenoyama (2000) estimated the change in origination rate from: L*1 =L*2 N2 u2 f2 =N1 u1 f1
FIG. 2. Likelihood estimates of allelic turnover t0 given k trans-generic lineages in a sample of size n, using the approach of Takahata (Klein et al., 1993b; Takahata, 1993b). The method assumes the evolution of the number of trans-generic lineages over time is due to a random process of allele birth and death, as the population moves along a trajectory starting at the time of generic divergence (k n) and ending with complete turnover of all lineages present initially (k1). The measure of turnover t0 describes the position of a population sample along this trajectory.
where L*i is the sum of total terminal branch lengths in species i, Ni is eective population size of species i, ui is the origination rate of species i and fi is the scaling factor for species i, a function of ui and Ni (Vekemans and Slatkin, 1994). Ni was estimated using the eective allele number for each species (Yokoyama and Hetherington, 1982; Uyenoyama, 2000). We applied the same approach to estimate a four-fold increase in u in P. crassifolia relative to L. andersonii.
244
RichmanÐS-Locus Evolution in the Solanaceae DISCUSSION
Uyenoyama (2000) assumed that dierences in terminal branch length for species gene genealogies were due to a change in the origination rate of new allelic speci®ties to estimate a 25-fold increase in origination rate in P. crassifolia relative to S. carolinense. Using the same approach we found a much more modest estimate of the change in P. crassifolia relative to L. andersonii. The dierent estimates of u are, in fact, due largely to assumed dierences in eective population size, not dierences in sums of the lengths of terminal branches. A much larger dierence in the origination rate is required for S. carolinense because it is assumed it has had a smaller eective size than P. crassifolia over the history of the genealogy, based on dierences in estimated allele number (Table 1). Because L. andersonii has a high estimated S-allele number similar to P. crassifolia, the contribution of this assumption of dierences in eective size to the inferred origination rate is much reduced. We suggest that the assumption of small eective size in S. carolinense may be misleading. Estimates of allelic turnover in L. andersonii and S. carolinense are very similar, which would not be expected if the species had maintained dierent eective population sizes for a long time. Moreover, the number of alleles is expected to be evolutionarily labile (Takahata, 1993b; Richman and Kohn, 1999), and signi®cant dierences in allele number have been found even among congeneric species (Richman and Kohn, 1999). Therefore, the characteristics of the genealogy of S. carolinense may not re¯ect maintenance of smaller eective size over millions of years. In contrast, it is unlikely that L. andersonii has recently attained large eective size relative to S. carolinense, because the large number of transspeci®c lineages in this species indicate that it has maintained a larger number of alleles for a long time. Assuming that dierences in terminal branch length are due to dierences in the origination rate of allelic speci®cities, we asked whether dierences in the origination rate are sucient to account for dierences in TGE among taxa. We used estimates of TGE for population samples (Table 1) to estimate the amount of allelic turnover using the approach of Takahata (Klein et al., 1993b; Takahata, 1993b). We then asked, if there was an increase in the origination rate in species with low turnover, was this expected to result in high turnover as is found in P. crassifolia? In particular, we estimate a four-fold increase in the origination rate in L. andersonii relative to P. crassifolia, based on dierences in terminal branch lengths. A four-fold increase in the origination rate would result in a four-fold increase in turnover as well, given the assumption of constant population size. Estimates of turnover in S. carolinense and L. andersonii assuming a four-fold increase in turnover are 0.32 and 0.40, respectively. We note that these values are overestimates in that they assume that turnover occurs at a four-fold higher rate over the entire period since the divergence of Nicotiana from the common ancestor of Solanum and Lycium, when in fact we know that the increase in turnover must have occurred much later, after the divergence of Physalis from Solanum
(see phylogeny of genera in Fig. 1). We compared these values to the critical minimum level of turnover for P. crassifolia consistent with the observation of two transgeneric lineages in 28 alleles. This value (0.48), estimated using a log likelihood ratio test (Kishino and Hasegawa, 1989), exceeded the estimates of turnover for S. carolinense and L. andersonii assuming a four-fold increase in origination rate. We therefore conclude that this estimate of change in the origination rate by itself is insucient to account for the estimated allelic turnover in P. crassifolia. The preceding analysis assumes that a change in origination rate applies equally to all allelic lineages in a species. Alternatively, if one or only a few lineages show an accelerated rate of origination, this would be expected to result in higher turnover than estimated here. While we do not know how many lineages were present at the time of initial diversi®cation, it is clear that extensive allelic diversi®cation occurred in multiple trans-generic lineages which pre-date the presumed increase in origination rate. The observation of contemporaneous diversi®cation of multiple transgeneric lineages in P. crassifolia strongly suggests that diversi®cation was triggered by a single demographic event. Nevertheless, Uyenoyama (1997) concluded that the genealogy for P. crassifolia does not deviate signi®cantly from expectation for a genealogy at equilibrium, a ®nding which would appear to contradict the proposition that allelic diversity in this species is the result of a relatively recent bottleneck. However, it appears the analysis of Uyenoyama (1997) has only limited power to detect deviations from equilibrium expectation. An alternative analysis ®nds evidence for a signi®cant diversi®cation event in the genealogy of P. crassifolia (Richman, unpubl. res.). While a bottleneck event appears necessary to explain the diversi®cation of S-alleles in P. crassifolia, this does not necessarily rule out a contribution of a change in the origination rate in explaining inter-speci®c dierences in S-allele age. However, we suggest there is no compelling evidence to support a change in the origination rate. Long terminal branches relative to expectation found in L. andersonii and S. carolinense may result from violations of assumptions of the coalescent model other than change in the origination rate (Takahata, 1990; Uyenoyama, 1997). We have suggested that divergent allele advantage and not sheltered load is responsible for the observation of genealogies with long terminal branches (Richman and Kohn, 1999). Long terminal branches are also observed for other balanced genetic polymorphisms, including the MHC class II loci of mice (Wakeland et al., 1990; Richman and Kohn, 1999). Wakeland et al. (1990) proposed that divergent allele advantage is responsible for the maintenance of highly divergent sequence motifs in MHC genes known to aect antigen presentation. In this case, a relatively divergent heterozygous MHC genotype is postulated to present a wider variety of antigens to the immune system, conferring an advantage in resistance to pathogens. Divergent allele advantage may operate at the S-locus if the ability to discriminate self from non-self increases with S-allele sequence divergence. False rejection of similar alleles as self would generate a mating
RichmanÐS-Locus Evolution in the Solanaceae disadvantage for close alleles relative to alleles with sequences highly divergent from others in the population. An important distinction between a mechanism of divergent allele advantage and a mechanism of sheltered load (Uyenoyama, 1997) is that the former does not rely on the assumption of sheltered recessive mutations at closely linked loci, and is therefore expected to be unaected by the history of inbreeding. AC K N OW L E D GE M E N T S I thank the organizers of the Symposium on Pollen/Stigma Interactions for the opportunity to participate, and also Dr M. J. Lawrence and colleagues for their seminal work on plant self-incompatibility and the inspiration it provided. This study was supported by National Science Foundation grant D.E.B. 98-70766 to A.D.R. L I T E R AT U R E C I T E D Brace J, King GJ, Ockendon DJ. 1993. Development of a method for the identi®cation of S-alleles in Brassica oleracea based on digestion of PCR-ampli®ed restriction endonucleases. Sexual Plant Reproduction 7: 169±176. Brace J, King GJ, Ockendon DJ. 1994. A molecular approach to the identi®cation of S-alleles in Brassica oleracea. Sexual Plant Reproduction 7: 203±208. Franklin-Tong VE, Franklin FCH. 1993. Gametophytic self-incompatibility: Contrasting mechanisms for Nicotiana and Papaver. Trends in Cell Biology 3: 340±345. Huang S, Lee HS, Karunanandaa B, Kao TH. 1994. Ribonuclease activity of Petunia in¯ata S proteins is essential for rejection of self-pollen. Plant Cell 6: 1021±1028. Kishino H, Hasegawa M. 1989. Evaluation of the maximum likelihood estimate of the evolutionary tree topologies from DNA sequence data, and the branching order in Hominoidea. Journal of Molecular Evolution 29: 170±179. Klein D, Ono H, O'Huigin C, Vincek V, Goldschmidt T, Klein J. 1993a. Extensive MHC variability in cichlid ®shes of Lake Malawi. Nature 364: 330±334. Klein J, Satta Y, Takahata N, O'Huigin C. 1993b. Trans-speci®c Mhc polymorphism and the origin of species in primates. Journal of Medical Primatology 22: 57±64. Lawrence MJ, Franklin-Tong VE. 1994. The population genetics of the self-incompatibility polymorphism in Papaver rhoeas. IX. Evidence of an extra eect of selection acting on the S-locus. Heredity 72: 353±364. Lawrence MJ, Lane MD, O'Donnell S, Franklin-Tong VE. 1993. The population genetics of the self-incompatibility polymorphism in Papaver rhoeas. V. Cross-classi®cation of the S-alleles of samples from three natural populations. Heredity 71: 581±590. Lee HS, Huang SS, Kao T-H. 1994. S-proteins control rejection of selfincompatible pollen in Petunia in¯ata. Nature 367: 560±563. O'Donnell S, Lawrence MJ. 1984. The population genetics of the self-incompatibility polymorphism in Papaver rhoeas. IV. The
245
estimation of the number of alleles in a population. Heredity 53: 495±507. Paxman GJ. 1963. The maximum likelihood estimation of the number of self-sterility alleles in a population. Genetics 48: 1029±1032. Richman AD, Kohn JR. 1996. Learning from rejection: the evolutionary biology of single-locus incompatibility. Trends in Ecology and Evolution 11: 497±502. Richman AD, Kohn JR. 1999. Self-incompatibility alleles in Physalis: Implications for historical inference from balanced polymorphisms. Proceedings of the National Academy of Sciences, USA 96: 168±172. Richman AD, Uyenoyama MK, Kohn JR. 1996a. S-allele diversity in a natural population of ground cherry Physalis crassifolia (Solanaceae) assessed by RT-PCR. Heredity 76: 497±505. Richman AD, Uyenoyama MK, Kohn JR. 1996b. Allelic diversity and gene genealogy at the self-incompatibility locus in the Solanaceae. Science 273: 1212±1216. Richman AD, Kao T-H, Schaeer SW, Uyenoyama MK. 1995. S-allele sequence diversity in natural populations of Solanum carolinense Horsenettle. Heredity 75: 405±415. Swoord DL. 1999. PAUP*. Phylogenetic analysis using parsimony (*and other methods). Version 4. Sunderland, MA: Sinauer. Takahata N. 1990. A simple genealogical structure of strongly balanced allelic lines and trans-species evolution of polymorphism. Proceedings of the National Academy of Science, USA 87: 2419±2423. Takahata N. 1993a. Allelic genealogy and human evolution. Molecular Biology and Evolution 10: 2±22. Takahata N. 1993b. Evolutionary genetics of human paleo± populations. In: Takahata N, Clarke AG, eds. Mechanisms of molecular evolution. Sunderland, MA: Sinauer, 1±21. Takezaki N, Rzhetsky A, Nei M. 1995. Phylogenetic test of the molecular clock and linearized trees. Molecular Biology and Evolution 12: 823±833. Uyenoyama MK. 2000. The evolution of breeding systems. In: Singh RS, Krimbas C, eds. Evolutionary genetics from molecules to morphology. New York: Cambridge University Press. Uyenoyama MK. 1997. Genealogical structure among alleles regulating self-incompatibility in natural populations of ¯owering plants. Genetics 147: 1389±1400. Vekemans X, Slatkin M. 1994. Gene and allelic genealogies at a gametophytic self-incompatibility locus. Genetics 137: 1157±1165. Vincek V, O'Huigin C, Satta Y, Takahata N, Boag PT, Grant PR, Grant BR, Klein J. 1997. How large was the founding population of Darwin's ®nches?. Proceedings of the Royal Society of London B 264: 111±118. Wakeland EK, Boehme S, She JX. 1990. The generation and maintenance of MHC class II gene polymorphism in rodents. Immunological Reviews 113: 207±226. Wright S. 1960. On the number of self-incompatibility alleles maintained in equilibrium by a given mutation rate in the population of a given size: A re-examination. Biometrics 16: 61±85. Wright S. 1965. The distribution of self-incompatibility alleles in populations. Evolution 18: 609±619. Yokoyama S, Hetherington LE. 1982. The expected number of selfincompatibility alleles in ®nite plant populations. Heredity 48: 299±303. Yokoyama S, Nei M. 1979. Population dynamics of sex-determining alleles in honey bees and self-incompatibility alleles in plants. Genetics 91: 609±626.