Speciation and evolution in the Gagea reticulata species complex (Tulipeae; Liliaceae)

Speciation and evolution in the Gagea reticulata species complex (Tulipeae; Liliaceae)

Molecular Phylogenetics and Evolution 62 (2012) 624–639 Contents lists available at SciVerse ScienceDirect Molecular Phylogenetics and Evolution jou...

2MB Sizes 23 Downloads 54 Views

Molecular Phylogenetics and Evolution 62 (2012) 624–639

Contents lists available at SciVerse ScienceDirect

Molecular Phylogenetics and Evolution journal homepage: www.elsevier.com/locate/ympev

Speciation and evolution in the Gagea reticulata species complex (Tulipeae; Liliaceae) Mehdi Zarrei a,⇑, Paul Wilkin a, Martin J. Ingrouille b, Ilia J. Leitch a, Sven Buerki a,c, Michael F. Fay a, Mark W. Chase a a b c

Royal Botanic Gardens, Kew, Richmond TW9 3AB, UK School of Biological and Chemical Science, Birkbeck College, University of London, Mallet Street, London WC1E 7HX, UK Department of Biodiversity and Conservation, Real Jardin Botanico, CSIC, Plaza de Murillo 2, 28014 Madrid, Spain

a r t i c l e

i n f o

Article history: Received 16 February 2011 Revised 29 October 2011 Accepted 4 November 2011 Available online 17 November 2011 Keywords: Ancestral polymorphisms Gene duplication Hybridization Low-copy nuclear genes Ploidy Network analysis

a b s t r a c t For the 12 named taxa in the Gagea reticulata species complex, 609 cloned sequences of the low-copy nuclear gene malate synthase (MS) were used to investigate species relationships, using standard phylogenetic tools and network analyses. Three (homologous) copies of MS locus were present in each individual analyzed, and multiple alleles were present at most of these loci. Duplication of MS occurred after divergence of the G. reticulata complex. After comparisons, 591 sequence types (i.e. haplotypes) were identified, requiring implementation of novel statistical analyses to group haplotypes in a smaller number of groups/lineages to enable further study. Haplotype groups/lineages are not fully congruent with species limits with some widely present among species. MS genotypes at the root of the network are those of G. setifolia from central Iran, with more derived sequences in this species found in the west and northwest. Presence of ancestral genotypes in several other taxa may indicate either the retention of ‘‘ancestral’’ polymorphisms, more recent introgressive hybridization, or both. The relative DNA content of specimens was estimated with flow cytometry (FCM). The FCM analyses revealed two levels of DNA content (putatively ‘‘diploid’’ and ‘‘tetraploid’’), but no correlation between number of MS gene copies and ploidy was found. Ó 2011 Elsevier Inc. All rights reserved.

1. Introduction The use of DNA sequence data in plant systematics has progressed dramatically over the past 20 years. Studies have mainly been based on a few well-characterized loci (Lewis and Doyle, 2001). Through the use of regions of plastid DNA (Asmussen and Chase, 2001; Shaw et al., 2005) and nuclear ribosomal DNA (nrDNA) (Álvarez and Wendel, 2003; Baldwin, 1992, 1993; Baldwin et al., 1995), a broad range of systematic and phylogenetic questions in plants have been addressed (Lewis and Doyle, 2001). However, early reliance by systematists on uniparentally (plastid) sequences alone (Chase and Fay, 2009) is potentially flawed (Rieseberg and Wendel, 1993), and development of biparentally inherited markers was a crucial step in increasing robustness of phylogenetic reconstruction. Nuclear ribosomal internal transcribed spacer, nrITS, sequences were among the first candidate nuclear genes for such studies, but gene conversion/concerted evolution of nrITS make it unreliable in some taxa for understanding parentage of hybrids (Chase et al., 2003). ⇑ Corresponding author. Present address: Department of Ecology and Evolutionary Biology, University of Toronto, 25 Willcocks St., Toronto, Ontario, Canada M5S 1B1. Fax: +1 416 586 7921. E-mail address: [email protected] (M. Zarrei). 1055-7903/$ - see front matter Ó 2011 Elsevier Inc. All rights reserved. doi:10.1016/j.ympev.2011.11.003

Early studies tended to focus on family and generic relationships, but for research at lower taxonomic ranks it has become clear that DNA regions with a high evolutionary rate are required (Lewis and Doyle, 2001). In general nrITS has proved a suitable marker for lower taxonomic levels and has been used successfully for studies of congeneric and closely related species (e.g. Álvarez and Wendel, 2003; Baldwin et al., 1995; Feliner and Rossellò, 2007; Hughes et al., 2006; Peruzzi et al., 2008a,b; Peterson et al., 2004, 2008). However, nrITS sequence variation appears to be inadequate for the study of some closely related complexes of species in at least some taxa (Baldwin et al., 1995; Sang, 2002). Zarrei et al. (2009) investigated species delimitation within Gagea Salisb. sensu lato (including Lloydia Salisb. ex Rchb.) using plastid genes/spacers and nrITS. They revealed that a combined dataset of nuclear and plastid sequences did not have sufficient character polymorphisms to uncover relationships among the five major clades. Moreover, no resolution was observed between species in G. section Platyspermum Boiss., containing the G. reticulata complex of species investigated here. Despite its normally high level of evolutionary divergence, analysis of nrITS sequences did not resolve the backbone of the tree or provide unequivocal evidence of relationships between major groups. Moreover, from study of morphological, karyological, and molecular evidence, it has been revealed that polyploidization and hybridization play an important

M. Zarrei et al. / Molecular Phylogenetics and Evolution 62 (2012) 624–639

role in evolution and speciation in Gagea (Peruzzi, 2008a; Peterson et al., 2009). For example, the hybrid origins of G. luberonensis J.-M. Tison, G. polidorii J.-M. Tison and G. pomeranica Ruthe have been documented by Peruzzi (2008a) using morphological and karyological evidence (Peruzzi, 2008b). However Zarrei et al. (2009) found no evidence of multiple copies of the nrITS region, suggestive of a complete homogenization of the nrITS region in the G. reticulata species complex. Low-copy nuclear genes are suitable alternatives to nrITS, with higher divergence expected in many taxa (Sang, 2002; Small et al., 2004; Wolfe et al., 1987). Here we have explored the utility of lowcopy nuclear gene, malate synthase (MS), in the systematics of the G. reticulata complex. Malate synthase, which catalyzes conversion of glyoxylate to malate in the glyoxylate cycle (Graham et al., 1989), has been successfully used to reconstruct evolutionary history in Arecaceae (Lewis and Doyle, 2001, 2002) and Rosa (Joly et al., 2006). The Gagea reticulata (Pall.) Schult. and Schult. f. species complex (reviewed in Zarrei and Zarre, 2005; Zarrei et al., 2010a,b,c) belongs to Gagea section Platyspermum and comprises at least twelve named taxa, G. alexeenkoana Miscz., G. anonyma Rech.f., G. bergii Litv., G. calcicola Zarrei and Wilkin, G. caroli-kochii Grossh., G. commutata K.Koch, G. graminifolia Vved., G. reticulata, G. robusta Zarrei & Wilkin, G. tenuifolia (Boiss.) Fomin, G. setifolia Baker, and G. vegeta Vved. The center of diversity for Gagea section Platyspermum is Iran, although most species also grow in neighboring countries. Some have wide distributions, reaching as far as North Africa. G. reticulata is the most widespread species in dry areas (Zarrei et al., 2010b,c). Monophyly of the G. reticulata species complex has been demonstrated by Zarrei et al. (2009), but that study did not provide resolution between species of the complex. Many of the named taxa appear to be closely related. This kind of pattern has been ascribed to reticulate evolution/hybridization in other groups (Hörandl et al., 2009; Linder and Rieseberg, 2004; Lo et al., 2010; Robertson et al., 2010; Vriesendorp and Bakker, 2005), but genetic data for G. section Platyspermum and particularly the G. reticulata species complex are scarce. Probably some taxa only merit recognition as subspecies or morphological variants. Zarrei et al. (2007, 2010b,c) went some way towards establishing species limits and published detailed morphological studies of Iranian species of Gagea, with a focus on the G. reticulata complex, but this group suffers from a lack of detailed phylogenetic studies. Unfortunately, karyological data for the G. reticulata species complex are scarce. Chromosome numbers have been reported only for G. caroli-kochii (2n = 24, Davlianidze, 1969, 1976); G. graminifolia (2n = 24, Romanov, 1936); G. reticulata (2n = 24, Ghafari, 2008; Heyn and Dafni, 1971; Koul et al., 1976; Koul and Wakhlu, 1985; Magulaev, 1992; Malik and Sehgal, 1959); G. taurica Stev. (=G. alexeenkoana, 2n = 24, Davlianidze, 1976), G. commutata (2n = 24, 36, Heyn and Dafni, 1971), and G. vegeta (2n = 60, Davlianidze and Levichev, 1987). In this paper, ploidy within and among species belonging to the G. reticulata species complex has been assessed by flow cytometry of individuals from numerous localities. These results are then combined with those from the MS analyses to address the following topics in this species complex: (i) species circumscriptions; (ii) extent of hybridization that has taken place; and (iii) potential migratory routes during speciation.

625

pled in the field from 134 localities for the twelve species of the G. reticulata species complex throughout Iran. Two or more individuals (up to 15 in some cases) were sampled per locality, and herbarium vouchers of the same individuals sampled for DNA (two individuals) were collected and numbered for a morphometric study (unpubl.). Because G. reticulata complex taxa reproduce vegetatively through stolons (G. anonyma), bulbils (G. bergii) and replacement bulbs (G. reticulate) (Zarrei et al., 2010b,c), samples were collected from individuals more than 5 m apart in each locality. For localities 78, 123–134 (population codes as in Appendix) only one individual was included for the molecular study. Sequences for five Gagea species, G. bulbifera (Pall.) Salisb., G. kunawurensis (Royle) Greuter (syn: G. stipitata Merckl. ex Bunge, Zarrei et al., 2011a), G. chomutovae (Pascher) Pascher, G. villosa (M.Bieb.) Sweet and G. serotina (L.) Ker Gawl., served as outgroups based on the results of Zarrei et al. (2009). 2.2. DNA extraction, marker amplification and gel cleaning DNA extractions were performed using silica-gel dried basal leaves, bracts and occasionally floral material (Chase and Hills, 1991) with the 2 CTAB method of Doyle and Doyle (1987). All samples were then precipitated in absolute ethanol at 20 °C for a week, pelleted, washed with 70% ethanol, air dried overnight, and resuspended in TE buffer (10 mM Tris/HCl, 1 mM EDTA, pH 8.0). An aliquot of 150 ll was purified using the NucleoSpin Extract II PCR purification kit (Machery-Nagel, GmbH & Co. KG, Germany) following the manufacturer’s protocol. This DNA was used for amplification of malate synthase (MS). The remaining DNA was precipitated in 2.5 volumes ethanol. DNA samples were then purified using a caesium chloride/ethidium bromide gradient (1.55 gml1 density), followed by removal of the ethidium bromide with butanol, dialysis and storage at 80 °C in the DNA Bank of the Jodrell Laboratory, at the Royal Botanic Gardens, Kew (http://data.kew. org/dnabank/homepage.html). Amplification of MS used the previously published primers, ms400f and ms943r (Table 1; Lewis and Doyle, 2001). These primers produce a fragment containing part of exon 1, all of exon 2, part of exon 3, and all of introns 1 and 2 (Fig. 1 in Lewis and Doyle, 2001). PCR reaction volumes of 25 ll were prepared with the following concentrations of components: 20.9 ll of 2.5 mM MgCl2 Reddy MixTM PCR master mix (Abgene Ltd., Epsom, UK), 1.25 mg/ll of both the forward (ms400f) and reverse (ms943r) primers, 0.5 ll of bovine serum albumin (BSA) (0.04%), 0.1 ll of Taq polymerase (GoTaqÒ Flexi DNA Polymerase, Promega, Southampton, UK) and 1.0 ll of genomic DNA. PCR conditions included an initial denaturation step of 4 min at 94 °C, followed by 38 cycles of denaturation (1 min at 94 °C), annealing (1 min at 55 °C), and elongation (2 min at 70 °C), with a final extension step of 10 min at 72 °C. Because ms400f and ms943r primers are respectively 64-fold and 96-fold degenerate (Lewis and Doyle, 2001), five times more primer per reaction was added in comparison with PCR with non-degenerate primers. All PCR products were run on an agarose gel, and selected band(s) were cut out and cleaned using the NucleoSpin Extract II PCR purification kit (Machery–Nagel, GmbH & Co., KG, Germany). Due to the presence of several copies of MS, obtaining unambiguous electropherograms by direct cycle sequencing of MS PCR products was not possible.

2. Materials and methods 2.3. Cloning and sequencing 2.1. Sampling Sampling, voucher information, and GenBank accession numbers are provided in the Appendix. Silica-gel dried (Chase and Hills, 1991) basal leaf, bract and occasionally floral material were sam-

By direct sequencing of MS PCR products, variation in both sequence length and base composition was detected. All gel-cleaned PCR products were spliced into a vector and then transformed into JM109 cells (pGem-T Easy Vector, Promega Ltd. Cat. No. A1360

626

M. Zarrei et al. / Molecular Phylogenetics and Evolution 62 (2012) 624–639 Table 1 Primers used for amplification and cycle sequencing of MS in the current project. Primer

Sequence

References

Forwards ms400f MS-830FGag Ms-ber-2010F MS-ber-2100F MS-anon-315F

50 -GGA AGA TGR TCA TCA AYG CNC TYA AYT C-30 50 -GTC TGG AGG TCA TGG GTT CG-30 50 -TCA GTT AGA ACC TCA TGC CGG-30 50 -ATT AAA ATG TTT TCC TCG GG-30 50 -CCT GTG TGT CAT GGG TGT GCC-30

Lewis and Doyle (2001) This paper This paper This paper This paper

Reverse ms943r MS-1890RGag MS-1500RGag MS-940RGag

50 -GTC TTN ACR TAG CTG AAD ATR TAR TCC C-30 50 -CTG GAG TGC TCC ATC TTA GG-30 50 -GTT MAG MCC CAC RGA ATG GTC CCT YAG Y-30 50 -RAT GGG ATA TGC GCC TCM GGC RAR TGC-30

Lewis and Doyle (2001) This paper This paper This paper

Fig. 1. A schematic diagram of the new approach developed to reduce the number of haplotypes while retaining information from polymorphic sites, which maximizes (i) the number of groups (i.e. haplotypes) and (ii) the percentage of polymorphic characters within each group. Height refers to the position (distance to the tips) where we have applied a cut and defined the number of groups.

Madison, Southampton, UK). The MS region was then re-amplified from the transformed bacterial colonies, using M13 primers. One to 12 clones of successfully transformed colonies per individual were then sequenced. For species with fewer individuals included in the analyses, e. g. G. commutata and G. graminifolia, more clones per individual were sequenced. In addition, PCR products were carefully checked on an agarose gel. If bands were of different lengths, at least four clones were sequenced. Otherwise, only one clone was sequenced. Clones of two individuals from the same population were compared to determine if they clustered together; if they did not, up to 11 more clones were sequenced.

Cleaned PCR products were sequenced on an Applied Biosystems, Inc. (ABI) 3730 genetic analyzer following the manufacturer’s protocols. Sequencing was performed using the vector primers M13F and M13R and internal primers MS-830FGag and MS1890RGag (Table 1). In a few cases MS-940RGag and MS-1500RGag were used to produce sequences of sufficient length. DMSO, to a final concentration of 7%, was added to each cycle sequencing reaction to reduce secondary structure in MS, which helped to produce longer sequences in most cases. Sequences were edited and assembled using Sequencher 4.1 (Gene Codes, Corp.). Alignment was performed using the online

627

M. Zarrei et al. / Molecular Phylogenetics and Evolution 62 (2012) 624–639

version of MUSCLE (www.ebi.ac.uk/tools/muscle), subsequently manually adjusted by eye in PAUP v. 4.0b10 for Macintosh (Swofford, 2002), following the guidelines of Kelchner (2000). To improve the alignment of the intron 2 region, sequences were blasted against different repetitive DNA sequences available in the Plant Repeats Database (http://plantrepeats.plantbiology.msu.edu/ ). The most common transposons found were ‘‘CACTA/En/Spm’’, which have been detected in Oryza GSS repeats (Ouyang et al., 2004). The matrix is available on request from MZ or MWC. A recombination test was performed using RDP3 Beta 40 (recombination detection program, Martin et al., 2005). No signs of recombination were found in exons. However, intron 2, which was excluded from analyses, was shown to have recombination, but this could arise due to the presence of transposons. No points of crossing over (Kelly et al., 2010) were detected between intron 2 and upstream exon 2 or downstream exon 3. 2.4. Parsimony analysis A Fitch parsimony (Fitch, 1971) analysis was undertaken using PAUP v. 4.0b10 for Macintosh (Swofford, 2002) with the following parameters: 1000 replicates of random taxon-additions, treebisection-reconnection (TBR) branch swapping, and MulTrees on (i.e., keeping multiple, equally parsimonious trees). To reduce search time, ten trees per replicate were saved out of potentially thousands of trees. All trees collected from the primary heuristic searches were then used as starting trees in another search with a MAXTREES limit of 20,000. Internal support was estimated using 1000 bootstrap replicates (Felsenstein, 1985) using simple taxon addition and TBR swapping but permitting only ten trees per replicate to be held. All analyses were performed in two stages. First, the complete dataset (625 sequences, 591 distinct, including outgroups) was analyzed as above. In a second stage, parsimony and bootstrap analyses were performed on a reduced matrix by keeping only one clone per population when all clones from an individual accession tightly grouped. The reduced data matrix for the final phylogenetic analysis comprised 166 sequences. In addition, two separate analyses were performed on the complete matrix, based only on intron 1 or just exons to compare the relative contribution of both regions to the topology. 2.5. Network analysis The MS dataset was also investigated using a network approach, which we hoped might better explore the effect of hybridization within the G. reticulata species complex. However, as noted above, the number of sequence types prevented a straightforward interpretation of the haplotype network. To reduce the number of haplotypes they were hierarchically clustered based on a pair-wise distance matrix between aligned DNA sequences (the N model of the APE package is used; Paradis et al., 2004) into minimum variance clusters (utilizing Ward’s clustering). The clustering stage/height (i.e., number of clusters/haplotype groups) was selected, by examination of the rate of polymorphism and nucleotide diversity (or Nei’s genetic diversity; Nei, 1987) at a level which maintained a high information content (Fig. 1). Table 2 shows the number of sequences in each allele group, the number of polymorphic sites and the nucleotide diversity for each group. Once the haplotype groups have been defined in this way, a new reduced ‘‘haplotype group’’ matrix is reconstructed by summarizing the sequences of each group based on the IUPAC notation that allows polymorphism at any site in a sequence in each group to be taken into account. The new reduced matrix was used as input for the network analysis. This approach was implemented in R (R

Table 2 Statistical information from the haplotype group analysis of malate synthase (MS) sequences. Haplotypes

No. of sequence

No. of species

No. of polymorphic sites

Genetic diversity

H1 H2 H3 H4 H5 H6 H7 H8 H9 H10 H11 H12 H13 H14 H15 H16 H17 H18 H19 H20 H21 H22 H23 H24 H25 H26 H27 H28 H29 H30 H31

16 14 107 33 27 30 69 27 21 17 7 18 15 20 17 38 17 11 26 5 5 6 8 6 9 5 27 8 4 6 6

2 4 3 4 3 2 9 1 3 2 1 2 3 2 1 4 4 2 3 1 1 1 2 1 1 1 2 1 1 1 3

93 666 1022 1337 660 1029 1627 1123 857 999 40 817 572 78 1059 848 1299 427 1342 30 38 393 34 112 430 45 824 9 102 17 782

0.0145 0.0346 0.0042 0.0231 0.0259 0.0091 0.0285 0.0289 0.024 0.0343 0.0088 0.025 0.0208 0.0061 0.0241 0.0131 0.0068 0.0105 0.0238 0.0067 0.0141 0.012 0.0072 0.0054 0.0341 0.0058 0.0247 0.0026 0.0423 0.0085 0.0624

Development Core Team, 2010) and is available on request from SB/MZ. A network analysis is performed on the reduced ‘‘haplotype group’’ matrix using the parsimony criterion as implemented in TCS (Clement et al., 2000). In the TCS analysis gaps were treated as missing data, and the 95% parsimony connection limit was applied. Because outgroups are distantly related to the ingroups, they were excluded from network analysis. To interpret the allele group network biogeographically, Iran was subdivided into six distinct areas (Fig. 2): (1) Kerman (including Hazar, Lalezar, Jupar, and Khabr Mountains), (2) Zagros chain mountains, (3) northwestern mountains (including Sabalan and mountains near the Iraqi border), (4) Alborz Range (including Kopet Dagh Mountains and northeastern Iran), (5) eastern mountains, and (6) Taftan Mountains. 2.6. Flow cytometry 2.6.1. Flow cytometry means (FCM) measurements Fresh leaf material is usually used for flow cytometry. However, herbarium samples and silica-gel dried material (Chase and Hills, 1991) have been used in ploidy determination by Suda and Trávnícˇek (2006), although it is unsuitable in older specimens for accurate DNA quantification due to unquantifiable losses of DNA during storage. Approximate DNA amounts for 150 individuals collected from 110 localities belonging to 11 taxa were estimated using flow cytometry. Voucher information is provided in the Appendix. All measurements were made on c.1-year-old silica-gel preserved material. Nuclei were isolated from bracts or basal leaves (rarely from flowers) by soaking material in 1 ml of LB01 buffer (15 mM TRIS; 2 mM Na2EDTA; 0.5 mM spermine.4HCl; 80 mM KCl; 20 mM NaCl; 15 mM b-mercaptoethanol; 0.1 % (v/v) Triton X-

628

M. Zarrei et al. / Molecular Phylogenetics and Evolution 62 (2012) 624–639

Fig. 2. The geographic distribution of haplotypes in Iran. Iran is divided into six geographic areas according to the topology that is most likely to prevent gene flow between different areas. The sampling for this project was extensive, and we have tried to include samples from all populations that we could reach. The unsampled areas in central and southeastern Iran are desert (Kavire Loot and Dashte Kavir, respectively) that are not inhabited by any of the G. reticulata species complex. If localities close to each other share similar genotypes, to avoid overcrowding the figure we have indicated only one locality positioned between these two or more localities. Areas 1, 4, and 5 share most of genotypes mentioned for each, but they are numbered fewer times due to lack of space. H21 is not included in this map because G. commutata was sampled from outside Iran.

100; pH 7.5; Dolezˇel et al., 1989) for 30 min. A further 1 ml of buffer was then added, and the tissue was chopped with a razor blade. No fruits or seeds were included due to potential problems arising from triploid endosperm and diploid embryos (Suda, 2004). Finally, material was filtered through a 30-lm nylon membrane, and nuclei were stained with 100 ll of propidium iodide (PI, 1 mg/ml) and 32 ll RNase (3 mg/ml) for five minutes. For each flow-cytometry run, 5000 nuclei per sample were analyzed on a CyFlow SL-3 flow cytometer (Münster, Germany) equipped with an argon laser. As this was a comparative experiment, it was important to analyze silica-gel dried materials of the same age due to the potential degradation of DNA over time. A constant gain of 221 was applied to all samples. One individual from each locality was analyzed in most cases except where the plants at a locality showed morphological variation or produced a strange flow cytometry histogram, in which case several individuals per locality were included (Appendix).

Table 3 Descriptive statistical information for species used in study of flow cytometry means (FCM) . S.Z. = sample size. Species

G. G. G. G. G. G. G.

anonyma calcicola caroli-kochii graminifolia vegeta bergii alexeenkoana

G. reticulata G. robusta G. setifolia G. tenuifolia

2.6.2. Analyses of FCM and ploidy estimation The mean position of the first peak (G0/G1 = assumed 2C DNA content) in the flow histogram was used for the analysis. Because no reference standard was available at the time of the experiment for the G. reticulata species complex, it was not possible to convert relative DNA amounts into absolute values in picograms (pg). Therefore flow cytometry data are given in relative DNA amounts here and refer to the mean peak position of the 2C peak (=FCM) in all experiments under standardized conditions. Stability of readings was checked routinely using a fresh sample of G. algeriensis (Chabert) Stroh (G. section Didymobulbos) from the RBG, Kew, nursery. Descriptive statistical information of the specimens examined in this project is given in Table 3.

‘‘Diploid’’

‘‘Tetraploid’’

Mean

Range

Mean

Range

130.08 134.85 135.14 122.44 139.41 – 127.46 – 118.63 – 110.42 – 121.74 – 129.04 –

114.06–145.43 134.00–136.37 127.98–147.29 120.05–124.83 134.11–145.06 – 113.41–146.78 – 104.03–136.09 – 94.53–116.60 – 79.82–145.67 – 119.51–150.00 –

– – –

– – –

198.39 – 214.85 – 198.92 – 184.68 – 189.13 – 222.92

178.72–210.00 – 174.75–253.96 – 189.24–208.70 – 170.00–195.00 – 173.25–205.00 – 221.99–223.84

S.Z.

21 3 3 2 6 7 11 17 22 5 5 6 23 2 15 2

3. Results 3.1. Dataset A total of 625 MS sequences obtained from the twelve named taxa of the G. reticulata species complex and four outgroups species at 139 localities was generated. Sixteen bases at the end of exon 1, all of intron 1, all 330 bases of exon 2, and 176 bases at the beginning of exon 3 (Fig. 1 in Lewis and Doyle, 2001) were included in both parsimony and network analyses. Intron 2 was excluded from

M. Zarrei et al. / Molecular Phylogenetics and Evolution 62 (2012) 624–639

the analyses due to presence of transposons, which made alignment difficult. Amplified fragments of MS were 1060–1592 base pairs (bp) in length. The shortest was from a clone of G. setifolia from locality 111 (1060 bp) and the longest G. reticulata from locality 59 (1592 bp). Gagea anonyma and G. setifolia clones were mostly 1062 and 1217 bp, respectively. Gagea reticulata accessions were mainly 1100–1150 bp long, although occasionally they reached 1592 bp (e.g. some clones from localities 83, 69, 59 and 93). Intron 1 was responsible for most of the length variation in MS sequences. The shortest intron 1 belonged to G. alexeenkoana (locality 100) with 389 bp and the longest were in G. reticulata at localities 83 and 51 with 951 bp and locality 69 with 950 bp. However, clones of G. reticulata from locality 69 showed length variation for intron 1 of 474–498 bp. Introns were more variable than exons. There were a number of indels in intron 1, which did not correlate with taxa or haplotype groups. No indels were observed in any of the three exons. 3.2. Parsimony analyses The data matrix comprised 1916 positions of which 1392 (72.7%, including outgroups) were variable and 989 (51.6%, including outgroups) potentially parsimony informative. The analysis yielded 15 trees of 5355 steps with a consistency index of 0.41 and retention index of 0.85 (results not shown). The parsimony analysis of the reduced data set, comprising 33.8% of potentially parsimony informative positions (including outgroups), generated 110 most parsimonious trees with 2862 steps each, consistency index (CI) = 0.49, and retention index (RI) = 0.71. One of the most parsimonious trees randomly selected from 110 trees is presented in Fig. 3A and B. The monophyly of the ingroup taxa is well supported, bootstrap percentage (BP) 97. There is no resolution between major clades in the ingroup. However, there are some well-supported groups. Each group is named after the species that contributed greatest number of DNA sequences. Groups are recovered for most species, for example, anonyma-type I, anonyma-type II, and anonyma-type III are recovered with a BP of 90, 100, and 100, respectively. However, sequences of G. vegeta, G. commutata, and G. graminifolia do not form specific supported groups (Fig. 3A). The clones belonging to these three species fall in reticulata-type II. Two groups (bergii-type I and bergii-type II) are recovered in G. bergii (Fig. 3A and 3B). 3.3. Network analyses Based on the raw MS dataset, 591 sequence types were present (i.e., some of the 625 clones were identical with others) which correlated only poorly with taxa. With this number and level of complexity it was impossible to interpret the network and so further analysis was carried out on 31 haplotype groups produced in the manner described above. The haplotype group matrix was subjected to parsimony network analysis, with a standard Fitch parsimony (Fitch, 1971) using PAUP v. 4.0b10 for Macintosh (Swofford, 2002). Due to presence of large number of the polymorphic positions in these sequences that summarize the variation in the larger matrix, the resulting MP trees produce a large polytomy in the consensus tree (results not shown). Information for each haplotype group, including number of clones and species per haplotype group and genetic diversity within haplotype groups, is presented in Table 2. Allocation of species per haplotype is presented in Table 4. Haplotype group H3 included the greatest number of sequences (107), whereas H20, H21 and H26 had the fewest (5; Table 4). Most haplotype groups included sequences occurring in one to four species, whereas H7 contained sequences occurring in nine species (Table 2). The greatest genetic diversity of a haplotype group was found in H2 (0.0346) and the lowest in H28 (0.0026; Table 2).

629

The parsimony network obtained from TCS analysis with 2000 steps of fixed connection limits and treating gaps as missing data is presented in Fig. 4. The percentage of species comprising each haplotype is represented as pie charts. Species with less than 15% of clones per haplotype are not shown in the circles. Twenty-five haplotypes were retained in the network, although there were no consistent differences between H8 + H10, H2 + H7, and H16 + H19, which therefore are shown as one haplotype (Fig. 4). H8 and H10 appeared to be central to the other 23 haplotypes and therefore are close to a putative ‘‘ancestral’’ sequence. H12, H16 (including H19), and H4 are clusters derived from the ancestral haplotype (H8 + H10) with a maximum of three genetic changes. H27, H24, H28, H11, and H6 are clusters with more than five changes. H12 and H16 (+H19) are closely related with one change only. H22 and H20 are derivatives of H12. Similarly, H21, H18, H17, H15, H13, and H3 are derived from H16 (+H19). H5 is an intermediate cluster between H23 and H4. H4 is intermediate between H14 and the ancestral haplotype, whereas H25 is intermediate between H26 and H4. The 65 sequences of G. alexeenkoana occur in five haplotype groups. The majority was found in H4, H5, and H27, each with 19, 20, and 24 sequences, respectively (Table 4). H7 and H19 each included only one sequence of this species. Gagea anonyma (n = 135) fell in eight groups, but mostly in H3 and H6 with 94 and 23 sequences, respectively. Clones of G. bergii (n = 37) fell in five groups; H13 and H15 included the most (n = 12 and 17, respectively). Clones of G. calcicola (n = 26) were distributed in five groups, 16 sequences of which were in H14. Only three clones of G. caroli-kochii were sequenced, and all fell in H27. H21 included only five clones, all from G. commutata. All clones of G. graminifolia (n = 7) fell in H2 + H7. Clones of G. reticulata (n = 95) fell into 14 groups with most occurring in H12, H19, H7, H17 with 17, 16, 11, and 10, respectively. Only 25 clones of G. robusta were sequenced. They clustered into five groups; H2 + H7 possessed the greatest number (13). Gagea setifolia with 138 sequences was the species with the most clones sequenced in this project. They fell into 12 groups. H7 and H8 had the most with 22 and 27, respectively. Fifty-seven sequences of G. tenuifolia fell in six groups. Sixteen cloned sequences of G. vegeta were clustered into three groups (H7, H25, and H26). There was correlation between the relationships of MS types and the haplotype group network. The haplotype groups mapped onto one of the most parsimonious trees, randomly selected, is presented in Fig. 3A and B. There was clear allocation of haplotype groups to the MS types recognized according to the MP tree. All sequences of anonyma-type II belonged to H3 and anonyma-type I comprised mostly sequences of H6 (Fig. 3A). H13 and H15 consisted of bergii-type I (Fig. 3A) and bergii-type II, respectively. All alexeenkoana-type I sequences belonged to H27 (Fig. 3B), whereas H4 and H5 were consisted mostly of alexeenkoana-type II. Both bergii-type I (H13) and anonyma-type II (H3), which were related in the haplotype group network (Fig 4) were sister to each other in a clade (BP 97; Fig. 3A and F). H16–H19 were nested in reticulata-type III (Fig. 3B) and are all related in the network (Fig. 4). The geography of these genetic variants in Iran (based on data in Table 2) is shown in Fig. 2. Some haplotype groups possessed sequences from only one mountain range, whereas most haplotype groups shared their sequences among different mountains. All sequences of H27 were from northwestern Iran (area 3), whereas all sequences of H11 and H24 were from the Damavand Mountains in the north-western part of the Alborz Range (area 4). H1 and H2 were from the eastern range of the Alborz Range (area 5), whereas H3 possesses sequences from the Kerman Mountains (area 1) and the Taftan Mountains (area 6) as well as the Alborz Range. Sequences of G. setifolia collected from the Taftan Mountains fell into three groups, H3, H6, and H7. This is the only species that was col-

630

M. Zarrei et al. / Molecular Phylogenetics and Evolution 62 (2012) 624–639

B

Fig. 3. (A) One of the most parsimonious trees, randomly selected from 110 trees, obtained from analysis of the reduced data matrix of malate synthase sequences (166 OUT). Tree length = 2862, CI = 0.49, and RI = 0.71. Branch lengths (DELTRAN optimization) are indicated above branches and bootstrap percentages below. Branches with no number underneath have BP < 50. Nodes indicated with a hyphen were not retained in the strict consensus tree. The number given after each species is the locality number (Appendix). The number in parenthesis after each species locality is the haplotype number according to Tables 2 and 4.

M. Zarrei et al. / Molecular Phylogenetics and Evolution 62 (2012) 624–639

Fig. 3 (continued)

631

632

M. Zarrei et al. / Molecular Phylogenetics and Evolution 62 (2012) 624–639

Table 4 Species composition of genetic groups (=haplotypes). Numbers in parenthesis refer to the number of clones for specific species in each haplotype group. The ploidy inferred from FCM data is also presented in parenthesis as d = ’’diploid’’, and t = ’’tetraploid’’ for each species in each haplotype group. ? denotes missing ploidy information for a given species in a specific haplotype group. Haplotype groups

Species and ploidy composition of clones making up each haplotype group

No. of clones

H1 H2 H3 H4 H5 H6 H7

G. G. G. G. G. G. G. G. G. G. G. G. G. G. G. G. G. G. G. G. G. G. G. G. G. G. G. G. G. G.

16 14 107 33 27 30 69

H8 H9 H10 H11 H12 H13 H14 H15 H16 H17 H18 H19 H20 H21 H22 H23 H24 H25 H26 H27 H28 H29 – H31

reticulata (2 = 13%; d, t), G. setifolia (14 = 87%; d) robusta (4 = 28.6%; d), G. setifolia (6 = 42.8%; d), G. bergii (1 = 7.1%; t), G. anonyma (3 = 21.5%; d) anonyma (94 = 87.8%; d), G. setifolia (12 = 10.2%; d), G. reticulata (1 = 2%; d) alexeenkoana (19 = 57.6%; d, t), G. setifolia (7 = 21.2%, d, t), G. calcicola (4 = 12.1%, t, d), G. anonyma (3 = 9.1%; d) alexeenkoana (20 = 74.1%; t), G. reticulata (1 = 3.7%; ?), G. setifolia (6 = 22.2%; t) anonyma (23 = 76.7%; d), G. setifolia (7 = 23.3%; d) alexeenkoana(1 = 1.4%; ?), G. anonyma (5 = 7.2%; d), G. bergii (4 = 8.1%; t), G. graminifolia (7 = 10.1%; d), G. reticulata (11 = 15.9%; d, t), robusta (13 = 18.8%; d, t), G. setifolia (22 = 31.9%; d), G. tenuifolia (4 = 8.1%; d, t), G. vegeta (2 = 2.9%; d) setifolia (27 = 100%; d) anonyma (2 = 9.5%; d), G. robusta (5 = 23.8%; d), G. setifolia (14 = 66.7%; d) bergii (3 = 17.6%; t), G. setifolia (14 = 82.4%; d, t) reticulata (7 = 100%; ?) reticulata (17 = 94.4%; d, t), G. setifolia (1 = 5.6%; d) anonyma (1 = 6.7%; d), G. bergii (12 = 80%; t), G. reticulata (2 = 13.3%; d, t) anonyma (4 = 20%; d), G. calcicola (16 = 80%; d) bergii (17 = 100%; t) calcicola (2 = 2.3%; d), G. reticulata (7 = 18.4%; d, t), G. robusta (2 = 2.3%; d, t), G. tenuifolia (27 = 77%; d, t) calcicola (1 = 5.9%; d), G. reticulata (10 = 58.8%; t), G. robusta (1 = 5.9%; d, t), G. tenuifolia (5 = 29.4%; d) reticulata (4 = 36.4%; d), G. tenuifolia (7 = 63.6%; d, t) alexeenkoana (1 = 3.8%; t), G. reticulata (16 = 61.5%; d, t), G. tenuifolia (9 = 34.6%; d, t) reticulata (5 = 100%; d) commutata (5 = 100%; ?) reticulata (6 = 100%; d) calcicola (3 = 37.5%; d), G. tenuifolia (5 = 62.5%; d) reticulata (6 = 100%; ?) vegeta (9 = 100%; d) vegeta (5 = 100%; d) alexeenkoana (24 = 88.9%; d, t), G. caroli-kochii (3 = 11.1%; d) setifolia (8 = 100%, ?) bulbifera (4), G. serotina (6), G. kunawurensis (10), G. villosa (2), G. chomutovae (3)(outgroups)

27 21 17 7 18 15 20 17 38 17 11 26 5 5 6 8 6 9 5 27 8 25

H8+H10

2

H1 4

H16+H19

H12

H2+H7

6 10

H27

3

7

H9

11

H4

H24 H11 H28

4

8

3 5

12

H22

H6

H5 5

H25

H15

17

4

18

H14 22

H18

7

H20

H23

H17 H13 H26

H21 5

H3

Gagea setifolia Gagea tenuifolia Gagea alexeenkoana Gagea reticulata Gagea robusta

Gagea calcicola Gagea caroli-kochii Gagea graminifolia Gagea vegeta Gagea commutata

Gagea anonyma

Gagea bergii

Fig. 4. Network inferred from the reduced haplotype matrix (see Fig. 1 for more details). Reduced haplotypes are referred to as ‘‘H’’. Number of steps between haplotypes is indicated along connecting lines (when nothing is displayed it means that only one step separates haplotypes). The species representing more than 15% of clones per haplotype are shown. For H8 + H10, H16 + H19 and H2 + H7, the percentage of sum of both haplotypes is encountered. Gagea caroli-kochii occurs only in 11.1% of H27. Because it is only representative of this species, it included in the chart.

633

M. Zarrei et al. / Molecular Phylogenetics and Evolution 62 (2012) 624–639

A

2C

B

2C

4C 4C

8C

8C

2C

2C

C

4C

D

4C

8C

8C

Fig. 5. Histograms obtained by flow cytometric analysis (at gain of 221) of PI-stained nuclei isolated from basal leaves and bracts of (A) G. setifolia (locality 67, FCM = 79.82); (B) G. anonyma (locality 46, FCM = 138.86); (C) G. robusta (locality 81, FCM = 173.09); (D) G. alexeenkoana (locality 1, FCM = 253.96); X axis shows relative DNA content and Y axis number of nuclei counted. Signals spanning the range correspond to nuclei synthesizing DNA (S phase) and cell debris.

lected from this isolated mountain range. In contrast, the northern Zagros (Area 2), Alborz and northwestern mountains have groups in common, such as H16. H8 and H10, the putative ‘‘ancestral’’ groups, are found in several possibly ‘‘relict’’ populations around the central desert (Fig 6). 3.4. Flow cytometry The mean position of the first peak (G0/G1 = assumed 2C DNA content) in the flow histogram is given in the Appendix. Selected flow histograms are shown in Fig. 5. These show maximum variation in relative DNA contents of different species/individuals. The prominent peak corresponds to nuclei in G0/G1 phase (with 2C DNA content = 2C), whereas the smaller peak corresponds to nuclei in G2/M phase (with 4C DNA content = 4C). In some histograms, for example G. anonyma (Fig. 5B), G. robusta (Fig. 5C), and G. alexeenkoana (Fig. 5D), additional peaks were also observed that correspond to nuclei that have undergone endopolyploidy and possess 8C DNA content. Descriptive statistical information for specimens analyzed is summarized in Table 3. The lowest FCM measured belonged to G. setifolia at locality 67 (FCM = 79.82; Fig. 5A; Appendix and Table 3), whereas the highest was an accession of G. alexeenkoana from locality 1 (FCM = 253.96; Fig. 5D; Appendix and Table 3). The flow histograms of all plants sampled clearly show two separate and well-defined modes of FCM values. Some species exhibited only one mode of FCM values (i.e. G. anonyma; G. calcicola; G. caroli-kochii; G. graminifolia; G. vegeta, and G. bergii) representing a single ploidy level, whereas some have both (i.e. G. robusta; G. reticulata; G. setifolia; G. tenuifolia, and G. alexeenkoana) indicating the presence of both diploid and tetraploid levels in the species. In most cases, FCMs of different individuals from one population fell in

the same range, but some populations had both modes of FCM (for example localities 63, 76, 81, 82, and 119). 3.5. Number of MS copies Primers specifically designed to amplify only one copy type produced important data to establish the presence of multiple alleles not picked up by cloning. For example MS-ber-2100F primer in combination with MS-1890RGag (Table 1) amplified reticulatatype II in G. bergii from localities 51, 60, 64 and 129 (Fig. 3B and B and Appendix), although these sequence types were not picked up in the cloning process. Together with MS-1890RGag and MSber-2010F (Table 1), the MS-ber-2010F primer amplified MS from samples of G. bergii collected at localities 60 and 130, which did not otherwise produce bergii-type I copies (Fig. 3A and B). These tests confirm that three copies of MS are present in the genome of G. bergii at all localities sampled. In order to test the copy type in G. anonyma, the primer MS-anon-315F was designed (Table 1). Primer MS-anon-315F (specific for anonyma-type I) paired with MS-830RGag also amplified anonyma-type I in localities that did not have that type in clones that were sequenced (localities 24, 29, 31, 36, 48, 49, 70, 71 and 97). The highly degenerate PCR primers (ms400f + ms943r; Table 1) failed to amplify the specific copies of MS in some individuals. This suggests that copies in some taxa have diverged such that all three copies do not amplify with the standard primers. 4. Discussion Given the extent of morphological variation in the G. reticulata complex, it is not surprising that there have been disagreements regarding taxon limits. Gagea alexii Ali and Levichev has been re-

634

M. Zarrei et al. / Molecular Phylogenetics and Evolution 62 (2012) 624–639

Fig. 6. The distribution of ancestral genetic groups (H8 + H10) in Iran. The numbers are based on haplotypes in Table 4. The grey area indicates the hypothetical paleodistribution of the ancestral population. All of areas 1, 5, and most of 4 (except the northeastern portions in the Kopet Dagh Mountains) are occupied by remnants of the ancestral populations. A few sequences were also recovered from eastern slopes of Zagros range (area 2). These clones are from isolated mountains near deserts that are thought to be remnants of larger ancestral population that once occupied all central, northern, and eastern Iran including the desert (Kavir) between them.

corded from southern Iran (Ajani et al., 2010) based on only one locality. This is a form of G. alexeenkoana that is common throughout western, southern, and northern Iran. Both molecular studies (e.g. localities 103, 117, and 118 in the current study; Appendix) and morphometric analyses (Zarrei et al., in preparation) did not support the identity of G. alexii as a distinct species. Ajani et al. (2010) identified G. luteoides Stapf to be a member of Gagea section Incrustatae Levichev. This species, which was reduced to a synonym of G. fragifera (Vill.) Bayer and López (Zarrei et al., 2007), belongs to Gagea section Didymobulbos K.Kokh, and the earlier section is now considered a synonym of G. section Platyspermum (Zarrei et al., 2011b,c). Gagea tenuifolia and G. tehranica Gand. were treated as distinct species in Flora Iranica (Wendelbo and Rechinger, 1990), but they were combined as G. reticulata by Zarrei et al. (2007). Gagea perpusilla Pascher was subsumed by Zarrei et al. (2007) as G. setifolia, as previously suggested by Wendelbo and Rechinger (1990). Gagea caroli-kochii was reduced to a synonym of G. alexeenkoana (Zarrei et al., 2007), and G. anonyma was reduced to a synonym of G. setifolia (Zarrei et al., 2007). However, for the purposes of this study, we retained them all as putatively distinct entities to evaluate their genetic distinctiveness and facilitate study of gene flow and hybridization/polyploidization at the population level. 4.1. Ploidy in the G. reticulata species complex In this study, the flow cytometry mean peak positions (FCM) reported in different individuals could have several causes. (i) There might have been differential rates of DNA degradation during sample storage, but this does not explain the large differences observed in some samples treated in the same way and expected to have similar DNA content. For example, the FCM for G. alexeenkoana from locality 1 was 2.24 times higher than that from locality 113

(Appendix), suggesting significantly higher relative DNA amounts. Since all samples were prepared from plant material that was approximately one-year old and had been collected in the same way (i.e. taken directly from the plant, cut into small pieces, transferred directly into the ziplock bag of silica-gel and kept in the dark), all enzymes are assumed to be deactivated immediately after removal from the live plant (Chase and Hills, 1991), particularly as the basal leaves and bracts of Gagea are thin and narrow (Zarrei et al., 2007). All these factors should have enhanced the drying process and deactivated enzymes involved in DNA degradation. In addition, preparations were treated identically during FCM analyses, and therefore differential effects of deterioration during sample preparation (Suda, 2004) were likely reduced to zero. Nevertheless, differences in cytochemistry between samples/species that interact with DNA degradation processes and/or interfere with the binding of the fluorochrome staining may contribute to some observed differences in relative DNA amounts between samples expected to be similar. (ii) Intraspecific variation in DNA amount is another possibility. Variation in genome size of the same species with constant chromosome number has been reported (Dolezˇel et al., 2007; Greilhuber, 2005). It is therefore possible that some variation reported in relative DNA amounts within a species is due to genuine intraspecific variation. For example, heterochromatin variation and supernumerary segments will contribute to variation in DNA amounts observed between individuals (Ohri, 1998). However, there is currently no cytological data available that can confirm whether intraspecific variation occurs in the G. reticulata complex. This will require careful karyological studies combined with genome size measurements made on fresh material. (iii) Variation in ploidy between individuals is the most likely explanation for the observed range of relative DNA amounts, that is, two cytotypes, ‘‘diploid’’ and ‘‘tetraploid’’, exist for some species,

M. Zarrei et al. / Molecular Phylogenetics and Evolution 62 (2012) 624–639

although, to date, no tetraploid individuals have been reported in the G. reticulata complex (based on chromosome counts of six species, see Section 1). Peruzzi et al. (2009) showed that total haploid length (THL) of chromosome in the G. section Platyspermum is the most variable. This variation may be responsible for the FCM differences in accessions with the same potential ploidy. Therefore the data presented in this manuscript strongly support the presence of two distinct ploidy levels, ‘‘diploid’’ and ‘‘tetraploid’’ for G. reticulata, G. tenuifolia, G. alexeenkoana, G. robusta, and G. setifolia. The presence of just a single peak for G. bergii falling in the higher modal values suggests that all these individuals studied are ‘‘tetraploid’’. Although there is a previous report of 2n = 60 for G. vegeta (Davlianidze and Levichev, 1987), FCM values for this species fell within the ‘‘diploid’’ range. This either suggests that the count is a mistake, cytotypes with lower chromosome counts also occur in this species, or there has been a reduction in chromosome size leading to lower DNA amounts in spite of being a polyploid. Certainly, related species with different chromosome numbers but similar DNA amounts have been reported (e.g. Lysák et al., 2009). The existence of just a single ‘‘diploid’’ cytotype for G. graminifolia and G. caroli-kochii supports previous studies (Davlianidze, 1969, 1976; Romanov, 1936) and are new for G. calcicola, G. anonyma, and G. vegeta. Since the basic chromosome number for the G. reticulata species complex has been reported to be x = 12 (Ghaffari, 2008; Peruzzi, 2003), it is likely that ‘‘diploid’’ individuals have a chromosome number of 2n = 24 and ‘‘tetraploids’’ therefore are 2n = 48. 4.2. Three copies of malate synthase (MS) are present in G. reticulata species complex genome MS occurs in three copies in G. reticulata, G. setifolia, and G. anonyma. This is the first report of three copies of the MS gene in the genome of land plants. In two previous studies on palms (Lewis and Doyle, 2001, 2002), there was no evidence of multiple copies. Clones belonging to a single species were placed in three distinct clusters with a few exceptions. In most cases, all three copies were found in one individual plant; for example, clones from one plant of G. anonyma sampled from locality 95 (Appendix; Fig. 3A and B) had three copies of MS (anonyma-types I–III), and it is likely that there are three copies of MS in all individuals, although they were not always amplified. If all three copies amplified equally, sequencing eleven clones should be sufficient to detect all types in most accessions (Pillon et al., 2009). However, examination of the most parsimonious tree revealed that some taxa of Gagea did not produce three copies of MS. For example, G. bergii had two distinct MS types (bergii-types I and II, Fig. 3A and B, respectively), and the third type was shared with reticulata-type II (retrieved from G. bergii only at locality 96, Fig. 3A), which is a sign of allotetraploid origin. DNA sequences of these three MS genes were translated into malate synthase amino acids with no internal stop codons. However, we did not investigate whether base changes in exons occurred primarily in synonymous positions or whether nonsynonymous changes lead to changes in the amino acid sequence that altered the functionality of MS. This will be the subject of future research. 4.3. Evolution of malate synthase (MS) in the G. reticulata species complex and implications for reconstructing phylogeny of Gagea At least three MS loci were recognized in the genome of the G. reticulata species complex, and each locus contained several alleles. It is likely from the linkage patterns observed that the duplicated loci of MS are located on the same chromosome. This will be tested and reported in a later paper. Potentially more alleles might be re-

635

trieved from tetraploid individuals, but in this study no difference was observed between putative ‘‘diploids’’ and ‘‘tetraploids’’; thus, it seems likely that they possess the same number of MS copies. All three copies of MS were successfully translated to MS protein in both ‘‘diploids’’ and ‘‘tetraploids’’. This result may have several possible origins: (a) a process of gene homogenization (reviewed by Wendel, 2000) may have taken place during the evolution of MS after polyploidization; (b) loss of some copies of MS gene through deletion; (b) three copies of MS (in case of tetraploid) mutating as they developed new functions (neofunctionalization) and so were not amplified; and (c) tetraploids being recent autopolyploids and carrying the same sets of alleles as their diploid progenitors. These alternatives require further study. The presence of three copies of the MS gene in the genome of the G. reticulata complex and its absence in the outgroup taxa suggests that gene duplication occurred after the divergence of G. section Platyspermum from G. sections Gagea and Didymobulbos (Fig. 4 in Zarrei et al., 2009, 2010d). The failure to recover three major clades (one for each of the three MS loci) with each including representatives of most, if not all, taxa studied here (Clarkson et al., 2010; Cui et al., 2006) indicates that it is likely that the origin of the complex and duplication of MS have been coincident (see Fig. 5). 4.4. G. reticulata complex, hybridization and polyploidization There is no apparent correlation between ploidy estimated from our flow cytometry work and number of copies of MS in the genome of the G. reticulata species complex. Gagea anonyma and G. calcicola, which we believe to be ‘‘diploid’’, have three MS copies like G. bergii, which is likely to be ’’tetraploid’’. An allopolyploid origin of some taxa is indicated. For example, the MS data suggest that G. bergii shares one copy of MS with G. reticulata (reticulata-type II; Fig. 3A) and another with G. anonyma (BP 97; Fig. 3A), species with which it is sympatric, and so these species are its putative parents. This hypothesis is supported by network analyses of haplotype groups that reveal that G. bergii (H13 and H15, Fig. 4) has evolved from G. reticulata/G. tenuifolia ancestors (H16 + H19, Fig. 4) but that haplotype group H3, which is mainly found in G. anonyma, is also related. Gagea bergii shows some morphological characters intermediate between these two possible parents, also pointing to a hybrid origin. This will be described in more detail in a future paper. However, it resembles G. reticulata more than G. anonyma with long pedicels covered by long, dense villous hairs, which are more common in G. reticulata that G. anonyma. For other taxa the evidence for an allopolyploid origin is more equivocal. For example, although G. robusta, which has its own type of MS (robusta-type I, Fig. 3B), and has both diploid and tetraploid individuals, some alleles are of the setifolia-type III (Fig. 2B) and reticulata types II and III (Fig. 3A and B, respectively). The relationship with G. setifolia is stronger as shown by the haplotype network analysis. Sequences of G. robusta are divided into three haplotype groups (H2 + H7 and H9), which are shared with G. setifolia (Fig. 4 and Table 4) and are derived directly from ancestral G. setifolia haplotype groups (H8 + H10). Gagea robusta has a sympatric distribution with G. reticulata, but the presence of reticulata-types II and III alleles in G. robusta in might be due either to retention of the ancestral G. setifolia genotypes by both G. reticulata and G. robusta or recent gene flow between these two species. Some morphological traits in G. robusta, like pedicel length, number of flowers, and basal leaf length, are intermediate between G. setifolia and G. robusta. On a broader scale there is strong indirect evidence for potential hybridization and introgression from the MP tree (Fig. 3A and B). Species with overlapping geographical distributions show higher rates of intermixed alleles and alternatively isolated species show

636

M. Zarrei et al. / Molecular Phylogenetics and Evolution 62 (2012) 624–639

a low rate of intermixing. For example, G. anonyma is an isolated species that is restricted to fine sandy soils in the dry desert plains and cannot be found in the foothills and mountains. It has only been seen at one location mixed with species of G. reticulata species complex (with G. setifolia in locality 44) (MZ pers. obs.). Alleles of G. anonyma are only rarely found intermixed with alleles of other species (G. setifolia from locality 44). In contrast, G. setifolia, which grows in soils with a high percentage of clay in the mountains, can be found growing with most other species in this group especially with G. reticulata, G. robusta and G. alexeenkoana, which share a common niche. The alleles of G. setifolia are mixed with alleles these species.

4.5. Species evolution and paleo-distribution of G. reticulata taxa Evolution of MS as detected in this study suggests that species evolution is complex and in places reticulate (Fig. 3A and B) and (Fig. 4). The haplotype group network has H8 + H10 central with all other haplotypes derived (Fig. 4). All clones of H8 and most of H10 (82.4%, Table 4) belong to G. setifolia from areas 1, 4, and 5. The remaining clones of H10 are from G. bergii in the eastern mountains of Iran (area 5, Fig. 6). From this evidence one hypothesis is that G. setifolia retains the ancestral genome of the G. reticulata species complex, represented by these two haplotype groups. These haplotype groups are distributed mainly in the mountains in the Kerman area in southern and central Iran (area 1), eastern Iran (area 5) and Damavand in central Alborz (northern Iran, area 4, Figs. 2 and 7), surrounding the central Kavir Desert. It unlikely that gene flow has taken place between these mountain ranges given that (i) the pollinators of Gagea are bees (Nishikawa, 2009), (ii) the mountain ranges are separated by more than 700 km and (iii) there are no Gagea species in the central Kavir Desert Figs. 2 and 7. It is more likely that they indicate the survival of relict populations from a former wider distribution in central Iran (Fig. 6). Haplotype groups H1, H2, H7, and H9 are most directly related to the ‘‘ancestral’’ haplotype groups (Fig. 2; Tables 2 and 4), but they have a wider distribution and are present in several species, indicative of population expansion to the southeastern and southwestern parts of Iran (Taftan Mountains and southern/western Zagros, respectively) as well as the whole of the Alborz Range (extending to northeastern Iran). We suggest that during colonization of these areas, disjunctions caused by abiotic (e.g. mountain uplift, extension to drier or more humid regions) or biotic (e.g. different flowering time) factors resulted in formation of restricted haplotypes (Figs. 2 and 7). All clones of G. graminifolia and 88% of clones found in G. robusta belong to haplotype groups H2, H7, and H9. H1 is composed entirely of G. setifolia accessions from eastern areas 4 and all area 5, and this species also makes up most of haplotype group H9 and about 1/3 of H2 + H7. Thus, G. graminifolia and G. robusta, which occur in eastern area 4, appear to be derived from G. setifolia (Fig. 4). The putative ancestral haplotypes (H8 + H10) and early-diverging haplotypes (H1, H2, H7, and H9) are predominantly ‘‘diploid’’ (Table 4). The occasional presence of clones from apparently ‘‘tetraploid’’ individuals in these groups might arise from retaining the ancestral copy in polyploids or hybridization between recently formed populations/species and ancestral populations/species. Other parts of the haplotype group network represent the expansion of G. reticulata and G. tenuifolia (first H12, H16, H19 then H22, H20, H18, H17) and G. alexeenkoana (H4 then H5). These haplotype groups have a more peripheral distribution that extends to the whole of the Zagros Mountains, eastern Alborz, and eastern Iran (Fig. 7). These data suggest that G. tenuifolia, G. alexeenkoana, and G. reticulata also evolved more or less directly from G. setifolia and have either retained in random fashion some of the genotypes

typical for that species or have haplotype groups such as H27 that are easily derived from ‘‘ancestral’’ G. setifolia haplotype groups. Gagea tenuifolia has no unique genotypes and shares three groups of them with G. reticulata and could thus be seen as a derivative of this species. Gagea reticulata is one of the most diverse species in the complex, and it has given rise, in addition to G. tenuifolia, to G. vegeta (H21 genotypes), G. bergii (H13 and H15). Where species share unrelated and more derived genotypes, then it is likely that these are the products of local and perhaps recent hybridization. Cases of hybridizations detected here include G. tenuifolia sharing H23 with G. calcicola, and G. anonyma sharing H14 with G. calcicola (Fig. 4). Haplotype group H3 found mainly in G. anonyma is related to G. bergii haplotype group H13, but its presence may be the result of introgressive hybridization as described above (Section 4.4). There are other examples where a genotype highly similar to one species occurs in an accession of another species; such occasional patterns are not included in Fig. 3A and B, which was designed to show more general patterns of relatedness. 4.6. Effects of geographical topology, paleo-climate, and different ecological conditions on speciation of the G. reticulata species complex It has been argued that the present-day vegetation of central Iran (habitat of the ancestral population of G. section Platyspermum, H8 + H10, which today we would call G. setifolia) formed in the late Miocene. The Dashte-Kavir areas contained water during some glacial periods (Rahimpour-Bonab et al., 2007a,b). Our results suggest that the ancestral population that occupied all central Iran (including Kavir) was distributed in this area in the Pliocene and most of the Quaternary (from 5 million years ago; see Patterson and Givnish, 2002 for discussion of the timing and origin of Tulipeae). Given that central Iran (including Kavir) and the Zagros Mountains emerged from the sea (refer to Agard et al., 2005; Mohajjel et al., 2003) at the same time and the Zagros lack the ancestral genotype (Fig. 6), it is suggested that the ancestral population migrated westwards from the central parts of Iran (Fig. 7). Although some recently evolved genotypes are present in the paleo-distribution area, the main distribution of the more recently evolved genotypes is peripheral in comparison to that of the ancestral populations (H8 + H10, Fig. 7). This confirms expansion of the ancestral population in mainly westward direction. Several Quaternary climatic changes occurred in Iran (reviewed in Kehl, 2009). In northern and western Iran, the climate changed between dry and cold conditions during the stadials and moist and warm conditions during the interglacials. In western Iran, the Younger Dryas and the lower Holocene were characterized by dry conditions. This mosaic of climate in Iran coincided with formation of mountains that caused fragmentation of the large ancestral distribution into smaller isolated populations, which permitted speciation or at least genetic divergence (drift). On the western edge of this range, migration via small populations created a similar level of isolation and divergence, leading to speciation (Figs. 2 and 7). The Damavand Mountains were formed by volcanic activity; the present cone was produced over 600 ky ago (Davidson et al., 2004). The most recent episode is dated at 7300 yr B.P., forming small, unglaciated, and unvegetated flows on the western flanks of the current cone at Damavand (Davidson et al., 2004). It is most likely that the presence of an ancestral genome (H8 + H10) in the Damavand area (Fig. 6) is due to secondary migration into this zone; the recent volcanic activity would have eliminated products of earlier occupations. However, this hypothesis needs further examination. 4.7. Implications for conservation It is wise to take into account the evolutionary importance of alleles/species as well as the IUCN criteria for conservation (IUCN,

M. Zarrei et al. / Molecular Phylogenetics and Evolution 62 (2012) 624–639

637

Fig. 7. The distribution of recently evolved haplotypes (H3–H6, H12, H14, H16–H17, H23, and H27) in Iran. The numbers are from Table 4 and Figs. 2 and 5. The light grey area indicates the hypothetical paleo-distribution of the ancestral population (Fig. 6). The dark grey marked areas, 2, 3, and 6, are recently occupied by populations derived from the ancestral population. The Kopet-Dagh area populations in northeastern Iran are also derived from the ancestral population.

2001). It has been argued that it is important to preserve the conditions required for the persistence of evolutionary processes that generate biodiversity (Cowling and Pressey, 2001; Pillon et al., 2009; Smith et al., 1993). Thus, preserving ancestral and intermediate genotypes should have a priority over conserving of the most derived genotypes. Moreover, those populations with barriers to gene flow are also important in both an evolutionary and conservation context because they more likely accumulate mutations, which will ultimately led to speciation. Haplotype groups H8 + H10 are probably ‘‘ancestral’’ and associated with the origin of the G. reticulata species complex with a distribution in drier areas (Figs. 2 and 4). H10 also possesses high genetic diversity (Tables 2 and 4). Moreover, this haplotype group includes G. bergii with EN status (Zarrei et al., 2007). The geographical distribution of these haplotype groups indicate that the central portions of Alborz, Kerman, and mountains in eastern Iran are the most important places for conservation strategies to be focused (Fig. 2). Gagea bergii has some clones with ancestral genotypes (Figs. 2 and 4 and Table 4), and in order to preserve both the species and these genotypes conservation of this species from population 51 (Appendix), in area 5 (Fig. 2), is recommended. Haplotype groups H4, H12, H16, and H19 are evolutionarily important due to their intermediate position. Although haplotype group H12 is widespread in G. reticulata, the central Zagros Mountains, where H4, H16, and H19 co-occur, is another possible region where it would be worthwhile to develop a conservation plan (Fig. 2). Conservation strategies in the Taftan Mountains (area 6, Fig. 2) are also required to conserve the isolated populations of G. setifolia that occur in this area. Although haplotypes of these populations occur in other areas as well, their isolated distribution (possibly due to both biotic and abiotic factors) suggest they have followed distinct evolutionary paths and should therefore be conserved.

5. Conclusions MS data support the conclusion that Iran was the center of diversification and evolution for the G. reticulata species complex. Two algorithms, heuristic-bifurcations and networks, led to our conclusions that MS possesses three copies in the genome of G. reticulata aggregate, that speciation and duplications of malate synthase were coincident, and that G. setifolia includes the ancestral genotype; these analyses also revealed the migratory paths of the ancestral population. These results do not fully support the current species circumscription (several species are more closely related to some individuals of other species than they are to other members of their own species). Morphology (Zarrei et al., 2007, 2010a,b,c) does not correlate well with the supported clades/ groups based on the parsimony and network analyses presented here. A comparison of the morphological and genetic boundaries of species will be reported in a later paper. Potentially more data from other nuclear genes, particularly for Gagea species with evolutionary histories longer than the G. reticulata species complex, will be useful, but it is possible that in this actively evolving complex where changes in ploidy and introgressive hybridization are common further genetic data will not reveal a simple phylogenetic pattern.

Acknowledgments Support of the Iran Society in the UK, Calleva Trust, Norman Scarfe Charitable Trust, Royal Botanic Gardens, Kew, the Systematics Association, the Bentham-Moxon Trust, and Mr. Paul Miles is gratefully acknowledged. At RBG, Kew, the work could not have been completed without the help of Laura Kelly, Imalka Kahan-

638

M. Zarrei et al. / Molecular Phylogenetics and Evolution 62 (2012) 624–639

dawala, Laszlo Csiba, Dion Devey, Jim Clarkson, Félix Forest, and Lola Lledó. The first author would like to thank M. Djamali for discussion about the geology of Iran. We also thank two anonymous reviewers for their helpful comments on this manuscript. Appendix A. Supplementary material Supplementary data associated with this article can be found, in the online version, at doi:10.1016/j.ympev.2011.11.003. References Agard, P., Omrani, J., Jolivet, L., Mouthereau, F., 2005. Convergence history across Zagros, Iran: constraints from collisional and earlier deformation. Int. J. Earth Sci. 94, 401–419. Ajani, Y., Noroozi, J., Levichev, I.G., 2010. Gagea alexii (Liliaceae), a new record from subnival zone of southern Iran with key and notes on sect. Incrustatae. Pak. J. Bot. 42, 67–77. Álvarez, I., Wendel, J.F., 2003. Ribosomal ITS sequences and plant phylogenetic inference. Mol. Phylogenet. Evol. 29, 417–434. Asmussen, C.B., Chase, M.W., 2001. Coding and noncoding plastid DNA in palm systematics. Am. J. Bot. 88, 1103–1117. Baldwin, B.G., 1992. Phylogenetic utility of the internal transcribed spacers of nuclear ribosomal DNA in plants: an example from the Compositae. Mol. Phylogenet. Evol. 1, 3–16. Baldwin, B.G., 1993. Molecular phylogenetics of Calycadenia (Compositae) based on ITS sequences of nuclear ribosomal DNA: chromosomal and morphological evolution reexamined. Am. J. Bot. 80, 222–238. Baldwin, B.G., Sanderson, M.J., Porter, J.M., Wojciechowski, M.F., Campbell, C.S., Donoghue, M.J., 1995. The ITS region of nuclear ribosomal DNA: a valuable source of evidence on angiosperm phylogeny. Ann. Missouri Bot. Gard. 82, 247– 277. Chase, M.W., Fay, M.F., 2009. Barcoding of plants and fungi. Science 325, 682–683. Chase, M.W., Hills, H.G., 1991. Silica gel: an ideal desiccant for preserving fieldcollected leaves for use in molecular studies. Taxon 40, 215–220. Chase, M.W., Knapp, S., Cox, A.V., Clarkson, J.J., Butsko, I.Y., Joseph, J., Savolainen, V., Parokonny, A.S., 2003. Molecular systematics, GISH and the origin of hybrid taxa in Nicotiana (Solanaceae). Ann. Bot. 92, 107–127. Clarkson, J.J., Kelly, L.J., Leitch, A.R., Knapp, A., Chase, M.W., 2010. Nuclear glutamine synthetase evolution in Nicotiana: Phylogenetics and the origins of allotetraploid and homoploid (diploid) hybrids. Mol. Phylogenet. Evol. 55, 99– 112. Clement, M., Posada, D., Crandall, K., 2000. TCS: a computer program to estimate gene genealogies. Mol. Ecol. 9, 1657–1660. Cowling, R.M., Pressey, R.L., 2001. Rapid plant diversification: planning for an evolutionary future. Proc. Natl. Acad. Sci. USA 98, 5452–5457. Cui, L., Wall, P.K., Leebens-Mack, J.H., Lindasy, B.G., Soltis, D.E., Doyle, J.J., Soltis, P.S., Carlson, J.E., Arumuganathan, K., Barakat, A., Albert, V.A., Ma, H., DePamphilis, C.W., 2006. Widespread genome duplication throughout the history of flowering plants. Genome Res. 16, 738–749. Davidson, J., Hassanzadeh, J., Berzins, R., Stockli, D.F., Bashukooh, B., Turrin, B., Pandamouz, A., 2004. The geology of Damavand volcano, Alborz Mountains, northern Iran. GSA Bull. 116, 16–29. Davlianidze, M.T., 1969. Chromosomatum numeris in genere Gagea Salisb.. Zam. Sist. Geog. Rast. Bot. Inst. Akad. Nauk. Gruzinnskoi SSR 27, 108–113. Davlianidze, M.T., 1976. Kavkazaskie predstaviteli roda Gagea Salisb., Tbilisi, pp. 1– 160 (In English translation entitled ‘‘A review of the systematics and the taxonomy of the Caucasian representatives of the genus Gagea Salisb.’’). Davlianidze, M.T., Levichev, I.G., 1987. Chromosome numbers in species of the genus Gagea (Liliaceae) from central Asia. Bot. J. (St. Petersburg) 72, 1271–1272. R Development Core Team, 2010. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. Dolezˇel, J., Binarová, P., Lucretti, S., 1989. Analysis of nuclear DNA content in plant cells by flow cytometry. Biol. Plantarum 31, 113–120. Dolezˇel, J., Greilhuber, J., Suda, J., 2007. Flow cytometry with plants: an overview. In: Dolezˇel, J., Greilhuber, J., Suda, J. (Eds.), Flow Cytometry with Plant Cells. Wiley-VCH, Weinheim, pp. 41–67. Doyle, J.J., Doyle, J.L., 1987. A rapid DNA isolation procedure from small quantities of fresh leaf tissue. Phytochem. Bull. 19, 11–15. Feliner, N.G., Rossellò, J.A., 2007. Better the devil you know? Guidelines for insightful utilization of nrDNA ITS in species-level evolutionary studies in plants. Mol. Phylogenet. Evol. 44, 911–919. Felsenstein, J., 1985. Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39, 783–791. Fitch, W.M., 1971. Towards defining the course of evolution: minimum change for a specific tree topology. Syst. Zool. 20, 406–426. Ghafari, S.M., 2008. Chromosome reports for some plant species from Iran. Iran J. Bot. 14, 39–46. Graham, I.A., Smith, L.M., Brown, J.W., Leaver, C.J., Smith, S.M., 1989. The malate synthase gene of Cucumber. Plant Mol. Biol. 13, 673–684. Greilhuber, J., 2005. Intraspecific variation in genome size in angiosperms – identifying its existence. Ann. Bot. 95, 9–98.

Heyn, C.H., Dafni, A., 1971. Studies in the genus Gagea (Liliaceae) I. The platyspermous species in Israel and neighbouring areas. Isr. J. Bot. 20, 214–233. Hörandl, E., Greilhuber, J., Klímová, K., Paun, O., Temsch, E., Emadzade, Hodálová, K.I., 2009. Reticulate evolution and taxonomic concepts in the Ranunculus auricomus complex (Ranunculaceae): insights from analysis of morphological, karyological and molecular data. Taxon 58, 1194–1215. Hughes, C.E., Eastwood, R.J., Bailey, C.D., 2006. From famine to feast? Selecting nuclear DNA sequence loci for plant species-level phylogeny reconstruction. Philos. Trans. R. Soc. London 361, 211–225. IUCN, 2001. IUCN Red List Categories: Version 3.1. Prepared by the IUCN Species Survival Commission. IUCN, Gland, Switzerland & Cambridge, UK. Joly, S., Bruneau, A., 2006. Incorporating allelic variation for reconstructing the evolutionary history of organisms from multiple genes: an example from Rosa in North America. Syst. Biol. 55, 623–636. Kehl, M., 2009. Quaternary climate change in Iran – the state of knowledge. Erdkunde 63, 1–17. Kelchner, S.A., 2000. The evolution of non-coding chloroplast DNA and its application in plant systematics. Ann. Missouri Bot. Gard. 87, 482–498. Kelly, L.J., Leitch, A.R., Clarkson, J.J., Hunter, R.B., Knapp, S., Chase, M.W., 2010. Intragenic recombination events and evidence for hybrid speciation in Nicotiana (Solanaceae). Mol. Biol. Evol. 27, 781–799. Koul, A.K., Wakhlu, A.K., 1985. Studies on the genus Gagea 5: Embryology of Gagea reticulata Schult. J. Jpn. Bot. 60, 361–369. Koul, A.K., Wafai, B.A., Wakhlu, A.K., 1976. Studies on the genus Gagea III. Sporogenesis, early embryogeny and endosperm development in hexaploid Gagea stipitata. Phytomorphology 6, 255–263. Lewis, C.E., Doyle, J.J., 2001. Phylogenetic utility of the nuclear gene malate synthase in the palm family (Arecaceae). Mol. Phylogenet. Evol. 19, 409–420. Lewis, C.E., Doyle, J.J., 2002. A phylogenetic analysis of palm tribe Areceae using two low-copy nuclear genes. Plant Syst. Evol. 236, 1–17. Linder, C.R., Rieseberg, L.H., 2004. Reconstructing patterns of reticulate evolution in plants. Am. J. Bot. 91, 1700–1708. Lo, E.Y.Y., Stefanovíc, S., Dickinson, T.A., 2010. Reconstructing reticulation history in a phylogenetic framework and the potential of allopatric speciation driven by polyploidy in an agamic complex in Crataegus (Rosaceae). Evolution 64, 3593– 3608. Lysák, M.A., Koch, M.A., Beaulieu, J.M., Meister, A., Leitch, I.J., 2009. The dynamic ups and downs of genome size evolution in Brassicaceae. Mol. Biol. Evol. 26, 85–98. Magulaev, A.Y., 1992. Chromosome numbers in some species of vascular plants of the northern Caucasus flora. Bot. J. (St. Petersburg). 77, 88–90. Malik, C.P., Sehgal, S.M., 1959. Chromosome number in Gagea reticulata Schultes. J. Sci. Ind. Res. 8, 155–156. Martin, D.P., Williamson, C., Posada, D., 2005. RDP2: recombination detection and analysis from sequence alignments. Bioinformatics 21, 260–262. Mohajjel, M., Fergusson, C.L., Sahandi, M.R., 2003. Cretaceous-Tertiary convergence and continental collision, Sanandaj–Sirjan Zone. Western Iran: J. Asian Earth Sci. 21, 397–412. Nei, M., 1987. Molecular Evolutionary Genetics, Columbia, New York. Nishikawa, Y., 2009. Significance of intra-inflorescence variation on flowering time of a spring ephemeral, Gagea lutea (Liliaceae), under seasonal fluctuations of pollinator and light availabilities. Plant Ecol. 202, 337–347. Ohri, D., 1998. Genome size variation and plant systematic. Ann. Bot. 82, 75–83. Ouyang, S., Buell, C.R., 2004. The TIGR Plant Repeat Databases: a collective resource for the identification of repetitive sequences in plants. Nucleic Acids Res. 32, 360–363. Paradis, E., Claude, J., Strimmer, K., 2004. APE: analyses of phylogeny and evolution in R language. Bioinformatics 20, 289–290. Patterson, T.B., Givnish, T.J., 2002. Phylogeny, concerted convergence, and phylogenetic niche conservatism in the core Liliales: insights from rbcL and ndhF sequence data. Evolution 56, 233–252. Peruzzi, L., 2003. Contribution to cytotaxonomical knowledge of Gagea Salisb. (Liliaceae). sect. Foliatae A.Terrac. and synthesis of karyological data. Caryologia 56, 115–128. Peruzzi, L., 2008a. Hybridity as a main evolutionary force in the genus Gagea Salisb. (Liliaceae). Plants Biosyst. 142, 179–184. Peruzzi, L., 2008b. Contribution to the cytotaxonomical knowledge of the genus Gagea Salisb. (Liliaceae). III. New karyological data from the central Mediterranean area. Caryologia 61, 92–106. Peruzzi, L., Leitch, I.J., Caparelli, K.F., 2009. Chromosome diversity and evolution in Liliaceae. Ann. Bot. 103, 459–475. Peruzzi, L., Peterson, A., Tison, J.-M., Peterson, J., 2008a. Phylogenetic relationships of Gagea Salisb. (Liliaceae) in Italy, inferred from molecular and morphological data matrices. Plant Syst. Evol. 276, 219–234. Peruzzi, L., Tison, J.-M., Peterson, A., Peterson, J., 2008b. On the phylogenetic position and taxonomic value of Gagea trinervia (Viv.) Greuter and the whole G. sect. Anthericoides A. Terracc. (Liliaceae). Taxon 57, 1201–1214. Peterson, A., Harpke, D., Peruzzi, L., Levichev, I.G., Tison, J.-M., Peterson, J., 2009. Hybridization drives speciation in Gagea (Liliaceae). Plant Evol. 278, 133– 148. Peterson, A., John, H., Koch, E., Peterson, J., 2004. A molecular phylogeny of the genus Gagea (Liliaceae) in Germany inferred from non-coding chloroplast and nuclear DNA sequences. Plant Syst. Evol. 245, 145–162. Peterson, A., Levichev, I.G., Peterson, J., 2008. Systematics of Gagea and Lloydia (Liliaceae) and infrageneric classification of Gagea based on molecular and morphological data. Mol. Phylogenet. Evol. 46, 446–465.

M. Zarrei et al. / Molecular Phylogenetics and Evolution 62 (2012) 624–639 Pillon, Y., Munzinger, J., Amir, H., Hopkins, H.C.F., Chase, M.W., 2009. Reticulate evolution on a mosaic of soils: diversification of the New Caledonian endemic genus Codia (Cunoniaceae). Mol. Ecol. 18, 2263–2275. Rahimpour-Bonab, H., Shariatinia, Z., Siemann, M.G., 2007a. Role of rifting in evaporite deposition in the Great Kavir Basin, central Iran. Geol. Soc. Lond. 285, 69–85. Rahimpour-Bonab, H., Shariatinia, Z., Siemann, M.G., 2007b. Origin and geochemistry of Miocene marine evaporites associated with red beds: Great Kavir Basin, Central Iran. Geol. J. 42, 37–54. Rieseberg, L.H., Wendel, J.F., 1993. Introgression and its consequences in plants. In: Harrison, R. (Ed.), Hybrid Zones and the Evolutionary Process. Oxford University Press, Oxford, pp. 70–109. Robertson, A., Rich, T.C., Allen, A.M., Houston, L., Roberts, C., Bridle, J.R., Harris, S.A., Hiscock, S.J., 2010. Hybridization and polyploidy as drivers of continuing evolution and speciation in Sorbus. Mol. Ecol. 19, 1675–1690. Romanov, I.D., 1936. Die Embryosackentwicklung in der Gattung Gagea Salisb. Planta 25, 438–459. Sang, T., 2002. Utility of low-copy nuclear gene sequences in plant phylogenetics. Crit. Rev. Biochem. Mol. 37, 121–147. Shaw, J., Edgar, B., Lickey, J., Beck, T., Susan, B.F., Wusheng, L., Miller, J., Kunsiri, C.S.C., Edward, T.W., Randall, S.E., Small, F., 2005. The tortoise and the hare II: relative utility of 21 noncoding chloroplast DNA sequences for phylogenetic analysis. Am. J. Bot. 92, 142–166. Small, R.L., Cronn, R.C., Wendel, J.F., 2004. The use of nuclear genes for phylogeny reconstruction in plants. Aust. Syst. Bot. 17, 145–170. Smith, T.B., Bruford, M.W., Wayne, R.K., 1993. The preservation of process: the missing element of conservation program. Biodivers. Lett. 1, 164–167. Suda, J., 2004. An Employment of Flow Cytometry into Plant Biosystematics. PhD Thesis Presented to Charles University in Prague, Faculty of Science, Department of Botany. Suda, J., Trávnícˇek, P., 2006. Reliable DNA ploidy determination in dehydrated tissues of vascular plants by DAPI flow cytometry-new prospects for plant research. Cytometry 69, 273–280. Swofford, D.L. 2002. PAUP⁄: Phylogenetic analysis using parsimony (⁄and other methods), Verson 4. Computer program distributed by Sinauer Associates, Sunderland, MA.

639

Vriesendorp, B., Bakker, F.T., 2005. Reconstructing patterns of reticulate evolution in angiosperms: what can we do? Taxon 54, 593–604. Wendel, J.F., 2000. Genome evolution in polyploids. Plant Mol. Biol. 42, 225–249. Wendelbo, P., Rechinger, K.H., 1990. Gagea. In: Rechinger, K.H. (Ed.), Flora Iranica, vol. 165. Graz, Austria, pp. 13–57. Wolfe, K.H., Li, W.H., Sharp, P.M., 1987. Rates of nucleotide substitution vary greatly among plant mitochondrial, chloroplast, and nuclear DNAs. Proc. Natl. Acad. Sci. USA 84, 9054–9058. Zarrei, M., Zarre, S., 2005. Pollen morphology of the genus Gagea (Liliaceae) in Iran. Flora 200, 96–108. Zarrei, M., Zarre, S., Wilkin, P., Rix, E.M., 2007. Systematic revision of the genus Gagea Salisb. (Liliaceae) in Iran. Bot. J. Linn. Soc. 154, 559–588. Zarrei, M., Wilkin, P., Fay, M.F., Ingrouille, M.J., Zarre, S., Chase, M.W., 2009. Molecular systematics of Gagea and Lloydia (Liliaceae; Liliales): implications of analyses of nuclear ribosomal and plastid sequences for infrageneric classification. Ann. Bot. 104, 125–142. Zarrei, M., Wilkin, P., Ingrouille, M.J., Zarre, S., Chase, M.W., 2010a. The systematic importance of anatomical data in Gagea (Liliaceae) from Flora Iranica area. Bot. J. Linn. Soc. 164, 155–177. Zarrei, M., Wilkin, P., Ingrouille, M.J., Chase, M.W., 2010b. Gagea calcicola (Liliaceae), a new species from southwestern Iran. Kew Bull. 65, 89–96. Zarrei, M., Wilkin, P., Ingrouille, M.J., Chase, M.W., 2010c. Gagea robusta (Liliaceae), a new species from Flora Iranica area. Kew Bull. 65, 327–336. Zarrei, M., Wilkin, P., Ingrouille, M.J., Leitch, I., Buerki, S., Fay, M.F., Chase, M.W., 2010d. Species relationships in the Gagea reticulata species complex utilizing nucleotide sequences of the low-copy nuclear gene malate synthase and flow cytometry data. In: Poster in SPNHC & CBA-ABC Joint Conference, Ottawa, Canada, p. 102. Zarrei, M., Wilkin, P., Noltie, H.J., Ingrouille, M.J., Chase, M.W., 2011a. Clarifying the nomenclature and taxonomy of Gagea kunawurensis (Royle) Greuter (Liliaceae) and allied taxa. Edinb. J. Bot. 68, 43–59. Zarrei, M., Wilkin, P., Ingrouille, M.J., Chase, M.W., 2011b. A revised infrageneric classification for Gagea (Tulipeae; Liliaceae): insights from DNA sequence and morphological data. Phytotaxa 15, 44–56. Zarrei, M., Wilkin, P., Chase, M.W., 2011c. Gagea Salisb. (Liliaceae) in Iran: an updated species checklist. Phytotaxa 15, 33–43.