Plant Science 180 (2011) 228–237
Contents lists available at ScienceDirect
Plant Science journal homepage: www.elsevier.com/locate/plantsci
Nucleotide diversity and linkage disequilibrium of nine genes with putative effects on flowering time in perennial ryegrass (Lolium perenne L.) Alice Fiil a , Ingo Lenk b , Klaus Petersen c , Christian S. Jensen b , Klaus K. Nielsen b , Britt Schejbel a , Jeppe Reitan Andersen a,∗ , Thomas Lübberstedt d a
Aarhus University, Faculty of Agricultural Sciences, Department of Genetics and Biotechnology, Forsøgsvej 1, DK-4200 Slagelse, Denmark DLF Trifolium A/S, Højerupvej 31, DK-4660 Store Heddinge, Denmark c University of Copenhagen, Department of Biology, The Bioinformatics Centre, Ole Maaløvs vej 5, DK-2200 København N, Denmark d Iowa State University, Department of Agronomy, 1204 Agronomy Hall, Ames, IA 50011, USA b
a r t i c l e
i n f o
Article history: Received 12 May 2010 Received in revised form 13 July 2010 Accepted 21 August 2010 Available online 27 August 2010 Keywords: Perennial ryegrass Lolium perenne Flowering time Nucleotide diversity Linkage disequilibrium MADS-box
a b s t r a c t Optimization of flowering is an important breeding goal in forage and turf grasses, such as perennial ryegrass (Lolium perenne L.). Nine floral control genes including Lolium perenne CONSTANS (LpCO), SISTER OF FLOWERING LOCUS T (LpSFT), TERMINAL FLOWER1 (LpTFL1), VERNALIZATION1 (LpVRN1, identical to LpMADS1) and five additional MADS-box genes, were analyzed for nucleotide diversity and linkage disequilibrium (LD). For each gene, about 1 kb genomic fragments were isolated from 10 to 20 genotypes of perennial ryegrass of diverse origin. Four to twelve haplotypes per gene were observed. On average, one single nucleotide polymorphism (SNP) was present per 127 bp between two randomly sampled sequences for the nine genes ( = 0.00790). Two MADS-box genes, LpMADS1 and LpMADS10, involved in timing of flowering showed high nucleotide diversity and rapid LD decay, whereas MADSbox genes involved in floral organ identity were found to be highly conserved and showed extended LD. For LpMADS4, LpMADS5, LpCO, LpSFT and LpTFL1, LD extended over the entire region analyzed. The results are compared to previously published results on resistance genes within the same collection of genotypes and the prospects for association mapping of floral control in perennial ryegrass are discussed. © 2010 Elsevier Ireland Ltd. All rights reserved.
1. Introduction High-quality forage and turf grass varieties are characterized by upright, dense and persistent growth. These traits together with high nutritional value are characteristic of vegetative and juvenile growth. Stem and inflorescence formation during maturation significantly reduces the digestibility, nutritional value, and productivity of forage grasses. In turf grasses, stem formation during the growth season suppresses tiller formation and affects quality, density and persistence of the sward. Therefore, control of flower-
Abbreviations: CO, CONSTANS; FT, FLOWERING LOCUS T; LTS, Lolium Test Set; MADS-box, MINICHROMOSOME MAINTENANCE 1, AGAMOUS, DEFIENS and SERUM RESPONSE FACTOR-box; NBS-LRR, nucleotide binding site and leucine rich repeat; PEBP, phosphatidylethanolamine-binding protein; QTL, Quantitative Trait Locus; QTN, Quantitative Trait Nucleotide; SFT, SISTER OF FT; SVP, Short Vegetative Phase; TFL1, TERMINAL FLOWER1; VRN1, VERNALIZATION1. ∗ Corresponding author. Tel.: +45 8999 3545; fax: +45 8999 3501. E-mail addresses:
[email protected] (A. Fiil),
[email protected] (I. Lenk),
[email protected] (K. Petersen),
[email protected] (C.S. Jensen),
[email protected] (K.K. Nielsen),
[email protected] (B. Schejbel),
[email protected] (J.R. Andersen),
[email protected] (T. Lübberstedt). 0168-9452/$ – see front matter © 2010 Elsevier Ireland Ltd. All rights reserved. doi:10.1016/j.plantsci.2010.08.015
ing time is an important goal in breeding programs of forage and turf grasses [1]. Perennial ryegrass (Lolium perenne L., 2x = 14) is one of the most important turf and forage grass species in temperate regions. It is an out-breeding species with a strong self-incompatibility system. Environmental control of flowering time in perennial ryegrass includes primary induction by short days and/or vernalization, followed by secondary induction by long days and elevated temperature [2]. There is, however, large variation for this trait in perennial ryegrass and requirements for primary and secondary induction has been shown to correlate with latitude of origin of the germplasm [2,3]. Four floral control genes: Lolium perenne VERNALIZATION1 (LpVRN1, identical to LpMADS1 and in the following denoted LpMADS1), CONSTANS (LpCO, identical to LpHD1 and in the following denoted LpCO), TERMINAL FLOWER1 (LpTFL1), and FLOWERING LOCUS T (LpFT) have been characterized in perennial and darnel ryegrass (Lolium temulentum L.) [3–11]. In addition, a number of MADS-box genes have been found to be differentially expressed during the transition from vegetative to reproductive growth [4,5]. MADS-box proteins belong to a family of eukaryotic transcription factors and are named after MINICHROMOSOME MAINTENANCE 1 (MCM1) in yeast, AGAMOUS (AG) in Arabidopsis,
A. Fiil et al. / Plant Science 180 (2011) 228–237
DEFICIENS (DEF) in Antirrhinum and SERUM RESPONSE FACTOR (SRF) in humans [12]. All MADS-box genes contain a conserved 180 bp motif designated the MADS-box, which encodes the DNA-binding domain of the MADS-box transcription factor. In plants, MADS-box transcription factors play important roles in flower development, control of flowering time and organ differentiation [13]. Three fundamentally different types of MADS-box genes, type I, type II (MIKCc - and MIKC*-type), and MADS-box like, have been identified in plants. Based on DNA sequence homology, the MADS-box genes of type II, which constitute most of the well-studied MADS-box genes in plants, have been divided into 13 different gene subfamilies, termed AG-, AGL2-, AGL6-, AGL12-, AGL15-, AGL17-, DEF-, FLC-, GGM13-, GLO-, SQUA-, SVP/STMADS11-, and SOC1/TM3-like [13]. Members of a distinct gene subfamily do not only share homology, but also tend to share expression patterns and functions of encoded proteins [14–18]. Ten MADS-box genes have been isolated from perennial ryegrass [4,5]. LpMADS1, LpMADS2 and LpMADS03 belong to the subfamily SQUA-like genes [13] and LpMADS1 is an orthologue of the VRN1 gene in cereals [3,6,19]. LpMADS4 belongs to the AGL6-like subgroup [4], for which a biological function of encoded proteins have not been defined [13]. LpMADS5, LpMADS6 and LpMADS7 all belong to the AGL2-like subfamily. LpMADS5 show highest homology with the SEP-like genes of the E-class while LpMADS6 and LpMADS7 show highest homology to OsMADS1, a Bclass gene of the ABCE model for genetic control of floral organ identity [4,20]. LpMADS10 is a putative homologue of Arabidopsis Short Vegetative Phase (SVP), belonging to the SVP/STMADS11-like subfamily [5]. In Arabidopsis and winter cereals, transcription factors encoded by SVP and SVP-like genes, respectively, function to delay floral transition [21,22]. Of the four floral control genes identified in perennial ryegrass, the three non MADS-box genes are orthologues of well characterized Arabidopsis floral control genes. LpCO is an ortholog of Arabidopsis CO [7,23] and belongs to a family of putative transcription factors defined by conserved zinc finger and CCT domains. Transcription of LpCO exhibits a diurnal oscillation, is induced by long days and the encoded protein promotes flowering in both Arabidopsis and perennial ryegrass [7]. In addition, a regulatory mechanism of LpCO similar to that of Arabidopsis CO is indicated in which the coincidence of high levels of CO expression, controlled by the circadian clock, and light induces expression of downstream genes under long day but not short day conditions [7,24]. In Arabidopsis, FT is a downstream target of CO and promotes flowering during long days [25,26]. FT encodes a protein similar to a phosphatidylethanolamine-binding protein (PEBP), such as Raf kinase inhibitor from mammals [27]. As in Arabidopsis, transcription of FT was induced by long days in darnel ryegrass [10]. Five, 13 and 15 FT-like genes have been identified in barley, rice and maize, respectively [28–32]. TERMINAL FLOWER1 (TLF1) also belongs to the plant PEBP gene family, but in contrast to FT, it is involved in repression of flowering in Arabidopsis [33]. A similar role for LpTFL1 has been suggested in perennial ryegrass [9]. To effectively utilize natural genetic variation in plant breeding, knowledge of correlations between phenotype and genotype is of crucial importance. Traditionally, this knowledge has been obtained through linkage mapping in which co-segregation of genetic markers and phenotype are identified within families generated by controlled crosses. In recent years, association mapping, also known as linkage disequilibrium (LD) mapping in populations of unrelated individuals has emerged as an alternative to linkage mapping. However, prior knowledge of nucleotide diversity and LD in species and/or loci of interest are highly desirable when designing association mapping experiments. LD is defined as the non-random co-segregation of alleles at two loci and factors influencing LD include recombination, mutation, mating system, genetic drift, population admixture, and selection [34]. LD decays more
229
rapidly in allogamous as compared to autogamous species due to a higher proportion of effective recombination as allogamous species are more likely to be heterozygous at a given locus. Perennial ryegrass is allogamous and highly heterozygous, thus promoting a rapid decay of LD. However, as observed in other crop plant species [35], levels of LD have been shown to vary between populations and loci within perennial ryegrass. LD decay was observed within 500 bp in 11 expressed resistance candidate genes in perennial ryegrass [36] while LD extended 1-2 kb within an alkaline invertase (LpcAI) gene and was indicated to vary between sub-populations at the LpCO locus in perennial ryegrass [37]. Here, we report levels of nucleotide diversity and LD in nine floral control candidate genes within 10 to 20 diverse genotypes of perennial ryegrass of diverse origin. The objectives of this study were to (1) determine levels of nucleotide diversity and LD within regions of approximately 1 kb of nine floral control candidate genes, (2) compare nucleotide diversity and LD decay between genes, (3) compare nucleotide diversity and LD decay between genes involved in floral control and disease resistance within the LTS, and (4) discuss the prospects of candidate gene based association mapping in perennial ryegrass. 2. Materials and methods 2.1. Plant material The plant materials included in the present study is denoted the Lolium Test Set (LTS) and is developed within the EU project GRASP. The LTS includes 20 genotypes of perennial ryegrass originating from various European sources and selected to represent a wide range of genetic diversity (Table 1). The 20 genotypes were found to be genetically distinct using amplified fragment length polymorphism (AFLP), inter-simple sequence repeat (ISSR), random amplified polymorphic DNA (RAPD) and simple sequence repeat (SSR) markers [38]. 2.2. Allele sequencing of candidate genes Candidate sequence regions were initially isolated from perennial ryegrass genomic DNA (Lolium perenne, clone F6, DLF TRIFOLIUM) by genomic walking according to the manufacturers description (Clontech, Mountain View, USA). Where possible, genomic walking primers were designed based on previously published sequence information (Supplementary Table 1). For LpSFT, primers were designed based on DNA sequence alignments between publicly available homologous sequences from closely related species. Amplicons of approximately 1 kb spanning from the proximal promoter-region into the coding region for each gene were amplified by polymerase chain reaction (PCR) using 10 ng of genomic DNA template in standard PCR using the Expand Highfidelity polymerase (Roche, Basel, Switzerland) according to the manufacturer description. PCR primers are listed in Supplementary Table 1. Amplified products were gel-purified (Qiagen, Düsseldorf, Germany) and cloned into TOPO T/A pCR4 vector (Invitrogen, Carlsbad, USA) according to the manufacturer description. Four individual plasmid preps (Qiagen, Düsseldorf, Germany) were made from each cloning according to the manufacturer description and purified plasmids were sequenced and chromatogram files edited as described by Xing et al. [36]. 2.3. Analysis of sequence data Alignment of sequences was performed using MAFFT [39]. The genomic structure of sequenced gene fragments was determined by alignment to previously published mRNA sequences of perennial ryegrass and other species (for LpSFT). The aligned sequences were
230
A. Fiil et al. / Plant Science 180 (2011) 228–237
analyzed for diversity using DnaSP version 4.50.3 [40] and for LD using TASSEL version 2.1 [41]. Two estimates of diversity, and , were calculated. is the average number of nucleotide differences per site between two sequences [42] and per site is derived from the total number of mutations (Eta) and corrected for sample size [43]. The pairwise nucleotide diversity () across all nine genes was calculated by merging all nine alignments into one alignment. To account for regions of missing data in the alignments, the pairwise deletion option was employed in DnaSP for calculating nucleotide diversity (). Using this option, only gaps and missing data present in a particular pairwise comparison was ignored. To test for neutrality of mutations, Tajima’s D statistics were applied [44]. Tajima’s D is based on the differences between the number of segregating sites and the average number of nucleotide differences. Tajima’s D is a statistical method for testing the neutral mutation hypothesis, according to which mutations are classified into three types: neutral, deleterious, and beneficial. Since deleterious mutations are eliminated by selection and beneficial mutations are too rare to be noticed only the neutral mutations should be randomly maintained in the history of evolution. Under the neutral mutation hypothesis, Tajima’s D is expected to be zero, but will tend to be negative given an excess of rare alleles and positive given an excess of common alleles. Such deviations from neutrality can be caused by selection [44]. LD was estimated using squared allele-frequency correlations (r2 ), which summarizes both recombinational and mutational history [45], and visualized using LD decay plots. Promoter elements were identified manually in notepad (Microsoft Office 2003) from published promoter elements in VRN1 [5,6,19].
Fig. 1. Genomic structure of the sequenced gene fragments.
LpMADS1, LpMADS4, LpMADS5, LpMADS6, LpMADS7, LpMADS10, and LpCO included 5 untranslated region (UTR) and part of the first exon. The LpTFL1 fragment included 5 UTR to exon four, while the LpSFT fragment included 5 UTR to exon three. Genomic structures of the amplified genomic fragments are summarized in Fig. 1. Alignment lengths ranged from 989 bp for LpMADS4 to 1327 bp for LpMADS1. The complete alignments were used for analysis of LD, while regions with gaps and missing data were excluded from analysis of nucleotide diversity. 3.2. Polymorphisms, haplotypes and heterozygosity
3. Results 3.1. Sequences and alignments Candidate gene regions were initially isolated by genomic walking based on pre-existing sequence information from perennial ryegrass and/or closely related species. Subsequently, genomic fragments of nine genes, six MADS-box genes, two PEBP genes and one zinc finger/CCT domain gene, putatively involved in floral control, were amplified from 10 to 20 perennial ryegrass genotypes (Supplementary Table 2). The FT gene fragment showed 86% identity to Hordeum vulgare FT-like protein 4 gene (HvFT4) [Genbank: DQ411320] and 83% identity to Zea mays CENTRORADIALIS 26 (ZCN26) [Genbank: EU241916] and was therefore named Lolium perenne Sister of FT (LpSFT). The amplified fragments of
In total, the alignments spanned 8133 bp, excluding gaps and missing data, and 306 SNPs were identified. The number of SNPs ranged from four in LpMADS7 to 90 in LpMADS10. Only five and four SNPs were identified in LpMADS6 and LpMADS7, respectively, while LpMADS1, LpMADS4, LpMADS5, LpCO and LpTFL1 showed between 21 and 35 SNPs (Table 2). The highest number of SNPs, 88 and 90, respectively, were identified in LpSFT and LpMADS10. Tri-allelic SNPs were identified for LpMADS10 while only bi-allelic SNPs were identified for all other genes (Table 2). No correlation was observed between the number of genotypes sequenced per gene and the number of SNPs (data not shown). For several of the gene fragments, the majority of SNPs were due to rare alleles. For LpSFT, 87.5% of the SNPs resulted from an allele from the genotype LTS04, and for LpMADS5, 65.7% of the SNPs resulted from an
Table 1 Description of perennial ryegrass genotypes in the Lolium Test Set. Code
Name
Type
Specificity
Heading
LTS01 LTS02 LTS03 LTS04 LTS05 LTS06 LTS07 LTS08 LTS09 LTS10 LTS11 LTS12 LTS13 LTS14 LTS15 LTS16 LTS17 LTS18 LTS19 LTS20
G00612 G00559 NGB9C2 Veyo9C1 DLF5 DLF6 G00851 G00852 RASP17-03 ILGI 80 Lp 34-551 INRA1 INRA2 INRA3 INRA4 INRA5 WSC 22/9 WSC 23/9 ILGI P150/112 74 ILGI P150/112 166
Forage Forage Forage Forage Turf Turf Forage Forage Forage Forage Turf Forage Forage Turf Ecotype Ecotype Forage Forage Forage Forage
Parent mapping population Parent mapping population Parent mapping population Parent mapping population Parent mapping population Parent mapping population Parent mapping population Parent mapping population RASP family self-fertility S12 S51F Heterozygous for both S and Z loci Colchicine induced type Parent mapping population Parent mapping population Parent mapping population Mediterranean origin: Greece Nordic origin: Sweden From WSC mapping population From WSC mapping population From ILGI mapping population From ILGI mapping population
Late Late Late Early Early Late No data No data Early No data Late Late Intermediate Intermediate Early Intermediate Early Intermediate Late No data
A. Fiil et al. / Plant Science 180 (2011) 228–237
231
Table 2 Summary of variable sites and InDels. The number of InDel sites is the total length in bp of all InDels in the analyzed gene and the number of InDel events is the number of different InDels. Gene
Variable sites
Singletons
Bi-allelic
Tri-allelic
InDel sites
InDel events
LpMADS1 LpMADS4 LpMADS5 LpMADS6 LpMADS7 LpMADS10 LpCO LpSFT LpTFL1 All genes
21 22 35 5 4 90 16 88 29 306
3 9 0 4 2 2 14 2 24 60
18 13 35 1 2 80 2 86 5 242
0 0 0 0 0 4 0 0 0 4
137 11 105 3 1 514 22 51 25 869
4 4 22 3 1 46 1 11 6 98
Table 3 Number of haplotypes, haplotype frequencies within the Lolium Test Set and heterozygosity (het) in percent. Gene
Haplotypes
≥5%
>5–20%
>20–40%
>40–60%
>60–80%
% het
LpMADS1 LpMADS4 LpMADS5 LpMADS6 LpMADS7 LpMADS10 LpCO LpSFT LpTFL1 Weighted average
9 8 4 6 5 12 5 5 9 7.2
4 4 0 4 3 4 2 2 7 3.6
3 3 3 1 0 7 2 1 1 2.3
2 0 0 0 1 1 0 1 0 0.5
0 0 0 0 1 0 0 1 0 0.2
0 1 1 1 0 0 1 0 1 0.6
47 45 0 29 15 31 21 10 35 25.9
LTS03 allele. Likewise, 81.3% and 55.2% of the SNPs in LpCO and LpTFL1 resulted from LTS07 alleles. All InDels were located in noncoding regions, except for a 22 bp singleton deletion within the first exon of LpCO in the genotype LTS08. The number of InDel events ranged from one in LpMADS7 and LpCO to 46 in LpMADS10 and the total length of all InDel polymorphisms per gene ranged from 1 bp in LpMADS7 to 514 bp in LpMADS10 (Table 2). The number of haplotypes ranged from four in LpMADS5 to 12 in LpMADS10 with a weighted average of 7.2 haplotypes per gene (Table 3). For LpMADS4, LpMADS5, LpMADS6, LpCO and LpTFL1, one common haplotype was observed while two common haplotypes were observed for LpMADS1, LpMADS7, and LpSFT. Rare haplotypes accounting for ≤5% of the alleles represented 48% of the 63 haplotypes. The level of heterozygosity ranged from 0% to 47%, with an average of 26% (Table 3).
3.3. Nucleotide diversity, selection, and linkage disequilibrium Pairwise nucleotide diversity () ranged from 0.00042 in LpMADS6 to 0.02160 in LpMADS10 (Table 4). Nucleotide diversity was generally lowest in the coding regions ( = 0 in LpMADS7 to = 0.01049 in LpMADS10) and highest in non-coding regions ( = 0.00043 in LpMADS6 to = 0.03100 in LpMADS1). An exception was LpCO, for which nucleotide diversity was lowest in the non-coding region ( = 0.00106) and highest in the coding region ( = 0.00125), showing 7 synonymous and 7 non-synonymous SNPs. One non-synomous SNP were identified in each of LpMADS1, LpMADS6, and LpTFL1, while only synonymous SNPs were identified in LpMADS4, LpMADS5, LpMADS7, LpMADS10 and LpSFT. Considering sampling size, total nucleotide diversity (/bp) was lowest in LpMADS7 (/bp = 0.00095) and highest in LpSFT (/bp = 0.02597, Table 4). Tajima’s D was negative for eight out of the nine candidate genes. Out of these, Tajima’s D was significant for LpCO (P < 0.01), LpMADS6 (P < 0.05) and LpTFL1 (P < 0.05) while positive and not significant for LpMADS1 (Table 4). Across all nine genes, was 0.00790 (1 SNP per 127 bp), when calculated using a merged alignment of all genes. Grouped by gene family, the MADS-box genes
showed the highest nucleotide diversity ( = 0.00960), followed by the PEBP genes ( = 0.00626) and the zinc finger/CCT domain gene ( = 0.00116) (Table 5). Levels of LD ranged from almost complete LD within LpSFT to rapid LD decay within LpMADS6 (Fig. 2). Based on the extent of LD, the nine genes were divided into three groups (A, B, and C, Fig. 2). For group A (LpMADS1 and LpMADS10), LD decay (r2 < 0.2) was observed within 850 bp. For group B (LpMADS6 and LpMADS7), very few SNPs were identified and LD decay was observed for all pairwise comparisons. For group C (LpMADS4, LpMADS5, LpCO, LpSFT and LpTFL1), LD extended throughout the alignments.
3.4. Regulatory regions in LpMADS1 Promoter regions of the nine genes were examined for regulatory regions and polymorphisms previously identified in VRN1 in wheat and perennial ryegrass [5,6,19]. Four putative VRN1 regulatory regions have been reported in wheat and perennial ryegrass, an ACGT core-binding site for bZIP transcription factor, a putative VRN-box (TTAAAACCCCTCCCC) and two CArG-boxes [5,46–48]. ACGT core-binding sites were identified 377, 360, and 297 bp upstream of the LpMADS1 translation start site (ATG) (Fig. 1) and the site 360 bp upstream of ATG was identified in all genotypes. Genotypes LTS02, -03, -06, -08, -10, -14, -16 and -17 were homozygous for one ACGT core-binding site in the LpMADS1 promoter while LTS01, -04, -09 and -11 were homozygous for three ACGT core-binding sites. LTS05, -07 and -13 were heterozygous for one and three ACGT core-binding sites while sequence data were missing for LTS12 and -18. A putative VRN-box was identified 209 bp upstream of ATG and a CT (TTAAAACCTCCCTCCCC; LTS01, -04, -09, -11, and -18) or a TCCTT (TTAAAACCTCCTTCCTCCCC; LTS02, -03, -06, -08, -10, -12, -14, and -17) insertion was identified in the putative VRN-box. LTS05, -07 and -13 were heterozygous for the inserts while data were missing for LTS16. A CArG-box (CCTCGTTTTGG) identified in wheat was not identified in the present study, whereas another CArG-box (CCAAATTAAG) previously identified in perennial ryegrass [5] was present in all alleles. Between LTS03 and
232
A. Fiil et al. / Plant Science 180 (2011) 228–237
Table 4 Summary of DNA polymorphism and diversity estimates. Parameters
Entire region
Non-coding regions
Coding regions Synonymous
Non-synonymous
LpMADS1 (bp)a InDels (sites)b SNP sitesc Polymorphic sites in %d e /bpf Tajima’s Dg
396 137 21 5.3 0.01992 0.01297 1.83429ns
216 137 17 7.9 0.03100 0.01925 –
180 0 4 2.2 0.00663 0.00543 –
43 – 3 7.0 0.02640 0.01706 –
137 – 1 0.7 0.00043 0.00179 –
LpMADS4 (bp) InDels (sites) SNP sites Polymorphic sites in % /bp Tajima’s D
977 11 22 2.3 0.00203 0.00529 −2.05913ns
801 11 19 2.4 0.00206 0.00558 –
176 0 3 1.7 0.00190 0.00401 –
42.67 – 3 7.0 0.00784 0.01653 –
131.33 – 0 0 0 0 –
LpMADS5 (bp) InDels (sites) SNP sites Polymorphic sites in % /bp Tajima’s D
940 105 35 3.7 0.00689 0.01021 −1.26163ns
764 105 33 4.3 0.00802 0.01185 –
176 0 2 1.1 0.00197 0.00312 –
42.33 – 2 4.7 0.00818 0.01296 –
131.67 – 0 0 0 0 –
822 3 4 0.5 0.00043 0.00125 –
185 0 1 0.5 0.00039 0.00139 –
46.83 – 0 0 0 0 –
136.17 – 1 0.7 0.00052 0.00189 –
LpMADS6 (bp) InDels (sites) SNP sites Polymorphic sites in % /bp Tajima’s D
1007 3 5 0.5 0.00042 0.00128 −1.86266*
All sites
LpMADS7 (bp) InDels (sites) SNP sites Polymorphic sites in % /bp Tajima’s D
989 1 4 0.4 0.00066 0.00095 −0.74374ns
868 1 4 0.5 0.00075 0.00108 –
121 0 0 0 0 0 –
29.50 – 0 0 0 0 –
LpMADS10 (bp) InDels (sites) SNP sites Polymorphic sites in % /bp Tajima’s D
940 514 90 9.6 0.02160 0.02377 −0.34483ns
762 514 82 10.8 0.02420 0.02672 –
178 0 8 4.5 0.01049 0.01116 –
43.28 – 8 18.5 0.04303 0.04590 –
LpCO (bp)a InDels (sites)b SNP sitesc Polymorphic sites in %d e /bpf Tajima’s Dg
977 22 16 1.6 0.00119 0.00390 −2.25791**
308 0 2 0.6 0.00106 0.00155 –
669 22 14 2.1 0.00125 0.00498 –
152.51 – 7 4.6 0.00305 0.01092 –
LpSFT (bp) InDels (sites) SNP sites Polymorphic sites in % /bp Tajima’s D
955 51 88 9.2 0.01961 0.02597 1.00383ns
681 51 82 12.0 0.02583 0.03394 –
274 0 6 2.2 0.00415 0.000617 –
67.67 – 6 8.9 0.01680 0.02499 –
205.33 – 0 0 0 0 –
LpTFL1 (bp) InDels (sites) SNP sites Polymorphic sites in % /bp Tajima’s D
952 25 29 3.0 0.00266 0.00716 –2.15874*
621 24 21 3.4 0.00275 0.00795 –
331 1 8 2.4 0.00248 0.00568 –
78.15 – 7 9.0 0.00987 0.02106 –
242.85 – 1 0.4 0.00021 0.00097 –
a
90.50 – 0 0 0 0 – 133.72 – 0 0 0 0 – 513.49 – 7 1.4 0.00072 0.0034 –
Number of sites excluding sites with gaps/missing data. Total number of InDel sites analyzed. Total number of mutations, Eta. d Polymorphic sites in percentage measured as polymorphic sites in the target region divided by the total number of sites in the target region excluding sites with gaps/missing data. e Nucleotide diversity, , explained in Section 2. f Theta (per site) from Eta, explained in Section 2. g Tajima’s D: ns non-significant. * Significant at P < 0.05. ** Significant at P < 0.01. b c
A. Fiil et al. / Plant Science 180 (2011) 228–237
LTS04, a 325 bp InDel polymorphism was previously reported [6]. This InDel, located 1080–755 bp upstream from ATG, was found to be widespread among the genotypes included in this study. LTS01, -04, -09, -10, -12 and -18 carried the insertion, LTS03, -06, -14 and
233
-17 carried the deletion, while LTS02, -05, -07, -08, -11, -13, and -16 were heterozygous. Neither InDel nor regulatory element polymorphisms were found to discriminate early and late flowering genotypes within the LTS.
Fig. 2. Linkage disequilibrium decay within floral control genes. Plots of squared correlations of allele frequencies (r2 ) against bp distance between pairs of SNPs in the candidate genes LpMADS1, LpMADS4, LpMADS5, LpMADS6, LpMADS7, LpMADS10, LpCO, LpSFT, and LpTFL. Based on the extent of LD, the nine genes were divided into three groups: A, B, and C. Group A includes LpMADS1 and LpMADS10 which show LD decay within 850 bp. Group B includes LpMADS6 and LpMADS7 in which low LD (r2 < 0.2) was observed between all pairwise comparisons of SNPs. Group C includes LpMADS4, LpMADS5, LpCO, LpSFT and LpTFL1 in which LD extends the 1 kb included in the analysis. Curves show nonlinear regression of r2 on distance in bp.
234
A. Fiil et al. / Plant Science 180 (2011) 228–237
Table 5 Comparison of nucleotide diversity in different gene classes for the LTS genotypes. MADS includes the genes LpMADS1, LpMADS4, LpMADS5, LpMADS6, LpMADS7 and LpMADS10. PEBP includes LpSFT and LpTFL1, and Zinc/CCT includes LpCO. The no. of sites is the total number of sites in the merged alignments, whereas the sites analyzed excludes gaps and missing data in a particular pairwise comparison. The number of differences and is based on pairwise comparisons. Gene family
No. of genes
No. of sites
Sites analyzed
No. of differences
MADS PEBP Zinc/CCT All genes
6 2 1 9
6818 2083 999 9900
4127.10 1259.85 997.84 6261.44
39.185 9.363 1.159 49.297
0.00960 0.00626 0.00116 0.00790
4. Discussion 4.1. Nucleotide diversity and LD decay in relation to gene function Six of the nine genes analyzed in the present study are MADS-box genes known to be differentially expressed during vernalization and/or transition from vegetative to reproductive growth [4,5]. LpMADS10 is a SVP-like MADS-box gene [5] and both VRN1 and SVP-like are involved in timing of flowering, VRN1 as a promoter and SVP-like as a suppressor of floral meristem development in cereals [22,49]. In addition, VRN1 is specifically involved in the vernalization response in both perennial ryegrass [3–6] and cereals [49–51]. As LTS genotypes originate from throughout Europe and as flowering time is an important trait in the adaptation to different environments, a high degree of allelic diversity could thus be expected at loci involved in control of flowering time. In agreement with this, relatively high levels of nucleotide diversity and rapid LD decay were observed at the LpMADS1 and LpMADS10 loci within the LTS. For LpMADS5, 23 of the 35 SNPs resulted from a rare allele in LTS03. LTS03 is a late flowering ecotype [3], originating from the Southern part of Denmark, and it could be expected that rare alleles, even at otherwise quite conserved loci, are more frequent in ecotypes than in domesticated plant materials. Few SNPs were identified in LpMADS6 and LpMADS7 and r2 was below 0.2 for all pairwise comparisons, owing to the fact that most SNPs were singletons identified in different alleles. Tajima’s D was negative for LpMADS5, LpMADS6 and LpMADS7, however only significant for LpMADS6, indicating selection and an excess of rare alleles at this locus. The low level of nucleotide diversity of LpMADS5 (ignoring the rare allele from LTS03), LpMADS6, and LpMADS7 are in agreement with an expected function of the encoded proteins in determining floral organ identity. Loci involved in floral organ development, and hence in reproductive ability, could in contrast to loci involved in timing of flowering be expected to be under high selection pressure across the LTS. LpCO shows low nucleotide diversity ( = 0.00119), extended LD, and significant selection within the LTS. This is in agreement with a function of LpCO in promoting flowering under long days [7], which would apply across Europe. Nucleotide diversity at the LpCO locus was found to be higher in a genomic fragment spanning 5.6 kb and including a putative peroxidase precursor (LpPX1) gene, intergenic region, and exon 1 of LpCO ( = 0.00500) in a sample of 96 genotypes originating from semi-natural and variety accessions [37]. In addition, Tajima’s D values for the LpCO locus are contrasting between the present study and the study of Skøt et al. [37], who did not observe selection at the LpCO locus. While the analyzed genomic region is overlapping between the two studies, the neighboring LpPX1 gene is included in the analysis by Skøt et al. [37]. In addition, several of the LTS genotypes originate from breeding programs while the majority of genotypes included in the study by Skøt et al. [37] were derived from semi-natural populations. Thus, discrepancies between studies likely result from differences in size and nature of both analyzed plant materials and DNA sequence. The two PEBP genes LpSFT and LpTFL1 both showed intermediate nucleotide diversity compared to the other genes analyzed in the present study. Transcription of
a barley homologue of LpSFT (HvFT4) has been shown to increase during long days and to be positively correlated with vegetative to reproductive transition of the apex [29], and LpTFL has previously been shown to repress flowering and control meristem identity in perennial ryegrass [9]. While non-significant for LpSFT, Tajima’s D was negative and significant for LpTFL1, indicating selection and an excess of rare alleles. LD extended across the LpTFL1 and LpSFT gene fragments also when excluding rare alleles from the analysis. Tajima’s D was negative for eight out of the nine genes and significant for LpMADS6, LpCO and LpTFL1, indicating an overrepresentation of rare alleles in the sample. However, for LpCO and LpTFL1, one allele from LTS07 accounted for more than 50% of the SNPs. When removing this allele from the analysis, Tajima’s D was no longer significant (data not shown). Thus, as the LTS constitutes a relatively small sample size, a single rare allele in the sample might ‘inflate’ Tajima’s D estimates, which should therefore be interpreted with caution. 4.2. Putative causative variation in LpMADS1 and LpCO VRN1 has been shown to be a central vernalization response gene in both perennial ryegrass and cereals [3–6,19,49–51] and polymorphisms in regulatory regions of the promoter and first intron have been shown to differentiate winter and spring types of wheat and barley [19]. However, genetic regulation of the vernalization response might have differentiated between perennial ryegrass and wheat. Perennial ryegrass is considered a short day–long day plant which requires short days and/or vernalization followed by long days to induce flowering, while winter wheat is considered a long day plant which only requires vernalization and not short days to induce flowering. However, in some genotypes of winter wheat short days can replace the vernalization requirement [52], suggesting that wheat, like perennial ryegrass, is originally a short day-long day plant and supporting a similar genetic regulation of the vernalization response between wheat and perennial ryegrass. In perennial ryegrass, a 325 bp InDel polymorphism located in the promoter region of LpMADS1 has been reported between two genotypes (LTS03 and LTS04) contrasting in vernalization requirement and flowering time [6]. While polymorphic within the LTS, this InDel did not discriminate early and late heading genotypes, and is thus not likely to be involved in regulation of LpMADS1 transcription. In the promoter region of wheat VRN1, ACGT core-binding sites, a VRN-box, and a CArG-box has all been suggested to be involved in transcriptional regulation. The ACGT core-binding sites are binding sites for FDL2, a bZIP transcription factor, which interacts with the FT protein, the VRN-box is a putative binding site for an unknown repressor, and the CArGbox is a putative binding site for the VRT2 protein to regulate VRN1 transcription in wheat [19,46,47]. Recently, however, the CArG-box was found be non-essential for the vernalization response in wheat [48]. In the LTS, either one or three ACGT core-binding sites were identified, the VRN-box were found to be partly conserved, and the CArG-box previously identified in perennial ryegrass [5] was identified in all alleles of LpMADS1. However, none of the observed polymorphisms in the LpMADS1 promoter discriminated early- and
A. Fiil et al. / Plant Science 180 (2011) 228–237
late heading genotypes in the LTS. Length variation in the first intron of VRN1 has been shown to correlate with vernalization requirement in wheat [53], and while conservation of promoter motifs between cereals and perennial ryegrass indicate similar regulatory mechanisms of VRN1, other regulatory regions are likely to be of more functional importance. In LpCO, a C/A SNP located 178 bp upstream from the coding region has previously been associated with flowering time [36]. However, while this SNP was also observed, it did not discriminate early and late heading genotypes in the LTS. While indicating that this is not a true causative SNP, it might be in LD with other causative polymorphisms in the LpCO 3 region, which was analyzed neither by Skøt et al. [37] nor in the present study. In addition, it cannot be excluded that the variation in flowering time observed in the present plant materials (Table 1) is caused by genetic variation in flowering time genes other than LpMADS1 and LpCO. It should be noted that the LTS constitute a relatively small sample not selected specifically to represent genetic diversity in relation to flowering time. Thus, the present observations should only be considered indicative and validated in larger sample sizes. 4.3. Comparison with other gene classes and species Considering all nine floral control candidate genes, one SNP per 127 bp ( = 0.00790) was observed within the LTS. This is relatively low compared to resistance genes, for which one SNP per 33 bp ( = 0.0314) were previously observed within the LTS [36]. Not unexpectedly, the greatest difference in nucleotide diversity between the two gene classes was observed within coding regions. In addition, extended LD was observed within some floral control genes while LD decay was observed within 15–25 bp for nucleotide binding site and leucine rich repeat (NBS-LRR) like genes and within 300–900 bp for non-NBS-LRR genes in the resistance genes [36]. On average, more than double the number of haplotypes per gene (16.3) were observed for the resistance genes [36] as compared to for floral control genes (7.2) in the present study. For the floral control genes a common haplotype with an allele frequency of more than 40% were observed for seven of the nine gene fragments, whereas for the resistance genes a common haplotype with an allele frequency of more than 40% were only observed for three out of the eleven gene fragments [36]. Similarly, 78% of resistance gene alleles were rare (represented in less than 5% of the sample; [36]) whereas 48% of floral control gene alleles were rare. These differences are in agreement with the biological functions in which the two gene classes are involved. Genes involved in floral control would be expected to be under high, and somewhat similar, selection pressure throughout Europe, whereas the high variability of resistance genes is in agreement with their role in multiallelic gene-by-gene interactions with pathogen isolates, which might vary considerably over space and time. However, similar to the results from the resistance genes, relatively high levels of homozygosity were observed for the floral control genes. In agreement with this, several of the LTS genotypes originate from breeding programs, in which some degree of allele fixation could be expected. Low nucleotide diversity has previously been reported for genes involved in floral control in other species, exemplified by hexaploid wheat FT ( = 0.001 [54]), maize Dwarf8 ( = 0.0018 [55]) and Arabidopsis FLC ( = 0.0044 [56]). However, in Arabidopsis, high nucleotide diversity has also been reported for the floral control gene CRY2 ( = 0.0125) [57]. In perennial ryegrass, LD has been shown to extent 500–2000 bp within LpCAD2 and LpCCR1 encoding enzymes involved in lignin biosynthesis [58]. In contrast, LD decay within 500 bp was detected for LpFT1 and Lp1-SST, which encode enzymes involved in oligosaccharide metabolism [58] and rapid LD decay has also been reported for a drought tolerance gene (LpASRa2) [59]. Thus, while the LTS is a relatively small sam-
235
ple size, levels of LD are similar to those previously reported for perennial ryegrass. 4.4. Prospects of LD mapping in perennial ryegrass Breeding for quantitative traits like flowering time can be greatly facilitated by identification of causative genes and, optimally, causative polymorphisms from which functional markers can be derived [60]. Traditionally, quantitative trait locus (QTL) mapping, based on genetic linkage in mapping populations, has been the method of choice for identifying genome regions affecting complex traits [61]. However, due to the limited number of recombination events occurring during construction of a mapping population, mapping resolution is relatively poor, at the order of megabases [62]. Recently, association mapping, based on LD in unrelated individuals, has emerged as a powerful alternative to linkage based QTL mapping [55,63]. Ancestral recombination allows for high mapping resolution and putative causative polymorphisms has been identified by association mapping in floral control genes of Arabidopsis (FRI, FLC, CRY2 and GI) [56,57,64–66], maize (Dwarf8 and Vgt1) [55,67], wheat (FT) [54], barley (VRN1 and VRN2) [68] and perennial ryegrass (LpCO) [37]. The extent of LD dictates which association mapping approach is more appropriate. Rapid LD decay provides high genetic resolution, allows for a candidate gene approach, and facilitates identification of quantitative trait nucleotides (QTN). In contrast, extended LD facilitates genome scans. In accordance with the out-crossing and heterozygous nature of perennial ryegrass, the present and previous studies [36,37,58,59,69,70] indicate relatively low levels of LD in perennial ryegrass. Consequently, a candidate gene based association mapping approach could be expected to identify causative polymorphisms in large and diverse populations of perennial ryegrass genotypes. In contrast, extended LD might be identified in less diverse populations, and synthetic varieties developed from relatively few parents might be suitable for genome scans to identify novel candidate genes [70]. The level of LD varies greatly with the number of parents to the synthetic variety, and significant LD has been observed up to 1.6 Mb in a variety origination from six related parents [70]. To perform a genome scan, at least a few SNP markers per haplotype block is needed in order to distinguish the most common haplotypes [71]. The estimated genome size of Lolium perenne is 2.7 Gb [72]. In the mentioned synthetic variety, app. 1700 linkages blocks would be expected (2700 Mb/1.6 Mb = 1687). Thus, having five markers per linkage block, at least 8500 SNPs markers would be necessary to cover the genome. However, this approach might not provide a higher resolution than QTL mapping in a population with two parents by the same number of markers. The rapid progress in the development of genomic tools, including genome sequencing and high-density SNP genotyping [73], will allow genome wide association studies in perennial ryegrass within a foreseeable future. Until then, however, candidate gene association studies can be utilized to further dissect causative genetic diversity for the several traits for which candidate genes are already identified in perennial ryegrass [36,58,59,74–77]. Acknowledgements This study was conducted in the frame of the EU framework V project GRASP (QLRT-2001-00862). Appendix A. Supplementary data Supplementary data associated with this article can be found, in the online version, at doi:10.1016/j.plantsci.2010.08.015.
236
A. Fiil et al. / Plant Science 180 (2011) 228–237
References [1] C.S. Jensen, K. Salchert, C. Gao, C. Andersen, T. Didion, K.K. Nielsen, Floral inhibition in red fescue (Festuca rubra L.) through expression of a heterologous flowering repressor from Lolium, Mol. Breed. 13 (2004) 37–48. [2] T.S. Aamlid, O.M. Heide, B. Boelt, Primary and secondary induction requirements for flowering of contrasting European varieties of Lolium perenne, Ann. Bot. 86 (2000) 1087–1095. [3] L.B. Jensen, J.R. Andersen, U. Frei, Y.Z. Xing, C. Taylor, P.B. Holm, T.L. Lübberstedt, QTL mapping of vernalization response in perennial ryegrass (Lolium perenne L.) reveals co-location with an orthologue of wheat VRN1, Theor. Appl. Genet. 110 (2005) 527–536. [4] K. Petersen, T. Didion, C.H. Andersen, K.K. Nielsen, MADS-box genes from perennial ryegrass differentially expressed during transition from vegetative to reproductive growth, J. Plant Physiol. 161 (2004) 439–447. [5] K. Petersen, E. Kolmos, M. Folling, K. Salchert, M. Storgaard, C.S. Jensen, T. Didion, K.K. Nielsen, Two MADS-box genes from perennial ryegrass are regulated by vernalization and involved in the floral transition, Physiol. Plantarum 126 (2006) 268–278. [6] J.R. Andersen, L.B. Jensen, T. Asp, T. Lübberstedt, Vernalization response in perennial ryegrass (Lolium perenne L.) involves orthologues of diploid wheat (Triticum monococcum) VRN1 and rice (Oryza sativa) Hd1, Plant Mol. Biol. 60 (2006) 481–494. [7] J. Martin, M. Storgaard, C.H. Andersen, K.K. Nielsen, Photoperiodic regulation of flowering in perennial ryegrass involving a CONSTANS-like homolog, Plant Mol. Biol. 56 (2004) 159–169. [8] I.P. Armstead, L. Skøt, L.B. Turner, K. Skøt, I.S. Donnison, M.O. Humphreys, I.P. King, Identification of perennial ryegrass (Lolium perenne (L.)) and meadow fescue (Festuca pratensis (Huds.)) candidate orthologous sequences to the rice Hd1(Se1) and barley HvCO1 CONSTANS-like genes through comparative mapping and microsynteny, New Phytol. 167 (2005) 239–247. [9] C.S. Jensen, K. Salchert, K.K. Nielsen, A Terminal Flower1-like gene from perennial ryegrass involved in floral transition and axillary meristem identity, Plant Physiol. 125 (2001) 1517–1528. [10] R.W. King, T. Moritz, L.T. Evans, J. Martin, C.H. Andersen, C. Blundell, I. Kardailsky, P.M. Chandler, Regulation of flowering in the long-day grass Lolium temulentum by gibberellins and the FLOWERING LOCUS T gene, Plant Physiol. 141 (2006) 498–507. [11] B. Studer, L.B. Jensen, A. Fiil, T. Asp, “Blind” mapping of genic DNA sequence polymorphisms in Lolium perenne L. by high resolution melting curve analysis, Mol. Breed. 24 (2009) 191–199. [12] Z. Schwarz-Sommer, P. Huijser, W. Nacken, H. Saedler, H. Sommer, Genetic control of flower development by homeotic genes in Antirrhinum majus, Science 250 (1990) 931–936. [13] A. Becker, G. Theißen, The major clades of MADS-box genes and their role in the development and evolution of flowering plants, Mol. Phylogenet. Evol. 29 (2003) 464–489. [14] J.J. Doyle, Evolution of a plant homeotic multigene family—toward connecting molecular systematics and molecular developmental genetics, Syst. Biol. 43 (1994) 307–328. [15] M.D. Purugganan, S.D. Rounsley, R.J. Schmidt, M.F. Yanofsky, Molecular evolution of flower development—diversification of the plant Mads-box regulatory gene family, Genetics 140 (1995) 345–356. [16] G. Theißen, H. Saedler, Mads-box genes in plant ontogeny and phylogeny: Haeckel’s ‘biogenetic law’ revisited, Curr. Opin. Genet. Dev. 5 (1995) 628– 639. [17] G. Theißen, J.T. Kim, H. Saedler, Classification and phylogeny of the MADSbox multigene family suggest defined roles of MADS-box gene subfamilies in the morphological evolution of eukaryotes, J. Mol. Evol. 43 (1996) 484– 516. [18] G. Theißen, A. Becker, A. Di Rosa, A. Kanno, J.T. Kim, T. Münster, K.U. Winter, H. Saedler, A short history of MADS-box genes in plants, Plant Mol. Biol. 42 (2000) 115–149. [19] A. Distelfeld, C. Li, J. Dubcovsky, Regulation of flowering in temperate cereals, Curr. Opin. Plant Biol. 12 (2009) 178–184. [20] D.E. Soltis, H. Ma, M.W. Frohlich, P.S. Soltis, V.A. Albert, D.G. Oppenheimer, N.S. Altman, C. de Pamphilis, J. Leebens-Mack, The floral genome: an evolutionary history of gene duplication and shifting patterns of gene expression, Trends Plant Sci. 12 (2007) 358–367. [21] U. Hartmann, S. Höhmann, K. Nettesheim, E. Wisman, H. Saedler, P. Huijser, Molecular cloning of SVP: a negative regulator of the floral transition in Arabidopsis, Plant J. 21 (2000) 351–360. [22] B. Trevaskis, M. Tadege, M.N. Hemming, W.J. Peacock, E.S. Dennis, C. Sheldon, Short vegetative phase-like MADS-box genes inhibit floral meristem identity in barley, Plant Physiol. 143 (2007) 225–235. [23] J. Putterill, F. Robson, K. Lee, R. Simon, G. Coupland, The CONSTANS gene of Arabidopsis promotes flowering and encodes a protein showing similarities to zinc-finger transcription factors, Cell 80 (1995) 847–857. [24] M.J. Yanovsky, S.A. Kay, Molecular basis of seasonal time measurement in Arabidopsis, Nature 419 (2002) 308–312. [25] A. Samach, H. Onouchi, S.E. Gold, G.S. Ditta, Z. Schwarz-Sommer, M.F. Yanofsky, G. Coupland, Distinct roles of CONSTANS target genes in reproductive development of Arabidopsis, Science 288 (2000) 1613–1616. [26] P.A. Wigge, M.C. Kim, K.E. Jaeger, W. Busch, M. Schmid, J.U. Lohmann, D. Weigel, Integration of spatial and temporal information during floral induction in Arabidopsis, Science 309 (2005) 1056–1059.
[27] I. Kardailsky, V.K. Shukla, J.H. Ahn, N. Dagenais, S.K. Christensen, J.T. Nguyen, J. Chory, M.J. Harrison, D. Weigel, Activation tagging of the floral inducer FT, Science 286 (1999) 1962–1965. [28] F. Chardon, C. Damerval, Phylogenomic analysis of the PEBP gene family in cereals, J. Mol. Evol. 61 (2005) 579–590. [29] S. Faure, J. Higgins, A. Turner, D.A. Laurie, The FLOWERING LOCUS T-like gene family in barley (Hordeum vulgare), Genetics 176 (2007) 599–609. [30] T. Izawa, T. Oikawa, N. Sugiyama, T. Tanisaka, M. Yano, K. Shimamoto, Phytochrome mediates the external light signal to repress FT orthologs in photoperiodic flowering of rice, Genes Dev. 16 (2002) 2006–2020. [31] S. Zhang, W. Hu, L. Wang, C. Lin, B. Cong, C. Sun, D. Luo, TFL1/CEN-like genes control intercalary meristem activity and phase transition in rice, Plant Sci. 168 (2005) 1393–1408. [32] O.N. Danilevskaya, X. Meng, Z. Hou, E.V. Ananiev, C.R. Simmons, A genomic and expression compendium of the expanded PEBP gene family from maize, Plant Physiol. 146 (2008) 250–264. [33] Y. Kobayashi, H. Kaya, K. Goto, M.I. wabuchi, T. Araki, A pair of related genes with antagonistic roles in mediating flowering signals, Science 286 (1999) 1960–1962. [34] S.A. Flint-Garcia, J.M. Thornsberry, E.S. Buckler, Structure of linkage disequilibrium in plants, Annu. Rev. Plant Biol. 54 (2003) 357–374. [35] S. Myles, J. Peiffer, P.J. Brown, E.S. Ersoz, Z. Zhang, D.E. Costich, E.S. Buckler, Association mapping: critical considerations shift from genotyping to experimental design, Plant Cell 21 (2009) 2194–2202. [36] Y. Xing, U. Frei, B. Schejbel, T. Asp, T. Lübberstedt, Nucleotide diversity and linkage disequilibrium in 11 expressed resistance candidate genes in Lolium perenne, BMC Plant Biol. 7 (2007). [37] L. Skøt, J. Humphreys, M.O. Humphreys, D. Thorogood, J. Gallagher, R. Sanderson, I.P. Armstead, I.D. Thomas, Association of candidate genes with flowering time and water-soluble carbohydrate content in Lolium perenne (L.), Genetics 177 (2007) 535–547. [38] U.K. Posselt, P. Barre, G. Brazauskas, L.B. Turner, Comparative analysis of genetic similarity between perennial ryegrass genotypes investigated with AFLPs, ISSRs, RAPDs and SSRs, Czech J. Genet. Plant Breed. 42 (2006) 87–94. [39] K. Katoh, K. Misawa, K.i. Kuma, T. Miyata, MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform, Nucl. Acids Res. 30 (2002) 3059–3066. [40] J. Rozas, J.C. Sánchez-DelBarrio, X. Messeguer, R. Rozas, DnaSP, DNA polymorphism analyses by the coalescent and other methods, Bioinformatics 19 (2003) 2496–2497. [41] P.J. Bradbury, Z. Zhang, D.E. Kroon, T.M. Casstevens, Y. Ramdoss, E.S. Buckler, TASSEL: software for association mapping of complex traits in diverse samples, Bioinformatics 23 (2007) 2633–2635. [42] M. Nei, Molecular Evolutionary Genetics, Columbia University Press, New York, 1987. [43] G.A. Watterson, On the number of segregating sites in genetical models without recombination, Theor. Popul. Biol. 7 (1975) 256–276. [44] F. Tajima, Statistical method for testing the neutral mutation hypothesis by DNA polymorphism, Genetics 123 (1989) 585–595. [45] B.S. Weir, Genetic Data Analysis II, Sinauer, Sunderland, 1996. [46] N.A. Kane, Z. Agharbaoui, A.O. Diallo, H. Adam, Y. Tominaga, F. Ouellet, F. Sarhan, TaVRT2 represses transcription of the wheat vernalization gene TaVRN1, Plant J. 51 (2007) 670–680. [47] C.X. Li, J. Dubcovsky, Wheat FT protein regulates VRN1 transcription through interactions with FDL2, Plant J. 55 (2008) 543–554. [48] B. Pidal, L. Yan, D. Fu, F. Zhang, G. Tranquilli, J. Dubcovsky, The CArG-box located upstream from the transcriptional start of wheat vernalization gene VRN1 is not necessary for the vernalization response, J. Hered. 100 (2009) 355– 364. [49] L. Yan, A. Loukoianov, G. Tranquilli, M. Helguera, T. Fahima, J. Dubcovsky, Positional cloning of the wheat vernalization gene VRN1, Proc. Natl. Acad. Sci. 100 (2003) 6263–6268. [50] L. Yan, M. Helguera, K. Kato, S. Fukuyama, J. Sherman, J. Dubcovsky, Allelic variation at the VRN-1 promoter region in polyploid wheat, Theor. Appl. Genet. 109 (2004) 1677–1686. [51] J. Zitzewitz, P. Szücs, J. Dubcovsky, L. Yan, E. Francia, N. Pecchioni, A. Casas, T. Chen, P. Hayes, J. Skinner, Molecular and structural characterization of barley vernalization genes, Plant Mol. Biol. 59 (2005) 449–467. [52] J. Dubcovsky, A. Loukoianov, D. Fu, M. Valarik, A. Sanchez, L. Yan, Effect of photoperiod on the regulation of wheat vernalization genes VRN1 and VRN2, Plant Mol. Biol. 60 (2006) 469–480. [53] D. Fu, P. Szücs, L. Yan, M. Helguera, J.S. Skinner, J. von Zitzewitz, P.M. Hayes, J. Dubcovsky, Large deletions within the first intron in VRN-1 are associated with spring growth habit in barley and wheat, Mol. Genet. Genomics 273 (2005) 54–65. [54] I. Bonnin, M. Rousset, D. Madur, P. Sourdille, L. Dupuits, D. Brunel, I. Goldringer, FT genome A and D polymorphisms are associated with the variation of earliness components in hexaploid wheat, Theor. Appl. Genet. 116 (2008) 383– 394. [55] J.M. Thornsberry, M.M. Goodman, J. Doebley, S. Kresovich, D. Nielsen, E.S. Buckler, Dwarf8 polymorphisms associate with variation in flowering time, Nat. Genet. 28 (2001) 286–289. [56] A.L. Caicedo, Stinchcombe J.R., K.M. Olsen, J. Schmitt, M.D. Purugganan, Epistatic interaction between Arabidopsis FRI and FLC flowering time genes generates a latitudinal cline in a life history trait, Proc. Natl. Acad. Sci. 101 (2004) 15670–15675.
A. Fiil et al. / Plant Science 180 (2011) 228–237 [57] K.M. Olsen, S.S. Halldorsdottir, J.R. Stinchcombe, C. Weinig, J. Schmitt, M.D. Purugganan, Linkage disequilibrium mapping of Arabidopsis CRY2 flowering time alleles, Genetics 167 (2004) 1361–1369. [58] R. Ponting, M. Drayton, N. Cogan, M. Dobrowolski, G. Spangenberg, K. Smith, J. Forster, SNP discovery, validation, haplotype structure and linkage disequilibrium in full-length herbage nutritive quality genes of perennial ryegrass (Lolium perenne L.), Mol. Genet. Genomics 278 (2007) 585–597. [59] N. Cogan, R. Ponting, A. Vecchies, M. Drayton, J. George, P. Dracatos, M. Dobrowolski, T. Sawbridge, K. Smith, G. Spangenberg, J. Forster, Geneassociated single nucleotide polymorphism discovery in perennial ryegrass (Lolium perenne L.), Mol. Genet. Genomics 276 (2006) 101–112. [60] J.R. Andersen, T. Lübberstedt, Functional markers in plants, Trends Plant Sci. 8 (2003) 554–560. [61] J.B. Holland, Genetic architecture of complex traits in plants, Curr. Opin. Plant Biol. 10 (2007) 156–161. [62] E. Ersoz, J. Yu, E.S. Buckler, Application of linkage disequilibrium and association mapping in crop plants, in: R. Varshney, R. Tuberosa (Eds.), Genomic Assisted Crop Improvement, Vol. 1, Genomics Approaches and Platforms, US Government, 2007, pp. 97–120. [63] C. Zhu, M. Gore, E.S. Buckler, J. Yu, Status and prospects of association mapping in plants, Plant Genome 1 (2008) 5–20. [64] J. Hagenblad, M. Nordborg, Sequence variation and haplotype structure surrounding the flowering time locus FRI in Arabidopsis thaliana, Genetics 161 (2002) 289–298. [65] M.J. Aranzana, S. Kim, K.Y. Zhao, E. Bakker, M. Horton, K. Jakob, C. Lister, J. Molitor, C. Shindo, C.L. Tang, C. Toomajian, B. Traw, H.G. Zheng, J. Bergelson, C. Dean, P. Marjoram, M. Nordborg, Genome-wide association mapping in Arabidopsis identifies previously known flowering time and pathogen resistance genes, PLoS Genet. 1 (2005) 531–539. [66] M.T. Brock, P. Tiffin, C. Weinig, Sequence diversity and haplotype associations with phenotypic responses to crowding: GIGANTEA affects fruit set in Arabidopsis thaliana, Mol. Ecol. 16 (2007) 3050–3062. [67] S. Ducrocq, D. Madur, J.B. Veyrieras, L. Camus-Kulandaivelu, M. Kloiber-Maitz, T. Presterl, M. Ouzunova, D. Manicacci, A. Charcosset, Key impact of Vgt1 on
[68]
[69]
[70]
[71]
[72] [73] [74]
[75]
[76]
[77]
237
flowering time adaptation in maize: evidence from association mapping and ecogeographical information, Genetics 178 (2008) 2433–2437. J. Cockram, J. White, F.J. Leigh, V.J. Lea, E. Chiapparino, D.A. Laurie, I.J. Mackay, W. Powell, D.M. O’Sullivan, Association mapping of partitioning loci in barley, BMC Genetics 9 (2008). L. Skøt, M.O. Humphreys, I. Armstead, S. Heywood, K.P. Skøt, R. Sanderson, I.D. Thomas, K.H. Chorlton, N.R.S. Hamilton, An association mapping approach to identify flowering time genes in natural populations of Lolium perenne (L.), Mol. Breed. 15 (2005) 233–245. J. Auzanneau, C. Huyghe, B. Julier, P. Barre, Linkage disequilibrium in synthetic varieties of perennial ryegrass, Theor. Appl. Genet. 115 (2007) 837– 847. K. Zhang, Z.S. Qin, J.S. Liu, T. Chen, M.S. Waterman, F. Sun, Haplotype block partitioning and tag SNP selection using genotype data and their applications to association studies, Genome Res. 14 (2004) 908–916. M.D. Bennett, I.J. Leitch, Nuclear DNA amounts in angiosperms: progress, problems and prospects, Ann. Bot. 95 (2005) 45–90. J.A. Rafalski, Association genetics in crop improvement, Curr. Opin. Plant Biol. 13 (2010) 174–180. N.O.I. Cogan, K.F. Smith, T. Yamada, M.G. Francki, A.C. Vecchies, E.S. Jones, G.C. Spangenberg, J.W. Forster, QTL analysis and comparative genomics of herbage quality traits in perennial ryegrass (Lolium perenne L.), Theor. Appl. Genet. 110 (2005) 364–380. C. Zhang, S.Z. Fei, S. Warnke, L. Li, D. Hannapel, Identification of genes associated with cold acclimation in perennial ryegrass, J. Plant Physiol. 166 (2009) 1436–1445. I.P. Armstead, L.B. Turner, A.H. Marshall, M.O. Humphrey, I.P. King, D. Thorogood, Identifying genetic components controlling fertility in the outcrossing grass species perennial ryegrass (Lolium perenne) by quantitative trait loci analysis and comparative genetics, New Phytol. 178 (2008) 559–571. P. Dracatos, N. Cogan, M. Dobrowolski, T. Sawbridge, G. Spangenberg, K. Smith, J. Forster, Discovery and genetic mapping of single nucleotide polymorphisms in candidate genes for pathogen defence response in perennial ryegrass (Lolium perenne L.), Theor. Appl. Genet. 117 (2008) 203–219.