CHAPTER 13
Genomic Selection in Wheat Daniel W. Sweeney, Jin Sun, Ella Taagen, Mark E. Sorrells Plant Breeding and Genetics, Cornell University, Ithaca, NY, United States
Contents 13.1 Introduction 13.2 High-Throughput Phenotyping 13.2.1 High-Throughput Phenotyping Platforms 13.2.2 Application of High-Throughput Phenotyping in GS 13.3 Genotype by Environment Interaction 13.3.1 Use of Environmental Covariates in Prediction Models 13.3.2 Accounting for and Exploiting Epistasis 13.4 Genomic Selection for Wheat Disease Resistance 13.4.1 Wheat Rusts 13.4.2 Fusarium Head Blight 13.4.3 Other Wheat Diseases 13.5 Genomic Selection in Wheat for Nutritional Traits 13.6 Genomic Selection for Wheat Quality Traits 13.6.1 Milling and Flour Quality 13.6.2 Preharvest Sprouting 13.7 Future Prospects 13.8 Conclusion References
273 277 277 278 279 283 286 287 288 289 290 291 292 292 294 295 296 297
13.1 INTRODUCTION Improved utilization of quantitative traits is imperative to the continuous progress of plant breeding. While marker-assisted selection (MAS) and other marker technologies have aided in manipulating simple traits controlled by one or several genes, they are inefficient methods of selection for quantitative traits, which are polygenic and controlled by many small-effect loci (Bernardo, 2008; Heffner et al., 2009). Genomic selection (GS) is a strategy that utilizes phenotypes and high-density marker scores to predict the genomic breeding values of lines in a population (Heffner et al., 2009). By incorporating all marker information in the prediction model, GS captures more variation due to small-effect quantitative trait loci (QTL) and is suited for improving traits with both high and low heritability. Additionally, GS theory indicates that genetic gain per unit time can be improved as a result of Applications of Genetic and Genomic Research in Cereals https://doi.org/10.1016/B978-0-08-102163-7.00013-2
Copyright © 2019 Elsevier Ltd. All rights reserved.
273
274
Applications of Genetic and Genomic Research in Cereals
reduced phenotyping per generation and selection at any stage. Considering the declining costs of genotyping and sequencing services, and stagnant to rising costs of phenotyping, GS is a tool with the potential to substantially improve breeding for high value quantitative traits that are difficult and/or expensive to phenotype. Application of GS in wheat has the potential to significantly improve the breeding effort for quantitative traits including biotic and abiotic stress tolerance, nutritional and end-use quality, and yield. The GS theory was originally developed by animal breeders because of the high cost of phenotyping and the inability to replicate individual genotypes, and has shown promise in many crops including wheat (Meuwissen et al., 2001; Heffner et al., 2011a,b; Lorenzana and Bernardo, 2009). As the limits of QTL mapping for breeding became apparent, notably that biparental mapping populations are not well adapted to many breeding applications and the statistical methods used to identify loci and implement MAS are insufficient for improving quantitative traits, plant breeders began experimenting with GS (Bernardo and Yu, 2007; Heffner et al., 2009).The GS theory proposes using a training population that has been whole-genome genotyped and phenotyped in the target population of environments to predict the performance of related genotyped individuals that do not have phenotypic records (Meuwissen et al., 2001). Genotypic and phenotypic data from the training population are used to train a prediction model that uses genotypic information to calculate genomic estimated breeding values (GEBVs) for the breeding or selection population. The first empirical studies of GS in small grains were published between 2009 and 2011 (de los Campos et al., 2009; Crossa et al., 2010; Heffner et al., 2011a). Studies by Crossa et al. (2010) and de los Campos et al. (2009) used data from The International Maize and Wheat Improvement Center (CIMMYT) wheat breeding program and concluded that genomic markers in a model provided greater accuracy of the predicted breeding value than a model limited to pedigree relationships. Heffner et al. (2011a,b) compared the prediction accuracies of GS and MAS and phenotypic selection for nine grain quality traits in soft white wheat, and concluded that GS could increase the rate of genetic gain per unit time and cost in a wheat breeding program. The notable advantages of GS to the breeder are the potential for a shortened breeding cycle, and/or increased selection accuracy depending on the trait, with increased rate of genetic gains. Additionally, GS models are useful for predicting unobserved genotypes and/or unobserved environments (Burgueño et al., 2012; Heslot et al., 2014). For a more in-depth review of the methodology see Heffner et al. (2009) and Lorenz et al. (2011).
Genomic Selection in Wheat
275
Different wheat breeding programs will have different breeding methods and unique goals, germplasm, and constraints that influence strategy, models, and application of GS. Important considerations for a wheat breeder initiating GS include timing, training population composition and size, phenotyping and genotyping design, budget, and statistical modeling. GS can be applied to early or late stages of wheat breeding, with the respective goals of rapid cycling or increased selection accuracies. Choosing which individuals to include in the training population can be challenging and directly affects the prediction accuracy. Population structure and identification of potential subpopulations are important, as previous studies have clearly indicated that more closely related training and breeding populations lead to higher GS accuracies (Asoro et al., 2011; Wang et al., 2014). The size of the population required is dependent on multiple factors such as trait heritability and level of relatedness, but generally larger populations offer more accurate predictions (Heffner et al., 2011a,b). Multiple studies have indicated that GS accuracy increases linearly with the training population size, and observed plateaus in accuracy may be trait specific. For example, a linear increase in prediction accuracy was observed by Heffner et al. (2011a) when the training population size was increased from 24 to 96 within biparental populations, as well as for predictions within a population of advanced breeding lines when the training population was increased from 96 to 288 (Heffner et al., 2011b). Population size is also affected by the relationship between the training population and selection candidates, with smaller, closely related populations requiring fewer individuals (Rutkoski et al., 2015b). Density and genome distribution are the primary considerations for genetic markers, and genotyping by sequencing (GBS) has been widely used to generate large numbers of nonredundant evenly distributed markers (Heslot et al., 2013b). As a self-pollinated crop, wheat has relatively high levels of linkage disequilibrium between the markers and QTL, and GS may be relatively accurate using fewer markers than in outcrossing species (Heffner et al., 2011b; Rutkoski et al., 2013). It is important to note that accurate phenotypes are required for training GS models and there is a need for potentially novel phenotyping strategies that can maximize replication of the allele rather than the individual (Heslot et al., 2014). The training population should be frequently updated to maintain accuracy of the model’s phenotype and allele marker correlations. For example, if GS is applied early in a breeding scheme, the model should be retrained with the progeny of the selection candidates after two to three rounds of selection (Heffner et al., 2009; Rutkoski et al., 2015b).
276
Applications of Genetic and Genomic Research in Cereals
A range of GS models have been intensively studied for accurate calculation of GEBVs of individuals in the selection population. For an in-depth comparison of GS models, see Heslot et al. (2012) and Pérez-Rodríguez et al. (2012). To briefly acquaint the reader with models commonly referenced in this chapter, ridge-regression best linear unbiased prediction (RRBLUP) and genomic BLUP (G-BLUP) are the two models most frequently used for the GS. These are mixed models that take relatedness into account via additive relationship or genomic matrices, respectively, and can easily fit models with many more markers than observations. The RR-BLUP and G-BLUP follow the assumptions of the infinitesimal model that all loci along the genome have a small effect on a trait.These models equally shrink all marker effects toward zero and are equivalent (Endelman, 2011). Bayesian models assume prior marker effects distributions and differentially weight marker effects. The most commonly used is Bayes-Cπ, which allows some markers to have zero variance with a probability π (Juliana et al., 2017b). These are potentially appropriate models for traits that may be oligogenic. Reproducing Kernel Hilbert Spaces (RKHS) models are semiparametric and are used in situations to capture nonadditive effects. Random forest (RF) is a machine learning approach. Most published GS studies compare multiple statistical models for accuracy, many of which reported little or no difference in prediction accuracy between models (Heslot et al., 2012). The BLUP models are often preferred because they require considerably less computation time, are relatively easy to use, and typically perform as well or better than Bayesian models (Meuwissen et al., 2001; Heffner et al., 2011b; Juliana et al., 2017b). However, a comparison study of wheat, barley, Arabidopsis, and maize data sets with 11 different models by Heslot et al. (2012) suggested that GS in plant breeding could be based on the Bayesian Lasso or weighted Bayesian shrinkage regression models and RF. Estimation of nonadditive effects may be desirable in some circumstances, especially when predicting fixed genotypes from a clonal or inbred training population. Several wheat studies have compared additive and nonadditive models (Mirdita et al., 2015; He et al., 2016). This chapter reviews recent empirical GS studies in wheat and applications to breeding strategies in wheat. The majority of the studies cited are genomic prediction studies in the sense that they are not actually selecting, crossing, and calculating gain from selection using GEBVs from GS models. However, such studies suggest that in practice GS could lead to faster cycling and enable increased selection intensities. At this point in time, GS
Genomic Selection in Wheat
277
studies in wheat have covered the major trait classes of interest in bread wheat: yield, disease resistance, and quality. High throughput phenotyping is beginning to be integrated into GS modeling, as are crop modeling strategies. GS also provides a new tool for exploiting genetic variation in diversity panels and for dealing with the perennial challenge of genotype by environment interactions. Genomics-enabled breeding strategies are critical for wheat breeders to meet the challenges of climate change, emerging diseases, water scarcity, and population growth in the 21st century. GS is a promising tool to help wheat breeders in their mission to improve the livelihoods of wheat growers and consumers worldwide.
13.2 HIGH-THROUGHPUT PHENOTYPING 13.2.1 High-Throughput Phenotyping Platforms With recent rapid progress in genotyping technology, phenotyping has become a critical factor that could impede further advances of plant breeding because of the time and labor required, as well as the accuracy of the data. For these reasons, considerable effort has been put into the development of high-throughput phenotyping (HTP) platforms in crops, in an attempt to generate large-scale, high-density phenotypes with high accuracy and low cost. Field-based HTP platforms including ground-based platforms, aerial platforms, unmanned aerial systems or vehicles (UAS or UAV), spectral satellite imaging, and others (White et al., 2012; Araus and Cairns, 2014; Shakoor et al., 2017) have been established based on remote or proximal sensing and imaging technology. Remote sensing and imaging techniques usually fall into three categories: visible/near-infrared (VIS/NIR) spectroradiometry, infrared thermometry and thermal imaging, and red, green, and blue (RGB) light color digital photography (Araus and Cairns, 2014). Each of HTP platforms has its own advantages, and different sensor and image technologies could be specifically deployed based on the traits of interest and experimental design in the field (Shakoor et al., 2017). Moreover, they are currently applicable in many crops including wheat (Haghighattalab et al., 2016; Liebisch et al., 2015; Tanger et al., 2017; Watanabe et al., 2017). For example, an UAV mounted with RGB cameras was successfully used to estimate wheat plant height and growth rate (Holman et al., 2016), and an UAS equipped with blue-green-NIR was utilized to capture vegetation indices for large wheat breeding nurseries (Haghighattalab et al., 2016), in which the vegetation indices are capable of predicting complex traits such
278
Applications of Genetic and Genomic Research in Cereals
as grain yield (Rutkoski et al., 2016; Sun et al., 2017) and disease resistance (Bauriegel et al., 2011; Devadas et al., 2015). In addition to the field-based HTP platforms, HTP has also been developed for laboratory level evaluations of plant samples and NIRS analysis to analyze grain characteristics (Araus and Cairns, 2014).
13.2.2 Application of High-Throughput Phenotyping in GS HTP platforms provide an opportunity to apply HTP traits in GS. In addition to genotyping, GS models require reliable phenotypes. HTP platforms allow relatively accurate and rapid phenotypic data collection on a large scale, which is useful for GS model training. For example, Watanabe et al. (2017) applied UAV remote sensing in genomic prediction modeling for height measurement of sorghum, and evaluated the possibility of replacing the traditional manual measurement with the HTP data for GS. Although the genomic prediction accuracy was lower using UAV data (in range of 0.448–0.634) compared to traditional manual measurement (in range of 0.629–0.675), the results indicated that UAV remote sensing is feasible. Because of the cost and labor efficiency, HTP platforms are able to monitor the treatments during multiple plant growth stages, which enables the comparison of plant heights at the same stage (Watanabe et al., 2017). The application of HTP platforms in most crops is still at an early stage. However, in the long term, the prediction accuracy of UAV remote sensing has considerable potential compared to traditional measurements when the measurement errors from HTP platforms are reduced through improvement in the experimental designs and HTP technologies (Watanabe et al., 2017). A HTP platform for wheat plant height has been developed as well (Holman et al., 2016) using UAV-based remote sensing, and similar approaches could be applied for wheat genomic prediction. Traits from HTP platforms could also be applied in multi-trait GS to predict complex traits. A multi-trait GS model (Jia and Jannink, 2012) takes advantage of correlated traits to improve the genomic prediction accuracy for the trait of interest. When different HTP traits are correlated with the trait of interest, such traits can be utilized to improve genetic gain (Sun et al., 2017). Grain yield is a complex quantitative trait that is influenced by environment. Canopy temperature (CT) and normalized difference vegetation index (NDVI) are genetically correlated with grain yield (Rutkoski et al., 2016) and can be easily measured from HTP platforms (Haghighattalab et al., 2016). Rutkoski et al. (2016) and Sun et al. (2017) utilized CT and NDVI from HTP platforms to predict wheat grain yield using multi-trait
Genomic Selection in Wheat
279
GS models and achieved an improvement in the prediction accuracy for grain yield by 70% on average in different environments. Compared to traditional measurements, especially for an environment-dependent trait such as CT, HTP platforms are superior to hand measurements because of the reduced data collection time and measurement errors caused by the external environment. In addition, the time-series data for CT and NDVI from HTP platforms offers the opportunity to select wheat cultivars with high predicted grain yield at early plant growth stages enabling selection to occur before harvest. HTP data can be used in combination with genotypes to improve genomic prediction accuracy, which would be highly beneficial in situations where grain yield could not be phenotyped due to severe weather or because of a lack of seeds (Rutkoski et al., 2016). The applications of HTP platforms in GS demonstrate their capability to expedite crop selection and increase genetic gains. It is expected that more HTP traits from HTP platforms will be accessible in the near future. However, the development of HTP platforms is lagging far behind the genotyping technologies. In order to effectively promote the application of HTP, more efforts are needed to develop low cost and high-performance HTP platforms (Shakoor et al., 2017) and to improve the big data integration, processing, and management from the HTP platforms (Araus and Cairns, 2014; Shakoor et al., 2017).
13.3 GENOTYPE BY ENVIRONMENT INTERACTION When using GS to predict grain yield or plant biomass, genotype by environment (G×E) interaction is an important factor affecting the prediction accuracy, more so than for animals. Whole-genome genotyping has created the opportunity for new approaches to characterize and understand G×E interaction. Studies have evaluated the impact of G×E interaction on prediction accuracy and the use of environmental information in prediction models. Genomic prediction studies can use historical multi-environment phenotypic data together with available genotypic data to base selection on a target set of environments which may encompass more than the typical few years of data used by classical advanced testing programs (Heffner et al., 2009). It is important to note that atypical environments may impact GS more than conventional phenotypic selection because phenotypic data are only used to select or discard breeding lines, not to train a model. Consequently, genetic gain from phenotypic selection is only affected for a short time, whereas genetic gain from GS can impact marker effect e stimates
280
Applications of Genetic and Genomic Research in Cereals
and performance of relatives that will alter selection criteria in future selection cycles. The GS may also be able to predict the performance of individuals in untested environments by using the phenotypes of its relatives that have been tested in those environments. Crossa et al. (2010) and Burgueño et al. (2011) were among the first to report that marker effects were different across environmental conditions. Burgueño et al. (2011) compared linear mixed models and factor analytic models for their predictive ability. When G×E interaction was important, modeling G×E interaction with the factor analytic model increased prediction accuracy, but when G×E interaction was not significant, prediction accuracies were similar for all the models. Burgueño et al. (2012), using the same 599 CIMMYT wheat lines as Crossa et al. (2010), were the first to apply the modeling of G×E covariance structures in multi-environment trials. When they included G×E interaction in the model, the prediction ability for unobserved individuals increased by about 20% compared to a single-environment prediction model. Models that included both markers and pedigrees were more predictive than those that included one or the other. Additionally, prediction accuracies were higher for predicting the performance of genotypes in untested environments than for predicting untested genotypes. This study showed that modeling the genetic covariance between environments could increase the accuracy when predicting performance within specific environments for lines that were only observed in some of the environments. The model was not predictive for lines with no phenotypic information and environmental covariates (ECs) were not incorporated in the model. Other studies that included G×E interaction in the prediction model also observed an increase in accuracy (Heslot et al., 2013a; Jarquín et al., 2014; Lado et al., 2016; Lopez-Cruz et al., 2015; Cuevas et al., 2016a,b; Pérez-Rodríguez et al., 2017). However, Dawson et al. (2013) analyzed 17 years of data from the International Center for Maize and Wheat Improvement’s (CIMMYT) Semi-Arid Wheat Yield Trials (SAWYT) and found that there was no difference in accuracy between the models that took into account G×E interactions and models that did not. They also grouped locations based on genomic predictions for all genotypes in all environments but this did not improve accuracies within groups. They concluded that there was no consistent pattern of G×E interaction among the mega-environments, and the unbalanced dataset could not be partitioned into clusters that had predictive power. Lado et al. (2016) characterized and modeled G×E interaction to design sets of environments having low G×E interaction to improve the prediction
Genomic Selection in Wheat
281
accuracy of wheat genotype performance in untested environments. They employed mixed models to construct a covariance matrix across the environments using a highly unbalanced dataset to predict performance within or across the different sets of environments.They found that the best method for predicting new genotypes was to borrow information from relatives evaluated in other environments and then model the correlation matrix across the environments. For predicting the performance of genotypes in new environments the best approach was either to predict within defined mega-environments or across locations for single years. Lopez-Cruz et al. (2015) modeled G×E interaction using a marker × environment (M×E) interaction GS model. They compared the M×E interaction model with both within-environment and across-environment analyses. When predicting the performance of genotypes in environments in which they have not been tested the M×E interaction model produced 10% higher prediction accuracy than the within-environment analysis and 37% higher accuracy than the across-environment analysis. They concluded that the M×E interaction model could help identify markers linked to genes that contribute to stability across environments as well as those that interact with environments. However, the M×E interaction model is somewhat limited in the ability to interpret patterns of G×E interaction that are neutral or negative so the authors concluded that the model is best suited for the analysis of the positively correlated environments. Cuevas et al. (2016a,b) used a Bayesian genomic kernel model to account for the correlations between environments. The model that included G×E interaction consistently had higher prediction accuracy than single-environment models. In an earlier study, Cuevas et al. (2016a,b) used nonlinear Gaussian kernels to model M×E interaction in the wheat dataset reported in Crossa et al. (2010). The RKHS model was extended to take into account the G×E interaction. For the wheat data, the prediction model including G×E interaction was up to 60% more accurate. Continuing their study of M×E interaction, Crossa et al. (2016) evaluated the use of priors that produce shrinkage (Bayesian ridge regression) and variable selection (BayesB). They evaluated the genomic prediction accuracy of models for M×E interaction, within and across environments in a durum wheat population. They found that the M×E interaction model minimized the model residual variance and improved data-fitting gain for more simply inherited traits compared to more complex traits such as grain yield. Interestingly, the M×E interaction model found markers linked to major genes for heading date and their effects were stable across e nvironments.
282
Applications of Genetic and Genomic Research in Cereals
For grain yield, several large marker effects were identified in all the chromosome groups. The above studies that model G×E interaction are useful for understanding the relationships among environments and for increasing GS prediction accuracy. However, they are based on observed covariances among environments and, consequently, they can only describe past performance instead of being predictive of future performance. Pérez-Rodríguez et al. (2017) used single-step genomic and pedigree models to determine the prediction accuracy of 58,798 CIMMYT wheat lines grown under different management conditions including irrigated, drought, late heat, severe drought, and early heat over seven seasons in Ciudad Obregon, Mexico. Only 29,484 individuals had genotypes consisting of 9045 markers.The trained models were used to predict grain yield of some lines at sites in South Asia. Phenotypes, pedigrees, and genotypes were used to evaluate the lines using a single-step model that combined pedigree and marker information into a unified H matrix. For the models without G×E, pedigree alone gave the highest accuracies (0.16–0.26). When G×E was included in the model, pedigree alone produced the highest accuracies in four of the six South Asia sites (0.25–0.29), markers were best for one site (0.39) and pedigree plus markers was most accurate at one site (0.25).There was a clear advantage to including G×E interaction in the model but overall pedigree alone produced accuracies as good as genotypes or genotypes plus pedigrees. Heslot et al. (2013a) took a novel approach to analyze unbalanced data sets that are typical in plant breeding testing programs. Although all genotypes are not observed in all the environments, all marker effects are observed in all environments allowing the use of Euclidean distances to determine the relationships among environments. When marker effects were used to cluster environments, clear patterns were observed and outlier environments were readily identified (Fig. 13.1). Breeder field notes corroborated the abnormal environments but grouping environments based on similarity of marker effects did not improve the prediction accuracy. In addition, the marker effects can be used to compute the prediction accuracy between pairs of environments to generate a reciprocal prediction accuracy matrix between the environments. When grouping the environments based on the reciprocal prediction accuracies between environments, the prediction accuracy for yield across environments increased significantly. In a second experiment to manage G×E interaction, Heslot et al. (2013a) optimized the composition of the training population by first computing the average predictive accuracy of each environment for predicting the line
Genomic Selection in Wheat
283
Fig. 13.1 Heat map showing the similarity of environments based on Euclidian distances computed using marker effects. Environment comparisons with red shading are more dissimilar and those environments with blue shading are more similar (Figure 3 from Heslot et al., 2013a).
performance in the other environments in the same dataset. Then beginning with the least predictive environments they were removed one at a time and then the model was retrained on the remaining environments.The environments that were removed were placed in an unpredictive set and the prediction accuracy of that set was calculated (Fig. 13.2). If the prediction accuracy of the predictive set improved, the process was repeated until it no longer increased and the remaining environments were considered to be the optimal set. In their study, 18 out of the 58 environments were removed and accuracy increased from 0.54 to 0.61.
13.3.1 Use of Environmental Covariates in Prediction Models Conceptually, it would seem desirable to include environment descriptors in the prediction models to increase the accuracy and to characterize G×E interaction. However, when analyzing large numbers of markers and ECs, methods are required to limit the computational burden. One solution
284
0.4 0.2
Accuracy
0.6
0.8
Applications of Genetic and Genomic Research in Cereals
Legend:
0.0
Predictive set Unpredictive set Prediction 0
10 20 30 40 50 Number of environments excluded from the predictive set
60
Fig. 13.2 Optimization of the training population. The blue dots are cross-validated accuracies for the selected training population (predictive set) and red triangles are prediction accuracies for the environments removed from the training population (unpredictive set). Green squares are the prediction accuracies for a validation set observed in 2011 (Figure 5 from Heslot et al., 2013a).
is to select a few of the most important ECs and a reduced number of markers. However, this can result in the loss of important information. Jarquín et al. (2017) modeled interactions between markers and ECs by describing genetic and environmental gradients as linear functions of the markers and ECs which they referred to as a reaction norm. Their dataset consisted of grain yield for 139 wheat lines grown in 340 environments. The lines were genotyped with 2395 single nucleotide polymorphisms (SNPs) and they compared a model with only main effects with one that included 68 ECs. Prediction accuracy was low when trying to predict new lines based on main effects only (0.175) but increased 35% (0.236) when ECs were included in the model. For lines that had been tested in some environments, including ECs in the model increased the accuracy by 17% from 0.439 to 0.514.
Genomic Selection in Wheat
285
In a different approach, Heslot et al. (2014) used a crop model to integrate ECs into the GS framework to predict G×E, to increase the prediction accuracy, and to better understand the genetic architecture of G×E. Their dataset consisted of 2437 elite inbred lines genotyped with 1287 SNP markers and grown 44 environments over 6 years in France.They used a crop model known as SiriusQuality (Martre et al., 2006) to compute the phenology and synchronize early, medium, and late maturing genotypes with the weather data. Environmental stress covariates (climatic variables at a specific developmental stage) by developmental stage were derived by using prior knowledge about sensitivity of specific growth stages to abiotic stresses from the literature. Environmental stress covariates were used as independent variables in statistical genetic models for effect estimation and prediction. By extending the factorial regression model to the GS context each marker was fit as a main effect and as a sensitivity to each of the stress covariates. The interactions between markers and stress covariates as were captured using a machine learning algorithm and genotype performance was predicted as a main effect plus a G×E deviation. The large number of markers and ECs presented computation issues so they evaluated the variance of marker effects across environments and eliminated the markers with little or no variation. The photoperiod sensitivity gene Ppd-D1 had the highest variance but did not capture a significant part of the G×E variance. Models were evaluated using cross-validation and the best model consisted of 250 markers in combination with the nonlinear soft rule fit algorithm. The most important EC was the sum of the average daily temperature between the meiosis and the flowering. Also important was drought in the early spring measured by “total number of dry days to 350 degree days” and the “sum of precipitation and evapotranspiration potential.” The factor analytic model can predict a G×E response for any genotype in any environment, even if an environment had no phenotypic data for that genotype. This allows the calculation of Euclidean distances for all environments by using the predicted level of genetic correlation between the environments. The G×E interactions can be clustered to show the structure of the TPE. By including the G×E interaction component, prediction accuracy for genotype performance in unobserved environments increased by 11.1% on average and the variability in prediction accuracy decreased by 10.8%. In contrast to the approaches that used covariances among environments to improve prediction accuracy the use of selected ECs made it possible to predict performance in unobserved environments rather than taking a retrospective view of G×E interaction. This approach also contributes
286
Applications of Genetic and Genomic Research in Cereals
information about the plant response to the environment so that the breeder can leverage agronomy and physiology knowledge, reduce dimensionality and nonlinearity, use existing breeding data, and interpret results to identify specific environmental stresses.
13.3.2 Accounting for and Exploiting Epistasis Genomic prediction models can be applied either to the segregating populations or to the advanced inbred lines. In the case of segregating populations, we are concerned with the estimation of breeding value and appropriate linear models (Fig. 13.3). However, when predicting genotypic value of pure lines, models that can account for gene interactions may provide higher prediction accuracy. In many of the studies cited above, multiple models were compared, and linear models often perform as well as nonlinear models. However, some of the reports, especially those using CIMMYT wheat datasets have found that nonlinear models such as RKHS tend to increase prediction accuracy for complex traits such as grain
Integrating genomic selection in a breeding program Recurrent selection stage GS models emphasizing additive genetic variation Spring plant and harvest
Inbreeding and evaluation stage F2 GH/field advance SSD/bulk, MAS F3 field advance bulk or row-pheno, MAS F4 field advance bulk or plot-pheno, MAS
S0 self Winter Crossing
GH crossing and selfing cycle Fall plant and harvest
Summer Crossing S0 self
Training population F4:5, master N., phenotype, select uniform spikes, genotype, GS+PS GS models capturing non additive genetic variation
Advanced regional testing GEBV + phenotype
Planting GS-selected individuals, intermating, self-pollination and S1 seed harvest occur twice a year MAS and phenotyping can be applied to F2–F4 generations F4s can be phenotyped and F5 spikes selected for uniformity and for GBS genotyping Selected lines enter the master training nursery Each year selected lines are entered in the regional trials and/or recycled in the crossing program
Fig. 13.3 Integration of GS in a pureline breeding program. In the rapid cycling phase, GS is used to enhance gain per unit time. In the inbreeding phase, MAS and PS can be imposed until the F4 or F5 generation and then whole-genome genotyping is used to select individuals that enter the training population or are recycled in the crossing program. Each phase is conducted simultaneously and the GS models are updated annually.
Genomic Selection in Wheat
287
yield (e.g., Cuevas et al., 2016a; Crossa et al., 2010; Pérez-Rodríguez et al., 2012). Theoretically, epistatic effects can be estimated for a large number of markers but practically speaking, the large number of epistatic interactions makes it computationally difficult. Models such as RKHS regression and radial basis function neural networks (RBFNN) (González-Camacho et al., 2012) can indirectly map marker effects into a high n-dimensional space that can capture nonadditive genetic effects. Pérez-Rodríguez et al. (2012) compared the prediction accuracies of four linear GS models with three nonlinear models including RKHS regression, Bayesian regularized neural networks (BRNN), and RBFNN for days to heading and grain yield of 306 CIMMYT wheat lines. For days to heading, either RKHS or RBFNN had the highest accuracy for all 12 environments. Prediction accuracy for grain yield was highest for RKHS and RBFNN in six of the seven environments. Crossa et al. (2014) combined pedigree-derived additive and epistatic additive × additive relationship information in a single model-to-model G×E interaction. Including the pedigree additive × additive relationships in the models increased the prediction accuracy in three of the four environments indicating that modeling additive-by-additive epistasis with G×E interaction is important for increasing the accuracy of predictions in wheat breeding populations.
13.4 GENOMIC SELECTION FOR WHEAT DISEASE RESISTANCE Disease resistance in crops can be broadly classified into qualitative and quantitative, or nondurable and durable, resistance. Qualitative resistance is conditioned by single large effect genes, often termed R genes, that provide strong gene-for-gene resistance against specific races of a given pathogen species. Rapid pathogen evolution can overcome such resistance, particularly if the resistance genes are widely deployed in cultivated varieties. Quantitative resistance is conferred by many small-effect genes and provides more durable resistance because the pathogens cannot overcome multiple modes of resistance quickly. However, traditional qualitative disease resistance breeding strategies like pyramiding and MAS are not efficient or effective in breeding for quantitatively inherited disease resistance. GS for disease resistance could overcome these shortcomings and lead to more efficient breeding for disease resistance. Breeding for disease resistance and the applications of GS for disease resistance in crops are comprehensively reviewed in Poland and Rutkoski (2016).
288
Applications of Genetic and Genomic Research in Cereals
13.4.1 Wheat Rusts Wheat rusts are typically caused by one of three main pathogens: Puccinia graminis (stem rust), Puccinia striiformis (stripe/yellow rust), or Puccinia triticina (leaf/brown rust).These obligate biotrophs are global in distribution and historically have been the most serious wheat pathogens (Poland and Rutkoski, 2016). Wheat rust resistance can be qualitative or quantitative. Seedling resistance is qualitative and quantitative resistance expressed at the adult plant stage is termed adult plant resistance (APR) (Poland and Rutkoski, 2016). Durable rust resistance is needed for wheat cultivars worldwide as quickly evolving rust pathogens often overcome R gene resistance. Ornella et al. (2012) were the first to investigate GS for wheat stem and yellow rust. They found moderate to high prediction accuracies by cross-validating within biparental populations and moderate prediction accuracies between populations when they were related. Predicting an environment from a correlated environment was found to be useful as well. Daetwyler et al. (2014) used a diverse group of wheat landrace accessions to train GS models for all the three rusts. Adding diagnostic PCR markers for the R genes Lr34, Sr57, and Yr18 improved prediction accuracy but a linked marker for Sr2 did not improve stem rust prediction accuracy because Sr2 is a recent introgression not present in older landraces. Rutkoski et al. (2014) followed a similar strategy in GS for stem rust APR in spring wheat. They ran genome-wide association studies (GWAS) and fit significant markers as fixed effects in GS models. Markers linked to Sr2 were significant in their germplasm and improved prediction accuracies, but seedling disease scores did not fit as fixed effects.To date, only Rutkoski et al. (2015a) have published the actual GS gain from selection experiment for disease resistance in wheat. They compared two cycles of GS to one cycle of phenotypic selection (PS) and examined inbreeding, genetic variance, and correlated response to selection. GS performed as well as PS per unit time with equal selection intensity but decreased genetic variance and increased inbreeding. Muleta et al. (2017) found that the prediction accuracies for stripe rust did not change appreciably in a diverse set of accessions from the USDA National Small Grains Collection when a subset of markers was used for prediction models instead of the whole marker set. They also found that combining genetically distant clusters in cross-validation led to higher prediction accuracy than cross-validation within clusters. Juliana et al. (2017a) evaluated least squares (LS), G-BLUP, G-BLUP with significant fixed marker effects, and three RKHS models for seedling leaf and stripe rust resistance and leaf, stripe, and stem rust APR using GBS SNPs.
Genomic Selection in Wheat
289
Overall, G-BLUP and RKHS models performed the best but LS models performed as well or better when R genes were present in the training population germplasm.
13.4.2 Fusarium Head Blight Fusarium head blight (FHB) is a devastating fungal disease caused by Fusarium graminearum in humid small grain production areas worldwide. The FHB disease symptoms include accumulation of the potent mycotoxin deoxynivalenol (DON) in the wheat grain. The DON is a particularly difficult phenotype for selection because DON levels are frequently poorly correlated with visible disease symptoms. The FHB resistance is largely quantitative, with a single large effect QTL, Fhb-1, popularly introgressed into breeding material in North America (Jin et al., 2013). Rutkoski et al. (2012) published the first study of GS for FHB resistance in wheat.They found that GS models, especially RF and RKHS, gave higher prediction accuracies for incidence and severity-related traits than multiple linear regression (MLR) models for MAS. Addition of QTL as fixed effects did not improve model performance. However, for DON, MLR performed as well as GS models and using QTL markers alone performed better than genome-wide markers. Mirdita et al. (2015) used a large set of 2325 breeding lines and varieties in 11 environments over 2 years as a training population for FHB severity. The authors tested two epistatic models, RKHS and extended G-BLUP, in addition to RR-BLUP and Bayes-Cπ. They found the best prediction accuracies with epistatic models, with a mean prediction accuracy of 0.6 with RKHS. Arruda et al. (2016) also compared MAS to GS for FHB and specifically looked at the additions of the large effect Fhb-1 locus, independent QTL, and QTL in the training population as fixed effects. GS models consistently outperformed the MAS models and the addition of significant QTL in the training population as fixed effects substantially increased the prediction accuracy. The authors cautioned that this is likely due to what they term “inside trading,” that is, estimating QTL effects in the same population that is used for the training population and including those effects as fixed inflates the prediction accuracy. Jiang et al. (2017) compared independent validation of MAS and GS models with cross-validation using a large set of European wheat breeding lines phenotyped for FHB resistance. By sampling across genotypes and environments, cross-validated GS models were not inflated compared to independently validated models; however, MAS models were inflated using cross-validation.The authors also looked at prediction accuracies of individual genotypes using the reliability
290
Applications of Genetic and Genomic Research in Cereals
criterion, an analysis commonly used in animal breeding but infrequently utilized in plant breeding. Prediction accuracies of individual genotypes using the reliability criterion were four to sixfold higher for individuals with high reliability than with low reliability.
13.4.3 Other Wheat Diseases Juliana et al. (2017b) used two CIMMYT bread wheat screening nurseries to build GS models for three major necrotrophic foliar diseases of wheat: Septoria tritici blotch (STB), Stagonospora nodorum blotch (SNB), and tan spot (TS). The LS, G-BLUP, four Bayesian, and three RKHS models were tested for pedigree and GS using GBS SNP and DArT markers. Prediction accuracies were moderate to high for all traits and for all models. The LS models consistently performed the worst and RKHS models incorporating markers and pedigree data tended to perform the best. Mirdita et al. (2015) used the same strategy for STB prediction as for FHB. They found the best prediction accuracies with epistatic models, with a mean prediction accuracy of 0.5 for STB with RKHS, which is comparable to the results of Juliana et al. (2017b). GS is a promising tool for wheat breeders to improve quantitative disease resistance in wheat. Prediction accuracies for rust resistance at the seedling and adult stage generally were high to moderate and tended to improve when R genes were included as fixed effects. Prediction accuracies for FHB traits in all studies tended to be moderate to high, with GS models consistently outperforming MAS models, a notable exception being DON in Rutkoski et al. (2012). Several studies showed that including significant markers for R genes or large effect QTL can increase prediction accuracy for disease resistance traits (Arruda et al., 2016; Daetwyler et al., 2014; Rutkoski et al., 2014). Association mapping is a useful way to screen large effect markers in diverse populations and is a worthwhile analysis to run when building GS models for disease resistance. The impact of adding such markers as fixed effects to GS models is highly dependent on population structure and the frequency of the causative locus in the training population. However, as Arruda et al. (2016) caution, inclusion of QTL from the training population as fixed effects may unrealistically inflate prediction accuracies. Inclusion of diagnostic markers as fixed effects for large effect QTL like Fhb-1 or R genes like Sr2 known to be present in the training and validation populations may be beneficial depending on the breeding goals and germplasm. Models capturing epistatic effects appear to be useful for some diseases.
Genomic Selection in Wheat
291
13.5 GENOMIC SELECTION IN WHEAT FOR NUTRITIONAL TRAITS Wheat is one of the most widely produced and consumed cereal grains in the world, serving as a primary source of calories for millions of people worldwide. Access to sufficient calories is not adequate for total nutrition. Micronutrient deficiencies are widespread in developed and developing nations alike and are a strong contributor to malnutrition. Breeding for increased vitamin and mineral content, or biofortification, of staple crops such as wheat is a potential avenue for alleviating micronutrient deficiencies and malnutrition. Iron (Fe) and zinc (Zn) deficiencies, respectively, affect over 800 million and 1 billion people worldwide and are particularly harmful for women and children (HarvestPlus, 2017). Recent biofortification efforts such as HarvestPlus, a Consortium of International Agriculture Research Centers (CGIAR) program, have resulted in the release of high Zn wheat varieties in South Asia (HarvestPlus, 2017). Genetic control of concentrations of many grain micronutrients appear to be quantitative and are therefore potential targets for GS (Trethowan et al., 2005; Velu et al., 2014). Micronutrient traits also require specialized phenotyping equipment, restricting breeders with minimal resources and budgets. Genomic prediction models for wheat grain Fe and Zn concentrations were examined by Velu et al. (2016) using the CIMMYT HarvestPlus Association Mapping panel, a population containing landrace progenies and various synthetic-derived progenies. The authors found low to moderate prediction accuracies for grain Fe and Zn concentrations measured across two growing seasons in multiple locations. The best prediction accuracies were obtained using G-BLUP models with both genetic and pedigree relationship matrices and inclusion of both genotype by environment (G×E) and pedigree by environment kernels. Several studies have shown that incorporating G×E interactions from multi-environment trials improves prediction accuracy across environments compared to not accounting for such interactions (Burgueño et al., 2012; Lopez-Cruz et al., 2015).Wheat Fe and Zn concentrations are heavily influenced by native soil Fe and Zn contents and tend to show large G×E interactions (Ortiz-Monasterio et al., 2007). Manickavelu et al. (2017) analyzed a set of Afghan wheat landraces grown in Japan and Afghanistan for grain potassium, phosphorus, magnesium, iron, zinc, and manganese contents. The authors ran GWAS for each trait and discovered only a single significant association for Zn, further suggesting the quantitative inheritance of wheat grain nutrient traits. Multiple models for each grain nutrient trait were fit in each country and accuracies were found
292
Applications of Genetic and Genomic Research in Cereals
to be moderate to high, with macronutrient (P, K, Mg) accuracies higher than micronutrient (Fe, Zn, Mn) accuracies. Prediction accuracies in Japan were higher than that in Afghanistan, which was attributed to large environmental variance and soil conditions in Afghanistan. The G×E interactions were not explicitly modeled and multi-trait models were not evaluated despite significant genetic correlations between macronutrients.
13.6 GENOMIC SELECTION FOR WHEAT QUALITY TRAITS The value of wheat lies not only in yield per se, but also in its quality.Wheat quality is determined by a suite of qualitative and quantitative traits. Milling quality and market class of wheat are largely determined by hardness, an oligogenic trait, and protein content, a quantitative trait (Edwards et al., 2010; Battenfield et al., 2016). Hard wheats have a stronger starch-protein attachment than soft wheats. The additional energy needed to break these attachments in milling creates higher proportions of damaged starch granules, which aid in water absorption and produce better baking performance for leavened products (Battenfield et al., 2016). Soft wheats have less damaged starch and are preferred for cookies, cakes, and pastries. Hard wheats tend to have high protein and soft wheats tend to have low protein. Kernel weight, flour yield, flour protein, flour color, starch damage, and protein quality are additional wheat milling quality traits of importance to breeders and end users.The primary storage proteins in wheat, glutenins and gliadins, give wheat its unique viscoelastic properties in baking (Battenfield et al., 2016). Dough rheology measurements are taken by mixing flour with water to determine dough elasticity, strength, tolerance, and optimal mixing time conferred by glutenin and gliadin contents (Battenfield et al., 2016). Farinographs, mixographs, and alveographs measure different dough rheology components but are time, flour, and cost intensive. Baking quality traits such as loaf volume and loaf texture are valuable as well and also require significant resource inputs that may not be available or feasible in early generation breeding trials.
13.6.1 Milling and Flour Quality Heffner et al. (2011a,b) were the first to develop prediction models for milling and flour quality traits. In both studies phenotypic prediction accuracy performed better for all end-use quality traits than GS models. Heffner et al. (2011a) compared RR-BLUP and Bayes-Cπ models and found that RRBLUP performed better in a classic QTL mapping biparental p opulation
Genomic Selection in Wheat
293
while Bayes-Cπ performed better in an elite by elite population. In general, prediction accuracies for quality traits from multifamily predictions (Heffner et al., 2011b) tended to be higher than the accuracies from the biparental predictions (Heffner et al., 2011a). This is seemingly in contradiction with the general paradigm that prediction accuracies are maximized within biparental populations (Crossa et al., 2014; Lehermeier et al., 2014). However, the training population size in Heffner et al. (2011a) was only 96 individuals in each family and ~450 markers were used whereas in the multifamily study Heffner et al. (2011b) used a training population size of 288 and 1158 markers. Hoffstetter et al. (2016) measured flour yield and flour softness in an F4-derived breeding population of soft winter wheat. The authors tested subsetting the training population and marker sets to improve prediction accuracy. Removing lines with high G×E interaction did not change prediction accuracies within environment for quality traits but subsetting markers to only include markers with significant associations to the trait substantially improved prediction accuracy for flour yield and flour softness compared to using the complete marker set. Other studies testing reduced marker set prediction accuracy typically have found minimal difference between whole and subset marker sets. Several GS studies that have been recently published make use of large diverse training populations and comprehensive HTP of wheat quality traits. Battenfield et al. (2016) utilized a large CIMMYT spring wheat multiyear breeding population data set with full processing and end-use quality phenotypes to test multiple GS models for the accuracy of forward prediction for the next year. Forward prediction accuracies from 2011 to 2015 increased as more years were added to the models but cross-validation with all years outperformed all forward prediction models in all traits. The authors postulated that cross-validation prediction accuracies are inflated due to not accounting for G×E interactions and leveraging information from relatives across years that is not available in forward prediction. Increase in response to selection from GS was greater than 100% for flour yield, grain protein, flour protein, and several dough rheology traits (Battenfield et al., 2016). The authors saw GS for end-use quality as a complement, not a replacement, for phenotypic selection in the CIMMYT breeding program, but expected that gain from selection could ultimately be much higher considering that 10,000 lines can be genotyped at the same cost as phenotyping 1000 lines for end-use quality. Hayes et al. (2017) integrated near-infrared (NIR) and nuclear magnetic resonance (NMR) HTP into multi-trait GS models using a large training population of public and private breeding
294
Applications of Genetic and Genomic Research in Cereals
lines and synthetic derivatives. In all, 44 quality traits were phenotyped in the training set over 2 years, including a set of noodle quality traits; the validation set was composed of training set subsets and Australian National Variety Trials (NVT) over 3 years. Including NIR and NMR phenotypes as correlated traits increased the prediction accuracy for grain, milling, and baking traits but noticeable improvements were not observed for dough rheology traits. Prediction accuracies were largely above 0.5, likely sufficient for effective early generation end-use quality selection (Hayes et al., 2017). Such accuracies in the NVT validation sets, which contain elite breeding lines, showed the robustness of the diverse training set to sufficiently predict elite breeding material.
13.6.2 Preharvest Sprouting Preharvest sprouting (PHS) occurs when wheat is exposed to excessive moisture in the field before harvest and germination processes are initiated. The PHS damage is only visible to the naked eye in the most serious circumstances when rootlets and coleoptiles form on the spike. The PHS damage can be measured directly through germination tests of loose grain and mist tests of individual spikes, or indirectly by falling number, an indirect measurement of alpha-amylase activity through dough rheological properties. GS for PHS resistance in wheat was first tested by Heffner et al. (2011a,b). In the first study, a single biparental population of 96 individuals segregating for PHS resistance was used to train prediction models. The PS accuracy was found to be significantly higher than accuracy for RR-BLUP or Bayes-Cπ, but both GS models performed much better than MAS. The authors cite the high heritability of PHS, small training population size, low-density markers, polygenic nature of PHS, and low G×E as reasons for the superior performance of the PS in the study. The second study used a multifamily training population of soft winter wheat breeding lines and evaluated several genomic prediction models as well as association analysis models to test the prediction accuracies and net merit from selection indices. Prediction accuracies for PHS using GS models outperformed association analysis models but not PS accuracy. However, in both studies, the authors conclude that the reduced cycle time and cost of GS for PHS and other quality traits would be superior to PS. Moore et al. (2017) used a panel of 1118 hard red and white winter wheat breeding lines as a training population with GBS SNP markers and PHS resistance phenotypes from germination tests. White wheats tend to be more susceptible to PHS than red wheats but no significant improvement in prediction accuracy was ob-
Genomic Selection in Wheat
295
served by adding kernel color as a fixed effect. However, they did find that adding five significant markers from GWAS improved prediction accuracy, supporting earlier findings from Bernardo (2014), Daetwyler et al. (2014), and Arruda et al. (2016).
13.7 FUTURE PROSPECTS Long-term GS experiments are still lacking in wheat and empirical results of response to GS in wheat are limited to the study by Rutkoski et al. (2015a). Realized gains from selection using GS are expected to be greater than those from phenotypic selection if selections based on GEBVs can substantially shorten the breeding cycle time. Heffner et al. (2010) simulated a public winter wheat breeding program and estimated long-term gain from selection for GS and MAS. Low to moderate GS prediction accuracies were sufficient to outperform MAS. Even in high heritability scenarios, GS was predicted to result in 1.5-fold greater genetic gain than MAS. Full implementation of GS into a breeding program would require significant reallocation of resources and a reordering of breeding objectives, but costs are expected to be equal to or less than MAS as phenotyping becomes less prevalent and occurs mostly to update prediction models. Resources would be shifted to genotyping and crossing costs instead of quality assays. Instead of phenotyping quality traits on a small number of lines in early or advanced yield trials, genomic prediction models could be used to discard a large percentage of early generation progeny, even F2 lines, with subsequent selection at harvest based on genomic and NIR/NMR predictions (Hayes et al., 2017). The immediate benefits of reduced phenotyping costs are more apparent for quality traits that involve specialized machinery, expensive reagents, and large quantities of flour than for disease resistance that can be scored at the seedling stage. However, Poland and Rutkoski (2016) point out that well trained GS models enable selection for yield at any stage; this allows for yield-based selections in very large early generation disease screens. They also highlighted the ability to simultaneously select for qualitative and quantitative disease resistance, which is typically hindered by R genes masking smaller effect QTL. Genomic prediction models could allow early generation breeding material instead of advanced material to be evaluated for micronutrient content and/or end-use quality. Such models could improve micronutrient breeding efforts in developing nations where consistent access to micronutrient phenotyping equipment is rare and biofortified varieties are most needed. GS for biofortification is additionally attractive because
296
Applications of Genetic and Genomic Research in Cereals
c ulturally important local crops could be efficiently improved without introducing new species or varieties that may not be readily accepted. GS models and breeding schemes accounting for nutritional quality G×E interaction in wheat are attractive because of the large effect of soil on grain micronutrient content. Early selection of quality traits could enable breeders to target specific consumer niches like craft baking, export markets, and health products with more precision which in turn could increase the value of wheat products for growers and processors. Genomic prediction models could be used to predict quality performance in yield trials across diverse environments without phenotyping (Burgueño et al., 2012; Lopez-Cruz et al., 2015). Breeders may be able to breed for several disparate markets at the same time with relative ease by creating differentially weighted selection indices for their prediction models. In a world where improved basic cereal grain nutrition and tailor-made cereal grain products are paradoxically needed, GS for nutritional and end-use quality traits in wheat offers exciting possibilities for breeders, farmers, processors, and consumers.
13.8 CONCLUSION Genomic prediction models have been trained for a large number of economically important traits in winter and spring wheat, including the ultimately most important trait, yield. Genomic prediction has also been shown to be an effective approach to dissect complicated epistatic and genotype by environment interactions in wheat. In the face of increasingly unpredictable weather and annual trends, breeding climate resilient wheat varieties will depend on proper exploitation of these two phenomena. GS provides the opportunity to select for any quantitative trait of interest at any stage of the breeding cycle, giving more power and freedom to today’s breeders. The dawn of the genomics-enabled breeding age is still upon the plant breeding world. Access to, and funding for, consistent genotyping is a requirement of applied GS, yet remains a challenge for public breeding programs globally. The promise of GS has yet to be fully realized in the developing world. Long-term effects of GS on variance and inbreeding in wheat have been theorized but presently have little empirical evidence. Applications of HTP to wheat genomics are actively being pursued but remain in their infancy. There is high potential for many unexplored applications of GS in wheat breeding. GS is not a breeding panacea, but is an invaluable new tool for wheat breeders in every season, every market, and every set of breeding goals.
Genomic Selection in Wheat
297
REFERENCES Araus, J.L., Cairns, J.E., 2014. Field high-throughput phenotyping: the new crop breeding frontier. Trends Plant Sci. 19, 52–61. Arruda, M.P., Lipka, A.E., Brown, P.J., Krill, A.M., Thurber, C., Brown-Guedira, G., Dong, Y., Foresman, B.J., Kolb, F.L., 2016. Comparing genomic selection and marker-assisted selection for Fusarium head blight resistance in wheat (Triticum aestivum L.). Mol. Breed. 36, 1–11. Asoro, F.G., Newell, M.A., Beavis, W.D., Scott, M.P., Jannink, J.-L., 2011. Accuracy and training population design for genomic selection on quantitative traits in elite North American oats. Plant Genome J. 4, 132–144. Battenfield, S.D., Guzmán, C., Gaynor, R.C., Singh, R.P., Peña, R.J., Dreisigacker, S., Fritz, A.K., Poland, J.A., 2016. Genomic selection for processing and end-use quality traits in the CIMMYT spring bread wheat breeding program. Plant Genome 9. Bauriegel, E., Giebel, A., Geyer, M., Schmidt, U., Herppich, W.B., 2011. Early detection of Fusarium infection in wheat using hyper-spectral imaging. Comput. Electron. Agric. 75, 304–312. Bernardo, R., 2008. Molecular markers and selection for complex traits in plants: learning from the last 20 years. Crop Sci. 48, 1649–1664. Bernardo, R., 2014. Genomewide selection when major genes are known. Crop Sci. 54, 68–75. Bernardo, R.,Yu, J., 2007. Prospects for genomewide selection for quantitative traits in maize. Crop Sci. 47, 1082–1090. Burgueño, J., Crossa, J., Cotes, J.M.,Vicente, F.S., Das, B., 2011. Prediction assessment of linear mixed models for multienvironment trials. Crop Sci. 51, 944–954. Burgueño, J., de los Campos, G.,Weigel, K., Crossa, J., 2012. Genomic prediction of breeding values when modeling genotype × environment interaction using pedigree and dense molecular markers. Crop Sci. 52, 707–719. Crossa, J., de los Campos, G., Pérez, P., Gianola, D., Burgueño, J., Araus, J.L., Makumbi, D., Singh, R.P., Dreisigacker, S.,Yan, J., Arief,V., Banziger, M., Braun, H.J., 2010. Prediction of genetic values of quantitative traits in plant breeding using pedigree and molecular markers. Genetics 186, 713–724. Crossa, J., Pérez, P., Hickey, J., Burgueño, J., Ornella, L., Cerón-Rojas, J., Zhang, X., Dreisigacker, S., Babu, R., Li,Y., Bonnett, D., Mathews, K., 2014. Genomic prediction in CIMMYT maize and wheat breeding programs. Heredity 112, 48–60. Crossa, J., de los Campos, G., Maccaferri, M., Tuberosa, R., Burgueño, J., Pérez-Rodríguez, P., 2016. Extending the marker × environment interaction model for genomic-enabled prediction and genome-wide association analysis in durum wheat. Crop Sci. 56, 2193–2209. Cuevas, J., Crossa, J., Soberanis, V., Pérez-Elizalde, S., Pérez-Rodríguez, P., Campos, G.d.l., Montesinos-López, O.A., Burgueño, J., 2016a. Genomic prediction of genotype × environment interaction kernel regression models. Plant Genome 9 (3), 1–20. Cuevas, J., Crossa, J., Montesinos-López, O.A., Burgueño, J., Pérez-Rodríguez, P., Campos, G.d.l, 2016b. Bayesian genomic prediction with genotype × environment interaction kernel models. G3: Genes Genomes Genet 7, 41–53. Daetwyler, H.D., Bansal, U.K., Bariana, H.S., Hayden, M.J., Hayes, B.J., 2014. Genomic prediction for rust resistance in diverse wheat landraces. Theor. Appl. Genet. 127, 1795–1803. Dawson, J.C., Endelman, J.B., Heslot, N., Crossa, J., Poland, J., Dreisigacker, S., Manes, Y., Sorrells, M.E., Jannink, J.L., 2013. The use of unbalanced historical data for genomic selection in an international wheat breeding program. Field Crops Res. 154, 12–22.
298
Applications of Genetic and Genomic Research in Cereals
de los Campos, G., Naya, H., Gianola, D., Crossa, J., Legarra, A., Manfredi, E., Weigel, K., Cotes, J.M., 2009. Predicting quantitative traits with regression models for dense molecular markers and pedigree. Genetics 182, 375–385. Devadas, R., Lamb, D.W., Backhouse, D., Simpfendorfer, S., 2015. Sequential application of hyperspectral indices for delineation of stripe rust infection and nitrogen deficiency in wheat. Precis. Agric. 16, 477–491. Edwards, M.A., Osborne, B.G., Henry, R.J., 2010. Puroindoline genotype, starch granule size distribution and milling quality of wheat. J. Cereal Sci. 52, 314–320. Endelman, J.B., 2011. Ridge regression and other kernels for genomic selection with R package rrBLUP. Plant Genome J. 4, 250. González-Camacho, J.M., de Los Campos, G., Pérez, P., Gianola, D., Cairns, J.E., Mahuku, G., Babu, R., Crossa, J., 2012. Genome-enabled prediction of genetic values using radial basis function neural networks. Theor. Appl. Genet. 125, 759–771. Haghighattalab, A., González Pérez, L., Mondal, S., Singh, D., Schinstock, D., Rutkoski, J., Ortiz-Monasterio, I., Singh, R.P., Goodin, D., Poland, J., 2016. Application of unmanned aerial systems for high throughput phenotyping of large wheat breeding nurseries. Plant Methods 12, 35. HarvestPlus, 2017. Crops. http://www.harvestplus.org/what-we-do/crops. Hayes, B.J., Panozzo, J., Walker, C.K., Choy, A.L., Kant, S., Wong, D., Tibbits, J., Daetwyler, H.D., Rochfort, S., Hayden, M.J., Spangenberg, G.C., 2017. Accelerating wheat breeding for end-use quality with multi-trait genomic predictions incorporating near infrared and nuclear magnetic resonance-derived phenotypes. Theor. Appl. Genet. 130, 1–15. He, S., Schulthess, A.W., Mirdita, V., Zhao, Y., Korzun, V., Bothe, R., Ebmeyer, E., Reif, J.C., Jiang, Y., 2016. Genomic selection in a commercial winter wheat population. Theor. Appl. Genet. 129, 641–651. Heffner, E.L., Sorrells, M.E., Jannink, J.-L.L., 2009. Genomic selection for crop improvement. Crop Sci. 49, 1–12. Heffner, E.L., Lorenz, A.J., Jannink, J.L., Sorrells, M.E., 2010. Plant breeding with genomic selection: gain per unit time and cost. Crop Sci. 50, 1681–1690. Heffner, E.L., Jannink, J.-L., Iwata, H., Souza, E., Sorrells, M.E., 2011a. Genomic selection accuracy for grain quality traits in biparental wheat populations. Crop Sci. 51, 2597–2606. Heffner, E.L., Jannink, J.-L., Sorrells, M.E., 2011b. Genomic selection accuracy using multifamily prediction models in a wheat breeding program. Plant Genome 4, 65–75. Heslot, N.,Yang, H.-P., Sorrells, M.E., Jannink, J.-L., 2012. Genomic selection in plant breeding: a comparison of models. Crop Sci. 52, 146–160. Heslot, N., Jannink, J.L., Sorrells, M.E., 2013a. Using genomic prediction to characterize environments and optimize prediction accuracy in applied breeding data. Crop Sci. 53 (3), 921–933. Heslot, N., Rutkoski, J., Poland, J., Jannink, J.L., Sorrells, M.E., 2013b. Impact of marker ascertainment bias on genomic selection accuracy and estimates of genetic diversity. PLoS ONE 8, e74612. Heslot, N., Akdemir, D., Sorrells, M.E., Jannink, J.L., 2014. Integrating environmental covariates and crop modeling into the genomic selection framework to predict genotype by environment interactions. Theor. Appl. Genet. 127, 463–480. Hoffstetter, A., Cabrera, A., Huang, M., Sneller, C., 2016. Optimizing training population data and validation of genomic selection for economic traits in soft winter wheat. G3 6, 2919–2928. Holman, F.H., Riche, A.B., Michalski, A., Castle, M., Wooster, M.J., Hawkesford, M.J., 2016. High throughput field phenotyping of wheat plant height and growth rate in field plot trials using UAV based remote sensing. Remote Sens. 8, 1031.
Genomic Selection in Wheat
299
Jarquín, D., Crossa, J., Lacaze, X., Du Cheyron, P., Daucourt, J., Lorgeou, J., Perez, P., Calus, M., de los Campos, G., Burgueño, J., 2014. A reaction norm model for genomic selection using high-dimensional genomic and environmental data. Theor. Appl. Genet. 127, 595–607. Jarquín, D., Lemes da Silva, C., Gaynor, R.C., Poland, J., Fritz, A., Howard, R., Battenfield, S., Crossa, J., 2017. Increasing genomic-enabled prediction accuracy by modeling genotype × environment interactions in Kansas wheat. Plant Genome 10. Jia, Y., Jannink, J.L., 2012. Multiple trait genomic selection methods increase genetic value prediction accuracy. Genetics 192, 1513–1522. Jiang,Y., Schulthess, A.W., Rodemann, B., Ling, J., Plieske, J., Kollers, S., Ebmeyer, E., Korzun, V., Argillier, O., Stiewe, G., Ganal, M.W., Röder, M.S., Reif, J.C., 2017. Validating the prediction accuracies of marker-assisted and genomic selection of Fusarium head blight resistance in wheat using an independent sample. Theor. Appl. Genet. 130, 471–482. Jin, F., Zhang, D., Bockus, W., Baenziger, P.S., Carver, B., Bai, G., 2013. Fusarium head blight resistance in U.S. winter wheat cultivars and elite breeding lines. Crop Sci. 53, 2006–2013. Juliana, P., Singh, R.P., Singh, P.K., Crossa, J., Huerta-Espino, J., Lan, C., Bhavani, S., Rutkoski, J.E., Poland, J.A., Bergstrom, G.C., Sorrells, M.E., 2017a. Genomic and pedigree-based prediction for leaf, stem, and stripe rust resistance in wheat. Theor. Appl. Genet. 130, 1415–1430. Juliana, P., Singh, R.P., Singh, P.K., Crossa, J., Rutkoski, J.E., Poland, J.A., Bergstrom, G.C., Sorrells, M.E., 2017b. Comparison of models and whole-genome profiling approaches for genomic-enabled prediction of Septoria tritici blotch, Stagonospora nodorum blotch, and tan spot resistance in wheat. Plant Genome 10. Lado, B., Barrios, P.G., Quincke, M., Silva, P., Gutiérrez, L., 2016. Modeling genotype × environment interaction for genomic selection with unbalanced data from a wheat breeding program. Crop Sci. 56, 2165–2179. Lehermeier, C., Krämer, N., Bauer, E., Bauland, C., Camisan, C., Campo, L., Flament, P., Melchinger, A.E., Menz, M., Meyer, N., Moreau, L., Moreno-González, J., Ouzunova, M., Pausch, H., Ranc, N., Schipprack, W., Schönleben, M., Walter, H., Charcosset, A., Schön, C.-C., 2014. Usefulness of multiparental populations of maize (Zea mays L.) for genome-based prediction. Genetics 198, 3–16. Liebisch, F., Kirchgessner, N., Schneider, D., Walter, A., Hund, A., 2015. Remote, aerial phenotyping of maize traits with a mobile multi-sensor approach. Plant Methods 11, 9. Lopez-Cruz, M., Crossa, J., Bonnett, D., Dreisigacker, S., Poland, J., Jannink, J.-L., Singh, R.P., Autrique, E., de los Campos, G., 2015. Increased prediction accuracy in wheat breeding trials using a marker × environment interaction genomic selection model. G3 5, 569–582. Lorenz, A.J., Chao, S., Asoro, F.G., Heffner, E.L., Hayashi, T., Iwata, H., Smith, K.P., Sorrells, M.E., Jannink, J.L., 2011. Genomic selection in plant breeding: knowledge and prospects. Adv. Agron. 110, 77–123. Lorenzana, R.E., Bernardo, R., 2009. Accuracy of genotypic value predictions for marker- based selection in biparental plant populations. Theor. Appl. Genet. 120, 151–161. Manickavelu, A., Hattori, T., Yamaoka, S., Yoshimura, K., Kondou, Y., Onogi, A., Matsui, M., Iwata, H., Ban, T., 2017. Genetic nature of elemental contents in wheat grains and its genomic prediction: toward the effective use of wheat landraces from Afghanistan. PLoS ONE 12, e0169416. Martre, P., Jamieson, P.D., Semenov, M.A., Zyskowski, R.F., Porter, J.R., Triboi, E., 2006. Modelling protein content and composition in relation to crop nitrogen dynamics for wheat. Eur. J. Agron. 25, 138–154. Meuwissen, T.H.E., Hayes, B.J., Goddard, M.E., 2001. Prediction of total genetic value using genome-wide dense marker maps. Genetics 157, 1819–1829.
300
Applications of Genetic and Genomic Research in Cereals
Mirdita, V., He, S., Zhao, Y., Korzun, V., Bothe, R., Ebmeyer, E., Reif, J.C., Jiang, Y., 2015. Potential and limits of whole genome prediction of resistance to Fusarium head blight and Septoria tritici blotch in a vast Central European elite winter wheat population. Theor. Appl. Genet. 128, 2471–2481. Moore, J.K., Manmathan, H.K., Anderson,V.A., Poland, J.A., Morris, C.F., Haley, S.D., 2017. Improving genomic prediction for pre-harvest sprouting tolerance in wheat by weighting large-effect quantitative trait loci. Crop Sci. 57, 1315–1324. Muleta, K.T., Bulli, P., Zhang, Z., Chen, X., Pumphrey, M., 2017. Unlocking diversity in germplasm collections via genomic selection: a case study based on quantitative adult plant resistance to stripe rust in spring wheat. Plant Genome 10. Ornella, L., Singh, S., Perez, P., Burgueño, J., Singh, R., Tapia, E., Bhavani, S., Dreisigacker, S., Braun, H.-J., Mathews, K., Crossa, J., 2012. Genomic prediction of genetic values for resistance to wheat rusts. Plant Genome J. 5, 136–148. Ortiz-Monasterio, J.I., Palacios-Rojas, N., Meng, E., Pixley, K., Trethowan, R., Pena, R.J., 2007. Enhancing the mineral and vitamin content of wheat and maize through plant breeding. J. Cereal Sci. 46, 293–307. Pérez-Rodríguez, P., Gianola, D., González-Camacho, J.M., Crossa, J., Manès,Y., Dreisigacker, S., 2012. Comparison between linear and non-parametric regression models for genome-enabled prediction in wheat. G3 2, 1595–1605. Pérez-Rodríguez, P., Crossa, J., Rutkoski, J., Poland, J., Singh, R., Legarra, A., Autrique, E., de los Campos, G., Burgueño, J., Dreisigacker, S., 2017. Single-step genomic and pedigree genotype × environment interaction models for predicting wheat lines ininternational environments. Plant Genome 10. Poland, J., Rutkoski, J., 2016. Advances and challenges in genomic selection for disease resistance. Annu. Rev. Phytopathol. 54, 79–98. Rutkoski, J., Benson, J., Jia, Y., Brown-Guedira, G., Jannink, J.-L., Sorrells, M., 2012. Evaluation of genomic prediction methods for Fusarium head blight resistance in wheat. Plant Genome J. 5, 51. Rutkoski, J.E., Poland, J., Jannink, J.-L., Sorrells, M.E., 2013. Imputation of unordered markers and the impact on genomic selection accuracy. G3 427–439. Rutkoski, J.E., Poland, J.A., Singh, R.P., Huerta-Espino, J., Bhavani, S., Barbier, H., Rouse, M.N., Jannink, J.-L., Sorrells, M.E., 2014. Genomic selection for quantitative adult plant stem rust resistance in wheat. Plant Genome 7. Rutkoski, J., Singh, R.P., Huerta-Espino, J., Bhavani, S., Poland, J., Jannink, J.L., Sorrells, M.E., 2015a. Genetic gain from phenotypic and genomic selection for quantitative resistance to stem rust of wheat. Plant Genome 8. Rutkoski, J., Singh, R.P., Huerta-Espino, J., Bhavani, S., Poland, J., Jannink, J.L., Sorrells, M.E., 2015b. Efficient use of historical data for genomic selection: a case study of stem rust resistance in wheat. Plant Genome 8, 1–10. Rutkoski, J., Poland, J., Mondal, S., Autrique, E., Pérez, L.G., Crossa, J., Reynolds, M., Singh, R., 2016. Canopy temperature and vegetation indices from high-throughput phenotyping improve accuracy of pedigree and genomic selection for grain yield in wheat. G3 6, 2799–2808. Shakoor, N., Lee, S., Mockler, T.C., 2017. High throughput phenotyping to accelerate crop breeding and monitoring of diseases in the field. Curr. Opin. Plant Biol. 38, 184–192. Sun, J., Rutkoski, J.E., Poland, J.A., Crossa, J., Jannink, J.-L., Sorrells, M.E., 2017. Multitrait, random regression, or simple repeatability model in high-throughput phenotyping data improve genomic prediction for wheat grain yield. Plant Genome 10. Tanger, P., Klassen, S., Mojica, J.P., Lovell, J.T., Moyers, B.T., Baraoidan, M., Naredo, M.E.B., McNally, K.L., Poland, J., Bush, D.R., Leung, H., Leach, J.E., McKay, J.K., 2017. Fieldbased high throughput phenotyping rapidly identifies genomic regions controlling yield components in rice. Sci. Rep. 7, 42839.
Genomic Selection in Wheat
301
Trethowan, R.M., Reynolds, M., Sayre, K., Ortiz-Monasterio, I., 2005. Adapting wheat cultivars to resource conserving farming practices and human nutritional needs. Ann. Appl. Biol. 146, 405–413. Velu, G., Ortiz-Monasterio, I., Cakmak, I., Hao,Y., Singh, R.P., 2014. Biofortification strategies to increase grain zinc and iron concentrations in wheat. J. Cereal Sci. 59, 365–372. Velu, G., Crossa, J., Singh, R.P., Hao, Y., Dreisigacker, S., Perez-Rodriguez, P., Joshi, A.K., Chatrath, R., Gupta, V., Balasubramaniam, A., Tiwari, C., Mishra, V.K., Sohu, V.S., Mavi, G.S., 2016. Genomic prediction for grain zinc and iron concentrations in spring wheat. Theor. Appl. Genet. 129, 1595–1605. Wang, Y., Mette, M., Miedaner, T., Gottwald, M., Wilde, P., Reif, J.C., Zhao, Y., 2014. The accuracy of prediction of genomic selection in elite hybrid rye populations surpasses the accuracy of marker-assisted selection and is equally augmented by multiple field evaluation locations and test years. BMC Genomics 15, 556. Watanabe, K., Guo, W., Arai, K., Takanashi, H., Kajiya-Kanegae, H., Kobayashi, M.,Yano, K., Tokunaga, T., Fujiwara, T., Tsutsumi, N., Iwata, H., 2017. High-throughput phenotyping of sorghum plant height using an unmanned aerial vehicle and its application to genomic prediction modeling. Front. Plant Sci. 8. White, J.W., Andrade-Sanchez, P., Gore, M.A., Bronson, K.F., Coffelt, T.A., Conley, M.M., Feldmann, K.A., French, A.N., Heun, J.T., Hunsaker, D.J., Jenks, M.A., Kimball, B.A., Roth, R.L., Strand, R.J., Thorp, K.R., Wall, G.W., Wang, G., 2012. Field-based phenomics for plant genetics research. Field Crop Res. 133, 101–112.
FURTHER READING Bernardo, R., 2014a. Genomewide selection of parental inbreds: classes of loci and virtual biparental populations. Crop Sci. 54, 2586–2595. Crespo-Herrera, L.A., Crossa, J., Huerta-Espino, J., Autrique, E., Mondal, S.,Velu, G.,Vargas, M., Braun, H.J., Singh, R.P., 2017. Genetic yield gains in CIMMYT’s international elite spring wheat yield trials by modeling the genotype × environment interaction. Crop Sci. 57, 789–801. He, S., Reif, J.C., Korzun,V., Bothe, R., Ebmeyer, E., Jiang,Y., 2017. Genome-wide mapping and prediction suggests presence of local epistasis in a vast elite winter wheat populations adapted to Central Europe. Theor. Appl. Genet. 130, 635–647. Heslot, N., Jannink, J.-L., Sorrells, M.E., 2015. Perspectives for genomic selection applications and research in plants. Crop Sci. 55, 1–12. Lado, B., Battenfield, S., Guzmán, C., Quincke, M., Singh, R.P., Dreisigacker, S., Peña, R.J., Fritz, A., Silva, P., Poland, J., Gutiérrez, L., 2017. Strategies for selecting crosses using genomic prediction in two wheat breeding programs. Plant Genome 10. Liu, G., Zhao,Y., Gowda, M., Longin, C.F.H., Reif, J.C., Mette, M.F., 2016. Predicting hybrid performances for quality traits through genomic-assisted approaches in central European wheat. PLoS ONE 11, e0158635. Marulanda, J.J., Mi, X., Melchinger, A.E., Xu, J.-L., Würschum, T., Longin, C.F.H., 2016. Optimum breeding strategies using genomic selection for hybrid breeding in wheat, maize, rye, barley, rice and triticale. Theor. Appl. Genet. 129, 1901–1913. Michel, S., Ametz, C., Gungor, H., Akgöl, B., Epure, D., Grausgruber, H., Löschenberger, F., Buerstmayr, H., 2017. Genomic assisted selection for enhancing line breeding: merging genomic and phenotypic selection in winter wheat breeding programs with preliminary yield trials. Theor. Appl. Genet. 130, 363–376. Paltridge, N.G., Milham, P.J., Ortiz-Monasterio, J.I.,Velu, G.,Yasmin, Z., Palmer, L.J., Guild, G.E., Stangoulis, J.C.R., 2012. Energy-dispersive X-ray fluorescence spectrometry as a tool for zinc, iron and selenium analysis in whole grain wheat. Plant Soil 361, 261–269.
302
Applications of Genetic and Genomic Research in Cereals
Rincent, R., Kuhn, E., Monod, H., Oury, F.-X., Rousset, M., Allard, V., Le Gouis, J., 2017. Optimization of multi-environment trials for genomic selection based on crop models. Theor. Appl. Genet. 130, 1735–1752. Sukumaran, S., Crossa, J., Jarquín, D., Reynolds, M., 2017. Pedigree-based prediction models with genotype × environment interaction in multienvironment trials of CIMMYT wheat. Crop Sci. 57, 1865–1880. Yang,W., Guo, Z., Huang, C., Duan, L., Chen, G., Jiang, N., Fang,W., Feng, H., Xie,W., Lian, X.,Wang, G., Luo, Q., Zhang, Q., Liu, Q., Xiong, L., 2014. Combining high-throughput phenotyping and genome-wide association studies to reveal natural genetic variation in rice. Nat. Commun. 5, 5087. Yuan, L., Zhang, H., Zhang, Y., Xing, C., Bao, Z., 2017. Feasibility assessment of multi- spectral satellite sensors in monitoring and discriminating wheat diseases and insects. Optik 131, 598–608. Zhao, Y., Mette, M.F., Gowda, M., Longin, C.F.H., Reif, J.C., 2014. Bridging the gap between marker-assisted and genomic selection of heading time and plant height in hybrid wheat. Heredity 112, 638–645.