Journal of Integrative Agriculture 2014, 13(9): 1845-1853
September 2014
RESEARCH ARTICLE
Molecular Diversity and Association Analysis of Drought and Salt Tolerance in Gossypium hirsutum L. Germplasm JIA Yin-hua, SUN Jun-ling, WANG Xi-wen, ZHOU Zhong-li, PAN Zao-e, HE Shou-pu, PANG Bao-yin, WANG Li-ru and DU Xiong-ming State Key Laboratory of Cotton Biology, Institute of Cotton Research, Chinese Academy of Agricultural Sciences, Anyang 455000, P.R.China
Abstract Association mapping is a useful tool for the detection of genes selected during plant domestication based on their linkage disequilibrium (LD). This study was carried out to estimate genetic diversity, population structure and the extent of LD to develop an association framework in order to identify genetic variations associated with drought and salt tolerance traits. 106 microsatellite marker primer pairs were used in 323 Gossypium hirsutum germplasms which were grown in the drought shed and salt pond for evaluation. Polymorphism (PIC=0.53) was found, and three groups were detected (K=3) with the second likelihood ΔK using STRUCTURE software. LD decay rates were estimated to be 13-15 cM at r2 0.20. Significant associations between polymorphic markers and drought and salt tolerance traits were observed using the general linear model (GLM) and mixed linear model (MLM) (P 0.01). The results also demonstrated that association mapping within the population structure as well as stratification existing in cotton germplasm resources could complement and enhance quantitative trait loci (QTLs) information for marker-assisted selection. Key words: cotton germplasm, genetic diversity, simple sequence repeats (SSR) markers, linkage disequilibrium (LD), association analysis
INTRODUCTION Cotton is the world’s most important textile fiber crop. It provides majority of the fiber used in the textile industry. High yield and fiber quality are the primary focus of cotton breeding programs. However, cotton yield is hindered by water and salt stress when it is grown in arid and semi-arid areas. Narrow genetic base of modern cotton cultivars limits the improvement of resistance and yield in cotton (Rungis et al. 2005). Thus, it is urgent that an approach should be found to identify more novel germplasm resources or genes to improve the resistance of cotton.
In the past, some researches have been conducted since it is uncountable to detect QTLs under arid and salt conditions for various traits (Mansur et al. 1993; Lilley et al. 1996; Tuberosa et al. 1998). QTL mapping for drought tolerance in cotton has been reported for the traits of productivity, physiology and fiber quality (Saranga et al. 2001; Levi et al. 2009a, b), and some valuable QTLs were found to promote the yield of cotton through increasing the drought tolerance (Muhammad et al. 2011). Stress signal transduction and transcriptional regulation were considered to play an important role in plant salt stress response. Genes related to the kinases and phosphates were found to switch on or off salt stress response in cotton (Zhang et al. 2011). However, few
Received 26 June, 2013 Accepted 27 December, 2013 JIA Yin-hua, E-mail:
[email protected]; Correspondence DU Xiong-ming, E-mail:
[email protected] © 2014, CAAS. All rights reserved. Published by Elsevier Ltd. doi: 10.1016/S2095-3119(13)60668-1
JIA Yin-hua et al.
1846
researches about the QTLs mapping of salt tolerance have been conducted in cotton. QTLs studies on crops are mainly based on the linkage analysis of F2-, RIL- or DH (double haploid)-derived mapping populations using molecular marker technology. These methods are very useful for detecting marker alleles linked to their respective traits. A large number of simple sequence repeat (SSR) markers have been developed and can be accessed through the cotton marker database (CMD) (Blenda et al. 2006). Many DNA markers were identified to be available for future cotton breeding programs through marker-assisted selection (MAS) (Zhang et al. 2008). However, markers linked to QTLs in some populations might not be useful to detect QTL effects because of linkage disequilibrium (LD) in mapping and breeding populations. Association mapping could be valuable for validating and detecting more QTLs in complex-pedigree population relevant linkage analyses. Association mapping was first used successfully for the identification of alleles at loci contributing to human diseases (Goldstein 2003), mapping and eventually cloning a number of genes underlying complex genetic traits in humans (Weiss and Clark 2002). This approach has also been used in different crops to identify markers and genes associated with a variety of phenotypes. In proceeding of domestication, humans selected favorable traits that were best adapted to their agricultural environment. These selections led to genetic changes shared by all individuals of a cultivated species. In maize, the tb1 gene was selected very early to shorten the length of branches (Wang et al. 1999; Viviane et al. 2003), and has been cloned based on association analysis (Doebley et al. 1997). In wheat, some important SSR markers were screened to be associated with kernel size (Breseghello and Sorrels 2006). In barley, several markers were found to be associated with a number of agronomic traits, including yield, heading date, water-stress tolerance, and salt tolerance (Ivandic et al. 2003; Kraakman et al. 2004). Similarly, in cot-
ton, several SSR markers were detected to have related association with the fiber quality in Gossypium hirsutum (Abdurakhmonov et al. 2008, 2009) and Gossypium arboreum (Kantartzi and Stewart 2008). Population structure shows a common confounding effect in association studies (Pritchard et al. 2000b). These populations are often composed of individuals that were derived from a complex pedigree and adapted to the local environment. New methods have been developed to detect accurate markers associated with traits by accounting for population structure and relatedness (Pritchard et al. 2000b). LD can also be calculated through analysis of population, besides identifying the markers for the locus of QTL. The using of population structure can significantly correct the number of false positives in plant studies (Thornsberry et al. 2001). Currently, STRUCTURE is the main software used to evaluate population structure (Pritchard et al. 2000a), especially the population of allogamous plants. In this study, we analyzed the association of SSR markers with drought and salt tolerance with a collection of cotton germplasms from all over the world.
RESULTS Morphological traits 323 cotton germplasms were evaluated in the drought and salt environments separately. Phenotype values of drought and salt tolerance showed a wide range which revealed that the data of traits were favorable for the association analysis (Table 1). The cultivar variance exceeded error variance for all the traits (Table 2). For the drought tolerance trait, the lowest value was 63.66, meaning that the individual was the most sensitive to water stress, and the highest value was 86.98 presenting the most resistant individual. For the salt tolerance trait, the lowest value was 3.86, and the highest value was 132.91.
Table 1 Summary statistics of drought tolerance and salt tolerance Mean
Standard deviation
Minimum
Maximum
P-value of normality test1)
Q12)
Median
Q33)
Drought tolerance
82.38
3.30
63.66
86.98
0.85
80.63
83.46
85.48
Salt tolerance
31.48
21.08
3.86
132.91
0.89
16.84
26.61
42.16
1)
P-value of normality test, Shapiro-Wilk test. 2) Q1, quantile 25%. 3) Q3, quantile 75%.
© 2014, CAAS. All rights reserved. Published by Elsevier Ltd.
Molecular Diversity and Association Analysis of Drought and Salt Tolerance in Gossypium hirsutum L. Germplasm
Table 2 Mean squares of the analysis of avriance (ANOVA) of drought tolerance and salt tolerance Cultivar Replicate
df
Salt tolerance
321
1 133.63
37.11586
2
26 443.46
221.37997*
Error
Drought tolerance
36.93683
961.38**
R2 of model
0.40
0.342554
*
and **, significance at the probability levels of 0.01 and 0.001, respectively. The same as below.
Diversity and structure In 323 cotton germplasms, 106 primer pairs detected 278 loci and 333 SSR alleles with an average of 3.1 alleles per marker (from 2-6 alleles) (Table 3). The number of effective allele varied from 1.2 to 4.9 with an average of 2.4. The 45 SSR (approximately 13.5%) alleles were rare and present in only 5% of the cotton individuals. The overall PIC for SSR markers ranged from 0.17-0.79 with an average of 0.53. The genetic distance (GD) was estimated with the NTSYSpc ver. 2.1. All lines ranged from 0.04-0.57 with an average of 0.26 demonstrating significant genetic diversity ranges. STRUCTURE was first used to estimate the data set structure. The log-likelihood increased with the increasing of the number of groups (K), and no evidence showed the maximum value was found after the run (Fig. 1-A). Then, the second-order change in log-likelihood was
1847
calculated using the method described by Evanno (Fig. 1-B). A strong signal for K=3 was found after calculating ΔK. Based on this result, we considered K=3 to be the supported number of populations. Subsequently, the number three was chosen as the best number of groups. The subpopulation structure in the 323 lines was determined as three different bar plots (Fig. 2). The lines showed that each individual owned at least 50% of the single ancestral genetic background. The proportions of the 323 lines assigned to different groups were asymmetric. Some lines had a complex genetic background. They were assigned to one group, but showed origination from other groups. 296 lines were assigned to three groups with at least 50% probability. The remaining 33 lines with a probability lower than 50% were cast into a mixed group. These three groups consisted of 101, 126 and 63 lines, and were labeled as groups 1, 2 and 3, respectively.
Linkage disequilibrium analysis LD was estimated using the TASSEL 2.1. 11.6% of SSR loci pairs were found to have linkage disequilibrium (P 0.01, r2 0.01), of which 10.9% of loci pairs were found to be decayed over the long terms of cotton cultivar selection (38 503 pairwise comparisons). In different
Table 3 Summary of simple sequence repeat (SSR) polymorphisms No. of polymorphic SSRs Locus Average allele/Marker 278 3.1
-40 000
0
1
2
Effective allele 2.4
3
4
5
6
7
Polymorphic information content (PIC) Range Average 0.17-0.79 0.53
Rare allele (%) 13.5
8
9
Genetic distance Range Average 0.04-0.57 0.26
100
10 11
60
-45 000 ∆K
Log-likelihood
80
-50 000
40 20 0
-55 000
Number of groups (K)
1
2
3
4
5
6
7
8
9
10
Number of groups (K)
Fig. 1 Analysis of the population structure. The number of groups were caculated using STRUCTURE (Pritchard et al. 2000a). A, graph about the log-likelihood. The log-likelihood increased with the number of groups (K) increasing. B, graph about ΔK. ΔK=m(|L(K+1)-2L(K)+ L(K-1)|)/s[L(K)] was used to assess the number of groups (K) (Evanno et al. 2005). A clear peak was detected for K=3.
© 2014, CAAS. All rights reserved. Published by Elsevier Ltd.
JIA Yin-hua et al.
1848
groups, different LD levels were observed. In group 1, 3.5% of SSR marker pairs were in significant LDs (P 0.01, r2 0.01, 38 503 pairwise), 3.6% in group 2, and 3.0% in group 3, whereas only 1.2% of pairs showed significant LDs in the mixed group (38 503 pairwise). In the future studies, a larger population should be used for a better LD estimation. The pattern of haplotypic LD was studied in the genome with 278 loci covering most of the chromosome because of the importance of LD blocks in the genome-wide for association mapping. LD analysis pairwise estimates of r2 varied from 0.0 to 1. Most r2 values ranged from 0.0 to 0.1. Most of the LD (r2 0.2) were located within a distance of less than 15 cM. These results indicated that LD for the genome in this population decayed with the genetic distance (Fig. 3). The LD decay rate was measured as the chromosomal distance at which r2 dropped to half its maximum value. Genome-wide LD decay rates of the present cotton sample were estimated at 7-8 and 13-15 cM, where r2 dropped to 0.25 and 0.20, respectively.
r
Fig. 2 The bar plots of Q-matrix estimates for the variety accessions. Groups are represented in different colors (blue for group 1, green for group 2, red for group 3). 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 Genetic distance (cM)
Fig. 3 The decays of linkage disequilibrium by r2 against genetic distance between all pairs of SSR loci. r2, LD-decay is considered at the threshold of r2 0.2 based on trend line. Table 4 SSR markers associated with drought tolerance and salt tolerance among the Gossypium hirsutum population Traits DroughtR
Association mapping of drought and salt tolerance The association mapping of SSR markers with drought and salt tolerance among the G. hirsutum population were tested using the general linear model (GLM) and the mixed linear model (MLM). 278 highly polymorphic SSR loci were used for association mapping. In GLM, at the significant threshold of P 0.01 (Table 4), 15 markers were detected to be associated with drought tolerance, and three markers were associated with salt tolerance. In MLM, when calculating the population structure and kinship, three markers were left to be associated with drought tolerance (NAU2265_382, NAU2277_60, BNL1694_415), and three markers were associated with salt tolerance (NAU2741_282, NAU5099_280, MUSS440_401) (P 0.01).
0
SaltR
Association markers
GLM1)
MLM2)
P-values
P-values
Effect for phenotype
NAU3419_252
6.5×10-3
0.0757
-0.89
NAU2265_382
6.3×10-4
9.5×10-3
-1.18
NAU2277_60
4.4×10-3
5.2×10-3
1.82
NAU2277_72
3.8×10-3
0.0652
-0.96
NAU1190_228
9.4×10-3
0.1846
-0.91
NAU2679_218
5.1×10-3
0.0155
-1.53
BNL1694_415
1.5×10-3
6.5×10-3
1.15
TMB1963_218
5.7×10-3
0.0734
-0.94
TMB1963_243
2.7×10-3
0.0415
1.02
NAU2437_245
2.9×10-3
0.0244
-1.17
BNL1694_235
2.4×10-3
0.0265
-1.16
NAU1102_230
3.0×10-3
0.016
-1.26 -1.22
NAU3110_224
3.5×10-3
0.0155
NAU3110_292
3.6×10-3
0.0174
-1.22
NAU3110_318
3.6×10-3
0.0174
-1.22
NAU2741_282
5.3×10-3
3.1×10-3
7.79
NAU5099_280 MUSS440_401
8.3×10-3
8.4×10-3
-6.22
0.0101
6.4×10-3
10.28
1)
GLM, general linear model. 2) MLM, mixed linear model.
The phenotypic allele effects were also estimated (Table 4). Most of the markers showed significant negative effects on the drought tolerance, and NAU2265_382 displayed a constant negative effect in both GLM and MLM. However, NAU2277_60, BNL1694_415 and TMB1963_243 performed the special ability of promoting drought tolerance. NAU2277_60 and BNL1694_415
© 2014, CAAS. All rights reserved. Published by Elsevier Ltd.
Molecular Diversity and Association Analysis of Drought and Salt Tolerance in Gossypium hirsutum L. Germplasm
kept the positive effects in the two models, GLM and MLM. NAU2741_282 and MUSS440_401 were associated with salt tolerance, and showed significant positive effects with the 7.79 and 10.28 values (P 0.01), meaning that genes linked with these markers could improve the salt tolerance. However, genes associated with NAU5099_280 could decrease the ability of enduring the stress of salt with the negative effects value -6.22. In the group, the most drought tolerant cultivars, Zhong 507145 and Suyuan 04-162, showed the significant positive effects alleles of NAU2277_60 and BNL1694_415. The most salt tolerant cultivars, Zhong 2220, Ji 91-18 and r4136, also owned the significant positive effects alleles of NAU2741_282, NAU5099_280 and MUSS440_401. The allele effects and resistant cultvars estimated in this test would be useful for making an accurate selection in the future of molecular marker assistant breeding.
DISCUSSION This study applied association mapping in cotton for identification of genetic markers associated with drought and salt tolerance. The experimental materials were collected from all over the world, and presented different abilities of drought and salt tolerance. The closely related cultivars in the group would violate the assumptions of the algorithm of structure (Pritchard et al. 2003) and inflate LD among unlinked loci. Our results showed that SSR alleles per locus in the cotton accessions of this group were lower (from 2-6 alleles) than that reported for landrace stocks (Lacape et al. 2007). The overall PIC for SSR markers was in the range of 0.17-0.79, and the genetic distance of the group ranged from 0.04-0.57. These results demonstrated the general genetic diversity of these groups. Perhaps more polymorphic markers and germplasms should be screened to identify the potential alleles for confirming these results. A larger sample size would increase detection power and allow the quantification of more alleles that would have enough counts in their own identity for association analysis (Breseghello and Sorrels 2006). In present study, Bayesian approaches were used to infer the population structures. 323 upland cottons were assigned to three separate groups. The accuracy of data groups would benefit from the subsequent trait-marker association analyses.
1849
The haplotype LD decayed with the distance of alleles in upland cotton. The decay distance was about 13-15 cM (r2 0.2), which was similar to the report of Abdurakhmonov et al. (2009) in the landrace stocks germplasm. The decay of LD between the linked loci is useful for genome-wide association mapping (Stich et al. 2005). Perfect association mapping of genes requires strong LD only between markers tightly linked to the loci of a trait. LDs between unlinked pairs of SSR loci were also observed in our study, in agreement with the findings of Abdurakhmonov et al. (2008). This may be attributed to co-selection of loci during the domestication process. The decayed distance of LD usually determines the density of markers for the association mapping. In cotton, the genome has a total recombination length of about 5 200 cM and an average 400 kb cM-1 (Paterson and Smith 1999). The LD decay rate of cotton is 13-15 cM (r2 0.2) in this test, which means that a maximum of 300-400 polymorphic markers is required to conduct a successful association mapping of complex traits. Although 278 loci detected with 106 markers were fewer than expected by the theoretical prediction, the number could be cut down to approximate 80-100 markers if the LD decayed rate was set at the r2 0.1 threshold with the genetic distance 55-65 cM. However, the genome of cotton as a result of polyploidization exhibits a complex origin, even the A-genome originating from G. herbaceum race africanum has experienced polyploidization (Ma et al. 2008) which has produced the complicated SSR data. Thus it increased difficulty to determine the numbers and locus of markers and needs further exploring. The GLM and the MLM were used to estimate the number of association markers. The MLM can filter most of the significant markers detected by GLM, and reduce both false-positive and -negative rates by adding population structure and kinship as covariates. We found 15 markers associated with cotton drought tolerance, and three markers associated with cotton salt tolerance at the significant level of P 0.01 in GLM. For the drought tolerance, 12 markers showed negative allele effects, and the remaining markers showed positive allele effects. However, most of those markers were filtered after accounting the factor of kinship in MLM. For the salt tolerance, two makers showed positive allele effects, and the remaining markers showed negative allele effects. The associated markers kept the same level when joined © 2014, CAAS. All rights reserved. Published by Elsevier Ltd.
JIA Yin-hua et al.
1850
with the kinship in MLM. This means that population stratification significantly affects marker-trait association in the traits of drought tolerance and salt tolerance. QTLs were considered to be useful for marker-assisted selection in crop breeding. Amount of research about QTLs controlling important traits have been carried out using the method of linkage, such as QTLs of cotton fiber qualities and resistance to Verticillium wilt traits using the inter-specific populations from crosses between G. hirsutum and Gossypium barbadense (Kohel et al. 2001; Lacape et al. 2010), and the intra-specific G. hirsutum populations (Zhang et al. 2009) in the past years. ‘‘QTLrich’’ chromosomes were presented based on previous research, such as the linkage groups or chromosomes 1, 6, 7, 14, 17, 20, 22, 23, LGA01, LGA03, LGD03, and LGD08 containing more fiber-related QTLs than their homeologous partners (Rong et al. 2007). QTLs mapping for water stress, salt stress and related physiological traits have been developed for understanding and using the molecular basis of drought tolerance in some crops. However, a few examples of MAS for traits linked with drought and salt resistance have been reported in cotton (Levi et al. 2009a, b). As a result of association analysis for drought and salt tolerance in our study, 15 markers were found to be associated with drought tolerance, and three markers were associated with salt tolerance (P 0.01). A few of these markers were located in ‘‘QTL-rich’’ chromosomes 1 and 14, while the majority of them were located in the homology of ‘‘QTL-rich’’ chromosomes. All of these types of markers associated with drought and salt traits were found firstly and were different with the reported QTLs (Muhammad et al. 2009; Muhammad et al. 2014). However, drought and salt tolerance included a complex system. Yield, osmotic potential, carbon isotope ratio, leaf chlorophyll concentrations, dry weight of stem and root, and proline density were the main contents related with drought tolerance (Levi et al. 2009a, b). Concentrations of elements such as nitrogen, phosphorus, potassium, and chloride, were the main salt tolerance physiological traits. Three QTLs (BNL3259, BNL1153 and BNL2884) for osmotic potential related the drought were ever detected in dry condition, and were mapped on chromosomes 14, 25 and 6, respectively (Muhammad et al. 2009). QTLs related with root length and root-shoot ratio in saline condition were also screened with 109 cotton variety germplasm (Muhammad et al. 2014). In our research, we used the
relative survival rate under the water stress and salt stress in seedlings of cotton as the level of drought and salt tolerance separately. Though this method produced visual results of the drought and salt resistance, it was limited in fine QTLs mapping of water and salt stress in cotton. For more accurate results, physiological and biochemical traits should be evaluated and integrated in the association analysis in future, and further experiments over multiple locations and years also should be performed to increase the precision of the detected QTLs. Association analysis provides an efficient way to detect markers related to important resistant traits in cotton. The associated markers were detected across the genetic background of several hundred diverse cotton lines from different geographic locations in the present study. The sample lines carried the favorable genes selected by people or environmental factors since the dawn of farming. As a result of preliminary nature of QTL locations, these markers associated with drought and salt tolerance provide a cotton genome view of selected genes, which implies potential application of MAS in cotton breeding. To validate the result and find more accurate markers associated with drought and salt tolerance traits, future research on the whole genome scan should be carried out with more controlled phenotypes, a high density genetic map, and the core cotton germplasms.
CONCLUSION The genetic diversity and population structure of 323 G. hirsutum germplasms were evaluated, and the LD level of the population was estimated. Three groups were detected using STRUCTURE software. LD decay rates were estimated to be 13-15 cM at r2 0.20. Signicant markers associated drought and salt tolerance were screened with GLM and MLM (P 0.01. Most of the markers showed their effects on the traits of drought and salt tolerance.
MATERIALS AND METHODS Plant materials 323 G. hirsutum germplasms were used to conduct the association analysis. These cotton germplasms were collected from all over the world and conserved in the GeneBank of © 2014, CAAS. All rights reserved. Published by Elsevier Ltd.
Molecular Diversity and Association Analysis of Drought and Salt Tolerance in Gossypium hirsutum L. Germplasm
Cotton Research Institute at Chinese Academy of Agricultural Sciences, including materials from the United States, central Asia, Australia, Africa, and different areas of China (Appendix A). Some individuals have been grown in their local habitat for several decades. All the cotton germplasms used in this study have been strictly self-pollinated during the past years.
Phenotype evaluation These cotton germplasms were characterized in drought pond and salt pond separately in Anyang, China. The drought pond was covered by a shelter which prevented rain and water from contacting the ground, and kept the soil dry. Each individual was sown in one row with three replicates and completely randomized in different blocks. Two individuals that were sensitive to the water stress were used as the control cultivars. The plant spacing was maintained at 0.1 m×0.06 m. In the seedling period, the soil moisture was decreased to 3% slowly, and then recovered to normal water content, which was repeated three times. The relative survival rate was estimated as the level of drought tolerance (Liu et al. 1998). The salt drought pond was also covered by a shelter for maintaining the concentration of NaCl in the soil. The plant spacing was maintained at 0.15 m×0.06 m with three replicates of each individual. Two individuals that were sensitive to the salt stress were used as the control cultivars. During the seedling period, the soil salt concentration was increased to 0.4% through adding NaCl and water, and maintained for 7 d. The concentration of NaCl in the soil was checked according to the method described by Ye and Liu (1998). The relative survival rate was also estimated as the level of salt tolerance (Ye and Liu 1998). The values of the traits were analyzed using the Statistical Analysis System (SAS, SAS Institute Inc., USA).
Genotyping with SSR markers DNA was extracted from the young and fully expanded leaves of each species (Paterson and Smith 1999). The sequences of SSR primers were downloaded from the CMD (cotton marker database, www.cottonmarker.org/cgi-bin/panel.cgi). 106 polymorphic SSRs were screened using a panel comprised of twenty cultivars which were chosen from these groups based on the different phenotypes, and covered 26 chromosomes of cotton (Appendix B). The polymerase chain reaction (PCR) reacted in 10 μL volumes which contained 1.0 μL 10× buffer (consisting of 20 mmol L-1 MgSO4, 100 mmol L-1 KCl, 80 mmol L-1 (NH4)2SO4, 100 mmol L-1 Tris-HCl, with a pH of 9.0 and 0.5% NP-40), 50 ng template DNA, 0.5 mmol L-1 dNTP, 0.4 U of Taq DNA polymerase (ET101-02) (TIANGEN Biotech, Beijing), and 0.5 µmol L-1 forward and reverse primers. The PCR amplification procedure included a 3-min pre-denature step at 95°C, 30 cycles at 94°C for 45 s, then 57°C for 45 s, 72°C for 1 min concluding with a 7-min
1851
extension at 72°C. The reactions were completed with an PTC-100TM thermocycler (MJ Research Inc., USA). The PCR product was stored at 4°C before being run on the 8% non-denaturing PAGE gel (Sambrook et al. 1989). The gel was dyed according to the method used by Zhang et al. (2000), and then was photographed using a GeneGenius gel light imaging system (Syngene, UK).
Allele diversity and population structure G. hirsutum are allotetraploid cotton, thus markers can produce more than one band. When markers produced a single band, each allele was scored with 1, whereas when markers produced more than two bands, alleles were scored with 1, 2, 3, and 4 representing the number of bands, respectively. Any missing data was represented with -9. Diversity and heterozygosity were calculated based on 106 polymorphic SSR data in 323 individuals. Allele frequencies were calculated using SpaGeDi ver.1.3 software (Hardy et al. 2002). The polymorphic information content (PIC) was analyzed using PowerMarker 3.25 software (Liu and Muse 2005). Genetic similarity coefficients were calculated using the Numerical Taxonomy Multivariate Analysis System (NTSYSpc) ver. 2.1 software (Exeter Software, USA). The genetic distance (GD) was estimated using Neighbor Joining (N-J) algorithm with the minimum evolution objective function.
Population structure analysis The Bayesian was estimated using STRUCTURE software for the population structure analysis (Pritchard et al. 2000a). The number of populations tested was assumed as K, where K varied from 1 to 10. The length of running time was 100 000 and replication after burning was 10 000 for the STRUCTURE with the admixture model. Since we did not find distinct clusters and could not determine a significant number of K populations using STRUCTURE, we created a graph of Pn to find the proper value of K following the method of Evanno et al. (2005).
Linkage disequilibrium The LD parameter r2 was estimated using Tassel 2.1 software (http:/ /www.maizegenetics.net). LD between all pairs of SSR alleles was analyzed with MAF filtered datasets, where SSRs alleles with a 0.05 frequency in genotyped accessions were removed before conducting LD analyses. The MAF removal was performed using the TASSEL site filtration function. LD was estimated by a weighted average of squared allelefrequency correlations between SSR loci. The significance of pairwise LD (P-value 0.005) among all possible SSR loci © 2014, CAAS. All rights reserved. Published by Elsevier Ltd.
1852
was evaluated using TASSEL with the rapid permutation test in 10 000 shuffles. The LD values between all pairs of SSR loci were plotted as LD plots using TASSEL to estimate the general view of genome-wide LD patterns and evaluate ‘blocklike’ LD structures.
Association studies The general linear model (GLM) association test was performed after incorporating drought tolerance, salt tolerance, SSRs genotype, and Q matrix using the TASSEL 2.1 software. The Q of population was set as covariate and 1 000 time permutations were set for the correction of multiple testing. The Q matrix was created with K=3, as determined by STRUCTURE. The mixed linear model (MLM) was also tested after incorporating the traits, genotype and Q and K matrices. The K matrix was created by the calculation of pairwise kinship coefficients. The phenotypic allele effect was estimated using the method described by Breseghello and Sorrels (2006): ai=∑xij/ni-∑Nk/nk Where, ai is the phenotypic effect of the specific i allele, xij is the phenotype value of the j individual with an i allele, ni is the total number of individuals with an i allele, Nk is the phenotype value of the j individuals with a null i allele and nk was the total number of individuals with a null i allele.
Acknowledgements
This research was supported by the National Natural Science Foundation of China (31201246) and the Project of International Science and Technology Cooperation and Exchange from the Ministry of Science and Technology, China (2010DFR30620-3). We thank Zhang Xuehai from Huazhong Agricultural University for the help of analysis. Appendix associated with this paper can be available on http://www.ChinaAgriSci.com/V2/En/appendix.htm
References
Abdurakhmonovn I Y, Kohe R J, Yu J Z, Pepper A E, Abdullaev A A, Kushanov F N, Salakhutdinov I B, Buriev Z T, Saha S, Scheffler B E, Jenkins J N, Abdukarimov A. 2008. Molecular diversity and association mapping of fiber quality traits in exotic G. hirsutum L. germplasm. Genomics, 92, 478-487. Abdurakhmonov I Y, Saha S, Jenkins J N, Buriev Z T, Shermatov S E, Scheffler B E, Pepper A E, Yu J Z, Kohel R Z, Abdukarimov A. 2009. Linkage disequilibrium based association mapping of fiber quality traits in G. hirsutum L. variety germplasm. Genetica, 136, 401-417. Blenda A, Scheffler J, Scheffler B, Palmer M, Lacape J M, Yu J Z, Jesudurai C, Jung S, Muthukumar S, Yellambalase P,
JIA Yin-hua et al.
Ficklin S, Staton M, Eshelman R, Ulloa M, Saha S, Burr B, Liu S, Zhang T Z, Fang D Q, Pepper A. 2006. CMD: A cotton microsatellite database resource for Gossypium genomics. BMC Genomics, 7, 132. Breseghello F, Sorrells M E. 2006. Association mapping of kernel size and milling quality in wheat (Triticum aestivum L.) cultivars. Genetics, 172, 1165-1177. Doebley J, Stec A, Hubbard L. 1997. The evolution of apical dominance in maize. Nature, 386, 485-488. Evanno G, Regnaut S, Goudet J. 2005. Detecting the number of clusters of individuals using the software structure: A simulation study. Molecular Ecology, 14, 2611-2620. Goldstein D B, Tate S K, Sisodiya S M. 2003. Pharmacogenetics goes genomic. Nature Reviews Genetics, 4, 937-947. Hardy O J, Vekemans X. 2002. SpaGeDi: A versatile computer programto analyze spatial genetic structure at the individual or population levels. Molecular Ecology Notes, 2, 618-620. Ivandic V, Thomas W T B, Nevo E, Zhang Z, Forster B P. 2003. Associations of simple sequence repeats with quantitative trait variation including biotic and abiotic stress tolerance in Hordeum spontaneum. Plant Breeding, 122, 300-304. Kantartzi S K, Stewart J M. 2008. Association analysis of fibre traits in Gossypium arboreum accessions. Plant Breeding, 127, 173-179. Kohel R J, Yu J, Park Y H, Lazo G R. 2001. Molecular mapping and characterization of traits controlling fiber quality in cotton. Euphytica, 121, 163-172. Kraakman A T W, Rients E N, Petra M M, van den B M, Stam P, van E F A. 2004. Linkage disequilibrium mapping of yield and yield stability in modern spring barley cultivars. Genetics, 168, 435-446. Lacape J M, Dessauw D, Rajab M, Noyer J L, Hau B. 2007. Microsatellite diversity in tetraploid Gossypium germplasm: assembling a highly informative genotyping set of cotton SSRs. Molecular Breeding, 19, 45-58. Lacape J M, Llewellyn D, Jacobs J, Arioli T, Becker D, Calhoun S, Al-Ghazi Y, Liu S, Georges O P S, Giband M, Giband M, de Assuncao H, Barroso P, Claverie M, Gawryziak G, Jean J, Vialle M, Viot C. 2010. Metaanalysis of cotton fiber quality QTLs across diverse environments in a Gossypium hirsutum×G. barbadense RIL population. BMC Plant Biology, 10, 132. Levi A, Ovnat L, Paterson A H, Saranga Y. 2009a. Photosynthesis of cotton near-isogenic lines introgressed with QTLs for productivity and drought related traits. Plant Science, 177, 88-96. Levi A, Paterson A H, Barak V, Yakir D, Wang B, Chee P W, Saranga Y. 2009b. Field evaluation of cotton near-isogenic lines introgressed with QTLs for productivity and drought related traits. Molecular Breeding, 23, 179-195. Lilley J M, Ludlow M M, Mccouch S R, O’Toole J C. 1996. Locating QTL for osmotic adjustment and dehydration tolerance in rice. Journal of Experimental Botany, 47, 1427-1436. Liu J, Ye W, Fan B. 1998. Studying and utilization of resistance
© 2014, CAAS. All rights reserved. Published by Elsevier Ltd.
Molecular Diversity and Association Analysis of Drought and Salt Tolerance in Gossypium hirsutum L. Germplasm
of cotton in China. China Cotton, 25, 5-6. Liu K, Muse S V. 2005. PowerMarker: An integrated analysis environment for genetic marker analysis. Bioinformatics, 21, 2128-2129. Ma X X, Ding Y Z, Zhou B L, Guo W Z, Lv Y L, Zhu X F, Zhang T Z. 2008. QTL mapping in A-genome diploid Asiatic cotton and their congruence analysis with ADgenome tetraploid cotton in genus Gossypium. Journal of Genetics and Genomics, 35, 751-762. Mansur L M, Lark K G, Kross H, Oliveira A. 1993. Interval mapping of quantitative trait loci for reproductive, morphological, and seed traits of soybean (Glycine max L.). Theoretical and Applied Genetics, 86, 907-913. Muhammad B, Yehoshua S, Zafar I, Muhammad A, Yusuf Z, Edward L, Peng C. 2009. Identification of QTLs and impact of selection from various environments (dry vs. irrigated) on the genetic relationships among the selected cotton lines from f6 population using a phylogenetic approach. African Journal of Biotechnology, 8, 4802-4810. Muhammad S, Guo W Z, Ihsan U, Tabbasam N, Zafar Y, Rahman M, Zhang T Z. 2011. QTL mapping for physiology, yield and plant architecture traits in cotton (Gossypium hirsutum L.) grown under well-watered versus water-stress conditions. Electronic Journal of Biotechnology, 14, 1-13. Muhammad S, Guo W Z, Zhang T Z. 2014. Association mapping for salinity tolerance in cotton (Gossypium hirsutum L.) germplasm from US and diverse regions of China. Australian Journal of Crop Science, 8, 338-346. Paterson A H, Smith R H. 1999. Future horizons: Biotechnology for cotton improvement. In: Smith C W, Cothren J T, eds., Cotton: Origin, History, Technology, and Production. John Wiley&Sons, Inc., New York. pp. 415-432. Pritchard J K, Stephens M, Donnelly P. 2000a. Inference of population structure using multilocus genotype data. Genetics, 155, 945-959. Pritchard J K, Stephens M, Rosenberg N A, Donnelly P. 2000b. Association mapping in structured populations. The American Journal of Human Genetics, 67, 170-181. Pritchard J K, Wen W. 2003. Documentation for Structure Software. ver. 2. Department of Human Genetics, University of Chicago, Chicago. Rong J K, Feltus F A, Waghmare V N, Pierce G J, Chee P W, Draye X, Saranga Y, Wright R J, Wilkins T A, May O L, Smith C W, Gannaway J R, Wendel J F, Paterson A H. 2007. Meta-analysis of polyploid cotton QTL shows unequal contributions of subgenomes to a complex network of genes and gene clusters implicated in lint fiber development. The Genetics Society of America, 176, 2577-2588. Rungis D, Llewellyn D, Dennis E S, Lyon B R. 2005. Simple
1853
sequence repeat (SSR) markers reveal low levels of polymorphism between cotton (Gossypium hirsutum L.) cultivars. Australian Journal of Agricultural Research, 56, 301-307. Sambrook J, Fritsch E F, Maniatis T. 1989. Molecular Cloning, vol. 2. Cold Spring Harbor Laboratory Press, New York. Saranga Y, Menz M, Jiang C X, Wright R J, Yakir D, Paterson A H. 2001. Genomic dissection of genotype x environment interactions conferring adaptation of cotton to arid conditions. Genome Research, 11, 1988-1995. Stich B, Melchinger A E, Frisch M, Maurer H P, Heckenberger M, Reif J C. 2005. Linkage disequilibrium in European elite maize germplasm investigated with SSRs. Theoretical and Applied Genetics, 111, 723-730. Thornsberry J M, Goodman M M, Doebley J, Kresovich S, Nielsen D, Buckler E S. 2001. Dwarf8 polymorphisms associate with variation in flowering time. Nature Genetic, 28, 286-289. Tuberrosa R, Sanguineti M C, Landi P, Salvi S, Casarini E, Conti S. 1998. RFLP mapping of quantitative trait loci controlling abscisic acid concentration in leaves of droughtstressed maize (Zea mays L.). Theoretical and Applied Genetics, 97, 744-755. Viviane J D, Ed S B, Bruce D S, Thomas P G, Alan C, Doebley J, Pääbo S. 2003. Early allelic selection in maize as revealed by ancient DNA. Science, 302, 1206-1208. Wang R L, Stec A, Hey J, Lukens L, Doebley J. 1999. The limits of selection during maize domestication. Nature, 398, 236-239. Weiss K M, Clark A G. 2002. Linkage disequilibrium and the mapping of complex human traits. Trends in Genetics, 18, 19-24. Ye W, Liu J. 1998. The method of evaluating the salt stress in cotton and utilization. China Cotton, 25, 34-38. (in Chinese) Zhang H B, Li Y, Wang B, Chee P W. 2008. Recent advances in cotton genomics. International Journal of Plant Genomics, 2008, 742304. Zhang J, Wu Y T, Guo W Z, Zhang T Z. 2000. Fast screening of microsatellite markers in cotton with PAGE/silver staining. Cotton Science, 12, 267-269. (in Chinese) Zhang X, Zhen J B, Li Z H, Kang D M, Yang Y M, Kong J, Hua J P. 2011. Expression profile of early responsive genes under salt stress in upland cotton (Gossypium hirsutum L.). Plant Molecular Biology Reporter, 29, 626-637. Zhang Z S, Hu M H, Zhang J, Liu D J, Zheng J, Zhang K, Wang W, Wan Q. 2009. Construction of a comprehensive PCR-based marker linkage map and QTL mapping for fiber quality traits in upland cotton (Gossypium hirsutum L.). Molecular Breeding, 24, 49-61. (Managing editor WANG Ning)
© 2014, CAAS. All rights reserved. Published by Elsevier Ltd.