Agricultural Sciences in China
December 2010
2010, 9(12): 1713-1725
Determination of the Number of SSR Alleles Necessary for the Analysis of Genetic Relationships Between Maize Inbred Lines WU Cheng-lai, LI Sheng-fu, DONG Bing-xue, ZHANG Qian-qian and ZHANG Chun-qing State Key Laboratory of Crop Biology/College of Agriculture, Shandong Agricultural University, Tai’an 271018, P.R.China
Abstract The amount of molecular marker information has considerable impact on the results of studies of crop germplasm genetic relationships in crop. The number of alleles required to reveal genetic relationship in maize inbred lines is a theoretical issue that needs to be addressed. In this study, 112 pairs of SSR (simple sequence repeat) primers and 97 maize inbred lines were selected to study the relationship between the number of inbred lines and the number of SSR primers and alleles required for a stable cluster. The results showed that the number of SSR primers is not tightly associated with the stability of the cluster analysis results, while an increase in the number of alleles can significantly improve the stability of cluster analysis results. The number of inbred lines (X) is significantly associated with the number of alleles required for stable cluster analysis (Y), and the regression equation is Y= 600.8×e(-15.9/x). This equation can be used to calculate the number of SSR alleles required for a genetic relationship study of maize inbred lines. These results provide a reference for determining of SSR alleles number in genetic relationship analysis of maize inbred line and other crop germplasm. Key words: maize, inbred line, SSR, number of alleles, genetic relationship
INTRODUCTION The clustering analysis of crop varieties or inbred lines is an important method for studying genetic relationships of crop germplasms. Phenotypic markers, biochemical markers and heterosis indices were previously applied to germplasm cluster analysis, although certain limitations were noted in these studies ( Wu 1983; Smith et al. 1990; Zeng 1990; Wang et al. 1997). In recent years, a technique using molecular markers based on DNA polymorphisms has overcome many limitations of traditional markers, and has become a very important tool in the study of germplasm genetic relationships. Melchinger et al. (1990) categorized 12 maize inbred lines into different heterotic groups according to polymorphic variation using 304 RFLP (restriction fragReceived 17 March, 2010
ment length polymorphism) loci. The result was generally consistent with known genealogical relationships. Using 135 RFLPs and 209 AFLPs (amplified fragment length polymorphisms), Marsan et al. (1998) analyzed the genetic relationships between 13 maize inbred lines and explored the relationship between genetic distance and heterosis. Rongwen et al. (1995) performed effective differentiation of 95 soybeans [Glycine max (L.) Merr.] germplasms by analyzing 130 alleles using seven pairs of SSR primers. Plaschke et al. (1995) conducted cluster analysis on the genetic diversity of 40 common bread wheat varieties by analyzing 142 alleles using 23 pairs of SSR primers. Struss and Plieske (1998) studied the genetic diversity of 163 barley varieties [Hordeum vulgare L.] by analyzing 130 alleles with 15 pairs of SSR primers. Garland et al. (1999) carried out a genetic distance study and cluster analysis on 43 rice
Accepted 7 May, 2010
WU Cheng-lai, Ph D, Tel: +86-538-8242458, E-mail:
[email protected]; Correspondence ZHANG Chun-qing, Ph D, Professor, Tel: +86-538-8242682, E-mail:cqzhang@sdau. edu.cn © 2010, CAAS. All rights reserved. Published by Elsevier Ltd. doi:10.1016/S1671-2927(09)60270-4
1714
[Oryza sativa L.] varieties by analyzing 115 alleles with 10 pairs of SSR primers. Senior et al. (1998) performed a cluster analysis on 94 maize inbred lines by analyzing 365 alleles with 70 pairs of SSR primers. Li et al. (2000) categorized heterotic groups in 21 maize inbred lines by analyzing 127 alleles with 43 pairs of SSR markers. You et al. (2004) performed a cluster analysis on 96 wheat varieties by analyzing 802 alleles with 104 pairs of primers. Teng et al. (2004) studied maize heterotic groups and patterns during past decade in China by analyzing 660 alleles with 111 pairs of SSR primers. Liu et al. (2006) analyzed heterotic patterns of maize hybrids used in Henan Province, China, using 485 alleles with 95 pairs of primers. Bao et al. (2007) studied genetic diversity in 98 East Asian pyrus species by analyzing 168 alleles with six pairs of SSR primers. Silvestrini et al. (2008) studied genetic diversity in 180 coffee germplasms by using 95 RAPDs (random amplified polymorphism DNA). Yoon et al. (2009) studied genetic diversity in 2 758 local soybean varieties from Korea by analyzing 110 alleles with 6 pairs of SSR primers. It is notable that the number of molecular markers (6-304), material (12-2 758) and polymorphic alleles (110-802) were all variable in the previous studies. For the study of genetic relationships between germplasms, the goal is to acquire true and reliable genetic relationships among the study materials using molecular markers. For this purpose, it is necessary to use certain amount molecular markers to provide enough information. However, addition of more molecular markers would not induce significant changes in the genetic relationships of the materials (Wang et al. 2003). The amount of molecular marker information necessary to ensure the stability and reliability of the results of genetic relationship analysis is a theoretical question in urgent need of an answer. To address this question, many researchers have explored in different crops using SSR markers. Zhang et al. (2002) studied the stability of cluster analysis by the relationship between the similarity coefficient matrixes of 43 wheat varieties on different allelic number. The authors claimed that 350 alleles were sufficient for a stable cluster analysis of 43 wheat varieties. Wang et al. (2003) analyzed the groups of 190 soybean varieties cultivated in China and suggested that the minimum number of alleles required to study genetic rela-
WU Cheng-lai et al.
tionships in soybean varieties cultivated in China was 570 or above. Yang et al. (2005) studied the groups of common wild rice populations in Yulin, Guangxi Province, China, and suggested that approximately 200 microsatellite alleles evenly distributed in the genome was adequate to determine precise genetic relationships between individuals. You et al. (2004) performed cluster analysis in 96 wheat varieties using SSR markers and suggested that the minimum number of alleles needed to reveal the true genetic relationships between study materials was 550, regardless of the number of study lines. However, in genetic relationship studies of maize germplasm, the relationship between the number of inbred lines and the number of alleles and the number of SSR primers needed for a stable cluster analysis is not clear. So far, no reports are available on whether there is consistency in this relationship in different crops. To determinate the relationship above, 112 pairs of SSR primers and 97 maize inbred lines were used in this study.
MATERIALS AND METHODS Plant materials Ninety-seven maize inbred lines and their original sources are listed in (Table 1). Of these, 68 inbred lines are commonly used for breeding, which cover the major heterotic groups of currently maize inbred lines in China. Twenty-nine of these inbred lines were selected by our research group.
DNA extraction and SSR analysis Leaves from three seedlings were obtained for each inbred line to extract DNA using the CTAB (cetyltrimethyl ammonium bromide) method (SaghaiMaroof et al. 1984). The DNA concentration was measured using a SmartspecTMPlus spectrophotometer and the quality of the DNA was determined by electrophoresis in 0.8% agarose. DNA was diluted to 20 ng µL-1 with TE buffer before use. Of the 200 pairs of SSR primers, 112 pairs with high polymorphism rate and even distribution across
© 2010, CAAS. All rights reserved. Published by Elsevier Ltd.
Determination of the Number of SSR Alleles Necessary for the Analysis of Genetic Relationships Between Maize Inbred Lines
1715
Table 1 The maize inbred lines and their original sources Code
Inbred line
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49
Ye 478 Qi 319 Huangzao 4 Shen 5003 Dan 340 Zheng 58 Chang 7-2 B38-2 1) 196 Danhuang 212 1029 B104-1 1) X178 B571) Zhonghuang 204 Mo 17 V77 7137 Lx 9801 Huangye 4 B1781) B52-2 1) 65232 D26-2 1) Cai 11-8 Liao 6107-2 Qi 310 84 Ye 488 K14 Ji 853 K12 B3021) W616 52106 502 B79-2 1) Qi 318 4Zi4 Zheng 22 4866 B3581) 543 H21 B1171) Y17 W518-1 B1211) A183-1 1)
Source U8112×Shen 5003 Recycled line from U.S. hybrid 78599 Pollinated plant of inbred line Sipingtou Recycled line from U.S. hybrid 3147 Baigulü 9×Pod corn Derived from Ye 478 (Huangzao 4×Wei 95)×S901 Recycled line from hybrid 0638 D340×Huangzao 4 D729×Huangzao 4 Recycled line from U.S. hybrid XL80 Group 2004 Recycled line from U.S. hybrid 78599 Group 2004 Derived from Mo 17 C103×187-2 Unknown Unknown 502/H21 (Yejihong×Huangzao 4)×Dunzihuang Group 2004 Group 2004 6 237×Shen 5003 Group 2004 Menke B×Zi 330 Recycled line from Va35 Synthetic Jin 21×Huangzao 4 Unknown U8 11×Shen 5003 5 005×6 917 Huangzao 4×Zi 330 Derived from Huangzao 4 Group 2004 Unknown (Aijin 525×Ye 107)×106 Dan 340×Huangzao 4 Group 2004 Recycled line from U.S. hybrid 78599 (Huangzao 4×Zi 330)×Huangzao 4 Dan 340×E28 Tie 7922×Ye 478 Group 2004 Unknown Huangzao 4×H84 Group 2004 Unknown Unknown Group 2004 Group 2004
Code
Inbred line
50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97
Yousuixuan-1 B105-1-1 1) D18-1 1) Huang C Y30-1-1 E4-2 1) AM311 8723 VT187-4 03Qun-3-3 1) F22-1 1) Danhuang 02 Lv 9 B283-1 1) Luyuan 92 Y3 Shen 118 B50 1) Zheng 653 D5-1-1 1) Longkang 11 A386-1 1) Y36 A210-1 1) BJian8 A348-4-2 1) D25-2 1) A22 Liao 308 D14-1 1) D10-2 1) 5237 3904 9418 Shen 137 Ji 842 A150-3-1 1) ZH-2-1 P138 Y26-1-1 3189 W618 3841 Shen 5005 B40-2 1) B209 1) Y32-1-1-1 8001
Source Unknown Group 2004 Group 2004 (Huangxiao 162×Zi 330)×Tuxpeno Unknown Group 2004 Unknown U8 112×Ye 107 Unknown Group 2003 Group 2004 Recycled line from E28 synthetic Lvda Reb Cob Group 2004 Yuanqi 122×1 137 Unknown Zhao 23×Super Sweet Group 2004 (5003×Zong 31)×5003 Group 2004 Zi 330 × Mo 17 Group 2004 Unknown Group 2004 (B73×Jianrui 2)×U8112 Group 2004 Group 2004 Unknown Liao 85-308×foreign material Group 2009 Group 2009 D340×Huangzao4 Unknown Unknown Recycled line from U.S. hybrid 6JK111 Ji 63×Mo 17 Group 2004 Unknown Recycled line from U.S. hybrid 78599 Unknown Shen 5003×U8112 Unknown Mo 17×Huobai Recycled line from U.S. hybrid 3147 Recycled line from hybrid 0638 Group 2004 Unknown 3189×Ye 488
1) Inbred lines produced by our research group. Group 2004 is a free-pollination group consisting of Nongda 108, Nongda 3138, Yedan 2, Yedan 13, and Nongdan 5; Group 2003 is a free-pollination group consisted of five US hybrid lines such as Xianyu 335.
the 10 maize chromosomes were selected (Table 2) and used in PCR amplification of genomic DNA of the inbred lines. The PCR reaction program was as follows: 94ºC for 5 min; 34 cycles of denaturation at 94ºC for 1 min, annealing at 60ºC for 2 min, extension at 70ºC for 2 min; and extension at 70ºC for 5 min at the end. PCR products were separated on 9% non-denaturing
polyacrylamide gel in 1×TEB buffer for 50 min and stained with the silver method (Li et al. 2001).
Statistical analysis Microsatellite profiles were scored reflecting either the presence (1) or absence (0) of clear bands. Polymor-
© 2010, CAAS. All rights reserved. Published by Elsevier Ltd.
1716
WU Cheng-lai et al.
Table 2 The polymorphism information content of the 112 pairs of primers in the 97 inbred lines SSR primer umc1071 umc1976 phi001 phi109275 umc1397 umc1917 umc1124 umc1035 umc1590 umc1122 umc1147 bnlg1564 phi120 phi064 phi96100 bnlg1017 umc1823 umc1555 bnlg1621 bnlg1036 nc003 bnlg1045 bnlg198 bnlg1520 umc2105 umc2101 umc1970 bnlg1144 bnlg1447 bnlg1452 phi036 umc1102 umc1307 bnlg197 bnlg1496 phi047 umc2148 umc1294 umc1896 nc005 bnlg1444 umc2188 umc1856 umc1808 bnlg2162 umc1058 phi024 umc1274 umc1221 bnlg1287 bnlg1074 umc1822 umc2164 mmc0081 umc1225 bnlg389
Bin no. 1.01 1.02 1.03 1.03 1.03 1.04 1.05 1.06 1.06 1.06-1.07 1.07 1.07 1.11 1.11 2.00-2.01 2.02 2.02 2.03 2.03 2.06 2.06 2.07 2.08 2.09 3.00-3.01 3.00-3.01 3.01 3.02 3.03 3.04 3.04 3.05 3.05 3.06 3.09 3.09 4.01 4.02 4.05 4.05 4.08 4.08 4.08 4.08 4.08 4.11 5.01 5.03 5.04 5.04 5.04 5.05 5.05 5.05 5.08 5.09
Allele number 8 3 9 4 2 2 5 8 6 3 4 6 2 7 7 3 7 6 6 4 7 9 6 5 5 7 6 13 5 7 5 3 5 5 7 4 5 7 7 9 10 10 10 11 4 6 4 4 9 3 4 7 6 10 9 4
PIC 0.71 0.43 0.68 0.66 0.44 0.23 0.47 0.80 0.77 0.52 0.26 0.61 0.66 0.72 0.77 0.54 0.52 0.47 0.77 0.50 0.73 0.77 0.77 0.60 0.77 0.80 0.75 0.73 0.71 0.70 0.70 0.24 0.49 0.66 0.79 0.46 0.77 0.74 0.71 0.79 0.84 0.83 0.76 0.76 0.65 0.65 0.64 0.65 0.73 0.66 0.36 0.76 0.76 0.87 0.72 0.21
phism information content of each locus was com2 puted by PIC=1, where ƒ i means the gene frei quency of each allele at this locus. Genetic similarity coefficient between maize inbred lines was calculated
SSR primer bnlg161 umc1133 bnlg1371 phi077 nc010 umc1014 umc1805 umc1859 phi123 umc1653 umc2177 umc1695 umc1066 umc1016 umc1983 umc1015 umc1936 umc1593 umc1944 umc0151 Dupssr13 phi116 umc1359 umc1414 bnlg1863 umc1562 umc1161 umc1268 bnlg1384 phi080 umc1663 phi015 umc1647 bnlg1724 umc2084 phi028 bnlg1401 bnlg2441 phi065 umc1267 bnlg127 umc1231 bnlg1270 phi041 umc1152 umc1367 bnlg1712 bnlg1518 umc2163 bnlg1526 bnlg1250 umc2122 bnlg2190 bnlg1677 umc11960 bnlg1450
Bin no. 6 6.01 6.01 6.01 6.04 6.04 6.05 6.06 6.07 6.07-6.08 7 7 7.01 7.02 7.02 7.03 7.03 7.03-7.04 7.04 7.04 7.04 7.06 8 8.01 8.03-8.04 8.05 8.06 8.07 8.07 8.08-8.09 8.08-8.09 8.08-8.09 9 9.01 9.01 9.01 9.02 9.02 9.03 9.03 9.03 9.05 9.05-9.06 10 10.01-10.02 10.03 10.03 10.04 10.04 10.04 10.05 10.06 10.06 10.07 10.07 10.07
Allele number
PIC
6 5 7 3 3 7 7 6 2 12 6 3 3 6 5 6 4 6 5 8 5 4 5 5 7 7 5 2 9 5 4 6 3 5 5 3 10 12 2 2 3 5 6 7 5 3 7 7 6 4 4 3 7 5 6 8
0.78 0.61 0.80 0.53 0.50 0.77 0.76 0.74 0.25 0.71 0.59 0.63 0.58 0.68 0.73 0.79 0.51 0.70 0.70 0.80 0.79 0.65 0.58 0.62 0.80 0.63 0.62 0.50 0.44 0.55 0.56 0.78 0.53 0.66 0.68 0.36 0.81 0.89 0.45 0.50 0.46 0.70 0.69 0.73 0.70 0.32 0.71 0.80 0.78 0.65 0.64 0.46 0.78 0.70 0.75 0.75
by GS= m/(m+n), where “m” is the number of shared bands between genotypes and “n” is the number of non-matching bands present. Based on the UPGMA method (un-weighted pair group method using arith-
© 2010, CAAS. All rights reserved. Published by Elsevier Ltd.
Determination of the Number of SSR Alleles Necessary for the Analysis of Genetic Relationships Between Maize Inbred Lines
metic averages), cluster analysis was performed using NTSYS-pc 2.10 software (Rohlf 2000). Based on the descending order of the allelic number of each primer pair in 97 inbred lines, the 112 pairs of primers were separated into eight primer sets with 14 pairs of primers in each set. The primers in each set, as well as the 112 pairs of primers, were used to calculate the genetic similarity matrix of the 97 inbred lines. The correlation coefficient between the genetic similarity matrix from each set of primers (sub-matrix) and the genetic similarity matrix from the 112 pairs of primers (general matrix) were calculated using the MXCOMP program in NTSYS-pc 2.10 (Mantel 1967). A high correlation coefficient indicates that the clustering results between the sub-matrix and general matrix match well, which means that the clustering results from the sub-matrix are more reliable. Using the correlation between the numbers of alleles detected from each group of primers and the correlation coefficient for the corresponding matrices, the effect of the allele number on the stability of clustering results was determined.
Establishment of a regression model between the number of inbred lines and the number of SSR alleles required for stable clustering analysis According to the cluster analysis results, 10 inbred lines at a large genetic distance were removed from the 97 lines (Table 1) to create 10 material sets containing 97, 87, 77, 67, 57, 47, 37, 27, 17, and 7 inbred lines. The total number of alleles produced by each set of lines was used to calculate the genetic similarity matrix (general matrix) for each set of inbred lines. The genetic similarity coefficient matrix (sub-matrix) was calculated for each set of inbred lines on pairs of primers from 1 to 111, and the genetic similarity coefficient matrix (general matrix) was calculated for each set of inbred lines on 112 pairs of primers. The MXCOMP program in NTSYS-pc 2.10 was applied to individually calculate the correlation coefficient between the submatrix from each set of inbred lines and the corresponding general matrix (Mantel 1967). When the correlation coefficient between the sub-matrix and the general matrix reached 0.9, the minimum number of alleles needed in the sub-matrix of each set was determined
1717
and considered as the allele number needed by that set of inbred lines for cluster analysis. DPS software was used to establish the regression model between the number of inbred lines in each of the 10 material sets and the number of SSR alleles that were needed for cluster analysis.
RESULTS Analysis of SSR marker information Among 200 pairs of SSR primers evenly distributed on the 10 chromosomes of maize, 112 pairs that produced clear and steady amplification bands were selected (Table 2). There were 10-14 pairs of primers corresponding to each chromosome. A total of 643 alleles were obtained from the 97 lines with the 112 pairs of SSR primers. Each pair of primers detected 2 to 13 alleles with an average of 5.73. The polymorphism information content provided by the 112 pairs of SSR primers was significantly different (Table 2). The average PIC of 112 pair of primers was 0.64 (range from 0.89 to 0.21). In different chromosomes (Table 3), the average allelic number from the corresponding primer sets varied considerably. The average allelic number and average polymorphism information content were the highest on the 4th chromosome, with values of 8.5 and 0.75, respectively. The average numbers of alleles from primers on chromosomes 1 and 9 were relatively small, only 4.9 and 4.4, respectively; the average PIC of these primers was also low, 0.57 and 0.58, respectively. These results provide references for the selection of primers for studies of genetic relationships.
Categorization of the 97 inbred lines Genetic similarity (GS) among the 97 inbred lines was calculated based on 643 polymorphic SSR alleles and ranged from 0.666 to 0.924 (GS value is not shown). The 97 inbred lines were categorized into nine clusters (Fig. 1-A) with UPGMA method, which is generally consistent with the genealogical record. It included four groups: Reid (group Reid germplasm derived from modern U. S. hybrids in China) with 30 lines, PB (group
© 2010, CAAS. All rights reserved. Published by Elsevier Ltd.
1718
WU Cheng-lai et al.
Table 3 Polymorphism information content of the primers targeting different chromosomes Chromosome
Number of primers
1 2 3 4 5 6 7 8 9 10 Total
Alleles
14 10 12 10 10 10 12 10 10 13 111
69 54 73 85 66 52 61 55 44 72 631
B germplasm derived from modern U.S. hybrid in China) with 22 lines, SPT (derivative lines from Sipingtou, a Chinese landrace) with 21 lines and LRC (derivative lines from Lvda Reb Cob, a Chinese landrace) group with 7 lines, and the representative inbred lines were Ye 478, Qi 319, Huangzao 4, and Lv 9, respectively. A total of 17 inbred lines, grouped into four clusters ( ), were not clustered into the heterotic groups due to their complex genetic backgrounds.
Impact of the numbers of primers and alleles on clustering results The total number of alleles generated from each set of primers varied significantly with the highest number at 143 and the lowest at 37 (Table 4). The genetic similarity matrix from the 97 inbred lines was calculated using each of the eight sets of primers, and the correlation coefficient between the sub-matrices and the general matrix was in the range of 0.37126-0.58496, which increases with an increase in number of alleles. The correlation coefficient between the allelic number and the correlation coefficient between the sub-matrix and the general matrix of the inbred lines, r = 0.92 [r0.01(6) = 0.834] is statistically very significant, suggesting the
Average alleles
Total PIC
Average PIC
4.9 5.4 6.1 8.5 6.6 5.2 5.1 5.5 4.4 5.5 5.7
7.97 5.66 7.80 8.27 6.35 5.67 7.45 6.09 5.82 8.77 71.31
0.57 0.63 0.65 0.75 0.64 0.63 0.68 0.61 0.58 0.67 0.64
number of alleles significantly affect the clustering results. When the 112 pairs of primers were classified into eight sets using similar numbers of alleles, the number of primers in each set varied considerably. The 21 primers were in 8th set, which had 3-fold value for the first set (Table 5). The correlation coefficient between the genetic similarity coefficient matrix of the 97 inbred lines from the eight sets of primers (sub-matrix) and the genetic similarity matrix of the 97 inbred lines from the 112 pairs of primers (general matrix) did not exhibit obvious variation, suggesting that the number of primers has no significant effects on clustering results when the number of alleles is constant. Genetic similarity coefficient matrices of the 97 inbred lines were calculated using 1 to 111 pairs of primers (sub-matrix). The correlation coefficients between the sub-matrix and the general matrix calculated with the 112 pairs of primers displayed an upward tendency with increasing numbers of primers and numbers of alleles (Fig. 2). The correlation coefficient between the number of primers and correlation coefficients between the sub-matrix and the general matrix was 0.911 (r0.01 (110) = 0.289), while the correlation coefficient between the number of alleles and correlation coefficients
Table 4 Correlation coefficient between the general matrix and the sub-matrices from the primer sets with the same number of primers Primer set 1 2 3 4 5 6 7 8
Number of primer pairs
Number of alleles
14 14 14 14 14 14 14 14
143 104 95 84 73 65 51 37
Correlation coefficient between the general matrix and the sub-matrix of each primer set 0.58496 0.57299 0.56133 0.51378 0.50132 0.48315 0.41048 0.37126
© 2010, CAAS. All rights reserved. Published by Elsevier Ltd.
Determination of the Number of SSR Alleles Necessary for the Analysis of Genetic Relationships Between Maize Inbred Lines
1719
Fig. 1 Dendrogram of the 97 inbred lines based on a genetic similarity matrix of 643 (A) and 498 (B) SSR alleles.
© 2010, CAAS. All rights reserved. Published by Elsevier Ltd.
1720
WU Cheng-lai et al.
Fig. 1 (Continued from preceding page)
© 2010, CAAS. All rights reserved. Published by Elsevier Ltd.
Determination of the Number of SSR Alleles Necessary for the Analysis of Genetic Relationships Between Maize Inbred Lines
1721
Table 5 Correlation coefficient between the general matrix and the sub-matrices from different primer sets with similar numbers of alleles Primer set
Number of primer sets
Number of alleles
7 9 11 11 14 14 16 21
78 82 80 77 79 78 78 78
1 2 3 4 5 6 7 8
Correlation coefficient between the sub-matrix from each set of primers and the general matrix 0.52531 0.57762 0.54683 0.45567 0.54957 0.53582 0.45088 0.52263
The number of SSR alleles required for stable cluster analysis with different numbers of inbred lines
Fig. 2 Correlation coefficient between the genetic similarity matrix from different numbers of alleles (primers) and the genetic similarity matrix from the 112 pairs of primers.
between the sub-matrix and the general matrix reached 0.935 [r0.01 (110) = 0.289]. Path analysis showed that the indirect path coefficient between the number of primers (mediated by the number of alleles) and the correlation coefficient of the matrix was 0.955, demonstrating that the effect of the number of primers on clustering results was mediated by the number of alleles. Therefore, in order to improve the reliability of cluster analysis, it is more important to increase the number of alleles than of primers in the study of genetic relationships.
The total number of alleles detected by the 112 primer pairs in the 10 sets of inbred lines, which consisted of 97, 87, 77, 67, 57, 47, 37, 27, 17, and 7 inbred lines, respectively. The number of alleles increased to 141 when the number of inbred lines increased from 7 to 17, i.e., an increase of 14.1 alleles for each added inbred line. The number of alleles increased to only 6 when the number of inbred lines increased from 87 to 97, this was an increase of only 0.6 alleles for each added inbred line. The correlation coefficients between general matrix (97 lines) and sub-matrix (different inbred lines) were calculated. When the correlation coefficient r = 0.9, the corresponding number of alleles in the sub-matrix of each set was determined as the allelic number required for cluster analysis of this inbred set (Table 6). The clustering results generated from the total and the minimum number of alleles required in each set of materials matched very well except for a few inbred lines which had slight changes between groups (Table 6), indicating the clustering results from the minimum number of alleles required were generally reliable. For
Table 6 The minimum number of alleles required for the clustering of a different number of materials Material group 1 2 3 4 5 6 7 8 9 10
Number of inbred lines 7 17 27 37 47 57 67 77 87 97
Total number of alleles 347 488 540 566 583 606 621 630 637 643
Correlation coefficient between the sub-matrix from the minimum number of alleles and the general matrix
Minimum number of alleles required
0.90608 0.90339 0.90119 0.90017 0.90288 0.90211 0.90156 0.90196 0.90173 0.90048
95 208 310 421 439 452 481 493 496 498
Number of inbred lines with changes Fitted values for between total allele clusteringand the minimum minimum allele clustering number of alleles 0 0 1 1 2 2 2 3 3 3
62 236 333 391 428 455 474 489 500 510
© 2010, CAAS. All rights reserved. Published by Elsevier Ltd.
1722
example, the clustering results of the 97 inbred lines using 643 alleles vs. 498 alleles (Fig. 1-A, B) showed only three materials with a cluster change. In the dendrogram generated using 498 alleles, A183-1 changed from the undefined group to the PB group, while 543 and W618 changed positions between the different undefined groups. The minimum number of alleles required for the stable clustering of inbred lines increased as the number of inbred lines increased. Stable clustering analysis for seven inbred lines required 95 alleles, while 498 alleles were required for 97 inbred lines. Using the number of inbred lines and the required number of alleles for stable cluster analysis in each group as shown in Table 6, a regression model was established (Fig. 3), and the regression equation was: Y = 600.8×e(-15.9/x) (X is the number of inbred lines, Y is the number of SSR alleles). The correlation coefficient of the regression equation was (r = 0.9895, F = 374.98 > F0.01 (1, 8) = 11.3) very significant. Based on this regression equation, the minimum number of alleles required for genetic relationship analysis of different numbers of maize inbred lines was calculated. From the equation, the Y value increases with increasing X value. However, when X increases infinitely, the limit of the Y value is 600.8. That means approximately 601 alleles are sufficient for the study of genetic relationships for more inbred lines. As shown in Fig. 3, when the number of lines was less than 40, the required number of alleles went up rapidly with increasing the number of inbred lines; whereas when the number of the materials was more than 40, the required number of alleles increased slowly with number of materials.
Fig. 3 The fitted curve for the number of inbred lines with the minimum number of alleles required in cluster analysis.
WU Cheng-lai et al.
DISCUSSION Cluster analysis of 17 inbred lines based on 112 SSR markers Of 97 inbred lines, 17 inbred lines were not be clustered into the heterotic groups due to their complex genetic backgrounds, excluding 7137, V77, A22, 543, W618, and Yousuixuan-1 without available genealogical information. A150-3-1, A210-1, B178, A183-1, A386-1, and F22-1 were selected from a free-pollination group consisted of Nongda 108, Nongda 3138, Yedan 2, Yedan 13 and Nongdan 5. Luyuan 92 (Yuanqi 122×1137) was clustered to different groups by researchers (Li et al. 2003; Teng et al. 2004; Gao et al. 2005; Xiao et al. 2008). Similarly, 52106 [(Aijin 525×Ye 107)×106] was also clustered to different groups by researchers (Li et al. 2003; Teng et al. 2004; Liu et al. 2006; Xiao et al. 2008). Ji 842 (Ji 63×Mo17), Cai11-8 (Menke B×Zi 330) and Shen 118 (Zhao 23×Super Sweet) had complex genetic backgrounds from different groups (Zheng et al. 2002).
The effect of the number of primer pairs and alleles on the clustering results Factors that influence the results of genetic relationship analysis include the number of lines examined and the number of alleles (or polymorphic loci) under study. So far, no reports have been published on how to determine this relationship. The MXCOMP program in NTSYS-pc 2.10 software (Mantel 1967) is commonly used to calculate the correlation coefficient between two genetic similarity coefficient matrices (or genetic distance matrices). The extent of changes in genetic relationships is then determined based on the value of the calculated coefficient. A high correlation coefficient, above 0.90, indicates the two matrices are highly consistent with each other, suggesting the genetic relationship is generally stable (Zhang et al. 2002; You et al. 2004). Zhang et al. (2002) argued that use of the number of alleles is more suitable than the number of primers as an indicator of molecular marker information content, but without supporting experimental results. In our study, two types of primer sets were used: One
© 2010, CAAS. All rights reserved. Published by Elsevier Ltd.
Determination of the Number of SSR Alleles Necessary for the Analysis of Genetic Relationships Between Maize Inbred Lines
type of set had the same number of primer pairs but a different number of alleles, while the other type of set had a similar number of alleles but a different number of primer pairs. Analysis of these two types of primer sets demonstrated that the number of alleles was significantly associated with the stability of cluster analysis results, while the number of primer pairs did not have a significant effect on the stability of the cluster analysis results. The number of primers mainly exerted its effect on the stability of clustering results through changing the number of alleles. Our results revealed that the number of primer pairs can not serve as an indicator of molecular marker information content, and supported the rationale for using the number of alleles as the indicator of molecular marker information content. Therefore, to improve the reliability of the genetic relationship study of material, primers with high a polymorphism rate should be selected to increase the number of alleles.
The relationship between the number of lines and the required number of alleles Determining the number of alleles according to the number of studied materials in a genetic relationship study not only ensures the reliability of the results, but also drastically reduces unnecessary labor (Zhang et al. 2002). Many researchers have explored the number of alleles required for the genetic relationship study of certain numbers of materials of wheat (Zhang et al. 2002), rice (Yang et al. 2005), and soybeans (Wang et al. 2003), although the relationship between the number of materials and the alleles needed remains elusive. You et al. (2004) studied the number of alleles required for stable cluster analysis in 33, 48, and 96 wheat varieties, using 16 numbers of alleles ranging from 50 to 750 with an increment of 50, and found that the 33, 48, and 96 wheat varieties all required a minimum of 550 alleles. Our experiment not only revealed a significant correlation between the number of materials and the minimum number of alleles required for cluster analysis, but also established the regression relationship between the number of materials and that of alleles: Y=600.8×e(-15.9/x). When the number of materials was smaller than 37, the minimum number of alleles required increased rapidly, while when the number of materials increased from 37 to 97, the minimum number of alleles required increased slowly. Based on
1723
this equation, the relationship between the number of materials and the alleles required can be readily calculated.
The number of alleles required for a genetic relationship study in different crops Zhang et al. (2002) reported that 350 alleles were required for genetic relationship analysis among 43 wheat varieties. A study by You et al. (2004) demonstrated that 550 alleles were sufficient for a genetic relationship study among 96 wheat varieties. Based on the equation established in the present study, the numbers of alleles required for stable cluster analysis of 43 and 96 maize inbred lines were calculated to be 415 and 509, respectively. Wang et al. (2003) argued that 570 alleles were enough for a genetic relationship study of 190 soybean varieties. Based on the equation established in this experiment, 553 alleles were required for cluster analysis of 190 maize inbred lines. The two sets of numbers are very close, indicating that different crops may adhere to similar principles.
CONCLUSION Increasing the number of primer pairs (number of alleles) can improve the accuracy of experimental results in genetic relationship study. Although the number of primer pairs itself does not have an evident effect on the stability of cluster analysis results, this factor influences the stability of cluster analysis results by increasing the number of alleles. The number of alleles is significantly correlated with the stability of cluster analysis results. Therefore, primer pairs with a high polymorphism rate should be chosen in genetic relationship studies to increase the number of alleles so that the reliability of results can be improved. Stable cluster analyses in inbred lines of different numbers require different numbers of SSR alleles. The number of alleles required for the analysis exhibited a significant association with the number of inbred lines analyzed, and the regression relationships confirm to the exponential equation: Y = 600.8 × e(-15.9/x).
Acknowledgements This research was supported by the Natural Science
© 2010, CAAS. All rights reserved. Published by Elsevier Ltd.
1724
Foundation of Shandong Province, China (Y2007D52) and the Improved Variety Project of Shandong Province (2008 No.6). Thanks are due to Prof. Chen Huabang, College of Agriculture, Shandong Agricultural University, China, and Prof. Chen Maoxue, College of Information Science, Shandong Agricultural University, China, for constructive advice.
References
WU Cheng-lai et al.
markers. Theoretical and Applied Genetics, 91, 1001-1007. Rohlf F J. 2000. NTSYS-pc: Numerical Taxonomy and Multivariate Analysis System. ver. 2.1, Exeter Software. Satauket, New York, USA. Rongwen J, Akkaya M S, Bhagwat A A, Lavi U, Cregan P B. 1995. The use of microsatellite DNA markers for soybean genotype identification. Theoretical and Applied Genetics, 90, 43-48. Saghai Maroof M A, Soliman K M, Jorgensen R A, Allard R W. 1984. Ribosomal DNA spacer length polymorphisms in
Bao L, Chen K S, Zhang D, Cao Y F, Yamamoto T, Teng Y W. 2007. Genetic diversity and similarity of pear (Purus L.)
barley: Mendelian inheritance, chromosomal location and population dynamics. Proceeding of the National Academy
cultivars native to East Asia revealed by SSR (simple sequence repeat) markers. Genetic Resources and Crop
of Science USA, 81, 8014-8018. Senior M L, Murphy J P, Goodman M M, Stuber C W. 1998.
Evolution, 54, 959-971. Gao X, Chen Z H, Zhu Y F, Zhao X Y, Shen J H, Cao S S. 2005.
Utility SSRs for determining genetic similarities and relationships in ma using an agarose gel system. Crop Science,
The usage of germplasm of Reid in breeding and produce of maize in China. Chinese Agricultural Science Bulletin, 21,
38, 1088-1098. Silvestrini M, Maluf M P, Silvarolla M B, Filho O G. 2008.
120-136. (in Chinese) Garland S H, Lewin L, Abedinia M, Henry R, Blakeney A. 1999.
Genetic diversity of a coffea germplasm collection assessed by RAPD markers. Genetic Resources and Crop Evolution,
The use of microsatellite polymorphisms for the identification of Australian breeding lines of rice (Oryza
55, 901-911. Smith O S, Smith J S C, Bowen S L, Tenborg R A, Wall S J. 1990.
sativa L.). Euphytica, 108, 53-63. Li X H, Fu J H, Zhang S H, Yuan L X. 2000. Genetic variation of
Similarities among a group of elite maize inbreds as measured by pedigree, F1 grain yield, heterosis and RFLPs. Theoretical
inbred lines of maize detected by SSR markers. Scientia Agricultura Sinica, 33, 1-9. (in Chinese)
and Applied Genetics, 80, 833-840. Struss D, Plieske J. 1998. The use of microsatellite markers for
Li X H, Jiao S J, Fu J H, Zhang S H, Yuan L X, Li M S. 2001. The effects of two gel electrophoresis system on the
detection of genetic diversity in barley populations. Theoretical and Applied Genetics, 97, 308-315.
polymorphism of SSR markers. Acta Agriculturae BorealiSinica, 16, 43-48. (in Chinese)
Teng W T, Cao J S, Chen Y H, Liu X H, Jing X Q, Zhang F J, Li J S. 2004. Analysis of maize heterotic groups and patterns
Li X H, Yuan L X, Li X H, Zhang S H, Li M S, Li W H. 2003. Heterotic grouping of 70 maize inbred lines by SSR markers.
during past decade in china. Scientia Agricultura Sinica, 37, 1804-1811. (in Chinese)
Scientia Agricultura Sinica, 36, 622-627 (in Chinese) Liu Z H, Tang J H, Wang Q D, Hu Y M, Ji H Q, Chen W C. 2006.
Wang B, Chang R Z, Tao L,Yan L, Tao L, Guan R X, Zhang M H, Feng Z F, Qiu L J. 2003. The number of SSR primers required
Analysis of heterotic patterns of maize hybrids used in China’s Henan province. Scientia Agricultura Sinica, 39,
for the analysis of the genetic diversity of the soybeans cultivated in China. Molecular Plant Breeding, 1, 82-88. (in
1689-1696. (in Chinese) Mantel N A. 1967. The detection of disease clustering and a
Chinese) Wang Y B, Wang Z H, Wang Y P, Zhang X, Lu L X. 1997.
generalized regression approach. Cancer Research, 27, 209220.
Studies on the heterosis utilizing models of main maize germplasm in China. Scientia Agricultura Sinica, 30, 16-24.
Marsan P A, Castiglioni P, Fusari F, Kuiper M, Motto M. 1998.
(in Chinese) Wu J F. 1983. A review on the germplasm bases of the main corn
Genetic diversity and its relationship to hybrid performance in maize as revealed by RFLP and AFLP markers. Theoretical and Applied Genetics, 96, 219-227. Melchinger A E, Lee M, Lamkey K R, Hallauer A R, Woodman W L. 1990. Genetic diversity for restriction length polymorphisms and heterosis for two diallele sets of maize inbreds. Theoretical and Applied Genetics, 80, 488-496. Plaschke J, Ganal M W, Röder M S. 1995. Detection of genetic diversity in closely related bread wheat using microsatellite
hybrids in China. Scientia Agricultura Sinica, 6, 1-8. (in Chinese) Xiao M J, Li M S, Li X H, Zhang S H. 2008. Genetic diversity revealed by SSR markers among inbred lines used in summer maize breeding program in huanghuaihai area of China. Journal of Maize Sciences, 16, 1-7. Yang Q W, Chen C B, Zhang W X, Shi J X, Ren J F. 2005. Minimum number of SSR alleles needed for genetic structure
© 2010, CAAS. All rights reserved. Published by Elsevier Ltd.
Determination of the Number of SSR Alleles Necessary for the Analysis of Genetic Relationships Between Maize Inbred Lines
1725
analysis of Oryza rufipogon populations. Chinese Journal of Rice Science, 19, 297-302. (in Chinese)
2002. An estimation of the minimum number of SSR alleles needed to reveal genetic relationships in wheat varieties. I.
Yoon M S, Lee J, Kim C Y, Kang J H, Cho E G, Baek H J. 2009. DNA profiling and genetic diversity of Korean soybean
Information from large-scale planted varieties and cornerstone breeding parents in Chinese wheat improvement and
[Glycine max (L.) Merrill] landraces by SSR markers. Euphytica, 165, 69-77.
production. Theoretical and Applied Genetics, 106, 112-117. Zheng D H, Li Y R, Jin F X, Jiang J J. 2002. Pedigree and
You G X, Zhang X Y, Wang L F. 2004. An estimation of the minimum number of SSR loci needed to reveal genetic
germplasm base of inbreds of the lancaster heterotic group of maize in China. Scientia Agricultura Sinica, 35, 750-757. (in
relationships in wheat varieties: Information from 96 random accessions with maximized genetic diversity. Molecular
Chinese) Zeng S X. 1990. The maize germplasm base of hybrids in China.
Breeding, 14, 397-406. Zhang X Y, Li C W, Wang L F, Wang H M, You G X, Dong Y S.
Scientia Agricultura Sinica, 23, 1-9. (in Chinese) (Managing editor ZHANG Yi-min)
© 2010, CAAS. All rights reserved. Published by Elsevier Ltd.