Whole-genome scan for signatures of recent selection reveals loci associated with important traits in White Leghorn chickens D. F. Li, W. B. Liu, J. F. Liu, G. Q. Yi, L. Lian, L. J. Qu, J. Y. Li, G. Y. Xu, and N. Yang1 National Engineering Laboratory for Animal Breeding and MOA Key Laboratory of Animal Genetics and Breeding, College of Animal Science and Technology, China Agricultural University, Beijing 100193, China candidate genes/CDSs showing top P-values and slower decay of haplotype homozygosities. Some of these genes seemed to have significant effects on important economical traits, and most of them have not been reported in chickens. The current study provides a genome-wide map of linkage disequilibrium extents and distributions and selection footprints in the chicken genome. A panel of genes, including PRL, NCKX1, NRF1, LHX2, and SFRP1 associated with egg production, metabolism traits, and response to illumination were identified. In addition, there were more genes identified that have not yet been reported in chickens, and our results provide new clues for further study.
Key words: chicken genome, extended haplotype homozygosity test, linkage disequilibrium, selective signature 2012 Poultry Science 91:1804–1812 http://dx.doi.org/10.3382/ps.2012-02275
INTRODUCTION A major goal in evolutionary biology is to describe and understand the genetic basis of how diversities arise within populations and organisms adapt to their environments (Pritchard and Di Rienzo, 2010). The recent history of the domestic chicken is characterized by great changes of rearing environment, disease prevention/control strategies, and intensive human-driven selection (Rubin et al., 2010). Thus, the last 100 yr is the most interesting time in the history of poultry evolutionary biology, and many important genetic adaptations to a new environment, the farm, and disease resistances have evolved during this period. The chicken has served as an excellent model for genetic studies of phenotypic and genomic evolution by their large effective population size, specialized commercial lines, and strong human-driven selection (Rubin et al., 2010). It is also a very important agriculture animal. Identifying the genetic changes underlying these morphological changes provides new insight into un©2012 Poultry Science Association Inc. Received March 5, 2012. Accepted April 26, 2012. 1 Corresponding author:
[email protected]
derstanding of the basis of morphological evolution (Pritchard and Di Rienzo, 2010). Linkage disequilibrium (LD) refers to the nonrandom association of alleles at different sites. The extent and distribution of LD in chickens is a topic of great current interest (ICGSC, 2004). Studies of LD may enable us to learn more about the evolution history of populations. According to the natural selection theory, novel advantageous alleles under positive selection will increase in prevalence, and the LD in the vicinity of this locus will be degraded relatively slowly so that the surrounding conserved haplotype is long. For a neutral mutation, it will take many generations until the allele is fixed or lost, and the LD will be degraded relatively rapidly through recombination, hence the relatively short surrounding conserved haplotype (Nielsen, 2005; Sabeti et al., 2006). Based on the above theory of selective sweep, Sabeti et al. (2002) proposed the concepts, extended haplotype homozygosity (EHH) and relatively extended haplotype homozygosity (REHH) for the detection of recent selection (Sabeti et al., 2002). The EHH test has received considerable attention among human genetics and domestic animals because it is particularly useful for SNP data (Tang et al., 2007). By using 50-K SNP chips, Qanbari et al. (2010a) demonstrated a genome-wide map of selection footprints in
1804
Downloaded from http://ps.oxfordjournals.org/ at University of Hawaii - Manoa on June 8, 2015
ABSTRACT Chicken is considered to be an excellent model for genetic studies of phenotypic and genomic evolution, with large effective population size, specialized commercial lines, and strong human-driven selection. High-density chicken SNP chips can help to achieve a better understanding of the selection mechanisms in artificially selected populations. We performed the genome-wide tests for the selection signature in 385 White Leghorn hens and mapped positively selected regions to the genome annotations. Ten QTL related to egg production, egg quality, growth, and disease resistance traits were selected for extended haplotype homozygosity tests to give a brief overview of recent selection signatures in chicken QTL. We also reported 185
RECENT SELECTION SIGNATURES IN WHITE LEGHORNS
MATERIALS AND METHODS Bird Resource and Genotyping A line White Leghorn (WL) chickens, originated from a commercial population, has been maintained and selected mainly for egg production in the experimental station of China Agricultural University for more than 10 yr. These birds were kept in individual cages for daily recording of egg production from 21 to 56 wk of age, and egg quality traits were measured individually at 40 and 60 wk of age (Liu et al., 2011). A total of 385 WL hens from 20 families was employed as the experimental population for the current study. All genomic DNA samples were extracted from blood by using standard phenol-chloroform extraction. The DNA quality was estimated by 1% agarose gel electrophoresis and by calculating absorbance ratio optical density, OD260nm/OD280nm. All the samples were checked for quality and then a genome-wide SNP genotyping using 60 K SNP Illumina iSelect chicken array (Groenen et al., 2011) was performed by DNA LandMarks Inc. (Quebec, Canada). This microarray contains probes for 57,636 SNP, distributed across 29 autosomes (GGA1 to 28 and GGA32), 2 linkage groups (E22C19W28_E50C23 and E64), and 2 sex chromosomes.
Quality Control and Missing Genotypes Estimation A total of 385 DNA samples was selected for genotyping, and the genotyping data was analyzed through the software package PLINK (Purcell et al., 2007), with missing rate per individual <0.1, minor allele frequency (MAF) >0.01, and missing rate per SNP <0.1. The P-values for Hardy-Weinberg equilibrium tests were greater than 1.00E-06. Fully phased haplotype data were required for further analysis. After the quality control process, haplotypes for every chromosome us-
ing default parameters in fastPHASE (Scheet and Stephens, 2006) were reconstructed.
Detection of Selection Signatures and Gene Annotation For the EHH test, the default parameters were implemented by Sweep v.1.1 (Sabeti et al., 2002). The program was set to select core regions with between 3 and 20 SNP. As reported by the ICGSC (2004), the recombination rate varies in a range of 2.8 to 6.4 cM/Mb among chicken chromosomes. So we chose the distance of 300 kb as the matched distance to determine the REHH value for each core region and evaluated how LD decays across the genome. The EHH is defined as “the probability that two randomly chosen chromosomes carrying a tested core haplotype are homozygous at all SNPs for the entire interval from the core region to the distance x.” To account for factors such as variability of recombination, Sabeti et al. (2002) proposed the concept of REHH, “the ratio of the EHH on the tested core haplotype compared with the EHH of the grouped set of core haplotypes at the region not including the core haplotype tested.” According to the natural selection theory, regions under positive selection have frequent alleles, embedded in a long range LD background (Qanbari et al., 2010b). Accordingly, we chose the haplotype frequency >25%, REHH values greater than 1, and the P-value of REHH test <0.05/0.01 as criteria to identify the significant core haplotype. Genome Assembly/Annotation Projects database was used for mapping positively selected regions to genome annotations (ftp:// ftp.ncbi.nih.gov/genomes/Gallus_gallus/mapview/, based on Build 2.1).
RESULTS Markers and Core Haplotype Statistics We genotyped 57,636 markers by using 60 K SNP Illumina iSelect chicken array, and a total of 37,518 (65.09%) markers and 372 individuals passed the quality control. Most of the markers from GGA32, E64, E22, W, and Z were discarded after quality control, and the EHH test was applicable to all bi-allelic loci of diploid species, so only 36,949 (65.74%) markers with average intervals of 26 kb distributed across 28 autosomal chromosomes (GGA1 to GGA28) were included in further analysis. Table 1 gives a summary description of genome-wide marker distribution. A total of 25,760 SNP (69.72%) participated in forming core regions (Table 1), and more than half of the core regions were in a range of 3 to 4 SNP (Figure 1).There were 672 (63.83%) and 68 (28.66%) SNP distributed in different core regions on GGA15 and GGA25, respectively. These chromosomes were the highest and lowest proportion of total core region lengths on chromosome length. Overall, 3,741 core regions spanning 521.78 Mb
Downloaded from http://ps.oxfordjournals.org/ at University of Hawaii - Manoa on June 8, 2015
the Holstein genome and reported that several genes and QTL might be related to milk yield and composition as well as reproductive and behavioral traits. The resequencing of chicken genome using pooled genomic DNA was conducted, and several genes associated with growth, appetite, and metabolic regulation were reported (Rubin et al., 2010). Their work casts light on the genetic basis of domestication and encouraged us to conduct in-depth studies in a larger population. In our previous study, high-density SNP chips were employed in a genome-wide association study to reveal several loci associated with egg production and quality traits in dwarf and White Leghorn chickens (Liu et al., 2011). In this study, the genotyping data of White Leghorn hens was used for detecting the extents and distributions of LD and scanning the genome for positions that may have been targets of recent positive selection.
1805
9,059 6,958 5,171 4,256 2,766 2,219 2,280 1,813 1,504 1,682 1,647 1,671 1,492 1,284 1,337 31 1,104 1,160 1,053 1,991 970 476 794 937 241 852 665 793 56,206
= Gallus gallus chromosome.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 Total
1GGA
200.99 154.87 113.66 94.23 62.24 37.4 38.38 30.67 25.55 22.56 21.93 20.54 18.91 15.82 12.97 0.43 11.18 10.93 9.94 13.99 6.96 3.94 6.04 6.4 2.03 5.1 4.84 4.51 957.01
GGA1
No. of SNP on array 6,086 4,324 3,476 2,907 1,872 1,549 1,442 1,268 1,036 1,078 929 1,111 1,043 859 855 18 730 742 709 1,233 666 231 519 645 128 586 420 487 36,949
No. of valid SNP 33 36 33 32 33 24 26 24 25 21 24 18 18 18 15 24 15 15 14 11 11 17 12 10 16 9 11 9 25.9
Mean distance (kb) 825 582 465 400 264 211 191 165 132 155 136 146 137 111 110 3 88 98 101 148 96 26 72 91 13 82 58 63 4969
No. of CR 110.86 88.54 58.67 49.97 32.08 19.67 21.38 17.37 11.60 12.57 13.18 12.09 10.25 9.33 8.28 0.10 5.71 5.65 5.67 8.77 3.69 2.16 2.80 3.43 0.58 2.48 2.82 2.08 521.78
Coverage length (Mbp) 134.37 152.13 126.18 124.93 121.53 93.22 111.95 105.27 87.88 81.1 96.89 82.8 74.82 84.02 75.26 32.25 64.84 57.6 56.14 59.29 38.45 82.95 38.84 37.67 44.76 30.3 48.68 32.97 105.01
± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ±
146.63 185.69 118.91 118.4 121.94 86.05 212.43 186.19 80.61 129.32 136.54 78.99 72.09 90.62 63.53 3.2 65.38 62.86 58.22 74.68 32.64 151.01 38.19 32.93 78.76 31.44 77.59 37.54 133.29
Mean CR length (kb) 2,136.18 2,392.59 766.66 863.35 808.22 523.24 2,701.45 2,071.51 477.48 1,518.84 1,314.56 590.53 452.82 519.67 434.57 36.34 387.48 466.32 341.09 460.10 236.59 751.54 303.72 193.59 313.10 240.62 477.44 274.81 2,701.45
Maximum CR length
CR SNP 4239 3128 2339 2032 1274 1066 986 844 658 755 681 834 731 628 672 11 508 488 519 898 468 139 343 454 68 380 291 326 25,760
CR length/ Chr length (%) 55.16 57.17 51.62 53.03 51.55 52.59 55.71 56.63 45.40 55.72 60.09 58.86 54.21 58.95 63.83 22.50 51.03 51.65 57.05 62.72 53.04 54.74 46.30 53.56 28.66 48.72 58.34 46.05 54.52
Downloaded from http://ps.oxfordjournals.org/ at University of Hawaii - Manoa on June 8, 2015
Chromosome length (Mbp)
Table 1. Summary of genome-wide marker and core region (CR) distribution in White Leghorn chickens
5.14 5.37 5.03 5.08 4.83 5.05 5.16 5.12 4.98 4.87 5.01 5.71 5.34 5.66 6.11 3.67 5.77 4.98 5.14 6.07 4.88 5.35 4.76 4.99 5.23 4.63 5.02 5.17 5.18
± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ± ±
3.2 3.44 2.94 3.16 2.96 3.01 3.43 3.45 2.98 2.67 3.21 3.39 3.42 3.8 3.73 0.47 4.04 3.1 3.27 4.39 2.69 2.93 2.47 3.34 4.56 2.68 2.83 3.03 3.26
Mean CR SNP
69.65 72.34 67.29 69.90 68.06 68.82 68.38 66.56 63.51 70.04 73.30 75.07 70.09 73.11 78.60 61.11 69.59 65.77 73.20 72.83 70.27 60.17 66.09 70.39 53.13 64.85 69.29 66.94 69.72
CR SNP/ SNP (%)
1806 Li et al.
RECENT SELECTION SIGNATURES IN WHITE LEGHORNS
1807
54, and the proportion of length covered by core regions exceeded 73%. In the case of the QTL related to carcass weight (McElroy et al., 2006), the core region reached a length of 2,136 kb and covered the whole QTL region. The results of EHH tests on QTL associated with eggshell color, egg weight, and egg production rate showed significant/suggestive P-values. As shown in Table 2, the highest REHH values of core haplotypes in each QTL were calculated.
Whole-Genome EHH Test
(54.52%) of the genome were detected (Table 1). Mean core region length was estimated at 105.01 ± 133.29 kb, with a maximum of 2,701.45 kb (GGA7) and minimum of 36.34kb (GGA16). The GGA1 had the greatest number of core regions. Figure 2 presents the distributions of length of core regions.
Core Regions in Candidate QTL To get a brief overview of recent selection signatures in chicken QTL, we selected 10 QTL related to egg production, egg quality, growth, and disease resistance traits from previous reports (chicken QTL database, http://www.animalgenome.org/cgi-bin/QTLdb/GG/ index) to perform the EHH test. The proportion of length covered by core regions versus QTL length and the lowest P-value of EHH test of core regions distributed in a QTL are given in Table 2. The number of core regions spanning these QTL varied with a range of 1 to
Figure 2. Distribution of the number of SNP in the White Leghorn genome.
Gene Annotations To assess the influence of positive selection on a particular coding region, corresponding genes/ESTs harboring core haplotype with REHH P-value <0.01 were identified by aligning the core positions to the chicken genome sequence (Build 2.1). Table 4 shows a description of genes reported previously in human or chicken, and all the185 identified genes are given in Supplementary Table 1 (Available at http://www.ps.fass.org). On GGA20, a core region harboring 2 genes, the eyes absent homolog 2 (EYA2) and solute carrier family 2, member 10 (SLC2A10), showed a strong signature of selection (REHH = 7.45, P-value = 0.00095). The EYA2 homolog was first identified in Drosophila (Bonini et al., 1993) and played an essential role in cell proliferation, differentiation, and death during organogenesis (Zhang et al., 2005). In the chicken, EYA2 was isolated from 14-d embryonic chicken lenses, showing a dynamic expression pattern in different tissues of diverse embryological origin (Mishima and Tomarev, 1998). Prolactin (PRL) was located on GGA 2 (REHH = 3.09, P-value = 0.00423), involved in maintaining the broody behavior and egg production (Buntin, 1996). The LIM homeobox 2 (LHX2, REHH = 4.64, P-value = 0.00893) was located on GGA17. Secreted frizzled-related protein 1 (SFRP1, REHH = 6.15, P-value = 0.00123) was located on GGA22. The LHX2 and SFRP1 were involved in Wnt signaling pathway, a network best known in embryogenesis and cancer (Lie et al., 2005; Meng et al., 2011; Peukert et al., 2011).
Downloaded from http://ps.oxfordjournals.org/ at University of Hawaii - Manoa on June 8, 2015
Figure 1. Distribution of the length of core regions in the White Leghorn genome.
To detect core haplotypes in chicken genome, we calculated EHH at 300-kb distance from a core for all the possible cores present to both the upstream and downstream sides, and a total of 15,060 EHH tests with an average of 24 tests per core region were calculated. Figure 3 gives a distribution map of the –log10(P-value) of REHH values against the chromosomal positions. The results of whole-genome EHH tests were filtered with a criterion of core haplotype frequency >0.25, REHH >1, and the P-value of REHH <0.05/0.01 for finding the outlying core haplotypes. In 15,060 tests, 624 and 116 tests displayed outlying peaks on a threshold level of 0.05 and 0.01, respectively. As shown in Table 3, core haplotypes under positive selection were mainly distributed in GGA 1 to 4.
1808
Li et al.
McElroy et al. (2006) Wardecka et al. (2002) Yonash et al. (1999) Atzmon et al. (2008) Schreiweis et al. (2006) Schreiweis et al. (2005) Tercic et al. (2009) Heifetz et al. (2009) Sasaki et al. (2004) Schreiweis et al. (2005) Carcass weight Egg production rate Marek’s disease-related traits Abdominal fat weight Body weight (42 d) Breast muscle weight Egg shell color Tibia bone mineral density Body weight (1 d) Marek’s disease-related traits Egg weight Tibia breaking force
The Extent and Distribution of LD in Chicken Genome
= Gallus gallus chromosome; CR = core region; REHH = relatively extended haplotype homozygosity. 1GGA
77,690,681–78,690,681 161,532,735–162,532,735 7,838,822–18,028,322 140,493,503–141,493,503 86,206,740–87,208,344 139,571,894–149,870,058 30,162,990–30,261,755 65,600,571–66,684,756 7,136,680–8,136,680 2,307,802–6,272,742 32,637.17 (kb) 1 2 5 8 9 11 Total
1 6 54 4 6 28 1 5 5 12 122
2,136.18 744.46 7,386.66 757.15 716.67 7,395.16 85.31 783.40 754.03 3,179.27 23,938.28
100.00 74.45 72.49 75.72 71.55 71.81 86.37 72.26 75.40 80.18 73.35
1.66 2.37 2.57 2.73 2.73 2.73 2.53 3.31 4.37 2.95 2.73 2.31 2.75
0.063 0.050 0.040 0.003 0.043 0.009 0.003 0.036 0.051 0.009 0.003
Reference Lowest P-value REHH CR length/ Chr length (%) Coverage length (kb) No. of CR QTL position (bp) GGA
Table 2. Summary of the extended haplotype homozygosity tests on candidate QTL1
The first assembly of the chicken genome sequence was published in 2004, which opened an entirely new way to understand the genetic basis of recent evolution in the chicken genome. Large-scale SNP data sets in the chicken genome increase the marker density and achieve a better understanding of the selection mechanisms in artificially selected populations (Qanbari et al., 2010a). Benefiting from recently established Illumina 60 K chicken SNP chips, we detected the extent of LD and scanned the genome for positions that may have been targets of recent positive selection in the White Leghorn population.
Linkage disequilibrium analysis is a powerful tool for fine mapping of genes/loci responsible for economic/ disease-related traits and to perform genome-based selection (Meuwissen et al., 2001). As we know, the extent and distribution of LD is highly variable in different chromosome regions and populations and is a great topic of current interest. The comparison of the physical distance along each chicken chromosome with the genetic distance between markers reveals that recombination rates have a strong negative association with chromosome length (ICGSC, 2004). In this study, the mean length of core haplotypes varies over a 4-fold range among chromosomes (30 kb to 134 kb) and has a strong negative association with chromosome length. The longest LD block is located in GGA7 (2,701 kb), more than 80-fold of the length compared with the shortest one (36 kb), located in GGA16. Andreescu et al. (2007) used genotype data with 959 and 398 SNP on GGA1 and 4 to investigate the extent of LD in 9 commercial broiler breeding lines and found that LD measured by r2 extended over a very short distance (<1 cM; Andreescu et al., 2007). Aerts et al. (2007) demonstrated that the extent of LD was very different between GGA10 (15 kb) and GGA28 (4 cM) in breed Nutreco E3 (Aerts et al., 2007). Even the mean length of core haplotypes varied dramatically in physical length, when we convert the measurements of kb to cM, consistent with the results reported by Abasht et al. (2009) and Aerts et al. (2007); the difference in LD between chromosomes was not found (Andreescu et al., 2007; Abasht et al., 2009). Due to the different density of SNP markers, our results shows that the mean distance of LD blocks on GGA1, 4, 10, and 28 is 134 kb, 125 kb, 81 kb, and 33 kb, respectively. Rao et al. (2008) selected 36 SNP in a region of 200 kb on GGA1 and investigated the LD pattern in Red Jungle Fowl (RJF) and 2 domestic chicken populations. They found that the extent of LD in this region was about 150 kb (0.4 cM; Rao et al., 2008). Qanbari et al. (2010a) reported a comprehensive LD map using high-density SNP chips. Their result showed very different LD extent patterns
Downloaded from http://ps.oxfordjournals.org/ at University of Hawaii - Manoa on June 8, 2015
Trait
DISCUSSION
1809
RECENT SELECTION SIGNATURES IN WHITE LEGHORNS Table 3. Summary statistics of whole-genome extended haplotype homozygosity GGA 1 2 3 4 5 6 7 8 9 10 11 12 13 14 Total 1GGA
Tests on CH
No. of CR P < 0.05
No. of CR P < 0.01
GGA
2,549 1,804 1,420 1,268 786 662 569 523 427 494 391 422 407 319
91 65 75 45 43 20 21 19 25 30 12 11 15 11
17 11 15 5 10 4 3 4 6 5 2 0 5 2
15 16 17 18 19 20 21 22 23 24 25 26 27 28
tests1 Tests on CH
No. of CR P < 0.05
No. of CR P < 0.01
311 3 254 272 320 446 301 66 190 248 31 236 174 167 15,060
10 0 27 9 13 21 16 5 7 12 1 10 5 5 624
3 0 8 2 1 3 3 1 3 2 0 0 0 1 116
= Gallus gallus chromosome; CH = core haplotype; CR = core region.
long haplotype block, however, in a Chinese domestic chicken line, Dongxiang Blueshelled Layers, the mean length of LD block is 50 kb (unpublished data).
Core Regions in Candidate QTL Quantitative trait loci provide a way to associate segments of genome locations with quantitative traits in chicken. To date, more than 3,162 QTL representing 270 different traits have been reported in 158 publications (Hu et al., 2010). Ten QTL, well-reported to be related to egg production, egg quality, growth, and disease resistance traits, were selected from the chicken QTL database to examine the validity of EHH analysis and a brief overview of recent selection signatures in chicken QTL was obtained. As shown in Table 2, the mean proportion of length covered by core regions ver-
Figure 3. Genome-wide P-values for core haplotype with frequency >0.25.
Downloaded from http://ps.oxfordjournals.org/ at University of Hawaii - Manoa on June 8, 2015
between commercial white and brown layers. In white layers, a short pairwise distance (<25 kb) r2 = 0.73 ± 0.36 was observed and the r2 dropped to 0.60 ± 0.38 with distances of 75 to 120 kb (Qanbari et al., 2010a). But in brown layers, the r2 dropped to 0.21 ± 0.26 within 100 kb. In the current study, 38,655 valid SNP with average intervals of 26 kb were used to investigate the LD pattern. Similar to the previous study, a mean value of LD block size of around 105 kb was observed. Wragg et al. (2012) reported the genome-wide structure of traditional and domestic village chickens and indicated a median haplotype block size of 11 to 12 kb (Wragg et al., 2012). We are more inclined to believe that the median block size represents the ancestral haplotype blocks. In WL, a commercial layer line, the LD in the vicinity of alleles under strong positive selection will be degraded relatively slowly and showing a
1810
Chen et al. (1997) Scanes et al. (1975) Nguyen et al. (2005) Fain et al. (2001) Corbett et al. (2010) Lie et al. (2005) Willaert et al. (2012) Mishima and Tomarev (1998) Meng et al. (2011) Elleder et al. (2004) Nuclear respiratory factor 1 Prolactin Polymerase (DNA directed), gamma Potassium-dependent socium-calcium exchanger TBC1 domain family, member 24 LIM homeobox 2 Solute carrier family 2 (facilitated glucose transporter), member 10 Eyes absent homolog 2 (Drosophila) Secreted frizzled-related protein 1 Heterogeneous nuclear ribonucleoprotein M
Gene Annotations
= Gallus gallus chromosome. 1GGA
416677 396453 404292 414892 416753 395705 419206 395745 395237 420054 1 2 10 10 14 17 20 20 22 28
NRF1 PRL POLG NCKX1 TBC1D24 LHX2 SLC2A10 EYA2 SFRP1 HNRPM
3.79 3.09 4.37 3.56 5.72 4.64 7.45 7.45 6.15 6.08
0.00609 0.00423 0.00773 0.00431 0.00237 0.00893 0.00095 0.00095 0.00123 0.00679
Reference Description P-value REHH GeneID GGA1
Feature name
sus QTL length exceeded 73%, much higher than that of genome average level (55%). The results of EHH tests on QTL showed a relatively high mean value of REHH (2.75). All the evidence suggests that these QTL have been or are undergoing positive selection.
Whole-genome screen for positively selected regions were conducted, and for the purpose of assessing the influence of positive selection on genes/CDSs, we aligned core regions to the chicken genome sequence. A total of 186 genes/CDSs were identified, which are not only associated with egg production traits but also with growth traits. A polypeptide hormone secreted by the anterior pituitary gland, PRL, was isolated from pituitaries of chickens (Scanes et al., 1975). It has been well-reported to be an indicator of broodiness and plays a crucial role in egg production (Buntin, 1996; Sharp, 1997). In addition to the wide influence on reproductive behavior, PRL has been shown to have a diverse spectrum of biological activities and functions in birds (Denbow, 1986; Skwarlo-Sonta et al., 1987; Edens, 2011). The EYA2 gene, an important gene in cell proliferation, differentiation, and death during organogenesis, showed a dynamic expression pattern in different tissues of diverse embryological origin (Bonini et al., 1993). The NCKX1 gene product is found in retinal rod photoreceptors, and the physiological role of NCKX proteins in retinal rod and cone photoreceptors is extruding Ca2+ in the dark and increasing the activity of guanylylcyclase via lowering the cytosolic Ca2+. The NCKX proteins are important for the process of light adaptation and recovery from previous illumination (Fain et al., 2001). This parallels the result reported by Rubin et al. (2010). In their study, thyroid stimulating hormone receptor (TSHR), a gene having a pivotal role in metabolic regulation and photoperiod control of reproduction in vertebrates, was identified under most striking selective sweeps (Rubin et al., 2010). Nuclear respiratory factor 1 (NRF1) encodes a protein that activates the expression of some key metabolic genes regulating signaling pathways required for respiration and heme biosynthesis (Chen et al., 1997; Qu et al., 2011). The LHX2 and SFRP1 genes were involved in the Wnt signaling pathway, and DNA polymerase gama (POLG) was associated with alpers syndrome in human (Nguyen et al., 2005; Meng et al., 2011; Peukert et al., 2011). Within the most recent century, chickens have been intensively selected for egg laying and meat production. The poultry farming methods are changed dramatically, so it is not surprising that genes identified are related to egg production, metabolism, and response to illumination, showing the signatures of recent selection. In our previous study, 8 SNP showing genome-wise significant association with egg production and quality traits were revealed. Some significant SNP were located in genes, including GRB14 and GALNT1, which can affect development and function of ovary. Three
Downloaded from http://ps.oxfordjournals.org/ at University of Hawaii - Manoa on June 8, 2015
Table 4. Description of partial genes identified by relatively extended haplotype homozygosity (REHH) with P < 0.01
Li et al.
RECENT SELECTION SIGNATURES IN WHITE LEGHORNS
ACKNOWLEDGMENTS This work was supported in part by the Key Project of Chinese Ministry of Education (No. 104241 and 02183), the National Scientific Supporting Projects of China (2011BAD28B03), and Programs for Changjiang Scholars and Innovative Research in University (IRT0945 and IRT1191).
REFERENCES Abasht, B., E. Sandford, J. Arango, P. Settar, J. E. Fulton, N. P. O’Sullivan, A. Hassen, D. Habier, R. L. Fernando, J. C. Dekkers, and S. J. Lamont. 2009. Extent and consistency of linkage disequilibrium and identification of DNA markers for production and egg quality traits in commercial layer chicken populations. BMC Genomics 10(Suppl. 2):S2. Aerts, J., H. J. Megens, T. Veenendaal, I. Ovcharenko, R. Crooijmans, L. Gordon, L. Stubbs, and M. Groenen. 2007. Extent of linkage disequilibrium in chicken. Cytogenet. Genome Res. 117:338–345. Andreescu, C., S. Avendano, S. R. Brown, A. Hassen, S. J. Lamont, and J. C. Dekkers. 2007. Linkage disequilibrium in related breeding lines of chickens. Genetics 177:2161–2169. Atzmon, G., S. Blum, M. Feldman, A. Cahaner, U. Lavi, and J. Hillel. 2008. QTLs detected in a multigenerational resource chicken population. J. Hered. 99:528–538. Bonini, N. M., W. M. Leiserson, and S. Benzer. 1993. The eyes absent gene: Genetic control of cell survival and differentiation in the developing Drosophila eye. Cell 72:379–395. Buntin, J. D. 1996. Neural and hormonal control of parental behavior in birds. Pages 161–213 in Advances in the Study of Behavior. S. R. Jay and T. S. Charles, ed. Academic Press, Washington, DC. Chen, S., P. L. Nagy, and H. Zalkin. 1997. Role of NRF-1 in bidirectional transcription of the human GPAT-AIRC purine biosynthesis locus. Nucleic Acids Res. 25:1809–1816. Corbett, M. A., M. Bahlo, L. Jolly, Z. Afawi, A. E. Gardner, K. L. Oliver, S. Tan, A. Coffey, J. C. Mulley, L. M. Dibbens, W. Simri, A. Shalata, S. Kivity, G. D. Jackson, S. F. Berkovic, and J. Gecz. 2010. A focal epilepsy and intellectual disability syndrome is due to a mutation in TBC1D24. Am. J. Hum. Genet. 87:371–375. Denbow, D. M. 1986. The influence of prolactin on food intake of turkey hens. Poult. Sci. 65:1197–1200. Edens, F. W. 2011. Gender, age reproductive status effects on serum prolactin concentrations in different varieties and species of poultry. Int. J. Poult. Sci. 10:832–838. Elleder, D., J. Plachy, J. Hejnar, J. Geryk, and J. Svoboda. 2004. Close linkage of genes encoding receptors for subgroups A and C of avian sarcoma/leucosis virus on chicken chromosome 28. Anim. Genet. 35:176–181. Fain, G. L., H. R. Matthews, M. C. Cornwall, and Y. Koutalos. 2001. Adaptation in vertebrate photoreceptors. Physiol. Rev. 81:117–151. Groenen, M. A., H. J. Megens, Y. Zare, W. C. Warren, L. W. Hillier, R. P. Crooijmans, A. Vereijken, R. Okimoto, W. M. Muir, and H. H. Cheng. 2011. The development and characterization of a 60K SNP chip for chicken. BMC Genomics 12:274. Grossman, S. R., I. Shylakhter, E. K. Karlsson, E. H. Byrne, S. Morales, G. Frieden, E. Hostetter, E. Angelino, M. Garber, and O. Zuk. 2010. A composite of multiple signals distinguishes causal variants in regions of positive selection. Science 327:883. Heifetz, E. M., J. E. Fulton, N. P. O’Sullivan, J. A. Arthur, H. Cheng, J. Wang, M. Soller, and J. C. Dekkers. 2009. Mapping QTL affecting resistance to Marek’s disease in an F6 advanced intercross population of commercial layer chickens. BMC Genomics 10:20. Hu, Z.-L., C. A. Park, E. R. Fritz, and J. M. Reecy. 2010. QTLdb: A Comprehensive Database Tool Building Bridges between Genotypes and Phenotypes in Invited Lecture with full paper published electronically on The 9th World Congress on Genetics Applied to Livestock Production, Leipzig, Germany, August 1–6, 2010. ICGSC. 2004. Sequence and comparative analysis of the chicken genome provide unique perspectives on vertebrate evolution. Nature 432:695–716. Lie, D. C., S. A. Colamarino, H. J. Song, L. Desire, H. Mira, A. Consiglio, E. S. Lein, S. Jessberger, H. Lansford, A. R. Dearie, and F. H. Gage. 2005. Wnt signalling regulates adult hippocampal neurogenesis. Nature 437:1370–1375. Liu, W., D. Li, J. Liu, S. Chen, L. Qu, J. Zheng, G. Xu, and N. Yang. 2011. A genome-wide SNP scan reveals novel loci for egg
Downloaded from http://ps.oxfordjournals.org/ at University of Hawaii - Manoa on June 8, 2015
SNP, rs13968878, rs13978498, and GGaluGA059301, were located at a mean distance of 270 kb from nearest core regions (REHH > 2, P-value <0.05). These results give us confidence to conduct further in-depth study. Although 185 genes were identified and some of these genes were reported before, a greater number have not yet been studied in the chicken. Traditional QTL mapping always gives researchers a low-resolution map, so it is difficult to identify QTG or QTN underlying a particular phenotype. The mean length of core regions detected by EHH test is no more than 250 kb, a distance enough for us to locate single candidate genes. Although EHH test builds a bridge across the barrier between QTL and QTG, the identification of causal mutation is still difficult. Additional studies, especially those on the transcript level and getting the whole sequence of candidate gene, are needed to figure out the principle of phenotype traits. Considering that the strategy of EHH test may lack sensitivity for identifying lower-frequency selected alleles (Grossman et al., 2010), we focus on the high-frequency functional genes (haplotype frequency >25%). Because the significant core regions identified were large and contained many genes, we set the filtering criteria of REHH values greater than 1. These settings may decrease the power of the EHH test; on the other hand, it will help to minimize the possibility of false positives and give more reliable results. To make full use of these genes identified, further studies using the bioinformatics method, especially gene ontology analysis, will be needed. In addition, a denser chip especially with an increased number of SNP on GGA16, a very important chromosome in disease resistance in poultry, and comparative studies between different populations would allow a more reliable and comprehensive understanding on the footprint of selection. In conclusion, the current study presented a genomewide map of LD extent and distribution and selection footprints in the chicken genome. The EHH test on 10 QTL related to egg production, egg quality, growth, and disease resistance traits provided a brief overview of recent selection signatures in chicken QTL. And 185 candidate genes/CDSs with top P-values and a slower decay of haplotype homozygosity were also reported. A panel of genes, including PRL, NCKX1, NRF1, LHX2, and SFRP1, seemed to have significant effects on economically important traits in the field of poultry production. Additional studies, especially the comparisons of different populations, are needed to confirm and refine our results.
1811
1812
Li et al. Sasaki, O., S. Odawara, H. Takahashi, K. Nirasawa, Y. Oyamada, R. Yamamoto, K. Ishii, Y. Nagamine, H. Takeda, E. Kobayashi, and T. Furukawa. 2004. Genetic mapping of quantitative trait loci affecting body weight, egg character and egg production in F2 intercross chickens. Anim. Genet. 35:188–194. Scanes, C. G., N. J. Bolton, and A. Chadwick. 1975. Purification and properties of an avian prolactin. Gen. Comp. Endocrinol. 27:371–379. Scheet, P., and M. Stephens. 2006. A fast and flexible statistical model for large-scale population genotype data: Applications to inferring missing genotypes and haplotypic phase. Am. J. Hum. Genet. 78:629–644. Schreiweis, M. A., P. Y. Hester, and D. E. Moody. 2005. Identification of quantitative trait loci associated with bone traits and body weight in an F2 resource population of chickens. Genet. Sel. Evol. 37:677–698. Schreiweis, M. A., P. Y. Hester, P. Settar, and D. E. Moody. 2006. Identification of quantitative trait loci associated with egg quality, egg production, and body weight in an F2 resource population of chickens. Anim. Genet. 37:106–112. Sharp, P. J. 1997. Neurobiology of the onset of incubation behaviour in birds. Pages 193–202 in Frontiers in Environmental and Metabolic Endocrinology. S. K. Maitra, ed. Burdwan University Press, West Bengal. Skwarlo-Sonta, K., J. Sotowska-Brochocka, D. Rosolowska-Huszcz, E. Pawlowska-Wojewodka, A. Gajewska, D. Stepien, and K. Kochman. 1987. Effect of prolactin on the diurnal changes in immune parameters and plasma corticosterone in White Leghorn chickens. Acta Endocrinol. (Copenh.) 116:172–178. Tang, K., K. R. Thornton, and M. Stoneking. 2007. A new approach for using genome scans to detect recent positive selection in the human genome. PLoS Biol. 5:e171. Tercic, D., A. Holcman, P. Dovc, D. R. Morrice, D. W. Burt, P. M. Hocking, and S. Horvat. 2009. Identification of chromosomal regions associated with growth and carcass traits in an F(3) full sib intercross line originating from a cross of chicken lines divergently selected on body weight. Anim. Genet. 40:743–748. Wardecka, B., R. Olszewski, K. Jaszczak, G. Zieba, M. Pierzchala, and K. Wicinska. 2002. Relationship between microsatellite marker alleles on chromosomes 1-5 originating from the Rhode Island Red and Green-legged Partrigenous breeds and egg production and quality traits in F(2) mapping population. J. Appl. Genet. 43:319–329. Willaert, A., S. Khatri, B. L. Callewaert, P. J. Coucke, S. D. Crosby, J. G. Lee, E. C. Davis, S. Shiva, M. Tsang, A. De Paepe, and Z. Urban. 2012. GLUT10 is required for the development of the cardiovascular system and the notochord and connects mitochondrial function to TGFbeta signaling. Hum. Mol. Genet. 21:1248–1259. Wragg, D., J. M. Mwacharo, J. A. Alcalde, P. M. Hocking, and O. Hanotte. 2012. Analysis of genome-wide structure, diversity and fine mapping of Mendelian traits in traditional and village chickens. Heredity In Press. Yonash, N., L. D. Bacon, R. L. Witter, and H. H. Cheng. 1999. High resolution mapping and identification of new quantitative trait loci (QTL) affecting susceptibility to Marek’s disease. Anim. Genet. 30:126–135. Zhang, L., N. Yang, J. Huang, R. J. Buckanovich, S. Liang, A. Barchetti, C. Vezzani, A. O’Brien-Jenkins, J. Wang, M. R. Ward, M. C. Courreges, S. Fracchioli, A. Medina, D. Katsaros, B. L. Weber, and G. Coukos. 2005. Transcriptional coactivator Drosophila eyes absent homologue 2 is up-regulated in epithelial ovarian cancer and promotes tumor growth. Cancer Res. 65:925–932.
Downloaded from http://ps.oxfordjournals.org/ at University of Hawaii - Manoa on June 8, 2015
production and quality traits in White Leghorn and brown-egg dwarf layers. PLoS ONE 6:e28600. McElroy, J. P., J. J. Kim, D. E. Harry, S. R. Brown, J. C. Dekkers, and S. J. Lamont. 2006. Identification of trait loci affecting white meat percentage and other growth and carcass traits in commercial broiler chickens. Poult. Sci. 85:593–605. Meng, Y., Q. G. Wang, J. X. Wang, S. T. Zhu, Y. Jiao, P. Li, and S. T. Zhang. 2011. Epigenetic inactivation of the SFRP1 gene in esophageal squamous cell carcinoma. Dig. Dis. Sci. 56:3195– 3203. Meuwissen, T. H., B. J. Hayes, and M. E. Goddard. 2001. Prediction of total genetic value using genome-wide dense marker maps. Genetics 157:1819–1829. Mishima, N., and S. Tomarev. 1998. Chicken Eyes absent 2 gene: Isolation and expression pattern during development. Int. J. Dev. Biol. 42:1109–1115. Nguyen, K. V., E. Ostergaard, S. H. Ravn, T. Balslev, E. R. Danielsen, A. Vardag, P. J. McKiernan, G. Gray, and R. K. Naviaux. 2005. POLG mutations in Alpers syndrome. Neurology 65:1493–1495. Nielsen, R. 2005. Molecular signatures of natural selection. Annu. Rev. Genet. 39:197–218. Peukert, D., S. Weber, A. Lumsden, and S. Scholpp. 2011. Lhx2 and Lhx9 determine neuronal differentiation and compartition in the caudal forebrain by regulating Wnt signaling. PLoS Biol. 9:e1001218. Pritchard, J. K., and A. Di Rienzo. 2010. Adaptation—Not by sweeps alone. Nat. Rev. Genet. 11:665–667. Purcell, S., B. Neale, K. Todd-Brown, L. Thomas, M. A. Ferreira, D. Bender, J. Maller, P. Sklar, P. I. de Bakker, M. J. Daly, and P. C. Sham. 2007. PLINK: A tool set for whole-genome association and population-based linkage analyses. Am. J. Hum. Genet. 81:559–575. Qanbari, S., M. Hansen, S. Weigend, R. Preisinger, and H. Simianer. 2010a. Linkage disequilibrium reveals different demographic history in egg laying chickens. BMC Genet. 11:103. Qanbari, S., E. C. G. Pimentel, J. Tetens, G. Thaller, P. Lichtner, A. R. Sharifi, and H. Simianer. 2010b. A genome-wide scan for signatures of recent selection in Holstein cattle. Anim. Genet. 41:377–389. Qu, L., B. He, Y. Pan, Y. Xu, C. Zhu, Z. Tang, Q. Bao, F. Tian, and S. Wang. 2011. Association between polymorphisms in RAPGEF1, TP53, NRF1 and type 2 diabetes in Chinese Han population. Diabetes Res. Clin. Pract. 91:171–176. Rao, Y. S., Y. Liang, M. N. Xia, X. Shen, Y. J. Du, C. G. Luo, Q. H. Nie, H. Zeng, and X. Q. Zhang. 2008. Extent of linkage disequilibrium in wild and domestic chicken populations. Hereditas 145:251–257. Rubin, C. J., M. C. Zody, J. Eriksson, J. R. Meadows, E. Sherwood, M. T. Webster, L. Jiang, M. Ingman, T. Sharpe, S. Ka, F. Hallbook, F. Besnier, O. Carlborg, B. Bed’hom, M. Tixier-Boichard, P. Jensen, P. Siegel, K. Lindblad-Toh, and L. Andersson. 2010. Whole-genome resequencing reveals loci under selection during chicken domestication. Nature 464:587–591. Sabeti, P. C., D. E. Reich, J. M. Higgins, H. Z. Levine, D. J. Richter, S. F. Schaffner, S. B. Gabriel, J. V. Platko, N. J. Patterson, G. J. McDonald, H. C. Ackerman, S. J. Campbell, D. Altshuler, R. Cooper, D. Kwiatkowski, R. Ward, and E. S. Lander. 2002. Detecting recent positive selection in the human genome from haplotype structure. Nature 419:832–837. Sabeti, P. C., S. F. Schaffner, B. Fry, J. Lohmueller, P. Varilly, O. Shamovsky, A. Palma, T. S. Mikkelsen, D. Altshuler, and E. S. Lander. 2006. Positive natural selection in the human lineage. Science 312:1614–1620.