Genetic Basis of Adaptation and Maladaptation via Balancing Selection

Genetic Basis of Adaptation and Maladaptation via Balancing Selection

Accepted Manuscript Title: Genetic Basis of Adaptation and Maladaptation via Balancing Selection Authors: Manoj Kumar Gupta, Ramakrishna Vadde PII: DO...

830KB Sizes 0 Downloads 93 Views

Accepted Manuscript Title: Genetic Basis of Adaptation and Maladaptation via Balancing Selection Authors: Manoj Kumar Gupta, Ramakrishna Vadde PII: DOI: Article Number:

S0944-2006(19)30038-8 https://doi.org/10.1016/j.zool.2019.125693 125693

Reference:

ZOOL 125693

To appear in: Received date: Accepted date:

7 March 2019 3 July 2019

Please cite this article as: Gupta MK, Vadde R, Genetic Basis of Adaptation and Maladaptation via Balancing Selection, Zoology (2019), https://doi.org/10.1016/j.zool.2019.125693 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

1 Genetic Basis of Adaptation and Maladaptation via Balancing Selection Manoj Kumar Gupta and Ramakrishna Vadde*

*Corresponding Author: Ramakrishna Vadde Department of Biotechnology & Bioinformatics, Yogi Vemana University, Kadapa-516003, Andhra Pradesh, India Email: [email protected]; [email protected]

N

A

Natural selection give rise to adaptive evolution. Balancing selection maintain advantageous allele at intermediate frequency. Impact of balancing selection on genome vary across time. When individual migrate to contrasting dietary lifestyle, maladaptation may occur.

M

   

U

HIGH LIGHTS

SC RI PT

Department of Biotechnology & Bioinformatics, Yogi Vemana University, Kadapa-516003, Andhra Pradesh, India.

D

Abstract

TE

Since human left Africa about 100 thousand years ago, they experienced numerous environmental as well as social transitions. During these transitions, their genome too experiences various forms of selective pressure and retained favorable advantageous alleles in their genome by either positive selection or balancing selection, while

EP

removing deleterious alleles through purifying selection. However, when an individual with certain advantageous genetic diversity is migrated to new environment or lifestyle, the advantageous genetic diversity becomes disadvantageous and finally causing maladaptation. Thus, understanding the role of evolution in adaptation and also

CC

in the regulation of population dynamics, is highly important for identifying naturally occurring advantageous or disease risk allele in contemporary populations. Recent advancements in high-throughput sequence technologies have made it easier for understanding the impact of evolutionary forces on the genetic make-up of human in different

A

environmental and social conditions in a far better way. Statistical tests described in this review will enable reader to identify various signatures of balancing selection in different time scales in a more comprehensive way. Additionally, these tests will also help in identifying naturally occurring advantageous or disease risk alleles with applications in animal breeding, nature conservation and human medicine.

Keywords: balancing selection; population genomics; evolution; heterozygote advantages; frequency-dependent selection; diversity; polymorphism, maladaptation.

2

1.

Introduction The theory of evolution describes how populations of organisms have changed over time (Darwin, 1859).

Evolution does not denote changes that have occurred in an individual within its lifetime but it refers to the change in allele frequencies in populations over the generations (Arber, 2008). Evolution may occur either at a small scale, i.e. within a single population (microevolution) or on a scale that exceeds the boundaries of a single species

SC RI PT

(macroevolution). Microevolution generally denote modification in allele frequency within a species that eventually affect the phenotype of organisms within that species. For instance, genetic variation within Enterococci bacteria provide resistance toward several kinds of antibiotics. Comparative studies on skeletal samples from contemporary and ancient human have reported that there is an increase in body dimensions (for instance weight and BMI), reduce in skeletal robustness and an increase in morphological variability in contemporary population (Ruhli and Henneberg, 2011). Macroevolution denotes modifications that affect across species, for instance, emergence of new genus or extinction of any species (Erwin, 2010). There are four basic mechanisms that governs change in allele frequencies in

U

a population are mutation, genetic drift, gene flow and natural selection (Ayala, 1983; Hartl et al., 1997; Erwin, 2010). Mutation is the final cause of all genetic variations in genome (Barton N. H., 2010). Genetic drift occurs due to

N

"sampling error" in selecting alleles for the next generation from gene pool of the present generation. Though genetic

A

drift happens in populations of all sizes, its effect is stronger in small populations. Gene flow or gene migration is the movement of genes into or out of a population, e.g. migration of human population from rural to urban area and pollen

M

grain carried by various agents to new place (Ayala, 1983; Hartl et al., 1997; Arber, 2008). Natural selection arises when, in a population, individuals with certain genotypes are more likely to survive and reproduce as compared to

D

individuals with other genotypes, and hence pass on their alleles to the subsequent generation. This type of selection give rise to adaptive evolution (Ayala, 1983; Hartl et al., 1997; Erwin, 2010). Therefore, natural selection is not

TE

“survival of the fittest” but it is “reproduction of the fittest”. There are mainly four basic ways, namely directional selection, stabilizing selection, disruptive (diversifying) selection and balancing selection, by which natural selection can effect phenotypic distribution in a population. In

EP

directional selection, only single phenotype is favored, causing allele frequency to shift constantly in one direction (Goldberg, 2010). Industrial melanism in Biston betularia (L.) is one of the best example of directional selection

CC

(Cook and Saccheri, 2013). Biston betularia exists in three morphs namely melanic “carbonaria” , non-melanic black and white “typica” and intermediate melanic “insulari” (Majerus, 1998). Before 1848, Biston betularia existed mainly as “typica” and this colored pattern enable them to escape predation by birds through camouflaging against

A

lichen-covered tree trunks. However, with the air pollution in the nineteenth century because of industrial revolution, almost all the trees were covered by soot and killed the light-colored lichens, which in turn enabled “carbonaria” moths on the trees less conspicuous to bird predators in comparison to “typica” moths, causing enhanced predation of “typica” moths and decrease predation of “carbonaria” moths (Cook and Saccheri, 2013). Eventually Biston betularia in industrialized areas of England were predominantly of “carbonaria” variety. Several studies have also reported that, in postindustrial contemporary human population, directional selection have reduced the height in females (Byars et al., 2010; Stearns et al., 2012), enhanced the age at menopause (Tropf et al., 2015), lower the age at

3 first birth in females (Tropf et al., 2015) and increased weight in females (Byars et al., 2010; Stearns et al., 2012). Phenotype of any organism is also dependent on stabilizing selection or disruptive selection. While stabilizing selection tend to decrease phenotypic variation, the disruptive selection incline to increase it (Robertson, 1956; Turelli, 1984). There are only few examples of stabilizing selection in human. Though some studies have reported that birth weight in human is under stabilizing selection (Karn and Penrose, 1951), one study reported intensity of stabilizing selection has reduced in postindustrial societies (Ulizzi and Terrenato, 1992). In disruptive selection (or diversifying

SC RI PT

selection), extreme values of trait are favored over intermediate values. For instance, in an environment with very dark colored soil and very light colored rock, frequency of mice with either very dark or very light color, which are camouflaged, would increase, while the frequency of the intermediate-colored mice would decrease due to predation (Goldberg, 2010).

Balancing selection is defined as a selective force by which different alleles are actively maintained in the gene pool of a population at frequencies longer than expected from genetic drift alone (King et al., 2007). In balancing selection, advantageous genetic diversity is maintained at intermediate frequency in a populations (Hartl et al., 1997;

U

Ridley, 2009; Sabeti et al., 2006; Baum et al., 2013; Dobzhansky and Dobzhansky, 1982). Best example of balancing selection is heterozygote advantage in β-globin gene, which provide protection to African human population against

N

malaria (Allison, 1954; Pasvol et al., 1978; Verrelli et al., 2002; Andrés et al., 2009; Hedrick, 2011). Incidence of

A

balance selection also been reported in the ABO blood group in primates (Ségurel et al., 2012), major histocompatibility locus (MHC) in vertebrates (Klein et al., 2007) and disease resistance (R) genes in plants (Karasov

M

et al., 2014). Recently Brandt and his team reported first probable “divergent over dominance” mechanism for the nature of balancing selection on HLA genes across human populations (Brandt et al., 2018). Thus, these advantageous

D

genetic diversity increases the fitness of any individual in an environment. However, when any individual with certain advantageous genetic diversity migrate to new environment or lifestyle, including food habits, in contrast to his

TE

previous environment or lifestyle, advantageous genetic diversity become disadvantageous, which in turn may leads to maladaptation. Pleasants reported that migration of people with sickle-cell disease have resulted in defective gene flow from high-frequency sickle-cell gene areas, such as Africa and India, to Western Europe, the eastern coast of

EP

South America, and North America (Pleasants, 2014). Because of this migration, sickle cell anemia is becoming prevalent in that places, which were not associated with the disorder earlier. Another instance of maladaptation is the

CC

hypertension in contemporary populations living in a high-salt environment (Balaresque et al., 2007). One study examined the effect of selection on 21 human disease variants in relation to the migration out of Africa & climate variable and reported that Type 2 Diabetes increases greatly with specific dietary customs (Blair and Feldman, 2015),

A

for instance, high carbohydrate intake and sedentary lifestyle. Recently, one cross-sectional study was carried out for observing the effect of diet between two Indian communities practicing sedentary lifestyle. Diabetes incidence among lacto-vegetarian (1.7%) was found to be lower than in non-vegetarian group (5.3%) in spite of similar lipid profiles and body mass index (BMI)/waist circumference (WC) between the groups (Praharaj et al., 2017). Thus, there is always a question, which genes in the genome are under balancing selection, and what cause maladaptation? Recent improvements in high throughput sequencing technology has provided us opportunity to address these question in more comprehensive manner (Gupta et al., 2017; Wang and Mitchell-Olds, 2017; Brandt et al., 2018). Several classic

4 cases have been restudied by utilizing various statistical tests on high throughput sequencing data and reported clear evidence of different forms of selection. These tests can also be utilized to distinguish balancing from other form of natural selection (Charlesworth, 2006). This review deals with the effect of balancing selection on genome and statistical tests that enable us to identify different signature of balancing selection at various timescale. Apart from these, this review also narrated how mismatch between two environmental conditions or life style has turn advantageous genetic diversity of ancient

2.

SC RI PT

population into burden in contemporary population.

Balancing selection

The “balance hypothesis” proposes that the balancing selection is mainly responsible for maintaining advantageous genetic diversity in a population via heterozygote advantage, negative frequency dependent selection, and temporal or spatial habitat heterogeneity (Dobzhansky and Dobzhansky, 1982). 2.1 Heterozygote advantage

U

The heterozygote advantage, also known as over dominant selection. In this, the heterozygotes are more fit as compare to homozygotes (Smith, 1981; Semel et al., 2006; Dong et al., 2007; Andrés et al., 2009). For the first time

N

Fisher demonstrated that when homozygotes are less fit in comparison to heterozygote, there is an occurrence of stable

A

polymorphism for a pair of alleles of a gene (Fisher, 1923). Though earlier it was believed that as so much of genetic variation is observed in nature, heterozygote advantages is more common in natural population (Dobzhansky, 1955;

M

Hartl et al., 1997). Conversely some authors claim that as most of the novel mutations are deleterious and are discarded via natural selection, majority of genes are homozygous wild type in nature (Muller, 1950). However, all these studies

D

considered only for specific traits variability along with genetic basis, for instance, human blood groups and snail shell patterns, for estimating the genetic variation in different organisms. Later Lewontin and Hubby utilized acrylamide

TE

gel electrophoresis for examining genetic variation at dozens of loci in Drosophila pseudoobscura and reported that the same protein isolated from different members of the population is migrated across the gel at different speeds, which in turn is a signature of genetic variation (Lewontin and Hubby, 1966). Though earlier electrophoretic study of

EP

isozymes reported considerable variation in Drosophila, this was for the first time that Lewontin and Hubby measured these variation quantitatively and reported heterozygosity at 36% of loci in D. pseudoobscura genome (Lewontin and

CC

Hubby, 1966). Lewontin and Hubby also stated that if large number of loci are heterozygote advantageous, there is huge number of deleterious homozygotes, which in turn will highly reduce the fitness of an individual (Crow and Dove, 2000). In contrast, small number of heterozygote advantageous loci will increase the fitness.

A

A typical example of balancing selection via heterozygote advantage is the β-globin gene present in human

suffering from malaria. Earlier studies have reported numerous point mutations in the β-chain of normal adult hemoglobin (HbAA), for instance sickle hemoglobin (HbSS) (Williams et al., 2005), provide protection against Plasmodium infection (Williams, 2006). HbAA generally comprises of two α and two β globin chains. HbS occurs due to single point mutation (Glu → Val) on the 6th codon of the β globin gene (Ingram, 1959). Homozygotes for hemoglobin S (HbSS) with two affected beta chains develop sickle cell disease, in which polymerized hemoglobin causes red blood cells to sickle and occlude blood vessels. Vaso-occlusion affects many organs and tissues, and results

5 in high morbidity and mortality. Though heterozygotes for sickle hemoglobin (HbAS) have sickle cell trait, they are generally asymptomatic and provide protection against malaria (Gong et al., 2013). In 1981, Smith reported that gene modulating colour patterns in polymorphic African butterfly (Danaus chrysippus) control sexual selection during different seasons. Highest mating achievement for the dark homozygotes was during the cloudy, cool & rainy season, but for light-colored homozygotes was during the hot, and dry season. However, the highest mating achievement for the heterozygotes were observed in both the seasons (Smith, 1981). Another instance of heterozygous advantage can

SC RI PT

be seen in the transferrin locus of pigeons (Hedrick, 2012). Transferrin mainly responsible for inhibiting the growth of iron dependent microorganisms (Skaar, 2010; Parrow et al., 2013; Silva and Faustino, 2015), and few of these microorganisms may cause mortality of pigeon eggs (Frelinger, 1972). As transferrin, which protect the egg, is formed by the maternal genotype alone and is responsible for the hatchability of the egg and further reported higher hatchability of heterozygote maternal as compared to homozygotes (Frelinger, 1972).

Resistance of brown rats toward the rodenticide warfarin due to dominant allele R at an autosomal locus is an another example of heterozygote advantage (Hedrick, 2012). Resistant animals are heterozygotes for the R, and

U

wild-type susceptible (S) allele (Rost et al., 2004). Unlike vertebrate, only few studies has reported about the occurrence of balancing selection in Drosophila (Croze et al., 2017). A study reported that only small percentage of

N

polymorphisms in Drosophila is maintained via heterozygote advantage (Hedrick, 2012) even though few studies

A

reported no balancing selection signatures (Croze et al., 2017). For instance, heterozygote advantage is found in the alcohol dehydrogenase (Aldh) locus in D. melanogaster, which conveyed the first insight into the DNA dissimilarity

M

for an enzyme polymorphism (van Delden et al., 1978; Hudson et al., 1987). Several other studies have also reported about the involvement of epistasis and heterozygote advantage in the genetic basis of heterosis in Brassica rapa (Dong

D

et al., 2007), maize (Frascaroli et al., 2007), rice (Xiao et al., 1995; Li et al., 1997, 2001; L. Li et al., 2008), Arabidopsis (Melchinger et al., 2007) and tomato (Semel et al., 2006).

TE

2.2 Negative frequency dependent selection

Till date, negative frequency dependent selection is considered to be the most powerful selective force and is responsible for maintaining most of the balanced polymorphisms (Ayala and Campbell, 1974; Turelli and Barton,

EP

2004; Kazancıoğlu and Arnqvist, 2014; Brisson, 2018). Negative frequency dependent selection is a phenomenon in which the more common variants have a selective disadvantage over moderately rare variants. Negative frequency

CC

dependent selection causes increase in frequency of rare allele and avoidance of local extinction (Newbigin and Uyenoyama, 2005; Brisson, 2018). This type of selection also known as rare allele advantage and favours accumulation of huge amount of low-frequency alleles (Brisson, 2018). As this negative selection favours any allele

A

which is different, is also called diversifying selection (Hartl et al., 1997). One of the best example of this negative frequency dependent selection is the self-incompatibility technique present in plants, where solely pollen having nonself-alleles are capable of pollinating a flower. Common variants have selective disadvantage over rare variants because rare variants are compatible with nearly any other plant, but with the gradual increase in frequency, the advantage of rare variants starts reducing. Negative frequency dependent selection also have significant role in the maintenance of rare phenotypes with a greater probability of fleeing from its predators (Olendorf et al., 2006). For instance, in natural populations of Cepaea nemoralis snails, color polymorphisms are preserved via negative frequency

6 dependent selection to escape from their predators Turdus philomelos, which look for the most common morphology via “search image” and result in less predation pressure on the rare as compare to common morphology (Harvey et al., 1975; Allen, 1988). This process continues till rare morphology become common morphology. Similarly, Fisher's principle exhibits that the frequency of females and males in human population is maintained via negative frequency dependent selection, because the frequency of one sex is higher than another sex, parents reproducing the alternative sex would appreciate an advantage, which will result in more grandchildren (Fisher, 1930; Edwards, 1998).

SC RI PT

2.3 Temporal or spatial habitat heterogeneity

Selection that changes during course of time and is responsible for maintaining genetic diversity in a population is called “temporal variation”. If the two alleles are advantageous during two different time scales, they can attain balanced equilibrium and upheld in the population via natural selection. For instance, during placental malaria, placental trophoblasts encode FLT1, an angiogenesis inhibitor associated with preeclampsia. FLT1 gene contain some alleles which offer a reproductive advantage throughout malaria season, while other alleles are advantageous outside malaria season (Muehlenbachs et al., 2008). Selection that fluctuate across space is called

U

“spatial variation”. For instance, if a natural population lives in two places where one allele is favored in one place while another allele is favored in another place, selection can maintain both alleles, e.g., inversion polymorphisms in

N

Drosophila are maintained at specified frequencies in relation to latitude (Andrés et al., 2009). Fish experiences spatial

A

as well as temporal O2 variations in water bodies and thus have developed physiological, anatomical and biochemical approaches for adapting to the changing environmental gas availability (Souza and Bonilla-Rodriguez, 2007). The

M

spatial spread of H5N1 avian influenza virus in some geographic regions is reported to have large impact on the poultry industry and a serious warning to human health (Tian et al., 2015). Occurrence of cryptosporidiosis in the west

D

of Ireland also was reported to highest during spring (Callaghan et al., 2009). As obesity prevalence in Southeast, Central, Northeast and South regions of United States is higher, the

TE

inhabitant of these regions are more vulnerable to diabetes. However, inhabitant of Southwest, West and Northern regions of United States may also become vulnerable to diabetes with the increase in the prevalence of obesity (Li et al., 2016). Another analysis in Australia reported that with increase in the distance from equator (lower UV-B

EP

radiation), prevalence of type 1 diabetes (T1DM) also increases significantly (Ball et al., 2014). This finding further supported by another study where they reported that T1DM is more common in countries nearer to polar region. For

CC

example, Finland has the highest rate of type 1 diabetes diagnosis per year, i.e. approximately 40 per 100000 individuals while in China it is just 0.1 per 100000 (Walker and Colledge, 2013). Compared to other populations, type 1 diabetes is more common in Caucasians and most individuals are diagnosed in the winter months (Walker and

A

Colledge, 2013). Another systematic analysis reveals that incidence of childhood type 1 diabetes mellitus (CT1DM) in oceanic climate is higher than other climates. Additionally, incident of CT1DM is also higher in places with higher latitude and lower sunshine durations. Thus, climates play vital role in causing CT1DM (Chen et al., 2017). Some other studies have also reported higher incidence of type 1 diabetes in sparsely-populated rural areas (Patterson and Waugh, 1992; Cardwell et al., 2006) as compared to urban while other reported contrary results (Haynes et al., 2006) and some studies reported no difference in the frequency of type 1 diabetes between rural and urban area (Schoenle et al., 2001; Schober et al., 2003).

7 2.4 Special case of balancing selection In some cases, advantageous genetic diversity in a populations are maintained by combined effects of either two or all three approaches i.e. heterozygote advantages, negative frequency dependent selection and spatial/temporal selection, of balancing selection. For instance, genetic diversity in some genes, e.g. the major histocompatibility complex (MHC) genes in vertebrates, are reported to be maintained through all the three approaches of balancing selection, (Spurgin and Richardson, 2010). The “heterozygote advantage” hypothesis suggests that individuals

SC RI PT

heterozygous at MHC loci react to a larger extent of pathogen peptides as compared to homozygotes, and subsequently, provide better resistance to pathogens (Doherty and Zinkernagel, 1975; Hughes and Nei, 1988). The “negative frequency dependent” hypothesis recommends that for overcoming the resistance of the most common host MHC alleles, there is strong selection on pathogens. Hence, novel alleles that arise in the population offer better protection to pathogens as compared to common alleles, and thus have a selective advantage (Takahata and Nei, 1990). The “fluctuating selection” hypothesis suggests that both spatial & temporal heterogeneity in the type & abundance of pathogens are responsible for maintaining genetic diversity at the MHC (Hill, 1991).

U

Another interesting case of balancing selection is heterozygosity in CFTR gene. This gene modulate movement of chloride ions in and out of cells. Individuals with homozygotes are more while heterozygotes are less

N

prone to cystic fibrosis. Homozygotes when get infected by some diseases, e.g. typhoid or cholera, they generally die

A

because of electrolyte imbalance and dehydration caused via diarrhea. In heterozygotes, chloride channels is about 50% less as compared to non-carrier, which prevent the excess fluid loss during diarrhea in heterozygotes infected

M

with a disease like cholera or typhoid (Rodman and Zamudio, 1991; Bramanti et al., 2000). However, as heterozygote for CFTR will lose more salt due to sweating, this heterozygosity is advantageous in cold climate (because people

D

sweat less in cold climate), and disadvantageous in hot climate (Beckett, 1997).

TE

3. Impact of balancing selection on genome

Balancing selection leaves various forms of signatures in the genome that, either individually or jointly, form the basis of the statistical tests utilized in detecting the process (Sabeti et al., 2002; Charlesworth, 2006; Hedrick,

EP

2012). Impact of balancing selection can be sub-divided as in the current generation, recent past (which has spread that over the history of populations), and distance past (that can be observed over the history of species). Balancing

CC

selection in current generation is produced as well as lost in the same generation. Balancing selection in recent past produced or lost over ten to thousands of generation and depends on several factors like gene flow, recombination, and genetic drift. Signature of balancing selection in the distant past is determined via mutation and takes several

A

thousands or millions of generations for generating or to losing (Hedrick, 2006). 3.1 Distant Past The signature of balancing selection in the distant past is mainly determined via mutation and takes many

thousands to millions of generations to occur (Garrigan and Hedrick, 2003). Balancing selection in the distant past may result in long term balancing selection (Hedrick, 2006; Hedrick et al., 1976). As balancing selection is mainly responsible for increasing nuclear diversity in nearby neutral sites (Barton, 1986; Charlesworth, 2006; Andrés et al., 2009; Fijarczyk and Babik, 2015), when a single allele exist at a single site for very long evolutionary times, the effect

8 of balancing selection in that site can be detected from its nearby neutral site (Charlesworth, 2006; Wu et al., 2017). However, only few nearby sites, with signature of balancing selection, recombine with the selected site(s) and have longer coalescence times, i.e. they share common ancestors for longer duration as compared to other sites (Charlesworth, 2006). One of the best classical example of long term balancing selection is the preservation of sex determining alleles in honeybee having high amino acid as well as synonymous site diversity (Hasselmann and Beye, 2004). Another example of long-term balancing selection is the selection against self-fertile recombinants, which is

SC RI PT

generally strong enough for generating high nucleotide diversity all through the sequences of multi-allelic pistil recognition genes of plants bearing gametophytic self-incompatibility (Richman et al., 1996; Lu, 2001, 2002) as well as in the pistil and pollen S-loci of species bearing sporophytic incompatibility (Sato et al., 2002; Charlesworth et al., 2003). Though, the replacement is responsible for wearing away variations at intron and synonymous sites away within the gene, if the numbers of alleles are small or replacement amongst alleles are frequent, long-term balancing selection may leads to high differentiation amongst alleles very nearer to the selected sites (Takahata and Satta, 1998; Navarro and Barton, 2002). However, loci will not be detected when the effect of selection is for smaller duration or the

U

replacement is frequent for generating diversity between alleles. For instance, high allelic diversity in MHC develops due to recombination amongst differentiated haplotypes, which is a signature of long-term balancing selection.

N

Nevertheless, this diversity in MHC, though remarkably high for human sequences, but is lesser as compared to the

A

diversity in fungal or plant incompatibility gene sequences (Charlesworth et al., 2003). Ancient genetic variants whose origin precedes speciation events, resulting in shared alleles amongst

M

evolutionarily related species is called Trans-species polymorphism (TSP) (Klein, 1987). Unlike convergent evolution, TSP occurs via the movement of alleles from ancestral to offspring species through incomplete lineage

D

sorting (Klein, 1987; Klein et al., 2007). During convergent evolution, organisms (related or not) independently evolve similar traits because of adaptation to similar ecological niches or environments (Yeager and Hughes, 1999;

TE

Klein et al., 2007). TSP develops due to maintenance of balanced polymorphism for longer duration. It diminishes the influences of drift and slow down the progression of coalescence as well as lineage sorting. In general, two forms of TSP exists, namely, neutral TSP and balanced TSP. As neutral (transient) TSP occurs in closely related newly

EP

diverged species and progressively disappears after some time (Klein et al., 1998). It has a propensity to be widespread across loci only in a short duration of time after the speciation event (Nagl et al., 1998; Samonte et al., 2007). On

CC

contrary, balanced TSP is functionally more significant and occurs due to the selection for variability maintenance, i.e. balancing selection. Balanced TSP is generally for longer duration, may be even for millions or tens of millions of years (Aguilar and Garza, 2007; Klein et al., 2007; Kamath and Getz, 2011; Li et al., 2011). Hence, the

A

identification of balanced TSP variants is an important step towards identification of naturally occurring advantageous alleles with application in animal breeding, nature conservation, and human medicine (Těšický and Vinkler, 2015). Balanced TSP is reported in the self-incompatibility (SI) system of flowering plants (Ioerger et al., 1990; Dwyer et al., 1991; Richman et al., 1996), MHC of jawed vertebrates (Klein, 1987), Non-MHC immunoglobulins, PSMB8, host defense peptides, TRIM5α and oligoadenylate synthetase (Těšický and Vinkler, 2015). This TSP is also well documented in complementary sex determining genes in Hymenoptera (Heimpel and de Boer, 2007; Lechner et al., 2014), ABO blood system in primates (Kermarrec et al., 1999; Ségurel et al., 2012), and mating loci in fungi (Lukens

9 et al., 1996; Muirhead et al., 2002; Diepen et al., 2013). Recently, Wu and his team reported loci of 5 genes, namely, AT1G35220, AT2G16570, AT4G29360, AT5G38460, and AT5G44000 in Arabidopsis thaliana and Capsella rubella, as candidates under long-term balancing selection, which play significant role in providing resistance against stress and adaptation to diverse habitats (Wu et al., 2017). 3.2 Recent Past The signature of balancing selection in the recent past is mainly determined by selection in combination with

SC RI PT

other factors like gene flow, genetic drift, and recombination and is generated or lost over 10 to 1000 of generations, based on the impact of these factors (Garrigan and Hedrick, 2003). Balancing selection in the recent past generally lead to short term balancing selection (Hedrick, 2006; Hedrick et al., 1976). Host-pathogen interactions is the best example of short term balancing selection (Charlesworth, 2006; Ebert, 2008). Pathogen-mediated pressure result in either balancing selection (trench warfare) or positive directional selection (also known as the arms races) (Croze et al., 2016). Balancing selection maintain the allele frequency at intermediate level (Stahl et al., 1999). During positive directional selection (also known as the arms races) the allele frequency will increase and attain fixation (Magwire et

U

al., 2011). Other instances of short term balancing selection can be observed in glucose-6-phosphate dehydrogenase (Verrelli et al., 2002) and hemoglobin E that provide resistance against malaria (Ohashi et al., 2004). Other studies

N

have also reported that short term balancing selection has maintained "nontaster" and "taster" variants at the PTC

A

(phenylthiocarbamide) locus in humans (Wooding et al., 2004; Charlesworth, 2006). Person who can taste PTC are capable of tasting other bitter substances, including toxic and these variation in PTC perception may reflect variation

M

in dietary preferences across human history and could correlate with susceptibility to diet-related diseases in modern populations (Wooding et al., 2004).

D

3.3 Current Generation

Balancing selection in the current generation is mainly responsible for local adaptation (Hedrick et al., 1976).

TE

Local adaptation means local populations tend to have a higher mean fitness in their intrinsic environment in comparison to other environments and to other populations introduced in their home site (Lascoux et al., 2016). Ever since Turresson defined the theory of ecotypes, local adaptation has been a topic of active research by evolutionary

EP

ecologists. Environmental condition generally vary across both time and space, and local conditions decide which traits will be favored via natural selection. Local adaptation may occur in response to various selective factors, for

CC

instance parasites, climate and edaphic factors (Linhart and Grant, 1996). Local adaptation may also occurs when alleles with environment-dependent fitness trade-offs (antagonistic pleiotropy) or alleles that have a selective advantage in one environment but are neutral in others (conditional neutrality) (Schnee and Thompson, 1984). Out of

A

these two situations, only antagonistic pleiotropy has the ability for strong balancing selection and maintenance of genetic variation. However, the number of studies reporting local adaptation due to antagonistic pleiotropy is few as compared to conditional neutrality (Ågren et al., 2013; Anderson et al., 2011, 2013). Local adaptation is also responsible for maintaining genetic variation, which is finally lead to ecological speciation (Savolainen et al., 2013). Though field studies are highly required for identifying locally adapted traits, population genetic analysis allow us to make faster estimation of population structure and size along with identifying locally adapted traits (Wang and Shaffer, 2017). Population genetic analysis are highly required in species where either direct observation is very

10 difficult (Wang et al., 2009) or are of conservation concern (van Strien et al., 2014). However, as these genetic analysis discloses the impact of selection during short interval, they might not be representative of historical environments. Earlier studies have utilized whole organism data for detecting selective differences between populations that in turn may be reason for generating balancing selection at the metapopulational level (Linhart and Grant, 1996). Recently many studies have identifies several genes that are responsible for local adaptation and are also under balancing selection in a particular environment, for example, high-altitude adaptation (Storz et al., 2007) and immunity in plants

SC RI PT

and animals (Prugnolle et al., 2005; Moeller and Tiffin, 2008).

4. Tests used in detection of balancing selection

Recently, many classic cases have been re-analyzed by using sophisticated statistical tests on DNA sequence information and observed clear evidence of different forms of selection (Edwards, 1998; Barreiro and Quintana-Murci, 2010; Fijarczyk and Babik, 2015; Brisson, 2018). These tests reveal how genomic diversity get affected via balancing selection as well as how balancing selection during diverse timescale have diverse footprints in genomic sequences

U

(Sabeti et al., 2002; Verrelli et al., 2002; Fijarczyk and Babik, 2015). At present, various statistical tests are available for detecting different signatures of balancing selection at different timescales (Charlesworth, 2006; Hedrick, 2012;

N

Fijarczyk and Babik, 2015). This timescale is subdivided mainly by two time threshold, i.e. 0.02–0.4 Ne generations

A

ago, when signatures of very recent balancing selection or current generation started (Sabeti et al., 2006), and 4 Ne generations (Fijarczyk and Babik, 2015), which is the expected height of neutral genealogies, i.e. the time when the

M

most recent common ancestor developed. Though during very recent balancing selection there is less variation between populations, it is often difficult to detect recent balancing selection because it will generate signatures like

D

positive selection. It will increase linkage disequilibrium nearby selected locus and decreases differentiation between populations like incomplete sweeps (Hermisson and Pennings, 2005). Older balancing selection alters the overall

TE

shape and structure of gene genealogies in comparison to genealogies under neutrality, which in turn result in an excess of longer internal branches (Fijarczyk and Babik, 2015) and enhances the diversity near the target of selection via hitchhiking such that an excess of alleles at intermediate frequency will be observed (Charlesworth, 2006). Thus,

EP

the time to develop most recent common ancestor may become lengthier as compared to the age of the species, which might be responsible for generating trans-species polymorphisms (Klein, 1987; Klein et al., 1998; Diepen et al., 2013;

CC

Těšický and Vinkler, 2015). With the increase in time since speciation, there is also continuous progress in the process of lineage, which ultimately leads to variations in gene genealogies amongst genes under balancing selection, and the rest of the genome. Hence, balancing selection at different timescale have different effect on genome (Charlesworth,

A

2006; Fijarczyk and Babik, 2015; Hedrick, 2012). Balancing selection due to distance past give rise to polymorphic sites or allelic lineages shared amongst

species and this type of balancing selection can detected via trans-species polymorphism test. Balancing selection due to both distance and recent past give rise to increased diversity nearby selected locus, excess of common polymorphisms and excess of nonsynonymous polymorphisms. Increased diversity nearby selected locus can be detected via Hudson–Kreitman–Agaudé (HKA) test and Balancing Selection Likelihood Test T1 (BALLET). Excess of nonsynonymous polymorphisms can be detected via McDonald–Kreitman test & Neutrality index (NI), McDonald–

11 Kreitman Poisson Random Field (MKprf), Selection Inference using Poisson Random Effects (SniPRE), omegaMap and gammaMap. Excess of common polymorphisms can be detected via Tajima’s D, Fu & Li’s F & D, allele frequency spectrum, and Balancing Selection Likelihood Test T2 (BALLET). Balancing selection due to both recent past as well as current generation give rise to genetic differentiation between populations, which can be detected via Fixation index (Fst) outlier. Balancing selection due to current generation give rise to linkage disequilibrium, which can be detected via long range haplotype test (LRH), integrated extended haplotype homozygosity (iHS) and cross population

SC RI PT

extended haplotype homozygosity (XP-EHH). 4.1. Trans-species polymorphisms (TSP)

Long-term balancing selection alone is responsible for generating TSP. TSP occurs when the selection commences prior to the divergence of two species (Charlesworth, 2006; Siewert and Voight, 2017). TSPs are not under neutrality (Gao et al., 2015). Even though several studies have used primate outgroups for identifying TSPs in human (Andrés et al., 2009; Leffler et al., 2013; Teixeira et al., 2015), this methodology fails to identify selection when the balanced polymorphisms was missing in at least one of the species under consideration (Siewert and Voight,

U

2017). To overcome this problem, recently Cheng and DeGiorgio developed new method for detecting trans-species balancing selection without relying on shared polymorphism (Cheng and DeGiorgio, 2018). This approach utilises

N

summary and model based approaches for identifying genomic region under long-term balancing selection shared by

A

a group of species by utilising genomic patterns of intra-specific polymorphism and inter-specific fixed differences (Cheng and DeGiorgio, 2018). They adapted the original framework designed by DeGiorgio and team (DeGiorgio et

M

al., 2014) for constructing the likelihood ratio test statistics, namely, T1,trans and T2,trans, for detecting trans-species balancing selection. They also extended Non-central Deviation (NCD) statistic to NCDtrans to apply on multi species

D

data and amended the HKA test (represented as HKAtrans) in order to accommodate genomic data of multiple species in a better way. They executed extensive simulations to evaluate the performances of different methods, and applied

TE

the model-based T2;trans statistic to whole-genome human and chimpanzee data to gain insights on ancient balancing selection affecting these lineages (Cheng and DeGiorgio, 2018). 4.2. Hudson–Kreitman–Agaudé test

EP

HKA test compares the frequencies of polymorphisms in a species with the level of divergence amongst species (Hudson et al., 1987). This test utilizes sequence information at multiple unlinked loci (usually non-coding)

CC

from at least two closely related species, and investigate if the polymorphisms and divergences at those loci are because of adaptive or neutral evolution (Hudson et al., 1987). Excessive polymorphism within species at synonymous sites is a signature of purifying selection while excessive polymorphism between species at nonsynonymous sites is a

A

signature of positive directional selection low (Hudson et al., 1987; Yang, 2006). Under balancing selection, loci will show higher within-species variation in comparison to neutrally evolving loci (Gokcumen et al., 2013). In locus with high mutation rate, both polymorphism and divergence will be high whereas in locus with low mutation rate, both polymorphism & divergence will be low (Hudson et al., 1987; Yang, 2006). However, as we are testing the ratios between two loci in HKA test, when recent selection reduces polymorphism at one of the loci, it is becomes difficult to determine the real cause of a deviation from the neutrality (Hudson et al., 1987). HKA test is conservative in nature because it presumes free recombination between genes and complete linkage within each gene (Hudson et al., 1987).

12 4.3. Balancing Selection Likelihood Test Balancing Selection Likelihood test detects signatures of ancient balancing selection in genomic data (DeGiorgio et al., 2014). In 2014, DeGiorgio and team developed two model-based summaries, T1 and T2, which produce a composite likelihood of a site being under balancing selection (DeGiorgio et al., 2014). These two summaries are developed on the basis of earlier models that estimate the effect of balancing selection on the genealogy at linked neutral loci (Hudson and Kaplan, 1988, 1988) and consider the spatial distributions of polymorphisms and

SC RI PT

substitutions around a selected site. Through simulations, DeGiorgio and team reported that both T1 and T2 are more powerful than HKA test and Tajima's D under various demographic presumptions, for instance population growth and bottleneck (DeGiorgio et al., 2014) . However, T2 is more powerful and requires sequence information from closely related outgroup as well as prior knowledge about the underlying demographic history, which generate extensive grid of simulations for estimating the likelihood of a site being under balancing selection (DeGiorgio et al., 2014). Analysis of whole-genome sequencing data from Africans and Europeans utilizing T1 and T2 identified several novel as well as previously identified loci demonstrating signatures of balancing selection (DeGiorgio et al., 2014).

U

4.4. McDonald–Kreitman (MK) test, and Neutrality index (NI)

The neutral theory proposes that both the divergence between species and the polymorphism within a species

N

are two phases of the same evolutionary process, i.e., both occurs because of random drift of selectively neutral

A

mutations. Thus if both nonsynonymous and synonymous mutations are neutral, then the proportions of nonsynonymous (Pn) and synonymous (Ps) polymorphisms within a species should be the same as the proportions of

M

nonsynonymous (Dn) and synonymous (Ds) differences between species. The McDonald–Kreitman test estimates this prediction (McDonald and Kreitman, 1991). If nonsynonymous and synonymous substitutions are interspersed and

D

share common genealogy (or genealogies) as well as sampling scheme, one may perform a simple test of independence, namely neutrality index, on the 2 × 2 contingency table for estimating the direction and degree of

TE

departure from neutrality (Charlesworth and Eyre-Walker, 2008). As under neutrality, Pn/Ps = Dn/Ds, NI = 1. NI < 1 is represent an excess of variation between species because of the positive selection. NI > 1 indicates an excess of nonsynonymous polymorphisms proposing that either balancing selection is favoring to maintain polymorphism

EP

within one or both of the species or negative selection is preventing the fixation of hazardous mutations that exist in the population at low-frequency (Egea et al., 2008; Fijarczyk et al., 2016). Though MK test is generally applicable to

CC

protein-coding data, it can be applied to any two categories of sites that are interspersed, e.g. in protein and nonprotein-binding sites in a regulatory element (Jenkins et al., 1995). 4.5. MKprf, SniPRE, omegaMap and gammaMap

A

MKprf method is an extension of the MK test and estimate γ (population-effective selection coefficient) and

the confidence interval (CI) of γ for each gene. Positive selection effect represent positive selection coefficient while negative selection effect represent negative selection coefficient. A positive selection effect represent that nonsynonymous mutations are fixed at a higher rate than expected and reverse is true for negative selection (Eilertson et al., 2012). Gene with 95% CI of γ > 0 is under positive selection while gene with 95% CI of γ < 0 is under negative selection (Cai and Petrov, 2010). Though MKprf method is more powerful than MK tests, it is disadvantageous to some researcher because it requires prior information about a population genetic model (Y. F. Li et al., 2008). Eilertson

13 and team developed another new method, SniPRE, for detecting genes under natural selection (Eilertson et al., 2012). Though it works like “McDonald-Kreitman” test but is more robust to demography and has an advantage over other statistics (Eilertson et al., 2012). By utilizing Drosophila and human-chimp data, Eilertson and team reported that SnIPRE perform better than both the MK, and MKprf test (Eilertson et al., 2012). Main advantages of SnIPRE over MK and MKprf is that SnIPRE can identify genes under both negative and positive selection without the prior information of population genetic model. Another advantage of SnIPRE is that if anyone is interested to presume a

SC RI PT

particular population genetic model, it is possible to view the SnIPRE parameters as a re-parameterization of the population genetic model. By utilizing extra presumption, we can extend our inference beyond identification of genes that are not evolving according to the neutral theory, to quantify strength and directionality of the selection forces (Eilertson et al., 2012). Though, like MKprf method, SnIPRE utilises genome-wide information for increasing its statistical power, it makes no assumptions about mutation rate and the species divergence time for identifying genes under selection.

omegaMap was designed to detect natural selection and recombination in DNA or RNA sequences. While

U

the signature of natural selection in omegaMap is identified utilizing the dN/dS ratio, the signature of recombination is detected from the patterns of linkage disequilibrium (Wilson and McVean, 2005). gammaMap, detects natural

N

selection in sequence data obtained from one or more species. It computes the distribution of selection coefficients &

A

allows localization of signature of selection utilizing a Bayesian sliding window methods. This signature of selection is identified via contrasting in the dN/dS ratio within & between species (Wilson et al., 2011).

M

4.6. Allele Frequency Spectrum, Tajima’s D and Fu & Li’s F & D In a set of n aligned DNA sequences, the allele frequency spectrum is described as the vector (η), where 𝜂 =

D

(𝜂𝑘 )𝑘=0,1,2,…,𝑛 and ηk = number of sites with k derived alleles. The ancestral allele is the nucleotide present in the site prior to mutation and derived allele is a novel mutation (Achaz, 2009). This population genetic model presumes infinite

TE

site model, i.e. (a) there are an infinite number of sites in DNA sequence where mutations can take place, (b) every novel mutation arises at a new site, and (c) there is no recombination, and thus there is a clear demarcation of derived and ancestral site (Depaulis and Veuille, 1998). Each ‘site’ in DNA sequence denote single nucleotide base pair

derived allele.

EP

(Kimura, 1969; Tajima, 1996).While analyzing real data, outgroup data is utilized for estimating ancestral as well as

CC

This type of neutrality tests compare between two estimators, i.e. the effective population size (Ne) and the mutation rate per gene per generation (μ) of the population mutation parameter θ, which in turn characterizes the mutation–drift equilibrium. 𝜃 = 2𝑝𝑁𝑒𝜇, where 𝑝 = ploidy (1 for haploids & 2 for diploids). When the standard model

A

is true, the expectations of numerous unbiased estimators of θ are equal (Depaulis and Veuille, 1998). Typical estimator of θ, in a sample of n sequences are the average number of pairwise differences (𝜃̂𝑇 ) and the number of segregating sites ( 𝜃̂𝑤 ) in a sample (Depaulis and Veuille, 1998). 𝜃𝑇 = of sequences to compare.



𝑖<𝑗

ⅆ𝑖𝑗

𝑛(𝑛−1)∕2

, where ⅆ𝑖𝑗 = number of differences between two sequences i & j and 𝑛 = number

14 𝑛−1

𝜃̂𝑤 = S ∕ 𝑎𝑛 , where 𝑎𝑛 = ∑

𝑖=1

(1⁄𝑖 ) and 𝑛 = number of polymorphic sites

Tajima observed that 𝜃̂𝑤 is intensely effected via deleterious mutation because deleterious mutation are generally maintained in low frequency, however, 𝜃̂𝑤 ignore the frequency of mutants. Contrariwise, as 𝜃̂𝑇 consider the frequency of mutants, they are not much affected by the presence of deleterious mutations (Fu and Li, 1993). Thus, if some of the sequence in sample experiences selective effects, value of θ dependent on 𝜃̂𝑤 will differ from the value

SC RI PT

dependent on 𝜃̂𝑇 (Fu and Li, 1993). Tajima’s D statistics is basically the comparison between the average number of pairwise differences and the number of segregating sites in a set of DNA sequences. 𝐷=

̂ 𝑇 −𝜃 ̂𝑤 𝜃 ̂ 𝑇 −𝜃 ̂𝑤 ) √Var(𝜃

Where Var =variance

Under null neutral model, mean Tajima’s D value is zero and variance is one. Though, a very rough rule of thumb is that Tajima’s D > +2 or < -2 are likely to be significant, this does not denotes a critical value for a significance test. Positive Tajima’s D is a signature of population bottlenecks structure and/or balancing selection. At the time of

U

balancing selection, alleles are kept at intermediate frequencies, i.e. there will be more pairwise differences (𝜃̂𝑇 ) as compared to segregating sites (𝜃̂𝑤), which result in positive Tajima’s D. Negative Tajima’s D is a signature of

N

population expansions or positive selection. After population expansion (or selective sweep), majority of the

A

haplotypes in a population will be the same and thus, if any mutation is occurring, it will be rare. When such rare mutation accumulates to larger extent, 𝜃̂𝑇 will underestimate θ as compared to 𝜃̂ 𝑤 and thus we get negative value for

M

Tajima’s D. In case of positive selection, if population does not experiences any demographic changes (population contraction/ expansion or migration), i.e. 𝜃̂𝑤 will be greater than 𝜃̂𝑇 and thus Tajima’s D will be negative. Thus,

D

positive values of Tajima’s D typically reflect an excess of alleles of intermediate frequency, which can result from balancing selection and negative values of Tajima’s D indicate an excess of rare alleles, consistent with the occurrence

TE

of deleterious mutations and/or advantageous mutations being positively selected (Vasseur and Quintana-Murci, 2013; Yang, 2006).

Unlike Tajima D, Fu and Li's tests are based on coalescent theory. Coalescent theory describes how gene

EP

variants sampled from a population may have originated from a common ancestor. The Fu and Li’s D & F test requires data from intraspecific polymorphism as well as from outgroup while Fu and Li’s D* and F* require only intraspecific

CC

data. Like Tajima’s D, under neutrality, Fu and Li’s D* & F* also must be adjacent to 0 and the values significantly away from 0 designate divergence from the neutral model (Hartl et al., 1997; Llopart et al., 2002; Yang, 2006). Negative values of Fu and Li's D* & F* also indicate an excess of rare alleles and positive values of Fu and Li's D*

A

& F* indicate an alleles of intermediate frequency (Vasseur and Quintana-Murci, 2013; Yang, 2006). 4.7. Fst outlier The Fst is a measure of population differentiation and is most useful for identification of overall genetic

divergence among subpopulations (Hartl et al., 1997; Narum and Hess, 2011). The fixation index range between 0 (representing no genetic divergence) and 1 (representing fixation of alternate allele in different subpopulation). Fst <0.05, 0.05-0.15, 0.15-0.25 and >0.25 represent little genetic difference, moderate genetic difference, great genetic difference and very great genetic difference respectively (Hartl et al., 1997). Differential selection pressure in different

15 populations will increase Fst while balancing selection that is common to all populations will decrease Fst (Singh and Krimbas, 2000; Volis101, 2008; Brandt et al., 2018). Scanning of Fst outlier help us to determine the effect of both short term and local adaptation at any locus. Fst scans often detects high proportion (>1%) and in some cases even >10% outliers, which is a signature of pervasive local adaptation (Narum and Hess, 2011). 4.8. LRH, iHS and XP-EHH Extended Haplotype Homozygosity (EHH), iHS, and XP-EHH are the three tests for estimating phased

SC RI PT

genotypes to identify putative regions of recent or ongoing positive or balancing selection in genomes (Szpiech and Hernandez, 2014). These three tests are based on the model of a hard selective sweep, where a novel adaptive mutation if arises on a haplotype, get fixed, and reducing the diversity in nearby locus. If selection is strong enough, this fixation occurs faster as compared to recombination or mutation which may break the haplotype (Szpiech and Hernandez, 2014). Sabeti and team develop a method, namely the LRH (long range haplotype) test, which is having a capability to detect footprint of recent positive selections (Sabeti et al., 2002). The LRH test depends on the association between an allele frequency, and the linkage disequilibrium amongst allele and nearby loci. In case of no selection, novel

U

alleles take longer time for increasing its frequency, and recombination nearby them will assure that blocks of linkage disequilibrium deteriorates across time. In contrary, during positive selection, new alleles will spread quickly through

N

population and will be delimited by long blocks of linkage disequilibrium, as there will be less time for recombination

A

for discarding them. Though this process look simple, frequency of recombination is not similar across the genome, and thus will influence the speediness of linkage disequilibrium decay. However, the LRH test overpowers this

M

problem through built-in internal control, where comparative study between various groups of loci in a region are performed and any local inconsistencies in the recombination frequency get adjusted (Sabeti et al., 2002; Skipper,

D

2002).

For understanding the mechanism behind the LRH test, Sabeti and team genotyped a collection of closely

TE

related SNPs in a small area of the human genome and called this closely related SNPs as a “core haplotype”. The deterioration of linkage disequilibrium nearby each core haplotype was calculated via augmenting more and more distant SNPs, which enabled authors to estimate the probability by which any two chromosome segments carrying a

EP

given core haplotype are identical by descent. They named this probability as EHH. For finding a signature of positive selection, Sabeti and team simply search for core haplotypes with more frequency as well as higher EHH (Sabeti et

CC

al., 2002; Skipper, 2002).

The iHS (Integrated Extended Haplotype Homozygosity) was developed by Voight and team (Voight et al.,

2006) and estimate the amount of EHH at a given SNP along with ancestral allele relative to the derived allele (Voight

A

et al., 2006). Basically iHS computes the integral of the decay of EHH away from a specified core allele until EHH reached 0.05 (Voight et al., 2006). An extreme positive iHS score (iHS > 2) denotes that haplotypes on the derived allele background are shorter than ancestral allele background (Voight et al., 2006). An extreme negative iHS score (iHS < -2) represents that the haplotypes on the ancestral allele background are shorter than the haplotypes associated with the derived allele (Voight et al., 2006). XP-EHH is a cross-comparison neutrality test based on the same principle as the iHS and can be utilised for comparing between two populations, and discovering sweeps, which occurred in one population, but not in other (Sabeti et al., 2007). As all the three test, namely, LRH, iHS and XP-EHH, estimate

16 linkage disequilibrium, high diversity in any genome region is direct evidence for balancing selection (Charlesworth, 2006). 5. Maladaptation A maladaptive trait is one that continues in a population despite causing a negative influence on the capability of individuals to pass on their genes (Crespi, 2000; Nesse, 2005). As stated above, migration of any individual with certain advantageous genetic make-up to a new environment or lifestyle, including food habits, contrary to his earlier

SC RI PT

environment or lifestyle, the advantageous genetic diversity become disadvantageous, which in turn may leads to maladaptation (Vasseur and Quintana-Murci, 2013; Quintana-Murci, 2016). It is well established that migration is the main cause for population expansion in any living organisms (Blair and Feldman, 2015; Peischl and Excoffier, 2015; Chambre et al., 2017; Peischl et al., 2018). However, successful range expansion, i.e. an area where a particular species can be found during its lifetime, depends on the adaptation capability of a species to new local environmental conditions (Cornille et al., 2018). Hence, understanding of the evolutionary processes involved in the distribution of species is to understand the local adaptation of any species (Cornille et al., 2018). However, the degree to which a

U

population adapt to its local environment is unclear till date. One theory hypothesize that gene flow from central to marginal populations often cause maladaptation in the marginal population (Haldane, 1957). Another theory

N

hypothesized that range expansion allows maintenance of both deleterious and advantageous variants at expanding

A

range margins, which may attain high frequencies with passage of time. This event is generally called surfing and in the case of deleterious alleles, this phenomenon will lead to ‘expansion load’, which in turn have an ability to reduce

M

the fitness of recently expanded populations (Peischl et al., 2013, 2015; Peischl and Excoffier, 2015; Peischl et al., 2018). If the frequency of expansion load is higher, both maladapted phenotype as well as genotypes can continue for

D

several hundreds to thousands of generations, and may also restrict the capability of a species to colonize new habitats (Peischl et al., 2015). If the frequency of advantageous mutation is higher, these advantageous alleles increase the

TE

capability of a species to colonize new habitats. However, when individual with these advantageous alleles migrate to new environments and life styles (including food habits), which is in contrast to his previous environment or lifestyle, experiences maladaptation, which in turn causes several diseases like type 2 diabetes (Fig. 1) (Vasseur and Quintana-

A

CC

EP

Murci, 2013; Quintana-Murci, 2016).

N

U

SC RI PT

17

D

M

A

Fig. 1. Mechanism involved in causing maladaptation. Migration either cause (a) gene flow from central population to marginal population, which subsequently leads to maladaptation, or (b) range expansion, which maintains both advantageous and deleterious allele at expanding range margins. If the frequency of deleterious alleles increases across time, it may lead to maladaptation. If the previous and current environment/lifestyle, including food habit, is same, increase in the frequency of advantageous alleles across time will provide protection but if the current environment/lifestyle, including food habit contrast to previous one, increase in the frequency of advantageous alleles across time will cause maladaptation.

TE

For instance, urbanization provides opportunity to have calorie-dense/low-fiber foods with sedentary lifestyles, which lead to higher risk of morbidity and mortality from diet and lifestyle-related diseases, e.g., obesity & diabetes. Blair and his team reported about the effect of natural selection on 21 human disease variants in relation to

EP

migration out of Africa and climate variables (Blair and Feldman, 2015). Out of 21, the Type 2 diabetes (T2D) showed the highest correlation with accepted patterns of human migration. With increase in the distance from Africa, average heterozygosity of the 15 risk alleles decreased with an R2 of 0.69, as like regression reported by Ramachandran and

CC

team using microsatellite loci (Ramachandran et al., 2005). This pattern of risk allele frequencies would be because the risk of T2D increases highly with specific dietary customs that most probably spread after the major Out-of-Africa

A

migrations.

As the underlying mechanism responsible for obesity pandemic in industrialized countries is a topic of

debate, till date various hypothesis have been proposed to shed light on our understanding (Sellayah et al., 2014). All these hypothesis tries to reconcile gene-environment interactions with an understanding of human evolution (Sellayah et al., 2014). Out of all these hypothesis, “thrifty genotype” and the “drifty genotype” hypothesis are the most famous. The reason for proposing “thrifty genotype” hypothesis is due to the fact that human have experienced numerous episodes of famine as well as availability of surplus food during its evolution. The “thrifty genotype” hypothesis propose that genes related with obesity in contemporary populations were positively selected in early hunter-gatherer

18 human population because they allowed storage of food in body at the time of surplus food, and helped them to survive at time of food scarcity (Neel, 1962). However, in urban condition, in spite of constant presence of food, thrifty genotypes still prepare body for food scarcity, which never happens. This mismatch between two different environmental conditions causes obesity and severe diseases related with it, for instance obesity related T2D (Neel, 1962). Apart from survival, “thrifty genotype” also supported human to maintain fertility during famine. Earlier studies have reported that a pregnant woman under acute malnutrition may experience a long-term deleterious effect

SC RI PT

on her reproductive system, which may in turn causes permanent impaired fecundity (Song, 2013). To contradict the thrifty genotype hypothesis, which was then the most widely accepted model for the genetic basis of obesity, John Speakman proposes the “drifty genotype” hypothesis, also known as the “predation release” theory (Speakman, 2008). The “drifty genotype” hypothesis proposes that genes responsible for causing obesity are present in all populations worldwide because early homonids removed the selection pressure that was previously exerted on them by predation (Speakman, 2008). About 2 million years ago, when the ancient ancestors of modern humans, namely Homo erectus and Homo habilis, learnt to utilize stone tools and fire, constructed weapons, and started living in an organized social

U

structures, for the first time in evolutionary history a living organism was able to remove the threat of predatory danger (Speakman, 2008). Hence, in comparison to other animal, genes responsible for escaping predators, for instance, gene

N

associated with speed, stamina and agility, were no longer relevant to humans as it was earlier (Lingle et al., 2008;

A

Speiser et al., 2013; Spence et al., 2013). Thus, the “drifty genotype” hypothesis advocates that without predation selection pressure, though genes responsible for energy storage as well as obesity were not discarded via natural

M

selection, they were simply allowed to drift in the genetic journey of human evolution in such a way that they describe the obesity pandemic in modern western societies.

D

Though both drifty as well as thrifty genotype hypotheses are widely accepted theory for the occurrence of obesity in industrial countries, Dyan and team proposed that in some populations of industrialized countries these two

TE

theories may not be applicable (Sellayah et al., 2014). Some studies reported that though ~10,000 years ago when modern human migrated “Out-of-Africa” to various places and exposed to various form of environmental conditions (Quintana-Murci et al., 1999; Bowler et al., 2003; Rasmussen et al., 2011), it is unlikely that famine along with drought

EP

drove the evolution of human with thrifty genes (Hanson and Gluckman, 2011). Other studies reported that excess genetic change due to natural selection in contemporary human populations have occurred in about 10,000 years ago

CC

and these genetic variation are responsible for both better health as well as disease in some populations (Voight et al., 2006). For instance, heterozygote advantage in β-globin gene, which provide protection to African human population against malaria (Allison, 1954; Pasvol et al., 1978; Verrelli et al., 2002; Andrés et al., 2009; Hedrick, 2011). Recently

A

Sellayah and team proposes that both thrifty and drifty genotypes are not entirely correct (Sellayah et al., 2014). The descendent of early human who either migrated to tropical or subtropical environments, for instance Pacific Islanders and black Americans, or remained in Africa maintained heat adaption genes. Those who migrated to colder countries like Siberia and Europe retained genes for cold adaptation. Sellayah and team hypothesize that positive selection of uncoupling protein 1 (UCP1), which is highly expressed in brown adipose tissue, in cold environment enabled their ancestors to survive (Cannon and Nedergaard, 2004). In cold environment, brown adipose tissue breaks-down stored triglycerides into glycerol and free fatty acids (FFAs). FFAs move into mitochondria in brown adipose tissue where

19 they trigger UCP1 via uncoupling ATP synthesis from oxidative phosphorylation, in turn generate heat through an energy wastage mechanism (Brooks et al., 1980; Nicholls et al., 1978; Cannon and Nedergaard, 2004). On contrary, expression of brown adipose tissue and UCP1 are less in African and South Asian, which in turn is a risk factor for developing obesity in these populations when combined with hypercaloric, high-fat diet and sedentary lifestyle (Sellayah et al., 2014). Another example of maladaptation is the genetic adaptations to low-salt environment in ancestral populations

SC RI PT

is a threat for hypertension in present populations residing in a high-salt environment (Balaresque et al., 2007). This salt retention adaptive trait enable ancient humans, consuming low levels of dietary salt to survive in hot and humid areas (Balaresque et al., 2007). The evolution of our hominid ancestor, modern human and terrestrial animal took place in an environment where they had no access to salt for about 2 million years (Batuman, 2013). Living organism that have migrated to either brackish water or land environment from marine environment adapted to new environment by retaining little amount of salt (~0.25 gm salt per day) present in natural diet (Batuman, 2013). As salt is an important component of our plasma fluid, absence of effective mechanisms to preserve salt will cause excess bodily secretion,

U

which may turn into fatal if not controlled at an early stages (Batuman, 2013). Interestingly, all genes associated with blood pressure are connected with sodium transport (Lifton et al., 2001). Apart from other terrestrial animals, only

N

human discovered salt as a dietary additive and consume about 10 gm to 18 gm or even more salt per day that is about

A

50 to 70 time higher in comparison to our natural Paleolithic diet, which in turn may lead to heart disease, kidney failure, high blood pressure and strokes. Some studies reported that high salt consumption enhances increased water

M

and salt retention, a reduced pressure natriuresis (excretion of sodium in the urine) response and enhanced glomerular filtration, which in turn leads to hypertension (Kurokawa and Okuda, 1998; Tekol, 2008). Another instance of maladaptation can be observed in the alterations of selective pressures enforced via

D

infectious diseases across time. For the first time “hygiene hypothesis” was proposed by Starchan and stated that

TE

infection can provide protection against atopy (Strachan, 1989). Strachen assume that as first-born children are less exposed to common infections in comparison to their siblings, the number of first borne children with atopic dermatitis and allergic rhinitis are higher (Strachan, 1989). This hypothesis was later confirmed via epidemiological studies (Ege

EP

et al., 2011). The “hygiene hypothesis” proposes that due to improvements in hygiene as well as the introduction of vaccines and antibiotics, there is a reductions in the microbe’s diversity. This improved hygiene, along with vaccines

CC

and antibiotics, is also responsible for causing imbalance in the immune response with alleles, which in turn may result in increased incidence of hypersensitive barrier tissues and type 2 allergic disease (Vasseur and Quintana-Murci, 2013; Haspeslagh et al., 2018). Some studies have also reported that when pregnant mother is more expose to cowsheds and

A

farming, number of new borne children with atopic diseases is very few (Riedler et al., 2001; Ege et al., 2006). One study suggested that prolonged exposure to high levels of endotoxin during the first year of life protects from asthma and atopy (Braun-Fahrländer et al., 2002). However, contradictory results have also been reported where authors demonstrated positive correlation between higher levels of endotoxins and higher incidence of asthma in urban housing (Thorne et al., 2005; Tavernier et al., 2006). One epidemiological study reported that infections via Schistosoma sp. provides protection against atopy (Flohr et al., 2006). The hookworm Necator americanus is reported to promotes increased T-helper 2 (Th2) cell activity in host, which in turn provide protection against asthma (Pritchard

20 and Brown, 2001; Okada et al., 2010). However, infection via Trichuris trichiura and Ascaris lumbricoides have no significant effect on asthma (Okada et al., 2010). Hence, because of the continuous change in environment and living style due to contemporary human activities, incident of maladaptation is increasing in all species, including human. Irrespective of profound research that have been carried out until date to understand the genetic basis of adaptation and maladaptation via balancing selection on genome of diverse organisms, only limited confirmed reports are noticed in comparison to the number of organisms. Recent advancements in high-throughput sequence

SC RI PT

technologies has enabled us to understand the impact of balancing selection on the genetic make-up of contemporary human in different environment and social conditions in a far better way (Gupta et al., 2017; Wang and Mitchell-Olds, 2017; Brandt et al., 2018). However, new queries regarding balancing selection are arising every day and significant areas for future research demand development of more powerful statistical tools along with the investigation of gene networks in response to various environmental factors and lifestyle, including food habits. In near future, information provided in the present review article can be utilized in the field of animal breeding, nature conservation and human

U

medicine.

Conclusions and perspectives:

N

During “Out-of-Africa” migration, favorable advantageous alleles in human genome were retained by either

A

positive selection or balancing selection, while deleterious alleles were removed through purifying selection. Advantageous genetic diversity increases the fitness of an individual in a particular environment. These advantageous

M

genetic diversity may be retained in genome of current generation or would spread over the history of populations (recent past) or species (distance past). However, when any individual with certain advantageous genetic diversity

D

migrate to new environment or lifestyle, including food habits, which is in contrast to his previous environment or lifestyle lead to maladaptation (Vasseur and Quintana-Murci, 2013; Quintana-Murci, 2016). For instance, genetic

TE

adaptations to low-salt environment in ancestral populations is a threat for hypertension in present populations residing in a high-salt environment (Balaresque et al., 2007). Thus there is always a question, which genes in the genome are under balancing selection, and what cause maladaptation? To understand this, though several significant statistical

EP

tests, like Tajima’s D & McDonald–Kreitman (MK) test, have been developed to understand the impact of balancing selection on the genetic make-up of human under various timescale, environmental conditions, life style and food

CC

habits, number of confirmed reports are limited. Authors believes that estimation of balancing selection can be improved by developing robust statistical tool, which considers parameters, like explicit mutation models, size and structure of population across time and uncertainty in ancestral state, during estimation. Additionally, results obtained

A

from these statistical tools when combined with other multidisciplinary approaches, like physiology, epidemiology, and molecular biology, are most likely to identify functionally important variation(s) that are responsible for both normal phenotypic variation as well as disease susceptibility in two different environmental conditions and lifestyle, including food habit. In near future, these variation(s) may serve as a therapeutic target in the prevention of diseases caused due to maladaptation like hypertension and obesity.

21 Conflict of interest Authors declare no conflict of interest.

References:

A

CC

EP

TE

D

M

A

N

U

SC RI PT

Achaz, G., 2009. Frequency Spectrum Neutrality Tests: One for All and All for One. Genetics 183, 249–258. https://doi.org/10.1534/genetics.109.104042 Ågren, J., Oakley, C.G., McKay, J.K., Lovell, J.T., Schemske, D.W., 2013. Genetic mapping of adaptation reveals fitness tradeoffs in Arabidopsis thaliana. Proc. Natl. Acad. Sci. 110, 21077–21082. https://doi.org/10.1073/pnas.1316773110 Aguilar, A., Garza, J.C., 2007. Patterns of Historical Balancing Selection on the Salmonid Major Histocompatibility Complex Class II β Gene. J. Mol. Evol. 65, 34–43. https://doi.org/10.1007/s00239-006-0222-8 Allen, J.A., 1988. Frequency-dependent selection by predators. Philos. Trans. R. Soc. Lond. B. Biol. Sci. 319, 485– 503. Allison, A.C., 1954. Protection afforded by sickle-cell trait against subtertian malareal infection. Br. Med. J. 1, 290– 294. Anderson, J.T., Lee, C.-R., Rushworth, C.A., Colautti, R.I., Mitchell‐ Olds, T., 2013. Genetic trade-offs and conditional neutrality contribute to local adaptation. Mol. Ecol. 22, 699–708. https://doi.org/10.1111/j.1365294X.2012.05522.x Anderson, J.T., Willis, J.H., Mitchell-Olds, T., 2011. Evolutionary genetics of plant adaptation. Trends Genet. 27, 258–266. https://doi.org/10.1016/j.tig.2011.04.001 Andrés, A.M., Hubisz, M.J., Indap, A., Torgerson, D.G., Degenhardt, J.D., Boyko, A.R., Gutenkunst, R.N., White, T.J., Green, E.D., Bustamante, C.D., Clark, A.G., Nielsen, R., 2009. Targets of Balancing Selection in the Human Genome. Mol. Biol. Evol. 26, 2755–2764. https://doi.org/10.1093/molbev/msp190 Arber, W., 2008. Molecular mechanisms driving Darwinian evolution. Math. Comput. Model., Mathematical Methods and Modelling of Biophysical Phenomena 47, 666–674. https://doi.org/10.1016/j.mcm.2007.06.003 Ayala, F., 1983. Microevolution and macroevolution. Evol. Mol. Man 387–402. Ayala, F.J., Campbell, C.A., 1974. Frequency-Dependent Selection. Annu. Rev. Ecol. Syst. 5, 115–138. https://doi.org/10.1146/annurev.es.05.110174.000555 Balaresque, P.L., Ballereau, S.J., Jobling, M.A., 2007. Challenges in human genetic diversity: demographic history and adaptation. Hum. Mol. Genet. 16 Spec No. 2, R134-139. https://doi.org/10.1093/hmg/ddm242 Ball, S.J., Haynes, A., Jacoby, P., Pereira, G., Miller, L.J., Bower, C., Davis, E.A., 2014. Spatial and temporal variation in type 1 diabetes incidence in Western Australia from 1991 to 2010: increased risk at higher latitudes and over time. Health Place 28, 194–204. Barreiro, L.B., Quintana-Murci, L., 2010. From evolutionary genetics to human immunology: how selection shapes host defence genes. Nat. Rev. Genet. 11, 17–30. https://doi.org/10.1038/nrg2698 Barton N. H., 2010. Mutation and the evolution of recombination. Philos. Trans. R. Soc. B Biol. Sci. 365, 1281–1294. https://doi.org/10.1098/rstb.2009.0320 Barton, N.H., 1986. The maintenance of polygenic variation through a balance between mutation and stabilizing selection. Genet. Res. 47, 209–216. Batuman, V., 2013. Salt and hypertension: why is there still a debate? Kidney Int. Suppl. 3, 316. https://doi.org/10.1038/kisup.2013.66 Baum, D.A., Futuyma, D.J., Hoekstra, H.E., Lenski, R.E., Moore, A.J., Peichel, C.L., Schluter, D., Whitlock, M.C., 2013. The Princeton Guide to Evolution. Princeton University Press. Beckett, K.M., 1997. The Cystic Fibrosis Heterozygote Advantage: A Synthesis of Ideas. Anthropologica 39, 147– 158. https://doi.org/10.2307/25605860 Blair, L.M., Feldman, M.W., 2015. The role of climate and out-of-Africa migration in the frequencies of risk alleles for 21 human diseases. BMC Genet. 16, 81. https://doi.org/10.1186/s12863-015-0239-3 Bowler, J.M., Johnston, H., Olley, J.M., Prescott, J.R., Roberts, R.G., Shawcross, W., Spooner, N.A., 2003. New ages for human occupation and climatic change at Lake Mungo, Australia. Nature 421, 837–840. https://doi.org/10.1038/nature01383

22

A

CC

EP

TE

D

M

A

N

U

SC RI PT

Bramanti, B., Sineo, L., Vianello, M., Caramelli, D., Hummel, S., Chiarelli, B., Herrmann, B., 2000. The selective advantage of cystic fibrosis heterozygotes tested by aDNA analysis: A preliminary investigation. Int. J. Anthropol. 15, 255–262. https://doi.org/10.1007/BF02445136 Brandt, D.Y.C., César, J., Goudet, J., Meyer, D., 2018. The Effect of Balancing Selection on Population Differentiation: A Study with HLA Genes. G3 Genes Genomes Genet. 8, 2805–2815. https://doi.org/10.1534/g3.118.200367 Braun-Fahrländer, C., Riedler, J., Herz, U., Eder, W., Waser, M., Grize, L., Maisch, S., Carr, D., Gerlach, F., Bufe, A., Lauener, R.P., Schierl, R., Renz, H., Nowak, D., von Mutius, E., Allergy and Endotoxin Study Team, 2002. Environmental exposure to endotoxin and its relation to asthma in school-age children. N. Engl. J. Med. 347, 869–877. https://doi.org/10.1056/NEJMoa020057 Brisson, D., 2018. Negative Frequency-Dependent Selection Is Frequently Confounding. Front. Ecol. Evol. 6. https://doi.org/10.3389/fevo.2018.00010 Brooks, S.L., Rothwell, N.J., Stock, M.J., Goodbody, A.E., Trayhurn, P., 1980. Increased proton conductance pathway in brown adipose tissue mitochondria of rats exhibiting diet-induced thermogenesis. Nature 286, 274–276. Byars, S.G., Ewbank, D., Govindaraju, D.R., Stearns, S.C., 2010. Colloquium papers: Natural selection in a contemporary human population. Proc. Natl. Acad. Sci. U. S. A. 107 Suppl 1, 1787–1792. https://doi.org/10.1073/pnas.0906199106 Cai, J.J., Petrov, D.A., 2010. Relaxed Purifying Selection and Possibly High Rate of Adaptation in Primate LineageSpecific Genes. Genome Biol. Evol. 2, 393–409. https://doi.org/10.1093/gbe/evq019 Callaghan, M., Cormican, M., Prendergast, M., Pelly, H., Cloughley, R., Hanahoe, B., O’Donovan, D., 2009. Temporal and spatial distribution of human cryptosporidiosis in the west of Ireland 2004-2007. Int. J. Health Geogr. 8, 64. https://doi.org/10.1186/1476-072X-8-64 Cannon, B., Nedergaard, J., 2004. Brown adipose tissue: function and physiological significance. Physiol. Rev. 84, 277–359. https://doi.org/10.1152/physrev.00015.2003 Cardwell, C.R., Carson, D.J., Patterson, C.C., 2006. Higher incidence of childhood-onset type 1 diabetes mellitus in remote areas: a UK regional small-area analysis. Diabetologia 49, 2074–2077. https://doi.org/10.1007/s00125-006-0342-0 Chambre, C., Gbedo, C., Kouacou, N., Fysekidis, M., Reach, G., Le Clesiau, H., Bihan, H., 2017. Migrant adults with diabetes in France: Influence of family migration. J. Clin. Transl. Endocrinol. 7, 28–32. https://doi.org/10.1016/j.jcte.2016.12.003 Charlesworth, D., 2006. Balancing selection and its effects on sequences in nearby genome regions. PLoS Genet. 2, e64. https://doi.org/10.1371/journal.pgen.0020064 Charlesworth, D., Bartolomé, C., Schierup, M.H., Mable, B.K., 2003. Haplotype structure of the stigmatic selfincompatibility gene in natural populations of Arabidopsis lyrata. Mol. Biol. Evol. 20, 1741–1753. https://doi.org/10.1093/molbev/msg170 Charlesworth, J., Eyre-Walker, A., 2008. The McDonald–Kreitman Test and Slightly Deleterious Mutations. Mol. Biol. Evol. 25, 1007–1015. https://doi.org/10.1093/molbev/msn005 Chen, Y., Huang, Y., Qiao, Y., Ling, W., Pan, Y., Geng, L., Xiao, J., Zhang, X., Zhao, H., 2017. Climates on incidence of childhood type 1 diabetes mellitus in 72 countries. Sci. Rep. 7. https://doi.org/10.1038/s41598-017-129548 Cheng, X., DeGiorgio, M., 2018. Detection of shared balancing selection in the absence of trans-species polymorphism. bioRxiv 320390. https://doi.org/10.1101/320390 Cook, L.M., Saccheri, I.J., 2013. The peppered moth and industrial melanism: evolution of a natural selection case study. Heredity 110, 207–212. https://doi.org/10.1038/hdy.2012.92 Cornille, A., Salcedo, A., Huang, H., Kryvokhyzha, D., Holm, K., Ge, X.J., Stinchcomb, J.R., Glemin, S., Wright, S., Lascoux, M., 2018. Local adaptation and maladaptation during the worldwide range expansion of a selffertilizing plant. bioRxiv 308619. https://doi.org/10.1101/308619 Crespi, B.J., 2000. The evolution of maladaptation. Heredity 84, 623–629. https://doi.org/10.1046/j.13652540.2000.00746.x Crow, J.F., Dove, W.F., 2000. Perspectives on Genetics: Anecdotal, Historical, and Critical Commentaries, 19871998. Univ of Wisconsin Press. Croze, M., Wollstein, A., Božičević, V., Živković, D., Stephan, W., Hutter, S., 2017. A genome-wide scan for genes under balancing selection in Drosophila melanogaster. BMC Evol. Biol. 17. https://doi.org/10.1186/s12862016-0857-z

23

A

CC

EP

TE

D

M

A

N

U

SC RI PT

Croze, M., Živković, D., Stephan, W., Hutter, S., 2016. Balancing selection on immunity genes: review of the current literature and new analysis in Drosophila melanogaster. Zool. Jena Ger. 119, 322–329. https://doi.org/10.1016/j.zool.2016.03.004 Darwin, C., 1859. On the Origin of Species by Means of Natural Selection, Or, The Preservation of Favoured Races in the Struggle for Life. J. Murray. DeGiorgio, M., Lohmueller, K.E., Nielsen, R., 2014. A model-based approach for identifying signatures of ancient balancing selection in genetic data. PLoS Genet. 10, e1004561. Depaulis, F., Veuille, M., 1998. Neutrality tests based on the distribution of haplotypes under an infinite-site model. Mol. Biol. Evol. 15, 1788–1790. https://doi.org/10.1093/oxfordjournals.molbev.a025905 Diepen, V., T.a, L., Olson, Å., Ihrmark, K., Stenlid, J., James, T.Y., 2013. Extensive Trans-Specific Polymorphism at the Mating Type Locus of the Root Decay Fungus Heterobasidion. Mol. Biol. Evol. 30, 2286–2301. https://doi.org/10.1093/molbev/mst126 Dobzhansky, T., 1955. A review of some fundamental concepts and problems of population genetics. Cold Spring Harb. Symp. Quant. Biol. 20, 1–15. Dobzhansky, T., Dobzhansky, T.G., 1982. Genetics and the Origin of Species. Columbia University Press. Doherty, P.C., Zinkernagel, R.M., 1975. Enhanced immunological surveillance in mice heterozygous at the H-2 gene complex. Nature 256, 50–52. Dong, D.-K., Cao, J.-S., Shi, K., Liu, L.-C., 2007. Overdominance and Epistasis Are Important for the Genetic Basis of Heterosis in Brassica rapa. HortScience 42, 1207–1211. Dwyer, K.G., Balent, M.A., Nasrallah, J.B., Nasrallah, M.E., 1991. DNA sequences of self-incompatibility genes from Brassica campestris and B. oleracea: polymorphism predating speciation. Plant Mol. Biol. 16, 481–486. https://doi.org/10.1007/BF00024000 Ebert, D., 2008. Host-parasite coevolution: Insights from the Daphnia-parasite model system. Curr. Opin. Microbiol. 11, 290–301. https://doi.org/10.1016/j.mib.2008.05.012 Edwards, A.W., 1998. Natural selection and the sex ratio: Fisher’s sources. Am. Nat. 151, 564–569. https://doi.org/10.1086/286141 Ege, M.J., Bieli, C., Frei, R., van Strien, R.T., Riedler, J., Ublagger, E., Schram-Bijkerk, D., Brunekreef, B., van Hage, M., Scheynius, A., Pershagen, G., Benz, M.R., Lauener, R., von Mutius, E., Braun-Fahrländer, C., Parsifal Study team, 2006. Prenatal farm exposure is related to the expression of receptors of the innate immunity and to atopic sensitization in school-age children. J. Allergy Clin. Immunol. 117, 817–823. https://doi.org/10.1016/j.jaci.2005.12.1307 Ege, M.J., Mayer, M., Normand, A.-C., Genuneit, J., Cookson, W.O.C.M., Braun-Fahrländer, C., Heederik, D., Piarroux, R., von Mutius, E., GABRIELA Transregio 22 Study Group, 2011. Exposure to environmental microorganisms and childhood asthma. N. Engl. J. Med. 364, 701–709. https://doi.org/10.1056/NEJMoa1007302 Egea, R., Casillas, S., Barbadilla, A., 2008. Standard and generalized McDonald–Kreitman test: a website to detect selection by comparing different classes of DNA sites. Nucleic Acids Res. 36, W157–W162. https://doi.org/10.1093/nar/gkn337 Eilertson, K.E., Booth, J.G., Bustamante, C.D., 2012. SnIPRE: selection inference using a Poisson random effects model. PLoS Comput. Biol. 8, e1002806. Erwin, D.H., 2010. Microevolution and macroevolution are not governed by the same processes. Contemp. Debates Philos. Biol. 180–193. Fijarczyk, A., Babik, W., 2015. Detecting balancing selection in genomes: limits and prospects. Mol. Ecol. 24, 3529– 3545. https://doi.org/10.1111/mec.13226 Fijarczyk, A., Dudek, K., Babik, W., 2016. Selective Landscapes in newt Immune Genes Inferred from Patterns of Nucleotide Variation. Genome Biol. Evol. 8, 3417–3432. https://doi.org/10.1093/gbe/evw236 Fisher, I., 1930. The theory of interest. N. Y. 43. Fisher, R.A., 1923. XXI.—on the dominance ratio. Proc. R. Soc. Edinb. 42, 321–341. Flohr, C., Tuyen, L.N., Lewis, S., Quinnell, R., Minh, T.T., Liem, H.T., Campbell, J., Pritchard, D., Hien, T.T., Farrar, J., Williams, H., Britton, J., 2006. Poor sanitation and helminth infection protect against skin sensitization in Vietnamese children: A cross-sectional study. J. Allergy Clin. Immunol. 118, 1305–1311. https://doi.org/10.1016/j.jaci.2006.08.035 Frascaroli, E., Canè, M.A., Landi, P., Pea, G., Gianfranceschi, L., Villa, M., Morgante, M., Pè, M.E., 2007. Classical genetic and quantitative trait loci analyses of heterosis in a maize hybrid between two elite inbred lines. Genetics 176, 625–644. https://doi.org/10.1534/genetics.106.064493

24

A

CC

EP

TE

D

M

A

N

U

SC RI PT

Frelinger, J.A., 1972. The Maintenance of Transferrin Polymorphism in Pigeons. Proc. Natl. Acad. Sci. U. S. A. 69, 326–329. Fu, Y.X., Li, W.H., 1993. Statistical tests of neutrality of mutations. Genetics 133, 693–709. Gao, Z., Przeworski, M., Sella, G., 2015. Footprints of ancient-balanced polymorphisms in genetic variation data from closely related species. Evol. Int. J. Org. Evol. 69, 431–446. https://doi.org/10.1111/evo.12567 Garrigan, D., Hedrick, P.W., 2003. Perspective: Detecting Adaptive Molecular Polymorphism: Lessons from the Mhc. Evolution 57, 1707–1722. https://doi.org/10.1111/j.0014-3820.2003.tb00580.x Gokcumen, O., Zhu, Q., Mulder, L.C.F., Iskow, R.C., Austermann, C., Scharer, C.D., Raj, T., Boss, J.M., Sunyaev, S., Price, A., Stranger, B., Simon, V., Lee, C., 2013. Balancing Selection on a Regulatory Region Exhibiting Ancient Variation That Predates Human–Neandertal Divergence. PLOS Genet. 9, e1003404. https://doi.org/10.1371/journal.pgen.1003404 Goldberg, D.T., 2010. Barron’s AP Biology. Barron’s Educational Series. Gong, L., Parikh, S., Rosenthal, P.J., Greenhouse, B., 2013. Biochemical and immunological mechanisms by which sickle cell trait protects against malaria. Malar. J. 12, 317. https://doi.org/10.1186/1475-2875-12-317 Gupta, M.K., Behara, S.K., Vadde, R., 2017. In silico analysis of differential gene expressions in biliary stricture and hepatic carcinoma. Gene 597, 49–58. Haldane, J.B.S., 1957. The cost of natural selection. J. Genet. 55, 511. https://doi.org/10.1007/BF02984069 Hanson, M.A., Gluckman, P.D., 2011. Developmental origins of health and disease: moving from biological concepts to interventions and policy. Int. J. Gynaecol. Obstet. Off. Organ Int. Fed. Gynaecol. Obstet. 115 Suppl 1, S35. https://doi.org/10.1016/S0020-7292(11)60003-9 Hartl, D.L., Clark, A.G., Clark, A.G., 1997. Principles of population genetics. Sinauer associates Sunderland. Harvey, P.H., Birley, N., Blackstock, T.H., 1975. The effect of experience on the selective behaviour of song thrushes feeding on artificial populations ofCepaea (held). Genetica 45, 211– 216. https://doi.org/10.1007/BF01517197 Haspeslagh, E., Heyndrickx, I., Hammad, H., Lambrecht, B.N., 2018. The hygiene hypothesis: immunological mechanisms of airway tolerance. Curr. Opin. Immunol. 54, 102–108. https://doi.org/10.1016/j.coi.2018.06.007 Hasselmann, M., Beye, M., 2004. Signatures of selection among sex-determining alleles of the honey bee. Proc. Natl. Acad. Sci. U. S. A. 101, 4888–4893. https://doi.org/10.1073/pnas.0307147101 Haynes, A., Bulsara, M.K., Bower, C., Codde, J.P., Jones, T.W., Davis, E.A., 2006. Independent effects of socioeconomic status and place of residence on the incidence of childhood type 1 diabetes in Western Australia. Pediatr. Diabetes 7, 94–100. https://doi.org/10.1111/j.1399-543X.2006.00153.x Hedrick, P.W., 2012. What is the evidence for heterozygote advantage selection? Trends Ecol. Evol. 27, 698–704. https://doi.org/10.1016/j.tree.2012.08.012 Hedrick, P.W., 2011. Population genetics of malaria resistance in humans. Heredity 107, 283–304. https://doi.org/10.1038/hdy.2011.16 Hedrick, P.W., 2006. Genetic polymorphism in heterogeneous environments: the age of genomics. Annu Rev Ecol Evol Syst 37, 67–93. Hedrick, P.W., Ginevan, M.E., Ewing, E.P., 1976. Genetic polymorphism in heterogeneous environments. Annu. Rev. Ecol. Syst. 7, 1–32. Heimpel, G.E., de Boer, J.G., 2007. Sex Determination in the Hymenoptera. Annu. Rev. Entomol. 53, 209–230. https://doi.org/10.1146/annurev.ento.53.103106.093441 Hermisson, J., Pennings, P.S., 2005. Soft sweeps: molecular population genetics of adaptation from standing genetic variation. Genetics 169, 2335–2352. https://doi.org/10.1534/genetics.104.036947 Hill, A.V.S., 1991. HLA Associations with Malaria in Africa: Some Implications for MHC Evolution, in: Klein, J., Klein, D. (Eds.), Molecular Evolution of the Major Histocompatibility Complex, NATO ASI Series. Springer Berlin Heidelberg, pp. 403–420. Hudson, R.R., Kaplan, N.L., 1988. The coalescent process in models with selection and recombination. Genetics 120, 831–840. Hudson, R.R., Kreitman, M., Aguadé, M., 1987. A Test of Neutral Molecular Evolution Based on Nucleotide Data. Genetics 116, 153–159. Hughes, A.L., Nei, M., 1988. Pattern of nucleotide substitution at major histocompatibility complex class I loci reveals overdominant selection. Nature 335, 167–170. https://doi.org/10.1038/335167a0 Ingram, V.M., 1959. Abnormal human haemoglobins. III. The chemical difference between normal and sickle cell haemoglobins. Biochim. Biophys. Acta 36, 402–411.

25

A

CC

EP

TE

D

M

A

N

U

SC RI PT

Ioerger, T.R., Clark, A.G., Kao, T.H., 1990. Polymorphism at the self-incompatibility locus in Solanaceae predates speciation. Proc. Natl. Acad. Sci. 87, 9732–9735. https://doi.org/10.1073/pnas.87.24.9732 Jenkins, D.L., Ortori, C.A., Brookfield, J.F.Y., 1995. A test for adaptive change in DNA sequences controlling transcription. Proc R Soc Lond B 261, 203–207. https://doi.org/10.1098/rspb.1995.0137 Kamath, P.L., Getz, W.M., 2011. Adaptive molecular evolution of the Major Histocompatibility Complex genes, DRA and DQA, in the genus Equus. BMC Evol. Biol. 11, 128. https://doi.org/10.1186/1471-2148-11-128 Karasov, T.L., Kniskern, J.M., Gao, L., DeYoung, B.J., Ding, J., Dubiella, U., Lastra, R.O., Nallu, S., Roux, F., Innes, R.W., Barrett, L.G., Hudson, R.R., Bergelson, J., 2014. The long-term maintenance of a resistance polymorphism through diffuse interactions. Nature 512, 436–440. https://doi.org/10.1038/nature13439 Karn, M.N., Penrose, L.S., 1951. Birth weight and gestation time in relation to maternal age, parity and infant survival. Ann. Eugen. 16, 147–164. Kazancıoğlu, E., Arnqvist, G., 2014. The maintenance of mitochondrial genetic variation by negative frequencydependent selection. Ecol. Lett. 17, 22–27. https://doi.org/10.1111/ele.12195 Kermarrec, N., Roubinet, F., Apoil, P.-A., Blancher, A., 1999. Comparison of allele O sequences of the human and non-human primate ABO system. Immunogenetics 49, 517–526. https://doi.org/10.1007/s002510050529 Kimura, M., 1969. The number of heterozygous nucleotide sites maintained in a finite population due to steady flux of mutations. Genetics 61, 893–903. King, R.C., Stansfield, W.D., Mulligan, P.K., 2007. A Dictionary of Genetics. Oxford University Press. Klein, J., 1987. Origin of major histocompatibility complex polymorphism: the trans-species hypothesis. Hum. Immunol. 19, 155–162. Klein, J., Sato, A., Nagl, S., O’hUigín, C., 1998. Molecular Trans-Species Polymorphism. Annu. Rev. Ecol. Syst. 29, 1-C1. Klein, J., Sato, A., Nikolaidis, N., 2007. MHC, TSP, and the Origin of Species: From Immunogenetics to Evolutionary Genetics. Annu. Rev. Genet. 41, 281–304. https://doi.org/10.1146/annurev.genet.41.110306.130137 Kurokawa, K., Okuda, T., 1998. Genetic and non-genetic basis of essential hypertension: maladaptation of human civilization to high salt intake. Hypertens. Res. Off. J. Jpn. Soc. Hypertens. 21, 67–71. Lascoux, M., Glémin, S., Savolainen, O., 2016. Local Adaptation in Plants, in: ELS. American Cancer Society, pp. 1–7. https://doi.org/10.1002/9780470015902.a0025270 Lechner, S., Ferretti, L., Schöning, C., Kinuthia, W., Willemsen, D., Hasselmann, M., 2014. Nucleotide Variability at Its Limit? Insights into the Number and Evolutionary Dynamics of the Sex-Determining Specificities of the Honey Bee Apis mellifera. Mol. Biol. Evol. 31, 272–287. https://doi.org/10.1093/molbev/mst207 Leffler, E.M., Gao, Z., Pfeifer, S., Ségurel, L., Auton, A., Venn, O., Bowden, R., Bontrop, R., Wall, J.D., Sella, G., Donnelly, P., McVean, G., Przeworski, M., 2013. Multiple instances of ancient balancing selection shared between humans and chimpanzees. Science 339, 1578–1582. https://doi.org/10.1126/science.1234070 Lewontin, R.C., Hubby, J.L., 1966. A molecular approach to the study of genic heterozygosity in natural populations. II. Amount of variation and degree of heterozygosity in natural populations of Drosophila pseudoobscura. Genetics 54, 595. Li, L., Lu, K., Chen, Z., Mu, T., Hu, Z., Li, X., 2008. Dominance, Overdominance and Epistasis Condition the Heterosis in Two Heterotic Rice Hybrids. Genetics 180, 1725–1742. https://doi.org/10.1534/genetics.108.091942 Li, L., Zhou, X., Chen, X., 2011. Characterization and Evolution of MHC Class II B Genes in Ardeid Birds. J. Mol. Evol. 72, 474–483. https://doi.org/10.1007/s00239-011-9446-3 Li, X., Staudt, A., Chien, L.-C., 2016. Identifying counties vulnerable to diabetes from obesity prevalence in the United States: a spatiotemporal analysis. Geospatial Health 11. https://doi.org/10.4081/gh.2016.439 Li, Y.F., Costello, J.C., Holloway, A.K., Hahn, M.W., 2008. “Reverse ecology” and the power of population genomics. Evol. Int. J. Org. Evol. 62, 2984–2994. https://doi.org/10.1111/j.1558-5646.2008.00486.x Li, Z., Pinson, S.R., Park, W.D., Paterson, A.H., Stansel, J.W., 1997. Epistasis for three grain yield components in rice (Oryza sativa L.). Genetics 145, 453–465. Li, Z.K., Luo, L.J., Mei, H.W., Wang, D.L., Shu, Q.Y., Tabien, R., Zhong, D.B., Ying, C.S., Stansel, J.W., Khush, G.S., Paterson, A.H., 2001. Overdominant epistatic loci are the primary genetic basis of inbreeding depression and heterosis in rice. I. Biomass and grain yield. Genetics 158, 1737–1753. Lifton, R.P., Gharavi, A.G., Geller, D.S., 2001. Molecular mechanisms of human hypertension. Cell 104, 545–556. Lingle, S., Feldman, A., Boyce, M.S., Wilson, W.F., 2008. Prey behavior, age-dependent vulnerability, and predation rates. Am. Nat. 172, 712–725. https://doi.org/10.1086/591675

26

A

CC

EP

TE

D

M

A

N

U

SC RI PT

Linhart, Y.B., Grant, M.C., 1996. Evolutionary Significance of Local Genetic Differentiation in Plants. Annu. Rev. Ecol. Syst. 27, 237–277. https://doi.org/10.1146/annurev.ecolsys.27.1.237 Llopart, A., Comeron, J.M., Brunet, F.G., Lachaise, D., Long, M., 2002. Intron presence–absence polymorphism in Drosophila driven by positive Darwinian selection. Proc. Natl. Acad. Sci. 99, 8121–8126. https://doi.org/10.1073/pnas.122570299 Lu, Y., 2002. Molecular evolution at the self-incompatibility locus of Physalis longifolia (Solanaceae). J. Mol. Evol. 54, 784–793. https://doi.org/10.1007/s00239-001-0080-3 Lu, Y., 2001. Roles of lineage sorting and phylogenetic relationship in the genetic diversity at the self-incompatibility locus of Solanaceae. Heredity 86, 195–205. Lukens, L., Yicun, H., May, G., 1996. Correlation of genetic and physical maps at the A mating-type locus of Coprinus cinereus. Genetics 144, 1471–1477. Magwire, M.M., Bayer, F., Webster, C.L., Cao, C., Jiggins, F.M., 2011. Successive Increases in the Resistance of Drosophila to Viral Infection through a Transposon Insertion Followed by a Duplication. PLOS Genet. 7, e1002337. https://doi.org/10.1371/journal.pgen.1002337 Majerus, M.E.N., 1998. Melanism: Evolution in Action. Oxford University Press. McDonald, J.H., Kreitman, M., 1991. Adaptive protein evolution at the Adh locus in Drosophila. Nature 351, 652. Melchinger, A.E., Piepho, H.-P., Utz, H.F., Muminovic, J., Wegenast, T., Törjék, O., Altmann, T., Kusterer, B., 2007. Genetic basis of heterosis for growth-related traits in Arabidopsis investigated by testcross progenies of nearisogenic lines reveals a significant role of epistasis. Genetics 177, 1827–1837. https://doi.org/10.1534/genetics.107.080564 Moeller, D.A., Tiffin, P., 2008. Geographic Variation in Adaptation at the Molecular Level: A Case Study of Plant Immunity Genes. Evolution 62, 3069–3081. https://doi.org/10.1111/j.1558-5646.2008.00511.x Muehlenbachs, A., Fried, M., Lachowitzer, J., Mutabingwa, T.K., Duffy, P.E., 2008. Natural selection of FLT1 alleles and their association with malaria resistance in utero. Proc. Natl. Acad. Sci. 105, 14488–14491. https://doi.org/10.1073/pnas.0803657105 Muirhead, C.A., Glass, N.L., Slatkin, M., 2002. Multilocus self-recognition systems in fungi as a cause of transspecies polymorphism. Genetics 161, 633–641. Muller, H.J., 1950. Our load of mutations. Am. J. Hum. Genet. 2, 111–176. Nagl, S., Tichy, H., Mayer, W.E., Takahata, N., Klein, J., 1998. Persistence of neutral polymorphisms in Lake Victoria cichlid fish. Proc. Natl. Acad. Sci. 95, 14238–14243. https://doi.org/10.1073/pnas.95.24.14238 Narum, S.R., Hess, J.E., 2011. Comparison of FST outlier tests for SNP loci under selection. Mol. Ecol. Resour. 11, 184–194. Navarro, A., Barton, N.H., 2002. The effects of multilocus balancing selection on neutral variability. Genetics 161, 849–863. Neel, J.V., 1962. Diabetes Mellitus: A “Thrifty” Genotype Rendered Detrimental by “Progress”? Am. J. Hum. Genet. 14, 353–362. Nesse, R.M., 2005. Maladaptation and Natural Selection. Q. Rev. Biol. 80, 62–70. https://doi.org/10.1086/431026 Newbigin, E., Uyenoyama, M.K., 2005. The evolutionary dynamics of self-incompatibility systems. Trends Genet. 21, 500–505. https://doi.org/10.1016/j.tig.2005.07.003 Nicholls, D.G., Bernson, V.S., Heaton, G.M., 1978. The identification of the component in the inner membrane of brown adipose tissue mitochondria responsible for regulating energy dissipation. Experientia. Suppl. 32, 89– 93. Ohashi, J., Naka, I., Patarapotikul, J., Hananantachai, H., Brittenham, G., Looareesuwan, S., Clark, A.G., Tokunaga, K., 2004. Extended linkage disequilibrium surrounding the hemoglobin E variant due to malarial selection. Am. J. Hum. Genet. 74, 1198–1208. https://doi.org/10.1086/421330 Okada, H., Kuhn, C., Feillet, H., Bach, J.-F., 2010. The ‘hygiene hypothesis’ for autoimmune and allergic diseases: an update. Clin. Exp. Immunol. 160, 1–9. https://doi.org/10.1111/j.1365-2249.2010.04139.x Olendorf, R., Rodd, F.H., Punzalan, D., Houde, A.E., Hurt, C., Reznick, D.N., Hughes, K.A., 2006. Frequencydependent survival in natural guppy populations. Nature 441, 633–636. https://doi.org/10.1038/nature04646 Parrow, N.L., Fleming, R.E., Minnick, M.F., 2013. Sequestration and Scavenging of Iron in Infection. Infect. Immun. 81, 3503–3514. https://doi.org/10.1128/IAI.00602-13 Pasvol, G., Weatherall, D.J., Wilson, R.J., 1978. Cellular mechanism for the protective effect of haemoglobin S against P. falciparum malaria. Nature 274, 701–703. Patterson, C.C., Waugh, N.R., 1992. Urban/Rural and Deprivational Differences in Incidence and Clustering of Childhood Diabetes in Scotland. Int. J. Epidemiol. 21, 108–117. https://doi.org/10.1093/ije/21.1.108

27

A

CC

EP

TE

D

M

A

N

U

SC RI PT

Peischl, S., Dupanloup, I., Foucal, A., Jomphe, M., Bruat, V., Grenier, J.-C., Gouy, A., Gilbert, K.J., Gbeha, E., Bosshard, L., Hip-Ki, E., Agbessi, M., Hodgkinson, A., Vézina, H., Awadalla, P., Excoffier, L., 2018. Relaxed Selection During a Recent Human Expansion. Genetics 208, 763–777. https://doi.org/10.1534/genetics.117.300551 Peischl, S., Dupanloup, I., Kirkpatrick, M., Excoffier, L., 2013. On the accumulation of deleterious mutations during range expansions. Mol. Ecol. 22, 5972–5982. https://doi.org/10.1111/mec.12524 Peischl, S., Excoffier, L., 2015. Expansion load: recessive mutations and the role of standing genetic variation. Mol. Ecol. 24, 2084–2094. https://doi.org/10.1111/mec.13154 Peischl, S., Kirkpatrick, M., Excoffier, L., 2015. Expansion load and the evolutionary dynamics of a species range. Am. Nat. 185, E81-93. https://doi.org/10.1086/680220 Pleasants, S., 2014. Epidemiology: A moving target. Nature 515, S2–S2. https://doi.org/10.1038/515S2a Praharaj, A.B., Goenka, R.K., Dixit, S., Gupta, M.K., Kar, S.K., Negi, S., 2017. Lacto-Vegetarian Diet and Correlation of Fasting Blood Sugar with Lipids in Population Practicing Sedentary Lifestyle. Ecol. Food Nutr. 1–13. https://doi.org/10.1080/03670244.2017.1337570 Pritchard, D.I., Brown, A., 2001. Is Necator americanus approaching a mutualistic symbiotic relationship with humans? Trends Parasitol. 17, 169–172. https://doi.org/10.1016/S1471-4922(01)01941-9 Prugnolle, F., Manica, A., Charpentier, M., Guégan, J.F., Guernier, V., Balloux, F., 2005. Pathogen-Driven Selection and Worldwide HLA Class I Diversity. Curr. Biol. 15, 1022–1027. https://doi.org/10.1016/j.cub.2005.04.050 Quintana-Murci, L., 2016. Understanding rare and common diseases in the context of human evolution. Genome Biol. 17, 225. https://doi.org/10.1186/s13059-016-1093-y Quintana-Murci, L., Semino, O., Bandelt, H.J., Passarino, G., McElreavey, K., Santachiara-Benerecetti, A.S., 1999. Genetic evidence of an early exit of Homo sapiens sapiens from Africa through eastern Africa. Nat. Genet. 23, 437–441. https://doi.org/10.1038/70550 Ramachandran, S., Deshpande, O., Roseman, C.C., Rosenberg, N.A., Feldman, M.W., Cavalli-Sforza, L.L., 2005. Support from the relationship of genetic and geographic distance in human populations for a serial founder effect originating in Africa. Proc. Natl. Acad. Sci. U. S. A. 102, 15942–15947. https://doi.org/10.1073/pnas.0507611102 Rasmussen, M., Guo, X., Wang, Y., Lohmueller, K.E., Rasmussen, S., Albrechtsen, A., Skotte, L., Lindgreen, S., Metspalu, M., Jombart, T., Kivisild, T., Zhai, W., Eriksson, A., Manica, A., Orlando, L., De La Vega, F.M., Tridico, S., Metspalu, E., Nielsen, K., Ávila-Arcos, M.C., Moreno-Mayar, J.V., Muller, C., Dortch, J., Gilbert, M.T.P., Lund, O., Wesolowska, A., Karmin, M., Weinert, L.A., Wang, B., Li, J., Tai, S., Xiao, F., Hanihara, T., van Driem, G., Jha, A.R., Ricaut, F.-X., de Knijff, P., Migliano, A.B., Gallego Romero, I., Kristiansen, K., Lambert, D.M., Brunak, S., Forster, P., Brinkmann, B., Nehlich, O., Bunce, M., Richards, M., Gupta, R., Bustamante, C.D., Krogh, A., Foley, R.A., Lahr, M.M., Balloux, F., Sicheritz-Pontén, T., Villems, R., Nielsen, R., Wang, J., Willerslev, E., 2011. An Aboriginal Australian genome reveals separate human dispersals into Asia. Science 334, 94–98. https://doi.org/10.1126/science.1211177 Richman, A.D., Uyenoyama, M.K., Kohn, J.R., 1996. Allelic diversity and gene genealogy at the self-incompatibility locus in the Solanaceae. Science 273, 1212–1216. Ridley, M., 2009. Evolution. John Wiley & Sons Incorporated. Riedler, J., Braun-Fahrländer, C., Eder, W., Schreuer, M., Waser, M., Maisch, S., Carr, D., Schierl, R., Nowak, D., von Mutius, E., ALEX Study Team, 2001. Exposure to farming in early life and development of asthma and allergy: a cross-sectional survey. Lancet Lond. Engl. 358, 1129–1133. https://doi.org/10.1016/S01406736(01)06252-3 Robertson, A., 1956. The effect of selection against extreme deviants based on deviation or on homozygosis. J. Genet. 54, 236. https://doi.org/10.1007/BF02982779 Rodman, D.M., Zamudio, S., 1991. The cystic fibrosis heterozygote--advantage in surviving cholera? Med. Hypotheses 36, 253–258. Rost, S., Fregin, A., Ivaskevicius, V., Conzelmann, E., Hörtnagel, K., Pelz, H.-J., Lappegard, K., Seifried, E., Scharrer, I., Tuddenham, E.G.D., Müller, C.R., Strom, T.M., Oldenburg, J., 2004. Mutations in VKORC1 cause warfarin resistance and multiple coagulation factor deficiency type 2. Nature 427, 537–541. https://doi.org/10.1038/nature02214 Ruhli, F.J., Henneberg, M., 2011. Evolutionary Medicine: Microevolution of human morphology and its medicosocial impact. FASEB J. 25, 866.1-866.1. https://doi.org/10.1096/fasebj.25.1_supplement.866.1 Sabeti, P.C., Reich, D.E., Higgins, J.M., Levine, H.Z., Richter, D.J., Schaffner, S.F., Gabriel, S.B., Platko, J.V., Patterson, N.J., McDonald, G.J., 2002. Detecting recent positive selection in the human genome from haplotype structure. Nature 419, 832.

28

A

CC

EP

TE

D

M

A

N

U

SC RI PT

Sabeti, P.C., Schaffner, S.F., Fry, B., Lohmueller, J., Varilly, P., Shamovsky, O., Palma, A., Mikkelsen, T.S., Altshuler, D., Lander, E.S., 2006. Positive natural selection in the human lineage. Science 312, 1614–1620. https://doi.org/10.1126/science.1124309 Sabeti, P.C., Varilly, P., Fry, B., Lohmueller, J., Hostetter, E., Cotsapas, C., Xie, X., Byrne, E.H., McCarroll, S.A., Gaudet, R., Schaffner, S.F., Lander, E.S., 2007. Genome-wide detection and characterization of positive selection in human populations. Nature 449, 913–918. https://doi.org/10.1038/nature06250 Samonte, I.E., Satta, Y., Sato, A., Tichy, H., Takahata, N., Klein, J., 2007. Gene Flow between Species of Lake Victoria Haplochromine Fishes. Mol. Biol. Evol. 24, 2069–2080. https://doi.org/10.1093/molbev/msm138 Sato, K., Nishio, T., Kimura, R., Kusaba, M., Suzuki, T., Hatakeyama, K., Ockendon, D.J., Satta, Y., 2002. Coevolution of the S-locus genes SRK, SLG and SP11/SCR in Brassica oleracea and B. rapa. Genetics 162, 931–940. Savolainen, O., Lascoux, M., Merilä, J., 2013. Ecological genomics of local adaptation. Nat. Rev. Genet. 14, 807– 820. https://doi.org/10.1038/nrg3522 Schnee, F.B., Thompson, J.N., 1984. Conditional Neutrality of Polygene Effects. Evolution 38, 42–46. https://doi.org/10.1111/j.1558-5646.1984.tb00258.x Schober, E., Rami, B., Waldhoer, T., 2003. Small area variation in childhood diabetes mellitus in Austria: links to population density, 1989 to 1999. J. Clin. Epidemiol. 56, 269–273. https://doi.org/10.1016/S08954356(02)00607-8 Schoenle, E.J., Lang-Muritano, M., Gschwend, S., Laimbacher, J., Mullis, P.E., Torresani, T., Biason-Lauber, A., Molinari, L., 2001. Epidemiology of Type I diabetes mellitus in Switzerland: steep rise in incidence in under 5 year old children in the past decade. Diabetologia 44, 286–289. https://doi.org/10.1007/s001250051615 Ségurel, L., Thompson, E.E., Flutre, T., Lovstad, J., Venkat, A., Margulis, S.W., Moyse, J., Ross, S., Gamble, K., Sella, G., Ober, C., Przeworski, M., 2012. The ABO blood group is a trans-species polymorphism in primates. Proc. Natl. Acad. Sci. U. S. A. 109, 18493–18498. https://doi.org/10.1073/pnas.1210603109 Sellayah, D., Cagampang, F.R., Cox, R.D., 2014. On the Evolutionary Origins of Obesity: A New Hypothesis. Endocrinology 155, 1573–1588. https://doi.org/10.1210/en.2013-2103 Semel, Y., Nissenbaum, J., Menda, N., Zinder, M., Krieger, U., Issman, N., Pleban, T., Lippman, Z., Gur, A., Zamir, D., 2006. Overdominant quantitative trait loci for yield and fitness in tomato. Proc. Natl. Acad. Sci. U. S. A. 103, 12981–12986. https://doi.org/10.1073/pnas.0604635103 Siewert, K.M., Voight, B.F., 2017. Detecting Long-Term Balancing Selection Using Allele Frequency Correlation. Mol. Biol. Evol. 34, 2996–3005. https://doi.org/10.1093/molbev/msx209 Silva, B., Faustino, P., 2015. An overview of molecular basis of iron metabolism regulation and the associated pathologies. Biochim. Biophys. Acta BBA - Mol. Basis Dis. 1852, 1347–1359. https://doi.org/10.1016/j.bbadis.2015.03.011 Singh, R.S., Krimbas, C.B., 2000. Evolutionary Genetics: From Molecules to Morphology. Cambridge University Press. Skaar, E.P., 2010. The Battle for Iron between Bacterial Pathogens and Their Vertebrate Hosts. PLoS Pathog. 6. https://doi.org/10.1371/journal.ppat.1000949 Skipper, M., 2002. Human genetics: Tracking positive selection [WWW Document]. Nat. Rev. Genet. https://doi.org/10.1038/nrg942 Smith, D. a. S., 1981. Heterozygous advantage expressed through sexual selection in a polymorphic African butterfly. Nature 289, 174–175. https://doi.org/10.1038/289174a0 Song, S., 2013. Assessing the impact of in utero exposure to famine on fecundity: evidence from the 1959-61 famine in China. Popul. Stud. 67, 293–308. https://doi.org/10.1080/00324728.2013.774045 Souza, P.C. de, Bonilla-Rodriguez, G.O., 2007. Fish hemoglobins. Braz. J. Med. Biol. Res. 40, 769–778. https://doi.org/10.1590/S0100-879X2007000600004 Speakman, J.R., 2008. Thrifty genes for obesity, an attractive but flawed idea, and an alternative perspective: the ‘drifty gene’ hypothesis. Int. J. Obes. 32, 1611–1617. https://doi.org/10.1038/ijo.2008.161 Speiser, D.I., Lampe, R.I., Lovdahl, V.R., Carrillo-Zazueta, B., Rivera, A.S., Oakley, T.H., 2013. Evasion of predators contributes to the maintenance of male eyes in sexually dimorphic Euphilomedes ostracods (Crustacea). Integr. Comp. Biol. 53, 78–88. https://doi.org/10.1093/icb/ict025 Spence, R., Wootton, R.J., Barber, I., Przybylski, M., Smith, C., 2013. Ecological causes of morphological evolution in the three-spined stickleback. Ecol. Evol. 3, 1717–1726. https://doi.org/10.1002/ece3.581 Spurgin, L.G., Richardson, D.S., 2010. How pathogens drive genetic diversity: MHC, mechanisms and misunderstandings. Proc. R. Soc. Lond. B Biol. Sci. 277, 979–988. https://doi.org/10.1098/rspb.2009.2084

29

A

CC

EP

TE

D

M

A

N

U

SC RI PT

Stahl, E.A., Dwyer, G., Mauricio, R., Kreitman, M., Bergelson, J., 1999. Dynamics of disease resistance polymorphism at the Rpm1 locus of Arabidopsis. Nature 400, 667–671. https://doi.org/10.1038/23260 Stearns, S.C., Govindaraju, D.R., Ewbank, D., Byars, S.G., 2012. Constraints on the coevolution of contemporary human males and females. Proc. Biol. Sci. 279, 4836–4844. https://doi.org/10.1098/rspb.2012.2024 Storz, J.F., Sabatino, S.J., Hoffmann, F.G., Gering, E.J., Moriyama, H., Ferrand, N., Monteiro, B., Nachman, M.W., 2007. The Molecular Basis of High-Altitude Adaptation in Deer Mice. PLOS Genet. 3, e45. https://doi.org/10.1371/journal.pgen.0030045 Strachan, D.P., 1989. Hay fever, hygiene, and household size. BMJ 299, 1259–1260. Szpiech, Z.A., Hernandez, R.D., 2014. selscan: An Efficient Multithreaded Program to Perform EHH-Based Scans for Positive Selection. Mol. Biol. Evol. 31, 2824–2827. https://doi.org/10.1093/molbev/msu211 Tajima, F., 1996. Infinite-allele model and infinite-site model in population genetics. J. Genet. 75, 27. https://doi.org/10.1007/BF02931749 Takahata, N., Nei, M., 1990. Allelic genealogy under overdominant and frequency-dependent selection and polymorphism of major histocompatibility complex loci. Genetics 124, 967–978. Takahata, N., Satta, Y., 1998. Footprints of intragenic recombination at HLA loci. Immunogenetics 47, 430–441. Tavernier, G., Fletcher, G., Gee, I., Watson, A., Blacklock, G., Francis, H., Fletcher, A., Frank, T., Frank, P., Pickering, C.A., Niven, R., 2006. IPEADAM study: indoor endotoxin exposure, family status, and some housing characteristics in English children. J. Allergy Clin. Immunol. 117, 656–662. https://doi.org/10.1016/j.jaci.2005.12.1311 Teixeira, J.C., de Filippo, C., Weihmann, A., Meneu, J.R., Racimo, F., Dannemann, M., Nickel, B., Fischer, A., Halbwax, M., Andre, C., Atencia, R., Meyer, M., Parra, G., Pääbo, S., Andrés, A.M., 2015. Long-Term Balancing Selection in LAD1 Maintains a Missense Trans-Species Polymorphism in Humans, Chimpanzees, and Bonobos. Mol. Biol. Evol. 32, 1186–1196. https://doi.org/10.1093/molbev/msv007 Tekol, Y., 2008. Irreversible and reversible components in the genesis of hypertension by sodium chloride (salt). Med. Hypotheses 70, 255–259. https://doi.org/10.1016/j.mehy.2007.06.007 Těšický, M., Vinkler, M., 2015. Trans-Species Polymorphism in Immune Genes: General Pattern or MHC-Restricted Phenomenon? [WWW Document]. J. Immunol. Res. https://doi.org/10.1155/2015/838035 Thorne, P.S., Kulhánková, K., Yin, M., Cohn, R., Arbes, S.J., Zeldin, D.C., 2005. Endotoxin exposure is a risk factor for asthma: the national survey of endotoxin in United States housing. Am. J. Respir. Crit. Care Med. 172, 1371–1377. https://doi.org/10.1164/rccm.200505-758OC Tian, H., Cui, Y., Dong, L., Zhou, S., Li, X., Huang, S., Yang, R., Xu, B., 2015. Spatial, temporal and genetic dynamics of highly pathogenic avian influenza A (H5N1) virus in China. BMC Infect. Dis. 15, 54. https://doi.org/10.1186/s12879-015-0770-x Tropf, F.C., Stulp, G., Barban, N., Visscher, P.M., Yang, J., Snieder, H., Mills, M.C., 2015. Human fertility, molecular genetics, and natural selection in modern societies. PloS One 10, e0126821. https://doi.org/10.1371/journal.pone.0126821 Turelli, M., 1984. Heritable genetic variation via mutation-selection balance: Lerch’s zeta meets the abdominal bristle. Theor. Popul. Biol. 25, 138–193. Turelli, M., Barton, N.H., 2004. Polygenic variation maintained by balancing selection: pleiotropy, sex-dependent allelic effects and G x E interactions. Genetics 166, 1053–1079. Ulizzi, L., Terrenato, L., 1992. Natural selection associated with birth weight. VI. Towards the end of the stabilizing component. Ann. Hum. Genet. 56, 113–118. van Delden, W., Boerema, A.C., Kamping, A., 1978. The alcohol dehydrogenase polymorphism in populations of Drosophila melanogaster. I. Selection in different environments. Genetics 90, 161–191. van Strien, M.J., Keller, D., Holderegger, R., Ghazoul, J., Kienast, F., Bolliger, J., 2014. Landscape genetics as a tool for conservation planning: predicting the effects of landscape change on gene flow. Ecol. Appl. Publ. Ecol. Soc. Am. 24, 327–339. Vasseur, E., Quintana-Murci, L., 2013. The impact of natural selection on health and disease: uses of the population genetics approach in humans. Evol. Appl. 6, 596–607. https://doi.org/10.1111/eva.12045 Verrelli, B.C., McDonald, J.H., Argyropoulos, G., Destro-Bisol, G., Froment, A., Drousiotou, A., Lefranc, G., Helal, A.N., Loiselet, J., Tishkoff, S.A., 2002. Evidence for Balancing Selection from Nucleotide Sequence Analyses of Human G6PD. Am. J. Hum. Genet. 71, 1112–1128. Voight, B.F., Kudaravalli, S., Wen, X., Pritchard, J.K., 2006. A Map of Recent Positive Selection in the Human Genome. PLOS Biol. 4, e72. https://doi.org/10.1371/journal.pbio.0040072 Volis101, S., 2008. Detection of signatures of positive selection in naturally occurring genetic variation. Popul. Genet. Res. Prog. 279.

30

A

CC

EP

TE

D

M

A

N

U

SC RI PT

Walker, B.R., Colledge, N.R., 2013. Davidson’s Principles and Practice of Medicine E-Book. Elsevier Health Sciences. Wang, B., Mitchell-Olds, T., 2017. Balancing selection and trans-specific polymorphisms. Genome Biol. 18, 231. https://doi.org/10.1186/s13059-017-1365-1 Wang, I.J., Savage, W.K., Shaffer, H.B., 2009. Landscape genetics and least-cost path analysis reveal unexpected dispersal routes in the California tiger salamander (Ambystoma californiense). Mol. Ecol. 18, 1365–1374. https://doi.org/10.1111/j.1365-294X.2009.04122.x Wang, I.J., Shaffer, H.B., 2017. Population genetic and field‐ ecological analyses return similar estimates of dispersal over space and time in an endangered amphibian. Evol. Appl. 10, 630–639. https://doi.org/10.1111/eva.12479 Williams, T.N., 2006. Human red blood cell polymorphisms and malaria. Curr. Opin. Microbiol., Host microbe interactions: fungi/Host microbe interactions: parasites/Host microbe interactions: viruses 9, 388–394. https://doi.org/10.1016/j.mib.2006.06.009 Williams, T.N., Mwangi, T.W., Wambua, S., Alexander, N.D., Kortok, M., Snow, R.W., Marsh, K., 2005. Sickle Cell Trait and the Risk of Plasmodium falciparum Malaria and Other Childhood Diseases. J. Infect. Dis. 192, 178–186. https://doi.org/10.1086/430744 Wilson, D.J., Hernandez, R.D., Andolfatto, P., Przeworski, M., 2011. A population genetics-phylogenetics approach to inferring natural selection in coding sequences. PLoS Genet. 7, e1002395. Wilson, D.J., McVean, G., 2005. Estimating diversifying selection and functional constraint in the presence of recombination. Genetics. Wooding, S., Kim, U.-K., Bamshad, M.J., Larsen, J., Jorde, L.B., Drayna, D., 2004. Natural selection and molecular evolution in PTC, a bitter-taste receptor gene. Am. J. Hum. Genet. 74, 637–646. https://doi.org/10.1086/383092 Wu, Q., Han, T.-S., Chen, X., Chen, J.-F., Zou, Y.-P., Li, Z.-W., Xu, Y.-C., Guo, Y.-L., 2017. Long-term balancing selection contributes to adaptation in Arabidopsis and its relatives. Genome Biol. 18, 217. https://doi.org/10.1186/s13059-017-1342-8 Xiao, J., Li, J., Yuan, L., Tanksley, S.D., 1995. Dominance is the major genetic basis of heterosis in rice as revealed by QTL analysis using molecular markers. Genetics 140, 745–754. Yang, Z., 2006. Computational Molecular Evolution. OUP Oxford. Yeager, M., Hughes, A.L., 1999. Evolution of the mammalian MHC: natural selection, recombination, and convergent evolution. Immunol. Rev. 167, 45–58. https://doi.org/10.1111/j.1600-065X.1999.tb01381.x

31 Figure legend

A

CC

EP

TE

D

M

A

N

U

SC RI PT

Fig. 1: Mechanism involved in causing maladaptation. Migration either cause (a) gene flow from central population to marginal population, which subsequently leads to maladaptation, or (b) range expansion, which maintains both advantageous and deleterious allele at expanding range margins. If the frequency of deleterious alleles increases across time, it may lead to maladaptation. If the previous and current environment/lifestyle, including food habit, is same, increase in the frequency of advantageous alleles across time will provide protection but if the current environment/lifestyle, including food habit contrast to previous one, increase in the frequency of advantageous alleles across time will cause maladaptation.