Dynamics of Fst for the island model

Dynamics of Fst for the island model

ARTICLE IN PRESS Theoretical Population Biology 72 (2007) 485–503 www.elsevier.com/locate/tpb Dynamics of F st for the island model Sivan Rottenstre...

525KB Sizes 0 Downloads 74 Views

ARTICLE IN PRESS

Theoretical Population Biology 72 (2007) 485–503 www.elsevier.com/locate/tpb

Dynamics of F st for the island model Sivan Rottenstreicha,, Matthew B. Hamiltonb, Judith R. Millera a

Department of Mathematics, Georgetown University, St. Mary’s Building, Washington, DC 20057, USA b Department of Biology, Georgetown University, Reiss Science Building, Washington, DC 20057, USA Received 27 December 2006 Available online 4 September 2007

Abstract 1 F st is a measure of genetic differentiation in a subdivided population. Sewall Wright observed that F st ¼ 1þ2Nm in a haploid diallelic infinite island model, where N is the effective population size of each deme and m is the migration rate. In demonstrating this result, Wright relied on the infinite size of the population. Natural populations are not infinite and therefore they change over time due to genetic drift. In a finite population, F st becomes a random variable that evolves over time. In this work we ask, given an initial population state, what are the dynamics of the mean and variance of F st under the finite island model? In application both of these quantities are critical in the evaluation of F st data. 1 We show that after a time of order N generations the mean of F st is slightly biased below 1þ2Nm . Further we show that the variance of 1 F st is of order d where d is the number of demes in the population. We introduce several new mathematical techniques to analyze coalescent genealogies in a dynamic setting. r 2007 Elsevier Inc. All rights reserved.

Keywords: F statistics; F st ; Island model; Coalescent

1. Introduction Populations are often divided into subpopulations referred to as demes. Individuals within the same deme tend to be more genetically similar than individuals in different demes. One way of quantifying this phenomena is through expected heterozygosity. The expected total heterozygosity of the population, H T , is the probability that two gametes randomly selected from the total population are not identical in state. The expected within deme heterozygosity, H W , is the probability that two gametes randomly selected from the same deme are not identical in state. Typically one would expect H T 4H W . Wright introduced the quantity F st to measure the relative expected heterozygosity in bialleleic subdivided populations (Wright, 1942). The formula for F st is F st ¼

HT  HW . HT

(1.1)

Corresponding author.

E-mail address: [email protected] (S. Rottenstreich). 0040-5809/$ - see front matter r 2007 Elsevier Inc. All rights reserved. doi:10.1016/j.tpb.2007.08.007

The island model was introduced by Wright (1931) as a simple model of subdivided populations. A well known result concerning F st , due to Wright, states that for haploid 1 populations in an infinite island model, F st ¼ 1þ2Nm (Wright, 1942), where N is the size of each deme and m is the migration rate. If we want to understand natural populations then we cannot assume an infinite population size; hence we must relax the infinite population size assumption implicit in the infinite island model. Many authors have addressed this issue by considering F st for the finite island model, e.g. Nei et al. (1977), Takahata (1983), Crow and Aoki (1984) and Slatkin (1991). Due to drift in a finite population, F st at any point in time will be a random variable, where we think of F st over the set of all possible genealogies given a certain start state for the population. One is then interested in computing the mean and variance of F st at various points in time. The mean and variance can be computed in two different settings. When mutation is present we may consider the steady state distribution of F st . However, when mutation is very low or for some reason can be ignored F st has no

ARTICLE IN PRESS 486

S. Rottenstreich et al. / Theoretical Population Biology 72 (2007) 485–503

steady state since eventually one of the alleles will fix and we will arrive at F st ¼ 00. In such a situation we need to consider the dynamics of the mean and variance of F st as time evolves. This paper focuses on the following question: what are the dynamics of the mean and variance of F st in the island model without mutation for large, but finite, values of N and d and a migration rate, m, such that Nm5N; d. The literature on dynamic F st can be split into two categories. In one category, e.g. Nei et al. (1977) and Takahata (1983), the authors derive matrix equations that describe the dynamics of the mean and variance of H W and H T . For the mean this involves solving a two by two matrix equation; for the variance this involves solving a seven by seven matrix equation (Takahata, 1983). In practice solving the variance matrix equation is difficult. As Takahata states ‘actual calculations of such equations except for [the mean] is, however, often tedious so that is was done numerically’. Having solved for the moments of H W and H T one then uses a Taylor series approximation of F st to approximate the mean and variance of F st . It is difficult to show that the error term involved in these Taylor expansion is small (Nei et al., 1977; Takahata, 1983). In practice the complexity of the matrix equation and the Taylor expansion prevent a clear understanding of the dynamics of F st through this approach. In the second category of papers a quantity different than F st is analyzed. Letting E½ represent the expectation over all possible genealogies (in Section 2 we state more precisely what we mean by this) one can define a quantity, which we call F~ st , as follows: E½H T  H W  F~ st ¼ . (1.2) E½H T  F~ st is a deterministic quantity while F st is a random variable. In the population genetics literature, F~ st is often used as an approximation to E½F st , e.g. Wakeley (1998) and Vitalis and Couvet (2001). In general, the accuracy of this approximation is unclear. In Slatkin (1991) and Crow and Aoki (1984), F~ st rather than F st is analyzed. Since F~ st involves only the means of H T and H W this quantity is amenable to matrix equation methods (Crow and Aoki, 1984). Slatkin used coalescent theory to analyze F~ st (Slatkin, 1991). In doing so, Slatkin moved the analysis of F~ st from a matrix equation based approach to a coalescent theory based approach. In both (Slatkin, 1991; Crow and Aoki, 1984), F~ st is considered for an island model with mutation. Both papers show that the 1 steady state value of F~ st is 1þ2Nm and that regardless of the start state of the population the steady state is approached at an exponential rate in a time of order N. While this is a valuable result, it does not address the main question of this paper for two reasons. First, F~ st is not F st . Analysis of F~ st cannot substitute for analysis of F st . For example, F st has a variance while F~ st has no variance because it is deterministic. Hence, questions relating to variance, which are important in application, cannot be considered through

F~ st . Second, when mutation is not present or is very low F st has no steady state and so the results in Slatkin (1991) and Crow and Aoki (1984) do not apply. Understanding the mean and variance of F st in a dynamic setting is biologically interesting for two reasons. First, our current understanding of the variance of F st is incomplete. Typically, confidence intervals for F st data are built through bootstrapping techniques (Weir and Cockerham, 1984, 2002). An analytic framework for computing the variance of F st should help to shed light on these bootstrapping techniques and the formation of F st confidence intervals in general. Second, currently most F st analysis assumes a steady state value of F st . As pointed out in Whitlock and Mccauley (1999), populations cannot always be assumed to be in steady state. Understanding the effect of dynamics on F st is important not only for populations not in a steady state, but also for appreciating the implicit assumptions one makes on F st when a steady state is assumed. This paper is in spirit a continuation of Slatkin’s use of the coalescent to understand F~ st . However, we consider F st rather than F~ st . We use coalescent theory to compute the dynamics of the mean and variance of F st . Our results are novel in several ways. We derive explicit formulas for the mean and variance of F st as time evolves. Our formulas for the mean and variance provide a clear picture of how F st behaves over time. Finally our results are rigorous in the sense that we have explicit error bounds on the approximations we use. Our analysis uses and extends ideas developed in Wakeley (1998, 2003) and Wakeley and Takahashi (2004). We will make explicit the various connections of our approach to these works. The outline of this paper is as follows. In Section 2 we state the results of this paper. In Section 3 we introduce basic definition and notation. Sections 4 and 5 form the core of this paper, they consider the mean and variance of F st , respectively. We have placed some of the more mathematical technicalities needed in these two sections in the Appendix. Finally in Section 6 we discuss our results and the techniques used to derive them. 2. Results Figs. 1–5 represent data from a single run of a simulation of evolution in a biallelic haploid island model with 100 demes, 200 individuals in each deme, and a migration rate of :005. In the start state half the demes were homogeneous for one allele and half the demes were homogeneous for the other allele. This implies that the start state has F st ¼ 1. There is no mutation in this model. Hence the only steady state is a fixed population. We ran the model until the population fixed for one of the alleles. This took 25 080 generations. Figs. 1 and 2 show the F st value and allele frequency, respectively, over the entire simulation. Figs. 3–5 are enlargements of different sections of Fig. 1. As we will explain they represent different stages in the dynamics of

ARTICLE IN PRESS 1

1

0.9

0.9

0.8

0.8

0.7

0.7

0.6

0.6 Fst

Fst

S. Rottenstreich et al. / Theoretical Population Biology 72 (2007) 485–503

0.5

487

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1

0.1 0

0 0

0.5

1

1.5

2 x

generations

0

2.5

50

100

150

200

250

300

350

400

generations

104

Fig. 3. Initial stage.

Fig. 1. Value of F st plotted until population becomes fixed. N ¼ 200, d ¼ 100, m ¼ :005.

1 1

0.9 0.8

0.8

0.7

0.7

0.6

0.6

Fst

allelic frequency

0.9

0.5

0.5

0.4

0.4

0.3

0.3

0.2

0.2

0.1

0.1

0 1000

0 0

0.5

1

1.5 generations

2

2.5 x 104

1500

2000

2500

3000

3500

4000

generations Fig. 4. Quasi-steady stage.

Fig. 2. Frequency of allele in total population over time.

F st . The goal of this paper is to explain to the reader why these graphs look the way they do. The dynamics of F st can be split into three stages: an initial stage, a quasi-steady stage, and an end stage. The initial stage takes on order of N generations. In this stage 1 F st changes from its initial value to approximately 1þ2Nm . In the quasi-steady stage F st varies but within a window of 1 width of order p1ffiffid about 1þ2Nm . The quasi-steady stage lasts from generation of order N to generation of order Nd. At the end stage F st becomes increasingly erratic and does not 1 stay close to 1þ2Nm . During the end stage the population is approaching fixation for one of the two alleles. All this will be made precise through Theorems 1–3. This description of the dynamics is borne out in the figures. All three stages are shown in Fig. 1. The initial stage is enlarged in Fig. 3. F st starts at 1. Then

1 ¼ 13. Fig. 4 within N ¼ 200 generations it gets close to 1þ2Nm 1 enlarges the quasi-steady stage. F st stays within p1ffiffid ¼ 10 of 1 the value 3. Finally Figs. 2 and 5 show the erratic nature of F st in the end stage as the population approaches fixation. The dynamics of the initial stage of F st are well known through numerical simulation (Crow and Aoki, 1984). To our knowledge there are no analytic results that describe the initial stage behavior for F st , rather as discussed in the Introduction previous analytic results have dealt with F~ st . The dynamics of the quasi-steady stage of F st are not well known. Specifically, although Crow and Aoki (1984) and Slatkin (1991) and others have shown numerically that 1 after an initial stage F st stays fixed around 1þ2Nm , we are not aware of any work that derives the variance of F st during this stage. The erratic behavior of F st as fixation approaches has been noted previously in a qualitative manner e.g. Nei et al. (1977). However, there has been no

ARTICLE IN PRESS S. Rottenstreich et al. / Theoretical Population Biology 72 (2007) 485–503

488

1

Under these assumptions we develop asymptotic expansions for the mean and variance of F st ðT e Þ. We condition both the mean and variance on the allelic state of the population at time T i . The allelic state of the population at T i is determined by the values of pk ðT i Þ for k ¼ 1; . . . ; d. Also, since F st ðT e Þ is undefined if pðT e Þ ¼ 0; 1, which occurs if the population fixes, we condition on pðT e Þa0; 1. We let E½ and V ½ denote expectation and variance over all possible genealogies that can be formed for the whole population of the island model from time T i to T e , we consider the following mean and variance:

0.9 0.8 0.7

Fst

0.6 0.5 0.4 0.3 0.2

E½F st ðT e ÞjT i ; pðT e Þa0; 1, V ½F st ðT e ÞjT i ; pðT e Þa0; 1.

0.1 0 2

2.1

2.2 2.3 generations

2.4

2.5 x 104

Fig. 5. End stage.

previous work quantifying the relationship between allelic frequency and the onset of the end stage. Before describing our results more precisely we need to define some variables and set some notation. The variables that define the finite island model are

ð2:3Þ

We have three main results that we express as Theorems 1–3. We present the theorems and then discuss the motivation behind their assumptions and their consequences. Theorem 1. If NbdbG and 15T e  T i 5d (more precisely T e  T i ¼ d 1 for any 0oo1) then E½F st ðT e ÞjT i ; pðT e Þa0; 1   1 1 h¯ E ðT i Þ 1  þ O 1þ=2 , ¼ 2G þ 1 d 2G þ 1 d

ð2:4Þ

where d N m

number of demes number of individuals in each deme migration rate

h¯ E ðtÞ ¼

For convenience we also set G ¼ Nm.

(2.1)

We consider a biallelic locus for a haploid model. The population evolves over non-overlapping generations through a Wright–Fisher mating model (we will make this precise in Section 3). We measure time in unit of N generations. So time t corresponds to generation Nt. We define

  2G 1 2 ð1 þ G þ 4G Þ þ G . pðtÞqðtÞ ð1 þ GÞð1 þ 2GÞ2

ð2:5Þ

Theorem 2. If NbdbG and 15T e  T i 5d (more precisely T e  T i ¼ Nd 1 for any 0oo1) then V ½F st ðT e ÞjT i ; pðT e Þa0; 1   1 1 ¼ aðGÞ b1 ðGÞ þ b2 ðGÞ d pðT i ÞqðT i Þ   1 þ O 1þ=2 , d

ð2:6Þ

where pðtÞ pk ðtÞ qðtÞ qk ðtÞ

frequency frequency frequency frequency

of of of of

allele allele allele allele

A in total population at time t A in deme k at time t B in total population at time t B in deme k at time t.

With this P notation we have H T ðtÞ ¼ pðtÞqðtÞ and H W ðtÞ ¼ d1 dk¼1 pk ðtÞqk ðtÞ. Then P pðtÞqðtÞ  d1 dk¼1 pk ðtÞqk ðtÞ . (2.2) F st ðtÞ ¼ pðtÞqðtÞ We assume the following relationship between N, d, and m: NbdbG. We assume that the population starts evolving at some initial time, T i . We make no assumptions on the population at T i except that it is not fixed for one of the alleles. We consider the population at some end time, T e .

aðGÞ ¼

G , ð1 þ 2GÞ ð1 þ GÞð3 þ 2GÞ 4

b1 ðGÞ ¼ 8Gð3 þ 18G þ 100G2 þ 116G3 þ 40G4 Þ, b2 ðGÞ ¼ 1 þ 29G þ 98G2 þ 96G3 þ 32G4 .

ð2:7Þ

Theorem 3. If Nbdb1 and T e  T i 5d then       E½F st ðT e ÞjT i ; p  1  ¼ OðeGðT e T i Þ Þ þ O 1 .  1 þ 2G d (2.8) V ½F st ðT e ÞjT i ; p ¼ OðeGðT e T i Þ Þ þ O

  1 . d

(2.9)

These theorems give asymptotic expansions for the mean and variance of F st ðT e Þ. In contrast to previous work, we have specific bounds on the error of our approximation in

ARTICLE IN PRESS S. Rottenstreich et al. / Theoretical Population Biology 72 (2007) 485–503

terms of the order of d. This is an improvement on the previous methods used to analyze F st in which the order of the error was unknown. Notice the time restrictions 15T e  T i 5d in Theorems 1 and 2 and T e  T i 5d in Theorem 3. As noted in Wakeley (2003), Wakeley and Takahashi (2004), Wilkins (2004), the population dynamics of the island model have two scales. pðtÞ, which reflects the whole population, drifts on a timescale of order d. pk ðtÞ, which reflects a single deme, drifts on a timescale of order 1. F st is a combination of these two timescales. Choosing 15T e  T i 5d gives a length of time that is long enough to allow H W to reach an equilibrium but short enough so that H T does not change and upset that equilibrium. Theorems 1–3 explain Figs. 1–5. Recall that time is measured in units of N generations. In the context of this simulation, since N ¼ 200, a unit of time is 200 generations. If we set T i ¼ 0 and 15T e 5d ¼ 100 in the first two theorems then T e corresponds to generations between 200 and 20 000 which is precisely in the quasisteady stage shown by the simulation. In the quasi1 steady stage Theorems 1 and 2 say that F st  1þ2Nm 1 ¯ with a bias of 1þ2G hE ðT i Þand a standard deviation rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi aðGÞðb1 ðGÞþb2 ðGÞ

1 ÞÞ pðT i ÞqðT i Þ

. d We can use Theorem 3 to understand the initial stage. Theorem 3 shows that in the initial stage F st will go 1 exponentially in time of order 1 to 1þ2G . Further as the initial stage progresses, the standard deviation of F st will go exponentially in time of order 1 to size of order p1ffiffid . This agrees with the observation of Crow and Aoki (1984) and Slatkin (1991) made through the analysis of F~ st . The first two theorems specify when the end stage begins. The asymptotic expansion of Theorems 1 and 2 fail to hold when pðT i ÞqðT i Þ  d1. Since we can choose T i as we wish, this implies that F st is in the quasi-steady stage as long as 1 d1 d opðtÞo d . In our simulation d ¼ 100, so the quasi-steady stage ends and the end stage begins when pðtÞ ¼ :01 or pðtÞ ¼ :99. Examining Fig. 2 we see that pðtÞ  :99 at approximately generation 20 000. Then we see in Fig. 5 that after generations 20 000 F st becomes erratic. We emphasize that our results do not provide a description of F st in the end stage. Finally, we comment on the condition Nbd. This is actually not necessary for our results. If we removed this assumption we would have asymptotic expansions in N and d, rather than just d. However, we have not pursued this because the computations and resultant formulas would be much more cumbersome. 3. The model and definitions In this section we precisely define the model and provide definitions and notation. We define xk;j ðtÞ ¼ indicator variable for allele A for individual j in deme k at time t.

489

Then, pðtÞ ¼

d X N 1 X xk;j ðtÞ, Nd k¼1 j¼1

pk ðtÞ ¼

N 1X xk;j ðtÞ. N j¼1

ð3:1Þ

For convenience we set f st ðtÞ ¼ 1  F st ðtÞ. This means Pd 1 k¼1 pk ðtÞqk ðtÞ d . (3.2) f st ðtÞ ¼ pðtÞqðtÞ Notice f st ðtÞ ¼

H W ðtÞ H T ðtÞ ,

so that f st represents the ratio of within

deme heterozygosity to total population heterozygosity. It is mathematically easier to consider f st rather than F st . In Sections 4 and 5 we derive the mean and variance of f st . It is then straightforward to derive the mean and variance of F st . Often we will consider f st and p between two points in time. Let t1 and t2 be two time points, then we set Pd 1 k¼1 pk ðt2 Þqk ðt2 Þ d , f st ðt1 ; t2 Þ ¼ pðt1 Þqðt1 Þ Dpðt1 ; t2 Þ ¼ pðt2 Þ  pðt1 Þ. ð3:3Þ Notice f st ðtÞ ¼ f st ðt; tÞ. Now we define the island model more precisely. Given the current generation, we determine the allelic state of the jth individual in the kth deme in the next generation as follows. With probability m, this individual is given the allelic state of an individual in the current generation selected uniformly from all individuals not in the kth deme. With probability 1  m, this individual is given the allelic state of an individual in the current generation selected uniformly from all individuals in the kth deme. We consider F st ðtÞ on a range of times T e XtXT i . T e and T i are chosen according to the conditions of Theorems 1–3. To make this precise we set T e  T i ¼ d 1 for 0oo1. Hence 15T e  T i 5d. When we write E½f st ðt2 Þjt1  we mean the expected value of f st ðt2 Þ conditioned on the allelic state of the population at time t1 , i.e. pk ðt1 Þ for k ¼ 1; . . . ; d. To shorten our formulas we write E½f st ðt2 Þjt1 ; p for E½f st ðt2 Þjt1 ; pðT e Þa0; 1. We will analyze the finite island model through a coalescent approach. The symmetry of the island model allows us to describe the coalescent at any point in time by a state. The state of the coalescent is given by the number of uncoalesced individuals in separate demes. So for example suppose we start a coalescent with six individuals. Three individuals are in a certain deme, two individuals are in another deme, and one individual is in a third deme. Then the start state of the coalescent is ð3; 2; 1Þ. Now suppose that two of the individuals in the deme with three individuals coalesce. Then the coalescent contains five individuals. Two individuals share one deme, two individuals share a second deme, and one individual is in a third deme. So the state of the coalescent is ð2; 2; 1Þ. As a final example suppose that the coalescent has evolved to contain only two individuals each in a separate deme. Then the state of the coalescent is given by ð1; 1Þ.

ARTICLE IN PRESS S. Rottenstreich et al. / Theoretical Population Biology 72 (2007) 485–503

490

4. E½f st ðT e ÞjT i ; p In this section we compute E½f st ðT e ÞjT i ; p. Our approach consists of three steps. First, we divide the interval T e  T i into smaller intervals of size s and derive a Taylor expansion for the difference: E½f st ðt þ sÞ  f st ðtÞjt; p. Second, we evaluate the terms in the Taylor expansion of E½f st ðt þ sÞ f st ðtÞjt; p. Using the results from the first two steps, we are able to write down a difference equation for E½f st ðtÞjT i ; p. In the third step, we solve the difference equation to find E½f st ðT e ÞjT i ; p. Below we separate these three steps into three subsections. 4.1. Taylor expansion of f st To compute E½f st ðT e ÞjT i ; p we split the time interval T e  T i into time intervals of size s with N1 5s51 (the reason for this range of s will be explained as we go along). We then derive an equation for the change in f st over each interval, E½f st ðt þ sÞ  f st ðtÞjT i ; p. We start in this direction by first considering f st ðt þ sÞ  f st ðtÞ. Standard algebra gives f st ðt þ sÞ  f st ðtÞ  ¼ f st ðt; t þ sÞ

 1  f st ðt; tÞ, 1 þ Gðt; t þ sÞ

ð4:1Þ

4.2. Computation of moments In this section we show how to compute the expressions in (4.4). Actually, for the sake of clarity, we compute slightly different expressions. Rather than conditioning on T i and pðT e Þa0; 1, in this section we condition on t. This change does not alter our computations, but it makes the formulas in this section more readable and intuitive. We demonstrate in Appendix A.2 that the results of this section extend to conditioning on T i and pðT e Þa0; 1. In Section 4.3 we return to conditioning on T i and pðT e Þa0; 1. Specifically, then, in this section we consider E½f st ðt; t þ sÞ  f st ðt; tÞjt, E½f st ðt; t þ sÞG h ðt; t þ sÞjt

where

for h ¼ 1; . . . ; M þ 1.

1 ðð1  2pðtÞÞDpðt; t þ sÞ Gðt; t þ sÞ ¼ pðtÞqðtÞ  Dp2 ðt; t þ sÞÞ. The term

1 1þGðt;tþsÞ

ð4:2Þ

in (4.1) can be expanded in Taylor series

1 ¼ 1  G þ G2  G3 þ    þ GM þ through the identity 1þG OðG Mþ1 Þ, where M is any positive integer (for now we leave M unspecified, although we will eventually choose M ¼ 2). This expansion holds for Gðt; t þ sÞo1 and by choosing s small enough we can guarantee this with high probability. In Appendix A.2, we show that we may limit the analysis to the case Gðt; t þ sÞo1 because Gðt; t þ sÞX1 has low probability and does not effect our expansions. Plugging this Taylor expansion into (4.1) gives

M X

ð1Þh G h ðt; t þ sÞ

h¼1

þ Oðf st ðt; t þ sÞG Mþ1 ðt; t þ sÞÞ.

ð4:3Þ

Now we apply E½jT i ; p to (4.3). We see that we must consider two expressions: E½f st ðt; t þ sÞ  f st ðt; tÞjT i ; p, E½f st ðt; t þ sÞG h ðt; t þ sÞjT i ; p, for h ¼ 1; . . . ; M þ 1.

ð4:4Þ

In the next subsection we show how to compute these two expressions.

ð4:5Þ

We split the computation of these two quantities into two sections. In Section 4.2.1 we compute E½f st ðt; t þ sÞ  f st ðt; tÞjt through a coalescent approach. In Section 4.2.2 we compute E½f st ðt; t þ sÞG h ðt; t þ sÞjt through a novel coalescent approach that we call the coupled coalescent. 4.2.1. E½f st ðt; t þ sÞ  f st ðt; tÞjt We have E½f st ðt; t þ sÞ  f st ðt; tÞjt # "  d 1 1X  E pk ðt þ sÞqk ðt þ sÞt ¼  pðtÞqðtÞ d k¼1 ! d 1X  pk0 ðtÞqk0 ðtÞ . d 0 k ¼1

f st ðt þ sÞ  f st ðtÞ ¼ ½f st ðt; t þ sÞ  f st ðt; tÞ þ f st ðt; t þ sÞ 

In previous work (Nei and Chakravarti, 1977; Nei et al., 1977; Takahata, 1983), a Taylor expansion similar to the one we have just described is used to analyze E½f st ðT e Þ. However, there is one critical difference in their approach and ours. We split the interval T e  T i into intervals of size s. In previous works, the authors do not split up the interval T e  T i . By choosing s51 the computation of the expressions in (4.4) is made simpler. In previous work, the expressions in (4.4) are analyzed using matrix equations. But this analysis does not get far beyond simply writing down the matrix equations.

ð4:6Þ

P So the only quantity we need to compute is E½d1 dk¼1 pk ðt þ sÞqk ðt þ sÞjt. We do this through a coalescent argument. We start with the following observation that allows us to make a connection between allele frequencies, pðtÞ; qðtÞ and pk ðtÞ; qk ðtÞ, and the allelic state of a given individual, xk;j ðtÞ: # "  d 1X  E pk ðt þ sÞqk ðt þ sÞt  d k¼1 " ! d N 1X 1X ¼ E xk;j ðt þ sÞ d k¼1 N j¼1 0 1 3  N X  1 ð1  xk;j0 ðt þ sÞÞAt5 @ N j 0 ¼1 

ARTICLE IN PRESS S. Rottenstreich et al. / Theoretical Population Biology 72 (2007) 485–503

¼

d N 1X 1 X E½xk;j ðt þ sÞð1  xk;j 0 ðt þ sÞÞjt d k¼1 N 2 j;j0 ¼1;jaj0

d 1X N 1 E½xk;1 ðt þ sÞð1  xk;2 ðt þ sÞÞjt d k¼1 N   d 1X 1 ¼ E½xk;1 ðt þ sÞð1  xk;2 ðt þ sÞÞjt þ O . d k¼1 N

¼

ð4:7Þ

The key observation that justifies the result directly above is E½xk;j ðt þ sÞð1  xk;j0 ðt þ sÞÞjt ¼ E½xk;1 ðt þ sÞð1  xk;2 ðt þ sÞÞjt,

ð4:8Þ

which holds because under a Wright–Fisher mating scheme all individuals in deme k at time t þ s have the same allelic distribution. Eq. (4.7) shows that the expected value of pk ðt þ sÞqk ðt þ sÞ can be analyzed in terms of the expected value of the allelic state of two individuals, xk;1 and xk;2 . This relation can be generalized to show that the expected value of any expression involving the product of n allelic frequencies can expressed in terms of the expected value of n individual allelic states. So for example E½pk ðt þ sÞ5 pðt þ sÞ3 qðt þ sÞ2  can be expressed in terms of the expected value of 5 þ 3 þ 2 ¼ 10 individual allelic states. E½xk;1 ðt þ sÞð1  xk;2 ðt þ sÞÞjt can be analyzed as a coalescent of two individuals with time run backwards from time t þ s to time t. In general we can analyze the expected value of n individual allelic states through a coalescent containing n individuals. It is a standard result that if we measure time in units of N generations and if N is very large we may replace the discrete coalescent process by a continuous coalescent, see Wilkinson-Herbots (1998) and references therein. The error involved in this approximation is of order N1 and since we are taking Nbd this will not effect our analysis. In the continuous process the migration is a continuous time Poisson processes with rate G while the chance of two individuals coalescing within a group of k individuals sharing the same deme is a Poisson process with  rate k2 (Wilkinson-Herbots, 1998). There are some differences between our treatment of the coalescent and the standard equilibrium treatment of the coalescent. First, we caution the reader that in our analysis of the coalescent, unlike the standard approach in coalescent theory, we do not reverse the direction of time. We maintain the forward direction of time because we are analyzing f st forward in time and we want to understand the coalescent within this context. To be clear then, in our coalescent analysis when we consider two time points t þ s and t, the time t þ s represents a later time than t. In a standard coalescent analysis time would be reversed and t þ s would represent an earlier time than t. But a second more fundamental difference is that we run our coalescent over a fixed time interval, time t þ s to time t, and we condition on the state of the individuals in the coalescent at time t. To make this clear, consider Fig. 6. Here we show two separate realization, (a) and (b), of a coalescent involving xk;1 and xk;2 . For clarity we have not

491

shown the population separated into demes. In Fig. 6, the numbers 1 and 2 refer to lineages of xk;1 and xk;2 , respectively. The top row of squares represents the population at time t þ s (so in this example the population has four individuals), the bottom row represents the population at time t. Each square in the bottom row contains either an A or a B. This refers to the allelic state of the individual. So, for example, in (a) the individual represented by the box on the far left of the bottom row has an allelic state A. The letters a; b; c, and d serve as labels. In both figures (a) and (b), xk;1 ðt þ sÞ is represented by the box labeled a and xk;2 ðt þ sÞ is represented by the box labeled c. In both figures, moving backwards in time, xk;1 moves from boxes a to b while xk;2 moves from boxes c to d. In figure (b), xk;1 and xk;2 coalesce in box e. In (a), xk;1 and xk;2 do not coalesce between time t þ s and time t. Consider the value of xk;1 ðt þ sÞ and xk;2 ðt þ sÞ. Since there is no mutation in the model, the allelic state of xk;1 and xk;2 at time t þ s is determined by their ancestor at time t. In figure (a), the two have different ancestors both with allelic state A. So for figure (a), xk;1 ðt þ sÞ ¼ 1 and xk;2 ðt þ sÞ ¼ 1. In figure (b) the two have coalesced before time t. So we will have xk;1 ðt þ sÞ ¼ xk;2 ðt þ sÞ. We see further that their common ancestor at time t has allelic state B, so we will have xk;1 ðt þ sÞ ¼ xk;2 ðt þ sÞ ¼ 0. When we consider a coalescent from time t þ s to time t any number of migration events or coalescent events can happen. When we use the word event we mean either of these two types of events. In time s, the probability of two events occurring is Oðs2 Þ. In the following discussions we will not mention cases when more than one event occurs from time t þ s to t. We will simply group all such cases in an Oðs2 Þ term. This is the advantage of taking s51. Now thinking of the coalescent, hence backwards in time, P from time t þ s to t we can write an iterative equation for d1 dk¼1 E½xk;1 ðt þ sÞð1  xk;2 ðt þ sÞÞjt. d 1X E½xk;1 ðt þ sÞð1  xk;2 ðt þ sÞÞjt d k¼1

¼ ð1  2Gs  sÞL0 þ GsL1 þ GsL2 þ Oðs2 Þ,

ð4:9Þ

where L0 ¼

L1 ¼

L2 ¼

d N 1X 1 X 2 d k¼1 N j¼1

N X

xk;j ðtÞð1  xk;j 0 ðtÞÞ,

j 0 ¼1;j 0 aj

d 1X 1 d k¼1 N 2 ðd  1Þ

d X

N X N X

xk0 ;j ðtÞð1  xk;j 0 ðtÞÞ,

k0 ¼1;k0 ak j¼1 j 0 ¼1

d 1X 1 d k¼1 N 2 ðd  1Þ



d X

N X N X

xk;j ðtÞð1  xk0 ;j 0 ðtÞÞ.

ð4:10Þ

k0 ¼1;k0 ak j¼1 j 0 ¼1

We explain the above equation. The coalescent at time t þ s is in state ð2Þ, i.e. there is one deme with two individuals. If

ARTICLE IN PRESS S. Rottenstreich et al. / Theoretical Population Biology 72 (2007) 485–503

492

a

time=t+s

c

1

a

2 b

d

1

c 2

1 b 2

d 2

1 e

1

2

1 A

1,2

2 B

A

A

1,2 time=t

A

B

A

A

Fig. 6. Two realizations of a time restricted coalescent.

in the time interval from time t þ s to t no event occurs then the coalescent state remains ð2Þ. This has probability 1  2Gs  s þ Oðs2 Þ since   the rate of migration is G and the rate of coalescence is 22 ¼ 1. If xk;1 migrates the coalescent enters state ð1; 1Þ. This has probability Gs þ Oðs2 Þ. If xk;2 migrates then the coalescent enters state ð1; 1Þ. This has probability Gs þ Oðs2 Þ. If a coalescence between the two individuals occurs then the coalescent enters state ð1Þ. This has probability s þ Oðs2 Þ. L0 represents the case in which the coalescent stays in state ð2Þ. In this case xk;1 and xk;2 follow separate lineages backwards from time t þ s to time t. At time t the ancestor of each of these individuals is equally likely to be any of the N individuals in deme k. From this argument we have L0 ¼ E½xk;1 ðt þ sÞð1  xk;2 ðt þ sÞÞjt, remain in state (2)]. L1 represents the case in which xk;1 migrates and the coalescent moves to state ð1; 1Þ. Since xk;1 migrates, at time t the ancestor of xk;1 ðt þ sÞ is equally likely to be any individual in any deme other than k. This gives L1 ¼ E½xk;1 ðt þ sÞxk;2 ðt þ sÞjt, move to state (1,1)]. L2 represents the case in which xk;2 migrates and is symmetric to L1 . If a coalescence occurs then xk;1 and xk;2 at time t þ s have the same common ancestor. Hence xk;1 ðt þ sÞ ð1  xk;2 ðt þ sÞÞ ¼ 0. Combining the probability of each of the five cases with the corresponding conditional expectation gives (4.9). Now we can re-express L0 , L1 , and L2 in terms of pðtÞ; qðtÞ; pk ðtÞ; qk ðtÞ. We could do this by algebraic manipulation, but it is easier to take a probabilistic approach. In the case of L0 , the allelic states of xk;1 ðt þ sÞ and xk;2 ðt þ sÞ are formed by independent draws, up to OðN1 Þ, of individuals from deme k at time t. This means they each have a probability pk ðtÞ of evaluating to one. A similar argument can be applied to L1 and L2 . This gives   d 1X 1 L0 ¼ pk ðtÞqk ðtÞ þ O , d k¼1 N

1 L1 ¼ L2 ¼ pðtÞqðtÞ þ d   1 þO 2 . d

d 1X pðtÞqðtÞ  p ðtÞqk ðtÞ d k¼1 k

!

Plugging into (4.7), we have a formula for E½d1 pk ðt þ sÞqk ðt þ sÞjt: # "  d 1X  E pk ðt þ sÞqk ðt þ sÞt  d k¼1 ¼ ð1  ð2G þ 1ÞsÞ

ð4:11Þ Pd

k¼1

d 1X p ðtÞqk ðtÞ þ 2GspðtÞqðtÞ d k¼1 k

2Gs ðpðtÞqðtÞ  pk ðtÞqk ðtÞÞ d  s þ O 2 þ Oðs2 Þ. d

þ

ð4:12Þ

Finally then, plugging the above expression into (4.6) gives E½f st ðt; t þ sÞ  f st ðtÞjt 2Gs ð1  f st ðtÞÞ ¼ ð2G þ 1Þsf st ðtÞ þ 2Gs þ d   s þ O 2 þ Oðs2 Þ. d

ð4:13Þ

4.2.2. E½f st ðt; t þ sÞG h ðt; t þ sÞjt From (4.2) we have E½f st ðt; t þ sÞG h ðt; t þ sÞjt " d 1 1X E p ðt þ sÞqk ðt þ sÞ ¼ d k¼1 k ðpðtÞqðtÞÞhþ1

#  ðð1  2pðtÞÞDpðt; t þ sÞ  Dpðt; t þ sÞ2 Þh t .

ð4:14Þ

From the above expression, it is straightforward to see that computing E½f st ðt; t þ sÞGðt; t þ sÞh jt for h ¼ 1; . . . ; M

ARTICLE IN PRESS S. Rottenstreich et al. / Theoretical Population Biology 72 (2007) 485–503

reduces to computing E½Qh ðt; t þ sÞjt for h ¼ 1; . . . ; 2M where Qh ðt; t þ sÞ ¼

d X

1 p ðt þ sÞqk ðt þ sÞDph ðt; t þ sÞ. d k¼1 k

(4.15)

We could compute E½Qh ðt; t þ sÞjt by using the standard coalescent techniques of the previous section. For instance for h ¼ 2 we could write, E½Q2 ðt; t þ sÞjt # "  d 1X 2 pk ðt þ sÞqk ðt þ sÞðpðt þ sÞ  pðtÞÞ t ¼E  d k¼1 # "  d 1X  ¼E pk ðt þ sÞqk ðt þ sÞp2 ðt þ sÞt  d k¼1 # "  d 1X   2pðtÞE pk ðt þ sÞqk ðt þ sÞpðt þ sÞt  d k¼1 # "  d 1X  þp2 ðtÞE pk ðt þ sÞqk ðt þ sÞt .  d k¼1

E½f st ðt; t þ sÞGðt; t þ sÞjt ¼

d s 1  2pðtÞ 1 X p ðtÞqk ðtÞð1  2pk ðtÞÞ d ðpðtÞqðtÞÞ2 d k¼1 k s  f st ðt; tÞ2 þ Oðs2 Þ, d

ð4:18Þ

E½f st ðt; t þ sÞG 2 ðt; t þ sÞjt s ð1  2pðtÞÞ2 2 f ðt; tÞ þ Oðs2 Þ, d ðpðtÞqðtÞÞ st

E½f st ðt; t þ sÞG h ðt; t þ sÞjt ¼ Oðs2 Þ for h42.

ð4:19Þ

From the relations directly above we see that we need to take our Taylor expansion out to M ¼ 2 if we want an expansion for E½f st jT i ; p, (4.3), to Oðd1Þ. We note finally that the coupled coalescent requires sb N1 . The reason for this is explained in Appendix A.1. ð4:16Þ

d 1 X pk ðtÞqk ðtÞð1  2pk ðtÞÞ þ Oðs2 Þ, 2 d k¼1 !2 d s 1X p ðtÞqk ðtÞ þ Oðs2 Þ, E½Q2 ðt; t þ sÞjt ¼ d d k¼1 k

E½Q1 ðt; t þ sÞjt ¼ s

for h42.

(4.14) gives

¼

The three terms to the right of the equality directly above could be analyzed as coalescents with four, three, and two individuals, respectively. This would obviously be cumbersome. Furthermore, for higher values of h we would need to analyze coalescents with many individuals. For instance, if h ¼ 4 then we would need to consider a coalescents with six individuals. When we compute V ½F st ðT e ÞjT i ; p the situation becomes even more cumbersome with more coalescents and more individuals in each coalescent. The computations, while possible in theory, become unmanageable in practice. In order to avoid this cumbersome situation we introduce the notion of a coupled coalescent. This approach greatly simplifies computations of terms involving Dpðt; t þ sÞ. Intuitively, the coupled coalescent allows one to compare a coalescent against its average behavior. This allows for simpler computations because, to put it in a non-precise sense, average behavior can be ignored. Practically, this means that one can consider E½Qh ðt; t þ sÞjt through a single coupled coalescent that is straightforward to analyze. In Appendix A.1, we define the coupled coalescent and use it to compute E½Qh ðt; t þ sÞjt. We arrive at the following formulas:

E½Qh ðt; t þ sÞjt ¼ Oðs2 Þ

493

ð4:17Þ

Having derived expressions for E½Qh ðt; t þ sÞjt we now return to E½f st ðt; t þ sÞGðt; t þ sÞh jt. Plugging (4.17) into

4.3. The difference equation If we combine (4.3), (4.13), and (4.18), and generalize a bit to condition on T i and pðT e Þa0; 1 rather than t we arrive at the following difference equation: E½f st ðt þ sÞ  f st ðtÞjT i ; p ¼ ð2G þ 1ÞsE½f st ðtÞjT i ; p þ 2Gs   s s þ E½hE ðtÞjT i ; p þ O 2 þ Oðs2 Þ, d d

ð4:20Þ

where hE ðtÞ ¼ 2Gð1  f st ðtÞÞ  c1 ðtÞ þ c2 ðtÞ

d 1X p ðtÞqk ðtÞð1  2pk ðtÞÞ d k¼1 k !2

d 1X p ðtÞqk ðtÞ d k¼1 k

ð4:21Þ

and c1 ðtÞ ¼

1  2pðtÞ , ðpðtÞqðtÞÞ2

c2 ðtÞ ¼

ð1  2pðtÞÞ2 1 þ . 3 ðpðtÞqðtÞÞ2 ðpðtÞqðtÞÞ

ð4:22Þ

We could solve (4.20) directly. However, differential equations are easier to solve then difference equations. We show in Appendix A.2, that since s51 we can replace (4.20) by the following differential equation: d E½f st ðtÞjT i ; p dt ¼ ð2G þ 1ÞE½f st ðtÞjT i ; p þ 2G   1 1 þ E½hE ðtÞjT i ; p þ O 2 . d d

ð4:23Þ

ARTICLE IN PRESS S. Rottenstreich et al. / Theoretical Population Biology 72 (2007) 485–503

494

This differential equation can be integrated to arrive at the following formula for E½F st ðT e ÞjT i ; p: E½f st ðT e ÞjT i ; p 2G 1 þ ¼ 2G þ 1 d   1 þO 2 , d

Z

Te

ds E½hE ðsÞjT i ; peð2Gþ1ÞðT e sÞ

Ti

ð4:24Þ

where we have used the fact eðT e T i Þ 5Oð 12 Þ to arrive at d the equality directly above. Now we face a difficulty. E½f st ðT e ÞjT i ; p depends on the value of E½hE ðtÞjT i ; p through the integral in (4.24). hE is a complicated function of pðtÞ, qðtÞ, pk ðtÞ, and qk ðtÞ. It would seem that we need to derive another differential equation for hE ðtÞ. This, however, is not the case. Observe that the integral in (4.24) is proceeded by a factor d1. In Appendix  A.3, we show that E½hE ðtÞjT i ; p ¼ h¯ E ðT i Þ þ O 1 (recall d =2

the definition of h¯ E from 1). If we plug  Theorem 1 ¯ E½hE ðtÞjT i ; p ¼ hE ðT i Þ þ O into (4.24) then we arrive d =2

at Theorem 1 and the first equation in Theorem 3. We want to explain how we derive h¯ E ðtÞ from hE ðtÞ and the intuition  behind the relation E½hE ðtÞjT i ; p ¼ ¯hE ðT i Þ þ O 1 . Furthermore, we would like to explain d =2

how our arguments relate to Wakeley (1998, 2003) and Wakeley and Takahashi (2004). In Wakeley (2003) and Wakeley and Takahashi (2004), the authors derive a diffusion equation for pðtÞ. They consider the island model as we do, except that they allow for selection. The basic argument in these papers is, with selection removed to fit our case and in our notation, as follows. Define p1;...;d ðtÞ ¼ ðp1 ðtÞ; p2 ðtÞ; . . . ; pd ðtÞÞ.Then the mean and variance of the change in pðtÞ can be expressed in terms p; p1;...;d . That is, 

     1 1 1  aðpðtÞ; p1;...;d ðtÞÞ þ o  pðtÞt ¼ , E p tþ N Nd Nd  "  #  2  1   pðtÞ t E p tþ  N   1 1 ¼ bðpðtÞ; p1;...;d ðtÞÞ þ o , ð4:25Þ Nd Nd for some functions a; b. To derive a diffusion equation for pðtÞ the arguments of a; b must be expressed solely in terms of pðtÞ. So the dependence on p1;...;d ðtÞ is a problem. But while pðtÞ changes on a timescale of order d, p1;...;d ðtÞ changes on a timescale of order 1. This is the key observation that is made in Wakeley (1998, 2003), Wakeley and Takahashi (2004) and Wilkins (2004). If pðtÞ is held fixed at p¯ then p1;...;d ðtÞ reaches a steady state distribution that depends only on p¯ . Let p¯ 1;...;d ð¯pÞbe this steady state distribution. Since the timescale of pðtÞ is so slow compared to p1;...;d ðtÞ one can make the

approximation p1;...;d ðtÞ  p¯ 1;...;d ðpðtÞÞ. With this approximation, (4.25) can be rewritten as 

     1 1 1 aðpðtÞ; p¯ 1;...;d ðpðtÞÞÞ þ o E p tþ  pðtÞt ¼ , N Nd Nd    2  1   pðtÞ t E p tþ  N   1 1 bðpðtÞ; p¯ 1;...;d ðpðtÞÞÞ þ o ¼ . ð4:26Þ Nd Nd The right side of the two equalities directly above depends only on pðtÞ and so the authors are able to derive a diffusion equation. All this is made rigorous by an application of a theorem due to Ethier and Nagylaki (1980). The remaining difficulty is an explicit characterization of p¯ 1;...;d . In Wakeley (2003), the author is only able to write down an approximation to this distribution. In Wakeley and Takahashi (2004), this distribution can be computed exactly. Now let us return to the relation E½hE ðtÞjT i ; p ¼ ¯hE ðT i Þ þ O 1 . The formula for h¯ E ðT i Þ is derived from d =2

the formula for hE ðT i Þ, (4.21), in two steps. First in (4.21), the terms pðtÞ and qðtÞ are replaced by pðT i Þ and qðT i Þ, respectively. The intuition behind this substitution is that since pðtÞ changes on a timescale of order d and t  T i oT e  T i ¼ d 1 we have pðtÞ  pðT i Þ. We are using the same argument found in Wakeley (2003) and Wakeley and Takahashi (2004). However, rather than using the Ethier and Nagylaki theorem we simply compute, using the coupled coalescent method, that E½ðpðtÞ  pðT i ÞÞÞ2 jT i  ¼ tT Oð d i ÞpOðd1 Þ. The second step in moving from hE ðtÞ to h¯ E ðT i Þ is a substitution for each pk ðtÞ. In the formula for hE ðtÞ, (4.21), we have the following two expression which are a function of pk ðtÞ: pk ðtÞqk ðtÞð1  2pk ðtÞÞ ¼ pk ðtÞ  3p2k ðtÞ þ 2p3k ðtÞ, pk ðtÞqk ðtÞ ¼ pk ðtÞ  p2k ðtÞ.

ð4:27Þ

So we consider E½pk ðtÞjT i , E½p2k ðtÞjT i , and E½p3k ðtÞjT i . We make the following substitution for E½prk ðtÞjT i  for r ¼ 1; 2; 3: ! j  r Y jG r 2     pðT i Þ . E½pk ðtÞjT i  ! pðT i Þ j þ jG þ 2j j¼2 jG þ 2 (4.28) This second substitution takes us from hE ðtÞ to h¯ E ðT i Þ. Where does the above substitution come from? We determine E½pk ðtÞr jT i  through a coalescent argument as in Section 4.2.1. But instead of considering a coalescent over a small time interval t to t þ s we consider the coalescent over the time interval t to T i . We need to consider 15t  T i 5d. The analysis of a coalescent over this time range is given by Wakeley (1998). For concreteness suppose we are considering E½p3k ðtÞjT i . We showed in Section 4.2.1 that this is equivalent to considering a

ARTICLE IN PRESS S. Rottenstreich et al. / Theoretical Population Biology 72 (2007) 485–503

coalescent starting in state ð3Þ. Wakeley (1998) showed that four sequences of coalescent states have significant probability of occurring. In these sequences the first state is the state at time t and the last state is the state at time T e : ð3Þ ! ð2; 1Þ ! ð1; 1; 1Þ, ð3Þ ! ð2; 1Þ ! ð1; 1Þ, ð3Þ ! ð2Þ ! ð1; 1Þ, ð3Þ ! ð2Þ ! ð1Þ.

ð4:29Þ

These four states satisfy two properties. First, in these sequences individuals do not migrate to a deme that is already occupied by another individual in the coalescent. Second, in the last state of the coalescent each individual occupies a separate deme. The following two sequences of states violate the first and second properties, respectively: ð3Þ ! ð2; 1Þ ! ð3Þ ! ð2; 1Þ ! ð1; 1; 1Þ, ð3Þ ! ð2; 1Þ.

ð4:30Þ

We refer to sequences of states that satisfy the two properties as common sequences. We refer to sequences that violate one of the two properties as rare sequences. Common sequences are sequences that could occur in an infinite island model. Rare sequences will never occur in an infinite island model. In essence, since t  T i b1 there is enough time for all the individuals to either coalesce or end up in separate demes with high probability, but since t  T i 5d there is a very small probability for an individual to migrate to an occupied deme. The  probability of rare sequences occurT T ring is O e d i . Our (4.28) is equivalent to (30) in Wakeley (1998) with the difference that we are considering the coalescent over a finite time period while Wakeley considers the coalescent with mutation in steady state. Eq. (4.28) is constructed by considering only common sequences. More explicitly (4.28) is formed through the following sum: X PðSÞE½pr ðtÞjT i ; S, (4.31) all common sequences S

where S represents a common sequence and PðSÞ is the probability of the common sequence occurring in an infinite island model. As we mentioned, in Wakeley (2003) and Wakeley and Takahashi (2004), the authors need to determine the distribution p¯ 1;...;d . We are faced with a similar problem, but instead of needing to determine the whole distribution, we need to determine moments: E½pk ðtÞr jT i  for r ¼ 1; 2; 3. In Wakeley (2003) and Wakeley and Takahashi (2004), p¯ 1;...;d is determined by considering the allelic distributions in each deme in an infinite island model, this is done by considering a matrix equation (see (26) in Wakeley and Takahashi, 2004). Similarly, we compute E½pk ðtÞr jT i  by considering common sequences and ignoring rare sequences, this is done by considering a coalescent in an infinite island model.

495

5. V½f st ðT e ÞjT i ; p In this section we demonstrate Theorem 2 and the second equation in Theorem 3. We are interested in computing V ½f st ðT e ÞjT i ; p. Since we know E½f st ðT e ÞjT i ; p this reduces to understanding E½f st ðT e Þ2 jT i ; p. Our approach to analyzing E½f 2st ðT e ÞjT i ; p is identical to our analysis of E½f st ðT e ÞjT i ; p in Section 4 except that we consider f 2st rather than f st . We break up the analysis into the same three steps discussed in Section 4. 5.1. Taylor expansion of f 2st ðtÞ In analogy to (4.1) we have f 2st ðt þ s; t þ sÞ  f 2st ðt; tÞ  2 1 2 ¼ f st ðt; t þ sÞ  f 2st ðt; tÞ 1 þ Gðt; t þ sÞ

ð5:1Þ

Then, in analogy to (4.4) we see that we must consider the two expressions: E½f 2st ðt; t þ sÞ  f 2st ðt; tÞjT i ; p, E½f 2st ðt; t þ sÞG h ðt; t þ sÞjT i ; p

for h ¼ 1; . . . ; 3.

ð5:2Þ

5.2. Computation of moments We compute the two expressions in (5.2) using the techniques we developed in Section 4.2. E½f 2st ðt; t þ sÞ  f 2st ðt; tÞjt is computed using a time restricted coalescent. E½f 2st ðt; t þ sÞG h ðt; t þ sÞjt is computed using a coupled coalescent. 5.3. Difference equation We are able to form a difference equation for E½f 2st ðtÞjT i ; p. As in the case of E½f st ðtÞjT i ; p, we can approximate this difference equation by a differential equation with an error that is negligible within the context of our order expansions. d E½f 2st ðtÞjT i ; p dt ¼ ð4G þ 2ÞE½f 2st ðtÞjT i ; p

  1 1 þ 4GE½f 2;1;1 ðtÞjT i ; p þ E½hV ðtÞjt þ O 2 , d d

ð5:3Þ

where the formula of hV ðtÞ is given in Appendix A.4, and f 2;1;1 ðtÞ

¼

1 Pd 1 Pd xk;1 ðtÞð1  xk;2 ðtÞÞ k0 ;k00 ¼1;kak0 ak00 xk0 ;3 ðtÞð1  xk00 ;4 ðtÞÞ d k¼1 ðd  1Þ2 ðpðtÞqðtÞÞ2

.

ð5:4Þ

ARTICLE IN PRESS S. Rottenstreich et al. / Theoretical Population Biology 72 (2007) 485–503

496

We can integrate (5.3) and arrive at the following formula for E½f st ðT e Þ2 jT i ; p: E½f 2st ðT e ÞjT i ; p Z Te dsE½f 2;1;1 ðsÞjT i ; peð4Gþ2ÞðT e sÞ ¼ Ti

1 þ d

Z

Te

dsE½hV ðsÞjT i ; pe

ð4Gþ2ÞðT e sÞ

Ti

  1 þO 2 . d

ð5:5Þ

Now notice that the first integral to the right of the equality directly above does not contain a d1 factor. So in order to solve for E½f 2st ðT e ÞjT i ; p we need to find a differential equation for E½f 2;1;1 ðsÞjT i ; p. This is in contrast to (4.24) in which we could solve for E½f st ðT e ÞjT i ; p using only the differential equation for E½f st ðtÞjT i ; p. Why is this the case? We computed E½f st ðt; t þ sÞ  f st ðt; tÞjt by considering a coalescent with two individuals. We started the coalescent at time t þ s in state (2). When we compute E½f 2st ðt; t þ sÞ  f 2st ðt; tÞjt we must consider a coalescent with four individuals grouped into two pairs. Specifically we consider the following term which we label I: I ¼ xk;1 ðt þ sÞð1  xk;2 ðt þ sÞÞxk0 ;3 ðt þ sÞ ð1  xk0 ;4 ðt þ sÞÞ.

ð5:6Þ

In this expression two individuals occupy deme k and two individuals occupy deme k0 . We restrict our attention to the case kak0 , which means that the coalescent starts in state ð2; 2Þ. Now consider the next state that the coalescent could enter. If the evolution of the coalescent will eventually be described by a common sequence then there are two possible states: ð1; 2Þ or ð2; 1; 1Þ. Entering the state ð1; 2Þ implies that a coalescence of one of the two pairs occurred. This will force I ¼ 0. Entering the state ð2; 1; 1Þ implies a migration event. Calculating E½f 2;1;1 ðtÞjT i ; p reduces to considering a coalescent in state ð2; 1; 1Þ at time t. We have not mentioned rare sequences. The coalescent in state ð2; 2Þ could also transition to state ð3; 1Þ. However, this has probability Oðds Þ. A term to represent this transition is found in hV . Using the techniques of Section 4 we can derive a differential equation for E½f 2;1;1 ðtÞjT i ; p: d E½f 2;1;1 ðtÞjT i ; p dt ¼ ð2G þ 1ÞE½f 2;1;1 ðtÞjT i ; p þ 2GE½f 1;1;1;1 ðtÞjT i ; p

  1 1 þ E½h2;1;1 ðtÞjT i ; p þ O 2 , d d

ð5:7Þ

where the formula of h2;1;1 ðtÞ is given in Appendix A.4, and f 1;1;1;1 ðtÞ Pd 1 k;k0 ;k00 ;k000 ¼1;kak0 ak00 ak000 xk;1 ðtÞð1  xk0 ;2 ðtÞÞxk00 ;3 ðtÞð1  xk000 ;4 ðtÞÞ ðd  1Þ2 d 2 ¼ . ðpðtÞqðtÞÞ2

ð5:8Þ

In order to solve for E½f 2;1;1 ðtÞjT i ; p we need to solve for E½f 1;1;1;1 ðtÞjT i ; p. Why? As mentioned, f 2;1;1 corresponds to the coalescent state ð2; 1; 1Þ. To move from ð2; 1; 1Þ as common sequence requires a transition to either ð1; 1; 1Þ or ð1; 1; 1; 1Þ. The transition to ð1; 1; 1Þ will force f 2;1;1 ðtÞ ¼ 0. E½f 1;1;1;1 ðtÞjT i ; p corresponds to considering a coalescent in state ð1; 1; 1; 1Þ at time t. To move from ð2; 1; 1Þ as a rare sequences requires a transition to either state ð3; 1Þ or ð2; 2Þ. These transition have probability Oðds Þ of occurring and are represented in h2;1;1 . Now we must consider E½f 1;1;1;1 ðtÞjT i ; p. Manipulation of the formula for f 1;1;1;1 gives   1 1 E½f 1;1;1;1 ðtÞjT i ; p ¼ 1 þ E½h1;1;1;1 ðtÞjT i ; p þ O 2 , d d (5.9) where the formula of h1;1;1;1 ðtÞ is given in Appendix A.4. The determination of E½f 1;1;1;1 ðtÞjT i ; p does not require a differential equation because in coalescent state ð1; 1; 1; 1Þ the four individuals each occupy different demes. This means, up to order d1, each individual has allelic state A with probability pðtÞ independently of the other three individuals. The correlations between the allelic states are of order d1 and are expressed in the formula for h1;1;1;1 ðtÞ. We have a hierarchy of differential equations. Solving this hierarchy from f 1;1;1;1 to f 2st we arrive at a formula for E½f st ðT e ÞjT i ; p in terms of hV ðtÞ; h2;1;1 ðtÞ, and h1;1;1;1 ðtÞ. Then we use the two step approximation that took us from hE to h¯ E in Section 4.3 to replace hV ðtÞ; h2;1;1 ðtÞ, and h1;1;1;1 ðtÞ with expressions that depend only on pðT i Þ and qðT i Þ. This process gives Theorem 2 and the second part of Theorem 3. 6. Discussion In order to use F st appropriately in application one must understand the mean and variance of F st . We have examined the dynamics of the mean and variance of F st for the finite island model. The island model is an idealization and actual populations cannot be assumed to obey its assumptions (Whitlock and Mccauley, 1999). However, we can use the island model to gain insight into the behavior of F st in a general setting. With this in mind, we ask the following questions. What causes bias in F st away from S. Wright’s formula F st ¼ 1 1þ2Nm and what causes the variance of F st ? Our results address these questions when a population is not in a mutation-drift steady state. The bias and variance of F st can be traced to two sources. Recall 1 Pd k¼1 pk ðtÞqk ðtÞ . F st ¼ 1  d pðtÞqðtÞ

(6.1)

The first source that causes bias and variance is the correlation between pðtÞ and pk ðtÞ. In the case of the infinite island model, pðtÞ is fixed because the population is infinite.

ARTICLE IN PRESS S. Rottenstreich et al. / Theoretical Population Biology 72 (2007) 485–503

Another way of viewing this is that change in allelic frequency of allele A in a deme k has no effect on the allelic frequency of allele A in the total population. However, in the finite island model this is not so. If deme k has a sudden rise in the allelic frequency of allele A then the total population allelic frequency of A also tends to rise. Mathematically   this means pðtÞ and pk ðtÞ have a correlation of order O d1 . A second source of bias and variance relates to migration. Consider the lineage of an individual that is in deme k at time t. From a backward time, coalescent perspective the infinite island model implies that no individual can have a lineage that leaves deme k and then returns to deme k. This is not the case for the finite island model. In the finite island model a lineage can leave a deme k and  then return to it. The probability of such a return is O d1 . Wakeley (1998), points out that such return events have low probability and can be ignored in the limit of large d. However, in the case of F st , it is precisely these return events that contribute to the bias and variance of F st . Our technical computations quantify the role of these two sources of bias and variance. In Section 4 we computed the bias of F st . This involved computing two expectations given in (4.4). E½f st ðt; t þ sÞGðt; t þ sÞh jT i ; p involves changes in pðtÞ through the term Gðt; t þ sÞ. If pðtÞ does not change then E½f st ðt; t þ sÞGðt; t þ sÞh jT i ; p ¼ 0. This expectation then represents the bias caused by the correlation between pk ðtÞ and pðtÞ. On the other hand, the expression E½f st ðt; t þ sÞ  f st ðt; tÞjT i ; p does not involve changes in pðtÞ. That is, if pðtÞ does not change then we can still have E½f st ðt; t þ sÞ  f st ðt; tÞjT i ; pa0. This term then represents the bias caused by return events. It is interesting to consider models in which one of these sources is removed while the other remains. This is not possible in the island model because both sources go to zero as d is raised. However, a model such as the infinite stepping stone model would have return events without any change in pðtÞ. A model in which return events do not occur but in which pðtÞ changes is more difficult to describe. From a more technical perspective our techniques provide an alternative framework to the approach found in Wakeley (2003) and Wakeley and Takahashi (2004) with various advantages and disadvantages. Their results require the use of a technical Markov chain theorem that depends on analysis of differential operators (Ethier and Nagylaki, 1980). Our results depend only on coalescent theory. On the other hand, our results cannot be used in the presence of selection. Their results show that pðtÞ goes to a diffusion. This characterizes the distribution of pðtÞ over a range of time. In contrast, our results only give the mean and variance of F st at a given point in time. (This is a gap that we feel can be closed. Our techniques should be usable to give a diffusion result, but we have not pursued this). However, our approach provides expansions to order d1. Using the theorem of Ethier and Nagylaki, our Theorems 1 and 2

497

1 and V ½F st ðT e Þ ! 0 would be changed to E½f st ðT e Þ ! 1þ2G as d ! 1. Finally, we note that this paper does not describe F st in the end stage. The behavior of F st as the population approaches fixation is a problem that requires further work.

Appendix A A.1. The coupled coalescent In this section we define the coupled coalescent and use it to compute E½Qh ðt; t þ sÞjt. We split the discussion into three subsections. First, we define the coupled coalescent. Second, we show how to connect the coupled coalescent to E½Qh ðt; t þ sÞjt. Third, we use the coupled coalescent to compute E½Qh ðt; t þ sÞjt. A.1.1. The definition of the coupled coalescent We now define the coupled coalescent. Consider a coalescent containing n individuals. Let Y be the set of the coalescing individuals. So to be concrete, let Y ¼ fy1 ; y2 ; . . . ; yn g. We define a set of n new individuals, Y  . Let Y  ¼ fy1 ; y2 ; . . . ; yn g. We refer to yi as the couple of yi . We refer to Y as the main coalescent and Y  as the coupled coalescent. Along with the set Y  we define a decoupling event. A decoupling event is specified by a collection of sets Z1 ; . . . ; Z f such that Y [ Y  ¼ Z 1 [ Z 2 [ . . . [ Zf . We say that a decoupling event has occurred if a coalescent event involving two individuals, say yi and yj , in Y has occurred and yi and yj are not both contained in the same Z k . Now we specify the genealogy of Y  . Until the decoupling event occurs the genealogies of Y and Y  are identical. Let us explain this more concretely. Before the decoupling event yi and yi share the same lineage. For every generation after the decoupling event occurs, the genealogies of Y and Y  are completely independent. The coalescence of Y proceeds as usual. Y  is split into the f independent coalescents that are formed by the sets Y  \ Zi for i ¼ 1; . . . ; f . For an individual yi we let yi represent the couple before the decoupling event, and we let y i represent the couple after the decoupling event. To make this clear let us consider a specific case. Consider a four individual coalescent. So Y ¼ fy1 ; y2 ; y3 ; y4 g and Y  ¼ fy1 ; y2 ; y3 ; y4 g. Then let Z 1 ¼ fy1 ; y2 ; y1 ; y2 g and Z2 ¼ fy3 ; y4 ; y3 ; y4 g. Fig. A1 represents a realization of this coupled coalescent. We remind the reader that we analyze with time run backwards, so that Fig. A1 shows the coupled coalescent with the current time at the top rather than the standard practice of putting the current time at the bottom. In Fig. A1 the labels 1; 2; 3, and 4 represent the lineages of y1 ; y2 ; y3 , and y4 , respectively. The labels 1; 2; 3, and 4 represent the lineages of y1 ; y2 ; y3 , and y4 , respectively. The letters a through n serve as identifiers.

ARTICLE IN PRESS S. Rottenstreich et al. / Theoretical Population Biology 72 (2007) 485–503

498

a 3,3*

2,2*

1,1*

4,4*

b 3,3*

2,2*

1,1*

3,3*

2,2*

1,1*

4,4*

4,4*

c 3,3*

1,1*,2,2*

4,4*

d 3,3*

1,1*,2,2*

1,1*,2,2*

4,4*

3,3*

4,4*

e 1,2,3

3*

1*,2*

f

1*,2*

i

3*

k 1*,2*

4*

h

g 1,2,3

4

4

j 4*

l 1,2,3

4

4*

3* m

1*,2*

1,2,3

3*,4*

4 n

1*,2*

3*,4*

1,2,3,4

Fig. A1. Coupled coalescent.

The coupled coalescent starts with each yi and yi following the same lineage. So for example y1 and y1 move together from boxes a to b. The first coalescent event occurs at box c and involves y1 ; y1 coalescing with y2 ; y2 . y1 ; y2 2 Z 1 so this is not a decoupling event. Since the individuals are coupled this means y1 coalesces with y2 . At this point then y1 ; y1 ; y2 ; y2 must all follow the same lineage. This is shown in their movement from boxes c to d. The next coalescent event occurs at box e, y1 ; y1 ; y2 ; y2 coalesces with y3 ; y3 . This is a decoupling event since y1 2 Z 1 and y3 2 Z2 . From this point on we deal with three independent coalescents. Consider then the generation before the decoupling event (the generation containing box f). First we have the coalescent containing y1 ; y2 ; y3 , and y4 represented in boxes f and i, recall that y1 coalesced with y2 and y3 . Second we have the coalescent y1 ; y2 represented by box g, recall that y1 coalesced with y2 so the two will remain coalesced. Third we have the coalescent y3 ; y4 represented by boxes h and j. These three coalescents now proceed independently of each other. In box k we have a coalescent event involving y1 ; y2 ; y3 and y1 ; y2 . But since these are independent coalescents in the next generation

they select different parents. This is also the case in box l when y3 and y4 coalesce but then select different parents in the next generation. Note that within each independent coalescent, coalescent events are treated as usual. For instance, after the coalescent event in box m, y3 and y4 will remain coalesced and follow the same lineage. Similarly after box n, y1 ; y2 ; y3 , and y4 all remain coalesced and follow the same lineage. A.1.2. Connecting the coupled coalescent and E½Qh ðt; t þ sÞjt Consider E½Q1 ðt; t þ sÞjt (recall the definition of Qh given in (4.15)). By an argument almost identical to that of Section 4.2.1, demonstrated through (4.7), we can connect the computation of E½Q1 ðt; t þ sÞjt to the allelic state of three individuals: E½Q1 ðt; t þ sÞjt

X 1 d x ðt þ sÞð1  xk;2 ðt þ sÞÞ ¼E k¼1 k;1 d !#    d 1X 1   xk0 ;3 ðt þ sÞ  pðtÞ  þ O .  d 0 N k ¼1

ðA:1Þ

ARTICLE IN PRESS S. Rottenstreich et al. / Theoretical Population Biology 72 (2007) 485–503

Now a coupled coalescent by replacing pðtÞ by Pd we introduce 1  x ðt þ sÞ in (A.1). Specifically, set Y ¼ fxk;1 ; xk;2 ; 0 0 k ¼1 k ;3 d xk0 ;3 g and then Y  ¼ fxk;1 ; xk;2 ; xk0 ;3 g. We let Z 1 ¼ fxk;1 ; xk;2 ; xk;1 ; xk;2 g and Z 2 ¼ fxk0 ;3 ; xk0 ;3 g.We claim that E½Q1 ðt; t þ sÞjt can be expressed in terms of the coupled coalescent as follows: E½Q1 ðt; t þ sÞjt " d 1X xk;1 ðt þ sÞð1  xk;2 ðt þ sÞÞ ¼E d k¼1 ! #    d 1X 1    ðxk0 ;3 ðt þ sÞ  xk0 ;3 ðt þ sÞÞ t þ O .  d 0 N k ¼1

ðA:2Þ Notice that in the above equation xk;1 ; xk;2 play no role. We now demonstrate (A.2). To do this, we observe two facts. P (1) E½d1 dk0 ¼1 xk0 ;3 ðt þ sÞjt ¼ E½pðt þ sÞjt ¼ pðtÞ. (2) By the definition of the coupled coalescent xk;3 is independent of xk;1 and xk;2 .

With these two facts in mind we have " E

d 1X xk;1 ðt þ sÞð1  xk;2 ðt þ sÞÞ d k¼1 ! #  d 1X   ðxk0 ;3 ðt þ sÞ  xk0 ;3 Þ t   d 0 k ¼1 # "  d d 1X 1X  ¼E xk;1 ðt þ sÞð1  xk;2 ðt þ sÞÞ xk0 ;3 ðt þ sÞt  d k¼1 d 0 k ¼1 # "  d d 1X 1X  E xk;1 ðt þ sÞð1  xk;2 ðt þ sÞÞ xk0 ;3 ðt þ sÞ t  d k¼1 d 0 k ¼1 # "  d d 1X 1X  ¼ xk;1 ðt þ sÞð1  xk;2 ðt þ sÞÞ xk0 ;3 ðt þ sÞt  d k¼1 d 0 k ¼1 # "  d 1X  E xk;1 ðt þ sÞð1  xk;2 ðt þ sÞÞt  d k¼1 # "  d 1X  by fact 2 xk0 ;3 ðt þ sÞ t E  d 0 k ¼1 # " d d  1X 1X  ¼E xk;1 ðt þ sÞð1  xk;2 ðt þ sÞÞ xk0 ;3 ðt þ sÞt  d k¼1 d 0 k ¼1 # "  d 1X  xk;1 ðt þ sÞð1  xk;2 ðt þ sÞÞt pðtÞ E  d k¼1

by fact 1 " d 1X ¼E xk;1 ðt þ sÞð1  xk;2 ðt þ sÞÞ d k¼1

! #  d 1X  ðxk0 ;3 ðt þ sÞ  pðtÞ t   d 0 k ¼1   1 ¼ E½Q1 ðt; t þ sÞjt þ O , N

499

ðA:3Þ

where the last equality above follows from (A.1). More generally, for arbitrary h we will have E½Qh ðt; t þ sÞjt 2 d 1X ¼ E4 xk ;1 ðt þ sÞð1  xk1 ;2 ðt þ sÞÞ d k ¼1 1 1 0 1 3  hþ1 d X Y  1  @ A  ðxki ;iþ1 ðt þ sÞ  xki ;iþ1 ðt þ sÞÞ t5 d k ¼1  i¼2 i   1 þO . N

ðA:4Þ

with Y ¼ fxk1 ;1 ; xk1 ;2 ; xk2 ;3 ; xk3 ;4 ; . . . ; xkhþ1 ;hþ2 g, the corresponding Y  , and Z1 ¼ fxk1 ;1 ; xk1 ;2 ; xk ;1 ; xk ;2 g and Z i ¼ 1 1 fxki ;iþ1 ; xk ;iþ1 g for i ¼ 2; . . . ; h þ 1. i

A.1.3. Computing E½Qh ðt; t þ sÞjt We use the coupled coalescent through (A.2) and (A.4) to compute E½Q1 ðt; t þ sÞjt, E½Q2 ðt; t þ sÞjt and then E½Qh ðt; t þ sÞjt for h42. Starting with (A.2) we have   1 E½Q1 ðt; t þ sÞjt ¼ J 0 þ J 1 þ O , (A.5) N where J0 ¼

d 1X E½xk;1 ðt þ sÞð1  xk;2 ðt þ sÞÞ d k¼1

ðxk;3 ðt þ sÞ  xk;3 ðt þ sÞÞjt, J1 ¼

d d X 1 X E½xk;1 ðt þ sÞð1  xk;2 ðt þ sÞÞ d 2 k¼1 k0 ¼1;k0 ak

ðxk0 ;3 ðt þ sÞ  xk0 ;3 ðt þ sÞÞjt.

ðA:6Þ

We claim that J 1 ¼ Oðs2 Þ. To see this consider the case when the coupled coalescent does not experience a decoupling event between time t þ s and t. In this case xk0 3 and xk0 ;3 follow the same lineage between t þ s and t and so xk0 ;3 ðt þ sÞ  xk0 ;3 ðt þ sÞ ¼ 0, which implies J 1 ¼ 0. For a decoupling event to occur, xk0 ;3 must migrate to the deme k and then coalesce with either xk;1 or xk;2 , in other words two events must occur. This then gives J 1 ¼ Oðs2 Þ. Already we see the huge simplification that the coupled coalescent provides. We have eliminated the most complex term in the computation of E½Q1 ðt; t þ sÞjt with very little effort. Now consider J 0 . As in the case of J 1 we need only consider the situation when a decoupling event occurs. This means either xk;3 coalesces with xk;1 , this has probability s and we consider it with the term J 0;1 below, or xk;3

ARTICLE IN PRESS S. Rottenstreich et al. / Theoretical Population Biology 72 (2007) 485–503

500

coalesces with xk;2 , this has probability s and we consider it with the term J 0;2 below. With this in mind, we have J 0 ¼ sJ 0;1 þ sJ 0;2 þ Oðs2 Þ,

(A.7)

where "

d 1 X xk;1 ðtÞð1  xk;2 ðtÞÞðxk;1 ðtÞ  x k;3 ðtÞÞjt , 2 d k¼1 " # d 1 X  ¼E 2 xk;1 ðtÞð1  xk;2 ðtÞÞðxk;2 ðtÞ  xk;3 ðtÞÞjt . ðA:8Þ d k¼1

Let us explain (A.8). The formula for J 0;1 is simply the formula for J 0 but with xk;1 replacing xk;3 since the two  coalesce, and with x k;3 replacing xk;3 since a decoupling event has occurred. Finally we re-express J 0;1 and J 0;2 in terms of pðtÞ; qðtÞ; pk ðtÞ; qk ðtÞ. Consider J 0;1 . If the coalescent event has taken place at time t with t 4t then up to order OðN1 Þ, xk;1 ðtÞ; xk;2 ðtÞ, and x k;3 represent independent draws from deme k at time d. Since we take sb N1 we may assume with error OðN1 Þ that t 4t. This gives J 0;1

d 1 X ¼ 2 pk ðtÞqk ðtÞ  pk ðtÞ2 qk ðtÞ, d k¼1

J 0;2 ¼ 

d 1 X pk ðtÞ2 qk ðtÞ. d 2 k¼1

ðA:9Þ

Putting this all together and plugging into (A.5) gives E½Q1 ðt; t þ sÞjt d 1 X pk ðtÞqk ðtÞð1  2pk ðtÞÞ 2 d k¼1   1 þ Oðs2 Þ þ O . N

ðA:10Þ

(A.11)

where H0 ¼

Y  ¼ fxk;1 ; xk;2 ; xk0 ;3 ; xk00 ;4 g, Z1 ¼ fxk;1 ; xk;2 ; xk;1 ; xk;2 g, Z3 ¼ fxk00 ;4 ; xk00 ;4 g.

ðA:14Þ

If only one event occurs then, in the case of H 1 , either xk0 ;3 ðt þ sÞ  xk0 ;3 ðt þ sÞ ¼ 0 or xk00 ;4 ðt þ sÞ  xk00 ;4 ðt þ sÞ ¼ 0. This gives H 1 ¼ Oðs2 Þ. In the case of H 0 we only need consider the decoupling event in which xk0 ;3 coalesces with xk0 ;4 . This has probability s. Plugging these observations into (A.11) gives E½Q2 ðt; t þ sÞjt ¼s

d d 1X 1 X E½xk;1 ðtÞð1  xk;2 ðtÞÞðxk0 ;3 ðtÞ d k¼1 d 2 0 k ¼1

  x k0 ;3 ðtÞÞðxk0 ;3 ðtÞ  xk0 ;4 ðtÞÞjt   1 þ Oðs2 Þ þ O . N

ðA:15Þ

Then we note that xk;1 ðtÞ, xk;2 ðtÞ, xk0 ;3 ðtÞ, xk0 ;3 ðtÞ , and xk0 ;4 ðtÞ are independent draws from demes k and k0 . This gives !2 d s 1X E½Q2 ðt; t þ sÞjt ¼ p ðtÞqk ðtÞ d d k¼1 k   1 2 þ Oðs Þ þ O . ðA:16Þ N Finally, one can show by arguments similar to those above that E½Qh ðt; t þ sÞjt ¼ Oðs2 Þ þ OðN1 Þ for h42.

¼s

Now we consider E½Q2 ðt; t þ sÞjt. We have   1 E½Q2 ðt; t þ sÞjt ¼ H 0 þ H 1 þ O , N

Y ¼ fxk;1 ; xk;2 ; xk0 ;3 ; xk00 ;4 g,

Z2 ¼ fxk0 ;3 ; xk0 ;3 g,

#

J 0;1 ¼ E J 0;2

and with

d d 1X 1 X E½xk;1 ðt þ sÞ d k¼1 d 2 0

A.2. Technical details for E½f st ðT e ÞjT i ; p Here we wish to make precise the arguments of Section 4. In Section 4 we derived a differential equation for E½f st ðtÞjT i ; p, however, we did not show how to handle the conditioning on pðT e Þa0; 1. Furthermore we did not show that we may replace the difference equation, (4.20), by the differential equation, (4.23). In order to do this we introduce two sets of genealogies defined for each time t:

k ¼1

ð1  xk;2 ðt þ sÞÞðxk0 ;3 ðt þ sÞ  xk0 ;3 ðt þ sÞÞ ðxk0 ;4 ðt þ sÞ  xk0 ;4 ðt þ sÞÞjt,

H1 ¼

d d 1X 1X 1 d k¼1 d 0 d

d X

ðA:12Þ

E½xk;1 ðt þ sÞð1  xk;2 ðt þ sÞÞ

k00 ¼1;k00 ak0 sÞ  xk0 ;3 ðt þ

k ¼1

ðxk0 ;3 ðt þ 

xk00 ;4 ðt

þ sÞÞjt,

sÞÞðxk00 ;4 ðt þ sÞ ðA:13Þ

O ¼ fgenealogiesj pðT e Þa0; 1g.

1 Yt ¼ genealogiesj jpðsÞ  pðT i Þjp =4 for T i pspt . d O is the set of genealogies that we average over when we write E½f st ðT e ÞjT i ; p. Yt is the set of genealogies for which 1 the allele frequency does not change by more than =4 d between time T i and t. Now we use these sets. Observe that f st ðtÞ is bounded. In fact f st ðtÞ ¼ 1  F st ðtÞ 2 ½0; 1. Furthermore, Yt  O. With

ARTICLE IN PRESS S. Rottenstreich et al. / Theoretical Population Biology 72 (2007) 485–503

this in mind we have E½f st ðt þ sÞjT i ; p ¼ E½f st ðt þ sÞjO ¼ E½f st ðt þ sÞjYtþs  þ OðPðYctþs ÞÞ.

ðA:17Þ

The advantage of conditioning over the genealogies Yt as 1 opposed to O is that pqðtÞ is bounded on Yt . Recalling the definitions given by (4.1) we have E½f st ðt þ sÞjYtþs  ¼ E½Rðt; t þ sÞjYtþs  

 jDpðt; t þ sÞj3 jY þO E , tþs pqðtÞ3 ðA:18Þ where Rðt; t þ sÞ ¼ f st ðt; t þ sÞ  f st ðt; t þ sÞGðt; t þ sÞ þ f st ðt; t þ sÞG2 ðt; t þ sÞ.

ðA:19Þ

Now consider E½Rðt; t þ sÞjYtþs . What makes this term difficult to compute is that we are conditioning on t þ s rather than t. However, since we are averaging over the set Ytþs , Rðt; t þ sÞ is bounded. We also have Yt  Ytþs . These two facts give E½Rðt; t þ sÞjYtþs  ¼ E½Rðt; t þ sÞjYt  þ OðPðYct ÞÞ.

(A.20)

Now that we have conditioned on Yt we can use the same analysis as in Section 4 to compute E½Rðt; t þ sÞjYt . Combining the results of Section 4 with (A.17), (A.18), and (A.20) we arrive at E½f st ðt þ sÞjYtþs  ¼ ð1  ð2G þ 1ÞsÞE½f st ðtÞjYt  þ 2Gs s þ E½hE ðtÞjYt  d  

jDpðt; t þ sÞj3  þO E Yt pqðtÞ3  s þPðYct Þ þ s2 þ 2 . ðA:21Þ d It is not hard to see that we have almost recovered (4.20). The difference is that we have to bound the error terms that are in the big-O parenthesis in the equality directly above and bound the difference between E½hE ðtÞjYt  and h¯ E ðT i Þ. More precisely, iterating the difference equation shows that we need to bound the following quantities: 

1 jDpðt; t þ sÞj3  1 E PðYct Þ; s, Yt ;  3 s s pqðtÞ 1 jE½hE ðtÞjYt   h¯ E ðT i Þj. ðA:22Þ d We can show using a coupled coalescent analysis,

jDpj3 2 jY (A.23) E t ¼ Oðs Þ. pqðtÞ3 One can also show that " # d =2 c PðYt Þp exp  . 4

(A.24)

501

This is a particularly technical result. Intuitively, pðtÞ is approximated by a diffusion that moves on a timescale d. Since T e  T i 5d, the change in pðtÞ at any point in the time interval can be approximated by a Gaussian. Bounding the tails of the Gaussian gives (A.24). For a demonstration of this approach see Durrett (1996). Finally we must bound jE½hE ðtÞjYt   h¯ E ðT i Þj. We will show in Appendix A.3 rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi! Te  Ti jE½hE ðtÞjYt   h¯ E ðT i Þj ¼ O . (A.25) d After plugging in our error bounds into (A.22) we arrive at (4.24) with the following error term:   1 1 2  O s þ ep ðT i Þd =16 þ 1þ=2 . (A.26) s d  1 1 . Notice This expression will be O 1þ=2 if we set s ¼ 1þ d d 1 that this requires sb N . A.3. E½hE ðtÞjT i ; p ¼ h¯ E ðT i Þ þ Oð

1 Þ d =2

In this section we show E½hE ðtÞjT i  ¼ h¯ E ðT i Þ þ O



1 d =2

.

Actually, we will prove a more general result that can be applied to hE or hV , h2;1;1 , h1;1;1;1 . Letting a; b; r be positive integers we show 

r pk ðtÞðpðtÞqðtÞÞa  E Yt ðpðtÞqðtÞÞb  ! j  r Y ðpqÞa ðT i Þ jG 2     pðT i Þ þ ¼ pðT i Þ j jG þ 2j ðpqÞb ðT i Þ j¼2 jG þ 2   1 þO  . ðA:27Þ d =2 This results makes precise the two step process that we used to derive h¯ E from hE in Section 4.3. That is, first we replace pðtÞ and qðtÞ by pðT i Þ and qðT i Þ to make the transition prk ðtÞðpðtÞqðtÞÞa ðpðtÞqðtÞÞb

!

prk ðtÞðpðT i ÞqðT i ÞÞa ðpðT i ÞqðT i ÞÞb

.

(A.28)

Then we make the substitution given in (4.28). We now prove (A.27). Our first step will be to remove the pðtÞqðtÞ dependence in (A.27). We have 

r pk ðtÞðpqÞa ðtÞ E Yt ðpqÞb ðtÞ  

r p ðtÞðpqÞa ðT i Þ ¼E k Yt ðpqÞb ðT i Þ   

ðpqÞa ðtÞ ðpqÞa ðT i Þ  þ E prk ðtÞ  Yt . ðA:29Þ ðpqÞb ðtÞ ðpqÞb ðT i Þ  We can bound the second term on the right of the equality directly above using two bounds. First, a coupled tT coalescent argument gives E½ðpðtÞ  pðT i Þ2 jT i  ¼ Oð d i Þ. c Second, (A.24) gives a bound on PðYt Þ. Putting these two

ARTICLE IN PRESS S. Rottenstreich et al. / Theoretical Population Biology 72 (2007) 485–503

502

A.4. Formula for hV ðtÞ

bounds together gives  

ðpqÞa ðtÞ ðpqÞa ðT i Þ  E prk ðtÞ  Yt ðpqÞb ðtÞ ðpqÞb ðT i Þ 

Below we give the formulas for hV , h2;1;1 , and h1;1;1;1 :

¼ OðE½ðpðtÞ  pðT i ÞÞ2 jYt 1=2 Þ rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi! Te  Ti c ¼ O PðYt Þ þ d " # rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi!   d =2 Te  Ti 1 ¼ O exp  þ ¼ O =2 . 4 d d So we are left to determine PðYct Þ

order we show E½prk ðtÞ

pr ðtÞðpqa ðT i Þ jYt . E½ k b ðpqÞ ðT i Þ

this reduces to determining

hV ðtÞ ¼ hV ;a ðtÞ þ hV ;b ðtÞ þ hV ;c ðtÞ þ hV ;d ðtÞ, where hV ;a ðA:30Þ

With error of

E½prk ðtÞ.

! j  jG 2     pðT i Þ þ ¼ pðT i Þ j jG þ 2j j¼2 jG þ 2   1 GðtT i Þ þ O =2 þ e . d

(A.34)

! d 4G þ 2 1X 2 ¼ ðpk ðtÞqk ðtÞÞ , ðpðtÞqðtÞÞ2 d k¼1

hV ;b ¼ 2G

1 d

Pd

k¼1 ðpk ðtÞ

qk ðtÞqðtÞ þ pðtÞpk ðtÞqk ðtÞ2 Þ ðpðtÞqðtÞÞ2

So now

, (A.36)

r Y

hV ;c ðA:31Þ

2

(A.35)

! d ð1  2pðtÞÞ 1 X ¼ 4 pk ðtÞqk ðtÞ ðpðtÞqðtÞÞ3 d k¼1 ! d 1X pk0 ðtÞqk0 ðtÞð1  2pk0 ðtÞÞ ,  d 0 k ¼1

When r ¼ 1 (A.31) is satisfied. This can be shown through a time restricted coalescent analysis involving only a single individual. We proceed by induction and  assume that the formula holds for kor. Let a ¼ ðrG þ 2j Þ. We then have E½prk ðt þ sÞ

  1 E½xk;1 ðt þ sÞxk;2 ðt þ sÞ    xk;r ðt þ sÞ þ O ¼ d k¼1 N

 hV ;d ¼

2 3ð1  2pðtÞÞ2 þ 3 ðpðtÞqðtÞÞ ðpðtÞqðtÞÞ4



d 1X p ðtÞqk ðtÞ d k¼1 k

h2;1;1 ðtÞ ¼ h2;1;1;a ðtÞ þ h2;1;1;b ðtÞ þ h2;1;1;c ðtÞ,

!3 ,

(A.37)

d 1X

¼ ð1  asÞE½prk ðtÞ þ rGs 

d X

1 dðd  1Þ

h2;1;1;a ðtÞ ¼  4G  2Gf st ðtÞ 0 d X G @1 þ ðqðtÞp2k ðtÞqk ðtÞ ðpðtÞqðtÞÞ2 d k¼1

E½xk0 ;1 ðtÞxk;2 ðtÞ    xk;r ðtÞ

k;k0 ¼1;k0 ak

þ pðtÞpk ðtÞq2k ðtÞ þ q2 ðtÞp2k ðtÞ þ p2 ðtÞq2k ðtÞÞ !2 1 d 1X ðA:38Þ þ2 p ðtÞqk ðtÞ A, d k¼1 k

d r 1 X s þ E½xk;2 ðtÞ    xk;r ðtÞ þ Oðs2 Þ 2 d k¼1

¼ ð1  asÞE½prk ðtÞ þ rGsE½pðtÞpr1 k ðtÞ r s þ s2 . þ sE½pr1 k ðtÞ þ O d 2

where

ðA:32Þ

We would now like to replace E½pðtÞpr1 k ðtÞ by ðtÞ. This is almost identical to what we did in pðT i ÞE½pr1 k (A.29) and (A.30). We then arrive at    r E½prk ðt þ sÞ ¼ 1  rG þ s E½prk ðtÞ 2   r þ s rGpðti Þ þ E½pr1 k ðtÞ 2   s s þ s2 þ =2 . þO ðA:33Þ d d If we now plug in our inductive assumption and iterate the equality directly above then we arrive at (A.31).

0 d 2ð1  2pÞ @ 1X pðtÞqðtÞ p ðtÞqk ðtÞð1  2pk ðtÞÞ h2;1;1;b ðtÞ ¼ d k¼1 k ðpðtÞqðtÞÞ3 !2 1 d X 1 ðA:39Þ p ðtÞqk ðtÞ A, þð1  2pðtÞÞ d k¼1 k 

 2 3ð1  2pðtÞÞ2 h2;1;1;c ðtÞ ¼ þ pðtÞqðtÞ ðpðtÞqðtÞÞ3 ðpðtÞqðtÞÞ4 !2 d 1X p ðtÞqk ðtÞ .  d k¼1 k

ðA:40Þ

ARTICLE IN PRESS S. Rottenstreich et al. / Theoretical Population Biology 72 (2007) 485–503 d 1X p ðtÞqk ðtÞ d k¼1 k ! d d 1X 1X 2 2 2 2 þq ðtÞ p ðtÞ þ p ðtÞ q ðtÞ . ðA:41Þ d k¼1 k d k¼1 k

h1;1;1;1 ðtÞ ¼ 2 þ

1 ðpðtÞqðtÞÞ2

4pðtÞqðtÞ

References Crow, J.F., Aoki, K., 1984. Group selection for a polygenic behavioral trait: estimating degree of population subdivision. Proc. Nat. Acad. Sci. USA 81 (12), 6073–6077. Durrett, R., 1996. Stochastic Calculus: A Practical Introduction. CRC Press, Boca Rato. Ethier, S.N., Nagylaki, T., 1980. Diffusion approximations of Markov chains with two timescales and applications to population genetics. Adv. Appl. Probab. 12, 14–49. Nei, M., Chakravarti, A., 1977. Drift variances in F st and Gst statistics obtained from a finite number of isolated populations. Theor. Popul. Biol. 11, 307–325. Nei, M., Chakravarti, A., Tateno, Y., 1977. Mean and variance of F st in a finite number of incompletely isolated populations. Theor. Popul. Biol. 11, 291–306. Slatkin, M., 1991. Inbreeding coefficients and coalescence times. Genet. Res. Camb. 58, 167–175.

503

Takahata, N., 1983. Gene identity and genetic differentiation of populations in the finite island model. Genetics 104, 497–512. Vitalis, R., Couvet, D., 2001. Estimation of effective population size and migration rate from one and two locus identity measures. Genetics 157, 911–925. Wakeley, J., 1998. Segregating sites in Wright’s island model. Theor. Popul. Biol. 53, 166–174. Wakeley, J., 2003. Polymorphism and divergence for island model species. Genetics 163, 411–420. Wakeley, J., Takahashi, T., 2004. The many-demes limit for selection and drift in a subdivided population. Theor. Popul. Biol. 66, 83–91. Weir, B.S., Cockerham, C.C., 1984. Estimating F-statistics for the analysis of population structure. Evolution 38 (6), 1358–1370. Weir, B.S., Cockerham, C.C., 2002. Estimating F-statistics. Ann. Rev. Genet. 36, 721–750. Whitlock, M.C., Mccauley, D.E., 1999. Indirect measures of gene flow and 1 migration: F st a4Nmþ1 . Heredity 82, 117–125. Wilkins, J.F., 2004. A seperation timescales approach to the coalescent in a continuous population. Genetics 168, 2227–2244. Wilkinson-Herbots, H.M., 1998. Geneology and subpopulation differentiation under various models of population structure. J. Math. Biol. 37, 535–585. Wright, S., 1931. Evolution in Mendelian populations. Genetics 16, 97–159. Wright, S., 1942. Isolation by distance. Genetics 28, 114–138.