Efficient methods for studying stochastic disease and population dynamics

Efficient methods for studying stochastic disease and population dynamics

Theoretical Population Biology 75 (2009) 133–141 Contents lists available at ScienceDirect Theoretical Population Biology journal homepage: www.else...

3MB Sizes 0 Downloads 75 Views

Theoretical Population Biology 75 (2009) 133–141

Contents lists available at ScienceDirect

Theoretical Population Biology journal homepage: www.elsevier.com/locate/tpb

Efficient methods for studying stochastic disease and population dynamics M.J. Keeling a,∗ , J.V. Ross b a

Department of Biological Sciences and Mathematics Institute, University of Warwick, Gibbet Hill Road, Coventry, CV4 7AL, UK

b

King’s College, University of Cambridge, Cambridge, CB2 1ST, UK

article

info

Article history: Received 14 July 2008 Available online 21 January 2009 Keywords: Markov processes State space reduction Disease dynamics Catastrophes Parameter estimation

a b s t r a c t Stochastic ecological and epidemiological models are now routinely used to inform management and decision making throughout conservation and public-health. A difficulty with the use of such models is the need to resort to simulation methods when the population size (and hence the size of the state space) becomes large, resulting in the need for a large amount of computation to achieve statistical confidence in results. Here we present two methods that allow evaluation of all quantities associated with one- (and higher) dimensional Markov processes with large state spaces. We illustrate these methods using SIS disease dynamics and studying species that are affected by catastrophic events. The methods allow the possibility of extending exact Markov methods to real-world problems, providing techniques for efficient parameterisation and subsequent analysis. © 2009 Elsevier Inc. All rights reserved.

1. Introduction The foundations of ecological and epidemiological modelling are largely based upon deterministic equations for dynamics. However, it is becoming increasingly apparent that environmental and demographic stochasticity can cause dramatic deviations from this deterministic ideal (Bartlett, 1956; Rand and Wilson, 1991; Grenfell et al., 1998; Keeling et al., 2000; Spagnolo et al., 2003; Coulson et al., 2004). Three notable elements separate stochastic from deterministic dynamics. The first is the random nature of processes resulting in differences between realisations and thus temporal variation; in comparison a deterministic model always predicts the same behaviour (Taylor, 1961). The second is the opportunity for extinction, where, by chance, individuals may fail to ‘reproduce’ and the species dies out; this is important in most areas of population modelling, as recovery from extinction can usually only occur due to interaction with an external population (Bartlett, 1956). Finally, stochasticity induces fluctuations about the deterministic attractor which can lead to the population oscillating at or near its natural frequency (McKane and Newman, 2005). A common approach to studying stochastic population dynamics is integer-based event-driven models (also known as Monte Carlo simulations), mimicking the supposed behaviour of the real system (Gillespie, 1976; Renshaw, 1991). Such methods of simulation generally allow a range of complex, biologically-realistic features to be incorporated in the underlying model, and offer an



Corresponding author. E-mail addresses: [email protected] (M.J. Keeling), [email protected] (J.V. Ross). 0040-5809/$ – see front matter © 2009 Elsevier Inc. All rights reserved. doi:10.1016/j.tpb.2009.01.003

intuitive framework for many applied practitioners. However, given a single simulation, it is not clear whether the observed dynamics are representative of average behaviour or merely a chance outlier due to a rare combination of events. Consequently a large number of replicate simulations are required before confidence in results may be established. In essence, event-driven models are in silico experiments and therefore the results must be subject to the same statistical treatment as would be employed for any experimental observation. However, for a particular class of stochastic model called Markov processes, in which the future state of the population is determined solely by the current state, there exist computationally efficient and exact methods for evaluating behaviour when the state space (corresponding to all possible population configurations) is small (Norris, 1997; Keeling and Ross, 2008). These methods have many advantages over simulation methods, affording the use of simple matrix and vector operations to quickly and accurately determine a complete description of all possible behaviours of the stochastic system. The problem encountered with such methods, however, is that they are computationally infeasible when the state space is large (or even moderate) in size (Keeling and Ross, 2008). Here we present two novel methods which alleviate this constraint, allowing the aforementioned vector and matrix routines to be applied to processes with moderate to large-sized state spaces. We focus in particular on describing and applying the methods to one-dimensional Markov processes, although extensions to two or more dimensions are discussed. Such 1-D processes are ubiquitous in the field of population modelling, with applications arising in ecology for modelling single populations (Tomiuk and Loeschcke, 1994), metapopulations (Alonso and McKane, 2002)

134

M.J. Keeling, J.V. Ross / Theoretical Population Biology 75 (2009) 133–141

and populations subject to catastrophes (Mangel and Tier, 1993), and in epidemiology for modelling infectious diseases with SIS (susceptible-infectious-susceptible) dynamics, such as sexually transmitted infections (Andersson and Britton, 2000), hospital-acquired infections (Pelupessy et al., 2002) and host-parasite infections (Stone et al., 2008). The methods represent a significant improvement on other methods proposed in the literature by explicitly accounting for the states used in the approximation (cf. Gilpin and Taylor (1994) and Griebeler and Seitz (2001)) and avoiding the need for knowledge of neural networks and the accompanying issues of training and choice of time-step in the continuous-time case, and allow extension to much larger systems (cf. Griebeler and Seitz (2006)). Thus, we feel our methods provide a major advance for population modelling, as they allow the use of Markov processes in real-world scenarios, enabling accurate calibration and analysis to be undertaken. Our main method (termed the Interpolating Polynomial (IP) method) involves reducing the dimension of the state space by explicitly modelling only a subset of the states, and using interpolating polynomials to evaluate the other ‘missing’ state probabilities. The approximating polynomial (which is a function only of states explicitly modelled) replaces the missing state probabilities, thus reducing the size of the state-space, and allowing computationally exact state-of-the-art methods to be evoked. Not surprisingly, there exists a trade-off between the accuracy of the method and the reduction in the size of the state space, as will be demonstrated. However, the grid of interpolating points do not need to be equally-spaced, so that a fine grid may be used in regions of the state space where the distribution varies substantially between consecutive values (or where detailed results are required), and a much sparser grid may be used in regions where the distribution is relatively smooth. For the special case of (multi-dimensional) birth-and-death processes, an alternative approach (termed the Exact Probabilities and Exact Time (EPET) method) can be used to efficiently evaluate the exact stationary distribution, even for processes with large state spaces. The EPET method operates by amalgamating contiguous states into bins and approximating the movement between bins as Markovian, using the exact probabilities of movement to other bins and the exact average time spent within each bin to determine the rates. While this method produces exact results for the stationary distribution, the lack of internal structure within a bin means that transient dynamics are only captured approximately. The significance of the results presented here is two-fold. For the first time they allow for accurate and efficient evaluation of quantities associated with the type of large state space models that are necessary for real-world problems. Secondly, they allow the parameterisation of such stochastic models from data consisting of the state of the process collected at successive (not necessarily equally-spaced) time points. While a recent method has been presented for calibrating birth-and-death processes under these conditions (Ross et al., 2006, 2009), until now it has not been possible to accurately calibrate birth–death-and-catastrophe (BDC) processes without resorting to Monte Carlo (MC) simulation procedures (Cairns et al., 2007). In the next section we present the IP and EPET methods. We illustrate the techniques with two examples: SIS disease dynamics in a large population, and a metapopulation BDC process. We discuss our results and the extension of the method to higherdimensional (multi-species) processes in the final section of the paper. 2. Methods Here we present our two methods, allowing efficient and accurate computation for one-dimensional Markov processes that have large state spaces. We commence by introducing the required background theory of Markov processes.

Table 1 Transition rates for the SIS (susceptible-infectious-susceptible) model of disease dynamics; β is the transmission parameter, γ is the rate of recovery,  is the rate of import of infection from a source external to the population, and N is the total number of individuals. Event

Transition

Rate

Infection Recovery

n→n+1 n→n−1

q(n, n + 1) = β(n + )(N − n)/N q(n, n − 1) = γ n

Table 2 Transition rates for the BDC (birth–death-catastrophe) metapopulation model; c is the rate of propagule dispersion, e is the per-patch rate of local extinction due to natural causes, κ is the rate of catastrophes, ρ is the per-patch probability of death due to catastrophe, and N is the total number of patches. Event

Transition

Rate

Birth Death Catastrophe

n→n+1 n→n−1 n→n−l

q(n, n + 1) = cn(N − n)/N q(n, n − 1) = en q(n, n − l) = κ Bin(n, l, ρ)

2.1. Markov processes The dynamical behaviour of a continuous-time Markov chain is governed by fixed transition rates which we formulate into a matrix Q = (q(i, j), i, j ∈ S ), with q(i, j) representing the rate of transition from P state i to state j, for j 6= i, and q(i, i) = −q(i), where q(i) = j6=i q(i, j)(< ∞), is the total rate at which we leave state i. S is the state space representing all possible configurations (states) of the process. The model is completely specified by Q ; in particular the dynamics of the probability vector p are given by: dp dt

= Qp

(1)

where the elements of p are the probabilities of finding the system in a given configuration state at a given time. We now outline two examples which will illustrate the usefulness of one-dimensional Markov chains and will show the power of our methods. We consider the SIS model of disease dynamics and a metapopulation BDC process, both of which are defined on the integers 0 to N. For the SIS model the stochastic dynamics are defined by the events and corresponding transition rates as given in Table 1. The Susceptible-Infectious-Susceptible paradigm is a good yet simple approximation to the dynamics of many sexually transmitted infections. Only two processes occur: infection in which an encounter between a susceptible and an infectious individual leads to the transmission of infection; and recovery which usually follows treatment of an infected individual and leaves them once again susceptible to infection. In Table 1, n(t ) is the number of infectious individuals at time t, β is the transmission parameter, γ is the rate of recovery (hence 1/γ is the average infectious period),  is a parameter representing the import of infection from a source external to the population, and N is the total number of individuals in the population. In this example imports of infection (governed by  ) have been included to prevent the permanent extinction of the disease. The Birth–Death-Catastrophe model captures the dynamics of patches of habitat (or subpopulations). In general the colonisation and extinction of subpopulations follows a logistic growth model— as postulated in the simple Levins metapopulation model (Levins, 1969). However, rare catastrophes can occur which lead to the simultaneous extinction of a proportion of all subpopulations. For the BDC metapopulation model, we let n(t ) be the number of occupied patches at time t and define the events and corresponding transition rates as detailed in Table 2. In Table 2, c is the rate at which propagules are dispersed from each occupied patch, e is the per-patch rate of local extinction due

M.J. Keeling, J.V. Ross / Theoretical Population Biology 75 (2009) 133–141

to natural causes and κ is the rate at which catastrophes occur; when a catastrophe occurs we assume that each patch population is killed with probability ρ (i.e. the catastrophes are binomially distributed: Bin(n, l, ρ) represents the probability of l patches being killed when there are n patches occupied and each patch has probability ρ of being killed). We assume that our process is regular, which is guaranteed if Q is bounded in the sense that q(i) ≤ α (for all i), for some constant α ; a condition that is trivially satisfied when there are finitely many states (as in the examples). In this case we may write P (t ) = exp(Qt ), where exp is the matrix exponential, and P (t ) is the transition function; the (i, j)th entry of P (t ), denoted pij (t ), is the probability of moving from state i to state j in elapsed time t. In most cases the transition function cannot be evaluated explicitly. However, the behaviour of the process can often be investigated by working directly with the transition rates or we may evaluate the matrix exponential using a numerical package, such as EXPOKIT (Sidje, 1998), as we adopt here. One powerful way of understanding the behaviour of these Markov processes is to examine the eigenvalues and eigenvectors associated with the transition matrix Q . For the processes we consider here, the first eigenvalue is zero and the long-term, or stationary, distribution is given by the associated first left eigenvector of Q (normalised to sum to 1). Subsequent eigenvalues are all negative and inform us about convergence to the stationary distribution. For processes with an absorbing state (i.e. a state which it is impossible to leave, for example state 0 in the BDC model) the stationary distribution will be degenerate in the sense that all of the probability mass will be in the absorbing state. For such processes, we are more often interested in the quasistationary distribution of the process, which is given by the first left eigenvector (normalised to sum to 1) of the matrix QC being the matrix Q with row and column corresponding to the absorbing state removed (Norris, 1997; Keeling and Ross, 2008); the quasistationary distribution tells us about the long-term behaviour of our process but conditioned on not having gone extinct. Once again subsequent eigenvalues of QC inform us about convergence to quasi-stationarity (see, for example, Keeling and Ross (2008)). In addition, the expected time to absorption, τ , is found by solving the system of linear equations:

135

be easily incorporated into Bayesian MC methods. However, for each candidate set of parameters θc a new matrix Q (θc ) must be formed, and the matrix exponential evaluated corresponding to the sequence of observation states and times. This quickly becomes computationally infeasible as the population size (size of the state space) becomes moderate in size; it is generally considered that such approaches can only be used for populations up to size 50, or 100 (see, for example, (Cooper and Lipstich, 2004; Ross et al., 2006)), which is typically too small for many real-world applications. 2.2. The interpolating polynomial method (IP method) The standard one-dimensional Markov-chain methods have a state space defined on the integers: S = {0, 1, . . . , N }. The IP method works on a reduced state space Sˆ = {n1 , n2 , . . . , nM }, which is a subset of S; conceptually we think of Sˆ as being a coarse spacing of points between 0 and N. As such, if the method is successful, we have reduced our state-space from N + 1 to M dimensions. We have already seen in (1) that the rate of change of the probability vector p is a linear function of p; hence for birth–death processes (such as the SIS model) the rate of change of pi is a linear combination of pi−1 , pi and pi+1 . The difficulty of using a reduced state space is that some of the probabilities needed to calculate the rates of change may not be defined; the trick is therefore to interpolate for these missing state probabilities using the known ones. The success of this technique is due to the fact that interpolation using a polynomial of degree d can be expressed as a linear combination of the nearest d + 1 known probabilities. Therefore, we can formulate equations for the rates of change of the reduced-state probabilities pˆ , solely in terms of the probabilities pˆ , by using polynomial interpolation to fill in for the missing values. Given that the interpolation is linear in p, the resulting rates of change must also be linear and hence we have a system of differential equations which can once again be formulated as a matrix expression: dpˆ /dt = Qˆ pˆ . For example, using a quadratic interpolating polynomial we have the approximation: p(nk )+a ≈

x3 x2 pˆ nk−1 (x3 − x2 ) + x1 x3 pˆ nk (x1 − x3 ) + x2 x1 pˆ nk+1 (x2 − x1 ) x3 x2 (x3 − x2 ) + x1 x3 (x1 − x3 ) + x2 x1 (x2 − x1 )

(2)

(4)

where 1 is a vector of 1s and the i-th entry of τ (τi ) is the expected time to absorption starting in state i (Norris, 1997; Keeling and Ross, 2008). To use Markov processes for real-world modelling we must estimate parameters from data. Suppose that there is a parameter (or vector of parameters) θ , contained in some parameter space Θ , that must be estimated. We will allow the dependence on θ to be made explicit in our notation by writing Q (θ) for the transition rates and P (θ; t ) = exp(Q (θ )t ) = (pij (θ; t ), i, j ∈ S ) for the transition function. We will also write pi (θ; t ) for the probability that the process is in state i at time t. Given a set of d observations ik = n(tk )(k = 1, . . . , d) of the state of the process at times (0 ≤) t1 < · · · < td , the likelihood of observing them is

where x1 = nk−1 − nk − a, x2 = −a and x3 = nk+1 − nk − a are the differences between the point of interest (nk + a) and the three points used in the interpolation. Quite surprisingly, for a birth–death process, using such quadratic approximation still leads to a transition matrix Qˆ that is tri-diagonal, although the size of the state-space is reduced. For higher dimensional polynomials Qˆ is a banded matrix but is still generally sparse. The most natural assumption to make about the interpolation points is to use a regularly spaced grid, such that nk = m(k − 1) for a spacing of size m. Unfortunately, as shown below for the SIS model, this can lead to difficulties. As expected, the IP matrix Qˆ has one eigenvalue close to zero, and the associated eigenvector is a good match to the long-term dynamics of the full system. However, a more complete examination of the eigenvalues shows that numerical instabilities can occur, revealed as one or a conjugate pair of eigenvalues with large positive real parts. These instabilities are due to the importance of states at the boundaries, corresponding to n = 0 and n = N infectious individuals; in general the instabilities become stronger as polynomials of higher degree are used in the interpolation. Using an integer grid close to the boundaries resolves these difficulties with little extra computational overhead. In addition, for very large state-spaces it may be more efficient to utilise a very sparse grid of interpolation points in regions of the state space where there is little interest.

QC τ = −1,

L(θ) = pi1 (θ; t1 )

d Y

pik−1 ,ik (θ; tk − tk−1 ) .

(3)

k=2

We may then calculate the maximum likelihood estimator (MLE)

b θ of the parameters θ by maximising the likelihood (3) over Θ for

the given observations. As noted previously, the transition function cannot usually be evaluated explicitly, but progress can be made by computing the transition probabilities numerically. This, combined with a numerical search algorithm over the parameter space Θ , allows us to compute the MLE. Note that this likelihood may also

136

M.J. Keeling, J.V. Ross / Theoretical Population Biology 75 (2009) 133–141

These issues are illustrated in the two example systems studied below. 2.3. Exact probabilities and exact time method (EPET method) We now describe the EPET method in terms of how it is effected for birth–death processes, and later outline its generalisation to the multi-dimensional case. An explicit algorithm for performing EPET for one-dimensional birth death processes is given in the Appendix. In contrast to the interpolation method which reduced the state-space by considering a reduced set of points, the EPET method operates on aggregated bins of points, again reducing the state-space. Consider a set of non-overlapping, consecutive bins (K1 , . . . , KM ) which span the original state-space. The dynamics can now be re-expressed in terms of the rates at which transitions between bins occur. This argument is made more precise by considering two particular transition rates, depending on whether the bin has been entered from above or below. Thus four transitions are possible: Kb+ → Kb−+1

Kb+ → Kb+−1

Kb− → Kb−+1

Kb− → Kb+−1

where the superscripts + and − refer to entering the bin from above and below respectively. By treating the states within each bin in isolation, and assuming that leaving the bin (in either direction) is absorbing, we can use Eq. (2) to calculate the expected time spent in the bin starting at either end. The two probabilities of moving to the bin above or below can be determined, in generality, by finding the associated null spaces (see e.g. Keeling and Ross (2008), and Appendix). The transition rates are then approximated as the probabilities divided by the corresponding average time spent in the bin. The equilibrium dynamics of the probability of being in each bin can now be determined as before, by evaluating the eigenvector corresponding to the eigenvalue of 0. What is more, by using the calculated equilibrium transition rates between bins as boundary conditions, the internal equilibrium distribution within each bin can also be found. The approximation that the transitions between bins is Markovian, means that the EPET method often fails to capture the transient dynamics, and may fail to capture the quasi-equilibrium dynamics when the system has one or more absorbing states. For this reason we recommend use of the IP method when considering transient dynamics and processes possessing absorbing states. However, the EPET method produces the equilibrium distribution exactly for (multi-dimensional) birth–death processes without an absorbing state (see Appendix for details). For a 1-dimensional process, this does not always represent an advantage over the standard iterative solution given by the detailed balance equation. In higher dimensions, however, the savings can be considerable, breaking the large-scale eigenvalue problem into more manageable subunits. In higher dimensions, the transition rates associated with each bin have to be further subdivided by the point of entry, but this still represents a substantial saving. 3. Example systems 3.1. SIS disease dynamics The SIS disease model is a well-known and much used model of sexually-transmitted (Andersson and Britton, 2000), hospital-acquired (Pelupessy et al., 2002) and host-parasite infections (Stone et al., 2008). The associated Markov process defined in Table 1 provides a complete description of the dynamics, but it may be helpful to give the associated deterministic differential equation model for the (continuous) number of

infectious individuals, I: dI = β S (I + )/N − γ I dt where S = N − I,  measures the external force of infection, and transmission is assumed to be frequency dependent. Not only is this an important model from an epidemiological perspective, but additionally the Markov process has the mathematical advantage of possessing an equilibrium solution that can be expressed iteratively: pn+1 = pn

β(N − n)(n + ) . γ N (n + 1)

This provides us with a convenient means of testing the numerical results. We now examine the specific case where N = 2000, β = 1.3,  = 0.4 and γ = 1. Fig. 1 shows the accuracy and the computational savings gained by using the IP method (i.e. a sparse grid of interpolation points) ˆ 1 ) and when evaluating the smallest magnitude eigenvalue (λ associated eigenvector (eˆ 1 ). Several features emerge from these comparisons: the most obvious being that the state-space can readily be reduced by a factor of five to ten without seriously affecting the accuracy of the results (Fig. 1a,b). In addition, although higher degree interpolating polynomials are generally more accurate, polynomials of even degree perform better than those of odd degree (e.g. 6 is generally associated with lower errors than degree 7, and degree 2 generally out-performs degree 3). Certain spacing of interpolation points (n1 , . . . , nM ) provide a far better estimate of the eigenvector (e1 ) than others; spacings of 3, 6, 9, 12 and 15 perform poorly with little improvement even for high degree polynomials. It appears that these spacings fail to capture the precise position of the peak of the distribution, and if the spacings are offset (e.g. using 2, 5, 8, . . . rather than 0, 3, 6, . . .) then it is possible to greatly improve the accuracy. Thus we recommend that after first implementation of our method, a user should assess the shape of the distribution and ensure adequate placement of points to capture the peak(s) of the distribution. Finally, we turn to the computational savings, by comparing the methods when calculating exp(Qˆ )ˆp using EXPOKIT (Sidje, 1998) (Fig. 1c). Computation time clearly decreases as the size of the state-space decreases, with a drop that is faster than linear; although higher degree polynomials are associated with slower speeds, this effect is far less marked. We can conclude that for a range of interpolation grid spacings the IP method provides a reliable prediction of both the smallest magnitude eigenvalue and associated eigenvector; it is also clear that using higher (even) degree polynomials is most beneficial in terms of greater accuracy for little extra computational overheads. These results are all for the case of N = 2000. For far larger population sizes the savings become far more pronounced; in particular it allows evaluation of eigenvalues and matrix exponentials for population sizes that until now have been computationally infeasible. As commented on above, a simple regularly spaced grid of interpolation points can lead to numerical instabilities at the boundaries. This problem and its solution are dealt with more fully in the Appendix, where it is shown that an integer grid near the boundaries removes the instability. Using this refined grid, we consider the entire set of eigenvalues of both the full and interpolated system (Fig. 1d) and hence the ability of the IP method to capture transient behaviour. In this particular example we set N = 200 and nk = {0, 1, 2, . . . , 6, 11, 16, 21, . . . , 196, 197, 198, 199, 200}; this smaller population size emphasises the individual nature of the Markov process and therefore imposes a severe test on the IP methodology. Despite this, the first dozen largest real eigenvalues are in very good agreement, and while the interpolated system

M.J. Keeling, J.V. Ross / Theoretical Population Biology 75 (2009) 133–141

137

Fig. 1. (a–f) Assessment of the Interpolating Polynomial method for the one-dimensional SIS equation (N = 2000, R0 = β/γ = 1.3, γ = 1,  = 0.4). (a). Error in the smallest magnitude eigenvalue; this eigenvalue is zero for the full system. (b). Error in the eigenvector associated with the smallest magnitude eigenvalue (L2 norm). Given that the full eigenvector and IP eigenvector are defined over different points, we compare a reduced version of the full eigenvector just defined on the interpolation points and normalised to sum to one. (c). The average time in seconds required to calculate exp(Qˆ )ˆp for randomly chosen distributions pˆ , showing the computational savings associated with this method; calculations were performed on a 2.8 GHz PC using EXPOKIT and MATLAB. In comparison, the standard (full) model takes approximately 1.3 s. (d). Full eigenspectrum for the SIS model with N = 200, R0 = 1.3 and  = 0.1. Dots represent the 201 eigenvalues from the full matrix (Q ) and circles show the 49 eigenvalues from the IP matrix (Qˆ , nk = {0, 1, 2, . . . , 6, 11, 16, 21, . . . , 196, 197, 198, 199, 200}) using a quartic polynomial; note for clarity the first (zero) eigenvalue has not been plotted. e, f). Comparison of the IP method (red) and full system (blue) for the SIS model with imports when N = 200 (R0 = 1.3, ε = 0.1) for two sets of initial conditions. All results are calculated using EXPOKIT. In the main figures, the solid and dashed lines correspond to the mean number of infected individuals and the 95% confidence intervals. The inset figures show the distributions at various time points, solid blue lines are results for the full model, solid red circles are results for the IP method (using the grid nk = {0, 1, 2, . . . , 6, 11, 16, 21, . . . , 196, 197, 198, 199, 200} and quartic polynomials) and open red circles are an interpolation to the missing points; e. starts with the probability mass evenly distributed between 70 and 90, whereas in f. the probability mass is evenly distributed between 1 and 20. g, h). Comparison of stationary distribution for the SIS model with imports using both the full model and EPET method; R0 = β/γ = 1.3, γ = 1,  = 0.4; g N = 200, bins of width k = 3; h N = 2000 bins of width k = 10. Distribution using the EPET method is shown using bars; the distribution for the full system is shown in black, while the sum of the full distribution within each bin is shown in red.

138

M.J. Keeling, J.V. Ross / Theoretical Population Biology 75 (2009) 133–141

Fig. 2. Assessment of the Interpolating Polynomial method for the birth–death-catastrophe (BDC) model (N = 200, c = 2, e = 1, κ = 0.2 and ρ = 0.25). (a) Error in the smallest magnitude eigenvalue associated with the quasi-equilibrium process. It should be noted that these errors are larger than in Fig. 1a due to the much smaller system size. (b) Full eigenspectrum for the BDC model; dots represent the 201 eigenvalues from the full matrix (Q ) and circles show the 63 eigenvalues from IP matrix (Qˆ , nk = {0, 1, 2, . . . , 10, 14, 18, 22, . . . , 190, 194, 195, . . . 199, 200}) using a quartic polynomial.

does admit some oscillatory behaviour, these dynamics are rapidly damped. In practical terms this translates to being able to accurately capture the transient dynamics necessary for parameter estimation, as well as the equilibrium solution – examples of this are given in Fig. 1e and f. Finally, in Fig. 1g and h we compare the stationary distribution results of the EPET method and the full matrix approach for two parameterisations of the SIS model (N = 200, R0 = 1.3, ε = 0.1 and N = 2000, R0 = 1.3, ε = 0.4). As noted earlier, the stationary distribution from the EPET method (blue bars) is equal to the sum within each bin of the stationary distribution from the full model (red circles). 3.2. Birth–death-and-catastrophe dynamics The Markovian dynamics of the birth–death-and-catastrophe (BDC) process are far more complex than for the SIS model, although still one-dimensional. A deterministic (ODE) version of the model can be derived, but it provides a poor representation of the dynamics as it fails to capture the stochastic nature of the catastrophic jumps. The most striking feature is that the BDC model has zero (extinction of the species) as an absorbing state. This means that in general we are interested in the quasistationary distribution – that is the distribution conditional on non-extinction. The second distinction is that the transition matrix Q is no longer sparse. Instead, following a catastrophe it is possible to move to any lower state – thereby making Q triangular and reducing the computational efficiency that is often associated with sparse matrices. It is this second feature which emphasises the need for robust methods that can reduce the state-space. The IP method operates much as before, interpolating the unknown probabilities from the known probabilities defined on a grid of points. Again, similar instabilities can arise which can be removed by introducing integer spacing at the boundaries. Fig. 2a shows the error in calculating the quasi-stationary distribution, as the degree of the interpolating polynomial and the spacing of the interpolation grid varies. Here we have taken a relatively small population size (N = 200) and so extreme reductions in the state space are not possible. Throughout we have used an integer spacing from 1 to 10 followed by a regular grid. We note that the size of the integer regime needed to eliminate the instabilities grows with the order of the interpolating polynomial, and hence there is an important trade-off between using a high-degree polynomial to achieve sufficient accuracy across the majority of the grid, and yet keeping the size of the integer regime small. Fig. 2b shows the comparative eigenspectrum of the full matrix and the reduced state IP matrix for the BDC model. We find good

agreement between the first 20 eigenvalues and the corresponding eigenvectors (not shown), which is highly indicative of being able to predict the temporal dynamics for a range of scenarios. Again, as with the SIS model, we find that the IP method does give rise to a set of oscillatory solutions, but these are rapidly damped as the associated complex eigenvalues have large negative real parts. 4. Discussion 4.1. Extensions to two or more dimensions We have shown that the IP and EPET methods can be used to calculate equilibrium distributions for one-dimensional Markov chains, and that the IP method can also successfully capture transient dynamics and quasi-stationary behaviour. While 1-D Markov chains (essentially single-species models) have many uses, the vast majority of ecological and epidemiological processes are higher-dimensional with two or more interacting species or populations. Predator-prey models and SIR (susceptible-infectiousrecovered) disease models are both well studied examples of twodimensional Markov chain models which can be approached using extensions of our methodology. To extend the EPET method for multi-dimensional processes we need to break the state space up into (multi-dimensional) bins, and model explicitly the boundary states of each bin. These states correspond to the possible ways we may enter or leave each bin. We then evaluate, for each of these boundary states, the expected time we remain within the bin. We also evaluate the probabilities of first entering each state on the boundary of the other bins. The rates of transition are then calculated as for the one-dimensional case — the ratio of the probabilities to the corresponding expected time. Obviously the calculation of the required quantities becomes more demanding as the dimension of the process increases, but is still feasible for many processes (see e.g. Keeling and Ross (2008) for an indication of computational feasibility). The IP method has a natural extension to two or more dimensions, using higher dimensional interpolating polynomials. Some care must be taken choosing the polynomial such that the number of coefficients matches the number of interpolation points, thus maintaining the linear dependence. Two-dimensional Markov chains are clearly more computationally intensive than their one-dimensional counterparts. For example, for a population of exactly N individuals, the onedimensional SIS model has N + 1 states whereas the twodimensional SIR model has (N + 2)(N + 1)/2 states. However, the use of an interpolation grid of points has greater benefits in two or more dimensions; working with a grid spacing of 2 (e.g. even points) approximately halves the state space for a 1-D model,

M.J. Keeling, J.V. Ross / Theoretical Population Biology 75 (2009) 133–141

139

Fig. S1. (A) Transition diagram for the birth–death process with N + 1 states, where λn is the rate at which ‘births’ occur in a population of size n and µn is the rate at which j ‘deaths’ occur. (B) Transition diagram for EPET method with 3 states per bin; [Ki : {k}] has Ki representing the bin, the superscript j representing whether the bin has been entered from above (+) or below (−) and the k representing which state this corresponds to in the original state space (cf. A).

but quarters the state space for a 2-D model. While the optimal strategy may be to take different spacings and different boundary regions for each dimension/species, it is clear that huge computational savings can be achieved. 5. Summary We have presented and demonstrated the effectiveness of two novel methods for evaluating the limiting or dynamic behaviour of continuous-time Markovian models (EPET and IP methods). The EPET method reduces the size of a models state space (corresponding to all possible population state configurations) by grouping states into contiguous bins and approximating transitions between them as Markovian. The EPET method produces exact results for the equilibrium distribution of the system. The IP method also reduces the state space, but via use of interpolating polynomials to infer intermediate, omitted state probabilities. This methodology has two major benefits: it allows faster evaluation of model behaviour, and affords the opportunity to study real-world models that, until now, have been too large for analysis. In particular, it allows for calibration and subsequent analysis of Markov models to be undertaken efficiently. Acknowledgments The authors thank the Leverhulme Trust and King’s College (Zukerman Research Fellowship held by JVR). Appendix A.1. Exact probabilities and exact time (EPET) method It is well known that the equilibrium probabilities of a continuous-time Markov chain can be computed by multiplying the equilibrium probabilities of the embedded jump chain (the process defined by considering only jumps and not their times) by the corresponding expected holding time in each state. It is this fact which the EPET method exploits. We may evaluate precisely the probabilities of jumps of the embedded jump chain of the binned process (exact probabilities), and also the expected times within bins (exact times); more detail for each calculation is provided below. The EPET method operates on aggregated bins of points. Consider a set of non-overlapping, consecutive bins (K1 , . . . , KM ) which span the original state-space. The dynamics can now be reexpressed in terms of the rates at which transitions between bins occur. This argument is made more precise by considering two particular transition rates, depending on whether the bin has been

entered from above or below. Thus four transitions are possible: Kb+ → Kb−+1

Kb+ → Kb+−1

Kb− → Kb−+1

Kb− → Kb+−1

where the superscripts + and − refer to entering the bin from above and below respectively. In Fig. S1(B) the rates at which these transitions occur are denoted R(·, ·). We now explain how to evaluate these rates, and then provide an explicit algorithm for this purpose. By treating the states within each bin in isolation, and assuming that leaving the bin (in either direction) is absorbing, we can use Eq. (2) of the main paper to calculate the expected time spent in the bin starting at either end. The two probabilities of moving to the bin above or below can be determined, in generality, by finding the associated null spaces (see e.g. Keeling and Ross (2008)). For the birth–death process the required probabilities can actually be evaluated explicitly (see Section 3.1 of Ross and Pollett (2007), and below). The transition rates R are then approximated as the probabilities divided by the corresponding average time spent in the bin. EPET algorithm. 1. For each boundary state of each bin, evaluate the expected time you remain in the bin before leaving to another bin. These times may be evaluated via solving a system of linear equations (the number of equations in the system is equal to the number of states in the bin): Qb τb = −1, where Qb is the matrix Q restricted to states only in bin b, 1 is a vector of 1s and the required times are the first (τb (1); for the left boundary state) and last (τb (end); for the right boundary state) entries of τb (see Norris (1997) if more detail is required). 2. For each boundary state of each bin, evaluate the probability of exiting the bin in either direction. This probability (note they are complementary events) may be evaluated in a number of ways: (a) The probability p(U , L)i of reaching U before L (U > L), starting with a population of size i (L ≤ i ≤ U) for the general birth–death process with birth rates λx and death rates µx is given by p(U , L)i = si /sU , where sL = 0, sL+1 = 1 and, for L + 2 ≤ i ≤ U, si = 1 +

j i −1 X Y µk . λ j=L+1 k=L+1 k

(b) Another approach is to add two absorbing states to either end of the bin and consider the matrix Qb∗ being the matrix Q restricted to states only in bin b with the additional two states associated with the neighbouring bins. Now note that there are 2 zero eigenvalues associated with the matrix Qb , and that eigenvectors associated

140

M.J. Keeling, J.V. Ross / Theoretical Population Biology 75 (2009) 133–141

with zero-eigenvalues span the null-space of the matrix. The nullspace calculation provides a computationally efficient means of finding the eigenvectors and hence calculating the final state of the system (probability of being in either bin). For simplicity we can set the first 2 left eigenvectors to be l1 = I1 and l2 = Iend (where Ii is a vector of length equal to the size of the state space (number of states in bin plus two) and having a 1 in entry i). The right eigenvectors can then be found such that they span the nullspace of Q and satisfy ri .lj = δij (where δij is Kronecker delta). The long-term distribution is given by: p∞ =

2 X (ri .p0 ) li i=1

allowing us to compute the desired probabilities: p∞ (1) is the probability of exiting to the bin to the left before entering the bin to the right (i.e. p(L, U )i as denoted in 2(a); note that only two different p0 will be needed. (c) A final approach which is similar to (b), is to once again append two states to either side of the bin. However, instead of evaluating the null-space of the matrix Qb∗ , we evaluate the distribution of the process after a reasonable period of time (long enough so that most of the probability mass is in the two absorbing states). This may be evaluated, for example, by using EXPOKIT (Sidje, 1998). 3. The rates R are then determined by dividing the probability of reaching a particular neighbouring state (as evaluated in 2) by the expected time you remain within the bin (as evaluated in 1) for each boundary state. Considering the bin b = 2 in Fig. S1, for p(6,2) example, the rate R(K2+ , K3− ) = τ (3)5 , this being the probability 2 of reaching state 6 before state 2 starting from state 5, divided by the expected time you remain in bin b = 2 (i.e. states 3, 4, 5 of the original process) before exiting starting from state 5. A.2. Interpolating Polynomial (IP) method Algorithm. The Interpolating Polynomial (IP) method operates on a reduced state space Sˆ = {n1 , n2 , . . . , nM } and uses a polynomial of degree d specified by the nearest d + 1 known points to approximate any unknown points. By far the easiest way to construct the new matrix (Qˆ ) associated with the reduced statespace of the IP method is to return to the underlying Kolmogorov forward equation. In general the Kolmogorov forward equation for the full system can be expressed as follows for the probability of being in state n: dPn dt

= Qn (P0 , P1 , . . . , PN ) =

N X

qm,n Pm .

m=0

where Qn is a linear function of the current distribution of probabilities, and hence can be expressed as a matrix. We can now set up an interpolating polynomial of degree d (Z d ), such that each point in the full system is expressed as a linear combination of at most d + 1 points in the reduced state space: Pn ≈ Znd (Pn1 , Pn2 , . . . , PnM ) =

M X

zl,n Pnl .

l =1

We note that zn,k , when considered as a matrix of values, is generally sparse; and Znd is the identity when n is part of the reduced subspace. Combining the original Kolmogorov equation with the interpolating approximation gives: dPnk dt



N X

qm,nk

m=0



zl,m Pnl

l =1

M N X X l=1

M X

m=0

! zl,m qm,nk

Pnl

Fig. S2. Comparison of the (normalised) eigenvectors associated with the dominant (largest real) and smallest magnitude eigenvalues, for the full system (grey) and IP method. The IP method is performed on a grid with an integer boundary regime governed by the parameter q (nk = {0, 1, 2, . . . , q, q + 5, q + 10, . . .}). For values of q < 4 there exists an eigenvalue with positive real parts (associated eigenvector shown as a dashed line) indicating a growing numerical instability.

where the term in brackets can simply be found by multiplying the original Q matrix (containing only the columns corresponding to the reduced state-space) by the matrix (Z ) of interpolation coefficients. This matrix multiplication generates the new transition rates matrix Qˆ for the reduced state-space. We make the above reasoning explicit by considering a particular example: the SIS model with imports, a reduced statespace defined on every mth point, and a quadratic interpolating polynomial. The standard Kolmogorov forward equation across all points ({0, 1, 2, . . . , N }) is given by:   dPn = − β 0 (N − n)(n + ε) + γ n Pn dt   + [γ (n + 1)] Pn+1 + β 0 (N − n + 1)(n − 1 + ε) Pn−1 where β 0 = β/N. Considering the reduced spacing nk = m(k − 1), that is using a set of interpolation points {0, m, 2m, 3m, . . . , N }, then making a quadratic interpolation (Eq. (4) in main text) for the missing points we have:   dPnk = − β 0 (N − nk )(nk + ε) + γ nk Pnk dt h + [γ (nk + 1)] −(m − 1)Pnk−1 , i 2 + 2(m − 1)Pnk + (m + 1)Pnk+1 (2m2 )

 h + β 0 (N − nk + 1)(nk − 1 + ε) (m + 1)Pnk−1 , i 2 + 2 (m − 1)Pnk − (m − 1)Pnk+1 (2m2 )  = − β 0 (N − nk )(nk + ε) + γ nk − γ (nk + 1)(1 − 1/m2 )  − β 0 (N − nk + 1)(nk − 1 + ε)(1 − 1/m2 ) Pnk  0  β (N − nk + 1)(nk − 1 + ε)(m + 1) − γ (nk + 1)(m − 1) + Pnk−1 2m2   0 γ (nk + 1)(m + 1) − β (N − nk + 1)(nk − 1 + ε)(m − 1) + Pnk+1 . 2 2m

While the coefficients are undoubtedly more complex than the original, once calculated the associated transition rate matrix (Qˆ ) is still tri-diagonal but its size is reduced by approximately a factor of m. Instabilities near the boundary A more detailed analysis highlights that numerical instabilities can occur close to the boundaries. This instability is revealed as one or a conjugate pair of eigenvalues with large positive real parts. Although the use of lower order polynomials close to the boundary

M.J. Keeling, J.V. Ross / Theoretical Population Biology 75 (2009) 133–141

141

Fig. S3. Comparison of the IP method and full system for the SIS model with imports when N = 20, 000. Clearly similar patterns of errors and computational savings are achieved as in Fig. 1 of the main text – although, as expected, for the same grid spacing the errors are smaller.

can alleviate some of the instability, the most robust approach is to include extra points around the boundaries, defining the dynamics on the integers and only using the interpolation technique away from the boundaries. Fig. S2 shows the effects of using integer spacing close to the boundaries; in particular we set nk = {0, 1, 2, . . . , q, q + 5, q + 10, . . .}. In this example, we have deliberately chosen a small population size (N = 200) to highlight the spacing of the reduced state-space; it should be noted that in general larger population sizes are more amenable to approximation by a reduced state space. The (normalised) eigenvector associated with the dominant eigenvalue of the full system (e1 ) is shown in grey. For q less than 4, there exists an eigenvalue with a large real part that corresponds to the numerical instability (shown with dashed line for q = 2 and q = 3). However, for these q values the eigenvector corresponding to the eigenvalue of smallest magnitude (solid lines) still provides a reasonable approximation to the true long-term solution. For values of q greater than or equal to four, the largest real eigenvalue is also the smallest magnitude eigenvalue and hence the numerical instability disappears. By the time q = 6 the reduced state space model provides a robust description of the full dynamics. Comparison for larger N Finally, in Fig. S3 we extend the IP method to much larger population sizes (N = 20,000), which can be compared to Fig. 1a–c in the main text. A similar pattern of errors is found. However, given that the population size is larger, it is possible to use a sparser grid of interpolation points to capture the dynamics. References Alonso, D., McKane, A., 2002. Extinction dynamics in mainland-island metapopulations: An N-patch stochastic model. B. Math. Biol. 64, 913–958. Andersson, H., Britton, T., 2000. Stochastic Epidemic Models and Their Statistical Analysis. Springer. Bartlett, M.S., 1956. Deterministic and stochastic models for recurrent epidemics. Proc. Third Berkley Symp. Math. Statist. Probab. 4, 81–108. Cairns, B.J., Ross, J.V., Taimre, T., 2007. A comparison of models for predicting population persistence. Ecol. Model. 201, 19–26. Cooper, B., Lipstich, M., 2004. The analysis of hospital infection data using hidden Markov models. Biostatistics 5, 223–237. Coulson, T., Rohani, P., Pascual, M., 2004. Skeletons, noise and population growth: The end of an old debate? Trends Ecol. Evol. 19, 359–364.

Gillespie, D.T., 1976. A general method for numerically simulating the stochastic time evolution of coupled chemical reactions. J. Comput. Phys. 22, 403–434. Gilpin, M.E., Taylor, B.L., 1994. Reduced dimensional transition matrices: Extinction distributions from Markovian dynamics. Theor. Popul. Biol 46, 121–130. Grenfell, B.T., et al., 1998. Noise and determinism in synchronised sheep dynamics. Nature 394, 674–677. Griebeler, E.M., Seitz, A., 2001. The use of Markovian metapopulation models: A comparison of three methods reducing the dimensionality of transition matrices. Theor. Popul. Biol. 60, 303–313. Griebeler, E.M., Seitz, A., 2006. The use of Markovian metapopulation models: Reducing the dimensionality of transition matrices by self-organizing Kohonen networks. Ecol. Model. 192, 271–285. Keeling, M.J., Ross, J.V., 2008. On methods for studying stochastic disease dynamics. J. R. Soc. Interface 5, 171–181. Keeling, M.J., Ross, J.V., 2008. On methods for studying stochastic disease dynamics. J. R. Soc. Interface 5, 171–181. Keeling, M.J., Wilson, H.B., Pacala, S.W., 2000. Re-interpreting space, time-lags and functional responses in ecological models. Science 290, 1758–1761. Levins, R., 1969. Some demographic and genetic consequences of environmental heterogeneity for biological control. Bull. Ent. Soc. Am. 15, 237–240. Mangel, M., Tier, C., 1993. A simple direct method for finding persistence times of populations and application to conservation problems. Proc. Natl. Acad. Sci. USA 90, 1083–1086. McKane, A.J., Newman, T.J., 2005. Predator-prey cycles from resonant amplification of demographic stochasticity. Phys. Rev. Lett. 94, 218102. Norris, J.R., 1997. Markov Chains. Cambridge University Press. Pelupessy, I., Bonten, M.J.M., Diekmann, O., 2002. How to assess the relative importance of different colonization routes of pathogens within hospital settings. Proc. Natl. Acad. Sci. USA 99, 5601–5605. Rand, D.A., Wilson, H.B., 1991. Chaotic stochasticity—a ubiquitous source of unpredictability in epidemics. Proc. R. Soc. B 246, 179–184. Renshaw, E., 1991. Modelling Biological Populations in Space and Time. Cambridge University Press. Ross, J.V., Pagendam, D.E., Pollett, P.K., 2009. On parameter estimation in population models II: Multi-dimensional processes and transient dynamics. Theor. Popul. Biol. 75 (2–3), 123–132. Ross, J.V., Pollett, P.K., 2007. On costs and decisions in population management. Ecol. Model. 201, 60–66. Ross, J.V., Taimre, T., Pollett, P.K., 2006. On parameter estimation in population models. Theor. Pop. Biol. 70, 498–510. Sidje, R.B., 1998. Expokit. A software package for computing matrix exponentials. ACM Trans. Math. Softw. 24, 130–156. Sidje, R.B., 1998. Expokit. A software package for computing matrix exponentials. ACM Trans. Math. Softw. 24, 130–156. Spagnolo, B., Fiaasconaro, A., Valenti, D., 2003. Noise induced phenomena in Lotka–Volterra systems. Fluct. Noise. Lett. 3, L177–L185. Stone, P., Wilkinson-Herbots, H., Isham, V., 2008. A stochastic model for head lice infections. J. Math. Biol. 56, 743–763. Taylor, L.R., 1961. Aggregation, variation and the mean. Nature 189, 732–735. Tomiuk, J., Loeschcke, V., 1994. On the application of birth–death models in conservation biology. Cons. Biol. 8, 574–576.