A phylogenetic comparative method for studying multivariate adaptation

A phylogenetic comparative method for studying multivariate adaptation

Journal of Theoretical Biology 314 (2012) 204–215 Contents lists available at SciVerse ScienceDirect Journal of Theoretical Biology journal homepage...

540KB Sizes 0 Downloads 43 Views

Journal of Theoretical Biology 314 (2012) 204–215

Contents lists available at SciVerse ScienceDirect

Journal of Theoretical Biology journal homepage: www.elsevier.com/locate/yjtbi

A phylogenetic comparative method for studying multivariate adaptation Krzysztof Bartoszek a,n, Jason Pienaar b, Petter Mostad a, Staffan Andersson c, Thomas F. Hansen d a

Mathematical Sciences, Chalmers University of Technology and the University of Gothenburg, Gothenburg, Sweden Department of Genetics, University of Pretoria, Pretoria 0002, South Africa Department of Biological and Environmental Sciences, University of Gothenburg, Gothenburg, Sweden d CEES, Department of Biology, University of Oslo, Oslo, Norway b c

a r t i c l e i n f o

a b s t r a c t

Article history: Received 4 December 2011 Received in revised form 18 June 2012 Accepted 3 August 2012 Available online 24 August 2012

Phylogenetic comparative methods have been limited in the way they model adaptation. Although some progress has been made, there are still no methods that can fully account for coadaptation between traits. Based on Ornstein–Uhlenbeck (OU) models of adaptive evolution, we present a method, with R implementation, in which multiple traits evolve both in response to each other and, as in previous OU models, to fixed or randomly evolving predictor variables. We present the interpretation of the model parameters in terms of evolutionary and optimal regressions enabling the study of allometric and adaptive relationships between traits. To illustrate the method we reanalyze a data set of antler and body-size evolution in deer (Cervidae). & 2012 Elsevier Ltd. All rights reserved.

Keywords: Ornstein–Uhlenbeck process Multivariate phylogenetic comparative method Adaptation Optimality Allometry

1. Introduction Most biologists would agree that adaptive evolution is a multivariate phenomenon. Natural selection acts on many traits simultaneously and the selection pressures on individual traits are often functionally or environmentally interdependent. Furthermore, biological traits, or suites of traits contributing to a given function, usually consist of many interacting parts with complex genetic architectures that cause correlated responses due to pleiotropic gene action, see e.g. Walsh and Blows (2009). Thus, although much has been learned from the univariate analyses dominating both micro- and macroevolutionary studies they neglect the ‘‘covariance frontier’’ (Houle, 2007). It is here that the most important coadaptive or limiting forces (e.g. allometry) that guide and constrain adaptive change may be held. The difficulties of disentangling the myriad of evolutionary forces acting on traits extend across both micro- and macroevolutionary time scales. On the microevolutionary time scale, quantitative genetics, particularly the framework of Lande (1976, 1979) and Lande and Arnold (1983) for modeling evolution on the

n

Corresponding author. Tel.: þ46 31 772 10 00; fax: þ46 31 16 19 73. E-mail addresses: [email protected] (K. Bartoszek), [email protected] (J. Pienaar), [email protected] (P. Mostad), [email protected] (S. Andersson), [email protected] (T.F. Hansen). 0022-5193/$ - see front matter & 2012 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.jtbi.2012.08.005

adaptive landscapes of Simpson (1944) has had some success in capturing multivariate trait evolution. Macroevolution, however, cannot be fully understood as microevolution on static adaptive landscapes. The high rates of evolution generally found on microevolutionary time scales suggest that the relatively slow macroevolutionary dynamics result from changes in the adaptive landscape itself, and particularly in the position of adaptive peaks (Arnold et al., 2001; Svensson and Calsbeek, 2012; Hansen, 2012). Phylogenetic comparative methods are the principal tools for studying phenotypic evolution—analysing macroevolutionary patterns and processes that span across speciation events. Although the most commonly employed comparative methodologies are multivariate in that they attempt to model correlated evolution between traits, the model of evolution underlying most of these methods, a multivariate Brownian-motion (BM) process (Felsenstein, 1985), cannot model adaptation to specific optima (Hansen and Orzack, 2005). This leaves most methods unable to model phenotypic trait evolution in the spirit of the adaptive landscape framework mentioned above. A suite of comparative methods based on stochastic process models that capture deterministic adaptation towards specific optimal states has been developed (Hansen, 1997; Martins and Hansen, 1997; Butler and King, 2004; Hansen et al., 2008; Labra et al., 2009; Jhwueng and Maroulas, 2011). These and other recent model-based comparative methods (O’Meara et al., 2006; Thomas et al., 2006; Harmon et al., 2008; Hohenlohe and Arnold, 2008; Revell and

K. Bartoszek et al. / Journal of Theoretical Biology 314 (2012) 204–215

Harmon, 2008; Lajeunesse, 2009; Revell and Collar, 2009; Cooper et al., 2010; Hadfield and Nakagawa, 2010; Eastman et al., 2011) are starting to broaden the variety of evolutionary hypotheses that can be compared across species. With data from enough different species and a sufficiently resolved phylogeny relating extant taxa and their ancestors to selective regimes with quantitatively predicted trait optimum values, these methods can be used to address the adaptive component of trait changes across the time scale of the phylogeny. In contrast to many of the other methods, however, the models of adaptive evolution suffer from a largely univariate perspective, although the OUCH package (Butler and King, 2004) allows a limited multivariate model and Jhwueng and Maroulas (2011) developed a bivariate model (see also Hansen and Martins, 1996; Martins et al., 2002; Roper et al., 2008); we are therefore limited in our ability to study multivariate adaptation across phylogenies. Our aim is to remedy the above by extending comparative methods based on the Ornstein–Uhlenbeck process to a general multivariate form, and provide a software package in R (R Development Core Team, 2010) to implement them. To illustrate the methods we reanalyze a data set on body size and antler size in cervids from Plard et al. (2011). The multivariate extension allows for the study of how different traits interact with each other and affect each other’s optimum values. It is therefore an extension of the adaptation-inertia model of Hansen et al. (2008) and also allows for modelling of optima processes that themselves are under different selective pressures, which is an extension of the models in Hansen (1997) and Butler and King (2004). The paper is organized as follows: Section 2 provides the mathematical formulation of the multivariate models, discusses their properties and introduces an R software package which estimates their parameters; Section 3 provides examples of the kinds of hypotheses that can be explored within this model framework; Section 4 considers a data set of antler length and body mass in deer as an illustration, and finally Section 5 summarizes the work and discusses remaining questions and future directions. Appendices A, B, C, D and E contain advanced mathematical properties of the models. The supplementary material to this work presents the possibilities of the software package in more detail and presents the results of a simulation study concerning parameter recovery.

2. Mathematical models Throughout this paper matrices are written in bold face, ! column vectors in normal font with an arrow above (i.e. Z ) and

205

scalar values in normal font. By I we denote the identity matrix of appropriate size. For a matrix M, MT denotes the transpose of M, M1 its inverse, MT is the transpose of the inverse of M and vecðMÞ is the vectorization, stacking of columns, of M. The operators  and  denote respectively the Hadamard and ! Kronecker products between matrices. The observation of Z in ! species i is denoted by Z si . By n we will always mean the number of species on the phylogeny. 2.1. The general modelling framework Hansen (1997), Butler and King (2004), Hansen et al. (2008) and Labra et al. (2009) have developed a stochastic modelling framework for adaptation of evolving traits towards nichedependent optimal values. Following a suggestion by Felsenstein (1988), the framework is based on an Ornstein– Uhlenbeck process that allows for systematic adaptation towards an optimal state, ! ! ! ! d Z ðtÞ ¼ Fð Z ðtÞ C ðtÞÞ dt þ R d W ðtÞ, ð1Þ ! where Z ðtÞ is the vector of trait values at time t, F (the drift matrix in mathematical nomenclature) represents rates of adap! tation of trait values towards optimum values, C ðtÞ is a fixedeffect vector function (the drift vector in mathematical nomen! clature) representing the optimum trait values at time t, W ðtÞ is a multivariate standard Brownian motion and R (the diffusion matrix) mediates the stochastic perturbations (see Fig. 1). The ! vector C ðtÞ is assumed to be a step function over the phylogeny. The changes of the values of the function represent switching of selective regimes; see Butler and King (2004) for more details and the deer analysis presented here for an example. The solution of the stochastic differential equation (1) is the stochastic process, Z t Z t ! ! ! ! Z ðtÞ ¼ eFt Z ð0Þ þ eFðtvÞ F C ðtÞ dv þ eFðtvÞ R d W ðvÞ: ð2Þ 0

0

At each point in time this process is multivariate normal and so is the distribution of all of the traits on the phylogeny. This means that it is enough to study the mean and covariance structures to know all the properties of the process. In certain situations one will be able to partition the trait ! ! ! vector Z into two parts denoted X and Y . They are meant to symbolize predictor and response variables respectively. Then in our model framework, relationships between them can be studied

Fig. 1. Example trajectories of the mvOUBM stochastic differential equation model with a constant regime. A: SDE simulated with best found maximum-likelihood parameter estimates from the deer data (Section 4), B: SDE simulated with parameters as used for studying the performance of the method in the supplementary material. In both graphs the response traits are plotted, their respective optimum process and the predictor process. Notice that in panel A we can see that the two traits are supporting each other while in panel B we can clearly see a trade off between the traits. In both cases we can see a relation between the trait and optimum process.

206

K. Bartoszek et al. / Journal of Theoretical Biology 314 (2012) 204–215

! ! by looking at the conditional distribution of Y on X , in particular the conditional mean and covariance. One way of modelling relationships between the predictor and response variables is to introduce a layering structure in Eq. (2), see Section 2.3. A detailed discussion of such a modelling approach for evolutionary time-series data can be found in Reitan et al. (in press). Eq. (1) describes the dynamics of trait evolution along a single lineage (or branch) in the tree. However, to estimate the parameters for the entire data set one must take into account the correct datadependency structure which has to be derived from the phylogeny. An Ornstein–Uhlenbeck model was implemented in the OUCH (Ornstein–Uhlenbeck models for phylogenetic Comparative hypotHeses) R package (Butler and King, 2004). Their implementation is for the special case in which F is symmetric positive definite. Another partial implementation is in the SLOUCH (Stochastic Linear Ornstein–Uhlenbeck models for phylogenetic Comparative hypotHeses) R package (Hansen et al., 2008; Labra et al., 2009). Some parameter restricted bivariate Ornstein– Uhlenbeck processes are discussed by Roper et al. (2008) and Jhwueng and Maroulas (2011). There are a number of other partial implementations of this kind of process evolving on a phylogenetic tree e.g. ape (Paradis et al., 2004), COMPARE (Martins, 2004), Brownie (O’Meara et al., 2006), geiger (Harmon et al., 2008) or Picante (Kembel et al., 2010), but none of them implement the full general setting of Eq. (1) and most of them ! only allow one trait with a constant C ðtÞ  C. In this work we present an R package for the estimation of parameters of Eq. (1) with phylogenetic comparative data. We do not limit the number of traits, assume that the drift vector is a step function over the phylogeny and the only restrictions we make are that we require F to have an eigendecomposition and exclude some singular cases. In addition to the general model the package is also specially designed for an important subcase (where F is singular), the multivariate Ornstein–Uhlenbeck Brownian-motion (mvOUBM) model presented in the next section. 2.2. OUBM-model

2.3. OUOU model Hansen et al. (2008) presented a model in which the predictor variable evolves as a Brownian motion. The multivariate Ornstein–Uhlenbeck framework generalizes this by also allowing the predictor to follow an Ornstein–Uhlenbeck process. This leads to the Ornstein–Uhlenbeck Ornstein–Uhlenbeck (OUOU) model, ! ! ! ! ! d Y ðtÞ ¼ Ayy ð Y ðtÞð c y ðtÞ þ Q X ðtÞÞÞ dt þ Ryy d W y ðtÞ, ! ! ! ! d X ðtÞ ¼ Axx ð X ðtÞ c x ðtÞÞ dt þ Rxx d W x ðtÞ: The correspondence to the parameters of Eq. (1) is, 2 3 ! " # " # Ayy Ayy Q Ryy 0 ! 6 cy7 F¼ , R¼ , C ¼ 4 ! 5: Rxx 0 Axx 0 c

ð6Þ

ð7Þ

x

The first and second conditional moments do not seem to be as tractable as in the mvOUBM. Therefore we just want to signal this submodel here, leaving a detailed consideration for a future study. Note that if we set Axx to 0 we get the mvOUBM model. Jhwueng and Maroulas (2011) considered a bivariate subcase of Eq. (6), dYðtÞ ¼ a1 ðYðtÞðb0 þb1 XðtÞÞÞ dt þ sy dW y ðtÞ,

The Ornstein–Uhlenbeck Brownian-motion (OUBM) model in phylogenetic comparative studies was introduced by Hansen et al. (2008). In its most recent version the SLOUCH package estimates by maximum likelihood the parameters of the following layered system of stochastic differential equations, ! m X dYðtÞ ¼ a YðtÞcðtÞ bi X i ðtÞ dt þ sy dW y ðtÞ, i¼1

dX i ðtÞ ¼ sxi dW xi ðtÞ,

! variables. The X variables are not under any adaptive pressures ! while Y is modelled as adapting to a randomly evolving optimum ! ! c ðtÞ þ Q X ðtÞ. Following Hansen (1997) we call this process the ! primary optimum for Y . One can also consider primary optima ! for individual elements of the Y variables that apart from ! ! depending on c ðtÞ and X also depend on the remaining elements ! of Y . We do not write this out here in the general form but advise the reader that this can be done by doing the matrix multiplication in Eq. (4). Bivariate cases are discussed in Section 3. Setting A to 0 in Eq. (4) results in the Brownian-motion model considered by Felsenstein (1985) and most existing phylogenetic comparative methods.

i ¼ 1, . . . ,m,

ð3Þ

where a 4 0 and as mentioned cðtÞ is a step function over the phylogeny. The multivariate OUBM model is, ! ! ! ! ! d Y ðtÞ ¼ Að Y ðtÞð c ðtÞ þ Q X ðtÞÞÞ dt þ Ryy d W y ðtÞ, ! ! d X ðtÞ ¼ Rxx d W x ðtÞ,

ð4Þ

where A can be any invertible matrix with an eigendecomposition, ! ! ! c ðtÞ is a step function, and W y ðtÞ and W x ðtÞ are independent multivariate standard Brownian motions. The correspondence to the parameters of Eq. (1) is 2 !3 " #   Ryy 0 A AQ ! c F¼ ð5Þ , R¼ , C ¼ 4 ! 5: Rxx 0 0 0 0 The motivation behind the layering structure in the case of Eq. ! ! (4) is that the Y and X variables represent two separate types of

dXðtÞ ¼ a2 ðXðtÞgÞ dt þ sx dW x ðtÞ,

ð8Þ

but in their estimation algorithm a1 and a2 are assumed positive and equal, which limits their biological interpretation. See also Roper et al. (2008) for a similar setup. 2.4. Model properties and their biological interpretations 2.4.1. Asymptotic behaviour To simplify the discussion of the asymptotics we assume that ! ! there exists a point t0 such that for all t 4 t 0 C ðtÞ  C is constant. The asymptotic behaviour of the above processes along a given lineage is determined by the eigenvalues of the F matrix. If the real parts of all the eigenvalues are positive, the process will converge in distribution. This together with the multivariate normality implies that all the moments converge and also that the moments of conditional distributions will converge. The traits will be tending in distribution to a random variable ! that has mean C . This value can be understood as the optimal value. The conditional distributions (in the mvOUBM case condi! tional on X ) also converge and convergence of the conditional means will be discussed in the next section. The mvOUBM stochastic process does not possess any stationary distribution because the F matrix has eigenvalues equalling 0. This is due to the driving Brownian-motion predictor variable, which has a variance that will tend to infinity. However the ‘‘residual’’ process ! ! Y Q X will converge in distribution to its stationary distribution.

K. Bartoszek et al. / Journal of Theoretical Biology 314 (2012) 204–215

Both of these conditional distributions will be multivariate normal and the corresponding mean and covariance structures are presented in Appendix B. The eigenvalues that govern the model’s asymptotics can be interpreted as rates of adaptation towards the optimum along the directions in the trait space described by the corresponding eigenvectors. The inverses of these are easier to interpret, and as in the univariate case (Hansen, 1997), we can define the half lives, hi ¼ lnð2Þ=Realðli Þ, for the eigenvalues li . They represent the time needed for the effect of the initial state to be halved on the direction of the i-th eigenvector, i.e. eRealðli Þhi ¼ 0:5. One of the differences between the uni- and multivariate cases is that the half lives do not operate in the trait space but in the eigenvector space of the matrices. The case of negative eigenvalues could naı¨vely be understood as some form character displacement. But since they represent a process that takes place over the entire phylogeny this could only apply to cases in which most of the species are geographically overlapping. More general interpretations of negative eigenvalues require further thought.

207

2.5. Implementation The models were implemented in an R package named mvSLOUCH (MultiVariate Stochastic Linear Ornstein–Uhlenbeck models for phylogenetic Comparative hypotHeses) that accompanies this paper. The package performs maximum-likelihood estimation of the model parameters in the spirit of a GLS with general covariance structure, described in e.g. (Carroll and Ruppert, 1987). The package allows the user to input a lot more than just tip data. It allows for missing values on some tips and measurement error on the data. It can be used to do model comparison, or to study different hypotheses about regimes or phylogenies. Apart from point estimates of the parameters it also returns detailed composite statistics and confidence intervals. A detailed description of the estimation algorithm and program possibilities can be found in the supplementary material. The package is available from the corresponding author Krzysztof Bartoszek.

3. Biological interpretation of the model 2.4.2. Evolutionary, optimal, and limiting regressions Hansen et al. (2008) introduced a distinction between the evolutionary and the optimal regression. The evolutionary regression is the ‘‘observed’’ regression we expect to see in a crossspecies data set. Due to phylogenetic inertia this may differ from the optimal regression, which is the optimal relationship between the traits (i.e. the regression of the primary optimum on the predictor variable(s)). In the framework of Hansen et al. (2008), in which the predictor variable evolves as a Brownian motion, the evolutionary regression equals the optimal regression times a phylogenetic correction factor that depends on the rate of adaptation and the time scale. As the time scale goes to infinity the evolutionary regression will converge on the optimal regression. This finding does not generalize to the multivariate Ornstein–Uhlenbeck model, however. If the predictor variable is not marginally behaving as a Brownian motion, the evolutionary regression could converge on a limiting regression that is different from the optimal regression. Specifically, if the model is dYðtÞ ¼ ay ðYðtÞðcy þ b1 XðtÞÞÞ dt þ sy dW y ðtÞ þ syx dW x ðtÞ, dXðtÞ ¼ ax ðXðtÞcx Þ dt þ sx dW x ðtÞ,

ð9Þ

then the limiting regression is lim E½YðtÞ9XðtÞ ¼

t-1

ay ax syx b þ2 : ay þ ax 1 ax þ ay sx

ð10Þ

Hence, even given infinite time the regression we observe across species is expected to be different from the optimal relationship between traits. The reason why this happens can be analogous to why measurement error in predictor variables produces biased regression slopes. When adaptation is not instantaneous, changes in the predictor variable are not instantaneously converted into changes in the response variable, and will therefore act as if it was measurement error in the predictor. The bias due to measurement error can be described by an attenuation factor that equals a ratio between the ‘‘true’’ variation in the predictor divided by the sum of the true and the error variance in the predictor (Hansen and Bartoszek, 2012). If the predictor variable evolves as a Brownian motion this factor will go to unity as ax ¼ 0, but if the true evolutionary variance in the predictor variable is constrained, as in the Ornstein–Uhlenbeck process, some bias will remain, as described by Eq. (10). A special case of this appears in the model of Jhwueng and Maroulas (2011), where one can show that the evolutionary regression slope converges on half the optimal slope when ay ¼ ax and syx ¼ 0.

3.1. Correlated adaptation Many traits need to function in a synchronized manner. In many mammals the teeth of the upper and lower jaws need to occlude. Another example can be found in plants, where the positions of anthers and stigma need to match for efficient pollination (Armbruster et al., 2009). Developing the latter example, we can imagine that there is stabilizing selection for anther position towards an optimum determined by stigma position and vice-versa. A general model capturing this is, dYðtÞ ¼ ay ðYðtÞf y ðXðtÞÞÞ dt þ stochastic change, dXðtÞ ¼ ax ðXðtÞf x ðYðtÞÞÞ dt þ stochastic change,

ð11Þ

for some functions fy and fx, where Y is anther position and X is stigma position. On top of this, one or both traits may also be pulled towards a fixed or randomly evolving optimum. In the models described in Eqs. (1) and (4) we go beyond the simple bivariate case, allowing for multiple traits evolving jointly and assuming the f functions are linear transformations of the traits with the possible addition of a randomly evolving optimum process. In the case of anthers and stigma, such a pull may be determined by the size of the most efficient pollinator (which can be a randomly evolving predictor). Hence, we can imagine the species being positioned at different points along a cordillera depending on their main pollinator (Armbruster, 1990). In this model, a strong indication of correlated adaptation can be that the primary optimum of both (or all) traits depends on the other traits. It should be realized, however, that such relationships could arise for other reasons, such as trade offs or other constraints (see below). Tests for correlated adaptation can be based on specific hypotheses for what the optimal relationship should be and by matching estimated parameter values with this hypothesis (or model). For example, one could postulate that anther and stigma positions should match and see if the estimated primary optimum conform to this match. 3.2. Trade off Particularly in life-history theory, cross-species comparative analyses are often done to study trade offs between traits. As with the study of intraspecific trade offs this is problematic, due to the fact that different species may have different amount of resources to divide among the traits masking any underlying trade off (e.g. van Noordwijk and de Jong 1986; Houle 1991; Fry 1993). Hence,

208

K. Bartoszek et al. / Journal of Theoretical Biology 314 (2012) 204–215

the comparative study of trade offs must be based on models that account for both acquisition and allocation of resources to the traits. Here, we present two models inspired by the genetic acquisition–allocation model of Houle (1991). These models consider two traits Y(t) and X(t) that are both functions of an underlying resource trait R(t) and an allocation trait A(t), as YðtÞ ¼ lRðtÞAðtÞ, XðtÞ ¼ RðtÞð1AðtÞÞ:

ð12Þ

If we assume that the logarithms of the two traits Y and X evolve as an Ornstein–Uhlenbeck process then we can rewrite the model for A(t) and R(t) in terms of the observable traits and can study the transformed variables in the Ornstein–Uhlenbeck framework. We introduce the new variables Yn and Xn, YðtÞ lAðtÞ ¼ ¼: Y n ðtÞ, XðtÞ 1AðtÞ XðtÞ ¼ RðtÞð1AðtÞÞ ¼: X n ðtÞ,

ð13Þ

3.3. Evolution of allometry An allometric relationship between two traits Y and X means a power law model relating them as Y ¼ a  X b (Huxley, 1932; Savageau, 1979). By taking the logarithm of both variables we get a linear relationship logðYÞ ¼ logðaÞ þb  logðXÞ and the b parameter is often called the allometric slope or allometric coefficient. Allometric relationships across species are called evolutionary allometries (von Bertalanffy and Pirozynski, 1952; Cheverud, 1982), and we can use our framework to test hypotheses about how they arise. Two hypotheses can be compared. The first is that the evolutionary allometry is determined by selection as captured in the optimal regression between the logarithms of the two traits. The second is that the evolutionary allometry is due to correlated changes in the traits caused by internal constraints. Both are captured by the model, dYðtÞ ¼ ay ðYðtÞb1 b2 XðtÞÞ dt þ syy dW y þ syx dW x ,

which become, after taking logarithms,

dXðtÞ ¼ ax ðXðtÞcÞ dt þ sxx dW x þ sxy dW y :

AðtÞ ¼ logðY n ðtÞÞ, 1AðtÞ logðXðtÞÞ ¼ logðð1AðtÞÞÞ þ logðRðtÞÞ ¼ logðX n ðtÞÞ:

To determine which hypothesis our data support we need to look at the estimated parameters. The first adaptation hypothesis would be supported by large values of ay and b2 with small syx . The second ‘‘constraint’’ hypothesis would be by small values of ay and syy and a large value of syx . For more details see Hansen and Bartoszek (2012) and Voje and Hansen (in press) for an application.

logðYðtÞÞlogðXðtÞÞ ¼ logðlÞ þ log

ð14Þ

Now logðYðtÞÞlogðXðtÞÞ and logðXðtÞÞ are observable quantities and as they are a linear transformation of logðYðtÞÞ and logðXðtÞÞ, they will also be evolving as an Ornstein–Uhlenbeck process,

ð18Þ

d logðY n ðtÞÞ ¼ a1 ðlogðY n ðtÞÞb1 logðX n ðtÞÞc1 ðtÞÞ dt þ s11 dW 1 þ s12 dW 2 , d logðX n ðtÞÞ ¼ a2 ðlogðX n ðtÞÞb2 logðY n ðtÞÞc2 ðtÞÞ dt þ s22 dW 2 þ s21 dW 1 :

ð15Þ This model should tell us how much the species vary in their trade off, A(t), while controlling for acquisition in R(t). In the mvOUBM framework, this model can be combined with other variables to test for the influence of these other variables on the primary optima of the allocation and acquisition traits. The model above assumes that a trade off exists and seeks to study the optimal allocation. It cannot test directly for the presence of a trade off. Towards constructing a model that can do this, we can treat R(t) as an independently evolving third variable in the model and test for a negative optimal relation between X(t) and Y(t) while controlling for R(t). Here we work on the untransformed variables and model the primary optima as

yy ðtÞ ¼ cy ðtÞ þby XðtÞ þ ly RðtÞ, yx ðtÞ ¼ cx ðtÞ þ bx YðtÞ þ lx RðtÞ,

ð16Þ

where cy and cx are the fixed effects, and positive values of the by and bx parameters can be taken as indications of a trade off. Assuming that R(t) evolves as a Brownian motion, this model can be represented in our framework as dYðtÞ ¼ ay ðYðtÞðcy ðtÞ þby XðtÞ þ ly RðtÞÞÞ dt þ syy dW y þ syx dW x , dXðtÞ ¼ ax ðXðtÞðcx ðtÞ þ bx YðtÞ þ lx RðtÞÞÞ dt þ sxx dW x þ sxy dW y , dRðtÞ ¼ sR dW R ,

ð17Þ

where sR is the diffusion parameter of the resource variable. The problem is that R(t) is not usually observable. We therefore consider two possible solutions. The first is to assume that R(t) can be represented as a function of observable traits. For example, one could use RðtÞ ¼ YðtÞ þ lXðtÞ, for some l that needs to be estimated. This requires careful interpretation of the results. The second solution is to model R(t) as a hidden variable. This approach has been outlined for time-series analysis by Reitan et al. (in press), and should also work in a phylogenetic setting, but the details, especially parameter estimability, remain to be developed.

4. Example: an analysis of deer antler length and body mass To illustrate the method, we reanalyze a data set of antler length and body mass in 31 different species of deer previously used by Plard et al. (2011) to analyze effects of sexual selection on antler length. For each species Plard et al. (2011) assembled data from the literature on antler length (AL in millimeters), male body mass (MBM, in kilograms), female body mass (FBM, in kilograms), breeding group size (BGS, three levels 1–2, 3–5 and 45) and mating tactics (MT, four levels: harem, territorial, tending and monogamous). Mating tactic is missing for Megamuntiacus vuquangensis. The phylogeny, illustrated in Fig. 2a, was taken from the Cervidae subtree of the Ruminant phylogeny from Ferna´ndez and Vrba (2004) with Cervus canadensis added. Plard et al. (2011) predicted that the relative size of antlers would reflect sexual selection, and that strength of sexual selection would increase with breeding group size. They also used sexual size dimorphism, the relative mass difference between males and females, as a proxy for the opportunity for sexual selection, because male mass, but not female mass, is likely to be a target of sexual selection in the different species. Our reanalysis deviates from Plard et al. (2011) in two main ways. The first is that we will use male mass as a response variable rather than just as a predictor variable. Our multivariate method allows us to use both antler size and body mass as simultaneous response variables, and this seems more biologically correct, because both antler size and male body mass are assumed to respond to sexual selection. This also allows us to ask whether there are functional interactions or trade offs between the two variables. The second difference is that we use models that allow adaptive evolution towards optima determined by the different breeding group sizes and mating systems. Plard et al. (2011) used non-phylogenetic ordinary least squares and general least squares analysis in which the covariances of the different species were assumed proportional to branch length. The former analysis ignores phylogenetic inertia, and the second is inconsistent with adaptation towards an optimum (Hansen and Orzack, 2005). Our approach incorporates

K. Bartoszek et al. / Journal of Theoretical Biology 314 (2012) 204–215

209

log (antler length [mm])

7

6

5 small medium large OLS E[AL | MBM](1)

4

3

4

5

6

log (male body mass [kg])

6

6

5

4

small medium large OLS evolutionary regression optimal regression

log (male body mass [kg])

log (antler length [mm])

7

5

4

3

small medium large OLS evolutionary regression optimal regression

2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0

2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0

log (female body mass [kg])

log (female body mass [kg])

Fig. 2. The deer phylogeny and pairs of variables plotted against each other with slopes from the ordinary least squares (OLS) regression, evolutionary and optimal regression estimates. The observations are grouped according to the breeding group size. The intercept is calculated as the mean of ‘‘yb^  x’’ as in the OLS case it would be different for each of the three regimes while in the other two regressions would be particular to each species depending on the evolution of breeding group sizes on each lineage. The regression lines are as follows: B: OLS 1:68 þ 0:99MBM, E½AL9MBM ¼ 0:65 þ 1:55MBM. C: OLS 1:51 þ 1:10FBM, evolutionary regression 2:19 þ 0:93FBM, optimal regression 0:12 þ 1:47FBM. D: OLS 0:33 þ 1:56FBM, evolutionary regression 0:05 þ 1:09FBM, optimal regression 0:19 þ 1:12FBM. The regimes were mapped onto the tree by the Fitch algorithm with delayed transformations (Fitch, 1971).

phylogenetic inertia and adaptation towards optimality in a unified framework. As Plard et al. (2011) we assume allometric relationships between antler length, male body mass and female body mass. This means that after taking logarithms we will have linear relationships between the transformed variables that fit our linear stochastic differential equation framework. In the context of sexual selection it is more relevant to consider sexual size dimorphism defined as the residual of the evolutionary regression of the logarithm of male body mass on female body mass (Ranta et al., 1994) instead of male body mass directly. We do not need to use sexual size dimorphism in the estimation procedure because this variable is a linear function of the logarithms of male and female body mass. Its effect can be derived from a model including male and female body masses untransformed (see Revell, 2009 for a discussion on transforming data that have phylogenetic dependencies). Below we present and discuss the results in terms of both male body mass and sexual size dimorphism. We ran a number of different models and used AIC c to choose the best one. We considered the following candidate adaptation models: 1. breeding group size and mating tactic affecting the primary optimum of antler length, male body mass and female body mass, 2. only breeding group size affecting the primary optimum of antler length, male body mass and female body mass,

3. only mating tactic affecting the primary optimum of antler length, male body mass and female body mass, 4. female body mass, breeding group size and mating tactic affecting the primary optimum of antler length and male body mass, 5. female body mass and breeding group size affecting the primary optimum of antler length and male body mass, 6. female body mass and mating tactic affecting the primary optimum of antler length and male body mass. Parameter estimates for the best fitting model are shown in Table 1. This model has the logarithms of antler size and male body mass as response variables, and includes an effect of the logarithm of female body mass and mating group size on both of these traits. Mating tactic was not found to be important. There is also no effect of antler size on the primary optimum of male body mass and vice versa. There is, however, a covariance between evolutionary change in antler size and male body mass. The estimated A matrix reveals moderate phylogenetic inertia in antler size with a half life of 27:5% of tree height, but very low phylogenetic inertia in male body mass with a half life only 2:2% of the tree height (Fig. 3). We define sexual size dimorphism as SSD ¼ logðMBMÞðevolutionary regression coefficientÞ  logðFBMÞ, which using the best found estimates from Table 1 equals SSD ¼ logðMBMÞ1:09  logðFBMÞ. To transform our data vector ðlogðALÞ,logðMBMÞ,logðFBMÞÞ to ðlogðALÞ,SSD,logðFBMÞÞ we do the

210

K. Bartoszek et al. / Journal of Theoretical Biology 314 (2012) 204–215

Table 1 Parameters from the best-found model (AICc ¼ 133:28, log likelihood¼  48.52), a mvOUBM model with diagonal A and upper triangular Ryy . In all matrices the first row corresponds to logðALÞ and the second to logðMBMÞ. The standard deviations for Q , the evolutionary regression and the effects of breeding-group size are conditional on A, Ryy , Rxx , logðALÞð0Þ, logðMBMÞð0Þ and logðFBMÞð0Þ. The standard deviations for A, Ryy , Rxx , logðALÞð0Þ, logðMBMÞð0Þ and logðFBMÞð0Þ are, due to the dependencies between observations, approximate as described in the supplementary material. The presented model is the best among 108 different models. The estimation for each model was run from different starting points, to account for potential estimability or multimodality issues, with a total of about 360 program instances. We considered different classes of F/A matrices, namely diagonal, having real positive eigenvalues and invertible to test for adaptation, if the estimated eigenvalues will all have positive real part then this will indicate adaptation. To see whether the random perturbations to the traits were correlated or not we considered diagonal and upper triangular R/Ryy matrices. In addition as the mating tactic for M. vuquangensis was missing we ran each setup once for each possible value of the mating tactic. Parameters for the best model

Estimate

A



Ryy



2:52

0

0

Ryy RTyy Rxx RTxx Rxx



(standard error) 

31:90 

0:45 0

1:06 1:28

1:34

1:36

1:36 1.40

1:64

logðALÞð0Þ logðMBMÞð0Þ logðFBMÞð0Þ Q

1.18 5.04 3.96 3.83   1:47

Evolutionary regression



" "

Small BGS



Medium BGS



Large BGS



#





ð0:87Þ

ð1:69Þ 

ð2:15Þ ð2:64Þ

Fig. 3. Log-likelihood surface for A’s two half lives conditional on all other model parameters fixed at the best found estimate (Table 1). Note that 0 is not included in the axes.

#



Table 2 Parameters that changed from the best found model after transforming to have sexual size dimorphism instead of the logarithm of male body mass as one of the variables. The estimates of the parameters not shown are the same as in Table 1. They did not change as A was diagonal. The first row of Q n corresponds to the optimal regression of logðALÞ on logðFBMÞ and the second to that of SSD on logðFBMÞ. In addition we will get an instantaneous covariance term between the noise for the SSD and logðFBMÞ equaling  1.51, the evolutionary regression of SSD on logðFBMÞ has to be 0 by definition of the sexual size dimorphism.

ð1:20Þ ð0:41Þ ð0:0Þ ð0:06Þ " # ð0:43Þ

1:12 0:93 1:09

ð2:81Þ

ð0:07Þ "



0:59

0:81

#

Parameters for the best model

Estimate

#

SSDð0Þn Qn

 0.21   1:47 0:04

ð0:07Þ 

"



"



"

0:34 0:28 0:20

ð0:27Þ ð1:9Þ ð0:32Þ ð1:9Þ

#

ð0:32Þ

0:04

following matrix, vector multiplication, 2 3 2 32 3 logðALÞ logðALÞ 1 0 0 6 SSD 7 6 76 logðMBMÞ 7 4 5 ¼ 4 0 1 1:09 54 5, logðFBMÞ logðFBMÞ 0 0 1

ð1:9Þ ð0:32Þ

#

ð19Þ

which carries over to the model parameters from Table 1. The estimated primary optima (see Table 2) for the logarithm of antler size and sexual size dimorphism (SSD) for the three breeding group sizes are small breeding group: yAL ¼ 0:59 þ1:47logðFBSÞ, ySSD ¼ 0:34 þ 0:04logðFBSÞ:

ð20Þ

Medium breeding group: yAL ¼ 0:28 þ1:47logðFBSÞ, ySSD ¼ 0:20 þ0:04logðFBSÞ:

ð21Þ

Large breeding group: yAL ¼ 0:81 þ1:47logðFBSÞ, ySSD ¼ 0:04 þ0:04logðFBSÞ:

ð22Þ

These results are consistent with an adaptive effect of breeding group size on antler size. For the mean female body mass of 46 kg, the primary optimum for antler size is predicted to increase from yAL ¼ 15:3 cm for small breeding group through 36.5cm for medium breeding group to 62.3cm for large breeding group. Breeding group size and female body mass explain 92% of the

variance. The confidence intervals on these numbers are large however, and the result must be seen as tentative. The effects on sexual size dimorphism are in the same direction, but smaller. The reason for the small effect of female body mass on sexual dimorphism is low phylogenetic inertia. Low phylogenetic inertia means that the evolutionary regression between male and female body mass approaches the optimal regression very rapidly. We defined sexual size dimorphism as the residual of the evolutionary regression of the logarithms of male on female body mass. Due to rapid adaptation this will be very close to the residual of the optimal regression of the logarithms of male on female body mass and so by definition, we should not expect a large effect. For average female mass of 46 kg, the predicted primary optima for male mass in three breeding group categories are 52.3 kg, 60.2 kg, and 70.4 kg, respectively. There is however a strong effect of female mass on the primary optimum of antler size, which are predicted to be relatively larger in large-bodied species. This is likely a reflection of the allometric relation between antler and body mass. Even if male body mass is not directly included in the optimum, antler size and male body mass are strongly correlated (observed correlation 0.89). This dependency comes from two sources from female body mass effecting them both and correlated random perturbations, Ryy RTyy in Table 1. This model provides a clear rejection of both adaptive coevolution and functional trade offs between antler size and male mass in the sense of a sexual dimorphism. This is because there are no cross effects on the primary optima. The two traits are strongly correlated, however. The correlation between the stochastic infinitesimal changes, derived from Ryy RTyy , for the two traits is 0.92. It is tempting to speculate that this correlation is

K. Bartoszek et al. / Journal of Theoretical Biology 314 (2012) 204–215

caused by a static allometric constraint on the two traits. The regression of the logarithm of antler size on the logarithm of male body mass derived from the infinitesimal variance matrix is 0.83. The phylogenetic half lives reveal relatively rapid adaptation in these traits. This is in stark contrast to the strong phylogenetic effects in the traits themselves. The half lives computed from a simple univariate analysis with a constant primary optimum reveal t 0:5 ¼ 257% of tree length for the logarithm of antler size and t 0:5 ¼ 55:3% for male body mass. This is essentially equivalent to antler length evolving as a Brownian motion and male body mass having very slow adaptation. This non-adaptive evolution is, however, an effect of their dependence on female body mass, which itself has a strong phylogenetic effect (t 0:5 ¼ 59%Þ. Hence, even if the traits could be assumed to evolve as a Brownian motion, the residuals from the analysis do not behave like a Brownian motion, and a standard phylogenetic comparative analysis as independent contrasts would be seriously wrong (see Labra et al., 2009). The non-phylogenetic analysis of Plard et al. (2011) was in fact much more realistic than their generalized least squares analysis based on a Brownian motion. It was not entirely ideal, however, as antler size did show a moderate level of phylogenetic inertia that should be taken into account, as done in the present analysis. The most important aspect of our analysis is that it improves on Plard et al. (2011) by more realistic modelling of the traits’ evolution and by providing insights into their speed of adaptation. We find that antler length responds rapidly and much faster than male body mass to changes in female body mass. We did not find any strong interaction in the adaptation of male antler and body mass. To the best of our knowledge ours is the first mathematical tool that allows for testing hypotheses about coadaptation in a phylogenetic context. Our results are qualitatively consistent with those of Plard et al. (2011) and furthermore explain why they found a model with no phylogenetic effects superior to one which assumed a Brownian-motion evolutionary process.

5. Summary Our aim in this study was to present a multivariate generalization of the models described by Hansen (1997); Butler and King (2004); Hansen et al. (2008). We also provide R code that implements this multivariate setting and generalizes the OUCH and SLOUCH software packages from the two aforementioned papers. With respect to the OUCH package, we drop the need for F to be symmetric positive definite and we extend the SLOUCH package to a multivariate response. We describe the models not only in terms of their original parameters, but also from the point of view of stochastic process theory, interpreting the mean and covariance functions of the process (see Appendix E), what conditional distributions one should look at and what the biological implications are. In some hypothesis-testing testing situations variables may be a priori assigned as ‘‘predictors’’ and ‘‘responses’’. To analyze this one typically looks at the conditional distribution of the responses on the predictors, which in most cases means a regression—looking at the conditional mean of the responses on the predictors. The accompanying software package will give the user a tool for a fully multivariate comparative analysis in the framework of the Ornstein–Uhlenbeck type models (Martins and Hansen, 1997). The multivariate implementations so far, e.g. Butler and King (2004), Hansen et al. (2008), Jhwueng and Maroulas (2011) have been limited to a few submodels, and the challenge remains to study and test hypotheses involving complex trait interactions and coevolution.

211

Even though our model and its implementation are very general, there remain open questions that need to be addressed. All of the more complex stochastic differential equation models in comparative analysis run into the trouble of parameter identifiability. An analytic study is required to show whether this is due to insufficient sample sizes or an underlying parameter identifiability issue. This seems to be a difficult problem as for example the ! estimability of the values of c ðtÞ, corresponding to different regimes relies heavily on the particular regime layout and branch lengths. Ane´ (2008) considers the Brownian-motion case, but she does not address the problem for the more complex Ornstein– Uhlenbeck framework. Her results need to be extended to the case of iterated GLS procedures in which the covariance matrix also depends on the parameters entering the mean. Another question concerns what is estimable despite small sample sizes. This is something we cannot answer with full certainty and it has to be treated case by case due to the dependency on the tree structure. Looking at the histograms in the supplementary material we can see that the biologically relevant parameters are well estimated even with trees as small as 32 species. It is however possible to get outliers, a difficult parameter to estimate or run into the problem of model identifiability. We mentioned that the different stochastic differential equation models are submodels of each other when certain entries of the F matrix are set to 0. If the observed data came from a distribution that had appropriate entries very small it would be next to impossible to distinguish between for example an Ornstein–Uhlenbeck Brownian-motion model of evolution and a Brownian motion. Boettiger et al. (2012) discuss this and study via Monte Carlo methods the power of univariate Ornstein–Uhlenbeck based phylogenetic comparative methods. Model development is another direction for further work. A model in which the discrete niches are allowed to develop stochastically together with the continuous traits would be helpful. If the niches represent an environment, one can assume causality and first estimate the niches and then conditional on them the model parameters, but if the niches are intrinsic properties of the species it is more correct to model the evolution of the niches jointly with the traits adapting to them. However, we are yet lacking a tool to jointly model the evolution of categorical and continuous variables. It would also be good to integrate both molecular, trait and environmental data in one estimation procedure. The studies of Huelsenbeck and Rannala (2003) and Lemey et al. (2010) addressed this problem but assumed a Brownian-motion model of evolution (see also Huelsenbeck et al., 2000) whilst a methodology for a more general class of stochastic processes would be needed. A possible direction is to use conditioned branching process models for the phylogeny and combine them with the stochastic models of phenotype evolution. There has been a lot of progress concerning conditioned branching process in the last decade due to amongst others Popovic (2004); Aldous and Popovic (2005); Gernhard (2008a, 2008b); Mooers et al. (2012); Stadler and Steel (2012). A less mathematical discussion on this has been presented by Aldous et al. (2011). This combination of phylogeny and evolutionary models is studied by Sagitov and Bartoszek (2012) for the Brownian-motion model. More complex models for traits are being worked on. An Approximate Bayesian Computation approach in a similar setting with an incomplete phylogeny has been proposed by Slater et al. (2011). Our work provides an open-source tool for the phylogenetic comparative methods community that allows the study of traits that are jointly responding and adapting to their surrounding environments or other (randomly) changing processes. A large suite of questions about the trait interactions can be phrased in the framework of our models and should therefore allow users to ‘‘escape from flatland’’ (Walsh, 2007).

212

K. Bartoszek et al. / Journal of Theoretical Biology 314 (2012) 204–215

Acknowledgments The simulations were performed on resources provided by the Swedish National Infrastructure for Computing (SNIC) at C3SE. We would like to acknowledge Rob Freckleton, Olle Nerman, Trond Reitan, Serik Sagitov and Franciska Steinhoff for many helpful comments and insights. Krzysztof Bartoszek was supported by the Centre for Theoretical Biology at the University of ¨ Vetenskaplig Forskning och Utbildning Gothenburg, Stiftelsen for i Matematik (Foundation for Scientific Research and Education in Mathematics), Knut and Alice Wallenbergs travel fund, Paul and Marie Berghaus fund, the Royal Swedish Academy of Sciences, and Wilhelm and Martina Lundgrens research fund.

Appendix A. Matrix exponential To be able to work with the stochastic differential equation models from Eqs. (1) and (4) we need to be able to compute eAt Rt T and 0 eAv WeA v dv (W is some matrix). Both these problems are numerically difficult (Van Loan, 1977, 1978; Moler and Van Loan, 2003). However, if we can do A’s eigendecomposition we can get analytical formulae, denoting P to be the matrix whose columns are A’s eigenvectors and l1 , . . . , lk to be the eigenvalues (throughout this work we assume A to be invertible so all its eigenvalues are nonzero), eAt ¼ P diagðel1 t , . . . ,elk t ÞP1 , !   Z t T 1 evA WevA dv ¼ P ð1eðli þ lj Þt Þ  P1 WPT PT , li þ lj 0 1 r i,j r k ðA:1Þ ! ! where diag( l ) is a diagonal matrix with the vector l on the diagonal. In particular if A is symmetric positive definite then P1 ¼ PT . In general the eigendecomposition of a matrix could be ! numerically difficult, but if the number of ‘‘response’’ traits ( Y traits) is not overly large this should not be a problem. To work with the model of Eq. (4) we need to exponentiate a special type of matrix and this exponentiation is, #     " At A B ðeAt IÞA1 B e exp  ðA:2Þ t ¼ 0 0 0 I

Appendix B. Stochastic differential equation models The stochastic differential equation models from Eqs. (1) and (4) are called Ornstein–Uhlenbeck models. However in the mathematical literature they are also referred to as Vaˇsicˇek processes (Vaˇsicˇek, 1977). We derive here the mean and covariance of the implemented stochastic differential equation models, Eqs. (1) and (4). We ! assume, as in the implementation that C ðtÞ is a step functions. The solution to Eq. (1) is the process (Gardiner, 2004), Z t Z t ! ! ! ! Z ðtÞ ¼ eFt Z ð0Þ þ eFðtvÞ F C ðtÞ dv þ eFðtvÞ Rd W ðvÞ, 0

m X

t 0

ðB:3Þ We now turn to analyzing the properties of the stochastic differential equation of Eq. (4). We drop the assumptions from the main text that Rxy ¼ 0 and Ryx ¼ 0 and Eq. (4) becomes, 0 2 ! 31 " # "!#   "!# Ryy Ryx ! A B c ðtÞ 5A Y Y @ 4 dt þ d ! ðtÞ ¼  d W ðtÞ: ! ðtÞ ! Rxy Rxx 0 0 X X 0 ðB:4Þ It is known (Gardiner, 2004) that the solution to this stochastic differential equation system is " !#    " ! # A B Y Y ðtÞ ¼ exp  t ! ! ð0Þ 0 0 X X 2 3      ! Z t A B A B 4 c 5ðvÞ dv exp  ðtvÞ þ ! 0 0 0 0 0 0 " #     R Z t Ryx ! yy A B exp  þ d W ðvÞ: ðB:5Þ ðtvÞ Rxy Rxx 0 0 0 ! ! The mean-value function is ðeAt IÞA1 B X ð0Þ þ eAt ð Y ð0Þ þ ! ! ! ! Pm At c At c j e j1 Þ c Þ for Y ðtÞ and X ð0Þ for X ðtÞ, and the j j ¼ 1 ðe covariance function at time t is made up of, Z t T ! eAv R11 eA v dv Var½ Y ðtÞ ¼ 0 Z t T þ eAv A1 BR21 eA v dv 0

Z

Ft c

ðe

j

Ft c

e

j1

T

eFv RRT eF v dv,

t

þ

T

eAv R12 BT AT eA

v

dv

0

þ

t

T

eAv A1 BR22 BT AT eA

v

dv

0

!

ÞC j,

T

A1 BðR21 þ R22 BT AT ÞAT ðIeA t Þ

j¼1

Z

0

Z

whose mean and covariance are,

! Var½ Z ðtÞ ¼

where the time points 0 ¼ t c0 ot c1 r t c2 r    rt cm ¼ t represent ! the boundaries of the different regimes (time intervals when C ðtÞ ! is constant along the given lineage, equalling C j ). We denote by s1 and s2 two species of interest, t a12 their time of divergence, t1 the time of species 1 and t2 the time of species 2 (see Fig. B1 for meaning of different time symbols). With this notation the between-species covariance is, Z t a  12 ! ! T T eFv RRT eF v dv eF ðt2 ta12 Þ : Cov½ Z s1 , Z s2  ¼ eFðt1 ta12 Þ

0

ðB:1Þ

! ! E½ Z ðtÞ ¼ eFt Z ð0ÞeFt

Fig. B1. Illustration of different time components on the phylogeny.

ðIeAt ÞA1 ðR12 þ A1 BR22 ÞBT AT ðB:2Þ

þtA1 BR22 BT AT ,

K. Bartoszek et al. / Journal of Theoretical Biology 314 (2012) 204–215

!! Cov½ Y , X ðtÞ ¼ ðIeAt ÞA1 ðR12 þ A1 BR22 ÞtA1 BR22 , ! Var½ X ðtÞ ¼ t R22 , where "

Ryy Rxy

Ryx Rxx

#"

Ryy Ryx Rxy Rxx

#T

"

R11 ¼: R21

ðB:6Þ

Appendix C. Phylogenetic regression setup

0

t a12

Av

e

þ Z

T

A1 BR21 eA

v

dv

0 t a12

T

eAv R12 BT AT eA

þ 0 Z ta 12

þ

! ! We can see that the limiting covariance between Y s1 and Y s2 is ! actually the covariance of the optimal process for Y , discussed in Section 2.4.2, at their time of divergence.

#

R12 : R22

The between-species covariances are equal to, Z t a 12 ! ! T eAv R11 eA v dv Cov½ Y s1 , Y s2  ¼ eAðt1 ta12 Þ Z

213

v

dv

 T T eAv A1 BR22 BT AT eA v dv eA ðt2 ta12 Þ

0

eAðt1 ta12 Þ ðIeAta12 ÞA1 ðR12 þ A1 BR22 ÞBT AT T

A1 BðR21 þ R22 BT AT ÞAT ðIeA

ta12

T

ÞeA

ðt2 ta12 Þ

þ t a12 A1 BR22 BT AT , ! ! Cov½ Y s1 , X s2  ¼ eAðt1 ta12 Þ ðIeAta12 ÞðR12 þ A1 BR22 Þ t a12 A1 BR22 , ! ! Cov½ X s1 , X s2  ¼ t a12 R22 :

ðB:7Þ

In Section 2.4.2 we defined the regression parameters as composite statistics of the underlying stochastic-differentialequation model parameters. To estimate parameters linearly entering the mean structure, a generalized least squares (GLS) procedure is commonly used. For this one needs to build a design matrix in which the i-th row usually is the expected value of the ith response conditional on the i-th predictors. In comparative studies this is not always a correct approach because the predictors and residuals can be dependent on each other and between observations. If we denote the combined vector of all of ! the response observations from all the species by R and the combined vector of all the predictor observations from all the ! species by U then we have to get the design matrix from the equation for the conditional expectation (in the normal distribu!! ! !! ! ! ! tion case), E½ R 9 U  ¼ E½ R  þCov½ R , U Var½ U 1 ð U E½ U Þ. If the predictors are independent amongst species, the above formula will result in the same design matrix as if it was filled in row by row. For illustration, we will use the example of the OUBM model with no fixed effects, a single predictor variable and assuming for simplicity that Yð0Þ ¼ Xð0Þ ¼ 0, dYðtÞ ¼ aðYðtÞb  XðtÞÞ dt þ sy dW y ðtÞ, dXðtÞ ¼ sx dW x ðtÞ:

B.1. Asymptotic properties If the F and A matrices have positive real part eigenvalues one can derive biologically relevant asymptotic proprieties. Assuming ! ! that from some time point t0 for t 4 t 0 C ðtÞ  C is constant, then the Ornstein–Uhlenbeck stochastic process has a stationary multivariate normal distribution with moments, ! t-1! E½ Z ðtÞ! C , ! t-1 Var½ Z ðtÞ! P



1

li þ lj



!  P1 RRT PT PT ,

ðB:8Þ

1 r i,j r k

where P is the matrix of eigenvectors of F and l1 , . . . , lk are its eigenvalues. The covariance between different species on the phylogeny tends to 0. In the case of Eq. (4) we do not have a stationary distribution as the Brownian-motion predictor’s variance escapes to infinity ! ! but assuming that c ðtÞ  c is constant for t 4 t 0 then the process ! ! ! ! Y ðA1 BÞ X  c tends in distribution to a mean 0 normal random variable with covariance   1  ðP1 ðR11 þ R12 BT AT þA1 BR21 P li þ lj 1 r i,j r k  þ A1 BR22 BT AT ÞPT Þ PT , ðB:9Þ where P is the matrix of eigenvectors of A and l1 , . . . , lk are its eigenvalues. Unlike in the Ornstein–Uhlenbeck model the covariance between traits in different species after divergence does not disappear but tends to a constant, due to the remaining covar! iance from the driving Brownian-motion X variables, ! ! t-1 Cov½ Y s1 , Y s2 ðtÞ! t a12 A1 BR22 BT AT , ! ! t-1 Cov½ Y s1 , X s2 ðtÞ! t a12 A1 BR22 , ! ! Cov½ X s1 , X s2 ðtÞ ¼ t a12 R22 :

ðC:1Þ

This gives the conditional expectation (see Hansen et al., 2008 for details)   1eat XðtÞ  b: ðC:2Þ E½Y9XðtÞ ¼ 1 at But ! !! ðC:3Þ E½ R 9 U  ¼ KT1 U  b, ! T ! T where R ¼ ½Y s1 , . . . ,Y sn  , U ¼ ½X s1 , . . . ,X sn  , T is the n by n matrix of divergence times of the tip species with t aij (see Fig. B1) being the i-th, j-th entry and K is an n by n matrix whose i, j entry equals: t aij ð1ðexpðaðt i t aij ÞÞexpðat i ÞÞ=ða  t aij ÞÞ,

ðC:4Þ

where ti is the time since the start of the tree till species i (see Fig. B1). In the case of multiple-predictors-multiple-responses mvOUBM model, the formula for the conditional expectation is ! ! even more complicated. Dropping the assumption that X ð0Þ is 0 ! and assuming that Y ð0Þ is at its optimal value, see Eq. (D.1), the part corresponding to the i-th species equals (Q  A1 BÞ, ! ! ! ! ! E½ Y si 9 U  ¼ Cð c ðtÞ,AÞ þ CðA,T, X ð0Þ, U ÞvecðQ T Þ,

ðC:5Þ

T

the matrix in front of vecðQ Þ equals, !

!T

!

CðA,T, X ð0Þ, U Þ ¼ ðDi  X ð0ÞÞ þ

n X j¼1

Dij 

n X

! ! ! t ajr ð X sr  X ð0ÞÞT ,

r¼1

ðC:6Þ Aðt i taij Þ

Ati

At i

1

Þ, Dij :¼ ðt aij Iðe e ÞA Þ and t ajr is where Di :¼ ðIe ! entry j, r of T1 . The part depending on the fixed effects, c ðtÞ, is, m X ! ! ! ðeAtcl eAtcl1 Þ c l , Cð c ðtÞ,AÞ ¼ eAti c ð0Þ þ eAti l¼1

ðB:10Þ

see Appendix B for notation related to the fixed effects.

ðC:7Þ

214

K. Bartoszek et al. / Journal of Theoretical Biology 314 (2012) 204–215

Fig. E1. An illustration of Simpson’s paradox in a phylogenetic comparative data setting. Simulated data which has a negative correlation inside all three clusters of species but a positive correlation between all species. The data are simulated under the Ornstein–Uhlenbeck model with parameters such that the correlation between the traits at time 0.1 (about the height of the three clusters) is  0.973 while at time 1 (the tree height) it is 0.883. In panel B we can see that the data overall have a positive correlation but inside the three clusters they have a negative relationship.

Appendix D. Value at root The estimation of the ancestral value of the response is known to be a difficult problem, e.g. Labra et al. (2009), and in the latest version of OUCH the ancestral trait value is not returned (see e.g. R-sig-phylo Digest Vol 21(9) 2009 for a brief overview of this). In a recent theoretical paper, Ane´ (2008) showed that the estimation of the ancestral state is inconsistent and tends to a random variable as the sample size goes to infinity. The same is shown by Sagitov and Bartoszek (2012) for the Yule tree model of tree growth. Due to the known difficulties of estimating the ancestral state, we follow Labra et al. (2009) in assuming that the ancestral state was at its optimum value, ! ! ! Y ð0Þ ¼ c ð0ÞA1 B X ð0Þ:

is negative being at the optimum for one variable could be detrimental for the other. The covariance function of the discussed here models can result in patterns that may seem counterintuitive e.g. Simpson’s paradox, see Fig. E1. Our considered here family of stochastic processes is rich enough to include in it that relations between closely related species can be drastically different from distantly related ones, which is often observed in reality.

Appendix F. Supplementary data Supplementary data associated with this article can be found in the online version at http://dx.doi.org.10.1016/j.jtbi.2012.08.005.

ðD:1Þ

If the eigenvalues of the drift matrix have positive real parts, then a theoretical possibility to deal with the problem of the ancestral state, is to extend the root branch back in time to minus infinity. The will have the effect of losing the influence of the ancestral state.

Appendix E. Covariance structures The covariance function of the stochastic process describes how it fluctuates around its mean value. Apart from the covariance function of the traits, we are also interested in the covariance of the limiting and optimal processes for the response variables, the conditional covariance of the response variables on the predictors i.e. the covariance of the residual process ! ! ! Y ðtÞE½ Y ðtÞ9 X ðtÞ at a given time instance and in the limit, and the covariance between the response variables and the predictor process. The covariance between the responses and their optimum process indicates whether the responses are actually tracking the optimum (positive value) or not (negative value). If the covariance between a response and its optimum value process is negative, the ‘‘optimum’’ is not in reality something optimal for the process but something the process is ‘‘trying to escape from’’. Another interesting value is the covariance between a response and the optimal process for another response variable. If it is positive the two variables are supporting each others’ approach. However if it

References Aldous, D., Popovic, L., 2005. A critical branching process model for biodiversity. Adv. Appl. Prob. 37, 1094–1115. Aldous, D.J., Krikun, M.A., Popovic, L., 2011. Five statistical questions about the tree of life. Syst. Biol. 60, 318–328. Ane´, C., 2008. Analysis of comparative data with hierarchical autocorrelation. Ann. Appl. Stat. 2, 1078–1102. Armbruster, W.S., 1990. Estimating and testing the shapes of adaptive surfaces: the morphology and pollination of Dalechampia blossoms. Am. Nat. 135, 14–31. Armbruster, W.S., Pe´labon, C., Hansen, T.F., Bolstad, G.H., 2009. Macroevolutionary patterns of pollination accuracy: a comparison of three genera. New Phytol. 183, 600–617. Arnold, S.J., Pfrender, M.E., Jones, A.G., 2001. The adaptive landscape as a conceptual bridge between micro- and macroevolution. Genetica 112-113, 9–32. von-Bertalanffy, L., Pirozynski, W.J., 1952. Ontogenetic and evolutionary allometry. Evolution 6, 387–392. Boettiger, C., Coop, G., Ralph, P., 2012. Is your phylogeny informative? Measuring the power of comparative methods. Evolution 66, 2240–2251. Butler, M.A., King, A.A., 2004. Phylogenetic comparative analysis: a modelling approach for adaptive evolution. Am. Nat. 164, 683–695. Carroll, R.J., Ruppert, D., 1987. Transformation and Weighting in Regression. Chapman and Hall, New York. Cheverud, J.M., 1982. Relationships among ontogenetic, static and evolutionary allometry. Am. J. Phys. Anthropol. 59, 129–149. Cooper, N., Jetz, W., Freckleton, R.P., 2010. Phylogenetic comparative approaches for studying niche conservativism. J. Evol. Biol. 23, 2529–2539. Eastman, J.M., Alfaro, M.E., Joyce, P., Hipp, A.L., Harmon, L.J., 2011. A novel comparative method for identifying shifts in the rate of character evolution on trees. Evolution 65, 3578–3589. Felsenstein, J., 1985. Phylogenies and the comparative method. Am. Nat. 125, 1–15.

K. Bartoszek et al. / Journal of Theoretical Biology 314 (2012) 204–215

Felsenstein, J., 1988. Phylogenies and quantitative characters. Annu. Rev. Ecol. Syst. 19, 445–471. Ferna´ndez, M.H., Vrba, E.S., 2004. A complete estimate of the phylogenetic relationships in Ruminantia: a dated species-level supertree of the extant ruminants. Biol. Rev. 80, 269–302. Fitch, W.M., 1971. Toward defining the course of evolution: minimum change for a specific tree topology. Syst. Zool. 20, 406–416. Fry, J.D., 1993. The ‘‘general vigor’’ problem: can antagonistic pleiotropy be detected when genetic covariances are positive?. Evolution 47, 327–333. Gardiner, C.W., 2004. Handbook of Stochastic Methods. Springer, Berlin. Gernhard, T., 2008a. The conditioned reconstructed process. J. Theor. Biol. 253, 769–778. Gernhard, T., 2008b. New analytic results for speciation times in neutral models. Bull. Math. Biol. 70, 1082–1097. Hadfield, J.D., Nakagawa, S., 2010. General quantitative genetic methods for comparative biology: phylogenies, taxonomies and multi-trait models for continuous and categorical characters. J. Evol. Biol. 23, 494–508. Hansen, T.F., 1997. Stabilizing selection and the comparative analysis of adaptation. Evolution 51, 1341–1351. Hansen, T.F., 2012. Adaptive landscapes and macroevolutionary dynamics, in: Svensson, E.I., Calsbeek, R. (Eds.), The adaptive landscape in evolutionary biology. Oxford University Press, pp. 205–226. Hansen, T.F., Bartoszek, K., 2012. Interpreting the evolutionary regression: the interplay between observational and biological errors in phylogenetic comparative studies. Syst. Biol. 61, 413–425. Hansen, T.F., Martins, E.P., 1996. Translating between microevolutionary process and macroevolutionary patterns: the correlation structure of interspecific data. Evolution 50, 1404–1417. Hansen, T.F., Orzack, S.H., 2005. Assessing current adaptation and phylogenetic inertia as explanations of trait evolution: the need for controlled comparisons. Evolution 59, 2063–2072. Hansen, T.F., Pienaar, J., Orzack, S.H., 2008. A comparative method for studying adaptation to a randomly evolving environment. Evolution 62, 1965–1977. Harmon, L.J., Weir, J.T., Brock, C.D., Glor, R.E., Challenger, W., 2008. GEIGER: investigating evolutionary radiations. Bioinformatics 24, 129–131. Hohenlohe, P.A., Arnold, S.A., 2008. MiPoD: a hypothesis-testing framework for microevolutionary inference from patterns of divergence. Am. Nat. 171, 366–385. Houle, D., 1991. Genetic covariance of fitness correlates: what genetic correlations are made of and why it matters. Evolution 45, 630–648. Houle, D., 2007. A dispatch from the multivariate frontier. J. Evol. Biol. 20, 22–23. Huelsenbeck, J., Rannala, B., Masly, J., 2000. Accommodating phylogenetic uncertainty in evolutionary studies. Science 88, 2349–2350. Huelsenbeck, J.P., Rannala, B., 2003. Detecting correlation between characters in a comparative analysis with uncertain phylogeny. Evolution 57, 1237–1247. Huxley, J.S., 1932. Problem of Relative Growth. Methuen, London. Jhwueng, D.C., Maroulas, V., 2011. Phylogenetic Ornstein–Uhlenbeck regression curves. ArXiv e-prints 1108.5364. Kembel, S.W., Cowan, P.D., Helmus, M.R., Cornwell, W.K., Morlon, H., Ackerl, D.D., Blomberg, S.P., Webb, C.O., 2010. Picante: R tools for integrating phylogenies and ecology. Bioinformatics 26, 1463–1464. Labra, A., Pienaar, J., Hansen, T.F., 2009. Evolution of thermal physiology in Liolaemus lizards: adaptation, phylogenetic inertia, and niche tracking. Am. Nat. 174, 204–220. Lajeunesse, M.J., 2009. Meta-analysis and the comparative phylogenetic method. Am. Nat. 174, 369–381. Lande, R., 1976. Natural selection and random genetic drift in phenotypic evolution. Evolution 30, 314–334. Lande, R., 1979. Quantitative genetic analysis of multivariate evolution, applied to brain: body size allometry. Evolution 33, 402–416. Lande, R., Arnold, S.J., 1983. The measurement of selection on correlated characters. Evolution 37, 1210–1226. Lemey, P., Rambaut, A., Welch, J.J., Suchard, M.A., 2010. Phylogeography takes a relaxed random walk in continuous space and time. Mol. Biol. Evol. 27, 1877–1885. Martins, E.P., 2004. COMPARE, version 4.6b. Computer programs for the statistical analysis of comparative data. Distributed by the author at /http://compare. bio.indiana.edu/S Department of Biology Indiana University, Bloomington, IN.

215

Martins, E.P., Diniz-Filho, J.A.F., Housworth, E.A., 2002. Adaptive constraints and the phylogenetic comparative method: a computer simulation test. Evolution 56, 1–13. Martins, E.P., Hansen, T.F., 1997. Phylogenies and the comparative method: a general approach to incorporating phylogenetic information into the analysis of interspecific data. Am. Nat. 149, 646–667. Moler, C., Van Loan, C.F., 2003. Nineteen dubious ways to compute the exponential of a matrix, twenty-five years later. SIAM Rev. 45, 1–37. Mooers, A., Gascuel, O., Stadler, T., Li, H., Steel, M., 2012. Branch lengths on birth– death trees and the expected loss of phylogenetic diversity. Syst. Biol. 61, 195–203. van Noordwijk, A.J., de Jong, G., 1986. Acquisition and allocation of resources: their influence on variation in life history tactics. Am. Nat. 128, 137–142. O’Meara, B.C., Ane´, C., Sanderson, M.J., Wainwright, P.C., 2006. Testing for different rates of continuous trait evolution using likelihood. Evolution 60, 922–933. Paradis, E., Claude, J., Strimmer, K., 2004. Ape: analyses of phylogenetics and evolution in R language. Bioinformatics 20, 289–290. Plard, F., Bonenfant, C., Gaillard, J.M., 2011. Revisiting the allometry of antlers among deer species: male–male sexual competition as a driver. Oikos 120, 601–606. Popovic, L., 2004. Asymptotic genealogy of a critical branching process. Ann. Appl. Probab. 14, 2120–2148. R Development Core Team, 2010. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0. Ranta, E., Laurila, A., Elmberg, J., 1994. Reinventing the wheel: analysis of sexual dimorphism in body size. Oikos 70, 313–321. Reitan, T., Schweder, T., Henderiks, J. Phenotypic evolution studied by layered stochastic differential equations. Ann. Appl. Stat., in press www.e-publications.org/ims/ submission/index.php/AOAS/user/submissionFile/12293?confirm=a83f5019. Revell, L.J., 2009. Size-correction and principal components for interspecific comparative studies. Evolution 63, 3258–3268. Revell, L.J., Collar, D.C., 2009. Phylogenetic analysis of the evolutionary correlation using likelihood. Evolution 63, 1090–1100. Revell, L.J., Harmon, L.J., 2008. Testing quantitative genetic hypotheses about the evolutionary rate matrix for continuous characters. Evol. Ecol. Res. 10, 311–331. Roper, M., Pepper, R.E., Brenner, M.P., Pringle, A., 2008. Explosively launched spores of ascomycete fungi have drag–minimizing shapes. Proc. Natl. Acad. Sci. U.S.A. 105, 20583–20588. Sagitov, S., Bartoszek, K., 2012. Interspecies correlation for neutrally evolving traits. J. Theor. Biol., 309, 11–19. Savageau, M.A., 1979. Allometric morphogenesis of complex systems: derivation of the basic equations from first principles. Proc. Natl. Acad. Sci. U.S.A. 76, 6023–6025. Simpson, G.G., 1944. Tempo and Mode in Evolution. Columbia University Press, New York. Slater, G.J., Harmon, L.J., Wegmann, D., Joyce, P., Revell, L.J., Alfaro, M.E., 2011. Estimation of the branch points of a branching diffusion process. Evolution 66, 752–762. Stadler, T., Steel, M., 2012. Distribution of branch lengths and phylogenetic diversity under homogeneous speciation models. J. Theor. Biol. 297, 33–40. Svensson, E.I., Calsbeek, R. (Eds.), 2012. The Adaptive Landscape in Evolutionary Biology. Oxford University Press. Thomas, G.H., Freckleton, R.P., Szekely, T., 2006. Comparative analysis of the influence of developmental mode on phenotypic diversification rates in shorebirds. Proc. R. Soc. B 273, 1619–1624. Van Loan, C.F., 1977. The sensitivity of the matrix exponential. SIAM J. Numer. Anal. 14, 971–981. Van Loan, C.F., 1978. Computing integrals involving the matrix exponential. IEEE T. Automat. Contr. 23, 395–404. Vaˇsicˇek, O., 1977. An equilibrium characterisation of the term structure. J. Financ. Econ. 5, 177–188. Voje, K.L., Hansen, T.F., 2012. Evolution of static allometries: Adaptive change in allometric slopes of eye span in stalk–eyed flies. Evolution, in press. Walsh, B., 2007. Escape from flatland. J. Evol. Biol. 20, 36–38. Walsh, B., Blows, M.W., 2009. Abundant genetic variationþ strong selection¼ multivariate genetic constraints: a geometric view of adaptation. Ann. Rev. Ecol. Evol. Syst. 40, 41–59.