Infection, Genetics and Evolution 7 (2007) 651–655 www.elsevier.com/locate/meegid
Short communication
Evolutionary dynamics and spatial genetic structure of epizootic hemorrhagic disease virus in the eastern United States Roman Biek * Department of Biology and Center for Disease Ecology, 1510 Clifton Road, Emory University, Atlanta, GA 30322, USA Received 7 July 2006; received in revised form 20 April 2007; accepted 25 April 2007 Available online 1 May 2007
Abstract Epizootic hemorrhagic disease virus (EHDV) is a significant pathogen of wild and domestic ungulates worldwide. In North America, serotype EHDV-2 is responsible for the majority of outbreaks, which are most commonly observed in white-tailed deer. A recent study by Murphy et al. [Murphy, M.D., Howerth, E.W., MacLachlan, N.J., Stallknecht, D.E., 2005. Genetic variation among epizootic hemorrhagic disease viruses in the southeastern United States: 1978–2001. Infect. Genet. Evol. 5, 157–165] examined the genetic relationships of EHDV-2 sequences from outbreaks across the eastern United States for evidence of temporal and spatial structure but found no evidence for either. Here, I present results of further examination of the same data using additional types of analysis. Contrary to the earlier assessment, I find that for outbreaks observed within the same year, genetic and spatial distances are in fact positively correlated and that the virus is evolving at a rate similar to that seen in other vectorborn RNA viruses. Estimates of demographic history further revealed that population sizes of the virus had remained relatively stable over most of its history. A noticeable exception to this trend was a recent demographic bottleneck, possibly associated with a selective sweep, that affected one of the two viral genes examined. These results demonstrate that genetic variation accumulating at selectively neutral and measurably evolving sites in the EHDV-2 genome can be employed to gain insights into the spatial and temporal dynamics of this viral pathogen. # 2007 Elsevier B.V. All rights reserved. Keywords: Epizootic hemorrhagic disease virus; Virus evolution; Selection; Genetic structure; Phylodynamics; White-tailed deer; Orbivirus; Arbovirus; Culicoides
1. Introduction Epizootic hemorrhagic disease virus (EDHV) and bluetongue virus (BTV) are two closely related, segmented RNA viruses in the family Reoviridae that can cause a severe hemorrhagic disease in ungulates. In North America, whitetailed deer (Odocoileus virginianus) is the most commonly affected host species. Disease symptoms can range from mild to severe debilitation and death, and there is evidence that pathogenicity and disease prevalence differ among geographic areas (Davidson and Doster, 1997). Whereas many serotypes of EDHV and BTV can be distinguished worldwide, most cases observed in white-tailed deer can be attributed to serotype EDHV-2 (Stallknecht et al., 1995; Davidson and Doster, 1997). Transmission of the virus occurs through blood-sucking insects; in North America, biting midges of the genus
* Tel.: +1 404 727 9516; fax: +1 404 727 9516. E-mail address:
[email protected]. 1567-1348/$ – see front matter # 2007 Elsevier B.V. All rights reserved. doi:10.1016/j.meegid.2007.04.005
Culicoides are considered the most important vectors (Howerth et al., 2001). In the last decades, EHDV-2 cases have been documented from many locations throughout North America. This has begged the questions of how the virus is maintained and how its populations are spatially structured. One possible scenario is the existence of a number of endemic foci, which rarely exchange virus and thus are more or less evolving independently of each other. An alternative scenario is that of disease foci being connected through frequent host and/or vector movement. To distinguish between these two possibilities, Murphy et al. (2005) recently conducted a phylogenetic analysis of two EDHV-2 genes, based on samples collected throughout the eastern United States over three decades. Their results revealed that closely related genotypes were widely distributed in space and time. Based on this finding, the authors were able to conclude that frequent large-scale dispersal of virus must prohibit the occurrence of spatially restricted viral lineages. In addition, Murphy et al. took their results as evidence that the EHDV-2 failed to evolve appreciably over the time period
652
R. Biek / Infection, Genetics and Evolution 7 (2007) 651–655
covered by their samples and that there was a low probability of outbreaks being caused by newly emerging variants. However, these conclusions were not reached based on a formal estimation or statistical test. In recent years, a variety of likelihood-based methods have become available to perform these types of analysis. The objective of the current paper was therefore to reexamine the data presented by Murphy et al. (2005) and to explicitly test whether EHDV-2 in the eastern US is indeed (1) not showing evidence for positive selection; (2) not exhibiting any spatial structure; and (3) not evolving measurably. A further objective was to gain insights into the demographic history of EHDV-2 using coalescent methods. 2. Materials and methods 2.1. Sequence data and substitution models EHDV-2 sequences as reported by Murphy et al. (2005) were obtained from GenBank. The data set consisted of 48 partial NS3 sequences (viral segment S10), 745 base pairs (bp) in length, along with the same number of partial VP2 sequences (segment, L2, 681 bp) generated from samples collected throughout the eastern United States between 1978 and 2001. Previously described sequences from a specimen collected during 1962 in Alberta, Canada (Genbank accession numbers L29022 and L33818), were also included. A partition homogeneity test was conducted in Paup* (Swofford, 2003) to see if data for NS3 and VP2 could possibly be combined. Results indicated that the two genes resulted in significantly incongruent phylogenies (P = 0.010), possibly as a result of reassortment between virus gene segments (Samal et al., 1987a,b; Stott et al., 1987). Thus, all analyses were carried out separately for each gene. Suitable substitution models were found based on Akaike’s Information Criterion corrected for sample size (AICC) and model averaging in program MODELTEST 3.6 (Posada and Crandall, 1998; Posada and Buckley, 2004). Models selected were HKY + G for NS3 and TrN + G for VP2 (parameter values available from the author). 2.2. Selection patterns Sequences were examined for evidence of selection acting on particular amino acid sites based on the nonsynonymous/ synonymous substitution rate ratio (v = dN/dS), whereby v > 1 indicates positive selection (Yang and Bielawski, 2000). Analysis was performed in program CODEML, which is part of the PAML package (Yang, 1997). According to the program’s recommendations, four different models of increasing complexity were fitted to the data: M0 (one ratio), M1 (nearly neutral), M2 (positive selection), and M3 (discrete with three classes of v). Relative model fit was assessed using likelihood ratio testing. 2.3. Spatial genetic structure A global test for spatial genetic structure in EHDV-2 was conducted based on 43 samples. The analysis excluded the
1962 Alberta sequence because of the exceptionally large spatial distance from all other samples, as well as five sequences that fell in a separate genetic clade (Clade B, Murphy et al., 2005) that was >10% divergent from all remaining sequences (Clade A). Centroids for the state or county of origin were used to obtain spatial coordinates for each sample. Pairwise genetic distances between all Clade A sequences were calculated in Paup* under the ML model selected for each data set and compared with the geographic distances between sample locations. To test for random distribution of genotypes, a Mantel test was performed using the statistical package R (http://www.r-project.org/) by keeping the spatial distance matrix constant and permuting the genetic distance matrix 5000 times. The observed value was then compared with the distribution generated from the permutations. In a second analysis, distance values were grouped into seven categories with 200 km increments and plotted against the average genetic distance within each category. Pairwise genetic distances among viruses were thereby split into three groups depending on the time difference between samples (same year, 1–5 years apart, >5 years apart). 2.4. Estimating evolutionary rates and demographic history Rates of evolution and demographic history for both EDHV genes were estimated using a Bayesian coalescent approach based on Markov Chain Monte Carlo (MCMC) sampling (Drummond et al., 2002), as implemented in program BEAST (Drummond and Rambaut, 2003). In this approach, differences in sampling time, in the current study ranging from 1962 to 2001, are used to scale branch length estimates according to a ‘single rate dated tips’ model (Rambaut, 2000) to obtain the expected rate of genetic change per unit time. As a measure of estimate uncertainty, the program returns the 95% highest posterior density (HPD) interval. Based on the evolutionary rate, the age of the root (or of any other node) can also be estimated under the molecular clock model. In addition, the program can produce an estimate of the demographic history of the sampled population in the form of a Bayesian skyline plot (Drummond et al., 2005). Analyses were carried out for 10 million states, with the first 1 million subsequently removed as burn-in, and MCMC samples were taken every 1000 states. Demographic estimation was based on a group size of 10. 3. Results and discussion The current analysis statistically corroborates some of the earlier conclusions made by Murphy et al. (2005) while also providing a number of new insights into the molecular evolution and history of EHDV-2. No evidence for positive selection of amino acid replacements was found in the data. A nearly neutral model (M1), which assumes a combination of purifying selection and neutral sites, provided a better fit than model M0 (NS3: p = 0.037, VP2: p = 0.001). Models allowing for an additional class of positively selected sites (M2 and M3) were not supported relative to M1 (all p > 0.360). According to
R. Biek / Infection, Genetics and Evolution 7 (2007) 651–655
653
parameter estimates from model M1, the majority of codon sites in the viral sequences were under strong purifying selection (NS3: 97.5%; VP2: 87.6%). These results support the notion of a long-established and evolutionary stable relationship between the virus and its deer host. For both EHDV genes, correlation coefficients between observed genetic and spatial distances fell within the distribution generated from permuting the distance matrix (data not shown). Hence, results did not allow the rejection of the hypothesis of genotypes being spatially distributed at random, despite spatial distances of >1000 km among outbreak locations. Also, no relationship between genetic distance and spatial distance was found for viruses sampled 1–5 years (NS3: r = 0.480, p = 0.276; VP2: r = 0.426; p = 0.340) or >5 years apart (NS3: r = 0.483, p = 0.272; VP2: r = 0.396; p = 0.379; Fig. 1A). A positive relationship between genetic and spatial distances was evident for outbreaks within the same year (r = 0.854, p = 0.014; Fig. 2); a similar relationship was suggestive for VP2, but not statistically significant (r = 0.458, p = 0.301; Fig. 1B). The fact that any isolation by distance was so short-lived suggests that the virus is able to spread very effectively among localities, linking outbreaks epidemiologically on large spatial scales. The mechanism facilitating such rapid dissimenation is not clear but could involve movement of deer as well as of the biting midge vector. Passive windmediated dispersal of insect vectors has been suggested as a Fig. 2. Demographic history of two EHDV genes estimated through Bayesian skyline plots. (A) NS3 gene. (B) VP2 gene. Data are shown up to the upper 95% HPD of the estimate for the age of the most recent common ancestor.
Fig. 1. Correlation between genetic and spatial distances among EHD viruses sampled the same year, 1–5 years, or >5 years apart. (A) NS3 gene. (B) VP2 gene.
means for long-distance virus transmission (Sellers and Maarouf, 1991). Contrary to earlier assessments, the current analysis revealed that EHDV-2 had evolved measurably over the last 40 years at an estimated rate of 4.75 (HPD: 3.07– 6.61) 10 4 substitutions/site/year in the case of NS3 and a slight faster rate of 6.26 (HPD: 4.36–8.30) 10 4 for VP2. These rates are similar to those reported for the closely related bluetongue virus (3.6 10 4) and are also consistent with rates reported for other vector-born RNA viruses (Jenkins et al., 2002). Based on the rate calculated here, it can be inferred that the most recent common ancestor of EHDV-2 in North America may have only existed about a 100 years ago (NS3: 1926, HPD: 1894–1950, VP2: 1901, HPD: 1864–1930). First descriptions of symptoms consistent with hemorrhagic disease in deer also stem from this time period (Howerth et al., 2001), raising the possibility that it may have marked the first emergence of EHDV-2 in North America. Such an emergence could be caused, for example, by a host shift or an introduction of the virus from another continent. Future studies, involving EHDV2 sampled at a global scale, may be able to distinguish between these different scenarios. Bayesian skyline estimates for both genes indicated that the virus had maintained relatively stable population sizes over the last 100 years (Fig. 2A and B). The only exception to this was found in the VP2 data, which suggested a temporary population
654
R. Biek / Infection, Genetics and Evolution 7 (2007) 651–655
Fig. 3. Phylogenetic groupings for the VP2 gene of EHDV and their spatial distribution over time. (A) Maximum a posteriori tree out of 10,000 trees sampled during the Bayesian molecular-clock based analysis in BEAST. Color-coding indicates five clearly distinct genetic groups for which ancestral nodes received 100% superior support based on all 10,000 trees sampled. (B) Geographic distribution of genetic groups for the periods 1978–1997 and 1998–2001. Colors as in (A).
decline during the late 1980s and early 1990s (Fig. 2B, see below). That the recent demographic history of EHDV-2 was otherwise characterized by a lack of demographic fluctuations was somewhat surprising given that white-tailed deer populations throughout North America have increased dramatically over the last 100 years (McCabe and McCabe, 1997). Increase in host density could be expected to result in increased opportunities for EHDV-2 to be transmitted and thus in an expanding virus population, as seen in other wildlife-pathogen systems (Barbour and Fish, 1993; Fischer et al., 1997; Yates et al., 2002). The fact that evidence for a demographic expansion of the virus was not apparent from the genetic data suggests that EHDV-2 dynamics are limited by factors other than deer host density. The abundance of competent vectors, for example, may have a much more pronounced effect on virus population sizes. The population decline inferred from the VP2 data during the early 1990s (Fig. 2B) was not matched by a similarly strong trend for NS3 (Fig. 2A), suggesting that these dynamics reflect the specific history of the VP2 gene rather than that of the virus as a whole. One possible explanation is that a selective sweep removed much of the genetic variation for VP2. Such a scenario would be much more likely for the VP2 protein, which presumably contains epitopes recognized by neutralizing antibodies (DeMaula et al., 2000), than for the non-structural NS3 gene. If reassortment is as frequent in EDHV as it is in the closely related bluetongue virus (Samal et al., 1987a,b; Stott et al., 1987), a selective sweep could take place without having a noticeable effect on the diversity of other viral segments, especially in the presence of high levels of viral dispersal. The spatial distribution of VP2 genotypes at different time points provides tentative support for this scenario. Of the five genetically distinct groups evident from the phylogenetic
analysis (Fig. 3A), all five were represented in outbreaks 1978– 1997 and as late as 1994–1997, four different groups could still be detected. Between 1998 and 2001, however, only a single genetic group was implicated in outbreaks sampled throughout the eastern United States (Fig. 3B). Such a rapid and large-scale rise to dominance of one genetic group at the expense of all others may have been driven by a selective advantage benefiting the former group. Further genetic studies, preferably including current samples and viral segments in their entire length, will be needed to verify this intriguing hypothesis. In conclusion, the current analyses demonstrate that neutrally evolving sites in the EDHV genome accumulate appreciable amounts of genetic changes over time and that these changes can provide information about the temporal and spatial dynamics of the virus. Although geographic structure is apparently lost very quickly, due to high levels of virus dispersal, rapid genetic diversification permits the use of analytical tools for the detection of demographic and spatial trends. These tools are likely to prove useful in future studies aimed at identifying the factors controlling the emergence and spread of EDHV-2 and related viruses in host and vector populations. Acknowledgments I thank Les Real, Lance Waller, Molly Murphy, and two anonymous reviewers for their help and suggestions. References Barbour, A.G., Fish, D., 1993. The biological and social phenomenon of Lyme disease. Science 260, 1610–1616. Davidson, W.R., Doster, G.L., 1997. Health characteristics and white-tailed deer population density in the southeastern United States. In: McShea,
R. Biek / Infection, Genetics and Evolution 7 (2007) 651–655 W.J., Underwood, H.B., Rappole, J.H. (Eds.), The Science of Overabundance: Deer Ecology and Population Management. Smithsonian Institution Press, Washington, pp. 164–184. DeMaula, C.D., Bonneau, K.R., MacLachlan, N.J., 2000. Changes in the outer capsid proteins of bluetongue virus serotype ten that abrogate neutralization by monoclonal antibodies. Virus Res. 67, 59–66. Drummond, A.J., Nicholls, G.K., Rodrigo, A.G., Solomon, W., 2002. Estimating mutation parameters, population history and genealogy simultaneously from temporally spaced sequence data. Genetics 161, 1307– 1320. Drummond, A.J., Rambaut, A., 2003. BEAST v1.2, available from http:// evolve.zoo.ox.ac.uk/beast/. Drummond, A.J., Rambaut, A., Shapiro, B., Pybus, O.G., 2005. Bayesian coalescent inference of past population dynamics from molecular sequences. Mol. Biol. Evol. 22, 1185–1192. Fischer, J.R., Stallknecht, D.E., Luttrell, P., Dhondt, A.A., Converse, K.A., 1997. Mycoplasmal conjunctivitis in wild songbirds: the spread of a new contagious disease in a mobile host population. Emerg. Infect. Dis. 3, 69– 72. Howerth, E.W., Stallknecht, D.E., Kirkland, P.D., 2001. Bluetongue, epizootic hemorrhagic disease, and other orbivirus-related diseases. In: Williams, E.S., Barker, I.K. (Eds.), Infectious Diseases of Wild Mammals. Iowa State University Press, Ames, IA, pp. 77–97. Jenkins, G.M., Rambaut, A., Pybus, O.G., Holmes, E.C., 2002. Rates of molecular evolution in RNA viruses: a quantitative phylogenetic analysis. J. Mol. Evol. 54, 156–165. McCabe, T.R., McCabe, R.E., 1997. Recounting whitetails past. In: McShea, W.J., Underwood, H.B., Rappole, J.H. (Eds.), The Science of Overabundance: Deer Ecology and Population Management. Smithsonian Institution Press, Washington, pp. 11–26. Murphy, M.D., Howerth, E.W., MacLachlan, N.J., Stallknecht, D.E., 2005. Genetic variation among epizootic hemorrhagic disease viruses in the southeastern United States: 1978–2001. Infect. Genet. Evol. 5, 157–165. Posada, D., Crandall, K.A., 1998. MODELTEST: testing the model of DNA substitution. Bioinformatics 14, 817–818.
655
Posada, D., Buckley, T., 2004. Model selection and model averaging in phylogenetics: advantages of akaike information criterion and Bayesian approaches over likelihood ratio tests. Syst. Biol. 53, 793–808. Rambaut, A., 2000. Estimating the rate of molecular evolution: incorporating non-contemporaneous sequences into maximum likelihood phylogenies. Bioinformatics 16, 395–399. Samal, S.K., el-Hussein, A., Holbrook, F.R., Beaty, B.J., Ramig, R.F., 1987a. Mixed infection of Culicoides variipennis with bluetongue virus serotypes 10 and 17: evidence for high frequency reassortment in the vector. J. Gen. Virol. 68 (Pt 9), 2319–2329. Samal, S.K., Livingston Jr., C.W., McConnell, S., Ramig, R.F., 1987b. Analysis of mixed infection of sheep with bluetongue virus serotypes 10 and 17: evidence for genetic reassortment in the vertebrate host. J. Virol. 61, 1086– 1091. Sellers, R.F., Maarouf, A.R., 1991. Possible introduction of epizootic hemorrhagic disease of deer virus (serotype 2) and bluetongue virus (serotype 11) into British Columbia in 1987 and 1988 by infected Culicoides carried on the wind. Can. J. Vet. Res. 55, 367–370. Stallknecht, D.E., Nettles, V.F., Rollor, E.A., Howerth, E.W., 1995. Epizootic hemorrhagic-disease virus and bluetongue virus serotype distribution in white-tailed deer in Georgia. J. Wildlife Dis. 31, 331–338. Stott, J.L., Oberst, R.D., Channell, M.B., Osburn, B.I., 1987. Genome segment reassortment between two serotypes of bluetongue virus in a natural host. J. Virol. 61, 2670–2674. Swofford, D.L., 2003. PAUP*(4. 0b10). Phylogenetic Analysis Using Parsimony (*and other Methods) Sinauer Associates, Sunderland, MA. Yang, Z., 1997. PAML: a program package for phylogenetic analysis by maximum likelihood. Comput. Appl. Biosci. 13, 555–556. Yang, Z.H., Bielawski, J.P., 2000. Statistical methods for detecting molecular adaptation. Trends Ecol. Evol. 15, 496–503. Yates, T.L., Mills, J.N., Parmenter, C.A., Ksiazek, T.G., Parmenter, R.R., Vande Castle, J.R., Calisher, C.H., Nichol, S.T., Abbott, K.D., Young, J.C., Morrison, M.L., Beaty, B.J., Dunnum, J.L., Baker, R.J., Salazar-Bravo, J., Peters, C.J., 2002. The ecology and evolutionary history of an emergent disease: Hantavirus pulmonary syndrome. Bioscience 52, 989–998.