Patterns of sequence evolution at epitopes for host antibodies and cytotoxic T-lymphocytes in human immunodeficiency virus type 1

Patterns of sequence evolution at epitopes for host antibodies and cytotoxic T-lymphocytes in human immunodeficiency virus type 1

Virus Research 116 (2006) 98–105 Patterns of sequence evolution at epitopes for host antibodies and cytotoxic T-lymphocytes in human immunodeficiency...

278KB Sizes 0 Downloads 42 Views

Virus Research 116 (2006) 98–105

Patterns of sequence evolution at epitopes for host antibodies and cytotoxic T-lymphocytes in human immunodeficiency virus type 1 Helen Piontkivska 1 , Austin L. Hughes ∗ Department of Biological Sciences, University of South Carolina, Coker Life Sciences Bldg., 700 Sumter St., Columbia SC 29208, USA Received 6 May 2005; received in revised form 31 August 2005; accepted 1 September 2005 Available online 7 October 2005

Abstract Analysis of published sequence data from the nine protein-coding genes of human immunodeficiency virus type 1 (HIV-1) showed striking differences in evolutionary pattern between epitopes for host neutralizing antibodies (Ab) and epitopes for cytotoxic T cells (CTL). In all sequences analyzed, the greatest median amino acid residue diversity was seen at sites that formed part of Ab epitopes, but not of CTL epitopes. By contrast, sites belonging to CTL epitopes but not to Ab epitopes showed reduced median amino acid sequence diversity not only in comparison to sites in Ab epitopes but also in comparison to non-epitope sites. Ab epitopes that did not overlap CTL epitopes showed the highest frequency of comparisons in which the rate of nonsynonymous (amino acid-altering) nucleotide substitution exceeded that of synonymous nucleotide substitution, supporting the hypothesis that much of the diversity at Ab epitopes results from positive selection exerted by the host immune system. Though less frequent than that at Ab epitopes, there was evidence of such selection at certain CTL epitopes as well; and amino acid differences between sister pairs of sequences in CTL epitopes were more likely to be convergent than those in Ab epitopes. The pattern seen at CTL epitopes may represent the result of conflicting pressures favoring conservation of the amino acid sequence for functional reasons and amino acid replacements for reasons of CTL escape. © 2005 Elsevier B.V. All rights reserved. Keywords: Human immunodeficiency virus; Escape mutation; Natural selection

1. Introduction It has been suggested that the human immunodeficiency viruses show a stronger effect of positive Darwinian selection than any other organism studied so far (Rambaut et al., 2004). In both human immunodeficiency virus type 1 (HIV-1) and the monkey model simian immunodeficiency virus (SIV), there is evidence of positive selection exerted both by host neutralizing antibodies (Greenier et al., 2005; Richman et al., 2003; Zhang et al., 1993) and by the host cytotoxic T lymphocyte (CTL) system, involving binding of viral peptides by major histocompatibility complex (MHC) class I molecules and their recognition by T∗

Corresponding author. Tel.: +1 803 777 9186; fax: +1 803 777 4002. E-mail address: [email protected] (A.L. Hughes). 1 Present address: Department of Biological Sciences, Kent State University, Kent, OH 44242, USA. 0168-1702/$ – see front matter © 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.virusres.2005.09.001

cell receptors (Allen et al., 2000; da Silva and Hughes, 1999; Evans et al., 1999; Hughes et al., 2001; Goulder et al., 1997, 2001; Moore et al., 2002; O’Connor et al., 2002, 2004; Piontkivska and Hughes, 2004). In spite of this evidence, a great deal remains to be learned about positive selection on immunodeficiency viruses. The response of the vertebrate specific (or “adaptive”) immune system to viral infection involves both killing of infected cells by CTL and neutralization of virus particles by antibodies (Ab). But the relative contribution of these two aspects of the host immune response to selection on proteins of HIV-1 and related viruses remains unclear. In addition, since the strongest evidence for positive selection is derived from experimental studies of rhesus monkeys infected with SIV (O’Connor et al., 2004), it is so far uncertain to what extent diversity that is selectively maintained within an individual host contributes to diversity in the HIV-1 population.

H. Piontkivska, A.L. Hughes / Virus Research 116 (2006) 98–105

There are several reasons for believing that transmission of HIV-1 may involve a reduction in genetic diversity. First, natural transmission involves only a small sample of the viral population inhabiting a host, giving rise to a “bottleneck” in transmission (Rambaut et al., 2004). Second, there is evidence in SIV that certain CTL escape mutants may have reduced fitness; in HIV-1 it is hypothesized that such mutants will tend not be transmitted from one host to another or, if transmitted, will quickly be eliminated by selection in hosts lacking the MHC genotype that originally favored the mutant (Friedrich et al., 2004). Finally, given the fact that HIV-1 is primarily sexually transmitted, the male genital tract may serve as selective filter, eliminating viral genotypes poorly adapted to its distinctive features (Pillai et al., 2005). In spite of these factors, there is evidence that CTL escape mutants acquired in one host can be transmitted to subsequent hosts (Furutsuki et al., 2004; Goulder et al., 2001). In addition, statistical analyses of HIV-1 sequences from different hosts have shown a signature of past selection on CTL epitopes (Moore et al., 2002; Piontkivska and Hughes, 2004; Yusim et al., 2002). Here we use analysis of published sequences to examine the relative contribution of selection by host CTL and Ab to diversification of amino acid sequences of proteins encoded by HIV-1. Positive Darwinian selection, favoring changes at the amino acid sequence level, is characterized by a pattern of nucleotide substitution whereby the number of nonsynonymous (amino acid-altering) nucleotide substitutions per nonsynonymous site (dN ) exceeds the number of synonymous substitutions per synonymous site (dS ) (Hughes and Nei, 1988). We compare the frequency of occurrence of dN > dS in known host CTL and Ab epitopes, using comparisons between pairs of closely related sequences that are phylogenetically and thus statistically independent (Felsenstein, 1985a). Convergent or parallel evolution at the amino acid sequence level involves the independent occurrence of the same amino acid replacement in different lineages, and positive Darwinian selection is thought likely to enhance the probability of convergent evolution (Doolittle, 1994; Hughes, 1999). In HIV-1, sequence comparisons have shown that the same amino acid replacements occur convergently in different hosts in epitopes for both Ab (Strunnikova et al., 1995) and CTL (Piontkivska and Hughes, 2004). Moreover, convergent amino acid changes were observed experimentally in a CTL epitope in Tat of SIV in different monkeys independently infected with the same inoculum (Hughes et al., 2001). Since convergent evolution is a hallmark of positive selection, we compare the occurrence of convergent amino acid changes in CTL and Ab epitopes. 2. Methods 2.1. Sequences analyzed DNA alignments of 9 HIV-1 coding genes were downloaded from HIV Sequence Database, maintained at Los Alamos National Laboratory (http://www.hiv.lanl.gov/content/hiv-db/ mainpage.html). Consensus sequences and sequences designated as contaminated were removed. The numbers of usable sequences in the alignments for each gene were as follows: gag

99

347; pol 342; vif 557; vpr 488; vpu 496; tat 349; rev 368; env 606; nef 821. 2.2. Amino acid diversity At each aligned amino acid site, the amino acid diversity among all genes in the alignment was estimated by the mean of the pairwise JTT amino acid distance (Jones et al., 1992) among all residues at the site, excluding alignment gaps and stop codons. The JTT distance is an empirically derived amino acid distance based on comparisons of closely related, conserved sequences. Thus, the greater the JTT distance, the more radical the amino acid residue change as regards the chemical properties of amino acids. Because the distribution of amino acid diversity was not normally distributed (Kolmogorov-Smirnov test; P < 0.01), nonparametric methods were used in analysis. 2.3. CTL epitopes and antibody binding sites Set of CTL epitope regions was derived from the “best list” of CTL epitopes provided by Frahm and colleagues (Frahm et al., 2002). Only epitopes supported by strong experimental evidence are included in this list (Frahm et al., 2002). List of HIV-specific antibody (Ab) binding sites was extracted from the comprehensive set maintained by the HIV Molecular Immunology Database (Korber et al., 2002). Only those antibodies with experimentally shown neutralizing reaction were included in the analysis. Adjacent or overlapping epitopes were combined as a single epitope region for purposes of analysis. Epitope regions with overlapping CTL and antibodies epitopes were also combined as a single epitope region. All sites were classified into one of four categories: (1) sites with both CTL and Ab epitopes, (2) CTL epitope sites only, (3) Ab epitope sites only and (4) non-epitope sites. 2.4. Phylogenetically independent pairs For each gene, phylogenetic trees were reconstructed by the neighbor-joining method (Saitou and Nei, 1987), using the MEGA2 program (Kumar et al., 2001). Phylogenetic trees were constructed using (1) the Kimura 2-parameter nucleotide distance (Kimura, 1980) (see supplementary figures); and (2) the Poisson-corrected amino acid distance (Nei, 1987). Simple models of nucleotide and amino acid substitution were used because simpler models have lower variance and thus are more likely to yield reliable results in phylogenetic analyses with large numbers of sequences (Nei and Kumar, 2000). The reliability of internal branches on the tree was assessed by bootstrap method (Felsenstein, 1985b) based on 1000 bootstrap replicates. Phylogenetic trees were used to identify phylogenetically independent “sister pairs” of closely related sequences that received at least 50% bootstrap support for the internal branch clustering two sequences in both nucleotide sequence-based and amino acid sequence-based trees. Note that no sister pair was closely related to any other sister pair; thus the membership of sister pairs is likely to be robust to minor errors in topology involving the terminal branches of the trees. Since all substitutions between

100

H. Piontkivska, A.L. Hughes / Virus Research 116 (2006) 98–105

the two members of a pair occurred since their last common ancestor, these substitutions are statistically independent of other pairs. The numbers of sister pairs for each gene were as follows: gag 39; pol 75; vif 69; vpr 36; vpu 19; tat 34; rev 28; env 66; nef 32. The phylogenetic trees are available for download from: http://shanghai.biology.kent.edu/AbCTL K2p topologies.pdf. 2.5. Synonymous and nonsynonymous substitution For each gene, we estimated the number of synonymous substitutions per synonymous site (dS ) and the number of nonsynonymous substitutions per nonsynonymous site (dN ) between each sister pair of sequences. We estimated dS and dN separately for each epitope region and for combined non-epitope portions of the gene; and we counted the numbers of comparisons where dN was greater than dS . A pattern of dN exceeding dS is indicative of positive Darwinian selection favoring amino acid sequence changes (Hughes and Nei, 1988; Hughes, 1999). We used Nei and Gojobori’s (1986) method to estimate dS and dN . In preliminary analyses, we applied a number of more complex models to a subset of the data: Li (1993) method; the modified Nei and Gojobori method (Zhang et al., 1998), and Yang and Nielsen (2000) method. All models produced essentially identical results. Different correction formulas are expected to give similar results when the number of substitutions per site is low (Nei and Kumar, 2000). In comparisons between sister pairs in the present data set, overall mean dS was 0.120 and overall mean dN was 0.036. Because these values are relatively low, we report only results using the Nei and Gojobori (1986) method; because it makes fewer assumptions than the other models, this method is expected to have a lower variance (Nei and Kumar, 2000). Moreover, in the present analyses, we were interested only in the relative frequencies of occurrence of a pattern of dN exceeding dS in different regions, and the use of different correction formulas rarely affects the relative values of dS and dN .

Fig. 1. Examples of convergent and unique differences among sister pairs of sequences, chosen from the phylogeny of 349 tat sequences. In the case of convergence, the same amino acid difference is found in two or more sister pairs, whereas a unique difference is one found in only a single pair. Two examples of sister pairs with differences that are neither unique nor convergent (“other”) are also shown. The example involves amino acid residues at site 22 in the Tat alignment.

ences, an amino acid difference was unique if the same pair of amino acids did not occur in any other sister pair (Fig. 1). 3. Results 3.1. Sequence diversity

2.6. Convergent and unique amino acid pairs If there was an amino acid difference between the two members of a phylogenetically independent pair, the difference was characterized as “convergent,” “unique,” or “other” (i.e., neither convergent or unique) (Fig. 1). A convergent amino acid difference was one such that the exact same amino acid difference occurred at that site in at least two pairs (Fig. 1). Such a difference can be called convergent in the evolutionary sense because, if we assume that the clustering of sister pairs is correct, the same amino acid difference cannot occur in two pairs separately unless at least one convergent/parallel amino acid replacement has taken place. Note that this concept of a convergent amino acid difference requires fewer assumptions than identifying convergent amino acid changes by methods that involve reconstructing amino acid changes throughout a phylogeny (e.g., Yang et al., 1995). The latter methods are dependent on the assumption that all details of the phylogeny are accurate, whereas in practice it is often difficult to reconstruct all branches in a phylogeny with high confidence. In contrast to convergent amino acid differ-

For all nine proteins encoded by the HIV-1 genome, median amino acid diversity differed significantly among amino acid sites categorized by presence or absence of Ab and CTL epitopes (Kruskal-Wallis test; P < 0.001; Fig. 2A). By far the highest median amino acid diversity (1.585) was seen at sites included in Ab epitopes but not to CTL epitopes (Fig. 2A). Among sites included in CTL epitopes, median amino acid diversity at sites included also in Ab epitopes (0.486) was over five times as great as that at sites included in CTL but not Ab epitopes (0.084) (Fig. 2A). Sites included in CTL but not Ab epitopes in fact showed lower median sequence diversity than non-epitope sites (0.283). The Gag and Env proteins were the only two to contain both Ab and CTL epitopes. When these two proteins were considered separately (Fig. 2B–C), median amino acid diversity likewise differed significantly among amino acid sites categorized by presence or absence of Ab and CTL epitopes (Kruskal-Wallis test; P < 0.001 in each case). In each case, the highest median amino acid diversity was seen at sites included in Ab epitopes

H. Piontkivska, A.L. Hughes / Virus Research 116 (2006) 98–105

Fig. 2. Median amino acid diversity at sites characterized by presence or absence of Ab and CTL epitopes in (A) all proteins (Kruskal-Wallis test of equality of medians, P < 0.001); (B) Gag (P < 0.001); and (C) Env (P < 0.001). Numbers on the top of each bar represent the number of sites in that category. Legend for categories of sites: Ab−, antibody epitope absent; Ab+, antibody epitope present; CTL−, CTL epitope absent; CTL+, CTL epitope present.

but not in CTL epitopes (Fig. 2B–C). However, in the case of Env, the differences among categories were less marked than in the case of Gag (Fig. 2B–C). In comparisons between sister pairs of sequences, we counted numbers of cases of regions with dN > dS (Fig. 3). Over all genes, the proportion of comparisons with dN > dS differed significantly among regions characterized by presence or absence

101

Fig. 3. Percentages of comparisons with dN > dS between sister pairs of sequences in gene regions characterized by presence or absence of Ab and CTL epitopes in (A) all genes (test of uniformity of proportions among categories χ2 = 26.96; 3 d.f.; P < 0.001); (B) gag (χ2 = 33.46; 3 d.f.; P < 0.001); and (C) env (χ2 = 42.35; 3 d.f.; P < 0.001). Numbers on the top of each bar represent the number of comparisons in that category. See Fig. 2 for legend for categories of regions.

of Ab and CTL epitopes (χ2 = 26.96; 3 d.f.; P < 0.001; Fig. 3A). The highest proportion of comparisons with dN > dS was seen in comparisons of Ab epitope regions that did not include CTL epitopes (22.3%; Fig. 3A). Similar patterns were seen when the Gag (Fig. 3B) and Env (Fig. 3C) genes were considered separately.

102

H. Piontkivska, A.L. Hughes / Virus Research 116 (2006) 98–105

We used a log-linear model to test for a partial association between the presence of Ab epitope and dN > dS , controlling for the presence or absence of a CTL epitope. In the data for all genes, there was a highly significant partial association between Ab epitopes and dN > dS (χ2 = 20.19; 1 d.f.; P < 0.001). This association was explained by the fact that the proportion of comparisons with dN > dS was greater in Ab epitope regions than in regions not belonging to Ab epitopes, whether or not a CTL epitope was also present (Fig. 3A). Likewise, there was a significant partial association between CTL epitopes and dN > dS when the effect of An epitopes was controlled for (χ2 = 5.09; 1 d.f.; P = 0.024). The latter effect was explained mainly by the pattern in regions where an Ab epitope was absent. In that case, a higher proportion of comparisons with dN > dS occurred in CTL epitope regions (15.7%) than in regions not belonging to CTL epitopes (8.8%; Fig. 3A). By contrast, when an Ab epitope was present, there was little difference in the proportion of comparisons with dN > dS in CTL epitope regions (20.3%) than in regions not belonging to CTL epitopes (22.3%) (Fig. 3C). Considering only the comparisons with dN > dS , we compared the median difference between dN and dS among the four categories of genomic regions: non-epitope regions (median difference = 0.0145; 29 comparisons) Ab epitopes (median difference = 0.0460; 82 comparisons); CTL epitopes (median difference = 0.0476; 613 comparisons); both Ab and CTL epitopes (median difference = 0.0213; 48 comparisons). The medians were significantly different among categories (Kruskal-Wallis test; P < 0.001). The significant difference among categories was evidently due to the higher median difference between dN and dS in regions where either an Ab epitope or a CTL epitope was present, in comparison to regions having neither type of epitope or both types. 3.2. Convergent and unique changes The proportion of convergent amino acid differences between sister pairs of sequences differed significantly among sites categorized by presence or absence of Ab and CTL epitopes (χ2 = 68.39; 3 d.f.; P < 0.001; Fig. 4A). The highest percentage of convergent changes (64.4%) was seen at sites in CTL but not Ab epitopes (Fig. 4A). The lowest percentage of convergent changes (48.7%) was seen at sites in Ab but not CTL epitopes (Fig. 4A). When the Gag protein alone was considered, the pattern was somewhat different (Fig. 4B). In Gag, the highest percentage of convergent changes (66.0%) occurred at sites in both Ab and CTL epitopes, while the lowest percentage of convergent changes (37.9%) was seen at sites in Ab but not CTL epitopes (Fig. 4B). In Env, on the other hand, the highest percentage of convergent changes (66.9%) was seen at sites in CTL but not Ab epitopes, and the lowest percentage of convergent change (49.4%) was seen at sites in Ab but not CTL epitopes (Fig. 4A). The proportions of unique changes were much lower those of convergent changes, but the proportion of unique amino acid differences between sister pairs of sequences, like that of convergent amino acid differences, differed significantly among sites categorized by presence or absence of Ab and CTL epitopes

Fig. 4. Percentages of convergent amino acid differences between sister pairs at sites characterized by presence or absence of Ab and CTL epitopes in (A) all proteins (test of uniformity of proportions among categories χ2 = 68.39; 3 d.f.; P < 0.001); (B) Gag (χ2 = 11.74; 3 d.f.; P = 0.008); and (C) Env (χ2 = 47.38; 3 d.f.; P < 0.001). Numbers on the top of each bar represent the number of comparisons in that category. See Fig. 2 for legend for categories of sites.

(χ2 = 77.69; 3 d.f.; P < 0.001; Fig. 5A). The highest percentage of unique changes (8.4%), like that of convergent changes, was seen at sites in CTL but not Ab epitopes (Fig. 5A). The lowest percentage of unique changes (2.4%) was seen at sites in both Ab and CTL epitopes, while the percentage of unique changes at sites in Ab but not CTL epitopes (3.0%) was nearly equally low (Fig. 5A). When Gag (Fig. 5B) and Env (Fig. 5C) were considered separately, sites in CTL but not Ab epitopes showed the highest percentages of unique changes; but the difference was much more marked in the case of Gag than in the case of Env.

H. Piontkivska, A.L. Hughes / Virus Research 116 (2006) 98–105

Fig. 5. Percentages of unique amino acid differences between sister pairs at sites characterized by presence or absence of Ab and CTL epitopes in (A) all proteins (test of uniformity of proportions among categories χ2 = 77.69; 3 d.f.; P < 0.001); (B) Gag (χ2 = 22.33; 3 d.f.; P < 0.001); and (C) Env (χ2 = 13.39; 3 d.f.; P = 0.004). Numbers on the top of each bar represent the number of comparisons in that category. See Fig. 2 for legend for categories of sites.

4. Discussion Analysis of published sequence data from HIV-1 showed striking differences in evolutionary pattern between epitopes for host antibodies (Ab) and epitopes for cytotoxic T cells (CTL). In all sequences analyzed, the greatest median amino acid residue diversity was seen at sites that formed part of Ab epitopes, but not of CTL epitopes. Sites belonging to CTL epitopes but not

103

to Ab epitopes showed reduced median amino acid sequence diversity not only in comparison to sites in Ab epitopes but also in comparison to non-epitope sites. The absence of diversity at many known CTL epitopes in HIV-1 has been noted previously and has been attributed to fixation of escape mutants eliminating CTL epitopes in protein regions not subject to strong functional constraint (Yusim et al., 2002). An alternative explanation for relative conservation of CTL epitopes is that the processes by which peptides are chosen for presentation by class I MHC molecules tend to select relatively conserved portions of molecules (such as the hydrophobic cores of globular proteins) (Hughes and Hughes, 1995; da Silva and Hughes, 1998; Yeager et al., 2000). These two explanations are not mutually exclusive, and both processes may contribute to the observed reduction of amino acid sequence diversity in known CTL epitopes of HIV-1. Whatever mechanism is responsible for constraint on CTL epitopes, our results indicate that Ab epitopes are comparatively free from such constraints. A pattern of nucleotide substitution whereby the number of nonsynonymous substitutions per nonsynonymous site (dN ) exceeds the number of synonymous substitutions per synonymous site (dS ) is evidence that positive Darwinian selection has acted to favor amino acid changes (Hughes and Nei, 1988). Using comparisons between phylogenetically independent pairs of sister sequences, we found that Ab epitopes that did not overlap CTL epitopes showed the highest frequency of comparisons with dN > dS . This result is consistent with the observation that the highest amino acid sequence diversity was seen at sites in Ab epitopes and supports the hypothesis that much of this diversity results from positive selection exerted by the host immune system, favoring mutations that escape from host Ab recognition. In spite of the lower amino acid diversity and lower frequency of comparisons with dN > dS in CTL epitopes than in Ab epitopes, the frequency of comparisons with dN > dS was significantly greater in CTL epitopes than in non-epitope regions when the effect of Ab epitopes was controlled for statistically. The latter effect was explained mainly by the fact that, in the absence of an Ab epitope, the proportion of comparisons with dN > dS was nearly twice as high in CTL epitope regions than in non-epitope regions. The results thus support the hypothesis that positive selection has acted on certain CTL epitopes, although not as frequently as on Ab epitopes. Note that, with both Ab and CTL epitopes, the pattern of dN > dS was seen in only a minority of comparisons, consistent with previous evidence that positive selection on HIV-1 is “episodic.” In other words, the region of the viral genome subject to positive selection depends on the immune system of the individual host (Seibert et al., 1995; Evans et al., 1999; O’Connor et al., 2004; Piontkivska and Hughes, 2004). In spite of the relatively conserved amino acid sequences of CTL epitopes compared to Ab epitopes, amino acid differences between sister pairs of sequences in CTL epitopes possessed distinctive characteristics. Amino acid differences between sister pairs in CTL epitopes were more likely to be convergent than those in Ab epitopes, and they were also more likely to be unique. The convergent nature of differences in CTL epitopes may, at least in part, reflect the fact that CTL epitopes tend to be

104

H. Piontkivska, A.L. Hughes / Virus Research 116 (2006) 98–105

in regions of functional constraint, which may limit the possible amino acid replacements that can occur. Likewise, the higher frequency of unique differences may in part reflects the overall lower rate of amino acid replacement. In Ab epitopes, differences between sister pairs tended to involve the same residues that were seen in other sequences but did not involve convergent evolution between the sister pairs. Note that this does not imply that the overall frequency of convergent evolution is necessarily lower at Ab epitopes than at CTL epitopes, since convergent changes at Ab epitopes may have occurred deep in the phylogeny rather than between sister pairs. However, convergent evolution of the former sort is difficult to demonstrate conclusively, since it depends on a phylogeny with strongly supported internal branches, which is difficult to obtain in a rapidly evolving virus such as HIV-1. Nonetheless, the observed greater frequency of convergent changes between sister pairs of sequences in CTL epitopes than in Ab epitopes identifies a distinctive pattern in the former, where recent convergent changes are a frequent occurrence. A bias toward recent changes might occur because many CTL escape mutants are at least mildly deleterious to the virus, giving rise to selection favoring back-mutation when the immune environment changes (Friedrich et al., 2004; Goulder and Watkins, 2004). The pattern seen at CTL epitopes may thus represent the results of conflicting pressures favoring conservation of the amino acid sequence for functional reasons and amino acid replacements for reasons of CTL escape. Proposed strategies for vaccine design in the case of HIV-1 have targeted both antibody and CTL responses (Calarota and Weiner, 2003; Draenert and Goebel, 2004; Garber et al., 2004; Goulder and Watkins, 2004; Zolla-Pazner, 2004). In either case, the high polymorphism of the viral population poses a problem for vaccine development (Calarota and Weiner, 2003). An enhanced understanding of the nature of this polymorphism and of the factors contributing to it may yield insights that will be important for the design of a successful vaccine. Acknowledgments This research was supported by grant GM43940 from the National Institutes of Health. References Allen, T.M., O’Connor, D.H., Jing, P., Dzuris, J.L., Moth´e, B.R., Vogel, T.U., Dunphy, E., Liebl, M.E., Emerson, C., Wilson, N., Kunstman, K.J., Wang, X., Allison, D.B., Hughes, A.L., Desrosiers, R.C., Altman, J.D., Wolinsky, S.M., Sette, A., Watkins, D.I., 2000. Tat-specific CTL select for SIV escape variants during resolution of primary viremia. Nature 407, 386–390. Calarota, S.A., Weiner, D.B., 2003. Present status of human HIV vaccine development. AIDS (Suppl.) 17, S73–S84. da Silva, J., Hughes, A.L., 1998. Conservation of host cytotoxic T lymphocyte (CTL) epitopes as a host strategy to constrain parasite adaptation: evidence from the nef gene of human immunodeficiency virus 1 (HIV-1). Mol. Biol. E 15, 1259–1268. da Silva, J., Hughes, A.L., 1999. Molecular phylogenetic evidence of cytotoxic T lymphocyte (CTL) selection on human immunodeficiency virus type 1 (HIV-1). Mol. Biol. E 16, 1420–1422.

Doolittle, R.F., 1994. Convergent evolution: the need to be explicit. Trends Biochem. Sci. 19, 15–18. Draenert, R., Goebel, F.-D., 2004. Protective immunity in HIV infection: where do we stand? Infection 32, 250–252. Evans, D.T., O’Connor, D.H., Jing, P., Dzuris, J.L., Sidney, J., da Silva, J., Allen, T.M., Horton, H., Venham, J.E., Rudersdorf, R.A., Vogel, T., Pauza, C.D., Bontrop, R.E., DeMars, R., Sette, A., Hughes, A.L., Watkins, D.I., 1999. Virus-specific cytotoxic T-lymphocyte responses select for aminoacid variation in simian immunodeficiency virus Env and Nef. Nat. Med. 5, 1270–1276. Felsenstein, J., 1985a. Phylogenies and the comparative method. Am. Nat. 125, 1–15. Felsenstein, J., 1985b. Confidence limits on phylogenies: an approach using the bootstrap. Evolution 39, 783–791. Frahm, N., Goulder, P.J.R., Brander, C., 2002. Total assessment of HIVspecific CTL responses: epitope clustering, processing preferences, and the impact of HIV sequence heterogeneity. In: Korber, B.T.M., Brander, C., Haynes, B.F., Koup, R., Kuiken, C., Moore, J.P., Walker, B.D., Watkins, D.I. (Eds.), HIV Molecular Immunology 2002. Los Alamos National Laboratory, Los Alamos NM, pp. 3–21. Friedrich, T.C., Dodds, E.J., Yant, L.J., Vojnov, L., Rudersdorf, R., Cullen, C., Evans, D.T., Desrosiers, R.C., Moth´e, B.R., Sidney, J., Sette, A., Kunstman, K., Wolinsky, S., Piatak, M., Lifson, J., Hughes, A.L., Wilson, N., O’Connor, D.H., Watkins, D.I., 2004. Reversion of CTL escape-variant immunodeficiency viruses in vivo. Nat. Med. 10, 275–281. Furutsuki, T., Hosoya, N., Kawana-Tachikawa, A., Tomizawa, M., Odawara, T., Goto, M., Kitamura, Y., Nakamura, T., Kelleher, A.D., Cooper, D.A., Iwamoto, A., 2004. Frequent transmission of cytotoxic-T-lymphocyte escape mutants of Human Immunodeficency virus type 1 in the highly HLA-A24-positive Japanese population. J. Virol. 78, 8437–8445. Garber, D.A., Silvestri, G., Feinberg, M.B., 2004. Prospects for an AIDS vaccine: three questions, no easy answers. Lacet Infect. Dis. 4, 397–413. Goulder, P.J., Watkins, D.I., 2004. HIV and SIV CTL escape: implications for vaccine design Nature Rev. Immunol. 4, 630–640. Goulder, P.J., Phillips, R.E., Colbert, R.A., McAdam, S., Ogg, G., Nowak, M.A., Giangrande, P., Luzzi, G., Morgan, B., Edwards, A., McMichael, A.J., Rowland-Jones, S., 1997. Late escape from an immunodominant cytotoxic T-lymphocyte response associated with progression to AIDS. Nat. Med. 3, 212–217. Goulder, P.J.R., Brander, C., Tang, Y., Tremblay, C., Colbert, R.A., Addo, M.M., Rosenberg, E.S., Nguyen, T., Allen, R., Trocha, A., Atfeld, M., He, S., Bunce, M., Funkhouser, R., Pelton, S.I., Burchett, S.K., McIntosh, K., Korber, B.T.M., Walker, B.D., 2001. Evolution and transmission of stable CTL escape mutations in HIV infection. Nature 412, 334–338. Greenier, J.L., Van Roempay, K.K.A., Montefiori, D., Earl, P., Moss, B., Marthas, M.L., 2005. Simian immunodeficiency virus (SIV) envelope quasispecies transmission and evolution in infant rhesus macaques after oral challenge with unclosed SIVmac251: increased diversity is associated with neutralizing antibodies and improved survival in previously immunized animals. Virol. J. 2, 11, 2005. Hughes, A.L., 1999. Adaptive Evolution of Genes and Genomes. Oxford University Press, New York. Hughes, A.L., Hughes, M.K., 1995. Self peptides bound by HLA class I molecules are derived from highly conserved regions of a set of evolutionarily conserved proteins. Immunogenetics 41, 257–262. Hughes, A.L., Nei, M., 1988. Pattern of nucleotide substitution at major histocompatibility complex class I loci reveals overdominant selection. Nature 335, 167–170. Hughes, A.L., Westover, K., da Silva, J., O’Connor, D.H., Watkins, D.I., 2001. Simultaneous positive and purifying selection on overlapping reading frames of the tat and vpr genes of simian immunodeficiency virus. J. Virol. 75, 7966–7972. Jones, D.T., Taylor, W.R., Thornton, J.M., 1992. The rapid generation of mutation data matrices from protein sequences. Comput. Appl. Biosci. 8, 275–282. Kimura, M., 1980. A simple method for estimating evolutionary rate of base substitution through comparative studies of nucleotide sequences. J. Mol. E vol16, 111–120.

H. Piontkivska, A.L. Hughes / Virus Research 116 (2006) 98–105 Korber, B.T.M., Brander, C., Haynes, B.F., Koup, R., Kuiken, C., Moore, J.P., Walker, B.D., Watkins, D.I. (Eds.), 2002. HIV Molecular Immunology 2002. Los Alamos, Los Alamos National Laboratory. Kumar, S., Tamura, K., Jakobsen, I.B., Nei, M., 2001. MEGA2: molecular evolutionary genetics analysis software. Bioinformatics 17, 1244–1245. Li, W.-H., 1993. Unbiased estimates of the rates of synonymous and nonsynonymous substitution. J. Mol. E 36, 96–99. Moore, C.B., John, M., James, I.R., Christiansen, F.T., Witt, C.S., Mallal, S.A., 2002. Evidence of HIV-1 adaptation to HLA-restricted immune responses at a population level. Science 296, 1439–1443. Nei, M., 1987. Molecular Evolutionary Genetics. Columbia University Press, New York. Nei, M., Gojobori, T., 1986. Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol. Biol. E 3, 418–426. Nei, M., Kumar, S., 2000. Molecular Evolution and Phylogenetics. Oxford University Press, New York. O’Connor, D.H., Allen, T.M., Vogel, T.U., Jing, P., de Souza, I.P., Dodds, E., Dunphy, E.J., Melsaether, C., Moth´e, B., Yammamoto, H., Horton, H., Wilson, N., Hughes, A.L., Watkins, D.I., 2002. Acute phase cytotoxic T lymphocyte escape is a hallmark of simian immunodeficiency virus infection. Nat. Med. 5, 493–499. O’Connor, D.H., McDermott, A.B., Krebs, K.C., Dodds, E.J., Miller, J.E., Gonzalez, E.J., Jacoby, T.J., Yant, L., Piontkivska, H., Pantophlet, R., Burton, D.R., Rehrauer, W.M., Wilson, N., Hughes, A.L., Watkins, D.I., 2004. A dominant role for CD8+-T-lymphocyte selection in simian immunodeficiency virus sequence variation. J. Virol. 78, 14012–14022. Pillai, S.K., Good, B., Pond, S.K., Wong, J.K., Strain, M.C., Richman, D.D., Smith, D.M., 2005. Semen-specific genetic characteristics of human immunodeficiency virus type 1 env. J. Virol. 79, 1734–1742. Piontkivska, H., Hughes, A.L., 2004. Between-host evolution of CTL epitopes in Human Immunodeficiency Virus Type 1 (HIV-1): an approach based on phylogenetically independent comparisons. J. Virol. 78, 11758–11765. Rambaut, A., Posada, D., Crandall, K.A., Holmes, E.C., 2004. The causes and consequences of HIV evolution. Nat. Rev. Genet. 5, 52–61.

105

Richman, D.D., Wrin, T., Little, S.J., Petropoulos, C.J., 2003. Rapid evolution of the neutralizing antibody response to HIV type 1 infection. Proc. Natl. Acad. Sci. U.S.A. 100, 4144–4149. Saitou, N., Nei, M., 1987. The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. E 4, 406–425. Seibert, S.A., Howell, C.Y., Hughes, M.K., Hughes, A.L., 1995. Natural selection on the gag, pol, and env genes of human immunodeficiency virus 1 (HIV-1). Mol. Biol. E 12, 803–813. Strunnikova, N., Ray, S.C., Livingston, R.A., Rubalcaba, E., Viscidi, R.P., 1995. Convergent evolution within the V3 loop domain of human immunodeficiency virus type 1 in association with disease progression. J. Virol. 69, 7548–7558. Yang, Z., Nielsen, R., 2000. Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models. Mol. Biol. E 17, 32–43. Yang, Z., Kumar, S., Nei, M., 1995. A new method of inference of ancestral nucleotide and amino acid sequences. Genetics 141, 1641– 1650. Yeager, M., Carrington, M., Hughes, A.L., 2000. Class I and class II MHC bind self peptide sets that are strikingly different in their evolutionary characteristics. Immunogenetics 51, 8–15. Yusim, K., Kesmir, C., Gaschen, B., Addo, M.M., Atfeld, M., Brunak, S., Chigaev, A., Detous, V., Korber, B.T., 2002. Clustering patterns of cytotoxic T-lymphocyte epitopes in human immunodeficiency virus type 1 (HIV-1) proteins reveal imprints of immune evasion on HIV-1 global variation. J. Virol. 76, 8757–8768. Zhang, J., Rosenberg, H.F., Nei, M., 1998. Positive Darwinian selection after gene duplication in primate ribonuclease genes. Proc. Natl. Acad. Sci. U.S.A. 98, 3708–3713. Zhang, L.Q., MacKenzie, P., Cleland, A., Holmes, E.C., Leigh Brown, A.J., Simmonds, P., 1993. Selection for specific sequences in the external envelope protein of human immunodeficiency virus type 1 upon primary infection. J. Virol. 67, 3345–3356. Zolla-Pazner, S., 2004. Identifying epitopes of HIV-1 that induce protective antibodies. Nat. Rev. Immunol. 4, 199–210.