The envelope glycoprotein of HIV-1 may have incorporated the CD4 binding site from HLA-DQβ1

The envelope glycoprotein of HIV-1 may have incorporated the CD4 binding site from HLA-DQβ1

Life Sciences, Vol. 45, pp. ill-ix Printed in the U.S.A. Pergamon Press AIDS RESEARCH CO~UNICATIONS: THE ENVELOPE GLYCOPROTEIN OF HIV-I MAY HAVE INC...

407KB Sizes 1 Downloads 24 Views

Life Sciences, Vol. 45, pp. ill-ix Printed in the U.S.A.

Pergamon Press

AIDS RESEARCH CO~UNICATIONS: THE ENVELOPE GLYCOPROTEIN OF HIV-I MAY HAVE INCORPORATED THE CD4 BINDING SITE FROM HLA-DQ~I R. I. Brinkworth Graduate

School of Science and Technology Bond University Gold Coast Queensland 4229 Australia (Received in final form September i, 1989) Summary An hypothesis is presented which states that the increased binding for CD4 by the envelope glycoproteln (gpl20) from HIV-I compared with that from HIV-2 is due to the env gene from HIV-I having at some stage incorporated exon 2 of the gene coding for the ~ subunit of a class II MHC protein, possibly HLA-DQ, which contains part of the CD4 binding site. Evidence is presented from amino acid sequence analysis and consideration of putative binding residues from gpl20 and HLA-DQ.

An hypothesis is proposed that HIV-I is derived from an ancestral virus which incorporated that part of the HLA-DQ~gene coding for the CD4 binding site. The proposal is that it is exon 2, coding for the H L A - D Q ~ I domain (I). This would have the effect of increasing the specificity of the virus for one sub-class of T-lymphocyte, the "helper" T-cell, resulting in the acquired immunodeflclency syndrome, AIDS. An evolutionary llnk between two proteins is one mechanism by which two proteins could bind in a similar manner to the same receptor. Human Immunodeficiency

Virus

(HIV)

The human immunodeficiency virus, HIV, is an enveloped retrovlrus. The HIV genome codes for several proteins, with the env gene coding for the envelope precursor, gpl60 (2). After export to the cell membrane, gpl60 is proteolytically cleaved into the envelope glycoprotein, gpl20, and the transmembrane glycoprotein, gp41 (3). The two proteins are still held together through non-covalent interactions, and this serves to anchor gpl20 to the viral membrane. The initial event in the binding of HIV, or HlVinfected cells, to the target cell, the "helper" T-lymphocyte, is the binding of gpl20 to a cell-surface antigen: the monomeric, monomorphlc, membranebound glycoprotein, CD4 (4). This is the primary step in the infective process, although evidence has now been presented to show that CD4 is not the only receptor that HIV can employ to infect cells (5). CD4 is involved in T-cell recognition, where proteins are taken up by antigen-presentlng cells (APCs) with the subsequent processing of these proteins into "T-cell epitopes" (6). These peptides are presented to the T-lymphocytes bound to class II cell-surface antigens coded by the MHC. MHC class II proteins restrict the recognition to "T4+" cells carrying CD4. CD4 acts as an accessory protein, with the main interaction being between the MHC protein 0024-3205/89 $3.00 + .00 Copyright (c) 1989 Pergamon PRess plc

iv

GP210 Contains the CD4 Site From HLA-DQB

Vol. 45, No. 20, 1989

plus bound antigen and the T-cell receptor (TcR) on the T-lymphocyte (6). A direct interaction between CD4 and MHC class II proteins has been demonstrated (7). There are two classes of protein that bind to CD4: the binding of both gpl20 and class II MHC proteins to CD4 activates T-cells by stimulating the membrane transduction system involving the calcium-phosphoinositol pathway (8, 9); whereas binding of anti-CD4 monoclonal antibodies, some of which such as Leu3a, inhibit gpl20 binding (i0), simply inhibit this process (ii). This would suggest that gp120 and class II MHC proteins have a similar mode of binding to CD4. gpl20 The gpl20 from the HIV-I strain, WMJ1, has 481 amino acids and twenty potential asparagine-linked glycosylation sites (2). The amino acid numbering system used here does not include the signal peptide. Results from other laboratories indicate that the CD4 binding site is located predominantly in the 150 residue C-terminal section of gp120: (i) Work by Lasky and coworkers (12) established that Ala403 was essential for binding and replacement by an aspartate residue was deleterious; (ii) Ala403 is part of the potential T-cell epitope 391-406 designated "envTl" by Cease et al. (13). The delimiting cysteine residues for this region, Cys388 and Cys415, are thought to form a disulphide bond, resulting in a "hairpin" structure with a n ~ - h e l i x running down one side (12); (iii) Experiments using insertion and deletion mutations by Kowalski et al. (14) suggested the possibility that the CD4 binding site in gpl20 is divided into three epitopes; one following Asn444 (pIIIenv473), a second between Cys388 and Iie390 (pIIIenv419) which was assumed to have affected the binding of Ala403, and a third between His335 and Ser336 (pIIIenv363). Deleting the last 39 residues also abolished binding. This is presumably the same part of the binding site as pIIIenv473. (iv) A truncated recombinant gpl20, lacking the first 274 residues, has been demonstrated as being unable to displace radio-labelled gpl20 from binding to a CD4+ cell line (15). The dissociation constant for gpl20 binding to CD4 has been measured at the nanomolar level (12, 15, 16). These results have enabled the author to focus attention on the C-terminal region of gpl20. MHC Class II Proteins Class II proteins of the MHC are glycoproteins of two subunits, ~ a n d S , of molecular weights 33,000 and 28,000 respectively, located in the membranes of many cell types including macrophages (17). Each subunit can be divided (from the N-terminal) into two extracellular domains, i and 2, a transmembrane domain and a cytoplasmic domain. Each of the domains is coded on a different exon. The class II MHC protein chosen for study is a haplotype of HLA-DQ sequenced by Boss and Strominger (i). T h e ~ 1 a n d ~ l domains of an MHC class II protein are thought to form a structure similar to that of t h e ~ l a n d S 2 domains of the MHC class I protein, HLA-A2, whose crystal structure is known (18). The structure is that of two antiparallel~-helices overlying and perpendicular to, an eight-stranded~-sheet. Each domain contributes one helix and four strands. Both helices are divided by hingepoints into Nterminal "short" and C-terminal "long" helices. The bound foreign antigen is bound in the groove between the two~-helices, and parallel to both (18, 19, 20). From comparison of the amino acid sequences, the class I I ~ l domain is the equivalent of the class I ~ 2 domain. Matching up of the amino acid sequence of the ~ I domain of HLA-DQ with the ~ 2 domain of HLA-A2 is somewhat easier than for the ~ i domains because they are bounded by disulphide-linked cysteine residues and have equivalent tryptophan residues: A comparison of the amino acid residues from these domains, with fully and

Vol. 45, No. 20, 1989

GP210 Contains the CD4 Site From HLA-DQB

v

partially homologous residues highlighted, is as follows: HLA-A2~2, 101-149: CDVGSDWRFLRGYHQYAYDGKDYIALKEDLRSWTAADMAAQTTKHKWEA HLA-DQ~I, 15-63: CYFTNGTERVRLVSRSIYNREEIVRFDSDVGEFRAVTLLGLPAAEYWNS As postulated by Brown and coworkers (20), the-N-glycosylation site on Asnl9 of HLA-DQ~I seems likely to be situated on the ~ - b e n d between @-strands 1 and 2. T h e a 2 "short" helix of HLA-A2 runs from Met138 to Ala149 (18). On that basis, the ~i "short" helix of HLA-DQ~I is predicted to run from Leu52 to Ser63. The presence of a helix breaker, Pro56, suggests that the structures of the two "short" helices may not be identical. Models have been proposed for the binding of an MHC - foreign antigen complex (HLA-X) to the T-cell receptor (21, 22). The TcR on these models occupies a large proportion of the surface of the MHC protein made up of residues on the two ~-helices. It follows that for CD4 to take part in the binding scheme involving the ~ 1 and ~I domains at the same time as the TcR, it would probably only have limited access to the MHC protein surface. The most promising area on the periphery of t h e f t a n d ~ l domains is made up of the C-terminal end of the ~ I helix and the N-linked sugar on the ~-bend which follows, along with the ~I "short" helix, which from the structure of HLA-A2, is found in the same vicinity. Structural and other aspects of the putative ~i short helix, Leu52-Ser63 (LLGLPAAAEYWNS), that are relevant to our proposals are as follows: (i) Heteromorphic (non-conserved) residues in this sequence are residues 52, 53, 55 and 57 (i). These form part of hypervariable region 2 (HV2) which runs from Glu46 to Ala57 (i). This heteromorphism may affect their probability of binding to CD4, a monomorphic protein. The heteromorphic residues in HV2 are all found in the early part of t h e @ l "short" helix; the later part surrounding Trp61 is conserved; (ii) Trp61 is believed to form a hydrophobic pocket with Va138, Phe40 and Phe47 o n ~ - s t r a n d s 3 and 4, which presumably helps stabilise the structure in this region (20). Trp147 in HLA-A2d2 forms an analogous hydrophobic pocket with Iie124, Leu126 and Trp133. All known MHC proteins, whether class I or II, have conserved tryptophan at this point (23, 24). Indeed, conserved tryptophan residues which have key roles in forming hydrophobic bonds are a common feature of immunoglobulin domains (25); (iii) Reference to the crystal structure of HLA-A2 reveals that the following residues in the ~ 2 "short" helix have what Wiley and colleagues have regarded as the appropriate conformation ("facing outwards") for receptor (TcR, CD3 or CD4) binding: Glnl41, His145 and Ala149 (19, 20). The equivalent residues in HLA-DQ are Leu55, Glu59 and Ser63, the last two of which are monomorphic (conserved). Glu59 and Ser63 are therefore more likely to be part of the CD4 binding site than the Tcr binding site; (iv) Leu52 and Leu55 are equivalent to Met138 and Leul41 in the murine MHC class I protein, H-2K ~, which have been shown experimentally to be important for T-cell recognition (26). Leu55 therefore is suggested by (iii) and (iv) but not (i). For this reason, it is much more likely that Leu55 binds to the TcR rather than to CD4; (v) Preceding the short helix, o n ~ - s t r a n d 3, is the sequence RFDS proposed as a CD4 binding site or "adhesiotope" by Auffray and coworkers (27). Note however, that Phe40 has also been proposed as forming part of the hydrophobic pocket with Trp61. (vi) Fujii et al. (28) have reported that an alloreactive rat monoclonal antibody HOK7 cross-reacts with a conserved site in MHC class II molecules of humans, rats and mice. This site, Glu52 [Leu52 in HLA-DQ] to Lys65, corresponds mostly to the "short" helix predicted here, and the authors concluded that it might have an important role in T-cell recognition.

vi

GP210 Contains the CD4 Site From HLA-DQB

Vol. 45, No. 20, 1989

(vii) Fujinami et al. (29) have identified an amino acid sequence from the IE-2 peptide of human cytomegalovirus, Leu-Gly-Arg-Pro-Asp-Glu, showing some resemblance to Leu53-Gly54-Leu55-Pro56-Ala57-Ala58-Glu59 of HLA-DQ. This observation may help to explain how cytomegalovirus infection contributes to graft rejection. (viii) Ala57 has been shown to be one of the residues associated with a predisposition to insulin-dependent diabetes mellitus (30). Gly45 (Glu45 in some alleles) also appears to have an important structural contribution in DQ~, with the proposal that residues 45-57 interacting with a receptor or antigen (30). Homolo$ies between $p120 and HLA-DQ When the amino acid sequences of various class II MHC alleles and the gpl20 sequences from different isolates were compared, the best match was achieved when a particular allele of HLA-DQ (i) was matched with the gpl20 from the WMJI strain of HIV-I (2). The sequence of gpl20 [WMJI] from Cys415 to Ser451 shows partial homology with the sequence Cyst5 to Ser63 of HLA-DQ i: HLA-DQ~I, 15-63: CYFTNGTERVRLVSRSIYNREEIVRFDSDVGEFRAVTLLGLPAAEYWNS gpl20 [WMJI], 415-451.: CSSNITGLLLTRDGGNSSSREEIFRPG G GNMR DNWRS Some points to note:(i) Arg446 is homologous with Arg48 if it can be assumed that the potential -turn Gly441-Gly442-Gly443-Asn444 forms a truncated corner compared with Ser42-Asp43-Va144-Gly45 of HLA-DQ I. A ~ - b e n d such as GGGN, although exposed, would be expected to exhibit much less antigenicity than SDVG. (ii) Further along the sequence it is possible to match Trp449, Ser451 and Lys455 of gpl20 with Trp61, Ser63 and Lys71 of HLA-DQ~I only if the stretch Ala49-Tyr60 from HLA-DQ~I is deleted, with Asp447-Asn448 possibly forming part of a ~ - b e n d . This "missing" stretch corresponds closely with most of the "short" helix of HLA-DQpl and the hypervariable region, HV2, mentioned earlier which may in fact be a TcR binding epitope. (iii) The potentially glycosylated Asn418 of gp120 maps close to the similarly glycosylated Asnl9 of HLA-DQ~I in this scheme. Asnl9 is believed to form part of t h e ~ - b e n d linking~-strands I and 2. (iv) Phe438, Pro440, Met445 and Trp449 of gpl20 could form a hydrophobic pocket in a similar manner to that described for Va138, Phe47 and Trp61 of HLA-DQp I. (v) A gpl20 insertion mutant in the vicinity of the hypervariable region, Asn430-Ser434, in gpl20 was not deleterious to CD4 binding (14) and therefore is unlikely to be part of the binding site. Also, there is virtually no homology with ~-strand 2 of HLA-DQ. This area is unlikely to be part of a CD4 binding site since it lies buried under the t w o ~ - h e l i c e s and therefore not likely to be conserved during HIV replication. (vi) The above method of sequence comparison shows there are residues equivalent to Glu59, Trp61 and Ser63 in the later ~i "short" helix of HLA-DQ. Glu59 and Ser63 have already been discussed as potential CD4 binding residues. The region around Trp449 in gpl20 has previously been shown to be a neutralising epitope ("458-484") (31) and important for binding to CD4, as demonstrated by the insertion mutant pIIIenv473 (14). (vii) The strongly homologous REEIVR/REEIFR peptides are situated at a similar distance from cysteine residues and N-glycosylation sites. REEIVR is found on a loop between~-strands 2 and 3, close to the "gap" between the ends of the two~-helices and leads on to RFDS proposed as an "adhesiotope" (27). The first two residues in the sequence are not conserved in gpl20s from different strains of HIV1, although they are always hydrophilic. Glu436 is

Vol. 45, No. 20, 1989

GP210 Contains the CD4 Site From HLA-DQB

vii

is the first residue in the highly conserved region of gpl20 which continues to the C-terminal and beyond the gpl60 cleavage site (2). When the REEIVR sequence of HLA-DQ is considered, the non-conserved residues in various MHC class II alleles are the first, fourth and fifth, although the last two of these are always hydrophoblc. Thus, Glu36/436 and Arg39/439 are conserved in all gpl20s from different HIVI strains and in all alleles of different MHC class II HLAs. The two may form an internal salt bridge. The fifth residue (Va138) forms part of the putative hydrophoblc pocket mentioned previously. Consideration of points (1)-(vli) suggests that although there is only partial homology between the two sequences, the residues common to both are predominantly those which appear to have either an important structural or binding role in HLA-DQ . Comparisons of the DNA sequences do not yield any additional information not apparent from the amino acid sequences: HLA-DQ~I, Cys15-Ser63

(I):

C Y F T N G T E R V R L V S TGC TAC TTC ACC AAC CGG ACA GAG CGC GTG CGT CTT GTG AGC R S I Y -N R E E I----V R F D --S AGA AGC ATC TAT AAC CGA GAA GAG ATC GTG CGC TTC GAC AGC GAC V G E F R A V T L L G L P A A E GTG GGG GAG TTC CGG GCG GTG ACG CTG CTG GGG CTG CCT GCC GCC GAG Y W N S TAC TGG AAC AGC gpl20 [WMJI], Cys415-Ser451

(2):

C S S N I T G L L L T R D G TGT TCA TCA AAT ATT ACA GGG CTG CTA TTA ACA AGA GAT GGT G N S S ~ R m m I----F R P G -GGT AAT AGC AGC AGC AGG GAA GAG ATC TTC AGA CCT GGA GGA G N M R D GGA GAT ATG AGG GAC N

W

R

S

AAT TGG AGA AGT HIV-2 The sequence following the cystelne nearest the C-termlnal (Cys415) in the gpl20 from HIV-I is largely conserved (2). By contrast, the equivalent region of gpl20 from that deduced from the genome sequence of HIV-2 bears very little resemblance to the HIV-I sequence (32): HIV-I: CSSNITGLLLTRDGGNSSSREEIFRPGGGNMRDNWRSLYKYKVVKIEPLGVAPT HIV-2: CNSTVTSIIANIDWQNNNQTNIT_FSAEVAELYRLELGDYKLVEITPIGFAPTKE Recent work indicates that HIV~2, which appears to be much less virulent, does not bind as strongly to CD4 as does HIV-I (5). In particular, it should be noted that the sequence DNWRS from HIV-I is lacking in HIV-2. Conclusions The sequence comparisons alone are too weak to adequately support the proposition that gpl20 from HIV-I may have incorporated part of the H L A - D Q ~

viii

GP210 Contains the CD4 Site From HLA-DQB

Vol. 45, No. 20, 1989

gene coding for the CD4 binding site. However, the case is considerably strengthened when the results of experiments are examined which defined the areas of the two proteins that are important for binding to CD4. In particular, the work of Kowalski and coworkers (14) using insertion and deletion mutants, especially pIIIenv473, established that an epitope of gp120 in the vicinity of Asn444 was important for binding. Provided the conformation of HLA-DQ~I in the vicinity of Trp61 (E_ ~ S ) is similar to that surrounding Trp147 in HLA-A2~2 (HK_WEA), Glu59 and Ser63 of HLA-DQ 1 are candidates for receptor binding on the basis that they are homologues of His145 and Ala149 in HLA-A2~2. Since this part of HLA-DQ~I is non-variable, they would therefore be candidates for CD4 binding rather than TcR binding. Close to Asn444 in gp120 is the sequence DNWRS which could have a similar function to EYWNS in HLA-DQ~I, with Asp447 and Ser451 binding to CD4. This could account for the much weaker binding of HIV-2 to CD4 (5). Other homologous sections, such as REEIFR, may have a key structural role in maintaining the conformation of gp120 in this region. References I. 2.

3. 4. 5. 6.

J.M. BOSS and J.L. STROMINGER Proc. Natl. Acad. Sci. USA 81 5199-5203 (1984) B.R. STARCICH, B.H. HAHN, G.M. SHAW, P.D. McNEELY, S. MODROW, H. WOLF, E.S. PARKS, W.P. PARKS, S.F. JOSEPHS, R.C. GALLO and F, WONG-STAAL Cell 45 637-648 (1986) J.M. McCUNE, L.B. RABIN, M.B. FEINBERG, M. LIEBERMAN, J.C. KOSEK, G.R. REYES and I.L. WEISSMAN Cell 53 55-67 (1988) K.A. NICHOLSON Science 231 382-385 (1986) P.R. CLAPHAM, J.N. WEBER, D. WHITBY, K. MclNTOSH, A.G. DALGLEISH, P.J. MADDON, K.C. DEEN, R.W. SWEET and R.A. WEISS Nature 337 368-370 (1989) C. DeLISI and J.A. BERZOVSKY Proc. Natl. Acad. Sci. USA 82 7048-7052

(1985) 7. 8. 9. i0. ii° 12. 13.

14.

15.

16. 17. 18. 19. 20.

C. DOYLE and J.L. STROMINGER Nature 330 256-259 (1987) A.P. FIELDS, D.P. BEDNARIK, A. HESS and W.S. MAY Nature 333 278-280 (1988) H. KORKFELD, W.W. CRUIKSHANK, S.W. PYLE, J.S. BERMAN and D.M. CENTER Nature 335 445-448 (1988) F. EMMRICH Immunol. Today 9 296-299 (1988) Q.J.A. SATTENTAU, A.G. DALGLEISH, R.A. WEISS and P.C.L. BEVERLEY Science 234 1120-1123 (1986) L.A. LASKY, G. NAKAMURA, D.H. SMITH, C. FENNIE, C. SHIMASAKI, E.PATZER, P. BERMAN, T. GREGORY and D.J. CAPON Cell 50 975-985 (1987) K.B. CEASE, H. MARGALIT, J.L. CORNETTE, S.D. PUTNEY, W.G. ROBEY, C. OUYANG, H.Z. STREICHER, P.J. FISCHINGER, R.C. GALLO, C. DeLISI and J.A. BERZOVSKY Proc. Natl. Acad. Sci. USA 84 4249-4253 (1987) M. KOWALSKI, J. POTZ, L. BASIRIPOUR, T. DORFMAN, W.C. GOH, E. TERWILLIGER, A. DAYTON, C. ROSEN, W. HASELTINE and J. SODROSKI Science 237 1351-1355 (1987) D.K. FERRIS, D. LITTMAN and W.L. FARRAR Lymphocyte Activation and Differentiation - Fundamental Aspects, eds. J.C. MANI and J. DORMAND pp. 445-448, Walter de Gruyter, Berlin and New York (1988) S.M. SCHNITTMAN, H.C. LANE, J. ROTH, A. BURROWS, T.M. FOLKS, J.H. KEHRL, S. KOENIG, P. BERMAN and A.S. FAUCI J. Immunol. 141 4181-4186 (1988) J.F. KAUFMAN, C° AUFFRAY, A.J. KORMAN, D.A. SHACKELFORD and J.L. STROMINGER Cell 36 1-13 (1984) P.J. BJORKMAN, M.A. SAPER, B. SAMRAOUI, W.S. BENNETT, J.L. STROMINGER and D.C. WILEY Nature 329 506-512 (1987) P.J. BJORKMAN, M.A. SAPER, B. SAMRAOUI, W.S. BENNETT, J.L. STROMINGER and D.C. WILEY Nature 329 512-518 (1987) J.H. BROWN, T. JARDETZKY, M.A. SAPER, B. SAMRAOUI, P.J. BJORKMAN and D.C. WILEY Nature 332 845-850 (1988)

Vol. 45, No. 20, 1989

21. 22. 23. 24. 25. 26. 27. 28.

29. 30. 31. 32.

GP210 Contains the CD4 ~ite From HLA-DQB

ix

P. MARRACK and J. KAPPLER Science 238 1073-1079 (1987) M.M. DAVIS and P.J. JBORKMAN Nature 334 395-402 (1988) J. KLEIN and F. FIGUEROA Immunol. Today Z 41-44 (1986) F. FIGUEROA and J. KLEIN l~unol. Today ~ 78-81 (1986) D. BEALE and A. FEINSTEIN Q. Rev. Physics 9 135-180 (1976) P. AJITKUMAR, S.S. GEIER, K.V. KESARI, F. BORRIELLO, M. NAKAGAWA, J.A. BLUESTONE, M.A. SAPER, D.C. WILEY and S.G. NATHENSON Cell 54 47-56 (1988) F. MAZEROLLES, A. DURANDY, D. PIATIER-TONNEAU, D. CHARON, L.MONTAGNIER, C. AUFFRAY and A. FISCHER Cell 55 497-504 (1988) H. FUJJII, K. OASAWARA, I. IWABUCHI, K. MIZUNO, Y. MATSUNO, T.NIIYAMA, K. ONOE, Y. SHIOKAWA, T. NATORI and M. AIZAWA Transplant. Proc. 19 30003003 (1987) R.S. FUJINAMI, J.A. NELSON, L. WALKER and M.B.A. OLDSTONE J. Virol. 62 100-105 (1988) W.W. KWOK, C. LOTSHAW, E.C.B. MILNER, N. KNITTER-JACK and G.P. NEPOM Proc. Natl. Acad. Sci. USA 86 1027-1030 D.D. HO, M.G. SARNGADHARAN, M.S. HIRSCH, R.T. SCHOOLEY, T.R. ROYA, R.C. KENNEDY, T.C. CHANH and V.I. SATO J. Virol. 61 2024-2028 M. GUYADER, M. EMERMAN, P. SONIGO, F. CLAVEL, L. MONTAGNIER and M. ALIZON Nature 326 662-669 (1987)