Article
Molecular and Physicochemical Factors Governing Solubility of the HIV gp41 Ectodomain Fadia Manssour-Triedo,1 Sara Crespillo,1 Bertrand Morel,1 Salvador Casares,1 Pedro L. Mateo,1 Frank Notka,2 Marie G. Roger,3 Nicolas Mouz,3 Raphaelle El-Habib,4 and Francisco Conejero-Lara1,* 1 Departamento de Quı´mica Fı´sica e Instituto de Biotecnologı´a, Facultad de Ciencias, Universidad de Granada, Granada, Spain; 2Geneart AG, Regensburg, Germany; 3PX’Therapeutics, Grenoble, France; and 4Sanofi Pasteur S.A., Marcy l’Etoile, France
ABSTRACT The HIV gp41 ectodomain (e-gp41) is an attractive target for the development of vaccines and drugs against HIV because of its crucial role in viral fusion to the host cell. However, because of the high insolubility of e-gp41, most biophysical and structural analyses have relied on the production of truncated versions removing the loop region of gp41 or the utilization of nonphysiological solubilizing conditions. The loop region of gp41 is also known as principal immunodominant domain (PID) because of its high immunogenicity, and it is essential for gp41-mediated HIV fusion. In this study we identify the aggregation-prone regions of the amino acid sequence of the PID and engineer a highly soluble mutant that preserves the trimeric structure of the wild-type e-gp41 under physiological pH. Furthermore, using a reverse mutagenesis approach, we analyze the role of mutated amino acids upon the physicochemical factors that govern solubility of e-gp41. On this basis, we propose a molecular model for e-gp41 self-association, which can guide the production of soluble e-gp41 mutants for future biophysical analyses and biotechnological applications.
INTRODUCTION The HIV envelope glycoprotein (Env) promotes viral infection by mediating the fusion of the viral membrane with the host-cell membrane (1). Env is a trimer of heterodimers of two noncovalently associated subunits, gp120 and gp41, which originate from the proteolytic cleavage of the precursor protein gp160 (2). The transmembrane subunit gp41 is anchored to the viral membrane and drives membrane fusion. It consists of an extracellular domain (ectodomain), a transmembrane segment, and an intracytoplasmic tail. The gp41 ectodomain (e-gp41) contains a hydrophobic N-terminal fusion peptide (FP), a N-terminal heptad repeat (NHR), a disulfide-bridged loop region, a C-terminal heptad repeat (CHR), and a membrane-proximal external region (MPER). The current model of HIV infection suggests that binding of gp120 to CD4 and another co-receptor, such as CCR5 or CXCR4, triggers a conformational change of gp41. As a result, the fusion peptide at the N-terminus of gp41 is exposed and penetrates into the host-cell membrane. This is followed by a large conformational change from an instable ‘‘prefusion’’ or extended state, in which its NHR and CHR regions are not associated, to a highly stable ‘‘post-
Submitted January 22, 2016, and accepted for publication July 20, 2016. *Correspondence:
[email protected] Editor: Michele Vendruscolo http://dx.doi.org/10.1016/j.bpj.2016.07.022 Ó 2016 Biophysical Society.
700 Biophysical Journal 111, 700–709, August 23, 2016
fusion’’ state, in which the NHR and CHR regions are integrated into a trimer of helical hairpins, coiled-coil structure (3,4). It is believed that the large energy gain attained with this conformational change serves to bring the two membranes into close proximity facilitating their fusion (5). The loop connecting the NHR and CHR regions of e-gp41 is usually known as the principal immunodominant domain (PID) due to its high immunogenicity (6). The PID has been shown to be essential for effective HIV infection. Alaninescanning mutagenesis analysis of the gp41 PID loop showed a prominent role for several preserved residues on virus infectivity, due to either disruption of processing and viral incorporation of the Env protein or increased dissociation of gp120 (7). Moreover, the study of peptides corresponding to the oxidized and reduced forms of the disulfide bond of the PID loop provides evidence that this region and its conserved cysteines play a role in gp41 membrane hemifusion, acting as a hinge between its opened and closed conformations (8). Removal of the disulfide bond or deletion of the loop also results in nonfunctional Env (9). However, mutagenesis analysis of the disulphide bond and surrounding residues indicated that the PID loop does not contribute to the extremely high thermal stability of the gp41 postfusion conformation, suggesting that it does not participate in the driving forces mediating gp41-mediated membrane fusion (10).
Solubility of the HIV gp41 Ectodomain
Detailed structural and biophysical analyses of e-gp41 including the PID have been largely impaired by the high insolubility of the recombinant constructs at physiological pH (11,12). Precipitation under physiological conditions may also have an impact on immune recognition, stability, and efficacy of vaccine candidates based on gp41 ectodomains and their aggregation may also obscure important epitopes of the immunogens. This low solubility also interferes with the development of diagnostic applications and immunoassays, production processes of recombinant proteins, and formulation of gp41-based vaccines. Efforts have been made to produce soluble recombinant forms of gp41 fused to glutathione s-transferase (GST) (13), chaperone modules (14), or trimerization tags (15). In vivo at physiological pH, gp41 aggregates have also been related to the acquisition of HIV-associated dementia (11). Aggregation of e-gp41 results from intermolecular interactions involving the PID (11), although the physicochemical factors governing these interactions remain unclear. With the aim of investigating the molecular reasons for the insolubility of e-gp41, we have analyzed computationally the aggregation propensity of the e-gp41 sequence and engineered a mutant e-gp41 protein with several mutations located at the PID loop. The mutant e-gp41 is highly soluble at physiological pH and spontaneously acquires the trimeric conformation and overall structural properties of the wild-type (WT) e-gp41. Moreover, an analysis of reversing individual or subgroups of the mutations has provided a more detailed understanding of the molecular basis of e-gp41 insolubility. These e-gp41 mutants may facilitate the direct investigation under physiological conditions of the structural and biophysical properties of the full-length gp41 ectodomain and could also serve as a basis for the development of soluble gp41-based immunogens.
MATERIALS AND METHODS The DNA encoding the e-gp41 protein sequences was synthesized by Geneart (Regensburg, Germany) and inserted into a pM1800 vector provided by Sanofi Pasteur (Marcy l’Etoile, France). To facilitate purification by Ni-sepharose affinity chromatography, the protein sequences were histidine tagged at the C terminus with the sequence GGGGSHHHHHH. For protein expression, BLR(DE3), Escherichia coli (E. coli) cells (Novagen, reference: 69053) were transformed with the plasmids and cultured in Luria-Bertani (LB) medium in the presence of 30 mg$mL 1 of kanamycin at 37 C. Expression was induced with isopropyl b-D-1-thiogalactopyranoside (IPTG) at a culture optical density of 0.6 at 600 nm, and cells were cultured overnight. Cells were collected by centrifugation, resuspended in lysis buffer (50 mM Tris pH 8.0 containing 500 mM NaCl, 2 mM MgCl2, 5 mM b-mercaptoethanol, and EDTA-free protease inhibitor), and lysed by a freeze/thaw cycle followed by three sonication cycles. The cell lysates were incubated at 4 C with 2.5 U/mL of benzonase for 1 h and then the soluble and insoluble fractions were separated by ultracentrifugation. The soluble fraction was directly dialyzed to purify the protein by Ni-sepharose affinity chromatography as described below. To purify the protein from the insoluble fraction, the inclusion bodies (IBs) were resuspended using an extruder in 50 mM Tris buffer pH 8.0, 500 mM NaCl, 5 mM glycine, 5 mM b-mercaptoethanol, and
8 M urea. Then the homogenate was submitted to three cycles of sonication and centrifuged at 30,000 rpm during 30 min. The solubilized protein was purified with a Ni-Sepharose Fast Flow column (Amersham GE Healthcare, Buckinghamshire, UK), previously equilibrated in 50 mM Tris buffer pH 8.0, 5 mM b-mercaptoethanol, and 500 mM NaCl. The equilibration buffer contained additionally 8 M urea and 1 mM Glycine in the case of protein obtained from IBs. The protein was eluted using a gradient of imidazole from 0 to 1 M in the same buffer. The pooled protein fractions were extensively dialyzed against 50 mM Tris pH 8.5, 300 mM NaCl, and 1 mM b-mercaptoethanol to remove urea and imidazole. The proteins were then concentrated and further purified on Hiload 26/60 Superdex 200 size-exclusion chromatography (SEC) column (Amersham GE Healthcare). Sample concentrations were determined by UV absorption measurement at 280 nm. Protein purity and identity were assessed using SDS-PAGE and ESI-TOF mass spectrometry, respectively. The solubility of the proteins was evaluated at pH 7.4 in 50 mM sodium phosphate buffer. A previously measured volume of stock protein solution was centrifuged for 30 min to remove any insoluble material and its protein concentration was measured spectrophotometrically. Then the solution was extensively dialyzed at 4 C during 24 h. After the dialysis, the solution was again centrifuged to remove precipitated protein and its volume and concentration was measured again. Immediately after, the solution was submitted to several concentrations steps using centrifugal ultrafiltration (Millipore, Darmstadt, Germany; Amicon Ultra 4 Centrifugal Filters, 3 kDa). After each concentration step, the retained solution was homogenized, centrifuged to remove precipitated protein, and its volume and concentration was measured. For comparative purposes, the solubility of each mutant was evaluated as the percentage of soluble protein remaining when the initial volume was reduced five times during the concentration process. The molecular size of the proteins was measured at 25 C by dynamic light scattering (DLS) using a DynaPro MS-X instrument (Wyatt, Santa Barbara, CA). Before taking the measurements, the sample solutions were centrifuged for 30 min at 14,000 rpm in a bench microcentrifuge. The measurements were made in a 30 mL quartz sample cuvette. Sets of DLS data were acquired by averaging 50 measurements with an acquisition time of 10 s each. The intensity autocorrelation curves were analyzed with the Dynamics V6 software to obtain the hydrodynamic radius distributions. Analytical SEC experiments were performed using a Superdex-200 10/300GL column (Amersham GE Healthcare) equilibrated in 50 mM phosphate buffer, 100 mM NaCl pH 7.4. Purified protein (at 20 mM) was loaded into the column and eluted at a flow rate of 0.4 ml/min, monitoring UV absorption at 280 nm. The molecular mass of the protein was estimated from a calibration curve obtained with the following standards: lysozyme (14.3 kDa), b-lactoglobulin (35 kDa), bovine serum albumin (monomer, 67 kDa, and dimer, 134 kDa), and catalase (250 kDa). Circular dichroism (CD) measurements were performed with a Jasco J-715 spectropolarimeter (Tokyo, Japan) equipped with a thermostatted cell holder. Measurements of the far-UV CD spectra (260–200 nm) were made with a 1 mm path length quartz cuvette at a protein concentration of ~20 mM. Spectra were recorded at a scan rate of 100 nm/min, 1 nm step resolution, 1 s response, and 1 nm bandwidth. The resulting spectrum was usually the average of five scans. Near-UV CD spectra (350–250 nm) were measured at a protein concentration of ~40 mM using a 5 mm cuvette and were typically the average of 20 scans. Each spectrum was corrected by baseline subtraction using the blank spectrum obtained with the buffer and finally the CD signal was normalized to molar ellipticity ([q], in deg$dmol 1$cm2). The percentage of a-helical structure was estimated from the far-UV CD spectra as described elsewhere (16). The thermal stability of the e-gp41 proteins was characterized by differential scanning calorimetry (DSC) using a VP-DSC microcalorimeter (Microcal, Northampton, MA). Scans were run from 5 to 125 C at a scan rate of 90 C$h 1. Protein concentration was typically 40 mM. Baselines recorded with buffer in both cells were obtained before each experiment and subtracted from the experimental thermograms obtained with the samples. Reheating runs were carried out to determine the reversibility of the thermal
Biophysical Journal 111, 700–709, August 23, 2016 701
Manssour-Triedo et al. denaturation, with the exact same parameters used for the main scan. After correction of the dynamic response of the instrument, the partial molar heat capacity (Cp) was calculated from the experimental DSC thermograms using the instrument’s software.
RESULTS AND DISCUSSION Design of mutations solubilizing e-gp41 To obtain a soluble recombinant protein we selected a majority of the e-gp41 sequence (residues 24–157; Fig. S1 a in the Supporting Material) as previously studied elsewhere (12). To investigate the basis of insolubility of e-gp41 under physiological conditions and identify sequence regions mediating its aggregation, we analyzed the e-gp41 sequence using three different algorithms of sequence-based prediction of aggregation propensity available as web servers, i.e., TANGO (17), ZYGGREGATOR (18), and AGGRESCAN (19). High scores were obtained for several sequence regions depending on the specific algorithm. Coincident sequence stretches in at least two of the three methods were found in four regions: 56–60 (QLTVW) in the NHR region, 82–86 (LGIWG), and 92–96 (ICTTA) in the PID loop, and 129–132 (SLIH) in the CHR region (Fig. 1). The two high-aggregation propensity regions identified in the PID loop were in good
agreement with the PID involvement in e-gp41 insolubility as previously proposed (11). Three major physicochemical factors have been proposed as crucial to determine the aggregation propensity in polypeptide chains, i.e., hydrophobicity, b-sheet propensity, and net charge (20). An analysis of the HIV e-gp41 model structure (PDB: 1IF3), created by homology modeling using the NMR structure of the SIV e-gp41 (21), indicates an almost complete absence of charged residues in the PID (except for K90 at the tip of the loop) and an abundance of exposed hydrophobic residues. Particularly prominent is the high solvent exposure of L81, W85, L91, I92, A96, and W103. This high-surface hydrophobicity may be an important factor in mediating intermolecular interactions. In fact, a previously described, triple mutant e-gp41, including mutations L91K, I92K, and W103D, showed modest solubility (0.08 mg mL 1) at pH 7.5 (12). The PID also contains an abundance of residues with relatively high propensity to form b-sheet structure. This includes the hydrophobic I92 residue mentioned above and also threonine residues (T94 and T95). Finally, the theoretical isoelectric point of the e-gp41 sequence is ~6.7 and at pH 7.4 the net charge would be about 2.5. This relatively low net charge together with the low abundance of charged residues may favor PID-mediated aggregation. On this basis, several mutations were proposed to increase e-gp41 solubility: L81D, W85E, L91G, I92E, T95P, A96E, and W103D (Fig. 1). By consequence, six exposed hydrophobic side chains were replaced by small or charged side chains, two residues with high b-sheet propensity (Ile and Thr) were replaced by amino acids with lower b-sheet propensity, and five negatively charged residues were distributed along the PID region. To facilitate purification, a histidine tag with sequence GGGGSHHHHHH was added at the C terminus. The mutant was called e-gp41-7mut. This protein was produced and characterized in comparison with the WT e-gp41 (e-gp41-WT). Production and biophysical analysis of e-gp41 proteins
FIGURE 1 Structural analysis of the e-gp41. (a) Ribbon representation of the trimer of hairpins model structure of HIV e-gp41 (21) shows aggregation-prone sequence regions (red) identified by computational algorithms (see text for details). (b) Backbone structure of the PID shows the mutated residues in e-gp41-7mut. The mutated side chains are labeled, represented with sticks and colored differently for each e-gp41 monomer. (c) Molecular surface of the PID highlights the mutated side chains using the same colors as in (b). All images have been generated using Swiss PDB viewer (36). To see this figure in color, go online.
702 Biophysical Journal 111, 700–709, August 23, 2016
The two proteins were cloned and expressed in E. coli. The e-gp41-7mut protein was expressed both as soluble protein and as inclusion bodies, whereas e-gp41-WT was expressed only in the insoluble fraction. The insoluble proteins could be easily solubilized from the inclusion bodies using buffer containing 8M urea and refolded by dialysis. The soluble and insoluble fractions of e-gp41-7mut were processed, purified, and characterized separately showing identical biophysical properties. Solubility tests were carried out at pH 7.4 in 50 mM sodium phosphate buffer as described in Materials and Methods. The stock protein solution was dialyzed against the buffer and further concentrated using centrifugal ultrafilters. At pH 7.4, the e-gp41-7mut protein was highly soluble
Solubility of the HIV gp41 Ectodomain
and could be concentrated up to more than 10 mg mL 1 with a relatively small loss of protein. This high solubility contrasts with that of e-gp41-WT, which is fully insoluble under the same conditions. On the other hand at pH 2.5 (50 mM glycine/HCl buffer), both proteins are highly soluble, as described elsewhere for the e-gp41-WT (12). The oligomerization state of the e-gp41 variants was measured by DLS. At pH 2.5, the proteins e-gp41-WT and e-gp41-7mut show hydrodynamic radii of 3.9 and 3.8 nm, respectively (Table S-I). A minor fraction of oligomeric particles of ~11 nm is also present in the solutions at acid pH. At pH 7.4, near 100% of e-gp41-7mut has a hydrodynamic radius of 3.7 nm. These radii are slightly larger than the hydrodynamic radius of 3.3 nm estimated using the program Hydropro (22) for a rigid-body model of the e-gp41 trimer (21). This small discrepancy could be attributed to the presence of the His-tags in the recombinant proteins and/or to the contribution of molecular flexibility. Analytical SEC experiments confirm the trimeric state of e-gp41-7mut (Fig. S2 a), in good consistency with the observed hydrodynamic radius by DLS (Fig S2 c). Altogether, these molecular sizes are fully consistent with a trimeric state for both proteins. In addition, SDS-PAGE analysis of e-gp41-7mut under reducing and nonreducing conditions shows no significant cross-linking between monomers in the trimer (Fig S2 b), suggesting native-like intrachain disulfide bonds in the PID of e-gp41-7mut. The secondary structure was analyzed by far-UV CD (Fig. 2). The spectra of the two proteins are very similar and typical of proteins with a highly a-helical structure. The number of residues with a-helix structure is ~97–99, as estimated from the ellipticity at 222 nm (16). These values are fully consistent with the putative trimer of hairpins structure of e-gp41. The near-UV CD spectra show a characteristic negative band with a minimum at 293 nm and a broad positive band around 275 nm (Fig. 2 b). This shape reports about the stacking of two tryptophan side chains from the CHR region (W117 and W120) into the conserved hydrophobic pocket made by the trimer of NHR helices in the trimer-of-hairpins conformation (23). Both WT and mutant e-gp41 show the same sharp negative band at 293 nm. Some differences are observed at lower wavelengths, either due to a higher proportion of oligomers in the WT protein at acid pH or to slight structural changes produced by the mutations at the PID. The CD spectra of e-gp41-7mut protein were similar at acid and physiological pH, indicating an identical structure. This structural analysis clearly shows that the solubilizing mutations included in the PID loop of the e-gp41-7mut protein do not alter the overall trimer-of-hairpins conformation of e-gp41. It has been reported previously that the high thermodynamic stability of the trimer of hairpins structure of e-gp41 is the main force driving membrane apposition and fusion during HIV infection. However, the PID does not appear to play an important role in the thermostability of
a
b
FIGURE 2 Structural analysis of e-gp41-WT and e-gp41-7mut by (a) Far-UV CD and (b) near-UV CD. The data have been normalized to molar ellipticity.
e-gp41 (10). To investigate the effect of solubilizing mutations in e-gp41-7mut, its thermal unfolding was investigated by DSC. The unfolding profile of e-gp41-WT at pH 2.5 is partially reversible in a second scan and relatively complex (Fig. 3) in good consistency with a previous report (12). The thermal profile shows a small exotherm at 80 C, a major peak at ~106 C and a minor peak near 112 C. The main transition is attributed to the unfolding of the six-helix bundle structure, whereas the latter peak was previously attributed to the unfolding of a minor population of protein molecules cross-linked with intermolecular disulphide bonds at the PID loop (12). Under the same conditions, the e-gp41-7mut protein also shows a main unfolding peak at 106 C that is sharper than that of e-gp41-WT. No high temperature peak is observed for the mutant although there is a shoulder at ~100 C, suggesting a destabilization of the PID structure by the mutations. The unfolding profile changes considerably on a second scan, indicating improper refolding after heating to 125 C. Nevertheless, the two proteins have equal Tm for the major peak at pH 2.5, indicating that the stability of the six-helix bundle structure is not
Biophysical Journal 111, 700–709, August 23, 2016 703
Manssour-Triedo et al.
to the WT sequence. Three mutations were reversed individually: W85E and W103D, because of the large size and hydrophobicity of the tryptophan side chain, and T85P, which was introduced as a beta-sheet breaker. Other groups of mutations were reversed gradually to make a total of eight mutants (Table 1). All the mutants were cloned, expressed in E. coli and purified to homogeneity by a two-step procedure using a Ni-His Trap affinity chromatography and size-exclusion chromatography, as described in Materials and Methods. Table S-II also summarizes some biophysical properties of the mutants, such as isoelectric point, net charge increment at pH 7.4 and hydrophobicity. Biophysical characterization of mutants
FIGURE 3 Thermal unfolding of e-gp41-WT and e-gp41-7mut monitored by DSC. First scans are represented by solid lines and second scans by dashed lines. The heat capacity curves have been displaced vertically for the sake of clarity.
compromised by the mutations, in good consistency with their structural similarity. At pH 7.4, the thermal unfolding of the e-gp41-7mut protein takes place at 114 C and is fully irreversible, showing a heat capacity decrease after the denaturation peak, typical of aggregation at high temperature. Design and production of reverse e-gp41 mutants To understand in more detail the physicochemical reasons of the solubilizing effect of the mutations a set of e-gp41 mutants were produced and characterized. Since each individual mutation changes to different extent more than one physicochemical factor governing solubility (charge, betasheet propensity or hydrophobicity) an analysis using single mutants would not allow a proper exploration of these factors. Therefore we used a different strategy to select the positions of mutations. Since e-gp41-7mut is highly soluble at physiological pH and, contrary to e-gp41-WT, it can be easily characterized in this condition, we selected subsets of e-gp41-7mut mutations that were progressively reversed
704 Biophysical Journal 111, 700–709, August 23, 2016
DLS measurements at pH 7.4 (50 mM phosphate buffer) indicate that all the mutants are trimeric according to their hydrodynamic radii ranging between 3.6 and 3.9 nm (Table S-I). Their secondary structure is highly alpha helical and similar to that of e-gp41-7mut, according to the far-UV CD spectra, with only small differences between the mutants (Fig. S3, a and b). The six-helix bundle structure is also maintained in the eight mutants according to the conservation of the characteristic bands in the near-UV CD spectra (Fig. S3, c and d). Some spectral differences are observed for mutant e-gp41-H possibly because of the presence of aggregates due to its lower solubility (see below). These data confirm that the different mutations at the PID domain do not affect the overall structure of e-gp41. Similarly, DSC analysis at pH 7.4 indicates that all mutants denature irreversibly with similarly high-temperature transitions suggesting thermally induced aggregation (Fig. S4). At pH 2.5, the DSC thermal profiles share some features of either e-gp41-WT or e-gp41-7mut, depending on the mutant (Fig. S5). The Tm of the main unfolding transition changes very little between mutants and ranges between 105.5 C and 106.4 C, which indicates that the mutations do not alter the stability of the core structure of the ectodomain. However, the high temperature shoulder observed for e-gp41-WT near 112 C is attenuated by the mutations and is only slightly noticeable for the lessmutated e-gp41 variants E, F, G, and H. Instead, a shoulder TABLE 1
Description of e-gp41 Mutants Studied in this Work Mutationsa
Mutant e-gp41-7mut e-gp41-A e-gp41-B e-gp41-C e-gp41-D e-gp41-E e-gp41-F e-gp41-G e-gp41-H a
L81D L81D L81D L81D L81D – L81D – –
W85E – W85E W85E W85E W85E – – W85E
L91G L91G L91G L91G L91G – L91G – –
I92D I92D I92D I92D – I92D – I92D –
T95P T95P T95P – – T95P – T95P –
A96E A96E A96E A96E – A96E – A96E –
Dashes indicate WT residues at the corresponding position.
W103D W103D – W103D W103D – W103D – –
Solubility of the HIV gp41 Ectodomain
around 99 C is developed as more mutations are added. This suggests a gradual destabilization of the PID domain produced by the mutations. Solubility tests were carried out as described in the Materials and Methods. Briefly, a stock protein solution was dialyzed extensively against 50 mM phosphate buffer at pH 7.4 at 4 C. The protein concentration was reduced to different extents after the dialysis depending on the mutant due to precipitation. After centrifugation to remove the precipitated protein, the supernatants were submitted to progressive concentration using centrifuge ultrafilters and the protein concentration was measured during the procedure to quantify the percent of soluble protein recovered. In addition, the concentration procedure also led to some protein losses due to precipitation. The results are summarized in Fig. 4, which shows the percentage of recovered soluble proteins versus the concentration of protein achieved. In this graph, the less-soluble mutants appear at the bottomleft corner. The solubility of the different mutants was very variable and diminished gradually with the reversal of most mutations. The most soluble mutant is e-gp41-C, which is slightly more soluble than e-gp41-7mut. Other highly soluble mutants are mutants A, B, D, and E, although to different extents. In contrast, the solubility of mutants F, G, and H is considerably lower, especially that of e-gp41-H, which could be concentrated only up to 0.3 mg mL 1, with very poor recovery. These mutants contain a lower number of mutations compared with the WT protein. A model for intermolecular association of e-gp41 The solubility changes produced by the mutations can be quantified for each e-gp41 mutant as the percentage of soluble protein recovered during the concentration procedure
FIGURE 4 Percent of soluble protein recovered for each e-gp41 mutant as a function of concentration achieved during the concentration process of the solubility tests (see Materials and Methods for details). To see this figure in color, go online.
for the same factor of volume reduction. Using this solubility parameter, we could then explore how these values correlate with the changes in hydrophobicity, charge and b-sheet propensity exerted by the mutations (Fig. 5). Some moderate correlations may be noticed with the changes in net negative charge and hydrophobicity, whereas a weak correlation exists with b-sheet propensity. The correlation does not improve much when these three physicochemical properties are combined together as described by Chiti et al. (20) for the prediction of aggregation rates of peptides and proteins. In fact, there seem to be two groups of mutants showing separate correlations between solubility and intrinsic physicochemical properties. This indicates that the changes in the intrinsic aggregation propensity of the polypeptide chain cannot fully account for the observed solubility changes but there must be specific interactions influencing the solubility of the mutants. As stated above, intermolecular interactions mediated by the PID domain have been previously proposed in the high molecular weight aggregates of the e-gp41 (11). A comparison of the solubility values between different mutants could allow us extracting the impact individual mutations or groups of them on disrupting these intermolecular interactions. For instance, the ratio between the solubility factors of e-gp417mut and e-gp41-A reports about the effect of mutating residue W85 to E in the absence of the rest of WTamino acids. This results in an ‘‘intrinsic’’ solubility ratio of 1.05, which accounts for the changes in the intrinsic physicochemical properties produced by this mutation. In contrast, the same W85E mutation in presence of I92, T95, and A96 results in a solubility ratio of 4.8, as derived from a comparison between e-gp41-D/e-gp41-F. Therefore, by dividing this ratio by the intrinsic solubility ratio of the W85E mutation, a ‘‘corrected’’ solubility ratio of 4.5 results, which could be attributed to the specific intermolecular interactions involving W85, I92, T95, and A96. Similarly, mutating W85 in presence of L81, L91, and W103 (mutants E/G) increases the solubility by a ratio of 5.3, which gives a corrected ratio of 5.0 for the contribution of the interactions involving these four residues. A similar analysis can be made for other mutations, which has allowed us estimating the corrected contributions of specific interactions between different sets of WT residues. Table 2 shows the solubility ratios between different e-gp41 mutants classified in two sets. A first set corresponds to mutations that produced small solubility increases attributable to the changes in the intrinsic aggregation propensity. The second set includes those mutations that disrupt specific intermolecular interactions. Corrected ratios were calculated dividing each solubility ratio by the product of intrinsic solubility ratios corresponding to the mutated residues. As an example, the corrected ratio for the B/H mutants is obtained dividing its solubility ratio (19.9) by the solubility ratios of B/E and 7-mut/D (1.18 1.16). The higher a corrected ratio, the stronger would be the contribution of the involved residues to the intermolecular interactions in e-gp41.
Biophysical Journal 111, 700–709, August 23, 2016 705
Manssour-Triedo et al.
a
b
c
FIGURE 5 Correlations between the solubility of e-gp41 mutants with the changes in intrinsic physicochemical properties of the polypeptide sequence: (a) hydrophobicity; (b) net charge; (c) b-sheet propensity; and (d) combined aggregation tendency. These properties have been calculated as in (20).
d
The data clearly indicate that individual WT residues or some sets of them have a small influence on solubility, fully attributable to their intrinsic physicochemical properties. In contrast, as more and more WT residues are present, their contribution to self-association of e-gp41 becomes reinforced cooperatively. For instance, L81, L91, and W103 do not appear to establish significant intermolecular interactions if the rest of WT residues are mutated. However, the same resi-
dues contribute considerably to self-association if W85 is preserved, which indicates mutual interactions involving these residues. Fig. 6 illustrates a model of how two e-gp41 molecules may interact according to the observed effects of mutations. As described above, the PID surface is largely hydrophobic and has several cavities where prominent hydrophobic side chains can interact. For instance W103
TABLE 2 Contribution of e-gp41 Mutations to the Intrinsic Aggregation Propensity and to Specific Intermolecular Interactions between PID Residues Compared Mutants
Mutated Residues
Preserved WT Residues
Solubility Ratio
Corrected Ratioa
1.05 1.27 1.18 1.50 0.96 1.20 1.16
– – – – – – –
4.76 5.52 5.25 7.98 5.26 6.27 7.59 25.3 16.9 21.8 19.9
4.53 4.53 4.53 5.06 5.01 5.10 5.06 14.4 14.6 14.5 14.7
Changes in intrinsic aggregation propensity 7mut/A 7mut/B B/E 7mut/E 7mut/C C/D 7mut/D
W85E W103D L81D, L91G L81D, L91G, W103D T95P I92D, A96E I92D, T95P, A96E
W103
T95
Changes in specific interactions D/F 7mut/F A/F 7mut/G E/G B/G A/G 7mut/H E/H D/H B/H
W85E W85E, I92D, T95P, A96E I92D, T95P, A96E L81D, W85E, L91G, W103D W85E L81D, L91G, W85E L81D, L91G, W103D L81D, L91G, I92D, T95P, A96E, W103D I92D, T95P, A96E L81D, L91G, W103D L81D, L91G, I92D, T95P, A96E
a
I92, T95, A96 W85 – L81, L91, W103 W103 W85 – L81, L91, W103 I92, T95, A96 W103D
Each solubility ratio was divided by the product of solubility ratios corresponding to the changes in intrinsic aggregation propensity produced by the mutated residues.
706 Biophysical Journal 111, 700–709, August 23, 2016
Solubility of the HIV gp41 Ectodomain
FIGURE 6 Model for the self-association of e-gp41. Molecular surfaces have been calculated and represented using Swiss PDB viewer (36). Exposed surface corresponding to the mutated residues are labeled and colored in blue and red for each e-gp41 moiety. The arrows indicate how prominent side chains may interact with hydrophobic pockets on the surface of the PID. To see this figure in color, go online.
could insert onto a hydrophobic cavity created by the absence of side chain of G89 and lined by several side chains including W85, C87, L91, and I92. Likewise, W85 could interact with a second hydrophobic cleft made by the side chains of L81, A96, P98, A101, and W103. Stacking of the indole rings could play an important role in this interaction. Another mutual hydrophobic interaction could be established between L81 from one moiety and the pocket formed by G86 surrounded by I84, W85 and I92, T94, and T95. In these intermolecular interactions W85 plays a central role and, in fact, the single mutation W85E provided a low, albeit significant solubility at physiological pH for the e-gp41-H mutant, compared with the completely insoluble e-gp41-WT. It has to be emphasized that this model for intermolecular association of e-gp41 may not be unique and other interaction arrangements could be devised. Mutations may also modify the way intermolecular association occurs as hydrophobic cavities are disrupted and negative charges are added by the mutations, possibly producing structural changes at the PID. Nevertheless, this model of intermolecular association is very consistent with the observed aggregate morphology observed for SIV e-gp41 by electron microscopy (11). The soluble e-gp41 constructs presented here may be of relevance for vaccine development. Increased solubility could facilitate characterization and development of vaccines based on these constructs as vehicles for epitope display. Although many HIV neutralizing epitopes are occluded in native functional Env spikes, polyclonal anti-
bodies, isolated from HIV-1 infected patients, were shown to capture the virus via the gp41 PID (24), indicating that virions contain a mixture of native and nonnative Env forms, including exposed gp41 stumps (25). Although antibodies recognizing the highly immunogenic PID are considered nonneutralizing, it has been recently argued that antibody responses showing nonneutralizing activity in vitro may exert protection by antibody-dependent cell-mediated cytotoxicity and/or other nonneutralizing humoral effector functions (26). Some of the mutated residues in e-gp417mut are highly preserved, such as W85, T95, and W103 (Fig. S1 b), and therefore these mutations would not be appropriate for the design of soluble immunogens targeting PID epitopes. However, mutations at less conserved positions (L81, L91, A96), as well as other mutations devised from these results may confer sufficient solubility while preserving immunogenicity of the PID. In addition, MPER epitopes recognized by the potent neutralizing 2F5 and 4E10 mAbs appear to be conformational and functionally related to the N-terminal region of gp41 (27,28), suggesting that a hairpin conformation of gp41 may be requisite for a complete display of these neutralizing epitopes. Mice vaccination with a human cell line expressing a full gp41 ectodomain in a 6-helix bundle (6-HB) conformation, including the PID and the transmembrane region, has been shown to elicit potent and broad neutralizing antibodies that recognize a conformational epitope constituted by the beginning of the NHR region and the MPER region (29). In addition, hybrid proteins, combining fragments of HIV-1 gp41 and p15E of porcine endogenous retrovirus in a hairpin conformation were recognized by 2F5 and 4E10 Nabs with higher affinities than the isolated MPER epitopes (30). Vaccination of rats, guinea pigs, and a goat with these hybrid constructs induced immune response against similar conformational epitopes (TLTVQARQL, at the beginning of NHR, and ELDKWA, located in the MPER), although this antibody response was weakly neutralizing possibly due to the absence of a membrane environment, which is essential for neutralization activity of Nabs targeting the MPER epitopes (31). These studies suggest that a trimeric hairpin conformation of gp41 in the context of the membrane is relevant in antibody-mediated HIV neutralization. The presence of the PID in these constructs may be crucial in correctly displaying neutralizing epitopes in gp41. On the other hand, the interactions between gp41 and the membrane are still not fully understood. Recent findings have shown that gp41 6-HB trimeric complexes dissociate into monomers in a lipid-bound context (32), suggesting that not only the fusion peptide and the MPER but also other the NHR and CHR regions may play a role in the initial destabilization of the membranes to promote hemifusion. Moreover, the actual relevance of the trimeric oligomerization state of gp41 in the formation of the fusion pore has also been questioned recently (33).
Biophysical Journal 111, 700–709, August 23, 2016 707
Manssour-Triedo et al.
For all these reasons, a great deal of work of biophysical and structural characterization is still required using full-length e-gp41 constructs under near-physiological conditions, as well as in the context of the membrane environment. The results presented in this study will help to obtain these soluble constructs and may also facilitate the development of future e-gp41-based vaccine candidates. CONCLUSIONS This work shows how an analysis based on simple physicochemical properties of the polypeptide sequence can successfully orientate to modify the solubility properties of the HIV gp41 ectodomain by means of mutagenesis. In addition, a more detailed analysis of reversed mutations has revealed the existence of specific interactions in the intermolecular association of e-gp41 that cannot be predicted by sequence analysis, giving rise to methods to obtain soluble e-gp41 mutants for biophysical analyses and potential biotechnological applications. SUPPORTING MATERIAL Five figures and two tables are available at http://www.biophysj.org/ biophysj/supplemental/S0006-3495(16)30588-4.
6. Gnann, J. W., Jr., J. A. Nelson, and M. B. A. Oldstone. 1987. Fine mapping of an immunodominant domain in the transmembrane glycoprotein of human immunodeficiency virus. J. Virol. 61:2639–2641. 7. Jacobs, A., J. Sen, ., M. Caffrey. 2005. Alanine scanning mutants of the HIV gp41 loop. J. Biol. Chem. 280:27284–27288. 8. Ashkenazi, A., M. Viard, ., Y. Shai. 2011. Viral envelope protein folding and membrane hemifusion are enhanced by the conserved loop region of HIV-1 gp41. FASEB J. 25:2156–2166. 9. Sen, J., A. Jacobs, ., M. Caffrey. 2007. The disulfide loop of gp41 is critical to the furin recognition site of HIV gp160. Protein Sci. 16:1236–1241. 10. Jacobs, A., C. Simon, and M. Caffrey. 2006. Thermostability of the HIV gp41 wild-type and loop mutations. Protein Pept. Lett. 13:477–480. 11. Caffrey, M., D. T. Braddock, ., G. M. Clore. 2000. Biophysical characterization of gp41 aggregates suggests a model for the molecular mechanism of HIV-associated neurological damage and dementia. J. Biol. Chem. 275:19877–19882. 12. Krell, T., F. Greco, ., R. El Habib. 2004. HIV-1 gp41 and gp160 are hyperthermostable proteins in a mesophilic environment. Characterization of gp41 mutants. Eur. J. Biochem. 271:1566–1579. 13. Penn-Nicholson, A., D. P. Han, ., M. W. Cho. 2008. Assessment of antibody responses against gp41 in HIV-1-infected patients using soluble gp41 fusion proteins and peptides derived from M group consensus envelope. Virology. 372:442–456. 14. Scholz, C., P. Schaarschmidt, ., F. X. Schmid. 2005. Functional solubilization of aggregation-prone HIV envelope proteins by covalent fusion with chaperone modules. J. Mol. Biol. 345:1229–1241. 15. Gao, G., L. Wieczorek, ., V. B. Rao. 2013. Designing a soluble near full-length HIV-1 gp41 trimer. J. Biol. Chem. 288:234–246.
AUTHOR CONTRIBUTIONS
16. Luo, P., and R. L. Baldwin. 1997. Mechanism of helix induction by trifluoroethanol: a framework for extrapolating the helix-forming properties of peptides from trifluoroethanol/water mixtures back to water. Biochemistry. 36:8413–8421.
F. M.-T., B.M., S. Crespillo, and M.G.R. performed research; B.M. and S. Casares analyzed data; N.M. and F.N. contributed to analytic tools; P.L.M., R.E-H., and F.C-L. designed research; and F.C.-L. wrote the article.
17. Fernandez-Escamilla, A. M., F. Rousseau, ., L. Serrano. 2004. Prediction of sequence-dependent and mutational effects on the aggregation of peptides and proteins. Nat. Biotechnol. 22:1302–1306.
ACKNOWLEDGMENTS
18. Tartaglia, G. G., and M. Vendruscolo. 2008. The Zyggregator method for predicting protein aggregation propensities. Chem. Soc. Rev. 37:1395–1401.
This research was funded by grant HEALTH-F3-2007-201038 (EuroNeut41) from the European Union Seventh Framework Programme and co-financed by the Andalusia Regional Government (Incentivos-BOJA-51-2008).
19. Conchillo-Sole´, O., N. S. de Groot, ., S. Ventura. 2007. AGGRESCAN: a server for the prediction and evaluation of ‘‘hot spots’’ of aggregation in polypeptides. BMC Bioinformatics. 8:65. 20. Chiti, F., M. Stefani, ., C. M. Dobson. 2003. Rationalization of the effects of mutations on peptide and protein aggregation rates. Nature. 424:805–808.
References (34,35) appear in the Supporting Material.
21. Caffrey, M. 2001. Model for the structure of the HIV gp41 ectodomain: insight into the intermolecular interactions of the gp41 loop. Biochim. Biophys. Acta. 1536:116–122.
REFERENCES
22. Garcı´a De La Torre, J., M. L. Huertas, and B. Carrasco. 2000. Calculation of hydrodynamic properties of globular proteins from their atomic-level structure. Biophys. J. 78:719–730.
SUPPORTING CITATIONS
1. Eckert, D. M., and P. S. Kim. 2001. Mechanisms of viral membrane fusion and its inhibition. Annu. Rev. Biochem. 70:777–810. 2. Moulard, M., and E. Decroly. 2000. Maturation of HIV envelope glycoprotein precursors by cellular endoproteases. Biochim. Biophys. Acta. 1469:121–132. 3. Weissenhorn, W., A. Dessen, ., D. C. Wiley. 1997. Atomic structure of the ectodomain from HIV-1 gp41. Nature. 387:426–430. 4. Lu, M., S. C. Blacklow, and P. S. Kim. 1995. A trimeric structural domain of the HIV-1 transmembrane glycoprotein. Nat. Struct. Biol. 2:1075–1082. 5. Chan, D. C., and P. S. Kim. 1998. HIV entry and its inhibition. Cell. 93:681–684.
708 Biophysical Journal 111, 700–709, August 23, 2016
23. Peisajovich, S. G., L. Blank, ., Y. Shai. 2003. On the interaction between gp41 and membranes: the immunodominant loop stabilizes gp41 helical hairpin conformation. J. Mol. Biol. 326:1489–1501. 24. Burrer, R., S. Haessig-Einius, ., C. Moog. 2005. Neutralizing as well as non-neutralizing polyclonal immunoglobulin (Ig)G from infected patients capture HIV-1 via antibodies directed against the principal immunodominant domain of gp41. Virology. 333:102–113. 25. Moore, P. L., E. T. Crooks, ., J. M. Binley. 2006. Nature of nonfunctional envelope proteins on the surface of human immunodeficiency virus type 1. J. Virol. 80:2515–2528. 26. Excler, J. L., J. Ake, ., S. A. Plotkin. 2014. Nonneutralizing functional antibodies: a new ‘‘old’’ paradigm for HIV vaccines. Clin. Vaccine Immunol. 21:1023–1036.
Solubility of the HIV gp41 Ectodomain 27. Bellamy-McIntyre, A. K., C. S. Lay, ., P. Poumbourios. 2007. Functional links between the fusion peptide-proximal polar segment and membrane-proximal region of human immunodeficiency virus gp41 in distinct phases of membrane fusion. J. Biol. Chem. 282:23104– 23116.
32. Roche, J., J. M. Louis, ., A. Bax. 2014. Dissociation of the trimeric gp41 ectodomain at the lipid-water interface suggests an active role in HIV-1 Env-mediated membrane fusion. Proc. Natl. Acad. Sci. USA. 111:3425–3430.
28. Fiebig, U., M. Schmolke, ., J. Denner. 2009. Mode of interaction between the HIV-1-neutralizing monoclonal antibody 2F5 and its epitope. AIDS. 23:887–895.
33. Banerjee, K., and D. P. Weliky. 2014. Folded monomers and hexamers of the ectodomain of the HIV gp41 membrane fusion protein: potential roles in fusion and synergy between the fusion peptide, hairpin, and membrane-proximal external region. Biochemistry. 53:7184–7198.
29. Dawood, R., F. Benjelloun, ., S. Paul. 2013. Generation of HIV-1 potent and broad neutralizing antibodies by immunization with postfusion HR1/HR2 complex. AIDS. 27:717–730.
34. Wilkins, M. R., E. Gasteiger, ., D. F. Hochstrasser. 1999. Protein identification and analysis tools in the ExPASy server. Methods Mol. Biol. 112:531–552.
30. Strasz, N., V. A. Morozov, ., J. Denner. 2014. Immunization with hybrid proteins containing the membrane proximal external region of HIV-1. AIDS Res. Hum. Retroviruses. 30:498–508.
35. Chenna, R., H. Sugawara, ., J. D. Thompson. 2003. Multiple sequence alignment with the Clustal series of programs. Nucleic Acids Res. 31:3497–3500.
31. Alam, S. M., M. Morelli, ., B. Chen. 2009. Role of HIV membrane in neutralization by two broadly neutralizing antibodies. Proc. Natl. Acad. Sci. USA. 106:20234–20239.
36. Guex, N., and M. C. Peitsch. 1997. SWISS-MODEL and the Swiss-PdbViewer: an environment for comparative protein modeling. Electrophoresis. 18:2714–2723.
Biophysical Journal 111, 700–709, August 23, 2016 709