Biophysical Journal Volume 100 May 2011 2253–2261
2253
Role of Hydration in Collagen Recognition by Bacterial Adhesins Luigi Vitagliano,†* Rita Berisio,† and Alfonso De Simone‡* †
Istituto di Biostrutture e Bioimmagini, Consiglio Nazionale delle Ricerce (IBB-CNR), Naples, Italy; and ‡Department of Chemistry, University of Cambridge, Cambridge, United Kingdom
ABSTRACT Protein-protein recognition regulates the vast majority of physiological or pathological processes. We investigated the role of hydration in collagen recognition by bacterial adhesin CNA by means of first principle molecular-dynamics samplings. Our characterization of the hydration properties of the isolated partners highlights dewetting-prone areas on the surface of CNA that closely match the key regions involved in hydrophobic intermolecular interactions upon complex formation, suggesting that the hydration state of the ligand-free CNA predisposes the protein to the collagen recognition. Moreover, hydration maps of the CNA-collagen complex reveal the presence of a number of structured water molecules that mediate intermolecular interactions at the interface between the two proteins. These hydration sites feature long residence times, significant binding free energies, and a geometrical distribution that closely resembles the hydration pattern of the isolated collagen triple helix. These findings are striking evidence that CNA recognizes the collagen triple helix as a hydrated molecule. For this structural motif, the exposure of several unsatisfied backbone carbonyl groups results in a strong interplay with the solvent, which is shown to also play a role in collagen recognition.
INTRODUCTION Most protein molecules have been evolutionary selected to have biological activity in aqueous environments. Not only does water play a key role in balancing that factors that define the structural properties of proteins in solution, it is also an active biological molecule that is involved in the majority of the processes that occur in cells (1–3). It is therefore increasingly recognized that the hydration shell of proteins should be considered an integral part of these macromolecules to fully account for the physical principles underlying their biology (1). However, defining water properties at biological interfaces is a complex task, and despite the large number of studies that have been performed to date, the overall picture remains fragmented. Few studies have approached this challenge experimentally due to the difficulty of targeting biological waters, whose properties are often overwhelmed by the bulk solvent (4). However, theoretical approaches, and molecular simulations in particular, have enhanced our understanding of the nature of protein hydration and its influence on structural stability (5), enzyme catalysis (6), protein-protein interactions (7), and protein-DNA interactions (8). Buried water molecules have been shown to influence the structural dynamics of proteins (9,10), whereas surface waters can play a crucial role in biomolecular interactions and the processes that lead to protein aggregation into amyloid fibrils (11,12). The mechanisms by which solvent mediates protein interactions are diversified and include both entropic and enthalpic contributions to the binding affinity (1,11,13–16). In some circumstances, macromolecular interactions can be established through water molecules at the interface of the
Submitted November 8, 2010, and accepted for publication March 9, 2011. *Correspondence:
[email protected] or
[email protected]
binding partners, as in the case of protein-DNA interactions (8). In other cases, the formation of large, dry interfaces between macromolecules implies that the hydration influences the molecular docking in an indirect way (i.e., by selecting interfaces with a higher propensity to desolvate (3)). In this study, we explored in detail the role of hydration in collagen recognition and binding by macromolecular receptors. Collagen is the most abundant protein in vertebrates (17) and is characterized by 1), a peculiar amino acid composition that is unusually rich in glycine, proline, and (2S,4R)hydroxyproline (Hyp) residues; 2), a repetitive sequence pattern of the type Gly-X-Y; and 3), a characteristic structure in which three polyproline II chains wrap along a common axis (designated the triple helix motif) (18–21). Although the collagen triple helix is particularly suitable as a rigid scaffold and is often employed for mechanical purposes, there is mounting evidence that this structural motif is also involved in signaling and recognition processes (17,22,23). For instance, human collagens are the target of adhesins from harmful bacterial pathogens (24). Collagen binding proteins are clustered in two general classes that recognize either specific collagen sequence motifs or the generic collagen consensus sequences GlyPro-Pro or Gly-Pro-Hyp (22). To date, three binding modes based on specific collagen sequences have been structurally characterized (25–27), whereas only one case with aspecific collagen sequence recognition, involving the collagenbinding protein from Staphylococcus aureus (collagen binding adhesin (CNA)) and the peptide [(GPO)4GPRGRT (GPO)4]3 (24), has been characterized by x-ray crystallography. Although this structure has provided fundamental clues regarding recognition by a bacterial protein of a widespread collagen sequence motif, its limited resolution ˚ ) has prevented the localization of solvent molecules (3.29 A
Editor: Nathan Andrew Baker. Ó 2011 by the Biophysical Society 0006-3495/11/05/2253/9 $2.00
doi: 10.1016/j.bpj.2011.03.033
2254
within the complex. Because the collagen triple helix is associated with a unique hydration pattern that has been shown to be crucial for its stability (28), this missing information may be essential for unveiling fundamental aspects of collagen recognition by CNA. Our goal in this study was to address the role of hydration in collagen binding and recognition by CNA. To that end, we conducted extensive molecular-dynamics (MD) simulations of the CNA and the collagen-like peptide [(GPO)4GPRGRT (GPO)4]3 in both unbound and bound states to gain insights into water structure and dynamics at protein surfaces and protein-protein interfaces. Our analyses of the isolated partnering molecules showed a significantly good agreement with available experimental data, thus supporting the reliability of the employed approach for describing the hydration of both globular and fibrous peptide/proteins. The results highlight the role played by hydration in the processes of collagen recognition and binding by CNA, as well as the importance of macromolecular solvation for identifying regions that are prone to be involved in protein-protein interactions.
Vitagliano et al. adhesin. The CNA-coll and CNA-coll-short simulations provided consistent results with respect to both the hydration analyses and the conformational properties of the complex; therefore, we refer to the former throughout the text. However, we computed the binding free energies of hydration (vide infra) on the CNA-coll-short model because it provided faster convergence.
Simulation parameters We performed MD simulations with the GROMACS package (34) by using the all-atom AMBER99sb force field (35,36) in combination with the TIP4P-ew explicit water model (37). The latter was specifically optimized for MD simulations performed with the use of Ewald summation methods, and showed extremely good agreement with experimental data over a range of temperatures. It was previously reported that the combination of AMBER99sb and TIP4P-ew provides a very good method for sampling protein and peptide conformations, as shown by cross-validation with NMR scalar couplings and relaxation (38). The simulations were carried out in the NPT ensemble with periodic boundary conditions at a constant temperature of 300 K. A rectangular box was used to accommodate the protein/peptide, water molecules, and ions. The systems included 16,971, 7339, and 17,614 water molecules (totaling 72,495, 30,496, and 76,204 atoms, respectively) for ApoCNA, collagen-like peptide, and the complex (CNA-coll), respectively. The CNA-coll-short simulation was composed of 16,703 water molecules (totaling 72,195 atoms). Additional information on the simulation parameters is available in the Supporting Material.
MATERIALS AND METHODS Models and simulation parameters
Hydration analysis
MD samplings were performed on protein structures generated on the basis of the structural information provided by the complex between CNA and [(GPO)4GPRGRT(GPO)4]3 (PDB code 2F6A). In this crystal structure, the collagen-like peptide binds two independent adhesin molecules (24). A set of simulations was carried out on the unbound CNA (ApoCNA) and the collagen-like peptide [(GPO)4GPRGRT(GPO)4]3. To avoid any bias on the hydration status of the protein derived from the MD analyses, crystallographic water molecules were removed from the starting models. The analysis of the collagen-like model in its unbound state was conducted by considering the polypeptide [(GPO)4GPRGRT(GPO)4]3, whereas the starting coordinates of ApoCNA were derived from the crystal structure of the protein determined ˚ (PDB code 2F68) (24). A missing fragment from this structure, at 1.95 A namely, the linker connecting the N-terminal (N1) and C-terminal (N2) domains that resulted in partial disorder in the crystal state (residues 166–169), was generated with the use of molecular modeling algorithms. It was previously demonstrated that the structure of CNA is stabilized by the presence of an isopeptide bond formed by the side chains Lys-176 and Asn-293 (29). This bond was explicitly parameterized in the simulated CNA models. We generated the model of the bound state (CNA-coll) composed of CNA and the collagen-like peptide [(GPO)4GPRGRT(GPO)4]3 by using the crystal structure of the complex as a template (PDB code 2F6A) (24). ˚ ), we replaced the Because the complex had very low resolution (3.29 A CNA structure with the high-resolution model of ApoCNA (PDB code 2F68), which allowed us to start the minimizations and simulations with a better stereochemical model. The starting structure of the complex was also modified in its triple helix portion. In the crystal structure of the complex between CNA and [(GPO)4GPRGRT(GPO)4]3, Hyp residues were modeled assuming the (2S,4S) diastereoisomer. Because peptides containing (2S,4S)Hyp are unable to fold into triple helices (18,30–33), we rebuilt the [(GPO)4GPRGRT(GPO)4]3 peptide by assuming for Hyp the naturally occurring (2S,4R) diastereoisomer, which is known to stabilize the triple helix. The resulting model of the complex was optimized with a series of energy minimizations both in vacuo and in water. We carried out an independent simulation of the complex by considering a shorter collagen model [(GPO)4GPRGRT]3 (CNA-coll-short), corresponding to the triple helix region that directly interacts with the
We analyzed the MD samplings to compute solvent density maps whose maxima represent the MD hydration sites (MDHS) (39–41). We then characterized the dynamics of waters in the hydration sites by extracting the residence times from the water autocorrelation functions. Further details on the calculation of solvent density maps and water residence times are reported in the Supporting Material.
Biophysical Journal 100(9) 2253–2261
Free energy of water binding: double decoupling method To compute the free energy of binding of water molecules from the interior of proteins, we used the double-decoupling method (42). In this method, the free energy is obtained by computing the contributions of water removal from the protein cavity and from the bulk solvent to the gas phase. Although the removal of a water molecule from the bulk solution is quite straightforward, the calculation of the standard free energies of transferring a water from the complex to the gas phase is somewhat complex. This is particularly true for the part of the integral where water is weakly coupled to the protein, because such an evaluation would require the water molecule to explore the entire simulation box to achieve a convergent simulation. To circumvent this problem, we forced the water to occupy the binding site by using harmonic restraints while decoupling the interactions with the protein. This procedure corresponds to transferring the water into an ideal gas constrained in the binding site. More details regarding this procedure can be found elsewhere (11,42). The calculations were performed in both forward and reverse directions by thermodynamics integration (see the Supporting Material for details). The discrepancy between the free-energy values obtained from the forward and reverse calculations provides an estimate of the error.
Identification of dehydrated patches on protein surfaces We conducted an amino-acidic hydration shells (aaHS) analysis by assigning to each residue a hydration shell composed of grid points from the
Molecular Basis of Collagen Binding
2255
˚ of the residue side-chain atoms. The solvent density map located within 5 A residue solvation properties were thus averaged on the respective aaHS grid points. We computed the averages by excluding points that were not visited by waters during the MD simulation. In our analyses, we considered only residues that were located at the protein surface, which were selected on the basis of their solvent-accessible surface area. We used two indexes to describe the solvation properties of individual residues. The first index is computed as the mean value of the solvent density in its aaHS. The second index, designated the dewetting index, is computed as the fraction of grid points of the aaHS that present a lower solvent density than the bulk solution. We found an inverse, nonlinear correlation between the dewetting index and the average residue hydration (see Fig. S2 A). From this plot, we defined a dewetting index lower bound of 0.69 to individuate dewetting-prone residues on the protein surface. It is worth noting that the dewetting index is independent of the surface extension of the residues, as suggested by the lack of correlation with the total number of grid points comprising the aaHS (Fig. S2 B).
RESULTS We investigated the role of hydration in collagen recognition by CNA by means of first principle MD samplings with explicit titration of water molecules and protein atoms. The simulations were carried out on the unbound partners (ApoCNA and [(GPO)4GPRGRT(GPO)4]3) and the complex (CNA-coll and CNA-coll-short) with the use of the AMBER99sb force field (35,36) and the TIP4P-ew water model (37). To evaluate the accuracy of our computational approach, we tested the stability of the triple helix motif in the simulations of the collagen-like peptide and compared the calculated hydration patterns with those reported in the corresponding crystal structures.
Collagen-like peptide dynamics and structural stability The fibrous nature of collagen molecule precludes atomicresolution investigations of full-length natural molecules; however, the use of peptide models that embed repetitive sequence motifs characteristic of collagen has been shown to be effective for unveiling the structural and thermodynamic properties of collagen molecules (18,43). The peptide (GPO)4GPRGRT(GPO)4, which assumes a stable triplehelical collagen-like structure due to the presence of several GPO triplets in its sequence (Fig. 1 A), is able to tightly bind CNA in solution (24). We assessed the dynamics properties of the isolated [(GPO)4GPRGRT(GPO)4]3 triple helix model in a 50 ns MD by using the AMBER99sb force field (35,36) and TIP4P-ew (37) explicit water molecules. A direct indicator of the stability of the triple helix motif is the number of conserved main-chain hydrogen (H)-bonds along the trajectory (Fig. 1 B). These intermolecular H-bonds are peculiar to the triple helix motif and are established between the amide group of the Gly residues and carbonyl groups from complementary peptides that form the triple helix. Fig. 1 B suggests that the employed force field and simulation setup were able to maintain the initial H-bonding
FIGURE 1 Dynamics and stability of the collagen-like triple helix. (a) Structure of the starting conformation of the collagen triple helix motif. The model, composed of three peptides of sequence [(GPO)4GPRGRT (GPO)4]3, was derived from the coordinates of the x-ray structure 2F6A. The triple helix is divided into three regions of different amino acid compositions. This sequence is capped at the termini by the gly-pro-hyp sequence (GPO) and contains the amino acid triplets GPR and GRT in the central part. (b) Evolution of total main-chain/main-chain H-bonds. (c) RMSD of the trajectory structures versus the x-ray model (Ca atoms only). (d) Distributions of Ca RMSD of the trajectory structures versus the x-ray model. The calculation was repeated separately for the N- terminal, central, and C-terminal regions of the peptide. (e) Triple helix bending angle of the trajectory structures. This angle was calculated from the centers of mass of the N-terminal, central, and C-terminal fragments of the peptide.
pattern of the structure, with an average number of 24.7 (5 0.78) native main-chain H-bonds. The simulated structures also show moderately low root mean-square deviations (RMSDs) (calculated on the Ca atoms) from the x-ray struc˚ ; Fig. 1 C). The RMSD ture (average value of 2.27 5 0.49 A values are smaller when they are separately computed from three consecutive segments of the peptide. Of these, RMSD values for the N- and C-terminal regions of the peptide (i.e., composed exclusively of GPO triplets) are on average smaller than those of the central region (Fig. 1 D), which in addition to imino acids contains Arg residues. These data are consistent with the presence of a principal motion involving a global bending angle around the central region of the peptide (Fig. 1 E), which implies larger RMSD values for both the whole molecule and the central part of the peptide. The increased flexibility of the amino-acid-rich region is in line with previous observations from MD analyses of other collagen-like polypeptides (28,44–46). Biophysical Journal 100(9) 2253–2261
2256
Collagen-like peptide in the isolated state: hydration analysis We performed MD simulations of the [(GPO)4GPRGRT (GPO)4]3 peptide to characterize the hydration state of the triple helix in solution. The resulting solvent density maps highlight a number of strong peaks (MDHS; see Materials and Methods) mainly surrounding the carbonyl groups of the main chain and the hydroxyl groups of Hyp side chains (Fig. 2 A). The geometrical distribution of these sites and the formation of extended water bridges show an encouraging agreement with crystallographic data (47–49). The analysis of the water dynamics in the hydration sites indicates that although the MDHS exhibit high occupancy values, their residence times are quite short even though they are located in the first hydration shell of the molecule and form stable H-bonds with the peptide (Table 1). The simultaneous high occupancy and low residence time of MDHS surrounding the peptide is in agreement with the hopping model for triple helix hydration (50). MD of ApoCNA To assess the evolution of the ApoCNA structure in the simulation timescale (50 ns), we used a number of stereochemical parameters, which suggested that the gyration radius and the secondary structure elements of the starting crystallographic model were retained along the trajectories (Fig. S3). The RMSD profiles of the entire molecule and individual subunits (Fig. S3 C) highlight large-amplitude structural fluctuations that are due to the mutual orientation of the two subunits. Whereas the single domains present ˚ ), the whole rather low and steady RMSD values (~1.1 A molecule shows a large and irregular RMSD profile, indicating interdomain motions. However, the root mean-square fluctuation (RMSF) profile suggests a significant flexibility of the linker connecting domains N1 and N2 (Fig. 3, red line), which is in agreement with the disordered state of this fragment as found in the crystallographic structure
Vitagliano et al.
(24). This flexibility is then significantly reduced upon collagen binding (Fig. 3, black line). Hydration of ApoCNA The hydration analyses of ApoCNA showed several tightly bound waters located at the protein surface (Fig. 4 A). We found a good agreement between the calculated solvent density maps and available x-ray data (24), with 98 out of ˚ 166 crystallographic water molecules located within 1.4 A (i.e., the conventional radius employed for water molecules) ˚ from the closest MDHS, or 128 out of 166 lying within 2 A (Fig. 2 B and Fig. S4). We then analyzed the dynamics of waters in the second hydration shell. The resulting distribution of the residence times of H-bonds that formed between waters from the first and second hydration shells is higher and broader than that of H-bonds in the bulk solution (Fig. S5), suggesting a pairing effect between the first and second hydration layers. The calculated hydration map of ApoCNA shows a uniformly hydrated surface; however, a deeper analysis based on the aaHS (see Materials and Methods and Fig. S2) highlights clusters of surface residues that exhibit a significant degree of dewetting (Fig. 4, B–E). Of interest, some of the dewetting-prone patches correspond to surfaces that are involved in contacts between symmetry-related copies of the protein in the crystalline state (Fig. S6), which suggests that surface dewetting may be one of the factors that influence protein-protein association during crystallization. Some of the CNA dewetting-prone surfaces feature exposed aromatic side chains and large areas lacking ˚ 2, and 94.98 A ˚ 2 for Tyr-175, ˚ 2, 88.37 A MDHS (78.50 A Phe-193, and Tyr-233, respectively, as estimated from the polygons connecting the surrounding hydration sites; Fig. 4, C–E, black dashes). Of interest, such aromatic residues on the protein surface correspond to key amino acids involved in tight hydrophobic interactions that stabilize the complex CNA-coll, as suggested by both the crystallographic structure (24) and our simulations (Fig. 5). This correspondence indicates that the hydration state of ApoCNA predisposes the protein to engage in key hydrophobic interactions upon complex formation, which likely influences protein-protein recognition by relieving a substantial part of the desolvation free energy upon formation of large intermolecular surfaces (3,11,15,16,40). MD of the CNA-collagen complex
FIGURE 2 Agreement between crystallographic water molecules and MDHS. (a) Collagen-like peptide hydration: computational (red maps) versus x-ray (green CPKs) water sites from the collagen structure (PDB code 1BKV). (b) Distribution of the atomic distance between calculated and x-ray (PDB code: 2F68) hydration sites in the ApoCNA hydration. The MDHS are superimposed on the x-ray structure by means of a local ˚ ). structural fit. The dashed line marks the conventional water radius (1.4 A Biophysical Journal 100(9) 2253–2261
Our analysis of the stereochemical parameters used to assess the structural evolution of the system along the MD trajectories indicates that the bound-state CNA-coll is stable in the simulation timescale (50 ns). RMSD values from the x-ray model (24) of the individual N1 and N2 domains of CNA are moderately small; however, the overall RMSD values of the complex are significantly higher (Fig. S7). The principal factors that determine these high overall RMSD values
Molecular Basis of Collagen Binding TABLE 1
2257
Structured water molecules Collagen bound to CNA
Water* W1 W2 W3 W4 W5 W6 W7 W8 W9 W10 W11 W12 W13
Collagen
Residence times (ps)
Occupancy factors
DG of binding (kJ/mol)
352.3 1248.3 1975.8 542.3 435.2 954.3 836.1 2085.3 845.2 394.1 2406.3 1281.2 422.2
0.73 0.91 0.69 0.94 0.83 0.97 0.90 0.86 0.59 0.56 0.83 0.91 0.72
0.69 (0.13) 1.72 (0.27) 1.38 (0.25) 0.94 (0.12) 1.01 (0.25) 0.77 (0.12) 0.82 (0.05) 1.88 (0.38) 0.81 (0.13) 0.71 (0.09) 1.77 (0.32) 1.12 (0.12) 0.51 (0.08)
y
Residence times (ps)
Occupancy factors
44.3 189.3 68.7 42.2 180.7 29.9 8.1 242.3 101.6 33.2 241.5 5.2 56.1
0.68 0.69 0.72 0.63 0.77 0.66 0.54 0.93 0.50 0.53 0.82 0.51 0.67
*Numbering as in Fig. S8. Errors in parentheses are estimated as the difference between forward and reverse integrations.
y
are the variation of the relative orientation between the N1 and N2 domains and the fluctuations of the triple helix, which maintain the general features of the isolated collagen molecule with a global bending motion (Fig. S7 B) and a stable number of intermolecular main-chain H-bonds (Fig. S7 C). In addition to the low RMSD values associated with the individual N1 and N2 domains, the stability of the CNA molecule is also shown by the preservation of the secondary structure elements along the trajectory (Fig. S7 D). The distribution of bending angles of collagen is not perturbed upon binding (Fig. S7 E); however, the rates of interconversion of different bending states, estimated by means of the time autocorrelation function (Fig. S7 F), are affected such that that the characteristic relaxation time increases from 65 ps of the free state to 201 ps for bound states. This finding indicates that although the conforma-
FIGURE 3 Residue RMSFs of ApoCNA. The RMSF values were computed by considering the plateau region of the simulation (20–50 ns). A ribbon representation of the protein is drawn by means a color scale ranging from red (for low RMSF values) to blue (for high RMSF values). The plot reports the RMSDs of CNA in the free form (red line) and when bound to collagen (black line).
tional properties of the collagen are unaffected upon binding, the triple helix has slower dynamics. Of note, despite the interdomain motion between the N1 and N2 domains, and the large bending fluctuations of the collagen, the hydrophobic interactions that stabilize the binding between CNA and the triple helix were significantly stable throughout the simulation (Fig. 5). The preservation of these hydrophobic contacts, which are formed by the side chains Tyr-175, Phe-193, and Tyr-233 of CNA and proline residues of the triple helix, highlights the key role played by these interactions in stabilizing the complex.
FIGURE 4 Solvent density map of adhesin. MDHS arising from the solvent density map are dawn by curve levels embedding regions with a water density higher than twice the bulk value. (a) Blue, yellow, and green ribbons are used for domains N1 and N2 and the connecting loop, respectively. The plot highlights the incomplete Ig fold of N1, which is completed by the C-terminal strand of the protein. Dewetting-prone residues are drawn as red sticks. (b) Key patches for CNA-collagen interaction. CNA is shown by ribbons (color code as in panel a). The molecular surface is drawn for domains N1 and N2. Panels c–e show a close-up view of the three patches (Tyr-175, Phe-193, and Tyr-233, respectively). Dashed lines mark the boundary of the dewetted regions by means of polygons connecting the surrounding hydration sites. Biophysical Journal 100(9) 2253–2261
2258
Vitagliano et al.
FIGURE 5 Preservation of the stacking interactions in the sampling. The top inset reports the evolution of the distances between center of masses of key aromatic residues from CNA and Pro residues from collagen. Black, red, and green lines are employed for Tyr-175, Phe-193, and Tyr-233, respectively. The CNA-coll complex is represented by cyan, yellow, and violet ribbons, which are used for domains N1 and N2 and the connecting loop, respectively. Collagen is shown by red traces connecting the Ca atoms. Side chains involved in tight hydrophobic interactions stabilizing the complex are represented by sticks.
Hydration of the CNA-collagen complex Because the CNA-coll crystal structure was resolved at a limited resolution, it could not provide information on structured water molecules within the complex. However, the encouraging agreement between the calculated and x-ray hydration sites in the unbound ApoCNA and collagen-like peptide prompted us to investigate the hydration properties of the complex. The hydration maps calculated from MD samplings of CNA-coll revealed an intricate network of structured waters localized at the interface between the two proteins. Thirteen waters were employed to bridge the collagen and CNA groups by means of water-mediated H-bonds (Fig. 6). Very similar results were obtained in the simulation (CNA-coll-short) carried out by considering only the peptide moiety ([(GPO)4GPRGRT]3) that directly interacted with the adhesin. These water-mediated interactions were more abundant than the intermolecular interactions that were directly established between collagen and CNA. Most of these bridges are composed of a single water molecule (Fig. S8). In a few cases, two water molecules participated in the solvent-mediated protein-protein interactions. Whereas CNA essentially employs side chains to engage H-bonds with the bridging waters, collagen interacts by means of main-chain carbonyl groups or hydroxyproline side chains. Remarkably, the bridging water molecules at the protein-protein interface of the complex are conserved in the hydration pattern of the isolated collagen peptide (Fig. 6, B and C, and Biophysical Journal 100(9) 2253–2261
FIGURE 6 Key waters at the interface between collagen and adhesin. (a) Top view of the complex. Cyan, yellow, and violet ribbons are used for domains N1 and N2 and the connecting loop, respectively. Collagen is shown by means of a red trace connecting Ca atoms. MD hydration sites are shown by green CPKs located in the peaks of the solvent density map. Side chains explicitly involved in the protein-water interactions are plotted by means of sticks. Panels b and c show a close-up view of a specific bridging water at the interface of the two proteins. (c) Hydration map of the site for the unbound collagen (red curve levels) and collagen bound to CNA (green curve levels).
Fig. S8), although they show different dynamics. The water residence times in the case of the isolated collagen are relatively short, which is typical of convex protein shapes (51), whereas in the case of the CNA-coll complex, the residence times in the corresponding hydration sites assume values comparable to those exhibited by partially or totally buried waters in the interior of globular proteins (Table 1). The free energy of binding of the bridging waters in the CNA-coll complex, which we calculated by using the double decoupling method of Hamelberg and McCammon (42), indicates an average stabilizing contribution of ~1 kcal/mol per hydration site (Table 1). This suggests that the waters play a key role in the stabilization of the complex.
DISCUSSION A detailed characterization of the water structure and dynamics at biological interfaces is fundamental for a complete understanding of the processes that occur in
Molecular Basis of Collagen Binding
living organisms. The distribution of tightly hydrated patches and surfaces that are prone to dewetting is particularly important for the macromolecular recognition process because it allows the selection of complementary surfaces and modulates the binding affinity between molecules (3,52). Biological waters are therefore important not only for stabilizing the functional complexes but also for ensuring the ability of macromolecules to recognize their binding partners (52). This study illustrates the latter point very clearly by showing that dewetted patches in the unbound form of the collagen receptor CNA correspond to surfaces that are employed in hydrophobic protein-protein interfaces upon binding. The resulting intermolecular interactions that form upon binding by dewetted surfaces from the N2 domain of CNA and Pro residues from the collagen triple helix have a key role in stabilizing the complex, and remained stable in our MD simulations (Fig. 5). Thus, the hydration state can predispose the receptor to interact with collagen by relieving part of the free-energy cost associated with the desolvation of large surfaces. Dewetted hydrophobic patches on the ApoCNA also suggest that waters have an active role in mediating the recognition of collagen. Indeed, long-range hydrophobic forces have been elegantly connected to the strong correlation and coupling of the dipoles of the water molecules at the two hydrophobic surfaces (53). This is consistent with atomic force microscopy experiments showing that the energy of attraction between two hydrophobic monolayers deposited on mica, ˚ 2 (54). at 3 nm separation distance and at 25 C, is ~4 kBT/A Whereas ApoCNA hydration predisposes the protein to collagen recognition by means of dewetting-prone regions, the CNA-collagen complex actively employs water molecules to bridge groups from the two partner molecules. The interfacial hydration sites in the complex show elevated residence times and binding free energies that correspond to an overall stabilizing contribution of about ~10 kcal/mol (i.e., by assuming that the free-energy contributions of individual molecules are additive). Of note, the geometrical distribution of water bridges across collagen and CNA is templated by the hydration pattern of the isolated collagen triple helix. Because the hydration pattern of collagen is an integral part of this structural motif (28,55), this finding suggests that CNA recognizes the triple helix as a hydrated scaffold by employing collagen hydration sites as anchor points for binding. This strategy recalls the aspecific protein-DNA binding that is also mediated by patterns of waters along the double-helix structural motif (56). Although DNA and collagen exhibit both structural and chemical differences, these two macromolecules also share some similarities, as both are characterized by regular hydration patterns arising from the three-dimensional arrangement of hydrophilic groups. These hydration patterns are somehow recognized by the respective protein partners and employed to mediate interactions that lead to binding.
2259
The dual influence that protein hydration exerts on collagen binding by the adhesin CNA reflects the unique role that solvent plays in the collagen structure. In this structural motif, the exposure of several unsatisfied backbone carbonyl groups favors a strong interaction with the solvent. Although the effective role played by the hydration layer of collagen in stabilizing the triple helix is still highly debated (18,21,28,55,57–61), our finding that collagen hydration is preserved upon protein-protein interaction strongly supports the idea that water patterns should be considered as an integral part of collagen molecules. Collagen is recognized by its biological partners via either sequence-specific or aspecific mechanisms (22). We found that in the case of CNA, protein-collagen interactions involve either hydrophobic residues or water molecules. In both cases, the regions in which collagen participates in binding (Pro or Hyp side chains or exposed carbonyl groups) represent widespread features of its triple helix motif (18). This suggests that the interactions characterized here likely have a role in common mechanisms governing the aspecific recognition of collagen fragments. The employment of waters makes the binding surface highly adaptable and thus somewhat amenable to possible variations that can occur in the collagen sequence. In analogy to the oligopeptide binding protein OppA (62), water molecules can act as removable bricks as well as extensions of adhesin side chains in facilitating an aspecific substrate binding. Bridging waters in protein interactions can be employed to mediate relatively weak and tunable interactions when the molecular associations need to be transient. In addition to assisting binding and recognition, bridging waters can also have active roles in biological mechanisms, as in the case of the cytochrome c2 redox protein (cyt c2) of Rhodobacter sphaeroides. In this protein, changes in interfacial waters, which switch from structured to mobile, facilitate dissociation from the reaction center once it has fulfilled its function (63). Similarly, bridging waters at the interface of CNA and collagen can serve a lubricating function to favor the dynamical behavior required for the collagen hug mechanism binding of CNA (24). Overall, our results highlight the biological role of the protein hydration shell and the fundamental influence this shell exerts on macromolecular interactions. This role is sometimes overlooked because crystallographic structures of macromolecular complexes are solved at limited resolution, thus preventing solvent localization. Even in cases of high resolution, information about dewetted regions on the protein surface cannot be straightforwardly deduced from analyses of crystalline structures because crystalline packing strongly biases the distribution of water sites around the protein. A large number of studies carried out on protein molecules in the isolated monomeric or oligomeric state suggest that MD simulations can straightforwardly provide a detailed analysis of the hydration Biophysical Journal 100(9) 2253–2261
2260
properties of a given structure in solution (for instance, in the simulations presented here, the hydration patterns converged after 30 ns; Fig. S9), which may also suggest a priori indications for hot spots in protein-protein interactions (52,64,65).
Vitagliano et al. 16. Ferna´ndez, A., and L. R. Scott. 2003. Adherence of packing defects in soluble proteins. Phys. Rev. Lett. 91:018102. 17. Myllyharju, J., and K. I. Kivirikko. 2004. Collagens, modifying enzymes and their mutations in humans, flies and worms. Trends Genet. 20:33–43. 18. Brodsky, B., and A. V. Persikov. 2005. Molecular structure of the collagen triple helix. Adv. Protein Chem. 70:301–339.
SUPPORTING MATERIAL
19. Okuyama, K. 2008. Revisiting the molecular structure of collagen. Connect. Tissue Res. 49:299–310.
Methods, references, and figures are available at http://www.biophysj.org/ biophysj/supplemental/S0006-3495(11)00381-X.
20. Berisio, R., A. De Simone, ., L. Vitagliano. 2009. Role of side chains in collagen triple helix stabilization and partner recognition. J. Pept. Sci. 15:131–140.
We thank Dr. R. Radmer and Dr. T. Klein for providing the hydroxyproline parameters in the Amber force field. This work was supported by the Ministro dell’Istruzione, dell’Universita` e della Ricerca (FIRB RBRN07BMCT to R.B. and L.V.) and the Engineering and Physical Sciences Research Council (EP/G049998/1 to A.D.S.). CINECA provided computational resources (project No. 932 cne0fm4f). Support was also provided by the DEISA Consortium, cofunded through EU FP6 project RI-031513 and FP7 project RI-222919, within the DEISA Extreme Computing Initiative.
REFERENCES 1. Ball, P. 2008. Water as an active constituent in cell biology. Chem. Rev. 108:74–108. 2. Levy, Y., and J. N. Onuchic. 2006. Water mediation in protein folding and molecular recognition. Annu. Rev. Biophys. Biomol. Struct. 35:389–415. 3. Chandler, D. 2005. Interfaces and the driving force of hydrophobic assembly. Nature. 437:640–647. 4. Pal, S. K., J. Peon, and A. H. Zewail. 2002. Biological water at the protein surface: dynamical solvation probed directly with femtosecond resolution. Proc. Natl. Acad. Sci. USA. 99:1763–1768. 5. Zong, C., G. A. Papoian, ., P. G. Wolynes. 2006. Role of topology, nonadditivity, and water-mediated interactions in predicting the structures of a/b proteins. J. Am. Chem. Soc. 128:5168–5176. 6. Abel, R., T. Young, ., R. A. Friesner. 2008. Role of the active-site solvent in the thermodynamics of factor Xa ligand binding. J. Am. Chem. Soc. 130:2817–2831. 7. Papoian, G. A., J. Ulander, and P. G. Wolynes. 2003. Role of water mediated interactions in protein-protein recognition landscapes. J. Am. Chem. Soc. 125:9170–9178. 8. Fuxreiter, M., M. Mezei, ., R. Osman. 2005. Interfacial water as a ‘‘hydration fingerprint’’ in the noncognate complex of BamHI. Biophys. J. 89:903–911. 9. Fischer, S., and C. S. Verma. 1999. Binding of buried structural water increases the flexibility of proteins. Proc. Natl. Acad. Sci. USA. 96:9613–9615.
21. Shoulders, M. D., and R. T. Raines. 2009. Collagen structure and stability. Annu. Rev. Biochem. 78:929–958. 22. Leitinger, B., and E. Hohenester. 2007. Mammalian collagen receptors. Matrix Biol. 26:146–155. 23. Emsley, J. 2010. Convergent recognition of a triple helical hydrophobic motif in collagen. Structure. 18:1–2. 24. Zong, Y., Y. Xu, ., S. V. Narayana. 2005. A ‘collagen hug’ model for Staphylococcus aureus CNA binding to collagen. EMBO J. 24:4224– 4236. 25. Emsley, J., C. G. Knight, ., R. C. Liddington. 2000. Structural basis of collagen recognition by integrin a2b1. Cell. 101:47–56. 26. Hohenester, E., T. Sasaki, ., H. P. Ba¨chinger. 2008. Structural basis of sequence-specific collagen recognition by SPARC. Proc. Natl. Acad. Sci. USA. 105:18273–18277. 27. Carafoli, F., D. Bihan, ., E. Hohenester. 2009. Crystallographic insight into collagen recognition by discoidin domain receptor 2. Structure. 17:1573–1581. 28. De Simone, A., L. Vitagliano, and R. Berisio. 2008. Role of hydration in collagen triple helix stabilization. Biochem. Biophys. Res. Commun. 372:121–125. 29. Kang, H. J., F. Coulibaly, ., E. N. Baker. 2007. Stabilizing isopeptide bonds revealed in gram-positive bacterial pilus structure. Science. 318:1625–1628. 30. Inouye, K., S. Sakakibara, and D. J. Prockop. 1976. Effects of the stereo-configuration of the hydroxyl group in 4-hydroxyproline on the triple-helical structures formed by homogenous peptides resembling collagen. Biochim. Biophys. Acta. 420:133–141. 31. Berisio, R., L. Vitagliano, ., A. Zagari. 2002. Recent progress on collagen triple helix structure, stability and assembly. Protein Pept. Lett. 9:107–116. 32. Doi, M., Y. Nishi, ., Y. Kobayashi. 2005. Collagen-like triple helix formation of synthetic (Pro-Pro-Gly)10 analogues: (4(S)-hydroxyprolyl-4(R)-hydroxyprolyl-Gly)10, (4(R)-hydroxyprolyl-4(R)-hydroxyprolyl-Gly)10 and (4(S)-fluoroprolyl-4(R)-fluoroprolyl-Gly)10. J. Pept. Sci. 11:609–616. 33. Holmgren, S. K., L. E. Bretscher, ., R. T. Raines. 1999. A hyperstable collagen mimic. Chem. Biol. 6:63–70.
10. Hamelberg, D., T. Shen, and J. A. McCammon. 2006. Insight into the role of hydration on protein dynamics. J. Chem. Phys. 125:094905.
34. Hess, B., C. Kutzner, ., E. Lindahl. 2008. GROMACS 4: algorithms for highly efficient, load-balanced, and scalable molecular simulation. J. Chem. Theory Comput. 4:435–447.
11. De Simone, A., G. G. Dodson, ., F. Fraternali. 2005. Prion and water: tight and dynamical hydration sites have a key role in structural stability. Proc. Natl. Acad. Sci. USA. 102:7535–7540.
35. Hornak, V., R. Abel, ., C. Simmerling. 2006. Comparison of multiple Amber force fields and development of improved protein backbone parameters. Proteins. 65:712–725.
12. Silva, J. L., and D. Foguel. 2009. Hydration, cavities and volume in protein folding, aggregation and amyloid assembly. Phys. Biol. 6:015002.
36. Sorin, E. J., and V. S. Pande. 2005. Exploring the helix-coil transition via all-atom equilibrium ensemble simulations. Biophys. J. 88:2472– 2493.
13. Choudhury, N., and B. M. Pettitt. 2007. The dewetting transition and the hydrophobic effect. J. Am. Chem. Soc. 129:4847–4852.
37. Horn, H. W., W. C. Swope, ., T. Head-Gordon. 2004. Development of an improved four-site water model for biomolecular simulations: TIP4P-Ew. J. Chem. Phys. 120:9665–9678. 38. Wickstrom, L., A. Okur, and C. Simmerling. 2009. Evaluating the performance of the ff99SB force field based on NMR scalar coupling data. Biophys. J. 97:853–856.
14. Cramer, C. J., and D. G. Truhlar. 2008. A universal approach to solvation modeling. Acc. Chem. Res. 41:760–768. 15. Despa, F., A. Ferna´ndez, and R. S. Berry. 2004. Dielectric modulation of biological water. Phys. Rev. Lett. 93:228104. Biophysical Journal 100(9) 2253–2261
Molecular Basis of Collagen Binding 39. Makarov, V. A., B. K. Andrews, ., B. M. Pettitt. 2000. Residence times of water molecules in the hydration sites of myoglobin. Biophys. J. 79:2966–2974. 40. Colombo, G., M. Meli, and A. De Simone. 2008. Computational studies of the structure, dynamics and native content of amyloid-like fibrils of ribonuclease A. Proteins. 70:863–872. 41. De Simone, A., J. E. Corrie, ., F. Fraternali. 2008. Conformation and dynamics of a rhodamine probe attached at two sites on a protein: implications for molecular structure determination in situ. J. Am. Chem. Soc. 130:17120–17128. 42. Hamelberg, D., and J. A. McCammon. 2004. Standard free energy of releasing a localized water molecule from the binding pockets of proteins: double-decoupling method. J. Am. Chem. Soc. 126:7683– 7689. 43. Jenkins, C. L., and R. T. Raines. 2002. Insights on the conformational stability of collagen. Nat. Prod. Rep. 19:49–59. 44. Stultz, C. M. 2002. Localized unfolding of collagen explains collagenase cleavage near imino-poor sites. J. Mol. Biol. 319:997–1003. 45. Raman, S. S., R. Parthasarathi, ., T. Ramasami. 2006. Role of aspartic acid in collagen structure and stability: a molecular dynamics investigation. J. Phys. Chem. B. 110:20678–20685. 46. Ravikumar, K. M., and W. Hwang. 2008. Region-specific role of water in collagen unwinding and assembly. Proteins. 72:1320–1332. 47. Bella, J., B. Brodsky, and H. M. Berman. 1995. Hydration structure of a collagen peptide. Structure. 3:893–906. 48. Berisio, R., L. Vitagliano, ., A. Zagari. 2000-2001. Crystal structure of a collagen-like polypeptide with repeating sequence Pro-Hyp-Gly at 1.4 A resolution: implications for collagen hydration. Biopolymers 56:8–13. 49. Okuyama, K., C. Hongo, ., N. Nishino. 2004. Crystal structures of collagen model peptides with Pro-Hyp-Gly repeating sequence at 1.26 A resolution: implications for proline ring puckering. Biopolymers. 76:367–377. 50. Melacini, G., A. M. Bonvin, ., R. Kaptein. 2000. Hydration dynamics of the collagen triple helix by NMR. J. Mol. Biol. 300:1041–1049. 51. Hua, L., X. Huang, ., B. J. Berne. 2006. Dynamics of water confined in the interdomain region of a multidomain protein. J. Phys. Chem. B. 110:3704–3711. 52. De Simone, A., R. Spadaccini, ., F. Fraternali. 2006. Toward the understanding of MNEI sweetness from hydration map surfaces. Biophys. J. 90:3052–3061.
2261 53. Despa, F., and R. S. Berry. 2007. The origin of long-range attraction between hydrophobes in water. Biophys. J. 92:373–378. 54. Meyer, E. E., Q. Lin, ., J. N. Israelachvili. 2005. Origin of the longrange attraction between surfactant-coated surfaces. Proc. Natl. Acad. Sci. USA. 102:6839–6842. 55. Berisio, R., V. Granata, ., A. Zagari. 2004. Imino acids and collagen triple helix stability: characterization of collagen-like polypeptides containing Hyp-Hyp-Gly sequence repeats. J. Am. Chem. Soc. 126:11402–11403. 56. Murphy, 4th, F. V., R. M. Sweet, and M. E. Churchill. 1999. The structure of a chromosomal high mobility group protein-DNA complex reveals sequence-neutral mechanisms important for non-sequencespecific DNA recognition. EMBO J. 18:6610–6618. 57. Engel, J., and D. J. Prockop. 1998. Does bound water contribute to the stability of collagen? Matrix Biol. 17:679–680. 58. Nishi, Y., S. Uchiyama, ., Y. Kobayashi. 2005. Different effects of 4-hydroxyproline and 4-fluoroproline on the stability of collagen triple helix. Biochemistry. 44:6034–6042. 59. Holmgren, S. K., K. M. Taylor, ., R. T. Raines. 1998. Code for collagen’s stability deciphered. Nature. 392:666–667. 60. Schumacher, M., K. Mizuno, and H. P. Ba¨chinger. 2005. The crystal structure of the collagen-like polypeptide (glycyl-4(R)-hydroxyprolyl-4(R)-hydroxyprolyl)9 at 1.55 A resolution shows up-puckering of the proline ring in the Xaa position. J. Biol. Chem. 280:20397– 20403. 61. Shoulders, M. D., K. A. Satyshur, ., R. T. Raines. 2010. Stereoelectronic and steric effects in side chains preorganize a protein main chain. Proc. Natl. Acad. Sci. USA. 107:559–564. 62. Tame, J. R., G. N. Murshudov, ., A. J. Wilkinson. 1994. The structural basis of sequence-independent peptide binding by OppA protein. Science. 264:1578–1581. 63. Autenrieth, F., E. Tajkhorshid, ., Z. Luthey-Schulten. 2004. Role of water in transient cytochrome c2 docking. J. Phys. Chem. B. 108:20376–20378. 64. Keskin, O., B. Ma, and R. Nussinov. 2005. Hot regions in protein— protein interactions: the organization and contribution of structurally conserved hot spot residues. J. Mol. Biol. 345:1281–1294. 65. Li, X., O. Keskin, ., J. Liang. 2004. Protein-protein interactions: hot spots and structurally conserved residues often locate in complemented pockets that pre-organized in the unbound states: implications for docking. J. Mol. Biol. 344:781–795.
Biophysical Journal 100(9) 2253–2261