doi:10.1016/j.jmb.2003.11.033
J. Mol. Biol. (2004) 336, 241–251
Heterogeneous Folding of the trpzip Hairpin: Full Atom Simulation and Experiment Wei Yuan Yang1, Jed W. Pitera2, William C. Swope2* and Martin Gruebele1,3* 1 Center for Biophysics and Computational Biology University of Illinois Urbana, IL 61801, USA 2
IBM Research, IBM Almaden Research Center 650 Harry Road, San Jose CA 95120, USA 3 Departments of Chemistry and Physics, University of Illinois Urbana, IL 61801, USA
The b-hairpin trpzip2 can be tuned continuously from a two-state folder to folding on a rough energy landscape without a dominant refolding barrier. At high denaturant concentration, this extremely stable peptide exhibits a single apparent “two-state” transition temperature when monitored by different spectroscopic techniques. However, under optimal folding conditions the hairpin undergoes an unusual folding process with three clusters of melting transitions ranging from 15 8C to 160 8C, as monitored by 12 different experimental and computational observables. We explain this behavior in terms of a rough free energy landscape of the unfolded peptide caused by multiple tryptophan interactions and alternative backbone conformations. The landscape is mapped out by potentials of mean force derived from replica-exchange molecular dynamics simulations. Implications for deducing cooperativity from denaturant titrations, for the origin of folding cooperativity, and for the folding of thermophilic proteins are pointed out. trpzip is an excellent small tunable model system for the glass-like folding transitions predicted by landscape theory. q 2003 Elsevier Ltd. All rights reserved.
*Corresponding authors
Keywords: energy landscape; molecular dynamics; thermal unfolding; hairpin
Introduction Many single domain proteins, and even some peptides, seem to fold by a highly cooperative “all or none” process.1,2 A sufficiently high barrier forces cooperative behavior, even if many small local minima are present along the reaction coordinate for folding. Where along the reaction coordinate a given spectroscopic signal changes becomes irrelevant when the barrier is high because the changes are likely to occur in a region of high energy, where no significant population can accumulate. All spectroscopic signals then consist of a weighted average of the two states on each side of the barrier. This property has been exploited in protein engineering studies, and in providing indirect information on the transition state region via the rate of formation of Abbreviations used: tm, melting temperature; GuHCl, guanidinium hydrochloride; CD, circular dichroism; SVD, singular value decomposition; MD, molecular dynamics. E-mail address of the corresponding author:
[email protected]
the folded and unfolded states.3 Unfortunately, high barriers make it hard to monitor more directly the sparsely populated transient conformations a protein takes on during folding. Two-state folding in small proteins is by no means a trivial observation. Statistical mechanical folding models on rough energy landscapes predict that a cancellation of conformational entropy and contact enthalpy could lead to folding without a dominant barrier when the energy landscape is strongly biased in favor of the native state.4 Only small local minima remain in that case (residual roughness). In such models, the barrier arises from evolutionary selection against low energy partially folded states, resulting in an energy gap between the native ensemble of states and the lowest-lying excited states. This gap protects proteins from partial unfolding. Recent experimental and computational studies have renewed interest in heterogeneous folding processes that sample intermediate conformations more extensively. “Burst phase” intermediates are by now familiar;5,6 and some have been resolved.7 In some cases these intermediates have the
0022-2836/$ - see front matter q 2003 Elsevier Ltd. All rights reserved.
242
properties of unfolded states under native conditions, with many attributes of the native state.8 – 10 In other cases they form by stretched kinetics.11,12 Thermodynamic studies in which different yet individually two-state-like spectroscopic probes do not match up have revealed conformational heterogeneity in the absence of a large barrier.13 Signatures of the breakdown of single-barrier transition state theory for a two-state folder have been observed in kinetic studies.14 Simulations of mini-proteins validated by experiments have shown heterogeneous folding kinetics.15 Folding heterogeneity is of course not new, given the large body of work on proline isomerization,16 helix-coil dynamics of natural17 – 19 and designed polymers,20 and disulfide bond formation.21 The more recent work extends heterogeneous dynamics to the fastest folding time-scales of proteins and of peptide models beyond helices, showing that the energy landscape contains a full hierarchy of local minima.22 Although cooperativity is nearly as generic as the formation of a hydrophobic core during folding, we suggest that heterogeneity may be enhanced, by tailoring the main interactions that provide stability to the protein or peptide. Here, we present a molecular dynamics and experimental thermodynamic study of the tryptophan zipper 2 (termed trpzip2), a b-hairpin designed by Cochran et al.,23 looking for any heterogeneity in its folding process. This 12-residue peptide folds with a very high tm by utilizing two aromatic stacking interactions. Aromatic stacking is one of the strongest side-chain interactions participating in protein self-assembly,24 in residual unfolded structures,25 and in the increased thermal stability of thermophilic proteins.26,27 This gives trpzip2 a strong energy bias toward its native structure. We find that trpzip2 has at least three separate transition regions when monitored as a function of temperature by different molecular dynamics simulation observables. Two of these regions can be observed experimentally by temperature scans alone, and the third when denaturant is added. Within each region, there is a clustered group of spectroscopic or simulated observables. Molecular dynamics reveals at least seven structurally distinct basins at low temperatures. Under optimal folding conditions, trpzip does not fold via a simple two-state process. Instead it undergoes a heterogeneous folding process on a rough free energy landscape. Without a single dominant barrier, residual landscape roughness can be probed directly. Yet at low temperatures, trpzip2 still settles into a well-defined minimum free energy structure in close accord with NMR experiments.
Results Unusual stability of trpzip2 The 12-residue peptide trpzip2 offers several
Folding of trpzip
Figure 1. Initial conditions (A, 1HRX NMR structure; B, extended conformation). The relaxed 1HRX structure has tryptophan conformations much closer to the later 1LE1 NMR structure (C, 1HRX shown in gray; D, 1LE1 shown in gray for comparison with simulated structure). E, Lowest-energy conformation obtained during replicaexchange molecular dynamics simulations. When different Trp rotamers are enumerated and the resulting structures are optimized to eliminate thermal effects, a lowest-energy structure very close to the average 1LE1 NMR structure is obtained (F).
advantages for detailed investigations. Its small size allows extensive full-atom computations to be carried out.28 Figure 1 shows various initial and annealed structures during folding (details below) overlaid on top of the NMR structures.23,29 Experimentally, the hairpin is held together by two tryptophan –tryptophan pairs,23 so tryptophans represent a third of the protein sequence, and are the main constituents of the core. Tryptophan fluorescence (lifetime as well as lmax) can thus be taken as global probes for folding instead of measures of local environments within the peptide. In addition, exciton coupling between the tryptophan indoles gives strong far UV circular dichroism (CD) signals, which can be used to track long-range (1=R2 dependence) angular correlations between dipoles of the Trp side-chains quantitatively.30 Fluorescence, CD and infrared absorption provide a variety of structural probes to explore deviations from two-state behavior. Cochran et al. have reported that trpzip2 forms a b-hairpin with a cooperative melting temperature of 72 8C when the temperature titration is monitored by CD.29 This is surprising considering that the peptide has a very small hydrophobic core with all the side-chains incompletely buried. We found additionally that this peptide is very
243
Folding of trpzip
Figure 2. CD spectrum of trpzip2 (2.4 mM 1 cm pathlength) in pH 7 potassium phosphate buffer without denaturant, 6 M GuHCl and 7.9 M urea at 20 8C. The spectrum is .90% dominated by the couplet structure of the two interacting tryptophan pairs, whose interaction cannot be broken by denaturant at room temperature.
resistant to denaturation by chaotropes, maintaining a nearly native CD spectrum at 20 8C up to very high denaturant concentrations (Figure 2). It takes 6 M GuHCl and 95 8C to make trpzip2 exhibit the same small CD signature (Figure 3) as the “unfolded” single-stranded pentapeptide Carm5 that mimics the local tryptophan environment without allowing inter-strand interactions (see Materials and Methods). The same is evident in the tryptophan emission spectra. For fluorescence measurements, Carm5 was again used as a reference to divide out effects of the local tryptophan environment not related to unfolding (see Materials and Methods). trpzip2 and Carm5 fluorescence intensities both decrease
Figure 3. Thermal unfolding of trpzip2 in 0 – 6 M GuHCl, monitored using CD at 227 nm. tm ¼ 72 8C in 0 M GuHCl (in agreement with Ref. 23), tm ¼ 29 8C in 6 M GuHCl.
rapidly when the temperature is raised from 4 8C to 84 8C, but Carm5 fluorescence is subject only to the local tryptophan environment, while trpzip2 fluorescence is subject to inter-strand interactions and stacking leading to folding. The trpzip2 hairpin is held together by the two interacting b-strands; if the molecules start losing this structure, the motions and the local environments of the two b-strands should begin to uncouple, at the end approaching that of a free Carm5 monomer. Figure 4 shows representative raw wavelength/ temperature/denaturant data obtained for trpzip and Carm5 to illustrate the quality of the fluorescence spectra. The normalized fluorescence intensity and wavelength trends become clearer in Figures 5 and 6 discussed below. When the intrinsic trend of trpzip2 is divided out using Carm5 data, the relative fluorescence intensity of trpzip2 increases when denaturant is added or when the temperature is raised, so the tryptophan residues are more quenched in the native state of trpzip than in Carm5 (Figure 5). Two-state unfolding at high denaturant concentration At GuHCl concentrations above 5 M, CD and fluorescence reveal an apparent two-state transition. A temperature titration from 1 8C to 95 8C in 6 M GuHCl, monitored by CD at 227 nm or by integrated fluorescence intensity, has a tm of 27(^ 2) 8C (Figures 3 and 5A). The CD value approaches zero (no correlation between the tryptophan orientations). The integrated fluorescence intensity of trpzip2 steadily increases towards the value of Carm5, so the ratio in Figure 5 approaches unity (when [Carm5] ¼ 2[trpzip2]). Heterogeneity at low or zero denaturant concentration At GuHCl concentrations below 6 M, the transition becomes increasingly heterogeneous. As the GuHCl concentration is lowered to 1 M, the thermal titrations monitored by CD and integrated fluorescence remain sigmoidal, but the midpoint temperatures deviate by more than 20 8C (Figures 3 and 5A). In 1 M GuHCl, the CD transition has a midpoint . 55 8C, while the transition monitored by fluorescence integrated above 320 nm or 385 nm has a midpoint , 35 8C. (Here and elsewhere, the midpoints given are conservative estimates based on sigmoidal fits; when the unfolded baseline is not reached, the lowest possible temperature is given; when no native basline is reached, the highest possible temperature is given. Thus, results such as tm . 55 8C and tm , 35 8C if anything underestimate the discrepancy.) As long as the denaturant concentration is low but non-zero, the fluorescence titration curves still level off, indicating completion of the transition (Figure 5A). However, the trpzip2/Carm5
244
Folding of trpzip
Figure 4. Thermal unfolding of trpzip2 (thick lines) and Carm5 (thin lines) from 4 8C (red) to 84 8C (black) in 10 deg. C steps, monitored as a function of tryptophan fluorescence emission spectrum (excited at 280 nm). Only when the trpzip2 signal has the same wavelength and half the intensity of Carm5 has a completely denatured state been reached. A, 0 M GuHCl; B, 2 M GuHCl; C, 6 M GuHCl. In the plots, [Carm5] ¼ 2 £ [trpzip2] so the concentration of tryptophan is the same.
fluorescence emission ratios no longer approach unity at high temperature. The temperaturedenatured states at low denaturant concentration are clearly more ordered than in 6 M GuHCl, and complete unfolding can be achieved only by simultaneous application of high temperature and denaturant. An unusual thermal transition occurs in the absence of denaturant (Figures 5 and 6). The shape of the transition monitored by relative fluorescence intensity depends on the wavelength range being monitored (Figure 6A). At long wavelengths, the transition is apparently completed with a tm , 15 8C (Figure 5A or Figure 6A), but at shorter wavelengths, a pronounced maximum
Figure 5. Thermal unfolding of trpzip2 in 0 – 6 M GuHCl, monitored using integrated fluorescence (excitation ¼ 280 nm, 385 nm cutoff filter used for filtering emission). A, trpzip2/Carm5 integrated fluorescence ratio as a function of temperature removes the intrinsic temperature dependence of the emission intensity, with tm , 15 8C in 0 M GuHCl, tm ¼ 35 8C in 1 M GuHCl and tm ¼ 27 8C in 6 M GuHCl; B, the peak wavelength in 0 M GuHCl shows a much higher transition temperature than integrated intensity, tm . 65 8C:
Figure 6. Integrated trpzip2/Carm5 fluorescence ratio (385 nm cutoff filter (black), bandpass between 340 nm and 485 nm (blue), 320 nm cutoff (red)) in A, 0 M GuHCl; B, 2 M GuHCl. Ratios different from 1 are reached at high temperature depending on the wavelength region probed. Only at 6 M GuHCl (not shown) does the ratio approach 1 in all cases.
Folding of trpzip
245
Figure 7. Infrared spectrum of trpzip2 as a function of temperature in 0 M GuHCl. A, full spectrum; B, unfolding transition with tm . 60 8C obtained from the ratio of the amplitudes of the second and first singular value components (the first SVD component is the average spectrum).
is reached below 45 8C. At the same time, the transition tracked by the emission maximum lmax has a midpoint above 65 8C (Figure 5B). This is to be compared with more nearly sigmoidal traces at any wavelength in 2 M GuHCl (Figure 6B). Thus fluorescence intensity and fluorescence wavelength do not track the same transition in 0 M denaturant. The initial intensity increase without wavelength
shift is indicative of increased tryptophan flexibility without solvation. Different trpzip2 ensembles along the reaction coordinate to unfolding clearly have very different fluorescence properties in the absence of denaturant. The 0 M GuHCl CD and infrared spectra (Figures 3 and 7) both show apparent twostate transitions at higher temperatures than the
Figure 8. Melting curves of molecular dynamics simulation observables. Top to bottom: radius of gyration (apparent tm ¼ 162 8C), fraction of terminal salt-bridge formation (tm ¼ 67 8C), average number of backbone H-bonds (tm ¼ 72 8C; ˚ Ca-RMSD from the broken line is number of native H-bonds with tm ¼ 17 8C), fraction of structures less than 2.5 A 1LE1 NMR ensemble (tm ¼ 32 8C), average number of tryptophan residues in the native x1 rotamer (tm ¼ 27 8C), and number of residues with native-like secondary structure.
246
integrated fluorescence transition, but the midpoints do not correspond well with each other, or with the lmax transition in Figure 5B. When monitored by singular value decomposition of the IR spectrum (Figure 7, left), the midpoint is at tm . 59 8C (second SVD component shown in Figure 7, right), the lower bound being given by the absence of an unfolded state baseline. MD simulation analysis The trpzip2 simulations at each temperature were analyzed using several different geometric measures (Figure 8). To compare the overall similarity of the peptide backbone to the 1LE1 NMR ensemble, the Ca-RMSD of each simulated structure from all 20 conformations in the NMR ensemble was calculated. A structure was counted ˚ Ca-RMSD from as “folded” if it was within 2.5 A at least one of the NMR structures. The formation of a salt-bridge between the terminal NHþ3 and COO2 was monitored by tracking the distance between those two groups and counting a saltbridge as formed if the minimum inter-group ˚ . Native heavy atom distance was less than 5 A and non-native backbone hydrogen bonds were counted as formed if the donor– acceptor distance ˚ and the donor – hydrogen– was less than 4 A acceptor angle was greater than 1208. To assess the degree of tryptophan side-chain disorder, the distribution of x1 side-chain torsion angles was measured at each temperature. The secondary structural content (aL-helix, b-sheet, coil, aR-helix) of trpzip2 was also modeled on a per-residue basis based on the f and c torsion angles for each amino acid. This permitted the determination of secondary structure content as a function of temperature, as well as the fraction of native-like secondary structure. Estimated midpoints tm for the unfolding transitions monitored by these observables are shown in Figure 8. Like the experimental transition midpoints monitored by CD, integrated fluorescence, lmax and IR, the tm s obtained from different MD observables differ (Figure 8). Three main clusters of tm were found computationally. The lowest group involves the tryptophan side-chain conformation and Ca-RMSD, with tm < 30 8C. Between 65 8C and 75 8C lie the analyses by fraction of total backbone hydrogen bonds and fraction of terminal salt-bridge. The radius of gyration, which monitors the loss of compactness, has a tm of approximately 160 8C. It is important to note that the lowest-energy conformation sampled in the simulation has only three of the four tryptophan residues in their native x1-rotamers. An exhaustive enumeration of 1296 possible combinations of the native-like tryptophan rotamers for this lowest-energy simulation conformation yielded a structure of even lower energy, suggesting that the tryptophan ordering transition for our model may occur at
Folding of trpzip
temperatures lower than the 27 8C estimated based on the temperature range of the simulations. CD titration simulation A time-sequence of structures from the equilibrated molecular dynamics simulation at each temperature was used to compute the CD-detected thermal unfolding titration in 0 M GuHCl. The MD simulation was sampled every 20 ps to avoid unnecessary correlation of successive structures. The 200 – 250 nm CD spectrum is dominated by tryptophan couplets in the folded state (Figure 2) and nearly zero in the unfolded state. The couplet is a characteristic CD line-shape, resulting from the interaction of two electronic states of two different tryptophan residues. The couplet strength and sign were computed for all six Trp pairings in each structure using the formalism of Woody and co-workers,30,31 then summed between all pairs and time-averaged. Thus, heterogeneity among separate populations and within individual populations is taken into account. The sign of the couplet depends on whether the tryptophan dipole axes are aligned or counter-aligned. The pairwise sum approximation breaks down particularly in the unfolded state, where multiple tryptophans can cluster, but there the net signal is very small because the dipoles are more randomly oriented. The resulting simulated amplitude of the couplet is shown in Figure 9. The computed CD melting transitions occur in the higher temperature cluster, in agreement with the experimental data (although somewhat below the experimental temperature). More intriguingly, the transition temperature computed for this observable is again not in exact agreement with the other computed variables. The midpoint occurs at 48 8C, about 208 lower than the other transitions in the “high temperature” cluster that is between 65 8C and 75 8C.
Figure 9. Simulated 0 M GuHCl circular dichroism melting curve, using the tryptophan exciton-coupling as an observable applied to the molecular dynamics simulation geometries; tm ¼ 48 8C.
Folding of trpzip
247
Discussion
hydrogen bonding (IR in Figure 7). According to CD, this transition is sensitive to GuHCl, indicating that it is the main transition involving access of GuHCl to the core. Non-identical transitions using the different probes show that this transition involves significantly populated partially folded populations with different spectroscopic signatures between populations, and perhaps even within individual populations. A likely candidate for these unfolded traps is different pair clusterings of the four tryptophan residues. (3) A final transition gives trpzip2 a fluorescence spectrum nearly identical with the fragment peptide Carm5. This transition is accomplished experimentally only through the addition of GuHCl in addition to raising the temperature, which suggests that it involves breaking up residual random contacts between the tryptophans, further extending the global compactness of the peptide. In MD simulations, it can be observed without denaturant at high temperature ðtm ¼ 160 8CÞ: The molecular dynamics observables and spectral simulations in Figures 8 and 9 correlate well with the experimental data: observables linked to molecular flexibility and specifically tryptophan flexibility (Ca-RMSD and tryptophan x1) have low melting points, while the simulated CD spectrum and total number of backbone hydrogen bonds (related to the IR spectrum) fall into the higher melting point cluster. The simulations in addition predict that these unfolding events occur into a relatively compact denatured ensemble, which is loosened to a larger radius of gyration only at very high temperatures (experimentally: at lower temperature but in denaturant). This clearly indicates that temperature extrapolations of experimental unfolding transitions must be done with caution because events not significantly sampled below the boiling point of water can be accessed at higher temperature. The replica exchange simulations, thus validated by experiment, can provide additional structural and energetic insights into the folding process. Figure 10 shows that near and below the folding transition temperatures, several structural basins exist. These basins correspond to fairly compact structures, with distinct patterns of tryptophan – tryptophan interactions and different topologies as illustrated by the representative structures shown. Only at temperatures above 160 8C (e.g. 450 K and 600 K simulations) do these basins lose their integrity and merge into a continuous structural distribution populated by more highly unfolded conformations. Potentials of mean force along the Ca-RMSD reaction coordinate were calculated using the formula PMF ¼ 2kT lnðrÞ: The resulting plots in Figure 10 show the folding free energy as a function along the reaction coordinate. The basins are separated by refolding barriers in the 1 – 4:8kB T0 range ðT0 ¼ 298 KÞ; showing that partially folded conformers can convert from one basin to another
MD simulation and NMR structure Figure 1 illustrates the utility of refining NMR structures even with the current generation of MD force fields. The originally published NMR structure had the two-tryptophan pairs stacked. Refinement of this structure by MD yielded a structure in much better agreement with the “T-shaped” tryptophan interaction revealed by a later NMR analysis. Thermal sampling even at temperatures as low as 255 K reveals significant fluctuations of the tryptophan residues, but minimization of these low temperature samples again results in a structure very close to the later average NMR structure. Similarly favorable results have been obtained for the Trp cage by the same approach.32 Unfolding mechanism from experiment and simulation Both the experimental data and MD simulations reveal transitions that appear to have cooperative sigmoidal shapes, but whose midpoints differ depending on the spectroscopic probe or computed observable used. Computed unfolding transitions are clustered in three general temperature regions (17 – 32 8C, 48– 75 8C and 162 8C), but even within these regions transition temperatures are spread out. Experimentally, the two lower temperature clusters are observed by temperature tuning alone, and the third requires addition of denaturant. Rather than observing a single unfolding transition dominated by one large barrier, multiple denatured ensembles, each with different properties that manifest themselves in combination spectroscopically, can be observed as the temperature is lowered and the native state is populated. Beginning with the experimental data, we were able to distinguish three separate stages in the unfolding reaction (starting at 0 M GuHCl and 20 8C): (1) Fluorescence intensity increases first. This transition starts at low temperature (15 –40 8C). Transitions associated purely with tryptophan fluorescence intensity are evidence for increased tryptophan mobility caused by a loosened nativelike state.33 This is further corroborated by the observation of hyperfluorescence for certain integrated wavelength regions (Figure 6). In the region where a two-state analysis applies at least approximately (1 – 6 M GuHCl), the transition temperature is not very sensitive to the GuHCl concentration. The GuHCl-insensitivity of the transition indicates that peptide flexibility is not associated with core break up and subsequent GuHCl access. (2) This is followed by a broad transition with a tm near 60 –70 8C as shown by CD, amide I0 infrared bands, and fluorescence lmax shift. The higher temperature observables monitor solvent exposure of the tryptophan (lmax in Figure 5), orientation of tryptophan pairs (CD in Figure 4) and backbone
248
Folding of trpzip
Figure 10. A, Distribution of potential energies versus Ca-RMSD from the 1LE1 NMR ensemble sampled in the molecular dynamics simulations, with representative structures. Distributions indicating structures sampled at 255 K (blue), 310 K (green), 350 K (yellow), 450 K (orange), and 600 K (red), respectively. The native structure is the lowest-energy minimum found in the simulation, but there are numerous low-lying minima that are separated by significant energetic barriers. B, Potential of mean force at five temperatures as a function of the Ca-RMSD as a reaction coordinate. At low temperature, trpzip2 has a very rough energy landscape.
over very small barriers. The appearance of the folding free energy surface is generally that of a rough plateau with a final steep downhill step into the folded basin. This could have interesting consequences for the kinetics: if the interconversion time among the many denatured states becomes comparable to the final barrier crossing time, deviations from simple-two-state kinetics should occur. Such deviations include appearance of new time scales during folding, or as yet unobserved differences in the fast folding kinetics depending on the probe used.14 GuHCl induces folding barrier and cooperativity Although the trpzip2 free energy surface may seem somewhat unusual, it could turn out to be representative of proteins under highly optimized folding conditions. Recent examples include the
multiple transitions observed for BBL binding domain by Mun˜oz and co-workers,13 and the unusual folding kinetics of l-repressor mutants highly optimized for folding rate.14 Protein unfolding titrations in denaturants are often cited as proof for an apparent two-state mechanism. In fact, denaturant titrations used to quantify cooperativity create such cooperativity: m-value analyses from protein folding studies have long shown that denaturants have significant effects on the stability of partially folded structures and transition states, thus raising the folding free energy barrier along with the native state free energy.34 The presence of denaturants could thus turn a folding reaction without a dominant barrier into a two-state reaction. trpzip2 is an example of this effect. In 0 M GuHCl, the folding transition is unusual with no single melting point common to all spectroscopic
249
Folding of trpzip
probes. In 6 M GuHCl, the tm s deduced from CD and fluorescence intensity are very close. At intermediate denaturant concentrations, the deviations lie between these extremes. Thus, in protein folding reactions, whether a dominant folding barrier exists or not cannot be inferred from denaturant-titrations monitored by a single probe. Effective two-body interactions and cooperativity When folding is dominated by well-defined local interactions, one can expect that the entropy penalty and enthalpy stabilization for folding are spread out along the reaction coordinate(s) as pair contacts are made, resulting in lower folding barriers and increased microstructure in the free energy. In such a scenario, what normally constitutes the folding barrier is drawn out along the reaction coordinate and becomes the global/local roughness of the free energy landscape. For this reason (among others), the current generation of folding models that focuses on two-body interactions generally underestimates folding barriers, even when Go-like potentials are not utilized.35 trpzip2 is an example of an “idealized” hairpin, which more closely resembles the two-body interaction models used. Its folding is mainly driven by pairwise tryptophan stacking, hydrogen bonding, and salt bridging. Pairs of tryptophans and backbone amide groups can also associate nonnatively, generating a wide array of local traps, thus leading to the “glassy” folding behavior without a well-defined transition temperature that we observe. When such traps get too deep, the design is “bad”, in the sense that a “good” design should reduce non-native local minima, in addition to deepening the native well. These expectations correspond well with our computational results that indicate there are multiple smaller basins on the trpzip2 free energy surface without a clearly dominant folding barrier. Implications for thermophilic proteins trpzip2 has a native fold that involves a string of tryptophans stacked in a linear fashion, much like one of the several observed mechanisms thermophiles use to increase stability of their proteins.26,27 Because aromatic stacking is one of the strongest among all types of side-chain interactions, we expect that trpzip2 can be used as a simple model for aromatic stacking interactions in thermophilic proteins. For example, in the thermophilic cytochrome P450 CY119, it was reported that there are hints of non-two-state-like behavior from DSC-temperature scans.36 This could be caused by a decoupling of local folding (e.g. pair stacking) from global folding (e.g. overall collapse) of these proteins, analogous to the several thermodynamic transitions observed for trpzip2. If so, we would predict that P450 CY119 has a lower folding barrier and more
complex folding titrations and kinetics than its mesophilic homologs, which in turn would allow it to yield more information in bulk studies by revealing intermediate populations which are not populated in the presence of a large barrier. Aromatics also render thermophilic protein folding at high temperatures more like mesophilic protein folding at moderate temperatures by making the denatured ensemble more compact. As previously shown, protein denatured ensembles are compact at room temperature, but adopt increasingly extended conformations at higher temperatures due to side-chain entropy effects.37 This natural constraint makes folding of mesophilic proteins at high temperature different from the low temperature process. If compact denatured populations are advantageous for folding at all temperatures, and thus selected for by evolutionary pressures, this may be one cause for the utilization of aromatics in thermophilic proteins. The unfolded state compactification and residual structures created by aromatics (see Figure 10) might help retain the degradation resistance of thermophilic proteins. However, whether compact denatured states are in themselves advantageous for an organism, or merely a consequence of enhanced native stability, remains to be seen. In our trpzip2 calculations and experiments, aromatics indeed raise the compact-extended transition to very high temperatures, making extended states not accessible to experiments even at 100 8C in the absence of denaturant.
Materials and Methods Peptide trpzip2 was synthesized by the University of Illinois Biotechnology Center as the C-terminal amine (NH2-SWTWENGKWTWK-NH2) and purified to 90% following previous protocols. The identity and purity of the peptide was confirmed by electrospray mass spectrometry. The reference peptide for the unfolded state NH2-KWTWK-NH2 (dubbed Carm5) was similarly prepared. Thermodynamic measurements Nominal concentration for each sample was determined assuming each tryptophan contributes 5600 cm21 M21 to the peptide’s extinction coefficient at 280 nm. Characterization by CD, fluorescence and infrared spectroscopy was done at pH 7 in 50 mM phosphate buffer. CD spectra from 200 nm to 250 nm were measured in a 1 cm path length cuvette. Unfolding transitions were induced by temperature in the 1 – 95 8C range, by guanidine hydrochloride (GuHCl) in the 0 – 6 M range, and by urea in the 0 – 7.9 M range. The spectra show the characteristic “couplet” band with a negative and positive lobe resulting from partial cancellation of two electronic resonances of the interacting tryptophan side-chains.30 Characterization using infrared absorption spectra
250 was done in 2H2O (p2H 7) using a 75 mm mylar spacer sandwiched between two CaF2 windows. The lyophilized peptide was equilibrated in 2H2O overnight before the IR measurements. A Fourier transform infrared spectrometer was used to collect amide 10 band data in the 1600– 1700 cm21 region with 1 cm21 resolution. For fluorescence measurements, the five-residue nonfolding C terminus peptide NH2-KWTWK-NH2 (Carm5) was used as a reference to remove the intrinsic tryptophan fluorescence temperature-dependence from the fluorescence changes induced by trpzip2 unfolding structural changes. Integrated fluorescence intensities were obtained by wavelength scanning (Figure 2) and with filter combinations (Figure 5). Because trpzip2 is very sensitive to photo-bleaching, especially when without denaturants, the excitation light intensity was kept to a minimum during experiments so the result was intensity-independent. Simulation methods Atomistic molecular dynamics simulations can provide a detailed picture of the folding thermodynamics. Replica-exchange molecular dynamics is an enhanced sampling algorithm, where several replicas of the same protein are simulated in parallel at a range of different temperatures. Periodically, conformations of the system are exchanged between adjacent temperatures using a Metropolis-like acceptance criterion. Any given replica experiences a broad range of temperatures, allowing it to sample and then escape local minima. In addition, the replica-exchange algorithm provides correct thermodynamic ensemble averages at each temperature, making it well suited for studies of temperaturedependent phenomena.38 trpzip2 was simulated as an unblocked peptide (SWTWENGKWTWK) starting from the 1HRX29 NMR structure. These simulations relaxed to a tryptophan packing similar to the revised 1LE1 NMR structure.23 Simulations started from a completely unfolded structure yielded statistically equivalent results (data not shown). The protein was modeled with the AMBER parm96 force field39 and a Generalized Born/Solvent Accessible Surface Area (GB/SA) implicit solvent model.40 The GB/SA model used a solvent dielectric constant of 78.5 and a surface tension of 0.005 cal mol21 A22. All ionizable residues were set to their ionization state at pH 7.0. After an initial steepest descents minimization, replicas were simulated at a total of 28 different temperatures ranging from 255 K to 600 K. A 1 fs time step was used, and all bonds were constrained to their equilibrium lengths with the SHAKE algorithm.41 A parallel replicaexchange script was used to drive molecular dynamics simulations using the AMBER 6.0 package.42 Conformations of the protein were saved every 0.25 ps. Multiple replica swaps between replicas at adjacent temperatures were attempted after every 5 ps of simulation, yielding replica acceptance ratios of over 30%. The simulations were maintained at the target temperatures with a combination of Andersen velocity reassignment43 after every replica swap, and Berendsen temperature rescaling with a time constant of 1 ps21.44 A total of 27.5 ns of molecular dynamics were simulated for each replica, for a total of 0.77 ms of simulation. The initial 15.5 ns of data at each temperature were treated as an equilibration phase and discarded. All analysis was carried out on the final 12 ns of simulation for each
Folding of trpzip
temperature. The weighted histogram averaging method (WHAM)45 was used to improve the estimation of the ensemble averages for all observables at a range of temperatures down to 200 K. The close packing of the large Trp side-chains in the folded state of trpzip2 presents a significant sampling challenge for molecular dynamics simulation. To test if the simulations did indeed find the lowest-energy structure populated by trpzip2, the lowest-energy replica-exchange structure was subjected to exhaustive sampling of all possible combinations of Trp x1 and x2 rotamers followed by minimization. This process yielded a yet-lower energy structure (2 554.8 kcal/mol versus 2 552.2 kcal/mol) that is essentially identical with the final 1LE1 NMR structure (Figure 1F).
Acknowledgements W.Y. & M.G. were supported by National Science Foundation grant MCB-0316925. M.G. thanks the University of Illinois for an Alumni Scholarship.
References 1. Jackson, S. E. (1998). How do small single-domain proteins fold? Fold. Des. 3, 81 – 91. 2. Mun˜oz, V., Thompson, P. A., Hofrichter, J. & Eaton, W. A. (1997). Folding dynamics and mechanism of b-hairpin formation. Nature, 390, 196– 199. 3. Fersht, A. R. (2000). Transition-state structure as a unifying basis in protein-folding mechanisms: contact order, chain topology, stability, and the extended nucleus mechanism. Proc. Natl Acad. Sci. USA, 97, 1525– 1529. 4. Bryngelson, J. D., Onuchic, J. N., Socci, N. D. & Wolynes, P. G. (1995). Funnels, pathways, and the energy landscape of protein folding: a synthesis. Proteins: Struct. Funct. Genet. 21, 167– 195. 5. Mann, C. J. & Matthews, C. R. (1993). Structure and stability of an early folding intermediate of Escherichia coli trp aporepressor measured by far-UV stopped-flow circular dichroism and 8-anilino-1naphthalene sulfonate binding. Biochemistry, 32, 5282– 5290. 6. Jennings, P. & Wright, P. (1993). Formation of a molten globule intermediate early in the kinetic folding pathway of apomyoglobin. Science, 262, 892–895. 7. Ballew, R. M., Sabelko, J. & Gruebele, M. (1996). Direct observation of fast protein folding: the initial collapse of apomyoglobin. Proc. Natl Acad. Sci. USA, 93, 5759– 5764. 8. Kuwata, K., Shastry, R., Cheng, H., Hoshino, M., Batt, C. A., Goto, Y. & Roder, H. (2001). Structural and kinetic characterization of early folding events in beta-lactoglobulin. Nature Struct. Biol. 8, 151– 155. 9. Qin, Z., Ervin, J., Larios, E., Gruebele, M. & Kihara, H. (2002). Formation of a compact structured ensemble without fluorescence signature early during ubiquitin folding. J. Phys. Chem. B, 106, 13040– 13046. 10. Parker, M. J. & Marqusee, S. (1999). The cooperativity of burst phase reactions explored. J. Mol. Biol. 293, 1195–1210. 11. Sabelko, J., Ervin, J. & Gruebele, M. (1999). Observation
251
Folding of trpzip
12. 13.
14. 15.
16. 17. 18. 19.
20.
21. 22. 23. 24. 25.
26.
27.
28.
29.
of strange kinetics in protein folding. Proc. Natl Acad. Sci. USA, 96, 6031– 6036. Osva´th, S., Sabelko, J. & Gruebele, M. (2003). Tuning the heterogeneous early folding dynamics of phosphoglycerate kinase. J. Mol. Biol. 333, 187– 199. Garcia-Mira, M. M., Sadqi, M., Fischer, N., SanchezRuiz, J. M. & Munoz, V. (2002). Experimental identification of downhill protein folding. Science, 298, 2191–2195. Yang, W. Y. & Gruebele, M. (2003). Folding at the speed limit. Nature, 423, 193– 197. Snow, C. D., Nguyen, H., Pande, V. S. & Gruebele, M. (2002). Absolute comparison of simulated and experimental protein-folding dynamics. Nature, 420, 102–106. Wedemeyer, W. J., Welker, E. & Scheraga, H. A. (2002). Proline cis – trans isomerization and protein folding. Biochemistry, 41, 14637– 14644. Poland, D. & Scheraga, H. A. (1970). Theory of HelixCoil Transitions in Biopolymers, Academic Press, New York. Baldwin, R. L. (1995). a-Helix formation by peptides of defined sequence. Biophys. Chem. 55, 127– 135. Kallenbach, N. R., Lyu, P. & Zhou, H. (1996). CD Spectroscopy and the helix-coil transition in peptides and polypeptides. In Circular Dichroism and the Conformational Analysis of Biomolecules (Fasman, G. D., ed.), pp. 201– 261, Plenum Press, New York. Yang, W. Y., Prince, R. B., Sabelko, J., Moore, J. S. & Gruebele, M. (2000). Transition from exponential to nonexponential kinetics during formation of a nonbiological helix. J. Am. Chem. Soc. 122, 3248– 3249. Creighton, T. E. (1984). Disulfide bond formation in proteins. Methods Enzymol. 107, 305– 329. Frauenfelder, H., Sligar, S. G. & Wolynes, P. G. (1991). The energy landscapes and motions of proteins. Science, 254, 1598– 1603. Cochran, A. G., Skelton, N. J. & Starovasnik, M. A. (2002). Tryptophan zippers: stable, monomeric betahairpins. Proc. Natl Acad. Sci. USA, 99, 9081– 9081. Bork, P. & Sudol, M. (1994). The WW domain: a signalling site in dystrophin? Trends Biochem. Sci. 19, 531–533. Klein-Seetharaman, J., Oikawa, M., Grimshaw, S. B., Wirmer, J., Duchardt, E., Ueda, T. et al. (2002). Longrange interactions within a nonnative protein. Science, 295, 1719– 1722. Park, S. Y., Yamane, K., Adachi, S., Shiro, Y., Weiss, K. E., Maves, S. A. & Sligar, S. G. (2002). Thermophilic cytochrome P450 (CYP119) from Sulfolobus solfataricus: high resolution structure and functional properties. J. Inorg. Biochem. 91, 491– 501. Puchkaev, A. V., Koo, L. S. & Ortiz de Montellano, P. R. (2003). Aromatic stacking as a determinant of the thermal stability of CYP119 from Sulfolobus solfataricus. Arch. Biochem. Biophys. 409, 52 – 58. Zagrovic, B., Snow, C. D., Khaliq, S., Shirts, M. R. & Pande, V. S. (2002). Native-like mean structure in the unfolded ensemble of small proteins. J. Mol. Biol. 323, 153– 164. Cochran, A. G., Skelton, N. J. & Starovasnik, M. A.
30.
31. 32.
33.
34.
35.
36. 37. 38. 39.
40. 41.
42.
43. 44.
45.
(2001). Tryptophan zippers: stable, monomeric betahairpins. Proc. Natl Acad. Sci. USA, 98, 5578– 5583. Grishina, I. B. & Woody, R. W. (1994). Contributions of tryptophan side-chains to the circular dichroism of globular proteins: exciton couplets and coupled oscillators. Faraday Discuss. 99, 245 –262. Woody, R. W. (1994). Contributions of tryptophan side-chains to the far-ultraviolet circular dichroism of proteins. Eur. Biophys. J. 23, 253– 262. Pitera, J. W. & Swope, W. (2003). Understanding folding and design: replica-exchange simulations of “Trp-cage” miniproteins. Proc. Natl Acad. Sci. USA, 100, 7587– 7592. Ervin, J., Larios, E., Osvath, S., Schulten, K. & Gruebele, M. (2002). What causes hyperfluorescence: folding intermediates or conformationally flexible native states? Biophys. J. 83, 473– 483. Burton, R. E., Huang, G. S., Daugherty, M. A., Calderone, T. L. & Oas, T. G. (1997). The energy landscape of a fast-folding protein mapped by Ala ! Gly substitutions. Nature Struct. Biol. 4, 305– 310. Plotkin, S. S. & Onuchic, J. N. (2000). Investigation of routes and funnels in protein folding by free energy functional methods. Proc. Natl Acad. Sci. USA, 97, 6509 –6514. Maves, S. A. & Sligar, S. G. (2001). Understanding thermostability in cytochrome P450 by combinatorial mutagenesis. Protein Sci. 10, 161– 168. Yang, W. Y., Larios, E., Gruebele, M. (2003). On the beta-structure propensity of polypeptides at high temperature. J. Am. Chem. Soc., in the press. Mitsutake, A., Sugita, Y. & Okamoto, Y. (2001). Generalized-ensemble algorithms for molecular simulations of biopolymers. Biopolymers, 60, 96 –123. Cornell, W. D., Caldwell, J. W. & Kollman, P. A. (1997). Calculation of the Phi-Psi maps for alanyl and glycyl dipeptides with different additive and non-additive molecular mechanical models. J. Chimie Physique Physico-Chimie Biol. 94, 1417– 1435. Tsui, V. & Case, D. A. (2000). Theory and applications of the generalized born solvation model in macromolecular simulations. Biopolymers, 56, 275– 291. Ryckaert, J. P., Ciccotti, G. & Berendsen, H. J. C. (1977). Numerical-integration of cartesian equations of motion of a system with constraints—moleculardynamics of N-alkanes. J. Comp. Phys. 23, 327– 341. Case, D. A., Pearlman, D. A., Calswell, J. W., Cheatham, T. E. III, Ross, W. S. & Simmerling, C. L. (1999). AMBER, 6.0 edit., University of California, San Francisco. Andersen, H. C. (1980). Molecular-dynamics simulations at constant pressure and-or temperature. J. Chem. Phys. 72, 2384 –2393. Berendsen, H. J. C., Postma, J. P. M., Vangunsteren, W. F., Dinola, A. & Haak, J. R. (1984). Moleculardynamics with coupling to an external bath. J. Chem. Phys. 81, 3684– 3690. Ferrenberg, A. M. & Swendsen, R. H. (1989). Optimized Monte-Carlo data-analysis. Phys. Rev. Letters, 63, 1195– 1198.
Edited by C. R. Matthews (Received 10 September 2003; received in revised form 17 November 2003; accepted 17 November 2003)