Following co-operative formation of secondary and tertiary structure in a single protein module1

Following co-operative formation of secondary and tertiary structure in a single protein module1

J. Mol. Biol. (1997) 268, 185±197 Following Co-operative Formation of Secondary and Tertiary Structure in a Single Protein Module Jose L. Neira, Lau...

369KB Sizes 0 Downloads 20 Views

J. Mol. Biol. (1997) 268, 185±197

Following Co-operative Formation of Secondary and Tertiary Structure in a Single Protein Module Jose L. Neira, Laura S. Itzhaki, Andreas G. Ladurner, Ben Davis Gonzalo de Prat Gay and Alan R. Fersht* MRC Unit for Protein Function and Design Cambridge Centre for Protein Engineering University Chemical Laboratory, Lens®eld Road Cambridge, CB2 1EW UK

We have prepared a family of peptide fragments of the 64 amino acid protein chymotrypsin inhibitor (CI2), corresponding to progressive elongation from the N terminus, in order to elucidate the basis of conformational preferences in single-domain proteins and to obtain insights into their conformational pathway. Structural analysis of the fragment comprising the ®rst 50 residues, CI2(1 ± 50), indicates that it is mainly disordered, with patches of hydrophobic residues exposed to the solvent. Structural characterisation of the fragment CI2(1 ±63) which lacks only the C-terminal glycine, Gly64, shows native-like structure in all regions of the fragment. The study provides insights into the contribution of speci®c residues to the stability and co-operativity of the intact protein. We de®ne a NMR value, derived from chemical shift analysis, which describes the build-up of structure at the level of individual residues (protons). All the macroscopic probes used to study the growth of structure in CI2 on elongation of the chain (circular dichroism, ¯uorescence and gel ®ltration) are in agreement with the residue-by-residue description by NMR. It is seen that secondary and tertiary structure build up in parallel in the fragments and show similar structures to those developed in the transition state for folding of the intact protein. # 1997 Academic Press Limited

*Corresponding author

Keywords: protein folding; nascent polypeptide chain; phi-value; NMR; folding pathway

Introduction Most small, single domain proteins fold readily and rapidly in solution into their unique native three-dimensional structure. Although the information contained in the amino acid sequence is known to encode its three-dimensional structure Present addresses: B. Davis, Ludwig Institute for Cancer Research, University College/Middlesex Hospital Branch, 91 Riding House Street, London WC1P 88T, UK; G. de Prat Gay, Departamento de Bioquimica Medica, Universidade Federale do Rio de Janeiro, Cidade Universitaria, Rio de Janeiro 21941-590, Brazil. Abbreviations used: ANS, 8-anilinonaphthalene-1sulphonate; CI2, chymotrypsin inhibitor 2; DQF-COSY, double quantum-®ltered correlation spectroscopy; 1D, one-dimensional; 2D, two-dimensional; NOE, nuclear Overhauser effect; NOESY, two-dimensional nuclear Overhauser effect spectroscopy; TOCSY, twodimensional total correlation spectroscopy; tm, midpoint thermal denaturation; TPPI, time proportional phase increment; TSP, [2,2,3,3-2H4]-(trimethylsilyl)propionate; ppm, parts per million. 0022±2836/97/160185±13 $25.00/0/mb970932

(An®nsen, 1973), the mechanism by which the primary structure directs folding of a protein is controversial, with current ideas varying from folding being initiated by a collapse of the polypeptide chain into a compact structure (Dill et al., 1993), to the accretion of secondary structural elements of marginal stability that are formed on the submillisecond time-scale (Kim & Baldwin, 1990). We are interested in the description of how structure is attained by a polypeptide chain in solution as it grows from its N terminus (de Prat Gay et al., 1995a). With these studies, we can elucidate whether the growth of structure in a small protein is highly co-operative or not. Further, they will provide insights into the importance of hierarchical accretion of structures in protein folding. CI2 is a small monomeric protein that folds in vitro via a two-state mechanism, without the accumulation of folding intermediates and with only a single kinetically signi®cant transition state (Jackson & Fersht, 1991a,b; Jackson et al., 1993a); the transition state has been analysed at the residue level by the protein engineering method (Jackson # 1997 Academic Press Limited

186

Folding of a Nascent Polypeptide Chain

Figure 1. Schematic representation of secondary structure elements in CI2. The different fragments analysed during the elongation of the polypeptide chain (those studied here are in bold) are indicated with arrows at the position of the cleavage point.

et al., 1993a; Otzen et al., 1994; Itzhaki et al., 1995a). There are no fully developed regions of secondary structure in the transition state for folding, although the N-terminal region of the helix is well developed (Jackson et al., 1993a; Otzen et al., 1994; Itzhaki et al., 1995a). CI2 appears to fold by a nucleation-condensation mechanism whereby the structure collapses around the N terminal of the helix as the helix itself is in the process of being formed (Itzhaki et al., 1995a; Fersht, 1995). Thus, CI2 is the simplest system for folding studies and can perhaps be considered as a single domain of larger proteins. In our studies we have used a truncated form of CI2 lacking the ®rst 20 unstructured residues (Clore et al., 1987a; McPhalen & James, 1987), where residue 1 corresponds to residue 20 (Jackson et al., 1993b) of the long form; the folding and structural properties of both forms are similar (Jackson & Fersht, 1991a,b; Jackson et al., 1993a,b; Harpaz et al., 1994). Crystallographic (McPhalen & James, 1987; Harpaz et al., 1994) and NMR (Clore et al., 1987a,b; Ludvigsen et al., 1991) studies have de®ned the secondary strucure of CI2, which is: residues Thr3 to Trp5, b-strand 1; Trp5 to Leu8, type III reverse turn; Leu8 to Gly 10, type II reverse turn; residues Lys11 ± Ser12, b-strand 2; residues Ser12 to Lys24, a-helix; residues Pro25 to Gln28, type I reverse turn; Gln28 to Val34 b-strand 3; Gly35 to Asp45 reactive site loop (extended structure); residues Arg46 to Asp52, b-strand 4; residues Asp52 to Asp55, turn; residues Asp55 to Ala58, bstrand 5 and residues Val60 to Gly64, b-strand 6. We are using CI2 as a model system to elucidate the structural preferences of a nascent polypeptide chain (de Prat Gay et al., 1994a,b, 1995a,b; de Prat Gay & Fersht, 1994; Ruiz-Sanz et al., 1995; Itzhaki et al., 1995b; Neira et al., 1996). We have obtained fragments of increasing length from the N terminus of CI2 (Figure 1). The study of the smallest fragments by NMR and CD indicate that there is little drive for indpendent formation of local secondary structure in CI2 (Itzhaki et al., 1995b) in the absence of tertiary interactions, although nonnative local hydrophobic interactions are present (de Prat Gay et al., 1994b; Itzhaki et al., 1995b; Neira et al., 1996). Fluorescence, ANS-binding, CD and gel ®ltration studies of the larger fragments (CI2(1 ± 40) and larger; de Prat Gay et al., 1995b) in-

dicated that the major changes in secondary structure and accessibility to hydrophobic sites occur in parallel between fragments CI2(1 ±40) and CI2(1 ± 53). These biophysical probes give us global structural information, i.e. at a macroscopic level. Here, we investigate whether these changes are ``mirrored'' at the residue level. The only biophysical technique capable of giving residue-by-residue information is NMR. Previously, we have described the solution conformation preferences of CI2(1 ± 5), CI2(1 ± 13), CI2(1 ± 25), CI2(1 ± 28) (Itzhaki et al., 1995b), CI2(1 ±40) (de Prat Gay et al., 1994b; Neira et al., 1996), CI2(1 ± 53) (de Prat Gay et al., 1995b) and CI2(1 ± 60) (de Prat Gay et al., 1995a) by NMR; here, we report the conformational preferences of CI2(1 ± 50) and CI2(1 ± 63). The study of the solution conformation of CI2(1 ±50) will clarify the observed ``macroscopic'' changes occurring between CI2(1 ± 40) and CI2(1 ± 53) (de Prat Gay et al., 1995b); ®nally, the study of CI2(1 ±63) will allow us to describe, in structural terms, the importance of bstrand 6 for the folding of intact CI2. In addition, with these assignments we now describe the folding of CI2 nascent polypeptide chain residue-by-residue, and compare it with the kinetic folding pathway of intact protein.

Results and Discussion NMR characterisation of CI2(1-50) CI2(1 ± 50) has a high tendency to aggregate and binds ANS. However, there is no evidence of a concentration dependence of the NMR signals between 50 and 500 mM of protein (data not shown). The NOE pattern indicates that CI2(1 ± 50) is mainly unstructured (Figure 2). Strong aN(i, i ‡ 1) and the absence of sequential NN(i, i ‡ 1) NOE contacts are indicative of extended conformations (WuÈthrich, 1986). The conformational shifts for the Ha protons corroborate these results: all of them are in the range expected for a random coil, i.e. deviations of less than 0.1 ppm (Figure 3). There are small up®eld deviations in two regions: the ®rst region involves residues Ala16, Lys17, Ile20 and Asp23, and the second region involves Asp45. Further, the methyl group of Thr3 is up®eld shifted (1.02 ppm versus 1.22 observed in random coil pep-

187

Folding of a Nascent Polypeptide Chain

Figure 2. Sequence and summary of NOE connectivities for CI2(1± 50). The thickness of the lines re¯ects the intensity of the sequential NOE connectivities, i.e. weak, medium and strong, observed by counting the number of levels. The a(N, b, g)N(b, g)(i, i ‡ j) indicate an observed contact between the Ha(N, b, g) proton of a residue and the NH(b, g) proton of the i ‡ j residue. Amino acids have been numbered according to the intact protein.

tides (WuÈthrich, 1986)). Those deviations at the Ccap of the a-helix were observed in smaller fragments (Itzhaki et al., 1995b), and they result from the presence of low populations of helical structures; however, the absence of medium and longrange contacts suggests that the a-helix is not formed. The Ha chemical shift of Asp45 may be affected by the presence of the aromatic ring of Tyr42, as has been seen in CI2(41 ±64) (Neira et al., 1996). Anomalous conformational shifts in Thr3 have been observed in smaller fragments (de Prat Gay et al., 1994a; Itzhaki et al., 1995b), which indicate non-random conformations around Trp5. The conformational shifts are different from those of the intact protein (Kjñr et al., 1987; data not shown). There are no residues that are protected against exchange with solvent (see Materials and Methods). All the measured 3JNHHa coupling constants are in the range expected for averaged random conformations (data not shown). Fluorescence experiments indicated that Trp5 in CI2(1 ± 50) is solvent-exposed (de Prat Gay et al., 1995b); this is corroborated by the NMR study,

where the chemical shifts of the aromatic ring are similar to those expected for an extended conformation (WuÈthrich, 1986; see Figure 9), and its aromatic side-chain does not make any long-range contacts. In conclusion, CI2(1 ± 50) is mainly disordered, in agreement with CD and ¯uorescence studies (de Prat Gay et al., 1995b). Thus, there are no important structural changes at the residue level on elongation of the chain from CI2(1 ±40) to CI2(1 ± 50). CI2(1 ± 50) is the smallest fragment to bind ANS signi®cantly, indicating the exposure of hydrophobic clusters (de Prat Gay et al., 1995a,b). However, there is no evidence for this from the NMR data: the chemical shifts of the stretch of hydrophobic residues located toward the C terminus do not deviate from their random coil values, nor are NOE contacts observed between their sidechains; this is probably due to averaging of chemical shifts among the different explored conformations. NMR characterisation of CI2(1-63) This fragment contains all the residues except the C terminus residue, Gly64. In the intact protein, the side-chain of Val63 is buried, oriented towards Thr3, and makes contacts with residues in strands 1, 3 and 6, the active site loop and two turns (Itzhaki et al., 1995a). The negative charge of the C terminus interacts with the positively charged sidechain of Arg46. It has been proposed that this interaction is important in maintaining the structure in this region (Li & Daggett, 1995). Coupling constants and chemical shifts

Figure 3. Conformational shifts for the Ha protons of CI2(1 ±50), according to the reported random coil values (WuÈthrich, 1986): d(Ha) ˆ d(Ha)fragment ÿ d (Ha)random coil. The lines indicate the range accepted for the random coil (0.1 ppm). Gaps represent residues whose resonances could not be assigned. Both Ha protons of glycine residues are included.

It was not possible to measure all the 3JNHHa coupling constants (data not shown), but those that could be measured were in good agreement with those of the intact protein (Ludvigsen et al., 1991). The Ha conformational shifts to random coil are shown in Figure 4(A). On the other hand, the shifts of the majority of residues are very similar to those of the intact protein (Kjñr et al., 1987). Only in two regions do they differ signi®cantly (i.e. > 0.1 ppm): namely, the C terminus and the region around Ile44 (Figure 4(B)). The C-terminal residues are up®eld shifted, mainly because of the

188

Folding of a Nascent Polypeptide Chain

Lys24 con®rms the formation of the a-helix in CI2(1 ± 63) (Figure 5). However, the formation of the strands comprising the b-sheet is not so clear. We can de®ne the b-strands by the long range contacts aN(i, j), NN(i, j) and aa(i, j) (WuÈthrich, 1986), between residues i and j. The inter-strand longrange contacts in CI2(1 ± 63) are represented in Figure 6, de®ning the following b-strands: residues Thr3 to Trp5 (b-strand 1), Gly10 to Lys11 (b-strand 2), Ile30 to Val34 (b-strand 3), Arg48 to Asp52 (bstrand 4), Asp55 to Ala58 (b-strand 5) and Val60 to Val63 (b-strand 6). In the intact CI2, b-strand 3 comprises residues Gln28 to Val34 and b-strand 4 comprises residues Arg46 to Asp52 (Ludvigsen et al., 1991). The shortening of both b-strands in CI2(1-63) is highlighted by the presence of two non-native sequential NN(i, i ‡ 1) from residues Asp45 to Arg48, which are not observed in a rigid b-strand (WuÈthrich, 1986). More inter-strand aa(i, j) NOEs could not be observed because of overlapping or proximity to the spectrum diagonal. The rest of the contacts observed in CI2(1-63) are essentially native-like (Ludvigsen et al., 1991). Solvent-exchange experiments

Figure 4. (A) Conformational shifts for the Ha protons of CI2(1 ±63), according to the reported random coil values (WuÈthrich, 1986) d(Ha) ˆ coil . (B) Differences d(Ha)fragment ÿ d(Ha)random between the chemical shifts of the Ha protons of CI2(1 ± 63) and the values observed in the intact protein (Kjñr et al., 1987): d(NH) ˆ d(NH)fragment ÿ d(NH)intact protein . The lines indicate the range accepted for the random coil or for the intact protein (0.1 ppm). Gaps represent residues whose resonances could not be assigned. Both Ha protons of glycine residues are included.

removal of Gly64. Ile44 and Asp45 are at the beginning of b-strand 4 in the intact protein, and they pack against residue Gly64, which is deleted in CI2(1 ± 63). Further, there is an intense nonnative NN(i, i ‡ 1) NOE between residues Asp45 and Arg46 (see below). Thus, differences in chemical shifts are restricted to the regions which are close to the deleted residue in the intact protein. NOE study The presence of medium-range ab(i, i ‡ 3), aN(i, i ‡ 2), NN(i, i ‡ 2) and sequential NN(i, i ‡ 1) contacts in the region comprising residues Ser12 to

The exchange experiments (Figure 5) indicate protection for residues involved in b-strand 1, bstrand 2, residues Ser12 to Lys24, residues Gln27 to Leu32, residue Val34, residues Val47 to Gln59 (forming b-strands 4 and 5, and the turn between them). Exchange rates are slow for those amide protons which are buried in the interior of the protein (i.e. no solvent-accessible) and/or hydrogenbonded (Englander & Kallenbach, 1984). There is a good agreement with the exchange behaviour found in the intact protein (Ludvigsen et al., 1991; Neira et al., 1997). Discrepancies are observed in residues located in the turn region (Lys53 to Asn56) where the exchange was reported as ``fast'' for the intact protein (Ludvigsen et al., 1991). The discrepancy may arise, however, because of the different temperatures used in the two experiments (5 C in this study and 25 C for intact CI2 (Ludvigsen et al., 1991)). Figure 7 shows the NH ± NH region of the NOESY spectrum in 2H2O. The sequential contacts between non-exchanging residues involved in the a-helix are indicated. In conclusion, the experiments suggest that the a-helix extends from residues Ser12 to Lys24 as in the intact protein, the b-strands in CI2(1 ± 63) comprise residues Thr3 to Trp5 (b-strand 1), Gly10 to Lys11 (b-strand 2), Ile30 to Val34 (b-strand 3), Arg48 to Val51 (b-strand 4), Asp55 to Ala58 (bstrand 5) and Val60 to Val63 (b-strand 6). Residues Asp45 to Val47 appear to adopt b-strand-like conformations, although these conformations are not occupied all the time. The main-chain hydrogen bond between Arg46 and Gly64 is absent, because Gly64 is removed. This results in a displacement of the b-sheet around the beginning of b-strand 4, but not a complete disruption of the native-like b-sheet scaffold, as judged by the similarity in confor-

Figure 5. Sequence and summary of NOE connectivities for CI2(1 ± 63). The open rectangles and broken lines indicate those contacts not observed due to signal overlapping or proximity to spectrum diagonal. At the bottom of the Figure, a hatched rectangle indicates those ``slow'' NH protons found in the solvent protection experiments (see Materials and Methods). The thickness of the lines re¯ect the intensity of the sequential NOE connectivities, i.e. weak, medium and strong, observed by counting the number of levels. The a(N, b, g)N(b, g)(i, i ‡ j) indicate an observed contact between the Ha(N, b, g) proton of a residue and the NH(b, g) proton of the i ‡ j residue. Amino acids have been numbered according to the intact protein.

190

Folding of a Nascent Polypeptide Chain

Figure 7. NH-NH region in 2H2O of CI2(1 ± 63) showing the presence of sequential NN(i, i ‡ 1) NOEs de®ning the a-helical structure. The NN(i, i ‡ 1) NOE contact between residues Val13 and Glu14 was not observed in 2 H2O.

Figure 6. Schematic representation of the b-sheet structure of CI2(1 ± 63) as determined in solution from the long-range contacts, chemical shifts and exchange-pattern protection. The continuous arrows indicate NOE contacts also observed in the intact CI2; broken arrows indicate those non-native NOE contacts and the broken lines indicate an ambiguous NOE contact due to overlap with other signal.

mational shifts in the other strands and the pattern of solvent protection when compared with the intact protein. This movement of b-strand 4 exposes hydrophobic regions (mainly those on the edges of b-strands 3 and 4), resulting in the previously described ANS-binding properties (de Prat Gay et al., 1995b). The core of the protein appears to be completely formed since long-range contacts between Trp5 and residues in strands 1 and 6 (Lys2, Thr3, Glu4, Arg62, Val63), residue Val47 and residues in the a-helix (Val19, Ile20 and Asp23) are observed (Table 1). This suggests that the environment around Trp5 is native-like, i.e. it is completely buried and well-®xed. The same conclusion was obtained from the ¯uorescence experiments (de Prat Gay et al., 1995b). The region termed the minicore (formed by the side-chains of residues Leu32, Val38 and Phe50) is also completely ®xed, as is judged from the long range contacts between the aromatic moiety of Phe50 and residues Leu32, Val34 and Thr36. However, the minicore is substantially formed already in fragment CI2(1 ± 53) (de Prat Gay et al., 1995b).

The tm value of CI2(1 ± 63) is 55 C compared with 85 C in CI2 (de Prat Gay et al., 1995b). The free energy of denaturation in water (measured by ¯uorescence) is also less in CI2(1 ± 63) than in the intact protein: 2.0(0.1) in CI2(1 ±63) and 7.60(0.12) kcal molÿ1, respectively (Itzhaki et al., 1995a). These ®ndings indicate that even when the tertiary structure is native-like and there is a tight packing of the majority of the side-chains, the distortions in b-strand 4 and the removal of the mainchain hydrogen bond between Arg46 and Gly64, cause a loss of stability. On the other hand, CI2(163) is the ®rst fragment in the series of the N-terminal fragments where a co-operative thermal transition is observed (de Prat Gay et al., 1995b); it seems that co-operativity in CI2 requires the packing of the majority of the side-chains. Macroscopic versus residue-by-residue techniques: a  NMR value We have shown that during the elongation of the nascent polypeptide chain of CI2, secondary and tertiary interactions occur simultaneously, using ¯uorescence, CD, ANS-binding and gel ®ltration measurements (de Prat Gay et al., 1995b), but what is the behaviour of individual residues? We have used NMR to monitor, residue-by-residue, the formation of structure in the nascent chain. The changes in chemical shifts provide a simple way to describe the behaviour of the nascent polypeptide chain at a residue level; however, a more

191

Folding of a Nascent Polypeptide Chain Table 1. Medium and long-range NOE interactions with the side-chain of Trp5 in aqueous solution in CI2(1 ± 63) (pH ˆ 4.5, sodium acetate buffer, 5 C) Aromatic protona NH (10.42) 2H (7.16)

4H (7.38)

5H (6.64)

6H (6.77)

7H (7.23)

NOE contacts withb D42 Hb P6 Hb E7 NH, Hb L8 Hb, Med V19 Ha, Meg 120 Hg, Med T3 Ha L8 Ha, Hb, Med I20 Ha, Hb, Hg, Med, Meg R62 NH Ha Hb P61 Ha T3 Ha, Hb E4 Ha P6 Hb E7 Hb I20 Hb, Hg, Med, Meg D23 Hb R62 Ha, Hb V63 NH, Ha, Hb, Meg P6 Hb, Hg L8 Hb I20 Ha, Hb, Hg, Med, Meg D23 Hb R62, Ha, Hb, Hg V63 NH, Meg T3 Ha E7 Hb I20 Med, Meg D23 Ha, Hb

V19 Meg K2 Hg

V19 Meg V47 Meg V63 Meg V47 Meg

K2 Hg V19 Meg V47 Meg

K2 Hg P6 Hg V19 Meg V47 Meg R62 Hg V63 Meg

a The nomenclature of the protons of the aromatic moiety was according to WuÈthrich (1986). b The contacts with own backbone protons are not indicated.

meaningful measurement would compare the chemical shift values observed in each fragment with those observed in the intact protein. In addition, the chemical shifts should be normalized, with respect to standard values, i.e. the chemical shifts of random coil models (WuÈthrich, 1986). With this approach, we can de®ne a NMR value, analogous to that used in protein engineering method for studying protein folding (Fersht et al., 1992): fragment …d ÿ drandom-coil † NMR  ˆ protein ÿ drandom-coil † …d where dfragment is the chemical shift of a particular proton in a fragment, dprotein is the chemical shift of the same proton but in the intact protein (Kjñr et al., 1987) and drandom-coil is the chemical shift for that proton measured in the random-coil models (WuÈthrich, 1986). The NMR is similar to the general coef®cient of similarity de®ned by Gower (1971) and used in studies of denatured proteins (Shortle & Abeygunawardana, 1993). As for the  value in

the protein engineering studies, the NMR value is generally between 0 and 1; a value of zero indicates that the environment of the proton in the fragment is similar to that in a random coil model; a value equal to one indicates that the environment of the proton in the fragment is similar to that in the intact protein. Since the NH protons are very sensitive to any change in the solution conditions (Old®eld, 1995), we have calculated the NMR for only the Ha protons. There are some residues for which the Ha chemical shift in the intact protein is very close to the random coil value (within 0.1 ppm), resulting in a value of (dprotein ÿ drandom-coil) which is close to zero, and consequently large errors in the determination of NMR. Therefore, these residues are not included in the analysis. We have divided the residues according to the different structural elements previously described (Itzhaki et al., 1995a). Core residues The hydrophobic core is formed by docking of the a-helix onto the b-sheet and comprises residues Trp5, Leu8, Ala16, Val19, Ile20, Ala27, Ile29, Val47, Leu49, Val51, Ile57 and Pro61. The evolution of the Ha resonances and NMR values are shown in Figure 8(A) and (B), respectively, for selected residues belonging to the unique a-helix and b-turns regions. All the residues show an abrupt change when the chain is extended from 50 to 53 residues. CI2(1 ± 53) was the ®rst fragment for which there was evidence of a-helix formation and tertiary structure (de Prat Gay et al., 1995b). Formation of the hydrophobic core can also be followed with the aromatic resonances of Trp5 and the side-chain resonances of Ile20 (Figure 9). In the smaller fragments, random-coil chemical shifts are observed and the NMR values are close to zero. Weak peaks are observed for the indole resonances of Trp5 in CI2(1 ± 53) and CI2(1 ± 60), in addition to native-like resonances (Figure 9, inset), with chemical shifts that are close to the random-coil values. Thus, there is a minor (less than 10% as judged from the intensity in the 1D-NMR experiments), solvent-exposed, conformation. This is in agreement with the ¯uorescence data (de Prat Gay et al., 1995b), which suggests that the tryptophan is partially buried in these two fragments. Only nativelike resonances were observed for the side-chain of Ile20 in these fragments, but any peaks with random-coil chemical shifts would not be observed because of overlap in the methyl region. a-Helix residues Helix formation can be identi®ed easily by an up®eld displacement of the Ha resonances. The evolution of the chemical shifts and the NMR values are shown in Figure 10(A) and (B), respectively, for residues along the a-helix of the intact protein. All display similar behaviour: a sharp

192

Folding of a Nascent Polypeptide Chain

Figure 8. (A) Evolution of the Ha resonances during elongation of the polypeptide chain for residues involved in the core of CI2. (B) NMR parameter for the same residues. The NMR parameter of Ala27 in fragments CI2(1± 60) and CI2(1 ± 53) has been scaled to 1.

change is observed on elongation from 50 to 53 residues. The dramatic stabilisation of the a-helix on elongation of the chain from 50 to 53 residues can be explained as follows. On extending from CI2(1 ± 50) to CI2(1-53), residue Phe50 is added (in

CI2(1 ± 50), it is a homoserine). The bulky sidechain may drive formation of the minicore in CI2(1 ± 53) (de Prat Gay et al., 1995b), and other tertiary interactions. Thus, the a-helix is formed only when suf®cient tertiary interactions are developed to stabilise it, indicating that secondary and ter-

Figure 9. (A) Evolution of the aromatic resonances of Trp5 during elongation of the polypeptide chain; inset: evolution of the indole signals in the folded form (continuous line) and in the unfolded form (broken) (see the text). Nomenclautre of the aromatic signals was according to WuÈthrich (1986). (B) Evolution of the methylene and methyl resonances of Ile20 during elongation of the polypeptide chain.

Folding of a Nascent Polypeptide Chain

193

Figure 10. (A) Evolution of the Ha resonances during elongation of the polypeptide chain for residues involved in the a-helix of CI2. (B) NMR parameter for the same residues.

tiary structures are formed simultaneously. This was observed also in the far-UV CD experiments (de Prat Gay et al., 1995b). Although the a-helix is present in CI2(1-53), it is not well-®xed, as indicated by the weakness of the medium-range NOEs (de Prat Gay et al., 1995b). Thus, the last b-strands (4 and 5) are necessary to ®x the helix, and this is re¯ected in a signi®cant increase in NMR values on elongation beyond residue 53, and stronger medium-range NOEs (de Prat Gay et al., 1995a). b-Sheet residues Figure 11(A) and (B) shows that b-strands 1 to 3 form only on elongation of the chain beyond residue 50. There are not suf®cient fragments to describe the formation of strands 4 to 6. The b-sheet is weakly populated only in CI2(1 ±53): native like chemical shifts are observed for residues in bstrands 3 and 4 (de Prat Gay et al., 1995b). The bsheet becomes structured on going from residues 60 to 63. b-Strands 4 and 5 are better-formed than the others in CI2(1 ±60) (as indicated by the longrange contacts and hydrogen exchange rates (de Prat Gay et al., 1995a)). Both b-strands make contacts with the a-helix that contribute to the hydrophobic core: Leu49 and Ile57 are hydrophobic core residues whose side-chains interact with that of Ala16, a key residue in the folding of CI2 (see below). The b-scaffold is present when 63 residues are reached, but only the central residues of bstrands 3 and 4 are ®xed. Thus, the b-strands begin to form only when the a-helix is formed, and then require the last few residues for a rigid structure. Again, this supports the conclusion that secondary

and tertiary structure in CI2 are formed concomitantly. Other regions (the reactive-site loop, the minicore and the turns) do not give any conclusive results since these are composed of residues towards the C-terminal, and therefore they are not present in a suf®cient number of fragments to describe their behaviour. Finally, comparison of the changes in chemical shifts as a function of chain length shows that, for all the residues, there is little change until residue 53 is added. At this point, there is a dramatic change to native-like conformations (Figures 8 to 12). The transition occurs very co-operatively, and there are no substructures which form earlier than others. Neither are there any ``non-native'' states, there are no chemical shifts which deviate signi®cantly from random coil values by more than those of the intact protein (NMR > 1), or which show deviations of the opposite sign to the intact protein. In conclusion, the NMR analysis shows that secondary and tertiary interactions are formed concomitantly on elongation of the polypeptide chain. It was suggested that important changes occurred between CI2(1 ± 40) and CI2(1 ±53) based on the changes in CD, ¯uorescence and ANS-binding experiments. Now we can identify precisely that the changes occur between CI2(1 ± 50) and CI2(1 ± 53), when there are suf®cient tertiary interactions present to stabilise the secondary structure. In Figure 12, we compare the results of our previous ¯uorescence and CD studies (de Prat Gay et al., 1995b) with the corresponding NMR values for selected residues of the a-helix and b-sheet. The CD and ¯uorescence data suggest an earlier and

194

Folding of a Nascent Polypeptide Chain

Figure 11. (A) Evolution of the Ha resonances during elongation of the polypeptide chain for residues involved in the b-strands of CI2. (B) NMR parameter for the same residues. The NMR parameter of Ile30 in fragment CI2(1± 53) has been scaled to 1.

more gradual accretion of native structure than the NMR values would indicate, although both probes do support the major conclusion that the largest change occurs after CI2(1-50). It must also be pointed out that CD provides information on the overall molecule, while the NMR values are residue speci®c. We can calculate a mean value of NMR for each fragment (by averaging the NMR

Figure 12. Summary of the conformational changes in the nascent polypeptide chain of CI2. CD and ¯uorescence data are from de Prat Gay et al. (1995b). The NMR values of residues Val13 (a-helix) and Glu4 (bsheet) are indicated.

values for every residue and these values agree well with the CD results (data not shown). Thus, all the probes appear to map the concomitant acquisition of secondary and tertiary structure in CI2. Conclusions and implications for the mechanism of protein folding pathways It has been suggested that folding of nascent polypeptides proceeds according to a hierarchical mechanism, i.e. the elements of secondary structure form when the nascent chain is released from the ribosome (Tsou, 1988). We have shown, using a range of biophysical techniques, that in CI2 the folding is not hierarchical, all the secondary structural elements remain weakly populated, if at all, until there are suf®cient residues to develop the tertiary interactions required to stabilise them. Only then, secondary and tertiary structures are formed. In larger proteins or proteins with several domains, a hierarchical mechanism could occur with the individual domains folding when their full sequence information is available. CI2 folds via a single rate-determining transition state, which has been characterised by protein engineering (Otzen et al., 1994; Itzhaki et al., 1995a). In both the transition state and the series of states described here, the a-helix is the most developed element of structure, but only when there are contacts with the C-terminal b-strands. It has been shown, by experimental and theoretical approaches (Itzhaki et al., 1995a; Neira et al., 1996; Shakhnovich et al., 1996; Daggett et al., 1996), that Ala16 interacts with Leu49 and Ile57 strongly in

195

Folding of a Nascent Polypeptide Chain

the transition state of folding of CI2 and these interactions form the core of the nucleus of folding; further, the a-helix is well-formed in the transition state (Otzen et al., 1994; Itzhaki et al., 1995a; Neira et al., 1996). The structure of the transition state for folding is reminiscent of those of CI2(1-60) and CI2(1-53): the a-helix is formed in these largest fragments because there are suf®cient tertiary contacts with b-strands. Thus, the interactions required to stabilise the structural features in the nascent chain parallel the elements of the structure which develop in the folding nucleus of the kinetic pathway of folding of the intact protein.

Materials and Methods Materials Cyanogen bromide (CNBr) was purchased from Fluka (Switzerland). 2H2O and TSP were obtained from Aldrich Chemical Co. (Gillingham, UK); d3-acetic acid and d3-sodium acetate were from SIGMA. Standard suppliers were used for all other chemicals and water was deionized and puri®ed on an Elgastat system. Preparation, cleavage and purification of the fragments CI2(1-50) was produced by CNBr cleavage of engineered protein with a newly introduced Met residue at position 50 (F50M mutant), as described (de Prat Gay et al., 1995a). Puri®cation and treatment have been described elsewhere (de Prat Gay et al., 1995a). The fragment CI2(1-63) was obtained by introducing a stop codon at position 64 in wild-type protein and puri®ed as described (de Prat Gay et al., 1995a). The concentration was determined using the tabulated molar extinction coef®cients from model compounds (Gill & von Hippel, 1989). The composition of the fragments was checked by time-of-¯ight mass spectrometry and the puri®ed samples were lyophilised and stored at ÿ70 C. NMR spectroscopy The spectra were recorded on a Bruker AMX-500 or a DMX-600 spectrometer. All the spectra were acquired at 5 C. 1H chemical shifts are quoted relative to external TSP (0.0 ppm) measured under the same solution conditions. Lyophilised fragments were dissolved in 0.5 ml 50 mM buffer (sodium d3-acetate, pH 4.5) in 90% H2O/ 10% 2H2O. In all cases, the pH was measured using a Russell glass electrode. No correction was made for the isotope effects. Peptide concentrations were between 1.5 and 2.0 mM for the CI2(1 ± 3) and 400 mM for CI2(1± 50). Aggregation problems were analysed studying the chemical shifts and linewidths of diluted tenfold samples at the same conditions. CI2(1 ± 63) did not show any evidence of concentration dependence of the NMR signals within the concentration range used. Two-dimensional homonuclear NOESY (Jeener et al., 1979; Macura et al., 1981), TOCSY (Bax & Davis, 1985; Braunschweiler & Ernst, 1983), and DQF-COSY (Piantini et al., 1982) spectra were acquired. The spectra were recorded, in all cases, with 2048 complex data points, and 128 t1 increments using the TPPI method (Marion & WuÈthrich, 1983). Typically, 64 scans per t1 increment were obtained. The water signal was attenuated by

using presaturation during relaxation delay for one second and, additionally, in the NOESY during the mixing time (125 ms). For TOCSY spectra, with the MLEV-17 sequence, the mixing times were 70 to 80 ms. Spectral widths were 8064 Hz in both dimensions for all experiments. The spectra were processed using BRUKER-UXNMR software on an Aspect X-32 workstation. Prior to Fourier transformation, the squared sine bell window functions, shifted p/4 in the t2 dimension and p/6 in the t1 dimension, were applied. Polynomial base-line correction was applied in all cases in both dimensions. The ®nal 2D data matrix contained 2000  1000 data points, in all the experiments. Exchange experiments in CI2(1 ± 50) were carried out at identical pH by dissolving the sample in 90% 2H2O and 10% H2O. A 1D-NMR experiment was acquired 20 minutes after dissolving the sample. No amide protons were observed, thus indicating that there is no signi®cant protection in the fragment. Exchange experiments in CI2(1 ± 63) were carried out at the same pH by dissolving the sample in 90% 2H2O and 10% H2O, and acquiring a NOESY experiment, because solvent protection in intact CI2 was obtained after acquisition of a 12 hour COSY experiment, at 25 C and pH 4.2 (Ludvigsen et al., 1991). We classi®ed the protons as ``slow'' if they appear in the NOESY experiment and ``fast'' if they are not present. Exchange experiments in CI2(1 ± 50) were carried out at the same pH by dissolving the sample in 90% 2H2O and 10% H2O, and acquiring a one-dimensional experiment. Assignment of the fragments was carried out using standard procedures (WuÈthrich, 1986). Since the NH protons show differences between the different sets of random coil models (WuÈthrich, 1986; Wishart et al., 1995; Merutka et al., 1995), the analysis of the conformational shifts in CI2(1-50) and CI2(1-63) was carried out on the Ha protons, considering the WuÈthrich values (WuÈthrich, 1986).

Acknowledgements J.L.N. was supported by an EC Human Capital and Mobility fellowhsip, L.S.I. was supported by a Beit Memorial fellowship in Medical Research and A.G.L. by a predoctoral fellowship from the Boehringer Ingelheim Fonds, Germany.

References An®nsen, C. B. (1973). Principles that govern the folding of polypeptide chains. Science, 181, 223± 230. Bax, A. & Davis, D. G. (1985). MLEV-17 based twodimensional homonuclear magnetisation transfer spectroscopy. J. Magn. Reson. 65, 355± 360. Braunschweiler, L. & Ernst, R. R. (1983). Coherence transfer by isotropic mixing-application to proton correlation spectroscopy. J. Magn. Reson. 53, 521 ± 528. Clore, G. M., Gronenborn, A. M., James, M. N. G., Kjñr, M., McPhalen, C. A. & Poulsen, F. M. (1987a). Comparison of the solution structure and X-ray structure of barley serine proteinase inhibitor 2. Protein Eng. 1, 313± 318. Clore, G. M., Gronenborn, A. M., Kjñr, M. & Poulsen, F. M. (1987b). The determination of the threedimensional structure of barley serine proteinase in-

196 hibitor 2 by nuclear magnetic resonance, distance geometry and restrained molecular dynamics. Protein Eng. 1, 305 ± 312. Daggett, V., Li, A., Itzhaki, L. S., Otzen, D. E. & Fersht, A. R. (1996). Structure for the transition state of folding of a protein derived from experiment and simulation. J. Mol. Biol. 257, 430± 440. de Prat Gay, G. & Fersht, A. R. (1994). Generation of a family of protein fragments for structure-folding studies. 1. Folding complementation of two fragments of chymotrypsin inhibitor-2 formed by cleavage at its unique methionine residue. Biochemistry, 33, 7957± 7963. de Prat Gay, G., Ruiz-Sanz, J. & Fersht, A. R. (1994a). Generation of a family of protein fragments for structure-folding studies. 2. Kinetics of association of the two chymotrypsin inhibitor-2 fragments. Biochemistry, 33, 7964± 7970. de Prat Gay, G., Ruiz-Sanz, J., Davis, B. & Fersht, A. R. (1994b). The structure of the transition state for the association of two fragments of the barley chymotrypsin inhibitor-2 to generate native-like protein and implications for mechanism of protein folding. Proc. Natl Acad. Sci. USA, 91, 10943± 10946. de Prat Gay, G., Ruiz-Sanz, J., Neira, J. L., Itzhaki, L. S. & Fersht, A. R. (1995a). Folding of a nascent polypeptide chain in vitro. Co-operative formation of structure in a protein module. Proc. Natl Acad. Sci. USA, 92, 2683± 2686. de Prat Gay, G., Ruiz-Sanz, J., Neira, J. L., Corrales, F. J., Otzen, D. E., Ladurner, A. G. & Fersht, A. R. (1995b). Conformational pathway of the polypeptide chain of chymotrypsin inhibitor-2 growing from its N terminus in vitro. Parallels with the protein folding pathway. J. Mol. Biol. 254, 968 ± 979. Dill, K. A., Fiebig, K. M. & Chan, H. S. (1993). Co-operativity in protein folding kinetics. Proc. Natl Acad. Sci. USA, 90, 1942± 1946. Englander, S. W. & Kallenbach, N. R. (1984). Hydrogen exchange and structural dynamics of protein and nucleic acids. Quart. Rev. Biophys. 16, 521 ± 655. Fersht, A. R. (1995). Optimisation of rates of protein folding: the nucleation-condensation mechanism and its implications. Proc. Natl Acad. Sci. USA, 92, 10869± 10873. Fersht, A. R., Matouscheck, A. & Serrano, L. (1992). The folding of an enzyme. I. Theory of protein engineering analysis of stability and pathway of protein folding. J. Mol. Biol. 224, 771± 782. Gill, S. C. & von Hippel, P. H. (1989). Extinction coef®cients from amino acid sequence data. Anal. Biochem. 182, 319±326. Gower, J. C. (1971). General coef®cient of similarity and some of its properties. Biometrics, 27, 857± 871. Harpaz, Y., elMasry, N. F., Fersht, A. R. & Henrick, K. (1994). Direct observation of better hydration at the N terminus of an a-helix with glycine rather than alanine as the N-cap residue. Proc. Natl Acad. Sci. USA, 91, 311 ± 315. Itzhaki, L. S., Otzen, D. E. & Fersht, A. R. (1995a). The structure of the transition state for folding of chymotrypsin inhibitor-2 analysed by protein engineering methods: evidence for a nucleationcondensation mechanism for protein folding. J. Mol. Biol. 254, 260 ± 288. Itzhaki, L. S., Neira, J. L., Ruiz-Sanz, J., de Prat Gay, G. & Fersht, A. R. (1995b). Search for nucleation sites in smaller fragments of chymotrypsin inhibitor-2. J. Mol. Biol. 254, 289 ±304.

Folding of a Nascent Polypeptide Chain Jackson, S. E. & Fersht, A. R. (1991a). Folding of chymotrypsin inhibitor-2. 1. Evidence for a two-state transition. Biochemistry, 30, 10428± 10435. Jackson, S. E. & Fersht, A. R. (1991b). Folding of chymotrypsin inhibitor-2. 2. In¯uence of proline isomerisation on the folding kinetics and thermodynamic characterisation of the transition state of folding. Biochemistry, 30, 10436± 10443. Jackson, S. E., elMasry, N. F. & Fersht, A. R. (1993a). Structure of the hydrophobic core in the transition state for folding chymotrypsin inhibitor-2: a critical test of the protein engineering method of analysis. Biochemistry, 32, 11270± 11278. Jackson, S. E., Moracci, M., elMasry, M., Johnson, C. M. & Fersht, A. R. (1993b). Effect of cavity creating mutations in the hydrophobic core of chymotrypsin inhibitor 2. Biochemistry, 32, 11259± 11269. Jeener, J., Meier, B., Backman, P. & Ernst, R. R. (1979). Investigation of exchange processes by two-dimensional NMR spectroscopy. J. Chem. Phys. 71, 4546± 4550. Kim, P. S. & Baldwin, R. L. (1990). Intermediates in the folding reactions of small proteins. Annu. Rev. Biochem. 59, 631± 660. Kjñr, M., Ludvingsen, S., Sorensen, O. W., Denys, L. A., Kindler, J. & Poulsen, F. M. (1987). Sequence speci®c assignment of the proton nuclear magnetic resonance spectrum of barley serine proteinase inhibitor 2. Carslberg Res. Commun. 52, 353± 362. Li, A. & Daggett, V. (1995). Investigation of the solution structure of chymotrypsin inhibitor 2 using molecular dynamics: comparison to X-ray crystallographic and NMR data. Protein Eng. 8, 1117± 1128. Ludvigsen, S., Shen, H., Kjñr, M., Madsen, J. C. & Poulsen, F. M. (1991). Re®nement of the threedimensional solution structure of barley serine proteinase inhibitor 2 and comparison with the structures in crystals. J. Mol. Biol. 222, 621 ± 635. Macura, C., Huang, Y., Suter, D. & Ernst, R. R. (1981). Two-dimensional chemical-exchange and crossrelaxation spectroscopy of coupled nuclear spins. J. Magn. Reson. 43, 259± 281. Marion, D. & WuÈthrich, K. (1983). Application of phasesensitive two-dimensional correlated spectroscopy (COSY) for measurements of 1H-1H spin-spin coupling constants in proteins. Biochem. Biophys. Res. Commun. 11, 967± 975. McPhalen, C. A. & James, M. N. G. (1987). Crystal and molecular structure of the serine protease inhibitor (CI-2) from barley seeds. Biochemistry, 26, 261± 269. Merutka, G., Dyson, J. H. & Wright, P. E. (1995). ``Random coil'' 1H chemical shifts obtained as a function of temperature and tri¯uoroethanol concentration for the peptide series. J. Biomol. NMR, 5, 14± 24. Neira, J. L., Davis, B., Ladurner, A. G., Buckle, A., de Prat Gay, G. & Fersht, A. R. (1996). Towards the complete characterisation of a protein folding pathway: the structures of the denatured, transition and native states for the association/folding of two complementary fragments of cleaved chymotrypsin inhibitor 2. Direct evidence for a nucleation-condensation mechanism. Folding Design, 1, 189± 208. Neira, J. L., Itzhaki, L. S., Otzen, D. E., Davis, B. & Fersht, A. R. (1997). Hydrogen exchange in chymotrypsin inhibitor 2 probed by mutagnesis. J. Mol. Biol. In the press. Old®eld, E. (1995). Chemical shifts and three-dimensional protein structures. J. Biomol. NMR, 5, 217 ± 225.

197

Folding of a Nascent Polypeptide Chain Otzen, D. E., Itzhaki, L. S., elMasry, N. F., Jackson, S. E. & Fersht, A. R. (1994). Structure of the transition state for the folding/unfolding of chymotrypsin inhibitor-2 and its implications for mechanisms of protein folding. Proc. Natl Acad. Sci. USA, 91, 10422± 10425. Piantini, U., Sorensen, O. W. & Ernst, R. R. (1982). Multiple quantum ®lter for elucidating NMR coupling networks. J. Am. Chem. Soc. 104, 6800± 6801. Ruiz-Sanz, J., de Prat Gay, G., Otzen, D. E. & Fersht, A. R. (1995). Protein-fragments as models for events in protein folding pathways. Protein engineering analysis of the association of two complementary fragments of the barley chymotrypsin inhibitor-2 (CI-2). Biochemistry, 34, 1695± 1701. Shakhnovich, E. U., Abkevich, V. & Pitsyn, O. (1996). Conserved residues and the mechanism of protein folding. Nature, 379, 96± 98. Shortle, D. & Abeygunawardana, C. (1993). NMR analysis of the residual structure in the denatured state of an unusual mutant of staphylococcal nuclease. Structure, 1, 121 ± 134. Tsou, C. L. (1988). Folding of a nascent peptide chain into a biologically active protein. Biochemistry, 27, 1809± 1812. Wishart, D. S., Bigam, C. G., Holm, A., Hodges, R. S. & Sykes, B. D. (1995). 1H, 13C and 15N random coil

NMR chemical shifts of the common amino acids. I. Investigations of the nearest-neighbour effects. J. Biomol. NMR, 5, 67± 81. WuÈthrich, K. (1986). NMR of Proteins and Nucleic Acids, John Wiley & Sons, Inc., New York.

Edited by J. Karn (Received 6 November 1996; received in revised form 21 January 1997; accepted 22 January 1997)

http://www.hbuk.co.uk/jmb Supplementary material for this paper, comprising one Table, is available from JMB Online.