Structure of the transition state in the folding process of human procarboxypeptidase A2 activation domain1

Structure of the transition state in the folding process of human procarboxypeptidase A2 activation domain1

Article No. mb982158 J. Mol. Biol. (1998) 283, 1027±1036 Structure of the Transition State in the Folding Process of Human Procarboxypeptidase A2 Ac...

384KB Sizes 0 Downloads 75 Views

Article No. mb982158

J. Mol. Biol. (1998) 283, 1027±1036

Structure of the Transition State in the Folding Process of Human Procarboxypeptidase A2 Activation Domain Virtudes Villegas1, Jose C. MartõÂnez2,3, Francesc X. AvileÂs1 and Luis Serrano2* 1

Departament de BioquõÂmica i Institut de Biologia Fonamental Universitat AutoÁnoma de Barcelona (UAB) 08193 Bellaterra, Barcelona Spain 2

European Molecular Biology Laboratory (EMBL) Meyerhoftrasse, 1, Heidelberg D-69012, Germany 3

Departamento de QuõÂmica FõÂsica, Facultad de Ciencias Universidad de Granada 18071 Granada, Spain

The transition state for the folding pathway of the activation domain of human procarboxypeptidase A2 (ADA2h) has been analyzed by the protein engineering approach. Recombinant ADA2h is an 81-residue globular domain with no disul®de bridges or cis-prolyl bonds, which follows a two-state folding transition. Its native fold is arranged in two a-helices packing against a four-stranded b-sheet. Application of the protein engineering analysis for 20 single-point mutants spread throughout the whole sequence indicates that the transition state for this molecule is quite compact, possessing some secondary structure and a hydrophobic core in the process of being consolidated. The core (folding nucleus) is made by the packing of a-helix 2 and the two central b-strands. The other two strands, at the edges of the b-sheet, and a-helix 1 seem to be completely unfolded. These results, together with previous analysis of ADA2h with either of its two a-helices stabilized through improved local interactions, suggest that a-helix 1 does not contribute to the folding nucleus, even though it is partially folded in the denatured state under native conditions. On the other hand, a-helix 2 folds partly in the transition state and is part of the folding nucleus. It is suggested that a good strategy to improve folding speed in proteins would be to stabilize the helices that are not folded in the denatured state but are partly present in the transition state. Comparison with other proteins shows that there is no clear relationship between fold and/or size with folding speed and level of structure in the transition state of proteins. # 1998 Academic Press

*Corresponding author

Keywords: protein folding; transition state; folding kinetics; protein engineering; procarboxypeptidases

Introduction For several years it has been considered that protein folding intermediates were an essential part of the folding process, helping to restrain the conformational space search and guiding the protein to its folded conformation. This idea has dramatically changed in the last few years, since it has recently been described that no folding intermediates accumulate in the folding process of several small Abbreviations used: ADA2h, activation domain of human procarboxypeptidase A2; 3D, three-dimensional; SD, standard deviation. E-mail address of the corresponding author: [email protected] 0022±2836/98/451027±10 $30.00/0

proteins, e.g. the chymotrypsin inhibitor CI-2 (Jackson & Fersht, 1991), B1 and B2 domains of the IgG-binding protein (Alexander et al., 1992), cytocrome c refolded at pH 4.9 (Sosnick et al., 1992), SH3 domain of a-spectrin (Viguera et al., 1994), ADA2h (Villegas et al., 1995a), P22arc (Milla et al., 1995), the B and Y-acyl-coenzyme A binding proteins (Kragelund et al., 1995), the cold shock protein, CspB (Schindler et al., 1995) and the truncated form of the N-terminal domain of phage l repressor (Huang & Oas, 1995). The absence of detectable folding intermediates has been grossly correlated with the size of the polypeptide chain being folded (Viguera et al., 1994; Villegas et al., 1995a). To understand the folding process, a detailed structural and thermodynamic characterization of # 1998 Academic Press

1028 the different states involved, as well as of the kinetic relationship between them, is needed. In the case of proteins with folding intermediates, relevant information regarding the denatured, intermediate and folded states can be obtained by different techniques (Baldwin, 1993). However, for two-state transition proteins no information regarding putative transient intermediate states can be obtained using classical techniques. Kinetic analysis of engineered proteins (Matouschek et al., 1989; Serrano et al., 1992b; Fersht, 1995) has been used as a tool to provide energetic and structural information about the transition state between the folded and denatured states. Using this approach, the structure of the folding transition state has been elucidated to different extents for six proteins: barnase (Matouschek et al., 1989; Serrano et al., 1992b; Fersht, 1993), CI-2 (Otzen et al., 1994), CheY (LoÂpez-HernaÂndez & Serrano, 1996), SH3 domain (Viguera et al., 1996a; MartõÂnez et al., 1998), cro repressor (Burton et al., 1997) and P22arc (Milla et al., 1995). Concerning the nature of the folding transition of these proteins, two groups can be identi®ed: barnase (Matouschek et al., 1990, 1992) and CheY (LoÂpez-HernaÂndez & Serrano, 1996) show a kinetic folding intermediate, whereas CI-2 (Jackson & Fersht, 1991), SH3 domain (Viguera et al., 1994), cro repressor (Burton et al., 1997) and P22arc (Milla et al., 1995) follow two-state folding kinetics. The extent of the formation of interactions in the transition state is also a differential characteristic; most of the interactions found in the native

Transition State Structure of ADA2h

state of barnase are either totally present or absent, and only the hydrophobic core seems to be partially formed (Serrano et al., 1992b). This characteristic is not related to the presence of any intermediate, since CheY behavior is similar to that described for the small two-state proteins, i.e. there are few, if any, interactions fully present or absent. In the case of CheY, two subdomains are found, one which seems to be fully unfolded while the other resembles the transition state of the small two-state proteins (LoÂpez-HernaÂndez & Serrano, 1996). In light of all these results a general model of protein folding has been proposed based on a nucleation/condensation model (Itzhaki et al., 1995; Fersht, 1997). Here the structure of the transition state for the folding process of ADA2h has been analyzed by the protein engineering method (Matouschek et al., 1989; Fersht et al., 1992; Serrano et al., 1992b; Fersht, 1995). The three-dimensional (3D) structure of the complete proenzyme has recently been solved by X-ray analysis (GarcõÂa-Saez et al., 1997), and kinetic characterization has also been performed on the recombinant activation domain (Villegas et al., 1995a). This 81-residue globular domain folds as an open sandwich, with an antiparallel a-b topology consisting of two a-helices and a four-stranded b-sheet (CatasuÂs et al., 1995; GarcõÂa-Saez et al., 1997; Figure 1(b)). The absence of disul®de bridges and cis-prolyl bonds in this domain facilitates the assignment of the energetics of the folding process. The four trans-prolyl bonds

Figure 1. (a) Amino acid sequence of ADA2h (second line) and secondary structure assignment following Kabsch & Sander (1983) de®nition (upper line). The mutated residues are labeled in bold in the sequence and the mutations done are shown below them. Mutants which could be not expressed or puri®ed are underscored. (b) Schematic representation of the 3D arrangement of these secondary structures. a-Helices are depicted as circles and b-strands as triangles.

1029

Transition State Structure of ADA2h

present in ADA2h contribute little to the overall stability and kinetics of the protein, and so their effects are negligible (Villegas et al., 1995a). Moreover, the ADA2h fold has recently been stabilized by rational design of its a-helices (Villegas et al., 1995b) and the combination of the two stabilized a-helices results in a thermostable domain (Viguera et al., 1996b). In addition, stabilization of a-helix 2 signi®cantly increases the refolding rate, thus indicating that folding speed can be improved in some cases (Viguera et al., 1996b). To analyze the structure of the transition state in the folding process of ADA2h, 20 mutations spread throughout the whole sequence have been made and the refolding and the unfolding kinetics, as well as the equilibrium reaction, have been analyzed.

Results Non-disruptive mutants (Fersht et al., 1992) spread throughout the ADA2h sequence, were designed by examining the crystal structure for side-chain/side-chain or side-chain/main-chain interactions which could report upon the integrity of its secondary structure elements (Figure 1(a)) as well as of the tertiary packing (Figure 1(b)). Following Kabsch & Sanders' (1983) de®nition, b-strand 1 spans residues 11 ± 16; a-helix 1, 19 ± 31 (or 19 ±35, when considering the last 310 helical turn as part of it); b-strand 2, 38 ±41; b-strand 3, 50 ± 54; a-helix 2, 58± 68; and b-strand 4, 72± 77. Using graphical examination of the crystal structure it was found that there is a unique hydrophobic core, formed by Leu13, Ile15, Pro17, Leu26, Leu29, Leu35, Phe37, Pro41, Ala52, Val54, Val56, Pro57, Val61, Val64, Leu68 and Ile73.

Table 1 shows the mutations produced on ADA2h, their location, and the accessibility and contacts of the deleted groups. Approximately a third of the performed mutations show polar interactions, the other two thirds form hydrophobic contacts. Some mutations affect mainly local interactions and, therefore, report upon the integrity of the particular region of the protein, e.g. Glu20 ! Gly (a1), Ala31 ! Gly (a1), Asp38 ! Ala (b2), Ala50 ! Gly (b3), Gln60 ! Gly (a2) and Val64 ! Gly (a2). Others involve long-range interactions and report upon secondary structure as well as upon the 3D structure of the protein, e.g. Val12 ! Ala (b1), Glu14 ! Ala (b1), Ile15 ! Val (b1), Ile23 ! Val (a1), Leu26 ! Val (a1), Phe39 ! Leu (b2), Lys41 ! Ala (b2), His51 ! Ala (b3), Val52 ! Ala (b3), Asn58 ! Ala (a2), Phe65 ! Ala (a2), Ile71 ! Val (a2-b4), Tyr73 ! Leu (b4), and Ile75 ! Ala (b4). The replacement of Asn19 (N-cap a1) by Ala and by Gly resulted in the production of very little protein which could not be puri®ed, suggesting that the mutant proteins were greatly destabilized. This was also the case for mutations Ser42 ! Gly (loop b2-b3) and Val54 ! Ala (b3). In a previous work, two multiple mutants containing mutations on the solvent-exposed face of the two a-helices which only affected local interactions (Villegas et al., 1995b; Viguera et al., 1996b; a1 replacements: Asn25 ! Lys, Gln28 ! Glu, Gln32 ! Lys, Glu33 ! Lys; a2 replacements: Gln60 ! Glu, Val64 ! Ala, Ser68 ! Ala, Gln69 ! His) were analyzed. These two mutants provide information upon the integrity of the two helices as a whole. Urea equilibrium denaturation Table 2 summarizes the free energy of unfolding in water (GF-U) and the equilibrium m values for

Table 1. Location, accessibility and contacts of the modi®ed chemical groups of the different ADA2h mutants Residue

Location

Backbonea

Side-chaina

Accessible areab (%)

Val12 ! Ala Glu14 ! Ala Ile15 ! Val Glu20 ! Gly Ile23 ! Val Leu26 ! Val Ala31 ! Gly Asp38 ! Ala Phe39 ! Leu Lys41 ! Ala Ala50 ! Gly His 51 ! Ala Val52 ! Ala Asn58 ! Ala Gln60 ! Gly Val64 ! Gly Phe65 ! Ala Ile71 ! Val Tyr73 ! Leu Ile75 ! Ala

b1 b1 b1 a1 a1 a1 a1 b2 b2 b2 b3 b3 b3 a2 a2 a2 a2 a2-b4 b4 b4

Q11 A52 L13,A50 ± P17,N19,Q22,P43,T44,T45,P46,E48 Q22,I23,N25, L27,Q28 F39,W40,R53 L26,L29,E30,L37,D38,K41,S42,H51 W40,S42,E48,T49,A50 I15,K39,E50 L13,E14,I15,T49,A50 D38 L35,V57 N62 Q64,A65,V66 N25,V64,L66 E21,Q22,F65,L66,Q69 E14,K63,L66,I71,A72,S74 S74,I77,E78

W40,M76,I77, V80 V16,T51,A52,H53,S76,M78 V52 K24 P17,T44 I15,P17 ± W40,R53 E30,S42,P43,V52 T44,E48,T49,H51 I15,P17,F37,K39,P41 E14,K41,T49,M76 I15,L37,F39 L35,Q36,L37,P55,V57 V68 Q64,E71,S72 L29,L66,S68,Q69 Q22,N25,Q69,F65 K63,L66,E67,S74 Q11,L13

0.0 25.3 2.0 61.6 10.5 0.0 57.1 38.0 4.1 45.1 0.0 22.2 1.4 33.8 67.0 47.2 31.3 0.0 17.1 33.0

a b

Ê. Interactions at less than 5.5 A Calculated with the program WHATIF (Vriend, 1990) using default parameters.

1030

Transition State Structure of ADA2h

Table 2. Equilibrium denaturation parameters for ADA2h and its mutants Protein a

WT Val12 ! Ala Glu14 ! Ala Ile15 ! Val Glu20 ! Gly Ile23 ! Val Leu26 ! Val Ala31 ! Gly Asp38 ! Ala Phe39 ! Leu Lys41 ! Alab Ala50 ! Gly His51 ! Ala Val52 ! Ala Asn58 ! Ala Gln60 ! Gly Val64 ! Gly Phe65 ! Ala Ile71 ! Val Tyr73 ! Leub Ile75 ! Ala a1stabilizedd a2 stabilizedd

Secondary structure

m (kcal molÿ1 Mÿ1)

GF-U,H2O (kcal molÿ1)

[Urea]1/2 (M)

± b1 b1 b1 a1 a1 a1 a1 b2 b2 b2 b3 b3 b3 a2 a2 a2 a2 b4 b4 b4 a1 a2

0.95  0.02 1.04  0.07 0.93  0.03 0.93  0.03 0.83  0.03 1.05  0.03 0.98  0.05 0.82  0.02 0.98  0.04 0.88  0.08 0.73  0.10 0.95  0.04 1.00  0.05 0.97  0.10 0.98  0.06 0.96  0.03 0.85  0.03 1.10  0.05 1.00  0.03 0.90c 0.98  0.05 0.9  0.03 0.9  0.02

ÿ4.08  0.10 ÿ2.98  0.21 ÿ3.78  0.11 ÿ3.65  0.11 ÿ2.66  0.10 ÿ4.50  0.12 ÿ3.11  0.17 ÿ2.88  0.09 ÿ4.44  0.20 ÿ2.20  0.02 ÿ2.10  0.20 ÿ2.68  0.03 ÿ3.47  0.19 ÿ3.00  0.33 ÿ4.0  0.20 ÿ3.59  0.10 ÿ2.98  0.10 ÿ3.10  0.16 ÿ2.77  0.10 ÿ1.3  0.1 ÿ2.77  0.2 ÿ5.3  0.2 ÿ5.7  0.1

4.3 2.9 4.1 3.9 3.2 4.3 3.2 3.5 4.5 2.5 2.9 2.8 3.5 3.1 4.1 3.7 3.5 2.8 2.8 1.4 2.8 5.9 6.3

Some of the mutations greatly destabilize the folded state, rendering a partially unfolded native state that prevents the proper ®tting of the data. Unless stated, all the ®ttings have been done by ®xing the slopes of the dependent ¯uorescesce on urea concentration of the folded (a) and unfolded (b) states to those of the wild type. a From a previous work (Villegas et al., 1995a). b Subject to a large error due to the low stability of this protein, which prevents visualization of the full transition. c This slope was ®xed. d From previous works (Villegas et al., 1995b; Viguera et al., 1996b).

the mutants. The values for the wild-type and the helix-stabilized mutants previously obtained (Villegas et al., 1995a,b; Viguera et al., 1996b) are included for comparative purposes. As expected, the stability of the mutants decreases with respect to the wild-type (especially those located at the core of the protein), with the exception of Ile23 ! Val and Asp38 ! Ala, which are located on the outmost layer of the protein and partly solvent-accessible. In some cases, such as mutants Lys41 ! Ala and Tyr73 ! Leu, the protein was already partly unfolded in 0 M urea. To ®t the denaturation curves in those mutants, we assumed that the ¯uorescence of the folded state and its dependence on urea concentration were similar to that of the WT protein. The m values, which report upon the differences in the hydrophobic surface exposed to the solvent between the native and denatured states for all analyzed mutants, vary around 5% (standard deviation, SD). A variation of 6% has been described for barnase (Serrano et al., 1992a), CI-2 (Itzhaki et al., 1995) and CheY (LoÂpez-HernaÂndez & Serrano, 1996) and, in principle, could be considered as a result of experimental uncertainty. Kinetic analysis Table 3 summarizes the unfolding and refolding kinetic parameters obtained for the ADA2h mutants. The variation of the unfolding slopes m{-F is around 6% (SD), as in the equilibrium denaturation experiments, the exceptions being

Val12 ! Ala, and Ala50 ! Gly, where a decrease of this parameter is observed. This decrease in m{-F indicates that the mutation generates a shift of the transition state towards the folded state, as the observed increase in m{-U corroborates. In the refolding slopes, m{-U, the variation is similar and, apart from the two mutants mentioned above, a signi®cant increase is observed in two other cases: Phe65 ! Ala and Ile71 ! Val. However, in these two mutants no differences were obtained in m{-F, thus indicating a possible minor increase of accessibility to the solvent in the denatured state. From the substraction of the two kinetic slopes the equilibrium m value can be calculated. The obtained m values from the kinetic and equilibrium analysis show that both values are well correlated within the margins of the experimental errors (6% SD). With respect to the kinetic and equilibrium GF-U,H2O values (see Materials and Methods), the differences (around 0.3 kcal molÿ1) are due to proline isomerization, which is not taken into account in the analysis of kinetic data. The m{-F value re¯ects the difference in solvent accessibility between the folded and transition states (Serrano et al., 1992b). This parameter can be compared to the equilibrium m value in order to determine the position of the transition state in the reaction coordinate. For the wild-type the ratio between m{-F and m is 0.29, indicating that the transition state buries approximately 71% of the surface that has to be buried in the folded state. Similar values have been obtained for CheY (LoÂpez-HernaÂndez & Serrano, 1996), barnase

1031

Transition State Structure of ADA2h Table 3. Thermodynamic and kinetic parameters for ADA2h and its mutants Protein WTa Val12 ! Ala Glu 14 ! Ala Ile15 ! Val Glu20 ! Gly Ile23 ! Val Leu26 ! Val Ala31 ! Gly Asp 38 ! Ala Phe39 ! Leu Lys 41 ! Alac Ala 50 ! Gly His 51 ! Ala Val52 ! Ala Asn58 ! Ala Gln60 ! Gly Val64 ! Gly Phe65 ! Ala Ile71 ! Val Tyr73 ! Leuc Ile75 ! Ala a1stabilizeda a2stabilizeda

ln k{-F (sÿ1)

RTm{-F (kcal molÿ1 Mÿ1)

ÿ0.73  0.12 ‡1.49  0.10 ÿ0.29  0.09 ÿ0.87  0.08 ‡0.36  0.08 ÿ1.30  0.20 ‡0.25  0.09 ‡0.21  0.06 ÿ0.76  0.06 ‡2.40  0.40

0.29  0.01 0.22  0.01 0.30  0.03 0.30  0.01 0.31  0.03 0.33  0.01 0.32  0.09 0.29  0.02 0.28  0.01 0.26  0.1

‡1.82  0.10 0.00  0.10 ‡0.02  0.10 ÿ0.41  0.10 ÿ0.18  0.08 ÿ0.02  0.08 ‡0.49  0.10 ‡1.29  0.07

0.24  0.01 0.32  0.01 0.29  0.02 0.30  0.01 0.31  0.01 0.30  0.01 0.31  0.02 0.28  0.01

‡1.52  0.20 ÿ2.92  0.04 ÿ2.04  0.02

0.25  0.02 0.30  0.01 0.32  0.01

lnk{-U (sÿ1) 6.63  0.15 5.89  0.11 6.46  0.13 5.78  0.04 6.19  0.17 6.00  0.04 5.43  0.07 6.87  0.18 6.61  0.05 6.25  0.30 5.8±6.0 6.32  0.20 6.41  0.10 5.60  0.09 6.28  0.04 6.63  0.19 6.64  0.16 5.26  0.10 6.10  0.06 5.8±6.0 5.80  0.20 6.95  0.09 8.00  0.09

RTm{-U (kcal molÿ1 Mÿ1) ÿ0.70  0.02 ÿ0.87  0.05 ÿ0.69  0.03 ÿ0.67  0.01 ÿ0.68  0.05 ÿ0.67  0.01 ÿ0.76  0.03 ÿ0.72  0.04 ÿ0.66  0.01 ÿ0.62  0.12 ÿ0.77  0.04 ÿ0.73  0.02 ÿ0.71  0.03 ÿ0.69  0.01 ÿ0.74  0.04 ÿ0.73  0.04 ÿ0.84  0.03 ÿ0.85  0.03 ÿ0.72  0.08 ÿ0.66  0.02 ÿ0.58  0.01

m (Mÿ1)

GF-U,H2O (kcal molÿ1)

0.99  0.02 1.09  0.07 0.99  0.06 0.97  0.03 0.99  0.08 1.00  0.03 1.09  0.05 1.01  0.06 0.96  0.04 0.88  0.08 0.73  0.10 1.01  0.04 1.05  0.05 1.00  0.10 0.96  0.06 1.05  0.05 1.03  0.05 1.15  0.03 1.13  0.03 0.90  0.10 0.97  0.05 0.96  0.03 0.90  0.02

ÿ4.4  0.1 ÿ2.6  0.2 ÿ4.0  0.1 ÿ3.9  0.1 ÿ3.4  0.1 ÿ4.3  0.1 ÿ3.1  0.1 ÿ3.9  0.1 ÿ4.4  0.2 ÿ2.3  0.2 ÿ2.1  0.2 ÿ2.7  0.1 ÿ3.6  0.1 ÿ3.3  0.2 ÿ4.0  0.2 ÿ4.0  0.2 ÿ3.9  0.1 ÿ2.8  0.1 ÿ2.9  0.1 ÿ1.3  0.2 ÿ2.5  0.1 ÿ5.8  0.1 ÿ6.0  0.1

{-U Ð 0.25  0.01 0.00  0.04 1.0  0.04 0.22  0.01 9.3  31.0 0.55  0.02 0.00  0.10 b

0.11  0.05 0 0.11  0.01 0.17  0.04 0.59  0.02 0.52  0.10 0.48  0.07 0.43  0.02 0.53  0.01 0.21  0.03 0 0.27  0.02 0.13  0.02 0.49  0.03

a

Previously published and shown here for comparative purposes. Not determined due to the small difference between the wild-type and mutant protein. c The extremely fast unfolding precludes ®tting of the data. However, the estimation of the refolding rate constant can be done since we have reliable data between 0.1 and 0.2 M urea. Similarly, the estimation of the equilibrium parameters is not very reliable due to the absence of the baseline for the folded state, but the change in free energy is suf®ciently large, compared to the change in the refolding rate constant to consider that {-U must be close to 0. b

(Serrano et al., 1992a) and SH3 (Viguera et al., 1996a). In the case of the CI-2 protein the ratio between m{-F and the equilibrium m value indicates that its transition state is less compact (60% of the buried surface in the folded state is also buried in the transition state; Otzen et al., 1994).

any degree of accuracy and, in consequence, they are not included in the analysis. On the other hand, for the Ile23 ! Val mutant, although it has similar free energy of unfolding to the wild-type

Protein engineering analysis The ratio between the differences in energy in the refolding reaction, G{-U, and the destabilization in free energy induced by the mutation, GF-U, renders parameter {-U, which is indicative of the extent to which the deleted interactions are present in the transition state of the reference protein (Fersht, 1995). A value of {-U ˆ 1 indicates that the interaction is the same in the transition state as it is in the folded state, and a value of 0 indicates the absence of the interaction in the transition state. In a two-state process, {-U ˆ 1 ÿ {-F. The {-U values for the 20 single mutations destabilizing ADA2h are presented in Table 3 and depicted in colors in Figure 2. Among the interactions analyzed, the only group totally formed in the transition state are the ones corresponding to the Ile15 ! Val mutation. There is another group of mutants with {-U values close to 0: Glu14 ! Ala; Ala31 ! Gly; Phe39 ! Leu; Lys41 ! Ala; Ala50 ! Gly and Tyr73 ! Leu. A mutant which cannot be included in the  analysis is Asp38 since it does not show difference enough in stability and kinetics with the wild-type. Therefore, {-U values cannot be calculated with

Figure 2. Structure of ADA2h transition state. Ribbon diagram showing the {-U values on a color code. The side-chain groups of those residues that have a {-U value between 1.0 and 0.7 are red, those that are between 0.7 and 0.3 are yellow, and those that have a {-U value less than 0.3 are shown in blue. The only case in which a non-native interaction appears in the transition state is shown in green. The {-U averaged value for the multiple stabilizing mutations is shown on the a-helix ribbon with the corresponding color.

1032

Transition State Structure of ADA2h

protein, it exhibits signi®cant differences in the unfolding and refolding rate constants. This indicates that the Gibbs energy of the transition state has been modi®ed with respect to both the folded and unfolded states and, therefore, that a nonnative interaction could be present at this position. Analysis of the interactions made by the groups deleted and/or changed and their location (Table 1) in relation to the {-U values offers an interesting picture of the ADA2h transition state (Figure 2). Four residues which have {-U values 0.5 or higher (Ile15, Leu26, Phe65 and Val52) and Ile23 that could have non-native interactions, are clustered in the 3D structure, suggesting that they could be a folding nucleus (Val64 is also clustered with these residues and has a {-U of 0.4). The multiple mutant on the surface of a-helix 2 (residues Gln60, Val64, Ser68, and Gln69) also has a high average {-U value (0.5), suggesting that part of this helix is folded in the transition state. This is corroborated by the analysis of the individual Asn58, Gln60, Val64 and Phe65 mutations. Regarding a-helix 1, the {-U value for the multiple mutant (Asn25, Gln28, Gln32, and Glu33) indicates that this helix, or part of it, is unfolded in the transition state. This is in agreement with mutations Glu20 and Ala31. However, this result is in apparent contradiction with the data for Ile23 and Leu26 located in this helix, suggesting that these residues are contributing to the transition state ensemble, although in the case of Ile23 not necessarily in a native conformation (see Discussion). Those residues located at the edge strands (Phe39, Lys41 in b-strand 2 and Ile71, Tyr73 and Ile75 in b-strand 4), have {-U values of around zero, although some of those residues are relatively buried in the native fold. Regarding the central b-strands 1 and 3, one residue with a high {-U value is found in both cases (Ile15 and Val52).

Figure 3. Brùnsted plot for the unfolding (open circles) and refolding (®lled circles) rate constants of the 20 mutants. The continuous lines are linear regression ®ts with values: lnkz-F ˆ ÿ 0:78 ÿ 0:73  GF-U =RT lnkz-U ˆ 6:58 ‡ 0:27  GF-U =RT:

where ko{-F is the rate constant of unfolding of the wild-type protein and bF is a constant related to the degree of structure formation. The variation in lnk{-F versus the difference in energy between the wild-type and the different mutants is plotted in Figure 3. There is a very good linear correlation between these two variables with 1 ÿ bF ˆ 0.73, thus indicating that there are not discrete populations with different degrees of structure in the transition state.

Discussion Structure of the transition state

Brùnsted behavior As indicated above, the two central b-strands and the two a-helices have residues with intermediate {-U values. Intermediate values in the protein engineering analysis are dif®cult to interpret since they could correspond to partial formation of interactions or to a mixture of fully folded and unfolded states arising from parallel pathways (Fersht et al., 1994). Fersht and coworkers (Fersht et al., 1994), used simple physicochemical reasoning, to show that it is possible to distinguish between these two possibilities by performing a Brùnsted plot analysis. If there are simple relationships between the rate constants and the changes in interaction energies, and assuming that all mutations are testing the same degree of structure formation, the natural logarithm of the unfolding rate constant should follow the Brùnsted equation: lnkzÿF ˆ lnkzo-F ‡ …1 ÿ bF †  GF-U =RT

…1†

As for the already shown CI-2 (Otzen et al., 1994), CheY (LoÂpez-HernaÂndez & Serrano, 1996), and SH3 domain (Viguera et al., 1996a; MartõÂnez et al., 1998), most of the {-U values in ADA2h are fractional, and thus are consistent with the nucleation-condensation model for protein folding (Itzhaki et al., 1995; Fersht, 1995, 1997). There is only one interaction in b-strand 1 that is totally formed out of the 20 analyzed mutations. Seven other mutants analyzed in this work show intermediate fractional {-U values, while the remaining mutants have values lower than 0.3. In the case of Ile23 and as we discussed in Results, a non-native interaction could be made at this position in the transition state (Figure 2). Although fractional {-U values in general do not re¯ect a linear extent of formation of the analyzed interaction, the mutation of larger hydrophobic side-chains to smaller ones shows an approximately linear relationship (Matouschek et al., 1989, 1990; Fersht et al., 1992). Many of the mutations in this work are of that

Transition State Structure of ADA2h

type and, in consequence, its fractional {-U values are indicative of the extent of structure formation. In fact, the nice linear ®tting of the Brùnsted plot corroborates this hypothesis. All of the residues with high {-U values are localized around Val52, which is the center of the hydrophobic core which involves a-helix 2 and part of b-strands 1 and 3, as well as some residues at the center of a-helix 1. There are some residues in b-strands 1 and 3, i.e. Val12, Glu14, Ala50 and His51 which show low {-U values. Glu14 and His51 are interacting with each other on the solvent exposed face of b-strands 1 and 3, while Val12 packs against Arg55. Thus, these three mutations report about the integrity of the solvent exposed face of the central b-strands 1 and 3. The fact that they exhibit low {-U values suggests that although residues in those strands participate in the folding nucleus, the secondary structure is not consolidated. Ala50, which is part of the hydrophobic core mainly packs against residues in b-strand 2. Its low {-U value is expected since the edge strands seem to be almost unfolded. The structure of the transition state of this protein looks like a collapsed globule with some secondary structure and a weakened hydrophobic core. Looking at the polypeptide chain connectivity (Figure 1), one could have expected the folding nucleus to be generated by the packing of a-helix 1 with b-strands 2 and 3, which is not the case. Those three elements of secondary structure are contiguous in the sequence and in the 3D structure, while b-strands 1 and 4 are topologically separated by intercalated secondary structure elements. This suggests that neither the protein topology nor a simple unspeci®c hydrophobic collapse, can account in this protein for the partial folding of different protein regions in the transition state. There must rather be some speci®c interactions which facilitate the conformational search and stabilize the transition state. The complexity and relative stability of the protein substructures, which is partly related to size, could be more important for determining the structure of the transition state in proteins.

Folding and secondary structure propensities In ADA2h there is experimental evidence indicating that an isolated sequence peptide corresponding to a-helix 1 populates the native helical conformation to a signi®cant extent (27%; Villegas et al., 1995b). Regarding a-helix 2, it aggregates and no experimental conditions could be found to have it monomeric in the isolated state (Villegas et al., 1995b). However, using the helix/ coil transition algorithm, AGADIR1s (MunÄoz & Serrano, 1994, 1997) which correctly predicts the helical content of the ®rst a-helix, a very low helical tendency for this helix (4.5%) is predicted. All these evidences render the energy diagram for the folding reaction shown in Figure 4.

1033

Figure 4. Energy diagram for the folding reaction of ADA2h. In the folded state the a-helices are shown as black rectangles and the b-strands as arrows. In the transition state a-helix 2 is shown in dark grey and a-helix 1 (light grey) is drawn with a broken line to indicate that it does not participate in the transition state. In the case of the b-strands, numbers 2 and 4 are shown as ribbons to indicate that they are quite unfolded, and 1 and 3 as broken arrows to indicate partial unfolding. In the denatured state in water, a-helix 1 is drawn as a rectangle since there is evidence that the sequence by itself will fold as an a-helix in water (27% of population; Villegas et al., 1995b). Denatured state, the denatured protein under strong native conditions (Fersht et al., 1994); unfolded state, the protein under strong denaturing conditions.

Independent stabilization of both helices through local interactions has shown, in the case of a-helix 1, a strong deceleration in the unfolding reaction and a very small acceleration in protein refolding, while in the case of a-helix 2, refolding was fourfold accelerated (Viguera et al., 1996b). NMR analysis of a peptide corresponding to a-helix 1 has shown that the helical conformation in aqueous solution includes residues Ile23 and Leu26 (Villegas et al., 1995b), which have high {-U values. Therefore, it should have been expected that stabilization of this helix will signi®cantly accelerate refolding, as in a-helix 2, which is not the case. This apparent contradiction can be explained if a-helix 1 is not contributing to the transition state ensemble, as is derived from the {-U analysis, although some residues involved in tertiary contacts and contained in this helix (i.e. Ile23 and Leu26) could participate in a way that is independent of helix formation. Therefore, although this helix could be partly folded on the denatured and transition states, since it does not participate in the nucleation process, changes in its stability will not alter the folding kinetics. A similar result was observed for the second helix in barnase (Serrano et al., 1992b). On the other hand,

1034 a-helix 2 as predicted by AGADIR1s should be a random coil in the denatured state, and the protein engineering analysis indicates that it is partly made in the transition state and that it is part of the folding nucleus. Therefore, its stabilization should accelerate refolding, as it does.

Conclusions Using the protein engineering analysis a picture of the structure of ADA2h in its transition state has been obtained, which is in agreement with the nucleation-condensation model. This state seems to be quite compact (70% of the surface of the folded state is buried), as determined from the analysis of the refolding and unfolding slopes. Such a transition state seems to have some secondary structure present, corresponding to the internal elements of secondary structure, and is independent of the protein topology. Finally, our results indicate that in order to accelerate protein folding, we should stabilize secondary structure elements folded in the transition state and unfolded in the denatured state under native conditions.

Materials and Methods Materials All chemicals used have been described by Villegas et al. (1995a,b) and Viguera et al. (1996b). Mutagenesis was performed by PCR (Clakson et al., 1991); the mutant fragments were cloned into a modi®ed pTZ18U (Bruix et al., 1993) and expressed into E. coli XL-1-BLUE. Protein expression and puri®cation were carried out as described by Villegas et al. (1995b) and Viguera et al. (1996b). The level of the expressions and the yield in the puri®cations seem to be somehow related to the stability found for each mutant (data not shown). Twenty mutant forms spread throughout the ADA2h sequence were obtained and are presented in this work: Val12 ! Ala (b1); Glu14 ! Ala (b1); Ile15 ! Val (b1); Glu20 ! Gly (a1); Ile23 ! Val (a1); Leu26 ! Val (a1); Ala31 ! Gly (a1); Asp38 ! Ala (b2); Phe39 ! Leu (b2); Lys41 ! Ala (b2); Ala50 ! Gly (b3); His51 ! Ala (b3); Val52 ! Ala (b3); Asn58 ! Ala (a2); Gln60 ! Gly (a2); Val64 ! Gly (a2); Phe65 ! Ala (a2); Ile71 ! Val (a2b4); Tyr73 ! Leu (b4); and Ile75 ! Ala (b4). Mutations Asn19 ! Ala (a1 N-cap); Asn19 ! Gly(a1 N-cap); Ser42 ! Gly (b2 ‡1); and Val54 ! Ala (b3), rendered a very low yield of expression, preventing their analysis. Equilibrium denaturation Equilibrium denaturation and kinetic experiments were carried out as described by Villegas et al. (1995a). In all cases, the experiments were performed at 25 C in 50 mM sodium phosphate (pH 7.0) and the suitable concentration of urea. Fluorescence emission spectra of Trp40 of ADA2h was used to monitor any changes in the environment of this residue upon the unfolding of the protein. Fluorescence was measured in an Aminco Bowman Series 2 luminescence spectrometer. Excitation was at 290 nm with a 2 nm slit. Fluorescence was detected through an 8 nm slit at 315 nm. In these experiments protein concentration

Transition State Structure of ADA2h was kept at 2.2 mM, and temperature at 298 K. The equilibrium constant for denaturation was calculated for each denaturant concentration by using equation (2): KF-U ˆ …FN ÿ F†=…F ÿ FU †

…2†

where F is the ¯uorescence value at a certain concentration of denaturant, and FN and FU are the corresponding ¯uorescence values for the fully folded and unfolded states in the absence of denaturant. It has been found experimentally that the free energy of the unfolding of proteins in the presence of urea is linearly related to the concentration of denaturant (Pace, 1986): GF-U ˆ GF-U;H2 O ÿ m‰ureaŠ

…3†

The value of m and GF-U;H2 O , the apparent free energy of unfolding in the absence of denaturant, may be calculated from equation (3), since GF-U ˆ ÿ RTlnKF-U. The proportionality constant m re¯ects the co-operativity of the transition and is believed to be related to the difference in hydrophobic surface exposed to the solvent between the native and the denatured states. Taking all of these dependencies into account, the ¯uorescence data can be ®tted to the following equation: F ˆ f…FN ‡ a‰ureaŠ ‡ …FU ‡ b‰urealŠ†  exp……m‰ureaŠ ÿ GF-U;H2 O †=RT†g= f1 ‡ exp……m‰ureaŠ ÿ GF-U;H2 O †=RT†g

…4†

in which the dependence of the intrinsic ¯uorescence upon denaturant concentrations, in both the native and the denatured states, is taken into account by the terms of a[urea] and b[urea], respectively (linear approximation). Kinetic experiments Kinetics were followed in a Bio-Logic stopped-¯ow machine (SFM-3) by ¯uorescence. The average dead time of the experiments was 50 ms due to artifacts arising from mixing water and high urea concentrations. A cell of 150 ml and an aging loop of 10 ml were used. The unfolding process was promoted by dilution of the native ADA2h in a 50 mM sodium phosphate buffer (pH 7.0) with the appropriate ratio of the same buffer containing different concentrations of urea. For the refolding reaction, the unfolded domain in the 50 mM sodium phosphate buffer (pH 7.0), containing 9.5 M urea, was mixed with an excess of the same buffer without urea to give several ®nal urea concentrations. Fluorescence was measured through a 320 nm cutoff ®lter (excitation at 290 nm). The cell chamber and the syringes were kept at 298 K. The logarithm of the rate constants for the unfolding (lnk{-F) and the refolding reactions (lnk{-U) versus urea concentration can be ®tted to a linear equation, outside the transition region. The rate constants in the transition region between the linear equations concerning both the unfolding and the refolding reactions were indistinguishable. The complete reaction can be ®tted to the following equation (Jackson & Fersht, 1991): ln k ˆ ln‰kz-U;H2 O exp…ÿmz-U ‰ureaŠ† ‡ kz-F;H2 O exp…ÿmz-F ‰ureaŠ†Š

…5†

where k is the rate constant at a given concentration of denaturant, k{-U,H2O is the rate constant of refolding in water, k{-F,H2O is the rate constant of unfolding in water, and m{-U and m{-F are the slopes of the refolding and

1035

Transition State Structure of ADA2h unfolding reactions, respectively. The estimate of GF-U,H2O from the kinetic data can be obtained by using: GF-U;H2 O ˆ ÿRT ln…kz-U;H2 O =kz-F;H2 O †

…6†

and the slope m can be obtained by: m ˆ RT…mz-F ÿ mz-U †

…7†

In the analysis of the energetics, the free energy changes of the transition state (G{-U, G{-F) upon mutation were calculated using the following equations: Gz-U ˆ ÿ RT ln…kz0 -U =kz-U †

…8†

Gz-F ˆ ÿ RT ln…kz0 -F =kz-F †

…9†

0

where k{-U and k {-U are the rate constants for refolding for the wild-type and mutant, respectively. The same for the unfolding ones. The {-U values which report the degree of formation of the interactions broken upon mutation in the transition state were calculated by using the equation: z-U ˆ …Gz-U †=…GF-U †

…10†

where GF-U ˆ G{-U ÿ G{-F The protein engineering method has been extensively reviewed by Fersht and co-workers (Matouschek et al., 1989; Fersht et al., 1992; Serrano et al., 1992b; Fersht, 1995, 1997).

Acknowledgments V.V. has been a short-term EMBO and FEBS fellow and acknowledges the IBF biocomputing service for technical assistance. J.C.M. is supported by an EU-TMR postdoctoral fellowship. F.X.A. gratefully acknowledges ®nancial support from the CICYT (Spain; Grant BIO950848) and from the Centre de RefereÁncia en Biotecnologia (Generalitat de Catalunya). The authors also thank I. GarcõÂa-Saez et al. for providing the PCPA2 crystal structure.

References Alexander, P., Orban, J. & Bryan, P. (1992). Kinetic analysis of the folding and unfolding of the 56 amino acid IgG-binding domain of Streptococcal protein G. Biochemistry, 31, 7243± 7248. Baldwin, R. L. (1993). Pulse H/D exchange studies of folding intermediates. Curr. Opin. Struct. Biol. 3, 84 ± 91. Bruix, M., Pascual, J., Santoro, J., Prieto, J., Serrano, L. & Rico, M. (1993). 1H- and 15N-NMR assignment and solution structure of the chemotactic Escherichia coli Che Y protein. Eur. J. Biochem. 215, 573± 585. Burton, R. E., Huang, G. S., Daugherty, M. A., Calderone, T. & Oas, T. G. (1997). The energy landscape of a fast-folding protein mapped by Ala ! Gly substitutions. Nature Struct. Biol. 4, 305± 310. CatasuÂs, Ll., Vendrell, J., AvileÂs, F. X., Carreira, S., Puigserver, A. & Billeter, M. (1995). The sequence and conformation of human pancreatic procarboxypeptidase A2. cDNA cloning, sequence analysis, and 3D model. J. Biol. Chem. 270, 6651±6657.

Clakson, T., Gussow, D. & Jones, P. T. (1991). In PCR: A Practical Approach (McPerson, M. J., Quirke, P. & Taylor, G. R., eds), pp. 187± 214, IRL Press, Oxford. Fersht, A. R. (1993). Protein folding and stability: the pathway folding of barnase. FEBS Letters, 325, 5 ± 16. Fersht, A. R. (1995). Characterizing transition states in protein folding: an essential step in the puzzle. Curr. Opin. Struct. Biol. 5, 79 ± 84. Fersht, A. R. (1997). Nucleation mechanisms in protein folding. Curr. Opin. Struct. Biol. 7, 3 ± 9. Fersht, A. R., Matouschek, A. & Serrano, L. (1992). The folding of an enzyme. I. Theory of protein engineering analysis of stability and pathway of protein folding. J. Mol. Biol. 224, 771± 782. Fersht, A. R., Itzhaki, L. S., elMasry, N., Matthews, J. M. & Otzen, D. E. (1994). Single versus parallel pathways of protein folding and fractional formation of structure in the transition state. Proc. Natl. Acad. Sci. USA, 91, 10426± 10429. GarcõÂa-Saez, I., Reverter, D., Vendrell, J., AvileÂs, F. X. & Coll, M. (1997). The three-dimensional structure of human procarboxypeptidase A2. Deciphering the basis of the inhibition, activation and intrinsic activity of the zymogen. EMBO J. 16, 6906± 6913. Huang, G. S. & Oas, T. G. (1995). Submillisecond folding of monomeric l represor. Proc. Natl. Acad. Sci. USA, 92, 6878± 6882. Itzhaki, L. S., Otzen, D. E. & Fersht, A. R. (1995). The structure of the transition state for folding of chymotrypsin inhibitor 2 analysed by protein engineering methods: evidence for a nucleation condensation mechanism for protein folding. J. Mol. Biol. 254, 260± 288. Jackson, S. E. & Fersht, A. R. (1991). Folding of chymotrypsin inhibitor 2. I. Evidence for a two-state transition. Biochemistry, 30, 10428± 10435. Kabsch, W. & Sander, C. (1983). Dictionary of protein secondary structure pattern recognition of hydrogen-bonded and geometrical features. Biopolymers, 22, 2577± 2637. Kragelund, B. B., Hùjrup, P., Jensen, M. S., Schjerling, C. K., Joul, E., Knudsen, J. & Poulsen, F. M. (1995). Fast and one-step folding of closely and distantly related homologous proteins of a four-helix bundle family. J. Mol. Biol. 256, 187± 200. LoÂpez-HernaÂndez, E. & Serrano, L. (1996). Structure of the transition state for folding of the 129 aa protein CheY resembles that of the smaller protein CI-2. Fold. Des. 1, 43 ±55. Martinez, J. C., Pisabarro, M. T. & Serrano, L. (1998). Obligatory steps in protein folding and the conformational diversity of the transition state. Nature Struct. Biol. 5, 721± 729. Matouschek, A., Kellis, J. T., Jr, Serrano, L. & Fersht, A. R. (1989). Mapping the transition state and pathway of protein folding by protein engineering. Nature, 340, 122± 126. Matouschek, A., Kellis, J. T., Jr, Serrano, L., Bycroft, M. & Fersht, A. R. (1990). Transient folding intermediates characterized by protein engineering. Nature, 346, 440± 445. Matouschek, A., Serrano, L. & Fersht, A. R. (1992). The folding of an enzyme IV. Structure of an intermediate in the refolding of barnase analyzed by a protein engineering procedure. J. Mol. Biol. 224, 819± 835. Matouschek, A., Otzen, D. E., Itzhaki, L. S., Jackson, S. E. & Fersht, A. R. (1995). Movement of the transition

1036 state in protein folding. Biochemistry, 34, 13656± 13662. Milla, M. E., Brown, B. M., Waldburger, C. D. & Sauer, R. T. (1995). P22 Arc represor: transition state properties inferred from mutational effects on the rates of protein unfolding and refolding. Biochemistry, 34, 13914± 13919. MunÄoz, V. & Serrano, L. (1994). Elucidating the folding problem of helical peptides using empirical parameters. Nature Struct. Biol. 1, 399±409. MunÄoz, V. & Serrano, L. (1997). Development of the multiple sequence approximation within the agadir model of a-helix formation. Comparison with the zimm-bragg and lifson-roig formalisms. Biopolymers, 41, 495± 509. Otzen, D. E., Itzhaki, L. S., elMasry, N. F., Jackson, S. E. & Fersht, A. R. (1994). Structure of the transition state for the folding/unfolding of the barley chymotrypsin inhibitor 2 and its implications for mechanisms of protein folding. Proc. Natl. Acad. Sci. USA, 91, 10422± 10425. Pace, C. N. (1986). Determination and analysis of urea and guanidine hydrochloride denaturation curves. Methods Enzymol. 131, 226±280. Schindler, T., Herrler, M., Marahiel, M. A. & Schmid, F. X. (1995). Extremely rapid protein folding in the absence of intermediates. Nature Struct. Biol. 2, 663± 673. Serrano, L., Kellis, J. T., Cann, P., Matouschek, A. & Fersht, A. R. (1992a). The folding of an enzyme II. Structure of barnase and the contribution of different interactions to protein stability. J. Mol. Biol. 224, 783± 804. Serrano, L., Matouschek, A. & Fersht, A. R. (1992b). The folding of an enzyme III. Structure for the transition

Transition State Structure of ADA2h state for unfolding of barnase analyzed by a protein engineering method. J. Mol. Biol. 224, 805± 818. Smith, C. K., Bu, Z., Anderson, K. S., Sturtevant, J. M., Enegelma, D. M. & Regan, L. (1996). Surface point mutations that signi®cantly alter the structure and stability of a protein's denatured state. Prot. Sci. 5, 2009± 2019. Sosnick, T. R., Mayne, L., Hiller, R. & Englander, S. W. (1992). The barriers in protein folding. Nature Struct. Biol. 1, 149±156. Viguera, A. R., MartõÂnez, J. C., Filimonov, V. V., Mateo, P. L. & Serrano, L. (1994). Thermodynamic and kinetic analysis of the SH3 domain of spectrin shows a two-state folding transition. Biochemistry, 33, 2142± 2150. Viguera, A. R., Wilmanns, M. & Serrano, L. (1996a). Different folding transition states may result in the same native structure. Nature Struct. Biol. 3, 874± 880. Viguera, A. R., Villegas, V., AvileÂs, F. X. & Serrano, L. (1996b). Native-like favourable helical local interactions can accelerate protein folding. Fold. Des. 2, 23 ±33. Villegas, V., Azuaga, A., CatasuÂs, Ll. , Reverter, D., Mateo, P. L., AvileÂs, F. X. & Serrano, L. (1995a). Evidence for a two-state folding transition in the folding process of the activation domain of human procarboxypeptidase A2. Biochemistry, 34, 15105± 15110. Villegas, V., Viguera, A. R., AvileÂs, F. X. & Serrano, L. (1995b). Stabilization of proteins by rational design of a-helix stability using helix/coil transition theory. Fold. Des. 1, 29 ±34. Vriend, G. (1990). WHATIF: a molecular modeling and drug design program. J. Mol. Graphics, 8, 52 ± 56.

Edited by A. R. Fersht (Received 20 July 1998; accepted 20 August 1998)