doi:10.1016/S0022-2836(02)01425-0
J. Mol. Biol. (2003) 326, 899–909
Solution Structure of Switch Arc, a Mutant with 310 Helices Replacing a Wild-type b-Ribbon Matthew H. J. Cordes1*, Nathan P. Walsh1, C. James McKnight2 and Robert T. Sauer1 1
Department of Biology Massachusetts Institute of Technology, Cambridge MA 02139, USA 2 Department of Physiology and Biophysics Boston University School of Medicine, Boston MA 02118, USA
Adjacent N11L and L12N mutations in the antiparallel b-ribbon of Arc repressor result in dramatic changes in local structure in which each b-strand is replaced by a right-handed helix. The full solution structure of this “switch” Arc mutant shows that irregular 310 helices compose the new secondary structure. This structural metamorphosis conserves the number of main-chain and side-chain to main-chain hydrogen bonds and the number of fully buried core residues. Apart from a slight widening of the interhelical angle between a-helices A and B and changes in sidechain conformation of a few core residues in Arc, no large-scale structural adjustments in the remainder of the protein are necessary to accommodate the ribbon-to-helix change. Nevertheless, some changes in hydrogenexchange rates are observed, even in regions that have very similar structures in the two proteins. The surface of switch Arc is packed poorly com˚ 2 of additional solvent-accessible pared to wild-type, leading to , 1000 A surface area, and the N termini of the 310 helices make unfavorable headto-head electrostatic interactions. These structural features account for the positive m value and salt dependence of the ribbon-to-helix transition in Arc-N11L, a variant that can adopt either the mutant or wild-type structures. The tertiary fold is capped in different ways in switch and wildtype Arc, showing how stepwise evolutionary transformations can arise through small changes in amino acid sequence. q 2003 Elsevier Science Ltd. All rights reserved
*Corresponding author
Keywords: protein folding; protein evolution; binary pattern; core packing; NMR structure
Introduction Nature often borrows from and builds upon its past successes. For example, many current protein folds undoubtedly arose via sequence and structural modifications of pre-existing proteins. Recently, we described a mutation-induced transformation of the fold of the Arc repressor dimer that involved a structural transition from an antiPresent address: M. H. J. Cordes, Department of Biochemistry, University of Arizona, Tucson, AZ, USA. Abbreviations used: NOE, nuclear Overhauser enhancement; NOESY, NOE spectroscopy; COSY, correlated spectroscopy; TOCSY, total COSY; DQF, double quantum filtered; HSQC, heteronuclear single quantum coherence; SASA, solvent-accessible surface area; TMAO, trimethylamine oxide; TMSP, 3-(trimethylsilyl)-propionic acid. E-mail address of the corresponding author:
[email protected]
parallel b-ribbon to a pair of helices.1 Simple interchange of the wild-type Asn11 and Leu12 sidechains caused this dramatic change in configuration. The resulting mutant protein, called “switch” Arc, showed cooperative thermal unfolding and was roughly as stable as wild-type Arc. We also discovered that an Arc variant bearing just the N11L substitution could adopt either the wildtype or switch Arc folds and converted rapidly between these two structures.2 Here, we report the full solution structure of switch Arc. This structure reinforces the conclusion, originally based on a low-resolution NMR model,1 that switch Arc has a well-packed, albeit alternatively arranged, hydrophobic core. In addition, the solution structure shows that the N-terminal region of Arc, which undergoes the radical change in secondary structure, makes the same number of hydrogen bonds in the wildtype and mutant proteins. In chemical denaturation
0022-2836/03/$ - see front matter q 2003 Elsevier Science Ltd. All rights reserved
900
Solution Structure of Switch Arc
Figure 1. (a) Progression of Arc variants leading to switch Arc, showing solvent-exposed and core mutations. (b) Far ultraviolet circular dichroism (CD) spectra of each variant, at 50 mM protein in a 1 mm path-length cell, in 50 mM Tris (pH 7.5), 250 mM KCl at 15 8C.
experiments, switch Arc is actually slightly more stable than wild-type Arc. Hence, switch Arc appears to represent an equally successful solution to the structural problem of capping the Arc tertiary fold. This structural adaptability demonstrates the feasibility of creating new protein folds via relatively minor genetic changes during biological evolution and suggests that straightforward design principles can be used to re-engineer existing folds. Although switch Arc is folded stably, our studies show that it differs in a variety of subtle ways from wild-type Arc. Several of these differences, including apparent electrostatic repulsion, poor packing of surface side-chains, and more rapid hydrogen exchange could be interpreted as design “flaws” in switch Arc. In Arc-N11L, which can adopt either the new or ancestral folds, some of these flaws result in a preference for the wild-type fold under solvent conditions that favor surface burial or minimize electrostatic repulsion. Because the switch Arc fold arose by mutational serendipity without deliberate optimization, it makes sense that there seems to be room for improvement by further iterations in design or evolution.
Results and Discussion Mutagenic path to switch Arc As a follow-up to a study of the effects of hydrophobic surface mutations in the a-helices of Arc repressor,3 we substituted leucine for three surface polar residues (Gln9, Asn11 and Arg13) in both strands of Arc’s b-ribbon to generate the S-LLL mutant (Figure 1(a)). Purified S-LLL protein, which has six polar-to-hydrophobic mutations per dimer, differed from wild-type Arc in its far ultraviolet CD spectrum (Figure 1(b)), suggesting the possibility of an alternative fold. S-LLL showed thermal melting behavior that was partly irreversible, however, and became notably biphasic at high concentrations. Analytical ultracentrifugation of a closely related mutant (S-VLV), which also displayed biphasic behavior, showed that heat-induced aggregation, occurring at lower and lower temperature as the concentration was raised, was the cause of the biphasic melts. For this reason, we considered S-LLL an unlikely candidate for high-resolution structural studies.
901
Solution Structure of Switch Arc
Arc S-LLL contains nine consecutive hydrophobic residues, Met7-Pro8-Leu9-Phe10-Leu11Leu12-Leu13-Trp14-Pro15, where the underlined positions represent the mutated positions. In a search for a variant that preserved the spectral changes but was less aggregation-prone, we sought to reduce the hydrophobicity of this region in a sensible fashion. To do this, we first modeled a potential alternative fold for S-LLL. The most obvious possibility was a structure in which a pair of a-helices replaced the wild-type b-ribbon. Leu12, which is part of the hydrophobic core in wild-type Arc (Figure 1(a)), occupied a surface position in this model. We reasoned, therefore, that replacing Leu12 with a polar residue ought to have little effect on the putative helical fold, should improve solubility, and might destabilize any wildtype structure in competition with the helical fold. We constructed the S-LLQL variant of Arc by adding the L12Q mutation to the S-LLL sequence. S-LLQL had a far-UV CD spectrum similar to that of S-LLL (Figure 1(b)) but did not aggregate and could be concentrated to millimolar concentrations. Two and three-dimensional NMR experiments (see below) indicated the presence of a type I b-turn involving residues 12 and 13 of S-LLQL, and suggested that this turn constituted the C terminus of a helix extending from residues 9 – 13. Alterations in the solvent exposure of certain side-chains accompanied the change in backbone conformation. Most dramatically, the Leu11 side-chain was buried in the hydrophobic core of S-LLQL while the Gln12 side-chain was exposed to solvent. The corresponding residues in wild-type Arc, Asn11 and Leu12, show the opposite accessibility patterns. The side-chains of the remaining mutated residues in S-LLQL, Leu9 and Leu13, appeared to be solvent-exposed, although this was somewhat difficult to ascertain for Leu9 due to the poor quality of NMR models in its vicinity. In the final design iteration, Leu9 and Leu13 were mutated back to the wild-type residues, Gln9 and Arg13, and Gln12 was also changed to Asn12. We named this mutant switch Arc, because it was identical with wild-type Arc except for interchange of the identities of residues 11 and 12 (this parsimony explains that rationale for the Q12N change). Switch Arc had essentially the same CD spectrum as Arc-S-LLQL (Figure 1(b)) but was more tractable to structure determination by NMR. We previously reported a model of switch Arc based on NOE data for residues 7 –14.1 NMR studies of S-LLQL Spin-system assignments for S-LLQL were derived from analysis of two-dimensional DQFCOSY, TOCSY and NOESY experiments and threedimensional HSQC-TOCSY and HSQC-NOESY experiments at 30 8C on unlabelled and uniformly 15N-labelled samples of S-LLQL bearing the C-terminal affinity tag H6KNQHE. With the exception of the mutated region (residues 8 – 14),
sequential assignments were straightforward to obtain from 2D NOESY and 3D HSQC-NOESY spectra using the wild-type Arc assignments4 and the wild-type structure as a guide. For the mutated region (residues 8– 14), the d resonances of Pro8 could be identified by a strong NOE to the a proton of Met7, and the a hydrogen atom of Trp14 could be identified by a strong NOE to the d resonances of Pro15. It was possible to walk from the spin system of Pro8 as far as Phe10 through daN(i,i þ 1) and dNa(i,i) NOEs, and likewise from Trp14 to the a proton of Gln12. However, the amide resonances of Leu11 and Gln12 were not visible in 15N – 1H-HSQC spectra or in 3D HSQCNOESY and HSQC-TOCSY experiments at 30 8C. In the structure of switch Arc solved subsequently, the amides from 11 and 12 are involved in the two N-terminal hydrogen bonds of a helix. Their absence from the S-LLQL spectra at 30 8C could be caused by line-shape broadening arising from fraying of this helix on the NMR time scale. At 45 8C, both resonances were visible, and it was possible to walk continuously from Trp14 to Pro8 using intra-residue and sequential dNN and daN connectivities. NOEs and JHNa values for S-LLQL were generated from analysis of 2D and 3D NOESY and HNHA experiments, respectively, at 45 8C. Strong dNN(i,i þ 1) NOEs between Leu12 and Leu13 and between Leu13 and Trp14, a daN(i,i þ 2) NOE between Gln12 and Trp14, and small and large values of JHNa for Gln12 and Leu13, respectively, strongly suggested the presence of a type I b-turn centered around residues 12 and 13, and involving an i,i þ 3 hydrogen bond between the carbonyl oxygen atom of Leu11 and the amide proton of Trp14. Explicit structure calculations confirmed the presence of this structure.
Solution structure of switch Arc We assigned the NMR spectrum of switch Arc at 30 8C based on the S-LLQL experiments. In contrast to S-LLQL, amide resonances for Leu11 and Asn12 in switch Arc were clearly visible at 30 8C. This difference probably reflects changes in the dynamics of the mutant helix. Additional assignments were made using 2D 1H – 13C HSQC experiments performed on a 10% 13C/uniformly 15 N-labelled switch Arc sample. These assignments included the methionine 1-CH3 resonances, stereospecific values for leucine and valine methyl groups,5 the g resonances of Glu27, Glu28 and Lys47, and verification of Phe10 z, which has a 1H chemical shift nearly degenerate with the 1 proton. A small number of corrections to previous b proton assignments,1 including those of Glu36 and Arg40, were made using an HNHB spectrum. The b protons of Pro8 and Pro15 were stereospecifically assigned using the fact that the dab(3) is stronger than the dab(2) NOE for all proline conformations.6
902
Solution Structure of Switch Arc
Table 1. Structural statistics for switch arc solution structure A. Distance restraintsa Total Intra-residue Sequential (li 2 jl ¼ 1) Medium-range (li 2 jl # 4) Long-range (|i 2 j| . 4) Inter-molecularb Hydrogen bond distancesc
816 162 247 254 75 72 6
B. Dihedral restraintsa 0 f (C(i21) – Ni –Cai –Ci0) x (Ni –Cai –Cbi –Cgi)
31 3
C. Stereospecific assignments b Methylene group g Valine methyl groups d Leucine methyl groups
2 5 2
D. Average RMS deviations from ˚) Distance restraints (A Dihedral angle restraints (deg.) Idealized covalent geometry ˚) Bonds (A Angles (deg.) Impropers (deg.)
0.003 ^ 0.0002 0.509 ^ 0.0056 0.383 ^ 0.0098
E. Energiesd Total NOE Dihedral Non-crystalographic symmetry Bond Angle van der Waals
203 ^ 5.8 33.2 ^ 3.6 0.0668 ^ 0.084 20.2 ^ 1.04 12.5 ^ 0.65 126 ^ 2.81 10.8 ^ 3.7
(216) (33.0) (0.02) (8.6) (7.13) (134) (15.2)
Number of residues 1026 (80) 132 (10) 8 (0) 4 (0)
% 87.7 (88.9) 10.6 (11.1) 0.7 (0) 0.3 (0)
All 2.026 2.361
Residues 7–51 and 70 –510 0.56 1.32
F. Ramachandran plotd,e Most favorable region Additionally allowed region Generously allowed region Disallowed region G. Average RMS deviations of atomic coordinates between 13 structures ˚) Backbone heavy atoms (A ˚) All heavy atoms (A a b c d e
0.040 ^ 0.0021 0.092 ^ 0.0893
Restraints per monomer. Includes only unambiguous inter-strand restraints, as judged from the average structure. Three hydrogen bonds, each with two distance restraints, as described in Materials and Methods. Values in parentheses are for the minimized average of the 13 lowest energy accepted structures. Region of the Ramachandran plot as defined by PROCHECK-NMR.
Generation of restraints and structure calculations A structural model for residues 7 –14 of switch Arc was initially generated using NOE information for these residues plus simulated restraints for the remainder of the protein.1 The full solution structure incorporated 810 NOE restraints per monomer, of which 147 represented long-range or interstrand distances (Table 1). A total of 31 f angle and 3 x1 angle restraints were derived from HNHA and HNHB spectra, respectively. Three a-helical hydrogen-bond restraints were assigned using observation of protection in a hydrogenexchange experiment combined with the presence of daN(i 2 3, i) and daN(i 2 4, i) NOEs in 3D NOESY spectra. In all, 847 restraints per monomer were generated for residues 5 –53. No restraints for residues 1 –4 or the C-terminal H6KNQHE tag,
which are highly disordered, were used in the structure calculations. Restraints from a 50 ms 2D NOESY were classi˚ ), medium (1.8 – 3.3 A ˚ ), fied as strong (1.8 – 2.8 A ˚ ˚ weak (1.8– 3.8 A), or very weak (1.8 –4.3 A), according to cross-peak intensities. Restraints from a 150 ms 3D NOESY were similarly classified as ˚ ), medium (1.8 –4.2 A ˚ ), mediumstrong (1.8 – 3.1 A ˚ ˚ ), with the weak (1.82 5.3 A), or weak (1.82 6.4 A looser restraints reflecting the longer mixing time of this experiment. A total of 28 structures were generated using XPLOR 3.1 with a simulated annealing protocol designed to calculate symmetric multimer structures.7 Of these structures, 13 were accepted with no angle violations . 58 ˚ and no more than two NOE violations . 0.35 A (Figure 2(a)). Table 1 lists the statistics for these structures. For the ensemble of 13 structures, the ˚ (backbone average pairwise r.m.s.d. value of 0.56 A
903
Solution Structure of Switch Arc
Figure 2. (a) Superposition of 13 ensemble structures of switch Arc, showing backbone for residues 7 –51 and 70 – 510 . Side-chains of Phe10, Leu11 and Trp14 from each monomer are also shown to illustrate the well-defined hydrophobic core of switch Arc. (b) MOLSCRIPT28 ribbon diagram of minimized average solution structure of switch Arc, residues 8 – 48, with core side-chains (Phe10, Leu11 and Trp14) from each monomer shown in blue, and surface side-chains (Gln9, Leu12 and Arg13) from each monomer shown in cyan.
˚ (heavy atoms), computed for atoms) and 1.32 A residues 7– 51 from both monomers, is at least of comparable quality to the statistics for the solution structures of wild-type Arc8 and Arc-MYL.9 Secondary structure of switch Arc The minimized average solution structure of switch Arc contained a 310 helix in place of each strand of the wild-type antiparallel b-ribbon (Figure 2(b)), but the two proteins were otherwise very similar. The N termini of the two 310 helices in switch Arc are directly juxtaposed, creating a potential anion-binding site.10 Binding of negatively charged ions in this site would be stabilized by contacts with both helices, including inter˚ or less with the amide hydrogen actions of 3.3 A atoms of residues 9 and 10 from each subunit. In globular proteins, 310 helices are typically four residues or fewer in length.11 Each 310 helix
in switch Arc extended from residues 9 –13 as evaluated for the minimized average structure using MOLMOL.12 However, examination of the ensemble of structures gave the impression of slightly shorter helices; residues 9 – 12 were 310 helix in most structures, while residue 13 was classified as 310 helix in only one. Although semantics may dictate the precise length of the 310 helices in switch Arc, this region had similar conformations in all 13 NMR structures (see Figure 2(a)). Moreover, the residue 9 –13 backbone angles occupied the helical region of Ramachandran space in all structures, with mean f and c angles for the average structure (2 75, 2 16), close to the average value of (2 71, 2 18) for 310 helices in other proteins.11 Individual f and c angles for residues 9– 13 in all 13 switch Arc NMR structures also fell within the scatter observed in known structures. The 310 helices of switch Arc, like those in most proteins, are irregular. The cartoons in the bottom portion of Figure 3 show the hydrogen bonds in each 310 helix and in the wild-type b-ribbon. In each helix, there are three main-chain hydrogen bonds (Pro8– Leu11, Gln9 – Asn12, and Leu11 – Trp14) instead of four, because the Phe10 carbonyl oxygen atom makes a hydrogen bond to the sidechain amide of Asn34 rather than to the mainchain amide of Arg13. In the wild-type b-ribbon, there are six main-chain hydrogen bonds and two long-range hydrogen bonds, between the sidechain amide of Asn34 and the carbonyl oxygen atom of Arg13 in each subunit. Hence, when both monomers are considered, residues 9 – 13 in wildtype Arc and switch Arc both make a total of eight hydrogen bonds, even though the bonding partners and backbone conformations are strikingly different. Core packing and tertiary structure comparisons In switch Arc, three side-chains from each 310 helix, Phe10, Leu11, and Trp14, pack into the hydrophobic core. In addition, Pro8 makes packing contacts with Phe10 and Leu11, and Met7 packs against Leu11. The side-chain conformations and positions of Phe10, Leu11, and Trp14 for switch Arc were quite well defined in the ensemble (Figure 2(a)), and the core appeared to be well packed. In wild-type Arc, three residues in each b-strand, Phe10, Leu12, and Trp14, are buried in the hydrophobic core. As the cartoon in the top part of Figure 3 shows, Phe10 in switch Arc occupies approximately the same position as Leu12 in wild-type Arc, and Leu11 in switch Arc has the same approximate position as Phe10 in wild-type Arc. Overall, the number and type of core side-chains in switch Arc and wild-type Arc are conserved despite dramatic differences in the actual packing arrangement. Both in terms of its hydrogen bonding and tertiary packing, switch Arc represents an alternative way to cap the a-helical framework of Arc
904
Solution Structure of Switch Arc
Figure 3. Core packing (top) and hydrogen bonding (bottom) schematics for (a) wild-type Arc and (b) switch Arc. In the core packing schematic, boxes are shaded to distinguish the two monomers, and arrows indicate the chain direction. In the hydrogen-bonding schematic, hydrogen bonds are shown as broken arrows, and peptide linkages as continuous lines.
repressor. The a-helical portions of wild-type and switch Arc (residues 16 –48) are similar, and only minor adjustments accompany the strand ! helix transition of residues 9– 13. The inter-helical angle between a-helices A and B in the switch solution structure (93.1(^ 6.6)8) is slightly larger than that in the wild-type solution structure (85.7(^ 9.0)8), and the Leu19 side-chain adopts different rotamers in both structures to accommodate the changes in core packing. Solvent-accessible surface
Figure 4. Average residue SASA differences between the wild-type Arc and switch Arc solution structures, classified by wild-type SASA (i.e. whether the residue is buried or solvent-exposed in wild-type) and by region ˚ 2 of SASA constiof the sequence. Residues with 0 – 20 A tute the approximate hydrophobic core of the protein. Residues 8 – 15 correspond to the b-sheet of wild-type Arc or the 310 helices of switch Arc, plus the following turn. Residues 16– 32 correspond to helix A plus the following turn. Residues 33 – 51 correspond to helix B plus the following turn.
Calculation of the solvent-accessible surface area (SASA) for residues 7 –51 in the switch Arc and wild-type Arc solution structures (computed for each individual structure in the ensemble and ˚ 2 for switch then averaged) yields 6623(^ 100) A 2 ˚ Arc and 5655(^ 150) A for wild-type Arc. Thus, ˚ 2 additional SASA in the there is 968(^ 250) A native state of switch Arc. Much of this difference arises from accessibility changes of residues on the protein surface and in the immediate region of the secondary structure change (Figure 4). For example, Gln9, Asn12, Arg13, and Arg16 in or near the 310 helix of switch Arc contribute an ˚ 2 SASA relative to Gln9, Asn11, additional , 440 A Arg13 and Arg16 in or near the b-sheet of wildtype. Part of this difference is probably attributable
Solution Structure of Switch Arc
905
Protection against amide hydrogen exchange
Figure 5. (a) First 1H – 15N HSQC spectrum acquired seven minutes after resuspension of switch Arc sample in 2H2O. (b) Protection factors for amide protons of switch and wild-type Arc.
directly to the secondary structure change itself: on average, the side-chains in a two-stranded b-sheet are packed more efficiently than those in a single helix.13 While some of this difference could, in principle, be made up by helix – helix packing, the end-to-end arrangement of the 310 helices of switch Arc does not permit any substantial additional surface burial to be realized, at least on the solvent-exposed face. Overall, the structural change observed in switch Arc leads to a “rougher” outer surface of the protein, mostly in the vicinity of the structural change, though there is some propagation of this effect into helix A and the interhelical turn between helices A and B. If all other factors were equal, wild-type Arc would be expected to be significantly more stable than switch Arc because more of its surface is solvent-inaccessible in the native structure. This expectation is not met, however, as the two proteins have nearly the same stabilities (see below and Cordes et al.1). As a result, other interactions (solvent exposure of the denatured state, electrostatics, van der Waals interactions, etc.) must stabilize switch Arc and/or destabilize wild-type Arc.
Hydrogen-exchange experiments were performed as described for wild-type Arc14 by lyophilizing a uniformly 15N-labelled sample of switch Arc in H2O and dissolving the dried protein in 2 H2O (pH 4.67, 150 mM KCl). Figure 5(a) shows the HSQC spectrum acquired immediately after resuspension in 2H2O. Figure 5(b) shows the residue-by-residue protection factors determined by analysis of successive HSQC experiments. In the regions corresponding to a-helices A and B (residues 16 –29 and 33 – 48), the same patterns and general magnitudes of hydrogen-exchange protection were observed in wild-type and switch Arc. In helix A, the largest differences in protection were observed for residues 18, 21, 24 and 27, which lie roughly on one face of the helix. Interestingly, these four residues also show the largest differences in hydrogen-bond lengths between the average wild-type and switch Arc solution structures. Thus, the variations in protection factors in helix A probably reflect minor adjustments in the conformation of this secondary structure, primarily along one face. In helix B, somewhat less protection is observed for residues 39 –43 of switch Arc, as compared to the same residues of wild-type Arc. Other than residues 9– 13, this region contributes the most important dimer contacts in the protein, and core residues in this interface are the least tolerant to mutation in the entire protein.15 Accordingly, one might expect the dynamics of this region to be particularly sensitive to subtle structural changes. Although the central turn of helix B is not notably different in switch Arc, slight changes in the dimer interface are observed, most notably a shortened distance between the Val41 and Val410 side-chains. We suggest that increased structural strain in this region may affect the local dynamics of the backbone. For the 310 helix of switch Arc, discernible protection from exchange was observed only for the amide proton of Trp14, which is involved in the C-terminal hydrogen bond of this helix. Protection of Trp14 of wild-type Arc, part of the b-sheet, appears to be somewhat greater in wild-type Arc than in switch Arc, although the magnitude of this difference is unclear, since an accurate value for switch Arc could not be determined. No protection was observed for the Leu11 and Asn12 amide hydrogen atoms in switch Arc, which are also involved in hydrogen bonds in the 310 helix. By contrast, the amide hydrogen atoms of residues 12 and 13 were visible in comparable experiments on wild-type Arc, but exchange was too fast for determination of accurate rates. Overall, protection appeared to be greater for the b-ribbon of wildtype Arc than for the 310 helix of switch Arc, though neither structure was well protected relative to the a-helical framework. The region of the protein that changed structure most dramatically was clearly the most dynamic in both the wild-type and switch structures. Moreover, the
906
Figure 6. (a) Urea-denaturation curves for wild-type and switch Arc at 10 mM protein, in 50 mM Tris (pH 7.5), 250 mM KCl at 25 8C. (b) Dependence of near-ultraviolet CD spectrum of Arc-N11L mutant on the presence of osmolytes. Colored curves show experimentally measured spectra; black curves show spectra simulated using linear combinations of wild-type and switch basis spectra.2 The Arc-N11L spectrum becomes more wildtype-like in 1 M TMAO and more like that of switch Arc in 2 M urea.
general flexibility of this region in both structures probably accounts for the highly dynamic interconversion of the b-ribbon and 310 helical forms of Arc-N11L.2
Solvent effects and structural preference Exposure of solvent-accessible surface area upon unfolding of globular proteins is correlated with denaturant m values, obeying the approximate relationship m ¼ 0.14 £ DSASA.16 The SASA in the denatured state of both switch and wild-type Arc ˚ 2 from the molecular can be estimated as 16,111 A mass according to Miller et al.17 Given the accessible surface in the native states as computed from the solution structures as above, the DSASA upon ˚ 2 for wild-type Arc unfolding should be 10,456 A ˚ 2 for switch Arc, yielding predicted m and 9488 A
Solution Structure of Switch Arc
values of 1.46 and 1.33, respectively. Fitting of urea-denaturation data (Figure 6(a)) showed that switch Arc may be slightly more stable than wildtype Arc (DGu,298 of 11.6 versus 10.8 kcal mol1) but gave the same m value within error (1.4(^ 0.1) kcal mol21 M21). Two factors complicate interpretation of this finding. First, the experimental error is comparable to the predicted difference, and both of the measured m values are within experimental error of the predicted values. Second, if differences exist in the SASA of the unfolded states of the two proteins, it could mask the expected change in m value. In the Arc-N11L mutant, the switch and wildtype folds exist in dynamic equilibrium.2 Because Arc-N11L has a single denatured state, the m-value difference between the alternative native structures should only depend on the native SASA difference. Moreover, because urea diminishes the energetic penalty for exposing surface, sub-denaturing concentrations of urea should increase the relative proportion of the structure with the higher SASA. The urea sensitivity of the Arc-N11L structural equilibrium (Figure 6(b)) was quantified by fitting near-UV CD spectra using linear combinations of the basis spectra of wild-type Arc and switch Arc (see Cordes et al.2). As expected from the higher SASA of switch Arc, urea was indeed found to favor the switch fold of Arc-N11L relative to the wild-type fold. At 0 M urea, the Keq value for the sheet-to-helix conversion is 1.8 and and DGwt!sw ¼ 2 0.34 kcal mol21. At 2 M urea the corresponding values are Keq ¼ 3.3 and DGwt!21 . The m value calculated sw ¼ 2 0.69 kcal mol from the difference between the 0 M and 2 M urea data (DDG/2 M ¼ 0.17 (^0.04) kcal mol21 M21) was in good agreement with the predicted value of 2 0.13 kcal mol21 M21.16 Trimethylamine oxide (TMAO) is an osmolyte that favors burial of surface area18 and thus should have the opposite effect from urea. In 1 M TMAO, the near-UV CD spectra of Arc-N11L shifted toward the b-ribbon conformation (Figure 6(b)), again consistent with the native SASA differences between the wild-type and switch Arc structures. Recall that the head-to-head arrangement of the two 310 helices in switch Arc should give rise to unfavorable electrostatic repulsion. In the absence of additional effects, we would thus expect the proportion of the switch fold relative to the wildtype fold in Arc-N11L to increase as a function of ionic strength. To test this possibility, we monitored the relative proportion of the two structures in Arc-N11L by near-UV CD in the absence of added salt and in the presence of 250 mM KCl (data not shown). Salt stabilized the switch Arc fold relative to the wild-type fold in Arc-N11L; fitting of the near-UV CD spectra gave a [wild type]/[switch] ratio of approximately 3:1 in the absence of salt compared to a ratio near 2:3 in 250 mM KCl.
907
Solution Structure of Switch Arc
Conclusions The solution structure of the switch Arc mutant shows that an irregular 310 helix replaces each b-strand of the wild-type structure. This structural change results in conservation of a well-packed hydrophobic core and of the number of mainchain and main-chain to side-chain hydrogen bonds. The surface of switch Arc is rougher and not as well packed as the surface of wild-type Arc. ˚ 2 of additional Indeed, switch Arc has , 1000 A solvent-accessible surface area. This difference correlates with stabilization of the switch fold relative to wild-type in sub-denaturing concentrations of urea and stabilization of the wild-type fold relative to switch in trimethylamine oxide. The N termini of the 310 helices in switch Arc appear to make unfavorable head-to-head electrostatic interactions. As a result, the switch conformation is stabilized relative to the wild-type structure as the ionic strength is raised. The transition from the wild-type b-ribbon to the switch 310 helices occurs without major disruptions of the remaining a-helical portions of the Arc structure. A small widening of the inter-helical angle between a-helices A and B is, however, observed in the switch protein, and a few side-chains in this region of the protein adopt new rotamer conformations. These minor structural perturbations, however, do give rise to detectable changes in hydrogen-exchange rates in regions of wild-type and switch Arc that have very similar structures. Only two amino acid substitutions (N11L and L12N) are required to convert Arc stably from the wild-type fold to the switch fold. Just one of these changes (N11L) allows Arc to adopt either structure, providing a simple, two-step sequence path that connects both protein folds in a continuous fashion. This mutational path is very efficient, in the sense that it quickly leads to a stable structural transformation. On the other hand, the structural and biophysical results presented here suggest that the evolved fold is far from ideal in terms of surface packing, balanced electrostatic interactions, and hydrogen-exchange properties. This makes intuitive sense and raises the challenge of optimizing the switch structure through design or selection.
Materials and Methods
samples were prepared by overexpression in M9T minimal medium containing 15NH4Cl (0.8 g/l) as the sole nitrogen source. The 10% 13C-uniformly 15N-labelled sample of switch Arc was prepared by overexpression in M9T medium containing medium containing 15NH4Cl (0.8 g/l) as the sole nitrogen source, and a mixture 0.3 g/l [13C6]glucose and 2.7 g/l unlabelled glucose as the sole carbon source. Circular dichroism Far ultraviolet circular dichroism spectra of wild-type and mutant Arc repressors were obtained in 50 mM Tris (pH 7.5), 250 mM KCl at 15 8C using 50 mM protein samples and a 1 mm path-length cell. Urea-denaturation experiments on wild-type and switch Arc were performed at 25 8C essentially as described19 using 10 mM protein and monitoring ellipticity at 234 nm in a 1 cm path-length cuvette. Near-ultraviolet CD spectra of Arc-N11L were obtained in a 1 cm path-length cuvette at 15 8C on 100 mM protein in buffer B (10 mM Tris (pH 7.5), 0.2 mM EDTA) or buffer B plus 1M TMAO, 2 M urea or 250 mM KCl. Fitting of spectra to obtain equilibrium constants for the sheet – helix interconversion was performed as described.2 NMR samples and spectra All NMR samples except those used for amide-hydrogen exchange (see below) contained 20 mM sodium phosphate (pH 4.8 –4.9), 10% 2H2O, with 1 mM 3-(trimethylsilyl)-propionic acid (TMSP) as an internal chemical-shift reference. Some samples also contained 0.01% (w/v) sodium azide as a preservative. S-LLQL experiments were performed on a 1.5 mM unlabelled sample and a 5.2 mM 15N-labelled sample. Switch Arc experiments were performed with a 4 mM uniformly 15 N-labelled sample and a 7.5 mM 10% 13C-uniformly 15 N-labelled sample. S-LLQL NMR spectra at 303 K included: 2D 1H– 1H NOESY (200 ms mixing time), 2D 1H– 1H DQF-COSY, 2D 1H – 1H TOCSY (70 ms mixing time), 3D 15N– 1H HSQC-NOESY (200 ms mixing time), 3D 15N – 1H HSQCTOCSY (80 ms mixing time). S-LLQL NMR spectra at 318 K included: 2D 1H– 1H NOESY (25, 50, 100 and 150 ms mixing time), 2D 1H– 1H DQF-COSY, 2D 1H– 1H TOCSY (70 ms mixing time), 3D 15N– 1H HSQC-NOESY (50 and 200 ms mixing time) and HNHA. Switch Arc spectra included: 2D 1H– 1H NOESY (50 and 200 ms mixing times), 2D 1H – 1H DQF-COSY and 2D 1H– 1H TOCSY (50 and 110 ms mixing time), 3D 15N– 1H HSQCNOESY (150 ms mixing time), 3D 15N– 1H HSQCTOCSY (80 ms mixing time), HNHA, HNHB and 13 C– 1H HSQC. All switch Arc spectra were recorded at 303 K. Spectra were processed using NMRPipe/ NMRDraw20 and analyzed using NMRView.21
Mutagenesis and protein purification Genes encoding Arc S-LLL, Arc S-LLQL, switch Arc, and Arc-N11L were constructed by cassette mutagenesis of the arc-st11 gene of plasmid pET800 or pSA700. The st11 C-terminal extension (H6KNQHE) allows affinity purification and prevents intracellular degradation.19 Arc variants were overexpressed from Escherichia coli strains BL21(lDE3)-pET800, BL21(lDE3)pLysS-pET800, X90-pSA700 or UA2F-pSA700, and purified to greater than 95% homogeneity by chromatography on Ni2þNTA and SP-Sephadex.19 Uniformly 15N-labelled
Amide-hydrogen exchange An initial hydrogen-exchange experiment on switch Arc was performed by resuspension in 2H2O of a lyophilized 1.7 mM uniformly 15N-labelled sample containing 20 mM sodium phosphate (pH 4.91, uncorrected meter reading), 0.01% sodium azide, 1 mM TMSP, 10% 2 H2O. Restraints used in structure calculations were derived from this experiment. Quantitative hydrogenexchange experiments on switch Arc were performed on a 5 mM uniformly 15N-labelled sample containing
908
20 mM sodium phosphate (pH 4.67, uncorrected meter reading), 150 mM KCl, 0.01% sodium azide, 1 mM TMSP, 10% 2H2O. These conditions closely resemble those described for wild-type Arc.14 Quantitative rates of amide exchange were derived from this second experiment by exponential decay analysis (using the program NMRView) of amide cross-peak intensities in 20 15N– 1H HSQC spectra recorded between 16 minutes and 46 hours after the initial spectrum of the resuspended sample. For most peaks, a “jitter” analysis of peak intensities was used, allowing the precise peak position of a given amide to vary slightly from spectrum to spectrum. For residues 22 and 25, which were highly overlapped, “center” analysis was used. Protection factors were computed by dividing intrinsic exchange rates computed using the Sphere server† by the observed exchange rates. Restraint assignments Distance restraints for switch Arc were derived from a 50 ms 2D NOESY spectrum and a 150 ms 3D NOESY spectrum using the program NMRView and classified into restraint classes as described in the text. To help resolve ambiguities due to chemical-shift degeneracy, the model of switch Arc previously published1 was used as a template in generating restraints. 3 JNHa values were derived from HNHA spectra as described.22 Apparent J values were multiplied by 1.1 to correct for the difference in relaxation rate between the anti-phase and in-phase terms. Residues with 3JNHa , 6.0 Hz were restrained to values of f ¼ 65(^ 25)8, whereas those with 3JNHa . 8.0 Hz were restrained to values of f ¼ 120(^ 40)8. 3JNHb values were derived from HNHB spectra,23 and determined quantitatively by comparing volume integrals from the HNHB spectrum with those from a 2D reference spectrum as described.24 Terms due to passive coupling were ignored. Where 3JNHb # 1.5 Hz for both b protons, x1 angles were restrained to 180(^40)8. Amide protons were considered to be hydrogen bonded if they were visible in the first 15N – 1H HSQC spectrum acquired following resuspension of a lyophilized H2O sample in 2H2O. When daN(i 2 4, i) and daN(i 2 3, i) NOEs to such amides were also visible in a 150 ms 3D NOESY spectrum, a hydrogen-bond restraint was assigned between the amide proton residue i and the carbonyl oxygen atom of residue i 2 4.25 In many cases these NOEs could not be discerned due to resonance overlap, so only a small number of hydrogenbond restraints were used. Two distance restraints were ˚ between used to describe hydrogen bonds, 1.8 –2.5 A the carbonyl oxygen atom and the amide proton, and ˚ between the carbonyl oxygen atom and the 1.8– 3.5 A amide nitrogen atom. Structure determination Structure calculations were performed using XPLOR 3.126 with a simulated annealing protocol designed for symmetric dimers.7 As starting points for the calculation, 28 structures were generated in which the conformation of the N-terminal region (residues 1 – 13) was random, and that of the remainder of the protein restrained to be similar to wild-type Arc.1 The use of completely random † http://www.fccc.edu/research/labs/roder/ sphere-/sphere.html
Solution Structure of Switch Arc
starting structures led to extremely poor convergence rates, but those calculations which did converge yielded structures similar to those obtained from calculations using non-random starting structures. Each semirandom starting structure was then annealed to a model structure using the distance, angle and hydrogen bond restraints derived above. Two “seed” restraints, Arg401-Phe45 z and Trp14 13-Tyr38 1, were described as unambiguously intermolecular. In the wild-type Arc structure, the difference between the intra and inter˚ . Use of the monomer distances for these atoms is . 10 A seed restraints improved convergence but again did not alter the qualitative nature of the results. All other NOE distance restraints were described ambiguously using sum potentials. Hydrogen-bond restraints in a-helices were described as intra-monomer. In the conformational search phase of the calculations, non-bonded interactions were computed only between Ca atoms with a van der Waals term of 0.1, corresponding to protocol 2 of Nilges7. Structural analysis Secondary structures, hydrogen bonds, inter-helical angles and residue solvent-accessible surface areas for the wild-type and switch Arc ensemble and minimized average solution structures were assessed using the program MOLMOL.12 Criteria for identification of ˚ maximum distance and a hydrogen bonds were 2.5 A minimum angle between donor and acceptor of 1358. The structural quality data given in Table 1 were obtained using the program PROCHECK-NMR.27 Protein Data Bank accession number The ensemble of 13 accepted structures has been deposited in the Protein Data Bank under the accession number 1NLA.
Acknowledgements This work was supported, in part, by NIH grant AI-15706 and post-doctoral grants to M.H.J.C. from Merck and the Helen Hay Whitney Foundation.
References 1. Cordes, M. H. J., Walsh, N. P., McKnight, C. J. & Sauer, R. T. (1999). Evolution of a protein fold in vitro. Science, 284, 325–327. 2. Cordes, M. H. J., Burton, R. E., Walsh, N. P., McKnight, C. J. & Sauer, R. T. (2000). An evolutionary bridge to a new protein fold. Nature Struct. Biol. 7, 1129– 1132. 3. Cordes, M. H. J. & Sauer, R. T. (1999). Tolerance of a protein to multiple polar-to-hydrophobic surface substitutions. Protein Sci. 8, 318– 325. 4. Breg, J. N., Boelens, R., George, A. V. E. & Kaptein, R. (1989). Sequence-specific 1H NMR assignment and secondary structure of the Arc repressor of bacteriophage P22, as determined by two-dimensional 1H NMR spectroscopy. Biochemistry, 28, 9826– 9833. 5. Neri, D., Szyperski, T., Otting, G., Senn, H. & Wu¨thrich, K. (1989). Stereospecific nuclear magnetic
Solution Structure of Switch Arc
6.
7. 8.
9.
10. 11. 12. 13.
14.
15. 16.
17.
resonance assignments of the methyl groups of valine and leucine in the DNA-binding domain of the 434 repressor by biosynthetically directed fractional 13C labeling. Biochemistry, 28, 7510– 7516. Cai, M., Huang, Y., Liu, J. & Krishnamoorthi, R. (1995). Solution conformations of proline rings in proteins studied by NMR spectroscopy. J. Biomol. NMR, 6, 123– 128. Nilges, M. (1993). A calculation strategy for the structure determination of symmetric dimers by 1H NMR. Proteins: Struct. Funct. Genet. 17, 297– 309. Bonvin, A. M. J. J., Vis, H., Breg, J. N., Burgering, M. J. M., Boelens, R. & Kaptein, R. (1994). Nuclear magnetic resonance solution structure of the Arc repressor using relaxation matrix calculations. J. Mol. Biol. 236, 328– 341. Nooren, I. M. A., Rietveld, A. W. M., Melacini, G., Sauer, R. T., Kaptein, R. & Boelens, R. (1999). The solution structure and dynamics of an Arc repressor mutant reveal premelting conformational changes related to DNA binding. Biochemistry, 38, 6035– 6042. Copley, R. R. & Barton, G. J. (1994). A structural analysis of phosphate and sulphate binding sites in proteins. J. Mol. Biol. 242, 321– 329. Barlow, D. J. & Thornton, J. M. (1988). Helix geometry in proteins. J. Mol. Biol. 201, 601– 619. Koradi, R., Billeter, M. & Wu¨thrich, K. (1996). MOLMOL: a program for display and analysis of macromolecular structures. J. Mol. Graph. 14, 51– 55. Fleming, P. J. & Richards, F. M. (2000). Protein packing: dependence of protein size, secondary structure and amino acid composition. J. Mol. Biol. 299, 487–498. Burgering, M. J. M., Hald, M., Boelens, R., Breg, J. N. & Kaptein, R. (1995). Hydrogen exchange studies of the Arc repressor: evidence for a monomeric folding intermediate. Biopolymers, 35, 217– 226. Milla, M. E. & Sauer, R. T. (1995). Critical side-chain interactions at a subunit interface in the Arc repressor dimer. Biochemistry, 34, 3344– 3351. Myers, J. K., Pace, C. N. & Scholtz, J. M. (1995). Denaturant m values and heat capacity changes: relation to changes in accessible surface areas of protein unfolding. Protein Sci. 4, 2138– 2148. Miller, S., Janin, J., Lesk, A. M. & Chothia, C. (1987). Interior and surface of monomeric proteins. J. Mol. Biol. 196, 641– 656.
909
18. Wang, A. & Bolen, D. W. (1997). A naturally occurring protective system in urea-rich cells: mechanism of osmolyte protection of proteins against urea denaturation. Biochemistry, 36, 9101– 9108. 19. Milla, M. E., Brown, B. M. & Sauer, R. T. (1993). P22 Arc repressor: enhanced expression of unstable mutants by addition of polar C-terminal sequences. Protein Sci. 2, 2198– 2205. 20. Delaglio, F., Grzesiek, S., Vuister, G. W., Zhu, G., Pfeifer, J. & Bax, A. (1995). NMRPipe: a multidmensional spectral processing system based on UNIX pipes. J. Biomol. NMR, 6, 277. 21. Johnson, B. & Blevins, R. (1994). NMRView. A computer program for the visualization and analysis of NMR data. J. Biomol. NMR, 4, 603– 614. 22. Vuister, G. W. & Bax, A. (1993). Quantitative J correlation: a new approach for measuring homonuclear three-bond J(HNHa) coupling constants in 15 N-enriched proteins. J. Am. Chem. Soc. 115, 7772 –7777. 23. Archer, S. J., Ikura, M., Torchia, D. A. & Bax, A. (1991). An alternative 3D NMR technique for correlating backbone 15N with side chain Hb resonances in larger proteins. J. Magn. Reson. 95, 636– 641. 24. Bax, A., Vuister, G. W., Grzesiek, S., Delaglio, F., Wang, A. C., Tschudin, R. & Zhu, G. (1994). Measurement of homo- and heteronuclear J couplings from quantitative J correlation. Methods Enzymol. 239, 79 – 105. 25. Arseniev, A., Schultze, P., Worgotter, E., Braun, W., Wagner, G., Vask, M. et al. (1988). Three-dimensional structure of rabbit liver (Cd7) metallothionein-2a in aqueous solution determined by nuclear magnetic resonance. J. Mol. Biol. 201, 637– 657. 26. Bru¨nger, A. (1993). XPLOR Version 3.1: A System for X-ray Crystallography and NMR, Yale University Press, New Haven, CT. 27. Laskowski, R. A., Rullmann, J. A., MacArthur, M. W., Kaptein, R. & Thornton, J. M. (1996). AQUA and PROCHECK-NMR: programs for checking the quality of protein structures solved by NMR. J. Biomol. NMR, 8, 477– 486. 28. Kraulis, P. J. (1991). MOLSCRIPT: a program to produce both detailed and schematic plots of protein structures. J. Appl. Crystallog. 24, 946–950.
Edited by P. Wright (Received 22 May 2002; received in revised form 2 December 2002; accepted 3 December 2002)