JOIJRNAL
OF MAC’VETIC
RESONANCE
78,425-439 (1988)
The Influence of Spin Diffusion and Internal Motions on NOE Intensities in Proteins ANDREW N. LANE National Institute for Medical Research, The Ridgeway, Mill Hill, London N W 7 1.44, United Kingdom Received May 8, 1987; revised November 18, 1987 An algorithm has been coded for calculating NOE time coursesfor an arbitrary number of protons. Simulations were made on three proteins: myoglobin, bovine pancreatic trypsin inhibitor, and the toxic domain of Escherichia coli heat-stableenterotoxin. The calculations were carried out to simulate severalconditions under which NOE measurementsare commonly made, namely variation of the overall tumbling time and solvent (‘H*O, and *H20). The influence of spin diffusion and internal motions on NOE intensities in secondary structures was studied. It wasfoundthatspindiffusionis unlikelyto leadto falseidentification of secondary structure elements, but does lead to considerable underestimation of distances longerthanabout4 A, whichareusedfordetermination of tertiarystructures. Internal motions were simulated using truncated, anisotropic three-dimensional harmonic oscillation. For motions slower than the Larmor frequency, the oscillation has significant effects on the long-range NOES, whereas faster motions tend to yield distance estimates similar to those of the rigid body calculation, becauseof counteracting effectsof the motion on the correlation time and the averagedistance. These considerations have significance for the refinement of structures generated by distance geometry or other algorithms. Q 1988 Academic Press. Inc.
INTRODUCTION
In recent years, proton NMR has been increasingly used in the determination of the structures of macromolecules in solution (Z-10). One of the most important NMR parameters for structural work is the homonuclear cross-relaxation rate constant because of its strong (r-‘j) dependenceon internuclear distance (II, 12). The m a g n itude o!f the NOE at short m ixing times has been used for identifying secondary structures (4, 23, 14) and for estimating interproton distances as input for distance geometry (2-5) and restrained m o lecular dynamics calculations (1, 7). Unfortunately, there are difficulties with these approaches.Even in a rigid macrom o lecule, the intensity of an NOE is not necessarily a direct measure of a single internuclear distance, owing to transfer of magnetization via one or more spins; the approximation of two spins is strictly valid only in the lim it of vanishingly short m ixing times. The expected NOE patterns for different secondary structures were d’erivedsolely on the basis of distances obtained from crystal structures (4, 23). It has not been establishedwhether spin diffusion could be sufficient to mask the characteristic patterns of NOES. The so-called long-range NOES that are used to build up a tertiary structure are generally interpreted as relatively poorly determined distances.Although this appearsto be adequateto generategood topologies of small proteins in solution, 425
0022-2364188 $3.00 Copyright 0 1988 by Academic Press, Inc. All rights of reproduction in a n y form reserved.
426
ANDREW
N. LANE
the local structures are poorly determined (2). Better determination of local structures requires more precise and accurate distances. Hence, spin diffusion must be explicitly taken into account. Several workers have examined the extent of spin ditfusion for a variety of specific models (6, 8, 12, Z5-20), though apparently not for the actual geometry of real proteins. The second difficulty with NOE measurements is that the cross-relaxation and spinlattice relaxation rate constants depend not only on internuclear distances but also on spectral densities and therefore on relative motions of the dipolar coupled spins. The relaxation of protons in macromolecules is, for the most part, dominated by the overall tumbling rate, which can usually be estimated with adequate accuracy by a variety of methods (18,21). However, in the presence of internal motions of sufficient amplitude or frequency, the effective correlation time will always be smaller than the overall tumbling time, therefore decreasing the relaxation rate constants (22, 23). Further, internal motion implies that the internuclear distance between pairs of spins is not necessarily constant, but may vary between some limits imposed by the structure of the molecule, such as the sum of the van der Waals radii. Hence, the observed cross-relaxation rate constant will in general be an average over all conformations sampled during the experiment. Because it is not in principle possible to determine the frequency and amplitude of the motion as well as the internuclear separation from the NOE experiment, an important practical question is how much error is introduced into structure determinations by ignoring internal motions. Several studies (6, 16, 19, 2Q24) have shown that significant errors can be introduced into estimates of distances by ignoring spin diffusion. Olejnicjak et al. (25) have calculated the effects of picosecond motions in lysozyme on the proton relaxation rate constants and NOES by molecular dynamics simulations. These simulations suggested that NOE intensities were likely to be decreased by about a factor of 2 compared with expectations for a rigid body. Unfortunately, it is not at present feasible to extend molecular dynamics calculations to times longer than about 0.3 ns. Motions occurring on significantly longer time scalesmay have significant effects on NOE intensities, because the internuclear distances are still being averaged. It is desirable, therefore, to estimate the importance of relatively low frequency motions on the NOE intensities observable in macromolecules. The purpose of this article is to address these difficulties for real proteins. The first problem, the effect of multiple pathways of magnetization transfer, is treated for a rigid body; it is shown how large apparent distances can be in error compared with the known distances. The influence of spin diffusion on the identification of secondary structures from relative intensities of short-range NOES is also tested. The second problem, of motional averaging, is divided into two parts. First, it is assumed that the motion affects mainly the distance term, with insignificant effects on the effective correlation time. Finally, effects of motion on the correlation time, and therefore on relaxation constants, are treated. METHODS
NOE Buildup Curves In the high power limit, the NOE buildup rate is adequately described by the BlochSolomon equation for the z component of the magnetization (II, 15, Z7),
NOE INTENSITIES
427
IN PROTEINS
dMi/dt = --pi(Mi - M$ - C au(Mj - My),
i#j,
t11
where Mi is the magnetization of spin i, I@ is its equilibrium magnetization, pi is its intrinsic spin-lattice relaxation rate constant, and aij are the cross-relaxation rate constants. The relaxation rate constants are given by (12, 26) pi = (Y 2 [J(wi - aj) + 3J(oJ + 6J(wi + tij)]/r$
PI
uij = a[6J(wi + ~j) - J(o~ - Uj)]/r$,
t31
where ai,j are the Larmor frequencies of the interacting spins, J(wJ are the spectral density functions, rij are the internuclear distances, and (Yis a nuclear constant equal to y4h */40a*, where y is the gyromagnetic ratio and h is Plan&s constant. For protons, CY= 56.92 A6 ns-*. For each spin i, there is an equation of the form [ 11. For a given set of initial conditions, such as &fk = 0 (i.e., saturation of spin k), there is a set of n - 1 simultaneous equations for 12spins, which can be integrated by straightforward means. Equation [I] was integrated numerically, generating the entire NOE time course for each irradiated spin for a given conformation. The NOESY experiment can be similarly simulated by allowing the spin of interest to relax after inversion (8). An alternative method, particularly appropriate for the NOESY experiment, is to diagonalize the relaxation matrix (6, 19, 20). A program has been coded (NOEMOT) for a microcomputer that starts with the Cartesian coordinates of the spin system. A preliminary description of this program has been given (27). The coordinates are translated into internuclear distances, from which the relaxation rate constants p and cr are calculated using Eq. [2] and Eq. [3]. It is necessary to supply a value of the correlation time, T. It is possible to set different correlation times for pairs of different spins, or a single rigid body correlation time for all pairs of spins. The latter can be either calculated from the molecular weight and shape or estimated from experiment. The spectrometer frequency is also supplied, although for macromolecules, the dependence of the relaxation parameters on frequency is weak. All calculations reported here were done for a spectrometer frequency of 500 MHz. It may also be necessary to take into account other forms of relaxation. For amide protons, there are additional sources of relaxation due to interaction with the quadrupolar r4N nucleus (26). The main source of relaxation is the dipole-quadrupole interaction, which, despite the low gyromagnetic ratio of 14N,is quite effective owing to the short internuclear distance ( 1.12 A). Relaxation by this mechanism is of the same form as for dipolar relaxation, with the term y4 in the expression for (Y being replaced by &&, where TN is the gyromagnetic ratio of nitrogen. Scalar terms due to the J coupling (II) have an insignificant effect on pI and are ignored in these Calculations. An additional complication concerns relaxation of like spins, for example, in a methyl group. For equivalent methyl protons, the spin-lattice relaxation rate constant does not contain the term J(oi - wj), and so is dropped from Eq. [2]. The methyl protons also provide a relaxation sink (17). The influence of correlated motions has
428
ANDREW
N. LANE
been ignored (II). The three methyl protons were treated in two ways that gave very similar results. First, the methyl group was treated as a super atom, centered at the mean of the coordinates of the three protons. Second, the three protons were considered explicitly, with the internal cross-relaxation rates equal to zero. The spin to be irradiated is chosen, and the Bloch-Solomon equations are integrated using the method of Euler. The integration step size is initially set to 1/4Op,, . On each iteration, the NOE is calculated using this step size, and also for half the step size. The two results are compared, and if they differ by less than l%, the larger step size used, otherwise the calculation is repeated for half the smaller step size until the error is lessthan 1%. Usually a step size of 1/4Op,, is sufficiently small. The calculation automatically accounts for spin-lattice relaxation and multiple pathways of magnetization transfer. The apparent internuclear distances corresponding to the calculated NOE intensities at given times were estimated by two procedures. First, the slope of the NOE buildup curve at a time t was taken to be equal to 6. This is equivalent to assuming that at sufficiently short mixing times, the initial slope of the NOE buildup curve is given by dNOE/dt = CT.
]41
The corresponding value of the distance was then calculated from the estimate of u using Eq. [3], assuming a correlation time equal to the overall tumbling time. Second, the distances were calculated using a calibration NOE (8). For two-spin systems, the evolution of the NOE is given by NOE(t) = (a/~)[1 -
exp(--pt)].
PI
The ratio of two NOES at time t is then NOE(t)/NOEc(t) = u/a, = &r6,
161
where c denotes the calibration value, and the exponent&& have been expanded to first order. This also assumes that the correlation times of the unknown and calibration vectors are the same. Internal Motions To treat internal motions, it is necessary to invoke a specific motional model. One choice that has been used is a two-state jump model between extreme distances (16). While this might be adequate for relaxation of carbon by attached protons, it is unrealistic for pairs of protons. An alternative is to postulate simple harmonic motion in three dimensions, in which the amplitudes and frequencies in the x, y, and z directions are not necessarily equal. The amplitudes and relative frequencies are chosen to correspond to average motions expected in proteins (28-30). Hence, each spin is allowed to move under a pseudo harmonic potential as Xi(t)
= 2
+ a$ill(f$
+ $i),
171
where a is the amplitude, fis the frequency, and #Jis a phase angle. It is unreasonable for all the spins to maintain a fixed phase relation, so that r#~is allowed to vary randomly
NOE INTENSITIES IN PROTEINS
429
using a random number generator. The distances between spins are calculated as a function of time, from which the average relaxation rate constants are calculated as (u) = --(Y7(r;6).
PI
It is also necessaryto take into account covalent constraints on the amplitudes. For example, the distance between an (YCH and the (YNH varies only between the lim its 2.26 and 3.06 A. Provision is therefore made for respecting these lim its. Similarly, the lower lim it between protons separated by many covalent bonds is 2.0 A. Upper lim its cam also be included. The upper and lower bounds on distances were simply set as cutoffs, so that any distance exceeding these bounds was set to the value of the lim it. T:his is equivalent to a hard sphere model of the atoms. For those pairs of protons whose distancesare hxed by the covalent structure, e.g., neighboring protons in aromatic rings, or methylene protons, the upper and lower bounds on the distance were both set equal to the appropriate value (2.47 and 1.78 A). Three proteins were chosen for their different secondary structure contents, numbers of amino acid residues, and correlation times. They were ST 1b, a 14 amino acid residue peptide (tUIYIOnly, 7R = 1.6 ns), BPTI (helix plus sheet, 56 residues, rR = 3 ns), and Mb (helix only, 150 residues, TR w 9 ns).’ All calculations were made on either an IBM 9000 or an Apple Macintosh computer. Simulations routinely included 40-80 protons. RESULTS
Using the program NOEMOT, NOE time courses were generated for helical, sheet, turn, and coil segments either in ‘H20 (i.e., amide protons present) or in *H20, for different overall tumbling times. The effects of internal motions on NOE intensities were also examined. Different kinds of calculation were performed to test separate hypotheses about the magnitudes of NOES that can be expected in macromolecules, under different conditions, and for different kinds of secondary structures. L~etermination of Secondary Structures from Patterns of NOE Intensities-Influence of Spin Diflusion One of the criteria for identifying secondary structure elements in proteins is the pattern of NOE intensities along the peptide backbone (4, 13). This approach is predicated on the expectation that becausesecondary structures such as a! helices are regular and repeating, there should be regular patterns of interproton distances that are characteristic of the different secondary structure elements. This of course assumes that real secondary structure elements are indeed regular and that spin diffusion does not complicate the issue. NOE time courses for (Y helices, /3 sheets, B turns, and coil segmentswere calculated, using coordinates for BPTI, Mb, and St 1b. These calculations were made as a function of overall correlation time to assessthe possible importance of spin diffusion. Characteristic distances for the secondary structures are shown in Fig. 1. ’ Abbreviations used: BPTI, bovine pancreatic trypsin inhibitor; St lb, fragment 6-19 of the heat-stable enterotoxin from Escherichiu coli; Mb, myoglobin.
430
ANDREW
P NH n
P W-I n
N. LANE
P
4 4 t H
H
P
HCH n
H
HCH n
H
H
H
P
HCH n
,-‘- I
d aN
d O(Ni,W3
FIG. 1. Short proton-proton distances in peptides. The &., refers to the dipolar connectivity between the C,H of residue i and the amide proton of residue i + 1.
In a regular (Y helix, the shortest distances are between consecutive NH protons, and between the (Y proton of the ith residue and the NH of the i + 3 residue (4), whereas in a regular /3 strand, the shortest distances are between (Y CHI’ and NHi + 1. These patterns of NOES are usually classified in terms of weak and strong, as in Table 1. (Y helix. Table 1 shows that even in the (Y helix, which tends to be fairly regular, missing NOES are likely for short mixing times. Nevertheless, at short mixing times, the systematic absence of daN connectivities, and the observation of stretches of dm and dm connectivities, would be interpreted as (Yhelix, especially if other criteria (such as coupling constants and amide proton exchange rates (2-4, 32-B)) indicated the presence of secondary structure. At 50 ms, the strongest interresidue NOES are less than 6%. At longer mixing times (250 ms would be rather long for myoglobin), the discrimination between strong and weak NOES is less clear cut, because spin diffusion is beginning to add to the magnetization transfer between distant spins, whereas the NOES between close spins (~3 A) are approaching their steady-state values. Figure
TABLE 1 NOE Intensities in Myoglobin Helix 4 da d d-4 d NN dm doNi,i+3 &t+,i+z d c41.1+3
CIH
T51
E52
A53
E54
M55
K56
A57
S58
w/w m/s s/s= w/m O/W s/s”
w/s o/w m/m o/o o/w 010
010 m/s m/m o/w o/w w/m
o/w m/s m/m o/w o/w o/o
o/w w/w w/s o/w o/w o/w
w/m m/s s/s w/w o/o o/w
w/w s/s m/m o/o o/w o/o
w/m m/s m/m -
s/s w/w s/s -
Note. Intensities as a function of mixing time were calculated as described under Methods. The correlation time was 9 ns; 0 = intensities ~1% w = l-32, m = 3-5%, s > 5%. The first entry is for a mixing time of 50 ms, and the second for a mixing time of 100 ms. The distances dil are defined in the legend to Fig. 1. The helix runs from Thr 5 1 to Ala 57. The column headed CIH denotes the intensities expected in a perfect a helix on the basis of distances and Eq. [4]. 0 Entries are for the closest approach of a methylene proton.
431
NOE INTENSITIES IN PROTEINS
214 shows the NOE time courses for selected pairs of spins. The pronounced lag is clear evidence for the presenceof spin diffusion. These spins are separatedby distances as large as 5.2 A (Glu 52 aH to Thr 51 j3H). On the other hand, the NOE to Glu 52 NH, an intrarresidue effect that can be expected to be strong, shows no evidence of a lag. Indeed, the spin-lattice relaxation rate constant for Glu 52 NH spins is large, leading one to expect saturation at short m ixing times. The steady-state NOE would be 6% in the two-spin approximation. However, by 200 ms, the actual NOE is approximately twice as large as the expected steady-state value, even though the direct magnetization transfer has reached >9’7% of its final value. The continuing increase o!f the NOE must be attributed to magnetization transfer among the spins within the glutamate residue. In fact, the spins of glutamate are all within 4 A of one another, so that magnetization transfer is quite efficient in a slowly tumbling molecule. Clearly, the NOE buildup curve could not be adequately fitted by Eq. [5]. This observation casts doubt on the wisdom of using calibration distances in determining accurate distances from NOES, which are needed for determining the local structure. Further, NOE simulations that use only a third spin to account for spin-diffusion pathways are inadequate, as the geometry of amino acid residues is such that several pathways of magnetization transfer are usually present. This is shown in Fig. 2B, where the apparent distance calculated as described under Methods is plotted against the known (crystallographic) distance. The longer distances (>4 A) are significantly underestimated. The dependence of r(app) on r(X ray) is approximately linear and can be described as rbpp) - 1.3 + 0.6r(X ray). In fact, the apparent cross-relaxation rate constant varies approximately as rP4,instead of the expected re6. The precision of the distance determinations is approximately
A ’ I ‘/1...1
*\
2
4
2 5
i,L&0 0
50
100
150
tl me/ms
.
200
r 1 xroy)/A
PIG. 2. Distances and NOES in helix 4 of myoglobin. NOES and distances were calculated as described under Methods. The correlation time was 9 ns, and the solvent was ‘HrO. (A) Time course of the NOE on irradiation of the CYHof Glu 52. (1) Glu 52 NH (r = 2.81 A); (2) Ala 53 NH (r = 3.57 A); (3) Met 55 NH (r = 3.75 A); (4) Thr 5 1 OH (r = 5.2 I A); (5) Glu 54 NH (r = 4.77 A). (B) Apparent distance (r(app)) versus input distance (r(X ray)). Only interresidue distances are included. The continuous line shows the expected dependence in the absence of spin diffusion. The broken line is a linear regression having slope 0.58 and intercept 1.35 (r = 0.895). (0) Distance evaluated at 50 ms; (0) distance evaluated at 100 ms.
432
ANDREW
N. LANE
+O.l A for (true) distances shorter than about 3.5 A, decreasing to about kO.2 at longer distances. The accuracy for the longer distances, however, is much less, with the error about -1 A at I = 4 A. ,i3strands. These strands are often considerably less regular than (Yhelices (34), so that the potential for missing NOES, or for observing anomalous intensities, is greater. Table 2 shows the NOE intensity patterns calculated for B strands in BPTI. The dmN intensities remain much stronger than the other intensities even at 240 ms; spin diffusion in itself is unlikely to cause misassignment of this kind of secondary structure. Further, @sheets are characterized by close contacts between adjacent strands, leading to NOES between them. At 120 ms, the interstrand NOES are mostly less than 4%, and many more are observed at 240 ms. However, the twist and curving of the strands could be incorrectly determined by ignoring spin diffusion.
TABLE 2 NOE Intensity Patterns for @Strands in BPTI
4
B
RI7
118
I19
R20
Y21
F22
Y23
N24
A25 w/w m/m w/m
d aN
S/S
s/s
s/s
s/s
s/s
s/s
s/s
s/s
s/s
d NN d BN
o/w w/ma
o/w o/w
o/w o/w
o/w o/w
o/w m/s
o/w w/m
o/o w/m
o/w o/w
o/w o/w
4
P
K26
A27
G28
L29
c30
Q31
T32
d aN d NN dm
s/s o/w w/ma
w/m m/m w/m
w/w s/s w/m
w/m m/s -
m/s o/w s/s
s/s o/w w/m
s/s o/w m/s
s/s o/w o/w
Interstrand NOES Mixing time/ms
NOES From
To
I18NH 119cy I19o R20NH RZONH R20NH Y2la Y21a Y21a Y2lcy
v43a v34a F33a T32d F33a v34a! c3op Q31NH T32~x F33~x
Mixing time/ms
NOES 240
From
To
W
W
W
m
0 0
W
W
m
W
W
0 0
W
m
S
W
W
F22NH F22NH F22NH F2NH Y23a Y23a Y23a N24NH N24NH N24NH
c3oa! c3ocy Q31NH T32a c3oCY c30/3 Q31NH L29NH c300 Q31NH
120
W
W
120
240 W W S W S W W W
m W
Note. The correlation time was 3 ns. The first entry is for a mixing time of 120 ms, and the second for 240 ms; 0, cl%; w, l-3%; m, 3-8%; s, >8%. The two antiparallel @strands are found from Arg 17 to Asn 24 and from Leu 29 to Val 34, separated by a reverse turn from Ala 25 to Gly 28. The column headed @ gives the expected intensities for a perfect fi strand, based solely on distances. a This entry is for the closest distance of approach.
433
NOE INTENSITIES IN PROTEINS
/3 turns. fl turns are defined for only four consecutive residues. The two m iddle residues do not have hydrogen-bonded amide protons, so the other criteria for identifying the presence of secondary structure are not so clear cut (4, 35). Table 3 shows the simulated NOE intensity pattern for the second turn of BPTI. It is not obvious that a turn is present in this sequence from the NOE data alone (actually, it is from Ala 25 to Gly 28). Indeed, the reliability of detection of a reverse turn is open to question, because spin diffusion can be sufficient to affect the relative intensity of one 01: two of the small number of NOES, so that the pattern may be indistinguishable from other secondary structures such as a distorted /3 strand, or a segment of coil. Similar remarks apply to the patterns obtained for the other turns in BPTI and the turn in ST lb (27). Coil. Coil is not defined except as peptide that is not identifiably helix, sheet, or turn. Any pattern of NOE intensities can therefore be observed. However, regular, repeating patterns of intensities are not expected,as then the region would be a repetitive structure, and presumably identifiable as such. However, over short stretches, the NOE patterns m ight sufficiently resemble those of defined secondary elements as to introduce ambiguity into the assignments. Table 4 shows a pattern of intensities for a segment of the peptide ST lb (4) that has no identifiable secondary structure, and a short stretch of BPTI for which the 9, \k angles are inconsistent with a regular secondary structure. In the former instance, the pattern of NOE intensities would be consistent with an a! helix, even though the a’, \k angles are nowhere near those typical of cx helices, whereas in the latter, the intensities resemble those of extended chain and could conceivably be identified as fi strand using this criterion alone. Wiithrich and co-workers have repeatedly stressed (3, 4, 35) that NOE intensity patterns should not be used alone for identifying secondary structures, but should be used in conjunction with coupling constants and amide exchange rates.
Long-Range NOES in Myoglobin in 2H20-Efect of Overall Correlation Time NOES between protons on residues far apart in the amino acid sequence are critical for providing distance constraints for building tertiary structures. An example of longrange NOE constraints was given above for two @strands in BPTI. TABLE 3 NOE Intensities for Turn 2 in BPTI 4
N24
A25
K26
A21
G28
L29
d <,N d NN dm
S/S o/o o/o
O/W w/m o/w
w/m w/m w/w
w/w s/s w/w
w/w w/m -
w/m o/o s/s
Wb
o/o
W
o/o
Other
(I
0
Note. The distances do are defined in the legend to Fig. 1. The NOE intensities are as in Table 2. The first entry is for 65 ms, the second for 130 ms. The reverse turn is from Ala 25 to Gly 28. ’ N24NH to L29NH (w/w). b A25a to G28NH (w). ’ A27NH to L29NH (w).
434
ANDREW
N. LANE
TABLE 4 NOE Intensity Patterns for Coil Segments BPTI 4 d d-4 dm km
L6
E7
P8
F9
YlO
Tll
w/m m/s o/w
s/s o/o w/m
s/s o/o w/m
s/s o/o w/w
s/s o/w o/w
o/w m/s o/w
St lb dti
Cl
c2
E3
L4
c5
d ON dw dm
m/s m/s o/w
s/s o/w w/w
O/m m/s w/s
O/m s/s m/s
O/m m/s -
Note. The distances do are defined in the legend to Fig. 1. First entry 60 ms, second entry 240 ms. For proline residues, the 6 protons play the role of NH protons. The correlation time was 3 ns. NOE Intensities are defined as in Table 2.
Figure 3 shows representative time courses of the NOE buildup between residues on helices A and H of Mb (long-range NOES). These curves clearly show lag phases that can be attributed to spin diffision. The NOES arise from spins that are separated by more than about 3.5 A. However, they are observable only at relatively long mixing times, and as soon as they become observable, a significant fraction of the NOE intensity arises from spin diffusion. Indeed, the appearance of NOES at the longer mixing times occurs between spins that are separated by as much as 8 A, so that the qualitative argument that the presence of an NOE implies a distance of <5 A is not necessarily
time/ms FIG.3. NOE buildup curves for residues in helices 1 and 8 in myoglobin. The correlation time was 9 ns, solvent *H20. The irradiated proton was the aH of Glu 6. The residues included were Gln 8, Leu 9, Asp 126, Ala 127, Gln 128, Ala 130, Met 131, and Asn 132. (1) Ala 129 aH (r = 4.70 A); (2) Lys 133 eH (r = 7.16A); (3) Lys 133 OH (T = 7.94 A). The NOE intensities to Ala 129 aH have been divided by 5.
NOE INTENSITIES IN PROTEINS
435
valid. The NOES between Glu 6 aH and the side-chain protons of Lys 133 are all approximately the same, showing that spin diffusion occurs along the side chain and is effective by 300 ms for molecules the size of myoglobin. Plots of the apparent distance versus crystallographic distance similar to Fig. 2B were constructed (not shown). Even at 50 ms, the plot is not a straight line of unit slope. The data for short distances are slightly overestimated, whereas for long distances, they are significantly underestimated. Further, the scatter becomes larger at longer distances. Most of the o’bservedNOE intensity for large distances arises from spin diffusion, hence leading to an underestimate of the true distance. As expected, the error is more severe for longer correlation times. For example, at a correlation time of 2 ns, the error due to spin diffusion is negligible (co.1 A) for a m ixing time of 50 ms, whereasat a correlation time of 10 ns, NOES at a true distance of 4.4 A give apparent distances of about 3.4 ii.. A substantially shorter m ixing time would be necessaryto reduce the error. However, because most of the intensity arises from spin diffusion, the NOES would disappear at much shorter m ixing times, so that the information about proximity would be lost, a:nd the constraints on the tertiary structure correspondingly poorer.
The Efect of Internal Motions on NOE Intensities Internal motions have two effects on relaxation rate constants. First, the internuclear distance is no longer constant during the measurement, leading to averaging of the relaxation rate constants, and second, the correlation time for internuclear vectors will in general be different for each pair of spins, and also smaller than the overall tumbling time. These two effects are to some extent opposed, though the dependence of the relaxation rate constants on distance is much stronger than on the correlation time. Internal motions can decrease the value of r-(j, increase it, or have no effect, depending on the nature of the motion and the local geometry. It is important, therefore, to take into account the effects of anisotropic motions, with truncation effects, as they are likely to occur in real proteins. Motions much slower than the overall tumbling time are common in proteins (32). If the motions are faster than the overall tumbling time of the molecule, then the effective correlation time will be reduced, to some extent offsetting the effect of motions on the distance. Calculations were therefore made using averaged cross-relaxation rate constants using a correlation time equal to tlhat for overall tumbling (i.e., slow motions), or with correlation times shorter than tlhe overall tumbling time (rapid motions). The effect of motions on the correlation time of interproton vectors is difficult to assess,though there is some evidence in the literature that the effect can be quite s:mall.For example, the correlation time of the interproton vector connecting the ring protons of surface tyrosine residues, RNase (21), and ST lb (27) and fixed length vectors in lysozyme (25, 38) suggestthat the effective correlation time is 40% or more of the overall tumbling time. It is also possible to place lower lim its on the correlation times from observed cross-relaxation rate constants. The upper and lower internuclear separation is determined by the covalent structure and van der Waals interactions (about 2 A for protons). Thus, for protons separated by a small number of covalent blonds, the upper and lower possible distances can be decided by simple geometry. For example, the upper and lower lim its for the distances between an (YCH and the
436
ANDREW
N. LANE
amide proton are 2.25 and 3.06 A. This allows upper and lower limits of the crossrelaxation rate constants to be defined, provided that the maximum correlation time is known. For a nearly spherical molecule, this must be the overall tumbling time. Hence, the ratio of the observed cross-relaxation rate constant to the maximum possible is given by [91 where 7x iS the OVed tumbling time. The minimum Value of the ratio 7/7x occurs when r = rmin, so that the ratio u/u- gives an estimate of the lowest possible value of 7. Applying this procedure to DNA fragments shows that the lower limit of the correlation time for Dribose vectors is about 40-50s of the overall tumbling time (37, 39). Figure 4 shows the effects of motion on the NOE intensity patterns for turn 3 of BFTI, for fast and slow motions. The slow motions (dynamic model 1) affect only internuclear distances. If the motions do not affect the correlation times significantly, the NOE intensities can be significantly altered compared with a rigid body. However, if the motion also decreases correlation times, then the “static” pattern is largely restored. Decreasing the correlation times for the atoms distant from the peptide backbone increased the scatter, but on average brings the apparent distances closer to those obtained for a rigid molecule. Similar results were obtained for other secondary structures, and for interhelix NOES in Mb. For the coil, the intensity patterns were different for each of the simulations, probably because there was little spin diffusion in the static simulation, so that the effects of motions dominate the NOE intensities. ~~JuI~x
= (~/dL/~6,
DISCUSSION
NMR is being applied to the determination of the structures of increasingly larger proteins (1-4, 35, 40-43). The rate of spin diffusion increases as the tumbling time of
r(xray)/A FIG. 4. Effect of internal motions on the apparent distances in BITI turn 3. Interresidue distances were evaluated at 100 ms. Solvent = ‘H20. (0) No internal motions, ~a = 3 ns; (0) all correlation times = 3 ns, cc CH and NH amplitudes = 0.5 A, side-chain atoms 1.0 A; (+) same amplitudes, but correlation times = 3 11sfor cc CH and NH, 2.0 11sfor @CHI, and 1.5 ns for other protons. The continuous line is of unit Slope.
NOE INTENSITIES IN PROTEINS
437
the molecule increases and hence will become more evident in the larger molecules. It is therefore necessaryto evaluate the extent of spin diffusion in proteins for a range of correlation times. The simulations presented under Results demonstrate several points. The first is that spin diffision does not in general affect the NOE intensity patterns characteristic of secondary structures, at least at the m ixing times likely to be employed on proteins of moderate size (36). Spin diffusion is important for the estimation of distances between protons that are >.3 A apart, but not for protons separated by shorter distances. The reason for this is probably that for short interproton distances, the direct dipolar interaction is relatively strong, and the probability of finding other protons between the two of interest is amall, owing to constraints of the covalent structure and van der Waals volumes. On the other hand, the probability of there being other spins between two protons separated by more than about 3 A is relatively high, so that a substantial fraction of the magnetization transfer is relayed by the intervening spin(s). Also evident from the calculations presented here is that NOES cannot be reliably calculated using three or four spins (8, 12, 16). All of the spins in a sphere of radius about 6 A should be included in such calculations, as otherwise the result may be significantly in error. The errors involved in estimating long distances using the calibration method can be very severe, about 1 A at a true distance of 5 A. Further, this error is made worse if the distances are estimated by the calibration method (Eq. [ 51). This is because the weak NOES at long distances give sigmoidal buildup curves, whereas the calibration NOES tend to saturate, thereby effectively underestimating the calibration NOE, and therefore the apparent distance. Figure 4 shows that internal motions slower than the Larmor frequency cause an increase in the observed NOE for large distances, and can be quite severe for large molecules. On the other hand, if the motions are fast on the Larmor time scale, the correlation times become smaller than the overall tumbling times, and this offsets somewhat the effects of averaging distances,albeit to a degreethat is difficult to imagine without calculating the NOE time courses. What does this mean for determining the structure of proteins in solution using NMR? The determination of tertiary structures requires that a large number of NOES between residues far apart in the amino acid sequence are obtained. Actually, the number of such NOES is relatively small. Lichtarge (43) has shown that in myoglobin, there are only about 150 such NOES (r < 5 A), and further, the quality of the structure determination would not be greatly improved by including more long-range NOES. Nevertheless, distance geometry calculations using distance constraints from NOE data are quite successful in determining the folding patterns of proteins (2-5, 44). These calculations do not use quantitative estimates of distances, but rather ranges based on relative intensities of NOES, as in Tables l-4. The results of such calculations suggest rms deviations of about 2 A for backbone atoms, and around 4 A for sidec:hain atoms. This yields a good picture of the global fold of the protein. However, the individual torsion angles (*, q) are relatively poorly determined (2, 44), because the distance ranges used in the calculations are almost as large as the maximum possible over the entire allowed ranges of these angles. Clearly, a better determination of the structure at a local level would require both higher precision and higher accuracy of
438
ANDREW
N. LANE
the distances used. In the study of function of enzymes, it is often more important to know with great confidence the local structure and dynamics, especially in the region of the substrate binding site, than the overall conformation. The accuracy can be improved for distances by specifically accounting for spin diffusion (6, 10, 19,20). To do this, a reasonable estimate of the global fold is required, so that the structure can be iteratively refined against the NOE intensities. One method for doing this has been described for the structure of the St lb peptide (27). This approach is similar to that used by protein crystallographers (real space refinement). Neglecting internal motions will leave an unknown degree of precision and accuracy, which for modest amplitude motions is likely to be around +0.2 A. However, in many instances, an estimate of the maximum error introduced by the assumption of rigid body rotation can be obtained by comparing the observed cross-relaxation rate constant to the maximum possible cross-relaxation rate constant, as this yields a lower limit to the effective correlation time (cf. Eq. [9]). Available estimates indicate that this will lead to an overestimate of the distance of about 1O%,which may be tolerable for many purposes. All of these considerations are general; that is to say, there will be specific instances where distance estimates are incorrect by much more than the ranges already quoted. While these can be expected to be relatively rare, any refinement algorithm must be robust to a few “rogue” data points. Spin diffusion is sure to occur to a greater or lesser extent, especially for larger proteins, Instead of trying to eliminate it, or simply ignore it, the transfer of magnetization through the dipolar network might be used to advantage. Hence, rather than discarding all NOE time courses that show a lag, proper analysis of the time course should allow additional relationships between spins to be determined, thereby improving the quality of the structure determination. Spin diffision permits relationships to be discovered between spins that are far apart (cf. Fig. 4), thereby increasing the range of the information about structure. It may also mean the difference between an underdetermined and an overdetermined structure. ACKNOWLEDGMENTS I thank Drs. 0. Jardetzky, J. Feeney, M. Forster, and J.-F. Lefevre for their interest in and encouragement of this work. I thank, too, Drs. 0. Lichtarge and F. Frayman for providing coordinates of the protons of the proteins used in this study. REFERENCES 1. R. KA~TEIN, E. R. P. ZUIDERWEG, R. M. SCHEEK, R. M. BOELENS, AND W. F. VAN GUNSTEREN, J. Mol. Biol. 182, 170 (1985). 2. M. P. WILLIAMSON, T. F. HAVEL, AND K. WOTHRICH, J. Mol. Biol. 182,295 (1985). 3. A. D. KLINE, W. BRAUN, AND K. WUTHRICH, J. Mol. Biol. 189,377 (1986). 4. K. WOTHRICH, “NMR of Proteins and Nucleic Acids,” Wiley, New York, 1986. 5. D. R. HARE, L. SHAPIRO, AND D. J. PATEL, Biocheyktry 25,7445 (1986). 6. S. W. FESIK, T. J. O’DONNELL, R. T. GAMPE, AND E. T. OLEJNICZAK, .I. Am. Chem. Sot. 108, 3165 (1986). 7. A. BRONGER, G. M. CLORE, A. M. GQRNENBORN, AND M. KARPLUS, Proc. Natl. Acad. Sci. USA 83, 3801 (1986). 8. G. M. ORE AND A. M. GRONENBORN, J. Magn. Reson. 61, 158 (1985). 9. R. E. KLEVITT AND E. B. WAYGOOD, Biochemistry 25,7774 (1986). 10. 0. JARDETZKY,’ A. N. LANE, J.-F. LEI?&RE, 0. LICHYARGE, B. HAYESROTH, AND B. BUCHANAN,
NOE
INTENSITIES
IN PROTEINS
439
“NMR in the Life Sciences” (E. M. Bradbuxy and C. Nicolini, Eds.), p. 49, Plenum, New York, 1986. 11. A. ABRAGAM, “Principles of Magnetic Resonance,” Clarendon Press, Oxford, 1978. II. G. WAGNER AND K. WOTHRICH, J. Magn. Reson. 33,675 (1979). 13. M. BILLETER, W. BRAUN, AND K. WOTHRICH, J. Mol. Biol. 155, 321 (1982). 14. D. E. WEMMER AND B. R. REID, Annu. Rev. Phys. Chem. 36, 105 (1985). 15. A. A. BOTHNER-BY AND J. H. NOGGLE, J. Am. Chem. Sot. 101,5 I52 (1979). 16. J. W. KEEPERSAND T. L. JAMES, J. Magn. Reson. 57,404 (1984). Ii’. A. KALK AND H. J. C. BERENDSEN,J. Magn. Reson. 24,343 (1976). 1P. C. M. DOSSON, E. T. OLUNICZAK, F. M. POULSEN, AND R. G. RATCLIFFE, J. Magn. Reson. 48,97 (1982). 19. W. MASSEFXI AND P. H. BOLTON, J. Magn. Reson. 65,526 (1985). 20. E. T. OLEINICZAK, R. T. GAMPE, AND S. W. FESIK, J. Magn. Reson. 67, 28 (1986). 21’. A. N. LANE, J.-F. LE&VRE, AND 0. JARDETZKY, J. Magn. Reson. 66,201 (1986). 2;!. G. LIPARI AND A. SZABO, J. Am. Chem. Sot. 104,4546 (1982). 23. G. LIPARI AND A. SZABO, J. Am. Chem. Sot. 104,4559 (1982). 24. M. S. BROIW, T. L. JAMES, G. ZON, AND J. W. KEEPERS. Eur. J. Biochem. 150, 117 (1985). 2.‘;. E. T. OLEJNICZAK, C. M. D~BSON, M. KARPLUS, AND R. M. LEW, J. Am. Chem. Sot. 106, 1923 (1984). 26. D. R. KEARNS, CRC Crif. Rev. Biochem. 15,237 (1984). 27. J. GARI~PY, A. N. LANE, F. FRAYMAN, D. J. WILBUR, W. ROBIEN, G. S~HOOLNIK, AND 0. JARDETZKY, Biochemistry 25,7854 (1986). 2il. J. C. HOCH, C. M. DOSSON, AND M. KARPLUS, Biochemistry 24, 383 1 (1985). 2!). R. M. LEW? R. P. SHERIDAN, J. W. KEEPERS, G. S. DUBEY, S. SWAMINATHAN, AND M. KARPLUS, Biophys. J. 48, 509 (1985). 30. M. LEVITT, .I. Mol. Biol. 168, 621 (1983). 3.1. A. PARDI, M. BILLETER, AND K. WWTHRICH, J. Mol. Biol. 180,741 (1984). 3.?. G. WAGNER, Q. Rev. Biophys. 16, 1 (1983). 33. S. W. ENGLANDER AND N. KALLENBACH, Q. Rev. Biophys. 16,521 (1984). 34. J. S. RICHARDSON, “Protein Folding” (R. Jaenicke, Ed.), p. 41, Elsevier, Amsterdam, 1980. 35. G. WAGNER, D. NEUHAUS, E. WORGOTTER, M. VASAK, J. H. R. K~%GI,AND K. W~THRICH, J. Mol. Biol. 187, 131 (1986). 31% W. J. CHAZIN, K. W~THRICH, S. HYBERTS, M. RANCE, W. A. DENNY, AND W. LEUPIN, J. Mol. Biol. 190,439 (1986). 37. G. M. CLORE AND A. M. GRONENBORN, FEBS Lett. 172,2 19 (1984). 3X E. T. OLWNICZAK, F. M. POULSEN, AND C. M. DOSSON, J. Am. Chem. Sot. 103,6574 (1981). 39. J.-F. LE&VRE, A. N. LANE, AND 0. JARDETZKY, Biochemistry 26,5056 (1987). 40. A. J. WAND AND S. W. ENGLANDER, Biochemistry 25, 1100 (1986). 41. S. J. HAMMOND, B. BIRDSALL, M. S. SEARLE, G. C. K. ROBERTS, AND J. FEENEY, J. Mol. Biol. 188, 81 (1986). 4.2. W. BRAUN, G. WAGNER, E. WORG~~TER, M. VASAK, J. H. R. K~;GI, AND K. W~THRICH, J. Mol. Biol. 187, 125 (1986). 43. 0. LICHTARGE, PhD. Thesis, Stanford University (1986). 44. G. WAGNER, W. BRAUN, T. F. HAVEL, T. SCHAUMANN, N. Go, AND K. W~THRICH, J. Mol. Biol. 196, 611 (1987).