J. Mol. Biol. (1983) 168, 621-657
Molecular Dynamics of Native Protein II. Analysis and Nature o f Motion MICHAEl, LEVITT
Department of Chemical Physics Weizmann Institute of Science Rehovot, Israel (Received 4 November 1982, and in revised form 19 February 1983) The 132 picosecond simulation of atomic motion in bovine pancreatic trypsin inhibitor protein generated in the accompanying paper is analysed here using a variety of different methods. Together, these techniques, many of which have been used before in analyses of protein co-ordinate refinement, give a complete and comprehensible description of the trajectory. Some highlights of the simulation are as follows. (1) The atoms vibrate about a time-averaged conformation t h a t is close to the X-ray structure (within l ' l A root-mean-square deviation for the main-chain of all residues except the first and last two). The vibration amplitude is least for main-chain atoms in a-helix or fl-sheet secondary structure and most for side-chain atoms in the charged polar side-chains (Asp, Glu, Lys and Arg). The overall extent and distribution of atomic motion is in agreement with the temperature factors derived from the X-ray refinement: the reorientation of bond vectors is much less than observed by nuclear magnetic resonance. (2) The protein explores four distinct regions of conformational space in the 132 picoseconds simulated. The conformational change from region I I I to IV and back again lasts 40 picoseconds and is of particular interest as it is reversible and involves an increase in the hydrogen bond energy. (3) The changes in main-chain torsion angles show the expected cooperativity of ~bi+1 and r side-chains t h a t are close in space also change their conformational angles in unison. ( 4 ) H y d r o g e n bonds are variable and many break and reform again in the 132 picoseconds. Certain hydrogen bonds are much less stable than others; with particular variability seen in the a-helices and at the ends of the fl-hairpin. Most noticeable are the co-operative changes of hydrogen bonds at both ends of the fl-hairpin t h a t occur in going from region I I I to IV of the conformational space. (5) The overall solvent-accessible area remains close to t h a t of the X-ray structure but polar charged residues become less exposed while non-polar hydrophobic residues become more exposed. Together these results give a conceptual model for protein dynamics in which the molecule vibrates about a particular conformation but then suddenly changes conformation, jumping over an energy barrier into a new region of conformational space.
1. Introduction The folded structures of protein molecules are not easily comprehended. Even the smallest protein molecule has many hundreds of atoms that are arranged into an (i2I 0022-2836/83/230621-37 $03.00/(~
9 1983 Academic Press Inc. (London) Ltd.
622
M. LEVITT
intricate three-dimensional structure. These difficulties are much more severe when one tries to comprehend the nature of protein dynamics. The hundreds of atoms are now moving and the entire three-dimensional structure is undergoing subtle conformational changes. The method of molecular dynamics that is used to simulate protein motion does not automatically make it ally easier to understand the nature of the motion. For bovine pancreatic trypsin inhibitor, the small 58-residue protein that is the subject of this study, the raw results of the 132 picosecond simulation are 66,000 sets of 1581 atomic co-ordinates defining the positions of the 527 atoms considered. Storing this information on 14 computer tapes presents administrative problems; comprehensive analysis is extremely laborious and requires programs of considerable sophistication. In the accompanying paper (Levitt, 1983a), it was shown that trajectories generated with two potential energy functions, which differ in their treatment of hydrogen bonds, remain close to the native BPTIr X-ray structure (Huber et al., 1971; Deisenhofer & Steigemann, 1975). Here one of the trajectories, which was continued for 132 picoseconds, is subjected to an in-depth analysis using a battery of different techniques. Some of the techniques used have been used before in the analyses of conformational changes produced by protein refinement; others have been used to study folded conformations. These techniques include: (l) two-dimensional projections of the conformational space (Diamond, 1974); (2) convergent energy minimisation from points along the trajectory (Levitt, 1980); (3) distributions of (~, r main-chain torsion angles and (~+1, ~bi) shift plots: (4)tabulation of changes in side-chain conformations; (5)tabulation of changes in hydrogen bonds with schematic drawings of hydrogen bonding patterns; and (6) tabulation of changes in accessible areas for the entire molecule, classes of residues, and individual residues (Lee & Richards, 1971; Chothia, 1976). Taken together the results of these analyses give a new conceptual model for protein dynamics in which the molecule vibrates like a solid about a particular conformation but then suddenly changes conformation jumping over an energy barrier into a new region of conformational space. The conformational changes are small but involve co-operative motions of both side-chains and the backbone; hydrogen bonds also break and reform co-operatively. The potential energy surface is envisaged as a rather uneven egg tray with many more or less equivalent minima.
2. M e t h o d s (a) Analysing conformations The analysis of the tens of thousands of conformations generated by molecular dynamics presents formidable problems. The use of computer graphics and animated films enables a wide audience to comprehend the qualitative features of a simulation (Levitt & Feldmann, 1981). More quantitative analysis depends on the use of a broad range of different methods.
r Abbreviations used: BPTI, bovine pancreatic trypsin inhibitor; r.m.s., root-mean-square.
N A T U R E OF P R O T E I N DYNAMICS
623
(i) Co-ordinate deviations The deviation of a set of co-ordinates from the X-ray co-ordinates or any other set of coordinates provides an indispensable measure of the conformational changes caused by energy minimisation and molecular dynamics. Here conformations are compared using a positional deviation, Ar, and a distance deviation Ad. Calculation of Ar requires that the 2 sets of co-ordinates compared first be superimposed so as to overlap as well as possible9 Here this is done using matrix algebra that has received a surprising a m o u n t of attention (Eckhart, 1935; McLachlan, 1972,1979: Nyburg, 1974; Kabsch, 1976). The superimposition depends on which atoms are selected; here the positions of all the 454 non-hydrogen atoms of BPTI are always used9 Once superimposed, the positions of sets of selected atoms can be compared directly using: N~
zl, 's = { l l N s
(r~-4)2} *,
~
(l)
Atoms k selected
where r~ is the atomic position of atom k in co-ordinate set i, r~ is the position of the corresponding atom in set j and N s is the number of selected atoms. Here 4 selections of atoms are used: Ar~ all 454 non-hydrogen atoms; Ar~ 426 atoms of residues 3 to 56; Ari-j all 58 C~ atoms: Ar/4 54 C~ atoms of residues 3 to 56. (The 4 water molecules were not included in the comparison 9 Calculation of Ad, the distance deviation, can be done without superimposing the coordinate sets: Ad :i = {l/Nkl ~ k>l
~ (d~-d~)2} -~,
(2)
I
where d~l is the distance between atoms k and 1 in co-ordinate set i, d~l is the corresponding distance in co-ordinate set j and Nkt is the total number of such distances. Here 3 sets of distances are used. Ad~./.all 57 x 58/2= 1653 inter-C~ distances; Ad~i, 1431 inter-C~ distance IJ of residues 3 to 56; AdM(<41 all distances between main-chain atoms such that either d~,I or 9 ij d~,t are less than 4 A (about 2400 distances). The r.m.s, deviation, AdM(<4 ) is insensitive to overall changes in the protein conformation and measures how the local conformation changes (Levitt, 1980). These measures of r.m.s, deviation are usually used to compare a particular set of co-ordinates to the X-ray structure (called X) and Ar xi is then written simply as Ari or At. (it) Projecting co-ardiuate spaces When comparing several different conformations of a protein molecule, it is very convenient to have a 2-dimensional representation of the conformational space. For m such conformations, any of the measures of r.m.s, deviation given in the previous section can be used to give the set of m • ( m - 1)/2 deviations between all pairs i and j. If this deviation, denoted generally as Dii, is taken to be the distance between conformations i and j, it can be used to derive a 2-dimensional representation of the conformational space. One such representation is obtained by projecting the space onto the best plane through the conformations. Another is obtained by finding points (Pl, ql) in 2 dimensions such that the distances between the points i and j approximates the actual deviation Dir Both methods are described below. The best-plane projection was used by Diamond (1974) to illustrate the effect of crystallographic refinement on sets of atomic co-ordinates. In that study, these coordinates, denoted as the 3N-dimensional vector R i for structure i, are used to calculate the matrix of dot products S, where s i i = R i . R i. Here it is shown how Sii can be derived from the distances Dii without using the R' vectors themselves 9 First write: D 2 = ] R ~ _ RJ]2 = iRi[2 _ 2 R i . R.i + IRJ]2 = [Ril 2 + ]R~]2 _ 2Sij ' which shows that the value of IRil 2 is needed if Sii is to be calculated from Dij. By choosing
624
M. L E V I T T m
the origin of the space to be at the center of the conformations so that ~ R i = 0 , we sum Di~ J over all j values to get: m
m
Y D~j= ~ j=l
m
IRiI2-2Ri •
j=l
m
m
RJ§ ~ ]RJl2--mlRq2+ ~ IaJ[2.
j=l
j=l
i=1
m
Writing (R2~ =--1 ~ IRJl2 gives: l~l j = l
IRiI2=(1/m)
D
- ( R 2 ) , and summing this over all i values gives:
.i 1
i
1
j=l
Because Igi[2 can now be expressed only in terms of the Dij values, the Sij values can be calculated using:
Sii = [[R;[a + [RJl2 - D~]/2. The 2-dimensional representation of the m conformations is obtained by first diagonalising S to give ATAA, and then taking pi=A~Ail and qi=A~Ai2 as the co-ordinates of conformation i projected onto this plane (A 1 is the large eigenvalue of S and A 2 is the second largest). Although this method gives an accurate projection of the conformational space, it does not provide the most useful 2-dimensional representation. In such a representation the distances in 2-dimensions between conformations i and j should be as similar as possible to the actual co-ordinate deviations Di1 between these conformations. This suggests a more direct method for deriving the most useful 2-dimensional representation. The pairs of co-ordinates of each point (Pi, qi) are treated as parameters that are derived by a least-squares fit of the 2-dimensional distance [(p;_pj)2 + (q;_qj)2]89 to the actual distance D,j. Here this is done by using the variable metric minimisation method VA09D (Fletcher 1970, taken from the Harwell subroutine library) to minimise the total residual error defined as: m
m
~f {Dij --[(pi--])i) 2 + (qi--qi)2[89 2. i j
(3)
In such an optimization, the derived (p, q) values depend on the initial values. Here I0 completely random sets of initial (p, q) values were used and the set with the lowest residual error after optimisation was selected as the best 2-dimensional representation. For about 50 compared conformations, the error after optimisation is typically half of that after projection onto the best plane. Derivation of the projection by minimising the residual error is therefore used here. I t should be noted that projections can be made into subspaces of any dimension. For m sets of co-ordinates, the ( m - 1)-dimensional projection is always a perfect fit with no error. Projections onto 1 or 2 dimensions can both be drawn easily, but the error is generally much smaller in 2 dimensions. (iii) ~b, ~bangles, hydrogen bonds and accessible surface area The conformations were analysed by comparing their (r ~b) torsion angles, hydrogen bonds and accessible surface areas. This was usually done by tabulating the values in the X-ray structure followed by the significant changes in the other structures. The criteria for hydrogen bonds are that the H...acceplor distances be less than 2-4A and the d o n o r - H . . , acceptor angle be within 35 ~ of linearity. The accessible surface area is calculated by our implementation of the Lee & Eichards (1971) method using a water radius of 1"4 A and solvent exclusion radii for the different
N A T U R E OF P R O T E I N
DYNAMICS
625
types of atoms of: 0 A (H), l'4 A (O, Q or V), 1"65 A (N or M), 1"87 A (C), 1"76 A (A or B) and 1"85 A (S). This same program and set of radii have been used extensively in previous solvent accessibility calculations (Chothia, 1976). The accessible areas of classes of atoms, individual residues and entire conformations are obtained by summing up the accessible areas of individual atoms. (b) Dynamic properties (i) Correlation functions I t is customary to quantify dynamic behaviour by calculating the autocorrelation function Cv(t ) t h a t shows how the value of some parameter, p, at time 7 is correlated with the value a t time t + r.
Cv(t ) = (p(-:)p(~- + t) ) / (p(~.)p(T) ), where the average is taken over all times r. F o r a trajectory calculated a t discrete times this becomes: n-I
n-I
Cp(i At)= l/n ~ p(j At)p(j A t + i At)/{1/n ~ p(j At)t}, ./=0
j=O
where the summation extends over n time-steps. The denominator ensures t h a t a t t = 0 , Cp(t)= I. As t tends to infinity, (p(-c)p(t+T)) tends to (pO')Xp(T)) as for t values big enough there is no correlation between p(r) and p(-c+t). Redefining Cv(t ) as:
Cv(t ) = [(p(~)p(T + t) ) - (p(r) ) 2]/[ (p(-:) 2) - (p(T)) 2]
(4)
gives a function t h a t decays from 1 to 0 as the auto-correlation decays. The correlation function can be generalised for a pair of parameters p and q so t h a t :
Cvq(t) = [(p(-c)q(-: +t)) - (p(-:)) (q(-c)) ]/{[(pO.) 2) - (p(7)) 2] [(q(~)t ) _ (q(~))2]p.
(5)
Now Cpq(t) is not necessarily unity at t=O but is equal to the correlation coefficient of the 2 parameters. (ii) Vector reorientation The reorientation of vectors is of particular interest when comparing calculated and experimental results. Both nuclear magnetic resonance relaxation and fluorescence depolarisation depend on the second Legendre polynomial P2(cos0)=89 cos20 - 1), where 0 is the angle between the different orientations of the particular vector. If the vector concerned is/l(t) and has been normalised to be of unit length, COS0=#(T)~(T+t) and:
C~(t) =
=89 3< [la(r)~(~"+ t) ]t> -- 1}. The vector ~(t) is calculated from the atomic co-ordinates along the trajectory. F o r nuclear magnetic resonance relaxation involving the dipolar coupling of 2 nuclei, #(t) is directed along the line between the nuclei. F o r the depolarisation of fluorescence/~(t) is directed along the direction of the absorption dipole (Levy & Szabo, 1982). F o r a trajectory calculated with a time-interval At we have: n--[
C~(i At)= 89
~ [lu(j At)p(j A t + i At)] 2} --89 1=o
(6)
When t tends to infinity, C~(t) tends to a limiting value defined as S 2, where S is the generalised order p a r a m e t e r that measures the angular restriction of the motion of the vector p(t). This asymptotic value can be calculated from the plot of C,(t) against t. Alternatively, S t is given by the formula: n-I
$2 =1{(3/n2) E i=0
n-I
E ~(i At)p(j At)] 2}- 89 j=O
(7)
626
M. LEVITT
which is derived from the C,(t) formula by assuming that the components of St(r) and St(r+t) are uncorrelated as t tends to infinity. If reorientation of St(t) is completely restricted, i.e. St(i At)=st(j/It) for all, i, j values, then S 2 = I. If reorientation of St(t) is completely free, i.e. (~)----- 0 and (izx)=$ 2 for the x, y and z components of St, then 82=0. An alternative formula for S z using spherical harmonics is more efficient but also more complicated (Levy & Szabo, 1982); calculation of S 2 by eqn (7) takes a few seconds of computer time.
(iii) Power spectra The power spectrum, l(w), of any time-dependent parameter is calculated as the square of the Fourier transform of the particular parameter, p. l(w) ..~A(w) 2 + B(oo)2, where A(~) = f ; (p(r)-(p(r)))cos 2~ro~r dr and B(w) = ~ : ( p ( r ) - (p(r)))sin 2~rwr dr. For a trajectory calculated at n discrete time-intervals At, the lowest frequency of interest is l / ( n - l ) A t so that/ioJ= l/(n--l)At is an appropriate choice. This gives: n-l
A(i Aoj) =
(p(j At)--fi)cos(2n(j/(n-- 1)) j=O n-I
B(i Ao~) =
(p(j lit)--fi)sin(2mj/(n-- 1)),
(8)
j=O
where
l n-I
fi=-
~ p(j At).
n j=0
3. R e s u l t s (a) A t o m i c motion (i) Vibration about the time-averaged structure The a m p l i t u d e s of vibration of the a-carbon a t o m s (a~i) are presented in Figure 1 for the period 62 to 132 picoseconds of the t r a j e c t o r y . Certain p a r t s of the chain are v i b r a t i n g much less than other parts. I t is possible to define a core for which a~i<0"4 A and consists of a total of 37 residues in the ranges 3 to 11, 18 to 24, 29 to 35, 41 to 46 and 49 to 55. These regions coincide a l m o s t exactly with the regions of helix and fl-sheet secondary s t r u c t u r e in B P T I (helix, 2 to 5, 47 to 55; and fl-sheet, 17 to 23, 30 to 36, 44 to 46). The following eight residues h a v e a~i values greater than 0"5 A: A r g l , Pro2, P r o l 3 , Ala25, Ala27, Gly28, Gly57 and Ala58. I t is unexpected t h a t this list contains the residues with the smallest sidechains; the a-carbon seems to move more because there is no well-anchored sidechain to inhibit its m o v e m e n t . T h e a m p l i t u d e s of vibration deduced from the C ~ t e m p e r a t u r e factors (02= 3B/87r 2) are bigger t h a n those calculated here (the r.m.s. C ~ a m p l i t u d e s are 0"35 A and 0"68 A in the d y n a m i c s simulation and X - r a y study, respectively). This m a y be due to libration of the whole molecule in the crystal unit cell or to static disorder in t h a t different molecules in the crystal h a v e slightly different orientations or conformations. Some of the p e a k s in the X - r a y
NATURE OF PROTEIN DYNAMICS 2-0
i
i
i
627
i
(o)
I-5 _=
I-0
QQ ~
E o
p
0"5
0'0
B
I
,
I
'
/~t
I
,
B,
I
a
Argl7
(b
75 ~,rcJI
== 50 o> 25
Arg53 Lys26 ~' / / Gin=31
Lysl5 Phe4 9".
H AIo5 Glu49 I I
Arg39
'"
o
.
9
20
30
9
40
.
.
9
50
o o ..
60
Residue number Fro. 1. Amplitudes of atomic motion and temperature factors averaged over the period of 62 to 132 ps. (a) The variation of the r.m.s, fluctuation of a-carbon atoms, ~=i, with residue number, in which the unbroken line shows the calculated valhes and the dotted line shows the values derived from the X-ray temperature factors {Deisenh0fer & Steigemann, personal communication. The peaks occur at the chain termini and at the small side-chains. The short lines above the x-axis mark the 5 stretches of secondary structure: 31o helix (2 to 5), fl-strand (17 to 23, 30 to 36, 44 to 46), a-helix (47 to 55). The r.m.s, fluctuation is below 0.4 A for residues in these regions. (b) The variation of the mean residue temperature factors with residue number, in which the unbroken line shows the calculated values and the dotted line shows the values derived from the X-ray temperature factors. The peaks generally occur at the charged polar side-chains Glu, Lys and Arg.
a=i values correspond to peaks in the calculated values but there is no clear-cut correspondence of values (correlation coefficient of 0"34). The mean temperature factors of residues also show considerable variation along the chain (see Fig. 1). The largest motion is associated almost exclusively with the long polar side-chains, Gln, Lys and Arg. The two ends of the molecule also behave like side-chains. There is better overall agreement between the mean residue temperature factors calculated here and those observed by protein crystallography (see Fig. 1); many of the peaks match and the correlation coefficient is 0"45. The mean temperature factors for classes of atoms are given in Table 1. For each class, buried atoms vibrate less than exposed atoms. The buried chain atoms in the backbone (N, C= and C) have the smallest temperature factors (average of 7 A2), whereas the buried side atoms (H and O) have slightly larger values (average of 8"7 Az). The side-chain atoms have temperature factors (B) that increase with distance from the main chain: the mean B values are 9"2, 12"8, 14"8
M.
{i28
LEVITT TABLE 1
Classification of calculatedt temperature factors Atom type
Mean B value (A 2) Buried:~ Exposed
No.
All
C C~ C~ C;' C ~'t C ~'2 C '~ C al C ~2 C~ C ~l C~ C; H H ~2' H 'u2 H c21 H~22 N N ~2 N~ N ~2 N; N ~1 N ~z
56 56 50 31 3 6 !5 12 10 5 8 8 13 53 3 3 1 1 56 3 5 1 4 5 5
8"6 9"2 12-8 14-8 13"0 12-9 27 "6 16"8 19"3 56"6 22"7 25 "9 47 "9 10"3 20"7 21"1 118"5 122 "4 7"8 15"8 53"2 79"8 84"3 90"9 132"9
0
Exptl w
6-9 7"1 7-3 8-6 --10 "0 12"7 10"2 -27'1 12 "0 14 "4 9"5 12'6 21"1 -122"4 7" 1 9"6 ------
16.4 I 1"6 15'l 19-9 13.0 12.9 32 "0 22"4 33"0 56'6 15"5 30 "6 76"6 12"1 36"9 -118"5 -12"7 28"2 53"2 79"8 84-3 90"9 132"9
14"l 17'2 19"5 30"0 55"7 22'8 17'4
9"4 9"9
57
14"0
7"8
O:
1
13"2
--
0 ''I 0 ~1 0 42 0 ~1 0 ~2 O '~
3 5 2 3 2 5
I 1"4 i 8"0 20"7 51"1 59"2 33"7
-9"4 --43"0 --
18"8 13"2 11 "4 23"7 20"7 51"1 75.4 33-7
S~ S;'
1 6
18"6 12"2
18"6 11"3
-14"2
12"3 I 1.7 12'4 18'8 16-0 23.3 27 "4 14"6 13'9 51 '6 12'6 14"2 16 "8
I I "5 10"5 24"5 35"6 24'2 29'5
t Averaged over the period 62 to 130 ps of t h e L79 trajectory. T h e buried a t o m s have less t h a n 1 A 2 of accessible surface area. w experimental values are averages of the B values derived from the X - r a y refinement (Deisenhofer & S t e i g e m a n n , 1975) for the particular class of a t o m s .
a n d 2 7 " 6 A 2 f o r C ~, C ~, C ~ a n d C ~ a t o m s . for the seven agreement
common
classes
with the corresponding
correlation
coefficient
three
atoms,
show
a much
calculated
most
broader
values
X-ray
of 0"95. For
the correlation
The mean
of atom the
values
are less than
than the
the
X-ray
X-ray
factors
calculated
and the two sets of values
18 c l a s s e s
is lower at 0"53. The
spread
temperature
( C , C ~, C p, C ~, C ~, N , O ) a r e i n g o o d of atom
simulated
values;
values,
that
contain
temperature
for main-chain
whereas
have
more
factors
atoms,
for side-chain
a
than the
atoms,
NATURE OF PROTEIN DYNAMICS
~i29
the calculated values are greater than the X-ray values. Clearly, the simulation reproduces the overall extent and variation of atomic motion seen in the crystalline state by X-ray diffraction; the agreement for the individual atoms or residues is less good. (ii) The path through conformational space The conformations along the dynamics trajectory trace out a continuous path in the 1581-dimensional space defined by the Cartesian co-ordinates of all the 527 atoms. A glimpse at this path would illustrate how the trajectory explores the space, showing the regions occupied, the rates of conformational change and the amplitude of fluctuation. A two-dimensional projection of the path was constructed as described in Methods, section (a) (ii), using conformations three picoseconds apart along the trajectory (see Fig. 2(a)). Energy minimisation changes the conformation slightly from X to CX. Dynamics then explores the conformational space more fully. After the first 30 picoseconds, the r.m.s, deviation from the X-ray conformation Ad~ does not increase. There are four distinct regions of space that the trajectory occupies: (I) 1 to 22; (II) 23 to 62; (III) 63 to 76 and l l l to 132; (IV) 77 to l l l picoseconds. In each region the trajectory crosses back and forth several times. Because the ends of the molecules move much more than the ol/her parts, the twodimensional projection was recalculated using only residues 3 to 56 (z]d~i see Fig. 2(b)). The regions of space occupied by the trajectory are now smaller, but the four regions are even more apparent. Thus, these regions involve conformations that differ in more than just the positions of the terminal residues. For an atom of mass, m, 20 g/tool, the velocity, v, at temperature T is given by ~mvX2 =-ikaT,l or v= x/~aT/'m =3"5 A/picosecond at T = 3 0 0 K (kB is Boltzmann's constant). The actual length of the path between the conformations separated by three picoseconds along the trajectory should, therefore, be about l0 A, which is much longer than the straight-line distance of about 0"5 A, due to the oscillatory nature of the motion. When conformations separated by less than three picoseconds are included in the projection there is so much fine structure to the pathway that the broad features are obscured. In each region, the protein vibrates about the mean conformations for between 13 and 41 picoseconds. In this time the region is traversed many times (at thermal velocity the region will be crossed in less than l ps). The motion from one region to another is very quick taking less than two picoseconds. The most interesting conformational change concerns the transition from region III to region IV at 76 picoseconds and the transition back to region III at I II picoseconds. Because this change is reversible, it is not due to equilibration but is an example of the conformational mobility of native globular proteins. The nature of this conformational change will become clearer in the analysis that follows. (iii) The potential surface under the trajectory The path traced out by the trajectory depends on the level of the potential
M. L E V I T T
(i30 I-0
(o I
0'0
1T'-. . . . . . .
-I'0 --I'0
J
--'"
E or
I
I I'0
O0
p(h)
I'0
0-0
(b)
j
,4~
Error -I-0 -I'0
0
I'0
Flu. 2. Two-dimensional projections of the 1581-dimensional conformationa] space that contains the dynamic trajectory. The X-ray structure, X, the minimised X-ray structure, CX, and conformations separated by 3 ps along the 132 ps L79 trajectory are connected by a line representing the path of motion through the eonformational space (points at 27 and 30 ps are missing as that segment of the trajectory was lost in 1980). The distance between any pair of points in the Figure approximates the r.m.s, deviation between these sets of co-ordinates. The scale bar shows the mean difference between the distance on paper and the actual r.m.s, deviation (the maximum difference is double the mean difference). (a) The r.m.s, deviation Ad~ that depends on the distances between all pairs of a-carbons. (b) The r.m.s, deviation Ad~ that only considers distances between the a-carbons of residues 3 to 56. The clearly separable regions of space are as follows: I, 0 to 22; II, 23 to 62; III, 63 to 76 and I l l to 132; and IV, 77 to I I I ps.
NATURE
OF P R O T E I N
DYNAMICS
631
TABLe 2
Energy contributions and r.m.s, deviations of minimised conformations Energy (kcal/mol) Conformation
Total
Bond
Angle
Tots.
vdW.
Hbond
Ar^
X CX E6 El8 E32 E44 E56 E68 E77 E89 E98 Ell0 Ell9 El30
- 22 -393 - 18 -5 -20 - 14 - 19 - 28 -22 -23 -22 -20 -33 -38
62 3 0 0 0 0 l l l 1 0 1 1 1
152 34 2 l I 2 2 3 4 2 2 l 3 2
44 -28 -4 -7 - l0 -8 -9 - 3 -3 - l 0 -5 -1 0
- 178 -230 -4 6 4 0 1 - 4 -3 - 10 -8 -7 -11 - l0
- 101 - 172 - 12 -6 - 16 -8 - 13 - 24 -21 - 15 - 16 -10 -24 -31
0 1"14 1"74 1"61 2"58 2-51 2-54 2"46 2"56 2"64 2'65 2"64 2"60 2'58
r.m.s, deviation (A) Art Ar~. ArM(
0 0"59 0"96 0"86 1-10 1"02 0"97 0"94 l'01 1-17 1"13 1'22 1"21 1"06
0 0'15 0"27 0-24 0"30 0"27 0"27 0"28 0"27 0"30 0-29 0"33 0"28 0"28
All the energies of the minima along the trajectory are given relative to those, of' the CX reference conformation (0 ps). Tors., torsion; vdW., van der Waals': Hbond, hydrogen bond.
e n e r g y surface. O w i n g t o t h e r m a l v i b r a t i o n , t h e t i m e - a v e r a g e d v a l u e of t h e p o t e n t i a l e n e r g y is m u c h h i g h e r t h a n t h e level of t h e e n e r g y surface. T h i s level w a s i n v e s t i g a t e d b y m i n i m i s i n g t h e p o t e n t i a l e n e r g y of c o n f o r m a t i o n s s e p a r a t e d b y a b o u t t e n p i c o s e c o n d s a l o n g t h e t r a j e c t o r y t o give t h e set of E c o n f o r m a t i o n s (see T a b l e 2). T h e s e E c o n f o r m a t i o n s all h a v e lower p o t e n t i a l e n e r g i e s t h a n t h e C X c o n f o r m a t i o n . T h e e n e r g y v a l u e , itself, d e p e n d s on t h e r e g i o n in space f r o m which t h e e n e r g y m i n i m i s a t i o n b e g a n . All c o n f o r m a t i o n s t h a t h a v e m i n i m u m e n e r g y v a l u e s m o r e t h a n 25 k c a l / m o l below t h a t of C X are in region I I I , t h o s e w i t h e n e r g i e s of m o r e t h a n 20 k c a l / m o l below are in region IV, t h o s e w i t h e n e r g y v a l u e s b e t w e e n 15 a n d 20 k c a l / m o l below are i n r e g i o n s I a n d I I . T h e f a c t t h a t d y n a m i c s s i m u l a t i o n followed b y c o n v e r g e n t e n e r g y m i n i m i s a t i o n a l m o s t a l w a y s leads t o a c o n f o r m a t i o n of lower e n e r g y p r o v i d e s a useful m e a n s of e s c a p i n g f r o m local e n e r g y m i n i m a ; t h i s t e c h n i q u e h a s b e e n u s e d in s t u d i e s of p r o t e i n f o l d i n g ( L e v i t t , 1983b) a n d is r e f e r r e d to as d y n a m i c a n n e a l i n g . Most of t h e d r o p in e n e r g y r e l a t i v e to t h a t of C X comes f r o m b e t t e r h y d r o g e n b o n d i n g , b u t t h e r e are s i g n i f i c a n t c o n t r i b u t i o n s f r o m b o t h t o r s i o n a n g l e a n d v a n der W a a l s ' i n t e r a c t i o n s . T h e m i n i m u m e n e r g y c o n f o r m a t i o n s are all d i f f e r e n t i n d i c a t i n g t h a t t h e r e are m a n y local m i n i m a in t h e r e g i o n s o c c u p i e d b y t h e t r a j e c t o r y . T h e s e m i n i m a are n o t v e r y deep as m o l e c u l a r d y n a m i c s m o v e s b e t w e e n t h e m r e a d i l y . Closer a n a l y s i s of t h e e n e r g y v a l u e s gives a v e r y clear p i c t u r e of t h e n a t u r e of t h e d i f f e r e n t r e g i o n s of space e x p l o r e d b y t h e t r a j e c t o r y . I n r e g i o n s I a n d I I , t h e t o r s i o n a n g l e e n e r g y is a b o u t 7 k c a l / m o l m o r e f a v o u r a b l e t h a n in r e g i o n s I I I a n d IV, w h e r e a s t h e v a n der W a a l s ' e n e r g y shows j u s t t h e o p p o s i t e t r e n d . T h e h y d r o g e n b o n d e n e r g y in r e g i o n I I I is a b o u t 12 k c a l / m o l m o r e f a v o u r a b l e t h a n in
632
M. LEVITT
the other regions. This indicates the way the different energy contributions compensate; getting a lower van der Waals' energy necessitates a rise in the torsion angle energy. The very favourable hydrogen bonding in region III relative to region IV indicates that the reversible conformation change for the period 77 to 111 picoseconds involves these interactions. Another way to investigate the potential energy surface is to use the total energy value (potential plus kinetic). The time-averaged value of the kinetic energy is simply (EK) =~NA]CsT. The time-averaged value of the potential energy is above the level of the surface, U o, by the same amount so t hat (Ee) = Uo +~NAkBT (assuming that the specific heat, Cv, is 3kB/atom as expected for a classical solid; this is equivalent to assuming equipartition of the energy). The total energy (ET} = ( E p) + (EK) = U o + 3NAlcaT, allowing the level of the potential energy surface to be calculated as Uo= (ET)--3NAkBT. In the present calculation, the total energy is constant over the periods of 100 time-steps for which the same list of non-bonded pairs is used. If the mean temperature for this period is more than 315 K or less than 285 K the velocities are scaled to maintain the temperature within these limits; the mean temperature for the entire period 62 to 132 picoseconds is 289 K. The time variation of average temperature, total energy and expected level of the potential surface are shown in Figure 3. Also shown is the value of the potential energy, Umi., obtained by minimisation from different points along the trajectories (see Table 2). Although the minimum energy, Umi,, which shows the true level of the potential energy is 2 3 • kcal/mol higher than Uo, both U o and Urea. show approximately the same variation with time with low energy regions in the ranges 60 to 75 and 115 to 130 picoseconds and a higher plateau in the range 75 to 115 picoseconds. The 23 kcal/mol difference between U o and Umi. can be used to estimate a value for the specific heat per atoms, Cv, as follows. From the definition of Cv, the mean potential energy is (Ep~--U,,i,q- 89 and from the definition of U o, (ET}=Uo+3NAkB(T }. Using (EK}=~(NAka(T}) and ( E T ) = ( E p } + ( E K } gives Cv=3kB-(Umi. - Uo)2/NA(T ) or Cv=2"85+0"03l %, which is close to 3ka, the value for a perfectly harmonic solid. This is reasonable as harmonic or approximately harmonic potentials are used for bond lengths, bond angle and double-bond torsion angles. Only 208 single-bond torsion angles would be expected to deviate from harmonic behaviour at room temperature. This calculated value of the specific heat must not be compared to the experimental value of about 2.2kB as has been done previously (McCammon et al., 1977). The simulations all use classical dynamics in t h at every mode of motion can be excited at room temperature; the real system is quantum mechanical and most of the bond vibrations will not be excited until much higher temperatures.
(b) Torsion angle fluctuations In the preceding section, the changes of conformation were monitored by using the r.m.s, deviations of atomic positions or interatomic distances. Conformational changes can be more readily appreciated by considering the 208 ~, r and X single bond torsion angles t ha t are the softest degrees of freedom in the system. The
NATURE
OF P R O T E I N
DYNAMICS
033 (o)
296 292 A
I-,,. V
288 284
(b) o
470
-6
460
A
450
V
440 .
--42O o ~ v
o .....
9
.
.
.o..o.
.......
...4
o
(c)
.,..
o 9 ,o
o. "e ,e...
--430 --440 --450
--460
J
60 '
70
i
8'0
9'0
I00
,
Ii0
I'::;0
130
Time (ps)
Flu. 3. The variation of the temperature, the total energy, the expected level of the potential surface, and the minimum energy values for the period 62 to 130 ps. The quantities E T and Uo are smoothed by averaging over 5-ps intervals (2500 time steps). (a) The averaged temperature that is maintained close to 300 K by heating or cooling. (b) The averaged total energy , is shown as an unbroken line, whereas the actual level obtained by energy minimisation starting after different amounts of time, Umi., is shown as a dotted line.
r.m.s, c h a n g e s in t o r s i o n angles in going f r o m the X - r a y c o n f o r m a t i o n to the m e a n d y n a m i c c o n f o r m a t i o n are 34 ~ 38 ~ a n d 51 ~ for ~, ~ a n d X, respectively. These c h a n g e s are quite appreciable a n d are a n a l y s e d in some detail. (i) M a i n - c h a i n torsion angles T h e d i s t r i b u t i o n of (r r angle pairs is similar for b o t h X a n d M (see Fig. 4(a) a n d (b)). T h e largest c h a n g e s h a v e been p l o t t e d using (~bi, r p a i r s a n d (~i+1, ~bi) pairs (see Fig. 4(c) a n d (d)). T h e r e is little o b v i o u s s t r u c t u r e t o shifts of (~bi, ~bi) b u t those of (~i+1, r are p r e d o m i n a n t l y parallel t o the diagonal. B y m a k i n g equal a n d o p p o s i t e c h a n g e s t o ~i+1 a n d ~bi, m o r e f a v o u r a b l e (~, ~) values can be obtained. This causes a r e o r i e n t a t i o n of the p e p t i d e g r o u p b e t w e e n residues i a n d i 4- 1 w i t h o u t h a v i n g m u c h effect on the chain p a t h . T h e ~ a n d ~b angles c h a n g e with time a l o n g t h e t r a j e c t o r y . T h e r.m.s. f l u c t u a t i o n a v e r a g e s 22"7 ~ for all r angles a n d 14"8 ~ for all ~b angles (for t h e period o f 62 t o 132 ps). T h e r e are several m u c h larger f l u c t u a t i o n s t h a t indicate a c h a n g e of c o n f o r m a t i o n r a t h e r t h a n a simple v i b r a t i o n a b o u t an equilibrium value. F i g u r e 5 shows the t i m e - v a r i a t i o n o f 9 o f the I 1 m a i n - c h a i n t o r s i o n angles t h a t
M. L E V I T T
634
180
180
~o
oo o oo eo o
o
o
w -.>
o
o
o
x
z
0
x~
x
o x
--180 --180
' a
'
0
-180
180
,
--180
(a)
0
180
(b)
180
7
--180 --180
i
0
180
--180 --180
t
0
,hi (')
,hi+ l (')
(C)
(d)
180
Fro. 4. The distribution of (~, r angles in the X-ray and mean dynamic conformations, X and M, and the differences between these 2 sets of angles. (a) The distribution for X showing the glycine residues as x and the other residues as o. (b) The corresponding distribution for the M conformation. (c) The changes in (4, ~b) angles are plotted on a (~i, r map. (d) These same changes are plotted on a (~bi+~,~b,)map in which reorientation of the peptide unit is seen as a diagonal shift (i.e. A~i+l = --/lr ). Most of the shifts are seen to involve such reorientations.
have an r.m.s, fluctuation in excess o f 16 ~ (the 2 angles left o u t ale Gly57 ~b a n d Ala58 ~b t h a t involve m o t i o n of t h e C-terminus). E i g h t of the nine angles are (r 1, ~b/) pairs t h a t are a d j a c e n t t o the ( i + 1)th p e p t i d e g r o u p (all e x c e p t Asn24, r Their time v a r i a t i o n is seen t o be v e r y highly a n t i - c o r r e l a t e d with an increase in ~bi being c o m p e n s a t e d for b y a decrease in r T h e m o s t noticeable c h a n g e involves the pair (r ~bls), for which there is a c h a n g e f r o m (r r 290~ 0~ to (r r s) = ( 200~ 12~ for the period of 77 t o 111 picoseconds. This c o r r e s p o n d s to the period for which the protein is in region I V of c o n f o r m a t i o n a l space. One o t h e r correlated (r 1, ~bi) f l u c t u a t i o n also involves a c o n f o r m a t i o n a l c h a n g e ; (r ~bl~) goes f r o m (240 ~ 70 ~ to (300 ~ l 0 ~ a t a b o u t 70 picoseconds, s h o r t l y a f t e r the m o v e o u t o f region I I . T h e o t h e r torsion angles show large a m p l i t u d e v i b r a t i o n s r a t h e r t h a n j u m p s between well-defined states. T h e v a r i a t i o n of r has a different character, d r o p p i n g g r a d u a l l y f r o m a m e a n value o f 260 ~ to a value of a b o u t 220 ~ for t h e period o f 90 to 120 picoseconds. This torsion angle also seems t o be correlated with the m o v e f r o m region I I I t o I V a n d b a c k again.
NATURE
200
9
OF
PROTEIN
.
,
635
DYNAMICS .
,
9
,
(a)l
I00 0 2 .@
i
I
I
260 160
.-).
60 --40
9
I
240 140 a
o i--
20
(f
180
290 190
-0- 2 4 0 140 9
'
(h)l
540 240 240 140
,
60
'
I
70
,
I
80
n
I
n
90
I
I00
,
!
I10
I
I
120
I
(')t l
150
Time (ps)
Flu. 5. The time variation of the 9 main-chain torsion angles that have the largest fluctuation amplitudes (greater than 16~ for the period 62 to 132 ps. The angles shown and their r.m.s. fluctuation amplitudes (in parenthesis) are: (a) Pro9 ~b (24~ (b) Tyrl0 ~ (23~ (c) Lysl5 ~b (64~ (d) Alal6 ~ (48~ (e) Argl7 ~b(27~ (f) IlelS~ (28~ (g) Asn24 ~ (19~ (h) Gly37 r (21 ~ and (i) Cys38 (16~ Note that all of these angles except Ash24 ~ occur in pairs and are on both sides of a peptide unit; they change in opposition to reorient the peptide unit without changing the chain conformation (A'hi + , = - Ar
63U
M. LEVITT
The (r ~b) angles of the conformation energy minimised after 68 picoseconds (E68) are almost identical to those of the conformation minimised after 130 picoseconds (El30) with a r.m.s, difference in (4, ~b) angles of only 4 ~ (the r.m.s. co-ordinate deviation between these two conformations is 0"76 A (ArA,) and the r.m.s, difference of the side torsion angles is 45~ (ii) Side-chain torsion angles The pattern of changes in the 97 side-chain single bond X torsion angle were analysed by calculating values in conformations eight picoseconds apart along the entire 132 picosecond trajectory. The r.m.s, fluctuation averages 32 ~ for all 97 angles (for the period of 62 to 132 ps). The 25 torsion angles listed in Table 3 are those that have r.m.s, fluctuations greater than 25 ~ (in fact all have r.m.s. fluctuations greater than 36~ These 25 most active side-chain torsion angles occur in 16 different side-chains, with more than one angle being involved in the side-chains Asp3, Phe4, Lysl5, Arg39, Lys46, Glu49, Asp50 and Arg53. Some of the angles simply fluctuate between several states while others go abruptly from one conformation to another. At 60 picoseconds, Asp3 Xl, Lys46 XI, Asp50 X1, Met52 9/3 and Arg53 Xl change their conformations in unison; all these side-chains are close in space and most are in the C-terminal s-helix that is connected to the N-terminal 31o helix by the 5:51 disulphide bond. At 68 picoseconds, Lysl5 X2, Arg39 9/2 and Arg39 9/4 change their conformations; these residues are close in space and connected by the 14:38 disulphide bond. At 76 picoseconds, Lysl5 X4, Argl7 9/3 and Gin31 9/2 change their conformations; two of these residues are close in space (15 and 17), but the third (31) is 20 A away at the other end of the molecule. Once these abrupt changes have happened, the side-chain remains in the new conformation for the remainder of the trajectory. One of the aromatic side-chains, Phe4, shows a ring flip going from 9/2------90~ -) to §176 +) after 52 picoseconds. The 9/1 angle of Phe4 also changes after 84 picoseconds, but this change is not correlated with the ring flip. The phenylalanine side-chain swings out to become more exposed when 9/1 changes from g+ to t. The other seven aromatic ring 9/2 torsion angles do not show any sign of ring flipping (see Fig. 6). All the angles except that of Phe4 vibrate about the equilibrium value occasionally showing a brief excursion (less than 2ps) of up to 90 ~ (e.g. Tyr21 at t = 7 8 p s ) . The X2 angle of Phe4 has an equilibrium value of 90 ~ but changes from this value for periods of between two and six picoseconds; Phe4 looks as if it is trying to flip back over again. The autocorrelation functions of these eight X2 angles are shown in Figure 7. Except for Phe4, the correlation decays with a time constant of between 0"25 and 0"5 picoseconds (for Phe4 the time constant is qver 5 ps). This is in. good agreement with other simulations (McCammon et al., 1979). Once the correlation has decayed it does not increase above 0-25 showing that the X2 oscillations are well-damped. (c) Hydrogen bonds (i) Stability of hydrogen bonds In the present simulation, the hydrogen bond is a favourable interaction that
+1
+
§
§
§
+
§
I 0
I
§
§
§
I
q- I
4-
-t-
++
§
-I-
I+1
-t-
I
4-
-t-
t~
9 §
§
4-+
-t-
..I-
-t-
-t-
.-t-
-t-
--I-
--I-
--I-
-t-
--I-
§
--I-
I§247
+-I-
§
++1
-t- I
11-
§
I
--~
§
I
-I- -I-
§
II
-I-
+-t-
§
-I-
I§
-t-
I
§
-I- I
+1
r
§
§
I
I
-t-
-t-
-t-
I
I
§
+
§
§
o
§
E~
+
++ ~
§247
q--I-
§
§
-I-
§
§
--t-
M. L E V I T T
638
360
i
Phe4
180
i
Tyrl0 180
Tyr21
180
!
Phe22 o
180
o
._g
o
i
o p180
Tyr25
i
Phe33
180
Tyr55
180
Phe45
180 0
, 60
! 70
,
I 80
,
I 90
h
I I00
,
! I10
,
I 120
, 150
Time (ps)
FIG. 6. The time variation o f the,x2 torsion angles of the aromatic side-chains in BPTI. TJae r.m.s. fluctuations for these angles are 42, 8, 12 and 13 ~ for Phe4, 22, 44 and 45, respectively, and 9, 8, 9 and 17 ~ for Tyr]0, 21, 23 and 35, respectively.
NATURE 1.0
0"5
A
': O
OF
PROTEIN
.•_
Phe22
• / • ^A ^
0
9
i
639
Phe33
Phe 45
/~
9 V" V ~.,,/
-0.5
DYNAMICS
-v
~
V
i
i
0~
TyrlO
Tyr21
Tyr23
,
I
,
i
,
Tyr35
O
"5 <
0-5
[/'~..,[~ I vV L l V ~ J ^
0
~A ~/~ V" vv V v
_
-0-5
9
0
I
2
3
40
I
2
3
4
0
I
2
3
4
0
I
,
i
t
2
3
.
i
.
45
Time (ps) F]~:. 7. T h e autocorrelation f u n c t i o n s for t h e X2 angles of t h e 8 a r o m a t m rings in B P T I . T h e decay rates, taken as the time to reach a correlation o f 0"367 (e- I ), are 4, 0"2, 0'3 a n d 0"35 ps for Phe4, 22, 33 a n d 45, respectively, a n d 0"2, 0-5, 0-35 a n d 0-2 p s for T y r l 0 , 21, 23 a n d 35, respectively.
occurs between any 0 and H atoms t hat happen to be close enough and in the correct orientation. Because there is no need to decide a priori which pairs of atoms form hydrogen bonds, their dynamics can be investigated in a way impossible before (McCammon et al., 1977,1979). There are 49 0 . . H hydrogen bonds that are formed for at least 10~ of the time during the period of 62 to 132 picoseconds (see Table 4). Of these, 32 are also formed in the X-ray structure. The 49 hydrogen bonds can be classified into three classes: stable, formed more than 90% of the time: weak, formed more than 5 0 ~ and unstable, formed more than 10~/o. A hydrogen bond formed a fraction f of the time has a free energy of formation dGHa=]%T logr Thus, the 27 very stable hydrogen bonds have a ZIGHa value of less than - 1"55 kcal/mol, the 14 weak hydrogen bonds have AGH8 values less than 0 kcal/mol, and the eight unstable hydrogen bonds have zIG values between 0 and + 1"32 kcal/mol. Twentythree of the X-ray structure hydrogen bonds are in the stable class, seven in the weak class and two in the unstable class. Thirteen of the 17 protons that are most protected from exchange with solvent protons (as measured by 1H nuclear magnetic resonance, see Wagner & Wiithrich, 1982) are in the stable class, three are in the weak class and one is in the unstable classes. There are 14 other protons involved in the stable hydrogen bonds th at are not observed to exchange particularly slowly. Five of these are
640
M. L E V I T T TABLE 4
Calculated stability of hydrogen bonds Stabler Atom pair Pro2,0...Cys5,H Asp3,O... Leu6,H Asp3,0... Glu7,H GluT,0... Asn43,H~2t Tyrl0,O... Wat2,H2 Thrll,O...Oly36,H Thrl 1,O... Wat3,H 1 Ilel8,0...Tyr35,H Arg20,O... Phe33,H Tyr21,0... Phe45,H Phe22,0... Gin31,H Tyr 23,0... Asn43,Ha22 Asn24,0... Ala27,H Asn24,0... GIy28,H Gln31,O ... Phe22,H Phe33,0... Arg20,H Cys38,0... Wat3,H2 Asn43,0al ... Tyr23,H Phe45,0... Tyr21 ,H Ala48,0... Met52,H Cys51,0... Thr54,H Cys51,O...Cys55,H Cys51,0... Gly56,H Met52,0...Ala57,H Met52,O...AIa58,H Wat2,0... Lys41,H Wat3,0... Gly37,H
Fraction formed 0'90 (X)~: 0"98 (X) 0"99 0"99 (X S) 0"99 (XW) 0"99 (X S) 0"99 (XW) 1"00 (X S) 1-00 (X S) 1-00 (X S) 1.00 (X S) 0"97 (X S) 0"96 (X) 0"99 (X) 1-00 (X S) 1-00 (X S) 1"00 (XW) 0"99 (X S) 0"99 (X S) 0-92 (X S) 0-96 1-00 (X S) 0-99 1-00 (X) 0"98 (X) 0"98 (XW) 1-00 (W)
Atom pair
Weak and unstablet Fraction formed
Argl,O... Phe4,H Pro8,O... Watl ,H1 Asn24,0 al . . . Lys26,H Leu29,0... Asn24,H Leu29,0... Gln31,H~'21 GIn31,OEl...Asn24,Hn'' Lys41,O... Asn44,H Set47,0...Cys51,H Glu49,0... Arg53,H Wat1,0.. Tyrl0,H Wat1,0... Wat2,Hl Wat2,0... Asn44,H~21 Wat3,0... Cys38,H Wat4,0... Ala25,H
0"80 0-50 (W) 0"78 0'70 (X S) 0"58 0"79 0"79 (S) 0.64 (X) 0"78 (X S) 0"73 (XW) 0'84 (XW) 0'87 (XW) 0"86 (XW) 0"70 (XW)
Leu6,0... Wat4,H1 Leu6,0... Wat4,H2 Pro8,0... Watl ,H2 Tyr23,0... Wat4,H l Asn24,&l... Wat4,H2 Asn24,0... Gln31,Ha2 Tyr35,0... Ilel8.H Wat3,O...Cysl4,H
0.26 (W) 0"39 (W) 0-31 (W) 0-22 (XW) 0"24 (W) 0-29 0.16 (X S) 0-21 (XW)
Only those hydrogen bonds that involve an explicitly included hydrogen atom are considered here. The bond is considered to be formed if the 0 . . H distance is less than 2-4 A and if the 0 . . H--N angle is greater than 145~ t Stable hydrogen bonds are defined as those formed more than 90~'o of the time, weak hydrogen bonds are formed more than 50% of the time and unstable hydrogen bonds are formed more than 10% of the time. :~Hydrogen bonds marked by X are in the X-ray structure, those marked by W involve at least one of the 4 bound water molecules, and those marked by S involve a proton that is protected from exchange with solvent protons (kin (36~ < 10 -4 min- 1; see Wagner & Wiithrich, 1982).
i n v o l v e d in h y d r o g e n b o n d s t o s o l v e n t a n d s h o u l d t h e r e f o r e e x c h a n g e r a p i d l y . E i g h t o t h e r s are w e a k e r as t h e y are i n v o l v e d in b i f u r c a t e d h y d r o g e n b o n d s t o t h e following four o x y g e n a t o m s : A s p 3 , 0 , A s n 2 4 , 0 , Cys51,O a n d M e t 5 2 , 0 . O n l y one p r o t o n , t h a t in Cys5, w o u l d h a v e been e x p e c t e d t o be p r o t e c t e d from e x c h a n g e b u t h a s n o t been o b s e r v e d t o do so ( W a g n e r & W i i t h r i c h , 1982). T h e c o r r e l a t i o n b e t w e e n h y d r o g e n - b o n d s t a b i l i t y d e d u c e d from m o l e c u l a r d y n a m i c s a n d t h e r a t e of p r o t o n e x c h a n g e h a s b e e n d e s c r i b e d ( L e v i t t , 1981a); t h e p e r i o d of 32 t o 56 p i c o s e e o n d s of t h e p r e s e n t t r a j e c t o r y w a s u s e d t h e r e . S u c h a wide r a n g e of h y d r o g e n b o n d s t a b i l i t i e s is u n e x p e c t e d as t h e s a m e
NATURE OF PROTEIN DYNAMICS
641
0 . 9 H potential function is used for each. The potential energy of hydrogen bond formation is about -4-9 kcal/mol (see Fig. 2 of the accompanying paper) whereas the free energies of formation range from - 4 kcal/mol ( f = 0"999) to + 1"32 kcal/mol ( f = 0"!). Certain hydrogen bonds are less stable in the dynamics because forming the bond restricts the conformational freedom of the chain and so leads to a reduction in entropy and an increase in free energy. This entropic effect cannot be included in energy minimisation calculations. (ii)
Changingpatterns of hydrogen bonding
The hydrogen bonds formed in seven minimum energy conformations derived from points about 20 picoseconds apart along the trajectory are compared with those formed in the X-ray structure (see Table 5). The total number of hydrogen bonds found in the conformations along the trajectory increases from 45 after 18 picoseconds to 55 after 130 picoseconds. In the X-ray structure there are 44 hydrogen bonds, 23 of which are conserved in all conformations. The number of X-ray hydrogen bonds conserved in any particular conformation varies from 30 after 32 picoseconds to 37 after 130 picoseconds. The two conformations in region IV of the trajectory (E89 and E l l 0 ) have about five fewer conserved hydrogen bonds than do the conformations in region III (E68 and El30). There is a total of 32 main-chain hydrogen bonds formed between peptide groups in any of the conformations. Twelve of these are formed in all conformations and an additional 13 are formed in most conformations. Most of the conformations subjected to over 32 picoseconds of dynamics have 19 of the 22 hydrogen bonds formed in the X-ray structure plus seven new hydrogen bonds. These changes in the pattern of main-chain hydrogen bonds are shown in Figure 8. The major changes caused by dynamics are additional hydrogen bonds in the N-terminal 31o helix, and a distortion and extension of the second half of the C-terminal s-helix. Five of the seven new hydrogen bonds are bifurcated in that two hydrogen atoms are bound to a single oxygen atom. Bifurcated hydrogen bonds are not found in the X-ray structure, suggesting that the potential energy function used here may be defective in that it allows such bonds. The three hydrogen bonds in the X-ray structure that are not preserved after dynamics include 36,0.. 16,H, 4 2 , 0 . . 4 4 , H and 52,0..56,H. There are alternative pairings for two of these, 4 1 , 0 . . 44,H and 5 1 , 0 . . 56,H. The hydrogen bonds involving one of the four tightly bound water molecules are also highly conserved. Six are formed in all conformations, and another nine in most conformations. Eleven of the 13 hydrogen bonds formed in the X-ray structure are conserved. These changes in the pattern of water hydrogen bonds are shown in Figure 9. Two water molecules, Wat2 and Wat3, are completely localised while two others, Watl and War4, spin freely making hydrogen bonds to a number of oxygen atoms. The behaviour of the water molecules during the first 62 picoseconds of the trajectory has been described (Levitt, 1981b). The hydrogen bonds involving side-chains (main/side or side/side) are more variable than those involving the main-chain or the water molecules. Seven of the nine side-chain hydrogen bonds formed in the X-ray structure are preserved and
642
M.
LEVITT TABLE
5
Changes in hydrogen bonding along the trajectory
Property
T o t a l no. No. conserved No. extra
Conformation E56 E68
X
CX
El8
E32
E89
E1 l0
El30
44 ---
45 38 7
45 34 ll
48 30 18
49 35 14
53 36 17
53 32 21
53 31 22
55 37 18
x x x x x x x x x x x x x x x x x x x x x x
x x x x x x x x x x x x x x x x
x x x x x x x x x x x x x
x x x x x x x x x x x x
x x x x x x x x x x x x x
x x x x x x x x x x x x x
x x x x x x x x x x x x x
x x x x x x x x x x x x x
x x
x x x
x x
x
x x x x x x x x x x x x x x x x x
x x
x x
x x
x x
x x
x x x x x x x
x x x x x x x
x x x x x x x
x x x x x x x
x x x x x x x
24
23
26
Main~main Asp3,0... ThrI 1,0... IlelS,0... Phe33,0... Arg20,O... Phe45,0... Tyr21,O... Gln31,0... Phe22,0... Asn24,0... Ala48,0... Cys51,O... Pro2,0... Gly36,0... Tyr35,0... Leu29,0... Asn24,0... Arg42,0... Ser47,0... Glu49,0... Asp50,O... Met52,0...
Leu6,H Gly36,H Tyr35,H Arg20,H Phe33,H T y r 2 1 ,H Phe45,H Phe22,H G i n 3 1 ,H Gly28,H Met52,H Cys55,H Cys5,H Alal6,H IlelS,H Asn24,H Ala27,H Asn44,H C y s 5 1 ,H Arg53,H Thr54,H Gly56,H
Lys41,0... Cys51,O... Cys51,0... Met52,0... Met52,0... Argl,O... Asp3,0... Ser47,0... Ala48,0... Arg53,O...
Asn44,H Gly56,H Thr54,H Gly57,H Ala58,H Phe4,H Glu7,H Asp50,H Cys51 ,H Ala58,H
Number
x x x
x x
x x x
x x
x x x x x
x x x x x x x x x
22
19
22
24
24
25
x
x
x
x
x
x
x
x
x
x x x x x x
x x x x x x x x x
x x x x x x
x x x x x
x x x x
x x x x
x x x x
x x x x
x
x
x
x
x
x
x
x
x
x
x x x x x x x x
Main~side Glu7,0...
A s n 4 3 , H 621
V a 1 3 4 , 0 . . . T h r l 1 , 0 rl A s h 4 3 , 0 6 1 .. . T y r 2 3 , H T y r 2 3 , O . . . A s n 4 3 , H a22 A s p 5 0 , O . . . T h r 5 4 , 0 ;'l Glu7,0~2... Asn43,H Ser47,0~'... Asp50,H Glu49,0~... Glu49,H Ash24,061 ... Lys26,H G l y 5 6 , 0 . . . A r g l ,N '12 G l y 5 6 , 0 . . . T y r 2 3 , 0 'l G l y 3 7 , 0 . . . T y r 3 5 , 0 'l A s p S 0 , O 61 . . . S e r 4 7 , H
x x
x x x
x
x
x
NATURE
OF
PROTEIN
T.~aLE 5
Asp50,O 61 . . . Lys46,H Glu7,0~'2... Arg42,H Asp50,O 42 . .. Ser47 ,H G l y 2 8 , 0 . . . Tyr23,0" Asp50,062... Lys46,H A s h 2 4 , 0 . . . Gin31 ,H .21 L e u 2 9 , 0 . . . Gin31 ,H ~2l GluT,0 *l . . . Arg42,H Cys55,0. :. Tyr23,0" Number
643
DYNAMICS
(continued)
X
X
X
X
X
X
X
X
X
X
x
X
x x
x
x
7
12
19
9
9
11
13
15
13
x x 2
x
x
x
x
x
l
1
1
1
x x 2
x x 2
x x x x x x x x x
x x x x x x
x x x x x x
x x x x x x
x x x
x x x x x x x x x x
x x x 9 x x x
x x x
x x x x x x x x x x x x
x x
x x
x x
x
x x x
x x x
x x x x
x x x
x x x
x 14
x x 14
14
Side/side Asn44,0 ~l . . . Arg20,N "1 A sp50,O 61 ... A rg53,N " 2 GIn31,O ~1 . . . Asn24,H 621 Asp50,O al . . . Set47,0 r Number
x x
2
l
Water/any Wat1,0... Tyrl0,H T y r l 0 , O . . . Wat2,H2 T h r l 1 , 0 . . . Wat3,H 1 W a r 3 , 0 . . . Cysl4,H C y s 3 8 , 0 . . . Wat3,H2 W a r 2 , 0 . . . Lys41 ,H C y s 5 , 0 . . . Wat4,H 1 T y r 2 3 , 0 . . . Wat4,H 1 W a t 4 , 0 . . . Ala25,H W a r 3 , 0 . . . Cys38,H A s h 4 3 , 0 . . . W a t l ,H W a r 2 , 0 . . . Asn44,H ~21 W a t 1 , 0 . . . Wat2,Hl Glu7,O*2... W a t l ,H Ash24,061 . . . Wat4,H W a t 3 , 0 . . . Gly37 ,N L e u 6 , 0 . . . 4,H2 P r o S , 0 . . . W a t l ,H A r g 3 9 , 0 . . . Wat2,H2 G l u 7 , 0 . . . Wat4,H2 C y s l 4 , 0 . . . Wat3,H 1 Number
x x x x x x x x x x x x x
x x x
x x x x
x x
x x x x x
x 13
12
x x x x x x x x x x
13
14
15
16
A hydrogen bond t h a t is formed according to tile criteria given in Methods, section (d) (iii), is marked x.
there
are
an
additional
(Asn24,0~l...Lys26,H; There
are also
conformations bonds the
arginine atoms. 22
hydrogen
14 h y d r o g e n
bonds
along the trajectory.
are involved
14 b o n d s .
three
The
side-chains;
bonds
Asp50,O~t...Ser47,H
in more than two X-ray
that
are formed
one bond
structure
so that
hydrogen
hydrogen
bonds,
during
the
dynamics
Gln31,O~t...Asn24,H621). in only two
Most of the atoms
these bonds are prevented
One of the additional
formed and
involved
o n l y 18 a t o m s bonds
that
or three
of the
in these hydrogen are involved
are broken
by the repulsion
between
in
involve O and M
G l n 3 1 , O ~1 . . . A s n 2 4 , H 61, c o u l d n o t
644
M.
\ \4.
~,,~
9I - Io 31.t.
LEVITT
~)
N ~ O
022
~
42 I
'oo_'o. "--~.-4P <
__1
~?N/I ' <
~
>k"~3~-t N,'i.'.<7 s8,~'==
FI(:. 8. Schematic d r a w i n g showing the changes in t h e p a t t e r n of main-chain h y d r o g e n bonds t h a t results from molecular d y n a m i c s . T h e native, X - r a y s t r u c t u r e hydrogen b o n d s t h a t are preserved are shown as u n b r o k e n lines, those ghat are broken are shown as d a s h - d o t lines, while the additional hydrogen b o n d s that, formed d u r i n g t h e d y n a m i c s are shown as broken lines. Note how m o s t c h a n g e s are a t t h e e n d s of t h e chain a n d arise as t h e 31o-helix a t the N - t e r m i n u s a n d t h e a-helix at t h e C - t e r m i n u s are lengthened.
form in the X-ray structure as the 0 ~1 and N ~2 atoms of Gln31 had been interchanged in the list of X-ray co-ordinates; this was corrected by changing the Xa angle of Gin31 by 180 ~ before starting the simulations. The differences in hydrogen bonding patterns that occur in going from region III to region IV of the trajectory indicate the nature of this reversible conformational change. There are six hydrogen bonds that are not formed in both E89 and E l l 0 (region III) but are formed in E68 and El30 (region IV). Two of these hydrogen bonds are at the open end of the fl-hairpin T y r 3 5 , 0 . . . Ilel8,H and W a t 3 , 0 . . . C y s 3 8 , H ) and the other four are at the closed end of the fl-hairpin (Leu29,0...Asn24,H ; C y s 5 , 0 . . . W a t 4 , H1 ; T y r 2 3 , 0 . . . W a t 4 , H 2 ; and W a r 4 , 0 . . . Ala25,H). These hydrogen bonds break when the protein moves into region IV after 77 picoseconds and reform when it leaves the region after 11 picoseconds. This co-operative change in the pattern of hydrogen bonds is illustrated in Figure 10. The total number of hydrogen bonds does not drop in region IV (conformations E89 and E l l 0 ) but the hydrogen bond energy does increase by between 10 and ]5 keal/mol; the mean energy per hydrogen bond in region I I I is -3"7 kcal/mol, whereas in region IV it is -3"45 kcal/mol (see Table 2). The native X-ray hydrogen bonds that are broken are energetically more favourable than the extra
NATURE
OF P R O T E I N
37,NH\ 1.0"~,
DYNAMICS
645
/HN,38 0'86/I HN 14 ..W3~ '
I1,0/~''/" HI H2" ~ "
0,38
I0,0~
.HN,41
o.99~Ha IO'NH"'~73
/
~ W ] / 0"5/ / H ~ /
'~,~
~HN,S2
0"84
44
H2'~~. o'31 \
8,0
W2
H,
\
43,0
8,0
6,0.. 6,0 0,23 0"39'~. ... o 261I 0.22 ///H2 W4 HI.~. "'-. -
24,081 /
0"24
Io-7
o "'o,5
I
25,NH FI(;. 9. Schematic drawing showing the changes in the pattern of hydrogen bonds to the 4 bound water molecules that results from the dynamics simulation. Native X-ray hydrogen bonds that are conserved are shown as unbroken lines, those that are broken are shown as dot-dash lines, while the additional hydrogen bonds that are formed during the dynamics are shown as broken lines. The number written alongside each bond gives fraction of time during the period of 62 to 132 ps, for which the bond is formed. In two cases, both hydrogen atoms of a particular water molecule bonds to the same oxygen atom at different times (Leug,0 and Prog,O).
hydrogen bonds that are formed. Most of the extra hydrogen bonds formed in E89 or E l l 0 but not in E68 and E l 30 involve side-chain atoms ( G l y 5 6 , 0 . . . T y r 2 3 , O "; G l y 5 7 , 0 . . . T y r 3 5 , 0 "; Asn24,O...Gln31,H ~22 ; L e u 2 9 , 0 . . . Gin31 ,H ~21) or water molecules (Asn24,0 tl . . . Wat4,H1 ; G l u 7 , 0 . . . W a t 4 , H 2 ; C y s l 4 , 0 . . . W a t 3 , H 1 ) . The hydrogen bonds involving Asn24,0, and water molecules Wat3 and 4 are made to compensate for the broken hydrogen bonds. (d) Accessible surface area The total accessible surface areas and the changes of area of individual residues are shown in Table 6 for X, CX and En conformations separated by about 20 pieoseconds along the entire 132 picosecond trajectory. The total accessible area
646
M. LEVITT
(~ Water
/\/'" 16i
156 ,% .......
t31
~0
,8ql
~
23q
.......... N ',,
0 6
FI(:. 10. Schematic drawing showing tile native X-ray hydrogen bonds that are broken when the conformation changes at abaut 77 ps. These bonds, marked as broken lines, are reformed after Ill ps. This conformationa] change occurs reversibly during the trajectory and demonstrates the nature of the co-operative changes that happen spontaneously at room temperature. The potential energy level rises by about 12 kcal/mol when these hydrogen bonds break. of conformations along the t r a j e c t o r y is close to t h a t of X (within 5~ There is a small increase in the exposed area of non-polar residues (8%) and a much larger decrease in the exposed area of the polar residues (20O/o). Many of the residues t h a t become more exposed or more buried a t some t i m e during the simulation remain in t h a t s t a t e until the end. The residues t h a t become more exposed include A r g l , Pro2, Phe4, Cys5, Tyr23, Met52 and Gly57. Five of these seven residues h a v e non-polar side-chains. The residues t h a t become less exposed include Asp3, Glu7, Lys26, Lys41, Arg42, Lys46, Asp50 a n d Ala58. Six of these seven residues have charged polar side-chains. This p a t t e r n of changes in accessible surface area can be explained by the lack of solvent. T h e surrounding w a t e r would repel the non-polar h y d r o p h o b i c side-chains and they would not become more exposed. The w a t e r would a t t r a c t the polar side-chains and t h e y would not become more buried. Some of the changes in residue accessibility are correlated with the regions of the trajectory. In region IV (conformations E89 and E l l 0 ) , residues Ala25 to Gln31 show significant changes of solvent-accessible surface area relative to the X - r a y structure. These changes are corrected when the conformation m o v e s back from region I V to I I I (conformation El30). Changes in the accessible areas of
NATURE OF PROTEIN DYNAMICS
647
residues Phe4 to Glu7 also happen in region IV but these are not corrected in conformation El30. The greater exposure of Phe4 is a direct result of the change in the X1 angle after 84 picoseconds (see Table 3). The estimated solvent interaction energy change is about 8"5 kcal/mol for conformations obtained after the first 20 picoseconds. This value is very small compared to the differences in potential energy (see Table 2) but it must be remembered that the proportionality constant of 0"02 kcal/mol per A 2 refers to free energy not potential energy (Chothia, 1975).
(e) Order parameters Most experimental methods that can measure the internal dynamics of proteins are sensitive to the rate of reorientation of vectors. For nuclear magnetic resonance, the rate of decay of dipolar coupling depends on the reorientation of the vector between the pair of coupled atoms. For fluorescence, the rate of depolarisation depends on the reorientation of the absorption and emission dipole moments on the particular chromophore (Levy & Szabo, 1982). It is possible to calculate the rate and extent of vector reorientation from these experimental methods, correct them for the overall Brownian motion of the molecule and compare with the reorientation calculated from the simulation (Lipari et al., 1982). As such, these methods provide an important check on both the extent and rate of internal dynamics. Reorientation is measured by the autocorrelation function C,(t) that decays to a plateau value defined as the order parameter S 2 (see Methods, section (c) (ii)). Table 7 gives the calculated S 2 values for reorientation of: (a) bond vectors leading to methyl groups; (b) C~--N ' bonds in lysine side-chains; and (c) O--H bonds in the four bound water molecules. The order parameters of the methyl vectors are high, ranging between 0"70 and 0"92. Values obtained from the KMC 96 picosecond trajectory (Karplus & McCammon, 1979; Lipari et al., 1982) have a higher spread extending from 0"19 to 0"91. Experimental order parameters for the BPTI methyl groups (Richarz et al., 1980; Lipari et al., 1982) are much lower than either of these sets of values (0"12 to 0"52). For the alanine methyl bonds, the results from the trajectory of Karplus & McCammon (1979) are in better agreement with experiment than are the values calculated here. In order of decreasing mobility, the experiment gives Ala58, 27, 16, 40, 25 and 48, whereas the present simulation gives Ala58, 25, 27, 40, 48 and 16. The most noticeable disagreements are Alal6, 25 and 27. Most of the methyl bond autocorrelation functions C~(t) decay with time constants of less than 0.4 picosecond; for Ala25 and Ala27, the time constant is 1"6 picoseconds. The order parameters of the lysine C~--N; bonds are smaller than for methyl bonds ranging from 0"45 to 0"77; the time constants for the reorientation are larger ranging from 0"2 to 2-4 picoseconds. The water molecule O--H vectors show the largest range of values (0"03 to 0.91). Two water molecules are fixed (Wat2 and War3), and two (Watl and War4) show complete O - - H bond reorientation with rates of between 1"6 and 2 picoseconds. This mobility was inferred in section (c) (ii), above, from the pattern of hydrogen bonds involving
M. L E V I T T
648
TABLE 6
Changes in the accessible surface area along the trajectory Property or residue
X
CX
El8
E32
Total area Non-polart Polar
3879 664 1856
3860 713 1780
3883 716 1760
3908 787 1628
0
2-9
3"5
8"3
Energy:[: Argl w Pro2 Asp3 Phe4 Cys5 Leu6 Glut Pro8 Pro9 Tyrl0 Thrl l Glyl2 Prol3 Cysl4 Lysl5 Alal6 Argl7 Ilel8 Ilel9 A rg20 Tyr21 Phe22 Tyr23 Ash24 A la25 Lys26 Ala27 Gly28 Leu29 Cys30 Gin31 Thr32 Phe33 Val34 Tyr35 Gly36 Gly37 Cys38 Arg39 Ala40 Lys41 Arg42 Asn43 Ash44 Phe45 Lys46 Ser47
129"2 49"2 121"2 36"4 0"0 100"0 39"3 91 "3 53"0 73'5 67 "3 21 "5 85"5 54"6 174.9 48 -4 202.0 67 "5 104'3 43"1 56"8 20"9 5"6 38 '8 42 "4 191 "0 52'3 33"8 80 "6 17'5 69-7 71.8 23-2 70.6 11-8 0"8 36-4 25-3 177.0 51.4 100.4 169"0 0.0 17.8 24.4 158-6 32"9
38
43 59 --48
Conformation E56 E68 3810 771 1620 8' 1 39 43 -58
E89
ElI0
El30
3814 761 1644
3795 749 1590
3839 769 1602
3667 718 1544
7"7
8'4
8"5
8'6
42 46 --55
63 50 -63 33 27 -33 -22
-21
1
25
27
24
30
33
--25
27
25
62 52 -62 34 26
65 50 - 62 31 26 -26 -21
27
21 -21
--27 27 21
29
--29
--43
--23
1
-37
--28
-53
-31 -50
-24 -42
--33 --69
-35 --71
--34 --72
--29
--28
--28
--31
--29
--29
NATURE
Property of residue
X
Ala48 Glu49 Asp50 Cys51 Met52 A rg53 Thr54 Cys55 Gly56 Gly57 Ala58
33"8 116.8 66'5 0'0 71-7 169"8 49-8 0'0 21.4 36.8 126.6
CX
OF
PROTEIN
El8
E32
- 32
- 30
DYNAMICS
Confbrmation E56 E68
649
E89
ElI0
-28
- 34
43 - 3i
El30
27
25
23 26 -46
-37
29 -,54
33 -43
34 -55
33 -49
33 -47
The accessible area is in A 2 and was calculated by the method o f Lee & Richards (1971) using atomic radii, given in Methods. section (b) (iii). t The non-polar residues are Val, Leu, Ile, Cys, Met, Phe, Tyr, Trp; the polar residues are Asn, Asp, Gin, Glu, Lys, Arg. :~ Tile energy is the sum of the change in surface area multiplied by 0"024 kcal/mol per A 2 for nonpolar residues and - 0 - 0 2 4 kcal/mol per A 2 for polar residues (Chothia, 1976). w accessible area of each residue is listed for the X conformation only, tbr tile other conformations the change in area is given if it exceeds 20 .-~-' in magnitude.
water molecules (see Fig. 9). Because the experimental S 2 parameters are so much smaller than the simulated values, there must be motional averaging occurring on time scales longer than 100 picoseconds. This motion is likely to involve spontaneous transitions from one conformation to another like that seen here for regions III and IV of the trajectory.
(f) Overall change in conformation From the results presented above, it is clear that the mean dynamic conformation remains close to the X-ray structure. This similarity is demonstrated by the stereoscopic drawing of the X and M conformations (see Figs II and 12). Careful scrutiny of the stereopairs does, however, reveal differences. The N and C-termini of the chain change to extend the 31o and a-helices (see section (c) (ii), above). There are small changes of orientation of some of the hydrophobic side-chains (see Fig. 11). For example, the rings of Phe4 and Phe45, and the end of Met52 are noticeably different. There are much more extensive changes of conformation for the hydrophilic side-chains (see Fig. 12). For example, Lysl5 is bent in X and extended in M, whereas Lys26 is extended in X and bent in M. Both Asp3 and Asp50 have different conformations in X and M. Arginine residues 17, 39, 42 and 53 also have different conformations. In general, the polar side-chains in the M conformation seem to interact more strongly with the rest of the molecule. This trend, which is due to the lack of solvent or solvent effects has been described in section (d), above.
M. L E V I T T
650
TABLE 7
Reorientation rates and order parameters
Residue Leu6 Alal6 Ile 18 Ilel9 Ala25 A la27 Ala40 Ala48 Met52 Ala58 Lysl5 26 41 46 Watl Wat2 War3 War4
Bond
ro(ps )
Here
C ;' C ~' C v C +2 C~--C ~ CP--C ''~ C~I--C ~l C~--C r2 C~'I--C 61 C~--C p C~--C # C~--C ~ C~--C ~ S~---C~ C~---CI~ C+--N ~ C~--N ~ C~--N ~ C~--N ~ O--H1 O--H2 O--H 1 O--H2 O--H 1 O--H2 O--H 1 O--H2
0'4 0'3 0"6 0"4 0-2 0"2 0"2 1 "6 1 '6 0"2 0"2 0-2 0.4 2-4 0"8 0.4 0"2 1.6 2-0 0-4 0-4 0-3 0-2 1.8 2-0
0"86 0'80 0-92 0"83 0"84 0'88 0"81 0'70 0'85 0-91 0'91 0-81 0.70 0"45 0"60 0"77 0'65 0"13 0-05 0-74 0'85 0"89 0"91 0'03 0-13
S 2 value LSLt
Expl:~
0"20 0'20 0'82 0'36 0"28 0'86 0'55 0"86 0'79 0"85 0"91 0'07 0"60
0"25 0"25 0"38 0"29 0"17 0"43 0"27 0"56 O'34 0"38 0'55 0"22 0'13
t T a k e n from Lipari et al. (1982) who used t h e B P T I 96 picosecond trajectory of K a r p l u s & M c C a m m o n (1979). Values e s t i m a t e d by Lipari & Szabo from the laC nuclear m a g n e t i c resonance relaxation d a t a of Richarz el al. (1980).
4. Discussion
(a) Nature of protein dynamics Molecular dynamics generate a complete description of atomic motion fbr the period simulated. This description is contained in the thousands of sets of atomic co-ordinates of the conformations along the trajectory; it is impossi'ble to comprehend the nature of the motion without very extensive analysis. In this section the main results obtained from the 132 picosecond simulation of BPTI dynamics are highlighted and incorporated into a simple conceptual model of protein dynamics. (i) Highlights of the trajectory The atoms vibrate about their time-averaged positions with amplitudes that are similar to those seen in the BPTI crystal structure. The main-chain atoms
NATURE
OF
PROTEIN
~"~58
651
DYNAMICS
58 (o)
b) Fl('. 11. Stereoscopic drawings showing the a-carbon backbone, the disulphide bridges and hydrophobic side-chains (Val, Leu, Ile, Met, Phe and Tyr) of (a) the X-ray conformation, X; and (b) the mean conformation for the period, 62 to 132 ps, M. These drawings and those in Fig. 12 were drawn using the program PLUTO written by Dr S. Motherwell at the Chemical Laboratories, Cambridge University: I am grateful to him for having provided such a powerful tool.
652
M. L E V I T T
(a)
c
(b)
Fro. 12. Stereoscopic drawings showing the a-carbon backbone, the disulphide bridges and hydrophilic side-chains (Asp, Glu, Lys and Arg) of: (a) the X-ray conformation. X : and (b) the mean conformation, M.
NATURE
OF P R O T E I N
DYNAMICS
653
vibrate least for residues in s-helix or fl-sheet secondary structure and most for the small residues in exposed turns or at the ends of the chain. The side-chain atoms vibrate most for the polar charged residues that are exposed on the surface of the molecule. This very rational distribution of vibration amplitudes is expected to hold for any protein. During the 132 picosecond trajectory, the protein explores four distinct regions of the conformational space. Two of the regions are probably occupied as part of the equilibration process, but the other two provide a clear example of a reversible conformational change that occurs spontaneously at room temperature. The main-chain torsion angles remain fairly close to the X-ray values (average r.m.s, deviation for r and r of 36 ~ and vibrate with an average r.m.s, amplitude of 18 ~ Most large changes in r and ~b are anticorrelated with Ar --ZJr This type of co-operative motion seen in the first simulation of protein dynamics (McCammon et al., 1977), causes reorientation of the peptide unit without affecting the positions of the adjacent ~-carbons. The side-chain torsion angles have larger amplitudes (32 ~ on average), and show co-operative conformational changes that involve three or four adjacent side-chains. These jumps between local conformational states of side-chains happen on a time scale of about ten picoseconds. The hydrogen bonds have different stabilities that can" be quantified by calculating free energies of hydrogen bond formation. There is a core of stable hydrogen bonds that remain formed for most of the trajectory. Other hydrogen bonds are more variable, breaking and reforming several times during the 132 picosecond simulation. Most of this variability is associated with hydrogen bonds involving the side-chains. The reversible conformational change that occurs in going from region I I I to IV of the trajectory is seen to involve the co-operative breakage of six hydrogen bonds at the two ends of the fl-hairpin. It is interesting that two of the four water molecules included in the calculations are involved in these changes. This local melting would allow exchange of protons with water molecules and is a clear example of local melting that has been used to explain hydrogen exchange experiments (Englander & Englander, 1978). The conformational change associated with this reversible melting lasts about 40 picoseconds and does not involve any major atomic shifts. The solvent accessible surface area of BPTI fluctuates during the simulation but the mean value is only 1"6% less than t h a t of the X-ray structure. Individual residues change their accessible areas during the trajectory and some of these changes are reversible and correspond to the local melting described above. The calculated order parameters show that there is surprisingly little reorientation of bond vectors; much more motion is seen in nuclear magnetic resonance measurements made on BPTI protein in solution.
(ii) A conceptual model of protein dynamics X-ray crystallography of protein molecules has as its end products the mean positions of the atoms and the temperature factors that give the amplitude of
654
M. LEVITT
vibration about this mean structure. As such, this provides a model of protein dynamics that is like the dynamics of the atoms in crystals of small organic molecules. This behaviour is just that expected for any solid in which the restoring forces that hold atoms to their mean positions depend linearly on the displacement, namely, a harmonic solid. The first computer simulation of protein dynamics (McCammon et al., 1977) gave a vetT different picture. The internal motion was described as fluid-like in that the local atomic motion is more like diffusion than vibration about a mean structure. Support for this description was provided by the specific heat value of 2-3 kB obtained in the simulation; this value lies midway between the values of a classical fluid (1.5 kB) and a classical solid (3 ]ca). Although this pioneering study highlighted the rich variety of motional phenomena that occur in folded proteins at room temperature, it is now clear that the conclusions were drawn from the initial part of the trajectory that is plagued by problems of equilibration. Much longer trajectories with the same potential gave a specific heat of 3"1 ke and did not show some of the more dramatic features seen before (Karplus & McCammon, 1979,1981). The present results, which have been obtained with optimised energy parameters and extensive equilibration, lead to a new conceptual model for protein dynamics. The protein molecule is seen to vibrate about several distinct time-averaged conformations that occupy different low-energy regions of the potential energy surface. The motion in any one region is of small amplitude. After a time that is of the order of 50 picoseconds, the conformation changes significantly and begins to vibrate in a new region. These conformational changes are extensive and co-operative; here six hydrogen bonds at the two ends of the molecule break in going from region III to IV. The jump from one region to another is rapid, taking less than three picoseconds, and is reversible. Because most time is spent making small amplitude vibrations about a mean structure, the specific heat calculated here (2"8 ks, see Results, section (a) (iii)) is close to that expected for a classical harmonic solid. This picture of protein motion suggests that the potential energy has many local minima in the vicinity of the X-ray structure. These minima are separated by energy barriers of different heights, with the barrier height depending on the extent of the conformational change. The simplest changes involving jumps between different equilibrium positions of side-chain torsion angles (e.g., gauche and trans forms) occur on a time scale of ten picoseconds. The conformational change seen here in the move from region III to IV involves the co-operative breaking of hydrogen bonds and occurs on a time scale of 50 picoseconc[s. Longer simulations would probably reveal other more extreme changes that occur on longer time scales. At normal temperatures, very large conformational changes that lead to denaturation of the native structure must be very rare. The potential surface can be thought of as a rather uneven egg-tray bounded by high walls that keep the protein close to the native conformation (within 1"5 A r.m.s.). The various minimum energy conformations will have most of the stabilising interactions found in the X-ray structure, but still be distinct from it in terms of their r.m.s, deviation.
NATURE OF PROTEIN DYNAMICS
655
(b) Agreement with experiment The differences between the atomic co-ordinates in the time-averaged conformation and the X-ray structure are smaller here than in all previous in vacuo simulations (see the accompanying paper). The deviations of the core s-carbons (of residues 3 to 56) are even smaller than those obtained for simulations that include the solvent or crystal environment (van Gunsteren & Karplus, 1981,1982). Ill spite of this improvement the r.m.s, deviation obtained in the Present simulation (Ar~,=l.1 A) is still much larger than any expected error ill the X-ray co-ordinates or any change in BPTI conformation caused by crystal packing forces (the r.m.s, deviation between the X-ray structures of BPTI alone (Deisenhofer & Steigemann, 1975) and complexed with trypsin (Huber et al., 1974) is less than 0"4 A). The temperature factors calculated from the simulation are of the same general magnitude as those deduced from the X-ray refinement. The mean temperature factors of the six most common classes of atoms agree well with the corresponding X-ray values (correlation coefficient of 0"95). When the vibration amplitudes calculated from the temperature factors are averaged over all the atoms in each residue, the correlation coefficient with the corresponding X-ray values is 0"46. This is not as good as the agreement with experiment obtained in the simulation of cytochrome c dynamics (Northrup et al., 1980), which gave a correlation coefficient of 0"70. The level of agreement obtained here is considered satisfactory for a simulation that uses approximate potential energy functions, omits the solvent or crystal environment and studies the motion for such a short period. The X-ray temperature factors are also subject to error due to insufficiently high resolution (1"5 A for BPTI), static disorder, in that different molecules in the crystal are not identical, and overall libration of the protein molecule in the unit cell. The solvent and crystal environments would be expected to reduce the motion of the exposed charged side-chains that are calculated to vibrate too much, whereas the correction for disorder and overall libration would bring the X-ray amplitudes of the main-chain atoms closer to those calculated here. The hydrogen bonds of the X-ray structure are generally preserved in the simulation; the major changes in the pattern of hydrogen bonds occurs at the two ends of the chain. The 14 hydrogen bonds that are calculated to be stable, and not bifurcated or to a water molecule, involve 13 of the 17 protons measured by nuclear magnetic resonance to be most protected from exchange with solvent protons (Wagner & Wiithrich, 1982). The agreement between calculated and measured order parameters (S 2) is less good. In the present 132 picosecond simulation, there is much less bond reorientation than seen in 13C nuclear magnetic resonance measurements (Richarz et al., 1980; Lipari et al., 1982). There is a better agreement between experiment and the order parameters calculated from the last 96 of a 168 picosecond simulation of BPTI dynamics (Lipari et al., 1982; Karplus & McCammon, 1979), although there is still less motion in the simulation than seen experimentally. It would be interesting to compare with experimental values the S 2 values calculated from the one simulation that included solvent (van
656
M. LEVITT
Gunsteren & Karplus, 1981,1982), but these values have not been published. In the longer time scale of the nuclear magnetic resonance measurements, new lowenergy conformations may be explored, leading to the greater motional averaging seen experimentally. This is disappointing as the experimentally measured order parameters seemed to offer a way to test the reliability of the simulations. (c) Future implications Potential energy functions and methods used here are in no way specific to the one protein studied; the potential has not been optimized for BPTI and directional hydrogen bonding is expected to work well for any other protein. This has been demonstrated in preliminary simulations of the dynamics of trypsin aimed at calculating the stability of the hydrogen bonds for comparison with hydrogen exchange as measured by neutron diffraction (Kossiakoff, 1982). Calculations are also under way using the present methods to simulate the first steps in the release of oxygen from haemoglobin. These calculations are on proteins two to four times larger than BPTI and proper analysis of the trajectories is more difficult. The techniques used here for the analysis of the BPTI in vacuo trajectory will be very useful for these much larger systems. Tables showing the changes in sidechain torsion angles, hydrogen bonds and solvent-accessible area are all easily obtained for larger systems. The two-dimensional projection of the conformational space that shows regions explored by the moving protein should work as well for a 6000-dimensional space as for the 1581-dimensional BPTI space. The plot of the average level of the potential surface, Uo, against time is a very useful tool that can be incorporated into any simulation of molecular dynamics. It provides a view of undulations in the potential energy that is confirmed by the more difficult energy minimisation calculations. The tendency of the dynamics algorithm to move the conformations to the regions of lowest potential energy provides a technique for annealing conformations and escaping from local minima that has already been used in simulations of DNA dynamics (Levitt, 1983c) and protein folding (Levitt, 1983b). 5. C o n c l u s i o n
The present in-depth analysis of a 132 picosecond simulation of the atomic motion in BPTI protein has led to a new conceptual model of protein dynamics. The protein is seen to vibrate in a particular low-energy region of the energy surface for tens of picoseconds before jumping over a barrier into a neighbouring region. The changes associated with these jumps involve the co-operative motions of several residues and significant changes in the pattern of hydrogen bonding. The simulated atomic vibration amplitudes and stability of hydrogen bonds fit the temperature factors measured by X-ray refinement and the proton exchange rates measured by nuclear magnetic resonance experiments. This agreement and the small r.m.s, deviation from the X-ray structure indicate that the dynamics have been simulated realistically for native BPTI protein. The potential energy functions, methods of simulation and methods of analysis
NATURE OF PROTEIN DYNAMICS
657
used in this and the accompanying paper provide a powerful scheme for the s t u d y of both the static and dynamic properties of biologically i m p o r t a n t macromolecules. I thank the Weizmann Institute for providing ample computer resources and such a pleasant working environment. REFERENCES Chothia, C. (1975). Nature (London), 254, 304-308. Chothia, C. (1976). J. Mol. Biol. 105, 1-14. Deisenhofer, J. & Steigemann, W. (1975). Acta Crystallogr. sect. B, 31,238-250. Diamond, R. (1974). J. Mol. Biol. 82, 371-391. Eckhart, C. (1935). Phys. Rev. 47, 552-558. Englander, S. W. Englander, J. J. (1978). Methods Enzymol. 49, 24-39. Fletcher, R. (1970). Computer J. 13, 317-322. Huber, R., Kukla, D., Ruhlmann, A. & Steigemann, W. (1971). Cold Spring Harbor Syrup. Quant. Biol. 36, 141-150. Huber, R., Kukla, D., Bode, W., Schwager, P., Bartels, K., Deisenhofer, J. & Steigemann, W. (1974). J. Mol. Biol. 89, 73-101. Kabsch, W. (1976). Acta Crystallogr. sect. A, 32, 922-923. Karplus, M. & Mc(!ammon, J. A. (1979). Nature (London), 277, 578-579. Karplus, M. & McCammon, J. A. (1981). CRC Crit. Rev. Biochem. 9, 293-349. Kossiakoff, A. A. (1982). Nature (London), 296, 713-721. Lee, B. & Richards, F. H. (1971). J. Mol. Biol. 55, 379-400. Levitt, M. (1980). In Protein Folding (Jaenicke, R., ed.), pp. 17-39, Elsevier/NorthHolland, Amsterdam. Levitt, M. (1981a). Nature (London), 294, 379-380. Levitt, M. (1981b). Ann. N . Y . Acad. Sci. 367, 162-181. Levitt, M. (1983a). J. Mol. Biol. 168, 595-620. Levitt, M. (1983b). J. Mol. Biol. in the press. Levitt, M. (1983c). Cold Spring Harbor Syrup. Quant. Biol. 47, in the press. Levitt, M. & Feldmann, R. J. (1981). In Structural Aspects of Recognition and Assembly in Biological Macromolecules (Balaban, M., ed.), pp. 35-56, Elsevier/North-Holland, Amsterdam. Levy, R. M. & Szabo, A. (1982). J. Amer. Chem. Soc. 104, 2073-2075. Lipari, G., Szabo, A. & Levy, R. M. (1982). Nature (London), 300, 197-198. McCammon, J. A., Gelin, B. R. & Karplus, M. (1977). Nature (London), 267, 585-589. McCammon, J. A., Wolynes, P. G. & Karplus, M. (1979). Biochemistry, 18, 927-942. MeLachlan, A. D. (1972). Acta Crystallogr. sect. A, 28, 656-657. McLachlan, A. D. (1979). J. Mol. Biol. 128, 49-79. Northrup, S. H., Pear, M. R., MeCammon, J. A., Karplus, M. & Takano, T. (1980). Nature (London), 287,659-660. Nyberg, S. C. (1974). Acta Crystallogr. sect. A, 30, 251-253. Rieharz, R., Nagayama, K. & Wiithrich, K. (1980). Biochemistry, 19, 5189-5196. van Gunsteren, W. F. & Karplus, M. (1981). Nature (London), 293, 677-678. van Gunsteren, W. F. & Karplus, M. (1982). Biochemistry, 21, 2259-2274. Wagner, G. & Wfithrich, K. (1982). J. Mol. Biol. 160, 343-361. Edited by R. Huber Note added in proof: The method used here (see Methods, section (a) (ii)) to calculate 2-dimensional displays of the dynamics trajectories is in fact closely related to an established statistical method known as multidimensional scaling (see Kendall, D.G. (1971). Nat~re (London), 231, 158-159: and references cited therein).