Biochimica et Biophysica Acta, 3o3 (I973) I4-27 © Elsevier Scientific Publishing Company, A m s t e r d a m - Printed in The N e t h e r l a n d s
BBA
36359
CONFORMATION OF T H E LL AND LD H A I R P I N BENDS W I T H I N T E R N A L H Y D R O G E N BONDS IN P R O T E I N S AND P E P T I D E S ~
R. C H A N D R A S E K A R A N a , "*, A. V. L A K S H M I N A R A Y A N A N a . ", U. V. P A N D Y A b AND G. N. R A M A C H A N D R A N a , b
aDepartment of Biophysics, University of Chicago, Chicago, Ill. 60637 (U.S.A.) and bMolecula~ Biophysics Unit, Indian Institute of Science, Bangalore 56oor2 (India) (Received October 6th, 1972)
SUMMARY
The conformation of three linked peptide units having an internal 4 --~ I type of hydrogen bond has been studied in detail, and the low energy conformations are listed. These conformations all lead to the reversal of the chain direction, and may therefore be called as "hairpin bends" or "U-bends". Since this bend can occur at the end of two chains hydrogen-bonded in the antiparallel/5-conformation, it is also known as the "fi-bend". Two types of conformation are possible when the residues at the second and third C a atoms are both of type L (the LL bend), while only one type is possible for the LD and the DL bend. The LL bend can also accommodate the sequences LG, GL, GG ( G - glycine), while the LD bend can accommodate the sequences LG, GD and GG. The conformations for the sequences DD and DL are exact inverses (or mirror images) of those for the sequences LL and LD, respectively, and have dihedral angles (92, ~02), (93, %) of the same magnitudes, but of opposite signs as those for the former types, which are listed, along with the characteristics (length, angle and energy) of the hydrogen bonds. A comparison of the theoretical predictions with experimental data (from X-ray diffraction and NMR studies) on proteins and peptides, show reasonably good agreement. However, a systematic trend is observable in the experimental data, slightly deviating from theory, which indicates that some deformations occur in the shapes of the peptide units forming the bend, differing from that of the standard planar peptide unit.
INTRODUCTION
During a recent study 1 of the conformations of peptide chains containing alternating L- and D-amino acid residues, it was found that a sequence of an L-residue C o m m u n i c a t i o n No. 28 f r o m the Molecular Biophysics Unit, I n d i a n I n s t i t u t e of Science. *" Present address: D e p a r t m e n t of Biological Sciences, P u r d u e University, Lafayette, I n d i a n a 479o 7 (U.S.A.).
CONFORMATION OF H A I R P I N B E N D S IN P R O T E I N S
I~
followed by a D-residue (or vice versa, a D-residue followed by an L-residue) has the property of assuming a special type of folding, in which the three peptide unit~ linked together produce a reversal in chain direction. This conformation has lo~ energy, and further, it is stabilised by a hydrogen bond between the N H grout of the third peptide unit and the carbonyl oxygen of the first peptide unit (Fig I). A similar folded conformation was reported from the senior author's laborator 3
.~NL~
c
L
L
L
~"'0~ " ' H 4 ~ N4
(o)
cB
c# ..
D 02
H"
3
(b)
Fig. I. T h e L L a n d t h e L D h a i r p i n b e n d s h a v i n g t h e h y d r o g e n b o n d N 4 - H 4. • • 0 I. T h e L L b e n t i n (a) h a s L r e s i d u e s a t b o t h P o s i t i o n s 2 a n d 3, w h i l e t h e L D b e n d i n (b) h a s a n L r e s i d u e i n P o s i t i o l 2 a n d a D r e s i d u e i n P o s i t i o n 3. B o t h o f t h e m l e a d t o t h e r e v e r s a l i n c h a i n d i r e c t i o n , as s h o w n b~ the curved arrow.
by Venkatachalam ~, who had employed only stereochemical yes-or-no criteria tc characterise the structure so generated. Venkatachalam 2 only considered L-amine acid residues (including glycine, which we shall denote by G) and found thai there are two types of "hairpin bends" (as we shall call them); one type which car accommodate the sequence LL at the bend, and the other type which can accom modate only the sequence LG. The term "/~-bend" is also a suitable name for thi, bend, as it can occur at the end of two peptide chains in an antiparallel /~-con formation. However, a preliminary examination of the conformational map of dipeptide with a D-a-carbon atom shows that in most cases the G in the LG bent can be replaced by a D-amino acid residue though not b y an L-residue, thus leadin~ to an LD bend. The above statements are purely qualitative, but they indicate that a mor~ detailed study of the LL and LD bends is desirable. Further, the occurrence of reversal of chain direction is a key feature for the closure of the ring in cyclic struc. tures, occurring for example in antibiotics like gramicidin. It m a y be of importanc~ for the compact folded structures of globular proteins also. In view of these, a detailec study has been made of the potential energy of peptide sequences containing th~ amino acid residues L-Ala-L-Ala, including the energy of the hydrogen bonds whict m a y occur in such a structure. It is found that the low energy conformations ob. tained from theory are close to the conformations observed in the crystal structure., of some open and cyclic oligopeptides and also similar to those found in the crystallim protein structures which have been solved by X-ray diffraction.
R, C H A N D R A S E K A R A N et al.
16 METHOD OF CALCULATION
The molecular fragment on which the calculations have been made is shown in Fig. 2. trans planar peptide units of standard dimensions a were used in all the calculations. Hence the conformation of the unit m a y be specified by the rotational angles (~02, ~02) and (%, %) at the intermediate a-carbon atoms C% and Caa. The designation of the conformational angles and other notations follow the I U P A C - I U B conventions'L The alanyl side group of C% is taken to be in the L-configuration, and we have considered both L- and B-configurations at Caa. We have not used the DD or the DL combinations for the computations, since every conformation belonging to these types is the inverse of a conformation of an LL or an LD sequence, as the case m a y
C4 Unit
3
03~ ~ .
ep, e Unit
2
H ~/~,for~ CorD '3- , , N ~ 02 .#
c2 Peptide Unit 1
~/3~
H4 H;orC~
'27'a
L
H2~~2/~H2
I
Fig. a. D e s i g n a t i o n of a t o m s i n t h e t h r e e p e p t i d e u n i t s i n v o l v e d i n t h e h a i r p i n b e n d , w i t h t h e d i h e d r a l a n g l e s a t C2~ a n d C3a also m a r k e d .
be. For example, the conformation (--T2, --~0~), (--%, --VJ3) of a DD fragment has the same type of hydrogen bond, and the same total energy as an LL fragment with the dihedral angles (q)2, ~ ) , (~3, ~%). Although we have considered only alanyl residues, it is expected that closely similar backbone conformations m a y be expected for the other residues as well. However, proline side-chains at Ca 2 or Ca3 may have a particularly stabilizing action, as discussed later in this paper. From the previous calculations on the conformation of two linked peptide units (see for instance ref. 3) we know that the regions of low energy of the dipeptide m a y be taken to be the following: go . . . .
18o ° t o
3 o°,
~t, ~
- - 7 ° ° t o @I9 O° ( - - I 7 o°)
and q0 = 4 °0 t o 8o °, *y ~ 20 ° t o IOO °,
for the conformation at an L-alanyl C a atom. The corresponding values for the Dalanyl residue will be
CONFORMATION
OF HAIRPIN
BENDS
17
IN PROTEINS
= 3°0 to 180 °, ~ = 17°° (--I9 o°) to 7°o and ~0 =
- - 8 0 ° t o - - 4 o°, ~p ~
- - I O O ° t o - - 2 0 °.
These regions are m a p p e d in Fig. 3. The values of (q02, ~02) a n d (%, ~03) were v a r i e d in the a p p r o p r i a t e regions a n d the existence of a h y d r o g e n b o n d N4-H a. .. O 1 between the t h i r d a n d the first p e p t i d e units was searched for a n d its energy was calculated. I t was s t i p u l a t e d t h a t the conformations in which the length (R) N 4 . . . O 1 lies between 2.6 a n d 3.2 A a n d the angle (0) N 4 - H 4 . . . 0 1 is less t h a n 35 ° could be t r e a t e d as e x a m p l e s h a v i n g i n t e r n a l h y d r o g e n bonds. While doing this, the a b o v e ranges of ~0 a n d ~0 were e x p a n d e d to include conformations which were s o m e w h a t higher in e n e r g y t h a n the m i n i m u m for n o n - b o n d e d interactions, since t h e y m a y become favourable if the N , - H a . . . O 1 b o n d is formed. F o r each possible conformation, the
]O-AIo
~
-180 ° -180°
~ "1" ~
O°
180°
~r Gly
Gly
-180' -180°
" q~
(o1
O°
180 °
(b)
Fig. 3. Conformational maps showing the regions occupied by commonly occurring conformations (q~, ~o) for (a) L-alanyl and glycyl residues and (b) D-alanyl and glycyl residues. n o n - b o n d e d energy was c a l c u l a t e d using the p a r a m e t e r s given in ref. 3. The h y d r o g e n b o n d e n e r g y was t a k e n to be given b y the following expressions, as o b t a i n e d in an earlier s t u d y from our l a b o r a t o r y 5, for the h y d r o g e n b o n d s N 4 - H 4 . . . 0 1 between non-neighboring p e p t i d e u n i t s : Vt~b (kcal/mole) = --4.5 + 25 (R--2.95) 2 + o.ooi 0i
(I)
where R is the distance in ~ b e t w e e n the N a n d O a t o m s in the N H . . . O h y d r o g e n bond, a n d 0 is the angle in degrees between N H a n d N . . . O directions. P r e l i m i n a r y calculations i n d i c a t e d t h a t h y d r o g e n - b o n d e d c o n f o r m a t i o n for the L L sequence is possible in two regions given b y (q), 10) values as follows: Region
Ia, (--80 ° to --20 °, --9 °° to --lO%
( - - 1 5 °o t o - - 7 ° ° , i o ° t o 80 ° ) a n d
Region Ib, (--80 ° to --3 °0 , 80 ° to 14o% (2o° to 80 °, IO° to 7o°). I f the second residue is D-alanine, then it was found t h a t there is o n l y one region of
18
TABLE
R.
CHANDRASEKARAN
et
al
I
DIFFERENT
TYPES
OF
HAIRPIN
BENDS
AND
THEIR
LIKELY
CONFORMATIONS
AND
ENERGIES
OI
STABILIZATION
Dihedral angels 9~ (°)
~ (°)
Hydrogen bond parameters ~3 (°)
~ (o)
R (A)
0 (°)
Vhb (heal~mole)
Total stabilization energ3 V (heal~mole)
Region l a ; T y p e LL (also possible ~r LG, GL, GG) -- 5 °
--
5°
-- IIO
4°
2.97
21. 3
-- 4.05
--5 ° --5 ° --5 ° --4 ° --4 ° --5 °
-------
60 5° 60 5° 60 4°
-- 9 ° --Iio --ioo --ioo --IIO --12o
4° 5° 4° 5° 5° 5°
2.95 3.05 3.05 3.00 2.9 ° 2.96
33.3 32.0 23.8 16. 7 26. 3 30.8
--2.94 --3.27 --3.78 --4.19 --3.66 --3.43
-6.53
--5 °
--
4°
-- 120
4°
2.88
19.4
--3.93
--4 ° --5 ° --5 ° --4 ° --5 ° --4 ° --4 ° -- 5 ° --4 ° --4 ° --5 ° --5 ° --60 --60 --5 ° --6o --60
------------------
60 60 4° 5° 5° 7° 5° 5° 5° 60 7° 5° 4° 4° 60 3° 4°
--12o --IOO --13o --13o -- I i O --IOO --13o -- I2O --12o --IIO -- 90 --lOO --IOO --IIO --IOO - - 12o --11o
50 5° 4° 4° 3° 5° 60 4° 5° 4° 4° 4° 3° 4° 3° 4° 3°
3.08 3.14 3.05 2.96 2.93 2.98 3 .08 3.13 2.82 2-83 3.14 2.86 2.99 3.14 3 .ol 3.05 3 .08
17.4 33.8 lO.8 4.7 14.2 28.5 28.7 12-5 24.8 15-7 26.6 30.8 26.5 27.4 17.4 26.o 18.5
--3.87 --2.58 --4.17 --4.48 - - 4 .28 --3.72 --3.45 - - 3-55 --3.25 --3.84 --3.14 --2.69 - - 3 .82 --3.09 --4.13 --3.7 ° - - 3 .82
--5.86 --5.85 5.83 --5.77 -- 5.77 --5.72 5.71 - 5 .68 5.66 --5.59 5.56 5.55 5.49 5.48 5.43 --5.39 --5.39
17.1 5.3 25. 3 14.6 17.2 25.3 13.o 29.1 25. 7 24-6 7.0 1I.I
--4.IO --4.44 --3.88 -- 4.24 -- 3.72 -- 3.74 --3.96 --3.25 -- 3.63 - - 3.91 --4.15 -- 3.13
--6.2 9 -- 6 . 1 o --5.98 5.86 -- 5.86 - - 5 .82 --5.72 --5.61 --5.60 -- 5.38 --5.36 -- 5 . 3 4
6.33 --6.31 --6.21 --6.15 --6.07 --6.05 6.O1
Region Ib;Type LL (also possible for LG, GL, GG) --60 --60 --60 --60 --60 -- 5 ° --5 ° --60 --60 --5 ° --60 --60
IO0 ioo IiO IiO iio IOO IOO ioo 12o IIO IIO IOO
60 60 60 60 5° 5° 5° 60 5° 4° 5° 5°
4° 3° 4° 3o 4° 50 4° 5° 4° 5° 3° 4°
3.03 2.99 2.98 2.91 3.1o 2.90 2.83 3.11 3.07 2.99 3.06 3.19
Ragions IIa and IIb; Type DD (also possible ,[or DG, GD, GG) C o n f o r m a t i o n s i n v e r s e t o t h o s e l i s t e d i n R e g i o n s I a a n d I b are o f t h i s t y p e . T h e d i h e d r a l a n g l e s are of opposite signs, while the hydrogen bond parameters and stabilization energy are equal to those g i v e n f o r R e g i o n s I a a n d I b . T h u s , t h e m i n i m u m e n e r g y c o n f o r m a t i o n s f o r t h e s e t w o r e g i o n s are
Region IIa (minimum energy conformation) 5°
5°
iio
--4 °
2.97
21.3
--4.05
--6.53
3.03
17.1
--4.1o
--6.29
Region lib (minimum energy conformation) 6o
--ioo
--
60
--4 °
(continued on p. z9)
CONFORMATION
TABLE
I
OF
HAIRPIN
BENDS
19
PROTEINS
(continued)
Dihedral angles ~ (°)
IN
~ (°)
Hydrogen bond parameters
~3 (°)
~8 (o)
R(A)
0 (°)
Vn~
Total stabilization energy V (kcal/mole)
(kcal/mole) Region I l l ; T y p e LD (also possible for LG, GD, GG) --6o --60 --60 --60 --60 --7 ° --60 --60 --60 --60 --7 ° --5 ° --60 --7 ° --60 -- 5 ° --60 -- 5 ° --60 --7 ° --7 ° -- 5 ° --5 ° -- 7 ° -- 7 ° --60 --60 --5 ° --60 --60 --7 ° --5 ° --5 ° --7 ° -- 5 ° --60 --5 ° --60 --7 ° -- 5 ° --7 ° --60 --5 ° -- 7 ° -- 7° --60 --60 --50
ioo ioo 9° ioo iio 90 ioo iio 9° iio ioo 14o ioo IOO 9° 12o ioo 12o 9° 9° 9° ioo 13o 9° ioo IiO iio iio ioo 12o ioo ioo iio ioo iio 12o ioo 12o 9° 13o ioo 9° iio 9° 9° 12o 12o IOO
6o 7° 7° 60 60 90 7° 13o 7° 60 80 90 6o 90 7° 12o 14o 12o 80 ioo 90 5° iio 90 90 13o 5° 14o 15o 12o 13o 5° 14o 80 13o 12o 15o iio ioo IiO 80 60 13° 80 80 IiO 50 15o
4° 4° 4° 3° 4° 3° 3° --4 ° 3° 3° 3° --4 ° 5° 3° 5° --4 ° 4° --5 ° 4° 3° 4° 5° 4° 20 20 --3 ° 4° --4 ° --4 ° --4 ° --3 ° 4° --50 20 --5 ° -- 3 ° --5 ° --3 ° 20 -- 5 ° 4° 4° --4 ° 3° 4° --4 ° 4° --4 °
3.o3 2.91 2.98 2.99 2.98 3.o5 2.84 2.99 2.94 2.91 3-09 2.98 3.11 3.03 3.06 2.91 2.91 2.99 2.86 2.98 3.13 2.90 3.00 3.02 2.97 2.93 3.1o 3 .00 3.03 3.08 3.02 2.83 3.06 3.06 2.91 3.02 2.97 2.92 2.93 3.08 3.17 3.14 2.83 3.17 3.22 3.00 3.07 2.91
17.1 24.6 17.1 5.3 25. 3 15.8 13.o 24.2 4.7 14.6 17-4 32.2 29.1 25.0 29.1 18.3 24. 5 29.5 23.8 23.1 27.6 25.3 20. 3 5.8 14.8 13.6 17.2 9-3 16.9 24. 9 26.4 13.o 21.4 8.8 29.1 15. 4 22.5 23.4 12.1 30.8 28.8 12.4 17.2 9 .1 21.o 33.6 25.7 lO.3
--4.1o 3.79 --4.20 4-44 3.88 --4.04 --4.00 --3.92 --4.48 --4.24 --3.77 --3.25 --3.25 -- 3.81 --3.51 --4.11 3.81 -- 3.65 --3.62 --3.97 --3.18 -- 3.74 --4.06 --4.35 --4.28 --4.3 ° --3.72 --4.36 --4.1o --3.62 --3.76 -- 3.96 --3.84 --4.14 --3.42 4.17 --4.00 --3.89 4.33 3.29 --2.82 --3.46 --3.75 -- 3-33 -- 2.76 --3.o8 --3.63 --4.35
--6.99 --6.80 --6.77 --6.66 --6.63 --6.53 --6.47 --6.44 --6.44 --6.38 --6.36 --6.36 --6-34 --6.30 --6.29
-- 6.27 --6.27 -- 6.26 --6.24 --6.22 --6.21 --6.19 --6.18 -- 6.15 --6.15 --6.14 --6.12 --6.11 --6.1o --6.1o --6.09 --6.06 --6.05 --6.05 --6.Ol --6.00 --5.97 -- 5.95 --5.92 --5.91 -- 5.90 --5.90 --5.90 -- 5.88 --5.83 --5.83 --5.83 - - 5 .8o
Region I V ; Type DL (also possibl for GD, GL, GG) These conformations are inverse to those listed in Region III. The dihedral angles are of opposite signs, while the hydrogen bond parameters and stabilization energies are the same as those given f o r R e g i o n I I I . T h e m i n i m u m e n e r g y c o n f o r m a t i o n is
Region IV; (minimum energy conformation) 60
--ioo
--
60
--4 °
3.o3
17.1
--4.1o
--6.99
20
R. C H A N D R A S E K A R A N
et al.
hydrogen-bonded conformations, characterised by (~v,~o) values. Region III,(
9 °0 t o
3 O°, 7o ° t o I6O°), (3 o° t o 17o °, - - 7 o° t o @80°).
It m a y be noted that Region Ib for LL is a smaller section of the dihedral angle ranges covered by Region I I I for LD. Also, if the second residue is glycine, then obviously, the domain with hydrogen bonds would consist of all the above three regions put together. A detailed search was then made in Regions Ia, Ib and I I I , varying the dihedral angles in steps of IO°, and the results are shown in Table I. RESULTS OF THE CALCULATIONS IN COMPARISON W I T H E X P E R I M E N T A L DATA
It was found that the LD pair has the most stable conformation at (--6o °, ioo°), (60 °, 4 o°) with a hydrogen bond of length 3.0 ~ and the angle N H ANO of 17 °. The total energy of this conformation is --7.0 kcal/mole with a contribution of --4.1 kcal/mole from the hydrogen bond. If, in this backbone conformation, the absolute configuration at Caa is changed from I~ to L, the total energy increases to --6.3 kcal/mole. This is also the minimum energy in Region Ib for any LL tripeptide. In addition, the LL tripeptide has another minimum energy conformation, (--5 o°, --5o°), ( - - I I O °, 4 o°) in Region Ia. The hydrogen bond length for this is 3.0 A and the angle is 21 °. This contributes --4.05 kcal/mole to the total energy ot --6.5 kcal/mole. The three minimum energy conformations are shown in Fig. 4, in a stereoscopic drawing. It can be seen that, in all the three conformations, the first and the third peptide units are nearly in a plane, while the middle unit is approximately normal to this plane. Also, the third peptide unit folds back towards the first one to form the N 4 - H 4 . . . O1 bond. As a result, a sequence containing three peptide units, with an L-residue, followed by an L- or a D-residue in this conformation, may have the property of producing a reversal in the chain direction. Table I lists the low-energy 4 -+ I hydrogen-bonded conformations of both LL and LD tripeptides in the Regions Ia, Ib, and I I I , which have a total energy within 1.2 kcal/mole from the respective minimum energies. Conformations having energy outside this range have less than I o ~ probability of being found. It can be seen among the conformations of low energy, the tLvdrogen bond length varies from 2.8 to 3.1,~ and the angles from 5 ° to 34 ° Thus ~, a good linear hydrogen bond i~ possible in some of these conformations, although the minimum energy conformation has a somewhat non-linear hydrogen bond. In order to compare the relative ease of occurrence of the LD or the LL bend for a tripeptide, we m a y count the number of examples of each type in Table I having energies below --5.8 kcal/mole (which is 1.2 kcal/mole above the absolute minimnm, namely, --6.99 kcal/mole for the LD type). It is found that the number of such low energy conformations are 48 for the LD type (Region III) but only 19 for the LL type (Regions Ia and Ib). In addition, the ranges of conformational angles over which the energy is low are wider for the LD than for the LL bend. As already mentioned, either of the residues at Ca~ or Caa m a y be replaced by a glycyl residue. In that case, all the conformations listed in Table I, Regions Ia and Ib, become available for an LG or GL sequence, while their inverses (Regions IIa and IIb) are possible for a GD or a DG sequence. Hence, in a protein structure with L-residues, a hairpin bend m a y be expected to contain a glycyl residue quite often,
CONFORMATION OF HAIRPIN BENDS IN PROTEINS
21
(a)
(b)
ci
F" Nq
(¢)
d Fig. 4. Stereoscopic drawings showing the preferred folded conformations corresponding to (a) L-Ala-L-Ala (Region Ia), (b) L-Ala-L-Ala (Region Ib), (c) L-Ala-D-Ala (Region III). The diagrams were obtained using the computer program ORTEP prepared by Johnson 2~.
and similarly, in peptides (like antibiotics) containing both L- and D-residues, a mixed sequence, such as LD or DL, will be the likely site for the bend leading to cyclisation. The folded structure we have described here is thus of great importance in cyclic oligopeptides, wherein a chain reversal could be readily made use of for closure of the ring. In fact, in the s t u d y of the conformations of cyclic penta- and hexapeptides, containing glycyl and L- or D-alanyl residues, made by R a m a k r i s h n a n and S a r a t h y 6,7, the low energy conformations have this folding with an internal hydrogen
22
R. CHANDRASEKARAN eL
al.
bond. The crystal structures of cyclohexaglycyl and the cyclohexapeptide in ferrichrome A in fact contain these bends, as will be discussed (see Table III). From what has been discussed earlier, we m a y also say that in cyclic peptides having both L and D-residues, the key points of chain reversal would be expected to be in the LD or LG sequences (also DL and GL) rather than in the LL sequences. They are likely, therefore, to be the key factors in the formation of the folded conformation of antibiotics s, which have mixed L- and D-residues. On the other hand, the LL and the LG bends have an importance in the folding of polypeptides. Thus in globular proteins, such a chain reversal would enable different parts of the molecule to come close together resulting in the formation of a compact structure. Actually, T A B L E II CONFORMATIONS SIMILAR TO H Y D R O G E N - B O N D E D CHYMOTRYPSIN
Residues at C2a Lysozyme 55 lie
HAIRPIN BENDS OBSERVED IN LYSOZYME AND
Dihedral angles (in degrees) C3a
56 Leu
Observed
Nearest from theory
(--46,
35);
(--IO8,
~-IO)
(--60, --40); (--IIO,
30)
7° Pro
71 Gly
(--48 , --39);
(-8i, -9)
(-9o, 4o)
75 Leu
76 Cys
96 Lys
97 Lys
(--56 , --45); (--99, 2I) (--63, 51);
(--6o, --4o); (--IOO, 3 o) (--60, --40);
123 Trp
124 Ile
Chymotrypsin 24 Pro 25 Gly 28 Pro
29 Trp
73 Gln
74 Gly
92 Ser
93 Lys
lOO AsI1
IOI Asn
II6 Gln
i i 7 Thr
126 Ala
127 Ser
132 Ala
133 Gly
[73 Gly
174 Thr
178 Asp
179 Ala
I92 Met
193 Gly
195 Ser
I96 Gly
218 Ser
219 Thr
(--50, --6o);
(-75, - i 8 )
( ioo, 3o)
(--64, --38); (--97, --4)
(--60, --40); (--lO% 3° )
(--5 o, 117); (93, - - I i ) (--36 , --40); (--92, --I2) (--54, --46); (--81, --5) (--33, --34); ( - - I 2 I , 28) (--90, 157); (36 , 63) (--51, - - i i ) ;
(--60, 12o); ( i i o , --3 ° ) (--5 ° , --50); (-- i i o , 3 o) (--5 ° , - - 5 ° ) ; ( - - i o o , 4° ) (--5 ° , --4o); (--I2O, 4 ° ) (--60, 12o); (5 o, 4 ° ) (--60, --3o);
(-116, 2)
( - i 2 o , 40)
(--47, - - 4 I ) ; (--78 , 7)
(--5 o, --60); (--IOO, 3 ° )
(--46,
(
12o) ;
(97, --24) (62, --141); (--38 , --26) (--27, --46); (--93, 3 o) (--64, 148); (99, --23) (--37, 135); (95, --25) (--57, --51); (--IOO, 26)
5 O, 1 3 o ) ;
(IIO, --4 ° ) (6o, --12o); (--5 ° , --4 ° ) (--5o, --50); (-- IIO, 3 O) (--5 o, 14o); (90, --4 ° ) (--50, 14o); (92 , --4 ° ) (--5 ° , --50); (--IIO, 3 ° )
CONFORMATION
OF
HAIRPIN BENDS IN PROTEINS
23
TABLE III THE
HAIRPIN
BEND
FOUND
IN SOME
Structure
PEPTIDE
CRYSTAL
Ref.
Sequence
STRUCTURES
Dihedral angles (in degrees)
Hydrogen bond length (A) N4...O 1
Region [a; Type LL Minimum energy conformation from theory p-t3romocarbobenzoxy-Gly-Pro-Leu-Gly o-Bromocarbobenzoxy-Gly-Pro-Leu-Gly-Pro Oxytocin C-terminal peptide Cys Pro Leu-Gly-NH 2 Type GG Cyclohexaglyeyl hemihydrate
-- 5°
-- IiO
4°
2.97
Io
L-Pro L-Leu
-- 58
--33
-- lO4
8
2.97
II
L-PrO--L-Leu
--65
-- 27
-- lO5
8
3.00
12
L-Pro--L-Leu
--66
29
-- 115
13
--
13
-69 -69 -69 -68
-29 -3 ° -3 ° -31
-
-- 7 °
-- 15
--
8 4 7 8 16
2.96 3.02 3.03 3.09 3.04
°
2.97 3.16
7 GIy-Gly-D-Ala-D-Ala-Gly-Gly i
14
Gly-Gly Gly-Gly Gly-Gly Gly-Gly Gly-Gly
Region l l a ; T y p e D D Minimum energy conformation from theory FGly-Gly-D-Ala-D-Ala-Gly-GIy-~
14
D-Ala--D-Ala
15
L-Ser-Gly
~-(Gly) 67
Region IIIb; Type LG Minimum energy conformation from theory Ferrichrome A '.'Orn-Orn-Orn-Ser-Ser-Gly~
--5 °
50 66
-
60
-57
50 15
oo 132 i
--
94 92 95 93 lO6
iio
-4
131
-31
60 62
4 °
i
3.o3
2.98
since D-residues do n o t o c c u r in p r o t e i n s , t h e c h a i n r e v e r s a l c o u l d be e f f e c t e d t h r o u g h t h e f o r m a t i o n o f a n L L b e n d or m o r e f r e q u e n t l y via a n d L G fold. S o m e e x a m p l e s o c c u r r i n g in l y s o z y m e (Phillips, D. C., p e r s o n a l c o m m u n i c a t i o n ) a n d c h y m o t r y p s i n 9 are g i v e n in T a b l e II. T h e e m p h a s i s is o n t h e g e n e r a l s i m i l a r i t y o f t h e o b s e r v e d conf o r m a t i o n t o a t h e o r e t i c a l one, r a t h e r t h a n on a n e x a c t c o r r e s p o n d e n c e . T h e s e are d i s c u s s e d f u r t h e r in t h e n e x t section. A n o t h e r i n t e r e s t i n g f e a t u r e is t h a t t h e c o n f o r m a t i o n at C 2 for a large n u m b e r o f low e n e r g y L L b e n d s ( R e g i o n Ia) is close to t h e a - h e l i c a l c o n f o r m a t i o n , n a m e l y ( - - 5 5 °, - - 5 0°) (see T a b l e I). T h i s s u g g e s t s t h a t a c h a i n r e v e r s a l is possible in t h e v i c i n i t y o f an a - h e l i x . Also, t h e c o n f o r m a t i o n a l angle 92 at Ca2 closely c o r r e s p o n d s to t h a t f a v o u r a b l e for t h e o c c u r r e n c e o f a p r o l i n e residue, so t h a t s u c h a b e n d c o u l d be formed readily by the sequence -Pro-X-. T h e t y p e s o f f o l d e d c o n f o r m a t i o n d i s c u s s e d here h a s b e e n o b s e r v e d to be p r e s e n t in s e v e r a l o l i g o p e p t i d e s a n d p a r t i c u l a r l y in cyclic p e p t i d e s as r e v e a l e d b y X - r a y s t r u c t u r e d e t e r m i n a t i o n s on t h e s e c o m p o u n d s . E x a m p l e s o f t h e s e are g i v e n in T a b l e I I I . All t h e L L b e n d s a n d t h e G G b e n d s are f o u n d in R e g i o n Ia, a n d t h e
24
R. CHANDRASEKARAN et g~.
T A B L E 1V N.~IR STUDIES INDICATING THE PRESENCE OF HYDROGEN-BONDED HAIRPIN BENDS IN CYCLIC PEPTIDES
Compound
Ref.
Gramacidin S b (Val O r n - L e u - r > P h e - P r o ) 2 _ _
16
Residues at the bend
Dihedral angles (in degrees)
At C2a
At Caa
Expt*
D-Phe
L-Pro
(3 o,
,
Theory" I50 ) ;
(5 o,
1OO) ;
( 60, -5o)
( 5o, -4o)
i
Evolidine I (Ser-Phe-Leu-Pro-Val
17
L-Lcu
L-Ser
(--45, 4 ° ); ( IiO, 4o)
( (
~(Pro-Ser Gly)e i
18
L-Pro
L Set
I (Ser-Pro-Gly)a
19
L-Pro
Gly
20
L-Leu
Gly /
(--6o, IIO); (60, 30) ( - 6 0 , leO); (120, --3 o) ( 80, i2o); (70, o)
( 60, IIO); (60, 3o) ( 50, 120); (120, --4 o) ( 60, 1IO); (00, 30)
2o
L-Tvr
GIv [
20
L-Leu
Glv /
Asn-Leu)
,
(Glv. Leu-Glv)2_~ =~ s(GIv-Tvr
Gly)2
, (Glv L e u - G l v - G l v Gly-Gly) ~ I ( G I v - T v r - G l v - G I v GIv-Glv) i
20
L-Tyr
Glv]
(Gly Leu-Gly-Gly_ Gly-Gly)-~
2o**"
Gly (5)
Gly (6)1
( G I v - T v r - G l y - G I v Glv-GIv)
2o***
Glv (5)
Gly (6)[
2o*'*
L-His
Gly
2o***
L-Tyr
Gly
I (Gly His G l y - A l a - T y r - G l y )
I
(
80, I20);
5° , 4 ° ) 12o, 4 O)
(--60, II0);
(7 ~, o)
(60, 3o)
( 50, 120); (7° , 20)
(w60, 110); (60, 3° )
( lOO, 13o); (6o, 5° ) ( 85, i4o ); (60, 20)
(--oo, 12o); (iio, 4o) ( 7° , lOO); (80, 40)
* As stated in the reference, which has generally been obtained by model building. "* Hydrogen-bonded conformation of the bend, according to our theory, which is closest to the experl mental one. *'* The alternative conformations given in the original work are not listed here. only LG b e n d observed (in ferrichrome A) is in Region I I I which is favourable for LD a n d LG sequences. It can be seen here also t h a t L-proline can c o n v e n i e n t l y occur as the first residue in the bend. A discussion of these in relation to theory is given in the n e x t section. The presence of the folded conformation of the types discussed above has been suggested b y NMR studies to be a likely one for several cyclic peptides. These are given in Table IV along with the reported sites where the b e n d occurs a n d the approximate values of (qv, ~0) which fit N M R data. The nearest conformation from theory with hydrogen b o n d is also given for comparison. These are again discussed further in the n e x t section. DISCUSSION
Tables II, I I I a n d IV clearly indicate t h a t hairpin bends having conformations at the b e n d close to the theoretical prediction occur in several examples; not only in cyclic peptides which require this type of b e n d for cyclisation, b u t also in protein chains a n d also in short open chain peptides h a v i n g special sequences. Considering
CONFORMATION OF HAIRPIN BENDS IN PROTEINS
25
in particular the data in Table III, it is seen that the observed conformations in Region Ia in a variety of examples have the (~0, ~0) values (--6o ° to --7 o°, --3o°): (--9 oo to - - I I O °, 5 ° to 15°). This may be compared with the minimum energy conformation from the theory given in this paper, viz. (--5 o°, --5o°), (--iio°: 4o°). Since the same conformation occurs in different examples, it would appeai that the observed conformation should be expected to be a particularly stable one. However, the calculated energy for this is not low, but quite high above tile minimum. This was found to arise from a very short contact between the atoms H 4 and N~. However, in the crystal structure of cyclohexaglycyl this short contact does not occur, because the bond angles along the main chain at C% and C' a are appreciably larger than their standard values, which were used in our calculations reported in this paper. In fact, it is well known that the region near q) _~ o ° is ot relatively high energy because of this contact even in the (~0,~0) map of a single paii of peptide units. Consequently, the observed conformation near (--IOO °, IO°) at C% is itself very unfavourable (apart from whatever conformation may be occurring at Ca2), if a planar peptide unit of standard dimensions is adopted for the peptide unit 3 (Fig. 2). However, a small increase of the angle at Ca~ relieves the short contact mentioned above and leads to a lowering of the stabilisation energy in the region around .(--9 o°, o°). A preliminary examination using models indicates that an increase of the angle at C' 3 is also particularly effective in this regard. In fact, the two angles have mean values of about 112 ° and 119 ° at Ca3 and C' 3 in the cyclohexaglycyl structure, as against the standard values of IiO ° and 114 °, respectively. It is therefore necessary that the calculations reported in this paper should be repeated with deformations for the dimensions of the standard peptide unit a. These deformations should include changes in bond angles and also the possible occurrence of non-planarity due to rotations about the C'-N bond. In a recent papei from our laboratory 21, it has been shown that a non-planar conformation of the peptide unit may even have a lower energy than the planar one. This also requires following up in relation to the results of the type reported in this paper. Considering Table II, here again it will be seen that, in some examples, one or more of the dihedral angles differ by more than 3 °o between the observed and the theoretical conformations. This should not be considered to be a serious disagreement because, in most cases of this type, the observed ~v value is close to o °, while the predicted one is of the order of -+- 3o °, or larger in magnitude. In fact, Table 1 contains no conformation with [~0 I less than 3 o°, which is due to the bad contact mentioned earlier in this section, which occurs for standard dimensions of peptide units. Once again, the indications are that the peptide unit must be capable of deformation of various types, even in an open chain structure as in a protein. The consequences of this result for the interpretation of the electron density maps of protein structures are obviously quite important. This aspect will be discussed in detail in a separate communication. As regards Table IV, the experimental values of the dehedral angles reported in this table are not very precise, because they have been obtained by the original authors only by model building and by making the best fit with observed coupling constants. Further, the necessary condition for cyclisation itself would distort the peptide units and also make the conformational angles at each a-carbon atom assume values which may correspond to energies higher than the minimum. Consequently
26
R. CHANDRASEKARANet aI.
we will only remark that the agreement between theory and experiment is reasonably satisfactory in these examples. In fact, a listing of the type given in Table I would be expected to be very valuable for such NMR studies; but NMR workers should not hesitate to try conformations slightly outside the range covered in Table I particularly those having small values of L~v lIn the case of gramicidin S, the conformation as given in the reference quoted is the inverse of what is given in Table IV, which would be true for the sequence LPhe-D-Pro at the bend, Since the compound has the sequence D-Phe-g-Pro, we have listed the equivalent conformation for this sequence. It is interesting that the hairpin bends in all the examples given in Table IV, except one, correspond to Region III (and Region IV) of Table I, which includes also Regions Ib and lib, while the exception, viz. evolidine, has a bend of the type contained in Region Ia. This conformation of the bend in evolidine is however close to a low energy conformation listed in Table I, unlike the examples in Table III. Thus, the folded conformations described in this paper seem to be important in peptide sequences, whether they are part of open or cyclic oligopeptides or polypeptides. This conformation leads to chain reversal which enables ring closure in rigid cyclic peptides and a compact form for a long polypeptide chain, with possible hydrogen bond as in the antiparallel fl-structure. In addition, they are of great interest in relation to the structure of antibiotics, which may contain both L- and D-amino acid residues. In particular, if the sequence is alternately L and D, or has special sequences like . . . L L D D L L . . . , the chain may take up a rigid ribbon-like structure, as has been pointed out in earlier papers1, ~a. The role of proline in the formation of the LL or the LD bend requires special mention. As mentioned briefly in an earlier section, proline can occur at Ca3 in the LL bend. On the other hand, proline can occur either at Ca2 or Ca3, or both, in the LD bend. Therefore, the sequence P r o - G l y - or - G l y - P r o - in an all-L peptide structure can readily be the site of a/5-bend. A careful examination of all available protein structure data is being made in relation to these features of the 4 --~ I hydrogen-bonded bend, and the results will be published elsewhere. ACKNOWLEDGEMENTS We thank Dr Struther Arnott at the Purdue University for use of their facilities in preparing the stereoscopic diagrams (Fig. 4). This work was supported by USPHS grants AM-II493, and AM-I5964. REFERENCES
I Ramachandran, G. N. and Chandrasekaran, R. (1972) in Progress in Peptide Research (Lande: S., ed.), Vol. ii, pp. 195-215 2 Venkatachalam, C. M. (1968) Biopolymers 6, 1425-1436 3 Ramachandran, G. N. and Sasisekharan, V. (1968) Adv. Protein Chem. 23, 283-438 4 IUPAC-IUB Commission on Biochemical Nomenclature (197o) Biochemistry 9, 3471-3479 5 Ramachandran ,G. N., Chandrasekaran ,R. and Chidambaram, R. (1971) Proc. Indian Acad. Sci. A74, 27o-283 6 Ramakrishnan, C. and Sarathy, K. P. (I969) Int. J. Protein Res. i, 63-71 7 Ramakrishnan, C. and Sarathy, K. P. (1969) Int, J. Protein Res. i, lO3-111 8 Bodansky, M. and Perlman, D. (1969) Science 163, 352-358
CONFORMATION OF HAIRPIN BENDS IN PROTEINS
27
9 Birktoft, J. J., Matthews, B. W. and Blow, D. M. (1969) Biochem. Biophys. Res. Commun. 36 131-137 IO Ueki, T., Ashida, T., Kakudo, M., Sasada, Y. and Katsube, Y. (1969) Acta Cryst. B25, 184o1849 i i Ueki, T., Bando, S., Ashida, T. and Kakudo, M. (1971) Acta Cryst. B27, 2219-2231 12 Rudko, A. D., Lovell, F. M. and Low, B. W. (1971 ) Nature New Biol. 232, 18-19 13 Karle, I. L. and Karle, J. (1963) Acta Cryst. 16, 969-980 14 Karle, I. L., Gibson, J. w . and Karle, J. (197 ° ) J. Am. Chem. Soc. 92, 3755-3760 15 Zalkin, A., Forrester, J. D. and Templeton, D. H. (1966) J. Am. Chem. Soc. 88, 181o-1814 16 Stern, A., Gibbons, W. A. and Craig, L. C. (1968) Proc. Natl. Acad. Sci. U.S. 61, 734-741 17 Kopple, K. D:. (1971) Biopolymers io, 1139-1152 18 Torchia, D. A., di Corato, A., Wong, S. C. K., Deber, C. M. and Blout, E. R. (1972) J. A m Chem. Soc. 94, 6°9-615 19 Torchia, D. A., Wong, S. C. K., Deber, C. M. and Blout, E. R. (1972) J. Am. Chem. Soc. 94 616 620 20 Kopple, K. D., Go, A., Logan, Jr, R. H. and ~avrda, J. (1972) J. Am. Chem. Soc. 94, 973-98~ 21 Ramachandran, G. N., Lakshminar~yanan, A. V. and Kolaskar, A. S. (1973) Biochim. Biophys Acta, 303, 8-13 22 Johnson, C. K. (1965) ORTEP, A Fortran Thermal Ellipsoid Plot Program for Crystal Structur, Illustrations, Report ORNL-3794, Oak Ridge National Laboratory, Oak Ridge, Tenn. 23 Ramachandran, G. N. and Chandrasekaran, R. (1972) Indian J. Biochem. Biophys. 9, i - i I