ARCHIVES
OF BIOCHEMISTRY
AND
167, 615-626 (1975)
BIOPHYSICS
A Two-Dimensional JOHN Department
of Biochemistry
Representation B. R. DUNN
and Molecular
AND
Biology, Evanston,
of Protein
IRVING
and Department Illinois 60201
Structures’
M. KLOTZ of
Chemistry, Northwestern
University,
Received September 27, 1974 A two-dimensional chart of distances between residues of a protein and between their side-chains provides a concise and convenient representation of steric information derived from crystallographic data. Such a chart shows characteristic features corresponding to details of secondary structures of a polypeptide and also reveals which side-chains are interacting in the tertiary structure of the macromolecule.
Interpretations of function of proteins in terms of structure depend on perception of the spatial relationships and interactions of particular residues. These relationships may be perceived in a three-dimensional map or model based on coordinates derived from crystallographic data. However, it is difficult to present such information in two-dimensional printed form. A simple two-coordinate representation is needed. The Ramachandran plot (1) is one solution. An alternative approach is presented here. It extends a procedure used previously (2-4) in different contexts. PRINCIPLE
OF THE REPRESENTATION
A useful representation of protein structure may be generated by mapping interresidue distances in a form that draws attention to interactions that may play a key role in stabilization and function of the macromolecule. In a protein containing N residues, N2/2 residue interactions are possible. A graph with both ordinate and abscissa labeled in ordinal array of residue sequence can give a concise two-dimensional representation of conformation. In a polypeptide, restriction of rotation about the amide bond fixes the distance ’ This investigation was supported in part by Grant No. GM-09280 from the National Institute of General Medical Sciences, U.S. Public Health Service and Grant No. GB35296 from the National Science Foundation. 615 Copyright All rights
0 1975 by Academic Press, Inc. of reproduction in any form reserved.
from an ai carbon to an ai+, carbon, dai.i+ ,, at 3.8A. The distance from ai carbon to @I+2 carbon, dqc+2, depends, however, on the dihedral angles of the i + 1 residue. In a beta structure the distance dai,i+, is 7.47A, whereas it is’ 5.45A in an alpha helix. Calculation of corresponding distances from residues i to i + 2, i + 3, . . . produces the values listed in Table I and illustrated over a longer range in Fig. 1. These data illustrate a major difference between these two types of secondary structure. In a beta chain the a-carbon to a-carbon distance always increases with ordinal position in the sequence, whereas in an alpha helix, the third neighbor residue in the sequence is closer than the second neighbor residue because the structure coils back on itself. Similar trends are shown by any type of helix and the point at which the interresidue distance drops is a characteristic of the type of helix. The distances between cxcarbons can be classified as close, moderate, and far, and each category can be depicted by assignment of a specific symbol to observed distances falling within that range (Table II). In a plot of residue versus residue (Fig. 2), different secondary structures will have a different appearance near the diagonal. The residue number on the x axis is denoted by i and that on they axis by j. The value of dai,j for i = j is obviously zero and hence the diagonal will be composed of
616
DUNN AND KLOTZ
asterisks,*. The value of dcri.i for i = j + 1 is always 3.SA. These points will be just off the diagonal and will also be represented by asterisks. Any two adjacent residues throughout the sequence have the same symbol on the graph. However, the symbols for i = j f 2, i = j •t 3, i = j A 4, etc. are determined by the type of secondary structure. Hence the map shows differences for regions of alpha helix as contrasted to beta chain. Due to structural imperfections and inaccuracies in X-ray data, actual appearances will differ somewhat from the ideal (Fig. 2) but secondary structures should still be easily recognized. Since dolt.1 = daj,i, the plot has a diagonal mirror plane making half the map TABLE
redundant and hence superfluous. Therefore one-half of the map can be used to present cY-carbon-a-carbon interactions which reflect secondary structure, and the other half can be used to present sidechain, functional-group interactions, which gives more information on tertiary structure stabilizing interactions. For calculation of distances between side chains one must assign a reference point to each functional group; the positions chosen for this purpose are listed in Table III. The two parts of the chart then are constructed as follows. In the region to the lower right side of the diagonal (i.e., for i > j), the a-carbon-a-carbon distances are plotted. In the upper left portion of the plot (where i < j), the side chain-side chain interactions are plotted. The map then illustrates secondary structure and interacting residues in terms of the symbols for interaction distances (Table II). Asterisks in the upper left region reflect strong side chain interactions, hydrogen bonding, electrostatic bonds, van der Waals interactions, hydrophobic bonding or ~-1r interactions. The 0 and - symbols represent the larger distances and correspond to weaker interactions or to two groups interacting with a third. Since the functional side chain groups are extensions of their respective alpha carbon atoms, their map features near the diagonal will
I
DISTANCES BETWEEN ALPHA CARBON ATOMS OF RESIDUES IN ~-HELIX AND fl-CHAIN CONFORMATIONS
Distance i. ,+k k
0 1 2 3 4 5 6
Helix 0 3.82 5.43 5.09 6.36 8.71 9.92
Beta chain 0 3.82 7.47 11.16 14.94 18.60 22.40
(3.82P (5.49) (5.08) (6.20)
“Values in parentheses are observed averages for helices in myoglobin. 2524 23-
5
0
2221 zo19. ,S,7:g-
*
,4-
’ 012345678
*
0
G 13. ::y ;-
0
*
0
*
*
* 0
I
’
“1
”
1’ 9
10
“1 11
12
13
” 14
(5
FIG. 1. Chart of a-carbon to a-carbon distance versus ordinal position in polypeptide sequence for alpha helix (*) and B chain (0) secondary structures.
REPRESENTATION
OF PROTEIN
be somewhat different for alpha-helix as contrasted to beta sheet secondary structure frameworks. Points away from the diagonal represent interactions between portions of the protein widely separated in primary structure. Symbols on the map at these coordinates reveal which residues are in close nroximitv in the tertiary structure and which are interacting. If a series of symbols forms a broad line parallel to the diagonal axis, the two sections of the protein represented by the i and j coordinates of the points are parallel. If the series of points is perpendicular to the diagonal axis, the two sections are antiparallel. Two sections that are perpendicular to one another will appear on the map as a small circle of points. On a chart, bends in the polypeptide structure have the appearance of beta structure near the diagonal with a few additional points about three to six spaces from the diagonal in the upper left section, normal to the diagonal at the center of the bend. TYPICAL
CHARTS2
Figure 3 illustrates a two-dimensional representation of myoglobin. Interactions have been plotted for the 153 residues, the 2The interaction charts in Figs. 3-8 were constructed from atomic coordinates for each of the proteins. Coordinates for myoglobin were obtained from published tables (5), for hemoglobin from Drs. M. F. Perutz, G. Fermi, and A. D. McLachlan (MRC Laboratory, Cambridge, England), and for the other proteins from Dr. T. F. Koetzle (Brookhaven National Laboratory, Upton, NY). These coordinates had been derived from crystal structure studies for myoglobin (6), hemoglobin (7, 8), lamprey hemoglobin (9), rubredoxin (IO), and cytochrome b, (11). The distances between alpha carbon atoms or between positions of groups on side chains of all the ith and jth residues were computed from the Cartesian coordinates of the respective loci. If the distances fell within one of the interaction ranges, the appropriate symbol (Table II) was printed on the map at the position for i and j on the abscissa and ordinate, respectively. A CDC 6400 computer, using a Fortran program, calculated the distances, selected the appropriate interaction symbol and generated the control variables and commands for the Calcomp plotter, which produced the interaction chart.
617
STRUCTURES TABLE
SYMBOLS USED FOR DIFFERENT
II INTERACTION
ckl (in A)
DISTANCES
Symbol * 0
o-4.1 4.1-5.3 5.3-7.2
K1.i *** -
P-Chow
l ** .** -
..* l .*
.*.
-
-.*.l ** -**. a.* .** ..* _ - l ** - . . . ^ l .* l *.
-
.
-
FIG. 2. Ideal appearances near for alpha helix and for beta chain.
diagonal
of chart
N-terminal amine (NTR),3 the C-terminal carboxyl (CTR), the iron (FE), the four nitrogens of the porphyrin (Nl, N2, N3, N4), the two porphyrin propionic acid carboxyl groups (PAl, PA2), and the oxygen molecule (OS). In the map of myoglobin 1205 interactions are present. The arrangement of interaction symbols along the diagonal axis illustrates the predominance of alpha helix in this protein. The alpha helical regions evident from the map are: A helix, 3-19; B helix, 20-35; C helix, 36-43; D helix, 51-57; E helix, 58-77; F helix, 86-95; G helix, 100-119; and H helix, 125-149. Bends are evident in the following residue regions: 19-20, 37, 42-50, 78-87, 120-124, and 159-155. Since no beta structure is present in myoglobin, none is indicated in the map. The side-chain functional-group proximities in myoglobin are manifested in the upper left section of the map. Some specific interactions that may be picked out are Lys 79-Glu 4, Lys 133-Glu 6, Ala 130-Leu 9, His 82-Leu 137, Ala 94-Tyr 146. The chart also reveals the interaction of His 93 with heme iron and that of the a Abbreviations used: N-terminal amine, NTR; C-terminal carboxyl, CTR; iron, FE; the four nitrogens of the porphyrin, Nl, N2, N3, N4; and the two porphyrin propionic acid carboxyl groups, PAl, PA2.
DUNN AND KLOTZ
618
TABLE REFERENCE
Functional group
POSITIONS
III
FOR CALCULATIONS
OF INTERRESIDUE
DISTANCESO
Reference origin
Residue
-ci>,-
Asp, Glu
A point midway between the two oxygens
--NH,+
LYS
The c-nitrogen
-CH,+=vH
His
A point midway between the two imidazole nitrogens
Arg
A point midway between the two terminal nitrogens
Asn, Gln
A point midway between the oxygen and nitrogen of the amide
-CH,
Phe
The center of the ring
-CH,
‘br
The phenolic oxygen
-CH,-&H
CYS
The sulfur
Met
The sulfur
Ser
The hydroxyl oxygen
Thr
A point midway between the gamma carbon and the hydroxyl oxygen
-H
GUY
The alpha carbon
--&Ha
Ala
The beta carbon
-CH(CHsl--CHztiHa
Ile
The delta carbon
Val
A point midway between the gamma carbons
NH, --(CHJ,-NH-C’
l
\NH
+ 1
NH, --A \O
-(CH,)
,-.‘kH,
-CH,-6H
CHs -CH,
/ l
\
OH
REPRESENTATION TABLE Functional
group
OF PROTEIN
STRUCTURES
619
III-Continued
Residue
Reference origin
CH, -CH,-CH
/
.
Leu
A point midway between the two delta carbons
Trp
A point centered on the bond joining the two rings
Pro
A point midway between the beta and delta carbons
\
--CH2 \ CHz
l
--CH*
I
a Similar assignments are made for other groups such as the porphyrin nitrogens, porphyrin propionic acid groups, and iron atom.
porphyrin propioniate groups with Arg 45, Ser 92, His 93 and His 97, which facilitates the binding of’heme to the protein. The broad lines generated by the groupings of these interactions provide information on the protein conformation, for example, indicating that portions of the protein segments are parallel or antiparallel. The twl; ,broad lines of points above residues 1-41 reveal that helices A and B are antiparallel to helices E and D, respectively, and that helices A and B are antiparallel to helix G. Other helical-segment interactions indicated are: helix G antiparallel to helix H; helix H parallel to helix F and the nonhelical segment EF. These residueresidue interaction positions are characteristic of a specific type of folding. Figures 4-6 illustrate two-dimensional maps for the alpha chain of human hemoglobin, beta chain of human hemoglobin and sea lamprey hemoglobin, respectively. The great similarity in the visage of these charts (with each other and with that for myoglobin) reflects strikingly the nearly identical secondary and tertiary structures of these homologous proteins. Figures 7 and 8, for cytochrome b, and rubredoxin, respectively, illustrate the appearance of maps of proteins containing predominantly beta structures. Although
cytochrome b, is a heme protein, its tertiary folding and interactions differ from the oxygen carriers. The first two residue positions are unknown and their interactions have not been included in the map. Two small helical sections are evident for residues 65-71 and 54-58. However the majority of the residues in this protein are in beta structures. Rubredoxin contains no helices and none is evident in the map. The oxygen positions of structured water (denoted by 01, 02, . . .) have been included in this map to illustrate protein-water interactions. Of the 20 charged residues (NHS+, COO-) of rubredoxin, 12 are associated with structured water. Of the 35 hydrophobic residues 14 interact with water. CONCLUSION
These matrix charts provide a very concise and convenient representation of crystallographic X-ray information on protein structure. The appearance of the map reveals details of the secondary and tertiary structure of a protein and indicates which residues are within interacting distance with others. From such representations one can readily examine the types of interaction that play a dominant role in the structure and function of a specific protein.
620
DUNN
AND
KLOTZ
Myoglobin
--
t-
FIG. 3. Chart segments,
:
of interaction
distances
for
myoglobin.
Letters
and
brackets
mark
helical
REPRESENTATION
OF PROTEIN
a-
Hemoglobin
i
FIG. 4. Chart of interaction
STRUCTURES
1
1 I
I ,
distances for alpha chain of human hemoglobin.
621
-
k
REPRESENTATION
Hemoglobin
*-
F IG. 6. Chart of interaction
OF PROTEIN
(Sea
STRUCTURES
Lamprey)
!
distances for sea lamprey hemoglobin
623
624
DUNN AND KLOTZ
Cytochrome
bs
-
-I
I
FIG. 7. Chart of interaction
I
distances for cytochrome b,.
i
1
I
REPRESENTATION
OF PROTEIN
STRUCTURES
625
Ru bredoxin
---
1’
I
I
FIG. 8. Chart of interaction
I
I
I
distances for rubredoxin.
I - I
-L!$-
626
DUNN
AND
REFERENCES 1. RAMACHANDRAN, G. N ,, AND SASISEKHARAN (1968) Adv. Protein Chen. 23, 283. 2. KLOTZ, I. M., LANGERMAN, N. R., ANDDARNALL, D. W. (1970) Annu. Reu. Biochem. 39, 25. 3. PHILLIPS, D. W. (1970) Biochem. Sot. Symposia 30, 11. 4. NISHIKAWA, K., 001, T., ISOGAI, Y., AND SAITO, N. (1972) J. Phys. Sot. Japan, 32, 1331. 5. WATSON, H. C. (1969) Prop. Stereochem. 4, 299. 6. KENDREW, J. C., WATSON, H. C., STRANDBERG, B. E., DICKERSON, R. E., PHILLIPS, D. C., AND
KLOTZ SHORE, V. C. (1961) Nature (London) 190,666. 7. BOLTON, W., AND PERUTZ, M. F. (1970) Nature (London) 228, 551. 8. MUIRHEAD, H., AND GREER, J. (1970) Nature (London) 228, 516. 9. HENDRICKSON, W. A., LOVE, W. E., AND KARLE, J. (1973) J. Mol. Biol. 74, 331. 10. HERFUOTT, J. R., SIEKER, L. C., JENSEN, L. H., AND LOVENBERG, W. (1970) J. Mol. Biol. 50, 391. 11. MATHEWS, F. S., ARGOS, P., AND LEVINE, M. (1971) Cold Spring Harbor Symp. Quant. Biol. 36, 387.