Structure of ubiquitin refined at 1.8Åresolution

Structure of ubiquitin refined at 1.8Åresolution

I. Mol. Bid. (1987) 194, 531-544 Structure of Ubiquitin Refined at 143A Resolution Senadhi Vijay-Kumar3* William 4, Charles E. Bugg*, 3,4 and J. C...

1MB Sizes 12 Downloads 102 Views

.I. Mol. Bid.

(1987) 194, 531-544

Structure of Ubiquitin Refined at 143A Resolution Senadhi Vijay-Kumar3* William

4, Charles E. Bugg*, 3,4 and J. Cook’7 3v4

Departments of ‘Pathology and 2Biochemistry, ?Tenter for Macromolecular Crystallography and %omprehensive Cancer Center University of Alabama at Birmingham University Station Birmingham, AL 35294, [J.S.A. (Received 12 August 1986, and in revised form 22 October 1986) The crystal structure of human erythrocytic ubiquitin has been refined at 1.8 A resolution using a restrained least-squares procedure. The crystallographic R-factor for the final model is 0.176. Bond lengths and bond angles in the molecule have root-mean-square deviations from ideal values of O-016 a and 1*5”, respectively. A total of 58 water molecules per molecule of ubiquitin are included in the final model. The last four residues in the molecule appear to have partial occupancy or large thermal motion. The overall structure of ubiquitin is extremely compact and tightly hydrogen-bonded; approximately 8796 of the polypeptide chain is involved in hydrogen-bonded secondary structure. Prominent secondary structural features include three and one-half turns of a-helix, a short piece of 3,,-helix, a mixed P-sheet that contains five strands, and seven reverse turns. There is a marked hydrophobic core formed between the b-sheet and a-helix. The molecule features a number of unusual secondary structural features, including a parallel Gl P-bulge, two reverse Asx turns, and a symmetrical hydrogen-bonding region that involves the two helices and two of the reverse turns.

the cytoplasm and on the cell-surface membrane, but it appears that its primary role is in intracellular, ATP-dependent protein degradation. Protein breakdown in this pathway requires the formation of covalent conjugates in which carboxyl terminals of ubiquitin molecules become attached to the target protein via amide linkages. Two distinct types of ubiquitin-protein conjugates have been identified. One type contains ubiquitin joined via an isopeptide bond to E-amino groups of protein lysine residues (Ciechanover et al., 1980; Hershko et al., 1980, 1983). The second type of conjugate is formed by ubiquitination of the NH,-terminal a-SH, group of the acceptor prot’ein (Hershko et aE., 1984). Preferential modification in vitro of the NHz-terminal a-NH, groups of various prot,eins prevents degradation of such proteins in reticulocyte extracts. suggesting that NH,-terminal ubiquitination of proteolytic substrat,es is required for their degradation. Proteins in which most E-NH, groups are blocked but the a-NH, group is free are degraded by the ubiquitin system, but at a reduced rate (Hershko et al., 1984). The resulting ubiquitinprotein conjugates, which often contain more than

1. Introduction Ubiquitin is a small protein that is probably present in all eukaryotic cells (Goldstein et al., 1975). It consists of a single 8565 M, polypeptide chain of 76 amino acid residues (Fig. 1). Ubiquitin has been isolated and sequenced from a variety of sources. The primary structures through Arg74 are identical in insect (Gavilanes et al., 1982), trout (Watson et al., 1978), bovine (Schlesinger et al., 1975), and human (Schlesinger & Goldstein, 1975) ubiquitin. All of these sequencing studies indicated that ubiquitin was composed of only 74 amino acid residues. However, subsequent studies demonstrated that, ubiquitin has 76 residues and terminates with Gly-Gly (Wilkinson & Audhya, 1981). lTbiquitIin that lacks the COOH-terminal Gly-Gly is almost certainly a proteolytir artifact in vitro. The amino acid sequences of yeast (Wilkinson et al., 1986) and oat (Vierstra et aE., 1986) ubiquitin differ in only three of 76 residues from that of ubiquitin in higher eukaryotes. It appears to be one of the most conserved of all eukaryotic proteins. Ybiquitin has been identified in the nucleus, in 531 0022-28:~A/87/070531-14

$03.0010

S. Vijay-Kumar

5 10 Met - Gln - lie - Phe - Val - Lys - Thr - Leu - Thr - Gly 15 20 Lys - Thr - lie - Thr - Leu - Glu - Val - Glu - Pro - Ser 25 30 Asp - Thr - Ile - Glu - Asn - Val - Lys -Ala - Lys - lie 35 40 Gln -Asp - Lys - Glu - Gly - Ile - Pro - Pro -Asp - Gln 45 50 Gln - Arg - Leu - lie - Phe - Ala - Gly - Lys - Gln - Leu 55 60 Glu - Asp - Gly - Arg - Thr - Leu - Ser - Asp - Tyr - Asn 65 70 Ile - Gln - Lys - Glu - Ser - Thr - Leu - His - Leu - Val 75 Leu - Arg - Leu - Arg - Gly - Gly Figure ubiquitin.

1. Amino

acid sequence of human

erythrocytic

one ubiquitin molecule, can be either degraded to amino acids or simply deubiquitinated. Proteins that have been modified by the incorporation of amino acid analogs or denatured by chemicals are degraded by the ubiquitin pathway (Hershko et al., 1982; Chin et al., 1982). Ubiquitin synthesis can be induced by heat shock (Bond & Schlesinger, 1985) or by inhibiting ubiquitin-protein conjugation (Finley et al., 1984). This suggests that the ubiquitin-dependent proteolytic pathway is one mechanism that the cell uses to prevent damage that abnormal proteins could cause. Ubiquitin-coding DNA sequences have been cloned from a variety of eukaryotes (&kaynak et al., 1984: Wiborg et al., 1985; Bond & Schlesinger, 1985; Dworkin-Rastl et al., 1984). These studies have shown that ubiquitin-coding genes are organized into spacerless head-to-tail repeating units. Each gene segment codes for 76 amino acid residues. It therefore appears that ubiquitin is largely generated by the processing of polyubiquitin precursor proteins. This presumably reflects the need of the cell to generate large numbers of ubiquitin molecules in response to stress. Ubiquitin is present in the nucleus and on the cell surface as a specific isopeptide derivative. In the nucleus, ubiquitin attaches to histone 2A to form a branched molecule (Busch & Goldknopf, 1981). This chromosomal protein (designated uH2A) consists of ubiquitin conjugated to histone H2A via the E-NH, group of Lysll9 (Olson et al., 1976; Hunt 8: Dayhoff, 1977). The function not known, but it may

of this

conjugate

is

be involved in the transcription of active genes (Mueller et al., 1985; Levinger & Varshavsky, 1982). Most recently, ubiquitin has been identified as part of a lymphocyte-homing receptor (Siegelman et aE., 1986). The function of this conjugate is not known, but it is suspected that ubiquitin may help to confer receptor specificity. It appears that NH,-terminal ubiquitination of proteolytic substrates is both necessary and sufficient for ubiquitin-dependent

et al.

degradation of such substrates (Hershko et al., reversible conjugation of 1984). Therefore, ubiquitin to a variety of intracellular and cellsurface proteins at E-amino groups of their lysine residues (to yield branched ubiquitin-protein conjugates) may serve a regulatory function. The structure of human erythrocytic ubiquitin at 2.8 w resolution has been described (Vijay-Kumar et al., 1985). We now report extension of the resolution to 1.8 A and refinement of the model.

2. Materials

and Methods

(a) Development of the initial

model

Purified human erythrocytic ubiquitin and ubiquitin74 (ubiquitin without the last 2 residues) were supplied by Keith D. Wilkinson of Emory University. Ubiquitin was crystallized as described by Cook et al. (1979); all crystals were obtained by seeding at pH 5.6. The crystals, which grow as large rectangular prisms, belong to orthorhombic space group P2,2,2, with a = 50.84 A, b = 42.77 A, and c = 28.95 A. There is 1 molecule of ubiquitin per asymmetric unit. Initial phase estimates for the structure of ubiquitin were derived from a single isomorphous derivative (mercuric acetate). Two Hg sites were located in isomorphous difference and anomalousPatterson maps, and these sites were refined by least-squares analysis using data from the centric zones. Phase angles for the native data were calculated from the singleisomorphous derivative including anomalous dispersion effects. The overall figure-of-merit for the 2-site Hg derivative was 0.68. An electron density map for the protein was calculated including the 1734 native structure factors with d > 2.8 A, and using centroid phase angles with figure-of-merit weights. [Jsing the amino acid sequence, guide co-ordinates for (7”. Ca, and the end atom for each of the larger side-chains were identified. The last 4 residues at the COOH terminus (Leu73, Arg74. Gly75 and Gly76), which extend out into the solvent. could not be interpreted due to very noisy low electron density in this region. This portion of the molecule was modeled from a difference Fourier map calculated with single isomorphous replacement (SIR?) phases and native ubiquitin amplitudes minus the amplitudes of ubiquitin-74. These guide co-ordinates were used in conjunction with the computer graphics program FRODO (Jones, .1978) to generate atomic co-ordinates for the entire molecule. The co-ordinates were adjusted to approximate ideal geometry by the model building program included in the FRODO package. The R-index for the final graphics model based on the 2.8 A data and using an overall isotropic temperature factor of 12 A2 was 0,46. (b) Data collection Intensity data were collected with a Picker FACS-I diffractometer at room temperature using an omega stepscan procedure (scan width 0.6 to 0.8”) and Ni-filtered CuKcc radiation. The reflections were divided into 28 shells containing 100 reflections each and were collected from high to low resolution. Two crystals were used for data collection to 1.8 A. Data for 1 crystal were collected from 1.8 to 2.8 A, and data for the 2nd were collected t Abbreviation replacement.

used: SIR, single isomorphous

Structure

of Ubiquitin

from 2.5 A to infinity. To monitor and correct for decomposition effects? 6 standard reflections with a wide spread across reciprocal space were measured periodically. During the collection of data, the intensities of the st,andard reflections from the 2 crystals showed a systematic decrease in their intensities of 11 yc and Syc, respectively. The overlapping data in the resolution range 2.5 to 2.8 A were used to scale the 2 data sets together. The merging R-index calculated from the 391 overlapping reflections in these 2 data sets was 0.097. Since t,hr int.ensity decay increases with both time and resolution. t,he intensities of high-angle reflections are underestimated and t.hose of low-angle reflections are overestimated by t.raditional time-dependent decay corrections. In order to overcome this, we measured the intensities of I2 to 15 strong reflections in each shell on a new crystal and used these reflections to scale each shell individually. The monitoring reflections showed a very small intensity decrease of about l.l%, so decay corrections within this subset of data were ignored. This scaling procedure improved the quality of the data, as judged by the drop in the merging R-index on (Fl from 0.097 to 0.04H. A total of 6029 unique reflections was measured to d = 1.8 A. The empirical absorption correction due to Korth et al. ( 1968) was used (Zmax/lmin = 1.3) to correct the anisotropy in X-ray transmission as a function of 4. Lorentz and polarization corrections were applied. Of the 5750 reflections that had intensities exceeding 2.5~~. there were 5554 in the range 14 A I d < 6.0 A. ((4) Structure

from 0.47 to 0.18. A round of refinement represents a complete session of model-building followed by cycles of refinement until convergence was achieved. When convergence of the refinement was achieved, the resulting refined model was used to calculate a difference electron density map, and the model was rebuilt using an interactive graphics display with the FRODO model building system (Jones, 1978). The refinement of the structure progressed slowly, with a steady improvement in the quality of the model and the agreement between observed and calcuiated structure factors. Fig. 2 shows an overview of the progress of the refinement. The starting point for refinement was the initial model fitted to a 2.8 A SIR electron density map with an overall temperature factor of 12 AZ. In the 1st 9 rounds of refinement, an inner resolution limit of 5.0 A was used, but this was changed to 6.OA in the last 3 rounds. Weights for the stereochemical restraints were based on the statistical variations observed in small-molecule structures. As the refinement progressed, the relative weights given to structure factors were increased. In the 1st round of refinement. the initial model, consisting of 602 protein atoms. was refined and then fitted to the SIR map. This procedure was repeated t,wice to get the best fit to the SIR map for a st,ereochemically reasonable model. In rounds 2 to 4: the refined model was refitted to an electron density map calculated by phase combination methods. The SIR and calculated phases were combined with equal weights in the resolution range 5 to 2.8 A, and only SIR phases were used for the range 00 to 5 A. In round 5, the data were added gradually in 3 steps to 25 A resolution, and electron density maps were computed using SIR phases for the 00 to 5 A data and calculated phases for the data beyond 5 A. Beginning with round 5, OMIT maps (Bhat & Cohen, 1984) were used. In an OMIT map, an atom of the model does not contribute to the phases used to calculate electron density values at or near its position in real space; t,his

rejinement

The model was refined using the reciprocal space restrained least-squares procedure of Hendrickson & Konnert (1981). Refinement proceeded in 12 rounds with a total of 174 cycles, resulting in significant improvement in the geometry of the model and a drop in the R-index

L 0

1 IO

I 20

I 30

I 40

I 50

I 60

I 70

I 80

I 90 Cycle

23.

I 100

i 110

I 120

I 130

I 140

I 150

I 160

I 170

180

number

Figure 2. R-factor at the beginning of each round of refinement. Manual 34. 49. 67. 81, 99. 120. 129, 153 and 162, and are indicated by arrows.

alterations

in the model came after cycles 9,

534

S. Vijay-Kumar

procedure reduces the bias by the model atoms contained in that region. At this point (R = O-32), we calculated (21FJ-IFJ) and (/pO,l-IF& electron density maps using calculated phases for the structure factors in the 5.0 to 2.5 a and SIR phases in the 00 to 5 A resolution range. Reflections for which (IFOI--IF,I) > 0.7(IF,,I+IF,I) were omitted from the difference Fourier calculations, since the phases for reflections with IF.1 >> IF,1 are generally incorrect. Since very low angle reflections generally have very large IF,1 values, reflections with lF,l/lF,l > 3.0 were also excluded. These 2 conditions eliminated 64 reflections in the range Al) to 9.2 A. All positive peaks in the difference map larger than 0.5 electron/A3 were examined for inter- and intramolecular contacts by use of the graphics system. Peaks were identified as water molecules when they had well-defined electron density and were within a distance of 2.4 to 3.5 A from at, least 1 other polar protein group or another water molecule. Using these criteria, 42 peaks were assigned as oxygen atoms from water molecules and were given isotropic temperature factors of 20 A*. As the last, 4 residues at the COOH terminus were so poorly defined, they were deleted from the refinement at this point. In round 8, the resolution was gradually extended to 2.2 A in 4 steps. Beginning with round 9, the isotropic temperature factors of the protein atoms and the water molecules were refined. Refinement of B values was delayed until this stage of refinement to help prevent temperature factors from absorbing co-ordinate errors. The refinement statistics showed a better agreement between F. and F, in the low-resolution range, so the inner resolution limit was changed to 6 A for subsequent refinement. In rounds 10 and 11, the resolution was gradually extended to 1.8 A, and the model was rebuilt from (2IF,I- IFJ) maps. An (IF,,I-IF,/) map was then calculated and, using a threshold of 0.3 electron/A3, 17 more water molecules were added to the model. These water molecules form a 2nd layer of water structure as they generally hydrogen-bond with the earlier interpreted water structure. At the end of round 11, the temperature factors of 29 water molecules were in the range 30 to 43 A’. In round 12, the temperature factors and occupancies of the disordered water structure were refined together, but. they were updated separately in alternate cycles. At this stage 3 water molecules displayed relatively low B values (B = 8 to 11 A*) and low occupancies (0.30 to 0.33). Theseatomswere assigned B values of 18 A* and occupancies of 0.70, which were typical of other disordered water sites. Also, 3 of the 8

amideside-chainsdemonstratedlarge differencesbetween

et al. their temperature factors, so the amide nitrogen positions were interchanged with the carbonyl oxygen positions. Refinement was then continued to convergence. At this stage we calculated a difference electron density map using the refined phases and the amplitudes of the native protein minus the amplitudes of ubiquitin-74. This map clearly showed the last 4 residues at the COOH terminus (Fig. 3). However, when these residues were included in the refinement, the temperature factors became unreasonably large ( > 60 A*). Therefore, the temperature factors were fixed at 25 A* for main-chain atoms and 30 A* for side-chain atoms, and the occupancies were refined. This resulted in an average occupancy of about 0.45 for atoms in residues 73 and 74 and about 0.25 for atoms in residues 75 and 76. The occupancies for the atoms in these 4 residues were fixed at these values, and the temperature factors were then refined. Using these occupancies, the temperature factors gave values similar to those for Arg72. The final model includes 602 protein atoms and 58 oxygen atoms of solvent; 28 protein atoms and 29 solvent molecules were refined with partial occupancies. The final R-index of the model, based on 5554 reflections in the range 1.8 il < d < 6.0 A with intensities exceeding 2~50. is 0.176. The refined co-ordinates for ubiquitin have been deposited with the Protein Data Bank, Chemistry Department, Brookhaven National Laboratory, Upton, NY 11973, U.S.A., from which copies are available.

3. Quality A plot

after

of the Structure

the method of Luzzati

(1952),

presented as Figure 4, places an upper limit of O-15 A on the root-mean-square error in the atomic

co-ordinates for ubiquitin. Comparison of the co-ordinates of the initial ubiquitin model with those of the refined model shows a root-mean-square difference of 1.50 il for all 76 residues and 1.45 A for the first 72 residues. Root-mean-square shifts for the 304 main-chain atoms and 297 side-chain atoms are 1.07 A and 1.84 A, respectively. The side-chains of two residues, Glu34 and Arg54, were re-interpreted as the refinement progressed, and these showed large deviations from the initial model. The overall figure-of-merit of the initial SIR phased model (0.68) corresponds to a mean error in phases of about 47”. The phases calculated from the refined

Figure 3. Stereo drawing of the last 4 residues at the COOH terminus (Leu73, Arg74, Gly75 and Gly76) with the superimposed electron density contour surfaces. This difference map was calculated using the refined phases and the amplitudes of the native protein minus the amplitudes of ubiquitin-74.

Structure

.

I 0.01

I

0.02

I

I

0.03

0.04

535

of Ubiquitin

I

1

0.05

I

0.06

0.07

1

I

0.08

0.09

0.10

Figure 4. Dependenceof the R-factor on the reciprocal of the resolution. Curves for the root-mean-square 0.10. 0.15 and 0.20, as described by Luzzati (1952), are superimposed on the curve for ubiquitin. model differ from the initial SIR phases by an average of 42”. As shown in Table 1, the root-mean-square variation in distances from ideal values falls close to or within the targeted variances. The final coordinates of the ubiquitin model deviate from ideal bond. lengths and angles by 0.016 A and 1.5”, respectively. Residues 72 to 76 were poorly defined in electron density maps, display relatively large temperature factors, and are likely to have large residual errors. There is only one intramolecular contact shorter t,han 2.5 A, and this involves the 0 and CG atoms of Arg72. There are two short Table 1 Su,mmary of the least-squaresrefinement Target Observed reflections Number of atoms R-index Distances (A) Bonded Angle Dihedral Planes (A) Chiral volun~c=~ (A3)

(‘ont.a&

(7

parameters Final

intermolecular contacts (2.0 to 2.2 A) between symmetry-related ubiquitin molecules, one that involves Arg72 and another that involves the COOH-terminal carboxyl group. Distances between acceptor and donor atoms of hydrogen bonds are generally above 2.7 A. The shortest intramolecular acceptor-donor distance (2.51 A) occurs between atom 0 of Gln62 and atom OGl of Ser65. Only one other acceptor-donor distance falls between 2.5 A and 2.6 A; the remainder are above 2.6 A. The shortest acceptordonor distance between protein and water is approximately 2.4 Bi, occurring between the aminoterminal nitrogen atom of Met1 and water 110. There are only two other interactions between water and protein with distances less than 2.6 8. Dihedral angles of the main chain generally

value

5554 ( > 2.5u) 660 (58 H,O) 0.176

( < 6 a)

0@20 0.030 0.050 0.015 0.150

0.016 0.034 0.055 0.017 0.178

0.500 0.500 0.500

0.283 0.447 0.247

T

i 4 f

(A)

Single

Multiple Hydrogen-bonded To&n angles (deg.) omega (‘hi Aromatic Thrrmal (AB. :I’) Main bonded Main an,@ Side bondrd Side anple Structure factorst d -trrm R-trrm

error of

\

3.0 184 50.0

3.2 20.3 9.9

1.00 1.50 1.25 1.80 12.0 .3’2.0

l+l=(

1.46 2.26 3.93 4.77 27.0 - 278.9

_ b=C

Figure 5. A Ramachandran t These terms were used in the calculation of structure weights. according to the formula: u = A4 + (R * ((sin B/E.)- l/6)).

factor

for

each

handed (0)

Glpcyl

residue.

The

qonformation residues:

only

are

plot of the 4 and $ angles non-glycine

Ala46,

( x ) remaining

residue

Asn60 residues.

in a left-

and

Glu64

536

S. Vijay-Kumar

conform well to their expected values. Except for Glu64, the 4, $ pairs for non-glycine residues fall in allowed regions of the Ramachandran plot’ (Ramachandran et al., 1963; Fig. 5). The 4, $ angles for Glu64 are quite unusual. Glu64 is the third residue in a type II reverse turn that is also part of a parallel Gl P-bulge, so the lefthanded conformat’ion is required. The variances in categories of temperature factors fall near their intended values. The rootmean-square AB value for bonded atoms of the main chain is 1.46 A’, rising to 3.93 A2 for bonded atoms of the side-chains. Temperature factors for residues 1 to 72 range from a low of approximately 2 A2 to a high of 36 A2. For residues 73 and 74, which were given occupancies of 0.45, the temperature factors ranged from 28 to 42 A2. Temperature factors for the last two residues, which were given occupancies of O-25, ranged from 36 to 37 A2. The overall temperature factor derived from a relative Wilson plot is 12.5 A2, and the average temperature factor for all atoms including water is 14.7 A’. The temperature factors averaged over atoms for each residue of t,he main chain and side-chains appear in Figure 6. The most flexible regions of the molecule are the last four residues at the COOH terminus and the reverse turns involving residues 7 to 10, 51 to 54 and 62 to 65. The final difference Fourier map contained several uninterpreted peaks in the range 0.25 to 0.34 electron/A3. These peaks were too close to other well-defined solvent molecule positions to be interpreted as additional solvent molecules with reasonable occupancies.

et al.

4. Results (a) Conformation of the molecule Stereo drawings of the ubiquitin molecule are shown in Figure 7. The most prominent secondary structural features are 3.5 turns of a-helix involving residues 23 to 34, a short piece of 3i0-helix involving residues 56 to 59, and a mixed p-sheet that contains five strands (Fig. 8). Two of the inner strands, composed of residues 1 to 7 and 64 to 72, are parallel. The other three strands, which are composed of residues 10 to 17, 40 to 45 and 48 to 50, run in an antiparallel direction. The b-sheet has the characteristic left-handed twist, and the a-helix fits into the concavity formed by the sheet (Fig. 9). There are two P-bulges in ubiquitin; both are Gl bulges, but they are quite different. One of the P-bulges is a typical Gl bulge between antiparallel P-strands. The bulge involves GlylO, Lysll and Thr7 (Fig. IO), which would correspond to positions 1, 2 and X according to the nomenclature of Richardson et al. (1978). The bulge is somewhat unusual, in that GlylO is the fourth residue in a type I turn; most Gl bulges contain glycine as the third residue in a type II turn. The other bulge involves t,he two parallel strands in the B-sheet. The strand that contains residues 64 to 72 begins with a Gl p-bulge (Fig. 11). The bulge involves the last two residues of a t,ype II reverse turn (Glu64 and Ser65) and Gln2 on the adjacent strand. This type of parallel P-bulge is quite rare, and the presence of glutamate rather than glycine in position I of the bulge (and position 3 in a type IT reverse turn) is also unusual.

26, 92-

/

v

,6 -

20

40

60

6

Mom

cho

I

I

I

I

I

30

40

50

60

70

Residue

Figure number.

6. Variation

of mean temperature

numbers

factor for atoms of side-chains

and the main chain as a function

of residue

Structure

Figure complete

7. Stereo drawing molecule.

of ubiquitin.

The 2 drawings

of Ubiquitin

537

are in the same orientation.

a

:

‘d ,..’

i-,-g-,,

0 -

iI

..: ,_.’

.:’

/ ;: /

,/’

,..’

. ..’

C-i-7-i ,.: ,:. ,:’

bl :., ‘:, :.,

_:’ N--M-~-N-66-C-N-Bg-~-~-67-C-N-68-~-N-69-C-N-7O-~-N-71-C-~-72-~ 0

,..’

0”

2--C--N-33-~-N-4--C-~--5-~-N-66bl .P’

a i. :., ‘..

i

0

0 II

0

i

!

C-45-N-C-44-& u ;

8. A diagram

of hydrogen

bonding

B 0

0

R C--43-N-C-42--N-C-41-N-C-

fi II 0

bl

Figure

backbone;

R

C-~7-N-C-18-N-C-15-N-C-14-N-C-13-N-C-12-N-C-11-N-C-lO-N 8 j ‘d ;

0

Top, x-carbon

in the b-sheet

40-N

bottom,

538

S. Vijay-Kumar

Figure 9. Stereo drawing sheet are included.

of the relationship

et al.

of the a-helix to the p-sheet. Only the main-chain

IO

%

0 77

0

atoms of the helix and

IO

0

I

77

0

1

%

Figure 10. Stereo drawing of the antiparallel Gl B-bulge involving Thr7, GlylO and Lysll. The 0 and N atoms of Thr7 are hydrogen-bonded to N of GlylO and 0 of Lysll, reepect,ively. Residues 6 to 12 are included.

Figure 11. Stereo drawing of the parallel Gl b-bulge involving GIn2, Gluti hydrogen-bonded to N of Glu64. Residues 1 to 7 and 62 to 72 from the 2 parallel

and Ser6.5. The 0 atom of Gln2 ia strands in the sheet are included.

Structure

539

of Ubiquitin

Table 2 Geometry

of reverse turns

Residues

(4

$1 angles

(deg.) H bond

i

i+1

Thr7 (:I1118

l&U8

Pro37

Pro38 Ala46 Asp52 1~1156 Ser57 Asp58 Lys63

‘i’he45 (:lu51 Tbr55 Lru56 Ser57 (:lllW

Pro19

t The 4 residues

i+2

i+3

Thr9 %X20 Asp39

GlylO

Gly47 my53 SW57 Asp58

Tyr59 Glu64

i+l

Asp21 Gln40 Lys48 Aig54 As~58

Tyk9 Asn60 Ser65

56 t,o 59 form

-7

-73 -55 -57

-25 -32 48

-48 -61 -64 -56 -55

bonding

In addition to the usual hydrogen bonds formed in the /?-sheet and a-helix, there are a number of ot’her interesting intramolecular hydrogen bonds. One unusual set of hydrogen bonds involves two reverse turns and the two helices. The main-chain N atoms of the first’ two residues of the a-helix (Ile23 and Glu24) form hydrogen bonds to the carbonyl

Figure 24 and

12. Stereo drawing 51 to 57 are included.

46 -42 -36 -30 -39 143

- 101 -80 -68 62 -83 -64 -56

-91 67

A

15 -8 -16 22 -9 -30 -39

3.4 2.8 3.0 2.8 3.3 3.1 3.1

5 19

3.2

2.9

TSP

I I III III’ 1 111t 111t I II

a 3,,-helix.

The molecule contains seven reverse turns (Table 2). All of the turns have the expected 4 + 1 N-O hydrogen bond, although two of these are rather long. The longest is between Thr7 and GlylO, but this turn is somewhat unusual in that GlylO is the first residue in a Gl p-bulge, and thus there is an extra residue in the hairpin connection between adjacent antiparallel P-strands (Fig. 10). There is a highly contorted “turn-rich” stretch beginning with Phe45 and ending with Ser65. This sequence of 21 residues contains four reverse turns and a short piece of 3,,-helix that forms two interlocked type III reverse turns. All but three residues in this sequence are involved in reverse turns. (b) Hydrogen

i+P

of t.he hydrogen-bonding

scheme

oxygen atoms of the second and fourth residues in the type I turn involving residues 51 to 54 (Fig. 12). Similarly, the main-chain N atoms of the first two residues of the 3,,,-helix (Leu56 and Ser57) form hydrogen bonds to the carbonyl oxygen atoms of the second and fourth residues in the type I turn involving residues 18 to 21. There is almost 2-fold symmetry in this portion of the structure. To our knowledge, this type of hydrogen-bonding scheme has not been observed. Two unusual hydrogen bonds occur in the large loop formed by residues 51 to 59 (Fig. 13). This loop is not simply a random coil, but includes five hydrogen-bond interactions. The first three are typical hydrogen bonds involving the N -+ 0 bond of the reverse turn 51-54 and the two N -+ 0 bonds of the 3,0-helix 56-59. A fourth hydrogen bond is formed between the side-chain of Asp58 and the peptide N atom of Thr55. There is one other example of this type of hydrogen bond in ubiquitin (see below). The loop is then further stabilized by another unusual hydrogen bond between the hydroxyl oxygen atom of the Tyr59 side-chain and the peptide N atom of Glu51. This hydrogen bond

involving

the 2 helices

and

2 reverse

turns.

Residues

18 to

540

S. Vijay-Kumar

et al.

Figure 13. Stereo drawing of the large loop formed by residues51 to 59. The hydroxyl oxygen of Tyr59 forms a hydrogen bond with N of Glu51. Residues50 to 59 are included, but the side-chainof Leu50 has been truncated at the 8 pos&ion renders the tyrosine virtually inaccessible and suggests that iodination of the tyrosine ring would result in steric hindrance that might disrupt the stabilizing interactions that involve this residue. It also probably explains our inability to obtain crystals of iodinated ubiquitin by simply using crystals of native ubiquitin as seeds (unpublished results). There are two examples of an unusual hydrogen bond involving a carboxylate oxygen of aspartate n and the main-chain nitrogen of residue n- 3. The first occurs in the type I reverse turn at residues 18 to 21 (Fig. 14), and the second occurs in the S1c-helix at residues 56 to 59. In each case, the nth residue (aspartate) is the third residue in the turn. The hydrogen-bonded ring includes 14 atoms. Another unusual hydrogen bond involves the S atom of Met1 and the peptide N atom of Lys63 (Fig. 11). The N-S distance is 3.6 A, which is quite reasonable. There are no other contacts less than 4-O b between the S atom and any other possible donors, A hydrogen bond involving the S atom of methionine and an amide N atom has been reported (Birktoft & Blow, 1972), but the N-S distance was 3.7 A. However, subsequent refinement of a-chymotrypsin by two different groups (Tsukada & Blow, 1985; Blevins & Tulinsky, 1985) demonstrated that

Figure 14. Stereo drawing of the turn involving oxygen atom of Asp21 and N of Glu18, in addition included.

the S-N distance is about 3.6 A, in agreement with the hydrogen-bond distance for ubiquitin. (c) COOH

terminus

It has been shown that limited tryptic digestion of ubiquitin yields ubiquitin-74 and the dipeptide glycylglycine. This proteolytic cleavage apparently occurs during purification from most tissues (Wilkinson & Audhya, 1981). Since there is also an arginine residue at position 72, there is good reason to suspect that similar proteolytic cleavage occurs at this point also, although at a much lower rate. Indeed, treatment of ubiquitin with trypsin gives two major cleavage sites; Arg72 and Arg74 (Jabusch & Deutsch, 1985). Therefore, we suspect that our ubiquitin preparation had undergone partial proteolytic cleavage at Arg74 and, to a lesser extent, at Arg72. Similarly, we suspect that, the ubiquitin-74 preparation had also undergone partial proteolytic cleavage at Arg72. This would explain why we see t’he last four residues so clearly, and not just, t’he terminal glycylglycine, in the difference electron density map, and it would also fit qualitatively with the average occupancy derived for residues 73-74 and 75-76. Although extreme thermal motion could be part of the reason for the difficulty in seeing this part of the molecule

residues 18 to 21. Pu’ote the hydrogen bond between the carboxylate to the usual 4 + 1 hydrogen bond of the turn. Residues 17 to 22 are

Structure in the electron density maps, we feel that heterogeneity is probably more important. Although t,he average temperature factors for this region are higher than for the remainder of the molecule. this is t.he only portion of the molecule that is not involved in intramolecular hydrogen bonding. Tn addition, this is the portion of the molecule that interacts with target proteins, so a great deal of flexibility in this region is not unespecated. (d) Solvent structure Crystals of ubiquitin have a relatively low solvent volume of about 33yo (v,,, = 1.83 A3/dalton), and the est,imated number of water molecules per asymmetric unit is only 173. The solvent structure included in the refinement consists of 58 water molecules; a summary of the environment of the water molerules is given in Table 3. There are seven solvent’ molecules that have no contact closer than 3.2 4 with protein or other solvent atoms. However. t(hese seven water molecules are within van contact distance (4.0 A) of other der Waals’ hydrogen-bond donors or acceptors. Of the 29 water molecules with partial occupancies, five form no contact closer than 3.2 A. and 11 form only one contact. On the other hand, 20 of the 29 water molecules with full occupancy form two or more contacts closer than 3.2 A. Carbonyl oxygen atoms of peptide bonds generally participate in two to three times as many hydrogen bonds to water as amide nitrogen atoms (Finney, 1979; Holmes & Matthews, 1982; Rees et al., 1983). The ratio is about 1.6 for ubiquitin. This low ratio is probably due to the high number of carbonyl oxygen atoms involved in hydrogenbonded secondary structure. As is commonly seen, water molecules form more hydrogen bonds with oxygen atoms than with nitrogen atoms. At the end of refinement, no solvent molecule had an occupancy less than 0.36 or a temperature factor greater than 46 A’. For sites in the range of 36 to 507; occupancy, there may be an error in coordinates and a possibility for the misassignment of

of Ubiquitin

541

a water molecule to a site of noise in the elect.ron density map. However, all of these sites correspond to peaks of difference electron density that. were well above noise level and that had a high probability for hydrogen bonding with ot’her at,oms of t)he model.

5. Discussion (a) Stability

of the ~molecule

In addition to its physiological roles, ubiquitin is of interest because of its stability. The molecule is extremely resistant to tryptic digestion despite the presence of seven lysine residues and four arginine residues (Schlesinger et al., 1975). Tt is also quite stable over a wide range of pH and temperature values (Lenkinski et aZ., 1977). The overall structure of ubiquitin is extremely compact and tightly hydrogen-bonded. The only portion of the molecule without significant intramolecular hydrogenbonding and close packing contacts is the COOH terminus. The stability is probably also enhanced by the pronounced hydrophobic core formed by residues from the a-helix and the /?-sheet (Fig. 15). The u-helix contributes three hydrophobic residues, and 11 of the 13 hydrophobic residues in the p-sheet participate in the formation of the hydrophobic core. Only Ile44 and Va170 are somewhat exposed. Other residues that’ contribute to this core include Ile36, Leu56 and Ile61. The hydrogen-bonding interactions between the helices and the turns (Fig. 12) may be another factor responsible for the unusual stability of the molecule. These interactions were not clear until the protein model was refined. The molecule displays a novel pattern of hydrogen bonding that involves a symmetric arrangement of the two helices and two reverse turns. It is interesting to note that two of the three amino acid changes in yeast ubiquitin (Wilkinson et al., 1986) and all three of the amino acid changes in oat ubiquitin (Vierstra et al., 1986) occur in this region. The extreme evolutionary stability of ubiquitin suggests that the structural constraints necessary for catalytic activity or

Table 3 Water molecule environment Number of interactions per water molecule

Number of water molecules in class

0 1 2 3 4 Total

7(5) 18(11) 17(8) 13(3) 3(2) 58(29)

Type H,O 4 9 13 2 28

of

interacting atom

0

ox

N

-. 4 10 9 3 26

3 5 11 1 20

5 7 3 1 16

--. XX

2 3 3 5 13

All atoms within 3.2 A of a given water molecule are considered to be interacting atoms. The atom identification is as follows: H,O, solvent atom; 0, backbone carbonyl oxygen; OX, any other protein oxygen; N, backbone amide nitrogen: NX, any other nitrogen. The numbers in parentheses represent water molecules with partial occupancies.

542

8. Vijay-Kumar

et al.

Figure 15. Stereo drawing of the hydrophobic interactions between the u-helix and the /I-sheet. The view is down the barrel of t,he helix. ” -

recognition of ubiquitin by other proteins are quite strict. Since the only sequence changes identified thus far are confined to this region, it seems likely that this region is not directly involved in these functions. (b) Availability

of lysines

In the ubiquitin-mediated pathway for the degradation of proteins, several molecules of ubiquitin are linked to the protein substrate by amide linkages. Hershko & Heller (1985) have shown that some high molecular weight conjugates with ubiquitin contain structures in which one molecule of ubiquitin is linked to an e-amino group of another molecule of ubiquitin. Their findings indicated that the formation of polyubiquitin chains is not necessary for protein breakdown, but it might accelerate the rate of degradation. The seven lysine residues in ubiquitin were examined for intramolecular contacts and accessibility for binding. Three of the lysine residues (Lys6, Lys33 and Lys63) are fully exposed on the surface of the molecule and form no intramolecular contacts. Lys6 occurs at the end of a p-strand just before a reverse turn, Lys33 occurs just before a reverse turn at the end of the a-helix, and Lys63 is the third residue in a reverse turn. Of the remaining four, two are involved in salt-bridge interactions and two are involved in hydrogen-bond interactions. The saltbridges are formed between the E-amino nitrogen atoms of Lye11 and Lys27, and the carboxylate oxygen atoms of Glu34 and Asp52, respectively. The hydrogen bonds involve the s-amino nitrogen atoms of Lys29 and Lys48, and the carbonyl oxygen atoms of Glu16 and Ala46, respectively. The two lysine residues at positions 27 and 29 are the least exposed of the seven. These two extend from the a-helix toward the B-sheet. These positions agree quite well with the results of Jabusch & Deutsch (1985), who examined the reaction of

p-nitrophenyl acetate with ubiquitin. They found that lysine residues 27 and 29 showed little reactivity. Residue 6 was the most readily acetylated, and residues 11, 33, 48 and 63 showed intermediate reactivities. (c) Location of aromatic residues Nuclear magnetic resonance studies (Gary et al., 1980; Jenson et al., 1980) suggested that the tyrosine and histidine residues, as well as one of the phenylalanine residues, were in hydrophobic environments and might be buried. In the crystal structure, the two phenylalanine and the histidine residues are on the same surface of the molecule. Phe4, His67 and Phe45 are on the three middle strands of the /?-sheet. The closest side-chains around the imidazole ring of histidine are provided by Lys6, Ile44 and Thr66. While the histidine sidechain is almost perpendicular to the surface, the two phenylalanine side-chains lie roughly tangent to the surface of the molecule. Phe4 does not have any close hydrophobic contacts, but Phe45 lies in a shallow hydrophobic pocket composed of Ala46, He61 and Leu67. As described above, the tyrosine side-chain spans the large loop involving residues 51 to 59 and contributes to the stability of the loop by formation of a hydrogen bond between N of Glu51 and OH of the Tyr59 side-chain. (d) Correlation

of structure

with function

Most studies suggest that ubiquitin functions primarily in intracellular ATP-dependent protein degradation. Protein breakdown in this pathway requires the formation of covalent conjugates in which carboxyl terminals of ubiquitin molecules become attached to the target protein via amide linkages. Therefore, this portion of the molecule would require considerable freedom of motion. The COOH terminus is protruding from the structure

Structure

and does not interact with the rest of the molecule by any hydrogen bonding or hydrophobic interactions. Thus, this portion of the molecule is accessible by enzymes involved in formation of cleavage of the isopeptide bond. In light of the evidence that a-amino terminal ubiquitination may be important for protein degradation, the environment of Met1 is of some interest. The first seven residues of the NH, terminus are fairly tightly constrained by two adjacent strands of the p-sheet. Also, the side-chain of methionine is buried in the interior of the molecule and is hydrogen-bonded through the S atom to Lys63. This renders the NH, terminus virtually inaccessible, and probably is important in preventing the degradation of ubiquitin by the ubiquitin pathway. Since ubiquitin is synthesized in cells as a polyubiquitin molecule where the COOH terminus of one molecule is covalently linked to the LX-NH, terminus of the next molecule, the first few residues probably have to undergo significant conformational changes after cleavage of the large precursor into individual molecules. As mentioned above, the function of ubiquitin on the lymph node homing receptor is unclear. However, its conformation on the homing receptor is different from the conformation of free ubiquitin or of ubiquitin bound to other proteins (St John et al., 1986). It is suspected that these conformational changes might enable ubiquitin to contribute to specific receptor interactions. These conformational changes occur in residues 64 to 76. It is noteworthy that this area corresponds to the p-strand that begins with thtb unusual Gl p bulge: fi bulges between pairs of hydrogen bonds on parallel strands are extremely rare (Richardson, 1981). In ubiquitin, the bulge is required for proper orientation of the middle strand of the five-stranded p-sheet. Since b-bulges tend to occur in critical positions at active or binding sites of proteins (Richardson et al., 1978), it is possible t,hat this unusual secondary structural feature is important in the interaction of ubiquitin with target proteins. While it, is clear that ubiquitin can function as a signal for proteolysis, there is still a great deal that is unknown about specific ubiquitin-dependent proteolytic reactions. Also, it is still unclear whether ubiquitination can serve non-proteolytic functions as well. Unfortunately, while on the basis of the structure of ubiquitin we can speculate about some of t:hese processes, further studies on ubiquitin-protein conjugates and the enzymes involved in conjugation will be required for resolution of these questions. We thank Wayne A. Hendrickson for helpful discussions about the refinement procedures and Keith D. Wilkinson for providing purified human erythrocytic ubiquitin and ubiyuitin-74. This work was supported by National Institutes of Health grants GM-27144, C&l3148 and DE-02670. W.J.C. is the recipient of National Institutes of Health Career Development Award CA-00696.

of Ubiquitin

543

References Bhat, T. N. & Cohen, G. H. (1984). J. Appl. Crystallcgr. 17, 244-248. Birktoft, J. J. t Blow, 187-240.

D. M. (1972). J. MOE. Biol. 68,

Blevins, R. A. & Tulinsky, A. (1985).J. Biol. Chem. 260, 8865-8872.

Bond, U. & Schlesinger,M. J. (1985). Mol. Cell Biol. 5, 949-956.

Busch, H. & Goldknopf, I. L. (1981). Mol. CelZ. Rio&m. 40, 173-187. Gary, P. D., King, D. S., Crane-Robinson, C., Bradbury, E. M., Rabbani, A., Goodwin, G. H. & Johns, E. W. (1980). Eur. J. Biochem. 112, 577-580. Chin, D. T., Kuehl, L. & Rechsteiner, M. (1982). Proc. Nat. Acad. Sci., U.S.A. 79, 5857-5861. Ciechanover, A., Heller, H., Elias, S.. Haas, A. 1,. $ Hershko, A. (1980). Proc. Nut. Awd. Sci., U.S.A. 77, 1365-1368. Cook, W. J., Suddath, F. L., Bugg, C. E. & Goldstein, G. (1979). .I. Mol. Biol. 130, 353-355. Dworkin-Rastl, E., Shrutkowski, A. & Dworkin, M. B. (1984). Cell, 39, 321-325. Finley, D., Ciechanover, A. & Varshavsky, A. (1984). Cell, 37, 43-55. Finney, J. L. (1979). In Water: A Comprehensive Treatise (Franks, F., ed.), vol. 6, pp. 47-122, Plenum Press, New York. Gavilanes, ,J. G., de Buitrago, G. G., Perez-Castells, R. & Rodriguez, R. (1982). J. Biol. Chem. 257, 10,26710,270. Goldstein, G., Scheid, M., Hammerling, C., Boyse, E. A., Schlesinger, D. H. BE Niall, H. D. (1975). Proc. Nat. Acad. Sci., C.S.A. 72, 11-15. Hendrickson, W. A. t Konnert. J. H. (1981). In International Symposium on Biomolecular Structure (Srinavasan, R., ed.), pp. 43-57. Pergamon Press, Oxford. Hershko, A. & Heller, H. (1985). B&hem. Biophys. RRS. Commun. 128, 1079-1086. Hershko. A., Ciechanover, A., Heller, H.. Haax, A. I,. & Rose, I. A. (1980). Proc. Nat. Acad. Sk., ITS. A. 77, 1783-1786. Hershko, A., Eytan, E., Ciechanover. A. & Haas. A. L. (1982). J. Biol. Chem. 257, 13964-13970. Hershko, A., Heller, H., Elias, S. & (Xechanover, A. (1983). J. Biol. Chem. 258, 8206-8214. Hershko, A., Heller, H., Eytan, E.. Kaklij, 6. & Rose, I. A. (1984). Proc. Nat. Acad. Sci., I’.S.A. 81. 70217025. Holmes, M. A. & Matthews, B. W. (1982). J. Mol. Biol. 160, 623-639. Hunt, I,. T. & Dayhoff, M. 0. (1977). Hiochem. Biophys. Res. Commun. 74, 650-655. Jabusch, J. R. & Deutsch, H. F. (1985). Arch. Rioeh.em. Biophys. 238, 170-177. Jenson, J.. Goldstein, G. & Breslow, E. (1980). Biochim. Bi0phy.u. Acta, 624, 378-385. Jones, T. A. (1978). J. Appl. Crystdlogr. 11, 268-272. Lenkinski, R. E.: Chen, D. M.. Glickson. J. D. & Goldstein, Q. (1977). Bioehim. Riophys. .4&z. 494. 126-130. Levinger, 1,. & Varshavsky, A. (1982). Cell, 28, 375-385. Luzzati. V. (1952). Acta Crystallogr. 5, 802-810. Mueller, R. D., Yasuda. H., Hatch, C. L.. Bonner. W. M. & Bradbury, E. M. (1985). .I. Biol. Chem. 260. 51477 5153. North, A. C. T.. Phillips, 1). C. $ Mathews. F. S. (1968). Acta Crystallogr. Sect. A. 24. 351 -359.

544

S. Vzjay-Kwnar

Olson, M. 0. J., Goldknopf, I. L., Guetzow, K. A., James, G. T., Hawkins, T. C., Mays-Rothberg, C. J. & Busch, H. (1976). J. Biol. Chem. 251, 5901-5903. Ozkaynak, E., Finley, D. & Varshavsky, A. (1984). Nature (London), 312, 663-666. Ramachandran, G. N., Ramakrishnan, C. & Sasisekharan, V. (1963). J. Mol. Biol. 7, 95-99. Rees, D. C., Lewis, M. & Lipscomb, W. N. (1983). J. Mol. Biol. 168, 367-387. Richardson, tJ. S. (1981). Advan. Protein Chem. 34, 167.. 339. Richardson, J. S., Getzoff, E. D. & Richardson, D. C. (1978). Proc. Nat. Acad. Sci., U.S.A. 75, 2574-2578. Schlesinger, D. H. & Goldstein, G. (1975). Nature (London), 255, 423-424. Schlesinger, D. H., Goldstein, G. & Niall, H. D. (1975). Biochemistry, 14, 2214-2218. Siegelman, M., Bond, M. W., Gallatin, W. M., St John, T., Smith, H. T., Fried, V. A. & Weissman, I. L. (1986). Science, 231, 823-829. Edited

et al St ,John, T.. Gallatin, W. M., Siegelman, M., Smith, H. T., Fried, V. A. & Weissman, I. L. (1986). Science, 23 1, 845-850. Tsukada, H. & Blow, D. M. (1985). J. Mol. Biol. 184, 703-7 11. Vierstra, R. D., Langan, S. M. & Schaller, G. E. (1986). Biochemistry, 25, 3105-3108. Vijay-Kumar, S.. Bugg, C. E., Wilkinson, K. D. & Cook, W. ,J. (1985). Proc. Nat. Acad. Sci., IJ.S.A. 82, 35823585. Watson, 1). C., Levy, W. B. BE Dixon, G. H. (1978). Nature (London), 276, 196-198. Wiborg, O., Pedersen, M. S., Wind, A., Berglund, 1,. E., Marcker. K. A. & Vuust, ,J. (1985). EMBO J. 4, 755 759. Wilkinson, K. D. & Audhya, T. K. (1981). .I. Riol. Chem. 256, 9235-9241. Wilkinson, K. D., Cox, M. J., O’Connor, L. B. & Shapira, R. (1986). Biochemistry, 25, 4999-5004.

by R. Huber