Prog. Biophys.molec. Biol., 1982,Vol. 40, pp. 115 159 Printed in Great Britain.All rights reserved.
0079 6107/82/020115~.5522.50/0 Copyright t PergamonPress Ltd
N E U T R O N DIFFRACTION OF CRYSTALLINE PROTEINS ALEXANDER W L O D A W E R
National Measurement Laboratory, National Bureau of Standards, Washington, DC 20234, U.S.A. and Laboratory of Molecular Biology, N I A D D K , Bethesda, MD 20205, U.S.A.
CONTENTS 116 116
I. INTRODUCTION
l. Advantages o f Neutron Diffraction
118 118 119 119 119 120 120 122 123 123
11. GENERAl. DISCtISS[ON OF THE TECHNIQUE
1. Generation of Neutrons 2. Practical Units of Exposure 3. D([~iculties in Neutron Diffraction Experiments (a) Low neutron flux (b) Crystal growth and mounting (c) High incoherent background 4. Radiation Damage 5. Low-temperature Crystallography 6. Initial Orientation gila Crystal for Data Collection
123 123 125 126 129 129 129 130
I11. INSTRUMENTATION FOR DATA COLLECTION
1. 2. 3. 4. 5.
Four-Circle DifJ~actometers Photographic Measurements Linear Position-Sensitive Detectors Area Detectors Utilization of White Neutron Beams (a) Fourier chopper for a steady-state neutron source {b) Time-of-flight methods with pulsed neutron sources
130 130 130 131 131
IV. DATA REDUCTION
1. 2. 3. 4.
Lorentz-polarization Correction Correction j o t L/2 Component Absorption Correction Peak Integration
135 135 136 136
V. DETERMINATION OF AN INITIAl. MODEL FOR A PROTEIN
I. Isomorphous Replacement 2. Anomalous Seattering 3. Utilization q[an X-ray Model
137 138 139 140
VI. MErHODS OF NEUTRON STRUCTURE REFINEMENT
1. Real-Space Re[inement 2. DiJ]erenee Fourier Methods 3. Reciprocal-Space Refinement
144 144 148 150 151
VII. RESULTS OF NEUTRON DIFFRACTION STUDIES OF PRI)TEINS
1. 2. 3. 4.
Improvements to the Atomic Model of a Protein Search for Individual Hydrogen Atoms Solrent Structure Hydrogen Exchange
157
VII1. SUMMARY AND CONCLUSIONS A('KNOWI_H)GEMENT
157
REFEREN('ES
157 ABBREVIATIONS
CA, HA, C G I , ND2, etc. Fo
AND
SYMBOLS
Atom names in amino acids, as given in I U P A C IUB recommendations, with the exception that Roman rather than Greek letters describe atomic positions observed structure amplitude 115
116 f c
Fermi I MIR NMR PSD R
O, ~,~, Z, ~, ~,
A. WLODAWER calculated structure amplitude ul)it of scattering length equal to 10 J3 cm reflection intensity multiple isomorphous replacement nuclear magnetic resonance position sensitive detector crystallographic R factor, defined as R=EIIF0[- [Fell~El,,] calculated phase angle for a structure factor linear absorption coefficient wavelength diffractometer angles, as defined by Busing and Levy (1967)
I. I N T R O D U C T I O N The aim of this review is to present the current status of single-crystal neutron diffraction studies of proteins. While the related technique of X-ray diffraction has become very popular since the structure of myoglobin was first reported by Kendrew et al. (1958), few reports of protein crystal investigations using neutron diffraction have appeared. However, this is changing rapidly, and several structures have been completed and reported in the last two years. These new developments make it desirable to summarize the current status of the technique of neutron diffraction analysis of proteins, to present and discuss the type of information which it can provide and to consider the experimental difficulties which would bear on the interpretation of the results. In particular, I will emphasize those features of the technique which differ from their X-ray counterparts, both in the data collection and in the interpretation. While all of the reported neutron single-crystal investigations dealt with the proteins, there is nothing in the technique which would make it unsuitable for studying the crystals of other biomolecules, such as nucleic acids. While the term "protein crystallography" will often be used throughout this review, the techniques described below are applicable to the whole range of macromolecules in the same sense as established in the standard textbook "Protein Crystallography" (Blundell and Johnson, 1976), which does not limit itself to proteins alone. On the other hand, I will mention only in passing the studies of macromolecules that have a molecular weight of less than 5000 daltons (such as vitamin B12 ) and which belong crystallographically to the class of "'small molecules". 1. A d v a n t a g e s
q]" N e u t r o n
Diffraction
The interactions between the neutrons and the atoms in the crystal lattice differ in several respects from the interactions between these atoms and X-rays. The major difference is caused by the fact that while X-rays are diffracted by electron clouds, neutrons are scattered by nuclei. As a consequence, the X-ray scattering amplitudes are proportional to the number of electrons surrounding each atom, while the neutron scattering amplitudes depend on the nuclear forces which vary even between different isotopes of the same element (Table 1). Thus the X-ray scattering amplitudes are positive and are the same for different isotopes and span a range of 0.28 x 10 -~2 cm to 25.9 x 10 42 cm at the zero scattering angle for the naturally occurring elements. The range of neutron structure amplitudes ranges from 0.87 X 10- 12 cm (62Ni) to 4.94 x 10 l 2 cm (t 64Dy), and for a vast majority of isotopes, the range is only 0.4× 10 -12 cm (4 Fermi) to 0.8 x 10 12 cm (8 Fermi) (Bacon, 1975). The negative sign of the scattering amplitudes corresponds to the change in phase by 360 ° rather than 180. The importance of these scattering amplitudes to the studies of biological macromolecules is obvious. While the structure investigated by X-ray techniques is influenced in a similar way by the scattering of oxygen, nitrogen, and carbon atoms, the influence of hydrogen atoms is almost negligible unless very high-resolution data are available. In the structure investigated by neutron diffraction, however, the contribution of deuterium atoms is similar to the other three atom types mentioned above, while the contribution of hydrogens is still half the magnitude but of the opposite sign. While this change of sign is a source of considerable experimental difficulties (see below), neutron diffraction allows direct location of hydrogens in the structure and allows differentiation -
-
N e u t r o n diffraction of crystalline proteins
117
TABLE 1. A COMPARISON OF NEUTRON AND X-RAY SCATTERING DATA FOR SELECTED ELEMENTS AND ISOTOPES OF BIOLOGICAL INTEREST
Element
Isotope
H
1H 2H(D) 12C 14N 160 23Na
C N O Na Mg p S C1 K Mn Fe Ni Dy U
31p 32S
S5Mn 62Ni x6.,Dy
Atomic number 1 6 7 8 11 12 15 16 17 19 25 26 28 66 92
X-ray scattering Factors (10-12 cm)
Neutron scattering length (10 - ~ cm)
sin 0 = 0
(sin 0~/2 =0.5/~
-0.374 0.667 0.665 0.94 0.58 0.36 0.52 0.51 0.28 0.96 0.37 -0.39 0.95 -0.87 4.94 0.85
0.28 0.28 1.69 1.97 2.25 3.09 3.38 4.23 4.5 4.8 5.3 7.0 7.3 7.9 18.6 25.9
0.02 0.02 0.48 0.53 0.62 1.14 1.35 1.83 1.9 2.0 2.2 3.1 3.3 3.6 10.0 14.8
tBacon, 19751
between their isotopes. Since the knowledge of the exact location of hydrogens is crucial for the understanding of many biological problems, this feature is probably the most important reason for using the neutron diffraction technique. Even though the scattering amplitudes of the most common atoms in the biological systems (other than hydrogen) are similar, the scattering amplitude for nitrogen is about 40% higher than for D, C or O. Thus precise neutron measurements should be capable of distinguishing nitrogen from carbon (providing the absolute orientation of residues such as histidine) or distinguishing nitrogen from oxygen in the amino groups of glutamine or asparagine. This differentiation is often made even easier by the differential exchange rates of hydrogens attached to these atoms, as will be shown below. While the X-radiation is ionizing and causes damage to the crystals (mostly through the generation of free radicals), no similar damage is present in neutron experiments. The radiation damage in the X-ray case can often lead to substantial changes in the structure and can seriously influence the resulting protein model, particularly the temperature factors. It is uncommon to collect all of the X-ray data from only one crystal, and the difficulties introduced by scaling can be considerable. No such problems exist in neutron experiments. Since many neutron diffraction experiments are performed in deuterated solvents, this technique allows direct studies of the hydrogen exchange in proteins. While the protons covalently bound to carbon atoms are unlikely to exchange and those bound to side-chain oxygen and nitrogen atoms will almost certainly exchange, only partial exchange is possible for some of the most highly protected amides. The knowledge of the locations of the protected amides can be used in the investigation of the mobility and accessibility of the various regions of a polypeptide. Neutron diffraction experiments promise a much better description of the solvent layer surrounding a protein molecule. While the oxygen atom is the only effective scatterer in the X-ray measurement, the deuteriums provide similar scattering power in a neutron experiment, thus tripling the contribution of the solvent. (For reasons discussed below, the neutron experiments are usually carried out in deuterated solvents.) While the range of neutron scattering amplitudes is not large, some isotopes are very strong anomalous scatterers. The relative importance of the imaginary component of the structure amplitude is larger than in the X-ray case and can be utilized in the phasing of the structure.
1 18
A. WLODAWER
II. G E N E R A L D I S C U S S I O N O F T H E T E C H N I Q U E 1. Generation of Neutrons The existence of neutrons was demonstrated fifty years ago by Chadwick, and almost immediately these particles were considered for diffraction studies. Naturally occurring :~-emitters were initially utilized as neutron sources, but the usefulness of neutrons for crystallography was only established after the development of beam reactors. The reaction responsible for neutron generation in the reactor core is the capture of a neutron by a 23sU nucleus which then breaks up into two or more fragments and releases several neutrons m turn. Under suitable geometric conditions, the process can achieve steady state when the excess neutrons are allowed to escape. The energy of these neutrons is very high ion the order of several MeV), and in this form they are not suitable for diffraction measurements, which generally require the wavelengths of 1 3 A, corresponding to a neutron energy of 8 1 4 MeV. This energy can easily be achieved by allowing the neutrons to equilibrate, by multiple collisions, with a suitable moderator (usually graphite or D20), which is kept at a temperature usually not exceeding about 320 K. The distribution of neutron energies at that temperature is shown m Fig. l, and it can be seen that the root-mean-square neutron velocities correspond to a wavelength of 1.43 A, suitable for crystallographic studies. This temperature is also suitable for efficient reactor operation, and many such sources have been constructed in the last thirty years. The construction of high-flux reactors saw its peak in the 1960s and early 1970s, and very few new instruments have been constructed recently. Several reasons have contributed to this fact. One of them is cost: the full price of a reactor may now exceed $100 million, which is comparable to the cost of major instruments used in particle physics and which is three to four orders of magnitude more expensive than the laboratory X-ray sources. Also the association of research reactors with the field of nuclear power (even though the low core temperature makes them much safer than power reactors) presents difficult political problems. The main reason, however, is that the technical development of research reactors probably peaked with the construction of the reactor at the Institute L a u e Langevin in Grenoble, France. This reactor has a very high neutron flux in its core (1.2 x 10 ~s neutrons cm 2 sec 1), since it was built with only a single fuel element and is operated at a power of 57 MW. The difficulties with dissipating that amount of power from a very compact core make the prospects of increasing the flux of reactor-generated neutrons in any significant way rather poor. A different type of neutron source can deal better with the problems of heat dissipation by producing neutrons in pulses rather than continuously. Two pulsed reactors have been used
1
1.43
1.67
A
Fie;. 1. A diagram showing tile distribution of the neutron flux versus wavelength of the beam generated by a reactor operated at the moderator temperature of about 320 K. Peak wavelength and the wavelength used for protein Cl .vstallographic studies at the National Bureau of Standards Reactor are marked. The wavelength band is marked with two verlical lines [not to scale).
Neutron diffraction of crystalline proteins
119
in Dubna, U.S.S.R., for a number of years, but even though the design of a macromolecular diffractometer has been described (Bally et al., 1974), no further reports have been published. Macromolecular diffraction facilities are also under construction at the spallation facilities at Rutherford L a b o r a t o r y (England) and Argonne and Los Alamos laboratories (U.S.A.). Some discussion of the properties of pulsed sources and of their usefulness for crystallography will be presented below, but since all the current investigations have used the steady-state reactor sources, I will concentrate in this review on the ~'classical" techniques.
2. Practical Units (?/'Exposure The flux of neutrons incident on the sample varies (both in the short and in the long term) due to the fluctuation of the reactor output. This could cause difficulties if the exposure times were measured simply by monitoring the clock, as is customary in X-ray diffractometry. As an alternative, neutron flux is monitored by a low-efficiency counter and the approximate relationship between the monitor counts and clock time is used in setting the exposure. Kossiakoff and Spencer (1981) reported that the variation of time spent on counting each frame in the data collection for trypsin was 105--120 sec for the identical monitor count. If the monitor counter is gated by the dead-time detection circuit of a PSD, the dead-time correction can be implemented at the same time (Wlodawer and Sj61in, 1982a).
3. D![lieulties in Neutron Di[]kaction Experiments (a) Low neutron flux The principal problem associated with the neutron crystallography of proteins is caused by the low flux of neutrons in the sample position. The numbers quoted in different publications range from 5 x 1 0 6 to 2 X 108 neutrons cm -2 sec 1 in the sample position. This should be compared with 3 x 101 o photons cm - 2 sec- 1 available from a standard X-ray tube, 1.6x1011 photons cm -2 sec 1 available from a GX-6 rotating anode, and 1.2x1013 photons cm 2 sec ~ available from a synchrotron (Phillips et al., 1977). The exact flux values depend strongly on the geometry of the beam tube, beam divergence, m o n o c h r o m a t o r type, etc., but it is clear that the available neutron fluxes are many orders of magnitude lower than the corresponding fluxes of the X-ray beams. If the same crystal were to be used for both types of experiments and the experimental details were to be kept similar, the time required for a neutron experiment would be prohibitively long. Clearly, it is imperative to compensate as much as possible for the effects of the low neutron flux. While it is difficult to directly compare the fluxes for different neutron instruments, it may be instructive to at least consider the quoted estimates. The neutron flux is usually measured by either activation of gold foil or irradiation of a calibrated 235 U sample placed in the crystal position. The flux registered on the smallest of the reactors used for protein studies, a 10 M W medium-flux facility at the National Bureau of Standards, Washington, DC, U.S.A., was 6 × 1 0 6 neutrons cm 2 see- 1 with an estimated error of 10°~i (Wlodawer, 1980). This flux was measured with a calibrated fission monitor at )~= 1.68 A, the beam line was equipped with a graphite monochromator, and beam divergence was 40'. Early experiments conducted on the 40 M W Brookhaven reactor (Upton, NY, U.S.A.) showed a flux of 10~ neutrons cm 2 sec (2=1.527 A, germanium 220 m o n o c h r o m a t o r ; Schoenborn, 1969). The flux was later increased to 107 neutrons cm 2 sec- 1 at 1.6 A, using either a germanium m o n o c h r o m a t o r or a pyrolytic graphite m o n o c h r o m a t o r (Norvell et al., 1975). Most recently a flux of 5 × 107 neutrons c m - 2 sec ~ was achieved at 2 = 1.56 A with the help of a bent pyrolytic graphite monochromator (Kossiakoff and Spencer, 1981). The flux measured at 2 = 1.68 A on the D8 beam line of the Institute Laue Langevin high flux reactor (Grenoble, France) was 2 x 10~ neutrons cm 2 sec ~, as measured by the activation of a gold foil (Bentley et all, 19791. While the neutron fluxes quoted above differ by as much as a factor of thirty, no direct comparisons have been made (for example, by collecting data from the same crystal). Therefore, the importance of the available neutron flux should not be overestimated. It is clearly advisable to maximize the flux as much as possible, within the constraints imposed by the resolution required in a given experiment.
120
A. WLODAWER
(b) Crystal growth and mounting In view of the low neutron fluxes, it is necessary to use large crystals for the collection of diffraction data. Some of the crystals used until now have indeed been very large. A crystal of metmyoglobin used by Schoenborn (1971) had the dimensions of 4 × 3 x 2 mm (volume 24 mm3). A single crystal approximately 20 mm 3 in volume was used to collect lysozyme data (Bentley and Mason, 1981 ). A crystal 30 m m 3 was used for ribonuclease A (Wlodawer, 1980), and two crystals 8 mm 3 each were used for oxymyoglobin (Phillips and Schoenborm 1981 ). The structure of trypsin was investigated using the smallest crystal so far; its volume was only 1.6 mm 3 (Kossiakoff and Spencer. 1980). It is doubtful that much smaller crystals could be used with the current data collection facilities. N o general methods for growing such large crystals have been described, and indeed, none of the references gives much detail of the procedures. The secret seems to lie in using large volumes of concentrated protein solution, carefully excluding unwanted nucleation centers and introducing only one or two seeds into each aliquot. The largest ribonuclease crystal grown to date had a volume of 100 mm 3 and was growing alone in a vial containing 5 ml of protein solution at a concentration of 50 mg/ml (Wlodawer, 1980). While the crystals usually used for data collection need to be deuterated, it is u n c o m m o n to deuterate the protein in solution before attempting the growth of crystals. Bello and Harker ( 1961 ) have shown that the growth of crystals from deuterated ribonuclease A is possible, but later attempts were unsuccessful (Wlodawer, unpublished). This failure was most probably due to an insufficiently broad search for crystallization conditions. Given enough effort, the conditions suitable for the growth of deuterated crystals could probably be found, but since the original conditions were well established, this effort was not considered to be necessary. It may be useful in the future, though, since the structure of selectively deuterated proteins may turn out to be of interest. The ways of mounting crystals for neutron data collection are usually similar to those used for X-ray studies. The principal differences are caused by the length of time needed for data collection, which necessitates taking serious precautions against crystal movement. This is easier to accomplish because of the relatively low parasitic scattering of such mounting materials as quartz and, therefore, the crystal can be immobilized with quartz wool with little increase in the background. While the thickness of the capillary walls of glass or quartz tubes used in X-ray work is usually only 0.01 mm, quartz tubes with walls as thick as 0.5 mm can be safely used to mount crystals for the neutron investigations. Since the largest, commercially available, crystal mounting capillaries have a diameter of only 2 mm, such tubes have to be specially manufactured. A method for mounting ribonuclease A crystals so that their shortest dimension was parallel to the tube axis was shown by Wlodawer (1980), and a sketch can be seen in Fig. 2. The crystal was immersed in mother liquor and immobilized with quartz wool, and the tube was sealed with silicone grease, dental wax, and epoxy glue. A crystal that was mounted in this manner was both chemically and positionally stable for over two years. (c) High incoherent background Incoherent scattering is the other major cause of difficulties in collecting high-quality neutron diffraction data. The scattering of hydrogen atoms is responsible for almost all of this incoherent scattering, since its cross section for coherent scattering is about 40 times lower than the cross section for the incoherent scattering. This is the source of two problems. Since neutrons are removed from the incident beam, their attenuation is increased (thus this process increases the absorption coefficient, even though, properly speaking, it differs from true absorption). More importantly, the neutrons are scattered isotropically, and the background level is increased. It is important to stress that this increase in the incoherent scattering is due to the sample itself and, thus, cannot be eliminated. The situation in the neutron case is quite different from that found in X-ray diffraction, where scattering by the crystal mount and by the air surrounding the sample provides the principal contribution to the general background
Neutron diffraction of crystalline proteins
121
]-,.7 mrn-...-t Crystal
tuartz Wool, Mother Liquor
Silicone
FI(;. 2. Diagram of a method used to mount ribonuclease crystals for neutron diffraction study. "[he crystal was completelyimmobilizedand immersed in a synthetic mother liquor. (WIodawer. 1980. By permission of the International Union of Crystallography). (Krieger et al., 1974). In the description of this phenomenon, it was suggested that with the choice of the proper collimating geometry and by surrounding the sample with helium it is possible to considerably lower the background of the X-ray data. Since the background in the neutron case originates directly from the sample, these simple precautions will not help. No detailed comparisons of the signal-to-noise ratios in X-ray and neutron scattering have been made, but some feeling for the magnitude of this effect can be obtained by analyzing the data provided by Moore et al. (19671. These authors found that the mean peak (including the background) to background ratio for the observable reflections in the study of a monocarboxylic acid derivative of vitamin B t z was only 1.3 : 1. The X-ray integrated peak-tobackground ratio for a medium strength rellection 0 0 17 from a ribonuclease A crystal was 17.5 (Fig. 3), while the same reflection had a peak-to-background ratio of only 0.43 in an equivalent neutron measurement (Wlodawer, 1978, unpublished). Another indication of the influence of the high background, combined with the low neutron flux can be gleaned from the published scaling statistics of the neutron and X-ray diffraction data. In the case of ribonuclease, scaling of the duplicate X-ray data sets extending to 2.5 ,~ and collected on two crystals, with 97°11 of the reflections considered observed (F>2o-(F)) yielded the scaling R of 0.027 (R=NIF t F2pZF: Wlodawer, 1980). Similar scaling statistics for the 2.8/~ neutron data were 0.067, with only 82'!ii of the reflections observed (Sj61in and Wlodawer, 1981 ). Another example is provided by oxymyoglobin, for which the merging R factors (defined as R=£11 t -121/N{) were 0.065 for the 2/~ X-ray data (Phillips, 1980) and 0.141 for the neutron data (Phillips and Schoenborn, 1981 ). While it is possible to achieve quite low scaling R factors with careful and long measurements [for the triclinic lyosozyme, R [defined as R = Z I F ~ - F 2 1 Z F 2] was 0.08 for the 3183 repeated measurements on the same crystal; Bentley et al., 19791, it is unreasonable to expect the quality of the neutron data to routinely match lhe quality of the X-ray data which can be collected in an equivalent experiment. One of the easiest ways of reducing the incoherent background level is through the deuteration of the crystals. This can be accomplished by transferring the crystals to a synthetic mother liquor in which most or all of the hydrogen atoms have been replaced by deuterium. In the earliest studies of myoglobin, crystals were soaked in a solution of (NH2)4SO, , in DzO, and the mass spectrometer analysis indicated that 45"~, of the hydrogens in the crystals were replaced by deuterium (Schoenborn, 1969). It was necessary to change the solvent a number of times in order to obtain the m a x i m u m exchange. The time spent for the exchange has been quoted as several months for lysozyme (Bentley and Mason, 1981 ), three months for oxymyoglobin (Phillips and Schoenborn, 1981 t, and six months for ribonuclease
122
A. WLODAWER
2000
1500
I000
500
v.DT. w
STEP
FIG. 3. Profiles of the 0 0 17 reflection, collected using a crystal of ribonuclease A. Closed circles Xray intensities. Open circles neutron intensities. Both reflections were measured by ~,) step-scans of 21 points each. Counting time per point was 2 sec in the X-ray measurements, 30 sec in the neutron experiment.
A (Wlodawer, 1980). Careful deuteration does not usually lead to any deterioration of the diffraction pattern of a protein crystal, even though some small changes in the unit cell parameters can sometimes be noticed (Bentley and Mason, 1981 ). In some cases deuteration of the crystals is not easy to achieve. Ribonuclease crystals that are transferred directly to a deuterated mother liquor invariably disintegrate (Wlodawer, 1978, unpublished), even if care has been taken to maintain the pH and ionic strength constant. Crystals of cytochrome c' maintain their apparent integrity, but the diffraction extends only to about 5 A resolution after rapid transfer (Weber and Wlodawer, 1979, unpublished). Much better results can be obtained if the exchange is slow. A lypical procedure of deuteration followed for a ribonuclease crystal consisted of exchanging 10 i~l of the mother liquor daily (out of a total volume of 1 ml) in the first two weeks of exchange, 50 ~tl every other day in the next two weeks, and 250/~l weekly in the next three months, followed by two complete exchanges a month apart. While the crystals always developed small cracks during this procedure, their ability to diffract did not suffer, as monitored using X-ray diffraction. It is necessary to pay careful attention to the fact that the apparent pH measured in a deuterated solution differs from the true pH by about 0.4 units (Glasoe and Long, 19601. Thus the apparent pH of the deuterated mother liquor should register only 6.6 on the pHmeter if it is to be equivalent to the original hydrogenous mother liquor at pH = 7.0. The situation is even more complicated in the presence of alcohols, since the measured pH values are not related to the "true" pH in any simple manner. The experimenter has to pay attention to this phenomenon in order to preclude pH shock (Douzou et al., 1974) and to maintain the crystals at the pH conditions of most interest to a particular experiment. It should be pointed out, though, that the effect of D 2 0 on the ionization constants is almost exactly opposite to the abovementioned phenomenon, and thus the values of pK obtained in D 2 0 are coincidentally almost identical with those found in H 2 0 (Roberts et al., 1969; Bundy and Wiithrich, 1979). 4. Radiation Damage
The crystals exposed to thermal neutrons are not expected to suffer radiation damage (Schoenborn, 1969). This contrasts with the behavior in the X-ray beam. which always
Neutron diffraction of crystalline proteins
123
produces some degree of damage, even though its rate can sometimes be low. The expectation of the high stability of protein crystals in the neutron beam was borne in practice, since none of the research groups has reported any neutron-induced intensity decay. The reasons for any observed decrease in the intensity of standard reflections could be traced to some chemical or biological phenomena unrelated to neutron irradiation. An example was provided by Mason and Nunes (1976). The standard reflections collected for triclinic lysozyme weakened by 30~,, and the cause was traced to the growth of mould in the capillary. Data collection was resumed after the growth was arrested by 5 min exposure to UV light, and the new data scaled very well with the original set.
5. Low-temperature Crystallography Cooling the crystals to low temperatures can be done more easily in neutron experiments than in X-ray experiments, since the crystals can be completely enclosed in aluminium cans, which are practically transparent to thermal neutrons. Thus if the samples can survive low temperatures (and there are indications that some protein crystals can be brought to temperatures much lower than the freezing point of the mother liquor--Walter and Steigemann, 1981 ), a constant low temperature can be maintained using, for example, closedcycle refrigerators. Such devices found wide application in neutron crystallographic studies but, so far, not in the protein field. Low-temperature neutron data collection was reported only for oxymyoglobin. In that investigation, the crystal was cooled to - 5 C in order to retard heine oxidation (Phillips and Schoenborn, 1981). The full potential of lowtemperature neutron crystallography has clearly not been explored as yet.
6. Initial Orientation of a Cry,staljbr Data Collection An X-ray crystallographer usually has a good idea of the unit cell parameters and the indices of strong reflections long before a crystal is used for intensity data collection. Initial exploration is made easier by the availability of film techniques and the existence of devices such as precession cameras, which provide undistorted images of the reciprocal space. Since film techniques are seldom used in neutron crystallography, such preliminary data are not available. The orientation matrix can easily be found if the morphology of the crystal is known and if the faces can be indexed, but this is not always the case. Some guidance in finding strong reflections useful for crystal orientation can be provided by the X-ray structure amplitudes, since they are generally similar to their neutron counterparts. The scaling R = Z[F x - FD]/ZF" was 0.325 for the comparison of X-ray and neutron data extending to 2.0 A (Wlodawer and Sj61in, 1982a). Large differences can be expected, however, for individual reflections, especially at low values of 0, and they can be very misleading. For example, reflection 0 0 3 is characteristically strong in the X-ray diffraction pattern of ribonuclease but very weak in the neutron data. The opposite is true for the reflection 0 0 2, and such differences can make the initial crystal alignment tricky. This is, however, more of a nuisance than a real problem, and the orientation matrix can be found by other methods, for example, by using stereographic projections of the angles of some low-order reflections. lII. I N S T R U M E N T A T I O N
FOR DATA C O L L E C T I O N
1. Four-circle D([]kactometers The principal instrument used for collecting single-crystal neutron data has been, until recently, a four-circle diffractometer. A neutron version of the instrument does not differ much from the X-ray diffractometer, with the exception of having to utilize a different type of detector. The principles of operation of four-circle diffractometers have been summarized in a book by Arndt and Willis (1966) and will not be repeated here. I will make an assumption that the reader is familiar with that technique and thus will only mention those features which are of importance to the field of neutron protein crystallography. The detectors used in neutron diffractometers are usually gas-filled ionization or proportional chambers. Two types of gases are of practical importance: one is 3He and the other BF 3. For the former, the reaction caused by thermal neutron capture is 3 H e + n - ,
124
A. WLODAWER
3H + p + 0.76 MeV. For the latter, the principal reaction is 1oB + n--+ ~Li + ~ + 2.3 MeV + 7 + 0.5 MeV. In both types of counters, the charged particles are causing ionization of the gas, which can easily be detected using modern electronics. The efficiency of such counters is usually better than 50% at 2 = 1.5 A and can even approach 100% for some high-pressure designs. Two types of scans are routinely used in four-circle diffractometry. In an ~ scan, only the crystal is rotated by an angle such that the whole of the rocking curve can be detected. In a 20/~ scan, the detector motion is coupled 2 : 1 with the crystal rotation. The advantage of a 20/~,~ scan is that it provides a better estimate of the reflection background, while an ~,~scan can be used if the reflections are difficult to separate. Both types of scans have been used in practice in the neutron data collection. In the early work on myoglobin (Schoenborn, 1971), data were collected using a 20/~o scan mode and neutrons with a wavelength of 2 = 1.6 A, as well as an ~o scan and Z= 1.2 A. The typical ~o scan was 1.3' wide, and was covered in 11 steps. Counting time per reflection was 5 min. The average background was about 100 counts per min, with 35,000 counts per min recorded for the strongest measured reflection {1 0 - 1 J. Another example of data collected using four-circle diffi'actometry was provided by triclinic lysozyme (Bentley and Mason, 1980; 1981). In that case, data were measured using an u0 scan to the resolution of 1.4 ~,. Each reflection was measured in 31 steps, and the time spent on each peak was 1 min for the data extending to 1.85 /~ and 3-5 min for the data extending beyond that resolution range. A method that increased the speed of data collection was utilized for lysozyme and for carbonmonoxymyoglobin. Instead of collecting all of the high-resolution data, only those reflections which appeared, during a pre-scan, to be significantly stronger than the background were fully recorded. While all the reflections with I > 2 c , were considered observed for a resolution of up to 2.0 A for carbonmonoxymyoglobin, only those with I > 7a were measured in the shell of 1.8 to 2.0 A. Considerable time savings were possible since only 15'~{i of the reflections passed that criterion. In the lysozyme data collection, a preliminary four-point scan was used for the reflections in a shell between 1.4 and 1.5 11,,and only 44'!i, of the reflections, for which I > 2c,. were measured. This improvement is possible since the weakest reflections do not contribute substantially to the Fourier summations and their absence does not unduly decrease the quality of the Fourier maps. (Nevertheless, the absence of "unobserved" reflections may sometimes lead to difficulties during refinementt. Several methods of decreasing the effect of incoherent background have been described. In one of them, an over-small aperture is placed in front of the detector. Such an aperture decreases the peak intensities only slightly, while substantially cutting down on the background level. The disadvantage of a small aperture is that the decrease of integrated intensities is a function of the data resolution. For lysozyme this effect, called luminance function, changed smoothly with the scattering angle, was negligible for d-spacings greater that 3 ,~, and increased to 4001, at a d of 1.4/~ (Bentley et al., 1979). Even if the luminance correction is not applied properly, the only effect is the increase of apparent temperature factors (Nunes and Norvell, 1976~. Another method of decreasing the component of the background which is caused by incoherent scattering and, at the same time, removing the contribution of a ).~2 wavelength. was discussed in detail by Nunes and Norvell (1976). In that approach a crystal is inserted between the sample and the detector. The analyzer crystal is usually made of the same material as the monochromator, which, in most cases, is pyrolytic graphite. The analyzer angle is set such that only the neutrons of the desired wavelength )., the nominal wavelength of the monochromator, are reflected. While in a typical scan the peak intensity is decreased by a factor of two, the background is lowered by a factor of five (Fig. 4) and the peak-tobackground ratio is improved significantly. The fixed (and usually small) size of the analyzer aperture introduces the need for a luminance correction of the type discussed above. An analyzer was used for the carbonmonoxymyoglobin data collection (Norvell et al., 1975: Norvell and Schoenborn. 1976). Even with all of the improvements discussed above, neutron data collection using a single
Neutron diffraction of crystalline proteins
125
(19,-3,8)
4.'1"
(28,-6,£) 4.
4*., ..>4-
-~"
"#'4
at
200 4. 4 4" 44-
4+ 4, 4.4
44 4.
•
4~ 4.
~" 4 4q4 4 ,4
at
at at
at 4"
.4 4~'at
4. 4,
16
at
i
I
i
I
i
r
I
17
113
19
25
26
'27
~,g
FIG. 4. Two myoglobin rocking curves taken with an analyzer (lower curve) and without an analyzer (upper curvet. The solid lines are straight-line fits to the background. All measurements are on the same scale (Nunes and Norvelk 1976}.
counter is slow and inefficient. Even on the highest flux instruments it was only possible to collect, at most, about 500 reflections per day (Bentley et al., 1979), and several months were necessary to complete a single data set. Since the increase in the flux incident on the sample was not possible, more efficient methods of data collection became necessary. 2. Photographic M e a s u r e m e n t s One of the obvious ways of increasing the speed of data collection is by applying photographic methods. Such techniques have been utilized with a great degree of success in X-ray protein crystallography and were the most efficient means of data collection before the introduction of electronic area detectors (Arndt, 1968; Arndt and Wonacott, 1977; Xuong et al., 1978). Unfortunately, the detection of neutrons by photographic methods is, by necessity, indirect, since the thermal neutrons do not interact with the film emulsions. It is necessary to utilize a neutron-to-light converter in order to detect and record the neutron diffraction pattern. A description of a practical system for the photographic data collection from protein crystals has been presented by Hohlwein and Mason (1981). Oscillation films were taken using a flat-film camera equipped with a 13 × 18 cm cassette, which was placed at a distance of 10 cm from the crystal. The scintillator-film system consisted of a sheet of K o d a k Regulix film sandwiched between two neutron-sensitive screens. Each screen consisted of a 0.4 mm layer of 6LiF in a plastic binder, and the reaction responsible for the neutron detection was 6Li+ n---~4He+ 3H + 4.8 MeV + Z n S ~ 104 photons (4400 A). The absorption for each screen was 20% at 1 • and 27% at 1.7 A. A number of oscillation photographs were taken from a triclinic lysozyme crystal mounted around its b* axis. The volume of the crystal was 2 mm 3, and the exposure time was 3 hours (Fig. 5). D a t a extending to 2.9, 2.3 and 1.5 A were measured using incident neutrons with the wavelengths of 1.53 or 2.42 A. Integrated intensities were calculated by scanning the films on a rotating-drum, computer-controlled microdensitometer. The usual corrections, other than the absorption correction and the correction for nonorthogonal directions of the film and diffracted neutron beam, were applied. The data from three films were compared with the
126
A. WLODAWER
CAp e O O
B¸
Fl{;. 5. Neutron oscillation photograph of a 2 mm3 lysozyme crystal. Wavelength 2= 1.53A, oscillation angle 3.3, minimum d-spacings2.9 ~,, crystal-to-filmdistance 10 cm, exposure time 3 hrs. Hohlwein and Mason, 1981. By permission of the International Union of Crystallography).
scattering amplitudes measured from a much larger crystal (20 mm 3) on a four-circle diffractometer. The weighted R factors (R ~ , W d l F d 2 - - FfZl/~,waFd 2, where subscriptsfand d refer to the film and diffractometer data, respectively) were in the range of 0.135-0.25 for the 20-47 reflections common to each of the films and to the diffractometer data set. The conclusion of the authors was that this discrepancy was consistent with the expected quality of the data and that the accuracy of the photographic measurements was about equal to that of the diffractometer. A more detailed judgement will be possible only if a more extensive film data set becomes available. =
3. Linear Position-Sensitive Detectors Two data collection systems employing linear position-sensitive detectors (PSDs) have been constructed and utilized for routine data collection. The first to be completed was a diffractometer at Brookhaven (Cain et al., 1976) and the second was a diffractometer at the National Bureau of Standards (Prince et al., 1978). Both instruments used the same type of detectors, which consisted of aluminum tubes 2.5 cm in diameter, filled with 6 atm of 3He and 4 arm of Ar, with 3 % methane added for quenching. Both diffractometers utilized very similar electronics (Alberi et ell., 19751. They differed considerably in the diffraction geometry and in the details of their construction. The Brookhaven diffractometer consisted of three linear PSDs, each 40 cm long, connected in a series. The long axes of the detectors were coplanar and angled such that the midpoint for each detector was tangent to the cone of diffraction for the level being collected. The distance from the sample to the midpoint of each detector, when set for the zero level, could be varied from 50 to 200 cm and was set to 86.2 cm for the data collection from a crystal of trypsin (Spencer and Kossiakoff, 1980). In that case, all the data from 0 ° to 48 ° could be collected without repositioning the detectors. The angular range of the detector system was electronically divided into 720 data channels, with each channel approximately corresponding to the positional resolution of the detectors.
Neutron diffraction of crystalline proteins
127
Normal-beam diffraction geometry was utilized by this diffractometer. The crystal was positioned with its real axis parallel to the q~ axis of the instrument, and a ~b scan at Z = 0° brought all reflections from a particular level into diffracting positions. The detector assembly was moved up in order to collect upper level data, while the masks positioned in front of the detectors were moved down. The reflections from the inaccessible regions could be measured by resetting the crystal to Z--90% Splitting the detector into three segments reduced the layer-line curvature which would occur if upper-level cones of data were projected onto a single linear detector of the same total length. The opening of the front mask, however, had to be increased to accommodate the upper level reflections. This linear PSD system has been successfully used to collect trypsin data to 2.2 A resolution (Kossiakoff and Spencer, 1980; 1981), but now it has been replaced by a system utilizing an area detector (see below). The design of a diffractometer at the National Bureau of Standards differs considerably from the linear PSD system at Brookhaven. The principal difference is in the utilization of a different diffraction geometry which necessitated the vertical rather than horizontal mounting of the detector. Only one of the Weissenberg methods, the flat-cone technique, assures that all reflections from a reciprocal level will lie exactly in a plane, rather than on a cone. For that method the crystal is mounted so that it may be rotated around an axis normal to a set of layers of the reciprocal lattice, usually one of the real crystallographic axes. In an example shown in Fig. 6, rotation is accomplished around the c axis. If this axis makes an
EVEL
ERO LEVEL
c-A!~ _~-
IV-ROTATION DETECTOR PRIMARY BEAM
FIG. 6. Schematic representation of fiat-cone geometry. The rotation axis in the Fig. is c, and t h e / t h level is in the fiat-cone position. (Prince et al., 1978. By permission of the International Union of Crystallography).
angle 90 ° +/t with the incident beam, where sin/x = 2//c, the center of the sphere of reflection will lie in one of the reciprocal layers, and as the crystal is rotated about the c axis by the angle ~9, the lattice points in that layer will intersect the reflection sphere on a great circle. If the detector lies in the plane parallel to the great circle passing through the center of the sample, it will intercept all the diffracted beams from that layer. The flat-cone diffractometer constructed at NBS is a modification of a conventional fourcircle instrument. A detector 1 m long is mounted vertically in a shield on the 20 arm of the instrument (Fig. 7). The detector can be manually tilted by up to 30 '~ from the vertical position. Only the four rotations of the four-circle diffractometer are utilized, and no translational motions are necessary for the data collection. The crystal is usually mounted with its real axis parallel to the 4> axis of the diffractometer, and the ~9 rotation is accomplished near Z = 9 0 . Since the 4>and o) axes are not redundant at this setting (unlike at Z = 0 ) , the crystal does not have to be aligned exactly on the goniometer head, because the
128
A. WLOI)AWER
FI{, 7. I'-'lat-conc diffracltmlctcr installed at lhc N a t i o n a l Bureau of Ntandards Reactor
required rotation can be accomplished by a combination ofz, q~, and ¢,~motions. The distance from the sample to the detector {defined with the detector vertical) can be adjusted from 40 cm to 200 cm. It was set at 95.6 cm for ribonuclease data collection (Wlodawer, 1980). The tilt angle of the detector is adjusted to minimize tlle parallax, usually in such a way that the diffracted beaxn is perpendicular to tile dctcctor at its mid-point. For the sample-to-detector distance mentioned above, the optimal detector tilt angle was 30 from the vertical direction. This diffractometer can also be used m the standard four-circle mode, since the bottom part of the detector intersects the horizontal plane of the instrument. The change of mode can be done by simply disregarding all the data falling outside of a small number of channels spanning the horizontal plane. This option is advantageous for several reasons. First, the initial orientation matrix can be found and refined in a standard way. Second, the reflections falling m the blind region of the flat-cone geometry can be easily collected without any need for remounting the crystal. Third, selected reflections can be collected in both fiat-cone and four-circle modes and used for testing the hardware and tile software. In the initial experiment conducted on this instrument, the diffractometer was indeed used in the four-
Neutron diffraction of crystalline proteins
129
circle mode to collect ribonuclease data to 2.8 A resolution (Wlodawer, 1980). Subsequently, data to 2.0 A resolution were collected in the flat-cone mode (Sj61in and Wlodawer, 1981), and a final 2 A data set was obtained by combining the data measured in both modes. In this way no reflections were lost in the blind region of the flat-cone diffractometer. 4. Area Detectors
The ability of area detectors to increase the speed of data collection, compared to the speed provided by single counters, has been discussed over a decade ago (Arndt, 1968). Nevertheless, the first neutron diffractometer equipped with an area PSD only recently became available for routine data collection at Brookhaven. The detector used in that instrument was a 20 × 20 cm gas-filled, multi-wire counter, described by Alberi (1976). The details of the construction and the methods of operation of this diffractometer have not been published yet, but the first structure determinations using the data collected with this instrument have been completed (Phillips and Schoenborn, 1981; Kossiakoff, 1982). While the relatively small size of this detector causes the number of simultaneously observed reflections to be only slightly larger than the number observed with the linear detector it superceded (Cain et al., 1976), two sources of improvement are immediately obvious. The counting time can be varied between the low and the high 20 ranges, while all reflections belonging to a single level are counted for the same time using a linear detector. Since the counting time has to be optimized for the weaker high-angle reflections, for lowangle reflections it is unnecessarily long with a linear PSD, and substantial savings are possible if an area PSD is used. In addition, since no masks are placed in front of an area PSD, the chances of incomplete integration of the measured intensities are minimized if proper integration procedures are followed (see below). A large multiwire PSD is under construction at ILL (Thomas et al., 1981). The final detector will consist of eight segments. Each curved segment will contain 64 anode wires 2.5 mm apart and 16 cathode wires 5.0 mm apart. At 115 cm from the sample, this will correspond to a possible angular resolution of 0.125 and 0.250 r' and a total angular range of 64 × 4 . The detector will be placed upright and thus will share some of the properties of the linear and area PSDs described above. A prototype consisting of one detector section has been tested by collecting a data set from a crystal of phtalocyanine (Stansfield, 1981 ). The resulting data were judged to be of high quality. A different design of area PSDs utilizes television tubes as detection devices. While several designs have been described (Davidson, 1976; Arndt and Gilmore, 1976), such instruments have not been used for protein data collection. The interest in such designs may revive, though, since a commercial X-ray diffractometer utilizing a television detector was recently announced. Other area detector designs utilizing scintillation counters are also in the experimental stages. 5. Utilization o f White Neutron Beams
All of the instruments described above were designed to utilize monochromatic neutron beams from steady-state sources (reactors). Monochromatization is very wasteful since less than 10Jr, of the originally present neutrons will pass a typical crystal monochromator. A considerable increase in data collection efficiency could be accomplished if a much wider bandpass could be utilized. Neutrons are particularly well suited to Laue methods since their wavelengths can be measured by time-of-flight techniques. I will briefly discuss here such methods applicable to steady-state and pulsed neutron sources. (a) Fourier chopper Jor a steady-state neutron source The Laue diffraction pattern can be deconvoluted if both the intensities and wavelengths of neutrons can be determined for all reflections. The simplest way to accomplish that aim is by utilizing a chopper which changes the characteristics of the beam from steady-state to pulsed. In the usual time-of-flight technique, the arrival time of the slowest neutrons must precede the arrival time of the fastest neutrons belonging to the next bunch, and thus the duty cycle of a chopper must be short. (The duty cycle is defined as the average fraction of time during which JP8
40:1/2
-
I
130
A. WI,ODAWER
the neutrons reach the sample.) This results in serious losses of the transmitted beam intensity. If the intensity of a neutron beam is modulated by a sinusoidal function rather than split into discreet pulses, it is possible to achieve a duty cycle of 250~, leading to substantial intensity gain over the more conventional time-of-flight analysis. The general design of a Fourier chopper suitable for single-crystal diffraction measurements was discussed by Nunes eta/. (1971), and the preliminary results obtained with an instrument constructed at Brookhaven were summarized by Nunes (1975). It was shown that the simultaneous use of all neutrons with wavelengths of 0.75 3.0 A could increase the rate of data collection by two orders of magnitude compared with the rate for a standard four-circle diffractometer. The results obtained using a myoglobin crystal showed that the deconvolution of the spectrum into the intensities of individual reflections was indeed possible and that the resulting intensities were in good agreement with the data collected using monochromatic radiation. The errors in the intensities determined with the Fourier chopper were generally larger than for the reflections measured using monochromatic radiation, and the weakest reflections were most strongly affected. Much of the difficulty was due to the incoherent scattering of a protein crystal. A large improvement in the efficiency of data collection could be achieved by substituting an area PSD for the single counter. (b) Time-~l~flight methods with pulsed neutron .sources Laue techniques are absolutely necessary for the efficient utilization of pulsed neutron sources such as pulsed reactors or spallation accelerators (Carpenter et al., 1979). While peak neutron flux from a pulsed source may exceed the average flux available from a reactor, the duty cycle may be less than 0.1'I,i. On the other hand, since the neutrons are produced in bunches without the need of introducing choppers, the time-of-flight techniques are an obvious choice. A large time- and position-sensitive detector can bring the efficiency of data collection to the levels observed with the steady-state sources. The design of diffractometers equipped with such detectors was discussed by Bally et al. ~1974), Peterson et al. (1980) and Schultz et al. (1982). Absorption corrections and scaling due to changes of the incident flux and the sample reflectivity with the wavelength will require careful attention while using Laue technique. The efficiency of data collection will be lowered in the anomalous scattering experiments, since, in that case, the reflections cannot be simply accepted at any wavelength, but only near the resonance. On the other hand, given sufficient neutron flux, it is in principle possible to measure each reflection at a large number of wavelengths, thus increasing the accuracy of the method. While no protein data have been collected as yet using such instruments, the importance of the time-of-flight techniques may increase in the future, when more pulsed sources become available. IV. D A I A R E D U C T I O N A majority of the procedures for calculating neutron structure anaplitudes from the raw data is identical to the equivalent methods of X-ray data reduction. There are, however, some differences which will be briefly discussed in this section. I. Lorentz-polarization Correction
One of the differences between neutron and X-ray data is due to the fact that neutrons, unlike X-rays, are not polarized by diffraction from a crystal. This simplifies the Lorentzpolarization (Lp) correction by making the polarization component always equal to 1.0. The Lorentz factor depends only on the geometry of the instrument used for data collection and is identical for both methods. 2. Correction.lot 2/2 Component
The neutron beam from a monochromator crystal is usually contaminated by neutrons of the wavelength 2/2, since the diffracting conditions for the first order of the principal wavelength and the second order for 2/2 wavelength are the same. This is quite serious with the pyrolytic graphite monochromators, which allow the half-wavelength intensity to be considerable. The contribution of the wavelength ,;,/2 can be lowered by the use of
Neutron diffraction of crystalline proteins
131
appropriate filters, but this also seriously lowers the intensity of the main wavelength. An analyzer crystal (see above) lowers the intensity of the 2/2 neutrons compared to )., but it cannot be used with the diffractometers equipped with position-sensitive detectors. It is also possible to correct the diffracted intensities if the contribution of 2/2 neutrons to the overall flux is known. This method has been used in several investigations. The 2/2 contamination can be conveniently measured using an inorganic crystal, such as KBr, which has many systematically absent reflections. The fraction f o r the 2/2 component was found to be 0.063 at 1.58 A at Brookhaven (Kossiakoff and Spencer, 1981) and 0.10 at 1.68 A at ILL (Mason and Nunes, 1976). The fraction l'is lower for the neutrons with short wavelengths (Fig. 1 ), but the choice of wavelength for a given experiment is more often based on the need to resolve the reflections than on other considerations. The intensity of the reflection hkl can be calculated from: F2( hkl )obs = Fi(hkl)tru~ + 18'*F2(2h, 2k, 2 1)
(1)
The factor 8 in this expression is a result of the decrease of reflectivity with )). Since the intensities of high-order reflections are falling off due to Wilson statistics, the contribution of the )~/2 components will not exceed a few percent for almost all reflections, and this correction will usually have to be applied only to the relatively low-resolution data. Disregarding this effect is unlikely to lead to significant errors, but the correction is so simple that there is no real justification in not applying it to the data. 3. Absorption Correction
The true absorption of neutrons by most types of atoms is very low, with the exception of some isotopes of cadmium, berylium, and a few other elements. Incoherent scattering by hydrogens is the main source of decreasing the flux of neutrons passing through the sample, and this effect is mainly responsible for the "absorption" effects observed with protein crystals. Nevertheless, the total absorption coefficient is lower for the neutrons than for the X-rays. The absorption coefficient measured by transmission for a carbonmonoxymyoglobin crystal was /t=2.4 cm -1 (Norvell et al., 1975) and 3.3 cm -1 for metmyoglobin (Schoenborn et al., 1970). These values are about a factor of four lower than the typical X-ray absorption coefficients for proteins. Therefore, equivalent absorption will be observed with the crystals four times as large in the neutron measurements as in the X-ray experiments. The methods used to apply the absorption correction to neutron data have not been discussed in detail in the reports of structure investigation. It may be assumed that the method of North et al. (1968) was most commonly used, since it has been the most useful in X-ray investigations using diffractometry. Santoro and Wlodawer (1980)developed a modification of this method, suitable for Weissenberg diffractometers such as the flat-cone diffractometer (Prince et al., 1978). In the method of Santoro and Wlodawer, two absorption curves are utilized, one for a low-angle reflection and another for a reflection found at the high value of 20. Tile absorption correction was not applied at all in some neutron investigations, particularly if the crystals used to measure the data were small (Kossiakoff and Spencer, 1981 ). Such an omission is unlikely to lead to an 3' serious difficulties, due to the low value of the absorption coefficient. 4. Peak IntegratioJl
A variety of peak integration methods have been used in X-ray data collection and some of them have been adapted for the particular needs of neutron crystallography. The problems of properly integrating the reflections are much more severe in the neutron case, since the presence of considerable background makes peak-to-background ratios very low for a large number of reflections. Two closely related techniques applied to the processing of the output of linear detectors will be discussed here. Both of them aim at making the best estimates of the background and at defining the accurate centers, shapes, and sizes of each reflection. While similar in some
132
A. WLODAWER
aspects, these techniques also have considerable differences and will be discussed separately. Spencer and Kossiakoff (1980) developed an algorithm for processing the data collected using the Brookhaven diffractometer (equipped with three linear PSDs), described earlier. The raw data consisted of two-dimensional boxes, since each crystal rotation step was subdivided along the detector length. This method relies on the fact that the general shape of reflections collected in the normal-beam geometry is ellipsoidal, with the size and direction of the axes varying in different parts of the reciprocal space (Fig. 8). The aim of the technique is two-fold. The position of the center, axial lengths, and tilt angle ~ has to be determined for each reflection, and the background has to be determined very precisely. These two objectives are accomplished in alternating steps. X (channels)
Y
(~b steps)
16
17
20
21
17
19
21
21
18
18
21
21
19
20
21
18
20
I0
17
16
19
21
17
21
15
23
17
20
16
25
22
24
16
22
13
12
16 ~
22
2~
2
2 37
30
30
9 ~17
3o
3s
6o 140 2 0 9 1 " ~ 1 1 4
52
40
41
I
2o
28 1 ~ - , ~ 2
71 107 12
2
14
22
17
18
15
~
15
14
17
22
18
23
2
15
21
26
18
13
22
30
16
23
12
18
19
X"
20
17
22
15
20
16
20
22
17
24
14
25
16
17
21
18
19
18
18
17
20
19
18
18
21
17
~
i
16
25
~
.
0
2 9 ~ 1 9 ~ 24 ~
25,
FI(;. 8. Smoothed data array for a moderately intense reflection. The calculated elliptical peak
boundary and principal axes are sbown. The principal axes originate at the center of mass of the peak and are rotated by the tilt angle ~, relative to the data array axes (X, Y). (Reprinted with permission from Spencer and Kossiakoff(1980) J. Appl. Cryst. 13, 563). The first step of the data processing consisted of smoothing the elements of the data array by the application of a parabolic function, which combines the intensity of each point with those of the nearest and second nearest neighbors. The weights given to each of these classes of points were 5, 2, and - 1, respectively, and the sum was divided by 9 in order to bring the intensities back to the original scale. The initial estimate of the background for each reflection was obtained from the analysis of the distribution of all intensities in the array. The data array histogram was truncated somewhat above the m a x i m u m expected background value, and a Gaussian was fitted to this part of the curve. The initial background was usually too large, since the contribution from the peak distribution was not yet completely eliminated. This preliminary background was used to determine the location of the ellipse. The matched filter, calculated by combining nine neighboring elements into a shape approximating the ellipse in the expected orientation, was placed throughout the data array, and the net intensity was compared with the standard deviation of the background. The array of standard deviations (sigma array Kabsch, 1977) was used to determine the origin of the ellipse by finding the center of mass of all the sigma array elements greater than half the
N e u t r o n diffraction of crystalline proteins
133
magnitude of the largest element. During the subsequent refinement of the size, shape, and position of the reflection ellipse, the center was recomputed as the center of mass of all the data array elements which were contained within the ellipse perimeter for this refinement cycle. The details of the calculation of the momental ellipse for each reflection will not be repeated here. The ellipse needed only to be scaled to its proper size to arrive at the reflection mask. The initial lengths for the ellipse axes for weak reflections were chosen from a short lookup table and then were refined to their final values. The size and tilt angle of the ellipse were strongly constrained for the weak reflections. The points falling outside of the ellipse were subsequently used to recalculate the reflection background, and the refinement process was repeated, up to a maximum of five cycles, until the new integrated intensity differed by less than 0.5a from its previous estimate. The resulting individual backgrounds for each reflection were combined into an average background for the whole detector, and the final peak integration was usually accomplished using the averaged background values. Two levels of data were processed by four separate methods as a test for this procedure. In method 1 the complete ellipse treatment, including background averaging, was used. Method 2 was a single-pass procedure that did not include averaging the backgrounds. In method 3 backgrounds were obtained and averaged as in method 1, but the final intensity was obtained by integrating the whole data array and subtracting the background. Finally, in method 4 backgrounds were obtained by adding the first and the last two rows of each data array, while the remaining rows were integrated to give the peaks. The resulting symmetry R factors are plotted as a function of intensity in Fig. 9. The improvement, particularly for the weak reflections, is very significant. The procedure was a key to successful processing of the data collected from a relatively small (1.6 mm 3) crystal of trypsin. These data were used to refine the structure at 2.2 A resolution. This peak integration technique was recently used to process diffraction data collected with an area detector. In that case, the resulting three-dimensional reflection boxes were collapsed into two dimensions before processing (Kossiakoff, 1982). A related method of peak integration was introduced by Sj61in and Wlodawer (1981).
c:;-
d-
o-
c;
d z~ o
Intensct y @poup F~;. 9. R~ym versus intensity for data processed by the four m e t h o d s described in the text. R~,m= ~ ~lt (hkt),- < (l (hkl)> 1/~ ~t1(hkl), I, where the inner s u m m a t i o n is over all m e a s u r e m e n t s of hkl i hkl i symmetry-equivalent reflections. Each intensity g r o u p contains the same reflections for all four methods. Intensity g r o u p s are based on method 2, as follows: g r o u p 1, 50 100 counts above b a c k g r o u n d ; 2, 100 150 counts; 3, 150-250counts; 4, 250 350counts; 5, 350 500 counts; 6, 500-1000 counts; 7, over 1000 counts. ( A ) M e t h o d 1 ; ( O ) m e t h o d 2; ( + ) m e t h o d 3 ; (El]) m e t h o d 4. (Reprinted with permission from Spencer and Kossiakoff (1980) J. Appl. Cryst. 13, 563.)
134
A. WLODAWER
Their dynamic mask procedure differed from the method described above in two important aspects. First, it utilized all data falling outside of the reflection boxes in order to calculate the initial background curve ("universal background"). Second, the masks containing the peaks were not described by analytical functions but were general, and thus the method could be used with one-, two- and three-dimensional data arrays. The universal background curve was calculated while the data were being measured using the linear PSD. Since the lack of computer storage capacity precluded direct saving of all background points, the following procedure was implemented. Each frame of data was checked, and all data belonging to reflection boxes were removed and stored on the disk. Remaining points were used to update an array computed in the manner suggested by Xuong et al. (1978). Each universal background point along the detector was recalculated by adding a suitable fraction of the new value (usually./= 1/16) to the old value multiplied by ( 1 -J), and the values for the missing points were obtained by interpolation. When a complete reflection was collected, a polynomial was fitted by least squares to the part of the universal background in the vicinity of the reflection (usually 50 channels each way), and the resulting coefficients were stored together with the reflection. This procedure reduced the amount of background data from several hundred numbers to only three or four per reflection while preserving most of the information content. To calculate a mask for each reflection, the data are smoothed in a manner similar to the one discussed above, and a "statistical filter" (Kabsch, 1977) is applied to distinguish between the data belonging to the peak and the data belonging to the background. The procedures were modified to include the initial estimate of the background from the universal background data. Once the background level and its variance are established, the background is subtracted from each point and a flag is set, depending on whether the net intensity exceeds the background by less than or(0), between o- and 2a (1 i, 2c~ and 3o (2), over 3¢r (3). The mask can now be calculated from the contiguous part of the sigma array including all points flagged 1, 2 or 3. The remaining points can now be used to calculate a new background estimate, and the universal background and the local background are combined to provide initial data for the next cycle of the procedure. The convergence is usually reached in two to three cycles. The procedure described here will not provide accurate estimates of intensity for weaker reflections 11 < 5or(l)) that may not create a contiguous mask. In this case a mask calculated for a neighboring medium or strong reflection is applied (including the position of its center of gravity). This helps in avoiding the biasing of the weaker reflections, for which the intensities contained in the or=0 elements of the sigma array could make a substantial contribution to the integrated intensity. For this purpose a number of masks corresponding to different areas of reciprocal space are stored, and they are updated throughout data processing. The procedure used to calculate a dynamic mask made the assumption that the box under consideration contained one reflection only. This is not always the case, and while the presence of other reflections can be predicted from the orientation matrix of the crystal, such calculations may be cumbersome in practice. On the other hand, discrepancy of the universal and local background can alert the experimenter to such a possibility, since, if parts of other reflections are present in the area from which local background was computed, its value would be higher than that of the universal background. In that case the offending parts of the local background can be flagged and removed from subsequent calculations (Fig. 10~. The dynamic mask procedure provides information about the positions of the centers of gravity for all those reflections which were sufficiently strong to describe their own masks. This provides estimates of the errors in the setting angles, and a new orientation matrix can be recalculated (Wlodawer et al., 1982a). While the accuracy of the estimate of misalignment of each reflection is not high, this is compensated for by the large number of available corrections. It should also be noticed that this information is strictly a by-product of the normal intensity measurements and was obtained without prolonging the experiment. The dynamic mask procedure was used to process 2.0 A diffraction data collected from a crystal of ribonuclease A (Sj61in and Wlodawer, 1981 t.
Neutron diffraction of crystalline proteins 72 77 b~ 64 56 81 66 6b 75 60 64 77 68 92 62 75 60 7~ 76 88 70 65 75 72 78 80 81 93 122 146 ~09
60 86 97 54 70 84 82 70 60 71 64 65 92 73 68 64 69 81 61 5~ 7B 82 73 67 73 9b 82 102 13& 144 ~03
62 6~ 74 62 79 77 7~ 63 70 66 89 70 62 59 78 75 60 62 6.3 83 80 57 60 88 77 92 98 105 IU9 149 143
71 9~ 67 75 79 69 71 91 106 74 61 74 66 81 69 8~ 71 75 7"7 81 86 74 80 8& 78 92 ~0 93 82 8~ 88
60 68 91 71 68 69 84 76 85 70 75 70 77 62 75 72 74 88 75 77 74 84 89 80 82 67 71 67 73 8~ 81
76 73 77 60 60 71 69 56 58 69 72 58 73 70 78 &7 82 86 86 85 81 94 85 73 71 a7 ~0 86 77 80 75
72 71 44 80 72 a6 7~ 81 64 61 66 78 68 69 81 64 B3 91 96 ~ 81 85 7~ 7~ 69 79 8~ 77 b9 71 70
71 73 b9 78 66 72 69 81 59 84 81 63 67 74 74 77 79 71 74 79 87 69 76 65 ~2 7B 67 69 84 74 80
71 71 80 75 69 53 64 68 95 75 60 90 7"0 66 57 84 70 82 84 80 93 78 64 84 83 72 79 7~ 87 64 85
76 65 67 57 72 68 74 77 73 66 77 64 75 89 81 87 69 83 72 78 7b 70 ?5 7~ 82 78 67 8~ 78 73 60
70 67 60 74 80 85 63 69 76 80 b8 B1 74 74 90 63 87 85 68 87 76 84 62 70 93 75 7~ 69 69 aa 86
74 &~ 66 65 ~3 104 91 78 9~ 84 93 93 65 88 85 88 82 59 58 73 75 74 65 73 83 80 76 b2 78 67 69
85 95 99 123 156 149 159 l&~ I~9 91 79 74 B4 73 59 65 69 75 65 86 65 67 73 62 69 73 50 7b 82 8~ &7
135 116 119 192 211 ~91 334 ~96 190 119 ~r4 70 B5 91 74 79 77 67 76 67 67 84 78 B1 83 70 92 75 74 90 64 &8
15~ 240 300 462 431 3bo 244 151 81 &a 76 63 60 75 71 78 54 bO 61 B4 71 83 60 70 72 66 68 73 81 93 66
~59 344 425 431 386 263 156 97 81 70 && 84 66 80 77 68 63 7b 84 83 63 8& 65 7b 83 87 72 ~0 70 95 76
293 399 489 391 259 142 87 91 70 77 85 72 5b 58 74 73 6A 75 89 59 82 61 72 79 66 66 73 5412 77 69
381 410 ,~T~I 290 154 119 ~I 76 78 68 69 ~0 68 84 81 81 61 86 70 57 58 71 69 7b 77 85 64 ~3 /8 7~ 73
438 441 30~ 157 131 7~ 76 63 67 7b 70 70 74 70 BI 62 60 57 88 75 74 84 70 92 73 73 77 ~ 63 85 77
- 1 - 1 0 0 0 0 0 0 0 0 0 0 - 1 - - 1 - 1 - - 1 - 1 - 1 - 1 0 0 0 0 0 0 0 0 0 0 0 - 1 - 1 - 1 - 1 - 1 - 1 - 1 - 1 0 0 0 0 0 0 0 0 0 0 0 - I - - 1 - - 1 - 1 - - 1 - - I - 1 - 1 0 0 0 0 0 0 0 0 0 0 0 - I - - 1 - 1 - I - 1 - 1 - 1 0 0 0 0 0 0 0 0 0 0 0 0 - 1 - 1 - 1 - 1 - 1 - 1 0 0 0 0 0 - 1 0 0 0 0 0 0 0 - 1 - 1 1 - 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 - 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 - I - I - I - I 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Fl(;. 10. A box containing a weak reflection in the presence of two much stronger reflections. Points marked 1 were considered, on the basis of the normal distribution of background intensities, to belong to the other reflections, Points marked 0 were used to calculate the background and those marked 1 were used for peak integration. (Sj61in and Wlodawer, 1981. By permission of the International Union of Crystallography).
V. D E T E R M I N A T I O N
O F AN I N I T I A L M O D E L
FOR A PROTEIN
The experimentally measured neutron structure amplitudes are, by themselves, insufficient for the determination of the crystal atomic coordinates for a protein. The phases of all reflections have to be determined before useful Fourier synthesis can be performed. The m e t h o d which found its widest application in X-ray protein crystallography was multiple i s o m o r p h o u s replacement (MIR). In this technique, a t o m s which are strong X-ray scatterers are attached to the protein molecules. Useful heavy a t o m derivatives are usually prepared by the diffusion of c o m p o u n d s containing one or more of the heavier atoms (usually metals such as Pt, Hg, Au, lanthanides, etc.) into protein crystals or by co-crystallization of the protein and the heavy a t o m reagents. Since the X-ray atomic form factors are proportional to the n u m b e r of electrons, they are an order of magnitude larger for the heavy atoms than for the light atoms such as C, N and O, which are responsible for most of the X-ray scattering of a protein crystal. The magnitudes of neutron scattering lengths are much more uniform, and g o o d "heavy atoms" are difficult to find.
1. lsomorphous Replacement The only "'heavy a t o m " isotope which is potentially useful for neutron diffraction analysis
136
A. WLODAWER
is 16'*Dy. The coherent scattering amplitude of this isotope is 4.94 x 10- 12 cm, about a factor of 5 larger than that of nitrogen and 7.5 times that of carbon or oxygen (Table 1 ). Since the lanthanides have been used very successfully as derivatives in X-ray investigations, the chemistry of their binding to proteins has been extensively studied, and the preparation of suitable complexes may be possible. Unfortunately, no other similarly useful isotopes are known; therefore, this potential derivative could only be used as a single isomorph, providing more ambiguous information than the MIR method. Isomorphous replacement has not yet been utilized in practice, and it is not clear if it will be very useful in the future, since the same derivative could also be used for X-ray investigations, requiring much less effort for the data collection. The resulting X-ray model could be used later to initiate neutron studies, as will be shown below. 2. Anomalous Scattering
A resonance frequency is present for a number of stable isotopes in the wavelength range useful for neutron diffraction (),=0.6 to 2A). The anomalous scattering has two components: a real component which changes its sign at the resonance frequency and the imaginary component detectable only near the resonance. While the anomalous scattering is small for X-rays, it can be substantial for neutrons, and thus the potential of utilizing this technique for neutron studies is large. The most promising isotopes for this purpose are 113Cd ()o=0.678/~, breal=0.725 X 10 -12 cm, bimag=4.507 × 10 -12 cm), 149Sm ().=0.915 /~, b r e a l = 0 . 7 9 5 × 10 - 1 2 cm, bimag=6.051 x 10 -12 cm), and lSVGd (,i= 1.614 A, breal=0.809xl0 12 cm, bimag=7.16xl012 cm). The real component of anomalous scattering can be used to simulate isomorphism by collecting diffraction data at different wavelengths. The imaginary component can be used independently for phase determination by measuring the intensities of both reflections belonging to each Bijvoet pair. A combination of both types of data can yield unambiguous estimates of the neutron phases within the accuracy of the measured structure amplitudes (Ramaseshan, 1964: Phillips et al., 1977). A practical test of the usefulness of anomalous neutron scattering was reported by Schoenborn (1975). A derivative of metmyoglobin was prepared by diffusing ~ 3Cd acetate into a protein crystal. Data for this derivative were collected at 0.8 and 1.25/~ on the highwavelength side of the resonance. Although collecting the data on both sides of the anomalous frequency would have been advantageous, the low neutron flux and overlapping reflections made it impossible. Altogether 5000 stronger reflections in the 2.0 A sphere were collected at each wavelength and yielded 1200 independent structure amplitudes. All reflections with counting errors of more than 2°41 were discarded. The location of the major + Cd site was determined from the Patterson map calculated with the coefficients (IF p,[ ]Fpnl), and the parameters were refined using least squares methods. The attempts to utilize the real part of the anomalous scattering for breaking the phase ambiguity were not successful, since the differences were small for the chosen wavelengths. However, the results have shown that the chances of obtaining useful phase information would have been good for a more favorable isotope such as ~57Gd. Unfortunately, it was not possible to prepare a Gd derivative with sufficiently high substitution. 3. Utilization ~[ an X-ray Model
The neutron diffraction studies of proteins are usually concerned with investigating the details not accessible to X-ray techniques rather than with the ab initio studies of the structure. Under these conditions, a model based on X-ray diffraction data is usually available and can be used to supply the initial estimates of phases needed to compute the neutron Fourier map. The initial assumption is that the coordinates determined by both methods should be the same, although the differences in sample preparation (such as deuteration) may lead to some variations. This initial assumption is clearly subject to modification at a later time. The X-ray phases were used directly with the neutron structure amplitudes to calculate nuclear scattering density maps for metmyoglobin (Schoenborn, 1969: Schoenborn et al.,
Neutron diffraction of crystalline proteins
137
1970), and the resulting maps were shown to contain information not present in the corresponding electron density maps. It was estimated that the X-ray phases should deviate from the true neutron phases by 40 ° on the average and thus could serve as the initial approximation. The starting phase sets used in all subsequent ne~tron studies were recalculated using the neutron scattering lengths and X-ray atomic coordinates. The initial phase set may be based on the positions of nonhydrogen atoms only, or may also utilize some or all of the computed positions of hydrogens. Schoenborn and Diamond (1976) calculated the average difference between two such phase sets as 30 ° at 2 A resolution, and thus the phases calculated without the inclusion of hydrogen atoms were considered to be g,3od enough to initiate refinement. Kossiakoff and Spencer (1981) calculated the initial map o; trypsin on the basis of the model which excluded hydrogens and later added hydrogen and deuterium atoms in the positions observed in a series of difference Fourier maps. A similar procedure was followed by Phillips and Schoenborn (1981 ) for oxymyoglobin, except that even those hydrogens which were not directly observed in the difference maps, but which could be predicted on the basis of stereochemistry, were used in subsequent phase calculations. Bentley and Mason (1981) predicted the positions of all hydrogens on the basis of the X-ray coordinates of lysozyme, assuming the potentially exchangeable ones to be deuteriums. The refinement behaved poorly, but it was not clear whether this was due to the early inclusion of hydrogens or the other factors. In later refinements efforts, hydrogens were added more slowly, with apparently better success. Wlodawer and Sj61in (1982a) used the computed positions for all hydrogens in the initial model, even for those side chains which provided no stereochemical indications about their positions. The refinement was successful, even for those amide deuterium atoms which had to be changed to hydrogen, due to their protection from exchange (Wlodawer and Sj61in, 1982b). It should be stressed that no undue faith should be put in the features of an initial nuclear scattering density map phased directly from an X-ray model. This was shown by Phillips and Schoenborn (1981), who could not determine the protonation state of His E7 in oxymyoglobin on the basis of not only the initial map, which did not include hydrogens in the phasing, but also a later one, in which partial information about hydrogen positions was utilized. Nevertheless, the protonation state of this side chain became quite clear after refinement, indicating that the initial ambiguity was due to phasing errors.
Vl. M E T H O D S OF N E U T R O N S T R U C T U R E R E F I N E M E N T The methods used in the protein structure refinement with the neutron data are a natural outgrowth of the methods used in X-ray diffraction studies. They can be broadly classified as real-space, difference Fourier, and reciprocal-space techniques (Hendrickson and Konnert, 1981 ). All of these methods have been used in practice and will be discussed below. Two properties of neutron diffraction are very important for the refinement algorithms. One is the negative scattering of hydrogen, and the other is the approximate doubling of the number of refinable parameters when compared with the case of X-ray diffraction. The former factor is especially important for the difference Fourier methods but can also cause difficulties in reciprocal-space algorithms, as will be discussed below. The increase in the parameters is due to the inclusion of hydrogen coordinates and temperature factors necessary in neutron refinement but only seldom included in the refinement based on the X-ray data. Bentley and Mason (1981) showed an example for the crystals of triclinic lysozyme. In this case, the ratios of the number of intensity observations to the number of parameters (four per atom) are 1.7 (X-ray) and 0.9 (neutron) at 2 ~ resolution. These ratios are increased to 2.6 and 1.3, respectively, at 1.7 ,~ and to 4.5 and 2.3 at 1.4 ,~,. The ratios quoted above are theoretical and are lowered further by the exclusion of unobserved reflections, more numerous in the neutron than in the X-ray data, as was discussed previously. The successful methods of neutron structure refinement clearly must be capable of dealing with the case of low observation-to-parameter ratios.
138
A. WLODAWER
1. Real-Space Refinement The real-space method of neutron structure refinement is based on the algorithm of Diamond (1971). In this method a map is calculated with the best current estimate of phases, and the atomic coordinates are adjusted by rotation around interatomic bonds for the best fit to that map. This method has built-in flexibility in that the map used for refinement can be based on observed structure amplitudes and either the multiple isomorphous replacement (MIR) or calculated phases, or alternatively, the map can be based on difference Fourier coefficients such as (2F o - F<.):~,. or (3F,,-2F,.):~,.. In the neutron case, M I R phases are not available, and so far, almost all of the neutron refinements used F,, maps. although both difference Fourier maps and maps based on a partial phase set (not including the contribution to the phases from the heine atoms) were used by Hanson and Schoenborn (1981). The tree structure of Diamond's program introduces an artifact to the refinement of branched chains, and this has to be kept in mind. Since all atoms positioned before a given rotational parameter are moved when the angle is changed, the two hydrogen rotor groups on the side chains of valine, threonine, leucine, and isoleucine could not be refined in the same cycle. The two groups had to be rotated in alternating cycles, and the tree structure had to be modified between the cycles. Incidentally, the same behavior is encountered in the use of the computer graphics program B I L D E R (Diamond, 1981), which relies upon the same data structure as the refinement program. The structure of metmyoglobin has been refined by Schoenborn and Diamond (19761 using the real-space technique. The real-space refinement program of Diamond (1971) was altered to permit positive and negative scattering densities. The model used to initiate the refinement was based on the nonhydrogen atomic positions derived from Kendrew's model and neutron scattering amplitudes, and the hydrogens were added in stereochemically sensible positions. Hydrogen atoms at nonexchangeable positions and deuteriums at exchangeable sites were all assigned half weights in the initial stages of refinement. No water of hydration was included and only the isotropic temperature factor B = 14.5 A 2 was applied. The crystallographic R factor was lowered from 0.46 to 0.32 in four cycles of refinement with the data extending to 2.0 A resolution. Only the atoms with a weight larger than 0.25 were used in the structure factor calculation, and consequently, 280 atoms were omitted from this calculation in the last cycle. The map calculated after four refinement cycles showed the necessity of adjusting the G H corner and both terminals, as well as of several surface side chains. Of all the atoms omitted in the phasing calculation, over 200 appeared with higher weights but some had slightly altered positions. The refinement clearly led to a more interpretable map and to a better model. The structure of carbonmonoxymyoglobin was also refined using Diamond's procedure (Norvell and Schoenborn, 1976; Hanson and Schoenborn, 1981 ). This refinement utilized all data with I > 28(I) to 2.0 ,~ and I > 78(/) in the shell between 2.0 and 1.8 A. The initial model was based on the X-ray structure of Watson (1969). Sixteen cycles of refinement were done in the work of Norvell and Schoenborn ( 1976): four on the polypeptide chain alone, four on the heme region, two on the whole protein, two more on the heine region, and a final two on the whole protein. The crystallographic R factor was lowered from 0.46 to 0.37 (with 313 atoms not used in the structure factor calculations). The r.m.s, deviation of the coordinates for the nonhydrogen atoms from the starting model was 0.65 A. The later refinement effort (Hanson and Schoenborn, 1981) was initiated from the same point as the first one, which was the rebuilt Watson's structure (0.39 A r.m.s, change from the initial model for nonhydrogen atoms). This refinement utilized ten consecutive maps, with two to nine cycles of real-space refinement run on each map. The procedure required 36 refinement cycles altogether. In the refinement of the first four maps, the atomic radii were fixed and the observed weights refined, while in the last cycles of refinement these parameters were refined in alternate cycles. After map 3 and 6, manual refitting of the model was performed to correct atoms that were off density or in collision with other parts of the molecule. Overall isotropic thermal parameters were utilized for the first seven maps. For the last
Neutron diffraction of crystalline proteins
139
three, individual temperature factors were introduced but in a highly restrained fashion. Residue averages were used for main-chain atoms, n o n - H / D side-chain atoms, and H / D sidechain atoms. These averages were further combined to make them change smoothly. Their final values were 11.4 A 2, 12.5 A 2, and 12.7 A 2 respectively. Small deviations of the individual radii from their group averages were permitted in the last cycle. The r.m.s, shift from the starting non-H/D coordinates was 0.93 A after the seventh map. The r.m.s, shift for H / D atoms was 1.2 A. For the main-chain atoms, final coordinates differed by 0.5 A r.m.s, from the independently refined X-ray structure o f T a k a n o (1977), and for the side-chain atoms, they differed by 1.1/~. The largest deviations from Takano's model were found for the side-chains of arginine, lysine, and glutamic acid. Differences in atomic positions larger than 2 A were found in 21 side chains. The crystallographic R factor was 0.273 for the coordinates used to calculate the second difference Fourier solvent map. Other results of this refinement will be discussed later. 2. DiJference Fourier Methods A difference Fourier technique has been employed by Kossiakoff and Spencer (1980, 1981) in the refinement of the structure of trypsin. It was based on the automated constrained Fourier refinement method of Chambers and Stroud (1977b). The X-ray version of the program relies on the principle that an electron density map computed with the coefficients ( F o - F c ) contains positive peaks in positions where additional electron density should be added to the model and negative peaks in positions where it should be subtracted. The position of an atom can thus be improved by computing the density gradient in the difference Fourier map at the atomic center and moving the atom toward higher density along this gradient. For an atom which is correctly positioned, the density at the atomic center gives an indication of the shift in the temperature factor (or occupancy) for that atom. The modification required for neutron refinement changes the meaning of the gradients for the hydrogen atoms by reversing the shift directions. The geometry of the model which results from each refinement cycle departs from ideality, since all atoms are allowed to move in an unrestrained manner. After each cycle of coordinate shifts, the structure is rebuilt to ideal bond lengths and angles and to a lower global energy using a program developed by Hermans (Hermans and McQueen, 1974; Ferro et al., 1980). The energy refinement plays an important role in the overall process, and a judicious choice of the energy parameters makes a significant difference to the progress of refinement. In particular, the nonbonded and torsional energy components are downweighted compared to the terms involved in the reidealization of the geometry in the initial cycles of refinement. This insures that no large shifts would be performed on the starting model during the first stages of refinement. The weights of these terms are gradually increased later, thus eliminating the persistent high energy interactions. The difference Fourier procedure was shown to be very successful in the refinement of trypsin. An initial phasing model was based on the structure of Chambers and Stroud ( 1977a, b) and on neutron scattering lengths. An R factor of 0.304 was lowered to 0.255 by the addition of hydrogen and deuterium atoms and by the addition of solvent, derived by the evaluation of a series of five difference Fourier maps. This was followed by nine cycles of refinement and by reidealization, which lowered R to 0.187. The overall shift of all nonhydrogen atoms from the initial X-ray model was 0.21 /~. The r.m.s, deviation from the ideal values was 0.013 A for the bond lengths and 2.6 ~- for the bond angles. The convergence limits of this technique at 2.2/~ resolution were evaluated by extensive computer simulations. It was found that, under these conditions, atoms deviating from their proper positions by less than 0.3 A were likely to refine in three cycles or less. It was also found that if an atom with hydrogens attached was displaced by over 0.6 ~l from its correct position, the effect of the neighboring hydrogens rendered the gradient at the parent atom inaccurate. In general, the main difficulty in applying the difference Fourier refinement to the neutron data is caused by the interaction of opposite sign gradients. Several ways of minimizing the problems encountered in the refinement have been suggested by Spencer and Kossiakoff (1981 ). The temperature factors of multiple hydrogens
140
A. WI.OI)AWkR
attached to a common parent atom were restrained to be similar (but not necessarily identical). The temperature factors of deuterium atoms were allowed to refine independently from those for parent atoms, even though an approach in which the temperature factors would be constrained and the occupancies allowed to vary was suggested as more satisfactory. Water molecules were refined as single atoms having scattering lengths approximately 1.5 times that of oxygen in order to eliminate the errors in thc map which could be propagated from the nonnegligible scattering effect of the deutermm atoms. (The chosen scattering length for the water "atom" assured good behavior of the temperature factors, l 3. Reciprocal-Space Re[inement
Three different reciprocal-space refinement algorithms have been used with the neutron data. All of them are modifications of the programs extensively tested in the X-ray refinement of proteins, and they required only minor code modifications for neutron applications. The structure of triclinic lysozyme (Bentley and Mason, 1981 ) has been refined using the technique of Agarwal (1978). This refinement method utilizes the fast Fourier transform algorithm, making it very efficient computationally. The radius of convergence estimated for this technique is about half of the resolution of the diffraction data, and the amount of computations is almost proportional to the size of the structure. No internal restraints are utilized in this refinement scheme, and therefore, it is necessary to idealize the geometry after each refinement cycle. Of several algorithms available for this purpose, the method of Dodson et al. (1976) was chosen in the lysozyme refinement. Only preliminary results of lysozyme refinement are available at this time. The starting model was based on the intermediate results of the X-ray structure refinement by Jensen, Hodsdon, and Sieker (1977, unpublished t. The X-ray data were characterized by R = 0.25 at 2.0/~, for a model including 170 water molecules. The hydrogens were added in stereochemically reasonable positions to initiate the neutron refinement. The convergence was poor and the difference Fourier maps calculated at 1.4 ,~ resolution failed to give clear indications of the reasons. In a more cautious approach to the refinement, all hydrogens and deuteriums were removed from the initial calculations. These atoms were reintroduced later at various stages of refinement, together with reinterpreted solvent positions. For a model including 623 hydrogen and deuterium atoms and 165 solvent positions, the R factor was 0.282 after thirty cycles of refinement. Only 5°; of the nonhydrogen atoms failed to refine well. One of the difficulties encountered in this refinement was caused by strong interaction between the parameters of neighbouring atoms in the least-squares normal matrix. 1he refinement program uses the block-diagonal matrix approximation, which only accounts for the interactions between x, y and - on the same atom. A particularly strong interaction occurs between hydrogens and their neighboring atoms, displacing hydrogens h o m the parent atoms. This effect is not due to the short bond lengths but rather to the negative scattering of hydrogen atoms, since it does not occur with the positively scattering deuterium atoms. A method utilizing the complete normal matrix was suggested as a means of overcoming this difficulty. The refinement of lysozyme is still under way (Bentley and Mason, 1981 ). The structure of oxymyoglobin (Phillips and Schoenborn, 1981) was refined using an algorithm of Jack and Levitt (1978). This refinement method is based on the simultaneous minimization of a realistic potential-energy function and a crystallographic residual. For this reason, no separate model-building program is necessary, since the structure is nol allowed to deviate seriously from ideality by appropriately weighted energy terms. The presence of restraints allows this method to be used even with low-resolution data, for which the number of refinable parameters exceeds the number of available neutron structure amplitudes. On the other hand, it is possible to disregard the energy terms for some refinement cycles, in order to speed up convergence and reduce computing costs. The neutron refinement of oxymyoglobin was initiated after an excellent X-ray model (R = 0.159) became available (Phillips, 1980). The initial model, including 6(1 ordered water molecules, yielded R = 0 . 3 5 for the neutron data. The addition of 40",, of lhe missing
Neutron diffraction of crystalline proteins
141
hydrogen and deuterium atoms lowered R to 0.33, and seven cycles of refinement (including three on individual atomic thermal parameters) resulted in R=0.188. A difference map calculated upon completion of refinement showed unambiguously that a deuterium is bound to the nitrogen NE2 of histidine E7 and participates in a hydrogen bond to the molecular oxygen, even though the initial difference Fourier maps were ambiguous. A refinement method specifically designed to increase the ratio of observations to parameters was introduced by Wlodawer and Hendrickson (1982). In their joint refinement technique, both the X-ray and neutron data were utilized simultaneously in each refinement cycle. That approach is based on an obvious observation that an atomic model for a crystal structure should be consistent with both the X-ray and the neutron diffraction data. Hence more accurate atomic parameters can be expected from a simultaneous refinement with the data from both kinds of radiation than from either separate refinement. Since the degree of overdetermination will be increased in a joint refinement, improved refinement behavior might also be expected. All of the data for such a procedure should, of course, be measured from essentially identical crystals. In particular, for macromolecules both the X-ray and the neutron data should be measured from identically deuterated crystals in equilibrium with the same mother liquor. Such a suggestion for a joint analysis of X-ray and neutron data from macromolecules was made by Hoppe (1976). The procedure that was adopted in practice required relatively minor conceptual modifications to the X-ray procedures for stereochemically restrained refinement (Konnert, 1976; Hendrickson and Konnert, 1980, 1981 ). This procedure introduced stereochemical and other prior knowledge about the structure into the least-squares minimization. These geometrical "observations" served as restraints on the atomic parameters. There may be several qualitatively different kinds of observations. These include the structure-factor data, "ideal" bond lengths and angles, planarity of certain groups, chirality at asymmetric centers, nonbonded contacts, restricted torsion angles, noncrystallographic symmetry, and limitations on bond and angle fluctuation due to thermal motion (Konnert and Hendrickson, 1980). Thus the function to be minimized is in the form of ~=Y4~i
f2)
where each of the separate observational functions, ~bi, is usually (but not always) in the form of
¢, =
yl
r,ob~, ,oa~(,xl ,~2
~2LJJ
--Jj
[3)
~ ~IJ
Here each term relates to a particular observational q u a n t i t y ) °b~ for which a corresponding theoretical v a l u e : "~ can be calculated from the set of refinable parameters {x}, or possibly some subset of these. Each term is weighted by the inverse of the estimated variance for the particular observation or, in the early stages, possibly by some other variance estimate. The joint refinement simply requires adding another term in (2). Thus we now have (I) = (~b . . . . y "~ ~bn eutron -~- ~bbonds "+- ~bpl . . . .
-~- . . .
(4)
The major tasks involved in implementing the joint refinement procedure were those needed anyway for neutron refinement, namely the incorporation of hydrogen atoms. The changes to the actual refinement program, PROLSQ, that would permit simultaneous use of both X-ray and neutron data mainly involved setting appropriate switches for reading the respective data sets, calculating structure factors and derivatives based on the appropriate scattering factors, and including a separate scale factor refinement. Also, a provision for a special class of nonbonded contacts, those involving hydrogen atoms that participate in hydrogen bonds, was included. Changes to PROTIN, the program that prepares the restraint observations for particular protein structures, were more extensive. A variety of program modifications were needed to allow for hydrogen atoms. In addition, new standard groups that included hydrogen positions were compiled. All of the restraint dictionaries were also appropriately upgraded. Distances involving hydrogen positions were put into special weighting categories.
142
A. WLODAWER
For several reasons the joint refinement was found to be particularly attractive for the work on the neutron structure analysis of ribonuclease A. First, no refined structure of this enzyme was available at the outset of the neutron investigation; thus refinement of the X-ray model was necessary. This was accomplished at 2.0 A resolution using partially deuterated crystals treated in the same manner as the large crystals used for X-ray data collection (Wlodawer et al., 1982b). Second, neutron data were initially collected only to 2.8 A resolution, and thus the ratio of observed intensities to the number of atomic parameters was very unfavorable. In addition, tile hydrogens in the aliphatic side chains were not expected to exchange, and the total scattering length of a CH 2 group was very close to zero. At a resolution as low as 2.8 A. such groups did not provide any useful contribution to the refinement process. Third, it was decided that the joint refinement with the neutron and X-ray data might, in general, be useful, since at any resolution it doubles the number of diffraction data while the number of refinable parameters is not increased above that for a separate neutron refinement. This considerably increased the ratio of observations to parameters. The initial X-ray model was characterized by R=0.159 for the data between 10 A and 2.0 A and by the r.m.s, deviation of bond lengths from ideality of 0.022 A. Estimated deviations of the final atomic positions from true values (Luzzati, 1952) were 0.175 A. The model included a phosphate molecule and 176 partially occupied water sites. The agreement of that model with the 2.8/~ set of neutron structure amplitudes was R=0.32. This model neglected almost a third of the neutron scattering power, since hydrogens were not included; hence this comparatively high value was not surprising. Hydrogen atoms were appended to the model in stereochemically reasonable positions. In the initial approximation, all hydrogens attached to oxygen or nitrogen were assigned as deuterium and those attached to carbon as hydrogen. The addition of hydrogen and deuterium atoms lowered R to 0.306, even though some of the atoms were added wrongly (for example, methyl, amino, and hydroxyl groups were oriented arbitrarily) and the exchangeability of the others was not predicted properly. At this stage the deuteriums were not added to the solvent molecules. This model was refined jointly with the neutron data to 2.8 ,~ spacings and the X-ray data to 2.0/~ spacings. A total of 2575 neutron observations and 7708 X-ray observations were included. The relative weights for the contribution of the neutron and X-ray structure amplitudes were chosen in such a way that the resulting model was much more influenced by the X-ray data. Thus the X-ray R-factor changed little, from 0.159 to 0.163, in fifteen cycles of refinement. At the same time, the neutron R-factor was lowered from 0.306 to 0.236. From the X-ray starting point, the r.m.s, shift in atomic positions was 0.31 /~ overall and only 0.12/~ for nonhydrogen protein atoms (solvent excluded). The joint refinement was subsequently continued using both neutron and X-ray data extending to 2.0 A resolution (Wlodawer and Sj61in, 1982a). The refinement was terminated when the X-ray and neutron R-factors stabilized at 0.159 and 0.183 respectively, it was necessary to make considerable modifications to the solvent structure, and only 128 water molecules remained in the final model. The refined structure provided new information about the orientation of such residues as histidine, glutamine and asparagine. These residues could not be unambiguously oriented on the basis of the X-ray data only. The results from the joint refinement of ribonuclease showed that the procedure was stable and converged within an acceptable number of cycles. The inclusion of X-ray data prevented large shifts in the positions of nonhydrogen atoms, while geometrical restraints removed the tendency reported for lysozyme (Bentley and Mason, 1980) for bonds involving hydrogen to become too long. However, some of these properties could be expected from stereochemically restrained refinement with neutron data alone. Thus in the interest of comparison, separate neutron refinement was carried out, and as a control, separate X-ray refinements at lower resolution were also performed. The separate refinement with 2.8 A neutron data was also well behaved. Starting from the same initial model as that used in joint refinement, the R-factor was lowered to 0.156 in 21 cycles. However, this was done at the cost of increasing the X-ray R-factor to 0.386. This
Neutron diffraction of crystalline proteins
143
\
{'."Z'-'
'"~-<
.**a
:, , % . e-
' "
: -"7'
,, V
FI(;. 11. Difference Fourier maps of ribonuctease calculated using the phases from the joint X-ray and neutron refinement, after subtracting the contribution of residues 69 71. Coordinates after the joint refinement are marked in solid lines, those after the separate refinement are dashed. (a) Electron density contoured at 5a level. (b) Nuclear density contoured at + / - 5a level. Positive contours are solid; negative, dashed. (Wlodawer and Hendrickson, 1982. By permission of the International Union of Crystallography).
refinement was also accompanied by much larger positional shifts than the joint refinement. In the separate refinement, the r.m.s, shift from the starting X-ray model was 0.65 A (0.37 A for nonhydrogen, nonsolvent atoms). The largest atomic shifts were on the order of 1.5 A. Not unexpectedly, these occured in the side chain of lysines, glutamines, asparagines, arginines and valines. Figure 11 shows a region in which the results of both refinements were in substantial disagreement. It is clear that, while the electron density map for glutamine 69 and asparagine 71 is of very high quality, the nuclear scattering density map was much poorer, and this may explain the discrepancy. The lack of density near CG 69 is a quite common feature, since the scattering of a carbon is almost completely balanced by the contribution of the attached two hydrogens and the resolution of the data is 2.8 A. The occurrence of excessive shifts in the separate neutron refinement is due more to the nature of neutron scattering than to the limited extent of the neutron data. This was shown by running X-ray refinement with the data which had the same resolution or the same observation-to-parameters ratio as the neutron refinement. These refinements with limited X-ray data, first at 2.8/~ (2799 reflections) and later at 3.4 ,~ (1605 reflections), were much better behaved than the neutron refinement. The R-factor was lowered from 0.144 to 0.118 in
144
A. WLODAWER
six cycles at 2.8 A and from 0.141 to 0.071 in nine cycles at 3.4 A. The r.m.s, shift from initial coordinates for the nonsolvent atoms was 0.13 A in the 2.8 A test and 0.21 A in the 3.4 A test. The r.m.s, difference between the 2.8 A and 3.4 A models was 0.16/k. Only side-chain atoms from one residue, lysine 37, had shifts as large as 0.75 A at 2.8 A resolution and 1.1 A at 3.4 A resolution, but since there is no density for this residue in the difference Fourier maps at 2.0 A, its position was always considered uncertain. Other maximum shifts at 2.8 A did not exceed 0.35 A, only a third of the size noticed for over a dozen side chains in the separate neutron refinement. Only a few 0.5 A shifts were found in 3.4 A refinement. It cannot be completely ruled out, though, that the poorer quality of the neutron data did not contribute to the difficulties encountered in the separate neutron refinement. The results derived from the application of the joint refinement procedure to a protein structure at medium resolution show that such a procedure is less likely to lead to serious errors than the separate refinement with neutron data alone. Even with an initial model in which many hydrogens were not properly placed, the refinement converged rapidly, while the idealized geometry was preserved. The joint refinement, starting from a well refined X-ray model, achieved a substantial improvement in agreement with the neutron data without appreciable change in agreement with the X-ray data. In contrast, the dramatic decrease in the neutron R-value during refinement with the neutron data alone was accompanied by great deterioration in the match with the X-ray data and large, meaningless shifts in a number of side chains. This ill behavior of neutron refinement was not simply due to the poor observation-to-parameter ratio of the problem, since an X-ray refinement with a comparable ratio of diffraction data to variables was well behaved. VII. R E S U L T S OF N E U T R O N D I F F R A C T I O N S T U D I E S OF P R O T E 1 N S The structures of several proteins have been studied in detail using neutron diffraction. So far, highly refined structures have been obtained for trypsin at 2.2/~, resolution (Kossiakoff and Spencer, 1980; 1981), oxymyoglobin at 2.0 A (with partial data to 1.5 A; Phillips and Schoenborn, 1981), and ribonuclease A at 2.0A (Wlodawer and Sjalin, 1982a). The structures of carbonmonoxymyoglobin (Hanson and Schoenborn, 1981), metmyoglobin (Schoenborn and Diamond, 1976), and triclinic lysozyme (Bentley and Mason, 1981) have been reported at the intermediate state of refinement (R >0.25). In addition, neutron data have been collected for other proteins. The structure of a small protein crambin should be of particular interest, since the available neutron data extend to 1.2 A (M. Teeter, 1982, personal communication) and an excellent X-ray starting model is available (Hendrickson and Teeter, 1981). In this section I will discuss the presently available results of the neutron diffraction studies of proteins. Since the aim of this review is to present the technique rather than to discuss the details of the protein structure, similar types of information obtained for different proteins will be grouped together. 1. Improvements to the Atomic Model q['a Protein
Neutron diffraction data have been used to obtain information about the protein models, even though in some cases this information was not completely unique, This was particularly true for the early work on met- and carbonmonoxymyoglabin, since their refined X-ray structures were not yet available. The hydrogenmleuterium bonding scheme was analyzed on the basis of the neutron diffraction of metmyoglobin and was described for both the main chain (Schoenborn, 1972) and the side chains (Schoenborn, 1971). Similar information can be obtained indirectly from the X-ray model by considering the distances and angles between the potential hydrogen bond donors and acceptors, but the information provided by neutron diffraction is more reliable, since the positions of hydrogens (and/or deuteriums) can be visualized directly. The structure of carbonmonoxymyoglobin was also independently refined starting from an unrefined X-ray model (Hanson and Schoenborn, 1981) and then was compared with the refined X-ray structure of Takano (1977). The r.m.s, differences between the positions of nonhydrogen atoms in the neutron and X-ray structures were
Neutron diffraction of crystalline proteins
145
0.85 ,~. These differences were due to the different ligands and different crystallization conditions (particularly D 2 0 and pH), as well as different refinements. The relative contribution of these factors to the overall difference between these models is not known. Much smaller differences between the overall X-ray and neutron structures were observed in the cases where the neutron refinement was initiated from a well-refined X-ray model. The trypsin structure based on neutron data extending to 2.2 A resolution differs by 0.21 A (r.m.s. displacement of nonhydrogen atoms) from an initial X-ray model derived at 1.5 A (Kossiakoff and Spencer, 1981). However, the neutron data provided an indication of some changes which could not be based on the X-ray data alone. In particular, the positions of the oxygens and nitrogens in four asparagine and glutamine side-chain amides were switched, based on the information provided by the neutron difference Fourier maps (Fig. 12). It should be stressed that, while the scattering length of nitrogen is higher than that for oxygen, the main contribution comes from the two deuterium atoms bound to nitrogen. The resulting average scattering of an N D 2 group is 3.5 times larger than that of a single oxygen and can be rather easily detected in a difference Fourier map. These differences are, however, not detectable by X-ray scattering at a comparable resolution.
0 Asn 34
H
//
',,.'x.~//' \-~
/
\
~' /
Ser 139
~. v ~
FIG. 12. (a) Difference map of trypsin side chain Asn 34. The nitrogen and oxygen positions shown are those from the X-ray model. The difference density indicates that the orientation of the nitrogen and oxygen atoms is incorrect. (b) Difference map for Ser 139. The orientation of a deuterium atom can be seen. (Reprinted with permission from Kossiakoff, A. A, and Spencer, S. A. (1981) Biochemistry 20, 6462. Copyright (1982) American Chemical Society.)
The orientation of some side-chain amide groups was also modified during the neutron refinement of triclinic lysozyme (S. Mason, 1979, personal communication) and ribonuclease A. Orientation of the imidazole rings of the histidine side chains in ribonuclease was determined on the basis of neutron data at 2.8 ,~ resolution (Fig. 13; Wlodawer and Sj61in, 1981). Another important residue, the catalytically active lysine 41, was rebuilt based on the neutron difference Fourier maps (Fig. 14; Wlodawer and Sj61in, 1982a). In the X-ray maps no clear density was found to extend beyond CG of this side chain (Wlodawer et al., 1982a), and the nitrogen N Z was rather arbitrarily placed about 9 A from the phosphorus present in the active site. A careful analysis of a neutron difference Fourier map, however, has revealed a large ball of density in the proximity of the phosphate and has suggested that the side chain be directed there. This was indeed possible and both the X-ray and neutron maps calculated subsequently showed unambiguous densities for Lys 41, with its nitrogen at a hydrogenbonding distance from the oxygen 0 2 of the phosphate. Thus the interpretation based on the assignment o f N D 3 to the initially observed high-density peak was justified. The positions of other lysine side chains in the ribonuclease model, which resulted from the joint X-ray/neutron refinement (Wlodawer and Sj61in, 1982a), also differ to some extent from those based on the X-ray data only (Wlodawer et al., 1982b; Borkakoti et al., 1982). This is probably due to the influence of the high scattering power of N D 3 exerted in the neutron refinement, which compensates the influence of the increased temperature factors along the chain. On the other hand, difference neutron density for a lysine is often noncontiguous, as discussed before. While the positions of some hydrogen atoms (in groups such as amide, guanidinium, HA) can be determined rather precisely from the knowledge of the nonhydrogen structure, the
146
A. WLOI)A'ArER P04
/
""
i DE2
//
~ _
HE1
"\",
CE1
C8 ...... "\
tk . . . . . )
,
.\
i)
'.
.... / ,/ J 4',o~
¢....... % "
'
(
)
, H D 2 .,,.ll,~C
,/'"
' ]
- ~
no4
J-
,
E
"'-//:')I' :/://'1','
/,'( \/~
I
'\
C [3 j ~
,"/HD1 ,J
FIcL 13. Sections of diflerence Fourier maps for five side chains of ribonucleasc A. l h e snaps were calculated at 2.8 A resolution, with coefficients (F, F<.).The atoms belonging to these side chains and the phosphate molecule were rcmoxed flom thc structure factor calculations. All maps were contoured tit the same arbitrary le,,el. Solid lines, positive density: dashed lines, negative density: zero contour is omitted. The low density around the CB atoms is caused by the negative scattering of the attached hydrogens not resolved at this resolution. (a) His 12: (b) His 48: (c) His 105; (d) His 119; (e) Fyr 25 (note that the deutermm DH lies belo,a the contoured planeL (Wlodawer and Sjalin. 1981. By permission of the National Academy of Sciences, U.S.A.)
p o s i t i o n s of o t h e r h y d r o g e n s are n o t k n o w n in a d v a n c e . S o m e of t h e m c a n be l o c a t e d in the difference F o u r i e r m a p s , w h i l e the p o s i t i o n of the o t h e r s m a y be d e t e r m i n e d in the r e f i n e m e n t process. T h e o r i e n t a t i o n s of the r o t o r C H 3 g r o u p s are n o t o r i o u s l y difficult to o b t a i n , since t h e y are n o t fixed in s p a c e b u t are rotating• N e v e r t h e l e s s , t h e y are u s u a l l y f o u n d in p r e f e r r e d o r i e n t a t i o n s , a n d s o m e success in their l o c a t i o n has b e e n r e p o r t e d in the s t r u c t u r e of trypsin {Fig. 15; K o s s i a k o f f a n d S p e n c e r , 1981). T h e o n l y p r e v i o u s i n f o r m a t i o n of t h a t n a t u r e was r e p o r t e d o n the basis of the X - r a y r e f i n e m e n t of insulin at 1.2/~ ( S a k a b e et al.. 1981aL
d
li,
b
FJ(,. 14. Difference Fourier maps for Lys 41 in ribonuclease. All maps utilized 2.0/k data and calculated phases. ~ith the contribution of Lys 41 removed. Initial coordinates are marked with the thin lines, tinal coordinates arc marked with thicker lines (and contain hydrogensp. (a) Initial nuclear scattering density map: tb) Final nuclear scattering density map: (el Initial electron density map: (d) Final electron density map.
e
a
,j
"o
q
e'~
¢-
z
148
A. WLODAWER Ala 24
o
o
,,.q~o c
un~+J
09 o~ £3
/ /
\,
/
\÷
//' + 20
Leu 108 CD2
/ 40
60
Hotot,.,:J~
AngLe
80
1O0
120
[degr-ees)
Fl(;. 15. Methylrotor plots for selected side chains in trypsin. Plots for individual rotors wereobtained from difference maps with the parent atom removed. Methyl hydrogenswere rotated around their rotor axes in 20~increments.At eachinterval, densities foundat the hydrogenpositions weresummed. A range of 120~'was covered. (Reprinted with permission from Kossiakoff,A. A. and Spencer, S. A. (1981) Biochemistry 20, 6462. Copyright (1982) American Chemical Society.)
2. Search Jbr Individual Hydrogen Atoms While the exact location of many hydrogen atoms may not be very important to our knowledge of the overall structure of a protein, the presence or absence of a single hydrogen atom can sometimes have crucial importance for a catalytic behavior of an enzyme or an oxygen carrier. One of the most important problems in that category was the mechanism of action of the serine proteases. All the enzymes belonging to this class contain three invariant active-site residues: histidine 57, aspartic acid 102 and serine 195. The mechanisms by which these enzymes hydrolyze peptide bonds can be described as a general base catalyzed nucleophilic attack on the carbonyl carbon of the substrate by the hydroxyl oxygen of Set 195. At the same time, the hydroxyl proton of the serine is transferred to the imidazole of His 57. The mechanistic question, unresolved before the neutron results became available, was whether His 57 was the actual chemical base in the hydrolysis reaction (intermediate 2, Fig. 16) or whether the histidine acted as an intermediary through which Asp 102 functioned as the base (intermediate 1, Fig. 16). The answer to this question could be obtained by finding out whether the proton is attached to the N D I nitrogen of His 57 or to the OD2 oxygen of Asp 102 under the conditions similar to those present during the catalytic reaction. A definitive answer has been provided by Kossiakoff and Spencer (1980, 1981). They have investigated the structure of the monoisopropylphosphate-inhibited trypsin at p D = 6 . 2 (uncorrected pH-meter reading). The M I P closely mimics the expected structure and electrostatic properties of a real substrate--enzyme intermediate, and thus the protonation of this complex corresponds to that of the real substrate in the most crucial stage of the hydrolysis reaction. The preferred location of the deuterium in question was investigated in three ways. In the first method, this atom was omitted from the model and the full structure refined through one cycle of coordinate shifts and model reidealization, in order to eliminate any bias in the phases that might have been introduced by a previous placement of this atom in the model. As seen in Fig. 17a, the difference Fourier map shows a distinct preference for the deuterium to reside on the imidazole. In the second test method, the deuterium was placed on the ND1 nitrogen of His 57 and
Neutron diffraction of crystalline proteins
Asp 102
His 57
Ser 195
y
\
o
149
TETRAHEDRAL INTERMEDIATE
9
( 0
/
ES COMPLEX
.
.
.
.
o
F]o. 16. Two different locations proposed for proton H( 1) in the tetrahedral intermediate structure of trypsin. Proton H(I )is attached either to the carboxyl group of Asp 102, or to the nitrogen ND1 of His 57. Proton H(2) is attached to NE2 of His 57. (Reprinted with permission from Kossiakoff, A. A. and Spencer, S. A. (19817 Biochemistry 20, 6462. Copyright (1982) American Chemical Society.)
refined in two cycles. This refinement was well behaved. An equivalent refinement with the deuterium placed on OD2 of Asp 102 was quite unstable, and the deuterium atom resided along a distinct gradient, giving a clean indication of misplacement. In the third method, difference Fourier maps were calculated using structure factors for the two alternative models. The map resulting from the model with the deuterium bound to ND1 of His 57 was featureless, while placement of deuterium on OD2 of Asp 102 resulted in an interpretable difference map shown in Fig. 17c. The results of these studies, taken together with the calculations showing that the protonation states of aspartate and histidine were not artifacts due to the substitution of a M I P group for a real tetrahedral intermediate, settled the question of the identity of the chemical base in the hydrolysis reaction. The role of the "distal" histidine E7 in oxymyoglobin has been investigated by neutron diffraction (Phillips and Schoenborn, 1981). The function of this residue was not clear, and the presence of a hydrogen bond between NE2 of His E7 and oxygen 0 2 of the O molecule bound to the porphyrin iron was not positively established. The refined structure of oxymyoglobin definitely revealed the presence of a deuterium bound to NE2 at pD = 8.4 (Fig. 18). The deuterium is clearly visible in both the (F o - Fc) and (2F o - Fc) maps, even though its presence was not unambiguously determined in the unrefined model. It is also clear that the histidine side chain is uncharged, since no deuterium is bound to the ND1 nitrogen. The geometry of the hydrogen bond between NE2 and 0 2 indicates a medium strength hydrogen bond, with a NE2-O2 distance of 2.98 A and NE2-DE2 . . . 0 2 angle of 157. Such a bond could contribute several kcal per mole to the enthalpy of oxygen binding, and the position of His E7 might act as a means of control for oxygen affinity.
Asp 102 Asp 102
.
0
N--D H
•"-'-=~"
His 57 H
H
FIc. 17. (a) A difference map (F o - Fc) calculated with only the deuterium H( 1) between the His 57 and Asp 102 side chains of trypsin left out of the phases. (b) A ( 2 F o - Fc) different Fourier map calculated with both H(I ) and H(2) omitted from phases. (c) A difference map in which the deuterium was placed by stereochemistry on the atom OD2 of Asp 102. All three maps show clearly that the deuterium is bound to the imidazole of His 57. (Reprinted with permission from Kossiakoff, A. A. and Spencer, S. A. (1981) Biochemistry 20, 6462. Copyright (19821 American Chemical Society.)
150
A. WL,ODAWER
17E
Ft(;. 18. Stereo view of (F,, F,) neutron difference map of oxymyoglobin.The refined model is superimposed, showing His E7, FeO2, and part of the heme. Contours are + - 0.35,0.55,0.75 F'ernli per A3, with negative ones shown as broken lines. A strong, positive peak indicates the presence of deuterium bonded to NE2. {Reprinted by permission from Nature 292, 81. Copyright ~ 1981 Macmillan Journals Limited.)
In contrast, the studies of carbonmonoxymyoglobin have shown no similar hydrogen bond (Hanson and Schoenborn, 1981). This investigation was conducted at pH = 5.7. The Fourier maps and the real-space refinement indicated that there was a lack of deuterium bound to NE2 of His E7. The neutron weight in the final real-space refinement against a map with no phase contribution from DE2 was only 0.2 Fermi units, as compared to the expected value for full occupancy of 3.3 Fermi units (half of the scattering length of the deuterium atom, since the features not included in the phasing model return with half weights). The result, however, was interpreted with some caution, since the estimated standard deviation of the weight of the DE2 contribution was 1.8 Fermi units. The distance from NE2 to the oxygen of the CO ligand was 2.7/~, shorter than expected for the nonbonded N . . . O interaction. 3. Solvent S t r u c t u r e
One of the early promises of the neutron diffraction studies of proteins was that they could provide a better description of the ordered solvent surrounding the protein molecule than the X-ray methods could. This is mainly caused by the higher fraction of total scattering originating from the solvent in the neutron case, since each D 2 0 molecule has three atoms with roughly similar scattering power, while only oxygen provides a significant contribution in the X-ray case. In addition, the nuclear scattering density of a water molecule should not be spherically symmetrical if the deuterium positions are stabilized by hydrogen bonds, and thus more information about water orientation can be obtained. The number of water molecules found to associate with a protein molecule varies widely in the published reports. Two high-resolution X-ray studies provide a good point of reference. Sakabe et al. (1981b) found about 150 water sites within 3.5 A from an insulin molecule during the analysis of 1.2 A data. Sixty of them were within hydrogen-bonding distances from the backbone amides or carbonyls. Using data extending to 1.2 ,~, Watenpaugh et al. (1978) reported finding 127 water sites surrounding a molecule of rubredoxin. These sites were mostly 2.5 3.0/~ or 4.0 4.5 A from the nearest oxygen or nitrogen atoms belonging to the protein. Three sites located over 8 A away from the protein were considered to be artifacts. The first detailed report on the water structure in a protein, based on the analysis of neutron diffraction data, was presented by Schoenborn and Hanson (1980). Forty water molecules were located during the neutron refinement of carbonmonoxymyoglobin. The occupancy for all of these sites was higher than 0.4. By comparison, the number of waters found in the X-ray refinement of metmyoglobin (Takano, 1977) was 72, and only 25 water positions were common to both structures.
Neutron diffraction of crystalline proteins
151
All of the water molecules found in the neutron study were hydrogen bonded either to the protein or (in a few cases) to other water molecules. An example of a water cluster that bridges two protein molecules is shown in Fig. 19. A better description of the solvent was expected from the higher resolution data, as well as from the data collected using different H 2 0 / D 2 0 mixtures. Both of these approaches are under way. The investigation of bound water was also extended to the structure of metmyoglobin (Raghavan and Schoenborn, 1981). The neutron data only provided a small amount of new information about the solvent structure of trypsin (Kossiakoff and Spencer, 1981 ). The 40 best-determined water molecules from the X-ray model were easily located in the initial neutron difference Fourier map. Since the oxygen positions of these water molecules were known, other difference Fourier maps were calculated by subtracting the contribution of these oxygen atoms from the phasing model. As expected, at a resolution of 2.2 A, the resulting peaks were broad and asymmetric (see Fig. 20). Some information about the positions of deuterium atoms was present, however.
." "'i-i'." /
C
!
..".~.......
.."
(o)
(b)
FIG. 19. Section of the neutron difference density map (F,,-F,.] of carbonmonox3,myoglobin. The calculated structure factors do not include any water molecules. Peaks in the center are interpreted as three water ( D 2 0 ) molecules. On the right-hand side, the neutron density map iF,,) for the same section, with all atoms included in the phase calculation. IReprinted with permission from Schoenborn, B. P. and Hanson, J. C. (1980). In: Water in Polymers, {ed. S. P. Rowland l p. 215, 4CS Symp. 127, American Chemical Society, Washington, DC.]
A slightly different approach to the investigation of solvent structure was taken by Wlodawer and Sj61in (1982a). Only those water molecules which refined well with both X-ray and neutron data in the joint refinement were kept in the final model. This reduced the number of water sites from 176 in the X-ray structure (Wlodawer et al., 1982b) to 128 in the joint model. Even though the number of solvent molecules decreased, the R-factors for the structure stayed approximately the same, thus suggesting that a number of noise peaks were assigned as waters in the initial X-ray studies. A comparison of the X-ray and neutron results can ultimately lead to establishing more realistic acceptance levels for the "solvent" observed in the X-ray diffraction experiments.
4. Hydrogen Exchange The structure of a protein revealed by the techniques of X-ray or neutron diffraction is averaged over the time period of data collection. However, some information about the dynamic states of the molecules is also accessible. The studies of the amide hydrogen exchange can provide information about long-range flexibility in the different regions of the molecule. This technique is based on the observation that the exchange of either deuterium or
152
A. WI.ODAWER
~ 0
% FIG. 20. Orientation of a D2O molecule in trypsin. Difference density with the X-ray-determined position for the oxygen atom (0,,.) subtracted. Resulting density is due to scattering of the two deuteriums alone. (Reprinted with permission from Kossiakoff, A. A. and Spencer, S. A. (1981) Biochemistry 20, 6462. Copyright (1982) American Chemical Society.)
tritium for amide hydrogens can pinpoint those areas which are both accessible to the solvent and sufficiently flexible for the process to take place (Hvidt and Linderstrom-Lang, 1954). While the techniques most commonly used to monitor hydrogen exchange have been radioactive labeling and N M R spectroscopy, neutron diffraction has recently been shown to provide a unique tool that monitors the protection of specific amide protons. Unlike the other methods used for this purpose, neutron diffraction measurements are less susceptible to errors, since the N M R results are strongly dependent on the proper assignment of the resonances (Wfithrich and Wagner, 1979; Richarz et al., 1979), while chemical modification can be affected by problems encountered while the polypeptide is denatured during the proteolysis and chromatography steps (Rosa and Richards, 1979; 1981 ). If such steps are not performed, the chemical techniques can only yield overall information about the protein molecule and cannot yield the assignment of individual exchange rates (Schreier and Baldwin, 1976). The studies of hydrogen exchange have not provided the principal motivation for carrying out neutron diffraction investigations of proteins but rather were a by-product of the long soaking in D20, required in order to reduce the level of incoherent background. The time of soaking (a few months to a year) is usually in between the exchange periods for the unprotected and protected protons. Thus the neutron diffraction does not yield kinetic data for hydrogen exchange but rather a "snapshop", which, however, is sufficient to indicate the less flexible regions of the protein molecule. The results of the neutron diffraction studies of hydrogen exchange recently became available for two proteins: ribonuclease A (Wlodawer and Sj61in, 1982b) and MIP-inhibited trypsin Kossiakoff, 1982). The coordinates obtained in a joint refinement of ribonuclease (Wlodawer and Sj61in, 1982a) formed the basis of the subsequent hydrogen exchange studies. The degree of exchange of amide hydrogens was calculated in a refinement with the neutron data only. In this refinement the positional and thermal parameters for all atoms were kept constant, while the occupancies were allowed to vary. The refinement was thus constrained for all parameters other than the occupancies and was free for the occupancies. All amide hydrogens were formally treated as deuteriums, with the occupancy of 1.0 corresponding to a deuteron and - 0 . 5 5 to a proton, as indicated by their respective scattering lengths. The number of parameters in the occupancy refinement was equal to the number of atoms, namely 2360, while the number of observations was 4132. Since the overdetermination by a factor of 1.75 might not have been sufficient in the presence of substantial errors, it was necessary to evaluate the level of confidence in the results. The tests consisted of refining the occupancies of all atoms in the structure, without setting any arbitrary upper or lower limits and by monitoring the behavior of all atoms belonging to the main chain. This refinement clearly did not correspond to a physically meaningful situation, and the resulting occupancies of amide hydrogens were not correct. Nevertheless, the r.m.s, discrepancies of
Neutron diffraction of crystalline proteins
153
the occupancies for other main-chain atoms provided a good estimate of the errors associated with the individual estimates of occupancies. Two starting models were used: one with all amide hydrogens assumed to be fully exchanged for deuterium and one in which they were unexchanged. In the first case, the final average occupancies and their r.m.s, deviations for N, CA, HA, O and D were 1.0 (0.17), 1.08 (0.211, 1.05 (0.201, 1.06 (0.211 and 0,72 [0.41), respectively. The results of the refinement starting from amide hydrogens were 1.14 t0.19), 1.05 (0.23), 1.07 (0.23), 1.06 (0.23) and 0.31 (0.40). The results of this test showed that the r.m.s, deviations from the average occupa~ cics for all the main-chain atoms other than amide hydrogens were about 0.21, and this a m b e r could be taken as an estimate of the errors involved in the measurement. The hydrogens HA behaved similarly to the nonhydrogen atoms, showing that the deviations for nonexchangeable hydrogens are the same as those for other main-chain atoms. On the other hand, the r.m.s, deviations for amide hydrogens were twice as large due to a combination of the statistical errors and a real contribution of exchange which caused the occupancies to vary. Another interesting effect was the difference between the average occupancies of amide nitrogens and hydrogens in these two refinements, while the occupancies of other atoms stayed the same. This effect was due to an error caused by assigning individual occupancies (using diffraction data extending to 2 A only) to atoms separated by a distance of about 1 A. When the total occupancy for an N H pair changed during the refinement, the occupancy of either of these atoms was modified, even though the increase in the occupancy of nitrogen had no physical meaning. To prevent this effect from biasing the final results, occupancies were subsequently limited to a maximum of 1.0 for all atoms other than amide hydrogens. The departure of mean occupancies from 1.0 indicated an error in the original estimate of neutron temperature factors and/or of the scale factor for the neutron data. The overall temperature factor was underestimated due to an overwhelming influence exerted on this parameter by the X-ray data used in the joint refinement. The derived mean value of the occupancy of nonhydrogen atoms was used to convert the final calculated occupancies of amide hydrogens. An attempt to refine the occupancies of amide hydrogens only (while keeping all other occupancies as 1.0) failed, since the structure at this stage of refinement was still far from being perfect, and the variation of occupancies for all atoms provided a convenient "error sink". In the absence of such flexibility, all errors were concentrated in amide hydrogen occupancies and this made their estimates useless. The results of the occupancy refinement (Fig. 21) showed 12 amide hydrogens to be completely protected from exchange, 6 to be exchanged 25%~ and 13 approximately half exchanged. Altogether about a quarter of the amide hydrogens were not more than half exchanged after approximately one year of exposure to deuterated solvent. Not surprisingly. amide hydrogens corresponding to nonpolar amino acids were the most highly protected, and half or more of the methionines, valines, isoleucines, and leucines were protected. The distribution of protected hydrogens in the three-dimensional structure of ribonuclease is shown in. Fig. 22. A vast majority (26 out of 31 ) of the protected amide hydrogens form hydrogen bonds to main-chain oxygen atoms in either helical or/?-sheet secondary structures. However, the participation in such bonds is not a guarantee of protection, since almost tuothirds of the hydrogens involved in the main-chain hydrogen bonding network are fully exchanged. Of the remaining protected hydrogens, two are involved in unambiguous hydrogen bonds to the side chain groups. Three hydrogens (18, 67 and 89) appear protected despite their location on the surface and hydrogen bonding to D 2 0 molecules. While the amide of Set 18 is also involved in a lattice contact, the other two are not. All three were found to be half exchanged, and their apparent protection may reflect the errors in refinement. The distribution of protected amides in the ribonuclease structure is not uniform. Of the two distinct areas surrounding the cleft near the active site, one appears to be much more flexible than the other. In particular, the part of the/?-sheet containing residues 73 75, 79, 104, 106-109, 116 and 118 appears to be much more highly protected from exchange than any other region of the protein. Only the central part of the/?-sheet found in the other half of the molecule contains protected amides (46, 82, 100). Each of the helices contains two or three
154
A. WLODAWER
I
D
i
,11.1[,Ii h,
I I Jlll
°I 0.75
8
0.50'
,il [t1t,1,llilliiI1"!! IIip!lj il ] ,i
-
.
.
.
.
0.75 <1:
H 1.00
l
I
I
I
10
20
30
40
I, 50
l
I
I
I
i
!
i
60
70
80
90
100
110
120
RESIDUE NUMBER FIG. 21. Plot of occupancies of amide hydrogens in ribonuclease as a function of residue numbers. Atoms which are predominantly of deuterium character are plotted above the central line; hydrogens are plotted below mark. Every tenth residue is marked by a dot, arrows point to prolines, which have no amide hydrogens. Scatter of occupancies beyond 1.0 is an indication of errors in their determination (Wlodawer and Sj61in, 1982bi.
FK~, 22. Stereo plot of the positions of all amide nitrogens in the ribonuclease A molecule, as well as of those amide hydrogens which were found to be at least half protected against exchange (Wlodawer and Sj61in, 1982b).
Neutron diffraction of crystalline proteins
155
protected amides, but three amides in a row are protected only in the first helix. These amides (1l 13) are located at the carboxyl end of the helix, near the active site cleft. An analysis of nitrogen temperature factors and of the degree of protection of corresponding hydrogens shows a possible correlation for the most highly protected amides only. The mean temperature factor for the nitrogen atoms with completely protected hydrogens was 7.2 (r.m.s. deviation 2.04), while for unprotected amides it was 10.75 (3.43). The temperature factors for partially protected amides were, however, not significantly different from unprotected ones, so the degree of protection cannot be described as a simple function of the local vibrational states of the molecules. The results of a neutron diffraction investigation of hydrogen exchange in ribonuclease A showed some similarities to the results obtained by chemical exchange in ribonuclease S (Schreier and Baldwin, 1976; Rosa and Richards, 1979, 1981), but also some differences. Important features of the neutron work were the possibility of estimating the errors associated with the measurements of individual occupancies and the reduction in the probability of introducing systematic errors. The neutron method is more straightforward than chemical analysis and does not suffer from the artifactual lack of data for the amides involved in proteolytic cleavage. Nevertheless, the confidence which can be placed in the protection of any particular peptide should not be exaggerated, in view of the relatively large r.m.s, deviation of calculated occupancies. It will be interesting to compare the neutron and N M R results when both of these techniques will be applied to the same protein. The hydrogen exchange experiments were carried out on trypsin by Kossiakoff (1982), using diffraction data extending to 1.8 A resolution. The amide hydrogens were assigned as either unexchanged (0 15% D), partially exchanged (15 60% DI, or fully exchanged (60 100% D) on the basis of the heights of the corresponding peaks in the final Fourier maps.
(157) 0 N I I
0 189) 194 ~
~
U
~
~lf.~ ~v
N
~ (i -
157
(65•) 0 N l
"J ,'~
{64) 0 N I I
N--O - N--O
48
/Oo~
'ON
II
ON
I \a
0
116
/
(46)0-~,
20
~
-
-
~~-~o.6o
//
9
",^ .v
,IJ, - O-N ~6 2~(32) I~
n .
I
-~ /02
t
" '~
\
-o~
NNO._%__C(
~ ~n
I.,v_
2 0 1 (200) (198)
(192)
135'~\0N O, N:O N: =
0 N
"
-'~',185
(34)
145
nN
N
2o9! ; \L
2 3 2 ~
,~.
ii -L
l/[
I I I 128
T] ~,
o N 0 N_O_~.~No ~ 6
-
_ 125
178
o
/
N" I10 - -
8o
O_ I
40
;N-O-N- ~
o
I I
"1
0 136 N(=371
I
ONO
188~.
FK;. 23. Schematic representation of H/D exchange at each amide peptidc site in the trypsin molecule. KEY: Full exchange C), partial exchange ~ , unexchanged O, proline @, sequence insertions 1, deletions 2 (based on the chymotrypsinogen numbering schemeL carboxylate side chains O. Peptide NH and carbonyl oxygens are shown when H-bonded. The H/D exchange information for residues 60~2, 110 111, and 147 150 may not be reliable. (Reprinted by permission from Nature, 296, 713. Copyright r 1982 Macmillan Journals Limited.)
156
A WLODAWI!R
Limiting the number of catagories in this manner retained the informational content of the H/D data while ,guarding against possible overinterpretation of the relevance of small changes in the exchange ratio. The degree of exchange of the amide hydrogens is shown in Fig. 23. Of the 215 exchangeable amide groups, 68 % were found to be fully exchanged, 8 % partially exchanged and 24% unexchanged (after soaking the crystal for one year in a deuterated mother liquor, pH=7). The most prominent feature visible in Fig. 23 is the clustering of unexchanged sites in regions corresponding to the/3-sheet structures, contained in two beta barrels forming the core of this protein. Only 11 of the peptide groups involved in /#sheet hydrogen bonding were found to be completely exchanged, and all of these were located at the edges of sheet structures. An interesting observation about the accessibility of sites within the tightly bonded fl-sheet core came from noticing that the hydrogen belonging to the hydroxyl group of the Ser 54 side chain was fully exchanged, while the adjacent amide hydrogen and its neighbors were not (Fig. 24). This showed that conformational fluctuations large enough to expose the fl-sheet region to attack by the solvent do occur, but not with sufficient probability for the slower peptide exchange to progress appreciably within the time frame of the experiment. The correlation of the degree of protection with the distance to the bulk solvent showed that 20 of the fully exchanged sites were located 5 A deep into the structure. Nine of them, however, were found to be located adjacent to one or several of the interior water molecules, and it was concluded that these internal waters are crucial to the exchange process. Nevertheless, some amide protons did not exchange, even though the oxygen atoms to which they were hydrogen bonded were in turn bonded to internal D20 molecules. While a significant correlation between the degree of protection and the nature of the side
FI(;. 24. Section of fl-sheet structure residues 43 45 and 53 56 in trypsin, showing main chain atoms, as well as the side chain of Ser 54. Because of a bulge in the sheet, the H-bond interaction between 43 O and 55 NH cannot be made directly. The bonding structure of the sheet is maintained (dashed lines) through the mediation of the side chain of Ser 54. The hydroxyl proton of the serine is almost fully exchanged for a deuterium, yet the amide proton of Ala 55 to which the hydroxyl oxygen is hydrogen bonded remains unexchanged. (Reprintea by permission from Nature, 296, 713. Copyright ~ 1982 Macmillan Journals Limited.)
Neutron diffraction of crystalline proteins
157
chains was found, it turned out to be indirect and due to the ubiquity of hydrophobic side chains in fl-sheet structures. N o distinct correlation between the potential exchange of the interior sites and their observed thermal motions (temperature factors) was found, and this implied that the temperature factors contain no systematic information on the characteristics of the larger scale breathing modes of a protein. The model most likely to explain the observed exchange behavior of trypsin involved localized conformational mobility. This assumed that the most probable fluctuations are those characterized by a degree of"regional melting" usually limited in extent to the breaking of only a small number of hydrogen bonds. This model modifies the earlier ones, which explained the phenomenon of hydrogen exchange either by cooperative unfolding ("breathing protein"), solvent penetration, or a combination of these effects (Wagner and Wfithrich, 1979). Further information about the issues concerning the relationship of exchange chemistry to protein dynamics can be expected from the modeling studies, which will use the neutron results as a base. VIII. SUMMARY AND CONCLUSIONS The technique of single-crystal neutron diffraction of proteins is now at a stage where it begins to provide significant returns on the investment of the last fifteen years. Three facilities are currently capable of collecting useful data in a routine fashion. The initial technical difficulties have been largely overcome, and particularly the introduction of positionsensitive detectors and the associated data reduction algorithms have provided a major impetus to the change from the stage of technique development to that of practical utilization. While very large crystals are still desirable, they are no longer absolutely necessary, and it can be expected that a volume of about 1 m m 3 will soon be considered sufficient for most experiments. Since crystals of that size can be grown for many proteins, neutron diffraction experiments could become more common. In view of the clear success in locating specific protons in proteins such as trypsin or myoglobin, similar studies can lead to a better understanding of the mechanism of action of different classes of proteins. The results of hydrogen exchange experiments provide a unique possibility for correlating the dynamic properties of proteins with their three-dimensional structures. Finally, the contribution of neutron diffraction to improving the structural models of proteins, and, in particular, to providing a better description of their bound solvent, m a y become important. In the next few years the technique should reach the level of maturity now enjoyed by X-ray diffraction. ACKNOWLEDGEMENT 1 would like to thank Drs. Lennart Sj61in, Wayne Hendrickson, David Davies, Edward Prince and Antonio Santoro for their help and fruitful discussions. I am indebted to Mrs. Sheryl Long for her skillful editorial assistance.
REFERENCES AGARWAL,R. C. (1978) Acta Cryst. A34, 791. ALBERI,J. L. (1976) Brookharen Syrup. Biol. 27, VIll 24. At.BERI,J. L., FISHER,J.. RADEKA,V., ROGERS,L. C. and SCHOENBORN,B. P. (1975) IEEE Trans. Nucl. Sci. 1,255. ARNDT, U. W. (1968) Acta Co'st. B24, 1355. ARNDT, U. W. and GmMORE,D. J. (1976) Brookhacen Syrup. Biol. 27, VIII-16. ARYDT, U. W. and WILLIS,B. T. M. (1966) Single Co,stal Diffraetometry, University Press, Cambridge. ARNDT,U. W. and WONACOTT,A. J. (1977) The Rotation Method in Crystallography, North Holland, Amsterdam. BACON,G. E. (19751 Neutron Di[Jraction, 3rd edition, Clarendon Press, Oxford. BALI_Y,D., CttlR]'OC, V., GHEO(HIIU, Z., POPOVICI, M., STOICA, A. D., TARINA, R. and BALAGUROV,A. M. (1975) Report IFA FN-48, University of Bucharest. BELL(),J. and HARKER,D. (1961) Nature (London) 192, 756. BENTLEY,G. A., DUEE,E. D., MASON,S. A. and NUNES,A. C. (1979) J. Chim. Phys. 76, 817. BENTLEY,G. A. and MASON,S. A. (1980) Phil. Trans. R. Soc. Lond. B290, 505. BENTLEY,G. A. and MASON,S. A. (1981) In Structural Studies on Molecules of Biological Interest (eds. G. DODSON, J. P. GLUSKERand D. SAYRE),p. 246, Clarendon Press, Oxford. BLUNDELL,T. L and JOHNSON,L. N. (1976) Protein Crystallography, Academic Press, New York. BORKAKOTI,N., Moss, D. A. and PALMER,R. A. (1982) Acta Cryst. B38, 2210.
158
A. WLODAWER
BUNDY, A. and WUTHRICH, K. (1979) Biopolymers 18, 285. BUSING, W. R. and LEVY, H. A. (1967) Acta Crvst. 22, 457. CAIN, J. E., NORVELL, J. C. and SCHOENBORN, B. P. (1976) Brookhat,en Syrup. Biol. 27, VIII 43. CARPFNTER, J. M., BLEWITT, T. H., PRI('E. D. L. and WerNeR, S. A. (1979) Physics Today, Dec. 1979. CHAMBERS, J. L. and STROUD, R. M. (I977a) Protein Data Bank (Brookhaven National Laboratory, Upton. New York). CHAMBERS, J. L. and STROUD. R. M. (1977b) Acta Crys;. B33, 1834. DAVIDSON, J. B. (1976) Brookhal'en Syrup. Biol. 27, VIII 3. DIAMOND, R. (1971) Acta Co'st. A27, 436. DIAMOND, R. (t981) In Biomolecular Structure, Conformation, Function and Evolution, (ed. R. SRINIVASAN),VOI. l. p. 567. Pergamon Press, Oxford. DODSON, E. J., ISAACS, N. W. and ROLLETT, J. S. (1976) Acta Co'st. A32, 782. D o u z o u , P., HuI BON HOA, G. and PETSKO, G. A. (1974) J. Mol. Biol. 96, 367. FERRO, D. R., McQuEEN, J. E., M¢'CowN, J. T. and HERMANS, J. R. (1980) J. Mol. Biol. 136, 1. GLASOE, P. K. and LONe;, F. A. (1960) J. Phys. Chem. 64, 188. HANSON, J. and SCFIOENBORN. B. P. (198t).1. Mol. Biol. 153, 117. HENDRICKSON, W. A. and KONNERT. J. H. (1980) In Computing in Crystallography (eds. R. DIAMOND. S. RAMASEStlAN and K. Vf!NKar~SaN) p. 13.01. Indian Academy of Sciences, Bangalore. HENDRICKSON. W. A. and KONNERI. J. H. ( 1981 ) In Biomolecular Structure, Conformation, Function and Et'olution (ed. R. SRINIVASAN)VOI. l, p. 43. Pergamon Press, New York. HENDRICKSON, W. A. and TEETER, M. M. (1981)Nature (London) 290, 107. HERMANS, J. R. and McQUE~N, J. E. (1974) Acta CO'st. A30, 730. HOHLWEIN, D. and MASON, S. A. (1981)./. Appl. Crlst. 14, 24. HOPPE, W. (1976) Brookhacen Syrup. Biol. 27, lI 22. HWDT, A. and LINDERSTRoM-LAN(;, K. (1954) Biochim. Biophys. Acta 14, 574. JACK, A. and LEVI'IT, M. (1978) Ac:a Ct3'st. A34, 782. KABSCH, W. (1977) J. Appl. Cryst. 10, 426. KENDREW, J. C.. BODO, G.. DINTZIS, H. M.. PARRISH, R. G.. WYCKOFF, H. and PHILLIPS, D. C. (1958) Nature (London) 181,662. KONNERT. J. H. (1976) Aeta ('rv.st. A32, 614. KONNER'I, J. H. and HENDRICKSON. W. A. (1980) Acta Co'st. A36, 344. KOSSIAKOFF, A. A. (1982) Nature (London) 296, 713. KOSSIAKOFF, A. A. and SPENCER, S. A. (1980) Nature (London) 288, 414. KOSSIAKOFF. A. A. and SPENCER. S. A. (1981) Biochemisto' 20, 6462. KRt~I(;ER, M.. CttAMmiRS, J. k.. ('HRIS~OPH. G. G., STROtJD, R. M., TRt:S, B. L. (1974) Acta Cryst. A30, 740. LUZZATI, V. (1952) Aeta Crvst. 5, ~02. MOORt~, F. M.. WILLIS, B. T. M. and HODGKIN, D. C. (1967) Nature (London) 214, 130. MASON, S. A. and NUNES, A. C. (1976) Annual report, Institute Laue-Langevin, Grenoble. NORTH, A. C. T., PmLLIPS, D. C and MATHEWS, F. S. (1968) Acta Crvst. A24, 351. NORVELL. J. C., NONES. A. C. and SCHOENBORN, B. P. (1975) Science 19tl, 568. NORVELL, J. C. and S('HOt NBORN. B. P. (1976) Brookhat'en Syrup. Biol. 27, 11 12. NUNES, A. C. (1975) J. Appl. Crys;. 8, 20. NUNES, A. C., NATHANS, R. and SCHOENBORN, B. P. (1971) Acta Co,st. A27, 284. NUNES, A. C. and NORVELL, J. (1976) Brookhacen Syrup. Biol. 27, VII 57. PETERSON, S. W.. REIS. A. H., SC'Ht,'CTZ, A. J. and DAY, P. (1980) In Adt,ances in Chemisto, Series, No. 186, Solid State Chemisto': .4 Contemporary Occrzicw (eds. S. L. HOLt. J. B. MILSTEINand M. ROBBINS), p. 75, American Chemical Society. Washington. PHILLIPS, J. C., WLODAWER, A., GOODFELLOW, J. M., WATENPAUGH, K. D., SIEKER, L. C., JENSEN, L. H. and HODGSON. K. O. (1977) Acta Cryst. A33, 445. PHILIAPS, S. E. V. (1980).I. Mol. Biol. 142, 531. PHH.LIPS, S. E. V. ~md SCHOf~YBORN. B. P. (1981) Nature (London) 292, 81. PRINCE, E., WI.OI)A\VER. A. and SANIORO. A. (19781 J. Appl. CO'St. !1, 173. RA(;ttAVAN. N. V. and S(tl(IINB(IRN, B. P. 11981) Abstract 02.808, Xll-th International Congress of Crystallography, Ottawa, p. C-54. IUCR, Ottawa. RAMASESHAN, S. (1964) In Adt'aneed .Uethods in CtTs'talh)qraphy (ed. G. N. RAMACHANDRAN),Academic Press. London. RICHArZ. R., SEHR, P., WAGNr!R, G. and W()THrlCH, K. (1979) J. Mol. Biol. 130, 19. ROBERTS, G. C. K., MI!ADOWS, D. H. and JArl)~TZKV, O. (1969) Biochemistry 8, 2053. ROSA, J. J. and R1CHARDS. F. M. (1979) J. Mol. Biol. 133, 399. ROSA, J. J. and RI('HARDS, F. M. (19gl) J. Mol. Biol. 145, 835. SAKABE, N., SAKAB~, K. and SaSAKI, K. (1981at In Structural Studies on Molecules of Biological Interest (eds. G. DODSON. J. P. GutlsKvr and D. SAYRE), p. 509, Clarendon Press, Oxford. SAKABE, K.. SASAKI, K. and SAKABI3, N. (1981b) Abstract 0 2 . 8 0 7 , Xll-th International Congress of Crystallography, Ottawa. p. C-54. IUCR, Ottawa. SANTORO, A. and WLDDAWer, A. (1980) Aeta Co'st. A36, 442. SCHOENBORN, B. P. (1969) Nature (London) 224, 143. SCHOENBORN, B. P. (1971) Cold Spring Harbor, 5~vmp. Quant. Biol. 36, 569. SCHOENBORN, B. P. (1972) In Structure and Function q/Oxidation and Reduction Enzymes (eds. A. AKERSONand A. EIIRI!NBER(;), Pergamon Press, Oxford. S('H(IENBORN, B. P. (19751 In Anomalous Scattering (eds. S. RAM~,SliSHANand S. C. ABRAHAMS)p. 407, Munksgaard, Copenhagen. SCHOENBORN. B. P. and DIAMONI), R. (1976) Brookhacen Syrup. Biol. 27, II 3.
Neutron diffraction of crystalline proteins
159
SCHOENBORN, B. P. and HANSON, J. C. (1980) In Water in Polymers (ed. S. P. ROWLAND}, ACS Syrup. 127, 215, American Chemical Soc., Washington DC. SCHOENBORN,B. P., NtYNES, A. C. and NATHANS, R. (1970) Ber. Bunsenges. Phys. Chem. 74, 1202. SCHREIER, A. A. and BALDWIN, R. L. (1976) J. Mol. Biol. 105, 409. SCHULTZ, A. J., TELLER, R. G., PETERSON, S. W. and WmLIAMS, J. M. (1982) AlP Conference Series (Argonne meeting, Summer 1981) American Institute of Physics, Washington. SJOLIN, L. and WLODAWER, A. (1981) Acta Cryst. A37, 594. SPENCER, S. A. and KOSSIAKOEE, A. A. (1980)J. Appl. Cryst. 13, 563. STANSFIELD,R. F. D. (1982) AIP Conference Series (Argonne meeting, Summer 1981 ) American Institute of Physics, Washington. TAKANO, T. (1977) J. Mol. Biol. 110, 537. THOMAS, M., STANSFIELD,R. F. D., BERNERON,M. and FILHOL, A. (1981) Annual Report, Institute Laue-Langevin, Grenoble. WAGNER. G. and WOTHRICH, K. (1979) J. Mol. Biol. 134, 75. WAt,TER, J. and STEIGEMANN, W. (1981) Abstract 02.7-02, VIl-th International Congress of Crystallography, Ottawa, p. C-50. IUCR, Ottawa. WATENPAUGH, K. D., MARGULIS, T. N., SIEKER, L. C., and JENSEN, L. H. (1978) J. Mol. Biol. 122, 175. WATSON, H. C. (1969) Pro,qr. Stereochem. 4, 299. WLODAWER, A. (1980) Acta Cryst. 1336, 1826. WLODAWER, A. and HENDRICKSON, W. H. (1982) Acta Cryst. A38, 239. WLODAWER, A. and SJOLIN, L. (1981) Proc. Natl. Acad. Sci. U.S.A. 78, 2853. WLODAWER, A. and SJOLIN, L. (1982a) Biochemistrv, in press. WLODAWER, A. and SJOL1N, L. (1982b) Proc. Natl. Acad. Sci. U.S.A. 79, 1418. WLODAWER, A., SJ(~LIN, L. and SANTORO, A. (1982a) J. Appl. Cryst. 15, 79. WLODAWER, A., BOTT, R. and SJ6LIN, L. (1982b) J. biol. Chem. 257, 1325. WOTHRICH, K. and WAGNER, G. (1979) J. Mol. Biol. 130, I. XUONG, N. H., FREER, S. T., HAMLIN, R., NIELSEN, C. and VERNON, W. (1978) Acta Cryst. A34, 289.