Computational strategies pertinent to NMR solution structure determination

Computational strategies pertinent to NMR solution structure determination

Computational strategies pertinent to NMR solution structure determination Thomas L James UCSF, San Francisco, USA Experimental NMR data provide the b...

1MB Sizes 3 Downloads 42 Views

Computational strategies pertinent to NMR solution structure determination Thomas L James UCSF, San Francisco, USA Experimental NMR data provide the basis for determining the three-dimensional structures of biopolymers in solution. Most work has been conducted on protein structure determination, but there have also been efforts to obtain meaningful nucleic acid and carbohydrate structures. Analysis of the large quantity of data obtained by NMR techniques requires much more time and effort than acquisition of the data. Advances in automating resonance assignment procedures and in enhancing the quality and quantity of structural restraints available from the experimental data continue to be made. There are further improvements in computational methods used to search conformational space to yield all possible structures consistent with the experimental restraints and to optimize resulting structures. Recently there has also been more consideration of the accuracy and precision of the structures obtained. Current Opinion in Structural Biology 1994, 4:275-284

Introduction Solution structures of about 200 proteins or peptides and a handful of nucleic acids have been determined using experimental NMR-derived structural restraints. Yet, it could easily Be argued that NMR as a method for determining molecular structures is, if not in its infancy, then in its adolescence. While structures are indeed being determined, there remains a comparatively rapid concomitant d e v e l o p m e n t of techniques for structure determination, both experimental and computational, as well. This review will focus on computational aspects. In particular, I will be concerned here with structural calculations performed recently which enhance our insights into solution structure using experimental NMR data. In this regard, it should be noted that NMR structural restraints p e r se do not constitute a structure. Rather, ideally, an effort is made to generate an array of structures consistent with all available experimental data, which m a p out the conformational space accessible to the molecule. The method of generating these structures should in principle enable an intelligent search of all conformational space. However, in practice this is no easy problem for molecules larger than about 1000 Chitons, so there has been no consensus on the exact protocol for NMR solution structure determination. Most people would agree, however, that our understanding of any structure might be enhanced if w e could describe it more precisely, for

example, if the array of structures generated in accord with experimental data was m o r e closely related, and more accurate, i.e. afforded a better representation of its true nature. Obviously, precision and accuracy are not the same thing. There are a n u m b e r of recent reviews on NMR determination of protein structure [1-3], and DNA structure [4,5]. This review, however, will emphasize computational strategies introduced in the past year or so which could: first, enhance accuracy or precision; second, ease structure determination; third, provide attractive alternative means of structure generation; and finally, enable an assessment of structure quality. A review of some computational aspects of protein structure determination, especially distance geometry, has previously b e e n presented [6] and a volume resulting from a NATO w o r k s h o p on the subject has also appeared [7].

Generation of solution structures A consensus on the exact m e t h o d o l o g y of structure determination using NMR is lacking, but the thread c o m m o n to all procedures is that NMR data must be acquired which permit assignment of resonances, and which are characteristic of the structure. A search of conformational space nmst be m a d e in an attempt at

Abbreviations 2D--two-dimensional; 3D--three-dimensional;4D--four-dimensional; DG~distance geometry; ISPA---isolatedspin pair approximation; MD---molecular dynamics;MD-tar--molecular dynamicswith time-averagedrestraints;NOE--nuclear Overhausereffect; NOESY---nuclearOverhausereffect spectroscopy;r-MD--restrained molecular dynamics; RMSD--root mean squaredeviation; ROESY--rotating-frameOverhausereffect spectroscopy;TOCSY---total correlationspectroscopy. © Current Biology Ltd ISSN 0959-440X

275

276

Theory and simulation simultaneously satisfying all geometric requirements imposed by the holonomic constraints of atom connectivities, bond lengths and bond angles in addition to all of the experimental NMR structural data. Unless the data are in serious error, or there are multipie conformations, the procedure should show what structures are compatible with available information. It is intuitively obvious that an increasing amount of data will serve to limit the accessible conformational space, and will thus yield a higher resolution structure. While the rough idea for structure determination is clear, virtually every single step in achieving the goal has variants. There are different types of NMR spectra, means of assigning resonances, means of obtaining structural restraints, and methods of searching conformational space. Generally, though, the NMR data used to determine structures has always included interproton distance-related two-dimensional (2D) nuclear Overhauser effect (NOE) or possibly rotating-frame Overhauser effect spectroscopy (ROESY) cross-peak intensities. This is sometimes augmented by torsion angle- elated data from scalar coupling based spectra. Typically, distances are estimated from 2D NOE spectra along with an estimate of upper and lower bounds to each of the distances. These serve as the structural restraints (or constraints). The list of structural restraints is examined in a computational procedure whereby an objective function measures the deviation of the individual distances from their target values (experimental upper and lower bounds). Typically, one of the distance geometry algorithms would be used in an attempt at sinmltaneously satisfying all geometric requirements imposed by the holonomic constraints, as well as the experimental distance bounds. If successful, a list of atomic coordinates (the structure) is obtained along with an indication of the extent of their inconsistency with the experimental data. Further refinement might entail consideration of some of the energetic aspects of the structure. In this case, the molecular force field could include penalty functions monitoring the deviation of NMR-derived distance restraints (and possibly scalar coupling-derived torsion angle restraints) as conformational space is searched. The mathematical procedures of molecular dynamics (MD), (or simulated annealing) commonly provide the means for carrying out such a search of conformational space.

Resonance assignments For a biopolymer which is soluble, not aggregated, and predominantly assumes a single structure in solution, the first step in structure determination is the assignment of pertinent resonances. This may entail only proton resonances if the protein has less than 100 amino acyl residues, or less than about 20 base pairs in a DNA duplex, triplex or quadruplex. However, for larger proteins or for RNA structures with more than -20 nucleotides, it is usually necessary to incorporate stable isotopes. In this case, 15N and 13C resonances must also be assigned. Cumulative experience from

several laboratories, which produced software to aid proton resonance assignments in proteins using 2D NMR spectra is that signals remained so overlapped that automated assignments could generally only be made for a rather limited number of backbone resonances. However, the advent of extensive 15N and 13C labeling, along with three-dimensional (3D) NMR and four-dimensional (4D) NMR techniques, which spread the signals into a third and fourth dimension, decreases the resonance overlap problem and increases the likely success of automatic assignment procedures. Assignment of resonances is often the bottleneck in structure detern-fination. As it is in principle amenable to automation, there is currently considerable activity in this area, with many of the programs extant having no published description. However, there are some reports which give a flavor of the different approaches taken. The degree of automation in assignment procedures ranges from automation of only a part of the process, e.g., the final sequential assignment step [8] or aid in bookkeeping by providing a list of chemical shifts with scalar connectivities [9] to those that attempt to automatically make as many resonance assignments as possible using the input cross-peak frequency and fine structure data. Some of these approaches have recently been extended to 3D and 4D NMR. Most automated (or semi-automated) techniques are based on computing the methods commonly used for manual assignment. Consequently, several semi-automated assignment aids for the commonly employed hierarchical assignment procedures have been developed over the past 6--7 years. Basically, spin systems are identified via spectra based on scalar coupling, and subsequently through-space NOE connectivities are made. Recent developments for 3D NMR include a method which first identifies spin patterns in 3D total correlation spectroscopy (TOCSY)-TOCSY spectra, with manual assignment to specific amino acids, followed by automatic sequential connectivities made from 3D TOCSY-NOESY (NOE spectroscopy) spectra [10]. Others are extending earlier 2D NMR versions of hierarchical assignment procedures to 3D NMR data as well [11-13]. For example, a 3D and 4D NMR version of the program Pronto is being distributed for analysis of homonuclear and heteronuclear protein NMR spectra. Hierarchical methods typically utilize some secondary structural characteristics in making sequential assignments. Recently, at least two approaches have been explored which are based almost entirely on structure. In the first instance, NOE-derived interproton distances are utilized to generate a three-dimensional distribution of the protons, which is then used for assignments using knowledge of the primary sequence and atom connectivity [14,15]. It was concluded that for this procedure to work, excellent quality stereo-resolved data are required [15]. Perhaps of practical value is a method which incorporates a high-dimensional potential term into the force field used for restrained molecular dynamics (r-MD) refinement [16]. In this case, ambiguous NOE connectivities could be assigned and used for further structure improvement.

Computational strategies pertinent to NMR solution structure determination lames Various computational techniques have also been brought to bear on assignment problems such as rulebased identification, [17] combinatorial minimization, [8] and genetic algorithms, [18] Initial reports with these techniques suggest that they could provide powerful means of automated assignment.

Structural restraints Secondary structure is commonly discerned while assigning resonances. Traditionally this has entailed assessment of sequential NOE connectivities, among other things. Recently, it has become possible to estimate the extent of g-helix and [3-sheet from proton chemical shifts even prior to resonance assignments. For example, (x-proton resonances shift upfield about 0.4 p p m (relative to the random coil) for a helical configuration and downfield about 0.4 ppm in a [3-strand, or extended configuration [19,20]. Chemical shifts are quite sensitive to small changes in structure, but historically it has proven difficult to utilize chemical shift information quantitatively. Aromatic ring current shifts d e p e n d on the distance and orientation of a nucleus relative to the ring. Incorporation of ring current shifts into r-MD calculations have b e e n initiated [21] Recent a b initio calculations have also related 13C, 15N, and 1H chemical shifts in proteins to backbone geometry as well as sidechain torsion angle X1 [22"]. All structures determined to date have utilized structural restraints from NOE or ROESY cross-peaks. However, the manner of using this data in structure deterruination has varied. Most protein structure determinations have employed distance restraints estimated using the nearly trivial calculation offered by the isolated spin pair (or two-spin) approximation, with an estimate of upper and lower bounds to the distances. While this approximation leads to sizable errors, both random and systematic [23-25], the high proton density of proteins typically leads to large numbers of restraints which can largely compensate for inaccuracies (see below). The strongest motivation for using the two-spin approximation is that since distances are sufficiently inaccurate, it is unnecessary to determine cross-peak intensities accurately. For nucleic acid structures, however, higher resolution is often necessary to say anything meaningful about the structure - - so the consensus seems to be forming that more accurate distances are needed. Complete relaxation matrix analysis of proton 2D NOE spectra accurately relates numerous interproton distances to NOE intensities accounting for all dipole-dipole interactions, thus obviating the 'spin diffusion' effects which lead to the major errors in the two-spin approximation [23]. For example, the program CORMA calculates 2D NOE peak intensities for any postulated model [23,24]. Earlier work in this area has previously b e e n reviewed [25] and I will limit my discussion to more recent work. The most efficient relaxation matrix techniques for calculating accurate distances entail an iterative approach [24,26-35]. While

some of these have been in use for a few years, e.g., IRMA, MORASS and MARDIGRAS, there have been improvements in the algorithms. For example, internal motions have been incorporated into the IRMA and MARDIGRAS programs [35-37]. The effect of distance values compromised by internal motions on structure determination have been examined via relaxation matrix calculations [37,38]. In addition, for exchangeable protons, exchange with bulk water can significantly attenuate NOE cross-peak intensity, but the exchange rate has b e e n incorporated into MARDIGRAS permitting accurate distances to exchangeable protons to be determined [39]. Other relaxation matrix variants have also recently been presented. Visual comparison of experimental 2D NOE spectra, and spectra simulated using the complete relaxation matrix approach, enabled better fitting of overlapped cross-peaks [32,40]. A fitting of NOE build-up curves, accounting for all relaxation matrix elements, using linear prediction has also been demonstrated [41]. Alternatives to relaxation matrix methods, which also account for many of the dipole-dipole interactions, continue to be developed [42,43"]. For structure determination, an estimate of the accuracy of the structural restraints is needed for setting bounds in distance geometry, or fiat-well size in r-MD calculations. Currently, restraint bound estimates vary from laboratory to laboratory. Tighter distance bounds (small error bars) result in a better defined structure. However, distance bounds made tighter than warranted by experimental accuracy mislead to a highly defined (small atomic root mean square deviation) but incorrect structure [44]. Ideally the bounds should be as tight as possible but not so tight that the real distance lies outside the bounds. As well as enabling more accurate distances, iterative relaxation matrix methods, at least MARDIGRAS, can aid our choice of bounds individually for each proton pair [45]. This is especially valuable for exchangeable protons and protons subject to motional or overlap averaging. Molecular motions compromise the accuracy of experimental restraints and, consequently, the precision and probably the accuracy of the structure defined. As noted, some efforts have been made to account for internal motions in relaxation matrix accounts already. In addition, initial efforts have been made to account for conformational ensembles resulting from conformational flexibility. In particular, NOE spectra have been calculated for such conformational ensembles and the calculated spectra compared with experimental data [46,47,48",49"*]. Relaxation matrix methods that enabled calculated ROESY spectra for protein and DNA structure models to be compared with experimental ROESY data [50--52]. For example, a ROESY spectrum has been calculated for the multiple conformations of methyl [3-cellobioside resulting from MD trajectories [53"]. The resolution of protein structures has improved considerably with the advent of stereospecific assignments. The program HABAS has b e e n written to determine

277

278

Theory and simulation stereospecific assignments via a complete grid search of local 9, ~t, and HI angles to eliminate one of the two possible stereospecific assignments for ~-methylene protons [54]. STEREOSEARCH is another program, based on a conformational database search of crystal structures [55]. Coupling constant measurements are important for these programs - - several heteronuclear NMR experiments, especially, have enabled J coupling constant measurements, which permit elucidation of stereospecific assignments and the rotomeric states characterized by ~ and X1. A procedure termed CUPID determines the continuous angular distribution of rotamer probability from a minimum of four measured vicinal h o m o n u c l e a r and heteronuclear coupling constants [56,57]. This is particularly applicable to conformationally flexible peptide backbones, but could find use with carbohydrates or nucleic acids. A recent calculation indicates that the interaction between coherent scalar coupling and incoherent dipolar relaxation, generally considered negligible, could be significant for biopolymers [58"]. The effect depends on interproton distances, coupling constant values, and correlation times. Of particular concern are vicinal coupling constant values derived b e t w e e n two protons w h e n one of the two is also dipolar coupled very strongly to another proton. It would a p p e a r that with the small biopolymers currently studied small corrections, at most, will be necessary. However, more calculations to determine the extent of this effect are warranted. Extraction of torsion angle restraints from vicinal coupling constants involves use of a Karplus relationship. It should be noted that a re-parameterization of the Karplus equation for sugar rings in nucleic acids was recently published [5].

Structure generation and refinement using restraints Some protocols simultaneously refine distance restraints and structure via iteration through a grand cycle of structure generation. This is normally done via roMD on one branch and relaxation matrix calculations on the other, starting from a model structure typically constructed using isolated spin pair approximation (ISPA) based distances. While this probably works fine, it might b e argued that it is better to keep the restraint determination step and the structure generation step separate. For one thing, it enables a clear, individual assessment of the range (i.e., bounds) for each structural restraint (see above). It should also diminish any tendency to b e c o m e trapped in a local energy m i n i n m m occurring in the vicinity of early iteration structures which are trying to satisfy inaccurate ISPA-derived distances. Separate establishment of accurate structural restraints also allows other structure determination techniques to be used on the same restraint set. Distance geometry (DG) and r-MD (simulated annealing) calculations are most c o m m o n l y used for structure generation and refinement. Both entail two key facets:

First, a search of conformational space to yield a structure consistent with all restraints; and second, optimization of any resulting structure. Adequate sampling of conformational space is required to find all possible structures consistent with the experimental restraints. Methods entailing systematic searches of accessible conformational space have b e e n advocated, but they are presently too expensive computationally to be used for biopolymers. Consequently, a directed search of conformational space is generally carried out. With molecules the size of proteins, there is rarely a guarantee that all of conformational space has indeed b e e n searched. Havel [6] reviewed the sampling problem as well as DG handling of the situation. The DG sampling problem has also b e e n explored independently of other techniques [59]. The solution to the p r o b l e m for the metric matrix approach lies in the appropriate protocol of random metrization, i.e., r a n d o m choice of the order in which distances are chosen in the 'classical' embedding scheme in the metric matrix method. The latest versions of DG programs either utilizing a metric matrix, as in DG-II [6], EMBOSS [60"], or X-PLOR [59], or an alternative approach, such as that in DIANA [61], provide the most efficient means of sampling a large region of conformational space. Restrained molecular dynamics simulations attempt to reconcile experimental structural restraints with energetic considerations. Two recent reviews of r-MD are to be found in [62,63]. A hybrid DG/r-MD approach is often employed, with distance geometry generating a family of structures which are further refined via r-MD. In a hybrid approach, optimization steps in the DG program are generally eliminated to save computation time. Several MD programs, both commercial and academic, have b e e n modified for incorporation of NMR restraints, i.e., the molecular force field is modified, giving: Vtotal = Vbond + Vangle + Vdihedral + Vvan der Waals + Vcoulomb + VH bond + VNOE + Vj coupling.

(1)

The first five terms serve to monitor the classical potential energy of the molecule. There may or may not, be an explicit term used to maintain hydrogen bonds (VH bond), The final two terms serve as penalty functions monitoring the NOE-derived distance restraints and scalar coupling derived torsion angle restraints, respectively. Most studies have not incorporated torsion angle restraints, but all have utilized NOE restraints. Different functional forms have b e e n employed for these penalty functions, but most utilize distances derived from NOE (or ROESY) spectra, for example, a fiat-well potential with quadratic boundaries b e y o n d the experimental u p p e r and lower distanqe bounds. An alternative penalty function m o r e in keeping with the sixth-root relationship of the distance and 2D NOE cross-peak intensity has b e e n proposed, NOE = k(aex -1/6 -1/6 2 for any pmr" of prot ons namely, Vii p -acalc) i and j [64Iwhere aexp and acalc are the experimental and calculated intensities for the ij cross-peak, respectively, and k is the NOE restraints' force constant; it was

Computational strategies pertinent to NMR solution structure determination James 279 noted for a decapeptide that better convergence behavior was achieved. Rather than using derived distances in the NOE penalty term, Yip and Case [65] directly utilized the difference b e t w e e n experimental 2D NOE intensities and intensities calculated via the program CORMA for the molecular structure at each step of the r-MD simulation. Their calculation is computationally demanding, as it entails calculating the gradient of the intensity matrix via CORMA as the structure changes during the MD trajectory. As originally written, the calculation can require up to two orders of magnitude more time than standard MD refinement against distances. But, in the case of proteins, it apparently yields i m p r o v e m e n t s in the derived structure [66]. More efficient methods of calculating the gradient have recently b e e n develo p e d [67°-69°]. Intensities from 3D NOE-NOE spectra have b e e n used directly in protein refinement, resulting in a letter defined structure than w h e n two-spin approximation distances were used [70]. The p o p u lar software packages AMBER and X-PLOR have the ability to perform molecular dynamics with refinement directly against NOE intensities. Both h o m o n u c l e a r and heteronuclear vicinal coupling constants have b e e n directly used as restraints in MD calculations on a cyclic peptide [71]. NOE distance restraints could be included as well. Although threeb o n d couplings are more commonly employed, the o n e - b o n d Cm-Hc~coupling constant is sensitive to the angle in the peptide backbone, and complements the vicinal coupling data characterizing ~and X1. The ~ and angle d e p e n d e n c e of the C~--Het coupling constant has b e e n illustrated, with t h e one-bond coupling constants utilized directly as restraints in MD simulations [721. Recognizing that conformational fluctuations play a role in determining distance restraints extracted from 2D NOE spectra, an alternative distance penalty function was p r o p o s e d [63,73]. In this case, the distance b e t w e e n any two protons and the associated penalty is monitored as a running average with exponential weighting to emphasize more recent snapshots during an r-MD trajectory. This has the effect of permitting a broader search of conformational space in an r-MD simulation. For a DNA duplex, it was found that, molecular dynamics with time-averaged restraints (MD-tar), generates an ensemble of structures which fit both the experimental 2D NOE data and the scalar coupling data better than the best structure obtained via traditional r-MD methods [47,48°°]. Incorporation of a penalty function for time-averaged scalar coupling constants h a s b e e n demonstrated for the cyclic pentapeptide antamanide, which is k n o w n to exhibit conformational flexibility [74"]. There are methods other than DG and r-MD for generating structures from NMR data. Methods for building up structures by using the NMR data for constructing local structural elements followed by a hierarchical build-up, or a limited conformational space search, based on consideration of further NMR data have b e e n

p r o p o s e d [75"]. Another probabilistic method (FILMAN), based on a double iterated Kalman filter, operates in dihedral angle space and yields estimates of dihedral angle errors in the structures generated [76]. The algorithm REPENT has b e e n developed and demonstrated to be of use for protein structure determination [77]. REPENT uses the sixth root of the ratio between the observed and CORMA-calculated NOE intensities as a gradient to drive changes in the distance restraints input to the distance geometry program DIANA in an iterative procedure. Another recently prop o s e d method (termed PEACS), entailing a conformational search by potential energy annealing, is claimed to sample conformational space more extensively than traditional r-MD, yielding structures which are in better agreement with experimental data [78]. A Monte Carlo search in torsion angle space has b e e n developed for proteins [79]. Originally developed to provide better sampling of conformational space than metric matrix distance g e o m e t r y methods, Monte Carlo calculations have subsequently been shown to be readily adaptable to the multiple-minima problem characteristic of linear peptides in solution [80]. Another restrained Monte Carlo p r o c e d u r e using internal coordinates was also d e v e l o p e d and applied to NOE data from a DNA duplex [81°°]. Several different rMC protocols yielded the same structure (-<0.3 ~, atomic RMSD).

Criteria for assessing quality of NMR structures It is not u n c o m m o n in papers reporting NMR structures that the only criteria listed pertaining to structure quality are the number of restraints e m p l o y e d and the atomic RMSD, or variance a m o n g a family of structures generated via DG or r-MD. While knowledge of the RMSD may be useful, it is not actually an index of structure accuracy. Rather, it indicates the degree of convergence of a family of NMR structures which have b e e n generated using the experimental restraints, i.e., it can be a useful but imperfect descriptor of precision. However, the differences b e t w e e n precision and accuracy should be distinguished. Accuracy indicates h o w closely NMR structures approximate the troth, while an index of precision might be defined as a measure of h o w well the structures fit the experimental data. It has b e e n pointed out that the RMSD is not necessarily even a very good measure of this precision [82°°]. The distribution of structures a b o u t the average structure is certainly not Gaussian as implied, and the value is highly subject to outlying structures in the distribution. Furthermore, efforts which reduce the RMSD may actually lead to structural errors [44]. There are criteria that can be used to assess the quality of a structure. One is to examine the consistency of a structure with the experimental NMR data. With X-ray crystallography, evaluation of the fit of a derived structure with the original electron densities via

280

Theoryand simulation a residual index, or R factor, is standard. But an R factor is infrequently reported for NMR protein structures. However, calculated 2D NOE spectral intensities for any proposed molecular structure can be quantitatively compared with experimental intensities [26]. A residual index analogous to the crystallographic R factor R 1 or a sixth-root residual index ~ can be used, which permits longer-distance NOE interactions (e.g., -4.4.) to contribute to the calculated R factor. R factors are, in principle, d e p e n d e n t o n the sign of the errors between the experimental and model structure's intensities, so a related quality factor (Q1) has also b e e n defined [83]. A free R factor R~ree was recently p r o p o s e d for X-ray and NMR structures [84"]. R~tee measures the fit to a randomly selected set of test data which are not used in structure refinement - - - 1 0 % of the distance restraints for example. This avoids any model bias in calculating the R value. It might be expected that R~ree will be larger than R1, since the test data were not used in structure generation. However, discounting noise in the data, if the test data are consistent with the structure determined, R~ree should not be too much larger; the exact amount considered acceptable probably awaits more experience with real data. In one particular case of a DNA duplex well-defined by 25 restraints/residue, a test set comprised of 20% of the total distances selected randomly was used [85"]. ~ calculated for the test set was 60-70% larger than for those used in structure refinement. This is similar to the few X-ray structures examined via cross-validation to date. Consistency of an NMR structure with torsion angle restraints may be evaluated via the RMSD between experimental and theoretical coupling constants: Jrms = (l/N) q £ ( J e x p - Jtheor) 2, the summation being carried out over N, all or any subset of the coupling constants [81]. One could also calculate all spectral cross-peaks to be expected for any given structural model, for exampie using SPHINX-LINSHA, [86] and compare with the experimental cross-peaks. This, of course, assumes that a well-parameterized Karplus relationship has b e e n established between the dihedral angles and three-bond coupling constants. A structure derived from NMR will d e p e n d on the number, type, distribution, motional averaging, and accuracy of restraints. The effect of some of these has been evaluated. It has b e e n k n o w n for m a n y years that for protein structure determination, it is more important to have a large n u m b e r of inaccurate restraints than a small number of accurate restraints. Basically, the holonomic restraints as well as triangle and tetrangle inequalities establishes this. Analysis of test sets of distance restraints indicated that deviations from the test set models asymptotically approach a limiting value after about 30% of all potential restraints for a protein are available [87]. Tightening error bars on accurate restraints had the same qualitative effect as increasing the n u m b e r of restraints. Assuming the structure determined for a 56 residue protein from 1058 restraints, including 854 approximate NOE distances, to be completely accurate, certain aspects of the structure generation process have b e e n explored

[88"]. While increased accuracy and precision of restraints was demonstrated to increase accuracy and precision of the structure determined, the major determinant was found to be the number of restraints. Increasing the number of approximate restraints to _>18 per residue was sufficient to yield a precision and apparent accuracy for the backbone to the threshold level one might expect to reach as set by librational motions. With fewer restraints, accuracy plays a more important role. For a DNA duplex well restrained with an average of 20 relatively accurate distance restraints and five experimental torsion angle restraints per residue, it was found that the precision (and apparent accuracy) of the structure derived via r-MD was degraded to only a small extent as up to 40% of the distance restraints were randomly neglected; but use of fewer restraints than this was decidedly deleterious [85]. There are few cases where different structure refinement m e t h o d s have b e e n independently applied to the same data set and the resulting structures compared. For a DNA duplex, it was reported that distance geometry structures exhibited most of the salient structural features without resorting to the empirical energy functions used to derive r-MD structures [89]. But experience with proteins, which typically have a greater n u m b e r of restraints per residue than in most DNA studies, shows that energy refinement is salutary [1-3]. Indeed, a study designed to examine just that question concluded that accuracy was greater with r-MD structures than with those obtained solely using geometry, i.e., distance geometry and a double iterated Kalman filter [87]. A restrained Monte Carlo method, d e v e l o p e d and applied to a DNA duplex, found that several different r-MC protocols yielded the same structure (<0.5 atomic RMSD) as that obtained with the same experimental data using a traditional r-MD approach [81"]. One might be tempted to consider a crystal structure as a close approximation of the truth. While definitely valuable for comparison, there may be real differences b e t w e e n solution and crystal structures, not to mention that a crystal structure has its own limitations in accuracy. While most protein structures which can be c o m p a r e d yield very similar (if not identicaD structures, there are cases where structures in solution and solid state apparently differ significantly [1,3,90,91]. It is quite likely that comparisons of DNA crystal and solution structures will be less in accord than proteins. Most DNA duplexes crystallize in the A-form conformation, which is obtained in solution only with very high salt and DNA concentrations. Some DNA duplexes do crystallize in B-form, but crystal packing forces apparently play an important role in establishing conformation; crystallizing the same duplex in two different crystal fQrms caused significant changes in structural features [92]. There are few high-resolution NMR structures of duplex DNA which can be c o m p a r e d with a crystal structure. One exhibited some features in comm o n with the X-ray structure, but also demonstrated some variations [93]. A similar finding was reported for a DNA quadruplex [94].

Computational strategiespertinent to NMR solution structure determination James 281 Conclusion Significant a d v a n c e s in computational strategies continue. These a d v a n c e s will increase the throughput in NMR structure labs. This in turn will contribute in the near future to the rate at w h i c h structures determined via NMR ~ c a r b o h y d r a t e and RNA structures as well as proteins a n d DNA. Simultaneously, there will be s o m e structures d e t e r m i n e d with greater precision and possibly greater accuracy.

An Automatable Procedure Based on 3D TOCSY-TOCSY and 3D TOCSY-NOESY. Biopolymers 1991, 31:699-712. 12.

Computational Aspects of the Study of Biological Macromolecules by Nuclear Magnetic Resonance Spectroscopy. Edited by Hoch JC, Poulsen FM, Redfield C. New York: Plenum Press; 1991:291-302. 13.

ECCLESC, GONTER P, BILLETERM, WOTRICH K: Efficient Analysis of Protein 2D NMR Spectra Using the Software Package EASY. J Biomol NMR 1991, 1:111-130.

14.

MALLIAV1NTE, ROUH A, DELSUCMA, LALLEMANDJY: Approche Directe de la D6termination de Structures Mol6culaire a Partir de l'Effet Overhauser Nucl6aire [in French]. Chim Thdorique 1992, 315 (Ser II):653-659.

15.

OSHIRO CM, KUNTZ ID: Application of Distance Geometry to the Proton Assignment Problem. Biotxdymers 1993, 33:107-115.

16.

BERNSTEINR, SCHNOCHELA, CIESLAR C, HOLAK TA: High-Dimensional Potential for the Assignment of Ambiguous NOE Conncctivities, J Magn Resort 1993, 102 [series B]:116-119.

17.

Yu C, HWANG JF, CHEN TB, SOO VW: RUBIDIUM, a Program for Computer-Aided Assignment of Two-Dimensional NMR Spectra of Polypeptides. J Chem I n f Comp Sci 1992, 32:183--187.

18.

WEHRENSR, LUCASIUSC, BUYDENSL, KATEMANG: HIPS, a Hybrid Self-Adapting Expert System for Nuclear Magnetic Resonance Spectrum Interpretation Using Genetic Algorithms. Anal Chim Acta 1993, 277:313-324.

19.

KUNTZID, KOSEN PA, CRAIG EC: Amide Chemical Shifts in Many Helices in Peptides and Proteins are Periodic. J Am Chem Soc 1991, 113:1406-1408.

20.

WISHARTDS, SYKES BD, RICHARDSFM: Relationship Between Nuclear Magnetic Resonance Chemical Shift and Protein Secondary Structure. J Mol Biol 1991, 222:311-333.

21.

OSAPAYK, CASE DA: A New Analysis of Proton Chemical Shifts in Proteins. J Am Chern Soc 1991, 113:9436-9444.

Acknowledgements This work was supported by National Institutes of Health grants GM 41639, GM39247 a n d RR01695.

References and recommended reading Papers of particular interest, published within the annual period of review, have b e e n highlighted as: • of special interest •• of outstanding interest 1.

WAGNERG, HYBER'I~ SO, HAVEI. TF: NMR Structure Determination in Solution: A Critique and Comparison with XRay Crystallography. A n n u Rev Biophys Blomol Struct 1992, 21:167-198.

2.

WAGNERG: Prospects for NMR of Large Proteins. J B~omol NMR 1993, 3:375-385.

3.

JAMESTL, BASUS "v~J: Generation of High-Resolution Protein Structures in Solution from Multidimensional NMR. A n n u Rev Phys Chem 1991, 42:501-542.

4.

EEIGONJ, SKLEN,~R V, WANG E, GILBERT DE, MACAYA RF, SCHULTZE P: lH NMR Spectroscopy of DNA. In Methods in

Enzyrnologl*, DNA Structures, Part A: Synthesis and Physical Analysis o f DNA vol 211. Edited by Lilley DMJ, Dahlberg JE. New York: Academic Press; 1992:235-253. 5.

WIJMENGASS, MOOREN MMW, HILBERS CW: NMR of Nucleic Acids; from Spectrum to Structure. In NMR in Macromolecules. Edited by Roberts GC. Oxford: IRL Press; 1993:217-288.

6.

HAVELTF: An Evaluation of Computational Strategies for Use in the Determination of Protein Structure from Distance Constraints Obtained by Nuclear Magnetic Resonance. Prog Btophys Mol Biol 1991, 56:43-78.

7.

H O C HJC, POULSEN FM, REDFIELD C, (El)S): Computational Aspects o f the Study o f Biological Macromolecules by Nuclear Magnetic Resonance Spectroscopy. New York: Plenum

DE D1OS AC, PEARSON JG, OLDFIELDE: Secondary and Tertiary Structural Effects on Protein NMR Chemical Shifts: An Ab Initio Approach. Science 1993, 260:1491-1496. lH, 15N, 13C a n d 19F chemical shifts in proteins are calculated via an ab initio approach. The results suggest that the ~ and iF backbone torsion angles, as well as sidechain torsion angle X1 could be estimated from the chemical shifts. 22. •

23.

KEEPERSJW, JAMES TL: A Theoretical Study of Distance Determination from NMR. Two-Dimensional Nuclear Overhauser Effect Spectra. J Magn Reson 1988, 57:404-426.

24.

BORGIAS BA, JAMES TL: COMATOSE: A Method for Constrained Refinement of Macromolecular Structure Based on Two-Dimensional Nuclear Overhauser Effect Spectra. J Magn Reson 1988, 79:493-512.

25.

JAMES TL: Relaxation Matrix Analysis of 2D NOE Spectra to Obtain Accurate Biomolecular Structural Constraints and to Assess Structural Quality. Curt Opin Struct Biol 1991, 1:1042-1053.

26.

P O S T CB, MEADOWS RP, GORENSTEIN DG: O n the Evaluation of Interproton Distances for Three-Dimensional Structure Determination by NMR Using a Relaxation Rate Matrix Analysis. J Am Chem Soc 1990, 112:6796-6803.

27.

BOELENS R, KONING TMG, KAPTEIN R: Determination of Biomolecular Structures from Proton-Proton NOE's Using a Relaxation Matrix Approach. J M o l Struct 1988, 173:299-311.

28.

BORGIAS BA, JAMES TL: MARDIGRAS - - A Procedure for Matrix Analysis of Relaxation for Discerning Geometry of an Aqueous Structure. J Magn Resort 1990, 87:475-487.

Press; 1991. 8.

9.

10.

BERNSTEIN R, CIESLAR C, ROSS A, OSCHKINAT H, FREUND J, HOLAK TA: Computer-Assisted Assignment of Multidimensional NMR Spectra of Proteins: Application to 3D NOESY-HMQC and TOCSY-HMQC Spectra. J Biomol NMR 1993, 3:245-251.

POWERSR, GARRETr DS, MARCHCJ, FRIEDEN EA, GRONENBORN AM, CLORE GM: IH, 15N, 13C and 13CO Assignments of H u m a n Interleukin-4 Using Three-Dimensional Double- and Triple-Resonance Heteronuclear Magnetic Resonance Spectroscopy. Btochemist~. 1992, 31:43334-4346. OSCHKINATH, HOLAK TA, CIESLARC: Assignment of Protein NMR Spectra in t h e Light of Homonuclear 3D Spectroscopy:

KJAERM, ANDERSEN KV, LUDVIGSEN S, SHEN H, WINDEKILDE D, SOORENSEN B, POULSEN FM: Outline of a Computer Program for the Analysis of Protein NMR Spectra. In

282

Theory and simulation 29.

KOEHLP, LEFEVREJF: The Reconstruction of the Relaxation Matrix from an Incomplete Set of Nuclear Overhauser Effects. J Magn Reson 1990, 86:565-583.

46.

LANDISC, ALLUREDMS: Elucidation of Solution Structures by Conformer Population Analysis of NOE Data. J Am Chem Soc 1991, 113:9493-9499.

30.

MADRID M, LLINAS E, LLINAS M: Model-Independent Refinement of Inmerproton Distances Generated from 1H-NMR Overhauser Intensities. J Magn Reson 1991, 93:329-346.

47.

31.

VAN DE MEN FJM, BLOMMERSMJJ, SHOU'IEN RE, HII.BERS CW: Calculation of Interproton Distances from NOE Intensities. A Relaxation Matrix Approach without Requirement of a Molecular Model. J Magn Reson 1991, 94:140-151.

SMITZU, KUMARA, JAMES TL: Dynamic Interpretation of NMR Structural Data: Molecular Dynamics with Weighted TimeAveraged Restraints and Ensemble R-Factor. J Am Chem Soc 1992, 114:10654--10656.

32.

KIM SG, REID BR: Automated NMR Structure Refinement via NOE Peak Volumes. Application to a Dodecamer DNA Duplex. J Magn Reson 1992, 100:383-390.

33.

NIBE1)ITA R, KUMAR 1~#~, MAJUMDAR A, HOSUR RV: Simulation of NOESY Spectra of DNA Segments: A New Scaling Procedure for Iterative Comparison of Calculated and Experimental NOE Intensities. J Btomol NMR 1992, 2:467-476.

34.

SUGARIP, XU Y: Computer Simulation of 2D-NMR (NOESY) Spectra and Polypeptide Structure Determination. Prog Biophys Mol Btol 1992, 58:61-84.

35.

KONING TMG, BOELENS R, KAPTEIN R: Calculation of the Nuclear Overhauser Effect and the Determination of Proton-Proton Distances in the Presence of Internal Motions. J Magn Reson 1990, 90:111-123.

36.

LIU H, THOMAS PD, JAMES TL: Averaging of Cross-Relaxation Rates and Distances for Methyl, Methylene and Aromatic Ring Protons Due to Motion or Overlap: Extraction of Accurate Distances Iteratively via Relaxation Matrix Analysis of 2D NOE Spectra. J Magn Reson 1992, 98:163-175.

48. •-

SCHMITZ U, ULYANOV NB, KUMAR A, JAMES TL: Molecular Dynamics with Weighted Time-Averaged Restraints of a DNA Octamer: Dynamic Interpretation of NMR Data. J Mol Btol 1993, 234:373-389. Molecular dynamics calculations using exponentially weighted, timeaveraged restraints yielded a n ensemble of related DNA structures which fit both the experimental NOE and scalar coupling data better than any of the structures resulting from traditional rMD calculations which require all restraints to be met at each step in the MD trajectory. 49.

BONVIN MJJ, RULLMANNJAC, LAMERICHS 1LMJN, BOELENS R, KAPTEINR: 'Ensemble' Iterative Relaxation Matrix Approach: a New NMR Refinement Protocol Applied to the Solution Structure of Crambin. Proteins 1993, 15:385-400. The I I ~ A procedure is modified to deal with a set of protein structures generated via distance geometry. The relaxation matrix is calculated as d u e the e n s e m b l e of structures. Refinement of the relaxation matrix and the structures proceeds as all structures in the ensemble are subjected to rMD in parallel. oe

50.

BAZZO R, EDGE CJ, WORMALD MR, RADEMACHER TW, DWEK RA: Full Stimulation of ROESY, Including the H a r t m a n n - H a h n Effects. Chem Phys Lett 1990, 174:313-319.

51.

KUMARA, JAMES TL, LEVY GC: Macromolecule Structure Refinement by Relaxation Matrix Analysis .of 2D NOE Spectra: Effect of Differential Internal Motion on Internuclear Distance Determination in Proteins. L~rael J Chem 1992, 32:257-261.

BAUER CJ, FRENKIEL TA, LANE AN: A Comparison of the ROESY and NOESY Experiments for Large Molecules, with Application to Nucleic Acids. J Magn Reson 1990, 87:144-152.

52.

LEEFLANGBR, KI{OON-BA'I~ENBURGLMJ: CROSREL: Full Relaxationi Matrix Analysis of NOESY and NROESY NMR Spectroscopy. J Biomol NMR 1992, 2:495-518.

38.

P O S T CB: Internal Motional Averaging and Three-Dimensional Structure Determination by Nuclear Magnetic Resonance. J Mol Biol 1992, 224:1087-1101.

53. •

39.

LIU H, KUMAR A, WEISZ K, SCHMITZ U, BISHOP KD, JAMES TL: Extracting Accurate Distances and Bounds from 2D NOE Exchangeable Proton Peaks. J Am Cbem Soc 1993, 115:1590-1591.

40.

MIRAUPA: A Strategy for NMR Structure Determination. J Magn Reson 1992, 96:480-490.

41.

MALI.IAVINTE, DELSUC MA, LAI.LEMANDJY: Computation of Relaxation Matrix Elements from Incomplete NOESY Data Sets. J Biomol 1992, 2:349-360.

42.

FORSTERMJ: Comparison of Computational Methods for Simulating Nuclear Overhauser Effects in NMR Spectroscopy. J Comput Chem 1991, 12:292-300.

37.

LAI X, CHEN C, ANDERSEN NH: The DISCON Algorithm, an Accurate and Robust Alternative to an Eigenvalue Solution for Extracting Experimental Distances from NOESY Data. J Magn Reson 1993, 101[series B]:271-288. Secondary contributk)ns to cross-relaxation are approximated in an iterative fashion, enabling interproton distances to be calculated. This procedure may be particularly useful for smaller molecular systems where autopeaks are resolved and intensities can be quantitated.

KROON-I~ATENBURGLMJ, KROON J, LEEFLANG BR, VLIEGENTHART JFG: Conformational Analysis of Methyl []-Cellobioside by ROESY NRaM Spectroscopy and MD Simulations in Combination with the CROSREL Method. Carbohydrate Res 1993, 245:21-42. MD simulations yielded a ,set of structures, the resulting ensemble being used to calculate a ROESY spectrum which can be compared with experimental ROESY data. 54.

GI~)NTER P, BRAUN W, BILLETER M, W{J'IRICH K: Automated Stereospecific 1H NMR Assignments and Their Impact on the Precision of Protein Structure Determinations in Solution. J Am Chem Soc 1989, 111:3997-4004.

55.

NILGESM, CLORE GM, GRONENBORN AM: Biotx)lymers 1990, 29:813-822.

56.

DZAKULAZ, WESTLER WM, EDISON AS, MARKLEYJL: The CUPID Method for Calculating the Continuous Probability Distribution of Rotamers from NMR Data. J A m Chem Soc 1992, 114:6195-6199.

57.

DZAKULAZ, EDISON AS, WESTLER WM, MARKLEYJL: Analysis of Xt Rotamer Populations from NMR Data by the CUPID Method. J Am Chem Soc 1992, 114:6200-62007.

43. I,

44.

THOMASPD, BASUS VJ, JAMES TL: Protein Solution Structure Determination Using Distances from 2D NOE Experiments: Effect of Approximations on the Accuracy of Derived Structures. Proc Nail Acad Sci USA 1991, 88:1237-1241.

45,

MUJEEBA, KERW1N SM, EGAN W, KENYON GL, JAMES TL: Solution Structure of a Conserved DNA Sequence from the HIV-1 Genome:Restrained Molecular Dynamics Simulation with Distance and Torsion Angle Restraints Derived from 2D NMR Spectra. BiocbemLgt~ 1993, 32:13419-13431.

58. -

HARBISON GS: Interference b e t w e e n J-Couplings and CrossRelaxation in Solution NMR Spectroscopy: Consequences for Macromolecular Structure Determination. J Am Chem Soc 1993, 115:3026-3027. Interaction b e t w e e n coherent scalar coupling and incoherent dipolar cout~ling is typically considered negligible. However, as s h o w n here, the extent of interaction m a y be measurable in the case of biopolymers. Ignoring this could lead to some error in coupling constant values derived from NMR spectra. 59.

KUSZEWSK[J, NILGE.SM, BRUNGERA: Sampling and Efficacy of Metric Matrix Distance Geometry: A Novel Partial Metrization Algorithm. J Biomol NMR 1992, 2:33-56.

Computational strategies pertinent to NMR solution structure determination James 283 60.

NAKAIT, KIDERA A, NAKAMURA H: Intrinsic Nature of The Three-Dimensional Structure of Proteins as Determined by Distance Geometry with Good Sampling Properties. J Biomol NMR 1993, 3:19-40. The program EMBOSS and its applications are described. EMBOSS utilizes a metric matrix approach ff)llowed by optimization via simulated annealing in four-dimensional space. Good sampling of confromational space together with good convergence is achieved. •

61.

GONTERTI), W{)TRICH K: Improved Efficacy of Protein Structure Calculations from NMR Data Using the Program DIANA with Redundant Dihedral Angle Constraints. J Biomol NMR 1991, 1:447-456.

62.

BRONGERAT, KARPLUS M: Molecular Dynamics Simulations with Experimental Restraints. Ace Chem Res 1991, 24:54-61.

63.

VAN GUNSTEItENWF: Molecular Dynamics Studies of Proteins. Curr Opin Strut[ Biol 1993, 3:277-281.

64.

74. *o

TORDA AE, BRUNNE RM, HUBERE T, KESSLER H, VAN GUNSTERENWF: Structure Refinement Using Time-Averaged J-Coupling Constant Restraints. JBiomol NMR 1993, 3:55-66. A probabilistic method, termed FISINOE, of determining a protein structure by building up the structure from local elements constmcted to be consistent with NMR data is described. The method appears to be quite efficient. SHEILMANSA, JOHNSON ME: Derivation of Locally Accurate Spatial Protein Structure from NMR Data. Prog Biophys Mol Biol 1993, 59:285-339. A probabilistic method, termed FISINOE, of determining a protein structure by building up the structure from local structural elements constructed to be consistent with the NMR data is described. The method appears to be quite efficient. 75. •-

76.

KOEHLP, LEEEVREJF, JARDFTI'ZKYO: Computing the Geometry of a Molecule in Dihedral Angle Space Using NMR-derived Constraints. J Mol Biol 1992, 223:299-315.

STAWARZB, GENEST M, GENEST D: A New Constraint Potential for the Structure Refinement of Biomolecules in Solution Using Experimental Nuclear Overhauser Effect Intensity. Biopolymers 1992, 32:633-642.

77.

HOFEMANRC, XU RX, KLEVVr RE, HERRIOTr JR: A Simple Method for the Refinement of Models Derived from NMR Data Demonstrated on a Zinc Finger Domain from Yeast ADR. J Magn Reson 1993, 102 [series B]:61-72.

65.

YIP P, CASE DA: A New Method for Refinement of Macromolecular Structures Based on Nuclear Overhauser Effect Spectra. J Magn Reson 1989, 83:6434548.

78.

66.

GIPPERT GP, YW PF, WRIGHT PE, CASE DA: Computational Methods for Determining Protein Structures from NMR Data. Biochem Pharmacol 1990, 40:15-22.

VAN SCHAIK 1),C, VAN GUNSTEREN WF, BERENDSEN HJC: Conformational Search by Potential Energy Annealing: Algorithm and Application to Cyclosporin A. J Comp Aided Mol Design 1992, 6:97-112.

79.

LEVYI{M, BASSOLINO DA, KITCHEN DB, PARDI A: Solution Structures of Proteins from NMR Data and Modeling: Alternative Folds for Neutrophil Peptide 5. Biochemist~ 1989, 28:9361-6372.

80.

FdPOLI.DR, Nt F: Refinement of the Thrombin-Bound Structure of a Hirudin Peptide by a Restrained Electrostatically Driven Monte Carlo Method. Btopolyme~ 1992, 32:359-365.

DELI.WOMJ, WAND AJ: Computationally Efficient Gradients for Relaxation Matrix-Based Structure Refinement Including the Accomodation of Internal Motions. J Bimol NMR 1993, 3:205-214. Structure refinement via iterative fitting of relaxation matrix-calculated 2D NOE intensities entails calculation of the gradient of the matrix. This can be more efficient if distances, rather than coordinates, are used. This formulation also permits internal motions to be readily incorporated. "

67. •

68. YIP P: A Computationally Efficient Method for Evaluating • the Gradient of 2D. J Btomol NMR 1993, 3:631-365. NOE intensities, rather than distances derived from them, can be incorporated directly into r-MD calculations. A computationally demanding aspect of this is the calculation of the gradient of the intensity matrix via a relaxation matrix method during the c o u r s e of the MD trajectory. This paper provides a method of speeding up this calculation. 69. •

NESTEROVAEN, CHUPRINAVP: An Efficient Method of Caiculating Analytical Derivatives for Direct NOE Refinement of Macromolecular Structures. J Magn Reson 1993, 10 [series B[ :94-96. For direct refinement against NOE intensities in r-MD protocols, calculation of the gradient of the intensity matrix via a relaxation matrix method has proven to be very time-consuming. This paper provides a method in which the calculation of the gradient requires less than twice the amount of time required to calculate the NOE intensities alone. 70.

HABAZETI"L J, SCHLEICHER M, OTLEWSKI J, HOLAK TA: Homonuclear Three-Dimensional NOE-NOE Nuclear Magnetic Resonance Spectra for Structure Determination of Proteins in Solution. J Mol Biol 1992, 228:156-169.

71.

MIERKEDF, KESSLER H: Combined Use of Homonuclear and Heteronuclear Coupling Constants as Restraints in Molecular Dynamics Simulations. Biopolymers 1992, 32:1277-1282.

72.

MIERKEDF, GRDADOLNIK SG, KESSLER H: Use of One-Bond CO-Ha Coupling Constants as Restraints in MD Simulations. J Am Chem Soc 1992, 114:8283-8284.

73.

TORDAAE, SCHEEK RM, VAN GUNSTEREN WF: Time-Average Nuclear Overhauser Effect Distance Restraints Applied to Tendamistat. J Mol Biol 1990, 214:223-235.

ULYANOVNB, SCHM1TZ U, JAMESTL: Metropolis Monte Carlo Calculations of DNA Structure Using Internal Coordinates and NMR Distance Restraints: An Alternative Method for Generating High-Resolution Solution Structure. J Biomol NMR 1993, 3:547-568. A r-MC method was s h o w n to be efficient and to yield the same DNA duplex structure (<0.5k. atomic RMSD) as a traditional r-MD approach, regardless of the extant protocol employed. The r-MC methods did not use torsion angle restraints while the r-MD calculations did. 81. --

82. oo

SHRICERJ, EDMONDSONS: Defining the Precision with Which a Protein Structure is Determined by NMR. Application to Motilin. Biochemistry 1993, 32:1610-1617. A method of accurately defining the precision of an NMR structure is described. The bias in the conventional method is discussed. 83.

WlTHKAJM, SRINIVASANJ, BOLTON PH: Problems with, and Alternatives to, the NMR R Factor. J Magn Reson 1992, 98:611-617.

84. ""

BRLINGERA, CI.ORE GM, GRONENBORN A, SAFFRICHR, NIGELS M: Assessing the Quality of Solution Nuclear Magnetic Resonance Structures by Complete Cross-Validation. Science 1993, 261:328-331. Use of a free R-factor to provide an unbiased assessment of the fit of the final NMR structures to the experimental data is described. Complete cross-validation for a protein is performed by randomly partitioning the experimental 2D NOE data into ten test sets with the R-factor being calculated independently for all ten. 85. oo

WEISZ K, SHAFER RH, EGAN W, JAMES TL: Solution Structure of the Octamcr Motif in Immtmoglobulin Genes via Restrained Molecular Dynamics Calculations. Biochemistry 1994, 33:345-366. The structure of a DNA decamer duplex was determined using 398 distance restraints, determined via complete relaxation matrix analysis with MARDIGRAS, and 100 sugar torsion angle restraints in a r-MD protocol. The well-defined structure permitted analysis of the effects of using subsets of restraints.

284

Theory and simulation 86.

WIDMERH, W{~ITRICHK: Simulation of Two-Dimensional NMR Experiments Using Numerical Density Matrix Calculations. J Magn Reson 1986, 70:270-279.

87.

LltJ Y, ZHAO D, AITMAN RB, JARDETZKY O: A Systematic Comparison of Three Structure Determination Methods from NMR Data: D e p e n d e n c e u p o n Quality and Quantity of Data. J Biomol NMR 1992, 2:373-388.

88. •e

CLOREGM, ROI?dEN M, GRONENBORN AM: Exploring t h e Limits of Precision and Accuracy of Protein Structures Determ i n e d by Nuclear Magnetic Resonance Spectroscopy. J Mol Biol 1993, 231:82-102. Assuming the structure of a well-restrained protein determined with an average of 15 approximate distance restraints and 3.6 other restraints per residue to be the correct structure, the effect of the n u m ber, precision, and accuracy of restraints o n the final structure was ascertained. The n u m b e r of restraints was established as the major determinant of both precision and accuracy. The description o f nonb o n d e d contacts is also s h o w n to have a substantial influence o n the precision and accuracy of the structure determined. 89.

CHF;NGJW, CHOU SH, SALAZARM, REID BR: Solution Structure of [d(GCGTATACGC)] 2. J Mol Biol 1992, 228:118-137.

90.

GOUDA H, TOmGOE H, SANTOA, SATO M, ARATAY, SHIMADA I: Three-Dimensional Solution Structure of the B Domain of

Stapholococcal Protein A: Comparisons of the Solution and Crystal Structurcs. BiochemL, t ~ 1992, 31:9665-9672. 91.

92.

BANClL, BERTINI I, CARLONI P, LUCHtNKFC, ORIOLI PL: Molecular Dynamics Simulations on HiPIP from Chromatium vlnosum and Comparison with NMR Data. J Am Chem Soc 1992, 114:10(x-'33-10(x89. LIPANOVA, KOPKA ML, KACZOR-GRZESKOWIAKM, QUINTANA

J, DICKERSON RE: Structure of the B-DNA Decamer C-C-A-AC-I-T-T-G~G in Two Different Space Groups: Conformational Flcxibility of B-DNA. Blochemist~ 1993, 32:1373-1389. 93.

NIKONOWICZ E, GORENSTEIN DG: Comparison of the X-ray Crystal, NMR Solution, and Molecular Dynamics Calculated Structures of a Tandem G-A Mismatched Oligonucleotide Duplex. J Am Chem Soc 1992, 114:7494-7503.

94.

SMITH "O(/F, FEIGNON J: Strand Orientation in the DNA Quadruplex Formed from the Oxytricha Telomere Repeat Oligonucleotide d(G4T4G4) in Solution. Biochemist~ 1993, 32:8682-8692.

TL James, Department of Pharmaceutical Chemistry, University of California, San Francisco, California 94143-0446, USA.