JOURNAL OF MAGNETIC RESONANCE, ARTICLE NO.
Series B 113, 248–251 (1996)
0183
NOTES Application of a Three-Dimensional Maximum-Entropy Method to Processing Sections of Three-Dimensional NMR Spectra GUANG ZHU Department of Biochemistry, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong Received July 8, 1996; revised September 13, 1996
Multidimensional NMR has become an established method for determining the structure of proteins. The timedomain data recorded in such experiments is frequently severely truncated in the indirectly detected dimensions. One of the approaches widely used for minimizing the resulting truncation effects in the corresponding Fourier-transformed spectra and for optimizing the resolution relies on the maximum-entropy method (MEM) (1–15). The application of maximum-entropy methods to onedimensional and two-dimensional (1–13) NMR data has been discussed in great detail elsewhere. Below, the discussion will be limited to the details pertinent to our implementation of the algorithm for the processing of 3D NMR spectra. Although there is no fundamental difference in the application of MEM to 1D, 2D, and 3D NMR data, the size of the data matrix that needs to be considered rapidly increases with the dimensionality of the data set, resulting in a rapid decrease in the stability of the algorithm and a steep increase in the computational time needed for MEM processing. The 3D MEM is superior to the 1D and 2D MEM in processing 3D and 4D NMR data because the accumulated computational error can be greatly reduced. Analogous to the solution used by Mazzeo et al. (9) for processing of large 1D data sets, we propose to apply 3D MEM to small but crowded 3D sections of 3D or 4D Fourier-transformed spectra. Our implementation of 3D MEM is based on the algorithm of Gull and Daniell (2), and the application of this algorithm has been previously discussed in detail by Delsuc (12). Given a 3D experimental FID of noisy Nt 1 1 Nt 2 1 Nt 3 hypercomplex data points Dijk , where Ntk is the number of complex data points in the kth dimension, one seeks a frequency-domain spectrum F( v1 , v2 , v3 ) consisting of Nf 1 1 Nf 2 1 Nf 3 real points Flmn with l Å 1, . . . , Nf 1 ; m Å 1, . . . , Nf 2 ; n Å 1, . . . , Nf 3 and Nf k § 2Ntk (16) such that the following criteria are satisfied. (a) The spectral entropy S is maximized with S being defined as
1064-1866/96 $18.00 Copyright q 1996 by Academic Press, Inc. All rights of reproduction in any form reserved.
AID
JMRB 7487
/
6o14$$$$21
N f 1, N f 2, N f 3
∑
SÅ0
Flmn ú 0.
Flmn ln Flmn
(b) The square difference Nt 1, Nt 2, Nt 3
∑
2
x Å
i Å1, jÅ1,k Å1
(Dijk 0 Rijk ) 2 s 2ijk
[2]
is converged to 2Nt 1 1 2Nt 2 1 2Nt 3 , where Rijk is the inverse Fourier transformation of the MEM spectrum F( v1 , v2 , v3 ) and sijk is the standard deviation of the time-domain noise at the coordinate (i, j, k). The noise amplitude, sijk , is commonly considered to be independent of i, j, and k. The solution of the above constrained least-squares problem is obtained when the MEM spectrum F( v1 , v2 , v3 ) maximizes Q Å S 0 lx 2 by choosing the Lagrangian multiplier l, such that x 2 É 2Nt 1 1 2Nt 2 1 2Nt 3 . At the solution point, ÇQ Å 0, this corresponds to (2–13)
H F
N f 1, N f 2, N f 3
Flmn Å A exp 01 / 2l
1
∑
T lmn ijk
i Å1, jÅ1,k Å1
(Dijk 0 Rijk )Wijk s 2ijk
GJ
,
[3]
where Wijk Å
H
1 i £ Nt 1 , j £ Nt 2 , k £ Nt 3 0 Nt 1 õ i £ Nf 1 , Nt 2 õ j £ Nf 2 , Nt 3 õ k £ Nf 3
and T lmn ijk , the transformation matrix, satisfies the equation Ìx 2 Å2 ÌFlmn
N f 1, N f 2, N f 3
∑
i Å1, jÅ1,k Å1
T lmn ijk
F
(Dijk 0 Rijk )Wijk s 2ijk
248
11-07-96 15:39:56
[1]
l Å1,m Å1,n Å1
magbas
AP: Mag Res, Series B
G
.
[4]
249
NOTES
The introduction of the constant A is required when a constraint on the total power is incorporated into the MEM algorithm (4). The 3D transformation, T lmn ijk , expressed in Eq. [4] can be substituted by 3D Fourier transformation. The optimal l of each iteration is calculated by recognizing that at the solution point the following relation (12) is true: Ç SrÇQ Å 0. ÉÇ SÉ
[5]
To obtain the solution of Eq. [3], an iterative method is used (2). Beginning with a flat, featureless spectrum of amplitude A, the iterative method substitutes the Rijk obtained from the inverse Fourier transformation of the nth iteration F n ( v1 , v2 , v3 ) on the right hand-side of Eq. [3] to obtain the solution of the (n / 1)-th iteration F n/ 1 ( v1 , v2 , v3 ). Due to the exponential form of Eq. [3], this process is very unstable. Several methods have been proposed to stabilize this procedure. In our implementation, we use the so called fixed-point method (4, 12) to slow down the iterative procedure in the manner detailed below. /1 ( v1 , v2 , v3 ), is calculated First, the new spectrum, F nnew as a linear combination of the result of the previous iteration F nnew ( v1 , v2 , v3 ) and the outcome of Eq. [3], F n /1 ( v1 , v2 , v3 ):
hypercomplex data expressed in Eq. [7] is defined as a sum of the products of the corresponding components. However, we want to slow down the speed of iteration and use the same schemes as in Eq. [6] as /1 A nnew Å (1 0 b )A nnew / bA n /1 ,
/1 where A n /1 is calculated from Eq. [7] and A nnew is the actual amplitude used in the (n / 1)-th iteration. The factor b is a constant given by the user, controlling how much the /1 calculated A n /1 is used for the actual amplitude A nnew . The 0 initial amplitude A is given by
A0 Å
n t 1, Nt 2, Nt 3 ( Ni Å1, jÅ1,k Å1 É DijkrR ijk É An, Nt 1, Nt 2, Nt 3 n ( i Å1, jÅ1,k Å1 É R ijk É2
[6]
[7]
where R n is the trial FID of the nth iteration and É É is the norm of a complex number. The dot product of two
AID
JMRB 7487
/
6o14$$$$22
É DijkÉ . Nt 1 Nt 2 Nt 3
11-07-96 15:39:56
[9]
Wu’s correction for obtaining the MEM spectrum of each iteration is given (4) as /1 F nnew Å
1 / ( lNF nnew / s 2 ) n /1 F . 1 / ( lNF n /1 / s 2 )
[10]
The l calculated from Eq. [5] leads to rapid convergence to the solution point. To further stabilize the iterative algorithm, the l used for each iteration is calculated by /1 lnnew Å (1 0 z ) lnnew / zl n /1 ,
The factor a is used to control how much of the calculated spectrum F n /1 ( v1 , v2 , v3 ) is used to construct the new /1 spectrum F nnew ( v1 , v2 , v3 ). The factor a can be calculated (12) or given as a constant. A similar approach can also be applied to compute A and l. Further improvement on the stability of the iterative algorithm is also essential to obtain the MEM spectrum. In our current version of the 3D maximum-entropy algorithm, we use Wu’s method (4) to obtain optimal A, the amplitude of the MEM spectrum described in Eq. [3], and optimal mixing of the calculated spectrum F n /1 ( v1 , v2 , v3 ) and the MEM spectrum F nnew ( v1 , v2 , v3 ) of the nth iteration for obtaining /1 the MEM spectrum F nnew ( v1 , v2 , v3 ). The optimal A n /1 of the (n / 1)-th iteration can be computed based on the equation A n /1 Å
max 1£i£Nt 1,1£j £Nt 2,1£k£Nt 3
/1 F nnew ( v1 , v2 , v3 ) Å (1 0 a )F nnew ( v1 , v2 , v3 )
/ aF n /1 ( v1 , v2 , v3 ).
[8]
[11]
/1 is the l used for the (n / 1)-th iteration and where lnnew n /1 l is calculated from Eq. [5]. l 0 is the small constant (1.0 1 10 04 ). z determines the speed of the convergence and in our experience its value is 0.01–0.2. In addition, if the MEM spectrum of the nth iteration is scaled by a constant a, i.e., F nnew Å F n /a, the speed of convergence can also be controlled by varying the a. The iterative process is stopped when the x 2 no longer improves more than a given percentage or x 2 approaches 2Nt 1 1 2Nt 2 1 2Nt 3 . One can also monitor the convergence by observing the angle between Ç S and Çx 2 , because at the solution point they are antiparallel to each other. The iterative method described above ensures rapid convergence of the MEM spectra. In order to extract and expand a 3D section of a 3D spectrum, it is necessary to apply window functions along the truncated dimensions to obtain the Fourier-transformed spectrum before it is processed by the 3D MEM. This is required so that truncation effects do not distribute sinc wiggles outside the region being analyzed. If this is not done, the reconstruction of the MEM spectrum will be distorted as the information of the peaks in the sinc wiggles is lost. Applying window functions concentrates the information of a peak into its center, guaranteeing minimal loss of information when the zoom region of a 3D FT spec-
magbas
AP: Mag Res, Series B
250
NOTES
FIG. 1. (A) The 2D section of a 3D zoomed region selected from the Fourier-transformed 3D HCCH–TOCSY spectrum of the protein calmodulin and taken at 45.8 ppm in the carbon dimension. The original 3D matrix size of the time-domain data is 384*(H) 1 64*(H) 1 24*(C). A sine-squaredbell window function was applied to the F3 dimension, and sine-bell window functions were also applied to the F1 and F2 dimensions. After zero filling and Fourier transformation, a final spectrum with the size of 1024 1 256 1 64 was obtained. The size of this section of the 3D spectrum is 64 1 32 1 32. The 1D Z trace was taken from the black dot shown on this 2D slice. (B) The MEM spectrum obtained by expanding a region of the Fouriertransformed 3D HCCH–TOCSY spectrum. After applying Hilbert and inverse Fourier transformations, the corresponding time-domain data with the size of 24* 1 8* 1 12* were obtained. The final size of the MEM spectrum is 64 1 64 1 64. The total of 14 iterations required 20 min. computing time when our C program was run on a SUN SPARC 10, equipped with 128 Mbytes RAM. Deconvolutions on J couplings and window functions were also applied. The resulting 3D MEM spectrum shows great improvement in resolution in all three dimensions when compared with the original spectrum in (A). The Z trace was taken from the black dot shown on this 2-D slice.
trum to be expanded by the MEM method is selected. The line broadening due to the application of window functions can be readily deconvolved with the MEM method (5) by applying the same window function to the exponent of Eq. [3]. If the selection of the analyzed spectral region includes a border peak, the final MEM spectra will have a folded portion of this resonance peak due to the use of Fourier transformation in the MEM algorithm. It is important to notice that when the proposed procedure for selecting the spectral region is applied, the noise distribution is no longer constant. The distribution of the noise is modulated according to the apodization functions used. Therefore, in theory, the modulated si should be applied in Eq. [3]. In practice, less steep apodization functions are employed to prevent the exponent of that equation being divided by a very small number or zero. If this is not necessary, the same window function can be applied. If si is used as a constant, only a portion of the original 3D FID, where the original window functions assume large values, is fitted most effectively by MEM and the performance of the MEM algorithm is less optimal. It may also introduce a small peak between two close resonances in some cases.
AID
JMRB 7487
/
6o14$$$$23
11-07-96 15:39:56
The essential procedure for obtaining the MEM spectrum consists of the following steps: (i) applying window functions, zero padding, and phase correction to experimental 3D FID to obtain the phased FT 3D spectrum; (ii) extracting the zoom region to be expanded; (iii) using Hilbert and inverse Fourier transformations to obtain the corresponding time-domain data; (iv) computing the x 2 of the nth iteration, checking whether the convergence condition is reached, and if so, stopping the whole process, otherwise continuing; (v) /1 computing optimal l and A nnew values based on Eq. [5] and Eq. [7] respectively, for the (n / 1)-th iteration; (vi) applying the fixed point method and Wu’s correction (4) (Eqs. [8] and [10]) to obtain the MEM spectrum of the ( n / 1)-th iteration; and (vii) repeating the procedure starting from step (iv). The proposed 3D MEM algorithm is demonstrated for processing a section of the 3D HCCH–TOCSY NMR spectrum of the protein calmodulin (17). The experimental data set has the 3D matrix size of 384*(H) 1 64*(H) 1 24*(C) with * indicating complex data points. After applying a sinesquared-bell window function to the F3 dimension and sinebell window functions to the F1 and F2 dimensions, zero
magbas
AP: Mag Res, Series B
251
NOTES
filling, and Fourier transformation, a final FT spectrum matrix of 1024 1 256 1 64 was obtained. We cut a spectral region of size 64 1 32 1 32 and applied the Hilbert transformation and inverse Fourier transformation to obtain the corresponding 3D time-domain data, which was equivalent to a FID of size 24* 1 8* 1 12*. Finally, we applied our 3D MEM algorithm, obtaining a 3D MEM spectrum of size 64 1 64 1 64. As indicated in Fig. 1B, the MEM spectrum has superior resolution in all three dimensions compared with the corresponding FT spectrum (Fig. 1A). The total number of iterations required to reach the solution was 14. The a and b values for the calculation were 0.04, and 0.05 respectively. Deconvolution on J couplings (16 Hz) and window functions (18) was also applied. The total time required to generate this MEM spectrum was 20 min. on a SUN SPARC 10 workstation. Approximately 64 Mbytes of RAM are needed, which is a moderate requirement for modern workstations. The C program of the 3D MEM algorithm is available from the author. The relatively simple 3D MEM algorithm described above shows its capability of increasing the apparent spectral resolution and signal-to-noise ratio in all three dimensions simultaneously, which can be difficult to achieve by 1D or 2D MEM methods. The more sophisticated MEM procedure proposed by Skilling et al. (3) may be applied to achieve better stability and convergence. Nevertheless, the 3D MEM proposed here is effective and computationally efficient and can be applied to 3D and 4D NMR data processing, thereby resolving peaks from crowded spectral regions, and facilitating the analysis of multidimensional NMR data.
The author thanks Drs. A. Bax, H. Kuboniwa, and M. Delsuc for useful discussions and for providing the 3D data set. Many thanks to Dr. D. Smith
JMRB 7487
/
6o14$$$$23
REFERENCES 1. E. T. Jaynes, ‘‘Statistical Physics,’’ W. A. Benjamin, New York, p. 182, 1963. 2. S. F. Gull and G. J. Daniell, Nature (London) 272, 686 (1978). 3. J. Skilling and R. K. Bryan, Mon. Not. R. Astron. Soc. 211, 111 (1984). 4. N. L. Wu, Astron. Astrophys. 139, 555 (1984). 5. E. D. Laue, J. Skilling, J. Staunton, S. Sibisi, and R. G. Brereton, J. Magn. Reson. 62, 437 (1985). 6. E. D. Laue, M. R. Mayger, J. Skilling, and J. Staunton, J. Magn. Reson. 68, 14 (1986). 7. J. Hoch, in ‘‘Methods in Enzymology’’ (N. Oppenheimer and T. L. James, Eds.), Vol. 176, p. 216, Academic Press, San Diego, 1989. 8. F. Ni and H. Scheraga, J. Magn. Reson. 82, 413 (1989). 9. A. R. Mazzeo, M. A. Delsuc, A. Kumar, and G. C. Levy, J. Magn. Reson. 81, 512 (1989). 10. D. S. Stephenson, Prog. NMR Spectrosc. 20, 515 (1988). 11. G. J. Daniell and P. J. Hore, J. Magn. Reson. 84, 515 (1989). 12. M. A. Delsuc, ‘‘Maximum Entropy and Bayesian Methods’’ (J. Skilling, Ed.), p. 285, Kulwer Academic, Dordrecht/Norwell, MA, 1989. 13. J. A. Jones and P. J. Hore, J. Magn. Reson. 92, 363 (1991). 14. W. Boucher, E. D. Laue, S. C. Burk, and P. J. Domaille, J. Am. Chem. Soc. 114(6), 2262 (1992). 15. P. Schmieder, A. S. Stern, G. Wagner, and J. C. Hoch, J. Biomol. NMR 3, 569 (1993). 16. E. Bartholdi and R. R. Ernst, J. Magn. Reson. 11, 9 (1973).
ACKNOWLEDGMENTS
AID
for his assistance in preparing the manuscript. This work is supported by the Hong Kong Research Grants Council (HKUST563/95M, DAG94/ 950494).
11-07-96 15:39:56
17. A. Bax, G. M. Clore, and A. M. Gronenborn, J. Magn. Reson. 88, 425 (1990). 18. M. A. Delsuc and G. Levy, J. Magn. Reson. 76, 306 (1988).
magbas
AP: Mag Res, Series B