Bayesian reconstruction of projection reconstruction NMR (PR-NMR)

Computers in Biology and Medicine 54 (2014) 89–99 Contents lists available at ScienceDirect Computers in Biology and Medicine journal homepage: www...

Download PDF

802KB Sizes 0 Downloads 49 Views

Report

PDF Reader
Full Text

Computers in Biology and Medicine 54 (2014) 89–99

Contents lists available at ScienceDirect

Computers in Biology and Medicine journal homepage: www.elsevier.com/locate/cbm

Bayesian reconstruction of projection reconstruction NMR (PR-NMR) Ji Won Yoon 1 Cyber Defense Department and Graduate School of Information Security, Korea University, South Korea

art ic l e i nf o

a b s t r a c t

Article history: Received 9 May 2014 Accepted 14 August 2014

Projection reconstruction nuclear magnetic resonance (PR-NMR) is a technique for generating multidimensional NMR spectra. A small number of projections from lower-dimensional NMR spectra are used to reconstruct the multidimensional NMR spectra. In our previous work [1,2], it was shown that multidimensional NMR spectra are efﬁciently reconstructed using peak-by-peak based reversible jump Markov chain Monte Carlo (RJMCMC) algorithm. We propose an extended and generalized RJMCMC algorithm replacing a simple linear model with a linear mixed model to reconstruct close NMR spectra into true spectra. This statistical method generates samples in a Bayesian scheme. Our proposed algorithm is tested on a set of six projections derived from the three-dimensional 700 MHz HNCO spectrum of a protein HasA. & 2014 Elsevier Ltd. All rights reserved.

Keywords: Inverse problem Projection reconstruction Bayesian model selection Reconstruction of multidimensional NMR spectra Mixed linear model

1. Introduction It is well known that the structures of molecules of interest are studied both by X-ray crystallography and by NMR spectroscopy. While X-ray crystallography works only in a solid state, NMR spectroscopy characterizes three-dimensional structures, dynamics and molecular interactions of molecules in a liquid state and even in a gas state. It is also known that multidimensional NMR spectroscopy is very useful for obtaining more accurate and precise structure of molecules. However, a serious limitation of multidimensional NMR spectroscopy is the long time required to record a set of multidimensional NMR experiments. The minimum measurement time of an N dimensional NMR experiment increases as the number of dimensions increases. This makes higher multidimensional NMR spectroscopy intractable in practice. For these reasons, several investigators have been attempting to speed up these measurements through more efﬁcient approaches [3–7], which are traced back to the concept of accordion spectroscopy [8]. One of the recent approaches designed to address this problem is GFT-NMR by Kim and Szyperski [9]. GFT-NMR experiments detect sums of and differences in chemical shifts by linking the various evolution dimensions. Another approach, which is the main consideration of this paper, is projection reconstruction NMR (PR-NMR) by Freeman and Kupce [10–13]. To implement the reconstruction of multidimensional NMR spectra, we have investigated the use of several deterministic and statistical approaches [2,1,14]. Although

E-mail address: [email protected] Tel.: þ82 2 3290 4886.

1

http://dx.doi.org/10.1016/j.compbiomed.2014.08.016 0010-4825/& 2014 Elsevier Ltd. All rights reserved.

deterministic algorithms including back projection and lowest value algorithms can reconstruct the spectra quickly, the reconstructed spectra are rather different from the underlying true spectra. Meanwhile, the statistical approaches based on a Bayesian scheme are relatively slow but provide reconstructions that are closer to the true spectra than those using deterministic approaches. In addition, there are two types of schemes for reconstructing spectra from projections: the pixel-by-pixel approach and peak-by-peak approach. In our previous papers [1,2,14], we demonstrated maximum likelihood (ML) reconstruction, maximum a posterior (MAP) [15,16] reconstruction and maximum entropy (ME) [17,18,3,19] reconstruction for pixel-by-pixel approach. We also showed that reversible jump Markov chain Monte Carlo (RJMCMC) reconstruction based on a peak-by-peak scheme can be designed with the simple linear model and work better than other approaches based on a pixel-by-pixel approach in that it reduces the complexity of the problem and speeds up the calculation. Although the previous RJMCMC algorithm with the simple linear model works effectively in particular PR-NMR experiments, we can still improve the methodology by carefully considering the noise of each projection. In practice, each projection is NMR spectra obtained from separate lower dimensional experiments meaning that each projection may have a different noise level. In this paper, an extended and generalized RJMCMC based algorithm is proposed to replace the simple linear model with the linear mixed model to explain such different noise levels per each projection.2

2 These noises are regarded as random effect of the linear mixed model in this manuscript.

90

J.W. Yoon / Computers in Biology and Medicine 54 (2014) 89–99

2. Projection reconstruction NMR

3. Statistical background

2.1. Dimension reduction

3.1. Linear mixed model

As mentioned in the introduction, the serious limitation of the promising multidimensional NMR spectroscopy is the long minimum measurement time, which increases as the number of dimensions increases. This makes higher multidimensional NMR spectroscopy impracticable. For example, suppose that n time points are sampled in each evolution period for a d-dimensional NMR experiment. In this case, the total experimental time is determined by the number of time points (n), the number of scans (Na), the number of dimensions (d), and the duration of each scan, typically about one second [20]. The minimum experimental time of the phase sensitive d-dimensional NMR experiment increases exponentially with the dimension as the dimension increases so therefore it is proportional to p nd . Therefore, several investigators have been attempting to speed up these measurements by more efﬁcient approaches [3–7], which are traced back to the concept of accordion spectroscopy [8]. One of the recent approaches to address the measurement time problem is projection reconstruction NMR (PR-NMR) by Freeman and Kupce [10–13]. The PR-NMR reconstructs multidimensional NMR spectra by using relatively lower dimensional NMR spectra quickly. Apart from Kupce and Freeman, several other groups have contributed to the PR-NMR and its related topics [21–25].

In many application domains, linear models are often used for modeling the relationship between a response variable y and predictor variables, x1 ; …; xp . The relationship can be modeled with a linearly parametrized function as follows:

2.2. Inter-modulation of chemical shifts An N-D experiment involves N 1 evolution periods and 1 acquisition period. For example, there exist two consecutive evolution periods (t1 and t2) for a 3-D experiment. In NMR spectroscopy, a signal that evolves at a chemical shift ΩA in t1 modulates the signal in the second evolution period t2. Thus, it can be regarded as an inter-modulation of the chemical shifts ΩA and ΩB of the two dimensions. In other words, signals that evolve at ΩA during t1 pass on a component cos ΩA or sin ΩA to the next evolution period. Therefore, the signal obtained in the acquisition period ðt 3 Þ is modulated as one of the following: S1 ¼ cos ðΩA t 1 Þ cos ðΩB t 2 Þ S2 ¼ sin ðΩA t 1 Þ sin ðΩB t 2 Þ S3 ¼ sin ðΩA t 1 Þ cos ðΩB t 2 Þ S4 ¼ cos ðΩA t 1 Þ sin ðΩB t 2 Þ: The inter-modulation creates a cross-peak in a two dimensional frequency domain F 1 F 2 plane at co-ordinates ðΩA ; ΩB Þ. 2.3. Basics of PR-NMR Projection reconstruction NMR (PR-NMR) is based on the intermodulation of chemical shifts. The basic idea of PR-NMR is to increase the evolution times t1 and t2 jointly at relative rates in order to generate a projection of the spectra onto a plane inclined at a proper angle. If t1 and t2 are set to t cos α and t sin α respectively for a 3D experiment, then we have combined the terms of the modulations for hyper-complex Fourier transformation: Re1 ¼ S1 S2 ¼ cos ½ðΩA cos α þ ΩB sin αÞt Re2 ¼ S1 þS2 ¼ cos ½ðΩA cos α ΩB sin αÞt Im1 ¼ S3 þ S4 ¼ sin ½ðΩA cos α þ ΩB sin αÞt Im2 ¼ S3 S4 ¼ sin ½ðΩA cos α ΩB sin αÞt: It can be observed from the above equations that the projections always occur in pairs tilted at 7 α.

y ¼ β 0 þ β1 x1 þ β 2 x2 þ ⋯ þ β p xp þ ϵ where the coefﬁcient βi (ﬁxed effect) for i ¼ 0; 1; 2; …; p is unknown and ϵ is the error term. Using the matrix representation for the n different observations, the regression equation for the above example can be written as Y ¼ XA þ ϵ where and the 2 1 6 61 6 X¼6 6⋮ 4 1

Y ¼ ðy1 ; y2 ; …; yn ÞT ; A ¼ ðβ 0 ; β1 ; …; βp ÞT ; ϵ ¼ ðϵ0 ; ϵ1 ; …; ϵn ÞT matrix X is formed by 3 xð1Þ ⋯ xð1Þ xð1Þ p 1 2 7 7 xð2Þ xð2Þ ⋯ xð2Þ p 7 1 2 7: ⋮ ⋮ ⋱ ⋮ 7 5 xðnÞ ⋯ xðnÞ xðnÞ p 1 2

However, some parameters among predictor variables can have random effects under uncertainty and such a simple linear model with ﬁxed effects cannot explain the parameters well. The linear mixed model (LMM) is a statistical model containing both ﬁxed and random effects to address this problem. The common form of the LMM is Y ¼ XA þ ZB þ ϵ

ð1Þ

where Z is the design matrix for the random effects relating observations and B is a vector of the coefﬁcients of random effect variables. 3.2. Reversible jump Markov chain Monte Carlo Reversible jump Markov chain Monte Carlo (RJMCMC) has been a popular method for addressing model selection problems [26] in many applications such as image processing [27,28] and nuclear spectroscopy [29]. RJMCMC enables Markov chains to move in between different dimensions of parameters. This can be used to infer the joint distribution of two types of parameters, a model indicator k and a parameter vector xk . This is described as inferring pðk; xk jyÞ, where y is observed information. The conventional MCMC mentioned cannot solve this problem since it is not straightforward in comparing the densities of objects in different dimensions. The reversible jump MCMC, suggested by [30], samples over a union space on the different dimensions of parameter spaces by applying the dimension matching procedures in a generalized MCMC framework.

4. Mathematical model for reconstruction of multidimensional NMR spectra 4.1. simple linear models for multidimensional NMR Reconstruction of multidimensional NMR spectra from a small number of projections can be achieved using two strategies: pixelby-pixel modeling [14,2] and peak-by-peak modeling [1,2]. While pixel-by-pixel modeling determines all pixels on an image individually in terms of the given projections, the peak-by-peak approach reconstructs an image using a ﬁnite collection of speciﬁc peak shapes. Even though the peak-by-peak model is an

J.W. Yoon / Computers in Biology and Medicine 54 (2014) 89–99

idealization of reality, it is a very reasonable assumption since the NMR peak shape can be well approximated as a speciﬁc shape such as Gaussian or Lorentzian. In the peak-by-peak approach, each peak consists of a center position, amplitude, and peakwidth. Peak-by-peak estimation can be more efﬁcient since it models explicitly the sparseness inherent in the NMR spectra. Thus, in the peak-by-peak approach, we do not have to directly update all areas of the image as would be required in the pixel-bypixel approach. Another interpretation is that a solution can rapidly be obtained since the number of parameters for peak-bypeak estimation is much smaller than that for pixel-by-pixel. Here, then, we deﬁne the underlying image model at pixel location x with a ﬁnite collection of peaks, K

sðxÞ ¼ ∑ Ak N ðx; mk ; σ 2 IÞ

ð2Þ

k¼1

where sðxÞ is the intensity at the image position x, N ð; a; bÞ denotes multivariate(/univariate) normal distribution with mean a and covariance b and x ¼ ½x1 ; x2 ; …; xd T ; mk ¼ ½mk;1 ; mk;2 ; …; mk;d T

where d is the dimension of the evolutionary periods of the reconstructed NMR spectra. In this paper, we simply used d¼ 2 because we are reconstructing 3D NMR spectra. In Eq. (2), Ak is the amplitude of the k-th peak. We use the Gaussian shape, which has two components: mk for the center position and σ for peak width, as shown in Eq. (2). Two or more individual widths of the peaks may provide more accurate reconstruction than a single line width but we use a single width for our approach for two reasons. First, considering only a single width reduces time complexity in estimation; therefore, this can speed up our proposed approach.

2.5

2.5

1.5 1

1.5 1

0.5

0.5

0

0 20

40

60

80

Noisy observation Reconstruction Decomposition (K=1)

2

Intensity

Intensity

Second, we can reconstruct any shape of peaks only with such a single line width. That is, our proposed approach is based on a few assumptions. One is that the shape of the peak is considered as a Gaussian shape function. The other assumption is that a single line width can be used to approximate all peaks. Of course, our proposed algorithm may not perform well if the assumptions are not correct. Therefore, we simulate additional experiments to verify the assumptions used in this paper before applying these to real experimental NMR data. First, our proposed approach uses a Gaussian shape function to design the peak proﬁle in this paper although other line shape functions such as Lorentzian and Laplacian can also be effectively used. However, in fact, it is known that none of these correctly represents the underlying actual spectral line shape, which is deﬁned by the decay rate of the time signal and appodization function. Fig. 1 demonstrates a comparison between actual nonGaussian synthetic spectra and their reconstructed shape using the Gaussian shape function of the Eq. (2). The solid lines of Fig. 1(a) and (c) are the spectra with a single Laplacian shaped peak and the dashed lines of the subﬁgures are the reconstructed spectra with a Gaussian shaped function. As can be seen in the subﬁgures, the reconstructed spectra with 10 Gaussian components (K ¼10) are much closer to the actual Laplacian shape than one with a Gaussian component (K ¼1). In a way similar to Laplacian, the spectra with Lorentzian shaped peaks can also be well approximated using only Gaussian shaped functions as shown in Fig. 1(b) and (d). To validate the other assumption, we generate a synthetic spectra with two Gaussian shaped peaks that have different amplitudes and widths as shown in Fig. 2. The left peak of the synthetic spectra is constructed with the location parameter μ1 ¼ 30 and shape/width parameter σ 1 ¼ 3 and the right peak with μ2 ¼ 80 and σ 2 ¼ 5. With a ﬁxed width σ ¼ 3, the spectra are reconstructed varying the number of Gaussian components.

Noisy observation Reconstruction Decomposition (K=1)

2

100

20

Sample points

1.5 1

0

0 60

100

1 0.5

40

80

1.5

0.5

Sample points

60

80

Noisy observation Reconstruction Decomposition (K=10)

2

Intensity

Intensity

2.5

Noisy observation Reconstruction Decomposition (K=10)

20

40

Sample points

2.5 2

91

100

20

40

60

80

100

Sample points

Fig. 1. Fitting the shape functions using a Gaussian shape with a ﬁxed width for different peak shapes. (a) Laplacian (K¼1). (b) Lorentzian (K¼ 1). (c) Laplacian (K¼ 10). (d) Lorentzian (K ¼10).

92

J.W. Yoon / Computers in Biology and Medicine 54 (2014) 89–99

2.5 2 1.5 1

1.5 1

0.5

0.5

0

0 20

40

60

80

Noisy observation Reconstruction Decomposition (K=10)

2

Intensity

Intensity

2.5

Noisy observation Reconstruction Decomposition (K=2)

100

20

Sample points

40

60

80

100

Sample points

Fig. 2. Gaussian ﬁtting for two Gaussian peaks with different amplitudes and widths. (a) Gaussian (K¼2). (b) Gaussian (K¼ 10).

the row of X: 6 projections for each peak

3000

2500

2000 (4, 3)

1500

1000

500

0

1st peak

2nd peak

3rd peak

4th peak

5th peak

the column of X

Fig. 3. The matrix X for given multidimensional spectra with K¼ 5 peaks and jΛj ¼ 6 projections. Angles used for projections are 301, 3301, 501, 3101, 01 and 901. (a) A sampled spectra. (b) Structure of X of the sampled spectra.

The dashed line of Fig. 2(a) shows the reconstructed spectra with K ¼2 and that of Fig. 2(b) demonstrates the reconstructed spectra with K ¼10. As can be seen in this Fig. 2(a), the right peak, with a different width, is not properly reconstructed even in the Gaussian case because of the different width of the peak. However, both peaks are closely approximated to the actual spectra only with a single width Gaussian when 10 Gaussians (K ¼10) with a common single width are used to compose the spectra. That is, as K increases, such differences or errors from different shapes or different widths are signiﬁcantly reduced, as shown in Fig. 1(b) and (d) and Fig. 2(b). Eventually, we can model real NMR peaks with various shapes by composing Gaussian shaped components that have a single width. In PR-NMR, input data are a small number of projections obtained at different projection angles. Suppose that y is the projection data. λi and l stand for the i-th projection angle and the sample index into a projection, respectively, i ¼ 1; …; jΛj. The PR-NMR data is then deﬁned as ¼ Rs ðλi ; lÞ þ ϵi yðiÞ l

ð3Þ

where yðiÞ is the l-th data-point of the i-th projection taken at l angle λi and Rs ðλi ; lÞ is the projection function for the i-th angle λi

and data point. Suppose that the number of data points in each projection is Ns and the number of peaks is K. Eq. (3) can be written in the linear model framework using vector notation as follows: Y ¼ XA þ e

ð4Þ

where A ¼ A1:K and e N ðe; 0; σ 2e IÞ. While we assumed σe to be known in our previous work [1], we estimate it in this paper. Y is the ð1Þ ðjΛjÞ stacked vector of projection data, deﬁned as Y ¼ ½yð1Þ ; 1 ; …; yN s ; …; y1

ðjΛjÞ T …; yN . X denotes the integrated peaks obtained by substituting s

Eq. (2) in terms of angles λ, A ¼ A1:K ¼ ½A1 ; A2 ; …; AK T is a vector of peak amplitudes, and e is a noise vector. When we have jΛj projections, X is a matrix with size ðN s jΛjÞ K and A1:K is a K 1 vector. The size of vectors Y and e is ðNs jΛjÞ. For example, suppose that we have jΛj ¼ 6 projections with Ns ¼ 512 points. If the sampled or given spectra have K ¼5 peaks, as shown in Fig. 3(a), then we obtain its corresponding X, from Fig. 3(b). Since there are 5 peaks in the image and each projection has 512 points, the number of columns is 5 and the number of rows is 6 512 ¼ 3072. In Fig. 3(b), the area “(4, 3)” denotes the 4th projection of an image with only the 3rd peak.

J.W. Yoon / Computers in Biology and Medicine 54 (2014) 89–99

4.2. Proposed linear mixed model for multidimensional NMR spectra Although the simple linear model of the previous section works effectively in certain PR-NMR experiments [1], we can still improve the methodology by carefully considering the noise from each projection. In practice, each projection is NMR spectra obtained from separate lower dimensional experiments meaning each projection may have a different noise level. In order to subtract the noise from the spectra, we need to understand the noise. In general, there are two different types of noise: projection noise and full plane noise. In each experiment (of lower dimensional NMR spectra), a certain amount of noise can be added into all projected spectra with variance σ 2μ . We refer to this noise as projection noise for a particular projection. Full plane noise is the other type of noise that is randomly added in a particular location of the spectra. In PR-NMR, only projection noise may be considered by setting σ e ¼ 0. However, we added the “full plane noise” in order to indirectly consider the noise reduction of the reconstructed spectra rather than projected spectra. Such a denoising scheme for the full plane spectra is commonly applied even in one or two dimensional NMR processing [31]. We show the existence of both types of noise from Tables 1 and 2 in Section 5. We now map these two noises to the statistical terms that are commonly used in the linear mixed model. The projection noise is regarded as a projection-speciﬁc random effect and the full plane noise is interpreted as a ﬁxed effect in the linear mixed model. By carefully separating noise in the linear mixed model, we can expect better and well reconstructed multidimensional NMR spectra. Now our linear mixed model is deﬁned as Y ¼ XA þ ZB þ e;

ð5Þ

where A and B represent ﬁxed and random effects respectively. To subtract the projection noise in each projection, we model 2 ð1Þ 3 g 0Ns 1 0Ns 1 ⋯ 0Ns 1 6 7 gð2Þ 0Ns 1 ⋯ 0Ns 1 7 6 0Ns 1 7 Z¼6 6 ⋮ ⋮ ⋮ ⋱ ⋮ 7 4 5 0Ns 1 0Ns 1 0Ns 1 ⋯ gðjΛjÞ Table 1 Scales and noise levels of the projection data. Scale (ρi) indicates the scale of signal amplitudes and EðϵÞ and VðϵÞ indicate the mean and standard deviation of noise level ϵ, respectively. Metric

301

3301

501

3101

01

901

Scale (ρi) EðϵÞ pﬃﬃﬃﬃﬃﬃﬃﬃﬃ VðϵÞ

4.9367 1.7261 2.5877

5.0598 1.8199 2.9305

5.4074 0.1677 1.5278

5.3177 0.8501 2.0577

2.4739 0.6058 1.6067

5.1193 2.1472 9.0809

Table 2 Comparison of Linearσe with varying σe using several metrics: “bold” indicates the best value and “under bar” indicates the second best values. σe

RMSE

KLD

PSNR

SSIM

AUC

0.1 0.3 0.5 0.7 0.9 1 5 10 15 20 25 30

1.1275 1.1943 1.2311 1.1704 1.0960 1.1477 0.7938 0:7106 0.6821 0.7274 0.7457 0.8210

25.2739 24.5784 21.9748 20.0840 18.5391 26.3999 3.2067 3:3713 6.2410 8.0324 7.5607 13.1501

1.4584 1.3095 1.0714 1.2919 1.4865 1.3689 1:5633 1.5888 1.5487 1.5593 1.5884 1.5742

0.3399 0.2322 0.1766 0.2701 0.3630 0.2994 0.6677 0:7205 0.7450 0.6987 0.6580 0.5819

0.6749 0.6620 0.7302 0.7271 0.7814 0.6233 0.9910 0:9776 0.9491 0.9459 0.9498 0.8747

93

and B ¼ ½b1 ; b2 ; …; bjΛj T N ð0; σ 2μ IÞ, where gðiÞ is a projection noise pattern in the ith projection with Ns data points and 0Ns 1 is an Ns 1 zeros vector. In this model, ﬁnding g may not be straightforward since there is no known pattern for projection noise in a generalized form. One of the simple ways to determine the pattern of each projection is to use a low-pass ﬁlter, such as average ﬁlter or Gaussian ﬁlter, that passes slowly changing regions or signals and reduces or removes the amplitude of signals from quickly changing regions in the projection. For simplicity, we assume that there exists a constant amount of noise in each projection, and then we say that the expectations of the projection noise are scaled with variance σ 2μ for all projections as shown in Fig. 4(a). Therefore, we simply deﬁne gðiÞ ¼ 1Ns 1 for all iA f1; 2; …; jΛjg. To subtract such projection noise for each projection of PR-NMR, we model Z again as follows: Z ¼ IjΛjjΛj 1Ns 1

ð6Þ

where stands for the Kronecker product and 1Ns 1 is a N s 1 ones vector. The term Y in the linear mixed model is the same as that in simple linear model. X and A of the linear mixed model are also built with a similar structure to the previous simple linear model. Therefore, the linear mixed model reduces to a simple linear model when the random effect becomes zero. From this point of view, the linear mixed model is a sort of generalization of the linear model but has a richer structure to explain the data. Note that we can still represent such a richer linear mixed model with a simple linear model by only changing the structure of covariance as follows: ^ Y ¼ XA þ e;

ð7Þ

^ 0; σ μ ZZ þ σ where e^ N ðe; Therefore, we have a uniﬁed and simpliﬁed form for both models by 2

T

2 e IÞ.

Y ¼ XA þ ϵ ϵ N ðϵ; 0; Σ Þ; where 8 < σ 2e I Σ¼ : σ 2μ ZZT þ σ 2e I

ð8Þ

if it is a simple linear model if it is a Linear Mixed Model:

4.3. Target distribution of interest Since the peak-by-peak approach presents a computational challenge in that the number of peaks is not known a priori, trans-dimensional methods were applied in order to estimate the number of peaks in our previous paper [1]. Reversible jump Markov chain Monte Carlo (RJMCMC) is used to approach the trans-dimensional problems. Use ηk to denote the parameter vector associated with the peak proﬁles indexed by k A f1; 2; …; Kg. Then, for K peaks we have a set of hidden variables as follows:

η1:K ¼ ðm1:K ; A1:K ; σ ; σ e ; σ μ Þ

ð9Þ

where m1:K ¼ ðm1;1:K ; m2;1:K ; …; md;1:K Þ. With this speciﬁc scheme, our target posterior of interest is written as pðK; η1:K jYÞ, where K is also a random variable for the number of peaks. Interestingly, some hidden parameters among η1:K can be regarded as nuisance parameters that we do not need to estimate until they are required. For instance, the amplitudes of the peaks A1:K can be removed with the linear Gaussian assumption as shown in Eqs. (10) and (13). We can also estimate σe and σ μ via the maximum likelihood (ML) technique rather than by the RJMCMC scheme to obtain stable results within a practical time as well as to speed up the simulation by reducing the searching space in RJMCMC. We have now three types of hidden parameters:

94

J.W. Yoon / Computers in Biology and Medicine 54 (2014) 89–99

0

0

+30

200

−30

200

100

100

0

0 0

200

400

0

+500

200

200

200

100

400

−500

100

0

0 0

200

400

0

200

0

200

400 0

0

90

200

100

100

0

0 0

200

400

0

200

400

Fig. 4. Projections and their underlying spectra. (a) Projections (observation). (b) Underlying multidimensional spectra (underlying true spectra).

1. fm1:K ; σ g: These are the main parameters of interest for the peak proﬁle that we want to estimate and reconstruct. They are estimated using the RJMCMC technique, and since only these parameters are of interest for the RJMCMC procedure, we rename these parameters θ1:K , i.e. θ1:K ¼ fm1:K ; σ g. 2. A1:K : This parameter is an element of peaks' proﬁle but we separately classify this parameter from other proﬁles of the peaks, θ1:K . 3. σe and σ μ : For stable and fast estimation, we infer these parameters by using the maximum likelihood scheme rather than the RJMCMC scheme. A set of hidden variables η1:K is decomposed as fθ1:K ; A1:K ; ðσ e ; σ μ Þg. It is very important to classify the hidden parameters since we can design a more efﬁcient algorithm by removing nuisance parameters or unnecessary Monte Carlo sampling steps for a few parameters. Therefore, our target distribution conditional on only σe and σ μ for the RJMCMC simulation is Z pðθ1:K ; A1:K ; KjY; σ e ; σ μ Þ dA1:K pðθ1:K ; KjY; σ e ; σ μ Þ ¼ Z p pðY; θ1:K ; A1:K ; Kjσ e ; σ μ Þ dA1:K Z ¼ pðYjθ1:K ; A1:K ; K; σ e ; σ μ ÞpðA1:K jKÞ dA1:K pðθ1:K ; Kjσ e ; σ μ Þ

¼ pðYjθ1:K ; σ e ; σ μ Þpðθ1:K ; KÞ (

)

K

¼ pðYjθ1:K ; σ e ; σ μ Þ pðσ Þ ∏ pðmk jKÞ pðKÞ (

k¼1

K

d

After drawing a sample on θ1:K and K via RJMCMC in each time step, we then estimate the σe and σ μ via the maximum likelihood (ML) approach and maximum a posteriori (MAP) approach:

σ^ e ; σ^ μ ¼ arg max log pðYjX; σ e ; σ μ Þ by ML estimate or σ e ;σ μ

σ^ e ; σ^ μ ¼ arg max log pðσ e ; σ μ jY; XÞ by MAP estimate: σ e ;σ μ

ð12Þ

For an MAP estimate, we set up the prior of σ2e and σ 2μ by inverse Gamma distribution: σ 2e IGðσ 2e ; α; β Þ and σ 2μ IGðσ 2μ ; α; βÞ where α ¼ 3 and β ¼ 6. We recursively run RJMCMC for the proﬁles of the peak θ1:K and ML/MAP for standard deviations of noises σe and σ μ . 4.4. Marginalized likelihood With such a uniﬁed and simpliﬁed form represented by Eq. (8), we use the following marginalized likelihood for both the simple linear model and the linear mixed model. The marginalized likelihood is obtained by integrating out the amplitude parameters by assuming that the amplitudes follow normal distribution as follows: Z pðYjX; σ e ; σ μ Þ ¼ pðY X; A; σ e ; σ μ ÞpðAσ e ; σ μ Þ dA 1 1 1 exp ½Y T Σ Y þ μTA 1TK1 C 1 1K1 μA HT Φ H 2 ¼ j2πΣ j1=2 j2π Cj1=2 j2πΦj1=2 ð13Þ where

)

Φ ¼ XT Σ 1 X þ C 1 1 H ¼ XT Σ Y þ C 1 ðμA 1K1 Þ:

¼ pðYjX; σ e ; σ μ Þ pðσ Þ ∏ ∏ pðmk;j jKÞ pðKÞ: k¼1j¼1

ð10Þ

and Σ is obtained from Eq. (8) and C ¼ diagðC A ; …; C A Þ of Eq. (11).

The distributions of this equation are deﬁned by pðA1:K Þ ¼ N ðA1:K ; μA 1K1 ; diagðC A ; …; C A ÞÞ pðmk;i Þ ¼ Uð0; T i Þ for all i A f1; 2; …; dg pðσ Þ ¼ Gðα; β Þ pðKÞ ¼ 1=c

4.5. Main algorithm

ð11Þ

where UðÞ and G represent uniform and Gamma distributions respectively. The distribution p(K) is a prior distribution for dimensionality and we assume it by uniform distribution (c is a normalization factor for uniform distribution.)

Both RJMCMC for pðK; θjY; σ e ; σ μ Þ and ML for σe and σ μ recursively run until convergence. More details for the algorithms are described below:

Update the proﬁles of peaks via RJMCMC. This step of the proposed algorithm is very similar to that of our previous approach where the details of the algorithms of RJMCMC for

J.W. Yoon / Computers in Biology and Medicine 54 (2014) 89–99

PR-NMR are described [1]. The difference is the replacement of the simple linear model with the linear mixed model resulting in a richer covariance than with the previous approach. The updated covariance of each sample is used to build the likelihood pðYjθ1:K ; σ e ; σ μ Þ ¼ pðYjX; σ e ; σ μ Þ. 1. Propose a type of move from ‘Birth’, ‘Death’, ‘Split’, ‘Merge’ and ‘Dimension invariant’. 2. If the move type is dimension invariant, RJMCMC draws samples over parameters using a standard Metropolis-Hastings (MH) algorithm, so that each unknown parameter is updated according to an acceptance probability: ( ) 0 0 0 pðYjθ1:K ; σ e ; σ μ Þpðθ1:K jKÞqðθ1:K ; θ1:K Þ αKjK ¼ min 1; ð14Þ 0 pðYjθ1:K ; σ e ; σ μ Þpðθ1:K jKÞqðθ1:K ; θ1:K Þ 0

where θ1:K and θ1:K are proﬁles of the peaks of new and previous samples respectively. 3. If the move type is dimension variant such as ‘Birth’, ‘Death’, ‘Split’ and ‘Merge’, then RJMCMC draws samples over parameters using a Metropolis-Hastings (MH) algorithm in RJMCMC scheme, so that each unknown parameter is updated according to an acceptance probability: (

αK 0 jK ¼ min 1;

0

0

0

pðYjθ1:K 0 ; σ e ; σ μ Þpðθ1:K jK 0 ÞqðK; K 0 Þqðθ1:K ; θ1:K 0 Þ 0 pðYjθ1:K ; σ e ; σ μ Þpðθ1:K jKÞqðK 0 ; KÞqðθ 1:K 0 ; θ1:K Þ

)

ð15Þ

where K 0 and K are the number of the peaks of the new and previous samples respectively. Note that Green's Jacobian terms for dimension matching are not required in our algorithm. Estimate σe and σ μ via ML/MAP 1. Estimate σe and σ μ via ML: σ^ e ; σ^ μ ¼ argσ e ;σ μ max log pðYjX; σ e ; σ μ Þ or MAP: σ^ e ; σ^ μ ¼ argσ e ;σ μ max log pðσ e ; σ μ jY; XÞ. Recursively update the proﬁles (via RJMCMC) and standard deviations of the noises σe and σ μ (via ML/MAP) until convergence.

5. Experimental data analysis 5.1. Characteristics of datasets used Fig. 4(a) shows the projections of 13C15N correlation peaks in the 700 MHz HNCO spectrum of the protein HasA. This is the observed dataset Y. Figs. 4(b) and 6(a) show a contour image and its segmented binary image of a desired target map, respectively. The binary image is obtained by a simple single thresholding segmentation technique onto the ground truth spectra. The F1F2 plane is extracted from a full three-dimensional HNCO experiment on HasA, performed using the conventional method where both evolution times are incremented independently. The 16 peaks include a doublet and very weak peaks, as shown in Fig. 4(b). Now, our task here is to reconstruct the underlying multidimensional spectra of Fig. 4(b) from the six lower dimensional projected spectra of Fig. 4(a) by using several

95

deterministic and statistical approaches. Given this dataset, we have checked the characteristics of the dataset before applying the reconstruction approach to verify the assumption that each projection can have different amplitudes in scale. Thus, we summed all values of spectra in each projection to check whether the spectra are scaled up or down at the rate ρi for the i-th projection. As shown in Table 1, the summed values for 01 are rather different from others so this proves that each projection has a differently scaled amplitude for each projection. For a stable estimation, we re-scaled the i-th projection by 1=ρi . In addition, another interesting factor to characterize the data is the noise level. The noise level of each projection can be different with varying projection angles although the projections are made from the same 3D HNCO spectra. Table 1 also shows that noise levels vary in terms of the projection angles. In particular, there is huge noise at a 901 projection. In the table, EðϵÞ and VðϵÞ indicate the mean and standard deviation of noise level ϵ, respectively. 5.2. Sensitivity issue of the parameter linear model

σe in the previous simple

Before demonstrating the performance of our proposed linear mixed model approach, we tested sensitivity of the selection of model parameters such as σe in our previous simple linear model [1]. We call this model Linearσ e ¼ τ for any real positive value τ. In order to obtain the optimal value of σe for Linearσ e , we applied several metrics to the reconstructed spectra with varying σe as shown in Table 2. Table 2 shows that the Linearσ e has the best outputs between σ e ¼ 5 and σ e ¼ 15. In the view of RMSE, Linearσe performs well at σ e ¼ 15 while Linearσe performs well at σ e ¼ 5 in the view of KLD. Therefore, we select the central value σ e ¼ 10 to represent it. The following ﬁgures include multidimensional spectra reconstructed using Linearσ e ¼ 0:1 , Linearσ e ¼ 10 and Linearσ e ¼ 30 . As can be seen in Fig. 5, the best result is obviously obtained from the second approach of Fig. 5(c), but note that we have to carefully and manually set the model parameters σe at 10 in this approach. 5.3. Performance comparison with our proposed approach and its characteristics However, in general, we do not know the standard deviation of the noise; if we under-estimate it by 0.1 as in Fig. 5(b) or overestimate it by 30 as in Fig. 5(d), the restored spectra are rather different from the desired spectra. In the case of the underestimation of σe, we obtain unwanted peaks, which are simply denoted as “false alarm”. Thus, it is difﬁcult to distinguish the real peaks from the false alarms since noises are considered as signals and the ﬁltering technique fails. When we use an over-estimation of σe as in Fig. 5(d), we lose many weak but desired peaks in the reconstructed spectra. For these reasons, it is very crucial to obtain a precise and accurate σe in the simple linear model. We therefore need to address this problem by automatically estimating σe via

Fig. 5. Reconstructed multidimensional spectra by Linearσe with varying σe. (a) Underlying ground truth. (b) Linearσe ¼ 0:1 . (c) Linearσ e ¼ 10 . (d) Linearσe ¼ 30 .

96

J.W. Yoon / Computers in Biology and Medicine 54 (2014) 89–99

Fig. 6. Reconstructed spectra using several algorithms. (a) Segmented binary spectra (Underlying ground truth). (b) Lowest value (pixel-by-pixel). (c) Additive back projection (pixel-by-pixel). (d) Maximum likelihood (pixel-by-pixel). (e) Maximum entropy (pixel-by-pixel). (f) LinearML (peak-by-peak). (g) MixedML (peak-by-peak). (h) LinearMAP (peak-by-peak). (i) MixedMAP (peak-by-peak).

ML or MAP in the simple linear model: these improved versions are named as LinearML , and LinearMAP for the ML and MAP estimates of σe, respectively. However, LinearML, and LinearMAP may not be sufﬁcient for obtaining spectra close to the underlying target map if lower dimensional spectra are independently corrupted by unwanted noise. Owing to such complicated models and datasets, we generalize the linear model to linear mixed model with ﬁxed effect. In this paper, we name this generalized model MixedML and MixedMAP with ML and MAP estimates for the hidden parameters respectively. In order to observe the performance of RJMCMC based on our proposed linear mixed model, we have compared several approaches categorized as either pixel-by-pixel or peak-by-peak.

Pixel-by-pixel approach: Let R, y and M be the unknown multidimensional spectra of the desired target map, given lower dimensional projections and an assumed model respectively. 1. LV (Lowest value): The lowest intensity is retained from corresponding locations in projections to infer each pixel value R of the target map. 2. ABP (Adaptive back-projection): This restores a spectra from a set of measured projections via a well-known inverse Radon transform technique [32]. 3. ML (Maximum likelihood): This seeks an optimal R^ to maximize the likelihood pðyjR; MÞ [16,2]. Since ML is a non-Bayesian (Frequentist) approach, we can modify this problem into maximum a posteriori (MAP) approach by maximizing the posterior pðRjy; MÞ in a Bayesian

framework [33]. However, we do not have any known prior information about R, so we simulate only ML estimation for performance evaluation in this paper. 4. ME (Maximum entropy): This reconstructs a desired spectra R by maximizing the entropy with a Lagrange multiplier [34,35]. Nowadays, it is common to use maximum entropy as a prior knowledge in Bayesian statistics, but we do not use ME for the prior in this paper. In addition, maximum entropy reconstruction algorithms are very popular in multidimensional NMR domains. However, this is rather different from our problem since the conventional maximum entropy reconstruction algorithms are considering sub-sampling with sparse sampling strategies rather than projection space. From this point of view we cannot use recent ME reconstruction algorithms developed by several researchers because of their different data characteristics for PR-NMR [3,19]. More details on other types of sub-sampled NMR datasets are described in [36,37]. Peak-by-peak approach: All RJMCMC algorithms are based on the peak-by-peak approach and they infer (θ1:K ; KÞ [and optionally ðσ e ; σ μ )]. 1. Linearσ e : this uses a simple linear model with ﬁxed σe (e.g. σ e ¼ 0:1, σ e ¼ 10, and σe ¼30). 2. LinearML: this uses a simple linear model but σe is updated in each sample of RJMCMC followed by an ML approach. 3. MixedML : this uses a linear mixed model to build likelihood and both σe and σ μ are updated in each sample of RJMCMC followed by an ML approach. Since the ﬁrst and second

J.W. Yoon / Computers in Biology and Medicine 54 (2014) 89–99

97

Table 3 Performance evaluation using several metrics. Method

RMSE

KLD

PSNR

SSIM

AUC

Lowest value (LV) Additive back projection (ABP) Maximum likelihood (ML) Maximum entropy (ME) LinearML MixedML LinearMAP MixedMAP

0.8982 1.1555 1.0103 1.1374 0.8159 0.7804 0.8718 0.7794

8.4944 17.0698 16.7052 17.5728 5.3585 4.0741 9.9944 3.6605

1.3069 0.9416 1.5533 1.7884 1.6059 1.5519 1.5194 1.6352

0.4042 0.1631 0.4634 0.1739 0.6482 0.6743 0.6014 0.6733

0.8836 0.8182 0.8779 0.9081 0.9674 0.9831 0.9189 0.9916

1

1

LV

ABP ML

0.6

ME Linear

0.4

ML

Mixed

ML

Linear

0.2

ABP

0.8

LV

TPR or sensitivity

TPR or sensitivity

0.8

ML ME

0.6

Linear

ML

Mixed

ML

0.4

Linear

MAP

Mixed

0.2

MAP

MAP

Mixed

MAP

0

0

0.2

0.4

0.6

0.8

FPR or (1−specificity)

1

0

0

0.02

0.04

FPR or (1−specificity)

Fig. 7. ROC for several reconstruction approaches. (a) ROC. (b) Dilated view of ROC

derivatives are not explicitly deﬁned, we use a quasiNewton method to obtain σ^ e and σ^ μ . 4. LinearMAP : this uses a simple linear model but σe is updated in each sample of RJMCMC followed by an MAP approach. 5. MixedMAP : this uses a linear mixed model to obtain likelihood and both σe and σ μ are updated in each sample of RJMCMC followed by an MAP approach. Fig. 6 shows the reconstructed spectra by several pixel-by-pixel (b–e) and peak-by-peak (f–i) approaches referred to above. In particular, all RJMCMC reconstruction algorithms including our proposed approaches are peak-by-peak approaches and the subimages are the averaged reconstructed spectra after burn-ins from the four different approaches. In order to estimate the number and proﬁles of peaks, the four variants of RJMCMC techniques are also compared with other pixel-by-pixel approaches and we generate in total 6000 samples including 3000 burn-in periods. As can be seen in the ﬁgures, many false ridges exist in most pixel-by-pixel approaches (including LV, ABP, and ML) except ME. Interestingly, in ME reconstruction strong signals are enhanced but low signals are suppressed in the reconstructed multidimensional spectra. The likelihoods of Fig. 6(g) and (i) are designed with the linear mixed model to explain the projection noise for each projection. The model parameters σe and σ μ are recursively calculated whenever RJMCMC draws samples. Table 3 demonstrates the performance comparison with several popular metrics: root mean square error (RMSE), Kullback Leibler distance (KLD), peak signal-to-noise ratio (PSNR), structural similarity (SSIM) and area under curve (AUC). See the appendix for more details on the metrics used. Mixed model based approaches have smaller RMSEs and KLDs and higher SSIMs and AUCs compared with other algorithms except PSNR. Here, maximum entropy appears to have the best result in PSNR because the strong signals are enhanced and the effects of the enhanced signals are larger than the side-effects of

the suppressed signals. Therefore, mixed model based approaches (MixedML and MixedMAP) infer closer and more accurate multidimensional spectra. The area under curve (AUC) is calculated from the receiver operating characteristic (ROC), shown in sub-graph 7(a). The ROC is obtained by comparing two segmented binary spectra. One binary spectra is obtained by applying a single thresholding technique into a desired spectra of Fig. 6(a) and the other binary spectra is obtained by increasing a single threshold from 0 to 1 to segment the restored spectra. Performance from the best to worst shows MixedMAP MixedML LinearML LinearMAP ðLV; ML; MEÞ ABP. In order to view the details of the ROC, the dilated sub-graph is displayed in Fig. 7 (b). In this plot, the sensitivity of ME reconstruction algorithm sharply increases but it falls to about 0.8 after FPR¼ 0.01. This indicates that ME reconstruction ﬁnds a few strong peaks well in the low threshold but that it fails to uncover weak signals as the threshold increases. Fig. 8 shows the analysis of the samples of variants of RJMCMC reconstruction algorithms: (a) acceptance ratio, (b) trajectories of the number of peaks, (c) the distribution of the number of sampled peaks and (d) the convergence diagnostics. Interestingly, the MixedMAP has a lower acceptance ratio after burn-ins while MixedML has a higher acceptance ratio in the burn-ins period as shown in Fig. 8(a). In addition, MixedMAP has a relatively lower acceptance ratio throughout the whole period except at the beginning from 1 to 100. This suggests that MixedMAP generates well-mixed samples from the beginning and that it can speed up the RJMCMC algorithm to build target density. Fig. 8(b) shows the trajectories of the number of peaks for four RJMCMC algorithms. After the burn-in period, the number of peaks in LinearML and MixedML converges to around 25 while those in LinearMAP and MixedMAP converges to about 40. An alternative view concerning the number of peaks is the histogram/distribution of the number of peaks in Fig. 8(c), and the MAP approach has a set of samples with more peaks than the ML approach. Additionally the mixed model has more peaks than the linear model.

98

J.W. Yoon / Computers in Biology and Medicine 54 (2014) 89–99

50

100

Linear

ML

40

Mixed

The number of peaks

Acceptance Ratio

ML

Linear

MAP

Mixed

MAP

10−1

30

20

Linear

ML

Mixed

ML

10

Linear

MAP

0 10−2

0

1000

2000

3000

4000

5000

6000

Mixed

MAP

0

1000

2000

Iteration

3000

4000

4000

6000

25

Linear

Linear (ML)

ML

Mixed (ML)

20

Mixed

ML

3000

Linear (MAP)

Linear

Mixed (MAP)

MAP

PSRF (σ)

Counts (after burn−ins)

5000

Iteration

Mixed

MAP

2000

15

10

1000 5

0

0

10

20

30

40

50

The number of peaks

0

1000

2000

3000

4000

5000

6000

Iteration number

Fig. 8. Analysis of the samples of variants of RJMCMC reconstruction algorithms: (a) acceptance ratio, (b) trajectories of the number of peaks, (c) the distribution of the number of sampled peaks and (d) the convergence diagnostics.

Fig. 8(d) plots the convergence diagnostics by using the potential scale reduction factor (PSRF) statistic for the peak's width σ [38].3 All four variants of RJMCMC techniques pass the convergence checking.

Conﬂict of interest statement None declared.

6. Conclusion Projection reconstruction nuclear magnetic resonance (PR-NMR) is an efﬁcient technique for reconstructing multidimensional NMR spectra with only a small number of lower dimensional projected NMR spectra. Because reconstructing multidimensional NMR spectra from the projections for PR-NMR is an ill-posed inverse problem, Bayesian inference has been used to overcome the problem in two ways: the pixel-by-pixel and the peak-by-peak approaches. In this paper, we extended and improved our previous peak-by-peak based reconstruction algorithm which employs reversible jump Markov chain Monte Carlo (RJMCMC). First of all, we introduced the linear mixed model with random effect to explain the projection noise that might exist in each projection of PR-NMR. Second, we added an extra procedure to estimate the standard deviations of noises that were assumed known in the previous work. Lastly, we have introduced a uniﬁed and simpliﬁed form for both a simple linear model and a linear mixed model. Therefore, we do not need to change and can reuse the previous systems of PR-NMR. 3 Convergence is asserted provided all the PSRF values fall below 1.2, a value that is known to work well in practice [38]. If this criterion is not satisﬁed for all estimands, then it is indicative of a lack of convergence and the simulation run needs to be lengthened.

Acknowledgments This research was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT and Future Planning (NRF-2013R1A1A1012797). The author is also supported by new faculty research grant from Korea University. The author sincerely thanks to Prof. R. Freeman who introduced PR-NMR and showed it to the author.

Appendix A. Evaluation metrics to calculate the similarity between desired ground truth spectra and reconstructed spectra by algorithms Let Rn and R^ be the reconstructed multidimensional spectra by a particular reconstruction method and its desired target spectra respectively.

Root mean square error (RMSE): this is one of the simplest metrics for image quality assessment and it is the averaging

J.W. Yoon / Computers in Biology and Medicine 54 (2014) 89–99

value of the squared intensity differences of R n and R^ by qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ RMSEðRn Þ ¼ EððRn R^ Þ2 Þ.

Kullback Leibler distance (KLD): Let p ^ and pR be the true R n

distribution and the calculated distribution of the multidimensional spectra R respectively by KLDðR n Þ ¼ ∑i pR^ ðiÞlog pR^ ðiÞ=pRn ðiÞ where the quantiﬁcation in this paper is done with N b ¼ 10 bins. Peak signal to noise ratio (PSNR): this is also a well-known quality metric by PSNRðR n Þ ¼ 20 log 10 ðmaxðR n Þ=RMSEðRn ÞÞ where maxðR n Þ is the maximum intensity of the reconstruction spectra. Structural similarity (SSIM): the other metric is structural similarity based on image quality assessment, which is well known in the computer vision and image processing society. This metric is able to explain the strong dependencies between pixels of highly structured spectra [39]. SSIMðRn Þ ¼

ð2μRn μR^ þC 1 Þð2σ Rn R^ þC 2 Þ ðμ2Rn þ μ2^ þC 1 Þðσ 2Rn þ σ 2^ þ C 2 Þ R

ðA:1Þ

R

where

"

#

μR ¼ ∑f ni =N; μR^ ¼ ∑f^i =N; σ R ¼ ∑ðf ni μR Þ2 =N 1=2; n

n

"

i

i

σ R^ ¼ ∑ðf^i μR^ Þ2 =N i

n

and

i

#1=2 ;

and σ Rn R^ ¼ ∑ðf i μRn Þðf^i μR^ Þ=N: n

i

For simplicity, it is common to set C 1 ¼ C 2 ¼ 0, known as the “universal quality index (UQI)” [40].

References [1] J. Yoon, S.J. Godsill, Bayesian Inference for Multidimensional NMR image reconstruction, in: European Signal Processing Conference (EUSIPCO 2006), Florence, Italy, September 2006. [2] J. Yoon, S.J. Godsill, E. Kupce, R. Freeman, Deterministic and statistical methods for reconstructing multidimensional NMR spectra, Magn. Reson. Chem. 44 (March (3)) (2006) 197–209. [3] M. Mobli, A.S. Stern, J.C. Hoch, Spectral reconstruction methods in fast NMR: reduced dimensionality, random sampling and maximum entropy, J. Magn. Reson. 182 (1) (2006) 96–105. [4] D.J. Holland, M.J. Bostock, L.F. Gladden, D. Nietlispach, Fast multidimensional NMR spectroscopy using compressed sensing, Ang. Chem. Int. Ed. 50 (29) (2011) 6548–6551. [5] K. Kazimierczuk, V.Y. Orekhov, A comparison of convex and non-convex compressed sensing applied to multidimensional NMR, J. Magn. Reson. 223C (2012). [6] V.Y. Orekhov, V.A. Jaravine, Analysis of non-uniformly sampled spectra with multi-dimensional decomposition, Prog. Nucl. Magn. Reson. Spectrosc. 59 (3) (2011) 271–292. [7] J.C. Hoch, M.W. Maciejewski, M. Mobli, A.D. Schuyler, A.S. Stern, Nonuniform sampling and maximum entropy reconstruction in multidimensional NMR, Acc. Chem. Res. 47 (2) (2014) 708–717. [8] G. Bodenhausen, R.R. Ernst, Direct determination of rate constants of slow dynamic processes by two-dimensional accordion spectroscopy in nuclear magnetic resonance, J. Am. Chem. Soc. 104 (1982) 1304–1309. [9] S. Kim, T. Szyperski, GFT, NMR a new approach to rapidly obtain precise highdimensional NMR Spectral information, J. Am. Chem. Soc. 125 (2003) 1385–1393. [10] E. Kupce, R. Freeman, Projection-reconstruction of three-dimensional NMR spectra, J. Am. Chem. Soc. 125 (46) (2003) 13958–13959. [11] R. Freeman, E. Kupce, Distant echoes of the accordion—reduced dimensionality, GFT-NMR, and projection-reconstruction of multidimensional spectra, Concepts Magn. Reson. A 23A (2004). [12] E. Kupce, R. Freeman, Projection-reconstruction technique for speeding up multidimensional NMR spectroscopy, J. Am. Chem. Soc. 126 (January) (2004) 6429–6440. [13] E. Kupce, R. Freeman, Fast multidimensional NMR spectroscopy by the projection-reconstruction technique, Spectroscopy, October 2004. [14] J. Yoon, S.P. Wilson, K.H. Mok, A highly efﬁcient blocked Gibbs sampler reconstruction of multidimensional NMR spectra, in: Yee W. Teh, D.M. Titterington (eds), Journal of Machine Learning Research—Workshop and Conference Proceedings, in: Proceedings of the Thirteenth International Conference on Artiﬁcial Intelligence and Statistics (AISTATS), vol. 9, 2010, 940–947.

99

[15] N. Denisova, A maximum a posteriori reconstruction method for plasma tomography, Plasma Sources Sci. Technol. 13 (2004) 531–536. [16] J. Nuyts, K. Baete, D. Beque, P. Dupont, Comparison between MAP and postprocessed ML for image reconstruction in emission tomography when anatomical knowledge is available, IEEE Trans. Med. Imaging 24 (May (5)) (2005) 667–675. [17] T. Georgiou, P.J. Olver, A. Tannenbaum, Maximal Entropy for Reconstruction of Back Projection Images, Mathematical Methods in Computer Vision, IMA Volumes in Mathematics and its Applications, vol. 133, Springer-Verlag, New York, 2003, pp. 57–64. [18] C.R. Smith, W.T. Grandy, Maximum-Entropy and Bayesian Methods in Inverse Problems, D. Reidel Publishing Company, The Kluwer Academic Publisher Group, Dordrecht/Boston/Lancaser, 1985. [19] N.M. Balsgart, T. Vosegaard, Fast forward maximum entropy reconstruction of sparsely sampled data, J. Magn. Reson. 223 (2012) 164–169. [20] B. Simon, M. Sattler, Speeding up biomolecular NMR spectroscopy, Angew. Chem. Int. Ed. 43 (2004) 782–786. [21] Brian E. Coggins, Ronald A. Venters, Pei Zhou, Radial sampling for fast NMR: concepts and practices over three decades, Prog. Nucl. Magn. Reson. Spectrosc. 57 (4) (2010) 381–419. [22] H.R. Eghbalnia, A. Bahrami, M. Tonelli, K. Hallenga, J.L. Markley, Highresolution iterative frequency identiﬁcation for nmr as a general strategy for multidimensional data collection, J. Am. Chem. Soc. 127 (36) (2005) 12528–12536. [23] M. Maciejewski, M. Mobli, A. Schuyler, A.S. Stern, J.C. Hoch, Data sampling in multidimensional NMR: fundamentals and strategies, in: M. Billeter, V. Orekhov (Eds.), Novel Sampling Approaches in Higher Dimensional, NMR Topics in Current Chemistry, vol. 316, Springer, Berlin Heidelberg, 2012, pp. 49–77. [24] M. Mobli, M.W. Maciejewski, A.D. Schuyler, A.S. Stern, J.C. Hoch, Sparse sampling methods in multidimensional nmr, Phys. Chem. Chem. Phys. 14 (2012) 10835–10843. [25] S. Gradmann, C. Ader, I. Heinrich, D. Nand, M. Dittmann, A. Cukkemane, M. van Dijk, A. Bonvin, M. Engelhard, M. Baldus, Rapid prediction of multidimensional NMR data sets, J. Biomol. NMR 54 (4) (2012) 377–387. [26] J.A. Hoeting, D. Madigan, A.E. Raftery, C.T. Volinsky, Bayesian model averaging, Stat. Sci. 14 (4) (1999) 382–417. [27] W.J. Fitzgerald, S.J. Godsill, A.C. Kokaram, A.J. Stark, Bayesian Methods in Signal and Image Processing, in Bayesian Statistics 6, Oxford University Press, 1999, pp. 239–254. [28] Reversible Jump Markov Chain Monte Carlo for Unsupervised MRF Color Image Segmentation. BMVC 2004, 2004. [29] G.S. Razul, W.J. Fitzgerald, C. Andrieu, Bayesian model selection and parameter estimation of nuclear emission spectra using RJMCMC, Nucl. Inst. Methods Phys. Res. A 497 (2–3) (2003) 492–510. [30] P.J. Green, Reversible jump markov chain monte carlo computation and bayesian model determination, Biometrika (1995) 711–732. [31] A.S. Stern, D.L. Donoho, J.C. Hoch, NMR data processing using iterative thresholding and minimum l1-norm reconstruction, J. Magn. Reson. 188 (2) (2007) 295–300. [32] E. Kupce, R. Freeman, The radon transform: a new scheme for fast multidimensional NMR, Concepts Magn. Reson. 22A (December) (2003) 4–11. [33] P.P. Mondal, K. Rajan, Hybrid prior method for positron emission tomographic image reconstruction, in: ICISIP 2004. Intelligent Sensing and Information Processing, 2004. [34] G. Minerbo, MENT: a maximum entropy algorithm for reconstructing a source from projection data, Comput. Graph. Image Process. 10 (May (1)) (1979) 48–68. [35] M.L. Reis, N.C. Roberty, Maximum entropy algorithms for image reconstruction from projections, Inverse Probl. 8 (4) (1992) 623–644. [36] M.W. Maciejewski, M. Mobli, A.D. Schuyler, A.S. Stern, J.C. Hoch, Data sampling in multidimensional NMR: fundamentals and strategies, Top. Curr. Chem. 316 (2012) 23. [37] M. Mobli, M.W. Maciejewski, A.D. Schuyler, A.S. Stern, J.C. Hoch, Sparse sampling methods in multidimensional NMR, Phys. Chem. Chem. Phys. 14 (2012) 10835–10843. http://dx.doi.org/10.1039/C2CP40174F. [38] A. Gelman, D. Rubin, Inference from iterative simulation using multiple sequences, Stat. Sci. 7 (1992). [39] Z. Wang, A.C. Bovik, H.R. Sheikh, E.P. Simoncelli, Image quality assessment: from error visibility to structural similarity, IEEE Trans. Image Process. 13 (4) (2004) 600–612. [40] Z. Wang, A.C. Bovik, A universal image quality index, IEEE Signal Process. Lett. 9 (3) (2002) 81–84.

Ji Won Yoon received the B.Sc. degree in information engineering at the SungKyunKwan University, Korea. He obtained the M.Sc. degree in School of informatics at the University of Edinburgh UK in 2004 and the Ph.D. degree in signal processing group at the University of Cambridge UK in 2008 respectively. After doing post-doctoral research at robotics group, the University of Oxford and statistics group, Trinity College Dublin, he moved to IBM Research as a research scientist to do BigData analysis. Since 2012 he is an assistant professor at Korea University. His current research is Bayesian statistics, Bioinformatics, and cyber defense.

Bayesian reconstruction of projection reconstruction NMR (PR-NMR)

Bayesian reconstruction of projection reconstruction NMR (PR-NMR)

Recommend Documents