Metabolic Engineering 3, 322343 (2001) doi:10.1006mben.2001.0193, available online at http:www.idealibrary.com on
Innovations in Generation and Analysis of 2D [ 13C, 1H] COSY NMR Spectra for Metabolic Flux Analysis Purposes Wouter van Winden,* , - , 1 Dick Schipper, Peter Verheijen, - and Joseph Heijnen* *Bioprocestechnology Group and -Process Systems Engineering Group, Faculty of Applied Sciences, Delft University of Technology, The Netherlands; and Beijerinck Laboratory, DSM Research, Delft, The Netherlands Received February 23, 2001; accepted May 16, 2001; published online August 6, 2001
13
1
2D [ C, H] COSY NMR is used by the metabolic engineering community for determining 13C13C connectivities in intracellular compounds that contain information regarding the steady-state fluxes in cellular metabolism. This paper proposes innovations in the generation and analysis of these specific NMR spectra. These include a computer tool that allows accurate determination of the relative peak areas and their complete covariance matrices even in very complex spectra. Additionally, a method is introduced for correcting the results for isotopic non-steady-state conditions. The proposed methods were applied to measured 2D [ 13C, 1H] COSY NMR spectra. Peak intensities in a one-dimensional section of the spectrum are frequently not representative for relative peak volumes in the two-dimensional spectrum. It is shown that for some spectra a significant amount of additional information can be gained from long-range 13C 13C scalar couplings in 2D [ 13C, 1H] COSY NMR spectra. Finally, the NMR resolution enhancement by dissolving amino acid derivatives in a nonpolar solvent is demonstrated. 2001 Academic Press
INTRODUCTION For several decades labeling experiments have been used to analyze fluxes in the carbon metabolism of the cell that cannot be determined from net consumption or production rates, such as parallel or cyclic reactions, alone. Tracer atoms that are commonly used in these studies are either the radioactive carbon isotope 14C or the stable isotope 13C. The corresponding measurement methods are scintillation counting, nuclear magnetic resonance spectroscopy (NMR), and mass spectrometry (MS). As for NMR, the most frequently reported method is the measurement of fractional enrichments of specific carbon atoms (Portais et al., 1993; Sonntag et al., 1993; Marx et al., 1996; De Graaf et al., 1999). In several studies of the TCA cycle, NMR measurements of fractional enrichments were complemented by 1 To whom correspondence and reprint requests should be addressed at Kluyver Laboratory for Biotechnology, Julianalaan 67, 2628 BC Delft, The Netherlands. Fax: +31-15-278-5307-2355. E-mail: W.A.VanWinden tnw.tudelft.nl.
1096-717601 35.00 Copyright 2001 by Academic Press All rights of reproduction in any form reserved.
322
measurements of intensities of fine structures in 13C spectra that are caused by adjacent 13C atoms (Malloy et al., 1987; Tran-Dinh et al., 1996; Jucker et al., 1998). Szyperski (1995) proposed the use of mixtures of nonlabeled and uniformly 13C-labeled substrates in metabolic flux analysis. The use of these substrates leads to identical fractional enrichments for all carbon atoms. Therefore, the information to be gained from these experiments solely resides in the 13C 13C connectivities. These connectivities are derived from relative intensities of fine structures in 13C spectra that are obtained by means of 2D [ 13C, 1H] COSY NMR. These isotopomer measurements potentially constitute a much richer source of information than fractional enrichments. Recently, software allowing the simulation of complete isotopomer distributions has made it possible to simultaneously estimate all metabolic fluxes from any type of labeling information (Schmidt et al., 1997). Combined with the application of mixtures of specifically and uniformly labeled substrates, this opened the possibility of generating and analyzing rich sets of both fractional enrichment data and relative fine structure intensity measurements for flux estimation (Schmidt et al., 1999). The first step in determining fluxes from NMR data is the spectral analysis. This step partly determines the accuracies of the estimated fluxes and yields the information that is needed to determine their confidence intervals. Relatively little attention has been paid to this spectral analysis step in metabolic engineering literature. This paper summarizes and extends current knowledge of a specific type of NMR spectra, namely a 2D [ 13C, 1H] COSY spectrum. The presented innovations in the spectral analysis include: v The introduction of a set of linear constraints on the positions of spectral peaks. These constraints yield a small set of parameters that accurately predict all peak positions and allow cross-checking of the assignment of peaks to their corresponding fine structures in complex spectra;
Innovations in Analysis of 2D [ 13C, 1H] COSY Spectra
Metabolic Engineering 3, 322343 (2001) doi:10.1006mben.2001.0193
parallel to the 13C frequency axis (see Fig. 1B). In case the relative areas of the various peaks in the section are independent of the 1H frequency at which the section is made, these relative areas can be used instead of the relative volumes in order to calculate the relative intensities of the peaks. However, if the relative peak areas in the section depend on the 1H frequency, a one-dimensional plot should be made by summing the sections at successive 1H frequencies (Szyperski, 1995).
v The combination of the set of positional parameters with a flexible mathematical description of NMR lineshapes that is sparse in parameters. The resulting spectral model allows a nonlinear fit of measured NMR spectra that is both very fast and flexible enough only to yield small and randomly distributed residuals; v A sensitivity analysis and error propagation calculation that yields the complete covariance matrix of the estimated relative intensities based on an estimation of the spectral noise;
Computer Tools for Improved Spectral Analysis
v An extension of existing isotopic non-steady-state corrections for the case of relative fine structure intensities; and v A calculation of the amount of additional information to be gained from long-range 13C 13C scalar couplings in 2D [ 13C, 1H] COSY NMR spectra. Nonlinear fitting of 2D [ 13C, 1H] COSY spectra is not new. Szyperski et al. (1999) reported the development of a specialized software package, FCAL, for this purpose. However, this package will not be available to many workers in the field and the underlying calculations and assumptions were not published. This paper offers a systematic procedure for nonlinear spectral fitting that can be applied by anyone interested in maximizing the quality and quantity of data extracted from their spectra. The proposed methods are applied and verified using two extensive sets of measured 2D [ 13C, 1H] COSY NMR spectra. Furthermore, it is checked whether peak intensities in a one-dimensional section of a 2D [ 13C, 1H] COSY NMR spectrum are representative for relative peak volumes in the two-dimensional spectrum. Finally, the NMR resolution enhancement by using amino acid derivatives that are soluble in less polar solvents is demonstrated. THEORY Single Section versus Sum of Sections Measuring 13C labeling by means of the 2D [ 13C, 1H] COSY NMR technique results in a spectrum that gives the NMR signal intensities of the various carbon fine structures versus their 13C and 1H frequencies (see Fig. 1A). Separation of cellular components prior to these NMR measurements is not strictly needed, because most of the carbon atoms of the different cellular components have their own unique set of 13C and 1H chemical shifts (Szyperski et al., 1996). These unique coordinates make it possible to assign all the NMR signals to the corresponding carbon atoms. Spectral analysis is made easier by making a one-dimensional section of a 2D [ 13C, 1H] COSY NMR spectrum 323
The relative peak areas (or relative intensities) in onedimensional sections of 2D [ 13C, 1H] COSY spectra correspond to (groups of) isotopomers (see Fig. 2) and can be used for flux analysis (Szyperksi, 1995; Schmidt et al., 1999; Petersen et al., 2000). When peak areas are manually determined by indicating between which lower and upper frequency the signal area is to be integrated, the following problems are encountered: v Determination of the beginning and end of separate peaks introduces a subjective element in the determination of the areas. Moreover, it is often difficult or even impossible to do so due to overlapping peaks. If not all peaks of a fine structure overlap, one can only integrate the areas of the nonoverlapping peaks and correct these for the nonobservable peaks. This will, however, increase the error in the determined area by the same correction factor; v In complex spectra including many (overlapping) peaks it is difficult to assign the various peaks to the fine structures. This is especially the case in spectra showing long-range couplings (see, e.g., Fig. 5); and v Some of the amino acids in biomass lysates that are subjected to NMR analysis are present at low concentrations, e.g., phenylalanine constitutes less than 40 of the cellular protein in Escherichia coli, Penicillium chrysogenum, and Saccharomyces cerevisiae (Stephanopoulos et al., 1998). Besides, labeling experiments for 2D [ 13C, 1H] COSY NMR analysis are usually performed with low fractions of labeled substrate [e.g., 100 fully labeled glucose (Szyperski, 1995) or 100 0 glucose labeled at one single position (Marx et al., 1996)]. Both factors result in low signal-to-noise ratios which makes all of the above problems even worse. For the reasons stated above, computer-aided methods are required to identify and disentangle overlapping peaks (Wittig et al., 1995). Multiple techniques or even software packages for (automated) analysis of NMR spectra have been developed in the past (e.g., Bartels et al., 1997; Ge et al., 1993; Stephenson and Binsch, 1980; Gunther et al., 2000; Szyperski et al., 1999). These packages are generally
van Winden et al.
Metabolic Engineering 3, 322343 (2001) doi:10.1006mben.2001.0193
FIG. 1. (A) Schematic example of a 2D [ 13C, 1H] COSY spectrum showing the intensities of the singlet and doublet fine structures of a carbon atom. (B) A section parallel to the 13C frequency axis of the 2D spectrum and the resulting 1D multiplet.
good at analyzing NMR spectra of all sorts based on little to no user-supplied information. Trained NMR operators applying 2D [ 13C, 1H] COSY NMR to determine labeling patterns of biomass components will have no problems in assigning the signals in a spectrum to the various amino acids and carbohydrate-related compounds. They only need computer support for accurate determination of the signal intensities of the specific compounds. Ideally such a computer tool contains prior information about the specific characteristics of 2D [ 13C, 1H] COSY NMR spectra, such as lineshapes, positions of fine structure peaks with respect to those of other fine structures, and constraints imposing equal height on peaks of one fine structure (Wittig et al.,
1995). Such prior information helps to prevent erroneous peak assignments and helps the algorithm to quickly converge to an optimal fit of the spectrum. The following section extends the prior information and constraints imposed on fits of 2D [ 13C, 1H] COSY NMR spectra that were proposed by Wittig et al. (1995). Spectral Model Lineshapes of one-dimensional NMR peaks are not purely Gaussian or Lorentzian (Marshall et al., 2000). The Voigt function accurately describes experimental lineshapes, but cannot be analytically calculated. That is why several approximations of the Voigt lineshape are in use. Examples include the sum or product of a Lorentzian and a Gaussian lineshape (Stephenson and Binsch, 1980; Wittig et al., 1995; Conny and Powell, 2000) or the Pearson VII lineshape (Subhash and Mohanan, 1997). The Pearson VII lineshape [Eq. (1)] was chosen as the basis of our spectral model because it is sparse in parameters and still approximates the Voigtian lineshape very well.
S i (| 13C )=
FIG. 2. The fine structures that can be distinguished in a one-dimensional section of a 2D [ 13C, 1H] COSY spectrum with only two one-bond 13 C 13C scalar couplings (left) and with an additionally observable longrange 13C 13C scalar coupling (right). The isotopomer groups causing the various fine structures are shown in the middle. The respective summed spectra are shown below. 324
1 | } p
\ \
13C
hi &| 13C max, i wi
2
+ +
p
.
(1)
+1
In Eq. (1) the symbol S i represents the NMR signal size of peak i at frequency | 13C (i.e., the horizontal position in the 13C dimension, see Fig. 1B). The Pearson VII lineshape function contains four parameters: h i is the maximal signal height, (| 13C max, i is the peak centroid, w i represents the linewidth (note: w i does not equal the full width at halfmaximum in the present formulation of the Pearson VII function), and p determines the lineshape. Equation (1)
Innovations in Analysis of 2D [ 13C, 1H] COSY Spectra
Metabolic Engineering 3, 322343 (2001) doi:10.1006mben.2001.0193
FIG. 3. (A) The positions of the peaks in a multiplet of carbon atom C b , which are split by the 13C 13C scalar coupling with carbon atoms C : and C # . (B) The positions of the peaks in a multiplet of carbon atom C : , which are split by the 13C 13C scalar coupling with carbon atom C ; and shifted by an isotope effect T C:C; .
represents the Lorentzian lineshape for p=1 and equals the Gaussian lineshape in the limit p Ä . The number of peaks in the measured multiplet of a given 13 C atom is a function of the number of its 13C 13C scalar couplings with other carbon atoms. If there are no 13C 13C scalar couplings, there is only one singlet. Every additional 13 C 13C scalar coupling splits each fine structure into a fine structure with the double number of peaks. By consequence, a multiplet of an atom with m 13C 13C scalar couplings consists of 3 m peaks divided over 2 m fine structures. The peak multiplicities of the fine structures that compose multiplets with 0, 1, 2, and 3 scalar couplings increase from [1] to [1, 2], [1, 2, 2, 4], and [1, 2, 2, 2, 4, 4, 4, 8]. Note that the numbers of elements of these sets indeed equal 2 m and their sums equal 3 m. The peaks belonging to a fine structure are identical. This considerably reduces the number of independent parameters needed to describe a multiplet. Only 2 m linewidth parameters and 2 m height parameters are needed to describe a multiplet consisting of 3 m peaks. Assuming that the lineshape of all the peaks in a multiplet is identical, only one parameter p [see Eq. (1)] is needed. The positions of the peaks in a multiplet are linearly constrained. Every 13C 13C scalar coupling splits a fine structure into a fine structure of which the peaks are shifted with respect to the ``parent'' peaks by plus and minus a given 13 C frequency, called the ``scalar coupling constant'' and denoted by J CC (Szyperski, 1995). Consider Fig. 3A showing the multiplet of carbon atom C ; that has one-bond 325
13
C 13C scalar couplings with carbon atoms C : and C # . The coupling constants are J C;C: and J C;C# . The frequency of the central singlet peak of C ; is | 13C max, s . Whereas the 13C 13C scalar couplings yield a symmetrical multiplet, 13C isotope effects (Hansen, 1988) cause asymmetry by slightly shifting all the peaks in the same (lowfield) direction. Consider a multiplet of carbon atom C : that has a scalar coupling J C:C; with C ; and an isotope effect of the size T C:C; . The resulting multiplet is shown in Fig. 3B. All the 3 m peak positions in a multiplet can be described by the central peak position of the singlet and m scalar couplings and m isotope effects. The following equation shows how 1+2+2=5 parameters describe the central positions of 3 2 =9 peaks in a multiplet with two different 13 C 13C scalar coupling constants: 1 0 0 0 0 1 &1 0 1 0 1 1 0 1 0 1 0 &1 0 1 1 0 1 0 1 1 &1 &1 1 1 1 1 &1 1 1 1 &1 1 1 1 1 1
1 1 1
}
| 13C max,s
| 13C max,s | 13C max,d1,1 | 13C max,d1,2
J C;C: J C;C#
| 13C max,d2,1 | 13C max,d2,2
\+ T C;C: T C;C#
=
.
| 13C max,dd,1 | 13C max,dd,2 | 13C max,dd,3 | 13C max,dd,4 (2)
van Winden et al.
Metabolic Engineering 3, 322343 (2001) doi:10.1006mben.2001.0193
We can thus conclude that whereas four parameters are needed to describe one NMR peak, a multiplet consisting of 3 m peaks may be described by 2+2 m+1 +2*m parameters (1 power, 2 m heights, 2 m linewidths, 1 peak centroid, m coupling constants and m isotope effects). This is a serious reduction: e.g., a multiplet consisting of 9 peaks (m=2) can be described by 14 parameters instead of 36. This reduction is valuable when fitting a multiplet because it significantly reduces the parameter estimation. Initialization of the Fit The spectral model is fitted to an experimentally determined multiplet by minimizing the sum of squared residuals. This is an ill-posed nonlinear optimization problem in the sense that the optimization procedure does not know in what direction to search if the initially guessed peaks have no overlap with the true ones. In the worst case a suboptimal local minimum is found due to the fitting of minor noise peaks. Stephenson and Binsch (1980) discussed a solution for this problem that enables automatic peak finding, but only rigorously applies to a multiplet of one peak. We have opted for a hybrid approach. The peak positions of some peaks can be uniquely detected by the operator. As shown in the previous section, once 2*m+1 of a total of 3 m peak positions are known, the remaining can be automatically estimated. Adding an initial estimate for the linewidth of the peaks and initially assuming a Lorentzian lineshape [ p=1 in Eq. (1)] suffice to make the fitting procedure quickly converge to a global minimum. By checking if the 3 m estimated peak positions approximately correspond with the peaks in the measured multiplet, one verifies whether the manually indicated peak positions indeed correspond to the fine structures they were assumed to correspond with. Consider for example the case of Eq. (2). The five positional parameters can be obtained by inverting the equation and manually indicating the positions of the five peaks of the singlet and doublets 1 and 2. Based on the positional parameters one can calculate the positions of the remaining four double doublet peaks. In case one of the double doublet peaks was incorrectly manually indicated as a doublet peak, the positional parameters will be wrong and the calculated positions of the double doublet peaks will not correspond with the experimentally observed ones. This method is of great help in correctly assigning all peaks to their corresponding fine structures in complex spectra. Estimation of Errors in Relative Intensities Only few papers are available in which relative fine structure intensity data are presented together with their standard deviations. In these papers standard deviations are 326
derived from multiple measured data sets (Malloy et al., 1987; Jucker et al., 1998; Schmidt et al., 1999), from comparison of redundant measurements and visual inspection of the signal-to-noise ratio (Schmidt et al., 1999), or from the outcomes of the application of various filtering functions (exponential, Gaussian, or mixed exponentialGaussian linebroadening) and methods to determine the peak area (simple integration or peakfitting) of a single NMR data set (Petersen et al., 2000; A. A. de Graaf, personal communication). A great advantage of the peakfitting procedure discussed in the previous section is that the residual spectrum of the measured (S meas ) and optimally fitted (S fit ) spectra can be used to estimate the NMR noise and thus the measurement error from one single experiment. This method is based on the assumption that all NMR signal intensities that are measured along the | 13C axis have measurement errors with an uniform variance (i.e., the homoscedasticity assumption applies). The error variance (_ 2 ) can be estimated using Eq. (3): (S &S ) T } (S meas &S fit ) _ 2 = meas fit , n&p
(3)
where n is the number of spectral data (i.e., the length of the vector S ) and p is the total number of spectral parameters. In Appendix A it is explained how the error variance of a spectrum can be used to calculate the covariance matrix (C RI ) of the relative intensities. Although standard deviations of relative fine structure intensities have been published before, the corresponding covariances have wrongly never been determined nor taken into account by our knowledge. Only by taking into account covariances (i.e., the known correlations between dependent relative intensities) can one perform a correct statistical test of the differences between two data sets. Isotopic Non-Steady-State Correction of Fractional Enrichment and Fine Structure Intensities Labeling experiments are usually started by establishing a metabolic steady state in a continuous culture growing on unlabeled substrate. Subsequently, the medium is replaced by medium containing 13C-labeled substrate. Both media must be identical except for the isotopic composition of the carbon substrate not to disturb the metabolic steady state. In continuous cultures where the labeled substrate is limiting, the substrate concentration in the fermentor is so low that the switch to labeled medium leads to a stepwise onset of the import of labeled substrate into the cells. Because the intracellular metabolite levels are generally low,
Innovations in Analysis of 2D [ 13C, 1H] COSY Spectra
Metabolic Engineering 3, 322343 (2001) doi:10.1006mben.2001.0193
it may be assumed that the metabolites immediately reach an isotopic steady state. This results in a stepwise onset of the accumulation of label in the biomass. This assumption was checked against literature data on the metabolite concentrations and fluxes in the glycolysis (Theobald et al., 1997; Rizzi et al., 1997) and pentose phosphate pathway (Vaseghi et al., 1999) and the concentration of and flux through the combined :-ketoglutarateglutamate pool (Ter Schure et al., 1995; Rizzi et al., 1997) that may be considered in isotopic equilibrium (Tran-Dinh et al., 1996). All literature data apply to S. cerevisiae growing under conditions very similar to our experiments. Based on the data we calculated, by far the slowest wash-in of label in the intermediate pools is approximately 0.63_10 &3 s &1 for the :-ketoglutarateglutamate pool. This rate is 22 times larger than the turnover rate of the biomass (0.1 h &1 =0.028_ 10 &3 s &1 ) which leads us to conclude that our assumption of a stepwise onset of the accumulation label in the biomass is valid. Under these assumptions the labeling of biomass follows first-order wash-in kinetics. After a number of residence times of growth on labeled medium the biomass is harvested in order to determine the label distribution in the biomass. Theoretically, the isotopic steady state of the biomass components is only reached after an infinite number of residence times. Therefore, experimental labeling data must be corrected for the deviation from the isotopic steady state at the time of harvesting the biomass (Marx et al., 1996; Szyperski, 1998; Mollney et al., 1999). The fractional enrichment vector x of biomass component X at isotopic steady state (t=) can be calculated from the fractional enrichment vector that is measured at time t by means of the following equation: x(t)&e &+ } t } x(0) . x()= 1&e &+ } t
(4)
x i, f (t) } x i (t)&x i, f (0) } e &+ } t } P n . x i (t)&e &+ } t } P n
(7)
In the above equation, x i without the subscript `` f '' represents the fractional enrichment of the i th carbon atom of X. Thus, in order to correct the relative intensity of a fine structure of a carbon atom, the fractional enrichment of the atom at time t must be known as well. When the labeled substrate that is applied is uniformly labeled, the fractional enrichment of each carbon atom is identical and can be calculated by applying Eq. (8) which is derived from Eq. (4). The applicability of this equation (and thereby of the presented isotopic non-steady-state correction) is limited to cases where the substrates are uniformly labeled. \x, i
(8)
In this equation P s is the fraction of uniformly labeled substrate in the medium. Filling in this equation in Eq. (7) yields the following isotopic non-steady-state correction in case of uniform labeling:
(5)
where P n represents the natural fractional labeling (r0.012), and i is a vector of the same dimension as x con The isotopic non-steady-state correction taining only ones. of the isotopomer distributions of the biomass components is also given by Eq. (4) above. In this case vector x represents the isotopomer distribution vector. In this case the i th element of vector x(0) is given by x i (0)=(P n ) L(xi ) +(1&P n ) U(xi ),
x i, f ()=
x i (t)=e &+ } t } P n +(1&e &+ } t ) } P s
In Eq. (4), + is the specific growth rate of the culture and the vector x(0) is defined by the natural labeling: x(0)=P n } i,
where L(x i ) is the number of 13C atoms and U(x i ) is the number of 12C atoms of the ith isotopomer of biomass component X. For example, the vector x(0) of a two-carbon 0.012, 0.012, compound is (x 00 , x 01 , x 10 , x 11 ) t =(0.976, t 0.000) (for an explanation of the binary subscripts, see Appendix B). In case the labeling data used are 2D [ 13C, 1H] COSY NMR spectra, Eq. (4) cannot be used for isotopic nonsteady-state correction, since the NMR spectra seldom yield a complete isotopomer distribution vector. In other words, x(t) is not known. As was discussed before, the data obtained from 2D [ 13C, 1H] COSY NMR spectra are the relative intensities of fine structures which represent ratios of groups of isotopomer fractions (Szyperski, 1995). The relative intensity (x i, f (t)) of the fine structure `` f '' (e.g., a singlet or doublet) of the i th carbon atom of biomass component X at time t can be corrected for isotopic non-steadystate as follows (derivation in Appendix B):
(6) 327
x i, f (t) } (e &+ } t } P n +(1&e &+ } t ) } P s ) &x i, f (0) } e &+ } t } P n . x i, f ()= (1&e &+ } t ) } P s
(9)
The value of the relative intensity at time 0 (x i, f (0)) in Eqs. (7) and (9) can be calculated using Eq. (6) where L(x i ) and U(x i ) are the numbers of 13C atoms and 12C atoms, respectively, that neighbor the observed carbon atom. In the case of superimposed fine structures due to identical 13C 13C scalar couplings, the separate relative intensities at time 0 should be calculated using Eq. (6) and summed afterward.
van Winden et al.
Metabolic Engineering 3, 322343 (2001) doi:10.1006mben.2001.0193
The nonisotopic steady-state correction of the relative intensities must be taken into account in the estimation of the corresponding covariances. Under the assumption that the errors in P n , P s , t, and + in Eq. (9) are negligible compared to the error in x i, f (t), the covariance matrix of the relative intensities in the multiplet of the i th carbon atom of compound x at harvesting time t (C RI(x f (t)), see Appendix A) can be corrected as follows:
nonisotopic steady state, their labeling data may be compared. This allows multiple independent NMR analyses from a single continuous culture. Additional Information Due to Long-Range Couplings
\
(e &+ } t } P n +(1&e &+ } t ) } P s ) (1&e &+ } t ) } P s
2
+ }C
RI
(x i, f (t)). (10)
When the labeled substrate has been supplied much longer than the dilution time prior to biomass harvesting (i.e., t> >1+), the isotopic non-steady-state correction only marginally improves the measured labeling data and may be neglected. However, for labeling times shorter than four times the residence time, negligence of the correction results in serious relative errors. A short supply of labeled substrate has several advantages. Often only small amounts of biomass are required for NMR analysis. An entire fermentor content of fully labeled biomass is seldom needed. Labeling costs may therefore be reduced by reducing either the fermentor volume (Schmidt et al., 1999) or the concentration of substrate in the medium (and thus of the biomass). Both solutions may cause changes in the metabolic fluxes that are studied. Using the normal fermentor size and substrate concentration and opting for a short label supply and isotopic non-steady-state correction instead does not have this disadvantage. Another advantage of short labeling is that the observed metabolic system needs to be kept in a well-defined stationary state for only a short period of time. This reduces the risk of disturbances of the steady state. Finally, if substrate labeling is supplied for a long time to a fermentor with a sufficiently large volume and biomass concentration, several biomass samples may be taken prior to attaining the isotopic steady state without disturbing the metabolic steady state. By correcting these samples for the
x 1,s =: x 10?? : x 1??? x 1,d =: x 11?? : x 1??? =1&x 1,s x 2,s =: x 010? : x ?1?? x 2,d1 =: x 110? : x ?1?? x 2,d2 =: x 011? : x ?1?? x 2,dd =: x 111? : x ?1?? =1&x 2,s &x 2,d1 &x 2,d2
C 13C
As stated before, 2D [ 13C, 1H] COSY spectra seldom yield complete isotopomer distribution vectors. Figure 2 shows the groups of isotopomers that can be distinguished in the section of a spectrum corresponding with a central carbon atom in a four-carbon compound where up to three 13 C 13C couplings are observable. The figure is based on the assumption that all coupling constants are different. In the case where coupling constants are identical, various fine structures overlap and the number of observable isotopomer groups decreases. Only the multiplets of carbon atoms that are covalently bound to at least one proton are observed in 2D [ 13C, 1H] COSY spectra. By consequence, amino acids always yield one less multiplet than the number of carbon atoms, having a carbonyl carbon atom that is not proton-bound. Suppose that all four carbon atoms of the compound in Fig. 2 are proton-bound (which is the case in, e.g., erythritol). In this case two multiplets of terminal carbon atoms and two multiplets of central carbons can be measured. How many of the 2 4 =16 isotopomer fractions (note: Figure 2 shows only one-half of all the possible isotopomers) of the compound can be determined from the combined four multiplets? This question cannot be answered by simply summing the number of separate fine structures in all the multiplets. Some information may overlap. Moreover, the fine structure areas in a multiplet are normalized with respect to the total area of the multiplet. Consequently, one of the fractions of the isotopomer groups that are observed in a multiplet follows from the others. Suppose that none of the 13C 13C couplings is identical and that respectively one and two one-bond 13C 13C couplings can be observed in multiplets of terminal and central carbon atoms. In this case the number of separate isotopomer fractions that is determined from the four multiplets follows from Eq. (11).
C RI(x i, f ()) =
13
}
328
x 3,s =: x ?010 : x ??1? x 3,d1 =: x ?110 : x ??1? x 3,d2 =: x ?011 : x ??1? x 3,dd =: x ?111 : x ??1? =1&x 3,s &x 3,d1 &x 3,d2 x 4,s =: x ??01 : x ???1 x 4,d =: x ??11 : x ???1 =1&x 4,s
Innovations in Analysis of 2D [ 13C, 1H] COSY Spectra
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 1
0 0 0 0 1 0 0 0
0 0 0 0 0 0 1 C8
0 1 0 0 0 0 0 0
0 1 0 0 0 0 0 1
Metabolic Engineering 3, 322343 (2001) doi:10.1006mben.2001.0193
0 0 0 1 0 1 0 0
0 0 0 1 C5 C6 C7 C8
1 0 0 0 0 0 0 0
1 0 0 0 0 0 0 1
1 0 0 0 1 0 0 0
where C 1 =&x 1,s x 1,d C 2 =&x 2,s x 2,dd C 3 =&x 2,d1 x 2,dd C 4 =&x 2,d2 x 2,dd
}
C 5 =&x 3,s x 3,dd C 6 =&x 3,d1 x 3,dd C 7 =&x 3,d2 x 3,dd C 8 =&x 4,s x 4,d
In Eq. (11) the subscripts ``s,'' ``d,'' ``d1,'' ``d2,'' and ``dd'' denote the singlet, doublet, doublets 1 and 2 (two different doublets in a central carbon multiplet), and double doublet. The binary subscripts are defined as in Schmidt et al. (1997). A question mark represents either a 0 or a 1. The rank of the 8_16 matrix above is 8. In other words, the 12 relative intensities of the four multiplets yield eight independent isotopomer data. If we assume that each multiplet has an additional long-range 13C 13C coupling that is large enough to yield well-resolved peaks, then the same compound has two multiplets of terminal carbon atoms which yield four (three independent) relative intensities and two multiplets of central carbon atoms yielding eight (seven independent) relative intensities. The corresponding 20_16 matrix has rank 14. This confirms the expectation that multiplets with more couplings indeed yield a superior amount of data. In the example above the obtained 14 independent isotopomer data are the maximally achievable number. The fully unlabeled isotopomer of a n-carbon compound is not visible in any of the 2D [ 13C, 1H] COSY spectra. Furthermore, relative intensities only yield ratios between the remaining (2 n &1) isotopomers due to the normalization. The maximal number of independent ratios between (2 n &1) entities is (2 n &1)&1. Therefore, 2D [ 13C, 1H] COSY NMR maximally yields (2 n &2) independent labeling data for a n-carbon compound. MATERIALS AND METHODS Labeled Biomass The labeled biomass that was used in this NMR study was obtained in two continuous cultures. In one of the 329
1 0 0 0 0 0 1 C8
C1 C1 C1 0 0 C2 1 1 C3 0 0 C4 0 0 0 0 0 1 0 0 0 0 1 0
C1 C2 C3 C4 C5 C6 C7 C8
x 0000 x 0001
}
\+ b
x 1110 x 1111
=0
(11)
experiments S. cerevisiae (CEN.PK-113.7D) was grown aerobically on a C-limited defined medium containing glucose and ethanol (10:1 on a weight basis) as carbon sources and ammonium as a nitrogen source. The glucose consisted of 900 unlabeled glucose and 100 U-[ 13C 6 ]glucose (CLM1396, 990 13C, ARC Laboratories B.V., Amsterdam, The Netherlands). The yeast was grown in a fermentor working volume of 1670 ml at a biomass concentration of 1.70 g dry wtliter and a dilution rate of 0.1 h &1. Biomass samples of 80 ml (r136 mg dry wt) were taken for NMR analysis after 11.8 and 35.4 h. In the second experiment P. chrysogenum (DS12975, DSM, Delft, The Netherlands) was grown aerobically on a C-limited defined medium containing glucose as a carbon source and nitrate as a nitrogen source [for details, see Van Gulik et al. (2000)]. The glucose consisted of 900 unlabeled glucose and 100 U-[ 13C 6 ]glucose. The fungus was grown in a fermentor working volume of 1360 ml at a biomass concentration of 1.15 g dry wtliter and a dilution rate of 0.03 h &1. A biomass sample of 100 ml (r115 mg dry wt) was taken for NMR analysis after 119.6 h. The samples of S. cerevisiae were centrifuged (6 min, 4800 rpm). After decanting the supernatant, cells were washed with 0.90 NaCl-solution, centrifuged, and washed with demineralized water. After final centrifugation cells were frozen at &80% C. Prior to NMR analysis the biomass was freeze-dried. The samples of P. chrysogenum were filtered (glass fiber filter, Gelman Sciences, U.S.A.). Filters with cells were washed with 0.90 NaCl solution and demineralized water. After final filtration cells were frozen at &80% C prior to freeze-drying. Preparation of Samples Biomass was hydrolyzed in 10 ml 6 N HCl for 16 h at 110% C. After filtration and evaporation to dryness, the residue was dissolved in 10 ml 0.1 N HCl and the amino acids were adsorbed to an ion-exchange resin (Dowex AG 50W X4) and washed with 0.1 N HCl. The amino acids were eluted with 4 N HCl. After evaporation the residue was dissolved in D 2 O.
van Winden et al.
Metabolic Engineering 3, 322343 (2001) doi:10.1006mben.2001.0193
The presented sample preparation includes separation of proteinogenic amino acids from the remaining biomass components. This was also done by Szyperski (1995), Schmidt et al. (1999), and Sauer et al. (1999), whereas Petersen et al. (2000) used whole-cell lysate for NMR analysis. The separation of the amino acids eliminates the interference of multiplets of carbon atoms of other compounds. Water extracts of cell components other than proteinogenic amino acids were prepared by heating the biomass in 5 ml H 2O to 90% C for 10 min with shaking. After centrifugation, the supernatant was lyophilized and dissolved in D 2O. Derivatization After drying the biomass hydrolysate by evaporation the amino acids were converted to methyl esters with methanol thionyl chloride according to Brenner et al. (1950). After the reaction, the residual solvent was removed by a stream of N 2 . The methyl esters were redissolved in 1 ml of a 50 : 50 mixture of deuterated methanol and acetone. The methyl esters were acylated with trifluoroacetic acid anhydride (Coulter and Hahn, 1968) and after removal of the excess reagents in a stream of N 2 dissolved in CDCl 3 .
which yields spectra of poor quality. A better option is to prepare methyl esters and dissolve these in a 50 : 50 mixture of methanol and acetone. An additional advantage over water-dissolved amino acids is that due to the absence of salts in the methanolacetone mixture considerably less heat is generated by the 13C-decoupling during the acquisition time. In principle the methyl ester can be further converted to the N-acyl amino acid esters with TFAA, although not all amino acids are easily and quantitatively prepared. The resulting N-trifluoroacetyl amino acid esters are soluble in chloroform. Four multiplets of amino acids in water and the corresponding derivatives in chloroform were measured and compared. Figure 4 shows a typical result of this comparison. The multiplets are normalized to the same total height and width. It is clear that the linewidths of the peaks are considerably smaller for the chloroform-soluble derivative. The part of the multiplet shown within the rectangle clearly illustrates the better resolution of peaks of the derivatives dissolved in chloroform. Results of fits of multiplets of four :-carbon atoms of proteinogenic amino acids dissolved in water and chloroform are compared in Table 1. The table shows that the average linewidths are 1.41.7 times larger for water samples
2D [ 13C, 1H] COSY NMR Measurements NMR measurements were performed at 600 MHz at 37% C on a Bruker Avance 600 spectrometer. The [ 13C, 1H] COSY experiment was the HSQC sequence by Bax and Pochapsky (1992) with gradients for artifact suppression. Folding in F1 was used for reducing the sweepwidth. The 13 C carrier was set to 57.5 ppm and 2400 increments were recorded with an effective sweep width of 20 ppm (t 1, max = 398 ms). For the aromatic carbons the offset was 129.6 ppm and 512 increments were recorded with a sweepwidth in F1 of 3 ppm (t 1, max =652 ms). The window function used before Fourier transformation was a cosine bell shifted by ?3 in F2 and a sine bell shifted by ?2.6. RESULTS Improving Spectral Quality by Derivatization In general, the linewidths of NMR signals are dependent on a number of parameters like solvent viscosity, temperature, and the presence of paramagnetic ions. To improve the resolution, the temperature of the sample could be increased or the sample could be dissolved in a less viscous solvent. Amino acids are generally not soluble in apolar solvents. They do dissolve in methanol, but in this solvent they are partially converted to methyl esters during the measurement 330
FIG. 4. Comparison of normalized sections in 13C direction of 2D [ C, 1H] COSY NMR spectra of the :-carbon atom of phenylalanine. (Upper) Multiplet of chloroform-soluble N-trifluoroacetyl-methyl ester of phenylalanine; (lower) multiplet of water-dissolved phenylalanine. Rectangle includes a peak of the doublet with the smaller 13C 13C scalar coupling constant and a peak of the double doublet. 13
Innovations in Analysis of 2D [ 13C, 1H] COSY Spectra
Metabolic Engineering 3, 322343 (2001) doi:10.1006mben.2001.0193
TABLE 1 Comparison of Linewidths, D Values, and Determined Relative Intensities in Multiplets of Four Carbon Atoms of Amino Acids in Water and Chloroform (Hydrolyzed Biomass of S. cerevisiae) Water :-Phenylalanine
Average linewidth (Hz) Db Relative intensity
:-Threonine
Singlet Doublet 1 c Doublet 2 Double doublet
Average linewidth (Hz) D Relative intensity
:-Methionine
Singlet Doublet 1 Doublet 2 Double doublet
Average Linewidth (Hz)
:-Glutamic acid
1.145 1.28_10 &5
0.120 0.023 0.086 0.771
0.121 0.016 0.083 0.780
1.749
1.181
1.85_10 &5
1.73_10 &5
0.279 0.254 0.106 0.361
0.274 0.258 0.111 0.358
Singlet Doublet 1 Doublet 2 Double doublet
Average linewidth (Hz) Singlet Doublet 1 Doublet 2 Double doublet
SS weighted a
Pa
0.000 0.007 0.003 &0.009
1.359
7.15_10 &1
0.005 &0.004 &0.004 0.003
2.081
5.56_10 &1
0.007 &0.008 0.040 &0.039
3.150
3.69_10 &1
&0.004 &0.019 0.012 0.011
11.753
8.28_10 &3
Difference
1.332
4.09_10
D Relative intensity
1.714 3.92_10 &5
2.203
D Relative intensity
Chloroform
&4
0.243 0.233 0.147 0.376
1.66_10 &4 0.236 0.241 0.107 0.415
1.777
1.237
3.63_10 &5
2.52_10 &5
0.336 0.335 0.195 0.133
0.340 0.354 0.183 0.123
a
See Eqs. (C1) and (C2) in Appendix C. For definition, see Eq. (12). c Doublet 1 has the larger coupling constant in all cases. b
than for chloroform. This is consistent with the observation made for Fig. 4, i.e., that solution of derivatives in chloroform leads to narrower peaks and thus to a better resolution of the peaks. Table 1 also shows the effect of the derivatization and solution in chloroform on the size of the estimated covariances of the relative intensities. The total variance of the relative intensities in a multiplet consisting of F fine structures is represented by the value of a scalar D that is defined here as
\
F&1
D= ` s i (C RI ) i=1
+
1(F&1)
.
(12)
In Eq. (12) s i (C RI ) is the i th singular value of the covariance matrix (C RI ) of the relative intensities of the 331
multiplet. Only the F&1 largest singular values are multiplied, since the F th (smallest) singular value equals zero due to the dependence of one of the relative intensities on the others. Raising the product in Eq. (12) to the power 1(F&1) renders the outcome D independent of the number (F ) of fine structures in the multiplet. This allows an unbiased comparison of the total variance of multiplets with varying numbers of peaks. When comparing the values of D for the water-dissolved amino acids and their chloroform-dissolved derivatives in Table 1 one sees that the latter have smaller values in all four cases. The values are between 1.07 and 3.07 times smaller for the chloroform-soluble derivatives. In other words, their relative intensities are more accurately determined. It is important to verify that the relative intensities do not depend on the solvent. One can check that the relative
van Winden et al.
Metabolic Engineering 3, 322343 (2001) doi:10.1006mben.2001.0193
TABLE 2 Comparison of Relative Intensities in 1D Multiplets Obtained by Taking Either One Single Section of the 2D [ 13C, 1H] COSY Spectrum or a Sum of Sections at All 1H Frequences Showing a Signal (Hydrolyzed Biomass of P. chrysogenum)
Single section
Sum of multiple sections
Difference
SS weighted
P
:-Histidine
Singlet Doublet 1 Doublet 2 Double doublet
0.131 0.070 0.013 0.786
0.121 0.073 0.014 0.792
0.010 &0.004 &0.001 &0.006
15.63
1.35_10 &3
;-Histidine
Singlet Doublet 1 Doublet 2 Double doublet
0.177 0.005 0.336 0.481
0.190 0.004 0.336 0.470
&0.013 0.001 0.000 0.012
1.73
6.30_10 &1
$-Histidine
Singlet Doublet
0.584 0.416
0.518 0.482
0.067 &0.067
103.82
C 1-Trehalose
Singlet Doublet
0.268 0.732
0.262 0.738
0.006 &0.006
0.95
3.31_10 &1
C 2-Trehalose
Singlet Doublet Triplet
0.206 0.298 0.496
0.227 0.292 0.481
&0.021 0.006 0.015
12.83
1.64_10 &3
C 3-Trehalose
Singlet Doublet Triplet
0.375 0.235 0.390
0.363 0.238 0.399
0.012 &0.003 &0.009
1.28
5.27_10 &1
C 4-Trehalose
Singlet Doublet Triplet
0.182 0.331 0.488
0.163 0.324 0.512
0.018 0.006 &0.025
13.31
1.29_10 &3
C 5-Trehalose
Singlet Doublet Triplet
0.126 0.086 0.788
0.120 0.086 0.794
0.007 &0.001 &0.006
1.70
4.28_10 &1
C 6-Trehalose
Singlet Doublet
0.132 0.868
0.136 0.864
&0.004 0.004
0.72
3.96_10 &1
:-Valine
Singlet Doublet 1 Doublet 2 Double doublet
0.292 0.594 0.041 0.072
0.280 0.598 0.045 0.078
0.012 &0.003 &0.004 &0.005
54.69
8.00_10 &12
#$-Valine
Singlet Doublet
0.350 0.650
0.364 0.636
&0.014 0.014
42.67
6.49_10 &11
#"-Valine
Singlet Doublet
0.781 0.219
0.793 0.207
&0.012 0.012
2.49
1.14_10 &1
:-Tyrosine
Singlet Doublet 1 Doublet 2 Double doublet
0.176 0.040 0.114 0.670
0.171 0.043 0.109 0.676
0.004 &0.003 0.005 &0.006
4.73
1.93_10 &1
332
2.21_10 &24
Innovations in Analysis of 2D [ 13C, 1H] COSY Spectra
Metabolic Engineering 3, 322343 (2001) doi:10.1006mben.2001.0193
TABLE 2Continued
Single section
Sum of multiple sections
Difference
SS weighted
P
;-Tyrosine
Singlet Doublet Triplet
0.254 0.724 0.022
0.229 0.771 0.000
0.025 &0.047 0.022
46.69
7.27_10 &11
$-Tyrosine
Singlet Doublet Triplet
0.184 0.752 0.064
0.177 0.757 0.066
0.006 &0.005 &0.001
0.82
6.65_10 &1
=-Tyrosine
Singlet Doublet Triplet
0.299 0.298 0.403
0.293 0.299 0.408
0.006 &0.001 &0.004
0.74
6.89_10 &1
:-Serine
Singlet Doublet 1 Doublet 2 Double doublet
0.222 0.374 0.078 0.326
0.213 0.376 0.077 0.335
0.009 &0.002 0.002 &0.009
17.13
6.63_10 &4
;-Serine
Singlet Doublet
0.580 0.420
0.569 0.431
0.011 &0.011
3.14
7.64_10 &2
intensities of the multiplets in the chloroform and water samples generally show close resemblance by inspecting the one-tailed probabilities (P) in Table 1. The calculation and meaning of the values of P are explained in Appendix C. When a minimal probability of 50 is chosen [i.e., :=0.05 in Eq. (C2) of Appendix C], then pairs of spectra with values of P larger than 0.05 do not significantly differ. This is the case for the multiplets of :-phenylalanine, :-threonine, and :-methionine dissolved in water and chloroform. For :-glutamic acid, however, a slight, but statistically significant difference is observed. Single Section versus Sum of Sections To check whether the relative peak areas in a 1D section along the 13C frequency axis are independent of the 1H frequency at which the section is made, a single section was compared to the sum of all sections at the 1H frequencies where the signal was observable. Table 2 shows this comparison for 18 carbon atoms of several amino acids and trehalose. The analyzed multiplets were selected to cover a wide range of possible fine structures and corresponding relative intensities. Evaluating the one-tailed probabilities in Table 2 versus a minimal probability of 5 0 one sees that the single-section multiplet significantly differs from the multiple-section multiplet for :- and $-histidine, C 2 - and C 4 -trehalose, :- and #$-valine, ;-tyrosine, and :-serine. This means that the 333
relative areas in the single-section multiplet are not representative of the relative volumes in the 2D COSY multiplet. In these cases the multiple-section multiplet must be used in order to obtain a reliable set of relative intensities. In the 10 remaining cases where no significant difference is found, a single-section multiplet may be used to determine the relative intensities. Improved Peakfitting for Determination of Fine Structure Areas The developed software tool for computer-aided spectral analysis proved of great value in determining relative peak areas, especially in the case of complex multiplets with many overlapping peaks. Besides relative peak areas, the tool yielded a library of optimally estimated lineshapes [ p in Eq. (1)], peak positional parameters (peak centroid of the singlets, scalar coupling constants, and isotope effects), and linewidths [w in Eq. (1)] for all measured multiplets. This library can be used for analyzing new sets of experimental data. The software tool also proved very helpful in correctly assigning all peaks to their corresponding fine structures in complex multiplets. This is illustrated by the $- and =-tyrosine multiplets shown in Fig. 5. These multiplets are the superpositions of the multiplets of the 2nd and 6th, respectively, the 3rd and 5th carbons of the aromatic ring of tyrosine. The multiplets could be neatly fitted by 27
van Winden et al.
Metabolic Engineering 3, 322343 (2001) doi:10.1006mben.2001.0193
FIG. 5. Multiplets of $- and =-tyrosine fitted by means of the spectral analysis software tool. The separate peaks to the right of the multiplets are the remainders of the cut-off peaks (rescaled). The various symbols below the multiplet indicate the peak centroids of the singlet, three doublets, three double doublets, and the quadruple doublet that are caused by three (two one-bond and one long-range) 13C 13C scalar couplings. The filled and open symbols of the same form indicate fine structures resulting from groups of isotopomers that only differ in their long-range 13C 13C scalar coupling. 334
Innovations in Analysis of 2D [ 13C, 1H] COSY Spectra
Metabolic Engineering 3, 322343 (2001) doi:10.1006mben.2001.0193
peaks that result from eight fine structures caused by three 13 C 13C scalar couplings (cf. right-hand side in Fig. 2). The outermost peaks in the $-tyrosine spectrum seem to be fitted by too small peaks, but it should be noted that the fit of these peaks is also influenced by the fit of the remaining peaks of the same fine structures that fall under the central cluster of peaks. Only seven positional parameters were needed to describe all 27 peak positions for both multiplets. Estimation of Errors in Relative Intensities The proposed error estimation procedure was used to generate covariance matrices of all multiplets that were analyzed. These covariance matrices have nonzero offdiagonal elements due to the overlap of peaks and due to the fact that relative intensities are normalized by dividing the area of a fine structure in a multiplet by the total area of all fine structures. Clearly, this information is richer than estimated variances of separate relative intensities (neglecting covariances) that are commonly used (Schmidt et al., 1999; Petersen et al., 2000). Consider for example a comparison of two sets of relative intensities of a multiplet consisting of a singlet and a doublet. Due to the normalization of the peak areas, the intensities are fully correlated. When applying Eq. (C1) to these data (see Appendix C) and filling in only variances in C RI (i.e., off-diagonal elements are set at zero), the weighted sum of squares is twice as large than when covariances are filled in as well. This leads to an erroneous outcome of the statistical test of Eq. (C2). The accuracy of the error estimation procedure was tested by a Monte Carlo analysis. For this purpose four multiplets of different carbon atoms were fitted, resulting in estimates of the NMR noise and three analytically calculated covariance matrices (see Appendix A). Next, normally distributed random noise with the estimated variance was added to the original data and the resulting multiplets were fitted. This was repeated 500 times for each multiplet resulting in sets of 500 estimated relative intensities. When calculating the covariance matrices of these sets and comparing them to the analytically determined ones, a close resemblance was observed for most covariances. This suggests that both the proposed procedure for analytical error estimation and the homoscedasticity assumption for the spectral noise are correct. Isotopic Non-Steady-State Correction Biomass of S. cerevisiae was harvested from a continuous culture at two different times prior to attaining isotopic steady state in order to test the isotopic non-steady-state correction. 2D [ 13C, 1H] COSY NMR spectra of various 335
components of the biomass hydrolysate were measured and analyzed to obtain the relative intensities shown in Table 3 (see columns ``not corrected''). The dilution rate of the continuous culture was 0.1 h &1 and the two biomass samples were taken after 11.8 and 35.4 h of labeled substrate supply. In these two situations the remaining fraction of naturally labeled biomass is 0.31 and 0.03, respectively. Evaluating the one-tailed probabilities in the ``not corrected'' column of Table 3 versus a minimal probability of 50 shows significant differences between ``early'' and ``late'' multiplets for all carbon atoms except ;-glutamic acid, ;-alanine, :-, ;-, and #-proline, and ;- and $"-leucine. In general, the multiplets of the early harvested biomass have larger relative intensities of the singlets and lower values of the (double) doublets. The explanation of this observation is that the labeling pattern of the naturally labeled biomass has a larger contribution to the overall labeling pattern of the early harvested biomass. Natural labeling causes larger intensities of singlets than of other fine structures because the fortuitous labeling of several carbons in a row rarely occurs. Table 3 also shows the relative intensities that are extrapolated to their values after infinite labeling supply (see columns ``corrected'') using Eq. (9) and Eq. (10). The value of P s in these equations is 0.10, except for the carbon atoms that are indirectly derived from acetyl-coenzyme A (acetyl-CoA). These include the carbon atoms of glutamic acid and proline and the :-carbon of leucine. The lower value for P s in these cases is caused by the influx of unlabeled ethanol from the feed into the acetyl-CoA pool. From the glucose consumption and biomass formation measurements it was determined that for each mole of acetyl-CoA formed from (100 labeled) glucose, 0.38 mol acetyl-CoA was formed from (unlabeled) ethanol. This leads to a value for P s of 1.001.38V0.10=0.072. For all but 4 of the 20 multiplets shown, the correction yields extrapolated relative intensities of the early and late harvested biomass that are more similar than the uncorrected values (compare P values). However, the corrected relative intensities of the early and late harvested biomass should theoretically be identical except for random measurement errors. This is the case for only 10 of the 20 carbon atoms. Quite a number of relative intensities of the early harvested biomass are ``overcorrected,'' suggesting that the corresponding carbon atoms are in isotopic steady state somewhat earlier than expected. A tentative explanation is that these amino acids are not only labeled by growth of new, labeled biomass, but also by protein turnover. In that case the assumptions underlying the isotopic nonsteady-state correction do not strictly apply. The four carbon atoms in Table 3 of which the corrected relative intensities are more different than the uncorrected
van Winden et al.
Metabolic Engineering 3, 322343 (2001) doi:10.1006mben.2001.0193
TABLE 3 Comparison of Relative Intensities in Multiplets of Amino Acids in Biomass Harvested after Different Times of Label Supply [(Left) Values prior to Isotopic Non-Steady-State Correction, (Right) Values after Isotopic Non-Steady-State Correction (Hydrolyzed Biomass of S. cerevisiae)] Not corrected
Corrected
Biomass harvesting time (h)
Biomass harvesting time (h) P
11.8
35.4
Difference SS weighted
291.42
7.16_10 &63
0.115 0.050 0.027 0.808
0.107 0.041 0.013 0.840
0.009 0.009 0.014 &0.032
19.23
2.45_10 &4
0.041 0.012 &0.009 &0.043
43.82
1.65_10 &9
0.128 0.025 0.361 0.485
0.128 0.013 0.354 0.505
0.000 0.012 0.007 &0.020
5.41
1.44_10 &1
0.365 0.635
0.030 &0.030
18.57
1.64_10 &5
0.362 0.638
0.362 0.638
0.000 0.000
0.00
9.93_10 &1
0.147 0.030 0.070 0.752
0.120 0.023 0.086 0.771
0.027 0.007 &0.016 &0.019
45.40
7.60_10 &10
0.103 0.031 0.073 0.792
0.117 0.023 0.086 0.773
&0.014 0.008 &0.013 0.019
13.23
4.17_10 &3
0.153 0.756 0.091
0.125 0.784 0.091
0.028 &0.028 0.000
22.53
1.28_10 &5
0.109 0.796 0.096
0.122 0.787 0.091
3.87
1.45_10 &1
:-Glutamic acid Singlet Doublet 1 Doublet 2 Double doublet
0.364 0.327 0.190 0.118
0.336 0.335 0.195 0.133
0.028 &0.007 &0.005 &0.015
39.40
1.42_10 &8
0.319 0.351 0.203 0.127
0.333 0.336 0.196 0.134
&0.014 0.014 0.007 &0.007
20.08
1.64_10 &4
;-Glutamic acid Singlet Doublet
0.657 0.343
0.645 0.355
0.011 &0.011
0.58
4.48_10 &1
0.632 0.368
0.644 0.356
&0.012 0.012
0.57
4.51_10 &1
#-Glutamic acid Singlet Doublet 1 Doublet 2 Double doublet
0.215 0.696 0.019 0.069
0.196 0.717 0.019 0.068
10.14
1.74_10 &2
0.159 0.747 0.020 0.075
0.192 0.720 0.019 0.069
&0.033 0.026 0.001 0.006
25.64
1.13_10 &5
:-Alanine
Singlet Doublet 1 Doublet 2 Double doublet
0.164 0.049 0.104 0.683
0.157 0.047 0.122 0.674
0.007 0.003 &0.018 0.009
8.53
3.62_10 &2
0.120 0.051 0.109 0.720
0.154 0.047 0.123 0.676
&0.034 0.004 &0.014 0.043
71.61
1.93_10 &15
;-Alanine
Singlet Doublet
0.189 0.811
0.190 0.810
&0.001 0.001
0.02
8.79_10 &1
0.146 0.854
0.187 0.813
&0.041 0.041
41.51
1.17_10 &10
:-Proline
Singlet Doublet 1 Doublet 2 Double doublet
0.341 0.329 0.202 0.129
0.337 0.320 0.199 0.145
0.004 0.009 0.003 &0.016
1.76
6.24_10 &1
0.294 0.352 0.216 0.139
0.333 0.321 0.200 0.146
&0.040 0.031 0.016 &0.007
66.28
2.67_10 &14
Non-steady-state correction
11.8
35.4
Difference SS weighted
:-Histidine
Singlet Doublet 1 Doublet 2 Double doublet
0.159 0.048 0.026 0.767
0.110 0.041 0.013 0.837
0.049 0.007 0.014 &0.070
;-Histidine
Singlet Doublet 1 Doublet 2 Double doublet
0.172 0.024 0.344 0.461
0.131 0.013 0.353 0.503
$-Histidine
Singlet Doublet
0.394 0.606
:-Phenylalanine Singlet Doublet 1 Doublet 2 Double doublet ;-Tyrosine
Singlet Doublet Triplet
0.019 &0.021 0.000 0.001
336
&0.013 0.009 0.004
P
Innovations in Analysis of 2D [ 13C, 1H] COSY Spectra
Metabolic Engineering 3, 322343 (2001) doi:10.1006mben.2001.0193
TABLE 3Continued Not corrected
Corrected
Biomass harvesting time (h)
Biomass harvesting time (h) P
11.8
35.4
Difference SS weighted
0.47
4.93_10 &1
0.623 0.377
0.631 0.369
&0.008 0.008
0.11
7.44_10 &1
0.038 &0.019 &0.019
4.98
8.28_10 &2
0.164 0.757 0.079
0.178 0.729 0.093
&0.014 0.028 &0.014
2.61
2.71_10 &1
0.209 0.791
0.031 &0.031
7.29
6.92_10 &3
0.185 0.815
0.205 0.795
&0.020 0.020
2.95
8.57_10 &2
0.254 0.638 0.032 0.076
0.218 0.682 0.026 0.074
0.036 &0.044 0.006 0.002
249.39
8.85_10 &54
0.200 0.685 0.034 0.081
0.214 0.686 0.026 0.074
&0.014 &0.001 0.007 0.007
18.84
2.95_10 &4
Singlet Doublet Triplet
0.837 0.163 0.000
0.830 0.170 0.000
0.007 &0.007 0.000
0.07
9.67_10 &1
0.829 0.171 0.000
0.829 0.171 0.000
0.000 0.000 0.000
0.00
1.00
$$-Leucine
Singlet Doublet
0.253 0.747
0.216 0.784
0.037 &0.037
12.96
3.17_10 &4
0.214 0.786
0.213 0.787
0.001 &0.001
0.00
9.57_10 &1
$"-Leucine
Singlet Doublet
0.899 0.101
0.889 0.111
0.010 &0.010
0.25
6.15_10 &1
0.894 0.106
0.889 0.111
0.005 &0.005
0.07
7.90_10 &1
:-Serine
Singlet Doublet 1 Doublet 2 Double doublet
0.188 0.246 0.068 0.498
0.162 0.271 0.070 0.497
0.026 &0.025 &0.002 0.001
70.08
4.11_10 &15
0.415 0.258 0.071 0.525
0.159 0.272 0.070 0.499
&0.013 &0.014 0.001 0.026
14.83
1.97_10 &3
;-Serine
Singlet Doublet
0.435 0.565
0.377 0.623
0.058 &0.058
19.09
1.25_10 &5
0.405 0.595
0.375 0.625
0.030 &0.030
4.97
2.58_10 &2
Non-steady-state correction
11.8
35.4
Difference SS weighted
;-Proline
Singlet Doublet
0.648 0.352
0.632 0.368
0.016 &0.016
#-Proline
Singlet Doublet Triplet
0.220 0.706 0.074
0.182 0.725 0.092
$-Proline
Singlet Doublet
0.240 0.760
:-Leucine
Singlet Doublet 1 Doublet 2 Double doublet
;-Leucine
values are the #-carbon atom of glutamic acid, the :-carbon atom of proline, and both the :- and ;-carbon atoms of alanine. The fact that the uncorrected multiplets of both alanine carbon atoms are more similar than the corrected ones seems to indicate that alanine reaches its isotopic steady state much earlier than the other amino acids. This cannot be explained by protein turnover, as this would affect all amino acids. An alternative explanation could be that alanine is rare in biomass protein and is relatively abundantly present in a free form in the cell. However, alanine is neither a rare component of S. cerevisiae biomass nor known to be present in the cell in a free form in very high concentrations. Therefore, the observation must have another explanation. 337
P
ADDITIONAL LABELING INFORMATION The 2D [ 13C, 1H] COSY NMR data of proteinogenic amino acids reported in literature are relative intensities of fine structures caused by one-bond 13C 13C couplings (Petersen et al., 2000; Schmidt et al., 1999; Szyperski, 1995). Fiaux et al. (1999) mentioned the detection of an additional fine structurea quadruple doubletin the multiplet of ;-histidine caused by splitting of the double doublet by a long-range 13C ; 13C $ coupling in the histidine molecule. Besides the reported quadruple doublet, the long-range 13 C ; 13C $ coupling may add an additional doublet and two double doublets to the multiplet by splitting the singlet and doublets that result from the one-bond 13C ; 13C : and
van Winden et al.
Metabolic Engineering 3, 322343 (2001) doi:10.1006mben.2001.0193
FIG. 6. Example of additionally observable peaks due to long-range 13C 13C scalar couplings in the multiplets of ;-histidine (left) and $-histidine (right). (Spectra from hydrolyzed biomass of P. chrysogenum, obtained by summing multiple sections of 2D [ 13C, 1H] COSY NMR multiplet.) 13
C ; 13C # couplings. Some of these fine structures are observed in our measurement of the multiplet of the ;-carbon of histidine (Fig. 6). Measured multiplets of the $-histidine (Fig. 6) and $- and =-tyrosine (Fig. 5) carbon atoms also show additional fine structures caused by long-range 13C 13C couplings. The multiplet of $-tyrosine in Fig. 5 shows not only the 13 C # 13C $ and 13C $ 13C = couplings, but also the long-range coupling with the !-carbon. In the multiplet of =-tyrosine the 13C = 13C # coupling is observed additionally to the commonly reported one-bond couplings with the $- and !-carbons. Likewise, Fig. 6 shows that the multiplet of $-histidine does not consist of a singlet and a doublet only. Additional fine structures result from the fact that the multiplet shows not only the 13C $ 13C # coupling, but also the long-range 13 C $ 13C ; coupling. This is valuable information, since the #-carbon itself is not proton bound which makes its couplings with the ;- and $-carbons unobservable by 2D [ 13C, 1H] COSY NMR. The relative intensities in Tables 2 and 3 were obtained by taking into account the long-range 13C 13C couplings and fitting eight fine structures to the measured $- and =-tyrosine and ;-histidine multiplets (see Fig. 2) and four to the $-histidine multiplet. Subsequently, the pairs of fine structures that are caused by the same one-bond 13C 13C couplings are summed. This method (method 1) is preferable to only fitting the fine structures that are expected by the one-bond 338
13
C 13C couplings (method 2) since this will cause a misfit. Table 4 shows the differences between the outcomes of the two methods for the multiplets shown in Figs. 5 and 6. Evaluating the one-tailed probabilities in Table 4 (see Appendix C) versus a minimal probability of 50 clearly shows that the differences between the outcomes of the two methods are significant for three carbon atoms. In the case of $-tyrosine, the difference is not significant. This is due to the relatively low signal-to-noise ratio for this compound (see Fig. 5), which causes a large covariance matrix and thus a small weighted sum of squared residuals. Although the fit of method 2 does not yield significantly differing relative intensities for this carbon atom, the nonrandom nature of the residual spectrum clearly indicates a serious misfit which further contributes to the estimated noise. This serves to emphasize that the probability that is calculated according to Appendix C is only a good criterion for comparing two spectra when no systematic misfit is detected. Many peaks of the multiplets in Figs. 5 and 6 are ill resolved. Still, the computer-aided spectral analysis tool is able to fit them and allocate relative intensities to them. The poor resolution is accounted for in the covariance matrices of the relative intensities. As was outlined in the theory section, the additionally observable long-range couplings result in a larger number of independently known isotopomer groups within a molecule. This yields more independent labeling data for subsequent flux analysis. Whereas the fine structures of :-, ;-, and
Innovations in Analysis of 2D [ 13C, 1H] COSY Spectra
Metabolic Engineering 3, 322343 (2001) doi:10.1006mben.2001.0193
TABLE 4 Comparison of Two Methods (see Text) for Determining Relative Intensities in Multiplets Showing Long-Range 13C 13C Couplings (;- and $-Histidine, =-Tyrosine: Hydrolyzed Biomass of P. chrysogenum; $-Tyrosine: Hydrolyzed Biomass of S. cerevisiae) Method
;-Histidine
$-Histidine
$-Tyrosine
=-Tyrosine
1
2
Separate peaks
Summed peaks a
Singlet Doublet 3 b
0.136 0.033
0.169
Doublet 1 Double doublet 13 c
0.008 0.020
Doublet 2 Double doublet 23
Difference
SS weighted
P
0.150
0.019
37.511
3.59_10 &8
0.029
0.040
&0.011
0.239 0.056
0.296
0.260
0.036
Double doublet 12 Quadruple doublet d
0.172 0.335
0.507
0.550
&0.044
Singlet Doublet 2
0.459 0.058
0.518
0.489
0.028
11.089
8.69_10 &4
Doublet 1 Double doublet
0.328 0.154
0.482
0.511
&0.028
Singlet Doublet 3
0.179 0.016
0.195
0.219
&0.024
3.759
1.53_10 &1
Doublet 1 Double doublet 13 Doublet 2 Double doublet 23
0.028 0.270 0.461 0.000
0.758 e
0.760
&0.002
Double doublet 12 Quadruple doublet
0.035 0.011
0.047
0.021
0.025
Singlet Doublet 3
0.276 0.018
0.293
0.278
0.015
7.398
2.48_10 &2
Doublet 1 Double doublet 13 Doublet 2 Double doublet 23
0.124 0.107 0.069 0.000
0.299 e
0.309
&0.010
Double doublet 12 Quadruple doublet
0.383 0.025
0.408
0.413
&0.005
Sum of relative intensities caused by same one-bond 13C 13C coupling. Doublet 3 is caused by long-range 13C 13C coupling. c Double doublet 13 is caused by combination of coupling constants of doublets 1 and 3. d Quadruple doublet is caused by combination of all three coupling constants. e Due to identical one-bond 13C 13C couplings, doublets 1 and 2 and double doublets 13 and 23 (see Fig. 5) are summed to yield a single relative area. a b
339
van Winden et al.
Metabolic Engineering 3, 322343 (2001) doi:10.1006mben.2001.0193
$-histidine that are reported in Tables 2 and 3 yield seven independent isotopomer measurements of the histidine molecule, the additionally observable 13C 13C couplings in the multiplets of the latter two carbon atoms yield six more independent measurement data. This means almost a doubling of the quantity of isotopomer information. CONCLUSIONS In this paper we have shown how the Pearson VII function can be combined with a set of linear constraints on the positions of spectral peaks to yield a mathematical description of 2D [ 13C, 1H] COSY NMR spectra that contains few parameters and still fits experimental multiplets very well. A computational tool that was developed based on this mathematical description of multiplets allowed us to efficiently and accurately determine the relative peak areas of fine structures in experimental multiplets containing up to 27 partly overlapping peaks. It could be checked whether the peaks had been correctly assigned to the corresponding fine structures. Complete covariance matrices of the determined relative intensities were calculated based on the estimated measurement noise and an analytical sensitivity analysis of the relative intensities with respect to the noise. It was checked whether peak intensities in a one-dimensional section along the 13C frequency axis are representative for relative peak volumes in a 2D [ 13C, 1H] COSY NMR multiplet. Comparison of single sections and sums of multiple sections of 18 multiplets showed that in 8 cases (i.e., 440) the relative intensities of single sections were not representative for the peak volumes. It was demonstrated that 2D [ 13C, 1H] COSY NMR spectra of chloroform-soluble derivatives of amino acids have better resolved peaks due to the lower viscosity of chloroform when compared to water. Also, the estimated variance of the determined relative intensities was smaller for the chloroform-soluble derivatives. These findings can be used to improve the accuracy of determined peak areas in multiplets with many overlapping peaks. A new method was proposed to enable isotopic nonsteady-state correction of relative fine structure intensities. Application of this correction to 20 multiplets of amino acids in biomass samples harvested at different times led to a significant improvement of 16 (i.e., 800) of the multiplets. Two of the multiplets that were not improved by the correction were of alanine, suggesting that this amino acid is in isotopic steady state earlier than the remaining amino acids. The cause of this phenomenon is unknown. Finally, for some of the amino acid carbon atoms long-range 13C 13C scalar couplings were observed in 2D [ 13C, 1H] COSY NMR spectra. We demonstrated that the presence of additional peaks in a multiplet should be taken 340
into account when fitting the multiplets. Negligence of the satellite peaks caused by a long-range 13C 13C scalar coupling led to significantly different results. It was shown how to calculate the number of additional independent labeling data that result from the observability of the longrange 13C 13C scalar couplings. In our example of histidine, the number of independent labeling data increased from the previously reported 7 to 13. APPENDIX A Calculation of Covariances of Relative Intensities from Spectral Noise Variance The relative intensities of a multiplet are the areas of the separate fine structures that are normalized with respect to the total spectral area. It is assumed that the covariances of the determined relative intensities are solely caused by the spectral noise. Fine structure areas (A i ) are determined T by finding the parameters ; =(w, h, | 13C of the max , p) spectral model of Eq. (1) that optimally fit the measured multiplet and calculating A i =c } m i } w i } h i .
(A1)
In this equation m i is the multiplicity of the peaks in fine structure i. The value of constant c in the equation depends on the lineshape of the peak and thus on the power p in Eq. (1). The covariances of the spectral parameters (C ; ) can be calculated from the spectral noise variance [_ 2, see Eq. (3)] by linearizing the fitted multiplet [sum of Eq. (1) for each peak] around the optimally fitted parameters and calculating C ; =_ 2 } (J Sfit, ; T } J Sfit, ; ) &1.
(A2)
The columns of the Jacobian J Sfit, ; in Eq. (A2) can be analytically derived from Eq. (1): mi
S fit, i w i = : k=1
mi
Y i, k +1 p
}
+
Z i, k
S fit, i h i = : k=1
S fit, i | 13C max, i, k =
\\
Z i, k
\h +
+
(A3A)
(A3B)
i
Z i, k Y i, k
2 } Y i, k wi
\ p +1+
}
(|
2 } Y i, k (A3C) &| 13C max, i, k )
13C
Innovations in Analysis of 2D [ 13C, 1H] COSY Spectra
Metabolic Engineering 3, 322343 (2001) doi:10.1006mben.2001.0193
S fit p F
mi
=:
:
i=1 k=1
\ \
&Z i, k } ln
\
Y i, k +1 & p
+
Examples of Covariances of Peak Areas and Relative Intensities
Y i, k Y i, k , p} +1 p (A3D)
\
+
++
From the above it is clear that the covariances of the relative intensities (C RI ) partly stem from correlation of the determined peak areas (accounted for in C A ) and partly from the normalization of absolute peak intensities to relative values. The contribution of these two factors to the matrix C RI is illustrated in the following two examples in which C A* and C RI* are the correlation matrices corresponding to C A and C RI .
where Y i, k =
\
| 13C &| 13C i, k, max wi
+
2
(I) Spectrum of the ;-serine carbon atom consisting of a nonoverlapping singlet (area 1) and doublet (area 2):
and hi
Z i, k =
\
Y i, k +1 p
+
p
.
In Eq. (A3), the subscript ``i '' refers to the i th fine structure and ``k'' refers to the kth peak of a fine structure. F is the total number of fine structures in a multiplet. The covariance matrix of the fine structure areas (C A ) follows from C ; by two-sided multiplication with a matrix that contains the partial derivatives of A with respect to ;: C A =J A, ; } C ; } J A, ; T.
(A4)
For a multiplet consisting of a singlet and doublet, J A, ; is given by J A, ; =c }
\
0 0 m s } hs ms } ws <|< . 0 md } hd 0 md } wd (A5)
}
}
+
The covariances of the relative intensities (C RI ) are derived from C A in a similar manner: C RI =J RI, A } C A } J RI, A T.
1 A tot
2}
_\
0 A tot As A s & 0 A tot Ad Ad
+ \
+&
+ &2.177 2.177+
*= CA
\
C* RI =
\
\
1.000
0.800
0.709 &0.531 4.451 1.760 4.657
0.376 &0.276 1.000 0.387 1.000
C RI =10
341
RI
+
&1.000 . 1.000
+
(II) Spectrum of the ;-tyrosine carbon atom consisting of an overlapping singlet (area 1) and middle two double doublet peaks (12* area 3), a doublet (area 2, consisting of two fully overlapping doublets), and two separated outer double doublet peaks (12* area 3):
(A7)
Substituting Eqs. (A1), (A4), (A5), and (A7) in Eq. (A6), one finds that constant c cancels from the final solution, so the value of this constant needs not be known.
1.000 0.415 1.000
\ 1.000 C* = \ C *= A
The correlation matrix C A* shows that the errors in the peak surfaces are correlated, although the peaks do not overlap. These correlations stem from the single peak form [parameter ``p'' in Eq. (1)] that is estimated for all fine structures in a multiplet and from the correlations between the positions of the various peaks [see Eq. (2)]. Matrix C RI* shows total (negative) correlation of the errors of the relative intensities of the singlet and the doublet which is caused by the normalization of the two fine structure surfaces.
(A6)
.
4.412 12.057
C A =10 &5 }
Jacobian J RI, A contains the sensitivities of the relative intensities to the fine structure areas. For a multiplet consisting of only a singlet and a doublet fine structure, J RI, A can be derived, according to Van Winden et al. (2001), to be J RI, A =
C RI =10 &5
9.387
\ 2.177 } \
C A =10 &8 }
&4
0.668 0.436 } 0.927
\
+
+
&1.104 &1.363 2.467
1.000 0.554 &0.860
+
1.000 &0.901 . 1.000
+
van Winden et al.
Metabolic Engineering 3, 322343 (2001) doi:10.1006mben.2001.0193
In the correlation matrix C A* one sees that the errors of the nonoverlapping singlet and doublet areas and the doublet and double doublet areas are somewhat positively correlated for reasons mentioned above. The errors of the singlet and double doublet areas on the other hand are negatively correlated [element (C A*) 1, 3 ] due to the partial overlap between these fine structures. Comparison of C A * and CRI* again shows that normalization of the fine structure areas leads to larger correlations of the errors. It also shows that positively correlated errors of the peak surfaces do not necessarily lead to positively correlated errors of the relative intensities [see elements (C A *) 2,3 and (C RI *) 2,3 ]. This can be understood by recognizing that an overestimation of the absolute areas of a large and a small peak in a multiplet may lead to a decrease of the estimated relative area of the large peak and an increase of the relative area of the small one. APPENDIX B Isotopic Non-Steady-State Correction of 2D [ 13C, 1H ] COSY Spectral Data Sections of the 2D [ 13C, 1H] COSY spectrum yield onedimensional multiplets for the various carbon atoms of biomass components. The relative intensity x i, f of a fine structure `` f '' in the multiplet of the i th carbon atom of biomass component X can be calculated from the fractional enrichment x i and from the sum of the fractions of the isotopomers that give rise to the concerning fine structure: x i, f (t)=: (x i, bin(t))x i (t).
(B1)
The numerator in the right-hand side of Eq. (B1) is the sum of the isotopomers. The subscript ``bin'' denotes the ``binary isotopomer notation'' as introduced by Schmidt et al. (1997). For example, for the singlet (``s'') and doublet (``d'') of the third (;) carbon of alanine, Eq. (B1) reads ala 3,s(t)=: ala ?01(t)ala 3(t) ala 3,d(t)=: ala ?11(t)ala 3(t). In the equations above, a question mark in a binary subscript denotes that both zero and one are allowed. Applying the isotopic non-steady-state correction [Eq. (4)] to both the numerator and denominator of Eq. (B1) yields x i, f ()= =
: (x i, bin(t))&e &+ } t } : (x i, bin(0)) 1&e &+ } t } x i (t)&e &+ } t } P n 1&e &+ } t x i, f (t) } x i (t)&x i, f (0) } e &+ } t } P n . x i (t)&e &+ } t } P n
(B2) 342
APPENDIX C Significant Deviation of Relative Intensities The judgement of whether two vectors of relative intensities (RI) significantly differ can be calculated on the basis of the residual vector of the two subtracted vectors and on the covariance matrix of the residual vector. This covariance matrix is simply the sum of the covariance matrices (C RI ) of the subtracted vectors. The covariance weighted sum of squares (SS weighted ) of the elements of the residual is SSweighted=(RI1&RI2 ) T }(CRI,1+C RI,2 ) * } (RI1&RI2 ).
(C1)
In Eq. (C1), ``*'' represents the generalized pseudoinverse. The use of the pseudo-inverse implicitly accounts for the dependence that stems from the fact that the sums of the elements of both subtracted vectors equal one. SS weighted has a / 2 distribution in case the covariances are known. In our case the covariances are estimated from the spectral noise (Appendix A), but it was found by means of Monte Carlo simulation that the / 2 distribution describes the true distribution of SS weighted well. The number of degrees of freedom of the / 2 distribution of the weighted sum of squares equals the number of relative intensities of a multiplet (F ) minus one (due to discussed dependency of residual elements). Thus, the significance of the deviation between two relative intensity vectors can be tested by evaluating the one-tailed probability P that a given deviation would be fortuitously found: P(SS weighted / 2(F&1))<:.
(C2)
REFERENCES Bartels, C., Guntert, P., Billeter, M., and Wuthrich, K. (1997). GARANT A general algorithm for resonance assignment of multidimensional nuclear magnetic resonance spectra. J. Comp. Chem. 18(1), 139149. Bax, A., and Pochapsky, S. S. (1992). Optimized recording of heteronuclear multidimensional NMR spectra using pulsed field gradients. J. Magn. Res. 99, 638643. Brenner, M., Muller, H. R., and Pfister, R. W. (1950). A new enzymatic synthesis. Helv. Chim. Acta 33, 568. Conny, J. M., and Powell, C. J. (2000). Standard test data for estimating peak parameter errors in X-ray photoelectron spectroscopy. Surf. Interface Anal. 29, 856872. Coulter, J. R., and Hahn, C. S. (1968). Practical quantitative gas chromatographic analysis of amino acids using the n-propyl N-acetyl esters. J. Chromatogr. 36, 42. De Graaf, A. A., Striegel, K., Wittig, R. M., Laufer , B., Schmitz, G., Wiechert, W., Sprenger, G. A., and Sahm, H. (1999). Metabolic state of Zymomonas mobilis in glucose-, fructose-, and xylose-fed continuous cultures as analysed by 13C- and 31P-NMR spectroscopy. Arch. Microbiol. 171, 371385. Fiaux, J., Andersson, C. I. J., Holmberg, N., Bulow, L., Kallio, P. T., Szyperski, T., Bailey, J. E., and Wuthrich, K. (1999). 13C NMR flux
Innovations in Analysis of 2D [ 13C, 1H] COSY Spectra
Metabolic Engineering 3, 322343 (2001) doi:10.1006mben.2001.0193
ratio analysis of Escherichia coli central carbon metabolism in microaerobic bioprocesses. JACS 121, 14071408. Ge, W., Lee, H. K., and Nalcioglu, O. (1993). Simultaneous nonlinear least squares fitting technique for NMR spectroscopy. IEEENuclear Science Symposium on Medical Imaging Conference 2, San Francisco, CA, pp. 13221326. Gunther, U. L., Ludwig, C., and Ruterjans, H. (2000). NMRLAB Advanced NMR data processing in MATLAB. J. Magn. Res. 145, 201208. Hansen, P. E. (1988). Isotope effects in nuclear shielding. Prog. NMR Spectrosc. 20, 207255. Jucker, B. M., Lee, J. Y., and Shulman, R. G. (1998). In vivo 13C NMR measurements of hepatocellular tricarboxylic acid cycle flux. J. Biol. Chem. 273(20), 1218712194. Malloy, C. R., Sherry, A. D., and Jeffrey, F. M. (1987). Carbon flux through citric acid cycle pathways in perfused heart by 13C NMR spectroscopy. FEBS Lett. 212(1), 5862. Marshall, I., Bruce, S. D., Higinbotham, J., MacLullich, A., Wardlaw, J. M., Ferguson, K. J., and Seckl, J. (2000). Choice of spectroscopic lineshape model affects metabolite peak areas and area ratios. Magn. Res. Med. 44, 646649. Marx, A., de Graaf, A. A., Wiechert, W., Eggeling, L., and Sahm, H. (1996). Determination of the fluxes in the central metabolism of Corynebacterium glutamicum by nuclear magnetic resonance spectroscopy combined with metabolic balancing. Biotechnol. Bioeng. 49(2), 111 129. Mollney, M., Wiechert, W., Kownatzki, D., and de Graaf, A. A. (1999). Bidirectional reaction steps in metabolic networks. IV. Optimal design of isotopomer labeling experiments. Biotechnol. Bioeng. 66(2), 86 103. Petersen, S., de Graaf, A. A., Eggeling, L., Mollney, M., Wiechert, W., and Sahm, H. (2000). In vivo quantification of parallel and bidirectional fluxes in the anaplerosis of Corynebacterium glutamicum. J. Biol. Chem. 275(46), 3593235941. Portais, J.-C., Schuster, R., Merle, M., and Canioni, P. (1993). Metabolic flux determination in C6 glioma cells using carbon-13 distribution upon [1- 13C]glucose incubation. Eur. J. Biochem. 217, 457468. Rizzi, M., Baltes, M., Theobald, U., and Reuss, M. (1997). In vivo analysis of metabolic dynamics in Saccharomyces cerevisiae. II. Mathematical model. Biotechnol Bioeng. 55(4), 592608. Sauer, U., Lasko, D. R., Fiaux, J., Hochuli, M., Glaser, R., Szyperski, T., Wuthrich, K., and Bailey, J. E. (1999). Metabolic flux ratio analysis of genetic and environmental modulations of Escherichia coli central carbon metabolism. J. Bacteriol. 181(21), 66796688. Schmidt, K., Carlsen, M., Nielsen, J., and Villadsen, J. (1997). Modeling isotopomer distributions in metabolic networks using isotopomer mapping matrices. Biotechnol. Bioeng. 55(6), 831840. Schmidt, K., No% rregaard, L. C., Pedersen, B., Meissner, A., Duus, J. O3 ., Nielsen, J. O3 ., and Villadsen, J. (1999). Quantification of intracellular metabolic fluxes from fractional enrichment and 13C 13C coupling
343
constraints on the isotopomer distribution in labeled biomass components. Metab. Eng. 1, 166179. Sonntag, K., Eggeling, L., De Graaf, A. A., and Sahm, H. (1993). Flux partitioning in the split pathway of lysine synthesis in Corynebacterium glutamicum. Eur. J. Biochem. 213, 13251331. Stephanopoulos, G. N., Aristidou, A. A., and Nielsen, J. (1998). ``Metabolic Engineering. Principles and Methodologies,'' Academic Press, San Diego. Stephenson, D. S., and Binsch, G. (1980). Automated analysis of highresolution NMR spectra. I. Principles and computational strategy. J. Magn. Res. 37, 395407. Subhash, N., and Mohanan, C. N. (1997). Curve-fit analysis of chlorophyll fluorescence spectra: Application to nutrient stress detection in sunflower. Remote Sens. Environ. 60, 347356. Szyperski, T. (1995). Biosynthetically directed fractional 13C-labeling of proteinogenic amino acids. An efficient analytical tool to investigate intermediary metabolism. Eur. J. Biochem. 232, 433448. Szyperski, T., Bailey, J. E., and Wuthrich, K. (1996). Detecting and dissecting metabolic fluxes using biosynthetic fractional 13C labeling and two-dimensional NMR spectroscopy. TIBTECH 14, 453459. Szyperski, T. (1998). 13C-NMR, MS and metabolic flux balancing in biotechnology research. Q. Rev. Biophys. 31(1), 41106. Szyperski, T., Glaser, R. W., Hochuli, M., Fiaux, J., Sauer, U., Bailey, J. E., and Wuthrich, K. (1999). Bioreaction network topology and metabolic flux ratio analysis by biosynthetic fractional 13C labeling and twodimensional NMR spectroscopy. Metab. Eng. 1, 189197. Ter Schure, E. G., Sillje, H. H. W., Verkleij, A. J., Boonstra, J., and Verrips, C. T. (1995). The concentration of ammonia regulates nitrogen metabolism in Saccharomyces cerevisiae. J. Bacteriol. 177(22), 66726675. Theobald, U., Mailinger, W., Baltes, M., Rizzi, M., and Reuss, M. (1997). In vivo analysis of metabolic dynamics in Saccharomyces cerevisiae. I. Experimental observations. Biotechnol. Bioeng. 55(2), 305316. Tran-Dinh, S., Bouet, F., Huynh, Q.-T., and Herve, M. (1996). Mathematical models for determining metabolic fluxes through the citric acid and the glyoxylate cycles in Saccharomyces cerevisiae by 13C-NMR spectroscopy. Eur. J. Biochem. 242, 770778. Van Gulik, de Laat, W. T. A. M., Vinke, J. L., and Heijnen, J. J. (2000). Application of metabolic flux analysis for the identification of metabolic bottlenecks in the biosynthesis of penicillin-G. Biotechnol. Bioeng. 68(6), 602618. Van Winden, W. A., Verheijen, P. J. T., and Heijnen, J. J. (2001). Possible pitfalls of flux calculations based on 13C-labeling. Metab. Eng. 3, 151162. Vaseghi, S., Baumeister, A., Rizzi, M., and Reuss, M. (1999). In vivo dynamics of the pentose phosphate pathway in Saccharomyces cerevisiae. Metab. Eng. 1, 128140. Wittig, R., Mollney, M., Wiechert, W., and de Graaf, A. A. (1995). Interactive evaluation of NMR spectra from in vivo isotope labeling experiments. IFAC Comparative and Applied Biotechnology, GarmischPartenkirchen, Germany, pp. 230233.