ARTICLE IN PRESS
Metabolic Engineering 7 (2005) 445–456 www.elsevier.com/locate/ymben
Genetically constrained metabolic flux analysis Steven J. Coxa, Sagit Shalel Levanonb, George N. Bennettb, Ka-Yiu Sanc,d, a
Department of Computational and Applied Mathematics, Rice University, Houston, TX, USA b Department of Biochemistry and Cell Biology, Rice University, Houston, TX, USA c Department of Bioengineering, Rice University, Houston, TX, USA d Department of Chemical Engineering, Rice University, Houston, TX, USA Received 22 March 2005; accepted 22 July 2005 Available online 6 September 2005
Abstract Significant progress has been made in using existing metabolic databases to estimate metabolic fluxes. Traditional metabolic flux analysis generally starts with a predetermined metabolic network. This approach has been employed successfully to analyze the behaviors of recombinant strains by manually adding or removing the corresponding pathway(s) in the metabolic map. The current work focuses on the development of a new framework that utilizes genomic and metabolic databases, including available genetic/ regulatory network structures and gene chip expression data, to constrain metabolic flux analysis. The genetic network consisting of the sensing/regulatory circuits will activate or deactivate a specific set of genes in response to external stimulus. The activation and/ or repression of this set of genes will result in different gene expression levels that will in turn change the structure of the metabolic map. Hence, the metabolic map will automatically ‘‘adapt’’ to the external stimulus as captured by the genetic network. This adaptation selects a subnetwork from the pool of feasible reactions and so performs what we term ‘‘environmentally driven dimensional reduction.’’ The Escherichia coli oxygen and redox sensing/regulatory system, which controls the metabolic patterns connected to glycolysis and the TCA cycle, was used as a model system to illustrate the proposed approach. r 2005 Elsevier Inc. All rights reserved. Keywords: Genetic regulatory network; Metabolic network; Metabolic flux analysis
1. Introduction Significant progress has been made in using existing metabolic databases to estimate metabolic fluxes. This progress is mainly due to advances in the area of metabolic flux balance analysis (FBA) (Edwards et al., 2002; Schilling et al., 2001; Wiback et al., 2004) and the development of experimental and analytical techniques, such as the use of isotope labeling in combination with nuclear magnetic resonance and mass spectrometry (Blank and Sauer, 2004; Klapa et al., 1999; Klapa et al., 2003; Sriram and Shanks, 2004; Zhao et al., 2004). These techniques enable a more accurate account of the intracellular metabolic fluxes. Corresponding author. Department of Bioengineering, MS 142, 6100 Main Street, Rice University, Houston, TX 77005, USA. E-mail address:
[email protected] (K.-Y. San).
1096-7176/$ - see front matter r 2005 Elsevier Inc. All rights reserved. doi:10.1016/j.ymben.2005.07.004
At the same time, a significant amount of knowledge has been accumulated in the past decades in the area of gene regulatory networks. The pace of discovery of new regulatory elements and their interactions has been hastened with recent advances in high throughput measuring techniques, such as gene chip array technology. Some of this information has been cataloged and organized in databases that are accessible from the World Wide Web, such as BioCyc (http://biocyc.org). The development and application of modeling methodologies (Boolean, differential and stochastic, and their hybrids) for gene regulatory networks has received considerable attention (see, e.g., the review by De Jong, 2002) but yet has rarely made the downstream link to metabolism or the upstream link to environment. In this article, an integrated framework, which utilizes the static databases to describe cellular behavior upon genetic and/or environmental changes, is presented.
ARTICLE IN PRESS S.J. Cox et al. / Metabolic Engineering 7 (2005) 445–456
446
A quantitative scheme in which existing knowledge of a genetic network is used to generate the appropriate metabolic map under specific conditions is developed. Metabolic flux analysis of the system is thus based on this updated metabolic map. As such, the full analysis, although it is based on static database information, becomes dynamic in nature as the metabolic map can be constantly adjusted to respond to any genetic and/or environmental change.
2. Background Traditional FBA analysis generally is based on a predetermined metabolic network (also known as metabolic pathway map) as shown in the following schematic (Fig. 1). FBA requires only the stoichiometry of biochemical pathways. This approach when coupled with experimental measurements has been extremely useful in a number of studies to elucidate the intracellular metabolic flux patterns upon various genetic and environmental perturbations in a number of systems. FBA-based modeling and theoretical studies involving large-scale metabolic networks has also been used to provide insight into the ‘‘extremal’’ behaviors of organisms (Edwards et al., 2002; Schilling et al., 2001; Wiback et al., 2004). Recently, FBA has been employed to analyze the response of metabolic networks after gene deletions or additions (Burgard and Maranas, 2001; Burgard et al., 2003). Most of the application of traditional metabolic FBA starts with a predescribed metabolic map, which is generated from existing pathway databases (such as KEGG Metabolic Pathways). The final metabolic map of a particular cell line under investigation is usually obtained by modifying the reference map from existing databases to reflect the associated genetic changes, such as deletions of particular pathways or addition of new pathways (Fig. 1). It should be emphasized, however, the resulting pathway map is static in the sense that the whole analysis solely depends on the genetic makeup of the cell line and will not be able to capture any potential effects due to changes in environmental conditions.
behavior of cell cultures. For example, the metabolic profiles of Escherichia coli change according to the growth conditions with factors such as the availability of oxygen, pH, osmotic pressure, temperature and nutrient source. Some of these regulatory networks have been extensively studied (Gunsalus, 1992; Lin and Iuchi, 1991; Lynch and Lin, 1996; Park et al., 1997; Unden et al., 1995). For example, based on genome data, it is estimated that E. coli has about 180 regulons and 1120 regulatory interactions (Salgado et al., 2004). This information has been updated constantly; for example, the number of identified regulatory interactions was only 433 in 1998. In general, a regulatory network will direct the activation or repression of a set of genes in response to a specific environmental stimulus, such as oxygen or pH. This gene activation/repression action will subsequently result in a significant change in the pathway network. In view of the existence of these highly sophisticated networks and their critical roles in dictating the ultimate cellular responses, we have developed a new analysis scheme by extending traditional FBA to include these regulatory networks (Fig. 2). In the new scheme, the reference metabolic network map will still be generated from existing pathway databases as discussed before. However, two new components, ‘‘genetic network’’ and ‘‘expression pattern’’ (inside dashed box, Fig. 2), which are designed to capture any changes in the metabolic pathways in response to environmental variation, are added (San et al., 2003). The ‘‘genetic network’’ consists of the sensing/regulatory circuits, which will activate or repress a specific set of genes in response to external stimulus. The activation and/or repression of this set of genes will result in different gene expression levels that will in turn change the structure of the metabolic map. Hence, the metabolic map will automatically ‘‘adapt’’ to the external stimulus as captured by the genetic network. This critical genetic network component will be constructed from existing gene regulation knowledge in the literature and can be supplemented with gene expression patterns from gene chip expression analysis experiments. The construction of the genetic network will be illustrated in Genome Database
2.1. Genetic network assisted MFA
A priori Knowledge
It is well known that environmental and genetic perturbations have major effects on the metabolic Genome Database
Pathway Database A priori Knowledge
Metabolic Network
FBA
Metabolic Pattern
Fig. 1. Traditional flux balance analysis (FBA).
Pathway Database
Genetic Structure Environmental Conditions
FBA Metabolic Network
Predicted Expression Patterns Genetic Network
Gene Regulation
Metabolic Pattern
Gene Chip Experiments
Expression Pattern Data
Fig. 2. Schematic of genetic network assisted flux balance analysis.
ARTICLE IN PRESS S.J. Cox et al. / Metabolic Engineering 7 (2005) 445–456
later sections using the E. coli oxygen sensing/regulatory network as an example.
3. Example—oxygen sensing/regulatory network with Arc and FNR regulation 3.1. Background The bacterium E. coli possesses a large number of sensing/regulation systems for rapid response to environmental changes. Those regulation systems allow variation in the way electrons are channeled from donor to terminal acceptors such that the overall potential difference is maximized for any given growth condition. The adaptive responses are coordinated by a group of global regulators, which includes the one component furmarate & nitrate reduction (FNR) protein, and the two-component anoxic redox control (Arc) system (Fig. 3). With the initial onset of anaerobiosis ArcA is activated, and if these conditions persist or become more anaerobic, FNR is activated leading in turn to the upregulation of ArcA and amplification of its effect (Guest et al., 1996). The Arc system is a two-component regulatory system composed of ArcB, the transmembrane histidine kinase sensor, and ArcA, the cytosolic response regulator. ArcB activity is decreased by the presence of oxidized quinones during the transition from microanaerobic to aerobic growth (Georgellis et al., 2001). In the absence of inhibitor (when more reduced quinones are present) ArcB undergoes autophosphorylation and the phosphoryl group is transferred to ArcA by a His-AspHis-Asp phosphorelay (Georgellis et al., 1999). Consequently, the increased level of phosphorylated ArcA represses the synthesis of some enzymes, such as the citric acid cycle enzymes, succinate dehydrogenases, lactate dehydrogenase, fumarase, pyruvate dehydrogenase, and the low oxygen affinity cytochrome o oxidase, while it activates the expression of other enzymes such as cytochrome d oxidase and enzymes involved in fermentative metabolism (Lin and Iuchi, 1991; Lynch
Cytoplasmic membrane
e- transport Redox, metabolites
FNR FNR
P
ArcB
Redox?
Aer Dos
ArcA O2
ArcA-P
CheW,A,Y O2
Transcription
unknown Energy taxis Transcription
Fig. 3. Schematic showing selected oxygen and redox sensing pathways in E. coli (adopted from Sawers, 1999).
447
and Lin, 1996; Unden et al., 1995; Georgellis et al., 1998, 1999). A number of genetic studies indicate that ArcA has distinct physiological roles in the regulation of aerobic gene expression. (1) ArcA is required for normal activation of hemA expression (encoding glutamyltRNA dehydrogenase) during both anaerobic and aerobic growth. (2) A functional arcA gene is required for activation of cyd operon expression aerobically and anaerobically in fnr mutant cells. (3) The arcA gene product serves as a repressor for the gltA gene (encoding citrate synthase) in both aerobic and anaerobic cells (Lynch and Lin, 1996). Fnr activates and represses target genes in response to anaerobiosis, and Fnr contains a bound iron that serves as a redox sensor. Most of the FNR modulon is concerned with maximizing the capacity for anaerobic energy generation. Thus, the FNR system induces the expression of genes that permit anaerobically growing E. coli to transfer electrons to alternative terminal acceptors (Lin and Iuchi, 1991). It also represses the aerobic genes, cytochrome d and o oxidase, and NADH dehydrogenase II. It acts as a positive regulator of genes expressed under anaerobic fermentative conditions such as aspartase, formate deydroganases, fumarate reductase, and pyruvate formate lyase (Guest et al., 1996). Moreover, arcA belongs to the FNR modulon. Thus, in addition to direct effects of FNR on anaerobic gene expression, two distinct mechanisms may indirectly affect anaerobic gene expression via effects on the Arc system: (1) elevation of ArcA levels in anaerobic cells, and (2) mediation of metabolic responses that change the concentration of certain anaerobic metabolites, which affect ArcB function (Lynch and Lin, 1996). This elevates FNR to the highest ranking regulator of anaerobic gene expression.
4. Mathematical modeling of genetic networks—results and discussions The central metabolic network is shown in Fig. 4. Notice that this map includes both pathways that are active in either aerobic or anaerobic conditions. In another words, this network represents the ‘‘master’’ catch-all system which can be normally found from the static data bases. The individual reactions together with the enzymes involved are listed in Appendix A. We shall be concerned with balancing the flux of m metabolites through a reaction network comprised of r reactions. The balance will take the form Sv ¼ b where S is the m-by-r stoichiometric matrix, v is the r-by-1 vector of reaction rates and b is the m-by-1 vector of measurable external fluxes. Each column of S corresponds to a particular reaction that requires a particular enzyme that in turn requires the expression of a
ARTICLE IN PRESS 448
S.J. Cox et al. / Metabolic Engineering 7 (2005) 445–456
Fig. 4. Major central metabolic pathways showing some key genes and enzymes. Enzyme numbers [E.C.] and genes associated with several pathways are also included. Some reaction numbers, for identification purposes, are labeled in rectangular boxes.
particular structural gene. Hence, the state of the gene network directly and automatically, i.e., without human intervention, determines the row and column structure of S. In what follows we shall construct a master S that encodes each of the reactions of Fig. 4, and then study how S varies as we perturb the environment (via O2) that in turn perturbs the gene network via FNR and ArcA/B. Fig. 5 shows a descriptive version of the table in Appendix B indicating the regulatory impact of the transcription factors, FNR and ArcA. To illustrate the proposed modeling scheme, we will examine two extreme cases that are either purely aerobic or anaerobic. Under these two conditions, we will assume the related genes are either on or off and so we model the gene activity via Boolean variables (Kauffman, 1974; De Jong, 2002). In particular, ArcB and Fnr are on in the absence of oxygen (Guest et al., 1996; Lynch and Lin, 1996). When two genes regulate a third we suppose their action to be an ‘or.’ Now, if the O2 level is a Boolean variable, O2 ¼ 1 (aerobic) and O2 ¼ 0 (anaerobic), we may determine the action of the gene network on stoichiometry via the two functions rowðO2 Þ and colðO2 Þ. Here row(O2) is a numerical list of (row numbers) of the participating metabolites while col(O2) is a numerical list of (column numbers) of the participating reactions. Together they automatically select a submatrix of the
aspA
frdAB CD
fumB
pfl cyd
aceB fum C
ArcB cyo FNR
ArcA
acnB
aceEF
sdhCDAB
sucCD
fumA
mdh
gltA
activation
icd
sucAB
repression
Fig. 5. Main constituents of the FNR-ArcA/B regulon that impact major metabolites.
master stoichiometric matrix via SðrowðO2 Þ; colðO2 ÞÞ. We put this into practice first on the small network in Fig. 6. Here we have two enzymes directing the flow of 7 metabolites and so the dimension of the master stoichiometric matrix is 7-by-2, where flow in carries a plus sign and flow out carries a minus sign.
ARTICLE IN PRESS S.J. Cox et al. / Metabolic Engineering 7 (2005) 445–456
O2
Stimulus
Arc A
FNR
aceEF
pfl
genes
PDH
PFL
enzymes
Sensors/regulators
CO2 HSCo A NADH
formate +
Metabolites
Acetyl-CoA
NAD
pyruvate stimulation inactivation
Fig. 6. A small sub-network on which one may demonstrate cause and effect of the aerobic/anaerobic switch.
2
1
6 1 6 6 6 1 6 6 6 1 6 6 1 6 6 4 1
0
3
2
6 0 7 7 6 7 6 " # 6 0 7 7 npdh 6 7 6 1 7 ¼6 7 npfl 6 6 1 7 7 6 7 6 1 5 4
0
1
rc
3
7 7 7 7 7 07 7 7 07 7 07 5 rf 0 0
CO2 NADH NADþ Pyr . AcCoA HSCoA Formate
The O2 level then distinguishes the anaerobic and aerobic reactions, namely, colð0Þ ¼ 2 and colð1Þ ¼ 1 and the associated anaerobic and aerobic metabolites rowð0Þ ¼ 4 5 6 7 and rowð1Þ ¼ 1 2 3 4 5 6 . Now, with regard to the full coupled system of Figs. 4 and 5 the master stoichiometric matrix encodes 25 possible reactions between 16 potential intermediates 1. 2. 3. 4. 5. 6. 7. 8.
G6P G3P PEP PYR acetyl CoA citrate isocitrate 2-ketoglutarate
9. succinyl-CoA 10. succinate 11. fumarate 12. malate 13. OAA 14. glyoxalate 15. aspartate 16. NADH
The full 16 25 system is presented in Appendix C and can be represented as S i v ¼ 0.
449
The associated metabolic map is depicted in Fig. 4. This system is indeed underdetermined in the sense that the dimension of the null space of Si (ignoring NADH balance) is 10. In other words, FBA reduces the original 25 variable problem to a 10 variable problem. The environment, acting through the gene net, places further constraints on metabolism. In particular, in the absence of oxygen reactions 6, 14 through 20 are inactivated and so colð0Þ ¼ 1 : 5 7 : 13 21 : 25 and neither glyoxalate nor succinyl-CoA is produced and neither 2-ketoglutarate nor succinate is balanced and so rowð0Þ ¼ 1 : 7 11 : 13 15 16 As a result, S0, the stoichiometric matrix associated with O2 ¼ 0, is 12 17 and has a five-dimensional nullspace. The intersection of this five-dimensional hyperplane with the 17-dimensional orthant of vectors with non-negative components yields a polytope of admissible fluxes. Following (Schuster et al., 2000) we depict in Fig. 7A the polytope’s five vertices, or so-called ‘elementary flux modes.’ Conversely, in the presence of oxygen reactions 5, 7 and 19:22, 24 and 25 are repressed, in which case colð1Þ ¼ ½1 : 4 6 8 : 18 23 while all of the metabolites, save NADH, remain in play and so row(1) ¼ [1:15]. As a result, S 1 , the stoichiometric matrix associated with O2 ¼ 1, is 14 17 and has a three-dimensional null space. Its intersection with the positive orthant results in a polytope whose four elementary flux modes are depicted in Fig. 7B. The summary of our calculations, see Table 1, depicts environmentally driven dimensional reduction but nonetheless still leaves one with a class of admissible metabolic patterns. At this stage the engineer, and perhaps the organism itself, singles out the pattern that optimizes a certain index of performance.
5. Comparison with experimental results The network predictions of a selected set of genes were compared with several recently published experimental observations. There is good agreement between the model prediction and experimental observation (Table 2). For example, the expression of isocitrate dehydrogenase of the wild type was activated under aerobic conditions resulting in a 6.4-fold increase in the icd gene expression level (Chao et al., 1997). This behavior is correctly predicted by the model that leads to a simplified model with a prediction of zero flux under anaerobic conditions through this pathway and the TCA cycle (Alexeeva et al., 2003). Similarly, the expression
ARTICLE IN PRESS 450
S.J. Cox et al. / Metabolic Engineering 7 (2005) 445–456
(A)
(B) Fig. 7. Elementary flux modes: (A) anaerobic case, and (B) aerobic case.
pattern of fumarate reductase as predicted by the model is consistent with that of experimental observations (Kang et al., 2005). Also consistent with the reported experimental observations are the model predictions of zero flux through the fumarate–succinate pathway under aerobic condition and non-zero flux under anaerobic conditions (Shalel Levanon et al., 2005). Another feature of the model is its ability to predict the gene expression pattern of arcA and fnr mutant strains. For example, the model correctly predicted that the icd gene would be on in the arcA mutant under
anaerobic conditions (the experimental data showed a 12-fold increase in the icd gene expression level). Furthermore, the model expects a non-zero flux through ICD/TCA pathways while experimental observation also qshowed non-zero flux albeit at a very low level (Alexeeva et al., 2003). It should be noted that the present model focuses on environmental and genetic perturbations only and does not consider the effect of metabolite concentrations on enzyme activity. Thus, the differences between the predicted TCA cycle fluxes and the experimental fluxes may result from the
ARTICLE IN PRESS S.J. Cox et al. / Metabolic Engineering 7 (2005) 445–456
451
ND
0.805e (O2)
0e (+O2)
ND
Zero (O2) Zero (+O2) Non-zero (O2) Zero (O2)
Chao et al. (1997). TCA cycle flux (mmol/g dry weight/h); ND-not determined; Fluxes are given in mmol/g dry weight/h; O2—Anaerobic; +O2—Aerobic. c Alexeeva et al. (2003). d Kang et al. (2005). e Shalel Levanon et al. (2005). f Tseng et al. (1996). g Park et al. (1997). b
70 foldg (O2arcA/+arcA) On (O2) 0b,c (O2) Off (O2) sdh
11 foldg (+O2/O2)
On (O2)
4 foldd (O2/+O2)
0e (succinate flux, +O2) 1.57e(O2)
Off (+O2) On (O2)
12 folda (O2 arcA/+arcA) ND On (O2) 0b,c (O2)
Experimental observations
Model
Experimental observations
Zero (O2) Zero (+O2) Non-zero (O2) Zero (O2) frd
a
Off (O2) 0b,c (O2)
2.037e (O2)
0e (+O2)
Off (+O2) Off (O2)
5–6 fold d (O2 fnr/+fnr)
2.5 folda (O2 fnr/ +fnr) 3.7 foldf (O2 +fnr/fnr) Off (O2) 0b,c (O2)
Non-zero (O2) Zero (+O2) Non-zero (O2) Non-zero (O2)
Model Model
Experimental observations
Model
Experimental observations
Flux Expression
6.4 folda (+O2/O2) Off (O2) icd
This work was supported by the National Science Foundation (BES-0222691).
Model
Acknowledgment
Experimental observations
While the conceptual approach outlined in Fig. 2 is rich enough to accommodate most situations our example suffers from two obvious limitations, arising from both expository and experimental exigencies. In the first place, it was convenient to treat oxygen as well as each of the regulatory and structural genes as Boolean variables. Furthermore, in this study we assumed when two genes regulate a third we suppose their action to be an ‘or’. However, this limitation can be resolved when more information about the regulatory system becomes available. It remains to incorporate the growing body of work linking intermediate oxygen levels to intermediate levels of activated Fnr and ArcA and their relative roles in the regulation of the main metabolic genes. And second, it remains to close the loop and quantify the regulatory impact that the resulting metabolites have on levels of activated Fnr and ArcA as well as different enzymes.
Model
6.1. Limitations of current model
Flux
A framework, that utilizes existing knowledge of gene regulatory networks to assist metabolic flux analysis, has been established. In this framework, environmental signals recruit transcription factors that in turn regulate the expression of the genes involved in the metabolic pathway, resulting in a functional metabolic map. We have demonstrated successfully, using the oxygen sensing/regulatory network, the implementation of the proposed framework. The use of the gene regulatory network provides additional constraints that reduce the dimension of the class of permissible metabolic states.
Expression
6. Summary
Flux
effect of NADH on the TCA cycle activity. When the NADH/NAD+ ratio is high the concentration of oxaloacetate is low (since the malate dehydrogenase reaction is essentially at equilibrium in the cell), slowing the first step in the TCA cycle (Nelson and Cox, 2000).
Expression
10 5 3
Effect of FNR (fnr mutant)
16 25 12 17 14 17
Effect of ArcA (arcA mutant)
Si (master) S0 (anaerobic) S1 (aerobic)
Effect of oxygen (wile type)
Independent solutions
Gene
Dimension
Table 2 Comparison of model prediction of selected genes (with wild type, arcA-, fnr- strains) under aerobic and anaerobic conditions with published experimental observations
Stoichiometric matrix
Experimental observations
Table 1 Stoichiometric characteristics of the catch-all and two extreme metabolic maps
14
11 12 13
10
4 5 6 7 8 9
acetyl-CoA+H2O+oxaloacetate2Citrate+CoA citrate2isocitrate Isocitrate+NADP+2oxalosuccinate+NADPH Oxalosuccinate22-ketoglutarate+CO2 2-ketoglutarate+NAD+2succinyl-CoA+NADH +CO2
Glucose 6-P+ATP22 Glyceraldehyde 3-P+ADP a. Glucose 6-P2Fructose 6-P b. Fructose-6-P+ATP2Fructose 1,6-diP+ADP c. Fructose 1,6-diP 2Dihydroxyacetone-P+Glyceraldehyde 3-P d. Dihydroxyacetone-P2Glyceraldehyde 3-P Glyceraldehyde 3-P+NAD++Pi+ADP 2PEP+H2O+NADH+H++ATP a. Glyceraldehyde 3-P+NAD++Pi2Glycerate1,3-diP+NADH+H+ b. Glycerate 1,3-diP+ADP2Glycerate 3-P+ATP c. Glycerate 3-P2Glycerate 2-P d. Glycerate 2-P2PEP+H2O PEP+ADP2Pyruvate+ATP PEP+CO22OAA+Pi Pyruvate+HSCoA+NAD+2Acetyl-CoA+NADH +CO2 Pyruvate+HSCoA2Formate+Acetyl-CoA Pyruvate+NADH2Lactate+NAD+ Acetyl-CoA+2 NADH+2 H+2Ethanol+HSCoA+2 NAD+ a. Acetyl-CoA+NADH+H+2Acetaldehyde+HSCoA+NAD+ b. Acetaldehyde+NADH+H+2Ethanol+NAD+ Acetyl-CoA+Pi+ADP2Acetate+HSCoA+ATP a. Acetyl-CoA+Pi2Acetyl-P+HSCoA b. Acetyl-P+ADP2Acetate+ATP
2
3
Glucose+PEP2Glucose 6-P+Pyruvate
isocitrate dehydrogenase
1.2.4.2 1.2.4.2
4.1.3.7 4.2.1.3
2.3.1.8 2.7.2.1
1.2.1.10 1.1.1.1
Aldehyde dehydrogenase Alcohol dehydrogenase Acetate phosphotransferase, PTA Acetate kinase, ACK; or, chemical hydrolysis Citrate synthase aconitate hydratase
1.2.1.12 3.6.1.7 5.4.2.1 4.2.1.11 2.7.1.40 4.1.1.31 1.2.4.1 2.3.1.54 1.1.1.28
5.3.1.9 2.7.1.11 4.1.2.13 5.3.1.1
E.C. Number
3-P Glyceraldehyde dehydrogenase 3-P glycerate kinase P glycerate mutase Enolase Pyruvate kinase PEP carboxylase Pyruvate dehydrogenase Pyruvate formate-lyase Lactate dehydrogenase
glucose-6-phosphate isomerase Phosphofructokinase, PFK Fructose-diP aldolase Triose-P isomerase
Glucose:PTS enzymes; Enzyme I, HPr, IIGlc, IIIGlc
Enzyme(s)
452
1
Reaction
The following summarizes the central metabolic reactions (glycolysis & fermentation) in Escherichia coli grown anaerobically on glucose. Measured quantities (excreted products, carbon substrate, and biomass) are represented in boldface.
Appendix A. Enzymatic reactions for anaerobic growth
ARTICLE IN PRESS
S.J. Cox et al. / Metabolic Engineering 7 (2005) 445–456
FNR() ArcA() fumC
4.2.1.2
FNR() ArcA() sdhCDAB
fumarate hydratase (fumarase)
ArcA()
sucCD
ATP+succinate+CoA2ADP+phosphate+succinylCoA
6.2.1.5
1.3.99.1
ArcA()
sucAB
2-Oxoglutarate+lipoamide2Ssuccinyldihydrolipoamide+CO2
1.2.4.2
succinate dehydrogenase
ArcA()
icd
Isocitrate+NADP+22-oxoglutarate+CO2+NADPH
1.1.1.42
isocitrate dehydrogenase oxoglutarate dehydrogenase system succinate-CoA ligase
ArcA()
acnB
Citrate2isocitrate
4.2.1.3
aconitate hydratase
fumarate+H2O2(S)-malate
Succinate+acceptor2fumarate+reduced acceptor
Acetyl-CoA+H2O+oxaloacetate2citrate+CoA
Lynch and Lin, 1996; Park et al., 1997 Park et al., 1997 Lin and Iuchi, 1991; Lynch and Lin, 1996; Unden et al., 1995 Unden et al., 1995 Park and Gunsalus, 1995
Lynch
Lynch
Lynch
Lynch
Guest et al., 1996 Lin and Iuchi, 1991; and Lin, 1996 Lin and Iuchi, 1991; and Lin, 1996 Lin and Iuchi, 1991; and Lin, 1996 Lin and Iuchi, 1991; and Lin, 1996
FNR() ArcA() gltA
4.1.3.7
citrate (si)-synthase
Lin and Iuchi, 1991; Lynch and Lin, 1996
ArcA()
Acetyl-CoA+CO2 +NADH2CoA+pyruvate+NAD
aceEF
1.2.4.1
pyruvate dehydrogenase complex
Ref
6.2.1.5 1.3.99.1 4.2.1.2 1.1.1.37 4.1.3.1 4.1.3.2 1.3.1.6 2.6.1.1 4.3.1.1
Effect
Succinate:CoA ligase Succinate dehydrogenase Fumarase C Malate dehydrogenase Isocitrate lyase Malate synthase Fumarate reductase Aspartate aminotransferase Aspartase
Encoded by
EC number
Recommended name
Reactions
ADP+phosphate+succinyl-CoA2ATP+succinate+CoA Succinate+NAD+2fumarate+NADH Fumarate+H2O2malate Malate+NAD+2oxaloactetate+NADH Isocitrate2glyoxylate+succinate glyoxylate+acetylCoA2Malate+HSCoA Fumarate+NADH+Pi+ADP2Succinate+NAD++ATP OAA+NADH+NH4+2Aspartate+NAD+ Aspartate2Fumarate+NH3
Appendix B. Role of ArcA and FNR
15 16 17, 21 18, 25 19 20 22 23 24
ARTICLE IN PRESS
S.J. Cox et al. / Metabolic Engineering 7 (2005) 445–456 453
ArcA(+)
aceB cyd
cyo
Cytochrome d oxidase
Cytochrome o oxidase
fumarate hydratase (fumerase) fumarate reductase
4.3.1.1
aspartate ammonialyase
ArcA(+)
arcA
fnr
ArcA
FNR
FNR()
FNR(+)
FNR(+)
aspA
L-aspartate2fumarate+NH3
FNR(+)
ArcA(+)
CoA+pyruvate2acetyl-CoA+formate
pfl
2.3.1.54
pyruvate formatelyase
FNR(+)
frdABCD
Fumarate+NADH2succinate+NAD+
1.3.1.6
FNR(+)
fumB
(S)-malate2fumarate+H2O
FNR()
ArcA()
4.2.1.2
4.1.3.1
Lin and Iuchi, 1991; Unden et al., 1995 Lin and Iuchi, 1991; Unden et al., 1995; Guest et al., 1996 Unden et al., 1995; Lynch and Lin, 1996 Unden et al., 1995; Guest et al., 1996 Unden et al., 1995; Guest et al., 1996
Lin and Iuchi, 1991; Lynch and Lin, 1996; Park and Gunsalus, 1995 Lin and Iuchi, 1991/Park and Gunsalus, 1995 Lin and Iuchi, 1991; Lynch and Lin, 1996 Lin and Iuchi, 1991; Lynch and Lin, 1996 Lin and Iuchi, 1991; Gunsalus, 1992; Unden et al., 1995; Tseng et al., 1996 Gunsalus, 1992; Unden et al., 1995; Tseng et al., 1996 Lin and Iuchi, 1991; Gunsalus, 1992; Unden et al., 1995; Tseng et al., 1996 Lin an Iuchi, 1991; Gunsalus, 1992; Unden et al., 1995; Tseng et al., 1996 Lin and Iuchi, 1991; Unden et al., 1995 Gunsalus, 1992; Unden et al., 1995; Guest et al., 1996; Tseng et al., 1996 Unden et al., 1995
454
Fnr ()
ArcA()
mdh
(S)-Malate+NAD+2oxaloactetate+NADH
1.1.1.37
malate dehydrogenase isocitrate lyase
FNR(0)/ () ArcA()
ArcA()
fumA
fumarate+H2O2(S)-malate
4.2.1.2
fumarate hydratase (fumarase)
ARTICLE IN PRESS
S.J. Cox et al. / Metabolic Engineering 7 (2005) 445–456
ARTICLE IN PRESS S.J. Cox et al. / Metabolic Engineering 7 (2005) 445–456
455
Appendix C. The master stoichiometric matrix The reaction and column numbers correspond to the boxes in Fig. 4. The rows correspond to balance of the intermediates 1. 2. 3. 4. 5. 6. 7. 8.
G6P G3P PEP PYR acetyl CoA citrate isocitrate 2-ketoglutarate
9. succinyl-CoA 10. succinate 11. fumarate 12. malate 13. OAA 14. glyoxylate 15. aspartate 16. NADH 3 v1 7 6 6 v2 7 7 6 6 v3 7 7 6 7 6 6 v4 7 7 6 6 v5 7 7 2 3 6 0 0 36 7 6 v6 7 7 607 6 0 7 76 v7 7 6 7 7 6 7 76 0 76 7 607 76 v8 7 6 7 7 6 7 6 0 7 76 v 7 6 0 7 76 9 7 6 7 7 607 0 76 76 v10 7 6 7 7 6 7 6 0 7 76 v 7 6 0 7 76 11 7 6 7 7 607 0 76 76 v12 7 6 7 7 6 7 6 0 7 76 v 7 6 0 7 :76 13 7 ¼ 6 7 7 607 0 76 76 v14 7 6 7 7 6 7 6 0 7 76 v 7 6 0 7 76 15 7 6 7 7 607 7 6 0 76 6 7 v16 7 7 607 7 6 1 76 7 6 7 76 v17 7 6 7 7 6 7 6 1 7 76 v18 7 6 0 7 7 607 6 0 7 7 6 7 76 76 v19 7 6 7 7 405 0 56 6 v20 7 7 0 1 6 7 6 6 v21 7 7 6 6 v22 7 7 6 7 6 6 v23 7 7 6 6 v24 7 5 4 v25 2
2 1 6 0 6 6 6 1 6 6 1 6 6 6 0 6 6 0 6 6 6 0 6 6 0 6 6 6 0 6 6 0 6 6 6 0 6 6 0 6 6 6 0 6 6 0 6 6 4 0 0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
2 0
1 1
0 1
0 1
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
1 0
0 0
1 1
1 1
1 0
0 1
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 1
0 0
0 0
0 0
0 0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
1 0
0 0
0 0
0 0
0 0
1 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
1 0
0 0
0 0
0 1
0 0
0 0
0 1
0 0
0 0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
0
0
0
1
1
0
1
0 0
0 0
0 1
0 0
0 0
0 0
0 0
0 0
0 0
0 1
0 0
0 0
0 0
0 0
0 0
1 0
1 1
0 0
1 0
1 0
0 0
0 1
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
0 0
1 0
1 0
0 0
0 0
0 1
0 1
0
1
0
0
1
0
1
2
0
0
0
0
1
0
1
0
1
0
0
0
1
1
0
0 0 1 1
1 0 1 1
References Alexeeva, S., Hellingwerf, K.J., Teixeira, de Mattos, M.J., 2003. Requirement of ArcA for redox regulation in Escherichia coli under microaerobic but not anaerobic or aerobic conditions. J. Bacteriol. 185, 204–209. Blank, L.M., Sauer, U., 2004. TCA cycle activity in Saccharomyces cerevisiae is a function of the environmentally determined specific growth and glucose uptake rates. Microbiology 150, 1085–1093. Burgard, A.P., Maranas, C.D., 2001. Probing the performance limits of the Escherichia coli metabolic network subject to gene additions or deletions. Biotechnol. Bioeng. 74, 364–375. Burgard, A.P., Pharkya, P., Maranas, C.D., 2003. Optknock: a bilevel programming framework for identifying gene knockout strategies for microbial strain optimization. Biotechnol. Bioeng. 84, 647–657. Chao, G., Shen, J., Tseng, C.P., Park, S.J., Gunsalus, R.P., 1997. Aerobic regulation of isocitrate dehydrogenase gene (icd) expres-
1 0 1 1
sion in Escherichia coli by the arcA and fnr gene products. J. Bacteriol. 179, 4299–4304. De Jong, H., 2002. Modeling and simulation of genetic regulatory systems: a literature review. J. Comput. Biol. 9, 67–103. Edwards, J.S., Covert, M., Palsson, B., 2002. Metabolic modelling of microbes: the flux-balance approach. Environ. Microbiol. 4, 133–140. Georgellis, D., Kwon, O., De Wulf, P., Lin, E.C.C., 1998. Signal decay through a reverse phosphorelay in the Arc two-component signal transduction system. J. Biol. Chem. 273, 32864–32869. Georgellis, D., Kwon, O., Lin, E.C.C., 1999. Amplification of signaling activity of the Arc two-component system of Escherichia coli by anaerobic metabolites: an in vitro study with different protein modules. J. Biol. Chem. 274, 35950–35954. Georgellis, D., Kwon, O., Lin, E.C.C., 2001. Quinones as the redox signal for the Arc two-component system of bacteria. Science 292, 2314–2316.
ARTICLE IN PRESS 456
S.J. Cox et al. / Metabolic Engineering 7 (2005) 445–456
Guest, J.R., Green, J., Irvine, A.S., Spiro, S., 1996. The FNR modulon and FNR-regulated gene expression. In: Lin, E.C.C., Lynch, A.S. (Eds.), Regulation of gene expression in Escherichia coli. Chapman & Hall, New York. Gunsalus, R.P., 1992. Control of electron flow in Escherichia coli: coordinated transcription of respiratory pathway genes. J. Bacteriol. 174, 7069–7074. Kang, Y., Weber, K.D., Qiu, Y., Kiley, P.J., Blattner, F.R., 2005. Genome-wide expression analysis indicates that FNR of Escherichia coli K-12 regulates a large number of genes of unknown function. J. Bacteriol. 187, 1135–1160. Klapa, M.I., Park, S.M., Sinskey, A.J., Stephanopoulos, G., 1999. Metabolite and isotopomer balancing in the analysis of metabolic cycles: I. Theory. Biotechnol. Bioeng. 62, 375–391. Klapa, M.I., Aon, J.C., Stephanopoulos, G., 2003. Ion-trap mass spectrometry used in combination with gas chromatography for high-resolution metabolic flux determination. Biotechniques 34, 832–836. Kauffman, S., 1974. The large scale structure and dynamics of gene control circuits. J. Theor. Biol. 44, 167–190. Lin, E.C.E., Iuchi, A.S., 1991. Regulation of gene expression in fermentative and respiratory systems in Escherichia coli and related bacteria. Ann. Rev. Genet. 25, 361–387. Lynch, A.S., Lin, E.C.C., 1996. Regulation of aerobic and anaerobic metabolism by the Arc system. In: Lin, E.C.C., Lynch, A.S. (Eds.), Regulation of gene expression in Escherichia coli. Chapman & Hall, New York. Nelson, D.L., Cox, M.M., 2000. Lehninger Principles of Biochemistry, third ed. Worth Publishers, New York. Park, S.J., Gunsalus, R.P., 1995. Oxygen, iron, carbon, and superoxide control of the fumarase fumA and fumC genes of Escherichia coli: Role of the arcA, fnr, and soxR gene products. J. Bacteriol. 177, 6255–6262. Park, S.J., Chao, G., Gunsalus, R.P., 1997. Aerobic regulation of the sucABCD genes of Escherichia coli, which encode a-ketoglutarate dehydrogenase and succinyl coenzyme A synthetase: Roles of ArcA, Fnr, and upstream sdhCDAB promoter. J. Bacteriol. 179, 4138–4142. Salgado, H., Gama-Castro, S., Martinez-Antonio, A., Diaz-Peredo, E., Sanchez-Solano, F., Peralta-Gil, M., Garcia-Alonso, D.,
Jimenez-Jacinto, V., Santos-Zavaleta, A., Bonavides-Martinez, C., Collado-Vides, J., 2004. RegulonDB (version 4.0): transcriptional regulation, operon organization and growth conditions in Escherichia coli K-12. Nucleic Acids Res. 32 (Database issue D3037D306). San, K.-Y., Cox, S., Shalel Levanon, S., Bennett, G.N., 2003. Metabolic flux analysis based on dynamic genomic information. In: 225th American Chemical Society National Meeting, New Orleans, LA. Sawers, G., 1999. The aerobic/anaerobic interface. Curr. Opin. Microbiol. 2, 181–187. Schilling, C.H., Edwards, J.S., Letscher, D., Palsson, B.O., 2001. Combining pathway analysis with flux balance analysis for the comprehensive study of metabolic systems. Biotechnol. Bioeng. 71, 286–306. Schuster, S., Fell, D.A., Dandekar, T., 2000. A general definition of metabolic pathways useful for systematic organization and analysis of complex metabolic networks. Nature Biotech. 18, 326–332. Shalel Levanon, S., San, K.Y., Bennett, G.N., 2005. Effect of oxygen on the Escherichia coli ArcA and FNR regulation systems and metabolic responses. Biotechnol. Bioeng. 89, 556–564. Sriram, G., Shanks, J.V., 2004. Improvements in metabolic flux analysis using carbon bond labeling experiments: bondomer balancing and Boolean function mapping. Metab. Eng. 6, 116–132. Tseng, C.P., Albrecht, J., Gunsalus, R.P., 1996. Effect of microaerophilic cell growth conditions on expression of the aerobic (cyoABCDE and cydAB) and anaerobic (narGHJI, frdABCD, and dmsABC) respiratory pathway genes in Escherichia coli. J. Bacteriol. 178, 1094–1098. Unden, G., Becker, S., Bongaerts, J., Holighaus, G., Schirawski, J., Six, S., 1995. O2-Sensing and O2 dependent gene regulation in facultatively anaerobic bacteria. Arch. Microbiol. 164, 81–90. Wiback, S.J., Mahadevan, R., Palsson, B.O., 2004. Using metabolic flux data to further constrain the metabolic solution space and predict internal flux patterns: the Escherichia coli spectrum. Biotechnol. Bioeng. 86, 317–331. Zhao, J., Baba, T., Mori, H., Shimizu, K., 2004. Effect of zwf gene knockout on the metabolism of Escherichia coli grown on glucose or acetate. Metab. Eng. 6, 164–174.