Computing complex metabolic intervention strategies using constrained minimal cut sets

Computing complex metabolic intervention strategies using constrained minimal cut sets

Metabolic Engineering 13 (2011) 204–213 Contents lists available at ScienceDirect Metabolic Engineering journal homepage: www.elsevier.com/locate/ym...

724KB Sizes 1 Downloads 64 Views

Metabolic Engineering 13 (2011) 204–213

Contents lists available at ScienceDirect

Metabolic Engineering journal homepage: www.elsevier.com/locate/ymben

Computing complex metabolic intervention strategies using constrained minimal cut sets a,b ¨ Oliver Hadicke , Steffen Klamt a,b,n a b

Max Planck Institute for Dynamics of Complex Technical Systems, Sandtorstrasse 1, D-39106 Magdeburg, Germany MaCS—Magdeburg Centre for Systems Biology, Sandtorstrasse 1, D-39106 Magdeburg, Germany

a r t i c l e i n f o

a b s t r a c t

Article history: Received 4 October 2010 Received in revised form 16 November 2010 Accepted 7 December 2010 Available online 13 December 2010

The model-driven search for gene deletion strategies that increase the production performance of microorganisms is an essential part of metabolic engineering. One theoretical approach is based on Minimal Cut Sets (MCSs) which are minimal sets of knockouts disabling the operation of a specified set of target elementary modes. A limitation of the current approach is that MCSs can induce side effects disabling also desired functionalities. We, therefore, generalize MCSs to Constrained MCSs (cMCSs) allowing for the additional definition of a set of desired modes of which a minimum number must be preserved. Exemplarily for ethanol production by Escherichia coli, we demonstrate that this approach offers enormous flexibility in defining and solving knockout problems. Moreover, many existing methods can be reformulated as special cMCS problems. The cMCSs approach allows systematic enumeration of all equivalent gene deletion combinations and also helps to determine robust knockout strategies for coupled product and biomass synthesis. & 2010 Elsevier Inc. All rights reserved.

Keywords: Metabolic engineering Elementary modes Minimal cut sets Targeted modification Strain optimization Ethanol production

1. Introduction The production of industrially relevant compounds from renewable resources using biological systems becomes more and more attractive, not only for economical but also for sustainability reasons. Metabolic engineering as an enabling technology for this process aims at developing new experimental and theoretical methodologies for the targeted improvement of metabolic pathways in suitable production hosts. During the last decade, a variety of new theoretical methods for strain optimization has been developed (Burgard et al., 2003; Pharkya et al., 2004; Patil et al., ¨ 2005; Trinh et al., 2006; Trinh et al., 2009; Hadicke and Klamt, 2010; Kim and Reed, 2010; Tepper and Shlomi, 2010). Many of them rely on constraint-based modeling approaches and thus on stoichiometric models of metabolic networks. Stoichiometric Optimization Techniques (SOTs) usually identify gene knockouts or gene overexpression candidates that potentially lead to a higher product yield. SOTs can make in silico predictions on the minimum product and/or biomass yield that follows from a given intervention strategy. Several successfully engineered mutant strains confirmed

Abbreviations: cMCS(s), constrained minimal cut set(s); EM(s), elementary mode(s); EMA, elementary-mode analysis; FBA, flux balance analysis; MCS(s), minimal cut set(s); MMF, minimal metabolic functionality; SOT, stoichiometric optimization technique n Corresponding author at: Max Planck Institute for Dynamics of Complex Technical Systems, Sandtorstrasse 1, D-39106 Magdeburg, Germany, Fax: +49 391 6110 509. E-mail address: [email protected] (S. Klamt). 1096-7176/$ - see front matter & 2010 Elsevier Inc. All rights reserved. doi:10.1016/j.ymben.2010.12.004

these predictions (Edwards and Palsson, 2000; Fong et al., 2005; Ibarra et al., 2002; Jantama et al., 2008; Trinh et al., 2006; Trinh and Srienc, 2009; Unrean et al., 2010, Yazdani and Gonzalez, 2008). However, as SOTs do not involve kinetic data, they cannot predict the productivity of engineered strains with similar stringency. There are two key techniques for stoichiometric modeling and optimization of metabolic reaction networks: (1) Flux Balance Analysis (FBA) is a general framework for analyzing and simulating metabolic networks at steady-state (Oberhardt et al., 2009). FBA requires formulating a linear objective function to simulate the network. Examples are the optimization of biomass yield (a typical ‘‘natural’’ objective function) or product yield (a typical economic objective). FBA usually focuses on a particular solution for a given optimization problem but has the advantage that it can deal with large (genome-scale) networks. (2) Elementary-Modes Analysis (EMA) is a powerful tool for analyzing and optimizing metabolic networks by investigating minimal functional units (pathways) (Schuster et al., 2000; Trinh and Srienc, 2009). Enumerating the complete set of elementary modes yields the complete spectrum of possible metabolic behaviors. However, due to the combinatorial explosion of the number of EMs in large networks, its use is limited to medium-scale networks. Examples for FBA-based techniques for metabolic engineering include (Burgard et al., 2003; Pharkya et al., 2004; Patil et al., 2005; Pharkya and Maranas, 2006; Kim and Reed, 2010; Tepper and Shlomi, 2010), examples for EMA-based methods for network optimization

O. H¨ adicke, S. Klamt / Metabolic Engineering 13 (2011) 204–213

have been given in (Bordel and Nielsen, 2010; Klamt, 2006; Trinh et al., 2006; Trinh et al., 2008; Trinh and Srienc, 2009; Unrean et al., 2010; H¨adicke and Klamt, 2010). We mention two approaches explicitly as they have been applied with great success and served as a starting point for the development of related methods. The FBA-based approach OptKnock introduced by Burgard et al. (2003) was one of the first searching for genetic interventions leading to a coupling of natural and economical objectives. The advantage of this engineering strategy is that the power of adaptive evolution can be used to increase not only biomass but also product yield. Mathematically, OptKnock is a bi-level optimization problem that can be reformulated as a single level Mixed Integer Linear Program (MILP; Burgard et al. (2003)). Successful applications (e.g. (Fong et al., 2005)) and several refined variants of OptKnock (including OptStrain (Pharkya et al., 2004), OptReg (Pharkya and Maranas, 2006), RobustKnock (Tepper and Shlomi, 2010), and OptOrf (Kim and Reed, 2010)) underline that the basic idea of OptKnock was a milestone for stoichiometric network optimization. Despite these results, one essential assumption of these methods is growth-optimal behavior which is not always fulfilled (Schuster et al., 2008). The second approach, Minimal Metabolic Functionality (MMF), is based on EMA and was developed by Srienc and coworkers (Trinh et al., 2006; Unrean et al., 2010). MMF searches for combinations of gene deletions that will eliminate all undesired (non-optimal) elementary modes while keeping at least a few elementary modes with optimal or nearly optimal behavior intact. Analogous to OptKnock, obligatory coupling of product and biomass synthesis can be enforced by selected knockout sets. The MMF approach showed remarkable success in a number of case studies including the production of secondary metabolites with Escherichia coli (Trinh et al., 2006; Trinh et al., 2008; Trinh and Srienc, 2009; Unrean et al., 2010). An alternative EM-based approach is the concept of Minimal Cut Sets (MCSs) originally introduced in (Klamt and Gilles, 2004) and generalized in (Klamt, 2006). The idea of MCSs is to find combinations of reactions whose removal guarantees that a given functionality of the system (e.g. synthesis of a product) or certain behaviors (e.g. biomass synthesis with a yield lower than 90% of the maximum yield) are blocked. A limitation of the current approach is that MCSs might disable not only undesired but also desired network functions. For example, one MCS to block the synthesis of an undesired product is to remove the substrate uptake reaction. However, albeit it blocks product synthesis this intervention also impairs all other network functions including biomass synthesis. This MCS should, therefore, not be considered as an admissible cut set. In this work, we will present a generalized approach of MCSs, namely Constrained MCSs (cMCSs), which allows for the consideration of side constraints. Moreover, we will highlight the relationship of the extended approach to MMF and OptKnock-related techniques and show that the engineering goals of the latter can directly be formulated within this framework. An advantage of the novel cMCS approach compared to other approaches is that it enables full enumeration of all possible solutions (instead of computing only one particular solution). Furthermore, as one specific application of cMCSs, engineering objectives can be formulated in such a way that knockout strategies will be identified whose product yields are less sensitive with respect to growth– yield optimality. Throughout the paper, we will use the case study of anaerobic ethanol production by E. coli for illustration and for comparing the new cMCSs approach with other methods. 2. Methods

205

steady-state, i.e., the concentrations of the (internal) metabolites do not change over time yielding the well-known steady-state mass balancing equation dcðtÞ ¼ Nr ¼ 0: dt

ð1Þ

N is the m  q stoichiometric matrix and r the q-dimensional (flux) vector of the net reaction rates. Under typical biological conditions, some reactions will practically be irreversible. We denote the index set of irreversible reactions with Irr and it holds that ri Z 0 for all i A Irr:

ð2Þ

The system of Eqs. (1) and inequalities (2) characterizes the set of feasible steady-state flux distributions and the solution set forms a convex polyhedral cone; in this context called flux cone. An Elementary (flux) Mode (EM) (Schuster and Hilgetag, 1994; Schuster et al., 2000) is a non-trivial flux vector e which satisfies (1), (2) and a non-decomposability condition. The latter demands that there is no flux vector r a0 which obeys (1) and (2) and whose support is a proper subset of e, i.e., where supp(r)Csupp(e). The support of a vector r, supp(r), comprises the indices of non-zero elements in r: supp(r)¼{i: ri a0}. EMs have a number of very useful properties. For example, the set of EMs generates the flux cone: any flux vector r lying in the flux cone can be obtained by a nonnegative linear combination of the EMs. Another important property is that the set of EMs of any sub-network (comprising only a subset S of the reactions of the full network) can easily be obtained from the EMs of the full network. All EMs e whose support is completely contained in S, supp(e)CS, build up the new set of EMs in the sub-system and a recalculation is not necessary. Any EM that contains one of the removed reactions will disappear since each EM is non-decomposable and can thus not attain a steady-state flux by a subset of its reactions. By this property, the effect of reaction removals can easily be assessed. In the following, we will make use of the support representation E for an EM e, E¼supp(e), which uniquely characterizes an EM. The full set of EMs (in support notation) is denoted by E. If we want to repress certain flux distributions or disable network functions by removing an appropriate set of reactions (corresponding to a certain set of gene deletions), it is convenient to describe these undesired behaviors by a corresponding set T of EMs, T  E. We will name T the set of target modes (thus, in our context, the attribute target indicates something ‘‘unwanted’’ that will be ‘‘attacked’’). A cut set C is a set of reactions ‘‘hitting’’ all target modes, i.e., 8T A T : C \ T a |:

ð3Þ

It follows that a removal of the reactions contained in a cut set disables the operation of all target modes since, per definition, no subset of the reactions of an EM can attain a non-zero steady-state flux distribution. As for elementary modes, we demand a cut set to be minimal and call a cut set C a Minimal Cut Set (MCS) if no proper subset of C is a cut set (i.e., no subset of C hits all target modes). From this definition it follows that the MCSs can be computed as the minimal hitting sets of the target modes1 (the attribute ‘‘hitting’’ reflects property (3); Klamt (2006)). For computing the minimal hitting sets from the set of target modes the Berge algorithm (Berge, 1989) can be used which showed the best performance results (Haus et al., 2008). Originally, an MCS was defined as a set of reaction removals which would lead to a zero-flux in a given objective reaction

2.1. Definitions We consider a metabolic (stoichiometric) reaction network with q reactions and m (internal) metabolites and assume that it is in

1 Interestingly, the converse also holds: the target modes are the minimal hitting sets of the MCSs establishing a dual relationship between MCSs and EMs (Klamt, 2006).

206

O. H¨ adicke, S. Klamt / Metabolic Engineering 13 (2011) 204–213

calculating first the EMs and then the minimal hitting sets of the target modes appears to be more efficient in practice (Haus et al., 2008). Moreover, for a complex set of target modes it will be difficult or even impossible to formulate the corresponding FBA problem (without having these EMs at hand).

(Klamt and Gilles, 2004). However, the approach of target modes is more general and allows one to formulate more complicated intervention goals, e.g. ‘‘delete all EMs that have a biomass (or product) yield lower than 90% of the maximum yield’’ (Klamt, 2006). The special case of the original MCS definition is covered by selecting all those EMs as target modes where the objective reaction has a non-zero flux. Multiple objectives can be considered by defining T as the union of the associated sets of target modes. An example network together with its elementary modes is shown in Fig. 1. Assume we want to increase the yield of P and remove the ability of the network to produce the side products D and E. The corresponding set of target modes reads T ¼ fEM3,EM4,EM5g and the seven resulting MCSs are given in Table 1 (intervention problem I1). One can easily verify the required property of an MCS: each of the seven MCSs hits all target modes while no subset of any MCS could achieve this. In principle, many MCSs problems could also be defined and computed without the notion of target modes by formulating a suitable FBA problem. For example, MCSs blocking biomass synthesis can be determined by systematically simulating single, double, triple,y, knockouts and keeping those minimal knockout sets which imply a maximal growth yield of zero. This approach has been used, for instance, for computing synthetic lethals (Suthers et al., 2009). However, as long as the set of EMs is computable,

A(ext)

A(ext)

P(ext)

E(ext)

R2

R1

2.2. Constrained minimal cut sets The approach of MCSs offers a framework for computing intervention strategies, studying network robustness and for network diagnosis (Klamt, 2006; Behre et al., 2008). Nevertheless, regarding the identification of intervention strategies, the current MCS approach has some limitations: in many practical applications one is interested in MCSs that not only disable certain functionalities but also keep other network functions operable. For instance, in the example discussed above (Fig. 1/Table 1: intervention problem I1) the objective was to repress the production of side products D and E in order to increase the yield of P. The set of identified MCSs for this problem contains, among others, MCS1 (removing substrate uptake reaction) and MCS3 (removing product excretion). Clearly, these MCSs cannot be suitable knockout sets for the enhanced production of P as they destroy, as a side effect, all EMs enabling the synthesis of P.

P(ext) E(ext)

1

A(ext)

1 1

R3 C

EM1

D

1 1

B P

−1 C

1

A

P(ext) E(ext)

1

1

1

B

A

P(ext) E(ext) A(ext)

B P

A

C

P

1 B

R4

R8

R7 R5 A

E

D

EM2

A(ext)

P

EM3

D(ext)

D(ext)

C

E

P(ext) E(ext) A(ext)

D 1 D(ext)

E

P(ext) E(ext)

R9 1

2

R6 D R10

B 1 C

1

E A

EM4

2

1

1

B P

1

A

C

1

1

D(ext)

1

D

1

E

EM5

D(ext)

P 1

D

E

D(ext)

Fig. 1. Example network and its elementary modes. All reactions except R7 are irreversible.

Table 1 Intervention problems and resulting minimal cut sets for the network in Fig. 1. Intervention problem

Target modes T

Desired modes D1

I1) no synthesis of undesired products D and E

EM3, EM4, EM5

I2) no synthesis of undesired products D and E and production of P with maximal yield possible

EM3, EM4, EM5

EM1, EM2

1

MCS1¼ {R6}, MCS2 ¼ {R9,R10}, MCS3¼ {R3,R10}, MCS4 ¼{R5,R7,R10}

I3) no synthesis of undesired products D and E; preserve all modes with maximal yield of P

EM3, EM4, EM5

EM1, EM2

2

MCS1¼ {R6}, MCS2 ¼ {R9,R10}, MCS3¼ {R3,R10}

I4) no synthesis of undesired product E; synthesis of D and P with maximal yield possible

EM4, EM5

EM1, EM2

1

n1

Desired modes D2

n2

MCSs

MCS1¼ {R1}, MCS2 ¼ {R6}, MCS3 ¼{R2, R10}, MCS4¼ {R9,R10}, MCS5 ¼ {R3,R10}, MCS6¼ {R4,R5,R10}, MCS7 ¼{R5,R7,R10}

EM3

1

MCS1¼ {R3}, MCS2 ¼ {R9}, MCS3 ¼{R5,R7}

O. H¨ adicke, S. Klamt / Metabolic Engineering 13 (2011) 204–213

To account for the necessity to keep certain EMs intact, we introduce the concept of constrained MCSs. An MCS is now considered admissible, if it hits all target modes as required by (3) but additionally preserves a minimum number n of desired EMs. Formally, we define a set D of desired modes, D  E, of which we would like to keep as many as possible. We cannot expect that all EMs in D will be spared (not be hit) by an MCS. The minimum number n of desired EMs in D to be saved is, therefore, usually smaller than the number of modes contained in D, i.e., n r 9D9. For a given MCS C, we collect in DC all EMs D A D that are not hit by C DC ¼ fD A D9C \ D ¼ |g:

ð4Þ

A Constrained MCS (cMCS) satisfies (3) plus the constraint C

9D 9 Z n:

ð5Þ

Based on this generalization, an intervention problem I ¼ ðT,D,nÞ is defined by a set T of target modes and a set D of desired modes of which at least n must not be hit by a cMCS. We can easily embed the original definition of MCSs by setting D ¼ | and n¼0. For certain applications it is useful to also allow for several sets D1 ,    , Dg of desired modes with associated thresholds n1, y, ng and to demand that some modes must be preserved from each set Di (realizing a logical AND of these requirements). Accordingly, an intervention problem then reads I ¼ ðT,D1 , n1 , D2 ,n2 ,    ,Dg , ng ) and an admissible cMCS C must satisfy (5) for each pair ðDi , ni Þ C ð6Þ Di Z ni , i ¼ 1,    , g: Well-posed intervention problems will fulfill Di \ T ¼ |. It is allowed that some EMs of the complete set E are not contained in any of the sets T or Di , thus, the union of T and all Di does not necessarily cover E. We do not care about those ‘‘neutral’’ EMs—they may survive or not when the reactions of a cMCS are removed. With these extensions, we come back to our example network in Fig. 1. We now extend the intervention problem I1 (Table 1) with the demand that optimal production of P should still be possible while synthesis of E and D should be abolished. For this intervention problem I2, we have the same target modes as for problem I1 but an additional requirement is that at least one (n ¼1) of the two desired modes producing P with maximal yield, D ¼ fEM1,EM2g, has to be saved. As expected, the resulting admissible four cMCSs are a subset of the MCSs from the first scenario, since the same set of target modes was used (Table 1). In a third intervention scenario I3, we further constrain I2 by searching only for those cMCSs that attack all modes with side production of D and E and that keep all desired modes with maximal yield for P. We, therefore, modify I2 by setting n ¼2. We find that three out of the four MCSs from problem I2 are admissible and that this intervention goal can be achieved by deleting only one reaction (R6). Finally, the fourth scenario I4 illustrates the usage of more than one set of desired modes (Table 1). The intervention goal demands that production of E is to be repressed while synthesis of P and D with maximal yield should be possible. The set T of target modes now comprises EM4 and EM5 and we have one set of desired modes producing P, D1 ¼ fEM1,EM2g, and a second one for producing D, D2 ¼ fEM3g. Only if we keep at least one mode from each of these two sets functional, the intervention goal can be fulfilled. It is easy to verify the resulting three cMCSs; they are not contained in the solution of I1 because the set of target modes was different (Table 1). There are two possibilities to account for the constraints (6) when computing the cMCSs with the Berge algorithm. A naive approach would be to compute all MCSs as usual and to discard in a post-processing step all MCSs that violate (6) for any pair ðDi ,ni Þ. However, in typical applications the algorithm performs much better if we check after each iteration in the Berge algorithm

207

whether (6) is fulfilled for all pairs ðDi ,ni Þ. This especially holds true if the thresholds ni are high and/or the sets Di are small. With that, many preliminary cMCSs can already be discarded during early iterations avoiding that these non-admissible sets are combined to higher order MCSs. In the examples discussed below, this strategy enabled us to compute cMCSs with nine reaction knockouts within a few seconds. A pseudo-code of the adapted Berge algorithm for computing constrained MCSs is given in the Appendix A. We adapted the algorithm for computing MCS in our interactive MATLAB package CellNetAnalyzer, a comprehensive tool for the analysis of cellular (including metabolic) networks (Klamt et al., 2007). Using several selection criteria (e.g. involvement of certain reactions or metabolites, minimum/maximum yields, etc.) the user can conveniently specify a set of target modes T and up to two sets of desired modes, D1 and D2 , together with thresholds n1 and n2. In the Application Programming Interface (API) version of this function, arbitrary many pairs ðDi ,ni Þ can be defined. The number of maximal knockouts allowed can also be specified by the user. Computed cMCSs can be displayed in the graphical user interface and further analyzed, e.g. to find essential reaction removals contained in all cMCSs.

3. Results Our approach of constrained MCSs allows one to formulate a multitude of possible intervention problems. Some examples have already been illustrated in Table 1. The methodology is also general enough to reformulate existing engineering approaches including MMF, OptKnock, and RobustKnock. Taking the example of optimizing anaerobic production of ethanol by E. coli, we will highlight these relationships and demonstrate the potential advantages of a reformulation into a cMCSs problem. For the case study, we used a network model described in (Trinh et al., 2008) (see Appendix B) and restrict ourselves to anaerobic conditions with glucose as substrate. 3.1. Relationship of cMCSs to minimal metabolic functionality A straightforward way to apply the approach of cMCSs is to pick a set of desired modes and a set of target modes from the full set of EMs and to subsequently compute the cMCSs. This procedure is closely related to the method of Minimal Metabolic Functionality (MMF) by Srienc and coworkers (Trinh et al., 2006; Trinh and Srienc, 2009; Trinh et al., 2009). They propose a manual procedure where knockout candidates are identified iteratively. With this, as many elementary modes as possible are eliminated without affecting the maximal achievable product yield. If coupling of product and biomass synthesis is desired, this selection procedure is restricted to the set of EMs that still enable a reasonable biomass yield. The procedure stops when only a few elementary modes with desired characteristics remain. To formulate this engineering goal as a cMCSs problem, one would collect the acceptable EMs with desired behavior in set D and, separately, all remaining EMs in the set of target modes T. By demanding that all desired EMs are preserved ðn ¼ 9D9Þ, the cMCSs will exclusively retain all EMs of D. Every computed cMCS represents an equivalent intervention strategy for the corresponding MMF problem. With this reformulation, we can now compare the results of the cMCSs approach with the results of the original MMF procedure. For enhanced anaerobic ethanol production with E. coli, Trinh et al. (2008) identified a mutant with seven knockouts keeping only 12 EMs with desired behaviors. Using the same model (Appendix B), we collected these 12 EMs in D and set n ¼12 to ensure that each of them is preserved. The set T of target modes comprises the (4998) EMs that are functional at anaerobic conditions and are not

208

O. H¨ adicke, S. Klamt / Metabolic Engineering 13 (2011) 204–213

contained in D. The solution of this cMCSs problem resulted in 1048 different knockout strategies. A total of 60 of them require only six reaction removals. Furthermore, there are 252 cut sets of size seven, 464 of size eight and 272 solutions with nine reaction deletions. The strategy applied by Trinh et al. is contained in one of the 252 cut sets of size seven. The systematic enumeration of all equivalent knockout strategies revealed that the engineering goal can even be achieved with only six knockouts. The cMCSs approach thus complements the MMF method by providing an automated procedure systematically enumerating all knockout strategies that satisfy the given intervention goal. In this way, the globally optimal (e.g. smallest) knockout strategy can be found. As a general remark, we mention here that even for seemingly simple engineering problems ð0 o ni {9Di 9 and Di \ T ¼ |Þ no valid cMCS solution might exist at all. For example, consider the task of deleting all EMs except one growth-yield optimal EM and one ethanol-yield optimal EM in the E. coli network. For this engineering problem, no cMCS solution exists and an empty set of cMCSs would be returned by the adapted Berge algorithm. We conjecture that the existence of cMCSs is related to the convexity properties of the sets of desired and target modes but investigating this in detail goes beyond the scope of this paper. 3.2. Relationship of cMCSs to OptKnock and RobustKnock The engineering objective of OptKnock (Burgard et al., 2003) and RobustKnock (Tepper and Shlomi, 2010) is to identify a set of gene deletions that couple a cellular objective (e.g. maximization of biomass formation) to the synthesis of a desired chemical with high product yield. Both are bi-level optimization problems that assume optimal cell growth (maximal biomass yield) as an inner objective. While OptKnock searches for a set of reaction knockouts maximizing the upper boundary of the product yield YP/S, RobustKnock identifies knockout targets that maximize the lower boundary of YP/S. One constraint applied by both methods is that the achievable biomass yield YX/S should not fall below a predefined minimum tX/S. Both methods can be transformed into corresponding cMCSs problems. We consider initially the case of an unlimited number of knockouts; the case of a limited number of interventions is discussed afterwards. The set of desired modes D is identical for both methods and is constructed as follows: from the set of all EMs with a biomass yield above the required threshold tX/S, we select for D the EM(s) achieving the maximal product yield. This set may contain several EMs (each showing the maximal achievable product yield feasible for the requested minimum biomass yield) and we demand that at least one of them has to be preserved (n¼1). To mimick OptKnock, the set of target modes TOptKnock consists of all EMs with a biomass yield higher than that of the EMs from D. For RobustKnock, the set of target modes comprises TOptKnock extended by EMs that have an identical growth yield as the modes in D, but lower product yields. Table 2

summarizes the reformulations of the different methods into cMCSs problems. In contrast to the original formulation of the OptKnock/RobustKnock optimization problems, the maximal number of knockouts was not specified through this definition of the cMCSs problem. As long as the number of allowed knockouts is higher than the cardinality of the smallest cMCSs achieving the theoretically maximal product yield, the solutions can be identified by the corresponding cMCSs formulations given above. The size of the smallest cMCSs cardinality depends on the chosen threshold tX/S for the minimal growth yield that should be retained for the mutant. If the number of allowed knockouts is smaller than necessary to achieve the theoretical maximal coupling value, one could adapt the cMCSs approach into an iterative procedure as follows. The definitions of D and T remain the same but the maximal cMCSs size is limited to the number of allowed knockouts. If no cMCS exists for the optimal EM(s) in D, this EM is skipped from D and exchanged by the following suboptimal EM. Subsequently, the cMCSs are recomputed. This procedure is stopped if the first cMCS with desired maximal cardinality has been found. For the case study of anaerobic ethanol production, we set as threshold for the minimal growth yield (unit: Gram Cell Dry Weight (gCDW) per mmol glucose) to be retained tX/S ¼0.02. Initially, we allow arbitrary many knockouts. The maximal achievable product yield (unit: mmol ethanol per mmol glucose) for all EMs with YX/S Z0.02 is YP/S ¼1.639. There are 12 EMs exhibiting this optimal ethanol yield and each of them has a growth yield of YX/S ¼0.0237. These modes are collected as desired modes in D of which at least one must be preserved (n¼ 1). For OptKnock, the set T of target modes comprises the 1638 EMs that have a biomass yield higher than the product-yield optimal EMs selected in D (YX/S 40.0237). The RobustKnock formulation extends this set T with EMs that have an identical growth yield of YX/S ¼ 0.0237 (as the desired EMs) but a product yield lower than the optimal one exhibited by the EMs in D (this results in 890 additional target EMs). OptKnock reaches its optimum with three knockouts with a possible range of product yields at growth-yield optimal conditions between 0 and 1.639 (Table 3). For the OptKnock optimum, 18 equivalent cMCSs can be computed (there of six with three, nine with four and three with six interventions). The optimal mutants identified by RobustKnock necessitate at least six knockouts, but these solutions ensure the unique product yield of YP/S ¼1.639 at growth optimal conditions. In total, 536 cMCSs (there of 30 with six, 130 with seven, 336 with eight and 40 with nine interventions) exist yielding the same optimal value for the RobustKnock problem (Table 3). Table 3 also displays the results for the cMCSs formulation of OptKnock and RobustKnock, when restricting the number of knockouts below three and six, respectively. As for the MMF approach, the full set of cMCSs indicates that multiple equivalent solutions for both OptKnock and RobustKnock can exist, even when

Table 2 Reformulation of different engineering methods into cMCSs problems. MMF Desired modes D

Target modes T

OptKnock

Select a specific set of EMs with desired phenotypes

D

All EMs not contained in D

T ¼ fEMs E with X=S

YE n (minimal number of EMs to be retained from D)

RobustKnock

X=S ¼ fEMs E with YE Z tX=S P=S P=S and YE Z YZ for all EMs X=S Z with YZ Z tX=S g

9D9

1

X=S

D ¼ fEMs E with YE

Z tX=S

P=S P=S and YE Z YZ for all X=S Z with YZ Z tX=S g

T ¼ fEMs E with X=S

X=S

4 max ðYD Þg

YE

D A D

1

X=S

Z max ðYD Þg\ D D A D

EMs

O. H¨ adicke, S. Klamt / Metabolic Engineering 13 (2011) 204–213

limiting the number of interventions. For instance, when allowing four knockouts, six different knockout strategies solving the RobustKnock optimization problem (with the same optimal product yield) can be found. Again, the full enumeration of equivalent knockout strategies using the cMCSs approach helps to identify the best choice of all solutions and can also reveal essential knockout candidates occurring in each cMCSs.

Table 3 Product yield ranges (for yield-optimal growth) and number of equivalent gene deletion combinations (cMCSs) for knockout strategies suggested by RobustKnock and OptKnock for anaerobic overproduction of ethanol from glucose in E. coli. RobustKnock and OptKnock deliver one of the available cMCSs solutions. The product yield is given in mmol ethanol/mmol glucose. Number of knockouts

OptKnock formulation

Ethanol yield

Number of equivalent cMCSs 0 (wild-type) 1 2 3 4 5 6 7 8 9

RobustKnock formulation Number of equivalent cMCSs

0.613 1 1 6 9 3

Ethanol yield 0.613

0–1.398 0–1.548 0–1.639 0–1.639 0–1.639

1 3 2 6 0 30 130 336 40

1.099 1.343 1.490 1.548 1.639 1.639 1.639 1.639

209

3.3. Computing robust engineering strategies for coupled product and biomass synthesis As already pointed out by Tepper and Shlomi (2010), one potential problem of the OptKnock/RobustKnock approach (or the equivalent cMCSs formulation discussed above) is the high sensitivity of the solution with respect to the biomass yield. This sensitivity is reflected by EMs differing slightly in their growth yields but significantly in their product yields. Both methods assume optimal growth and, therefore favor the growth-yield optimal EMs that still enable high product yields. However, a network may contain EMs that are only slightly inferior or (in the case of OptKnock) even equivalent with respect to growth yield, but enable poor product yields. This aspect is illustrated in Fig. 2. Fig. 2a depicts the phenotypic capabilities of the wild-type E. coli network w.r.t. biomass and product yields for anaerobic growth (a similar yield trend analysis was used by Carlson et al. (2002)). Each point represents an EM and the convex hull of the EMs gives the admissible flux space. Growth yields range from 0 to 0.0513 and ethanol yields from 0 to 2. From Fig. 2a, we can see that optimal growth of the wild-type under anaerobic conditions is already coupled to ethanol production (cf. also with Table 3). We may now apply OptKnock and RobustKnock, respectively, with the aim to shift this coupling towards higher product yields. The high sensitivity of computed knockout strategies is illustrated by the remaining phenotypic space displayed for exemplary solutions found by OptKnock (Fig. 2b) and RobustKnock (Fig. 2c) when allowing two reaction deletions. The OptKnock mutant has a potential product yield of 1.548 at the optimal growth yield of 0.0295. However, even at growth-yield optimal conditions there

OptKnock double mutant

Ethanol yield

2

Ethanol yield

Wild type EMs and phenotypic space

1.5 1

2

0.5

1.5

0 0

0.01

1

0.02 0.03 Growth yield

0.04

0.05

RobustKnock double mutant

0.5 2 0

0.01

0.02

0.03

Growth yield

0.04

0.05

Ethanol yield

0 1.5 1 0.5 0 0

0.01

0.02

0.03

0.04

0.05

Growth yield Fig. 2. Phenotypic space of EMs for the wild-type (a) and for exemplary mutants found by OptKnock (b) and RobustKnock (c). (a) The admissible phenotypic space for growth and product yield of the E. coli metabolic network under anaerobic conditions (growth yield given in gCDW/mmol glucose; ethanol yield in mmol ethanol/mmol glucose). The dots represent EMs and the solid line the convex hull of all EMs. (b) Remaining EMs and phenotypic space for the double mutant {FEM5, FC1R} found by OptKnock (for reaction identifiers see Appendix B). Even for growth with optimal biomass yield no obligatory coupling occurs. (c) Remaining EMs and phenotypic space for an exemplary double mutant ({FEM7, GG2r}) identified by RobustKnock. Unique obligatory coupling occurs at growth-yield optimal conditions but product yields are strongly decreasing with suboptimal growth yields.

210

O. H¨ adicke, S. Klamt / Metabolic Engineering 13 (2011) 204–213

are EMs with a product yield of zero (which is less than in the wildtype). The exemplary double mutant identified by RobustKnock (Fig. 2c) has an optimal biomass yield of 0.0285 with a unique product yield of 1.343. However, if the growth yield drops slightly to 0.0257 (  90%), the achievable product yield may fall down to 0.91 (68%). With growth yields below 0.0166 the obligatory coupling can even completely vanish. Thus, close to the desired EM (with high biomass and product yield) there are other EMs exhibiting only a slightly lower biomass yield but very unfavorable (possibly even zero) product yields. The assumption, that the organism evolves exactly to the only optimal EM and does not utilize EMs having slightly smaller biomass but significantly inferior product yields, might be too strong. Furthermore, a slightly varying biomass composition or maintenance demand could easily change the order of growthoptimal EMs and low product-yield EMs may then be selected by adaptive evolution. Therefore, it is desirable that product yields are above a given threshold even in a certain range of suboptimal growth rates. We, therefore, reformulate the engineering problem for coupled biomass and product synthesis and select target and desired modes in such a way that the cMCSs solutions will be less sensitive w.r.t. the assumption of growth optimality P=S

X=S

P=S

X=S

D ¼ fall EMs E with YE Z tP=S and YE T ¼ fall EMs E with YE o tP=S and YE

1 Z tX=S g,

n¼1

2 Z tX=S g,

2 1 tX=S o tX=S :

ð7Þ

In contrast to the cMCSs problem given in the previous section, we now consider not only the requested minimal product (tP=S ) and 1 minimal biomass yield (tX=S ) for the desired modes. We introduce 2 the additional biomass yield threshold tX=S which will be required for the selection of target modes. We now consider EMs with low 2 product yield and a biomass yield larger than tX=S as target modes (not only larger than the biomass yield of the product-yield optimal 2 1 modes). With tX=S otX=S it is ensured that, for each computed cMCSs, retained EMs with low product yield will have a significantly lower biomass yield than the remaining desired modes. The larger the 1 2 2 difference between tX=S and tX=S (i.e., the smaller tX=S ), the larger is the distance between remaining and desired EMs. Thus, the assumption that the cell grows with optimal biomass yield becomes less important and the identified optimal knockout solutions more 2 robust. However, this higher robustness has its price: smaller tX=S imply larger sets of target modes thus requiring more knockouts (cMCSs with high cardinality) to hit all target modes. In the extreme 2 case tX=S ¼ 0, all EMs with a product yield lower than the threshold tP=S would be removed (irrespective of their biomass yield). In Fig. 3, using the phenotypic space, we illustrate this intervention strategy, again for the case of anaerobic ethanol production 1 2 by E. coli. In Fig. 3a, we used tX=S ¼0.02, tP=S ¼1.4 and tX=S ¼0.01. In 2 Fig. 3b, we used the same values except tX=S ¼ 0. The assigned sets of desired and target modes for the two engineering problems can be seen on the left-hand side, the remaining EMs (for exemplary 2 cMCSs) on the right-hand side. As expected, if we choose tX=S ¼ 0, we

Fig. 3. Robust solutions for coupled biomass and product synthesis. (a) Left: phenotypic space of the wild-type (same as in Fig. 2a) with target EMs (red crosses) and desired 1 2 EMs (blue filled circles) resulting from the chosen thresholds tX=S ¼ 0:02, tP=S ¼ 1:4 and tX=S ¼ 0:01. EMs assigned neither to target nor to desired modes are indicated by black open circles. Right: remaining EMs and phenotypic space for an exemplary cMCSs with four reaction deletions ({EDP1, PPP8r, FC1r, TRA5}) solving the intervention problem formulated on the left-hand side. (b) Left: phenotypic space of the wild-type (same as in Fig. 2a) with target EMs (red crosses) and desired EMs (blue filled circles) resulting 1 2 from the chosen thresholds tX=S ¼ 0:02, tP=S ¼ 1:4 and tX=S ¼ 0. EMs assigned neither to target nor to desired modes are indicated by black open circles. Right: remaining EMs and phenotypic space for an exemplary cMCSs with five reaction deletions ({TRA5, GG11, FEM3, FEM7, OPM4r}) solving the intervention problem formulated on the left-hand side. Obligatory coupling occurs now for all growth conditions. (For interpretation of the reference to color in this figure legend the reader is refered to the web version of this article.)

O. H¨ adicke, S. Klamt / Metabolic Engineering 13 (2011) 204–213

get an obligatory coupling between product and biomass synthesis irrespective of any growth optimality assumption and the product yield of all remaining EMs is at least as high as tP=S . However, this ideal behavior requires an increasing number of knockout investments. The minimum number of required interventions (smallest cMCSs) for the first scenario is four, whereas at least five knockouts are required for the fully coupled behavior. In total, we identified 2 2 1978 cMCSs for tX=S ¼0.01 and 1988 cMCSs for tX=S ¼0. In Fig.3, black open circles correspond to ‘‘neutral’’ EMs which are neither part of the desired nor part of the target modes (see also Section 2.2). In our example, according to (7), this set consists of 2 EMs whose biomass yield is either below tX=S or whose biomass 2 1 yield is between tX=S and tX=S with a simultaneous product yield above the requested minimum tP=S . There is no need to remove or to preserve these neutral EMs as long as at least one of the desired modes is retained. In fact, for the example mutants shown in Fig. 3, some of these EMs ‘‘survived’’.

4. Discussion and conclusions In this work, we presented a generalization of the minimal cut set approach allowing one to specify not only functionalities to be disabled but also those that have to be preserved. This extension to constrained MCSs facilitates the formulation of complex metabolic engineering goals. The combination of desired and undesired functionalities can conveniently be specified by appropriate sets of target EMs and desired EMs. The great flexibility of the new approach is reflected by the fact that popular existing methods such as MMF, OptKnock or RobustKnock can be reformulated as special cases of cMCSs problems. Moreover, with the lower bounds for the number of desired EMs that must be preserved by the cMCSs one can control the minimal flexibility of the system that has to be retained. The fundamental idea to couple product synthesis with cellular growth has proven to be a valuable approach for the optimization of microbial strains, in particular in combination with adaptive evolution. OptKnock, RobustKnock and other FBA-based approaches provide suitable mathematical formulations for this class of metabolic engineering problems. Using the cMCSs approach, we presented a modified procedure to compute more robust knockout strategies for coupled product and biomass synthesis which are less sensitive against the assumption of optimal growth. This approach could increase the success rate of such engineering strategies. In contrast to cMCSs, FBA-based methods have the advantage that they can be applied to genome-scale networks. As the computation of cMCSs relies on elementary modes this approach is restricted to medium-scale networks where an enumeration of EMs is feasible. However, for most of the applications this is not a major drawback: for many relevant products, the promising knockout candidates are usually located within the central metabolism. For example, applying RobustKnock and OptKnock to genome-scale networks, Tepper and Shlomi (2010) computed double and triple knockouts for growth-coupled synthesis of various chemicals of interest (including ethanol, succinate, hydrogen, formate and others). A total of 57 out of the 58 proposed targets lie in the central metabolism. It is, therefore, reasonable to confine oneself to smaller network models comprising a detailed representation of the central metabolism and a rough description of anabolic pathways. Clearly, if one is interested in products requiring anabolic reactions (e.g. amino acids) these pathways must be included with sufficient detail. Furthermore, several recent improvements of the algorithm for computing EMs (Terzer and Stelling, 2008) now enable their full enumeration even in larger networks (up to tens of millions of EMs). One possibility to circumvent the problem of enumerating the full set of EMs in

211

genome-scale networks is to apply sampling methods such as the EFMEvolver (Kaleta et al., 2009) and to use the resulting sample as an approximation of the whole set of EMs. An alternative method to compute instead a sample of MCSs in genome-scale networks was presented recently by Imielinski and Belta (2008) and an adaptation to constrained MCSs seems to be possible. As soon as the enumeration of EMs is computationally feasible, the cMCSs approach has the advantage that very specific engineering problems can be defined. Furthermore, a full enumeration of all equivalent knockout strategies can be computed. Some knockout combinations are experimentally difficult or even impossible to realize and the strategy with the fewest number of gene deletion is not always experimentally the easiest to create. Therefore, having the full set of alternative gene deletion combinations to accomplish the same engineering goal is of great value and the best combination - in terms of practical realization - can then be selected. With the full enumeration of cMCSs we may also identify important properties, such as knockouts being essential to achieve a given intervention goal. Higher order intervention sets with more than five knockouts can be computed with cMCSs, which is usually difficult with FBA-based methods (approximate variants such as OptGene (Patil et al., 2005) have then to be used). Computational times to calculate the cMCSs presented in this article (with up to nine knockouts) were in the order of seconds on a standard personal computer. We also tested different engineering problems with larger models (  120 reactions) and observed that the entire sets of cMCSs were computable in reasonable time (in the order of minutes). Regulatory or environmental constraints can be easily accounted for by selecting only those EMs that satisfy the imposed constraints. Furthermore, using simple replacement rules based on gene– enzyme-reaction mappings, EMs (of reactions) can be converted into the corresponding ‘‘EMs of genes’’. This allows the identification of gene (instead of reaction) knockout strategies. For example, if k genes encode for one enzyme complex catalyzing one reaction, k (gene) columns in the elementary-modes matrix replace the latter. Each gene then participates in all EMs where the original reaction participated. If a gene encodes a multifunctional enzyme catalyzing m reactions, the m reaction columns would be replaced by one (gene) column. This indicates that this gene participates in all EMs where any of the m reactions is involved. Afterwards, the minimal hitting set algorithm for cMCSs (Appendix A) is applied to this ‘‘gene EM’’ matrix. With the demonstrated advantages and applications, the cMCSs approach can become a valuable extension of the computational toolbox for metabolic engineering.

Acknowledgments We thank the reviewers for their constructive and valuable suggestions. We are grateful to Regina Samaga for helpful comments during the preparation of the manuscript. This work was supported by the German Federal Ministry of Education and Research (FORSYS center MaCS (Magdeburg Center for Systems Biology)) and the Ministry of Education and Research of SaxonyAnhalt (Research Center ‘‘Dynamic Systems’’).

Appendix Appendix A. Pseudo-code of the adapted Berge algorithm for computing constrained minimal cut sets. See Fig. A1.

212

O. H¨ adicke, S. Klamt / Metabolic Engineering 13 (2011) 204–213

Fig. A1

Appendix B. Metabolic model of the central metabolism of E. coli (taken from (Trinh et al., 2008)). See Table B1. Table B1 Reactions: GG1: GG2r: GG3: GG4: GG5r: GG6r: GG7r: GG8r: GG9r: GG10r: GG11: GG12: GG13: PPP1: PPP2: PPP3: PPP4r: PPP5r: PPP6r: PPP7r: PPP8r: EDP1: EDP2: TCA1: TCA2r: TCA3r: TCA4: TCA5: TCA6r: TCA7: TCA8r: TCA9r: TCA10:

PEP + GLCex ) G6P +PYR G6P3F6P F6P + ATP ) ADP +F16DP F16DP ) F6P F16DP3GA3P + DHAP GA3P3DHAP GA3P + NAD3NADH+ 3PGP ADP + 3PGP3ATP+ 3PG 3PG32PG 2PG3PEP PEP + ADP ) PYR+ ATP PYR + ATP ) PEP+ AMP PYR + NAD+ COASH ) NADH+CO2+ ACoA G6P + NADP ) GL6P+ NADPH GL6P ) 6PG NADP +6PG ) CO2+ NADPH+ R5P R5P3X5P R5P3Ribo5P Ribo5P + X5P3GA3P + S7P GA3P + S7P3F6P +E4P X5P +E4P3F6P + GA3P 6PG ) KDPG KDPG ) PYR +GA3P ACoA +OAA ) COASH+ CIT CIT3CACO CACO3ICIT NADP +ICIT ) CO2+ NADPH + AKG NAD + COASH+ AKG ) NADH+ CO2+ SCOA ADP + SCOA3ATP + COASH+SUCC SUCC + Q ) QH2+ FUM FUM3MAL NAD + MAL3NADH+OAA QH2 +FUM ) SUCC +Q

Table B1. (continued ) ANA1: ANA2: ANA3: GLB1: GLB2: FEM1: FEM2: FEM3: FEM4: FEM5: FEM6: FEM7: FEM8: FEM9: OPM3: OPM4r: FC1r: FC2: TRA1: TRA2: TRA3: TRA4: TRA5: TRA6: TRA7: BIO:

PEP +CO2 ) OAA NAD +MAL ) PYR+ NADH+ CO2 ATP +OAA ) PEP + ADP+ CO2 ICIT ) SUCC+ GLYOXY ACoA+ GLYOXY ) COASH +MAL PYR + COASH ) ACoA+ FOR PYR + Q ) CO2+ QH2+ ACE PYR + NADH ) NAD+ LAC FOR ) CO2+ H2ex NADH+ ACoA ) NAD + COASH+ACA NADH+ ACA ) NAD+ ETOH ACoA ) COASH +ACP ADP + ACP ) ATP +ACE PYR ) CO2+ ACA ATP ) ADP +ATPbase NADH+ Q3NAD +QH2 NAD +NADPH3NADH+ NADP ATP +AMP ) 2 ADP ETOH ) ETOHex ACE ) ACEex NH3ex ) NH3 LAC ) LACex SUCC ) SUCCex FOR ) FORex CO2 ) CO2ex 0.96 PEP+ 0.049 G6P + 3.92 PYR + 0.017 F6P + 40.68 ATP + 0.031 GA3P + 4.079 NAD +1.642 3PG + 1.207 ACoA+ 18.32 NADPH + 0.86 Ribo5P+ 0.512 E4P +2.355 OAA + 1.426 AKG +12.502 NH3 ) 40.68 ADP + 4.079 NADH+ 1.207 COASH+ 18.320 NADP+ BIOMASS Remark: compared to the original model given in (Trinh et al., 2008), we rescaled the unit for biomass to gCDW and for the metabolite stoichiometries to mmol. This does not affect the results.

Internal metabolites: 2PG 3PG 3PGP 6PG ACA

2-phosphoglycerate 3-phosphoglycerate 1,3-bisphosphoglycerate 6-phosphate-gluconate acetaldehyde

O. H¨ adicke, S. Klamt / Metabolic Engineering 13 (2011) 204–213

Table B1. (continued ) ACE ACoA ACP ADP AKG AMP ATP CACO CIT CO2 COASH DHAP E4P ETOH F16DP F6P FOR FUM G6P GA3P GL6P GLYOXY ICIT KDPG LAC MAL NAD NADH NADP NADPH NH3 OAA PEP PYR Q QH2 R5P Ribo5P S7P SCOA SUCC X5P

acetate acetyl-coenzyme A acetate-phosphate adenosine diphosphate alpha-ketoglutarate adenosine monophosphate adenosine triphosphate cis-aconitate citrate carbon dioxide coenzyme A dihydroxyacetone phosphate erythrose-4-phosphate ethanol fructose-1,6-biphosphate fructose-6-phosphate formate fumarate glucose-6-phosphate glyceraldehyde-3-phosphate 6-phosphogluconolactone glyoxylate isocitrate phospho-2-dehydro-3-deoxygluconate lactate malate oxidized form of nicotinamide adenine dinucleotide reduced form of nicotinamide adenine dinucleotide oxidized form of nicotinamide adenine dinucleotide phosphate reduced form of nicotinamide adenine dinucleotide phosphate ammonia oxaloacetate phosphoenolpyruvate pyruvate ubiquinone ubiquinole ribulose-5-phosphate ribose-5-phosphate sedoheptulose-7-phosphate succinyl-coenzyme A succinate xylulose-5-phosphate

External metabolites: ACEex extracellular acetate ATPbase maintenance energy BIOMASS biomass CO2ex extracellular carbon dioxide ETOHex extracellular ethanol FORex extracellular formate GLCex extracellular glucose H2ex extracellular hydrogen LACex extracellular lactate NH3ex extracellular ammonia SUCCex extracellular succinate

References Behre, J., Wilhelm, T., von Kamp, A., Ruppin, E., Schuster, S., 2008. Structural robustness of metabolic networks with respect to multiple knockouts. J. Theor. Biol. 252, 433–441. Berge, C., 1989. Hypergraphs: Combinatorics of Finite Sets. North-Holland Mathematical Library, Amsterdam. Bordel, S., Nielsen, J., 2010. Identification of flux control in metabolic networks using non-equilibrium thermodynamics. Metab. Eng. 12, 369–377. Burgard, A.P., Pharkya, P., Maranas, C.D., 2003. Optknock: A bilevel programming framework for identifying gene knockout strategies for microbial strain optimization. Biotechnol. Bioeng. 84, 647–657.

213

Carlson, R., Fell, D., Srienc, F., 2002. Metabolic pathway analysis of a recombinant yeast for rational strain development. Biotechnol. Bioeng. 79, 121–134. Edwards, J.S., Palsson, B.O., 2000. Metabolic flux balance analysis and the in silico analysis of Escherichia coli K-12 gene deletions. BMC Bioinformatics. 1, 1. Fong, S.S., Burgard, A.P., Herring, C.D., Knight, E.M., Blattner, F.R., Maranas, C.D., Palsson, B.O., 2005. In silico design and adaptive evolution of Escherichia coli for production of lactic acid. Biotechnol. Bioeng. 91, 643–648. ¨ Hadicke, O., Klamt, S., 2010. CASOP: A computational approach for strain optimization aiming at high productivity. J. Biotechnol. 147, 88–101. Haus, U.-U., Klamt, S., Stephen, T., 2008. Computing knock-out strategies in metabolic networks. J. Comput. Biol. 15, 259–268. Ibarra, R.U., Edwards, J.S., Palsson, B.O., 2002. Escherichia coli K-12 undergoes adaptive evolution to achieve in silico predicted optimal growth. Nature 420, 186–189. Imielinski, M., Belta, C., 2008. Exploiting the pathway structure of metabolism to reveal high-order epistasis. BMC Syst. Biol. 2, 40. Jantama, K., Haupt, M.J., Svoronos, S.A., Zhang, X., Moore, J.C., Shanmugam, K.T., Ingram, L.O., 2008. Combining metabolic engineering and metabolic evolution to develop nonrecombinant strains of Escherichia coli C that produce succinate and malate. Biotechnol. Bioeng. 99, 1140–1153. Kaleta, C., de Figueiredo, L. F., Behre, J., Schuster, S., 2009. EFMEvolver: Computing elementary flux modes in genome-scale metabolic networks. In: Grosse, I., Neumann, S., Posch, S., Schreiber, F., Stadler, P. (Eds.), Lecture Notes in ¨ Informatik, Bonn, Informatics—Proceedings, Vol. P-157, Gesellschaft fur pp. 179–189. Kim, J., Reed, J.L., 2010. OptORF: Optimal metabolic and regulatory perturbations for metabolic engineering of microbial strains. BMC Syst. Biol. 4, 53. Klamt, S., 2006. Generalized concept of minimal cut sets in biochemical networks. BioSystems. 83, 233–247. Klamt, S., Gilles, E.D., 2004. Minimal cut sets in biochemical reaction networks. Bioinformatics. 20, 226–234. Klamt, S., Saez-Rodriguez, J., Gilles, E.D., 2007. Structural and functional analysis of cellular networks with CellNetAnalyzer. BMC Syst. Biol. 1, 2. Oberhardt, M.A., Palsson, B.O., Papin, J.A., 2009. Applications of genome-scale metabolic reconstructions. Mol. Syst. Biol. 5, 320. ¨ Patil, K.R., Rocha, I., Forster, J., Nielsen, J., 2005. Evolutionary programming as a platform for in silico metabolic engineering. BMC Bioinformatics. 6, 308. Pharkya, P., Burgard, A.P., Maranas, C.D., 2004. OptStrain: A computational framework for redesign of microbial production systems. Genome Res. 14, 2367–2376. Pharkya, P., Maranas, C.D., 2006. An optimization framework for identifying reaction activation/inhibition or elimination candidates for overproduction in microbial systems. Metab. Eng. 8, 1–13. Schuster, S., Fell, D.A., Dandekar, T., 2000. A general definition of metabolic pathways useful for systematic organization and analysis of complex metabolic networks. Nat. Biotechnol. 18, 326–332. Schuster, S., Hilgetag, S., 1994. On elementary flux modes in biochemical reaction systems at steady state. J. Biol. Syst., 165–168. Schuster, S, Pfeiffer, T, Fell, DA., 2008. Is maximization of molar yield in metabolic networks favoured by evolution? J. Theor. Biol. 252, 497–504. Suthers, P.F., Zomorrodi, A., Maranas, C.D., 2009. Genome-scale gene/reaction essentiality and synthetic lethality analysis. Mol. Syst. Biol. 5, 301. Tepper, N., Shlomi, T., 2010. Predicting metabolic engineering knockout strategies for chemical production: Accounting for competing pathways. Bioinformatics. 26, 536–543. Terzer, M., Stelling, J., 2008. Large-scale computation of elementary flux modes with bit pattern trees. Bioinformatics. 24, 2229–2235. Trinh, C.T., Carlson, R., Wlaschin, A., Srienc, F., 2006. Design, construction and performance of the most efficient biomass producing E. coli. bacterium. Metab Eng. 8, 628–638. Trinh, C.T., Srienc, F., 2009. Metabolic engineering of Escherichia coli for efficient conversion of glycerol to ethanol. Appl. Environ. Microbiol. 75, 6696–6705. Trinh, C.T., Unrean, P., Srienc, F., 2008. Minimal Escherichia coli cell for the most efficient production of ethanol from hexoses and pentoses. Appl. Environ. Microbiol. 74, 3634–3643. Trinh, C.T., Wlaschin, A., Srienc, F., 2009. Elementary mode analysis: A useful metabolic pathway analysis tool for characterizing cellular metabolism. Appl. Microbiol. Biotechnol. 81, 813–826. Unrean, P., Trinh, C.T., Srienc, F., 2010. Rational design and construction of an efficient E. coli. for production of diapolycopendioic acid. Metab. Eng. 12, 112–122. Yazdani, S.S., Gonzalez, R., 2008. Engineering Escherichia coli for the efficient conversion of glycerol to ethanol and co-products. Metab. Eng. 10, 340–351.