Available online at www.sciencedirect.com
Chemical synthesis using synthetic biology James M Carothers1,2, Jonathan A Goler1,2,3 and Jay D Keasling1,2,3,4,5,6 An immense array of naturally occurring biological systems have evolved that convert simple substrates into the products that cells need for growth and persistence. Through the careful application of metabolic engineering and synthetic biology, this biotransformation potential can be harnessed to produce chemicals that address unmet clinical and industrial needs. Developing the capacity to utilize biology to perform chemistry is a matter of increasing control over both the function of synthetic biological systems and the engineering of those systems. Recent efforts have improved general techniques and yielded successes in the use of synthetic biology for the production of drugs, bulk chemicals, and fuels in microbial platform hosts. Synthetic promoter systems and novel RNA-based, or riboregulator, mechanisms give more control over gene expression. Improved methods for isolating, engineering, and evolving enzymes give more control over substrate and product specificity and better catalysis inside the cell. New computational tools and methods for high-throughput system assembly and analysis may lead to more rapid forward engineering. We highlight research that reduces reliance upon natural biological components and point to future work that may enable more rational design and assembly of synthetic biological systems for synthetic chemistry. Addresses 1 California Institute for Quantitative Biosciences and Berkeley Center for Synthetic Biology, University of California, Berkeley, CA 94720, USA 2 Joint BioEnergy Institute, Emeryville, CA 95608, USA 3 Synthetic Biology Engineering Research Center, University of California, Berkeley, CA 94720, USA 4 Department of Bioengineering, University of California, Berkeley and Physical Biosciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA 5 Department of Chemical Engineering, University of California, Berkeley, USA 6 Physical Biosciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA Corresponding author: Carothers, James M (
[email protected]), Goler, Jonathan A (
[email protected]) and Keasling, Jay D (
[email protected])
Current Opinion in Biotechnology 2009, 20:498–503 This review comes from a themed issue on Systems and synthetic biology Edited by Sven Panke and Ron Weiss Available online 31st August 2009 0958-1669/$ – see front matter # 2009 Elsevier Ltd. All rights reserved. DOI 10.1016/j.copbio.2009.08.001
Introduction Much progress in biotechnology can be understood as an improved ability to exert control over biological processes. Current Opinion in Biotechnology 2009, 20:498–503
A central problem facing researchers 30 years ago was controlling the overexpression of a single gene product for purification from a heterologous microbial host such as Escherichia coli or Saccharomyces cerevisiae. In contrast, to engineer biology for synthetic chemistry the challenge is to regulate the expression of a large number of biocatalytic gene products and control their activity as a coordinated system inside a heterologous microbial host (Figure 1). Mastering the ability to effectively engineer biology is a challenge worth tackling because synthetic metabolic pathways can be used to produce a large variety of stereochemically pure small molecules that are difficult to synthesize with traditional organic chemistry. Improved methods for engineering synthetic biological systems may lead to efficient routes for producing pharmacologically active compounds, industrially important bulk chemicals, and liquid fuels for transportation [1,2]. One of us (JDK) recently authored a comprehensive review of the application of synthetic biology to problems in synthetic chemistry [1]. Here, we highlight efforts to improve control over the function of synthetic biological systems as well as to improve control over the engineering process itself. In general, it will be important to develop a more quantitative understanding of how the design of biological structures, pathways, and systems relates to context-dependent function. This review emphasizes work published in the last two years. In particular, we describe reports that lead to firstly, better controls for regulating gene expression in the cell; secondly, improved methods for identifying, engineering, and evolving enzymes to give better control over catalytic reactions, including improving performance or changing substrate specificity; and thirdly, computational tools and methods for high-throughput system assembly and analysis that may enable more rapid forward engineering. Programming gene expression
To efficiently produce small molecules from an engineered metabolic pathway, expression of multiple, heterologous, biocatalytic genes must be balanced to avoid the accumulation of toxic intermediates and to ensure that the demand for cellular resources does not exceed the supply. Many traditional gene expression tools created to overproduce large quantities of a single protein are not well suited to this task [1]. For example, it was not possible to induce transcription independently from two different commonly used promoters, lac (Plac) and arabinose (PBAD), in the same E. coli cell until the AraC www.sciencedirect.com
Chemical synthesis using synthetic biology Carothers, Goler and Keasling 499
transcriptional regulator was engineered to avoid ‘crosstalk’ (Figure 1) [3]. Genetic controls that generate defined levels of expression in a given context are very useful, not only for identifying optimal biocatalytic gene expression stoichiometries [4] but also for subsequently programming those ratios in synthetic biological systems. One approach has been to compile databases, or ‘registries’ of genetic controls where the activities in different contexts are quantitatively characterized using heuristic rules [5,6]. Dynamic gene regulation with RNA
In addition to the more familiar protein-based regulatory controls such as transcription factors and response regulators [1], naturally occurring biological systems have evolved a variety of RNA-based mechanisms for regulating gene expression. To date, synthetic RNA sequences that fold into defined secondary structures, catalytic RNA structures (ribozymes), or ligand-binding RNA structures (aptamers) have been engineered to modulate transcription, translation, and mRNA degradation to program gene expression levels [7]. Synthetic RNA aptamers can be generated with in vitro selection to bind a chosen molecular target [8] and deployed as genetic control mechanisms [7]. In one successful example, RNA aptamers evolved to bind the tetracycline repressor protein (TetR) in E. coli induce gene expression from a tet promoter to the same extent as tetracycline [9]. Aptamers joined to a self-cleaving ribozyme form allosterically regulated, self-cleaving structures, or aptazymes [10]. Aptazymes can be employed such that ligand-mediated RNA cleavage alters the degradation rate of an mRNA [11] or the secondary structure surrounding a ribosome binding site (RBS) [12]. In these designs, the level of protein expression is a function of steady-state mRNA concentration and the degree of ribosomal access to the RBS, respectively (Figure 1, gray inset). In principle, aptamer and aptazyme-based feedback controls could be customized to couple changes in the concentration of a target ligand to changes in gene expression, without being limited to recognizing metabolites for which the binding structures exist in nature. Introducing feedback to dynamically control gene expression and enzyme function in engineered pathways should help prevent the build-up of the toxic intermediates and avoid wasteful translation. Furthermore, feedback mechanisms can transmit information throughout the pathway to reduce the rise-time to the steady state, improving product yields. Engineered oscillatory circuits [13] could be used to program product formation that takes place in discrete steps for asymmetric syntheses or to cycle between phases of product formation and active transport out of the cell. www.sciencedirect.com
Identifying and engineering enzymes
Extant enzymes and pathways are often assembled into novel combinations during the evolutionary diversification of natural biological systems [14]. In an analogous manner, engineered metabolic pathways can be assembled from novel combinations of naturally occurring enzymes and pathways [15] (Table 1). Combining bioinformatic tools with screening methods for functionally interrogating sequence libraries derived from environmental samples of mixed microbial communities (i.e. metagenomic libraries [16]) may expand access to the catalytic versatility already present in nature. For example, a study involving the heterologous expression of a metagenomic library from a single site in Alaska yielded dozens of different beta-lactamase enzymes [17]. Those data underscore the point that even small microbial communities can contain a high degree of sequence and catalytic diversification. Naturally occurring enzymes that are readily expressed in a heterologous host and that exhibit a high degree of promiscuity with regard to their cognate substrates [18] may be especially good starting points for assembling synthetic pathways [19]. Other protein engineering methods, such as those informed by structural information [20] or theories of molecular evolution [21], can be pursued when the goal is to improve enzyme expression and activity without altering catalytic specificity [22]. These approaches have been successfully applied in our own laboratory to engineer proteins for heterologous terpenoid biosynthesis pathways [22]. Improved methods for evolving or rationally designing enzymes with novel function may eventually alleviate the need to rely upon naturally occurring biological structures as the source of catalysts. If so, there will be even greater control over the choice of substrates, products, rate constants, and cellular contexts where the desired chemistry can be performed. One recent computational design study produced enzymes capable of breaking carbon– carbon bonds using four different catalytic motifs [23]. Quantum mechanical transition state calculations guided the initial design of a protein for catalyzing carbon deprotonation via Kemp elimination based on a chosen catalytic mechanism. Subsequent in vitro evolution produced an enzyme with 106-fold rate enhancement over background [24]. The in vitro evolution of an RNA ligase enzyme showed that de novo enzymes with million-fold rate enhancements can be generated directly from libraries of noncatalytic protein scaffolds, without needing detailed mechanistic insight, if the starting library is diverse and well-designed [25]. Computational tools for guiding system design
Computational modeling and analysis can reduce the time and expense needed for experimentation by culling ideas that have no possibility of working and guiding Current Opinion in Biotechnology 2009, 20:498–503
500 Systems and synthetic biology
Figure 1
Current Opinion in Biotechnology 2009, 20:498–503
www.sciencedirect.com
Chemical synthesis using synthetic biology Carothers, Goler and Keasling 501
Table 1 Identifying, engineering, and evolving new enzyme function Approach
Highlighted example(s)
Notes
Functional metagenomic library screening
Isolation of b-lactamases [17]
Engineer promiscuous enzyme active sites
Engineered new reaction specificity for P450BM3 from Bacillus mageterium [19] Heterologous enzymes optimized for activity in E. coli terpenoid biosynthesis pathways [21,22]
Engineer nonactive site residues to improve expression and heterologous enzyme activity
De novo enzyme design Combination of rational design and directed evolution
In vitro selection from diverse libraries
Computational design of retro-aldol enzymes [23] Rational design and evolution of novel enzymes catalyzing Kemp elimination [24] Evolution of novel RNA-RNA ligase enzymes [25]
system design in promising directions. For example, ReBit is a tool for interrogating databases of characterized enzymes to identify potential biocatalytic routes for producing a desired small molecule [15]. DESHARKY integrates ReBit-type analysis with models for flux balance and gene expression in an attempt to guide metabolic pathway design such that growth is optimized [26]. Synthetic genetic circuitry can be designed using computational platforms for evaluating and manipulating genetic information. BioJADE is a conceptual design tool whereby genetic circuits are designed via an electrical engineering metaphor [27]. BioJADE employs plug-ins to simulate the function of a putative genetic circuit using information drawn from a database or registry that describes the activities of well-characterized genetic components such promoters, ribosome binding sites, and open reading frames. GenoCAD [28] implements a context free grammar to achieve a similar end and RoVerGeNe evaluates the robustness of a particular circuit design with respect to a given range of conditions and biochemical parameters. Notably, the RoVerGeNe algorithms can identify parameters predicted to be consistent with a desired circuit function, information that can be used to guide the generation of individual components needed to construct the system [29]. High-throughput assembly
Once a synthetic metabolic pathway has been designed, a strategy must be put into place to assemble the necessary genetic expression constructs. Cox et al. [30] described a
Metagenomic libraries provide access to natural catalytic diversity Structure-based design avoids need to identify cognate enzymes in nature Structural information and evolutionary insight used to engineer sequence and structural variants more active in microbial host Design begins with mechanistic insight and potential protein scaffolds Mechanistic insight guides initial designs. Directed evolution with libraries informed by initial designs Does not require prior mechanistic insight
scheme for automating the assembly of individual genes to express protein sequence variants. Their protein fabrication automation (PFA) method assembles open reading frames (ORFs) from short chemically synthesized oligos using PCR, liquid handling robotics, and genetic selections. The technique routinely produces 24–48 gene-sized ORFs per run, and is amenable to highthroughout automation. Individual ORFs can be assembled into multigene pathways with cloning and expression systems similar to the one developed to combinatorially assemble and test novel three-module polyketide synthase enzymes [31]. In that case, gene cassettes expressing 147 combinations of three different modules were fabricated using unique restriction sites and three cycles of standard, ligation-mediated cloning [31,32]. In principle, the same combinatorial set of three-gene cassettes could be assembled from the library of polyketide synthase modules in a single step using sequence and ligation independent cloning (SLIC), a method that relies upon in vitro homologous recombination and single-strand annealing [33]. PCR error rates and the propensity for large pieces of DNA to shear places upper limits on the sizes of genetic constructs that can be produced in vitro. This difficulty can be surmounted by combining PCR-assembled sections of DNA in yeast via in vivo homologous recombination. As a demonstration, the highly robust recombination and error-correcting mechanisms of yeast were harnessed to assemble an entire Mycoplasma genitalium genome from 25 synthetic fragments [34].
( Figure 1 Legend ) Programming gene expression. We illustrate three examples of genetic controls that can be engineered to regulate the expression of heterologous enzymes (E1, E2, and E3) (A). At left, AraC* [3] is a mutant version of an arabinose-responsive transcriptional regulator created with directed evolution to avoid ‘cross-talk’ with the lactose-responsive transcriptional regulator, lac repressor (LacI). On the right, expression from the tet promoter begins when an in vitro-selected RNA aptamer binds specifically to the tetracycline repressor protein, TetR [9]. E1 and E3 translation occurs once their mRNAs have been transcribed (B). E2 translation initiates only when ligand-dependent aptazyme cleavage [12] removes secondary structure surrounding the ribosome binding site (RBS) (gray inset). Genetic tools like these can be used to control protein stoichiometries such that the translated enzymes (C) efficiently catalyze the conversion of substrate a into product d. www.sciencedirect.com
Current Opinion in Biotechnology 2009, 20:498–503
502 Systems and synthetic biology
Figure 2
analysis will facilitate pathway optimization and provide functional component and system characterization data for use in future applications.
Conclusions Engineered metabolic pathways can be assembled and used to discover and produce clinically important and industrially important small molecules. The aim is to design synthetic pathways and systems that minimize metabolic and genetic overhead in order maximize product formation. As shown here, significant progress continues to be made toward engineering genetic controls, enzymes, and pathways that can be tailored to any chemical problem, not just those for which the components — or products — already exist in nature.
Engineering process. Integrating system design, assembly, and analysis with computational simulation should increase the speed and efficacy with which synthetic biological systems can be engineered for synthetic chemistry. Computational design and simulation data could drive both assembly (arrow A) and part generation (e.g. engineering or identifying genetic controls or enzymes that meet particular performance targets) (arrow D). The output of the assembly (arrow B) is then assayed and analyzed. The data gathered from assembly (i.e. which combinations were successfully built) and analysis (i.e. quantitative part and system performance) are captured and fed-back (arrows A, C) to refine system simulations and guide the improvement of enzyme and genetic control parts (arrow D). Databases are used to collect and manage information throughout the entire process.
The design process can be integrated with methods for assembly. Clotho is an expandable platform-based design tool that is similar in spirit to BioJADE and GenoCAD (see above), but has different technological underpinnings (URL: http://2008.igem.org/Team:UC_Berkeley_Tools). Clotho has the ability to interface directly with devices in the laboratory, including automated liquid handling robots that could be directed to assemble genetic constructs. Systems-level analysis is important for understanding how synthetic biological pathways can be engineered to interact productively with the metabolism of the host cell [35] and is a crucial aspect of the assembly process. Developing methods to support and integrate the processes of high-throughput assembly and analysis will enable larger scale engineering projects as well as allow systematic measurement (Figure 2). In turn, quantitative Current Opinion in Biotechnology 2009, 20:498–503
Informing the design and analysis of synthetic biological systems using computational simulation will increase the speed and efficacy with which biology can be engineered. Building on the lessons and methodologies derived from the work reviewed here may help us reach a point where synthetic biological systems can be mechanistically modeled, purposefully designed and automatically assembled. If so, it will be possible to rapidly engineer synthetic biological systems to perform synthetic chemistry, for any desired application, that is efficient, inexpensive, and environmentally benign.
Acknowledgements Work in the authors’ laboratory was supported by the Joint BioEnergy Institute (JBEI, URL: http://www.jbei.org) through contract DE-AC0205CH11231 between Lawrence Berkeley National Laboratory and the US Department of Energy and by the Synthetic Biology Engineering Research Center (SynBERC, URL: http://www.synberc.org/) through a grant from the US National Science Foundation. JMC was supported in part by a Jane Coffin Childs Memorial Fund Postdoctoral Fellowship.
References and recommended reading Papers of particular interest, published within the period of review, have been highlighted as: of special interest of outstanding interest 1.
Keasling JD: Synthetic biology for synthetic chemistry. ACS Chem Biol 2008, 3:64-76.
2.
Lee SK, Chou H, Ham TS, Lee TS, Keasling JD: Metabolic engineering of microorganisms for biofuels production: from bugs to synthetic biology to fuels. Curr Opin Biotechnol 2008, 19:556-563.
3.
Lee SK, Chou HH, Pfleger BF, Newman JD, Yoshikuni Y, Keasling JD: Directed evolution of AraC for improved compatibility of arabinose-and lactose-inducible promoters. Appl Environ Microbiol 2007, 73:5711-5715. This paper shows that cross-talk can be engineering out of E. coli promoter systems, enabling the simultaneous, independent expression of multiple genes. 4.
Tsuruta H, Paddon CJ, Eng D, Lenihan JR, Horning T, Anthony LC, Regentin R, Keasling JD, Renninger NS, Newman JD: High-level production of amorpha-4,11-diene, a precursor of the antimalarial agent artemisinin, in Escherichia coli. PLoS ONE 2009, 4:e4489.
5.
Cox RS, Surette MG, Elowitz MB: Programming gene expression with combinatorial promoters. Mol Syst Biol 2007, 3: doi: 10.1038/msb4100187. www.sciencedirect.com
Chemical synthesis using synthetic biology Carothers, Goler and Keasling 503
Combinatorial random promoter libraries were analyzed to derive heuristic rules for programming gene expression in E. coli. 6.
Canton B, Labno A, Endy D: Refinement and standardization of synthetic biological parts and devices. Nat Biotechnol 2008, 26:787-793.
7.
Saito H, Inoue T: Synthetic biology with RNA motifs. Int J Biochem Cell Biol 2009, 41:398-404.
8.
Carothers JM, Szostak JW: In vitro selection of functional oligonucleotides and the origins of biochemical activity. The Aptamer Handbook: Functional Oligonucleotides and Their Applications. Sven Klussman; 2006:. 3–28.
9.
Hunsicker A, Steber M, Mayer G, Meitert J, Klotzsche M, Blind M, Hillen W, Berens C, Suess B: An RNA aptamer that induces transcription. Chem Biol 2009, 16:173-180.
10. Link KH, Guo L, Ames TD, Yen L, Mulligan RC, Breaker RR: Engineering high-speed allosteric hammerhead ribozymes. Biol Chem 2007, 388:779-786. This paper describes an in vitro selection method for creating and optimizing aptazymes comprising RNA aptamers and hammerhead ribozymes to potentially generate biologically useful regulators. Although the aptazymes engineered here could not be used to regulate gene expression in a mammalian model system, these methods may yield structures for dynamically controlling gene expression in other contexts.
22. Chang MCY, Eachus RA, Trieu W, Ro D, Keasling JD: Engineering Escherichia coli for production of functionalized terpenoids using plant P450s. Nat Chem Biol 2007, 3:274-277. Plant enzymes were engineered for efficient expression and catalysis in E. coli in order to assemble heterologous pathways with high levels of functionalized terpenoid production. 23. Jiang L, Althoff EA, Clemente FR, Doyle L, Rothlisberger D, Zanghellini A, Gallaher JL, Betker JL, Tanaka F, Barbas CF III et al.: De novo computational design of retro-aldol enzymes. Science 2008, 319:1387. Jiang et al. computationally designed and experimentally verified enzymes capable of catalyzing a reaction for which no known biological counterparts exist. Improved methods for enzyme design and evolution reduce the need to rely upon existing proteins as the source of catalysts and expand the range of chemistries that can be performed. 24. Ro¨thlisberger D, Khersonsky O, Wollacott AM, Jiang L, DeChancie J, Betker J, Gallaher JL, Althoff EA, Zanghellini A, Dym O et al.: Kemp elimination catalysts by computational enzyme design. Nature 2008, 453:190-195. Computational design and in vitro selection were combined to generate functional enzymes. 25. Seelig B, Szostak JW: Selection and evolution of enzymes from a partially randomized non-catalytic scaffold. Nature 2007, 448:828-831.
11. Win MN, Smolke CD: A modular and extensible RNA-based gene-regulatory platform for engineering cellular function. Proc Natl Acad Sci U S A 2007, 104:14283.
26. Rodrigo G, Carrera J, Prather KJ, Jaramillo A: DESHARKY: automatic design of metabolic pathways for optimal cell growth. Bioinformatics 2008, 24:2554-2556.
12. Ogawa A, Maeda M: An artificial aptazyme-based riboswitch and its cascading system in E. coli. ChemBioChem 2008, 9:206-209.
27. Goler JA, Bramlett BW, Peccoud J: Genetic design: rising above the sequence. Trends Biotechnol 2008, 26:538-544.
13. Wong WW, Tsai TY, Liao JC: Single-cell zeroth-order protein degradation enhances the robustness of synthetic oscillator. Mol Syst Biol 2007, 3: doi: 10.1038/msb4100172. 14. Ganfornina M, Sanchez D: Generation of evolutionary novelty by functional shift. BioEssays 1999, 21:432-439. 15. Prather KLJ, Martin CH: De novo biosynthetic pathways: rational design of microbial chemical factories. Curr Opin Biotechnol 2008, 19:468-474. 16. Schloss P, Handelsman J: A statistical toolbox for metagenomics: assessing functional diversity in microbial communities. BMC Bioinformatics 2008, 9:34. This paper describes a method for estimating the functional richness within a microbial community by inferring peptide fragments from metagenomic DNA sequence data.
28. Cai Y, Hartnett B, Gustafsson C, Peccoud J: A syntactic model to design and verify synthetic genetic constructs derived from standard biological parts. Bioinformatics 2007, 23:2760-2767. 29. Batt G, Yordanov B, Weiss R, Belta C: Robustness analysis and tuning of synthetic gene networks. Bioinformatics 2007, 23:2415-2422. A thorough treatment of how to determine which system components are best tunable to produce the desired results. 30. Cox JC, Lape J, Sayed MA, Hellinga HW: Protein fabrication automation. Prot Sci 2007, 16:379. The authors describe a rapid, low-cost way to produce genetic cassettes for expressing protein variants to explore functional diversity. This method is widely applicable and could be used to rapidly assemble a large number of genetic constructs.
17. Allen HK, Moe LA, Rodbumrer J, Gaarder A, Handelsman J: Functional metagenomics reveals diverse [beta]-lactamases in a remote Alaskan soil. ISME J 2008, 3:243-251.
31. Menzella HG, Carney JR, Santi DV: Rational design and assembly of synthetic trimodular polyketide synthases. Chem Biol 2007, 14:143-151.
18. Fischbach MA, Clardy J: One pathway, many products. Nat Chem Biol 2007, 3:353-355.
32. Shetty R, Endy D, Knight T: Engineering BioBrick vectors from BioBrick parts. J Biol Eng 2008, 2:5.
19. Dietrich J, Yoshikuni Y, Fisher K, Woolard F, Ockey D, McPhee D, Renninger N, Chang M, Baker D, Keasling JD: A novel semibiosynthetic route for artemisinin production using engineered substrate-promiscuous P450BM3. ACS Chem Biol 2009, 0: http://pubs.acs.org/doi/abs/10.1021/cb900006h.
33. Li M, Elledge S: Harnessing homologous recombination in vitro to generate recombinant DNA via SLIC. Nat Methods 2007, 4:251-256. A method for generating multi-gene expression constructs.
20. Fox RJ, Huisman GW: Enzyme optimization: moving from blind evolution to statistical exploration of sequence–function space. Trends Biotechnol 2008, 26:132-138. 21. Yoshikuni Y, Dietrich JA, Nowroozi FF, Babbitt PC, Keasling JD: Redesigning enzymes based on adaptive evolution for optimal function in synthetic metabolic pathways. Chem Biol 2008, 15:607-618. A 1000-fold improvement in production from a synthetic mevalonate pathway was achieved by engineering enzymes for improved activity in vivo without affecting the catalyzed reactions.
www.sciencedirect.com
34. Gibson DG, Benders GA, Axelrod KC, Zaveri J, Algire MA, Moodie M, Montague MG, Venter JC, Smith HO, Hutchison CA: One-step assembly in yeast of 25 overlapping DNA fragments to form a complete synthetic Mycoplasma genitalium genome. Proc Natl Acad Sci U S A 2008, 105:20404. This paper demonstrates assembly of small genome-sized chromosomes. 35. Kizer L, Pitera DJ, Pfleger BF, Keasling JD: Application of functional genomics to pathway optimization for increased isoprenoid production. Appl Environ Microbiol 2008, 74:3229-3241.
Current Opinion in Biotechnology 2009, 20:498–503