Biocatalysts by evolution

Biocatalysts by evolution

Available online at www.sciencedirect.com Biocatalysts by evolution Christian Ja¨ckel and Donald Hilvert Proteins evolve by iterative cycles of mutat...

349KB Sizes 2 Downloads 74 Views

Available online at www.sciencedirect.com

Biocatalysts by evolution Christian Ja¨ckel and Donald Hilvert Proteins evolve by iterative cycles of mutation, selection and amplification. Analogous evolutionary strategies are being profitably exploited in the laboratory to generate and optimize biocatalysts for diverse biotechnological applications. In this review, we summarize recent efforts to improve this process by creating more effective protein libraries and more efficient screening/selection schemes. Targeted mutagenesis using simplified amino acid alphabets, statistical analyses of sequence–function–stability relationships, and neutral mutational drift have emerged as powerful tools for generating useful molecular diversity, while new techniques for controlling selection stringency and microfluidic methods for screening large populations of molecules promise to facilitate exploration of sequence space. Enzyme engineers interested in creating novel biocatalysts for abiological reactions are sure to profit from these advances. Address Laboratory of Organic Chemistry, ETH Zurich, CH-8093 Zurich, Switzerland Corresponding author: Hilvert, Donald ([email protected])

Current Opinion in Biotechnology 2010, 21:753–759 This review comes from a themed issue on Chemical biotechnology Edited by Phil Holliger and Karl Erich Jaeger Available online 17th September 2010 0958-1669/$ – see front matter # 2010 Elsevier Ltd. All rights reserved. DOI 10.1016/j.copbio.2010.08.008

Introduction Directed evolution is a powerful technique for probing structure–activity relationships and catalytic mechanisms in proteins and also provides a practical means of tailoring biocatalysts for new tasks [1,2]. Thermostability [3], tolerance to organic solvents [4], enantioselectivity [5], and substrate specificity [6] are among the properties successfully optimized by iterative cycles of mutagenesis and selection or screening. More challenging applications too, such as probing protein–protein interactions [7,8], changing catalytic mechanism [9], and modifying large molecular machines [10,11] have been achieved by this approach. Enzymes evolved in the laboratory are already being used as catalysts to synthesize organic building blocks [12] and pharmaceutical compounds [13,14,15], activate prodrugs [16], degrade anthropogenic chemicals [17], and modify genomes [18]. By enabling the redesign www.sciencedirect.com

of molecular assembly lines for natural product biosynthesis [10] and modification of catalysts responsible for replication, transcription, and translation [11,19–21], directed evolution also promises to contribute significantly to the new field of synthetic biology. Although each step of the evolutionary cycle has been successfully mimicked in vitro, improved strategies for exploring sequence space productively are crucial to future advances in this area. Given the essentially unlimited number of possible protein variants and practical limits on the size of genetic libraries, increasing the frequency of ‘hits’ in an experimental population so that desirable clones can be identified and subsequently amplified remains a major challenge. Recent progress toward this end is highlighted here, with an emphasis on library design and the development of more efficient screening/selection protocols.

Simplified amino acid alphabets One of the simplest strategies to reduce the complexity of protein libraries is to limit the number of building blocks allowed at each randomized position [22]. Entire proteins, including novel scaffolds [23] and active enzymes [24], can be constructed from significantly fewer than the 20 canonical amino acids. Limited amino acid alphabets often yield proteins with better solubility and a lower tendency to aggregate than those constructed with the full 20-member alphabet [25]. However, computational [26] and experimental studies [24] suggest that extensively simplified protein sequences may also yield molten globular structures with weak tertiary interactions. Saturation mutagenesis with focused libraries of restricted amino acid alphabets has proven to be an effective tool for tailoring enzyme properties. In one example, this approach was used to alter the specificity and selectivity of an epoxide hydrolase [27]. Three active site residues, chosen by inspection of crystal structures, were simultaneously mutagenesized using degenerate codons encoding either all 20 proteinogenic amino acids or a subset of 12 representative building blocks. Screening of 5000 clones, corresponding to 95% coverage of the simplified library, resulted in a significantly larger number of active enzymes than sampling of an equal number of variants from the unrestricted library. In the latter case, it would have been necessary to screen ca. 105 clones to provide 95% coverage, representing a significant increase in cost and effort. Very large evolutionary steps can be achieved by iteration of this process over multiple rounds [28]. Current Opinion in Biotechnology 2010, 21:753–759

754 Chemical biotechnology

Focused libraries were also successfully employed to reengineer the substrate scope of a thermostable Baeyer–Villiger monooxygenase [29]. Codon degeneracies at four targeted positions were chosen to match the amino acid preferences found in an alignment of seven homologous monooxygenases, permitting a significant reduction in library size. Improved catalysts were obtained after screening only 1700 transformants, or 86% of the library. An approximately 1000-fold larger screening effort would have been necessary to provide comparable sampling of libraries constructed from the standard set of 20 amino acids. It is worth noting, however, that the best variants frequently contained non-consensus amino acids allowed by the degenerate codons, indicating that oversimplification of the side-chain repertoire might ultimately be counterproductive. Combinatorial complexity and chemical variety must be judiciously balanced at the stage of library design to minimize screening effort while, at the same time, maximizing the chances of finding new activities. Computer simulations on the evolution of antibody variable regions using reduced alphabets support this view. Antibodies with higher binding affinities were found by sparsely searching libraries with a larger alphabet than by exhaustively searching smaller libraries [30].

Statistical analyses Powerful mathematical algorithms have been developed to extract information from the sequence diversity inherent in homologous proteins [31–35]. Natural evolutionary relationships revealed by patterns of covarying residues, for instance, are useful for rationalizing functional properties of proteins [21,36] and for design [32,37]. Statistical analyses of the enormous volume of sequence data generated in laboratory evolution experiments can similarly shed light on structure–function–stability relationships relevant to engineering better catalysts [15,38,39,40,41]. Multivariate methods are able to identify beneficial mutations — even in clones with overall reduced function — that can be combined to give improved variants directly or incorporated into new combinatorial libraries for further rounds of evolution. By focusing on functional mutations, as opposed to functional genes, higher quality libraries are generated, thereby minimizing the need for laborious and costly downstream screening. For example, the ProSAR algorithm, which extends traditional SAR-based approaches to protein optimization, was successfully used together with recombinant methods to produce a bacterial halohydrin dehalogenase for the industrial-scale synthesis of a cholesterol-lowering drug [15]. Eighteen rounds of diversification, screening, and statistical analysis of sequence– function relationships afforded catalysts with a ca. 4000fold increased volumetric productivity for the targeted cyanation reaction. The chromate reductase activity of oxidoreductase ChrR was improved ca. 1500-fold using a comparable statistical model [38]. It is worth noting that Current Opinion in Biotechnology 2010, 21:753–759

individual amino acid contributions to activity were treated additively in both studies, suggesting that epistatic effects can be neglected to a first approximation. Structure-guided recombination strategies have similarly been used to stabilize enzymes. In one example, a library of cellulase chimeras was created by shuffling eight pseudoindependent sequence blocks from three homologous fungal enzymes [39]. The additive contributions of the individual fragments to thermostability enabled the design of cellulase variants with enhanced activity towards crystalline cellulose at elevated temperatures. This approach was also used to create a diverse family of thermostable cytochrome P450 enzymes [40]. In addition to possessing substantially greater resistance to inactivation than the most stable parent, the best chimeras exhibited a range of novel functions, including the ability to produce drug metabolites. Stabilized cytochrome P450 enzymes have also been reliably predicted by analysis of multiple sequence alignments [40]. In contrast to the approach described above, which is based on linear regression analysis of sequencestability data, consensus design is founded on the assumption that frequencies of sequence elements correlate with their contributions to protein stability. Genetic diversity in the cytochrome P450 study was generated by shuffling larger sequence blocks from several homologous proteins, thus minimizing evolutionary relationships between sequences that often bias amino acid frequencies in conventional alignments of phylogenetically related proteins. Taking this approach to the extreme, sequence diversity essentially free of phylogenetic bias can be generated by targeted randomization of a single protein, followed by selection/screening for function (Figure 1). This strategy was successfully applied to the consensus design of binarypatterned chorismate mutase enzymes that were easier to produce and substantially more stable than reference proteins from the starting libraries [41]. Notably, because side-chain diversity was greatly restricted through the use of a simplified eight amino acid alphabet, these properties could be achieved by statistical analysis of a relatively small number (<30) of sequences.

Neutral drift libraries Successful enzyme engineering often requires multiple mutations. If viable combinations are rare or if the necessary amino acid substitutions destabilize the protein or prevent proper folding, as is frequently observed, evolutionary redesign may ultimately fail. In nature, sequence diversity inherent in large protein populations can suppress the destabilizing effects of new mutations, as well as foster promiscuous functions divorced from the protein’s normal biological role, to facilitate acquisition of new activities. This natural sequence diversity arises by the process of neutral drift. Over time, proteins accumulate random mutations that are neutral with regard to fitness www.sciencedirect.com

Biocatalysts by evolution Ja¨ckel and Hilvert 755

Figure 1

Consensus design without phylogenetic bias [41]. The helices of a homodimeric chorismate mutase were randomized according to a binary pattern of hydrophobic (Phe, Ile, Leu, Met) and hydrophilic (Asp, Glu, Asn, Lys) residues in three separate experiments. After functional selection for variants that complement a chorismate mutase deficient E. coli strain, catalytically active enzymes were sequenced and used for statistical analysis of amino acid frequencies at the randomized sites. Artificial consensus proteins, which were constructed from the most frequent residue at each position (denoted as a gray X), were more stable and easier to fold than the parent library proteins.

— that is, they do not interfere with biological function or structural integrity. When environmental conditions change, functional homologues present in the population may provide an adaptive advantage. Methods for shuffling families of homologous genes — also called molecular breeding — have provided a powerful and practical means of exploiting natural sequence diversity for the directed evolution of enzymes [42]. Neutral mutational drift can be purposely accelerated in the laboratory to facilitate accumulation of potentially adaptive mutations. In two seminal studies, serum paraoxonase [43] and cytochrome P450 BM3 [44] were subjected to random mutagenesis and functional selection for native activity. The resulting protein populations exhibited an expanded range of specificities and enhanced promiscuous activities compared to the starting enzymes, even though these properties were not subject to direct selection. The threshold of native catalytic activity demanded in such experiments is an important parameter that determines the rate at which mutations accumulate [45]. In the absence of an activity constraint, deleterious mutations ultimately inactivate the protein. Selection for native activity thus purges lethal changes, but if the selection stringency is too high, mutations that might benefit new functions will also be lost. This tradeoff was observed during adaptation of TEM-1 b-lactamase to a new task, namely degradation of a cephalosporin antibiotic. Evolution was fastest when the pressure of purifying selection for the original penicillinase activity was relaxed but not entirely relieved [45]. www.sciencedirect.com

Computational simulations suggest that genetically diverse, drifting populations tend to become enriched in sequences that are stable and robust to mutation [46]. Laboratory evolution of cytochrome P450 BM3 has provided experimental support for this prediction. Proteins evolved from large neutral drift populations exhibited higher mutational robustness and structural stability than P450 variants obtained by evolution from a single starting sequence [47]. Studies on TEM-1 blactamases also showed that stabilizing mutations are enriched in subpopulations of highly polymorphic neutral drift libraries [48]. Moreover, the accumulation of global suppressor mutations greatly facilitated the evolution of TEM-1 variants that convert the artificial substrate cefotaxime. These results convincingly illustrate the utility of neutral variation for the acquisition of new function (Figure 2). In nature, chaperones have been shown to buffer the expression of genetic variation [49–51]. This buffering capacity can increase the overall adaptive evolvability of an organism, since subpopulations expressing cryptic phenotypes may have a selective advantage under altered environmental conditions. Chaperone overexpression similarly promotes genetic variation in laboratory evolution experiments by suppressing the destabilizing effect of mutations. Mutagenesis and selection experiments with four different enzymes in the presence or absence of GroEL/GroES showed that the genetic diversity of neutral drift populations can be roughly doubled in this way [52]. Moreover, adaptive evolution of the promiscuous esterase Current Opinion in Biotechnology 2010, 21:753–759

756 Chemical biotechnology

Figure 2

Directed evolution of enzymatic activity via combined neutral drift and adaptive evolution. Iterative rounds of randomization and screening or selection of protein libraries for low levels of starting activity (neutral drift) generates neutral networks of protein sequences (large light circles) that contain subpopulations with increased mutational robustness (small dark circles). After neutral drift starting from an enzyme with activity (A) (blue star), mutationally robust subpopulations of the resulting network that are promiscuous for activity (B) (end of left dotted arrow) represent ideal starting points for adaptive evolution (left solid arrow) of a highly active enzyme with novel function (B) (orange star). Mimicking the course of native evolution, the design process can be repeated (right dotted and solid arrows) to evolve enzyme (B) towards potent biocatalysts for activity (C) (green star).

activity of a phosphotriesterase coexpressed with GroEL/ GroES yielded ca. 10-fold higher catalytic efficiencies compared to analogous experiments without these chaperones. Evolution of efficient chaperones for specific target proteins, for example using cell-surface display systems [53], may further extend the utility of this approach.

Improved screening and selection strategies Identifying molecules in a protein population that do something interesting is generally the greatest challenge for directed evolution. Over the years, diverse methods have been developed for efficiently sampling the catalytic or binding activities of library members. For tasks with rare solutions, selection-based methods are particularly valuable because of their enormous throughput [54]. For example, in vitro display techniques enable sampling of >1012 variants. The power of such approaches is illustrated by the use of RNA display, a method that links proteins covalently to their encoding mRNA [55], to isolate and optimize a novel RNA ligase from a library based on a partially randomized zinc finger scaffold. Seventeen selection rounds afforded multi-turnover catalysts with >2  106 rate accelerations. Notably, no prior mechanistic information was required to identify these Current Opinion in Biotechnology 2010, 21:753–759

enzymes; product formation served as the sole selection criterion. Provided that catalytic activity can be linked to cell growth, genetic selection is an attractive high-throughput alternative to in vitro display techniques [54]. Designing in vivo selection systems is non-trivial, however. Living cells are complex, and modification of the host genome often leads to unanticipated phenotypes. Moreover, the threshold of activity needed for survival of an auxotrophic host and the dynamic range of selectable activities are usually difficult to predict. Regulation of in vivo catalyst and/or substrate concentrations through metabolic engineering represents one solution to such problems (Figure 3) [56]. For example, intracellular enzyme concentrations can be systematically varied over a broad dynamic range by combining tunable transcription with a protein-degradation tag [57]. Such fine-tuned control allows the intracellular concentration of the growth-limiting catalyst to be matched to the level of activity needed to overcome the selection hurdle. Similarly, the concentration of critical metabolites within the cell can be manipulated either by controlling the supply of relevant compounds in the growth medium, or by modifying genetic and regulatory mechanisms or metabolic pathways within the cell to steer the production (or destruction) of a key substance. The latter approach was adopted to create a tunable selection system for prephenate dehydratases [58]. By coexpressing a regulable dehydrogenase that competes for prephenate, dehydratases whose activities vary >50,000-fold could be readily differentiated. The evolution of biocatalysts for reactions that cannot be linked to host survival demands other assay strategies. Screening methods that directly monitor substrate disappearance or product appearance are typically quite versatile [59] but generally have much lower throughput than selection-based assays [22]. Recent advances in microfluidic methods may be able to bridge this gap, though. A highly efficient microfluidic fluorescence-activated droplet sorter (FADS) was recently developed that sorts small aqueous droplets dispersed in a fluorocarbon oil emulsion at rates up to 2000 droplets/s. If single cells producing the catalyst of interest are compartmentalized together with a fluorogenic substrate within the droplet, genotype and phenotype become linked in a simple manner. If enzyme-catalyzed conversion of substrate into product ensues, the resulting signal should be proportional to the activity of the compartmentalized enzyme. Proof-of-concept experiments with cells expressing variants of b-galactosidase have established the feasibility of this approach: cells producing active enzymes were successfully enriched and subsequently recovered from the sorted droplets [60]. The technology has also been successfully applied to the directed evolution of horseradish peroxidase mutants exhibiting catalytic rates 10-times more efficient than their parent [61]. www.sciencedirect.com

Biocatalysts by evolution Ja¨ckel and Hilvert 757

Figure 3

Strategies for tuning selection stringency. Left panel: Bacterial selection system for chorismate mutases (CMs) [57]. Constitutive expression of CM genes (cm) under control of the bla promoter (Pbla) on plasmid pKECMB complements the CM knock-out by converting chorismate into prephenate, a precursor of the essential amino acids tyrosine and phenylalanine. Because high yields of protein are generated, even weakly active enzymes complement. Selection stringency can be increased by balancing catalyst production and degradation with plasmid pKTS. The library is expressed from the tetA promoter (PtetA), which allows graded and homogeneous transcriptional control of catalyst production via concentration-dependent, tetracycline-mediated displacement of the TetR repressor. Adding a C-terminal degradation tag (SsrA), which directs the enzymes to the ClpXP protease, prevents catalyst accumulation in the cell. Right panel: A regulable prephenate dehydratase (PDT) selection system [58]. In PDT-deficient bacteria, prephenate is preferentially converted to 4-hydroxy-phenylpyruvate and then to tyrosine by prephenate dehydrogenase (PDH) and aromatic amino acid aminotransferase (AAT). However, because tyrosine inhibits E. coli PDH, prephenate accumulates, enabling spontaneous non-enzymatic conversion to phenylpyruvate, which is converted in turn to phenylalanine. ‘Leaky’ growth results, complicating selection of novel PDT enzymes. By coproducing a prephenate dehydrogenase from Z. mobilis (CDH) under tetracycline control, intracellular prephenate concentrations can be systematically reduced, diminishing background growth and also affording tunable stringency for PDT selection.

Since the frequency of active clones in the starting libraries was roughly 1 in 105, it would have taken about 100 days for an expensive, state-of-the-art robotic system to screen the same number of variants. To minimize background signal from cellular enzymes, enzymes displayed on retroviruses can also be sorted by droplet-based microfluidics [62]. Although partitioning of hydrophobic molecules into the fluorous oil phase currently limits this technique to reactions with polar substrates and products, the tremendous increase in sampling efficiency made possible by droplet sorting represents an important advance, and numerous applications can be expected in the future.

Perspectives Over the past decade, directed evolution has become an important tool in the field of protein engineering. Nowadays, as a consequence of improvements in screening/ selection technologies and library design, existing www.sciencedirect.com

enzymes can be almost routinely remodeled with respect to selectivity and substrate scope, paving the way to tailored biocatalysts on demand. The redesign of a transaminase for the industrial-scale synthesis of sitagliptin, an antidiabetic compound, impressively illustrates the praticability of this approach [13]. Creating new enzyme activities where none existed before remains an enormous challenge, of course, but recent breakthroughs in computational design suggest that help is not far off. Powerful algorithms have been developed to construct functional active sites for diverse chemical transformations [63,64,65]. Because the efficiencies of computationally designed enzymes are still modest compared to their natural counterparts, evolutionary strategies will be essential to turn these starting points into practically useful catalysts (Giger L, Blomberg R, Hilvert D, unpublished results) [66]. Once islands of activity are identified in silico, laboratory evolution will serve to specify and map the larger overall topography of the fitness landscape. Current Opinion in Biotechnology 2010, 21:753–759

758 Chemical biotechnology

References and recommended reading Papers of particular interest, published within the annual period of review, have been highlighted as:  of special interest  of outstanding interest 1.

Romero PA, Arnold FH: Exploring protein fitness landscapes by directed evolution. Nat Rev Mol Cell Biol 2009, 10:866-876.

2.

Turner NJ: Directed evolution drives the next generation of biocatalysts. Nat Chem Biol 2009, 5:567-573.

3.

Reetz MT, Soni P, Acevedo JP, Sanchis J: Creation of an amino acid network of structurally coupled residues in the directed evolution of a thermostable enzyme. Angew Chem Int Ed 2009, 48:8268-8272.

4.

Zumarraga M, Bulter T, Shleev S, Polaina J, Martinez-Arias A, Plou FJ, Ballesteros A, Alcalde M: In vitro evolution of a fungal laccase in high concentrations of organic cosolvents. Chem Biol 2007, 14:1052-1064.

5.

Bartsch S, Kourist R, Bornscheuer UT: Complete inversion of enantioselectivity towards acetylated tertiary alcohols by a double mutant of a Bacillus subtilis esterase. Angew Chem Int Ed 2008, 47:1508-1511.

6.

Williams GJ, Zhang C, Thorson JS: Expanding the promiscuity of a natural-product glycosyltransferase by directed evolution. Nat Chem Biol 2007, 3:657-662.

7.

Zhou Z, Lai JR, Walsh CT: Directed evolution of aryl carrier proteins in the enterobactin synthetase. Proc Natl Acad Sci USA 2007, 104:11621-11626.

8.

Mu¨ller MM, Kries H, Csuhai E, Kast P, Hilvert D: Design, selection, and characterization of a split chorismate mutase. Protein Sci 2010, 19:1000-1010.

9.

Jochens H, Stiba K, Savile C, Fujii R, Yu JG, Gerassenkov T, Kazlauskas RJ, Bornscheuer UT: Converting an esterase into an epoxide hydrolase. Angew Chem Int Ed 2009, 48:3532-3535.

10. Fischbach MA, Lai JR, Roche ED, Walsh CT, Liu DR: Directed evolution can rapidly improve the activity of chimeric assembly-line enzymes. Proc Natl Acad Sci USA 2007, 104:11951-11956. 11. Neumann H, Wang KH, Davis L, Garcia-Alai M, Chin JW: Encoding multiple unnatural amino acids via evolution of a quadrupletdecoding ribosome. Nature 2010, 464:441-444. 12. Carballeira JD, Krumlinde P, Bocola M, Vogel A, Reetz MT, Ba¨ckvall J-E: Directed evolution and axial chirality: optimization of the enantioselectivity of Pseudomonas aeruginosa lipase towards the kinetic resolution of a racemic allene. Chem Commun 2007:1913-1915. 13. Savile CK, Janey JM, Mundorff EC, Moore JC, Tam S, Jarvis WR,  Colbeck JC, Krebber A, Fleitz FJ, Brands J et al.: Biocatalytic asymmetric synthesis of chiral amines from ketones applied to sitagliptin manufacture. Science 2010, 329:305-309. A combination of rational, computational, and evolutionary methodologies was employed to evolve highly efficient transaminases to replace a rhodium-catalyzed asymmetric hydrogenation step in the manufacture of an antidiabetic drug. The resultant biocatalysts exhibit high enantioselectivity, broad substrate specificity, and tolerance to high concentrations of organic cosolvent and elevated temperatures, illustrating the practicability of tailoring enzymes for industrial application. 14. Gao X, Xie XK, Pashkov I, Sawaya MR, Laidman J, Zhang WJ, Cacho R, Yeates TO, Tang Y: Directed evolution and structural characterization of a simvastatin synthase. Chem Biol 2009, 16:1064-1074. 15. Fox RJ, Davis SC, Mundorff EC, Newman LM, Gavrilovic V, Ma SK,  Chung LM, Ching C, Tam S, Muley S et al.: Improving catalytic function by ProSAR-driven enzyme evolution. Nat Biotechnol 2007, 25:338-344. The directed evolution of a catalyst for the industrial production of the drug atorvastatin was guided by statistical analysis of protein sequence– activity relationships obtained during library screening. Current Opinion in Biotechnology 2010, 21:753–759

16. Liu LF, Li YF, Liotta D, Lutz S: Directed evolution of an orthogonal nucleoside analog kinase via fluorescenceactivated cell sorting. Nucleic Acids Res 2009, 37:4472-4481. 17. Pavlova M, Klvana M, Prokop Z, Chaloupkova R, Banas P, Otyepka M, Wade RC, Tsuda M, Nagata Y, Damborsky J: Redesigning dehalogenase access tunnels as a strategy for degrading an anthropogenic substrate. Nat Chem Biol 2009, 5:727-733. 18. Gordley RM, Gersbach CA, Barbas CF III: Synthesis of programmable integrases. Proc Natl Acad Sci USA 2009, 106:5053-5058. 19. Yoo TH, Tirrell DA: High-throughput screening for methionyltRNA synthetases that enable residue-specific incorporation of noncanonical amino acids into recombinant proteins in bacterial cells. Angew Chem Int Ed 2007, 46:5340-5343. 20. Neumann H, Slusarczyk AL, Chin JW: De novo generation of mutually orthogonal aminoacyl-tRNA synthetase/tRNA pairs. J Am Chem Soc 2010, 132:2142-2144. 21. Loakes D, Gallego J, Pinheiro VB, Kool ET, Holliger P: Evolving a polymerase for hydrophobic base analogues. J Am Chem Soc 2009, 131:14827-14837. 22. Ja¨ckel C, Kast P, Hilvert D: Protein design by directed evolution. Annu Rev Biophys 2008, 37:153-173. 23. Jumawid MT, Takahashi T, Yamazaki T, Ashigai H, Mihara H: Selection and structural analysis of de novo proteins from an a3b3 genetic library. Protein Sci 2009, 18:384-398. 24. Walter KU, Vamvaca K, Hilvert D: An active enzyme constructed from a 9-amino acid alphabet. J Biol Chem 2005, 280:3774237746. 25. Tanaka J, Doi N, Takashima H, Yanagawa H: Comparative characterization of random-sequence proteins consisting of 5, 12, and 20 kinds of amino acids. Protein Sci 2010, 19:786-795. 26. Guarnera E, Pellarin R, Caflisch A: How does a simplifiedsequence protein fold? Biophys J 2009, 97:1737-1746. 27. Reetz MT, Kahakeaw D, Lohmer R: Addressing the numbers  problem in directed evolution. ChemBioChem 2008, 9:17971804. An alphabet of 12 representative building blocks is compared to the complete set of all 20 proteinogenic amino acids for active site engineering of an epoxide hydrolase; a detailed mathematical analysis is provided for the calculation of library sizes in directed evolution experiments. 28. Reetz MT, Prasad S, Carballeira JD, Gumulya Y, Bocola M: Iterative saturation mutagenesis accelerates laboratory evolution of enzyme stereoselectivity: rigorous comparison with traditional methods. J Am Chem Soc 2010, 132:9144-9152. 29. Reetz MT, Wu S: Greatly reduced amino acid alphabets in  directed evolution: making the right choice for saturation mutagenesis at homologous enzyme positions. Chem Commun 2008:5499-5501. Optimized amino acid alphabets are determined for the randomization of a loop sequence in phenyl acetone monooxygenase on the basis of amino acid frequencies at the respective positions of functional homologs. 30. Munoz E, Deem MW: Amino acid alphabet size in protein evolution experiments: better to search a small library thoroughly or a large library sparsely? Protein Eng Des Sel 2008, 21:311-317. 31. Lockless SW, Ranganathan R: Evolutionarily conserved pathways of energetic connectivity in protein families. Science 1999, 286:295-299. 32. Bloom JD, Glassman MJ: Inferring stabilizing mutations from protein phylogenies: application to influenza hemagglutinin. PLoS Comput Biol 2009, 5:e1000349. 33. Magliery TJ, Regan L: Beyond consensus: statistical free energies reveal hidden interactions in the design of a TPR motif. J Mol Biol 2004, 343:731-745. 34. Perez-Jimenez R, Godoy-Ruiz R, Parody-Morreale A, IbarraMolero B, Sanchez-Ruiz JM: A simple tool to explore the distance distribution of correlated mutations in proteins. Biophys Chem 2006, 119:240-246. www.sciencedirect.com

Biocatalysts by evolution Ja¨ckel and Hilvert 759

35. Kuipers RKP, Joosten HJ, Verwiel E, Paans S, Akerboom J, van der Oost J, Leferink NGH, van Berkel WJH, Vriend G, Schaap PJ: Correlated mutation analyses on super-family alignments reveal functionally important residues. Proteins 2009, 76:608-616. 36. Lockless SW, Muir TW: Traceless protein splicing utilizing evolved split inteins. Proc Natl Acad Sci USA 2009, 106:10999-11004. 37. Lee J, Natarajan M, Nashine VC, Socolich M, Vo T, Russ WP, Benkovic SJ, Ranganathan R: Surface sites for engineering allosteric control in proteins. Science 2008, 322:438-442. 38. Barak Y, Nov Y, Ackerley DF, Matin A: Enzyme improvement in the absence of structural knowledge: a novel statistical approach. ISME J 2008, 2:171-179. 39. Heinzelman P, Snow CD, Smith MA, Yu XL, Kannan A, Boulware K, Villalobos A, Govindarajan S, Minshull J, Arnold FH: SCHEMA recombination of a fungal cellulase uncovers a single mutation that contributes markedly to stability. J Biol Chem 2009, 284:26229-26233. 40. Li YG, Drummond DA, Sawayama AM, Snow CD, Bloom JD,  Arnold FH: A diverse family of thermostable cytochrome P450s created by recombination of stabilizing fragments. Nat Biotechnol 2007, 25:1051-1056. Two novel protein stabilization approaches — linear regression analysis of sequence-stability data and library-based consensus design — are successfully used in this study to predict stabilized variants of cytochrome P450 enzymes from libraries generated by shuffling sequence blocks. 41. Ja¨ckel C, Bloom JD, Kast P, Arnold FH, Hilvert D: Consensus protein design without phylogenetic bias. J Mol Biol 2010, 399:541-546. 42. Crameri A, Raillard S-A, Bermudez E, Stemmer WPC: DNA shuffling of a family of genes from diverse species accelerates directed evolution. Nature 1998, 391:288-291. 43. Amitai G, Gupta RD, Tawfik DS: Latent evolutionary potentials under the neutral mutational drift of an enzyme. HFSP J 2007, 1:67-78.

52. Tokuriki N, Tawfik DS: Chaperonin overexpression promotes genetic variation and enzyme evolution. Nature 2009, 459:668-673. 53. Wilhelm S, Rosenau F, Becker S, Buest S, Hausmann S, Kolmar H, Jaeger K-E: Functional cell-surface display of a lipase-specific chaperone. ChemBioChem 2007, 8:55-60. 54. Taylor SV, Kast P, Hilvert D: Investigating and engineering enzymes by genetic selection. Angew Chem Int Ed 2001, 40:3310-3335. 55. Seelig B, Szostak JW: Selection and evolution of enzymes from a partially randomized non-catalytic scaffold. Nature 2007, 448:828-831. 56. Neuenschwander M, Kleeb AC, Kast P, Hilvert D: Equipping in vivo selection systems with tunable stringency. In Protein Engineering Handbook. Edited by Lutz S, Bornscheuer UT. WileyVCH; 2008:537-561. 57. Neuenschwander M, Butz M, Heintz C, Kast P, Hilvert D: A simple  selection strategy for evolving highly efficient enzymes. Nat Biotechnol 2007, 25:1145-1147. Regulated transcriptional control was combined with a degradation tag to provide tight control of intracellular catalyst concentration over a wide dynamic range for the directed evolution of redesigned chorismate mutases with wild-type levels of catalytic activity. 58. Kleeb AC, Hansson Edalat M, Gamper M, Haugstetter J, Giger L, Neuenschwander M, Kast P, Hilvert D: Metabolic engineering of a genetic selection system with tunable stringency. Proc Natl Acad Sci USA 2007, 104:13907-13912. 59. Goddard JP, Reymond JL: Enzyme assays for high-throughput screening. Curr Opin Biotechnol 2004, 15:314-322. 60. Baret JC, Miller OJ, Taly V, Ryckelynck M, El-Harrak A, Frenz L, Rick C, Samuels ML, Hutchison JB, Agresti JJ et al.: Fluorescence-activated droplet sorting (FADS): efficient microfluidic cell sorting based on enzymatic activity. Lab Chip 2009, 9:1850-1858.

45. Bershtein S, Tawfik DS: Ohno’s model revisited: measuring the frequency of potentially adaptive mutations under various mutational drifts. Mol Biol Evol 2008, 25:2311-2318.

61. Agresti JJ, Antipov E, Abate AR, Ahn K, Rowat AC, Baret JC,  Marquez M, Klibanov AM, Griffiths AD, Weitz DA: Ultrahighthroughput screening in drop-based microfluidics for directed evolution. Proc Natl Acad Sci USA 2010, 107:4004-4009. The catalytic power of a highly efficient horseradish peroxidase was further improved using a novel screening technology that enables a throughput of 108 library variants per day and thus ca. 1000-fold faster directed evolution processes compared to state-of-the-art robotic systems.

46. Noirel J, Simonson T: Neutral evolution of proteins: the superfunnel in sequence space and its relation to mutational robustness. J Chem Phys 2008, 129:.

62. Granieri L, Baret JC, Griffiths AD, Merten CA: High-throughput screening of enzymes by retroviral display using dropletbased microfluidics. Chem Biol 2010, 17:229-235.

47. Bloom JD, Lu Z, Chen D, Raval A, Venturelli OS, Arnold FH:  Evolution favors protein mutational robustness in sufficiently large populations. BMC Biol 2007, 5:29. This study provides experimental evidence for the theory that neutral evolution of polymorphic populations yields proteins of higher mutational robustness and stability compared to evolution from single sequences.

63. Ro¨thlisberger D, Khersonsky O, Wollacott AM, Jiang L, DeChancie J, Betker J, Gallaher JL, Althoff EA, Zanghellini A, Dym O et al.: Kemp elimination catalysts by computational enzyme design. Nature 2008, 453:190-195.

44. Bloom JD, Romero PA, Lu ZY, Arnold FH: Neutral genetic drift can alter promiscuous protein functions, potentially aiding functional evolution. Biol Direct 2007, 2:17.

48. Bershtein S, Goldin K, Tawfik DS: Intense neutral drifts yield  robust and evolvable consensus proteins. J Mol Biol 2008, 379:1029-1044. Neutral evolution of polymorphic TEM-1 b-lactamase populations resulted in the enrichment of stabilizing mutations that improved the functional evolvability of lactamases via compensation of destabilizing mutations. 49. Rutherford SL, Lindquist S: Hsp90 as a capacitor for morphological evolution. Nature 1998, 396:336-342. 50. Queitsch C, Sangster TA, Lindquist S: Hsp90 as a capacitor of phenotypic variation. Nature 2002, 417:618-624. 51. Rutherford SL: Between genotype and phenotype: protein chaperones and evolvability. Nat Rev Genet 2003, 4:263-274.

www.sciencedirect.com

64. Jiang L, Althoff EA, Clemente FR, Doyle L, Ro¨thlisberger D, Zanghellini A, Gallaher JL, Betker JL, Tanaka F, Barbas CF III et al.: De novo computational design of retro-aldol enzymes. Science 2008, 319:1387-1391. 65. Siegel JB, Zanghellini A, Lovick HM, Kiss G, Lambert AR, St.  Clair JL, Gallaher JL, Hilvert D, Gelb MH, Stoddard BL et al.: Computational design of an enzyme catalyst for a stereoselective bimolecular Diels–Alder reaction. Science 2010, 329:309-313. Enzymes catalyzing a non-natural bimolecular Diels–Alder reaction with high stereoselectivity and substrate specificity were engineered by de novo computational design. The predicted structures of the most active biocatalysts were confirmed by X-ray crystallography. 66. Khersonsky O, Ro¨thlisberger D, Dym O, Albeck S, Jackson CJ, Baker D, Tawfik DS: Evolutionary optimization of computationally designed enzymes: Kemp eliminases of the KE07 series. J Mol Biol 2010, 396:1025-1042.

Current Opinion in Biotechnology 2010, 21:753–759