Update
TRENDS in Biotechnology
Vol.22 No.5 May 2004
| Research Focus
Exposing relationships using directed evolution Oliver J. Miller and Paul A. Dalby The Advanced Centre for Biochemical Engineering, Department of Biochemical Engineering, University College London, Torrington Place, London WC1E 7JE, UK
Functionally related protein structures that have undergone significant mutagenesis and re-arrangement over a large evolutionary time-scale might no longer share enough sequence or structural similarity to be revealed by even the most advanced database searches. Recently, Christ and Winter used directed evolution to obtain functional variants of the RNA-hairpin-binding protein Rop. Using the functional sequences obtained, a structural database search revealed previously unknown similarity to the tRNA-binding region of valyltRNA synthetase. It is well established that proteins of both similar and unrelated function can have the same overall structural topology but statistically insignificant sequence homology. For example, human hemoglobin and lupine leghemoglobin have very similar tertiary structures but only 15.6% homology at the amino acid sequence level [1]. The extent to which sequences can be altered and yet achieve the same protein fold has been investigated with the directed evolution of a functional Src homology 3 (SH3) domain, using phage-displayed libraries containing a simplified alphabet of just five amino acids [2]. Sequence simplification was achieved at 40 of the 45 randomized non-peptidebinding residues, highlighting that, potentially, a protein could evolve to have a dramatically different sequence while retaining its structure and function. Consequently, we can expect an abundance of distantly related proteins with similar functions that are difficult to identify by comparison of their sequences alone. A fundamental aim of protein science is to develop a method to predict ab initio the folded structures of proteins from sequence data alone and, subsequently, to infer their function. The structure of many proteins can be identified by sequence homology with known protein structures, although this is not possible when a protein sequence has little or no significant homology to those in structure databases [3]. The recent report by Christ and Winter [4] demonstrates that directed evolution might bridge the gap to homology modeling for a subset of sequences that are related structurally and functionally but no longer have significant sequence homology. Protein engineering by directed evolution Over the past decade, directed evolution has become established as the leading method both for obtaining proteins with novel binding affinities and for altering the properties of enzymes [5,6]. It has been used to obtain Corresponding author: Paul A. Dalby (
[email protected]). www.sciencedirect.com
enzyme activity [7] and to improve many properties of proteins, including binding affinities [8– 10], enzyme activity [11,12], stability [13,14], substrate specificity [15,16], enantioselectivity [17] and protein expression [18]. The key to its success has been that it does not require comprehensive knowledge of protein structure and function. Successive rounds of random mutation and carefully designed selection or screening protocols identify improved proteins in a manner that mimics natural evolution processes. The screening or selection for new protein variants from a library of random mutants neatly avoids the requirement that currently hampers rational protein design (i.e. understanding the complex relationship between protein structure and function). Mutations that alter or improve protein function are frequently obtained; these would have been difficult to predict by sequence analysis or protein modeling. Interestingly, these unexpected mutations, alongside those rationalized more readily, might play a significant role in understanding better both structure – function relationships and, as Christ and Winter have demonstrated, the evolutionary relationships between proteins [4]. Furthermore, directed evolution often reveals divergence to more than one consensus sequence that results in the same overall protein structure and function [19,10], thus highlighting the potential difficulty in identifying the evolutionary link between two distantly related sequences. The ability of directed evolution to identify these changes has led to its increased use as a tool for identifying protein residues or structural elements with functional importance. For example, it has been used to identify residues that affect enzyme regulation [20], to obtain functional consensus sequences compatible with certain structural elements in proteins [19,10] and to identify peptide sequence motifs that interact with target proteins [21 –23]. After detecting consensus sequence motifs that bind to a chosen target molecule, computational search tools can then be used to identify potential interaction partners. Using this method, protein interaction networks have been identified for SH3 domains that were then refined using two-hybrid screening [24]. Directed evolution in bioinformatics Christ and Winter have extended the use of directed evolution to reveal an evolutionary relationship between two proteins that would have been difficult to identify using alternative current methods [4]. The consensus sequences obtained by directed evolution of the dimeric RNA-binding protein Rop, mapped to the RNA-binding
204
Update
TRENDS in Biotechnology
helix structure, have been used to identify a distantly related enzyme, valyl-tRNA-synthetase (ValRS), with previously unknown structural and functional similarity to Rop. The two proteins have no significant sequence homology and only a search with alternative functional Rop sequences revealed the potential link to ValRS. In their approach, Christ and Winter randomized five residues of Rop corresponding to the putative RNA-binding site within the N-terminal helix. A genetic complementation approach was then used to select active Rop variants. The basis of this system is a derivative of the naturally occurring ColE1 plasmid with the rop gene deleted. This deletion boosts the plasmid copy number and, consequently, increases the metabolic burden on the cell, which results in reduced growth rate. The increased copy number also raises the expression level of the plasmid-borne reporter gene LacZ. Clones from the library that expressed active variants of Rop in trans had their growth rates restored and were, therefore, enriched by growth selection in liquid media. Subsequent blue – white screening of colonies growing on X-Gal (5-bromo-4-chloro-3-indolyl-bD-galactopyranoside) confirmed clones that expressed active variants of Rop – their lower levels of reporter gene expression colored them white. After three such rounds of selection and screening, the sequences of 28 active Rop variants were compiled and used to search a Protein-Data-Bank-derived database with the SPASM program [25]. All combinations of the obtained sequences were used in the search pattern, excluding positions at which mutations occurred only once. The search pattern included only the mutated residues and enabled a maxi˚ root-mean-square from their spatial arrangemum of 1-A ment in Rop. Initially, the inclusion of residue 25 returned only Rop as a match but its exclusion enabled six other proteins to be identified, of which ValRS was the only RNA-binding protein. This refinement of the search pattern seems to indicate that, in general, several versions of a search pattern might be required for efficient identification of ‘hits’ with SPASM. Having obtained a match to ValRS, the authors built a model of wild-type Rop bound to RNA, based on the ValRS structure and the synthetic Tar–Tar* RNA hairpin, for which a nuclear magnetic resonance (NMR) structure is available. The binding affinity of Tar– Tar* for Rop is similar to that of ColE1, the natural target of Rop, making it a reasonable RNA structure to use in the model. The model obtained was consistent with previous NMR and biochemical data for Rop. Comparison of the Rop –RNA model with the known structure of Rop in the absence of RNA enabled Christ and Winter to rationalize the RNA binding in terms of a ‘ribose trap’, in which a hydrogen bond between Arg-13 and Asn-10 of Rop is broken to form new contacts with the ribose of RNA [4]. Concluding remarks Overall, these results demonstrate that using directed evolution and structure searches is a powerful new approach for identifying potential new evolutionary links between distantly related protein sequences. Furthermore, the identification of structural and functional similarity to a protein for which a liganded structure is www.sciencedirect.com
Vol.22 No.5 May 2004
available has enabled Christ and Winter to infer the mode of binding for their protein to a similar ligand. The similarities suggest a possible common evolutionary origin for Rop and ValRS, bearing in mind that most other tRNA synthetases (e.g. ArgRS) have different binding modes to RNA. The technique used in the SPASM program identifies only proteins containing the search motif and does not require matches outside this region. Looking beyond the RNA contact sites, the authors found that both ValRS and Rop contain a four-helix bundle. However, Rop is an antiparallel bundle between a homodimer, whereas ValRS is monomeric bundle. Also, Rop binds two RNA molecules in a symmetrical manner, whereas ValRS binds only one tRNA molecule. Consequently, it is difficult to distinguish the evolutionary link between Rop and ValRS as being either divergent or convergent evolution. Despite this, many researchers should, surely, be revisiting the results of their directed evolution experiments to see whether they can reveal any further evolutionary links to functionally similar proteins. This work has broad implications for the study of protein evolution. Prediction of evolutionary relationships is currently limited to cases in which sequence or structural similarities are readily identified. Distant protein relatives that have mutated beyond recognition at the sequence level could now be identified by the method described by Christ and Winter [4] and used to improve models of protein evolution. Extensive application might also reveal many more, previously unseen, relationships between protein families. It will be interesting to see whether this work will have an impact on sequence or structural homology searches. In the future, it might be possible to use a similar approach in silico, whereby localized random mutations are introduced into a structural model and the variants are prioritized by their predicted binding properties. Derived consensus sequences and structures could then be used to search for potential distantly related proteins. Acknowledgements We thank the UK Biotechnology and Biological Sciences Research Council for funding O.J.M.
References 1 Berg, J.M. et al. (2002) Exploring evolution. In Biochemistry, (5th edn), pp. 179– 180, Freeman, New York 2 Riddle, D.S. et al. (1997) Functional rapidly folding proteins from simplified amino acid sequences. Nat. Struct. Biol. 4, 805 – 809 3 Baker, D. and Sali, A. (2001) Protein structure prediction and structural genomics. Science 294, 93 – 96 4 Christ, D. and Winter, G. (2003) Identification of functional similarities between proteins using directed evolution. Proc. Natl. Acad. Sci. U. S. A. 100, 13184 – 13189 5 Hoess, R.H. (2001) Protein design and phage display. Chem. Rev. 101, 3205– 3218 6 Dalby, P.A. (2003) Optimising enzyme function by directed evolution. Curr. Opin. Struct. Biol. 13, 500 – 505 7 Goud, G.N. et al. (2001) Specific glycosidase activity isolated from a random phage display antibody library. Biotechnol. Prog. 17, 197 – 202 8 Lowman, H.B. et al. (1991) Selecting high-affinity binding proteins by monovalent phage display. Biochemistry 30, 10832 – 10838 9 Smith, G.P. et al. (1998) Small binding proteins selected from a
Update
10 11 12
13
14 15
16
17
TRENDS in Biotechnology
combinatorial repertoire of knottins displayed on phage. J. Mol. Biol. 277, 317 – 332 Dalby, P.A. et al. (2000) Evolution of binding affinity in a WW domain probed by phage display. Protein Sci. 9, 2366– 2376 Stemmer, W.P.C. (1994) Rapid evolution of a protein in vitro by DNA shuffling. Nature 370, 389– 391 Chen, K. and Arnold, F.H. (1993) Tuning the activity of an enzyme for unusual environments: sequential random mutagenesis of subtilisin E for catalysis in dimethylformamide. Proc. Natl. Acad. Sci. U. S. A. 90, 5618 – 5622 Flores, H. and Ellington, A.D. (2002) Increasing the thermal stability of an oligomeric protein, b-glucuronidase. J. Mol. Biol. 315, 325 – 337 Pedersen, J.S. et al. (2002) Directed evolution of barnase stability using proteolytic selection. J. Mol. Biol. 323, 115 – 123 Matsumura, I. and Ellington, A.D. (2001) In vitro evolution of b-glucuronidase into a b-galactosidase proceeds through non-specific intermediates. J. Mol. Biol. 305, 331 – 339 Raillard, S. et al. (2001) Novel enzyme activities and functional plasticity revealed by recombining highly homologous enzymes. Chem. Biol. 8, 891 – 898 Zha, D.X. et al. (2001) Complete reversal of enantioselectivity of an enzyme-catalyzed reaction by directed evolution. Chem. Comm., 2664 – 2665
Vol.22 No.5 May 2004
205
18 Lin, Z. et al. (1999) Functional expression of horseradish peroxidase in E. coli by directed evolution. Biotechnol. Prog. 15, 467– 471 19 Zhou, H.X. et al. (1996) In vitro evolution of thermodynamically stable turns. Nat. Struct. Biol. 3, 446 – 451 20 Salamone, P.R. et al. (2002) Directed molecular evolution of ADP-glucose pyrophosphorylase. Proc. Natl. Acad. Sci. U. S. A. 99, 1070– 1075 21 O’Neil, K.T. et al. (1992) Identification of novel peptide antagonists for GPIIb/IIIa from a conformationally constrained phage peptide library. Proteins 14, 509– 515 22 Li, R.H. et al. (2003) Use of phage display to probe the evolution of binding specificity and affinity in integrins. Protein Eng. 16, 65 – 72 23 Kasanov, J. et al. (2004) Characterizing class I WW domains defines key specificity determinants and generates mutant domains with novel specificities. Chem. Biol. 8, 231 – 241 24 Tong, A.H.Y. et al. (2002) A combined experimental and computational strategy to define protein interaction networks for peptide recognition modules. Science 295, 321– 324 25 Kleywegt, G.J. (1999) Recognition of spatial motifs in protein structures. J. Mol. Biol. 285, 1887 – 1897
0167-7799/$ - see front matter q 2004 Elsevier Ltd. All rights reserved. doi:10.1016/j.tibtech.2004.03.003
Nuclear remodeling after SCNT: a contractor’s nightmare Peter Sutovsky1,2 and Randall S. Prather1 1
Department of Animal Science, University of Missouri-Columbia, S141 ASRC, 920 East Campus Drive, Columbia, MO 65211, USA Department of Obstetrics & Gynecology, University of Missouri-Columbia, S141 ASRC, 920 East Campus Drive, Columbia, MO 65211, USA 2
As the success rate of somatic cell nuclear transfer (SCNT) remains low, researchers are turning to the very early stages of pre-implantation development to try to improve the developmental potential of reconstructed mammalian embryos. Two recent papers highlight the role of regulated proteolysis in nuclear remodeling after SCNT. First, Gao et al. describe a rapid, programmed replacement of the somatic-type linker histone H1 inside donor-cell nuclei with an oocyte-derived homolog after SCNT, which is subsequently reversed at the time of maternal embryonic transition. Second, Zhou et al. report the first successful cloning of a rat by using selective blockers of the ubiquitin-dependent degradation of cell-cycle regulator cyclin B. Therefore, a fast, programmed proteolysis might be of central importance for nuclear remodeling after SCNT, particularly in the ubiquitin-proteasome pathway. Even though the number of species cloned by somatic-cell nuclear transfer (SCNT) grows steadily, the overall success rate remains low (, 5%). Deviant patterns of nuclear remodeling and improper replication of the nuclear DNA methylation patterns (gene imprinting) during pre-implantation embryo development have been Corresponding author: Peter Sutovsky (
[email protected]). www.sciencedirect.com
blamed for this poor developmental capacity of the clones. As a rule, both the maternal and the paternal DNA undergo gradual demethylation that is completed by the blastocyst stage. However, some researchers are now turning to very early stages of pre-implantation development to explain this problem. Notably, two recent papers indicate the importance of early events in nuclear remodeling of the donor cell after SCNT. Histone replacement The first paper by Gao et al. [1] demonstrates that the somatic-cell-type histone H1 in the donor-cell nucleus is rapidly replaced by oocyte-derived H1 within 60 min after nuclear transfer (NT) or intracytoplasmic sperm injection (ICSI) in mouse. The exchange of nuclear histone H1 is then reversed at the two- to four-cell stage, when the oocyte-derived molecules are replaced by the embryoderived H1. This is likely to be a consequence of the onset of transcription and translation of the gene encoding embryonic H1, as well as the limited half-life of oocytederived H1. Overall, this shows that remodeling of the donor-cell nucleus during SCNT is similar to the remodeling of the sperm nucleus after natural fertilization, and that the mammalian ooplasm is programmed and well equipped for this function. The ooplasmic factors contributing to nuclear remodeling are being sought in hope