71
Folding and binding Ð new technologies and new perspectives Editorial overview Jane Clarke and Gideon Schreiber Current Opinion in Structural Biology 2003, 13:71±74 0959-440X/03/$ ± see front matter ß 2003 Elsevier Science Ltd. All rights reserved. DOI 10.1016/S0959-440X(03)00008-3
Jane Clarke Department of Chemistry, MRC Centre for Protein Engineering, Lens®eld Road, Cambridge CB2 1EW, UK e-mail:
[email protected]
Jane's research group uses a number of techniques, including protein engineering, NMR, simulation and, recently, atomic force microscopy, to study the folding of structurally related protein families. She is a Wellcome Trust Senior Research Fellow. Gideon Schreiber Department of Biological Chemistry, Weizmann Institute of Science, Rehovot 76100, Israel e-mail:
[email protected]
The Schreiber laboratory uses biophysical methods to study protein±protein interactions in order to gain basic physical understanding of the kinetics and thermodynamics of the binding process. The gained understanding is translated into computer algorithms for prediction purposes.
www.current-opinion.com
Many years of intensive experimental and computational investigations have led to signi®cant progress in the ®elds of protein folding and binding; however, many important questions remain to be answered. We are still some way from the relatively modest goals of being able to accurately predict, de novo, protein structures or functions, or to design ligands that bind to and possibly modify the properties of proteins. This is not simply a lack of computer power; the underlying physics and chemistry of these processes are not yet completely understood. Folding and binding are dynamic processes dominated by the formation and loss of large numbers of noncovalent interactions and so many of the experimental techniques and philosophies are common to both. A feature of recent studies has been the application of the quantitative techniques of chemistry and physics to these biological systems, and advances in experimental techniques and computation are central to most of the work presented here. The ®rst ®ve reviews deal with varied aspects of protein folding, from studies bene®ting from ultrafast techniques and single-molecule technology to improvements in computational techniques and bioinformatics. Ferguson and Fersht describe how advances in experimental techniques and detection have allowed the earliest events in protein folding to be observed. They describe three stages in protein folding: speci®c or nonspeci®c chain collapse; formation of secondary and tertiary structure; and desolvation of the chain as it folds to lower energy conformations. Much early thinking on protein folding made the assumption that, as chain collapse must be much faster than structure formation, this list represented the strict order of events in any folding process. Most importantly, the authors point out that this need not be the case and describe new experiments that challenge this assumption. Different denatured states, for instance, show very different collapse behaviour, in terms of both kinetics and thermodynamics, and, as yet, this behaviour is not predictable. The problem of protein folding becomes instantly more interesting (and more challenging), because it means that sequence and structural differences are important even for these fastest, most nonspeci®c events. New fast techniques come with their own set of experimental problems and Ferguson and Fersht describe how artefacts or changes in experimental conditions can affect the analysis. The authors advocate using multiple probes and multiple techniques, which should allow the observation of processes that are hidden from one probe or help account for artefacts introduced in the experiment. Possibly the most exciting feature of these experimental studies of very rapid folding processes is that they move experiment and simulation into the same, submicrosecond timeframe, allowing the determination of protein folding pathways at atomic resolution. Refreshingly, the authors remind us: ``It is especially important that we benchmark simulations as rigorously as possible by experiment and learn from, as well as report, both our successes and failures''. Current Opinion in Structural Biology 2003, 13:71±74
72 Folding and binding
Vendruscolo and Paci describe how a number of groups are responding to the challenge to unite theory and experiment. They describe how experimental and computational techniques have advanced `hand in hand' to further our understanding of complex protein folding landscapes. Where simulation can reproduce experimentally measured variables, such as NMR data or F values, then details of the simulation can be con®dently used to understand the (un)folding process. Unfolding simulations are signi®cantly more successful and more common than simulations of folding, because the starting state is known; however, one of the problems is that of timescale. Simulations have had to be performed at high temperatures to induce the proteins to unfold on a computationally accessible timescale. Here, the fast folding techniques described by Ferguson and Fersht have closed the gap for a number of proteins. A second challenge is to de®ne an unambiguous folding trajectory. The forced unfolding experiments described by Zhuang and Rief (see below) have led to a number of successful collaborations between experimentalists and theoreticians. Nonetheless, it is surprising to see how often theory and experiment are in agreement when the conditions used are so dissimilar. Paci and Vendruscolo propose that this agreement comes from the nature of the landscape itself Ð to avoid misfolding traps, evolution has carved deep valleys into the energy landscape of proteins and folding trajectories are restricted to these valleys even at very nonphysiological conditions. Understanding the nature of these landscapes requires all species to be characterised in detail. A new technique is to use experimental data as restraints in a simulation to ®nd an ensemble of structures that are compatible with a set of experimental observables. This technique has been used to determine an ensemble of structures that are compatible with an experimental set of F values. This review presents an optimistic picture of progress towards obtaining atomistic detail about the folding processes of a few well-studied systems, for which experiments and theory have worked closely together, but makes it clear that there is still a great deal of work to be done. Our understanding of folding processes comes from ensemble measurements, but how much heterogeneity is there between molecules? Some theoretical studies have suggested that the simple experimental exponential kinetics do not re¯ect the multiplicity of folding behaviours. Single-molecule folding is relatively new, but the progress to date is described by Zhuang and Rief. Two techniques for observing single molecules have been exploited, ¯uorescence resonance energy transfer (FRET) and dynamic force spectroscopy, and, in this review, with applications to the folding of both RNA and proteins. Ensemble studies of RNA folding are less well advanced than those of proteins, possibly as a result of the size of the molecules and the complexity of the kinetics Current Opinion in Structural Biology 2003, 13:71±74
for the folding reactions. However, RNA is an excellent target for single-molecule FRET studies: the labelling is comparatively simple and attachment of the RNA molecules to a surface via speci®c base pairing is relatively easy and can be shown to have no effect on folding behaviour. FRET studies have been used to dissect the complicated folding pathway of RNA molecules, and can distinguish previously undetected transient species and multiple folding pathways. Single-molecule FRET studies of protein folding are far less advanced. Immobilisation techniques may have the drawback of holding the protein too close to the surface and large FRET dyes can change the folding behaviour unless carefully placed. However, structural ¯uctuations in proteins and peptides can be measured, and there is optimism for these studies in the future. Dynamic force spectroscopy has been shown to be relatively easily adapted to the study of individual protein unfolding events. This would seem to be particularly relevant to the large number of proteins that experience force in vivo, including muscle proteins and proteins of the extracellular matrix. The technique is not without technical dif®culty and, to date, few proteins have been studied and only one protein has been investigated to any depth, the giant muscle protein titin. However, studies on this one protein show that, by combining this new technique with simulation and more traditional techniques, such as protein engineering, the effect of force on the energy landscape can be investigated in atomic detail. Intriguingly, the technique has now been applied to an examination of the structure of a membrane protein by `pulling' the helices out of a bilayer. In considering protein folding in vivo, it is a mistake to think that, once the protein has come off the ribosome and folded, this is an end to the matter. Unfolding kinetic rate constants show that proteins have half-lives of a few hours and so there is likely to be constant unfolding and refolding of proteins during their life-time. Matouschek reminds us that protein unfolding may be a process that is of signi®cant and yet neglected importance within the cell in at least three biological processes: protein translocation across membranes; protein degradation; and passive elasticity of muscle. The elasticity of muscle is being investigated by atomic force microscopy, as described by Zhuang and Rief, but Matouschek shows that the three processes may have much in common. Many of the advances in our understanding come from the application of a number of techniques and physical-chemistry principles to the study of these systems. Both import and degradation involve complex, energy-dependent multicomponent machinery that recognises targeting signals. What has been unclear, however, is whether the machinery relies on the natural unfolding process, whether the targeting signal acts to destabilise the target protein in some way or whether the machinery itself acts as an www.current-opinion.com
Editorial overview Clarke and Schreiber 73
`unfoldase'. In this review, the author describes current experimental evidence suggesting that this machinery acts to unfold the target proteins by `pulling' at the targeting sequence. In this way, force acts to change the unfolding pathway, catalysing unfolding. It will be intriguing to compare the results from further experiments in this direction to those on muscle proteins such as titin. Lastly, Trifonov and Berezovsky consider evolutionary aspects of protein structure and folding. Their proposal is that polymer physics dictates a typical loop closure size of 25±35 elements (residues) for a polypeptide chain. This would dictate a basic structural element of this size, independent of the exact composition. From statistical analysis of modern proteins, the authors ®nd that elements of this size are extremely abundant and propose that these correspond to the hypothetical ancestral looplike molecules. From this they pose a number of questions: are these loops elementary folding elements and were early proteins indeed short peptide-like closed loops that became basic building blocks for modern proteins; do these basic loop elements guide the folding pathway of a protein? Although these questions remain open, they are addressed here from a different, evolutionary perspective. Intriguingly, David Baker's group has had some success in the CASP protein prediction competitions by using a short `building block' technique that might re¯ect the loop element hypothesis described here (although the loops used in the structure prediction programs are signi®cantly shorter). Many of the current models of protein folding, such as nucleation condensation, would not support such a folding model. With the development of ultrafast folding methods, which can assign the earliest events in folding (see the review by Ferguson and Fersht), and as theoretical calculation methods become more powerful (see the review by Vendruscolo and Paci), it will be interesting to see how these evolutionary predictions stand up against experiment. Protein±protein and protein±ligand binding are the focus of the last three reviews. In the ®rst, new developments in calorimetric methodology are discussed. The other two reviews are more system speci®c, but the insights gained from detailed study of a single model system may be more productive towards understanding the physical principles behind protein interactions than any number of short studies. The changes in bonding interactions or dynamics that occur upon ligand binding are re¯ected in the reaction enthalpy and entropy, which in turn determine the free energy of ligand association. Although this sounds simple, predicting these properties is a real challenge, particularly in the case of complex biological macromolecules in aqueous solution. In their review, Weber and Salemme introduce us to applications of calorimetric methods in www.current-opinion.com
biological chemistry and drug discovery. Changes in the balance between the enthalpic and entropic components of binding are dif®cult to predict, even when apparently conservative chemical modi®cations are made to the protein or ligand structures. This calls for a more detailed understanding of binding as obtained by calorimetry. One of the best examples showing the added value of this method is the investigation of the binding of inhibitors to HIV-1 protease and drug-resistant mutants. Interestingly, the development of drug resistance is characterised by unfavourable enthalpy changes relative to the native protease. Drugs that overcome this resistance tend to have a larger favourable binding enthalpy due to increased ¯exibility in their mode of binding. Weber and Salemme discuss new calorimetric technologies, including methods to derive accurate binding af®nities for very tight-binding ligands, and the development of high-throughput instruments requiring only small quantities of proteins and semi-automated methods for data analysis. These new methodologies may contribute to the wider use of calorimetric methods, which may result in a more detailed understanding of protein±ligand interactions. One of the most important processes governed by protein interactions is signal transduction, which is the topic of the review by Herrmann. The process of signal transduction can be divided into three parts: input, processing and output. Ras and its effectors play a central role as part of the processing unit in many signal transduction cascades. A physical map of a signal transduction network involves knowing all interactions between proteins, although this is not suf®cient to understand the process of decisionmaking in detail. For this, the structural, thermodynamic and kinetic details are required. Herrmann focuses our attention on the molecular details of the complex network of signalling through the Ras family of small GTPases, which bind multiple effectors. Ras represents the minimal core structure of all GTPases, a feature of which is the conformational switch between GDP- and GTP-bound states and the concomitant change in binding kinetics and af®nity for regulatory proteins and effectors. The structures of the Ras-binding domains of a number of different effectors have been solved, all showing the same topology, despite having little sequence homology. Unravelling the molecular details of Ras±effector interactions and the mechanism of effector activation takes us nearer to answering some of the fundamental questions raised in the biochemical investigations. How does this complicated network function? What is the importance of the thermodynamics and kinetics of binding in determining the speci®city of signal transduction? What are the molecular and structural events giving rise to signalling, and why do some mutations in Ras cause only partial loss of signalling whereas others stop signalling all together? In this review, Herrmann guides us carefully through the Current Opinion in Structural Biology 2003, 13:71±74
74 Folding and binding
huge body of data to give us an up-to-date picture of current knowledge, in addition to highlighting the challenges ahead. A feature of the protein folding ®eld has been the attempt to develop algorithms that can be used to predict protein structure from sequence. In the ®nal review, Laskowski, Qasim and Yi propose that, knowing the family a protein comes from, it should be possible to go beyond a simple prediction of qualitatively assigning a function. For some protein families, they assert, it should be possible to develop a predictive algorithm for standard free energies of association of any possible member of a protein family and a selected protein belonging to another family. The important assumption they make is that changes in binding free energies, on changing the sequences of both proteins, will be additive. The task of testing this theory and evaluating the success of such an algorithm was
Current Opinion in Structural Biology 2003, 13:71±74
undertaken for the Kazal family of protein inhibitors of serine proteinases. A protein engineering technique, developed originally to study intramolecular interactions in protein folding, was used. First, the variable contact position set was identi®ed, and then all possible single variants at these positions were made and the standard free energies of association of these variants were measured. The measure of success of this approach and its implications for other systems are the subjects of this review. Taken as a whole, the reviews in this Folding and binding section leave an impression of a lively ®eld of research. Many technical advances are described, some in their infancy; a number of the reviews propose areas for future research, looking towards what may be achieved in the future. The interplay between experiment and theory is highlighted; and some of the articles include controversial ideas yet to be tested. Watch this space!
www.current-opinion.com