From Computer Simulations to Human Disease

From Computer Simulations to Human Disease

Cell, Vol. 97, 291–298, April 30, 1999, Copyright 1999 by Cell Press From Computer Simulations to Human Disease: Emerging Themes in Protein Folding ...

205KB Sizes 0 Downloads 76 Views

Cell, Vol. 97, 291–298, April 30, 1999, Copyright 1999 by Cell Press

From Computer Simulations to Human Disease: Emerging Themes in Protein Folding Sheena E. Radford* and Christopher M. Dobson† * School of Biochemistry and Molecular Biology University of Leeds Leeds LS2 9JT United Kingdom † Oxford Centre for Molecular Sciences New Chemistry Laboratory University of Oxford South Parks Road Oxford OX1 3QT United Kingdom Understanding how a protein molecule finds its unique functional three-dimensional structure following its synthesis on the ribosome is a question that has intrigued scientists for decades (Anfinsen, 1973). Not only will a solution to the folding problem complete our knowledge of the series of events that link a gene sequence and the three-dimensional structure of a protein, but it could also play a key role in fine tuning structure prediction algorithms and open the door to the rational design of completely novel protein molecules. In addition, it is now becoming clear that an appreciation of protein folding has far-reaching consequences in biology, embracing topics such as macromolecular recognition, protein trafficking, the regulation of the cell cycle, diseases such as the amyloidoses and cancer, cell invasion, and membrane fusion. Major progress in understanding protein folding in its widest context is now resulting as a consequence of individuals from a wide range of disciplines joining forces to address key issues in the field. To this end a group of more than 70 scientists recently met (in December 1998) for a three-day meeting to discuss progress in this endeavor. The meeting was organized by Alan Fersht, Luis Serrano, and Manuel Rico and was held in the beautiful surroundings of the Instituto Juan March de Estudios e Investigaciones in Madrid. In this

Meeting Review

review, we summarize some of the major findings revealed at the meeting and attempt to lay them into context such that biologists and biophysicists alike can appreciate the significance of the breakthroughs made, the issues still to be tackled, and the future directions that research in this area might take. Protein Folding Mechanisms—Where Theory Meets Experiments The Protein Folding Problem Ever since the foundations of protein folding were laid more than 30 years ago, the importance of bringing together theoretical calculations and experimental studies has been apparent. Thus, the protein folding field was initiated by the inspirational experiments of Anfinsen (who demonstrated that all the information needed to guide a protein to its correct, biologically active structure is hidden within the sequence of amino acids [Anfinsen, 1973]) and the simple, but thought-provoking calculations of Levinthal (1968) (who suggested that folding occurs along specific pathways rather than by a random search). In the early 1980s attempts to elucidate folding mechanisms were severely limited by the lack of suitable protein systems for study and the absence of experimental methods appropriate to the study of a protein as it folds. The limitations of computational methods, including the limited quantitative understanding of the forces determining molecular interactions (force fields) and the lack of the computer power necessary to simulate events over a biologically significant timescale (most small proteins fold in less than a second), also ruled out a solution of the folding problem from theoretical approaches (Karplus, 1997). Experimental methods capable of monitoring the folding of proteins in real time began to emerge in the late 1980s and techniques such as stopped flow circular dichroism (which monitors the evolution of secondary

Figure 1. A Free-Energy Surface for the Folding of a 125-Residue Protein Model The polypeptide is represented by a string of beads, each one of which can take up positions on a cubic lattice; the native state (left) corresponds to one of the many possible arrangements of the residues in a 5 3 5 3 5 cube, The free energy (right) was obtained from a series of Monte Carlo simulations in which the protein was allowed to sample all conformational space. It is plotted as a function of the number of contacts between residues, Qc and Qs, which are divided into two groups, “core” (shown on the left in red) and “surface” (shown on the left in yellow). The paths (folding trajectories) taken by two representative molecules are indicated. As more native-like contacts are formed, the energy of the system generally decreases. A key feature of folding is that native-like interactions can be formed in many different orders, giving rise to the heterogeneity inherent in the statistical nature of the “new view” mentioned earlier. This is likely to be particularly pronounced for larger proteins, which unlike small proteins might become trapped in partially folded intermediate states that correspond to the local minima (one of which is visible at Qc 5 33 and Qs 5 0). A discussion of the nature of surfaces of this type is given in Dobson et al., 1998. The figure was kindly provided by Dinner and Karplus and is based on work described in Dinner and Karplus, 1999.

Cell 292

structure), stopped flow fluorescence (which reflects the burial of tryptophan residues as the hydrophobic core develops), quenched flow hydrogen exchange coupled with multidimensional NMR and mass spectrometric methods (which can identify the formation of hydrogen bonds as structure becomes persistent), and protein engineering methods (which enable the roles of individual amino acids in folding and stability to be deduced) are now commonplace (reviewed by Plaxco and Dobson, 1996). Using these and other methods, schemes of protein folding based on simple models began to emerge. These generally depicted the evolution of the native state as taking place on a single pathway in which increasingly native-like structure formed in a stepwise manner, much as envisaged in the original speculations of Levinthal. Commensurate with this, advances in theory and computational methods began to lead to new approaches to simulating folding at least at a rudimentary level. As described by Alan Fersht in the first session of the Juan March meeting, the 1990s have seen enormous changes in our conceptual view of the folding problem. The general relevance of partially folded states in directing the search for the native state is no longer clear, and the idea of specific hierarchical pathways is being challenged (see below). In addition, evidence from both experiments and theoretical methods now shows that there is not a single defined route to the native state, as was depicted in the early folding schemes of the 1980s, but the native state can be found by several different passages on the folding route map (reviewed by Dobson et al., 1998). New views of folding thus involve folding “landscapes” or “funnels” (Bryngelson et al., 1995; Dill and Chan, 1997), terms chosen by physical chemists to depict the folding transition as one involving an enormous number of possible species of decreasing energy, through which a protein searches according to the laws of statistical mechanics on its way to the native state (Figure 1). Breakthroughs from Experiments The challenge of the changing conceptual basis of protein folding has stimulated experimentalists to design innovative approaches that can be used to address the questions that are emerging as a result of the new view of folding in terms of energy landscapes. Key issues include the nature of the initial events in folding of a protein from its highly disordered “random coil” state, and the structure and dynamics of any compact states that are formed early in the process. The first of these issues was discussed by Bill Eaton at the Juan March meeting. In an elegant series of experiments using laser temperature jump methods, Eaton has been able to witness folding events from a few nanoseconds of the initiation step to the generation of the final fully folded state (see Eaton et al., 1998 for a review). Using synthetic peptides with fluorophores and quenchers placed at strategic locations, Eaton has shown that an a helix can form within a few nanoseconds or as slowly as a few microseconds, depending on its intrinsic stability, a b hairpin can form in about 10 ms and loops in as little as 1 ms (Hagen et al., 1996). The latter places a lower limit on the timescale of the initial steps in folding and sets the challenge for experimentalists wishing to extend this work to study such events in intact proteins. Both Bill Eaton and Heinrich Roder described new

devices that can make measurements of protein folding within 100 ms of their initiation by dilution of denaturant (Chan et al., 1997; Shastry and Roder, 1998). These instruments are based on novel rapid mixers (commercially available instruments currently have dead times of the order of a few milliseconds). This is an important breakthrough, not just because it opens the door to fast measurements of folding on virtually any protein (by contrast with temperature jump methods that are restricted to peptide fragments or proteins that cold denature, and electrochemical methods that require proteins with redox centres), but because it allows the nature of compact structures formed early in folding to be addressed. Using such apparatus, Roder has shown that the early events in folding of cytochrome c (Shastry and Roder, 1998) and the B1 domain of protein G (Park et al., 1997 and unpublished work) involve a specific collapse and formation of a well-defined compact species. Formation of the fully native state then occurs in slower steps on millisecond–second timescales. The general importance of these results awaits experiments using other proteins, but it supports the much-heralded concept that collapse of the polypeptide chain to a compact structure is an important first step in the folding of at least some proteins. The importance of partially folded intermediates in facilitating the search for the native state is another hot topic in folding. The debate focuses on whether intermediates provide access to the folded state or are trapped species that must unfold before the search to the native state can be continued. Evidence for both schemes abounds: folding simulations and some experiments demonstrate that intermediates are unproductive species stabilized by nonnative interactions, while other work suggests that they are obligatory species in the folding process. New work on cytochrome c (using microsecond mixing) (Shastry and Roder, 1998) and apopseudoazurin (a predominantly b sheet protein (discussed at the meeting by Sheena Radford [Capaldi et al., 1999]) suggests that intermediates for these proteins represent productive steps in the folding process. As described by Javier Sancho, the a/b protein, apoflavodoxin, by contrast, folds through a complex mechanism involving parallel routes to the native state, the major route involving a two-state transition to the native state and a minor pathway involving a partially folded intermediate. Multiple pathways also occur during the folding of other proteins such as the lysozymes (Dobson et al., 1998; and see Figure 1), suggesting that the folding of larger proteins is more complex than that described for small, simple proteins and demonstrates, at least for some proteins, that intermediates may be associated with routes through the energy landscape that provide access to the native state (Capaldi and Radford, 1998). A key issue in defining the mechanism of any chemical process is the characterization of species populated along the reaction coordinate (the unfolded, intermediate, and native states) as well as the “transition state” for the reaction that defines the nature of the free energy barrier between reactants and products. Populated species such as the native and denatured states can be studied at leisure at relatively high resolution using the capabilities of modern NMR, as described by Stefan Freund and Pascal Garcia in Madrid, and techniques such as real time NMR and stopped and quench flow

Meeting Review 293

Figure 2. Representation of the Fold of the Spectrin SH3 Domain (Ribbon) with Residues Whose Side Chains Have Been Identified as Having High φ Values in the Rate-Limiting Transition State for Folding Residues whose side chains have high φ values (.0.5, colored red in the figure) are presumed have native-like interactions that are highly formed in the rate-limiting transition state for folding. These are presumed to be involved in the folding nucleus and are thought to be particularly important in order for the protein to fold rapidly to its native state. Side chains with high φ values are located at key points defining the overall topology of the protein fold. Remarkably, one of the most critical residues is in an exposed b turn (seen at the bottom of the picture), and it appears to be essential to form this structure before the rest of the molecule can fold. Residues shaded blue in the figure have side chains with φ values , 0.4 or negative φ values (green), these residues do not stabilize the ratelimiting transition state. In large proteins a number of folding nuclei may exist, and these may be associated with different structural domains. This is the case in Figure 1 where a number of transition state regions can be seen. A general description of the concept of a folding nucleus and its mechanistic significance is given in Fersht, 1997.

methods can reveal key features of partially folded intermediates (Plaxco and Dobson, 1996). Transition states are more ephemeral species and for a reaction of the complexity of protein folding are conceptually somewhat different from those characteristic of a simple reaction. The properties of rate-limiting transition states in protein folding can be determined, however, using a protein engineering approach. In pioneering work Fersht has thus shown that the rate-limiting transition state for the small proteinase inhibitor, chymotrypsin inhibitor 2 (CI2), is an expanded form of the native state that contains a “nucleus” consisting of a handful of key interactions, but lacks the network of fully formed interactions that characterizes the native state. Secondary and tertiary structures then form cooperatively around this nucleus in a mechanism referred to as “nucleation condensation” (Fersht, 1997). In an important series of experiments involving the spectrin SH3 domain, Serrano has now used protein engineering to show that the folding nucleus of the SH3 domain includes a fully formed (native-like) b hairpin (Martinez et al., 1998) (Figure 2). Interestingly the same b hairpin is found in the transition state during folding of the src-SH3 domain (Grantcharova et al., 1998), but not of circular permutants of the spectrin SH3 domain (Viguera et al., 1996). This finding suggests

that the formation of a specific folding nucleus can be an obligate feature of a particular fold, but that an alternative transition state can emerge if drastic changes to the topology are made, even for a closely related sequence. In his talk, Eugene Shakhnovich raised the fascinating question of the interplay between the nature of folding nuclei and evolutionary pressure for proteins to fold rapidly and efficiently (Shakhnovich et al., 1996). He described procedures, which involve randomly changing the nature of residues in folding simulations, to identify sequences that have the property of folding rapidly (Shakhnovich, 1997). Such procedures can be thought of as simulating evolution. As we discuss later, the need to avoid aggregation may be an important factor in the development of the character of protein structures. As elegantly stated by Shakhnovich, the folding nucleus can be viewed as an “accelerator pedal” for folding, stabilizing the transition state and reducing the size of the hurdle over which molecules have to jump to find the native state (Shakhnovich, 1997). If this is the case, stabilizing the folding nucleus should accelerate folding. By grafting a sequence known from peptide studies to form a long and stable b hairpin onto the specific folding nucleus of the SH3 domain, Serrano has produced a chimeric protein that folds significantly more rapidly than its wild-type counterpart. The specific nucleus in the transition state of the SH3 domains contrasts with the more diffuse species described for the rate-limiting transition state of CI2 (Fersht, 1997), but correlates well with predictions from simulations as discussed in Madrid by Shakhnovich (see Shakhnovich, 1997 for a review). This raises the intriguing possibility that the folding of b sheet proteins, which involve extensive long-range interactions, may occur by a more narrowly defined route than proteins with other folds, particularly those containing helices. Accordingly, the transition state of the helical protein, phage l repressor (l6–85) can move along the reaction coordinate from highly folded to unfolded species in different mutated versions of the protein (Burton et al., 1997). The closely related 434 Cro repressor protein, however, has a transition state that is highly native-like and where certain interactions seem obligatory for fast folding (this work was described in Madrid by Manuel Rico). Interestingly, the Cro repressor protein has a lower intrinsic stability of its a helices relative to those in phage l repressor and a unique buried salt bridge in its native state. This suggests that long-range interactions between sidechains are needed to stabilize the transition state in the Cro repressor protein, perhaps more consistent with the behavior of proteins with complex folds, including those with b sheet structures. An important objective now is to extend the repertoire of proteins investigated in this manner such that more general rules about folding might be found, but these results highlight the important point that significant insights about folding mechanisms can be learned even from simple proteins such as these. Triumphs of Theory While the above studies highlight the rapidly developing field of protein folding from an experimentalist’s viewpoint, papers describing equally exciting breakthroughs by theoreticians are also currently flooding into the literature (Bryngelson et al., 1995; Dill and Chan, 1997; Karplus, 1997; Shakhnovich, 1997; Dobson and Karplus,

Cell 294

1999). Using a variety of models that simplify the otherwise unwieldy dimensions of the calculations involved, new insights and predictions about folding can be made. Using molecular dynamics simulations, the unfolding of a single molecule of CI2 is predicted to occur by a mechanism that bears a remarkable resemblance to the folding nucleus determined experimentally by Fersht (Ladurner et al., 1998). These simulations were performed at a temperature of 500 K to speed up the calculations such that the events of interest could be seen. Another approach to simulate unfolding within the timescale accessible to calculations was described at the meeting by Julia Goodfellow, in which water molecules were allowed to penetrate the protein to initiate unfolding. This approach is providing promising results and has been used so far to investigate the unfolding of a range of proteins including thermophilic and mesophilic ribonuclease H, lysozyme, p53, and the eye lens crystallins. CI2 also featured in molecular dynamics simulations discussed by Martin Karplus. Using a model in which all the atoms of the protein are explicitly considered and a simplified model for the solvent, the high-temperature unfolding of 24 individual molecules of CI2 was simulated. Analysis of the results indicates that there are many ways in which the protein can fold or unfold, a picture in accord with the statistical energy landscape model discussed earlier in this article. It is consoling to note, however, that the most popular trajectories cluster around a route that is consistent with the experimental findings (Lazaridis and Karplus, 1997). Aaron Dinner described the behavior of a lattice model comparable in length to larger well-studied proteins, such as lysozyme, barnase, myoglobin, and apoflavodoxin. These calculations are in fact those from which the surface shown in Figure 1 was derived. Good agreement between the kinetics and thermodynamics was achieved by using reaction coordinates specific to the folding mechanism (see Figure 1) (Dinner and Karplus, 1999). The results of these and a number of related simulations (Shakhnovich, 1997), bear a remarkable resemblance to the findings of experimental studies of refolding (Dobson et al., 1994). There is a multiplicity of folding pathways, one of which leads rapidly and directly to the native state, and several of which involve intermediates that are stabilized by nonnative interactions and represent kinetic traps (see Figure 1). The transition state for folding is a reduced version of the structural core of the protein containing some of the interactions that define the core in the native state and having little other structure, much like those suggested by the protein engineering experiments described above. While simplified models are needed to simulate the folding of proteins, more explicit models can be used to analyze the folding of peptides. This approach has been taken by Wilfred van Gunsteren using a short (7-residue) b peptide (Daura et al., 1998). Using a force field for methanol developed within the GROMOS package, van Gunsteren has calculated the conformational fluctuations of the b peptide at 350 K, close to the temperature of NMR studies used to determine its conformation in solution. Importantly, the calculation showed that the peptide adopts a left-handed 31 helix in the simulation, precisely as observed by NMR. The proof of

the pudding came when a 6 residue b peptide of similar sequence was studied both by NMR and by simulation. In each case an unsuspected right-handed 31 helix resulted. This demonstrates that the force field is adequate to simulate different structures and offers hope that full calculation of the folding of larger peptides and proteins may become possible when improved force fields are developed and sufficient computing power is available to allow calculations at ambient temperatures on a sufficiently long timescale to be directly compared with experiments. Indeed, simulation of the folding of a small protein to a metastable intermediate over a period of 1 ms, a time comparable to that of the fastest events detected in experimental studies, has just been reported (Duan and Kollman, 1998).

Tailor-made Proteins—Successes and Surprises Perhaps the ultimate challenge in protein folding is to demonstrate our ability to design new proteins at will or to redesign the scaffolds or surfaces of existing proteins for novel purposes. Presentations by Luis Serrano, Buzz Baldwin, Lynne Regan, Peter Kim, and Bill DeGrado at the Juan March meeting served to demonstrate just how close we are to fulfilling this ambition. Thus, as summarized by Serrano in an overview lecture, the problem of understanding simple elements of structure, such as the a helix, is all but complete (Mun˜oz and Serrano, 1997) and peptide models for b hairpins are now available (Blanco et al., 1998) such that it is only a matter of time until algorithms for predicting and designing the latter structures are similarly successful. The major hurdle to designing new proteins is encountered when sequences large enough to contain a variety of different elements of secondary structure are considered. Now more complex features such as long-range packing interactions and twists between elements of secondary structure need to be correctly engineered. Such features are currently not well understood. Nevertheless, rational design of three-stranded b sheet structures by painstaking iterations of design and redesign have produced several sequences that fold into the desired structure (see Kortemme et al., 1998, for one example), providing the first steps toward tailor-made proteins of the future. What does the future hold for protein design? Clear goals according to Serrano are to design b sheet proteins and a/b proteins larger than 60 amino acids in length. While simple peptide structures may be designed manually, the way ahead is to develop algorithms capable of automated design. Again the problem boils down to the size of the search problem, as originally depicted by Levinthal. How might one pick just the right sequence that folds into exactly the right structure, ensuring that it avoids traps and alternate conformations? Computer algorithms that can design sequences in hours that guarantee successful folding to a defined, biologically active protein are possibly close at hand. Again, it is the sheer dimensions of the conformational search problem and the subtleties of the interactions that select the native conformation over all other possibilities that are the challenges for such studies in the future.

Meeting Review 295

An outstanding success at protein redesign was recounted in Madrid by Peter Kim in his studies of coiledcoil proteins. These proteins constitute some 3%–5% of proteins in the human genome and play important roles in many cellular processes including the control of transcription and in membrane fusion. As well as providing simple models on which to base studies on the molecular description of protein folding, these proteins are also vitally important for the health of the cell. Understanding the folding and stability of coiled coils thus could provide important new routes for intervention of disease. Interestingly, all coiled coils found in nature are left-handed. Kim thus set himself the task of designing a novel right-handed coiled coil. The design he came up with, which used parameters of the coiled coil worked out from the known left-handed versions, was spectacularly vindicated. The X-ray crystal structure of the designed protein showed the desired right-handed twist, the positions of the atoms differing from those predicted by only about 0.2 A˚ (Harbury et al., 1998). At least for these regular helical structures, therefore, the design problem seems to be all but solved. In a second example of helical design, Lynne Regan and her colleagues have redesigned the dimeric four– helical bundle protein, ROP, the natural function of which is to bind a specific RNA complex and thereby regulate the replication of ColE1 plasmids. By tinkering with the amino acids that make up the core of the bundle, new ROP proteins with distinct stabilities and folding properties can be made (Munson et al., 1996). Perhaps most fascinatingly, Regan has recently managed to crystallize one of the mutant proteins that contains a simplified core containing only alanine and isoleucine residues. This protein resembles the wild-type in that it unfolds and refolds slowly. A shock was in store, however, since although the wild-type protein has an antiparallel arrangement of helical hairpins, those in the mutant are in a parallel arrangement. Such a large change in structure was completely unexpected and, even if it could have been predicted in advance, it is unlikely that one would have guessed that the protein with the new topology would retain wild-type properties. Not every rational redesign, therefore, leads to expected results. It does, however, raise the exciting idea that it may be possible to exploit relatively minor changes in natural proteins to generate very different and novel molecular structures. Protein Unfolding, Misfolding, and Aggregation— When Nature Gets It Wrong Until relatively recently, protein unfolding and misfolding, both of which often lead to aggregation, were viewed by many merely as experimental bugbears, which hindered otherwise straightforward manipulations involving proteins in the research laboratory, but were of little scientific interest. Nowadays this could not be further from the truth and new research fields have burgeoned that focus on these phenomena. Topics of particular interest include the roles of molecular chaperones in assisting protein folding in vivo and the involvement of aberrant protein folding and protein aggregation in disease. At the Juan March meeting, developments in both of these areas were discussed. Bernd Bukau

and Ulrich Hartl described their beautiful studies of the role of molecular chaperones in assisting folding in prokaryotes and eukaryotes in vivo. Bernd Bukau focused on the role of the E. coli hsp70 chaperones in assisting cellular folding (reviewed by Bukau and Horwich, 1998). This chaperone family (E. coli has three hsp70 proteins) is amazingly diverse in its functional activities, playing a role in functions ranging from folding of nascent proteins to assistance in the proteolytic degradation of proteins. Bukau described how this functional diversity arises by regulation of the ATPase cycle of the hsp70 chaperones by different cochaperones, explaining the versatility of these proteins. By contrast, Hartl focused on the role of the hsp70 and hsp60 chaperones in folding nascent polypeptide chains (reviewed by Ellis and Hartl, 1999). Folding can be either cotranslational or posttranslational, depending on the organism and the complexity of the protein involved. Thus, the simpler proteins of the prokaryotes appear to fold on chaperones posttranslationally, while the more complex multidomain proteins typical of eukaryotes fold in a cotranslational manner, involving a relay of molecular chaperones which ensures that the newly synthesized polypeptide chain reaches the desired native state without a chance of escaping to the hostile cellular environment. Nature has therefore devised awe-inspiring ways to avoid aggregation and achieve spontaneous folding in the complex milieu of the cell. When chaperones can no longer cope with the stress of folding, or mutated proteins of aberrant folding or stability are synthesized, the consequences for life are deadly. Thus, as recounted by Alan Fersht in an overview lecture, a large number of cancers (50% of all reported cases and 95% of lung cancers) involve mutations in one protein, p53. The vast majority of disease-causing mutations in p53 have been mapped to its central core domain and ablate the normal function of the protein as the “guardian of the genome.” The powerful approaches devised to study protein folding and unfolding have now allowed the effect of these mutations to be determined at the molecular level (Bullock et al., 1997). Using the core domain of p53, Fersht and his group have developed a quantitative assay for the stability of known p53 mutants. These assays have shown that many of the common disease-forming mutations are destabilizing (to greater or lesser extents), one mutant (R249S) being partially unfolded even at 378C. As a consequence, p53 is inactivated and cell cycle control is lost. The introduction of stable p53 molecules by gene therapy thus offers exciting possibilities for regaining cell cycle control. To this end, using rules learned from studies of protein folding, Fersht has made new mutants of p53, many of which are significantly more stable than the wild-type protein and are still functional (Nikolova et al., 1998). By combining these mutations with cleverly engineered mutations in the tetramerization domain (Mateu and Fersht, 1999) (which is needed for the function of the intact p53 molecule), hope glimmers for new therapies of the future. The amyloid diseases, which include Alzheimer’s and the prion diseases, have also attracted the attention of increasing numbers of protein folders in recent years. These diseases are scientifically fascinating because the culprit proteins fold correctly to a functional form

Cell 296

Figure 3. Various Representations of the Structure of an Amyloid Fibril Based on CryoEM Studies of the SH3 Domain of PI3 Kinase (a) represents an overall view of the fibril structure derived from the electron density map at a resolution of 23 A˚. The fibril is ca. 60 A˚ in diameter and consists of four segments of high density (shown in blue) that twist around each other with a periodicity of ca. 600 A˚ in the direction of the fibril axis. (b)–(d) are views of a model for the molecular structure of a fibril. This was made by “opening up” the b sandwich fold of the native structure of the SH3 domain (which is closely similar to that shown in Figure 2), straightening the b strands, and building these into the regions of high electron density. The rest of the polypeptide chain is simply represented by dots. The electron density distribution from the EM data is shown as a rounded surface (note the hollow core) and the directions of the strands are arbitrary. Full details of both the EM analysis and the modeling are given in Jimenez et al., 1999. The figure is reprinted from Jiminez et al., 1999, with permission.

after their synthesis on the ribosome, but then appear to unfold later in their lifetime, where they self-associate in an “explosive” reaction to form amyloid fibrils. As described by Phil Hawkins in Madrid, although about sixteen proteins are currently known to be involved in different amyloid diseases (which can be sporadic, familial or, as with the prions, infectious) they have very different native structures (involving a-helical or b sheet structure, or both) (Kelly, 1998). Fourier transform infrared spectroscopy and X-ray fiber diffraction studies indicate that the fibrils formed from each of these proteins has a similar overall structure that is mostly b sheet in nature. A key question is how the conformational transition between soluble and fibrillar forms occurs. For the prion proteins, the problem is even more complex because, as described in Madrid by Fred Cohen and Martin Billeter, models for the origin of infectivity need to be built onto models for amyloidosis (Cohen and Prusiner, 1998). Interestingly, it appears that only some of the mutations causing familial diseases destabilize the functional form of the prion protein, as has been found for mutations in other amyloidogenic proteins. Importantly, the recent availability of full-length recombinant prion protein and its fragments in vitro is now making possible quantitative analysis of prion stability and folding, assays for aggregation and the origin of strains, and investigations of the molecular basis of infectivity. One of the major frustrations in our quest to understand fibrillogenesis at the molecular level is that despite the elapse of more than a century since its discovery, the structure of a fibril at atomic resolution remains a mystery. At the Madrid meeting Chris Dobson described some recent strides toward a solution to this crucial question. Through their studies of the SH3 domain from PI3 kinase, Dobson and colleagues discovered that the partially denatured form at pH 2 assembled into long unbranched fibrils, which resembled those associated

with amyloid diseases. Although not known to be associated with any known disease, the SH3 domains formed such beautifully ordered fibrils that Helen Saibil has been able to generate high-resolution cryo-EM images that show four protofilaments twisting around each other to form hollow fibrils with a helical repeat (Jimenez et al., 1999). Further work is needed to push the resolution further so that the location of individual b strands within the protofilaments can be determined, but the images are already clear enough to permit individual protein molecules to be modeled into the maps (Figure 3). Dobson also provided evidence that fibrillogenesis is not restricted to a few selected proteins, but could be possible for many or indeed all proteins under the “right” or, in biological terms, “wrong” conditions (Chiti et al., 1999). Nature thus seems to have performed an excellent job, not just in designing proteins that can fold successfully to soluble, active structures, but also in controlling the environments within living systems to such a level that the aberrant behavior leading to the formation of amyloid fibrils is limited to just a handful of proteins. Avoidance of aggregation, and particularly of the intrinsic ability to form highly insoluble amyloid fibrils, might be a crucial driving force behind molecular evolution. Future Directions So what does the future for protein folding hold? It is clear that there is still some way to go until the ultimate goal of a complete solution to the folding problem is found and the full potential of folding in biology, biotechnology, and medicine can be realized. But this is an exciting and dynamic field in which there is currently enormous momentum and great progress has already been made. Thus, images of folding energy landscapes from simulations have revolutionized recent concepts of the molecular aspects of folding and have provided new challenges for experimentalists to determine the

Meeting Review 297

nature of a landscape for real proteins. Bill DeGrado has recently started along this route and excited the audience in Madrid by showing the first images of the folding of a leucine zipper, a single molecule at a time. Second, the possibility of making designer molecules at will appears to be within our grasp, opening new doors for the production of protein molecules of biotechnological or medical importance that nature has not already provided. Perhaps most excitingly, biophysicists are now turning their attention to folding in vivo and understanding the folding process in this complex environment is beginning to provide us with means of modulating and localizing the biological activity of proteins in a cellular context. This is already providing inspiration for new approaches to decipher and interfere with fatal human diseases such as cancer and the amyloidoses, and giving rise to new ideas about molecular evolution. While the last 10 years have seen huge leaps in our understanding of protein folding, it is clear that there will be even more excitement to come. Acknowledgments In this review we have highlighted issues of excitement to emerge from the Juan March meeting that we hope will be of interest to readers of Cell. We have not attempted to give a full account of all of the lectures given, and we apologize to those whose work is presented only briefly or is not mentioned here at all. We acknowledge with thanks the organizers of the Juan March meeting for putting together such an exciting program; Luis Serrano, Aaron Dinner, Martin Karplus, and Helen Saibil for providing Figures 1–3; and our colleagues from the Juan March meeting for critically reading this review prior to its publication. References Anfinsen, C.B. (1973). Principles that govern the folding of protein chains. Science 181, 223–230. Blanco, F., Ramirez Alvarado, M., and Serrano, L. (1998). Formation and stability of beta-hairpin structures in polypeptides. Curr. Opin. Struct. Biol. 8, 107–111. Bryngelson, J.D., Onuchic, J.N., Socci, N.D., and Wolynes, P.G. (1995). Funnels, pathways, and the energy landscape of proteinfolding: a synthesis. Proteins 21, 167–195. Bukau, B., and Horwich, A.L. (1998). The Hsp70 and Hsp60 chaperone machines. Cell 92, 351–366. Bullock, A.N., Henckel, J., DeDecker, B.S., Johnson, C.M., Nikolova, P.V., Proctor, M.R., Lane, D.P., and Fersht, A.R. (1997). Thermodynamic stability of wild-type and mutant p53 core domain. Proc. Natl. Acad. Sci. USA 94, 14338–14342. Burton, R.E., Huang, G.S., Daugherty, M.A., Calderone, T.L., and Oas, T.G. (1997). The energy landscape of a fast-folding protein mapped by Ala→Gly substitutions. Nat. Struct. Biol. 4, 305–310. Capaldi, A.P., and Radford, S.E. (1998). Kinetic studies of beta-sheet protein folding. Curr. Opin. Struct. Biol. 8, 86–92. Capaldi, A.P., Ferguson, S.J., and Radford, S.E. (1999). The Greek key protein apo-pseudoazurin folds through an obligate on-pathway intermediate. J. Mol. Biol. 286, 1621–1632. Chan, C.K., Hu, Y., Takahashi, S., Rousseau, D.L., Eaton, W.A., and Hofrichter, J. (1997). Submillisecond protein folding kinetics studied by ultrarapid mixing. Proc. Natl. Acad. Sci. USA 94, 1779–1784. Chiti, F., Webster, P., Taddei, N., Stefani, M., Ramponi, G., and Dobson, C.M. (1999). Designing conditions for in vitro formation of amyloid protofilaments and fibrils. Proc. Natl. Acad. Sci. USA 96, 3590–3594. Cohen, F.E., and Prusiner, S.B. (1998). Pathologic conformations of prion proteins. Annu. Rev. Biochem. 67, 793–819.

Daura, X., Jaun, B., Seebach, D., van Gunsteren, W.F., and Mark, A.E. (1998). Reversible peptide folding in solution by molecular dynamics simulation. J. Mol. Biol. 280, 925–932. Dill, K.A., and Chan, H.S. (1997). From Levinthal to pathways to funnels. Nat. Struct. Biol. 4, 10–19. Dinner, A., and Karplus, M. (1999). The thermodynamics and kinetics of protein folding. A lattice model analysis of multiple pathways with intermediates. J. Phys. Chem. B, in press. Dobson, C.M., and Karplus, M. (1999). The fundamentals of protein folding. Curr. Opin. Struct. Biol. 9, 92–101. Dobson, C.M., Evans, P.A., and Radford, S.E. (1994). Understanding how proteins fold: the lysozyme story so far. Trends Biochem. Sci. 19, 31–37. Dobson, C.M., Sali, A., and Karplus, M. (1998). Protein folding: a perspective from theory and experiment. Angewandte Chemie—Intl. Ed. 37, 868–893. Duan, Y., and Kollman, P.A. (1998). Pathways to a protein folding intermediate observed in a 1-microsecond simulation in aqueous solution. Science 282, 740–744. Eaton, W.A., Mun˜oz, V., Thompson, P.A., Henry, E.R., and Hofrichter, J. (1998). Kinetics and dynamics of loops, alpha-helices, beta-hairpins, and fast-folding proteins. Accounts Chem. Res. 31, 745–753. Ellis, R.J., and Hartl, F.U. (1999). Principles of protein folding in the cellular environment. Curr. Opin. Struct. Biol. 9, 102–110. Fersht, A.R. (1997). Nucleation mechanisms in protein folding. Curr. Opin. Struct. Biol. 7, 3–9. Grantcharova, V.P., Riddle, D.S., Santiago, J.V., and Baker, D. (1998). Important role of hydrogen bonds in the structurally polarized transition state for folding of the src SH3 domain. Nat. Struct. Biol. 5, 714–720. Hagen, S.J., Hofrichter, J., Szabo, A., and Eaton, W.A. (1996). Diffusion-limited contact formation in unfolded cytochrome c: estimating the maximum rate of protein folding. Proc. Natl. Acad. Sci. USA 93, 11615–11617. Harbury, P.B., Plecs, J.J., Tidor, B., Alber, T., and Kim, P.S. (1998). High-resolution protein design with backbone freedom. Science 282, 1462–1467. Jimenez, J.L., Guijarro, J.L., Orlova, E., Zurdo, J., Dobson, C.M., Sunde, M., and Saibil, H.R. (1999). Cryo-electron microscopy structure of an SH3 amyloid fibril and model of the molecular packing. EMBO J. 18, 815–821. Karplus, M. (1997). The Levinthal paradox: yesterday and today. Folding and Design 2, S69–S75. Kelly, J.W. (1998). The alternative conformations of amyloidogenic proteins and their multi-step assembly pathways. Curr. Opin. Struct. Biol. 8, 101–106. Kortemme, T., Ramirez Alvarado, M., and Serrano, L. (1998). Design of a 20-amino acid, three-stranded beta-sheet protein. Science 281, 253–256. Ladurner, A.G., Itzhaki, L.S., Daggett, V., and Fersht, A.R. (1998). Synergy between simulation and experiment in describing the energy landscape of protein folding. Proc. Natl. Acad. Sci. USA 95, 8473–8478. Lazaridis, T., and Karplus, M. (1997). “New view” of protein folding reconciled with the old through multiple unfolding simulations. Science 278, 1928–1931. Levinthal, C. (1968). Are there pathways for protein folding? J. Chim. Physique 65, 44–45. Martinez, J.C., Pisabarro, M.T., and Serrano, L. (1998). Obligatory steps in protein folding and the conformational diversity of the transition state. Nat. Struct. Biol. 5, 721–729. Mateu, M.G., and Fersht, A.R. (1999). Mutually compensatory mutations during evolution of the tetramerization domain of tumor supressor p53 lead to impaired hetero-oligomerization. Proc. Natl. Acad. Sci. USA 96, 3595–3599. Mun˜oz, V., and Serrano, L. (1997). Development of the multiple sequence approximation within the AGADIR model of alpha-helix formation: comparison with Zimm-Bragg and Lifson-Roig formalisms. Biopolymers 41, 495–509.

Cell 298

Munson, M., Balasubramanian, S., Fleming, K.G., Nagi, A.D., Obrien, R., Sturtevant, J.M., and Regan, L. (1996). What makes a protein a protein? Hydrophobic core designs that specify stability and structural properties. Prot. Sci. 5, 1584–1593. Nikolova, P.V., Henckel, J., Lane, D.P., and Fersht, A.R. (1998). Semirational design of active tumor supressor p53 DNA binding domain with enhanced stability. Proc. Natl. Acad. Sci. USA 95, 14675–14680. Park, S.H., Oneil, K.T., and Roder, H. (1997). An early intermediate in the folding reaction of the B1 domain of protein G contains a native-like core. Biochemistry 36, 14277–14283. Plaxco, K.W., and Dobson, C.M. (1996). Time-resolved biophysical methods in the study of protein folding. Curr. Opin. Struct. Biol. 6, 630–636. Shakhnovich, E.I. (1997). Theoretical studies of protein-folding thermodynamics and kinetics. Curr. Opin. Struct. Biol. 7, 29–40. Shakhnovich, E., Abkevich, V., and Ptitsyn, O. (1996). Conserved residues and the mechanism of protein folding. Nature 379, 96–98. Shastry, M.C.R., and Roder, H. (1998). Evidence for barrier-limited protein folding kinetics on the microsecond time scale. Nat. Struct. Biol. 5, 385–392. Viguera, A.R., Serrano, L., and Wilmanns, M. (1996). Different folding transition states may result in the same native structure. Nat. Struct. Biol. 3, 874–880.