YABBI 6948
No. of Pages 10, Model 5G
19 March 2015 Archives of Biochemistry and Biophysics xxx (2015) xxx–xxx 1
Contents lists available at ScienceDirect
Archives of Biochemistry and Biophysics journal homepage: www.elsevier.com/locate/yabbi
2
Review
6 4 7
Structural predictions of neurobiologically relevant G-protein coupled receptors and intrinsically disordered proteins
5 8 9 10 11 12 13 14 15 1 8 7 2 18 19 20 21 22 23 24 25 26 27
Giulia Rossetti a,b,1, Domenica Dibenedetto b,1, Vania Calandrini a,b,1, Alejandro Giorgetti a,c,⇑, Paolo Carloni a,b a Computational Biophysics, German Research School for Simulation Sciences, and Computational Biomedicine, Institute for Advanced Simulation IAS-5 and Institute of Neuroscience and Medicine INM-9, Forschungszentrum Jülich, 52425 Jülich, Germany b Jülich Supercomputing Centre, Forschungszentrum Jülich, 52425 Jülich, Germany c Department of Biotechnology, University of Verona, Ca’ Vignal 1, Strada le Grazie 15, Verona 37134, Italy
a r t i c l e
i n f o
Article history: Received 22 December 2014 and in revised form 11 March 2015 Available online xxxx Keywords: G-protein coupled receptors Intrinsic disordered proteins Computational biophysics Bioinformatics
a b s t r a c t G protein coupled receptors (GPCRs) and intrinsic disordered proteins (IDPs) are key players for neuronal function and dysfunction. Unfortunately, their structural characterization is lacking in most cases. From one hand, no experimental structure has been determined for the two largest GPCRs subfamilies, both key proteins for neuronal pathways. These are the odorant (450 members out of 900 human GPCRs) and the bitter taste receptors (25 members) subfamilies. On the other hand, also IDPs structural characterization is highly non-trivial. They exist as dynamic, highly flexible structural ensembles that undergo conformational conversions on a wide range of timescales, spanning from picoseconds to milliseconds. Computational methods may be of great help to characterize these neuronal proteins. Here we review recent progress from our lab and other groups to develop and apply in silico methods for structural predictions of these highly relevant, fascinating and challenging systems. Ó 2015 Published by Elsevier Inc.
29 30 31 32 33 34 35 36 37 38 39 40 41
42 43
Introduction
GPCRs: computational challenges
56
44
Neuronal signaling processes involve mostly unknown, exquisite molecular recognition mechanisms. The first step towards understanding the so-called mechanistic systems biology [1] is to provide a detailed molecular description of the neuronal proteins involved. Unfortunately the structural determinants of several classes of proteins are challenging to be obtained experimentally. These include more than half of the largest protein family in humans, the G-protein coupled receptors (GPCRs)2 family, as well as intrinsically disordered proteins (IDPs), which constitute 30% of the human genome. This paper reviews our effort to use advanced molecular simulation to predict the structural determinants of some of these highly challenging systems.
Human G protein coupled receptors or GPCRs belong to the largest membrane-bound receptor family expressed by humans [2]. This family (Fig. 1A) encompasses 4% of the protein-coding genes in humans. Remarkably, GPCRs participate in about the 80% [3] of the signaling processes in the brain. In particular they are deputed to the perception of chemicals from the environment. Thus, a detailed characterization of the molecular mechanisms underlying their function is needed. GPCRs share a similar architecture of seven TM helices held together by tertiary contacts (Fig. 1B). However, their sequences are remarkably variable [4]. One of the most important features of GPCRs is their ability to bind molecules of diverse shapes, sizes and chemical properties (going from metal ions to short peptides) and to fire a cascade of events (Fig. 1C), thus the understanding of function/dysfunction mechanisms of these receptors imply the characterization of the interaction with their natural ligands. The recent progress in GPCRs crystallography opened an unprecedented venue for receptor–ligand characterizations [5– 21]. Indeed, the 2012 Nobel Prize in Chemistry was awarded to Brian Kobilka (Stanford University) and Robert Lefkowitz (Duke University) for their structural work on the GPCRs (http://www.
57
45 46 47 48 49 50 51 52 53 54 55
⇑ Corresponding author at: Department of Biotechnology, University of Verona, Ca’ Vignal 1, Strada le Grazie 15, Verona 37134, Italy. E-mail address:
[email protected] (A. Giorgetti). 1 Equally contributed to the work. 2 Abbreviations used: GPCRs, G protein coupled receptors; IDPs, intrinsic disordered proteins; TAS2Rs, taste 2 receptors; WT, wild-type; 3D, three-dimensional; MD, molecular dynamics; CG, coarse-grained; b2 AR, b2 adrenergic receptor; S-Car, SCarazolol; MC, Monte Carlo; MUCA, multi-canonical algorithm; REM, replicaexchange method; PB, Poisson–Boltzmann; HMQC, Heteronuclear multiple-quantum correlation. http://dx.doi.org/10.1016/j.abb.2015.03.011 0003-9861/Ó 2015 Published by Elsevier Inc.
Please cite this article in press as: G. Rossetti et al., Arch. Biochem. Biophys. (2015), http://dx.doi.org/10.1016/j.abb.2015.03.011
58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77
YABBI 6948
No. of Pages 10, Model 5G
19 March 2015 2
G. Rossetti et al. / Archives of Biochemistry and Biophysics xxx (2015) xxx–xxx
Fig. 1. (A) Scheme of the human GPCR’s phylogenetic tree according to the GRAFS system [31]. GPCR’s are divided in 5 families (glutamate, rhodopsin, adhesion, frizzled/taste2, secretin). The rhodopsin family can be further divided in four subfamilies. This classification excludes the very large (388 GPCR’s) olfactory receptors’ subbranch in the rhodopsin family. (B) Representation of a general GPCR fold. The seven trans-membrane helices (TM) are colored from red (TM1) to blue (TM7). The GPCR binding cavity is represented as a yellow surface. (C) Schematic representation of GPCR signaling in eukaryotes. The GPCR is in complex with its cognate G-protein in its inactive state. Upon interaction with the agonist molecule, Guanosine-50 -triphosphate (GTP) exchanges with guanosine diphosphate (GDP) in one of the subunits of the trimeric G-protein (Ga). This allows the bound G-protein to be released from the receptor and to dissociate into active a subunit GTP bound (GaGTP) and dimer formed by the other two subunits, b and c, (Gbc). In the bitter taste receptor pathway, focus of this work, GaGTP stimulates phosphodiesterase (PDE) activity, thus reducing intracellular cAMP concentration and opening the cAMP inhibited channels. Concomitantly, Gbc subunits stimulate the phospholipase Cb2 (PLC-b2) enzyme. This catalyzes the formation of IP3 (inositol trisphosphate) from PIP2 (phosphatidylinositol 4,5-bisphosphate). PIP2 in turns stimulates Ca2+ release from the endoplasmatic reticulum. Thus, both GaGTP and Gbc concur in the increase of intracellular Ca2+ concentration.
78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105
nobelprize.org/nobel_prizes/chemistry/laureates/2012/). However, the lack of structural data for about the 95% of the members of the family, calls upon the development of approaches in which, specific GPCRs-targeting in silico tools are combined with extensive in vitro experiments. In the absence of structural information, the determination of agonists/antagonists’ binding modes may be challenging [22–24]. In particular, when docking protocols are applied on homology models, the accuracy of the determination of the ligand binding mode decreases substantially as the distance between target and template receptors increases [25]. This is due to the poor predictive power of side chains’ conformation [25]. The latter plays a key role for the interaction between the ligand and the protein matrix. Some of the classical docking algorithms take into account the flexibility of both ligand and receptor’s binding sites, increasing in this way the chance to generate reliable hypotheses on the predominant conformations of receptor–ligand complexes [25–29]. Nevertheless, experimental mutagenesis data is still the only tool for the selection of the most probable binding mode [30]. In this respect, the compatibility between models and mutagenesis data is typically assessed with an analysis of the interactions that the ligand establishes with the mutated residues. Notably, the fact that a residue contributes significantly to ligand binding does not necessarily imply a direct interaction with the ligand, as indirect effects due to the induction of conformational changes in the receptor are also possible. Therefore we conclude that the contribution of experimental data is fundamental when dealing with homology models. In particular, when homology models’ building
is based on remote homologs, an extensive screening of the conformational space of the ligand and of the putative binding cavity, validating the models against the experiment, is highly recommended. In the next section we will illustrate efforts aimed at characterizing the ligand–neuronal GPCR interaction when the structure of the latter is predicted using homology models based on remote homologs. Chemosensorial GPCRs. Mammals evolved a host of mechanisms to communicate chemical information from the environment and in particular to elicit cellular responses that provide an advantage in avoiding or seeking the chemical signatures of foods, mates, toxins, etc. For instance, bitter taste perception stems from the binding of bitter molecules to ca. 25 specific GPCRs referred to as taste 2 receptors (TAS2Rs) [32–34]. TAS2Rs are located in special subsets of taste receptor cells [33–36]. They are able to detect multiple and diverse natural and synthetic organic molecules [35]. In spite of their widespread presence in humans (as well as in other animals [37,38]), experimental structural information is so far lacking for bitter taste receptors as well as other chemosensorial GPCRs. Hence, computational-based approaches are the method of choice to investigate structure, dynamics and function of these GPCRs [39–41]. As we discussed above, comparative modeling technique is the key tool for obtaining three-dimensional (3D) models of the receptors when experimental structural data is lacking. Functional assays-validated bioinformatics approaches, complemented with molecular docking [35], have provided structural insights on agonist–chemosensorial receptors interactions. The
Please cite this article in press as: G. Rossetti et al., Arch. Biochem. Biophys. (2015), http://dx.doi.org/10.1016/j.abb.2015.03.011
106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133
YABBI 6948
No. of Pages 10, Model 5G
19 March 2015 G. Rossetti et al. / Archives of Biochemistry and Biophysics xxx (2015) xxx–xxx 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168
responses of the different receptor mutants have been measured upon application of increasing concentrations of agonists. If the half maximal effective concentration (EC50) value is larger than that of the wild-type (WT), the receptor sensitivity is impaired, whilst the contrary is true if the EC50 is lower. Higher maximal signal amplitude usually may stand for improved receptor activation relative to WT, whilst a lower one stands for an impaired activation [42]. These pieces of information, when included in the model generation, provide valuable insights into the structure/function relationships. In spite of this, the approach has clear limitations in describing the active site cavity of the receptor. Structural prediction of chemosensorial receptors is very difficult because, in general, they share a sequence identity with structural templates of less than 25% [35]. Indeed, as stated before, the orientation of the side chains is highly inaccurate. This implies that classical techniques such as virtual docking experiments become very difficult or almost impossible to be used. Molecular dynamics (MD) simulations, based on structures predicted by bioinformatics methods, are often used to identify ligand poses on GPCRs for which experimental structural information is not available. Still this approach can be very CPU intensive. Moreover, starting from structures with highly inaccurate side-chains orientation, the chance to end up with a rather inaccurate model is very high [43] in the typical time-scale reached by state-of-the-art MD simulations of membrane proteins (from 20 to 200 ns [44]). On the other hand, using coarse-grained (CG) simulations, the molecular details of the ligand/receptors interactions are lost, making these approaches not very useful. A strategy to overcome these limitations is to combine comparative modeling and docking techniques with hybrid atomistic/coarse-grained refinement of the docking pose. Actually hybrid atomistic/coarse-grained methods, in which only the binding cavity is represented with full atomistic detail while the rest is represented with a coarser resolution, are not only much cheaper than all-atom MD [40] simulations, but may be more accurate [42].
169
Hybrid atomistic-coarse grained approaches
170
Several groups have developed mixed modeling methods for proteins [45–52]. In some implementations, the protein of interest is described in full atomistic detail, while water and eventually lipid molecules are described using coarse-grained representations [47,48,50,51]. Our group has developed a hybrid atomistic-coarse grained approach [52], the Molecular Mechanics/Coarse-Grained (MM/CG) molecular dynamics, that is especially conceived for structural predictions of agonist- and antagonist-GPCR complexes, [45,46]. In this implementation only the binding cavity of the protein along with the ligand and the solvent close to the binding cavity are represented with an atomistic force field, while the rest of the protein frame is described using a Go-like CG representation. The hydration is taken into account by including a sphere of water molecules centered on the ligand. An interface (I region) is defined between MM and CG regions, which bridges the two different resolution models. In this way, the number of degrees of freedom is drastically reduced by up to 2 orders of magnitude [52]. This allows the system to be equilibrated in much shorter time scale, and it may be able to avoid artifacts caused by wrong orientations of side-chains in loop and in helices far from the active site. We showed that 0.8 ls MM/CG simulations of a GPCR/inverse-agonist complex for which the 3D X-ray structure is experimentally available – the b2 adrenergic receptor/S-Carazolol (b2 AR/S-Car) – reproduced the results of full atomistic MD simulations [52,53]. Importantly, our method proved to be able to recover the orientation of the S-Car ligand as in the X-ray structure irrespectively of its initial orientation [52].
171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196
3
Using the same hybrid approach, we then studied the interaction of one of the most characterized human bitter taste receptors, i.e.TAS2R38 receptor with two of its agonists. By combining our initial low-resolution homology models together with virtual docking experiments, MM/CG simulations and experimental data, we were able to provide a detailed description of the ligand-binding site in the TAS2R38 receptor, satisfying more than 40 site-directed mutagenesis experiments and functional calcium imaging assays performed on this receptor [42]. Our approach can be extended for other GPCRs with direct applications in drug design. In the current MM/CG version, implemented within the GROMACS 4.5 code [94–98], the solvent is confined to a small hemisphere by Lennard-Jones-like potentials. This might introduce water structure artifacts that affect the binding modes of ligands. A possible solution to improve the solvation description is the implementation of an adaptive resolution scheme, [54,55] that allows the water molecules to freely diffuse, while keeping the density uniform and the temperature balanced between the regions at different resolution. The adaptive resolution scheme mimics effectively a grand canonical ensemble, which allows a correct estimation of the ligand’s free energies of binding. Adaptive resolution schemes have been exploited successfully for the study of biologically relevant systems as ionic solutions [56] and macromolecules in solution [57] and are currently being implemented in standard MD software distributions used for molecular dynamics, e.g. GROMACS [57] and ESPRESSO [55]. However, these implementations do not allow yet a mixed MM/CG description of protein. The reliability of the method can be further improved by implementing long-range electrostatic contributions of the CG part using, for instance, topologically based multipolar reconstruction scheme [58], which reconstructs the electrostatic field of a protein using the tridimensional coordinates of the Ca atoms. Moreover, the replacement of the GROMOS96 force field [59], currently used to model the all-atom region, by the AMBER force field [60], may allow a better description of neurobiologically relevant systems like ours [61,62].
197
Intrinsically disordered proteins: open issues and recent successful stories
233
Intrinsically disordered proteins (IDPs) are an important class of functional proteins with high abundance in nature [63–66], specifically in humans, where they represent almost one third of the genome [63–67]. Notably IDPs are extensively associated with human diseases and amyloidosis [68,69]. In the specific, the 79% of cancer associated proteins, the 57% of the cardiovascular disease associated proteins and the 55% of neurodegenerative disease associated proteins are predicted to contain 30 or more consecutive disordered residues [70]. Studying the structural determinants of this class of proteins is key to understand their role for cellular function and dysfunction in both healthy and altered-disease-associated pathways. Unfortunately, traditional computational and experimental approaches have been hampered so far by a variety of challenges. IDPs do not adopt a well-defined native three-dimensional structure [71]; they lack stable tertiary and/or secondary structures when isolated in solution under near-physiological conditions [72,73] and exist in an ensemble of states both in solution and when unbound to a ligand in vivo [73]. In fact, IDPs have a distinct amino acid composition, enriched in A, R, G, Q, S, P, E and K while depleted in W, C, F, I, Y, V, L and N [74,75]. Lysine, glutamine, serine, glutamic acid and proline are disorder-promoting amino acids of low hydrophobicity and high net charge [76]. This combination of the low hydrophobicity and relatively high net charge contribute to their lack of compact structure under physiological conditions
235
Please cite this article in press as: G. Rossetti et al., Arch. Biochem. Biophys. (2015), http://dx.doi.org/10.1016/j.abb.2015.03.011
198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232
234
236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259
YABBI 6948
No. of Pages 10, Model 5G
19 March 2015 4 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316
G. Rossetti et al. / Archives of Biochemistry and Biophysics xxx (2015) xxx–xxx
[66]. This means that an ensemble of inter-converting conformers is required to describe the conformational behavior of IDPs [77]. Subtle micro-environmental conditions, like the presence of a ligand with given surface features and charge, can allow funnels to move, deepen or flatten in the energy landscape, driving the IDPs towards different structural states and/or differently populated sets of structures [78,79]. A whole ‘‘new view’’ of classical protein structure–function relationship is emerging, where conformational diversity, far from being an unusual quirk of a few peculiar proteins, is intimately connected with protein function [80]. IDPs do not appear to simply occur as ‘filler material’ amongst functional well-structured proteins, instead they are associated with a variety of biological functions [63]. IDPs are indeed enriched in signaling and regulatory functions because disordered segments permit interaction with several proteins and hence the re-use of the same protein in multiple pathways [63,81–84]. IDPs apply potentially very different molecular recognition mechanisms and functional modes [83,85] compared to the globular proteins. The majority of IDPs undergo a disorder-to-order transition upon functioning [65,73,76,81,86–92], a structural transition from a partially disordered state into a more highly ordered conformation in the complex [86,93], also called folding-upon-binding mechanism (Fig.A).3 The pervasive binding-folding association lead to the hypothesis that IDPs are, in physiological state, no less folded than other proteins, because they always come bound to a partner [94], their ‘‘intrinsic’’ disorder being a hidden property that arises only when proteins are purified in vitro. The persistence of natively unfolded proteins throughout evolution may reside in advantages of flexible structure during disorder–order transitions in comparison with rigid proteins [65,73,90,96]. The potential advantages of intrinsic lack of structure and function-related disorder–order transitions are: (i) Decoupling of specificity from binding strength IDPs are capable to combine high specificity with low affinity [73,77,88,97]. This is due to the fact that folding and binding are coupled for IDPs. Therefore the change in enthalpy is compensated by a much larger loss of conformational entropy as compared to globular proteins. This results in a lower absolute value of DG, decreasing the stability of the resulting complex. The key thermodynamic driving force for the binding reaction is consequently a favorable enthalpic contribution [98]. Although binding strength is lowered, the specificity of the interaction, which is mainly governed by the complementarity of the partners, is not altered [64]. Therefore, despite the interface properties of complexes formed by disordered proteins point to stronger and specific interactions, it is not reflected in the binding strength. (ii) Binding commonality in which multiple, distinct sequences fold differently yet each recognizes a common binding surface [95] (Fig. 2B). These localized interacting regions allow IDPs to have an increased modularity as different binding regions can be incorporated into the same protein without excessively increasing protein length [99].4 (iii) Binding diversity IDPs may folds differently to recognize differently shaped partners by several structural accommodations at the various binding interfaces [63,65,73,86]
3
Some IDPs engage in complexes that are much more dynamic, in which the IDPs do not necessarily adopt a specific conformation in the complex, but rather sample various states on the surface of the partner [86,93]. 4 These binding regions can be close to each other or can form mutually exclusive overlapping sites. This allows a compact arrangement of multiple binding regions. Such feature is possibly one of the reasons for the abundance of IDPs among protein– protein interaction network hubs [100,101].
(Fig. 2C). This phenomenon known as one- to-many signaling [75], illustrate the complexity of the different binding modes of IDPs and enables an exceptional plasticity in cellular responses [83]. (iv) The creation of very large interaction surfaces as the disordered protein wraps-up [102] or surrounds its partner [103] making it possible to overcome steric restrictions [73,103], meaning that these proteins utilize a much larger fraction of their accessible surfaces compared to globular proteins [104]. The distinct binding mode of IDPs is also reflected in the physico-chemical nature of their interfaces, preferring hydrophobic interactions during binding opposed to the large number of polar–polar interactions at globular interfaces [104]. (v) A increased speed of interaction IDPs display both faster rates of association by reducing dependence on orientation factors and by enlarging target sizes and faster rates of dissociation by un-zippering mechanisms [73,90]. The great conformational freedom of IDPs in multidirectional search permits the recognition of distant and/or discontinuous determinants on the target [69,85]. Moreover, their extended structure enables them to contact their partner(s) over a large binding surface area, which allows the same interaction potential to be realized by shorter proteins overall [85,105].
317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342
All these features, i.e. precise control and simple regulation of the binding thermodynamics [65,73,88,89,92], larger-than-average intermolecular interfaces despite the limited protein size (and, consequently, without increase of cell size or crowding) [99], structural flexibility and ability to bind vastly different ligands, added to their speed of interaction due to greater capture radius and their ability to spatially search through interaction space, collocate IDPs among the major cellular regulators, recognizers, and signal transducers [77]. Also IDPs reduced life-time in the cell, possibly represents a mechanism of rapid turnover of important regulatory molecules [65]. The inherent flexibility of IDPs calls upon new experimental and computational strategies for studying these proteins, since describing the ensemble of conformations of IDP at atomistic level remains a considerable challenge. From an experimental point of view, the highly dynamic nature of IDPs [106], the presence of local and long-range conformational rearrangements [107,108], as well as the transient secondary structure and transient long-range tertiary structure [93,109,110] make indeed difficult their structural characterization. Additional problems may arise from low overall hydrophobicity and high net charge [111]. From a computational point of view, key challenges are the sampling of the wide conformational space as well as the lack of a force field tailored for unstructured proteins [112,113]. In this review we will first discuss the computational strategies that have been devised to tackle the conformational plasticity of neuronal IDPs, complemented by an application from our lab.
343
Computational methods for neuronal IDPs
370
Computational methods using physics-based empirical molecular mechanics force fields increasingly release critical contributions in providing general insights into the behavior of IDPs [77,98,114– 117]. However, the dynamic and heterogeneous nature of IDPs presents substantial challenges, in terms of force field accuracy and of conformational sampling capability. MD simulations are indeed sensitive to the choice of the protein force field [114,118,119]. Hence, an important caveat is that these force fields are typically parameterized to reproduce the behaviors of folded proteins rather
371
Please cite this article in press as: G. Rossetti et al., Arch. Biochem. Biophys. (2015), http://dx.doi.org/10.1016/j.abb.2015.03.011
344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369
372 373 374 375 376 377 378 379
YABBI 6948
No. of Pages 10, Model 5G
19 March 2015 G. Rossetti et al. / Archives of Biochemistry and Biophysics xxx (2015) xxx–xxx
5
Fig. 2. Schematic of IDPs’ molecular recognition mechanisms. (A) disorder-to-order transition [83]. (B) Binding commonality [95]. (C) Binding diversity [75].
380 381 382
383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402
than IDPs, and thus they may fail to capture important aspects of IDP conformational ensembles [62,120,121]. We will therefore describe all the different techniques so far used for IDPs.
Molecular dynamics and Monte Carlo simulations Molecular dynamics (MD) and Monte Carlo (MC) simulations complement experiments by elucidating chemical details underlying the conformational dynamics of biological macromolecules [122]. Unfortunately, it is extremely difficult to adequately sample the conformational space accessible to IDPs. In details: – MD in explicit solvent at room temperature is generally insufficient for achieving convergence in simulated structural ensembles of IDPs, due to of their large conformational space [123] and the socalled kinetic trapping, i.e., the system tends to be confined to local energy minima [124]. Such minima are separated by free-energy barriers, whose heights are often much larger than the thermal energy available to the system [124]. Therefore MD is not always suitable to sample the dynamical behaviors of IDPs. – In MC approaches, stochastic conformational searches are used to efficiently sample conformations of the protein chain [125]. MC surmounts energy barriers by moving through successive discrete local minima in the energy landscape. In this way, MC permits to sample all the minima of conformational space but can’t see the energy barriers [126].
403
404 405 406 407 408 409 410 411 412
Enhanced sampling techniques These methods achieve a random walk in the potential-energy space, allowing the system to easily overcome the energy barriers that separate local minima. In this way a much wider phase space compared to conventional simulations may be sampled [124,128,129]. Three well-known approaches for carrying out generalized ensemble MD or MC simulations are the multi-canonical algorithm [130–132], the simulated tempering [133,134] and the replica
exchange method [129,135] (Fig. 3). They are very briefly summarized here:
413
– The multi-canonical algorithm (MUCA) method [130–132] was successfully used for systems that suffer from multiple-minima problem such as spin glasses [136,137] and the protein folding problem [138]. The methods assign to each state with energy E a non-Boltzmann weight that is independent from temperature so that a uniform potential energy distribution is obtained ensuring that all the energy states are sampled with the same likelihood [139]. The flat distribution implies that a free random walk in the potential energy space is realized in this ensemble [139]. This allows the simulation to escape from any local minimum-energy states and to sample the configurational space much more widely than the conventional canonical MC or MD methods [139]. MUCA methods combined with a trajectory-parallelization method have been also developed [140,141] to further increase the sampling efficiency. These approaches were applied to the coupled folding and binding of an IDP in order to generate the corresponding free-energy landscape [142].
415
414
416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432
– The simulated tempering (ST) performs a free random walk in temperature space. This random walk, in turn, induces a random walk in the potential energy space and allows the simulation to escape from states of local energy minima. In ST [133,134], the temperature of the system is randomly switched between several predefined values. Both the configuration and the temperature are updated during the simulation with a non-Boltzmann weight [130–132]. The frequencies of visiting different temperatures depend on the given criteria defining the acceptance probability of temperature transitions [133,134]. In the tempering methods, the system can frequently visit higher temperatures where the sampling is ergodic and go back to lower temperatures with a very different configuration [143,144]. ST however requires extensive initial simulations to accurately compute the Helmholtz free energy of the system at each temperature [143,144]. Therefore, the computational cost of the initial simulations might becomes
Please cite this article in press as: G. Rossetti et al., Arch. Biochem. Biophys. (2015), http://dx.doi.org/10.1016/j.abb.2015.03.011
433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448
YABBI 6948
No. of Pages 10, Model 5G
19 March 2015 6
G. Rossetti et al. / Archives of Biochemistry and Biophysics xxx (2015) xxx–xxx
Fig. 3. Flow-charts of the three main enhanced sampling methodologies: Replica Exchange Method, REM, (panel A), Simulate Tempering, ST, (panel B), Multi-canonical ½i ½j ½i algorithm, MUCA, (panel C). Here X ¼ f. . . ; xm ; . . . ; xn ; . . .g is a generalized ensemble of replicas where the i-th replica xm indicates a state specified by the coordinates q½i and momenta p½i of N atoms at temperature Tm, bm ¼ 1=kB T m , E is the potential energy, n(E) is the density of states, EMUCA is the multicanonical potential energy, and S(E) is the entropy in the microcanonical ensemble. Advantages and weakness of the three methods are summarized in panel (D). The flow-charts have been built according with the algorithms’ description provided in reference [127].
449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471
prohibitive for IDPs [145]. Despite these drawbacks, ST was applied to study the binding mechanism of two IDPs in combination with classical MD [146]. The former two algorithms make use of non-Boltzmann probability weight factors, which are not known a priori and need to be determined by trial simulations [129–132,135]. This process can be non-trivial and very tedious for complex systems with many degrees of freedom. – The replica-exchange method (REM) uses standard Boltzmann weight factors that are known a priori [128]. In this method, a number of non-interacting copies (or replicas) of the original system at different temperatures are simulated in parallel under different conditions [128]; at given time intervals, the simulation conditions are exchanged with a specific transition probability between replica pairs [128]. Trajectories at higher temperatures are more likely to cross potential energy barriers, leading to a wider array of sampled configurations, while trajectories at lower temperatures provide the physically-relevant acceptance probabilities. Recent applications of the methods to small peptides [135,147–152] generally confirm that replica exchange can enhance protein conformational sampling as long as the activation enthalpies (of conformational transitions) are positive. However, the efficacy of REM for IDPs can be challenging since the nature
of IDPs sub-states and the energy barriers of their inter-conversion are not known and the maximum temperature should be carefully estimated slightly above where the folding rate maximizes [147,148]. These aspects make problematic the choice of simulation key parameters such as the number of replicas, range and distribution of simulation temperatures, and exchange attempt frequency.
472 473 474 475 476 477 478 479
A variation is the replica exchange solute tempering method, REST2 [153], in which only the protein and the ions (i.e. the solute) are simulated at different effective temperatures by applying an appropriate potential energy function to each replica [154]. REM could be also coupled with Monte-Carlo simulations, (REMC) [155–157] to explore the conformation space of IDPs. REMC maintains many independent replicas of potential solutions, i.e. protein conformations, each of them at a different temperature. Each replica locally runs a Markov process sampling from the Boltzmann distribution in energy space. At any step, a conformation update involving a change in one or more backbone and side-chain torsion angles is proposed, and is either accepted or rejected using a Metropolis criterion. A random walk in temperature space is achieved by periodic exchanges of conformations at neighboring temperatures [155]. A REMC-derived approach
Please cite this article in press as: G. Rossetti et al., Arch. Biochem. Biophys. (2015), http://dx.doi.org/10.1016/j.abb.2015.03.011
480 481 482 483 484 485 486 487 488 489 490 491 492 493 494
YABBI 6948
No. of Pages 10, Model 5G
19 March 2015 G. Rossetti et al. / Archives of Biochemistry and Biophysics xxx (2015) xxx–xxx 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511
512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553
based on an implicit-solvent all-atom potential [158] was recently applied on the disordered N-terminal domain of Prion Protein (PrP) [156]. The authors were able to predict the conformational ensemble of the wild type (WT) and mutated mouse PrPC N-terminal domain. Importantly, the work shows how pathogenic mutations (PMs) affect the PrPC binding to functional interactors and/or the translocation [156] (Fig. 4A). New generalized-ensemble algorithms could be obtained by combining the merits of the above three methods (reviewed in [139]5). Another particularly attractive approach to overcome the sampling bottleneck is to combine large numbers of equilibrium and/or generalized ensemble simulations using network methods based on MC algorithms like Markov State Models [159–161]. This strategy has provided unprecedented detail on energy landscapes of several proteins under both stable and unstable conditions [162– 164]. Recently it has been applied also to IDPs, [165,166]. Solvent representation In all the methods discussed in the previous sections, the solvent can be represented either as a continuum model, or rather as a set of macroscopic constants describing the protein–solvent interface and protein–protein interactions [167,168], or as explicit molecules [150,169]. Traditional explicit solvent protein force fields arguably provide the most realistic description of solvent, but also significantly increase the system size (10-fold) leading to prohibitive computational cost to sufficiently sample the immense conformational space of IDPs [112]. Moreover, explicit solvent force fields are know to have a tendency to over-stabilize helices [118,170] and overestimate the strength of protein–protein interactions [171]. Specifically, simulations performed on polyalanine peptides in explicit water showed that also the force fields providing the best match with the experimental J-couplings data (deviation from experiment comparable to the error of the J-couplings estimation), yield an helical fraction between 10% and 30% (OPLS-aa/L⁄, Gromos43a1⁄) [118], whilst NMR experiments indicate the almost exclusive formation of polyproline II. Simulations of C-peptide of ribonuclease A, based on AMBER94, AMBER96, AMBER99, CHARMM22, OPLS-AA/L, and GROMOS96 force fields in explicit solvent revealed significant differences of secondary-structure forming tendencies, with helix contents from 0 to 76% [170]. The best agreement with the experimental value (30%) was obtained by using AMBER99 (23%) and CHARMM22 (26%). A substantial reduction in the computational cost without compromising the essential physics, even with a less realistic description of the solvent, could be obtained using implicit solvent force field, where only the solute is represented atomistically, and the system size is reduced [167,172]. The underlying idea of this model is to capture the mean influence of water by direct estimation of the solvation free energy [167,172]. Using the beta-hairpin from C-terminus of protein G as an example, it has been shown that the implicit solvent models OPLSAA/SGB (Surface GB), AMBER94/ GBSA (GB with Solvent Accessible Surface Area), and AMBER99/ GBSA, provide quite different free energy landscapes compared to the explicit solvent model OPLSAA/SPC [173]. Only AMBER96/ GBSA gives a somewhat reasonable free energy contour map when compared with the explicit solvent model. All implicit solvent models show erroneous salt-bridge effects between charged residues. On the other hand, in both GBSA models with AMBER94 5
Replica-exchange multi-canonical algorithm, replica-exchange simulated tempering, multi-canonical replica-exchange method, simulated tempering replicaexchange method, and multidimensional replica-exchange method, the last of which also leds to replica-exchange free energy perturbation and replica-exchange umbrella sampling.
7
and AMBER99, the implicit solvent model dramatically increases the helical content, 70–80% in AMBER94 and 60% in AMBER99, which is much larger than the 15–20% of helical content found in previously reported explicit solvent simulation with AMBER94/ TIP3P. Furthermore, the beta-hairpin population is underestimated in all implicit solvent models at low temperatures; for example, at 282 K it is estimated to be only 43% in OPLSAA/SGB, 31% in AMBER94/GBSA, 57% in AMBER96/GBSA, and 27% in AMBER99/ GBSA, compared with 74% in the explicit solvent OPLSAA/SPC and about 80% in experiment. [173]. Recently important advances have been made to greatly improve the efficiency and achievable accuracy of implicit solvent models based on generalized Born (GB) approximation. By calculating self-electrostatic solvation energy through the use of a simple smoothing function to represent the dielectric boundary, and a volume integration method, the GB method has been made consistent with the previously developed Poisson–Boltzmann (PB) theory based on finite-difference methods [174]. The improved method is roughly 3 times faster than other GB models, and allows a reproducibility of the PB solvation energies within 2% absolute error with a confidence of about 95% [175]. However inherent and methodological drawbacks do exist [175]. Implicit solvent model will not properly describe short-range effects where the detailed interplay of a few non-bulk-like water molecules is important and might be further limited by the specific methodology for calculating the solvation free energy as well as the physical parameters of the solvation model [176,177]. Specifically, the current surface area based nonpolar models used to compute the non polar contribution to the solvation free energy have severe limitations, including insufficient description of the conformational dependence of solvation, over-estimation of the strength of pairwise nonpolar interactions, and incorrect prediction of anticooperativity for three-body hydrophobic associations. For instance, both GBSW/SA and GBMV/SA models, which employ, respectively, a vdW-based surface with a smooth dielectric boundary and a molecular surface, systematically over-stabilize the dimer interactions between nonpolar amino acid side chain in comparison to the TIP3P explicit water, with an average overstabilization of 0.54 and 0.53 kcal mol-1, respectively [176]. Moreover, in the current SA models the length scale dependence is neglected and the atomic effective surface tension coefficient is conformationally independent. Such a simplification results in over-stabilization of pair-wise interactions and failure to predict cooperativity in three-body hydrophobic associations [176]. Furthermore it introduces severe distortion to the free energy landscape such that the global minimum may no longer correspond to the true native basin [176]. Finally, non-specific collapsed states [178] can be over-stabilized. Despite these caveats, implicit solvent force field has been successfully applied to simulation of regulatory IDPs [179,180]. Optimization efforts have also led to substantial improvement of other GB models to correctly reproduce alpha-helices, beta-hairpins and free energy landscapes [181,182].
554
An application from our lab
607
Recently in Dibenedetto et al. [183] we proposed a computational protocol based on classical MD simulations (amber parm99SB force field [60] plus the ildn modification [119,184]) for investigating how the conformational space of the IDP a-synuclein (AS) is affected by the binding of an anti-aggregation drug, dopamine (DOP) [185–187]. The conformational sampling challenge was addressed by performing MD simulations based on realistic configurations of AS obtained by NMR spectroscopy combined with paramagnetic Relaxation Enhancement (PRE) [188], instead of using randomly generated conformations.
608
Please cite this article in press as: G. Rossetti et al., Arch. Biochem. Biophys. (2015), http://dx.doi.org/10.1016/j.abb.2015.03.011
555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606
609 610 611 612 613 614 615 616 617
YABBI 6948
No. of Pages 10, Model 5G
19 March 2015 8
G. Rossetti et al. / Archives of Biochemistry and Biophysics xxx (2015) xxx–xxx
Fig. 4. (A) Cartoon representation of wild-type prion protein and selected conformations of its N-terminal domain generated by REMC-derived approach based on an implicitsolvent all-atom potential [156]. (B) 2D 1H15N Heteronuclear multiple-quantum correlation (HMQC) spectra of AS in the presence of DOP and relative mixed cartoon/licorice representation of the ASDOP complex. 2D 1H15N HMQC spectra of free AS and ASDOP were measured at 303 K on a 200 lM AS sample in a buffer that consisted of 20 mM sodium phosphate and 150 mM NaCl (pH 6.4), with 10 mg of DOP. The normalized mean weighted chemical shift variations per residue are calculated as: [(DH)2 + (DN/ 10)2]0.5, where D indicates the difference between the chemical shift between the bound and free state, given for each backbone amide. The detail of the interaction between AS and DOP is magnified. Different contributions to the spectra coming from direct contacts between DOP and AS and conformational transitions are highlighted with orange and green arrows respectively. Adapted with permission from [183]. Copyright 2013 American Chemical Society.
638
Furthermore, the authors analyzed the conformational ensemble of AS, alone and in the presence of the drug, with a newly developed tool based on the dihedral angle distributions visited during MD [183]. The latter represents a truly innovative aspect of the work: not only the tool is able to detect and quantify backbone conformational transitions in proteins; it is also able to quantitatively detect the effects of the drug on the dynamical spectra of the protein. The tool finally allows interpreting 2D 1H15N Heteronuclear multiple-quantum correlation (HMQC) spectra of AS in the presence of the anti-aggregation drug. Actually, hetero nuclear 2D NMR is arguably one of the most versatile and powerful experimental tool to investigate drug binding to IDPs [189] since it is able to detect fast conformational changes and binding events also with high resolution and within a short time [189,190]. However, the differences in chemical shift due to direct contacts between the drug and the IDP are not easy to be distinguished from the ones due to long-range effects of the binding [109]. The tool proposed in Dibenedetto et al. [183] was instead able to distinguish variation of chemical shifts due to direct contacts with the drug from the ones due to conformational changes of the AS induced by longrange effects of the binding (Fig. 4B).
639
Conclusions
640
We have presented recent computational investigations regarding proteins of neurobiological relevance, whose structure determination poses challenges to experimental techniques. In the case of the largest neuronal GPCRs subfamilies, those of olfactory and bitter taste receptors, we showed, as excellently done by many other groups (see e.g. Ref. [191,192]) that it is possible to predict reliably structural determinants by state of the art bioinformatics combined with a novel dedicated computational technique, the hybrid atomistic/coarse-grained method [42]. Importantly, the
618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637
641 642 643 644 645 646 647 648
resulting models are consistent with a variety of molecular biology data. Our studies of neuronal IDPs, suggest, as already pointed out, that methods at different resolution might give important insights in their biological function as well as ligand binding, a process which is so far not well understood [193].
649
References
655
[1] G. Bertini (Ed.), NMR of Biomolecules: Towards Mechanistic Systems Biology, Wiley-Blackwell, 2012. [2] T. Schoneberg et al., Pharmacol. Ther. 104 (3) (2004) 173–206. [3] A. Sali, J.P. Overington, Protein Sci. 3 (9) (1994) 1582–1596. [4] A.J. Venkatakrishnan et al., Nature 494 (7436) (2013) 185–194. [5] V. Katritch et al., J. Med. Chem. 53 (4) (2010) 1799–1809. [6] D.K. Tosh et al., J. Med. Chem. 55 (9) (2012) 4297–4308. [7] K.A. Jacobson, S. Costanzi, Mol. Pharmacol. 82 (3) (2012) 361–371. [8] R.O. Dror et al., Proc. Natl. Acad. Sci. U.S.A. 108 (46) (2011) 18684–18689. [9] A.C. Kruse, J. Hu, B.K. Kobilka, J. Wess, Curr. Opin. Pharmacol. 16C (2014) 24– 30. [10] A.C. Kruse, Nat. Rev. Drug Discovery (2014). [11] A. Manglik, B. Kobilka, Curr. Opin. Cell Biol. 27 (2014) 136–143. [12] E. Pardon et al., Nat. Protoc. 9 (3) (2014) 674–693. [13] D.P. Staus et al., Mol. Pharmacol. 85 (3) (2014) 472–481. [14] J.K. Jiang et al., Bioorg. Med. Chem. Lett. 24 (4) (2014) 1148–1153. [15] J.M. Johnston, M. Filizola, PLoS One 9 (2) (2014) e90694. [16] J.M. Johnston, M. Filizola, Adv. Exp. Med. Biol. 796 (2014) 95–125. [17] G. Scarabelli, D. Provasi, A. Negri, M. Filizola, Biopolymers 101 (1) (2014) 21– 27. [18] D. Rodriguez, H. Gutierrez-de-Teran, Curr. Pharm. Des. 19 (12) (2013) 2216– 2236. [19] K. Palczewski et al., Science 289 (5480) (2000) 739–745. [20] V. Cherezov et al., Science 318 (5854) (2007) 1258–1265. [21] S.G. Rasmussen et al., Nature 450 (7168) (2007) 383–387. [22] S. Costanzi, J. Med. Chem. 51 (10) (2008) 2907–2914. [23] M. Michino et al., Nat. Rev. Drug Discovery 8 (6) (2009) 455–463. [24] I. Kufareva, M. Rueda, V. Katritch, R.C. Stevens, R. Abagyan, Structure 19 (8) (2011) 1108–1126. [25] W. Sherman, T. Day, M.P. Jacobson, R.A. Friesner, R. Farid, J. Med. Chem. 49 (2) (2006) 534–553. [26] A.J. Orry, R. Abagyan, Methods Mol. Biol. 857 (2012) 351–373. [27] S. Bhattacharya, N. Vaidehi, Methods Mol. Biol. 914 (2012) 167–178.
Please cite this article in press as: G. Rossetti et al., Arch. Biochem. Biophys. (2015), http://dx.doi.org/10.1016/j.abb.2015.03.011
650 651 652 653 654
656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688
YABBI 6948
No. of Pages 10, Model 5G
19 March 2015 G. Rossetti et al. / Archives of Biochemistry and Biophysics xxx (2015) xxx–xxx 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774
[28] C.N. Cavasotto, Methods Mol. Biol. 819 (2012) 157–168. [29] K.A. Jacobson, Z.G. Gao, B.T. Liang, Trends Pharmacol. Sci. 28 (3) (2007) 111– 116. [30] S. Costanzi, Curr. Opin. Struct. Biol. 23 (2) (2013) 185–190. [31] R. Fredriksson, M.C. Lagerström, L.-G. Lundin, H.B. Schiöth, Mol. Pharmacol. 63 (6) (2003) 1256–1272. [32] J. Chandrashekar, M.A. Hoon, N.J. Ryba, C.S. Zuker, Nature 444 (7117) (2006) 288–294. [33] W. Nadler, A.T. Brunger, K. Schulten, M. Karplus, Proc. Natl. Acad. Sci. U.S.A. 84 (22) (1987) 7933–7937. [34] H. Matsunami, J.-P. Montmayeur, L.B. Buck, Nature 404 (6778) (2000) 601– 604. [35] X. Biarnes et al., PLoS One 5 (8) (2010) e12394. [36] A.V. Nair, M. Mazzolini, P. Codega, A. Giorgetti, V. Torre, Biophys. J. 90 (10) (2006) 3599–3607. [37] M. Nei, Y. Niimura, M. Nozawa, Nat. Rev. Genet. 9 (12) (2008) 951–963. [38] D. Dong, K. Jin, X. Wu, Y. Zhong, PLoS One 7 (2) (2012) e31540. [39] A. Grossfield, Biochim. Biophys. Acta 1808 (7) (2011) 1868–1878. [40] J.M. Johnston, M. Filizola, Curr. Opin. Struct. Biol. 21 (4) (2011) 552–558. [41] A. Bruno, G. Costantino, Mol. Inform. 31 (2012) 222–230. [42] A. Marchiori et al., PLoS One 8 (5) (2013) e64675. [43] A. Giorgetti, D. Raimondo, A.E. Miele, A. Tramontano, Bioinformatics 21 (Suppl. 2) (2005) ii72–76. [44] T.J. Piggot, P.J. Bond, S. Khalid, Proteins Solut. Interfaces Methods Appl. Biotechnol. Mater. Sci. (2013) 193–206. [45] M. Neri, C. Anselmi, M. Cascella, A. Maritan, P. Carloni, Phys. Rev. Lett. 95 (21) (2005) 218102. [46] M. Neri et al., Biophys. J. 94 (1) (2008) 71–78. [47] A.J. Rzepiela, M. Louhivuori, C. Peter, S.J. Marrink, Phys. Chem. Chem. Phys. 13 (22) (2011) 10437–10448. [48] Q. Shi, S. Izvekov, G.A. Voth, J. Phys. Chem. B 110 (31) (2006) 15045–15048. [49] E. Villa, A. Balaeff, K. Schulten, Proc. Natl. Acad. Sci. U.S.A. 102 (19) (2005) 6783–6788. [50] T.A. Wassenaar, H.I. Ingólfsson, M. Prieß, S.J. Marrink, L.V. Schäfer, J. Phys. Chem. B 117 (13) (2013) 3516–3530. [51] W. Han, K. Schulten, J. Chem. Theory Comput. 8 (11) (2012) 4413–4424. [52] M. Leguebe et al., PLoS One 7 (10) (2012) e47332. [53] S. Vanni, M. Neri, I. Tavernelli, U. Rothlisberger, PLoS Comput. Biol. 7 (1) (2011) e1001053. [54] M. Praprotnik, L.D. Site, K. Kremer, Annu. Rev. Phys. Chem. 59 (2008) 545– 571. [55] C. Junghans, S. Poblete, Comput. Phys. Commun. 181 (8) (2010) 1449–1454. [56] S. Bevc, C. Junghans, K. Kremer, M. Praprotnik, New J. Phys. 15 (10) (2013) 105007. [57] S. Fritsch, C. Junghans, K. Kremer, J. Chem. Theory Comput. 8 (2) (2012) 398– 403. [58] M. Cascella, M.A. Neri, P. Carloni, M. Dal Peraro, J. Chem. Theory Comput. 4 (8) (2008) 1378–1385. [59] W.R.P. Scott et al., J. Phys. Chem. A 103 (19) (1999) 3596–3607. [60] V. Hornak et al., Proteins 65 (3) (2006) 712–725. [61] K. Vanommeslaeghe et al., Abstr. Pap. Am. Chem. Soc. 238 (2009). [62] O.F. Lange, D. van der Spoel, B.L. de Groot, Biophys. J . 99 (2) (2010) 647–655. [63] H. Dyson, P. Wright, Nat. Rev. Mol. Cell Biol. 6 (3) (2005) 197–208. [64] P. Tompa, Trends Biochem. Sci. 27 (10) (2002) 527–533. [65] P.E. Wright, H.J. Dyson, J. Mol. Biol. 293 (2) (1999) 321–331. [66] V.N. Uversky, Protein Sci. 11 (4) (2002) 739–756. [67] A.K. Dunker, Z. Obradovic, P. Romero, E.C. Garner, C.J. Brown, Genome Inform. Int. Conf. Genome Inform. 11 (2000) 161–171. [68] V.N. Uversky, C.J. Oldfield, A.K. Dunker, Annu. Rev. Biophys. 37 (1) (2008) 215–246. [69] L.M. Iakoucheva, C.J. Brown, J.D. Lawson, Z. Obradovic´, A.K. Dunker, J. Mol. Biol. 323 (3) (2002) 573–584. [70] V.N. Uversky, C.J. Oldfield, A.K. Dunker, J. Mol. Recognit. 18 (5) (2005) 343– 384. [71] V.N. Uversky, D. Eliezer, Curr. Protein Pept. Sci. 10 (5) (2009) 483–499. [72] P.H. Weinreb, W. Zhen, A.W. Poon, K.A. Conway, P.T. Lansbury, Biochemistry 35 (43) (1996) 13709–13715. [73] A.K. Dunker et al., J. Mol. Graph. Model. 19 (1) (2001) 26–59. [74] V.N. Uversky, J.R. Gillespie, A.L. Fink, Funct. Bioinform. 41 (3) (2000) 415–427. [75] P. Romero et al., Proteins: Struct., Funct., Bioinf. 42 (1) (2001) 38–48. [76] R.M. Williams et al., Pac. Symp. Biocomput. (2001) 89–100. [77] A.K. Dunker, I. Silman, V.N. Uversky, J.L. Sussman, Curr. Opin. Struct. Biol. 18 (6) (2008) 756–764. [78] K. Sugase, H.J. Dyson, P.E. Wright, Nature 447 (7147) (2007) 1021–1025. [79] C.J. Oldfield et al., Biochemistry 44 (6) (2005) 1989–2000. [80] L.C. James, D.S. Tawfik, Trends Biochem. Sci. 28 (7) (2003) 361–368. [81] A.K. Dunker et al., Pac. Symp. Biocomput. (1998) 473–484. [82] M.M. Babu, R. van der Lee, N.S. de Groot, J. Gsponer, Curr. Opin. Struct. Biol. 21 (3) (2011) 432–440. [83] V.N. Uversky, Chem. Soc. Rev. 40 (3) (2011) 1623–1634. [84] R.G. Smock, L.M. Gierasch, Science 324 (5924) (2009) 198–203. [85] P. Tompa, FEBS Lett. 579 (15) (2005) 3346–3354. [86] H.J. Dyson, P.E. Wright, Curr. Opin. Struct. Biol. 12 (1) (2002) 54–60. [87] A.P. Demchenko, J. Mol. Recognit. 14 (1) (2001) 42–61. [88] G.E. Schulz, Molecular Mechanism of Biological Recognition, Elsevier/NorthHolland Biomedical Press, New York, 1979.
9
[89] R. Spolar, M. Record, Science 263 (5148) (1994) 777–784. [90] B.W. Pontius, Close Encounters: why Unstructured, Polymeric Domains can Increase Rates of Specific Macromolecular Association, 1993 (0968–0004 (Print)). [91] K.W. Plaxco, M. Gross, Nature, (00/17) 657–659. [92] R. Rosenfeld, S. Vajda, S, Vajda, C. DeLisi, C. DeLisi, Flexible Docking and Design, 1995 (1056–8700 (Print)). [93] P.E. Wright, H.J. Dyson, Curr. Opin. Struct. Biol. 19 (1) (2009) 31–38. [94] V.N. Uversky, Cell. Mol. Life Sci. CMLS 60 (9) (2003) 1852–1871. [95] P. Radivojac et al., Biophys. J . 92 (5) (2007) 1439–1456. [96] P. Romero et al., Pac. Symp. Biocomput. (1998) 437–448. [97] R.W. Kriwacki, L. Hengst, L. Tennant, S.I. Reed, P.E. Wright, Proc. Natl. Acad. Sci. 93 (21) (1996) 11504–11509. [98] D. Eliezer, Curr. Opin. Struct. Biol. 19 (1) (2009) 23–30. [99] K. Gunasekaran, C.-J. Tsai, S. Kumar, D. Zanuy, R. Nussinov, Trends Biochem. Sci. 28 (2) (2003) 81–85. [100] Z. Dosztányi, J. Chen, A.K. Dunker, I. Simon, P. Tompa, J. Proteome Res. 5 (11) (2006) 2985–2995. [101] C. Haynes, C.J. Oldfield, et al., Intrinsic Disorder is a Common Feature of Hub Proteins from Four Eukaryotic Interactomes. 2006, pp. 1553–7358 (Electronic). [102] Y. Choo, J.W. Schwabe, Nat. Struct. Biol. (00/01), 253–255. [103] W.E. Meador, A.R. Means, F.A. Quiocho, Science 275 (0036–8075) (1992) 1251–1255. [104] B. Mészáros, P. Tompa, I. Simon, Z. Dosztányi, J. Mol. Biol. 372 (2) (2007) 549– 561. [105] A. Mohan et al., J. Mol. Biol. 362 (5) (2006) 1043–1059. [106] N. Rezaei-Ghaleh, M. Blackledge, M. Zweckstetter, ChemBioChem 13 (7) (2012) 930–950. [107] N. Tokuriki, D.S. Tawfik, Science 324 (5924) (2009) 203–207. [108] V. Uversky, Protein. J. 28 (7) (2009) 305–325. [109] L. Salmon et al., J. Am. Chem. Soc. 132 (24) (2010) 8407–8418. [110] M. Kjaergaard, F.M. Poulsen, B.B. Kragelund, Methods Mol. Biol. 896 (2012) 233–247. [111] K.-P. Wu, J. Am. Chem. Soc. (2011). [112] T.H. Click, D. Ganguly, J. Chen, Int. J. Mol. Sci. 11 (12) (2010) 5292–5309. [113] P.S. Nerenberg, T. Head-Gordon, J. Chem. Theory Comput. 7 (4) (2011) 1220– 1230. [114] J.W. Ponder, D.A. Case, Force Fields for Protein Simulations, 2003, (0065– 3233 (Print)). [115] A.D. Mackerell, Jr., Empirical Force Fields for Biological Macromolecules: Overview and Issues, 2004, (0192–8651 (Print)). [116] J.M. Bourhis, B. Canard, S. Longhi, Curr. Protein Pept. Sci. 8 (2) (2007) 135– 149. [117] A.I. Bartlett, S.E. Radford, Nat. Struct. Mol. Biol. 16 (6) (2009) 582–588. [118] R.B. Best, N.-V. Buchete, G. Hummer, Biophys. J . 95 (1) (2008) L07–L09. [119] K. Lindorff-Larsen et al., Proteins: Struct., Funct., Bioinf. 78 (8) (2010) 1950– 1958. [120] A.E. Aliev, D. Courtier-Murias, J. Phys. Chem. B 114 (38) (2010) 12358–12375. [121] S. Piana, K. Lindorff-Larsen, D.E. Shaw, Biophys. J . 100 (9) (2011) L47–L49. [122] D. Frenkel, B. Smit, M.A. Ratner, Phys. Today 50 (1997) 66. [123] R. Petrenko, J. Meller, Molecular Dynamics, eLS, John Wiley & Sons Ltd., 2001. [124] H.A. Scheraga, M. Khalili, A. Liwo, Annu. Rev. Phys. Chem. 58 (2007) 57–83. [125] A. Vitalis, R.V. Pappu, Annu. Rep. Comput. Chem. 5 (2009) 49–76. [126] Z. Li, H.A. Scheraga, Proc. Natl. Acad. Sci. 84 (19) (1987) 6611–6615. [127] A. Mitsutake, Y. Mori, Y. Okamoto, Methods Mol. Biol. 924 (2013) 153–195. [128] Y. Sugita, Y. Okamoto, Chem. Phys. Lett. 314 (1) (1999) 141–151. [129] H. Lei, Y. Duan, Improved Sampling Methods for Molecular Simulation, 2007, (0959–440X (Print)). [130] J. Higo, J. Ikebe, N. Kamiya, H. Nakamura, Biophys. Rev. 4 (1) (2012) 27–44. [131] N. Nakajima, H. Nakamura, A. Kidera, J. Phys. Chem. B 101 (5) (1997) 817– 824. [132] P.S. Shelokar, V.K. Jayaraman, B.D. Kulkarni, Eur. J. Oper. Res. 185 (3) (2008) 1213–1229. [133] E. Marinari, G. Parisi, EPL (Europhys. Lett.) 19 (6) (1992) 451. [134] S. Rauscher, R. Pomès, Biochem. Cell Biol. 88 (2) (2010) 269–290. [135] D.M. Zuckerman, E. Lyman, J. Chem. Theory Comput. 2 (4) (2006) 12001202. [136] B.A. Berg, T. Celik, Phys. Rev. Lett. 69 (15) (1992) 2292. [137] B.A. Berg, U.E. Hansmann, T. Celik, Phys. Rev. B 50 (22) (1994) 16444. [138] U.H.E. Hansmann, Y. Okamoto, J. Comput. Chem. 14 (11) (1993) 1333–1338. [139] Y. Okamoto, J. Mol. Graph. Model. 22 (5) (2004) 425–439. [140] J. Higo, N. Kamiya, T. Sugihara, Y. Yonezawa, H. Nakamura, Chem. Phys. Lett. 473 (4) (2009) 326–329. [141] J. Ikebe et al., J. Comput. Chem. 32 (7) (2011) 1286–1297. [142] J. Higo, Y. Nishimura, H. Nakamura, J. Am. Chem. Soc. 133 (27) (2011) 10448– 10458. [143] X. Huang, G.R. Bowman, V.S. Pande, J. Chem. Phys. 128 (20) (2008) 205106. [144] S. Park, V.S. Pande, Phys. Rev. E 76 (1) (2007) 016703. [145] S. Rauscher, C. Neale, Pomès. Rg, J. Chem. Theory Comput. 5 (10) (2009) 2640–2662. [146] I. Staneva, Y. Huang, Z. Liu, S. Wallin, PLoS Comput. Biol. 8 (9) (2012) e1002682. [147] W. Zheng, M. Andrec, E. Gallicchio, R.M. Levy, Proc. Natl. Acad. Sci. 104 (39) (2007) 15340–15345. [148] H. Nymeyer, J. Chem. Theory Comput. 4 (4) (2008) 626–636. [149] W. Zhang, C. Wu, Y. Duan, J. Chem. Phys. 123 (15) (2005) 154105.
Please cite this article in press as: G. Rossetti et al., Arch. Biochem. Biophys. (2015), http://dx.doi.org/10.1016/j.abb.2015.03.011
775 776 777 778 779 780 781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 846 847 848 849 850 851 852 853 854 855 856 857 858 859 860
YABBI 6948
No. of Pages 10, Model 5G
19 March 2015 10 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893
G. Rossetti et al. / Archives of Biochemistry and Biophysics xxx (2015) xxx–xxx
[150] X. Periole, A.E. Mark, Convergence and Sampling Efficiency in Replica Exchange Simulations of Peptide Folding in Explicit Solvent, 2007, (0021– 9606 (Print)). [151] F. Rao, A. Caflisch, J. Chem. Phys. 119 (2003) 4035. [152] D. Sindhikara, Y. Meng, A.E. Roitberg, Exchange Frequency in Replica Exchange Molecular Dynamics, 2008, (0021–9606 (Print)). [153] L. Wang, R.A. Friesner, B.J. Berne, J. Phys. Chem. B 115 (30) (2011) 9431–9438. [154] T. Terakawa, T. Kameda, S. Takada, J. Comput. Chem. 32 (2011) 1228–1234. [155] K. Hukushima, K. Nemoto, J. Phys. Soc. Jpn. 65 (6) (1996) 1604–1608. [156] X. Cong et al., J. Chem. Theory Comput. 9 (11) (2013) 5158–5167. [157] R.K. Das, R.V. Pappu, Proc. Natl. Acad. Sci. (2013). [158] A. Irback, S. Mohanty, Biophys. J. 88 (3) (2005) 1560–1569. [159] G.R. Bowman, K.A. Beauchamp, G. Boxer, V.S. Pande, J. Chem. Phys. 131 (2009) 124101. [160] G.R. Bowman, X. Huang, V.S. Pande, Methods 49 (2) (2009) 197–201. [161] G.R. Bowman, D.L. Ensign, V.S. Pande, J. Chem. Theory Comput. 6 (3) (2010) 787–794. [162] G.R. Bowman, V.S. Pande, Proc. Natl. Acad. Sci. 107 (24) (2010) 10890–10895. [163] V.A. Voelz, G.R. Bowman, K. Beauchamp, V.S. Pande, J. Am. Chem. Soc. 132 (5) (2010) 1526–1528. [164] V.A. Voelz, V.R. Singh, W.J. Wedemeyer, L.J. Lapidus, V.S. Pande, J. Am. Chem. Soc. 132 (13) (2010) 4702–4709. [165] Q. Qiao, G.R. Bowman, X. Huang, J. Am. Chem. Soc. 135 (43) (2013) 16092– 16101. [166] C.M. Baker, R.B. Best, Insights into the binding of intrinsically disordered proteins from molecular dynamics simulation, Wiley Interdisciplinary Reviews: Computational Molecular Science, 2013. [167] B. Roux, T. Simonson, Biophys. Chem. 78 (1) (1999) 1–20. [168] J. Chen, C.L. Brooks Iii, J. Khandogin, Curr. Opin. Struct. Biol. 18 (2) (2008) 140–148. [169] J. Mittal, R.B. Best, Biophys. J . 99 (3) (2010) L26–L28. [170] T. Yoda, Y. Sugita, Y. Okamoto, Chem. Phys. 307 (2–3) (2004) 269–283. [171] M. Kang, P.E. Smith, J. Comput. Chem. 27 (13) (2006) 1477–1485.
[172] [173] [174] [175] [176]
[177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191]
[192] [193]
M. Feig, C.L. Brooks Iii, Curr. Opin. Struct. Biol. 14 (2) (2004) 217–224. R. Zhou, Proteins 53 (2) (2003) 148–161. W. Im, D. Beglovb, B. Roux, Comput. Phys. Commun. 111 (1–3) (1998) 59–75. W. Im, M.S. Lee, C.L. Brooks, J. Comput. Chem. 24 (14) (2003) 1691–1702. J. Chen, C.L. Brooks, 3rd., Implicit modeling of Nonpolar Solvation for Simulating Protein Folding and Conformational Transitions, 2008, (1463– 9076 (Print)). J. Chen, C.L. Brooks, J. Am. Chem. Soc. 129 (9) (2007) 2444–2445. W. Zhang, D. Ganguly, J. Chen, PLoS Comput. Biol. 8 (1) (2012) e1002353. D. Ganguly, J. Chen, J. Am. Chem. Soc. 131 (14) (2009) 5214–5223. J. Chen, J. Am. Chem. Soc. 131 (6) (2009) 2088–2089. A. Okur, B. Strockbine, V. Hornak, C. Simmerling, J. Comput. Chem. 24 (1) (2003) 21–31. S. Jang, E. Kim, Y. Pak, Proteins: Struct., Funct., Bioinf. 66 (1) (2007) 53–60. D. Dibenedetto, G. Rossetti, R. Caliandro, P. Carloni, Biochemistry (2013). 130821094755001. K. Lindorff-Larsen et al., PLoS ONE 7 (2) (2012) e32131. R.G. Perez et al., J. Neurosci. 22 (8) (2002) 3090–3099. L. Yavich, H. Tanila, S. Vepsäläinen, P. Jäkälä, J. Neurosci. 24 (49) (2004) 11165–11170. V. Lehmensiek, E.-M. Tan, J. Schwarz, A. Storch, NeuroReport 13 (10) (2002). M.M. Dedmon, K. Lindorff-Larsen, J. Christodoulou, M. Vendruscolo, C.M. Dobson, J. Am. Chem. Soc. 127 (2) (2005) 476–477. P. Schanda, B. Brutscher, J. Am. Chem. Soc. 127 (22) (2005) 8014–8015. P. Schanda, E. Kupce, B. Brutscher, J. Biomol. NMR 33 (4) (2005) 199–211. A. Levit, D. Barak, M. Behrens, W. Meyerhof, M. Niv, Homology ModelAssisted Elucidation of Binding Sites in GPCRs. Membrane Protein Structure and Dynamics, Methods in Molecular Biology, (Humana Press), 2012, Vol. 914, pp. 179–205. F. Musiani, G. Rossetti, A. Giorgetti, P. Carloni, Adv. Exp. Med. Biol. 805 (2014) 441–457. V.N. Uversky, Front. Mol. Biosci. 1 (2014).
Please cite this article in press as: G. Rossetti et al., Arch. Biochem. Biophys. (2015), http://dx.doi.org/10.1016/j.abb.2015.03.011
894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926