Accepted Manuscript Molecular simulation of peptides coming of age: Accurate prediction of folding, dynamics and structures Panagiota S. Georgoulia, Nicholas M. Glykos PII:
S0003-9861(18)30734-3
DOI:
https://doi.org/10.1016/j.abb.2019.01.033
Reference:
YABBI 7939
To appear in:
Archives of Biochemistry and Biophysics
Received Date: 14 September 2018 Revised Date:
23 January 2019
Accepted Date: 28 January 2019
Please cite this article as: P.S. Georgoulia, N.M. Glykos, Molecular simulation of peptides coming of age: Accurate prediction of folding, dynamics and structures, Archives of Biochemistry and Biophysics (2019), doi: https://doi.org/10.1016/j.abb.2019.01.033. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT
Molecular simulation of peptides coming of age: accurate prediction of folding, dynamics and structures. Department of Chemistry and Biomedical Sciences, Faculty of Health and Life Sciences, Linnæus University, Kalmar, 39182, Sweden 2 3
Linnæus University Centre of Excellence “Biomaterials Chemistry”, Kalmar, 3912, Sweden
Department of Molecular Biology and Genetics, Democritus University of Thrace, University Campus, Alexandroupolis, 68100, Greece
SC
1
RI PT
Panagiota S. Georgoulia1,2 , Nicholas M. Glykos3 *
M AN U
Abstract
The application of molecular dynamics simulations to study the folding and dynamics of peptides has attracted a lot of interest in the last couple of decades. Following the successful prediction of the folding of several proteins using molecular simulation, foldable peptides emerged as a favourable system mainly due to their application in improving protein structure prediction methods and in drug design studies. However,
D
their performance is inherently linked to the accuracy of the empirical force fields used
TE
in the simulations, whose optimisation and validation is of paramount importance. Here we review the most important findings in the field of molecular peptide simulations and highlight the significant advancements made over the last twenty years. Special
EP
reference is made on the simulation of disordered peptides and the remaining challenge to find a force field able to describe accurately their conformational landscape. Keywords:
AC C
peptide simulations, peptide folding, peptide dynamics, empirical force fields, validation.
Preprint submitted to Archives in Biochemistry and Biophysics
January 30, 2019
ACCEPTED MANUSCRIPT
1
1. Overview Molecular simulations have evolved to a powerful theoretical technique to study
3
peptide structure and dynamics, as indicated by the plethora of references in the lit-
4
erature. The study of peptides thrived in the beginning of this millennium and still
5
holds as an active field of research after 20 years. This interest is mainly due to the
6
ability of foldable peptides to mimic essential characteristics of the much larger protein
7
systems while at the same time maintaining the simplicity of smaller systems. This
8
makes peptide systems not only easier to comprehend, but also cost efficient in terms
9
of both computational power and experimental resources required. Their advantages
10
as model systems have been eloquently presented in other reviews[24, 63]. Herein, we
11
try to underline their vital role in the development and advancement of force fields
12
through numerous cycles of validation and optimisation that lead to what we perceive
13
to be significant improvements in the ability of simulations to capture and reproduce
14
the experimentally accessible physical reality.
15
2. Peptide folding simulations: two decades of continuous advancement
D
M AN U
SC
RI PT
2
Historically, the study of peptides was prompted by the fact that they resemble
17
the early events in protein folding and thus offer an excellent opportunity to decipher
18
protein-structure relationships[35, 75, 97]. Early studies suggested that peptides may
19
serve as nucleation sites from where the folding is initiated[171, 200]. This paved the
20
way to a number of studies aiming to characterise the timescales of the early folding
21
events and to model them computationally.
EP
TE
16
The speed limit of folding has been estimated to be N/100µs, where N is the number
23
of residues[96], with its actual value being dependant on a number of events such as
24
the formation of hydrophobic core, hydrogen bonds, electrostatic interactions, solvation
25
energy, conformational entropy, diffusion, etc.[126]. The timescales for the formation
26
of basic structural elements have been placed to the sub-second time-scale[20, 43]. The
27
fastest early folding events were grouped by Gnanakaran et al.[63] based on the minimal
28
sequence that can form each of these structures, from loop-closure events that can take
29
place in the 10ns time-scale[103], to α-helix formation which takes place in the 100-
30
200ns time-scale[57, 199] and to the slower β-hairpin formation that may require several
31
microseconds[132]. The Eaton group contributed with extensive studies on the speed of
32
folding, setting an upper limit for a 50-residue protein to 1 microsecond[42, 66]. This
AC C
22
2
ACCEPTED MANUSCRIPT
33
nano- to micro-second regime allowed a unique opportunity for a direct comparison
34
between theoretical calculations and experimental measurements[27, 95, 103, 202]. The combination of these relatively short folding timescales with the increase of the
36
computational power available to the research groups led to numerous peptide folding
37
simulation studies which significantly advanced the field of peptide folding simulations.
38
One of the pioneering studies was by Daura et al. that showed in atomistic detail the
39
reversible folding of two β-peptides in solution[32]. Later, the Caflisch group showed the
40
impressive, for that time, folding of a 20-residue peptide to a three-stranded antiparallel
41
β-sheet through a cumulative simulation time of 4µs[49, 50]. Likewise, Pande et al.,
42
illustrated the efficiency of using different starting conformations and high-temperature
43
unfolding to study the folding pathway of a β-hairpin fragment of protein G[143].
44
Challenging short peptides with many charged residues (10 out of 21 amino acids) were
45
successfully computationally folded to the native structure in an attempt to describe
46
the folding pathway and estimate folding rates that were experimentally difficult to
47
access[194].
M AN U
SC
RI PT
35
In later years, the increase in available computational power made it possible for
49
the molecular simulations to reach even higher time scales in the range of tens to
50
hundreds of microseconds, approaching the folding times of longer peptides (20-60
51
amino acids)[125]. One of the first notable studies was the milestone 1 microsecond
52
single folding simulation of the 36 residue villin headpiece in explicit water by Duan and
53
Kollman[38]. Shortly after that, the Pande group demonstrated the efficiency of using
54
many short trajectories instead of a single long one to describe the folding pathway of
55
small peptides like BBA and villin headpiece (23 and 36 residue long, respectively), an
56
effort made possible through the development of distributed computing [45, 104, 167,
57
175, 205].
TE
EP
AC C
58
D
48
A few peptides served as preferable study systems due to their unique properties
59
and abundance of experimental data, like the 20-residue peptide, Trp-cage, the smallest
60
stably folded peptide showing two-state folding properties [28, 151, 170] and the WW
61
domain (approximately 40 residues) of Pin1 protein[46, 52]. The later one served
62
especially as a workhorse for small all beta-sheet peptides to examine force fields’ bias
63
towards helical conformations[53, 131].
64
All-atom molecular simulations in explicit solvent have nowadays reached the
65
impressive millisecond timescale thanks to the improvement of the parallelization
66
algorithms[2], the highly optimised GPU implementations[55, 159, 180] and special3
ACCEPTED MANUSCRIPT
purpose supercomputers like Anton[163, 164]. For an example of the latter, it was
68
the increased simulation length afforded by the Anton machine, together with a better
69
performing force field, that allowed the correct folding of both FiP35, the fastest fold-
70
ing variant of WW domains, and BPTI (58 residues) to experimental resolution[164],
71
overcoming deficiencies of previous trials[52, 53]. The Shaw group in 2011 took the
72
whole field of molecular simulation a huge step forward by demonstrating the correct
73
folding of 12 structurally diverse small proteins, ranging from 10 to 80 residues and
74
from all structural classes[108]. Apart from the obvious significance of a successful in
75
silico folding study, they demonstrated the transient formation of native-like structural
76
elements already in the unfolded state, providing unprecedented insights into their
77
folding mechanism.
SC
M AN U
78
RI PT
67
The small size of the peptides is counteracted by their high flexibility and increased dynamics which made the probing of their full conformational potential extremely dif-
80
ficult with conventional unconstrained molecular dynamics simulations. Thus, and
81
more often that not, enhanced methods are employed to study peptide systems. In
82
fact most of the work cited in this review concerns studies that make use of some of the
83
established accelerated dynamics techniques, such as the replica exchange molecular dy-
84
namics (REMD)[181], simulated tempering (ST)[120] and adaptive tempering[206] to
85
name but few. The common concept in all of these methods is to enhance the conforma-
86
tional sampling through different biasing methods (for example umbrella sampling[182],
87
metadynamics[100], and accelerated MD[68]) to achieve the desired long timescales and
88
retrieve thermodynamic and kinetic properties for the system. Although an in-depth
89
presentation of these techniques is beyond the scope of this review, we should empha-
90
sise the contribution of these methods to the field of peptide folding[76, 179, 207]
91
and refer the interested reader to some excellent reviews already available in the
92
literature[36, 122, 130, 141].
TE
EP
AC C
93
D
79
To summarise the progress outlined above, we believe that molecular simulation
94
of peptides has reached the level of maturity where they can realistically describe in
95
atomistic detail the structure and dynamics of peptides, justifying their naming as a
96
computational microscope[37]. In the past, a folding simulation could have been char-
97
acterised as successful based solely on its ability to locate the native (experimentally
98
determined) structure. It is an indication of the maturity of the field that simply lo-
99
cating the native peptide structure through simulation is no longer sufficient for an
100
application to be considered successful. The major question for recent applications 4
ACCEPTED MANUSCRIPT
is how realistically can the simulation describe the folding mechanism and the corre-
102
sponding folding landscape, including the transiently stable conformations. Present-
103
day molecular simulations are performing relatively well in characterising structurally
104
stable peptide conformations and in estimating rates and free energies of folding[149].
105
However the unfolded state is still poorly understood. The inherent properties of the
106
disordered state —with its multitude of conformations of similar energies which are
107
separated by relatively low energy barriers— amplifies any small force field deficiencies
108
but will most probably not prevent the native structure from being located. The ability
109
to perform extremely long simulations of peptide systems will allow a more extensive
110
sampling of the unfolded/disordered state and should, thus, allow further refinement
111
of the empirical force fields.
112
3. Molecular simulations of health-related peptides
M AN U
SC
RI PT
101
A great interest for the study of the structure and dynamics of peptides came
114
from their applications in drug design, even leading to the creation of a new subfield
115
aptly named “computational peptidology”[209]. Molecular dynamics are a powerful
116
tool towards this means, whose contribution is increasingly being recognised[22, 41].
117
Simulation-based methods are actively used both in early stages of drug discovery as
118
well as in the final steps aiming to decipher binding mechanisms[208].
TE
D
113
Peptides are physiologically involved in many biological processes with vital roles
120
in metabolism and signalling. In particular, they are long-recognised to have a well-
121
established role as anti-cancer, anti-microbial, anti-viral, anti-angiogenic and anti-
122
inflammatory agents. Their inherent capability to modulate protein function increases
123
dramatically their druggability potency[89]. On top of that, they offer enhanced ac-
124
tivity and specificity which is translated in smaller dosages and lower toxicity. How-
125
ever, their susceptibility to proteases appears to be a major bottleneck, underlying
126
the need for alternative administration routes to overcome their low bioavailability[44].
127
Significant improvements in the synthesis and purification techniques have made the
128
production of peptides up to 100 amino acids a routine, opening new horizons for their
129
large-scale health-related applications. A comprehensive presentation of the peptide-
130
based drug market is exhaustively analysed in other reviews[110, 188].
AC C
EP
119
131
Significant advancements have also been made concerning the availability of com-
132
putational tools aiming to assist peptide-based drug design studies[90, 91]. However,
133
most of these tools are limited in their general applicability by making the more often 5
ACCEPTED MANUSCRIPT
than not unjustified assumption that the peptides of interest will have stable (native-
135
like) structures in water. In this connection, special mention must be made of the
136
Rosetta suite of programs which have been catalytic in the field, with pioneering work
137
in the de novo peptide (and protein) structure prediction accuracy as assessed by
138
CASP (Critical Assessment of Structure Prediction)[21]. The Rosetta algorithm im-
139
plemented the fundamental concept of peptide fragments, which are short fragments
140
of known protein structures that are assembled by Monte Carlo methods to predict
141
protein structures[7, 172]. There are also many other in silico tools dedicated to struc-
142
ture prediction but only a handful are adapted for peptides. Their accuracy is limited
143
and highly dependable on the availability of experimental data. The most promising
144
representative of this class is the Pepfold server that is a bioinformatics-based de novo
145
approach to predict tertiary peptide structures from sequence[124]. The algorithm uses
146
Hidden Markov Models and non-overlapping peptide fragments of four amino acids. It
147
should be noted, however, that all of these methods and programs, and irrespectively
148
of their detailed algorithmic basis, utilise empirical force fields to refine their proposed
149
structures, thus highlighting the substantial interplay of peptides with the force fields
150
even in protein structure prediction methods.
D
M AN U
SC
RI PT
134
Health-related applications of peptides also involve the search for small foldable
152
peptides with predefined structural characteristics. Identification of small stably folded
153
and soluble peptides is not trivial[59]. Many small peptides (4-10 residues) of biological
154
interest have been studied computationally but were found to be marginally stable in
155
solution[171]. Well-known exceptions to this rule are the designed 10-residue peptides
156
chignolin[77, 98, 160] and its variant, CLN025[70, 127, 157] that adopt a unique and
157
outstandingly stable β-hairpin structure in water.
EP
AC C
158
TE
151
The question that arises then, is how good are the empirical force fields in describ-
159
ing the structure and dynamics of peptides, especially of those that are not stably
160
folded, or have more than one stable conformation. Are, for example, the force fields
161
sensitive enough to predict the structural plasticity of the peptides, or, the sometimes
162
pronounced effect of mutations on the peptide structure and dynamics? Force field
163
improvement and validation has gone hand-in-hand with peptide simulations. In the
164
following section we present a historical perspective of the evolution of the force fields
165
highlighting the parallel paths that accurate peptide structure predictions and force
166
field development followed through the years.
6
ACCEPTED MANUSCRIPT
167
4. Force fields: then and now A number of force fields have been developed over the years for the simulations of
169
biological macromolecules. These can categorised in the families of CHARMM[114],
170
AMBER[31], OPLS[85] and GROMOS[140]. The similarities and differences among
171
them, especially regarding their parameterisation protocols can be found in other
172
reviews[65, 153]. In what follows, we review the evolution of the force fields from
173
their first appearance to the present, highlighting the pivotal role of peptides in their
174
improvement.
175
4.1. CHARMM force field
SC
Historically, the CHARMM (Chemistry at Harvard Macromolecular Mechan-
M AN U
176
RI PT
168
ics) force field appeared in 1983[23] and was officially released in the version
178
CHARMM19[134] (Figure 1). The published parameters comprised the so called ’ex-
179
tended’ atom types and van der Waals parameters, wherein hydrogen atoms were in-
180
cluded as part of their attached heavy atoms for sulfur and carbon, whereas for nitrogen
181
and oxygen the hydrogen atoms were treated explicitly, hence the name united-atom
182
potentials. One of the major features was the application of the partial atomic charges
183
to fit ab initio calculations, using the TIP3P water model[84] to calibrate the inter-
184
actions. The potential function also included internal energy terms for bonds, angles,
185
dihedrals, improper dihedrals as well as the non-bonded terms for van der Waals and
186
electrostatics, using a truncation cutoff for the latter. This basic form of the energy
187
function has been carried-over to almost all present-day empirical force fields.
TE
EP
188
D
177
CHARMM22[113][114] was later introduced as an ”additive” protein force field that included explicit parameters for all atoms (rendering the term ’extended atom’
190
obsolete), emphasising on the balance of interactions involving the protein and the
191
solvent and in-between. This was mainly achieved through a refinement of the non-
192
bonded interactions already implemented in the CHARMM19 version, by broadening
193
the experimental and ab initio data used. Backbone parameters were based on the
194
N-methylacetamide (NMA) and the alanine dipeptide and the side chains were based
195
on a number of small model compounds. Additive force fields do not allow however
196
the change of the electrostatic parameters as a function of environment, neglecting
197
the electronic polarisation phenomenon. The CHARMM22 protein parameters were
198
maintained to the version that followed, CHARMM27, where only the nucleic acid
AC C
189
7
ACCEPTED MANUSCRIPT
199
and lipid parameters were updated together with parameters for a number of common
200
ions[112]. This force field version remained prevalent for a decade until the introduction of
RI PT
201
the CHARMM22/CMAP version which incorporated the addition of a dihedral correc-
203
tion map (CMAP)[115, 116]. CMAP is a two-dimensional grid of energy corrections
204
in the dihedral space of φ/ψ angles added to the potential energy equation, which
205
improved the secondary structure propensities and compensated for the demonstrated
206
helical bias of the CHARMM force field[178]. The parameters for the CMAP term
207
were derived from a big collection of protein x-ray crystallographic data[116]. The
208
CHARMM22/CMAP version had an improved performance[25], but deficiencies in the
209
equilibrium between helical and extended conformations were noted which affected the
210
ability of the force field to correctly fold small peptides, mostly due to an increased heli-
211
cal propensity[13, 52, 53]. Thus, further improvements were incorporated by correcting
212
the backbone and side-chain dihedral angles to sample better the corresponding regions
213
in the Ramachandran plot[19]. Corrections were based on experimental solution NMR
214
data of weakly structured peptides for all residues except glycine and proline, for which
215
QM-based corrections were made. We should note that this work was not based solely
216
on alanine peptides but also on small helical and hairpin peptides.
D
M AN U
SC
202
In later years, only non-protein parameters regarding lipids, carbohydrates,
218
drug-like and cofactor molecules were implemented, giving rise consequently to
219
CHARMM27r, CHARMM35 and the CHARMM General Force Field (CGenFF). To-
220
gether they resulted in the latest great release of CHARMM36[19, 81] and a few years
221
later of CHARMM36m for intrinsically disordered proteins[82]. In parallel, version
222
CHARMM C22*[150] was presented that aimed towards a better helix-coil balance.
223
This was achieved by replacing the CMAP correction for all residues except glycine and
224
proline, and modifying the side-chain partial charges and torsional terms for residues
225
aspartate, glutamate and arginine (following the philosophy of the ff99SB* and ff03*
226
corrections in the AMBER family that is presented in the next section). A more com-
227
prehensive presentation on each component of the CHARMM all-atom additive force
228
field together with the parameterisation philosophy and methodology and the recently
229
released CHARMM Drude polarizable force field can be found elsewhere[111, 187, 210].
230
4.2. AMBER force field
231 232
AC C
EP
TE
217
The AMBER (Assisted Model Building with Energy Refinement) force field first appeared also in the early 80s, sharing the united-atom idea[195] but was soon extended 8
ACCEPTED MANUSCRIPT
to an all-atom force field with recalibrated torsional and angle parameters based on
234
experimental conformational energies[196] (Figure 2). A decade of improvements in
235
the parameters and the algorithm gave rise to a second generation force field, named
236
ff94[31]. Since then, the tradition has it that protein (and nucleic acid) force fields
237
in the AMBER family are named with ”ff” followed by a two-digit number year. Up
238
to this point, the parameterisation was focused on the gas phase behaviour, but the
239
need to produce potentials that are suitable for condensed phase simulations became
240
apparent for a more balanced modelling of biomolecules in solution. Unlike CHARMM
241
though, the core parameterisation is different here, employing more adjustable empir-
242
ical parameters that can be optimised. For example, the fixed atomic partial charges
243
of CHARMM that are tightly connected to the TIP3P water model are not present
244
here. Instead AMBER employed initially an ESP (electrostatic potential) fit for atomic
245
centred charges and later an RESP fit (restrained ESP) where charges were fitted simul-
246
taneously to several conformations to achieve a better average behaviour. The dihedral
247
parameters were fit to relative QM energies of alternate rotamers of small molecules
248
to represent several conformations of glycine and alanine. Unlike CHARMM, the AM-
249
BER force fields underwent many optimisations and refinements to reach a satisfactory
250
level of accuracy, giving rise to a large number of variants.
SC
M AN U
D
The ff94 force field was shown to over-stabilise helical conformations and over-
TE
251
RI PT
233
estimate computed melting temperatures in peptide simulations, sometimes even bias
253
helices over the native β-hairpin conformation[57, 137, 170]. The ff94 version was soon
254
replaced by ff96 and C96[93], which featured improved calculations for electrostatics
255
and van der Waals interactions and better accounted of long-range effects. This resulted
256
in better fitting to ab initio calculations on small peptides and more balanced solvent-
257
solvent and solute-solvent interactions. One of the main drawbacks of this version
258
was a bias towards β-structures this time[74, 88, 138]. The next version ff99[190]
259
was yet another attempt to refit the backbone dihedral parameters using this time
260
more representative structures of alanine and glycine in the form of tetrapeptides and
261
dipeptides. At that time, generalised parameters for compounds beyond proteins and
262
nucleic acids were also released. The new potential surfaces were significantly different
263
of those of its predecessors.
264
AC C
EP
252
Force field optimisation was assisted by the longer simulation times that became fea-
265
sible, which only uncovered more force field deficiencies. This opened up to a decade of
266
numerous almost in-parallel efforts to improve AMBER force field, making it challeng9
ACCEPTED MANUSCRIPT
ing to keep up with them and even more to choose the appropriate one for a particular
268
study. Despite the extended optimisation of the ff99 version, AMBER force fields con-
269
tinued to suffer from an imbalance of secondary structural elements, over-stabilising
270
α-helical peptides, like their ff94 counterpart[57, 137].
RI PT
267
A few years later ff03 was released, a so called third-generation point-charge all-atom
272
force field[39]. The major contribution was the fitting of the electrostatic potential and
273
main-chain torsion parameters to QM calculations of small peptides in condensed phase
274
with continuum solvent models. Initial testing showed a satisfactory balance between
275
helical and extended conformations and a better description of the polyproline II region.
276
A united-atom counterpart was also released, ff03ua[201], where aliphatic hydrogens of
277
α-carbon and aromatic hydrogens are explicitly represented whereas aliphatic hydro-
278
gens of side-chains are represented united to their carbon atom, to improve speed for
279
computationally demanding cases like folding simulations. A close variant of ff03, ff03*
280
incorporated a correction to the backbone dihedral potential to fit experimental data
281
of helix-coil transition, making it more transferable and able to fold peptides from both
282
helical and beta structural classes[15]. The ff03w force field[16] is a slightly modified
283
version of ff03*, with a backbone dihedral potential correction that allows the usage
284
of the more accurate TIP4P/2005 water model[1]. This updated force field offered a
285
better description of the folding thermodynamics and of the unfolded state as tested
286
in small peptides, whilst preserving its ability to fold to the native structure larger
287
peptides like the Trp cage and the GB1 hairpin. It still suffered though from poor
288
solvation, with less favourable solvation free energies and too favorable protein asso-
289
ciation. In an attempt to further improve the representation of these properties, the
290
ff03ws force field[18] was released which essentially incorporated a tuning parameter
291
for the Lennard-Jones protein-water interaction, specifically the water oxygen. The
292
same scheme applied to the ff99SB force field, gave rise to version ff99SBws[18].
M AN U
D
TE
EP
AC C
293
SC
271
The ff99SB force field[79] was a parallel effort focused largely on improving the φ/ψ
294
dihedral terms of the previous ff94/ff99 energy functions, after showing that the inad-
295
equate backbone dihedral parameterisation was indeed responsible for the weaknesses
296
of that generation of force fields. The major realisation here was that the two sets
297
of dihedral backbone parameters introduced in ff94 version have to be both individu-
298
ally optimised for glycine and non-glycine residues. Glycine due to absence of a Cβ
299
atom needs only one set of dihedral parameters, φ/ψ (defined as φ = C-N-Cα -C, ψ =
300
N-Cα -C-N), whereas all other amino acids that harbor a side-chain need also a φ’/ψ’ 10
ACCEPTED MANUSCRIPT
term (defined as φ’ = C-N-Cα -Cβ , ψ’ = Cβ -Cα -C-N). Then the dihedral potential for
302
the non-glycine residues will be the sum of them. This resulted in better agreement
303
with the PDB data for the dihedral angles and consequently better description of the
304
secondary structural preferences of small peptides as well as improved fitting to NMR
305
observables of larger peptides, like trpzip2, trp cage, villin headpiece, ubiquitin and
306
lysozyme.
307
RI PT
301
In the same spirit, there were some efforts like the AMBER-GS version[57], where the φ/ψ torsion potential was completely removed to fit better to experimental helix-
309
coil parameters, but the α-helical bias was not removed as shown by long equilibrium
310
ensemble simulations[177]. As a remedy, the ff99SB-φ’ force field[177] was proposed
311
where they combined the low helicity of the ff99 potential with the high helicity of
312
the ff94, by removing the additional barriers on the φ rotational degree of freedom.
313
This version provided a better description of the helix-coil transition but its usage to
314
non-helical peptides was not verified.
M AN U
SC
308
The fast —almost chaotic— production of even more versions of AMBER force fields
316
continued unimpeded in the years to come. All versions from this period shared the
317
core potential function of ff99SB which demonstrated superior performance to previous
318
versions[13], proving the significance of the proper backbone dihedral parameterisation.
319
Next in line, came ff99SB* force field[15], which incorporated to 99SB potential the
320
same backbone correction as the ff03* version described above. This improved the
321
helical bias and showed better agreement to NMR data of peptides but with poor
322
thermodynamics description.
EP
TE
D
315
A different approach was introduced with the ff99SB-ILDN version[109], where the
324
torsion potentials of side-chains were targeted this time. Especially four residue types,
325
isoleucine (I), leucine (L), aspartate (D) and asparagine (N), where found to have
326
considerably different rotamer distributions in the simulation in respect to the exper-
327
imentally observed ones. So, the side-chain torsion potentials for these four residues
328
were fitted to high-level QM calculations, improving considerably the agreement to
329
NMR data for the rotamer distributions. Of course combinations of these optimisa-
330
tions surfaced, like ff99SB*-ILDN[15] and later ff99SB*-ILDN-Q[14], which combined
331
the global backbone correction and the modified torsion parameters for residues ILDN,
332
with refitted charges for residues D, E, K, R.
333 334
AC C
323
On another front, the Br¨ uschweiler group introduced the concept of employing experimental NMR data that are commonly used to cross-validate a force field’s per11
ACCEPTED MANUSCRIPT
formance, to optimise them instead. Thus they used a collection of NMR chemical
336
shift data and RDCs for peptides to optimise and improve the ff99SB version using
337
an energy-based reweighting to match the experimental data in an iterative manner,
338
giving rise to the versions ff99SBnmr1[105] and ff99SB-φψ[106] that showed superior
339
performance when compared to its parent force field, ff99SB.
340
RI PT
335
Around that time, it became clear that it was imperative to produce a catholic AMBER version with carefully selected corrections that can reproduce faithfully ex-
342
perimental data for peptides from all structural classes that the developers can respon-
343
sibly recommend to general users of molecular simulations. Towards this direction, a
344
preliminary version, ff12SB, and nowadays ff14SB[118] were officially released. Taking
345
lessons from all the previous optimisation attempts, this version comprised a complete
346
refit of the side-chain dihedral potential of all amino acids, including alternate protona-
347
tion states for the ionizable ones, and empirical adjustments to the backbone dihedral
348
potential, including the φ rotational profile. The superiority of this updated version
349
was tested for reproducibility of the secondary structural content and NMR parameters
350
of peptides and proteins in solution. The ff12SB version was also optimised by adding
351
a CMAP-like correction, ff12SB-cMAP, for simulations with implicit solvent models,
352
when computational speed is a crucial factor[146].
D
M AN U
SC
341
Although the ff14SB is the one officially still recommended by the developers at the
354
time of the writing, another version, ff15SB, came up for use in combination with the
355
TIP3P-FB water model[192]. This version comprised a complete refitting of the bonded
356
parameters of the parental ff99SB force field and was shown to retain the prediction
357
accuracy of equilibrium properties while improving the accuracy of the temperature
358
dependent ones[127]. Its broader application and establishment remains to be seen.
359
For completeness, there are nowadays available force fields that can also describe small
360
molecule ligands (General AMBER force field, GAFF), carbohydrates (GLYCAM),
361
lipids (lipid14) but also amber-compatible parameters for post-translational modifica-
362
tions (Forcefield PTM) and ff15ipq version for implicit polarised charges in explicit
363
solvent[34].
364
4.3. Force fields for disordered peptides
AC C
EP
TE
353
365
Much of the current focus in the community of force field development has turned
366
towards the study of disordered peptides. An accurate description of the landscape of
367
these structurally challenging peptides is of paramount importance due to the increas-
368
ing acknowledgement of their abundance in protein structures as modular elements and 12
ACCEPTED MANUSCRIPT
their participation in many cellular processes[40]. Their main characteristic, the abil-
370
ity to interconvert between different conformations, also presents the greatest challenge
371
which is the requirement to transform force fields that underwent decades of optimisa-
372
tion cycles to model well-folded peptides, to now perform comparably well even with
373
disordered peptides.
RI PT
369
Disordered peptides present the additional challenge of interpreting complex experi-
375
mental observables that are averaged ensembles over transient conformations. The most
376
common experimental techniques used for validation of the simulation data comprise
377
NMR (nuclear magnetic resonance), usually amide proton exchange measurements, J-
378
couplings, chemical shifts, NOEs (nuclear Overhauser effect) and RDC (residual dipo-
379
lar coupling) constants, FRET (F¨orster resonance energy transfer) and SAXS (small
380
angle X-ray scattering), that have been reviewed recently[18, 26, 72, 80]. Infrared spec-
381
troscopy has emerged lately as a powerful tool to probe the conformational dynamics
382
of the disordered states, providing experimental evidence on the secondary structures
383
of sites that can be isolated by isotope labelling that can be compared with computa-
384
tionally modelled amide I spectra[48].
M AN U
SC
374
Naturally, force fields had to embrace the need to represent the disordered state
386
as accurately as the well-structured state. The first attempt to produce such a force
387
field was based on the application of the CMAP correction method to the dihedral
388
potential of 8 disorder-promoting residues (A, G, P, R, Q, S, E, K), using as basis
389
the ff99SB-ildn force field, giving rise to the version ff99IDPs[193] (Figure 2). A later
390
validation study using a set of 3 representative IDPs showed a quantitative agreement
391
with the experimental NMR chemical shifts and more representative conformational
392
sampling, though still suffering from over-stabilisation of structured conformations,
393
most probably due to inadequate representation of polar interactions[203].
TE
EP
AC C
394
D
385
A more comprehensive effort was presented through the refined CHARMM36m
395
(Figure 1) force field[82] that tried to address a reported bias for overpopulation of
396
the αL region of the Ramachandran plot[154]. The exhaustive benchmarking gave
397
satisfactory fitting to experimental parameters for IDPs while retaining consistency
398
with the CHARMM36 counterpart for folded peptides, folding kinetics and free energy
399
calculations. It removed the oversampling of the αL region observed in CHARMM36,
400
but also underestimated the β region as shown through simulations of peptides like
401
chignolin and CLN025. The main issue addressed was the over-compactness, a common
402
characteristic of all current generation empirical force fields when applied to studies of 13
ACCEPTED MANUSCRIPT
403 404
the disordered state. In the same spirit, and inspired by previous optimisation efforts, a force field termed a99SB-disp was released (Figure 2), the most recent to our knowledge empiri-
406
cal force field dedicated to accurately present both folded and unfolded conformational
407
states[156]. This force field is based on the ff99SB-ILDN version with necessary modi-
408
fications to employ the TIP4P-D water model. Its superior performance is attributed
409
to (a) an iterative scheme to optimise the backbone and side-chain torsion parame-
410
ters to represent better the protein-water van der Waals interactions inspired by the
411
ff99SB*-ILDN-Q version, and, (b) the modification of the strength of the Lennard-
412
Jones term for carbonyl oxygen and amide hydrogen pairs. Its performance remains to
413
be evaluated in the years to come.
M AN U
SC
RI PT
405
Amyloid (Aβ) peptides and in particular their assembly into fibrils, have been one of
415
the most challenging peptide systems to study with significant difficulties encountered
416
with both the experimental and computational approaches. The interested reader is re-
417
ferred to some excellent reviews of these Alzheimer’s disease-related peptides[133, 183].
418
A large number of studies appeared which attempted to computationally characterise
419
the peptides’ equilibrium conformations using the most advanced force fields. The great
420
discrepancy among the predicted structures of the monomers as well as of the various
421
oligomers only highlights the difficulty to describe the conformational preferences of
422
disordered peptides and the force field bias that needs yet to be addressed[136]. The
423
differences between the various force fields, sometimes in the context of different water
424
models as well, have been attributed to differing secondary structure biases as well
425
as the intrapeptide hydrogen bonding, with helix-overestimating force fields, like AM-
426
BER99SB and CHARMM22-CMAP, having the poorest agreement to experimentally
427
observed populations[174]. Later studies with improved force field versions accompa-
428
nied by different water models, such as OPLS-AA/TIP3P, AMBER99SB-ILDN/TIP4P-
429
Ew and CHARMM22*/TIP3SP demonstrated strongly converged structural properties
430
for these type of disordered peptides[158] and improved agreement with experimental
431
J-coupling constants, chemical shifts and CD data[176]. These encouraging findings
432
indicate that force field improvements may be heading in the correct direction. How-
433
ever, the force field validation is difficult due to the limited availability of experimental
434
observables for disordered peptide systems. This is particularly concerning when force
435
field performance is found to be dependent on the peptide system per se. For instance,
436
the CHARMM36/TIP3P combination was the most accurate in the case of Aβ10-
AC C
EP
TE
D
414
14
ACCEPTED MANUSCRIPT
40 peptide but not in the case of Aβ1-40 and Aβ1-42 peptides[173]. In a different
438
study, the most recent versions AMBER14SB and CHARMM22* with TIP3P water
439
model showed a helical overestimation compared to CD data, whereas OPLS-AA and
440
AMBER99SB-ILDN with the same water model represented conformational ensembles
441
closer to experiment[119]. Clearly further studies are needed to determine the source
442
of the force field discrepancies, but more importantly to also comprehend whether the
443
limited accuracy for disordered peptide systems is due to the interplay between the
444
force field and water model parameters.
445
4.4. Other biomolecular force fields
SC
RI PT
437
The force field families of CHARMM and AMBER are the ones that underwent
447
through heavy optimisation and validation by the developers but also the rest of the
448
community in the peptide folding and dynamics field. For completeness we should also
449
mention two additional and significant contributions, those of the OPLS and GROMOS
450
force fields.
M AN U
446
The OPLS (optimised potentials for liquid simulations) made its appearance in
452
the 80s as well (OPLS-ua) and was initially intended for simulation of liquid-state
453
properties of water and other organic liquids[84] (Figure 3). A distinct feature was the
454
emphasis given to model the non-bonded interactions to fit liquid-state thermodynamic
455
properties, like density and heat of vaporisation. The first release was a combined
456
AMBER/OPLS force field[86] (AMBER ff94 was largely inspired by OPLS) and was
457
later updated to the all-atom version OPLS-AA[85]. An update of it came out a
458
few years later, where side-chain potentials were reparameterized based on QM-data
459
of small peptides[87]. The major contribution from this effort was, and still is, the
460
development of the water models TIP3P and TIP4P that are heavily used to the
461
present. It was only very recently that new OPLS versions emerged, like OPLS2.1 and
462
OPLS3, that are mostly optimised towards small molecules and protein-ligand binding
463
studies[69].
TE
EP
AC C
464
D
451
Closing this section, we should refer to GROMOS (Groningen molecular simulation)
465
that appeared in the 90s together with the homonymous computer simulation program
466
with the most cited version being GROMOS96[186] (Figure 4). The newest release
467
appeared a few years ago as GROMOS11 with the parameter sets 54A7[161] and now
468
54A8[155]. Its inferior performance compared to other force fields in comparative
469
validation studies of peptide folding and dynamics prevented its broader use in the
470
specified field of peptide simulations and will not be further presented here. 15
ACCEPTED MANUSCRIPT
4.5. Water models for explicit representation of the solute in peptide folding simulations
472
This review is focused on peptide folding simulations that comprise explicit repre-
473
sentation of the solvent. Therefore, the ability of the force fields to describe accurately
474
the conformational landscape of the peptides is inextricably dependent on the ability
475
of the associated water model to describe the physical properties of water molecules in
476
the liquid phase. There are numerous water potentials that have been developed so far,
477
but we shall only mention here the empirical potentials that are used in biomolecular
478
simulations (Figure 5). Detailed presentation of all available water models and their
479
comparative performance can be found elsewhere[30, 64, 73, 139, 142, 185, 189].
SC
480
RI PT
471
Their first appearance coincided with the early protein force field versions in the 80s and they all share the idea of a rigid water monomer with the non-bonded interac-
482
tions described by 3, 4, or 5 sites composed of Coulombic terms for the intermolecular
483
interaction pairs and a Lennard-Jones term for oxygen atoms upon dimerization. His-
484
torically, two quite similar 3-site models were introduced in parallel in 1981. These
485
were the TIPS (transferable intermolecular potential)[83] model, and the SPC (simple
486
point charge)[11] model, two of the most popular potentials for water molecules in
487
molecular simulations even to date[153]. The main difference between TIP and SPC
488
models is the tetrahedral shape geometry adapted by the latter and its ability to re-
489
produce the experimental radial distribution function (including the second peak) and
490
the self-diffusion coefficient. Later the SPC/E[10] model was released that comprised
491
an additional polarization correction leading to better representation of the density,
492
diffusion coefficient and dielectric constant for simulations of liquid water. Together
493
these features made the SPC models more appropriate for modeling the bulk properties
494
of water[121].
D
TE
EP
AC C
495
M AN U
481
Later reparameterisation of TIPS and addition of a Lennard-Jones term to the hy-
496
drogen atoms as well, lead to improved representation of the density of liquid water and
497
gave rise to what became the standard model for protein simulations, the TIP3P[84]
498
model. Concurrently with TIP3P, its 4-site version was released, the TIP4P model,
499
which also incorporated terms for the bisector of the H-O-H angle allowing a better
500
electrostatic distribution. The TIP4P model provided a better description of most of
501
the water’s properties (like the phase diagram) but not of its dielectric constant. The
502
increased computational cost (imposed by the larger number of interactions to be cal-
503
culated) prohibited its wider use in the early peptide folding studies. The most crucial
504
factor however for the wider application of TIP3P was its coupling to the vastly popular 16
ACCEPTED MANUSCRIPT
CHARMM22 force field. The 5-site interaction version, TIP5P[117] appeared in 2000,
506
which replaced the single negative charge on the bisector angle of TIP4P with two
507
single negative charges on the lone-pair electrons. The truncated tetrahedral geometry
508
showed excellent representation of the structure of water, matching the density and
509
radial distribution properties, but suffered in describing the gas phase. The increased
510
complexity also significantly increased the computational requirements. As a result,
511
SPC and TIP4P models provided a satisfying description of liquid’s water structural
512
and thermodynamic parameters at a much lower computational cost.
SC
RI PT
505
The scenery changed with the increase in the available computational power but
514
most importantly the appearance of AMBER force field versions that gave more free-
515
dom on the choice of water model (RESP-fitted atomic charges). All aforementioned
516
water models utilise truncated Coulombic interactions. An improved treatment of the
517
long-range interactions with particle-mesh Ewald summation gave rise to the TIP4P-
518
Ew[78] model which improved the representation of liquid water’s thermodynamic pa-
519
rameters over a large temperature range but without adding significantly to the compu-
520
tational cost. On the same spirit was based the parameterisation of the TIP4P/2005[1]
521
model which reproduces temperature-dependent properties but underestimates the di-
522
electric constant. The TIP4P/ǫ[56] model is fine tuned to match exactly that property
523
in a large temperature regime. An equivalent version is the TIP4Q[4] model that
524
includes an additional charge to increase the dipole moment, but at an additional
525
computational expense compared to other TIP4P-based models. The common charac-
526
teristic of these water models is the larger enthalpy of solvation that leads to stronger
527
interactions between the water and the embedded protein. It is for that reason that
528
they have been lately employed in folding simulations of disordered less hydrophobic
529
peptides, facilitating a more accurate description of the expanded unfolded state. On
530
the same front, the TIP3P-FB and TIP4P-FB[191] models were introduced by applying
531
the so called ForceBalance method that combines reference data produced both exper-
532
imentally and theoretically to derive parameters for force fields. The resultant models,
533
and in particular the TIP4P-FB, manage to reproduce most of the kinetic and thermo-
534
dynamic properties of water, like the viscosity, diffusion coefficient, dielectric constant,
535
radial distribution function and heat of vaporization, in a variety of temperatures and
536
pressures.
AC C
EP
TE
D
M AN U
513
537
The take-home message here is that there is no perfect water model that encapsu-
538
lates all of the experimental physical properties of water. Furthermore, the general 17
ACCEPTED MANUSCRIPT
users are highly restricted in their choice by the compatibility to the protein force
540
field of choice and whether it is implemented in a molecular mechanics program.
541
For example, the molecular dynamics simulation engine NAMD[147] only supports
542
TIP3P- and TIP4P-based models from the whole panel of non polarizable water
543
models. Gromacs[12, 184] on the other hand supports SPC, SPC/E, TIP3P, TIP4P
544
and TIP5P. It has thus become common knowledge in the field that the GROMOS
545
force fields should better be paired with the SPC water models, OPLS force fields
546
are recommended to be used with TIP4P, whilst the CHARMM family of force fields
547
should be paired with the TIP3P model to avoid inconsistencies.
548
fields, especially the latest versions, are more versatile. Despite the relatively poor
549
performance of some water models with respect to the description of the bulk solvent
550
properties, their success in the folding of proteins and peptides appears to indicate that
551
folding per se may not be as sensitive to the properties of bulk solvent. This finding is
552
probably the result of the protein force fields being developed given the available water
553
models, so their deficiencies could be masked in the empirical parameterization of the
554
force fields. As the observation that the main deficiency of current generation of force
555
fields is the poor description of the protein-water interactions, an effort in underway
556
to develop force fields that are parameterized in combination with the newly reported
557
water models. In that spirit the TIP4P-D[148] was recently released in an effort to
558
recapitulate the more extended conformational ensembles of the disordered state by
559
increasing the strength of the dispersion interaction (LJ-term) (see also section 6.
560
The future). As is always the case with molecular simulation, it will be the extensive
561
testing by the community that will define which of these models will survive the acid
562
test of a quantitative comparison with the experimental data, including the ability to
563
accurately describe the kinetics and thermodynamics of peptide folding.
565
SC
EP
TE
D
M AN U
AMBER force
AC C
564
RI PT
539
5. The unceasing quest for a balanced biomolecular force field
566
The previous sections clearly demonstrated that one of the main contributions
567
of peptide simulations is their continuing usage as test systems for the optimisation,
568
correction and validation of empirical force fields[5, 61, 107, 123, 175]. One of the
569
recurring challenges through all these studies has been to convincingly quantify force
570
field accuracy, especially in connection with the fact that the relatively limited simu-
571
lation timescales attainable are insufficient for guaranteeing a faithful configurational 18
ACCEPTED MANUSCRIPT
sampling of the peptides’ folding landscapes[51]. Soon it became clear in the litera-
573
ture that the initial force field versions from all families were suffering from helical
574
bias[13, 51, 53, 62, 79, 123, 137, 165, 178, 204]. A common remedy to tackle the helical
575
bias of the early force fields was to choose one with a bias towards the major secondary
576
structure of the protein under study. This was obviously highly unsatisfactory and
577
proved indeed ineffective, as demonstrated by the folding simulations of villin and Pin1
578
WW domain mentioned earlier with the CHARMM22/CMAP force field[115, 116]. In
579
these studies, villin folded to its native structure[54] but the WW domain, whose na-
580
tive structure is a 3-stranded β-sheet, did not[52]. The reason, of course, was that as
581
far the force field was concerned a misfolded conformation had lower free energy than
582
the native structure[47, 53]. Studies such as these highlight how small force field inac-
583
curacies can have an additive effect and can result in stabilising misfolded non-native
584
structures.
M AN U
SC
RI PT
572
AMBER’s ff99SB was the first successful version to convincingly compensate for
586
that and displayed superior performance over the rest of the force fields of its time:
587
literature consistently provided overwhelming evidence of significantly improved bal-
588
ance of secondary structural elements and reasonable agreement with experimental
589
data through multiple comparative validation studies comprising from short glycine
590
and alanine peptides to longer 10-20 residue peptides (disordered, helical, beta) up to
591
small proteins like lysozyme and ubiquitin[8, 58, 79, 92, 94, 99, 102, 123, 144, 145, 168,
592
168, 169, 197, 198].
TE
Follow-up studies suggested two more force fields, f99SB-ILDN-φ and ff99SB-ILDN-
EP
593
D
585
NMR, with significantly improved agreement with NMR observables, highlighting
595
though that large uncertainties accompany this agreement: statistical calculated er-
596
rors in the validation were in the margin of systematic experimental errors of these
597
parameters[9].
598
AC C
594
The newer versions that surfaced, ff03/ff03* and ff99SB*-ILDN from AMBER, and
599
CHARMM22/CMAP and CHARMM22* from CHARMM, all succeeded in predicting
600
the native states and folding rates (for example the case of the model protein villin
601
headpiece), but displayed distinct discrepancies in the folding mechanism and in the
602
description of the unfolded state in particular, with ff99SB*-ILDN and CHARMM22*
603
being closer to the experiments[150]. CHARMM22* showed superior performance over
604
its counterparts in the CHARMM family in describing the free energy landscape of the
605
β-hairpin GB1[71]. Similarly, f99SB and its variants (ff99SB*, ff99SB-ILDN, ff99SB*19
ACCEPTED MANUSCRIPT
ILDN), ff03, ff03* and the GROMOS96 (43a1p,53a6) succeeded in finding the residual
607
β-hairpin preference of the disordered 16-mer Nrf2-derived peptide but with extremely
608
spurious results that were temperature dependent[29]. In accordance, a parallel study
609
with ff99SB-ILDN, ff99SB*-ILDN, CHARMM22/CMAP and CHARMM22* provided
610
accurate description of the native state of ubiquitin, GB3, villin and WW domain, but
611
force fields with better helix-coil balance, ff99SB*-ILDN, ff03* and CHARMM22* had
612
superior performance in the peptide systems[107]. Our personal recent work also lends
613
further support that for structurally challenging peptide systems, the ff99SB*-ILDN is
614
outperforming its counterparts in the AMBER 99SB family[3, 60, 162].
SC
615
RI PT
606
To summarise, our take on the current literature is that from the set of all extensively tested and validated force fields, there are nowadays versions from both AMBER
617
and CHARMM families that are shown to be balanced and well-performing in a wide
618
variety of tested systems, including cases of intrinsically disordered peptide systems,
619
like the RS repeats[154]. All that being said, we present in the next section our views
620
concerning the remaining challenges towards a universally useful biomolecular force
621
field.
622
6. The future
D
M AN U
616
We believe that a bird’s eye view of the recent literature leaves little doubt: peptide
624
simulations followed the rest of the molecular dynamics field in making a tremendous
625
progress towards accurate and physically relevant simulations. The development of
626
new algorithms, the production and validation of new robust and transferable force
627
fields, together with the increase of computational power available to the research
628
groups, completely transformed the field. Indeed, it is no longer an exaggeration to
629
state that we have reached the point where molecular simulations can faithfully predict
630
and reproduce the peptides’ native states, folding rates and in exceptional cases even
631
provide atomistic description of the folding mechanism in unprecedented detail, often
632
unreachable by experiments. Having said that, there are still a significant number of
633
unresolved issues and deficiencies, some of which are discussed in the paragraphs that
634
follow.
635
AC C
EP
TE
623
Peptides with weak structural propensities are difficult systems to track computa-
636
tionally. The major challenge here is to describe, without overestimating, the residual
637
(if any) secondary structure present in the unfolded state, whilst retaining the fold-
638
ing performance for the folders. Some recently reported issues concern the observed 20
ACCEPTED MANUSCRIPT
temperature dependence of the conformational preferences of peptides and the weak
640
folding cooperativity, whose combination results in a relatively poor description of
641
the kinetics and thermodynamics[33]. As discussed in the previous section on water
642
potentials, the source of this ”overcompactness” compared to the experimental observ-
643
ables (SAXS, FRET, Rg) appears to be the inaccurate description of the solvation
644
of the unfolded chain, possibly due to systematically underestimating the dispersion
645
parameter in the protein-water interactions[18, 72, 148]. A number of approaches have
646
been proposed to remedy this, for example applying a scaling factor to the van der
647
Waals interaction parameter[18], or re-parameterizing the Lennard-Jones parameter
648
for water’s oxygen (TIP4P-D water model[148]) or hydrogen (mTIP3P)[82]. These ap-
649
proaches improved the agreement with the experimental observables for some peptide
650
cases but not for others, rendering this approach not universally applicable. For ex-
651
ample, recent studies showed that the more polar TIP3P model favours more compact
652
structures[176]. On the other hand, the treatment of the protein-water interactions
653
by the TIP4P-Ew and TIP4P/2005 water models, eases the hydrophobically-driven
654
collapse and improves sampling of the disordered state for peptides like Trp-cage and
655
GB1[17], with TIP4P/2005 giving less collapsed conformations[135]. The latest of
656
the released water models, TIP4P-D matched better the experimental Rg in a set of
657
disordered peptides[148].
TE
D
M AN U
SC
RI PT
639
It has, thus, been realized that a new direction for further improvement of the
659
force fields is the one aiming towards an accurate description of the solvation effect
660
and better accounting for the long-range interactions. It is probably not surprising
661
that the newest of the empirical protein force field versions are adapting to new water
662
models that provide a better description of the total physical properties of the solvent.
663
However, obtaining a balance between the water model and the protein force field is
664
not a trivial exercise, especially with respect to peptide folding and dynamics. The
665
bulk properties of the solvent are experimentally tractable and thus it is straightfor-
666
ward to validate the available water models on how well they describe the solvent’s
667
molecular properties. However, the impact of the interactions between the solvent and
668
the biomolecules on the conformation of the latter is still poorly understood and needs
669
further exploration. A much different approach has been introduced with a force field
670
based on the Kirkwood-Buff theory (KBFF)[152] to retrieve force field parameters
671
that reproduce the microscopic characteristics of the solution to better describe the
672
protein-water interactions. This force field was found to compensate for the overcom-
AC C
EP
658
21
ACCEPTED MANUSCRIPT
pacteness and match the experimentally found dimensions of disordered peptides[128],
674
still though at the expense of partially unfolding peptides like GB1[129]. However, this
675
was partially rescued by increasing the interaction cut-off to strengthen the contribu-
676
tion of the long range interactions, like in the case of AMBER99SB*-ILDN with the
677
TIP4P-D water model for Nup peptides[148].
678
RI PT
673
Polarizable force fields might be an alternative route by accounting for the environmental effects. Their main asset is the ability to describe changes of the charge
680
distribution upon conformational changes of the molecules. This can be achieved via
681
fluctuating charge models, Drude oscillator-based models[101], inducible dipole models
682
or multipole electrostatics, like in AMOEBA[166], all of which are detailed in other
683
reviews[6, 67]. Their general applicability is hindered by the large computational bur-
684
den of the additional electrostatic interatomic forces calculations, barely reaching the
685
100ns timescale even for small peptide systems.
M AN U
SC
679
The difficulty in approaching all those unresolved matters in a meaningful way is
687
that on one hand the disordered state is heavily undersampled computationally and on
688
the other hand complex experimental data of disordered peptides can have ambiguous
689
interpretations. We believe that peptides have and will continue to serve as an acid
690
test for the performance of molecular simulations and force fields to reproduce the
691
physical reality: small deficiencies in the force field parameterisation can be shielded
692
in the larger well-folded protein context. Further improvements could be of substantial
693
importance towards more accurate modelling of disease-related issues, like mutation
694
effects, misfolding, aggregation or alternative enzyme conformations that affect drug
695
binding. We hope that this review of the literature regarding molecular simulations
696
of peptides and associated force fields will aid the readers to wisely select a force field
697
suitable for a certain application in order to achieve high quality results.
AC C
EP
TE
D
686
22
ACCEPTED MANUSCRIPT
698
7. Author Information Corresponding Author
700
*Phone: +30-25510-30620. Fax: +30-25510-30620.
701
Web page: https://utopia.duth.gr/glykos/. E-mail:
[email protected].
702
ORCID
703
Panagiota S. Georgoulia: 0000-0003-4573-8052
704
Nicholas M. Glykos: 0000-0003-3782-206X
705
Notes
706
The authors declare no competing financial interest.
AC C
EP
TE
D
M AN U
SC
RI PT
699
23
EP
TE
D
M AN U
SC
RI PT
ACCEPTED MANUSCRIPT
AC C
Figure 1: CHARMM family of protein force fields. Circled tree map representation of the various protein-related force fields developed to present time. The size of the circle and the colour is analogous to the number of reported citations. Genealogically-related force field are encircled in the same hierarchical level, which is represented by the thickness of the lines. Force-fields are ordered chronologically from left to right by year of appearance.
24
AC C
EP
TE
D
M AN U
SC
RI PT
ACCEPTED MANUSCRIPT
Figure 2: AMBER family of protein force fields. Circled tree map representation of the various protein force fields developed to present time as presented in Figure 1.
25
AC C
EP
TE
D
M AN U
SC
RI PT
ACCEPTED MANUSCRIPT
Figure 3: OPLS family of protein force fields. Circled tree map representation of the various protein force fields developed to present time as presented in Figure 1.
26
AC C
EP
TE
D
M AN U
SC
RI PT
ACCEPTED MANUSCRIPT
Figure 4: GROMOS family of protein force fields. Circled tree map representation of the various protein force fields developed to present time as presented in Figure 1.
27
AC C
EP
TE
D
M AN U
SC
RI PT
ACCEPTED MANUSCRIPT
Figure 5: Water models for biomolecular simulations. Circled tree map representation of the water potentials used in simulations of proteins and peptides. The size and colour of the circles follows their popularity as expressed by the number of citations. Encircled are the models that can be used in conjunction with a specific family of force fields, with consecutive levels of hierarchy represented by the line colour.
28
ACCEPTED MANUSCRIPT
707
708
References [1] Abascal, J. L., and Vega, C. A general purpose model for the condensed phases of water: Tip4p/2005. The Journal of chemical physics 123, 23 (2005),
710
234505.
711
RI PT
709
´ll, S., Smith, J. C., [2] Abraham, M. J., Murtola, T., Schulz, R., Pa Hess, B., and Lindahl, E. Gromacs: High performance molecular simulations
713
through multi-level parallelism from laptops to supercomputers. SoftwareX 1
714
(2015), 19–25.
SC
712
[3] Adamidou, T., Arvaniti, K.-O., and Glykos, N. M. Folding simulations of
716
a nuclear receptor box-containing peptide demonstrate the structural persistence
717
of the lxxll motif even in the absence of its cognate receptor. The Journal of
718
Physical Chemistry B 122, 1 (2017), 106–116.
719
M AN U
715
[4] Alejandre, J., Chapela, G. A., Saint-Martin, H., and Mendoza, N. A non-polarizable model of water that yields the dielectric constant and the
721
density anomalies of the liquid: Tip4q. Physical Chemistry Chemical Physics 13,
722
44 (2011), 19728–19740.
D
720
[5] Aliev, A. E., and Courtier-Murias, D. Experimental verification of force
724
fields for molecular dynamics simulations using gly-pro-gly-gly. The Journal of
725
Physical Chemistry B 114, 38 (2010), 12358–12375.
EP
TE
723
[6] Baker, C. M. Polarizable force fields for molecular dynamics simulations of
727
biomolecules. Wiley Interdisciplinary Reviews: Computational Molecular Science
728
729 730
AC C
726
5, 2 (2015), 241–254.
[7] Baker, D., and Sali, A. Protein structure prediction and structural genomics. Science 294, 5540 (2001), 93–96.
731
[8] Baltzis, A. S., and Glykos, N. M. Characterizing a partially ordered
732
miniprotein through folding molecular dynamics simulations: Comparison with
733
the experimental data. Protein Science 25, 3 (2016), 587–596.
734
[9] Beauchamp, K. A., Lin, Y.-S., Das, R., and Pande, V. S. Are protein
735
force fields getting better? a systematic benchmark on 524 diverse nmr measure-
736
ments. Journal of chemical theory and computation 8, 4 (2012), 1409–1414. 29
ACCEPTED MANUSCRIPT
737
[10] Berendsen, H., Grigera, J., and Straatsma, T. The missing term in effective pair potentials. Journal of Physical Chemistry 91, 24 (1987), 6269–6271.
739
[11] Berendsen, H. J., Postma, J. P., van Gunsteren, W. F., and Hermans,
RI PT
738
740
J. Interaction models for water in relation to protein hydration. In Intermolecular
741
forces. Springer, 1981, pp. 331–342.
[12] Berendsen, H. J., van der Spoel, D., and van Drunen, R. Gromacs: a
743
message-passing parallel molecular dynamics implementation. Computer physics
744
communications 91, 1-3 (1995), 43–56.
746
747 748
[13] Best, R. B., Buchete, N.-V., and Hummer, G. Are current molecular
M AN U
745
SC
742
dynamics force fields too helical? Biophysical journal 95, 1 (2008), L07–L09. [14] Best, R. B., de Sancho, D., and Mittal, J. Residue-specific α-helix propensities from molecular simulation. Biophysical journal 102, 6 (2012), 1462–1467. [15] Best, R. B., and Hummer, G. Optimized molecular dynamics force fields
750
applied to the helix- coil transition of polypeptides. The journal of physical
751
chemistry B 113, 26 (2009), 9004–9015. [16] Best, R. B., and Mittal, J. Protein simulations with an optimized water
TE
752
D
749
model: cooperative helix formation and temperature-induced unfolded state col-
754
lapse. The Journal of Physical Chemistry B 114, 46 (2010), 14916–14923.
EP
753
[17] Best, R. B., and Mittal, J. Free-energy landscape of the gb1 hairpin in
756
all-atom explicit solvent simulations with different force fields: Similarities and
757
differences. Proteins: Structure, Function, and Bioinformatics 79, 4 (2011), 1318–
758
AC C
755
1328.
759
[18] Best, R. B., Zheng, W., and Mittal, J. Balanced protein–water interac-
760
tions improve properties of disordered proteins and non-specific protein associa-
761
tion. Journal of chemical theory and computation 10, 11 (2014), 5113–5124.
762
[19] Best, R. B., Zhu, X., Shim, J., Lopes, P. E., Mittal, J., Feig, M., and
763
MacKerell Jr, A. D. Optimization of the additive charmm all-atom protein
764
force field targeting improved sampling of the backbone φ, ψ and side-chain χ1
765
and χ2 dihedral angles. Journal of chemical theory and computation 8, 9 (2012),
766
3257–3273. 30
ACCEPTED MANUSCRIPT
[20] Bieri, O., Wirz, J., Hellrung, B., Schutkowski, M., Drewello, M.,
768
and Kiefhaber, T. The speed limit for protein folding measured by triplet–
769
triplet energy transfer. Proceedings of the National Academy of Sciences 96, 17
770
(1999), 9597–9601.
RI PT
767
[21] Bonneau, R., and Baker, D. Ab initio protein structure prediction: progress
772
and prospects. Annual review of biophysics and biomolecular structure 30, 1
773
(2001), 173–189.
SC
771
[22] Borhani, D. W., and Shaw, D. E. The future of molecular dynamics simula-
775
tions in drug discovery. Journal of computer-aided molecular design 26, 1 (2012),
776
15–26.
M AN U
774
777
[23] Brooks, B. R., Bruccoleri, R. E., Olafson, B. D., States, D. J.,
778
Swaminathan, S. a., and Karplus, M. Charmm: a program for macromolec-
779
ular energy, minimization, and dynamics calculations. Journal of computational
780
chemistry 4, 2 (1983), 187–217.
D
782
[24] Brooks, C. L. Protein and peptide folding explored with molecular simulations. Accounts of chemical research 35, 6 (2002), 447–454.
TE
781
[25] Buck, M., Bouguet-Bonnet, S., Pastor, R. W., and MacKerell Jr,
784
A. D. Importance of the cmap correction to the charmm22 protein force field:
785
dynamics of hen lysozyme. Biophysical journal 90, 4 (2006), L36–L38.
787
788 789 790 791
[26] Burger, V., Gurry, T., and Stultz, C. Intrinsically disordered proteins: Where computation meets experiment. Polymers 6, 10 (2014), 2684–2719.
AC C
786
EP
783
[27] Cellmer, T., Buscaglia, M., Henry, E. R., Hofrichter, J., and Eaton, W. A. Making connections between ultrafast protein folding kinetics and molecular dynamics simulations. Proceedings of the National Academy of
Sciences (2011).
792
[28] Chowdhury, S., Lee, M. C., Xiong, G., and Duan, Y. Ab initio folding
793
simulation of the trp-cage mini-protein approaches nmr resolution. Journal of
794
molecular biology 327, 3 (2003), 711–717.
31
ACCEPTED MANUSCRIPT
[29] Cino, E. A., Choy, W.-Y., and Karttunen, M. Comparison of secondary
796
structure formation using 10 different force fields in microsecond molecular dy-
797
namics simulations. Journal of chemical theory and computation 8, 8 (2012),
798
2725–2740.
RI PT
795
[30] Cisneros, G. A., Wikfeldt, K. T., Ojamae, L., Lu, J., Xu, Y., Tora-
800
bifard, H., Bartok, A. P., Csanyi, G., Molinero, V., and Paesani, F.
801
Modeling molecular interactions in water: From pairwise to many-body potential
802
energy functions. Chemical reviews 116, 13 (2016), 7501–7528.
803
SC
799
[31] Cornell, W. D., Cieplak, P., Bayly, C. I., Gould, I. R., Merz, K. M., Ferguson, D. M., Spellmeyer, D. C., Fox, T., Caldwell, J. W., and
805
Kollman, P. A. A second generation force field for the simulation of proteins,
806
nucleic acids, and organic molecules. Journal of the American Chemical Society
807
117, 19 (1995), 5179–5197.
M AN U
804
[32] Daura, X., Gademann, K., Jaun, B., Seebach, D., Van Gunsteren,
809
W. F., and Mark, A. E. Peptide folding: when simulation meets experiment.
810
Angewandte Chemie International Edition 38, 1-2 (1999), 236–240.
D
808
[33] Day, R., Paschek, D., and Garcia, A. E. Microsecond simulations of
812
the folding/unfolding thermodynamics of the trp-cage miniprotein. Proteins:
813
Structure, Function, and Bioinformatics 78, 8 (2010), 1889–1899.
TE
811
[34] Debiec, K. T., Cerutti, D. S., Baker, L. R., Gronenborn, A. M.,
815
Case, D. A., and Chong, L. T. Further along the road less traveled: Amber
816
ff15ipq, an original protein force field built on a self-consistent physical model.
AC C
817
EP
814
Journal of chemical theory and computation 12, 8 (2016), 3926–3947.
818
[35] Doniach, S., and Eastman, P. Protein dynamics simulations from nanosec-
819
onds to microseconds. Current opinion in structural biology 9, 2 (1999), 157–163.
820
[36] Doshi, U., and Hamelberg, D. Towards fast, rigorous and efficient confor-
821
mational sampling of biomolecules: Advances in accelerated molecular dynamics.
822
Biochimica et Biophysica Acta (BBA)-General Subjects 1850, 5 (2015), 878–888.
823
[37] Dror, R. O., Dirks, R. M., Grossman, J., Xu, H., and Shaw, D. E.
824
Biomolecular simulation: a computational microscope for molecular biology. An-
825
nual review of biophysics 41 (2012), 429–452. 32
ACCEPTED MANUSCRIPT
[38] Duan, Y., and Kollman, P. A. Pathways to a protein folding intermediate
827
observed in a 1-microsecond simulation in aqueous solution. Science 282, 5389
828
(1998), 740–744.
829
RI PT
826
[39] Duan, Y., Wu, C., Chowdhury, S., Lee, M. C., Xiong, G., Zhang, W., Yang, R., Cieplak, P., Luo, R., Lee, T., et al. A point-charge force
831
field for molecular mechanics simulations of proteins based on condensed-phase
832
quantum mechanical calculations. Journal of computational chemistry 24, 16
833
(2003), 1999–2012.
SC
830
[40] Dunker, A. K., Lawson, J. D., Brown, C. J., Williams, R. M.,
835
Romero, P., Oh, J. S., Oldfield, C. J., Campen, A. M., Ratliff, C. M.,
836
Hipps, K. W., et al. Intrinsically disordered protein. Journal of molecular
837
graphics and modelling 19, 1 (2001), 26–59.
839
840
[41] Durrant, J. D., and McCammon, J. A. Molecular dynamics simulations and drug discovery. BMC biology 9, 1 (2011), 71.
[42] Eaton, W. A., Munoz, V., Hagen, S. J., Jas, G. S., Lapidus, L. J.,
D
838
M AN U
834
Henry, E. R., and Hofrichter, J. Fast kinetics and mechanisms in protein
842
folding. Annual review of biophysics and biomolecular structure 29, 1 (2000),
843
327–359.
TE
841
[43] Eaton, W. A., Munoz, V., Thompson, P. A., Henry, E. R., and
845
Hofrichter, J. Kinetics and dynamics of loops, α-helices, β-hairpins, and
846
fast-folding proteins. Accounts of Chemical Research 31, 11 (1998), 745–753.
AC C
EP
844
847
[44] Edwards, C., Cohen, M., and Bloom, S. Peptides as drugs, 1999.
848
[45] Ensign, D. L., Kasson, P. M., and Pande, V. S. Heterogeneity even at
849
the speed limit of folding: large-scale molecular dynamics study of a fast-folding
850
variant of the villin headpiece. Journal of molecular biology 374, 3 (2007), 806–
851
816.
852
[46] Ensign, D. L., and Pande, V. S. The fip35 ww domain folds with structural
853
and mechanistic heterogeneity in molecular dynamics simulations. Biophysical
854
journal 96, 8 (2009), L53–L55.
33
ACCEPTED MANUSCRIPT
[47] Faver, J. C., Benson, M. L., He, X., Roberts, B. P., Wang, B., Mar-
856
shall, M. S., Sherrill, C. D., and Merz Jr, K. M. The energy computa-
857
tion paradox and ab initio protein folding. PloS one 6, 4 (2011), e18868.
RI PT
855
858
[48] Feng, C.-J., Dhayalan, B., and Tokmakoff, A. Refinement of peptide
859
conformational ensembles by 2d ir spectroscopy: Application to ala–ala–ala. Bio-
860
physical Journal 114, 12 (2018), 2820–2832.
[49] Ferrara, P., Apostolakis, J., and Caflisch, A. Thermodynamics and
862
kinetics of folding of two model peptides investigated by molecular dynamics
863
simulations. The Journal of Physical Chemistry B 104, 20 (2000), 5000–5010.
M AN U
SC
861
864
[50] Ferrara, P., and Caflisch, A. Folding simulations of a three-stranded
865
antiparallel β-sheet peptide. Proceedings of the National Academy of Sciences
866
97, 20 (2000), 10780–10785.
867 868
[51] Freddolino, P. L., Harrison, C. B., Liu, Y., and Schulten, K. Challenges in protein-folding simulations. Nature physics 6, 10 (2010), 751. [52] Freddolino, P. L., Liu, F., Gruebele, M., and Schulten, K. Ten-
870
microsecond molecular dynamics simulation of a fast-folding ww domain. Bio-
871
physical journal 94, 10 (2008), L75–L77.
873
TE
[53] Freddolino, P. L., Park, S., Roux, B., and Schulten, K. Force field
EP
872
D
869
bias in protein folding simulations. Biophysical journal 96, 9. [54] Freddolino, P. L., and Schulten, K. Common structural transitions in
875
explicit-solvent simulations of villin headpiece folding. Biophysical journal 97, 8
876
AC C
874
(2009), 2338–2347.
877
[55] Friedrichs, M. S., Eastman, P., Vaidyanathan, V., Houston, M.,
878
Legrand, S., Beberg, A. L., Ensign, D. L., Bruns, C. M., and Pande,
879
V. S. Accelerating molecular dynamic simulation on graphics processing units.
880
Journal of computational chemistry 30, 6 (2009), 864–872.
881
[56] Fuentes-Azcatl, R., and Alejandre, J. Non-polarizable force field of water
882
based on the dielectric constant: Tip4p/ε. The Journal of Physical Chemistry B
883
118, 5 (2014), 1263–1272.
34
ACCEPTED MANUSCRIPT
[57] Garcia, A. E., and Sanbonmatsu, K. Y. α-helical stabilization by side chain
885
shielding of backbone hydrogen bonds. Proceedings of the National Academy of
886
Sciences 99, 5 (2002), 2782–2787.
RI PT
884
887
[58] Georgoulia, P. S., and Glykos, N. M. Using j-coupling constants for force
888
field validation: application to hepta-alanine. The Journal of Physical Chemistry
889
B 115, 51 (2011), 15221–15227.
[59] Georgoulia, P. S., and Glykos, N. M. On the foldability of tryptophan-
891
containing tetra-and pentapeptides: an exhaustive molecular dynamics study.
893
The Journal of Physical Chemistry B 117, 18 (2013), 5522–5532.
M AN U
892
SC
890
[60] Georgoulia, P. S., and Glykos, N. M. Folding molecular dynamics simula-
894
tion of a gp41-derived peptide reconcile divergent structure determinations. ACS
895
Omega 3, 11 (2018), 14746–14754.
[61] Gnanakaran, S., and Garcia, A. E. Validation of an all-atom protein force
897
field: from dipeptides to larger peptides. The Journal of Physical Chemistry B
898
107, 46 (2003), 12555–12557.
D
896
[62] Gnanakaran, S., and Garc´ıa, A. E. Helix-coil transition of alanine peptides
900
in water: Force field dependence on the folded and unfolded structures. Proteins:
901
Structure, Function, and Bioinformatics 59, 4 (2005), 773–782.
EP
902
TE
899
[63] Gnanakaran, S., Nymeyer, H., Portman, J., Sanbonmatsu, K. Y., and Garcıa, A. E. Peptide folding simulations. Current opinion in structural
904
biology 13, 2 (2003), 168–174.
AC C
903
905
[64] Guillot, B. A reappraisal of what we have learnt during three decades of
906
computer simulations on water. Journal of Molecular Liquids 101, 1-3 (2002),
907
219–260.
908
[65] Guvench, O., and MacKerell, A. D. Comparison of protein force fields
909
for molecular dynamics simulations. In Molecular modeling of proteins. Springer,
910
2008, pp. 63–88.
911 912
[66] Hagen, S. J., Hofrichter, J., Szabo, A., and Eaton, W. A. Diffusionlimited contact formation in unfolded cytochrome c: estimating the maximum
35
ACCEPTED MANUSCRIPT
913
rate of protein folding. Proceedings of the National Academy of Sciences 93, 21
914
(1996), 11615–11617.
916
[67] Halgren, T. A., and Damm, W. Polarizable force fields. Current opinion in structural biology 11, 2 (2001), 236–242.
RI PT
915
[68] Hamelberg, D., Mongan, J., and McCammon, J. A. Accelerated molecu-
918
lar dynamics: a promising and efficient simulation method for biomolecules. The
919
Journal of chemical physics 120, 24 (2004), 11919–11929.
SC
917
[69] Harder, E., Damm, W., Maple, J., Wu, C., Reboul, M., Xiang, J. Y.,
921
Wang, L., Lupyan, D., Dahlgren, M. K., Knight, J. L., et al. Opls3:
922
a force field providing broad coverage of drug-like small molecules and proteins.
923
Journal of chemical theory and computation 12, 1 (2015), 281–296.
924
M AN U
920
[70] Hatfield, M. P., Murphy, R. F., and Lovas, S. Vcd spectroscopic proper-
925
ties of the β-hairpin forming miniprotein cln025 in various solvents. Biopolymers
926
93, 5 (2010), 442–450.
[71] Hazel, A. J., Walters, E. T., Rowley, C. N., and Gumbart, J. C.
D
927
Folding free energy landscapes of β-sheets with non-polarizable and polarizable
929
charmm force fields. The Journal of Chemical Physics 149, 7 (2018), 072317.
930
TE
928
[72] Henriques, J., Cragnell, C., and Skepo, M. Molecular dynamics simulations of intrinsically disordered proteins: force field evaluation and comparison
932
with experiment. Journal of chemical theory and computation 11, 7 (2015), 3420–
933
3431.
AC C
EP
931
934
[73] Hess, B., and van der Vegt, N. F. Hydration thermodynamic properties of
935
amino acid analogues: a systematic comparison of biomolecular force fields and
936
water models. The Journal of Physical Chemistry B 110, 35 (2006), 17616–17626.
937
[74] Higo, J., Ito, N., Kuroda, M., Ono, S., Nakajima, N., and Nakamura,
938
H. Energy landscape of a peptide consisting of α-helix, 310-helix, β-turn, β-
939
hairpin, and other disordered conformations. Protein Science 10, 6 (2001), 1160–
940
1171.
941 942
[75] Ho, B. K., and Dill, K. A. Folding very short peptides using molecular dynamics. PLoS computational biology 2, 4 (2006), e27. 36
ACCEPTED MANUSCRIPT
944
945 946
947
[76] Hoang Viet, M., Derreumaux, P., and Nguyen, P. H. Communication: Multiple atomistic force fields in a single enhanced sampling simulation, 2015. [77] Honda, S., Yamasaki, K., Sawada, Y., and Morii, H. 10 residue folded
RI PT
943
peptide designed by segment statistics. Structure 12, 8 (2004), 1507–1518. [78] Horn, H. W., Swope, W. C., Pitera, J. W., Madura, J. D., Dick, T. J., Hura, G. L., and Head-Gordon, T. Development of an improved
949
four-site water model for biomolecular simulations: Tip4p-ew. The Journal of
950
chemical physics 120, 20 (2004), 9665–9678.
SC
948
[79] Hornak, V., Abel, R., Okur, A., Strockbine, B., Roitberg, A., and
952
Simmerling, C. Comparison of multiple amber force fields and development
953
of improved protein backbone parameters. Proteins: Structure, Function, and
954
Bioinformatics 65, 3 (2006), 712–725.
955
M AN U
951
[80] Huang, J., and MacKerell, A. D. Force field development and simulations of intrinsically disordered proteins. Current opinion in structural biology
957
48 (2018), 40–48.
D
956
[81] Huang, J., and MacKerell Jr, A. D. Charmm36 all-atom additive protein
959
force field: Validation based on comparison to nmr data. Journal of computa-
960
tional chemistry 34, 25 (2013), 2135–2145.
EP
TE
958
963
proved force field for folded and intrinsically disordered proteins. nature methods
964
AC C
962
[82] Huang, J., Rauscher, S., Nawrocki, G., Ran, T., Feig, M., de Groot, ¨ller, H., and MacKerell Jr, A. D. Charmm36m: an imB. L., Grubmu
961
14, 1 (2016), 71.
965
[83] Jorgensen, W. L. Quantum and statistical mechanical studies of liquids. 10.
966
transferable intermolecular potential functions for water, alcohols, and ethers.
967
application to liquid water. Journal of the American Chemical Society 103, 2
968
(1981), 335–340.
969
[84] Jorgensen, W. L., Chandrasekhar, J., Madura, J. D., Impey, R. W.,
970
and Klein, M. L. Comparison of simple potential functions for simulating
971
liquid water. The Journal of chemical physics 79, 2 (1983), 926–935.
37
ACCEPTED MANUSCRIPT
[85] Jorgensen, W. L., Maxwell, D. S., and Tirado-Rives, J. Development
973
and testing of the opls all-atom force field on conformational energetics and
974
properties of organic liquids. Journal of the American Chemical Society 118, 45
975
(1996), 11225–11236.
RI PT
972
[86] Jorgensen, W. L., and Tirado-Rives, J. The opls [optimized potentials
977
for liquid simulations] potential functions for proteins, energy minimizations for
978
crystals of cyclic peptides and crambin. Journal of the American Chemical Society
979
110, 6 (1988), 1657–1666.
980
SC
976
[87] Kaminski, G. A., Friesner, R. A., Tirado-Rives, J., and Jorgensen, W. L. Evaluation and reparametrization of the opls-aa force field for proteins
982
via comparison with accurate quantum chemical calculations on peptides. The
983
Journal of Physical Chemistry B 105, 28 (2001), 6474–6487.
984
M AN U
981
[88] Kamiya, N., Higo, J., and Nakamura, H. Conformational transition states of a β-hairpin peptide between the ordered and disordered conformations in ex-
986
plicit water. Protein Science 11, 10 (2002), 2297–2307.
988
989
[89] Keller, T. H., Pichota, A., and Yin, Z. A practical view of druggability. Current opinion in chemical biology 10, 4 (2006), 357–361.
TE
987
D
985
[90] Kier, B. L., and Andersen, N. H. Probing the lower size limit for protein-like fold stability: ten-residue microproteins with specific, rigid structures in water.
991
Journal of the American Chemical Society 130, 44 (2008), 14675–14683.
993
[91] Kliger, Y. Computational approaches to therapeutic peptide discovery. Peptide
AC C
992
EP
990
Science 94, 6 (2010), 701–710.
994
[92] Koller, A. N., Schwalbe, H., and Gohlke, H. Starting structure depen-
995
dence of nmr order parameters derived from md simulations: implications for
996
997
judging force-field quality. Biophysical journal 95, 1 (2008), L04–L06.
[93] Kollman, P., Dixon, R., Cornell, W., Fox, T., Chipot, C., and Po-
998
horille, A. The development/application of a minimalistorganic/biochemical
999
molecular mechanic force field using a combination of ab initio calculations and
1000
experimental data. In Computer simulation of biomolecular systems. Springer,
1001
1997, pp. 83–96. 38
ACCEPTED MANUSCRIPT
[94] Koukos, P. I., and Glykos, N. M. Folding molecular dynamics simulations
1003
accurately predict the effect of mutations on the stability and structure of a
1004
vammin-derived peptide. The Journal of Physical Chemistry B 118, 34 (2014),
1005
10076–10084.
RI PT
1002
[95] Kubelka, J., Henry, E. R., Cellmer, T., Hofrichter, J., and Eaton,
1007
W. A. Chemical, physical, and theoretical kinetics of an ultrafast folding protein.
1008
Proceedings of the National Academy of Sciences 105, 48 (2008), 18655–18662.
1010
1011
[96] Kubelka, J., Hofrichter, J., and Eaton, W. A. The protein folding speed limit. Current opinion in structural biology 14, 1 (2004), 76–88.
M AN U
1009
SC
1006
[97] Kuhlman, B., and Baker, D. Exploring folding free energy landscapes using
1012
computational protein design. Current opinion in structural biology 14, 1 (2004),
1013
89–95.
¨hrova ´, P., De Simone, A., Otyepka, M., and Best, R. B. Force-field [98] Ku
1015
dependence of chignolin folding and misfolding: comparison with experiment and
1016
redesign. Biophysical journal 102, 8 (2012), 1897–1906.
1017
D
1014
[99] L. Fawzi, N., Phillips, A. H., Ruscio, J. Z., Doucleff, M., Wemmer, D. E., and Head-Gordon, T. Structure and dynamics of the aβ21–30 peptide
1019
from the interplay of nmr experiments and molecular simulations. Journal of the
1020
American Chemical Society 130, 19 (2008), 6145–6158.
1023 1024 1025
1026
EP
1022
[100] Laio, A., and Parrinello, M. Escaping free-energy minima. Proceedings of the National Academy of Sciences 99, 20 (2002), 12562–12566.
AC C
1021
TE
1018
[101] Lamoureux, G., MacKerell Jr, A. D., and Roux, B. A simple polarizable model of water based on classical drude oscillators. The Journal of chemical
physics 119, 10 (2003), 5185–5197.
[102] Lange, O. F., Van der Spoel, D., and De Groot, B. L. Scrutinizing
1027
molecular mechanics force fields on the submicrosecond timescale with nmr data.
1028
Biophysical journal 99, 2 (2010), 647–655.
1029
[103] Lapidus, L. J., Eaton, W. A., and Hofrichter, J. Measuring the rate
1030
of intramolecular contact formation in polypeptides. Proceedings of the National
1031
Academy of Sciences 97, 13 (2000), 7220–7225. 39
ACCEPTED MANUSCRIPT
[104] Larson, S. M., Snow, C. D., Shirts, M., and Pande, V. S. Folding@
1033
home and genome@ home: Using distributed computing to tackle previously
1034
intractable problems in computational biology. arXiv preprint arXiv:0901.0866
1035
(2009).
1037
1038
¨schweiler, R. Nmr-based protein potentials. Ange[105] Li, D.-W., and Bru wandte Chemie International Edition 49, 38 (2010), 6778–6780.
[106] Li, D.-W., and Bruschweiler, R. Iterative optimization of molecular me-
SC
1036
RI PT
1032
chanics force fields from nmr data of full-length proteins. Journal of chemical
1040
theory and computation 7, 6 (2011), 1773–1782.
1041
M AN U
1039
[107] Lindorff-Larsen, K., Maragakis, P., Piana, S., Eastwood, M. P.,
1042
Dror, R. O., and Shaw, D. E. Systematic validation of protein force fields
1043
against experimental data. PloS one 7, 2 (2012), e32131.
1045
1046
[108] Lindorff-Larsen, K., Piana, S., Dror, R. O., and Shaw, D. E. How fast-folding proteins fold. Science 334, 6055 (2011), 517–520. [109] Lindorff-Larsen, K., Piana, S., Palmo, K., Maragakis, P., Klepeis,
D
1044
J. L., Dror, R. O., and Shaw, D. E. Improved side-chain torsion potentials
1048
for the amber ff99sb protein force field. Proteins: Structure, Function, and
1049
Bioinformatics 78, 8 (2010), 1950–1958.
1051
EP
1050
TE
1047
[110] Loffet, A. Peptides as drugs: is there a market? Journal of peptide science: an official publication of the European Peptide Society 8, 1 (2002), 1–7. [111] Lopes, P. E., Guvench, O., and MacKerell, A. D. Current status of
1053
protein force fields for molecular dynamics simulations. In Molecular Modeling
1054
AC C
1052
of Proteins. Springer, 2015, pp. 47–71.
1055
[112] MacKerell Jr, A. D. Empirical force fields for biological macromolecules:
1056
overview and issues. Journal of computational chemistry 25, 13 (2004), 1584–
1057
1604.
1058
[113] MacKerell Jr, A. D., Bashford, D., Bellott, M., Dunbrack Jr,
1059
R. L., Evanseck, J. D., Field, M. J., Fischer, S., Gao, J., Guo, H.,
1060
Ha, S., et al. The journal of physical chemistry B 102, 18 (1998), 3586–3616.
40
ACCEPTED MANUSCRIPT
1061
[114] MacKerell Jr, A. D., Brooks, B., Brooks III, C. L., Nilsson, L., Roux, B., Won, Y., and Karplus, M. Charmm: the energy function and
1063
its parameterization. Encyclopedia of computational chemistry 1 (2002).
RI PT
1062
1064
[115] MacKerell Jr, A. D., Feig, M., and Brooks, C. L. Improved treatment of
1065
the protein backbone in empirical force fields. Journal of the American Chemical
1066
Society 126, 3 (2003), 698–699.
[116] Mackerell Jr, A. D., Feig, M., and Brooks III, C. L. Extending the
1068
treatment of backbone energetics in protein force fields: limitations of gas-phase
1069
quantum mechanics in reproducing protein conformational distributions in molec-
1070
ular dynamics simulations. Journal of computational chemistry 25, 11 (2004),
1071
1400–1415.
M AN U
SC
1067
1072
[117] Mahoney, M. W., and Jorgensen, W. L. A five-site model for liquid water
1073
and the reproduction of the density anomaly by rigid, nonpolarizable potential
1074
functions. The Journal of Chemical Physics 112, 20 (2000), 8910–8922. [118] Maier, J. A., Martinez, C., Kasavajhala, K., Wickstrom, L., Hauser,
D
1075
K. E., and Simmerling, C. ff14sb: improving the accuracy of protein side
1077
chain and backbone parameters from ff99sb. Journal of chemical theory and
1078
computation 11, 8 (2015), 3696–3713. [119] Man, V. H., Nguyen, P. H., and Derreumaux, P. High-resolution struc-
EP
1079
TE
1076
tures of the amyloid-β 1–42 dimers from the comparison of four atomistic force
1081
fields. The Journal of Physical Chemistry B 121, 24 (2017), 5977–5987.
1082 1083
AC C
1080
[120] Marinari, E., and Parisi, G. Simulated tempering: a new monte carlo scheme. EPL (Europhysics Letters) 19, 6 (1992), 451.
1084
[121] Mark, P., and Nilsson, L. Structure and dynamics of the tip3p, spc, and
1085
spc/e water models at 298 k. The Journal of Physical Chemistry A 105, 43
1086
(2001), 9954–9960.
1087
[122] Markwick, P. R., and McCammon, J. A. Studying functional dynamics in
1088
bio-molecules using accelerated molecular dynamics. Physical Chemistry Chem-
1089
ical Physics 13, 45 (2011), 20053–20065.
41
ACCEPTED MANUSCRIPT
[123] Matthes, D., and De Groot, B. L. Secondary structure propensities in
1091
peptide folding simulations: a systematic comparison of molecular mechanics
1092
interaction schemes.
RI PT
1090
1093
[124] Maupetit, J., Derreumaux, P., and Tuffery, P. Pep-fold: an online
1094
resource for de novo peptide structure prediction. Nucleic acids research 37,
1095
suppl 2 (2009), W498–W503.
[125] Mayor, U., Johnson, C. M., Daggett, V., and Fersht, A. R. Protein
SC
1096
folding and unfolding in microseconds to nanoseconds by experiment and simu-
1098
lation. Proceedings of the National Academy of Sciences 97, 25 (2000), 13518–
1099
13522.
M AN U
1097
[126] McCammon, J. A. A speed limit for protein folding. Proceedings of the National
1101
Academy of Sciences of the United States of America 93, 21 (1996), 11426.
1102
[127] McKiernan, K. A., Husic, B. E., and Pande, V. S. Modeling the mecha-
1103
nism of cln025 beta-hairpin formation. The Journal of Chemical Physics 147, 10
1104
(2017), 104107.
D
1100
[128] Mercadante, D., Milles, S., Fuertes, G., Svergun, D. I., Lemke,
1106
E. A., and Grater, F. Kirkwood–buff approach rescues overcollapse of a dis-
1107
ordered protein in canonical protein force fields. The journal of physical chemistry
1108
B 119, 25 (2015), 7975–7984.
EP
TE
1105
[129] Mercadante, D., Wagner, J. A., Aramburu, I. V., Lemke, E. A., and
1110
Grater, F. Sampling long-versus short-range interactions defines the ability of
1111 1112
1113
AC C
1109
force fields to reproduce the dynamics of intrinsically disordered proteins. Journal
of chemical theory and computation 13, 9 (2017), 3964–3974.
[130] Mitsutake, A., Sugita, Y., and Okamoto, Y. Generalized-ensemble al-
1114
gorithms for molecular simulations of biopolymers. Peptide Science: Original
1115
Research on Biomolecules 60, 2 (2001), 96–123.
1116
[131] Mittal, J., and Best, R. B. Tackling force-field bias in protein folding simu-
1117
lations: folding of villin hp35 and pin ww domains in explicit water. Biophysical
1118
journal 99, 3 (2010), L26–L28.
42
ACCEPTED MANUSCRIPT
[132] Munoz, V., Thompson, P. A., Hofrichter, J., and Eaton, W. A. Fold-
1120
ing dynamics and mechanism of β-hairpin formation. Nature 390, 6656 (1997),
1121
196.
RI PT
1119
[133] Nasica-Labouze, J., Nguyen, P. H., Sterpone, F., Berthoumieu, O.,
1123
Buchete, N.-V., Cote, S., De Simone, A., Doig, A. J., Faller, P.,
1124
Garcia, A., et al. Amyloid β protein and alzheimer???s disease: When
1125
computer simulations complement experimental studies. Chemical reviews 115,
1126
9 (2015), 3518–3563.
SC
1122
[134] Neria, E., Fischer, S., and Karplus, M. Simulation of activation free
1128
energies in molecular systems. The Journal of chemical physics 105, 5 (1996),
1129
1902–1921.
1130 1131
M AN U
1127
¨ller-Spa ¨th, S., Ku ¨ster, F., Hofmann, H., Haenni, [135] Nettels, D., Mu ¨egger, S., Reymond, L., Hoffmann, A., Kubelka, J., Heinz, D., Ru B., et al. Single-molecule spectroscopy of the temperature-induced collapse
1133
of unfolded proteins. Proceedings of the National Academy of Sciences 106, 49
1134
(2009), 20740–20745.
[136] Nguyen, P. H., Li, M. S., and Derreumaux, P. Effects of all-atom force
TE
1135
D
1132
fields on amyloid oligomerization: Replica exchange molecular dynamics simula-
1137
tions of the aβ 16–22 dimer and trimer. Physical Chemistry Chemical Physics
1138
13, 20 (2011), 9778–9788.
EP
1136
[137] Okur, A., Strockbine, B., Hornak, V., and Simmerling, C. Using pc
1140
clusters to evaluate the transferability of molecular mechanics force fields for
1141
1142 1143 1144
AC C
1139
proteins. Journal of computational chemistry 24, 1 (2003), 21–31.
[138] Ono, S., Nakajima, N., Higo, J., and Nakamura, H. Peptide free-energy profile is strongly dependent on the force field: Comparison of c96 and amber95.
Journal of Computational Chemistry 21, 9 (2000), 748–762.
1145
[139] Onufriev, A. V., and Izadi, S. Water models for biomolecular simulations.
1146
Wiley Interdisciplinary Reviews: Computational Molecular Science 8, 2 (2018),
1147
e1347.
43
ACCEPTED MANUSCRIPT
[140] Oostenbrink, C., Villa, A., Mark, A. E., and Van Gunsteren, W. F.
1149
A biomolecular force field based on the free enthalpy of hydration and solvation:
1150
the gromos force-field parameter sets 53a5 and 53a6. Journal of computational
1151
chemistry 25, 13 (2004), 1656–1676.
RI PT
1148
[141] Ostermeir, K., and Zacharias, M. Advanced replica-exchange sampling
1153
to study the flexibility and plasticity of peptides and proteins. Biochimica et
1154
Biophysica Acta (BBA)-Proteins and Proteomics 1834, 5 (2013), 847–853.
1156
1157
[142] Ouyang, J. F., and Bettens, R.
Modelling water: a lifetime enigma.
CHIMIA International Journal for Chemistry 69, 3 (2015), 104–111.
M AN U
1155
SC
1152
[143] Pande, V. S., and Rokhsar, D. S. Molecular dynamics simulations of un-
1158
folding and refolding of a β-hairpin fragment of protein g. Proceedings of the
1159
National Academy of Sciences 96, 16 (1999), 9062–9067.
1160 1161
[144] Patapati, K. K., and Glykos, N. M. Three force fields’ views of the 310 helix. Biophysical journal 101, 7 (2011), 1766–1771.
[145] Patmanidis, I., and Glykos, N. M. As good as it gets? folding molecular
1163
dynamics simulations of the lyta choline-binding peptide result to an exception-
1164
ally accurate model of the peptide structure. Journal of Molecular Graphics and
1165
Modelling 41 (2013), 68–71.
EP
TE
D
1162
[146] Perez, A., MacCallum, J. L., Brini, E., Simmerling, C., and Dill,
1167
K. A. Grid-based backbone correction to the ff12sb protein force field for implicit-
1168
solvent simulations. Journal of chemical theory and computation 11, 10 (2015),
1169
AC C
1166
4770–4779.
1170
[147] Phillips, J. C., Braun, R., Wang, W., Gumbart, J., Tajkhorshid,
1171
E., Villa, E., Chipot, C., Skeel, R. D., Kale, L., and Schulten, K.
1172
Scalable molecular dynamics with namd. Journal of computational chemistry 26,
1173
16 (2005), 1781–1802.
1174
[148] Piana, S., Donchev, A. G., Robustelli, P., and Shaw, D. E. Wa-
1175
ter dispersion interactions strongly influence simulated structural properties of
1176
disordered protein states. The journal of physical chemistry B 119, 16 (2015),
1177
5113–5123. 44
ACCEPTED MANUSCRIPT
[149] Piana, S., Klepeis, J. L., and Shaw, D. E. Assessing the accuracy of
1179
physical models used in protein-folding simulations: quantitative evidence from
1180
long molecular dynamics simulations. Current opinion in structural biology 24
1181
(2014), 98–105.
1182
RI PT
1178
[150] Piana, S., Lindorff-Larsen, K., and Shaw, D. E. How robust are protein folding simulations with respect to force field parameterization?
1184
journal 100, 9 (2011), L47–L49.
1185
Biophysical
SC
1183
[151] Pitera, J. W., and Swope, W. Understanding folding and design: Replicaexchange simulations of“trp-cage”miniproteins.
1187
Academy of Sciences 100, 13 (2003), 7587–7592.
Proceedings of the National
M AN U
1186
1188
[152] Ploetz, E. A., Bentenitis, N., and Smith, P. E. Developing force fields
1189
from the microscopic structure of solutions. Fluid phase equilibria 290, 1-2 (2010),
1190
43–47.
1193
Advances in protein chemistry, vol. 66. Elsevier, 2003, pp. 27–85.
D
1192
[153] Ponder, J. W., and Case, D. A. Force fields for protein simulations. In
[154] Rauscher, S., Gapsys, V., Gajda, M. J., Zweckstetter, M.,
TE
1191
de Groot, B. L., and Grubmuller, H. Structural ensembles of intrinsically
1195
disordered proteins depend strongly on force field: a comparison to experiment.
1196
Journal of chemical theory and computation 11, 11 (2015), 5513–5524.
EP
1194
[155] Reif, M. M., Hunenberger, P. H., and Oostenbrink, C. New interaction
1198
parameters for charged amino acid side chains in the gromos force field. Journal
1199
1200 1201 1202
AC C
1197
of chemical theory and computation 8, 10 (2012), 3705–3723.
[156] Robustelli, P., Piana, S., and Shaw, D. E. Developing a molecular dynamics force field for both folded and disordered protein states. Proceedings of
the National Academy of Sciences (2018), 201800690.
1203
[157] Rodriguez, A., Mokoema, P., Corcho, F., Bisetty, K., and Perez,
1204
J. J. Computational study of the free energy landscape of the miniprotein cln025
1205
in explicit and implicit solvent. The Journal of Physical Chemistry B 115, 6
1206
(2011), 1440–1449.
45
ACCEPTED MANUSCRIPT
[158] Rosenman, D. J., Wang, C., and Garc´ıa, A. E. Characterization of aβ
1208
monomers through the convergence of ensemble properties among simulations
1209
with multiple force fields. The Journal of Physical Chemistry B 120, 2 (2015),
1210
259–277.
RI PT
1207
[159] Salomon-Ferrer, R., Gotz, A. W., Poole, D., Le Grand, S., and
1212
Walker, R. C. Routine microsecond molecular dynamics simulations with
1213
amber on gpus. 2. explicit solvent particle mesh ewald. Journal of chemical
1214
theory and computation 9, 9 (2013), 3878–3888.
SC
1211
[160] Satoh, D., Shimizu, K., Nakamura, S., and Terada, T. Folding free-
1216
energy landscape of a 10-residue mini-protein, chignolin. FEBS letters 580, 14
1217
(2006), 3422–3426.
1218
M AN U
1215
[161] Schmid, N., Eichenberger, A. P., Choutko, A., Riniker, S., Winger, M., Mark, A. E., and van Gunsteren, W. F. Definition and testing of
1220
the gromos force-field versions 54a7 and 54b7. European biophysics journal 40, 7
1221
(2011), 843.
1222
D
1219
[162] Serafeim, A.-P., Salamanos, G., Patapati, K. K., and Glykos, N. M. Sensitivity of folding molecular dynamics simulations to even minor force field
1224
changes. Journal of chemical information and modeling 56, 10 (2016), 2035–2041.
1225
[163] Shaw, D. E., Dror, R. O., Salmon, J. K., Grossman, J., Mackenzie,
1226
K. M., Bank, J. A., Young, C., Deneroff, M. M., Batson, B., Bowers,
1227
K. J., et al. Millisecond-scale molecular dynamics simulations on anton. In
1228
Proceedings of the conference on high performance computing networking, storage
EP
AC C
1229
TE
1223
and analysis (2009), ACM, p. 39.
1230
[164] Shaw, D. E., Maragakis, P., Lindorff-Larsen, K., Piana, S., Dror,
1231
R. O., Eastwood, M. P., Bank, J. A., Jumper, J. M., Salmon, J. K.,
1232
Shan, Y., et al. Atomic-level characterization of the structural dynamics of
1233
proteins. Science 330, 6002 (2010), 341–346.
1234
[165] Shell, M. S., Ozkan, S. B., Voelz, V., Wu, G. A., and Dill, K. A.
1235
Blind test of physics-based prediction of protein structures. Biophysical journal
1236
96, 3 (2009), 917–924.
46
ACCEPTED MANUSCRIPT
1237
[166] Shi, Y., Xia, Z., Zhang, J., Best, R., Wu, C., Ponder, J. W., and Ren, P. Polarizable atomic multipole-based amoeba force field for proteins. Journal
1239
of chemical theory and computation 9, 9 (2013), 4046–4063.
1240 1241
RI PT
1238
[167] Shirts, M., and Pande, V. S. Screen savers of the world unite! Science 290, 5498 (2000), 1903–1904.
¨schweiler, R. Quantitative molecular ensem[168] Showalter, S. A., and Bru
1243
ble interpretation of nmr dipolar couplings without restraints. Journal of the
1244
American Chemical Society 129, 14 (2007), 4158–4159.
¨schweiler, R. [169] Showalter, S. A., Johnson, E., Rance, M., and Bru
M AN U
1245
SC
1242
1246
Toward quantitative interpretation of methyl side-chain dynamics from nmr by
1247
molecular dynamics simulations. Journal of the American Chemical Society 129,
1248
46 (2007), 14146–14147.
[170] Simmerling, C., Strockbine, B., and Roitberg, A. E. All-atom structure
1250
prediction and folding simulations of a stable protein. Journal of the American
1251
Chemical Society 124, 38 (2002), 11258–11259.
[171] Simmerling, C. L., and Elber, R. Computer determination of peptide con-
TE
1252
D
1249
formations in water: different roads to structure. Proceedings of the National
1254
Academy of Sciences 92, 8 (1995), 3190–3193.
EP
1253
[172] Simons, K. T., Kooperberg, C., Huang, E., and Baker, D. Assembly
1256
of protein tertiary structures from fragments with similar local sequences using
1257
simulated annealing and bayesian scoring functions1. Journal of molecular biology
1258
AC C
1255
268, 1 (1997), 209–225.
1259
[173] Siwy, C. M., Lockhart, C., and Klimov, D. K. Is the conformational
1260
ensemble of alzheimer???s aβ10-40 peptide force field dependent? PLoS compu-
1261
tational biology 13, 1 (2017), e1005314.
1262
[174] Smith, M. D., Rao, J. S., Segelken, E., and Cruz, L. Force-field induced
1263
bias in the structure of aβ21–30: A comparison of opls, amber, charmm, and
1264
gromos force fields. Journal of chemical information and modeling 55, 12 (2015),
1265
2587–2595.
47
ACCEPTED MANUSCRIPT
[175] Snow, C. D., Nguyen, H., Pande, V. S., and Gruebele, M. Absolute
1267
comparison of simulated and experimental protein-folding dynamics. nature 420,
1268
6911 (2002), 102.
RI PT
1266
1269
[176] Somavarapu, A. K., and Kepp, K. P. The dependence of amyloid-β dy-
1270
namics on protein force fields and water models. ChemPhysChem 16, 15 (2015),
1271
3278–3289.
1273
[177] Sorin, E. J., and Pande, V. S. Exploring the helix-coil transition via all-atom
SC
1272
equilibrium ensemble simulations. Biophysical journal 88, 4 (2005), 2472–2493. [178] Steinbach, P. J. Exploring peptide energy landscapes: a test of force fields
1275
and implicit solvent models. Proteins: Structure, Function, and Bioinformatics
1276
57, 4 (2004), 665–677.
1277
M AN U
1274
[179] Stirnemann, G., and Sterpone, F. Recovering protein thermal stability us-
1278
ing all-atom hamiltonian replica-exchange simulations in explicit solvent. Journal
1279
of chemical theory and computation 11, 12 (2015), 5573–5577. [180] Stone, J. E., Hardy, D. J., Ufimtsev, I. S., and Schulten, K. Gpu-
D
1280
accelerated molecular modeling coming of age. Journal of Molecular Graphics
1282
and Modelling 29, 2 (2010), 116–125.
1284
[181] Sugita, Y., and Okamoto, Y. Replica-exchange molecular dynamics method
EP
1283
TE
1281
for protein folding. Chemical physics letters 314, 1-2 (1999), 141–151. [182] Torrie, G. M., and Valleau, J. P. Nonphysical sampling distributions in
1286
monte carlo free-energy estimation: Umbrella sampling. Journal of Computa-
1287
AC C
1285
tional Physics 23, 2 (1977), 187–199.
1288
[183] Tran, L., and Ha-Duong, T. Exploring the alzheimer amyloid-β peptide
1289
conformational ensemble: A review of molecular dynamics approaches. Peptides
1290
69 (2015), 86–91.
1291
[184] Van Der Spoel, D., Lindahl, E., Hess, B., Groenhof, G., Mark, A. E.,
1292
and Berendsen, H. J. Gromacs: fast, flexible, and free. Journal of computa-
1293
tional chemistry 26, 16 (2005), 1701–1718.
48
ACCEPTED MANUSCRIPT
[185] van der Spoel, D., van Maaren, P. J., and Berendsen, H. J. A system-
1295
atic study of water models for molecular simulation: derivation of water models
1296
optimized for use with a reaction field. The Journal of chemical physics 108, 24
1297
(1998), 10220–10230.
1299 1300
1301
¨nenberger, [186] van Gunsteren, W. F., Billeter, S. R., Eising, A. A., Hu ¨ger, P., Mark, A. E., Scott, W. R., and Tironi, I. G. P. H., Kru Biomolecular simulation: the {GROMOS96} manual and user guide.
SC
1298
RI PT
1294
[187] Vanommeslaeghe, K., and MacKerell, A. Charmm additive and polarizable force fields for biophysics and computer-aided drug design. Biochimica et
1303
Biophysica Acta (BBA)-General Subjects 1850, 5 (2015), 861–871.
M AN U
1302
1304
[188] Vlieghe, P., Lisowski, V., Martinez, J., and Khrestchatisky, M. Syn-
1305
thetic therapeutic peptides: science and market. Drug discovery today 15, 1-2
1306
(2010), 40–56.
1309
and description. Reviews in Computational Chemistry (1999), 183–247.
D
1308
[189] Wallqvist, A., and Mountain, R. D. Molecular models of water: Derivation
[190] Wang, J., Cieplak, P., and Kollman, P. A. How well does a restrained
TE
1307
electrostatic potential (resp) model perform in calculating conformational ener-
1311
gies of organic and biological molecules? Journal of computational chemistry 21,
1312
12 (2000), 1049–1074.
EP
1310
[191] Wang, L.-P., Martinez, T. J., and Pande, V. S. Building force fields:
1314
an automatic, systematic, and reproducible approach. The journal of physical
1315
AC C
1313
chemistry letters 5, 11 (2014), 1885–1891.
1317
[192] Wang, L.-P., McKiernan, K. A., Gomes, J., Beauchamp, K. A., HeadGordon, T., Rice, J. E., Swope, W. C., Mart´ınez, T. J., and Pande,
1318
V. S. Building a more predictive protein force field: a systematic and repro-
1319
ducible route to amber-fb15. The Journal of Physical Chemistry B 121, 16 (2017),
1320
4023–4039.
1316
1321
[193] Wang, W., Ye, W., Jiang, C., Luo, R., and Chen, H.-F. New force field
1322
on modeling intrinsically disordered proteins. Chemical biology & drug design 84,
1323
3 (2014), 253–269. 49
ACCEPTED MANUSCRIPT
[194] Wei, C.-C., Ho, M.-H., Wang, W.-H., and Sun, Y.-C. Molecular dynamics
1325
simulation of folding of a short helical peptide with many charged residues. The
1326
Journal of Physical Chemistry B 109, 42 (2005), 19980–19986.
RI PT
1324
[195] Weiner, S. J., Kollman, P. A., Case, D. A., Singh, U. C., Ghio, C.,
1328
Alagona, G., Profeta, S., and Weiner, P. A new force field for molecular
1329
mechanical simulation of nucleic acids and proteins. Journal of the American
1330
Chemical Society 106, 3 (1984), 765–784.
SC
1327
[196] Weiner, S. J., Kollman, P. A., Nguyen, D. T., and Case, D. A. An
1332
all atom force field for simulations of proteins and nucleic acids. Journal of
1333
computational chemistry 7, 2 (1986), 230–252.
1334
M AN U
1331
[197] Wickstrom, L., Okur, A., and Simmerling, C. Evaluating the perfor-
1335
mance of the ff99sb force field based on nmr scalar coupling data. Biophysical
1336
journal 97, 3 (2009), 853–856.
1337
[198] Wickstrom, L., Okur, A., Song, K., Hornak, V., Raleigh, D. P., and Simmerling, C. L. The unfolded state of the villin headpiece helical subdo-
1339
main: computational studies of the role of locally stabilized structure. Journal
1340
of molecular biology 360, 5 (2006), 1094–1107.
TE
1341
D
1338
[199] Williams, S., Causgrove, T. P., Gilmanshin, R., Fang, K. S., Callender, R. H., Woodruff, W. H., and Dyer, R. B. Fast events in protein
1343
folding: helix melting and formation in a small peptide. Biochemistry 35, 3
1344
(1996), 691–697.
AC C
EP
1342
1345
[200] Wright, P. E., Dyson, H. J., and Lerner, R. A. Conformation of peptide
1346
fragments of proteins in aqueous solution: implications for initiation of protein
1347
folding. Biochemistry 27, 19 (1988), 7167–7175.
1348
[201] Yang, L., Tan, C.-h., Hsieh, M.-J., Wang, J., Duan, Y., Cieplak, P.,
1349
Caldwell, J., Kollman, P. A., and Luo, R. New-generation amber united-
1350
atom force field. The journal of physical chemistry B 110, 26 (2006), 13166–13176.
1351
[202] Yang, W. Y., and Gruebele, M. Folding at the speed limit. Nature 423,
1352
6936 (2003), 193.
50
ACCEPTED MANUSCRIPT
[203] Ye, W., Ji, D., Wang, W., Luo, R., and Chen, H.-F. Test and evaluation
1354
of ff99idps force field for intrinsically disordered proteins. Journal of chemical
1355
information and modeling 55, 5 (2015), 1021–1029.
RI PT
1353
1356
[204] Yoda, T., Sugita, Y., and Okamoto, Y. Comparisons of force fields for
1357
proteins by generalized-ensemble simulations. Chemical Physics Letters 386, 4-6
1358
(2004), 460–467.
[205] Zagrovic, B., Snow, C. D., Shirts, M. R., and Pande, V. S. Simulation
SC
1359
of folding of a small alpha-helical protein in atomistic detail using worldwide-
1361
distributed computing. Journal of molecular biology 323, 5 (2002), 927–937.
1362 1363
1364
M AN U
1360
[206] Zhang, C., and Ma, J. Enhanced sampling and applications in protein folding in explicit solvent. The Journal of chemical physics 132, 24 (2010), 244101. [207] Zhang, T., Nguyen, P. H., Nasica-Labouze, J., Mu, Y., and Der-
1365
reumaux, P. Folding atomistic proteins in explicit solvent using simulated
1366
tempering. The Journal of Physical Chemistry B 119, 23 (2015), 6941–6951.
1369
D
1368
[208] Zhao, H., and Caflisch, A. Molecular dynamics in drug design. European journal of medicinal chemistry 91 (2015), 4–14.
TE
1367
[209] Zhou, P., Wang, C., Ren, Y., Yang, C., and Tian, F. Computational peptidology: a new and promising approach to therapeutic peptide design. Current
1371
medicinal chemistry 20, 15 (2013), 1985–1996.
EP
1370
[210] Zhu, X., Lopes, P. E., and MacKerell Jr, A. D. Recent developments
1373
and applications of the charmm force fields. Wiley Interdisciplinary Reviews:
1374
AC C
1372
Computational Molecular Science 2, 1 (2012), 167–185.
51
AC C
EP
TE D
M AN U
SC
RI PT
ACCEPTED MANUSCRIPT