Protein-RNA complexation driven by the charge regulation mechanism

Protein-RNA complexation driven by the charge regulation mechanism

Accepted Manuscript Protein-RNA complexation driven by the charge regulation mechanism Fernando Luís Barroso da Silva, Philippe Derreumaux, Samuela Pa...

876KB Sizes 2 Downloads 84 Views

Accepted Manuscript Protein-RNA complexation driven by the charge regulation mechanism Fernando Luís Barroso da Silva, Philippe Derreumaux, Samuela Pasquali PII:

S0006-291X(17)31358-X

DOI:

10.1016/j.bbrc.2017.07.027

Reference:

YBBRC 38133

To appear in:

Biochemical and Biophysical Research Communications

Received Date: 15 June 2017 Revised Date:

27 June 2017

Accepted Date: 5 July 2017

Please cite this article as: Fernando.Luí.Barroso. da Silva, P. Derreumaux, S. Pasquali, Protein-RNA complexation driven by the charge regulation mechanism, Biochemical and Biophysical Research Communications (2017), doi: 10.1016/j.bbrc.2017.07.027. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT

RI PT

Protein-RNA complexation driven by the charge regulation mechanism

Fernando Lu´ıs Barroso da Silva∗a,b,, Philippe Derreumauxb , Samuela Pasqualic a

M AN U

SC

Departamento de F´ısica e Qu´ımica, Faculdade de Ciˆencias Farmacˆeuticas de Ribeir˜ ao Preto, Av. do caf´e, s/no. – Universidade de S˜ ao Paulo, BR-14040-903 Ribeir˜ ao Preto – SP, BRAZIL, Fax: +55 (16) 3315 48 80; Tel: +55 (16) 3315 42 22; E-mail: [email protected]. b Laboratoire de Biochimie The´ orique, UPR 9080 CNRS, Institut de Biologie Physico Chimique, Universit´e Paris Diderot – Paris 7 et Universit´e Sorbonne Paris Cit´e, 13 rue Pierre et Marie Curie, 75005 Paris, France. c Laboratoire de Cristallographie et RMN Biologiques, UMR 8015 CNRS Facult´e des sciences pharmaceutiques et biologiques, Universti´e Paris Descartes et Universit´e Sorbonne Paris Cit´e, 4 Avenue de l’Observatoire, 75006 Paris, France.

D

Abstract

AC C

EP

TE

Electrostatic interactions play a pivotal role in many (bio)molecular association processes. The molecular organization and function in biological systems are largely determined by these interactions from pure Coulombic contributions to more peculiar mesoscopic forces due to ion-ion correlation and proton fluctuations. The latter is a general electrostatic mechanism that gives attraction particularly at low electrolyte concentrations. This charge regulation mechanism due to titrating amino acid and nucleotides residues is discussed here in a purely electrostatic framework. By means of constant-pH Monte Carlo simulations based on a fast coarse-grained titration proton scheme, a new computer molecular model was devised to study protein–RNA interactions. The complexation between the RNA silencing suppressor p19 viral protein and the 19-bp small interfering RNA was investigated at different solution pH and salt conditions. The outcomes illustrate the importance of the charge regulation mechanism that enhances the association between these macromolecules in a similar way as observed for other protein-polyelectrolyte systems typically found in colloidal science. Due to the highly negative charge of RNA, the effect is more pronounced in this system as predicted by the Kirkwood-Shumaker theory. Our results contribute to the general physico-

Preprint submitted to BBRC

July 8, 2017

ACCEPTED MANUSCRIPT

RI PT

chemical understanding of macromolecular complexation and shed light on the extensive role of RNA in the cell’s life.

5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

AC C

20

M AN U

4

D

3

Protein-RNA interactions play a fundamental role in many cellular processes. Proteins stabilize, protect, and help transport the different kinds of RNA molecules, and mediate the interaction of RNA with other macromolecules. Protein-RNA complexes are in particular present along all steps of gene expression. For these reasons, protein–RNA interactions are a research topic of great interest in biological, medical and pharmaceutical sciences. Various infectious and genetic human diseases are associated with these interactions which make them an interesting target for medicinal drugs. [1, 2, 3] An important recent example related to public health, which has caused global concerns, is the Musashi-1 protein, an RNA–binding protein involved in regulating the neural stem cells and possibly used by the Zika virus, resulting in microcephaly. [4] Understanding how proteins and RNA interact to form a complex is therefore a key problem in structural biology. Analysis of known protein-RNA structures highlights two distinct interaction mechanisms, one sequence dependent, involving the interaction of specific amino acids with specific bases through the formation of hydrogen bonds, and one sequence independent, based on the shape and charge complementarity of the interfaces. [5, 6, 7] The analysis of interfaces from crystallographic structures teaches us the ways proteins and RNA interact once they have come in close contact. However, before the stabilizing short-range interaction can come into action, the two partners have to find each other and correctly orient their interfaces. The question of how the protein and the RNA recognize each other from a distance is still open. At long range the main driving force is electrostatics, which makes recognition very sensitive to the charge of the two molecules, ionic conditions and pH, which are properties hard to account for both from a theoretical and numerical standpoint. [8, 9, 10, 11, 12] Computational studies of the protein–RNA complexation often focus on charge-charge, charge-dipole, van der Waals and hydrophobic interactions.

TE

2

1. Introduction

EP

1

SC

Keywords: protein titration, RNA titration, Monte Carlo Simulations, electrostatics interactions, pH effects, charge regulation

21 22 23 24 25

26 27 28 29 30

2

ACCEPTED MANUSCRIPT

37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52

53

54

2. Model and Methodology

2.1. System The p19-siRNA silencing complex has been extensively characterized experimentally, both for its structure [28] and for its activity as a results of different ionic and pH conditions. [29] It is therefore the ideal model system for our computational investigation. In plants, as a defense against viral infections, a class of small RNA molecules called ”small interfering RNAs” are assembled into complexes that guide the degradation of single-stranded complementary viral RNA through what is known as the silencing mechanism. Some viruses have adapted to evade this mechanism by producing the protein p19 that acts as a siRNA inhibitor. P19 binds to the double stranded siRNA with high affinity, effectively sequestering it and preventing it from acting in the silencing pathway.

AC C

55

RI PT

36

SC

35

M AN U

34

D

33

TE

32

[13, 14, 15, 7, 16] Much less discussed are the important Kirkwood & Shumaker’s structure sensitive electrostatic forces, [17] where mesoscopic attractive forces between the macromolecules arise from charge fluctuations due to mutual perturbations in the acid–base equilibria. This is the core of the “charge regulation mechanism” [17, 18, 19] that can result in pH-dependent attractive interactions capable to overcome ordinary Coulombic repulsions for liked-charged macroparticles. The importance of such interactions for protein–polyelectrolyte and protein–nanoparticle complexation has been well documented in the literature. [20, 21, 8, 22, 18, 23, 19, 24] From the Kirkwood & Shumaker analytical theory, [17] it can be anticipated that such mesoscopic interactions will be more pronounced for highly charged macromolecules, affecting in particular the interaction of nucleic acids and other cellular components. In this work, we extended our previous investigations on the complexation of a protein–protein pair [25, 26] and a protein–polyelectrolyte chain [21, 8, 18, 27] to describe a protein–RNA system. As model system we investigated the complexation between the RNA silencing suppressor p19 viral protein and the 19-bp small interfering RNA (siRNA), at different solution pH and salt concentrations. Our aim was to explore the importance of electrostatic interactions for this association and quantify the contributions from the charge regulation mechanism. [17, 8, 18, 19]

EP

31

56 57 58 59

60 61 62 63 64

65

3

ACCEPTED MANUSCRIPT

72 73 74 75 76 77 78 79 80 81

82 83 84 85 86 87 88 89 90

2.2. Molecular model Treating a biomolecular system with atomic detail and explicit water molecules involves the computation of a large number of inter and intramolecular interactions, demanding excessive computational resources. As a consequence, constant-pH methods suffer from slow convergence and poor sampling properties. [9, 11, 12] High-throughput analysis of several different experimental conditions and mutations, requiring the same calculation to be repeated several times, represents a further difficulty for models with atomistic resolution and classical empirical molecular force fields. To overcome this problem, coarse-grained (CG) models, using simplified physical considerations, can be devised to explore the main physical features of a system with a reduced number of parameters and limited CPU time. Since the classical Tanford-Kirkwood model for ion binding to biomolecules, [31] continuum models have been quite appealing to study electrostatic interactions in and between biomolecules. Recently, a new fast proton titration scheme (FPTS) was devised to perform constant-pH Monte Carlo (MC) simulations for RNA [32] as an extension from a previous version for proteins. [33, 12] Such computer models turned out to also be a very powerful approach for complexation studies especially when salt and pH effects should be accounted for in detail. FPTS implements proton charge fluctuations according to the acid-base equilibrium, and allows to investigate contributions of the

AC C

91

RI PT

71

SC

70

M AN U

69

D

68

TE

67

[30] In vitro fluorescence experiments have shown that the p19-siRNA interaction is highly affected by changes in pH. [29] Binding is observed for pH values spanning from 6 to ∼10. It is more efficient at low pH (6-7), and it is weaker at higher pH, with values of dissociation constant as much as four to five times higher. The changes in binding activity are likely the results of protonation/deprotonation of some of the protein’s positively charged residues, in a reversible process, as suggested by extrapolated pKa values for protein residues. Experiments at different ionic concentrations show that the protein-RNA interaction is also highly dependent on salt concentrations, with a decreasing affinity as the salt concentration increases. This result is indicative of the importance of electrostatic effects in the binding process. Mutations of ionizable residues of the RNA-binding interface have the ability of impairing binding. In particular the single mutations R45W of p19 and the double mutation R75G and R78G are sensitive to salt concentrations. [29] The structure of the p19-siRNA complex (PDB id 1R9F) is shown in figure 1(a).

EP

66

92

93 94

95 96 97 98 99 100 101 102

4

ACCEPTED MANUSCRIPT

110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140

RI PT

109

SC

108

M AN U

107

D

106

TE

105

EP

104

charge regulation mechanism. [17, 18, 19] In benchmark studies for proteins and RNA, computed pKa values were comparable to those obtained with more sophisticated techniques, at a much lower computational cost. [32, 12] Moreover, the charges for the ionizable groups do converge quite fast in this model and exhibit small standard deviations. [32, 12] Following previous studies on protein–protein interactions, [34, 35, 25, 26] we propose here a similar coarse-grained model for protein–RNA systems. A cartoon of the model for a protein–RNA system is given in Figure 1(b). Both macromolecules are described at the mesoscopic level as a collection of charged Lennard-Jones (LJ) spheres of radii Rai and valence zai representing the nucleotides and the amino acids. The size of these beads was taken from references [34] and [32]. Table 1 lists these values together with the intrinsic acid dissociation constants, possible valences used for the protonated/deprotonated states of the titratable amino acids and nucleotides residues, and the number of their occurrences in the used crystal structure coordinates (Naa ) provided by the RCSB Protein Data Bank (PDB). [36] The PDB file (PDB id 1R9F) was edited before the calculations as follows. All hetero atoms and water molecules were removed. Protein and RNA coordinates were split into two distinct files and converted into their CG descriptions using the beads sizes as given by the values reported in Table 1. The virus protein p19 mutants (R43W and R75G+R78G) were prepared by the direct replacement of the amino acids in the CG structure. Titratable amino acids and the nucleotides were assigned a charge which could be changed by the FPTS scheme according to the pH of the solution. [33, 32, 12] The proposed CG model focuses on the electrostatic features of the system and the computation of the free energy derivatives as a function of the separation distance at different experimental conditions, averaged over angular coordinates. Each partner, protein and RNA, is kept rigid, neglecting the internal degrees of freedom. This choice comes from several considerations, first of which is the fact that the complete coupling of proton titration and conformational changes represents a very complex problem [12], computationally too demanding with currently available models. In our work we are interested in understanding what are the driving forces for the protein-RNA association acting at long distances. These forces are expected to be weakly affected by phenomena outside the interface zone due to the electrostatic potential at long range being rather insensitive to local structural details. Nevertheless, the effects of macromolecular conformational changes were partially investigated by means of the repetition of the

AC C

103

5

ACCEPTED MANUSCRIPT

148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178

RI PT

147

SC

146

M AN U

145

D

144

TE

143

EP

142

MC simulations with selected clustered structures obtained from a classical molecular dynamics (MD) simulations. [37, 38] Given that RNA molecules are in general more flexible than proteins, we investigated only the effects of conformational changes of the RNA duplex by generating a trajectory with the HiRE-RNA high-resolution CG force field [39, 40] from the experimental structure. Although this macromolecular ensemble of configurations does not represent the proper behavior for all different pH values, it allows assessment of the impact of the RNA’s dynamical structure on the association process at low computational costs. It is expected that conformational changes would have a stronger effect on the the short range interactions and a small influence on the potential of mean forces w(r), which are the main results of our calculations. These additional MD runs were carried out in a spherical box (radius 120 ˚ A) at room temperature, with a time step of 4 fs and the Langevin thermostat. The MD production simulation run for 200 ns after the initial minimization, thermalization and equilibration. As done in reference, [26] the nineteen most populated clusters obtained from a cluster selection of the MD trajectory were tested. This selection was based on the diversity of three dimensional structural features of the RNA as measured by the root-mean-square deviation (RMSD) of the beads positions. A cluster analysis using the g-cluster tool in the GROMACS 3.3.1 package [41] with a RMSD cutoff of 2.25 ˚ A was employed for this purpose. For the used MD trajectory, the RMSD ranged from 1.22 to 12.09 ˚ A (with an average RMSD ˚ of 4.80 A) showing the diversity of RNA structures in the selected ensemble of coordinates. For the complexation study, a binary RNA–protein system was placed in an electroneutral open cylindrical cell of radius rc and height lc as seen in Fig. 1(b), following the same scheme as for protein-protein interactions. [34, 35, 25, 26] Both the RNA and the protein are free to rotate as rigid bodies in any direction and can translate back and forth along the axis connecting their centers of masses. Such simulation box is convenient for an efficient sampling of properties that depend on the biomolecular separation distances [e.g. w(r)]. Salt and counter ions are implicitly modeled by their screening properties via the inverse Debye length κ. Since the target experimental viral protein–RNA system was studied at NaCl solution, ion-ion correlations are of no importance here allowing the use of a screening potential. [42] Therefore, all multipole–multipole interactions (charge-charge, charge-dipole, dipole-dipole, etc.) are computed using a screened Coulombic potential. The electrostatic interactions [uel (rij )] between any two ionizable chemical groups

AC C

141

6

ACCEPTED MANUSCRIPT

(either an amino acid or a nucleotide) i and j are then given by: uel (rij ) =

zi zj e2 exp(−κri,j ) 4 π ǫ0 ǫ rij

180 181

185 186 187 188 189 190 191 192

SC

184

where ǫ0 is the vacuum permittivity (ǫ0 = 8.854 × 10−12 C 2 /Nm2 ), zi and zj denote the valency of i and j, respectively, e = 1.602 × 10−19 C is the elementary charge, and distance. κ, the inverse Debye length, is given Prij their2 1/2 2 by [8πe /(ǫ0 ǫkB T ) ions ck zk ] , where kB (= 1.3807 × 10−23 J mol−1 K −1 ) is Boltzmann constant, T is the temperature in Kelvin and ck is the number density of the mobile electrolyte species k (counter-ions and added salt). The driving force for macromolecular complexation must also include van der Waals interactions, hydrophobic effect and excluded volume repulsions. To account for these physical interactions, we have chosen the simple description of the Lennard-Jones potential, i.e., for any two charged or neutral beads (amino acids or nucleotides) i and j, uvdw (rij ) is given by: h σ 12  σ 6 i ij ij uvdw (rij ) = 4εLJ − (2) rij rij

M AN U

183

197 198 199 200 201 202 203 204 205 206 207 208 209 210

where εLJ is, following refs. [34, 43, 32], assumed to be universal for any system and equals to 0.05005 kB T (= 0.124 kJ/mol). This should correspond to a Hamaker constant of ca. 9 kB T for amino acids pairs. [44] The term σij (= Rai + Raj ) is the separation distance of two interacting beads i and j at contact. For instance, σij for ALA (RaALA = 3.1 ˚ A) and GLU (RaGLU = 3.8 ˚ ˚ ˚ A) is 6.9 A, σij for A (RaA = 5.0 A) and GLU (RaGLU = 3.8 ˚ A) is 8.8 ˚ A, and σij for A (RaA = 5.0 ˚ A) and G (RaG = 5.1 ˚ A) is 10.1 ˚ A. The choice of εLJ is somehow arbitrary and it regulates the strength of the attractive forces in the system. Based on previous calculations, [26] εLJ = 0.05005 kB T keeps the system in an electrostatic regime in agreement with the experimental evidences for this specific RNA–protein system. [29] Having beads with different sizes (see table 1) is a practical and convenient manner to model the averaged van der Waals and non-specific contributions from the hydrophobic effect. In principle, εLJ could be specifically tuned for each bead type in order to better reflect macromolecular hydrophobic moments [45] and guide a correct docking direction at short separation dis-

EP

196

AC C

195

TE

193 194

(1)

D

182

RI PT

179

7

ACCEPTED MANUSCRIPT

213 214 215 216 217 218

RI PT

212

tances. However, this would involve long parametrization procedure useful to explore non-electrostatic aspects, but unnecessary for our purposes. The accurate modeling of specific hydrophobic terms being a well known open and complex problem. [46, 47, 48, 35, 26] With the present model, large beads will attract others more than smaller ones (see Eq. 2), which is enough to effectively model the main attractive contributions. Combining Eqs. 1 and 2, the total system’s interaction energy for a given configuration U({rk }) can be written as: N

SC

211

N

219 220

225 226 227 228 229 230 231 232

233 234 235 236 237 238

239 240 241 242 243

2.3. Monte Carlo simulations Equilibrium properties of the system were obtained by means of standard Metropolis Monte Carlo simulations [49, 37] in the NVT ensemble. The number of MC steps for production was at least 107 steps after equilibration. The simulation box was defined with radius rc = 400 ˚ A and height lc = 300 ˚ A in all calculations. The aqueous solution dielectric constant and temperature were fixed at ǫ = 78.7 and T = 298K, respectively. The pH of the solution was varied from 1 to 14. Salt concentrations from 0.1 to 1.0M were investigated. We performed MC simulations of five different set-ups:

D

224

TE

223

where {rk } are amino acids or nucleotides positions, N = Np + NRN A is the total number of beads, Np is the number of amino acids of the protein and NRN A is the number of bases of the RNA chain.

EP

222

• Set A A single protein p19 in the monomeric state in NaCl solution. These simulations were used to quantify the main physico-chemical properties of the protein alone: protein net charge number (ZP ), protein dipole moment (µP ) and protein capacitance (CP ≡ hZP2 i − hZP i2 ) as a function of solution pH and the pKa values for all its titratable groups.

AC C

221

(3)

M AN U

 1 X X  ele U({rk }) = u (rij ) + uvdw (rij ) 2 i=1 j=1

• Set B A single RNA double helix in NaCl solution. These simulations describe the electrostatic properties of the RNA alone: RNA net charge (ZRN A ), RNA dipole moment number (µRN A ) and RNA capacitance 2 2 (CRN A ≡ hZRN A i − hZRN A i ) as a function of solution pH and the pKa values for all nucleotides. 8

ACCEPTED MANUSCRIPT

247 248 249 250

251 252 253 254 255 256 257 258

RI PT

246

• Set D The two macromolecules (protein p19 and the siRNA) in NaCl solution apart from each other. The purpose of these simulations was to provide the computed free energy of interactions taking into account all electrostatics mechanism.

SC

245

• Set C The complex protein–RNA in its crystallographic structure in NaCl solution. The same quantities computed in sets A and B are computed for the complex.

• Set E Same as set D but with fixed charges for each amino acid residue and nucleotide, using the values obtained from a previous titration run (from set A and B, respectively, for the protein and RNA). The titration scheme is switched off in these calculations excluding the charge fluctuation mechanism from these calculations. Charge, dipole and high order moments are kept in this non-titrating system. Comparing results from sets D and E allows us to quantify the charge regulation contribution. [21, 26]

M AN U

244

259

263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278

D

TE

262

Runs were carried out with the wild-type (wt) protein (as given by the crystal structure coordinates), its two mutants (R43W and R75G+R78G) and the wt RNA structure (as also given by the crystal structure coordinates). With set-up D, additional calculations were done with the wt protein and the ensemble of nineteen structures obtained from the MD trajectory. During the production runs, the quantities ZP , µP , CP , ZRN A , µRN A , CRN A , and the averaged charge number at each titratable chemical group were computed at a given pH and salt concentration. From the titration plots, the isoelectric point (pI) of the protein was defined as the solution pH where ZP is equal to zero. Analogously, ZRN A = 0 would define the pI for the RNA although ZRN A was always negative for the studied pH window. pKa values were extracted from the titration plots of each ionizable group, following the procedure described in refs. [32, 12]. They were calculated in simulations with both macromolecules (set D) and with only one molecule (sets A and B). The presence of one charged macromolecule in the neighborhood of a ionizable group perturbs its acid-base equilibrium. When a protein approaches or binds to the RNA, its charges can induce changes in the pKa values of the nearby titratable nucleotides residues, and vice-versa. Mapping these pKa shifts can reveal the existence of preferential binding spots.

EP

261

AC C

260

9

ACCEPTED MANUSCRIPT

290

3. Results and discussion

287 288

291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314

SC

286

M AN U

284 285

3.1. Physical chemistry properties We first examined the titration behavior of the two macromolecules alone (sets A and B). Figure 2(a) shows the titration plot where the average net charge of p19 and of the siRNA are given as a function of pH. Salt concentration was set at physiological conditions (150mM). The corresponding pI value for the protein is 9.3. The siRNA molecule is negatively charged at all pH regimes. These titration plots suggest that protein–RNA complexation would have a tendency to happen only for solution pH < 9.3 where the two biomolecules have an unlike charge (ZP > 0 and ZRN A < 0). Figures 2(b and c) describe how the macromolecular dipole moment (µ) and capacitance (C) vary with pH. The values of µ depend on the reference frame for the coordinates system, chosen here to be that of the center of mass of the system. [21] The protein dipole moment’s plot is similar to what was observed previously for whey proteins. [21] At pI, µP (p19) is equal to 69 ˚ A (≈ 288D), which is within a common range for proteins, smaller than the computed value for α-lactalbumin (µ = 82 ˚ A), albumin (µ = 297 ˚ A) and ˚ β-lactoglobulin (µ = 128 A), but larger than proteins like insulin (µ = 49 ˚ A) and lysozyme (µ = 24 ˚ A). [18] The elongated form of RNA and its even charge distribution does not result in a pronounced µRN A , as seen in the figure. This is natural for a macromolecule with a cylindrical shape and a distribution of negative charges. In this one-bead description of RNA’s bases, the charge number of a base in the deprotonated and protonated state varies between −1 to 0, for A and C, and −2 to −1, for G and U (see table 1). µRN A peaks around pH 5 and 11, when the bases A and C and G and U, respectively,

D

283

TE

282

EP

281

AC C

280

RI PT

289

Therefore, through the comparison of computed pKa values between the isolated macromolecules [pKa (p19) and pKa (RNA)] and their complexed form [pKa (complex)], we could identify the beads most affected by the presence of the other (charged) macromolecule. The free energy of interactions were estimated by recording the histogram of the pair protein–RNA separations during the simulations with sets C and D. The bin size was 1 ˚ A. This histogram was normalized to result in the radial distribution function [g(r)] that gives the angularly averaged potential of mean force [βw(r) = − ln g(r)]. Calculations were performed with the Faunus biomolecular simulation package, [50] where the FPTS was already implemented. [33]

279

10

ACCEPTED MANUSCRIPT

322 323 324 325 326 327 328 329 330 331 332 333

334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350

RI PT

321

SC

320

M AN U

319

D

318

3.2. pH effects The calculated free energy of interactions for the complexation of p19 and the siRNA as a function of the solution pH is given in Fig. 3(a) at 100mM NaCl (set D). Negative values for βw(r) are observed at pH values between 5 and 10.5. If the criterion for association is assumed to βw(r) < 1 kB T, [25, 26] complexation is observed for pH values spanning from 5 and 10.2. The strongest attraction was computed at pH 6.5 where the lowest free energy minimum was measured [βw(r) = −5.6 kB T]. This result is in excellent agreement with the in vitro fluorescence experiments where the strongest binding was measured at pH values between 6.2 and 7.6. [29] Our findings are also within the range of the simulated results obtained by Brooks and co-authors that reported an optimal pH range for siRNA binding between 7 and 10 using a different crystallographic structure (PDB id 1RPU). [14] The typical error in these free energy profiles is smaller than 0.2 kB T. This estimate is obtained by extending the MC simulation three times by the same number of steps as the first round, and computing the free energy profiles at each iteration.

TE

317

EP

316

have a tendency to be at the deprotonated states (see their computed pKa values at table 3). The average capacitance for the two macromolecules varies significantly with pH and it can reach relatively high values for both molecules (e.g. CP = 2.5 at pH 4, and CRN A = 3.2 at pH 11). Being a measurement of charge fluctuations, C is high when these fluctuations are larger. This happens when pH = pKa . For example, a molecule rich in ARG as p19 will have a peak around pH 12 [pK0 (ARG)= 12] as seen in this figure. Due to the number of GLU and ASP residues, the capacitance plot for the protein p19 peaks around pH 4. RNA is made up of titratable chemical groups that have pK0 values at two pH regimes, 3.5–4.2 and 9.2. Therefore, it is expected to always observe two peaks in a capacitance plots at pH ≈ 4 and pH ≈ 9.2. The exact position of these peaks will depend on the pKa values that are often shifted. For this RNA, pKa values are shifted to the basic regime (pH > pKa ) as it will be discussed below (see table 3). For instance, pKa (A)’s and pKa (C)’s are between 4.0 and 4.9 and 4.9 and 5.6, respectively. This indicates that a peak is expected at pH ≈ 5 as observed in this Figure 2(c). Since the magnitude of the regulation term depends on C, [17, 21, 8, 18] such high values of C suggest that the charge regulating mechanism plays an important role in the protein-RNA interaction.

AC C

315

11

ACCEPTED MANUSCRIPT

358 359 360 361 362

363 364 365 366 367 368 369 370 371 372 373

374 375 376 377 378 379 380 381 382 383 384 385 386

RI PT

SC

357

M AN U

356

3.3. Salt effects The ionic strength dependence on the equilibrium binding of the virus protein p19 to the siRNA is shown in Fig. 4(a). This plot was obtained from the MC simulations with set D at pH 7.2. As reported in the experimental studies, p19–siRNA association is highly dependent on NaCl concentration. [29] Increasing salt concentration reduces binding affinity due to the electrostatic potential shielding. Negative values for βw(r) are observed at salt concentrations between 100 and 350 mM. Systems with salt concentration higher than 750 mM essentially exhibit no affinity. Our computations are in agreement with experimental findings where 1M NaCl completely inhibited the binding affinity. [29]

D

355

TE

353 354

EP

352

We repeated these calculations with each RNA structure from the ensemble of nineteen clustered conformations generated by the MD trajectory with the HiRE-RNA force field, allowing us to estimate the impact of fluctuations in the RNA structure on the association process. Figure 3(b) shows the data for pH 7 and 150mM NaCl. Similar plots were observed in other pH regimes and salt concentrations. The separation distance for the minima on the βw(r) is conserved for all the RNA structures with a magnitude varying within ∼ 0.5 kB T. We can speculate that this would be the averaged observed differences if conformational changes were incorporated into the model, therefore not affecting the qualitative description. The calculation performed with the most populated RNA cluster from the MD trajectory gives βw(r) similar to that performed with the crystal structure.

3.4. Mutation effects Pezacki and collaborators [29] have suggested that a few mutations in the p19 protein, particularly those involving ionizable residues, could be critical for the p19–siRNA association. We therefore tested the effects of a single and a double amino acid substitution on the interaction free energy [Fig. 4(b)]. Simulations were performed at pH 7.2 and physiological salt conditions. The replacement of an arginine by either a tryptophan (as in R43W) or a glycine (as in the double R75G+R78G mutation) mostly affects the electrostatic interactions while it has a marginal effect on other physical interactions. At this pH value, arginine has a tendency to be protonated (pK0 (ARG) = 12) which implies that these two mutations should have a smaller positive charge, reducing the Coulombic attraction between p19 and the siRNA. Indeed, the free energy plots show this feature. The smallest minima is observed for the

AC C

351

12

ACCEPTED MANUSCRIPT

393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408

409 410 411 412 413 414 415 416 417 418 419 420 421 422

RI PT

SC

392

M AN U

391

3.5. pKa shifts We calculated pKa values for the system both for the apo (siRNA–free) and holo (siRNA–bound) forms using simulations sets A, B and C. Tables 2 and 3 list these values for p19 and the siRNA, respectively. Aspartic acids, glutamic acids and tyrosines for the apo forms have their pKa values downshifted with respect to their pK0 values while arginines, cysteines, lysines, adenosines and cytosines exhibit shifts in the opposite direction. As far as we are aware, there is no experimental data for comparison. However, we note that pKa values computed by FPTS are usually in good agreement with experimental data both for proteins and RNAs. [32, 12] pKa values can be used to determine titratable groups that are directly involved in the complexation, which are the groups most perturbed by the presence of the other charged macromolecule. In tables 2 and 3 we report ∆pKa ’s values. From this data, we can make the hypothesis that the amino acids R44, E10, H1, H104, K6 and K43 have an important role in the complexation mechanism, with R44 (∆pKa = +0.36), H1 (∆pKa = −0.32) and K43 (∆pKa = −0.50) being the most critical ones. On the RNA side, the bases A20, A38, C2, C6, C18, C35 and C39 were observed to have the largest pKa shifts.

D

390

TE

389

3.6. Charge regulation contributions We next examined the charge regulation contributions for the p19–siRNA assembly. A numerically more rigorous partitioning of the physical contributions to the potential of mean force can be done by switching off some specific interactions in the effective model. [26] Figure 4(c) shows the simulated interaction free energy obtained by the fully titratable model (with charge fluctuations) and by the “non-titratable” model, with fixed charges obtained from separated simulations (sets A and B). We have chosen to represent here a limiting case (pH 10.5) where both macromolecules are negatively charged. Despite the obvious Coulombic repulsion, fluorescence experiments indicates a weak binding at this pH. [29] With a fixed charge model, the experimental behavior can not be reproduced. However, switching the titration on gives the necessary attractive contribution able to compensate the strong electrostatic repulsion between the protein p19 and the siRNA molecule. The

EP

388

wild-type protein followed by the single and the double mutations. The differences between the minima of βw(r) for the mutants R43W and R75G+R78G are quite small (≈ 0.1 kB T).

AC C

387

13

ACCEPTED MANUSCRIPT

429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459

RI PT

3.7. Final remarks By means of a simplified computer model and constant-pH Monte Carlo simulations, we have reproduced and rationalized the experimental observation that the p19–siRNA association is driven by electrostatic interactions, and that this process is highly sensitive to both the solution pH and salt concentration. The optimal pH was found to be 6.5 in agreement with the experimental data. The mutation of positively charged amino acids at pH 7.2 reduces the binding affinity as expected for electrostatic processes. The importance of the charge regulation mechanism was investigated highlighting the contribution of this peculiar pH-dependent mesoscopic force on the complexation process. Similarly to what is typically observed for protein-polyelectrolyte systems in colloidal science, an enhancement of the attraction was observed for the complexation of the protein and the RNA. Due to the highly negative charge of the RNA, the effect is more pronounced in this system, as predicted by the Kirkwood-Shumaker theory. The proposed computer model is fast enough to be applied on a large scale context exploring different solution pH and salt concentration regimes. The typical CPU time for a production run per experimental condition quantifying the free energy of interaction is ca. 70 hours in a personal laptop (Intel i7-3630QM and 2.40 GHz – running Ubuntu 12.04). Our results contribute to the general physical chemical understanding of macromolecular complexation and shed light on the extensive role of RNA in the cell’s life.

SC

428

M AN U

427

D

426

TE

425

EP

424

minima on the βw(r) was at 23 ˚ A. At this separation distance, the difference between the “titratable” and “non-titratable” plot is ≈ 1.7 kB T revealing the magnitude of the charge regulation mechanism on protein-RNA interactions. This result also exemplifies the importance of constant-pH simulations to properly describe the main fundamental physical mechanisms.

AC C

423

Acknowledgments This work has been supported in part by the Funda¸c˜ao de Amparo a Pesquisa do Estado de S˜ao Paulo [Fapesp 2015/16116-3 (FLBDS)] and the Universit´e Sorbonne Paris Cit´e (USPC) through a visiting professor grant. FLBDS thanks also the support of the University of S˜ao Paulo (USP) through the NAP-CatSinQ (Research Core in Catalysis and Chemical Synthesis), the computing hours at Rice University through the international collaboration program with USP, and the hospitality of the Institut de Biologie Physico 14

ACCEPTED MANUSCRIPT

461 462

Chimique/USPC. FLBDS and SP acknowledge the mobility support from the joint USPC–USP Research program. This work was also supported by the grant DYNAMO (ANR-11-LABX-0011-01)

RI PT

460

463

469

470 471

472 473 474 475 476 477

478 479

480 481

482 483 484 485

486 487 488 489

SC

M AN U

468

[2] H. Zhou, M. Mangelsdorf, J. Liu, L. Zhu, J. Y. Wu, Rna-binding proteins in neurological diseases, Science China Life Sciences 57 (4) (2014) 432– 444. [3] C. Stavraka, S. Blagden, The la-related proteins, a family with connections to cancer, Biomolecules 5 (4) (2015) 2701–2722.

[4] P. L. Chavali, L. Stojic, L. W. Meredith, N. Joseph, M. S. Nahor ski, T. J. Sanford, T. R. Sweeney, B. A. Krishna, M. Hosmillo, A. E. Firt h, R. Bayliss, C. L. Marcelis, S. Lindsay, I. Goodfellow, C. G. Woods, F. Gergely, Neurodevelopmental protein musashi 1 interacts with the zika genome and promotes viral repli cation, SciencearXiv:http://science.sciencemag.org/content/early/2017/05/31/science.aam9243.full.p

D

467

TE

466

[5] E. Kligun, Y. Mandel-Gutfreund, The role of rna conformation in rnaprotein recognition., RNA biology 12 (2015) 720–727.

EP

465

[1] A. M. Khalil, J. L. Rinn, RNA–Protein interactions in human health and disease, Seminars in Cell & Developmental Biology 22 (4) (2011) 359–365.

[6] S. Cusack, Rna-protein complexes., Current opinion in structural biology 9 (1999) 66–73. [7] M. Krepl, M. Havrila, P. Stadlbauer, P. Banas, M. Otyepka, J. Pasulka, R. Stefl, J. Sponer, Can we execute stable microsecond-scale atomistic simulations of protein-rna complexes?, Journal of chemical theory and computation 11 (2015) 1220–1243.

AC C

464

[8] B. J¨onsson, M. Lund, F. L. B. da Silva, Electrostatics in macromolecular solution, in: E. Dickinson, M. E. Leser (Eds.), Food Colloids: SelfAssembly and Material Science, Royal Society of Chemistry, Londres, 2007, pp. 129–154.

15

ACCEPTED MANUSCRIPT

496 497 498

499 500 501

502 503

504 505 506

507 508 509

510 511 512 513

514 515 516

517 518

519 520

RI PT

SC

495

[10] E. Socher, H. Stich, Mimicking titration experiments with MD simulations: A protocol for the investigation of pH-dependent effects on proteins, Scientific Reports 22523 (2016) 1–12. [11] W. Chen, Y. Huang, J. K. Shen, Conformational activation of a transmembrane proton channel from constant pH molecular dynamics, J. Phys. Chem. Lett. 7 (19) (2016) 3961–3966.

M AN U

494

[12] F. L. B. da Silva, D. MacKernan, Benchmarking a fast proton titration scheme in implicit solvent for biomolecular simulations, J. Chem. Theory Comput. 13 (6) (2017) 2915–2929. [13] A. D. M. Jr., L. Nilsson, Molecular dynamics simulations of nucleic acidprotein complexes, Curr. Opin. Struct. Biol. 18 (2008) 194–199. [14] S. M. Law, B. W. Zhang, C. L. B. III, pH-sensitive residues in the p19 RNA silencing suppressor protein from carnation italian ringspot virus affect siRNA binding stability, Prot. Sci. 22 (5) (2013) 595–604.

D

493

TE

492

[15] T. N. Do, P. Carloni, G. Varani, G. Bussi, Rna/peptide binding driven by electrostaticsinsight from bidirectional pulling simulations, Journal of Chemical Theory and Computation 9 (3) (2013) 1720–1730.

EP

491

[9] Y. Chen, B. Roux, Constant-pH hybrid nonequilibrium molecular dynamicsmonte carlo simulation method, J. Chem. Theory Comput. 11 (2015) 3919–3931.

[16] J. poner, M. Krepl, P. Ban, P. Khrov, M. Zgarbov, P. Jureka, M. Havrila, M. Otyepka, How to understand atomistic molecular dynamics simulations of rna and proteinrna complexes?, Wiley Interdisciplinary Reviews: RNA 8 (3) (2017) e1405–n/a, e1405.

AC C

490

[17] J. G. Kirkwood, J. B. Shumaker, Forces between protein molecules in solution arising from fluctuations in proton charge and configuration, Proc. Natl. Acad. Sci. USA 38 (1952) 863–871.

[18] F. L. B. Da Silva, B. J¨onsson, Polyelectrolyte-protein complexation driven by charge regulation, Soft Matter 5 (15) (2009) 2862–2868. [19] M. Lund, B. J¨onsson, Charge regulation in biomolecular solution, Quarterly Reviews of Biophysics 46 (2013) 265–281. 16

ACCEPTED MANUSCRIPT

527 528 529

530 531 532 533

534 535 536

537 538

539 540 541 542

543 544 545

546 547

548 549 550

RI PT

[22] W. M. de Vos, P. M. Biesheuvel, A. de Keizer, J. M. Kleijn, M. A. Cohen Stuart, Adsorption of the protein bovine serum albumin in a planar poly(acrylic acid) brush layer as measured by optical reflectometry., Langmuir 24 (13) (2008) 6575–6584.

SC

526

M AN U

525

[21] F. L. B. da Silva, M. Lund, B. J¨onsson, T. ˚ Akesson, On the complexation of proteins and polyelectrolytes, J. Phys. Chem. B 110 (2006) 4459–4464.

[23] W. M. de Vos, F. A. Leermakers, A. de Keizer, M. A. Cohen Stuart, M. M. Kleijn, Field theoretical analysis of driving forces for the uptake of proteins by like-charged polyelectrolyte brushes: effects of charge regulation and patchiness., Langmuir 26 (1) (2010) 249–259. [24] F. L. B. da Silva, M. Bostr¨om, C. Persson, Effect of charge regulation and iondipole interactions on the selectivity of proteinnanoparticle binding, Langmuir 30 (14) (2014) 4078–4083.

D

524

[25] L. Delboni, F. L. B. da Silva, On the complexation of whey proteins, Food Hydrocolloids 55 (2016) 89–99.

TE

523

[26] F. L. B. da Silva, S. Pasquali, P. Derreumaux, L. G. Dias, Electrostatics analysis of the mutational and pH effects of the n-terminal domain selfassociation of the major ampullate spidroin, Soft Matter 12 (2016) 5600– 5612.

EP

522

[20] P. M. Biesheuvel, M. A. C. Stuart, Electrostatic free energy of weakly charged macromolecules in solution and intermacromolecular complexes consisting of oppositely charged polymers, Langmuir 20 (2004) 2785.

[27] C. R. Brasil, A. C. Delbem, F. L. B. da Silva, Multiobjective evolutionary algorithm with many tables for purely ab initio protein structure prediction, J. Comput. Chem. 34 (20) (2013) 1719–1734.

AC C

521

[28] K. Ye, L. Malinina, D. J. Patel, Recognition of small interfering rna by a viral suppressor of rna silencing., Nature 426 (2003) 874–878. [29] R. Koukiekolo, S. M. Sagan, J. P. Pezacki, Effects of pH and salt concentration on the siRNA binding activity of the RNA silencing suppressor protein p19, FEBS Letters 581 (2007) 3051–3056.

17

ACCEPTED MANUSCRIPT

558

559 560 561

562 563 564

565 566 567

568 569 570

571 572

573 574 575 576

577 578 579

RI PT

SC

557

[32] F. L. B. da Silva, P. Derreumaux, S. Pasquali, Fast coarse-grained model for RNA titration, J. Chem. Phys. 146 (3) (2017) 035101+. [33] A. A. Teixeira, M. Lund, F. L. B. da Silva, Fast proton titration scheme for multiscale modeling of protein solutions, Journal of Chemical Theory and Computation 6 (10) (2010) 3259–3266.

M AN U

556

[31] C. Tanford, J. G. Kirkwood, Theory of protein titration curves I. General equations for impenetrable spheres, J. Am. Chem. Soc. 79 (1957) 5333– 5339.

˚kesson, [34] B. Persson, M. Lund, J. Forsman, D. E. W. Chatterton, T. A Molecular evidence of stereo-specific lactoferrin dimers in solution, Biophys Chem. 3 (3) (2010) 187–189. [35] A. Kurut, C. Dicko, M. Lund, Dimerization of terminal domains in spiders silk proteins is controlled by electrostatic anisotropy and modulated by hydrophobic patches, ACS Biomater. Sci. Eng. 1 (6) (2015) 363–371.

D

554 555

TE

553

[36] H. M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T. N. Bhat, H. Weissig, I. N. Shindyalov, P. E. Bourne, The protein data bank, Nucleic Acids Research 28 (2000) 235–242.

EP

552

[30] J. M. Vargason, G. Szittya, J. Burgyn, T. M. T. Hall, Size selective recognition of sirna by an rna silencing suppressor., Cell 115 (2003) 799–811.

[37] D. Frenkel, B. Smit, Understanding Molecular Simulation: From Algorithms to Applications, Academic Press, San Diego, 1996.

AC C

551

[38] Y. G. Spill, S. Pasquali, P. Derreumaux, Impact of thermostats on folding and aggregation properties of peptides using the optimized potential for efficient structure prediction coarse-grained model, J. Chem. Theory Comput. 7 (5) (2011) 1502–1510. [39] S. Pasquali, P. Derreumaux, HiRE-RNA: A high resolution coarsegrained energy model for RNA, J. Phys. Chem. B 114 (2010) 11957– 11966.

18

ACCEPTED MANUSCRIPT

587 588

589 590 591

592 593 594

595 596

597 598 599

600 601 602

603 604 605

606 607 608

609 610 611

RI PT

[41] D. van der Spoel, E. Lindahl, B. Hess, G. Groenhof, A. E. Mark, H. J. C. Berendsen, Gromacs: Fast, flexible and free, J. Comp. Chem. 26 (2005) 1701–1719.

SC

586

[42] R. R. Netz, Electrostatics of counter-ions in and between planar charged walls: From Poisson-Boltzmann to the strong-coupling theory, Eur. Phys. J. E. 5 (2001) 557–574.

M AN U

585

[43] A. Kurut, B. A. Persson, T. ˚ Akesson, J. Forsman, M. Lund, Anisotropic interactions in protein mixtures: Self assembly and phase behavior in aqueous solution, J. Phys. Chem. Lett. 3 (6) (2012) 731–734. [44] M. Lund, B. J¨onsson, A mesoscopic model for protein-protein interactions in solution, Biophys. J. 85 (2003) 2940–2947.

D

584

[45] D. Eisemberg, R. M. Weiss, T. C. Terwilliger, W. Wilcox, Hydrophobic moments and protein structure, Faraday Symp. Chem. Soc. 17 (1982) 109–120.

TE

582 583

EP

581

[40] F. Sterpone, S. Melchionna, P. Tuffery, S. Pasquali, N. Mousseau, T. Cragnolini, Y. Chebaro, J.-F. Saint-Pierre, M. Kalimeri, A. Barducci, Y. Laurin, A. Tek, M. Baaden, P. H. Nguyen, P. Derreumaux, The OPEP coarse-grained protein model: From single molecules, amyloid formation, role of macromolecular crowding and hydrodynamics to rna/dna complexes, Chem Soc Rev. 43 (13) (2014) 4871–4893.

[46] M. J¨onsson, M. Skep¨o , F. Tjerneld, P. Linse, Effect of spatially distributed hydrophobic residues on protein–polymer association, J. Phys. Chem. B 107 (2003) 5511–5518.

AC C

580

[47] R. A. Curtis, R. S. Pophale, M. W. Deem, Monte carlo simulations of the homopolypeptide pair potential of mean force, Fluid Phase Equilibria 241 (2006) 354–367.

[48] R. Mezzenga, P. Fische, The self-assembly, aggregation and phase transitions of food protein systems in one, two and three dimensions, Rep. Prog. Phys. 76 (2013) 046601(43pp). [49] N. A. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. Teller, E. Teller, Equation of state calculations by fast computing machines, J. Chem. Phys. 21 (1953) 1087–1097. 19

ACCEPTED MANUSCRIPT

618

RI PT

[52] P. Thaplyal, P. C. Bevilacqua, Experimental approaches for measuring pKas in rna and dna, Methods in Enzymology 549 (2014) 189–219.

SC

617

[51] Y. Nozaki, C. Tanford, Examination of titration behavior, Methods Enzymol. 11 (1967) 715–734.

M AN U

616

D

615

TE

614

EP

613

[50] B. Stenqvist, A. Thuresson, A. Kurut, R. V´acha, M. Lund, Faunus – A flexible framework for Monte Carlo simulation, Mol. Sim. 39 (14–15) (2013) 1233–1239.

AC C

612

20

RI PT

ACCEPTED MANUSCRIPT

D

pK0 12.0 4.0 10.8 4.4 6.3 10.4 9.6 2.6 7.5 3.5 4.2 9.2 9.2

EP

TE

Rai (˚ A)b 3.1 4.0 3.6 3.6 3.6 3.8 3.8 2.9 3.9 3.6 3.6 3.7 3.8 3.9 3.4 3.3 3.5 4.3 4.1 3.4 2.0 2.0 5.0 4.9 5.1 4.9

AC C

Residue i ALA ARG ASN ASP CYSa GLN GLU GLY HIS ILE LEU LYS MET PHE PRO SER THR TRP TYR VAL CTR NTR A C G U

M AN U

SC

Table 1: Properties of the coarse grained amino acids and nucleobases. Rai is the radius used for the residues (amino acid or nucleotides i). pK0 is the intrinsic acid dissociation constant taken from Nozaki & Tanford [51] for the amino acids, and from Thaplyal & Bevilacqua [52] for the nucleic acids bases. Zai is the charge number of the residue i in the deprotonated and protonated state, respectively. Naa,base is the number of amino acids i found in the PDB protein structure (chain A) or the number of bases i in the RNA structure (PDB id 1r9f). a Cysteine not involved in disulfide bridge formation. b Rai values as given by Refs. [34] (for protein amino acids) and [32] (for the RNA nucleotides residues).

Zai 0 and +1 -1 and 0 -1 and 0 -1 and 0 0 and +1 0 and +1 -1 and 0 -1 and 0 0 and +1 -1 and 0 -1 and 0 -2 and -1 -2 and -1

21

Naa,base 5 10 3 6 2 5 7 10 4 4 9 5 0 8 4 13 8 4 4 6 0 0 9 10 10 11

ACCEPTED MANUSCRIPT

SC

D

EP AC C mean ∆pKa

∆pKa 0.36 0.05 0.09 -0.05 0.00 -0.04 -0.05 -0.03 0.08 0.04 -0.06 -0.09 -0.04 -0.01 0.07 0.16 0.08 -0.19 0.05 -0.03 -0.09 0.00 0.05 -0.03 -0.32 -0.05 0.04 -0.17 -0.11 -0.07 -0.01 -0.50 -0.08 -0.02 0.06 -0.06 -0.01 -0.03

M AN U

p19 13.11 13.33 12.91 12.55 12.25 12.28 12.50 12.21 12.71 3.18 3.67 3.17 3.31 3.73 3.82 10.91 11.42 3.53 4.04 4.20 3.19 4.00 3.73 3.98 5.86 6.81 6.24 6.28 10.49 10.46 10.38 9.90 10.82 9.43 9.49 9.47 9.48

TE

Residue ARG-44 ARG-47 ARG-50 ARG-57 ARG-73 ARG-87 ARG-89 ARG-100 ARG-110 ASP-9 ASP-22 ASP-26 ASP-49 ASP-65 ASP-78 CYS-82 CYS-106 GLU-10 GLU-16 GLU-23 GLU-33 GLU-52 GLU-107 GLU-114 HIS-1 HIS-20 HIS-56 HIS-104 LYS-6 LYS-32 LYS-39 LYS-43 LYS-112 TYR-45 TYR-48 TYR-69 TYR-84

pKa p19–siRNA complex 12.75 13.28 12.82 12.60 12.25 12.32 12.55 12.24 12.63 3.14 3.73 3.26 3.35 3.74 3.75 10.75 11.34 3.72 3.99 4.23 3.28 4.00 3.68 4.01 6.18 6.86 6.20 6.45 10.60 10.53 10.39 10.40 10.90 9.45 9.43 9.53 9.49

RI PT

Table 2: Calculated pKa values for the virus protein p19 (PDB id 19RF) in 150 mM NaCl. Data from the MC runs with sets A, B and C. pKa shifts due to the complexation are given by ∆pKa = pKa (p19)-pKa (complex).

22

SC

RI PT

ACCEPTED MANUSCRIPT

Table 3: Calculated pKa values for the 19-bp small interfering RNA (PDB id 19RF) in 150 mM NaCl. Data from the MC runs with sets A, B and C. pKa shifts due to the complexation are given by ∆pKa = pKa (siRNA)-pKa (complex).

M AN U

D

siRNA 4.45 4.82 4.80 4.63 4.21 4.63 4.57 4.85 4.03 4.89 5.45 5.63 5.51 5.23 5.31 5.54 5.43 5.50 4.92

AC C

EP

TE

Residue A5 A11 A12 A14 A20 A25 A26 A27 A38 C2 C6 C8 C15 C18 C23 C32 C33 C35 C39

pKa p19-siRNA complex 4.51 4.89 4.75 4.56 4.10 4.70 4.63 4.80 4.35 4.64 5.58 5.71 5.55 5.39 5.32 5.54 5.51 5.63 5.12

mean ∆pKa

23

∆pKa -0.06 -0.07 0.05 0.07 0.11 -0.07 -0.06 0.05 -0.32 0.25 -0.13 -0.08 -0.04 -0.16 -0.01 0.00 -0.08 -0.13 -0.20 -0.05

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

AC C

EP

TE

D

(a)

Figure 1: (a) Top:

(b)

Crystal structure of p19–siRNA complex. The two sets of mutations are highlighted in purple (R43W) and in red (R75G + R78G). (b) Bottom: A sketch of the model system. One RNA chain and one protein represented by a collection of charged LJ spheres of radii Rai and valences zai , mimicking amino acids and nucleotides, are surrounded by counter ions and added salt, implicitly described by the inverse Debye length κ. The solvent is represented by its static dielectric constant ǫ. Positive and negatively charged protein amino acids are represented in blue and red, respectively. The macromolecules’s centers of mass are separated by a distance r. 24

Titration 20

protein rna

SC

Z

0

RI PT

ACCEPTED MANUSCRIPT

-20

-60

M AN U

-40

5

10

pH

(a)

Dipole moment number

Capacitance

4

protein rna

D

40

0

EP

20

5

protein rna

3

C

TE

µ

60

2

1

0

10

pH

5

10 pH

(c)

AC C

(b)

Figure 2: Main physical chemical properties of the studied macromolecules in

the electrolyte solution. The salt concentration is 150 mM. Data from the MC simulation runs with sets A and B. (a) Top: Titration properties of the biomolecules in the electrolyte solution. The simulated average net charge number (Z) of the virus protein p19 and the 19-bp small interfering RNA as a function of pH. (b) Left: Simulated average dipole moment (µ) as a function of pH. (c) Right: Simulated average macromolecular capacitance (C) as a function of pH.

25

Free energy of interactions

Free energy of interactions pH effects

Structural effects

pH 5.0 pH 6.5 pH 8.0 pH 9.5 pH 10.0 pH 10.2 pH 10.5 pH 11.0 pH 12.0

βw(r)

0

M AN U

5

SC

1

0

-1

-5 25

75

50

0

100

D

r (Å)

25

50 r (Å)

75

EP

TE

Figure 3: pH effects on the interaction free energy [βw(r)]. a) Left: Simulated βw(r) between the centers of mass of the p19 and the siRNA at different solution pHs and 100mM NaCl. pH regimes where a negative minimum for βw(r) is observed are shown with solid lines while repulsive cases are shown with dashed lines. b) Right: Thermal fluctuations effects on βw(r) at pH 7 and 100mM NaCl. Nineteen different clustered configurations generated by MD simulations with the HiRE-RNA force field were tested.

AC C

βw(r)

RI PT

ACCEPTED MANUSCRIPT

26

100

Free energy of interactions salt effects 0.10M 0.20M 0.25M 0.35M 0.75M 1.00M 1.10M

βw(r)

5

SC

0

RI PT

ACCEPTED MANUSCRIPT

25

M AN U

-5 75

50

100

r (Å)

(a)

Free energy of interactions

Free energy of interactions charge regulation mechanism

mutational effects

5

WT R43W R75G.R78G

-5 25

TE 75

50

βw(r)

βw(r)

D

0

fixed charges charge fluctuation

0

100

25

r (Å)

75

50

100

EP

r (Å)

(c)

AC C

(b)

Figure 4: (a) Top: Salt effects on the free energy of interactions [βw(r)]. The

simulated βw(r) between the centers of mass of p19 and siRNA at different salt concentrations and pH 7.2. pH regimes where a negative minimum for βw(r) is observed are shown with solid lines while repulsive cases are shown with dashed lines. (b) Left: Mutation effects on the interaction free energy [βw(r)]. Data for a single (R43W) and a double mutations (R75G+R78G) are shown together with the wild-type protein system. The MC simulations were carried out with set D at 150 mM NaCl and pH 7.2. (c) Right: Partition of contributions to the p19-siRNA complexation. The simulated free energy of interactions [βw(r)] between the centers of mass of the p19 and the 19-bp small interfering RNA at pH 10.5 and 150 mM salt concentration as predicted by means of the different modifications in the model as mentioned in the text.

27