Weighted histogram analysis method for multiple short-time metadynamics simulations

Weighted histogram analysis method for multiple short-time metadynamics simulations

Journal Pre-proofs Research paper Weighted histogram analysis method for multiple short-time metadynamics simulations Junichi Ono, Hiromi Nakai PII: D...

2MB Sizes 0 Downloads 61 Views

Journal Pre-proofs Research paper Weighted histogram analysis method for multiple short-time metadynamics simulations Junichi Ono, Hiromi Nakai PII: DOI: Reference:

S0009-2614(20)30299-2 https://doi.org/10.1016/j.cplett.2020.137384 CPLETT 137384

To appear in:

Chemical Physics Letters

Received Date: Revised Date: Accepted Date:

10 January 2020 6 March 2020 18 March 2020

Please cite this article as: J. Ono, H. Nakai, Weighted histogram analysis method for multiple short-time metadynamics simulations, Chemical Physics Letters (2020), doi: https://doi.org/10.1016/j.cplett.2020.137384

This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

© 2020 Published by Elsevier B.V.

Weighted

histogram

analysis

method

for

multiple

short-time

metadynamics simulations

Junichi Ono,a,b Hiromi Nakaia,b,c,*

aWaseda

Research Institute for Science and Engineering (WISE), Waseda University, Tokyo

169-8555, Japan bElements

Strategy Initiative for Catalysts and Batteries (ESICB), Kyoto University, Kyoto 615-8245,

Japan cDepartment

of Chemistry and Biochemistry, School of Advanced Science and Engineering, Waseda

University, Tokyo 169-8555, Japan

*Corresponding author. Email address: [email protected] (Hiromi Nakai) Fax: +81 3 3205 2504

1

Abstract An efficient calculation scheme for obtaining free energy surfaces (FESs) in chemical or biological systems from multiple metadynamics (metaD) simulations was developed. In this study, the biased probability distributions of the collective variables and other degrees of freedom were separately calculated, in multiple short-time metaD simulations using the reweighting method. The corresponding equilibrium probability distribution was finally constructed using the weighted histogram analysis method (WHAM). This method was applied for the conformational transitions of alanine dipeptide in the gas phase as an illustrative application. The results confirmed that this method was feasible for efficiently generating the converged FES.

2

1. Introduction In chemical or biological systems, chemical reactions or structural changes that occur through transition states (TSs) are considered rare events [1-3], which can be generally characterized as transition motions with long waiting times and short transition path times. Molecular dynamics (MD) simulations are useful tools for analyzing such rare events at the atomic level. However, the accessible timescales with MD simulations are limited compared to the timescales of the rare events. Therefore, enhanced sampling methods, such as umbrella sampling and replica-exchange MD, which can effectively promote rare events in MD simulations, are commonly adopted to treat chemical reactions [4-21]. Metadynamics (metaD) [22] and its variants, such as the well-tempered version, denoted as WTmetaD [23], have been successfully applied for a number of phenomena in chemistry, biology, and material sciences to obtain free energy surfaces (FESs) [1,2,24-29]. In the metaD scheme, the repulsive potentials expressed as Gaussian functions in the space of the predefined collective variables (CVs), which should adequately describe the target reaction, are successively deposited as the history-dependent biasing potential for escaping from local minima [22]. In the long-time limit, the negative sign of the total biasing potential, i.e., the sum of Gaussians in the single trajectory, corresponds to the free energy surfaces in the standard metaD [22]. In particular, it has been proven that the total biasing potential exactly converges asymptotically to the FESs in the case of WTmetaD [30]. However, the accessible timescale of quantum MD simulations is still limited to picoseconds or sub-nanoseconds due to the high computational cost. Thus, the

3

convergence of metaD and/or WTmetaD for obtaining reliable FESs is difficult in practical quantum MD simulations or even in quantum-mechanics/molecular-mechanics (QM/MM) MD simulations. In the present study, a methodology was developed for converging the FESs from multiple short-time WTmetaD simulations, instead of performing a single long-time metaD simulation. Herein, a biased histogram with respect to the CVs as well as other degrees of freedom was calculated from the individual WTmetaD trajectories using the reweighting method [31]. Then, the resultant biased histograms were merged using the weighted histogram analysis method (WHAM) to obtain the converged unbiased histogram [32-34]. Finally, the FESs were obtained from the conventional form of the potential of mean force. The organization of this letter is as follows. In Section 2, after introducing the reweighting method for WTmetaD with slight modifications, WHAM is summarized for the biased histograms obtained from multiple WTmetaD simulations. In Section 3, a numerical application of the proposed method to alanine dipeptide in the gas phase is demonstrated. Concluding remarks are given in Section 4.

4

2. Theoretical aspects Let us consider a system described by the potential energy, U(R), with the position vector, R. For simplicity, equations described here are restricted to one-dimensional CV unless otherwise noted, but the extension to the multidimensional case is straightforward. The free energy profile, F(s), for the CV depending on R, s = s(R), which is the target quantity characterizing the chemical reaction of the system, is defined by

F ( s )  kBT log P0 ( s )  C ,

(1)

where kB is the Boltzmann constant, T is the absolute temperature, P0(s) is the unbiased equilibrium probability distribution function with respect to s, and C is the arbitrary constant. Note that to obtain F(s) correctly, accurate sampling of P0(s) in the configurational space is required by means of the long-time average or sufficient ensemble average. The unbiased equilibrium probability distribution function with respect to R is expressed as the Boltzmann distribution, i.e.,

P0 (R ) 

exp[ U (R )]

 d R exp[U (R)]

,

(2)

where β is the reciprocal temperature (kBT)-1. If the system instantaneously reaches the equilibrium state under the time-dependent biasing potential, V(s(R),t), in addition to the original potential, U(R), the resulting probability distribution at time, t, is given by [31]

P(R, t ) 

exp[ U (R )]exp[ V ( s (R ), t )]

 d R exp[U (R)]exp[V (s(R), t )] 5

 exp[ V ( s (R ), t )]exp[  c(t )]P0 (R ) .

(3)

Here, the time-dependent constant, exp[βc(t)], which is the fraction of the partition functions, is defined as exp[  c(t )] 

 d R exp[U (R)]

 d R exp[U (R)]exp[V (s(R), t )]

.

(4)

As a result, the biased probability distribution with respect to s at time, t, is expressed as P( s, t )   d R ( s  s (R )) P(R, t )

 exp[ V ( s, t )]exp[  c(t )]P0 ( s ) .

(5)

Using this relation, the biased probability distribution with respect to s at time, t+Δt, is given by

P( s, t  t )  exp[  (V ( s, t  t )  V ( s, t ))]exp[  (c(t  t )  c(t ))]P( s, t ) 

exp[ V ( s, t )] P ( s, t ) ,  exp[ V ( s, t )]

(6)

where ΔV(s,t) is defined as V(s,t+Δt) − V(s,t). Here, indicates the expectation value determined from P(s,t) as  exp[ V ( s, t )]   ds exp[ V ( s, t )]P( s, t ) .

(7)

Note that the first-order cumulant expansion of in Eq. (6) leads to the equation of motion for the reweighting method of WTmetaD previously developed by Parrinello and coworkers [31]. Thus, if the biased probability distribution is described as Gaussian and the width is sufficiently small in which the first-order cumulant expansion is satisfied, the propagation of Eq. (6) is equivalent to the previously suggested operation [31]. Note that this situation is achieved in the long-time limit as described below. 6

For updating the biased probability distribution, Eq. (6) is successively used when a new Gaussian is accumulated in the WTmetaD simulation. Then, the final biased probability distribution, P(s,tf), is calculated under the biasing potential, V(s,tf), at a specific time, tf. When the resulting biased probability distribution is sufficiently converged, the unbiased one is obtained from the following relation:

P0 ( s )  exp[ V ( s, tf )]exp[  c(tf )]P( s, tf ) .

(8)

In WTmetaD [23], the time derivative of the biasing potential is given by

 V ( s (t ), t )   ( s  s (t )) 2  exp V ( s, t )   exp      , 2 2    kB T 

(9)

where ΔT is the biasing temperature and σ is the width of the Gaussian. Here, ω is the initial deposition rate defined as W0/τ with the initial Gaussian height, W0, and the time interval for the deposition of a new Gaussian, τ. In the long-time limit, where P(s,t) is stationary, i.e., P(s,t+Δt) = P(s,t), the following relations are satisfied from Eqs. (6) and (7):

V ( s, t )  c(t ) ,

(10)

P( s, t )   ( s  s (t )) .

(11)

Note that Eq. (10) is derived from ΔV(s,t) = c(t+Δt) − c(t) in Eq. (6). In addition, if the Gaussian width, σ, is sufficiently small, then the Gaussian function in Eq. (9) can be expressed as the δ-function in Eq. (11), which yields

 V ( s (t ), t )  V ( s, t )   exp    2  P( s, t ) .  kB T  7

(12)

Finally, from Eqs. (1), (5), (10), and (12), the FES is given by

F (s)  

T  T V ( s, t )  C (t ) , T

(13)

where C(t) is the time-dependent constant. If the long-time WTmetaD simulation is performed, the FES can be obtained from Eq. (13). Otherwise, the FES should be obtained by using the results of short-time WTmetaD simulations via the ensemble average for the biased probability distributions, as proposed in this study. We next consider the biased probability distribution obtained from the i-th WTmetaD simulation, Pi(s,tf), under the biasing potential, Vi(s,tf). In the practical computation, Pi(s,tf) is actually discretized and expressed as Pi ( s, tf )s 

ni ( s, tf ) , N i (tf )

(14)

where Δs is the bin size, ni(s,tf) is the number of samples related with the bin for s, and Ni(tf) is the number of total samples. In WHAM [32-34], the unbiased probability distribution is assumed to be P0 ( s )   wi ( s ) exp[ Vi ( s, tf )]exp[  ci (tf )]Pi ( s, tf ) ,

(15)

i

where exp[  ci (tf )]   ds exp[ Vi ( s, tf )]P0 ( s ) . Note that the overlap between the individual probability distributions is required for good performance of WHAM, as discussed in the application to the umbrella sampling [32-34]. Here, the coefficient for the linear combination, wi(s), is optimally determined by error minimization and the assumption of the Poisson process for ni(s,tf) [8]. As a result, the coefficient is optimized as wi ( s ) 

N i (tf ) exp[ Vi ( s, tf )]exp[  ci (tf )] . N ( t ) exp[   V ( s , t )]exp[  c ( t )]  j f f j j f j

8

(16)

The unbiased probability distribution is given by P0 ( s ) 

 N (t ) P (s, t ) i

N

f

f

i

i

j

(tf ) exp[ V j ( s, tf )]exp[  c j (tf )]

.

(17)

j

Finally, the free energy profile is calculated with the use of Eq. (1). In the present method, the biased histogram, Hi(s,t), is accumulated along the individual WTmetaD trajectory as follows: First, during the period of (m-1)τ ≤ t < mτ for m = 1, 2, …, the biased histogram, Hi(s,t), is normally updated. Second, at t = mτ, a new Gaussian centered at si(t) is generated, i.e.,

 Vi ( s (t ), t )   ( s  si (t )) 2  Vi ( s, t )  W0 exp    exp    . kB T  2 2   

(18)

The expectation value in Eq. (6), , is approximately evaluated as  exp[ Vi ( s, t )]   ds exp[ Vi ( s, t )]Pi ( s, t )

 exp[ V (s, t )]H (s, t )   H (s, t ) i

s

s

i

.

(19)

i

Third, the time evolution of the histogram, Hi(s,t), is performed according to Eq. (6). Then, the resultant biased histogram Hi(s,tf) and the final biasing potential Vi(s,tf) from the i-th WTmetaD trajectory are utilized for WHAM in Eq. (17). Note that Ni(tf) and Pi(s,tf) in Eq. (17) are constructed from

 H ( s, t ) i

s

f

and H i ( s, tf ) /  H i ( s, tf ) , respectively. A flowchart of the present method is s

presented in Figure 1. Note that the present method is just the post processing technique which does not require any synchronization of the biasing potentials among the different trajectories during the metaD simulations, in contrast to multiple walkers metaD. This fact indicates that the present method 9

is highly suitable for the massively parallel computing. The present scheme is easily applicable for the calculation of the unbiased probability distribution for degrees of freedom other than CVs, for example, P0(x) with respect to another coordinate, x. Here, the histogram, Hi(s,t), described above is simply substituted for the joint histogram, Hi(s,x,t). After performing WHAM to obtain the joint probability distribution, P0(s,x), the marginal probability for x can be calculated from the integration of P0(s,x) with respect to s. Note that this procedure, which corresponds to the reweighting method [31,35,36], enables us to obtain the FES for dimensions larger than the number of CVs used in the metaD simulation. Finally, let us shortly describe the difference from the previous studies. The application of WHAM to metaD simulations was proposed by Laio and co-workers [37], in which the trajectories from the metaD simulations were classified into microstates and then the simple time-average of the bias potentials in the microstates was used for solving the WHAM equation. Similar methods for collecting the metaD simulations were proposed on the basis of other reweighting techniques: mean force integration (MFI) [38] and iterative trajectory reweighting (ITRE) [39]. The feature of the present method is that the biased probability distributions are propagated according to Eq. (6) in the individual metaD trajectories, which are finally gathered using WHAM.

10

3. Numerical application 3.1 Computational details As a numerical test of the proposed method, we focused on the conformational transitions of the main chain of alanine dipeptide (consisting of 22 atoms) in the gas phase, which is widely employed as a benchmark for various simulation methods [3,24]. Alanine dipeptide has two main conformations in the gas phase, C7eq and C7ax, as illustrated in Figure 2. These two conformations can be mainly characterized by two dihedral angles, ϕ and ψ, namely, {−81°, 71°} and {74°, −67°}, respectively, obtained from the geometry optimization in the gas phase with the computational method described below. In particular, the two-dimensional (2D) FES of alanine dipeptide for ϕ and ψ, namely, the Ramachandran plot, is widely used for evaluating the efficiency of enhanced sampling methods [3,24]. Quantum MD simulations for alanine dipeptide in the gas phase were performed with the aid of the DCDFTBMD program [40]. The density-functional tight-binding (DFTB) method was adopted to calculate the electronic energy and its energy gradient, i.e., molecular force field, at the DFTB3 level with the 3OB parameter set [41,42]. Grimme’s DFT-D3 dispersion correction with the Becke–Johnson damping scheme was included [43,44]. The unbiased DFTB-MD simulations were conducted under the constant temperature of T = 300 K controlled by an Andersen thermostat [45]. The velocity Verlet integrator and the Rattle method with a time step of 1.0 fs were used for the time integration. After 100-ps equilibration, 20-ns production runs were performed for 150 independent

11

trajectories. As a result, unbiased trajectories with a total temporal length of 3 μs were generated for the initial structures and velocities of the WTmetaD simulations, as well as for the calculation of the 2D FES as a reference. Here, WTmetaD was employed to accelerate the conformational transitions of alanine dipeptide. Two dihedral angles, ϕ and ψ were selected as the CVs for the 2D WTmetaD simulations. The initial height of the Gaussian, W0, was set to 0.3 kcal/mol. The width of the Gaussian, σ, was set to 10° for both ϕ and ψ. The time interval for depositing the Gaussian, τ, and the biasing temperature, ΔT, were set to 100 fs and 3200 K, respectively. The determination of the parameters was based on the previous suggestion [26], that is, W0, σ, τ, and ΔT were determined from the thermal energy, 0.5 kBT, half of the standard deviation of ϕ in the most stable state, C7eq, from the unbiased MD simulations, the initial decay of the autocorrelation function for ϕ from the unbiased MD simulations, and the preliminary estimation of the free energy barrier from C7eq and C7ax with the standard metaD simulation, respectively. Using these parameters, 200 WTmetaD simulations were carried out for 500 ps to confirm the convergence of the 2D FESs in WHAM, in which 100 simulations started from C7eq and the others from C7ax. A single 16-ns WTmetaD simulation with the same parameters was also performed for the reference of long-time WTmetaD. Note that the boundary correction of the biasing potential for the periodic CVs [46] was not included in the present calculations.

12

3.2 Results and discussion Figure 3a shows the 2D FES for ϕ and ψ of alanine dipeptide in the gas phase obtained from 3-μs unbiased DFTB-MD simulations. Although the 2D space of the CVs was not completely explored due to the absence of the biasing, two distinct basins corresponding to C7eq and C7ax are clearly visible; the basins of C7eq and C7ax are located at around {ϕ, ψ} = {−85°, 70°} and {70°, −70°}, respectively. Two main pathways exist from C7eq to C7ax; one is the pathway through the TS located at around {0°, 80°} (as denoted by Path 1) and the other is that around {0°, −80°} (Path 2). The corresponding free energy barriers from C7eq to C7ax in Paths 1 and 2, F1‡ and F2‡ , are 7.1 and 6.8 kcal/mol, respectively. The free energy difference between the minimum values of C7eq (Feq) and C7ax (Fax), ΔF = Feq − Fax, is −1.1 kcal/mol. The statistical error of the 2D FES was estimated by monitoring the convergence of ΔF with respect to the temporal length (Figure S1 in Supporting Information). Figure 3b displays the 2D FES obtained from the single 16-ns WTmetaD simulation. Note that the 2D FES, F(ϕ,ψ), was calculated from the biasing potential, V(ϕ,ψ), on the basis of the standard procedure of WTmetaD in Eq. (13). All the regions, including the unstable states with high free energies, which were not accessible in the unbiased MD simulations on the order of microseconds, were successfully sampled owing to the sufficient biasing deposited during the long-time simulation. The present result is in reasonable agreement with that from a previous report based on a single 15-ns WTmetaD simulation with the DFTB3 level [47]. The free energy barriers,

13

F1‡ and F2‡ , were estimated to be 6.6 and 6.4 kcal/mol, respectively. The free energy difference,

ΔF, was −1.2 kcal/mol. Despite the slight underestimation of the barrier heights compared to the corresponding results of the 2-μs unbiased MD simulations (Figure 3a), the present long-time WTmetaD simulation was found to reasonably reproduce the main features of the FES. Figure 3c shows the 2D FES obtained from multiple short-time WTmetaD simulations. Note that the 2D FES was calculated from Eq. (1) with the unbiased probability distribution, P0(φ,ψ), which was constructed by using WHAM in Eq. (17) for 10 independent 400-ps WTmetaD simulations. The number of iterations for solving the WHAM equation was less than 20, and furthermore the computational cost of WHAM was negligible compared with that of the energy/gradient calculation in the DFTB-metaD simulations. As shown in Figures S2, S3, and S4 in Supporting Information, the biased probability distributions have overlaps with each other. Although the highest regions located at around {0°, ±180°} could not be sampled during the sub-ns WTmetaD simulations, the main structures of the 2D FES were well reproduced. The free energy barriers, F1‡ and F2‡ , were estimated to be 6.4 and 6.5 kcal/mol, respectively. The free energy difference, ΔF, was −0.9 kcal/mol. These values were consistent with the results from the long-time WTmetaD simulation (Figure 3b). Therefore, the proposed method had high efficiency when the multiple short-time simulations were performed using the parallel computational environment. Next, we systematically investigated the convergence of the 2D FES with respect to the temporal length of the individual WTmetaD simulations, tf, and the number of WTmetaD 14

trajectories, M. Figure 4 illustrates the deviation of FES, ΔFES, of the present WHAM scheme based on multiple short-time WTmetaD simulations from the reference FES of the 2-μs unbiased MD simulations as a function of M for various tf values. Similar to the analysis of Ref. 48, the explicit definition of ΔFES is given as follows:

 ds[ F (s)  F (s)] H ( F  F (s))  ds H ( F  F (s)) 2

 FES 

r

max

max

r

,

(20)

r

where Fr(s) is the reference FES with respect to s = {ϕ, ψ}, Fmax is the upper-bound value from the minimum value of Fr(s) for evaluating the deviation, which was chosen to be 7 kcal/mol, and H is the Heaviside function. Here, Fr (s) and F (s) are the shifted FESs with respect to the corresponding average values of the reference and target FESs, Fr(s) and F(s), respectively. Note that tf changed from 100 to 500 ps in 100 ps increments, whereas M increased from 1 to 200 in increments of 1. The deviations were found to decrease systematically with increasing M, demonstrating that the present method successfully reduced the error by the ensemble average. The deviation converged to ~0.4 kcal/mol when tf = 100 ps, whereas the deviations converged to ~0.2 kcal/mol when tf ≥ 200 ps. Figure S5 in Supporting Information shows the deviation as a function of the total length of the sampling, i.e., L = M × tf, for various tf values. The behaviors of the decrease with respect to L are very close to each other when tf is longer than 200 ps. The result indicates that if tf is longer than the threshold, the present WHAM scheme calculated from the multiple short-time WTmetaD simulations gives reasonable FES. This feature is helpful in situations where multiple short-time calculations can be performed independently using (massive) parallel computational resources, even if the 15

corresponding long-time calculation is impractical. Further discussion on the comparison with the straightforward averaging is given in Supporting Information (Figure S6). Figure 5 shows the histogram of the residence time of C7eq before the initial transition to C7ax in the WTmetaD simulations starting from C7eq. From 100 independent WTmetaD simulations, the average residence time was determined to be 73 ± 34 ps. If tf is reasonably longer than this value, the transition of C7eq to C7ax occurs in the individual WTmetaD trajectories, which results in accurate sampling of the biased histogram, Hi(s,tf). Therefore, when the target reactions are observed in the individual WTmetaD trajectories at least once, the present method can reproduce reasonable FESs from multiple WTmetaD simulations instead of performing a single long-time WTmetaD simulation in which the reactions are observed several times. As proposed in the previous reweighting method [31], the present method is also based on the assumption in Eq. (19), which can yield the systematic errors. This error can be evaluated from the time series of ΔFES [31]. Figure 6 shows the time series of  FES , which is calculated by simply averaging ΔFES for the 200 independent WTmetaD simulations, multiplied by t1/2. It is found that the deviation approximately decreases as t-1/2 for tf ≥ 400 ps, indicating that the systematic errors were negligible in this temporal region. As explained at the end of Section 2, the present scheme enables us to obtain the FES with respect to the degrees of freedom other than the original CVs. Let us here consider a new coordinate, θ, which is defined schematically in Figure 2a. Figure 7 shows the 2D FESs for the one of the

16

original CVs, ϕ, and the new coordinate, θ. Figures 7a and 7b were obtained from the 3-μs unbiased MD simulations and WHAM for 10 independent 400-ps WTmetaD simulations (with the CVs of ϕ and ψ), respectively. The corresponding FESs for the other CV, ψ, and θ are shown in Figure S7 in Supporting Information. From Figure 7a, the basins of C7eq and C7ax are found to be located at around {ϕ, θ} = {−85°, 0°} and {70°, −10°}, respectively. The free energy barrier from C7eq and C7ax and the free energy difference between C7eq and C7ax were 6.5 and −1.0 kcal/mol, respectively. Figure 7b reasonably reproduces the overall structure of the FESs given in Fig. 7a; the corresponding free energy barrier and difference were 6.2 and −0.8 kcal/mol, respectively. These results indicate that the present method is also useful for reasonable reweighting with respect to the degrees of freedom other than the CVs.

17

4. Conclusions A computational method for generating the unbiased equilibrium probability distribution of the CVs and/or other degrees of freedom from multiple short-time WTmetaD simulations was developed. In the present method, the biased probability distributions are separately calculated from multiple short-time WTmetaD simulations through the reweighting technique, which was derived by modifying the previous technique [31] following the construction of the converged equilibrium probability distribution using WHAM and, furthermore, the determination of the FES. Numerical assessment of the conformational transitions of alanine dipeptide in the gas phase based on the 3-μs unbiased MD, single 16-ns WTmetaD, and multiple sub-ns WTmetaD simulations showed that the proposed method is feasible for reasonably determining the FES if the target reaction occurs in the individual WTmetaD trajectories. The present method is useful if multiple short-time calculations are possible using the parallel environment with the available computational resources, even if the corresponding long-time calculation is impractical. Therefore, the present method is applicable not only for quantum MD simulations but also for huge-scale classical MD simulations to obtain converged FESs for analyzing chemical and/or biological phenomena from static viewpoints. Instead of WHAM in our method, the multistate Bennett acceptance ratio (MBAR) [49] can also be employed to obtain the equilibrium probability distribution in a binless fashion. This could be of great use if multiple variables are simultaneously biased. Future work along this line could achieve the further efficiency and/or accuracy for the applications to more complex systems.

18

Acknowledgments The authors thank Professor Yasushige Yonezawa at Kindai University for helpful discussions on metaD. This work was supported in part by a Grant-in-Aid for Scientific Research (S) “KAKENHI Grant Number JP18H05264” and a Grant-in-Aid for Specially Promoted Research “KAKENHI Grant Number JP15H05701” from the Japan Society for the Promotion of Science (JSPS). This work was also supported by the Element Strategy Initiative of MEXT, “Grant Number JPMXP0112101003”. The calculations were performed in part at the Research Center for Computational Science (RCCS), Okazaki Research Facilities, National Institutes of Natural Sciences (NIIS), Japan.

19

References [1] K. Henzler-Wildman, D. Kern, Dynamic personalities of proteins. Nature 450 (2007) 964–972. [2] A. Laio, F.L. Gervasio, Metadynamics: A method to simulate rare events and reconstruct the free energy in biophysics, chemistry and material science, Rep. Prog. Phys. 71 (2008) 126601. [3] O. Valsson, P. Tiwary, M. Parrinello, Enhancing important fluctuations: Rare events and metadynamics from a conceptual viewpoint, Annu. Rev. Phys. Chem. 67 (2016) 159–184. [4] C. Dellago, P.G. Bolhuis, D. Chandler, Efficient transition path sampling: Application to Lennard-Jones cluster rearrangements, J. Chem. Phys. 108 (1998) 9236–9245. [5] Y. Sugita, Y. Okamoto, Replica-exchange molecular dynamics method for protein folding, Chem. Phys. Lett. 314 (1999) 141–151. [6] Y. Sugita, A. Kitao, Y. Okamoto, Multidimensional replica-exchange method for free-energy calculations, J. Chem. Phys. 113 (2000) 6042–6051. [7] E. Weinan, W. Ren, E. Vanden-Eijnden, String method for the study of rare events, Phys. Rev. B 66 (2002) 052301. [8] S. Sakuraba, N. Matubayasi, Distribution-function approach to free energy computation, J. Chem. Phys. 135 (2011) 114108. [9] C. Abrams, G. Bussi, Enhanced sampling in molecular dynamics using metadynamics, replica-exchange, and temperature-acceleration, Entropy, 16 (2014) 163–199. [10]J.D. Chodera, F. Noé, Markov state models of biomolecular conformational dynamics, Curr.

20

Opin. Struct. Biol. 25 (2014) 135–144. [11]E. Rosta, G. Hummer, Free energies from dynamic weighted histogram analysis using unbiased Markov state model, J. Chem. Theory Comput. 11 (2015) 276–285. [12]R. Harada, Y. Takano, T. Baba, Y. Shigeta, Simple, yet powerful methodologies for conformational sampling of proteins, Phys. Chem. Chem. Phys. 17 (2015) 6155–6173. [13]R.C. Bernardi, M.C.R. Melo, K. Schulten, Enhanced sampling techniques in molecular dynamics simulations of biological systems, Biochim. Biophys. Acta 1850 (2015) 872–877. [14]Y. Yonezawa, A method for predicting protein conformational pathways by using molecular dynamics simulations guided by difference distance matrices, J. Comput. Chem. 37 (2016) 1139–1146. [15]B.E. Husic, V.S. Pande, Markov State models: From an art to a science, J. Am. Chem. Soc., 140 (2018) 2386–2396. [16]R. Harada, Simple, yet efficient conformational sampling methods for reproducing/predicting biologically rare events of proteins, Bull. Chem. Soc. Jpn. 91 (2018) 1436–1450. [17]R. Akashi, Y.S. Nagornov, Stochastic formalism for thermally driven distribution frontier: A nonempirical approach to the potential escape problem, J. Phys. Soc. Jpn. 87, 063801 (2018). [18]H. Fujisaki, K. Moritsugu, A. Mitsutake, H. Suetani, Conformational change of a biomolecule studied by the weighted ensemble method: Use of the diffusion map method to extract reaction coordinates, J. Chem. Phys. 149 (2018) 134112.

21

[19]M. Shoji, M. Kayanuma, Y. Shigeta, A practical approach for searching stable molecular structures by introducing repulsive interactions among walkers, Bull. Chem. Soc. Jpn. 91 (2018) 1465–1473. [20]M. Shiga, M.E. Tuckerman, Finding free-energy landmarks of chemical reactions, J. Phys. Chem. Lett. 9 (2018) 6207–6214. [21]H. Oshima, S. Re, Y. Sugita, Replica-exchange umbrella sampling combined with Gaussian accelerated molecular dynamics for free-energy calculation of biomolecules, J. Chem. Theory Comput. 15 (2019) 5199–5208. [22]A. Laio, M. Parrinello, Escaping free-energy minima, Proc. Natl. Acad. Sci. U. S. A. 99 (2002) 12562–12566. [23]A. Barducci, G. Bussi, M. Parrinello, Well-tempered metadynamics: A smoothly converging and tunable free-energy method, Phys. Rev. Lett. 100 (2008) 020603. [24]B. Ensing, M. De Vivo, Z. Liu, P. Moore, M.L. Klein, Metadynamics as a tool for exploring free energy landscapes of chemical reactions, Acc. Chem. Res. 39 (2006) 73–81. [25]A. Barducci, M. Bonomi, M. Parrinello, Metadynamics, WIREs Comput. Mol. Sci. 1 (2011) 826–843. [26]G. Bussi, D. Branduardi, Free-energy calculations with metadynamics: Theory and practice, Rev. Comput. Chem. 28 (2015) 1–49. [27]A.W. Sakti, Y. Nishimura, H. Nakai, Rigorous pKa estimation of amine species using

22

density-functional tight-binding-based metadynamics simulations, J. Chem. Theory Comput. 14 (2018) 351–356. [28]K. Doi, Y. Yamada, M. Okoshi, J. Ono, C.-P. Chou, H. Nakai, A. Yamada, Reversible sodium metal electrodes: Is fluorine an essential interphasial component? Angew. Chem. Int. Ed. 58 (2019) 8024–8028. [29]A.W. Sakti, Y. Nishimura, H. Nakai, Recent advances in quantum-mechanical molecular dynamics simulations of proton transfer mechanism in various water-based environments, WIREs Comput. Mol. Sci. 10 (2020) e1419. [30]J.F. Dama, M. Parrinello, G.A. Voth, Well-tempered metadynamics converges asymptotically, Phys. Rev. Lett. 112 (2014) 240602. [31]M. Bonomi, A. Barducci, M. Parrinello, Reconstructing the equilibrium Boltzmann Distribution from well-tempered metadynamics, J. Comput. Chem. 30 (2009) 1615–1621. [32]S. Kumar, J.M. Rosenberg, D. Bouzida, R.H. Swendsen, P.A. Kollman, The weighted histogram analysis method for free-energy calculations on biomolecules. I. The Method, J. Comput. Chem. 13 (1992) 1011–1021. [33]M. Souaille, B. Roux, Extension to the weighted histogram analysis method: combining umbrella sampling with free energy calculations, Comput. Phys. Commun. 135, (2001) 40–57. [34]F. Zhu, G. Hummer, Convergence and error estimation in free energy calculations using the weighted histogram analysis method, J. Comput. Chem. 33 (2012) 453–465.

23

[35]P. Tiwary, M. Parrinello, A time-independent free energy estimator for metadynamics, J. Phys. Chem. B 119 (2015) 736–742. [36]L. Donati, B. G. Keller, Girsanov reweighting for metadynamics simulations, J. Chem. Phys. 149 (2018) 072335. [37]X. Biarnésad, F. Pietruccib, F. Marinellic, A. Laio, METAGUI. A VMD interface for analyzing metadynamics and molecular dynamics simulations, Comput. Phys. Commun. 183 (2012) 203– 211. [38]V. Marinova, M. Salvalaglio, Time-independent free energies from metadynamics via mean force integration, J. Chem. Phys. 151 (2019) 164115. [39]F. Giberti, B. Cheng, G.A. Tribello, M. Ceriotti, Iterative Unbiasing of Quasi-Equilibrium Sampling, J. Chem. Theory Comput. 16 (2020) 100–107. [40]Y. Nishimura, H. Nakai, DCDFTBMD: Divide-and-conquer density functional tight-binding program for huge-system quantum mechanical molecular dynamics simulations, J. Comput. Chem. 40 (2019) 1538–1549. [41]M. Gaus, A. Goez, M. Elstner, Parametrization and benchmark of DFTB3 for organic molecules, J. Chem. Theory Comput. 9 (2013) 338–354. [42]M. Gaus, X. Lu, M. Elstner, Q. Cui, Parameterization of DFTB3/3OB for sulfur and phosphorus for chemical and biological applications, J. Chem. Theory Comput. 10 (2014) 1518–1537. [43]S. Grimme, J. Antony, S. Ehrlich, H. Krieg, A consistent and accurate ab initio parametrization

24

of density functional dispersion correction (DFT-D) for the 94 elements H-Pu, J. Chem. Phys. 132 (2010) 154104. [44]S. Grimme, S. Ehrlich, L. Goerigk, Effect of the damping function in dispersion corrected density functional theory. J. Comput. Chem. 32 (2011) 1456–1465. [45]H.C. Andersen, Molecular dynamics simulations at constant pressure and/or temperature. J. Chem. Phys. 72 (1980) 2384–2393. [46]M. McGovern, J. de Pablo, A boundary correction algorithm for metadynamics in multiple dimensions, J. Chem. Phys. 139 (2013) 084102. [47]J. Cuny, K. Korchagina, C. Menakbi, T. Mineva, Metadynamics combined with auxiliary density functional and density functional tight-binding methods: Alanine dipeptide as a case study, J. Mol. Model 23 (2017) 72. [48]D. Branduardi, G. Bussi, M. Parrinello, Metadynamics with adaptive Gaussians. J. Chem. Theory Comput. 8 (2012) 2247–2254. [49]M.R. Shirts, J.D. Chodera, Statistically optimal analysis of samples from multiple equilibrium states, J. Chem. Phys. 129 (2008) 129105.

25

Figure captions

Figure 1. Flowchart of the present method for calculating the biased histogram, Hi(s,t), along the i-th WTmetaD trajectory.

Figure 2. Representative snapshots of alanine dipeptide in the two main conformations in the gas phase: (a) C7eq with {ϕ, ψ} = {−81°, 71°} and (b) C7ax with {ϕ, ψ} = {74°, −67°} obtained from the geometry optimizations. Also shown is the definition of the dihedral angles in the main chain: ϕ and ψ, employed as the CVs, and θ utilized for the reweighting method.

Figure 3. Two-dimensional FESs for ϕ and ψ of alanine dipeptide in the gas phase obtained from (a) the unbiased DFTB-MD simulations with a total temporal length of 3 μs, (b) the single 16-ns WTmetaD simulation, and (c) WHAM for 10 independent 400-ps WTmetaD simulations in kcal/mol. The arrows in (a) indicate the two pathways from C7eq to C7ax, Path 1 and Path 2, with the free energy barriers of F1‡ and F2‡ , respectively.

26

Figure 4. Deviation of FES, ΔFES, of the present WHAM scheme based on the multiple WTmetaD simulations from the reference FES of the 3-μs unbiased MD simulations as a function of M for various tf values in alanine dipeptide in the gas phase.

Figure 5. Histogram of the residence time of C7eq before the initial transition to C7ax in the 100 independent WTmetaD simulations starting from C7eq, in arbitrary units (a.u.).

Figure 6. Analysis for the error originating from the reweighting method. Solid line indicates the deviation calculated by simply averaging 200 independent WTmetaD simulations, multiplied by t1/2. Corresponding standard deviation of the average is depicted as the shadow.

Figure 7. Two-dimensional FESs for ϕ and θ of alanine dipeptide in the gas phase obtained from (a) the unbiased DFTB-MD simulations with the total temporal length of 3 μs and (b) WHAM for 10 independent 400-ps WTmetaD simulations in kcal/mol. Note that the CVs of the WTmetaD simulations are ϕ and ψ.

27

Figure 1. Flowchart of the present method for calculating the biased histogram, Hi(s,t), along the i-th WTmetaD trajectory.

28

Figure 2. Representative snapshots of alanine dipeptide in the two main conformations in the gas phase: (a) C7eq with {ϕ, ψ} = {−81°, 71°} and (b) C7ax with {ϕ, ψ} = {74°, −67°} obtained from the geometry optimizations. Also shown is the definition of the dihedral angles in the main chain: ϕ and ψ, employed as the CVs, and θ utilized for the reweighting method.

29

Figure 3. Two-dimensional FESs for ϕ and ψ of alanine dipeptide in the gas phase obtained from (a) the unbiased DFTB-MD simulations with a total temporal length of 3 μs, (b) the single 16-ns WTmetaD simulation, and (c) WHAM for 10 independent 400-ps WTmetaD simulations in kcal/mol. The arrows in (a) indicate the two pathways from C7eq to C7ax, Path 1 and Path 2, with the free energy barriers of F1‡ and F2‡ , respectively.

30

Figure 4. Deviation of FES, ΔFES, of the present WHAM scheme based on the multiple WTmetaD simulations from the reference FES of the 3-μs unbiased MD simulations as a function of M for various tf values in alanine dipeptide in the gas phase.

31

Figure 5. Histogram of the residence time of C7eq before the initial transition to C7ax in the 100 independent WTmetaD simulations starting from C7eq, in arbitrary units (a.u.).

32

Figure 6. Analysis for the error originating from the reweighting method. Solid line indicates the deviation calculated by simply averaging 200 independent WTmetaD simulations, multiplied by t1/2. Corresponding standard deviation of the average is depicted as the shadow.

33

Figure 7. Two-dimensional FESs for ϕ and θ of alanine dipeptide in the gas phase obtained from (a) the unbiased DFTB-MD simulations with the total temporal length of 3 μs and (b) WHAM for 10 independent 400-ps WTmetaD simulations in kcal/mol. Note that the CVs of the WTmetaD simulations are ϕ and ψ.

34

Highlights 

Post-processing method for efficiently obtaining free energy surfaces is proposed



Biased histograms in short-time metadynamics simulations are merged for convergence



Weighted histogram analysis method is utilized for the statistical average



Application to a small peptide demonstrates the efficiency of the present method



The present method is feasible for massive parallel computational environments

35

CRediT author statement Junichi Ono: Conceptualization, Methodology, Software, Investigation, Data Curation, Writing Original Draft, Visualization. Hiromi Nakai: Conceptualization, Validation, Resources, Writing - Review & Editing, Supervision, Funding acquisition.

36

37

Declaration of interests ☒ The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

☐The authors declare the following financial interests/personal relationships which may be considered as potential competing interests:

38