NRMC – A GPU code for N -Reverse Monte Carlo modeling of fluids in confined media

NRMC – A GPU code for N -Reverse Monte Carlo modeling of fluids in confined media

Accepted Manuscript NRMC – a GPU code for N -Reverse Monte Carlo modeling of fluids in confined media Vicente Sánchez-Gil, Eva G. Noya, Enrique Lomba ...

214KB Sizes 0 Downloads 45 Views

Accepted Manuscript NRMC – a GPU code for N -Reverse Monte Carlo modeling of fluids in confined media Vicente Sánchez-Gil, Eva G. Noya, Enrique Lomba

PII: DOI: Reference:

S0010-4655(17)30122-4 http://dx.doi.org/10.1016/j.cpc.2017.04.008 COMPHY 6197

To appear in:

Computer Physics Communications

Received date : 13 January 2017 Revised date : 10 April 2017 Accepted date : 14 April 2017 Please cite this article as: V. Sánchez-Gil, et al., NRMC – a GPU code for N -Reverse Monte Carlo modeling of fluids in confined media, Computer Physics Communications (2017), http://dx.doi.org/10.1016/j.cpc.2017.04.008 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

NRMC – a GPU code for N-Reverse Monte Carlo modeling of fluids in confined media Vicente S´anchez-Gil, Eva G. Noya, Enrique Lomba Instituto de Qu´ımica F´ısica Rocasolano, Consejo Superior de Investigaciones Cient´ıficas, CSIC, Calle Serrano 119, E-28006 Madrid, Spain

Abstract NRMC is a parallel code for performing N-Reverse Monte Carlo modeling of fluids in confined media [V. S´anchez-Gil, E. G. Noya and E. Lomba, J. Chem. Phys. 140, 024504 (2014)]. This method is an extension of the usual Reverse Monte Carlo method to obtain structural models of confined fluids compatible with experimental diffraction patterns, specifically designed to overcome the problem of slow diffusion that can appear under conditions of tight confinement. Most of the computational time in N-Reverse Monte Carlo modeling is spent in the evaluation of the structure factor for each trial configuration, a calculation that can be easily R CUDA so that the code can be run on GPUs leads to a parallelized. Implementation of the structure factor evaluation in NVIDIA speed up of up to two orders of magnitude. PROGRAM SUMMARY Program Title: NRMC gpu Program Files doi: http://dx.doi.org/10.17632/kbbgbkn68m.1 Licensing provisions: GNU General Public License 3 (GPL) R CUDA Programming language: FORTRAN, C and NVIDIA Supplementary material: An example calculation is provided R MathKExternal routines/libraries: LAPACK (for gfortran) or Intel R Fortran (included in Intel’s distribution) ernel for Intel Nature of problem: Determination of structural models of confined fluids compatible with experimental diffractograms Solution method: N-Reverse Monte Carlo simulations using GPUs

1. Introduction Understanding the adsorption of gases on porous media is an important and challenging goal, both from a fundamental and a practical point of view[1, 2]. The properties of the fluid are often affected by the confining material, usually leading to the appearance of novel physical behaviour. But also, porous materials have many applications in industry, for example, in catalysis or in mixtures separation[1, 2]. In order to fully understand an adsorption process, it is desirable to have access to the detailed microscopic picture of the adsorbate/adsorbent system. Molecular simulations can provide such description and have helped to understand the adsorption in a large variety of systems (see, for example, Refs. [1, 2, 3]). However, simulations rely on the goodness of the models used to represent the interactions between the different components in the system and, unfortunately, in many occasions these models fail to quantitatively predict the experimental adsorption behaviour[4, 5]. Alternatively, one can obtain structural information about the adsorption process from diffraction experiments. If the adsorbent material presents a well characterized crystalline structure and narrow pores that can host a reduced Preprint submitted to Elsevier

number of molecules (e.g., adsorption of aromatic and hydrocarbon molecules on MFI-type zeolites)[6] an structural model of the adsorbed fluid can be obtained using the Rietveld refinement method[7]. However, when the porous material presents a disordered structure or when the number of adsorbed particles is large, the Rietveld refinement route is no longer practical. Another approach is to use Reverse Monte Carlo (RMC) modeling[8, 9]. In this method a structural model is built starting from an initial atomic configuration and performing random atomic move attempts that are accepted or rejected depending on whether the calculated diffraction pattern approaches or not the experimental one. Recently, we have adapted this technique to confined fluids[10]. Our implementation, designated N-RMC, is specially suited to circumvent the slow diffusion problems that appear under tight confinement. The method has proven to be able to reproduce the structure of adsorbed fluids on zeolites up to the level of three body correlations[10]. It has also been used in conjunction with real experimental data to extract the structure of adsorbed argon[11] and toluene[12] on pure silica MEL. In both cases, analysis of the experimental data also yielded information about a structural deformation of the zeolite at high loads. The code was specifically designed to study the adsorption on crystalline adsorbents, such as zeolites. However, in principle, it could be used to study also the adsorption on disordered materials. In this contribution, we provide a source code for N-RMC modeling that is implemented in FORTRAN and CUDA (Computer Unified Device Architecture) C languages. The theoretical evaluation of the structure factor is the most time consuming part of the code and can make the simulations extremely slow, specially when increasing the system size or the range of scattering vectors q included in the fit. Fortunately, this calculation can be easily parallelized, speeding up calculations by a factor that depends on the particular system. The use of Graphic Processing Units (GPU), instead of the Computing Processing Units (CPUs), can reduce even further the computational time. April 22, 2017

In the two examples provided, simulations run about a hundred times faster in a GPU than in a CPU. Comparative times for Example 1 (fit of TOF neutron diffraction pattern of argon on pure silica MEL zeolite) are given in Table 1. The paper is organized as follows. In Section 2, we start by describing how to evaluate theoretically the powder diffraction pattern, measured either with X-ray or time-of-flight (TOF) neutron scattering experiments. After that, we give a comprehensive overview of the N-RMC method. Instructions to compile the code are provided in Section 3. The input files needed to start a simulation are enumerated and described in Section 4, whereas the output files generated by the code are sketched in Section 5 . Along with the source code, we provide two examples that are briefly described in Section 6.

Bragg peaks corresponding to the crystal, and a diffuse contribution coming from the adsorbate and the atomic vibrations of the crystal. Bragg peaks emerge at the wave-vectors allowed by the crystal unit cell, whereas the presence of disorder leads to enhanced intensities also at vectors of the reciprocal space incompatible with the crystal symmetry. In this code, the diffuse contribution was evaluated using an approximate method proposed by Mellergård and McGreevy[13] in which the intensities of the diffuse reciprocal points of the simulated super-cell are averaged over a certain range of q (in the two examples provided diffuse scattering was averaged over 0.1 Å−1 ). Spurious effects on the diffraction pattern resulting from the finite size of the system are drastically reduced using this approximate method. 2.2. Peak shape In real experimental data, peaks are not δ functions, but, instead, they spread over a region of the reciprocal space adopting a shape that depends on the type of radiation and the resolution of the equipment. For X-ray, several empirical functions have been proposed to describe the peak shape as a function of the diffraction angle 2θ. In the present code, we have implemented a pseudo-Voigt function[18], that captures the asymmetry of the diffraction peaks by mixing a Gaussian and a Lorentzian function: I(2θ) = ηL(2θ − 2θ0 ) + (1 − η)G(2θ − 2θ0 ). (4)

2. Method 2.1. Theoretical calculation of the diffraction pattern As mentioned in the preceding section, the code is prepared to analyze powder diffraction data, measured with X-ray or TOF neutrons. In both cases, we need to evaluate the orientationally averaged structure factor: S (q) =

X 2π2 |F(q0 )|2 δ(q − q0 )/q02 NV < f >2 q0

(1)

where N and V are, respectively, the number of atoms and the volume of system, q0 are the allowed vectors in the reciprocal unit cell, and < f > is the average of the form factors (X-ray) or coherent scattering lengths (neutrons) of the constituent atoms, f j . The 1/q0 2 factor stems from the angular integration over all the possible q0 orientations in the powder sample[13]. When using experimental data, the δ-function in this equation is replaced by the resolution function. Finally, F(q) contains the correlations between the scattering nuclei and is given by: F(q) =

N X

f j exp(iqR j )

Here L(2θ−2θ0 ) and G(2θ−2θ0 ) are the normalized Lorentz and Gaussian functions, respectively, and η is a mixing parameter that controls the weight of each function. Here we took η =0.5. The variation of the instrumental broadening of the peaks with the scattering angle was modeled using an expression proposed by Caglioti et al.[19] for the half-width at half-maximum of the Lorentz and Gaussian functions (HWHM): HWHM 2 = U tan2 θ + V tan θ + W.

For adsorbents whose structure is already known (e.g. zeolites), the parameters of the empirical functions of the peak shape and the instrumental zero offset 2θ0 [18] can be obtained from a Pawley refinement of each experimental diffractogram using Materials Studio Package[20]. As for X-ray, the peak shapes in scattered neutron diffractograms can be modeled using several empirical functions, such as, for example, those proposed by Ikeda and Carpenter[21] or by Von Dreele et al.[22]. In the present code, specifically for TOF experiments, we implemented this latter functional form, which consists of a convolution of two back-to-back exponentials with a Gaussian distribution[22, 23]:   H(∆T ) = N eu erfc(y) + ev erfc(z) , (6)

(2)

j=1

where R j denotes the position of the atom j in the unit cell. As mentioned before, when using X-ray data, f j stands for the atomic form factors. They exhibit a dependence with the scattering vector that can be accounted for with the function: f j (q) =

4 X i=1

 q 2 ! ai exp −bi +c 4π

(5)

(3)

The values of the parameters ai , bi and c might be taken from the International Tables of Crystallography[14, 15]. For neutron diffraction data, f j are the coherent scattering lengths of the constituent atoms, and their values can be found in Refs. [16, 17]. When studying the adsorption of gases on ordered materials, we will often be dealing with a system with partial order, consisting of a perfectly ordered crystal (the adsorbent) and a disordered fluid (the adsorbate). In those cases, the diffraction pattern can be decomposed into two contributions[13]: the

where erfc denotes the complementary error function. ∆T in the equation above is the difference in TOF between a given profile point and the position of the considered peak, and N is a normalizing function: N= 2

αβ . 2(α + β)

(7)

Table 1: Benchmark GPU/CPU times for Example 1 run with 200 cycles for equilibration plus 500 cycles to perform averages. Examples have been run on various GPU types and on a single CPU for the serial code. Note that in the last instance, the serial code is not included with the files for this publication for practical reasons.

GPU model GTX 1070 GTX 980 GTX 590 Serial

GPU cores 2880 3072 512 –

Compute capability 6.1 5.2 2 –

CPU Model R Xeon E5-2650 2.20GHz Intel R Core i7-2600 3.40GHz Intel R Core i7-2600 3.40GHz Intel R Xeon R E5-2650 2.20GHz Intel

u, v, y and z are given by: u

=

v

=

y

=

z =

α (ασ2 + 2∆T ) 2 β (βσ2 − 2∆T ) 2 ασ2 + ∆T √ σ 2 βσ2 − ∆T √ σ 2

S cal (qi ) and S exp (qi ) are the calculated and experimental structure factors, σ(qi ) are the uncertainties of the experimental measurement and Nq is the number of discrete points used to evaluate the diffraction pattern. Differently from the usual RMC, the N-RMC implementation also includes particles insertion/deletion attempts. The acceptance probability in this case is:   2 2 2  − ∆Nold    χnew − χ2old ∆Nnew acc − P = min 1, exp −  (13) 2 2

(8)

Here, ∆N 2 = (N − Nexp )2 /σ2N , Nexp being the experimental estimation of number of adsorbed particles and σN its uncertainty. This insertion/deletion attempts are crucial to avoid the trapping of adsorbate molecules in some regions of the porous material. The efficiency of this trial moves is much improved by attempting insertions only on those regions of the porous material accessible to the adsorbate[25]. As usual in RMC, overlaps between particles are avoided incorporating a hard repulsive core. Move attempts that result in a particle overlap are immediately rejected[8, 9]. When using experimental data, besides considering the appropriate peak-shape in the theoretical calculation of the diffraction pattern, one needs to correct for background effects. In our implementation this correction is fitted along the simulations, renormalizing the experimental data to minimize its difference to the calculated pattern:

α, β and σ are functions of the TOF: α

= α0 + α1 /d

β

4

σ

2

= β0 + β1 /d =

(9)

σ20 + σ21 d + σ22 d2

The TOF can be translated into d-spacing using: T OF = DIFA d2 + DIFC d + ZERO

(10)

where DIFA, DIFC and ZERO are parameters characteristic of the experimental setup and of the detector bank[24]. Their values, as well as those of the remaining parameters, namely, α0 , α1 , β0 , β1 , σ0 , σ1 , σ2 must be provided by the user in the input file (see Section 4). They can be obtained from a Pawley refinement of the empty adsorbent material. 2.3. N-Reverse Monte Carlo modeling

χ02 =

RMC is a method to obtain structural models of a given system that are compatible with experimental diffraction patterns[8, 9]. This is attained by starting from a random configuration and performing random particle moves that are accepted or rejected depending on whether the calculated structure factor approaches or not the experimental one according to the acceptance probability: acc

P

  2    χnew − χ2old   , = min 1, exp − 2

Nq X (S cal (qi ) − S exp (qi ))2 i=1

σ2 (qi )

.

Nq X (S cal (qi ) − ((S exp (qi ) − a)/b))2 i=1

σ2 (qi )

,

(14)

where the parameters a and b are the fitted parameters. The code also includes an additional parameter, kχ0 , that can be used to allow for larger deviations between the calculated and measured patterns than the experimental error. The advantage of this is that atoms exhibit a higher mobility, allowing a faster sampling of the configurational space. The acceptance probability is given then by:   2 2  − ∆Nold    ∆χ02 ∆Nnew acc − P = min 1, exp −kχ0 (15)  2 2

(11)

where χ is a quantity that measures the deviation of the calculated and experimental structure factors, before and after the movement (χold and χnew , respectively): χ2 =

GPU/CPU time (s) 546 858 1128 51840

02 where ∆χ02 = χ02 new − χold . One rather common assumption that is made when modeling adsorption processes is that the atomic vibration of the adsorbent material does not affect much its equilibrium adsorption properties and, therefore, is only included when studying

(12) 3

dynamic properties, such as diffusion coefficients[3]. This approximation is valid as long as the structure of the adsorbent material does not change much during the adsorption process. However, there are increasing evidences that many porous materials, such as zeolites, are rather flexible and can be deformed at high loads or when the size of the adsorbed molecules is comparable to the pore size (see, for example, Refs. [26, 27]). In those cases, the flexibility of the adsorbent needs to be explicitly taken into account. RMC tends to find the most disordered configurations compatible with the experimental data. As a consequence, an unconstrained RMC usually leads to large and unphysical deformations of the material. This can be avoided by incorporating all the information we know about the system in the RMC modeling, for example, constraining the bond lengths and bending angles[9]. The current version of the code incorporates some constrains that are specifically designed for pure silica zeolites. In particular, it includes a constraining potential for the S i − O bonds given by: Ubond =

Nbond 1X kbond (rS i−O,i − r0,i )2 , 2 i=1

moderately increase these values so that a more efficient sampling is attained. Regarding kbond and kbend , they should be assigned large enough values to prevent the adsorbent from undergoing unphysically large deformations. Initial values might be estimated from computer simulations or, in the case of crytalline adsorbents, from the Debye-Waller factors obtained, for example, from Rietveld refinement. 3. Source code and compilation The source code consists of two files, one written in FORTRAN90, NRMC gpu.f90, and another one in CUDA libGPU.cu. As mentioned before, the theoretical evaluation of the diffraction pattern can be easily parallelized, simply dividing the summatory of Eq. 1 in small pieces that will be calculated by different processing units. This is precisely the part of the code that will be processed in GPUs. The code can be R Fortran, and easily compiled using the Makefile (for Intel Makefile.gfortran for gfortran compiler) included in the Source directory. File book.h must also be provided. The name of the executable file is NRMC gpu. The code is currently prepared to deal with atomic and small rigid molecular adsorbates (argon and toluene are two representative examples) and with adsorbents with a maximum of two components. As mentioned before, however, the constrains for the adsorbent material implemented in the code were specifically designed for pure silica zeolites. For other materials implementing different constraining potentials might be needed. Fortunately, in many instances keeping the adsorbent rigid is a rather good approximation[12], and in those cases the code can be used in its present form.

(16)

where Nbond is the number of Si-O bonds in the adsorbent zeolite, rS i−O,i is the instantaneous length of bond i and r0,i its equilibrium bond distance. kbond is the parameter that controls the maximum deviation of the bond distances from the equilibrium value. It can also be interpreted as the weight of this term with respect to the remaining terms in the acceptance probability. Similarly, the O − S i − O angles are constrained by a bending interaction of the form: Ubend =

Nbend 1X kbend (θO−S i−O,i − θ0,i )2 , 2 i=1

(17) 4. Input files

where Nbend is the number of O-Si-O bending terms, θO−S i−O,i is the instantaneous angle of bending term i and θ0,i its equilibrium angle. kbend is the parameter that controls the maximum deviation of the bending angles from the equilibrium value. When these constraints are incorporated to the N-RMC method, the acceptance probability of the zeolite atomic displacement attempts becomes: !! ∆χ02 acc 0 P = min 1, exp −kχ − ∆Ubond − ∆Ubend , (18) 2

Only four input files are needed to start an N-RMC simulation: • The input file, that contains all the information needed to perform an N-RMC simulation. The meaning of each entry is specified in the two examples provided. • One (or multiple) file(s) with the experimental diffraction pattern. The name of these files are provided in the input (variable file exp in section “Description of the experimental data”).

where ∆Ubond = Ubond,new − Ubond,old and ∆Ubend = Ubend,new − Ubend,old . The appropriate constraints are system dependent and for other adsorbents these constraints might be modified in the source code in subroutines init neighbors zeo, trans chi2 ox and trans chi2 si. Once all the terms are defined, one has to decide what values to assign to the parameters σχ0 , σN , kbond and kbend that control the weight of the different contibrutions to the acceptance probability. As mentioned before, σχ0 and σN can be directly related to the uncertainties in the experimental diffraction pattern and estimate of the number adsorbed molecules, respectively. However, sometimes it might be computationally advantageous to

• A file with the initial configuration of the adsorbent. The name of this file is provided in the input (variable adsorbent file name in section “Information relative to the N-RMC simulations”). • The file molecule.xyz, that provides the geometry of the molecule in a reference Cartesian coordinate system. In the first line, the number of atoms of the molecule, N, and the number of atoms Ncenter used to evaluate the center about which rotations will be performed are specified. The center of the molecule is simply identified with the geometric center of the Ncenter first atoms in the molecule. 4

The second line is for comments. The remaining lines have four columns, the first one indicates the species of the atom and the three remaining columns are the Cartesian coordinates of the atom.

where i is an index that runs over the number of experimental data, are also generated. All these files have five columns, the first one contains the experimental variable used to measure the diffraction pattern (i.e. the diffracted angle 2θ in x-ray, or time-of-flight in TOF neutrons), the modulus of the vectors in reciprocal space (in Å−1 ), the total calculated pattern, and the contributions arising from Bragg and diffuse scattering.

When restarting a previous N-RMC simulation, besides those four files, the program also needs some files produced in the previous run: • The file configuration ad-in.xyz with the initial configuration of the adsorbate. The first line specifies the number of atoms. The second line has the factor that scales the intensity of the experimental and calculated pattern so that they exhibit comparable intensities. The remaining lines have four entries with the atom name and the three the Cartesian coordinates. When dealing with molecules, the atomic coordinates of each atom in the molecule are in contiguous lines following the order specified in the input file, in section “Definition of the atoms and molecules present in the system”. This file can be generated by renaming configuration ad-out.xyz from the periodic output of the simulation.

and • Files configuration ad final.xyz configuration ad-out.xyz contain the Cartesian coordinates of the final and current configuration of the adsorbed fluid in xyz format. The final and current configuration of the adsorbent are stored in files configuration adsorbent.xyz and respectively. configuration adsorbent RMC.xyz, Additionally, the code calculates the average configuration of the adsorbent along the simulation, which is saved in file configuration adsorbent RMC average.xyz. • File trajectory ad.xyz and, if the zeolite is flexible also trajectory adsorbent.xyz, store configurations of the adsorbed fluid and the adsorbent along the simulation with a frequency that is specified by the parameter control2 in the input file.

• If the adsorbent is considered to be flexible, the final configuration of the previous run, file configuration adsorbent RMC.xyz, must also be provided.

• File background params.dat contains the parameters resulting from fitting the background function to the simulated data (a and b in Eq. 14) at the end of the simulation.

• The file background params.dat that contains the parameters resulting from fitting a background function to the simulated data (a and b in Eq. 14) in the last step of the previous simulation.

• File pos accessible.dat stores the Cartesian coordinates of a grid of points that are accessible to the adsorbed fluid. At the beginning of the simulation, the simulation box is partitioned into a grid of cells with a width specified in the input file (variable paso red). Then, the code checks in which cells of that grid is possible to insert an adsorbate particle without incurring in an overlap with the adsorbate. Restricting the particle insertion attempts to these cells leads to a considerably speed up of the equilibration process[25]. This file can be read in continuation runs.

5. Output files The program produces the following output: • File N vs cycles.dat saves the evolution of the number of particles along the simulation. File chi vs cycles.dat monitors the evolution the χ0 factor (14) and the constraining terms ∆Ubond and ∆Ubend that contribute to the acceptance probability (Eq. 18). File R wp vs cycles.dat stores the evolution of the weighted profile R-factor [28]. In the case that more than one experimental file is provided, separate Rwp factors are calculated for each set of data and printed as additional columns in this file.

• Files map x.dat, map y.dat and map z.dat store unidimensional density probability maps integrated along the three axes of the Cartesian coordinate system. For molecules these maps are calculated using the position of their reference point. The bin size of the grid used to build these maps is specified in the input (variable paso mapa).

• Files S initial.dat and S final.dat contain, respectively, the initial and final calculated diffraction patterns. The pattern is also saved periodically along the simulation in file S calculated.dat, which can be used to monitor the evolution of the simulation on the flight. File S experimental.dat stores the experimental data after correcting for the background contribution obtained fitting the experimental to the calculated pattern (Eq. 14). When more than one set of experimental data is provided, files S calculated i.dat and S experimental i.dat,

• File adsorbate centers.xyz saves the Cartesian coordinates of the reference point of each adsorbed molecule at the end of the simulation. and • Files RDF centre centre.xyz RDF centre centre.xyz save the pair distribution function and the bond angle distribution of the adsorbed fluid calculated using the reference point of the molecule. The bond angle distribution is defined as the integral 5

of the three-body correlation function over the first coordination shell (see Ref. [10]). The distance up to which this integral is performed is specified in the input file (variable r ang max). In the same way, the user can choose the radial and angular increment (∆r in Angstrom and ∆θ in radians) used to evaluate the pair distribution function (variables paso r and paso ang). As the code was prepared to deal with pure silica zeolites, it also calculates the pair correlations of the adsorbed fluid with the oxygen atoms of the zeolite, storing it at file RDF centre 1.xyz. If the zeolite is flexible its partial pair and angular distribution functions are printed as well (files RDF 1 1.xyz, RDF 1 2.xyz, RDF 2 2.xyz, ADF 1 2 1.xyz and ADF 2 1 1.xyz, where the indices refer to the atomic components). For other adsorbents with a different composition, the appropriate partial distribution functions can be calculated from the trajectory files trajectory ad.xyz and trajectory adsorbent.xyz. 6. Examples Two examples are provided along with the source code: • Example 1: TOF neutron diffraction pattern of argon adsorbed on pure silica MEL zeolite Experimental data of this example was taken from Ref. [11]. It corresponds to a load of ≈ 31 Ar atoms per unit cell. Note that in order to increase the signal from the adsorbed fluid measurements were performed for isotope 36 Ar, which exhibits a coherent scattering length about ten times larger than that of the usual 40 Ar isotope. In this example, the zeolite framework is kept frozen.

[6] N. Floquet, J. P. Coulomb, J. P. Bellat, J. M. Simon, G. Weber, G. Andre, J. Phys. Chem. C 111 (2007) 18182. [7] H. M. Rietveld, J. Appl. Crystallogr. 2 (1965) 65. [8] R. L. McGreevy, L. Pusztai, Mol. Simul. 1 (1988) 359. [9] R. L. McGreevy, J. Phys.: Condens. Matter 13 (2001) R887. [10] V. S´anchez-Gil, E. G. Noya, E. Lomba, J. Chem. Phys. 140 (2014) 024504. [11] V. S´anchez-Gil, E. G. Noya, J. M. Guil, E. Lomba, S. Valencia, I. da Silva, L. Pusztai, L. Temleitner, J. Phys. Chem. C 120 (2016) 2260. [12] V. S´anchez-Gil, E. G. Noya, A. Sanz, S. J. Khatib, , J. M. Guil, E. Lomba, R. G. Marguta, S. Valencia, J. Phys. Chem. C 120 (2016) 8640–8652. [13] A. Mellergård, R. L. McGreevy, Acta Crystallogr. A55 (1999) 783. [14] P. J. Brown, A. G. Fox, E. N. Maslen, M. A. O’Keefe, B. T. M. Willis, International Tables of Crystallograpy C (2006) 554–595. [15] http://lampx.tugraz.at/ hadley/ss1/crystaldiffraction/atomicformfactors/formfactors.php, 2016. [16] V. F. Sears, Neutron news 3 (1992) 29. [17] http://www.ncnr.nist.gov/resources/n-lengths/, 2016. [18] L. B. McCusker, R. B. von Dreele, D. E. Cox, D. Lou¨er, P. Scardi, J. Appl. Cryst. 32 (1999) 36–50. [19] G. Caglioti, A. Paoletti, F. P. Ricci, Nucl. Instrum. Methods 3 (1958) 223. [20] Materials Studio, version 5.0.0.0, Accelrys Software Inc.: San Diego, CA, 2009. [21] S. Ikeda, J. M. Carpenter, Nuclear Inst. and Methods in Physics Research, A 239 (1985) 536–544. [22] R. B. V. Dreele, J. D. Jorgensen, C. G. Windsor, J. Appl. Cryst. 15 (1982) 581–589. [23] L. B. McCusker, R. B. V. Dreele, D. E. Cox, D. Lou¨er, P. Scardi, J. Appl. Cryst. 32 (1999) 36–50. [24] A. C. Hannon, Nucl. Instrum. Methods Phys. Res. A 551 (2005) 88–107. [25] R. Q. Snurr, A. T. Bell, D. N. Theodorou, J. Phys. Chem 97 (1993) 13742. [26] P. L. Llewellyn, J.-P. Coulomb, Y. Grillet, J. Patarin, H. Lauter, H. Reichert, J. Rouquerol, Langmuir 9 (1998) 1846. [27] E. Garc´ıa-P´erez, J. B. Parra, C. O. Ania, D. Dubbeldam, T. J. H. Vlugt, J. M. Castillo, P. J. Merkling, S. Calero, J. Phys. Chem. C 112 (2008) 9976. [28] D. E. Clark, Evolutionary Algorithms in Molecular Design, John Wiley & Sons, Weinheim, Germany, 2000.

• Example 2: X-ray diffraction pattern of toluene adsorbed on pure silica MEL zeolite. Experimental data of this example was taken from Ref. [12]. It corresponds to a load of 6 toluene molecules per unit cell. Now the zeolite framework is also allowed to change, but imposing harmonic S i − O bond and S i − O − S i bending constraints that prevent large, unphysical deformations of the zeolite. 7. Acknowledgments This work was funded by Direcci´on General de Investigaci´on Cient´ıfica y T´ecnica under Grant No. FIS2013-47350-C5-4-R. VSG also thanks the CSIC for support by means of a JAE program Ph.D. fellowship. Useful discussions with Laszlo Pusztai and Laszlo Temleitner are gratefully acknowledged. [1] L. D. Gelb, K. E. Gubbins, R. Radhakrishnan, M. Sliwinska-Bartkowiak, Rep. Prog. Phys. 62 (1999) 1573–1659. [2] C. Alba-Simionesco, B. Coasne, G. Dosseh, G. Dudziak, K. E. Gubbins, R. Radhakrishnan, M. Sliwinska-Bartkowiak, J. Phys.: Condens. Matter 18 (2006) R15. [3] B. Smit, T. L. M. Maesen, Chem. Rev. 108 (2008) 4125. [4] R. J.-M. Pellenq, D. Nicholson, Langmuir 95 (1995) 1626. [5] V. S´anchez-Gil, E. G. Noya, J. M. Guil, E. Lomba, S. Valencia, Micropor. Mesopor. Mat. 222 (2016) 218–225.

6