An effective HIV-1 integrase inhibitor screening platform: Rationality validation of drug screening, conformational mobility and molecular recognition analysis for PFV integrase complex with viral DNA

An effective HIV-1 integrase inhibitor screening platform: Rationality validation of drug screening, conformational mobility and molecular recognition analysis for PFV integrase complex with viral DNA

Journal of Molecular Graphics and Modelling 78 (2017) 96–109 Contents lists available at ScienceDirect Journal of Molecular Graphics and Modelling j...

5MB Sizes 1 Downloads 83 Views

Journal of Molecular Graphics and Modelling 78 (2017) 96–109

Contents lists available at ScienceDirect

Journal of Molecular Graphics and Modelling journal homepage: www.elsevier.com/locate/JMGM

Topical Perspectives

An effective HIV-1 integrase inhibitor screening platform: Rationality validation of drug screening, conformational mobility and molecular recognition analysis for PFV integrase complex with viral DNA Wenyi Du a , Ke Zuo a , Xin Sun a , Wei Liu a , Xiao Yan a , Li Liang a , Hua Wan b , Fengzheng Chen c , Jianping Hu a,∗ a College of Pharmacy and Biological Engineering, Sichuan Industrial Institute of Antibiotics, Key Laboratory of Medicinal and Edible Plants Resources Development, Chengdu University, Chengdu, China b College of Mathematics and Informatics, South China Agricultural University, Guangzhou, China c Department of Chemistry, Leshan Normal University, Leshan, China

a r t i c l e

i n f o

Article history: Received 24 July 2017 Received in revised form 1 October 2017 Accepted 3 October 2017 Available online 5 October 2017 Keywords: HIV-1 integrase Molecular docking Molecular dynamics simulation Binding free energy Drug design

a b s t r a c t As an important target for the development of novel anti-AIDS drugs, HIV-1 integrase (IN) has been widely concerned. However, the lack of a complete accurate crystal structure of HIV-1 IN greatly blocks the discovery of novel inhibitors. In this work, an effective HIV-1 IN inhibitor screening platform, namely PFV IN, was filtered from all species of INs. Next, the 40.8% similarity with HIV-1 IN, as well as the high efficiency of virtual screening and the good agreement between calculated binding free energies and experimental ones all proved PFV IN is a promising screening platform for HIV-1 IN inhibitors. Then, the molecular recognition mechanism of PFV IN by its substrate viral DNA and six naphthyridine derivatives (NRDs) inhibitors was investigated through molecular docking, molecular dynamics simulations and water-mediated interactions analyses. The functional partition of NRDs IN inhibitors could be divided into hydrophobic and hydrophilic ones, and the Mg2+ ions, water molecules and conserved DDE motif residues all interacted with the hydrophilic partition, while the bases in viral DNA and residues like Tyr212, Pro214 interacted with the hydrophobic one. Finally, the free energy landscape (FEL) and cluster analyses were performed to explore the molecular motion of PFV IN-DNA system. It is found that the association with NRDs inhibitors would obviously decrease the motion amplitude of PFV IN-DNA, which may be one of the most potential mechanisms of IN inhibitors. This work will provide a theoretical basis for the inhibitor design based on the structure of HIV-1 IN. © 2017 Elsevier Inc. All rights reserved.

1. Introduction Human immunodeficiency virus type 1 (HIV-1) is the culprit of the emerge of acquired immune deficiency syndrome (AIDS) [1–3]. HIV-1 life cycle mainly contains six phases: fusion, reverse transcription, integration, cleavage, packaging and reproduction [4]. Wherein, the integration consists of 3 -processing (3 -P) and strand transfer (ST) reactions [5,6]. During 3 -P reaction occurring in cytoplasm, HIV-1 integrase (IN) cleaves two nucleotides from 3 ends of viral DNA, then the 3 hydroxyl groups of the CA dinucleotides are exposed. Subsequently, ST reaction occurs in nucleus where the 3 hydroxyl attacks the phosphodiester of host chro-

∗ Corresponding author. E-mail address: [email protected] (J. Hu). https://doi.org/10.1016/j.jmgm.2017.10.002 1093-3263/© 2017 Elsevier Inc. All rights reserved.

mosome. HIV-1 will quickly spread along with the replication of the host chromosome once the ST reaction accomplished. The fulllength of HIV-1 IN enzyme comprises 288 residues, which can be divided into three domains[6]. Specifically, the N-terminal domain (NTD, residues 1 ∼ 50) contains a conserved “HHCC” motif binding a Zn2+ ion and contributes to enzymatic multimerization [7]; the catalytic core domain (CCD, residues 51 ∼ 211) has three conserved DDE motif residues (i.e., Asp64, Asp116, Glu152) chelating to two Mg2+ ions, which is the most important structural unit for exerting its integration activity [8]; the C-terminal domain (CTD, residue 212 ∼ 288) can nonspecifically bind with DNA [9]. Structurally and mechanistically same with RNaseH and Tn5 transposases, HIV-1 IN belongs to the polynucleotidyl transferases family [10–13]. In terms of that integration is a critical process in the HIV-1 life cycle, there is no homologous protein of IN in human body, thus HIV-1

W. Du et al. / Journal of Molecular Graphics and Modelling 78 (2017) 96–109

IN becomes an important drug target for the treatment of AIDS in recent years [14]. IN strand transfer inhibitors (INSTIs) are a major class of active site inhibitors against HIV-1 integration through preventing the ST reaction, and plenty of significant developments and designs of INSTIs were reported in recent years [10]. Raltegravir (RLT) was the first INSTIs approved through FDA [15] in 2007, which provided a new option for highly effective antiretroviral therapy (HAART) [16], converted AIDS into a chronic controllable disease, and greatly reduced the mortality of AIDS patients. After this dramatic advance, Elvitegravir (EVG) and Dolutegravir (DTG) have been successively approved, promoting the research of INSTIs to a rapid development stage [17,18]. RLT, EVG and DTG all broadly belong to the bioisosteres compounds of diketone acid (DKA). DKA derivatives, containing aromatic ring, 1,3-dicarbonyl and acid group, are a class of superactive HIV-1 INSTIs, in which 1,3-dicarbonyl group seizes two Mg2+ ions, destroying the metal ion-mediated integration catalysis [19–21]. Currently, most of reported highly active INSTIs are the bioisosteres of DKA, either using heterocyclic to replace the aromatic ring or N atom instead of O atom in the carbonyl group, and so on. Sato group’s results illustrated that the bioisosteres of DKA have three structural characteristics (i.e., ketone analogs, enolization ketone and carbonyl oxygen) whose spatial arrangement is coplanar [22]. Wang et al. designed and synthesized a series of novel ␤-diketone derivatives, in which polyhydroxylated aromatic moiety served as a significant role in inhibiting ST reaction, and the 3,4,5-trihydroxylated aromatic moiety exhibited stronger inhibition than dihydroxylated aromatic one [20]. Feng et al. not only established a subunit exchange assay for the discovery of new inhibitors selectively modulating IN multi-merization, but also obtained some novel inhibitors effectively suppressing INSTIs phenotype resistant virus. Moreover, many in-depth studies on drug resistance have been carried out [23]. Wainberg’s group found that HIV-1 IN T66I/R263 K double mutations could greatly influence the infection and replication ability of the virus, exhibiting incompatible with the emergence of resistance against RLT [24,25]. It was also found that S119G/R/P/T mutations were frequently observed in HIV-1 clinical specimens, and the S119X polymorphism might contribute a lot to the current drug resistance caused by Y143R, Q148H, N155H and G140S/Q148H mutations [26]. There are many reports on structure, function, mutation, and resistance for HIV-1 IN, nevertheless, the experimental data related to full length structure of HIV-1 IN are still rare, which greatly hindered the discovery and development of novel INSTIs. As an alternative, prototype foamy virus (PFV) IN is adopted as the homologous protein model for the development of anti-AIDS drugs. Valkov et al. analyzed the conservation of CCD domain and the composition of active site, then stated that PFV IN may be a suitable alternative model for HIV-1 IN [27]. Johnson’s group generated a corresponding HIV-1 IN model from PFV IN crystal structure (PDB ID: 3OY9 and 3OS1), and predicted the in vitro anti-INSTIs activity using molecular dynamics (MD) simulation and molecular docking methods [6]. Johnson et al. also designed a series of novel INSTIs based on the previously established target model through structural superposition, homology modeling and molecular biology experiments [28]. In addition, Hu et al. proposed a possible inhibition mechanism of RLT and the recognition of DKAs with PFV IN-DNA via MD simulation and molecular docking methods [29]. Some previous work showed that PFV IN might be an ideal model for the development of HIV-1 INSTIs [6,27,29]. Actually, only the molecular docking and MD simulation were used to validate the reliability of the platform [6,29,30], but other important strategies including database filtering, free energy calculation, etc., have not yet been adopted. Furthermore, some central problems are still unclear, such as conformational motion, free energy landscape, and DNA conformational changes etc.

97

Hare [31] recently obtained diffractions crystal structure of PFV IN complexed with vital DNA and DTG inhibitor. In the present work, the rationality of the PFV IN-DNA crystal complex for INSTIs screening was firstly confirmed through structural and sequential analyses. Then, the effectiveness and sensibility of the model were further verified by molecular docking and virtual screening. Next, ten comparative MD simulations in conjunction with PCA, FEL, Cluster and Curves analyses were performed to explore the motion mode and conformational change of PFV IN-DNA. Finally, the optimized recognition of DKAs with PFV IN-DNA was obtained via hydrogen bond and energy decomposition calculation etc. In sum, this work provides the structural and theoretical help for anti-AIDS drug design based on IN structure. 2. Materials and methods All computational studies were carried out in four single machines, running on two Intel Xeon E5-2643v3 processors with 32 GB RAM and 2 TB hard disk with Red Hat Linux Enterprise version 6.5 as the operating system. 2.1. Virtual screening As an essential part in the discovery of lead compounds, virtual screening plays a key role in the development of novel drugs. It breaks through the limitations of traditional chemical syntheses and pharmacological experiments and becomes an excellent choice for rapid evaluation of compounds [32]. Before virtual screening, a novel HIV-1 IN inhibitors database was created, which consists of 1673 molecules with activity data obtained from the current literatures and the newest National Cancer Institute (NCI) database consisting of 265,242 structures. Two-steps screening (i.e., pharmacophore and molecular docking strategies) was adopted in the present work. In the first step, the UNITY module in the SYBYL-X 1.3 software was performed to construct the pharmacophore model of IN inhibitors, including three H-bond acceptors (HA), one H-bond donor (HD) and two hydrophobic aromatic centers (HP). Once the pharmacophore model was constructed, it was used to filter our created database to pick out the active compounds satisfying the specific geometric or physicochemical constraints [33]. The second step is the docking screening by using Surflex-Dock screen (SFXC) program in SYBYL-X 1.3, a fast docking tool with high accuracy. The docking acceptor was the optimized PFV IN-DNA crystal structure, and the ligand source was the reduced database after pharmacophore screening. Additional starting conformation per molecule, the maximum conformations per fragment and the maximum number of rotatable bonds per molecule were set as 0, 20, and 100, respectively. The search grid was set as 6 nm. Then, the following flags using default parameters were turned on: pre-dock minimization, post-dock minimization, molecule fragmentation, soft grid treatment, and activate spin alignment method. 2.2. Druggability analysis Druggability evaluation of a target is an important concept in the confirmation of drug binding pocket. Druggability provides a quantitative estimate of binding site and affinities for a potential drug acting on a specific protein target [34]. Druggability calculation, using a set of probes (i.e., isopropanol, acetamide, acetate and isopropylamine etc.) with diverse physicochemical properties, can automatically analyze MD trajectories of target protein complexed with these probe molecules, then achieve the druggability prediction. It can also be used to find the unknown active sites in proteins mainly by calculating the binding free energy between the probe

98

W. Du et al. / Journal of Molecular Graphics and Modelling 78 (2017) 96–109

molecules and the protein surface. The binding free energy could be calculated by [34,35]: i Gprobebinding = −RT ln

n  i

n0

(1)

where, the ni /n0 represents the ratio of the observed density of probes to the expected density. R is the gas constant, and T is the absolute temperature. In the present study, the object of druggability assessment was the PFV IN crystal structure (PDB code: 3S3 M) with the DNA and ligands removed. The probe molecules were made up of 70% isopropanol, 10% acetamide and 20% isopropylamine. In term of druggability parameters, the simulation length was set to 40 ns, the temperature was 300 K, and probe merge radius was set as 0.55 nm. The position for the probe molecules with the lowest binding free energy was considered as the binding pocket.

non-bonded interactions. The MD time step was taken as 2 fs, and one snapshot was sampled every 500 steps (1 ps), thus 100,000 conformations were obtained in each MD simulation. Based on the MD trajectories, Molecular Mechanics/Poisson Boltzmann Surface Area (MM/PBSA) calculations can estimate the energy between PFV IN-DNA and its inhibitors including van der Waals energy (VDW), electrostatic energy (ELE), hydrophilic contribution to solvation free energy (PBELE), and hydrophobic contribution to solvation free energy (PBSUR), and conformational entropy (TS) [41]. At last, the binding free energy (Gbind ) could be calculated by the formula: Gbind = H − TS = (VDW + ELE + PBELE + PBSUR) −TS

(2)

2.3. Molecular docking

2.5. Free energy landscape

Six naphthyridine derivatives (NRDs) with high HIV-1 IN inhibitory activity [5] were docked to the binding site of PFV IN-DNA using AutoDock 4.0 package [36]. The program performs automated docking of a fully flexible ligand to the rigid receptor. The rotations of all the single bonds were taken into account during the docking calculation. Moreover, the Lamarckian genetic algorithm (LGA) combined with the local energy search method were used to find the globally appropriate conformations and binding positions of the ligands, and a semiempirical potential function was used as the energy scoring function. The following parameters were used for docking: The maximum number of energy evaluations was increased to 2.5 × 106 , the maximum number of generations was 2.7 × 105 , the number of points in x-/y-/z-dimension were set to 4.0 nm, the spacing was set to 0.0375 nm, the center grid box was set as x: −38, y: 32, z: −20, the GA-LS runs was set as 128. Finally, the structure with the lowest docking energy was taken as the final docking result.

The free energy landscape (FEL) can provide a quantitative description of protein dynamics [42]. FEL can be obtained from joint-probability distribution of reaction coordinates. It can be calculated by:

2.4. Molecular dynamics simulation All-atom molecular dynamics (MD) simulation is one of the most important technique in exploring the thermodynamics and kinetics features of biomolecules. In this work, ten independent MD simulations comprising PFV IN-DNA-RLT/EVG/DTG, PFV IN-DNA-Cmpd # 1 ∼ 6 and PFV IN-DNA monomer systems were carried out using Amber 16 package [37]. In the MD simulations, Amber ff14SB allatom force field [38] was applied and the solutes were solvated in a truncated periodic box with a 1.0 nm solute-wall distance using the TIP3P water model [39]. Finally, these ten systems contained 9087, 9300, 9297, 9294, 9319, 9295, 9288, 9288, 9286 and 9343 water molecules, as well as 30317, 30957, 30943, 30922, 31007, 30941, 30923, 30929, 30917, 31032 atoms, respectively. Six Na+ ions were added to maintain the charge balance in each system. Two phases of minimization were performed for the initial structures before MD simulations. First, the solute atoms were restrained with the force constant 2.09 × 105 kJ mol−1 nm−2 , and water molecules were minimized for 5000 steps with the steepest descent method followed by 5000 steps of conjugate gradient. Second, unrestrained minimizations containing 5000 step steepest descent and 5000 step conjugate gradient of the whole systems were executed with the convergence criterion that energy gradient is less than 4.18 kJ mol−1 nm−2 . MD simulations were also set to two stages. First, we restrained the solutes and slowly heated the systems up from 0 to 300 K over 5 ns. Then, non-restraint MD simulations at 300 K were performed for 95 ns. In these simulations, SHAKE algorithm [40] was applied to constrain covalent bonds involving hydrogen atoms and a 1.2 nm cut-off was used for all

G (X) = −kB T ln P (X)

(3)

where kB represents the Boltzmann constant, T represents the temperature during the simulation, and the P (X) is the probability distribution along the reaction coordinate X [43]. In the FEL, both g covar and g anaeig programs in Gromacs software [44] were adopted to calculate the eigenvector and eigenvalue. Then, the gaussian broadening method was employed to get the probability distribution [45]. The local basins populated with high probability usually represent low-energy conformations in stable states, while the barriers located in the low probability regions indicate the highenergy transient states. In our present study, the first and second principal components were chosen as the reaction coordinates, and the two-dimensional FEL was calculated to identify the dominant conformations with lower energy. 2.6. Cluster analysis Cluster analysis is an effective strategy analyzing the regional movement of a biological system during MD simulation. Each cluster analysis was performed from the 100,000 structural conformations gained from the ten MD systems using MMTSB packages [46]. According to the Lindorfflarsen’s protocol [47], root mean square deviation (RMSD) is calculated one by one, to create the RMSD matrix (N × N, N is equal to the number of conformations). Starting from the first conformation, the RMSD between it and the other conformations are observed. A RMSD cut-off value is set as the threshold of cluster and the conformations are fallen into the first cluster if their corresponding RMSD values are less than the threshold value. The cluster operation mentioned above does not stop until all the conformations are fallen into a certain cluster. Finally, the conformation with lowest potential energy is determined as the representative structure in the cluster. In this work, cluster analysis was performed based on the RMSD matrix of the PFV IN-DNA and PFV IN-DNA-DTG systems, respectively, with the RMSD threshold setting as 0.12 nm. 2.7. Conformational analysis of DNA Curves is a widely used nucleic acid conformational analysis program [48], which provides a full analysis of DNA structure, including base pair-axis, intra-base pair, inter-base pair, backbone and groove parameters. The grooves of DNA are the preferential binding sites

Dolutegravir

Dihydro-1H-isoindole derivatives 2-Hydroxyisoquinoline1,3(2H,4H)-diones 2-Pyridinone aminals

Naphthyridine derivatives

A:Ala10 ∼ His374; B:Asp116 ∼ Phe278

A:Leu8 ∼ Gln375; B:Asp116 ∼ Arg299

A:Leu8 ∼ Gln375; B:Asp116 ∼ Arg299

A:Asp9 ∼ Gln375; B:Asp116 ∼ Ile298

3L2U,3L2 V,3L2W

3S3 M,3S3N,3S3O

4BDY,4BDZ,4BE0,4BE1,4BE2

4IKF

4ZTF,4ZTJ

5FRM,5FRN,5FRO

2 [8]

3 [31]

4 [50]

5 [51]

6 [52]

7 [5]

A:Leu8 ∼ Gln375; B:Asp116 ∼ Arg299

Raltegravir, MK0536, PICA, GS9160, MK2048 and its analogs Raltegravir, Elvitegravir A:Leu8 ∼ Gln375; B:Asp116 ∼ Arg299 3OYA,3OYB,3OYC,3OYD,3OYE, 3OYF,3OYG,3OYH,3OYJ3OYL,3OYN 1 [49]

A:Leu8 ∼ Gln375; B:Asp116 ∼ Arg299

Ligands Sequences PDB codes No.

Table 1 Information of integrase crystal structures from PDB obtained with a four-step screening method.

A lot of crystal structures of the different domains of HIV-1 integrase (IN) have already been resolved except the entire structure (including IN, DNA, metal, etc.), which is the essence for structurebased drug design. We firstly accounted for the number of HIV-1 IN protein structures in protein data bank (PDB), and found 248 crystal structures totally from five species, such as HIV-1, murine leukosis virus, mouse mammary tumor virus, prototype foamy virus intasome and feline immunodefiency virus. In order to get a better HIV-1 IN inhibitor screening platform, we applied “four-step” screening method gradually reducing the search range to filter the INSTIs database. The specific screening procedure was: (a) selecting 159 PDB structures with a full-length of CCD from the total 248 crystal data; (b) next gaining 49 structures with the coordination of Mg2+ /Mn2+ ions in the binding pocket; (c) then reducing to 40 structures complexed with virus DNA; (d) finally getting 29 structures complexed with IN inhibitors. After the “four-steps” screening, all the 29 structures belonged to PFV family, and could be divided into seven categories (see Table 1). The crystal structures of PFV IN complexed with three approved drugs (i.e., Raltegravir, Elvitegravir, and Dolutegravir) were included in the first three classes, providing the accurate molecular recognition between PFV IN and its INSTIs. Seen from Table 1, the protein sequences of PFV IN obtained by screening are almost the same, which are composed of chain A (Leu8 ∼ Gln375), chain B (Asp116 ∼ Arg299) and different ligands with high inhibitory activity, such as RLT, MK0536, EVG, DTG, etc. Due to the high structural homology, the structural integrality binding with the newly approved DTG drug, and the high resolution, the crystal structure of 3S3 M was chosen as the IN inhibitor screening platform. Meanwhile, the crystal structures of 3OYA and 3L2U containing RLT and EVG, respectively, both were adopted as a good model to study the interactions between IN and its inhibitors. PFV IN-DNA system could be chosen as the platform of screening IN inhibitors, mainly based on the following consideration: 1) PFV IN, the same as HIV-1 IN, belongs to the homologous retroviral family; 2) The sequence BLAST results showed there was high protein sequence similarity (about 40.8%) between PFV and HIV-1 IN (see Fig. 1). Furthermore, the conserved DDE motif (Asp64, Asp116 and Glu152) in HIV-1 IN is consistent with that (i.e., Asp128, Asp185 and Glu221) in PFV IN. The position and length of their secondary structures are basically identical, which is consistent with the results of Hare and Galilee [8,53]. 3) The superimposition of 3S3 M with 3L3U (i.e., HIV-1 IN CCD crystal structure with high resolution) showed that the most obvious difference was between Ser209 ∼ Arg222 in 3S3 M and Gly140 ∼ Ser153 in 3L3U (see in Fig. 1B), with the loop (3L3U) turning into “␣-helix-loop” [54] (3S3 M), which made these residues closer to the binding pocket and more favorable to bind with viral DNA. Interestingly, the frequently reported drug resistance mutations were mostly located in this fragment (such as Y143R, G140S and N155H, etc.) [5]. However, compared to the conserved Glu152 in HIV-1 IN structure, the corresponding residue Glu221 in PFV IN is closer to the Mg2+ ions from 1.45 nm to 0.20 nm. Therefore, this “␣-helix-loop” structure may be a rational conformation of IN-DNA complex.

Features

3.1. Bioinformatics analyses of PFV and HIV-1 integrase

The ligand exhibits stronger inhibitory effect on both ST and 3 -P reactions. The ligand bears a novel prodrug structure containing a screw ring, and belongs to diketone acid derivatives. The carbonyl oxygen in diketone acid is replaced by a nitrogen atom, still remaining strong inhibitory activity.

3. Results and discussion

99

3L2U has the coordinate of Mg2+ ions with highest resolution, while the rest contain two Mn2+ ions. 3S3 M has a relatively higher resolution, and the other two structures are G217H and N224H mutants with drug resistance. The ligand is a typical diketone acid.

for protein-DNA recognitions. In this work, 200 snapshots were extracted from each trajectories of PFV IN-DNA and IN-DNA-DTG systems every 500 ps in order to calculate the mean DNA conformational parameters. The following parameters were analyzed to describe the viral DNA structural deformability, containing inclination angles, twist angles and groove widths.

3OYA contains the earliest drug raltegravir, and the ligands in other structures are typical inhibitors under research.

W. Du et al. / Journal of Molecular Graphics and Modelling 78 (2017) 96–109

100

W. Du et al. / Journal of Molecular Graphics and Modelling 78 (2017) 96–109

Fig. 1. Sequence and structure comparison of PFV IN and HIV-1 IN systems. (A) The protein sequence alignment of PFV IN and HIV-1 IN (3S3 M and 3L3U). Red, blue and gray represent ␣ helix, ␤ sheet and loop respectively. The sequence similarity is 40.8%. (B) Superimposition of the structures of CCD in 3S3 M (light blue) and 3L3U (gray). The deep blue and red show the obvious difference between them. Glu221 and Glu152 both indicate the position of conserved residue, and the distance between their oxygen atom and the Mg2+ are 1.45 nm and 0.20 nm, respectively. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Structural biology information, protein sequence alignment, and structural superimposition, all proved that 3S3 M and 3L3U are similar both in sequence and in structure. Comparatively, the 3S3 M PFV IN structure is more complete, including all the binding information of viral DNA and Mg2+ ions, which may be closer to the physiological state and have great significance for the following molecular recognition study of IN-DNA complexed with inhibitors. Recent years, many groups have tried to explore the structure and functions of HIV-1 IN based on the PFV IN model. Quevedo et al. [55] evaluated four homology HIV-1 IN models from the crystal structure of PFV IN (PDB ID: 3OYA), and proved that the crystal structure of PFV IN was more effective than the other HIV-1 IN homology models in the high throughput screening of IN inhibitors. Miri [54] constructed a complex model containing DNA, Mg2+ , RLT and PFV IN (PDB code: 3L2T) using homology modeling and molecular docking, then MD simulation was performed to optimize the structure, finally elaborated the possible inhibitory mechanism of RLT. In a word, the structure of PFV IN-DNA is a potentially reliable template for HIV-1 INSTIs virtual screening research from the summary analysis of structural biology information.

3.2. Virtual screening In this work, the database comprising 432 IN inhibitors collected from others’ experiments was established, in which IC50 values range from 0.59 to 1.11 ␮M. Another free database containing 2737 INSTIs was downloaded from ChEMBL data base. Then, these two databases were merged into a new database by deleting the molecules with duplicate or incomplete information. This new database (containing 1673 molecules) was defined as the active compounds database used for INSTIs virtual screening. The free NCI database containing a huge number of drugs with structure and property information is often used for drug design. To further validate the dependability of the PFV IN-DNA platform, we filtered the newest NCI database (265,242 molecules) combined with our active compounds database (1673 molecules), hoping to screen out the active compounds from the 266,915 molecules.

Table 2 Distance restrictions of pharmacophore model presented in our work.

Distance (nm)

HP1-HP2

HP1-HA1

HA1-HA3

HP2-HA3

HA1-HA2

0.7004

0.5580

0.5506

0.3656

0.2719

Virtual screening was divided into two stages: Firstly, the virtual screening based on pharmacophore was applied. The model in this work was established based on the structure of DTG because of the novel structure and relatively high inhibitory activity (IC50 = 1.7 nM). Fig. 2 shows the 6-point pharmacophore model of DTG with Discovery Studio 2.5 package (Accelrys, San Diego, California, USA). In this model, the hydrogen bond donor (HD) and hydrogen bond acceptor (HA) were located in the carbonyl and hydroxyl moieties interacting with Asp128, Asp185, Glu221 and two Mg2+ ions, while the hydrophobic (HP) feature was situated in the dihydropyridine and fluorine phenyl moieties recruited by adenine (A-1) and cytosine (C-2) in viral DNA. In addition, Table 2 lists the distance restrictions in the pharmacophore model. In order to effectively screen the database, the 4-point pharmacophore match (i.e., HA1-HA2-HP1-HP2, HA1-HA3-HD-HP1, HA1-HA2-HD-HP2) is indispensable. In the second stage of virtual screening, molecular docking was carried out between the energy-optimized crystal structure of PFV IN-DNA (3S3 M) and all the preliminary screening ligand molecules (i.e., after pharmacophore screening) using the Libdock module in the Discovery studio 2.5 package (Accelrys, San Diego, California, USA). Finally, the molecules were ranking in order of the composite docking score based on the second stage above. The relationship between the number of hit compounds and that of screened compounds is depicted in Fig. 3. Apparently, the 3.8% (top 10,100 compounds) of the database screened includes all the active compounds, and the screen efficiency value has reached 26.43. Here, the definition of screen efficiency value is a ratio of the number of top compounds containing all active ones divided by total number of compounds. The higher the screen efficiency value is, the higher the screening efficiency is[56]. Therefore, the screening result shows that the PFV IN-DNA model can effectively distinguish the active IN inhibitors from inactive compounds,

W. Du et al. / Journal of Molecular Graphics and Modelling 78 (2017) 96–109

101

Fig. 2. Two-dimensional pharmacophore model of dolutegravir in PFV IN structure. The red, green and blue rings indicate the HD, HA and HP structure features, respectively. The deep blue lines represent the interacting residues/bases, while the gray corresponds to the HIV-1 IN structure. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Fig. 3. Correlation between the number of hit compounds identified and the number of compounds screened in the modified NCI database containing 266,915 compounds.

which demonstrates the PFV IN-DNA platform can be used to evaluate and select the HIV-1 INSTIs. 3.3. Analysis of binding pocket The main feature of MD-based druggability simulations is that solvent and coupling effect between protein and water/probe molecules are considered. In this work, the druggability program[34] was used to find the druggable pocket of PFV IN surface. After adding the water and probe molecules, the PFV IN protein was in a cube solvent box (shown in Fig. S1A), and a 40 ns MD-based druggability simulation provided a reference state for estimating the binding free energy of every probe molecule. The calculated interaction spots were shown in Fig. 1SB, each spot represents a probe molecule, and the color of spots means the level

of binding free energy. Specifically, the blue, white and red balls represent the binding energy gradually decreases. Thus, it was obvious that the druggable pocket (shown in Fig. 4A) where the five probe molecules located was the optimal binding pocket in the surface of PFV IN. The calculated druggable site is the same with that in Li’s work [57]. It is just the region where the DDE motif located. In this calculated druggable site, the precise binding modes of the INSTIs drugs will be discussed below. The calculated druggability binding pocket is exactly where RLT, EVG and DTG located, which is consistent with PFV IN crystal structures of 3OYA, 3L2U, 3S3 M respectively (see Fig. 4). The common features of these three structures are that the two carbonyl oxygen atoms in the DKAs chelated to two Mg2+ ions, and the fluorophenyl and the pyrimidine ring of C-2 constructed a ␲-␲ stacking interaction with the interplanar distance of 0.34 ∼ 0.38 nm, which was also mentioned in the pharmacophore model. The subtle differences of binding information among the three drugs are that: 1) The methyl diazole ring in the RLT made another ␲-␲ stacking interaction with Tyr212 residue with a distance of 0.36 nm, which might be the main reason why the inhibitory activity of RLT to IN Y143R mutant decrease more than 40 folds compared with the wild type enzyme [5]. Relatively, the ␲-␲ stacking interaction was disappeared in both EVG and DTG inhibitors because of the lack of the aromatic rings on their corresponding part. It was proposed Tyr212 could be “a barrier” to restrain the movement of ligands, so the inhibitory activity of EVG and DTG only decreased 1.2 and 2.7 fold to the Y143R mutant. 2) The aromatic ring in DTG formed a strong ␲-␲ stacking interaction with the A-1 base in viral DNA with a distance of 0.39 nm. Obviously, this stacking restrained the movement of viral DNA and further weakened its 3 -P and ST reactions. This interaction might be the reason why DTG has the highest inhibitory activity to IN. From Fig. 4B ∼ D, there existed a deep cavity between Tyr212 and Pro214. The different fragment of marketed drugs RLT/EVG/DTG all located above the cavity, more or less interacting with Tyr212 and Pro214. It is proposed that the affinity between IN-DNA and DKA compounds might rise if a long aliphatic chain bearing stronger interactions

102

W. Du et al. / Journal of Molecular Graphics and Modelling 78 (2017) 96–109

Fig. 4. Binding pocket and binding mode of PFV IN with three marketed drugs. (A) The binding site information of in the surface of PFV IN calculated by the probe molecules using druggability. The binding modes of RLT (B), EVG (C) and DTG (D) in the co-crystal structures (3OYA, 3L2U, 3S3 M). The digits represent the important interatomic distances (unit: Å).

with these two residues was added to DKAs. This hypothesis will be discussed in the following molecular docking section. In order to get detailed binding information of the three drugs approved with PFV IN, the contact residues were analyzed. Here, the contact residues are defined as the residues/bases with the distance less than 0.4 nm to the ligand, which are significant for the recognition between ligand and receptor [58]. The contact residues/bases are shown in Table S1. Apparently, Asp128, Asp185, Y212, Pro214, Gln215, Glu221 and A-1, C-2, G-15 all are the common residues/bases in the three crystal structures. From the interactions with these residues/bases, it is found the RLT/EVG/DTG compounds can be divided into two parts, the hydrophilic part (corresponding to HA1, HA2, HA3 and HD in Fig. 2) and the hydrophobic part (corresponding to HP1 and HP2 in Fig. 2). The conserved residues Asp128, Asp185, Glu221 and Mg2+ form hydrogen and coordinate bonds with the hydrophilic part of the drugs, while the

Tyr212, Pro214, Gln215 and the bases constructed van der Waals and ␲-␲ interactions with the hydrophobic part. Furthermore, the molecular van der Waals surfaces of RLT/EVG/DTG were shown in Fig. S2, which were consistent with the analysis of the crystal structures.

3.4. Molecular docking Recently, Zhao et al. [5] designed and synthetized a series of naphthyridine derivatives (NRDs) with noncytotoxic and nanomolar IC50 against HIV-1 variants harboring all of the major drug-resistant mutations. In the structure of NRDs, a carbonyl in diketone acid is substituted by a pyridine nitrogen using the strategy of bioisosteres. Due to the pyridine nitrogen having more electrophobic effect than the former carbonyl group, the consequence of this substitution is that NRDs still keep the interactions

W. Du et al. / Journal of Molecular Graphics and Modelling 78 (2017) 96–109 Table 3 The molecular structures and ST reaction IC50 values of NRDs and DTG.

103

results showed a good correlation between the predicted complex structure and the inhibit activity, and the corresponding correlation coefficient value was 0.84 (see Fig. S3). To further explore the relationship between calculated energy and experimented activity, the correlation between them will be discussed in next part using MD simulation sampling. 3.5. Molecular dynamics analysis

Cmpd #

R

ST-IC50 (nM)

1 2 3 4 5 6 DTG

H (CH2 )3 OH (CH2 )5 OH (CH2 )6 OH (CH2 )8 OH (CH2 )2 SO2 Ph

19.0 ± 2.0 11.3 ± 2.7 2.7 ± 0.4 0.6 ± 0.1 5.4 ± 0.8 3.7 ± 1.3 1.7

with the metal ions and key residues. Interestingly, the long chain substitution in 6-position may be suitable to the deep cavity mentioned above, which will improve the cohesive inhibition strength (see Table 3). The NRDs possess strong inhibitory activity similar to the newly approved DTG, however, the detailed binding modes with IN remain unknown. In order to further test the screening ability of PFV IN-DNA platform, the binding modes of NRDs with PFV IN-DNA were predicted with molecular docking method, in which the druggability pocket calculated above was defined as the docking box center. The binding modes of PFV IN-DNA with six NRDs compounds are almost the same (see Fig. 5) with the contact residues/bases consisting of Asp128, Asp185, Tyr212, Pro214, Gln215, Glu221, A-1, C-2 and G-15. The common group of these binding patterns are shown below: 1) The aromatic ring and fluorophenyl in Cmpd # 1 ∼ 6 constructed steady ␲-␲ interactions with purine and pyrimidine in A-1 and C-2 respectively with a low distance of 0.365 ∼ 0.385 nm. 2) In addition, the fluoride atom formed a stable hydrogen bond (0.266 ∼ 0.346 nm) with the NH group in residue Gln215 that made the fluorophenyl bind in the cavity. However, their major difference lied in the depth of the 6-substitutent getting into the hole. Specifically, the deeper and stronger the 6substitutent binding in the hole, the higher inhibition they will be. For example, Cmpd # 2 (Fig. 5B) with a trimethylene alcohol substitute exhibited better inhibition than Cmpd # 1 (Fig. 5A) with no substitutent. Especially, the hexamethylene alcohol group penetrated into the bottom of the cavity so that the terminal hydroxyl held hydrogen bonds with the oxygen atom in main chain carbonyl of Thr210 or with the nitrogen atom in main chain imine of Tyr212 with the distances of 0.210 nm and 0.289 nm (shown in Fig. 5D), respectively, favoring the molecular recognition between PFV IN and viral DNA. Comparatively, the pentamethylene alcohol group in Cmpd # 3 cannot implant into the tunnel. On the contrary, the octamethylene alcohol group in Cmpd # 5 has to crook back to shape the groove. Therefore, the hexamethylene alcohol is the best suitable length, so the Cmpd # 4 exhibited the highest inhibition (ST IC50 = 0.59 nM). As conservative interactions in INSTIs, the coordinated bonds formed by carbonyl and nitrogen in pyridine with Asp128, Asp185, Glu221 and Mg2+ ions still existed. In conclusion, the 6-substitutent can not only increase the inhibit activity, but also partly avoid the drug resistance from Tyr212 mutation [5] (the inhibitory ability of NRDs compounds only reduced 2.3 folds in Y143R mutation, while the 162 folds of RLT), which showed that NRDs are promising second generation INSTIs. Generally speaking, the substitution of pyridine nitrogen with highly resonance donating effect for the oxygen in carbonyl, as well as a long aliphatic chain bounded in the cavity constructed by Pro214, Tyr212, Thr210 and Asp185, made the NRDs perform better inhibition than traditional DKAs. Meanwhile, the docking

Ten 100 ns MD simulations were performed for six docked complex models containing PFV IN-DNA-Cmpd # 1 ∼ 6, three PFV IN crystal structures complexed with RLT, EVG and DTG (3OYA, 3L2U and 3S3 M respectively), and the PFV IN-DNA monomer systems. The RMSD distribution of the ten systems versus simulation time are shown in Fig. S4. As a whole, all the RMSD values of the nine systems binding with inhibitors (see Fig. S4 B ∼ J) are consistently below 0.152 nm with nearly 10% fluctuation amplitude, while the monomer system (i.e., PFV IN-DNA) rises up to 0.181 nm with about 20% fluctuation amplitude. The results show the motion range of the PFV IN-DNA system will obviously decrease after the association with INSTIs. The previous research [58] also proposed that violent movement of PFV IN might be indispensable to exert its catalytic function. Table S2 lists the calculated binding free energy (Gcal ) and its main energy items. Fig. 6A shows the correlation between the Gcal values of nine complex systems and the corresponding experimental pIC50 data. There is a significant correlation with a coefficient of 0.89 between Gbind and pIC50 , suggesting PFV IN-DNA platform is able to effectively predict the bioactivity of INSTIs. From Table S2, the hydrophobic energy (VDW+PBSUR) have a lower negative value than electrostatic energy (ELE+PBELE). Meanwhile, the correlation between the hydrophobic energy contribution and the corresponding pIC50 runs up to 0.96 (P < 0.05, see Fig. 6B), which is higher than that of Gcal . Therefore, it is proposed the hydrophobic energy may form the main driving force to the recognition of PFV IN-DNA with INSTIs. Good correlation of the experimental pIC50 values with calculated binding free energy Gcal not only verifies the reliability of all MD simulations but also proves the effectiveness of PFV IN-DNA system as a new HIV-1 integrase inhibitor screening platform. 3.6. Conformational changes of PFV IN-DNA Free energy landscape (FEL) is a probability distribution function of free energy obtained by principal component analysis (PCA) based on the MD trajectory. In order to study the conformational changes of the PFV IN-DNA system, the conformation distribution of the PFV IN-DNA-DTG (DTG-bound) and PFV IN-DNA (DTG-free) systems along the PCs was investigated. Fig. 7A and C displays the free energy contour maps of the two systems at 300 K, with deeper color indicating lower energy. In the DTG-free system, the local minima approximately was distributed in the left of the FEL (see Fig. 7A), which mainly corresponded to the conformations from 60 ∼ 75 ns (see Fig. S5A). Relative to the DTG-free system, there are more local minima in the DTG-bound system locating in the lower left of the FEL, which includes the conformations from 15 ∼ 75 ns (see Fig. S5B). Compared with the DTG-free system, the DTG-bound system is more stable for the more low energy conformations. Besides, the conformational transition space diminishes obviously after DTG binding to PFV IN-DNA, and the PC1 decreases from −1.153 ∼ 1.774 to −1.107 ∼ 1.401 interval, and PC2 from −0.985 ∼ 1.258 to −1.016 ∼ 1.078 interval, respectively. It suggests that the motion amplitude decreases markedly and the DTG-bound system becomes more sTable Specifically, there are two low-energy regions both in the DTG-free and DTG-bound systems during the MD simulations. However, the dark areas in DTG-bound

104

W. Du et al. / Journal of Molecular Graphics and Modelling 78 (2017) 96–109

Fig. 5. The docking results of the six NRDs with PFV IN-DNA systems. The black and green show the name of amino acids and compounds. The purple and blue digits represent the distance of ␲-␲ stacking and hydrogen bonds (angstrom). (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

system are bigger than the other one, which indicates that more steady conformations in DTG-bound system. Consequently, the association with DTG may cause the decrease of motion amplitude in PFV IN-DNA system, which may be one of the inhibitory mechanism of DTG. To more accurately evaluate the conformational distribution of the local minima versus molecular time, cluster analyses of the DTG-bound and DTG-free systems were performed with a RMSD threshold of 0.12 nm. Fig. 7B and D shows the cluster numbers of two systems are both 2. Each cluster represents an independent conformational state, which is consistent with the results in FEL.

3.7. Water-mediated chemical-bonds As we all know, a protein must be in the water environment when executing its biological function in vivo, thus water molecules generally contribute to the stability of the protein spatial structure and conformation. In most cases, water molecules are involved in the formation of intermolecular hydrogen bonds, electrostatic and van der Waals interactions [59]. In this work, in order to study the role of water molecules in molecular recognition between PFV IN-DNA and INSTIs, the number of interfacial water molecules were counted in the PFV IN-DNA-DTG and PFV IN-DNA systems. Here, the water molecules

W. Du et al. / Journal of Molecular Graphics and Modelling 78 (2017) 96–109

105

Fig. 6. Correlation of calculated binding free energy values with its corresponding experimental pIC50 . (A) The correlation between binding free energy calculated by MM/PBSA method and experimental pIC50 data. (B) The correlation between pIC50 and hydrophobic energy contribution.

Fig. 7. The free energy landscape of DTG-free (A) and DTG-bound (C), as well as the corresponding conformation cluster analysis (B and D).

106

W. Du et al. / Journal of Molecular Graphics and Modelling 78 (2017) 96–109

Fig. 8. Water-mediated chemical-bonds in the PFV IN-DNA-DTG and PFV IN-DNA systems. (A) The number of waters around the protein versus simulation time; (B) The coordinate bonds (purple) and hydrogen bonds (blue) involving waters molecules. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

within 0.4 nm from all residues of the protein are defined as interfacial water. Fig. 8A shows the change of the interfacial water molecules versus molecular time in the two systems, with water molecules sampled every 500 ps. The fluctuation trends of interfacial water molecules in the PFV IN-DNA and PFV IN-DNA-DTG systems are similar and finally converge to 12. Through careful observation, three water molecules, forming coordinate bonds with Mg2+ ions and hydrogen bonds with the residues (i.e., Asp128, Tyr129, Asp185 and Glu221) around, stably exist in the active site, aiding the Mg2+ ions effectively chelated by the hydrophilic part of DTG. The distances between the three water molecules and the Mg2+ ions stay below 0.211 nm within 5% of fluctuation amplitude (see in Fig. S6). The distances between the oxygen atoms of water molecules and that of DTG, the angle between two water molecules and one Mg2+ ion as well, all are also shown in Fig. S6. Therefore, tight hydrogen bonds and coordinate bonds network constructed by three water molecules with two Mg2+ ions, four key residues play an important role in the molecular recognition and catalytic activity of IN protein. 3.8. Functional conformation changes of viral DNA Viral DNA is the substrate of 3 -P and ST reactions in HIV-1 life cycle. Therefore, the elucidation of the conformational change of viral DNA, will greatly help understanding the integration mechanism of HIV-1 IN. Fig. 9 shows the binding mode between viral DNA and PFV IN. The hydrogen bonds and electrostatic interactions formed by Thr210, Ser216, Gly218, Arg222, Arg229, Ser258 and DNA are the main driving force to the recognition of virus DNA with PFV IN (detail data not shown). In vivo, the motion freedom of viral DNA and its specific interactions with HIV-1 IN both are important for the integration reaction. As for DTG molecule, it was restricted between the A-1 and C-2 bases and specially blocked the essential necessary movement of viral DNA, and it was a widely accepted inhibitory mechanism [58]. The major and minor grooves of DNA are generally important potential binding sites with protein, and the protein-DNA interactions will be determined through focusing on the width change of grooves over time. In the present work, Curves package was applied to analyze the change of viral DNA groove parameters in the PFV IN-DNA and PFV IN-DNA-DTG systems. Fig. 10A displays the minor groove width change in the two systems. The minor groove width of the PFV IN-DNA system is 0.994 ± 0.090 nm with a 9.03% fluc-

tuation, while the 1.012 ± 0.047 nm with a 4.6% fluctuation in PFV IN-DNA-DTG. This means that the minor groove width of viral DNA in the former system show more remarkable fluctuation than the latter. With DTG only binding to DNA minor groove, the width of major groove exhibits no obvious change (data not shown). Meanwhile, the fluctuation can be revealed from other DNA parameters. For example, the twist angle between base pairs 1 and 2 in both PFV IN-DNA and PFV IN-DNA-DTG systems are 51.578 ± 9.950◦ and 79.999 ± 32.382◦ , respectively (see Fig. 10B). Moreover, the inclination distributions are −65.618 ± 9.060◦ and −69.503 ± 33.672◦ (see Fig. S7). Reducing the degree of deformation and movement freedom of viral DNA may also be a possible inhibitory mechanism of the INSTIs (see Fig. 10 and S6). 4. Conclusions Due to the lack of the complete HIV-1 IN three dimensional structures including full-length enzyme, its substrate viral DNA, catalytic Mg2+ ions and INSTIs, etc, the drug development based on HIV-1 IN has been greatly hindered. In this work, the crystal structure of PFV IN-DNA was proposed as a potential HIV-1 IN inhibitor screening platform through structural biology information survey. The 40.8% sequence similarity between HIV-1 IN and PFV IN, 26.43 of the virtual screening efficiency and the good correlation between calculated binding free energy and experimental pIC50 values, all results confirmed this viewpoint. Ten comparative 100 ns MD simulations were performed for the PFV IN-DNA-NRDs1 ∼ 6, PFV IN-DNA-RLT/EVG/DTG and PFV IN-DNA systems, respectively. The specific binding mode of PFV IN, DNA with INSTIs were displayed. DTG, the second generation INSTIs, can be divided into hydrophilic and hydrophobic parts respectively. The Mg2+ ions, three water molecules and conserved residues interacted with the hydrophilic part, while the bases in viral DNA and residues like Tyr212, Pro214 interacted with the latter. Furthermore, the calculated binding free energy values are in good agreement with the experimental data. In the aspect of conformational change analysis for PFV INDNA system, both FEL based on principal component analyses and cluster analyses were carried out. The results show that the motion amplitude will decrease and the system contains more stable conformations in the presence of inhibitors. Further, functional conformation change of DNA containing minor groove width, twist and inclination all prove that PFV IN-DNA-DTG is more stable than

W. Du et al. / Journal of Molecular Graphics and Modelling 78 (2017) 96–109

107

Fig. 9. The binding mode between PFV IN and viral DNA. (A) The three dimensional structure of the binding mode. Red sticks are used to describe the amino acid interacting with DNA. (B) The sketch map of contact residues. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Fig. 10. Change of the parameters of base pairs in virus DNA, after DTG combined to the PFV IN-DNA system. (A) The minor groove width of base pair 3 versus simulation time; (B) The twist between base pair 1 and 2 versus simulation time.

the PFV IN-DNA system. It may be one of the potential inhibitory mechanisms of the INSTIs decreasing the motion amplitude of the PFV IN-DNA system and limiting the space requirement for its biological function. In conclusion, the PFV IN-DNA platform is a rational, reliable and effective HIV-1 IN inhibitor screening platform. Subsequently, this platform can serve as an alternative effective template for antiAIDS drug screening and provides theoretical advice to the HIV-1 IN inhibitor drug design.

Science and Technology Bureau Foundation of Sichuan Province (2015JY0117), the Key Project of Sichuan Provincial Education Bureau (17ZA0194), the Project of Leshan Science and Technology Administration (14GZD022), The Project of Leshan Normal University (Z16019), The Projector of Sichuan Provincial Key Laboratory of nature product chemistry and small molecule catalysis (TRCWYXFZCH2016015), Education Department of Sichuan Province (14CZ0024) and Scientific Research Fund of Leshan Normal University (Z1305).

Acknowledgments

Appendix A. Supplementary data

This work was supported by the funding from the National Natural Science Foundation of China (31600591, 11247018), the

Supplementary data associated with this article can be found, in the online version, at https://doi.org/10.1016/j.jmgm.2017.10.002.

108

W. Du et al. / Journal of Molecular Graphics and Modelling 78 (2017) 96–109

References [1] D. Akoadjei, W. Fu, C. Wallin, K.S. Katz, G. Song, D. Darji, J.R. Brister, R.G. Ptak, K.D. Pruitt, HIV-1, human interaction database: current status and new features, Nucleic Acids Res. 43 (2014) 566–570. [2] R.A. Weiss, How does HIV cause AIDS? Science 260 (1993) 1273–1279. [3] H.B. Wang, Q.H. Mo, Z. Yang, HIV vaccine research: the challenge and the way forward, J. Immunol. Res. 2015 (2015) 1–5. [4] J.A. Smith, R. Daniel, Human vaginal fluid contains exosomes that have an inhibitory effect on an early step of the HIV-1 life cycle, AIDS 30 (2016) 1. [5] X.Z. Zhao, S. Smith, D. Maskell, M. Metifiot, V. Pye, K. Fesen, C. Marchand, Y. Pommier, P. Cherepanov, S. Hughes, HIV-1 integrase strand transfer inhibitors that reduce susceptibility to drug resistant mutant integrases, Acs. Chem. Biol. 11 (2016) 1074–1081. [6] B.C. Johnson, M. Métifiot, Y. Pommier, S.H. Hughes, Molecular dynamics approaches estimate the binding energy of HIV-1 integrase inhibitors and correlate with In vitro activity, Antimicrob. Agents Ch. 56 (2012) 411–419. [7] M. Cai, M. Caffrey, G.M. Clore, A.M. Gronenborn, H. Ying, R. Zheng, R. Craigie, Solution structure of the His 12 Cys mutant of the N-terminal zinc binding domain of HIV-1 integrase complexed to cadmium, Protein Sci. 7 (1998) 2669–2674. [8] S. Hare, S.S. Gupta, E. Valkov, A. Engelman, P. Cherepanov, Retroviral intasome assembly and inhibition of DNA strand transfer, Nature 464 (2010) 232–236. [9] E.Z. Loizidou, Computational inhibition studies of the human proteasome by argyrin-based analogues with subunit specificity, Chem. Biol. Drug Des. 84 (2014) 99–107. [10] J. Cocohoba, B.J. Dong, Raltegravir: the first HIV integrase inhibitor, Clin. Ther. 30 (2008) 1747–1765. [11] M. Nowotny, S.A. Gaidamakov, R.J. Crouch, W. Yang, Crystal structures of RNase H bound to an RNA/DNA hybrid: substrate specificity and metal-dependent catalysis, Cell 12 (2005) 1005–1016. [12] D.R. Davies, I.Y. Goryshin, W.S. Reznikoff, I. Rayment, Three-dimensional structure of the Tn5 synaptic complex transposition intermediate, Science 289 (2000) 77–85. [13] F. Dyda, A.B. Hickman, T.M. Jenkins, A. Engelman, R. Craigie, D.R. Davies, Crystal structure of the catalytic domain of HIV-1 integrase: similarity to other polynucleotidyl transferases, Science 266 (1994) 1981–1986. [14] D.C. Swinney, J. Anthony, How were new medicines discovered? Nat. Rev. Drug Discov. 10 (2011) 507–519. [15] V. Summa, A. Petrocchi, F. Bonelli, B. Crescenzi, M. Donghi, M. Ferrara, F. Fiore, C. Gardelli, O.G. Paz, D.J. Hazuda, Discovery of raltegravir, a potent selective orally bioavailable HIV-Integrase inhibitor for the treatment of HIV-AIDS infection, J. Med. Chem. 51 (2008) 5843–5855. [16] C.D. Madsen, J. Eugenolsen, O. Kirk, J. Parner, C.J. Kaae, M.S. Brasholt, N.J. Ole, K. Krogsgaard, TTV viral load as a marker for immune reconstitution after initiation of HAART in HIV-infected patients, HIV Clin. Trials 3 (2002) 287–295. [17] S. Pasquini, C. Mugnaini, C. Tintori, M. Botta, A. Trejos, R.K. Arvela, M. Larhed, M. Witvrouw, M. Michiels, F. Christ, Investigations on the 4-quinolone-3-carboxylic acid motif. 1. Synthesis and structure-activity relationship of a class of human immunodeficiency virus type 1 integrase inhibitors, J. Med. Chem. 51 (2008) 5125–5129. [18] B.A. Johns, T. Kawasuji, J.G. Weatherhead, T. Taishi, D.P. Temelkoff, H. Yoshida, T. Akiyama, Y. Taoda, H. Murai, R. Kiyama, Carbamoyl pyridone HIV-1 integrase inhibitors 3. A diastereomeric approach to chiral nonracemic tricyclic ring systems and the discovery of dolutegravir (S/GSK1349572) and (S/GSK1265744), J. Med. Chem. 56 (2013) 5901–5916. [19] M.L. Barreca, S. Ferro, A. Rao, L.D. Luca, M. Zappalà, A. Monforte, Z. Debyser, M. Witvrouw, A. Chimirri, Pharmacophore-based design of HIV-1 integrase strand-transfer inhibitors, J. Med. Chem. 48 (2005) 7084–7088. [20] Y. Wang, J. Rong, B. Zhang, L. Hu, X. Wang, C. Zeng, Design and synthesis of N-methylpyrimidone derivatives as HIV-1 integrase inhibitors, Bioorg. Med. Chem. 23 (2015) 735–741. [21] C.G. Cuzzucoli, L. Pescatori, A. Messore, V.N. Madia, G. Pupo, F. Saccoliti, L. Scipione, S. Tortorella, F. Esposito, A. Corona, Structure-activity relationship of pyrrolyl diketo acid derivatives as dual inhibitors of HIV-1 integrase and reverse transcriptase ribonuclease H domain, J. Med. Chem. 58 (2015) 1915–1928. [22] M. Sato, T. Motomura, H. Aramaki, T. Matsuda, M. Yamashita, Y. Ito, H. Kawakami, Y. Matsuzaki, W. Watanabe, K. Yamataka, Novel HIV-1 integrase inhibitors derived from quinolone antibiotics, J. Med. Chem. 49 (2007) 1506–1508. [23] L. Feng, R.C. Larue, A. Slaughter, J.J. Kessl, M. Kvaratskhelia, HIV-1 integrase multimerization as a therapeutic target, Curr. Top. Microbiol. Immunol. 389 (2015) 93–119. [24] J. Liang, T. Mesplède, M. Oliveira, K. Anstett, M.A. Wainberg, The combination of the R263K and T66I resistance substitutions in HIV-1 integrase is incompatible with high-level viral replication and the development of high-level drug resistance, J. Virol. 89 (2011) 11269–11274. [25] M. Oliveira, T. Mesplède, D. Moïsi, R.I. Ibanescu, B. Brenner, M.A. Wainberg, The dolutegravir R263K resistance mutation in HIV-1 integrase is incompatible with the emergence of resistance against raltegravir, AIDS 29 (2015) 2255–2260.

[26] A. Hachiya, H. Ode, M. Matsuda, Y. Kito, U. Shigemi, K. Matsuoka, J. Imamura, Y. Yokomaku, Y. Iwatani, W. Sugiura, Natural polymorphism S119R of HIV-1 integrase enhances primary INSTI resistance, Antiviral Res. 119 (2015) 84–88. [27] E. Valkov, S.S. Gupta, S. Hare, A. Helander, P. Roversi, M. Mcclure, P. Cherepanov, Functional and structural characterization of the integrase from the prototype foamy virus, Nucleic Acids Res. 37 (2008) 243–255. [28] B.C. Johnson, M. Métifiot, A. Ferris, Y. Pommier, S.H. Hughes, A homology model of HIV-1 integrase and analysis of mutations designed to test the model, J. Mol. Biol. 425 (2013) 2133–2146. [29] J.P. Hu, H.Q. He, D.Y. Tang, G.F. Sun, Y.Q. Zhang, J. Fan, S. Chang, Study on the interactions between diketo-acid inhibitors and prototype foamy virus integrase-DNA complex via molecular docking and comparative molecular dynamics simulation methods, J. Biomol. Struct. Dyn. 31 (2012) 734–747. [30] L.L. De, G.S. De, S. Ferro, R. Gitto, F. Christ, Z. Debyser, A. Chimirri, HIV-1 integrase strand-transfer inhibitors: design, synthesis and molecular modeling investigation, Eur. J. Med. Chem. 46 (2011) 756–764. [31] S. Hare, S.J. Smith, M. Métifiot, A. Jaxa-Chamiec, Y. Pommier, S.H. Hughes, P. Cherepanov, Structural and functional analyses of the second-generation integrase strand transfer inhibitor dolutegravir (S/GSK1349572), Mol. Pharmacol. 80 (2011) 565–572. [32] N.M. Cerqueira, D. Gesto, E.F. Oliveira, D. Santos-Martins, N.F. Brás, S.F. Sousa, P.A. Fernandes, M.J. Ramos, Receptor-based virtual screening protocol for drug discovery, Arch. Biochem. Biophys. 582 (2015) 56–67. [33] dwq, SYBYL-X, Version 1.3, Tripos International, St. Louis, MO, 2010. [34] A. Bakan, N. Nevins, A.S. Lakdawala, I. Bahar, Druggability assessment of allosteric proteins by dynamics simulations in the presence of probe molecules, J. Chem. Theory. Comput. 8 (2013) 2435–2447. [35] J. Seco, F.J. Luque, X. Barril, Binding site detection and druggability index from first principles, J. Med. Chem. 52 (2009) 2363–2371. [36] G.M. Morris, D.S. Goodsell, R.S. Halliday, R. Huey, W.E. Hart, R.K. Belew, A.J. Olson, Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function, J. Comput. Chem. 19 (1998) 1639–1662. [37] D.A. Case, R.M. Betz, W. Botello-Smith, D.S. Cerutti, T.E. Cheatham III, T.A. Darden, R.E. Duke, T.J. Giese, H. Gohlke, A.W. Goetz, N. Homeyer, S. Izadi, P. Janowski, J. Kaus, A. Kovalenko, T.S. Lee, S. LeGrand, P. Li, C. Lin, T. Luchko, R. Luo, B. Madej, D. Mermelstein, K.M. Merz, G. Monard, H. Nguyen, H.T. Nguyen, I. Omelyan, A. Onufriev, D.R. Roe, A. Roitberg, C. Sagui, C.L. Simmerling, J. Swails, R.C. Walker, J. Wang, R.M. Wolf, X. Wu, L. Xiao, D.M. York, P.A. Kollman, Amber 2016, University of California, San Francisco, 2016. [38] J. Wang, R.M. Wolf, J.W. Caldwell, P.A. Kollman, D.A. Case, Development and testing of a general amber force field, J. Comput. Chem. 25 (2004) 1157–1174. [39] W.L. Jorgensen, J. Chandrasekhar, J.D. Madura, R.W. Impey, M.L. Klein, Comparision of simple potential functions for simulating liquid water, J. Chem. Phys. 79 (1983) 926. [40] S. Miyamoto, P.A. Kollman, Settle An analytical version of the SHAKE and RATTLE algorithm for rigid water models, J. Comput. Chem. 23 (1992) 952–962. [41] I. Slynko, M. Scharfe, T. Rumpf, J. Eib, E. Metzger, R. Schüle, M. Jung, W. Sippl, Virtual screening of PRK1 inhibitors: ensemble docking, rescoring using binding free energy calculation and QSAR model development, J. Chem. Inf. Model. 54 (2014) 138–150. [42] H. Wan, J.P. Hu, K.S. Li, X.H. Tian, S. Chang, Molecular dynamics simulations of DNA-free and DNA-bound TAL effectors, PLoS One 8 (2013) 1002–1004. [43] G.G. Maisuradze, A. Liwo, H.A. Scheraga, Relation between free energy landscapes of proteins and dynamics, J. Chem. Theory. Comput. 6 (2010) 583–595. [44] B. Hess, C. Kutzner, D.V.D. Spoel, E. Lindahl, GROMACS 4: algorithms for highly efficient, load-balanced, and scalable molecular simulation, J. Chem. Theory. Comput. 4 (2008) 435–447. [45] G.G. Zakaraya, J. Maisuradze, J. Ulstrup, Theory of inhomogeneous environmental gaussian broadening of resonance Raman excitation profiles for polyatomic molecules in solution, J. Raman Spectrosc. 20 (1989) 359–365. [46] M. Feig, J. Karanicolas, MMTSB Tool Set: enhanced sampling and multiscale modeling methods for applications in structural biology, J. Mol. Graph. Model. 22 (2004) 377–395. [47] K. Lindorff-Larsen, S. Piana, R.O. Dror, D.E. Shaw, How fast-folding proteins fold, Science 334 (2011) 517. [48] R. Lavery, M. Moakher, J.H. Maddocks, D. Petkeviciute, K. Zakrzewska, Conformational analysis of nucleic acids revisited: curves+, Nucleic. Acids Res. 37 (2009) 5917–5929. [49] S. Hare, A.M. Vos, R.F. Clayton, J.W. Thuring, M.D. Cummings, P. Cherepanov, Molecular mechanisms of retroviral integrase inhibition and the evolution of viral resistance, Proc. Natl. Acad. Sci. U. S. A. 107 (2010) 20057–20062. [50] M. Métifiot, K. Maddali, B.C. Johnson, S. Hare, S.J. Smith, X.Z. Zhao, C. Marchand, BT Jr, S.H. Hughes, P. Cherepanov, Activities, crystal structures, and molecular dynamics of dihydro-1H-isoindole derivatives, inhibitors of HIV-1 integrase, Acs. Chem. Biol. 8 (2013) 209–217. [51] B.A. Desimmie, J. Demeulemeester, V. Suchaud, O. Taltynov, M. Billamboz, C. Lion, F. Bailly, S.V. Strelkov, Z. Debyser, P. Cotelle, 2-hydroxyisoquinoline-1,3(2H,4H)-diones (HIDs), novel inhibitors of HIV integrase with a high barrier to resistance, ACS Chem. Biol. 8 (2013) 1187–1194. [52] I.T. Raheem, A.M. Walji, D. Klein, J.M. Sanders, D.A. Powell, P. Abeywickrema, G. Barbe, A. Bennet, K. Childers, M. Christensen, Correction to discovery of 2-pyridinone aminals: a prodrug strategy to advance a second generation of

W. Du et al. / Journal of Molecular Graphics and Modelling 78 (2017) 96–109 HIV-1 integrase strand transfer inhibitors, J. Med. Chem. 58 (2015) 8154–8165. [53] M. Galilee, E. Britan-Rosich, S.L. Griner, S. Uysal, V. Baumgärtel, D.C. Lamb, A.A. Kossiakoff, M. Kotler, R.M. Stroud, A. Marx, The preserved HTH-Docking cleft of HIV-1 integrase is functionally critical, Structure 24 (2016) 1936–1946. [54] L. Miri, G. Bouvier, A. Kettani, L. Wakrim, M. Nilges, T.E. Malliavin, Stabilization of the integrase-DNA complex by Mg2+ ions and prediction of key residues for binding HIV-1 integrase inhibitors, Proteins: Struct. Funct. Bioinf. 82 (2014) 466–478. ˜ [55] M.A. Quevedo, S.R. Ribone, M.C. Brinón, W. Dehaen, Development of a receptor model for efficient in silico screening of HIV-1 integrase inhibitors, J. Mol. Graph. Model. 52 (2014) 82–90.

109

[56] Z. Feng, M.H. Alqarni, P. Yang, Q. Tong, A. Chowdhury, L. Wang, X.Q. Xie, Modeling molecular dynamics simulation, and mutation validation for structure of cannabinoid receptor 2 based on known crystal structures of GPCRs, J. Chem. Inf. Model. 54 (2014) 2483–2499. [57] Y. Li, S. Xuan, Y. Feng, A. Yan, Targeting HIV-1 integrase with strand transfer inhibitors, Drug Discov. Today 20 (2015) 435–449. [58] S. Réty, L. Reaeábková, B. Dubanchet, J. Silhán, P. Legrand, A. LewitBentley, Structural studies of the catalytic core of the primate foamy virus (PFV-1) integrase, Acta. Crystallogr. 66 (2010) 881–886. [59] G.A. Papoian, J. Ulander, M.P. Eastwood, Z. Luthey-Schulten, P.G. Wolynes, Water in protein structure prediction, Proc. Natl. Acad. Sci. U. S. A. 101 (2004) 3352–3357.