non-substrate recognition by P-glycoprotein

non-substrate recognition by P-glycoprotein

Accepted Manuscript Title: On the mechanism of substrate/non-substrate recognition by P-glycoprotein Author: Azat Mukhametov Oleg A. Raevsky PII: DOI:...

524KB Sizes 0 Downloads 44 Views

Accepted Manuscript Title: On the mechanism of substrate/non-substrate recognition by P-glycoprotein Author: Azat Mukhametov Oleg A. Raevsky PII: DOI: Reference:

S1093-3263(16)30456-9 http://dx.doi.org/doi:10.1016/j.jmgm.2016.12.008 JMG 6808

To appear in:

Journal of Molecular Graphics and Modelling

Received date: Revised date: Accepted date:

27-6-2016 8-12-2016 9-12-2016

Please cite this article as: Azat Mukhametov, Oleg A.Raevsky, On the mechanism of substrate/non-substrate recognition by P-glycoprotein, Journal of Molecular Graphics and Modelling http://dx.doi.org/10.1016/j.jmgm.2016.12.008 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

On the mechanism of substrate/non-substrate recognition by P-glycoprotein

Azat Mukhametova*, Oleg A. Raevskya

a

Department of Computer-aided Molecular Design, Institute of Physiologically Active Compounds of the

Russian Academy of Science (IPAC RAS), Severny pr-d 1, 142432 Chernogolovka, Moscow Region, Russian Federation;

E-mails: [email protected] (for A.M), [email protected] (for O.A.R).

* Corresponding author. Tel. +7(496)524–2600

Graphical abstract

Highlights

    

Molecular Dynamics simulations and docking of drugs performed for P-gp Drugs with ER<1 almost do not bind the main binding cavity (MBC) of P-gp Drugs with 1≤ER≤2 bind both MBC and other binding sites almost equally Strong substrates with ER>2 preferably bind the MBC Binding the MBC might be a prerequisite for pumping the compound off the P-gp

Abstract P-glycoprotein (P-gp, multi-drug resistance protein, MDR1) plays a gatekeeper role, interfering delivery of multiple pharmaceuticals to the target tissues and cells. We performed Molecular Dynamics (MD) simulations to generate fifty side-chain variants for P-gp (PDB ID: 4Q9H-L) followed by docking of 31 drugs (0.6≤ER≤22.7) to the whole surface except the ATPase domains and the extracellular part. A selection of the most negative energy complex for each ligand followed. All compounds docked to the

two areas – the main binding cavity at the top of P-gp (12.5% of compounds with ER<1; 44.4% of 1≤ER≤2; and 100% of ER>2), and the binding sites in the middle of P-gp (87.5% of ER<1; 55.6% of 1≤ER≤2; and 0% of ER>2). Our results show that anti-substrates (ER<1), intermediate compounds (1≤ER≤2) and strong substrates (ER>2) might behave differently in relation to the P-gp. According to our calculations, the anti-substrates almost do not bind the main binding cavity (MBC) of P-gp and rather approach the other binding sites on the protein; the substrates preferably bind the MBC; the intermediate compounds with 1≤ER≤2 bind both MBC and other binding sites almost equally. The modelling results are in line with the known hypothesis that binding the MBC is prerequisite for the pumping the compound off the P-gp.

Keywords: P-glycoprotein, P-gp, MDR1, ABCB1, Docking, Substrates

1. Introduction P-glycoprotein (P-gp, multi-drug resistance protein, MDR1) plays a “gate-keeper” role, not letting many drugs reach their pharmaceutically active concentration in respective targets in the human body [1]. This specifically affects oral availability of most drugs, brain availability of neuropharmaceuticals, as well as creates basis for multi-drug resistance of cancer cells. P-gp is a 170 kDa membrane protein, residing at the interface between the cytoplasm and the periplasm [2]. In inward-facing conformation it has a turned V-like structure (Fig. 1), with the main substrate binding cavity at the internal side of the top, and the two ATPase units at the terminals [3]. At the intersection of two parts are located the portals through which external substances enter internal part of the P-gp [3]. Potentially toxic substances approach the cell, diffuse from the periplasm into membrane and approach a portal of P-gp. Then the external ligand may a) bind the main internal binding cavity at the top of a turned V-like structure of P-gp with the following “pumping” it off to the periplasm, or b) just leave the P-gp structure, continuing the diffusion into the cytoplasm or by returning back to the periplasm.

In the process of “pumping” a toxic molecule off the cell to the periplasm, P-gp changes its conformation from a turned V-like to a normal V-like (outward-facing) conformation, with the two ATPases coupled [4]. After elimination of the potentially toxic molecule, ATP hydrolysis gives energy to return the P-gp conformation to its initial (a turned V-like) state with the two ATPases separated [5]. The mechanism of substrate/non-substrate recognition by P-gp is still undiscovered. Before the structure of P-gp was x-ray resolved, multiple partly mutually contradictory experimental observations on the points of interaction of substrates and P-gp molecule were made [5]. Elucidation of these interaction points and recent docking studies suggest that most of them are located in the main binding cavity (MBC) of P-gp [6]. Some authors suggest that there are specific sites on the P-gp, interaction with which is critical for triggering the pumping mechanism. Recently, interesting modelling observation was made [7]: upon binding a substrate the pores at the periplasm part of P-gp get opened, and water molecules diffuse into the binding cavity surrounding initially dehydrated molecule of the substrate; moreover they surround it preferably at the side proximal to the cytoplasm, thus making a force for pumping it off into periplasm. There were few attempts to dock compounds to the whole surface of P-gp [4, 8], however no systematic study with big set of substrates and non-substrates was performed. The main purpose of our studies for P-gp was to derive models and regularities which would assist us in designing new neuro-pharmaceuticals. The critical quality of a neuro-pharmaceutical is its ability to penetrate the Blood-Brain Barrier, essential part of which consists of P-gp. In case of neuropharmaceuticals there is no use in making them an inhibitor of P-gp; that’s quite enough to make them just a non-substrate of P-gp. Thus, we directed our study to a narrow task – find simple models and respective regularities capable to effectively with high probability classify P-gp non-substrates from substrates. In our work we performed a computer-aided simulation of a probing of the P-gp surface by prospective ligand molecules. Molecular Dynamics simulations were applied for generating a variation of sidechain conformations for five recently x-ray resolved crystal structures of mouse P-gp, followed by a computer-aided docking of 31 drug molecules with a known efflux ratio (ER) to the whole surface of

protein except the ATPase domains and the extracellular part. The results of this simulated probing have brought to us the interesting results, which shed a light on the mechanism of P-gp activity. We do describe the results below.

2. Material and Methods 2.1 Initial data preparation. Structures of P-gp in the forms of apo-protein and co-crystallized protein-ligand complexes [6] were retrieved from PDB.ORG database (PDB ID: 4Q9H, 4Q9I, 4Q9J, 4Q9K, 4Q9L, with the respective resolutions of 3.40, 3.78, 3.60, 3.80, and 3.80 Å). Ligands were removed, and the protein structures were prepared in Protein Preparation Wizard (Schrödinger [9], Academic version). Ligands were retrieved from the work of Borst et al. [10] and rebuilt in CHED [11]. Were retained only the ligands with explicit structure, with a non-ionized molecule in pH 7.0. Molecules were converted to 3D in Sybyl package and energy minimized using the implemented Powell algorithm [12]. Totally 31 ligands were prepared. 2.2 Molecular Dynamics simulation. Molecular Dynamics (MD) simulations were performed to generate a variety of side-chain conformations. P-gp structures from 4Q9H, 4Q9I, 4Q9J, 4Q9K, 4Q9L were used for MD simulations. The Amber Tools package [13] was used to prepare input data. Prior to preparing the protein structures by the XLEAP application of AMBER Tools [13], hydrogens were deleted using Maestro (Schrödinger [9], Academic version). Counter-ions were added to neutralize the systems charges. No solvation environment was used. The proteins were prepared using the FF99SB force field. Molecular Dynamics (MD) simulations were executed using NAMD [14]. Three steps of minimization (ions move; ions and hydrogens move; ions, hydrogens, and sidechains move), and the heating stage (backbone restrained) preceded the simulation. The protein backbone was fixed during all stages of Molecular Dynamics simulations. The system was heated from 0 to 300 K in 500,000 steps with temperature reassignment, and a Langevin thermostat with a collision frequency of 5.0 ps-1 was used for subsequent temperature

maintenance. All MD simulations were performed with a 1.0 fs time step and a cutoff of 12.0 Å for nonbonded interactions. The production MD was performed for 1 ns timeframe at 310 K, with the backbone restrained. 2.3 Trajectory Clustering. The trajectories resulted from 1 ns production MD for the five variants of P-gp were used for clustering. For all P-gp variants the parts excluding the two ATPase domains, were used for clustering. Hierarchical clustering algorithm in the PTRAJ program of AMBER Tools [13] was applied to the respective residues coordinates. RMSd similarity distance metric for clustering was used. As a result 10 clusters per trajectory were generated and, respectively, one representative from each cluster was taken in pdb format by an automatic procedure implemented in cluster command of ptraj.

2.3 Computational docking. The 50 cluster representatives of P-gp were used as targets for docking. Computational docking of 31 small molecule drugs was performed with Autodock [15]. Grids were prepared with Autodock Tools [15] for a big box containing full structure of P-gp except the two ATPase domains and the extracellular part of P-gp. For each protein target the ten models were generated and the top one with the most negative binding energy was saved. Then, for each of the 31 ligands, only one of the 50 cluster representatives with the most negative protein-ligand binding energy was retained. Thus, 31 proteinligand complexes were generated finally and used for the following analyses. We used the next docking protocol. In the calculation folder the five separate folders for each Pgp source type 4Q9H-L were created, with respective subfolders for each of ten cluster representatives. The calculations were thus run in every of totally fifty subfolders containing the respective cluster representative, the common library of 31 ligands, and the launching scripts. Following the launching script, the cluster representatives were converted from pdb to pbqt format with prepare_receptor4.py of MGLTools. The ligands were converted from mol2 to pbqt format with prepare_ligand4.py of MGLTools. The reference gpf file was prepared for 4Q9H with the next parameters: number of grid

points (npts) 120/ 120/ 120; spacing 0.375 A; receptor atom types (receptor_types) A, C, HD, N, OA, SA; ligand atom types (ligand_types) A, Br, C, Cl, F, HD, N, NA, OA, S, SA; xyz-coordinates (gridcenter) 59.212/ 13.546/ 7.767. The gpf files for the cluster representative proteins were created using prepare_gpf4.py for the set of atom types occurring in ligands (A, Br, C, Cl, F, HD, N, NA, OA, S, SA), using the reference gpf file prepared for 4Q9H. Then the respective grid files (.glg and .map) were prepared for probing of the receptor surface by a ligand atom type under consideration, using autogrid4 of AutoDock software. After that, docking of each ligand was performed in two steps: preparation of dpf4 file containing all docking settings by prepare_dpf4.py; and docking was performed with dpf as input and dlg as output files, by autodock4. All the parameters not mentioned here were used as a default for Autodock4.2 and MGLTools-1.5.6.

2.4 Structure visualization and analysis. Analyses of the structures of proteins, ligands, and protein-ligand complexes were performed in Maestro (Schrödinger [9], Academic version) and VMD [16].

3. Results A set of 31 molecules for docking was prepared in neutral form based on the assumption [17] that a ligand has to deionize and to leave a water shell before entering the lipid bilayer, and retain in this form until being pumped off into the cytoplasm. The range of the ER for the selected molecules was 0.6≤ER≤22.7: 8 molecules with ER<1 (0.6≤ER≤0.9), 18 molecules with 1≤ER≤2 (1.0≤ER≤1.5) and 5 molecules with ER>2 (2.8≤ER≤22.7). In many protein targets, a ligand probes multiple potential binding sites and binding points until the position will be approached that satisfies the principle of the minimum of energy. We combined MD simulations with docking to the replicas to simulate such probing. Checking of the trajectories resulted from MD simulations in VMD had verified that they passed successfully with the sidechains moving at T=310 K. The replicas generated by the clustering procedure shown a variety of sidechain conformations

in visual analysis. Thus, a whole P-gp docking to the 50 side-chain variants, excluding the periplasm part and ATPase domains, was performed. The drug names and binding characteristics are depicted in Table 1. The positions of these ligands superimposed on the structure of 4Q9H are shown on Fig. 1. Generally, we can divide all docking poses into two classes: a) Main binding cavity (MBC) at the top of P-gp – evidently responsible for the binding of substrates prior to their “pumping off” from P-gp. b) Other binding sites (OBS) located mainly at the middle part of P-gp structure – evidently binding the non-substrates. We considered all compounds divided into three groups of ER<1 (0.6≤ER≤0.9), 1≤ER≤2 (1.0≤ER≤1.5), and ER>2 (2.8≤ER≤22.7) for the following analyses. ER<1 – compounds which are non-substrates, rather “absorbed” into the cell than pumped off from it. Among the 8 molecules in this subgroup with actual range 0.6≤ER≤0.9, 7 (87.5%) compounds with 0.6≤ER≤0.9 docked in the middle part (OBS) and 1 (12.5%) compound with ER=0.9 docked at the top (MBC) of P-gp (Fig 1A). The most interacting amino-acids were Phe693 and Tyr994 (both participated in binding of 4 compounds of 8); Trp694 participated in binding of 2 compounds of 8. The other amino-acids participated in contacts only with one of eight compounds: Glu778, Lys822, Ala691, Gln721, Phe724, Ser725, Asn838, Ala976 and Phe979. 1≤ER≤2 – compounds which also commonly considered as non-substrates, are slightly ef-fluxed from the cell. Among 18 molecules in this subgroup with an actual range 1.0≤ER≤1.5, 8 compounds docked at the top of P-gp in MBC, and 10 compounds docked in the middle part (OBS). Thus, 44.4%/55.6% distribution does not diversify compounds with 1≤ER≤2 based on just MBC/ OBS binding places. The most interacting residues were Phe979 (participated in binding of 6 compounds of 18); Tyr994, Tyr306, Phe693 (each participated in binding 4 compounds of 18); and Gln721 (3 of 18). Six residues participated in binding 2 compounds of 18 each: Gln721, Asn838, Met874, Gln878, Lys881, Asn926 and Lys929. And 13 residues bound only 1 compound of 18 each: Phe728, Ala824, Phe724, Glu778, Ala691, Phe693, Trp694, Ser725, Phe755, Phe332, Lys230, Lys238 and Lys287.

ER>2– compounds which are substrates, being clearly ef-fluxed from the cell. We had only five compounds with ER>2 (2.8≤ER≤22.7), all of which bound at the main binding cavity (MBC) of P-gp. Thus in our case of small set of compounds we attained 100% recognition of substrates by the MBC of Pgp. Among the residues participated in binding, each of Gln721, Phe724, Ser725 and Phe979 bound 2 compounds of five. And each of the nine residues bound only 1 compound of five: Asn838, Ala976, Gln986, Tyr303, Phe728, Ser975, Met188, Gln343 and Met945. Thus, from the results of our study, non-substrates of ER<1 (we would call them “antisubstrates”) with 87.5% probability bind to the OBS of P-gp, and only with 12.5% probability bind the MBC. Compared to them, 100% of substrates with ER>2 are recognised by the MBC of P-gp. Compounds with 1≤ER≤2 which are commonly considered as non-substrates, however in reality showing slight substrate properties, bind in 44.4%/55.6% proportion to the MBC/ OBS, respectively. From the results of docking of 31 compounds against the fifty replicas of P-gp, there is a clear difference in the binding sites (Fig 1, Table 1).

4. Discussion Historically, P-gp has been considered as a key target for new generation anti-cancer chemotherapeutics. That’s the main reason why the most efforts have been devoted to finding an inhibitor of this protein. Most studies on P-gp are still being made by oncological drug hunters hoping to find a potentially market accessible inhibitor. However, the P-gp is also found in two very important barriers in the human body – the Gastrointestinal tract– Blood and the Blood-Brain Barrier (BBB). There is no need and it would be even harmful to the human health if the common drugs would inhibit the P-gp. Thus, for non-cancer targeting applications, finding non-substrates of P-gp is the only goal of dealing with this protein. The structure of any protein in natural conditions, including P-gp, is flexible in water environment; also it has to adapt the specific size and binding properties of the certain ligand. The high flexibility of P-gp structure in native environment was proved in experiments. The P-gp easily changes

the size of its internal cavity to fit a ligand of the size ranging from small to great, which was demonstrated in many publications. Superposition of the P-gp structures 4Q9H-L shows that they are different in the distance between the two lobes in them. Generally, the bigger the ligand co-crystallised with P-gp the bigger the internal cavity made to adapt the ligand and thus the bigger the distance between the lobes. A case of such a giant protein as P-gp is far of being a routine task for a molecular modelling study. Beside a serious challenge as regarding the power of computational resources needed as the calculation time it requests, there are also structural challenges as well. Even a visual analysis of the surface of P-gp shows some interesting observations making it far beyond a typical protein target for a docking study. Among them are: the surface of P-gp is of hydrophilic quality, except for the “binding belt”, as inside as outside, which is emerged into the membrane bilayer; the internal part of surface of Pgp in the main binding cavity (MBC) is of substantially non-specific quality, mainly formed of residues capable to easily adjust they aromatic and hydrophobic sidechains to form a “binding cage” around a number of molecules currently found to be a substrate for P-gp; at the same time, the residues at the internal part of P-gp outside of MBC are of various quality – aromatic, aliphatic, hydrophilic, charged, etc. Moreover, it can be noticed that the internal surface of P-gp as in MBC as outside of it is quite shallow and formed of comparatively short residues which are unable move significantly (<1 A) against each other as in MD simulations as, apparently, in native conditions. At the same time, the sidechains at the outer surface of P-gp are mainly formed of lengthy structural elements capable to move significantly (>1 A) against each other as in MD simulations as, apparently, in native conditions. There are multiple extremely complicated mechanisms by which the P-gp inhibitors may work; and a number of works have been devoted to studying them. However, we do not consider the mechanisms by which the P-gp activity can be halted or interrupted. Our study is devoted to understanding of how in trivial activity the P-gp recognises a molecule to pump off. As it has been stated above, the P-gp is a highly flexible structure capable to recognise as little molecules as giant molecules,

by increasing the distance between the two lobes and the size of internal binding cavity (the main binding cavity - MBC). In fact, P-gp does not have a clear simple binding pocket; it rather has a so-called “binding belt”. Also there are hypotheses on various roles of different binding points on the structure of P-gp in the ligand recognition and triggering the different stages of a pumping mechanism. It is surprising, that a same ligand co-crystallised with P-gp on up to several different sites of its surface. Using a simple protein with a known or putative binding pocket co-crystallized with inhibitor for molecular modelling lets provide an induced fit only regarding the fixed conformation of protein adjusted to this specific inhibitor. And use of this structure for simple rigid-protein docking will just give structure of protein-ligand complex, where the ligand has bound to a protein which is in conformation ideal for another ligand. That is not a good solution. It is important to consider backbone protein structure flexibility at least partly, and consider sidechain protein structure flexibility fully. While common docking procedure provides option of rotating side-chains (however preferably 5-10, as common docking software designed for docking of small ligands to a simple protein with a small binding site), normally it cannot simulate backbone protein flexibility. Thus, a molecular dynamics simulation procedure must be introduced herein. Thus, we decided to design a specific simulation protocol to be able to simulate binding of nonsubstrates and substrates to P-gp using the limited computational resources which we have. Up to the time of starting our study, five newest P-gp structures co-crystallized with ligands of different size were resolved by the respective collective of researchers. Superposition of these protein structures has demonstrated that there was a difference in distance between two halves of P-gp. Generally, for bigger size ligands there was a lengthier distance between the two halves of P-gp. Accordingly, we could save time on unrestrained simulation of P-gp by use of these five P-gp variants for further studies. Of course, these structures do not represent all possible P-gp conformations in water environment; however these are the samples of ones bound to the ligands of different size. And they even can be more fitted to binding of

the various ligands compared to the snapshots which could be generated by unrestrained MD simulation in water environment for apo - P-gp. Researchers normally dock molecules to a transporter (e.g. 4Q9H) in its free form (apo-protein), which is partly correct as soon as they consider binding of the ligand only to the main binding cavity. However, according to the observations, the size of the main binding cavity in P-gp changes depending on the size of the ligand. And also surface of the cavity (measure of its “roughness”) also changes to perfectly fit the ligand. Additionally to that, the tails of the lobes of P-gp also move depending on the size of the ligand. Moreover, as also was shown in publications, the P-gp passes great changes in its structure upon binding of a ligand to various parts of it – e.x. gateway area, binding belt area, as well as the other areas and sites. It is known, that protein while binding a ligand passes as “big fit” when spacious orientation of backbone changes, as “little fit” when only sidechains move – to better bind the ligand. There are two ways to take the “little fit” into account: by use of implemented in docking software (e.x., AutoDock) option to systematically rotate specifically selected residues during a docking simulation; and using an approach of MD simulation to generate variety of sidechains conformations. However, considering extremely giant surface of P-gp, an approach utilizing inbuilt into AutoDock option of changing the coordinates of pre-selected sidechains during docking simulation would demand incredible amount of time for calculations. That’s because of hundreds residues to rotate and because that is done systematically. An approach of MD simulation would let generate a native-like spontaneous coordinates for sidechain orientations, which might be done faster than by a systematic residue-rotation procedure in AutoDock. For that reason an approach of MD simulation was chosen by us. We decided to take aforementioned five P-gp structures from PDB.ORG database to take into account “big fit” and apply a restrained MD simulation to them to consider a “little fit”. What we wanted to do is to generate spontaneous orientation of sidechains in P-gp similar to those which could be obtained in docking software when the option “rotate sidechains” made on. Thus, we designed a kind of artificial MD procedure, where simulations were performed for each unliganded P-gp structure, however without

use of water molecules and keeping backbone rigid during whole simulation. Trajectory analysis has shown that we retrieved what we wanted – a spontaneous orientation of sidechains for each P-gp structure. After performing MD simulations, the trajectories were clasterised using the procedure in ptraj of AmberTools. Totally 10 clusters were generated for each trajectory, and a representative from each was selected. Superposition of every ten representatives for each P-gp structure has demonstrated a difference in spacious coordinates between them. Thus, using comparatively fast and simple procedure of MD simulation we retrieved 50 P-gp structures. We could use inbuilt sidechain rotation procedure in AutoDock for the same purpose, however it would demand significantly bigger computational resources and time for systematic sidechain rotation. Then, we developed a specific protocol for docking by AutoDock. AutoDock itself is not optimized for fast virtual screening: e.x. it generates a grid before docking of every ligand. However, it is possible to specify all atom types occurring in molecules, generate map for each atom type, and then use them for all docking procedures. That is what we implemented. AutoDock itself performs by default 10 dockings, the results of which it arranges based on the value of their binding energies. Thus, by selection of the most minimal energy complex for a given ligand binding to 50 P-gp variants, we significantly enhance the basic AutoDock procedure by taking into consideration as flexibility of sidechains as flexibility of backbone (partly, based on use of 5 initial P-gp structures), as well as we significantly enhance statistical significance of docking results. This way, we could say, even with default docking procedure which performs 10 docking simulations, we would consider if we made 10 x 50 = 500 docking simulations against a P-gp structure generally. The last gives very high statistical stability of retrieved results. In our study we used the default settings for docking: number of individuals in population (ga_pop_size) 150; maximum number of energy evaluations (ga_num_evals) 2500000; maximum number of generations (ga_num_generations) 27000; number of top individuals to survive to next generation

(ga_elitism) 1; rate of gene mutation (ga_mutation_rate) 0.02; rate of crossover (ga_crossover_rate) 0.8; ga_window_size 10; and number of runs (ga_run) 10. In our case we used 50 cluster representatives generated from MD simulations, and all these structures have only minor differences between each other. As soon as all ligands docked on the internal surface of P-gp (areas of MBC and outside of it), sidechains of which demonstrated only low fluctuations in MD simulations, and differences between the structures 4Q9H-L are also small, we would suggest that the study results can be considered as approximation to the docking to an “average” P-gp structure. In this case the number of ga_runs would be in approximation to 50x10 = 500. As a result we retrieved huge amount of data for docking of 31 ligands against 50 P-gp structures. All these data could be analysed only automatically; thus we developed special scripts for extraction of energy data, selection of minimum energy ligand-protein pairs, and protein-ligand structures extraction from the docking results. In our docking we used the whole structure of P-gp except the periplasmic part of P-gp and the two ATPases. Despite seeming to be attractive of the external part of P-gp to ligand binding due to extended “fuzziness” of its surface, as well as seeming unattractiveness of the internal part of P-gp due to “shallowness” of its surface, no docking was occurred on the external part of P-gp – and 100% of ligands docked only on the internal surface of P-gp. The last observation demonstrates clearly that the internal surface of P-gp (as in MBC as outside of it) represents the only target for the ligands proposing to be substrate or non-substrate. The experimental observation that the same ligand molecule could cocrystallize in several different sites simultaneously in the same crystal structure brightly shows that the Pgp does not provide any specific binding site for any ligand being substrate or non-substrate. Thus, we would stipulate that any docking result against a transport protein such as P-gp will be valid, as soon as it is one of multiple probable binding sites which differ in binding energy and which the ligand must probably consider as a potential binding point. According to our current understanding of how the P-gp carries out its pumping activity, the ligand smoothly moves across the P-gp structure and its surface, not

stopping at any given place for a long time (in case of non-substrates) until binding the MBC at proper low-energy binding site (in case of substrates). We think that the results retrieved in our study demonstrate correctness of the selected methodology of research.

5. Conclusion We conclude that there are two main binding areas which are different in the ability to recognise compounds with distinct properties. These are the main binding cavity (MBC) and other binding sites (OBS). While MBC is located at the top of P-gp, and preferably binds substrates for the following pumping them off into the periplasm; the OBS are located in the middle part of P-gp and possibly transitionally bind non-substrates for the following letting them pass into the cytoplasm. The said is supported by a fact, that 87.5% of the compounds with ER<1, which are the nonsubstrates (“anti-substrates”) preferably rather diffusing into the cytoplasm than being ef-fluxed into the periplasm, were not docked at the MBC of P-gp and rather bound the OBS. In contrast, 100% of the compounds with ER>2, which are strong substrates, docked at the MBC of P-gp. At the same time, intermediate compounds with 1≤ER≤2, which though commonly considered as non-substrates but in reality showing the slight P-gp substrate properties, could bind MBC and OBS in almost equal proportion of 44.4%/55.6%. The results and information presented above may add to the understanding of the mechanisms of non-substrate/substrate recognition by P-gp.

Acknowledgements This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Supplementary Information PDB files for protein structures and MOL2 files for ligands, participating in the 31 final complexes obtained. Matrix of 50x31, demonstrating the energies of protein-ligand binding retrieved per docking procedure.

References [1] J.H. Lin, M. Yamazaki, Role of P-glycoprotein in pharmacokinetics: clinical implications, Clinical pharmacokinetics, 42 (2003) 59-98. [2] R.L. Juliano, V. Ling, A surface glycoprotein modulating drug permeability in Chinese hamster ovary cell mutants, Biochimica et biophysica acta, 455 (1976) 152-162. [3] S.G. Aller, J. Yu, A. Ward, Y. Weng, S. Chittaboina, R. Zhuo, P.M. Harrell, Y.T. Trinh, Q. Zhang, I.L. Urbatsch, G. Chang, Structure of P-glycoprotein reveals a molecular basis for poly-specific drug binding, Science, 323 (2009) 1718-1722. [4] J.W. McCormick, P.D. Vogel, J.G. Wise, Multiple Drug Transport Pathways through Human PGlycoprotein, Biochemistry, 54 (2015) 4374-4390. [5] R. Callaghan, Providing a molecular mechanism for P-glycoprotein; why would I bother?, Biochemical Society transactions, 43 (2015) 995-1002. [6] P. Szewczyk, H. Tao, A.P. McGrath, M. Villaluz, S.D. Rees, S.C. Lee, R. Doshi, I.L. Urbatsch, Q. Zhang, G. Chang, Snapshots of ligand entry, malleable binding and induced helical movement in P-glycoprotein, Acta crystallographica. Section D, Biological crystallography, 71 (2015) 732-741. [7] J.C. Jagodinsky, U. Akgun, Characterizing the binding interactions between P-glycoprotein and eight known cardiovascular transport substrates, Pharmacology research & perspectives, 3 (2015) e00114. [8] M. Zeino, M.E. Saeed, O. Kadioglu, T. Efferth, The ability of molecular docking to unravel the controversy and challenges related to P-glycoprotein--a well-known, yet poorly understood drug transporter, Investigational new drugs, 32 (2014) 618-625.

[9] Schrödinger Suite 2015 (Academic Version), Schrödinger LLC. [10] F. Broccatelli, QSAR models for P-glycoprotein transport based on a highly consistent data set, Journal of chemical information and modeling, 52 (2012) 2462-2470. [11] S.V. Trepalin, A.V. Yarkov, CheD: chemical database compilation tool, Internet server, and client for SQL servers, Journal of chemical information and computer sciences, 41 (2001) 100-107. [12] SYBYL Software, Tripos Associates Inc. [13] D.A. Case, T.A. Darden, C.T. E., C.L. Simmerling, J. Wang, R.E. Duke, R. Luo, R.C. Walker, W. Zhang, K.M. Merz, B. Roberts, B. Wang, S. Hayik, A. Roitberg, G. Seabra, I. Kolossvai, K.F. Wong, F. Paesani, J. Vanicek, J. Liu, X. Wu, S.R. Brozell, T. Steinbrecher, H. Gohlke, Q. Cai, X. Ye, J. Wang, M.-J. Hsieh, G. Cui, D.R. Roe, D.H. Mathews, M.G. Seetin, C. Sagui, V. Babin, T. Luchko, S. Gusarov, A. Kovalenko, P.A. Kollman, AMBER 11, (2010). [14] J.C. Phillips, R. Braun, W. Wang, J. Gumbart, E. Tajkhorshid, E. Villa, C. Chipot, R.D. Skeel, L. Kale, K. Schulten, Scalable molecular dynamics with NAMD, Journal of computational chemistry, 26 (2005) 17811802. [15] G.M. Morris, R. Huey, W. Lindstrom, M.F. Sanner, R.K. Belew, D.S. Goodsell, A.J. Olson, AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility, Journal of computational chemistry, 30 (2009) 2785-2791. [16] W. Humphrey, A. Dalke, K. Schulten, VMD: visual molecular dynamics, Journal of molecular graphics, 14 (1996) 33-38, 27-38. [17] H. Gunaydin, M.M. Weiss, Y. Sun, De novo prediction of p-glycoprotein-mediated efflux liability for druglike compounds, ACS medicinal chemistry letters, 4 (2013) 108-112.

Figure Captions Fig 1 Results of docking of 31 drugs to P-gp. (left) Compounds with ER<1; (middle) Compounds with 1≤ER≤2; (right) Compounds with ER>2. All compounds docked to the two areas – the main binding cavity (MBC) at the top of P-gp (12.5% of compounds with ER<1; 44.4% of 1≤ER≤2; and 100% of ER>2), and the other binding sites (OBS) in the middle of P-gp (87.5% of ER<1; 55.6% of 1≤ER≤2; and 0% of ER>2). The P-gp structures on the left and on the right pictures are slightly turned right to make all ligands visible. This figure is black-and-white in print and online.

Table 1 Complexes* of 31 drugs with the structure of P-gp. #

Drug Name

ER

Ebind, kCal/M

1 2 3 4 5 6 7 8

Caffeine Diazepam Indomethacin Methotrexate Antipyrine Nordazepam Carbamazepine Lamotrigine

0,6 0,7 0,7 0,7 0,8 0,9 0,9 0,9

-5,46 -8 -8,68 -7,91 -5,87 -8,34 -7,76 -7,25

9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Ketoconazole Ramelteon Zolpidem Phenytoin Meprobamate Zonisamide Flumazenil Tacrine Bromazepam Nitrazepam Pyridostigmine Clonazepam Zaleplon Alprazolam Oxcarbazepine Loratadine

1 1 1 1 1 1 1 1,1 1,2 1,2 1,2 1,3 1,4 1,4 1,4 1,4

-9,9 -8,12 -8,15 -7,17 -5,61 -7,64 -7,18 -7,6 -8,18 -9,19 -5,36 -9,64 -8,8 -9,21 -7,81 -9,72

Interacting amonoacids 4Q9H/I/J/K/L (source numbers in mouse P-gp) F693 W694, Y994 E778, K822 A691, F693 Y994 F693, Y994 F693, W694, Y994 Q721, F724, S725, N838, A976, F979 Y306, F728 F693, A824, Y994 F724, F979 E778 A691, F693 F693, W694, Y994 F979 Q721, S725, N838 F693, Y994 M874, Q878, K881, N926, K929 F693 M874, Q878, K881, N926, K929 Y306, Q721, F979 Y994 Y306, Q721, F755, F979 Y306, F332, F979

25 26 27 28 29 30 31

Nimodipine Riluzole Trimethoprim Famciclovir Prazosin Prednisone Dipyridamole

1,5 1,5 2,8 3,2 3,8 3,8 22,7

-8,11 -6,26 -6,89 -6,52 -8 -9,66 -9,98

K230, K238, K287 N838, F979 (F724), S725, N838, A976, F979 Q721, F724, S725, F979, Q986 Y303, Q721, (F728, S975) M188, Q343 M945

Interacting TMs, NBD1/NBD2 halve

Binding Site (MBC/OBS)

LinkFrag (NBD2) LinkFrag, TM12 (NBD2) TM8,9 (NBD2) LinkFrag (NBD2) TM12 (NBD2) LinkFrag, TM12 (NBD2) LinkFrag, TM12 (NBD2) TM7,9,12 (NBD2)

OBS OBS OBS OBS OBS OBS OBS MBC

TM5,7 (NBD2) LinkFrag, TM9,12 (NBD2) TM7,12 (NBD2) TM8 (NBD2) LinkFrag (NBD2) LinkFrag, TM12 (NBD2) TM12 (NBD2) TM7,9 (NBD2) LinkFrag, TM12 (NBD2) TM10,11 (NBD1) LinkFrag (NBD2) TM10,11 (NBD1) TM5,7,12 (NBD2) TM12 (NBD2) TM5,7,8,12 (NBD2) TM6 (NBD1); TM5,12 (NBD2) TM4,5 (NBD2) TM9,12 (NBD2) TM7,9,12 (NBD2) TM7,12 (NBD2) TM5,7,12 (NBD2) TM3,6 (NBD1) TM11 (NBD1)

MBC OBS MBC OBS OBS OBS MBC MBC OBS OBS OBS OBS MBC OBS MBC MBC OBS MBC MBC MBC MBC MBC MBC

*Complexes were generated by docking to the whole surface of P-gp, except the ATPase domains and the extracellular part. White – docked to the other binding sites (OBS) in the middle of P-gp structure. Gray – docked to the main binding cavity (MBC) at the top of P-gp. This table is black-and-white in print and online.