Development of pharmacophore-based classification model for activators of constitutive androstane receptor

Development of pharmacophore-based classification model for activators of constitutive androstane receptor

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 ...

1MB Sizes 0 Downloads 55 Views

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54

DMPK151_proof ■ 23 November 2016 ■ 1/7

Drug Metabolism and Pharmacokinetics xxx (2016) 1e7

Contents lists available at ScienceDirect

Drug Metabolism and Pharmacokinetics journal homepage: http://www.journals.elsevier.com/drug-metabolism-andpharmacokinetics

Regular Article

Development of pharmacophore-based classification model for activators of constitutive androstane receptor Q5

Kyungro Lee a, Hwan You a, Jiwon Choi b, Kyoung Tai No a, b, * a b

Department of Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, South Korea Bioinformatics & Molecular Design Research Center, Yonsei University, Seoul 03722, South Korea

a r t i c l e i n f o

a b s t r a c t

Article history: Received 5 August 2016 Received in revised form 21 September 2016 Accepted 10 November 2016 Available online xxx

Constitutive androstane receptor (CAR) is predominantly expressed in the liver and is important for regulating drug metabolism and transport. Despite its biological importance, there have been few attempts to develop in silico models to predict the activity of CAR modulated by chemical compounds. The number of in silico studies of CAR may be limited because of CAR's constitutive activity under normal conditions, which makes it difficult to elucidate the key structural features of the interaction between CAR and its ligands. In this study, to address these limitations, we introduced 3D pharmacophore-based descriptors with an integrated ligand and structure-based pharmacophore features, which represent the receptor-ligand interaction. Machine learning methods (support vector machine and artificial neural network) were applied to develop an in silico model with the descriptors containing significant information regarding the ligand binding positions. The best classification model built with a solvent accessibility volume-based filter and the support vector machine showed good predictabilities of 87%, and 85.4% for the training set and validation set, respectively. This demonstrates that our model can be used to accurately predict CAR activators and offers structural information regarding ligand/protein interactions.

Keywords: Constitutive androstane receptor Pharmacophore-based descriptor Classification Machine learning Support vector machine

Q1

© 2016 Published by Elsevier Ltd on behalf of The Japanese Society for the Study of Xenobiotics.

1. Introduction The nuclear constitutive active/androstane receptor (CAR, NR1I3) is a key member of the nuclear receptor (NR) transcription factors, which regulate metabolic homeostasis when activated by their ligands. CAR serves as a xenobiotic sensing receptor that organizes the cellular defense system against endogenous and exogenous challenges by controlling the expression of genes encoding phase I oxidation enzymes (e.g., cytochrome P450s), phase II conjugation enzymes (e.g., UDP-glucuronosyltransferases), and phase III efflux transporters (e.g., multidrug resistance proteins) [1]. Drugs that bind to the CAR in hepatocytes can disturb the liver metabolic system through pharmacokinetic drugedrug interactions. CAR is also associated with several hepatic functions, including bilirubin metabolism, fatty acid oxidation, bile acid homeostasis, hormone homeostasis, gluconeogenesis regulation, and cell apoptosis and proliferation [2].

* Corresponding author. E-mail addresses: [email protected], [email protected] (K.T. No).

NRs are structurally conserved and are composed of three domains: a highly variable N-terminal DNA binding domain, hinge domain, and ligand binding domain (LBD). In general, the LBD of NRs is enclosed by 12 aehelices, where a12 containing the activation function (AF) domain is a crucial region for activation [3e5]. Despite the structural similarity of CAR with other NRs, CAR is constitutively activated in the absence of ligand binding [6]. Helices aX and aAF of CAR mediate the ligand-independent interaction with coactivators and confer constitutive activity [7,8]. Furthermore, the positive K195 residue in helix a5 preferentially interacts with the negatively charged carboxy-terminus and constructs the active conformation of aAF, which has relevance to the shortened loop between aX and aAF in addition to a shortened aAF compared to other NRs [9,10]. Additionally, a mutagenesis study showed that several amino acids within a3 (Asn165), a5 (Val199), a10 (Tyr326, Ile330, and Gln331), and aAF (Leu343 and Ile346) contributed to the constitutive activity of CAR and some residues within a3 (Ile164 and Asn165), a5 (Cys202 and His203), and a7 (Phe234 and Phe238) affected the selectivity of chemicals for CAR activation [11]. The activity of CAR is commonly determined using a cell-based reporter gene assay, which measures the expression of a reporter in

http://dx.doi.org/10.1016/j.dmpk.2016.11.005 1347-4367/© 2016 Published by Elsevier Ltd on behalf of The Japanese Society for the Study of Xenobiotics.

Please cite this article in press as: Lee K, et al., Development of pharmacophore-based classification model for activators of constitutive androstane receptor, Drug Metabolism and Pharmacokinetics (2016), http://dx.doi.org/10.1016/j.dmpk.2016.11.005

55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

DMPK151_proof ■ 23 November 2016 ■ 2/7

2

K. Lee et al. / Drug Metabolism and Pharmacokinetics xxx (2016) 1e7

the cell co-transfected with CAR and a reporter gene plasmid. The reporter assay has identified many compounds as agonists/inverse agonists of CAR, but the assay results do not demonstrate whether these compounds directly bind to the ligand-binding site of CAR. For example, phenobarbital indirectly augments the expression of the reporter genes by regulating epidermal growth factor receptor signaling, which is crucial for regulating the dephosphorylation of CAR at Thr38 and nuclear translocation of CAR [12,13]. As CAR is accumulated in the nucleus, up-regulation of CAR activity induces the transcription of drug metabolism and transporter-related genes. In addition to phenobarbital, phenytoin, triclocarban, galangin, chrysin, and baicalein are known to indirectly activate CAR [14e16]. Using the reporter assay, it is impossible to select the binders interacting with CAR-LBD among the activators of CAR. Two-hybrid assays and fluorescence resonance energy transfer assays were recently developed to verify the direct interaction between CAR and its ligands [16e18]. Since the activators in these assays are determined based on their function of dissociating the coactivator protein, TIF2 or PGC1, from CAR, further studies are needed to confirm the ligand binding site or a different activation site, particularly at the binding interface between CAR and coactivator proteins. Furthermore, a limited number of scaffolds, such as polychlorinated biphenyls and phthalates, have been tested in these assays, and the number of validated compounds is too small to be introduced in training data for developing machine-learning models. We collected data for chemicals regulating the activation of CAR, which include the results from yeast two-hybrid assays, fluorescence resonance energy transfer assays, and reporter gene assays, excluding known indirect activators and their analogs. Machine-learning is a data-driven decision or prediction modelbuilding method that is widely used in computer-aided molecular modeling. Diverse machine-learning methods have been used to infer the information of structure-based receptor-ligand binding. Although there are several machine-learning-based prediction models for the sister xenobiotic receptor pregnane X receptor (NR1I2) [19e21], only one machine-learning model for predicting the activity of CAR with its ligands has been developed and this model indicated the critical residues involved in CAR ligand binding

[22]. Because CAR is constitutively active under normal conditions, it is difficult to determine whether the activation of CAR is caused by ligand binding. In this study, we developed a general classification model derived from a support vector machine (SVM) and a solvent accessibility volume (SAVol)-based filter. Through validation with a predefined external validation set, the model was evaluated for its ability to classify compounds as activators/non-activators of CAR. The purpose of this study was to develop a prediction model and identify critical structural interactions between CAR and its ligands that highly contribute to binding affinity. 2. Materials and methods To identify the activators of CAR and infer structural information regarding the interaction between CAR and its ligands, pharmacophore-based descriptors were introduced to represent the binding features. First, reliable binding poses of CAR ligands were selected through docking calculations. The poses of the ligands were used to generate a ligand-based pharmacophore. A receptor-based pharmacophore representing possible interactions between ligands and CAR was built in the ligand-binding site. The receptor and ligand-based pharmacophores were used to calculate descriptors representing the crucial interactions between a ligand and CAR. Machine-learning methods combined with a genetic algorithm (GA) were used to develop the binary classification model. The overall modeling procedure is depicted in Fig. 1. 2.1. Data set A set of 548 compounds used in this study were collected from previous studies [17,18,22e55]. The compounds in the data set were classified in several ways as shown in Fig. 2. The compounds were classified as i) activators or non-activators and ii) binders or non-binders of CAR. By applying the biological and biophysical criteria (transcriptional activation and binding), the compounds were clustered into four groups, i) binders and activators, ii) binders and non-activators, iii) non-binders and activators, and iv) non-

Data Collection and Filtration

Docking

Ligand Pharmacophore Generation

Receptor Pharmacophore Generation

Support Vector Machine

Distance and Angle Calculation

Neural Network

Descriptor calculation

Model Development

Model Validation

Fig. 1. Flowchart of pharmacophore-based classification model development in the present study.

Please cite this article in press as: Lee K, et al., Development of pharmacophore-based classification model for activators of constitutive androstane receptor, Drug Metabolism and Pharmacokinetics (2016), http://dx.doi.org/10.1016/j.dmpk.2016.11.005

66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

DMPK151_proof ■ 23 November 2016 ■ 3/7

K. Lee et al. / Drug Metabolism and Pharmacokinetics xxx (2016) 1e7

3

66 67 Constructed Data set for 68 DB (548) machine learning model (392) Removed 69 from DB 70 Class I by scaffold Activator 71 Activator & known from & Binder Indirect experiment Binder 72 Activator (23) Activator (222) & Non-binder Activator 73 Preassigned Non-activator Small & Large 74 as Class II & Binder Compounds Class II 75 (133) Non-activator Non-activator 76 & Non-binder (170) 77 78 79 80 81 Fig. 2. Composition of the data set for classification model development. The four groups used as published experimental data are depicted on the left. The specific composition of 82 the data set for model development is presented on the right. The figures in parenthesis refer to the number of compounds in the group. 83 Discovery Studio 2016 [67], which creates pharmacophore features binders and non-activators. Among the collected activators, 23 84 from a Ludi interaction map on the surrounding residues in the compounds containing the common scaffold of known indirect 85 binding site of CAR. Ninety hydrogen bond (HB) acceptors, 438 HB activators (non-binder and activator) were eliminated from the 86 donors, and 220 hydrophobic features were generated, and then data set, and then the remaining activators were regarded as 87 109 pharmacophore features (3 HB acceptors, 3 HB donors, and 103 binders of CAR. During analysis of the data set, we found that all 88 hydrophobic features) were selected by extracting the cluster known CAR activators have solvent accessible volume (SAVol) in 89 centers from the features (Supplemental Fig. 1A). The ligand-based the range of 326e800 Å3. This implies that the size of the ligand is 90 pharmacophore for each compound in the data set was generated correlated to the volume of the CAR ligand-binding pocket. After 91 using the ‘Feature Mapping’ module to identify all possible pharexcluding the compounds outside of the range of the data set, 92 macophore features (Supplemental Fig. 1B). The pharmacophoremachine-learning models were developed. Finally, the data set of 93 based descriptor is the matching score of the hydrogen bond and 392 compounds (222 activators and 170 non-activators) was 94 hydrophobic features. The score was determined using a modified divided into two groups, 70% for a training set and 30% for a vali95 simple geometric function representing the distance and angle dation test set (Supplemental Table 1). 96 between each receptor-based pharmacophore feature and the 97 closest ligand-based pharmacophore feature [68]. Hydrogen bond 2.2. Best pose selection by docking 98 feature-based descriptors (HBDs) and hydrophobic feature-based 99 descriptors (FDs) were calculated using the following equations: Two X-ray crystal structures of CAR complexed with 6-(4100 chlorophenyl)imidazo[2,1-b][1,3]thiazole-5-carbaldehyde O-(3,4101      HBDðiÞ ¼ max f qij $f rij (1) dichlorobenzyl) oxime (CITCO) and 5b-pregnane-3,20-dione were 102 j downloaded from the RCSB Protein Data Bank (PDB ID: 1XVP and 103 1XV9, respectively). After attaching hydrogens, the energy of each 104 (2) FDðkÞ ¼ maxðf ðrkl ÞÞ complex structure was minimized using CHARMm force field with 105 l the Polak-Ribiere-conjugated gradient method. The main structural 106 difference between the two complexes is the orientation of Asn165 107 where i and k are receptor-based pharmacophore features, j and l in the binding pocket. Asn165-flexible docking of the complexes 108 are ligand-based pharmacophore features, and f(q) and f(r) are was performed with AutoDock 4.2 [56] and ICM docking [57]. To 109 angle- and distance-dependent functions between receptor- and evaluate the performance of the two docking methods for X-ray 110 ligand-based pharmacophore features, respectively (Supplemental crystal structures (1XVP and 1XV9), we performed re-docking of 111 Fig. 2). f(q) and f(r) are modified functions of PharmDock [69]. co-crystallized ligands onto the X-ray structures of CAR used in this 112 8  p > study [58]. The ICM docking method with the co-crystallized ligand 113 > qij < 1 > > 6 > of the PDB-entry 1XV9 structure was selected as the representative 114 > >      < p p p method because the RMSD of the best scoring pose was lower than 115 2 3  qij < f qij ¼ cos q  (3) that of the other combination methods in re-docking. Fifty poses for 116 > 2 ij 4 6 2 > > >   > each ligand were generated using the ICM docking and the scores 117 > p > : qij  0 for each binding pose were calculated with the following 10 scoring 118 2 functions; LigScore1 [59], LigScore2, PLP1 [60], PLP2 [61], Ludi1 119   8 [62], Ludi2, Ludi3, PMF [63], PMF04 [64], and Jain [65]. To overcome 120 > A 1 rij < 1:5 > the limitations of individual scoring functions, a consensus scoring 121 > > > p    method was applied to select the best pose [66]. Briefly, we 122   < p f rij ¼ cos2 rij  (4) 1:5 A  rij < 3:0 A developed the best consensus scoring based on the 10 scoring 123 > 3 2 > >   functions by counting the top 30% ranked on each scoring function. > 124 > : A 0 rij  3:0 125 126 2.3. Generation of structure-based descriptors qij is the angle between the HB feature of receptor and each HB Q2 127 128 Both receptor- and ligand-based pharmacophores were used to feature of the ligand, and rij is the distance between the origin of 129 generate structure-based descriptors. A receptor-based pharmathe HB feature vector of the receptor and ligand or between the F 130 cophore was generated using ‘Interaction Generation’ module in feature point of the receptor and ligand. Please cite this article in press as: Lee K, et al., Development of pharmacophore-based classification model for activators of constitutive androstane receptor, Drug Metabolism and Pharmacokinetics (2016), http://dx.doi.org/10.1016/j.dmpk.2016.11.005

4

K. Lee et al. / Drug Metabolism and Pharmacokinetics xxx (2016) 1e7

1

2.4. Descriptor selection and model development GA was employed to select highly contributing descriptors to the models. The population size and number of generations of the GA were 100 and 1000, respectively. When the performance of the developed model did not improve for 200 generations, the process of descriptor selection was stopped. The population set of GA was independently combined with two learning methods: SVM and artificial neural network (ANN) algorithms. After selecting the descriptors, the parameters of the learning methods, SVM (kernel, g, C, and ε-insensitive loss function) and ANN (number of hidden layers, number of neurons, learning rate, and momentum), underwent grid-based optimization. The optimum kernel function of SVM was a radial basis function defined by exp(gjjxyjj2). The kernel g, kernel capacity C, and specified insensitivity ε of SVM are optimized to 0.919, 0.621, and 0.005, respectively. The optimum architecture of ANN is composed of one hidden layer with seven neurons. The learning rate and momentum of ANN are optimized to 0.278 and 0.103, respectively. All machine-learnings were performed using Rapidminer 5.3 [70]. To avoid overfitting in the training set, we used the 5-fold cross validation method to estimate the fitness of the training models followed by evaluation of the external validation test set. The quality of the models was represented by the following standard parameters derived from the confusion matrix, where the statistical parameters were extracted: specificity (SP), selectivity (SE), accuracy (Q), and Matthew's correlation coefficient (MCC). 3. Results 3.1. Composition of data set To confirm the unbiased distribution of the training and test set, principal component analysis was performed using a set of molecular descriptors: AlogP, molecular weight, molecular surface area, number of rotatable bonds, number of rings, and number of hydrogen acceptors and donors. The distribution of compounds was elucidated using the three main principal components (Supplemental Fig. 3). The proportions of variance were 0.354, 0.177, and 0.148, respectively. Because activators/non-activators and training/test set were checked and the four groups of the compounds were distributed evenly in 2D space, the data set is suitable for model development. 3.2. Performance of classification models In order to identify the activator of CAR, a training set of 274 and test set of 118, encoded by calculation of the pharmacophore-based descriptor, were employed in two different machine-learning algorithms, SVM, and ANN. To select the best descriptor set to build the model, GA was integrated with two different machine-learning algorithms. As the number of descriptors increased up to 10 for both algorithms, the predictability of the SVM model encoded by the seven descriptors was recorded as the best result for the test set (Fig. 3). These results indicate that the 7-descriptor subset is sufficiently large to explain the model. Information from the selected pharmacophore-based structural location explained the structural relationship between CAR and its ligand complex. Table 1 shows that the GA-SVM model performed substantially better than GAANN model. The predictabilities of the model, Q and MCC, were 0.805 and 0.606 for the test set, respectively. Although the results of SVM were lower than expected, the predictabilities (Q) of the model on the total data set including the filtered non-activator predicted 133 compounds with values of 0.875 and 0.854 in the training and test sets, respectively.

0.8 Predictability (Q)

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

DMPK151_proof ■ 23 November 2016 ■ 4/7

0.6

0.4

0.2

SVM

SVM-Test

ANN

ANN-Test

0 2

3

4

5 6 7 8 Num of Descriptors

9

10

11

Fig. 3. Predictability (Q) is plotted against the number of descriptors. The solid line is the training set and dotted line is the test set.

Table 1 Comparison of performance among combinational models. Training Set SVM ANN

D 7 7

TP 133 119

FP 24 32

TN 95 83

FN 22 18

SEa 0.858 0.869

SPb 0.798 0.722

Qc 0.832 0.802

MCCd 0.658 0.6

D 7 7

TP 54 63

FP 10 21

TN 41 33

FN 13 11

SE 0.806 0.851

SP 0.804 0.611

Q 0.805 0.75

MCC 0.606 0.481

Test Set SVM ANN

Number of descriptors (D), true positive (TP), true negative (TN), false positive (FP), false negative (FN), sensitivity (SE), specificity (SP), overall prediction accuracy (Q), and Matthew's correlation coefficient (MCC). a SE ¼ TP/(TP þ FN). b SP ¼ TN/(TN þ FP). c Q ¼ (TP þ TN)/(TP þ TN þ FP þ FN). d MCC ¼ [(TP*TN)(FN*FP)]/[(TP þ FN) (TP þ FP) (TN þ FN) (TN þ FP)]1/2.

In order to exclude the possibility of chance correlation, the reliability of the model was evaluated by Y-randomization [71]. The activity of the training set was randomized and then the SVM model was developed again. The predictability of Y-randomization was compared with that of the original data and is summarized in Table 2. MCC values were close to zero and Z-scores were greater than three, indicating the statistical significance of the best model. 3.3. Structural analysis of pharmacophore-based descriptors To identify structural information representing the binding of CAR to its ligands, the descriptors composing the models were analyzed. Among the 109 whole descriptor set, seven descriptors based on pharmacophore features selected from the SVM model are illustrated on Fig. 4. The features are composed of one hydrogen Table 2 Model validation through Y-randomized data. Model

SVM

ANN

Originala Y-randomizationb Z-scorec

0.606 0.037 ± 0.10 5.75

0.481 0.012 ± 0.14 3.35

a

MCC of the best model validated by test set (MCCori). Y-randomization values represent the mean (MCCmean rand ) ± deviation (s) of sensitivities from 10 independent runs. c mean z ¼ ðMCCori  MCCrand Þ=s. b

Please cite this article in press as: Lee K, et al., Development of pharmacophore-based classification model for activators of constitutive androstane receptor, Drug Metabolism and Pharmacokinetics (2016), http://dx.doi.org/10.1016/j.dmpk.2016.11.005

66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

DMPK151_proof ■ 23 November 2016 ■ 5/7

K. Lee et al. / Drug Metabolism and Pharmacokinetics xxx (2016) 1e7

5

Fig. 4. Composition of seven pharmacophore features derived from trained SVM model (pdb id: 1XV9); one hydrogen bond acceptor and six hydrophobic queries; hydrogen bond acceptor denoted as HA, whereas hydrophobe denoted as F. Pharmacophore features are marked with blue and residues surrounding the binding pocket are marked with black. The figures were created using Discovery Studio 2016. (A) Pharmacophore in the ligand binding site of CAR. a2 (T127eR136) and a3 (H160eQ171) helices are deleted for clarity. The surface of CAR is depicted by surface of solvent atom charge; blue and red color represents positive and negative charge, respectively. The radius of each probe is 1.4 Å. (B) 180degree rotated ribbon diagram of (A), aX in dark blue and aAF in light blue. Parts of a10 (Y328eH332) and a6 (D228eQ235) are deleted for clarity. The carbon atoms of four barrier residues are colored in yellow. The residues from aAF are colored in light green. Pharmacophore model with agonists, (C) 5b-pregnane-3,20-dione and (D) CITCO.

bond acceptor (HA1) and six hydrophobic features (F1, F2, F3, F4, F5, and F6). To evaluate the relevance of the selected descriptors, the contribution factor (Ci) proposed by Cherqaoui et al. [72] and difference of MCC (DMCC) were calculated by comparing the predictabilities of the re-trained model without one descriptor (Table 3). The Ci of F5 was the highest and that of F1 was higher than the average of the other Cis. F5 and F1 were located in the space between the a6 and a7 helixes. F5 was close to the a-carbon chain of Gly229 and contacted the aromatic ring of Phe234. F1 was near the carbon chains of Leu239 and Leu242. If a compound interacted with the two hydrophobic features, the stability of the ligand binding site of CAR was expected to increase. The hydrophobic center of the 5b-pregnane-3,20-dione, and p-chlorophenyl substituent of CITCO were matched with F5 and F1 and were found to interact with Phe234, Leu239, and Leu242. The only hydrophilic Table 3 Relative contributions of seven selected descriptors from the SVM model. Descriptor

Ci (%)a

DMCCb

Acceptor1 Hydrophobe1 Hydrophobe2 Hydrophobe3 Hydrophobe4 Hydrophobe5 Hydrophobe6

13.38 14.92 14.09 12.02 16.16 16.58 12.85

0.110 0.237 0.201 0.106 0.274 0.311 0.146

a

100Dmi Ci ¼ P ; Dmi is the mean of the deviations' absolute values between 7 i¼1

Dmi

observed and estimated values. b DMCC is calculated as the difference in MCC yielded with all descriptors and without the regarded descriptor.

feature, HA1, represents a hydrogen bond with His203. The C21 ketone atom of 5b-pregnane-3,20-dione and nitrogen of imidazothiazole heterocycle of CITCO formed a hydrogen bond with the lone pair at ε-nitrogen of His203. Among the selected features, F3 near HA1 was located at the closest position to the AF helix. However, the nearest residue Leu343 in the helix was 5.3 Å from F3. Between F3 and the helix, Asn165 and Tyr326 were present and were components of a barrier shielding the ligand binding pocket from the AF helix and was packed with the AF helix and aX [8]. Although it was difficult for only F3 to directly affect the packing of the ligand binding pocket with the AF helix, F3 cooperation with HA1 was a crucial feature stabilizing the packaging of the ligand binding pocket and interacting with co-activator proteins. F4 was located at C23 of the 5b-pregnane-3,20-dione binding pose and interacted with Leu206 of CAR. Among the data set, many small molecules were not matched with F4. This may be because of the position at the center of the binding site and the difficulty in representing the interaction with specific residues. Hydrophobic feature F6 was located in the hydrophobic interaction between the chloride of CITCO and CAR. This is thought to be an important factor for stabilizing the receptor-ligand complex structure after ligands enter the ligand binding pocket of CAR. 4. Discussion Although numerous computational studies have evaluated NRs, application of the models applied to CAR activators, which is of great interest in medicinal chemistry, has not been thoroughly evaluated. In this study, we developed machine-learning models based on pharmacophore-based descriptors. First, the SAVol filter was applied to the data set. When divided based on SAVol 326 Å3,

Please cite this article in press as: Lee K, et al., Development of pharmacophore-based classification model for activators of constitutive androstane receptor, Drug Metabolism and Pharmacokinetics (2016), http://dx.doi.org/10.1016/j.dmpk.2016.11.005

66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 Q3 55 56 57 58 59 60 61 62 63 64 65

DMPK151_proof ■ 23 November 2016 ■ 6/7

6

K. Lee et al. / Drug Metabolism and Pharmacokinetics xxx (2016) 1e7

compounds were classified with high accuracy (67%). Small molecules with SAVol values of less than 326 Å3 may not be sufficient in the CAR ligand-binding pocket; accordingly, these molecules were classified as non-activators. To prevent biases in model development with respect to molecule size, we excluded the small molecules from the data set. After docking simulations, the large molecules with SAVol of greater than 800 Å3 were excluded owing to ligand-protein bumping. Using the preprocessed data set, we calculated pharmacophore-based descriptors that represent the position and interactions of compounds within the ligand-binding site of CAR. Machine-learning algorithms, SVM and ANN, were then applied to develop a classification model using the descriptors. Based on the results, SVM combined with the SAVol-filter model performed better than other models. Qs of the model were 0.875 and 0.854 and MCCs were 0.744 and 0.701 for the training and test sets, respectively. Using a Y-randomization test, the developed model was validated and found to be robust. Furthermore, when 23 compounds containing known common scaffolds of indirect activators were applied to the model, Q was just 47.8%, and the sensitivity was surprisingly zero (Supplemental Table 2). Owing to the shared common structure with indirect activators, the compounds may not target the ligand-binding site of CAR. Since the compounds were incorrectly classified, we conclude that the robustness of the model was specific to CAR targeting. Based on the analysis of the contribution of descriptors, hydrophobic features F5 and F1 were important for stabilizing the a6 and a7 helixes and F3 and HA1 near the aX and AF helix were crucial for LBD packaging and interacting with co-activators. Although only seven descriptors and a size-filter were used, the model showed good performance and can be used for identifying CAR activators from large compound libraries. Together, these findings provide insight into how the CAR can be activated by ligands and suggest that structural information can be determined using pharmacophore models. The model built in this study can be used to understand the drug metabolism pathway, expression of metabolizing enzymes, and drugedrug interactions. Overall, this pharmacophore-based descriptors approach can be utilized in various machine-learning models to provide valuable insights into receptor-ligand complexes to guide future structureand ligand-based drug design. Conflict of interest The authors have declared no conflicts of interest. Acknowledgements This research is supported by the Industrial Core Technology Development Program (10054749, software development about drug metabolism prediction) and funded by the Ministry of Trade, Industry and Energy (MOTIE), and supported by the Ministry of Knowledge Economy through Korea Research Institute of Chemical Technology (SI-1205, SI-1304, SI-1404, SI-1505). This work is also supported in part by Brain Korea 21 (BK21) PLUS Program. Appendix A. Supplementary data Supplementary data related to this article can be found at http:// dx.doi.org/10.1016/j.dmpk.2016.11.005. References [1] Nakata K, Tanaka Y, Nakano T, Adachi T, Tanaka H, Kaminuma T, et al. Nuclear receptor-mediated transcriptional regulation in Phase I, II, and III xenobiotic metabolizing systems. Drug Metab Pharmacokinet 2006;21:437e57.

[2] di Masi A, De Marinis E, Ascenzi P, Marino M. Nuclear receptors CAR and PXR: molecular, functional, and biomedical aspects. Mol Aspects Med 2009;30: 297e343. [3] Kumar R, Thompson EB. The structure of the nuclear hormone receptors. Steroids 1999;64:310e9. [4] Holm L, Sander C. Dali/FSSP classification of three-dimensional protein folds. Nucleic Acids Res 1997;25:231e4. [5] Holm L, Sander C. The FSSP database of structurally aligned protein fold families. Nucleic Acids Res 1994;22:3600e9. [6] Choi HS, Chung M, Tzameli I, Simha D, Lee YK, Seol W, et al. Differential transactivation by two isoforms of the orphan nuclear hormone receptor CAR. J Biol Chem 1997;272:23565e71. [7] Andersin T, Vaisanen S, Carlberg C. The critical role of carboxy-terminal amino acids in ligand-dependent and -independent transactivation of the constitutive androstane receptor. Mol Endocrinol 2003;17:234e46. [8] Xu RX, Lambert MH, Wisely BB, Warren EN, Weinert EE, Waitt GM, et al. A structural basis for constitutive activity in the human CAR/RXRalpha heterodimer. Mol Cell 2004;16:919e28. [9] Dussault I, Lin M, Hollister K, Fan M, Termini J, Sherman MA, et al. A structural model of the constitutive androstane receptor defines novel interactions that mediate ligand-independent activity. Mol Cell Biol 2002;22:5270e80. [10] Suino K, Peng L, Reynolds R, Li Y, Cha JY, Repa JJ, et al. The nuclear xenobiotic receptor CAR: structural determinants of constitutive activation and heterodimerization. Mol Cell 2004;16:893e905. [11] Jyrkkarinne J, Windshugel B, Makinen J, Ylisirnio M, Perakyla M, Poso A, et al. Amino acids important for ligand specificity of the human constitutive androstane receptor. J Biol Chem 2005;280:5960e71. [12] Mutoh S, Sobhany M, Moore R, Perera L, Pedersen L, Sueyoshi T, et al. Phenobarbital indirectly activates the constitutive active androstane receptor (CAR) by inhibition of epidermal growth factor receptor signaling. Sci Signal 2013;6:ra31. [13] Li H, Wang H. Activation of xenobiotic receptors: driving into the nucleus. Expert Opin Drug Metab Toxicol 2010;6:409e26. [14] Wang H, Faucette S, Moore R, Sueyoshi T, Negishi M, LeCluyse E. Human constitutive androstane receptor mediates induction of CYP2B6 gene expression by phenytoin. J Biol Chem 2004;279:29295e301. [15] Yueh MF, Li T, Evans RM, Hammock B, Tukey RH. Triclocarban mediates induction of xenobiotic metabolism through activation of the constitutive androstane receptor and the estrogen receptor alpha. PLoS One 2012;7: e37705. [16] Carazo Fernandez A, Smutny T, Hyrsova L, Berka K, Pavek P. Chrysin, baicalein and galangin are indirect activators of the human constitutive androstane receptor (CAR). Toxicol Lett 2015;233:68e77. [17] Zhang H, Zhang Z, Nakanishi T, Wan Y, Hiromori Y, Nagase H, et al. Structuredependent activity of phthalate esters and phthalate monoesters binding to human constitutive androstane receptor. Chem Res Toxicol 2015;28: 1196e204. [18] Kamata R, Shiraishi F, Kageyama S, Nakajima D. Detection and measurement of the agonistic activities of PCBs and mono-hydroxylated PCBs to the constitutive androstane receptor using a recombinant yeast assay. Toxicol Vitro 2015;29:1859e67. [19] Ma SL, Joung JY, Lee S, Cho KH, No KT. PXR ligand classification model with SFED-weighted WHIM and CoMMA descriptors. SAR QSAR Environ Res 2012;23:485e504. [20] Dybdahl M, Nikolov NG, Wedebye EB, Jonsdottir SO, Niemela JR. QSAR model for human pregnane X receptor (PXR) binding: screening of environmental chemicals and correlations with genotoxicity, endocrine disruption and teratogenicity. Toxicol Appl Pharmacol 2012;262:301e9. [21] Handa K, Nakagome I, Yamaotsu N, Gouda H, Hirono S. Three-dimensional quantitative structure-activity relationship analysis for human pregnane X receptor for the prediction of CYP3A4 induction in human hepatocytes: structure-based comparative molecular field analysis. J Pharm Sci 2015;104: 223e32. [22] Jyrkkarinne J, Windshugel B, Ronkko T, Tervo AJ, Kublbeck J, LahtelaKakkonen M, et al. Insights into ligand-elicited activation of human constitutive androstane receptor based on novel agonists and three-dimensional quantitative structure-activity relationship. J Med Chem 2008;51:7181e92. [23] Hernandez JP, Mota LC, Baldwin WS. Activation of CAR and PXR by dietary, environmental and occupational chemicals alters drug metabolism, intermediary metabolism, and cell proliferation. Curr Pharmacogenomics Person Med 2009;7:81e105. [24] Li L, Stanton JD, Tolson AH, Luo Y, Wang H. Bioactive terpenoids and flavonoids from Ginkgo biloba extract induce the expression of hepatic drugmetabolizing enzymes through pregnane X receptor, constitutive androstane receptor, and aryl hydrocarbon receptor-mediated pathways. Pharm Res 2009;26:872e82. [25] Kretschmer XC, Baldwin WS. CAR and PXR: xenosensors of endocrine disrupters? Chem Biol Interact 2005;155:111e28. [26] Yao R, Yasuoka A, Kamei A, Kitagawa Y, Tateishi N, Tsuruoka N, et al. Dietary flavonoids activate the constitutive androstane receptor (CAR). J Agric Food Chem 2010;58:2168e73. [27] Burk O, Piedade R, Ghebreghiorghis L, Fait JT, Nussler AK, Gil JP, et al. Differential effects of clinically used derivatives and metabolites of artemisinin in the activation of constitutive androstane receptor isoforms. Br J Pharmacol 2012;167:666e81.

Please cite this article in press as: Lee K, et al., Development of pharmacophore-based classification model for activators of constitutive androstane receptor, Drug Metabolism and Pharmacokinetics (2016), http://dx.doi.org/10.1016/j.dmpk.2016.11.005

66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58

DMPK151_proof ■ 23 November 2016 ■ 7/7

K. Lee et al. / Drug Metabolism and Pharmacokinetics xxx (2016) 1e7 [28] Kublbeck J, Jyrkkarinne J, Poso A, Turpeinen M, Sippl W, Honkakoski P, et al. Discovery of substituted sulfonamides and thiazolidin-4-one derivatives as agonists of human constitutive androstane receptor. Biochem Pharmacol 2008;76:1288e97. [29] Casabar RC, Das PC, Dekrey GK, Gardiner CS, Cao Y, Rose RL, et al. Endosulfan induces CYP2B6 and CYP3A4 by activating the pregnane X receptor. Toxicol Appl Pharmacol 2010;245:335e43. [30] Chen X, Zhang J, Baker SM, Chen G. Human constitutive androstane receptor mediated methotrexate induction of human dehydroepiandrosterone sulfotransferase (hSULT2A1). Toxicology 2007;231:224e33. [31] Kobayashi K, Yamanaka Y, Iwazaki N, Nakajo I, Hosokawa M, Negishi M, et al. Identification of HMG-CoA reductase inhibitors as activators for human, mouse and rat constitutive androstane receptor. Drug Metab Dispos 2005;33: 924e9. [32] Lynch C, Pan Y, Li L, Ferguson SS, Xia M, Swaan PW, et al. Identification of novel activators of constitutive androstane receptor from FDA-approved drugs by integrated computational and biological approaches. Pharm Res 2013;30:489e501. [33] Fisher CD, Augustine LM, Maher JM, Nelson DM, Slitt AL, Klaassen CD, et al. Induction of drug-metabolizing enzymes by garlic and allyl sulfide compounds via activation of constitutive androstane receptor and nuclear factor E2-related factor 2. Drug Metab Dispos 2007;35:995e1000. [34] Lau AJ, Yang G, Chang TK. Isoform-selective activation of human constitutive androstane receptor by Ginkgo biloba extract: functional analysis of the SV23, SV24, and SV25 splice variants. J Pharmacol Exp Ther 2011;339: 704e15. [35] Anderson LE, Dring AM, Hamel LD, Stoner MA. Modulation of constitutive androstane receptor (CAR) and pregnane X receptor (PXR) by 6-arylpyrrolo [2,1-d][1,5]benzothiazepine derivatives, ligands of peripheral benzodiazepine receptor (PBR). Toxicol Lett 2011;202:148e54. [36] Omiecinski CJ, Coslo DM, Chen T, Laurenzana EM, Peffer RC. Multi-species analyses of direct activators of the constitutive androstane receptor. Toxicol Sci 2011;123:550e62. [37] Kublbeck J, Jyrkkarinne J, Molnar F, Kuningas T, Patel J, Windshugel B, et al. New in vitro tools to study human constitutive androstane receptor (CAR) biology: discovery and comparison of human CAR inverse agonists. Mol Pharm 2011;8:2424e33. [38] Al-Salman F, Plant N. Non-coplanar polychlorinated biphenyls (PCBs) are direct agonists for the human pregnane-X receptor and constitutive androstane receptor, and activate target gene expression in a tissue-specific manner. Toxicol Appl Pharmacol 2012;263:7e13. [39] Svard J, Spiers JP, Mulcahy F, Hennessy M. Nuclear receptor-mediated induction of CYP450 by antiretrovirals: functional consequences of NR1I2 (PXR) polymorphisms and differential prevalence in whites and sub-Saharan Africans. J Acquir Immune Defic Syndr 2010;55:536e49. [40] Li H, Chen T, Cottrell J, Wang H. Nuclear translocation of adenoviral-enhanced yellow fluorescent protein-tagged-human constitutive androstane receptor (hCAR): a novel tool for screening hCAR activators in human primary hepatocytes. Drug Metab Dispos 2009;37:1098e106. [41] Li L, Chen T, Stanton JD, Sueyoshi T, Negishi M, Wang H. The peripheral benzodiazepine receptor ligand 1-(2-chlorophenyl-methylpropyl)-3isoquinoline-carboxamide is a novel antagonist of human constitutive androstane receptor. Mol Pharmacol 2008;74:443e53. [42] Yao R, Yasuoka A, Kamei A, Kitagawa Y, Rogi T, Taieishi N, et al. Polyphenols in alcoholic beverages activating constitutive androstane receptor CAR. Biosci Biotechnol Biochem 2011;75:1635e7. [43] Dring AM, Anderson LE, Qamar S, Stoner MA. Rational quantitative structureactivity relationship (RQSAR) screen for PXR and CAR isoform-specific nuclear receptor ligands. Chem Biol Interact 2010;188:512e25. [44] Faucette SR, Zhang TC, Moore R, Sueyoshi T, Omiecinski CJ, LeCluyse EL, et al. Relative activation of human pregnane X receptor versus constitutive androstane receptor defines distinct classes of CYP2B6 and CYP3A4 inducers. J Pharmacol Exp Ther 2007;320:72e80. [45] DeKeyser JG, Laurenzana EM, Peterson EC, Chen T, Omiecinski CJ. Selective phthalate activation of naturally occurring human constitutive androstane receptor splice variants and the pregnane X receptor. Toxicol Sci 2011;120: 381e91. [46] Chen T, Tompkins LM, Li L, Li H, Kim G, Zheng Y, et al. A single amino acid controls the functional switch of human constitutive androstane receptor (CAR) 1 to the xenobiotic-sensitive splicing variant CAR3. J Pharmacol Exp Ther 2010;332:106e15.

7

[47] Chang TK, Waxman DJ. Synthetic drugs and natural products as modulators of constitutive androstane receptor (CAR) and pregnane X receptor (PXR). Drug Metab Rev 2006;38:51e73. [48] Kublbeck J, Laitinen T, Jyrkkarinne J, Rousu T, Tolonen A, Abel T, et al. Use of comprehensive screening methods to detect selective human CAR activators. Biochem Pharmacol 2011;82:1994e2007. [49] Zhang XJ, Shi Z, Lyv JX, He X, Englert NA, Zhang SY. Pyrene is a novel constitutive androstane receptor (CAR) activator and causes hepatotoxicity by CAR. Toxicol Sci 2015;147:436e45. [50] Yu L, Wang Z, Huang M, Li Y, Zeng K, Lei J, et al. Evodia alkaloids suppress gluconeogenesis and lipogenesis by activating the constitutive androstane receptor. Biochim Biophys Acta 2015. Q4 [51] Yarushkin AA, Kachaylo EM, Pustylnyak VO. The constitutive androstane receptor activator 4-[(4R,6R)-4,6-diphenyl-1,3-dioxan-2-yl]-N,N-dimethylaniline inhibits the gluconeogenic genes PEPCK and G6Pase through the suppression of HNF4alpha and FOXO1 transcriptional activity. Br J Pharmacol 2013;168:1923e32. [52] Sharma D, Lau AJ, Sherman MA, Chang TK. Differential activation of human constitutive androstane receptor and its SV23 and SV24 splice variants by rilpivirine and etravirine. Br J Pharmacol 2015;172:1263e76. [53] Lau AJ, Chang TK. Indirect activation of the SV23 and SV24 splice variants of human constitutive androstane receptor: analysis with 3-hydroxyflavone and its analogues. Br J Pharmacol 2013;170:403e14. [54] Kittayaruksakul S, Zhao W, Xu M, Ren S, Lu J, Wang J, et al. Identification of three novel natural product compounds that activate PXR and CAR and inhibit inflammation. Pharm Res 2013;30:2199e208. [55] Imai J, Yamazoe Y, Yoshinari K. Novel cell-based reporter assay system using epitope-tagged protein for the identification of agonistic ligands of constitutive androstane receptor (CAR). Drug Metab Pharmacokinet 2013;28:290e8. [56] Morris GM, Huey R, Lindstrom W, Sanner MF, Belew RK, Goodsell DS, et al. AutoDock4 and AutoDockTools4: automated docking with selective receptor flexibility. J Comput Chem 2009;30:2785e91. [57] Molsoft LLC. ICM software package. Version 38. [58] Wang R, Lu Y, Wang S. Comparative evaluation of 11 scoring functions for molecular docking. J Med Chem 2003;46:2287e303. [59] Krammer A, Kirchhoff PD, Jiang X, Venkatachalam CM, Waldman M. LigScore: a novel scoring function for predicting binding affinities. J Mol Graph Model 2005;23:395e407. [60] Gehlhaar DK, Verkhivker GM, Rejto PA, Sherman CJ, Fogel DR, Fogel LJ, et al. Molecular recognition of the inhibitor AG-1343 by HIV-1 protease: conformationally flexible docking by evolutionary programming. Chem Biol 1995;2:317e24. [61] Gehlhaar D, Bouzida D, Rejto P. Rational drug design: novel methodology and practical applications. ACS symposium series. 1999. p. 2e311. €hm H-J. The development of a simple empirical scoring function to estimate [62] Bo the binding constant for a protein-ligand complex of known threedimensional structure. J Comput Aided Mol Des 1994;8:243e56. [63] Muegge I, Martin YC. A general and fast scoring function for protein-ligand interactions: a simplified potential approach. J Med Chem 1999;42:791e804. [64] Muegge I. PMF scoring revisited. J Med Chem 2005;49:5895e902. [65] Jain A. Scoring noncovalent protein-ligand interactions: a continuous differentiable function tuned to compute binding affinities. J Comput Aided Mol Des 1996;10:427e40. [66] Charifson PS, Corkery JJ, Murcko MA, Walters WP. Consensus scoring: a method for obtaining improved hit rates from docking databases of threedimensional structures into proteins. J Med Chem 1999;42:5100e9. mes BIOVIA. Discovery studio modeling environment. Release [67] Dassault Syste mes; 2016. 2016. San Diego: Dassault Syste [68] Hu B, Lill MA. Exploring the potential of protein-based pharmacophore models in ligand pose prediction and ranking. J Chem Inf Model 2013;53: 1179e90. [69] Hu B, Lill MA. PharmDock: a pharmacophore-based docking program. J Cheminform 2014;6:14. [70] Surhone LM, Tennoe MT, Henssonow SF. Rapidminer. VDM Publishing; 2010. [71] Lee S, Kang YM, Park H, Dong MS, Shin JM, No KT. Human nephrotoxicity prediction models for three types of kidney injury based on data sets of pharmacological compounds and their metabolites. Chem Res Toxicol 2013;26:1652e9. [72] Cherqaoui D, Esseffar M, Villemin D, Cense JM, Chastrette M, Zakarya D. Structure musk odour relationship studies of tetralin and indan compounds using neural networks. New J Chem 1998;22:839e43.

Please cite this article in press as: Lee K, et al., Development of pharmacophore-based classification model for activators of constitutive androstane receptor, Drug Metabolism and Pharmacokinetics (2016), http://dx.doi.org/10.1016/j.dmpk.2016.11.005

59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116