CORAL: Prediction of binding affinity and efficacy of thyroid hormone receptor ligands

CORAL: Prediction of binding affinity and efficacy of thyroid hormone receptor ligands

European Journal of Medicinal Chemistry 101 (2015) 452e461 Contents lists available at ScienceDirect European Journal of Medicinal Chemistry journal...

1MB Sizes 0 Downloads 75 Views

European Journal of Medicinal Chemistry 101 (2015) 452e461

Contents lists available at ScienceDirect

European Journal of Medicinal Chemistry journal homepage: http://www.elsevier.com/locate/ejmech

Research paper

CORAL: Prediction of binding affinity and efficacy of thyroid hormone receptor ligands A.P. Toropova*, A.A. Toropov, E. Benfenati IRCCS, Istituto di Ricerche Farmacologiche Mario Negri, 20156, Via La Masa 19, Milano, Italy

a r t i c l e i n f o

a b s t r a c t

Article history: Received 30 April 2015 Received in revised form 1 July 2015 Accepted 6 July 2015 Available online 10 July 2015

Quantitative structure e activity relationships (QSARs) for binding affinity of thyroid hormone receptors based on attributes of molecular structure extracted from simplified molecular input-line entry systems (SMILES) are established using the CORAL software (http://www.insilico.eu/coral). The half maximal inhibitory concentration (IC50) is used as the measure of the binding affinity of thyroid hormone receptors. Molecular features which are statistically reliable promoters of increase and decrease for IC50 are suggested. The examples of modifications of molecular structure which lead to the increase or to the decrease of the endpoint are represented. © 2015 Elsevier Masson SAS. All rights reserved.

Keywords: QSAR Endocrine disrupting Thyroid hormone receptor Monte Carlo method

1. Introduction Endocrine disrupting chemicals are natural or synthetic compounds that have the potential to interfere with the endocrine system, often through imitating or blocking endogenous hormones [1]. Thyroid hormones are important endocrine signalling hormones, which are involved in a number of important physiological processes such as lipid metabolism; control of energy expenditure and in the brain development [1e3]. Consequently, the synthesis and analysis of chemicals which are able be analogies with thyroid hormones is very important task [4]. Several classes of environmental chemicals have a high degree of structural similarity to thyroid hormones [5]. The half maximal inhibitory concentration is traditional measure of biological activity of analogies of thyroid hormones [6]. Quantitative structure e activity relationships (QSARs) are widely used to predict the endpoints related to biological behaviour of analogies of thyroid hormones [7].

* Corresponding author. Laboratory of Environmental Chemistry and Toxicology, IRCCS e Istituto di Ricerche Farmacologiche Mario Negri, Via La Masa 19, 20156 Milano, Italy. E-mail address: [email protected] (A.P. Toropova). http://dx.doi.org/10.1016/j.ejmech.2015.07.012 0223-5234/© 2015 Elsevier Masson SAS. All rights reserved.

Endocrine disrupting chemicals may act via multiple pathways; however one privileged route is through their direct interaction with nuclear receptors, which leads to modulation of gene expression. Several functional domains of thyroid hormone receptors have been identified. The important family of chemicals is a ligand-inducible co-activators binding domain termed activation function 2 or AF-2 [1]. The aim of this work is to build up QSAR for binding affinity of thyroid hormone receptors (AF-2).

2. Method 2.1. Data The chemical structures for 181 compounds able to bind at AF-2 domains taken in the literature: collected IC50 values falls into range from 0.310 mM to 100 mM for AF-2 domains [1]. The endpoint under consideration is the negative decimal logarithm of half maximal inhibitory concentration (pIC50) [1]. The available data were three times randomly split into the training (z75%), calibration (z12.5%), and validation (z12.5%) sets. Each set has specific role: the training set build up model; the calibration set block the overtraining, and the validation set estimate the predictive potential of the model (see Figs. 1 and 2).

A.P. Toropova et al. / European Journal of Medicinal Chemistry 101 (2015) 452e461

453

SMILES (Table 1); in fact, Sk, SSk, and SSSk are local SMILES attributes, whereas, NOSP, HALO, BOND, and PAIR are global SMILES attributes [2]. The NOSP is descriptor that reflects presence (absence) of nitrogen, oxygen, sulphur, and phosphorus; the HALO is descriptor that reflects presence (absence) of fluorine, chlorine, bromine, and iodine; the BOND is descriptor that reflects presence (absence) of double, triple, and stereochemical bonds; and PAIR is descriptor that reflects presence (absence) of all binary combines of above listed molecular features [8]. Table 1 contains an example of translation of SMILES into the above attributes. The T is threshold, i.e. coefficient for classification of SMILES attributes into two categories: (i) rare (frequency of attribute in the training set is less than T) and (ii) active (frequency of attribute in the training set is larger than T). The rare attributes are blocked: their correlation weights are fixed be equal to zero. Therefore, blocked attributes are not involved in the process of building up a model. The N is the number of epochs of the Monte Carlo method optimization procedure. The Monte Carlo method is utilized to calculate optimal correlation weights of local and global attributes which give maximal correlation coefficient between the DCW(T, N) and an endpoint [8,12,13] for the training set. However, unlimited number of epochs of the optimization can lead to overtraining (perfect correlation for the training set and poor correlation for external set). To avoid the overtraining, one should define after several runs of the optimization, the T* and N* which give the best correlation coefficient for the calibration set [12,13]. It is to be noted that often the T* and N* give reasonable good statistical quality for the calibration set, together with poorer statistical quality for the training set. But further improvement of statistical quality for the training set will lead to decrease of statistical quality for the calibration set and hence most probably the predictive potential for external compounds will be reduced. Having the numerical data on correlation weights obtained with T ¼ T* and N]N*, one can calculate with compounds from the training set model:

Fig. 1. Graphical representation of the best model (Eq. (3), Split 1).

pIC50 ¼ C0 þ C1  DCWðT*; N*Þ Fig. 2. Test of applicability domain for the best model (Split 1, Eq. (3)). One can see accuracy of prediction for the calibration and validation sets is ±0.5, whereas the accuracy for the training set is ±1.0.

(2)

The model calculated with Eq. (2) should be checked up with the external validation set. 3. Results and discussion

2.2. Optimal descriptors The optimal descriptors were utilized in this work to establish one-variable correlations with pIC50. These descriptors are calculated as the following [8]:

DCWðT; NÞ ¼

Ns X k¼1

wðsk Þþ

Ns1 X k¼1

wðssk Þþ

Ns2 X

According to the principle “a QSAR is an random event” [13], three different splits into the training, calibration, and validation sets were examined. The models for three random splits are the following:

pIC50 ¼ 4:1264ð±0:0077Þ þ 0:06586ð±0:0003Þ*DCWð3; 10Þ (3)

wðsssk Þ þ wðNOSPÞ

k¼1

pIC50 ¼ 3:9131ð±0:0158Þ þ 0:01664ð±0:0001Þ*DCWð1; 3Þ

þ wðHALOÞ þ wðBONDÞ þ wðPAIRÞ

(4)

(1) where Sk is indivisible component of simplified molecular inputline entry system (SMILES) [9e11], e.g. “C”, “N”, “Cl”, “Br”, etc.; SSk and SSSk are combine of two and three indivisible components of

pIC50 ¼ 4:1909ð±0:0112Þ þ 0:01852ð±0:0001Þ*DCWð3; 4Þ (5) The statistical quality of these models are represented in Table 2.

454

A.P. Toropova et al. / European Journal of Medicinal Chemistry 101 (2015) 452e461

Table 1 An example of translation of SMILES into SMILES attributes: Sk, SSk, and SSSk; NOSP, HALO, BOND, and PAIR.

(continued on next page)

A.P. Toropova et al. / European Journal of Medicinal Chemistry 101 (2015) 452e461

Table 1 (continued)

455

456

A.P. Toropova et al. / European Journal of Medicinal Chemistry 101 (2015) 452e461

Table 2 The statistical characteristics of QSAR models of pIC50 for thyroid hormone receptor ligands. Set

n*

r2

q2

Split 1 Training 138 0.6679 0.6583 Calibration 22 0.8557 0.8235 Validation 21 0.8721 Split 2 Training 132 0.5187 0.5003 Calibration 25 0.8003 0.7737 Validation 24 0.7642 Split 3 Training 139 0.5374 0.5227 Calibration 20 0.8852 0.8600 Validation 22 0.7700 Model from Ref. [1] Training 181 0.70 e Validation r2 ¼ 0.70 for 5-fold external cross Model from Ref. [6] Training(TRa) 55 0.90 0.77 Validation(TRa) 13 0.88 e Training(TRb) 55 0.87 0.69 Validation(TRb) 13 0.92 e

MAE

s

F

T*

N*

0.249 0.112 0.124

0.325 0.127 0.157

273

3

10

0.288 0.187 0.185

0.384 0.231 0.225

140

1

3

0.296 0.128 0.190

0.380 0.157 0.222

159

3

4

0.24 e validation [1] e e e e

0.44 1.10 0.50 1.13

* The n is the number of compounds in set; r2 is the correlation coefficient; q2 is cross validated r2; MAE is mean absolute error; s is the root-mean-square error; F is the Fischer F ratio; T* and N* preferable values for the threshold and the number of epochs, respectively.

such modifications which are done according to data from Table 3. However, these modifications (Table 4) can be useful if described calculations will be checked up and directed by experimentalists with using the “feedback” mechanism. Table 5 contains list of outliers for the validation sets calculated with statistical defects of SMILES described in the literature [14]. According to the distribution of substances into the training, calibration, and validation set, each SMILES characterized by Defect D calculated as the following:



NSA X

DefectSAk

(6)

k¼1

where

DefectSAk ¼

jPTRN ðSAk Þ  PCLB ðSAk Þj NTRN ðSAk Þ þ NCLB ðSAk Þ

(7)

where, the NSA is the number of active (not blocked) SMILES attribute in the given SMILES; PTRN(SAk) is the probability of presence of the SAk in compounds of the training set, i.e.

PTRN ðSAk Þ ¼ NTRN ðSAk Þ=NTRN The PTRN(SAk) is the probability of presence of the SAk in compounds of the calibration set, i.e.

One can see (Table 2), these models have different statistical characteristics and different T* and N*, but all these models have satisfactory predictive potential. The statistical quality of model for the same data described in the literature [1] is the following: r2 ¼ 0.70; MAE ¼ 0.24 (Table 2). Unfortunately, the statistical characteristics of model for external set (as well as the list of compounds used for external validation) are not represented in work [1]. Thus, the statistical quality for visible training and calibration sets for models described in this work is poorer, but the statistical quality for the invisible validation set is comparable with model from work [1]: MAE for Eqs. (3)e(5) are 0.124, 0.185, and 0.190, respectively, whereas MAE in work [1] is 0.24. Of course, the comparison of statistical quality for the same validation set is the best way, but unfortunately such comparison is unavailable, because the list of compounds used as the external validation set is not indicated in work [1]. In Ref. [6] a and b thyroid hormones (TRa and TRb) are examined. The statistical quality of the best models (Table 2) are better, but it should be taken into account that (i) the number of compounds analysed is considerably less (68 vs. 181); and (ii) the accuracy of prediction for validation sets are considerably poorer in comparison with the accuracy of the prediction for the training set (0.44 vs. 1.10 and 0.50 vs. 1.13) [6]. Thus the approach suggested in the present work gives the statistical quality of models which is comparable with the statistical quality of model described in work [6]. The majority of SMILES attributes (Sk, SSk , SSSk , NOSP, HALO, BOND, and PAIR) are rare, but, consequently, attributes which are active (not rare) become more informative. Having data on several runs of the optimization, one can extract statistically reliable attributes (i.e. attributes which have large frequency in the training set) with stable positive correlation weights, and vice versa, attributes with stable negative correlation weights. The statistically reliable attributes with stable positive and negative correlation weights are listed in Table 3. This information gives possibility to define hypothesises how one should modify the molecular structure of compounds (which are examined in the experiment) in order to increase (or decrease) the pIC50 value. Table 4 contains examples of

PTST ðSAk Þ ¼ NCLB ðSAk Þ=NCLB The NTRN(SAk) is the number (frequency) of compounds which contain SAk in the training set; The NTRN is the total number of compounds in the training set; The NCLB(SAk) is the number (frequency) of compounds which contain SAk in the calibration set; The NCLB is the total number of compounds in the calibration set. A substance characterized by defect D is outlier if D > 2  D, where the D is average of defect for SMILES which are distributed into the training set. Unfortunately, the above criteria (Eqs. (6) and (7)) are not a “mathematical garantie” of true classification, but they are a probabilistic measure of quality of distribution into the “visible” training and calibration sets and “invisible” validation set. SMILES with large defect D should be estimated as “suspect” ones, however their categorization upon role of outliers should be based on additional examination. Table 6 contains splits into the training, calibration, and validation set, experimental and calculated pIC50 taken in the literature [1]. Supplementary materials section contains technical details of the models for splits examined in this work.

4. Conclusions The Monte Carlo method can be used to build up model for half maximal inhibitory concentration (pIC50). The model has mechanistic interpretation in terms of promoters of increase or decrease of the half maximal inhibitory concentration (logarithmic scale). The domain of applicability for models which are calculated with Eqs. (3)e(5) is defined according to prevalence of local and global SMILES attributes in the training and validation sets [14]. The distribution of available data into the visible training and calibration sets and invisible validation set has influence upon the statistical quality of the model. The represented data indicated that suggested approach can be utilized for regulatory [15] purposes, according to OECD principles [16].

A.P. Toropova et al. / European Journal of Medicinal Chemistry 101 (2015) 452e461

457

Table 3 Molecular features with statistically reliable impact upon pIC50.

Table 4 Two hypothesis which are defined according to data from Table 3. Compound

Comments Compound described in the literature [1]. pIC50(expr) ¼ 4.896

Presence of oxygen and double bond is promoter of pIC50 increase. Oxygen is deleted. Hence the pIC50 should decrease (See ID 1 in Table 3).

Presence of ring which contains oxygen is promoter of pIC50 increase. Oxygen is added into ring. Hence the pIC50 should increase. (See ID 5 in Table 3)

458

A.P. Toropova et al. / European Journal of Medicinal Chemistry 101 (2015) 452e461

Table 5 The list of outliers according to inequality D > 2  D. The defectD is calculated with Eq. (6). Split

SMILES

1

CCCCCCCc1ccc(cc1)C(]O)C(\C)]C\C

Structure

2

D

D

8.052

3.375

C[C@H]1Cc2ccccc2N1C(]O)c1sc(nc1C(F)(F)F)-c1ccc(c(c1)N(]O)]O)S(C)(]O)]O

56.137

6.166

3

CCCCCCOc1ccc2C(]O)[C@H](CN(C)C)CCc2c1

15.050

5.959

3

Cc1nc(sc1C(]O)N1CCCCC1)-c1ccc(c(c1)N(]O)]O)S(C)(]O)]O

30.070

5.959

Table 6 The splits into the training (T), calibration (C), and validation (V) sets, experimental and calculated with Eqs. (3)e(5) values for pIC50. The experimental data taken in the literature [1]. 1

2

3

SMILES

pIC50 Expr

Eq. 3

Eq. 4

Eq. 5

T T T T T T T T T T T T T T T T T T T T T T T V T C V T V T T T T T

T T T T T T T T T T T T T V V T T T C C T T C T T V T T T T T V T T

T T T T T T T T V T T T T T C T T T T V T T V T T C C T V T T T T T

CCCCCCOc1ccc(cc1C(F)(F)F)C(]O)CCN(C)C CCCCCCc1ccc(cc1)C(]O)[C@@H]1CO1 CCCCCCc1ccc(NC(]O)CCC(O)]O)cc1 CCCCCCc1ccc(cc1)C(]O)CCN1CCC(CC1)C(]O)OC CCCCCCCc1ccc(cc1)C(]O)C(C)]C CCCCCCc1ccccc1OC(]O)C]C CCCCCCc1ccc(NC(]O)\C]C\c2ccccc2)cc1 CCCCCCc1ccc(cc1)C(]O)CCl CCCCCCc1ccc(cc1)C(]O)CCN(C)CCc1ccccc1 CCCCCCCCc1ccc(cc1)C(]O)C]C CCCCCCOc1ccc(C(]O)CCN(C)C)c(OC)c1 CCCCCCc1ccc(cc1)C(]O)CCN(C)CCO CCCCCCc1ccc(NC(]O)C]C)cc1 CCCCCCOc1ccc(cc1)C(]O)CC[N@]1C[C@@H]1C CCCCCCc1ccc(cc1)C(]O)CCN(C)C CC(C)(C)Cc1ccc(cc1)C(]O)C]C CCN1CCC(Cc2ccc(cc2)C(]O)CCN(C)C)CC1 CCCCCCOc1ccc(cc1)C(]O)CCN1CCCC1 CCCC(]O)Nc1cccc2CCCCc12 CCCCc1ccc(cc1)C(]O)C]C CCCCCCOc1ccc(cc1)C(]O)CCN(C)C CCCCCCc1ccc(NC(]O)\C]C\C)cc1 CCCCCCOc1ccc(cc1-c1ccccc1)C(]O)CCN(C)C CCCCCCc1ccc(cc1)C(]O)CCNCCC CCCCCC(]O)Nc1ccc(cc1)C(]O)CCN(C)C CCCCCc1ccc(cc1)C(]O)C]C CCCCCCc1ccc(cc1)C(]O)CCN1CCOCC1 CCCCCCS(]O)(]O)c1cc(Cl)c(C(]O)CC[N@]2C[C@@H]2C)c(Cl)c1 CCCCCCc1ccc(cc1)C(]O)CF CCCCCCOc1ccc(C(]O)CCN(C)C)c(SC)c1 CCCCCCOc1ccc(cc1C)C(]O)CCN(C)C CCCCCCCc1ccc(cc1)C(]O)CCN(C)C CCCCCCOc1cc(C)c(C(]O)CCN2CCNC(]O)C2)c(C)c1 CN(Cc1cccc(F)c1)C(]O)c1cnc(s1)-c1ccc(c(c1)N(]O)]O)S(C)(]O)]O

4.0000 4.0660 4.1560 4.1610 4.1800 4.2150 4.2530 4.3620 4.4550 4.4700 4.5970 4.6350 4.6360 4.6580 4.6930 4.7190 4.7240 4.7240 4.7470 4.7520 4.7830 4.8180 4.8510 4.8960 4.9070 4.9170 4.9430 4.9510 4.9790 4.9830 4.9960 5.0180 5.0410 5.0710

4.5380 4.0982 4.6263 4.1752 4.8731 4.1773 4.1772 4.7159 4.5983 5.1207 5.0921 4.9004 4.8358 4.8150 4.8142 5.2022 4.7099 5.0124 5.0026 5.1360 5.2702 4.7569 4.7905 5.0131 5.1926 5.1322 4.9055 5.7901 5.0983 5.5078 5.4724 4.8103 5.5965 5.1149

5.1745 4.7240 4.6951 5.0191 4.9069 4.5967 4.6854 4.6718 4.7861 4.9316 5.4046 5.0069 4.8133 4.7842 4.8428 4.8992 4.9116 5.1167 4.6260 4.7321 5.1672 4.9546 5.1875 4.7965 5.1425 4.7820 5.0562 5.7201 4.6875 5.5317 5.3173 4.8926 5.6355 5.2879

5.2153 4.7595 4.6730 4.7837 4.9924 4.7625 4.7158 4.8866 4.7166 5.1348 5.3120 4.9553 5.0494 4.6502 4.8628 5.1123 4.7515 5.0756 4.7197 5.0057 5.1962 5.0818 5.1752 4.7840 5.1315 5.0380 4.9971 5.7856 4.7058 5.5748 5.3837 4.8951 5.5404 5.2185

A.P. Toropova et al. / European Journal of Medicinal Chemistry 101 (2015) 452e461

459

Table 6 (continued ) 1

2

3

SMILES

pIC50 Expr

Eq. 3

Eq. 4

Eq. 5

T T T T V T T e T T T V T T T V C T C T T T C T T T T C V T T C T T T T T T T T T T C T T T T T T T V C V T T T T T T T T T T T V T T C V T T C T T

V T T C T T C T T T T C T T V T T V T V T T T T T T T C C V T T T e T C T C T T T C T T T T T C T T V T T T T T T T T T T T T T T V T C T T T V T T

T T T C T T T T V T T T T T T V C T V T V T T V T T T V T T T C T T T T V T T T T T T T T T T C T T C C T T T T T T V V C T T T T T T T C T C V V T

CCCCCCOc1ccc(cc1S(C)(]O)]O)C(]O)CCN(C)C CCCCCCCc1ccc(cc1)C(]O)\C]C\c1ccccc1 CCCN(C(]O)c1csc(n1)-c1ccc(c(c1)N(]O)]O)S(C)(]O)]O)c1ccccc1 CCCCCCNC(]O)c1ccc(cc1)C(]O)CCN(C)C CCCCCCCc1ccc(cc1)C(]O)C(\C)]C\C CC1CCN(CC1)C(]O)c1sc(nc1C)-c1ccc(c(c1)N(]O)]O)S(C)(]O)]O CCCCCCOc1cccc(c1)C(]O)CCN(C)C Cc1nc(sc1C(]O)N1C[C@]2(C)CC1CC(C)(C)C2)-c1ccc(c(c1)N(]O)]O)S(C)(]O)]O CCCCCCOc1ccc(cc1OC)C(]O)CCN(C)C CCCCCCOc1cc(C)c(C(]O)CC[N@]2C[C@@H]2C)c(C)c1 CCCCCCc1ccc(cc1)C(]O)CCN(C1CCCCC1)C1CCCCC1 CCCCCCSc1ccc(cc1)C(]O)CCN(C)C CCCCCCOc1ccc(cc1)C(]O)CCN1CCN(C)CC1 CCCCCCSc1cc(Cl)c(C(]O)CCN(C)C)c(Cl)c1 CCCCCCS(]O)(]O)c1cc(C)c(C(]O)CCN2CCNC(]O)C2)c(C)c1 CCCCCCOc1ccc(C(]O)CCN(C)C)c(Br)c1 CCCCCCSc1cc(C)c(C(]O)CC[N@]2C[C@@H]2C)c(C)c1 C[C@H]1Cc2ccccc2N1C(]O)c1sc(nc1C(F)(F)F)-c1ccc(c(c1)N(]O)]O)S(C)(]O)]O CCCCCCSc1ccc(C(]O)CC[N@]2C[C@@H]2C)c(Cl)c1 CCCCCCSc1cc(C)c(C(]O)CCN(C)C)c(C)c1 CCCCCCOc1ccc(cc1)C(]O)CCN1CCCCC1 CCCCCCS(]O)(]O)c1cc(C)c(C(]O)CCN2CCN(CC2)S(]O)(]O)CC)c(C)c1 CCCCCCOc1cc(Cl)c(cc1Cl)C(]O)CC[N@]1C[C@@H]1C CCCCCCS(]O)(]O)c1ccc(cc1)C(]O)CCN(C)C CCCCCCc1ccc(cc1)C(]O)CCN(C)Cc1ccco1 CCCCCCOc1ccc(C(]O)CCN2CCN(CC2)C(C)]O)c(Cl)c1Cl CCCCCCOc1ccc(cc1F)C(]O)CCN(C)C CCCCCCOc1ccc(C(]O)CC[N@]2C[C@@H]2C)c(Cl)c1 CCCCCCc1ccc(cc1)C(]O)CCCBr CCCCCCOc1ccc(cc1Cl)C(]O)CCN(C)C CCCCCCc1ccc(NC(]O)CC(]C)C(O)]O)cc1 CCCCCCSc1cc(Cl)c(cc1Cl)C(]O)CC[N@]1C[C@@H]1C CS(]O)(]O)c1ccc(cc1N(]O)]O)-c1nc(c(s1)C(]O)N1CCc2ccccc2C1)C(F)(F)F CCCCCCSc1ccc(C(]O)CCN2CCOCC2)c(Cl)c1 CS(]O)(]O)c1ccc(cc1N(]O)]O)C(]O)OCC(]O)NC12CC3CC(CC(C3)C1)C2 CCCCCCOc1ccc(C(]O)CCN2CCN(CC2)C(C)]O)c(Cl)c1 CCCCCCSc1cc(C)c(C(]O)CCN2CCN(CC2)S(]O)(]O)CC)c(C)c1 CCCCCCOc1ccc(C(]O)CCN(C)C)c2ccccc12 CCCCCCc1ccc(cc1)C(]O)CCN(C(C)C)C(C)C CCCCCCS(]O)(]O)c1ccc(C(]O)CCN2CCN(CC2)C(C)]O)c(Cl)c1 CCCCCCSc1ccc(C(]O)CCN2CCN(CC2)S(]O)(]O)CC)c(Cl)c1Cl CCCCCCS(]O)(]O)c1ccc(C(]O)CCN(C)C)c(Cl)c1 CCCCCCS(]O)(]O)c1cc(C)c(C(]O)CC[N@]2C[C@@H]2C)c(C)c1 CN(Cc1ccccc1F)C(]O)c1sc(nc1C(F)(F)F)-c1ccc(c(c1)N(]O)]O)S(C)(]O)]O CN(Cc1ccccc1)C(]O)c1sc(nc1C(F)(F)F)-c1ccc(c(c1)N(]O)]O)S(C)(]O)]O CCCc1ccc(OC(]O)C]C)cc1 CCCCCCOc1ccc(cc1N(]O)]O)C(]O)CCN(C)C CCCCCCS(]O)(]O)c1ccc(C(]O)CCN2CCN(CC2)S(]O)(]O)CC)c(Cl)c1 CN(Cc1cccc(F)c1)C(]O)c1sc(nc1C(F)(F)F)-c1ccc(c(c1)N(]O)]O)S(C)(]O)]O CCCCCCc1ccc(NC(]O)\C]C/C(O)]O)cc1 CCCCCCOc1ccc(cc1)C(]O)CCN1CCOCC1 CCCCCCOc1cc(C)c(C(]O)CCN2CCN(CC2)S(]O)(]O)CC)c(C)c1 CCCCCCOc1ccc(cc1I)C(]O)CCN(C)C CCCCCCOc1ccc(cc1Br)C(]O)CCN(C)C CCCCCCSc1cc(Cl)c(C(]O)CC[N@]2C[C@@H]2C)c(Cl)c1 CCCCCCOc1ccc(C(]O)CCN(C)C)c(c1)S(C)(]O)]O CCc1nc(sc1C(]O)N1C[C@]2(C)CC1CC(C)(C)C2)-c1ccc(c(c1)N(]O)]O)S(C)(]O)]O CCCCCCOc1ccc(cc1)C(]O)[C@@H](CN(C)C)C(C)C CCCCCCOc1ccc2C(]O)[C@H](CN(C)C)CCc2c1 CCOCCOc1ccc(cc1)C(]O)CCN(C)C CCCCCCSc1ccc(C(]O)CCN2CCNC(]O)C2)c(Cl)c1Cl CCCCCCOc1ccc(C(]O)CCN(C)C)c(I)c1 C[C@@]12CC(CC(C)(C)C1)N(C2)C(]O)c1sc(nc1C(F)(F)F)-c1ccc(c(c1)N(]O)]O)S(C)(]O)]O CCCCCCCc1ccc(cc1)C(]O)C]C CCCCCCSc1ccc(C(]O)CCN2CCNC(]O)C2)c(Cl)c1 CCCCCCOc1ccc(C(]O)CCN2CCN(CC2)S(]O)(]O)CC)c(Cl)c1 CCCCCCOc1cc(Cl)c(cc1Cl)C(]O)CCN1CCNC(]O)C1 CCCCCCOc1cc(C)c(C(]O)CCN2CCOCC2)c(C)c1 CCCCCCOc1cc(C)c(C(]O)CCN2CCN(CC2)C(C)]O)c(C)c1 CCCCCCOc1ccc(C(]O)CCN(C)C)c(c1)C(F)(F)F CCCCCCOc1ccc(C(]O)CCN2CCOCC2)c(Cl)c1 CCCCCCOc1ccc(C(]O)CCN(C)C)c(Cl)c1 Cc1nc(sc1C(]O)N1CCCCC1)-c1ccc(c(c1)N(]O)]O)S(C)(]O)]O CCCCCCOc1cc(C(C)C)c(cc1C)C(]O)CCN(C)C

5.0760 5.0860 5.0860 5.0920 5.1190 5.1310 5.1490 5.1490 5.1550 5.1870 5.2080 5.2290 5.2520 5.2680 5.2680 5.2760 5.2760 5.2840 5.2920 5.3010 5.3010 5.3010 5.3010 5.3100 5.3190 5.3280 5.3280 5.3370 5.3670 5.3670 5.3770 5.3770 5.3770 5.3870 5.3870 5.3980 5.3980 5.3980 5.4090 5.4200 5.4440 5.4440 5.4560 5.4690 5.4810 5.5230 5.5230 5.5380 5.5530 5.5690 5.5690 5.5690 5.5690 5.5850 5.6200 5.6200 5.6200 5.6380 5.6380 5.6780 5.6780 5.6780 5.6780 5.6990 5.6990 5.6990 5.6990 5.6990 5.7210 5.7210 5.7450 5.7450 5.7450 5.7470

5.1364 5.1338 5.0577 4.9584 4.9590 5.4353 5.1130 5.3026 5.1344 5.3897 5.2667 5.2461 5.2835 5.8858 5.6313 5.5039 5.3656 5.1798 5.3677 5.6274 5.0086 5.6161 5.5035 5.3051 4.8688 5.9397 5.3405 5.3919 5.0001 5.7024 4.9566 5.4794 5.5076 5.7748 5.2331 5.7218 5.4825 5.2988 5.4900 5.8637 5.8095 5.7955 5.4246 5.3573 5.5959 5.1811 5.6868 5.7253 5.7434 5.4753 5.3616 5.5066 5.5418 5.5418 5.6241 5.3736 5.3442 5.8637 5.8341 5.6270 5.7924 5.5039 5.7070 5.1245 5.5745 5.6158 5.9652 5.7968 5.7197 5.0881 5.7989 5.6537 5.7913 5.9542

5.4528 4.9498 5.2439 5.0550 5.0357 5.6187 5.0923 5.6149 5.2900 5.3919 5.1585 5.1006 5.2878 5.8476 5.6376 5.5874 5.3252 5.3338 5.2686 5.5798 5.1666 5.7003 5.4979 5.1693 5.1169 5.9243 5.2535 5.3352 5.0461 5.5565 4.8969 5.4313 5.7015 5.7217 5.3338 5.7995 5.6567 5.0965 5.3759 5.8600 5.7832 5.6502 5.3940 5.5360 5.4360 4.7322 5.3630 5.7020 5.4983 5.3750 5.3807 5.7233 5.6459 5.6998 5.5931 5.5192 5.6397 5.8500 5.6250 5.3314 5.6370 5.7627 5.7718 4.8817 5.5122 5.7250 5.9224 5.8450 5.8562 5.3242 5.7883 5.5898 5.4684 5.8296

5.3047 5.0909 5.2037 4.9436 5.0480 5.6800 5.1175 5.4347 5.2865 5.3536 4.9869 5.1500 5.2190 5.9668 5.5881 5.6894 5.3073 5.0999 5.3494 5.6011 5.1079 5.6484 5.4697 5.2439 4.7467 5.9320 5.1454 5.3956 5.1455 5.6920 4.9299 5.4234 5.6040 5.8167 5.3711 5.7653 5.6065 4.9759 5.3255 5.8316 5.8338 5.7557 5.4013 5.5236 5.5373 4.9922 5.4370 5.7090 5.6578 5.1536 5.3305 5.6527 5.2287 5.5991 5.6731 5.4819 5.4670 5.8982 5.4092 5.3864 5.7029 5.3189 5.8033 5.1025 5.5362 5.7133 5.9412 5.8209 5.7232 5.3263 5.8630 5.6894 5.4325 5.8394

(continued on next page)

460

A.P. Toropova et al. / European Journal of Medicinal Chemistry 101 (2015) 452e461

Table 6 (continued ) 1

2

3

SMILES

pIC50 Expr

Eq. 3

Eq. 4

Eq. 5

T T V T T V T T T T C T T T T C T V C T T V C T T C T T T T T T V C V T C T V T T T T T T T T T T C T C T C T T T T V C V T T T T T T V C T T T T

T T T T T T T T T T T T T T T V V T V T T T C T T C V C T V V T V V V T T T C T V C T T T T T T V T T T T C T C C T C C V T T T T C T T C T T T T

T T T T T C T T T T T T T T T T T T C T C V V T T T T T T C V T T V V T T T C T T T V T T T T T C T T V T T T T T T T T T T T T C C T T T T T T T

CCCCCCOc1cc(C)ccc1C(]O)CCN(C)C CCCCCCOc1ccc(cc1)C(]O)[C@@H](C)CN(C)C CCCCCCSc1cc(Cl)c(C(]O)CCN2CCNC(]O)C2)c(Cl)c1 CCCCCCS(]O)(]O)c1cc(Cl)c(cc1Cl)C(]O)CC[N@]1C[C@@H]1C CCCCCCOc1ccc2C(]O)[C@@H](CN(C)C)Cc2c1 CCCCCCSc1ccc(C(]O)CCN2CCN(CC2)S(]O)(]O)CC)c(Cl)c1 CS(]O)(]O)c1ccc(cc1N(]O)]O)-c1ncc(s1)C(]O)NC12CC3CC(CC(C3)C1)C2 CCCCCCc1ccc(cc1)C(]O)CCBr CCCCCCSc1cc(C)c(C(]O)CCN2CCOCC2)c(C)c1 CCCCCCOc1ccc2C(]O)[C@@H](CN(C)C)COc2c1 CCc1nc(sc1C(]O)N1CCCCC1)-c1ccc(c(c1)N(]O)]O)S(C)(]O)]O CCCCCCc1ccc(cc1)C(]O)\C]C/C(O)]O CCCC(]O)Nc1ccc(cc1)C(]O)C]C CCCCCCS(]O)(]O)c1ccc(C(]O)CC[N@]2C[C@@H]2C)c(Cl)c1 CCCCCCOc1cc(Cl)c(cc1Cl)C(]O)CCN1CCN(CC1)S(]O)(]O)CC CCCCCCSc1ccc(C(]O)CCN(C)C)c(Cl)c1 CCCCCCOc1cc(Cl)c(C(]O)CCN2CCNC(]O)C2)c(Cl)c1 CCCCCCS(]O)(]O)c1cc(Cl)c(cc1Cl)C(]O)CCN(C)C CCCCCCSc1cc(C)c(C(]O)CCN2CCNC(]O)C2)c(C)c1 CS(]O)(]O)c1ccc(cc1N(]O)]O)-c1nc(c(s1)C(]O)N1CCCCC1)-c1ccccc1 CCCCCCSc1ccc(C(]O)CCN2CCOCC2)c(Cl)c1Cl CCCCCCSc1cc(Cl)c(C(]O)CCN2CCN(CC2)S(]O)(]O)CC)c(Cl)c1 CCCCCCSc1ccc(C(]O)CCN(C)C)c(Cl)c1Cl CS(]O)(]O)c1ccc(cc1N(]O)]O)-c1ncc(s1)C(]O)N1CCCCC1 CCCCCCc1ccc(cc1)C(]O)CC CCCCCCOc1cc(Cl)c(cc1Cl)C(]O)CCN1CCOCC1 CCCCCCOc1ccc(C(]O)CCN2CCNC(]O)C2)c(Cl)c1 CCCCCCS(]O)(]O)c1cc(C)c(C(]O)CCN2CCN(CC2)C(C)]O)c(C)c1 CCCCCCSc1cc(Cl)c(cc1Cl)C(]O)CCN(C)C CCCCCCOc1ccc(C(]O)CCN2CCNC(]O)C2)c(Cl)c1Cl CCCCCCS(]O)(]O)c1ccc(C(]O)CC[N@]2C[C@@H]2C)c(Cl)c1Cl CCCCCCS(]O)(]O)c1cc(C)c(C(]O)CCN(C)C)c(C)c1 CCCCCCS(]O)(]O)c1cc(Cl)c(C(]O)CCN(C)C)c(Cl)c1 CCCCCCOc1ccc(C(]O)CCN(C)C)c(Cl)c1Cl CCCCCCSc1cc(Cl)c(C(]O)CCN2CCOCC2)c(Cl)c1 CCCCCCSc1cc(Cl)c(cc1Cl)C(]O)CCN1CCOCC1 CCCCCCS(]O)(]O)c1ccc(C(]O)CCN2CCN(CC2)C(C)]O)c(Cl)c1Cl CCCCCCS(]O)(]O)c1cc(Cl)c(C(]O)CCN2CCNC(]O)C2)c(Cl)c1 CCCCCCOc1ccc(C(]O)CCN2CCN(CC2)S(]O)(]O)CC)c(Cl)c1Cl CCCCCCS(]O)(]O)c1cc(C)c(C(]O)CCN2CCOCC2)c(C)c1 CCCCCCSc1ccc(C(]O)CCN2CCN(CC2)C(C)]O)c(Cl)c1Cl CCCCCCS(]O)(]O)c1ccc(C(]O)CCN2CCN(CC2)S(]O)(]O)CC)c(Cl)c1Cl CCCCCCS(]O)(]O)c1cc(Cl)c(cc1Cl)C(]O)CCN1CCOCC1 CCCCCCOc1ccc(cc1C(C)(C)C)C(]O)CCN(C)C CCCCCCOc1c(C)cc(cc1C)C(]O)CCN(C)C CCCCCCOc1ccc(cc1SC)C(]O)CCN(C)C CCCCCCOc1ccc(C(]O)CCN(C)C)c(C)c1 CCCCCCSc1cc(C)c(C(]O)CCN2CCN(CC2)C(C)]O)c(C)c1 CCCCCCSc1cc(Cl)c(C(]O)CCN2CCN(CC2)C(C)]O)c(Cl)c1 CCCCCCS(]O)(]O)c1cc(Cl)c(cc1Cl)C(]O)CCN1CCN(CC1)S(]O)(]O)CC CCCCCCS(]O)(]O)c1cc(Cl)c(C(]O)CCN2CCOCC2)c(Cl)c1 CCCCCCOc1cc(Cl)c(C(]O)CCN2CCOCC2)c(Cl)c1 CCCCCCS(]O)(]O)c1cc(Cl)c(cc1Cl)C(]O)CCN1CCN(CC1)C(C)]O CCCCCCOc1cc(Cl)c(cc1Cl)C(]O)CCN1CCN(CC1)C(C)]O CCCCCCOc1ccc(C(]O)CC[N@]2C[C@@H]2C)c(Cl)c1Cl CCCCCCSc1cc(Cl)c(cc1Cl)C(]O)CCN1CCN(CC1)S(]O)(]O)CC CCCCCCS(]O)(]O)c1ccc(C(]O)CCN2CCNC(]O)C2)c(Cl)c1 CCCCCCOc1cc(C)c(C(]O)CCN(C)C)c(C)c1 CCCCCCOc1ccc(C(]O)CCN2CCOCC2)c(Cl)c1Cl CCCCCCS(]O)(]O)c1ccc(C(]O)CCN2CCOCC2)c(Cl)c1Cl CCCCCCS(]O)(]O)c1cc(Cl)c(C(]O)CCN2CCN(CC2)C(C)]O)c(Cl)c1 CCCCCCOc1cc(Cl)c(C(]O)CCN(C)C)c(Cl)c1 CS(]O)(]O)c1ccc(cc1N(]O)]O)-c1nc(c(s1)C(]O)N1CCCCC1)C(F)(F)F CCCCCCOc1cc(Cl)c(C(]O)CCN2CCN(CC2)S(]O)(]O)CC)c(Cl)c1 CCCCCCSc1cc(Cl)c(cc1Cl)C(]O)CCN1CCN(CC1)C(C)]O CCCCCCOc1cc(Cl)c(cc1Cl)C(]O)CCN(C)C CCCCCCSc1ccc(C(]O)CCN2CCN(CC2)C(C)]O)c(Cl)c1 CCCCCCS(]O)(]O)c1cc(Cl)c(cc1Cl)C(]O)CCN1CCNC(]O)C1 CCCCCCOc1cc(Cl)c(C(]O)CCN2CCN(CC2)C(C)]O)c(Cl)c1 CCCCCCSc1cc(Cl)c(cc1Cl)C(]O)CCN1CCNC(]O)C1 CCCCCCS(]O)(]O)c1ccc(C(]O)CCN2CCOCC2)c(Cl)c1 CCCCCCS(]O)(]O)c1cc(Cl)c(C(]O)CCN2CCN(CC2)S(]O)(]O)CC)c(Cl)c1 CC1CCN(CC1)C(]O)c1sc(nc1C(F)(F)F)-c1ccc(c(c1)N(]O)]O)S(C)(]O)]O

5.7700 5.7700 5.7700 5.7700 5.7700 5.7700 5.7700 5.7960 5.7960 5.7960 5.7960 5.8240 5.8240 5.8240 5.8240 5.8240 5.8240 5.8540 5.8540 5.8540 5.8860 5.8860 5.8860 5.8860 5.9210 5.9210 5.9210 5.9210 5.9210 5.9210 5.9590 5.9590 5.9590 5.9590 5.9590 6.0000 6.0000 6.0000 6.0460 6.0460 6.0460 6.0460 6.0460 6.0970 6.0970 6.0970 6.0970 6.0970 6.0970 6.0970 6.0970 6.0970 6.0970 6.0970 6.1550 6.1550 6.1550 6.1550 6.1550 6.1550 6.1550 6.1610 6.1800 6.2220 6.2220 6.2220 6.2220 6.2220 6.2220 6.2220 6.3010 6.3980 6.5090

5.3940 5.5258 5.8308 5.6454 5.5213 5.5916 5.8259 5.0039 5.7726 5.7768 5.8329 6.0802 5.3087 5.5338 6.1346 5.6295 5.8549 6.1006 5.5723 5.8056 5.9927 5.8479 5.8474 5.8082 4.7403 6.0501 5.5986 5.7545 5.9346 5.8165 5.7516 5.6864 6.0519 5.8716 6.0311 6.0260 6.0816 5.9968 5.8337 5.8317 5.9156 5.9432 6.1920 5.5804 6.0520 5.6486 5.5076 5.6955 5.9540 6.2441 6.1971 6.0553 6.0817 5.9398 5.6098 6.1104 5.7405 5.6515 6.0168 6.1587 6.1200 5.9100 6.0632 5.8721 5.9157 5.9587 5.6977 6.1071 5.9781 5.9410 5.9408 5.9816 6.1264

5.2800 5.5835 5.8367 5.5584 5.4088 5.6584 5.8329 4.9962 5.7783 5.6832 5.4933 5.4622 5.0151 5.3956 6.0745 5.5231 5.9034 5.9414 5.5689 5.9649 5.8465 5.9829 5.6479 5.8283 4.6136 6.0944 5.5789 5.8583 5.8144 5.7037 5.5204 5.6485 5.9747 5.7146 6.0462 6.0278 5.9847 5.9638 5.8498 5.8470 5.8577 5.8268 6.1548 5.6507 5.5630 5.6086 5.4007 5.7895 6.0574 6.0514 6.1732 6.1128 6.1825 6.1221 5.4600 6.0078 5.6393 5.6464 5.9131 5.9735 6.1844 5.9143 5.9058 6.0495 6.0555 5.8810 5.7329 5.9828 6.1240 5.8558 5.8487 6.0264 5.6089

5.3490 5.6668 5.8599 5.5360 5.3910 5.6671 5.9038 5.1132 5.7747 5.5600 5.4648 5.3191 5.2096 5.4619 6.0698 5.6431 5.9061 6.0820 5.4942 5.9205 5.9834 5.9908 5.8098 5.8421 4.7242 6.1499 5.5824 5.7710 5.9694 5.7491 5.6286 5.6951 6.0794 5.8561 6.1404 6.1037 5.9983 5.9725 5.8800 5.8687 5.8857 5.8757 6.2163 5.6404 5.6684 5.4452 5.4230 5.6770 6.0427 6.0655 6.2530 6.1867 6.1380 6.0717 5.5623 6.0236 5.6487 5.6473 6.0297 6.0960 6.1553 6.0131 5.9234 6.0370 6.0255 6.0157 5.7190 6.0075 6.0890 5.8949 5.9293 6.0327 5.7086

A.P. Toropova et al. / European Journal of Medicinal Chemistry 101 (2015) 452e461

Acknowledgements We thank EC project PROSIL funded under the LIFE program (project LIFE12ENV/IT/000154) for financial support. Appendix A. Supplementary material Supplementary material related to this article can be found at http://dx.doi.org/10.1016/j.ejmech.2015.07.012. References [1] R. Politi, I. Rusyn, A. Tropsha, Prediction of binding affinity and efficacy of thyroid hormone receptor ligands using QSAR and structure-based modeling methods, Toxicol. Appl. Pharmacol. 280 (2014) 177e189. [2] J.J. Hangeland, A.M. Doweyko, T. Dejneka, T.J. Friends, P. Devasthale, € m, J. Sandberg, M. Grynfarb, J.S. Sack, H. Einspahr, M. Fa €rnegårdh, K. Mellstro B. Husman, J. Ljunggren, K. Koehler, C. Sheppard, J. Malm, D.E. Ryono, Thyroid receptor ligands. Part 2: thyromimetics with improved selectivity for the thyroid hormone receptor beta, Bioorg. Med. Chem. Lett. 14 (2004) 3549e3553. [3] S. Raval, P. Raval, D. Bandyopadhyay, K. Soni, D. Yevale, D. Jogiya, H. Modi, A. Joharapurkar, N. Gandhi, M.R. Jain, P.R. Patel, Design and synthesis of novel 3-hydroxy-cyclobut-3-ene-1,2-dione derivatives as thyroid hormone receptor b (TR-b) selective ligands, Bioorg. Med. Chem. Lett. 18 (2008) 3919e3924. [4] T.P. Burkholder, B.E. Cunningham, J.R. Clayton, P.A. Lander, M.L. Brown, R.A. Doti, G.L. Durst, C. Montrose-Rafizadeh, C. King, H.E. Osborne, R.M. Amos, R.W. Zink, L.E. Stramm, T.P. Burris, G. Cardona, D.L. Konkol, C. Reidy, M.E. Christe, M.J. Genin, Design and synthesis of a novel series of [1-(4hydroxy-benzyl)-1H-indol-5-yloxy]-acetic acid compounds as potent, selective, thyroid hormone receptor b agonists, Bioorg. Med. Chem. Lett. 25 (2015) 1377e1380.

461

[5] K. Shiizaki, S. Asai, S. Ebata, M. Kawanishi, T. Yagi, Establishment of yeast reporter assay systems to detect ligands of thyroid hormone receptors a and b, Toxicol. Vitro 24 (2010) 638e644. [6] N.F. Valadares, M.S. Castilho, I. Polikarpov, R.C. Garratt, 2D QSAR studies on thyroid hormone receptor ligands, Bioorg. Med. Chem. 15 (2007) 4609e4617. [7] G. Azimi, S. Afiuni-Zadeh, A. Karami, A QSAR study for modeling of thyroid receptors b1 selective ligands by application of adaptive neuro-fuzzy inference system and radial basis function, J. Chemom. 26 (2012) 135e142. [8] A.A. Toropov, A.P. Toropova, B.F. Rasulev, E. Benfenati, G. Gini, D. Leszczynska, J. Leszczynski, Coral: QSPR modeling of rate constants of reactions between organic aromatic pollutants and hydroxyl radical, J. Comput. Chem. 33 (2012) 1902e1906. [9] D. Weininger, SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules, J. Chem. Inf. Comput. Sci.® 28 (1988) 31e36. [10] D. Weininger, A. Weininger, J.L. Weininger, SMILES. 2. Algorithm for generation of unique SMILES notation, J. Chem. Inf. Comput. Sci.® 29 (1989) 97e101. [11] D. Weininger, Smiles. 3. Depict. Graphical depiction of chemical structures, J. Chem. Information Comput. Sci. 30 (1990) 237e243. [12] A.P. Toropova, A.A. Toropov, E. Benfenati, G. Gini, D. Leszczynska, J. Leszczynski, CORAL: quantitative structureeactivity relationship models for estimating toxicity of organic compounds in rats, J. Comput. Chem. 32 (2011) 2727e2733. [13] A.A. Toropov, A.P. Toropova, T. Puzyn, E. Benfenati, G. Gini, D. Leszczynska, J. Leszczynski, QSAR as a random event: modeling of nanoparticles uptake in PaCa2 cancer cells, Chemosphere 92 (2013) 31e37. [14] A.P. Toropova, A.A. Toropov, R. Rallo, D. Leszczynska, J. Leszczynski, Optimal descriptor as a translator of eclectic data into prediction of cytotoxicity for metal oxide nanoparticles under different conditions, Ecotoxicol. Environ. Saf. 112 (2015) 39e45. [15] REACH, 2007. http://ec.europa.eu/environment/chemicals/reach/reach_intro. htm. accessed 30.04.15. [16] OECD Principles for the Validation, for Regulatory Purposes, of (Quantitative) Structureeactivity Relationship Models, 2004. http://www.oecd.org/ dataoecd/33/37/37849783.pdf. accessed 30.04.15.