QSARs for a series of inhibitory anilides

QSARs for a series of inhibitory anilides

~ ) Chemosphere. Vol. 36. No. 13, pp. 2809-2818, 1998 © 1998ElsevierScienceLtd All rightsreserved.Printedin GreatBritain 0045-6535/98$19.00+0.00 Per...

341KB Sizes 0 Downloads 57 Views

~ )

Chemosphere. Vol. 36. No. 13, pp. 2809-2818, 1998 © 1998ElsevierScienceLtd All rightsreserved.Printedin GreatBritain 0045-6535/98$19.00+0.00

Pergamon

PII: S0045-6535(97)10239-9

QSARs FOR A SERIES OF INHIBITORY ANILIDES. D. Zakarya 1~,, E. M. Larfaoui 2~, A. Boulaamail 2), M. Toilabi 3) and T. Lakhlifi 2)

1) Facult~ des Sciences et Techniques, B. P. 146, Cit~ Yasmina, Mohammedia, Morocco. Fax : 212 3 31 53 53 Email : [email protected] 2) Facult6 des Sciences, B. P. 1046, Zitoune, Mekn~s, Morocco. 3) Facult6 des Sciences, Universit6 Mohammed V, Avenue Ibn Batouta, Rabat, Morocco. (Receivedin Germany24 June 1997;accepted6 December1997)

ABSTRACT Neural network was applied to study a series of anilide herbicides inhibiting photosystem II (PSII). The molecules were encoded by a set of dimensional parameters. The model established using a back-propagation algorithm of neural network (r = 0.967 ; n = 76) was superior to that obtained using multiple linear regression (r = 0.922 ; n = 76). The descriptor's contributions to the PS II inhibitory activity pls0 were calculated by a method which consists in analysing the statistical coefficients between observed and calculated pls0 using neural network, when a descriptor is taken off. The results obtained were interpreted in terms of interactions molecule-receptor site. ©1998 Elsevier ScienceLtd. All fights reserved

K E Y WORDS:

QSAR ; Anilide herbicides ; Photosystem II inhibitory activity; Multiple

Linear Regression ; Neural Network ; Descriptor's contribution ; Interaction moleculereceptor site.

2809

2810 INTRODUCTION Recently, Neural Networks (NN) have attracted the attention of chemists who recognized that the characteristics of these tools are well adapted to the processing of data in which the relation between the cause and its effects cannot be exactly defined. The prediction of the secondary structure of proteins ~ is a well known example of the application of NN to solve scientific problems. In chemometrics, NN are used to solve problems and the results were better than those using classical tools such as Multiple Linear Regression (MLR). Andrea and Kalayeth 2 studied 256 compounds belonging to 2,4-diamino-6,6-dimethyl-5-phenyl- dihydrotriazine and concluded that NN results exceeded'the MLR ones. The activity studied was the dihydrofolate (DHFR) inhibition. Similar conclusions were advanced by Aoyama et al. 3-6 for the study of the anticancer activity of mitomycins a n d carbiquinones, the hypertensive activity of arylacryloyl-piperazinesand three different biological data for 1,4-benzodiazepines. Shimizu et al. 7 analyzed the quantitative structure-activity relationship for inhibition of photosystem II by meta and para-substituted anilides using multiple linear regression. In order to improve the quality of their model, they used the concept of the bilinear model of Kubinyi 8 expressed by the term log(1310 log P + 1), where log P is file octanol-water partition coefficient and 13, a constant. This term may be not sufficient to account for the non linearity of the phenomenon. All these facts suggested the use of NN which is a promising technique to improve the model proposed by Shimizu et al. 7

MATERIAL AND METHODS

The chemicals studied are taken from Shimizu et al. 7 They belong to the anilide family (Table 1). Hydrophobicity of the anilide herbicides is estimated using the partition coefficient between 1-octanol and water (log P) 9 and the fuctional groups R~, R 2, R 3 and R4 of the molecules (Table 1) are described by means of a set of Verloop parameters l0 modified by Asao and lwamura I I as shown in Figure 1.

2811 Table 1: Chemical structures of the studied anilides

R

~

o



Rj

R2

R3

R4

Activity plso Obsf MLR b NN c NN a

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38

Cl C1 CI C1 CI C1 CI H CI CI CI CI CI CI CI H C1 Br OMe CF 3 O-i-pentyl H H H H CI O-i-pentyl O(CHz)2Ph O(CH2)3Ph O(CH2)sPh O(CH2)3OPh O(CH2)4OPh H F O-i-pentyl H H CI

C1 C1 CI C1 CI C1 CI C1 Me C1 CI C1 CI CI CI H H H H H H F CI OMe CN C1 H H H H H H H H H C1 Me C!

Et i-Pr 1-Me-allyl n-Bu i-Bu 1-Me-n-Bu 2-Me-n-Bu 1,l-Mez-n-Bu 1-Me-n-Bu n-pentyl i-pentyl c-Hx (CH2)2-c-Hx (CH2)2-Ph CHz-OPh 1-Me-c-Pr l-Me-c-Pr 1-Me-c-Pr 1-Me-c-Pr 1-Me-c-Pr 1-Me-c-Pr l-Me-c-Pr 1-Me-c-Pr 1-Me-c-Pr l-Me-c-Pr 1-Me-c-Pr c-Pr c-Pr c-Pr c-Pr c-Pr c-Pr CH2Ph CH2Ph CH2Ph CH2Ph CH2Ph CH2Ph

H H H H H H H H H H H H H H H H H H H H H H H H H H H H H H H H H H H H H H

6.35 6.07 6.77 6.34 4.94 7.22 6.33 5.80 6.48 6.63 6.45 5.17 4.44 4.77 4.42 5.04 5.72 5.50 4.57 5.32 7.05 5.14 5.80 4.78 4.51 6.88 6.90 6.93 7.09 6.73 7.39 6.67 4.95 5.73 7.13 5.75 5.13 6.72

6.23 6.44 5.98 6.16 5.16 6.42 5.65 5.76 6.06 6.50 5.73 4.91 5.64 4,40 4,56 4.29 5.54 5.64 5.01 5.67 6.30 5.04 5.46 4.76 4.99 5.95 6.60 6.72 7.07 7.75 6.82 7.16 5.23 5.50 6.70 5.80 5.47 6.35

6.36 6.65 6.31 6.67 5.09 5.80 6.13 6.12 6.59 6.81 5.90 4.90 4.89 4.68 4.65 4.79 5.53 5.69 4.88 5.74 6.74 4.91 5.42 4.64 4.86 6.23 6.92 6.99 7.08 6.99 7.03 7.08 4.80 5.41 6.82 6.12 5.33 6.76

6.43 6.85 6.06 6.61 5.65 6.70 5.70 6.15 6.52 6.78 5.35 4.68 5.98 4.46 4.73 4.76 5.42 5.67 4.92 5.83 6.72 4.85 5.27 4.68 4.93 6.08 6.93 6.99 7.07 7.21 6.98 7.15 4.76 5.29 6.77 6.23 5.36 6.71

2812 Table 1 : Continued. N°

RI

R2

39 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 bl 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76

H H C1 C1 H H OH H OMe H O-n-Pr H O-i-Pr H O-n-Bu H O-i-Bu H O-sec-Bu H O-i-pentyl H O-n-pentyl H O-1-Me-n-Bu H O-n-heptyl H O-1-Me-n-Hx H OCH2-c-Hx H O(CH)MeCOOMe H O(CH)MeCOOEt H H OMe H OEt H O-n-Pr H O-i-Pr H O-n-Bu H O-i-Bu H O-sec-Bu H O-i-pentyl H O-n-pentyl H O-l-Me-n-Bu H OCH(Et)2 H O-n-heptyl H O-1-Me-n-Hx H OCH2-c-Hx H O(CH)MeCOOMe H O(CH)MeCOOEt H O(CH)MeCOO-i-Pr O-i-pentyl H Cl Cl O-i-pentyl H H O-i-pentyl O(CH2)aPh H

R3

R4

Activity plso Obs. a M L R b NN c NN d

n-Bu CH2Ph n-Bu n-Bu n-Bu n-Bu n-Bu n-Bu n-Bu n-Bu n-Bu n-Bu n-Bu n-Bu n-Bu n-Bu n-Bu n-Bu n-Bu n-Bu n-Bu n-Bu n-Bu n-Bu n-Bu n-Bu n-Bu n-Bu n-Bu n-Bu n-Bu n-Bu n-Bu n-Bu n-Bu 1-Me-n-Bu n-Bu n-Bu n-Bu c-Pr

H H H H H H H H H H H H H H H H H H H H H H H H H H H H H H H H H H H H Me Me Me Me

4.12 6.72 4.12 4.12 4.87 6.05 5.64 6.46 6.50 5.76 7.29 6.41 5.60 6.92 6.24 6.50 3.99 4.19 4.26 4.31 4.16 3.90 4.49 4.24 4.18 5.27 5.11 4.73 4.57 5.88 5.28 5.08 3.46 3.63 3.89 6.80 3.88 3.65 3.38 4.09

4.97 6.35 4.97 4.81 5.22 5.90 5.04 6.25 6.16 5.38 6.51 6.59 5.73 7.28 6.41 7.40 4.25 4.59 4.97 3.89 4.23 4.15 4.57 4.49 4.49 4.83 4.90 4.83 4.83 5.60 5.52 5.73 3.36 3.70 3.96 6.77 3.76 4.11 2.44 4.67

4.41 6.76 4.41 4.22 4.83 6.39 5.37 6.73 6.68 5.80 6.81 6.82 5.98 6.70 6.01 6.65 3.91 4.45 4.41 3.77 4.02 3,94 4,49 4.36 4.36 4.90 4.98 4.90 4.90 5.44 5.42 5.46 3.64 3.70 3.81 6.81 3.55 3.71 3.47 4.13

4.50 6.71 4.50 4.31 4.86 6.41 5.11 6.66 6.60 5.66 6.73 6.82 6.05 6.92 6.12 7.00 3.94 4.75 4.48 3.68 3.98 3.93 4.54 4.43 4.45 4.84 4.93 4.91 4.94 5.19 5.42 5.73 3.56 3.65 3.79 6.84 3.48 3.94 3.52 4.41

: Observed pl~o, b : Calculated using MLR, c : Calculated using NN, a : Predicted using cross-validation. Me : Methyl, Et : Ethyl, Pr : Propyl, Bu : Butyl, Ph : Phenyl, Hx : Hexyl, i : iso, n : normal, c : cyclo

2813

_

__

~_

~D-axis

-~-~~_ ~W

,

I

~

1,,

..... _~z_

Wrl3

i i

Fig. 1: Illustration of the used Verloop parameters. D is the length of a substituent in the extended conformation. It is measured along the D-axis which starts at the connecting end and goes forward among the zigzag aliphatic chain. The Wr is the width of the right-hand side of substituents measured from the D-axis when viewed from the connecting end, and the Wj is that of the left-hand side. Similarely, Tr and T I are the thicknesses of the right- and left-hand sides respectively. Wr3 is the width Wr at the 3position. The variable Ibr (R 0 takes value one for the compounds having Ri branched at the 13position from the benzene ring and zero for the others. IOR (R2) takes value one for the compounds having a fuctional group R2 larger than ethoxy in terms of the D defined in figure 1 and zero for the others. INMetakes one for the N-methylated derivatives and zero for the others. These three indicator variables were introduced to account for a particular behaviour for the corresponding three categories of compounds. The activity was expressed in terms of the logarithm of the reciprocal of the molar ...... ~i:~atio,a at '.:i~ich 50 per cent inhibition of the photosynthetic DCIP reduction is obtained, pls0. The values ofpls0 are normalized between O and 1 for the NN calculations.

RESULTS AND DISCUSSIONS Multinle Linear Re~,ression The best model of quantitative structure-activity relationship for inhibition of photosystem II obtained by Shimizu et al. 7 is given below.

2814 pls0 = 0.95 log P - 1.39 log (1310 logP +1) - 0.62 TI (R3) - 0.87 W~3 (R3) -0.75 Ibr (Rl) -

(1)

1.44 IOR (R2) - 2.59 INMc+ 6.64 n = 76

r = 0.940

s = 0.401

log [3 = -5.08

As mentioned above, the authors introduced log (1310 ~ogP +1) in order to account for the nonlinearity of the phenomenon. When we take off this pa, eaneter, equation (1) 12 becomes:

pls0 = 0.64 log P - 0.62 T i (R3) - 0.81 Wr3 (R3) -0.79 Ibr (R1) - 1.50 IOR (R2) - 2.74 INMe+ 7.51 n = 76

r = 0.922

s = 0.457

(2)

The results above show clearly that log (1310 log P +1) does not improve sufficiently the quality of the model and does not take into account the non-linearity as promising. This is due to the fact that log ([~10 log P +1) (noted BT) is fairly correlated with log P (Table 2).

Table 2 : Correlation between descrintors. logP

Wr3(R3)

TI(R3)

Ibr(Rl)

IOR(R2)

INMe

1.00 0.25 -0.12 0.03 0.04 0.13 0.73

1.00 -0.58 0.06 0.10 -0.05 0.11

1.00 -0.11 -0.20 0.01 -0.03

1.00 -0.15 -0.07 0.05

1.00 -0.12 0.02

1.00 0.17

BT logP Wr3(R3) Tt(R3) Ibr(Rl) IoR(R2) INMe ~T

On the other hand, due to the ability of NN

13,14 in

1.00

modelling non linear phenomenons,

it is interesting to study the present series of compounds with this technique.

Neural Network Data set was subjected to NN, with three layers and complete connections between neurons. The input layer is constituted by the six descriptors, (log (1310 ~ogP +1) was not taken into account). The optimized hidden layer has five neurons and the activities pI50 constituted the output neuron (configuration : 6-5-1).

2815 The used transfer function is a sigmo'fd ( 1 / [l + e×]. Weights of connections are optimized using a back-propagation algorithm. Finally, 1000 cycles were sufficient to obtain a good optimisation of the weights of connection. The correlation coefficient between observed and calculated p150 was 0.967, it is higher than those obtained by Shimizu et al. 7 (0.940 and 0.922 for equations (1) and (2) respectively). It indicates that the introduction of log (1310 logP +1) does not account sufficiently for the nonlinearity as supposing in the introduction of this paper. The plot between calculated and observed values of pI50 shows a good distribution of standard deviations (SD).

7,50 7,00

~1, • •

6,50

*•

%

6,00 "o t~

•00'

5,50

~o

5,00 0

4,50

t

4;00 3,50

~0 0



3,00

3,00

4,00

5,00

6,00

7,00

8,00

Observed p150 Fig. 2: Calculated Pls0 vs. Observed pls0

In order to test the efficiency of the established model, we used a cross-validation procedure.

Prediction ability of neural network We used cross-validation ]5 to evaluate the prediction ability of the network. In this procedure, one compound is removed from the data set, the network is trained with the remaining compounds and used to predict the discarded compound. The process is repeated in turn for each compound in the data set. Cross-validation is realized with the same architecture and parameters of the NN described above.

2816 The statistical informations about the quality ot the predict~,,n ability are given as follows : -

Standard Error of Prediction : SEP = [E(PIs0 obs. - pIs0prea')2 / N] 1/2= 0.107 N is the number of compounds used.

- Cross-validated r 2 = 1- E(plso obs. - plso pred') 2 / Y(Plso obs. - pI5o mean.)2 = 0.853

Descriptor's contributions and interoretation To evaluate the descriptor's contributions in explaining the behaviour of plso based on NN, we elaborated a method 16 which consists in removing a descrip*o" a part and analysing the statistical coefficients between observed and calculated activities using NN. Comparison between these statistics and those calculated by NN when no descriptor was removed gave an idea about the importance of the descriptor taken off. The values of the statistical coefficients when a descriptor is removed and for the full sei of descriptors (latest row : reference) are reported in Table 3.

Table 3 ; Correlation coefficients (r) and standard errors (s) calculated using biN and MLR ,~,____h~na descrintor if removed and for the full set of descrintor. Removed descriptor

r

s

NN

MLR

NN

MLR

log P

0.6807

0.6745

0.2043

0.5881

IoR(R2)

0.7757

0.7551

0.1760

0.7622

INMe

0.8140

0.7833

0.1620

0.7220

Wr3(R3)

0.8949

0.8180

0.1244

0.6683

Ibr(Rt)

0.9507

0.8931

,0.0864

0.5234

T~(R2)

0.9516

0.8957

0.0885

0.5172

Any one

0.9673

0.9220

0.0706

0.4573

Analysis of Table 3 shows that the removing of log P decreases r from 0.9673 to 0.6807. Its presence increases the quality of the model (i.e. log P contributes greatly to explain the behaviour of pls0 ). A similar interpretation may be done for the other descriptors and the following order of contributions is obtained :

2817 log P > IoR(R2) > INMe > Wr3(R3) > Ibr(Rj) > TI(R2)

According to the results above, it appears that log P contributes greatly in pls0 behaviour. It is also responsible of the nonlinearity of the relationship pls0-descriptors. This is shown in table 3, where the correlation coefficient r is almostly the same for NN and MLR (i. e. linear phenomenon) when log P is removed. Consequently, log P is the major responsible for the nonlinearity. Another proof may be shown obviousely when pls0 calculated using NN is projected in function of log P, the obtained plot indicates nonlinearity. This suggests that the molecular diffusion to the receptor site follows a nonlinear route. The activity of anilides decreases with the increasing bulkiness of the acyl moiety (R3). Probably, R 3 encounters narrow regions in the receptor site in the directions Wr3 and T I. As indicated by IoR(R2), substituents larger or bulkier than ethoxy diminished the activity, The region of receptor is probably narrow for these substituents. Ibr(R0 decreases the activity, the region to be occupied by Rj is suggested to be narrow. The activity soon decreases with N-methylation as reflected by the indicator variable INMe. This seems to be due to reduction of conformational flexibility. The theory of narrow and wide regions was previousely postulated by Omokawa et al. 17 in their study about the inhibitory activity of 1,3,5-triazines. They suggested that the amino group with a small substituent and with low steric hindrance would be placed in a narrow region of the receptor site in PSII protein, and the other amino group would be placed in the wide region.

REFERENCES 1.

N. Qian and T. J. Sejnowski, J. Mol. Biol. 202,865 (1988)

2.

T. A. Andrea and H. Kalayeth, J. Med. Chem. 34,2824 (1991)

3.

T. Aoyama, Y. Suzuki and H. Ichikawa, J. Med. Chem., 33,905 (1990)

4.

T. Aoyama and H. Ichikawa, Chem. Pharm. Bull., 39,372 (1991)

5.

T. Aoyama, Y. Suzuki and H. Ichikawa, J. Med. Chem., 33,2583 (1990)

6.

T. Aoyama and H. Ichikawa, Chem. Pharm. Bull., 39,358 (1991)

7.

R. Shimizu, H. Iwamura and T. Fujita, J. Agric. Food Chem., 36,1276 (1988)

8.

H. Kubinyi, Drug Res., 29, 1067 (1979)

2818 9.

C. Hansch and A. Leo, The Fragment Method of Calculating Partition Coefficients. in : Substituent Constants for Correlation Analysis in Chemistry and Biology., Wiley Interscience, New York, (1979)

10.

A. Verloop, W. Hoogenstraaten and J. Tipker, De~ elopment matt Application of New Steric Substituent Parameters in Drug Design, in : E. J. Ariens (ed,), Drug Design, Academic, New York, Vol. VII, (1979)

11.

M. Asao and H. Iwamura, Faculty of Agriculture, Kyoto University, Kyoto, Japan, unpublished data, (1985)

12.

Statitcf Software, Institut Technique des C6r6ales et Fourages, Paris, France, (1987)

13.

J.L. McClleland, D. E, Rumelhart, Parallel Distributed Processing, MIT Press, Cambridge, MA (1988)

14.

W.H. Press, B. P. Flannery, S. A. Teukolski and W. T. Vetterlivg, Numerical Recipes in C, Cambridge University Press, Cambridge (1988)

15.

D. Villemin, D. Cherqaoui and J. M. Cense, J. Chim. Phys.90,1505 (1993)

16.

M. Chastrette, D. Zakarya and J. F. Peyraud, Eur. J, Med. Chem., 29, 343 (1994).

17.

H. Omokawa and M. Konnai, Agric. Biol. Chem. 54,2373 (1990)