Bioorganic & Medicinal Chemistry 11 (2003) 4523–4533
QSAR Study on Tadpole Narcosis Vijay K. Agrawal,a Sanjeev Chaturvedi,a Michael H. Abrahamb and Padmakar V. Khadikarc,* a
QSAR and Computer Chemical Laboratories, A.P.S. University, Rewa-486 003, India Department of Chemistry,University College London, 20 Gordon Street, London, UK c Research Division, Laxmi Fumigation and Pest Control Pvt. Ltd. 3 Khatipura, Indore-452 007, India b
Received 25 February 2003; accepted 24 March 2003
Abstract—This paper deals with a Quantitative Structure–Activity Relationship (QSAR) study on a large set of 123 compounds using a combination of topological indices as well as Abraham’s molecular descriptors. The results have shown that an excellent O model (R=0.9542) is obtained in hexa-parametric correlation containing W, logRB (topological indices) along with R2, pH 2 , b2 and Vx as the correlating parameters. The results are discussed critically. # 2003 Elsevier Ltd. All rights reserved.
Introduction Abraham and Rafols1 have reported factors that influence tadpole narcosis using LFER (linear free-energy relationship) approach. They observed that the narcotic concentration of solutes in mol dm3 is given by the following relationship:
For the same set of the compounds, Abraham and Rafols1 further observed that tadpole narcosis is related to water–octanol partition coefficient (logP(oct)) according to the following equation: logð1=Cnar Þ ¼ 1:129 þ 0:8331 logPðoctÞ n ¼ 84; r ¼ 0:9212; SD ¼ 0:407
logð1=Cnar Þ ¼ 0:579 þ 0:824R2
ð2Þ
0:334pH 2
2:871bO 2 þ 3:097Vx
ð1Þ
where the descriptors used are: R the solute excess molar refraction, pH 2 the solute dipolarity/polarizability, bO 2 the solute hydrogen-bond basicity and Vx is solute volume. Applying aforementioned relationship to a set of 84 compounds Abraham and Rafols1 obtained the correlation coefficient r=0.9730 and the standard deviation (SD)=0.246 log units. The above relationship (eq 1) shows that solute hydrogen-bond basicity and solute volume are the main parameters influencing the tadpole narcosis and that hydrogen-bond acidity has no effect at all.
The statistics of (eq 2) is improved by the addition of Vx as the second correlating parameter: logð1=Cnar Þ ¼ 0:621 þ 0:7431 logPðoctÞ þ 0:668 Vx n ¼ 84; r ¼ 0:9400; SD ¼ 0:359
Finally, Abraham and Rafols1 added additional compounds, thus, increasing the set of 84 compounds to 114 and arrived at a conclusion that the tadpole narcosis is a general phenomenon. Considering tadpole narcosis as a general biological property (SP) Meyer and Hemmi2 proposed the following relationship between logSP and the partition coefficient (logP): logSP ¼ a logP þ C
*Corresponding author. Tel.: +91-731-253-1906; e-mail: vijay- agrawal @lycos.com 0968-0896/$ - see front matter # 2003 Elsevier Ltd. All rights reserved. doi:10.1016/S0968-0896(03)00446-2
ð3Þ
ð4Þ
Needless to state that here SP stands for tadpole narcosis. This relationship (eq 4), now-a-days, is referred as
4524
V. K. Agrawal et al. / Bioorg. Med. Chem. 11 (2003) 4523–4533
the Overton–Meyer relationship; or some times just as the Overton rule, and has been applied to numerous series of biological results.3,4 During recent years, our research is concerned mainly to investigate the use of topological and/or other molecular descriptors in place of logP.511 Thus, our primary objective is to study the modeling tadpole narcosis using topological indices and/or their combinations with other molecular descriptors. We have, therefore, initially used a larger set of distance-based topological indices (W, Sz, 1w, B, J, and logRB) along with the aforementioned Abraham’s molecular descripH O tors (R2, pH 2 , a2 , b2 and Vx), thus making a total of 11 molecular descriptors. Another objective of our present study is to investigate whether or not our methodology will be useful for more than 84 compounds. In doing so we have considered: (i) tadpole narcosis as a general phenomenon, and (ii) used only those compounds from the original set1 for which logP values are reported by Abraham and Rafols.1 We did this to support our primary objective, that is use of topological and/or their combinations with other molecular descriptors in place of logP. The results as discussed below show that both the objectives of our present study are fulfilled. At this stage, it is worth reporting that the topological indices used are: Wiener index (W),12 Szeged index (Sz),13,14 first order connectivity index (1w), branching index (B),15 Balaban index (J),16 and logRB.17 Also, that the statistical analysis for obtaining the excellent model is done following maximum R2 method.18 The quality factor (Q),19,20 though criticized recently,21 has been found useful in obtaining predictive potential of the proposed models.
parameters. It resulted into five hexa-parametric regression expressions out of which the one containing O R2, pH 2 b2 , Vx, W and logRB as the correlating parameters gave better result. This model is found as under: logð1=Cnar Þ ¼ 1:0079 þ 1:2975ð0:2059ÞR2 1:0016ð0:2211ÞpH 2 1:7924 ð0:2746ÞbO 2 þ 2:4560 ð0:2838ÞVx þ 0:0120ð0:0054ÞW
ð5Þ
0:0379ð0:0187ÞlogRB n ¼ 123; SE ¼ 0:5807; r ¼ 0:8644; F ¼ 57:120; Q ¼ 1:4885; p ¼ 0:000 Here and hereafter, n is the number of compounds used, SE is the standard error of estimation, r is the multiple correlation coefficient, F is the F-ratio, Q is the quality factor and p is the probability. Eq. 5 shows that the solute-volume (Vx) plays a dominating role in the exhibition of the tadpole narcosis of the set of 123 compounds used. Also, among the distance-based topological indices used in the multi-parametric regression, W and logRB are more useful for modeling tadpole narcosis ðlog 1=Cnar Þ. The successive regression analysis resulted into four statistically significant seven-parametric regression models. However, only one out of these seven regressions gave better results than the regression model discussed above. This model is found to contain R2, pH 2 1 bO 2 , Vx, W, w, B and logRB as the correlating parameters. This model is found as given below:
Results and Discussion Table 1 records the set of 123 compounds used in the present study. The narcotic activity ðlog 1=Cnar Þ and H O the Abraham’s molecular descriptors (R2, pH 2 , a2 ,b2 and Vx) for the set of 123 compounds are also recorded in Table 1. It is recorded that here Cnar is the narcotic concentration, that is the minimum concentration recorded for narcosis in mol dm3. As stated in the introduction section it is to be noted that the narcotic concentration is related to the partition coefficient of the solute between water and olive oil2 and other related partition coefficients.3
logð1=Cnar Þ ¼ 0:7712 þ 1:0463ð0:1991ÞR2 1:2475ð0:2124ÞpH 2 1:9686 ð0:2578ÞbO 2 þ 1:2220 ð0:3824ÞVx þ 0:0144ð0:0050ÞW
ð6Þ
þ 0:5087ð0:1143Þ 1 wðor BÞ 0:0528ð0:0177ÞlogRB n ¼ 123; SE ¼ 0:5387; r ¼ 0:8856;
The topological indices (W, Sz, 1w, B, J, logRB) as calculated from the Luko-1 program of Lukovits are presented in Table 2. The correlation matrix as given in Table 3 reflects the mutual correlatedness. In addition, this correlation matrix gives the correlations of molecular descriptors (topological indices and Abraham’s molecular descriptors with tadpole narcosis ðlog 1=Cnar Þ.
Our results, therefore, show that the combination of topological indices with the Abraham’s molecular descriptors can be used to model tadpole narcosis of higher number of compounds (i.e., 123 compounds rather than 84 compounds).
The initial regression procedure, using maximum R2 method,18 has indicated that statistically useful regression models start coming with the model having six
In view of the above Abraham and Rafols1 added additional compounds to the set of 84 compounds from the Overton’s data22,23 thus making a new set consisting
F ¼ 59:720; Q ¼ 1:6439; p ¼ 0:000
4525
V. K. Agrawal et al. / Bioorg. Med. Chem. 11 (2003) 4523–4533 Table 1. Name of the compounds used, their tadpole narcosis ðlog1=Cnar Þ and Abraham’s molecular descriptors Compd no. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
Compd name
Logð1=Cnar Þ
R2
IIH 2
aH 2
bo2
Vx
Pentane 2-Methylbut-2-ene Trichloromethane Tetrachloromethane Chloroethane 1,2-Dichloromethane Bromoethane Iodoethane Diethylether Propanone Butanone Pentan-2-one Pentan-3-one Camphor Ethylformate Methylacetate Ethylacetate Propylacetate Butylacetate Isobutylacetate Pentylacetate Ethylpropanoate Ethylbutanoate Ethylpentanoate Butylpentanoate Ethylisobutanoate Triacetin Acetonitrile Nitrometane Pentanamide N-Ethylurethane Methanol Ethanol Propan-1-ol Propan-2-ol Butan-1-ol 2-Methylprpan-1-ol 2-Methylpropan-2-ol 3-Methylbutan-1-ol 2-Methylbutan-2-ol Octan-1-ol Menthol Ethane-1,2-diol Ethanethiol Carbondisulphide Triethylphosphate Benzene m-Xylene Naphthalene Phenanthrene Methylphenylether 1,3-Dimetoxybenzene 1,4-Dimetoxybenzene Acetophenone Aniline N,N-Dimethylanilene Diphenylamine Azobenzene Acetanilide p-Metoxyacetanilide p-Etoxyacetanilide Phenol o-Cresol m-Cresol p-Cresol 2-Isopropylphenol 4-tert-Pentylphenol 2-Methoxyphenol Catechol Resorcinol Hydroquinone
2.55 2.64 2.85 3.14 2.35 2.63 2.57 2.96 1.47 0.54 1.04 1.72 1.54 2.88 1.16 1.10 1.52 1.96 2.30 2.24 2.72 1.96 2.37 2.72 3.60 2.24 1.64 0.44 1.09 1.30 1.40 0.24 0.54 0.96 0.89 1.42 1.35 0.89 1.64 1.20 3.40 3.97 0.19 2.09 3.28 1.96 2.68 3.42 4.19 5.25 2.82 3.35 3.05 3.04 1.96 2.85 4.43 4.74 2.31 2.09 2.55 2.28 2.92 2.75 2.75 4.26 4.52 2.57 2.12 1.64 2.12
0.000 0.159 0.425 0.458 0.227 0.416 0.366 0.640 0.041 0.179 0.166 0.143 0.154 0.450 0.146 0.142 0.106 0.092 0.071 0.052 0.067 0.087 0.068 0.049 0.033 0.034 0.136 0.237 0.313 0.400 0.236 0.278 0.246 0.236 0.212 0.224 0.217 0.180 0.192 0.194 0.199 0.400 0.404 0.392 0.876 0.000 0.610 0.623 1.340 2.055 0.708 0.816 0.806 0.818 0.955 0.957 0.700 0.680 0.870 0.970 0.940 0.805 0.840 0.822 0.820 0.822 0.810 0.837 0.970 0.980 1.000
0.000 0.080 0.490 0.380 0.400 0.640 0.400 0.400 0.250 0.700 0.700 0.680 0.660 0.850 0.660 0.640 0.620 0.600 0.600 0.570 0.600 0.580 0.580 0.580 0.560 0.550 1.300 0.900 0.950 1.300 0.820 0.440 0.420 0.420 0.360 0.420 0.390 0.300 0.390 0.300 0.420 0.480 0.900 0.350 0.260 1.000 0.520 0.520 0.920 1.290 0.750 1.010 1.000 1.010 0.960 0.840 0.880 1.200 1.400 1.630 1.600 0.890 0.860 0.880 0.870 0.790 0.890 0.910 1.070 1.000 1.000
0.000 0.000 0.150 0.000 0.000 0.100 0.000 0.000 0.000 0.040 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.070 0.060 0.500 0.240 0.430 0.370 0.370 0.330 0.370 0.370 0.310 0.370 0.310 0.370 0.320 0.580 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.260 0.000 0.600 0.000 0.500 0.480 0.480 0.600 0.520 0.570 0.570 0.520 0.560 0.220 0.850 1.100 1.160
0.000 0.070 0.020 0.000 0.100 0.110 0.120 0.150 0.450 0.490 0.510 0.510 0.510 0.560 0.380 0.450 0.450 0.450 0.450 0.470 0.450 0.450 0.450 0.450 0.450 0.470 0.350 0.320 0.310 0.620 0.610 0.470 0.480 0.480 0.560 0.480 0.480 0.600 0.480 0.600 0.480 0.610 0.780 0.240 0.030 1.060 0.140 0.160 0.200 0.260 0.290 0.450 0.500 0.480 0.500 0.470 0.380 0.440 0.670 0.860 0.840 0.300 0.300 0.340 0.310 0.440 0.410 0.520 0.520 0.580 0.600
0.8131 0.7710 0.6167 0.7391 0.5128 0.6352 0.5654 0.6486 0.7309 0.5470 0.6879 0.8288 0.8288 1.3161 0.6057 0.6057 0.7466 0.8875 1.0284 1.0284 1.1693 0.8875 1.0284 1.1693 1.4511 1.0284 1.5999 0.4042 0.4237 0.9286 0.9873 0.3082 0.4491 0.5900 0.5900 0.7309 0.7309 0.7309 0.8718 0.8718 1.2950 1.4677 0.5078 0.5539 0.4905 1.3934 0.7164 0.9982 1.0854 1.4540 0.9160 1.1160 1.116 1.0139 0.8162 1.0980 1.4240 1.4808 1.1133 1.3133 1.4542 0.7751 0.9160 0.9160 0.9160 1.3387 1.4796 0.9747 0.8338 0.8338 0.8338 (continued)
4526
V. K. Agrawal et al. / Bioorg. Med. Chem. 11 (2003) 4523–4533
Table 1 (continued) Compd no. 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123
Compd name
Logð1=Cnar Þ
R2
IIH 2
aH 2
bo2
Vx
Vanilin Eugenol Phenylthiourea Coumarin Phthalide Piperonol Pyridine Quinoline Antipyrine Caffine Morphine Phenylurea Acetamide Methylurethane Nicotine 2-Propylpiperidine Urea N-Ethylurethane N-Propylurethane N-Isobutylurethane N-Isopentylurethane Methanol Ethanol Propan-1-ol Butan-1-ol Pentan-1-ol Hexan-1-ol Heptan-1-ol Octan-1-ol Nonan-1-ol Decan-1-ol Undecan-1-ol Dodecan-1-ol Butan-2-ol Pentan-2-ol Hexan-2-ol Heptan-2-ol Octan-2-ol Cyclohexanol Cycloheptanol Cyclooctanol Cyclodecanol Pentan-1,5-diol Hexan-1,6-diol Heptan-1,7-diol Octan-1,8-diol Nonan-1,9-diol Decan-1,10-diol Dodecan-1,12-diol Acetal Benzamide Benzylalcohol
2.48 3.91 2.18 3.24 2.37 2.78 1.60 2.72 1.89 1.92 2.76 2.34 0.77 0.57 3.51 3.48 0.60 1.46 2.18 2.50 2.93 0.23 0.72 1.14 1.97 2.54 3.24 3.64 4.24 4.43 1.90 5.09 5.33 1.77 2.32 2.86 3.48 4.21 2.30 2.89 3.41 4.08 1.72 1.60 2.48 3.02 3.19 3.60 4.41 1.92 2.52 2.70
1.040 0.946 1.250 1.060 0.950 0.990 0.631 1.268 1.320 1.400 2.200 1.110 0.460 0.263 0.865 0.364 0.500 0.236 0.225 0.202 0.190 0.278 0.246 0.236 0.224 0.219 0.210 0.211 0.199 0.193 0.191 0.181 0.175 0.217 0.195 0.187 0.188 0.158 0.460 0.513 0.578 0.621 0.388 0.385 0.381 0.380 0.370 0.370 0.360 0.000 0.990 0.803
1.040 0.990 1.720 1.790 1.900 1.600 0.840 0.970 1.500 1.550 2.340 1.400 1.300 0.820 1.340 0.440 1.000 0.820 0.820 0.790 0.790 0.440 0.420 0.420 0.420 0.420 0.420 0.420 0.420 0.420 0.420 0.420 0.420 0.360 0.360 0.360 0.360 0.360 0.540 0.540 0.540 0.540 0.950 0.950 0.950 0.950 0.950 0.950 0.950 0.670 1.500 0.870
0.320 0.220 0.490 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.860 0.770 0.540 0.240 0.000 0.100 0.500 0.240 0.240 0.240 0.240 0.430 0.370 0.370 0.370 0.370 0.370 0.370 0.370 0.370 0.370 0.370 0.370 0.330 0.330 0.330 0.330 0.330 0.320 0.320 0.320 0.320 0.720 0.750 0.750 0.750 0.750 0.750 0.750 0.000 0.490 0.390
0.670 0.510 0.780 0.460 0.460 0.520 0.470 0.510 1.480 1.340 1.790 0.770 0.680 0.610 0.940 0.690 0.900 0.610 0.610 0.610 0.610 0.470 0.480 0.480 0.480 0.480 0.480 0.480 0.480 0.480 0.480 0.480 0.480 0.560 0.560 0.560 0.560 0.560 0.570 0.580 0.580 0.580 0.920 0.920 0.920 0.920 0.920 0.920 0.920 0.076 0.670 0.560
1.1313 1.3544 1.1774 1.0619 0.9640 1.0227 0.6753 1.0443 1.5502 1.3632 2.0648 1.0726 0.5059 0.8464 1.3710 1.2270 0.4648 0.9873 1.1282 1.2691 1.4100 0.3082 0.4491 0.5900 0.7309 0.8718 0.0127 1.1536 1.2950 1.4354 1.5763 1.7170 1.8580 0.7309 0.8718 1.0127 1.1536 1.2950 0.9040 1.0450 1.1860 1.4677 0.9305 1.0714 1.2123 1.3532 1.4941 1.6350 1.9168 1.0714 0.9728 0.9160
of 114 compounds. They obtained the following regression model for modeling logð1=Cnar Þ:
O is the six parametric model containing R2, pH 2 b2 , Vx, W, and logRB and is found as below:
logð1=Cnar Þ ¼ 0:595 þ 0:805ð0:107ÞR2 0:725
logð1=Cnar Þ ¼ 0:4461 þ 0:9574ð0:1281ÞR2
ð0:135ÞpH 2
2:489ð0:166ÞbO 2
þ 3:341ð0:115ÞVx
0:5530ð0:1310ÞpH 2 2:4493 ð7Þ
ð0:1729ÞbO 2 þ 3:3111
n ¼ 114; SE ¼ 0:341; r ¼ 0:9520;
ð0:1839ÞVx þ 0:0095ð0:0031ÞW
F ¼ 263:000
0:0330ð0:0109ÞlogRB
When we attempted our methodology to the above referred set of 114 compounds we obtained slightly better model than the model given by eq 7. However, ours
n ¼ 114; SE ¼ 0:3217; r ¼ 0:9592; F ¼ 205:103; Q ¼ 2:9816; p ¼ 0:000
ð8Þ
4527
V. K. Agrawal et al. / Bioorg. Med. Chem. 11 (2003) 4523–4533 Table 2. Value of various calculated topological indices for the compounds used in the present study Compd
Sz
W
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
20 20 9 16 4 10 4 4 20 9 18 32 32 174 20 18 32 52 79 71 114 50 76 110 200 71 424 4 9 143 98 4 10 20 18 35 32 28 29 44 165 291 35 10 4 166 54 110 243 677 109 192 200 142 142 142 505 498 228 360 445 109 142 146 150 366 372 184 184 192 200
20 20 9 16 4 10 4 4 20 9 18 32 32 123 20 18 32 52 79 71 114 50 76 110 200 71 424 4 9 143 98 4 10 20 18 35 32 28 29 44 165 191 35 10 4 166 27 62 109 271 64 121 125 88 88 88 307 399 156 250 316 64 86 88 90 244 252 117 117 121 125
1
w (=B)
J
Log(RB)
2.4142 2.4142 1.7321 2.0000 1.4142 1.9142 1.4142 1.4142 2.4142 1.7321 2.2701 2.7701 2.7701 4.9830 2.4142 2.2701 2.7701 3.2701 3.7701 3.6639 4.2701 3.3081 3.8081 4.3081 5.3081 3.6639 6.9136 1.4142 1.7321 4.6807 4.2187 1.4142 1.9142 2.4142 2.2701 2.9142 2.7701 2.5607 2.8425 3.1213 4.9142 5.6471 2.9142 1.9142 1.4142 5.1820 3.0000 3.7877 4.9663 6.9495 3.9319 4.8637 4.8637 4.3045 4.3045 4.3045 6.8770 6.9319 5.2152 6.1471 6.6471 3.9319 4.3425 4.3257 4.3257 6.1851 6.0977 4.8805 4.8805 4.8637 4.8637
2.1906 2.1906 2.3238 3.0237 1.6330 1.9747 1.6330 1.6330 2.1906 2.3238 2.5395 2.6272 2.6272 2.3963 2.1906 2.5395 2.6272 2.6783 2.7158 3.0988 2.7467 2.8318 2.8621 2.8766 3.0171 3.0988 3.8357 1.6330 2.3238 3.1296 3.3248 1.6330 1.9747 2.1906 2.5395 2.3391 2.6272 3.1685 2.0939 3.3604 2.6476 2.5012 2.3391 1.9747 1.6330 3.7902 2.0000 2.1924 1.9254 1.4792 2.1250 2.2465 2.1738 2.2284 2.2284 2.2284 1.8180 1.8247 2.3206 2.4103 2.3868 2.125 2.2973 2.2317 2.1804 2.4968 2.4021 2.3400 2.3400 2.2465 2.1738
5.6630 5.6630 2.0794 4.1589 0.6931 2.4849 0.6931 0.6931 5.6630 2.0794 4.9698 9.5342 9.5342 39.5886 5.6630 4.9698 9.5342 15.9311 24.3021 22.2872 34.7732 15.4203 23.6090 33.9259 60.3280 22.2872 128.6710 0.6931 2.0794 44.2428 31.0637 0.6931 2.4849 5.6630 4.9698 10.4505 9.5342 8.1479 8.1479 13.5231 48.9613 61.5396 10.4505 2.4849 0.6931 53.1685 7.4547 19.0038 34.4240 86.8067 19.6969 38.3361 39.0780 27.6625 27.6625 27.6625 95.8633 113.7509 49.7028 79.2185 98.6359 19.6969 27.1516 27.6625 28.0679 77.8967 78.9751 37.4198 37.4198 38.3361 39.0780 (continued)
4528
V. K. Agrawal et al. / Bioorg. Med. Chem. 11 (2003) 4523–4533
Table 2 (continued) Compd 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123
Sz
W
303 376 340 308 185 248 54 243 414 423 2224 340 29 70 323 187 65 98 135 168 219 4 10 20 35 56 84 120 165 220 286 364 455 31 50 76 110 153 109 121 217 381 120 165 220 286 364 455 560 75 228 148
198 254 250 144 106 147 27 109 284 258 940 250 29 70 242 121 65 98 135 168 219 4 10 20 35 56 84 120 165 220 286 364 455 31 50 76 110 153 64 88 121 206 120 165 220 286 364 455 560 75 156 94
1
w (=B)
J
Log(RB)
5.8124 6.2956 6.1259 5.3602 4.8707 5.3982 3.0000 4.9663 6.6984 6.5366 11.2019 6.1259 2.6427 3.6807 5.9319 4.8425 3.5534 4.2187 4.7187 5.1294 5.6294 1.4142 1.9142 2.4142 2.9142 3.4142 3.9142 4.4142 4.9142 5.4142 5.9142 6.4142 6.9142 2.8081 3.3081 3.8081 4.3081 4.8081 3.9319 4.4319 4.9319 5.9319 4.4142 4.9142 5.4142 5.9142 6.4142 6.9142 7.4142 3.8081 5.2152 4.4319
2.4127 2.3891 2.3975 1.9323 2.0146 1.9100 2.0000 1.9254 2.0049 2.2264 1.6177 2.3975 2.9935 3.1708 1.9062 2.2513 3.4642 3.3248 3.3759 3.6998 3.7309 1.6330 1.9747 2.1906 2.3391 2.4475 2.5301 2.5951 2.6476 2.6909 0.7272 2.7582 2.7848 2.7542 2.8318 2.8621 2.8766 2.8862 2.1250 2.1827 2.1611 2.1690 2.5951 2.6476 2.6909 2.7272 2.7582 2.7848 2.8079 2.9196 2.3206 2.0779
63.1978 80.4242 79.2185 45.7908 33.5078 46.4839 7.4547 34.4240 90.7187 84.4145 297.6837 79.2185 8.5533 21.9995 72.3677 38.3361 20.5724 31.0637 42.5846 53.4405 68.8326 0.6931 2.4849 5.6630 10.4505 17.0297 25.5549 36.1595 48.9613 64.0657 81.5680 101.5552 124.1073 9.2462 15.4203 23.6090 33.9259 46.4764 19.6969 27.7803 38.5184 65.6057 36.1595 48.9613 64.0657 81.5680 101.5552 124.1073 149.2986 23.3858 49.7028 29.2719
Table 3. Correlation matrix for the inter-correlation of various parameters
Logð1=Cnar Þ R2 IIH 2 aH 2 bo2 Vx Sz W 1 w (=B) J logRB
Logð1=Cnar Þ
R2
IIH 2
aH 2
bo2
Vx
Sz
W
1.0000 0.2792 0.0334 0.0069 0.1016 0.6653 0.4088 0.5055 0.5826 0.0772 0.4935
1.0000 0.7176 0.1971 0.2964 0.2710 0.6108 0.4163 0.5238 0.4667 0.4577
1.0000 0.2062 0.5560 0.3706 0.6037 0.5414 0.6231 0.0769 0.5737
1.0000 0.4269 0.1545 0.3071 0.3693 0.3501 0.0435 0.3616
1.0000 0.4539 0.5716 0.5935 0.5943 0.1926 0.6093
1.0000 0.7002 0.8501 0.8948 0.1800 0.8468
1.0000 0.9077 0.8242 0.1372 0.9305
1.0000 0.9098 0.0474 0.9962
1
w (=B)
J
LogRB
1.0000 0.0950 0.9216
1.0000 0.0402
1.0000
V. K. Agrawal et al. / Bioorg. Med. Chem. 11 (2003) 4523–4533
Thus, our model expressed by (eq 8) gave slightly better results than given by Abraham and Rafols1 model (eq 7) but it suffers from the defect in that six parameters used in place of four parameters has been used by the Abraham and Rafols.1 This better statistics is due to the presence of W and logRB indices. The other points raised in discussing the merits/demerits of Abraham and Rafols1 model are applicable to our model (eq 8) also. In order to confirm our results we have estimated logð1=Cnar Þ from the above most appropriate models and compared them with the experimental logð1=Cnar Þ values. Such a comparison is shown in Table 4. The comparison and the difference between observed and estimated tadpole narcosis (residue) shows that the model expressed by (eq 6) is the best model for modeling tadpole narcosis of 123 compounds and that (eq 8) is the best model for this purpose containing 114 compounds. The predicted correlation coefficients estimated from Figures 1–3 (r2=0.7843) and (r2=0.9201) also confirm our findings. Finally, it is worth mentioning that the Abraham and Rafols1 model based on logP for a set of 114 compounds is found as: logð1=Cnar Þ ¼ 1:210 þ 0:835ð0:031ÞlogPoct
4529
Topological indices Wiener index (W). The Wiener index (W) is a widely used topological index.12 It is based on the vertex distances of the respective molecular graph. Molecular graph can be denoted by G and having v1, v2, v3,. . .,vn as its vertices. Let d vi ; vj GÞ stands for the shortest distance between the vertices vi and vj. Then the Wiener index is defined as:
W ¼ WðGÞ ¼ 1=2
n X n X d vi ; vj GÞ
ð10Þ
i¼1 j¼1
Szeged index (Sz). Let e be an edge of the molecular graph G. Let n1 ðejGÞ be the number of vertices of G lying closer to one end of e; let n2 ðejGÞ be the number of vertices of G lying closer to the other end of e. Then the Szeged index (Sz) is defined13,14 as:
SzðGÞ ¼ Sz ¼
X n1 ðejGÞn2 ðejGÞ
ð11Þ
e
with the summation going over all the edges of G. ð9Þ
n ¼ 114; r ¼ 0:9301; SD ¼ 0:419; F ¼ 718
However, for the small set of 114 compounds, we obtained a model (eq 8) which gave better statistics than the model expressed by (eq 9).
In cyclic graphs, there are edges equidistant from both the ends of edge e; by definition of Sz such edges are not taken into account. First-order connectivity index (1,). The connectivity index w=w(G) of a graph G is defined by Randic24,25 as under (eq 12): X 0:5 w ¼ wðGÞ ¼ di dj ð12Þ ij
Conclusion The aforementioned results show that we can combine topological and Abraham’s molecular descriptors to obtain a model with better statistics. However, record that the model expressed by (eq 9) uses a single correlating parameter compared to six parametric model expressed by (eq 8).
Experimental
where di and dj are the valence of a vertex i and j, equal to the number of bonds connected to the atoms i and j, in G. In the case of hetero-systems the connectivity is given in terms of valence delta values dvi and dvj of atoms i and j and is denoted by wv. This version of the connectivity index is called the valence connectivity index and is defined24,25 as under (eq 13): i0:5 Xh wv ¼ wv ðGÞ ¼ dvi dvj ð13Þ
Tadpole narcosis The tadpole narcosis data were adopted from the work of Abraham and Rafols.1
ij
where the sum is taken over all bonds i–j of the molecule. Valence delta values are given by the following expression:
Abraham’s molecular descriptors. Abraham’s molecular O descriptors (R2, pH 2 , b2 , Vx) were adopted from the work of Abraham and Rafols.1
dvi ¼
Molecular graphs. The carbon-hydrogen suppressed molecular graphs were used for the calculation of topological indices W, Sz, 1w, B, J, logRB (Table 2).
where Zi is the atomic number of atom i, Zvi is the number of valence electron of the atom i and Hi is the number of hydrogen atoms attached to atom i.
Zvi Hi Zi Zj 1
ð14Þ
4530
V. K. Agrawal et al. / Bioorg. Med. Chem. 11 (2003) 4523–4533
Table 4. Comparison of observed and estimated tadpole narcosis ðlogð1=Cnar Þ from different models Compd
Obs. Logð1=Cnar Þ
Estimated logð1=Cnar Þ using Model-5
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67
2.55 2.64 2.85 3.14 2.35 2.63 2.57 2.96 1.47 0.54 1.04 1.72 1.54 2.88 1.16 1.1 1.52 1.96 2.3 2.24 2.72 1.96 2.37 2.72 3.6 2.24 1.64 0.44 1.09 1.3 1.4 0.24 0.54 0.96 0.89 1.42 1.35 0.89 1.64 1.2 3.4 3.97 0.19 2.09 3.28 1.96 2.68 3.42 4.19 5.25 2.82 3.35 3.05 3.04 1.96 2.85 4.43 4.74 2.31 2.09 2.55 2.28 2.92 2.75 2.75 4.26 4.52
Model-6
Model-8
Est.
Res.
Est.
Res.
Est.
Res.
3.03 2.928 2.577 3.071 2.004 2.295 2.277 2.784 1.825 1.033 1.325 1.657 1.691 2.946 1.368 1.26 1.574 1.92 2.246 2.196 2.61 1.929 2.252 2.591 3.363 2.192 3.4 0.855 0.977 1.435 1.824 0.864 1.175 1.508 1.396 1.837 1.857 1.688 2.187 2.041 3.292 3.519 0.504 2.122 3.057 1.507 2.829 3.485 4.137 5.452 2.927 2.99 2.917 2.696 2.402 3.271 3.905 4.018 2.258 2.318 2.748 2.549 2.952 2.842 2.912 3.76 4.1
0.48 0.288 0.273 0.069 0.346 0.335 0.293 0.176 0.355 0.493 0.285 0.063 0.151 0.066 0.208 0.16 0.054 0.04 0.054 0.044 0.11 0.031 0.118 0.129 0.237 0.048 1.76 0.415 0.113 0.135 0.424 0.624 0.635 0.548 0.506 0.417 0.507 0.798 0.547 0.841 0.108 0.451 0.314 0.032 0.223 0.453 0.149 0.065 0.053 0.202 0.107 0.36 0.133 0.344 0.442 0.421 0.525 0.722 0.052 0.228 0.198 0.269 0.032 0.092 0.162 0.5 0.42
2.982 2.859 2.22 2.708 1.68 1.954 1.85 2.18 1.727 0.69 1.06 1.448 1.484 2.904 1.31 1.127 1.502 1.889 2.24 2.156 2.614 1.926 2.275 2.626 3.39 2.162 3.387 0.501 0.722 1.587 1.918 0.705 1.095 1.487 1.314 1.865 1.827 1.573 2.039 1.992 3.384 3.558 0.591 1.936 2.644 1.359 2.882 3.496 4.237 5.432 3.007 3.036 2.958 2.658 2.582 3.138 4.256 4.195 2.252 2.209 2.607 2.742 3.12 2.991 3.068 3.963 4.071
0.432 0.219 0.63 0.432 0.67 0.676 0.72 0.78 0.257 0.15 0.02 0.272 0.056 0.024 0.15 0.027 0.018 0.071 0.06 0.084 0.106 0.034 0.095 0.094 0.21 0.078 1.747 0.061 0.368 0.287 0.518 0.465 0.555 0.527 0.424 0.445 0.477 0.683 0.399 0.792 0.016 0.412 0.401 0.154 0.636 0.601 0.202 0.076 0.047 0.182 0.187 0.314 0.092 0.382 0.622 0.288 0.174 0.545 0.058 0.119 0.057 0.462 0.2 0.241 0.318 0.297 0.449
3.142 2.939 2.592 3.137 1.91 2.338 2.169 2.633 1.669 0.859 1.254 1.693 1.714 3.26 1.3 1.139 1.565 2.009 2.437 2.376 2.888 2.013 2.439 2.871 3.787 2.37 — 0.745 0.881 1.57 1.903 0.353 0.774 1.222 1.04 1.662 1.673 1.401 2.133 1.856 3.474 3.72 0.095 1.887 2.707 1.739 2.783 3.632 4.228 5.598 2.992 3.151 3.038 2.779 2.234 3.309 4.178 4.311 2.398 2.486 2.978 2.517 2.997 2.873 2.956 3.908 4.422
0.592 0.299 0.258 0.003 0.44 0.292 0.401 0.327 0.199 0.319 0.214 0.027 0.174 0.38 0.14 0.039 0.045 0.049 0.137 0.136 0.168 0.053 0.069 0.151 0.187 0.13 — 0.305 0.209 0.27 0.503 0.113 0.234 0.262 0.15 0.242 0.323 0.511 0.493 0.656 0.074 0.25 0.095 0.203 0.573 0.221 0.103 0.212 0.038 0.348 0.172 0.199 0.012 0.261 0.274 0.459 0.252 0.429 0.088 0.396 0.428 0.237 0.077 0.123 0.206 0.352 0.098 (continued)
4531
V. K. Agrawal et al. / Bioorg. Med. Chem. 11 (2003) 4523–4533 Table 4 (continued) Compd
Obs. Logð1=Cnar Þ
Estimated logð1=Cnar Þ using Model-5
68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123
2.57 2.12 1.64 2.12 2.48 3.91 2.18 3.24 2.37 2.78 1.6 2.72 1.89 1.92 2.76 2.34 0.77 0.57 3.51 3.48 0.6 1.46 2.18 2.5 2.93 0.23 0.72 1.14 1.97 2.54 3.24 3.64 4.24 4.43 1.9 5.09 5.33 1.77 2.32 2.86 3.48 4.21 2.3 2.89 3.41 4.08 1.72 1.6 2.48 3.02 3.19 3.6 4.41 1.92 2.52 2.7
Model-6
Model-8
Est.
Res.
Est.
Res.
Est.
Res.
2.631 2.297 2.286 2.296 2.876 3.658 2.401 2.368 1.884 2.273 1.843 3.337 2.345 2.117 3.388 2.3 0.35 1.52 2.634 2.816 0.184 1.824 2.163 2.494 2.854 0.864 1.175 1.508 1.837 2.179 0.071 2.905 3.292 3.717 4.19 4.702 5.279 1.742 2.054 2.391 2.756 3.105 2.285 2.664 3.083 3.825 1.267 1.664 2.093 2.568 3.08 3.664 4.65 2.846 1.968 2.444
0.061 0.177 0.646 0.176 0.396 0.252 0.221 0.872 0.486 0.507 0.243 0.617 0.455 0.197 0.628 0.04 0.42 0.95 0.876 0.664 0.416 0.364 0.017 0.006 0.076 0.634 0.455 0.368 0.133 0.361 3.169 0.735 0.948 0.713 2.29 0.388 0.051 0.028 0.266 0.469 0.724 1.105 0.015 0.226 0.327 0.255 0.453 0.064 0.387 0.452 0.11 0.064 0.24 0.926 0.552 0.256
2.871 2.639 2.619 2.619 3.097 3.791 2.371 2.422 1.903 2.446 1.805 3.439 1.97 1.914 2.672 2.515 0.221 1.576 2.511 2.926 0.501 1.918 2.258 2.554 2.89 0.705 1.095 1.487 1.865 2.241 1.389 2.997 3.384 3.798 4.249 4.733 5.273 1.727 2.078 2.438 2.811 3.163 2.443 2.825 3.228 3.92 1.383 1.778 2.195 2.647 3.131 3.677 4.447 2.878 2.081 2.606
0.301 0.519 0.979 0.499 0.617 0.119 0.191 0.818 0.467 0.334 0.205 0.719 0.08 0.006 0.088 0.175 0.549 1.006 0.999 0.554 0.099 0.458 0.078 0.054 0.04 0.475 0.375 0.347 0.105 0.299 1.851 0.643 0.856 0.632 2.349 0.357 0.057 0.043 0.242 0.422 0.669 1.047 0.143 0.065 0.182 0.16 0.337 0.178 0.285 0.373 0.059 0.077 0.037 0.958 0.439 0.094
2.579 2.151 2.06 2.044 2.774 3.808 2.45 2.723 2.275 2.49 1.682 3.236 2.103 1.836 2.852 2.17 0.171 — — 2.813 — 1.903 2.332 2.749 3.183 0.353 0.774 1.222 1.662 2.107 — 3.011 — 3.96 — 5.017 5.602 1.494 1.917 2.353 2.805 — 2.145 2.601 3.09 3.981 1.071 1.541 2.03 2.548 3.089 3.679 4.773 — 1.992 2.326
0.009 0.031 0.42 0.076 0.294 0.102 0.27 0.517 0.095 0.29 0.082 0.516 0.213 0.084 0.092 0.17 0.599 — — 0.667 — 0.443 0.152 0.249 0.253 0.123 0.054 0.082 0.308 0.433 — 0.629 — 0.47 — 0.073 0.272 0.276 0.403 0.507 0.675 — 0.155 0.289 0.32 0.099 0.649 0.059 0.45 0.472 0.101 0.079 0.363 — 0.528 0.374
—Indicates deletion of nine compounds from the larger set of 123 making smaller set of 118 compounds; the model expressed by (eq 8) uses only 114 compounds.
Now-a-days, the connectivity and the valence connectivity indices expressed by eqs 12 and 13 are termed as first-order connectivity and first-order valence connectivity indices, respectively. Branching index (B) and logRB. The branching index B and logRB have been calculated by the method as
described in literature.15,16 Balaban index (J). The Balaban index, J (the average distance sum connectivity index) is defined17 by: 1=2 M X J ¼ Jð G Þ di dj ð15Þ m þ 1 bonds
4532
V. K. Agrawal et al. / Bioorg. Med. Chem. 11 (2003) 4523–4533
where M is the number of bonds in a graph G, m is the cyclomatic number of G and di’s (i=1, 2, 3,. . .,N) are the distance sums (distance degrees) of atoms in G such that
(ii) The off-diagonal elements: ðDÞij di ¼
X kr
ð19Þ
r
di ¼
N X ðDÞij
ð16Þ
j¼1
The cyclomatic number m of G indicates the number of independent cycles in G and is equal to the minimum number of cuts (removal of bonds) necessary to convert a polycylic structure into an acyclic structure: m¼MNþ1
ð17Þ
One way to compute the Balaban index (J) for heterosystem is to modify the elements of the distance matrix for hetero-system as follows: (i) The diagonal elements: ðDÞij ¼ 1 ðZc =Zi Þ
ð18Þ
where Zc=6 and Zi=atomic number of the given element.
where the summation is over all bonds. The bond parameter kr is given by: kr ¼ 1=br Z2c =Zi Zj where br is the bond weight with values: 1 for single bond, 2 for double bond, 1.5 for aromatic bond and 3 for triple bond. Quality factor (Q). The quality factor (Q) is defined19,20 as the ratio of correlation coefficient (R) to the standard error of estimation (SE), that is Q ¼ R=SE. Thus, higher the value of R, the lower the SE, the larger will be Q, and better will be the quality of the model. Regression analysis. Maximum R2 improvement method is to identify prediction models.18 This method finds the ‘best’ one variable model, the ‘best’ two variable model and so forth for the prediction of property/activity. Several models (combinations of variables) were examined to identify combinations of variables with good prediction capabilities. In all regression models developed a variety of statistics associated with residues, that is the Wilks–Shapiro test for normality and Cooks Dstatistics for outliers, to obtain the most reliable results were examined.18
Figure 1. Correlation between observed and estimated logð1=Cnar Þ using eq 5.
Multiple regression analyses for correlating tadpole narcosis of the present set of compounds with the aforementioned molecular descriptors were carried out using Regress-1 software as supplied by Professor I. Lukovits, Hungarian Academy of Sciences, Budapest, Hungary. Several multiple regressions were attempted using correlation matrix from this program and the best results were considered and discussed in developing QSAR and hence, for modeling the tadpole narcosis of the compounds.
Figure 2. Correlation between observed and estimated logð1=Cnar Þ using eq 6.
Figure 3. Correlation between observed and estimated logð1=Cnar Þ using eq 8.
V. K. Agrawal et al. / Bioorg. Med. Chem. 11 (2003) 4523–4533
Computations. All the computations were carried out in Power Macintosh 9600/233.
Acknowledgements The authors are thankful to Professor I. Lukovits, Hungarian Academy of Sciences, Budapest, Hungary for providing software to carryout regression analysis and to Prof. Ivan Gutman, Faculty of Science, University of Kragujevac, Yugoslavia for introducing one of the authors (P.V.K.) to this fascinating field of chemical graph theory and topology.
References and Notes 1. Abraham, M. H.; Rafols, C. J. Chem. Soc, Perkin Trans. 1995, 1843 and references therein. 2. Meyer, K. H.; Hemmi, H. Biochem. Zeit. 1935, 277, 39. 3. Dearden, J. C. Envir. Health Persp. 1985, 61, 203. 4. Chaturvedi, S.K. QSAR Studies on Non-polar Narcosis; PhD Thesis, A.P.S. University, Rewa, India, and references therein. 5. Khadikar, P. V.; Singh, S.; Shrivastava, A. Bioorg. Med. Chem. 2002, 12, 1125. 6. Khadikar, P. V.; Phadnis, A.; Shrivastava, A. Bioorg. Med. Chem. 2002, 10, 1181. 7. Khadikar, P. V.; Agrawal, V. K.; Shrivastava, A. Bioorg. Med. Chem. 2002, 10, 3499.
4533
8. Khadikar, P.V.; Mandloi, D.; Bajaj, A.V.; Joshi, S. Bioorg. Med. Chem. Lett. 2003, 13, 419. 9. Khadikar, P. V.; Karmarkar, S.; Varma, R. G. Acta Chim. Slov. 2002, 49, 755. 10. Khadikar, P. V.; Mathur, K. C.; Singh, S.; Phadnis, A.; Shrivastava, A.; Mandloi, M. Bioorg. Med. Chem. 2002, 10, 1761. 11. Agrawal, V. K.; Singh, J.; Khadikar, P. V. Bioorg. Med. Chem. 2002, 10, 3981. 12. Wiener, H. J. Am. Chem. Soc. 1947, 69, 17. 13. Gutman, I. Graph Theory Notes (New York) 1994, 27, 9. 14. Khadikar, P. V.; Deshpande, N. V.; Kale, P. P.; Dobrynin, A.; Gutman, I.; Domotor, G. J. Chem. Inf. Comput. Sci. 1995, 35, 547. 15. Devillers, J.; Balaban, A. T. Topological Indices and Related Descriptors in QSAR and QSPR; Gorden & Breach: Williston, VT, 2000. 16. Diudea M. V. QSPR/QSAR Studies by Molecular Descriptors; Babes-Bolyai University: Cluj, Romania, 2000. 17. Balaban, A. T. Chem. Phys. Lett. 1982, 89, 399. 18. Chaterjee, S.; Hadi, A. S.; Price, B. Regression Analysis by Examples, 3rd ed.; Wiley: New York, 2000. 19. Pogliani, L. Amino Acids 1994, 6, 141. 20. Pogliani, L. J. Phys. Chem. 1996, 100, 18065. 21. Todeschini, R. Chemometrics Web News; Milano Chemometric & QSAR Research Group, file:/C/WINDOWS/Desktop/Web news on Chemometrics.html, Feb. 2001. 22. Overton, C. E. Studies Uberdie Narkose; Fischer: Jena, 1901. 23. Overton, C. E. In Studies on Narcosis; Lipnick, R. L., Ed.; Chapman and Hall; London, 1991. 24. Randic, M. J. Am. Chem. Soc. 1975, 97, 6609. 25. Randic, M. Croat. Chem. Acta 1993, 66, 289.