Chapter 11
Virtual Screening of Phytochemicals Manabendra D. Choudhury*, Walid A. Atteya*, Keshav Dahal*, Pankaj Chetia†, Karabi D. Choudhury†, Anant Paradkar* ⁎
University of Bradford, Bradford, United Kingdom Assam University, Silchar, India
†
Chapter Outline 11.1. Introduction 301 11.1.1 Artificial Neural Networks (ANNs) 302 11.1.2 Application of ANNs in Pharmaceutical Science 303 11.1.3 ANNs in Predicting Bioactivity 303 11.1.4 Gossypol and its Derivatives 303 11.2. Materials and Methods 304 11.2.1 Input and Output Vector Definition for Data 304
11.2.2 Software and Hardware Environment 304 11.2.3 Modelling Procedure 306 11.2.4 Training and Test Data Set 306 11.2.5 Experimental Data Set 308 11.2.6 Docking Experiment 308 11.3. Results and Discussion 308 11.4. Conclusions 332 Acknowledgements 332 References 332
11.1. INTRODUCTION The fundamental objective in any drug discovery process is to reduce the time and cost involved in the process of bringing an effective drug to the market. The techniques used in drug discovery are aimed at shortening the time to identify drug candidates. Although Quantitative Structure Activity Relationship (QSAR) technique (Perkins et al., 2003; Ishikawa et al., 2012) has been in application in drug discovery process for quite some time, it does not regularly have a high impact on lead discovery as it mainly influences later stages of drug development, particularly, prediction of IC50 and a few other parameters. A revolutionary change in conventional methods of testing bioactivity of plant
Computational Phytochemistry. https://doi.org/10.1016/B978-0-12-812364-5.00011-0 © 2018 Elsevier Inc. All rights reserved.
301
302 Computational Phytochemistry
materials has taken place with the advent of computer and computational techniques. The computer-aided tools have significantly reduced the time necessary for bioactivity assessments, and a quick and cost-effective prediction of biological activity of plant materials is now possible based on physicochemical properties. Although Computer-Aided Drug Discovery (CADD) (Zhang, 2011; Silwoski et al., 2014) has emerged as a broad subject in the field of medicinal plants research, computational molecular docking and computational target fishing (Wand and Xie, 2014; Katsila et al., 2016) are the specific techniques in this field. Above all, Artificial Neural Networks (ANNs) (Dayfoff and DeLeo, 2001) has proved to be a promising technique for direct prediction of a specific property of specific biomolecules. This chapter, utilizing a specific example, describes how ANNs could be used to predict possible biological activity of a few drug molecules.
11.1.1 Artificial Neural Networks (ANNs) ANNs are biologically inspired computer programmes designed to simulate the way in which the human brain processes information (Dayfoff and DeLeo, 2001). ANNs gather their knowledge by detecting the patterns and relationships in data and learn (or are trained) through experience, not from programming, and there lies the basic difference between ANNs and other classical computer programmes. Another significant difference between ANNs software and other computer programmes is that the algorithms used for data analysis are flexible. They can be changed anytime during the progress of analysis. The distinctive feature of ANNs is their ability to deal effectively with multidimensional problems, including several thousands of features. An ANN is formed from hundreds of single units, i.e. artificial neurons or processing elements, connected with coefficients (weights), which constitute the neural structure and are organized in layers. The ability of neural computations comes from connecting neurons in a network. The better the neurons are connected in networks, the better is the prediction as output. The activity of a neural network is determined by transfer functions of its neurons, by the learning rule, and by the architecture itself. Achievement of successful result from ANNs studies depends on minimization of prediction error by optimization of interunit connections during training. By doing so as trial and error, the network reaches the specified level of accuracy. Once the network is trained with minimum prediction error and tested, it may be used with new input information to predict the output. The information in ANNs is encoded in the strength of the network's ‘synaptic’ connections (Zupan and Gasteiger, 1993; Kaliszan et al., 2003). Latest studies on ANNs are mainly centred on designing new network types by changing transfer connection of neurons, by changing learning rule, and by initiating new connection formula.
Virtual Screening of Phytochemicals Chapter | 11 303
11.1.2 Application of ANNs in Pharmaceutical Science Use of ANNs in drug discovery is not that old; in fact, their use in drug discovery process started at the end of the 1980s, when they were applied to solve various chemical problems including the study of Quantitative Structure Activity Relationships (QSAR). ANNs were found to be useful in compound classification, modelling of structure activity relationships, identification of potential drug targets, and localization of structural and functional features of biopolymers (Isu et al., 1994). Hussain et al. (1991) were the first to introduce ANNs to the field of pharmaceutical technology, pointing to the possible advantages of guided search for the optimal pharmaceutical formulation. Various dosage forms were the subjects of neural analysis: tablets (Bourquin et al., 1998a,b,c), pellets (Peh et al., 2000), capsules (Mendyk et al., 2007), emulsions (Gašperlin et al., 1998, 2000) and microemulsions (Agatonovic-Kustrin and Alany, 2001), hydrogels (Takayama et al., 1999, 2003), and transdermal delivery systems (Kandimalla et al., 1999). Reports on the potential use of ANNs for pharmacological classification of drugs (Buciński et al., 2000), optimization of HPLC separations of bioactive compounds (Buciński and Bączek, 2002), in vitro permeability determination in Caco-2 cells (Paixão et al., 2010), and predicting drug release and formulation (Chen et al., 1999; Petrović et al., 2009) are also available. Competitive adsorption of phenol and resorcinol from water environment using carbonaceous adsorbents has also been modelled using ANNs (Aghav et al., 2011).
11.1.3 ANNs in Predicting Bioactivity ANNs in predicting bioactivity classes based on physicochemical parameters of agents was demonstrated for dihydrofolate reductase inhibitors (So and Richards, 1992). Antitumour activity could also be predicted by ANNs (Zupan and Gasteiger, 1993, 1999). ANNs were proposed as decision support systems in dentistry (Brickley et al., 1998) and urology (Snow et al., 1999) and in the assessment of HIV/AIDS-related health performance (Lee and Park, 2001). Antioxidant capacity of cruciferous sprouts was also predicted using neural network (Buciński et al., 2004). Prediction of specific bioactivity like anti-HIV, anticancer, and psychometric activity of a series of molecules has been performed using ANNs (Vanyúr et al., 2003; Naik and Patel, 2009; Haghdadi and Fatemi, 2010).
11.1.4 Gossypol and its Derivatives Gossypol (Fig. 11.1) is a phenolic aldehyde, yellow in colour, found in cottonseeds (Gossypium herbaceum). It can permeate cells and act as an inhibitor for several dehydrogenase enzymes and has been tested as a male oral contraceptive
304 Computational Phytochemistry
FIG. 11.1 Gossypol from Gossypium herbaceum.
in China (Coutinho, 2002). In addition to its contraceptive properties, it also possesses antimalarial properties (Keshmiri-Neghab and Goliaei, 2014). The IUPAC name of the compound is 2,2′-bis-(formyl-1,6,7-trihydroxy-5isopropyl-3-methylnaphthalene) and it has 14 derivatives available in the NCBI PubChem compound database. Although the parent compound gossypol is known for its male contraceptive property, biological activities of its derivatives are still awaiting experimentation. We have undertaken a work involving gossypol and its derivatives, as depicted below, to demonstrate the application of ANNs as a tool for predicting biological activity of these compounds with an intention of suggesting possible new drug leads. The prediction obtained from ANNs is cross-validated by in silico search of receptors for the chosen ligands.
11.2. MATERIALS AND METHODS 11.2.1 Input and Output Vector Definition for Data Data related to physicochemical properties of 117 molecules, whose biological activities are well-known, were collected from NCBI PubChem Compound database. Twenty two descriptors for each of the 117 molecules were recorded from the database. Physicochemical properties of all compounds of the data set were considered as input and their respective biological activities were taken as output. Similarly, 22 descriptors for each of 6 compounds, whose biological properties have not been tested, were also recorded as experimental data set. The physicochemical descriptors used for preparing training, test, and experimental dataset are shown in Table 11.1.
11.2.2 Software and Hardware Environment The test bit used for the implementation of the ANNs was Intel Core 2 Duo 2.93 GHz processor with 4 Giga Ram and windows 7 Enterprise Edition, Service Pack 1. Operating System type was 64-bit. The ANNs were implemented using the MATLAB (Matrix Laboratory) software, version 7.12.0.635 Release 2011a, 64-bit, windows. MATLAB Neural Network Toolbox with some in-house developed MATLAB code was used to design, implement, visualize, and simulate the neural networks. This allowed flexible implementation of neural network
Virtual Screening of Phytochemicals Chapter | 11 305
TABLE 11.1 Physicochemical Descriptors Used for Preparing Training, Test, and Experimental Dataset Serial
Module Descriptor
Description
1
Molecular weight
Weight of all atoms in a compound
2
XLogP3
Partition coefficient
3
H-bond donor
A hydrogen atom attached to a relatively electronegative atom plays the role of the hydrogen bond donor
4
H-bond acceptor
Any group which can share its electron for H-bond, e.g., O, N, etc.
5
Rotatable bond count
Number of rotatable bonds in a molecule
6
Exact mass
Sum of masses of the individual isotopes of a molecule
7
Mono-isotopic mass
Sum of the masses of the atoms in a molecule
8
Topological polar surface area
Topological polar surface area (TPSA) makes use of functional group contributions based on a large database of structures, is a convenient measure of the polar surface area that avoids the need to calculate ligand 3D structure or to decide which is the relevant biological conformation or conformations.
9
Heavy atom count
Number of atom that contains more than the common number of neutrons
10
Formal charge
Charge assigned to an atom in a molecule
11
Complexity
Complexity describes the behaviour of a system or model whose components interact in multiple ways
12
Isotope atom count
Number of atoms with the same number of protons, but differing numbers of neutrons in a molecule
13
Defined atom stereo centre count
Count of any point in a molecule that leads to stereoisomerism
14
Undefined atom stereo centre count
Unknown group in a molecule that may lead to stereoisomerism
15
Defined bond stereo centre count
Nonrotatable bond known
16
Undefined bond stereo centre count
Nonrotatable bond unknown
17
Covalently bonded unit count
Count of groups bound through covalent bonds Continued
306 Computational Phytochemistry
TABLE 11.1 Physicochemical Descriptors Used for Preparing Training, Test, and Experimental Dataset—cont’d Serial
Module Descriptor
Description
18
Feature 3D acceptor count
Count of 3D feature acceptor of the molecule
19
Feature 3D ring count
Number of aromatic rings present in a molecule
20
Effective rotor count
Which provides flexibility
21
Conformer sampling RMSD
Deviation among all conformer upon reaction
22
CID conformer count
Number of different conformers of a molecule with same molecular formula
23
Biological activity
Activity of the compounds as drugs
algorithm, plotting the required functions and data. The common and wellknown back propagation algorithm for training neural network was chosen to minimize the objective function.
11.2.3 Modelling Procedure To build a model that can predict the biological activity of unknown molecules from their physicochemical descriptors, a four-layered feed-forward neural network, was developed. Experiments were conducted for a sensitivity analysis by changing the number of neural network layers and the number of hidden nodes inside the layers to get the best prediction accuracy. Other important parameters had significant effect on the error rate in the proposed model. These parameters were the back propagation and the hidden layer training functions. At the beginning, neural network parameters that are commonly used were applied (Buciński et al., 2004). These parameters were learning coefficient of 0.02, momentum equalled to 0.6, and a limitation of maximum 3000 epochs. After achieving the desired goal, these parameters were changed to decrease the time needed to build the neural network model without affecting the goal. The ANN model set out finally for present study is presented in Fig. 11.2.
11.2.4 Training and Test Data Set Data as recorded from NCBI PubChem compounds database for training and test of the neural networks and after removing the redundancy were divided into a training data set of 70% and test data set of 30% for internal validation. Before splitting data into training and test set, biological activity attribute was encoded as shown in Table 11.2. This is performed to have numerical representation of each type of bioactivity.
Virtual Screening of Phytochemicals Chapter | 11 307
Molecular weight XLogP3 H-bond donor H-bond acceptor Rotatable bond count Exact mass MonoIsotopic mass Topological polar surface area Heavy atom count Formal charge Complexity BA_code
Isotope atom count Defined atom stereocenter count Undefined atom stereocenter count Defined bond stereocenter count Undefined bond stereocenter count Covalently-bonded unit count Feature 3D acceptor count Feature 3D ring count Effective rotor count Conformer sampling RMSD CID conformer count
FIG. 11.2 Artificial Neural Network model set out finally for present study.
TABLE 11.2 Biological Activity Attributes Biological Activity Code
Biological Activity Description/Name
10
Anabolic agent
20
Analgesic
30
Narcotic analgesic
40
Antiinflammatory
50
Anticonvulsant
60
Antineoplastic
70
Cardioprotection
80
Vasodilator
90
Antianxiety agent
100
Hypnotic
110
Contraceptive
120
Atherogenic
308 Computational Phytochemistry
11.2.5 Experimental Data Set Six gossypol derivatives, i.e. diaminogossypol, CID:198041 (Compound-1), mono-aldehyde gossypol, CID:195071 (Compound-2), gossylic lactone,CID:5479154 (Compound-3), gossypol tetraacetic acid, CID: 130831 (Compound-4), ethyl gossypol, CID: 374353 (Compound-5), and gossypol-6,6'dimethyl ether, CID: 25200979 (Compound-6), were selected for prediction of their bioactivity. Descriptors of these six compounds with their chemical identity (CID) numbers were downloaded from NCBI PubChem compound database. Selection of these compounds for present investigation was based on the facts that their biological activities have not been tested yet, either in silico or in vivo. However, the parent compound from which these compounds have been derived by chemical group substitution is known for its oral male contraceptive property. Physical and chemical descriptors used for experimental compounds are described in Table 11.3.
11.2.6 Docking Experiment Once the ANNs prediction is over, chosen compounds were considered as ligands, and docking experiments were performed to search for their respective suitable targets/receptors for validation of ANNs prediction. As the predicted activity of chosen ligands is contraceptive and parent gossypol compound is known for its male contraception property, docking was carried out with chosen ligands against acrosin and hyaluronidase—the two vital enzymes of human spermatozoa. Since 3D structure of acrosin and hyaluronidase could not be obtained from RCSB Protein Databank, the structures of these molecules were predicted using homology modelling technique with Modeller 9v7. Active site of the receptors (enzymes) was predicted using Q-Site finder (Laurie and Jackson, 2005). Structure of chosen ligands was obtained from NCBI PubChem database in SDF format and docking experiments were carried out using BiosolveIT FlexX 1.3.0 (Stahl, 2000). As control, parent gossypol was also docked with those enzymes using same software.
11.3. RESULTS AND DISCUSSION Predicted output of the experimental compounds as recorded in Table 11.4 showed that compounds 3, 4, and 6 are contraceptive, compounds 1 and 2 are, respectively, antianxiety agent and vasodilator, and compound 5 is hypnotic. As parent compound gossypol has contraceptive properties, it was expected that some of its derivatives would have the same property. While training the model for obtaining prediction, many attempts were made by changing the neural network settings to get the best prediction accuracy. These settings included the number of neural network layers, the number of hidden nodes inside the layers, and the layers propagation functions. The effect of the neutral network parameters on the learning rate totally depends on the
TABLE 11.3 Physicochemical Descriptors of the Gossypol Derivatives Compounds 2
3
4
5
6
Descriptors
CID:198041
CID:195071
CID:5479154
CID:5479154
CID:374353
CID:25200979
Molecular weight
548.5837
490.5443
514.5226
686.7011
518.5544
546.6076
XLogP3
5.6
6.9
7.7
6.3
6.7
7.6
H-bond donor
8
6
4
2
6
4
H-bond acceptor
10
7
8
12
8
8
Rotatable bond count
5
4
3
13
5
7
Exact mass
548.2159
490.1992
514.1628
686.2363
518.1941
546.2254
Mono-isotopic mass
548.2159
490.1992
514.1628
686.2363
518.1941
546.2254
Topological polar surface area
156
208
138
289
180
192
Heavy atom count
40
36
38
50
38
40
Formal charge
0
0
0
0
0
0
Complexity
848
773
890
1180
808
879
Isotope atom count
0
0
0
0
0
0
Defined atom stereo centre count
0
0
0
0
0
0
Undefined atom stereo centre count
0
0
0
0
0
0 Continued
Virtual Screening of Phytochemicals Chapter | 11 309
1
Compounds 1
2
3
4
5
6
Descriptors
CID:198041
CID:195071
CID:5479154
CID:5479154
CID:374353
CID:25200979
Defined Bond stereo centre count
0
0
0
0
0
0
Undefined bond stereo centre count
0
0
0
0
0
0
Covalently bonded unit count
1
1
1
1
1
1
Feature 3D acceptor Count
2
1
2
6
2
4
Feature 3D ring count
8
6
4
2
6
4
Effective rotor count
5
4
3.4
13
5
7
Conformer sampling RMSD
1
0.8
0.8
1.4
0.8
1
CID conformer count
2
2
2
2
4
3
310 Computational Phytochemistry
TABLE 11.3 Physicochemical Descriptors of the Gossypol Derivatives—cont’d
Predicted Output for the Compounds 1
2
3
4
5
6
Biological activity code
90.0424
79.9101
109.9587
109.7493
100.0998
109.6576
Biological activity names
Antianxiety agent
Vasodilator
Contraceptive
Contraceptive
Hypnotic
Contraceptive
Virtual Screening of Phytochemicals Chapter | 11 311
TABLE 11.4 ANNs Predicted Output of the Gossypol Derivatives
312 Computational Phytochemistry
data set. Our goal was to optimize these settings until the error was minimized and reached the required acceptable level of accuracy. Two parameters are very important in the learning process, these are Mean Square Error and the regression value. The Mean Squared Error represents the learning accuracy of the neural network. It is the average squared difference between outputs and targets. Normally, lower values are better and zero means no error. The pattern of accuracy of the learning of the neural network for the present studies is presented in Fig. 11.3. Mean square error recorded during present learning was 0.02, which indicated that learning of the ANN in the present study was accurate. Mean square error was calculated as follows. N
MSE = å ( Ti - Oi )
2
i =1
Where, MSE is the mean square error, N is the number of inputs, T is the target values, and O is the output values from the neural network model. The Regression Factor is a measure for correlation between outputs and targets and indicates the prediction accuracy. A value of 1 means a close relationship, 0 means a random relationship. The regression factor of the neural network used in the present study is presented in Fig. 11.4. Value of regression factor during present work was 0.99, which signifies the accuracy of the prediction of the present work. During the work, data were processed into the suitable format used in Matlab. Input and target data attributes were defined. Many experiments were executed by tuning the neural network parameters to get the best accuracy and the best target data. Fig. 11.5 shows a comparison between
FIG. 11.3 Pattern of accuracy of the learning of the neural network for the present study.
R = 0.99179 2
Data Fit Y=T
Output ∼ = 0.97 × Target + 0.019
1.5
1
0.5
0
−0.5
−1
−1
−0.5
0
0.5
1
1.5
2
Target
FIG. 11.4 Regression factor of the proposed neural network used in the present study.
FIG. 11.5 Comparison between the neural network output data and the target data.
314 Computational Phytochemistry
the neural network output data and the target data. Data values for output and target are shown in Table 11.5. The data show how the output values of the neural network model are close to the target values. During present study, best results were found at four-layered feed-forward multilayer perception neural network. There were one input layer, three hidden layers, and one output layer. Ten artificial neurons were there in the first hidden layer, three each in second and third hidden layers and one artificial neuron in the output layer.
TABLE 11.5 Predicted Output of Training and Test Set With Reference to Biological Activity Attribute (Target) Output
Target
69.37214
70
39.70692
40
50.63383
50
25.8101
30
39.70048
40
40.06018
40
31.64394
30
39.57521
40
41.42172
40
40.02308
40
101.2782
100
40.05743
40
90.04237
90
18.92682
20
119.8577
120
29.47375
30
48.97252
50
28.35779
30
20.13945
20
88.79931
90
69.33511
70
80.70457
80
Virtual Screening of Phytochemicals Chapter | 11 315
TABLE 11.5 Predicted Output of Training and Test Set With Reference to Biological Activity Attribute (Target)—cont’d Output
Target
109.9587
110
61.64463
60
109.7493
110
80.10604
80
28.92282
30
20.35135
20
39.17767
40
9.11467
10
119.8899
120
9.632499
10
39.26161
40
79.91012
80
39.63317
40
18.29689
20
79.47068
80
39.09463
40
79.90057
80
120.63
120
90.56023
90
59.46885
60
59.87975
60
40.29981
40
40.22298
40
50.05615
50
110.3123
110
39.84036
40
39.74226
40
79.92715
80 Continued
316 Computational Phytochemistry
TABLE 11.5 Predicted Output of Training and Test Set With Reference to Biological Activity Attribute (Target)—cont’d Output
Target
20.22006
20
20.95283
20
39.32984
40
39.37376
40
38.44877
40
70.20587
70
10.16076
10
10.71323
10
20.37211
20
79.0657
80
39.44648
40
10.48536
10
19.84768
20
11.41296
10
90.43699
90
117.9819
120
18.36777
20
109.5418
110
39.51115
40
19.79686
20
41.86343
40
79.70558
80
39.39451
40
119.3081
120
30.23937
30
39.81784
40
79.94202
80
30.67188
30
69.83762
70
Virtual Screening of Phytochemicals Chapter | 11 317
TABLE 11.5 Predicted Output of Training and Test Set With Reference to Biological Activity Attribute (Target)—cont’d Output
Target
118.5441
120
22.49832
20
18.79895
20
20.71337
20
40.40256
40
18.81931
20
21.7158
20
20.12916
20
40.36983
40
40.05577
40
41.04731
40
39.6325
40
109.6576
110
37.64742
40
99.48608
100
37.87494
40
41.51219
40
110.2016
110
42.77141
40
119.1196
120
37.1825
40
29.26267
30
21.32417
20
79.47948
80
32.78027
30
120.059
120
19.59192
20
39.44699
40 Continued
318 Computational Phytochemistry
TABLE 11.5 Predicted Output of Training and Test Set With Reference to Biological Activity Attribute (Target)—cont’d Output
Target
99.15079
100
20.47612
20
69.74487
70
28.65926
30
39.48187
40
18.90024
20
118.4873
120
119.0782
120
100.0998
100
Different training functions, including ‘tansig’ and ‘purlin’, were tested in different layers to find the function that fits best to the model. The best input layer function was ‘tansig’, the best hidden layer function was ‘purlin’, and the best back propagation training function was 'trainlm'. Docking of gossypol with human sperm enzymes acrosin and hyaluronidase showed that it could bind successfully with a score of −25.6376 and −24.2222 kcal/mol, respectively. Docking experiments with compounds having ANNs predicted contraceptive activity (compounds 3, 4, and 6) and showed that affinity of gossypol-6,6'-dimethyl ether with acrosin and hyaluronidase was comparable to that of gossypol. However, all three derivatives had significant bonding capability with these two enzymes with an exception that gossypol tetraacetic acid’s affinity towards hyaluronidase was less (Tables 11.6, 11.7A–11.7H, and Fig. 11.6). As hyaluronidase and acrosin are two vital enzymes of human spermatozoa and are responsible, respectively, for digestion of hyaluronan in the corona radiata, enabling conception (Alberts, 2008), and for making sperms able to penetrate into the ovum by lysis of the zona pellucida through acrosome reaction, thus facilitating penetration of the sperm through the innermost glycoprotein layers (Honda et al., 2002), inhibition of any of these two enzymes by gossypol derivatives in question will have adverse effect on the ability of the spermatozoa to enable conception. Majority of mammalian ova are covered in a layer of granulosa cells interwined in an extracellular matrix that contains a high
Virtual Screening of Phytochemicals Chapter | 11 319
TABLE 11.6 Docking Score of the Chosen Ligands and Gossypol Against Acrosin and Hyaluronidase H-Bond Forming Amino Acids
Docking Score (kcal/mol)
Acrosin
Tyr39, Val70, His71, Arg74, and Val99
−25.6376
Hyaluronidase
Tyr179, His187, Ser222, Asn226, Thr227, and Gln228
−24.2222
Acrosin
His71, Trp73, Val99, and Glu100
−17.9569
Hyaluronidase
Tyr179, His187, Thr227, and Gln228
−15.1068
Gossypol tetraacetic acid (CID 130831)
Acrosin
Lys21, Ala23, and Trp156
−12.1150
Hyaluronidase
Tyr357 and His359
−5.1035
Gossypol-6,6 dimethyl ester (CID 25200979)
Acrosin
Trp30 and Cys136
−20.7042
Hyaluronidase
Asp279, Asn353, Ser355, and Tyr357
−19.5150
Ligand
Enzyme
Gossypol (CID 3503)
Gossylic lactone (CID 5479154)
concentration of hyaluronan. When a capacitated sperm reaches the ovum, it is able to penetrate this layer with the assistance of hyaluronidase enzymes present on the surface of the sperm. Once this occurs, the sperm is capable of binding with the zona pellucida, and the acrosome reaction can occur then with the help of the enzyme acrosin. The resulted lysis from acrosome reaction enables spermatozoa to reach the innermost glycoprotein layers of the ovum to effect conception (Alberts, 2008). Alteration of activity of these two enzymes is, therefore, directly linked with male contraceptive activity. Docking experiments, therefore, in one hand validated the ANNs prediction with respect to contraceptive action of compounds 3, 4, and 6 by showing their binding potential with two vital enzymes of human spermatozoa, and on the other hand, suggested possible molecular path ways by which these compounds may affect contraception.
TABLE 11.7A Docking Parameters of Gossylic Lactone [CID 5479154] Against Acrosin #
Ligand
1
(1) 5479154
Structure
Rank
Score
Match
Lipo
Ambig
Clash
Rot
1
−17.9569
−17.3358
−10.1318
−11.1563
6.8671
8.4000
RMSD
Simil
#Match 16
OH
HO O
HO
O O
O
OH
TABLE 11.7B Docking Parameters of Gossypol Tetraacetic Acid [CID 130831] Against Acrosin #
Ligand
1
(1) 130831
Structure O O
O O
HO
O OH
O O O
O
Rank
Score
Match
Lipo
Ambig
Clash
Rot
1
−12.1150
−17.3531
−11.2893
−9.5479
3.8753
16.8000
RMSD
Simil
#Match 11
TABLE 11.7C Docking Parameters of Gossypol-6,6 Dimethyl Ether [CID 25200979] Against Acrosin #
Ligand
1
(1) 25200979
Structure O
Rank
Score
Match
Lipo
Ambig
Clash
Rot
1
−20.7042
−23.6019
−10.0613
−6.9311
3.2901
11.2000
RMSD
Simil
#Match 15
HO O HO HO
TABLE 11.7D Docking Parameters of Gossylic Lactone [CID 5479154] Against Hyaluronidase #
Ligand
1
(1) 5479154
Structure HO
OH O
O OH
HO O
O
Rank
Score
Match
Lipo
Ambig
Clash
Rot
1
−15.1068
−19.3266
−5.5109
−7.5996
3.5302
8.4000
RMSD
Simil
#Match 14
(1) 130831
1
HO
O
O
O
O O
Structure
O O
O
O
OH
1
Rank
Match −16.2112
Score −5.1035
−10.8139
Lipo −7.8980
Ambig 7.6196
Clash 16.8000
Rot
Ligand
(1) 25200979
#
1
O
HO
O
HO HO
Structure
1
Rank
Match −27.1367
Score −19.5150
−8.1126
Lipo
−7.4221
Ambig
6.5564
Clash
11.2000
Rot
TABLE 11.7F Docking Parameters of Gossypol-6,6 Dimethyl Ether [CID 25200979] Against Hyaluronidase
Ligand
#
TABLE 11.7E Docking Parameters of Gossypol Tetraacetic Acid [CID 130831] Against Hyaluronidase
RMSD
RMSD
Simil
Simil
17
#Match
14
#Match
(1) 3503
1
HO
HO
O
OH
OH
Structure
Match −27.8142
Score
−25.6376
Rank
1
−12.0931
Lipo −9.7579
Ambig
Ligand
(1) 3503
#
1
HO
HO
O
OH
OH
Structure
Match −31.7353
Score
−24.2222
Rank
1
−3.2238
Lipo −6.9964
Ambig
TABLE 11.7H Docking Parameters of Gossypol [CID 3503] Against Hyaluronidase
Ligand
#
TABLE 11.7G Docking Parameters of Gossypol [CID 3503] Against Acrosin
1.1334
Clash
7.4276
Clash
11.2000
Rot
11.2000
Rot
RMSD
RMSD
Simil
Simil
13
#Match
21
#Match
Virtual Screening of Phytochemicals Chapter | 11 323
324 Computational Phytochemistry Val99
O −
R
R
H
R H N
Trp73
N
R
O
O Glu100
OH
O
O
R
O H
Glu100 Val70
H
H N
O
His71 O
H O
O
O
O
Thr123 Tyr98 Phe37 Arg74 His71
R
(A)
(B) FIG. 11.6 Bonding pattern of chosen ligands and Gossypol with their receptors (A) 2D view of gossylic lactone and acrosin bonding (B) 3D view of gossylic lactone and acrosin bonding
Virtual Screening of Phytochemicals Chapter | 11 325
OH
Trp156 N
Glu81
O
H O
Ala23 Lys80
R
O
N H
Ala23
O O
O
R
O
O O
O
O Trp156
H O
R O
Lys21 Ile159 Ala22
R HN
(C)
Lys21
(D) FIG. 11.6, CONT’D (C). 2D view of gossypol tetraacetic acid and acrosin bonding (D). 3D view of gossypol tetraacetic acid and acrosin bonding
326 Computational Phytochemistry Cys136
R
N
R Cys136 Leu137
O H
O
OH
O
H O H O
Trp28
O Trp152
H N H
O
O Trp30 Trp30
(E)
(F) FIG. 11.6, CONT’D (E). 2D view of gossypol-6,6 dimethyl ether and acrosin bonding (F). 3D view of gossypol-6,6 dimethyl ether and acrosin bonding
Virtual Screening of Phytochemicals Chapter | 11 327 Tyr179
Thr227
R
O
O H
R
O
Asp270 Tyr224
N
H OH
H O H
O H O
O
O
O
Asn226 Leu180 Thr227 Gln228
H
N
His187 N
H O
O
(G)
H N H
Gln228
(H) FIG. 11.6, CONT’D (G). 2D view of gossylic lactone and hyaluronidase bonding (H). 3D view of gossylic lactone and hyaluronidase bonding
328 Computational Phytochemistry
Tyr283 Tyr357
OH
Val282 Ser355
O Lys392
R
N H
O
O
O
O
O
Tyr357
O
O
O
O
R
Pro362 His359
O HO
O R
H N R
His359
(I)
(J) FIG. 11.6, CONT’D (I). 2D view of gossypol tetraacetic acid and hyaluronidase bonding (J). 3D view of gossypol tetraacetic acid and hyaluronidase bonding
Virtual Screening of Phytochemicals Chapter | 11 329 O −
Asp279
O
R NH Ser355 R
Tyr283 Tyr357 Val282
O R Tyr357
N
Asp279 Asp356 O
O
H H
H O
O
Ser355 H O O
O
R
H O Asn353
O N
H
O
H Lys392 Asn353
(K)
(L) FIG. 11.6, CONT’D (K). 2D view of gossypol-6,6 dimethyl ether and hyaluronidase bonding (L). 3D view of gossypol-6,6 dimethyl ether and hyaluronidase bonding
330 Computational Phytochemistry Val99
R
N
O
R R
H
NH Tyr98
H O
O H O
O
R
O Val70
H N H
H N
R
+ NH
H
His71
R
His71 Val70
O
H N Arg74
H
Arg74
Tyr39 Phe37
HO
R
O HO
O H
O Tyr39
H N
(M)
R
(N) FIG. 11.6, CONT’D (M). 2D view of gossypol and acrosin bonding (N). 3D view of gossypol and acrosin bonding
Virtual Screening of Phytochemicals Chapter | 11 331 His187 Ser222
O
N
H N H
H
H H O
O
O
Asn226
O O
Tyr224
H2N
H
Tyr179
O R
O H H N
O O
O H
R Thr227
O
H
O H
Gln228 Asn226
(O)
H
N
H
O Gln228
(P) FIG. 11.6, CONT’D (O). 2D view of gossypol and hyaluronidase bonding (P). 3D view of gossypol and hyaluronidase bonding.
332 Computational Phytochemistry
11.4. CONCLUSIONS Out of the six gossypol derivatives, gossylic lactone, gossypol tetraacetic acid, and gossypol-6,6'-dimethyl ether are contraceptives. Diaminogossypol and mono-aldehyde gossypol have antianxiety and vasodilation activity, respectively, and ethyl gossypol is hypnotic. Cross validation of ANNs prediction with respect to contraceptive action of hossylic lactone, gossypol tetraacetic acid, and gossypol-6,6'-dimethyl ether with the help of docking experiments confirmed ANNs result. By inhibiting acrosin and hyaluronidase enzymes of human spermatozoa, gossylic lactone, gossypol tetraacetic acid, and gossypol-6,6'-dimethyl ether are supposed to exert male contraceptive action.
ACKNOWLEDGEMENTS Manabendra D. Choudhury sincerely acknowledges Commonwealth Scholarship Commission for awarding him with the Commonwealth Academic Staff Fellowship at University of Bradford, United Kingdom. Support of Bioinformatics Centre, Assam University, Silchar, India, is sincerely acknowledged for carrying out docking part of the work.
REFERENCES Agatonovic-Kustrin, S., Alany, R.G., 2001. Role of genetic algorithms and artificial neural networks in predicting the phase behavior of colloidal delivery systems. Pharm. Res. 18, 1049–1055. Aghav, R.M., Kumar, S., Mukherjee, S.N., 2011. Artificial neural network modelling in competitive adsorption of phenol and resorcinol from water environment using some carbonaceous adsorbents. J. Hazard. Mater. 188, 67–77. Alberts, B., 2008. Molecular Biology of the Cell. Garland Science, New York, p. 1298. ISBN0-8153-4105-9. Brickley, M.R., Shepherd, J.P., Armstrong, R.A., 1998. Neural networks. J. Dent. 26, 305–309. Bourquin, J., Shmidli, H., Hoogevest, P., Van Leuenberger, H., 1998a. Advantages of Artificial Neural Networks (ANNs) as alternative modeling technique for data sets showing non-linear relationship using data from a galenical study on a solid dosage form. Eur. J. Pharm. Sci. 7, 5–16. Bourquin, J., Shmidli, H., Hoogevest, P., Van Leuenberger, H., 1998b. Comparison of artificial neural networks (ANN) with classical modeling techniques using different experimental designs and data from a galenical study on a solid dosage form. Eur. J. Pharm. Sci. 6, 287–300. Bourquin, J., Shmidli, H., Hoogevest, P., Van Leuenberger, H., 1998c. Pitfalls of artificial neural networks (ANN) modeling technique for data sets containing outlier measurements using a study on mixture properties of a direct compressed dosage form. Eur. J. Pharm. Sci. 7, 17–28. Buciński, A., Bączek, T., 2002. Optimization of HPLC separations of flavonoids with the use of artificial neural networks. Pol. J. Food Nutr. Sci. 4, 47–51. Buciński, A., Nasal, A., Kaliszan, R., 2000. Pharmacological classification of drugs based on neural network processing of molecular modeling data. Comb. Chem. High Throughput Screen. 3, 525–533. Buciński, A., Zieliński, H., Kozłowska, H., 2004. Artificial neural networks for prediction of antioxidant capacity of cruciferous sprouts. Trends Food Sci. Technol. 15, 161–169. Chen, Y., McCall, T.W., Baichwal, A.R., Meyer, M.C., 1999. The application of an artificial neural network and pharmacokinetic simulations in the design of controlled-release dosage forms. J. Control. Release 59, 33–41.
Virtual Screening of Phytochemicals Chapter | 11 333 Coutinho, E.M., 2002. Gossypol: a contraceptive for men. Contraception 65, 259–263. Dayhoff, J.E., DeLeo, J.M., 2001. Artificial neural networks: opening the black box. Cancer 91, 1615–1635. Gašperlin, M., Tušar, L., Tušar, M., Kristl, J., Šmid-Korbar, J., 1998. Lipophilic semisolid emulsion systems: viscoelastic behavior and prediction of physical stability by neural network modeling. Int. J. Pharm. 168, 243–254. Gašperlin, M., Tušar, L., Tušar, M., Šmid-Korbar, J., Zupan, J., Kristl, J., 2000. Viscosity prediction of lipophilic semisolid emulsion systems by neural network modeling. Int. J. Pharm. 196, 37–50. Haghdadi, M., Fatemi, M.H., 2010. Artificial neural network prediction of the psychometric activities of phenylalkylamines using DFT-calculated molecular descriptors. J. Serb. Chem. Soc. 75, 1391–1404. Honda, A., Siruntawineti, J., Baba, T., 2002. Role of acrosomal matrix proteases in sperm-zona pellucida interactions. Hum. Reprod. Update 8, 405–412. Hussain, A.S., Yu, X., Johnson, R.D., 1991. Application of neural computing in pharmaceutical product development. Pharm. Res. 8, 1248–1252. Ishikawa, T., Hirano, H., Saito, H., Sano, K., Ikegami, Y., Yamaotsu, N., Hirono, S., 2012. Quantitative structure-activity relationship (QSAR) analysis to predict drug-drug interactions of ABC transporter ABCG2. Mini Rev. Med. Chem. 12, 505–514. Isu, Y., Nagashima, U., Hosoya, H., Aoyma, T., 1994. Development of neural network simulator for structure-activity correlation of molecules. J. Chem. Softw. 2, 76–95. Kaliszan, R., Bączek, T., Buciński, A., Buszewski, B., Sztupecka, M., 2003. Prediction of gradient retention from the linear solvent strength (LSS), quantitative structure-retention relationships (QSRR), and artificial neural networks (ANN). J. Sep. Sci. 26, 271–282. Kandimalla, K.K., Kanikkannan, N., Singh, M., 1999. Optimization of a vehicle mixture for the transdermal delivery of melatonin using artificial neural networks and response surface method. J. Control. Release 61, 71–82. Katsila, T., Spyroulias, G.A., Patrinos, G.P., Matsoukas, M.-T., 2016. Computational approaches in target identification and drug discovery. Comput. Struct. Biotechnol. J. 14, 177–184. Keshmiri-Neghab, H., Goliaei, B., 2014. Therapeutic potential of gossypol: an overview. Pharm. Biol. 52, 124–128. Laurie, A., Jackson, R., 2005. Q-SiteFinder: an energy-based method for the prediction of protein– ligand binding sites. Bioinformatics 21, 1908–1916. Lee, C.W., Park, J.A., 2001. Assessment of HIV/AIDS-related health performance using an artificial neural network. Inf. Manage. 38, 231–238. Mendyk, A., Dorożyński, P., Jachowicz, R., 2007. Proceedings of International Joint Conference on Neural Networks, Orlando, FL, 12–17 August 2007. IEEE catalog number: 07CH37922C, ISBN: 1-04244-1380-X, ISSN: 1098-7576. Naik, P.K., Patel, A., 2009. Prediction of anticancer/non-anticancer drugs based on comparative molecular moment descriptor using Artificial Neural Network and support vector machine. Dig. J. Nanomater. Biostruct. 4, 19–43. Paixão, P., Gouveia, L.F., Morais, J.A.G., 2010. Prediction of the in vitro permeability determined in Caco-2 cells by using artificial neural networks. Eur. J. Pharm. Sci. 41, 107–117. Peh, K.K., Lim, C.P., Quek, S.S., Khoh, K.H., 2000. Use of artificial neural networks to predict drug dissolution profiles and evaluation of network performance using similarity factor. Pharm. Res. 17, 1384–1388. Perkins, R., Fang, H., Tong, W., Welsh, W.J., 2003. Quantitative structure-activity relationship methods: perspectives on drug discovery and toxicology. Environ. Toxicol. Chem. 22, 1666–1679.
334 Computational Phytochemistry Petrović, J., Ibrić, S., Betz, G., Jelena Parojčić, Đ.Z., 2009. Application of dynamic neural networks in the modeling of drug release from polyethylene oxide matrix tablets. Eur. J. Pharm. Sci. 38, 172–180. Silwoski, G., Kothiwale, S., Meiler, J., Lowe Jr., E.W., 2014. Computational methods in drug discovery. Pharmacol. Rev. 66, 334–395. Snow, P.B., Rodvold, D.M., Brandt, M.J., 1999. Artificial neural networks in clinical urology. Urology 54, 787–790. So, S.S., Richards, W.G., 1992. Application of neural networks. quantitative structure-activity relationships of the derivatives of 2,4-diamino-5-(substituted-benzyl)pyrimidines as DHFR inhibitors. J. Med. Chem. 35, 3201–3207. STAHL, 2000. Stahlbau, 69, 672. https://doi.org/10.1002/stab.200002470. Takayama, K., Fujikawa, M., Nagai, T., 1999. Artificial neural network as a novel method to optimize pharmaceutical formulation. Pharm. Res. 16, 1–6. Takayama, K., Fujikawa, M., Obata, Y., Morishita, M., 2003. Neural network based optimization of drug formulations. Adv. Drug Deliv. Rev. 55, 1217–1231. Vanyúr, R., Héberger, K., Jakus, J., 2003. Prediction of anti-HIV-1 activity of a series of tetrapyrrole molecules. J. Chem. Inf. Comput. Sci. 43, 1829–1836. Wang, L., Xie, X.-Q., 2014. Computational target fishing: what should chemogenomics researchers expect for the future of in silico drug design and discovery? Future Med. Chem. 6, 247–249. Zhang, S., 2011. Computer-aided drug discovery and development. Methods Mol. Biol. 716, 23–38. Zupan, J., Gasteiger, J., 1993. Neural Networks for Chemists. An Introduction. Wiley-VCH, Weinheim. Zupan, J., Gasteiger, J., 1999. Neural Networks in Chemistry and Drug Design, second ed. Wiley-VCH, Weinheim.