Virtual Screening of Phytochemicals

Virtual Screening of Phytochemicals

Chapter 11 Virtual Screening of Phytochemicals Manabendra D. Choudhury*, Walid A. Atteya*, Keshav Dahal*, Pankaj Chetia†, Karabi D. Choudhury†, Anant...

3MB Sizes 0 Downloads 113 Views

Chapter 11

Virtual Screening of Phytochemicals Manabendra D. Choudhury*, Walid A. Atteya*, Keshav Dahal*, Pankaj Chetia†, Karabi D. Choudhury†, Anant Paradkar* ⁎

University of Bradford, Bradford, United Kingdom Assam University, Silchar, India



Chapter Outline 11.1. Introduction 301 11.1.1 Artificial Neural Networks (ANNs) 302 11.1.2 Application of ANNs in Pharmaceutical Science 303 11.1.3 ANNs in Predicting Bioactivity 303 11.1.4 Gossypol and its Derivatives 303 11.2. Materials and Methods 304 11.2.1 Input and Output Vector Definition for Data 304

11.2.2 Software and Hardware Environment 304 11.2.3 Modelling Procedure 306 11.2.4 Training and Test Data Set 306 11.2.5 Experimental Data Set 308 11.2.6 Docking Experiment 308 11.3. Results and Discussion 308 11.4. Conclusions 332 Acknowledgements 332 References 332

11.1. INTRODUCTION The fundamental objective in any drug discovery process is to reduce the time and cost involved in the process of bringing an effective drug to the market. The techniques used in drug discovery are aimed at shortening the time to identify drug candidates. Although Quantitative Structure Activity Relationship (QSAR) technique (Perkins et al., 2003; Ishikawa et al., 2012) has been in application in drug discovery process for quite some time, it does not regularly have a high impact on lead discovery as it mainly influences later stages of drug development, particularly, prediction of IC50 and a few other parameters. A revolutionary change in conventional methods of testing bioactivity of plant

Computational Phytochemistry. https://doi.org/10.1016/B978-0-12-812364-5.00011-0 © 2018 Elsevier Inc. All rights reserved.

301

302  Computational Phytochemistry

materials has taken place with the advent of computer and computational techniques. The computer-aided tools have significantly reduced the time necessary for bioactivity assessments, and a quick and cost-effective prediction of biological activity of plant materials is now possible based on physicochemical properties. Although Computer-Aided Drug Discovery (CADD) (Zhang, 2011; Silwoski et  al., 2014) has emerged as a broad subject in the field of medicinal plants research, computational molecular docking and computational target fishing (Wand and Xie, 2014; Katsila et al., 2016) are the specific techniques in this field. Above all, Artificial Neural Networks (ANNs) (Dayfoff and DeLeo, 2001) has proved to be a promising technique for direct prediction of a specific property of specific biomolecules. This chapter, utilizing a specific example, describes how ANNs could be used to predict possible biological activity of a few drug molecules.

11.1.1  Artificial Neural Networks (ANNs) ANNs are biologically inspired computer programmes designed to simulate the way in which the human brain processes information (Dayfoff and DeLeo, 2001). ANNs gather their knowledge by detecting the patterns and relationships in data and learn (or are trained) through experience, not from programming, and there lies the basic difference between ANNs and other classical computer programmes. Another significant difference between ANNs software and other computer programmes is that the algorithms used for data analysis are flexible. They can be changed anytime during the progress of analysis. The distinctive feature of ANNs is their ability to deal effectively with multidimensional problems, including several thousands of features. An ANN is formed from hundreds of single units, i.e. artificial neurons or processing elements, connected with coefficients (weights), which constitute the neural structure and are organized in layers. The ability of neural computations comes from connecting neurons in a network. The better the neurons are connected in networks, the better is the prediction as output. The activity of a neural network is determined by transfer functions of its neurons, by the learning rule, and by the architecture itself. Achievement of successful result from ANNs studies depends on minimization of prediction error by optimization of interunit connections during training. By doing so as trial and error, the network reaches the specified level of accuracy. Once the network is trained with minimum prediction error and tested, it may be used with new input information to predict the output. The information in ANNs is encoded in the strength of the network's ‘synaptic’ connections (Zupan and Gasteiger, 1993; Kaliszan et al., 2003). Latest studies on ANNs are mainly centred on designing new network types by changing transfer connection of neurons, by changing learning rule, and by initiating new connection formula.

Virtual Screening of Phytochemicals  Chapter | 11  303

11.1.2  Application of ANNs in Pharmaceutical Science Use of ANNs in drug discovery is not that old; in fact, their use in drug discovery process started at the end of the 1980s, when they were applied to solve various chemical problems including the study of Quantitative Structure Activity Relationships (QSAR). ANNs were found to be useful in compound classification, modelling of structure activity relationships, identification of potential drug targets, and localization of structural and functional features of biopolymers (Isu et al., 1994). Hussain et al. (1991) were the first to introduce ANNs to the field of pharmaceutical technology, pointing to the possible advantages of guided search for the optimal pharmaceutical formulation. Various dosage forms were the subjects of neural analysis: tablets (Bourquin et al., 1998a,b,c), pellets (Peh et al., 2000), capsules (Mendyk et al., 2007), emulsions (Gašperlin et al., 1998, 2000) and microemulsions (Agatonovic-Kustrin and Alany, 2001), hydrogels (Takayama et  al., 1999, 2003), and transdermal delivery systems (Kandimalla et al., 1999). Reports on the potential use of ANNs for pharmacological classification of drugs (Buciński et al., 2000), optimization of HPLC separations of bioactive compounds (Buciński and Bączek, 2002), in vitro permeability determination in Caco-2 cells (Paixão et  al., 2010), and predicting drug release and formulation (Chen et al., 1999; Petrović et al., 2009) are also available. Competitive adsorption of phenol and resorcinol from water environment using carbonaceous adsorbents has also been modelled using ANNs (Aghav et al., 2011).

11.1.3  ANNs in Predicting Bioactivity ANNs in predicting bioactivity classes based on physicochemical parameters of agents was demonstrated for dihydrofolate reductase inhibitors (So and Richards, 1992). Antitumour activity could also be predicted by ANNs (Zupan and Gasteiger, 1993, 1999). ANNs were proposed as decision support systems in dentistry (Brickley et al., 1998) and urology (Snow et al., 1999) and in the assessment of HIV/AIDS-related health performance (Lee and Park, 2001). Antioxidant capacity of cruciferous sprouts was also predicted using neural network (Buciński et al., 2004). Prediction of specific bioactivity like anti-HIV, anticancer, and psychometric activity of a series of molecules has been performed using ANNs (Vanyúr et al., 2003; Naik and Patel, 2009; Haghdadi and Fatemi, 2010).

11.1.4  Gossypol and its Derivatives Gossypol (Fig. 11.1) is a phenolic aldehyde, yellow in colour, found in cottonseeds (Gossypium herbaceum). It can permeate cells and act as an inhibitor for several dehydrogenase enzymes and has been tested as a male oral contraceptive

304  Computational Phytochemistry

FIG. 11.1  Gossypol from Gossypium herbaceum.

in China (Coutinho, 2002). In addition to its contraceptive properties, it also possesses antimalarial properties (Keshmiri-Neghab and Goliaei, 2014). The IUPAC name of the compound is 2,2′-bis-(formyl-1,6,7-trihydroxy-5isopropyl-3-methylnaphthalene) and it has 14 derivatives available in the NCBI PubChem compound database. Although the parent compound gossypol is known for its male contraceptive property, biological activities of its derivatives are still awaiting experimentation. We have undertaken a work involving gossypol and its derivatives, as depicted below, to demonstrate the application of ANNs as a tool for predicting biological activity of these compounds with an intention of suggesting possible new drug leads. The prediction obtained from ANNs is cross-validated by in silico search of receptors for the chosen ligands.

11.2.  MATERIALS AND METHODS 11.2.1  Input and Output Vector Definition for Data Data related to physicochemical properties of 117 molecules, whose biological activities are well-known, were collected from NCBI PubChem Compound database. Twenty two descriptors for each of the 117 molecules were recorded from the database. Physicochemical properties of all compounds of the data set were considered as input and their respective biological activities were taken as output. Similarly, 22 descriptors for each of 6 compounds, whose biological properties have not been tested, were also recorded as experimental data set. The physicochemical descriptors used for preparing training, test, and experimental dataset are shown in Table 11.1.

11.2.2  Software and Hardware Environment The test bit used for the implementation of the ANNs was Intel Core 2 Duo 2.93 GHz processor with 4 Giga Ram and windows 7 Enterprise Edition, Service Pack 1. Operating System type was 64-bit. The ANNs were implemented using the MATLAB (Matrix Laboratory) software, version 7.12.0.635 Release 2011a, 64-bit, windows. MATLAB Neural Network Toolbox with some in-house developed MATLAB code was used to design, implement, visualize, and simulate the neural networks. This allowed flexible implementation of neural network

Virtual Screening of Phytochemicals  Chapter | 11  305

TABLE 11.1  Physicochemical Descriptors Used for Preparing Training, Test, and Experimental Dataset Serial

Module Descriptor

Description

1

Molecular weight

Weight of all atoms in a compound

2

XLogP3

Partition coefficient

3

H-bond donor

A hydrogen atom attached to a relatively electronegative atom plays the role of the hydrogen bond donor

4

H-bond acceptor

Any group which can share its electron for H-bond, e.g., O, N, etc.

5

Rotatable bond count

Number of rotatable bonds in a molecule

6

Exact mass

Sum of masses of the individual isotopes of a molecule

7

Mono-isotopic mass

Sum of the masses of the atoms in a molecule

8

Topological polar surface area

Topological polar surface area (TPSA) makes use of functional group contributions based on a large database of structures, is a convenient measure of the polar surface area that avoids the need to calculate ligand 3D structure or to decide which is the relevant biological conformation or conformations.

9

Heavy atom count

Number of atom that contains more than the common number of neutrons

10

Formal charge

Charge assigned to an atom in a molecule

11

Complexity

Complexity describes the behaviour of a system or model whose components interact in multiple ways

12

Isotope atom count

Number of atoms with the same number of protons, but differing numbers of neutrons in a molecule

13

Defined atom stereo centre count

Count of any point in a molecule that leads to stereoisomerism

14

Undefined atom stereo centre count

Unknown group in a molecule that may lead to stereoisomerism

15

Defined bond stereo centre count

Nonrotatable bond known

16

Undefined bond stereo centre count

Nonrotatable bond unknown

17

Covalently bonded unit count

Count of groups bound through covalent bonds Continued

306  Computational Phytochemistry

TABLE 11.1  Physicochemical Descriptors Used for Preparing Training, Test, and Experimental Dataset—cont’d Serial

Module Descriptor

Description

18

Feature 3D acceptor count

Count of 3D feature acceptor of the molecule

19

Feature 3D ring count

Number of aromatic rings present in a molecule

20

Effective rotor count

Which provides flexibility

21

Conformer sampling RMSD

Deviation among all conformer upon reaction

22

CID conformer count

Number of different conformers of a molecule with same molecular formula

23

Biological activity

Activity of the compounds as drugs

a­lgorithm, plotting the required functions and data. The common and wellknown back propagation algorithm for training neural network was chosen to minimize the objective function.

11.2.3  Modelling Procedure To build a model that can predict the biological activity of unknown molecules from their physicochemical descriptors, a four-layered feed-forward neural network, was developed. Experiments were conducted for a sensitivity analysis by changing the number of neural network layers and the number of hidden nodes inside the layers to get the best prediction accuracy. Other important parameters had significant effect on the error rate in the proposed model. These parameters were the back propagation and the hidden layer training functions. At the beginning, neural network parameters that are commonly used were applied (Buciński et  al., 2004). These parameters were learning coefficient of 0.02, momentum equalled to 0.6, and a limitation of maximum 3000 epochs. After achieving the desired goal, these parameters were changed to decrease the time needed to build the neural network model without affecting the goal. The ANN model set out finally for present study is presented in Fig. 11.2.

11.2.4  Training and Test Data Set Data as recorded from NCBI PubChem compounds database for training and test of the neural networks and after removing the redundancy were divided into a training data set of 70% and test data set of 30% for internal validation. Before splitting data into training and test set, biological activity attribute was encoded as shown in Table 11.2. This is performed to have numerical representation of each type of bioactivity.

Virtual Screening of Phytochemicals  Chapter | 11  307

Molecular weight XLogP3 H-bond donor H-bond acceptor Rotatable bond count Exact mass MonoIsotopic mass Topological polar surface area Heavy atom count Formal charge Complexity BA_code

Isotope atom count Defined atom stereocenter count Undefined atom stereocenter count Defined bond stereocenter count Undefined bond stereocenter count Covalently-bonded unit count Feature 3D acceptor count Feature 3D ring count Effective rotor count Conformer sampling RMSD CID conformer count

FIG. 11.2  Artificial Neural Network model set out finally for present study.

TABLE 11.2  Biological Activity Attributes Biological Activity Code

Biological Activity Description/Name

10

Anabolic agent

20

Analgesic

30

Narcotic analgesic

40

Antiinflammatory

50

Anticonvulsant

60

Antineoplastic

70

Cardioprotection

80

Vasodilator

90

Antianxiety agent

100

Hypnotic

110

Contraceptive

120

Atherogenic

308  Computational Phytochemistry

11.2.5  Experimental Data Set Six gossypol derivatives, i.e. diaminogossypol, CID:198041 (Compound-1), mono-aldehyde gossypol, CID:195071 (Compound-2), gossylic lactone,CID:5479154 (Compound-3), gossypol tetraacetic acid, CID: 130831 (Compound-4), ethyl gossypol, CID: 374353 (Compound-5), and gossypol-6,6'dimethyl ether, CID: 25200979 (Compound-6), were selected for prediction of their bioactivity. Descriptors of these six compounds with their chemical identity (CID) numbers were downloaded from NCBI PubChem compound database. Selection of these compounds for present investigation was based on the facts that their biological activities have not been tested yet, either in silico or in vivo. However, the parent compound from which these compounds have been derived by chemical group substitution is known for its oral male contraceptive property. Physical and chemical descriptors used for experimental compounds are described in Table 11.3.

11.2.6  Docking Experiment Once the ANNs prediction is over, chosen compounds were considered as ligands, and docking experiments were performed to search for their respective suitable targets/receptors for validation of ANNs prediction. As the predicted activity of chosen ligands is contraceptive and parent gossypol compound is known for its male contraception property, docking was carried out with chosen ligands against acrosin and hyaluronidase—the two vital enzymes of human spermatozoa. Since 3D structure of acrosin and hyaluronidase could not be obtained from RCSB Protein Databank, the structures of these molecules were predicted using homology modelling technique with Modeller 9v7. Active site of the receptors (enzymes) was predicted using Q-Site finder (Laurie and Jackson, 2005). Structure of chosen ligands was obtained from NCBI PubChem database in SDF format and docking experiments were carried out using BiosolveIT FlexX 1.3.0 (Stahl, 2000). As control, parent gossypol was also docked with those enzymes using same software.

11.3.  RESULTS AND DISCUSSION Predicted output of the experimental compounds as recorded in Table  11.4 showed that compounds 3, 4, and 6 are contraceptive, compounds 1 and 2 are, respectively, antianxiety agent and vasodilator, and compound 5 is hypnotic. As parent compound gossypol has contraceptive properties, it was expected that some of its derivatives would have the same property. While training the model for obtaining prediction, many attempts were made by changing the neural network settings to get the best prediction accuracy. These settings included the number of neural network layers, the number of hidden nodes inside the layers, and the layers propagation functions. The effect of the neutral network parameters on the learning rate totally depends on the

TABLE 11.3  Physicochemical Descriptors of the Gossypol Derivatives Compounds 2

3

4

5

6

Descriptors

CID:198041

CID:195071

CID:5479154

CID:5479154

CID:374353

CID:25200979

Molecular weight

548.5837

490.5443

514.5226

686.7011

518.5544

546.6076

XLogP3

5.6

6.9

7.7

6.3

6.7

7.6

H-bond donor

8

6

4

2

6

4

H-bond acceptor

10

7

8

12

8

8

Rotatable bond count

5

4

3

13

5

7

Exact mass

548.2159

490.1992

514.1628

686.2363

518.1941

546.2254

Mono-isotopic mass

548.2159

490.1992

514.1628

686.2363

518.1941

546.2254

Topological polar surface area

156

208

138

289

180

192

Heavy atom count

40

36

38

50

38

40

Formal charge

0

0

0

0

0

0

Complexity

848

773

890

1180

808

879

Isotope atom count

0

0

0

0

0

0

Defined atom stereo centre count

0

0

0

0

0

0

Undefined atom stereo centre count

0

0

0

0

0

0 Continued

Virtual Screening of Phytochemicals  Chapter | 11  309

1

Compounds 1

2

3

4

5

6

Descriptors

CID:198041

CID:195071

CID:5479154

CID:5479154

CID:374353

CID:25200979

Defined Bond stereo centre count

0

0

0

0

0

0

Undefined bond stereo centre count

0

0

0

0

0

0

Covalently bonded unit count

1

1

1

1

1

1

Feature 3D acceptor Count

2

1

2

6

2

4

Feature 3D ring count

8

6

4

2

6

4

Effective rotor count

5

4

3.4

13

5

7

Conformer sampling RMSD

1

0.8

0.8

1.4

0.8

1

CID conformer count

2

2

2

2

4

3

310  Computational Phytochemistry

TABLE 11.3  Physicochemical Descriptors of the Gossypol Derivatives—cont’d

Predicted Output for the Compounds 1

2

3

4

5

6

Biological activity code

90.0424

79.9101

109.9587

109.7493

100.0998

109.6576

Biological activity names

Antianxiety agent

Vasodilator

Contraceptive

Contraceptive

Hypnotic

Contraceptive

Virtual Screening of Phytochemicals  Chapter | 11  311

TABLE 11.4  ANNs Predicted Output of the Gossypol Derivatives

312  Computational Phytochemistry

data set. Our goal was to optimize these settings until the error was minimized and reached the required acceptable level of accuracy. Two parameters are very important in the learning process, these are Mean Square Error and the regression value. The Mean Squared Error represents the learning accuracy of the neural network. It is the average squared difference between outputs and targets. Normally, lower values are better and zero means no error. The pattern of accuracy of the learning of the neural network for the present studies is presented in Fig. 11.3. Mean square error recorded during present learning was 0.02, which indicated that learning of the ANN in the present study was accurate. Mean square error was calculated as follows. N

MSE = å ( Ti - Oi )

2

i =1

Where, MSE is the mean square error, N is the number of inputs, T is the target values, and O is the output values from the neural network model. The Regression Factor is a measure for correlation between outputs and targets and indicates the prediction accuracy. A value of 1 means a close relationship, 0 means a random relationship. The regression factor of the neural network used in the present study is presented in Fig. 11.4. Value of regression factor during present work was 0.99, which signifies the accuracy of the prediction of the present work. During the work, data were processed into the suitable format used in Matlab. Input and target data attributes were defined. Many experiments were executed by tuning the neural network parameters to get the best accuracy and the best target data. Fig. 11.5 shows a comparison between

FIG. 11.3  Pattern of accuracy of the learning of the neural network for the present study.

R = 0.99179 2

Data Fit Y=T

Output ∼ = 0.97 × Target + 0.019

1.5

1

0.5

0

−0.5

−1

−1

−0.5

0

0.5

1

1.5

2

Target

FIG. 11.4  Regression factor of the proposed neural network used in the present study.

FIG. 11.5  Comparison between the neural network output data and the target data.

314  Computational Phytochemistry

the neural network output data and the target data. Data values for output and target are shown in Table  11.5. The data show how the output values of the neural network model are close to the target values. During present study, best results were found at four-layered feed-forward multilayer perception neural network. There were one input layer, three hidden layers, and one output layer. Ten artificial neurons were there in the first hidden layer, three each in second and third hidden layers and one artificial neuron in the output layer.

TABLE 11.5  Predicted Output of Training and Test Set With Reference to Biological Activity Attribute (Target) Output

Target

69.37214

70

39.70692

40

50.63383

50

25.8101

30

39.70048

40

40.06018

40

31.64394

30

39.57521

40

41.42172

40

40.02308

40

101.2782

100

40.05743

40

90.04237

90

18.92682

20

119.8577

120

29.47375

30

48.97252

50

28.35779

30

20.13945

20

88.79931

90

69.33511

70

80.70457

80

Virtual Screening of Phytochemicals  Chapter | 11  315

TABLE 11.5  Predicted Output of Training and Test Set With Reference to Biological Activity Attribute (Target)—cont’d Output

Target

109.9587

110

61.64463

60

109.7493

110

80.10604

80

28.92282

30

20.35135

20

39.17767

40

9.11467

10

119.8899

120

9.632499

10

39.26161

40

79.91012

80

39.63317

40

18.29689

20

79.47068

80

39.09463

40

79.90057

80

120.63

120

90.56023

90

59.46885

60

59.87975

60

40.29981

40

40.22298

40

50.05615

50

110.3123

110

39.84036

40

39.74226

40

79.92715

80 Continued

316  Computational Phytochemistry

TABLE 11.5  Predicted Output of Training and Test Set With Reference to Biological Activity Attribute (Target)—cont’d Output

Target

20.22006

20

20.95283

20

39.32984

40

39.37376

40

38.44877

40

70.20587

70

10.16076

10

10.71323

10

20.37211

20

79.0657

80

39.44648

40

10.48536

10

19.84768

20

11.41296

10

90.43699

90

117.9819

120

18.36777

20

109.5418

110

39.51115

40

19.79686

20

41.86343

40

79.70558

80

39.39451

40

119.3081

120

30.23937

30

39.81784

40

79.94202

80

30.67188

30

69.83762

70

Virtual Screening of Phytochemicals  Chapter | 11  317

TABLE 11.5  Predicted Output of Training and Test Set With Reference to Biological Activity Attribute (Target)—cont’d Output

Target

118.5441

120

22.49832

20

18.79895

20

20.71337

20

40.40256

40

18.81931

20

21.7158

20

20.12916

20

40.36983

40

40.05577

40

41.04731

40

39.6325

40

109.6576

110

37.64742

40

99.48608

100

37.87494

40

41.51219

40

110.2016

110

42.77141

40

119.1196

120

37.1825

40

29.26267

30

21.32417

20

79.47948

80

32.78027

30

120.059

120

19.59192

20

39.44699

40 Continued

318  Computational Phytochemistry

TABLE 11.5  Predicted Output of Training and Test Set With Reference to Biological Activity Attribute (Target)—cont’d Output

Target

99.15079

100

20.47612

20

69.74487

70

28.65926

30

39.48187

40

18.90024

20

118.4873

120

119.0782

120

100.0998

100

Different training functions, including ‘tansig’ and ‘purlin’, were tested in different layers to find the function that fits best to the model. The best input layer function was ‘tansig’, the best hidden layer function was ‘purlin’, and the best back propagation training function was 'trainlm'. Docking of gossypol with human sperm enzymes acrosin and hyaluronidase showed that it could bind successfully with a score of −25.6376 and −24.2222 kcal/mol, respectively. Docking experiments with compounds having ANNs predicted contraceptive activity (compounds 3, 4, and 6) and showed that affinity of gossypol-6,6'-dimethyl ether with acrosin and hyaluronidase was comparable to that of gossypol. However, all three derivatives had significant bonding capability with these two enzymes with an exception that gossypol tetraacetic acid’s affinity towards hyaluronidase was less (Tables 11.6, 11.7A–11.7H, and Fig. 11.6). As hyaluronidase and acrosin are two vital enzymes of human spermatozoa and are responsible, respectively, for digestion of hyaluronan in the corona radiata, enabling conception (Alberts, 2008), and for making sperms able to penetrate into the ovum by lysis of the zona pellucida through acrosome reaction, thus facilitating penetration of the sperm through the innermost glycoprotein layers (Honda et al., 2002), inhibition of any of these two enzymes by gossypol derivatives in question will have adverse effect on the ability of the spermatozoa to enable conception. Majority of mammalian ova are covered in a layer of granulosa cells interwined in an extracellular matrix that contains a high

Virtual Screening of Phytochemicals  Chapter | 11  319

TABLE 11.6  Docking Score of the Chosen Ligands and Gossypol Against Acrosin and Hyaluronidase H-Bond Forming Amino Acids

Docking Score (kcal/mol)

Acrosin

Tyr39, Val70, His71, Arg74, and Val99

−25.6376

Hyaluronidase

Tyr179, His187, Ser222, Asn226, Thr227, and Gln228

−24.2222

Acrosin

His71, Trp73, Val99, and Glu100

−17.9569

Hyaluronidase

Tyr179, His187, Thr227, and Gln228

−15.1068

Gossypol tetraacetic acid (CID 130831)

Acrosin

Lys21, Ala23, and Trp156

−12.1150

Hyaluronidase

Tyr357 and His359

−5.1035

Gossypol-6,6 dimethyl ester (CID 25200979)

Acrosin

Trp30 and Cys136

−20.7042

Hyaluronidase

Asp279, Asn353, Ser355, and Tyr357

−19.5150

Ligand

Enzyme

Gossypol (CID 3503)

Gossylic lactone (CID 5479154)

concentration of hyaluronan. When a capacitated sperm reaches the ovum, it is able to penetrate this layer with the assistance of hyaluronidase enzymes present on the surface of the sperm. Once this occurs, the sperm is capable of binding with the zona pellucida, and the acrosome reaction can occur then with the help of the enzyme acrosin. The resulted lysis from acrosome reaction enables spermatozoa to reach the innermost glycoprotein layers of the ovum to effect conception (Alberts, 2008). Alteration of activity of these two enzymes is, therefore, directly linked with male contraceptive activity. Docking experiments, therefore, in one hand validated the ANNs prediction with respect to contraceptive action of compounds 3, 4, and 6 by showing their binding potential with two vital enzymes of human spermatozoa, and on the other hand, suggested possible molecular path ways by which these compounds may affect contraception.

TABLE 11.7A  Docking Parameters of Gossylic Lactone [CID 5479154] Against Acrosin #

Ligand

1

(1) 5479154

Structure

Rank

Score

Match

Lipo

Ambig

Clash

Rot

1

−17.9569

−17.3358

−10.1318

−11.1563

6.8671

8.4000

RMSD

Simil

#Match 16

OH

HO O

HO

O O

O

OH

TABLE 11.7B  Docking Parameters of Gossypol Tetraacetic Acid [CID 130831] Against Acrosin #

Ligand

1

(1) 130831

Structure O O

O O

HO

O OH

O O O

O

Rank

Score

Match

Lipo

Ambig

Clash

Rot

1

−12.1150

−17.3531

−11.2893

−9.5479

3.8753

16.8000

RMSD

Simil

#Match 11

TABLE 11.7C  Docking Parameters of Gossypol-6,6 Dimethyl Ether [CID 25200979] Against Acrosin #

Ligand

1

(1) 25200979

Structure O

Rank

Score

Match

Lipo

Ambig

Clash

Rot

1

−20.7042

−23.6019

−10.0613

−6.9311

3.2901

11.2000

RMSD

Simil

#Match 15

HO O HO HO

TABLE 11.7D  Docking Parameters of Gossylic Lactone [CID 5479154] Against Hyaluronidase #

Ligand

1

(1) 5479154

Structure HO

OH O

O OH

HO O

O

Rank

Score

Match

Lipo

Ambig

Clash

Rot

1

−15.1068

−19.3266

−5.5109

−7.5996

3.5302

8.4000

RMSD

Simil

#Match 14

(1) 130831

1

HO

O

O

O

O O

Structure

O O

O

O

OH

1

Rank

Match −16.2112

Score −5.1035

−10.8139

Lipo −7.8980

Ambig 7.6196

Clash 16.8000

Rot

Ligand

(1) 25200979

#

1

O

HO

O

HO HO

Structure

1

Rank

Match −27.1367

Score −19.5150

−8.1126

Lipo

−7.4221

Ambig

6.5564

Clash

11.2000

Rot

TABLE 11.7F  Docking Parameters of Gossypol-6,6 Dimethyl Ether [CID 25200979] Against Hyaluronidase

Ligand

#

TABLE 11.7E  Docking Parameters of Gossypol Tetraacetic Acid [CID 130831] Against Hyaluronidase

RMSD

RMSD

Simil

Simil

17

#Match

14

#Match

(1) 3503

1

HO

HO

O

OH

OH

Structure

Match −27.8142

Score

−25.6376

Rank

1

−12.0931

Lipo −9.7579

Ambig

Ligand

(1) 3503

#

1

HO

HO

O

OH

OH

Structure

Match −31.7353

Score

−24.2222

Rank

1

−3.2238

Lipo −6.9964

Ambig

TABLE 11.7H  Docking Parameters of Gossypol [CID 3503] Against Hyaluronidase

Ligand

#

TABLE 11.7G  Docking Parameters of Gossypol [CID 3503] Against Acrosin

1.1334

Clash

7.4276

Clash

11.2000

Rot

11.2000

Rot

RMSD

RMSD

Simil

Simil

13

#Match

21

#Match

Virtual Screening of Phytochemicals  Chapter | 11  323

324  Computational Phytochemistry Val99

O −

R

R

H

R H N

Trp73

N

R

O

O Glu100

OH

O

O

R

O H

Glu100 Val70

H

H N

O

His71 O

H O

O

O

O

Thr123 Tyr98 Phe37 Arg74 His71

R

(A)

(B) FIG. 11.6  Bonding pattern of chosen ligands and Gossypol with their receptors (A) 2D view of gossylic lactone and acrosin bonding (B) 3D view of gossylic lactone and acrosin bonding

Virtual Screening of Phytochemicals  Chapter | 11  325

OH

Trp156 N

Glu81

O

H O

Ala23 Lys80

R

O

N H

Ala23

O O

O

R

O

O O

O

O Trp156

H O

R O

Lys21 Ile159 Ala22

R HN

(C)

Lys21

(D) FIG. 11.6, CONT’D  (C). 2D view of gossypol tetraacetic acid and acrosin bonding (D). 3D view of gossypol tetraacetic acid and acrosin bonding

326  Computational Phytochemistry Cys136

R

N

R Cys136 Leu137

O H

O

OH

O

H O H O

Trp28

O Trp152

H N H

O

O Trp30 Trp30

(E)

(F) FIG. 11.6, CONT’D  (E). 2D view of gossypol-6,6 dimethyl ether and acrosin bonding (F). 3D view of gossypol-6,6 dimethyl ether and acrosin bonding

Virtual Screening of Phytochemicals  Chapter | 11  327 Tyr179

Thr227

R

O

O H

R

O

Asp270 Tyr224

N

H OH

H O H

O H O

O

O

O

Asn226 Leu180 Thr227 Gln228

H

N

His187 N

H O

O

(G)

H N H

Gln228

(H) FIG. 11.6, CONT’D  (G). 2D view of gossylic lactone and hyaluronidase bonding (H). 3D view of gossylic lactone and hyaluronidase bonding

328  Computational Phytochemistry

Tyr283 Tyr357

OH

Val282 Ser355

O Lys392

R

N H

O

O

O

O

O

Tyr357

O

O

O

O

R

Pro362 His359

O HO

O R

H N R

His359

(I)

(J) FIG.  11.6, CONT’D  (I). 2D view of gossypol tetraacetic acid and hyaluronidase bonding (J). 3D view of gossypol tetraacetic acid and hyaluronidase bonding

Virtual Screening of Phytochemicals  Chapter | 11  329 O −

Asp279

O

R NH Ser355 R

Tyr283 Tyr357 Val282

O R Tyr357

N

Asp279 Asp356 O

O

H H

H O

O

Ser355 H O O

O

R

H O Asn353

O N

H

O

H Lys392 Asn353

(K)

(L) FIG. 11.6, CONT’D  (K). 2D view of gossypol-6,6 dimethyl ether and hyaluronidase bonding (L). 3D view of gossypol-6,6 dimethyl ether and hyaluronidase bonding

330  Computational Phytochemistry Val99

R

N

O

R R

H

NH Tyr98

H O

O H O

O

R

O Val70

H N H

H N

R

+ NH

H

His71

R

His71 Val70

O

H N Arg74

H

Arg74

Tyr39 Phe37

HO

R

O HO

O H

O Tyr39

H N

(M)

R

(N) FIG. 11.6, CONT’D  (M). 2D view of gossypol and acrosin bonding (N). 3D view of gossypol and acrosin bonding

Virtual Screening of Phytochemicals  Chapter | 11  331 His187 Ser222

O

N

H N H

H

H H O

O

O

Asn226

O O

Tyr224

H2N

H

Tyr179

O R

O H H N

O O

O H

R Thr227

O

H

O H

Gln228 Asn226

(O)

H

N

H

O Gln228

(P) FIG.  11.6, CONT’D  (O). 2D view of gossypol and hyaluronidase bonding (P). 3D view of ­gossypol and hyaluronidase bonding.

332  Computational Phytochemistry

11.4. CONCLUSIONS Out of the six gossypol derivatives, gossylic lactone, gossypol tetraacetic acid, and gossypol-6,6'-dimethyl ether are contraceptives. Diaminogossypol and mono-aldehyde gossypol have antianxiety and vasodilation activity, respectively, and ethyl gossypol is hypnotic. Cross validation of ANNs prediction with respect to contraceptive action of hossylic lactone, gossypol tetraacetic acid, and gossypol-6,6'-dimethyl ether with the help of docking experiments confirmed ANNs result. By inhibiting acrosin and hyaluronidase enzymes of human spermatozoa, gossylic lactone, gossypol tetraacetic acid, and gossypol-6,6'-dimethyl ether are supposed to exert male contraceptive action.

ACKNOWLEDGEMENTS Manabendra D. Choudhury sincerely acknowledges Commonwealth Scholarship Commission for awarding him with the Commonwealth Academic Staff Fellowship at University of Bradford, United Kingdom. Support of Bioinformatics Centre, Assam University, Silchar, India, is sincerely acknowledged for carrying out docking part of the work.

REFERENCES Agatonovic-Kustrin, S., Alany, R.G., 2001. Role of genetic algorithms and artificial neural networks in predicting the phase behavior of colloidal delivery systems. Pharm. Res. 18, 1049–1055. Aghav, R.M., Kumar, S., Mukherjee, S.N., 2011. Artificial neural network modelling in competitive adsorption of phenol and resorcinol from water environment using some carbonaceous adsorbents. J. Hazard. Mater. 188, 67–77. Alberts, B., 2008. Molecular Biology of the Cell. Garland Science, New York, p. 1298. ISBN0-8153-4105-9. Brickley, M.R., Shepherd, J.P., Armstrong, R.A., 1998. Neural networks. J. Dent. 26, 305–309. Bourquin, J., Shmidli, H., Hoogevest, P., Van Leuenberger, H., 1998a. Advantages of Artificial Neural Networks (ANNs) as alternative modeling technique for data sets showing non-linear relationship using data from a galenical study on a solid dosage form. Eur. J. Pharm. Sci. 7, 5–16. Bourquin, J., Shmidli, H., Hoogevest, P., Van Leuenberger, H., 1998b. Comparison of artificial neural networks (ANN) with classical modeling techniques using different experimental designs and data from a galenical study on a solid dosage form. Eur. J. Pharm. Sci. 6, 287–300. Bourquin, J., Shmidli, H., Hoogevest, P., Van Leuenberger, H., 1998c. Pitfalls of artificial neural networks (ANN) modeling technique for data sets containing outlier measurements using a study on mixture properties of a direct compressed dosage form. Eur. J. Pharm. Sci. 7, 17–28. Buciński, A., Bączek, T., 2002. Optimization of HPLC separations of flavonoids with the use of artificial neural networks. Pol. J. Food Nutr. Sci. 4, 47–51. Buciński, A., Nasal, A., Kaliszan, R., 2000. Pharmacological classification of drugs based on neural network processing of molecular modeling data. Comb. Chem. High Throughput Screen. 3, 525–533. Buciński, A., Zieliński, H., Kozłowska, H., 2004. Artificial neural networks for prediction of antioxidant capacity of cruciferous sprouts. Trends Food Sci. Technol. 15, 161–169. Chen, Y., McCall, T.W., Baichwal, A.R., Meyer, M.C., 1999. The application of an artificial neural network and pharmacokinetic simulations in the design of controlled-release dosage forms. J. Control. Release 59, 33–41.

Virtual Screening of Phytochemicals  Chapter | 11  333 Coutinho, E.M., 2002. Gossypol: a contraceptive for men. Contraception 65, 259–263. Dayhoff, J.E., DeLeo, J.M., 2001. Artificial neural networks: opening the black box. Cancer 91, 1615–1635. Gašperlin, M., Tušar, L., Tušar, M., Kristl, J., Šmid-Korbar, J., 1998. Lipophilic semisolid emulsion systems: viscoelastic behavior and prediction of physical stability by neural network modeling. Int. J. Pharm. 168, 243–254. Gašperlin, M., Tušar, L., Tušar, M., Šmid-Korbar, J., Zupan, J., Kristl, J., 2000. Viscosity prediction of lipophilic semisolid emulsion systems by neural network modeling. Int. J. Pharm. 196, 37–50. Haghdadi, M., Fatemi, M.H., 2010. Artificial neural network prediction of the psychometric activities of phenylalkylamines using DFT-calculated molecular descriptors. J. Serb. Chem. Soc. 75, 1391–1404. Honda, A., Siruntawineti, J., Baba, T., 2002. Role of acrosomal matrix proteases in sperm-zona pellucida interactions. Hum. Reprod. Update 8, 405–412. Hussain, A.S., Yu, X., Johnson, R.D., 1991. Application of neural computing in pharmaceutical product development. Pharm. Res. 8, 1248–1252. Ishikawa, T., Hirano, H., Saito, H., Sano, K., Ikegami, Y., Yamaotsu, N., Hirono, S., 2012. Quantitative structure-activity relationship (QSAR) analysis to predict drug-drug interactions of ABC transporter ABCG2. Mini Rev. Med. Chem. 12, 505–514. Isu, Y., Nagashima, U., Hosoya, H., Aoyma, T., 1994. Development of neural network simulator for structure-activity correlation of molecules. J. Chem. Softw. 2, 76–95. Kaliszan, R., Bączek, T., Buciński, A., Buszewski, B., Sztupecka, M., 2003. Prediction of gradient retention from the linear solvent strength (LSS), quantitative structure-retention relationships (QSRR), and artificial neural networks (ANN). J. Sep. Sci. 26, 271–282. Kandimalla, K.K., Kanikkannan, N., Singh, M., 1999. Optimization of a vehicle mixture for the transdermal delivery of melatonin using artificial neural networks and response surface method. J. Control. Release 61, 71–82. Katsila, T., Spyroulias, G.A., Patrinos, G.P., Matsoukas, M.-T., 2016. Computational approaches in target identification and drug discovery. Comput. Struct. Biotechnol. J. 14, 177–184. Keshmiri-Neghab, H., Goliaei, B., 2014. Therapeutic potential of gossypol: an overview. Pharm. Biol. 52, 124–128. Laurie, A., Jackson, R., 2005. Q-SiteFinder: an energy-based method for the prediction of protein– ligand binding sites. Bioinformatics 21, 1908–1916. Lee, C.W., Park, J.A., 2001. Assessment of HIV/AIDS-related health performance using an artificial neural network. Inf. Manage. 38, 231–238. Mendyk, A., Dorożyński, P., Jachowicz, R., 2007. Proceedings of International Joint Conference on Neural Networks, Orlando, FL, 12–17 August 2007. IEEE catalog number: 07CH37922C, ISBN: 1-04244-1380-X, ISSN: 1098-7576. Naik, P.K., Patel, A., 2009. Prediction of anticancer/non-anticancer drugs based on comparative molecular moment descriptor using Artificial Neural Network and support vector machine. Dig. J. Nanomater. Biostruct. 4, 19–43. Paixão, P., Gouveia, L.F., Morais, J.A.G., 2010. Prediction of the in vitro permeability determined in Caco-2 cells by using artificial neural networks. Eur. J. Pharm. Sci. 41, 107–117. Peh, K.K., Lim, C.P., Quek, S.S., Khoh, K.H., 2000. Use of artificial neural networks to predict drug dissolution profiles and evaluation of network performance using similarity factor. Pharm. Res. 17, 1384–1388. Perkins, R., Fang, H., Tong, W., Welsh, W.J., 2003. Quantitative structure-activity relationship methods: perspectives on drug discovery and toxicology. Environ. Toxicol. Chem. 22, 1666–1679.

334  Computational Phytochemistry Petrović, J., Ibrić, S., Betz, G., Jelena Parojčić, Đ.Z., 2009. Application of dynamic neural networks in the modeling of drug release from polyethylene oxide matrix tablets. Eur. J. Pharm. Sci. 38, 172–180. Silwoski, G., Kothiwale, S., Meiler, J., Lowe Jr., E.W., 2014. Computational methods in drug discovery. Pharmacol. Rev. 66, 334–395. Snow, P.B., Rodvold, D.M., Brandt, M.J., 1999. Artificial neural networks in clinical urology. Urology 54, 787–790. So, S.S., Richards, W.G., 1992. Application of neural networks. quantitative structure-activity relationships of the derivatives of 2,4-diamino-5-(substituted-benzyl)pyrimidines as DHFR inhibitors. J. Med. Chem. 35, 3201–3207. STAHL, 2000. Stahlbau, 69, 672. https://doi.org/10.1002/stab.200002470. Takayama, K., Fujikawa, M., Nagai, T., 1999. Artificial neural network as a novel method to optimize pharmaceutical formulation. Pharm. Res. 16, 1–6. Takayama, K., Fujikawa, M., Obata, Y., Morishita, M., 2003. Neural network based optimization of drug formulations. Adv. Drug Deliv. Rev. 55, 1217–1231. Vanyúr, R., Héberger, K., Jakus, J., 2003. Prediction of anti-HIV-1 activity of a series of tetrapyrrole molecules. J. Chem. Inf. Comput. Sci. 43, 1829–1836. Wang, L., Xie, X.-Q., 2014. Computational target fishing: what should chemogenomics researchers expect for the future of in silico drug design and discovery? Future Med. Chem. 6, 247–249. Zhang, S., 2011. Computer-aided drug discovery and development. Methods Mol. Biol. 716, 23–38. Zupan, J., Gasteiger, J., 1993. Neural Networks for Chemists. An Introduction. Wiley-VCH, Weinheim. Zupan, J., Gasteiger, J., 1999. Neural Networks in Chemistry and Drug Design, second ed. Wiley-VCH, Weinheim.