C H A P T E R
25 An Overview of Computational Methods, Tools, Servers, and Databases for Drug Repurposing Sailu Sarvagalla, Safiulla Basha Syed, Mohane Selvaraj Coumar Centre for Bioinformatics, School of Life Sciences, Pondicherry University, Puducherry, India
1 DRUG REPURPOSING Drug discovery is a multistep, arduous process that requires a lot of time, money, and effort and has a low success rate. During the drug development phases, medicinal chemists often face challenges pertaining to the optimization of the compounds. However, the cuttingedge technologies such as microfluidics assisted chemical synthesis, microfluidics based biological activity screening, and artificial intelligence systems somehow accelerate the optimization process and allow the chemists to explore a broad area of chemical space compared with the manual synthesis and testing of compounds. Nevertheless, these systems are quite expensive; moreover, the lead compounds still need to pass through preclinical and clinical trials, which again require a lot of time, effort, and expense before they are approved (Schneider, 2018). To overcome this hurdle, over the past decades drug repurposing has gained more importance in the drug discovery process (Doan, Pollastri, Walters, & Georg, 2011). Drug repurposing or drug repositioning is a method used to find new therapeutic use of drugs which have already been approved for other indications. Drug repurposing apparently makes the drug discovery process much easier, because it not only speeds up the process but also circumvents the preclinical and clinical studies related to the ADMET properties, drug optimization issues, and enhances the drug development success rate (Fig. 1) (Novac, 2013). Moreover, recent data also suggest the high success rate of the drug repurposing strategy (Brown & Patel, 2017; Shameer et al., 2017). During the past decade, there has been an advancement in the acquisition of biological and chemical data pertaining to the alteration of pathways in diseases, alteration in protein
In Silico Drug Design. https://doi.org/10.1016/B978-0-12-816125-8.00025-0
743 # 2019 Elsevier Inc. All rights reserved.
744
FIG. 1
25. AN OVERVIEW OF COMPUTATIONAL METHODS
Schematic representation of the steps involved in conventional new drug discovery and drug repurposing
strategy.
structures related to the disease, drug targets, mechanism of drug actions, disease biomarkers, and genomics, which altogether offer biologists an opportunity to understand the complexity of diseases, drugs, and drug-target interactions and eventually to help in the drug repurposing process (Li & Jones, 2012). Consequently, there has been an increase in the number of tools available in different areas, such as chemoinformatics ( Jonsdottir, Jorgensen, & Brunak, 2005), bioinformatics (Issa, Kruger, Byers, & Dakshanamurthy, 2013), systems biology (Zou, Zheng, Li, & Su, 2013), and biological networks (Lotfi Shahreza, Ghadiri, Mousavi, Varshosaz, & Green, 2017). Application of these computational methods in drug repurposing is cost and time effective. Depending upon the knowledge and data available about the disease, drugs and drug-target interactions, and treatment outcomes, two types of strategies could be adopted by the scientific community for drug repurposing: phenotype-based screening, and knowledge and database-based methods (Fig. 2).
1.1 Phenotype-Based (Blinded) Screening Phenotype-based screening is helpful when there is little or no information about the target. Basically, in this approach large compound libraries are randomly screened (Sardana et al., 2011). However, in this approach the molecular mechanism of drug action and target
4. TOOLS AND DATABASES
1 DRUG REPURPOSING
745
FIG. 2 Different approaches for drug repurposing.
may remain unknown. The compounds identified through this approach can target more than one protein and even different pathways (Zheng, Thorne, & McKew, 2013). There are plenty of examples of drugs that were identified by this route, and the best example is aspirin, whose mechanism of action was revealed almost 100 years after its approval (Vane & Botting, 2003). The major advantage of this approach is that it can be applied to numerous diseases without the prerequisite knowledge about the targets involved in them.
1.2 Knowledge and Database Methods Knowledge and database-oriented drug repurposing requires specific knowledge and data about the disease, targets or biomarkers, drug target interactions, drug mechanism of action, altered biological pathways, and genomics. This strategy comprises target-based, signature-based, and pathway or network-based methods ( Jin & Wong, 2014). 1.2.1 Target-Based Methods The target-based approach requires prerequisite knowledge about the targets related to the disease. This approach starts with the identification of a target related to the disease, and the DrugBank database can be used for in vitro and in vivo high-throughput screening of drugs (Swamidass, 2011). Computational and bioinformatics tools are handy in this approach, as all drugs available in the DrugBank can be completely screened by in silico docking studies (Sawada, Iwata, Mizutani, & Yamanishi, 2015), and later they can be validated by in vitro and in vivo studies. The major advantage of this approach is that the most known targets are directly related to the disease; this approach improves the drug discovery process and guarantees the likelihood of finding better drugs, unlike the phenotype-based screening where the information about the drug target may remain unknown.
4. TOOLS AND DATABASES
746
25. AN OVERVIEW OF COMPUTATIONAL METHODS
1.2.2 Signature-Based Methods In this approach, knowledge of the large transcriptomic data is utilized to reveal the difference in gene expression between the disease and healthy controls (Sithara, Crowley, Walder, & Aston-Mourney, 2017). This comparison would help to uncover the unknown disease mechanisms. Similarly, the gene-expression signature data generated before and after treatment of a disease can help to discover a drug-induced differential gene-expression signature and to know whether the drug reverts the drug-disease associated differential geneexpression signature back to the healthy conditions (Iorio, Rittman, Ge, Menden, & SaezRodriguez, 2013). The advantage of this approach is that it discloses the new mechanism of action of drugs, gene connections, and relation of the same pathways related to different diseases. The best example is the Connectivity Map (CMap), a database that contains the gene-expression analysis of five human cancer cell lines treated with more than 1000 FDAapproved drugs (Lamb et al., 2006). 1.2.3 Pathway or Network-Based Methods In this method, data of signaling pathways and relationships among different biological molecules are used to reorganize the pathways in the form of networks to find key targets for drug repurposing (Lotfi Shahreza et al., 2017; March-Vila et al., 2017). The NCBI Gene Expression Omnibus (GEO) dataset has constructed a large network that includes 645 diseasedisease, 5008 disease-drug, and 164,374 drug-drug relationships (Barrett, 2013). This large network data set offers researchers an opportunity to find a new drug target/pathway for a disease and for effective drug repurposing. Depending upon the objectives and the availability of information, any of these methods or a combination of these methods can be applied by scientists in their research for efficient drug repurposing. This chapter focuses on the methods, tools, databases, and servers that are used for target-based drug repurposing. With this in mind, Section 3 discusses computer-aided drug design (CADD) techniques in general for drug discovery. Ways we can use CADD for target-based drug repurposing, along with the tools, databases and servers available freely over the Internet for this purpose (i.e., modeling the target and screening the drugs), are listed and discussed in Section 4. Section 5 briefly describes the databases/tools available for gene signature and pathway-based drug repurposing. Furthermore, the utility of the target-based method for drug repurposing using the freely available online resources is illustrated with a case study using Aurora kinase C as target protein and screening the DrugBank database compounds in Section 6. Since the availability of open access tools and databases for drug design is exploding exponentially, it is difficult to list all of them in this chapter. Hence, interested readers may also refer to the following web resources and publications for more information: 1. Open source molecular modeling (Pirhadi, Sunseri, & Koes, 2016) (https:// opensourcemolecularmodeling.github.io/). 2. VLS3D (Villoutreix, Lagorce, Labbe, Sperandio, & Miteva, 2013) (http://www.vls3d.com/ links/51-shortlist). 3. Off-targets, repurposing, repositioning hypotheses (http://www.vls3d.com/index.php/ links/chemoinformatics/off-targets-repurposing).
4. TOOLS AND DATABASES
2 COMPUTER-AIDED DRUG DESIGN
747
2 COMPUTER-AIDED DRUG DESIGN Drug design is a critical step in the drug discovery process. It is an iterative process of finding a new drug molecule to a particular drug target (e.g., enzymes, receptor proteins, ion channels, etc.), which can alter its functional activity (Kindt, Morse, Gotschlich, & Lyons, 1991; Prachayasittikul et al., 2015; Takenaka, 2001). The designed drug molecule could be complementary in shape and charge to the target protein and strongly bind to the target and inhibit or activate its function, which in turn provides therapeutic benefit to the patient. However, the design and development of new drug molecules using traditional drug discovery techniques that depend on trial-and-error methods (random) involving the testing of large numbers of chemical substances on cultured cells/biochemical assays (in vitro testing) or on animals (in vivo testing) are very time consuming and expensive. However, with the recent advancements in computational methodologies, a more rational approach to the design and development of drug molecules is possible. The process of drug design using computational (in silico) tools is referred to as CADD (Tang, Zhu, Chen, & Jiang, 2006; Zhang, 2011). Once the molecules are designed in silico, the most promising ones can be synthesized/purchased and tested in vitro/in vivo, thereby effectively decreasing the number of molecules tested in vitro/in vivo. Hence, CADD acts like a “virtual shortcut” to identify small molecules for in vitro testing, predict the effectiveness and plausible side effects, and help in the improvement of bioavailability of the molecules. Hence, CADD is more cost-effective and it reduces the time required to discover a drug. Moreover, the field of CADD has recently witnessed rapid growth and advancement in the technologies, both at the hardware and software levels, and has immense potential in the drug discovery process. A number of drugs, including aliskiren, saquinavir, oseltamivir, dorzolamide, boceprevir, zanamivir, rupintrivir, and nolatrexed were discovered and optimized using CADD techniques (Sliwoski, Kothiwale, Meiler, & Lowe Jr., 2014; Talele, Khedkar, & Rigby, 2010). CADD can be classified into two types, namely structure-based drug design (SBDD) and ligand-based drug design (LBDD) (Sliwoski et al., 2014). LBDD is carried out when the target protein 3D structure is not available. It mainly depends on the knowledge of other molecules that bind to the target protein/enzyme of interest. The ligand-based pharmacophore screening, molecule similarity search, and quantitative structure-activity relationship (QSAR) are a few of the standard LBDD methods that are widely employed in CADD to shortlist the molecules for testing. The basic assumption of LBDD is that similar molecules will have similar activities. In the ligand-based pharmacophore screening, the ligands that bind to the target protein are used to develop a common pharmacophore hypothesis to screen the small molecule libraries. In the molecular similarity search, the known ligand’s molecular fingerprint/ substructure can be used to identify new molecules with similar fingerprint/substructure by screening the small molecule libraries. QSAR is a regression or classification-based method that models the relationship between the structural features of a set of known ligands that bind to the target protein and their corresponding biological activity. Thus, the generated QSAR models can be used to screen small molecule libraries to find new molecules with improved biological activity (Verma, Khedkar, & Coutinho, 2010). On the other hand, SBDD depends on the knowledge of the 3D structure of the target protein and it has rapidly grown because of the advancements in proteomics and genomics,
4. TOOLS AND DATABASES
748
25. AN OVERVIEW OF COMPUTATIONAL METHODS
along with concomitant development in structural biology and computational chemistry. Moreover, SBDD has helped in the discovery of many successful drugs such as saquinavir, oseltamivir, and zanamivir. Once the drug target’s 3D structure is resolved using either experimental (X-ray or NMR) or computational methods (homology modeling and threading), the drug target’s structure can then be used to virtually screen small molecule libraries and prioritize molecules for biological testing. Alternatively, the 3D structure of the target protein can be used to design molecules by assembling molecular fragments in the active site of the target, which is referred to as de novo drug design (Hartenfeller & Schneider, 2011). In addition, SBDD is extensively used in the lead optimization process, to improve the binding affinity of a molecule with the target protein by analyzing the protein-ligand interaction and contemplating the necessary structural modifications in the molecules to improve the interaction with the target (Sliwoski et al., 2014).
2.1 CADD Techniques 2.1.1 Virtual Screening Virtual screening (VS) is a computational method used to search small molecule libraries in order to identify and narrow down potential molecules that are likely to bind/interact with the drug target (Lavecchia & Di Giovanni, 2013; Rester, 2008). The potential molecules identified in VS can be tested in an in vitro assay (e.g., biochemical assay, cell culture, etc.) to confirm the activity of the molecule. Molecules with confirmed activity are referred to as hits or leads. The VS process decreases the time and money required to test a small molecule library, as it acts as a filter to select only a few potential molecules for testing in an in vitro assay. Moreover, VS is very fast and widely employed in the drug discovery process to identify small molecule inhibitors. The VS methods used to screen the small molecule libraries are generally classified as either ligand-based screening methods (e.g., similarity search and pharmaophore) or structure-based screening methods (e.g., docking and structure-based pharmacophore (SBP)). 2.1.1.1 DOCKING-BASED VIRTUAL SCREENING
Molecular docking is a computational technique used to predict the binding orientation of a ligand with a biomolecule (target) (Bartuzi, Kaczor, Targowska-Duda, & Matosiuk, 2017; Lengauer & Rarey, 1996), including protein-small molecule, protein-protein, protein-DNA, and protein-RNA interactions. Docking involves two processes, a searching and a scoring function (Kitchen, Decornez, Furr, & Bajorath, 2004). The searching algorithms (e.g., systematic, molecular dynamics (MD) simulation, and genetic algorithm) employ several degrees of translational, rotational, and conformational freedom to find an appropriate binding orientation of the ligand in the target protein. The knowledge derived from this binding orientation could be used to predict the binding affinity or activity of the ligand using the scoring functions (e.g., force field, empirical, machine-learning, and knowledge-based). Hence, docking has become a powerful computational tool to predict the various bimolecular interactions. Some of the well-known docking programs commonly used by scientific communities are AutoDock, GOLD, DOCK, Glide-XP, LibDock CDOCKER, FlexX, and LigandFit. All these programs use different searching and scoring algorithms with different parameters, and these
4. TOOLS AND DATABASES
3 TARGET-BASED DRUG REPURPOSING USING CADD TECHNIQUES
749
could be classified broadly as shape-based (e.g., LibDock) and energy-based (e.g., Glide XP) methods. Using the docking programs, compound libraries can be virtually screened for identifying hits. 2.1.1.2 STRUCTURE-BASED PHARMACOPHORE SCREENING
Virtual screening using SBP is an important and an alternative method to docking-based VS (Falchi, Caporuscio, & Recanatini, 2014; Pirhadi, Shiri, & Ghasemi, 2013). In SBP, a set of pharmacophore features (hydrophobic, aromatic, hydrogen bond acceptor, hydrogen bond donor, positive and negative ionizable groups) of the ligands are identified. These features are essential for optimal intermolecular interactions of the ligands with the biological target. In the absence of a protein 3D structure or when the drug target is unknown, ligand-based pharmacophore hypotheses are generated to screen the small molecule databases. However, recent advancements both in structural biology and computational chemistry have greatly helped to derive pharmacophore hypotheses from the 3D structures of protein and/or protein-ligand complexes. Several tools such as E-pharmacophore (Schr€ odinger Suite 9.2 software package), Catalyst (Discovery Studio 3.1), Ligand and Structure-Based Query Editor (MOE), ZINCPharmer, and Ligand Scout can be employed for this purpose. The derived pharmacophore hypotheses can be used for virtual screening of compound libraries to identify hits. 2.1.2 Molecular Dynamics Simulation The hits/leads identified from virtual screening could be further evaluated using molecular dynamics (MD) simulations. MD simulation (Ciccotti, Ferrario, & Schuette, 2014; Foroutan, Fatemi, & Esmaeilian, 2017) is a method used to measure the dynamic behavior of atoms and molecules by solving Newton’s equations of motion, starting from a defined conformational state. In this system, the atoms and molecules are allowed to interact over a period of time to evolve the system, and the generated interatomic forces and potential energies are calculated using molecular mechanics force fields. By using MD simulation, the nature and stability of the target protein-ligand interaction in the presence of solvent (water) molecules can be studied. Based on favorable results from the computational investigations, the identified hits/leads could be further tested in vitro and then subjected to chemical optimization to improve the biological activity.
3 TARGET-BASED DRUG REPURPOSING USING CADD TECHNIQUES The advancements in biological sciences such as genomics, proteomics, and molecular and structural biology have greatly assisted in identifying and understanding various pathological conditions and have also provided a number of opportunities to explore and identify novel drug targets for disease intervention and drug discovery processes (Hood et al., 2012; Ma & Zhao, 2013). At the same time, the number of approved and investigational drugs for various disease conditions has increased in recent years (Wishart et al., 2018). Since the design and development of new drug molecules for particular disease conditions are very time consuming and expensive processes, utilizing the available knowledge of drugs and
4. TOOLS AND DATABASES
750
25. AN OVERVIEW OF COMPUTATIONAL METHODS
FIG. 3 Target-based drug repurposing using CADD techniques.
drug target information for drug repositioning is an effective and beneficial way to identify novel therapeutic molecules for new disease indications (Dudley, Deshpande, & Butte, 2011; Jin & Wong, 2014; Wang, Chen, & Deng, 2013). One of the common approaches used by the scientific community for drug repurposing is target-based methods ( Jin & Wong, 2014; Li et al., 2016). In this approach, we can virtually screen the approved DrugBank compounds (Wishart et al., 2018) against different target proteins using CADD techniques. In this section, we briefly described the methods to carry out drug repurposing using CADD, along with the available drug and drug target databases, tools, servers, and software that are useful for predicting the target structure and ligand binding sites and carrying out VS. The overall framework of target-based drug repurposing using CADD techniques is shown in Fig. 3.
3.1 Drug/Small Molecule Databases The DrugBank is a database that combines bioinformatics and cheminformatics resources for establishment of an extensive chemical, pharmacological, and pharmaceutical (i.e., drugs) 4. TOOLS AND DATABASES
751
3 TARGET-BASED DRUG REPURPOSING USING CADD TECHNIQUES
database in combination with drug target (i.e., sequence, structure, and pathway) information (Wishart et al., 2018). Currently, DrugBank contains over 11,027 drug entries, which includes 2517 FDA-approved small molecule drugs, 948 FDA-approved biotech (protein/peptide) drugs, and 109 nutraceuticals. In addition to this, it also contains over 5114 experimental drugs and extensive SNP (single nucleotide polymorphism) drug data that can be used for pharmacogenomics studies. Along with drug data, this database also contains 4911 nonredundant drug targets (i.e., protein/enzyme/transporter/carrier) sequences that are linked to drug entries. In addition to the DrugBank database, there are a number of other databases for drugs, such as SuperDRUG2, DrugCentral, and WITHDRAWN, along with small molecule library databases, e.g., BindingDB, ChEMBL, KEGG Drug, ZINC, etc. (Table 1). These databases are freely accessible for retrieving information and compounds, which can be virtually screened against different drug targets for the identification of novel therapeutic molecules.
TABLE 1
List of Available Drug/Small Molecule Databases
Sl. No.
Database
Description and Web-Links
References
1
DrugBank
Freely available curated database containing approved, investigational and nutraceuticals drugs and drug target information (https://www.drugbank.ca/)
(Wishart et al., 2018)
2
SuperDRUG2
A free one-stop resource for approved drugs that is searchable by text and 2D chemical structure (http:// cheminfo.charite.de/superdrug2/)
(Siramshetty et al., 2018)
3
DrugCentral
The web server provides information on active chemical entities, mode of action, and indications for the drugs. The database is searchable by drugs, targets, and disease (http://drugcentral.org/)
(Ursu et al., 2017)
4
WITHDRAWN
Is a database of withdrawn and discontinued drugs due to safety concerns, grouped according to their toxicity. The database is searchable by text and 2D chemical structure (http://cheminfo.charite.de/withdrawn/ index.html)
(Siramshetty et al., 2016)
5
Drug Repurposing Hub
Provides biological screening information about 5000 drugs that are either approved or have reached clinical trials. Data such as chemical structure, status of clinical development, supplier information, mode of action, drug targets, and approved indications are available (https://clue.io/repurposing)
(Corsello et al., 2017)
6
BindingDB
BindingDB is a publicly accessible web database of measured binding affinities of proteins considered to be drug targets with small, drug-like molecules. BindingDB contains 1,439,799 binding data, for 7042 protein targets and 644,978 small molecules (https:// www.bindingdb.org/bind/index.jsp)
(T. Liu, Lin, Wen, Jorissen, & Gilson, 2007)
Continued
4. TOOLS AND DATABASES
752
25. AN OVERVIEW OF COMPUTATIONAL METHODS
TABLE 1 List of Available Drug/Small Molecule Databases—cont’d Sl. No.
Database
Description and Web-Links
References
7
ChEMBL
ChEMBLdb is a manually curated chemical database of bioactive molecules that are drug-like (https://www. ebi.ac.uk/chembl/)
(Gaulton et al., 2012)
8
KEGG Drug
KEGG Drug has comprehensive drug information of approved drugs in United States, Europe, and Japan. Chemical structures are associated with therapeutic target, metabolizing enzyme, and other molecular interaction network information (http://www.genome.jp/kegg/drug/)
(Kanehisa, Goto, Furumichi, Tanabe, & Hirakawa, 2010)
9
ZINC database
ZINC database contains over 35 million commercially purchasable compounds. These compounds are stored in ready-to-dock 3D formats for virtual screening and are freely accessible (http://zinc.docking.org/)
(Irwin & Shoichet, 2005)
10
Therapeutic Target DB
The database currently contains 2025 targets, 17,816 drugs, and 3681 multitarget agents (http://bidd.nus. edu.sg/group/cjttd/)
(Y. H. Li et al., 2018)
3.2 Drug Target Databases In the drug discovery pipeline, the first step is to understand the pathophysiology of a particular disease condition, followed by identification of drug targets (i.e., protein/nucleic acid) and then lead discovery for the target (Lindsay, 2003). The advancements and technology in various fields of science have greatly helped to understand disease mechanisms as well as to identify drug targets using different approaches, including genomics, proteomics, transcriptomics, epigenetics, phenotypic screening, and biomarker studies (Schenone, Dancik, Wagner, & Clemons, 2013). The resulting drug target sequence, structure, and functional information from such studies is already deposited in various freely available databases for data retrieval, analysis, and understanding of disease mechanisms. This information can be used for drug repositioning using CADD techniques. The target-based drug repurposing methods invariably depend on the three-dimensional (3D) structure of drug targets (protein/DNA/RNA); these methods are powerful and many success stories have been reported in the literature (Shim & Liu, 2014; Vasaikar, Bhatia, Bhatia, & Chu Yaiw, 2016). The 3D structures of the drug targets (protein/DNA/RNA) can be resolved using experimental techniques, including X-ray, NMR, and cryo-electron microscopy (cryo-EM) methods, and the derived structural knowledge is deposited in the Protein Data Bank (PDB) archive (Berman, Henrick, Nakamura, & Markley, 2007). PDB serves as a single repository for the structures of proteins, nucleic acids, and their complexes. Presently, the Worldwide Protein Data Bank (wwPDB) organization manages the PDB archive and assures that it is freely available to the global research community. Moreover, this organization distributes the deposited structural information to the scientific community through its group members, including Research Collaborator for Structural Bioinformatics Protein Data Bank
4. TOOLS AND DATABASES
3 TARGET-BASED DRUG REPURPOSING USING CADD TECHNIQUES
753
(RCSB PDB) (Rose et al., 2011), Biological Magnetic Resonance Data Bank (BNMR) (Markley et al., 2008), Protein Data Bank Japan (PDBj) (Kinjo et al., 2012) and Protein Data Bank in Europe (PDBe) (Gutmanas et al., 2014). From these databases, we can download the highresolution 3D structure of proteins or protein-ligand complexes by simple query search (or) advanced search options. Moreover, these databases also offer various structure analysis and visualization tools, and using these tools we can analyze and understand the interaction mechanisms that exist between the biomolecules, including protein-ligand, protein-protein, etc. Understanding the structural interaction mechanism is essential for drug design and drug repurposing using SBDD methods. Even though wwPDB serves as a repository for biomolecular structural information, there is a huge gap between resolved crystal structures and the reported drug targets that are identified using various experimental methods. The reason is that resolving the 3D structures of these drug targets using experimental methods (such as NMR, X-ray, and cryo-EM) is difficult and challenging. Undoubtedly this huge gap can be filled by computational methods including homology modeling and protein-protein docking. Like the biomolecular structural databases, there are also a number of databases for protein/nucleic acid sequence deposition and retrieval. The Universal Protein Resource (UniProt) (UniProtConsortium, 2018) is a freely accessible resource for annotated protein information such as sequence, structural, and functional information. Moreover, this database integrates the knowledge derived from three other databases, including UniProt Knowledgebase (UniProtKB), UniProt Archive (UniParc), and UniProt Reference Clusters (UniRef ). Hence, it can be considered as a one-stop source for superior annotated protein sequence and functional information. The National Center for Biotechnology Information (NCBI) provides comprehensive resources for nucleic acid and protein sequence information. The NCBI RefSeq (Pruitt, Tatusova, & Maglott, 2007) database provides information about the genomic DNA and transcript sequences of various organisms. In addition, the Protein database (Pruitt et al., 2007) of NCBI contains a collection of protein sequence information from several sources, including translations from annotated coding regions in GenBank, RefSeq, and TPA, as well as records from SwissProt, PIR, PRF, and PDB. Hence, in the absence of the 3D structure of a drug target, we can search and retrieve the target sequence information from these databases. Further, this knowledge can be used to predict 3D structures of the drug targets using various computational methods. A few of the available drug target sequence and structural databases are listed in Table 2.
3.3 Prediction of 3D Structure of Drug Targets In the absence of experimental (X-ray and NMR) 3D structures, in silico methods can be used to predict the 3D structure of drug targets. Three methods are commonly used to build the target model: (1) homology modeling (Vyas, Ukawala, Ghate, & Chintha, 2012), (2) threading and fold recognition (Rost, Schneider, & Sander, 1997), and (3) ab initio modeling (Liwo, Lee, Ripoll, Pillardy, & Scheraga, 1999; Wu, Skolnick, & Zhang, 2007). Homology modeling is also called comparative modeling, and it predicts the 3D structure of the target protein based on its sequence homology with a protein (called a template) whose 3D structural information is solved experimentally. Homology modeling comprises the following steps: (i) identification of template structures (homolog/related sequence structures), (ii) alignment of target sequence to template, (iii) model building (copying template
4. TOOLS AND DATABASES
754
25. AN OVERVIEW OF COMPUTATIONAL METHODS
TABLE 2 List of Available Protein Sequence and Structural Databases Database/ Sl. No. Server Description and Web-Links
References
1
UniProt
UniProt is a freely accessible resource that provides comprehensive high-quality protein sequence and functional information (http:// www.uniprot.org/)
(UniProtConsortium, 2018)
2
RefSeq
A freely accessible database of genomic DNA, transcripts sequence, and (Pruitt et al., 2007) protein sequence information (https://www.ncbi.nlm.nih.gov/refseq/)
3
Protein
The Protein database contains a collection of sequence information from (Pruitt et al., 2007) several sources, including translations from annotated coding regions in GenBank, RefSeq, and TPA, as well as records from SwissProt, PIR, PRF, and PDB (https://www.ncbi.nlm.nih.gov/protein/)
4
wwPDB
The single largest repository of information for 3D structures of biomolecules and is freely accessible (http://www.wwpdb.org/)
(Berman et al., 2007)
5
RCSB PDB
A database repository of 3D structures of proteins, nucleic acids, and complex assemblies (https://www.rcsb.org/)
(Rose et al., 2011)
6
PDBe
PDBe is the European resource for biological macromolecular structures (Gutmanas et al., (http://www.ebi.ac.uk/pdbe/) 2014)
7
PDBj
PDBj is the Japanese resource for biological macromolecular structures (Kinjo et al., 2012) and also provides integrated tools for analysis (https://pdbj.org/)
8
BMRB
A repository for NMR Spectroscopy data from proteins, peptides, nucleic acids, and other biomolecules (http://www.bmrb.wisc.edu/)
(Markley et al., 2008)
coordinates to target), (iv) loop modeling and side chain refinements, and (v) model optimization and validation. Threading and fold recognition predicts the unknown sequence structure based on recognizing probable folds in the structural databases and then selecting the best-fitting fold for further model building and refinement. The rule of thumb of this method is that the structures are more conserved than the protein sequences. There are fewer protein folds available as compared to the reported protein sequences in the SCOP (Structural Classification Of Proteins) database (Andreeva et al., 2008). Hence, this method could identify structurally comparable proteins even without noticeable sequence similarity (distantly related sequence). On the other hand, ab initio methods attempt to predict the full-length structure of the protein based on the sequence information alone without depending on the template or other structural information. These methods are empirical in nature and are not accurate. Based on the energy minimization principles, these methods find the possible conformation with the lowest global minima energy. There are several in silico tools, web servers, and databases available for modeling the protein structures using the three methods of comparative modeling, threading, and ab initio methods and they are listed in Table 3.
4. TOOLS AND DATABASES
755
3 TARGET-BASED DRUG REPURPOSING USING CADD TECHNIQUES
TABLE 3
List of Available Protein 3D Structure Prediction Tools/Servers
Tools/ Sl. No. Servers
Description and Web-Links
References
HOMOLOGY BASED OR COMPARATIVE METHODS 1
Modeller
MODELLER is a downloadable program that can run on most Unix/ (Webb & Sali, 2016) Linux, Windows, and Mac systems. It is used for homology modeling of protein 3D structures. It can also perform de novo modeling of loops, optimization of modeled protein structures, multiple alignment of protein sequences and/or structures (http:// salilab.org/modeller/)
2
Modeweb
A web server for protein modeling (https://modbase.compbio.ucsf. (Pieper et al., 2014) edu/modweb/)
3
Modebase
Modebase (http://salilab.org/modbase) is a database of modeled protein structures by the modeling pipeline ModPipe (https:// salilab.org/modpipe/)
(Pieper et al., 2014)
4
SWISSMODEL
The web server performs homology modeling in an automated manner and can be accessed through the ExPASy web server (http://swissmodel.expasy.org/)
(Biasini et al., 2014)
5
PyMod 2.0 A downloadable program for sequence similarity searches, multiple ( Janson, Zhang, Prado, & Paiardini, 2017) sequence alignments, and homology modeling within PyMOL environment (http://schubert.bio.uniroma1.it/pymod/index. html)
THREADING AND FOLD RECOGNITION-BASED METHODS 6
I-TASSER
Iterative Threading ASSEmbly Refinement (I-TASSER) uses a (Yang et al., 2015) hierarchical approach to protein structure and function prediction. It is available for download (http://zhanglab.ccmb.med.umich.edu/ I-TASSER/download/) and as a web server (http://zhanglab.ccmb. med.umich.edu/I-TASSER/)
7
RaptorX
Predicts secondary and tertiary structures and assigns confidence (Kallberg et al., 2012) scores to the prediction results. It is available for download as well as a web server (http://raptorx.uchicago.edu/)
8
MUSTER
The web server MUlti-Sources ThreadER (MUSTER) identifies multiple template structures from the PDB library and builds the model (http://zhanglab.ccmb.med.umich.edu/MUSTER/)
(Wu & Zhang, 2008)
AB INITIO AND COMBINATION OF THREADING AND AB INITIO METHODS 9
ROBETTA
The web server uses Rosetta software for ab initio and comparative (Kim, Chivian, & Baker, modeling of proteins (http://www.robetta.org/) 2004)
10
Bhageerath An energy-based protein structure prediction server that is ( Jayaram et al., 2006) validated using 80 small globular proteins, and it predicts five candidate structures for input amino acid sequence (http://www. scfbio-iitd.res.in/bhageerath/index.jsp) Continued
4. TOOLS AND DATABASES
756
25. AN OVERVIEW OF COMPUTATIONAL METHODS
TABLE 3 List of Available Protein 3D Structure Prediction Tools/Servers—cont’d Tools/ Sl. No. Servers
Description and Web-Links
11
QUARK
12
PEP-FOLD A web server that predicts peptide structure from amino acid sequences using a de novo approach (http://bioserv.rpbs.univparis-diderot.fr/services/PEP-FOLD/)
References
A web server for ab initio protein structure prediction and protein (Xu & Zhang, 2013) peptide folding that aims to construct the correct protein 3D model from amino acid sequence (http://zhanglab.ccmb.med.umich.edu/ QUARK/) (Thevenet et al., 2012)
LOOP MODELING 13
FALCLoop
Fragment Assembly and Loop Closure (FALC) is a downloadable tool for modeling missing regions in a protein (http://galaxy. seoklab.org/softwares/falc.html)
(Ko et al., 2011)
14
FREAD
The web server can be used to fill the gaps in the 3D protein model (Choi & Deane, 2010) (http://opig.stats.ox.ac.uk/webapps/fread/php/)
15
RCD+
RCD+ server is a fast loop-closure modeling tool based on an (Chys & Chacon, 2013) improved version of the RCD method (http://rcd.chaconlab.org/)
16
ModLoop
A web server for automated modeling of loops in protein structures (Fiser & Sali, 2003) (https://modbase.compbio.ucsf.edu/modloop/)
17
LoopIng
Uses Random Forest automatic learning technique to select structural templates for protein loops from a set of candidates (http://circe.med.uniroma1.it/looping/)
(Messih, Lepore, & Tramontano, 2015)
18
Sphinx
The protein loop modeling algorithm generates high-accuracy predictions and decoy sets enriched with near-native loop conformations (http://opig.stats.ox.ac.uk/webapps/sabdabsabpred/Sphinx.php)
(Marks et al., 2017)
3.3.1 Homology Modeling/Comparative Modeling 3.3.1.1 IDENTIFICATION OF TEMPLATE STRUCTURE
The first step in homology modeling is to identify a suitable template structure for the given query sequence. This can be carried out using a sequence similarity search program called protein-BLAST (https://blast.ncbi.nlm.nih.gov/Blast.cgi, part of the Protein database of NCBI; Fig. 4), which identifies several homolog protein structures in the PDB database, and the identified structures are displayed in order based on their E value, sequence identity score, and maximum query coverage area, etc. When selecting the template, we have to choose the structure with high resolution, lowest E value, maximum query coverage area, and identity score. Generally, more than 30% sequence similarity in structure would be considered appropriate for homology modeling. In the absence of full-length sequence structure, multiple template structures can also be considered for improving the model quality.
4. TOOLS AND DATABASES
3 TARGET-BASED DRUG REPURPOSING USING CADD TECHNIQUES
757
FIG. 4 Identification of template structures by querying the Protein-BLAST server of NCBI.
3.3.1.2 ALIGNMENT OF TARGET SEQUENCE TO TEMPLATE
After selecting the template structure, the next step is to align the target protein sequence to the template sequence using highly sensitive multiple sequence alignment algorithms (e.g., Clustal Omega (Sievers & Higgins, 2018), Praline (Simossis & Heringa, 2005), T-Coffee (Di Tommaso et al., 2011), etc.) to obtain optimum alignment. This is a very critical step in homology modeling, as an improper alignment could lead to incorrect disposition of homolog residues, which affect the quality of the model. Hence, we must ensure that the aligned sequence is correct by visual inspection. 3.3.1.3 MODEL BUILDING
After achieving the optimum alignment, the model can be built by copying the template residue coordinates to the target protein. If the template and target sequence are identical,
4. TOOLS AND DATABASES
758
25. AN OVERVIEW OF COMPUTATIONAL METHODS
both backbone and side chain atom coordinates are copied to the target protein; otherwise only backbone atoms are copied, and the side chain atoms are fixed in subsequent steps. While performing sequence alignment, frequently some regions of the target sequence cannot be aligned to the template sequence due to sequence insertions and deletions that produce gaps, and these gaps cannot be modeled directly by creating holes in the protein structure. Hence, closing the gap is a very essential step in protein modeling and this can be done by loop modeling methods. Currently there are two loop modeling methods available: the first one is a database search and the second one is an ab initio method. The database search method finds the “spare parts” from the known structures in the PDB database, and then selects the best segment/spare parts with minimal steric clash and maximal sequence similarity. Finally, this segment/spare part is used to fill the loops by fitting onto the two stem regions of the target protein. On the other hand, the ab initio method generates many random loops and then selects the best loop that does not have any steric clash with nearby side chains and also maintains a reasonably low free energy. There are several software tools/modules developed to model the loops using both database search (e.g., FREAD, etc.) and ab initio methods (e.g., RCD +, etc.) and they are listed in Table 3. In order to build a reasonably good model, along with loops and missing side chain atoms, modeling and refinement need to be done using an amino acids rotamer library. This library contains favorable side chain torsion angles of amino acid side chains, derived from known protein crystal structures. Using this library, finally the modeled loops and main and side chain atoms are refined to relieve the steric clashes. Most of the software modules/tools/ servers used for model building and optimization, such as Modeller (Webb & Sali, 2016), Swiss-PdbViewer (Guex & Peitsch, 1997), and UCSC Chimera (Pettersen et al., 2004), incorporate the side chain refinement functions along with model building. 3.3.1.4 MODEL OPTIMIZATION AND VALIDATION
Model optimization/refinement can be carried out using energy minimization or molecular dynamics simulation methods. The main aim of the model optimization/refinement is to correct the irregularities in the protein structure by relieving steric clashes and strains that exist among the atoms. The energy minimization methods move the atoms locally in order to get a suboptimal structure, whereas the molecular dynamics simulation methods move the atoms in uphill and downhill directions in a rough energy landscape. Thus, it overcomes the local energy barriers and gets the global minima structure. A number of software tools are available to refine the models. The UCSC Chimera (Pettersen et al., 2004), SWISS-PdbViewer (Guex & Peitsch, 1997), etc., can be used to minimize the structures using various algorithms. Moreover, the freely available molecular dynamic-based GROMACS (Hess, Kutzner, van der Spoel, & Lindahl, 2008) software can also be used to minimize and achieve the global minima structure. Once the model is optimized, it has then to be validated to make sure that the model geometry features are reliable within the stipulated physiochemical rules. This process involves checking for anomalies in bonds, bond lengths, bond angles (phi-psi angles), and close contacts. This could be carried out using statistical models, where the constructed model parameters are compared to the existing standard models. The obtained results show which region of the sequence is folded properly and which region has errors. If there are any errors and irregularities in the model, further refinement needs to be done. Rampage, ProSA-web
4. TOOLS AND DATABASES
759
3 TARGET-BASED DRUG REPURPOSING USING CADD TECHNIQUES
(Wiederstein & Sippl, 2007), Verify3D (Eisenberg, Luthy, & Bowie, 1997), etc. are some of the tools/modules/servers that are widely used to check the structural geometries of the models. The available open source model optimization and validation tools/servers are listed in Table 4.
TABLE 4
List of Open Source Model Optimization and Validation Tools and Servers
Sl. No. Tools/Servers
Description and Web-Links
References
MODEL OPTIMIZATION 1
UCSC Chimera A downloadable interactive visualization and analysis (Pettersen et al., 2004) tool for molecular structures, sequence alignments, and docking results. It can provide high-quality images and animations for publication and presentation purposes (https://www.cgl.ucsf.edu/chimera/)
2
SWISS PDBViewer
The downloadable program has an intuitive graphic and (Guex & Peitsch, 1997) menu interface that allows the analysis of several proteins to carry out structural alignments and compare their active sites (https://spdbv.vital-it.ch/)
3
YASARA
A web server that performs energy minimization of 3D (Krieger et al., 2009) structures of proteins using the YASARA force field (http://www.yasara.org/minimizationserver.htm)
4
3Drefine
Web server for protein structure refinement (http:// sysbio.rnet.missouri.edu/3Drefine/)
5
Gromacs
A downloadable package to perform molecular (Hess et al., 2008) dynamics simulation of proteins, lipids and nucleic acids (http://www.gromacs.org/)
(Bhattacharya, Nowotny, Cao, & Cheng, 2016)
MODEL VALIDATION 6
Ramachandran plot
Web server that generates Ramachandran plot for 3D (Gopalakrishnan, Sowmiya, structure of a protein, which will help in analyzing the Sheik, & Sekar, 2007) quality of the structure (http://dicsoft1.physics.iisc. ernet.in/rp/index.html)
7
RAMPAGE
Web server for Ramachandran plot analysis of 3D – structure of a protein (http://mordred.bioc.cam.ac.uk/ rapper/rampage.php)
8
ProSA-web
Downloadable and also web server to detect errors in 3D (Wiederstein & Sippl, 2007) structure of a protein (https://prosa.services.came.sbg. ac.at/prosa.php)
9
VERIFY3D
The web server determines the compatibility of the (Eisenberg et al., 1997) atomic model (3D) with its own amino acid sequence (1D) by assigning a structural class based on its location and environment (alpha, beta, loop, polar, nonpolar, etc.) and comparing the results to good structures (http://services.mbi.ucla.edu/Verify_3D/) Continued
4. TOOLS AND DATABASES
760
25. AN OVERVIEW OF COMPUTATIONAL METHODS
TABLE 4 List of Open Source Model Optimization and Validation Tools and Servers—cont’d Sl. No. Tools/Servers
Description and Web-Links
References
10
PROCHECK
A downloadable program that checks the residue-by(Laskowski, Rullmannn, residue geometry of a protein structure and also MacArthur, Kaptein, & includes PROCHECK-NMR for checking the structures Thornton, 1996) solved by NMR (https://www.ebi.ac.uk/thornton-srv/ software/PROCHECK/)
11
MolProbity
Web server for validation of macromoleular structure (http://molprobity.biochem.duke.edu/)
12
SAVES
The Structure Analysis and Verification Server (SAVES) – is a metaserver that runs six programs (PROCHECK, WHAT_CHECK, ERRAT, VERIFY_3D, PROVE, CRYST1 record matches) for checking and validating protein structures during and after model refinement (https://services.mbi.ucla.edu/SAVES/)
(Chen et al., 2010)
3.4 Binding Site Identification The binding site is the cavity or pocket on the surface of the target protein/nucleic acid to which the ligand/drug molecule binds through intermolecular forces (e.g., hydrogen bonds, electrostatic, ionic bonds, Van der Waals forces, etc.), resulting in conformational change of the target protein and leading to its functional activation/inhibition. Identification of the ligand binding site on a given target protein is the first prerequisite step in the design of a drug that can interact with the target (Guo et al., 2015; Perot, Sperandio, Miteva, Camproux, & Villoutreix, 2010; Sliwoski et al., 2014). Generally, the ligand binding site information can be identified from the cocrystal structures of reported drug targets (or) closely related protein structures that are in complex with natural/nonnatural ligands/peptide molecules. In the absence of ligand binding site information, in silico methods/tools can be used to identify the potential druggable cavities/pockets on the surface of the target protein. Usually, these in silico methods/tools are categorized as protein geometry-based methods (e.g., CASTp, PocketDepth, etc.), energy-based methods (e.g., PocketFinder, Q-SiteFinder, FTMap, etc.), evolutionary-based machine learning methods (e.g., eFindSite, ATPbind, etc.) and combinations of these methods. These different methods can be used to identify and characterize the potential druggable binding sites and the obtained binding site knowledge can then be used for drug repurposing. Some of the freely available binding site prediction software/tools/web servers are listed in Table 5.
3.5 Virtual Screening for Drug Repurposing Once the 3D structure of the drug target is ready, the next step is to discover suitable drug molecules that can bind to the target protein and alter its function. As discussed in Section 3, VS of database compounds can help in identifying suitable ones for testing. VS methods are broadly classified as docking-based screening and pharmacophore-based screening, and both the methods could be employed for drug repurposing for new indications.
4. TOOLS AND DATABASES
3 TARGET-BASED DRUG REPURPOSING USING CADD TECHNIQUES
TABLE 5
761
List of Freely Available Binding Site/Pocket Prediction Tools/Servers
Tools/ Sl. No. Servers
Description and Web-Links
References
1
eFindSite
A ligand binding site prediction and virtual screening web (Brylinski & Feinstein, 2013) server that detects common ligand binding sites in a set of evolutionarily related proteins identified by 10 threading/ fold recognition methods (http://brylinski.cct.lsu.edu/ efindsite)
2
COACH
Uses metaserver approach for protein-ligand binding site (Yang, Roy, & Zhang, 2013) prediction (https://zhanglab.ccmb.med.umich.edu/ COACH/)
3
ATPbind
ATPbind is a metaserver that predicts the protein ATP (Hu, Li, Zhang, & Yu, 2018) binding site using support vector machine (SVM) method (https://zhanglab.ccmb.med.umich.edu/ATPbind/)
4
CASTp
Computed Atlas of Surface Topography of proteins (Binkowski, Naghibzadeh, & (CASTp) is an online resource for locating, delineating, and Liang, 2003) measuring concave surface regions on 3D structure of proteins (http://sts.bioe.uic.edu/castp/index.html?2cpk)
5
fpocket
fpocket is a Voronoi tessellation-based open source protein (Le Guilloux, Schmidtke, & pocket (cavity) detection algorithm (http://fpocket. Tuffery, 2009) sourceforge.net/)
6
3DLigandSite 3DLigandSite is an automated method for the prediction of (Wass, Kelley, & Sternberg, ligand binding sites in 3D protein structures (http://www. 2010) sbg.bio.ic.ac.uk/3dligandsite/)
7
Pocketome
An encyclopedia of conformational ensembles of druggable binding sites that can be identified experimentally from the cocrystal structures available in the Protein Data Bank (http://www.pocketome.org/)
8
FINDSITE
FINDSITE is a threading-based binding site prediction and (Brylinski & Skolnick, 2008) ligand screening algorithm that detects common ligand binding sites in a set of evolutionarily related proteins (http://cssb.biology.gatech.edu/findsite)
9
PocketDepth
A geometry-based pocket prediction method that uses a depth-based clustering (http://proline.physics.iisc.ernet.in/pocketdepth/)
(Kalidas & Chandra, 2008)
10
DeepSite
DeepSite uses deep neural networks to predict protein binding pockets (http://www.playmolecule.org/ deepsite/)
( Jimenez, Doerr, MartinezRosell, Rose, & De Fabritiis, 2017)
(Kufareva, Ilatovskiy, & Abagyan, 2012)
There are a number of docking tools/servers (e.g., Autodock, Dock, eFindSite, etc.) available and they are widely applied in drug design. These docking methods can also be used for drug repurposing, where the approved DrugBank compounds can be used to evaluate their binding affinity towards a novel drug target and, based on the predicted binding affinity, the compounds can be shortlisted for testing using various in vitro and in vivo methods.
4. TOOLS AND DATABASES
762
25. AN OVERVIEW OF COMPUTATIONAL METHODS
AutoDock Vina is an open-source standalone docking-based VS tool used to screen a large number of chemical compounds from different databases. It is a grid-based docking method and accurately predicts the binding mode of the compounds at the defined binding site of the target protein. AutoDock Vina can be employed to screen DrugBank compounds to assess their binding affinity towards other novel drug targets. MTiOpenScreen is a web server that provides docking-based VS facility based on the AutoDock Vina methods. Using this server, diverse compounds from different databases as well as the custom library compounds can be screened against a particular drug target. EFindSite is another web server that automatically predicts the ligand binding site on a given target protein using a metathreading approach, and then uses this information to screen a large number of compounds from different databases, including BindingDB, ChEMBL, DrugBank, KEEG compounds, RCSB PDB, etc. Then the compounds are ranked based on their Z-scores, and these can be tested using in vitro methods. In addition to these, a number of other servers, databases, and tools are available for docking-based screening, listed in Table 6.
TABLE 6 List of Available Docking-based Screening Tools/Web Servers Sl. No. Tools/Servers
Description and Web-Links
1
Autodock 4
Downloadable Lamarckian genetic algorithm-based (Morris et al., 2009) docking software that consists of two main programs: (1) Autodock, which performs the docking of the ligand to a set of grids describing the target protein and (2) Autogrid, which precalculates these grids (http://autodock.scripps.edu/)
2
AutoDock Vina
An open-source downloadable program for (Trott & Olson, 2010) molecular docking that supports Windows and Linux systems. The accuracy of binding mode predictions by AutoDock Vina is better, as compared to that of AutoDock 4 (http://vina.scripps.edu/)
3
DOCK
Downloadable and the latest DOCK 6 version incorporates receptor flexibility (http://dock. compbio.ucsf.edu/)
(Allen et al., 2015)
4
iGEMDOCK
iGEMDOCK is a downloadable graphical environment for virtual screening that uses an evolutionary technique for docking (http:// gemdock.life.nctu.edu.tw/dock/igemdock.php)
(Yang & Chen, 2004)
5
Rosetta
A multipurpose application for structure prediction, (Kaufmann, Lemmon, design, and remodeling of proteins and nucleic Deluca, Sheehan, & Meiler, acids. It uses Monte Carlo minimization-based 2010) docking algorithm (https://www.rosettacommons. org/software)
6
GalaxyDock
Downloadable and works based on the protein side (Baek, Shin, Chung, & chain conformational space annealing method. Runs Seok, 2017) on Linux and MAC systems (http://galaxy.seoklab. org/softwares/galaxydock.html)
4. TOOLS AND DATABASES
References
3 TARGET-BASED DRUG REPURPOSING USING CADD TECHNIQUES
TABLE 6
763
List of Available Docking-based Screening Tools/Web Servers—cont’d
Sl. No. Tools/Servers
Description and Web-Links
References
7
MTiOpenScreen
AutoDock Vina pipeline-based web server for virtual screening (http://mobyle.rpbs.univ-parisdiderot.fr/cgi-bin/portal.py#forms:: MTiOpenScreen)
(Labbe et al., 2015)
8
Computer-Aided DrugDesign Platform using PyMOL
A downloadable PyMOL plugin that runs on Linux (Lill & Danielson, 2011) OS, that requires preinstallation of PyMOL for the analysis, computations, and simulations of proteinligand complexes (http://people.pharmacy. purdue.edu/mlill/software/pymol_plugins/ install.shtml)
9
GriDock
(Vistoli, Pedretti, Downloadable virtual screening front-end for AutoDock 4, designed for dockings of ligands stored Mazzolari, & Testa, 2010) in a single database (SDF format) (http://159.149.85. 2/cms/index.php?Software_projects:GriDock)
10
SwissDock
A web service that predicts the molecular (Grosdidier, Zoete, & interactions between a target protein and a small Michielin, 2011) molecule. It works based on docking EADock_DSS software (http://www.swissdock.ch/)
11
PatchDock
The web server uses shape complementarity principles for the docking of proteins, DNA, peptides, and drugs (http://bioinfo3d.cs.tau.ac.il/ PatchDock/patchdock.html)
(Schneidman-Duhovny, Inbar, Nussinov, & Wolfson, 2005)
12
iScreen
The world’s first cloud-computing web server for virtual screening and de novo drug design based on TCM database@Taiwan (http://iscreen.cmu.edu.tw/)
(Tsai, Chang, & Chen, 2011)
13
idock
Structure-based virtual screening web server and also downloadable (http://istar.cse.cuhk.edu.hk/ idock/)
(H. Li, Leung, & Wong, 2012)
As discussed in Section 3, SBP screening is a complementary and alternative method to docking for screening large chemical library compounds against a given drug target. The pharmacophore features derived from the protein-ligand complex uses the knowledge of the potential interactions that exist between the protein and the ligand, whereas the features that are derived from the protein alone use active site/hot spot residue information. The derived pharmacophore features can be used to screen DrugBank compounds for new indications. There are a few tools/software (e.g., AnchoreQuery, ZINCPharmer, etc.) available for SBP screening and they are listed in Table 7.
3.6 Drug Repurposing for Protein-Protein Interactions The previous sections discuss the application of CADD techniques and tools for drug repurposing using target-based approaches. These described methods, tools, servers, and
4. TOOLS AND DATABASES
764
25. AN OVERVIEW OF COMPUTATIONAL METHODS
TABLE 7 List of Available Pharmacophore-based Screening Tools/Web Servers Sl. No. Tools/Servers Description and Web-Links
References
1
PharmDock
A pharmacophore-based docking program that combines pose (Hu & Lill, 2014) sampling and ranking based on optimized protein-based pharmacophore models with local optimization using an empirical scoring function. The program comes with an easy-to-use GUI within PyMOL and is downloadable (http://people.pharmacy.purdue. edu/mlill/software/pharmdock/)
2
ZINCPharmer An open source pharmacophore search web server that can identify (Koes & Camacho, pharmacophore features directly from structure or use MOE and 2012b) LigandScout pharmacophore definitions for searching chemical structures from ZINC or Molprot databases (http://zincpharmer. csb.pitt.edu/)
3
AnchorQuery Web server specialized in pharmacophore search for targeting protein-protein interactions (http://anchorquery.ccbb.pitt.edu/)
4
Pharmit
(Koes, Domling, & Camacho, 2018)
A pharmacophore-based virtual screening web server, which can (Sunseri & Koes, generate pharmacophore features from input ligand or from protein- 2016) ligand complex (http://pharmit.csb.pitt.edu/)
databases focus on targeting the classical drug targets for drug discovery, i.e., enzymes, receptors, ion channels, and transporters. However, growing evidence supports the fact that altered signaling pathways leading to pathological states could also be due to aberrations in the protein-protein interactions (PPIs). Such PPIs are emerging as possible drug targets for modulation by small molecule inhibitors. One of the main advantages of PPI inhibitors are that they are very specific to a particular PPI, which will offer more selectivity for the drug with lower toxicity. The availability of computational methods and tools helps in faster identification of PPI inhibitors (Choi & Choi, 2017; Cierpicki & Grembecka, 2015; Villoutreix et al., 2014). The availability of the PPI partners’ 3D structure can be a limiting factor for identifying a drug or repurposing a drug to disrupt their interaction. Specialized servers, tools, and databases are available to gather information on PPIs related to disease states, and to model and predict these PPIs. Moreover, tools are available to detect the hot spot regions of the PPIs. Using these tools, one can predict the complex structure of PPIs and identify the hot spot regions/residues of the complex. Hot spot residues in a PPI are the most important, contributing to the binding of two proteins in the complex (Kuttner & Engel, 2012; Rosell & Fernandez-Recio, 2018). Typically, hot spot region information is used to design ligands that will bind strongly to these residues and interfere in the PPI and that could alleviate the pathophysiological conditions. Interested readers may refer to recent reviews detailing how computational approaches can be utilized for the design of PPI inhibitors (Gromiha, Yugandhar, & Jemimah, 2017; Johnson & Karanicolas, 2017; Peng, Wang, Peng, Wu, & Pan, 2017; Sarvagalla & Coumar, 2016). Tables 8–10 provide a list of databases, tools, and servers freely available to model PPIs and use them for drug repurposing.
4. TOOLS AND DATABASES
TABLE 8
List of Curated PPI Modulator Databases
Name of the Sl. No. Database Description and Web-Links
References
1
2P2IDB
A hand-curated structural database dedicated to protein-protein interactions with known orthosteric modulators (http://2p2idb. cnrs-mrs.fr)
(Basse et al., 2013)
2
iPPI-DB
A manually curated and interactive database of small nonpeptide inhibitors of protein-protein interactions (http://www.ippidb. cdithem.fr/)
(Labbe et al., 2016)
3
TIMBAL
A database holding molecules of molecular weight <1200 Da that modulate protein-protein interactions (http://mordred.bioc.cam. ac.uk/timbal)
(Higueruelo et al., 2009)
4
ANCHOR
Web-based tool to facilitate the analysis of protein-protein interfaces (Meireles, Domling, & with regard to its suitability for small molecule drug design (http:// Camacho, 2010) structure.pitt.edu/anchor/)
TABLE 9
List of Available Protein-Protein Docking Tools and Web Servers
Sl. No. Tools/Servers
Description and Web-Links
References
1
ClusPro
Web server that uses rigid-body docking using the FFT (Comeau, Gatchell, Vajda, & correlation approach and RMSD clustering (http://nrc. Camacho, 2004) bu.edu/cluster)
2
HADDOCK
A data-driven flexible docking web server used to model (van Zundert et al., 2016) bimolecular complexes (http://haddocking.org/)
3
ZDOCK
Available as a web server that uses fast Fourier transformation (FFT), shape complementarity, desolvation energy, and electrostatic interactions to predict the complex structures (http://zdock. umassmed.edu/)
4
RosettaDock
Web server that optimizes the rigid body docked poses (Chaudhury et al., 2011) and side chains to find the lowest energy conformation of the complex (http://rosettadock.graylab.jhu.edu/)
(Pierce et al., 2014)
REFINEMENT/RANKING/POST DOCKING ANALYSIS MODULES 5
FireDock
An efficient method for refinement and rescoring of rigid-body protein-protein docking poses (http:// bioinfo3d.cs.tau.ac.il/FireDock/)
(Mashiach, SchneidmanDuhovny, Andrusier, Nussinov, & Wolfson, 2008)
6
HADDOCKRefinement interface
Refines protein-protein docking poses using MD simulated annealing method (http://haddock.science. uu.nl/services/HADDOCK2.2/haddockserverrefinement.html)
(van Zundert et al., 2016)
7
DockRank
Available as web server and ranks docked models using (Xue, Jordan, El-Manzalawy, predicted partner-specific protein-protein binding sites Dobbs, & Honavar, 2014) (http://ailab1.ist.psu.edu/DockRank/)
8
DOCKSCORE
A web server for ranking protein-protein docked poses (Malhotra, Mathew, & that employs DockScore for ranking the poses (http:// Sowdhamini, 2015) caps.ncbs.res.in/dockscore/)
4. TOOLS AND DATABASES
766
25. AN OVERVIEW OF COMPUTATIONAL METHODS
TABLE 10 List of PPI Hot Spot Detection Tools and Web Servers Sl. No. Tools/Servers Description and Web-Links
References
1
Robetta
Web server useful for computational alanine scanning to detect (Park, Kim, Ovchinnikov, the hot spots in protein-protein interaction interfaces (http:// Baker, & DiMaio, 2018) www.robetta.org/alascansubmit.jsp)
2
KFC2
Knowledge-based FADE and Contacts (KFC2) web server to predict binding hot spots within protein-protein interfaces (http://kfc.mitchell-lab.org/)
3
HotRegion
A database of predicted hot spot clusters (http://prism.ccbb. (Cukuroglu, Gursoy, & ku.edu.tr/hotregion/) Keskin, 2012)
4
DrugScorePPI Web server for in silico allanine scanning for hot spot detection (Kruger & Gohlke, 2010) (http://cpclab.uni-duesseldorf.de/dsppi/main.php)
5
Hot Spot Prediction
Downloadable program (http://sfb.kaust.edu.sa/pages/ software.aspx)
(P. Chen et al., 2013)
6
PocketQuery
PocketQuery is a web service for interactively exploring hot spot residues at the protein-protein interaction interface (http://pocketquery.csb.pitt.edu/)
(Koes & Camacho, 2012a)
(Zhu & Mitchell, 2011)
4 DATABASES AND TOOLS FOR DRUG REPURPOSING USING GENE EXPRESSION SIGNATURES In addition to the databases, tools, and servers described in the previous section for targetbased drug repurposing, a number of databases/servers are freely available to the scientific community for drug repurposing using gene expression information. Some of these servers are described here, to help readers make use of them for their drug repurposing studies. The Connectivity Map (CMap; https://clue.io/) database was constructed by acquiring data of gene expression signatures from cultured human cells treated with different bioactive compounds. These gene expression signatures help in making the connection between drugs, genes, and diseases. Recently, gene expression signature data in CMap has been expanded 1000-fold. L1000 expression profiling, which is a low-cost, high-throughput gene expression profiling method, has been used to obtain gene expression signatures. A total of 476,251 expression signatures were obtained from a panel of cell lines exposed to 27,927 perturbagens. From this database, users can gain access to 1.3M L1000 profiles and the tools for results analysis (Subramanian et al., 2017). L1000 firework display (L1000FWD; http://amp.pharm.mssm.edu/L1000FWD) is a freely accessible web application where users can visualize over 16,000 drug and small molecule-induced gene expression signatures. Over 20,000 small molecule compounds were used to profile the gene expression changes in human cell lines using the L1000 expression method (Wang, Lachmann, Keenan, & Ma’ayan, 2018). DMAP (http://bio.informatics.iupui.edu/cmaps) is a database of in silico drug-protein connectivity maps. Users can obtain data about the effect of a drug on disease-associated
4. TOOLS AND DATABASES
5 DRUG REPURPOSING FOR AURORA KINASE C TARGET USING CADD (A CASE STUDY)
767
genes or proteins. Compared to CMap, DMAP has more data coverage, having 24,121 compounds and 438,004 chemical-to-protein effect relationships (Huang et al., 2015). Drug Signature Database (DSigDB; http://tanlab.ucdenver.edu/DSigDB) is a freely available database having 22,527 gene sets and 17,389 compounds covering 19,531 genes. Further, this data can be integrated into gene set enrichment analyses (GSEAs) to know the direct links between the gene and drug for drug repurposing studies (Yoo et al., 2015). DrugSig (http://biotechlab.fudan.edu.cn/database/drugsig) is a freely available database with more than 1300 drugs, 7000 microarrays, and 800 drug targets. It has been constructed based on drug response microarray data and by selecting the top 500 upregulated and downregulated genes. Both signature-based and target-based approaches to drug repurposing can be integrated using the gene expression signature data available on this database (Wu, Huang, Zhong, & Huang, 2017). Drug-Path (http://www.cuilab.cn/drugpath) is a freely available database built based on the drug-influenced gene expression data available in Connectivity Map (CMap) to identify the drug-associated pathways. Users can search the drug of interest to get access to the list of pathways that are influenced by that drug, which can further be visualized and downloaded (Zeng, Qiu, & Cui, 2015). NFFinder (http://nffinder.cnb.csic.es) is a freely available tool that uses transcriptomic data to comprehend the relationship between drugs and diseases; it is useful for finding potent drugs for orphan diseases (Setoain et al., 2015). There are many more databases which are freely available, such as QUADrATiC (O’Reilly et al., 2016), DeSigN (Lee et al., 2017), Cogena ( Jia et al., 2016), and Connection Map for Compounds (CMC) (Liu et al., 2016), which also provide gene expression signatures to link the drug, gene, and diseases and which can be useful for drug repurposing studies.
5 DRUG REPURPOSING FOR AURORA KINASE C TARGET USING CADD (A CASE STUDY) 5.1 Aurora Kinases C Model Building and Validation To demonstrate how the resources discussed in the previous sections can be utilized for drug repurposing, we have chosen Aurora kinase C as a drug target. Virtual screening of approved drugs using docking-based methods can identify possible drugs that can be repurposed for this target. There are three isoforms of Aurora kinases—A, B, and C—and they are involved in the cell division process. Since overexpression of Aurora kinases are reported in cancer, they have been targets for new drug development in the past decade (Borisa & Bhatt, 2017; Cheung, Sarvagalla, Lee, Huang, & Coumar, 2014; Damodaran, Vaufrey, Gavard, & Prigent, 2017). Several 3D structures of human Aurora kinase A (PDB ID: 3E5A, etc.) and Aurora kinase B (PDB ID: 4AF3) in complex with inhibitors are reported in RCSB PDB. However, the 3D structure of Aurora kinase C is not yet reported. Hence, we specifically choose this as a drug target to demonstrate the homology modeling process, followed by using this model to carry out VS of DrugBank compounds.
4. TOOLS AND DATABASES
768
25. AN OVERVIEW OF COMPUTATIONAL METHODS
Human Aurora kinase was text searched in the UniProt database (http://www.uniprot. org/uniprot) and the FASTA sequence file with ID: Q9UQB9 was retrieved for Aurora kinase C. Subsequently, the FASTA sequence was used to build the 3D structure of the target protein using SWISS-MODEL web server (https://swissmodel.expasy.org/). SWISS-MODEL is a fully automated and freely available homology modeling server that predicts the protein 3D structure by selecting highly identical and homologous template structures from PDB. It is freely accessible via the ExPASy web server or Swiss Pdb-Viewer module. In order to build the structure, first the FASTA sequence of Aurora kinase C was submitted to the SWISS-MODEL server to identify appropriate template PDB structures. For Aurora kinase C, several PDB structures were identified as possible templates (Fig. 5). From this, the Aurora kinase B structure (PDB ID: 4AF3) was chosen as the appropriate template, as it had maximum query coverage and 84% identity to the Aurora kinase C sequence. Using 4AF3 as the input template structure, the Aurora kinase C 3D structure was built by the server and given as output in PDB format. Users have the option of choosing the template structure for model building. This option will be particularly useful when PDB structures are available
FIG. 5 Template retrieval from SWISS-MODEL (https://swissmodel.expasy.org/) for Aurora.
4. TOOLS AND DATABASES
5 DRUG REPURPOSING FOR AURORA KINASE C TARGET USING CADD (A CASE STUDY)
769
from different organisms or when query overage for the search sequence is very low. In such cases, the users can guide the model building process by selecting the appropriate template. Next, the modeled Aurora kinase C structure was subjected to energy minimization using UCSF Chimera software (https://www.cgl.ucsf.edu/chimera/download.html). This is a freely available standalone tool that is used for visualization and processing of biomolecular structures. While minimizing, the missing hydrogen atoms were added and the AMBER ff14Sb force field was used to correct the bond orders and angles of the protein residues. Finally, the structure was optimized using steepest descent (100 steps) and conjugate gradient (10 steps) methods. The minimized Aurora C structure was used to check its overall geometry quality and reliability using Rampage (http://mordred.bioc.cam.ac.uk/rapper/rampage. php) and Protein Structure Analysis (ProSa) web servers (https://prosa.services.came.sbg. ac.at/prosa.php). The Rampage server generates the Ramachandran plot that represents the energetically allowed regions for the backbone dihedral angles ψ against φ of amino acid residues in the modeled protein structure. For the Aurora kinase C modeled structure, we found that 95.1% of residues were in the favored region, 4.5% of residues were in the allowed region, and only 0.4% of residues were in the outlier region. Further, a Z-score quality plot from the ProSa web server revealed that the constructed model has NMR structure quality and is suitable for further investigations. The build 3D model of Aurora kinase C and the validation results are provided in Fig. 6.
5.2 Binding Site Identification and Virtual Screening of DrugBank Compounds Next, the minimized Aurora C structure was used to identify the ligand binding sites using two web servers: (1) ATPbind (https://zhanglab.ccmb.med.umich.edu/ATPbind/) and (2) eFindSite (http://brylinski.cct.lsu.edu/content/efindsite-webserver?). The ATPbind server accurately predicts the ATP binding site in a given protein structure using support vector machine (SVM) learning methods. On the other hand, eFindSite predicts the common ligand binding site using a set of evolutionarily related proteins identified by the metathreading method. Further, based on the identified binding site, it also virtually screens the small molecule library compounds from different databases. We have used both the servers for the identification of ligand binding sites in the modeled Aurora kinase C structure, by submitting the pdb file to the servers. The ATPbind server results revealed that the residues Leu14, Gly15, Lys16, Gly17, Lys18, Glys20, Val22, Lys37, Lys69, Lys85, Glu86, Tyr87, Ala88, Glu92, Glu135, Asn136, Leu138, and Asp149 formed the ATP binding site in Aurora C. On the other hand, eFindSite identified a total of five probable ligand binding sites. Comparison of the first binding site residues (Leu14, Gly15, Lys18, Phe19, Gly20, Val22, Lys37, Leu69, Leu85, Tyr87, Ala88, Gly91, Glu92, Tyr94, Glu135, Asn136, Leu137, Leu138, Lys146, Ile147, Ala148, Asp149, Phe150, and Trp152) with those of the ATPbind server results revealed that most of the identified residues are common in both the methods. Hence, we could consider these residues for defining the ligand binding site for VS of DrugBank compounds for drug repurposing using SBDD methods. The identified ligand binding sites from the ATPbind server is shown in Fig. 7. To repurpose for the Aurora kinase C target, we carried out VS of DrugBank compounds using the eFindSite web server, which automatically identifies the ligand binding site
4. TOOLS AND DATABASES
770 25. AN OVERVIEW OF COMPUTATIONAL METHODS
4. TOOLS AND DATABASES
FIG. 6 Aurora kinase C 3D model building and validation. (A) 3D model of Aurora kinase C built by SWISS-MODEL server (https://www.cgl.ucsf. edu/chimera/download.html) and the model validation by (B) Ramachandran plot from the Rampage server (http://mordred.bioc.cam.ac.uk/rap per/rampage.php) and (C) Z-score quality plot from the ProSa server (https://prosa.services.came.sbg.ac.at/prosa.php).
kinase C.
5 DRUG REPURPOSING FOR AURORA KINASE C TARGET USING CADD (A CASE STUDY)
4. TOOLS AND DATABASES
FIG. 7 ATP binding site prediction by the ATPbind server (https://zhanglab.ccmb.med.umich.edu/ATPbind/) for modeled 3D structure of Aurora
771
772
25. AN OVERVIEW OF COMPUTATIONAL METHODS
FIG. 8 Compounds identified from DrugBank as potential candidates for Aurora kinase C target using the eFindSite web server (http://brylinski.cct.lsu.edu/content/efindsite-webserver).
residues and screens various in-build small molecule library compounds (BindingDB, ChEMBL, DrugBank, KEGG compounds, KEGg drugs, NCI-open, RCSB PDB and ZINC) for the identification of probable hits. While setting up the screening, we have selected only approved DrugBank compounds. The top five Z-score compounds and their DrugBank IDs are shown in Fig. 8. DB06963 is an experimental drug reported as a serine/threonine-protein kinase PLK1 inhibitor and the Plk1-DB06963 crystal structure complex is available as PDB ID: 3DBC. Again, DB08583 is another experimental drug reported as a tyrosine kinase Abl1 inhibitor and the Abl1-DB08583 crystal structure complex is available as PDB ID: 3DBC. DB08242 is an experimental drug reported as a p38α MAP kinase inhibitor, and the crystal structure complex is available as PDB ID: 3L8X. On the other hand, DB06909 is an experimental drug reported as a selective PDE4B inhibitor and the crystal structure complex is available as PDB ID: 3D3P. DB07689 is reported as an inhibitor of pteridine reductase 1 of Leishmania major and is available in complex with the protein as PDB ID: 3H4V in RCSB PDB. Out of the five top Z-score compounds, three are known kinase inhibitors, suggesting that kinase inhibitors could possess polypharmacological effects by inhibiting multiple kinase targets, which could be exploited for drug repurposing. As a next step, the identified DrugBank compounds’ ability to interact with the Aurora kinase C can be investigated using in silico techniques such as docking and molecular dynamics simulation. Then, further testing using in vitro and in vivo experimental models for Aurora kinase C inhibition could enable drug repurposing for this target.
4. TOOLS AND DATABASES
REFERENCES
773
6 SUMMARY New drug discovery is a tedious process requiring huge expense of time and money. In a typical new drug discovery program, it takes between 10 and 17 years for the drug to move from bench to bedside. Scientists worldwide find ingenious ways to bring drugs to market in shorter time spans. One such strategy is drug repurposing, where older approved drugs are developed for new therapeutic indications. Such a strategy can save the time required to bring the drug to the market, as the old drugs’ preclinical and clinical toxicity profiles are already known. This information makes it easier to bring existing drugs for new indications in a shorter time span. Drug repurposing can be done by serendipitous observation that a drug is effective for a new indication either in clinical or preclinical studies. In recent times, systematic investigation of older drugs for newer drug targets using preclinical assessments has led to the identification of new uses for old drugs. In this connection, use of CADD techniques for drug repurposing is gaining momentum due to the inherent advantage of time and the resource-efficient nature of CADD. This chapter puts together in silico tools, servers, and databases that are available as open source for drug repurposing. The chapter provides information regarding open access databases for drugs, protein sequences for drug targets, and 3D protein structures of drug targets. Moreover, the tools/software and servers available for the construction of the 3D structure of drug targets from protein sequence using homology modeling are discussed. Finally, docking and pharmacophore-based VS tools/software and servers are discussed. Using the information provided in this chapter, scientists can set up VS experiments to screen approved drugs for new drug targets. This is demonstrated through an example using Aurora kinase C as a drug target. We believe that the computational resources provided in this chapter can be useful for drug discovery scientists to speed up not only drug repurposing but also new drug discovery.
References Allen, W. J., Balius, T. E., Mukherjee, S., Brozell, S. R., Moustakas, D. T., Lang, P. T., et al. (2015). DOCK 6: impact of new features and current docking performance. Journal of Computational Chemistry, 36(15), 1132–1156. Andreeva, A., Howorth, D., Chandonia, J. M., Brenner, S. E., Hubbard, T. J., Chothia, C., et al. (2008). Data growth and its impact on the SCOP database: new developments. Nucleic Acids Research, 36(Database issue), D419–D425. Baek, M., Shin, W. H., Chung, H. W., & Seok, C. (2017). GalaxyDock BP2 score: a hybrid scoring function for accurate protein-ligand docking. Journal of Computer-Aided Molecular Design, 31(7), 653–666. Barrett, T. (2013). Gene expression omnibus (GEO). In The NCBI Handbook [Internet] (2nd ed.): NCBI. Bartuzi, D., Kaczor, A. A., Targowska-Duda, K. M., & Matosiuk, D. (2017). Recent advances and applications of molecular docking to G protein-coupled receptors. Molecules, 22(2). Basse, M. J., Betzi, S., Bourgeas, R., Bouzidi, S., Chetrit, B., Hamon, V., et al. (2013). 2P2Idb: a structural database dedicated to orthosteric modulation of protein-protein interactions. Nucleic Acids Research, 41(Database issue), D824–D827. Berman, H., Henrick, K., Nakamura, H., & Markley, J. L. (2007). The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB data. Nucleic Acids Research, 35(Database issue), D301–D303. Bhattacharya, D., Nowotny, J., Cao, R., & Cheng, J. (2016). 3Drefine: an interactive web server for efficient protein structure refinement. Nucleic Acids Research, 44(W1), W406–W409. Biasini, M., Bienert, S., Waterhouse, A., Arnold, K., Studer, G., Schmidt, T., et al. (2014). SWISS-MODEL: modelling protein tertiary and quaternary structure using evolutionary information. Nucleic Acids Research, 42(Web Server issue), W252–W258.
4. TOOLS AND DATABASES
774
25. AN OVERVIEW OF COMPUTATIONAL METHODS
Binkowski, T. A., Naghibzadeh, S., & Liang, J. (2003). CASTp: Computed Atlas of Surface Topography of proteins. Nucleic Acids Research, 31(13), 3352–3355. Borisa, A. C., & Bhatt, H. G. (2017). A comprehensive review on Aurora kinase: small molecule inhibitors and clinical trial studies. European Journal of Medicinal Chemistry, 140, 1–19. Brown, A. S., & Patel, C. J. (2017). A standard database for drug repositioning. Science Data, 4, 170029. Brylinski, M., & Feinstein, W. P. (2013). eFindSite: improved prediction of ligand binding sites in protein models using meta-threading, machine learning and auxiliary ligands. Journal of Computer-Aided Molecular Design, 27(6), 551–567. Brylinski, M., & Skolnick, J. (2008). A threading-based method (FINDSITE) for ligand-binding site prediction and functional annotation. Proceedings of the National Academy of Sciences of the United States of America, 105(1), 129–134. Chaudhury, S., Berrondo, M., Weitzner, B. D., Muthu, P., Bergman, H., & Gray, J. J. (2011). Benchmarking and analysis of protein docking performance in Rosetta v3.2. PLoS One, 6(8), e22477. Chen, P., Li, J., Wong, L., Kuwahara, H., Huang, J. Z., & Gao, X. (2013). Accurate prediction of hot spot residues through physicochemical characteristics of amino acid sequences. Proteins, 81(8), 1351–1362. Chen, V. B., Arendall, W. B., 3rd, Headd, J. J., Keedy, D. A., Immormino, R. M., Kapral, G. J., et al. (2010). MolProbity: all-atom structure validation for macromolecular crystallography. Acta Crystallographica Section D, Biological Crystallography, 66(Pt 1), 12–21. Cheung, C. H., Sarvagalla, S., Lee, J. Y., Huang, Y. C., & Coumar, M. S. (2014). Aurora kinase inhibitor patents and agents in clinical testing: an update (2011 - 2013). Expert Opinion on Therapeutic Patents, 24(9), 1021–1038. Choi, S., & Choi, K. Y. (2017). Screening-based approaches to identify small molecules that inhibit protein-protein interactions. Expert Opinion on Drug Discovery, 12(3), 293–303. Choi, Y., & Deane, C. M. (2010). FREAD revisited: accurate loop structure prediction using a database search algorithm. Proteins, 78(6), 1431–1440. Chys, P., & Chacon, P. (2013). Random coordinate descent with Spinor-matrices and geometric filters for efficient loop closure. Journal of Chemical Theory and Computation, 9(3), 1821–1829. Ciccotti, G., Ferrario, M., & Schuette, C. (2014). Molecular dynamics simulation. Entropy, 16, 233. Cierpicki, T., & Grembecka, J. (2015). Targeting protein-protein interactions in hematologic malignancies: still a challenge or a great opportunity for future therapies? Immunological Reviews, 263(1), 279–301. Comeau, S. R., Gatchell, D. W., Vajda, S., & Camacho, C. J. (2004). ClusPro: a fully automated algorithm for proteinprotein docking. Nucleic Acids Research, 32(Web Server issue), W96–W99. Corsello, S. M., Bittker, J. A., Liu, Z., Gould, J., McCarren, P., Hirschman, J. E., et al. (2017). The Drug Repurposing Hub: a next-generation drug library and information resource. Nature Medicine, 23(4), 405–408. Cukuroglu, E., Gursoy, A., & Keskin, O. (2012). HotRegion: a database of predicted hot spot clusters. Nucleic Acids Research, 40(Database issue), D829–D833. Damodaran, A. P., Vaufrey, L., Gavard, O., & Prigent, C. (2017). Aurora a kinase is a priority pharmaceutical target for the treatment of cancers. Trends in Pharmacological Sciences, 38(8), 687–700. Di Tommaso, P., Moretti, S., Xenarios, I., Orobitg, M., Montanyola, A., Chang, J. M., et al. (2011). T-Coffee: a web server for the multiple sequence alignment of protein and RNA sequences using structural information and homology extension. Nucleic Acids Research, 39(Web Server issue), W13–W17. Doan, T. L., Pollastri, M., Walters, M. A., & Georg, G. I. (2011). The future of drug repositioning: old drugs, new opportunities. Annual Reports in Medicinal Chemistry, 46, 385–401. Dudley, J. T., Deshpande, T., & Butte, A. J. (2011). Exploiting drug-disease relationships for computational drug repositioning. Briefings in Bioinformatics, 12(4), 303–311. Eisenberg, D., Luthy, R., & Bowie, J. U. (1997). VERIFY3D: assessment of protein models with three-dimensional profiles. Methods in Enzymology, 277, 396–404. Falchi, F., Caporuscio, F., & Recanatini, M. (2014). Structure-based design of small-molecule protein-protein interaction modulators: the story so far. Future Medicinal Chemistry, 6(3), 343–357. Fiser, A., & Sali, A. (2003). ModLoop: automated modeling of loops in protein structures. Bioinformatics, 19(18), 2500–2501. Foroutan, M., Fatemi, S. M., & Esmaeilian, F. (2017). A review of the structure and dynamics of nanoconfined water and ionic liquids via molecular dynamics simulation. The European Physical Journal. E, Soft Matter, 40(2), 19. Gaulton, A., Bellis, L. J., Bento, A. P., Chambers, J., Davies, M., Hersey, A., et al. (2012). ChEMBL: a large-scale bioactivity database for drug discovery. Nucleic Acids Research, 40(Database issue), D1100–D1107.
4. TOOLS AND DATABASES
REFERENCES
775
Gopalakrishnan, K., Sowmiya, G., Sheik, S. S., & Sekar, K. (2007). Ramachandran plot on the web (2.0). Protein and Peptide Letters, 14(7), 669–671. Gromiha, M. M., Yugandhar, K., & Jemimah, S. (2017). Protein-protein interactions: scoring schemes and binding affinity. Current Opinion in Structural Biology, 44, 31–38. Grosdidier, A., Zoete, V., & Michielin, O. (2011). SwissDock, a protein-small molecule docking web service based on EADock DSS. Nucleic Acids Research, 39(Web Server issue), W270–W277. Guex, N., & Peitsch, M. C. (1997). SWISS-MODEL and the Swiss-PdbViewer: an environment for comparative protein modeling. Electrophoresis, 18(15), 2714–2723. Guo, Z., Li, B., Cheng, L. T., Zhou, S., McCammon, J. A., & Che, J. (2015). Identification of protein-ligand binding sites by the level-set variational implicit-solvent approach. Journal of Chemical Theory and Computation, 11(2), 753–765. Gutmanas, A., Alhroub, Y., Battle, G. M., Berrisford, J. M., Bochet, E., Conroy, M. J., et al. (2014). PDBe: Protein Data Bank in Europe. Nucleic Acids Research, 42(Database issue), D285–D291. Hartenfeller, M., & Schneider, G. (2011). De novo drug design. Methods in Molecular Biology, 672, 299–323. Hess, B., Kutzner, C., van der Spoel, D., & Lindahl, E. (2008). GROMACS 4: algorithms for highly efficient, load-balanced, and scalable molecular simulation. Journal of Chemical Theory and Computation, 4(3), 435–447. Higueruelo, A. P., Schreyer, A., Bickerton, G. R., Pitt, W. R., Groom, C. R., & Blundell, T. L. (2009). Atomic interactions and profile of small molecules disrupting protein-protein interfaces: the TIMBAL database. Chemical Biology & Drug Design, 74(5), 457–467. Hood, L. E., Omenn, G. S., Moritz, R. L., Aebersold, R., Yamamoto, K. R., Amos, M., et al. (2012). New and improved proteomics technologies for understanding complex biological systems: addressing a grand challenge in the life sciences. Proteomics, 12(18), 2773–2783. Hu, B., & Lill, M. A. (2014). PharmDock: a pharmacophore-based docking program. Journal of Cheminformatics, 6, 14. Hu, J., Li, Y., Zhang, Y., & Yu, D. J. (2018). ATPbind: accurate protein-ATP binding site prediction by combining sequence-profiling and structure-based comparisons. Journal of Chemical Information and Modeling, 58(2), 501–510. Huang, H., Nguyen, T., Ibrahim, S., Shantharam, S., Yue, Z., & Chen, J. Y. (2015). DMAP: a connectivity map database to enable identification of novel drug repositioning candidates. BMC Bioinformatics, 16(Suppl. 13), S4. Iorio, F., Rittman, T., Ge, H., Menden, M., & Saez-Rodriguez, J. (2013). Transcriptional data: a new gateway to drug repositioning? Drug Discovery Today, 18(7-8), 350–357. Irwin, J. J., & Shoichet, B. K. (2005). ZINC—a free database of commercially available compounds for virtual screening. Journal of Chemical Information and Modeling, 45(1), 177–182. Issa, N. T., Kruger, J., Byers, S. W., & Dakshanamurthy, S. (2013). Drug repurposing a reality: from computers to the clinic. Expert Review of Clinical Pharmacology, 6(2), 95–97. Janson, G., Zhang, C., Prado, M. G., & Paiardini, A. (2017). PyMod 2.0: improvements in protein sequence-structure analysis and homology modeling within PyMOL. Bioinformatics, 33(3), 444–446. Jayaram, B., Bhushan, K., Shenoy, S. R., Narang, P., Bose, S., Agrawal, P., et al. (2006). Bhageerath: an energy based web enabled computer software suite for limiting the search space of tertiary structures of small globular proteins. Nucleic Acids Research, 34(21), 6195–6204. Jia, Z., Liu, Y., Guan, N., Bo, X., Luo, Z., & Barnes, M. R. (2016). Cogena, a novel tool for co-expressed gene-set enrichment analysis, applied to drug repositioning and drug mode of action discovery. BMC Genomics, 17, 414. Jimenez, J., Doerr, S., Martinez-Rosell, G., Rose, A. S., & De Fabritiis, G. (2017). DeepSite: protein-binding site predictor using 3D-convolutional neural networks. Bioinformatics, 33(19), 3036–3042. Jin, G., & Wong, S. T. (2014). Toward better drug repositioning: prioritizing and integrating existing methods into efficient pipelines. Drug Discovery Today, 19(5), 637–644. Johnson, D. K., & Karanicolas, J. (2017). Computational screening and design for compounds that disrupt proteinprotein interactions. Current Topics in Medicinal Chemistry, 17(23), 2703–2714. Jonsdottir, S. O., Jorgensen, F. S., & Brunak, S. (2005). Prediction methods and databases within chemoinformatics: emphasis on drugs and drug candidates. Bioinformatics, 21(10), 2145–2160. Kalidas, Y., & Chandra, N. (2008). PocketDepth: a new depth based algorithm for identification of ligand binding sites in proteins. Journal of Structural Biology, 161(1), 31–42. Kallberg, M., Wang, H., Wang, S., Peng, J., Wang, Z., Lu, H., et al. (2012). Template-based protein structure modeling using the RaptorX web server. Nature Protocols, 7(8), 1511–1522. Kanehisa, M., Goto, S., Furumichi, M., Tanabe, M., & Hirakawa, M. (2010). KEGG for representation and analysis of molecular networks involving diseases and drugs. Nucleic Acids Research, 38(Database issue), D355–D360.
4. TOOLS AND DATABASES
776
25. AN OVERVIEW OF COMPUTATIONAL METHODS
Kaufmann, K. W., Lemmon, G. H., Deluca, S. L., Sheehan, J. H., & Meiler, J. (2010). Practically useful: what the Rosetta protein modeling suite can do for you. Biochemistry, 49(14), 2987–2998. Kim, D. E., Chivian, D., & Baker, D. (2004). Protein structure prediction and analysis using the Robetta server. Nucleic Acids Research, 32(Web Server issue), W526–W531. Kindt, T., Morse, S., Gotschlich, E., & Lyons, K. (1991). Structure-based strategies for drug design and discovery. Nature, 352, 581. Kinjo, A. R., Suzuki, H., Yamashita, R., Ikegawa, Y., Kudou, T., Igarashi, R., et al. (2012). Protein Data Bank Japan (PDBj): maintaining a structural data archive and resource description framework format. Nucleic Acids Research, 40(Database issue), D453–D460. Kitchen, D. B., Decornez, H., Furr, J. R., & Bajorath, J. (2004). Docking and scoring in virtual screening for drug discovery: methods and applications. Nature Reviews. Drug Discovery, 3(11), 935–949. Ko, J., Lee, D., Park, H., Coutsias, E. A., Lee, J., & Seok, C. (2011). The FALC-Loop web server for protein loop modeling. Nucleic Acids Research, 39(Web Server issue), W210–W214. Koes, D. R., & Camacho, C. J. (2012a). PocketQuery: protein-protein interaction inhibitor starting points from proteinprotein interaction structure. Nucleic Acids Research, 40(Web Server issue), W387–W392. Koes, D. R., & Camacho, C. J. (2012b). ZINCPharmer: pharmacophore search of the ZINC database. Nucleic Acids Research, 40(Web Server issue), W409–W414. Koes, D. R., Domling, A., & Camacho, C. J. (2018). AnchorQuery: rapid online virtual screening for small-molecule protein-protein interaction inhibitors. Protein Science, 27(1), 229–232. Krieger, E., Joo, K., Lee, J., Raman, S., Thompson, J., Tyka, M., et al. (2009). Improving physical realism, stereochemistry, and side-chain accuracy in homology modeling: four approaches that performed well in CASP8. Proteins, 77 (Suppl 9), 114–122. Kruger, D. M., & Gohlke, H. (2010). DrugScorePPI webserver: fast and accurate in silico alanine scanning for scoring protein-protein interactions. Nucleic Acids Research, 38(Web Server issue), W480–W486. Kufareva, I., Ilatovskiy, A. V., & Abagyan, R. (2012). Pocketome: an encyclopedia of small-molecule binding sites in 4D. Nucleic Acids Research, 40(Database issue), D535–D540. Kuttner, Y. Y., & Engel, S. (2012). Protein hot spots: the islands of stability. Journal of Molecular Biology, 415(2), 419–428. Labbe, C. M., Kuenemann, M. A., Zarzycka, B., Vriend, G., Nicolaes, G. A., Lagorce, D., et al. (2016). iPPI-DB: an online database of modulators of protein-protein interactions. Nucleic Acids Research, 44(D1), D542–D547. Labbe, C. M., Rey, J., Lagorce, D., Vavrusa, M., Becot, J., Sperandio, O., et al. (2015). MTiOpenScreen: a web server for structure-based virtual screening. Nucleic Acids Research, 43(W1), W448–W454. Lamb, J., Crawford, E. D., Peck, D., Modell, J. W., Blat, I. C., Wrobel, M. J., et al. (2006). The Connectivity Map: using gene-expression signatures to connect small molecules, genes, and disease. Science, 313(5795), 1929–1935. Laskowski, R. A., Rullmannn, J. A., MacArthur, M. W., Kaptein, R., & Thornton, J. M. (1996). AQUA and PROCHECKNMR: programs for checking the quality of protein structures solved by NMR. Journal of Biomolecular NMR, 8(4), 477–486. Lavecchia, A., & Di Giovanni, C. (2013). Virtual screening strategies in drug discovery: a critical review. Current Medicinal Chemistry, 20(23), 2839–2860. Le Guilloux, V., Schmidtke, P., & Tuffery, P. (2009). Fpocket: an open source platform for ligand pocket detection. BMC Bioinformatics, 10, 168. Lee, B. K., Tiong, K. H., Chang, J. K., Liew, C. S., Abdul Rahman, Z. A., Tan, A. C., et al. (2017). DeSigN: connecting gene expression with therapeutics for drug repurposing and development. BMC Genomics, 18(Suppl 1), 934. Lengauer, T., & Rarey, M. (1996). Computational methods for biomolecular docking. Current Opinion in Structural Biology, 6(3), 402–406. Li, H., Leung, K. S., & Wong, M. H. (2012). Idock: a multithreaded virtual screening tool for flexible ligand docking. In Paper Presented at the IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), San Diego, United States. 9–12 May. Li, J., Zheng, S., Chen, B., Butte, A. J., Swamidass, S. J., & Lu, Z. (2016). A survey of current trends in computational drug repositioning. Briefings in Bioinformatics, 17(1), 2–12. Li, Y. H., Yu, C. Y., Li, X. X., Zhang, P., Tang, J., Yang, Q., et al. (2018). Therapeutic target database update 2018: enriched resource for facilitating bench-to-clinic research of targeted therapeutics. Nucleic Acids Research, 46 (D1), D1121–D1127. Li, Y. Y., & Jones, S. J. (2012). Drug repositioning for personalized medicine. Genome Medicine, 4(3), 27.
4. TOOLS AND DATABASES
REFERENCES
777
Lill, M. A., & Danielson, M. L. (2011). Computer-aided drug design platform using PyMOL. Journal of Computer-Aided Molecular Design, 25(1), 13–19. Lindsay, M. A. (2003). Target discovery. Nature Reviews. Drug Discovery, 2(10), 831–838. Liu, L., Tsompana, M., Wang, Y., Wu, D., Zhu, L., & Zhu, R. (2016). Connection Map for Compounds (CMC): a server for combinatorial drug toxicity and efficacy analysis. Journal of Chemical Information and Modeling, 56(9), 1615–1621. Liu, T., Lin, Y., Wen, X., Jorissen, R. N., & Gilson, M. K. (2007). BindingDB: a web-accessible database of experimentally determined protein-ligand binding affinities. Nucleic Acids Research, 35(Database issue), D198–D201. Liwo, A., Lee, J., Ripoll, D. R., Pillardy, J., & Scheraga, H. A. (1999). Protein structure prediction by global optimization of a potential energy function. Proceedings of the National Academy of Sciences of the United States of America, 96(10), 5482–5485. Lotfi Shahreza, M., Ghadiri, N., Mousavi, S. R., Varshosaz, J., & Green, J. R. (2017). A review of network-based approaches to drug repositioning. Briefings in Bioinformatics, bbx017. Ma, H., & Zhao, H. (2013). Drug target inference through pathway analysis of genomics data. Advanced Drug Delivery Reviews, 65(7), 966–972. Malhotra, S., Mathew, O. K., & Sowdhamini, R. (2015). DOCKSCORE: a webserver for ranking protein-protein docked poses. BMC Bioinformatics, 16, 127. March-Vila, E., Pinzi, L., Sturm, N., Tinivella, A., Engkvist, O., Chen, H., et al. (2017). On the integration of in silico drug design methods for drug repurposing. Frontiers in Pharmacology, 8, 298. Markley, J. L., Ulrich, E. L., Berman, H. M., Henrick, K., Nakamura, H., & Akutsu, H. (2008). BioMagResBank (BMRB) as a partner in the Worldwide Protein Data Bank (wwPDB): new policies affecting biomolecular NMR depositions. Journal of Biomolecular NMR, 40(3), 153–155. Marks, C., Nowak, J., Klostermann, S., Georges, G., Dunbar, J., Shi, J., et al. (2017). Sphinx: merging knowledge-based and ab initio approaches to improve protein loop prediction. Bioinformatics, 33(9), 1346–1353. Mashiach, E., Schneidman-Duhovny, D., Andrusier, N., Nussinov, R., & Wolfson, H. J. (2008). FireDock: a web server for fast interaction refinement in molecular docking. Nucleic Acids Research, 36(Web Server issue), W229–W232. Meireles, L. M., Domling, A. S., & Camacho, C. J. (2010). ANCHOR: a web server and database for analysis of proteinprotein interaction binding pockets for drug discovery. Nucleic Acids Research, 38(Web Server issue), W407–W411. Messih, M. A., Lepore, R., & Tramontano, A. (2015). LoopIng: a template-based tool for predicting the structure of protein loops. Bioinformatics, 31(23), 3767–3772. Morris, G. M., Huey, R., Lindstrom, W., Sanner, M. F., Belew, R. K., Goodsell, D. S., et al. (2009). AutoDock4 and AutoDockTools4: automated docking with selective receptor flexibility. Journal of Computational Chemistry, 30 (16), 2785–2791. Novac, N. (2013). Challenges and opportunities of drug repositioning. Trends in Pharmacological Sciences, 34(5), 267–272. O’Reilly, P. G., Wen, Q., Bankhead, P., Dunne, P. D., McArt, D. G., McPherson, S., et al. (2016). QUADrATiC: scalable gene expression connectivity mapping for repurposing FDA-approved therapeutics. BMC Bioinformatics, 17(1), 198. Park, H., Kim, D. E., Ovchinnikov, S., Baker, D., & DiMaio, F. (2018). Automatic structure prediction of oligomeric assemblies using Robetta in CASP12. Proteins, 86(Suppl 1), 283–291. Peng, X., Wang, J., Peng, W., Wu, F. X., & Pan, Y. (2017). Protein-protein interactions: detection, reliability assessment and applications. Briefings in Bioinformatics, 18(5), 798–819. Perot, S., Sperandio, O., Miteva, M. A., Camproux, A. C., & Villoutreix, B. O. (2010). Druggable pockets and binding site centric chemical space: a paradigm shift in drug discovery. Drug Discovery Today, 15(15-16), 656–667. Pettersen, E. F., Goddard, T. D., Huang, C. C., Couch, G. S., Greenblatt, D. M., Meng, E. C., et al. (2004). UCSF Chimera—a visualization system for exploratory research and analysis. Journal of Computational Chemistry, 25 (13), 1605–1612. Pieper, U., Webb, B. M., Dong, G. Q., Schneidman-Duhovny, D., Fan, H., Kim, S. J., et al. (2014). ModBase, a database of annotated comparative protein structure models and associated resources. Nucleic Acids Research, 42(Database issue), D336–D346. Pierce, B. G., Wiehe, K., Hwang, H., Kim, B. H., Vreven, T., & Weng, Z. (2014). ZDOCK server: interactive docking prediction of protein-protein complexes and symmetric multimers. Bioinformatics, 30(12), 1771–1773. Pirhadi, S., Shiri, F., & Ghasemi, J. B. (2013). Methods and applications of structure based pharmacophores in drug discovery. Current Topics in Medicinal Chemistry, 13(9), 1036–1047.
4. TOOLS AND DATABASES
778
25. AN OVERVIEW OF COMPUTATIONAL METHODS
Pirhadi, S., Sunseri, J., & Koes, D. R. (2016). Open source molecular modeling. Journal of Molecular Graphics & Modelling, 69, 127–143. Prachayasittikul, V., Worachartcheewan, A., Shoombuatong, W., Songtawee, N., Simeon, S., & Nantasenamat, C. (2015). Computer-aided drug design of bioactive natural products. Current Topics in Medicinal Chemistry, 15 (18), 1780–1800. Pruitt, K. D., Tatusova, T., & Maglott, D. R. (2007). NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Research, 35(Database issue), D61–D65. Rester, U. (2008). From virtuality to reality—virtual screening in lead discovery and lead optimization: a medicinal chemistry perspective. Current Opinion in Drug Discovery & Development, 11(4), 559–568. Rose, P. W., Beran, B., Bi, C., Bluhm, W. F., Dimitropoulos, D., Goodsell, D. S., et al. (2011). The RCSB Protein Data Bank: redesigned web site and web services. Nucleic Acids Research, 39(Database issue), D392–D401. Rosell, M., & Fernandez-Recio, J. (2018). Hot-spot analysis for drug discovery targeting protein-protein interactions. Expert Opinion on Drug Discovery, 13(4), 327–338. Rost, B., Schneider, R., & Sander, C. (1997). Protein fold recognition by prediction-based threading. Journal of Molecular Biology, 270(3), 471–480. Sardana, D., Zhu, C., Zhang, M., Gudivada, R. C., Yang, L., & Jegga, A. G. (2011). Drug repositioning for orphan diseases. Briefings in Bioinformatics, 12(4), 346–356. Sarvagalla, S., & Coumar, M. S. (2016). Protein-protein interactions (PPIs) as an alternative to targeting the atp binding site of kinase: in silico approach to identify PPI inhibitors. In M. H. -M. a. B. S. Siavoush Dastmalchi (Ed.), Applied Case Studies and Solutions in Molecular Docking-Based Drug Design (pp. 249–277): IGI Global. Sawada, R., Iwata, H., Mizutani, S., & Yamanishi, Y. (2015). Target-based drug repositioning using large-scale shemical-protein interactome data. Journal of Chemical Information and Modeling, 55(12), 2717–2730. Schenone, M., Dancik, V., Wagner, B. K., & Clemons, P. A. (2013). Target identification and mechanism of action in chemical biology and drug discovery. Nature Chemical Biology, 9(4), 232–240. Schneider, G. (2018). Automating drug discovery. Nature Reviews. Drug Discovery, 17(2), 97–113. Schneidman-Duhovny, D., Inbar, Y., Nussinov, R., & Wolfson, H. J. (2005). PatchDock and SymmDock: servers for rigid and symmetric docking. Briefings in Bioinformatics, bbw136. Setoain, J., Franch, M., Martinez, M., Tabas-Madrid, D., Sorzano, C. O., Bakker, A., et al. (2015). NFFinder: an online bioinformatics tool for searching similar transcriptomics experiments in the context of drug repositioning. Nucleic Acids Research, 43(W1), W193–W199. Shameer, K., Glicksberg, B. S., Hodos, R., Johnson, K. W., Badgeley, M. A., Readhead, B., et al. (2017). Systematic analyses of drugs and disease indications in RepurposeDB reveal pharmacological, biological and epidemiological factors influencing drug repositioning. Briefings in Bioinformatics, bbw136. Shim, J. S., & Liu, J. O. (2014). Recent advances in drug repositioning for the discovery of new anticancer drugs. International Journal of Biological Sciences, 10(7), 654–663. Sievers, F., & Higgins, D. G. (2018). Clustal Omega for making accurate alignments of many protein sequences. Protein Science, 27(1), 135–145. Simossis, V. A., & Heringa, J. (2005). PRALINE: a multiple sequence alignment toolbox that integrates homologyextended and secondary structure information. Nucleic Acids Research, 33(Web Server issue), W289–W294. Siramshetty, V. B., Eckert, O. A., Gohlke, B. O., Goede, A., Chen, Q., Devarakonda, P., et al. (2018). SuperDRUG2: a one stop resource for approved/marketed drugs. Nucleic Acids Research, 46(D1), D1137–D1143. Siramshetty, V. B., Nickel, J., Omieczynski, C., Gohlke, B. O., Drwal, M. N., & Preissner, R. (2016). WITHDRAWN—a resource for withdrawn and discontinued drugs. Nucleic Acids Research, 44(D1), D1080–D1086. Sithara, S., Crowley, T. M., Walder, K., & Aston-Mourney, K. (2017). Gene expression signature: a powerful approach for drug discovery in diabetes. The Journal of Endocrinology, 232(2), R131–R139. Sliwoski, G., Kothiwale, S., Meiler, J., & Lowe, E. W., Jr. (2014). Computational methods in drug discovery. Pharmacological Reviews, 66(1), 334–395. Subramanian, A., Narayan, R., Corsello, S. M., Peck, D. D., Natoli, T. E., Lu, X., et al. (2017). A Next Generation Connectivity Map: L1000 platform and the first 1,000,000 profiles. Cell, 171(6), 1437–1452.e1417. Sunseri, J., & Koes, D. R. (2016). Pharmit: interactive exploration of chemical space. Nucleic Acids Research, 44(W1), W442–W448. Swamidass, S. J. (2011). Mining small-molecule screens to repurpose drugs. Briefings in Bioinformatics, 12(4), 327–335. Takenaka, T. (2001). Classical vs reverse pharmacology in drug discovery. BJU International, 88(Suppl 2), 7–10.
4. TOOLS AND DATABASES
REFERENCES
779
Talele, T. T., Khedkar, S. A., & Rigby, A. C. (2010). Successful applications of computer aided drug discovery: moving drugs from concept to the clinic. Current Topics in Medicinal Chemistry, 10(1), 127–141. Tang, Y., Zhu, W., Chen, K., & Jiang, H. (2006). New technologies in computer-aided drug design: toward target identification and new chemical entity discovery. Drug Discovery Today: Technologies, 3(3), 307–313. Thevenet, P., Shen, Y., Maupetit, J., Guyon, F., Derreumaux, P., & Tuffery, P. (2012). PEP-FOLD: an updated de novo structure prediction server for both linear and disulfide bonded cyclic peptides. Nucleic Acids Research, 40(Web Server issue), W288–W293. Trott, O., & Olson, A. J. (2010). AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. Journal of Computational Chemistry, 31(2), 455–461. Tsai, T. Y., Chang, K. W., & Chen, C. Y. (2011). iScreen: world’s first cloud-computing web server for virtual screening and de novo drug design based on TCM database@Taiwan. Journal of Computer-Aided Molecular Design, 25(6), 525–531. UniProtConsortium. (2018). UniProt: the universal protein knowledgebase. Nucleic Acids Research, 46(5), 2699. Ursu, O., Holmes, J., Knockel, J., Bologa, C. G., Yang, J. J., Mathias, S. L., et al. (2017). DrugCentral: online drug compendium. Nucleic Acids Research, 45(D1), D932–D939. van Zundert, G. C. P., Rodrigues, J., Trellet, M., Schmitz, C., Kastritis, P. L., Karaca, E., et al. (2016). The HADDOCK2.2 web server: user-friendly integrative modeling of biomolecular complexes. Journal of Molecular Biology, 428(4), 720–725. Vane, J. R., & Botting, R. M. (2003). The mechanism of action of aspirin. Thrombosis Research, 110(5-6), 255–258. Vasaikar, S., Bhatia, P., Bhatia, P. G., & Chu Yaiw, K. (2016). Complementary approaches to existing target based drug discovery for identifying novel drug targets. Biomedicine, 4(4), 27. Verma, J., Khedkar, V. M., & Coutinho, E. C. (2010). 3D-QSAR in drug design—a review. Current Topics in Medicinal Chemistry, 10(1), 95–115. Villoutreix, B. O., Kuenemann, M. A., Poyet, J. L., Bruzzoni-Giovanelli, H., Labbe, C., Lagorce, D., et al. (2014). Druglike protein-protein interaction modulators: challenges and opportunities for drug discovery and chemical biology. Molecular Informatics, 33(6-7), 414–437. Villoutreix, B. O., Lagorce, D., Labbe, C. M., Sperandio, O., & Miteva, M. A. (2013). One hundred thousand mouse clicks down the road: selected online resources supporting drug discovery collected over a decade. Drug Discovery Today, 18(21-22), 1081–1089. Vistoli, G., Pedretti, A., Mazzolari, A., & Testa, B. (2010). Homology modeling and metabolism prediction of human carboxylesterase-2 using docking analyses by GriDock: a parallelized tool based on AutoDock 4.0. Journal of Computer-Aided Molecular Design, 24(9), 771–787. Vyas, V. K., Ukawala, R. D., Ghate, M., & Chintha, C. (2012). Homology modeling a fast tool for drug discovery: current perspectives. Indian Journal of Pharmaceutical Sciences, 74(1), 1–17. Wang, Y., Chen, S., & Deng, N. (2013). Drug repositioning by kernel-based integration of molecular structure, molecular activity, and phenotype data. PLoS One, 8(11) e78518. Wang, Z., Lachmann, A., Keenan, A. B., & Ma’ayan, A. (2018). L1000FWD: fireworks visualization of drug-induced transcriptomic signatures. Bioinformatics, bty060. Wass, M. N., Kelley, L. A., & Sternberg, M. J. (2010). 3DLigandSite: predicting ligand-binding sites using similar structures. Nucleic Acids Research, 38(Web Server issue), W469–W473. Webb, B., & Sali, A. (2016). Comparative protein structure modeling using MODELLER. Current Protocols in Bioinformatics, 54, 5 6 1–5 6 37. Wiederstein, M., & Sippl, M. J. (2007). ProSA-web: interactive web service for the recognition of errors in threedimensional structures of proteins. Nucleic Acids Research, 35(Web Server issue), W407–W410. Wishart, D. S., Feunang, Y. D., Guo, A. C., Lo, E. J., Marcu, A., Grant, J. R., et al. (2018). DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Research, 46(D1), D1074–D1082. Wu, H., Huang, J., Zhong, Y., & Huang, Q. (2017). DrugSig: a resource for computational drug repositioning utilizing gene expression signatures. PLoS One, 12(5) e0177743. Wu, S., Skolnick, J., & Zhang, Y. (2007). Ab initio modeling of small proteins by iterative TASSER simulations. BMC Biology, 5, 17. Wu, S., & Zhang, Y. (2008). MUSTER: improving protein sequence profile-profile alignments by using multiple sources of structure information. Proteins, 72(2), 547–556.
4. TOOLS AND DATABASES
780
25. AN OVERVIEW OF COMPUTATIONAL METHODS
Xu, D., & Zhang, Y. (2013). Toward optimal fragment generations for ab initio protein structure assembly. Proteins, 81 (2), 229–239. Xue, L. C., Jordan, R. A., El-Manzalawy, Y., Dobbs, D., & Honavar, V. (2014). DockRank: ranking docked conformations using partner-specific sequence homology-based protein interface prediction. Proteins, 82(2), 250–267. Yang, J., Roy, A., & Zhang, Y. (2013). Protein-ligand binding site recognition using complementary binding-specific substructure comparison and sequence profile alignment. Bioinformatics, 29(20), 2588–2595. Yang, J., Yan, R., Roy, A., Xu, D., Poisson, J., & Zhang, Y. (2015). The I-TASSER Suite: protein structure and function prediction. Nature Methods, 12(1), 7–8. Yang, J. M., & Chen, C. C. (2004). GEMDOCK: a generic evolutionary method for molecular docking. Proteins, 55(2), 288–304. Yoo, M., Shin, J., Kim, J., Ryall, K. A., Lee, K., Lee, S., et al. (2015). DSigDB: drug signatures database for gene set analysis. Bioinformatics, 31(18), 3069–3071. Zeng, H., Qiu, C., & Cui, Q. (2015). Drug-Path: a database for drug-induced pathways. Database: The Journal of Biological Databases and Curation, 2015, bav061. Zhang, S. (2011). Computer-aided drug discovery and development. In S. D. Satyanarayanajois (Ed.), Vol. 716 Drug Design and Discovery (pp. 23–38). Springer Protocol. Zheng, W., Thorne, N., & McKew, J. C. (2013). Phenotypic screens as a renewed approach for drug discovery. Drug Discovery Today, 18(21-22), 1067–1073. Zhu, X., & Mitchell, J. C. (2011). KFC2: a knowledge-based hot spot prediction method based on interface solvation, atomic density, and plasticity features. Proteins, 79(9), 2671–2683. Zou, J., Zheng, M. W., Li, G., & Su, Z. G. (2013). Advanced systems biology methods in drug discovery and translational biomedicine. BioMed Research International, 2013, 742835.
4. TOOLS AND DATABASES