Structural modelling and dynamics of proteins for insights into drug interactions

Structural modelling and dynamics of proteins for insights into drug interactions

Advanced Drug Delivery Reviews 64 (2012) 323–343 Contents lists available at SciVerse ScienceDirect Advanced Drug Delivery Reviews journal homepage:...

1MB Sizes 0 Downloads 41 Views

Advanced Drug Delivery Reviews 64 (2012) 323–343

Contents lists available at SciVerse ScienceDirect

Advanced Drug Delivery Reviews journal homepage: www.elsevier.com/locate/addr

Structural modelling and dynamics of proteins for insights into drug interactions☆ Tim Werner a, Michael B. Morris b, c, Siavoush Dastmalchi d, W. Bret Church a,⁎ a

Group in Biomolecular Structure and Informatics, Faculty of Pharmacy A15, University of Sydney, Sydney, NSW 2006, Australia Discipline of Physiology and Bosch Institute, School of Medical Sciences, University of Sydney, NSW 2006, Australia c Centre for Developmental and Regenerative Medicine, Kolling Institute of Medical Research, Royal North Shore Hospital, St. Leonards, NSW 2065, Australia d Biotechnology Research Centre and Department of Medicinal Chemistry, School of Pharmacy, Tabriz University of Medical Sciences, Tabriz, Iran b

a r t i c l e

i n f o

Article history: Received 13 July 2011 Accepted 24 November 2011 Available online 1 December 2011 Keywords: Molecular dynamics Protein structure Computational biology Structural biology Drug targets Drug–target interactions

a b s t r a c t Proteins are the workhorses of biomolecules and their function is affected by their structure and their structural rearrangements during ligand entry, ligand binding and protein–protein interactions. Hence, the knowledge of protein structure and, importantly, the dynamic behaviour of the structure are critical for understanding how the protein performs its function. The predictions of the structure and the dynamic behaviour can be performed by combinations of structure modelling and molecular dynamics simulations. The simulations also need to be sensitive to the constraints of the environment in which the protein resides. Standard computational methods now exist in this field to support the experimental effort of solving protein structures. This review presents a comprehensive overview of the basis of the calculations and the wellestablished computational methods used to generate and understand protein structure and function and the study of their dynamic behaviour with the reference to lung-related targets. © 2011 Elsevier B.V. All rights reserved.

Contents 1. 2.

3.

4. 5.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Protein structure prediction . . . . . . . . . . . . . . . . . . . . . . . 2.1. Classes of protein structure prediction methods . . . . . . . . . . 2.2. Critical assessment of protein structure prediction (CASP) . . . . . 2.3. Quantitative measures of structural differences . . . . . . . . . . 2.4. Template-based modelling . . . . . . . . . . . . . . . . . . . . 2.4.1. Homology modelling . . . . . . . . . . . . . . . . . . 2.4.2. Threading . . . . . . . . . . . . . . . . . . . . . . . 2.5. Free modelling . . . . . . . . . . . . . . . . . . . . . . . . . 2.5.1. ROSETTA . . . . . . . . . . . . . . . . . . . . . . . . 2.5.2. I-TASSER . . . . . . . . . . . . . . . . . . . . . . . . 2.6. Comparison of structure-prediction methods . . . . . . . . . . . Force field-based modelling methods . . . . . . . . . . . . . . . . . . 3.1. Energy minimisation. . . . . . . . . . . . . . . . . . . . . . . 3.2. Molecular dynamics . . . . . . . . . . . . . . . . . . . . . . . 3.2.1. Force fields in molecular dynamics simulations . . . . . . 3.2.2. Explicit treatment of solvent . . . . . . . . . . . . . . . 3.2.3. Implicit treatment of solvent. . . . . . . . . . . . . . . 3.2.4. Specialised techniques in molecular dynamics simulations . Structure-based virtual (in silico) screening . . . . . . . . . . . . . . . Selected examples of computational modelling in the lung . . . . . . . . 5.1. Molecular dynamics simulation of the air/water interface in the lung 5.1.1. The hypophase and lung surfactant . . . . . . . . . . . 5.1.2. Modelling surfactant collapse and lipid–protein interactions

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

☆ This review is part of the Advanced Drug Delivery Reviews theme issue on “Computational and visualization approaches in respiratory drug delivery”. ⁎ Corresponding author. E-mail address: [email protected] (W.B. Church). 0169-409X/$ – see front matter © 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.addr.2011.11.011

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

324 324 324 325 325 325 325 327 328 328 328 329 329 329 330 331 331 332 332 332 333 333 333 334

324

T. Werner et al. / Advanced Drug Delivery Reviews 64 (2012) 323–343

5.2.

Other lung proteins . . . . . . . . . . . . . . . 5.2.1. GPCRs . . . . . . . . . . . . . . . . . 5.2.2. Epidermal growth factor receptor. . . . . 5.2.3. Cystic fibrosis transmembrane conductance 5.2.4. Antigen α4β1 (VLA-4, very late antigen-4) 6. Conclusion . . . . . . . . . . . . . . . . . . . . . . . Acknowledgements . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . regulator . . . . . . . . . . . . . . . . . . . .

1. Introduction Currently, the Worldwide Protein Data Bank (wwPDB) contains ~ 70,000 experimentally solved structures for proteins. This set includes multiple structures of the same or very closely related proteins. Because of this redundancy, the number of ‘unique’ solved protein structures available is far less than 70,000. In contrast, the National Center for Biotechnology Information nonredundant database contains more than 3,000,000 sequences from a wide variety of species. This represents a total number of residues that is ~ 120 times larger than the corresponding database of unique chains from the PDB [1]. With the advent of ever-more-efficient high-throughput gene sequencing technologies and their application to increasing numbers of genomes, sequence depositions into databases are growing exponentially while the rate of growth of structure determinations has slowed [1]. Thus, the gap between sequences available and structures solved experimentally is growing at a rapid rate. Furthermore, high-resolution structures are essentially ‘snapshots’ of a single conformation of a protein meaning that complex, often subtle, structural movements that might be essential to function, stability, and partner binding can be difficult or impossible to determine from these structures alone. The situation is particularly egregious for membrane proteins. Less than 1% of experimentally solved structures are those of membrane proteins even though they typically represent 30% of a genome and are the targets of more than half of therapeutic drugs [2]. This deficiency in high-resolution structural information for membrane proteins is mainly due to the more demanding requirements for overexpression, purification and crystallisation compared to water-soluble proteins [3]. To fill the ever-widening gap, computationally driven 3D structure prediction methodologies can be used to model structures of proteins where no experimentally derived structure exists. Even where an experimentally determined structure does exist, in silico methods can be used to model the effects of mutations, predict the location of binding surfaces for other macromolecules and small-molecule effectors, estimate binding energies, and predict local and non-local movements required for events such as binding, signalling and catalysis to occur. In this review, we outline the basics of computational modelling methodologies and feature approaches to select lead compounds by employing in silico (virtual) drug screening and structure-based drug design (SBDD) (Fig. 1). At the end of the review, we present examples of these approaches as applied to the field of molecular respiratory medicine. These highlight the usefulness of computational methods to inform on normal and abnormal lung function and direct the development of therapeutic interventions. 2. Protein structure prediction 2.1. Classes of protein structure prediction methods Protein structure prediction methods have traditionally been grouped into homology modelling, threading, de novo and ab initio methods. Many modelling strategies now use combinations of these methods or incorporate features of more than one strategy to model protein structure. As a

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

. . . . . . . .

334 334 336 337 338 338 338 338

consequence, the division between these conventional categories has become blurred and computational methods are now often grouped into template-based methods, which include homology modelling and threading, and ‘free-modelling’ methods, which include de novo and ab initio methods (see Section 2.2). Homology modelling, or comparative modelling as it is sometimes known, is based on the observation that evolutionarily related proteins (i.e., proteins that are related to one another in terms of amino-acid sequence) tend to have similar structures. As a result, the unsolved structure of a protein (the target) can be modelled using the solved structure of a related protein (the template) [4]. In contrast, threading methods can be used to generate structures even if the target and template sequence are not evolutionarily related. The rationale is that proteins can adopt the same fold even if there is no obvious sequence relationship because the structure is more conserved than the sequence [5]. Threading methods assign the target sequence to templates with known folds, where each type of fold represents structures sharing closely similar architecture regardless of sequence. For every trial template the optimal sequence-to-structure alignment is evaluated by a set of scoring functions based on physico-chemical parameters. By their nature, threading methods are limited to a search of known folds and are unable to correctly predict the structure of the target if, in reality, it adopts a novel fold. That said, there appears to be a limited number of folds: estimations of protein folds for water-soluble proteins vary from 400 to 10,000 [6–11], with 2700 folds being a relatively recent estimate [11] derived by analysing the Structural Classification of Proteins (SCOP) [12,13] database. The current SCOP release 1.75 (June 2009) contains 1195 different folds belonging to 1962 superfamilies and 3902 families. Because of the limitations indicated above, 3D structures for many sequences cannot be predicted reliably by homology modelling or threading. Instead, template-free, or free-modelling, methods have been developed. To a greater or lesser extent these rely on generating structures based on the physico-chemical/thermodynamic properties of the string of amino acids without direct reference to solved structures. Freemodelling methods build 3D models using scoring functions (energy functions). The functions contain terms which need to balance faithful representation of the system properties on the one hand with computational tractability on the other. Generating the models is an iterative process using strategies to search the energy landscape and find the in silico conformation of the protein lying at the global minimum of potential energy as determined by the scoring functions. Free-modelling methods can be divided into two groups: de novo methods use knowledge-based force fields as scoring functions which can rely heavily on our experimentally derived understanding of proteins undergoing folding together with information on experimentally determined structures deposited in databases. Therefore, they are not template free in the strictest sense. For example, many de novo methods are related to threading methods because they use the structures of small fragments of proteins whose complete structures have been solved — for this reason they are sometimes referred to as ‘mini-threading’. Ab initio methods use force fields based on first principles, without reference to solved structures. The force fields are defined as a series of relatively simple terms which are used to calculate the energy of the system. Ab initio methods often use all-atom models so that the search space is very large and the energy function becomes very complex. For protein work, they are largely

T. Werner et al. / Advanced Drug Delivery Reviews 64 (2012) 323–343

325

Fig. 1. Simplified flowchart for the structure-based drug design procedure. Structure-based drug design is an iterative process which is not shown here for the sake of clarity. The drug design includes additional steps, such as toxicity, bioavailability and synthesisability check. The experimental steps in the process are much more time and cost consuming than the computational steps. The clinical trials are the single most expensive component of the entire drug design process and therefore demand the best preparation for success.

limited to modelling the structures of peptide-sized sequences within proteins. De novo and ab initio methods share some limitations such as the high computational cost to search the energy space and the difficulty in correctly selecting the ‘native’ structure from a large number of alternative conformations. 2.2. Critical assessment of protein structure prediction (CASP) CASP is a biennial competition in which protein structure prediction techniques are evaluated. In the lead up to each CASP meeting, sequences are made available for proteins whose structures have been solved but not published, providing a blind test of 3D structure-prediction methods. The CASP competition has been useful for evaluating improvements in prediction methods and identifying bottlenecks in the process. The most recent competitions have been divided into templatebased modelling and free-modelling sections. As noted above, this distinction is somewhat arbitrary in part because some methods draw on aspects of both approaches and also because software may not explicitly declare to the competitors if a template was used or not for the specific target. After the competition closes, the submitted models are quantitatively compared to determine which computational methods produce structures most similar to the experimentally determined structures. 2.3. Quantitative measures of structural differences Traditionally, differences between two structures such as between a model structure and the experimentally solved structure are measured by the root mean square deviation (RMSD) — the average distance between the atoms of the structures when they are superimposed. In this review, all RMSDs reported for proteins are Cα or backbone RMSDs, except in the case of ligands where allatom RMSDs are used. RMSD is still considered a reasonable comparative measure of differences between high resolution structures. However, RMSD as a measure of the quality of low-resolution models

is not ideal because badly modelled local structural elements have a large impact on the RMSD even if the overall structure is well modelled [14]. For this reason the Global Distance Test Total Score (GDT-TS) [15] is often adopted for comparison of low-resolution models. GDT-TS calculates the average maximum number of residues in the model which can be paired up with the corresponding residue in the template with Cα distance cut-offs of 1, 2, 3 and 8 Å. Because of the wide range of distance cut-offs, GDT-TS also rewards models with the correct overall fold even if parts of the model are poor. The performance of computational modelling methods can be compared by using multiple targets, such as in the situation with CASP competitions. A metric is required which combines the scores of all predicted models over all targets for each prediction method. The Z-score was developed to take into account the degree of difficulty of modelling of the targets so that a higher positive impact occurs for the computational modelling methods that are successful for the difficult targets over those methods successful for only relatively simple predictions. In CASP, all three different scores (RMSD, GDT-TS and Z-scores) are used to rank the prediction methods and the rank of a particular method can change depending on the type of scoring used.

2.4. Template-based modelling 2.4.1. Homology modelling Homology modelling is the most commonly used approach for modelling the 3D structures of proteins for which structures are not solved experimentally. It is estimated that for every unique protein in the PDB an average of 20 other homologous proteins can be built [16]. Homology modelling involves 3 sequential steps: model building, refinement and evaluation. The model-building step involves identifying the best template by aligning the sequence of the target with template sequences of proteins with known structures. A single template or multiple templates are chosen with the major (but not only) consideration being the extent of sequence identity/similarity between the target and the template. The chosen template acts as a “pattern” for the 3D coordinates of the target protein based on the

326

T. Werner et al. / Advanced Drug Delivery Reviews 64 (2012) 323–343

conserved positions. Generally speaking, sequence homology needs to be 25% or greater to successfully be employed for homology modelling but many exceptions arise: For example, models of the G proteincoupled receptor (GPCR), rhodopsin, have been generated using the solved structure of bacteriorhodopsin as the template even though rhodopsin does not share any significant sequence similarity with bacteriorhodopsin [17]. This illustrates that structural information drove the impetus to use bacteriorhodopsin as a template, even though it was a low resolution template with an overall RMSD of around 7.4 Å to rhodopsin. This template is now outdated for homology modelling of GPCRs because the crystal structure of rhodopsin and other GPCRs have been solved. Related approaches to sequence conservation can also be used: for example, rather than using a strict sequence alignment, hydrophobic periodicity of a helical transmembrane region from the solved structure of a membrane protein might be used as the basis of a “pattern” when aligning the target sequence [4,18]. Stretches of amino acids in the target which do not fit the pattern of the template are often loop regions and are usually modelled using a database of fragment structures [19] (see Section 2.4.1.4) or by ab initio approaches (see Section 2.5). In the refinement phase, the structures of loops and side chains are usually refined by molecular dynamics or energy minimisation procedures (see section 3). In the last step, model evaluation, the refined models are evaluated for their agreement with information gathered from a number of sources, including generally known structural features and other experimental results. Each of these can take a variety of forms. E.g., structural features of the model can be evaluated by generating a Ramachandran plot and by calculating clash scores for steric overlap [20–22]. Experimental results can include independent measures of the location of disulphide bonds, secondary structure content as measured by circular dichroism or infrared spectroscopy, low-resolution structural data, and information about conformation and function derived from mutagenesis studies. Additional computational analysis can also be valuable: e.g., can a known ligand for the protein be docked into the model structure's binding site? Using this information the best model can be chosen out of a set of predictions. Most of the methods and assumptions used in homology modelling are derived from the physico-chemical properties of watersoluble proteins, including experimentally determined low- and high-resolution structures. Homology model building of membrane proteins follow similar rules to water-soluble proteins. However, including information specific to membrane proteins such as the location of hydrophobic transmembrane regions and the incorporation of a lipid environment might increase the accuracy of the models [23]. Many homology models have been used in the drug discovery process, including models for protein kinases [24–26], GPCRs [27–32] and other membrane proteins [33–36]. 2.4.1.1. Template identification and selection. Template identification is done by comparing the target sequence with each member of a database, which is usually the wwPDB database or a close relative thereof. The search methods usually provide a ranking of templates based on the target–template alignment and various alignment scores which describe the quality of the ‘compatibility’. The alignment is often based on approximations to improve the search time and therefore may be suboptimal. There are different approaches to template identification: sequence–sequence or sequence–profile search methods such as FASTA [37,38] and BLAST [39] use pairwise or sequence–profile alignment algorithms, respectively. A sequence profile is usually constructed from a protein family by aligning all members together. Other methods use profile–profile alignments or Hidden Markov Models (HMMs) to identify eligible templates. More advanced methods such as HHpred [40] or Phyre [41] use profiles or HMMs

combined with structural features (e.g. secondary structures); these have been very successfully used in the CASP competitions. Sequence–sequence and sequence–profile comparisons perform well when target and template sequence identity is high (above 30–40%). Profile–profile and HMM methods seem to achieve better results than methods using sequence–profile or sequence–sequence comparisons when sequence identity is below 30% [42–44]. After generating a list of candidate templates, the most eligible structures must be selected. Often these are the structures with the highest alignment score to the target, which includes factors such as sequence identity, sequence similarity and penalties for gaps, etc. However, this is not the only criterion used in homology modelling. When template structures with similar alignment scores are available, the structure solved to highest resolution is often chosen. Other factors such as the presence of bound ligand, the pH of template/target, solvent type etc. can also be important for selecting the template. In cases with low sequence identity between target and template the alignment becomes difficult and is often unreliable [45]. Doolittle [46] formulated some rules of thumb: 1) when the sequences are longer than 100 amino acids and are more than 25% identical (including gaps) then they are probably related. 2) If the sequences are found to be 15–25% identical the sequences might still be related (‘twilight zone’) but additional statistical analyses should be performed to help establish this with confidence [46]. 3) Sequences with less than 15% sequence identity are most likely not related. There is some evidence that using multiple templates for the homology model building can improve the quality of the model compared to using single templates [47,48].

2.4.1.2. Target–template alignment. Homology modelling programs use the target–template alignment as input but the alignments produced by the above search methods are usually sub-optimal and specialised alignment tools are often used to create a better alignment. This is crucial since the alignment of sequences is the most important step in the homology modelling procedure [49]. E.g., even one incorrectly aligned residue leads to a displacement of its α carbon at least 3.8 Å away from the correct location and generally this cannot automatically be corrected by most of the optimising procedures available (see section 3). An inappropriate single-residue gap in an α-helical region induces a rotation of the remainder of the residues in the helix by ~100°. In an evolutionary sense, such deletions may be important for altering function. Alternatively, they can be ‘neutralised’ through localised helix unwinding. From the computational modelling perspective, gaps within secondary structure elements can prove extremely problematic, whereas gaps at the ends can sometimes be tolerated. Specialised alignment tools can be grouped into pairwise alignment and multiple-sequence alignment approaches. The most widely used alignment methods are ClustalW [50], TCoffee [51], 3DCoffee [52], and Muscle [53], which can all be used for pairwise or multiple alignments. Additional information is also often used to improve alignment, including the placement of hydrophobic regions, secondary structure elements, and disulphide bonds. Thus, the expertise of the modeller, drawing on biochemical information such as function, family characteristics, mutagenesis observations and other information that may require manual intervention, is frequently used to refine automated alignments. The CASP competitions have borne out this principle, where it has been repeatedly observed that predictions using intervention based on human expertise are mostly better than predictions from fully automated servers [54,55]. 2.4.1.3. Model building. There are different approaches for building the model which can be grouped as [56]: rigid-body assembly methods (e.g., using 3D-JIGSAW [57], BUILDER [58] and SWISS-MODEL [59]), segment matching methods (e.g., using SegMod/ENDCAD [60]),

T. Werner et al. / Advanced Drug Delivery Reviews 64 (2012) 323–343

spatial restraint methods (e.g., using MODELLER [61,62]), and artificial evolution methods (e.g., using NEST [63]). Rigid-body assembly methods build a model from those fragments of the template structure that align to the target sequence. For those parts of the target which do not align with the template, a structural database of known protein structures or de novo methods (see Section 2.5) is used to model the missing parts. Segment matching methods build a guiding structure of the target based on the alignment and then the target structure is broken into a set of short segments to search in a database of known structures for matching segments. The selection of each segment is based on sequence identity and geometrical considerations as well as energetic criteria. The full atom model is then built by using the guiding structure as an anchor to place the segments. Spatial restraint methods construct the model by satisfying restraints derived from the template structure. The restraints, including bond lengths and angles, van der Waals contact distances, and dihedral angles, are then mapped onto the target structure based on the alignment. The artificial evolution methods use rigid-body assembly methods in combination with stepwise ‘evolutionary mutations’ on the template until its sequence is changed to that of the target. This approach considers the alignment between target and template as a list of operations, which includes amino-acid substitutions, insertion and deletions. Various studies have shown that no single model-building program is universally superior. MODELLER, NEST and SegMod/ENCAD generally perform better than the others, with MODELLER performing the best on average [49,64]. It should be mentioned that SegMod/ ENCAD was written more than 15 years ago without undergoing any further development.

2.4.1.4. Model refinement. For homology modelling, refinement tends to focus on the correct orientation (rotamer position) of the side chains and the structure of the loops. One persistent observation, however, is that the models obtained are structurally closer to the template than to the true structure of the target. Attempts have been made to refine homology structures away from the template structure to become closer to the true structure of the target using methods that do not depend on the template. These include the use of physical parameters and knowledge-based input (see Section 2.5). However, because different templates are often used to generate a homology model and different combinations of methods then applied to refine these models, no absolute conclusions about the applicability of refinement has emerged [65]. Side-chain modelling is normally done by the homology modelling program but this is not always optimal. Therefore, models are often refined by standalone programs such as SCRWL4 [66] (which uses rotamer libraries derived from known structures) or else molecular dynamics simulations are run for the entire model (see Section 3.2). The regions of the target sequence which do not have a corresponding homologous region in the template are often loops which nevertheless can play important structural roles, form ligandbinding sites, etc. Loops are normally modelled using a databasesearch or de novo conformational-search approach. In the databasesearch approach, a database of loops derived from known structures is interrogated. Loop conformations are selected using a scoring process that combines information from matching sequence length, sequence similarity, and other factors such as secondary structure or chemical properties. One of the most popular loop database servers is ArchPred [67]. Currently, loop searches are only carried out for loops of length up to 10 residues because the number of possible conformations for longer loops becomes very large. Conformations for shorter loops are

327

well represented in databases such as wwPDB but recognising them through appropriate scoring remains problematic. To help overcome these limitations, de novo methods have been developed to predict conformations of loops by searching conformational space. Methods include Monte Carlo simulations, simulated annealing, genetic algorithms, and molecular dynamics simulations often in combination with knowledge-based potentials. For these de novo approaches there is no limitation on the length of loop that can be modelled but with increasing length the number of possible conformations increases rapidly, making modelling very timeconsuming. Hybrid methods are also available and include: 1) combining database searching for short fragments with de novo methods for longer fragments (e.g., using SWISS-MODEL [59]). 2) Generating two different sets of conformations, one from database searching and one from conformational searching, and then clustering the predictions in order to create a consensus that must pass a set of rule-based filters (e.g., using CODA [68]). 3) Building loops derived from known structures and then running a conformational search restricted to randomly selected fragments (e.g., using ROSETTA [69]; see Section 2.5). 2.4.1.5. Model quality assessment. After the model has been built, its stereochemistry can be checked using programs such as PROCHECK [21], WHATCHECK [22] or MolProbity [20]. These programs are not optimal because they check the capability of the homology modelling algorithm to build the structure rather than verifying the actual quality of the model. However, they are still useful to detect errors in the modelling process and the models themselves (such as bad phi/psi angles or clashes) and they can use this information to choose the best model out of a set of predictions. Another approach is to calculate a pseudo-energy profile of the model using programs as PROSA [70,71] or Verify3D [72,73]. These programs assign an energy value to each amino acid in the sequence derived from atomistic coordinates of correctly folded 3D structures. Peaks in the profile indicate an unfavourable contribution to the potential energy of the structure and point to errors. Besides evaluation of the models by these specialised programs, models can also be validated through reference to experimental results (e.g., mutagenesis studies and disulphide bond locations) as well as computational methods such as those which evaluate the ability of a known ligand to fit within the model's ligand-binding site. 2.4.1.6. Errors in homology modelling. Clearly there are many potential sources of error in the homology modelling process including poor choice of template, misalignment of the target sequence to the template, incorrect loop folding and choice of side-chain conformation. Inappropriate choice of template and misalignment appear to be the main sources of error, particularly when sequence identity is less than 30%. Chothia and Lesk [74] studied the relationship between protein sequence and 3D structure of homologous proteins and showed that the RMSD increased with decreasing sequence identity, and that if sequence identity was less than 20% large structural differences generally occurred. 2.4.2. Threading Threading approaches can be used in place of homology modelling (i.e., even if a homologous template can in fact be found for the target sequence) but is thought to be more particularly useful when the only available template structures share less than ~30% sequence identity with the target sequence. Thus, threading can be used to select template structures even if the target and template sequences are not evolutionarily related. However, the method does require that the target does indeed adopt a fold whose 3D structure has already been solved.

328

T. Werner et al. / Advanced Drug Delivery Reviews 64 (2012) 323–343

Threading attempts to search through the folds with known structure and identify the ones which are most likely to be appropriate for the target sequence. Therefore, the threading method needs a target sequence and a structure library as input, and a selection process using a scoring function that finds the best sequence–structure match (threading). Threading is often considered a method that falls between homology modelling and de novo modelling and in fact is often incorporated directly or indirectly into these approaches to help improve structure modelling [75]. E.g., many threading methods just identify the template structure or provide the target– template alignment for the modelling process, leaving the actual model building to be done by homology modelling programs [76]. The library of different protein folds is derived from databases containing known structures (such as the PDB). The selections can be based on a variety of criteria including sequence similarity to other known structures, the resolution (in Å) of the solved structure, etc. Preferably, the library should be large enough to contain a wide range of structures representing all currently known folds, but because the threading process itself is often slow this can put considerable strain on computational resources. Scoring functions are designed to measure the ‘fitness’ of a sequence–structure alignment. Normally, the sequence of residues is aligned to each structure in the library using a knowledge-based energy function that scores the sequence of residues placed on the backbone structure of the potential template [77]. The assumption is that a suitable template fold provides a framework which lets the target's sequence of residues interact favourably, leading to a higher score [78]. The resulting rough models are then ranked according to an energy function, which can be different to the knowledge-based energy function used for the alignment. A variety of parameters can be included in the energy functions. These include pairwise interaction scores derived from the analysis of pairs of residue–residue contacts in known structures, profile energies which take into account the environment (e.g., the burial state) of a residue, secondary structure compatibilities, sequence similarity, and gap penalties [78]. The development of the profile energy potential was motivated by the computational difficulties of finding optimal alignments with gaps when using only pairwise energies [77]. The best known threading program is THREADER [79,80], which uses residue–residue contacts, solvation potentials, secondary structure predictions, and target–template sequence similarities for the scoring function. RAPTOR [81] takes into account residue solvation potentials, secondary structure, amino-acid substitution rates and pairwise interaction scores. After generating the models, RAPTOR ranks the models with a support vector machine. 2.5. Free modelling The 3D structures for many sequences cannot yet be predicted reliably by homology modelling or threading simply because suitable template structures do not presently exist. Because of this problem, and the additional difficulty that many modelled structures end up looking more like the template structure than they should, freemodelling methods were developed. The template-free approach to modelling has been guided by Anfinsen's thermodynamic hypothesis, which states that a protein's structure in a given environment is based on the sequence and corresponds to the global minimum of the potential energy of the system. Levinthal formulated the paradox that the folding process cannot follow a random path to find the native conformation because it would take longer than the age of the universe. The concept of folding funnels [82,83] was then developed in which protein folding follows an energy landscape, moving downhill to the global minimum. The concept of the folding funnel has elegantly shown a way out of the Levinthal paradox and also illustrated how a folding protein may become trapped at some crevice (local minimum) of the energy surface.

Free-modelling methods draw on the following: 3D structures of target sequences are built using iterative processes in which the conformation of the folding structure is changed until a conformation with the lowest potential energy is found. Techniques used to search the energy landscape are combined with a scoring (energy) function used to estimate the value of potential energy. Ultimately, the terms in the energy function may not be intended to faithfully reproduce energies, but rather promote computational tractability. The energy landscape is normally searched using Monte Carlo simulation or molecular dynamics approaches. Free-template methods can be divided into two groups, ab initio and de novo, based on their energy functions. Ab initio methods use energy functions based on first principles of energy and atomic motion. The algorithms used generally consist of a series of relatively simple terms to calculate the energies of structures; nevertheless the computational demands are considerable. Widely used methods include UNRES [84,85] and ASTRO-FOLD [86,87]. Despite attempts to reduce the computational costs, ab inito methods generally are limited to small molecules, including peptides, where they can be used to model the structures of fragments of sequence up to about 100 residues in length [85]. De novo methods combine quantitative understanding of the physics of folding with knowledge about previously solved protein structures. Commonly used de novo methods include ROSETTA [69] and I-TASSER [75,88,89]. 2.5.1. ROSETTA ROSETTA is a fragment-based structure prediction method which typically can produce structures with accuracies of 3–6 Å RMSD compared to the experimentally determined structures [69]. Generally speaking such models have the correct fold and secondary structures, and active sites have many of the correct functional residues clustered within them. When ROSETTA is combined with limited experimental data, such as disulphide bond location, accuracy of the models often improves to 2–3 Å RMSD [69]. Consistent with our understanding of protein folding, ROSETTA tries to mimic that local structures fluctuate though most are eventually ‘locked in’ to a final compact 3D structure stabilised by additional long-range interactions, which can include contact between individual secondary structure elements. ROSETTA employs a nonredundant library comprised of X-ray structures ≤2.5 Å resolution with no entries sharing more than 50% sequence identity [69]. For each 3- or 9-residue fragment of the target sequence, a search is made of the library to select appropriate conformations based on sequence alignment and secondary structure considerations. The top 25 ranked conformations effectively represent a set of discrete conformations that a fragment can adopt. Complete 3D structures for the target sequence are then made by randomly combining conformations of the fragments using a Monte Carlo simulated annealing approach. A scoring function (or potential energy surface) is used to evaluate and compare models based on a number of parameters which directly or indirectly include solvation, electrostatic effects, H-bonding, van der Waals interactions, secondary structure packing and steric overlap [90]. Several of these are probabilistic terms based on analysis of solved structures. The use of a nonredundant library, the initial restriction of the conformational space that individual fragments can adopt, together with other approximations such as replacing all side-chain atoms with a single centroid ‘atom’, all help to reduce computational load [91]. However, these restrictions can be relaxed if additional structural refinement of the 3D structures is desired. The scoring function can also be refined by modifying terms or including extra terms which are more physically realistic with respect to protein structure and stability. 2.5.2. I-TASSER I-TASSER was developed to use two stages of calculation in the de novo predictions of structures. It combines fragment-based structure

T. Werner et al. / Advanced Drug Delivery Reviews 64 (2012) 323–343

prediction with threading in which initial models are built using a conformational search which are then submitted to a second conformational search providing refinement. I-TASSER searches the whole PDB library with its threading algorithm [76] to select appropriate template fragments with the possibility of finding them with varied lengths. The aligned fragments are then combined to assemble the global structure and missing regions for which a template has not been found are built by ab initio modelling [92]. The conformational space is searched by replica-exchange Monte Carlo simulations [93] using a classical knowledge-based energy function, restraints from the threading template, and sequencebased contact predictions [94]. In the resulting models, the sidechain atoms are represented only as a single centroid, which are then clustered [95] and refined by undergoing the fragment-based assembly simulation again. In this second simulation, the spatial restraints are extracted from the centroids and the PDB structures are searched by a structure alignment program [96]. The conformations are clustered and the lowest energy structure in each cluster is selected and expanded to a full-atomistic structural representation [97]. The I-TASSER server can be accessed openly for the prediction of protein structures and their functions. Other free modelling approaches have performed well in the recent CASP competitions including the multi-level combination approach MULTICOM [98], which combines complementary and alternative templates, alignments and models, and MUFOLD-MD [99], which implements molecular dynamics simulation for model evaluation. 2.6. Comparison of structure-prediction methods The three commonly used structure prediction methods, homology modelling, threading and de novo free modelling, have different strengths and weaknesses and their reliability varies for each structural problem. The choice of approach to be employed not only depends on the availability of a suitable template but also on the computational resources available. Homology modelling approaches perform very well for proteins with closely homologous templates and most predicted structures have an RMSD of 1–2 Å from the experimental structure [100]. However, a homolog with sufficient sequence identity is not always available and not every homologous protein is a good template. Threading techniques are often used for target sequences where a template can be found that is either a distant homolog or is not homologous but adopts a similar fold. The resulting model quality is, on average, worse using threading compared to homology modelling, with RMSDs usually in the range of 2–6 Å, with larger errors in the loop regions. For target proteins for which no suitable solved template structure exists, de novo approaches can be used but successful structure prediction is usually limited to small proteins (b120 residues) and the RMSD compared to the experimental structure is quite high (4–8 Å) (Fig. 2). In a 2007 study [101], small protein domains (b150 residues), which were not homologous to known structures, were selected from the yeast genome. 3D structures of these 3338 domains were predicted using ROSETTA and then classified into the SCOP superfamilies by a structure-based comparison of the predicted models with the SCOP structures, and also by integration of Gene Ontology (GO) data [102]. Out of the 3338 models, 404 could be assigned using only structure-comparison methods and a further 177 were successfully assigned after integrating GO data. The results show that the de novo prediction methods might be useful for special problems but one cannot expect a high success rate. In CASP 8 there were only 10 template-free and 3 template-free/ template-based targets. A visual assessment [103] concluded that only six targets had excellent models predicted of which two were template-free/template-based targets. Alternative rankings and

329

scorings showed that MUFOLD, MULTICOM and ROSETTA performed best. However, the small number of free-modelling targets and the small size of the targets (11 of 13 targets had a sequence length of 44–87 residues) makes it difficult to generalise.

3. Force field-based modelling methods In the process of molecular modelling we need to find the global minimum of the energy surface by starting at an initial conformation and altering it according to the specialised rules of the modelling process. These so-called molecular mechanics approaches treat atoms as a series of balls (the atomic nuclei) connected by springs (representing covalent bonds). The electrons are not considered explicitly and the energy of the molecule is only described by the positions of the atomic nuclei (Born–Oppenheimer-approximation). The energy of the system is described by a function consisting of a set of terms that describe covalent and non-covalent (electrostatic and van der Waals) bonds. This function is referred to as the 'force field' [104]. The 'classical' force field consists of two major parts: the first describes interactions between atoms connected via covalent bonds, and the second describes the non-covalent interactions using the Lennard–Jones and Coulomb terms [104]. The Lennard–Jones term models the van der Waals interactions and the Coulomb term describes electrostatic interactions between charges. The most common biomolecular mechanical force fields are CHARMM [105–107], AMBER [108], OPLS [109,110] and GROMOS [111]. The terms in these force fields are mostly derived from experimental and quantum mechanical data which were then optimised by adjusting the parameter values until the force fields were able to reproduce the data in training sets. These training sets can include data derived from a number of sources including spectroscopic, thermodynamic and crystallographic data of protein structures, as well as data derived from quantum–mechanical calculations. Force fields have different strengths and weaknesses and all have limitations due to the simplifications required to make them computationally tractable. For two well-written reviews providing deeper insights into classical force fields see [104,112].

3.1. Energy minimisation Minimisation in molecular mechanics refers to strategies used to traverse the energy surface of a molecule in order to find a conformation or conformations close to or at the global minimum. The energy minimisation procedure is always trying to walk downhill on the energy surface by changing the structure in order to find a structure of lower energy. The process is iterative and usually is continued until the difference in calculated energy between successive structures differs by less than a specified threshold, indicating the structure is now at a minimum. Different starting conformations of the protein should converge to the same minimum, providing confidence that the minimum is global. Convergence to a minimum with higher energy indicates that the minimum is local not global. Three common algorithms used for energy minimisation are steepest descent, conjugate gradients and Newton–Raphson, and these are often combined in various ways to provide a robust convergence to the global minimum. The energy minimisation procedure is often used to refine protein structures; i.e., it is used to explore the energy surface from a position close to what is already thought to be near the global minimum. Minimisation can also be used as a means of analysing the effect of mutations in a protein, such as a point mutation, by re-orienting residues (mostly in the region of the mutation) to minimise unfavourable contacts and steric constraints and maximise favourable contacts.

330

T. Werner et al. / Advanced Drug Delivery Reviews 64 (2012) 323–343

Fig. 2. Accuracy of computational structure prediction methods and their applications. Shown are computational structure prediction methods (de novo prediction, threading and homology modelling) and their corresponding accuracy and applications. The protein structure model is that of myoglobin, which was the first protein crystal structure to be solved. The different detail levels of the structure clarify visually the information which can be derived from the models constructed by the methods. The RMSD values reflect the expected differences between model and biological structures and limited applications are highlighted here.

3.2. Molecular dynamics In molecular dynamics simulations, the behaviour of the system, as measured by the value of the force field, is monitored over time using rules (generally referred to as molecular mechanics) that draw heavily on the laws of Newtonian motion. The resulting calculated trajectory for the molecule provides detailed information on the fluctuations and conformational changes of proteins over time. Molecular dynamics simulations have been a common tool to investigate biological macromolecules since they were introduced in the late 1950s to study the interaction of hard spheres [113,114]. Over the years, more and more molecular simulation program packages have been developed [105,108,115–117]. The first molecular dynamics simulation for a protein was performed in 1977 with the simulation of the dynamics of folded bovine pancreatic trypsin inhibitor (BPTI) [118]. It took more than 15 years before the resources were available to perform the first molecular dynamics simulation of a membraneembedded peptide in 1994 [119] and the first integral membrane

protein in 1995 [120]. Since that time, many different studies using molecular dynamics simulation have been carried out to explore the dynamics of structures [121], to refine structures [122–124], to determine ligand–protein interactions [125], to simulate lipid surfaces [126,127], to sample conformations [128] and to model membraneembedded proteins [129–131]. Before the molecular dynamics simulation can be performed, the simulation system has to be defined by setting up the atoms (protein, lipids, water, ion) and their positions, the temperature, the pressure and also the system cell, inside which the simulation takes place. All these parameters can also change during the simulation. Many simulations are performed with periodic boundary conditions, so that objects and interactions which pass through the cell boundaries reappear on the other side of the cell. Thus, when applying a periodic boundary condition, it is very important that the simulation system is large enough so that the simulation systems in different periodic cells are not affected by each other in an unintended way. Moreover, it is not always clear where the boundaries of the system should be placed

T. Werner et al. / Advanced Drug Delivery Reviews 64 (2012) 323–343

so that all necessary interaction partners are included whilst keeping the system to a manageable size. After setting up the system, the force field and the molecular simulation algorithm has to be chosen. The choice of the force field is not trivial because it depends on a number of factors including the specific system to be tested, the purpose of the simulation and its timescale, and the computational resources you can provide. The simulation is started by adding forces to all atoms in such a way that distribution of the force suits the starting temperature and pressure of the system. Knowing the force, the acceleration of each atom in the system can be determined in order to solve the equation of motion and to obtain the new force for the next simulation step. Assuming a finite time step, the integration of time in the governing equation yields the trajectory which describes the velocities, positions and accelerations of the particles in the system at every time step. Because of the complexity, the equation cannot be solved analytically and molecular dynamics programs use integration with one of numerous algorithms such as the Verlet method or the Leapfrog method [132]. Both algorithms use the finite difference method in which the integrations are divided into many small steps. One of the most important questions is how long one should simulate the system in order to see effects (Fig. 3) and whether these effects are stable or only temporary. The simulation times vary between a few nanoseconds and 1 ms [133,134] depending, in part, on the size of the system. In most molecular dynamics simulations, a system is considered stable when it is seen to equilibrate over a longer timescale. However, the timescale to simulate depends on the effects the modeller wants to observe and the available computer power. 3.2.1. Force fields in molecular dynamics simulations There are three common types of force fields in molecular dynamics simulations of biomolecules: all atom, united atom and coarsegrained. In the all atom force fields (AMBER, CHARMM, OPLS-AA

Fig. 3. Comparison of motions in proteins and their time scale. The time scale for the motions depends on the size of the structural elements and the environment. Side chains on the surface of a protein move faster than side chains buried in the protein forming many contacts to other side chains. The time scale for α-helix, β-hairpin and protein folding depends largely on their size. The graphic shows that MD simulations are generally used for a small part of the dynamic range in a protein.

331

[109]) every atom is treated separately and parameters, such as charge and non-bonded as well as bonded interaction parameters, are provided for every single atom in the system. In the united atom force fields (GROMOS, OPLS-UA [135]), the non-polar hydrogens are not treated as separate atoms; instead, they are combined with the carbon into united atoms, which decreases the number of particles in the system. The united atom force fields were mostly created at a time when limited power of computers made it attractive to not include the large number of hydrogen atoms. In addition, in many united atom force fields the aromatic ring charges cannot correctly be treated due to the significant quadrupolar charge distribution having an effective positive charge near the hydrogens and a negative charge near to the middle of the ring. This effect can be crucial for determining the pi–pi interaction of aromatic rings (stacking, edge-to-face, face-to-face) and interaction between rings and other chemical groups. This limitation was later removed in the GROMOS force field by adding aromatic hydrogen atoms to nucleotide bases [136]. Moreover, the forces that influence the rotation between conformations of five-atom aliphatic rings are difficult to describe when just united atoms are treated in the force field. Finally, it is difficult to compare computed and observed vibrational frequencies with united atom models [104]. Despite the limitations of united atom force fields they are still widely used because in most simulations the approximation is not the limiting factor and speed is more important. The coarse-grained force fields reduce the computational requirement for the simulation by grouping atoms into ‘super atoms’ in which one particle represents 2–5 atoms. One of the most popular coarse-grained force fields is MARTINI [137,138] and its extension for proteins [139]. The larger the groups of atoms treated as a super atom, the larger the saving in computational cost. This elimination of internal degrees of freedom implies that their effect must be taken into account implicitly, which will lead to a loss of accuracy. There has recently been considerable interest in using coarsegrained models for protein–protein complexes [140,141], vesicle fusion [142,143] and protein folding [144–146] which are currently beyond the capabilities of atomistic simulations. Moreover, many groups are using coarse-grained models for membrane protein simulations [147–151] in order to simulate over a longer time scale. In addition to these three common force fields, there are some specialised force fields such as polarisable force fields in which a particle's charge is influenced by electrostatic interactions with its neighbours; in contrast to the ‘classic’ force field in which the charge is not affected by the local electrostatic environment [152]. 3.2.2. Explicit treatment of solvent Molecular dynamics can explicitly or implicitly include the presence of solvents such as water and lipids to provide a more realistic simulation compared to in vacuo. Explicit inclusion of solvent dramatically increases computational cost because the number of particles in the system greatly increases. The computational cost can rise even further depending on the model of the solvent used. For example, different models of water can be used in molecular dynamics including ST2 [153], SPC [154], and TIP3P, TIP4P, and TIP5P [155]. SPC and TIP3P are rigid 3-site models having three interaction sites which correspond to the three atoms. TIP4P is a 4-site model additionally placing the negative charge on a dummy atom near the oxygen, and TIP5P is a 5-site model, placing the negative charge on two dummy atoms representing the lone pairs. Despite the TIP5P model including interaction sites at the lone pair positions and the 3-site water models lacking lone-pair positions, SPC and TIP3P remain the most commonly used solvent models because they approximately halve the number of particles in the system. Molecular dynamics simulations of membrane-embedded proteins are commonly performed in bilayers with only a single lipid type and neglecting the ion gradient across the membrane bilayer.

332

T. Werner et al. / Advanced Drug Delivery Reviews 64 (2012) 323–343

Different methods can be used to insert the protein into the bilayer such as removing overlapping lipids, using repulsive forces to create a space for the protein, and building bilayer around the protein [156]. 3.2.3. Implicit treatment of solvent For many molecular dynamics simulations, the surrounding solvent is treated implicitly. This has reduced computational cost since the most time consuming calculations in a typical molecular dynamics simulation is the evaluation of forces between atoms that do not share covalent bonds. However, implicit solvent simulation improves the simulation speed at the cost of accuracy. The implicit treatment of solvent in molecular dynamics involves the design of an effective potential function that describes the change in energy and the direct effect of the solvent on the protein through collisions that can be treated by Langevin dynamics [132]. The most common methods of treating electrostatic interactions between the protein and the solvent use the generalised Born surface area (GBSA) model [157]. Implicit treatment of solvent should be used with care in all-atom molecular dynamics simulations because it can cause problems, particularly in the case of free energy landscape of folding, and can overestimate the stability of the native state [158]. Moreover, the balance between the polar electrostatic and non-polar interactions might not be well preserved for charged systems over a longer time-scale or when large conformational changes are involved [159]. On the other hand it performs well when combined with advanced sampling techniques in the study of folding, and has found application in the assembly of membrane proteins, as well as in de novo structure prediction and refinement of homology models [160–162]. 3.2.4. Specialised techniques in molecular dynamics simulations Many additional specialised techniques for particular problems, including multiscale simulations, mixed quantum mechanical/classical simulations (QM/MM), first introduced by Warshel and Levitt in 1976 [163], or Replica-Exchange molecular dynamics, first applied to proteins by Sugita and Okamoto [164], have been developed. The QM/MM simulation has been designed for biological processes involving quantum effects which can never be modelled with classical molecular dynamics simulations. The system is subdivided into a core and the surroundings, where the core is treated at the quantum– mechanical level and the surroundings are described with molecular mechanics. The electrostatic interactions between the embedded QM-atoms and the surrounding MM-atoms are incorporated into the QM energy calculation. There are also attempts to simulate systems using mixed atomistic and coarse-grained molecular dynamics (MM/CG). Some parts of the system, such as peptides or binding sites, are described in full atomistic detail in order to get a more detailed view whereas other parts (lipids and solvent) are described using coarse-grained models [165–168] to reduce the computational cost. Multiscale simulations use the atomistic 'classical' molecular dynamics simulation combined with coarse-grained simulation, but in contrast to MM/CG simulations the MM simulation is not directly incorporated into the system. There are different multiscale protocols: protocols where a coarse-grained simulation is performed and then the sampled structures are converted into an all-atom structure for refinement [169], and protocols where the coarse-grained potential is derived from the all-atom simulation [170]. There are also attempts to design a two-way integration of the all-atom and coarse-grained simulations using a self-learning multiscale (SLMS) method [171]. An advantage of multiscale simulations is that they can proceed over a longer timescale, at reasonable computational cost, while still ultimately providing models at atomic resolution. Replica-Exchange Molecular Dynamics (REMD) simulation is a technique used to avoid trapping of structures in local energy minima. With this approach, a number of molecular dynamics or Monte

Carlo simulations are performed on several computers starting at different temperatures. Then after a certain time the temperatures are exchanged between different trajectories based on the Metropolis criterion. This method can be used for sampling of structure but it is not suitable to study the kinetics of protein. REMD has been used with both implicit- [172] and explicit-solvent simulations [173,174] as well as in modified form [175,176]. The increase of computer power over the next decades will overcome some of the problems with molecular dynamics simulations but also better resources will encourage people to simulate larger systems on a longer time scale. Therefore, simplifications will always be necessary for particular needs. Besides the simplification of systems, more realistic simulations like quantum–mechanical and atomistic simulations will also be applied to more complex systems containing hundreds to millions of atoms in the near future. Recently, graphics processor units (GPUs) have become extremely popular for performing molecular dynamics simulations because of the improvement of speed and the lower cost of facilities needed. GPU-based molecular dynamics simulations can be used as a standard method for the refinement of homology models and other models without the need for large computer clusters. 4. Structure-based virtual (in silico) screening Traditional screening methods are based on experimental highthroughput screening to identify biologically active compounds. They are extremely laborious and expensive and have often failed to identify potential lead compounds even when the process is highly automated [177]. In the molecular docking approach, a ligand is normally automatically placed into a predetermined, predefined binding site of a 3D receptor structure model; hence the term structure-based. The receptor is usually a protein or a nucleic acid and the ligand a small organicbased molecule or peptide. The optimal position and orientation of the ligand are found with a search algorithm and a scoring function which rank the ‘goodness’ of the solutions. Molecular docking has a wide area of application in computational structural biology, such as preparing ligand/receptor simulations, validating experimental data, evaluation of predicted structural models, and more. However, a most important application for docking has become virtual screening, where a compound library is screened against one or more targets and returns a ranked list of potential lead compounds. The first step for docking is to define the binding site, as this reduces the computational complexities. The easiest way to find the binding site is to use experimental information derived from mutagenesis studies or cross-linking studies, or to consider homologous structures with known binding pockets. That information is not always available, but structural methods have been developed which are useful in finding a binding site in a protein for which 3D structural information is available [178–185]. Besides the structural approaches, sequence-based approaches have been developed for finding the binding site using sequence conservation information [186]. The ConCavity algorithm is a hybrid approach which integrates evolutionary sequence conservation information with structure-based surface pocket predictions [187]. After defining the binding site, the ligand pose can be predicted in this pocket using molecular docking algorithms. With the advance of virtual screening, an important goal for the development of molecular docking algorithms has been the achievement of fast methods that are capable of reproducing experimental data to a level of accuracy that supports the discovery of lead compounds for pharmaceutical interventions. Methods have been developed with different scoring and search algorithms such as DOCK6.4 [188], AUTODOCK4.2 [189,190], GOLD [191,192], GLIDE [193,194], FLEXX [195], and ICM [196]. The search algorithm must be able to sample different conformations and orientations of the ligand quickly, whereas the scoring function

T. Werner et al. / Advanced Drug Delivery Reviews 64 (2012) 323–343

should be able to discriminate between different ligand–receptor interactions. A rigid-body docking is one in which the ligand and the receptor are rigid, and flexible docking is where the flexibility of the ligand and/or the receptor is considered. Rigid-body docking methods are fast but they do not consider induced fit, in which ligand and receptor adapt to the binding by influencing the local conformations of each other. Although there is flexibility in a protein, it may be the case that the only parts of the protein that flex during the docking process are some side chains in the binding site. However, many docking program algorithms lack the ability to sample the protein binding site appropriately because sufficiently extensive movements of the side chains are not considered during the docking process. Taking into account the flexibility of the ligand and the receptor comes at a much higher computational cost than the rigid-body docking approach. Water can also play important roles in the protein–ligand interaction by mediating hydrogen bonds between protein and ligand or by being displaced by the ligand. In many approaches, the water molecules can be positioned based on explicit information from protein crystal structures or the position of potential water molecules in the binding site can be predicted by methods such as GRID [197,198]. Generally, the water molecules in these cases must be placed explicitly into the binding site before the ligands are docked. On the other hand, a few methods have been developed to place water molecules as an automated part of the molecular docking simulation [194,199]. Search algorithms such as Molecular Dynamics, Monte Carlo, Simulated Annealing, Genetic Algorithms, Tabu Search, Incremental Construction and more [200] are all appropriate for molecular docking. The scoring function approaches can be roughly grouped into forcefield methods, empirical methods, and knowledge-based methods. Normally, the docking process returns several possible conformations ranked by the chosen scoring system. However, the scoring functions are usually not good enough to discriminate properly between docking poses which means that the pose with the highest score is rarely the best one. Simplified and fast scoring methods are often available in the docking software packages, and can be substituted by potentially more accurate scoring methods such as Molecular Mechanics Generalized Born (MMGB) or Molecular Mechanics Poisson–Boltzman (MMPB) binding energies, which are in turn more computationally intensive [201]. In summary, three key challenges remain: accurate scoring and ranking of compounds, inclusion of protein flexibility and ligand-induced fit, and the placement of solvent molecules in the protein–ligand interface. Virtual screening has attracted interest in the pharmaceutical industry as a productive and cost-effective technology in the search for lead compounds. It attempts to determine a set of compounds with the aid of the docking score, in which potential leads are ranked higher than compounds with lower or no activity. Due to the overall unreliability of the docking score potential and the large number of false positive docking poses, compounds are often also picked from the sorted list by visual inspection of the top 100–1000 compounds. This is based on assumptions that the best scoring functions available cannot entirely replace the experience of a modeller or chemist. The enrichment factor is a measurement of the quality of the virtual screening method, and compares the number of active compounds retrieved in the virtual screening process to the number of active compounds found when compounds are selected randomly. A desirable docking program for virtual screening should generate high enrichment in a short processing time. There are various studies available comparing protein–ligand docking programs, which concluded that different schema perform differently on different targets, and generalising is difficult [202–205]. The particular chemical makeup of the binding sites such as the size, hydrophobicity, and solvation accessibility all have a role to play. One cannot expect to get good results in virtual screening using low accuracy or low resolution structures. Therefore, the choice and

333

preparation of the protein structure are very important. The success of the screen itself then depends on all its components: the database of compounds to be screened, the search algorithm effectiveness, the scoring function, and the computational resources available. Screening of large compound databases can be time-consuming, and hence the database is often searched with a fast docking approach first and then the top ligands are selected and re-ranked again with a slower docking method that, for example, allows for induced fit and has a more accurate scoring function. The database can be prefiltered by applying a druglikeness or ADMET (absorption/distribution/metabolism/excretion/toxicity) filter, so that only compounds with selected characteristics remain in the database. One of the most common filter approaches is Lipinski's rule of five [206]. Moreover, the database can be reduced by considering redundancy in related molecules, in order to get a diverse dataset that will be faster to search. Databases for virtual screening can be provided exclusively with compounds that can be purchased and for which suppliers are available; e.g., the ZINC database [207]. Other methods can also be used for in silico screening of compound databases including pharmacophore-based approaches and quantitative structure–activity relationship (QSAR) modelling, which are considered ligand-based methods rather than structurebased. For pharmacophore-based virtual screening [208], the pharmacophore itself is a framework of steric and electronic features necessary for activity at the binding site of the protein target. Pharmacophore models can be derived from the binding pocket or from a set of known active compounds. The models can then be used to search compound databases for leads using computational approaches similar to structure-based methods. QSAR is based on the observation that similar structures are expected to have similar biological activities. QSAR correlates the biological activity of known ligands with their molecular features such as physicochemical (charges, logP, molar fractivity, etc.) and topological properties (bond arrangements, connectivity, etc.) [209]. 3DQSAR additionally considers the 3D structure of the compounds [210], and includes the pseudoreceptor approach. In this approach, a pseudoreceptor is constructed around a single ligand or an ensemble of superposed ligands such that the representation of bioactive ligands is not restricted to atomistic representations [211] but rather to the selected physicochemical and topological properties. Pseudoreceptor modelling, as well as other 3D-QSAR techniques, is able to bridge the gap between pure ligand-based and receptor-based approaches. 5. Selected examples of computational modelling in the lung 5.1. Molecular dynamics simulation of the air/water interface in the lung 5.1.1. The hypophase and lung surfactant The lung is an essential organ whose primary function is to transport oxygen from the atmosphere into the bloodstream and to release carbon dioxide from the bloodstream into the atmosphere. This gas exchange occurs in the clusters of alveoli — the terminal sacs of the branching luminal network arising from the trachea. The alveolar epithelial surface is covered by a thin aqueous layer (hypophase) which is exposed to air [212]. This lipid-rich film of pulmonary surfactant contains phospholipids, neutral lipids (mainly cholesterol) and specific surfactant proteins [213]. Alveolar walls contain two epithelial cell types: flat, squamous cells in which gas exchange occurs and cells that secrete the surfactant proteins. The lung surfactant monolayer maintains near-zero surface tension inside the alveoli during exhalation but must respread quickly to cover the alveolar surface during inhalation, which requires that its viscosity resides within a specified range. These properties help to reduce the work of breathing, prevent the collapse of the lung at the end of exhalation, and promote appropriate gas exchange at the

334

T. Werner et al. / Advanced Drug Delivery Reviews 64 (2012) 323–343

interface during the compression and expansion cycles of the breathing process [214]. The composition of pulmonary surfactant is complex and includes lipids such as 1,2-dipalmitoylphosphatidylcholine (DPPC), monounsaturated phosphatidylglycerol (PG), unsaturated phosphatidylcholine (PC) and cholesterol, as well as specific surfactant proteins SPA, SP-B, SP-C and SP-D [212]. SP-A and SP-D are large, multimeric, hydrophilic glycoproteins which are critical for maintenance of the regulation of host defence [215]. In contrast, SP-B and SP-C are smaller and extremely hydrophobic, playing key roles in facilitating optimal dynamic behaviour of the lung during respiratory compression– expansion cycling. High resolution structures of the SP-B and SP-C proteins have not been determined. Only low resolution structures of SP-C (PDBID: 1SPF), obtained by FTIR and NMR spectroscopies, and the α-helical peptide of SP-B, SP-B1–25 (PDBID: 1DFW), have been published. The collapse of a monolayer surface plays an important role in the regulation of surface tension. Although the structure of the monolayer formed upon collapse in the lung can be characterised experimentally, the underlying mechanism of the collapse is still not fully understood. One widely accepted assumption is that DPPC enrichment of the surface film is required to support very low surface tension. However, the proportion of DPPC in the monolayer is not known and the mechanism of its insertion into the layer is not clearly understood. Some studies suggest that SP-B and SP-C could selectively promote insertion of DPPC into the air–liquid interface [216]. Alternatively, the 'squeeze-out' theory [217–219] proposes that to reach near-zero surface tension the monolayer must 'squeeze out' nonDPPC components during compression leading to DPPC enrichment. It is also possible that the ability of surfactant to produce very low tension during breathing depends on the formation of a multilayered film at the interface which acts as a surfactant reservoir [220–223]. However, the validation of any one of these models in vivo requires new experimental models. A common feature of almost all lung surfactants and model mixtures is the coexistence of a semi-crystalline solid phase called the liquid-condensed phase and a disordered fluid phase known as the liquid-expanded phase [224]. Successful surfactant therapies have been developed for treating premature infants with respiratory distress syndrome (RDS) [225,226]. However, developing surfactant therapies for other conditions such as acute respiratory distress syndrome and asthma will likely prove difficult without a better understanding of underlying properties, including dynamics, of the hypophase. 5.1.2. Modelling surfactant collapse and lipid–protein interactions Computer simulations of phospholipid systems can yield new insights into the structure and dynamics of small-scale interactions not detectable in experimental work, particularly in vivo. Atomistic molecular dynamics studies have been used to simulate monolayer collapses for phospholipids [227] and fatty acids [228] and also to study the orientation and interaction of SP-B1–25 in DPPC [229,230] and in palmitic acid (PA) monolayers [231,232]. Coarse-grained molecular dynamics simulations of the hypophase have also been undertaken to further understand the mechanisms of collapse. Collapse of short and long-tail phospholipid (DC14PC and DC29PC) monolayers occurs into the air side of the air–water interface via a bridge transport mechanism [233]. MARTINI coarse-grained modelling of DPPC/1-palmitoyl-2-oleoyl-sn-glycero-3-phosphoglycerol (POPG)/cholesterol/SP-C [234], DPPC [235] and DPPC/POPG [236] monolayers showed that the monolayers collapse into the water subphase by forming a bilayer fold. The effect of cholesterol concentration in a DPPC/POPG/cholesterol membrane mixture has been simulated using both atomistic and coarse-grained force fields; the monolayer remained as a liquid-expanded phase when the cholesterol concentration was low but adopted a liquid-condensed phase at higher cholesterol concentrations [237].

Coarse-grained simulations of mixtures of DPPC/POPG/PA have been undertaken which include the presence of protein surfactants (SP-B1–25, SP-C, and several SP-B1–25 mutants) [238]. The results suggested that monolayer collapse occurs via two different mechanisms: Through the growth of undulations and by nucleation around a defect. Monolayer folding correlated with the average value of the chain order parameter (a measure of the order of lipid tails). Unsaturated phospholipids (POPG) and the surfactant proteins fluidise the monolayer and promoted folding while PA displayed a chargedependent condensing effect on DPPC monolayers containing SP-C. Thus, these surfactant proteins do appear to play an important role in viscosity and folding of hypophase lipids. 5.2. Other lung proteins 5.2.1. GPCRs GPCRs constitute a large family of membrane receptors consisting of seven transmembrane domains connected by extracellular and intracellular loops (Fig. 4). A striking feature of this protein-receptor family is the chemical diversity of the ligands. In addition, more than one third of all current therapeutics target GPCRs yet they represent only ~1% of the genome. GPCRs are clustered into 5 different families based on sequence similarities within transmembrane domains: The rhodopsin family, the adhesion family, the frizzled/taste family, the glutamate family and the secretin family [239]. Many GPCRs play a major role in the maintenance of the function of the lung, such as β2-adrenergic receptor, CXC chemokine receptor type 7 (CXCR7), muscarinic acetylcholine receptor (mAChRs) and histamine 4 receptor (H4R). Despite the enormous difficulties generally encountered in crystallising membrane proteins and solving their structures, the last 11 years has seen good progress for GPCRs. Currently, there are crystal structures for several in inactive and/or activated states and this has greatly increased our understanding not just of their structures but also their ligand-binding properties and the conformational changes associated with cell signalling [240–250]. 5.2.1.1. β2-adrenergic receptor. The crystal structure of β2-adrenergic receptor (PDBID: 2RH1) in the inactive form bound to the partial

Fig. 4. Schematic figure of the generalised GPCR structure. The model was based on the β2-adrenegic receptor structure with bound carazolol. The 7 helices cross the membrane, and the ligand is displayed by dots. Helix VIII, which is parallel to the membrane does not exist in all crystal structures. The blue spheres represent the flexible region ('kinks') in the membrane domains of the GPCR. The kinks were calculated with an inhouse algorithm. The residue interactions towards the intracellular end of the helices between helix III and helix VI are the 'ionic lock'.

T. Werner et al. / Advanced Drug Delivery Reviews 64 (2012) 323–343

inverse agonist carazolol was solved in 2007 [240]. This receptor is the most commonly used target for the treatment of asthma symptoms by inhalation of selective β2-agonists combined with corticosteroids [251]. Therefore, there is considerable interest in examining the protein–ligand interactions and understanding the conformational changes that occur during the activation process. Simpson et al. [252] modelled an active form of the β2-adrenergic receptor based on the information derived from the opsin crystal structure (PDBID: 3DQB) which contains some features of the activated state. The opsin-based homology model was built for the intracellular loops and the transmembrane domains while the extracellular domains were modelled using the crystal structure of β2-adrenergic receptor itself (PDBID: 2RH1). Conformational restraints were then included based on experimental results from zinc-binding studies, site-directed spin labelling and other techniques. The model was then subjected to molecular dynamics simulation during which conformational changes in helix 6 as well as smaller changes in helices 5 and 7 were observed. The model has subsequently been validated by manual docking and virtual screening experiments which show it is selective for β2-selective agonists over β1-selective agonists [252]. Thus, the results suggest that an activated state of the β2-adrenergic receptor has been modelled successfully. Moreover, this modelling approach may be appropriate for studying the activated states of other GPCRs. Recently, the crystal structure of the active state of the β2-adrenergic receptor (PDBID: 3P0G) has been published [247]. A rough comparison between the model and the crystal structure shows that the model demonstrates the overall directional changes of the helix movements between the active and inactive states of the protein but underestimates the amount of the movement. For example, helix 6 in the crystal structure of the active state of β2-adrenergic receptor moves outward 11.4 Å relative to the inactive structure, whereas in the model the same helix moves only 3–4 Å. However, all changes occurring in the active β2-adrenergic crystal structure are remarkably similar to those observed in opsin so that models which are based on opsin should be appropriate to model the active state of β2-adrenergic receptor. Rosenbaum et al. [253] solved the crystal structure of β2-adrenegic receptor covalently bound to a synthetic agonist (PDBID: 3PDS). The crystal structure did not show the conformational changes on the cytoplasmic side associated with G protein binding even though their experimental results showed that the agonist-bound receptor does indeed activate G proteins. Therefore, they performed a 30 μs molecular dynamics simulation on the experimentally determined structure of the agonist-bound receptor to which was also bound an antibody fragment on the receptor's cytoplasmic face (PDBID: 3P0G) [247]. In this complex, the antibody fragment stabilises the agonistbound receptor in the active conformation, in much the same way that G protein might be expected to. Removal of the antibody fragment in the simulation resulted in the receptor rapidly relaxing back to a structure similar to an agonist-bound inactive conformation. Dror et al. [254] performed the first MD simulations in which the complete pathway of drug interaction in GPCRs is described — from initial association, through drug entry into the binding pocket, to the adoption of the final bound conformation. Four independent MD simulations with β2-adrenergic receptor including three different antagonists and one agonist were performed. Additionally, a fifth MD simulation was performed with the β1-adrenergic receptor and dihydroalprenolol. In the MD simulations, the drug molecules were placed in the solvent away from the binding site and then spontaneously associated with the GPCR to achieve final poses without incorporating any artificial biasing forces. During the simulation, the orthosteric site was the dominant binding site and the only site to which ligand binding remained stable. All ligands traversed the same welldefined, dominant pathway for which the largest energetic barrier to binding was at a distance of 15 Å from the binding pocket. However, other metastable binding sites could be observed which were

335

previously unknown. The MD simulations were able to reproduce the ligand pose observed in the crystal structure with an RMSD of 0.8 Å between the average simulation pose and the crystallographic pose of alprenolol. These MD simulations are an alternative to standard docking approaches and in additionally providing information about the drug interaction pathway. The ionic-lock is a limited set of specific contacts on the cytoplasmic side between helix 3 and helix 6 in many GPCRs that plays a role in the mechanics of the activation process. Biochemical experiments suggest that the β2-adrenergic receptor does form these contacts [255]. The first crystal structure of β2-adrenergic receptor in the inactive state bound to carazolol (PDBID: 2RH1) did not contain these contacts, raising questions about the conformation of the inactive state and the role of the ionic-lock in receptor activation. However, molecular dynamics simulations were able to simulate the ioniclock starting with the crystal structure that lacked these contacts [256,257]. This combination of experimentally determined structures and molecular dynamics simulation has provided deep insight into conformational changes associated with ligand binding, G protein coupling, and stabilisation of the active state. In particular, these promising results show that molecular dynamics simulations are able to provide information which cannot always be directly observed in the crystal structures themselves. 5.2.1.2. Muscarinic acetylcholine receptors. Five different muscarinic acetylcholine receptor subtypes (M1–M5) have been identified in mammals [258] that mediate the effects of acetylcholine in the central and peripheral nervous systems as well as in many organs such as the lung in which they control muscle contraction and bronchoconstriction of smooth muscle [259]. The acetylcholine-induced airway smooth muscle contraction is primarily mediated through activation of the receptor M3 [260]. In addition, dysfunction of neuronal M2 muscarinic receptors increases acetylcholine release and bronchoconstriction [261,262]. Pedretti et al. [263] have built homology models of all five human muscarinic receptors based on the crystallographic structure of bovine rhodopsin. Cytoplasmic loop 3 in the receptor and the large Cterminal domain in M1 were modelled using other closely related proteins. Then, the physiological ligand acetylcholine was docked into the models. The results showed compatibility with mutagenesis studies and the models can be of further use for investigations using molecular dynamics simulations and virtual screening. Bhattacharjee et al. [264] performed 3D-QSAR studies of 2,2diphenylpropionates to find novel potent muscarinic antagonists. The geometry of all ligands was optimised using quantum chemical methods and then pharmacophores were built. The pharmacophores were used to screen an in-house compound database. Several potential anti-muscarinic compounds were identified from which ten were chosen for evaluation in a biological assay. None of the compounds had previously been reported to have anti-muscarinic properties but in the biological assay eight of them showed inhibition of ligand binding in the concentration range of 2–200 nM. 5.2.1.3. C-X-C chemokine receptor type 7. C-X-C chemokine receptor type 7 (CXCR7, RDC1) promotes breast and lung tumour growth [265] and binds C-X-C chemokine ligand 12 (CXCL12, SDF-1) [266], which is also known to bind C-X-C chemokine receptor type 4 (CXCR4). It has been suggested that CXCR7 scavenges CXCL12, which in turn would modulate the activity of the ubiquitously expressed CXCR4 in development and also in tumour formation [267]. De novo modelling with I-TASSER was used to confirm that CXCR7 belonged to the chemokine family of GPCRs [268], in agreement with binding experiments [269]. I-TASSER was used to model the structures of all GPCRs in the human genome. Similar structures for different GPCRs were clustered and assigned to specific functional classes,

336

T. Werner et al. / Advanced Drug Delivery Reviews 64 (2012) 323–343

with CXCR7 clustering with other chemokine receptors [268]. This study demonstrates the usefulness of de novo prediction methods for assigning GPCRs to specific families and shows that structural models can be more sensitive than sequence-based models. CXCR7 agonists have been identified by building a homology model of the receptor based on the crystal structure of inactive rhodopsin and refining it by introducing a twist in helix 6 to model the active form of CXCR7 [270]. Virtual screening of a database of compounds was performed by docking the candidate compounds into the binding site. 1000 compounds were selected based on the docking score and their chemical diversity. Out of the 1000 compounds, 392 were available for experimental screening and resulted in two hits confirmed to have agonist activity. 5.2.1.4. Histamine 4 receptor. The H4 receptor is involved in physiological and pathophysiological processes in the immune system, making it a target in the development of drug candidates that could find application in the treatment of asthma, allergy, chronic pruritus and autoimmune diseases [271]. A rhodopsin-based homology model of the human histamine H4 receptor was built to investigate the binding mode of a series of the structurally diverse H4 receptor agonists [272]. Two essential points of interaction in the receptor binding site (Asp94 and Glu182) were revealed by docking of these ligands into the homology model and by mutagenesis studies. The mutagenesis studies showed that Asp94 is essential for agonist binding whereas mutation of Glu182 leads to discrimination between histamine-like ligands (in which the affinity was decreased) and all other ligands (where affinity was unchanged). These results could be explained by the proposed agonist binding model derived from the docking. Homology models of H4 receptor were developed based on the crystal structure of bovine rhodopsin [29]. The models were then refined by rotation of the side-chain of the critical Asp94 around the Cα\Cβ bond to accommodate histamine more favourably. Enrichment testing showed that the models were able to select H4 ligands from random decoys. The models were then used to investigate 5 million unique compounds by docking. From this, a total of 255 compounds were selected and tested in competitive binding assays, and 16 of these showed more than 20% displacement of [ 3H] histamine — an overall hit rate of 6.3%. A molecular dynamics simulation of an H4 receptor homology model in a membrane-embedded environment has been performed with and without bound ligands [273]. Two molecular dynamics simulations with bound ligands were undertaken: The first included histamine and allowed for changes in the receptor structure (e.g. movement of helix 6 outward relative to helices 3 and 7) and changes in the interaction pattern of histamine at the binding site. The results were in agreement with experimental observations. The second simulation with a bound selective antagonist showed significant movement of helix 6 in the direction of helix 3. A simulation without ligand showed no movement of helix 6. The authors suggested that the simulation without ligand and that with the bound antagonist resulted in an inactive receptor conformation whereas in the presence of histamine, the model resembled the active state. These results demonstrate that molecular dynamics might be capable of modelling and discriminating between active and inactive states of GPCRs. A combination of pseudoreceptor modelling and molecular dynamics simulations has been used to investigate the ligand–receptor interaction in the H4 receptor [274]. A homology model of the H4 receptor was built from the β2-adrenergic receptor and the pseudoreceptor was generated based on four different ligands. The receptor's binding pocket was then identified from a molecular dynamics simulation of the membrane-embedded protein by matching the pseudoreceptor into every receptor conformation provided by the molecular dynamics simulation. Docking of a selective antagonist into this binding pocket showed favourable docking scores, suggesting that

pseudoreceptor models can be used to prioritise a particular protein model obtained by molecular dynamics simulations. This might help to improve virtual screening through identifying better receptor models. Werner et al. [275] performed two 50 ns molecular dynamics simulations to characterise the binding mode of highly related compounds and their impact on receptor activation. A partial agonist was shown to stabilise the active receptor conformation and caused transmembrane helix 6 to move whereas during the simulation of the inverse agonist no significant movement of helix 6 occurred. The results show that molecular dynamics simulations can be helpful to investigate the conformational changes that occur in the hH4R structure as well in the interaction patterns which reveal different binding modes of partial and inverse agonists. 5.2.2. Epidermal growth factor receptor Epidermal growth factor receptor (EGFR: ErbB-1: HER1) plays a role in several forms of cancer, including lung cancer. EGFR is a major target for cancer treatment because it is over-expressed in more than 60% of lung-cell carcinomas [276]. EGFR can be divided into three functional domains: an extracellular ligand-binding domain, a single transmembrane-spanning domain, and an intracellular kinase domain. The kinase domain is flanked at its N-terminal end by a juxtamembrane (JM) and the Cterminal end contains an extended tail (C-tail). The kinase domain can also be divided into the N lobe comprising a five-stranded β sheet and the C-helix, and a larger C lobe that is predominantly helical [277] (Fig. 5). The catalytic activity of EGFR is regulated by asymmetric dimerisation which permits one kinase domain to activate the other [278,279]. The extracellular domain contains two homologous domains I and III and cysteine-rich domains II and IV [280]. In the activation process the C-lobe of one kinase domain (the activator) forms close contacts with the other kinase domain (the receiver) to change the conformation of the C-helix in the receiver to an active state [278,279]. EGFR can also form heterodimers with other proteins from the ErbB family such as ErbB2 (HER2), ErbB3 (HER3), or ErbB4 (HER4) [281]. Crystal structures are available for several topologies of the extracellular domain of EGFR and ErbB2–4 and for the kinase domains of EGFR, ERbB3 and ErbB4. There are two different classes of inhibitors of EGFR: One targets the extracellular domain and blocks binding of EGF to the receptor. The other consists of small compounds that block activity of the intracellular kinase domain by competing with ATP [282]. Mutations in the kinase domain of EGFR commonly arise in human cancer and can result in resistance to the inhibitor by influencing the relative binding strength of inhibitor and ATP. Mutations can also result in constitutive activation of EGFR [276,282]. The binding affinities of three ATP-competitive inhibitors (erlotinib, geftinib, and AEE788) for wildtype EGFR and three mutants were investigated using molecular dynamics simulations, freeenergy calculations (MM–GBSA method) and per-residue footprint analysis [282]. The results showed that free-energy changes calculated in silico for inhibitor–receptor interactions correlated with experimental activities. The fold resistance values (where fold resistance = ratio of activity) were calculated for each of the molecular dynamics simulations. The results were in general agreement with available experimental data (r 2 = 0.84). The authors suggest that drug resistance more likely involves disruption of favourable interactions for the inhibitors than change in affinity for ATP, which could have consequences for the development of new drugs. A similar study [283], employing molecular dynamics simulations and free-energy calculations (MM/PBSA method), was used to characterise the binding of gefitinib and AEE788 to EGFR. The interactions between these inhibitors and the EGFR kinase domain were analysed using multiple short (ensemble) simulations and the molecular mechanics/Poisson–Boltzmann solvent area (MM/PBSA) method. The

T. Werner et al. / Advanced Drug Delivery Reviews 64 (2012) 323–343

337

Fig. 5. A model for the EGF receptor. The extracellular domain was built by using different crystal structures (PDBID: 1IVO, 1NQL and 3NJP). The transmembrane dimer (PDBID: 2JWA) and the kinase domain (PDBID: 3GOP and 2JIU) were then manually modelled to the extracellular domain. The model shows the overall structure and the functional features, which are mentioned in the text, of EGF receptor. This depiction is based on the figure of Jura et al. [276].

method was able to rank successfully binding affinities of each inhibitor to multiple EGFR mutants as well as the activity of the inhibitors to the same mutant or wildtype. The r 2 value between simulated and experimental data was 0.92. Mustafa et al. [284] studied the activation of EGFR kinases with the help of molecular dynamics simulations followed by principal component analysis. The simulations showed that during the activation process, the N-lobe in the EGFR kinase domain opens and closes relative to the C-lobe, switching the C-helix from an inactive ‘out’ to an active ‘in’ conformation and changing the position of the flexible activation loop from the inactive to active conformation. The simulations also showed that these conformational changes in the kinase domain are closely coupled with the JM and C-tail conformational change. The structural models based on a number of experimentally solved structures of the EGFR extracellular domain were investigated using molecular dynamics simulation focusing on orientation, conformational plasticity and oligomerisation of the receptor. [285]. The results showed that the EGFR dimer can align itself almost flat on the cell membrane, in contrast to the traditional view that EGFR dimers

having an extracellular domain standing perpendicular to the membrane. These simulation results are consistent with the results from recent fluorescence resonance energy transfer (FRET) [286] and EM experiments [287]. In a study by Zhang and Wriggers [288], multiple molecular dynamics simulations of the extracellular domain of EGFR dimer binding TGFα or EGF were performed in both solvated and unsolvated environments in a system comprising around half a million particles, using a parallel implementation of the GROMACS package with the GROMOS-96 force field. The simulations in solvent resulted in significant structural relaxation and more extensive contacts between the two EGFR monomers, whereas in the unsolvated simulations not all of these EGFR monomer–monomer contacts were formed. Moreover, the simulations suggested that the disordered domain IV in the extracellular domain of EGFR may act as a stabilised spacer in the dimer. 5.2.3. Cystic fibrosis transmembrane conductance regulator In ~ 90% of cystic fibrosis (FC) patients Cystic Fibrosis Transmembrane Conductance Regulator (CFTR) has a deletion of Phe508,

338

T. Werner et al. / Advanced Drug Delivery Reviews 64 (2012) 323–343

which lies in the first nucleotide-binding domain (NBD1) [289]. No significant differences in thermodynamic stability or any significant structural difference have been observed in experimental studies between wildtype and the Phe508 mutant [290]. However, experimental studies have shown differences in folding yields suggesting that the Phe508 deletion alters the kinetics of folding and dynamics of NBD1 [291–293]. Computational tools like molecular dynamics simulations are able to simulate the folding pathway of proteins. Performing multiple folding simulations of NBD1, models have shown that there is a higher folding efficiency for wildtype compared to the mutants, in agreement with experimental results [294]. Many single-domain proteins are able to self-chaperone their folding so that misfolded parts can be corrected. A reduction in the folding time can reduce the propensity of proteins to correct the malformed contacts in the intermediate states. In the molecular dynamics simulation, the folding of the mutants was accomplished faster than in the wild types, which suggests that the mutant NBD1 has a diminished self-chaperoning activity resulting in a lower overall folding efficiency. The molecular dynamics simulation showed different intermediate folded states of mutants and wildtype during the folding process, and some of them were only accessible in either wildtype or Phe508 mutant. Moreover, critical interactions were observed to be specific in mutants and wildtype which could help in the design of small molecules to correct the folding pathway of NBD1.

5.2.4. Antigen α4β1 (VLA-4, very late antigen-4) Antigen α4β1 (VLA-4, very late antigen-4) is a member of the integrin family. α4β1 is involved in the migration of mononuclear leukocytes to sites of inflammation and α4-dependent adhesion pathways are critical intervention points in several inflammatory and autoimmune pathologies including asthma, multiple sclerosis, and rheumatoid arthritis [295]. Singh et al. [296] defined a 3D pharmacophore model for the α4β1 ligand to search a chemical database for structures that satisfy the spatial and chemical constraints of the model. The compounds found were then synthesised and the most potent was then tested in an allergic sheep model of asthma. This compound was able to inhibit early and late airway responses, preventing the development of nonspecific airway hyperresponsiveness to carbachol.

6. Conclusion The application of computational modelling tools to proteins has become well established in the last few decades and has benefited greatly from increased computing power. 3D modelling can provide useful information not just for proteins whose 3D structures remain unsolved but also on high-resolution conformational dynamics during folding, ligand binding and activation that can be very difficult to achieve using current experimental technologies. As the gap between protein structures solved and protein sequences available expands exponentially, increasing reliability is being placed on highpowered in silico approaches to provide relevant information on structure, function, and molecular interactions across all fields of biological and medical sciences as well as the biotechnology and pharmaceutical industries. Furthermore, strong synergies have developed between experimentally based structure determination and computer-based modelling approaches as evidenced, for example, in this review by research undertaken on medically relevant lung proteins and the lipid-rich hypophase. In particular, computational modelling methods and bioinformatics approaches can detect information ‘hidden’ in the increasing volume of data provided by experimentalists, and this can be used to guide further experimental work.

Acknowledgements We thank Emma Rath and Catherine Vacher for proofreading the article. TW acknowledges the receipt of a University of Sydney International Scholarship. References [1] M. Levitt, Growth of novel protein structural data, Proc. Natl. Acad. Sci. U.S.A. 104 (2007) 3183–3188. [2] R.M. Bill, P.J.F. Henderson, S. Iwata, E.R.S. Kunji, H. Michel, R. Neutze, S. Newstead, B. Poolman, C.G. Tate, H. Vogel, Overcoming barriers to membrane protein structure determination, Nat. Biotechnol. 29 (2011) 335–340. [3] M. Baker, Making membrane proteins for structures: a trillion tiny tweaks, Nat. Methods 7 (2010) 429–434. [4] M.A. Martí-Renom, A.C. Stuart, A. Fiser, R. Sánchez, F. Melo, A. Sali, Comparative protein structure modeling of genes and genomes, Annu. Rev. Biophys. Biomol. Struct. 29 (2000) 291–325. [5] K. Illergård, D.H. Ardell, A. Elofsson, Structure is three to ten times more conserved than sequence — a study of structural response in protein cores, Proteins 77 (2009) 499–508. [6] O. Carugo, S. Pongor, Protein fold similarity estimated by a probabilistic approach based on Cα–Cα distance comparison, J. Mol. Biol. 315 (2002) 887–898. [7] S. Govindarajan, R. Recabarren, R.A. Goldstein, Estimating the total number of protein folds, Proteins 35 (1999) 408–414. [8] Y.I. Wolf, N.V. Grishin, E.V. Koonin, Estimating the number of protein folds and families from complete genome data, J. Mol. Biol. 299 (2000) 897–905. [9] C.T. Zhang, Relations of the numbers of protein sequences, families and folds, Protein Eng. 10 (1997) 757–761. [10] C. Zhang, C. DeLisi, Estimating the number of protein folds, J. Mol. Biol. 284 (1998) 1301–1305. [11] X. Liu, K. Fan, W. Wang, The number of protein folds and their distribution over families in nature, Proteins 54 (2004) 491–499. [12] A.G. Murzin, S.E. Brenner, T. Hubbard, C. Chothia, SCOP: a structural classification of proteins database for the investigation of sequences and structures, J. Mol. Biol. 247 (1995) 536–540. [13] A. Andreeva, D. Howorth, J.M. Chandonia, S.E. Brenner, T.J.P. Hubbard, C. Chothia, G.A. Murzin, Data growth and its impact on the SCOP database: new developments, Nucleic Acids Res. 36 (2008) D419–D425. [14] D. Cozzetto, A. Kryshtafovych, K. Fidelis, J. Moult, B. Rost, A. Tramontano, Evaluation of template-based models in CASP8 with standard measures, Proteins 77 (Suppl 9) (2009) 18–28. [15] J. Moult, K. Fidelis, A. Zemla, Processing and evaluation of predictions in CASP4, Proteins 21 (2002) 13–21. [16] D. Vitkup, E. Melamud, J. Moult, C. Sander, Completeness in structural genomics, Nat. Struct. Biol. 8 (2001) 559–566. [17] M. Neumüller, F. Jähnig, Modeling of halorhodopsin and rhodopsin based on bacteriorhodopsin, Proteins 26 (1996) 146–156. [18] P.R. Daga, R.Y. Patel, R.J. Doerksen, Template-based protein modeling: recent methodological advances, Curr. Top. Med. Chem. 10 (2010) 84–94. [19] J. Espadaler, N. Fernandez-Fuentes, A. Hermoso, E. Querol, F.X. Aviles, M.J.E. Sternberg, B. Oliva, ArchDB: automated protein loop classification as a tool for structural genomics, Nucleic Acids Res. 32 (2004) D185–D188. [20] I.W. Davis, A. Leaver-Fay, V.B. Chen, J.N. Block, G.J. Kapral, X. Wang, et al., MolProbity: all-atom contacts and structure validation for proteins and nucleic acids, Nucleic Acids Res. 35 (2007) W375–W383. [21] R.A. Laskowski, M.W. MacArthur, D.S. Moss, J.M. Thornton, PROCHECK: a program to check the stereochemical quality of protein structures, J. Appl. Crystallogr. 26 (1993) 283–291. [22] R.W. Hooft, G. Vriend, C. Sander, E.E. Abola, Errors in protein structures, Nature 381 (1996) 272. [23] L.R. Forrest, C.L. Tang, B. Honig, On the accuracy of homology modeling and sequence alignment methods applied to membrane proteins, Biophys. J. 91 (2006) 508–517. [24] M. Brylinski, J. Skolnick, Comprehensive structural and functional characterization of the human kinome by protein structure modeling and ligand virtual screening, J. Chem. Inf. Model. 50 (2010) 1839–1854. [25] R. Kiss, T. Polgár, A. Kirabo, J. Sayyah, N.C. Figueroa, A.F. List, L. Sokol, K.S. Zuckerman, M. Gali, K.S. Bisht, P.P. Sayeski, G.M. Keseru, Identification of a novel inhibitor of JAK2 tyrosine kinase by structure-based virtual screening, Bioorg. Med. Chem. Lett. 19 (2009) 3598–3601. [26] T.L. Nguyen, R. Gussio, J.A. Smith, D.A. Lannigan, S.M. Hecht, D.A. Scudiero, R.H. Shoemaker, D.W. Zaharevitz, Homology model of RSK2 N-terminal kinase domain, structure-based identification of novel RSK2 inhibitors, and preliminary common pharmacophore, Bioorg. Med. Chem. 14 (2006) 6097–6105. [27] C.N. Cavasotto, A.J.W. Orry, N.J. Murgolo, M.F. Czarniecki, S.A. Kocsi, B.E. Hawes, K.A. O'Neill, H. Hine, M.S. Burton, J.H. Voigt, R.A. Abagyan, M.L. Bayne, F.J. Monsma, Discovery of novel chemotypes to a G-protein-coupled receptor through ligand-steered homology modeling and structure-based virtual screening, J. Med. Chem. 51 (2008) 581–588. [28] S. Engel, A.P. Skoumbourdis, J. Childress, S. Neumann, J.R. Deschamps, C.J. Thomas, A.O. Colson, S. Constanzi, M.C. Gershengorn, A virtual screen for diverse ligands: discovery of selective G protein-coupled receptor antagonists, J. Am. Chem. Soc. 130 (2008) 5115–5123.

T. Werner et al. / Advanced Drug Delivery Reviews 64 (2012) 323–343 [29] R. Kiss, B. Kiss, Á. Könczöl, F. Szalai, I. Jelinek, V. László, B. Noszál, A. Falus, G.M. Keseru, Discovery of novel human histamine H4 receptor ligands by largescale structure-based virtual screening, J. Med. Chem. 51 (2008) 3145–3153. [30] S. Radestock, T. Weil, S. Renner, Homology model-based virtual screening for GPCR ligands using docking and target-biased scoring, J. Chem. Inf. Model. 48 (2008) 1104–1117. [31] B.K. Rai, G.J. Tawa, A.H. Katz, C. Humblet, Modeling G protein-coupled receptors for structure-based drug discovery using low-frequency normal modes for refinement of homology models: application to H3 antagonists, Proteins 78 (2010) 457–473. [32] N. Renault, A. Gohier, P. Chavatte, A. Farce, Novel structural insights for drug design of selective 5-HT(2C) inverse agonists from a ligand-biased receptor model, Eur. J. Med. Chem. 45 (2010) 5086–5099. [33] A.L. Parrill, U. Echols, T. Nguyen, T.C.T. Pham, A. Hoeglund, D.L. Baker, Virtual screening approaches for the identification of non-lipid autotaxin inhibitors, Bioorg. Med. Chem. 16 (2008) 1784–1795. [34] P. Mukherjee, P.V. Desai, A. Srivastava, B.L. Tekwani, M.A. Avery, Probing the structures of leishmanial farnesyl pyrophosphate synthases: homology modeling and docking studies, J. Chem. Inf. Model. 48 (2008) 1026–1040. [35] G. Anupriya, K. Roopa, S. Basappa, Y.S. Chong, L. Annamalai, Homology modeling and in silico screening of inhibitors for the substrate binding domain of human Siah2: implications for hypoxia-induced cancers, J. Mol. Model. 17 (2011) 3325–3332. [36] C. Kalyanaraman, H.J. Imker, A.A. Fedorov, E.V. Fedorov, M.E. Glasner, P.C. Babbitt, S.C. Almo, J.A. Gerlt, M.P. Jacobson, Discovery of a dipeptide epimerase enzymatic function guided by homology modeling and virtual screening, Structure 16 (2008) 1668–1677. [37] D. Lipman, W. Pearson, Rapid and sensitive protein similarity searches, Science 227 (1985) 1435–1441. [38] W.R. Pearson, D.J. Lipman, Improved tools for biological sequence comparison, Proc. Natl. Acad. Sci. U.S.A. 85 (1988) 2444–2448. [39] S.F. Altschul, T.L. Madden, A.A. Schäffer, J. Zhang, Z. Zhang, W. Miller, D. Lipman, Gapped BLAST and PSI-BLAST: a new generation of protein database search programs, Nucleic Acids Res. 25 (1997) 3389–3402. [40] J. Söding, A. Biegert, A.N. Lupas, The HHpred interactive server for protein homology detection and structure prediction, Nucleic Acids Res. 33 (2005) W244–W248. [41] L.A. Kelley, M.J.E. Sternberg, Protein structure prediction on the Web: a case study using the Phyre server, Nat. Protoc. 4 (2009) 363–371. [42] T. Ohlson, B. Wallner, A. Elofsson, Profile–profile methods provide improved fold-recognition: a study of different profile–profile alignment methods, Proteins 57 (2004) 188–197. [43] G. Wang, R.L. Dunbrack, Scoring profile-to-profile sequence alignments, Protein Sci. 13 (2004) 1612–1626. [44] J. Söding, Protein homology detection by HMM–HMM comparison, Bioinformatics 21 (2005) 951–960. [45] B. Rost, Twilight zone of protein sequence alignments, Protein Eng. 12 (1999) 85–94. [46] R. Doolittle, Of Urfs and Orfs: A Primer on How to Analyze Derived Amino Acid Sequences, 1st ed University Science Books, 1986. [47] N. Fernandez-Fuentes, B.K. Rai, C.J. Madrid-Aliste, J.E. Fajardo, A. Fiser, Comparative protein structure modeling by combining multiple templates and optimizing sequence-to-structure alignments, Bioinformatics 23 (2007) 2558–2565. [48] R.N. Wallner, E. Lindahl, A. Elofsson, Using multiple templates to improve quality of homology models in automated homology modeling, Protein Sci. (2008) 990–1002. [49] J.A.R. Dalton, R.M. Jackson, An evaluation of automated homology modelling methods at low target template sequence similarity, Bioinformatics 23 (2007) 1901–1908. [50] M.A. Larkin, G. Blackshields, N.P. Brown, R. Chenna, P.A. McGettigan, H. McWilliam, F. Valentin, I.M. Wallace, A. Wilm, R. Lopez, J.D. Thompson, T.J. Gibson, D.G. Higgins, Clustal W and Clustal X version 2.0, Bioinformatics 23 (2007) 2947–2948. [51] C. Notredame, D.G. Higgins, J. Heringa, T-Coffee: a novel method for fast and accurate multiple sequence alignment, J. Mol. Biol. 302 (2000) 205–217. [52] O. O'Sullivan, K. Suhre, C. Abergel, D.G. Higgins, C. Notredame, 3DCoffee: combining protein sequences and structures within multiple sequence alignments, J. Mol. Biol. 340 (2004) 385–395. [53] R.C. Edgar, MUSCLE: multiple sequence alignment with high accuracy and high throughput, Nucleic Acids Res. 32 (2004) 1792–1797. [54] J.N.D. Battey, J. Kopp, L. Bordoli, R.J. Read, N.D. Clarke, T. Schwede, Automated server predictions in CASP7, Proteins 69 (Suppl 8) (2007) 68–82. [55] Y. Zhang, I-TASSER: fully automated protein structure prediction in CASP8, Proteins 77 (Suppl 9) (2009) 100–113. [56] Z. Xiang, Advances in homology protein structure modeling, Curr. Protein Pept. Sci. 7 (2006) 217–227. [57] P.A. Bates, L.A. Kelley, R.M. MacCallum, M.J.E. Sternberg, Enhancement of protein modeling by human intervention in applying the automatic programs 3DJIGSAW and 3D-PSSM, Proteins. Suppl 5 (2002) 39–46. [58] P. Koehl, M. Delarue, A self consistent mean field approach to simultaneous gap closure and side-chain positioning in homology modelling, Nat. Struct. Biol. 2 (1995) 163–170. [59] K. Arnold, L. Bordoli, J. Kopp, T. Schwede, The SWISS-MODEL workspace: a webbased environment for protein structure homology modelling, Bioinformatics 22 (2006) 195–201. [60] M. Levitt, Accurate modeling of protein conformation by automatic segment matching, J. Mol. Biol. 226 (1992) 507–533.

339

[61] N. Eswar, B. Webb, M.A. Marti-Renom, M.S. Madhusudhan, D. Eramian, M.-Y. Shen, et al., Comparative protein structure modeling using MODELLER, Curr. Protoc. Bioinformatics (Suppl. 15) (2006) 5.6.1–5.6.30. [62] A. Sali, T. Blundell, Comparative protein modelling by satisfaction of spatial restraints, J. Mol. Biol. 234 (1993) 779–815. [63] D. Petrey, Z. Xiang, C.L. Tang, L. Xie, M. Gimpelev, T. Mitros, C.S. Soto, S. GoldsmithFischman, A. Kernytsky, A. Schlessinger, I.Y.Y. Koh, E. Alexov, B. Honig, Using multiple structure alignments, fast model building, and energetic analysis in fold recognition and homology modeling, Proteins 53 (Suppl 6) (2003) 430–435. [64] B. Wallner, A. Elofsson, All are not equal: a benchmark of different homology modeling programs, Protein Sci. 14 (2005) 1315–1327. [65] J.L. MacCallum, L. Hua, M.J. Schnieders, V.S. Pande, M.P. Jacobson, K. a Dill, Assessment of the protein-structure refinement category in CASP8, Proteins 77 (Suppl 9) (2009) 66–80. [66] G.G. Krivov, M.V. Shapovalov, R.L. Dunbrack, Improved prediction of protein side-chain conformations with SCWRL4, Proteins 77 (2009) 778–795. [67] N. Fernandez-Fuentes, J. Zhai, A. Fiser, ArchPRED: a template based loop structure prediction server, Nucleic Acids Res. 34 (2006) W173–W176. [68] C.M. Deane, T.L. Blundell, CODA: a combined algorithm for predicting the structurally variable regions of protein models, Protein Sci. 10 (2001) 599–612. [69] C.A. Rohl, C.E. Strauss, K.M. Misura, D. Baker, Protein structure prediction using Rosetta, Meth. Enzymol. 383 (2004) 66–93. [70] M.J. Sippl, Recognition of errors in three-dimensional structures of proteins, Proteins 17 (1993) 355–362. [71] M. Wiederstein, M.J. Sippl, ProSA-web: interactive web service for the recognition of errors in three-dimensional structures of proteins, Nucleic Acids Res. 35 (2007) W407–W410. [72] R. Lüthy, J.U. Bowie, D. Eisenberg, Assessment of protein models with threedimensional profiles, Nature 356 (1992) 83–85. [73] J.U. Bowie, R. Lüthy, D. Eisenberg, A method to identify protein sequences that fold into a known three-dimensional structure, Science 253 (1991) 164–170. [74] C. Chothia, A.M. Lesk, The relation between the divergence of sequence and structure in proteins, EMBO J. 5 (1986) 823–826. [75] A. Roy, A. Kucukural, Y. Zhang, I-TASSER: a unified platform for automated protein structure and function prediction, Nat. Protoc. 5 (2010) 725–738. [76] S. Wu, Y. Zhang, LOMETS: a local meta-threading-server for protein structure prediction, Nucleic Acids Res. 35 (2007) 3375–3382. [77] J. Meller, R. Elber, Protein recognition by sequence-to-structure fitness: bridging efficiency and capacity of threading models, in: R.A. Friesner, I. Prigogine, S.A. Rice (Eds.), Computational Methods for Protein Folding, John Wiley & Sons, Inc., New York, 2002, pp. 77–130. [78] J. Xu, F. Jiao, L. Yu, Protein structure prediction using threading, Methods Mol. Biol. 413 (2008) 91–121. [79] D.T. Jones, W.R. Taylor, J.M. Thornton, A new approach to protein fold recognition, Nature 358 (1992) 86–89. [80] R.T. Miller, D.T. Jones, J.M. Thornton, Protein fold recognition by sequence threading: tools and assessment techniques, FASEB J. 10 (1996) 171–178. [81] J. Xu, M. Li, D. Kim, Y. Xu, RAPTOR: optimal protein threading by linear programming, J. Bioinform. Comput. Biol. 1 (2003) 95–117. [82] J.D. Bryngelson, P.G. Wolynes, Intermediates and barrier crossing in a random energy model (with applications to protein folding), J. Phys. Chem. 93 (1989) 6902–6915. [83] K.A. Dill, H.S. Chan, From Levinthal to pathways to funnels, Nat. Struct. Biol. 4 (1997) 10–19. [84] A. Liwo, J. Lee, D.R. Ripoll, J. Pillardy, H.A. Scheraga, Protein structure prediction by global optimization of a potential energy function, Proc. Natl. Acad. Sci. U.S.A. 96 (1999) 5482–5485. [85] S. Ołdziej, C. Czaplewski, A. Liwo, M. Chinchio, M. Nanias, J.A. Vila, M. Khalili, Y.A. Arnautova, A. Jagielska, M. Makowski, H.D. Schafroth, R. Kaźmierkiewicz, D.R. Ripoll, J. Pillardy, J.A. Saunders, Y.K. Kang, K.D. Gibson, H.A. Scheraga, Physicsbased protein-structure prediction using a hierarchical protocol based on the UNRES force field: assessment in two blind tests, Proc. Natl. Acad. Sci. U.S.A. 102 (2005) 7547–7552. [86] J.L. Klepeis, Y. Wei, M.H. Hecht, C.A. Floudas, Ab initio prediction of the threedimensional structure of a de novo designed protein: a double-blind case study, Proteins 58 (2005) 560–570. [87] J.L. Klepeis, C.A. Floudas, ASTRO-FOLD: a combinatorial and global optimization framework for ab initio prediction of three-dimensional structures of proteins from the amino acid sequence, Biophys. J. 85 (2003) 2119–2146. [88] S. Wu, J. Skolnick, Y. Zhang, Ab initio modeling of small proteins by iterative TASSER simulations, BMC Biol. 5 (2007). [89] Y. Zhang, I-TASSER server for protein 3D structure prediction, BMC Bioinformatics 9 (2008) 40. [90] R. Das, D. Baker, Macromolecular modeling with Rosetta, Annu. Rev. Biochem. 77 (2008) 363–382. [91] K.T. Simons, I. Ruczinski, C. Kooperberg, B.A. Fox, C. Bystroff, D. Baker, Improved recognition of native-like protein structures using a combination of sequencedependent and sequence-independent features of proteins, Proteins 34 (1999) 82–95. [92] Y. Zhang, A. Kolinski, J. Skolnick, TOUCHSTONE II: a new approach to ab initio protein structure prediction, Biophys. J. 85 (2003) 1145–1164. [93] Y. Zhang, D. Kihara, J. Skolnick, Local energy landscape flattening: parallel hyperbolic Monte Carlo sampling of protein folding, Proteins 48 (2002) 192–201. [94] S. Wu, Y. Zhang, A comprehensive assessment of sequence-based and templatebased methods for protein contact prediction, Bioinformatics 24 (2008) 924–931.

340

T. Werner et al. / Advanced Drug Delivery Reviews 64 (2012) 323–343

[95] Y. Zhang, J. Skolnick, SPICKER: a clustering approach to identify near-native protein folds, J. Comput. Chem. 25 (2004) 865–871. [96] Y. Zhang, J. Skolnick, TM-align: a protein structure alignment algorithm based on the TM-score, Nucleic Acids Res. 33 (2005) 2302–2309. [97] Y. Li, Y. Zhang, REMO: a new protocol to refine full atomic protein models from C-alpha traces by optimizing hydrogen-bonding networks, Proteins 76 (2009) 665–676. [98] Z. Wang, J. Eickholt, J. Cheng, MULTICOM: a multi-level combination approach to protein structure prediction and its assessments in CASP8, Bioinformatics 26 (2010) 882–888. [99] J. Zhang, Q. Wang, B. Barz, Z. He, I. Kosztin, Y. Shang, D. Xu, MUFOLD: a new solution for protein 3D structure prediction, Proteins 78 (2010) 1137–1152. [100] Y. Zhang, Protein structure prediction: when is it useful? Curr. Opin. Struct. Biol. 19 (2009) 145–155. [101] L. Malmström, M. Riffle, C.E.M. Strauss, D. Chivian, T.N. Davis, R. Bonneau, D. Baker, Superfamily assignments for the yeast proteome through integration of structure prediction with the gene ontology, PLoS Biol. 5 (2007) e76. [102] M. Ashburner, C.A. Ball, J.A. Blake, D. Botstein, H. Butler, J.M. Cherry, A.P. Davis, K. Dolinski, S.S. Dwight, J.T. Eppig, M.A. Harris, D.P. Hill, L. Issel-Tarver, A. Kasarskis, S. Lewis, J.C. Matese, J.E. Richardson, M. Ringwald, G.M. Rubin, G. Sherlock, Gene ontology: tool for the unification of biology. The Gene Ontology Consortium, Nat. Genet. 25 (2000) 25–29. [103] M. Ben-David, O. Noivirt-Brik, A. Paz, J. Prilusky, J.L. Sussman, Y. Levy, Assessment of CASP8 structure predictions for template free targets, Proteins 77 (Suppl 9) (2009) 50–65. [104] J.W. Ponder, D.A. Case, Force fields for protein simulations, Adv. Protein Chem. 66 (2003) 27–85. [105] B.R. Brooks, C.L. Brooks, A.D. Mackerell, L. Nilsson, R.J. Petrella, B. Roux, Y. Won, G. Archontis, C. Bartels, S. Boresch, A. Caflisch, L. Caves, Q. Cui, A.R. Dinner, M. Feig, S. Fischer, J. Gao, M. Hodoscek, W. Im, K,. Kuczera, T. Lazaridis, J. Ma, V. Ovchinnikov, E. Paci, R.W. Pastor, C.B. Post, J.Z. Pu, M. Schaeffer, B. Tidor, R.M. Venable, H.L. Woodcock, X. Wu, W. Yang, D.M. York, M. Karplus, CHARMM: the biomolecular simulation program, J. Comput. Chem. 30 (2009) 1545–1614. [106] B.R. Brooks, R.E. Bruccoleri, D.J. Olafson, D.J. States, S. Swaminathan, M. Karplus, CHARMM: a program for macromolecular energy, minimization, and dynamics calculations, J. Comput. Chem. 4 (1983) 187–217. [107] A.D. MacKerel Jr., C.L. Brooks III, L. Nilsson, B. Roux, Y. Won, M. Karplus, CHARMM: The Energy Function and Its Parameterization with an Overview of the Program, in: P.v.R. Schleyer, et al., (Eds.), The Encyclopedia of Computational Chemistry, John Wiley & Sons, Chichester, Vol. 1, 1998, pp. 271–277. [108] D.A. Case, T.E. Cheatham, T. Darden, H. Gohlke, R. Luo, K.M. Merz, A. Onufriev, C. Simmerling, B. Wang, R.J. Woods, The Amber biomolecular simulation programs, Comput. Chem. 26 (2005) 1668–1688. [109] W.L. Jorgensen, D.S. Maxwell, J. Tirado-Rives, Development and testing of the OPLS all-atom force field on conformational energetics and properties of organic liquids, J. Am. Chem. Soc. 118 (1996) 11225–11236. [110] G.A. Kaminski, R.A. Friesner, J. Tirado-Rives, W.L. Jorgensen, Evaluation and reparametrization of the OPLS-AA force field for proteins via comparison with accurate quantum chemical calculations on peptides, J. Phys. Chem. B 105 (2001) 6474–6487. [111] M. Christen, P.H. Hünenberger, D. Bakowies, R. Baron, R. Bürgi, D.P. Geerke, et al., The GROMOS software for biomolecular simulation: GROMOS05, J. Comput. Chem. 26 (2005) 1719–1751. [112] O. Guvench, A.D. MacKerell, Comparison of protein force fields for molecular dynamics simulations, Methods Mol. Biol. 443 (2008) 63–88. [113] B.J. Alder, T.E. Wainwright, Phase transition for a hard sphere system, J. Chem. Phys. 27 (1957) 1208–1209. [114] B.J. Alder, T.E. Wainwright, Studies in molecular dynamics. I. General method, J. Chem. Phys. 31 (1959) 459–466. [115] J.C. Phillips, R. Braun, W. Wang, J. Gumbart, E. Tajkhorshid, E. Villa, C. Chipot, R.D. Skeel, L. Kalé, K. Schulten, Scalable molecular dynamics with NAMD, J. Comput. Chem. 26 (2005) 1781–1802. [116] B. Hess, C. Kutzner, D. van der Spoel, E. Lindahl, GROMACS 4: algorithms for highly efficient, load-balanced, and scalable molecular simulation, J. Chem. Theory. Comput. 4 (2008) 435–447. [117] D. van der Spoel, E. Lindahl, B. Hess, G. Groenhof, A.E. Mark, H.J.C. Berendsen, GROMACS: fast, flexible, and free, J. Comput. Chem. 26 (2005) 1701–1718. [118] J.A. McCammon, B.R. Gelin, M. Karplus, Dynamics of folded proteins, Nature 267 (1977) 585–590. [119] T.B. Woolf, B. Roux, Molecular dynamics simulation of the gramicidin channel in a phospholipid bilayer, Proc. Natl. Acad. Sci. U.S.A. 91 (1994) 11631–11635. [120] O. Edholm, O. Berger, F. Jähnig, Structure and fluctuations of bacteriorhodopsin in the purple membrane: a molecular dynamics study, J. Mol. Biol. 250 (1995) 94–111. [121] M. Karplus, J. Kuriyan, Molecular dynamics and protein function, Proc. Natl. Acad. Sci. U.S.A. 102 (2005) 6679–6685. [122] J. Dolenc, J.H. Missimer, M.O. Steinmetz, W.F. van Gunsteren, Methods of NMR structure refinement: molecular dynamics simulations improve the agreement with measured NMR data of a C-terminal peptide of GCN4-p1, J. Biomol. NMR 47 (2010) 221–235. [123] J. Subbotina, V. Yarov-Yarovoy, J. Lees-Miller, S. Durdagi, J. Guo, H.J. Duff, Y.S. Noskov, Structural refinement of the hERG1 pore and voltage-sensing domains with ROSETTA-membrane and molecular dynamics simulations, Proteins 78 (2010) 2922–2934. [124] A.K. Malde, A.E. Mark, Binding and enantiomeric selectivity of threonyl-tRNA synthetase, J. Am. Chem. Soc. 131 (2009) 3848–3849.

[125] J.D. Durrant, C.A.F. de Oliveira, J.A. McCammon, Including receptor flexibility and induced fit effects into the design of MMP-2 inhibitors, J. Mol. Recognit. 23 (2010) 173–182. [126] L.V. Schäfer, D.H. de Jong, A. Holt, A.J. Rzepiela, A.H. de Vries, B. Poolman, J.A. Killian, S.J. Marrink, Lipid packing drives the segregation of transmembrane helices into disordered lipid domains in model membranes, Proc. Natl. Acad. Sci. U.S.A. 108 (2011) 1343–1348. [127] S.J. Marrink, A.H. de Vries, D.P. Tieleman, Lipids on the move: simulations of membrane pores, domains, stalks and curves, Biochim. Biophys. Acta 1788 (2009) 149–168. [128] K. Klenin, B. Strodel, D.J. Wales, W. Wenzel, Modelling proteins: conformational sampling and reconstruction of folding kinetics, Biochim. Biophys. Acta 1814 (2010) 977–1000. [129] F. Khalili-Araghi, J. Gumbart, P.C. Wen, M. Sotomayor, E. Tajkhorshid, K. Schulten, Molecular dynamics simulations of membrane channels and transporters, Curr. Opin. Struct. Biol. 19 (2009) 128–137. [130] E. Lindahl, M.S.P. Sansom, Membrane proteins: molecular dynamics simulations, Curr. Opin. Struct. Biol. 18 (2008) 425–431. [131] J. Gumbart, Y. Wang, A. Aksimentiev, E. Tajkhorshid, K. Schulten, Molecular dynamics simulations of proteins in lipid bilayers, Curr. Opin. Struct. Biol. 15 (2005) 423–431. [132] H.A. Scheraga, M. Khalili, A. Liwo, Protein-folding dynamics: overview of molecular simulation techniques, Annu. Rev. Phys. Chem. 58 (2007) 57–83. [133] V.A. Voelz, G.R. Bowman, K. Beauchamp, V.S. Pande, Molecular simulation of ab initio protein folding for a millisecond folder NTL9(1–39), J. Am. Chem. Soc. 132 (2010) 1526–1528. [134] D.E. Shaw, R.O. Dror, J.K. Salmon, J.P. Grossman, K.M. Mackenzie, J.A. Bank, C. Young, M.M. Deneroff, B. Batson, K.J. Bowers, E. Chow, M.P. Eastwood, D.J. Ierardi, J.L. Klepeis, J.S. Kuskin, R.H. Larson, K. Lindorff-Larsen, P. Maragakis, M.A. Moraes, S. Piana, Y. Shan, B. Towles, Millisecond-scale molecular dynamics simulations on Anton, Proceedings of the Conference on High Performance Computing, Networking, Storage and Analysis (SC09), ACM, New York, 2009. [135] W.L. Jorgensen, J.M. Briggs, M.L. Contreras, Relative partition coefficients for organic solutes from fluid simulations, J. Phys. Chem. 94 (1990) 1683–1686. [136] T.A. Soares, P.H. Hünenberger, M.A. Kastenholz, V. Kräutler, T. Lenz, R.D. Lins, C. Oostenbrink, W.F. van Gunsteren, An improved nucleic acid parameter set for the GROMOS force field, J. Comput. Chem. 26 (2005) 725–737. [137] S.J. Marrink, A.H. de Vries, A.E. Mark, Coarse grained model for semiquantitative lipid simulations, J. Phys. Chem. B 108 (2004) 750–760. [138] S.J. Marrink, H.J. Risselada, S. Yefimov, D.P. Tieleman, A.H. de Vries, The MARTINI force field: coarse grained model for biomolecular simulations, J. Phys. Chem. B 111 (2007) 7812–7824. [139] L. Monticelli, S.K. Kandasamy, X. Periole, R.G. Larson, D.P. Tieleman, S.J. Marrink, The MARTINI coarse-grained force field: extension to proteins, J. Chem. Theory. Comput. 4 (2008) 819–834. [140] B.A. Hall, M.S.P. Sansom, Coarse-grained MD simulations and protein–protein interactions: the Cohesin–Dockerin system, J. Chem. Theory. Comput. 5 (2009) 2465–2471. [141] I. Tunbridge, R.B. Best, J. Gain, M.M. Kuttel, Simulation of coarse-grained protein–protein interactions with graphics processing units, J. Chem. Theory. Comput. 6 (2010) 3588–3600. [142] A.J. Markvoort, A.F. Smeijers, K. Pieterse, R.A. van Santen, P.A.J. Hilbers, Lipidbased mechanisms for vesicle fission, J. Phys. Chem. B 111 (2007) 5719–5725. [143] A.F. Smeijers, K. Pieterse, A.J. Markvoort, P.A.J. Hilbers, Coarse-grained transmembrane proteins: hydrophobic matching, aggregation, and their effect on fusion, J. Phys. Chem. B 110 (2006) 13614–13623. [144] C. Clementi, Coarse-grained models of protein folding: toy models or predictive tools? Curr. Opin. Struct. Biol. 18 (2008) 10–15. [145] M. Qin, J. Zhang, W. Wang, Effects of disulfide bonds on folding behavior and mechanism of the β-sheet protein tendamistat, Biophys. J. 90 (2006) 272–286. [146] I.F. Thorpe, J. Zhou, G.A. Voth, Peptide folding using multiscale coarse-grained models, J. Phys. Chem. B 112 (2008) 13079–13090. [147] P.J. Bond, J. Holyoake, A. Ivetac, S. Khalid, M.S.P. Sansom, Coarse-grained molecular dynamics simulations of membrane proteins and peptides, J. Struct. Biol. 157 (2007) 593–605. [148] K.A. Scott, P.J. Bond, A. Ivetac, A.P. Chetwynd, S. Khalid, M.S.P. Sansom, Coarsegrained MD simulations of membrane protein-bilayer self-assembly, Structure 16 (2008) 621–630. [149] J. Gumbart, K. Schulten, Structural determinants of lateral gate opening in the protein translocon, Biochemistry 46 (2007) 11147–11157. [150] W. Treptow, S.J. Marrink, M. Tarek, Gating motions in voltage-gated potassium channels revealed by coarse-grained molecular dynamics simulations, J. Phys. Chem. B 112 (2008) 3277–3282. [151] S. Yefimov, E. van der Giessen, P.R. Onck, S.J. Marrink, Mechanosensitive membrane channels in action, Biophys. J. 94 (2008) 2994–3002. [152] P. Cieplak, F.Y. Dupradeau, Y. Duan, J. Wang, Polarization effects in molecular mechanical force fields, J. Phys. Condens. Matter 21 (2009) 333102. [153] F.H. Stillinger, Improved simulation of liquid water by molecular dynamics, J. Chem. Phys. 60 (1974) 1545–1557. [154] H.J.C. Berendsen, J.R. Grigera, T.P. Straatsma, The missing term in effective pair potentials, J. Phys. Chem. 91 (1987) 6269–6271. [155] W.L. Jorgensen, J. Chandrasekhar, J.D. Madura, R.W. Impey, M.L. Klein, Comparison of simple potential functions for simulating liquid water, J. Chem. Phys. 79 (1983) 926–935. [156] C. Kandt, W.L. Ash, D.P. Tieleman, Setting up and running molecular dynamics simulations of membrane proteins, Methods 41 (2007) 475–488.

T. Werner et al. / Advanced Drug Delivery Reviews 64 (2012) 323–343 [157] W.C. Still, A. Tempczyk, R.C. Hawley, T. Hendrickson, Semianalytical treatment of solvation for molecular mechanics and dynamics, J. Am. Chem. Soc. 112 (1990) 6127–6129. [158] B.D. Bursulaya, C.L. Brooks, Comparative study of the folding free energy landscape of a three-stranded β-sheet protein with explicit and implicit solvent models, J. Phys. Chem. B 104 (2000) 12378–12383. [159] R. Zhou, Free energy landscape of protein folding in water: explicit vs. implicit solvent, Proteins 53 (2003) 148–161. [160] L. Bu, C.L. Brooks, De novo prediction of the structures of M. tuberculosis membrane proteins, J. Am. Chem. Soc. 130 (2008) 5384–5385. [161] L. Bu, W. Im, C.L. Brooks, Membrane assembly of simple helix homo-oligomers studied via molecular dynamics simulations, Biophys. J. 92 (2007) 854–863. [162] W. Im, M. Feig, C.L. Brooks, An implicit membrane generalized born theory for the study of structure, stability, and interactions of membrane proteins, Biophys. J. 85 (2003) 2900–2918. [163] A. Warshel, M. Levitt, Theoretical studies of enzymic reactions: dielectric, electrostatic and steric stabilization of the carbonium ion in the reaction of lysozyme, J. Mol. Biol. 103 (1976) 227–249. [164] Y. Sugita, Y. Okamoto, Replica-exchange molecular dynamics method for protein folding, Chem. Phys. Lett. 314 (1999) 141–151. [165] M. Neri, C. Anselmi, M. Cascella, A. Maritan, P. Carloni, Coarse-grained model of proteins incorporating atomistic detail of the active site, Phys. Rev. Lett. 95 (2005) 1–4. [166] M. Neri, M. Baaden, V. Carnevale, C. Anselmi, A. Maritan, P. Carloni, Microseconds dynamics simulations of the outer-membrane protease T, Biophys. J. 94 (2008) 71–78. [167] M. Neri, C. Anselmi, V. Carnevale, A.V. Vargiu, P. Carloni, Molecular dynamics simulations of outer-membrane protease T from E. coli based on a hybrid coarse-grained/atomistic potential, J. Phys. Condens. Matter 18 (2006) S347–S355. [168] Q. Shi, S. Izvekov, G.A. Voth, Mixed atomistic and coarse-grained molecular dynamics: simulation of a membrane-bound ion channel, J. Phys. Chem. B 110 (2006) 15045–15048. [169] A.P. Heath, L.E. Kavraki, C. Clementi, From coarse-grain to all-atom: toward multiscale analysis of protein landscapes, Proteins (2007) 646–661. [170] N. Koga, T. Kameda, K. Okazaki, S. Takada, Paddling mechanism for the substrate translocation by AAA + motor revealed by multiscale molecular simulations, Proc. Natl. Acad. Sci. U.S.A. 106 (2009) 18237–18242. [171] W. Li, S. Takada, Self-learning multiscale simulation for achieving high accuracy and high efficiency simultaneously, J. Chem. Phys. 130 (2009) 214108. [172] A.K. Felts, Y. Harano, E. Gallicchio, R.M. Levy, Free energy surfaces of betahairpin and alpha-helical peptides generated by replica exchange molecular dynamics with the AGBNP implicit solvent model, Proteins 56 (2004) 310–321. [173] R. Zhou, B.J. Berne, R. Germain, The free energy landscape for beta hairpin folding in explicit water, Proc. Natl. Acad. Sci. U.S.A. 98 (2001) 14931–14936. [174] A.E. García, J.N. Onuchic, Folding a protein in a computer: an atomic description of the folding/unfolding of protein A, Proc. Natl. Acad. Sci. U.S.A. 100 (2003) 13898–13903. [175] S. Kannan, M. Zacharias, Enhanced sampling of peptide and protein conformations using replica exchange simulations with a peptide backbone biasingpotential, Proteins 66 (2007) 697–706. [176] S. Kannan, M. Zacharias, Application of biasing-potential replica-exchange simulations for loop modeling and refinement of proteins in explicit solvent, Proteins (2010) 2809–2819. [177] M. Congreve, C.W. Murray, T.L. Blundell, Structural biology and drug discovery, Drug Discov. Today 10 (2005) 895–907. [178] G.P. Brady, P.F. Stouten, Fast prediction and visualization of protein binding pockets with PASS, J. Comput. Aided Mol. Des. 14 (2000) 383–401. [179] J. Dundas, Z. Ouyang, J. Tseng, A. Binkowski, Y. Turpaz, J. Liang, CASTp: computed atlas of surface topography of proteins with structural and topographical mapping of functionally annotated residues, Nucleic Acids Res. 34 (2006) W116–W118. [180] M. Hendlich, F. Rippmann, G. Barnickel, LIGSITE: automatic and efficient detection of potential small molecule-binding sites in proteins, J. Mol. Graph. Model. 15 (1997) 359–389. [181] B. Huang, M. Schroeder, LIGSITEcsc: predicting ligand binding sites using the Connolly surface and degree of conservation, BMC Struct. Biol. 6 (2006) 19. [182] R.A. Laskowski, SURFNET: a program for visualizing molecular surfaces, cavities, and intermolecular interactions, J. Mol. Graph. 13 (1995) 323–328. [183] D.G. Levitt, L.J. Banaszak, POCKET: a computer graphics method for identifying and displaying protein cavities and their surrounding amino acids, J. Mol. Graph. 10 (1992) 229–234. [184] J. Ren, L. Xie, W.W. Li, P.E. Bourne, SMAP-WS: a parallel web service for structural proteome-wide ligand-binding site comparison, Nucleic Acids Res. 38 (2010) W441–W444. [185] M. Weisel, E. Proschak, G. Schneider, PocketPicker: analysis of ligand bindingsites with shape descriptors, Chem. Cent. J. 1 (2007). [186] W.S.J. Valdar, Scoring residue conservation, Proteins 48 (2002) 227–241. [187] J.A. Capra, R.A. Laskowski, J.M. Thornton, M. Singh, T.A. Funkhouser, Predicting protein ligand binding sites by combining evolutionary sequence conservation and 3D structure, PLoS Comput. Biol. 5 (2009) e1000585. [188] T.J. Ewing, S. Makino, A.G. Skillman, I.D. Kuntz, DOCK 4.0: search strategies for automated molecular docking of flexible molecule databases, J. Comput. Aided Mol. Des. 15 (2001) 411–428. [189] D.S. Goodsell, G.M. Morris, A.J. Olson, Automated docking of flexible ligands: applications of AutoDock, J. Mol. Recognit. 9 (1996) 1–5.

341

[190] G.M. Morris, D.S. Goodsell, R.S. Halliday, R. Huey, W.E. Hart, R.K. Belew, A.J. Olson, Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function, J. Comput. Chem. 19 (1998) 1639–1662. [191] G. Jones, P. Willett, R.C. Glen, A.R. Leach, R. Taylor, Development and validation of a genetic algorithm for flexible docking, J. Mol. Biol. 267 (1997) 727–748. [192] G. Jones, P. Willett, R. Glen, Molecular recognition of receptor sites using a genetic algorithm with a description of desolvation, J. Mol. Biol. 245 (1995) 43–53. [193] R.A. Friesner, R.B. Murphy, M.P. Repasky, L.L. Frye, J.R. Greenwood, T.A. Halgren, P.C. Sanschagrin, D.T. Mainz, Extra precision glide: docking and scoring incorporating a model of hydrophobic enclosure for protein–ligand complexes, J. Med. Chem. 2006 (2006) 6177–6196. [194] R.A. Friesner, J.L. Banks, R.B. Murphy, T.A. Halgren, J.J. Klicic, D.T. Mainz, M.P. Repasky, E.H. Knoll, M. Shelley, J.K. Perry, D.E. Shaw, P. Francis, P. Shenkin, Glide: a new approach for rapid, accurate docking and scoring. 1. Method and assessment of docking accuracy, J. Med. Chem. 47 (2004) 1739–1749. [195] M. Rarey, B. Kramer, T. Lengauer, G. Klebe, A fast flexible docking method using an incremental construction algorithm, J. Mol. Biol. 261 (1996) 470–489. [196] R. Abagyan, M. Totrov, D. Kuznetsov, ICM-A new method for protein modeling and design: applications to docking and structure prediction from the distorted native conformation, J. Comput. Chem. 15 (1994) 488–506. [197] P.J. Goodford, A computational procedure for determining energetically favorable binding sites on biologically important macromolecules, J. Med. Chem. 28 (1985) 849–857. [198] C. de Graaf, P. Pospisil, W. Pos, G. Folkers, N.P.E. Vermeulen, Binding mode prediction of cytochrome p450 and thymidine kinase protein–ligand complexes by consideration of water and rescoring in automated docking, J. Med. Chem. 48 (2005) 2308–2318. [199] M. Rarey, B. Kramer, T. Lengauer, The particle concept: placing discrete water molecules during protein–ligand docking predictions, Proteins 34 (1999) 17–28. [200] F. Sousa, P.A. Fernandes, M. Joa, Protein–ligand docking: current status and future, Bioinformatics 26 (2006) 15–26. [201] T. Hou, J. Wang, Y. Li, W. Wang, Assessing the performance of the molecular mechanics/Poisson Boltzmann surface area and molecular mechanics/generalized Born surface area methods. II. The accuracy of ranking poses generated from docking, J. Comput. Chem. 32 (2010) 866–877. [202] G.L. Warren, C.W. Andrews, A.M. Capelli, B. Clarke, J. LaLonde, M.H. Lambert, M. Lindvall, N. Nevins, S.F. Semus, S. Senger, G. Tedesco, I.D. Wall, J.M. Woolven, C.E. Peishoff, M.S. Head, A critical assessment of docking programs and scoring functions, J. Med. Chem. 49 (2006) 5912–5931. [203] J.C. Cole, C.W. Murray, J.W.M. Nissink, R.D. Taylor, R. Taylor, Comparing protein– ligand docking programs is difficult, Proteins 60 (2005) 325–332. [204] J.B. Cross, D.C. Thompson, B.K. Rai, J.C. Baber, K.Y. Fan, Y. Hu, C. Humblet, Comparison of several molecular docking programs: pose prediction and virtual screening accuracy, J. Chem. Inf. Model. 49 (2009) 1455–1474. [205] E. Kellenberger, J. Rodrigo, P. Muller, D. Rognan, Comparative evaluation of eight docking tools for docking and virtual screening accuracy, Proteins 57 (2004) 225–242. [206] C.A. Lipinski, F. Lombardo, B.W. Dominy, P.J. Feeney, Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings, Adv. Drug Deliv. Rev. 46 (2001) 3–26. [207] J.J. Irwin, B.K. Shoichet, ZINC — a free database of commercially available compounds for virtual screening, J. Chem. Inf. Model. 45 (2005) 177–182. [208] S.-Y. Yang, Pharmacophore modeling and applications in drug discovery: challenges and recent advances, Drug Discov. Today 15 (2010) 444–450. [209] J.O. Ebalunode, W. Zheng, A. Tropsha, Application of QSAR and shape pharmacophore modeling approaches for targeted chemical library design, Methods Mol. Biol. 685 (2011) 111–133. [210] M.A. Lill, Multi-dimensional QSAR in drug discovery, Drug Discov. Today 12 (2007) 1013–1017. [211] Y. Tanrikulu, G. Schneider, Pseudoreceptor models in drug design: bridging ligand- and receptor-based virtual screening, Nat. Rev. Drug Discov. 7 (2008) 667–677. [212] J. Perez-Gil, T.E. Weaver, Pulmonary surfactant pathophysiology: current models and open questions, Physiology. 25 (2010) 132–141. [213] Y.Y. Zuo, R.A.W. Veldhuizen, A.W. Neumann, N.O. Petersen, F. Possmayer, Current perspectives in pulmonary surfactant — inhibition, enhancement and evaluation, Biochim. Biophys. Acta 1778 (2008) 1947–1977. [214] J. Goerke, Pulmonary surfactant: functions and molecular composition, Biochim. Biophys. Acta 1408 (1998) 79–89. [215] E.C. Crouch, Collectins and pulmonary host defense, Am. J. Respir. Cell Mol. Biol. 19 (1998) 177–201. [216] E.J. Veldhuizen, J.J. Batenburg, L.M. van Golde, H.P. Haagsman, The role of surfactant proteins in DPPC enrichment of surface films, Biophys. J. 79 (2000) 3164–3171. [217] A.D. Bangham, C.J. Morley, M.C. Phillips, The physical properties of an effective lung surfactant, Biochim. Biophys. Acta 573 (1979) 552–556. [218] B. Pastrana-Rios, C.R. Flach, J.W. Brauner, A.J. Mautone, R. Mendelsohn, A direct test of the “squeeze-out” hypothesis of lung surfactant function. External reflection FT-IR at the air/water interface, Biochemistry 33 (1994) 5121–5127. [219] J.C. Watkins, The surface properties of pure phospholipids in relation to those of lung extracts, Biochim. Biophys. Acta 152 (1968) 293–306. [220] H. Bachofen, U. Gerber, P. Gehr, M. Amrein, S. Schürch, Structures of pulmonary surfactant films adsorbed to an air–liquid interface in vitro, Biochim. Biophys. Acta 1720 (2005) 59–72. [221] R.V. Diemel, M.M.E. Snel, A.J. Waring, F.J. Walther, L.M.G. van Golde, G. Putz, H.P. Haagsman, J.J. Batenburg, Multilayer formation upon compression of surfactant

342

[222]

[223] [224]

[225]

[226] [227] [228] [229]

[230]

[231]

[232]

[233]

[234]

[235]

[236]

[237]

[238]

[239]

[240]

[241]

[242]

[243]

[244]

[245]

[246] [247]

[248]

T. Werner et al. / Advanced Drug Delivery Reviews 64 (2012) 323–343 monolayers depends on protein concentration as well as lipid composition. An atomic force microscopy study, J. Biol. Chem. 277 (2002) 21179–21188. D. Follows, F. Tiberg, R.K. Thomas, M. Larsson, Multilayers at the surface of solutions of exogenous lung surfactant: direct observation by neutron reflection, Biochim. Biophys. Acta 1768 (2007) 228–235. S. Schürch, R. Qanbar, H. Bachofen, F. Possmayer, The surface-associated surfactant reservoir in the alveolar lining, Biol. Neonate 67 (Suppl 1) (1995) 61–76. C. Alonso, T. Alig, J. Yoon, F. Bringezu, H. Warriner, J.A. Zasadzinski, More than a monolayer: relating lung surfactant structure and mechanics to composition, Biophys. J. 87 (2004) 4188–4202. F. Moya, S. Sinha, R.B. D'Agostino, Surfactant-replacement therapy for respiratory distress syndrome in the preterm and term neonate: congratulations and corrections, Pediatrics 121 (2008) 1290–1291. W.A. Engle, Surfactant-replacement therapy for respiratory distress in the preterm and term neonate, Pediatrics 121 (2008) 419–432. D. Rose, J. Rendell, D. Lee, K. Nag, V. Booth, Molecular dynamics simulations of lung surfactant lipid monolayers, Biophys. Chem. 138 (2008) 67–77. C.D. Lorenz, A. Travesset, Atomistic simulations of Langmuir monolayer collapse, Langmuir 22 (2006) 10016–10024. Y.N. Kaznessis, S. Kim, R.G. Larson, Specific mode of interaction between components of model pulmonary surfactants using computer simulations, J. Mol. Biol. 322 (2002) 569–582. S.K. Kandasamy, R.G. Larson, Molecular dynamics study of the lung surfactant peptide SP-B1-25 with DPPC monolayers: insights into interactions and peptide position and orientation, Biophys. J. 88 (2005) 1577–1592. H. Lee, S.K. Kandasamy, R.G. Larson, Molecular dynamics simulations of the anchoring and tilting of the lung-surfactant peptide SP-B1-25 in palmitic acid monolayers, Biophys. J. 89 (2005) 3807–3821. J.A. Freites, Y. Choi, D.J. Tobias, Molecular dynamics simulations of a pulmonary surfactant protein B peptide in a lipid monolayer, Biophys. J. 84 (2003) 2169–2180. S.O. Nielsen, C.F. Lopez, P.B. Moore, J.C. Shelley, M.L. Klein, Molecular dynamics investigations of lipid Langmuir monolayers using a coarse-grain model, J. Phys. Chem. B 107 (2003) 13911–13917. S. Baoukina, L. Monticelli, M. Amrein, D.P. Tieleman, The molecular mechanism of monolayer–bilayer transformations of lung surfactant from molecular dynamics simulations, Biophys. J. 93 (2007) 3775–3782. S. Baoukina, L. Monticelli, S.J. Marrink, D.P. Tieleman, Pressure-area isotherm of a lipid monolayer from molecular dynamics simulations, Langmuir 23 (2007) 12617–12623. S. Baoukina, L. Monticelli, H.J. Risselada, S.J. Marrink, D.P. Tieleman, The molecular mechanism of lipid monolayer collapse, Proc. Natl. Acad. Sci. U.S.A. 105 (2008) 10803–10808. C. Laing, S. Baoukina, D.P. Tieleman, Molecular dynamics study of the effect of cholesterol on the properties of lipid monolayers at low surface tensions, Phys. Chem. Chem. Phys. 11 (2009) 1916–1922. S.L. Duncan, R.G. Larson, Folding of lipid monolayers containing lung surfactant proteins SP-B(1–25) and SP-C studied via coarse-grained molecular dynamics simulations, Biochim. Biophys. Acta 1798 (2010) 1632–1650. R. Fredriksson, M.C. Lagerström, L.-G. Lundin, H.B. Schiöth, The G-proteincoupled receptors in the human genome form five main families. Phylogenetic analysis, paralogon groups, and fingerprints, Mol. Pharmacol. 63 (2003) 1256–1272. V. Cherezov, D.M. Rosenbaum, M.A. Hanson, S.G.F. Rasmussen, F.S. Thian, T.S. Kobilka, H.J. Choi, P. Kuhn, W.I. Weis, B.K. Koblika, R.C. Stevens, High-resolution crystal structure of an engineered human β2-adrenergic G protein-coupled receptor, Science 318 (2007) 1258–1265. T. Warne, M.J. Serrano-Vega, J.G. Baker, R. Moukhametzianov, P.C. Edwards, R. Henderson, A.G.W. Leslie, C.G. Tate, G.F.X. Schertler, Structure of a β1adrenergic G-protein-coupled receptor, Nature 454 (2008) 486–491. K. Palczewski, T. Kumasaka, T. Hori, C.A. Behnke, H. Motoshima, B.A. Fox, I. Le Trong, D.C. Teller, T. Okada, R.E. stenkamp, M. Yamamoto, M. Miyano, Crystal structure of rhodopsin: a G protein-coupled receptor, Science 289 (2000) 739–745. V.P. Jaakola, M.T. Griffith, M.A. Hanson, V. Cherezov, E.Y.T. Chien, J.R. Lane, E.Y.T. Chien, J.R. Lane, A.P. Ijzerman, R.C. Stevens, The 2.6 angstrom crystal structure of a human A2A adenosine receptor bound to an antagonist, Science 322 (2008) 1211–1217. B. Wu, E.Y.T. Chien, C.D. Mol, G. Fenalti, W. Liu, V. Katritch, R. Abagyan, A. Brooun, P. Wells, F.C. Bi, D.J. Hamel, P. Kuhn, T.M. Handel, V. Cherezov, R.C. Stevens, Structures of the CXCR4 chemokine GPCR with small-molecule and cyclic peptide antagonists, Science 330 (2010) 1066–1071. E.Y.T. Chien, W. Liu, Q. Zhao, V. Katritch, G. Won Han, M.A. Hanson, L. Shi, A.H. Newman, J.A. Javitch, V. Cherezov, R.C. Stevens, Structure of the human dopamine D3 receptor in complex with a D2/D3 selective antagonist, Science 330 (2010) 1091–1095. J.H. Park, P. Scheerer, K.P. Hofmann, H.-W. Choe, O.P. Ernst, Crystal structure of the ligand-free G-protein-coupled receptor opsin, Nature 454 (2008) 183–187. S.G.F. Rasmussen, H.-J. Choi, J.J. Fung, E. Pardon, P. Casarosa, P.S. Chae, B.T. Devree, D.M. Rosenbaum, F.S. Thian, T.S. Kobilka, A. Schnapp, I. Konetzki, R.K. Sunahara, S.H. Gellman, A. Pautsch, J. Steyaert, W.I. Weis, B.K. Kobilka, Structure of a nanobody-stabilized active state of the β2 adrenoceptor, Nature 469 (2011) 175–180. T. Warne, R. Moukhametzianov, J.G. Baker, R. Nehmé, P.C. Edwards, A.G.W. Leslie, G.F.X. Schertler, C.G. Tate, The structural basis for agonist and partial agonist action on a β1-adrenergic receptor, Nature 469 (2011) 241–244.

[249] T. Shimamura, M. Shiroishi, S. Weyand, H. Tsujimoto, G. Winter, V. Katritch, R. Abagyan, V. Cherezov, W. Liu, G.W. Han, T. Kobayashi, R.C. Stevens, S. Iwata, Structure of the human histamine H1 receptor complex with doxepin, Nature 475 (2011) 65–70. [250] F. Xu, H. Wu, V. Katritch, G.W. Han, K. a Jacobson, Z.-G. Gao, V. Cherezov, R.C. Stevens, Structure of an agonist-bound human A2A adenosine receptor, Science 332 (2011) 322–327. [251] P.J. Barnes, New therapies for asthma: is there any progress? Trends Pharmacol. Sci. 31 (2010) 335–343. [252] L.M. Simpson, I.D. Wall, F.E. Blaney, C.A. Reynolds, Modeling GPCR active state conformations: the β2-adrenergic receptor, Proteins 79 (2011) 1441–1457. [253] D.M. Rosenbaum, C. Zhang, J.A. Lyons, R. Holl, D. Aragao, D.H. Arlow, S.G.F. Rasmussen, H.J. Choi, B.T. DeVree, R.K. Sunahara, P.S. Chae, S.H. Gellman, R.O. Dror, D.E. Shaw, W.I. Weis, M. Caffrey, P. Gmeiner, B.K. Kobilka, Structure and function of an irreversible agonist-β2 adrenoceptor complex, Nature 469 (2011) 236–240. [254] R.O. Dror, a C. Pan, D.H. Arlow, D.W. Borhani, P. Maragakis, Y. Shan, et al., Pathway and mechanism of drug binding to G-protein-coupled receptors, Proc. Natl. Acad. Sci. U.S.A. (2011) 2–7. [255] J.A. Ballesteros, A.D. Jensen, G. Liapakis, S.G.F. Rasmussen, L. Shi, U. Gether, J.A. Javitch, Activation of the β2-adrenergic receptor involves disruption of an ionic lock between the cytoplasmic ends of transmembrane segments 3 and 6, J. Biol. Chem. 276 (2001) 29171–29177. [256] R.O. Dror, D.H. Arlow, D.W. Borhani, M.Ø. Jensen, S. Piana, D.E. Shaw, Identification of two distinct inactive conformations of the beta2-adrenergic receptor reconciles structural and biochemical observations, Proc. Natl. Acad. Sci. U.S.A. 106 (2009) 4689–4694. [257] S. Vanni, M. Neri, I. Tavernelli, U. Rothlisberger, Observation of “ionic lock” formation in molecular dynamics simulations of wild-type β1 and β2 adrenergic receptors, Biochemistry 48 (2009) 4789–4797. [258] M.P. Caulfield, N.J. Birdsall, International Union of Pharmacology. XVII. Classification of muscarinic acetylcholine receptors, Pharmacol. Rev. 50 (1998) 279–290. [259] R. Gosens, J. Zaagsma, H. Meurs, A.J. Halayko, Muscarinic receptor signaling in the pathophysiology of asthma and COPD, Respir. Res. 7 (2006). [260] A.F. Roffel, C.R. Elzinga, J. Zaagsma, Muscarinic M3 receptors mediate contraction of human central and peripheral airway smooth muscle, Pulm. Pharmacol. 3 (1990) 47–51. [261] P.A. Minette, J.W. Lammers, C.M. Dixon, M.T. McCusker, P.J. Barnes, A muscarinic agonist inhibits reflex bronchoconstriction in normal but not in asthmatic subjects, J. Appl. Physiol. 67 (1989) 2461–2465. [262] R.E. Ten Berge, M. Krikke, A.C. Teisman, A.F. Roffel, J. Zaagsma, Dysfunctional muscarinic M2 autoreceptors in vagally induced bronchoconstriction of conscious guinea pigs after the early allergic reaction, Eur. J. Pharmacol. 318 (1996) 131–139. [263] A. Pedretti, G. Vistoli, C. Marconi, B. Testa, Muscarinic receptors: a comparative analysis of structural features and binding modes through homology modelling and molecular docking, Chem. Biodivers. 3 (2006) 481–501. [264] A.K. Bhattacharjee, J.A. Gordon, E. Marek, A. Campbell, R.K. Gordon, 3D-QSAR studies of 2,2-diphenylpropionates to aid discovery of novel potent muscarinic antagonists, Bioorg. Med. Chem. 17 (2009) 3999–4012. [265] Z. Miao, K.E. Luker, B.C. Summers, R. Berahovich, M.S. Bhojani, A. Rehemtulla, C.G. Kleer, J.J. Essner, A. Nasevicius, G.D. Luker, M.C. Howard, T.J. Schall, CXCR7 (RDC1) promotes breast and lung tumor growth in vivo and is expressed on tumor-associated vasculature, Proc. Natl. Acad. Sci. U.S.A. 104 (2007) 15735–15740. [266] K. Balabanian, B. Lagane, S. Infantino, K.Y.C. Chow, J. Harriague, B. Moepps, F. Atenzana-Seisdedos, M. Thelen, F. Bachelerie, The chemokine SDF-1/CXCL12 binds to and signals through the orphan receptor RDC1 in T lymphocytes, J. Biol. Chem. 280 (2005) 35760–35766. [267] U. Naumann, E. Cameroni, M. Pruenster, H. Mahabaleshwar, E. Raz, H.-G. Zerwes, A. Tor, M. Thelen, CXCR7 functions as a scavenger for CXCL12 and CXCL11, PLoS One 5 (2010) e9175. [268] Y. Zhang, M.E. DeVries, J. Skolnick, Structure modeling of all identified G protein–coupled receptors in the human genome, PLoS Comput. Biol. 2 (2006) e13. [269] J.M. Burns, B.C. Summers, Y. Wang, A. Melikian, R. Berahovich, Z. Miao, M.E.T. Penfold, M.J. Sunshine, D.R. Littman, C.J. Kuo, K. Wei, B.E. McMaster, K. Wright, M.C. Howard, T.J. Schall, A novel chemokine receptor for SDF-1 and I-TAC involved in cell survival, cell adhesion, and tumor development, J. Exp. Med. 203 (2006) 2201–2213. [270] S.W. Jones, S.M.V. Brockbank, M.L. Mobbs, N.J. Le Good, S. Soma-Haddrick, A.J. Heuze, C.J. Langham, D. Timms, P. Newham, M.R.C. Needham, The orphan Gprotein coupled receptor RDC1: evidence for a role in chondrocyte hypertrophy and articular cartilage matrix turnover, Osteoarthr. Cartil. 14 (2006) 597–608. [271] E. Zampeli, E. Tiligada, The role of histamine H4 receptor in immune and inflammatory disorders, Br. J. Pharmacol. 157 (2009) 24–33. [272] A. Jongejan, H.D. Lim, R.A. Smits, E. Haaksma, R. Leurs, Q. For, et al., Delineation of agonist binding to the human histamine H 4 receptor using mutational analysis, homology modeling, and ab initio calculations, J. Chem. Inf. Model. 48 (2008) 1455–1463. [273] B. Jójárt, R. Kiss, B. Viskolcz, G.M. Keseru, Activation mechanism of the human histamine H4 receptor — an explicit membrane molecular dynamics simulation study, J. Chem. Inf. Model. 48 (2008) 1199–1210. [274] Y. Tanrikulu, E. Proschak, T. Werner, T. Geppert, N. Todoroff, A. Klenner, T. Kottke, K. Sander, E. Schneider, R. Seifert, H. Stark, T. Clark, G. Schneider, Homology model adjustment and ligand screening with a pseudoreceptor of the human histamine H4 receptor, ChemMedChem 4 (2009) 820–827.

T. Werner et al. / Advanced Drug Delivery Reviews 64 (2012) 323–343 [275] T. Werner, K. Sander, Y. Tanrikulu, T. Kottke, E. Proschak, H. Stark, G. Schneider, In silico characterization of ligand binding modes in the human histamine H4 Receptor and their impact on receptor activation, Chembiochem 11 (2010) 1850–1855. [276] S.V. Sharma, D.W. Bell, J. Settleman, D.A. Haber, Epidermal growth factor receptor mutations in lung cancer, Nat. Rev. Cancer 7 (2007) 169–181. [277] M. Huse, J. Kuriyan, The conformational plasticity of protein kinases, Cell 109 (2002) 275–282. [278] N. Jura, N.F. Endres, K. Engel, S. Deindl, R. Das, M.H. Lamers, D.E. Wemmer, X. Zhang, J. Kuriyan, Mechanism for activation of the EGF receptor catalytic domain by the juxtamembrane segment, Cell 137 (2009) 1293–1307. [279] X. Zhang, J. Gureasko, K. Shen, P.A. Cole, J. Kuriyan, An allosteric mechanism for activation of the kinase domain of epidermal growth factor receptor, Cell 125 (2006) 1137–1149. [280] K.M. Ferguson, M.B. Berger, J.M. Mendrola, H.-S. Cho, D.J. Leahy, M.A. Lemmon, EGF activates its receptor by removing interactions that autoinhibit ectodomain dimerization, Mol. Cell 11 (2003) 507–517. [281] N.E. Hynes, H.A. Lane, ERBB receptors and cancer: the complexity of targeted inhibitors, Nat. Rev. Cancer 5 (2005) 341–354. [282] T.E. Balius, R.C. Rizzo, Quantitative prediction of fold resistance for inhibitors of EGFR, Biochemistry 48 (2009) 8435–8448. [283] S. Wan, P.V. Coveney, Rapid and accurate ranking of binding affinities of epidermal growth factor receptor sequences with selected lung cancer drugs, J. R. Soc. Interface 8 (2011) 1114–1127. [284] M. Mustafa, A. Mirza, N. Kannan, Conformational regulation of the EGFR kinase core by the juxtamembrane and C-terminal tail: a molecular dynamics study, Proteins 79 (2011) 99–114. [285] J. Kästner, H.H. Loeffler, S.K. Roberts, M.L. Martin-Fernandez, M.D. Winn, Ectodomain orientation, conformational plasticity and oligomerization of ErbB1 receptors investigated by molecular dynamics, J. Struct. Biol. 167 (2009) 117–128. [286] S.E.D. Webb, S.K. Roberts, S.R. Needham, C.J. Tynan, D.J. Rolfe, M.D. Winn, D.T. Clarke, R. Barraclough, Single-molecule imaging and fluorescence lifetime imaging microscopy show different structures for high- and low-affinity epidermal growth factor receptors in A431 cells, Biophys. J. 94 (2008) 803–819.

343

[287] J.J. Lammerts van Bueren, W.K. Bleeker, A. Brännström, A. von Euler, M. Jansson, M. Peipp, T. Schneider-Merck, T. Valerius, J.G.J. van de Winkel, P.W.H.I. Parren, The antibody zalutumumab inhibits epidermal growth factor receptor signaling by limiting intra- and intermolecular flexibility, Proc. Natl. Acad. Sci. U.S.A. 105 (2008) 6109–6114. [288] Z. Zhang, W. Wriggers, Polymorphism of the epidermal growth factor receptor extracellular ligand binding domain: the dimer interface depends on domain stabilization, Biochemistry 50 (2011) 2144–2156. [289] J.L. Bobadilla, M. Macek, J.P. Fine, P.M. Farrell, Cystic fibrosis: a worldwide analysis of CFTR mutations — correlation with incidence data and application to screening, Hum. Mutat. 19 (2002) 575–606. [290] H.A. Lewis, X. Zhao, C. Wang, J.M. Sauder, I. Rooney, B.W. Noland, D. Lorimer, M.C. Kearins, K. Conners, B. Condon, P.C. Maloney, W.B. Guggino, J.F. Hunt, S. Emtage, Impact of the ΔF508 mutation in first nucleotide-binding domain of human cystic fibrosis transmembrane conductance regulator on domain folding and structure, J. Biol. Chem. 280 (2005) 1346–1353. [291] B.H. Qu, E.H. Strickland, P.J. Thomas, Localization and suppression of a kinetic defect in cystic fibrosis transmembrane conductance regulator folding, J. Biol. Chem. 272 (1997) 15739–15744. [292] B.H. Qu, P.J. Thomas, Alteration of the cystic fibrosis transmembrane conductance regulator folding pathway, J. Biol. Chem. 271 (1996) 7261–7264. [293] P.H. Thibodeau, C.A. Brautigam, M. Machius, P.J. Thomas, Side chain and backbone contributions of Phe508 to CFTR folding, Nat. Struct. Mol. Biol. 12 (2005) 10–16. [294] A.W.R. Serohijos, T. Hegedus, J.R. Riordan, N.V. Dokholyan, Diminished selfchaperoning activity of the ΔF508 mutant of CFTR results in protein misfolding, PLoS Comput. Biol. 4 (2008) e1000008. [295] D. Cox, M. Brennan, N. Moran, Integrins as therapeutic targets: lessons and opportunities, Nat. Rev. Drug Discov. 9 (2010) 804–820. [296] J. Singh, H. van Vlijmen, Y. Liao, W.C. Lee, M. Cornebise, M. Harris, I. Shu, A. Gill, J.H. Cuervo, W.M. Abraham, S.P. Adams, Identification of potent and novel α4β1 antagonists using in silico screening, J. Med. Chem. 45 (2002) 2988–2993.