If no experimental structural data is available one needs to resort to modeled data if one wants to understand a molecule or to predict mutants with certain (altered) characteristics. We have set out to design a computer program that meets the following criteria. These criteria can be divided in two classes: usage criteria and design criteria. 1
Usage criteria: It must be easy for novice users. It must be flexible enough to allow an expert user to perform even the most sophisticated studies. It should allow for all kinds of high quality graphics in one, two or three dimensions. It must incorporate existing or new sequence and structure databases. It must be highly modular: a user should only need to know a very few global commands plus the commands of the module of interest to be able to work. Transparent interfaces must be available to popular programs like GROMOS. On-line help must be available at several levels at every stage of user interaction. The program should be useful to molecular modelers, drug designers and crystallographers. One command should only invoke a single action, No subcommands or modes should be used. An experienced user, however, should not be limited by this requirement.
2
Design criteria: It is written in FORTRAN 77, therefore it will be easy to port to other computers. Even the HELP facility should be fully FORTRAN 77. It is highly modular. If a user wants to make a change or extension, only a very limited number of subroutines needs to be studied. Options that are not CPU intensive should be an integral part of the program. Options that are CPU intensive should be submittable as batch jobs. The idea that one command performs one option is also extended to the source code. One library contains the routines needed for the options in one menu. Also one subroutine should always only perform one task. WHAT IF options fall into several categories including: graphics, databases, molecular comparisons, crystallographic tools, protein mutations, drug docking, structure analysis, parameter correlation analysis and atom, residue and molecule operations.
One of the modules in WHAT IF fully and automatically superimposes protein structures. Some of the more spectacular results and failures of the application of this method will be discussed.
Applications structure
of dynamic
programming
to protein
W.R. Taylor and C.A. Orengo Laboratory of Mathematical Biology, National Institute for Medical Research, The Ridgeway, Mill Hill, London NW7 1AA. UK
The technique of dynamic programming optimization (often better known as the Needleman-Wunsch method) has been applied for many years to the problem of biological sequence comparison. Until recently the method was restricted to comparing one-dimensional (lD), or sequential, objects but has now been extended by the authors to the comparison of three-dimensional (3D) objects, specifically, protein structures. This new method works by reducing the 3D problem to a series of ID comparisons. The protein structures are first reduced to distance plots (two dimensions); then each column of the two distance plots is compared between the two proteins and a consensus alignment is built up. The method has been applied successfully to the comparison of protein structures, including some very remotely related pairs. Further developments of the method will be described including the incorporation of additional data (hydrogen bonds, torsion angles, etc.), a new fast algorithm and its extension to multiple structure comparison. New applications of the method, including the comparison of protein structure directly to NMR and crystallographic data, will also be briefly described.
From the comparative analysis of proteins to knowledge based modeling Mark S. Johnson, John Overington, Andrej Sali, Pam Thomas and Tom L. Blundell Imperial Cancer Research Fund Structural Molecular Biology Unit, Laboratory of Molecular Biology, Department of Crystallography, Birkbeck College, Malet Street, London WClE7HX, UK The analysis of protein structures and sequences from homologous families provides a realistic route for the homology based modeling of proteins. The general approach taken by us at Birkbeck can be described by the following four steps: The comparison of protein sequences and three-dimensional (3D) structures determined by experiment and organized into databases The derivation of rules through the analyses of the comparisons The use of these rules to define sequences that can adopt similar folds; the projection from 3D structure to onedimensional sequence chiefly through structural templates The extrapolation of the features of known structures to the sequence of the protein to be modeled Two modeling procedures have been developed within this framework: a “Frankenstein” approach to modeling, COMPOSER, which is based on a rigid-body treatment of protein pieces and parts; and PROTEIN-MODELLER, which is based on distance constraints in a similar way to methods for structure refinement using two-dimensional NMR. These methods lead to models that are useful in the design of inhibitors, can provide suggestions for site-directed mutagenesis and may even help in the solution of x-ray crystal structures.
J. Mol. Graphics,
1990, Vol. 8, December
231