Abstracts / Journal of Biotechnology 136S (2008) S72–S74
S73
I2-P-001
I2-P-002
Reconstruction of phylogenies by mapping bio-sequences into a probability vector space
Research on the dissociation process of the hirudin/thrombin complex using steered molecular dynamics
Jia F. Weng ∗ , I. Mareels, D.A. Thomas
Jun Zhao ∗ , Chunli Yan, Zhilong Xiu
Victoria Research Lab (VRL), National ICT Australia (NICTA) and Melbourne School of Engineering, The University of Melbourne, Melbourne, Australia
Department of Bioscience and Biotechnology, School of Environmental and Biological Science and Technology, Dalian University of Technology, Dalian 116024, People’s Republic of China
E-mail address:
[email protected] (J.F. Weng).
E-mail address:
[email protected] (J. Zhao).
Inferring a phylogenetic tree is not only finding its optimal topology but also reconstructing all ancestors of a given set of species. Besides it determines the branch lengths that are defined to be a function of the end-nodes of branches. Many methods for reconstruction of phylogenetic trees have been developed (Felsenstein, 2004), which mainly focus on prediction of the graph structures, called topologies, of the trees, and less attention is paid to the ancestral states of internal nodes in the tree, in particular, the state of the common ancestor of given species. However, just the latter — the existence of a common origin, stands at the centre of evolution theory. Moreover, because of the gaps in nucleotide/protein sequences (resulted in alignment) and because of the ambiguity in reconstruction of phylogenetic trees, the ancestral states of internal nodes are often uncertain. The natural solution to the above problems is to assume probability distributions at internal nodes. For this reason in this paper we propose a probability representation model of phylogenetic trees. In this model the input discrete bio-sequences (nucleotide/protein sequences) are mapped into a metric space of probability vectors. As a result, all internal nodes (ancestors) become points in the continuous space and the phylogenetic tree problem is translated into a probability Steiner tree problem (Hwang et al., 1992; Ewens and Grant, 2005) in a high dimensional space. (A probability tree is a tree whose nodes are all probability vectors.) The metric in the space is based on the (uncorrected/corrected) genetic distance of sequences, for example, the metric is 1 metric (possibly modified) if a (weighted/unweighted) maximum parsimony criterion is applied. In this paper we study the phylogenetic tree problem in this scenario and propose an algorithm for constructing probability Steiner trees that are locally minimal by the (weighted/unweighted) maximum parsimony criterion. Finally a real example of 14 species whose sequences contain 532 nucleotides is given to show the advantages of our probability presentation model and method to the currently used maximum parsimony methods. Therefore, the probability representation model might help biologists investigate phylogenies from a new point of view that perhaps explains the evolutionary process better.
Interactions between hirudin and thrombin have unique characters (Rydel et al., 1990). Firstly, the interaction domain between hirudin and thrombin in the complex is large, which probably accounts for the high affinity and selectivity of hirudin to thrombin. Secondly, the last 16 residues of hirudin are in an extended conformation and bind to an anion binding exosite on the surface of thrombin that extends from the active site and is the secondary fibrinogen binding site. Steered Molecular dynamic simulations (Leech et al., 1996; Grubmuller et al., 1996; Stepaniants et al., 1997) were performed to study the interactions between hirudin and thrombin. Our results showed that the two domains of hirudin interact with thrombin together to stabilize the hirudin/thrombin complex. Moreover, C-domain of hirudin interacted with thrombin more tightly and effectively than N-domain. Lys-47h Ser-50h Asn-52 Asp-53h Phe56h Glu-57h of hirudin were the key residues for the binding of the complex, which gave us effective and easy templates to design anti-thrombosis medicines. Pulling directions of steered molecular dynamics: yellow points represent the centre of mass (COM) of parts of the complex. Arrows present the directions of pulling.
References Grubmuller, H., Heymann, B., Tavan, P., 1996. Ligand binding and molecular mechanics calculation of the streptavidin-biotin rupture force. Science 271, 997–999. Leech, J., Prins, J., Hermans, J., 1996. SMD: visual steering of molecular dynamics for protein design. IEEE Comp. Sci. Eng. 3, 38–45. Rydel, T.J., Ravichandran, K.G., Tulinsky, A., Bode, W., Huber, R., Roitsch, C., Fenton, J.W., 1990. The structure of a complex of recombinant hirudin and human alphathrombin. Science 249, 277–280. Stepaniants, S., Izrailev, S., Schulten, K., 1997. Extraction of lipids from phospholipid membranes by steered molecular dynamics. J. Mol. Model. 3, 473–475.
doi:10.1016/j.jbiotec.2008.07.162 I2-P-005 Novel methods for gene network study based on meta-analysis of microarray data Joshua S. Yuan ∗ , Susie Y. Dai
References Genomics Hub, University of Tennessee, Knoxville, TN, United States Ewens, W.J., Grant, G.R., 2005. Statistical Methods in Bioinformatics: An Introduction. Springer, USA. Felsenstein, J., 2004. Inferring Phylogenetics. Sinauer Associates, Inc., Sunderland, UK. Hwang, F.K., Richards, D.S., Winter, P., 1992. The Steiner Tree Problem. Elsevier Science Publishers B.V., The Netherlands.
doi:10.1016/j.jbiotec.2008.07.161
E-mail address:
[email protected] (J.S. Yuan). Gene network is important in the integration of transcriptomics data to identify the key regulatory components in biological systems. Here we present our platform for building gene network based on meta-analysis of microarray data. Basically, different microarray datasets for the same or relevant biological treatments were standardized. The datasets were then filtered to remove the outliers and the genes presented in less than half microarrays. A weighted mean method is used to transform the dataset so that the dataset can be combined for further analysis. The combined metadataset can then be analyzed by datamining methods including cluster, decision tree, and neuron network. In particular, we developed an automatic package to calculate the correlation coefficiency between genes and/or to group genes according to multivrariate