Available online at www.sciencedirect.com
Protein–protein interactions: from global to local analyses WP Kelly1 and MPH Stumpf1,2 For the increasing number of species with complete genome sequences, the task of elucidating their complete proteomes and interactomes has attracted much recent interest. Although the proteome describes the complete repertoire of proteins expressed, the interactome comprises the pairwise protein– protein interactions that occur, or could occur, within an organism, and forms a large-scale sparse network. Here we discuss the challenges provided by present data, and outline a route from global analysis to more detailed and focused studies of protein–protein interactions. Carefully using proteininteraction data allows us to explore its potential fully alongside the evaluation of mechanistic hypotheses about biological systems. Addresses 1 Centre for Bioinformatics, Imperial College London, London, United Kingdom 2 Institute of Mathematical Sciences, Imperial College London, London, United Kingdom Corresponding author: Kelly, WP (
[email protected]) and Stumpf, MPH (
[email protected])
Current Opinion in Biotechnology 2008, 19:396–403 This review comes from a themed issue on Systems biology Edited by Jaroslav Stark Available online 6th August 2008 0958-1669/$ – see front matter # 2008 Elsevier Ltd. All rights reserved. DOI 10.1016/j.copbio.2008.06.010
Introduction At the cellular and molecular level, biological structure and function are the product of complex interactions between proteins and other molecules. As a result, the study of biological networks, and the use of graph theory to represent these interactions, has become one of the central concepts for the description of biological systems [1–3]. Although protein–protein interaction data are notoriously noisy and incomplete [4,5], there have been numerous reports highlighting the use of protein-interaction network data (see Figure 1) in analyzing and understanding complex molecular phenotypes (e.g. [6,7]). Following several major experimental surveys prior to 2006, the past two years have seen a rise in more targeted, problem-specific experimental interaction studies. There has been a wealth of studies performing in-depth functional and evolutionary analyses of protein-interaction Current Opinion in Biotechnology 2008, 19:396–403
data along with developments of statistical methodology for the analysis of such networks and its integration with other biological and biochemical data. This is a trend likely to continue and here we review this recent work that will integrate much more tightly with other systems biology approaches than has been the case thus far. Below we will discuss three related aspects of protein– protein interaction networks in particular: (i) issues surrounding the collation, curation, and verification of protein–protein interactions; (ii) detailed characterization of individual protein–protein interactions in light of the structure, biological role, and function of the involved proteins; and (iii) their use in systems biology for studying molecular (disease) processes and as scaffolding for studying dynamical properties of cellular systems.
Sources and challenges of protein-interaction data Mapping the structure of biological networks is experimentally challenging and requires considerable resources and effort. Large-scale protein-interaction network data of model organisms became available only with the rise of high-throughput technologies, which now produce thousands of putative interactions each year. This is in contrast with the hundreds of individual interactions reported in total just a few years ago. For the welldeveloped prokaryotic and eukaryotic model organisms there are now several large protein-interaction network datasets. Saccharomyces cerevisiae particularly has been widely studied, with recent studies apparently reaching saturation [8,9] for the first time with available technology — see Figure 2. However, achieving this level of coverage of the network represents the work of dozens of groups over many years (e.g. [10–13]). Coimmunoprecipitation and mass spectrometry [12,14] and yeast twohybrid [10,11] have been extensively used to identity large numbers of putative protein interactions in S. cerevisiae. The different assays for the detection of physical interactions and the relevant bioinformatics databases have been reviewed comprehensively by Shoemaker et al. [15]. However, the small overlap of these datasets was surprising, suggesting that the data should be treated with care [16,17]. In addition to experimental methodologies, a host of in silico analyses have been proposed and applied in order to predict novel protein–protein interactions. Computationally predicting protein–protein interactions from genome sequence data alone is extremely challenging [18,19], and even the most successful methods suffer from very high false-positive and false-negative rates [20]. www.sciencedirect.com
Protein–protein interactions Kelly and Stumpf 397
Figure 1
Sample protein-interaction network. The nodes represent proteins, while each edge represents an interaction reported between the two proteins. The data are from [62,63] forming a subset of the BIOGRID database. Different colors represent different biological processes.
More promising methods employ state-of-the art techniques from statistical learning theory, bioinformatics, and evolutionary biology, and base their predictions on the information available in extensively curated bioinfor-
matics data resources. These typically aim to transfer knowledge of interactions in a model organism to another species of interest. For example, if two proteins A and B have been reported to interact in a model species (such as S. cerevisiae), and if orthologous proteins, A0 and B0 can be found in a different species, then the interaction is also assigned to the new species if certain conditions, such as sequence similarity, are met [20–22]. This is clearly a sensible starting point, but the limitations are also evident: unreliability in the interaction data will be propagated across species and it may be difficult to reconcile conflicting data. In addition to the well-studied eukaryotic (S. cerevisiae, Drosophila melanogaster, Caernohabditis elegans, and Plasmodium falciparum) and prokaryotic model organisms (Escherichia coli and Heliobacter pylori) large-scale protein-interaction network maps have recently been published for other biological organisms, through a mixture of experiments and model-based inference for a range of species including Campylobacter jejuni [23] and Arabidopsis thaliana [24]. The human interactome has seen a variety of comparative analyses [25] as well as new means of inference from model organisms [26], which complement the small number of large-scale assays — though the total coverage is thought to be around 0.2% [27]. S. cerevisiae physical interactions have
Figure 2
Accrual of yeast protein–protein interactions over time. The number of reported interactions found in the BIOGRID yeast database over the past 30 years shows the impact of a small number of high-throughput studies on the overall data. Red indicates novel interactions while yellow bars represent the reported interactions that have been published before. www.sciencedirect.com
Current Opinion in Biotechnology 2008, 19:396–403
398 Systems biology
Figure 3
Current Opinion in Biotechnology 2008, 19:396–403
www.sciencedirect.com
Protein–protein interactions Kelly and Stumpf 399
been reported at the fastest rate, owing to the relative ease and maturity of high-throughput techniques, and there have been two extensive studies reporting sets of multiprotein complexes [8,28]. Several online repositories exist — often with slightly different foci — which make the various interaction data available [29–34]. On closer inspection, it appears that the high-throughput studies survey different subsamples of the overall network, intrinsically limiting the level of overlap that could be observed [35]. Problems with ignoring the coverage of each method are combined with different interpretations of mass spectrometry data, and ambiguous usage of the term ‘interaction’ [36]. These all hinder the separation of experimental error from experimental bias. Different interpretations of interactions in purified protein complexes [37] further exacerbate the interpretation of experimental protein-interaction data [38]. The growth in available protein-interaction network data is illustrated in Figure 2, which shows that over 10,000 novel interactions have been reported in S. cerevisiae over each of the past three years. Such a representation of the data and how they have been assembled over time offers a useful starting point for considering how our knowledge about interactomes grows. It is also invaluable if we want to evaluate the quality of the overall interactome data assembled to date. For example, the complete S. cerevisiae is a sparse network on approximately 6000 nodes and around 18 million possible pairwise interactions; and there are over 70 000 different interactions reported in the BIOGRID databank. However, the consensus view of the size of the yeast interactome is smaller than the number of distinct reported interactions [9,27,39]. The task is now to find effective means of differentiating between falsepositive reports and true-positive pairwise interactions. The early enthusiasm for the interpretation of network structure has waned as reliability problems have become more noticeable. Comparing multiply verified interactions with the whole dataset shows that the characteristics of the structure change [40]. Moreover, recent mathematical results and empirical studies have shown that, in general, a subnet sampled from the true interactome will have different properties and structural characteristics from the true network [5,41]. Studying available protein-interaction datasets therefore potentially provides only limited information about the structure of the interactome. Moreover, the practice of combining different datasets may ignore the limitations and potential inconsistencies in different biochemical assays.
Characterizing protein–protein interactions The global analysis of protein-interaction networks is increasingly being replaced by more detailed analyses of protein–protein interactions, their determinants, characteristics, and effects. Such analyses, firstly, can inform us about the functional role of particular interactions and secondly, will hopefully enable us to predict protein–protein interactions (and their roles) in silico. Given that prediction approaches are largely based on learning statistical patterns which distinguish pairs of interacting proteins from pairs of noninteracting proteins, in silico inference of protein–protein interactions is intimately linked to understanding the mechanisms and effects of protein interactions. Strictly, we should speak of protein–domain interactions rather than protein interactions, and the importance of structural bioinformatics for the prediction of protein– protein interaction continues to increase [42,43]. However, difficulties of protein structure prediction carry over to the structure-based prediction of protein–protein interactions, though some progress has been made in determining probable interaction surfaces of proteins and protein domains [44]. The analysis of pairs of interacting proteins in terms of the genomic and functional characteristics of the participant proteins is more straightforward. Measures of coexpression, shared transcription-factor-binding sites, gene ontology (GO) classifications, etc. have been used to (i) assess the influence of the network on the properties of the constituent nodes [45,46] and (ii) determine what distinguishes pairs of interacting proteins from pairs of noninteracting proteins [47]. In such studies, we typically compare the true network with networks that have similar statistical properties; here the definition of the random network ensemble may crucially influence the downstream statistical analysis [48]. Figure 3 shows one such measure for similarity among properties of interacting proteins: the fraction of reported interactions between proteins with the same or different GO annotations in S. cerevisiae. There is an expected bias [49] for interactions to occur between proteins with the same (or biologically similar) annotations. For molecular function over 20% of reported protein interactions with known functions fall into this class. This is similar to the proportion for components and process annotations. However, among the novel interactions reported in the BIOGRID data, only 36 097 (approximately half of the total distinct reported interactions) are found between proteins
Yeast interactome by GO annotations. These heatmaps show the fractions of reported interactions (out of the theoretical maximum of such interactions) between proteins with different GO annotations; histograms on the right show the total number of proteins within each annotation class (the GO slim annotations were used in order to deal with the otherwise hierarchical GO structure). Although we find a clear increase of within-category interactions, pairs of proteins with different interactions nevertheless predominate; knowledge of such across-category interactions can be biologically important but may compromise naive interaction prediction (or alternatively function prediction) approaches. www.sciencedirect.com
Current Opinion in Biotechnology 2008, 19:396–403
400 Systems biology
Figure 4
Illustration of interconnected networks. In this hypothetical example dissolving an interaction between protein products of genes A and B may lead to transcriptional activation of D and E which in turn act as enzymes for a metabolic reaction and transcriptional inhibitor for gene B, respectively. If, however, the proteins corresponding to genes C and D undergo a protein-protein interaction, D no longer inhibits the transcription of B. The two protein interactions, AB and CD, are, however, not observed simultaneously and depend strongly on regulatory interactions. Generally, protein-interaction, metabolic and gene-regulatory networks are all interlinked.
with known functions. Within this subset of the data, 8380 of the interactions are found between proteins of the same function. The remaining interactions involve proteins, which, according to current annotation have different functions, contribute to different biological process or are predominantly found in different cellular components (Figure 4). Such observations will necessarily have an impact on in silico predictor of protein–protein interactions. This has recently been noted by Ben-Hur and Stafford Noble [50], who compared the use of choosing negative sets of interactors based on cellular localization to random selection of pairs that do not appear in the databases. They found the former did indeed produce high-quality sets of noninteracting proteins, as would essentially any set that involves different annotations (different function, different process, etc.). However, it also biased the output of protein–protein interaction predictors/classifier, which may actually predict proteins localized in different components rather than true interacting protein pairs. The danger of potential circular arguments in function and interaction prediction is, of course, also present when interpreting GO annotations that were based on protein– protein interactions in the first place.
Using protein-interaction networks in systems biology In the past, protein-interaction networks have been studied predominantly in isolation of other networks, but this appears to be changing rapidly. We know that transcription and transcriptional regulation, metabolic processes and Current Opinion in Biotechnology 2008, 19:396–403
signaling cascades all require and invoke protein–protein interactions-see Figure 4. By this token, in order to make physiological sense, regulatory interaction networks, metabolic networks, signaling networks, and protein–protein interaction networks cannot be considered in isolation or as independent entities. Rather, we have to incorporate their intricate interwoven structure: proteins act alone or in combination as transcription factors and regulators of protein abundances, as enzymes they catalyze and coordinate the basic cellular metabolic processes, and they react to external and internal stimuli activating other proteins in signaling cascades. All of these processes in turn provide cues that may lead to the formation or termination of protein interactions and complexes. At the moment, the nature of the available data does not allow us to elucidate such detail, but increasingly we are able to, for example, recognize the need for certain posttranslational modification, such as phosphorylation, of the involved proteins before certain interactions can be realized. Additionally, experimental approaches are being developed, which enable us to collect and analyze timeresolved, quantitative, and condition-dependent proteininteraction data [51,49]. In lieu of quantitative and time-resolved analysis of protein–protein interaction networks and their dynamics, one typically resorts to using protein-interaction networks in the analysis of system-level data. Generally, the nodes in the protein-interaction network are annotated using, for example, expression data, knockout or knockdown phenotype data, and gene ontology www.sciencedirect.com
Protein–protein interactions Kelly and Stumpf 401
annotations [52]. Such a combination of functional data with network data can be used to implicate additional genes in the cellular phenotype (e.g. those which are not significantly differentially expressed/regulated but interact with several other genes that are), or make predictions about as yet unassayed phenotypes. This may become interesting particularly when looking at complex phenotypes which are hard to study in, for example, humans: recent protein-interaction network studies have, for example, focused on analyses of interactions among proteins which are known to be involved in development or certain human diseases [53,54]. Even if these interactions themselves may not be frequently instrumental in modulating phenotypes, tracing them can highlight the involved components of biological systems. Moreover, their knowledge can inform experimental and theoretical analysis of small-scale biological systems such as signaling pathways [55,56], stress response mechanisms [57], and cell motility [58]. This trend toward more detailed and better resolved analyses is also apparent in the evolutionary analysis of protein–protein interactions. Global analyses are only able to detect some general but weak — and statistically often poorly defined — trends. However, a recent analysis of the bZIP transcription factor family [59], for example, has resulted in much more detailed insights into the evolution of protein interactions [60] than has previously been possible. Given the biological importance of protein–protein interactions in, for example, stress response and host– parasite interactions [60,61], we feel that we will see more such analyses of smaller but highly specific sets of protein– protein interactions in order to generate or test mechanistic hypotheses for disease-related, developmental, and signaling processes in model organisms and humans.
Conclusions Present protein-interaction datasets appear to highlight a trade-off between data quality and quantity: protein– protein interaction data are still incomplete and plagued by high error rates. Nevertheless, given sufficient care and statistical sophistication, we can start to employ this information usefully. Although they may offer only limited descriptions of the true interactome, they do provide increasingly reliable and useful information, especially on a smaller scale. A crucial step in the future development of interactomes of model organisms will be the integration of protein–protein, protein–DNA, and protein– small molecule interactions.
References and recommended reading Papers of particular interest, published within the period of review, have been highlighted as: of special interest of outstanding interest 1.
Alm E, Arkin A: Biological networks. Curr Opin Struct Biol 2003, 13(2):193-202.
www.sciencedirect.com
2.
de Silva E, Stumpf MPH: Complex networks and simple models in biology. J R Soc Interf/R Soc 2005, 2(5):419-430.
3.
Schlitt T, Brazma A: Modelling gene networks at different organisational levels. FEBS Lett 2005, 579:1859-1866.
4.
Bader J, Chaudhuri A, Rothberg J, Chant J: Gaining confidence in high-throughput protein interaction networks. Nat Biotechnol 2004, 22(1):78-85.
5.
de Silva E, Thorne T, Ingram PJ, Agrafioti I, Swire J, Wiuf C, Stumpf MPH: The effects of incomplete protein interaction data on structural and evolutionary inferences. BMC Biology 2006, 4(39).
6.
Brun C, Chevenet F, Martin D, Wojcik J, Gue´noche A, Jacq B: Functional classification of proteins for the prediction of cellular function from a protein–protein interaction network. Genome Biol 2003, 5(1):R6.
7.
Drees BL, Thorsson V, Carter GW, Rives AW, Raymond MZ, Avila-Campillo I, Shannon P, Galitski T: Derivation of genetic interaction networks from quantitative phenotype data. Genome Biol 2005, 6(4):R38.
8.
Gavin A-C, Aloy P, Grandi P, Krause R, Boesche M: Proteome survey reveals modularity of the yeast cell machinery. Nature 2006, 440:631-636.
9.
Hart G, Ramani A, Marcotte EM: How complete are current yeast and human protein-interaction networks? Genome Biol 2006, 7:120.
10. Uetz P, Giot L, Cagney G, Mansfield TA, Judson RS, Knight JR, Lockshon D, Narayan V, Srinivasan M, Pochart P et al.: A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae. Nature 2000, 403(6770):623-627. 11. Ito T, Chiba T, Ozawa R, Yoshida M, Hattori M, Sakaki Y: A comprehensive two-hybrid analysis to explore the yeast protein interactome. PNAS 2001, 98(8):4569-4574. 12. Gavin A-C, Bosche M, Krause R, Grandi P, Marzioch M, Bauer A, Schultz J, Rick J, Michon A-M, Cruciat C-M et al.: Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 2002, 415(6868):141-147. 13. Lappe M, Holm L: Unraveling protein interaction networks with near-optimal efficiency. Nat Biotechnol 2004, 22(1):98-103. 14. Ho Y, Gruhler A, Heilbut A, Bader G, Moore L, Adams S-L, Millar A, Taylor P, Bennett K, Boutilier K et al.: Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 2002, 415(6868):180-183. 15. Shoemaker BA, Panchenko AR: Deciphering protein–protein interactions. Part I. Experimental techniques and databases. PLoS Comput Biol 2007, 3(3):e42. This review describes the different experimental techniques of protein interaction identification together with various databases, which attempt to classify the large array of experimental data. 16. von Mering C, Krause R, Snel B, Cornell M, Oliver SG, Fields S, Bork P: Comparative assessment of large-scale data sets of protein–protein interactions. Nature 2002, 417(6887): 399-403. 17. Bader GD, Hogue CWV: Analyzing yeast protein–protein interaction data obtained from different sources. Nat Biotechnol 2002, 20(10):991-997. 18. Lu L, Xia Y, Paccanaro A, Yu H, Gerstein M: Assessing the limits of genomic data integration for predicting protein networks. Gen Res 2005, 15:945-953. 19. Mika S, Rost B: Protein–protein interactions more conserved within species than across species. PLoS Comput Biol 2006, 2(7):e79. The authors compare the evolutionary conservation of interologs, observing that homology-based inferences of physical protein–protein interactions appeared far less successful than expected. 20. Pazos F, Ranea J, Juan D, Sternberg M: Assessing protein co-evolution in the context of the tree of life assists in the prediction of the interactome. J Mol Biol 2005, 352:1002-1015. Current Opinion in Biotechnology 2008, 19:396–403
402 Systems biology
21. Albert I, Albert R: Conserved network motifs allow protein–protein interaction prediction. Bioinformatics 2004, 20(18):3346-3352. 22. Gertz J, Elfond G, Shustrova A, Weisinger M, Pellegrini M, Cokus S, Rothschild B: Inferring protein interactions from phylogenetic distance matrices. Bioinformatics 2003, 19(16):2039-2045. 23. Parrish JR, Yu J, Liu G, Hines JA, Chan JE, Mangiola BA, Zhang H, Pacifico S, Fotouhi F, DiRita VJ et al.: A proteome-wide protein interaction map for Campylobacter jejuni. Genome Biol 2007, 8(7):R130. 24. Geisler-Lee J, O’Toole N, Ammar R, Provart N: A predicted interactome for Arabidopsis. Plant Physiol 2007, 145:317-329. 25. Futschik ME, Chaurasia G, Herzel H: Comparison of human protein–protein interaction maps. Bioinformatics 2007, 23(5):605-611. The authors present a robust comparative analysis of eight large-scale maps with a total of over 10 000 unique proteins and 57 000 interactions included, revealing a small, significant overlap. 26. Ramani AK, Li Z, Hart GT, Carlson MW, Boutz DR, Marcotte EM: A map of human protein interactions derived from coexpression of human mrnas and their orthologs. Mol Syst Biol 2008, 4:180 doi: 10.1038/msb.2008.19. The authors report a set of 7000 physical associations among human proteins inferred from indirect evidence: the comparison of human mRNA co-expression patterns with those of orthologous genes in five other eukaryotes.
39. Grigoriev A: On the number of protein–protein interactions in the yeast proteome. Nucleic Acids Res 2003, 31(14):4157-4161. 40. Hakes L, Pinney JW, Robertson DL, Lovell SC: Protein–protein interaction networks and biology — what’s the connection? Nat Biotechnol 2008, 26(1):69-72. 41. Stumpf MPH, Wiuf C, May R: Subnets of scale-free networks are not scale-free: sampling properties of networks. PNAS 2005, 102(12):4221-4224. 42. Lange OF, Lakomek N-A, Fare`s C, Schro¨der GF, Walter KFA, Becker S, Meiler J, Grubmu¨ller H, Griesinger C, de Groot BL: Recognition dynamics up to microseconds revealed from an rdc-derived ubiquitin ensemble in solution. Science 2008, 320(5882):1471-1475. 43. Boehr DD, Wright PE: Biochemistry. How do proteins interact? Science 2008, 320(5882):1429-1430 doi: 10.1126/ science.1158818 URL: http://www.sciencemag.org/cgi/content/ full/320/5882/1429. 44. Hoskins J, Lovell S, Blundell TL: An algorithm for predicting protein–protein interaction sites: abnormally exposed amino acid residues and secondary structure elements. Protein Sci 2006, 15(5):1017-1029. 45. Agrafioti I, Swire J, Abbott J, Huntley D, Butcher S, Stumpf MPH: Comparative analysis of the Saccharomyces cerevisiae and Caenorhabditis elegans protein interaction networks. BMC Evol Biol 2005, 5(23).
27. Stumpf MPH, Thorne T, de Silva E, Stewart R, An H, Lappe M, Wiuf C: Estimating the size of the human interactome. PNAS 2008, 105(19):6959-6964.
46. Allan Drummond D, Raval A, Wilke CO: A single determinant dominates the rate of yeast protein evolution. Mol Biol Evol 2006, 23(2):327-337.
28. Krogan NJ, Cagney G, Yu H, Zhong G, Guo X, Ignatchenko A, Li J, Pu S, Datta N, Tikuisis AP et al.: Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature 2006, 440(7084):637-643. This presents a map of 547 different protein complexes alongside a map of 7123 protein–protein interactions.
47. Salathe M, Ackermann M, Bonhoeffer S: The effect of multifunctionality on the rate of evolution in yeast. Mol Biol Evol 2006, 23(4):721-722.
29. Stark C, Breitkreutz B-J, Reguly T, Boucher L, Breitkreutz A, Tyers M: Biogrid: a general repository for interaction datasets. Nucleic Acids Res 2006, 34:D535-D539 doi: 10.1093/nar/gkj109. 30. Bader G, Donaldson I, Wolting C, Ouellette B: Bind — the biomolecular interaction network database. Nucleic Acids Res 2001, 29(1):242-245. 31. Salwinski L, Miller CS, Smith AJ, Pettit FK, Bowie JU, Eisenberg D: The database of interacting proteins: 2004 update. Nucleic Acids Res 2004, 32:D449-D451. 32. Mishra GR, Suresh M, Kumaran K, Kannabiran N, Suresh S, Bala P, Shivakumar K, Anuradha N, Reddy R, Raghavan TM et al.: Human protein reference database — 2006 update. Nucleic Acids Res 2006, 34:D411-D414. 33. Kerrien S, Alam-Faruque Y, Aranda B, Bancarz I: Intact-open source resource for molecular interaction data. Nucleic Acids Res 2006, 00:D1-D5. 34. Chatr-aryamontri A, Ceol A, Montecchi Palazzi L, Nardelli G, Victoria Schneider M, Castagnoli L, Cesareni G: Mint: the molecular interaction database. Nucleic Acids Res 2007, 35:D572-D574. 35. Gentleman R, Huber W: Making the most of high throughput protein-interaction data. Genome Biol 2007, 8(10):112. The authors review the estimation of coverage and error rate in HTP PPI data, arguing that reports of the low quality of such data are substantially based on misinterpretations. 36. Hakes L, Robertson DL, Oliver SG: Effect of dataset selection on the topological interpretation of protein interaction networks. BMC Genomics 2005, 6:131.
48. Thorne T, Stumpf MPH: Generating confidence intervals on biological networks. BMC Bioinformatics 2007, 8(1):467. 49. Tarassov K, Messier V, Landry CR, Radinovic S, Molina MSM, Shames I, Malitskaya Y, Vogel J, Bussey H, Michnick SW: An in vivo map of the Yeast Protein Interactome. Science 2008. 320(5882):1465-1470. 50. Ben-Hur A, Stafford Noble W: Choosing negative examples for the prediction of protein–protein interactions. BMC Bioinformatics 2006, 7(Suppl 1):S2. This study focuses on the effects of a biological bias in creating negative interaction sets for interactome-modeling purposes. 51. Jones RB, Gordus A, Krall JA, MacBeath G: A quantitative protein interaction network for the erbb receptors using protein microarrays. Nature 2006, 439(7073):168-174. 52. Linghu B, Snitkin ES, Holloway DT, Gustafson AM, Xia Y, DeLisi C: High-precision high-coverage functional inference from integrated data sources. BMC Bioinformatics 2008, 9:119. 53. Lim J, Hao T, Shaw C, Patel A, Szabo´ G, Rual J: A protein–protein interaction network for human inherited ataxias and disorders of Purkinje cell. Cell 2006, 125:801-814. 54. Calzone L, Gelay A, Zinovyev A, Radvanyl F, Barillot E: A comprehensive modular map of molecular interactions in rb/e2f pathway. Mol Syst Biol 2008, 4:173. The study presents a detailed and curated map of molecular interactions taking place in the regulation of the cell cycle by the retinoblastoma protein (RB/RB1). 55. Schulze WX, Deng L, Mann M: Phosphotyrosine interactome of the erbb-receptor kinase family. Mol Syst Biol 2005, 1:0008. 56. Uetz P, Stagljar I: The interactome of human egf/erbb receptors. Mol Syst Biol 2006, 2:0006.
37. Goll J, Uetz P: The elusive yeast interactome. Genome Biol 2006, 7(6):223 URL: http://genomebiology.com/2006/7/6/223.
57. Jovanovic G, Lloyd LJ, Stumpf MPH, Mayhew AJ, Buck M: Induction and function of the phage shock protein extracytoplasmic stress response in Escherichia coli. J Biol Chem 2006, 281(30):21147-21161.
38. Hakes L, Robertson DL, Oliver SG, Lovell SC: Protein interactions from complexes: a structural perspective. Comp Funct Genomics 2007:49356.
58. Rajagopala SV, Titz B, Goll J, Parrish JR, Wohlbold K, McKevitt MT, Palzkill T, Mori H, Finley RL, Uetz P: The protein network of bacterial motility. Mol Syst Biol 2007, 3:128.
Current Opinion in Biotechnology 2008, 19:396–403
www.sciencedirect.com
Protein–protein interactions Kelly and Stumpf 403
59. Pinney JW, Amoutzias GD, Rattray M, Robertson DL: Reconstruction of ancestral protein interaction networks for the bzip transcription factors. PNAS 2007, 104(51):2044920453. This study shows how probabilistic modeling can provide a platform for the quantitative analysis of multiple protein interaction networks, being applied to the reconstruction of ancestral networks for the bZIP family of transcription factors. 60. Uetz P, Dong Y-A, Zeretzke C, Atzler C, Baiker A, Berger B, Rajagopala SV, Roupelieva M, Rose D, Fossum E, Haas J: Herpes viral protein networks and their interaction with the human proteome. Science 2006, 311(5758): 239-242.
www.sciencedirect.com
61. LaCount DJ, Vignali M, Chettier R, Phansalkar A, Bell R, Hesselberth JR, Schoenfeld LW, Ota I, Sahasrabudhe S, Kurschner C et al.: A protein interaction network of the malaria parasite Plasmodium falciparum. Nature 2005, 438(7064):103-107. 62. Yuan C, Yongkiettrakul S, Byeon IJ, Zhou S, Tsai MD: Solution structures of two fha1-phosphothreonine peptide complexes provide insight into the structural basis of the ligand specificity of fha1 from yeast rad53. J Mol Biol 2001, 314(3):563-575. 63. Gurunathan S, David D, Gerst JE: Dynamin and clathrin are required for the biogenesis of a distinct class of secretory vesicles in yeast. EMBO J 2002, 21(4):602-614.
Current Opinion in Biotechnology 2008, 19:396–403