MISCELLANEA
The yeast two-hybrid system: prospects for protein linkage maps Carlos Evangelista, Daniel Lockshon and Stanley Fields
The authors are at the Depts of Genetics and Medicine, Markey Molecular Medicine Center, Box 357360, University of Washington, Seattle, WA 981957360, USA.
196
Every cell contains a set of machines capable of carrying out processes such as replication or transcription, responding to the outside environment, generation of usable energy or directing the traffic flow of macromolecules. These machines are like intricate three-dimensional jigsaw puzzles, forming arrays of interlocking protein components that assemble and disassemble over time and in response to complex signals. Understanding cellularfunction is advanced by knowledge of each assembly, which requires an inventory of its parts and a blueprint that details how these parts fit together. A simple genetic assay in yeast, the two-hybrid system1f2, detects protein-protein interactions and can be used to analyse many such assemblies. In principle, it may be possible to employ this assay to identify most of the components in these protein complexes, and the resulting networks of interactions are whatwe refer to as a ‘protein linkage map’. Traditionally, assemblies of proteins have been analysed by using two complementary approaches. In the well-known analogy to understanding how a car runs, biochemists disassemble the engine, transmission and body, characterize all the pieces, and attempt to rebuild a working vehicle; geneticists, by contrast, break single components, turn the key and try to determine what effect the single missing part has on the car’s operation. These strategies have proved spectacularly successful in deciphering many of the essential processes of a cell, and undoubtedly will continue to be fruitful. However, the advent of genome sequencing projects of organisms ranging in complexity from mycoplasma to man (reviewed in Ref. 3) has ushered in the arrival of thousands of new proteins to analyse, and there is difficulty in applying such labour-intensive traditional procedures to so many proteins. This onslaught of genomic information calls for the development of efficient experimental strategies to gain insight into the organization and function of the encoded proteins. 0 1996 Elsevier Science Ltd. PII: SO962-8924(96)40002-2
Overview of the two-hybrid system The two-hybrid system relies on the modular properties of eukaryotic site-specific transcriptional activators for the detection and analysis of protein- protein interactions (Fig. 1). One hybrid protein consisting of a DNAbinding domain fused to some protein ‘x’ localizes to a reporter gene regulated by sites recognized by the DNAbinding domain. Another hybrid protein consisting of a transcriptional activation domain fused to some protein ‘Y’ turns on expression of the reporter if an X-Y interaction occurs. In a search for proteins that bind to X (X is often termed the ‘bait’ protein), a vector encoding the transcriptional activation domain is used to construct a library of cDNA or genomic fragments fused to the DNA encoding this activation domain. Such a library is introduced into a yeast two-hybrid reporter strain containing the bait protein, and transformants are selected for growth due to expression of a reporter gene that encodes an essential enzyme (see Refs 4 and 5 for detailed protocols). The two broad applications of this system as outlined above have been widely used (reviewed in Ref. 6). In the first, available genes are used in pairwise tests for protein-protein interaction. These reconstruction experiments, if they generate a positive transcriptional signal, allow various parameters of an interaction to be analysed (see below). Although the interactions must occur in the yeast nucleus for the assay to work, interactions that normally occur in other cellular compartments have been detected; even extracellular-receptorligand combinations have been demonstrated’. Additionally, interactions that require a post-translational modification not normally occurring in yeast may be detected if the modifying activity - a protein kinase, for example is supplied8. In the second application, searches using bait proteins from a diverse set of organisms have identified novel partners for proteins that have an extensive range of functions6.
Reconstruction experiments with protein networks What can be learned from applying the two-hybrid system to assemblies of known proteins? Perhaps the best example is the pheromone-response pathway in Saccharomyces cerevisiae, which includes a seven transmembranespanning receptor, a G protein, a conserved set of protein kinases (homologous to the mitogen-activated protein (MAP) kinase module) and a transcription factor (reviewed in Ref. 9). Many of these components were identified initially from genetic experiments as being defective in strains unable to respond to the mating factors, and the corresponding genes and proteins have been extensively analysed. In order to understand better how this
(4
Fio:jzF
,
-
Repotter
gene
Binding site
@I
Activation domain
!
Binding site
-
Reporter
gene
@) g?cT, Binding site
)Repotter
gene FIGURE
1
The two-hybrid system. (a) A DNAbinding-domain-protein-X hybrid localizes to the reporter gene but does not activate transcription. (b) A transcriptional-activation-domainprotein-Y hybrid cannot localize to the reporter gene. (c) Protein-protein interaction between X and Y brings the activation domain into close proximity to the DNA-binding site that regulates the reporter gene and results in its transcriptional activation. If the gene for a candidate protein Y is available, it can be used in this assay to test directly whether Y binds to X. If it is not available, a library of cDNA or genomic fragments in the activationdomain vector can be used as the source
of Y to identify new interactions of the bait protein X. The reporter gene commonly encodes an enzyme such as the yeast His3 protein, and transformants are selected on plates lacking histidine; many two-hybrid strains carry in addition the fscherichia co/i lacZ gene as a reporter.
trends
in CELL BIOLOGY
(Vol.
6) May
1996
MISCELLANEA
TABLE pathway operates, several laboratories have recently constructed sets of hybrids with the relevant proteins to assay for interaction. A summary of these findings is presented in Table 1 and an overview of the pathway derived from these and other data is shown in Figure 2. First, the two-hybrid analyses indicate which proteins can be in contact. For example, they indicate that Ste5p, a protein of unknown function, may simultaneously bind Fus3p or Ksslp (a MAP kinase), Ste7p (a MAP kinase kinase) and Stel lp (a MAP kinase kinase kinase), implying that Stetjp may act as a scaffold18-20. Second, the twohybrid analyses delimit domains of interaction; different residues of SteSp are involved in the interaction with Stel 1 p, Ste7p and Fus3p, while similar residues of SteSp bind to the MAP kinases Fus3p and Ks~lp’~~*~. Third, they appear to explain the phenotypic effects of previously identified point mutations. For example, Stel 1 pP279S is a mutated version of one of the protein kinases that is partially constitutive for the pheromone response, and it shows 16-fold increased reporter gene activity when paired with Ste5pJg. Fourth, the two-hybrid studies examine the effect on protein interactions of deleting or overproducing other pathway components. Among these findings is the demonstration that the interaction of Ste7p with SteSp does not require Stel 1 plG20. However, overproduction of SteSp increases reporter gene expression mediated by the Ste7p-Stel l p combination18,20. In considering a set of two-hybrid data such as that generated from the proteins of the pheromone-response pathway, the investigator should keep in mind that some of the interactions detected may be transitory in nature. The two-hybrid assay is highly sensitive, such that interactions identified by this method need not be reflective of a stable complex within the cell. Additionally, when a given protein is involved in one interaction, it may not be able simultaneously to take part in a second interaction that is also detected with this method. Thus, evidence from other methodological approaches is important. In this regard, it is noteworthy that many of the interactions detected by the twohybrid studies were also observed biochemically using glutathione-stransferase-tagged versions of these proteins, thus validating the results from the genetic assay. A similar analysis of cell-cycle regulatory proteins2’ has been carried trends
in CELL BIOLOGY
(Vol.
6) May
1996
out using a mating procedure to test rapidly a large number of potential interactions. In this procedure, a collection of transformants of one yeast strain was obtained in which a defined set of different DNA-binding domain hybrids was present. A second collection, in a strain of opposite mating type, contained a defined set of different activation domain hybrids. The two collections were used to form all possible diploids, which contained one of each type of hybrid, and then these diploids were assayed for their expression of standard two-hybrid reporter genes. In the cell-cycle experiments, seven Drosophila proteins (termed CDls, for cyclin-dependent protein kinase interactors) identified in two-hybrid searches as binding to either of two Drosophila cyclindependent kinases (CDKs) were analysed. The seven CDls were paired with the two CDKs from Drosophila and a further five CDKs from human and 5. cerevisiae. These pairings revealed several features of protein-protein interactions that are broadly relevant to the study of complexes. For example, a protein can contact conserved structural elements found in a family of proteins; one CDI interacted with six of the seven CDKs tested. Furthermore, the two-hybrid method can be used to further the understanding of the specificity of CDK- cyclin interactions; one of the CDls is a novel cyclin that shows specificity for only two of the seven CDKs. Results from the twohybrid studies can also provide the impetus to search for new homologues. For example, one of the Drosophila CDls is a member of the cyclin-D family. Since the human cyclin D shows preference for CDK4, this result indicates that Drosophila may have a CDK4 homologue. Finally, a knowledge of multiprotein complexes can be built up by using the results showing interactions between different CDls. As the panel of known DNAbinding domain hybrids against which a given protein may be tested increases, the potential for new interactions to be revealed increases correspondingly2’. Given that there are hundreds of strains carrying such hybrids available now, this pairwise approach is obviously powerful. A bacteriophage T7 protein linkage map Two limitations constrain the reconstruction approach when applied to ever larger networks of proteins: first, pairwise assays test only those proteins already identified, and, second,
1 - INTERACTIONS OF COMPONENTS OF THE PHEROMONE-RESPONSE PATHWAY DETECTED BY THE TWO-HYBRID ASSAY
Protein-protein Cpal p-Ste4p Ste4p-Stel8p Ste4p-Cdc24p Ste4p, Stel 8p-Akrl SteZOp-Bern1 p Ste20p-Cdc42p Cdc24p-Bern1 p Ste4p-SteSp SteSp-Stel 1p SteSp-Ste7p SteSp-Fus3p Stel 1 p-Fus3p Stel 1 p-Kssl p Ste7p-Fus3p Ste7p-Kssl p Kssl pJtel2p
interactions
p
Refs 10 10 11 12,13 14 15 16 17 18-20 18-20 18-20 18,19 19 18,19 19 19
Pheromone
FIGURE
2
The pheromone-response system in Saccharomyces cerevisiae that leads to transcriptional induction of pheromone-responsive genes. The protein-protein interactions indicated are based on both genetic (including two-hybrid) and biochemical assays. The diagram implies neither that these are stable interactions nor that they all occur simultaneously.
constructing the appropriate hybrid genes entails considerable effort as the number of genes escalates. One means to circumvent these constraints is to construct libraries both of random DNA-binding domain hybrids and of random activation domain hybrids and use them to identify all interactions.
197
MISCELLANEA
TABLE
2 - SOME INTERACTIONS T7 PROTEINSa
DETECTED
BETWEEN
Gene 4 Gene product Gene 4 helicase/primase Gene 4.7 protein Gene 5 DNA polymerase Gene 6.5 protein Gene 19 terminase
+
4.7
product 5 + + -
BACTERIOPHAGE
6.5
19
-
-
+ +
+ -
“T7 proteins are indicated by their gene number, and function, if known. + Indicates that a two-hybrid transcriptional signal was detected between either the full-length proteins or fragments of these proteins; - indicates that the pair was assayed in the two-hybrid system and did not yield a signaP. The gene 4.7 and 6.5 proteins are of unknown function.
This approach was applied to the Escherichia co/i bacteriophage T7 (Ref. 22). The T7 genome of -40 kb was used to generate the two libraries, and these libraries were transformed into yeast strains of opposite mating type to carry out the two-hybrid assay. Three strategies were used to identify interactions. First, 30 000 random DNA-binding domain hybrids were screened (ten at a timejagainst the entire activation domain library to identify 103 positives, which defined 19 different interactions among the -55 proteins of the phage. Second, certain DNA-binding domain or activation domain hybrids were screened individually against the appropriate library of the other type; this strategy yielded three further interactions. Third, all DNA-binding domain hybrids characterized from the first two approaches that contained an inframe T7 protein were paired with all similarly characterized activation domain hybrids, which identified three additional interactions. An example of some of these data can be seen in the matrix in Table 2 -the total set of interactions identified constitute what we term a protein linkage map. The T7 study revealed several interesting features of the phage. A set of six interactions was identified, connecting three proteins known to function in DNA replication or DNA packaging with two other T7 proteins (Table 2). These interactions suggest that assemblies of proteins can be constructed even though the function of some of the proteins was previously unknown. Another interesting result arising from this analysis is that two T7 proteins that interact, the gene 18.5 and gene 18.7 proteins, are encoded by the same DNA, using different reading frames, a finding that poses intriguing questions about their evolution. Unexpectedly, the results with T7 also have implications for the analysis of
198
protein structure: several interactions identified were between adjacent domains of the same polypeptide. These interactions are due presumably to the random fragmentation of proteincoding regions in the library construction, exposing interior surfaces that are then capable of finding their complementary surfaces, as must occur during protein folding. The conclusions from the T7 analysis are that a two-hybrid approach for screening a full-genome complement of proteins is viable, at least for organisms of this size, and that it can detect interactions that would not have been predicted from previous studies. These interactions can then be analysed by other approaches in order to determine what role they may play in vivo. Future prospects A strategy for using the two-hybrid system in the analysis of a fairly large set of proteins, for example, those involved in replication, transcription orcytoskeletal organization, might take advantage of several of the approaches described above. For proteins implicated in the process whose genes have been cloned, reconstruction experiments can be used to test all pairwise combinations for interaction. Proteins found to be implicated in interactions can be subjected to deletion analysis (or other mutagenic procedures) to delineate critical domains or residues; previously defined mutations can be constructed in the hybrids to determine if they affect interactions. Proteins that bridge, stabilize or interfere with interactions might be detected by their deliberate expression in yeast carrying the appropriate hybrid proteins. For the identification of new proteins involved in a given process, two-hybrid searches with known components as bait can be carried out. These new proteins can then be assayed in a matrix approach against
all the other proteins previously implicated, to build up a network of interactions. Specific proteins can also be tested against a large array of defined hybrids, containing proteins involved in any function and from any organism; if any positives are identified that derive from an organism different from that under study, potential homologues can be identified. A number of large DNA-sequencing projects are nearing completion; perhaps it is now feasible to contemplate a future when a protein linkage map of an entire cell, arraying the majority of proteins with their partners, is also available. This map might be derived by performing a scaled-up version of the experiments with bacteriophage T7, in which random libraries of DNA-binding and activation domain hybrids are screened against each other. What might such a map reveal? As novel proteins join known complexes or pathways, it may suggest functions for such proteins. New functions for well-studied proteins may be revealed by the identification of unexpected interactions. Cross-connections may become apparent as key proteins are found to relay information from one cellular process to another; new organizing principles of metabolic pathways may become evident. Certain protein-protein interactions may be identified as viable targets for therapeutic intervention in various diseases. The inability to assign a function to a substantial fraction of proteins encoded by even an organism as intensely analysed as Saccharomyces cerevisiae may mean that we do not know of the existence of some fundamental biological processes. In its headiest potential, a protein linkage map might uncover novel machines carrying out as yet unidentified functions. Problems of false positives and false negatives in the two-hybrid system clearly exist (outlined in Ref. 22), and the logistics of applying this assay to the set of -7000 proteins, such as there are in yeast, are challenging. With genome sequences in hand, however, the prospect of understanding proteins in a global fashion appears promising. References 1
FIEl.DS,S.and
SONG,
0-K.(1989)Noture
340,245246 2
CHIEN, C-T., BARTEL, P. L., STERNGLANZ,R.and
FIELDS,S.(1991)
Proc. Nat/ Acad. Sci. USA 88, 9578-9582 3
JONES, S.J. M.(1995)
Cm
Opin. Genet.
Dev. 5, 349-353 4
BARTEL, P. L. and FIELDS, S. (1995) MethodsEnzymol.
trends
in CELL BIOLOGY
254,241-263
(Vol.
6) May
1996
VOJTEK, A. B. and HOLLENBERG, 5. M. (1995) Methods Enzymol. 255, 331-342 FIELDS, 5. and STERNCLANZ, R. (1994) Trends Cenet. 10, 286292 OZENBERCER, B. A. and YOUNG, K. H. (1995) Mol. Endocrinol. 9, 1321-1329 OSBORNE, M. A., DALTON, S. and KOCHAN, 1. P. (1995) BiolJechnology 13, 1474-1476 HERSKOWITZ, I. (1995) Cell 80, 187-l 97 CLARK, K. L., DICNARD, D., THOMAS, D. Y. and WHITEWAY, M. (1993) Mol. Cell. Biol. 13, l-8 ZHAO, A-S., LEUNG, T., MANSER, E. and
The Keystone symposium on the ‘Cell Biology of Virus Entry, Replication and Pathogenesis’ provided a forum for an examination of the recurring and often fatal attraction between ceils and viruses. Viruses have been useful tools to elucidate cell function, and, likewise, cell biology is central to the understanding of the virus life cycle. This symposium* was the third in a series of Keystone meetings held over the years to provide periodic updates on recent developments. The meeting attracted scientists from many disciplines, including the fields of virology, cell biology and immunology, creating an environment that allowed for a unique exchange of information and ideas. This multidisciplinary effort towards the study of cell-virus interactions provides the best hope for understanding the mechanisms of virusinduced disease. Virus binding and penetration into cells Viruses have evolved strategies to recognize and discriminate their target host cells by binding to existing cell-surface proteins, which may vary between tissue types. Identifying the virus receptor is essential to understanding the biology of the virus. Such information will be useful if viruses are to be used in tissue-specific gene targeting. If the contents of the symposium are any indication, the field of virology appears to have become reinvigorated in the search for cellular receptors. At least a dozen novel viral receptors were reported. One of the exciting new results presented was the identification of a secondary cellular receptor, in addition to CD4, used by human immunodeficiency virus (HIV-l) to infect cells. E. Berger trends
in CELL BIOLOGY
(Vol.
6) May
1996
LIM, L. (1995) Mol. Cell. Biol. 15, 52465257 12 KAO, L-R., PETERSON, I., ]I, R., BENDER, L. and BENDER, A. (1996) Mol. Cell. Biol. 16, 168-I 78 13 PRYCIAK, P. M. and HARTWELL, L. H. Mol. Ceil. Biol. (in press) 14 LEEUW, T. et al. (1995) Science 270, 1210-1213 15 SIMON, M-N., DEVIRCILIO, C., SOUZA, B., PRINCLE, J. R., ABO, A. and REED, S. I. (1995) Nature 376, 702-705 16 PETERSON, J., ZHENC, Y., BENDER, L., MYERS, A., CERIONE, R. and BENDER, A. (1994) I. Cell Biol. 127, 1395-1406
17 18
WHITEWAY, M. S. et ai. (1995) Science 269,1572-l 575 MARCUS, S., POLVERINO, A., BARR, M. and WICLER, M. (1994) Proc. NatiAcad. Sci. USA 91, 7762-7766
19 20 21
22
PRINTEN, 1. A. and SPRAGUE, G. F., jr (1994) Genetics 138,609-619 CHOI, K-Y., SATTERBERC, B., LYONS, D. M. and ELION, E. A. (1994) Ceil 78,499-512 FINLEY, R. L., Jr and BRENT, R. (1994) Proc. Noti Acad. Sci. USA 91, 12980-I 2984 BARTEL, P. L., ROECKLEIN, J. A., SENGUPTA, D. and FIELDS, 5. (1996) Nat. Genet. 12, 72-77
Acknowledgements We thank Colin Manoil, Lee Hartwell and Peter Pryciak for comments on the manuscript. Work in the laboratory has been supported by grants from the NIH (CA28146 and CM54415) and Amgen, Inc.
Cells and viruses: an infectious relationship Matthew Bui, l%ivi M. Ojala and Gary Whittaker (Bethesda, MD, USA) described an elegant cell fusion reporter system that identified a 45kDa, seven-transmembrane-spanning (7TMS) protein belonging to the G-protein-coupledreceptor family. This protein behaves as a fusion accessory factor and determines the specific tropism of the HIV-1 strain that infects CD4+ T-cell lines’. Antibodies directed against a peptide of the 7TMS protein effectively block HIV-1 infection. Using the same procedure, the group is trying to find the cellular determinant for an HIV strain that has tropism for primary macrophages. The search for the herpes simplex virus (HSV) receptor has provided some hopeful candidates, as indicated by a few short presentations. R. Montgomery from the laboratory of P. Spear (Chicago, IL, USA) has identified and expressed a protein with characteristics of the tumour necrosis factor (.TNF)/nerve growth factor (NCF) receptor family that allows HSV-1 to infect previously nonpermissive cells. In addition, C. Campadelli-Fiume (Bologna, Italy) used an anti-idiotypic antibody mimicking glycoprotein D (SD) to identify a 62-kDa cellular protein that is thought to interact with gD to enable spread of HSV-1 from cell to cell. Finally, R. Pietropaolo from the laboratory of T. Compton (Madison, WI, USA) demonstrated that glycoprotein B of the human f3-herpesvirus, cytomegalovirus, binds specifically to cellular annexin II and incorporates it into the virus particle.
For enveloped viruses, there must be a molecular signal to induce membrane fusion after the virus has been bound to its receptor. One such trigger for fusion is acidic pH. Influenza virus haemagglutinin (HA) has served as the paradigm for viral glycoproteinmediated, pH-dependent membrane fusion, D. Wiley (Boston, MA, USA) presented the crystal structure of the pH-dependent conformation of HA and showed the loop-to-helix transition that forms the triple-stranded coiled coil and delivers the fusion peptide 100 A towards the target membrane2J3. The fusion-active conformation was suggested to be the most thermodynamically stable state of the HA molecule. For viruses that do not depend on pH for fusion, the molecular cue involves a conformational change in the viral envelope glycoprotein induced by interactions with the cellular receptor(s). 1. White (Charlottesville, VA, USA), in collaboration with P. Bates (Philadelphia, PA, USA), showed that the Rous sarcoma virus envelope protein has an altered mobility on gel electrophoresis after interacting with the cellular receptor Tva (Ref. 4). The fusion properties appear to be receptor-, temperatureand time-dependent. Similarly, the HIV-1 gp12O/gp41 envelope protein is also believed to undergo a conformational change via interaction with its receptor(s). E. Hunter (Baltimore, MD, USA) proposed that the two heptad repeats in HIV gp41 not only induce oligomerization but
0 1996 Elsevier Science Ltd.
PII:SO962.8924(96)3OOOS-6
*Cell Biology of Virus Entry, Replication and Pathogenesis. Santa Fe, NM, USA; 1 O-l 6 February 1996.
The authors are at the Dept of Cell Biology, Yale University School of Medicine, New Haven, CT 06520-8002, USA.
199