Gene 222 (1998) 213–222
Libraries of green fluorescent protein fusions generated by transposition in vitro G.V. Merkulov, J.D. Boeke * Department of Molecular Biology and Genetics, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA Received 27 April 1998; received in revised form 4 September 1998; accepted 8 September 1998; Received by I. Verma
Abstract Two artificial transposons have been constructed that carry a gene encoding Green Fluorescent Protein and can be used for generating libraries of GFP fusions in a gene of interest. One such element, AT2GFP, can be used to generate GFP insertions in frame with the amino acid sequence of the protein of interest, with a stop codon at the end of the GFP coding sequence; AT2GFP also contains a selectable marker that confers trimethoprim resistance in bacteria. The second element, GS, can be used to generate tribrid GFP fusions because there is no stop codon in the GFP transposon, and the resulting fusion proteins contain the entire amino acid sequence encoded by the gene. The GS element consists of a gfp open reading frame and a supF amber suppressor tRNA gene; the supF portion of the GS transposon can be utilized as a selectable marker in bacteria. Its sequence contains a fortuitous open reading frame, and thus it can be translated continuously with the gfp amino acid sequence. As a target for GFP insertions, we used a plasmid carrying the native Ty1 retrotransposon of the yeast Sacharomyces cerevisiae. The resulting multiple GFP fusions to Ty1 capsid protein Gag and Ty1 integrase were useful in determining the cellular localization of these proteins. Libraries of GFP fusions generated by transposition in vitro represent a novel and potentially powerful method to study the cell distribution and cellular localization signals of proteins. © 1998 Elsevier Science B.V. All rights reserved. Keywords: Aequoriea victoria; Tribrid proteins; Ty1; Yeast
1. Introduction The Green Fluorescent Protein (GFP) from the jellyfish, Aequorea victoria, has been extensively used as a vital marker in studies of a number of biological processes (Cubitt et al., 1995). GFP is a 238-amino-acidlong protein encoded by the gfp gene of Aequorea victoria (Prasher et al., 1992). Excited by blue light at the wavelength of 395 nm, GFP emits green light. The emitted light can be observed in vivo since no substrates or cofactors are needed for the unique post-translational modification that generates the fluorescent chromophore (Cody et al., 1993). The post-translational modification producing the chromophore proceeds through cycliza* Corresponding author. Tel: +1 410 955 0398; Fax: +1 410 614 2987; e-mail:
[email protected] Abbreviations: AT, artificial transposon; BPTI, bovine pancreatic trypsin inhibitor; DAPI, 4,6-diamidino-2-phenylindole; EDTA, ethylene diamine tetra-acetate; GFP, green fluorescent protein; HA, hemagglutinin epitope tag; IN, integrase; ORF, open reading frame; PCR, polymerase chain reaction; PR, protease; RT, reverse transcriptase; Ty, transposon of yeast; VLP, virus-like particle.
tion of residues 65–67 and results in oxidation of tyrosine 66 by atmospheric oxygen. GFP has been successfully expressed in a number of organisms such as Caenorhabditis elegans (Chalfie et al., 1994), Drosophila ( Wang and Hazelrigg, 1994), pathogenic mycobacteria (Dhandayuthapani et al., 1995; Kremer et al., 1995), budding yeast ( Kahana et al., 1995; Niedenthal et al., 1996) and others, but the fluorescence intensity of wild-type GFP has been shown to be rather low. The selection of mutants with improved brightness, such as S65T GFP, helped solve this problem (Heim et al., 1995; Cormack et al., 1996, 1997). Proper folding of the GFP moiety may be another prerequisite for bright fluorescence (Crameri et al., 1996). The correct folding of GFP in the fusion protein may also depend on the site of fusion. In most cases, GFP has been fused either at the N- or at the C-terminus of the protein of interest, and often both approaches are used in the hope that one of them will generate a useful fusion protein. Sometimes, the GFP tag has to be inserted in the middle of the protein, between two domains in order to retain the function of the protein into which GFP is inserted.
0378-1119/98/$ – see front matter © 1998 Elsevier Science B.V. All rights reserved. PII: S0 3 7 8 -1 1 1 9 ( 9 8 ) 0 0 50 3 - 4
214
G.V. Merkulov, J.D. Boeke / Gene 222 (1998) 213–222
For instance, the GFP tag can be inserted in a surface loop of the target protein, generating a tribrid. Another example of a protein that can be used for internal insertion is the bovine pancreatic trypsin inhibitor (BPTI ) (Borjigin and Nathans, 1993). BPTI has its N and C termini in close proximity in the three-dimensional structure and therefore is predicted not to disturb surrounding domains of the target protein. However, unlike GFP fusions, BPTI fusions must be detected indirectly in the form of complexes with trypsin and therefore require an external cofactor. Markers like GFP and BPTI could also be used to map functional topology of a protein. Constructing brightly fluorescent tagged derivatives while preserving the full functionality of the target protein may be a formidable task. One solution to this problem is to generate multiple gfp fusions to a gene of interest. A library of plasmids carrying gfp fusions could also be helpful in systematic characterization of protein localization on a genomic scale. Here, we describe a scheme for generating multiple insertions of gfp into a gene of interest by means of transposition in vitro. One of the two transposons designed for that purpose generates GFP fusions with a stop codon at the native C-terminus of GFP, whereas the other yields fusions without any stop codons in the GFP tag, generating an internal ‘tribrid’ fusion protein. The latter is advantageous since insertion does not truncate the protein to which GFP is fused, increasing the likelihood that the protein will function and will be localized correctly. We show that such tribrids can be obtained and are fluorescent. We selected six truncating GFP fusions to the Ty1 capsid protein, Gag, and two fusions to the Ty1 integrase, IN, from a library of insertions obtained by inspection of random colonies. The Ty1 Gag was localized in the cytoplasm, while Ty1 IN was localized predominantly to the nucleus with some integrase remaining in the cytoplasm, possibly in the form of Ty1 VLPs. Interestingly, one IN–GFP fusion with the GFP fusion point located 159 amino acid residues upstream of the integrase C-terminus did not localize to the nucleus, whereas the other IN–GFP fusion in which GFP was inserted 5 amino acids upstream of the C-terminus did translocate into the nucleus, consistent with the hypothesis that the nuclear localization signal is located in the 159 C-terminal amino acids of the Ty1 IN. This result agrees with other recent results on Ty1 IN localization obtained by independent means.
2. Materials and methods 2.1. Strains, plasmids and oligonucleotides The bacterial strains, plasmids and oligonucleotides used in the study are listed in Table 1. The strains
carrying plasmids with the bla marker (encoding ApR or ampicillin resistance) were propagated on LB plates containing ampicillin at a concentration of 50 mg/ml; the strains carrying plasmids with the dhfr marker (encoding TpR or trimethoprim resistance) were grown on LB plates containing trimethoprim at a concentration of 400 mg/ml. The MC1061/p3 cells containing supF on a plasmid were grown on LB plates with ampicillin at 20 mg/ml and tetracycline at 5 mg/ml. All antibiotic solutions were prepared fresh immediately before the plates were poured or liquid cultures were started. Ampicillin stock solution was made by dissolving ampicillin powder in water at a concentration of either 20 or 50 mg/ml; trimethoprim was dissolved in DMSO at 50 mg/ml, and tetracycline was dissolved in 50% ethanol at 5 mg/ml. Electroporation of bacterial cells was performed according to the manufacturer of the MC1061/p3 strain (Clontech, Palo Alto, CA); after electroporation, the cells were incubated in LB with glucose and magnesium for longer periods of time to increase the recovery of transformants (1 h at 37°C and then 3 h at 25°C ). Plasmids were purified either by the STET boiling method or by Triton X-100 lysis followed by centrifugation in gradient of CsCl. DNA was sequenced by the dideoxy chain termination technique using T7 DNA polymerase (Sequenase) from US Biochemicals (Cleveland, Ohio). 2.2. Construction of transposons AT2GFP and GS To generate pAT2GFP, plasmid pJK6-3 (Table 1) was digested with BamHI, and a 0.7-kb fragment containing gfp was then ligated to pAT2 (Devine and Boeke, 1994) linearized with BamHI. This gfp gene contains the S65T mutation (Heim et al., 1995) which improves the signal in yeast and other cells. Plasmid pAT2GFP was selected by transformation of the DH5a strain to ampicillin resistance. To generate pBluescriptIIKS+GS, two PCR fragments were synthesized. One of the fragments was synthesized by using pAN7 as a template and oligonucleotide primers GM62 (JB1448) and GM63 (JB1449) in the PCR, the other was made by using pAT2GFP as a template and oligonucleotide primers GM64 (JB1450) and GM66 (JB1452) ( Table 1). After these two PCR products were digested with SpeI, they were ligated, and the ligation product was PCR-amplified with oligonucleotides GM62 (JB1448) and GM66 (JB1452) ( Table 1). The resulting 1-kb PCR fragment was digested with ClaI and XbaI and ligated with pBluescriptIIKS+ linearized with ClaI and XbaI. Plasmid pBluescriptIIKS+GS was selected by transformation of the DH5a strain to ampicillin resistance. To prepare transposon GS for the in-vitro transposition reaction, the MC1061/p3 strain was transformed with pBluescriptIIKS+GS to tetracycline resistance.
215
G.V. Merkulov, J.D. Boeke / Gene 222 (1998) 213–222 Table 1 Oligonucleotides, plasmids and strains Name Oligonucleotides GM61 (JB1436) GM62 (JB1448) GM63 (JB1449) GM64 (JB1450) GM66 (JB1452) Plasmids p3 pAN7 pAT2 pAT2GFP pBluescriptIIKS+ pBluescriptIIKS+GS pJK6-3 Bacterial strains DH5a DH10B MC1061 MC1061/p3 Yeast strain YH8
Description
Reference
ggtgatccctgagcaggtgg gcatcgatgggaacatgttcctgcagcccgggggatcc gcactagtggatccttatatttgtatagttcatccat gcactagtgtctttcggacttttgaaagtg gctctagaccgaacatgttcccccaacgtaacactttacagcgg
This This This This This
kanR ampam tetam, conjugative supF tmpR flanked by Ty1 U3 cassettes S65T mutant of gfp and tmpR flanked by Ty1 U3 cassettes
[Seed (1983)] [Lutz et al. (1987)] [Devine and Boeke (1994)] This study Stratagene This study D. Koshland
S65T mutant of gfp and supF flanked by Ty1 U3 cassettes S65T mutant of gfp in pUC8
study study study study study
F− deoR endA 1 gyrA w80dlac ZDM15 D(lacZYA-argF )U169 hsdR17 (r m ) recA 1 relA 1 supE 44 thi 1 k− k+ F− araD 139 D(ara, leu)7679 deoR endA 1 galU galK DlacX74 l− w80dlac ZDM15 mcrA nupG D(mrr-hsdRMS-mcrBC ) recA 1 rpsL F− araD139 D(araABC-leu)7679 deoR+ galK galU kanR+DlacX74 mcrB (r m ) rpsL strA supo thi k− k+ MC1061 carrying plasmid p3 MAT a his3D200 leu2D1 trp1D1 ura3-167
2.3. Transposition of AT2GFP and GS into target plasmid pJEF1105 in vitro Each in-vitro transposition reaction contained a target plasmid, an isolated artificial transposon consisting of an XmnI restriction fragment and Virus Like Particles ( VLPs, the source of Ty1 integrase). The target plasmid, pJEF1105 (Boeke et al., 1988), was isolated from DH5a cells by Triton X-100 lysis, followed by double banding in a CsCl/Ethidium Bromide gradient. Plasmid pAT2GFP was isolated from DH5a cells, and pBluescriptIIKS+GS was isolated from MC1061/p3 cells by the same method. Whereas E.coli cultures containing pAT2GFP could be grown to saturation, the cells carrying pBluescriptIIKS+GS had to be collected when they reached an A of 0.6. The cells carrying supF on a 600 multicopy plasmid grow more slowly; even when grown on tetracycline, these cells may lose the supF marker at higher densities (Phadnis et al., 1989). Ten micrograms of pAT2GFP were digested with 20 units of XmnI and 20 units of BsaWI in the volume of 40 ml for 16 h at 37°C and then for 1 h at 65°C. BsaWI was added to digest a fragment that would otherwise comigrate with transposon. Ten micrograms of the pBluescriptIIKS+GS were digested with 20 units of XmnI and 20 units of SspI in a volume of 40 ml for 16 h at 37°C. SspI was added to digest a fragment that would otherwise comigrate with the transposon. The products of enzymatic digestion were separated on a 1% agarose/Tris–Acetate gel (pH 8.0) using high-quality,
low-melting-point agarose. Some batches of LMP agarose might inhibit the in-vitro transposition reaction (Devine and Boeke, 1994); ultraPURE LMP agarose from Life Technologies (Gaithesburg, MD; Catalog number 5517UB) was used in this study. The 1603-bp AT2GFP transposon and 970-bp GS transposon were eluted from LMP agarose gel slices in an IBI Unidirectional Electroeluter (New Haven, CT; model UEA) using Tris–Acetate buffer and a 75-ml 7.5 M ammonium acetate cushion with Bromophenol Blue. Electroelution was carried at 120 V for 50 min; approximately 150 ml of ammonium acetate cushions containing the fragments (and Bromophenol marker dye) were collected, and DNA was precipitated by addition of three volumes of ethanol. After incubation at −70°C for 30 min, the DNAs were pelleted by centrifugation at 4°C, dried without washing and dissolved in 20 ml of water. The conditions for the in-vitro transposition reaction have been described elsewhere (Devine and Boeke, 1994). A reaction contained 2 ml of 10× reaction buffer (150 mM MgCl , 100 mM Tris–HCl, pH 7.5, 2 100 mM KCl and 10 mM DTT ), 4 ml of water, 100 ng of pJEF1105 in 1 ml, 3 ml of VLPs, prepared as described ( Eichinger and Boeke, 1988), 5 ml of 20% PEG 8000 (w/v), and 100–500 ng of transposon in 5 ml. After incubation at 30°C for 1 h, the reaction was stopped by the addition of 1 ml of 0.5 M EDTA and 1 ml of 10 mg/ml of Proteinase K. Following incubation at 37°C for 15 min, the DNA was extracted with phenol/chloroform, precipitated with ethanol and resuspended in 20 ml of
216
G.V. Merkulov, J.D. Boeke / Gene 222 (1998) 213–222
water. One microliter was used to transform 10 ml of electrocompetent cells by electroporation. DH10B electrocompetent cells (Gibco/BRL) were used to select pJEF1105 with AT2GFP inserts in liquid media (complexity: approximately 2000 E. coli transformants) and MC1061/p3 electrocompetent cells (Clontech) were used to select pJEF1105 with GS inserts in liquid media (complexity: approximately 700 E. coli transformants). pJEF1105*AT2GFP and pJEF1105*GS libraries of plasmids were isolated by banding in CsCl. The libraries were then used to transform the YH8 strain of Saccharomyces cerevisiae (Table 1) to Ura+. 2.4. Immunoblotting Cultures were grown at 30°C. The starting density was an A of 0.5, and the cells were collected when 600 the density reached an A of 2 (about 16 h at 30°C ). 600 Immunoblotting was performed as described (Merkulov et al., 1996) using anti-Gag antibodies ( Eichinger and Boeke, 1988) or JH695 antisera (Merkulov et al., 1996). 2.5. Microscopy Images of live cells expressing GFP fusions were taken on a Zeiss Axioskop microscope (Carl Zeiss Inc., Thornwood, NY ) using the appropriate excitation and emission filters.
3. Results 3.1. Transposons AT2GFP and GS Since the first genetic assay for the in-vitro transposition of Ty1 was developed, a number of artificial transposons have been designed, all sharing similar features such as blunt, inverted dinucleotide 5∞TG… CA3∞ ends and a selectable marker (Devine and Boeke, 1994; Devine et al., 1997; Kimmel et al., 1997). The blunt termini of specific sequence are directly involved in the integration reaction, whereas the selectable marker is necessary to select the products of reaction in bacteria as the complete integration events in this system occur at a frequency of 10−4–10−3 of total target plasmids. The AT2GFP and GS transposons were constructed as described in Section 2. The AT2GFP is a derivative of artificial transposon AT2 (Devine and Boeke, 1994), originally developed to make sets of random insertions into DNA to facilitate sequencing. Both AT2 and AT2GFP feature a dhfr gene that enables selection of DNAs with transposon insertions using trimethoprim, and two 4-bp TGTT…AACA inverted repeat termini generated by XmnI digestion. AT2GFP also carries a gfp inserted at the BamHI site, and there is a continuous open reading frame (ORF ) extending from the transpo-
son 5∞ end through gfp (Fig. 1A). Thus, one in six AT2GFP insertions in any target ORF is expected to generate a productive GFP fusion; parts of the target gene downstream of the gfp insertion would not be translated because of the stop codon at the end of gfp. To facilitate the in-frame insertion of a non-truncating GFP tag at various positions in the middle of the protein coding region, a second transposon, GS, was made ( Fig. 1B). GS contains the bacterial supF gene fused in-frame to gfp; supF can be used as a selectable marker in E. coli strains bearing the p3 Tetam Ampam plasmid. This was possible because supF fortuitously contains an intact (but biologically meaningless) ORF in one of six possible reading frames. GS is 970 bp long; when inserted into target DNA, it acquires 5 bp due to the duplication that occurs at the integration site. Therefore, the target gene would have a 975-bp insertion (a multiple of 3 bp) that lacks stop codons; thus, the resulting protein would carry a simple insertion of 325 amino acid residues and would not be truncated. A number of fluorescent GFP fusions to Ty1 proteins have been selected from libraries of such insertions, as described below. 3.2. Generation of GAL-Ty1 plasmid libraries with gfp inserted at random positions A plasmid (pJEF1105) based on the native Ty1 transposon was chosen as a target for transposon insertion, since little was known about intracellular distributions of the Ty1 proteins. This plasmid is useful because it can produce high levels of Ty1 proteins for immunoblot analysis (Fig. 1C ). Recombinant products of AT2GFP integration into pJEF1105 were selected by transforming DH5a cells and plating them on to LB plates containing ampicillin and trimethoprim. GS transpositions were selected by transforming MC1061/p3 cells and plating them on to the LB plates with ampicillin and tetracycline. The recombinant colonies were counted and compared to total colonies on ampicillin plates to estimate frequency of transposition. In both cases, the frequency of transposition was close to the transposition frequency of AT2, 10−4–10−3 (Devine and Boeke, 1994). Restriction analysis of individual clones was performed to verify in-vitro transposition. Almost all clones of the pJEF1105*GFP library contained plasmids of expected size, whereas only about 50% of the clones from the pJEF1105*GS library contained plasmids of expected size; the rest of the pJEF1105*GS clones were shorter, albeit with transposon inserted. The latter were not further characterized. After YH8 yeast cells were transformed to Ura+ with the pJEF1105*AT2GFP library, they were replica-plated on to SC-Ura galactose plates. The plates were screened under a fluorescent microscope using a low-magnification objective (total magnification 63×) to detect colo-
G.V. Merkulov, J.D. Boeke / Gene 222 (1998) 213–222
217
Fig. 1. (A) Map of the artificial transposon AT2GFP. Large arrow, direction of gfp translation; gray box, dhfr gene; small arrow, dhfr transcription start; x marks, stop codons; black boxes, U3 sequences. (B) Map and sequence of the artificial transposon GS. Large arrow, direction of translation; gray box, supF gene; small arrow, direction of supF transcription in E.coli; black boxes, Ty1 terminal TGTT…AACA sequences. (C ) Genetic map of plasmid pJEF1105. Hatched box, GAL1 promoter; boxed triangles, LTR sequences; neo, Ty1 marker; URA3, vector selectable marker; 2 micron ori, yeast 2 mm plasmid origin of replication; PR, IN, and RT, regions encoding protease, integrase and reverse transcriptase; arrowheads, PR cleavage sites.
nies of fluorescing cells. Forty-four brightly fluorescing colonies were selected by screening a library, estimated to carry about 2000 independent integration products. The 44 sequenced plasmids represented six classes. In all six of these plasmids, the gfp was integrated into the gag ORF. Five fusion points out of six were located within 77 amino acid residues of the Gag C-terminus (the insertions fell between residues 363 and 433); the sixth fusion point was located just 72 amino acid residues downstream of the N-terminus. An immunoblot analysis of Gag–GFP fusions was carried out to verify the sequencing results; the sizes of Gag–GFP fusions determined by immunoblotting agree with the sequencing data ( Fig. 3). No gfp fusions in Ty1 POL were detected in this library by this screening technique. Since the Gag:Pol ratio was maintained at about 20:1, it was possible that Gag–Pol–GFP fusions escaped attention merely because the colonies carrying Pol–GFP fusions were less bright than those expressing Gag–GFP. The clustering of the fluorescence-positive GAG insertions near the termini has two possible explanations: (1) insertions into GAG were non-random, or (2) insertions
into the middle of GAG resulted in unstable, and thus undetectable, fuion proteins. We strongly favor the latter explanation for three reasons: (1) a number of Ty1 based artifical transposons containing a variety of different internal sequences have now been shown to be extremely random in their in-vitro integration pattern (Devine and Boeke, 1994; Devine et al., 1997), and thus we would expect these related artificial transposons to insert randomly as well; (2) previous studies in this laboratory have shown that insertion of even short sequences into the middle of the GAG region leads to loss of detectable Gag protein (presumably due to protein instability; D. Eichinger and JDB, unpublished data); and (3) Garraway et al. (1997) reported that a related AT2-GFP construct inserts pseudorandomly. We attempted to isolate additional plasmids expressing fluorescent GFP fusions with other parts of Ty1 ORFs. Approximately 100 additional colonies were screened by fluorescent microscopy with a high-magnification objective (total magnification 1000×). To
218
G.V. Merkulov, J.D. Boeke / Gene 222 (1998) 213–222
increase the likelihood of finding insertions in other parts of Ty1, only cells with a weaker fluorescence or unusual intracellular distribution of fluorescent material were chosen and further analysed. Some cells isolated in this second screen demonstrated an apparent nuclear localization of the fluorescent material. Seven colonies were picked; two of them represented insertions in the integrase (IN ) region, and five were in the reverse transcriptase (RT ) region ( Fig. 2A). 3.3. Tribrid proteins We next examined the library of GS insertions to try to identify insertions that produced tribrid GFP fusion proteins. Ten fluorescing colonies were isolated from the
pJEF1105*GS library, which was estimated (based on the number of transformants and the conditions of outgrowth) to carry roughly 300 independent integration products. Seven represented insertions in Ty1, 2 in Gag, 3 in protease (PR) and 2 in IN (Fig. 2A). The other three were found next to the 2m origin of replication (data not shown). Sequencing of plasmid DNA (Fig. 2B) showed that all of the Gag–GS, PR–GS and IN–GS predicted fusion proteins were tribrids. The molecular weights of Gag–GS fusion proteins determined by Immunoblot analysis using anti-Gag antisera agree with the sequencing data (Fig. 3). The Gag–GS fusion was identified by immunoblot analysis with JH695 antiserum raised against the 22 C-terminal amino acids of Gag ( Fig. 3), indicating that the amino acids located down-
Fig. 2. (A) Maps of Ty1 proteins fused to GFP. The Ty1 Gag and Pol processing products ([Garfinkel et al., 1991]; [Merkulov et al., 1996 ]) are shown on the top. Black boxes, GFP; gray boxes, amino acid sequence encoded by supF; numbers in white boxes, numbers of Ty1 amino acids preceding fusion points; numbers of fusions shown next to the maps. The Gag–GS fusion directs the synthesis of two proteins internally tagged with GFP, Gag and Gag–Pol; both are shown. (B) Nucleotide sequences surrounding GS insertion points. Black interrupted box, GS transposon; shaded triangles, U3 sequences.
G.V. Merkulov, J.D. Boeke / Gene 222 (1998) 213–222
219
Fig. 3. Immunoblots of Ty1 proteins fused to GFP. Cultures containing plasmids with Ty1–gfp fusions were grown at 30°C, and cells were pelleted, lysed, subjected to SDS–PAGE, transferred on to an Immobilon membrane, and incubated with anti-Gag or JH695 antiserum; the JH695 antiserum was raised against a synthetic peptide representing 28 C-terminal amino acids of Ty1 Gag. Numbers on the left show molecular masses (kDa) of marker proteins. Open circle, position of unprocessed Gag protein; filled circle, position of processed Gag protein.
stream of GS insertion were translated, proving that a tribrid protein was made. Ty1 Gag was not processed in the cells expressing PR–GS ( Fig. 3), indicating that the GS insertion disrupted the function of PR. As judged by Gag processing, PR activity was not disrupted in the cells carrying the IN–GS fusion ( Fig. 3). Thus, the GS transposon was integrated into pJEF1105 plasmid in vitro, and the integration apparently produced fluorescent tribrid fusions. 3.4. Subcellular localization of fusion proteins The intracellular localization of the N-terminal Gag–GFP fusion differs from that of the five C-terminal fusions. Whereas insertion 1 exhibits a uniform distribution of the fusion protein in the cytoplasm, insertions 2–6 show bright dots 12 h after induction with galactose. The distribution of the fusion protein does not change significantly with time in the cells of clone 1, but the Gag–GFP fusions from clones 2–6 can be observed as larger aggregates about 24 h post-induction ( Fig. 4A, B). Furthermore, cells of clones 2–6, if broken, release a large number of bright fluorescent particles, whereas cells of clone 1 do not. The individual dots observed in the cells of clones 2–6 may represent individual particles that may form aggregates in the cytoplasm. The Gag–GFP fusion from clone 1 probably cannot assemble into the particles and this may explain why bright dots cannot be observed in cytoplasm. The IN–GFP insertion 2, but not insertion 1, was observed in nuclei ( Fig. 4C ), in agreement with other studies from this lab and others that the nuclear localization signal of Ty1 IN is located in the C-terminal part of the protein ( Kenna et al., 1998; Moore et al., 1998). RT/RH–GFP fusions showed cytoplasmic localization in a punctate pattern (data not shown), as did the
Gag–GS, PR–GS and IN–GS fusions. Gag–GS fusions appeared less bright than the Gag–GFP fusions. The GS insertion probably disrupted the nuclear localization signal in Ty1 IN because it was inserted in the middle of the NLS ( Kenna et al., 1998; Moore et al., 1998).
4. Discussion The in-vitro transposition reaction driven by Ty1 integrase is a rapid method to generate multiple insertions of DNA fragment into a plasmid. We report here two new artificial transposons used to generate expression libraries of GFP fusions. Transposon-based strategies for DNA sequencing have been developed using modified bacterial transposon Tn5 (Lutz et al., 1987; Phadnis et al., 1989; Kurnit and Seed, 1990), yeast retrotransposon Ty1 (Devine and Boeke, 1994; Devine et al., 1997; Kimmel et al., 1997) and other elements ( Way et al., 1984; Ahmed, 1985; Adachi et al., 1987; Phadnis et al., 1989; Kleckner et al., 1991; Strathmann et al., 1991). Bacterial transposons are also being used as a tool for the systematic analysis of protein localization in the yeast Saccharomyces cerevisiae (Burns et al., 1994); random lacZ insertions throughout the genome were made by transposon mutagenesis in bacteria using a minitransposon derived from Tn3. The Tn3-derived transposons have also been engineered recently to introduce GFP and HA tags into yeast genes (Ross-Macdonald et al., 1997). In-vivo and in-vitro transposition techniques reveal specific advantages and shortcomings. Whereas in-vivo mutagenesis does not require purification of recombination enzymes and has also been demonstrated to work on a genomic scale (Prasher et al., 1992), the in-vitro approach currently offers reasonably high rates of transposition, great
220
G.V. Merkulov, J.D. Boeke / Gene 222 (1998) 213–222
Fig. 4. Cells expressing Ty1 proteins fused to GFP. Living cells were observed using an oil-immersion objective; they were excited with 365-nm light and photographed using a low-fluorescence filter passing wavelengths greater than 450 nm. (A) Cells expressing Gag–GFP 1 fusions. (B) Cells expressing Gag–GFP 2 fusions; panel on the left, 12 h of incubation; panel on the right, 24 h of incubation. (C ) Cells expressing IN–GFP 2 fusions. First six panels, left to right: cells viewed in phase contrast; DAPI staining; GFP fluorescence. Last four panels, left to right: cells viewed in phase contrast; GFP fluorescence.
flexibility of transposon engineering, absence of interference with host factors, and control over the reaction (Devine and Boeke, 1994). Also, the enzymatic components are now commercially available, making it very accessible. Improvements to the in-vitro transposition technique might eventually raise the transposition frequency so that nearly several per cent of targets are hit,
eliminating the need for selection of recombinants in bacteria and consequently eliminating the need for the selectable marker in the transposon; if so, the transposon could consist of only the tag gene such as gfp and the short flanking sequences involved in integration; colony hybridization with a gfp probe could be used to identify the insertions. Thus, the in-vitro transposition method has distinct advantages over the in-vivo technique in generating libraries of tag fusions. The two transposons tested in this study represent two approaches for generating libraries of tag fusions in vitro. One of them, AT2GFP, has a stop codon at the tag end and therefore generates truncated fusion proteins, whereas the other, the GS, lacks a stop codon and can be inserted in the beginning or in the midst of an ORF without truncation ( Fig. 1A and B). Both can be used for various applications; the truncating version, AT2GFP, can be employed to generate series of truncations and thereby map cellular localization signals, while the read-through transposon, GS, can be inserted in the middle of multi-domain proteins without jeopardizing their function, which may be especially useful in studies of essential genes. The GS transposon is capable of generating readthrough insertions thanks to the supF marker, which is probably the shortest selectable marker in bacteria; because supF is only 200–250 bp long, it was possible to find an open reading frame in it. SupF was fused to gfp, as shown in Fig. 1B. It would be possible to fuse supF with a tag gene in four different orientations/reading frames if stop codons in other frames were eliminated by site-directed mutagenesis. The relatively small size of supF may also improve the in-vitro transposition performance since shorter transposons might integrate with a slightly higher frequency than longer transposons. However, use of supF as a selectable marker also creates some problems; for instance, it has been reported previously that high-copy plasmids expressing supF may be lost in over-grown cultures. To overcome the problem, cells may be collected at lower densities, as we did in this study, or supF could be expressed from a low-copy vector. It is not known how the ‘unnatural’ polypeptide translated from supF folds, but several fluorescent GFP fusions were successfully generated with the GS transposon in this study. The GS fusion proteins appeared slightly less bright on average than similar AT2GFP insertions. This could be due to one of several factors: (1) GS fusion proteins are longer, (2) GS fusion proteins contain the unnatural supF peptide sequence, and (3) AT2GFP fusions have a native GFP C-terminus, whereas GS fusion proteins have a constrained GFP C-terminus. Six fluorescent Gag–GFP fusions were generated with the AT2GFP transposon. Interestingly, none of these fusions fell within the central 300 amino acids of the 440-residue-long Gag protein ( Fig. 2A). The core of Gag is believed to be involved in protein–protein inter-
G.V. Merkulov, J.D. Boeke / Gene 222 (1998) 213–222
actions and virus-like particle formation, and therefore, the GFP fusions in the Gag core probably had a misfolded GFP moiety or were unstable in cells and were not fluorescent as a result. Since the AT2 transposon, the prototype of AT2GFP, and several of its derivatives transpose near-randomly (Devine and Boeke, 1994; Devine et al., 1997), it is likely that GFP fusions to the central 300 amino acids of Gag were created but not isolated merely because they were not fluorescent. Future studies of GFP fusions may show whether specific parts of target proteins interfere with GFP folding. Our study shows that GFP fusion proteins and tribrids can be readily detected by visual inspection of colonies when expressed from a strong promoter in yeast. Even proteins expressed at a 20-fold lower level (POL insertions) could be easily detected microscopically. Other studies suggest that the low expression levels of many yeast genes may be sufficient for routine detection of GFP when expressed from their native promoters (R.K. Niedenthal, L. Riles, U. Gueldener, S. Klein, M. Johnston, J. Hegemann, pers. commun.). The system designed here should be generally helpful for overproduced proteins now and may become even more widely applicable as GFP detection technology improves. The gfp gene used in this study was an S65T mutant, resulting in brighter fluorescence (Heim et al., 1995). A number of GFP variants with shifted excitation and emission spectra have been reported (Heim et al., 1995; Cormack et al., 1996; Crameri et al., 1996; Heim and Tsien, 1996; Cormack et al., 1997) and can be exploited to make transposons analogous to AT2GFP and GS. Novel techniques in protein engineering, such as DNA shuffling (Crameri et al., 1996), are thought to be valuable aids in creating modified GFPs. GFPs with altered codon usage have also been tailored for expression in a variety of cell types and organisms (Cormack et al., 1997). The two artificial transposons reported here contained only one type of GFP and can be easily modified to incorporate GFPs with improved characteristics.
Acknowledgement We thank Dr S.E. Devine for the pAT2 vector and helpful discussions, and Dr C. Baker Brachmann for JH695 antiserum. This work was supported by NIH grants GM36481 and CA77812.
References Adachi, T., Mizuuchi, M., Robinson, E., Appella, E., O’Dea, M., Gellert, M., Mizuuchi, K., 1987. DNA sequence of the E. coli gyrB
221
gene: application of a new sequencing strategy. Nucleic Acids Res. 15, 771–784. Ahmed, A., 1985. A rapid procedure for DNA sequencing using transposon-promoted deletions in Escherichia coli. Gene 39, 305–310. Boeke, J.D., Xu, H., Fink, G.R., 1988. A general method for the chromosomal amplification of genes in yeast. Science 239, 280–282. Borjigin, J., Nathans, J., 1993. Bovine pancreatic trypsin inhibitor–trypsin complex as a detection system for recombinant proteins. Proc. Natl. Acad. Sci. USA 90, 337–341. Burns, N., Grimwade, B., Ross-Macdonald, P.B., Choi, E.Y., Finberg, K., Roeder, G.S., Snyder, M., 1994. Large-scale analysis of gene expression, protein localization, and gene disruption in Saccharomyces cerevisiae. Genes Dev. 8, 1087–1105. Chalfie, M., Tu, Y., Euskirchen, G., Ward, W.W., Prasher, D.C., 1994. Green fluorescent protein as a marker for gene expression. Science 263, 802–805. Cody, C.W., Prasher, D.C., Westler, W.M., Prendergast, F.G., Ward, W.W., 1993. Chemical structure of the hexapeptide chromophore of the Aequorea green-fluorescent protein. Biochemistry 32, 1212–1218. Cormack, B.P., Valdivia, R.H., Falkow, S., 1996. FACS-optimized mutants of the green fluorescent protein (GFP). Gene 173, 33–38. Cormack, B.P., Bertram, G., Egerton, M., Gow, N.A., Falkow, S., Brown, A.J., 1997. Yeast-enchanced green fluorescent protein (yEGFP) as a reporter of gene expression in Candida albicans. Microbiology 143, 303–311. Crameri, A., Whitehorn, E.A., Tate, E., Stemmer, W.P.C., 1996. Improved green fluorescent protein by molecular shuffling. Nature Biotechnology 14, 315–319. Cubitt, A.B., Heim, R., Adams, S.R., Boyd, A.E., Gross, L.A., Tsien, R.Y., 1995. Understanding, improving and using green fluorescent proteins. Trends Biochem. Sci. 20, 448–455. Devine, S.E., Boeke, J.D., 1994. Efficient integration of artificial transposons into plasmid targets in vitro: a useful tool for DNA mapping, sequencing and genetic analysis. Nucleic Acids Res. 22, 3765–3772. Devine, S.E., Chissoe, S.L., Eby, Y., Wilson, R.K., Boeke, J.D., 1997. A transposon-based strategy for sequencing repetitive DNA in eucaryotic genomes. Genome Res. 7, 551–563. Dhandayuthapani, S., Via, L.E., Thomas, C.A., Horowitz, P.M., Deretic, D., Deretic, V., 1995. Green fluorescent protein as a marker for gene expression and cell biology of mycobacterial interactions with macrophages. Mol. Microbiol. 17, 901–912. Eichinger, D.J., Boeke, J.D., 1988. The DNA intermediate in yeast Ty1 element transposition copurifies with virus-like particles: cellfree Ty1 transposition. Cell 54, 955–966. Garfinkel, D.J., Hedge, A.M., Youngren, S.D., Copeland, T.D., 1991. Proteolytic processing of pol-TYB proteins from the yeast retrotransposon Ty1. J. Virol. 65, 4573–4581. Garraway, L.A., Tosi, L.R., Wang, Y., Moore, J.B., Dobson, D.E., Beverley, S.M., 1997. Insertional mutagenesis by a modified in vitro Ty1 transposition system. Gene 198, 27–35. Heim, R., Cubitt, A.B., Tsien, R.Y., 1995. Improved green fluorescence. Nature 373, 663–664. Heim, R., Tsien, R.Y., 1996. Engineering green fluorescent protein for improved brightness, longer wavelengths and fluorescence resonance energy transfer. Curr. Biol. 6, 178–182. Kahana, J.A., Schnapp, B.J., Silver, P.A., 1995. Kinetics of spindle pole body separation in budding yeast. Proc. Natl. Acad. Sci. USA 92, 9707–9711. Kenna, M.A., Brachmann, C.B., Devine, S.E., Boeke, J.D., 1998. Invading the yeast nucleus: a nuclear localization signal at the C terminus of integrase is required for transposition in vivo. Mol. Cell. Biol. 18, 1115–1124. Kimmel, B., Palazzola, M.J., Martin, C., Boeke, J.D., Devine, S.E., Transposon-mediated DNA sequencing. In: Green, E., Birren, B., Myers, R., Hieter, P. ( Eds.), Genome Analysis: A Laboratory
222
G.V. Merkulov, J.D. Boeke / Gene 222 (1998) 213–222
Manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1997, pp. 455–532. Kleckner, N., Bender, J., Gottesman, S., 1991. Uses of transposons with emphasis on Tn10. Meth. Enzymol. 204, 139–180. Kremer, L., Baulard, A., Estaquier, J., Poulain-Godefroy, O., Locht, C., 1995. Green fluorescent protein as a new expression marker in mycobacteria. Mol. Microbiol. 17, 913–922. Kurnit, D.M., Seed, B., 1990. Improved genetic selection for screening bacteriophage libraries by homologous recombination in vivo. Proc. Natl. Acad. Sci. USA 87, 3166–3169. Lutz, C.T., Hollifield, W.C., Seed, B., Davie, J.M., Huang, H.V., 1987. Syrinx 2A: an improved lambda phage vector designed for screening DNA libraries by recombination in vivo. Proc. Natl. Acad. Sci. USA 84, 4379–4383. Merkulov, G.V., Swiderek, K.M., Baker Brachmann, C., Boeke, J.D., 1996. A critical proteolytic cleavage site near the C-terminus of the yeast retrotransposon Ty1 Gag protein. J. Virol. 70, 5548–5556. Moore, S.P., Rinckel, L.A., Garfinkel, D.J., 1998. A Ty1 integrase nuclear localization signal required for retrotransposition. Mol. Cell. Biol. 18, 1105–1114. Niedenthal, R.K., Riles, L., Johnston, M., Hegemann, J.H., 1996. Green fluorescent protein as a marker for gene expression and subcellular localization in budding yeast. Yeast 12, 773–786.
Phadnis, S.H., Huang, H.V., Berg, D.E., 1989. Tn5supF, a 264-basepair transposon derived from Tn5 for insertion mutagenesis and sequencing DNAs cloned in phage lambda. Proc. Natl. Acad. Sci. USA 86, 5908–5912. Prasher, D.C., Eckenrode, V.K., Ward, W.W., Prendergast, F.G., Cormier, M.J., 1992. Primary structure of the Aequorea victoria greenfluorescent protein. Gene 111, 229–233. Ross-Macdonald, P., Sheehan, A., Roeder, G.S., Snyder, M., 1997. Amultipurpose transposon system for analyzing protein production, localization, and function in Saccharomyces cerevisiae. Proc. Natl. Acad. Sci. USA 94, 190–195. Seed, B., 1983. Purification of genomic sequences from bacteriophage libraries by recombination and selection in vivo. Nucleic Acids Res. 11, 2427–2445. Strathmann, M., Hamilton, B.A., Mayeda, C.A., Simon, M.I., Meyerowitz, E.M., Palazzolo, M.J., 1991. Transposon-facilitated DNA sequencing. Proc. Natl. Acad. Sci. USA 88, 1247–1250. Wang, S., Hazelrigg, T., 1994. Implications for bcd mRNA localization from spatial distribution of exu protein in Drosophila oogenesis. Nature 369, 400–403. Way, J.C., Davis, M.A., Morisato, D., Roberts, D.E., Kleckner, N., 1984. Gene 32, 369–379.