Cell, Vol. 18. 257-266.
October
1979,
Copyright
0 1979
by MIT
Overlapping Genes in RNA Phage: a New Protein Implicated in Lysis
Marian N. Beremand* and Thomas Program in Molecular, Cellular and Developmental Biology and Department of Biology Indiana University Bloomington, Indiana 47405
Blumenthal
We have identified a new 75 amino acid polypeptide (L protein) following f2 phage infection of E. coli. It is encoded by an out-of-phase overlapping gene which begins within the coat protein gene, ends in the replicase gene and covers the 36 base intercistronic space between them. A mutant f2 phage carrying a UGA mutation (op3), which complements mutations in the other three f2 genes (coat, A protein and replicase), fails to lyse cells (Model, Webster and Zinder, 1979) and also fails to produce L protein. Both lysis and L protein are restored following op3 infection of a UGA suppressor-containing strain or infection of wild-type bacteria with a revertant of 0~3. L protein is found in the insoluble fraction of artificially lysed cells. In this paper, we present the time course of its synthesis relative to the other fBcoded polypeptides: L protein synthesis increases as replicase synthesis decreases. We also report the discovery of another phage-coded polypeptide, which appears to be the product of a novel mode of translation: initiation at the coat protein initiation site, followed by translational frame shifting into the L protein frame and termination at the L protein terminus. Introduction The group I single-stranded RNA bacteriophages (including f2, MS2 and R17) have three known genes which code for the coat protein, the A protein and a subunit of the RNA replicase (Gussin, 1966; Horiuchi, Lodish and Zinder, 1966). These genes account for 90% of the genome. Most of the remaining 365 bases are divided between untranslated regions at the 3’ and 5’ ends of the RNA (Figure 1). However, the isolation and characterization of an f2 opal (UGA) mutant, 0~3, in a fourth complementation group suggests the existence of a fourth gene (accompanying paper by Model et al., 1979). The op3 phage which complements mutations in the three known genes is apparently physically identical to wild-type phage, but fails to lyse su- host cells (P. Model et al., quoted by Horiuchi, 1975). Since the f2 RNA sequence lacks the capacity to code for a fourth protein in a convenl Present address: College of Medicine,
Department of Biological Irvine. California 92717.
Chemistry,
California
tional manner, an op&defined lysis protein encoded by an overlapping gene seems probable. A review of the RNA bacteriophage literature provides some support for this hypothesis. Several potential overlapping reading sequences can be found in the MS2 RNA sequence (Fiers, 1975; Fiers et al., 1976) some of which are preceded by regions complementary to the 3’ end of 16s rRNA, a property of most ribosome binding sites (Shine and Dalgarno, 1975a, 1975b; Steitz et al., 1977). Furthermore, unidentified small polypeptides have been detected following in vitro translation of f2, R17 and MS2 RNA (Lodish. 1970; Atkins et al., 1975). Of the potential overlapping sequences identified, the one most likely to code for an op&defined lysis protein is a region overlapping the 3’ end of the coat gene, the 36 base intercistronic space and the 5’ end of the replicase gene (Figure 2). The potential cistron has a strong ribosome binding site followed by a sequence coding for a 75 amino acid polypeptide terminating with a UAA codon. This site was suggested by the following observations. First, the first overlapping gene found was the E gene of +X1 74. This gene codes for a protein required for lysis (Barrell, Air and Hutchison, 1976). Comparison of the 4x174 E protein sequence (89 residues) with the protein coded by the potential MS2 overlapping gene defined above (75 residues) reveals some homology with the region in the E protein thought to be directly involved in lysis (Barrel1 et al., 1976). Second, the op3 mutant phage showed altered regulation of replicase protein synthesis initiation (Horiuchi, 1975). Thus if the hypothesized gene location were correct, a single mutation near the beginning of the replicase cistron could result in both termination of the predicted protein and altered regulation of replicase synthesis. The studies reported here show that f2 infection does induce the synthesis of a fourth polypeptide encoded by the overlapping gene described above. The op3 mutation prevents the synthesis of this protein. Consistent with the lysis-deficient phenotype of f2 0~3, this polypeptide is found in the insoluble fraction of artificially lysed cells. We also report the existence of another phage-coded polypeptide (protein 5) which we believe to be a hybrid of coat and L proteins. Results Synthesis of L Protein in Vivo High resolution polyacrylamide gel electrophoresis reveals that, in addition to the three known phageencoded proteins (replicase, A protein and coat), the synthesis of a fourth polypeptide is induced upon f2 infection of E. coli cells (Figure 3). This polypeptide, designated L protein, has an approximate (subunit)
Cell
258
sequence between residues 1678 and 1905, as predicted (Figure 21, it should contain histidine but lack both glycine and tryptophan. This is the only possible phage-encoded protein of this size which could lack glycine. Our results indicate that L protein contains neither glycine nor tryptophan. On the other hand, it is labeled when infected cells are grown in the presence of ‘%histidine, which appears only once in the L protein amino acid sequence (data not shown). Figure 4 shows that L protein is not labeled when cells are grown in the presence of ‘%-glycine (even though an average protein of 11,000 daltons would be expected to contain several glycine residues). From the distribution of glycine in the other three known phage proteins, the absence of glycine in L protein eliminates the possibility that it is a fragment of any of these proteins. The results obtained with labeled tryptophan are discussed below. Atkins et al. (1979; accompanying paper) have identified a polypeptide which appears to be identical with L protein among the polypeptides synthesized in vitro by reticulocyte ribosomes in response to f2 RNA. They have determined that the proline residues in this polypeptide are at positions 6, 13, 21 and 28, as predicted for L protein (Figure 2). They have also identified a ribosome binding site which contains oligonucleotides corresponding to the region shown in Figure 2. Since the polypeptide they observe and L protein migrate together on sodium dodecylsulfate polyacrylamide gels (data not shown), and both are eliminated by the op3 mutation (see below), we conclude that they are the same protein and are coded by the RNA sequence which we have identified as the L protein gene.
molecular weight of 11,000 daltons, based on its mobility relative to coat protein. Synthesis of L protein occurs in both the presence and absence of rifampicin (data not shown). Its synthesis is more readily observed in samples prepared from rifampicin-treated cells, due to the preferential reduction in the synthesis of host proteins. Rifampicin (final concentration 100 pg/ml) was therefore added to all cultures, both control and infected, 5 min after infection. Origin of L Protein: Amino Acid Composition The close genetic relationship of f2 and MS2 suggests that the probable amino acid sequence of the proteins encoded by potential overlapping genes could be predicted from the known MS2 RNA sequence (Fiers, 1975; Fiers et al., 1976). Fortuitously, the amino acid composition of these potential proteins varied in such a way that they could be distinguished from one another by labeling infected cells with different amino acids. If the L protein is the product of the RNA “;;,a’
If”
‘3pS
I ( 1 A-Protein
l-r Y
‘6r’S ;902
1
jLI
[I
I I ‘3?5 1724J761
III Figure
ICoat
5’ 1.
Reading
Frame
3335
Replicase
3’
Map of MS2
Reading frame I is defined by nucleotide number one at the 5' end. Reading frame II begins with nucleotide number two, and reading frame Ill begins with nucleotide number three. The map is based on the sequence determined by Fiers etal. (1976).
SER GLN GLN THR PRO ALA SER MET GLU THR ARG PHE PRO GLN GLN SER ALA ILE ALA ALA ASN SER GLY ILE LEU LYS ASP GLY ASN PRO ILE PRO GLN GLY LEU CAAGGUCUCCUAAAAGAUGGAAACCCGAUUCCCUCAGCAAUCGCAGCAAACUCCGGCAUC 1700 1710 1720 1670 1680 1690
THR U'r\"C
CYS ARG ARG GLN GLN ARG SER SER LYS THR MET U A G A C G C C G G C C A U U C A A A C A U G A G G A U U A C C C A U G U C G A A G A C A ATFRA AL:'G ALiSG + 1750 1760 1770 1780 1730 1740 U
ASN U*;*A
ARG
ARG
ARG
PRO
PHE
LYS
HIS
GLU
ASP
LEU LEU ALA ILE SER THR LEU TYR VAL ILE PHE PHE ASN SER LEU CYS ILE ASP LEU PRO ARG ASP UUCAACUCUUUAUGUAUUGAUCUUCCUCGCGAUCUUUCUCUCGAAAUUUACCAAUCAAUU 1790 1800 1810
TYR
PRO
PHE LEU SER LYS PHE THR ASN GLN LEU LEU SER LEU GLU ILE TYR GLN SER ILE 1820
1830
1840
LEU GLU ALA ILE ARG THR VAL THR THR LEU GLN GLN LEU LEU THR l ** LEU LEU SER LEU VAL PHE THR ILE ALA TYR ALA SER VAL ALA THR GLY SER GLY ASP PRO HIS SER ASP ASP ALA LEU GCUUCUGUCGCUACUGGAAGCGGUGAUCCGCACAGUGACGACUUUACAGCAAUUGCUUACUUAA 1870 1850 1860 1880 1890 1900 Figure
2.
MS2 RNA Sequence
of L Gene
Region
and Predicted
Amino
Acid Sequence
of L Protein
The hypothetical ribosome recognition sequences of the Lgene are underlined. L gene coding of the op3 mutation (Atkins et al., 1979) is indicated by f&U). The upper amino acid sequence are for the portion of the coat and replicase proteins specified by this region.
sequence beginsatnucleotide is the L protein; the lower
1678.The position amino acid sequences
Overlapping 259
Genes
in RNA Phage
Leu
w
Rep A
-
Coat
-
Rep
Coat L
Figure fected
Figure coli
3. Polypeptides
Synthesized
in f2-Infected
and Uninfected
E.
Strain CSH63 was grown and infected with f2 as described in Experimental Procedures. 5 ml cultures were treated with rifampicin 5 min after infection and labeled with 2.5 pCi “C-leucine (final concentration 10 pM) 32 min after infection of one of the cultures. Cells were harvested at 43 min and prepared for gel electrophoresis as described in Experimental Procedures. (12) Infected; (un) uninfected.
Infection with f2-op3 The following experiments were performed to determine whether the op3 mutation affected the synthesis of L protein. infection of an su- E. coli strain with op3
-
-
4. Polypeptides and Uninfected
Labeled E. coli
with
Leucine
and Glycine
in f2-ln-
The experiment was performed as described in the legend to Figure 3, except that the cells were labeled from 12 min after infection and harvested at 60 min. In the glycine-labeled cultures (2.5 FCi ‘%glycine: final concentration 10 @A). 1 mM serine was added to dilute 14C-serine resulting from conversion of labeled glycine to serine.
induces the synthesis of the three classical RNA phage proteins, but fails to induce L protein synthesis (Figure 5). However, growth of op3 on an su+“cA suppressor-carrying strain results in the synthesis of L protein (Figure 5). Infection of the su- strain with op3-Rl , an op3 revertant capable of normal lysis on the su- strain, also restores synthesis of L protein (Figure 5). The L protein of the 0~341 revertant
Cell 260
5 Coat L
Figure
5. Polypeptides
Cultures were labeled harvested at 80 min.
Synthesized with “C-leucine
in Su-
and Su’
as described
E. coli following in the legend
displays a slight shift in its electrophoretic mobility as compared to that of f2 or suppressed 0~3. Analysis of nine additional independent revertants yields the same result: the L protein position on the gel is always shifted in the same fashion. According to the predicted amino acid sequence, wild-type L protein should not contain trp. The op3 L protein synthesized in the trp-inserting UGA suppressor strain, however, should contain one trp residue per polypeptide. The data in Figure 6 are consistent with these predictions. The protein profile of trp-la-
Infection to Figure
by f2. op3 and Revertant 3. from
30 min afler
rev 1
infection
with the indicated
phage.
and were
beled su- ceils infected with f2 in the presence of 3Htrp is identical in the L protein region with the profile of op34nfected su- cells in which no L protein is synthesized (see Figure 5). The band visible below the coat position is presumably due to the highly labeled coat protein. (Note that it is slightly above the L protein position.) On the other hand, L protein is observed following op3 infection of su+ bacteria grown in the presence of 3H-trp. While this region of the gel is labeled in uninfected bacteria as well as in op3- and f2-infected cells, the band is darkest in the
Overlapping 261
Genes
in RNA Phage
SU-
su+
Ilr
Rep
Rep -
A
5 ;:Coat L
Figure 6. Polypeptides Labeled with ‘H-Tryptophan E. coli following Infection with f2 and op3
in Su
and Su’
The experiment was performed as described in the legend to Figure 5. except that the labeled amino acid was 3H-tryptophan (150 gCi/5 ml culture; final concentration 5 pfvl).
op3-infected strain. These results provide further evidence that L protein is encoded by the RNA sequence presented in Figure 2, and that the op3 mutation resides in this sequence. The titer of both op3 and f2 phage produced was determined for all the above experiments (data not shown). The titer of wild-type revertants was never more than 1% that of op3 in all experiments in which op3 phage were used. Thus the protein observed following op3 infection of the suppressor-containing bacteria cannot be a product of wild-type revertants. Subcellular Localization of L Protein The insoluble fraction was separated from soluble cellular proteins by centrifugation of a cell suspension that had been gently lysed by freeze-thawing in the
Coat L-
f2 Figure
7. Subcellular
Fractionation
of f2-Coded
un Polypeptides
5 ml cultures were labeled as described in the legend to Figure 4. These cultures were separated into soluble and insoluble fractions as described in Experimental Procedures. (I) Insoluble fraction; (S) soluble fraction.
presence of EDTA, lysozyme, DNAase and RNAase. L protein sediments with the insoluble fraction as shown in Figure 7. All of the replicase and most of the A protein, as well as some of the coat protein, are also found in the insoluble material. None of the proteins found in the insoluble fraction were solubilized during subsequent resuspension in extraction buffer (data not shown).
Cell 262
Time Course of Synthesis of f2-Specific Proteins The time of synthesis of coat, replicase and A protein in vivo has been determined in previous investigations. We repeated these experiments to determine when L protein is synthesized after infection in relationship to the other phage-encoded proteins. In this experiment, both f2-infected and uninfected cells were labeled with “C-leucine during the following sequential time intervals after infection: 12-23 min, 22-33 min, 3243 min, 42-63 min and 62-83 min. At the end of each labeling period, the cell cultures were immediately chilled on ice for 1 hr. The cells were then harvested by centrifugation and samples were prepared for electrophoresis as described in Experimental Procedures. The data, taken from gel scans and quantitated according to the number of leucine residues present in each polypeptide, are given in Figure 8. The ordinate represents relative numbers of molecules synthesized during each time interval. The L protein is not synthesized during the first 23 min after infection; it first appears during the 22-33 min time interval after infection, and its synthesis continues at almost the same level during the next 10 min and gradually decreases during the latter stages of infection. The replicase is synthesized early after infection; synthesis of L protein does not begin until the synthesis of replicase starts to decline. The number of L protein molecules synthesized exceeds the number of replicase molecules synthesized during most of the f2 life
20
0 Time Figure
8. Time Course
40 After
of f2-Coded
60
Infection
(min)
Polypeptide
Synthesis
80
5 ml cultures of E. coli infected with f2 and treated with rifampicin were labeled with “C-leucine for 11 min intervals at the times indicated. The labeled proteins were analyzed by SDS gel electrophoresis. Following fluorography of the gels, the films were scanned and integrated with a Helena Quick Scan Microdensitometer. The raw data were divided by the number of leucine residues in each polypeptide, as determined from the MS2 RNA sequence (Fiers et al., 1976) or the f2 coat protein sequence (Weber and Konigsberg, 1975). The numbers used were: replicase. 50; A protein, 36: coat protein, 8; L protein, 12.
cycle. The rate and amounts of synthesis of replicase, A protein and coat protein shown in Figure 8 correspond well with those determined previously (Fromageot and Zinder, 1968; Robertson, 1975). Another f2 Protein A fifth phage-induced protein of approximately 25,000 daltons (protein 5) is also visible in Figures 3, 5 and 6. It is present in very low amounts. Surprisingly, protein 5 (like L protein) is not synthesized after infection with 0~3. Instead, a new polypeptide of approximately 15,000 daltons can be observed following op3 infection (Figure 6, at *). We interpret this band to be a nonsense fragment of protein 5. The op3 revertant op3-Rl restores both protein 5 and L protein. We believe that protein 5 is a “hybrid” of coat protein and L protein (see Discussion). Discussion Overlapping Genes-L Protein The RNA of the group I RNA phage codes for three polypeptides (A protein, coat protein and replicase) which nearly saturate the linear coding capacity of the genome. However, a fourth and independent complementation group was defined upon isolation and characterization of 0~3, an f2 opal (UGA) mutant which failed to cause lysis of the su- host cell even though it was identical to the wild-type phage by infectivity and all other physical properties. The existence of a lysis protein coded by a fourth, overlapping cistron would appear to be the most plausible explanation for these properties of 0~3. In this paper we demonstrate that the group I RNA phages do code for a fourth protein which is the product of an overlapping gene. Analysis of cell extracts by high resolution polyacrylamide gel electrophoretograms reveals that f2 infection of wild-type E. coli (su-) induces the synthesis of a fourth, low molecular weight protein (designated L protein) that is absent upon infection with 0~3. Infection with op3 revertants restores both lysis and synthesis of L protein. Similarly, growth of op3 in an SLI+“~~ suppressor strain allows lysis and results in the synthesis of a low level of L protein. Identification of the genetic locus of the L protein was made possible by the availability of the entire RNA sequence of the closely related MS2 RNA phage. Analysis of this sequence indicates that there are several potential overlapping genes. A stretch of RNA which overlaps the end of the coat gene, the intercistronic region and the beginning of the replicase gene seems most likely to encode the L protein. Assuming that MS2 and f2 are identical in this region, L protein should not contain tryptophan or glycine but should have histidine. Experiments designed to test these predictions showed that the L protein has all of these properties. Thus we conclude that the sequence
Overlapping 263
Genes
in RNA Phage
shown in Figure 2 is the gene for L protein. The findings in the accompanying paper (Atkins et al., 1979) further strengthen this conclusion. In the MS2 RNA sequence, the L gene starts at residue 1678 with an AUG codon and ends at residue 1905 with a UAA termination codon. This sequence overlaps the last 47 bases of the coat gene, spans the 36 residues in the intercistronic region between the coat and replicase gene and extends 142 bases into the replicase gene. The coat and replicase reading frames occur in the same phase, while the L protein reading frame is shifted one base to the right (Figure
2). Site of the op3 Mutation The above correlations between the synthesis of the L protein and the op3 phenotype suggested that the corresponding op3 mutation occurs in the RNA sequence coding for the L protein. Another property of the op3 mutant, reduction in the rate of protein synthesis initiation at the replicase cistron, was not only consistent with this theory, but indicated that the mutation might be located in the portion of the L gene which overlaps the ribosome binding site of the replicase gene. Atkins et al. (1979) have indeed found that the op3 mutation is a C-to-U transition at position 1765 of the MS2 RNA sequence. This site codes for both the second amino acid in the replicase protein and the thirtieth amino acid of the L protein (see Figure 2). The presence of op3 at this location may explain the finding that there are two classes of revertants of the op3 mutation (Model et al., 1979). One class presumably represents a return to the wild-type arginine residue at position 30 of L protein by a UGA + CGA transition mutation. These revertants have normal growth characteristics. The second class of revertants is characterized by a very small burst size and the production of an abnormally high percentage of defective particles. These could occur by a UGA + UGG transition mutation, which would result in an arginine + tryptophan change at position 30 of L protein. We know from the growth characteristics of f2 op3 in the UGA suppressor-containing strain that tryptophan is an acceptable amino acid at this site. Why then would these revertants be minute? We believe that the answer could lie, at least in part, in the resultant change in the sequence of amino acids at the N terminus of replicase, from met-ser-lys- to metleu-glu-. The altered replicase could account for some or all of the altered growth characteristics of the minute revertants (Model et al., 1979). Control of Protein Synthesis in f24nfected Cells: Coordinate Translation of Overlapping Cistrons Expression of the phage proteins is regulated during the infective cycle. In the case of wild-type f2, replicase synthesis ceases at approximately 20 min. while both coat and A protein are made throughout infec-
tion. Unlike the three classical f2 proteins which all appear shortly after infection, production of L protein is not detectable until between 22 and 33 min. Thereafter, L protein synthesis closely parallels that of A protein. Previous investigations have shown that synthesis of the three major f2 proteins is regulated by both RNA secondary structure and the translational repressor activities of coat protein and replicase (reviewed by Robertson, 1975). Until now, f2 phage RNA has been viewed as a classic example of a polycistronic mRNA, with three distinct structural genes initiated by ribosome binding sites in regions of the RNA not used for coding. The existence of the overlapping L gene complicates this picture, however; it is now clear that the ribosome binding sites for both replicase and L protein occur within coding regions. Use of the same RNA sequence for more than one polypeptide could entail new mechanisms of control of gene expression. Nevertheless, the very mechanisms mentioned above may also regulate the synthesis of L protein. Both of these mechanisms cause the repression of phage protein synthesis by controlling the availability of ribosome binding sites. In fact, current knowledge from other systems also implicates site availability as a central underlying factor dictating the selection of protein initiation sites (Steitz et al., 1977; Nakamoto and Vogl, 1978; Dunn, Buzash-Pollert and Studier, 1978). Our preliminary results indicate that the L protein is synthesized at a very low level in an in vitro protein synthesizing system directed by native f2 RNA, and at a markedly increased level when the RNA is pretreated with formaldehyde according to the procedure of Lodish (1970). Thus we suggest that regulation of ribosome binding site availability by RNA secondary structure may be an important mechanism in controlling L protein synthesis. Due to the position of the L gene, however, other considerations must also be taken into account. It is possible that L protein synthesis could also be regulated by coat protein, since coat binds the replicase ribosome binding site in the middle of the L gene. In addition, since the L protein initiation site resides in the coat protein gene, the frequency of translation of the coat cistron could regulate the rate at which L protein synthesis is initiated. In turn, replicase, upon binding the coat ribosome binding site, could also affect L protein synthesis by modulating the rate of coat protein synthesis. It is not known which of these hypothesized modes of control are actually used to regulate L protein synthesis, but the question is currently under investigation in our laboratory. Protein 5-Novel Mode of Translation: Internal Shift in mRNA Reading Frame? A fifth phage-specific protein is visible on our gels of f2-infected E. coli. The occurrence of protein 5 par-
Cl?ll 264
allels that of L protein. Synthesis of protein 5 following op3 infection is dependent upon the presence of a UGA suppressor. Absence of protein 5 in op3-infected su- cells is accompanied by the production of a new phage-specific protein which is slightly larger than the coat protein (Figure 6). It seems probable that this new protein is a nonsense fragment of protein 5. Following infection of either su- or su+“en cells with f2 wild-type or any one of the ten op3 revertants, we observed the synthesis of protein 5 but not that of the presumed fragment. These results suggest the possibility that the reading frame used to code for the L gene is also involved in the specification of protein 5. This idea is further substantiated by the following observations. The reversion properties of op3 are consistent with the theory that it carries only a single nonsense mutation [see accompanying papers by Model et al. (1979) and Atkins et al. (197911. Moreover, in the op3 revertants, not only are both proteins restored, but both the L protein and protein 5 show a similar shift in electrophoretic mobility. (This shift in the mobility of protein 5 is not clear in the figures but is apparent on the actual fluorograms.) Protein 5 cannot be a dimer of the L protein, since it labels with both glycine and tryptophan, the two amino acids missing in L protein. We suggest that protein 5 is the product of contiguous segments of both the coat and L genes, and that it arises from a shift in reading frames during protein synthesis. According to this hypothesis (suggested by J. Atkins), protein 5 consists of the N terminal portion of the coat protein and the C terminal portion of the L protein. The positions of termination codons in the coat and L gene reading frames are such that the proposed shift would have to occur between residues 1674 and 1725. Our interpretation of the mode of synthesis of protein 5 is strongly supported by our identification of a nonsense fragment approximately lo-15 amino acids longer than the coat protein following f2 op3 infection. A fragment 14 amino acids longer than the coat would be expected on the basis of the op3 mutation located at position 1765 (Atkins et al., 1979). Further support for the hybrid protein hypothesis is provided by the finding that a polypeptide made from f2 RNA in vitro which co-electrophoreses with protein 5 is not made from RNA containing nonsense mutations in either the L or coat genes (J. F. Atkins, R. F. Gesteland and B. R. Reid, personal communication). How might protein 5 arise? Since the L gene reading frame is shifted one base to the right with respect to the coat gene, such a protein could be made during normal coat protein synthesis if a single codon were occasionally read as a quadruplet rather than a triplet (Riddle and Roth, 1972; Riddle and Carbon, 1973; Atkins and Ryce, 1974) or if the protein synthetic machinery infrequently skipped a base during trans-
lation of the coat protein gene. The latter mechanism might result from a ribosomal collision during translation of overlapping genes from a single mRNA. At present, we do not know whether protein 5 performs a function during RNA phage infection. Complementation of chain termination mutations in coat and L genes indicates, however, that protein 5 is not required for either viable phage production or lysis. Lysis-Role of L Protein The correlation between the ability of op3 phage to lyse host cells and the presence or absence of the L protein suggests that this polypeptide is functional in lysis. This role is supported by the presence of L protein in the insoluble fraction of artificially lysed cells. Furthermore, an amino acid sequence in L protein (residues 37-63) shows some relatedness to a region of the $1X174 protein E thought to be directly involved in lysis (residues 10-24). Long stretches of hydrophobic amino acids are found in both polypeptides. The hydrophobic region occurs near the N terminus of protein E (Barrel1 et al., 19761, however, while it is found closer to the C terminus of L protein. Conclusion The RNA bacteriophage have evolved to optimize their use of a very small genome by economizing in a variety of ways (Weber and Konigsberg, 1975). These include: -Use of the phage-coded proteins to perform dual functions: coat and replicase are used to repress translation as well as to encapsulate and replicate the RNA, respectively. -Gene regulation by RNA secondary structure: specific sequences are used to hydrogen bond to translation initiation sites as well as to code for proteins. -Use of host-coded polypeptides to perform new functions: protein synthesis factors act as components of the RNA replicase. -Multiple translation of the same genetic material: Qp uses the coat sequence to encode both the coat protein and the Al protein, by sometimes “reading through” the UGA termination codon at the end of the coat gene. In addition, as we show here, the group I RNA phages have overlapping genes in two different reading frames. Two groups of organisms have now been shown to use overlapping genes. These are the DNA viruses +X1 74 and SV40 (Barrel1 et al., 1976; Sanger et al., 1977; Reddy. Dhai and Weissman, 1978; Ysebaert, Van de Voorde and Fiers, 1978) and the singlestranded RNA bacteriophages. We do not yet know whether additional groups of organisms use this strategy, or whether overlapping genes are a device specific to viruses that must make maximum use of very small genomes.
;vgejlapping
Experimental
Genes
in RNA Phage
Procedures
Materials ‘t-t-trp (6.0 Ci/mmole), “C-leu (287 mCi/mmole) and 14C-gly (109 mCi/mmole) were from New England Nuclear. Bovine serum albumin was from Miles. DNAase I (bovine pancreas) and RNAase A (proteasefree) were from Sigma. Lysozyme was from Calbiochem. Sodium dodecylsulfate (BDH biochemicals) was purchased from GallardSchlesinger. 2-mercaptoethanol was from Mallinckrodt. All other polyacrylamide chemicals were from BioRad. Bacteria and Bacteriophage Two strains of E. coli K12 were used: CSH63 (Hfr H. pro-, BI-. sum) and K223 (F’lac, Su+uex), which carries the CAJ70 sutucA allele described by Sambrook. Fan and Brenner (1967). The following 12 RNA phages were used: wild-type; 0~3, an opal (UGA) mutant which produces viable phage but fails to lyse sum hosts (P. Model et al., reported by Horiuchi. 1975; and ten independent revertants of op3 capable of normal growth on CSH63. In Vivo Labeling of Infected Cells Cells were grown in modified synthetic MTPA medium (Vitiuela. Algranati and Ochoa. 1967) containing only those amino acids (50 pg/ml) required for growth. Rifampicin, used as a 10 mg/ml solution in 100% methanol, was prepared immediately prior to use. Cells were grown and labeled according to the procedure outlined by Viriuela et al. (1967). as modified by Fromageot and Zinder (1968) for use with rifampicin. An overnight bacterial culture was diluted 81 1 fold into freshly prepared medium and grown with aeration at 36”37°C to a cell density of 2-3 X lO’/ml. Part of the cells (routinely 5 ml) were infected at a multiplicity of 5-l 0 pfu per cell. 5 min later, rifampicin was added to both the infected and uninfected cell cultures at a final concentration of 100 pg/ml (unless otherwise stated). Radioactive amino acids were then added at the times and concentrations indicated in each experiment. 50 pl samples were withdrawn from the culture at predetermined intervals and added to 4 ml of 5% trichloracetic acid containing a drop of 0.5% bovine serum albumin. These samples were filtered through Whatman GF/C filters, washed with three 3 ml rinses of 5% trichloracetic acid, dried and counted. Phage production was determined on parallel unlabeled cultures as described by Fromageot and Zinder (1968). At the end of the labeling period, the cultures were immediately chilled on ice for 1 hr to allow completion of already initiated proteins (Friedman. Lu and Rich, 1971; Broeze, Solomon and Pope, 1978). They were then divided into l-l 5 ml aliquots and subjected to centrifugation at 12,000 X g for 30 min at 4°C in a Brinkman 3200 centrifuge. The cell pellets were frozen at -20°C for later use. Polyacrylamide Gel Electrophoresis Samples were prepared as follows. Frozen cell pellets from 1 ml of culture were resuspended in 50 pl of buffer containing 2.0 mM TrisHCI fpH 7.4) and 1 .O mM MgC12. 2 pl of a solution containing DNAase and RNAase at 2 mg/ml each were added, and the mixture was incubated for 30 min at 0°C. The samples were then subjected to 6 freeze-thawing cycles (dry ice-acetone mixture for 1 min; 37°C for l-2 min). followed by another 30 min. 0°C incubation. Triple-strength Laemmli sample buffer was then added to a final concentration of single strength. After careful mixing, the preparations were boiled for 5 min and stored at -2O’C. Samples were electrophoresed on sodium dodecylsulfate polyacrylamide gels containing a linear gradient of 12.5-l 5% acrylamide and 0.33-0.4% bisacrylamide. as described by Laemmli (1970). except that electrophoresis was conducted with pulsed power current using an Ortec Model 4100 pulsed constant power supply at 320 V. The pulse rate was adjusted throughout the electrophoresis according to the following schedule: t = 0 min, pulse rate = 75: t = 20 min. pulse rate = 150: t = 25 min, pulse rate = 200. and so on until the dye reached the bottom of the gel (approximately t = 1 10 min). Fluorograms were prepared in the following manner. Gels stained with Coomassie brilliant blue G250 according to the method of
Blakesley and Boezi (1977) were treated with 2.5-diphenyloxazole in dimethylsulfoxide. dried and exposed to Kodak RP Royal X-Omat X-ray film at -70°C as described by Bonner and Laskey (1974). Densitometer Quick Scan
tracings of fluorograms Densitometer.
were
obtained
using
a Helena
Cell Fractionation Cell pellets, each from 1 ml of cell culture, were resuspended in 50 pl of buffer composed of 50 mM Tris-HCI (pH 8.0), 1.0 mM 2mercaptoethanol. 50 mM EDTA and 2 pg/ml freshly dissolved lysozyme. After a 30 min. 0°C incubation, the samples were subjected to freeze-thawing as described above and then to centrifugation at 12,000 x g for 20 min. The supernatant fraction was carefully removed from the cell debris pellet which was then resuspended in an equal volume of the original extraction buffer. The supernatant is referred to as the soluble fraction and the resuspended cell pellet as the insoluble fraction. Triple-strength Laemmli sample buffer was then added to both the soluble and insoluble fractions to a final concentration of single strength. The samples were treated as above for electrophoresis. Acknowledgments We thank P. Model for a sample of f2 0~3; J. Atkins, J. A. Steitz and P. Model for communication of unpublished results; J. E. Walker, B. G. Barrell. G. Hegeman and B. Polisky for helpful discussions; M. Brawner and S. R. Jaskunas for help with the in vitro system; and B. Saari for technical assistance. This work was supported by an NIH grant and by an NIH Research Career Development Award to T.B. The costs of publication of this article were defrayed in part by the Payment of page charges. This article must therefore be hereby marked “adverlisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. Received
April 23, 1979;
revised
June
21, 1979
Atkins, J. F., Lewis, J. B.. Anderson, C. W. and Gesteland, R. F. (1975). Enhanced differential synthesis of proteins in a mammalian cell-free system by additional polyamines. J. Biol. Chem. 250. 56885695. Atkins, reading
J. F. and Ryce, S. (1974). of the genetic code. Nature
UGA and non-triplet 249, 527-529.
suppressor
Atkins, J. F.. Steitz, J. A.. Anderson, C. W. and Model, P. (1979). Mammalian ribosome binding to MS2 phage RNA reveals an overlapping gene encoding a lysis function. Cell 18, 247-256. Barrell. 8. G.. Air, G. M. and Hutchison. genes in bacteriophage +X1 74. Nature
C. A., Ill. (1976). 264, 34-41.
Overlapping
Blakesley, R. W. and Boezi. J. A. (1977). A new staining technique for proteins in polyacrylamide gels using Coomassie brilliant blue G250. Anal. Biochem. 82. 580-582. Banner. W. M. and Laskey. R. A. (1974). A film detection method for tritium-labelled proteins and nucleic acids in polyacrylamide gels. Eur. J. Biochem. 46, 83-88. Breeze. R. J., Solomon, C. J. and Pope, D. H. (1978). Effects of low temperature on in vivo and in vitro protein synthesis in Escherichia co/i and Pseudomonas fluorescens. J. Bacterial. 134, 861-874. Dunn, J. J., Buzash-Pollert. E. and Studier, F. W. (1978). Mutations of bacteriophage T7 that affect initiation of synthesis of the gene 0.3 protein. Proc. Nat. Acad. Sci. USA 75, 2741-2745. Fiers, W. (1975). Chemical struoture and biological activity of bacteriophage MS2 RNA. In RNA Phages. N. D. Zinder. ed. (New York: Cold Spring Harbor Laboratory). pp. 353-396. Fiers. W.. Contreras. R.. Duerinck, F.. Haegemann, G.. Iserentant. D.. Merregaert. J., Min Jou. W., Molemans. F.. Raeymaekers. A., Van den Berghe. A., Volckaert, G. and Ysebaert, M. (1976). Complete nucleotide sequence of bacteriophage MS2 RNA: primary and secondary structure of the replicase gene. Nature 260. 500-507.
Cell 266
Friedman, H.. Lu, P. and Rich, A. (1971). initiation of protein synthesis in Escherichia 105-121.
Temperature control of co/i. J. Mol. Biol. 67,
Fromageot. H. P. M. and Zinder. N. D. (1968). phage f2 in E. co/i treated with rifampicin. Proc. 61. 184-191. Gussin. G. (1966). Three complementation R17. J. Mol. Biol. 21. 435-453.
Growth of bacterioNat. Acad. Sci. USA
groups
in bacteriophage
Horiuchi, K. (1975). Genetic studies of RNA phages. In RNA Phages, N. D. Zinder, ed. (New York: Cold Spring Harbor Laboratory), pp. 29-50. Horiuchi. K.. Lodish, H. F. and Zinder, N. D. (1966). Mutants of the bacteriophage f2. VI. Homology of temperature-sensitive and hostdependent mutants. Virology 28, 438-447. Laemmli, assembly
U. K. (1970). Cleavage of structural proteins during the of the head of bacteriophage T4. Nature 227, 680-685.
Lodish. H. F. (1970). Secondary structure of bacteriophage nucleic acid and the initiation of in vitro protein biosynthesis. Biol. 50. 689-702.
f2 riboJ. Mol.
Model, P.. Webster, R. E. and Zinder, N. D. (1979). Characterization of op3, a lysis defective mutant of bacteriophage f2. Ceil 16, 235246. Nakamoto. T. and Vogl, B. (1978). On the accessibility and selection of the initiator site of mRNA in protein synthesis. Biochim. Biophys. Acta 517, 367-377. Reddy. V. B., Dhai, R. and Weissman, S. M. (1978). Nucleotide sequence of the genes for the simian virus 40 proteins VP2 and VP3. J. Viol. Chem. 253, 621-630. Riddle, D. L. and Roth, J. R. (1972). Frameshift suppressors Ill. Effects of suppressor mutations on transfer RNA. J. Mol. Biol. 66, 495-506. Riddle, D. L.. and Carbon, J. (1973). Frameshift suppression: a nucleotide addition in the anticodon of a glycine tRNA. Nature New Biol. 242, 230-234. Robertson, H. D. (1975). Functions of replicating RNA in cells infected by RNA bacteriophages. In RNA Phages, N. D. Zinder. ed. (New York: Cold Spring Harbor Laboratory), pp. 113-l 46. Sambrook. J. F.. Fan, D. P. and Brenner. S. (1967). pressor specific for UGA. Nature 214. 452-453.
A strong
sup-
Sanger. F.. Air, G. M.. Barrell. B. G., Brown, N. L.. Coulson, A. R., Fiddes. J. C.. Hutchison, A. C. A., Ill, Slocombe. P. M. and Smith, M. (1977). Nucleotide sequence of bacteriophage @Xl 74 DNA. Nature 265, 687-698. Shine, J. and Dalgarno. in bacterial ribosomes.
L. (1975a). Determinant Nature 254, 34-38.
of cistron
Shine, J. and Dalgarno. L. (1975b). The 3’ terminal Escherichia co/i 16s ribosomal RNA: complementarity triplets and ribosome binding sites. Proc. Nat. Acad. 1342-l 346. Steitz. J. A., Sprague, K. U., M.. Moore, P. B. and Wahba, RNA interactions during the Acid-Protein Recognition, H. Press), pp. 491-508.
specificity
sequence of to nonsense Sci. USA 77.
Steege. D. A., Yuan, R. C.. Laughrea, A. J. (1977). RNA.RNA and protein. initiation of protein synthesis. Nucleic J. Vogel, ed. (New York: Academic
Viiiuela, E.. Algranati. I. D. and Ochoa. S. (1967). Synthesis of virusspecific proteins in Escherichia co/i infected with the RNA bacteriophage MS2. Eur. J. Biochem. 7. 3-l 1, Weber, K. and Konigsberg, W. (1975). Proteins In RNA Phages. N. D. Zinder, ed. (New York: Laboratory), pp. 51-84.
of the RNA phages. Cold Spring Harbor
Ysebaert. M.. Van de Voorde. A. and Fiers. W. (1978). Nucleotide sequence of the simian virus 40 Hindll + III restriction fragment D and the total amino acid sequence of the late proteins VP2 and VP3. Eur. J. Biochem. 97, 431-439.