Cell, Vol. 56, 691-904,
March
10, 1989, Copyright
0 1969 by Cell Press
Reverse Transcriptase-Dependent Synthesis Covalently Linked, Branched DNA-RNA Compound in E. coli B Dongbin Lim and Werner K. Maas Department of Microbiology New York University Medical Center New York, New York 10016
of a
Results
Introduction
An Endogenous Substrate for Reverse Transcriptase is Present in E. coli B, but Not in E. coli K12. To measure argR mRNA in the E. coli B strain AC2514, we used a modified primer extension method employing labeled nucleotides, as done previously with argR mRNA of E. coli K12 (Lim et al., 1987). When the reaction products were analyzed on an 8% sequencing gel, large amounts of labeled compounds were observed. As shown in Figure lA, such large amounts were not observed in extracts of the E. coli K12 strain JM103. Surprisingly, they were also formed in controls of E. coli B without added primer. In the experiment shown in Figure lA, two extended bands (a and b) were formed, corresponding to about 260 and 220 nucleotides, respectively. In other experiments, the number and size of extended bands varied, ranging from 200 to 350 nucleotides. These variations did not depend on the particular RNA preparation or on the particular batch of reverse transcriptase. In all cases, when the material from the extended bands was treated with RNAase A (or alkali) and the residual DNA analyzed, only one band, 144 nucleotides long, was formed (Figure lB, band c). The results of these experiments suggest that the products of the reverse transcriptase reaction are DNA-linked to RNA, and its DNA component is 144 nucleotides long.
During studies on the regulatory gene argR of E. coli B, which codes for the arginine repressor, we measured argf? mRNA by primer extension with a 17 bp primer complementary to part of the argR structural gene, using M-MuLV reverse transcriptase (Lim et al., 1987). Gel electrophoresis showed large amounts of extended products, about 200 to 350 nucleotides long. Unexpectedly, these were also produced in the absence of added primer, suggesting the presence of an endogenous primer and template. No such reactions occurred with E. coli K12. Following treatment with RNAase A, the extended bands became smaller, which suggested that they contained DNA linked covalently to RNA. Structures with this combination of nucleic acids have been described in myxobacteria (Yee et al., 1984; Furuichi et al., 1987a, 1987b). It has been shown that they consist of single-stranded DNA (msDNA) linked by a 2%’ phosphodiester linkage to RNA (msdRNA). Previously, it was reported that E. coli does not produce msDNA (Dhundale et al., 1987). In this paper, we show that E. coli B produces an msDNA-RNA compound, but E. coli K12 does not. The structure of the msDNA-RNA compound in E. coli B was determined and its chromosomal determinants were characterized. One of the chromosomal determinants is an open reading frame composed of 320 amino acids (ORF320). Evidence is presented that the protein encoded by ORF320 is a reverse transcriptase. Reverse transcriptase-dependent in vitro msDNA synthesis is also presented, and the nature of this peculiar genetic element is discussed.
Cloning of the Chromosomal Genes of the DNA-RNA Compound To clone the chromosomal determinants of this nucleic acid, Southern hybridization was carried out with chromosomal DNA, using the reverse-transcriptase extended product as a probe. As shown in Figure 2, the probe hybridizes with a specific restriction enzyme fragment of DNA isolated from E. coli 6, but not with any fragments from E. coli K12. Two different E. coli B strains (strain 814 and AC2514) gave the same hybridization pattern with three different restriction enzymes (EcoRI, Pstl, and Xhol). This indicates that there is a unique locus in the E. coli 6 chromosome that is homologous to the probe nucleic acid. The 3.5 kb Pstl fragment of AC2514 DNA, which hybridized with the probe, was isolated from a gel and cloned into the Pstl site of the multipurpose cloning vector pTZ19U. By colony hybridization with the same probe, a positive clone, JM103 (pDB808), was isolated and analyzed further. When total RNA extracted from JM103 (pDB808) was treated with RNAase A and applied to a polyacrylamide gel, a large amount of a small satellite DNA, approximately 80 nucleotides long, was observed (Figure 38). The same DNA band, but in a much smaller amount, was also observed in wild-type E. coli B (data not shown), but not in K12 strain JM103 (Figure 3B). Because this DNA is sensitive to Sl nuclease, it is concluded that it is a multicopy single-stranded DNA (msDNA), like the ones characterized in myxobacteria (Yee et al., 1984; Furuichi et al., 1987b; Dhundale et al., 1987). As shown in Figure
Summary We have found a branched DNA-RNA compound in E. coli B, that is similar in its secondary structure, but not its nucleotide sequence, to the previously described branched DNA-RNA compounds in myxobacteria. This compound is not produced in E. coii K12. We have cloned a 3.5 kb chromosomal segment of E. coli 6, which, when transferred into E. coli K12, leads to the production of the DNA-RNA compound. We describe the isolation of the DNA-RNA compound, the determination of its nucleotide sequence, and the nucleotide sequence of the genes required for its formation. The sequence contains the coding regions for the DNA component, the RNA component, and an open reading frame encoding a reverse transcriptase. This reverse transcriptase is shown to be required for the formation of the DNA-RNA compound in vivo and in vitro.
340
250
Figure 2. Southern Hybridization of Chromosomal E. coli K12 with the Probe Prepared by Reverse sion of Total RNA of E. coli B
Figure 1. Presence of a Specific Nucleic as a Template and Primer for Reverse
Acid in E. coti B Which Transcriptase
Serves
Total RNA was extracted from E. coli B strain AC2514 and K12 strain JMl03. The primer extension reaction was performed with or without added primer as described in Experimental Procedures. The reaction products were analyzed on an 8% sequencing gel before and after removal of RNA. (A) Reverse transcriptase extension products. Reverse transcriptase extension was performed with total RNA extracted from: E. coli AC2514 without primer (lane 1). E. coli AC 2524 with argR-specific primer (lane 2). E. coli JM103 without primer (lane 3). E. coli JMl03 with a@specific primer (lane 4). Bands a and b correspond to approximately 260 and 220 base long polynucleotides. (B) Reverse transcriptase extension products after removal of RNA. The extension reaction was performed with total RNA extracted from E. coli AC2514. Extension products were treated with RNAase A or with alkali. Residual DNA was analyzed on an 6% sequencing gel. Lane 1, molecular weight markers. Lane 2, extension products. Lane 3, extension products after RNAasa A treatment. Lane 4, extension products after alkali treatment. Band c corresponds to a 144 base long polynucleotide.
Chromosomal DNAs purified from E. coli strain AC2514 (lanes I, 2, and 3) 814 (lanes 4.6, and 9), and JMl03 (lanes 5,7, and 6) were digested with restriction enzymes, electrophoresed on an agarose gel, and transferred to a GeneScreen Plus membrane. Southern hybridization was performed with the probe prepared as described in Experimental Procedures. Lane M. molecular weight markers prepared by Hindlll digestion of EDNA. Molecular weights are in kilobase pairs. Lanes 1, 4, and 5 are EcoRl digests; lanes 2, 6, and 7 are Pstl digests; and lanes 3, 9, and 6 are Xhol digests of AC2514, 814, and JMl03 DNA, respectively.
As shown in Figure 4A, there are two msDNA species differing in length by one nucleotide. Both species of msDNA were purified and sequenced. From sequencing it was found that the longer one has one more base at its 3’ end; otherwise, they are same. One of the sequencing gels with the sequence of the 5’ end of the msDNA is shown in Figure 48. The msDNA sequence determined in the above experiments is shown in Figure 7, together with
B
A 3A, if the total RNA is directly subjected to gel electrophoresis, two bands, which are specific for the pDB808 clone and bigger than the msDNA, are observed. When these bands (a and b) were isolated and treated with RNAase A, the residual DNA comigrated with msDNA. We conclude that the bands a and b are composed of msDNA linked to RNA. Sequencing of msDNA After RNAase A treatment, msDNA of E. coli B was purified from an acrylamide gel and labeled with 32P at its 5’ end with polynucleotide kinase. This labeled msDNA was subjected to Maxam-Gilbert sequencing. As in the case of msDNA from myxobacteria, most of the labeled %P was released by a piperidine reaction, suggesting that the label is in an alkali-sensitive RNA residue rather than in DNA. The 3’ end of msDNA was labeled with terminal deoxynucleotidyl transferase and subjected to sequencing.
DNA of E. coli Band Transcriptase Exten-
-‘b
Figure
3. Expression
of msDNA
in E. coli K12
Nucleic acids were extracted from E. coli JMlO3 or E. coli JMl03 (pDB606) and applied to an 6% polyacrylamide gel before (A) or after (B) RNAase A treatment. Lane I, total RNA extracted from JMl03. Lane 2, total RNA extracted from JM103 (pDB606). Bands a and b are the intact msDNA-RNA compounds, and band c is msDNA.
Reverse 693
Transcriptase-Dependent
Synthesis
of msDNA
A
the covalently linked msdRNA sequence. msDNA of E. coli B is composed of 85 or 86 nucleotides. From its nucleotide sequence, a stable stem-loop structure can be predicted.
25
Figure 4. Presence of Two Species Sequence at the 5’ End
of msDNA
and Determination
of the
msDNA of E. coli JM103 (pDS606) was purified by gel electrophoresis and labeled at the 3’ end with terminal deoxynucleotidyl transferase. (A) shows the result of 8% urea-polyacrylamide gel electrophoresis of this labeled msDNA and (6) shows the Maxam-Gilbert sequencing products of the upper band of (A). Arrowheads in (A) indicate two different msDNAs, and the arrowhead of (6) indicates the Send of msDNA. The numbers in (6) indicate the corresponding nucleotide positions of msDNA shown in Figure 7.
A
_
Sequencing of msdRNA To determine the msdRNA sequence, the intact msDNARNA compounds (bands a and b in Figure 3) were extracted from a polyacrylamide gel, dephosphorylated with alkaline phosphatase, and labeled with T4 polynucleotide kinase. When these labeled nucleic acids were analyzed on an 6% sequencing gel, band a of Figure 3A was separated into two bands (equivalent to 166 and 166 nucleotides) while band b of Figure 3A was equivalent to a 150 nucletide long polynucleotide (data not shown). Nucleic acids in all three bands (henceforth they will be called Band 166, Band 166, and Band 150, respectively) were isolated and their RNA sequences determined by basespecific RNAase digestions (Furuichi et al., 1987b). Only the sequencing gels obtained from Band 166 are shown. As shown in Figure 5A, a large gap, equivalent to about 85 nucleotides, is observed after the 11th residue starting from the 5’ end of Band 166 (or after the 13th residue of Band 168, see lane L of Figure 5A) in partial alkaline digestion products as well as in partial RNAase reaction products. The presence of this gap suggests that an alkali (and RNAase) resistant molecule, such as DNA, equivalent to
Figure
B AC 3’ GAGUU AAU=
AC’ u.74
UG
D, uc uG cc AU
u
U U G G.20
-62
5. Sequencing
of msdRNA
Intact RNA-DNA compounds (bands a and b of Figure 3) were purified from a polyacrylamide gel and labeled with 32P with polynucleotide kinase. Labeled nucleic acids were repurified and subjected to RNA sequencing reactions as described in Experimental Procedures. Only the RNA sequencing gel obtained from reactions performed with the nucleic acid present in Band 166 (see text) is shown. The reaction products were analyzed vs. a partial alkaline hydrolysis ladder as a standard on a 20% (A) or 6% (B) sequencing gel. G, A, AU, CU, and C result from partial RNAase treatment with ribonucleases Tl (G), U2 (A), PhyM (AU), 8. cereus (CU), and CL3 (C), respectively. OH-, L, and S, represent partial alkaline hydrolysis products of Sand 166 (OH-), Sand 166 (L), and Sand 150 (S) (see text), respectively. N. purified msDNA-RNA compound, no RNAase treatment. Arrows indicate the 65 nucleotide long gap. The numbers in (6) are the msdRNA nucleotide coordinates corresponding to the numbers in Figure 7.
Cell 694
,G%\ F
4 0 i-G
T-A A-T C-G G-C i-T.50 G-C
A 5’ U G C G C A C C
3 Cl ;C-G, 7 T F T G-C’ G-C
RNA
T-A.60 T-A S-C, F A T-A'
2 0 .C-G C-G
10-c u U A
T-A -80 T-A.70 3' G--GTCAGAAAAAACGGGT-AACAGACAGTAkTCAGA C IIIIIIIIIII
Figure 6. Sensitivity of the RNA-DNA Linkage to HeLa Cell Debranching Enzyme msDNA purified after RNAase A treatment was labeled with terminal deoxynucleotidyl transferase and treated with HeLa cell debranching enzyme as described in Experimental Procedures. Reaction products were analyzed on an 6% sequencing gel together with Maxam-Gilbert reaction products of the same DNA. Lane 1, msDNA after RNAase A treatment, labeled with terminal nucleotidyl transferase, no debranching enzyme treatment. Lanes 23, and 4 are the reaction products after treatment with 300, 60, and 30 U of debranching enzyme. Arrow indicates the debranched product.
5’
The Nature of the RNA-DNA Linkage It has been shown that msDNA of M. xanthus or S. aurantiaca is linked to msdRNA by a 2’-5’ phosphodiester bond
rn ‘.-
G A G LAGGUUUAUCAUUA-\ 3'0 2’0
UC~UUGAGUCU~ 3’ 80
FCU
A-U G-C.40 G-C U-A C
\
/UGCAUUG
G-C G-C A-U. 60
A d
A.70
U-A A
G-C
u u 'U' Figure 7. Structure
85 nucleotides, is covalently linked to the 12th residue of msdRNA of Band 188 (or the 14th residue of Band 188). To see the nucleotide sequence above the gap, the sequencing reaction products were analyzed on an 8% sequencing gel. The results are shown in Figure 58. All bands above the gap in the RNAase reaction are observed as doublets, giving a higher density in the upper band of each doublet. These results show that the DNA covalently linked to RNA contains two species. Since we showed in Figure 4A that msDNA of E. coli B consists of two species differing by one nucleotide and that the longer (the 88 nucleotide long species) is more abundant than the shorter (the 85 nucleotide long species), we believe that the doublet bands are due to the two different msDNAs linked to the RNA (msdRNA). From similar experiments done with the nucleic acids of Band 188 and Band 150, the following results were obtained: The sequence of msdRNA of Band 188 is the same as that of msdRNA of Band 188, except for two additional nucleotides at its S’end (see Figure 5A, lane L). As shown in lane S of Figure 58, the msdRNA of Band 150 is 18 nucleotides shorter at its 3’end than the msdRNA of Band 188. Otherwise, it is the same as that of Band 188. The sequence of msdRNA is shown in Figure 7, together with the sequence of the covalently linked msDNA.
LNA
of the msDNA-RNA
compound
of E. coli 8 Strain
AC2514 The numbers start from the Vend of msDNA and msdRNA. The 3’end of the small msDNA ends at position 85; thus, it is one nucleotide shorter than the bigger one. The msdRNAs of Band 166, Band 166, and Rand 150 are composed of 82 (from positions 1 to 62). 80 (from positions 3 to 62) and 62 (from positions 3 to 64) nucleotides. The linkage between RNA and DNA is a 2’-5’ phosphodiester bond. Complementary base pairings are shown by bars between two bases.
because it is sensitive to the HeLa cell debranching enzyme (Furuichi et al., 1987b; Dhundale et al., 1987). Therefore, the nature of the E. coli B msDNA-RNA linkage was examined as follows. The msDNA prepared after RNAase A treatment was labeled at the 3’end with terminal deoxynucleotidyl transferase, and treated with various amounts of HeLa cell debranching enzyme. The reaction products were analyzed on an 8% sequencing gel together with the Maxam-Gilbert sequencing reaction products of the same DNA. As shown in Figure 8, the msDNA that is treated with debranching enzyme is three nucleotides shorter than the starting nucleic acid. From the m8dRNA sequence and the base specificity of RNAase A, it was expected that msDNA prepared after RNAase A treatment would contain the triribonucleotide SAGCY at its 5’ end (see Figure 7). From these results, we conclude that the msDNA of E. coli B, like msDNAs of myxobacteria, branches off from msdRNA by a 2’-5’ phosphodiester bond. The structure of the msDNA-RNA compound of E. coli B is shown in Figure 7.
Reverse 895
Transcriptase-Dependent
Synthesis
of msDNA
quencing gel is shown in Figure 8. The nucleotide sequence derived from the sequencing ladders below the msDNA band (lane 1) corresponds to the msDNA from position 51 to the 3’end (see Figure 7). The sequencing ladders above the msDNA bands show the complementary sequence of msdRNA, starting from position 71 of msdRNA. The observed size of extended DNA (144 nucleotides) suggests that the extension ends at the branch point of msdRNA (position 15). It is thus clear that M-MuLV reverse transcriptase extends msDNA using msdRNA as a template and msDNA as primer.
8 1234
Ex tended msDNA
m
5
6
*
sDNA>
75-
65-
55-
Figure 8. M-MuLV Branch Point
Reverse
Transcriptase
Extends
the msDNA
to the
Preparative reverse transcriptase extension was performed with total RNA extracted from E. coli AC2514 without labeling. Reaction products were treated with RNAase A, and residual DNA was applied to a 5% polyacrylamide gel. The extended band was purified and labeled by polynucleotide kinase with [v-*P]ATF? After repurification of labeled DNA from an 8% sequencing gel, Maxam-Gilbert sequencing was performed, and the products were analyzed in an 8% sequencing gel. Lane 1. msDNA before extension. Lane 2, extended DNA, purified after the reverse transcriptase reaction. Lanes 3, 4, 5, and 6 represent the G, G+A, T+C, and C cleavage products, respectively. The nucleotide sequence derived from ladders (from bottom to top) below the msDNA band, S’CTCCGTTCCAACAAGG .3’, is the same as that of msDNA (starting from position 51). The nucleotide sequence obtained from ladders above the msDNA band, 5~AATGCAGGAlGCC. .3’, is complementary to msdRNA (starting from position 71). The numbers indicate the corresponding positions of msDNA or msdRNA sequences.
An msDNA-RNA Compound Present in E. coli B Is the Template and Primer for Reverse Transcriptase To ascertain that the msDNA-RNA compound shown in Figure 7 served as primer and substrate for the M-MuLV reverse transcriptase, the sequence of msDNA after the extension reaction (band c in Figure 1B) was determined. Toward this end, the extension reaction was carried out with total RNA extracted from the E. coli B strain AC2514 and cold deoxynucleotide triphosphates. After RNAase A treatment, the extended DNA was purified, labeled at the 5’ end, and subjected to Maxam-Gilbert sequencing, as described in Experimental Procedures. Part of the se-
Structure of the Chromosomal Determinants The original clone pDB808 contains a 3.5 kb Pstl chromosomal fragment. To localize the chromosomal determinants of msDNA-RNA and to generate deletion clones for sequencing, the 3.5 kb Pstl fragment was subcloned into Ml3mpl9. Directional deletion clones were obtained from both ends by the ExollllSl method (Henikoff, 1984), and each deletion clone was tested for the production of msDNA. The part of the insert of pDB808 that is sufficient for production of the msDNA-RNA compound in E. coli JM103 was sequenced by the Sanger method in both directions @anger et al., 1977). This sequence, together with a restriction enzyme map of the original 3.5 kb insert, is shown in Figure 9. The coding regions for msDNA and msdRNA are situated in opposite orientation and overlap by 11 bases. When the RNA sequence determined in Figure 5 is compared with its chromosomal coding region, the only residue not identified in the sequencing reactions is guanine at the 12th position from the 5’ end of msdRNA of Band 166. It was shown that the 12th nucleotide is linked to msDNA (Figure 5A) and that the linkage is sensitive to the debranching enzyme (Figure 6). There are two 12 bp inverted repeats, one upstream of the branched guanine and the other upstream of the msDNA coding region (from positions 2 to 13, and from positions 159 to 170, Figure 9). Downstream of the RNA-DNA coding regions, there is an open reading frame consisting of 320 amino acids (ORF320). It codes for a basic protein whose calculated molecular weight is 36.4 kd. The amino-terminal and the carboxy-terminal regions of ORF320 are lysine- and arginine-rich, but the central region is hydrophobic. The Amino Acid Sequence of ORF320 Has Extensive Homology with Reverse Transcriptase From a homology search, we found that the protein encoded by ORF320 is similar to the reverse transcriptase domain of retroviral polymerases. Sequence alignment shows this similarity between ORF320 and the N-terminal part of retroviral polymerases (Figure 10). For instance, the similarity of ORF320 with Visna lentivirus reverse transcriptase in the region shown in Figure 10 is 20% for identical amino acids. This in itself may not be significant, but some amino acid sequences that are conserved among the retroviral reverse transcriptases are also observed in ORF320, suggesting that the similarity is significant. The most conserved amino acid sequences among reverse
Cell 696
l “**
(Al 013295
P ‘I
HH IV
VISNA HIV HTLVl BLV !lOIiLV RSV
ORF295
VISNA HIV HTLVl BLV I4OllLV RSV
Figure tases
**
*
l
l *
CYKNLLPQGAPSSPKLANLICSK.LDYRIQGYAGSRGLIYTRYADDL YYWKVLPQGWKL8PAVYQ?TItQKILRG...WIECHPHIQFGIYI(DDI YQYNVLPQGWKG8PAIPQSSHT-KILEP...FRKQNPDIVIYQYRDDL YAWKVLPQGFKNSPTLFKRQLAHILQP...IRQAFPQCTILQYRDDI FAWRVLPPOFINSPALFBRALQBPLRQ...VSMFSQSLLVSYMDDI LTWTRLPOGFKNSPTLFDCALHRDLAD...FRIQRPDLILLQYVDDL FQWKVLP~G,,TCBPTICQLWGQVLEP...LRLtiHPSLCHLNYRDDL * l TL.SAOSMKKWKARDFLFSIIPSBGLVINSKKTCISGPRSORKVTG YIGSDLGLECHRGIVNCLASYIAQYGFI4LPEDKRQCGYP..~AKWLG YVGSDLBIGQHRTKIEELRQHLLttWGLTTPDKKHQKBPP...FLWRG LLASPSHEDLL.LLSCATRASLISIiGLPVSSNKTQQ.TP.GTIKFLG LYASPTCEQRB.QCYQALAARLRDLGFQVASCKTSQ.TP.SPVPFLG LLAATSCLDCQ.QGTRALLQTLGNLGYRASAKKAQICQ..KQVKYLG LLAASSHDGLB.AAGCEVISTLBRAGFTISPDKVQR.CP..GVQYLG
*
_~
10. Alignments
of ORF320
with Retroviral
Reverse
Transcrip
The amino acid sequence of ORF320 (from 154 to 246) was aligned with some known retroviral reverse transcriptases using the programs of the Genetics Computer Group of the University of Wisconsin, Madison, WI (Devereux et al., 1964). Standard single-letter abbreviations are used to designate amino acids. Amino acids that are the same in all seven sequences are starred. The sequences presented are: VISNA, Visna lentivirus (Sonigo et al., 1965); HIV, human immunodeficiencyvirus 1 (Ratneret al., 1965); HTLVl, humanTcell leukemiavirus 1 (Seiki et al., 1963); BLV, bovine leukemia virus (Rice et al., 1965); MoMLV, Moloney murine leukemia virus (Shinnick et al., 1961); RSV, Rous sarcoma virus (Schwartz et al., 1963).
transcriptases, the so-called diagnostic sequences LPQG and Y-DD, are found in the appropriate positions in ORF320.
Figure 9. Restriction Map of the 3.5 kb Insert of pDB606 and Nucleotide Sequence of Chromosomal Determinants of the msDNA-RNA Compound of E. coli Et (A) Restriction map of the 3.5 kb insert of clone pDB606. The solid bar represents the region whose sequence is presented in (B). Transcrip tion is from left to right. Restricton enzymes are: P, Pstl; H, Hpal; 8, Sglll; X, Xhol. (B) Nucleotide sequence of the chromosomal determinants. Only the strand corresponding to the transcript is shown. Nucleotides are numbered starting from the first base observed in the msdRNA. The msdRNA coding region is overlined, and the msDNA coding region is underlined. The msDNA sequence is complementary to the sequence shown in this figure. Inverted repeats are indicated by double-dashed lines. The G at position 14 is the branched guanylate of msdRNA in the msDNA-RNA compound. IR, 12 bp inverted repeat.
ORF320 Is Required for msONA Synthesis To determine the minimum sequence required for msDNA synthesis, upstream and downstream deletion clones were generated and analyzed. Three different upstream deletion clones (deletions up to -170, -18, and -14 of the sequence shown in Figure 9) were tested. As shown in Figure 1lA, a deletion extending to -170 (lane 4) does not affect msDNA synthesis, but a deletion extending to -18 (lane 3) or to -14 (lane 2) decreases msDNA synthesis. Further deletion into the msdRNA coding region (deletions up to +37) totally abolishes msDNA synthesis (lane 5). This result shows that, although the deletions up to -14 reduce the level of msDNA, upstream sequences above position -14 are not absolutely required for the production of msDNA. Considering the results of Sl nuclease mapping, the lower level of msDNA in the clones M13del-18 and M13del-14 (lane 3 of Figure 11A) is believed to be due to the deletion of part of the promoter for precursor RNA (see below). To investigate the role of ORF320 present downstream of the RNA-DNA coding region, we isolated several clones deleted in the ORF320 coding region. Initially the production of msDNA in these deletion clones was tested by direct gel analysis. No msDNA was detected. Since it is difficult to see a low level of msDNA by this method, a more sensitive reverse transcriptase extension method was used to confirm that these clones did not produce msDNA. As shown in Figure 116, no msDNA was detected. Since the method is sensitive enough to easily detect a five-fold lower level of msDNA than that present in wild-type E. coli B, as shown in lane 13, it is unlikely that these clones produce msDNA. Therefore, we conclude that ORF320 is essential for the synthesis of msDNA.
Reverse 697
Transcriptase-Dependent
Synthesis
of msDNA
B 1234
-
!i
RNase
Figure 11. The Effects msDNA Synthesis
+RNase
of 5’ and 3’ Deletions
on
Upstream and downstream deletion clones generated by the Exolll/Si method were analyzed for the production of msDNA by direct gel analysis of msDNA (A) or by the reverse transcriptase extension method (B), as described in Experimental Procedures. (A) Direct gel analysis of the msDNA of upstream deletion clones. Total RNA was extracted from JM103 infected with phage Ml3 deletion clones. After RNAase A treatment, the residual nucleic acids were analyzed on a 7% polyacrylamide gel. The fragments cloned into M13mp19 are as follows: Lane I, from Bglll to Pstl of pDB606. Lane 2, from -14 to 1230. Lane 3, from -16 to 1230. Lane 4, from -170 to 1230. Lane 5,37 to 1230. Numbers correspond to the coordinates of the nucleotide sequence shown in Figure 9B. msDNA bands are indicated by an arrow. (B) The effects of 3’ deletions on msDNA synI thesis. Total RNA extracted from JM103 harboring different downstream deletion plasmids was subjected to reverse transcriptase extenQ sion reaction. Reaction products were ana“*, lyzed on an 6% sequencing gel before (lanes I 2 3 4 5 6 7 8 910~11213 1, 2, 3, 4, 5, 6, 7) and after (6, 9, IO, 11, 12, 13, 1415 14,15) RNAase A treatment. Lanes 1 and 6 are negative control reactions of JM103 (pTZ19U). Lanes 7 and 15 are positive control reactions of JM103 (pDB606). Lanes 6 and 14 are the extended products of E. coli B strain AC2514 without plasmid. In lane 13, one-fifth the amount of that present in lane 14 was loaded on the gel to determine the detection limit. The downstream deletion plasmids used in other lanes are pTZl9U derivatives containing the following fragments from the Hpal site to: Lanes 2 and 9, position 1023. Lanes 3 and 10, position 991. Lanes 4 and 11, position 775. Lanes 5 and 12, position 706.
ORF320 Is Translated Three different ORF-/acZ translational fusion plasmids were constructed as described in Experimental Procedures (Figure 12). In the plasmids pRTZ-t70 and pRTZ-14, the expression of the ORF-IacZfusion protein is controlled by a natural promoter, but in plasmid ptacRTZ-14 it is controlled by the strong rat promoter. After transformation of the plasmids into E. coli JM103, f%galactosidase produced from each clone was mea-
-170
pRTZ-14
’ :
pRTZ-170 i
i
PtacRTZ-14
-14
I I
:,
: 1 t ; tat
mRNA
+’ I
-
tat
!
IacZ
0.2"
I 1.2u
* ; :
Figure 12. Schematic Representation of the Chromosomal Determinants for the msDNARNA Compound and ORF+Galactosidase Fusion Expression
e
Xhol
mSDNA
1 * -kw
pT-14
sured. The data are shown in Figure 12, together with the structure of the fusion plasmids. The /3-galactosidase activity (U/OD unit) of JM103 (pRTZ-170) is high (1.2 U) compared with the control JM103 (pRS) (0.08 U). In JM103 (pRTZ-14) which is deleted for the promoter region, it is low (0.2 U). In JM103 containing plasmid ptacRTZ-14, the activity increases when the tat promoter is induced (26.5 U). Therefore, we conclude that ORF320 is translated in vivo, but its expression is in the range of repressed /acZ
IacZ
2 6.5
U
The coding sequences of msdRNA and msDNA go in opposite directions. IR and ORF indicate the 12 bp inverted repeats and 320 amino acids long open reading frame, respectively. The numbers at the top designate the nucleotide coordinates, with the transcription start point at +I (Figure 9B). In the plasmids, thin lines indicate the DNA derived from vectors and thick lines the inserts derived from the chromosomal determinants. The f3-galactosidase activities of JM103 containing ORF-/acZ fusion plasmids are shown at the end of each plasmid structure (UIOD). The number shown in parentheses of plasmid ptacRTZ-14 is the activity before induction of the fat promoter. The 5$atactosidase activity of the control JM103 (pRS) was 0.064 U. The unit of 6galactosidase is defined by Miller (1972).
Cdl 898
A
(Miller, 1972). When it is controlled by a strong promoter, such as the tat promoter, the expression after induction is about 20-fold higher than in its natural state.
12345678
ca
lb
a t’msdRNA
+17f3
a...
Figure
13. Sl Nuclease
Mapping
msdRNA, msDNA, and Reverse Transcriptase Are Synthesized from a Common Precursor RNA As shown in Figure 12, deletions extending to nucleotide position -14 reduce ORF320 fusion expression. With the same deletion, as shown in Figure llA, the production of msDNA is decreased. These effects of upstream deletions on both msDNA and ORF320-laczfusion expression suggested the presence of a long transcript covering msdRNA, msDNA, and ORF320. Sl nuclease mapping was carried out for the anlysis of a transcript in this region. For this, a 553 bp fragment (from the Xhol site at +382 to a site at -170 of Figure 9B) was labeled at the Xhol site and hybridized with total RNA extracted from JM103 or JM103 (pDB808). After Sl nuclease treatment, residual nucleic acids were analyzed on a 50/o sequencing gel. As shown in Figure 13A, the major protected product is about 385 nucleotides long (band a). This band matches with the distance from the Xhol site to position +l (see Figure 13B). Therefore, this transcript starts from the first nucleotide of msdRNA and covers the genes for msdRNA, msDNA, and ORF320. In addition to band a, several minor bands are observed in the range of 225 to 240 nucleotides. The 5’ends of these RNAs map in the msDNA coding region (positions 145 to 180 of the chromosomal sequence shown in Figure 9B). Considering the locations of their Sends in the msDNA coding region and the heterogenity at the ends, we believe that these RNAs are not the primary transcripts, but that they are the RNAase H products of an RNA-DNA hybrid formed during msDN.4 synthesis (see Discussion).
‘ORF
i b
of Transcripts
Sl nuclease mapping was performed with total RNA prepared from JM103 harboring pDB808 (lanes 3,4,5) and JM103 (lanes 8,7,8). The probe is a 553 bp fragment (from -170 to 382) labeled at the Xhol site. Reaction products were analyzed on a 5% sequencing gel. (A) Sl mapping results. Lane 1, size markers prepared by Sau3A digestion of pUC1R Sizes are in base pairs. Lane 2, the end-labeled probe without any treatment. Lanes 3, 4. 5, total RNA extracted from JM103 (pDS808) treatedwith 100,200, and 300 U of Sl nuclease, respectively. Lanes 6, 7, 8, total RNA extracted from JM103, treated with 100, 200, and 300 U of Si nuclease, respectively. The major protected product, band a. is 385 nucleotides long, and the heterogenous bands indicated by arrow b are in the range of 225 to 240 nucleotides. (f3) Schematic representation of the RNA transcripts detected by Sl nuclease mapping. The 5’ end of band a maps at the first nucleotide of the longer msdRNA (nucleotide position +l in Figure 9B). The 5’ ends of band b map at positions 145 to 180 of Figure 9B. Numbers refer the coordinates of the nucleotide sequence shown in Figure 9B. IR indicates inverted repeats.
In Vitro Synthesis of msDNA Since a biosynthetic model of the msDNA-RNA compound in M. xanthus involving reverse transcriptase has been proposed (Dhundale et al., 1987) we developed an in vitro system to test some predictions of this model. For this, plasmid pT-14 was constructed (see Figure 12). In this plasmid, the natural promoter of msdRNA, msDNA, and ORF320 was replaced with the inducible, strong tat promoter to control the formation of the msDNA-RNA compound by the inducer IPTG. A sonic extract was prepared from JM103 (pT-14) after induction of the tat promoter and was incubated with deoxynucleotide triphosphates as described In Experimental Procedures. When the reaction products were analyzed on a sequencing gel after RNAase A treatment, one major band was found, which has the same mobility as the in vivo-produced msDNA purified after RNAase A treatment (Figure 14A, band a). This band was not observed in the control reactions done with a sonic extract of JM103 (Figure 14A, lane 7), nor in the reactions done with a sonic extract of JM103 with the addition of the nucleic acids prepared from the sonic extract of JM103 (pT-14) (Figure 14A, lane 9). When the reaction products were directly applied to a gel before RNAase treatment, several bands bigger than msDNA were observed (lanes 4 and 8).
Reverse 899
Transcriptase-Dependent
Synthesis
A RNase
of msDNA
B RNase
+
4MW-a
2345
678910
From a comparison of the reaction products before and after RNAase A treatment, it is clear that msDNA produced in vitro i8 linked to an RNA, as observed also in vivo (lanes 4 and 8). Another interesting point of this result is the complete absence of free msDNA in lanes 4 or 8. This absence of a free msDNA band before RNAa8e treatment suggests that the production of m8DNA is coupled to the formation of a covalent bond between the template RNA and newly synthesized msDNA. Since it has been proposed that msDNA is synthesized from a primary transcript RNA via reverse transcription (Dhundale et al., 1937), we asked whether the in vitro synthesis of msDNA is sensitive to pretreatment with RNAase. As shown in Figure 148, pretreatment of a cell-free extract with RNAase A totally abolishes msDNA synthesis. In contrast to the msDNA band, the background bands, which are believed to be DNA polymerase products, are hardly affected. This result shows that an RNA component is essential for m8DNA synthesis. In the next experiment, we asked whether the msDNARNA joining occurs after or during msDNA synthesis. In order to elucidate the intermediate stages of msDNA synthesis, didexoynucleotides were added to the reaction mixture. When the reaction products were analyzed on a sequencing gel, all of the chain termination products were
Figure
14. In Vitro Synthesis
of msDNA
In vitro synthesis of msDNA was performed with sonic extracts of JM103 or JM103 (pT-14) as described in Experimental Procedures. The reaction products were analyzed on an 8% sequencing gel before or after RNAase A treatment. (A) In vitro synthesis of msDNA. In vitro reactions were carried out with a sonic extract of JM103 (lanes 3, 5, 7, and 9) or a sonic extract of JM103 containing plasmid pT-14 (lanes 4, 8, 8, and 10). In lanes 5 and 9, or lanes 8 and 10, nucleic acids purified from the sonic extract of JM103 (pT-14) were added to the reaction mixture. Lanes 1 and 2 are msDNA extended by M-MuLV reverse transcriptase, after RNAase A treatment, and msDNA labeled with kinase after RNAase A treatment, respectively. (6) msDNA synthesis is sensitive to pretreatment with RNAase A. In vitro reactions were performed with a sonic extract of E. coli JM103 (lanes 3, 5, 7, and 9) or JM103 harboring plasmid pT-14 (lanes 4, 8, 8, and 10) with (lanes 5, 8. 9, and 10) or without (lanes 3, 4, 7, and 8) pretreatment with RNAase A, as described in Experimental Procedures. Reaction products were analyzed on an 8% sequencing gel. Lane 1, M-MuLV reverse transcriptase extended msDNA after RNAase A treatment. Lane 2, msDNA labeled with polynucleotide kinase after RNAase A treatment. Lanes 3 and 5, reaction products of the JM103 extract with or without pretreatment with RNAase A. Lanes 4 and 6, reaction products of the JM103 (pT-14) extract with or without pretreatment with RNAase A. Bands a and b indicate the msDNA and msDNA extended to the branch point of msdRNA. respectively.
found to be bigger than msDNA itself (Figure 15). Only after RNAase treatment did a typical sequencing ladder appear. These results show that msDNA is synthesized in the state of an RNA-DNA compound. The formation of the RNA-DNA linkage between msdRNA and msDNA is not a postsynthetic reaction, but most likely the first step in the synthesis of msDNA. ORF320 Encodes a Reverse Transcriptase The above observations strongly suggest that ORF320 codes for a reverse transcriptase and that msDNA is synthesized by reverse transcription. In order to characterize the reactions of msDNA synthesis, substrate and proteins present in crude extracts of E. coli JM103 (pT-14) were separated as follows. For the preparation of the template and primer, total nucleic acids present in the sonic extract were purified by phenol extraction and ethanol precipitation. This nucleic acid contains the msDNA-RNA compound as well as chromosomal DNA and RNA. For the preparation of enzyme fractions, the sonic extract was treated with RNAase A and DNAase I, applied to a Mono Q ion exchange column, and eluted with an NaCl gradient, as described in Experimental Procedures. The sonic extracts of E. coli JM103 and E. coli JM103 containing vector pKK223-2
RN
RNase
ase
+
+ -II
14
Extended m.sDN
Figure
15. Chain
Termination
Products
of msDNA
:. )I
,
7
.,
%
.(
;.-,/
<
Synthesis
In vitro synthesis of msDNA was performed with a sonic extract of JM103 (pT-14) in the presence of dideoxynucleotide triphosphate as described in Experimental Procedures. The reaction products were anatyzed on a 10% sequencing gel before (RNAase -) or after RNAase A (RNAase +) treatment. Lanes A, C, G, and T denote the chain termination products. Lane N denotes the reaction without addition of dideoxynucleotides. The major chain termination products are indicated by dots. The numbers designate the corresponding nucleotide positions of msDNA shown in Figure 7.
were also prepared and processed in the same way as the extracts of E. coli JM103 (pT-14). All fractions were incubated with the template and primer prepared from sonic extracts of JM103 (pT-14) as described in Experimental Procedures. After phenol extraction and ethanol precipitation, reaction products were analyzed on a polyacrylamide gel before and after RNAase A treatment. In the reaction with fractions eluted at 0.2-0.3 M NaCI, large amounts of labeled products, spread out over the lane, were observed. These products were observed in all three strains, JM103, JM103 (pKK223-2) and JM103 (pT-14), and RNAase A treatment did not change their mobility in gels (data not shown). Therefore, we conclude that this fraction contains DNA polymerase. In JM103 (pT-14) but not in JM103 or JM103 (pKK223-2) another polymerase activity was eluted at about 0.4 M NaCI. Before RNAase treatment, the product of this enzyme was observed as heterogeneous bands with sizes greater than 144 nucleotides. After RNAase A treatment, the product was one band whose size is the same as of the msDNA extended by M-MuLV reverse transcriptase (Figure 16). No such reaction products were observed with nucleic acids pre-
Figure 16. Demonstration of Reverse Transcriptase Mono Cl Eluate of the JM103 (pT-14) Extract
Activity
in the
An active enzyme fraction eluted from the Mono Cl column was incubated with template and primer as described in Experimental Proce dures, and the reaction products were analyzed on an 6% sequencing gel before and after RNAase A treatment. Lane 1, molecular weight markers prepared by Hpall digestion of pUC16. Lane 2, msDNA extended to the branch point by M-MuLV reverse transcriptase after RNAase treatment. Lane 3, msDNA labeled by kinase after RNAase A treatment. Lanes 4 and 6, reaction products from Mono Q eluate without addition of nucleic acid, before and after RNAase A treatment. Lanes 5 and 9, reaction products from Mono Q eluate with addition of nucleic acid, before and after RNAase A treatment. Lanes 6 and 10, reaction products from Mono Q eluate plus boiled and quickly cooled nucleic acid, before and after RNAase A treatment. Lanes 7 and 11, reaction products from Mono Q eluate plus RNAase-treated nucleic acid, before and after RNAase A treatment. Arrows indicate the major extended product, before and after RNAase A treatment.
pared from JM103 or JM103 (pKK223-2). From these results, we conclude that not only the enzyme with reverse transcriptase activity, but also the template and primer, are derived from the 1244 bp insert present in plasmid pT-14. To characterize the reaction, an aliquot of an active fraction was incubated with template that had been heated for 90 set in boiling water and quickly cooled to denature any double-stranded DNA. Another aliquot was incubated with template treated with DNAase-free RNAase A (50 kg/ml for 20 min at 37%) to eliminate RNA. The reaction products were analyzed on an 6% sequencing gel before
Reverse 901
Transcriptase-Dependent
Synthesis
of msDNA
point of msdRNA, in the same manner as M-MuLV reverse transcriptase. To make sure that this is the case, the extension reaction was carried out with boiled template in the additional presence of dideoxynucleotide triphosphates. The chain termination products were analyzed on an 8% sequencing gel after RNAase treatment. As shown in Figure 17, the sequence deduced from the major chain termination products is complementary to the expected region of msdRNA (see Figure 7). Discussion
67-e
Figure
17. Chain
Termination
-
Products
Formed
during
the Extension
The extension reaction was carried out with a rerun and concentrated Mono Q eluate plus boiled and quickly cooled nucleic acid prepared from sonic extracts. Dideoxynucleotide triphosphates were added to the reaction mixtures to obtain the chain termination products. The reaction products were analyzed on an 8% sequencing gel after RNAase A treatment. Lane 1, molecular weight markers prepared by Hpall digestion of ptJC16. Lane 2, DNA extended by M-MuLV reverse transcriptase. Lane 3, msDNA labeled by kinase. N, extension reaction without addition of dideoxynucleotides. C, A, T, and G denote the chain termination reaction products, Dots indicate major chain termination products, The numbers are the corresponding nucleotide positions of the msdRNA shown in Figure 7.
and after FiNAase A treatment (Figure 16). The results of these experiments show that first, the size of the DNA component of the extended product is exactly the same as the DNA component extended by M-MuLV reverse transcriptase (lanes 2 and 9). This band is only observed after removal of RNA, suggesting that the synthesized DNA is covalently linked to RNA (lanes 5 and 9). Second, with the boiled and quickly cooled nucleic acid as template, most of the background bands disappear but, as in the extension of msDNA with M-MuLV reverse transcriptase (band c of Figure 16) the major band is still present (lane 10). This result suggests that the template and primer for the extension reaction can snap back quickly, indicating that they are linked. Third, when the nucleic acids were pretreated with DNAase-free RNAase A, the background bands were not significantly affected, but the major band was totally absent (lane 11). These results strongly suggest that the active fraction contains an enzyme that extends msDNA to the branch
A Reverse Transcriptase Is Produced in E. coli B We conclude from the presented evidence that an enzyme with reverse transcriptase activity is produced in E. coli 6. Our conclusion is based on the following considerations. First, the active protein obtained in the Mono Q column fractionation is the product of ORF320. The deduced amino acid sequence of the protein encoded by ORF320 is similar to the N-terminal portion of other reverse transcriptases and has all the sequence earmarks of reverse transcriptases. The observed reverse transcriptase is present only in extracts of strains that contain the genetic determinant for ORF320. Second, there is evidence that the endogenous primer and template for the observed extension reaction is the msDNA-RNA compound present in E. coli 6. In this compound the primer is the 3’ end of msDNA and the template is msdRNA from position 71 to the branch point at position 15. The evidence for these statements is based on the following observations. The extension reaction occurs only in extracts of strains that produce the msDNA-RNA compound and does not occur in extracts pretreated with RNAase. When M-MuLV reverse transcriptase is used, the product of the extension reaction has been shown to be the msDNA-RNA compound extended from the S’end of msDNA to the branch point of position 15 (Figure 6). The reaction products obtained from the extension reaction with the E. coli B enzyme are very similar to those obtained with M-MuLV reverse transcriptase (compare Figures 8 and 17). For both, M-MuLV and the E. coli 0 enzyme, the change observed in the extension products following RNAase A treatment shows that the template/primer is a covalently linked DNA-RNA compound. This is also shown by the rapid renaturation of substrate following heating and quick cooling. That the template is msdRNA, rather than msDNA, is shown in Figure 17 for the extended products of the E. coli B enzyme reaction. Structure of msDNA in E. coli B It is shown here that E. coli B has a multicopy, singlestranded linear DNA that branches off from RNA. Five different E. coli B strains (AC2514, AC2507, AC5645, AC5272, and 614) have been tested and all of them have this msDNA-RNA compound (unpublished data). The number of msDNA molecules in wild-type E. coli B is estimated to be approximately 500 per cell. The structure of the msDNA-RNA compound of E. coli B is shown in Figure 7. We have found some variations in this structure: three different msdRNAs and two different
Cell 902
msDNAs, forming six different msDNA-RNA compounds. However, it is likely that the smallest msdRNA (m.sdRNA composed of 62 nucleotides from position 3 to position 64 of Figure 7) is a degradation product of bigger msdRNAs composed of 60 or 62 nucleotides. The two msDNAs of E. coli B are composed of 66 or 65 nucleotides. The msDNA composed of 66 nucleotides is produced slightly more often than the msDNA composed of 65 nucleotides. The linkage between msDNA and msdRNA is believed to be a 2’-5’ phosphodiester bond, since it is sensitive to HeLa cell debranching enzyme. The 11 nucleotides at the 3’ end of msdRNA are complementary to the 3’ end of msDNA, suggesting the formation of an DNA-RNA hybrid in this region. Since the branch point of RNA is resistant to alkali or RNAase, the branched residue of msdRNA is not observed in the RNA sequencing gel, but a big gap is observed in the position of the branched residue. This branched nucleotide was determined by comparison of the RNA sequence obtained from gels with the chromosomal DNA sequence. This determination was supported by the sequencing of the reverse transcriptase extension product. Although the DNA-RNA compound of E. coli B has no significant sequence homology to those from M. xanthus and S. aurantiaca, the predicted secondary structures are remarkably similar. The similarity of the secondary structure, but not the primary sequence, suggests that they were either derived from the same source but evolved separately for a long time, or that they originated from different sources. All of the msDNAs that have been characterized branch off from a guanylate residue of msdRNA, which suggests that this guanylate is important for the formation of the branched structure. The branched guanylate is located at the end of an inverted repeat in the chromosomal sequence. The inverted repeat probably plays an important role in the activation of this guanylate residue, which is located right after the inverted repeat and serves as a primer for msDNA synthesis (see below). The chromosomal determinants are, in order, msdRNA, msDNA, and a reverse transcriptase coding sequence. A promoter for the precursor RNA is located upstream of the msdRNA coding region. A primary transcript synthesized from this promoter is responsible for the synthesis of msdRNA and msDNA. It probably also serves as an mRNA for reverse transcriptase. A pair of inverted repeats is present upstream of the branched guanylate and the msDNA coding region. The reverse transcriptase coding sequence is located right after the inverted repeat. Biosynthesis of the msDNA-RNA Compound The proposed mechanism for msDNA biosynthesis in E. coli B, based on our in vitro experiments, is essentially the same as the model suggested for the synthesis of the msDNA of M. xanthus proposed by Dhundale et al. (1967) based on the in vivo effects of metabolic inhibitors on msDNA synthesis. This mechanism is based on the following observations of our in vitro and in vivo experiments: ORF320 is required for the production of msDNA in vivo and we show that the product of ORF320 is a reverse transcriptase. It was shown by Sl nuclease map-
ping that there is a long primary transcript covering the msdRNA, msDNA, and reverse transcriptase coding regions. This mRNA presumably functions as a precursor of msDNA, msdRNA, and as the mRNA for reverse transcriptase. The heterogeneous bands observed in Sl nuclease mapping (band b in Figure 13A) are regarded as digestion products of the precursor RNA of the DNA-RNA hybrid region generated by RNAase H during reverse transcription. The linking of msDNA to RNA is not a postsynthetic reaction, but rather, msDNA is synthesized as a DNA-RNA compound. In further support of this, we found that a partially purified E. coli B reverse transcriptase synthesizes from an in vitro-generated precursor RNA a DNA that is covalently linked to RNA (Lim, Oppenheim, and Maas, unpublished data). The mechanism of termination of msDNA synthesis is unknown. More than 95% of in vitro-synthesized msDNA is correctly terminated; only about 1% of in vitro-synthesized DNA is extended to the branched guanylate residue of msdRNA (see Figure 14). Since the partially purified E. coli B reverse transcriptase continues DNA synthesis to the branch point of msdRNA (Figures 16 and 17), there must be other factors involved in the termination of msDNA synthesis. If any protein factor is involved in the termination of msDNA synthesis, it is probably encoded by the E. coli K12 chromosome, since the DNA fragment required for msDNA synthesis contains only one open reading frame (ORF320, for the reverse transcriptase). The Function of the DNA-RNA Compound The function of the DNA-RNA compound is unknown. Its absence in K12 indicates that it is not essential for growth. Most likely, it is part, or the product, of a larger parasitic genetic element, such as a defective virus or a transposon. Since we have shown that ORF320 codes for a reverse transcriptase, this element may be related to retroelements in eukaryotes. The amplification (or replication) of only a part of the determinants, not the whole element, to form the DNA-RNA compound suggests that the relationship of this compound to the larger genetic element is similar to that of defective interfering particles produced in conjunction with some eukaryotic viruses to the intact viruses. The presence of this genetic element at the same site in the chromosome of two different E. coli B strains suggests that it does not transpose frequently. However, we cannot rule out that, under certain conditions, the intact element in the chromosome, rather than a part of it, would be copied and would transpose to another site. If this is the case, the role of the msDNA-RNA compound may be similar to the role of primer tRNA in retroviral replication. Experlmental Materials
Procedures and Bacterial
Strains
Restriction enzymes, T4 polynucleotide kinase, and T4 DNA ligase were obtained from New England Biolabs. Moloney murine leukemia virus reverse transcriptase (Cloned M-MuLV reversetranscriptase), the RNA sequencing kit, and terminal deoxynucleotidyi transferase were purchased from Bethesda Research Laboratories. Radioactive materials were obtained from New England Nuclear or Amersham. E. coli
Reverse 903
Transcriptase-Dependent
Synthesis
of msDNA
JM103 (Yanisch-Perron et al., 1965) was the host of all plasmids and Ml3 phages used in this study. E. coli B strains AC2514, AC2507, AC5645 AC5272, and 814 were obtained from Dr. B. Bachman of Yale University HeLa cell debranching enzyme was a gift from Drs. J. Arenas and J. Hurwitz of the Memorial Sloan-Kettering Institute. DNAase-free RNAase A was prepared according to Maniatis et al. (1962).
Reverse lkanacriptase Extension of the Extended Products
and Sequencing
Total RNA preparation and primer extension were done as described in Lim et al. (1967). In the primer extension reaction, [a-32P]dCTP, rather than a labeled primer, was used to label the DNA that was synthesized during the extension reaction. Reaction products were analyzed on an 6% sequencing gel before or after removal of RNA by RNAase A (50 pg/ml for 20 min at PC) or alkali (0.1 N NaOH for 15 min at 65OC). To sequence the extended product, the extension reaction was performed with cold deoxynucleotide triphosphates under ten times scaled-up conditions. After RNAase A treatment, residual DNA was applied to a 5% polyacrylamide gel, and the extended DNA was identified by staining with ethidium bromide. Extended DNA was eluted from the gel, and its 5’ end was labeled with 32P using polynucleotide kinase. The labeled DNA was repurified from an 6% sequencing gel, and the nucleotide sequence was determined by the Maxam-Gilbert method.
Structure
Determination
of the msDNA-RNA
Compound
For the preparation of the intact DNA-RNA compounds, E. coli JM103 (pDB606) was grown in a tryptone yeast extract-NaCI-glucose medium (TYE) with 100 pglml of ampicillin until ODsm = 0.6. Cells were collected by centrifugation and resuspended in ice-cold 10 mM Tris-HCI, 1 mM EDTA (pH 7.6), in 1120 the volume of the culture. Five percent SDS was added to a final concentration of 0.5% and gently mixed. The suspension was boiled for 90 set to disrupt the cells and immediately cooled on ice. One-quarter volume of cold 3 M potassium acetate (pH 5.5) was added and gently mixed. After IO min incubation on ice, it was centrifuged for 10 min in a microfuge and the supernatant was collected. Afier phenol extraction and ethanol precipitation the nucleic acids were dissolved in distilled water. For the preparation of msDNA, the nucleic acids, prepared as above, were treated with RNAase A (50 &ml, for 20 min at VC) and applied to a 7% polyacrylamide gel (“direct gel analysis of msDNA”). To prepare the intact DNA-RNA compound, the nucleic acids were directly applied to a 7% polyacrylamide gel. msDNA and intact DNA-RNA compound were extracted from the gel and labeled with polynucleotide kinase using [u-~P]ATP (5’ end) or with terminal deoxynucleotidyl transferase using [a-3zP]ddATP (3’ end), under the conditions recommended by the supplier. Labeled nucleic acids were repurified from an 6% sequencing gel. msDNA was sequenced by the Maxam-Gilbert method (1960), and msdRNA was sequenced by base-specific RNAase digestion with an RNA sequencing kit (BRL), as recommended by the supplier. Reaction products were analyzed on a 20% acrylamide-6 M urea gel, or an 6% acrylamide-6 M urea gel. For the determination of the branch structure, msDNA purified after RNAase A treatment was labeled at the 3’ end with terminal deoxynucleotidyl transferase and repurified from an 6% acrylamideurea gel. Aliquots of this labeled msDNA were incubated for 30 min at 37oC with 300 U, 60 U, or 30 U of HeLa cell debranching enzyme under the conditions described in Arenas and Hurwitz (1967), except for the addition of denatured salmon sperm DNA (20 pglml). It was found that the addition of excess Salmon sperm DNA prevented the degradation of msDNA by DNAase present in the particular debranching enzyme preparation that we used. The reaction products were analyzed on an 6% sequencing gel, together with the Maxam-Gilbert sequencing products (1980).
Cloning, Generation of Deletion Chromosomal Determinants
Clones,
and Sequencing
of
Southern hybridization of chromosomal DNA isolated from E. coli AC2514, 814, or JMl03 was performed as described in Maniatis et al. (1962). The probe was the msDNA extended to the branch point of msdRNA (see Figure 7). It was prepared by a ten times scaled-up extension reaction of total RNA extracted from E. coli AC2514, under the conditions described in Lim et al., except for the omission of a synthetic
primer (Lim et a!.,.l967). The 3.5 kb Pstl fragment of E. coli AC2514 DNA that hybridized to the probe was isolated from a gel and cloned into pTZl9U (US Biochemical Corp.). The positive clone, JM103 (pDB606), was selected by colony hybridization (Maas, 1963). The genes coding for the DNA-RNA product were located on a 2 kb Bglll-Pstl fragment by restriction analysis and subcloning, and the fragment was transferred into Ml3mp19 or M13mpl6 Directional deletion clones were generated by the Exolll/Sl method (Henikoff, 1964) in both directions within Ml3 clones. Each deletion clone was tested for the production of msDNA by direct gel electrophoresis of the RNA extract after removal of RNA (50 pg/ml of RNAase A for 20 min at WC). Some of the 5’deletion clones generated in the Ml3 phage by ExollVSl nuclease were directly used for the analysis of msDNA production. They are Ml3del170, Ml3del-16, M13del-14, and M13del+37. The insert present in some of the Ml3 deletion clones (3’ deletion clones) were transferred to plasmid vector pTZl9U. The resulting plasmids are pDB1023, pDB991, pDB775, and pDB706. All of these deletion clones have EcoRl and Hindlll sites derived from the polylinker of Ml3mp19 or M13mpl6. The number of each deletion clone indicates the deletion endpoint determined by sequencing (see Figure 9B). A 1.7 kb fragment that was sufficient for the production of msDNA in E. coli JM103 was sequenced by the Sanger method with the modified T7 DNA polymerase (Sequenase kit of US Biochemical Corp.) under the conditions recommended by the supplier. In some cases, dlTP was used to prevent gel compression by a secondary structure. The sequence was analyzed by the software provided by the Genetics Computer Group of the University of Wisconsin, Madison, WI (Devereux et al., 1964). ORF-/acZ fusion plasmids pRTZ-170 and pRTi!-14 were constructed by insertion of a DNA fragment extending from the Xhol site in ORF320 to nucleotide positions -170 (pRTZ-170) or -14 (pRTZ-14) into a 6.2 kb EcoRl-Sall fragment of pORF1 (+I of the nucleotide sequence is the first nucleotide of the polycistronic RNA, see Figure 12). Joining of the Xhol site of ORF320 with the Sall site of pORFl (Weinstock et al., 1963) gives an in-frame fusion of ORF320 with /acZof pORF1. Plasmid pfacRTZ-14 was constructed by joining of a Pstl-BamHI fragment of pDR540 (Pharmacia) containing the tat promoter and part of the b/a gene with a Pstl-EcoRI fragment of pRTZ-14. The negative control plasmid pRS was generated by selfligation of an EcoRI-Sal1 fragment of pORF1 after fill-in with Klenow enzyme. The 8.galactosidase expressed from these fusion constructions was measured according to Miller (1972), after transformation into E. coli JM103. Plasmid pT-14 was constructed by an insertion of a 1.26 kb EcoRI-Hindlll fragment of pDfL-14 into the EcoRl and Hindlll sites of pKK223-2 (Pharmacia). The plasmid pDEL-14 is one of the deletion plasmids generated by the Exolll/Sl nuclease method (Henikoff, 1964) for the analysis of the role of upstream sequences in the production of the msDNA-RNA compound. The 1.26 kb EcoRI-Hindlll fragment of pDEL-14 (the restriction sites are derived from the polylinker of the vector) contains the genes for msdRNA, msDNA, and ORF320 (from nucleotide positions -14 to 1230, see Figure 9).
Sl Nuclease
Mapping
From an Ml3 deletion clone (M13Del-170) generated by Exolll/Sl nuclease, a 553 bp probe (from positions -170 to 365, labeled at the Xhol site with polynucleotide kinase) was purified from a gel. This probe was hybridized with total RNA extracted from E. coli JM103 or E. coli JM103 (pDB606), and Sl nuclease mapping was performed as described in Maniatis et al. (1982).
Assay
of Reverse
Transcriptase
E. coli JM103 (pT-14), JM103 (pKK223-2), and JM103 were grown in TYE medium with or without ampicillin (100 pg/ml). The tic promoters of plasmids pT-14 and pKK223-2 were induced by the addition of isopropyl-P-D-thiogalactoside (IPTG, final concentration 1 mM) at ODsw = 0.6 and incubation was continued for 1 hr. Cells were collected by centrifugation and resuspended in 3 ml of extraction buffer (20 mM Tris-HCI [pH 7.61, 10 mM MgCl*, 7 mM p-mercaptoethanol) per gram of wet weight. After sonication and centrifugation (30 min, 21,500 x g), supernatants were collected (sonic extract). To prepare the template and the primer, total nucleic acid present in the sonic ex-
Cdl 904
tract of JM103 (pT-14) was purified by phenol extraction and ethanol precipitation. It was dissolved in IO mM Tris-HCI (pH 8.0), 1 mM EDTA, and adjusted to the original volume. From gel electmphoresis, it was found that this solution of template and primer contains about 25 pg/ml of msDNA, together with large amounts of chromosomal DNA and RNA. To prepare the enzyme fraction, the sonic extracts were treated with DNAase I and RNAase A (3 us/ml of each for 30 min at 37“C), and after centrifugation as above, the supernatant was fractionated on a Mono 0 ion exchange column attached to FPLC system (Pharmacia). Proteins were eluted with a NaCl gradient made up in the extraction buffer and individual fractions were assayed for reverse transcriptase activity. The standard assay mixture contained 3 pl of column eluent, 3 ul of template and primer prepared asdescribed in the previous paragraph, 300 uM of each dATP, dGTP, dTTP, and 10 uCi of [as2P]dCTP (3000 Cilmmol) in 30 ul of 50 mM Tris-HCI (pH 7.5) 75 mM KCI, 10 mM DTT, and 3 mM MgCIs. In Vltm Synthesb of msDNA For the in vitro synthesis of msDNA, the reaction mixture contained 3 ul of crude sonic extract, rather than column eluent and purified substrate. The reaction mixture was incubated for 30 min at 3pC, and the nucleic acids were purified by phenol extraction and ethanol precipitation and dissolved in 20 ul of 10 mM Tris-HCI, 1 mM EDNA. Reaction products were analyzed on a sequencing gel before or after RNAase A treatment (50 @ml for 20 min at noC). To obtain the chain termination products, dideoxynucleotides were added to the standard mixture at a final concentration of 150 uM of either ddATP, ddGTP ddTTP, or 0.05 pM of ddCTf? The reaction products were analyzed on an 8% sequencing gel before or after RNAase A treatment (50 pg/ml for 20 min at PC). Acknowledgments The authors are grateful to Drs. Arenas and Hurwitr for the generous gift of the HeLa cell debranching enzyme, to Dr. B. Bachman for the gift of E. coli B strains, to Dr. J. Oppenheim for protein purification, to Drs. S. lnouye and M. lnouye for communication of their unpublished data and helpful discussions, and to Mila S. Dela Torre for typing the manuscript. Computing was provided by NYU Medical Center Department of Cell Biology Vax lli750. This work was supported by Public Health Service Grant GM-08048 from the National Institute of General Medical Sciences to W. K. Maas, who is the holder of Public Health Service Career Award GM15129 from the National Institute of General Medical Sciences. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked Wvertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. Received
February
1, 1989.
References Aiba, H., Adhya, S., and Crombrugghe, B. (1981). Evidence for two functional gal promoters in intact Escherichia co/i. J. Biol. Chem. 256, 1190511910. Arenas, J., and Hurwitz, J. (1987). Purification of an RNA debranching enzyme from HeLa cells. J. Biol. Chem. 262, 4274-4279. Devereux, J.. Haeberli, P., and Smithies, 0. (1984). A comprehensive set of sequence analysis programs for VAX. Nucl. Acids Res. 72, 387-395. Dhundale, A., Lampson, 8.. Furuichi, T., Inouye, M.. and Inouye, S. (1987). Structure of msDNA from Myxococcus xanthus: evidence for a long, self-annealing RNA precursor for the covalently linked, branched RNA. Cell 57, 1105-1112. Furuichi, T., Dhundale, A., Inouye. M., and Inouye. S. (I=). Branched RNA covalently linked to the 5’ end of a single-stranded DNA in Stigmatella aurantiaca: structure of msDNA. Cell 48, 47-53. Furuichi, T., Inouye. S., and Inouye. M. (1987b). Biosynthesis and structure of stable branched RNA covalently linked to the 5’ end of multicopy single-stranded DNA of Stigmatella aurantiaca. Cell 48, 55-62.
Hen&off, S. (1984). Unidirectional digestion with exonuclease Ill creates targeted breakpoints for DNA sequencing. Gene 28, 351359. Lim, D., Oppenheim, J., Eckardt, T., and Maas, W. K. (1987). Nucleotide sequence of the argR gene of Escherichia cdi K12 and isolation of its product, the arginine repressor. Proc. Natl. Acad. Sci. USA 84, 6897-6701. Maas, R. (1983). An improved colony hybridization method with significantly increased sensitivity for detection of single genes. Plasmid 70, 296298. Maniatis. T., Fritsch, E. F., and Sambrook, J. (1982). Molecular Cloning: A Laboratory Manual (Cold Spring Harbor, New York: Cold Spring Harbor Laboratory). Maxam, A., and Gilbert, W. (1980). Sequencing end-labeled DNA with base-specific chemical cleavages. Meth. Enzymol. 65.499-525. Miller, J. (1972). Experiments in Molecular Genetics (Cold Spring Harbor, New York: Cold Spring Harbor Laboratory). Ratner, L., Halseltine, W., Patarca, R., Livak, K., Starcich, B., Joseph& S. F., Doran, E. R., Rafalsk, J. A., Whitehorn, E. A., Baumeister, K., Ivanoff, L., Petteway, S. R., Jr., Pearson, M. L., Lautenberger, J. A., Papas, T. S., Ghrayeb, J., Chang, N. T., Gallo, R. C., and Wang-Staal, F. (1985). Complete nucleotide sequence of the AIDS virus HTLV Ill. Nature 313, 277-284. Rice, N. R., Stephens, R. M., Burny, A., and Gilden, R. V (1985). The gag and @gene of bovine leukemia virus: nucleotide sequence analysis. Vimlogy 142. 357-377. Sanger, F., Nicklen, S., and Coulson, A. R. (1977). DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. USA 74, 5463-5467. Schwartz, D. E., Tizard, R., and Gilbert, W. (1983). Nucleotide sequence of Rous sarcoma virus. Cell 32, 853-869. Shinnick, T M., Lerner, R. A., and Sutcliff, J. G. (1981). Nucleotide sequence of Moloney murine leukemia virus. Nature 293. 543-548. Seiki, M., Hattori, S., Hirayama, Y., and Yosida, M. (1983). Human adult T-cell leukemia virus: complete sequence of the provirus genome integrated in leukemia cell DNA. Proc. Natl. Acad. Sci. USA 80, 36183622. Sonigo, P., Alizon, M., Staskus, K., Klatzmann, D., Cole, S., Danos, O., Retzel, E., Tiollais, l?, Haase, A., and Wain-Hobson, S. (1985). Nucleotide sequence of visna lentivirus: relationship to the AIDS virus. Cell 42, 369-382. Weinstock, G. M., Ap Rhys, C., Berman, M. L., Hampar, B., Jackson, D., Silhavy, T J., Weisemann, J., and Zweig, M. (1983). Open reading frame expression vectors: a general method for antigen production in Escherkhia co/i using protein fusions to f3-galactosidase. Proc. Natl. Acad. Sci. USA 80, 4432-4436. Yanisch-Perron, C., Vieira, J., and Messing, J. (1985). Improved Ml3 phage cloning vectors and host strains: nucleotide sequences of the M13mp18 and pUCI9 vectors. Gene 33, 103-119. Yee, T., Furuichi, T, Inouye. S., and Inouye, M. (1984). Multicopy singlestranded DNA isolated from a gram-negative bacterium, Myxococcus xanthus. Cell 38, 203-209.