The early region of SV40 DNA may have more than one gene

The early region of SV40 DNA may have more than one gene

Cell, Vol. 11, 837-843, August 1977, Copyright 0 1977 by MIT The Early Region of SV40 DNA May Have More Than One Gene Bayar Thimmappaya and Sher...

7MB Sizes 6 Downloads 84 Views

Cell, Vol. 11, 837-843,

August

1977,

Copyright

0 1977 by MIT

The Early Region of SV40 DNA May Have More Than One Gene

Bayar Thimmappaya and Sherman Department of Human Genetics Yale University School of Medicine New Haven, Connecticut 06510

M. Weissman

Summary The nucleotide sequence of 70 base pairs (bp) around 0.545 map units (Alu I C and B junction) of the genome from the single Eco RI cleavage site within SV40 DNA is presented. The mRNA transcribed from the early strand template from this stretch contains two copies fo the nonsense triplet UAA in each of the three reading frames. Thus at least 25% of the early region of SV40 DNA does not code for the SV40 “A” protein, and the viral contribution to events in the lytic cycle and transformation may be more complex than is generally appreciated. Introduction Prior to viral DNA replication, only part of the genetic information of simian virus 40 (SV40) is expressed as protein. This early protein is necessary for the initiation of viral DNA replication, the establishment and/or, at least partly, the maintenance of the transformed state of cells and the enhancement of growth of human adenoviruses in primary cultures of African green monkey kidney cells. Several virus-specific antigenic determinants appear in cells that produce or contain early viral mRNA. These antigens include the T and U antigens present in the nucleus of transformed or infected cells and TSTA, an antigen present in the cell surface which is responsible for rejection of SV40-induced tumors by immunized hosts (reviewed by Levine, 1976). A single protein, the “A” protein, is detected in immunoprecipitates of early viral protein labeled in vivo or synthesized in cell-free systems programmed by early viral mRNA (reviewed by Rundell et al., 1977; Anderson et al., 1977). The molecular weight of this protein has been estimated by several laboratories to be over 85,000 daltons, on the basis of its electrophoretic mobility in acrylamide gels containing sodium dodecylsulfate and reducing agents. The early region of the viral DNA contains ~2600 bp, an amount of DNA barely sufficient to code for a protein of the size of the “A” protein. In the present report, we show that one third of the way within the early region, the transcript of SV40 DNA has translation termination codons in all the three phases. Thus the genetic structure of the early segment of SV40 DNA is more complex than was previously suspected.

Results The restriction endonuclease cleavage map of the segment of SV40 DNA early region extending from 0.49 to 0.66 of the fractional length of the genome from the single Eco RI restriction cleavage site within SV40 DNA is presented in Figure 1. The furthest extent of the 5’ end of early mRNA is located at approximately 0.67 on this restriction map (Khoury et al., 1975; Dhar et al., 1977a), and the 3’ end of early SV40 RNA is located at approximately 0.160 (Khoury et al., 1975; Dhar et al., 1974b, 1974c) of beyond (Reed, Stark and Alwine, 1976). In the process of analysis of the nucleotide sequence of the viral DNA, we prepared the DNA fragment Alu C (Figure 1) (Yang, Van de Voorde and Fiers, 1976; Jay and Wu, 1977) extending from 0.49 to 0.545 map units, radioactively labeled with 32P-phosphate attached to the 5’-hydroxyl of the nucleoside residue at position 0.545. We also prepared Alu B with radioactivity introduced at the 5’-hydroxyl adjacent to the Alu I cleavage site at 0.545, and Hinf D with radioactivity introduced at the 5’-hydroxyl at position 0.535 (Subramanian et al., 1977). Twodimensional chromatographic and electrophoretic fractionation was performed on the products of limited snake venom diesterase digestion of each of these labeled fragments. Various restriction cleavage sites in this area were sufficiently close to one another that the patterns obtained by venom diesterase digestion of labeled DNA fragments establish a sequence of over 70 nucleotides (Figures 2-4). The overlap was weak in the center of the fragment, falling between the Hinf cleavage site 0.535 and the Alu cleavage site at 0.545 on the SV40 map. To confirm this sequence, we analyzed the DNA by limited nucleoside-specific degradation according to the elegant method of Maxam and Gilbert (1977). The results of these patterns are shown in Figures 5 and 6. These results were fully consistent with the interpretation of venom diesterase digests and confirmed that no bases were missing in the overlap between the patterns. E. coli RNA polymerase is known to transcribe only the early strand of SV40 DNA (Westphal, 1970). To confirm the sequence further, we transcribed SV40 DNA with E. coli RNA polymerase, three nonradioactive nucleoside triphosphates and CX--~~PATP. This RNA was then annealed to the restriction endonuclease fragment Alu C, immobilized on a nitrocellulose filter. The RNA was digested with Tl ribonuclease to remove nonhybridized material, eluted and redigested with Tl ribonuclease. The oligonucleotides were fractionated by two-dimensional electrophoresis and chromatography. Each

Cell

a38

Early

mRNA

I

ECoRl

I

tF

,J,

G

B

I

H

A /

0

C

Figure

1. Restriction

Endonuclease

(a)

\ \

C A*G*lC

\

Hinf D Tz III

I

\

Alu B +I 4l

,545

Cleavage

F 100

Hinf

II

II, III

\

/

.49

,E

\

/-

WHinf -Alu /

D

\

/

/

,

Hind

.645

Map of SV40 DNA

(a) Hind II, III cleavage map of SV40 DNA. Eco RI refers to the location of the single Eco RI cleavage site of SV40 DNA. (A-K) indicate the Hind 11,111cleavage fragments of SV40 DNA. The large horizontal arrow spans the portion of the DNA represented in early cytoplasmic mRNA of infected cells. (b) Hinf and Alu cleavage sites of a portion of the early region of SV40 DNA. The numbers refer to the fractional genome length away from the Eco RI site.

product was analyzed by further digestion of aliquots with pancreatic ribonuclease or U2 ribonuclease (Dhar et al., 1977a). The nucleotide sequence of spots 3, 6 and 29 in Figures 7A and 78 was that predicted from the DNA sequence. The remaining Ti products can be accounted for by nucleotides derived from other portions of the sequence of Alu C (unpublished results). This provides further strong evidence to support the sequence. The most striking feature of the sequence is that there is a stretch of 50 nucleotides in the DNA that would serve as template for early strand mRNA that contained two copies of the nonsense triplet UAA in each of the three reading frames. In one reading frame, as many as four chain termination codons are present, one among them being UGA (see Figure 8). Another prominent feature is the relatively high AT content with a run of 18 consecutive AT or TA base pairs, and only 10 of 60 consecutive base pairs are GC or CG. Discussion The sequence of this region of SV40 DNA could be determined and confirmed with ease and confidence because of the frequency of restriction endonuclease cleavage sites. This removed any doubts about missing or inserted single bases that might arise when long sequences of DNA are determined in the absence of supporting protein data or convenient restriction cleavage sites. The triplet UAA is the termination codon for human LY- and p-globin mRNA, and UAG is the termination codon for a mutant hemoglobin mRNA (For-

get et al. 1975; C. A. Marotta et al., manuscript submitted; J. T. Wilson et al., manuscript in preparation). UAG is the termination codon for a mutant hemoglobin mRNA, and UGA is the termination codon for the mRNA for SV40 VP-l, the major structural protein of the virus (T. Kempe et al., manuscript in preparation), and for rabbit p-globin mRNA (Efstratiadis, Kafatos and Maniatis, 1977; Browne et al., 1977). Thus each of the nonsense triplets can be singly effective in terminating transcription in animal cells. The initiation codon for the largest early protein must lie downsteam from at least some of the termination codons within this segment of DNA. The “A” protein is shortened in an SV40 deletion mutant that lacks SV40 early sequences from 0.325 to 0.450 (Rundell et al., 1977). Sequences published previously by Dhar et al. (1974a, 1974b) show that there are termination codons in all three phases of early mRNA in the region of DNA at about 0.175 fractional genome length. The sequence for most of the SV40 DNA between 0.145 and 0.54 is already known, and the predicted amino acid composition of the translation product is not very aberrant (Volckaert et al., 1977; B. Thimmappaya, unpublished observations). Thus the estimated molecular weight of the large protein coded by the region between termination codons is <70,000 daltons. This is in contrast to the results obtained by several laboratories for the molecular weight of the “A” protein coded for by SV40. Del Villano and Defendi (1973), however, did obtain an “A” protein with an estimated molecular weight of 70,000 daltons. Certain results are consistent with the sugges-

Early 839

Genes

of SV40

Figure 2. Autoradiograph of Two-Dimensional Fractionation of Limited Snake Venom Phosphodiesterase Digestion of the DNA Fragment Alu C Labeled with 32P at the Cleavage Site at 0.545 (See Figure 1). Electrophoresis on Cellogel at pH 3.5 was from left to right, and homochromatography was from bottom to top. Letters indicate the successive mononucleotides removed by the exonuclease.

tion that the region of the SV40 DNA to the right of 0.545 might not encode the “A” protein. The temperature-sensitive mutants of SV40 virus that have been mapped so far all lie within the restriction fragments to the left of 0.430 (Lai and Nathans, 1975; Mertz, 1975). With the related virus polyoma, Feunteun et al. (1976) have reported marker rescue experiments indicating that certain host range mutants lie within the 5’ terminal portion of the early region, while all the temperature-sensitive mutants for early function lie in positions beyond the region analogous to 0.545 on the SV40 genome (Miller and Fried, 1976). There are preliminary reports that the

Figure 3. Autoradiograph of Products of Limited Phosphodiesterase Digestion of the DNA Fragment with 32P at the Clevage Site at 0.545 Fractionation

and labeling

were

as in Figure

Snake Venom Alu B Labeled

2.

host range mutants and the temperature-sensitive early mutants might complement each other in some functions (M. Fluck et al.; W. Eckhart, Abstracts of Tumor Virus Meeting, Cold Spring Harbor Laboratory, 1976). Such results imply that a product made by the temperature-sensitive mutants acts “in trans” to complement this class of host range mutants. This would be consistent with a single bifunctional, large early protein, but also with the existence of two separate genes within the early region. The sequence reported here favors the latter alternative. Shenk, Carbon and Berg (1976) have recently reported that a deletion mutant that extends upstream from 0.54 on the SV40 genome is viable and produces a T antigen that has a similar electrophoretic mobility to T antigen produced by wild-type SV40. This is consistent with

Cell 840

-/-A -A -+ F -A

G

A T T

-T T ii -A A -A -T -A T -A -A

-A Figure 4. Autoradiograph of Two-Dimensional Fractionation of Products of Limited Snake Venom Phosphodiesterase Digestion of the DNA Fragment Hinf D Labeled with 32P at the Cleavage Site at 0.535 Labeling

and fractionation

were

as in Figure

2.

the sequence which indicates that the 5’ portion of the early region of SV40 DNA does not code for any part of the amino acid chain of the “A” protein. On the other hand, adenovirus 2 SV40 hybrid viruses lacking SV40 early sequences upstream from 0.54 direct the synthesis of U antigen and TSTA, but not of T antigen (Lewis, 1977). The efficiency with which suppression of termination of translation occurs in bacterial systems may vary many fold with the particular site at which a termination codon occurs. It would be remarkable, however, to have

Figure 5. Autoradiograph of One-Dimensional Chemical Degradation of the DNA Fragment 32P at the Cleavage Site at 0.545 (Figure 1)

Gel Acrylamide Alu C Labeled with

Electrophoresis was from top to bottom. Columns labeled A, AG, CT and C represent results of selective cleavage at these bases. 6 x IO’ cpm were taken for each cleavage and electrophoresed as described by Maxam and Gilbert (1977). Letters in the right-hand column show the correspondence of the pattern with the sequence deduced.

highly efficient read-through of two termination codons. A simpler alternative would be that the sodium dodecylsulfate acrylamide gel electrophoretic mobility of the “A” protein is very misleading. This

Early

Genes

of SV40

641

i T TT

T T A C A C

Figure 6. Products of Limited Chemical Degradation of the DNA Fragment Hinf D Labeled with 32P at the Cleavage Site at 0.535 Fractionation and labeling were as in Figure 5, except that 1.2 x lo5 cpm were taken for each chemical cleavage. The patterns obtained in the A column were identical to the AG column, but were very faint and hence are not shown here.

is curious, since this electrophoretic been noted for a protein synthesized germ cell-free system under conditions

mobility has in the wheat where one

might not expect to see extensive post-synthetic glycosylation or other modification of the primary translation product (Prives et al., 1977). There is no clear assignment of function to SV40 sequences in the early region upstream to 0.54. From the partial sequences available for this region (Dhar et al., 1977b), this DNA could code for a peptide. The deletion mutants show that at least a portion of these sequences is not necessary for viability of the virus in cultured cells. The T antigenie determinants appear to be associated with the “A” protein. In the related papovavirus polyoma, mutations in the region analogous to SV40 early DNA sequences upstream from 0.50 appear to affect the host range of the virus and its ability to transform cells without affecting its viability in appropriate hosts. Failure to find temperature-sensitive mutants in this region of the virus could be due to some combination of the nonessentiality of the gene product for replication of SV40 in conventional cell lines and the small size of the putative protein. Only a single 19s mRNA species has been described from the early region of the SV40 genome. Possibly, the methods commonly used would not resolve two segments of RNA differing in length by a few hundred bases, particularly because of heterogeneity in the length of poly(A). In other examples of eucaryotic polycistronic mRNA, initiation codons for the downstream proteins may not be used as effectively as initiation codons near the 5’ end of the message (Kaesberg, 1977). If this applies here, one might speculate that the largest early RNA would direct principally the synthesis of a small protein of unknown antigenicity, and a message with its 5’ end near 0.55 would direct the synthesis of the “A” protein. One could explain in vitro synthetic results by assuming that the internal initiation product was detected either because mRNA cleavage occurred in the in vitro synthetic systems or because the added mRNA included mRNA with a 5’ end near 0.54. The deoxythymidylic and deoxyadenylic acidrich sequences in the neighborhood of the termination codons are also of considerable interest. Similar AT-rich segments have also been noted preceding the 5’ end of the late mRNA of SV40 that codes for the minor structural protein VP-2 and also in the region of the late mRNA where there is the potential initiation codon for the second minor capsid protein VP-3 (V. B. Reddy, R. Dhar and S. M. Weissman, manuscript in preparation). Such regions could be responsible either for mRNA processing or for initiating transcriptions on the genome or both. A search for more than one species of SV40 early message RNA and for additional small SV40 early proteins in lytically infected or transformed cells would be informative.

Cell 842

Figure 7A. Two-Dimensional with Weak Homo B Figure 76. Two-Dimensional Electrophoresis oligonucleotides with pancreatic

a)

Fractionation

of Tl

RNAase

Fractionation

of Tl

Products

Digestion with

Strong

of SV40 cRNA

Homo

Sequence

within

early

to the DNA

Fragment

Alu C

was from bottom to top. Numbers respectively. Each oligonucleotide radioactive label was &ZP-ATP.

3, 6 and 29 indicate the was digested further

SV40

50 TAATGTGTTAAACTACTGDTC ATTACACAATTTGATGACTAAG

Predicted

Complementary

B

was on Cellogel at pH 3.5 from left to right, and chromatography UAAAUAUAAAAUUUUUAAG, UUAAACUAAUG and UAUAAUG, or U, RNAase, and the predicted products were obtained. The

DNA Alu (05.45) 1 10 L pCCTACATATATTTAAAGCTATAAGGTAAATATAAAATTTTTAAGTGTA GGATGTATATAAATTTCGATATTCCATTTATATTTTAAAAATTCACAT

b)

Products

60

RNA

Hinf

20

30

40

(0.535)

sequence ( 3 ).

pCCUACAUAUAUUUAAAGCUAUAAGG’UAAAUAUAAAAUUUUUAAGbG-----

(29) ~~T~~~u~',,,AAcuAcu~'~uuc ---i?Figure

8. Sequence

----- 7 of a Segment

(A) Sequence of DNA. ALU and (B) Sequence predicted for early termination. Codons 1, 6; 3, 4, numbers in parentheses refer to

-----

2

4

3

--5--

---r of SV40 DNA between

0.535

and 0.575

Fractional

Genome

Length

(see Figure

Hinf indicate the respective restriction endonuclease cleavage sites. mRNA transcribed from the DNA sequence in (A). Numbers and short horizontal 5, 8; and 2, 7 are three sets of termination codons that would block translation the sequence of the Tl products shown in Figures 7A and 78.

1) lines indicate translation in all three phases. The

Early 843

Genes

of SV40

Experimental

Procedures

Jay, E. and Wu, R. (1977). Kaesberg,

The enzyme from Hemophilus influenza C strain (Hinf), the enzyme from Hemophilus aegyptius (Hae Ill) and the mixture of enzymes from Hemophilus influenzae strain d (Hind II, Ill) were prepared as previously described. The restriction endonuclease from Arthrobacter luteus (Alu 1) was obtained from New England Biolabs. Polynucleotide kinase was prepared from T4-infected E. coli by the procedure of Richardson (1971). Snake venom phosphodiesterase and bacterial alkaline phosphatase were obtained form Worthington. Tl FiNAase, U2 RNAase and pancreatic RNAase were obtained from Calbiochem. The sources of materials used for fractionation of oligonucleotides and DNA fragments have been described elsewhere (Marotta et al., 1974). A small plaque isolate of SV40 virus was propagated in VERO or CVI continuous lines of African green monkey cells. Form I DNA isolated in ethidium bromide cesium chloride density gradients was used for all analyses. The details of the procedures for virus propagation, DNA purification and preparation of restriction enzyme fragments by electrophoresis in polyacrylamide gels have been described by Maniatis, Jeffrey and Kleid (1975). Oligonucleotides were fractionated according to the Brownlee-Sanger procedure of electrophoresis and homochromatography (Brownlee and Sanger, 1969). Selective degradation of terminally labeled DNA fragments to cleave preferentially at guanylic, adenylic, cytidylic plus thymidylic acids and subsequent sequencing of such modified fragments were performed according to the procedures of Maxam and Gilbert (1977). Preparation of cRNA for SV40 form I DNA, annealing of the transcript to Alu C fragment and the analysis of the annealed transcript were performed as described by Dhar et al. (1977a).

This research was supported by a grant from the American Cancer Society and a fellowship from the James Hudson Brown Foundation. We are grateful for the excellent technical assistance of Alan McCluskey, Christine Gerhardt, Sharlene Ivory and Richard Wang, and to Marjorie Veronneau for typing the manuscript. Received

March

16, 1977;

revised

April

21, 1977

References Anderson, J. L., Chang, Virol. 27, 459-467.

C., Mora,

P. and Martin,

Browne, J. K., Paddock, G. V., Liu, A., Clarke, and Salser, W. (1977). Science 795, 369-391. Brownlee, 399.

G. G. and Sanger,

Del Villano,

8. C. and Defendi,

F. (1969).

R. B. (1977). P., Heindell,

Eur. J. Biochem.

V. (1973).

Virology

H. C.

71, 395

51, 34-46.

Dhar, R., Weissman, S. M., Zain, 8. S., Pan, J. and Lewis, Jr. (1974a). Nucl Acids Res. I, 595-613. Dhar, R., Zain, B. S., Weissman, K. N. (1974b). Proc. Nat. Acad.

A. M.,

S. M., Pan, J. and Subramanian, Sci. USA 71, 371-375.

Dhar, R., Subramanian, K. N., Zain, S. M. (1974c). Cold Spring Harbor 160. Dhar, R., Subramanian, (1977a). J. Biol. Chem.

J.

B. S., Pan, J. and Weissman, Symp. Quant. Biol. 39, 153Weissman,

S. M.

Dhar, R., Subramanian, (197713). Proc. Nat. Acad.

K. N., Pan, J. and Weissman, Sci. USA 74, 627-831.

S. M.

Efstratiadis, 571-585.

F. C. and

A., Kafatos,

K. N., Pan, 252, 368-376.

J. and

Maniatis,

T. (1977).

Feunteun, J., Sompayrac, L., Fluck, M. and Benjamin, Proc. Nat. Acad. Sci. USA 73, 4169-4173. Forget, (1975).

Ceil

10,

T. (1976).

B., Marotta, C. A., Weissman, S. M. and Cohen-Solal, Proc. Nat. Acad. Sci. USA 72, 3614-3618.

M.

P. (1977).

Khoury G., Howley, Virol. 15, 433-437. A. J. (1976).

Lewis,

Nucl.

D. (1975). Biochim.

A. M., Jr. (1977).

C. A., Lebowitz, in Enzymology,

Maxam, 560-564.

A. and Gilbert,

Mertz, J. E. (1975). California. Miller,

Prog.

Martin,

Virology

Med.

458, 213-241 in press.

D. G. (1975).

W. (1977).

Proc.

Proc.

Nat. Acad.

Nat. Acad.

Sci. USA 74,

thesis,

Stanford

University,

M. (1976).

J. Virol.

18, 824-832.

Prives, C., Gilboa, E., Revel, M. and Winocour, Nat. Acad. Sci. USA 72, 456-461. Reed, S. I., Stark, G. R. and Alwine, Sci. USA 73, 3083-3087. Richardson, L. Cantoni 815.

J.

66, 70-81.

Acta Viral.,

M. (1975).

P., Dhar, R. and Weissman, S. M. (1974). 29 (New York: Academic Press), pp. 254-

Ph.D.

L. K. and Fried,

D. and

Biophys.

Maniatis, T.. Jeffrey, D. and Kleid, Sci. USA 72, 1184-1188. Marotta, Methods 272.

15, 3612-3620.

Acid Res. Mol. Biol. 79, 465-470.

P., Nathans,

Lai, C. J. and Nathans, Levine,

Biochemistry

Prog.

J. C. (1976).

Palo Alto,

E. (1977). Proc.

C. (1971). Procedures in Nucleic Acid and D. R. Davies, eds. (New York: Harper

Proc.

Nat. Acad.

Research, and Row),

G. p.

Rundell, K., Collins, J. K., Tegtmeyer, P., Ozer, and Nathans, D. J. (1977). Virology 27, 636-646.

H. L., Lai, C. J.

Shenk,

18,664-671.

T., Carbon,

J. and Berg,

P. (1976).

Subramanian, K. N., Zain, B. S., Roberts, M. (1977). J. Mol. Biol. 110, 297-317. Volckaert, G., Contreras, Fiers, W. (1977). J. Mol. Westphal, Yang, them.

H. (1970).

R. J. and Weissman,

R., Soeda, E., Van de Voorde, Biol. 710, 467-510.

J. Mol.

R., Van de Voorde, 67, 119-138.

J. Virol.

S.

A. and

Biol. 50, 407-420. A. and

Fiers,

W. (1976).

Eur.

J. Bio-