J. Mob. Biol.
(1979)
131, 341-352
Sequences from the Beginning of the Fiber Messenger RNA of Adenovirus-2 B. SAYEEDAZAIN
AND RICHARD
J. ROBERTS
Cold Spring Harbor Laboratory Cold Spring Harbor, N.Y. 11724, U.S.A. (Received 24 January
1979, and in revised form
2 March
1979)
Small restriction fragments, from around co-ordinate 86.6 on the adenovirus-2 genome, have been used as primers for direct DNA sequence analysis by Sanger’s (Sanger et al., 1977) chain termination method with Ad2t DNA as template. The genomic sequences obtained have been compared with sequences deduced using fiber messenger RNA from Ad2 or Ad2 + NDB-infected:cells as template. With one primer, Hha 54, the sequences complementary to mRNL4 match those of the genome for 10 nucleotides but then differ from those found on the genome because this primer hybridizes near the point at which the leader sequence becomes joined t’o the main body of fiber mRNA. Using Ad2 +ND5 fiber mRNA as template, the sequence beyond the point of divergence matches the known sequence of the third leader component of one of the late Ad2 mRNAs, that, encoding the hexon polypeptide. With Ad2 fiber mRNA, a heterogeneous sequence continuation is found, III accordance with earlier findings that two major species of fiber mRNA are present. which differ in the nature of the leader component joined to the main body of fiber mRNA (Chow & Broker, 1978). Nevertheless, the data suggest that, both leader components appear to become joined to t’he same nucleotide at the start of the main body of fiber mRNA. The AUG codon, which probably encodes the N-terminus of the fiber polypept,ide, occurs just two nucleotides beyond the point at which these leader segments Income spliced to the main body of fiber mRNA.
1. Introduction The messenger RNAs present late during adenovirus-2 infection of human KB cells are mosaic molecules which contain RNA segments transcribed from non-contiguous regions of the Ad2t genome. It has been shown, by electron microscopy, that three short’ sequences encoded at map positions 16*6,19*6 and 26.6 on the R-strand become joined to form a leader sequence of 150 to 200 nucleotides at the 5’-end of most of the late messenger RNAs (Berget et aZ., 1977 ; Chow et aZ., 19773). Since the initial discovery of this phenomenon in the Ad2 system, many other eukaryotic mRNAs have been shown to possess similar structures. Thus, mRNAs transcribed from simian virus 40 (Aloni et al., 1977; Hsu & Ford, 1977; Celma et al., 1977), and RNA tumor viruses (Rothenberg et al., 1978; Mellon & Duesberg, 1978; Krzyzek et al., 1978) as well as those for fl-globin (Tilghman et al., 1977), immunoglobulin (Tonegawa et al., 1978) and ovalbumin (Breathnach et al., 1977) are all transcribed from genomes which contain intervening sequences. No clear function has yet been assigned to this mode of mRNA t Abbreviation
used: Adz, adenovirus-2.
00"2~2836~70/1RO341-12
$02.00/O
Ci? 1979 Academic
PPCHSTnr. (London)
Ltd.
342
H. s. ZAlS
ANI)
li.
.I. KOBELZ’I‘S
biosynthesis although the basic mechanism t)y which those tliscont,ilmou:: segments arise seems fairly certain. A primary transcript is produced which is a linear rcpresentation of genomic sequences and is then subjected to cutting and splicing to IWW~C the intervening sequences with concomitant rejoining ofDhose portions ofthr transcript destined to become mRNA sequences. Thus in the case of yeast tRXAPhe, such a prrcursor has been characterized and its transformation into a mature t’ransfer KSi\ species accomplished in vitro (Knapp et ab., 1978 : 0’ Farrell it al. 1 t 978). Surprisingly, most of the intervening sequences found in eukaryotic mR&ila. such as that for /3-globin, are located not at the ext,remities of the mRNA. but, rat1lr.r arca found within the coding sequences. This is in contrast to tho lat,c, At12 mRXAs Herr the main coding sequences appear to be transcribed intact from the genomtr and a11 untranslated 5’-leader region is constructed by the elimination of intervening sequf~nces. The evidence suggesting that the leader portion is not, translated rests with t’hcbobsrrvation that fragments of late Ad2 mRNAs which can be protrcted by- binding to rib(Jsomes do not hybridize to those regions of t,he genomu from which t)ht: leader scgm~~nt~s are transcribed. Rather, they are complementary to t#hose regions of the genomr~ previously identified as encoding the start of the late genes (J. Manley. ptwonal communication). This fundamental difference in splicing patterns between the late Ad2 mRNAs and other eukaryotic mRNAs, including the early Ad2 mRKAs (Kitchingman et al.. 197i : Berk & Sharp, 1978), is intriguing. For instance. the mechanism. which must be prr+sc~ when it, occurs within a coding region, may be imprecise when it only involves utltranslated regions. In order to study this matter further, \vt’ have concent~ratt~cl 0111’ attention upon one individual mRNA, which is produced late during Ad2 infection. and encodes the major structural protein, fiber. This particular mRNA. tht> body of which is located almost 25,000 nucleotiden away from t,he wequencch c~ncoding thrb fil,st section of the leader (Chow et al., 1977a,b: Klessig, 1977), differs from other late mRNAs in that several species arr prcsenh, some of which contain cxt,ra leader cornponents in addition to the common tripartite leader. About 25y,,i of these fiber mRSA molecules contain a leader with one additional segment (the fourth leader cornpont~nt ) transcribed from co-ordinates 78% to 79.1 (Cholv 6%Broker. 1978). and (ivtxn \\ithin this population minor variants occur in which additional htqut’nct’s from ifi.9 to 77.3 and/or 84.7 to 85.1 are present (Chow- & Broker. 1978). ‘l’ht~ significance* 01’ tllttscs additional sequences is unknown; however, they do not appear to aff& the translational properties of the mRNA (Dunn et al., 1978) and it, is possible that’ t~ho~~art’ processing intermediates. In this paper we describe experiments which locate. at) thcb nuclcotide IVVPI. t htt 5’-end of the main body of fiber mRNA upon thr Ad2 genome. We also drrivch thta sequences adjoining these main body sequences on t’hr two principal specirs of fiber mRNA, one containing the fourth leader component and the other rontaining only that common tripartite leader.
2. Materials and Methods (a)
Viral
DNAs
and HLVAS
Ad2 and Ad2 +ND5 viruses were grown in suspension cultures of human KH calls. DNA was prepared from the purified virions as described previously (Pettersson & Satnbrook, 1973; Pettersson et al., 1973). Late poly(A)-selected cytoplasmic RNA \vas prcyaretl as described by Lewis et al. (1974).
SEQUENCES
IX
ADENOVIRUS-2
FIBER
mRNA
343
(b) Enzymes Restriction endonucleases EcoRI (Greene et al., 1974), ;FlhaI (Roberts et al., 1976), H@ and HpaII (Sharp et al., 1973) were purified as described. DNA polymerase I (Klenow fragment) was obtained from Boehringer-Mannheim. Escherichia coli exonuclease III was from New England Biolabs. Reverse transcriptase from avian myeloblastosis virus was a gift from Dr J. Beard. Phage T7 exonuclease was a gift from P. Sadowski. Dideoxynucleoside triphosphates were obtained from Collaborative Research.
(c) 1Zestr?:ction endon&ease
&&e&on
and yel
electrophoresis
Restriction cndonuclease digestion and gel electrophoresis were carried out as described by Zain & Roberts (1978) except for the use of an improved procedure (G. A. Wilson, personal communication), described below, for the recovery of restriction enzyme fragments from agarose gels. The agarose gel slice, containing the restriction fragment, was inserted into a dialysis hag previously rinsed with a solution containing 50 pg of bovine serum albumin/ml and a small amount of E buffer (0.04 M-Tris-acetate (pH 7*9), 0.005 M-sodium acetate, 0.001 MNa,EDTA) containing 5 pg bovine serum albumin/ml. In general, about 6 ml of this buffer were used for a gel slice 20 cm x 0.6 cm x 0.3 cm. The dialysis bag was then placed in a llorizont,al gel apparatus which contained E buffer in both reservoirs and on the plate which ttormally supports tile gel, allowing a continuous path for the flow of ions between each electrode. The gel slice was completely immersed in the buffer within the dialysis bag and the bag was placed so that the gel slice lay parallel wit,h the electrodes. Electrophoresis under a potential of 150 V was carried out for 30 to 60 min and the passage of the DNA4 from the gel slice into bhe buffer was monitored by ultraviolet irradiation. As soon as the DNA had migrated out of the gel slice and could be seen adhering to the wall of the dialysis bag, the polarity was reversed and electrophoresis continued for a further 2 min. The dialysis bag was removed from the apparatus and the buffer solution, which now contained t,he clnted DNA, was removed, extracted with an equal volume of phenol, and precipitated \vit,h ethanol. DN’A extracted in this way can usually be successfully recut with restriction enzymes; however, for use as template in the dideoxy sequencing procedure it, was often found necessa,ry t’o introduce an additional purification step. For this purpose, a procedure suggested to (1s by Dr W. Studier was used. Immediately aft,er elution, the DNA solution was adjusted to 0.1 %I-NaCl and loaded onto a, small DEAE-cellulose column (1 ml) in a Pasteur pipette. After loading, the column was washed with 10 ml of a solution containing 0.1 M-NaCl, 0.01 M-Tris.HCl (pH 7.91, O*OOl M-Na,EDTA. Elution was carried out in 2 steps, first using 5 to 7 ml of the same buffer containing 0.4 M-NaCl and then with 5 to 7 ml of buffer containing 0.6 M-NaCl. One-ml fractions were collected. If the column is not overloaded (50 &column is maximum), the DNA elutes in the first 2 tubes of the O-6 M-NaCl eluate. If overloaded, it may elute in t’he first 2 tubes of the 0.4 M-NaCl eluate. %ractions from this column were assayed on a 1.4% aparose gel by loading 20-~1 samples of each fraction. Tubes containing DNA were pooled, adjusted to 0.1 M-NaCl by dilution with 0.01 fitTris.HCl (pH 7.9). 0.001 M-Na,EDTA, and precipit,ated with ethanol. Thr DNA was rrcovcred by centrifupation at 25 x lo3 revs/min for 30 min.
(d) DNA
sequence
analysis
The chain termination method (Sanger et al., 1977) was used with some slight modifications as indicated. All 4 [a-32P]dNTPs (4 &i each, per reaction set; spec. act. about 300 Ci/mmol) were used and the concentrations of the dideoxynucleoside triphosphates (P-L Biochemieals) adjusted accordingly so that the final concentrations were 50 pMddATP, 20 PM-ddCTP, 20 PM-ddGTP, and 40 p&I-ddTTP. The DNA polymerase I (Klenow fragment) was used at a final concentration of 0.05 unit/5 ~1 reaction. The initial elongations \vcrp carried out for 5 min at room temperatllre and the cold chase was carried out
344
B. S. ZAIN
for 5 min at room temperature Sequencing gels were the thin Sanger & Coulson (1978).
AND
R. J. ROBERTS
using I ~1 of a mixture of a.114 dNTPx at 2.5 mu eacll. (0.35 mm) 8% (v/v) polyacrylamide/urea gels described by
(e) Templates The HpaI D (co-ordinates 85.0 to 98.5) or EcoRI E (co-ordinat,es 83.4 to 89.7) fragments were isolated as described above. Terminal regions of these fragments were rendered single-stranded using the exonuclease procedure of A. .J H. Smith (personal communirution). Both E. coli exonuclease III (3’ ----t 5’) and phage TT cxonucleasc~ (.5’ + 3’) w~‘rc* 11sed. Typically, 5 pg of HpaI D in 60 m&r-Tris-HCl (pH 7.5). 2 mM-i&$1,, 1 rn>l-2-morcapt.oethanol, and 5 units exonuclease III in a total vol. of 50 ~1 wore incubated at 37-f: fol 30 min. After the addition of 50 ~1, 0.3 M-Na,Cl, 0.01 M-Tria~HCl (pH 7.9), 0.01 XI-EDTA the solution was extracted with 100 ~1 of phenol. Thta aqueo~~s phese was recovered by centrifugation and the DNA precipitated with 2 vol. ethanol. Following centrifupadion, t,hcj DNA was resuspended in 50 ~1 of 0.01 M-Tris.HCl (pH 7.9). 0.1 mx-Na,EDTA and 5 ~1 (0.5 rg) used BB template for each sequence set. For T7 exonuclertse treatment, 5 up H@D in 0.05 M-Tris-HCl (pH 8-l), 5 mM-I$@,, 1 InM-dithiotlrreitol, 0.02 M-KCl, and 3 unit,s of T7 exonuclease were incubated for 30 min at, 37°C irr a total vol. of 20 ~1. A total of 30 ~1 0.3 M-NaCl, 0.01 M-Tris*HCl (pH 7.9), O-01 M-EDTA was added and the solution extracted with 50 ~1 of phenol. The aqueous phase was precipitated wit,h ethanol and t.hts DNA redissolved in 50 ~1 0.01 M-Tris.HCl (pH 7.9). 0.1 mai-Na,EDTA. A sample of 5 ~1 (O-5 pg) of this template was used for each sequence sot,. (f ) Primers Ad2 HpaI D or EcoRI E fragments, purified as described abovth. w
seqz4enciny
When mRNA (poly(U)-Sepharose-selected cytoplasmio RNrl), rather than DNA, was used as a source of template, the sequencing procedure was identica.1 with that used for DNA templates except for the following changes: the primer (5 ~1). either \citb or witboul exonuclease III treatment, was placed in a capillary tube and boiled for 3 min. The capillary was broken and 2.5 pl of a solution containing 50 mM-Tris.HCl (pH 8.4), 66 mM-K(“I, P40 (Shell) was added, and 2 ~l(O.1 ~g) 66 mM-MgCl,, 100 mivi-dithiothreitol, 0.1 “,6 Nonidet mRNAwas added. Thecontents were mixed, the capillary resealed, and hybridization carried out for 30 min at 68’C. Further manipulations were as described for DNA templates except, that the final concentrations of st,op solutions in the reaction mixtures were 0.8 PM-ddATP. O-4 IL&%-ddCTP, 0.2 PM-ddGTP or 0.6 FM-ddTTP. The polymerase used was avian rnyeloblastosis virus reverse transcriptase at 4 units/5 ~1 reaction. Incubation was at 42°C for 5 min followed by a cold chase (2-5 mM, each dNTP) for 5 min at 42°C.
REQUEKCES
IN
ADENOVIRUS-2
FIBER
mRNA
345
3. Results 1 (a) Genome sequence The &ructures of the three viral genomes, Ad2, Ad2+NDl!and Ad2+ND5, which have been important in these studies are shown in Figure 1. Those regions of the genome from which the various segments of fiber mRNA are transcribed are indicated. The main coding region, which has been defined genetically (Mautner et al., 1975) and biochemically (Lewis et al., 1975), and by electron microscopy (Chow et al., 1977), lies bet)ween co-ordinates 86.3 and 91.2. A restriction enzyme map of the region of the I
I
I
I
I
I
I
I
I
I
I
0
IO
20
30
40
50
60
70
80
90
loo
Ad2+NDI
I
Ad2
I
Mnrnn
I I
HMMMM
Ad2+ M5 I .
.
Tripartite
.
. 4th leader
leader
I
r, Main body of fiber mRNA
FIG:. 1. Structuw of Ad2+NDl, Ad2+XD5 and &%d2and the location of t,hr coding sequencrs for fibw mR,NA. The approximate positions of the simian virus 40 sequences ( - - - - - ) present in Ad2 + NI)l and Ad2+N1)5 are &ken from Kelly 8: Lewis (1973). The location of t)hc sequences present, in the leaders and main body of fiber ;RXA (-) arc taken from Chon et trl. (1977n) and (‘how &~ Broker (1978).
genome around 86.3 for Ad2 and Ad2 +NDl was available from earlier studies (Zain & Roberts, 1978) and is shown in Figure 2. In this region Ad2+ND5 is identical with Ad2 +NDl (Zain, unpublished results). The approximate location, within this region, of the extreme 5’-end of the main body of fiber mRNA had been determined previously (Broker et al.; 1978 ; Roberts et al., 1979). By hybridizing fiber mRNA to appropriate subfragment,s and fingerprinting the annealed RNA, iC was found that fiber mRNA sequences extended no further than 100 nucleotides to the left of the HpaII site at co-ordinate 86.6. Direct DNA sequence analysis of this region was undertaken. For this purpose the HpaT 1) fragment, (co-ordinates 84.4 to 98.5) of Ad2 was prepared and, following treatment wit,h either E. coli exonuclease IIT (A. J. H. Smith, personal communication) or phage ‘I”7 exonuclease, was used as a t)emplate in the dideoxy sequencing method (Sanger et al., 1977). Both HhaI 54 and HpnIl 150. which flank the HpaII site at co-ordinate 86.6, were used as primers on exonuclease III-treated template and Hhal 54 was used on T7 exonuclease-treated template. By using primers, which had themselvrs been resected with exonuclease III (Nathans & Roberts, unpublished results) it, was possible to deduce the genome sequence extending 50 nucleotides to the left (5’) of the HhnI site at 86.5 and 50 nucleotides to the right (3’) of that site. Confirmation of this sequence was provided by experiments carried out using templates and primers from a cloned copy of a reverse transcript of fiber mRNA (Zain et al., 1979). The relevant part of the genome sequence is shown in Figure 5. (b) Adenovims-2Jiber Based upon the earlier observations the beginning of fiber mRNA lay within
messenger
RNA sequemes
(Broker et al., 1978 ; Roberts et ccl., 1979) that, the first 100 nucleotides to the left of the HpaII
R. S. ZAlN
346
AND
R. .I. ROBERTS 70-9 75-9
58-5
J
A ET
C
44
A
VT 24.2
B
t
83.4
AEJ
B
:f
57.0
28-8
89-7
JF&D
Q4.4 _---
EcvRI /---
II /--
HpoII
- - -
-
-
lIl
-&* It
83.4
sv 40 AdZ+NDI p Ad2+ND5
I I
500
\
1000
\
’ @oRI \ 897
Hho54 k: 8, : I ~,Hponl50 I :I I ~-b
H,OI
98-5 \
b@oII Hpon
JplJ
EcoRl fG
; \
__--
_---
C jD
EcoRI 4 Fiber mRNA
1500
2000
Number of nucleotides FIG. 2. Fine st,ructuw mitp of the wgion rncotling thtx start of the main hotly of fiber mHN,l. The restriction enzyme sites shown UC t)aken from hlulder et ul. (1974) and Zain & Roberts (1978). Ad2+NDI and Ad2+ND5 are identical in this region (Zain. unpublished results). The HpaII site at co-ordinak 86.6 lies between HpnTl fragments III and TV anal provides the right, terminus of the small restriction fragment Hhn 54. The loctttion of the beginning of the fiber mRNA is described in the text. SV40, simian virus 40.
site at co-ordinate 86.6, it seemed likely that the fragment Hhn 54 (Fig. 2) was who+ contained within the mRNA sequence. This was confirmed by showing that it could serve as a primer for reverse transcriptase when fiber mRNA was used as a template. Because fiber mRNA is the only major cytoplasmic mRNA known to contain sequences from around co-ordinates 86 to 91 (Lewis et al., 1975), it proved unnecessary to employ purified mRNA for this purpose. In this and all subsequent experiments, mixed oligo(dT) selected cytoplasmic RNA isolated late (18 h) during viral infection was used as the source of template. The fragment Hha 54 was labeled at its 5’-ends using 1’4 polynucleotide kinase and [Y-~~P]ATP and used as a primer for avian myeloblastosis virus reverse transcriptasc in the presence of fiber mRNA as template. Following elongation in the presence of unlabeled deoxynucleoside triphosphates, the complementary DNA synthesized was assayed for its sequence content by the blotting technique of Southern (1975). Crsing Hind111 and BgZIT digests of Ad2 DNA, which had b een resolved by agarose gel electrophoresis and transferred to cellulose nitrate membrane filters. it can be seen (Fig. 3) that the cDNA not only hybridizes to fragments containing co-ordinates 166, 19.6 and 26.6 (Hind111 B and C and BgZII A and B) but also to the fragments HindI I I H (73.6 to 79.9) and BgZII F (77.9 to 84.2). Th us species of fiber mRNA from Ad2 which contain all four leader components are copied when the Hha 54 fragment is used as primer. In order to define the splice point and to analyze the sequence present in fiber mRNA a set of reverse transcripts was made, using unlabeled Hha 54 as primer, but, this time in the presence of t,he dideoxynucleoside triphosphate chain terminators and (a-32P)-labeled deoxynucleoside triphosphates. The results of this experiment arc shown in Figure 4(c), from which it can be seen t)hat the first eight nucleotides T-T-T-C-A-T-C-T are clearly defined and are identical with the sequence found when
cDNA
Ad2
BgllI
cDNA
Ad2
HindDI
A
B -
-F
H
-G -H,I -J -K
9.4
25.3
E
B
45.3 A
1
60.2
63.6
D
1
J
77.904*2/ F
C
K 185.5 09.9 ’
I
G,L
96.0 Ii
Bgl II
t 7.5
17-o
31-5 B
C
G
37.3 I
41.0 J
50-I A
D
HindIU
26.6
70.3
86.5
.. .
_.
...
.. .
*
.. .
.. .
...
16*6146
Ad2 Ad2+ND5
-
I
I
I
I
I
I
I
I
I
I
J
0
IO
20
30
40
50
60
70
00
90
100
Frc. 3. Analysis of the products of reverse transcription of fiber mRNA using Hhcc 54 as primer. Hhcr 54, labeled at its 5’.end using polynucleotide kinase, was used t,o prime the reverse t,mnscription of Ad2 mRNA and the products analyzed by hybridization to Southern (1975) blot’s of the NiwIIII and HgZII fragments of Ad2 DNA. In each case t,he positions of the fragments present in a cornplet,e digest are shown alongnide the cDNA results. In addition to HittdIII E and BglTT I, the fragments from which the primer is derived, Hind111 fragments C, I3 and H and BgZTI fru.gmonts A, B, and F are also labeled when Ad2 mRNA is the templat,e. The band labeled I’ in the third channel (cDNA against Ad2 HindIlI) is probably not, due to hybridization to Hir~dlIl 11‘ (cf., marker), but, may be due t,o the partial digestion product,s NirdIII H and L. The restriction enzyme sit,cs shown are taken from Ad2 HindTIT (Roberts, R. .I, & Sambrook, .J.. unpnblishwl data): .-\(I? &$I1 (&beau, M. Rr Zain. H. S., unpublished data).
the Bd2 genome serves as template. Beyond that point’ the genome sequence continues G-C-$-A-C-A-A-T, whereas the cDNA sequence becomes heterogeneous but contains C and T as the predominant candidates for the next base. This heterogeneity is expected on the basis of the multiple species of fiber mRNA known to be present. These data can be interpreted to show that the two major species (i.e. those with and without the fourth leader component) have identical junctions with main body sequences (see Discussion).
T
G
C
A
A
T
G
AT
TG
AG
T
AT
c
ATTT
ATTT~~CT
GGAGG~~~.
ATA
AC
CA
.. ... . . :::::::::Sp,ice ,:,:.:.:.:.:.:.:.:.:.:.:.:. T c ._ ., ::::. ::::. pow ~:~_~_~_~.~.~.~,~,~.~.~.‘,~ .. ... ..... . . . _.:, ~.~.~,~.~,‘.‘.~,~.~.‘.~.‘.~.~.‘.’. .~,~.~.‘_~_~ .~.~.~.~.~.~....,~,~.~.~.~,~.~.~.~.~.’.~.~.~.~.‘.‘.
c
AG
7
G
A-b.
A-A
A c
AAG
ACT
I
T
T
G
C
A
C
T T
T
G
C
A
SEQUENCE8
IN
ADENOVIRUB-2
(c) Ad2 + ND5Jiber
FIBER
mRNA
34!)
m RNA sequence
To try and overcome the difficulty imposed by the heterogeneous fiber mRNd species in Ad2. a similar experiment was performed using, as template, late mRNA derived from the hybrid virus Ad2 +ND5 (Lewis et al., 1973). In this virus, certain Ad2 sequences are deleted and replaced by a large segment of the simian virus genome (Kelly & Lewis, 1973). The region of Ad2 deleted is quite large and extends close to or possibly within t’he region of the Ad2 genome known to encode the fourt’h component of the fiber mRNA leader. Thus, there was some possibility that the fiber mRNA produced by Ad2+ND5 might be missing the fourth leader component sequences. When this mRNA was used as template with Hha 54 as the primer, the results shown in Figure 4(a) were obtained, from which a unique sequence could be derived. In particular, the sequence T-T-T-C-S-T-C-T, previously found both in the Ad2 genome (Fig. 4(b)) and with ,4d2 fiber mRPiA as template (Fig. 4(c)). was present. However, this sequence continued T-G-C-G-A-C, t’hus differing significant’ly from t)he genome sequence but matching the predominant sequence continuation from Ad2 fiber rnRE-,4. The nature of the nucleotide N which lies at the splice point’ cannot be unambiguously assigned from t,hese data, since bands occur in all four channels at this position. However: the strongest band lies in the T channel and since this channel is devoid of artifactual bands throughout the rest of the gel. we would favor the sequence continuation T-G-C-G-A-C which matches the predominant sequence continuation from ,i\d2 fiber mRNA.
4. Discussion The results presented in this paper have led to the derivation of the sequence present in the Ad2 genome at the beginning of the main body of fiber mRNA. By using a small primer fragment. Hha 54, it has also been possible to relate this sequence to the sequences present in fiber mRNA from both Ad2 and Ad2 +ND5. In the latter case a unique sequence was obtained which matched the genome sequence for the first 10 nucleotides beyond the primer, but then diverged as the chain was extended int’o the leader sequence. The two sequences are compared in Figure 5 and serve to define the splice point at which the leader sequence becomes joined to hhe main body of fiber mRNA. Although in Ad2 multiple species of fiber mRNA are present (Chow & Broker, 197S), in Ad2 +ND5 our experiments provide evidence for only one species (Fig. 4). The sequence derived is shown in Figure 5 and by comparison with the sequence derived in a similar manner from hexon mRNA (Akusjarvi & Pettersson, 1979), it
FIG. 5. Sequences derived from the Ad2 genome and Ad2+NIX fiber mRNA compar~~d with hexon mRNA. The sequences of the Ad2 genome and Ad2 +NIX fiber mRNA are from the data of Fig. 4 except that the cont,inuation of the genome sequence is included. The hexon mRNA sequence. which shows the junct,ion between the third leader sequence and the main body, is taken from Pettersson el ul. ( 1979). These sequences show the st,rand complementary to the coding strand. Hyphens have bcxrn omitted to save space.
H.
350
S. ZAiN
ANU
R.
J.
KOBER’L’S
can be seen that the sequences beyond the splice point are identical (Fig. 5). Hence. this new sequence is from the third leader component; however, our sequences do not extend sufficiently far to show that the first and second leader components also match t,hosr of hexon mRNA. It should be noted that in both hexon and fiber mRP;As the t,hird leader sequence appears to be joined to the main body sequences at exactly the same position. This suggests that the end point of the third leader is likely to be t,he same in the other late mRNAs which contain the tripartite leader and argues t)hat the fate of the initial transcript is not determined by splicing at different point,s within the third leader segment. When fiber mRNA from Ad2, rather than Ad2+XDS. was used as templat,e for t,hc Hha 54 primer, it was only possible to deduce a unique sequence for Dhc first nitrc nucleotides beyond the primer. This sequence was identical to that’ found iu tlw genome and in Ad2 +ND5 fiber mRNA. Beyond that point t,he pattern on the sequence gel resembled that found using the Ad2 +ND5 mRNA t’emplate but) contained additional bands indicating that two (or mot-r) sequences \t~ert: superimposed upon one a,rrot,ht~r. This is t’o be expected as it is known that about’ 75% of the Ad2 fiber mRNA contains the usual tripartite leader, while the remainin& r 25”;, contains additional leader components (Chow & Broker. 1978). Within t,hin minorit,y population. one form predominates and contains a fourth leader component. Fortunately. t*he sequence of this extra component is available (Zain et al., 1979) and allowed the intepretation of the Ad2 fiber mRNA sequence data. The sequences obt,ained from fiber mRNA of Ad2 + ND5 and Ad2 are compared with the fourt’h componerlt leader sequences in Figure 6. Splice point
Template Ad”?
ND5 fiber mRNA
Ad2 fiber Cloned fiber
mRNA cDNA
GCTCTTTCCGCAGATTGGTCAGTGTCAGCGT~~..TCTACTTT GCT GN
C
GGTCTGTAATGAGGGTAAAAAGGTTTTGTCC
G T .,‘. ‘:.
G Cc c :j::,j T C T A C T T T
,.TCTACTTT
The sequences from the fiber mRNAs of Ad2 and Ad2 ‘SD5 arc’ from th
Although addit’ional bands are present at a few posit’ions in thcb sequence gel of Figurt 4(c), the major bands closely match those to be expected by superimposition of third and fourth component leader sequences. The most important conclusion to be derived from these studies is that both leader sequences are spliced to exactly the same nucleotide at the start, of the main body of fiber mRNA, even though the sequences within the leader segments themselves are quite different. This suggests that the same enzyme is probably responsible for both modes of splicing, although more detailed evaluation must await the elucidation of the appropriate intervening sequences. Within the main body of fiber mRNA, just two nucleotides separate the splice point from an AUG codon which then has an open reading frame until the end of the genome
SEQIJENCER 5’
IN
.L\DENOVIRUK-2
FIRER
rnKR’A\
351
UCGAGAAAGGCGUCKCCAGUCACAGU
CGCAAGA.UAACGCGCCAGACCGUCC’UAGAC Met LYS Arg -i--T3QT-ii;-7-g-~
Ala
ACCUUCAACCCCGUGUAUCCAUAU Thr Phe Asn Pro lo
-K-
12
E-
Arg
Val 14
Pro
Tyr Tc
GIU
Ser
Pro 16
Asp
Tyr 77
FIG. 7. Predicted N-terminal sequence of fiber polypeptide. The nucleotide sequence contains the third leader component joined to the main body of fiber mRNA. The sequence shown is the complementary strand of the leader sequence derived from Fig. 4(a) plus its continuation into the main body sequence derived from the Ad2 genome. Out,.offrame terminators are indicated above t,he nucleotide sequence and the possible start, of t,he fiber polypeptide is indicat,ed below the sequence. Hyphens omitted to save space.
sequence determined in these studies (Fig. 7). Both other reading frames are blocked by terminator codons. Although it has not been established that this AUG is the initiation codon for the fiber polypeptide, preliminary results indicate that amino acids 1 (methionine), 2 (lysine). 4 (alanine), 6 (proline), 8 (glutamic acid), 11 (phenylalanine), 13 (proline), 15 (tyrosine), 16 (proline), 17 (tyrosine) are present at the predicted positions wit)hin the fiber polypeptide (C. W. Anderson & ,J. B. Lewis, personal communication). We thank P. Sadowski for a generous gift of T7 exonuclease, J. W. Beard for AMV reverse transcriptase, U. Pettersson and A. !J. H. Smith for communicating their result)s prior to publication, Gail Wong for her excellent technical assistance, and M. Moschitta and R. Yaffe for their help in preparing this manuscript. This work was supported by a grant from the National Cancer Institute (CA13106) and by grants to one of us (B. S. Z.) from the National Science Foundation (PCM77-16480), the National Institnte of Healt,h (1 RO 1 CA22898-Ol), and the American Cancer Society (ACS-VC-267). REFERENCES Akusjarvi, G. & Pettersson, U. (1979). Cell, in the press. Aloni. Y., Dhar, R., Laub, O., Horowitz, M. & Khoury, G. ( 1977). Proc. Sat. ilcarl. Sci.. I’.S.A. 74, 3689-3690. Sci., T:.S.d. 74. 3171 3175. Berget, S. M., Moore, C. & Sharp, P. A. (1977). Proc. A’at. Acad. Berk, A. J. & Sharp, P. A. (1977). Cebl, 14, 695 -711. Breathnach, R., Mandel, J. L. Ss Chambon, P. (1977). Xature (London), 270, 314 3 1!). Brokrr, T. R.. Chow, L. T., Dunn, A. R., Gelinas, R. E., Hassell. .J A., Klessig, I). F., Louis, J. B., Roberts, R. J. & Zain, B. S. (1’378). Colti ,S”prirzg Ha&or Symp. Quarlf. Biol. 42, 53-553. Cehna. M. L., Dhar, R., Pan, J. & Weissman, S. M. (1977). S&. dcida Res. 4, 2549 2559. Cltow, L. T. bz Broker. T. R. (1978). Cell, 15, 4!)7-510. Cl~o\\-, L. T.. Gelinas, R. E., Broker, T. R. & Roberts, R. J. (1977a). Cell, 12, l-8. Cl~ow. L. T., Roberts. J. M., Lewis, J. B. & Broker, T. R. (19776). Cell, 11, 819-836. Dutnl, A. R.. Mathews, M. B.. Chow. L. T.. Sambrook, J. & Keller. W. (1978). Gel/, 15. 51 I -526. (:rcvnc. P. J.. Hrtlach, M. C., Boycxr, H. FV. & Goodman, H. M. (1!)74). :Wethod.s Jlol. Hid. 7, 87 -111. Hsu, M. T. & Ford, J. (1977). I’roc. Nat. Acarl. Sci., 77.8.A. 74, 4982 4985. Kelly. T. J. Jr & Lewis, A. M. Jr (1973). J. l~irol. 12, 843~-652. Kitchinpman, G. R., Lai, 8. P. & Wcstphal, H. (1977). f’roc. Sat. Acad. SC’., 71.S.d. 74. 43!)2%4395.
R. S. ZAIN
352
ANJ)
1%. J. ROBERTS
Klessig, D. 1’. (l!liT). Cell, 12, $1~21. Knapp, G., Beckma,nn, .J. S.. .Joln~sot~. I’. F., l+llttTrlatt. s. .\. k, .\ tK+c)rk. .I. ( 197X). (‘e/I. 14, 22 l-236. Krzyzek, R. A., Coll&t, M. S., Lau, A. Y., Perdur, M. L., his. J. I’. ct tJaras;, ;-\. +I. (1!178). Proc. Nat. Acad. Sci., U.S.A. 75, 1284--1288. Lewis, A. M. Jr, Levine, A. S., Crmnpackrr, (‘. S.. Lrl\,itl, M. .J ,. Samalla. IC. .I. k Htst~r> . I’. H. (1973). .J. I’irol. 11. 655 604. Lewis, .J. B., Atkins, J. F., Anderson. (‘. II’.. I
f:stt~latI(t. R. I+‘. (I 9i5). /‘).II(.. Nat. Acad. hi.. [J.S.A. 72, 1344 1348. Mautncr, V., Wdliams, J ., Sambrook. J., Sharp, 1’. A. & Grodxickr~t~. ‘I’. (l!J75). (!fJ/Iq 5. 93-99. Mellon, P. & he&erg, 1’. H. (1978). iVature (Londo*~), 270, 63 1 -63-l. Molder, C., Arrand, J. R., Delius, H.? Keller, W., Pettersson, IT., Roberts, R. J. Ly:Sharp, P. A. (1974). Cold Spring Harbor Syrup. Qua&. Hiol. 39, 3!)7 -400. O’Farrell, P. Z., Cordell, B., Valenzuela. P., Rut&r, \V. .I. & (Goodman, H. M. (I 97X). iVatwe
(Lordon),
274,
438-445.
Pettersson, U. & Sambrook, J. (1973). .J. Mol. Bio/. 73, IPR 130. Pettersson, U., Mulder, C., Delius, H. & Sharp, 1’. A. (l!)‘iR). I’roc. *Vat. d-lcatl. Nci.. (:.S.,-1 . 70, 200-204. Robe&, R. J ., Myers, E’. A ., Morrisor) . A. $ Murray, Ii. ( 1976). ./. ,b10/. Hiol. 103. 199 20X. Roberts, R. J ., Klessip, D. F.. Manley, .I. & Zain, 13. S. ( 1979). I'ror. 12th FEl3,S Syrnp., irl the press. Rothenborg. E., Donogllue. 1). ,J. & Baltimore, I>. ( I !)78). Cell, 13, 435 Ifi I. Sanger, F. & Co&on, A. It. (1978). FEES Letters, 87, 107 110. Sanger. F.. Nicklen. 8. & Cor~lson. A. R. (1977). I’roc. ,vat. .d cat/. Sri.. (T.S.A.. 74, $&ti:)5467. Sharp, P. A., Sugden, B. & Sambrook, .J. (1973). Biochemistry, 12, 305.5 3Oti:I. Southern, E. M. (1975). J. Mol. Hiol. 98, 503 -517. Tilghman, S. M., Tiemeier, D. C., Polsky, F., Edgell. M. H.. Soidman. *J. (i., Leder, A., Sci., C:.h’.=l. 74, Enyuist, L. W., Norman, B. & Leder, P. (1977). I’roc. :\‘at. ilcatl. 4406-4410. Tonegawa. S., Maxam, A. M., Tizard, H.. B ernard. 0. & (:llbcbrt. \I.. (197X). f’rr~c. Sat. Acad. Sci.. U.S.A. 75, 1485. 1489. Zain, R. R. c( Roberts, R. d. (1978). .J. ~Uol. &lo/. 120, 13 :$I. Zain. B. S.. Sambrook. J., Roberts. R. .I., Kelkr. \V.. l~'rietL. M. k 1)11rrl1, ;\. It. ( l!l7!J). Cell, in thr press.