J. Mol. Biol. (1975) 99, 419 ~ 3
Bacteriophage T7 Early Promoters: Nucleotide Sequences of Two RNA Polymerase Binding Sites DAVID I:~IBNOWt
The Biological Laboratories Harvard University Gambridge, Mass. 02128, U.S.A. (Received 24 April 1975, and in revi,~ed form 25 August 1975) At two independent bacteriophage T7 early promoters, Escherichia colt RNA polymerase protects about 40 base-pairs of DNA from digestion by DNAase I. The DNA fragment from each promoter contains the information required to maintain a stable pre.initiation complex with RNA polymerase and it codes for about 18 bases of specific T7 early mRNA. The nucleotide sequence of each promoter fragment and the sequence of the mRNA coded by each fragment are presented. Comparison of these and other promoter sequences has led to the formulation of a model for promoter structure a~d function, which is put forth. 1.
Introduction
A promoter (Jacob et al., 1964; Epstein & Beckwith, 1968) is a start signal at the beg~nnlng of a gene or a gene cluster that directs the RNA polymerase to initiate RNA synthesis. Since genetic information is contained in a sequence of nucleotides, specific I)NA sequences constitute promoters, and the interaction between RNA polymerase and these sequences causes the polymerase to initiate transcription. What is the basic promoter sequence, and how does the polymerase interact with it? I have been studying the DNA fragments from two promoters in the "early region" of bacteriophage T7 DNA that were isolated by DNAase treatment of RNA polymerase-DNA complexes. The Escherichia coli RNA polymerase (holoenzyme) can form stable, binary complexes with DNA at promoter sites (Stead & Jones, 1967; ~inkie & Chamberlin, 1970,1972; Chamberlin & Ring, 1972; Bordier & Dubochet, 1974; Mangel & Chamberlin, 1974a). At each promoter, the DNA base-pairs that are engaged by the polymerase are physically "protected" from digestion by DNAase I, thus DNAase treatment produces a complex containing an RNA polymerase molecule bound to a double-stranded promoter fragment (Heyden et al., 1972; Sugimoto eta/., 1975; Walz & Pirrotta, 1975; Schaller et aL, 1975; Pribnow, 1975). This DNA fragment, which can be readily separated from the polymerase, is about 40 base-pairs long, it contains the initiation point for mRNA synthesis from the promoter and it codes for about 18 bases of RNA (Sugimoto et al., 1975; Walz & Pirrotta, 1975; Schaller et al., 1975; Pribnow, 1975). This paper reports the nucleotide sequences of the protected fragments from the T7 A2 and A3 promoter sites. Bacteriophage T7 has a linear, non-permuted, double-stranded genome with a molecular weight of about 25 × 10s (Dubin et al., 1970). At the far left end of the t Present address: Department of Molecular,Cellularand DevelopmentalBiology,University of Colorado, Boulder, Colo. 80302, U.S.A. 419
420
D. PRIBNOW
DNA molecule, preceding the first of the "early" genes, are three strong promoters (A1, A2, and A3). I n vivo and in vitro (Dunn & Studier, 1973a,b; lVIink]ey& Pribnow, 1973), these promoters are the primary start sites for transcription of the "early region" (first 20~) of T7 DNA. I n vitro, RNA polymerase normally binds tightly to all three of the strong, T7 early promoters (Hinkle & Chamberlin, 1972; Chamberlin & Ring, 1972; Bordier & Dubochet, 1974; Mangel & Chamberlin, 1974a), and DNAase treatment of the polymerase-DNA complexes would consequently yield a mixture of protected promoter fragments. I wanted to isolate a DNA fragment separately from each T7 promoter; therefore, a way had to be found to direct RNA polymerase binding at a single, specified promoter site. Stable binary complexes between RNA polymerase and promoter DNA form readily at 37°C, in the presence of low concentrations of salt (less than 0.1 M) (H~nl~le& Chamberlin, 1972; Mangel & Chamberlin, 19745). If the salt concentration is subsequently raised to 0.2 M or greater, these complexes are destabilized (H~n]~]e & Chamberlin, 1972; Mangel & Chamberlin, 19745; Jones & Berg, 1966; Richardson, 1966), while ternary complexes in which the polymerase has initiated RNA synthesis are not perturbed by 0.2 M salt (Stead & Jones, 1967; Jones & Berg, 1966; Richardson, 1966; Sentenac et al., 1968). Thus, by forcing initiation to take place at a specified promoter, one can effectively "lock" RNA polymerase onto a DNA template at one chosen site. A limited initiation event can be directed at a particular promoter with a dinucleotide (diribonucleoside monophosphate) (i~in]dey & Pribnow, 1973; Pribnow, 1975; Darlix & Dausse, 1975). In the presence of very low concentrations of ribonueleoside triphosphates (5 ~M or lower), RNA polymerase will use a dinucleotide as the initiating substrate instead of ATP or GTP (Downey & So, 1970; Downey et al., 1971 ; Hoffman & Niyogi, 1973). (Essentially no synthesis takes place in the absence of the dinucleotide.) The dinucleotide acts as a sequence-specific primer, with different dinucleotides being used at different sets of promoters (M~n]~]ey& Pribnow, 1973 ; Darlix & Dausse, 1975). Although the polymerase cannot form an initiated complex using a dinucleotide alone, a dinucleotide and one or two triphosphates are sumcient to promote the formation of an initiated polymerase-DNA complex (Pribnow, 1975; Darlix & Dausse, 1975). I have isolated polymerase-protected DNA fragments from the T7 A2 and A3 promoters by using dinucleotide-mediatedinitiation to specify the protected promoter, followed by DNAase treatment in the presence of high salt. By isolating RNA complementary to these DNA fragments, I have been able to sequence the fragments. A comparison of all of the known promoter sequences leads ~o a model for RNA polymerase-promoter interaction involving two specific, spatially related DNA sequences. That model is put forth at the end of the paper.
2. Materials and M e t h o d s (a) Rea#ent8 Unlabeled ribonucleoside triphosphates (P.-L. Biochemieals) were suspended in 10 to 20 m~-Tris.HC1 (pH 7.0) and their concentrations determined by absorbance. ~82p. labeled triphosphates (specific activities ranged from 90 to 140 Ci/mmol) and Hzz2PO4 were purchased from New England Nuclear. Rifampicin and urea were purchased from Schwarz-l~Iann, Pentex brand bovine serum albumin was purchased from Miles, sodium dodeeyl sulfate was Pierce "sequenal grade," and gel chemicals (acrylamide, bis-acrylamide, TEM_ED) were purchased from Bio-Rad. Phenol was redistilled before use. All other chemicals were standard laboratory reagent grade.
P H A G E T7 P R O M O T E R
SEQUENCES
421
(b) Enzymes E. cell R N A polymerase (holoenzyme) was purified from 3/4-1og phase E. eoli K12 (Grain Processing Corp.) according to the procedure of Berg e$ a/. (1971) a n d was a generous gift of R. Simpson. E. eoli core R N A polymerase (with less t h a n I~/o sigma content) was k i n d l y provided b y M. Chamberlin. DNAase I (RNAase-free) a n d R N A a s o A were purchased from Worthington. R N A a s o T1 a n d R N A a s e IY2 (Sankyo) were purchased from Calbiochem. The polynueleotide kinase was generously provided b y T. Maniatis. (c) T e a tube. All reactions, precipitations, etc., were done in 10 m m × 75 rnrn disposable glass t e s t tubes which h a d been coated on t h e inside with 5 % dimethyldichlorosilaue (in CC14), washed several times with distilled water, a n d h e a t e d in a closed container overnight a t 250°F.
(d) Phage T7 DN21 (i) Unlabeled D N A Large amounts o f T7 D N A were prepared from purified phage b y S D S t extraction, as described previously (Minldey & Pribnow, 1973). The D N A was stored in D N A buffer (0.01 M-Tris.HC1 (pH 7-9), 1.0 rn~-EDTA, 0.1 ~-NaC1). (ii) a2PJabeled T7 2217.4 E. coli B/r was grown a t 37°0 to an O.D.55o of 1.0 in 100 ml of low-phosphate/glucose m e d i u m (Minldey, 1974) containing u n t r e a t e d Casamino acids (see below). The ceils were then pelleted b y spinning t h e m down for 20 min a t 7000 r p m in a Sorvall SS34 rotor, a n d t h e s u p e r n a t a n t was discarded. The cell pellet was resuspended in 100 ml of dephosp h o r y l a t e d m e d i u m (low-phosphate/glucose m e d i u m containing dephosphorylated Casamino acids~(Minldey, 1974; phosphate concentration approx. 5 × 10 -5 M) a n d was placed in a shaking w a t e r b a t h a t 37°C. After shaking for 5 rain, 50 mCi of [a2p]orthophosp h a t e were a d d e d to the culture; a n d 1 rain later wild t y p e T7 phage were a d d e d a t a multiplicity of 5. Complete lysis occurred after a b o u t 1 h. F r o m this p o i n t on, isolation of the D N A followed t h e procedure for isolation of unlabeled T7 D N A . Concentrations o f stock solutions o f T 7 D N A were determined b y absorbance a t 260 nm. The specific a c t i v i t y of the [a2P]DNA used in these experiments was 0.1 to 0.3 Ci/mmol of D N A phosphate.
(e) Isolation of a2P Jabded promot.r fratdmenta (i) 213 initiated comple~ E. eoli R N A polymerase (14 pg/ml) a n d 82P-labeled T7 D N A (50 pg/ml) (molar ratio approx. 15:1) were incubated together with 100 p•-CpA and 10 or 20 t~M-CTP for 10 rain a t 37°O i n 0.1 m l of binding buffer (10 InM-Tris-HC1 (pH 7-9), 10 mm-magnesium acetate, 0-1 max-EDTA, 0"1 m~-dithiothreitol, 5~/o glycerol, 200 pg bovine serum albumin/ml) containing 0.05 ~t-KC1, allowing initiation to t a k e place a t the specified promoter. 3-5 MKCI was then a d d e d to a final concentration of 0.2 ~, a n d 1 rain later DNAase I (200 pg/ml) was added, a n d incubation was continued for 10 or 20 mln a t 37°C. The digestion was t e r m i n a t e d b y adding E D T A to a concentration of 20 rnM~ This m i x t u r e was imm e d i a t e l y loaded onto a 2-ml Sephadex G100 col,~mu a n d was eluted in binding buffer (without bovine serum albumin a n d KC1) containing~20 mM-EDTA. While small [32P]DNA oligonucleotides were r e t a r d e d in the column, t h e i n t a c t polymerase-[s2P]DNA complexes were excluded from the column a n d were collected in the void volume (approx. 0-3 ml). T h e sample was t h e n e x t r a c t e d once with 2/3 volume of phenol to remove t h e polymerase (and DNAase), a n d i t was then e x t r a c t e d 3 times with 3 eel. of ether to remove residual phenol. A t this point, t h e sample contained v i r t u a l l y nothing b u t t h e desired [a2P]DNA fragments, a n d it was p r e p a r e d for p o l y a c r y l a m i d e gel electrophoresis (see section (f)). I n control reactions, either R N A polymerase, the dinucleotide, or the triphosphate was withheld so as to preclude binding or initiation. t Abbreviation used: SDS, sodium dodecyl sulfate.
422
D. P R I B N O W
(ii) 2Von.initiated complazes 8~P-labeled p r o m o t e r fragments were obtained from non-initiated complexes b y the procedures described above, except t h a t no dinucleotides or triphosphates were present in the binding reaction, a n d the KC1 concentration was m a i n t a i n e d a t 0.05 M t h r o u g h o u t t h e isolation procedure. (f) 1grin-denaturing DlqA gels (i) Preparation of sample.s 50 gg of E. coli B t R N A a n d MgC12 (to a concentration o f 20 raM) were a d d e d to each solution containing t h e [s2p]DNA fragments. Then 2 vol. of ice-cold 95% ethanol were added, a n d each sample was frozen in a solid CO2 acetone b a t h (5 to 10 rnin) to precipitate the DlqA. After being t h a w e d on ice, each sample was spun for 15 rain a t 7000 r p m in a Sorvall SS34 rotor (4°C), the s u p e r n a t a n t was discarded, a n d the pellet was dried either under N2 or in a v a c u u m desicator. The [82P]DNA was t h e n resuspended in 20 to 30 gl of 1/10 Tris]Mg buffer (90 m~-Tris-borate (pH 8.3), 5 m~sMgC12) containing 15~o glycerol a n d bromphenol blue a n d xylene cyanol as m a r k e r dyes. E a c h sample was t h e n layered on a p o l y a c r y l a m i d e gel for electrophoresis.
(ii) The gels Samples were electrophoresed t h r o u g h 15 cm, 10~o polyacrylamide-TrisflYIg buffer slab gels (Maniatis & Ptashne, 1973), with TrisfiYIg running buffer. Double-stranded D N A does n o t denature in these gels. Samples were electrophoresed until the bromphenol blue dye reached the b o t t o m of t h e gel; then the t o p glass plate was removed from t h e gel, a n d t h e wet gel was covered with Saran wrap a n d was exposed to film ( K o d a k screen). RF values for bands of [32p]DNA were c o m p u t e d relative to t h e position of the bromphenol blue dye. The sizes of t h e double-stranded D N A fragments were roughly e s t i m a t e d from a gel calibration curve for D N A s of known length provided b y T. Maniatis. (g) Selective RiVA synthesis: stutter (Maizels, 1973) products R N A polymorase (20 gg/ml) a n d T7 D N A (50 gg/ml) were proincubated together with one 100 g~-dinucleotide a n d one 2-0 gM-ribonucleoside t r i p h o s p h a t e (UpC a n d G T P for A2; CpA and CTP for A3) for 5 rain a t 37°C in 0.1 ml of transcription buffer (20 m ~ - T r i s HC1 (pH 7.9), 0.1 mM-dithiothreitol, 4 raM-magnesium acetate, 0.15 ~-KC1), allowing initiation to t a k e place a t the specified promoter. Rifampicin was then a d d e d to 20 gg/ml to disable non-initiated polymerase molecules (Sippel & Har~man, 1968; Chamberlin & Ring, 1972), a n d 2 rain later t h e remaining triphosphates were a d d e d to a final concent r a t i o n of 2.0 or 2.5 gM. One or more of the triphosphates were a-32P-labeled. After 10 rain a t 25°C (or 5 min a t 37"C), SDS (to 0"5%) was a d d e d to t e r m i n a t e synthesis. The labeled R N A was then p r e p a r e d for gel eleetrophoresis. (h) "'Run off" synthesis: stutter products I n i t i a t e d complexes were prepared according to t h e protocol outlined in section (e), (i), with t h e following exceptions. The dinucleotide UpC a n d 2 triphosphates, G T P a n d CTP, were employed for initiation a t the A2 promoter. Unlabeled D N A was used in t h e reactions, a n d the 1 or 2 triphosphates required for specific initiation were present a t 2.0 g~. After 10 min of digestion b y DNAase I, the remaining triphosphates (final concn, 2.0 g~) were a d d e d to the reactions in 50 gl of transcription buffer (37°C), allowing polymerase molecules to " r u n off" the b o u n d DIqA fragments, synthesizing [32P]RNA. 5 rnln after the a d d i t i o n of the triphosphates, t h e reactions were t r e a t e d with SDS (0'5~o), a n d t h e R N A was p r e p a r e d for gel electrophoresis. A d d i t i o n o f heparin (Zillig et cd., 1970) to a reaction several minutes before a d d i t i o n of t h e triphosphates h a d v e r y little effect on t h e yield o f R N A products.
(i) Th~/~iVA ge~ (i) Prepara2Xon of sample~ E. coli B t R N A (to 200 gg/ml) a n d MgC12 (to 10 raM) or an equal vol. of 4 ~ - N H 4 A c were a d d e d to RNA-eontaining samples. Then, 2 vol. of ice-celd ethanol were added, a n d
P H A G E T7 P R O M O T E R S E Q U E N C E S
423
the samples were precipitated and pelleted as with the I ) N A samples. The pellets were resuspended in washing buffer (20 mM-Tris.HC1 (pH g-9), 10 mM-MgCI~, 0-5 ~-NaC1, 0.5% SDS) and were reprecipitated and pelleted to help remove unincorporated label. The pellets were dried under N~ and were resuspended in 20 to 30 ~1 of 1/10 T r i s f E D T i buffer (90 rn~-Tris-borate (pH 8.3), 2.5 m ~ - E D T A ) containing 4 to 5 M-urea, 15% glycerol, and bromphenol blue and xylene cyanol as marker dyes. Gel samples were placed in a boiling water b a t h for 2 m~n and then were chilled on ice before being layered onto a gel.
(ii) Th~ ge~ Depending on the expected lengths of the RNAs being analyzed, samples were run through 40-cm T r i s ~ D T A buffer slab gels (Gflberb & M~xam, 1973) varying from 12% to 17% polyacrylamide. All R N A gels contained 7 M-urea to help disrupt R N A secondary structure, but secondary structure was not eliminated completely. Samples were eleetrophoresed until the bromphenol blue dye was within 5 em of the b o t t o m of the gel, then, as with a D N A gel, a wet R N A gel was exposed to film, and/~F values were computed for labeled R N A species t h a t gave bands on the developed film. R N A lengths were estimated from a gel calibration curve provided b y T. Maniatis (12% gels) or relative to Re values for R N A species whose nucleotide sequences were determined. Potentially interesting R N A products were cut out of the gels, eluted, and sequenced (see section (k)). (j) Isolat/ion of R N A complemenZary to the I~romoterfragrnents (i) Prefaced D1VA fragment4 Unlabeled promoter fragments were isolated from dinuelcotide.initiated complexes according to the protocol outlined in section (e), (i), using twice the eoncentratious of T7 DNA, R N A polymerase, and DNAase I. After the DNAase digestion was stopped with EDTA, each reaction was extracted once with 2/3 vol. of phenol and 3 times with 3 vol. of ether, and the ~)NA fragments were dialyzed overnight against 1 change of 500 ml of 2 × SSC (SSC is 0"15 M-NaC1, 0"015 M-sodium citrate). The fragments were typically prepared in 0.2 to 0.6 ml batches, depending on the number of hybridizations in which they were to be used (see below). (ii) "Read through" I~NA " R e a d through" R N A was usually labeled with a single [~-32P]triphosphate. R N A polymerase (14 ~g/ml) and T7 D N A (50 ~g/ml) were mixed together with the dinucleo÷ides CpA and CpC (100 ~M), an [~-32P]triphosphate at 2.5 ~M, and the remaining 8 triphosp h a ~ s (unlabeled) at 5 or 6 ~ , in 0.15 to 0.6 ml of transcription buffer at 0°C. The reaction mixture was placed at 37°C, and 3 mln later rifampicin (to 20 ~g/ml) was added to block further initiation. After another 2 or 3 m~n, SDS was added to 0.5%, and incubation was continued for 2 m~n to stop the reaction. The reaction m~Yture was then extracted once with 2/3 vol. of phenol and 3 times with ether, after which it was transferred to dialysis tubing and was dialyzed overnight against 1 change of 500 ml of 2 × SSC. The bulk of the unincorporated label was removed from each sample during dialysis.
(iii) D l g A - R N A hybridiza$~,,~ After dialysis of the D N A fragments and the labeled "read through" R N A , the R N A solution was transferred to a silicated tube, and MgC12 (to 10 m~) and DNAase I (to 25 ~g]ml) were added. The m ~ t u r e was incubated for 20 m~n at 37°C to el~miuate endogenous template DNA, the digestion was then stopped with E D T A (40 m~), and the solution was extracted with phenol and e~her. Next, portions of D N A fragments and "read t h r o u g h " R N A were m ~ e d (typically 0.15 ml of D N A solution and 0-15 ml to 0-3 ml of R N A solution were m]~ed). (The estimated molar ratio of D N A to R N A molecules varied from 1"1 to 1:4.) A given mi~ture was placed in a boiling water bath for 2"5 m~n to denature the D N A fragments and was ~mmediately transferred to a 60°C bath where incubation was continued for 1 h. Each hybridization mixture was then allowed to cool to room temperature, after which 8 ~g of E. c o / / B t R N A and 1.6 ~g o f R N A a s e T 1 were added tQ
424
D. P R I B N O W
it, and the m i x t u r e was k e p t a t room t e m p e r a t u r e for 1 h. 60 gg of t R N A a n d SDS (to 0.5%) were t h e n added, and the sample was e x t r a c t e d with 0.5 vol. of phenol, after which MgClz was a d d e d to 10 rn~, a n d the D N A - R N A hybrids were precipitated with 2 voh of cold 95% ethanol. The precipitates were then prepared for electrophoresis on a 12% polyacrylamide-7 ~ - u r e a R N A gel. I t was anticipated t h a t the only large (40 to 50 nucleotides) R N A s remaining after R N A a s e T1 t r e a t m e n t would be the RNAase-resistant segments of the " r e a d t h r o u g h " R N A s which were directly hybridized to p r o m o t e r D N A fragments. R N A a s e T1 was used in the digestions p r i m a r i l y because i t was found to be easy to eliminate b y phenol extraction. (k) R1VA sequencing (i) Elution of RNA from the gel Radioautographs of R N A gels showed the positions of [32P]RNA bands. E a c h R N A to be sequenced was cut out of the gel, a n d the gel slice was placed in a silieated scintillation insert. 0'5 ml of elution buffer (0'5 M-arnmonluIn acetate, 0.1 m ~ - E D T A , 10 raM-magnesi u m acetate, 0-1% SDS, 20 gg t R N A / m l ) was a d d e d to each slice, the slice was crushed with a glass rod, a n d t h e insert was covered a n d k e p t a t 37°C overnight. The crushed gel a n d t h e eluate were t h e n separated b y passing the contents of each insert t h r o u g h a glRss wool filter. 40 to 50 ~g of t R N A were a d d e d to each eluted R N A solution, and t h e [32P]RNA was precipitated (ethanol, solid C02/acetone) and pelleted. The supernatants were removed, and each sample was dried down in a v a c u u m desicator, resuspended in 40 ~1 of doubledistilled water, s p o t t e d on Parafilm, and dried down again. (ii) Sequencing procedures Fingerprinting a n d subsequent steps of the R N A sequence analysis were done according to t h e procedures outlined in d e v i l b y Brownlee (1972). First, a n R N A p r o d u c t was digested with R N A a s e T1 or with R N A a s e A (usually with both, in separate reactions), a n d the products were fingerprinted. These products were then redigested with R N A a s e A, RNAase T1, or RNAase U2. Second digestion products were each h y d r o l y z e d to 3"-mononucleotides in alkali, yielding information a b o u t oligonucleotide composition a n d nearest neighbors. Relative molar yields were determined for the various digestion products b y counting the r a d i o a c t i v i t y in each of the products. (iii) Phosphat8 transfer reaction The R N A sample was eluted from t h e gel a n d precipitated; i t was then resuspended in 0-1 ml of kinase buffer (40 m~-Tris.HC1 (pH 8.5), 8 m•-MgC12, 8 m~-dithiothreitol) to which 10 units of polynucleotide kinase (Richardson, 1971) a n d A T P to a concentration of 50 gM were added. The mixture was incubated for 1 h at 37°C, and then 40 gg of t R N A , MgC12 (to 20 m~), a n d 0.1 ml of 2 l~-ammonium acetate were a d d e d to it. The R N A was then precipitated with 2 vol. of ethanol, a n d it was digested subsequently with R N A a s e T1 and fingerprinted.
3. Results (a) Isolation of protected DNA fragments G i v e n t h e d i n u c l e o t i d e C p A a n d t h e t r i p h o s p h a t e CTP, R N A p o l y m e r a s e " i n i t i a t e s " R N A s y n t h e s i s on T7 D N A only at the A3 promoter, f o r m i n g t h e oligonucleotide C p A p C - 0 H (Minldey & P r i b n o w , 1973; P r i b n o w , 1975). W i t h U p C a n d G T P , t I find t h a t t h e p o l y m e r a s e i n i t i a t e s exclusively at the A2 promoter, f o r m i n g U p C p G - 0 H . I n t The polymerase also "initiates" at the A2 promoter with the dinueleotide ffpG and GTP. The sequence of the G~C-primed transcript is identical to the sequence of the UpC-primed transcript, except that CpC is substituted for UpC at the 5'-end of the RNA (my unpublished results). The incorporation of CpC, which is at least as efficient as the incorporation of UpC, represents a dinueleotide "mismatch" with the complementary strand of the template, showing that one can be misled when trying to predict initiation site sequences from dinuclsotide priming experiments.
PHAGE
T7 PROMOTER
SEQUENCES
425
either case, when the reaction is treated with rifampicin to block further initiation and then the remaining triphosphates are added, the only RNA which is synthesized is RNA derived from the specified promoter. Analysis by size of the end-labeled RNA products on low-per cent polyacrylamide-agarose gels assigns the promoters which initiate each RNA species, as shown by Minkley & Pribnow (1973). By forming the dinucleotide-direeted, initiated complex at the A2 promoter or at the A3 promoter, raising the salt concentration in the reaction to disengage non-initiated polymerase molecules from other binding sites, and then treating the DNA with DlqAase I, I have been able to isolate specific A2 or A3 promoter complexes. Plate I shows a radioautograph indicating the positions of promoter fragments isolated from whole [3~P]T7 DNA and eleetrophoresed through a 10% polyacrylamide gel. Sample A contained protected fragments from the A3 promoter. As detailed in Materials and Methods, the polymerase was allowed to initiate at the A3 site with CpA and CTP in 0.05 M-KC1, and the complexes were treated with DNAase I in 0.2 M-KC1. Band 1 represents the double-stranded A3 promoter fragment, which was about 40 base-pairs long. Band 2, which occurred with variable intensity relative to band 1, represents promoter DNA which was denatured, perhaps during phenol extraction or during preparation of the sample for gel eleetrophoresis. (When a sample was heated to 100°C and then quickly chilled on ice before being electrophoresed, all of the labeled DNA migrated in the position of band 2.) Sample B was a control, showing that relatively little protected fragment was obtained when initiation was prevented by withholding CTP from the initiation reaction. Typically, the A3 promoter DNA obtained in a protection experiment was at least 80~/o pure, since no more than 20% as much [32P]DNA was recovered in the accompanying control sample; and the yield of radioactivity in isolated 32P-labeled promoter DNA showed that the A3 fragment was recovered with 10 to 20~/o efficiency. No protected DNA was recovered when polymerase was absent from a binding or initiation reaction (sample C). When promoter fragments were prepared from non-initiated complexes, where RNA polymerase was bound tightly at all T7 promoters (the KC1 concentration was kept at 0.5 M during the isolation), the labeled DNA banded in the gel (sample D) in the same position as the A3 promoter DNA obtained from initiated complexes. This result indicated that RNA polymerase protects the same sized piece of DNA whether or not it has "initiated" synthesis, presumably at any T7 promoter. I did not prepare A2 promoter fragments from labeled T7 DNA, since experiments designed to characterize and sequence the promoter fragments required unlabeled DNA. The A2 fragments were purified "blind" from complexes initiated with UpC, GTP, and CTP~ on unlabeled DNA, following the procedures worked out for the isolation of the A3 fragments. (Results presented later attest to the purity of the A2 and A3 fragments obtained from unlabeled DNA.) (b) Initial sequences of the A 2 and A 3 R N A s Since promoter fragments from the A2 and A3 sites were each isolated from complexes in which a limited initiation event had taken place, it was likely that each fragment would contain the point of initiation of mRNA synthesis (or at least t Whereas RNA polymerase "initiates" at the A2 promoter with UpC and one triphosphate, GTP, I have found that this UpCpG-initiatedcomplexis not as stable in 0-2 •-KCI as the complex formedwith UpC, GTP, and CTP (wherethe oIigonueleotideUpCpGpCis synthesized).The reason for the differencein stability is not known.
426
D. P R I B N O W
dinucleotide-primed RNA synthesis) for that promoter. In order to ask whether the initiation point actually was included in each promoter fragment, it was first necessary to determine the initial sequence of the RNA molecule transcribed from each promoter. Specific dinucleotide.mediated initiated was the key to getting sequenceable RNA from the A2 and A3 promoters. In one reaction, RNA polymerase was initiated at the A2 site with UpC and GTP, and in another, the polymerase was initiated at the A3 site with CpA and CTP; then each reaction was treated with rifampicin, and very low concentrations of the remaining three ribonucleoside triphosphates were added. One or more of the triphosphates were ~-82P-labeled. The ensuing syntbesis was stopped with SDS after ten minutes at 25°C (or 5 rain at 37°C). Since in the presence of very low concentrations of triphosphates RNA polymerase pauses or "stutters" at certain points along a DNA template as it synthesizes RNA (l~Iaizels, 1973), the product of each synthesis was a mixture of discrete, overlapping RNA species, all having a common 5'-end. When electrophoresed through a polyacrylamide gel, the RNA products from each promoter migrated in a series of well-defined bands, the RNAs ranging in size from about 5 to 80 nucl~tides. The first several A2 and A3 RNA products (up to an estimated length of 15 to 20 nucleotides) were eluted from the gel. Each A2 RNA was digested separately with RNAase T 1 and with RNAase A, and the products were fingerprinted; each A3 RNA was digested with RNAase T1 only, and the products were fingerprinted. Additional RNAase T1 or RNAase A oligonucleotides appeared on the fingerprints as progressively longer RNA products were analyzed, thereby establishing the oligonucleotide order extending from the 5'-end of the RNA synthesized from each promoter. Once the sequence of every RNAase T 1 and RNAase A oligonucleotide was determined by subsequent RNAase A (and sometimes RNAase U2) or RNAase T1 digestion and then alkaline hydrolysis, the sequences of the longest RNAs eluted from the gel and fingerprinted could be deduced. Table 1 shows the RNAase T 1 and RNAase A oligonucleotides contained in each of three eluted A2 RNA species, and it indicates the order of the RNAase T1 oligonucIeotides determined by their order of appearance on the fingerprints and by sequence overlaps with the G-containlng RNAase A oligonucleotides. Table 1 shows also the RNAase T~ oligonucleotides obtained from digestions of seven A3 RNA species. The order of the A3 T1 oligonucleotides was determined from their order of appearance on the fingerprints without reference to the sequences of the RNAase A oligonucleotides. The initial sequence of the UpC-primed RNA from the _42 promoter was determined to be 5' UGGCUAGGUAACACUAGCAGU-OH 3'~, and, as reported previously (Pribnow, 1975), the CpA-primed A3 RNA sequence was found to be 5' GACAUGAAACGACAGUGAGU-OH 3'. Do the A2 and A3 promoter fragments contain the initiation points for these RNA molecules? (c) "Run off" BNA sequences The gel profile in Plate II shows the RNA products synthesized by RNA polymerase molecules that were allowed to "run off" the ends of the DNA fragments to which they were bound. Sample A shows a series of A3 "run off" products obtained in the following way (see Materials and Methods for details): RNA polymerase was t Hyphens have been omitted.
A
B
C
D -O
-XC
2
PLATE I. Autoradiograph of a 10% polyacrylamide gel showing the positions of RNA polymerase-protected [32P]DNA fragments. Sample A, A3 promoter fragments obtained from complexes initiated with CpA and CTP. Sample B, control for A, in which CTP was withheld from the initiation reaction. Sample C, control that lacked polymerase during DNAase treatment. Sample D, promoter fragments obtained from non-initiated complexes. Bands 1 and 2 represent native and denatured fragments, respectively, xc, xylene cyanol dye marker; o, origin.
[facing p. 426
,
~
T?
CAc
__
~t.
,
W
L,LJ
CAc -
dl" ~
,~)
=
@2
03
PLATE II. The profile of the 15% polyaerylamide gel shows "run off" RNAs from the A3 (sample A) and A2 (sample B) promoter &agments. The RNA was labeled with [a-32P]GTP. On the left is an RNAase T1 fingerprint of the A3 band 1 RNA, showing the following oligonucleotides : (1) C-A-C-A-U-G(A)-which contained the initiating dinucleotide CpA, (2) U-G(A); (3) A-A-A-C-G(A); (4) A-C-A-G(U); (5) A-G(U). On the right is an RNAaso T1 fingerprint of A2 band 1 RNA, showing the following oligonueleotides: (1) U-A-A-C-A-C-U:A-G(C); ('2) C-U-A-G(G); (3) U-C-G(C)--whieh contained the initiating dinuelef)tide UpC. xe, xylene cyanol; b, bromphenol blue; d, fingerprint dye mariners; o, origin; CAc, cellulose acetate.
,
~.~::.:.~
i_ 3
~
ILl
"~'O
, i-1
,20
XC"
B
T
@4 dC 13
I-5
A
A
AB
A B ~ •
~
B O
-O
0
XCi
elB - 2
In
m
2-
-XC
-XC
2345--
-1
-2
b_
-b (o)
(b)
(c)
PLATE III. Autoradiographs of 12~o polyacrylamide gels showing the positions of the RNAs t h a t were complementary to the p r o m o t e r D N A fragments. (a) Sample B : I~NAs (bands 1 a n d 2) t h a t were c o m p l e m e n t a r y to the A2 p r o m o t e r D N A obtained from initiated complexes. Sample A was a control in which no D N A fragments were present during the hybridization. (b) Sample A, RNAs (bands 1 a n d 2) t h a t were c o m p l e m e n t a r y to t h e A3 p r o m o t e r D N A o b t a i n e d from initiated complexes. Sample B was a control in which no D N A fragments were present during the hybridization. (c) Sample A, t~NAs t h a t were complementary to the A3 (bands I a n d 2), A2 (bands 3 a n d 4), a n d A1 (band 5) promoter fragments obtained from non-initiated complexes. Sample B, R N A s t h a t were c o m p l e m e n t a r y to the A2 p r o m o t e r fragments obtained from initiated c o m p l e x e s - presented for comparison w i t h sample A. B a n d s of R N A t h a t migrated more slowly t h a n t h e h y b r i d R N A s were partial RNAase T1 products. The control p a t t e r n s were slightly different, depending on which [~-32P]triphosphate was used to label t h e RNA. xc, xylene eyanol; b, bromphenol blue.
I..1.1 ,,( Lt.J
t3
CAc
-"-
PLATE IV. RNAase T1 fingerprint of the 41-nucleotide A2 hybrid RNA. The R N A was labeled with [~-32P]ATP and -GTP. The oligonucleotides are: (1) A-U-A-C-A-A-A-U-C-G{C); (2) U-A-AC-A-C-U-A-G(C); (3) U-A-A-C-A-U-G(C); (4) U-A-A-G(A); (5) C-U-A-G(G); (6) C-A-G(U). The l~NAase T1 product G(U) was not labeled in this experiment. The 44-nueleotide A2 hybrid R N A contained the additional oligonucleotide A-A-G(U). d, dye markers; CAe, cellulose acetate.
. U,
I
p,,m,
d :40 50 04 o
I
"'3*
~J
CAc
-'-
PLATE V. RNAase T1 fingerprint of the 49-nucleotide A3 hybrid RZ~A (labeled with [:¢-32P]ATP) which was treated with polynueleotide kinasc and ATP before RNAase T1 digestion. The oligonueleotides are : (1) U-C-A-C-(C,A)-C-A-C-U-G(A); (2) U-A-C-C-A-C-A-U-G(A); (3*) approximate position of the normal T1 product; U-A-A-A-C-A-C-G(G), (3) the altered 5' T1 product, pUA-A-A-C-A-C-G(G); (4) U-A-C-G(A); (5) U-G(A); (6) A-A-A-C-G(A); (7) A-C-A-G(U). Three products, A-G(U), A-U-G(U), and G(U) were not labeled by A T P and did not appear on the fingerprint. The 52-nucleotide A3 hybrid R N A contained the additional oligonueleotide A-A-G(U). d, marker dyes; CAc, cellulose acetate.
P H A G E T7 P R O M O T E R SEQUENCES
427
TABI~ 1
Determination of the 5'-sequenc, ea of the dinudeotide.primed R N A 8 ini~iate~ at the A 2 and A3 ;promoter8 UpC-p.rimcd A2 RNA Gel band
Es~. RNA length~
1 2 3
5-6 10 15
U-C-G, 0-N(N)-0H U-C-G, C-U-A-G, G, U.N-0H U-C-G, C-U-A-G, G, U-A-A-C-A-C-U-A-G, C-A-G, U - 0 H
I
5-6 10 15
G-C(U)
RNAase T1 oligonucleotides~
ttNAa~e A oligonuc~otide~ 2 3
G-C(U), A-G-G-U(A) G-C(U), A-G-G-U(A), A-G-C(A)
CpA-primed A3 RNA Gel band
Est. RNA length
1 2 3 4 5 6 7
8-9 9-10 11 14 16 18 20
RNAase TI oligonuclcotides C-A-C-A-U-G, A-N(N)-OH C-A-C-A-U-G AoN-N(N)-OH C-A-C-A-U-G A-N-N-N-N-OH C-A-C-A°U-G A-A-A-C-G, A-N-N-0H
C-A-C-A-U-G A-A-A-C-G, A-C-A-G, U-OH C-A-C-A-U-G A-A-A-C-G, A-C-A-G, U-G, A-OH C-A-C-AoU-G A-A-A-C-G, A-C-A-G, U-G, A-G, U-OH
t Number of nucleotides, from gel calibration. $ "N" indicates an unidentified nucleotide in a particular RNA product. initiated at the A3 promoter with CpA and CTP, and then the polymerase-DNA complexes were treated with DNAase I in high salt. After DNAase digestion, ATP, GTP, and U T P were added (one or more of the N T P were labeled), and the polymerase "ran off" the A3 DNA fragment, synthesizing " s t u t t e r " products (low N T P concentrations were used). The [82P]RNA was then precipitated and was later run on a 15% polyacrylamide gel. Each "run off" product was eluted from the gel, digested with RNAase T1 (or RNAase A), and fingerprinted. All of the A3 RNAs had the same 5'-sequence with the fingerprints showing increasing complexity with increasing size of the RNA being analyzed. A T1 fingerprint of the longest, prominent R N A product from the A3 complex (band 1) is shown next to the gel profile in Plate II. This particular band contained RNA t h a t was 20 or 21 nucleotides long, and its sequence was determined to be 5' CACAUGAAACGACAGUGAGU(N)-01~ 3'. (I previously reported the same sequence for the A3 "run off" RNA, but indicated t h a t the longest product was 21 or 22 nucleotides long (Pribnow, 1975). The slight size ambiguity probably reflects a slight heterogeneity in the lengths of the DNA fragments in different experiments.) As expected, the " r u n off" sequence from the A3 DNA fragment is the same as the CpA-primed A3 R N A sequence transcribed from whole T7 DNA.
428
D . PRIBNOW
Sample B in the gel profile in Plate I I shows the positions of the RNA bands from an A2 "run off" experiment in which the polymerase was allowed to initiate at the A2 promoter with UpC, GTP, and CTP before the DNAase treatment. The RNAs in bands 1, 2, and 3 from the A2 sample gave well-defined fingerprints and were found to have the same initial sequence. The other bands from the A2 sample contained mixtures of several RNAs of unl~nown origin. (I was unable to eliminate the extraneous bands in 4 different experiments.) A RNAase T1 fingerprint of band 1 RNA, the longest homogeneous product, is shown next to the gel profile in Plate II. The sequence determined for A2 band 1 RNA, which was 18 or 19 bases long, was 5' UCGCUAGGUAACACUAGC(N)-OH 3'. This "run off" sequence, copied from the A2 DNA fragment, is the same as the UpC-primed A2 RNA sequence transcribed from whole T7 DNA. The A2 band 1 RNA migrated much faster through the gel than would have been expected, behaving like a molecule containing 12 or 13 bases, instead of 18 or 19 (compare its position on the gel with that of the A3 band 1 RNA, which was 20 or 21 nucleotides long). This anomalous behaviour is explained by the fact that the A2 RNA could form a stable hairpin loop (Tinoco et al., 1973) with the sequences 5' GCUAG 3' and 5' CUAGC 3' base-pairing in an antiparallel helix; and the urea in the gel failed to disrupt this secondary structure completely. The RNAs whose sequences are presented above were initiated with dinucleotides, but RNA polymerase normally initiates synthesis at the A2 promoter with GTP and at the A3 promoter with ATP (Dunn & Studier, 1973,a,b; Minl~]ey & Pribnow, 1973; Kramer e~ al., 1974). R. Kramer (personal communication) has obtained preliminary evidence that the initial sequences synthesized by RNA polymerase in vivo from the A2 and A3 promoters are pppGC-OH and pppAUG-OH, respectively. There is evidence from /av (Maizels, 1973) and T7 (my unpublished results) transcription experiments that RNA polymerase uses a dinucleotide to initiate RNA synthesis within one or two bases of the natural (ATP or GTP) initiation point. Therefore, it is likely that the initial A2 m R N A sequence is pppGCUAGUAACACUAG . . . -OH 3' and that the initial A3 m R N A sequence is pppAUGAAACGACAGUGAG . . . . OH 3'. Both the A2 and A3 protected fragments, then, do contain mRNA initiation points, and they each code for about 18 bases of messenger RNA. (d) Isolation of R N A s complementary to the entire A2 and A3 loromoter DNAs In order to determine the entire sequence of each promoter fragment, I decided to take advantage of the fact that the A1, .42, and A3 promoters all lie in close proximity (approx. 110 to 130 base-pairs separate A1 from A2 t and .42 from A3) (Bordier & Dubochet, 1974; Darlix & Dansse, 1975) and all promote transcription from the same T7 DNA strand (Summers & Szybalski, 1968; Minl~]ey & Pribnow, 1973). Figaro 1 shows that transcripts initiated at the A1 promoter should contain complete RNA copies of the A2 and A3 promoter sequences, while A2 transcripts should contain the A3 sequence. I previously reported (Pribnow, 1975) that it was possible to isolate radioactive RNA which was complementary to the A3 promoter DNA by DNA-RNA hybridization; and the promoter DNA sequence was inferred from the sequence t A CpA-primed A1 transcript, 130 to 150 bases long, was eluted from a gel, digested with Rl~Aase TI, and fingerprinted.The fingerprintshowedthat the transcript extended 20 to 25 bases beyond the A$ initiation point, so about 120 RNA bases separate the A1 and A2 initiation points (my unpublished results).
P H A G E T7 P R O M O T E R Early transcripts
mBi
429
SEQUENCES B
>
.......
T 7 DNA
I
AI
A2
A3
FzG. 1. Sohematie representation of the early promoter region of T7 DNA showing how the early transcripts "read through" downstream promoters.
determined for the complementary RiWA. The same approach has provided the sequence of the polymerase-protected DNA from the A2 promoter, as described below. First, RNA labeled with one of the four [~-82P]triphosphates was synthesized from the A1, _A2 and A3 promoters. The RNA was primed with CpA and CpC (which stimulate mostly A1 transcription (i~in~ley & Pribnow, 1973)), and the synthesis was limited so that the products did not extend far beyond the early promoter region. Then, in a separate reaction, unlabeled A2 promoter fragments were prepared by imtiating RNA polymerase at the A2 site with UpC, GTP, and CTP, raising the salt concentration, treating the initiated complexes with DNAase I, and then stopping the digestion with the addition of EDTA. The reactions containing the labeled "read through" RNA and the A2 DNA fragments were then each extracted with phenol (to eliminate po]ymerase, bovine serun albumin, and DNAase I) and were dialyzed separately into 2 x SSC. After dialysis, the RNA solution was treated with DNAase to destroy the endogenous template, and it was again extracted with phenol. Next, portions of [32P]RNA and promoter DNA were mixed together, and each mixture was heated briefly at 100°0 to denature the DNA fragments. A one-hour incubation at 60°(] followed in order to facilitate the formation of the DNA-RNA hybrids. After hybridization, the RNA was digested by RNAase Tz, and then the DNA-RNA hybrids were precipitated and were prepared for electrophoresis through a 40-cm long 125/o acrylamide-7 m-urea gel (Gilbert & Maxam, 1973). Since the protected DNA fragments themselves were at least 40 base-pairs long, it was anticipated that any [32P]RNA which had been directly hybridized to the promoter DNA and had therefore survived the RNAase treatment would be at least 40 bases long. Plate HI(a) shows a radioautograph of a gel that contained A2 RNA obtained b y the methods just described (sample B). Bands 1 and 2 represent RNAs 44 and 41 bases long, which were complementary to the A2 promoter DNA. Band 1 RNA was almost always present in a substantially lower molar yield than band 2 RNA. Both bands were missing when protected DNA fragments were absent from the hybridization, as shown by the control (sample A). Other controls showed that no complementary RNA was obtained when the DNA fragments were prepared in the absence of an initiation event (in which case 0.2 ~-KC1 prevented polymerase binding and protection). Plate III(b) shows the RNA bands obtained in a slml]ar hybridization experiment using protected fragments from the A3 promoter. Because the RNA used in the hybridization experiments was transcribed from one T7 DNA strand only, it was expected that the longer of the two A2 RNAs (baud 1)
430
D. PRIBNOW
would contain the sequence of the shorter, more prominent RNA (band 2). Another consequence of the strand-specificity of transcription from the early promoters was the prediction that the A2 hybrid RNA would contain the A2 "run off" sequence toward its 3'-end. Similar findings were expected for the two A3 RNAs (Pribnow, 1975). Sequence analysis confirmed the predicted results, once all RNA species specifically homologous to the promoter fragments were extracted from the gels and were subjected to sequencing procedures.
(e) Serluence of the RNA complementary to the .42 ~romoSerfragment After being eluted from the gel each A2 hybrid RNA, labeled with a single [~.s2p]. ribonucleoside triphosphate, was digested separately with RNAase T1 and RNAase A, and the sequences of the resulting oligonucleotide products were determined. Plate IV shows a radioautograph of the RNAase T1 products obtained from the 41 base-pair long A2 RNA. The 44 base-pair long A2 RNA contained only one additional Ti oligonucleotide, AAG, giving a total of nlne T 1 oligonucleotides (CAG was present twice). Each Ti oligonucleotide was digested with RNAase A, and then each RNAase A product was hydrolyzed in alkali. Table 2 lists the 5' to 3' nucleotide.specifie phosphate transfers obtained upon hydrolysis of each RNAase A product derived from each A2 T1 oligonucleotide. The sequences of all but two of the T~ oligonucleotides were determined directly from the sequences of their constituent RNAase A products given by this nearest-neighbor analysis (see Table 2). Additional data was required to establish the sequences of T~ oligonucleotides 1 and 2. There were two possible permutations of the five RNAase A products of T~ oligonucleotide 2, UACUAAGAG(C) and UAACACUAG(C), that were consistent with the data in Table 2. (A nucleotide indicated in parenthesis is not a par~ of the digestion product; it is the 3' neighbor of the last nucleotide in the product.) In order to determine the correct alternative, I subjected Tx oligonucleotide 2, labeled initially with [~-32P]GTP, to RNAase U~ digestion. Since RNAase U2 cleaves RNA on the 3' side of A and G leaving 3'-phosphates, either GA or CUA would be obtained as the labeled U2 product, depending on which sequence was correct. The labeled Us product obtained in the digestion was CUA, as determined by its electrophoretic mobility on DEAE-cellulose paper relative to U2 products of known sequence (including CA and UCA). Thus, the sequence of T1 oligonucleotide 2 was UAACACUAG(C). Two alternative sequences were also possible for T~ oligonucleotide 1: ACAU AAAUCG(C) or AUACAAAUCG(C). The second alternative was correct, since RNAase A digestion of the A2 hybrid RNA produced the oligonucleotide AAGAU(A), showing that one of the Ti oligonucleotides had AU(A) at its 5' end. Ti oligonucleotide 1 was the only candidate among the T1 products that could have had AU(A) at its 5'-end. The oligonucleotides derived from the A2 hybrid RNAs by RNAase A digestion were sequenced by procedures analogous to those used to sequence the Ti oligonucleotides. The G-containing RNAase A oligonucleotides were digested with RNAase T l, and then the individual Ti products were hydrolyzed in alkali. The non-Gcontaining oligonucleotides were hydrolyzed directly. The resulting nearest-neighbor data established the sequences of a/l of the RNAase A oligonucleotides without further analysis. Each RNAase A oligonucleo~ide was found to be present in the relative molar yield predicted by the sequences already established for the T1 oligonucleotides.
P H A G E T7 P R O M O T E R S E Q U E N C E S
431
TABLE 2
Determination of the sequences of the RNAaze T ~ digestion Troducts of the d4 base-Tair long A2 hybrid R N A RNAsae Tz product
RNAase A derivative I
I
2
3
4
II III
G
U
U
A
A-U(A)
-----
A U
--A
A-C(A) A-A-A-U(C)
C
. . . . . .
. . . . . .
C 2A
IV
.. . . . . . .
xt II :H
O
G(C)
--C
U(A) A-A-C(A) A-C(U)
---
A-U(G)
A
---
A-A-C(A)
A G
C ---
A-C(U) G(C)
u . . . . . . . . . A,C --A . . . . . . A -- -
A
:
u A,C
. . . . . . . . .
::i rv
G
---
. . . . . . . . . . . .
I
U
II
A,G .
.
C(G) -- -
IV
II
Sequence
C
to :
V
:
5
3=p transferred from A
U(A)
. . . . . . . . . A . . . . . . .
.
.
.
.
.
c
.
II
U
.
III
--.
A,G
6j"
I II
-..
. . . . . . . . . A ---
7
I
A
A
8
I
. . . . . .
C
U(A) A-A-G(A)
.
.
.
.
.
.
.
. . . . . .
--G
c(u) U(A)
.
A-G(G) C(A)
G
A-G(U)
G
A-A-G(U)
.--
G(U)
The Table lists the s2P-labeled 3"-mononucleotides generated upon hydrolysis of each RNAase A derivative of each RNAase T: oligonucleotide. The sequence of each RNAase Tx oligonucleotide was deduced directly from the sequences of its RNase A derivatives, except as noted in the text. t RNAase A derivative I (U(A)) was present in twice the molar yield of the other RNAase A derivatives of RNAase T x product 2. ~:RNAase Tz product 6 was present in twice the molar yield of the other RNAase T1 products. Table 3 presents the data obtained upon hydrolysis of the G.contalnlng RNAase A oligonucleotides. The entire sequence of the longest A2 hybrid R N A was constructed from overlaps of the individual oligonucleotide sequences, as shown in Figure 2. For each adjacent pair of Tz oligonueleotides, there was a G-contalnlng RNAase A ol~gonucleotide whose sequence extended from one Tz oligonucleotide into or beyond the neighboring oligonueleotide a necessary consequence of the different specificities of RNAase Tx (cleavage after G) and RNAase A (cleavage after U and C). An apparent anomaly aided the A2 sequence construction tremendously. I n two hybridization experiments, two R N A s were found in addition to A2 bands 1 and 2 (Plate I I I ( a ) ) t h a t were specifically complementary to the A2 promoter DNA. These two RNAs, 34 and 27 nucleotides in length, were fingerprinted and were discovered to be long partial (RNAase Tz) products of the other A2 RNAs. (It is not clear w h y the partial products were obtained, as t h e y did not appear in most experiments.) Thus, oligonucleotide
D. P R I B N O W
432
TABLE 3
Determina$ion of ~he ,equenves of the G-containinq RNAa~e A digestion 2oroduaa of the g4 base-laair long A2 hybrid RNA transferred from G C %o:
s2p
RNAase A produet
RNAase T~ derivativv
A
1
I II
A
2
I II
. . . . . . C .
I
---
A
II
U
.
I
A,G U
3 4
A . .
U
II
U
.
.
-- . .
.
.
O .
A-A-G(U) U(A)
.
.
.
G .
.
.
.
--.
G(C) C(A)
.
.
.
.
.
.
--.
A . . . . . . .
T n
. . . . . . . . . . . . . . .
G
i II HI
-- A,G . . . . . . . . . U . . . .
. . . . . .
6
I
-- -
A
G
II
O
.
8
I
.
.
.
A
---
G
A-G(U)
.
U(A)
. . . . . A
5
7
Sequence
.
.
.
--. c
G(O) c(u)
G
G(U)
A-G(G)
.
.
-.-
.
.
.
.
---
A-A-G(A) A-U(A)
U(A)
A-G(C)
.
C(A)
G
[A-G(U)]#
The Table lists the 82P-labeled 3"-mononucleotides generated upon hydrolysis of each RNAase T1 derivative of each G-contaln~ug RNAase A oligonucleotide. t RNAase A product from the 3'-end of the A2 hybrid RNA. A UA C A A A U C G(C) A A G A U(A)
G C(U)
U A A G(A) A G U(A) A A G(U) A A G U(A)
G(U)
C U A G(G) A G G U(A)
C A G(U) G C(A)
[A G(U)] C A G(U) A G C(A)
U AA CA C U A G(C)
U A A C A U G(C) 5'
AAG U A A C A U G C A G U A A G A U A C A A A U C GC UA GGUA'A C A C U A GCA G U 3'
I
27
I [
34 41
I I I
FIG. 2. Construction of the A2 hybrid RNA sequence by overlaps of RNAase T1 and RNAaso A products. The end-points of the 27, 34, and 41-nuoleotide RNAase T1 partial products are indicated. Hyphens have been omitted. o v e r l a p s c o u l d first b e w o r k e d o u t for t h e 27 b a s e - p a i r A 2 R N A , a n d t h e n t h e s e q u e n c e c o u l d b e e x t e n d e d to 34, 41, a n d finally 44 b a s e - p a i r s (see Fig. 2). A s c a n b e seen in F i g u r e 2, t h e d e d u c e d A2 R N A sequence c o n t a i n s t h e sequence o f t h e A2 m R N A (or U p C o p r i m e d R N A ) a t i t s 3'-end, p r o v i n g t h a t t h i s long s e q u e n c e r e p r e s e n t s t h e R N A p o l y m e r a s e - p r o t e c t e d A 2 D N A sequence.
P I I A G E T7 P R O M O T E R
SEQUENCES
433
(f) ~ / u e n c e of the R N A c o ~ m e ~ a r y to the A8 trromoter fragment The sequence of ~he whole A3 promoter complement was deduced in essentially the same w a y as the sequence of the A2 complement. Tables 4 a n d 5 show the nearestneighbor d a t a establishing the sequences of the T~ oligonucleotides and the G-containing RNAase A oligonucleotides derived from the 52 bases long A3 hybrid RNA. Figure 3 shows the overlaps of the RNAase T1 a n d R N A a s e A oligonucleotides leading
TABLE
4
Determination of She sequences of She RNAaze T I digestion produds of the 52 baze-pair long A3 hybrid RNA RNAase T1 product
RNAase A derivative
i H$ III 1
2t
5
6
7
C
. . . . . .
w vii
--G
i
u . . . . . . . . . . . . . . . A,C C . . . . . . . . . C - -A --U ---
II III IV V
---
u . .
G
.
u
.
.
.
.
.
.
.
.
.
.
.
---
A.C(A)
C
A-O(U)
. . . . . . . . .
.
.
.
.
.
.
A A
IV
---
G
. . . . . .
I II III
U --G
.
i
---
u
II
G
.
I
SA
.
----. A
.. ---
.
.
.
. . A . . .
.
.
.
.
.
U(A)
.
. . . . . . .
C
.
.
.
.
.
.
.
.
.
G{A) - -.
A-A-A-C(G)
II
G
.
I II
C
---
A
---
A-C(A)
A
---
G
A-G(U)
---
.
A-C(G) G(A) U(G)
.
A
U(A) A-A-A-C(A) A-C(G) G(G)
--.
U(A) A-C(C) C(A) ArC(A) A-U(G) G(A)
.
--C
.
U(G) G(A)
.
2A,C ---
. C . .
u(c) C(A) A-C(C)
A
.
.
Sequence
A
II III
I
.
G(A)
H
--U . . . . . . . . .
---
8
A G
A-U(G) G(U)
9
I
---
- --
G
A-G(U)
---
G
A.A-G(U)
G
G(U)
I0
I
A
ll
I
.
RNAase U2 digestion eliminated the ambiguity $ RNAase A derivative derivatives of Tx produe~
29
. . . . . . u --. C . . . . . . . . . . . . . . . A,C - --
IV
I
4
U
v
vi
3
32p t.Tansferred from G C to :
A
A
A .
.
.
.
.
.
.
.
of RNAase Tx product 2 yielded the oligonucleotide C-A(U). This result i n t h e o r d e r o f t h e R l ~ A a s e A d e r i v a t i v e s o f Tx p r o d u c t 2. II (C(A)) was present in twice the molar yield of the other RNAase A 1.
434
D.- P R I B N O W T~BI~ 5
Determi~ion of the se~s~w.es of the G.oo~aini~ RNAass A digestio~ ~o~¢ts of the 52 haze.pair long A3 hybrid RNA RNAase A product
RNAase Tx derivative
asp transferred from G C to:
A
A U
A --. . . . . . . . .
1
I
:
---
2
II
.
nI
3
4
5
6
7
8 9
II
U
o .
.
.
.
.
u
.
.
.
.
.
I
G
. . . . . . . . .
II
---
U .
.
.
.
.
.
.
.
.
.
.
.
U
.
.
.
.
.
I II
G
. . . . . . . . .
2A
C
G .
I
G
.
C
---
A
A U
-.-
-- ---
I II
G -- -
:ii
.
I
G
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
G(A)
A-A-A-C(G)
.. -
A-C(A)
.
.
G(U)
...
G(A)
G .
A-U(G) U(A)
.
A
II I
G(A)
A
I
G(U) U(A)
---
II
II
o(G)
G
.
A-A-G(U) U(A)
. . . . . .
.
.
.
G
Sequence
.
A-G(U) U(G)
. . . . . . . . . A - -. G . u ---
G(A) A-G(U) u(o)
. . . . . . . . .
[G(A)]t
RNAase A product from the 3"-end of the A3 hybrid RNA.
to the complete sequence. The oligonucleotides comprising the A3 "run off" R N A sequence were identified among the various products, showing t h a t the hybrid R N A really was complementary to the promoter fragment. These oligonucleotides were ordered according to the known run offsequence, and the rest of the sequence was built upon the run off sequence. A problem was encountered in the A3 sequence construction which involved the placement of the two T : oligonueleotides, UAAACACG and UACG. Two alternative sequences were possible for the A3 RNA, with one or the other of these T : products f a l ~ g immediately after AAG at the 5'-end of the sequence. Since the 49 bases long A3 R N A (Plate HI(b), band 2) contained all of the Tx products ~cept the 5'-AAG, it was certain t h a t either UA_A_~CACGor UACG occupied the W-end of this R N A molecule; and since the R N A was essentially a large T 1 partial product (the hybrids were treated with RNAase T1), it had a free 5'-OH end. I therefore decided to use polynucleotide kinase (Richardson, 1971) to transfer a phosphate (unlabeled) from A T P to the 5'-end of the 49 base-pair long/£3 RNA, expecting t h a t after subsequent RNAase T1 digestion the extra phosphate would affect the mobility of the T x product from the
A G(U)
G A G U(C)
3'
5' 3'
AA
G A A G T A A A C A CG G CTTCATTTGTGCC
1
G A A G T A A C A T G CAG CTTCATTGTACGTC TACGATG ATGCTAC
ATTCTAT
TAAGATA
TACCAC ATGGTG
CAAATC GTTTAG
CTAGGTAACACTAGCAGT
3' 5'
AA
A T G A A A C G A CA G T G A G T C A C (C,A) C A C T G A T A C T T T G C T G T CA CT C A G T G (G,T) GTGA CT
A1
C GATCCATTGTGATCGTCA
G
I
PIG. 4. DNA sequences at the A2 (top) and A3 (bottom) promoters. Arrows indicate the probable ends of t~hepolymerase-proteeted sequences. ' T ' indicates the initiation poin~ for RNA synthesis from each promoter. RNA synthesis proceeds to the right from "I". "B" indicates the DNA sequence that probably makes specific oontacts with the RNA polymerase (see Disoussion).
5' 3'
B
~G~3~C~t~cti~n~fth~A3h~ridR~Asequence~y~aps~fRNA~eTzandRNA~eAprodu~s.H~he~hav~bcenomitted.
(C,A) C A C U G A
A G U(G) [G(A)] U G(A) U C A C (C,A) C A C U G(A)
G A C(A) A C A G(U)
5' A A B U A A A C A C G B U A C B A U G U A C C A C A U G A A A C G A C A G U G A G U C A C
G(U)
G GU(A) G A U(G) U A C G(A)
U A A A C A C G(G)
A A G U(A) A A G(U)
UA C C A C A U G(A) G A A A C(G) G U(A) A A A C G(A) A U G(U)
436
D. PRIBNOW
5'-end of the RNA in both dlmensions of a fingerprint, thereby identifying it. I did the polynucleotide kinase transfer reaction, digested the RNA completely with RNAase Tx, and fingerprinted the products. Over 90% of the oligonucleotide UAAACACG had been altered (phosphorylated), the change in its position on the fingerprint being visualized in Plate V. None of the other T 1 products had been affected by the kinase reaction. This result showed that the sequence at the 5'-end of the 49 bases long A3 RNA was UAAACAGG... OH, and a unique sequence for the entire RNA molecule could then be pieced together (Fig. 3). (g) The DNA fragments The DNA sequences of the promoter fragments, inferred from the hybrid RNA sequences, are presented in Figure 4. These sequences are longer than the actual sequences of the A2 and A3 protected DNA fragments, because the RNAs that were hybridized to the promoter DNAs were digested with RNAase T1. RNA extending beyond either end of the hybrids would have remained intact up to the point of a T1 cleavage site at a G. A good estimate of the actual lengths of the A2 and A3 DNA fragments can, however, be obtained from the spectrum of RNA sequences. When the A2 RNA was hybridized to the complementary strand of the A2 I)NA fragment, the 5'-oligonucleotide AAG was only partially protected from removal by RNAase T 1. This suggested that the left end of the DNA strand was in the immediate vicinity of the G in the oligonucleotide AAG. Since the 3'-oligonucleotide CAG was completely inaccessible to RNAase T1, the right end of the DNA strand must have extended at least up to the G preceding the CAG. More importantly, the A2 "run off" RNA, which was complementary to the same DNA strand of the protected fragment as the hybrid RNA, extended at least as far as the C in CAG, and possibly up to the A, placing the right end of the DNA strand after the C or after the A in the 3'-oligonucleotide CAG. Thus, the DNA strand was probably 40 or 41 nucleotides long, and the double-stranded A2 promoter fragment was probably about 40 or 41 base-pairs long (the approximate end-points are indicated in Fig. 4). Similar arguments have been made previously for the size of the A3 protected fragment (Pribnow, 1975), suggesting that it was 40 to 42 base-pairs long. The more recent estimate of the length of the C~A-primed A3 "run off" RNA (20 or 21 nucleotides) suggests that the fragment was actually one base.pair shorter (these end-points are also indicated in Fig. 4). (h) Soluenves fron non.initiated complexes The promoter fragments whose sequences have been presented were obtained from complexes in which RNA polymerase had carried out a limited initiation reaction. As stated earlier, the polymerase can form stable, non-initiated complexes at promoter sites. At a given promoter, how does the binding site occupied by a non-initiated polymerase molecule relate to the site occupied by a polymerase molecule which has "initiated" transcription? As shown earlier, non-initiated and initiated polymerases protect the same sized piece of DNA from digestion by DNAase I (Plate I, samples A and D), but do they protect the same DNA 8equenves? To answer this question, I did hybridization experiments with labeled "read through" RNA similar to those described earlier. In this case, however, I prepared unlabeled polymerase-protected DNA fragments by binding RNA polymerase to T7 DNA w/~hout initiation, followed
PHAGE
T7 P R O M O T E R S E Q U E N C E S
437
by DNAase I treatment in low salt to digest unprotected DNA. The DNA fragments were hybridized to [82P]RNA, the hybrids were treated with RNAase TI, and the surviving [32PJRNA was electrophoresed through a polyacrylamide gel. The RNA products obtained in one such experiment are pictured in the radioautograph in Plate III(c) (sample A). As reported previously (Pribnow, 1975), two of the RNAs (bands 1 and 2) were shown by fingerprint analysis to be identical to the two hybrid RNAs that were complementary to the A3 promoter DNA from initiated complexes. Thus, RNA polymerase protects the same DNA sequence at the A3 promoter (at least within a few base-pairs) in both its initiated (CpA ~- CTP) and non-initiated states. Two other RNAs (Plate III(c), bands 3 and 4) were eluted, treated with RNAase T1, and fingerprinted; and these two RNAs were found to be identical to the two A2 hybrid RNAS seen in the earlier experiments, except that they did not contain the oligonucleotide CAG at their 3'-ends. So, the polymerase protects essentially the same DNA sequence in its initiated (UpC ~- GTP ~- CTP) and non-initiated states at the A2 promoter. Apparently the complexed polymerase moves a base-pair or two or alters its conformation slightly when it "initiates" synthesis at the A2 promoter, forming the oligonucleotide UpCpGpC. One other RNA band obtained in this experiment suggested that the T7 A1 promoter, too, was protected by non-initiated polymerase from digestion by DNAase I. When the RNA in band 5 (Plate III(c)) was analyzed, it was discovered that the RNA contained oligonueleotides corresponding to the initial sequence of the CpA-primed A1 RNA (my unpublished results). The obvious inference was that some of the labeled "read ~through" RNA initiated at the A1 promoter had hybridized to I)NA fragments from the A1 promoter and was therefore protected from digestion by RNAase T1. (i) Protection by core I~olymerase E. coli core RNA polymerase (Burgess, 1969; Berg et al., 1971) is capable of complexing with DNA. The core enzyme binds to T7 DNA with an affinity about 100-times lower than that for holoenzyme, and it is thought to bind non-specifically all over the DNA (ttinkle & Chamberlin, 1972). But does the core enzyme show a detectable preference for the promoter binding sites? Using core polymerase to protect 82p. labeled T7 DNA, I found that the core enzyme did protect some DNA from digestion b y DNAase I, and that the protected fragments were roughly 35 base-pairs long (my unpublished results); but did they represent specific segments of the template? I did hybridization experiments like those described earlier, where core-protected DNA was hybridized to labeled "read through" RNA from the early promoter region. The yield of hybrid RNA in these experiments was low ( ~ 10%) relative to yields obtained in hybridizations with holoenzyme-protected DNA fragments; and the RNA migrated in gels in diffuse bands that were shown by fingerprint analysis to contain heterogeneous non-promoter sequences. (In early experiments, I did obtain low yields of the A2 hybrid RNAs; but they did not appear in later experiments in which I used core enzyme containing less than 1~o sigma, l~indiy provided to me by Mike Chamberlin.) Promoter binding and protection, to the extent that it could be measured by these experiments, was totally sigma-dependent. Work is now in progress in this laboratory to determine the I)NA sequence engaged by RNA polymerase at the T7 A1 promoter and to expand the known sequences around the A2 and A3 binding sites.
438
D. P R I B N O W 4. D i s c u s s i o n
(a) The promoterf~gment~s Polymerase-protected DNA fragments from a bacteriophage fd promoter (Heyden d al., 1972; Schaller d a/., 1975; Sugimoto et aZ., 1975), the phage lambda rightward promoter (Walz & Pirrotta, 1975), the E. coZi/ac LTV5 promoter (J. Gralla, personal communication), and the T7 A2 and A3 promoters have all been isolated and sequenced. How do they compare with each other? Each promoter fragment is 40 to 42 base-pairs long, it contains the mRNA initiation point for the promoter, and it codes for 17 to 19 bases of RNA. The similarity among the promoter fragments shows that RNA polymerase binds to every promoter in the same position and orientation with respect to the mRNA initiation point. This suggests that RNA polymerase binds to a specific DNA sequence, common to all promoters, that lies in a prescribed region of the DNA near the initiation site. Two facts suggest that all of the relevant sequence information required for the maintenance of a stable polymerase-DNA complex is contained within the sequence of each polymerase-protected DNA fragment. First, there is the simple fact that the DNA nucleotides that are directly bound by the polymerase are the least likely to be digested by DNAase. Second, the stability of a complex between a polymerase molecule and its bound promoter fragment is appro~mately the same as the stability of a binary complex on a whole DNA molecule (Schaller e2 al., 1975), where all binding infm~nation must be present. A comparison of promoter sequences where the initiation points in the sequences are aligned should expose the proposed binding sequence as a conspicuous region of homology among the promoter sequences. In fact, since it is apparent that the tight binding sequence must fall within roughly 20 base-pairs of the initiation point at any promoter (the protected region), any promoter sequence covering this region, whether or not it actually comes from a "protected fragment", should be included in the comparison. (b) The binding 8~ue~e Earlier (Pribnow, 1975), I made a comparison of known promoter sequences from T7 (A3 only), fd (Schaller e$ al., 1975; Sugimoto eta/., 1975), simian virus 40 (Dhar eta/., 1974), lambda PL (Maniatis eta/., 1974), and E. coli tyr tRNA (Sekiya e~ ed., 1975a,b),/ac wild type (Dickson eta/., 1975), and/av UV5 (J. Gralla, personal communication). With the initiation points vertically aligned (in one or the other of two adjacent eo]umn~) and with transcription oriented from left to right, the sequences showed a striking seven base-pair homology located to the left of the h~itiation points in the non-transcribed regions of the sequences. Schaller e~ aZ. (1975) made a similar discovery. This sequence, 5' T-A-T-Pu-A-T-G 3' 3' A-T-A-Py-T-A-C 5', was therefore implicated in the formation of a stable binary complex with RNA polymerase. When the sequences from the lambda rightward promoter (Walz & Pirrotta, 1975) and the T7 A2 promoter are added to the comparison with the other seven sequences, the homology is retrained. Not every promoter sequence contains the exazt sequence written above, but none
P H A G E T7 P R O M O T E R S E Q U E N C E S
439
differs by more than two base-pairs. Differences in the basic binding sequence from one promoter to the next probably determine differences in the binding constants for tight complexes at the various promoters and, consequently, some differences in the relative efficiencies of the promoters. As noted previously (Pribnow, 1975), there is one example of a change in the binding sequence that does bring about a significant difference in promoter activity. Independent of the protein activator, CAP (Beckwith et aL, 1972), the/ac UV5 promoter sequence (J. Gralla, personal communication), which exactly matches the basic binding sequence, is a far better promoter (about 50 times better) (Silverstone et a/., 1970; Eron & Block, 1971) than t h e / a c wild type sequence (Dickson eta/., 1975), which does not exactly match the basic binding sequence. The seven base-pair binding sequence is A-T rich. This is probably important, since the formation of a stable polymerase-DNA complex has been shown to involve a limited unwinding of the DNA strands (Saucier & Wang, 1972), and A . T base-pairs "melt" more readily than G. C base-pairs. (c) The fragments are not ~romoters Are the polymerase-protected DNA fragments themselves functional promoters? Each fragment not only contains a binding site for the polymcrase and the mRNA initiation point for the promoter, but, in addition, "run off" experiments have been done with polymerase molecules bound in non-initiated complexes with fd DNA fragments (Schaller et al., 1975; Heyden et al., 1975), lambda PR fragments (Walz & Pirrotta, 1975,), and /ac UV5 fragments (J. Gralla, personal communication), and these experiments have shown conclusively that the DNA fragments contain all of the information that the polymerase requires to initiate RNA synthesis. However, when RNA polymerase was added to purified DNA fragments from the T7 A2 (my unpublished results) or A3 (Pribnow, 1975) promoter, an fd promoter (Schaller e~ al., 1975; Heyden eta/., 1975), or the lambda P~ promoter (Walz & Pirrotta, 1975), the enzyme was not able to form a functional complex and initiate RNA synthesis. None of the protected fragments could direct the formation of stable pre-initiation complexes, even though they were capable of maintaining such complexes once they had been formed (complexes remaining after DNAase treatment). (d) The recognition sequence Apparently some signal that normally helps the polymerase to find the binding sequence is not protected by the polymerase from digestion by DNAase. This is not surprising. A protected fragment contains just over 20 base-pairs of the non-transcribed promoter sequence; but in lambda and/ac there are promoter mutations that are located about 35 base-pairs from the lambda a n d / a v initiation points in the non-transcribed regions of these promoters (i~Ianiatis e~ a/., 1974; Dickson et a/., 1975). As suggested earlier (Pribnow, 1975), these mutations probably pinpoint the site of a "recognition" event that takes place between RNA polymcrase and some specific DNA sequence. I t would be this recognition sequence, situated about 35 basepairs from the RNA initiation point, that is missing on the protected fragments and that must normally be present to guide the polymerase into a pre-initiation complex. As with the binding sequence, the recognition sequence should appear as a region of homology among promoter sequences. In fact, Maniatis et al. (1975), in comparing
440
D. PRIBNOW
the sequences of lambda P~, ]ambda P,., and the simian virus 40 promoter, found an absolute five base-pair homology among these three sequences--a homology t h a t is centered exactly 35 base-pairs from the initiation point in each promoter sequence. However, the ~ c and tyr t R N A sequences, which are the only other known promoter sequences extending into this region, do not contain the five base-pair sequence. Appropriate alignment of the five sequences together reveals some homology, but it is clear t h a t much more information is needed to deduce the basic recognition sequence. I f the recognition sequence is a promoter-regulatory element, as proposed below, then the recognition sequence might v a r y substantially at some promoters.
(e) Promoter model Coupling the foregoing considerations with the current detailed knowledge of polymerase-DNA interactions and the initiation of RNA synthesis (for a review, see Chamberlin, 1974), I have constructed a simple, testable model for promoter structure and function. Some essential features of the model have been p u t forth b y others (Walter e~ al., 1967; Mangel & Chamberlin, 1974a,b), and I ~ concentrate here on the specific DNA sequences and their functional relationship. The model is sehematized in Figure 5 and is described below. According to the model, a promoter is comprised of three essential elements t h a t
R
~ VIA
tII!11~
•
/
,,
PPP
FIG. 5. The promoter model (see text). The 3 diagrammed steps in the initiation process are: (1) the polymerase forms a "recognition" complex at "R"; (2) the polymerase makes a transition to a stable pro-initiation complex at the exposed ("melted") binding sequence, "B", and (3) the polymerase initiates RNA synthesisDwhich is followed by elongation of the RNA chain. The polymerase is pictured as covering a total of 55 to 60 base-pairs of DNA.
P H A G E T7 P R O M O T E R
SEQUENCES
441
cover about 40 base-pairs of DNA: (1) a "recognition" sequence (R); (2) a "binding" sequence (B), and (3) an RNA initiation point (I). Whereas it is known that the basic binding sequence is 5' T-A-T-Pu-A-T-G 3' 3' A-T-A-Py-T-A-C 5', the recognition sequence is presently unknown. At the initiation point, the only requirement is that the nucleotide in the transcribed DNA strand be a T or a C, since all natural transcripts are initiated with ATP or GTP (Maitra e~ al., 1967). Five or six base-pairs separate the binding sequence from the initiation point, and about 20 base-pairs occupy the space between the binding and recognition sequences. Operationally, the model includes three basic, sequential events. First, the polymerase forms a complex with the recognition sequence, R; second, the polymerase binds the nucleotides in the binding sequence, B, holding the two DNA strands apart; and, third, the polymerase initiates RNA synthesis, catalyzing the formation of a phosphodiester bond between ATP or GTP and the next encoded nueleotide. Beyond this third step, the polymerase is involved in elongation of the nascent RNA, and the promoter can probably be ignored. (i) Recognition When RNA polymerase forms a complex with the recognition sequence, it does not open the DNA strands, but binds directly to the outside of the helix. Recognition is therefore a ra3pid event. (Although it is possible that the recognition complex is the . "closed" complex observed experimentally by Mangel & Chamberlin (1974a), the binding that they observed might have been largely non-speeffic.) The function proposed for the recognition sequence is that it orients the polymerase so that it has direct access to the binding sequence. This requires a fairly rigid spatial relationship between the recognition and binding sequences, since a displaced recognition sequence could not orient the polymerase properly. The polymerase specificity subunit, sigma (Burgess, 1969), probably mediates binding at the recogTlition sequence (perhaps sigma itself "recognizes" the sequence) and therefore enables the enzyme to engage the binding sequence. (ii) Pre-initiation co~ple.z The transition from the recognition complex to the stable pre-initiation complex involves "local melting" within the promoter (Saucier & Wang, 1972). The transition is co-operative, as noted by Mangel & Camherlin (1974a,b) and by ZiUig et aZ. (1970), being strongly dependent on salt concentration and temperature, both of which affect the stability of the DNA helix. According to the model, the polymerase, poised in the recognition complex, "waits" for the I)NA helix to open in the region of the binding sequence, and once the helix has opened, the polymerase engages the nucleotides in the binding sequence. The polymerase does not move laterally along the DNA helix during the transition. (Two polymerases cannot occupy the same promoter at one time.) The binding sequence itself is the same or nearly the same in all promoters, since the polymerase has to make specific contacts with the bases in this sequence--and because the pre-initiation complex must he formed before initiation can take place (Mangel & Chamberlin, 1974a).
442
D. P R I B N O W
(i~) I ~ i ~ i ~ The binding sequence, by determining the location of the pre-initiation complex, determines the RNA initiation point. The DNA helix is probably held "open" up to the point of initiation, so that the first two RNA nueleotides can base.pair with the exposed template nucleotides. Evidence in support of these suggestions comes from the observation that dmucleotides can prime RNA chains by base-pairing with the promoter DNA in a limited region around the initiation point (Minldey & Pribnow, 1973; Maize]s, 1973; Darlix & Dansse, 1975). The polymerase catalyzes the formation of a phosphodiester bond between the first and second template-encoded nucleotides as it begins to move away t~om the promoter, and initiation is effectively complete.
(iv) ReguhU/o~ Being the first step in polymerase-promoter interaction, "recognition" is the logical target for factors that modulate transcription. Repressors might block recognition by sterieally preventing polymerase binding; and activators might facilitate recognition by mediating formation of a recognition complex where a weak recognition sequence or no recognition sequence is present at the promoter (the activator would have its own recognition sequence). In programmed transcription systems (e.g. bacteriophage T4), where RNA polymerase is responsible for transcribing different sets of genes at different times, the affinity that the polymerase has for one recognition sequence might be altered in time (by chemical modification of the polymerase, substitution of a new sigma-type subunit, etc.), so that the polymerase acquires an 8ffinity for a different recognition sequence. Whereas the promoter model is somewhat oversimplified functionally, it provides a tentative answer to the question: what is a promoter? I wish to thank Bob Simpson and Mike Chamberlin for the RNA polymerases used in these experiments, and I thank Rick Kramer and Jay Gralla for allowing me to cite some of their unpublished results. I am particularly grateful to Allan Maxam, who skillfully taught me the RNA sequencing procedures; and I thank Allan Maxam, John Majors, flay Gralla, and Wally Gilbert for helpful discussions. Wally Gilbert also provided valuable criticism of this manuscript. This work was supported by U.S. Public Health Service grant no. GM09541 from the National Institute of General Medical Sciences. REFERENCES Beckwith, J., Grodzieker, T. & Arclitti, R. (1972). d. Mot. Biol. 69, 155-160. Berg, D., Barrett, K. & Chamberlin, M. (1971). In Methods in Enzymology (Grossman, L. & Moldave, K., eds), vol. 21, part D, pp. 506-519, Academic Press, New York. Bordier, C. & Dubochet, J. (1974). Eur. J. Biochem. 44, 617-624. Brownlee, G. G. (1972). Laboratory Techniques in Biochemistry and Molecular Biology: Determination of Sequences in R1VA (Work, T. S. & Work, E., eds), North-Holland, Amaterdam/Ameriean Elsevier,N e w York. Burgess, R. (1969). J. Biol. Ghem. 244, 6168-6176. Chamberlin, M. (1974). Annu. Rev. Bioohem. 43, 721-775. Chamberlin, M. & Ring, J. (1972). J. Mol. Biol. 70, 221-237. Darlix, ff.-L. & Dausse, ft. P. (1975). I~EBS Letters, 50, 214-218. Dhar, R., Weissman, S. M., Zain, B. S. & Pan, ft. (1974). Nucleic Acids Res. l, 595-614. Dickson, R. C., Abelson, J., Barnes, W. M. & Reznikoff, W. S. (1975). Science, 187, 27-35. Downey, K. M. & So, A. G. (1970). Biochemistry, 9, 2520-2525. Downey, K. M., Jurmark, B. S. & So, A. G. (1970). Biochemistry, 10, 4970-4975. Dubin, S. B., Benedeek, G. B., Bancroft, F. C. & Freifelder, D. (1970). J. Mol. Biol. 54, 547-556.
P H A G E T7 P R O M O T E R S E Q U E N C E S
443
Dunn, J. J. & Studier, F. W. (1973a). Prec. Nat. Acad. Sci., U.S.A. 70, 1559-1563. Dunn, J. J. & Stuclier, F. W. (1973b). Prec. Nat. Acad. Sc~., U.S.A. 70, 3296-4001. Epstein, W. & Beckwith, J. (1968). A n n u . Rev. Biochem. 37, 411. Eron, L. & Block, R. (1971). Prec. Nat. Acad. Sci., U.S.A. 68, 1828. Gilbert, W. & Maxam, A. (1973). Prec. Nat. Acad. Sci., U.S.A. 70, 3581. Heyden, B., Nusslein, Ch. & SehaUer, H. (1972). N a C r e N e w Biol. 240, 9. Heyden, B., Nusslein, Ch. & Schaller, H. (1975). Eur. J . Biochem. 55, 147-155. Hinkle, D. & Chamberlin, M. (1970). Cold Sprimy Harbor Syrup. Quant. Biol. 35, 65-72. Hinkle, D. C. & Chamberlin, M. J. (1972). J . Mol. Biol. 70, 157-185. Hoffman, D. J. & Niyogi, S. K. (1973). Prec. Nat. Acad. Sci., U.S.A. 70, 574-578. Jacob, F., Ullman, A. & l~Ionod, J. (1964). C.R. Acad. Sci., Pavia, 258, 3125-3128. Jones, O. W. & Berg, P. (1966). J . Mol. Biol. 22, 199-209. Kramer, R., Rosenberg, 1K. & Steitz, J. A. (1974). J . Mol. Biol. 89, 767-776. Maitra, LT., Nakata, Y. & Hurwitz, J. (1967). J . Biol. Chem. 242, 4908-4918. Maizels, N. hi. (1973). Prec. Nat. Acad. Sci., U.S.A. 70, 3585-3589. Maniatis, T. & Ptashne, M. (1973). Prec. Nat. Aead. Sci., U.S.A. 70, 1531-1535. Maniatis, T., Ptashne, M., Barrell, B. G. & Donelson, J. E. (1974). Nature (London), 250, 395-397. Maniatis, T., Ptashne, M., Baekman, K., Kleid, I)., Flashman, S., Jeffrey, A. & Maurer, R. (1975). Cell, in the press. Mangel, W. F. & Chamberlin, M. (1974a). J . Biol. Chem. 249, 3002-3006. Mangel, W. F. & Chamberlin, M. (1974b). J . Biol. Chem. 249, 3007-3013. Minkley, E. G. (1974). J . Mol. Biol. 83, 289-304. Mintdey, E. G. & Pribnow, D. (1973). J . Mol. Biol. 77, 255-277. Pribnow, D. (1975). Prec. Nat. Acad. Sci., U.S.A. 72, 784-788. Richardson, C. (1971). In Procedures in Nucleic Acids Research (Cantoni, G. L. & Davies, D. R., eds), vol. 2, part H, pp. 815-828, Harper & Row, New York. Richardson, J. P. (1966). J. Mol. Biol. 21, 83-112. Saucier, J.-M. & Wang, J. C. (1972). Nature N e w Biol. 239, 167-170. Sehaller, H., Gray, C. & Herrmann, K. (1975). Proc. Nat. Acad. Sci., U.S.A. 72, 737-741. Sekiya, T., Ormandt, H. & Khorana, G. It. (1975a). J . Biol. Chem. 250, 1087-1098. Sekiya, T., Takeya, T. & Khorana, G. H. (1975b). _~ed. Prec. Abstracts, 34 (no. 2205), 608. Sentenac, A., Rue~, A. & Fromageot, P. (1968). ~ E B S Lettere, 2, 53-56. Silverstone, A. E., Arditti, R. R. & Magasanik, B. (1970). Prec. Nat. Acad. Sci., U.S.A. 66, 773-779. Sippel, A. & Hartman, G. (1968). Biochim. Biophys. Acta, 157, 218-219. Stead, N. W. & Jones, O. W. (1967). J . Mol. Biol. 26, 131-135. Sugimoto, K., Okamoto, T., Sugisaki, H. & Takanami, M. (1975). Nature (London), 258, 410-414. Summers, W. C. & Szybalski, W. (1968). Virology, 34, 9-16. Tinoco, I., Jr, Borer, P. N., Dengler, B., Levine, •. D., Uhlenbeck, O. C., Crothers, D. M. & Gralla, J. (1973). Nature N e w Biol. 246, 40-41. Walters, G., Zillig, W., Palm, P. & Fuchs, R. (1967). Eur. J . Biochem. 3, 194-201. Walz, A. & Pirrotta, V. (1975). Nature (London), 254, 118-121. Zillig, W., Zechel, K., Rabussay, D., Schauehner, M., Sethi, V. S., Palm, P., Heil, A. & Seifert, W. (1970). Cold Spring Harbor Syrup. Quant. Biol. 35, 47-58.