The “initiator” as a transcription control element

The “initiator” as a transcription control element

Cell, Vol. 57, 103-113, April 7, 1989, Copyright The ‘Initiator” 0 1989 by Cell Press as a Transcription Stephen T. Smale and David Baltimore Whi...

4MB Sizes 0 Downloads 54 Views

Cell, Vol. 57, 103-113,

April 7, 1989, Copyright

The ‘Initiator”

0 1989 by Cell Press

as a Transcription

Stephen T. Smale and David Baltimore Whitehead Institute for Biomedical Research 9 Cambridge Center Cambridge, Massachusetts 02142 and Department of Biology Massachusetts Institute of Technology Cambridge, Massachusetts 02139

Summary Transcription of the lymphocyte-specific terminal deoxynucleotidyltransferase gene begins at a single nucleotide, but no TATA box is present. We have identified a 17 bp element that is sufficient for accurate basal transcription of this gene both in vitro and in vivo. This motif, the initiator (lnr), contains within itself the transcription start site. Homology to the Inr is found in many TATA-containing genes, and specific mutagenesis influences both the efficiency and accuracy of initiation. Moreover, in the presence of either a TATA box or the SV40 21 bp repeats, a greatly increased level of transcription initiates specifically at the Inr. Thus, the Inr constitutes the simplest functional promoter that has been identified and provides one explanation for how promoters that lack TATA elements direct transcription initiation. Introduction The biochemical mechanisms by which RNA polymerase II initiates transcription of a specific gene and from a specific start site are poorly understood. The cis- and transacting components required for transcription initiation have been approached in detail for only one class of protein-coding genes: those that contain a TATA element. It is generally accepted that the simplest promoter includes a TATA box and a transcription start site, or only a TATA box (reviewed in Breathnach and Chambon, 1981; Buratowski et al., 1988). Transcription begins about 30 bp downstream of the TATA box, and specific mutagenesis of the TATA sequence results in a decreased initiation frequency (Wasylyk et al., 1980; Grosveld et al., 1982) and/or initiation from more heterogeneous sites (Grosschedl and Birnsteil, 1980; Grosveld et al., 1981). The sequences immediately surrounding and including transcription start sites also have been implicated in transcription initiation. A weak consensus sequence, 5’-PyPyCAPyPyPyPyPyB’, was identified in several mammalian TATA-containing genes, with transcription beginning at the A (Corden et al., 1980). In a few cases, mutations in this region resulted in the use of alternative start sites and/or a reduction in promoter strength (Corden et al., 1980; Talkington and Leder, 1982; Dierks et al., 1983; Concino et al., 1984; Tokunaga et al., 1984; K. A. Jones et al.,

Control Element

1988). Furthermore, accurate initiation occasionally was observed following specific mutagenesis of the TATA box (Hen et al., 1982; Dierks et al., 1983; K. A. Jones et al., 1988). In yeast, where transcription does not initiate at a strictly defined distance from the TATA element, sequences near the start site have been shown to be important for accurate initiation (Chen and Struhl, 1985; Hahn et al., 1985; McNeil and Smith, 1985; Nagawa and Fink, 1985). However, in both yeast and larger eukaryotes the function of start site sequences remains a mystery; start site sequences are not important for transcription of many genes (e.g., McKnight and Kingsbury, 1982; Myers et al., 1986) a discrete start site element has not been identified, and accurate initiation has not been observed in the absence of distal promoter elements. The study of RNA polymerase II initiation mechanisms is complicated further by the observation that many cellular genes do not contain obvious TATA boxes. The promoters for these genes can be divided into two classes. GC-rich promoters, found primarily in housekeeping genes (for review see Sehgal et al., 1988) usually contain several transcription initiation sites spread over a fairly large region and several potential binding sites for the transcription factor Spl (Dynan and Tjian, 1983). The second class includes the remaining promoters, which have no apparent TATA element and are not GC rich. Unlike GC-rich promoters, many of these promoters are not constitutively active but rather are regulated during differentiation or development and initiate transcription at only one or a few tightly clustered start sites. Included in this class are Drosophila homeotic genes (e.g., the Ultrabithorax [Biggin and Tjian, 19881, engrailed [Soeller et al., 19883, and Antennapedia [Perkins et al., 19881 genes) and genes that are regulated during mammalian immunodifferentiation (e.g., the terminal deoxynucleotidyltransferase gene [TdT; Landau et al., 19841, the T cell receptor P-chain genes [Anderson et al., 19881, the Ick gene [Garvin et al., 19881, the h5 gene [Kudo et al., 19871, and the Vpres gene [Kudo and Melchers, 19871). The murine TdT gene (Landau et al., 1984) is tightly regulated during differentiation of 6 and T lymphocytes. Expression of the TdT gene is limited to precursor B and T lymphocytes (pre-B and pre-T cells), where its protein product, a template-independent DNA polymerase (Kato et al., 1967) is believed to increase diversity of the immunoglobulin and T cell receptor repertoires by adding nucleotides to junctions formed during gene rearrangement (Landau et al., 1987). In beginning to analyze regulated expression of the TdT gene, we were interested in defining the promoter element that acts in the absence of a TATA box to direct transcription to a specific start site. We demonstrate here that accurate basal transcription is directed largely by a 17 bp element that includes the RNA start site. Moreover, we show that this specific transcription can be activated strongly by a TATA box or by a heterologous promoter element in the absence of a TATA box.

Cell 104

A M

B &8% fib @, ET?“O?TTY\D\O~ 123

456

789

I*

M

4221 f201

12345678910

15 16 17

11121314

bp bp

42 bp+

C 5’-CACCAGGGTG GTACCTATGG GTCTGCTGGT GAGAGGACAT CAGAGCCCTC%CTGGAGA CACCACCTGA TGGCACAGAC AGAGCTAGAC TGTCTGCTTC CmGATCC-3’ Figure

1. Transcription

Start

Site for the Endogenous

and Transfected

Murine

TdT Gene

primers, hybridizing 67 (lanes l-3), 201 (lanes 4-6) (A) Primer extension analysis was performed with three different 32P-labeled oligonucleotide and 221 (lanes 7-9) nucleotides from the transcription start site. Thirty micrograms of total cytoplasmic RNA was tested from cell lines WEHI (lanes 1, 4, and 7) 40E4 (lanes 2, 5, and 8) and RLmll (lanes 3, 6, and 9). Markers were pBR322 digested with Hpall. (B) COS7 cells were transfected via a DEAE-dextran method, and, after 48 hr, total cytoplasmic RNA was isolated. Primer extension analysis with an oligonucleotide complementary to HSV-TK mRNA was performed with 30 ug of RNA from each transfection. Transfected DNAs for lanes 2-10 were pTdT(-12OOO)TK, pTdT(+iOOO)TK, pTdT(-3600)TK, pTdT(-1700)TK, pTdT(-lll)TK, pTdT(-41)TK, pTdT(-G)TK, pTdT(-6)TK2, and pTK(vector), respectively. The mock transfection is in lane 1, and G+A sequence markers (lane M) were from pTdT(-1300/+59) labeled at the BamHl site (nucleotide +56) and then cleaved with Sacl (-111). For lanes 11-14, 2 ug of pTdT(+34)TK DNA was cotransfected with an equimolar amount of pTdT(-5000)TK, pTdT(-17OO)TK, pTdT(-lll)TK, or pTdT(-41)TK DNA, respectively. The expected size of the test cDNAs was 84 nucleotides and that of the control DNA, 68 nucleotides. For lanes 15-17, pTdT(+33)TK, pTdT(+ll)TK, and pTdT(l7mer)TK were transfected into COS7 cells and assayed by primer extension analysis. The expected size of the cDNA from pTdT(+33)TK was 68 nucleotides, and that from pTdT(+ll)TK and pTdT(17mer)TK was 42 nucleotides. The relative intensities of the 68 and 42 nucleotide cDNAs varied greatly between experiments (see Results). (C) The DNA sequence surrounding the transcription initiation site (+l) for the murine TdT gene reveals no identifiable promoter elements. The translation initiation codon is underlined.

Results TdT Transcription Initiation Site Genomic DNA homologous to the 5’ end of a full-length murine TdT cDNA was isolated previously from a phage Ir library (N. Landau and D. B., unpublished results). The initiation site for TdT transcription was determined by primer extension analysis (Figure 1A) with three different 32P-labeled oligonucleotide primers hybridizing near the 5’ end of the coding sequence. In each case, the longest cDNA product (87 [lanes 2 and 31, 201 [lanes 5 and 61, or 221 [lanes 8 and 91 nucleotides) suggested that transcription initiates at an A found 51 nucleotides from the ATG. These cDNAs were found only in pre-B (4OE4; lanes 2, 5, and 8) and early T (RLmll; lanes 3, 6, and 9) cells, which express TdT protein, and not in mature B cells (WEHI 231; lanes 1, 4, and 7), which lack TdT Sl nuclease analysis (Figure 2, lane 1) and RNAase mapping (not shown) confirmed the location of the start site. The heterogeneity of

the Sl nuclease-resistant products is believed to be an artifact of the assay, based on the RNAase and primer extension results. The DNA sequence surrounding the transcription initiation site, from -420 to +254 (the location of the first exon-intron junction), was determined by analyzing both DNA strands with a combination of dideoxy (Sanger et al., 1977) and chemical-modification (Maxam and Gilbert, 1980) techniques (S. T S., N. Landau, and D. B., unpublished results). The sequence upstream of the transcription initiation site revealed that no TATA element or AT-rich element was found 30 bp from the start site (Figure 1C). Also, no binding sites for common transcription factors, such as Spl, CAAT factors, or NF-A (octamer binding protein), were present (for review see N. C. Jones et al., 19B8), and the sequence was not GC rich. The only putative control element surrounded the transcription start site and matched a weak consensus, 5’-PyPyCAPyPyPyPyPy-3’, surrounding the start site of several TATA-containing

Activity 105

of an RNA

Polymerase

II Initiator

Element

,$F$Y 24 25 26 27 M

,

Approx. % Trxn. 100 100

I

-111 45

30

41

30 -32

30 40

-31 -29

40 30

-20

30

-8

30

~”

-3

30 O?

+15 . ,

-78

20

_ I .

-19 -44

30 30

-17

-+11

Figure

2. Mutational

Analysis

of the TdT Promoter

in Nuclear

Extracts

10 10

Derived

from HeLa Cells

HeLa cell extracts were prepared in vitro transcription reactions performed as described in Experimental Procedures. Sl nuclease analysis (lanes 1-16) or primer extension analysis (lanes 19-27) was performed following in vitro transcription of TdT promoter mutant constructs. A deletion series from the Send to nucleotide -6 (lanes 2-13) was compared by Sl nuclease analysis to endogenous TdT RNA from 40E4 cells (lane 1). The deletion endpoint is indicated above each lane. Bands larger than 56 bp correspond to nonspecific transcripts detected where the Sl nuctease probe diverges from each deletion construct. Internal B’deletions (lanes 16-16) also were analyzed by Sl nuclease analysis, and the two internal deletion endpoints are indicated above each lane. 5’deletions to nucleotides -6, -3, and +15 (lanes 21-23) and 3’deletions to nucleotides +59, +33, and +ll (lanes 25-27) were analyzed by primer extension analysis. The expected cDNA sizes are 113 (lanes 20-25) 97 (lane 26) and 71 (lane 27) nucleotides. Markers (lanes M) are as in Figure 16 or are Hpall-digested pBR322. Internal controls were not included because of technical difficulties arising from the weak signals (about one transcript per 20,000 template molecules in lane 3). The mutant construc:s and approximate transcription percentages relative to pTdT(-1300/+59) are depicted at the bottom. However, because autoradiography was not performed to ensure linear signals and because internal controls were not included, the transcription percentages are not precise. Also, the number provided for the 5’ deletion to +15 assumes that the low level of heterogeneous transcripts is insignificant.

genes (Corden et al., 1980). However, the function of this element was believed to be dependent on a functional TATA box. Thus, although TdT transcription clearly initiates at a single nucleotide 51 nucleotides from the ATG, its putative promoter is unique.

Transient Transfection into Nonlymphoid (TdT-Lacking) Cell Lines To approach in detail the initiation mechanism for TdT transcription, both basal and regulated control elements must be analyzed. Basal promoter elements were defined

Cell 106

as those that would activate specific TdT transcription in transfected nonlymphoid cell lines like COS7 and 3T3 (which do not express the endogenous TdT gene). In contrast, regulated control elements exist that act only in lymphoid ceil lines (unpublished data). To identify basal promoter elements, a plasmid, pTdT(-12,00O)TK, was constructed that contains TdT sequences extending from 12,000 bp upstream of the transcription start site (-12 kb) to the BamHl site at the ATG (+58 bp). This fragment was fused to herpes simplex virus thymidine kinase (HSV-TK) coding sequences in a vector that can replicate to high copy number in mouse cells and in COS7 monkey kidney cells (Gluzman, 1981). A minimal SV40 replication origin allowed replication in COS7 cells, which contain an integrated copy of the SV40 large T antigen gene. In addition, a complete polyoma virus early region and replication origin allowed plasmid replication following transfection into any mouse cell line. COS7 cells were transfected with pTdT(-12,OOO)TK. After 48 hr, total cytoplasmic RNA was prepared and primer extension analysis was performed with a 32P-labeled primer that hybridizes to HSV-TK sequences. A single cDNA product of 84 nucleotides was detected (Figure 16, lane 2), demonstrating highly specific transcription initiating at the start site expected for TdT transcription. In COS7 cells, transcription directed by the TdT promoter was much weaker than transcription directed by the HSV-TK and SV40 promoters inserted in the same vector (data not shown). DNA sequences upstream of the transcription start site were deleted sequentially with either restriction endonucleases or BAL31 to define the sequence elements responsible for basal transcription initiation. Surprisingly, analysis of RNA from transfected COS7 cells revealed that sequences upstream of nucleotide -6 had little effect on the initiation frequency at the authentic TdT start site (Figure lB, lanes l-10). When a plasmid containing TdT sequences from -1300 to +34 was included in equimolar amounts as an internal standard, the initiation frequency from the 5’ deletion mutants was found to vary by only 2-fold (Figure lB, lanes 11-14). Similar results were obtained with 3T3 cells, but transcription in TdT-expressing RLmll T cells was stimulated strongly by upstream sequences, revealing regulated promoter elements (unpublished data). Southern blot analysis of Dpnl-digested low molecular weight DNA from transfected cells demonstrated that an equivalent amount of DNA was replicated from each plasmid except for pTdT(-12,00O)TK, which replicated about 5fold less efficiently (data not shown). The only significant effect of removing upstream TdT sequences in nonlymphoid cells was that the level of improperly initiated transcripts increased. Because the background transcription level increased gradually rather than suddenly, we believe that movement of pBR322, SV40, or polyoma virus sequences closer to the region of interest is responsible. To eliminate the possibility that a fortuitous TATA-like element in the pBR322 DNA (which is not apparent in the sequence) was abutted to the TdT sequences in the deletion constructs, a 240 bp SV40 fragment (BarnHI-Bell) was inserted upstream of the start site

in pTdT(-6)TK. This new plasmid, pTdT(-6)TK2, directed specific transcription at a level similar to pTdT(-6)TK (Figure lB, lanes 8 and 9). The 5’ deletion analysis suggests that TdT sequences from -6 to +58 are sufficient to direct specific basal transcription in COS7 cells. Two deletion constructs from the 3’side, pTdT(+34)TK and pTdT(+ll)TK, both of which contained TdT sequences to -1300, also exhibited accurate initiation (Figure lB, lanes 15 and 16). Finally, a doublestranded oligonucleotide that contained 17 bp of TdT sequence (from nucleotide -6 to +ll) was inserted into the TK vector. This plasmid, pTdT(17mer)TK, also directed TK transcription that initiated at the authentic TdT start site (Figure lB, lane 17). However, the 42 nucleotide cDNA product (lanes 16 and 17) cannot be directly compared with the longer cDNA products because the relative intensities of cDNA products of different lengths varied from experiment to experiment. This variation may have been due to an inconsistent extension efficiency by reverse transcriptase. These results suggest that 17 nucleotides surrounding the transcription start site contain the information necessary to direct transcription initiation. However, because transcription in vivo is known to be responsive to distant enhancer elements, and because the SV40 and polyomavirus sequences in the TK vector contain transcriptional control elements, we could not determine with these experiments if the 17-mer is sufficient to direct basal TdT transcription independently or was determining the start site for transcription that was activated by distant control elements. TdT Transcription in Nuclear Extracts from HeLa Cells In vitro transcription experiments in nuclear extracts derived from HeLa cells (which do not express TdT) were employed to analyze the basal TdT promoter further. Transcription in vitro is advantageous for these studies because it does not respond to distant enhancer elements. Also, it allowed us to mutate the basal TdT promoter extensively and to remove it from its surrounding eukaryotic sequences without regard for proper mRNA polyadenylation, stability, and transport. TdT sequences from -1300 to +59, pTdT(-1300/+59), were inserted into pUC19. By standard in vitro transcription procedures (see Experimental Procedures), this plasmid was found to direct accurate and specific TdT transcription initiation (Figure 2). The pattern of bands observed by Sl nuclease analysis of synthesized RNA (Figure 2, lane 3) was nearly identical to that observed with RNA isolated from TdT-expressing T cells (lane 1) and was sensitive to 2 pglml a-amanitin (not shown). Moreover, primer extension analysis of RNA synthesized from pTdT(-1300/+59) revealed the same specific transcription (Figure 2, lane 20), and linear and supercoiled templates were equally active (not shown). To define precisely the DNA sequences required for basal transcription in vitro, extensive mutagenesis was performed. Beginning with pTdT(-1300/+59), 5’and 3’deletion mutants were generated with restriction endonucleases and BALBI. Internal deletion mutants were also con-

Activity 107

of an RNA Polymerase

II Initiator

Element

pUCl8

puc19

t87

psw2

pTdT-TK

bp f79

*)

-71

_c

f46bp

bp

bp

f42bp

TdT Sac 1

17mer

5’GAGCTC

-6

I-+

+11

GCCCTCATTCTGGAGAC

AdML TATA

TATA/Dmer Figure

3. Specific

5’-CGGGCI’ATAAAAGGGGGTGGGGG Transcription

In Vitro with a 17 Nucleotide

Barn Hl

GGATCC-3 I--+

GAGCK Sequence

GCCCTCATKTGGAGAC

GGATCC-3

Element

(A) 5’deletion mutants with the 3’end fixed at nucleotide +ll were analyzed by in vitro transcription followed by Sl nuclease analysis using a probe prepared from pTdT(-1300/+11) (see Experimental Procedures). Plasmids with 5’ends at -1300, -111, and -6 were transcribed with equal efficiencies (lanes 2-4). The intense band in lane 4 corresponds to nonspecific transcripts detected where the Sl nuclease probe diverges from the template. Markers (lanes M) are G+A and C+T reactions as described in Figure 1B. (B) TdTl7mer oligonucleotide was inserted into ptJC18 (lanes l-4), pUC19 (lanes 5-8) pSP72 (lanes 9-12) and pTdTTK (lanes 13-16). In vitro reactions were performed either with no template (lanes 1, 59, and 13), plasmid containing TdT17mer (lanes 2, 3,6,7, 10, 11, 15, and 16), plasmid containing TdT17mer with an upstream TATA box (lanes 4,8, and 12) or pTdT-TK vector alone (lane 14). Plasmids pTdT(l7mer)TK and pTdT(17mer)TK2 (lanes 15 and 16) were described earlier (see Results). a-Amanitin (2 ug/ml) was included in some reactions (lanes 3, 7, and 11). RNA synthesized in vitro was analyzed by primer extension using commercial oligonucleotides (see Experimental Procedures). Sizes of expected products and indicated. (Bottom) Sequences of TdTl7mer and TATA oligonucleotides with flanking restriction sites are indicated.

strutted, removing variable sequences 5’ to the start site without affecting sequences further upstream. Each of these mutants was tested for specific transcription activity in nuclear extracts, using Sl nuclease and/or primer extension analyses (Figure 2). A decrease in initiation frequency of approximately 3-fold was consistently observed when sequences were removed upstream of nucleotide -45. However, even a 5’deletion to nucleotide -3 (Figure 2, lane 22) retained levels of specific transcription that were about 30% of wild-type levels. Deletion to +15 left only what appear to be background bands. From the 3’ end, a deletion from +59 to +34 strongly reduced the level of RNA (Figure 2, lane 26), either by reducing the rate of transcription initiation or by destabilizing the RNA transcripts. Surprisingly, however, accurate initiation was still observed upon deletion to nucleotide +ll from the 3’side (Figure 2, lane 27).

To define further the sequence requirements for accurate initiation, nucleotide +ll was maintained as the 3’ end of the promoter as 5’ sequences were removed (Figure 3A). Sl nuclease analysis with a probe prepared from pTdT(-1300/+11) revealed that the initiation frequency did not diminish significantly upon deleting first to nucleotide -111 (Figure 3A, lane 3) and then to nucleotide -6 (lane 4). The latter construct, containing TdT sequences only from nucleotide -6 to +ll, was derived by inserting into pUC19 a double-stranded oligonucleotide (TdT17mer), with a Sacl site at the 5’ end and a BamHl site at the 3’ end (pTdT17mer(pUClS)). To determine whether pUC19 sequences adjacent to the start site were required for detection of a specific signal, TdT17mer was inserted into several different vectors, including pGEM-3, pSP72, pSP73, pZf3- (Promega), pUC16, and derivatives of these vectors. Also tested were pTdT

Cell 106

-+-+-+ 123456789

M

79 bp +

TdT AdML AdIvKDdT

5’-GCCCTCATT(JTGGAGAC-3’ 5’-GTCCI-CACTCKTTCCG-3’ S-GTCCKACTCTGGAGAC-3’

Figure 4. Comparison of the TdT Inr with the AdML Promoter Inr Oligonucleotides corresponding to the TdT, MML, and TdT/AdML lnrs were inserted into pSP72 as described (Experimental Procedures) with or without the upstream TATA oligonucleotide. Oligonucleotides contained in the vectors are described above each lane, and some reactions were performed in the absence (lanes 1, 3, and 5) or presence (lanes 2,4, and 6) of 2 Pa/ml a-amanitin. In vitro transcription reactions were performed and analyzed by primer extension analysis using a commercial SPG-sequenceoligonucleotide (Promega). The arrow indicates the size of the expected product (79 nucleotides), and the markers (lane M) are Hpall-digested pSR322. In the sequences below, boldface letters indicate nucleotides that differ from the TdT Inr sequence.

(17mer)TK and pTdT(17mer)TK2 (see Figure I), which fused the BamHl site at +ll directly to the HSV-TK Bglll site, eliminating all polylinker sequences. The results from this analysis (partially shown in Figure 36, lanes 2, 6, 10, 15, and 16) revealed that the frequency of initiation from the TdT start was nearly identical in all vectors. (The initiation frequency in all but the TK vectors was compared directly using a common Sl nuclease probe containing only polylinker and 17-mer sequences as in Figure 3A [data not shown].) In Figures 2 through 4, the difference between significant, specific transcripts and bacterial vector transcripts is not easily apparent. Many of the background bands observed by primer extension analysis are artifactual and do not correlate with Sl nuclease experiments (data not

shown). However, some of the signals appear to correspond to actual transcription initiation sites that originate in or near vector sequences. This background transcription is likely to be activated by cryptic binding sites for sequence-specific DNA binding proteins. The bacterial pBR322 and pUC sequences may have a greater distribution of cryptic sequences because they have not been subjected to selective pressure against recognition sequences for mammalian proteins. In contrast, the mammalian DNA may be devoid of cryptic binding sequences to facilitate the precise regulation of gene expression. As another means of determining if surrounding sequences influenced transcription from TdT(17mer), a second double-stranded oligonucleotide was inserted upstream of TdT(17mer) in several of the vectors. This oligonucleotide placed the adenovirus major late (MML) promoter TATA box and surrounding sequences about 30 nucleotides from the TdT start site. It was expected that if cryptic TATA-like elements were influencing TdT transcription, then replacement of the cryptic element with the AdML TATA would have little effect or its effect would vary from vector to vector. The results demonstrate that the TATA element strongly stimulated transcription from the 17-mer in each vector to a similar extent (Figure 38, lanes 4, 8, and 12). Moreover, when the TATA oligonucleotide was placed in vectors that lack TdT(17mer), only low levels of transcription were observed initiating from one or more start sites located about 30 bp from TATA (data not shown). We conclude that 17 nucleotides surrounding the TdT start site is sufficient to independently direct specific initiation of transcription in vitro. As proposed by Chen and Struhl (1985), we will call this sequence the initiator element (lnr) because it overlaps the transcription initiation site. Comparison with the AdML Inr and Mutagenesis of the TdT Inr The 17 bp TdT Inr is homologus to a weak consensus sequence that surrounds the transcription start sites for several TATA-containing promoters (Corden et al., 1980). In fact, the widely studied AdML promoter has a 9 of 11 nucleotide match to the TdT Inr between nucleotides -6 and +5, but nucleotides +6 to +ll are completely different (Figure 4). No specific transcription has been reported in vitro from the AdML start site in the absence of a TATA box (Hen et al., 1982), although specific transcription was observed in vivo in the presence of only the start site element and a distant enhancer (Hen et al., 1982). To compare the activities of the TdT and AdML Inrs, two double-stranded oligonucleotides were synthesized and inserted into the Sacl and BamHl sites of pUC18 and pSP72 (Promega). The first, AdMLlnr, contained AdML sequences from nucleotides -6 to +ll, and the second, AdML/TdTlnr, contained AdML sequences from -6 to +5 and TdT sequences from +6 to +ll. Thus, AdMLlnr contains eight base substitutions (at -5, +2, and +6 though +l I), and AdMLlTdTlnr contains two (at -5 and +2), relative to TdTl7mer (Figure 4). In pSP72 (Figure4) and pUC18 (data not shown), both AdMLIldTlnr and AdMLlnr were only slightly less active and exhibited slightly more back-

Activity 109

of an RNA Polymerase

71bp

II Initiator

Element

+

Approx. % Trxn.

Figure

TdT

S-GCCCTCA’ITCTGGAGAC-3’

100

-5

5’-G-&ZCTCATKTGGAGAC-3’

100

-4

S-GC~CTCATTCEGAGAC-3

80

-3;2

S-GCC~CATKTGGAGAC-3’

20

-2

YGCCCACATTCTGGAGAC-3’

70

+7

YGCCCTCA-ITaGTAGAC-3’

70

5. In Vitro Transcription

of Inr Substitution

Mutants

Single- and double-base substitution mutants were generated in the TdT Inr (see Experimental Procedures) and were tested by in vitro transcription and primer extension analysis. The expected cDNA product of 7l nucleotides (in the pZf3- vector [Promega]) is indicated by an arrow, and the mutated nucleotides are indicated above each lane. At the bottom, the sequences of the wild-type and mutant oligonucteotides are shown, with the mutant nucleotides underlined. An approximate percent transcription as determined by densitometry of four independent experiments with three different plasmid preparations is also indicated. However, as with Figure 2, these numbers are only approximate because autoradiography was performed under nonlinear conditions.

ground than the TdTlnr. Also, the AdML TATA oligonucleotide stimulated transcription from each similarly (Figure 4, lanes 7-9). To define further the functional sequences within the Inr, nucleotide substitution mutants were generated (Figure 5). A single-base mutation at nucleotide -5 had no effect on the initiation frequency (Figure 5, lane 3), consistent with above results with the AdML Inr. In contrast, a single-base substitution at nucleotide -4 slightly decreased initiation (lane 4) and a double-base substitution at nucleotides -3 and -2 strongly decreased the initiation frequency (lane 5). Interestingly, single-base substitutions at nucleotides -2 or +7 did not strongly affect the total initiation frequency from within the Inr (Figure 5, lanes 6 and 7) but more heterogeneous start sites were observed. These results demonstrate that efficient recognition of the TdT Inr requires sequences other than the start site itself. Activation of Transcription from the TdT Inr by the SV40 21 bp Repeats We have demonstrated that the TdT Inr is independently functional and is functionally similar to the Inr from the TATA-containing AdML promoter. However, the require-

ments for activation of transcription above the basal level need to be addressed. Because the TdT Inr is similar to the AdML Inr, it is possible that a TATA analog is needed within the TdT promoter to activate transcription further. Alternatively, upstream promoter elements may be capable of interacting directly with the Inr. To distinguish between these possibilities, an SV40 DNA fragment containing a strong upstream promoter element, the 21 bp repeats (which bind to transcription factor Spl [Dynan and Tjian, 19831) and a portion of the enhancer (which binds to transcription factor AP-1 [Lee et al., 19871) were inserted upstream of the Inr and tested in vitro. Surprisingly, the S/40 sequences strongly and specifically activated transcription from the Inr, even though no TATA element was present (Figure 6A). In addition, specific stimulation was strongly position dependent. When the first GC box was positioned 24 bp upstream of the Inr (pSpl-24-lnr), transcription was weakly activated (Figure 6A, lane 2) from highly heterogeneous start sites. When it was 30 bp upstream (pSpl30-lnr), strong transcription was observed from the initiator and a few additional downstream sites (lane 3). When it was placed either 42 (pSpl42-lnr) or 50 bp (pSpl60-lnr) from the start site, virtually all of the transcription began in the Inr, from one or two nucleotides (Figure 6A, lanes 4 and 5). When it was farther from the Inr (66 bp), transcription again decreased and multiple upstream initiation sites appeared (lane 6). When the 21 bp repeats were inverted (pflev-Spl-lnr) such that the GC boxes were 87 nucleotides from the Inr, strong stimulation from the Inr was still found with some upstream initiation (Figure 6A, lane 7). In this construct, however, the AP-1 binding site in the SV40 fragment was located between the 21 bp repeats and the Inr. Several observations rule out the possibility that a cryptic TATA box was located 30 bp upstream of the Inr: First, in pSpl-30-lnr, where the GC boxes (and no Ts or As) were located at -30, reasonably strong transcription was activated from the Inr. Second, although an ATAT was located at -27 in pSpl-42-lnr, this sequence (part of an EcoRV site that is not believed to function as a TATA box) was split in two by the Bglll linker insertion that created pSpl-50-lnr. Finally, when the SV40 sequences were placed in pSP72 lacking the Inr (which is replaced by the polylinker Kpnl and Smal sites), only a very low level of heterogeneous transcription was observed (Figure 6A, lane 8). Comparison of pSpl-42-lnr, TATAllnr, and the intact SV40 early promoter reveals that, in vitro, pSpl-42-lnr contains a highly active promoter (Figure 6B). The pSpl-42Inr promoter (Figure 6B, lane 5) is about 20-fold stronger than the intact SV40 early promoter (lane 1) and about lofold stronger than TATAllnr (lanes 3 and 4). In other words, replacement of the authentic SV40 TATA box and start site region (which lacks an Inr) with the TdT Inr results in the creation of a much stronger in vitro promoter. Moreover, the SV40 21 bp repeats activate Inr transcription in vitro to a greater extent than does the AdML TATA box. These results clearly demonstrate that, in the absence of a TATA box and in a position-dependent manner, the Inr can act in concert with the SV40 21 bp repeats to direct very high levels of transcription from a single start site.

Cell 110

Figure 6. Activation of Inr Transcription by the SV40 21 bp Repeats

fl9

Discussion We have identified a new type of mammalian RNA polymerase II promoter element, the Inr (initiator). By itself, the Inr directs a low level of transcription initiation from a single internal nucleotide position. Specific mutagenesis of the Inr can alter both the efficiency and accuracy of initiation, and although Inr transcription can be activated by a traditional TATA box, activation is not limited to or dependent on TATA. In the absence of a TATA box and in a position-dependent manner, the SV40 early promoter elements (21 bp repeats) activate transcription strongly and specifically from the Inr. Thus, the Inr is a discrete promoter element that can act alone or in concert with either a TATA box or upstream element to direct specific transcription initiation. These results clarify earlier observations with genes containing TATA boxes. They establish that the start site consensus sequence (Corden et al., 1980) represents an independent recognition site for a component of the transcription machinery. They also suggest that previous start site mutations that inhibited transcription initiation probably disrupted this recognition event (Corden et al., 1980; Talkington and Leder; 1982; Dierks et al., 1983; Concino et al., 1984; Tokunaga et al., 1984; K. A. Jones et al., 1988). Furthermore, it now is not surprising that TATA mutations did not always alter RNA start sites (Hen et al., 1982; Dierks et al., 1983) because, in our experiments, up-

bp

In Vitro

(A) A DNA fragment containing the SV40 21 bp repeats was inserted at variable distances up stream of the TdT Inr in pSP72 (see Experimental Procedures). Distances between the Inr start site and the first GC box were 24 (lane 2). 30 (lane 3) 42 (lane 4) 50 (lane 5) and 66 (lane 6) nucleotides. The fragment containing the 21 bp repeats was also reversed (lane 7) placing the closest GC box 67 bp from the Inr, but with an AP-1 binding site between the 21 bp repeats and the Inr. Finally, the 21 bp repeats were placed in the same location as in lane 7, except that the 17 bp Inr was absent and was replaced by 9 bp of pSP72 sequence (lane 6). Primer extension analysis of in vitro reactions was performed as in Figure 2 (with an SP6 promoter primer), and a control reaction was performed with the TATAllnr construct (lane 1). The 79 nucleotide expected product is indicated. (6) The in vitro promoter strengths of the intact SV40 early promoter (lane l), TdTl7mer (lane 2) TATAllnr (in pUC16, lane 3; in pSP72, lane 4), and pSpl-42-lnr (lane 5) were compared. Primer extension analysis of in vitro reactions was as in Figure 2, with either a -40 sequencing primer (lanes 1-3) or an SP6 promoter primer (lanes 4 and 5). The TATAllnr reactions (lanes 3 and 4) reveal that the two primers result in similar signals and are directly comparable. The expected cDNA products are 107 (lane I), 67 (lanes 2 and 3), and 79 (lanes 4 and 5) nucleotides.

stream elements directly activated Inr-mediated transcription. Finally, yeast transcription, which allows a more flexible positioning of TATA (Struhl, 1987), may require an analogous element for start site placement. We do not know if the Inr is unique or if there are multiple types of initiator elements. The TdT Inr exhibits little homology to the start site region for the SV40 major late promoter, which accurately initiates transcription in the absence of its weak TATA box or TATA analog (Nandi et al., 1985; Ayer and Dynan, 1988). In contrast, the TdT Inr is homologous to the start site regions for three mammalian promoters that initiated accurate transcription with an inactivated TATA box, but with active distal control elements: the AdML (Hen et al., 1982) rabbit 8-globin (Dierks et al., 1983) and human immunodeficiency virus type 1 (K. A. Jones et al., 1988) promoters. In addition to the TdT Inr, the highly homologous AdML Inr was shown here to be independently active in vitro. This activity was not detected in previous studies, possibly because the initiation frequency was very low or because background transcription was high. The direct stimulation of accurate transcription by an upstream activator suggests that Spl, which interacts with the SV40 21 bp repeats (Dynan and Tjian, 1983) and TFIID, which interacts with TATA (Sawadogo and Roeder, 1985) are similar. In the presence of only their respective binding sites and an Inr element, both can direct high levels of accurate transcription initiation. However, to activate transcription specifically from an Inr, TFIID must be strictly

Activity 111

of an RNA Polymerase

II Initiator

Element

positioned about 30 nucleotides away (data not shown). The position of Spl is more flexible, although a distance of 40 to 50 bp appears to be optimal. These differences become more dramatic in the absence of an Inr: TFIID continues to activate transcription from one or more tightly clustered start sites about 30 nucleotides away (data not shown), but Spl activates transcription from a wide variety of start sites. The proteins involved in Inr-mediated transcription remain to be defined. This reaction may require only a subset of the proteins that have been found to reconstitute accurate initiation from the AdML TATA and Inr (Matsui et al., 1980; Samuels et al., 1982). Thus, the Inr could interact with TFIID, RNA polymerase II, or another component of the general transcription machinery. DNAase I footprinting experiments have demonstrated that, when bound to the AdML TATA box, purified TFIID protects the AdML Inr from DNAase digestion (Sawadogo and Roeder, 1985). However, studies by us (unpublished results) and others have not detected independent recognition of the TdT or AdML Inrs. Another issue is how Spl activates transcription from the Inr in the absence of the TATA element. Perhaps TFIID still mediates this reaction, but, equally likely, Spl may activate transcription through a direct interaction with another component of the general transcription apparatus. For cellular genes that lack TATA boxes and are not GC rich, the mechanisms by which adjacent promoter elements activate transcription from an Inr (or similar start site element) may be fundamentally similar. Deletion analyses of the TdT, Ultrabithorax (Biggin et al., 1988) engrai/ed(Soeller et al., 1988), and Antennapedia (Perkins et al., 1988) promoters reveal that in all four cases, basal transcription in vitro is influenced by sequences near the transcription start site and within the transcribed leader (at about +40 bp). In addition, regulated transcription of these genes depends largely on promoter sequences, and DNAase I footprinting analyses reveal that they all contain binding sites for an unusually large number of sequence-specific DNA binding proteins, both immediately upstream and downstream of the start site (Biggin et al., 1988; Soeller et al., 1988; S. T. S. and D. B., unpublished results). Possibly, the unique structure of these promoters is necessary for directing transcription of a subset of genes that require strict activation and inactivation during cellular differentiation and development. Experimental

Procedures

Plasmid DNAs Subcloning and plasmid DNA manipulations were performed as described (Maniatis et al., 1982). pTdT(-12,OOO)TK was derived from pSVTK3 (Smale and Tjian, 1985) which contains a minimal SV40 replication origin downstream of promoter-lacking HSVTK coding sequences. A 3.5 kb EamHl fragment containing the polyomavirus early region (Grosschedl and Baltimore, 1985) was inserted at a BarnHI site at the S’end of the HSV-TK gene. TdT sequences from -12,000 to +58 were reconstructed between the Clal and Bglll sites of pSVTK3, just upstream of the TK coding sequence. pTdT(-1300/+59) contains a 1359 bp EcoRl (-1300)-BamHI (+59) fragment in pUC19. Deletion mutagenesis of this plasmid was by standard procedures (Maniatis et al., 1982) to produce the plasmids described in the text. Oligonucleotides and SV40 fragments were in-

serted into Promega vectors pGEM-3, pGEM4, pSP72, pSP73, and pZf3-. and additional vectors as described in the text. Plasmids containing oligonucleotide inserts were confirmed by DNA sequence analysis (Sanger et al., 1977; Maxam and Gilbert, 1980). The SV40 21 bp early promoter was excised from SV40 DNA with Hindlll (nucleotide 5173) and Sphl (nucleotide 130) and inserted into pUC18 to produce pSV40 Early. EcoRI, Clal, or Bglll linkers (Collaborative Research) were then added at the SV40 Ncol site (nucleotide 39) adjacent to the 21 bp repeats. The GC boxes were then excised with EcoRl and BamHI, Clal and BamHI, or Bglll and BarnHI, and inserted Into appropriately digested pTdT17mer(SP72) to produce pSpl-24-lnr, pSpl-SO-lnr, and pSpl-42 Inr (and pRev-Spl-lnr), respectively. pSpl50-lnr and pSpl-66-lnr were produced by inserting, respectively, one or three Bglll linkers at the EcoRV site of pSpl-42-lnr. pSplSP72 was produced by inserting the 21 bp repeats at the Bglll site of pSF72. Oligonucleotides were synthesized by S. Blackman (Whitehead Institute) with the following sequences: TdT17mer, Y-GATCCGTCTCCAGAATGAGGGCCGAGCT-3’ and its complement to produce BarnHI and Sacl ends; AdML17mer !Y-GATCCCGGAAGAGAGTGAGGACCGAGCT-3’ and its complement; AdML/TdT17mer, Y-GATCCGTCTCCAGAGTGAGGACCGAGCT-3’and its complement; TATA, 5’-AATTCGGGCTATAAAAGGGGGTGGGGGGAGCT-3’ and its complement to produce Sacl and EcoRl ends. Inr substitution mutants were generated by having the two strands of TdT17mer oligonucleotide synthesized again, this time adding 4% random nucleotides to each position. The strands were then annealed and ligated into pZf3(Promega), and the ligation mix was used to transform bacteria. Bacterial colonies were pooled and lysed, and bacteria were retransformed with the resulting DNA. DNA was prepared from individual colonies and sequenced.

RNA Analysis Primer extension analysis of endogenous TdT RNA utilized three synthetic oligonucleotide primers hybridizing 87, 201, and 221 nucleotides from the transcription start site. The sequences of these probes were 5’CCGAGGACCCAGGTGGACTGC3’, 5’-GGCTCTTCGAGTTGTTCCCATC-3, and 5’CGGGCCAGCTCCATGAGGAA-3’. Analysis of HSVTK RNA from transfected cells required a 26 nucleotide primer complementary to TK sequences, 5’-GAGGTGCGGGAGTTTCACGCCACCAA-3’. Hybridization to 30 ug of total cytoplasmic RNA was at 60DC as described, and reaction conditions were described (McKnight and Kingsbury, 1982). Primer extension analysis of RNA synthesized in vitro used commercially available synthetic oligonucleotides: -40 sequencing (45% hybridization; New England Biolabs), reverse sequencing (3PC hybridization; New England Eiolabs), and SP6 (45% hybridization; Promega). Sl nuclease analysis was performed as described (Hentschel et al., 1980) with 30 pg of total cytoplasmic RNA or with RNA transcribed in vitro. All hybridizations were for at least 8 hr in 80% formamide, and all Sl nuclease digestion reactions were at room temperature. Lower temperatures had no effect on the heterogeneity of Sl nuclease-resistant products. Two different probes were used. The first (42% hybridization) was prepared from pTdT(-1300/+59) by digestion with BamHl (+59), treatment with alkaline phosphatase, labeling with T4 polynucleotide kinase. and digestion with Sacl (-111). The second (37% hybridization) was prepared from pTdT(-1300/+11) by digestion with Hindlll (polylinker), treatment with phosphatase, labeling with kinase, and digestion with Sacl (-111). Single-stranded probes were isolated at 4% on 8% nondenaturing polyacrylamide gels, by standard procedures.

Transfection Experiments COS7 monkey cells were grown on IO cm dishes in Dulbecco’s modified Eagle’s medium supplemented with 5% fetal calf serum and antibiotics. Transient transfection in COS7 cells was performed with a DEAEdextran-chloroquine procedure as described previously @male and Tjian, 1985). Forty-eight hours after infection, total cytoplasmic RNA was isolated with NP40 lysis and phenol extraction, as previously described (Smale and Tjian, 1985). Low molecular weight DNA was isolated by the procedure of Hirt (1967) to compare replication efficiencies of the various plasmids.

Cdl 112

In Vitro Transcription

Experiments

HeLa cells were grown in spinner culture in minimal essential medium (Joklik’s; Hazelton) supplemented with 7% horse serum and antibiotics. Nuclear extracts were prepared by a modification of the method of Dignam et al. (1963) as described (Briggs et al.. 1966). Following precipitation with ammonium sulfate, the extract was suspended in TM.1 (50 mM Tris [pH 7.41, 12.5 mM MgCl2, 1 mM EDTA, 1 mM DTT, and 20% glycerol) at a protein concentration of about 20 mglml and dialyzed extensively against TM.1. Aliquots were stored at -60°C. Transcription reactions were performed in a total volume of 50 ul, containing 12.5 ~1 HeLaextract, 12.5 ul TM.l, 2% polyvinyl alcohol, 250 uM each ribonucleoside triphosphate, and 400 ng of supercoiled template DNA (Jones et al., 1965). Following incubation at 30% for 60 min, reactions were prepared for primer extension of Sl nuclease analysis as described (Jones et al., 1965).

We thank Ned Landau for the isolation and characterization of the first TdT genomic clone, and Marc Learned, David Schatz, and Mark Schlissel for critical reading of the manuscript. This work was supported by a Helen Hay Whitney Foundation postdoctoral fellowship to S. T S. and by grants from the American Cancer Society (#IM355T) and the Du Pont Center for Molecular Genetics to D. B The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked ‘hdvertisement” in accordance with 16 USC. Section 1734 solely to indicate this fact. Received

October

26, 1966; revised

January

6, 1969.

Garvin. A. M.. Pawar, S., Marth, J. D., and Perlmutter, Structure of the murine Ick gene and its rearrangement lymphoma cell line. Mol. Cell. Biol. 8, 3056-3064.

R. M. (1966). in a murine

Gluzman, Y. (1961). SV40-transformed simian cells support tion of early SV40 mutants. Cell 23, 175-162.

the replica-

Grosschedl, R., and Baltimore, R. (1965). Cell-type specificity noglobulin gene expression is regulated by at least three quence elements. Cell 41, 665-697.

of immuDNA se-

Grosschedl, R., and Birnsteil, M. L. (1960). Identification of regulatory sequences in the prelude sequences of an H2A hisone gene by the study of specific deletion mutants in vitro. Proc. Natl. Acad. Sci. USA 77, 1432-1436. Grosveld, G. C., Shewmaker, C. K., Jat, l?, and Flavell, R. A. (1961). Localization of DNA sequences necessary for transcription of the rabbit 6-globin gene in vitro. Cell 25, 215-226. Grosveld, G. C., deBoer, E., Shewmaker, C. K., and (1962). DNA sequences necessary for transcription j3-globin gene in vitro. Nature 295, 120-126.

Flavell, of the

R. A. rabbit

Hahn, S., Hoar, E. T, and Guarente, L. (1965). Each of three”TATA elements” specifies a subset of the transcription initiation sites at the CYC7 promoter of Sacchafomyces cemvisiae. Proc. Natl. Acad. Sci. USA 62, 6562-6566. Hen, R., Sassone-Corsi, P, Corden, J., Gaub, M. P., and Chambon, P (1962). Sequences upstream from the T-A-T-A box are required in viva and in vitro for efficient transcription from the adenovirus serotype 2 major late promoter. Proc. Natl. Acad. Sci. USA 79, 7l32-7136. Hentschel, C. C., Imminger, J. C., Bucher, B., and Birnstiel, M. (1960). Sea urchin histone mRNA termini are located in gene regions downstream of putative regulatory sequences. Nature 285, 147-151. Hirt. B. (1967). Selective extraction of polyoma mouse cell culture. J. Mol. Biol. 26, 365-369.

DNA from

infected

Anderson, S. J., Chou, H. S., and Loh, D. Y. (1966). A conserved sequence in the T-cell receptor P-chain promoter region. Proc. Natl. Acad. Sci. USA 65, 3551-3554.

Jones, K. A., Yamamoto, K. R., and Tjian, R. (1965). Two distinct transcription factors bind to the HSV thymidine kinase promoter in vitro. Cell 42, 559-572.

Ayer, D. A., and Dynan, W. S. (1966). Simian virus 40 major late promoter: a novel tripartite structure that includes intragenic sequences. Mol. Cell. Biol. 8, 2021-2033.

Jones, K. A., Luciw, P A., and Duchange, N. (1966). Structural arrangements of transcription control domains within the S’Qntranslated leader regions of the HIV-1 and HIV-2 promoters, Genes Dev. 2, 1101-1114.

Biggin, M. D., and Tjian, Ft. (1966). Transcription factors that activate the Ultrabithorax promoter in developmentally staged extracts. Cell 53, 6994 1. Breathnach, of eukaryotic 349-363.

Ft., and Chambon, P (1961). Organization and expression split genes coding for proteins. Annu. Rev. Biochem. 50,

Briggs, M. R., Kadonaga, J. T., Bell, S. P, and Tjian, R. (1966). Purification and biochemical characterization of the promoter-specific transcription factor, Spl. Science 234, 47-52. Buratowski, S., Hahn, S., Sharp, P A., and Guarente, L. (1966). Function of a yeast TATA element-binding protein in a mammalian transcription system. Nature 334, 37-42. Chen, W., and Struhl, K. (1965). Yeast mRNA initiation sites are determined primarily by specific sequences, not by the distance from the TATA element. EMBO J. 4, 3273-3260. Concino, M. F., Lee, R. F., Merryweather, J. P, and Weinmann, R. (1964). The adenovirus major late promoter TATA box and initiation site are both necessary for transcription in vitro. Nucl. Acids Res. 72, 7423-7433. Corden, J., Wasylyk, B., Buchwalder, A., Sassone-Corsi, P., Kedinger, C., and Chambon, P (1960). Promoter sequences of eukaryotic proteincoding genes. Science 209, 1405-1414. Dierks, P, van Ooyen, A., Cochran, M. D., Dobkin, C., Reiser, J., and Weissmann, C. (1963). Three regions upstream from the cap site are required for efficient and accurate transcription of the rabbit f3-globin gene in mouse 3T6 cells. Cell 32, 695-706. Dignam, J. D., Lebovitz, R. M., and Roeder, R. G. (1963). Accurate transcription initiation by RNA polymerase Ii in a soluble extract from isolated mammalian nuclei. Nucl. Acids Res. 17, 1475-1469. Dynan, W. S., and Tjian, R. (1963). The promoter-specific transcription factor Spl binds to upstream sequences in the SV40 early promoter. Cell 35, 79-67.

Jones, N. C., Rigby, l? W. J., and Ziff, E. 8. (1966). Trans-acting protein factors and the regulation of eukaryotic transcription: lessons from studies on DNA tumor viruses. Genes Dev. 2, 267-261. Kato, K., Goncalves, H. M., Gouts, G. E., and Bollum, F J. (1967). Deoxynucleotide-polymerizing enzymes of calf thymus gland. J. Biol. Chem. 242, 2760-2769. Kudo, A., and Melchers, F. (1967). A second gene, Vpres in the 1s locus of the mouse, which appears to be selectively expressed in pre-B lymphocytes. EMBO J. 6, 2267-2272. Kudo, A.. Sakaguchi, N., and Melchers, F. (1967). Organization of the murine @related ?.+ gene transcribed selectively in pm-B lymphocytes. EMBO J. 6, 103-107. Landau, N. R., St. John, T P, Weissman, I. L., Wolf, S. C., Silverstone, A. E., and Baltimore, D. (1964). Cloning of terminal transferase cDNA by antibody screening. Proc. Natl. Acad. Sci. USA 81, 5636-5640. Landau, N. R., Schatz. D. G., Rosa, M., and Baltimore, D. (1967). Increased frequency of N-region insertion in a murine pre-B-cell line infected with a terminal deoxynucleotidyl transferase retroviral expression vector. Mol. Cell. Biol. 7, 3237-3243. Lee, W., Haslinger, A., Karin, M., and Tjian, R. (1967). Two factors that bind and activate the human metallothionein 11~gene in vitro also interact with the SV40 promoter and enhancer regions. Nature 325, 366-372. Maniatis, T., Fritsch, E. F., and Sambrook, J. (1962). Molecular Cloning: A Laboratory Manual (Cold Spring Harbor, New York: Cold Spring Harbor Laboratory). Matsui, T., Segall, J.. Weil, P A., and Roeder, R. G. (1960). Multiple factors required for accurate initiation of transcription by purified RNA polymerase II. J. Biol. Chem. 255, 11992-11996. Maxam, A., and Gilbert, W. (1960). Sequencing end-labeled DNA with base-specific chemical cleavages. Meth. Enzymol. 65, 499-560.

Activity 113

of an RNA Polymerase

II Initiator

Element

McKnight, S. L., and Kingsbury, R. (1982). Transcriptional control nals of a eukaryotic protein-coding gene. Science 217, 316-324.

sig

McNeil, J. B., and Smith, M. (1985). Saccharomyces cerevisiae CYCl mRNA 5’-end positioning: analysis by in vitro mutagenesis, using synthetic duplexes with random mismatch base pairs. Mol. Cell. Biol. 5, 3545-3551. Myers, R. M., Tilly, K., and Maniatis, T (1986). Fine structure analysis of a 6-globin promoter. Science 232, 613-618.

genetic

Nagawa, F., and Fink, G. Ft. (1985). The relationship between the TATA sequence and transcription initiation sites at the HIS4 gene of Saccharomyces cefevisiae. Proc. Natl. Acad. Sci. USA 82, 8557-8561. Nandi, A., Das, G., and Salzman, N. F! (1985). Characterization surrogate TATA box promoter that regulates in vitro transcription simian virus 40 major late gene. Mol. Cell. Biol. 5, 591-594.

of a of the

Perkins, K. K., Dailey, G. M., and Tjian, R. (1988). In vitro analysis of the Anfennapedia P2 promoter: identification of a new Drosophila transcription factor. Genes Dev. 2, 1615-1626. Samuel% M., Fire, A., and Sharp, P. A. (1982). Separation and characterization of factors mediating accurate transcription by RNA polymerase II. J. Biol. Chem. 257, 14419-14427. Sanger, F., Nicklen, S., and Coulson, A. R. (1977). DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. USA 74, 5483-5467. Sawadogo, M., and Roeder, R. G. (1985). Interaction of a gene-specific transcription factor with the adenovirus major late promoter upstream of the TATA box region. Cell 43, 165-175. Sehgal, A., Patil, N., and Chao, M. (1988). A constitutive promoter directs expression of the nerve growth factor receptor gene. Mol. Cell. Biol. 8, 3160-3167. Smale, S. T., and Tjian, R. (1985). Transcription of herpes simplex tk sequences under the control of wild-type and mutant human polymerase I promoters. Mol. Cell. Biol. 5, 352-362.

virus RNA

Soeller, W. C., Poole, S. J., and Kornberg, T. (1988). In vitro transcription of the Drosophila engrailed gene. Genes Dev. 2, 68-81. Struhl, K. (1987). Promoters, activator proteins, and the mechanism transcriptional initiation in yeast. Cell 49, 295-297. Talkington, of a globin

of

C. A., and Leder, P (1982). Rescuing the in vitro function pseudogene promoter. Nature 298, 192-195.

Tokunaga, K., Hirose, S., and Suzuki, Y. (1984). In monkey COS cells only the TATA box and the cap site region are required for faithful and efficient initiation of the fibroin gene transcription. Nucl. Acids Res. 72, 1543-1558. Wasylyk, B., Derbyshire, R.. Guy, A., Molko, D., Roget, A., Teoule, R., and Chambon, P (1980). Specific in vitro transcription of conalbumin gene is drastically decreased by single-point mutation in TATA box homology sequence. Proc. Natl. Acad. Sci. USA 77, 7024-7028. Zinn, K.. DiMaio, D., and Maniatis, T. (1984). Identification tinct regulatory regions adjacent to the human e-interferon 34. 865-879.

of two disgene. Cell