20
TIBS 11 - January 1986
Reviews Promoter-specific activation of RNA polymerase II transcription bySpl James T. Kadonaga, Katherine A. Jones and Robert Tjian The RNA polymerase II transcriptionfactor Sp l is a protein that binds to specific D NA sequences and activates RNA synthesis from a select group of promoters. Spl and related factors appear to be important for modulation of gene expression in higher organisms. Control of the rate of transcription initiation is one means by which the expression of genes can be varied. Regulation of transcription initiation is well characterized in prokaryotes such as Escherichia coli, but in higher organisms, such as humans, this phenomenon is only beginning to be clarified. One common approach to this problem has been the identification of important c/s-acting DNA sequences in the region surrounding transcription initiation sites. For synthesis of mRNA by RNA polymerase II, these studies have revealed some important and distinct promoter elements, such as specific sequences within 100 nucleotides upstream of the start site that contribute to the efficiency of mRNA synthesis, and an AT-rich region of DNA (25-30 bp upstream of the RNA start site and sometimes called the T A T A box) that appears to fix the site of transcription initiation1-6. Eukaryotic transcription can also be greatly influenced by control elements known as enhancers, which can increase the level of transcription from long distances (at least 2 kb) and from both orientations, when either upstream or downstream of the R N A start site 3,7-H. Promoter mapping studies of several eukaryotic genes in vivo and in vitro have revealed the importance of a G G G C G G hexanucleotide (GC b o x ) 3,4,12,13 and a CCAAT sequence (CAAT box) 14, which are often found 40-100 nucleotides upstream of the start site of transcription. These elements appear to play a critical role in directing efficient transcription from a select class of mammalian promoters. Examples include: the J. T. Kadonaga, K. A. Jones and R. Tjian are at the Department of Biochemistry, University of California, Berkeley, CA 94720, USA.
Simian Virus 40 early promoter3,4,n,13, which contains six tandemly arranged GC boxes; the mammalian I]-globin promoters 15-17, which each possess a single C A A T box; and the herpes simplex virus (HSV) thymidine kinase (TK) promoterS,6,18,w, which has two GC boxes that flank a single C A A T box. To understand how the GC and C A A T boxes affect the level of transcription in the cell, it is necessary to complement the genetic mapping experiments with biochemical identification and purification of factors that interact specifically with GC and CAAT boxes to modulate the synthesis of RNA. By characterization of such factors, it should be possible to elucidate the mechanisms of promotor-specific variation of transcription. Two transcription factors, termed Spl and CTF (CAAT-binding transcription factor), have been isolated recently from mammalian cells19-21. Spl and CTF bind to GC and CAAT boxes, respectively19,22 (and S. McKnight, pers. commun.). Because Spl has been studied more extensively than CTF, we focus, in this review, mainly on the promoterspecific activation of transcription by Spl. Properties of Spl Transcription factor Spl is a sequencespecific, DNA-binding protein isolated from HeLa (human) cells. It enhances transcription by R N A polymerase II 10to 50-fold from a select group of promoters that contain at least one properly positioned GC box. There appear to be 5000-10 000 Spl molecules per cell, and the protein has been recently purified 100000-fold to an estimated 95% homogeneity (M. Briggs, J. Kadonaga and R. Tjian, unpublished). As shown in Fig. 1, both cellular and viral promoters
have been found to be responsive to this factor (Ref 19,21-24 and W. Dynan, S. Sazer, R. Tjian and R. Schimke, unpublished; W. Lee, M. Karin and R. Tjian, unpublished; K. Jones, J. Kadonaga, P. Luciw and R. Tjian, unpublished.) Spl binds to some, but not all, sequences that contain the G G G C G G hexanucleotide, and, despite their marked asymmetry, binding sites for Spl are functional in either orientation. Promoters that are Spl-responsive often contain multiple binding sites, and the most important site is usually the one closest to the gene, approximately 40-70 nucleotides upstream of the RNA start site. To confirm that Spl is an important transcription factor, it was necessary to demonstrate a direct correlation between the binding and in-vitro transcriptional enhancement activities. By using a battery of promoter mutants, it was also possible to compare the action of SP1 in vitro with the synthesis of RNA in vivo. Studies of this kind with the SV40 early and HSV TK promoters indicated that, in vivo and in vitro, Spl must bind to D N A in a sequence-specific manner to activate transcription of Spl-responsive promoters19-22,25.
Measurement of Spl activity Spl can be assayed by monitoring either its DNA binding or its transcriptional enhancement activities. The most direct and specific assay is DNase I or dimethyl sulphate (DMS) footprinting26,27, in which a 32p-labeled DNA probe that contains at least one Spl binding site is incubated with the factor, lightly treated with DNase I or DMS, and then analysed by electrophoresis on a denaturing polyacrylamide gel. Regions of DNA that are bound by Spl appear as protected areas (Fig. 2a). Alternatively, the transcriptional enhancement activity of Spl can be measured in vitro by reconstituted transcription reactions carried out in the presence or absence of Spl. A convenient and specific assay for measuring transcription intitiation is the primer extension assayS: a 32p. labeled oligonucleotide that is complementary to a portion of the newly synthesized RNA is hybridized to the transcripts, and a complementary DNA (cDNA) strand is synthesized with reverse transciiptase and subsequently analysed by electrophoresis on a denaturing polyacrylamide gel. The activation of transcription by Spl is estimated by
(~1986, ElsevierSciencePublishers B.V., Amsterdam 0376 5067/86/$02.00
21
T I B S 11 - J a n u a r y 1986
Vl V IV ill II I
V HSV IE-3
. 4"--
IV Ill .
.
II
l
4--
4--
. --1,4--
II
I
HSVTK
III II
l
AIIDS Vires LTR
I
MT-Ilk
a w
4-IV
III
|l
I
Mouse DHFR
I -~0
I -200
I -150
I -100
I -50
I +1
Fig. I. Whether promoters were responsive to Sp l or not was determined by transcription in vitro in the presence or absence o f Spl. The filled ovals indicate Spl binding sites that were characterized by footprinting with Sp l. The filled rectangle represents a binding site for transcription factor CTF, which binds to CCA A T sequences ( CAA T boxes). SV40 GC-box IV is shown as an open oval because Sp l bound to GC-box V appears to prevent binding of the factor to GC-box I V ~. The orientation o f each Spl recognition site is indicated by an arrow ( N G G G C G G N N N = ~---). The RNA start sites are designated +1. D H F R =dihydrofolate reductase; M T = metallothionein, IE = immediate-early.
(a)
DNase I FOOTPRINTING
0+
Spl
~0
(b) PRIMER
EXTENSION
--++
Fig. 2. (a) DNase I footprintingassay. The binding of Spl to SV40 GC-boxes I-VI is depicted, and thefootprint boundary is indicated by a bracket. The lanes labeled "0' are negative controls that show the DNase I digestion pattern in the absence of Spl. In a typical footprint experiment, 5-50 fmol of 32p_ labeled DNA is incubated with 50-500 fmol o f Spl monomers before digestion with DNuse L (b) Primerextensionassay. The symbols + and -inchcate the presence or absence of Spl. The arrows indicate the cDNA strands that derive from RNA transcripts generated in vitro.
comparing the amounts of 32p-labeled cDNA derived from RNA synthesized in the presence or absence of Spl (Fig. 2b). Action of Spl and related factors on various promoters The criteria for assessing whether a promoter is Spl-responsive are (1) that transcription in vitro is enhanced at least 10-fold by Spl and (2) that the transcriptional activation correlates with the binding of Spl to a GC box in the promoter. To understand the mechansim of transcriptional activation by Spl and the role of Spl in the cell, we asked these questions: what is the minimum DNA sequence requirement for a promoter to be Spl-responsive; is the GGGCGG hexanucleotide necessary and sufficient for Spl binding and transcriptional activation; and are there other transcription factors similar to Spl ? Viral p r o m o t e r s r e s p o n s i v e to S p l
Spl was discovered by its ability in vitro to selectively increase transcription from the SV40 early promoter2°-22. As shown in Fig. 1, there are six tandemly arranged GC boxes in the degenerate 21 bp repeats of this promoter. By using a series of clustered point mutations in
each of the GC boxes, we have shown that GC boxes I, II and Ill are the most important for transcription in vitro in the early direction and that GC boxes III, V and VI are the most important for transcription in vitro in the late direction25. The HSV IE-3 promoter and the AIDS virus long terminal repeat (LTR) promoter have also contributed to our understanding of Spl. The IE-3 promoter, which contains five Spl binding sites ranging from position --75 to position -255, relative to the RNA start site, is strongly responsive to Sp123. Interestingly, a deletion mutant of this promoter that lacks viral sequences upstream of position -110 has only a single GC box, but its activity is still increased 20-fold by Spl. Thus, a single GC box appears to be sufficient for Spl binding and activation of transcription. The AIDS virus LTR promoter has three tandem Spl binding sites, defined by DNase footprinting and dimethyl sulfate methylation protection experiments with Spl of 95% purity (K. Jones, J. Kadonaga, P. Luciw and R. Tjian, unpublished). This finding was unexpected, however, because only the middle binding site has a GC box. The purified factor appears to bind to flanking sequences that are partially homologous to the decanucleotide consensus sequence for Spl (see later) but do not contain a perfect GC box. Thus, it appears that the GGGCGG hexanucieotide is not an absolute requirement for Spl binding.
Cellular S p l - r e s p o n s i v e p r o m o t e r s
Several cellular Spl-responsive promoters have been studied by assaying transcription in vitro and by DNase footprinting experiments. These include the mouse dihydrofolate reductase promoter (W. Dynan, S. Sazer, R. Tjian and R. Schimke, unpublished), the human metallothionein IA and IIA promoters (W. Lee, M. Karin and R. Tjian, unpublished) and two promoters that were cloned from a monkey-genomic library by sequence homology with the SV40 origin of replication22.24,28. The mouse dihydrofolate promoter possesses four evenly spaced Spl binding sites ranging from position - 5 0 to position -190, and each of the sites contains a single GC box. The human metallothionein IA and I1g promoters have Spl binding sites interspersed with other important sequences (such as metal regulatory elements, glucocorticoid response elements and basal level enhancer sequences29) that are believed to be recognition sites for specialized factors that affect
22 transcription. The monkey promoter bears some similarity to the SV40 21 bp repeats in that it contains closely spaced, tandemly arranged GC boxes that appear to activate transcription bidirectionally. It seems likely that a class of genes possessing Spl-responsive promoters are functionally related in some manner, but there are, as yet, too few examples of Spl-responsive and Spl-independent promoters to reveal a common theme for the cellular requirement for Spl.
A promoter that responds to S p l and CTF
Studies of the HSV TK promoter, which contains two Sp1 binding sites flanking one C A A T box, have revealed a transcription factor known as CTF, that behaves like Spl but specifically recognizes the C A A T box rather than the GC-box element ]9. CI'F also appears to bind to CAAT-box elements present in the human fl-globin and murine sarcoma virus (MSV) LTR promoters (K. Jones and R. Tjian, unpublished) as well as the human heat shock hsp-70 gene (W. Morgan and R. Tjian, unpublished). In the TK promoter, it is likely that CTF acts in conjunction with Spl to modulate transcription.
TIBS 11 - January 1986
direct interactions with different factors, may be important for Spl activation of transcription.
Mechanism of transcriptional activation by Spl The mechanism by which Spl activates RNA polymerase II transcription can only, at this point, be a subject of speculation. The sequence-specific binding to DNA and transcriptional enhancement activities are probably related, but other details, such as whether alterations in DNA conformation or protein-protein interactions between Spl and the transcriptional machinery are important, are not yet clear. Multiple binding sites appear to be one common feature of promoters that are Spl-responsive. In the AIDS virus LTR and SV40 early promoters, tandem sites are aligned such that Spl binds once
,G
every 10--12bp, which corresponds roughly to one complete turn of the DNA helix. Why are Spl binding sites arranged in such a manner? One answer could be simply that more is better than less - t h a t is, if many Spl molecules on a promoter can activate transcription to a greater extent than few Spl molecules, a promoter with multiple binding sites has greater potential for controlling transcription than a promoter with only a single site. The interaction between Sp1 and DNA has been characterized by treatment of S p l - D N A complexes with either DNase I or DMS, a reagent that methylates N-7 of guanine residues and N-3 of adenine residues. Like other DNA binding proteins, Spl depresses and enhances digestion by DNase I and methylation by DMS. Spl protection of DNA is easily interpreted as the factor causing inaccessibility of DNase I or
GGC
5 TGGGCGG~A r 3' RELATIVE SEOUENCE
AFFINITY
GGGGCGGGGC
HIGH
HSV IE-3 (V); DHFR (I, HI); MT IIA; CH-TK INTRON
TGGGCGGGGC
HIGH
HSV IE-3 OH, IV)
T G G G C G G A_G 1"_
HIGH
SV40 (IH, V)
G G G G C G G A_G C
HIGH
DHFR (II, IV)
GGGGCGGGGG
MEDIUM
HSV IE-3 (I)
G GGGCGGG GT
MEDIUM
HSV IE-3 (II)
TGGGCGGGGT
MEDIUM
HSV-TK (II)
1"_G G G C G G A A C
MEDIUM
SV40 (II)
GGGGCGGGAT
MEDIUM
SV40 (IV)
G G G G C G G G A_C
MEDIUM
SV40 (VI)
G G G G C G G _AG A
LOW
SV40 (I)
GGGGCGGCGC
LOW
HSV-TK (I)
Consensus sequence for Spl binding Because the G G G C G G sequence (or its inverted form, CCGCCC) was present in all of the Spl-responsive promoters that were first studied, it was inferred that this hexanucleotide is the recognition sequence for Spl. Further studies revealed, however, that the G G G C G G hexanucleotide does not always specify a strong Spl binding site, and the decanucleotide consensus sequence shown at the top of Fig. 3 subsequently evolved. Except for the 5' G or T, which appear to be equivalent, the upper bases are preferred to the lower ones. Thus, the best Spl binding sequences are probably G4CG4C and TG3CG4C. As mentioned above, analysis of the AIDS virus LTR promoter suggests that other related sequences may also be high-affinity Spl binding sites. Hence, it is likely that this consensus represents only a subset of all Spl binding sites. Interestingly, weak Spl binding sites, HSV TK box I and specifically SV40 as box I, have been found to be important for transcriptional activation. Thus, in addition to the inherent affinity of the protein for the recognition site, other considerations, such as the positioning of the binding sites relative to the RNA start site and perhaps
SOURCE
Fig. 3. A tentativeconsensussequenceforSpl binding. Thissequence wasderivedfromthe19bindingsites listed. The relative affinity of Spl for each site was estimated by DNase footprinting.
23
T I B S 11 - January 1986
DMS to DNA, but in contrast, the mechanism of enhancement of D N A reactivity toward DNase I and DMS is not obvious. This phenomenon is likely to be the consequence of significant alterations in the structure of DNA and thus may indicate how Spl activates transcription. As mentioned above, a second distinct transcription factor, CTF, appears to act in conjunction with Spl to activate transcription from the HSV TK promoter. If that is true, transcription from the TK promoter could be modulated by varying the cellular concentrations of Spl and CTF. Preliminary studies of the metallothionen and fl-globin promoters (W. Lee, M. Karin and R. Tjian, unpublished; W. Morgan and R. Tjian, unpublished) suggest that there may be other sequence-specific D N A binding proteins that recognize distinct promoter elements and behave like Spl and CTF to activate transcription. Also, transcription factors from Drosophila that are similar to SpI and CTF have been identified and characterized 30-32. These findings support the hypothesis that there is an entire class of DNA binding proteins that can act either alone or with other factors (and perhaps with other yet unidentified classes of transcription factors) to modulate transcription from a wide range of promoters. Future studies The most obvious function of Spl in the cell is to activate the expression of a select group of genes but other potential functions should not be excluded. For example, an Spl binding site has been characterized in the first intron of the chicken thymidine kinase gene (K. Jones, S. McKnight and R. Tjian, unpublished). Also, Spl binding sites are near replication origins of SV40 and HSV. Furthermore, because of its DNA
binding properties, Spl could conceivably be used as a repressor as well as an activator of RNA synthesis. We are only beginning to understand how Spl works. The factor has been purified to homogeneity recently and it is likely that antibodies to Spl and the gene(s) encoding Spl will be obtained. Such problems as the regulation of expression of Spl and related factors could then be pursued. Future studies should lead to a greater insight into the mechanisms by which Spl and other related factors activate transcription and give a broader understanding of their role in the cell.
Acknowledgements We thank our colleagues for their helpful and informative comments on this manuscript. J. T. Kadonaga is a Fellow of the Miller Institute for Basic Research in Science. K. A. Jones is supported by a postdoctoral fellowship from the National Institutes of Health. We thank K. Ronan for typing the manuscript. This work was funded by grants from the National Institutes of Health to R. Tjian. References 1 Breathnach, R. and Chambon, P. (1981) Annu. Rev. Biochem. 50, 349-383 2 Shenk, T. (1981) Curr. Top. Micro. lmmun. 93, 25-40 3 Benoist, C. and Chambon, P. (1981) Nature 290, 3(14-310 4 Myers, R. M., Rio, D. C., Robbins, A. K. and Tjian, R. (1981) Cell 25, 373-384 5 McKnight, S. L. (1982) Cell 31,355-366 6 McKnight, S. L. and Kingsbury, R. (1982) Science 217,316-324 7 Yaniv, M. (1982) Nature 297, 17-18 8 Khoury, G. and Gruss, P. (1983) Cell 33,313314 9 Fromm, M. and Berg, P. (1982) J. Mol. Appl. Genet. 1,457-481
10 Wasylyk, B., Wasylyk, C., Augereau, P. and Chambon, P. (1983) Cell 32, 503-514 11 Serflin~, E., Jasin, M. and Schaffner, W. (1985) Trends Genet. 1,224-230 12 Lebowitz, P. and Ghosh, P. K. (1982) J. Virol. 41,449-461 13 Everett, R. D., Baty, D.and Chambon, P. (1983) Nucl. Acids Res. 11, 2447-2464 14 Efstratiadis, A., Posakony, J. W., Maniatis, T., Lawn, R. M., O'Connell, C., Spiritz, R. A., DeRiel, J. K., Forget, B. G., Slighton, L., Blechl, A. E., Smithies, O., Baralle, F. E., Shoulders, C. C. and Proudfoot, N. J. (1980) Cell 21,653~68 15 Grosveld, G. C., Rosenthal, A. and Flavell, R. A. (1982) Nucl. Acids Res. 10, 4951-4971 16 Dierks, P., Van Ooyen, A., Cochran, M. D., Dobkin, C., Reiser, J. and Weissman, C. (1983) Cell 32, 695-706 17 Charnay, P., Mellon, P. and Maniatis, T. (1985) Mol. Cell. Biol. 5, 1498-1511 18 McKnight, S. L., Kingsbury, R., Spence, A and Smith, M. (1984) Cell37, 253-262 19 Jones, K. A., Yamamoto, K. R. and Tjian, R. (1985) Cell 42, 559-572 20 Dynan, W. S. and Tjian, R. (1983) Cell 32, 669-680 21 Dynan, W. S. and Tjian, R. (1983) Cell35, 7987 22 Gidoni, D., Dynan, W. S. and Tjian, R. (1984) Nature 312, 409~13 23 Jones, K. A. and Tjian, R. (1985) Nature 317, 179-182 24 Dynan, W. S., Saffer, J. D. Lee, W. S. and Tjian, R. (1985) Proc. Natl Acad. Sci. USA, 82, 4915-4919 25 Gidoni, D., Kadonaga, J.T., Barrera-Saldafia, H., Takahashi, K., Chambon, P. and Tjian, R. (1985) Science 230, 511-517 26 Galas, D. and Schmitz, A. (1978) Nucl. Acids Res. 5, 3157-3170 27 Siebenlist, U. and Gilbert, W. (1980) Proc. Natl Acad. Sci. USA 77, 122-126 28 Saffer, J. D. and Singer, M. F. (1984) Nucl. Acids Res. 12, 4769---4788 29 Karin, M., Haslinger, A., Holtgreve, H., Richards, R. I., Krauter, P., Westphal, H. M. and Beato, M. (1984) Nature 308, 513--519 30 Parker, C. S. and Topoi, J. (1984) Cell 36,357369 31 Parker, C. S. and Topoi, J. (1984) Cell 37,273-283 32 Heberlein, U., England, B. P. and Tjian, R. (1985) Cell41, 965-977