Update Research Focus
Structural evolution of multisubunit RNA polymerases Finn Werner UCL Research Department of Structural and Molecular Biology, Darwin Building, Gower Street, London WC1E 6BT, UK
Evolutionarily related multisubunit RNA polymerases (RNAPs) facilitate gene transcription throughout the three domains of life. During the past seven years an increasing number of bacterial and eukaryotic RNAP structures have been solved; however, the archaeal enzyme remained elusive. Two reports from the Murakami and Cramer laboratories have now filled this gap in our knowledge and enable us to hypothesize about the evolution of the structure and function of RNAPs.
Transcription engines Many biological processes involve and depend on regulated gene expression. The first step of gene expression is transcription, and at the centre of transcription lie DNA-dependent RNA polymerases (RNAPs) – thus a thorough understanding of gene expression requires detailed knowledge of RNAP structure and function. Since 2000 an impressive number (>50) of bacterial and eukaryotic RNAP structures, and complexes of RNAPs with nucleic acid scaffolds or basal transcription factors have been obtained at high resolution. However, the structural organization of RNAPs from the third domain, Archaea, remained uncharted territory. Two articles from the Murakami and Cramer laboratories now supply the missing information for our structural understanding of multisubunit RNAPs – the engines of transcription [1,2]. The novel structures were obtained by X-ray crystallography at med˚ ) and by electron microscopy at ium-high resolution (3.4 A ˚ ), respectively. The former structure low resolution (16 A especially, in conjunction with the recent identification of a genuine archaeal homologue of the eukaryotic RPB8 RNAP subunit [3], gives some intriguing clues on the evolutionary conservation of structure and function of RNAPs throughout the three domains of life. Structural and functional evolution of the archaeal RNAP The structural and functional organization of multisubunit RNAPs has been reviewed recently [4] (Box 1). I use the crystal structure of the Sulfolobus solfataricus RNAP to illustrate its functional architecture [1] (Figure 1). The three large subunits A0 , A00 and B form a largely ellipsoid body often likened to the shape of a crab claw. They represent the bulk of the enzyme that harbours all of the structural domains and motifs necessary for RNA polymerization (Figure 1d, outlined in gold). The efficient assembly of the large subunits into RNAPs requires four small polypeptides that form a stable subcomplex, D–L–N– P [5]. This so-called assembly platform is located opposite Corresponding author: Werner, F. (
[email protected]).
to the opening of the claw (Figure 1d, outlined in red). The two subunits F and E form a stable subassembly coined the stalk because it protrudes from the RNAP body (Figure 1d, outlined in blue). The F–E complex probably interacts with the nascent transcript as it emerges from the RNA exit channel by means of two RNA-binding motifs, and is instrumental in DNA melting during transcription initiation [6–8] (Box 1 and Box 2). How does this structure relate to bacterial and eukaryotic RNAPs? The conserved core of all RNAPs in the three domains of life can be described as a combination of large subunits and their cognate assembly platform – in essence the bacterial RNAP (Figure 1a–c). The archaeal and eukaryotic RNAPs harbour additional subunits that extend their functionality, such as the F–E signature module. Overall, the sequence, domain organization and structure of the large subunits are highly conserved between multisubunit RNAPs. The entire architecture of the largest subunits, A0 , A00 and B, can be superimposed on their RNAPII counterparts RPB1 and 2 [1]. This structural scaffold includes the dock domain that mediates RNAP recruitment, the flexible clamp that closes over the DNA-binding channel, and the active site including the bridge helix that is instrumental in the nucleotide addition cycle and translocation mechanism. The latter mechanisms have been eloquently described in two recent structural determinations of the bacterial Thermus thermophilus RNAP and yeast RNAPII [9–11]. The assembly platform of the bacterial RNAP consists of a homodimer of a subunits. Archaeal RNAP and eukaryotic RNAPII each contain two polypeptides (respectively, D and L, and RPB3 and RPB11) that are homologous to a; they heterodimerize and incorporate two additional subunits (respectively N and P, and RPB10 and RPB12) to form an extended platform [5]. Eukaryotic RNAPI and RNAPIII use a common platform consisting of AC40 and AC19 (homologous to D and L), and RPB10 and RPB12 [12] (Figure 1g). The lower ‘jaw’ of the crab claw encompasses the large subunits A0 and A00 (homologous to RPB1) and the small subunit H (homologous to RPB5), which interacts with the downstream DNA template [13]. Interestingly, the bacterial RNAP carries an insertion in the b0 subunit that is present at the equivalent position and structurally mimics H (RPB5) [1]. The new S. solfataricus and Pyrococcus furiosus RNAP structures reveal the archaeal RNAP as a truncated version of RNAPII in which some of the small subunits or domains are missing [1,2]. Some of the main differences between the archaeal and eukaryotic RNAPII can be predicted by comparing the subunit composition and by amino acid sequence alignments of the individual subunits. The RPB8 and RPB9 homologues are absent in both structures. Archaeal subunit H lacks 247
Update Box 1. Functional dissection of RNAP subunits Nucleic acid polymerases share a distant common ancestry that is reflected in the layout of the active site and a conserved catalytic mechanism involving two magnesium ions [29]. The catalytic centre of multisubunit RNAPs resides in two double-c b barrels that are found in the large subunits of the bacterial (b0 and b), archaeal (A0 , A00 and B) and eukaryotic RNAPs (RPB1 and 2 in RNAPII) [4,30]. These closely related enzymes share a common ternary organization of conserved subunits exemplified by the a, b, b0 and v subunits of the bacterial RNAP [31] (see Figure 1c,g in main text). This RNAP ‘core’ represents >75% of the mass of archaeo-eukaryotic RNAP. The largest archaeal RNAP subunit is split into two polypeptides, A0 , A00 , that are homologous to the N- and C-terminal parts of b0 (and RPB1). A0 , A00 and B harbour the catalytic centre (see Figure 1d in main text, gold outline), with two magnesium ions and the binding sites of substrate nucleoside triphosphates (NTPs), template DNA and the DNA–RNA hybrid. The archaeal subunits D, L, N and P (RPB3, 11, 10 and 12) form a stable subcomplex that serves as an assembly platform for the large A and B subunits (see Figure 1d, red outline, in main text). The archaeal RNAP (see Figure 1a in main text) and eukaryotic RNAPII (see Figure 1b in main text) contain another four homologous subunits that reflect their close kinship: the archaeal subunits F, H, E and G are homologous to eukaryotic RPB4, 5, 7 and 8 (see Figure 1g in main text). Subunit H forms an integral part of the jaw domain and interacts with the downstream DNA template [13]. Subunit K stabilizes the split A0 and A00 subunits and serves as anchor point for the F–E complex (see Figure 1d, blue outline, in main text). Subunits F and E modulate the position of the clamp (A0 ; see Figure 1d, green outline, in main text) and facilitate DNA melting and open complex formation. The F–E complex contains two RNA-binding domains, an OB (oligonucleotide/oligosaccharide binding) fold (E) and a HRDC (helicase and RNaseD Cterminal) motif (F), which are likely to bind the nascent RNA transcript as it emerges from the RNA exit channel [4].
the N-terminal domain of its RNAPII homologue, RPB5. In a similar manner the archaeal RNAP subunit K lacks the unstructured N-terminal domain of its RNAPII homologue, RPB6 [14]. RNAPII activity is regulated during the transcription cycle by phosphorylation of the C-terminal domain of RPB1 (CTD) and possibly RPB6, but hitherto no posttranslational modifications have been identified in the archaeal RNAP [15–17]. The evolutionary relationship between archaeal RNAP and RNAPII is also mirrored in their use of basal factors – both enzymes ‘share’ three basal transcription factors that facilitate promoter-dependent transcription [18] (Box 2). By contrast, the molecular mechanisms of transcription initiation in the bacterial system are fundamentally different and the structural motifs of the large bacterial RNAP subunits that interface with initiation factors are greatly divergent from both archaeal and eukaryotic RNAPs [1,19,20]. In summary, eukaryotic RNAPs have evolved from an archaeal-like ancestral RNAP by the simple addition of motifs, domains and subunits, whereas the differences between the bacterial and archaeal RNAP include changes to the core structure. S. solfataricus RNAP contains an iron–sulphur cluster The most intriguing novel aspect of the S. solfataricus RNAP is the presence of an iron–sulphur (Fe–S) cluster in subunit D [1]. This 4Fe–4S cluster has a structural role and is strictly required for the stable assembly of the D–L heterodimer, the nucleation event of RNAP assembly. In 248
Trends in Microbiology Vol.16 No.6
Box 2. Molecular mechanisms of archaeal transcription initiation factors The evolutionary relationship between the archaeal RNAP and eukaryotic RNAPII is also reflected in their requirements for the homologous transcription factors TATA element-binding protein (TBP, equivalent to eukaryotic TBP), transcription factor B (TFB, equivalent to eukaryotic TFIIB) and transcription factor E (TFE, equivalent to eukaryotic TFIIEa). In brief, TBP binding to the TATA element of the archaeal promoter nucleates the formation of the preinitiation complex by starting a short recruitment cascade. First, TFB binds to the TATA–TBP complex and subsequently RNAP is recruited to the TATA–TBP–TFB complex by multiple interactions between TFB and RNAP [18]. The S. solfataricus RNAP structure now demonstrates that the interfaces between the factors (e.g. TFB zinc ribbon) and the RNAP (e.g. the dock domain) are highly conserved. Initially the N-terminal zinc ribbon domain of TFB contacts the dock domain of RNAP; subsequently the TFB B-finger domain penetrates deep into the RNAP active site [32]. The latter interaction stimulates the catalytic activity of RNAP by stabilizing the template DNA strand or the initiating substrate NTPs (or both) [18]. Finally, TFE enters the DNA–TBP–TFB–RNAP initiation complex and promotes open complex formation, a process by which the two DNA strands of the template are separated around the transcription start site and the template strand is loaded into the active site. This process involves major topological changes in the initiation complex, and TFE is thought to stabilize it by direct physical interactions with the nontemplate DNA strand [33]. TFE action is strictly dependent on subunits F and E, which indicates that it modulates the RNAP clamp and thereby promotes DNA melting [18,33,34]. TFE is, like TFIIF, a component of initiation and elongation complexes and thus has multiple functions during the transcription cycle [33].
theory, the iron–sulphur cluster could enable a redox regulation of general transcription in S. solfataricus. However, the cluster is tightly bound in the context of the RNAP, it is not oxygen sensitive and persists on treatment with a strong iron chelator [1]. The Fe–S cluster-binding motif is not found in the P. furiosus RNAP but Fe–S cluster-binding motifs were predicted in the subunit D genes of 16 sequenced archaeal genomes and in the AC40 genes (homologue of subunit D in RNAPI and III) of 12 eukaryotic genomes (plants and protozoa [1]). Fe–S clusters have been previously identified as having a structural rather than a catalytic role in other RNAPs, including eukaryotic and archaeal primases [21]. Transcription factor capture Recently, the structures of eukaryotic RNAPI and III have been explored by a combination of electron microscopy of whole RNAP at low resolution, X-ray crystallography of RNAP subcomplexes at high resolution and molecular modelling [7,22]. Both RNAPI and III contain polypeptides that have no homologues in RNAPII, but two of the RNAPI (A49 and A34.5) and RNAPIII (C37 and 53) subunits are structurally and functionally related to the RNAPIIspecific basal transcription factor TFIIF, which consists of two subunits called RAP30 and RAP74 [22] (Figure 1g). TFIIF is involved in the initiation and elongation phase of transcription. This apparent structural and functional similarity enables the interesting speculation that (i) RNAPI and III have an ‘inbuilt’ transcription factor, and (ii) all three eukaryotic RNAPs could have similar requirements for ‘assistance’ during the transcription cycle aiding, for example, recruitment to the promoter and overcoming transcriptional pausing during elongation [22]. It also
Update
Trends in Microbiology
Vol.16 No.6
Figure 1. RNAP subunit in the three domains of life. The X-ray structures of multisubunit RNAPs in the three domains of life. (a) Archaeal RNAP, Sulfolobus solfataricus (PDB code: 2PMZ); (b) eukaryotic RNA polymerase II, Saccharomyces cerevisiae (1NT9); and (c) bacterial RNAP, Thermus aquaticus (PDB code: 1I6V). (d) Functional organization of the archaeal RNAP into the large catalytic subunits (A0 , A0 0 and B, gold trace), the assembly platform (D, L, N and P, red trace), the clamp (A0 , green) and the stalk module (F and E, blue). (e) The archaeal RNAP subunit D (highlighted in red) contains a 4Fe–4S cluster, shown in (f) (highlighted in red). RNAP subunits are colour coded according to the key in (g). The unconnected densities of the bacterial Taq RNAP structure (1I6V) were ignored for simplicity. RNAP images are cartoon representations of the main chains with overlaid semitransparent surfaces generated using MacPyMol (www.pymol.org). (g) Subunit composition of RNA polymerases in the three domains of life. The subunits are divided into RNAPs (columns), according to sequence and structural homology (rows), and subunits present in all multisubunit RNAPs, and subunits specific for archaeal and eukaryotic RNAPs. The lower box emphasizes the structural and functional homology of RNAPI and III subunits that are related to the RNAPII-specific transcription factor TFIIF [22]. The five subunits RPB5, 6, 8, 10 and 12 are common to all three eukaryotic RNAPs. The coloured circles indicate the colour coding of the RNAP subunits in (a–c).
demonstrates that the boundary between RNAP subunits and exogenous transcription factors that tightly cooperate with RNAP is blurred, and that the contemporary multisubunit RNAPs are in varying states of transcription factor ‘capture’. This scenario is reminiscent of an archaeal polypeptide transcription factor S (TFS) that is homologous to both the RNAPII subunit RPB9 and the transcript cleavage factor TFIIS [23,24]. Archaeal TFS and eukaryotic TFIIS stimulate the endogenous transcript cleavage activity of their RNAP [23]. TFS is associated with its cognate RNAP in some but not all archaeal systems and therefore TFS could represent an exogenous factor that is in the process of getting stably incorporated into the enzyme. Another likely example of this process is the recently identified archaeal homologue of the RNAPII subunit RPB8, coined subunit G [3]. RPB8 was believed to be the only RNAPII subunit without an archaeal hom-
ologue; however, rigorous sequence data mining identified RPB8 genes in hyperthermophilic Crenarchaea and the only sequenced species belonging to the Korarchaeota, but not Euryarchaea [3]. This finding is consistent with either of two alternative pathways for the evolution of multisubunit RNAPs, characterized by an early versus late origin of RPB8. RPB8 (G) could have evolved early as an integral part of the RNAP of the last archaeal common ancestor (LACA), but was lost during euryarchaeal evolution after the crenarchaeal and euryarchaeal lineages split. Alternatively, RPB8 was recruited and incorporated into the crenarchaeal RNAP (as described for TFS) following the split of the lineages. In both scenarios a crenarchaeal-like RNAP including RPB8 (G) would be the evolutionary predecessor of eukaryotic RNAPs. An RNAP preparation purified to near-homogeneity from the crenarchaeon Sulfolobus acidocaldarius contained RPB8 249
Update (G) [25]. However, the highly purified S. solfataricus RNAP used for crystallization does not include subunit RPB8 (G), which indicates that it only weakly or transiently associates with RNAP [1]. RPB8 is present in all three eukaryotic RNAPs but its significance and functional contribution to the molecular mechanisms of RNAPs is not yet understood [26]. Concluding remarks and future perspectives The new S. solfataricus and P. furiosus RNAP structures confirm predictions about the ternary architecture and functional interactions between RNAP subunits that were based on sequence conservation and experimental results. In addition, they reveal unexpected new insights such as the presence of Fe–S clusters in multisubunit RNAPs. The identification of bona fide RPB8 homologues in Crenarchaea demonstrates that the full range of RNAPII subunits was present in the ancestral RNAP of the LACA. The structures also yield some clues about the evolution of multisubunit RNAPs, indicating that an archaeal-like RNAP precursor evolved into eukaryotic RNAPs by the insertion of polypeptide motifs, domains and subunits rather than by more fundamental changes of the RNAP structure. Now that structural information at a good resolution is available for multisubunit RNAPs from all three domains of life, an indepth analysis of the evolution of structure and function of the transcription machinery can be carried out. The bacterial transcription systems are perhaps the best understood, based on the combination of structural information and recombinant in vitro reconstituted RNAPs. Although it has not been possible yet to develop a recombinant RNAPII system, the in vitro reconstitution of P. furiosus and Methanocaldococcus jannaschii RNAP from recombinant subunits has been successful [7,18,27]. The latter system has recently been adapted to a robotics platform that can produce a large number of recombinant mutant RNAPs in a high-throughput manner [28]. The high-quality structural information about archaeal RNAP obtained by the Murakami and Cramer laboratories paves the way for a comprehensive functional analysis of multisubunit RNAPs. References 1 Hirata, A. et al. (2008) The X-ray structure of RNA polymerase from Archaea. Nature 452, 248 2 Kusser, A.G. et al. (2008) Structure of an archaeal RNA polymerase. J. Mol. Biol. 376, 303–307 3 Koonin, E.V. et al. (2007) Orthologs of the small RPB8 subunit of the eukaryotic RNA polymerases are conserved in hyperthermophilic Crenarchaeota and Korarchaeota. Biol. Direct 2, 38 4 Werner, F. (2007) Structure and function of archaeal RNA polymerases. Mol. Microbiol. 65, 1395–1404 5 Werner, F. et al. (2000) Archaeal RNA polymerase subunits F and P are bona fide homologs of eukaryotic RPB4 and RPB12. Nucleic Acids Res. 28, 4299–4305 6 Meka, H. et al. (2005) Crystal structure and RNA binding of the Rpb4/ Rpb7 subunits of human RNA polymerase II. Nucleic Acids Res. 33, 6435–6444 7 Naji, S. et al. (2007) The Rpb7 orthologue E’ is required for transcriptional activity of a reconstituted archaeal core enzyme at low temperatures and stimulates open complex formation. J. Biol. Chem. 282, 11047–11057
250
Trends in Microbiology Vol.16 No.6 8 Ujvari, A. and Luse, D.S. (2006) RNA emerging from the active site of RNA polymerase II interacts with the Rpb7 subunit. Nat. Struct. Mol. Biol. 13, 49–54 9 Vassylyev, D.G. et al. (2007) Structural basis for transcription elongation by bacterial RNA polymerase. Nature 448, 157–162 10 Vassylyev, D.G. et al. (2007) Structural basis for substrate loading in bacterial RNA polymerase. Nature 448, 163–168 11 Wang, D. et al. (2006) Structural basis of transcription: role of the trigger loop in substrate specificity and catalysis. Cell 127, 941–954 12 Eloranta, J.J. et al. (1998) In vitro assembly of an archaeal D-L-N RNA polymerase subunit complex reveals a eukaryote-like structural arrangement. Nucleic Acids Res. 26, 5562–5567 13 Bartlett, M.S. et al. (2004) Topography of the euryarchaeal transcription initiation complex. J. Biol. Chem. 279, 5894–5903 14 Dahmus, M.E. (1996) Phosphorylation of mammalian RNA polymerase II. Methods Enzymol. 273, 185–193 15 Kayukawa, K. et al. (1999) A serine residue in the N-terminal acidic region of rat RPB6, one of the common subunits of RNA polymerases, is exclusively phosphorylated by casein kinase II in vitro. Gene 234, 139– 147 16 Kolodziej, P.A. et al. (1990) RNA polymerase II subunit composition, stoichiometry, and phosphorylation. Mol. Cell. Biol. 10, 1915–1920 17 Gerber, J. et al. (2008) Site specific phosphorylation of yeast RNA polymerase I. Nucleic Acids Res. 36, 793–802 18 Werner, F. and Weinzierl, R.O. (2005) Direct modulation of RNA polymerase core functions by basal transcription factors. Mol. Cell. Biol. 25, 8344–8355 19 Murakami, K.S. et al. (2002) Structural basis of transcription initiation: RNA polymerase holoenzyme at 4 A resolution. Science 296, 1280–1284 20 Vassylyev, D.G. et al. (2002) Crystal structure of a bacterial RNA polymerase holoenzyme at 2. 6 A resolution. Nature 417, 712–719 21 Klinge, S. et al. (2007) An iron-sulfur domain of the eukaryotic primase is essential for RNA primer synthesis. Nat. Struct. Mol. Biol. 14, 875–877 22 Kuhn, C.D. et al. (2007) Functional architecture of RNA polymerase I. Cell 131, 1260–1272 23 Lange, U. and Hausner, W. (2004) Transcriptional fidelity and proofreading in Archaea and implications for the mechanism of TFS-induced RNA cleavage. Mol. Microbiol. 52, 1133–1143 24 Guglielmi, B. et al. (2007) TFIIS elongation factor and Mediator act in conjunction during transcription initiation in vivo. Proc. Natl. Acad. Sci. U. S. A. 104, 16062–16067 25 Langer, D. et al. (1995) Transcription in archaea: similarity to that in eucarya. Proc. Natl. Acad. Sci. U. S. A. 92, 5768–5772 26 Briand, J.F. et al. (2001) Partners of Rpb8p, a small subunit shared by yeast RNA polymerases I, II and III. Mol. Cell. Biol. 21, 6056–6065 27 Werner, F. and Weinzierl, R.O. (2002) A recombinant RNA polymerase II-like enzyme capable of promoter-specific transcription. Mol. Cell 10, 635–646 28 Nottebaum, S. et al. (2008) The RNA polymerase factory: a robotic in vitro assembly platform for high-throughput production of recombinant protein complexes. Nucleic Acids Res. 36, 245–252 29 Steitz, T.A. and Steitz, J.A. (1993) A general two-metal-ion mechanism for catalytic RNA. Proc. Natl. Acad. Sci. U. S. A. 90, 6498–6502 30 Iyer, L.M. et al. (2003) Evolutionary connection between the catalytic subunits of DNA-dependent RNA polymerases and eukaryotic RNAdependent RNA polymerases and the origin of RNA polymerases. BMC Struct. Biol. 3, 1 31 Zhang, G. et al. (1999) Crystal structure of Thermus aquaticus core ˚ resolution. Cell 98, 811–824 RNA polymerase at 3.3 A 32 Bushnell, D.A. et al. (2004) Structural basis of transcription: an RNA polymerase II-TFIIB cocrystal at 4.5 Angstroms. Science 303, 983–988 33 Grunberg, S. et al. (2007) Transcription factor E is a part of transcription elongation complexes. J. Biol. Chem. 282, 35482–35490 34 Ouhammouch, M. et al. (2004) A fully recombinant system for activator-dependent archaeal transcription. J. Biol. Chem. 279, 51719–51721 0966-842X/$ – see front matter ß 2008 Elsevier Ltd. All rights reserved. doi:10.1016/j.tim.2008.03.008