doi:10.1016/j.jmb.2005.10.083
J. Mol. Biol. (2006) 356, 179–188
Crystal Structure of Thermus aquaticus Gfh1 a Gre-factor Paralog that Inhibits rather than Stimulates Transcript Cleavage Valerie Lamour1, Brian P. Hogan2, Dorothy A. Erie2 and Seth A. Darst1* 1
The Rockefeller University 1230 York Avenue, New York NY 10021, USA 2
Department of Chemistry CB #3290, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
Transcription elongation in bacteria is promoted by Gre-factors, which stimulate an endogenous, endonucleolytic transcript cleavage activity of the RNA polymerase. A GreA paralog, Gfh1, present in Thermus aquaticus and Thermus thermophilus, has the opposite effect on elongation complexes, inhibiting rather than stimulating transcript cleavage. We have ˚ -resolution X-ray crystal structure of T. aquaticus determined the 3.3 A Gfh1. The structure reveals an N-terminal and a C-terminal domain with close structural similarity to the domains of GreA, but with an unexpected conformational change in terms of the orientation of the domains with respect to each other. However, structural and functional analysis suggests that when complexed with RNA polymerase, Gfh1 adopts a conformation similar to that of GreA. These results reveal considerable structural flexibility for Gfh1, and for Gre-factors in general, as suggested by structural modeling, and point to a possible role for the conformational switch in Grefactor and Gfh1 regulation. The opposite functional effect of Gfh1 compared with GreA may be determined by three structural characteristics. First, Gfh1 lacks the basic patch present in Gre-factors that likely plays a role in anchoring the 3 0 -fragment of the backtracked RNA. Second, the loop at the tip of the N-terminal coiled-coil is highly flexible and contains extra acidic residues compared with GreA. Third, the N-terminal coiled-coil finger lacks a kink in the first a-helix, resulting in a straight coiled-coil compared with GreA. The latter two characteristics suggest that Gfh1 chelates a magnesium ion in the RNA polymerase active site (like GreA) but in a catalytically inactive configuration. q 2005 Elsevier Ltd. All rights reserved.
*Corresponding author
Keywords: Gfh1; Gre-factors; RNA polymerase; transcription elongation
Introduction While the cellular RNA polymerases (RNAPs) are absolutely processive, transcript elongation is not a smooth, continuous process. During elongation, some RNAP molecules may undergo reverse translocation (a process termed backtracking), leading to a transcriptionally inactive state in which the 3 0 -OH of the RNA transcript is disengaged from the RNAP active site.1–4 The resulting 3 0 -fragment of the RNA transcript threads through the RNAP secondary Abbreviations used: RNAP, RNA polymerase; Ec, Escherichia coli; NTD, N-terminal extended coiled-coil domain; CTD, C-terminal globular domain; TEC, ternary elongation complex. E-mail address of the corresponding author:
[email protected]
channel that bifurcates from the active site.5–11 Backtracking leads to transient pausing (where the backtracking is relieved by spontaneous forward translocation of the enzyme), or sometimes can generate arrested complexes that are trapped in the backtracked state. Backtracked complexes can be rescued by internal hydrolytic cleavage and release of the transcript 3 0 -fragment, generating a new 3 0 -OH in register with the RNAP active site, allowing for renewed RNA synthesis.12–14 This endonucleolytic cleavage activity is intrinsic to the RNAP enzyme,15,16 and is catalyzed by the RNAP active site itself.17,18 A class of transcription elongation factors, which includes prokaryotic GreA and GreB (Gre-factors), as well as eukaryotic TFIIS (SII), increases the overall elongation rate by mitigating pausing and reactivating backtracked, arrested complexes by greatly stimulating the intrinsic endonucleolytic transcript
0022-2836/$ - see front matter q 2005 Elsevier Ltd. All rights reserved.
180 cleavage activity of the RNAP.12,19–22 While the requirement of the Gre-factors for the natural progression of RNAP in vivo has been established,23–25 their biological role remains unclear. In addition to rescuing arrested complexes and increasing the overall elongation rate, the Gre-factors may play a role in modulating RNAP behavior at pause signals,12,25–27 decreasing misincorporation,26 and in stimulating promoter clearance.28 Homologs of
Crystal Structure of Thermus aquaticus Gfh1
Gre-factors are well conserved throughout prokaryotes, which is indicative of a significant role in the regulation of gene expression. The crystal structure of Escherichia coli (Ec) GreA revealed two distinct domains,29 an N-terminal extended coiled-coil domain (NTD, residues 1–75) and a C-terminal globular domain (CTD, residues 76–158). Biochemical characterization indicated that the NTD plays a critical role in stimulating transcript
Figure 1. Sequence and structure of Taq Gfh1 structure. (a) Sequence alignment comparing Taq Gfh1 with Tth Gfh1 and Thermus and Ec Gre-factors. Highly conserved residues discussed in the text are shown in red. Conserved basic residues contributing to the Gre-factor basic patch are colored blue. Other structural features are labeled below the sequences. The secondary structure of the Taq Gfh1 structure is indicated above the sequences (b-strands, grey arrows; a-helices, red rectangles). (b) The X-ray structure of Taq Gfh1. Secondary structural elements are colored (a-helices, red; b-strands, blue), connecting loops are colored grey. The NTD comprises a long, intramolecular coiled-coil. The CTD comprises a six-stranded b-sheet flanked by a small a-helix.
181
Crystal Structure of Thermus aquaticus Gfh1
cleavage and crosslinks to the 3 0 -end of the RNA transcript in the backtracked ternary elongation complex (TEC), while the CTD plays an important role in RNAP binding.29–32 Structural33 and biochemical analysis34,35 has led to a model for the mechanism through which Gre-factors stimulate transcript cleavage. In this model, the Gre-CTD binds to the surface of the RNAP near the entrance to the secondary channel, positioning the Gre-NTD ˚ into the secondary channel. coiled-coil to extend 45 A This places two crucial acidic residues at the tip of the coiled-coil (Asp41 and Glu44 of Ec GreA; Figure 1(a)) near the active site, where they participate in chelating the second magnesium ion necessary to catalyze the cleavage reaction. Consistent with biochemical analysis,36 conserved basic residues exposed on the surface of the coiled-coil facing the 3 0 -fragment of the backtracked RNA form a “basic patch” that serves as a “molecular ruler” to detect the length of the 3 0 -RNA fragment and determine the mode of transcript cleavage. Thus, GreA has a small basic patch near the distal end of the coiled-coil and tends to induce cleavage of short, 2–3 nt fragments of the backtracked RNA, while the GreB basic patch comprises the entire length of the coiled-coil, allowing it to induce cleavage of RNA fragments up to 18 nt. Recently, paralogs of GreA were identified and cloned from the related thermophiles Thermus thermophilus (Tth) and Thermus aquaticus (Taq).37 The Thermus GreA, like Ec GreA, crosslinks to the 3 0 -end of RNA and stimulates the intrinsic cleavage activity of Tth RNAP.37 In addition to GreA, a novel Thermus transcription factor, Gre factor homolog 1 (Gfh1), was identified. Gfh1 shares a high degree of sequence similarity with GreA (Figure 1(a)), but appears to have the opposite effect on TECs as Grefactors, acting to suppress intrinsic cleavage and competitively inhibiting GreA-induced transcript cleavage.37 ˚ resolution X-ray Here, we present the 3.3 A crystal structure of Taq Gfh1. The structure reveals an NTD and CTD with close structural similarity to GreA, but with an unexpected conformational change in terms of the orientation of the domains
with respect to each other. However, structural and functional analysis indicates that when complexed with RNAP, Gfh1 must adopt a conformation similar to that of GreA. These results reveal considerable structural flexibility for Gfh1, and for Gre-factors in general, as suggested by structural modeling, and point to a possible role for the conformational switch in Gre-factor and Gfh1 regulation.
Results Taq Gfh1 structure The domain architecture of Taq Gfh1 is identical with that of Ec GreA, comprising an N-terminal anti-parallel coiled-coil finger, and a CTD composed of six b-strands flanked by a short a-helix (Figure 1(b)). The Taq Gfh1 NTD superimposes on the Ec GreA coiled-coil with a root-mean-square˚ over 75 a-carbon positions deviation (rmsd) of 4.1 A (including the coiled-coil tip; Figure 2(a)). The first a-helix of Ec GreA has a pronounced kink between residues Leu21 and Arg25, while the corresponding helix of Taq Gfh1 is straight (Figures 1(a) and 2(b)). Furthermore, the loop at the coiled-coil tip of Gfh1 is highly flexible. The tips of all three Gfh1 molecules in the crystallographic asymmetric unit are not constrained by crystallographic contacts and adopt different conformations. The Taq Gfh1 CTD superimposes very closely ˚ onto the CTD of Ec GreA, with an rmsd of 1.34 A over 60 a-carbon positions (excluding flexible loops connecting b-sheets; Figure 2(c)). In Taq Gfh1, the loop connecting the last two strands in the CTD is well defined in all three molecules (residues 140–147), whereas it was disordered in the EcGreA structure (residues 141–148). The loop connecting the NTD with the CTD (residues 73–87) is four residues shorter in Taq Gfh1 than in Ec GreA. Although the individual domains of Gfh1 are very similar (NTD) or almost identical (CTD) to the corresponding domains of Ec GreA, superposition of the two proteins by their CTDs reveals a large
Figure 2. Individual domain comparison. (a) Superimposition over the entire NTD of Taq Gfh1 (blue) and Ec GreA (orange). (b) Superimposition of the Taq Gfh1 (blue) and Ec GreA (orange) NTDs over only the first a-helix of Ec GreA (before the kink), illustrating the bend in the GreA coiled-coil finger introduced by the kink. (c) Superimposition over the entire CTD of Taq Gfh1 (blue) and Ec GreA (orange).
182
Figure 3. Overall structural comparison. The Ec GreA (orange) and Taq Gfh1 (blue) structures were superimposed by their CTDs only, revealing a large conformational change with respect to the NTD/CTD orientation. The conformational change corresponds to a rotation of the NTD by about 1788 around the conserved Gly86.
conformational change (Figure 3(a)). The Taq Gfh1 NTD is rotated about 1788 away from the position of the Ec GreA NTD. The rotation point is about the conserved Gly86 at the beginning of the first CTD b-strand (Figures 1(a) and 3(a)). This position is very largely conserved as a Gly (98%) among Grefactors, indicating an important role. In addition to numerous van der Waal’s contacts, the novel domain orientation seen in Taq Gfh1 is stabilized by favorable interdomain polar interactions. These include a salt-bridge between Arg14 and Glu148, and hydrogen bonding interactions all involving main-chain atoms (Thr8 N–Gly135 O, Arg14 NH1– Val137 O, Ser64 Og–Pro94 O). These interactions are seen in all three Gfh1 molecules in the crystallographic asymmetric unit. The basic patch Previous studies with Ec Gre-factors have identified conserved basic residues (Figure 1(a)) defining
Crystal Structure of Thermus aquaticus Gfh1
an N-terminal “basic patch” that plays a role in determining the mode of transcript cleavage (GreA or GreB-like)36. Ec GreA has a small basic patch localized near the end of the NTD coiled-coil, while the basic patch of Ec GreB extends across the whole surface of the protein (Figure 4). This is correlated with the tendency for GreA to stimulate cleavage of 2–3 nt fragments of backtracked RNA, while GreB stimulates cleavage of fragments anywhere from 2–18 nt.12,36 Ala substitutions of these basic residues cause functional defects in vivo, and lead to a substantial reduction in crosslinking to the RNA 3 0 -end, suggesting a defect in anchoring the 3 0 -fragment of the backtracked RNA.36 The basic patch of Ec GreA is formed mainly by conserved basic residues Arg37 and Arg52 (Figure 1(a)). In an alignment of 59 GreA sequences, position 52 is absolutely conserved as a basic residue (Arg or Lys), while position 37 is conserved as a basic residue in 53 of the sequences (90%). Interestingly, the small handful of GreA sequences that do not have a basic residue at position 37 includes Taq and Tth (Figure 1(a)). As a consequence, a homology model of the Tth GreA structure suggests that its basic patch, although present, is less pronounced than Ec GreA (Figure 4). On the other hand, the molecular surface of the NTD coiled-coil of Taq Gfh1 is primarily acidic and does not appear to have a basic patch (Figure 4). RNA crosslinking Tth RNAP was used to generate defined ternary elongation complexes containing an RNA transcript bearing the photoreactive adenosine analog, 8-azido adenosine, at the 3 0 -end. The complexes were stalled at a position known to be susceptible to backtracking and Gre-mediated cleavage, and then exposed to UV radiation to induce RNA–protein crosslinking (see Materials and Methods). Although the basic patch of Thermus GreA is less pronounced than Ec GreA (Figure 4), a relatively robust crosslink was observed between Thermus GreA and the RNA 3 0 -end (Figure 5), suggesting that, like Ec Grefactors, Thermus GreA interacts with the backtracked RNA 3 0 -fragment. Although previous crosslinking experiments failed to reveal a significant crosslink between the 3 0 -end of RNA and
Figure 4. Basic patch. The electrostatic potentials on the molecular surfaces (calculated using the program GRASP)54 of the Ec GreA crystal structure,29 homology models of Ec GreB,32 and Tth GreA, and the Taq Gfh1 crystal structure are shown. Positive (basic) potential is colored blue; negative (acidic) potential is colored red, with neutral surfaces white.
Crystal Structure of Thermus aquaticus Gfh1
Figure 5. RNA crosslinking. TECs stalled at C24 were walked to positions C26 and C27 with CTP and 8-N3ATP, purified, and subsequently exposed to UV light. Crosslinks between the RNA 3 0 -end were visualized by autoradiography after SDS-PAGE (15% acrylamide).
Gfh1,37 with longer exposures, a weak crosslink between wild-type Gfh1 and the nascent transcript 3 0 -end is observed (Figure 5), despite the fact that Gfh1 lacks the characteristic basic patch of Grefactors (Figure 4).
Discussion Gre factors can adopt different conformations Our crosslinking experiments indicate that Gfh1 can contact the 3 0 -end of the backtracked RNA (Figure 5), indicating that the Gfh1 coiled-coil enters the RNAP secondary channel and approaches the RNAP active site, as in the models for the NTD coiled-coil in Gre-factor function. 33–35 This is consistent with the observed activity of Gfh1 in suppressing intrinsic transcript cleavage,37 as it is difficult to imagine how Gfh1 could achieve this otherwise. In the models for Gre-factor interaction with the RNAP,33–35 the Gre-CTD interacts with two long a-helices of the RNAP b 0 subunit that form a rim at the entrance to the RNAP secondary channel. Hydroxyl radical protein–protein footprinting identified a surface of the Ec GreB-CTD (residues 117–127) as being an important determinant for RNAP binding, and Ala substitution of two residues within this region (Asp121 and Pro123) caused binding defects.38 The Pro residue (corresponding to Gfh1 Pro124) is absolutely conserved, pointing to its importance in Gre-factor and Gfh1 function (Figure 1(a)). Because of the conserved functional role in stimulating transcript
183 cleavage, we assume that Thermus GreA interacts with RNAP in the same manner as Ec GreA/B. The high level of sequence similarity between the CTDs of Taq GreA and Taq Gfh1 (43% identity, 65% homology), and the very high level of conservation of the sequence segment implicated in the RNAP interaction (corresponding to Taq Gfh1 118–128; 73% identity, 82% homology) leads us to conclude that the Gfh1 CTD interacts with RNAP in the same manner as the Gre-factor CTDs. The common mode of Gre-factor and Gfh1-CTD interaction with RNAP, combined with the conformational change in the domain orientation of Gfh1 compared with GreA (i.e. 1208 rotation of the NTD with respect to the CTD; Figure 3), leads to a paradox. Superimposition of the Gfh1-CTD onto the Gre-factor CTD within the model for the RNAP TEC/Gre-factor complex33 leads to a model where the Gfh1-NTD coiled-coil protrudes away from the RNAP into solution (which is incompatible with our crosslinking results), rather than extending into the secondary channel as predicted by the crosslinking and functional considerations. To resolve this paradox, we propose that the conformation of the Gfh1-NTD and CTD seen in our crystal structure (Figure 1(b)) is one conformation available to Gfh1 (and possibly Gre-factors in general) but is not the functional conformation when complexed with RNAP. Rather, the functional conformation in complex with RNAP matches the domain orientation seen in the Ec GreA crystal structure (Figure 3). In Ec GreA, numerous van der Waals’ contacts, hydrogen bonds, and salt-bridges maintain the interdomain conformation. Two hydrogen bond/ salt-bridge networks stabilize the orientation of the CTD with respect to the NTD (Figure 6(a)). Asp111, located in a turn between two strands in the CTD, makes a network of polar contacts with Glu17, Arg61, Asp64, and Lys68, in the NTD, whereas Thr7 and Arg9 of the NTD make polar contacts with Arg106 and Glu112 of the CTD (Figure 6(a)). To assess the feasibility of our proposal regarding the functional Gfh1 interdomain orientation, we generated a structural model of Gfh1 with the interdomain orientation of Ec GreA by superimposing the individual Gfh1 domains on the intact Ec GreA structure. Despite the close contact of the two domains, the modeled position of the Gfh1 domains does not generate severe steric clashes. Analysis of the model suggests that some of the interactions that stabilize the NTD/CTD orientation in Ec GreA are conserved or replaced by a similar interaction network in Gfh1 (Figure 6(b)). In particular, the conserved Glu109 in the CTD (corresponding to Ec GreA Glu112; Figure 1(a)) makes a potential hydrogen bond with the conserved Thr8 of the NTD (corresponding to Ec GreA Thr7). Moreover, Taq Gfh1 Arg14 of the NTD is ideally positioned to make a favorable polar interaction with Glu109. These interactions could be further stabilized by Lys118 and Glu114 (Figure 6(b)). Some residues participating in these
184
Crystal Structure of Thermus aquaticus Gfh1
Figure 6. GreA domain orientation modeled onto Gfh1. (a) Key polar interactions between the Ec GreA NTD and CTD. (b) The Taq Gfh1 structure (left, dark blue a-carbon backbone) was altered to match the domain orientation of Ec GreA (middle, light blue backbone). The CTD a-helix is shown in red to highlight the flipped orientation of the CTD with respect to the NTD). Potential interdomain interactions revealed by the model involving conserved Thr8 and Glu109 are shown at the right.
interactions are strictly conserved between GreA and Gfh1, most notably Gfh1 Thr8 and Glu109 (Figure 1(a)), suggesting that these are important in maintaining the Gre-factor or Gfh1 conformation compatible with productive RNAP binding. In summary, we propose that the interdomain orientation observed in the Gfh1 crystal structure (Figure 1(b)) does not correspond to the conformation that is compatible with productive RNAP binding. What, then, is the functional significance of the observed domain orientation? It is interesting to consider whether a switch in interdomain orientation could play a role in the regulation of Gfh1 function. Such a switch could be in response to intracellular conditions, for example, and would render Gfh1 unable to interact productively with RNAP. Moreover, structural modeling of Ec GreA in the “flipped” conformation of Taq Gfh1 also suggests that this conformation may be accessible to GreA. Gfh1 function as a transcript cleavage inhibitor Despite the similarities in structure, Gfh1 and GreA have essentially opposite effect on TECs. Gfh1 competetively inhibits GreA function, since it competes for the same binding site on RNAP but
does not stimulate transcript cleavage. Moreover, it suppresses the intrinsic, non-factor-induced transcript cleavage of the RNAP active site.37 We noted that instead of the normal basic patch seen at the NTD coiled-coil tip of GreA, Gfh1 actually has an acidic electrostatic surface distribution in this region (Figure 4). While the basic patch plays an important role in determining the functional characteristics of Gre-factors, it is not an absolutely crucial feature. Kulish et al.36 studied single and multiple Ala substitution mutants in Ec GreA and GreB expected to dramatically alter or completely remove the basic patch on the NTD surface. While many of these mutants displayed significant functional defects in various assays, none of them abolished the factor-induced, in vitro transcript cleavage activity observed in several stalled TECs. For instance, structural modeling of an Ec GreA double mutant (Ec GreA Arg37Ala/ Arg62Ala) suggested that the basic patch would be abolished, and that an overall acidic surface would take its place (similar to the charge distribution seen for Gfh1; Figure 4). Nevertheless, this mutant GreA had, at most, a 50% reduction in transcript cleavage activity in certain TECs, while in others there was no observable defect in vitro.36 In any case, all of the basic patch mutants were able to stimulate the
185
Crystal Structure of Thermus aquaticus Gfh1
transcript cleavage reaction to a significant extent, and none of them was converted into suppressors of the intrinsic cleavage reaction. Therefore, we cannot conclude that the lack of a Gfh1 basic patch is the primary determinant of its functional properties as an intrinsic transcript cleavage inhibitor. In contrast to the basic patch, two strictly conserved acidic residues at the coiled-coil tip (Asp41 and Glu44 in Ec GreA/GreB) are crucial to the function of the Gre-factors.33–35 These residues likely play a role in chelating and positioning the essential second magnesium ion in the RNAP active site. For instance, the in vitro transcript cleavage activity of mutant Gre-factors with substitutions at these positions is reduced by many orders of magnitude. However, the corresponding positions in Gfh1 contain conserved acidic residues as well (Gfh1 Asp41 and Asp44; Figure 1(a)). The presence of Asp at position 44 (as opposed to the strictly conserved Glu of GreA) is not an important factor, as the GreA mutant Glu44Asp shows only slightly reduced transcript cleavage activity.34 Therefore, the coiled-coil tip of Gfh1 contains the essential acidic residues that are crucial for reconfiguring the RNAP active site to stimulate the transcript cleavage activity. Interestingly, the Gfh1s have additional Asp residues in the coiled-coil tip (Asp42 and Asp45). The corresponding positions in the Gre-factors (corresponding to Ec GreA Leu42 and the strictly conserved Asn45) never contain acidic residues. The possibility exists that these additional acidic residues may alter the interactions with, or reposition, the magnesium ion in the RNAP active site in ways that are detrimental to the cleavage reaction, which could explain the Gfh1 function. This hypothesis will need to be tested through functional assays of Gfh1 and Gre-factor mutants. A third potential factor important for the function of Gfh1 compared with the Gre-factors is the overall shape of the NTD coiled-coil. Although the NTD of Gfh1 and Gre-factors has the same anti-parallel, intramolecular coiled-coil dimer with the same number of helical turns, the superposition of a-carbon positions in the NTD gives rise to a fairly ˚ (Figure 2(a)). This is primarily high rmsd of 4.1 A due to a kink in the first a-helix of the Gre-factor coiled-coil, arising from an inserted residue (Leu21 in both Ec GreA and GreB) that interrupts the heptad repeat of the coiled-coil motif (Figures 1(a) and 2(b)).29 Although the identity of this residue is not absolutely conserved, the presence of the inserted residue in all known Gre-factors is strictly conserved. In contrast, a gap exists in the alignment with the Gfh1s, making the first a-helix of the coiled-coil dimer continuous. This residue inserted in the heptad repeat of the Gre-factors causes a slight kink or bend in the trajectory of the coiled-coil dimer of Gre-factors that is absent from the Gfh1 coiled-coil, giving rise to the increased rmsd for the superposition (Figure 2(a)). The precise positioning of the two magnesium ions is crucial for the twometal catalytic mechanism proposed for the RNAP
active site.39,40 In addition to the extra Asp residues in the Gfh1 coiled-coil tip that could reposition the second magnesium ion, the altered trajectory of the coiled-coil finger of Gfh1 compared with the Grefactors (due to the Gre-factor kink in the first a-helix) could also misalign the second magnesium ion. In summary, then, the inhibition of intrinsic transcript cleavage by Gfh1 (as opposed to the stimulation by the Gre-factors) may be explained by a combination of three factors. The absence of a basic patch in the Gfh1 structure (Figure 4) could partly explain the inability of Gfh1 to stimulate the transcript cleavage reaction. Nevertheless, the Gfh1 structure appears to contain the necessary determinants for chelating the second magnesium ion, which is crucial to the reconfiguration of the RNAP active site to stimulate the transcript cleavage reaction. Probably most importantly, two aspects of the Gfh1 structure, the presence of extra Asp residues in the coiled-coil tip that could potentially alter the way the protein interacts with the second magnesium ion, and the altered trajectory of the coiled-coil finger due to the absence of the helical kink in the first a-helix, could misalign the second magnesium ion, resulting in inhibition of the RNAP active site cleavage activity.
Materials and Methods Cells and reagents Tth HB8 cells were purchased from the University of Wisconsin Biotechnology Center. All chemicals, reagents, and radioactive nucleotides were as described.41 Tth RNAP was purified as described,41 or purchased from Epicentre (Madison, WI). Formation of stalled TECs Open promoter complexes were formed by incubating 100 nM Tth RNAP and 60 nM DE13 template DNA in high ionic strength transcription buffer (30 mM Hepes (pH 7.8), 200 mM potassium glutamate, 25 mg/ml of acylated bovine serum albumin, 1 mM DTT, 10 mM magnesium glutamate, supplemented with a micronutrient solution containing: 240 mM CuSO4$5H2O, 830 mM MgSO4$7H20, 136 mM NaCl, 1 mM KNO 3, 8.1 mM NaNO3, 414 mM Na2HPO4, 1.74 mM ZnSO4$7H2O, 1.2 mM Na2MoO4, 189 nM CoCl2$6H2O) at 55 8C for 15 min. DE13 template DNA was 5 0 -biotinylated and bound to streptavidin-coated magnetic beads prior to use. The DE13 DNA template contains the lPR promoter and encodes a transcript in which the first 30 nucleotides of the sequence are: pppAUGUAGUAAGGAGGUUGUAU GGAAC24CAACGC Following open complex formation, 50 mM ATP, 20 mM UTP, 10 mM [a-32P]GTP (200 Ci/mmol) were added to the reaction. Reactions were performed in the absence of CTP, thereby generating stalled TECs containing aC24 transcript. Stalled C24 TECs were purified by placing the reaction tube next to a strong magnet, and washing three times with 200 ml of transcription buffer without micronutrients.42 Stalled C24 TECs were resuspended in 30 ml of transcription
186
Crystal Structure of Thermus aquaticus Gfh1
buffer without micronutrients and incubated at 55 8C. Aliquots were removed over a period of 15 min and quenched with 95% (v/v) formamide. Transcription products were analyzed by electrophoresis on an 8 M urea/20% acrylamide gel. UV crosslinking Stalled TECs bearing the C24 transcript were prepared in high ionic strength buffer and purified as described for the transcript cleavage assays. After purification, complexes were extended to position C25 with the addition of 25 mM CTP and purified again. The C26/C27 RNA transcript bearing the photoreactive adenosine analog, 8-azido adenosine, at its 3 0 end was generated with the addition of 50 mM 8-N3ATP (TriLink BioTechnologies, Inc., San Diego, CA). Then 5 mM Tth Gfh1 protein (mutant or wild-type) was added to 13.5 ml portions of the C26/C27 containing TEC in a 96-well microtiter plate. The reactions were exposed to long wavelength UV light (254 nm) for 10 min at a distance of 1 cm at 55 8C. Reactions were quenched with 15 ml of 2X J buffer (100 mM Tris–acetate (pH 6.8), 4% (w/v) SDS, 100 mM DTT) and analyzed by SDS-PAGE (15% (w/v) acrylamide) followed by autoradiography. Taq Gfh1 crystallization Taq Gfh1 protein was expressed and purified essentially as described,37 except that the protein was purified on a Superdex 75 column as a last step in 50 mM Tris–HCl (pH 7.5), 200 mM NaCl. The sample was concentrated to 10 mg/ml by centrifugal filtration, and the buffer exchanged against 10 mM Tris–HCl (pH 8.0), 50 mM NaCl, 0.1 mM EDTA, 1 mM DTT. Initial clusters of crystals were obtained by vapor-diffusion in two to three days using the Hampton Research Natrix screen at 22 8C with drops containing 1 ml of protein solution and 1 ml of well solution. The crystallization conditions were refined and crystals up to 0.7 mm!0.4 mm!0.3 mm grew in about a week from drops containing 3 ml of protein solution at 6 mg/ml and 3 ml of the reservoir solution with crystallization buffer (50 mM sodium
cacodylate (pH 6.0), 10 mM MgSO4, 1.9 M Li2SO4, 1% (v/v) ethylene glycol). The crystals were transferred to crystallization buffer but with 20% ethylene glycol and flash-frozen in liquid ethane for cryocrystallography. Crystals of selenomethionyl-substituted Taq Gfh1 grew in about a week.43 Single crystals were obtained by microseeding in drops composed of 2 ml of protein solution at 2.3 mg/ml plus 1 ml of well solution containing 50 mM sodium cacodylate (pH 6.0), 10 mM MgSO4, 1.8 M Li2SO4, 5% ethylene glycol. They were prepared and frozen for cryocrystallography using the same procedure as that used for the native crystals. Structure determination Native and multiple anomalous dispersion (MAD) data (Table 1) were collected at the National Synchrotron Light Source (Brookhaven, NY) beamline X9A and X25, respectively. MAD data were collected at three wavelengths corresponding to the peak, the inflection, and one remote value of the X-ray absorption spectrum (l1, l2, l3 respectively, Table 1). The data were processed using DENZO and SCALEPACK.44 Although microseeding was used to produce large single crystals, none of the ˚ resolution for either crystals diffracted better than 3.2 A the native or the selenomethionyl-substituted protein. Nine out of 12 possible Se sites in the asymmetric unit were located using the anomalous signal from SeMet(l1) with the program SnB.45 Difference Fourier techniques located the additional three sites. Phases were calculated using MLPHARE,46 using SeMet(l1) as a reference. Anomalous signals from the three MAD wavelengths were used to generate an ˚ electron density map. Density modification to 3.3 A using SOLOMON47 yielded clear density for three molecules in the asymmetric unit (solvent content of 65%, v/v). The NTD coiled-coil of Ec GreA (PDB ID, 1GRJ)29 was fit manually into the density using the program O,48 using the Se sites as sequence position markers. The CTD of Ec GreA was docked independently from the NTD because of the altered domain orientation. The maps were improved using cycles of refinement with CNS against the native amplitudes.49 Final rounds of refinement were performed
Table 1. Crystallographic analysis Diffraction data
Data set
Wave˚) length (A
˚) Resolution (A
Number of reflections (Total/Unique)
Native 0.9795 30–3.2 (3.31–3.20) 38,872/12,359 SeMet (l1) 0.97857 50–3.3 (3.42–3.30) 43,102/20,751 SeMet (l2) 0.97885 50–3.3 (3.42–3.30) 44,077/20,995 SeMet (l3) 0.96486 50–3.3 (3.42–3.30) 44,246/21,098 Crystal space group Unit cell parameters ˚) a (A ˚) b (A ˚) c (A b (deg.) Solvent content (%, v/v) (three molecules in the asymmetric unit) ˚) Figure of meritb (30–3.3 A Refinement (against native dataset) ˚) Resolution (A Rcryst/Rfreec (%)
Completeness (%)
I/s
99.2 (99.0) 96.1 (95.6) 97.3 (98.0) 97.4 (98.4)
18.5 (2.2) 21.3 (2.5) 20.6 (2.1) 20.5 (2.0)
Rsyma (%) 6.4 4.4 4.6 4.6
No. sites
Phasing powerb ˚) (30–3.3 A
9 9 9
1.31 1.25 1.24
(50.2) (42.6) (53.2) (53.5) C2 191.2 76.4 53.1 101.9 65.6 0.3591 30.0–3.3 25.0/32.5
a RsymZSjIKhIij/SI, where I is observed intensity and hIi is average intensity obtained from multiple observations of symmetryrelated reflections. b Phasing power and figure of merit as calculated by MLPHARE.46 c RcrystZSkFobsjKjFcalck/SjFobsj, RfreeZRcryst calculated using 9.7% random data omitted from the refinement.
187
Crystal Structure of Thermus aquaticus Gfh1
using REFMAC,50 incorporating TLS restraints.51 PROCHECK revealed no residue in disallowed (f, j) regions of the Ramachandran plot.52 10. Tth GreA homology model The Tth GreA homology model was generated using the program Modeller,53 with the Ec GreA structure as template (PDB ID, 1GRJ).29
11.
Protein Data Bank accession code
12.
Atomic coordinates and structure factors for Taq Gfh1 have been deposited in the RCSB Protein Data Bank with accession code 2ETN.
13.
14.
Acknowledgements We are indebted to the staff at the National Synchrotron Light Source beamlines X9A and X25, for support during data collection. Figures 1(b), 2, 3, and 6 were generated using the program DINO (http://www.dino3d.org). V.L. was supported by a Women in Science Postdoctoral Fellowship at The Rockefeller University. This work was supported, in part, by NIH grants GM54136 to D.A.E. and GM61898 to S.A.D.
15.
16. 17.
References 1. Komissarova, N. & Kashlev, M. (1997). RNA polymerase switches between inactivated and activated states by translocating back and forth along the DNA and the RNA. J. Biol. Chem. 272, 15329–15338. 2. Komissarova, N. & Kashlev, M. (1997). Transcriptional arrest: Escherichia coli RNA polymerase translocates backward, leaving the 3 0 end of the RNA intact and extruded. Proc. Natl Acad. Sci. USA, 94, 1755–1760. 3. Nudler, E., Mustaev, A., Lukhtanov, E. & Goldfarb, A. (1997). The RNA–DNA hybrid maintains the register of transcription by preventing backtracking of RNA polymerase. Cell, 89, 33–41. 4. Reeder, T. C. & Hawley, D. K. (1996). Promoter proximal sequences modulate RNA polymerase II elongation by a novel mechanism. Cell, 87, 767–777. 5. Cramer, P., Bushnell, D. A., Fu, J., Gnatt, A. L., MaierDavis, B., Thompson, N. E. et al. (2000). Architecture of RNA polymerase II and implications for the transcription mechanism. Science, 288, 640–649. 6. Cramer, P., Bushnell, D. A. & Kornberg, R. D. (2001). Structural basis of transcription: RNA polymerase II ˚ resolution. Science, 292, 1863–1876. at 2.8 A 7. Korzheva, N., Mustaev, A., Kozlov, M., Malhotra, A., Nikiforov, V., Goldfarb, A. & Darst, S. A. (2000). A structural model of transcription elongation. Science, 289, 619–625. 8. Zhang, G., Campbell, E. A., Minakhin, L., Richter, C., Severinov, K. & Darst, S. A. (1999). Crystal structure of ˚ Thermus aquaticus core RNA polymerase at 3.3 A resolution. Cell, 98, 811–824. 9. Borukhov, S., Severinov, K., Kashlev, M., Lebedev, A., Bass, I., Rowland, G. C. et al. (1991). Mapping of trypsin cleavage and antibody-binding sites and
18. 19. 20.
21. 22.
23.
24.
25. 26.
delineation of a dispensable domain in the b subunit of Escherichia coli RNA polymerase. J. Biol. Chem. 266, 23921–23926. Epshtein, V., Mustaev, A., Markovtsov, V., Bereshchenko, O., Nikiforov, V. & Goldfarb, A. (2002). Swing-gate model of nucleotide entry into the RNA polymerase active center. Mol. Cell. 10, 623–634. Markovtsov, V., Mustaev, A. & Goldfarb, A. (1996). Protein–RNA interactions in the active center of transcription elongation complex. Proc. Natl Acad. Sci. USA, 93, 3221–3226. Borukhov, S., Sagitov, V. & Goldfarb, A. (1993). Transcript cleavage factors from E. coli. Cell, 72, 459–466. Izban, M. G. & Luse, D. S. (1992). The RNA polymerase II ternary complex cleaves the nascent transcript in a 3 0 –5 0 direction in the presence of elongation factor SII. Genes Dev. 6, 1342–1356. Surratt, C. K., Milan, S. C. & Chamberlin, M. J. (1991). Spontaneous cleavage of RNA in ternary complexes of Escherichia coli RNA polymerase and its significance for the mechanism of transcription. Proc. Natl Acad. Sci. USA, 88, 7983–7987. Awrey, D. E., Weilbacher, R. G., Hemming, S. A., Orlicky, S. M., Kane, C. M. & Edwards, A. M. (1997). Transcription elongation through DNA arrest sites. A multistep process involving both RNA polymerase II subunit RPB9 and TFIIS. J. Biol. Chem. 272, 14747–14754. Orlova, M. (1995). Intrinsic transcript cleavage activity of RNA polymerase. PhD thesis,, Institute of Molecular Genetics, Russian Academy of Science, Moscow. Rudd, M. D., Izban, M. G. & Luse, D. S. (1994). The active site of RNA polymerase II participates in transcript cleavage within arrested ternary complexes. Proc. Natl Acad. Sci. USA, 91, 8057–8061. Wang, D. & Hawley, D. K. (1993). Identification of a 3 0 / 5 0 exonuclease activity associated with human RNA polymerase II. Proc. Natl Acad. Sci. USA, 90, 843–847. Borukhov, S. & Goldfarb, A. (1996). Purification and assay of Escherichia coli transcript cleavage factors GreA and GreB. Methods Enzymol. 274, 315–326. Borukhov, S., Polyakov, A., Nikiforov, V. & Goldfarb, A. (1992). GreA protein: a transcription elongation factor from Escherichia coli. Proc. Natl Acad. Sci. USA, 89, 8899–8902. Fish, R. N. & Kane, C. M. (2002). Promoting elongation with transcript cleavage stimulatory factors. Biochim. Biophys. Acta, 1577, 287–307. Reines, D., Chamberlin, M. J. & Kane, C. M. (1989). Transcription elongation factor SII (TFIIS) enables RNA polymerase II to elongate through a block to transcription in a human gene in vitro. J. Biol. Chem. 264, 10799–10809. Trautinger, B. W., Jaktaji, R. P., Rusakova, E. & Lloyd, R. G. (2005). RNA polymerase modulators and DNA repair activities resolve conflicts between DNA replication and transcription. Mol. Cell. 19, 247–258. Toulme, F., Mosrin-Huaman, C., Sparkowski, J., Das, A., Leng, M. & Rahmouni, A. R. (2000). GreA and GreB proteins revive backtracked RNA polymerase in vivo by promoting transcript trimming. EMBO J. 19, 6853–6859. Marr, M. T. & Roberts, J. W. (2000). Function of transcription cleavage factors GreA and GreB at a regulatory pause site. Mol. Cell. 6, 1275–1285. Erie, D. A., Hajiseyedjavadi, O., Young, M. C. & von Hippel, P. H. (1993). Multiple RNA polymerase conformations and GreA: control of the fidelity of transcription. Science, 262, 867–873.
188
Crystal Structure of Thermus aquaticus Gfh1
27. Artsimovitch, I. & Landick, R. (2000). Pausing by bacterial RNA polymerase is mediated by mechanistically distinct classes of signals. Proc. Natl Acad. Sci. USA, 97, 7090–7095. 28. Hsu, L. H., Vo, N. V. & Chamberlin, M. J. (1995). Escherichia coli transcript cleavage factors GreA and GreB stimulate promoter escape and gene expression in vivo and in vitro. Proc. Natl Acad. Sci. USA, 92, 11588–11592. 29. Stebbins, C. E., Borukhov, S., Orlova, M., Polyakov, A., Goldfarb, A. & Darst, S. A. (1995). Crystal structure of the GreA transcript cleavage factor from Escherichia coli. Nature, 373, 636–640. 30. Koulich, D., Nikiforov, V. & Borukhov, S. (1998). Distinct functions of N- and C-terminal domains of GreA, an Escherichia coli transcript cleavage factor. J. Mol. Biol. 276, 379–389. 31. Polyakov, A., Richter, C., Malhotra, A., Koulich, D., Borukhov, S. & Darst, S. A. (1998). Visualization of the binding site for the transcript cleavage factor GreB on Escherichia coli RNA polymerase. J. Mol. Biol. 281, 262–266. 32. Koulich, D., Orlova, M., Malhotra, A., Sali, A., Darst, S. A., Goldfarb, A. & Borukhov, S. (1997). Domain organization of transcript cleavage factors GreA and GreB. J. Biol. Chem. 272, 7201–7210. 33. Opalka, N., Chlenov, M., Chacon, P., Rice, W. J., Wriggers, W. & Darst, S. A. (2003). Structure and function of the transcription elongation factor GreB bound to bacterial RNA polymerase. Cell, 114, 335–345. 34. Laptenko, O., Lee, J., Lomakin, I. & Borukhov, S. (2003). Transcript cleavage factors GreA and GreB act as transient catalytic components of RNA polymerase. EMBO J. 23, 6322–6334. 35. Sosunova, E., Sosunov, V., Kozlov, M., Nikiforov, V., Goldfarb, A. & Mustaev, A. (2003). Donation of catalytic residues to RNA polymerase active center by transcription factor Gre. Proc. Natl Acad. Sci. USA. 100, 15469–15474. 36. Kulish, D., Lee, J., Lomakin, I., Nowicka, B., Das, A., Darst, S. A. et al. (2000). The functional role of basic patch, a structural element of Escherichia coli transcript cleavage factors GreA and GreB. J. Biol. Chem. 275, 12789–12798. 37. Hogan, B. P., Hartsch, T. & Erie, D. A. (2002). Transcript cleavage by Thermus thermophilus RNA polymerase. Effects of GreA and anti-GreA factors. J. Biol. Chem. 277, 967–975. 38. Loizos, N. & Darst, S. A. (1999). Mapping interactions of Escherichia coli GreB with RNA polymerase and ternary elongation complexes. J. Biol. Chem. 274, 23378–23386. 39. Sosunov, V., Sosunova, E., Mustaev, A., Bass, I., Nikiforov, V. & Goldfarb, A. (2003). Unified twometal mechanism of RNA synthesis and degradation by RNA polymerase. EMBO J. 22, 2234–2244.
40. Steitz, T. A. (1998). A mechanism for all polymerases. Nature, 391, 231–232. 41. Xue, Y., Hogan, B. P. & Erie, D. A. (2000). Purification and initial characterization of RNA polymerase from Thermus thermophilus. Biochemistry, 39, 14356–14362. 42. Foster, J. E., Holmes, S. F. & Erie, D. A. (2001). Allosteric binding of nucleoside triphosphates to RNA polymerase regulates transcription elongation. Cell, 106, 243–252. 43. Doublie, S. (1997). Preparation of selenomethionyl proteins for phase determination. Methods Enzymol. 276, 523–530. 44. Otwinowski, Z. & Minor, W. (1997). Processing of X-ray diffraction data collected in oscillation mode. Methods Enzymol. 276, 307–326. 45. Weeks, C. M. & Miller, R. (1999). The design and implementation of SnB v2.0. J. Appl. Crystallog. 32, 120–124. 46. Otwinowski, Z. (1991). Maximum likelihood refinement of heavy-atom parameters in isomorphous replacement and anomalous scattering. In Proceedings of the CCP4 study Weekend (Wolf, W., Evans, P. R. & Leslie, A. G. W., eds), pp. 80–86, SERC Daresbury Laboratory, Warrington, UK. 47. Abrahams, J. P. & Leslie, A. G. W. (1996). Methods used in the structure determination of bovine mitochondrial F1 ATPase. Acta Crystallog. sect. D, 52, 30–42. 48. Jones, T. A., Zou, J.-Y., Cowan, S. & Kjeldgaard, M. (1991). Improved methods for building protein models in electron denstiy maps and the location of errors in these models. Acta Crystallog. sect. A, 47, 110–119. 49. Adams, P. D., Pannu, N. S., Read, R. J. & Brunger, A. T. (1997). Cross-validated maximum likelihood enhances crystallographic simulated annealing refinement. Proc. Natl Acad. Sci. USA, 94, 5018–5023. 50. Murshudov, G. N., Vagin, A. A. & Dodson, E. J. (1997). Refinement of macromolecular structures by the maximum-likelihood method. Acta Crystallog. sect. D, 53, 240–255. 51. Winn, M. D., Isupov, M. N. & Murshudov, G. N. (2001). Use of TLS parameters to model anisotropic displacements in macromolecular refinement. Acta Crystallog. sect. D, 57, 122–133. 52. Laskowski, R. A., MacArthur, M. W., Moss, D. S. & Thornton, J. M. (1993). PROCHECK—A program to check the stereochemical quality of protein structures. J. Appl. Crystallog. 26, 283–291. 53. Sali, A., Potterton, L., Yuan, F., van-Vlijmen, H. & Karplus, M. (1995). Evaluation of comparative protein modeling by MODELLER. Proteins: Struct. Funct. Genet. 23, 318–326. 54. Nicholls, A., Sharp, K. A. & Honig, B. (1991). Protein folding and association: insights from the interfacial and thermodynamic properties of hydrocarbons. Proteins: Struct. Funct. Genet. 11, 281–296.
Edited by K. Morikawa (Received 17 October 2005; received in revised form 28 October 2005; accepted 30 October 2005) Available online 17 November 2005 Note added in proof: After submission of this article, we became aware of a paper describing the structure of Thermus thermophilus Gfh1 (Symersky et al. (2005). J. Biol. Chem. in the press). which leads to many of the same conclusions as in our article.