[ 12]
RNA-PROTEIN INTERACTIONS
261
Prediction of RNA S e c o n d a r y S t r u c t u r e Another thermodynamic property of RNA that can be predicted in principle is the equilibrium folding. This may or may not be the physiologically important folding depending on the kinetics of folding. Attempts to predict R N A secondary structures on the basis of thermodynamic data described above with slight modifications have been about 70% successful ~) when compared with those determined by phylogenetic analysis and/or chemical mapping. In phylogenetic analysis, sequences for R N A molecules with similar functions are compared to find common folding. 25 The cloverleaf secondary structure for t R N A is a typical example. In chemical mapping, the RNA is allowed to react with reagents selective for single- or double-stranded regions. 26 The reactivity of each nucleotide then provides constraints on possible secondary structures. Presumably, both phylogenetic analysis and chemical mapping reflect physiologically important secondary structures. The similarity to structures predicted from thermodynamic considerations suggests thermodynamics is at least a major determinant of secondary structure folding. In practice, the most powerful way to deduce the secondary structure from sequence is to combine all of these methods. Acknowledgments This work is supported by NIH Grant GM22939 (D. H. T.), NIH Grant GM49429 (M. J. S.), the Research Corporation (M. J. S.), and the D A N A Foundation. D. H. T. is a Guggenheim Fellow and an American Cancer Society Scholar. 25 C. R. W o e s e a n d N. R. P a c e , in " T h e R N A W o r l d " (R. F. G e s t e l a n d a n d J. F. A t k i n s , eds.), C h a p t e r 4. C o l d S p r i n g H a r b o r L a b . Press, P l a i n v i e w , N Y , 1993. "~ R. P a r k e r , this series. Vol. 180, p. 51(l.
[12] T h e r m o d y n a m i c s and Mutations RNA-Protein Interactions
in
By KATHLEEN B. HALL and JAMES K. KRANZ To describe the association between an R N A and a protein, it is necessary to define the local interactions between nucleotides and amino acids and also to determine the energetics of the association. The local interactions will show how the specificity of the association is conferred; the energetics will provide the assembly parameters that encompass both the individual interactions and their interdependence. METHODS IN ENZYMOLOGY~ VOL. 259
Copyright ~" 1995 by Academic Press. Inc. All rights of reproduction in any form reserved.
262
E N E R G E T I C S OF B I O L O G I C A L M A C R O M O L E C U L E S
[12]
To predict the properties of an R N A - p r o t e i n interaction, it is necessary to know how the specificity and affinity of the interaction are controlled. Ideally, the details of the association might include how the R N A phosphate backbone is used in electrostatic interactions, where hydrogen bonds are formed between R N A and protein, if the two molecules associate to form a hydrophobic core of aromatic amino acids and nucleotides, where water molecules and counterions are used in the interaction, and if, in order to form these interactions, there is any conformational rearrangement of R N A or protein. While the thermodynamic parameters of the interaction will certainly not provide all these details, they can suggest which features are likely to be important for the interaction, and provide a framework in which to construct an accurate model of the complex. One simple approach to uncover the interactions and energetics of R N A - p r o t e i n complexes is to m a k e a mutation in the R N A sequence, then measure the affinity of the protein for this R N A variant. Through a comparison of the affinity of the mutant and wild-type RNAs, and the corresponding free energy of association for these complexes, the contribution of a specific R N A nucleotide or structural element to the association can be assessed. This approach can identify sites of the R N A that participate in complex formation, as well as suggest how much those sites contribute to the total free energy of association. Naturally, the structural integrity (both secondary and tertiary) of any R N A mutant must be determined in order to be able to accurately ascribe an observed change in affinity to the substitution. With this caution in mind, however, the m e a s u r e m e n t of a AAG ° of complex association, as a function of mutations in the R N A , has proved to be a valuable means of describing several R N A - p r o t e i n interactions: T F I I I A with 5S R N A , ~ $4 with its m R N A pseudoknot, 2 R17 coat protein with its R N A hairpin, 3 and, as shown here, U 1 A and an R N A hairpin. 4 Although the free energy (AG ° ) of the association provides the overall description of the system, it is usually desirable to have more information about the driving forces of complex formation. Defining the entropic and enthalpic contributions to the free energy will allow a more complete understanding of how the R N A and protein associate. 5 As illustrated here, the determination of these thermodynamic parameters is extended to complexes formed with mutant R N A sequences, in an attempt to understand P. J. Romaniuk, Nucleic Acids Res. 13, 5369 (1985). C. K. Tang and D. E. Draper, Biochemistry 29, 4434 (1990). 3 p. j. Romaniuk, P. Lowary, H. N. Wu, G. Stormo, and O. C. Uhlenbeck, Biochemistry 26, 1563 (1987). 4 K. B. Hall, Biochemistry 33, 10076 (1994). 5j. H. Ha, R. S. Spolar, and M. T. Record, J. Mol. Biol. 209, 801 (1989).
[ 121
RNA-PROTEIN INTERACTIONS
263
the origin of the o b s e r v e d differences in free energy of association. T h e m e t h o d s used and the analysis of the data should be applicable to other R N A - p r o t e i n interactions, and the complexity of the system is likely to be characteristic of these associations.
RNA a n d U 1A P r o t e i n W e have used the interaction of the h u m a n U 1 A protein and its R N A hairpin substrate as a m o d e l system to d e m o n s t r a t e the data and the analysis necessary to interpret the binding of R N A to protein. T h e h u m a n U 1 A protein is a 282-amino acid protein associated with the U1 s n R N P (small nuclear ribonucleoprotein particle). It contains two domains, at the Nterminal and the C-terminal, that have b e e n identified as R N A - b i n d i n g domains ( R B D s ) or R N A recognition motifs ( R R M ) . 6'1° T h e 102-amino acid N-terminal R B D binds specifically to s t e m - l o o p II of the U1 s n R N A 7 and as an a u t o n o m o u s d o m a i n can bind specifically to a short R N A hairpin containing the s n R N A loop II sequence. ~9 T h e N-terminal 95-amino acid R B D has b e e n crystallized, 8 and shown to consist of a/3ol/3-/3a/3 motif. W e have used the U 1 A N-terminal 102-amino acid R B D (102A) together with a short 26-nucleotide R N A hairpin as a model system to describe the sequence d e p e n d e n c e of this association and its energetics. T h e protein is purified f r o m an E s c h e r i c h i a c o l i overexpression system 9 and the R N A s are synthesized either chemically or enzymatically. This in v i t r o system is readily manipulated, as the protein is m o n o m e r i c and stable, and the R N A s are relatively simple. RNA P r e p a r a t i o n a n d C h a r a c t e r i z a t i o n RNA
Synthesis
R N A molecules were synthesized enzymatically from short D N A oligonucleotide substrates by T7 R N A p o l y m e r a s e 11 or by SP6 R N A p o l y m e r ase, 12 using e n z y m e s purified in the laboratory. Figure 1 gives the sequences 6 R. J. Bandziulis, M. S. Swanson, and G. Dreyfuss, Genes Dev. 3, 431 (1989). 7 D. Scherly, W. Boelens, W. J. van Venrooij, N. A. Dathan, J. Harem, and I. W. Mattaj, EMBO J. 8, 4163 (1989). K. Nagai, C. Outbridge, T. H. Jessen, J. Li, and P. R. Evans, Nature (London) 348, 515 (l 990). K. B. Hall and W. T. Stump, Nucleic Acids" Res. 20, 4283 (1992). m E. Birney, S. Kumar, and A. R. Krainer, Nucleic Acids Res. 21, 5805 (1993). II j. F. Milligan, D. R. Groebe, G. W. Witherall, and O. C. Uhlenbeck, Nucleic Acids Res. 15, 8783 (1987). 12W. T. Stump and K. B. Hall, Nucleic Acids Res. 21, 5480 (1993).
264
ENERGETICS OF BIOLOGICAL MACROMOLECULES
[12]
GCA C U U 1510 C A C C-G C-G20 G-U A-U G-C A-U A-U 5'G_C FI(;. 1. Sequence and numbering scheme for the wild-type R N A hairpin.
of the wild-type transcripts and the numbering system used to indicate substitutions. Molecules were also chemically synthesized using phosphoramidites from Milligen Biosearch (Burlington, MA) or Glen Research (Sterling, VA). RNAs were labeled in enzymatic syntheses with [oz-32P]CTP and/or [c~-32p]UTP for use in nitrocellulose filter-binding assays. Chemically synthesized RNAs were labeled at the 5' end with polynucleotide kinase and [y-32p]ATP for use in binding assays. RNA was purified from 20% polyacrylamide-8 M urea gels by soaking in 0.3 M sodium acetate overnight for 32p_ labeled samples and by electroelution (Schleicher & Schuell, Keene, NH) for unlabeled samples. For binding experiments, the concentration of RNA was determined from the specific activity of the incorporated radiolabeled nucleotide. For melting experiments, the RNA concentration was determined spectrophotometrically, using the appropriate extinction coefficient. RNAs used for thermal melting analysis were dialyzed against MilliQ (Millipore, Bedford, MA) water, then lyophilized.
Properties of RNA Hairpins Each RNA hairpin used for binding assays was also used in thermal melting experiments to ensure that it formed a monomer in solution, because any potential monorner-dimer equilibrium would interfere with the interpretation of the binding results. The absorbance-vs-temperature data for the hairpins were measured at 260 nm in a Gilford 250 spectrophotometer (Oberlin, OH) interfaced to a PC. The salt concentration varied, consisting of 250 mM NaC1, 10 mM sodium cacodylate (pH 6 or 7), with or
1121
265
R N A - P R O T E I N INTERACTION5
without MgCI2 (or 100 mM NaCI with or without MgC12), to match the conditions of binding experiments. For a monomolecular transition, there should be no concentration dependence of the melting temperature, and deviations from this observation would indicate that the RNAs were forming dimers. The concentrations of the RNAs measured varied from 10 3 to 10 7 M, using cuvettes with pathlengths from 1 to 0.01 cm (NGS Precision Cells, Inc., Farmingdale, NY). The absorbance-vs-temperature profiles of all RNAs were measured both to determine if they were in fact monomeric hairpins, and to observe any changes in the melting profile as a function of the substitutions. With one exception, all RNAs showed an upper melting temperature (Tin) that was independent of R N A concentration, indicating that these are monomeric species. The lower melting transition was dependent on the R N A sequence, and was generally quite broad. Because the filter-binding assays were done under conditions in which the R N A concentration was 10 ~l or 10 ~2 M, the RNAs were therefore certain to be monomers (hairpins), and thus the results obtained could not be ascribed to R N A dimers, which are not likely to be substrates for the protein. Melting profiles of the SP6 wildtype hairpin without MgCI2 and of several mutant T7 RNAs are shown in Fig. 2.
t
I
[
I
'
/ t
I
I. 2~.3
I , 157
I
7 :/
/
c~ N ! , 100
,.....i.i.i.i.i.i..i.i.i I, 043 C~s ', .',..."WT
../GI2A
1
0 , 987
I
I0
I 20
3C
I f 40 50 TE~P~RATU::E
I 50 '[C)
I 70
I
8o
h
gO
Fla. 2. Thermal melting of the wild-type RNA, C13U, and G12A mutant RNAs. R N A concentration is 10 5 M; buffer is 250 m M NaCI. 10 m M sodium cacodylate (pH 7.0), 1 mM MgCh.
266
ENERGETICS OF BIOLOGICAL MACROMOLECULES
[ 12]
These thermal melting data confirmed that the RNAs adopted the predicted conformation, which is a critical feature for interpretation of binding affinities. These experiments also showed that the hairpin is stable. Its high melting temperature makes it unlikely that the duplex is disrupted in association with the RBD.
Method and Analysis of Binding Assays Nitrocellulose filter binding was the method used to determine the binding affinity of the complexes. Duplicate experiments were repeated at least twice for every R N A or solution condition. For temperature experiments, a Schleicher & Schuell 0.2-~m pore size supported nitrocellulose membrane presoaked in the appropriate binding buffer was used in a modified dot-blot apparatus, 13 which was either chilled (4 °) or warmed (30 or 40 °) or left at room temperature (22°), and the samples were incubated in a polypropylene microtiter dish resting in the appropriate water bath. The membrane was lightly blotted with a Kimwipe after soaking to remove the excess buffer. Without such blotting, the samples applied tended to diffuse over the surface. After filtering, the underside of the membrane was blotted again to remove excess radioactive buffer that would otherwise diffuse across the surface and obscure the individual dots. With this method, a complete titration in the 96-well apparatus uses a single filter to collect all the points, eliminating the variability in individual nitrocellulose filters. This method can accommodate several binding isotherms on a single filter. For the 10° experiments, individual Schleicher & Schuell 0.45-tzm pore size filters were presoaked at least 30 rain in the binding buffer on ice, and the samples in microtubes incubated in a Fotodyne Biochiller 2000 (Hartland, WI). Bound radiolabeled R N A was quantified using a Betagen Betascope blot reader (model 603) (Mountain View, CA), and the retained counts (B) normalized to the total R N A (T) present. The R N A bound to the filter in the absence of protein was designated as the background (O), and this value was subtracted from each data point, to give (B - O)/T = FB (fraction bound). Complex formation is assumed to be described by a bimolecular association, based on previous experiments that showed the stoichiometry of binding to be 1:1. The data were therefore fitted to a Langmuir isotherm to determine the equilibrium constant of the association. The association is bimolecular: [RNA] + [102A] ~ [R:P] 13 I. W o n g and T. M. L o h m a n , Proc. Natl. Acad. Sci. U.S.A. 90, 5428 (1993).
[ 12]
RNA-PROTEIN INTERACTIONS
267
At equilibrium, the concentrations of [RNA] and [102A] change by the amount JR:P], and the equilibrium constant can be written Keq = KD = ([RNA] - [R:P])([10ZA] - [R:P])/[R:P]. Multiplying out this expression leads to the quadratic equation in JR:P], which is normalized by dividing the expression by [RNA]. The expression is solved for [R:P] at each value of [102A], with Keq as the variable. Fitting is done using nonlinear regression Kaleidegraph software (Synergy Software, Reading, PA) on an Apple Macintosh. Retention efficiency in the filter-binding experiments varied with the salt concentration and temperature, and values typically ranged from 30 to 60%. Such low retention efficiencies seem to be typical of RNA-protein systems. 1,~4For proper analysis of the binding data, the retentions are normalized. To determine the binding affinity of the RNAs for the 102A RBD, the RNA concentration is held fixed at a low value, while the protein concentration is varied. The RNA is radiolabeled in these experiments, and for the experiments with the wild-type sequence, it was necessary to incorporate two [aYP]NTPs in order to keep the RNA concentration below the measured KD. The use of both [oe-32p]ATP and [a-32p]CTP in the transcription reaction at a concentration of 48/xCi/nmol produces RNA with sufficiently high specific activity for the binding reactions. For example, in a 200-/xl reaction, 300 cpm of RNA is equivalent to 2.5 × 10 12 M RNA. In addition to the RNA, the reactions contain 20 p~g of bovine serum albumin (BSA) and 10 /xg of tRNA. After incubation for 20 min, the samples are filtered, with no subsequent washing. All experiments are done in duplicate and repeated at least twice. Measured dissociation constants (KD) typically vary by less than 50%, and the values reported are averages. Variability in these values seems to be primarily due to the RNA preparation. 1"9For the experiments described here, the standard solution conditions were 200 mM NaC1, 10 mM sodium cacodylate, 1 mM MgC12, pH 6. Energetics of Association
Free Energy Binding of the N-terminal RBD of U1A (102A) to the short wildtype RNA hairpin is extremely tight: previous nitrocellulose filter-binding experiments have shown that in 250 mM NaC1, 10 mM sodium cacodylate (pH 6), 1 mM MgC12 at room temperature, the association constant (KA) 14j. Carey and O. C. Uhlenbeck, Biochemistry22, 2610 (1983).
268
[121
ENERGETICS OF BIOLOGICAL MACROMOLECULES
is 2.5 (_+1) × 109 M i, with a free energy of complex formation of AG ° = - 1 2 . 8 kcal/mol. Given this tight association of wild-type R N A with 102A, we wanted to identify those elements of the R N A hairpin that conferred this affinity, and those that were responsible for the specificity. Five R N A s are used for these binding experiments to 102A, to illustrate s o m e of the complexity found with this system. These are the wild-type R N A hairpin, the G 1 2 A , C13U, and A 9 C substitutions, and an R N A in which all rCs are changed to dC. The G 1 2 A and C 1 3 U substitutions are in the conserved region of the loop, the A 9 C substitution at the base of the loop was designed to measure the effect of a nucleotide replacement outside of the conserved region, and binding of the dC R N A hairpin should suggest the contribution of s o m e of the ribose hydroxyls to the interaction. The binding curves of the R N A s with 102A at 22 ° are shown in Fig. 3. There is clearly a large change in the affinity for the protein as a function of single substitutions in the R N A . The loss of affinity is greatest for the G 1 2 A substitution, with a loss of 4.9 kcal/mol of binding free energy; a similar reduction in AG ° is observed for the A 1 4 G substitution, a which makes these two nucleotides the most critical ones identified for this interac-
1.2
........ I ........ I ........ I ..... .,,,I ........ I ........ I ........ I ........
1,0O
E
/,2 .;a°
0.80,6-
¢9 N
0.4-
O
0,2-
/
/ /
Z
0.0 -0,2
I
~o-'~
i
rll,,ll
i
~;-"
i
iiHiil
i
~b-'o
i
tltlll]
t
~'o-~
i
,,HIll
i
t
;o-'
,itl!l]
i
~'o-~
,
iiitrll
i
M
,
iiil*d
*
,
,i,i
70-,
[protein] FIG. 3. Binding isotherms for the interaction b e t w e e n 102A and R N A s at 22 ° in 200 mM NaCI, 10 mM sodium cacodylate, (pH 6), 1 mM MgCI2. Lines are the calculated fits to the raw data. ( 1 ) Wild-type RNA: ([~) dC RNA; (O) C13U RNA; (A) A9C RNA; (O) G12A RNA.
[ 12]
RNA-PROTEIN INTERACTIONS
269
tion. The binding constants and the free energies are given in Table I. In contrast to the G12A mutation, the observed loss (AAG °) of only 1.7 kcal/ mol of binding free energy for the C13U substitution shows that not all nucleotides in this phylogenetically conserved region cause comparable disruption of the complex. In fact, the A9C mutation at the base of the loop is more disruptive of the association than is the C13U mutation. The dC substitutions cost only 1 kcal/mol of binding free energy on loss of these eight hydroxyl groups, suggesting that they do not play a substantial part in the affinity of the complex. RNA Mutations and Free Energy of Association. One idea behind the use of single nucleotide substitutions for comparisons of binding affinity is that the effect observed might reflect the contribution of that single nucleotide to the total affinity. In theory, then, the AG ° of association ( - 1 3 . 4 kcal/mol for 102A-wild-type R N A association; Table I) can be reduced to the sum of the individual interactions. However, the magnitude of the AAG ° for the G12A mutant suggests that this substitution is affecting more than only interactions specific to this position in the RNA. In other words, the failure to make the (putative) normal G12 contacts compromises the formation of other contacts, such that there is additional loss of free energy or lower affinity. This structural interpretation invokes specific contacts between R N A and protein; however, the free energy loss on complex formation could also have its origin in the increased energetic cost of associating with an R N A loop that has either adopted a new conformation, that has different ion associations, or is structurally dynamic rather than rigid, and so offers fewer opportunities for association with the protein. In any case, these data suggest that the introduction of substitutions in the R N A as a way to identify the sources of specificity and affinity for this system will suffer from the complication that the effect will be pleiotropic, affecting other potential contacts as well as the ones intentionally disturbed. TABLE I BINDING AFFINIHES OF RNAs TO 102A" RNA Wild type G12A C13U A9C dC
KD (M) I 1 2 8 4
(+1) (+1) (+1) (-+2) (+1)
× × × x ×
10 10 10 10 10
l{} 7 9 9 1{}
AG ° (kcal/mol)
AAG °
13.44 -8.5 11.7 10.9 -12.5
4.9 1.7 2.5 0.9
" Binding measured by nitrocellulose filter binding, at 200 mM NaCI, 10 mM sodium cacodylate (pH 6), 1 mM MgClz, 22 °. A&G ° = AG°(wild type) AG°(mutant).
270
ENERGETICS OF BIOLOGICAL MACROMOLECULES
[12]
Implicit in the idea of interdependence of interactions and subsequent nonadditivity of their free energies is that the association of the two components is accompanied by a conformational change in one or both molecules to form the complex. In this induced fit model of interacting RNA and protein, the energetically most favorable interaction might include a conformational change of the R N A loop to provide sequence-specific contacts with the nucleotides and amino acids. Energetically, this could involve burying the hydrophobic bases in the protein-RNA interface, s A stunning example of such a phenomenon is found in the association between tRNA GLN and its synthetase; the cocrystal shows that the anticodon loop is turned inside out to bury the three nucleotides of the anticodon in sequence-specific pockets in the synthetase. 15 The small size of the 102A RBD precludes such a dramatic interaction with its RNA, but nuclear magnetic resonance (NMR) data indicate that there is a conformational change in the R N A loop when it is bound to the protein. 4 These preliminary structural data support the induced fit model of this interaction. All the data suggest that this association cannot be described as a simple docking of components with additive and separable energetic interactions. This observation of interdependent interactions is not novel; the same problem of interpretation of mutational effects has been observed with many D N A - p r o t e i n systems, in which the loss of one contact means that others are also lost. Deconvoluting the interdependencies has been dealt with elegantly by Lesser et al.~6 for the E c o R I - D N A complex. In those experiments, the gross substitution made here of a U for a C is taken to an unprecedented level of refinement, through the deletion of a single amino group on a single A, or a ring nitrogen on a purine. One additional parameter in the analysis of an RNA-protein association is the effect of a mutation on the stability and conformation of the R N A structure. This potential for conformational flexibility is peculiar to R N A molecules (and proteins), and is conspicuously absent from double-stranded DNA.
Enthalpy and Entropy Although the free energy of association provides an indication of the stability of the interaction, to describe the system more completely the driving forces of the interaction need to be defined. The enthalpy of the association is readily obtained with the methods described, and the entropy can then be calculated. To obtain the van't Hoff enthalpy of the association for 102A and RNA, ~5 M. A. Rould, J. J. Perona, and T. A. Seitz, Nature (London) 352, 213 (1991). t6 D. R. Lesser, M. R. Kurpiewski, T. Waters, B. A. Connolly, and L. Jen-Jacobson, Proc. Natl. Acad. Sci. U.S.A. 90, 7548 (1993).
[ 121
RNA-PROTEIN INTERACTIONS 30 L
•
•
i
•
•
1
I
•
•
l
271 I
•
?
26
24
2O
22
2xG °
20 ln(Kob s) (M-l)
(keal]mol)
20 0.0037
I 0.0036
i 0.0035
I 0.0034
I 0.0033
I 0.0032
14 0.0031
lfr FI6.4. Thermodynamic profile for the temperature dependence of 102A-wild-type RNA. The solid line is the calculated AH°(T); the hashed line is the calculated TAS°(T). (0) In(Kob0; (O) AG°. Errors in the ln(Kob0 data reflect 20% uncertainty due to the filter-binding measurements.
the binding is m e a s u r e d as a function of t e m p e r a t u r e , and the slope of the line, in a plot of ln(Kobs) vs l / T , is equal to A I t ° / R . This analysis, when applied to the 1 0 2 A - w i l d - t y p e R N A complex, yields a nonlinear relation b e t w e e n t e m p e r a t u r e and ln(Kobs), m a k i n g the simple d e t e r m i n a t i o n of 2~H° inappropriate. T h e e x p e r i m e n t a l results are shown in Fig. 4. As a first a p p r o x i m a t i o n , the results have b e e n analyzed according to the interpretation applied to sequence-specific binding of D N A and proteins, 5'~7 which is to say that the nonlinearity is indicative of a large negative ACp. Should this interpretation be found to be accurate, the origin of the o b s e r v e d heat capacity must be found. A s s u m i n g no other processes are occurring to complicate the analysis further, these R N A - U 1 A data can be analyzed 5'17'~ assuming a constant 2XC°p.obs, and two characteristic t e m p e r a t u r e s , TH and Ts. A t TH, the enthalpy associated with c o m p l e x f o r m a t i o n is zero (2xH ° = 0) and similarly, at Ts, the e n t r o p y is zero (2xS° = 0). F r o m the expressions for enthalpy ~7R. S. Spolar and M. T. Record, Science 263, 777 (1994). is R. L. Baldwin, Proc. Natl. Acad. Sci. U.S.A. 83, 8069 (1986).
272
E N E R G E T I C S OF B I O L O G I C A L M A C R O M O L E C U L E S
AH°bs = AC~,obs(T
[12]
TH)
and e n t r o p y o
o
Agobs = A Cp,obsln( T / T s )
as described by Baldwin, Is the relation b e t w e e n Kob s and t e m p e r a t u r e is expressed by ln(Kobs) = ( ~ X C ~ , o b s / R ) [ ( T H I T ) - ln(Ts/T) - l] and so aG °=
AC~,obs[(T H -
T)-
TIn(TITs)]
as presented in H a e t al. 5 The enthalpy and e n t r o p y are t e m p e r a t u r e dependent, and can be calculated f r o m the heat capacity and characteristic temperatures. A p p l y i n g this analysis to the data for the wild-type R N A and 102A complex yields a large negative AC~,obs = --1.43 (+0.54) kcal/mol-K, with TH = 283 (_+5) K (10 °) and Ts = 292 (_+2) K (19 °) (Table II). A b o v e 10 °, the driving force for the association is clearly enthalpic, with AH ° = - 1 7 kcal/mol at 22 °. T h e plots of A H ° and T A S ° are calculated f r o m the expressions given above. M u t a n t R N A s . T h e binding affinity of each of the four m u t a n t R N A 102A complexes was also m e a s u r e d as a function of temperature, to compare the driving forces in these associations to those of the wild-type R N A 102A complex. T h e results for several of these complexes are shown in
TABLE 1i CALCULATED THERMODYNAMIC PARAMETERS FOR R N A PROTEIN COMPLEXES" Cp,obs
RNA
(kcal/mol-K)
TH (K)
Ts (K)
Wild-type RNA dC RNA CI3U A9C G12A
1.43 (_+0.54) -2.05 (_+0.64) 1.94 (+0.43) -0.43 (--0.17) 0.0 (_+1.36)
283 (_+5) 279 (+2) 280 (_+2) 246 (_+19) 236 (+-237)
292 (+2) 285 (+1) 286 (_+1) 273 (+9) 264 (_+114)
" Values calculated from binding data for UIA 102A domain to RNA in 200 mM NaC1, 10 mM sodium cacodylate (pH 6), 1 mM MgCI2.
[ 121
RNA-PROTE1N INTERACTIONS
273
Fig. 5. As for the complex with wild-type R N A , the C 1 3 U - 1 0 2 A and dC R N A - 1 0 2 A complexes also show nonlinear relations between ln(Kobs) and temperature. While the A 9 C data can also be modeled assuming a temperature-independent A C p , o b s , the G12A data are best fit using the standard linear van't H o f f relation. The G12A data are the least accurate, given the low affinity of this R N A for 102A. The energetic p a r a m e t e r s of the R N A - 1 0 2 A complexes are c o m p a r e d in Table II. The heat capacity calculated for the wild-type, C13U, and dC R N A complexes is large and negative. That for the A9C mutant is significantly smaller, and that for the G12A R N A is zero. If a large negative heat capacity is indicative of binding that is accompanied by a change in the structure of the components, as suggested by Spolar and Record, 17 then these data can be interpreted to suggest that the complexes formed with the C13U and dC R N A s are able to m a k e sufficient contacts to induce the formation of a new structure in the R N A , protein, or both. In contrast, the G 1 2 A - 1 0 2 A complex is unable to undergo the adaptations necessary to form a tight association, and so might resemble a docking of R N A and protein. All these R N A - p r o t e i n associations are enthalpically driven above 10°. However, the magnitude of the favorable enthalpy varies. At room tempera-
26
24
22
20
in(Nobs) 18
16
14
i
0.0038
0.0057
0.0036
0.0035
0.0034
0.0033
0.0032
0.0031
1/Y FIG. 5. van't Hoff plots of 102A association with ([]) dC and (O) A9C RNAs compared to (A) wild-type. Lines are fits to the data points shown assuming a temperature-independent ACp. Errors in the data reflect 20% uncertainty in In(KA).
274
ENERGETICS OF BIOLOGICAL MACROMOLECULES
[ 12]
ture, at which the binding affinities were measured (Table I), AH ° for 102A association with wild-type R N A is - 1 7 kcal/mol; with C13U, A/F = - 3 3 kcal/mol; with dC RNA, A/F = - 2 9 kcal/mol; with A9C RNA, AH ° = -21 kcal/mol (calculated from ACp,obs, TH, and Ts) and for G12A, A/F at all temperatures is - 5 kcal/mol (calculated from the van't Hoff relation). Under these conditions, the enthalpy for the association of three mutant RNAs with 102A is actually more favorable than for the wild-type R N A complex. Because their corresponding free energy of association is less, however, there must be an unfavorable entropy term that dominates these associations. The entropy associated with these complexes at 22 ° is small: AS°(wild type) = - 1 7 eu; AS°(dC) = - 5 8 eu; AS°(C13U) = - 7 1 eu; AS°(A9C) = - 3 3 eu; and AS°(G12A) = +11 eu. Relative to the wild-type R N A complex, there is in fact a significantly larger unfavorable entropy associated with binding, at 22 °, for the first three mutant RNAs. Only for the weakest association, the G 1 2 A - 1 0 2 A complex, is there a favorable entropy term. All four of these substituted RNAs bind to 102A with lower affinity than does the wild-type RNA. The loss of binding affinity may be due to loss of a specific contact with subsequent failure to form a stable R N A - p r o t e i n interface, as postulated for the G12A mutant, or could be due to a change in the R N A loop structure that allows it to adopt a stable structure that is not suitable for binding or, alternatively, to adopt a structure that has increased conformational flexibility. A more flexible R N A would pay a higher cost in conformational entropy on formation of a complex that restricts its range of structures. It is also possible that substitution of a particular nucleotide does not alter the sequence-specific recognition pattern of the association, but interferes with the ability of the R N A or protein to adopt the correct interaction interface, thus reducing the binding affinity. The single measurement of the binding free energy cannot distinguish between these cases, but the comparison of the enthalpy and entropy contributions is suggestive. RNA M u t a n t s a n d I n t e r p r e t a t i o n of Energetics of RNA-Protein Interactions Expanding the binding experiments to look at the temperature dependence of the complexes formed with the mutant RNAs has indicated where the major energetic changes have taken place. Rather surprisingly, several of these complexes have more favorable enthalpies. If the enthalpy is a reflection of the burial of hydrophobic surfaces, then it appears that this is easier with the mutant RNAs. However, the mutant RNAs cost the association a large penalty in entropy, which is the cause of the reduced
[ 121
RNA-PROTEIN INTERACTIONS
275
free energy of association and the weaker binding constants. This entropic cost may come from the number of counterions that are associated with the mutant RNAs, from increased flexibility of these R N A loops that cost conformational entropy to the association, or from the association of the R N A and protein where both components are unable to adopt the most favorable surface for the interaction. With respect to the interpretation of the large negative 5C~, that these R N A - 1 0 2 A complexes apparently exhibit, it is important to note here that the conformational changes (folding transitions) in the D N A - p r o t e i n complexes that may account for the heat capacity term have been ascribed only to protein-folding transitions, and do not treat the contribution of D N A conformational changes. 5,r For these R N A - p r o t e i n interactions, the R N A is equally likely to exhibit tertiary folding transitions in the loop, so the conformational changes invoked could apply to either molecule. Thus it is tempting to attribute the increase in entropy that lowers the affinity of the protein for the mutant R N A to conformational entropy due to an altered R N A loop structure. C o m p a r i s o n with O t h e r RNA-Protein I n t e r a c t i o n s A similar method of analysis, to identify the source of the affinity and specificity of a protein for a given R N A molecule, has been employed for several systems; for this discussion, two complexes are of particular interest. The R17 coat p r o t e i n - R N A hairpin interaction has been extensively studied, both using R N A mutations and measurement of the energetics of the association. This complex should be rather similar to the one discussed here. In contrast, the interaction between T F I I I A and 5S R N A is apparently quite different. It too has been extensively probed with the use of mutations in the RNA. R17 Coat Protein
The U 1 A - R N A complex may have some similarities to the interaction of the phage R17 coat protein with a small R N A hairpin. Both RNAs are hairpins, and both interactions use nucleotides in the loop to make sequence-specific contacts; the proteins are not related, however. As with 102A and mutant RNAs, the R17 coat protein binds only weakly to specific mutations in its R N A hairpin substrate. 3 The R17 hairpin contains only four nucleotides in the loop, and substitution at two of the four positions results in an at least 1000-fold reduction in affinity for the protein] 9 while t90. C. Uhlenbeck, J. Carey, P. J. Romaniuk, P. T. Lowary, and D. Beckett, J. Biomol. Struct. Dyn. 1, 539 (1983).
276
ENERGETICS OF BIOLOGICAL MACROMOLECULES A,G,C:
0.7]
C: A:
\
/~: U
C: U:
A
--
5'
-0.8 2.6
1.s
U
2.6 I~A >4.5)
[C, ( - ) : > 4 . 5 ]
[121
G G
-
G U A C A A A
- C - A - U - G - Up
--
[U,C,G:>4.5]
C C
A
Flcl. 6. The RNA hairpin used for binding to the R17 coat protein. Values are the ~,~G ° for the single-nucleotide substitutions shown. (Adapted from Uhlenbeck et al. lV)
substitution at a third causes a 10-fold loss in affinity, and at the fourth costs almost nothing. 2° Deletion of the single-base bulge also results in a significant loss of affinity of the protein. These results are summarized in Fig. 6, and the AAG ° calculated for the free energy difference between mutant and wild type at 2 ° in 0.1 Tris-HC1 (pH 8.5), 80 mM KCI, 10 mM MgC12.19 One apparent difference between these two R N A - p r o t e i n complexes is in the temperature dependence of the association. While the 1 0 2 A - R N A interaction is best described by temperature-dependent &H ° and AS°, the R17 interaction gives a linear van't Hoff plot in the range of 4-30 °, with a temperature-independent AH ° = - 1 9 kcal/mol. 14 The free energy of association is correspondingly not constant over this temperature range. The entropy of this association, like that of 1 0 2 A - R N A , is unfavorable with AS° - - 3 0 cal/mol-deg. With the R17 R N A - p r o t e i n interaction, the nucleotides in the loop are clearly responsible in part for the specificity of the interaction, and the same pattern is observed here for U 1 A - R N A interactions. This means of making sequence-specific contact with the nucleic acid is unique to R N A - p r o t e i n interactions, and no doubt results from the inaccessibility of the base-specific moieties in the major groove of the duplex regions. Instead, loop regions of the RNA can present the bases to a protein to facilitate sequence-specific contacts, although disrupting an existing R N A structure may cost the association something in terms of conformational entropy, as well as enthalpically if the aromatic bases must be unstacked and reburied in the p r o t e i n - R N A interface. One difference between the R17 and U1A R N A hairpins is the size of the RNA loop; the 10-nucleotide loop of the 2~ H. N. Wu and O. C. Uhlenbeck, Biochemistry 26, 8221 (1987).
[ 121
RNA-PROTEIN INTERACTIONS
277
U1 hairpin is anticipated to have greater propensity for conformational heterogeneity. Using an inherently flexible region of the RNA as the primary site of interaction in this complex may have consequences for the interpretation of the energetics, first because there is almost certain to be a rearrangement of the R N A component on formation of the complex, and second because any intrinsic flexibility of the R N A may appear as a distinct energetic contribution to the system, which may be difficult to detect and resolve. TFIIIA
The interaction between TFIIIA and 5S R N A provides a contrasting picture of how this zinc finger protein recognizes a complex RNA. TFIIIA associates with both D N A and RNA, where it binds to the 5S gene to act as a transcription factor to modulate expression of 5S RNA, and also binds to 5S RNA. Binding to the R N A is tight, with a KD of 1 × 10 ~ M at 24 °, AG ° - -12.1 kcal/mol. ~ Of the nine zinc fingers in TFIIIA, it seems that only fingers 4-7 are required for recognition of 5S RNA, 21 and these (zf4-7) bind with an affinity equal to or greater than that of the entire protein (J. Gottesfeld, personal communication, 1995). The structure of 5S R N A is complex, as illustrated in Fig. 7, and contains five duplex regions as well as several single-stranded regions and bulged nucleotides that are frequently found (or proposed) to be sites of interaction with proteins. However, after analyzing T F I I I A binding to an enormous number of R N A mutants, 22 it appears that almost none of these likely sites in fact contributes in any substantial way to the association. Mutation data show that in the case of stems II and V, the double strand, not the sequence, is the critical feature; deletion of bulged nucleotides had no effect on binding23; substitutions of nucleotides 10-13, 41-44, and the 66/109 base pair have the greatest effect, which is to reduce the affinity to 20-40% of normal, a loss of free energy of association (AAG °) of 0.6-1 kcal/mol. 2z The greatest reduction in binding free energy is seen with RNAs that contain combinations of mutations in the double-stranded regions, and notably, the effect of the multiple disruptions is nearly additive. 24 Again, these extensive substitutions cost only about 1 kcal/mol of binding free energy, which is far below the effect of single-base substitutions in either the U 1 A - R N A or R 1 7 - R N A association. The incremental loss of TFIIIA 21 K. R. Clemens, V. Wolf, S. J. McBryant, Gonesfeld, Science 260, 530 (1993). 22 Q. You, N. Veldhoen, F. Baudin, and P. J. 23 F. Baudin and P. Romaniuk, Nucleic Acids 24 Q. You and P. J. Romaniuk, Nucleic Acids
P. Zhang, X. Liao, P. IE. Wright, and J. M. R o m a n i u k , Biochemistry 30, 2495 (1991). Res. 17, 2043 (1989). Res. 18, 5055 (1990).
278
[12]
ENERGETICS OF BIOLOGICAL MACROMOLECULES zfG A 2
Cto A ~'GCCUACGGC 3'UUUCGGAUGCUG UG 1 GC 2 GC
0
Zf~ 0
CC ACCCUG GG UGGGAC C 6 f 0
AU I CG
CG AU UOU i00 AOA AuG7 GOA GC GC UOU CG CG A GU CG CG 90 A G GA
zf5
A
U C G
3
A
U
A
GC G CG U A
0
C
G
U
U
CUGA GACUC AA 5
C U 40 C
G UA
0
[ [ 51 zf4 l
FIG. 7. Xenopus oocyte 5S R N A secondary structure with the putative binding sites of TFIIIA zinc fingers (zf) marked. Regions of the R N A that reduce the affinity up to eightfold are in bold; deletion of G75 causes the most severe loss of TFIIIA binding. (Adapted from Clemens et al. 21 )
binding free energy for mutations in the 5S R N A has led to a model of the interaction in which each of the four zinc fingers contacts an R N A duplex, as illustrated in Fig. 6. Because there is apparently little sequence specificity in these interactions, it is likely that most of the R N A - p r o t e i n contacts occur through the ribose 2'-OH, the phosphate backbone, and the base moieties in the minor groove (such as the guanosine NH2). With the discovery that only zinc fingers 4 - 7 are necessary for R N A binding, some of these binding studies have been repeated with this truncated protein. These experiments were done with gel shift methods, but despite the difference in methodology, the results are in agreement with the previous work. The major difference is in the magnitude of the effect of mutations on the affinity. For example, with the z f 4 - 7 - R N A complex, the substitution at nucleotides 10-13 reduces the affinity to 9% of wildtype affinity (AAG ° -- 1.4 kcal/mol). The most severe loss of affinity comes from the deletion of G75, the nucleotide shown by N M R experiments to be bulged out of the helix. 25 Loss of this nucleotide costs the association 2.56 kcal/mol of binding free energy, suggesting that this is an important contact point, or that its bulged conformation alters the structure of the 25 B. Wimberly, G. Varani, and I. Tinoco, Jr., Biochemistry 32, 1078 (1993).
[ 12]
RNA-PROTEIN INTERACTIONS
279
surrounding R N A in such a way that it becomes a unique element for recognition (positioning a number of phosphates or hydroxyls in the correct orientation). The energetics of this system were originally characterized with the whole T F I I I A protein, in 20 m M Tris (pH 7.5), 5 m M MgC12, 100 m M KC1, measuring the binding by nitrocellulose filter binding) The temperature dependence of the binding could be described by a linear van't Hoff relation, giving AH ° = - 8 . 3 kcal/mol, and a favorable AS° = 13.1 cal mol I deg -1 at 24 °. Like the 1 0 2 A - R N A and R 1 7 - R N A associations, formation of this complex is enthalpically favorable, but this one is also entropically favored. It would be interesting to repeat these experiments with zf4-7, to determine how the extra fingers have contributed to the thermodynamics. These data suggest that although the 5S R N A is structurally more complex than the U1A hairpin RNA, the energetics of this system are apparently simpler. This could reflect the mode of interaction between R N A and protein, which for T F I I I A - 5 S R N A relies on the unique geometric arrangement of R N A helices to provide a matrix for the protein to orient correctly; there is a minimum of nucleotide-specific interactions, and the total interaction free energy is the sum of many small contributions. This R N A - p r o t e i n association may thus represent an example of additivity of interaction free energies, implying that the bound and free conformations of the two molecules should be similar. Complex Energetics of RNA-Protein I n t e r a c t i o n s In the D N A - p r o t e i n complexes analyzed by Record and colleagues, 5,17 the D N A is a duplex, and the interaction between it and the proteins is sequence specific. The examples analyzed in detail are lac repressor and EcoRI. Both show similar nonlinear relations between temperature and equilibrium constants, and analysis of the data assumes that there is a large temperature-independent heat capacity, AC~,. There are some substantial differences between those D N A - p r o t e i n data and the data shown here: first, while the D N A data show a bell-shaped curve, with the maximum Kob s n e a r 20 ° clearly flanked by lower affinity binding, the R N A data only suggest a maximum, near 10 °. Assuming similar behavior, we are looking at only one side of the R N A curve. Second, while the maximum Kobs for the D N A systems is vaguely biological (between 20 and 25°), the R N A - U 1 A interaction has a maximum near 10° under these conditions. Because this R N A - p r o t e i n interaction occurs at 37 °, the 10° point is hardly relevant for in vivo function. Finally, the R N A - U 1 A interaction is far more sensitive to temperature than are the D N A systems, with a range of measured affinities 100 times that of the D N A - p r o t e i n complexes. However, a corn-
280
ENERGET1CS OF BIOLOGICAL MACROMOLECULES
[12]
mon feature of the R N A - and D N A - p r o t e i n associations is that over this temperature range, the free energy of association is relatively constant, indicating that the entropy and enthalpy are compensating. This example of 1 0 2 A - R N A energetics may be typical of the complexity of R N A - p r o t e i n interactions. While the large negative Z~Cp.ob s of the association may in fact reflect the particulars of the R N A - 1 0 2 A complex formation, it is also possible that this heat capacity arises from the coupling of two simultaneous processes. > An illustration of this phenomenon comes from the interaction of E. coli SSB binding to dA70.27 In that interaction, the van't Hoff plot is nonlinear and therefore the heat capacity is nonzero, whereas for SSB binding to dC70 o r dTT0, the van't Hoff plot is linear, i n d i c a t i n g ACp,ob s = 0. Calorimetric measurements showed that there was no heat capacity associated with the stacking/unstacking transition of poly(A)5 s One interpretation of these results is that the stackingunstacking equilibrium of the dA70 is linked to the binding of SSB to the unstacked polynucleotide, and the two processes thus become energetically coupled, giving rise to an apparent &Cp. This system may be analogous to the 1 0 2 A - R N A interaction, in which the conformation of the R N A loop may be flexible. Because R N A molecules, especially those that have more complex tertiary structures, are likely to have some conformational heterogeneity, it seems probable that some of these conformations will be more accessible to a protein than others, and thus the analogy to poly(A) becomes appropriate. While we have no direct evidence that such conformational flexibility is at the origin of the ACe, analysis of the temperature dependence of the binding for the complexes with mutant RNAs suggests that these associations have gained an unfavorable entropy term. Because the protein is the constant component in these interactions, the variability must arise from the RNA. The lower binding affinity could arise from the lack of a complete set of p r o t e i n - R N A contacts such that the interface is incorrectly formed, from a change in the counterion compensation required for the new R N A loop, or from an increase in the conformational entropy of the RNA. It is clear from NMR data that as the temperature increases, the wild-type R N A loop becomes more flexible. We suggest, therefore, that a likely cause of the loss of binding free energy comes from the increased conformational entropy associated with these mutant RNA sequences. We suggest that the association of a protein with an R N A that contains a structure with potential for conformational flexibility will produce a pat2~/M. R. Eftink, A. C. Anusiem, and R. L. Biltonen, Biochemistry' 22, 3884 (1983). 27 M. Ferrari and T. M. L o h m a n , Biochemistry 33, 12896 (1994). 2s V. Filimonov and P. L. Privalov, J. Mol. Biol. 122, 465 (1978).
[ 13]
RNA UNFOLDINGTHERMODYNAMICS
281
tern of interdependent interactions and complex energetics. Possible sources of energetic contributions to the association of such an R N A and a protein may include ion release from the RNA, with subsequent replacement by a direct or indirect contact with the protein; divalent counterions are especially likely to be important. In addition, the conformational diversity of the R N A may contribute to the entropic cost of association, if not all structures are equally accessible to the protein, or if, in order to bind, the protein must disrupt a low energy (favorable) structure. The sequencespecific recognition by the protein may involve a conformational change of the R N A or protein, or both, to bury hydrophobic groups or to make ionic contacts or hydrogen bonds. This behavior has been implicated in D N A - p r o t e i n interactions, but may be especially common for proteins that recognize a single-stranded region of RNA. Finally, several processes may be coupled in the association, such as the binding of the protein to a preferred conformation of the RNA, in the case in which the R N A conformation is dynamic. :~' Such R N A - p r o t e i n interactions are intrinsically more complex than double-stranded D N A - p r o t e i n interactions, and are more analogous to protein-protein interactions in the range of contributions from the two components that must be considered to describe the association completely. Acknowledgments We thank Professor Joel Gottesfeld for communication of the zf4-7 : T F I I I A results, and Professors Tim Lohman and Enrico Di Cera for critical reading of the manuscript. K. B. H. is a Markey Scholar, and the research is supported by the Lucille P. Markey Charitable Trust (#90-47) and the NIH (GM46318).
[13] M e l t i n g S t u d i e s o f R N A U n f o l d i n g RNA-Ligand Interactions
By D A V I D
and
E . D R A P E R a n d THOMAS C. G L U I C K
Introduction Many RNAs fold into three-dimensional structures that carry out specific functions, such as catalysis or recognition of regulatory proteins. How the primary sequence of an R N A encodes its functional structure is analogous to the "protein-folding problem." As with proteins, two levels of structure can be distinguished: the repeating base pair structure of simple helix segments (secondary structure), and additional interactions that conMETHODS IN ENZYMOLOGY,VOL. 259
Copyright © 1995 by Academic Press, Inc. All rightsof reproduction in any form leserved.