doi:10.1016/S0022-2836(02)01476-6
J. Mol. Biol. (2003) 326, 999–1004
COMMUNICATION
Inherent Protein Structural Flexibility at the RNA-binding Interface of L30e Jeffrey A. Chao1, G. S. Prasad2, Susan A. White3, C. David Stout2 and James R. Williamson1* 1 Department of Molecular Biology, Department of Chemistry and The Skaggs Institute for Chemical Biology The Scripps Research Institute 10550 North Torrey Pines Road La Jolla, CA 92037, USA 2
Department of Molecular Biology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA 92037 USA
The Saccharomyces cerevisiae ribosomal protein L30 autoregulates its own expression by binding to a purine-rich internal loop in its pre-mRNA and mRNA. NMR studies of L30 and its RNA complex showed that both the internal loop of the RNA as well as a region of the protein become substantially more ordered upon binding. A crystal structure of a maltose binding protein (MBP) –L30 fusion protein with two copies in the asymmetric unit has been determined. The flexible RNA-binding region in the L30 copies has two distinct conformations, one resembles the RNA bound form solved by NMR and the other is unique. Structure prediction algorithms also had difficulty accurately predicting this region, which is consistent with conformational flexibility seen in the NMR and X-ray crystallography studies. Inherent conformational flexibility may be a hallmark of regions involved in intermolecular interactions.
3 Department of Chemistry Bryn Mawr College, Bryn Mawr, PA 19010, USA
*Corresponding author
q 2003 Elsevier Science Ltd. All rights reserved
Keywords: protein structural flexibility; ribosomal protein L30; autoregulation; MBP – L30 fusion protein; induced fit
The ribosomal protein L30 from the yeast Saccharomyces cerevisiae regulates its own splicing and translation by binding to an internal loop structure found in both its pre-mRNA and processed mRNA.1 – 3 Auto-regulation of splicing and translation provides an elegant mechanism for controlling L30 levels, and consequently ribosome assembly, in both the nucleus and cytoplasm. It has been proposed that L30 participates in a key bridging interaction between the large and small ribosomal subunits in eukaryotes.4 L30 is a relatively small protein (105 residues) whose structure was solved by NMR spectroscopy in both its free state and when it is bound to the L30 mRNA site.5,6 The secondary structure consists of eight alternating a-helix and b-strand segments that fold into a three-layer a/b/a sandwhich. The RNA-binding interface of L30 is composed of three loops that connect a-helix and b-strand segments on one face of the protein. Present address: G. S. Prasad, Syrrx, 10410 Science Center Drive, San Diego, CA 92121, USA. Abbreviation used: MBP, maltose-binding protein. E-mail address of the corresponding author:
[email protected]
Here, we describe the X-ray crystal structure of a maltose-binding protein (MBP)-L30 fusion protein where a region of the L30 protein adopts two distinct conformations. The structure was solved using molecular replacement with a known structure of MBP (Table 1). The structure was refined to a crystallographic R factor of 22.25% and a free R of 25.44% for 65,330 reflections in the resolution ˚ . The overall structures of the two range 18 –2.31 A MBP molecules in the asymmetric unit are very similar with a root-mean-square deviation (rmsd) ˚ . The L30 portions of for the Ca postitions of 0.27 A the fusion proteins both have the a/b/a topology that was previously determined in the NMR structures.5,6 While the topologies of the L30 asymmetric copies are similar, the Ca positions of the two copies (residues 3– 105) superimpose with an ˚ . This difference is mainly due rmsd of only 2.85 A to a conformational rearrangement that occurs between residues 74 and 88 in the two L30 molecules. The rmsd of the Ca positions for this ˚ . If this region is omitted, the rmsd region is 4.80 A a of the C positions for the rest of L30 (residues ˚ . The two 3– 73 and 89 –105) improves to 1.85 A copies of L30 in the asymmetric unit have been designated L30(1) and L30(2) for clarity. In L30(1),
0022-2836/03/$ - see front matter q 2003 Elsevier Science Ltd. All rights reserved
1000
Protein Structural Flexibility
Table 1. X-ray diffraction data and refinement statistics A. Data collection Space group Cell parameters ˚) a (A ˚) b (A ˚) c (A ˚) Resolution (A No. reflections Unique reflections Completeness (%) Rmerge (%) B. Refinement statistics ˚) Resolution (A No. residues No. water molecules R-value (%) Rfree value (%) ˚ 2) Average B-factor (A ˚) rmsd bond lengths (A rmsd bond angles (deg.)
P212121 79.65 118.39 153.78 18–2.31 (2.38–2.31) 708,695 65330 99.2 (99.8) 8.6 (35.9) 18–2.31 944 233 22.25 25.44 50.75 0.0066 1.426
L30 was expressed as an amino-terminal maltose-binding protein (MBP) fusion and purified as described.5 MBP–L30 crystals were grown at 22 8C using the hanging-drop, vapor-diffusion method by mixing equal volumes of protein solution (5 mg/ml stock of MBP–L30 in 10 mM sodium citrate (pH 6.2), 1 mM maltose, 0.02% (w/v) sodium azide) with the reservoir solution (50% saturated sodium citrate, 0.1 M Tris (pH 7.0), 0.2 mM NaCl). Data were collected from a single crystal at 2180 8C in a cryoprotectant that consisted of the reservoir solution with 20% (w/v) glycerol at the Stanford Synchrotron Radiation Labora˚ . The data were tory beamline 9-2 at a wavelength of 0.9795 A processed and scaled with MOSFLM and the CCP4 suite of programs.18 The structure of MBP– L30 was solved by molecular replacement with AMoRe using a previously reported MBP structure (PDB accession number 1ANF).18 The crystals contain two copies of the fusion protein in the asymmetric unit. Using ˚ and 4 A ˚ , the rotation function a resolution range between 30 A calculated two unambiguous solutions. These two solutions were then subjected to translation searches and rigid body refinement resulting in an Rcryst ¼ 39.2% and a correlation coefficient of 65.1. The difference Fourier electron density maps clearly show the position of both copies of L30. Rounds of model building and refinement were performed with X-fit and CNS.19,20 Non-crystallographic symmetry (NCS) restraints were applied to the MBP molecules during the early stages of refinement. NCS restraints were not used for the L30 molecules because packing forces and crystal contacts resulted in distinct conformations for the two L30 copies. The backbone of both fusion proteins were traced unambiguously except for the linker regions between MBP and L30 molecules and several N and C-terminal residues of one L30 copy where the electron density was weak and/or discontinuous. Stereochemical values are all ˚ structure, within or better than the expected values for a 2.3 A as determined by PROCHECK.21 During refinement, examination of an lFol 2 lFcl difference electron density map indicated that there were four glycopyranoside rings present in the MBP ligand binding pocket. The refined occupancy of each of the sugar rings was found to be close to 1.0. The presence of four sugar rings was somewhat surprising, since the fusion protein was eluted from an amylose column with the disaccharide maltose during purification. The maltotetraose seen in the MBP– L30 fusion structure can likely be attributed to degradation of the amylose column. The maltotetraose is bound in a manner similar to what has been reported.22
residues 74– 78 form a coiled strand and residues 79 –88 form an a-helix (Figure 1A). This situation is reversed in L30(2), where residues 74 –81 make up the a-helix and residues 82– 88 form the coiled
strand (Figure 1B). Residue F85 in L30(2) makes a crystal contact to a symmetry-related copy of MBP. The corresponding F85 in L30(1) is located within the a-helix and is positioned so that the phenyl side-chain is partially buried by other hydrophobic residues in the region. A loop (residues 47 – 51) adjacent to the flexible region from residues 74 – 88 also has a slightly different structure in L30(1) and L30(2) (Figure 1A and B). The position of this loop accommodates the secondary structure change in the neighboring region. The conformational plasticity observed for residues 74 – 88, where the same sequence adopts different conformations in identical conditions, has previously been observed, although only rarely. In the structure of the yeast MATa2/ MCM1/DNA ternary complex, residues 121 –131 of one MATa2 copy in the asymmetric unit adopted a b-strand/b-strand topology, while the other copy was found to be a-helix/b-strand.7 This region was important for dimerization with MCM1. Other natural “chameleon” sequences have been reported for EF-Tu in its GTP and GDP forms, serpins before and after cleavage, and JDV Tat when bound to different RNA sequences.8 – 11 In the NMR structure of L30 in complex with mRNA (L30(bound)), an a-helix is formed by residues 74 – 81 and F85 makes a van der Waals contact to continue a purine stack in the RNA internal loop (Figure 1D).5 This interaction is critical for binding and an F85A mutation reduces binding by 20-fold (S.A.W., unpublished result). Other hydrophobic residues, L84 and V87, are buried upon complex formation.5 The residues that form the a-helix do not make specific contacts with the RNA, yet many of these residues are highly conserved in L30 from Archaea to Eukarya.4 These residues must be necessary for ensuring the local backbone geometry needed for positioning the loop residues that do make specific contacts to the RNA. Residues 74 –88 were previously shown to undergo conformational changes upon binding of the mRNA. In the NMR structure of the free L30 protein (L30(free)), the conformation of this region is poorly defined (Figure 1C). The amide proton resonances of residues 74 –88 are broad or missing from 15N heteronuclear single quantum coherence (HSQC) spectra, which indicates chemical exchange between multiple conformations is occurring in solution.6 This region becomes more ordered upon RNA binding with the broad or missing amide proton resonances sharpening and the average NOE density increasing from 5.7 restraints per amino acid residue to 18.7 restraints per amino acid residue. The structure of the regions from 74 to 88 in L30(2) and L30(bound) are nearly identical (Figure 1B and D). The a-helix contains residues 74 –81 in both of the structures, and residue F85 makes an intermolecular interaction, which is the RNA contact in the NMR structure and a protein crystal contact in the X-ray structure. In addition to F85, there are other residues that make crystal contacts
Protein Structural Flexibility
1001
Figure 1. Comparison of the L30 structures solved by X-ray crystallography with the free and bound forms of L30 determined by NMR. A, In L30(1), residues 74 – 78 form a loop (green) and residues 79 – 88 form an a-helix (yellow). B, In L30(2), the a-helix is formed by 74 – 81 (green) and the loop consists of residues 82 – 88 (yellow). C, In L30(free) the region 74 – 88 does not have a preferred conformation (residues 74 – 81 are green and residues 82 – 88 are yellow). D, In L30(bound), the a-helix is formed by residues 74 – 81 (green) and the loop is made of residues 82 – 88 (yellow). Residue F85 is shown in red in all structures for reference. The F85 in L30(2) makes a crystal contact to a symmetryrelated copy of MBP. In L30(bound), residue F85 stacks upon a guanosine residue in the RNA. The region 74 – 88 are nearly identical in L30(2) and L30(bound). L30(1), which does not make a crystal contact, adopts a structure that is unique compared to other L30 structures. Residue F85 in L30(1) is located within an a-helix that partially buries the F85 side-chain. The loop of residues 47 – 51 (pink in A and B) that is adjacent to the region 74 – 88 has a slightly different conformation in L30(1) and L30(2).
(Figure 2B). Residue L84, which makes an RNA base contact in the bound structure, also makes a van der Waals contact to a leucine residue from the symmetry-related MBP. Residue R86 makes an
electrostatic interaction with an MBP glutamate residue. A superposition of the Ca positions of the two structures in this region results in an rmsd of ˚ (Figure 3B). The van der Waals interactions 1.09 A
Figure 2. Crystal packing of L30(1) and L30(2) in the asymmetric unit. A, The crystal packing of L30(1) (blue) against MBP (green) and symmetry-related MBP molecules (brown) is shown. The a-helix composed of residues 79 – 88 runs parallel with an a-helix in a symmetry-related MBP. Residue F85 is shown in red and is located within the a-helix, and the phenyl side-chain is partially buried by other hydrophobic residues in the region. Residues K83 (orange) and R86 (pink) interact with the symmetry-related MBP a-helix. B, The crystal packing of L30(2) (blue) against MBP (green) and symmetry-related MBP molecules (brown) is shown. The a-helix formed by residues 74 – 81 positions F85 (shown in red) to make a crystal contact with a symmetry-related MBP. Residues L84 (orange) and R86 (pink) make contacts with the symmetry-related MBP.
1002
among several residues in the loop differ greatly from the L30(bound) structure compared to the L30(2), yet the burial of hydrophobic residues is crucial for inducing the entire local fold of this region. In the L30(1) structure, the a-helix and loop are clearly defined by the electron density and are well ordered with B-factors that are comparable to the entire structure. This conformation of L30(1) maximizes burial of hydrophobic residues in the absence of the intermolecular interactions that are observed for L30(2) and L30(bound) (Figure 2A). Both residues L84 and F85 are positioned within a hydrophobic pocket. One face of the a-helix runs parallel with a symmetry-related MBP copy, and residues K83 and R86 make electrostatic contacts with the MBP. While the conformation of this region is different from that of L30(2), it differs also from the L30(free). NMR studies showed that this region is flexible and does not adopt a preferred conformation in solution.6 It is possible that the structure of L30(1) that is captured in the crystal exists in solution in equilibrium with the bound form of L30. The observed broadening of NMR resonances in this region could reflect the local conformational change between these two structures on the micro- to millisecond timescale. At low ionic strength, the L30 protein has limited solubility and precipitates from solution over a period of a few days. It was necessary to record the NMR spectra of L30(free) at a low concentration of protein (, 1 mM) and at 10 8C with 300 mM NaCl, in order to maintain solubility. It is possible that residue F85 and other hydrophobic residues in this region are responsible for poor solubility of the free L30 protein. L30e belongs to a homologous family of RNAbinding proteins (Figure 3A). Structures of two homologs, L7Ae and the 15.5 kDa spliceosomal protein, in complex with their respective RNA targets have been solved by X-ray crystallography.12,13 All three of these proteins possess a/b/a topologies and bind to similar structural elements in their RNA targets. These RNAs possess purinerich internal loops that adopt regular structures and belong to a class of RNA secondary structure motifs known as K-turns.14 The regions in L7Ae and the 15.5 kDa protein that are homologous to residues 74 –88 in L30(bound) have Ca positions ˚ with that superimpose with an rmsd of about 1 A the L30 region and are similarly involved in RNA binding (Figure 3B). This local structure forms part of the pocket that accommodates the kink in the phosphate backbone and flipped out nucleotide that are essential features of the K-turn. Alignment of these regions shows that almost all of the conserved residues are located in the a-helical portion. There are two conserved glycine residues, G78 and G82 in L30, that may be important for the conformational flexibility in this region. In L30(2) and L30(bound), G78 is located within the a-helix and G82 is positioned at the end of the helix to facilitate the reversal of the backbone. In L30(1),
Protein Structural Flexibility
G78 is located just before the a-helix and G82 is within it. The conserved leucine residue is likely necessary for stabilizing the a-helix through hydrophobic interactions with the b-sheet core and the conserved alanine residue may be required for steric reasons. The residues that make specific contacts to the RNA are located in the loop where there is little sequence conservation. This variability is likely important for specificity, since the global fold of the K-turn is very similar. Studies of the free and bound forms of the other members of this family may provide further insights into the role of induced fit in the RNA-binding mechanism. The L30(bound) structure was submitted as a target (T0077) in the third Critical Assessment of Structure Prediction (CASP3) competition. There were a few groups that were able to accurately predict a significant portion of the secondary structure and tertiary packing correctly. Two predictions, in particular, aligned with over 50% of the target ˚ (TS035_4 aligned 67 with an rmsd of less than 4 A ˚ and TS163_2 residues with an rmsd of 3.8 A ˚ ).15,16 aligned 63 residues with an rmsd of 3.48 A Both of these predictions, however, had difficulty with the region spanning residues 74 –88. This region has been shown to be flexible and undergoes induced fit upon binding, so it is not unexpected that the bound conformation could not be predicted properly from amino acid sequence alone. Since L30(1) has been determined to adopt a conformation different from that of the bound L30, it seemed worthwhile to re-evaluate the CASP3 predictions on their ability to predict the local structure of the flexible region as well as the global fold. About half of the predictions contained a continuous backbone in the flexible region. Four predictions, which contained the entire stretch from residues 74– 88, were found to predict the loop-helix conformation found in L30(1) reasonably well. One of these predictions, AL033_2, was especially good with there being only a slight change in the a-helical angle and ˚ . There was superimposing with an rmsd of 2.73 A also a prediction that correlates similarly well with the L30(bound) structure. The TS035_5 entry was able to predict the helix-loop conformation ˚ . When and superimposes with an rmsd of 2.50 A the CASP3 predictions are analyzed with L30(free) as the target, the correlations become weaker. Analysis of the global folds of the CASP3 predictions using L30(1) as the target did not differ significantly from the results that were determined when L30(bound) was used as the target. While a few groups had success predicting the local conformation of the flexible region of L30(1), this did not drastically improve their prediction of the entire L30(1) structure. It is unclear whether the slight local prediction improvement observed for the L30(1) target over L30(bound) is significant. Distinct conformations for the region from residues 74 to 88 in L30 have been observed
1003
Protein Structural Flexibility
Figure 3. Comparison of homologous RNA binding proteins. A, Alignment of several members of a homologous RNA-binding family. Alignments were done with MultiAlign.17 B, Superposition of the flexible region in the L30(2), L30(bound), L7Ae and the 15.5 kDa protein. The pairwise rmsd values between the Ca positions of L30 bound ˚ , 1.05 A ˚ , and 1.16 A ˚ . This region of (green) and L30(2)(red), L7Ae(yellow) and the 15.5 kDa protein (blue) are 1.09 A all three proteins adopts a similar backbone conformation in order to accommodate the canonical stem of the K-turn. Two conserved glycine residues, G78 and G82 in L30, appear to facilitate the conversion from a-helix to loop.
experimentally. This inherent flexibility is likely crucial for proper function. L30 is capable of binding to RNA structures in its pre-mRNA, mRNA and the 80 S ribosome, and the flexible region is likely important for regulating assembly and, in some cases, disassembly of these RNPs. Structure prediction algorithms also had difficulty with the region of L30 that changes conformation upon RNA binding. The conformations of the flexible region are probably only marginally stable and are extremely sensitive to
the chemical environment. As structural prediction algorithms improve, it should be possible to recognize these structurally ambiguous regions as potential sites of intermolecular interactions.
Protein Data Bank accession number The coordinates have been deposited with RCSB Protein Data Bank with accession code 1NMU.
1004
Protein Structural Flexibility
Acknowledgments We thank the staff at the Stanford Synchotron Radiation Laboratory for their assistance. We thank Alexey Murzin for his helpful discussion of chameleon sequences in other protein structures. This work was supported by the ARCS Foundation, the Skaggs Institute for Chemical Biology and by a grant from the NIH (GM-53320 to J.R.W.).
References 1. Li, B., Vilardell, J. & Warner, J. R. (1996). An RNA structure involved in feedback regulation of splicing and of translation is critical for biological fitness. Proc. Natl Acad. Sci. USA, 93, 1596 –1600. 2. Vilardell, J. & Warner, J. R. (1994). Regulation of splicing at an intermediate step in the formation of the spliceosome. Genes Dev. 8, 211 – 220. 3. Eng, F. J. & Warner, J. R. (1991). Structural basis for the regulation of splicing of a yeast messenger RNA. Cell, 65, 797– 804. 4. Vilardell, J., Yu, S. J. & Warner, J. R. (2000). Multiple functions of an evolutionarily conserved RNA binding domain. Mol. Cell. 5, 761 –766. 5. Mao, H., White, S. A. & Williamson, J. R. (1999). A novel loop-loop recognition motif in the yeast ribosomal protein L30 autoregulatory RNA complex. Nature Struct. Biol. 6, 1139– 1147. 6. Mao, H. & Williamson, J. R. (1999). Local folding coupled to RNA binding in the yeast ribosomal protein L30. J. Mol. Biol. 292, 345– 359. 7. Tan, S. & Richmond, T. J. (1998). Crystal structure of the yeast MATalpha2/MCM1/DNA ternary complex. Nature, 391, 660– 666. 8. Abel, K., Yoder, M. D., Hilgenfeld, R. & Jurnak, F. (1996). An alpha to beta conformational switch in EF-Tu. Structure, 4, 1153– 1159. 9. Polekhina, G., Thirup, S., Kjeldgaard, M., Nissen, P., Lippmann, C. & Nyborg, J. (1996). Helix unwinding in the effector region of elongation factor EF-TuGDP. Structure, 4, 1141– 1151. 10. Wright, H. T. (1996). The structural puzzle of how serpin serine proteinase inhibitors work. Bioessays, 18, 453– 464.
11. Smith, C. A., Calabro, A. D. & Frankel, A. D. (2000). An RNA-binding chameleon. Mol. Cell, 6, 1067– 1076. 12. Vidovic, I., Nottrott, S., Hartmuth, K., Luhrmann, R. & Ficner, R. (2000). Crystal structure of the spliceosomal 15.5 kD protein bound to a U4 snRNA fragment. Mol. Cell, 6, 1331– 1342. 13. Ban, N., Nissen, P., Hansen, J., Moore, P. B. & Steitz, T. A. (2000). The complete atomic structure of the ˚ resolution. Science, large ribosomal subunit at 2.4 A 289(5481), 905– 920. 14. Klein, D. J., Schmeing, T. M., Moore, P. B. & Steitz, T. A. (2001). The kink-turn: a new RNA secondary structure motif. EMBO J. 20, 4214 –4221. 15. Ortiz, A. R., Kolinski, A., Rotkiewicz, P., Ilkowski, B. & Skolnick, J. (1999). Ab initio folding of proteins using restraints derived from evolutionary information. Proteins: Struct. Funct. Genet., 3(suppl.), 177– 185. 16. Simons, K. T., Bonneau, R., Ruczinski, I. & Baker, D. (1999). Ab initio protein structure prediction of CASP III targets using ROSETTA. Proteins: Struct. Funct. Genet., 3(suppl.), 171– 176. 17. Corpet, F. (1988). Multiple sequence alignment with hierarchical clustering. Nucl. Acids Res. 16, 10881– 10890. 18. 4, CCPN (1994). The CCP4 suite: programs for protein crystallography. Acta Crystallog. sect. D, 50, 760– 763. 19. McRee, D. E. (1999). XtalView/Xfit—a versatile program for manipulating atomic coordinates and electron density. J. Struct. Biol. 125, 156– 165. 20. Brunger, A. T., Adams, P. D., Clore, G. M., DeLano, W. L., Gros, P., Grosse-Kunstleve, R. W. et al. (1998). Crystallography & NMR system: a new software suite for macromolecular structure determination. Acta Crystallog. sect. D, 54, 905– 921. 21. Laskowski, R. A., Moss, D. S. & Thornton, J. M. (1993). PROCHECK: a program to check the stereochemical quality of protein structures. J. Appl. Crystallog. 26, 283– 291. 22. Quiocho, F. A., Spurlino, L. E. & Rodseth, L. E. (1997). Extensive features of tight oligosaccharide binding revealed in high-resolution structures of the maltodextrin transport/chemosensory receptor. Structure, 5, 997– 1015.
Edited by D. E. Draper (Received 20 September 2002; received in revised form 5 December 2002; accepted 5 December 2002)