Structural Basis for Polypyrimidine Tract Recognition by the Essential Pre-mRNA Splicing Factor U2AF65

Structural Basis for Polypyrimidine Tract Recognition by the Essential Pre-mRNA Splicing Factor U2AF65

Molecular Cell 23, 49–59, July 7, 2006 ª2006 Elsevier Inc. DOI 10.1016/j.molcel.2006.05.025 Structural Basis for Polypyrimidine Tract Recognition by...

796KB Sizes 0 Downloads 53 Views

Molecular Cell 23, 49–59, July 7, 2006 ª2006 Elsevier Inc.

DOI 10.1016/j.molcel.2006.05.025

Structural Basis for Polypyrimidine Tract Recognition by the Essential Pre-mRNA Splicing Factor U2AF65 E. Allen Sickmier,1,3 Katherine E. Frato,1 Haihong Shen,2 Shanthi R. Paranawithana,1,4 Michael R. Green,2 and Clara L. Kielkopf1,* 1 Department of Biochemistry and Molecular Biology Johns Hopkins University Bloomberg School of Public Health Baltimore, Maryland 21205 2 Howard Hughes Medical Institute Programs in Gene Function and Expression and Molecular Medicine University of Massachusetts Medical School Worcester, Massachusetts 01605

Summary The essential pre-mRNA splicing factor, U2AF65, guides the early stages of splice site choice by recognizing a polypyrimidine (Py) tract consensus sequence near the 30 splice site. Since Py tracts are relatively poorly conserved in higher eukaryotes, U2AF65 is faced with the problem of specifying uridine-rich sequences, yet tolerating a variety of nucleotide substitutions found in natural Py tracts. To better understand these apparently contradictory RNA binding characteristics, the X-ray structure of the U2AF65 RNA binding domain bound to a Py tract composed of seven uri˚ resolution. Specific dines has been determined at 2.5 A hydrogen bonds between U2AF65 and the uracil bases provide an explanation for polyuridine recognition. Flexible side chains and bound water molecules form the majority of the base contacts and potentially could rearrange when the U2AF65 structure adapts to different Py tract sequences. The energetic importance of conserved residues for Py tract binding is established by analysis of site-directed mutant U2AF65 proteins using surface plasmon resonance. Introduction Most transcripts of higher eukaryotes contain intervening sequences (introns) between the protein coding regions (exons) that must be excised by pre-mRNA splicing before nuclear export and translation of the mRNA product. The task of pre-mRNA splicing is accomplished through a series of ATP-dependent conformational rearrangements among constitutive splicing factors and small nuclear (sn)RNAs called the spliceosome (Jurica and Moore, 2003). In addition, alternative splicing factors generate transcript diversity for cell growth and differentiation by incorporating different exons into the final mRNA (Maniatis and Tasic, 2002). An essential splicing factor, U2 auxiliary factor (U2AF), recognizes consensus 30 splice site sequences in the pre-mRNA and coordi*Correspondence: [email protected] 3 Present address: Department of Molecular Structure, Amgen Inc., One Amgen Center Drive, Thousand Oaks, California 91320. 4 Present address: Department of Chemistry, Hampton University, Hampton, Virginia 23668.

nates the initial states of spliceosome assembly. Since formation of the U2AF complex commits the pre-mRNA to be spliced (Michaud and Reed, 1991), U2AF/premRNA interactions present a key target for regulation during alternative splicing, for example by Sex-lethal (SXL) (Valcarcel et al., 1993) or polypyrimidine tract binding protein (PTB) (Sharma et al., 2005). Accurate recognition of the 30 splice site by U2AF is critical for premRNA splicing, as demonstrated by the association of an estimated half of human genetic diseases with errors in splice site recognition (Garcia-Blanco et al., 2004). U2AF is a heterodimer of two subunits. The large subunit (U2AF65) recognizes an essential, polypyrimidine (Py) tract pre-mRNA consensus that is composed predominantly of uridines (Zamore et al., 1992). The small subunit (U2AF35) associates tightly with a region near the U2AF65 N terminus, and contacts an adjacent ‘‘AG’’ consensus dinucleotide at the nearby intron-exon boundary (Merendino et al., 1999; Wu et al., 1999; Zorio and Blumenthal, 1999). Initially, the U2AF heterodimer binds the pre-mRNA as a ternary complex with a third protein, splicing factor 1 (SF1) (Abovich and Rosbash, 1997). SF1 recognizes the branchpoint consensus sequence (BPS) of the pre-mRNA where the first step of the splicing reaction ultimately takes place. In parallel with assembly of the U2AF/SF1/30 splice site complex, the U1 small nuclear ribonucleoprotein (snRNP) associates with the 50 splice site. Formation of this early splicing complex brings the BPS, 50 , and 30 splice sites together in a structured conformation (Kent and MacMillan, 2002; Kent et al., 2003). Next, the U2 snRNP component of the spliceosome forms an ATP-dependent complex with the BPS and U2AF as SF1 dissociates. Stable association of the U2 snRNP requires an N-terminal, arginineserine-rich (RS) domain of U2AF65 (Shen and Green, 2004; Valcarcel et al., 1996), and RNP-unwindases such as the U2AF-Associated-Protein-56KD (UAP56) (Fleckner et al., 1997). Ultimately, U2AF is released from the pre-mRNA before the splicing reaction is catalyzed by the active spliceosome. U2AF65 preferentially binds uridine-rich RNA sequences, as shown by in vitro genetic selection experiments with U2AF65 that enrich polyuridine sequences (Singh et al., 1995), and chemical modification of the uridine-N3 or O4 atoms inhibits U2AF65 binding by w100fold (Singh et al., 2000). Accordingly, Py tracts composed of long uridine stretches promote use of adjacent 30 splice sites (Coolidge et al., 1997; Reed, 1989). However, natural mammalian Py tracts vary in length and sequence composition (Senapathy et al., 1990). U2AF65 universally recognizes these diverse natural Py tracts, which are frequently interrupted with cytosines or purines, albeit with a broad (200-fold) range of affinities (Zamore et al., 1992). In contrast, the alternative splicing factors SXL and PTB bind specific Py tract sequences of regulated splice sites (guanosine-containing uridine tracts or alternating [CU] tracts, respectively) (Perez et al., 1997; Singh et al., 1995; Sosnowski et al., 1989; Valcarcel et al., 1993). Despite their distinct Py tract specificities, the RNA binding domains of U2AF65 (Ito et al., 1999), SXL (Handa

Molecular Cell 50

et al., 1999), and PTB (Conte et al., 2000; Oberstrass et al., 2005; Simpson et al., 2004) are composed of a similar structural scaffold of consecutive RNA recognition motifs (RRM). The RRM, one of the most common types of eukaryotic RNA binding domains, is characterized by two ribonucleoprotein consensus motifs (RNP1 and RNP2) with aromatic and basic residues that interact with single-stranded RNAs (Maris et al., 2005). Specifically, the two central RRMs (RRM1 and RRM2) of U2AF65 comprise the minimal Py tract binding domain (U2AF651,2) (Banerjee et al., 2003, 2004; Zamore et al., 1992). To investigate how these two U2AF65 RRMs accomplish versatile Py tract recognition, we present the X-ray structure and a complementary mutational analysis of the U2AF65 RNA binding domain in complex with polyuridine RNA. The structure reveals that U2AF65 recognizes uridines through a network of hydrogen bond interactions with the base edges rather than shape selection of the smaller pyrimidine compared with purine bases. A significant number of side chain and watermediated hydrogen bonds may explain the ability of U2AF65 to bind a variety of natural Py tract sequences. Results X-Ray Structure Determination of a U2AF65 Variant with Polyuridine RNA Initial attempts to cocrystallize U2AF65 or U2AF651,2 with polyuridine RNAs of 12–20 nucleotides were unsuccessful. One strategy to engineer protein crystallization without interfering with functional activity is to shorten poorly conserved loop regions (Mazza et al., 2002; Nolen et al., 2001). The length and sequence composition of the 30 residue linker between RRM1 and RRM2 of human U2AF65 are phylogenetically variable (Figure 1), and this linker region is predicted to lack a well-defined structure (http://bioinf.cs.ucl.ac.uk/disopred/ and Shamoo et al. [1995]). Accordingly, several U2AF65 variants with shortened linkers were screened for cocrystallization with single-stranded polyuridine oligonucleotides of various lengths (Sickmier et al., 2006). The best crystals were obtained from a human U2AF65 fragment containing RRM1 and RRM2 (dU2AF651,2; residues 148–336 lacking residues 238–257), in complex with a polyuridine RNA dodecamer (rU12). The structure of the dU2AF651,2 complex with polyuridine RNA was determined at 2.5 A˚ resolution by molecular replacement using the X-ray structure of the isolated RRM1 domain (PDB code 2FZR) as the search model. Electron density calculated from the initial molecular replacement solution unambiguously revealed three nucleotides bound to RRM1. Following iterations of model building, refinement, and density modification, a complete structure was built for RRM1 and RRM2 of one polypeptide and 7 of the 12 uridines in the asymmetric unit (rU7). No electron density was observed for the remaining five uridines of the oligonucleotide, indicating that these were either disordered or degraded during crystallization. Crystallographic data and refinement statistics are given in Table 1. Characterization of U2AF65 Variant To confirm that residues 238–257 of the U2AF65 RRM1RRM2 linker were dispensable for pre-mRNA splicing,

the activity of a full-length U2AF65 variant containing the linker deletion (dU2AF65) was tested in vitro (Figure 2). A titration of the wild-type U2AF65 or dU2AF65 linker variant with U2AF65-depleted nuclear extract showed that splicing of the adenovirus major late premRNA (Ad ML) was restored following addition of comparable levels of both proteins (between 0.25 and 0.50 mM), regardless of the linker composition. Thus, the deleted linker residues were dispensable for U2AF65 to function during pre-mRNA splicing in vitro. The RNA binding properties of the wild-type U2AF651,2 and the variant dU2AF651,2 fragments were measured using nitrocellulose filter binding assays with single-stranded RNAs. Both U2AF651,2 and dU2AF651,2 proteins bound an RNA oligonucleotide composed of 20 uridines (rU20) with comparable apparent equilibrium dissociation constants (KD 0.37 6 0.06 mM or 0.41 6 0.04 mM respectively, data not shown). These binding constants were on the order of those previously measured for a U2AF65 fragment binding to natural Py tracts using electrophoretic mobility shift assays (EMSAs) (Zamore et al., 1992). Neither U2AF651,2 nor dU2AF651,2 detectably bound a polyguanosine RNA (rG20), indicating that the sequence preference for polyuridine over polyguanosine is >200-fold assuming a w100 mM KD limit for nitrocellulose filter binding (Hall and Kranz, 1999). In summary, our modification of the RRM1-RRM2 linker region had no apparent effect on the in vitro splicing activity or RNA binding characteristics of U2AF65, establishing that the dU2AF651,2/rU7 complex is a reliable model system in which to study Py tract recognition by U2AF65. The RNA affinity of U2AF651,2 is relatively low compared with the subnanomolar affinity of the corresponding SXL RRMs for its regulated tra Py tract (KD w5 3 10211 M for an SXL fragment containing RRM1 and RRM2) (Kanaar et al., 1995), although in practice the relative affinities of the full-length U2AF65 and SXL proteins for this sequence are closer (KD 1028 M and 1029 M, respectively) (Valcarcel et al., 1993). Association of the U2AF65/pre-mRNA complex with additional splicing factors such as SF1 (Berglund et al., 1998) and U2AF35 (Merendino et al., 1999; Wu et al., 1999; Zorio and Blumenthal, 1999) is functionally important to ensure the affinity and specificity of U2AF65 for the entire 30 splice site consensus (BPS-Py-tract-AG). The binding constants of the isolated U2AF65 RRMs for rU20 (KD 4.0 6 2.0 mM and >100 mM for RRM2 and RRM1, respectively, data not shown), were similar to the RNA affinities of other isolated RRMs (Shamoo et al., 1995). Using these values, a binding constant can be predicted for U2AF651,2/ rU20, assuming a 30 residue unstructured linker separating two RRMs. The predicted KD (0.8 mM, calculated using equations derived by Shamoo and coworkers [Shamoo et al., 1995]) is consistent with the observed KD of the U2AF651,2 fragment, which provides further evidence that the U2AF651,2 interdomain linker does not significantly contribute to RNA binding. Overall Structure The repeating unit of the crystal is a 1:1 stoichiometric dU2AF651,2/rU7 complex, although the dU2AF65 RRM1 and RRM2 that interact with the same rU7 strand are contributed by distinct polypeptide chains (Figure 3A

Polypyrimidine Tract Recognition by U2AF65 51

Figure 1. Sequence Alignment of Py Tract Binding Domains from Representative U2AF65 Homologs Human (accession code X64044), mouse (AAH43071), frog (AAH44032), fly (L23404), worm (AAL00879), plant (NP_176287), and fission yeast (NP_595396) sequences are colored in a gradient from white (%50% identity) to dark green (100% identity). Human U2AF65 secondary structure elements are indicated above the aligned sequences, where primed italics distinguish RRM2 secondary structures from those belonging to RRM1. The U2AF65 ribonucleoprotein consensus motifs (RNP1 and RNP2) are shown with red font, and consensus sequences are given below. Residues below dashes were absent from the dU2AF651,2 variant used for crystallization. Protein/RNA interactions are indicated above the sequences as follows: s, sidechain contact; m, mainchain contact; w, water-mediated contact. Underscores distinguish residues tested for contribution to RNA affinity by site-directed mutagenesis.

and see Figure S1 in the Supplemental Data available with this article online). We have demonstrated that dU2AF65 recognizes the Py tract and functions comparably to unmodified U2AF65 during pre-mRNA splicing, and we used RNA binding experiments with five different site-directed U2AF651,2 mutants (presented below) to establish the energetic importance of the U2AF65/uridine interactions observed in the structure. Furthermore, the dU2AF651,2/rU7 structure is consistent with and explains extensive biochemical literature (Kent et al., 2003; Shen and Green, 2004; Singh et al., 1995, 2000; Valcarcel et al., 1996). Thus, both published work and assays presented here support the relevance

of the dU2AF651,2/rU7 structure for interpreting uridine recognition by RRM1 and RRM2 of U2AF65. The overall dimensions of the integrated structural unit containing U2AF65 RRM1 and RRM2 bound to the same RNA strand (henceforth called dU2AF651,2/rU7) are 55 3 37 3 20 A˚ (Figure 3B). Each RRM has the expected topology of four b strands and two a helices, with an additional b extension of the second a helix (b4) that forms a short b hairpin with the C-terminal b5 strand. The topologies of the dU2AF651,2 RRMs are essentially the same as the isolated, RNA-free U2AF65 RRMs (rmsd 2.0 A˚ for 78 matching RRM1 Ca atoms, and 2.2 A˚ for 84 matching RRM2 Ca atoms), which

Molecular Cell 52

Table 1. Summary of Data Collection and Refinement Statistics Crystallographic Data X-ray source Wavelength Space group Unit cell

Diffraction limit Redundancy Completenessa Rsyma,b a

NSLS X8C 1.0 A˚ P6522 a = b = 129.4 A˚ c = 62.6A˚ a = b = 90º g = 120º 30–2.5 A˚ 7.8 99.6 (99.1)% 7.2 (47.6)% 11.0 (4.8)

Refinement Statistics Number of atoms

R factorc,d Rms deviations Average temperature factors

Protein RNA Water Dioxane (C4H8O2) Rcryst Rfree Bond lengths Bond angles Protein RNA

1333 140 31 6 26.0% 27.6% 0.008 A˚ 1.5º 65.8 A˚2 77.1 A˚2

a

Values in parenthesis are for the highest-resolution shell, 2.59–2.50 A˚. P P P P b Rsym = hkl ijIi 2 j/ hkl i Ii where Ii is an intensity I for the ith measurement of a reflection with indices hkl and is the weighted mean of all measurements of I. c All data used for refinement. P P d Rcryst = hklkFobs(hkl)j 2 kjFcalc(hkl)k/ hkljFobs(hkl)j for the working set of reflections, Rfree is Rcryst for 10% of the reflections randomly excluded from the refinement.

were previously determined using nuclear magnetic resonance (NMR) (Ito et al., 1999). The largest differences between these structures are due to crystal contacts that rearrange an unusually long a1-b2 loop of RRM1. This potentially flexible region is required for the association of U2AF65 with the RNP-dependent ATPase, UAP56 (Fleckner et al., 1997). Likewise, comparison with the X-ray structure of unliganded RRM1 shows that the structure remains largely unchanged by association with RNA (0.6 A˚ rmsd for 82 matching Ca atoms). The RNA-induced twisting of the RRM2 b20 -b30 loop (where primed italics distinguish RRM2 secondary structure) is the most notable difference between the unliganded and RNA bound X-ray structures. The dU2AF651,2/rU7 complex buries a significant amount of surface area (2200 A˚2), which is comparable to the surface area buried in other structures of tandem RRMs bound to RNA (2400 A˚2–2600 A˚2) (Deo et al., 1999; Handa et al., 1999). The b sheet surfaces of the two RRMs face one another across a deep cleft surrounding the oligonucleotide bases, and loop regions encircle the RNA strand. The RNP1 and RNP2 signatures of the RRM fold are located on the central b strands (b3/b30 and b1/ b10 , respectively) at the RNA interface. Although the theoretical isoelectric point of the U2AF65 Py tract binding domain is slightly acidic, the RNA binding groove is highly electropositive and complements the negative charge of the phosphodiester backbone (Figure 3C). The relative orientations of RRM1 and RRM2 from distinct dU2AF651,2 molecules are indirectly stabilized by

Figure 2. The In Vitro Splicing Activity of the dU2AF65 Variant Used for Structural Analysis Is Comparable to that of Wild-Type U2AF65 The concentrations of U2AF65 proteins added to the splicing extract were 0, 0.062, 0.125, 0.25, 0.50, and 1.0 mM, increasing left to right. The products of the splicing reaction are shown schematically to the right.

the shared interface with the bound RNA, in particular by interactions with opposite faces of Uri4 and flanking nucleotides as the RNA threads through the RRM1/ RRM2 cleft (Figure 3C and Figure 5E). A few direct RRM1/RRM2 interactions bury 504 A˚2 of the dU2AF651,2 protein’s surface area, consistent with other complexes of tandem RRMs with RNA; for example, 550 A˚2 of surface area is buried between RRM1 and RRM2 of polyadenosine binding protein (PAB) (Deo et al., 1999). Encircling the central Uri4 nucleotide, RRM1/RRM2 contacts include hydrogen bonds (w2.8 A˚) between the carbonyl of Lys195 (RRM1 b2-b3 loop) and Lys260 side chain (RRM2 b10 ), and weak contacts (w4 A˚) between the sidechains of Asp194 (RRM1 b2-b3 loop) and Asn289 (RRM2 b20 ). Around Uri3, the RRM2 C terminus is located near the sidechain of Gln222 in the loop before RRM1 b5; around Uri5, an alternate conformation of His230 near the RRM1 C terminus contacts Ser294 in the RRM2 b20 /b30 loop. In contrast, no contacts are observed between dU2AF651,2 RRM domains connected by the same polypeptide chain but bound to distinct RNA oligonucleotides. At the Uri4 nucleotide, the RNA strand makes an abrupt turn so that the 50 and 30 halves, respectively, interact with RRM2 and RRM1 of symmetry-related polypeptides (Figures 3A and 3B). Each RRM recognizes four nucleotides: RRM2 recognizes Uri1-Uri4, RRM1 recognizes Uri4-Uri7, and Uri4 is bound at the interface between RRM1 and RRM2. The location of Uri4 at the RRM1/RRM2 interface is consistent with simultaneous crosslinking of both U2AF65 RRMs with central uridines of the Ad ML Py tract (Banerjee et al., 2003). The 50 -to-30 orientation of the RNA strand is universally shared among previously characterized RRMs, and the binding site size of four nucleotides per U2AF65 RRM is the most common site size among canonical RRMs (Maris et al., 2005). RNA Backbone Conformation and Interactions with U2AF65 The RNA strand displays a 120º kink between Uri4 and Uri5, with an overall bend of 100º in the RNA axis due

Polypyrimidine Tract Recognition by U2AF65 53

Figure 3. Overall Structure of the Human dU2AF651,2/rU7 Complex (A) View of the crystal packing interactions of the dU2AF651,2/rU7 complex. The symmetry-related protein molecules are shown as green or blue backbone worms, and the bound nucleic acids are shown as purple or yellow ball-and-stick models. One asymmetric unit contains one protein molecule and one rU7 oligonucleotide, so that each rU7 oligonucleotide is bound by RRM1 and RRM2 donated by distinct, symmetry-related protein molecules. (B) View of the integrated structural unit of RRM1 and RRM2 interacting with the rU7 strand, into the RNA binding surface (top panel) or rotated 180º about the vertical axis (lower panel). Secondary structure elements of the dU2AF651,2 protein are labeled (primed italics distinguish RRM2 secondary structures) on a ribbon model that is colored by domain as follows: RRM1 (residues 148–237), green; and RRM2 (residues 258–336), blue. A composite omit electron density map at 1s contour level surrounds a stick representation of the oligonucleotides (purple). (C) Electrostatic surface representation of dU2AF651,2, shown bound to a stick representation of the RNA in a similar orientation as the lower panel of (B) was generated using the program Pymol (http://www.pymol.org). Red areas, highly negatively charged residues; blue areas, highly positively charged residues.

to additional curvature contributed by flanking nucleotides. The distance between the 50 and 30 ends of the rU7 strand is shortened by 66% relative to the fully extended conformation. Several other structures of RNA binding proteins composed of tandem RRMs bound to single-stranded RNA ligands are available for comparison with the dU2AF651,2/rU7 structure (Figure 4). These structures include SXL bound to the tra Py tract, which is contains a 50 terminal guanosine (Handa et al., 1999), PAB bound to an polyadenosine tract (Deo et al., 1999), and HuD bound to a polyuridine tract containing an internal adenosine (Wang and Tanaka-Hall, 2001). The RNA ligands of these different proteins bend to var-

ious extents, as measured by the shortening of the 50 -to30 distance in the bound relative to an ideal, fully extended conformation. The bound polyadenosine ligand of PAB is only slightly bent (43% shortening), whereas the Py tract ligands of SXL and HuD are more bent (60% shortening). The 66% shortening between the polyuridine termini observed in the dU2AF651,2/rU7 complex is slightly greater than previous examples, although the conformation of the Py tract is likely to differ and/or be flexible when bound by unmodified U2AF65. In contrast with the intramolecular hydrogen bonds and base-base stacking interactions that stabilize the bent SXL and HuD bound RNA conformations (Handa

Molecular Cell 54

consistent with the abundant O20 contacts in the SXL/ tra structure (Handa et al., 1999).

Figure 4. Comparison of dU2AF651,2/rU7 RNA Conformation with Available Structures of Tandem RRMs Bound to Continuous, Single RNA Strands Superposition of N-terminal RRMs shows a range of bound RNA conformations (proteins not shown for clarity). Coil representations of the RNA ligands are colored by structure: HuD/(AU)-rich RNA (ARE) (Wang and Tanaka-Hall, 2001), blue; SXL/tra Py tract (Handa et al., 1999), purple; PAB/polyadenosine (polyA) (Deo et al., 1999), pink; and dU2AF651,2/polyuridine (polyU), green.

et al., 1999; Wang and Tanaka-Hall, 2001), no intramolecular RNA/RNA interactions are observed within the U2AF65 bound rU7 strand. Instead, conserved hydrophobic residues of the U2AF65-RNP1 or RNP2 motifs stabilize many of the rU7 backbone positions by packing against the ribose sugars (RRM2-RNP1-Gly301/Uri2, RRM2-RNP1-Tyr302/Uri3, and RRM1-RNP1-Phe197/ Uri5) (Figures 5C, 5D, and 5F). The sugar-phosphate backbones of the relatively straight, 30 -terminal nucleotides (Uri6 and Uri7) lack extensive U2AF65 contacts and are exposed to solvent. Only two of the phosphates engage in direct, electrostatic interactions with the protein (RRM2-Lys328/Uri1 and RRM2-Lys225/Uri4), consistent with the barely detectable interference with U2AF651,2/ Py tract interactions by chemical modification of the oligonucleotide phosphates (Singh et al., 2000). One direct, and one water-mediated hydrogen bond are observed between U2AF65 and the ribose hydroxyls (RRM2-RNP1-Lys300/Uri1-O20 and RRM1-RNP1-Asn196/ H2O/Uri3-O20 ) (Figures 5B and 5D). The presence of a single direct hydrogen bond between U2AF65 and the ribose hydroxyls distinctly differs from the SXL/tra Py tract structure, in which six of the nine O20 hydroxyls are involved in inter- or intramolecular hydrogen bonds (Handa et al., 1999). The dU2AF651,2/rU7 structure explains the 4- to 9-fold difference in the affinities of wild-type U2AF651,2 for DNA and RNA, observed here using surface plasmon resonance (SPR) (Table 2), and previously by EMSAs with U2AF65 and tra sequences substituted with ribouridine compared with deoxythymidine (Singh et al., 2000). Assuming a penalty of 3 kcal/mol for the unsatisfied hydrogen bond potential of Lys300 in the absence of a ribose hydroxyl group (Pace et al., 1996), a 4.5-fold predicted preference for RNA over DNA agrees with the observed specificity. In contrast, SXL displays a >10,000-fold preference for RNA over DNA (Kanaar et al., 1995; Singh et al., 2000),

Py Tract Recognition by U2AF65 A schematic representation and detailed views of the dU2AF651,2 interactions with each of the uridines are shown in Figure 5A and Figures 5B–5G, respectively. The aromatic side chains of the U2AF65-RNP1 and RNP2 consensus motifs stack parallel to the uracils (RRM2-RNP2-Phe262/Uri3, RRM2-RNP1-Phe304/Uri4, RRM1-RNP2-Tyr152/Uri5, and RRM1-RNP1-Phe199/ Uri6). Additional base stacking interactions are observed between conserved Uri2 and RRM2-RNP1 glycines (Gly264 and Gly265). In contrast, no restrictive contacts are observed with the hydrocarbon edges (C5 and C6 atoms) of the uracil rings (Figures 5B–5F). Hence, U2AF65 allocates space to accommodate larger purine nucleotides that often interrupt natural mammalian Py tract sequences without a requirement for global conformational rearrangements (Senapathy et al., 1990). Further, the open environment of the C5 carbons of the uracils allows U2AF65 to tolerate the extra methyl group of thymine, as demonstrated by the similar U2AF65 affinities for thymidine compared with uridine oligonucleotides (Singh et al., 2000). dU2AF651,2 forms sequence-specific hydrogen bonds with up to two of the three polar uracil atoms (Figures 5B–5G) and directly engages all three potential hydrogen bond donor and acceptor atoms at Uri1. These include the uracil-O2 positions (RRM2-RNP1-Lys300/ Uri1-O2, RRM1-RNP1-Lys195/Uri4-O2, RRM2-Asn289/ Uri4-O2, RRM1-His230[main chain-N]/Uri5-O2, and RRM1-Arg150/Uri6-O2) and the uracil-N3 or O4 atoms of several nucleotides (RRM2-Asp293/Uri1-N3, RRM2Ala335[main chain-O]/Uri3-N3, RRM1-Arg228[main chain-O]/Uri5-N3 and RRM2-Thr296/Uri1-O4, RRM2Lys329/Uri2-O4, RRM2-Gln333/Uri3-O4, RRM2-Lys260/ Uri4-O4, and RRM1-Asp231[main chain-N]/Uri6-O4). Additionally, two water molecules indirectly mediate hydrogen bonds between the side chains of RRM1RNP2-Asn155 or RRM2-Asn289 and the Uri3-O2 or Uri4-N3 atoms, respectively. The extensive interactions of U2AF65 explain the w100-fold reduced affinity following methylation of the N3 or O4 positions, which introduces a bulky hydrophobic group and removes the iminohydrogen bond donor (Singh et al., 2000). With the exception of Lys300 and Lys195, the sequence-specific hydrogen bonds of the dU2AF651,2/ rU7 complex are primarily mediated by residues outside the consensus RNP motifs, particularly in the last b strands of RRM1 or RRM2. Strikingly, flexible side chains rather than main chain atoms are responsible for 66% of the sequence-specific hydrogen bonds between U2AF65 and the uracils. Like the U2AF65 RNP motifs, residues that mediate specific hydrogen bonds with the uridines are phylogenetically conserved among U2AF65 homologs (Figure 1). As observed here for U2AF65, specific hydrogen bonds with residues outside the shared RNP consensus motifs often contribute to the sequence selectivity of RRM-containing proteins (Maris et al., 2005; Wang and Tanaka-Hall, 2001), providing one means for the w500 different human RRM-containing proteins to distinguish among the many RNAs in the cellular milieu (Maris et al., 2005).

Polypyrimidine Tract Recognition by U2AF65 55

Figure 5. Uridine Recognition in the dU2AF651,2/rU7 Complex (A) Schematic diagram of dU2AF651,2/rU7 interactions. Residue names are boxed and colored as in (B)–(H). Filled boxes indicate main chain contacts, and open boxes indicate side chain interactions. Dashed lines represent hydrogen bonds, dashed lines ending with a filled circle represent stacking interactions, and solid lines represent ionic interactions. (B–H) Ball-and-stick representations of the interactions between dU2AF651,2 and each bound uridine are shown. Residues belonging to RRM1 are colored green, residues from RRM2 are colored blue, and residues altered by site-directed mutagenesis are colored yellow. RNA bonds are colored purple, with a lighter shade distinguishing the C5-C6 bond of the base edges. The oxygen atoms of the stick representations are colored red, and nitrogen atoms are blue. Dashed lines indicate hydrogen bonds. Recognition of (B) Uri1, (C) Uri2, (D) Uri3, (E) Uri4, (F) Uri5, (G) Uri6, and (H) Uri7.

Mutational Analysis of RNA Binding by U2AF65 Variants To test the energetic contribution of the interactions observed in the dU2AF651,2/rU7 structure to Py tract binding, we analyzed a series of site-directed mutant U2AF651,2 proteins using SPR (Table 2). The affinities of the wild-type U2AF651,2 and variant dU2AF651,2 proteins for an immobilized RNA strand composed of 20 uridines (rU20) were comparable when determined using SPR or nitrocellulose filter binding assays. Since the dU2AF651,2 and U2AF651,2 proteins displayed only

a 4- to 9-fold preference for the ribose-containing rU20 over a deoxyribose-counterpart (dU20), the more stable dU20 oligonucleotide was chosen for subsequent assays. The wild-type U2AF651,2 construct was modified by site-directed mutagenesis to evaluate the energetic contribution of the structurally observed interactions to Py tract binding by the natural U2AF65 protein. Altered residues were chosen to span the length and engage in different types of interactions with the RNA strand (Figure S2A), ranging from aromatic stacking to hydrogen bond formation. Alanine was substituted for most

Molecular Cell 56

Table 2. Surface Plasmon Resonance Analysis of U2AF65 Variants Proteina

Location of Mutation

Nucleic Acidb

(mM)c

Fold Effectd

wtU2AF651,2



dU2AF651,2

Deletion between RRM1 and RRM2

Phe199Ala-U2AF651,2 Lys260Ala/Asn289Ala-U2AF651,2 Gly301Ile-U2AF651,2 Phe304Ala-U2AF651,2 Phe262Ala-U2AF651,2

RRM1-RNP1 RRM1/RRM2 interface RRM2-RNP1 RRM2-RNP1 RRM2-RNP2

rU20 dU20 rU20 dU20 dU20 dU20 dU20 dU20 dU20

3.4 6 1.8 14.1 6 0.2 1.8 6 0.1 15.8 6 1.9 129 6 21 123 6 7.8 73.3 6 0.6 250 6 2 223 6 1

0.2 1.0 0.1 1.1 9.0 8.7 5.2 17.7 15.8

a

The indicated site-directed mutations were made in the wild-type U2AF65 Py tract binding domain (wtU2AF651,2). Polyuridine oligonucleotides were 20 nucleotides in length, with either ribose (rU20) or deoxyribose backbones (dU20). c Average apparent dissociation constants (KD) from at least two SPR experiments. d Fold reduction in binding affinity (increase in KD) relative to wtU2AF651,2/dU20. b

residues, except the replacement of Gly301 with a bulkier isoleucine residue. All of the U2AF651,2 site-directed mutations reduced affinity for dU20 by several-fold (Table 2). The conserved phenylalanines in the RNP motifs were most important for RNA affinity and displayed an energetic contribution (w5 kcal/mol each) that was slightly greater than expected based solely on loss of the buried hydrophobic surface area (3.2 kcal/mol). The penalties for substituting either RRM2-RNP2-Phe262 or RRM2-RNP1-Phe304 with alanine were greater by 0.4 kcal/mol than the penalty for RRM1-RNP1-Phe199 (Ito et al., 1999). Accordingly, Phe262 and Phe304, respectively, interact with the centrally located Uri3 and Uri4 and may stabilize the global RNA conformation beyond local base-stacking interactions, whereas Phe199 stacks solely with Uri6 near the relatively straight 30 end of the oligonucleotide. A double Lys260Ala/Asn289Ala mutation that removed U2AF65 hydrogen bond donors and acceptors with Uri4 also dramatically reduced RNA binding (9-fold). The smallest effect (5-fold) was observed for the Gly301Ile-substitution, which packs against the sugar-phosphate backbone of Uri2 rather than extensively interacting with the base. Consistent with their structural and energetic importance for Py tract interactions, the U2AF65 RNP1 and RNP2 residues are identical among homologs from diverse organisms ranging from plants to mammals, with the exception of conservative substitutions in distantly related fission yeast (Figure 1). The excellent correlation between the dU2AF651,2/rU7 structure and the RNA binding properties of the mutant U2AF651,2 proteins, coupled with previous biochemical experiments (Kent et al., 2003; Shen and Green, 2004; Singh et al., 2000; Singh et al., 1995; Valcarcel et al., 1996) and the results of our functional assays, confirms that the dU2AF651,2/rU7 structure provides a reliable tool for interpreting U2AF65/Py tract recognition. Discussion Versatile Py Tract Recognition by U2AF65 In theory, U2AF65 could distinguish purines (adenine and guanine) from pyrimidines (uracil and cytosine) on the basis of size, or all four bases could be distinguished by their unique patterns of hydrogen bond donors and acceptors. The structure reveals that dU2AF651,2 adopts the latter mechanism, recognizing the rU7 strand

by forming unique hydrogen bonds with the uracil edges rather than shape selective recognition. Frequently, the hydrogen bonds with the uracil bases are formed with side chains, which have the flexibility to retreat from unfavorable contacts if faced with a different base. For example, lysine residues, such as U2AF65 Lys260, Lys300, or Lys329, generally undergo side chain rearrangements upon ligand binding (Najmanovich et al., 2000). Additional hydrogen bond donors and acceptors on several of the interacting side chains could easily interconvert by torsional rotations (including Asn289, Gln333, and Thr296), as observed for residues in the RNP motifs of hnRNP A1 (Vitali et al., 2002). This provides one structural basis for the apparently contradictory abilities of U2AF65 to specify uridine-rich splice sites, yet tolerate a variety of relatively divergent metazoan Py tract sequences. Two bound water molecules mediate interactions between U2AF65 side chains and sequence-specific atoms of the uracil bases. Although prominent roles have yet to be established for water molecules during protein/RNA recognition, the importance of bound water molecules has been established for many protein/DNA complexes, including the trp repressor, paired homeodomain, and Smad3 MH1 (Jayaram and Jain, 2004). Most of the bound water molecules in protein-DNA complexes separate electrostatically repulsive polar atoms or extend short side chains to achieve hydrogen bonds with the nucleic acid, as observed for two water-mediated contacts in the dU2AF651,2/rU7 complex. In addition to flexible side chain rearrangements, relocation of bound water molecules is a second possible mechanism for the U2AF65 structure to accommodate base substitutions within natural Py tract sequences. Comparison with Py Tract-Specific Splicing Regulators Comparison of Py tract interactions by U2AF65, which is an essential splicing factor, with those of the specific splicing regulators SXL and PTB, reveals guidelines for general versus sequence-specific RNA recognition by RRMs. First, only three polar atoms of the seven uracils form hydrogen bonds with U2AF65 main chain peptide bonds. In contrast, SXL uses eight main chain hydrogen bonds to recognize nine bases of the tra Py tract (50 UGUUUUUUU-30 ), including one specifying the distinctive guanine exocyclic amine (Handa et al., 1999). The

Polypyrimidine Tract Recognition by U2AF65 57

PTB RRMs also display several mainchain interactions with an alternating (CU) tract found in regulated splice sites, such as a- or b-tropomyosin (Oberstrass et al., 2005). Second, U2AF65 recognizes the uracils with three water-mediated hydrogen bonds, which could rearrange easily to accommodate other sequences. Only one water molecule is observed mediating protein/ RNA contacts in the SXL structure (Handa et al., 1999). Third, the hydrocarbon edges of the U2AF65-bound uracils are free of inter- or intramolecular contacts that would confer shape-selective recognition, whereas the pyrimidines of the SXL and PTB structures are engaged in tight inter- and intramolecular contacts (Handa et al., 1999; Oberstrass et al., 2005). Thus, despite similar use of tandem RRM folds, U2AF65 tolerates base substitutions by adopting clearly different structural strategies from the sequence-specific recognition modes of SXL and PTB.

Data Collection, Structure Determination, and Refinement A native dataset was collected at the National Synchrotron Light Source (NSLS) Beamline X8C and processed using DENZO/SCALEPACK (Otwinowski and Minor, 1997). The structure was determined by molecular replacement by using the X-ray coordinates of the unbound RRM1 (PDB code 2FZR, K.R. Thickman and C.L.K., unpublished data) as the search model. The NMR structure of the unbound RRM2 facilitated building the structure using O (Jones et al., 1991), which was refined using CNS (Brunger et al., 1998) (Table 1). The final model includes U2AF65 residues 148–336 (excluding deleted residues 238–257), with three additional N-terminal residues from the protease site, seven uridines plus a 50 -phosphate, one dioxane molecule from the crystallization conditions, and 31 water molecules. Analysis of the structure using PROCHECK (Laskowski et al., 1993) established that none of the (f,c) combinations lie in the disallowed regions of the Ramachandran plot and that the overall G factor (0.23) is significantly better than average for structures of comparable resolution. RNA helical parameters were calculated using the program Curves (Lavery and Sklenar, 1988), and figures were prepared using Bobscript (Esnouf, 1999), Molscript (Kuralis, 1991), Pymol (www.pymol.org), and Raster3D (Merritt and Bacon, 1997).

Significance of Py Tract Bending Polyuridine sequences are inherently more flexible than other RNA polymers, due to the lower stability of intramolecular uracil-uracil stacking (Inners and Felsenfeld, 1970; Norberg and Nilsson, 1995). The rU7 strand of the dU2AF651,2 structure is the most curved among known structures of single-stranded RNAs bound by tandem RRMs, and the conformations of Py tracts bound by SXL and HuD are also unstacked and bent (Handa et al., 1999; Wang and Tanaka-Hall, 2001). Directed hydroxyl radical footprinting experiments show that the BPS and 30 splice sites are brought close together by association of U2AF65 with a functional Py tract (Kent and MacMillan, 2002; Kent et al., 2003), and these and UV-crosslinking experiments show that the N-terminal U2AF65 RS domain concurrently interacts with the BPS and the downstream 30 splice site (Kent et al., 2003; Shen et al., 2004; Valcarcel et al., 1996). Moreover, events at the 50 splice site influence association of U2AF65 with the Py tract (Cote et al., 1995; Sharma et al., 2005). A through-space explanation for U2AF65 interactions with linearly distant RNA sequences was proposed previously (Kent et al., 2003), based on the established bends of Py tracts in the presence of other RRM-containing proteins (Handa et al., 1999; Wang and Tanaka-Hall, 2001). With the reservation that the bend of the Py tract and arrangements of the wild-type U2AF65 RRMs may be variable in solution and are likely to differ from those of the dU2AF651,2 variant, this hypothesis is further supported by the bent conformation of the rU7 site observed here.

RNA Splicing Assays HeLa nuclear extract was depleted of U2AF as described (Valcarcel et al., 1997) by chromatography on oligo(dT)-cellulose in the presence of 1 M KCl. Splicing reactions were performed essentially as described previously (Kan and Green, 1999), except that 40% HeLa nuclear extract was used. Spliced products were resolved on 10% denaturing polyacrylamide gels (19:1) in 8 M urea in TrisBorate-EDTA buffer.

Experimental Procedures Protein and RNA Preparation and Crystallization Human U2AF65 and shorter variants were expressed and purified as glutathione S-transferase (GST) fusion proteins in the vector pGEX6P-1 using standard protocols (GE Healthcare). The GST tag was cleaved using PreScission Protease and removed by subtractive glutathione-Sepharose affinity and anion-exchange chromatography followed by a final size-exclusion chromatography step. Details of the crystallization are described elsewhere (Sickmier et al., 2006). In brief, RNA oligonucleotides (Dharmacon) were incubated in a 1:1 ratio with purified proteins, then equilibrated by the hanging drop method with a reservoir solution containing 1.6 M ammonium sulfate, 10% dioxane, and 0.1 M MES (pH 6.0). Crystals were flash cooled to 2180ºC for data collection.

RNA Binding Experiments Preliminary nitrocellulose filter binding experiments with U2AF651,2 ordU2AF651,2 and radiolabeled RNA (50 -UUUUUUUUUUUUUUUU UUUU-30 or 50 -GGGGGGGGGGGGGGGGGGGG-30 ) were performed as described (Hall and Kranz, 1999). Affinities of site-directed U2AF651,2 mutants for 20 nucleotide deoxyuridine strands (dU20) were measured using a BIAcore3000 instrument. dU20 oligonucleotides with 50 biotin labels were immobilized on preconditioned streptavidin sensor chips to a density of 20 response units (RU). Since the off rate (1 s21) for the interaction was beyond the resolution of the instrument (<0.1 s21) (Karlsson, 1999), equilibrium binding experiments were used to obtain the KD. Two-fold serial dilutions of the various U2AF651,2 proteins, covering two orders of magnitude around the KDs, were injected over this surface, at 20 mL/min flow rate in room temperature buffer (containing 100 mM NaCl, 10 mM 4-[2-hydroxyethyl]-1-piperazineethanesulfonate [pH 6.8], 0.05% surfactant-P20). No additional treatments other than running buffer were required to regenerate the baseline. The protein injections were performed in duplicate in random order and were interspersed with periodic injections of buffer to monitor baseline stability. The RU during the equilibrium binding phase was averaged over 100 s and plotted against peptide concentration, and curves were fit using BIAevaluation-v4.1 (Figure S2B). Supplemental Data Supplemental Data include two figures and Supplemental References and can be found with this article online at http://www. molecule.org/cgi/content/full/23/1/49/DC1/. Acknowledgments We deeply appreciate S.K. Burley’s assistance developing this project. We are grateful to R. McMacken for advice and J. Bender, S. Prigge, and J. Wedekind for critically reading the manuscript. M. Morelli and S. Lindley assisted with protein purification. We thank the NSLS staff for use of Beamline X8C at the Brookhaven National Laboratory, which is supported by the U.S. Department of Energy under contract number DE-AC02-98CH10886. H.S. was supported in part by a Charles A. King Trust Fellowship, and work in the M.R.G laboratory was supported by the National Institutes of Health (NIH, grant GM035490). K.E.F. was supported in part by a training grant (T32 GM08403) from the NIH, and E.A.S. was a Lang Fellow.

Molecular Cell 58

Work in the C.L.K. laboratory was supported by the NIH (grant GM070503). Received: January 14, 2006 Revised: April 13, 2006 Accepted: May 8, 2006 Published: July 6, 2006 References Abovich, N., and Rosbash, M. (1997). Crossintron bridging interactions in the yeast commitment complex are conserved in mammals. Cell 89, 403–412. Banerjee, H., Rahn, A., Davis, W., and Singh, R. (2003). Sex lethal and U2 small nuclear ribonucleoprotein auxiliary factor (U2AF65) recognize polypyrimidine tracts using multiple modes of binding. RNA 9, 88–99. Banerjee, H., Rahn, A., Gawande, B., Guth, S., Valcarcel, J., and Singh, R. (2004). The conserved RNA recognition motif 3 of U2 snRNA auxiliary factor (U2AF65) is essential in vivo but dispensable for activity in vitro. RNA 10, 240–253. Berglund, J.A., Abovich, N., and Rosbash, M. (1998). A cooperative interaction between U2AF65 and mBBP/SF1 facilitates branchpoint region recognition. Genes Dev. 12, 858–867. Brunger, A.T., Adams, P.D., Clore, G.M., DeLano, W.L., Gros, P., Grosse-Kunstleve, R.W., Jiang, J.S., Kuszewski, J., Nilges, M., Pannu, N.S., et al. (1998). Crystallography & NMR system: a new software suite for macromolecular structure determination. Acta Crystallogr. D54, 905–921. Conte, M.R., Grune, T., Ghuman, J., Kelly, G., Ladas, A., Matthews, S., and Curry, S. (2000). Structure of tandem RNA recognition motifs from polypyrimidine tract binding protein reveals novel features of the RRM fold. EMBO J. 19, 3132–3141. Coolidge, C.J., Seely, R.J., and Patton, J.G. (1997). Functional analysis of the polypyrimidine tract in pre-mRNA splicing. Nucleic Acids Res. 25, 888–896. Cote, J., Beaudoin, J., Tacke, R., and Chabot, B. (1995). The U1 small nuclear ribonucleoprotein/50 splice site interaction affects U2AF65 binding to the downstream 30 splice site. J. Biol. Chem. 270, 4031– 4036. Deo, R.C., Bonanno, J.B., Sonenberg, N., and Burley, S.K. (1999). Recognition of polyadenylate RNA by the poly(A)-binding protein. Cell 98, 835–845. Esnouf, R.M. (1999). Further additions to MolScript version 1.4, including reading and contouring of electron-density maps. Acta Crystallogr. D55, 938–940. Fleckner, J., Zhang, M., Valcarcel, J., and Green, M.R. (1997). U2AF65 recruits a novel human DEAD box protein required for the U2 snRNPbranchpoint interaction. Genes Dev. 11, 1864–1872. Garcia-Blanco, M.A., Baraniak, A.P., and Lasda, E.L. (2004). Alternative splicing in disease and therapy. Nat. Biotechnol. 22, 535–546. Hall, K.B., and Kranz, J.K. (1999). Nitrocellulose filter binding for determination of dissociation constants. In RNA-Protein Interaction Protocols, S.R. Haynes, ed. (Totowa, New Jersey: Humana Press). Handa, N., Nureki, O., Kurimoto, K., Kim, I., Sakamoto, H., Shimura, Y., Muto, Y., and Yokoyama, S. (1999). Structural basis for recognition of the tra mRNA precursor by the Sex-lethal protein. Nature 398, 579–585. Inners, L.D., and Felsenfeld, G. (1970). Conformation of polyribouridylic acid in solution. J. Mol. Biol. 50, 373–389. Ito, T., Muto, Y., Green, M.R., and Yokoyama, S. (1999). Solution structures of the first and second RNA-binding domains of human U2 small nuclear ribonucleoprotein particle auxiliary factor (U2AF65). EMBO J. 18, 4523–4534. Jayaram, B., and Jain, T. (2004). The role of water in protein-DNA recognition. Annu. Rev. Biophys. Biomol. Struct. 33, 343–361. Jones, T.A., Zou, J.Y., Cowan, S.W., and Kjeldgaard. (1991). Improved methods for building protein models in electron density maps and the location of errors in these models. Acta Crystallogr. A47, 110–119.

Jurica, M.S., and Moore, M.J. (2003). Pre-mRNA splicing: awash in a sea of proteins. Mol. Cell 12, 5–14. Kan, J.L., and Green, M.R. (1999). Pre-mRNA splicing of IgM exons M1 and M2 is directed by a juxtaposed splicing enhancer and inhibitor. Genes Dev. 13, 462–471. Kanaar, R., Lee, A.L., Rudner, D.Z., Wemmer, D.E., and Rio, D.C. (1995). Interaction of the sex-lethal RNA binding domains with RNA. EMBO J. 14, 4530–4539. Karlsson, R. (1999). Affinity analysis of non-steady-state data obtained under mass transport limited conditions using BIAcore technology. J. Mol. Recognit. 12, 285–292. Kent, O.A., and MacMillan, A.M. (2002). Early organization of premRNA during spliceosome assembly. Nat. Struct. Biol. 9, 576–581. Kent, O.A., Reayi, A., Foong, L., Chilibeck, K.A., and MacMillan, A.M. (2003). Structuring of the 30 splice site by U2AF65. J. Biol. Chem. 278, 50572–50577. Kuralis, P.J. (1991). MOLSCRIPT: a program to produce both detailed and schematic plots of protein structures. J. Appl. Crystallogr. 24, 946–950. Laskowski, R.A., MacArthur, M.W., Moss, D.S., and Thornton, J.M. (1993). PROCHECK: a program to check the stereochemical quality of protein structures. J. Appl. Crystallogr. 26, 283–291. Lavery, R., and Sklenar, H. (1988). The definition of generalized helicoidal parameters and of axis curvature for irregular nucleic acids. J. Biomol. Struct. Dyn. 6, 63–91. Maniatis, T., and Tasic, B. (2002). Alternative pre-mRNA splicing and proteome expansion in metazoans. Nature 418, 236–243. Maris, C., Dominguez, C., and Allain, F.H. (2005). The RNA recognition motif, a plastic RNA-binding platform to regulate post-transcriptional gene expression. FEBS J 272, 2118–2131. Mazza, C., Segref, A., Mattaj, I.W., and Cusack, S. (2002). Co-crystallization of the human nuclear cap-binding complex with a m7GpppG cap analogue using protein engineering. Acta Crystallogr. D58, 2194–2197. Merendino, L., Guth, S., Bilbao, D., Martinez, C., and Valcarcel, J. (1999). Inhibition of msl-2 splicing by Sex-lethal reveals interaction between U2AF35 and the 30 splice site AG. Nature 402, 838–841. Merritt, E.A., and Bacon, D.J. (1997). Raster3D: photorealistic molecular graphics. Methods Enzymol. 277, 505–524. Michaud, S., and Reed, R. (1991). An ATP-independent complex commits pre-mRNA to the mammalian spliceosome assembly pathway. Genes Dev. 5, 2534–2546. Najmanovich, R., Kuttner, J., Sobolev, V., and Edelman, M. (2000). Side-chain flexibility in proteins upon ligand binding. Proteins 39, 261–268. Nolen, B., Yun, C.Y., Wong, C.F., McCammon, J.A., Fu, X.D., and Ghosh, G. (2001). The structure of Sky1p reveals a novel mechanism for constitutive activity. Nat. Struct. Biol. 8, 176–183. Norberg, J., and Nilsson, L. (1995). Stacking free energy profiles for all 16 natural ribodinucleoside monophosphates in aqueous solution. J. Am. Chem. Soc. 117, 10832–10840. Oberstrass, F.C., Auweter, S.D., Erat, M., Hargous, Y., Henning, A., Wenter, P., Reymond, L., Amir-Ahmady, B., Pitsch, S., Black, D.L., and Allain, F.H. (2005). Structure of PTB bound to RNA: specific binding and implications for splicing regulation. Science 309, 2054–2057. Otwinowski, Z., and Minor, W. (1997). Processing of X-ray diffraction data collected in oscillation mode. Methods Enzymol. 276, 307–326. Pace, C.N., Shirley, B.A., McNutt, M., and Gajiwala, K. (1996). Forces contributing to the conformational stability of proteins. FASEB J. 10, 75–83. Perez, I., Lin, C.H., McAfee, J.G., and Patton, J.G. (1997). Mutation of PTB binding sites causes misregulation of alternative 30 splice site selection in vivo. RNA 3, 764–778. Reed, R. (1989). The organization of 30 splice-site sequences in mammalian introns. Genes Dev. 3, 2113–2123. Senapathy, P., Shapiro, M.B., and Harris, N.L. (1990). Splice junctions, branch point sites, and exons: sequence statistics,

Polypyrimidine Tract Recognition by U2AF65 59

identification, and applications to genome project. Methods Enzymol. 183, 252–278. Shamoo, Y., Abdul-Manan, N., and Williams, K.R. (1995). Multiple RNA binding domains (RBDs) just don’t add up. Nucleic Acids Res. 23, 725–728. Sharma, S., Falick, A.M., and Black, D.L. (2005). Polypyrimidine tract binding protein blocks the 50 splice site-dependent assembly of U2AF and the prespliceosomal E complex. Mol. Cell 19, 485–496. Shen, H., and Green, M.R. (2004). A pathway of sequential arginineserine-rich domain-splicing signal interactions during mammalian spliceosome assembly. Mol. Cell 16, 363–373. Shen, H., Kan, J.L., and Green, M.R. (2004). Arginine-serine-rich domains bound at splicing enhancers contact the branchpoint to promote prespliceosome assembly. Mol. Cell 13, 367–376. Sickmier, E.A., Frato, K.E., and Kielkopf, C.L. (2006). Crystallization and preliminary X-ray analysis of U2AF65 variant in complex with a polypyrimidine tract analogue by use of protein engineering. Acta Crystallogr. F62, 457–459. Simpson, P.J., Monie, T.P., Szendroi, A., Davydova, N., Tyzack, J.K., Conte, M.R., Read, C.M., Cary, P.D., Svergun, D.I., Konarev, P.V., et al. (2004). Structure and RNA interactions of the N-terminal RRM domains of PTB. Structure 12, 1631–1643. Singh, R., Valcarcel, J., and Green, M.R. (1995). Distinct binding specificities and functions of higher eukaryotic polypyrimidine tract-binding proteins. Science 268, 1173–1176. Singh, R., Banerjee, H., and Green, M.R. (2000). Differential recognition of the polypyrimidine-tract by the general splicing factor U2AF65 and the splicing repressor sex-lethal. RNA 6, 901–911. Sosnowski, B.A., Belote, J.M., and McKeown, M. (1989). Sex-specific alternative splicing of RNA from the transformer gene results from sequence-dependent splice site blockage. Cell 58, 449–459. Valcarcel, J., Singh, R., Zamore, P.D., and Green, M.R. (1993). The protein Sex-lethal antagonizes the splicing factor U2AF to regulate alternative splicing of transformer pre-mRNA. Nature 362, 171–175. Valcarcel, J., Gaur, R.K., Singh, R., and Green, M.R. (1996). Interaction of U2AF65 RS region with pre-mRNA branch point and promotion of base pairing with U2 snRNA. Science 273, 1706–1709. Valcarcel, J., Martinez, C., and Green, M.R. (1997). Functional analysis of splicing factors and regulators. In mRNA Formation and Function, J.D. Richter, ed. (New York: Academic Press), pp. 31–53. Vitali, J., Ding, J., Jiang, J., Zhang, Y., Krainer, A.R., and Xu, R.M. (2002). Correlated alternative side chain conformations in the RNA recognition motif of heterogeneous nuclear ribonucleoprotein A1. Nucleic Acids Res. 30, 1531–1538. Wang, X., and Tanaka-Hall, T.M. (2001). Structural basis for recognition of AU-rich element RNA by the HuD protein. Nat. Struct. Biol. 8, 141–145. Wu, S., Romfo, C.M., Nilsen, T.W., and Green, M.R. (1999). Functional recognition of the 30 splice site AG by the splicing factor U2AF35. Nature 402, 832–835. Zamore, P.D., Patton, J.G., and Green, M.R. (1992). Cloning and domain structure of the mammalian splicing factor U2AF. Nature 355, 609–614. Zorio, D.A., and Blumenthal, T. (1999). Both subunits of U2AF recognize the 30 splice site in Caenorhabditis elegans. Nature 402, 835– 838.

Accession Numbers The dU2AF651,2/rU7 and U2AF65-RRM1 structures have been deposited in the Protein Data Bank under ID codes 2G4B and 2FZR, respectively.