Deciphering 3′ss Selection in the Yeast Genome Reveals an RNA Thermosensor that Mediates Alternative Splicing

Deciphering 3′ss Selection in the Yeast Genome Reveals an RNA Thermosensor that Mediates Alternative Splicing

Molecular Cell Short Article Deciphering 30ss Selection in the Yeast Genome Reveals an RNA Thermosensor that Mediates Alternative Splicing Markus Mey...

462KB Sizes 5 Downloads 45 Views

Molecular Cell

Short Article Deciphering 30ss Selection in the Yeast Genome Reveals an RNA Thermosensor that Mediates Alternative Splicing Markus Meyer,1,4 Mireya Plass,2,5 Jorge Pe´rez-Valle,4,5 Eduardo Eyras,2,3 and Josep Vilardell1,3,6,* 1Centre

de Regulacio´ Geno`mica, Dr. Aiguader 88, 08003 Barcelona, Spain Genomics Group, Universitat Pompeu Fabra, Dr. Aiguader 88, 08003 Barcelona, Spain 3Institucio ´ Catalana de Recerca i Estudis Avanc¸ats (ICREA), Dr. Aiguader 88, 08003 Barcelona, Spain 4Molecular Biology Institute of Barcelona (IBMB), Baldiri Reixach 10-12, 08028 Barcelona, Spain 5These authors contributed equally to this work 6Present address: Molecular Biology Institute of Barcelona (IBMB), Baldiri Reixach 10-12, 08028 Barcelona, Spain *Correspondence: [email protected] DOI 10.1016/j.molcel.2011.07.030 2Computational

SUMMARY

Poor understanding of the spliceosomal mechanisms to select intronic 30 ends (30 ss) is a major obstacle to deciphering eukaryotic genomes. Here, we discern the rules for global 30 ss selection in yeast. We show that, in contrast to the uniformity of yeast splicing, the spliceosome uses all available 30 ss within a distance window from the intronic branch site (BS), and that in 70% of all possible 30 ss this is likely to be mediated by pre-mRNA structures. Our results reveal that one of these RNA folds acts as an RNA thermosensor, modulating alternative splicing in response to heat shock by controlling alternate 30 ss availability. Thus, our data point to a deeper role for the pre-mRNA in the control of its own fate, and to a simple mechanism for some alternative splicing. INTRODUCTION Precise identification of intron boundaries (splice sites [ss]) in eukaryotic pre-mRNAs is critical. Errors result in incoherent and possibly deleterious transcripts and are conductive to diseases (Cooper et al., 2009). The 50 ss selection is assisted by its interactions with U1 snRNP (Wahl et al., 2009). In contrast, the identification of the 30 ss, apparently encoded only by the dinucleotide AG, is a greater challenge because of its multiple occurrence and the lack of a clear selector comparable to U1. Several mechanisms for 30 ss selection have been proposed. They include scanning from the intron branch site (BS) or from a downstream pyrimidine-rich track (Patterson and Guthrie, 1991; Smith et al., 1993); interactions with the 50 ss (Goguel and Rosbash, 1993) or with the downstream exon (Crotti and Horowitz, 2009); distance from the BS (Cellini et al., 1986; Luukkonen and Se´raphin, 1997); and interactions with splicing factors (Smith et al., 2008; Umen and Guthrie, 1995). But despite these findings, global and accurate prediction of the 30 ss remain elusive; however, it is necessary to fully decode eukaryotic genomes.

The relative simplicity of the Saccharomyces cerevisiae genome, with 5% genes containing an intron (Spingola et al., 1999) and readily identifiable BS, provides a unique opportunity for a global analysis of 30 ss selection. Moreover, alternative splicing is underrepresented (Yassour et al., 2009), indicating a predominantly straightforward 30 ss selection. Yet over a third of introns contain up to nine additional 30 ss choices (sequence HAG, H: [UCA]) between the BS and the annotated 30 ss, and nothing apparently distinguishes them from their annotated counterparts. Here, we undertook bioinformatics and molecular approaches to document a set of fundamental rules that explain 30 ss selection in a model genome. Remarkably, a balance between pre-mRNA folding and spliceosome flexibility is required, suggesting important evolutionary implications and alternative ways of splicing regulation. We verified this possibility by showing that in one intron the pre-mRNA structure acts to modulate splicing in response to temperature changes. RESULTS AND DISCUSSION Intronic Features Downstream from the Branch Site Yeast introns show a variable distance between the BS adenosine (BS-A) and the 30 ss, from 10 to 165 nucleotides (Figure S1). This is surprising, as previous reports indicate that such a large distance drastically hinders splicing efficiency (Cellini et al., 1986); but this must be overcome, as most yeast transcripts are efficiently spliced (Clark et al., 2002). Interestingly, the number of potentially cryptic 30 ss between the BS and the annotated 30 ss is significantly lower than expected by chance (c2 p = 1.71 3 1017). This suggests a selective pressure against HAGs upstream of the annotated 30 ss. However, we have found that inactivation by mutation of an upstream HAG reduces VMA10 splicing (Figure S1). Thus, apparently facilitating 30 ss selection can have a negative impact. Interestingly, this mutation destabilizes a putative secondary structure that would occlude this HAG, suggesting the possibility that RNA folding would have a dual role by preventing spliceosomal targeting of occluded HAGs and promoting the use of one located downstream. To test the generality of this hypothesis we used RNAfold (Hofacker, 2009) to predict an optimal secondary structure (i.e., the most likely one) between the BS and the annotated 30 ss using a dataset

Molecular Cell 43, 1033–1039, September 16, 2011 ª2011 Elsevier Inc. 1033

Molecular Cell An RNA Thermosensor Controls APE2 30 ss Selection

A

Density

0.08

structure effective dist. no structure

0.06 0.04 0.02 0.00 0

50

100

150

distance BS to 3’ss (nt) B

no structure 5’SS BS

structure

3’SS 5’SS BS

distance C

3’SS intron

effective distance

accessibility

1.0 0.8 0.6 0.4 0.2 0.0 ann. intron exon cryptic

3’ss

Figure 1. BS-30 ss Distances of Yeast Introns (A and B) Introns are separated into two categories: those that have potential for secondary structure in this region (black), and those that do not (light gray). When the number of nucleotides in the secondary structures of the former is removed from the actual distance, schematized in (B), the distribution of these effective distances (dark gray) does not exceed 45 nt, as with the introns without structure (Wilcoxon signed-rank p = 3.4 3 101). Density expresses the proportion of cases at a given distance from the 30 ss (see the Supplemental Information for details on calculations). (C) Boxplot diagram showing accessibility values for annotated (ann) and cryptic (intronic and exonic) 30 ss. Boxes indicate the quartiles, from bottom to top: Q1 (25% of the data points), Q2 (the median—indicated by a thick line), and Q3 (75% of the data points). The dashed lines, which are limited by the thin lines, indicate the distribution of values outside the Q1 and Q3 values. Outliers, which are shown as open circles, correspond to those values that lie outside the interval [Q1  1.5(Q3  Q1), Q3 + 1.5(Q3  Q1)]. Annotated 30 ss are more accessible than intronic HAGs (Wilcoxon signed-rank p [ann versus intronic] = 5.5 3 105, p [ann versus exonic] = 2.2 3 1016).

of 282 S. cerevisiae introns (SGD, July 2009) (see Experimental Procedures and Supplemental Information). The results of these predictions, summarized in Figures 1 and S1, are available at http://regulatorygenomics.upf.edu/Yeast_Introns. Their analysis reveals several features of the 30 end of yeast introns.

First, 40% (113) of S. cerevisiae introns can potentially form a structure in the BS-30 ss region, and these include all those with a BS-30 ss distance larger than 45 nucleotides (nt) (Figure 1A, black line). Second, the effective BS-30 ss distance distribution of those introns, calculated by subtracting the number of nt contained in the structure (Figure 1B), is not significantly distinct from that of introns without structures (Figure 1A, light gray line). Notably, the effective BS-30 ss distance of all introns is predicted to be never larger than 45 nt. This may represent the maximum distance for efficient splicing in S. cerevisiae, or indeed in yeasts, since a similar result is also observed for other species (Figure S1). It has been shown that suboptimal RNA structure predictions can be relevant as well to determine the biological effects of RNA folds (Rogic et al., 2008). Thus, to assess the consistency of our results, we evaluated 1000 suboptimal structure predictions for each BS-30 ss region and the variations in the distributions of effective distances that they introduce. The results from this approach (Figure S1) are consistent with our previous analysis, supporting our conclusions. In addition, to test whether the properties of the effective distance are specific to real introns, we examined sets of randomized intron sequences (Supplemental Information) with the same BS-30 ss length distribution as that of the set of yeast introns. We observed that in randomized sequences RNA structures also shorten the BS-30 ss effective distance to values similar to those found in sequences without a predicted secondary structure. However, the maximum effective distance observed is significantly longer (60 nt) than in real introns (Figure S1). To evaluate the possibility that RNA folds occlude cryptic 30 ss from the spliceosome, we calculated for each HAG (potential 30 ss) the probability of being unpaired (accessibility) in the BS30 ss region (intronic HAG) and in the downstream exon (exonic HAG). We found that accessibility values are significantly higher for real than for cryptic 30 ss (Figure 1C). In particular, real 30 ss are predicted to be more accessible than intronic HAGs, which is not expected in randomized sequences, as longer segments are more likely to fold, increasing the probability of occluding HAGs. These results, replicated in other yeast species analyzed (Figure S1), are consistent with a role for RNA structure in 30 ss selection. 30 ss Selection Mediated by RNA Folding To validate our predictions, we assessed 30 ss usage in vivo in transcripts bearing mutations expected to disrupt the BS-30 ss structure. Subsequently we asked whether restoring the original structure by additional mutations would re-establish proper 30 ss selection. This strategy was tested in the RPS23B intron where the 30 ss is at 60 nt from the BS (Figure S2). We predicted that a cryptic 30 ss in this region is occluded by an RNA structure and that disruption of this structure should activate it. Thus, the RPS23B intron was inserted into a CUP1 reporter (Lesser and Guthrie, 1993) and splicing was monitored by primer extension analyses (Figure 2). The results show that in the WT RPS23B only the annotated splice site is used (RPS23B-1) (Figure 2B, lane 1). This pattern switches when the structure is disrupted and the alternative AG becomes the sole target of the spliceosome (RPS23B-2) (Figure 2B, lane 2). Importantly, the WT

1034 Molecular Cell 43, 1033–1039, September 16, 2011 ª2011 Elsevier Inc.

Molecular Cell An RNA Thermosensor Controls APE2 30 ss Selection

A

RPS23B

B

RPS23B-1 (wt)

dbr1Δ

_ _ _ _ _ _ _ _

A RPS23B-2

1 1

x x x x x

3

4

1

4

RPS23B-

2

AG

1

1 2

2

A RPS23B-3

2

AG x __ x x_x x x x_ x x_x _ _ _

A RPS23B-4

1

2

AG

x x x x x

U6 2

A

AG

C

1

2

3

4

5

6

VMA10

D

VMA10-1 (wt)

_ _ _ _ _ _ _ _ _ _

d=57 nt A

1 2 5 6 VMA1057 57 51 45 d (nt)

A1 A2

3

AG

1

VMA10-2 xx

1

x d=57 nt xx

2

A

3

3

AG

VMA10-5 d=51 nt

x x x x x

*

1 2

A

3

AG

VMA10-6 d=45 nt A

U6 1

x x x x x

2

3

4

pattern is restored when the structure is reinstated by complementary mutations (RPS23B-3) (Figure 2B, lane 3). In this case the splicing pattern is undistinguishable from that of the WT, in agreement with our predictions. The loss of splicing to the annotated 30 ss when the structure is disrupted could be attributed to a preference by the spliceosome for the first AG from the BS (Smith et al., 1993), or to an excessive (>45 nt) distance of the second AG from the BS, as suggested by our bioinformatics analyses. To distinguish between these two possibilities, we mutated the upstream 30 ss (AG to AU) in RPS23B-2. This mutation blocks the second step of splicing, as shown by the lack of accumulation of pre-mRNA and of mRNA and by the buildup of first step lariat intermediate in a dbr1D strain (RPS23B-4) (Figure 2B, lane 6) (dbr1D cells lack debranching enzyme, required for the proper turnover of first step lariat intermediates). These observations are consistent with the possibility that the annotated 30 ss cannot be reached by the spliceosome without a structure that brings it closer to the BS. Thus, we conclude that the predicted RPS23B structure forms in vivo and allows for the proper selection of the 30 ss. This was further validated in the VMA10 intron (Figure S2). Accordingly, we used it to determine the range for optimal 30 ss selection by bringing the first available 30 ss closer to the BS. We found that distances greater than 45 nt diminish splicing efficiency (Figures 2C and 2D), in agreement with our in silico findings. While the existence of an optimal range for 30 ss selection has been previously shown (Chua and Reed, 2001 and references therein), the suggested scope (20–30 nt from the BS) does not explain many naturally occurring introns. Our conclusions are consistent with an alternate spliceosomal range for the second step of catalysis and help to explain S. cerevisiae’s global 30 ss usage. We envision two compatible explanations for the spliceosomal range: it could reflect a balance between a search for a splice site and the activation of a proofreading mechanism, or it could represent the physical distance that can be reached by the spliceosome active center. While further research is necessary to evaluate these two models, we speculate that a pure kinetic model could allow for changes in scope, for example by stabilizing the conformation of the spliceosome required for the second step (Konarska et al., 2006).

1 2

3

AG

Figure 2. 30 ss Selection Mediated by RNA Structures (A and B) Validation of the RPS23B fold (Figure S2). (A) Diagrams of the tested RPS23B constructs. Possible 30 ss are numbered, and the used one is shown in black. Paired and mutated regions are indicated by hyphens and small marks, respectively. (B) Primer extension analyses of RNA from cells containing the constructs shown in (A), as indicated. Extension products are shown on the right, with numbers pointing to the 30 ss being used. Lanes 5–6 indicate samples from a dbr1D strain, lacking debranching enzyme. (C and D) The VMA10 intron (Figure S2) was used to assess the BS-30 ss distance effect on 30 ss efficiency, measured by primer extension. (C) Diagrams of the VMA10 constructs used, formatted as in (A). The distance (in nt) between the BS-A and the first 30 ss is shown. (D) Primer extension analyses of RNA from cells containing the constructs shown in (C), as indicated, formatted as in (B). Lane 1, VMA10-1 (WT); lane 2, VMA10-2 with mutated stem; lanes 3–4, VMA10-5 and 6 (respectively) constructs, based on VMA10-2 but with AG-1 increasingly closer to the BS, as indicated. The ‘‘*’’ indicates an uncharacterized band.

All Available 30 ss Can Be Used According to our predictions, 81 out of the 134 HAGs located between the BS and the annotated 30 ss are occluded from the spliceosome by an RNA structure (http://regulatorygenomics. upf.edu/Yeast_Introns). Out of the remaining 53 HAGs, 32 are located at a distance of 9 nt or less from the BS, whereas the shortest natural occurrence is of 10 nt. As for the rest (21), they are all AAGs. Therefore, we made two additional predictions, First, that AGs located closer than 10 nt to the BS cannot be used by the spliceosome, as was indicated for the ACT1 intron (Patterson and Guthrie, 1991); second, that the yeast spliceosome selects C/UAG over AAG, as reported in other systems (Smith et al., 1993). These predictions were tested in vivo with constructs based on the DMC1 intron with an AAG and a UAG at 16 and 28 nt from the BS, respectively. In the WT construct the spliceosome mostly selects the downstream 30 ss (DMC1-1) (Figure 3B, lane 1). This preference is not due to the sequence

Molecular Cell 43, 1033–1039, September 16, 2011 ª2011 Elsevier Inc. 1035

Molecular Cell An RNA Thermosensor Controls APE2 30 ss Selection

A

16 nt 12 nt DMC1-1 A AAG UAG 1 2 DMC1-2 A UAG UAG DMC1-3 A

UAG

AAG

UAG 9 nt

AAG

DMC1-5 A

UAG

UAG

DMC1-6 A

UAG

DMC1-4

A

x x x xx

B

UAG

Figure 3. Distance Constraints for Structures and 30 ss

C VMA10-1 VMA10-7 to 9 1 2 3 4 5 6 DMC11 1 2

A

D

_ _ _ _ _ _ _ _ _ _

A1 G 3 A2 AG G

1 7 8 20 14 8

A

d

_ _ _ _ _ _ _ _ _ _

A1 G 3 A2 AG G

9 VMA105 d (nt)

2 3

U6 1 2 3 4 5 6

U6 1 2 3 4

context, as the UAG is always selected, irrespective of its position, and AG’s are equally used when they are both UAG (Figure 3B, lanes 2 and 3). However, to be selected an AG must be at more than 9 nt from the BS (Figure 3B, lane 4). To determine whether the presence of an RNA fold between two available HAGs interferes with their selection we introduced the RPS23B stem into DMC1-2, with two active UAG. Remarkably, there is no change in the ratio of usage of either 30 ss (DMC1-5) (Figures 3B, lane 5, and S3), indicating that the spliceosome is able to probe for potential 30 ss upstream as well as downstream from the stem (but not inside). Consistent with our model, disrupting the stem (construct DMC1-6) renders the downstream AG inaccessible, now too far from the BS (Figure 3B, lane 6). Taken together, these results enable us to describe how the spliceosome selects a 30 ss. To be used, a 30 ss has to be located within a window of 10 to 45 nt from the BS, either linearly or with the help of an RNA fold. Moreover, if there are two splice sites within this window, but outside a structure, both will be equally used, unless one is stronger than the other (PyAG > AAG). The Spliceosome Will Open a Stem Located Too Close to the Branch Site We next asked whether a structure can occur anywhere between the BS and the selected 30 ss. Thus, we decreased the distance between the BS and the stem loop of the VMA10 intron, from 20 to 5 nt. Our data show that a stem is not fully formed when predicted to be close to the BS. In fact, positions closer than 10 nt from the BS will remain unpaired. Thus, an HAG that would otherwise be occluded becomes available if a stem is moved closer to the BS (Figure 3D, lane 3). Moreover, moving the stem even closer can in fact put out of spliceosomal reach a downstream HAG previously available, since the lack of folding leads to an increased effective distance to the BS (Figure 3D, lane 4). This, taken into account in our in silico predictions, indicates that while the spliceosome shows a remarkable tolerance to a diversity of RNA folds, a region of 9 nt downstream the BS will not harbor any structure, and likewise AGs located here will be ignored. Thus, it appears that the spliceosome needs to have access to this region before catalyzing exon ligation. This is

(A and B) Splice site strength and minimal distance between BS and 30 ss. (A) Diagrams of the different DMC1 constructs (see also Figure S3) monitored by primer extension analyses shown in (B). Constructs DMC1-5 and 6 include the RPS23B stem loop (WT and open, respectively) between the UAGs in DMC1-2. ‘‘A’’ denotes BS-A. (C and D) Requirement for a minimal distance between the BS and the beginning of a structure. Splicing patterns of constructs depicted in (C), based on VMA10 but with a decreasing distance between the BS-A and the base of the stem, are shown in (D), as analyzed by primer extension. Top row numbers indicate the VMA10 construct (1 is WT), showing below the distance to the stem from the BS-A. Primer extension products are indicated on the right of panels (B) and (D), with numbers indicating the selected AG.

consistent with both substrate requirements for in vitro spliceosome assembly (Fabrizio et al., 2009) and second-step blocking by a strong hairpin downstream of the BS (Liu et al., 1997). Our results argue against an exclusively scanning mechanism to explain 30 ss selection. It is difficult to accommodate scanning with 30 ss selection by RNA folding, and we show that the spliceosome uses all equivalent 30 ss in range (Figure 3). Consistent with this, scanning does not explain splicing patterns of some human introns (Chen et al., 2000). Interestingly, the minor U12-dependent spliceosome removes introns with a small BS-30 ss region, similar to that of yeast (Will and Lu¨hrmann, 2005). Our data indicate that RNA folds are more than a particularity of some transcripts. They are critical for splicing and widespread in several fungi species (this work and Deshler and Rossi, 1991). Moreover, 30 ss selection by RNA folding tends to be conserved in homologous genes (http://regulatorygenomics.upf.edu/ Yeast_Introns), arguing for a selective pressure to keep them. Notably, our defined set of rules for 30 ss selection (schematized in Figure S3) explains the splice site choice of all but one (REC102, in Figure S3) yeast intron. Comparison of sequence elements inside and outside secondary structures shows an enrichment of U-rich motifs outside structures, which reflects the higher propensity of sequences containing purines to be in a fold (Tables S2 and S3). Thus, if a factor were to be required for the function of these diverse structures, it would be unlikely that this is mediated by a particular sequence motif. Likewise, there is no apparent correlation between the presence of an RNA structure and the function of factors involved in 30 ss selection, such as prp18 (Umen and Guthrie, 1995) and slu7 (Frank et al., 1992) (Figure S3). Taken together, these results suggest that the RNA structures in the BS-30 ss region control 30 ss selection, although we cannot exclude additional factors that could modulate this role. The APE2 BS-30 ss Fold Is a Thermosensor that Modulates Alternative Splicing Given that 40% of S. cerevisiae introns potentially harbor a structure between the BS and the 30 ss, it is possible that some of them could have a more complex role in 30 ss selection. This

1036 Molecular Cell 43, 1033–1039, September 16, 2011 ª2011 Elsevier Inc.

Molecular Cell An RNA Thermosensor Controls APE2 30 ss Selection

A

wt

B 23

A A C

A

C

mut A mut B mut C 37 23 37 23 37 23 37 (T ºC)

A C G A

30

U

U A G C

20

40

U G 5' A

*

50

AAG

U A A G G A C A A C U A A A A U G A C C A G 3'

wt

17

21

39

A

C

A

C

C

A

U A

G C

A

C

A

G

A

A

U G A

A

A

A

mut A

A

+T

U

U A

A

C

A C G

A

U

C

A

C G

CAG

A A

A A

A A C

A

mut B

A C G

-T

A

U

U A

A

A

U

A

U

U A

A

mut C

U6

A A A

C A 20

5'

A 17

0.7

A C G

30

0.6

U

U

A

G

C

U

G

U

A A G G A C A A C U A A A A U G A C C A G 3'

Isoform fraction

C

0.5

40

29

50

47

0.4

CAG

0.3

AAG

0.2 0.1 0.0

1 2 3 4 5 6 7 8

Figure 4. The APE2 RNA Thermosensor

(A) RNA structure that controls selection of the 30 ss (boxes). Numbers refer to the BS-A, and the effective distance is shown in italics. Two possible alternate thermosensor states are depicted in balance, indicating factors promoting either state. Mutations (in circles) introduced to disrupt (gray) and restore (black) the structure are indicated. Even in the ‘‘open’’ state, G30 is paired to C23, as the AAG30 only gets activated by mutating the stem up to A22 or to C23 (Figure S4). (B) Primer extension analyses monitoring splicing of APE2 constructs at 23 and 37 C, as indicated (extension products indicated on the right; ‘‘*’’ indicates a cryptic 30 ss upstream of the stem). Below each lane a quantification of three independent experiments is shown ± SEM. Bars indicate the AAG (white) and CAG (black) fraction of transcripts.

could be mediated by binding regulatory factors, by working as a riboswitch (Wachter, 2010), or by acting directly as a thermosensor (Kaempfer, 2003; Neupert and Bock, 2009). In prokaryotes, it has been shown that RNA structures can block ribosome access in response to temperature changes (Neupert and Bock, 2009). Hence, we theorized that an analogous mechanism could act on 30 ss selection, as weakening of a stem can have a direct effect on 30 ss usage. Thus, in a search for yeast transcripts with a 30 ss selection likely to be both mediated by a structure and affected by temperature change (heat shock), we identified APE2, encoding an amino peptidase. Splicing of APE2 premRNAs switches 30 ss upon heat shock (Yassour et al., 2009), and our predictions show that their the BS-30 ss region folds into a small stem loop occluding two AAG and bringing the annotated CAG into spliceosomal range (Figures 4A and S4). Thus, according to our model, formation of this stem will trigger CAG selection. Our data are consistent with this and the previous report (Yassour et al., 2009), although the dual AAG/CAG usage suggests that the stem only forms completely in a fraction of transcripts. Upon heat shock the stem becomes less stable, and selection of the distant CAG is diminished (Figure 4B, lane 2).

Accordingly, splicing to the upstream AAG is promoted, leading to the addition of six residues to the Ape2 N-terminus region. We predicted that mutations weakening the APE2 stem will mimic the heat-shock response at low temperatures, possibly up to the point of blocking the change in 30 ss induced by heat. We tested this by sequentially opening the stem. Notably, a singlenucleotide mutation at the bottom of the stem is enough to switch the 30 ss prevalence toward the heat-shock pattern, leaving a residual heat sensitivity (Figure 4B, lanes 3–4). Further disruption of the stem leads to an APE2 splicing pattern that favors mostly AAG selection and is insensitive to heat (Figure 4C, lanes 5–6). Importantly, reinstating the stem by compensatory mutations restores the splicing switch induced by heat (Figure 4C, lanes 7–8). We argue that the increased AAG/CAG ratio in this construct is caused by the weaker reconstituted stem (AU versus GC base pair). Altogether, our data are consistent with the APE2 intron folding into a small stem loop that acts as an RNA thermosensor by detecting temperature changes and modulating 30 ss selection accordingly. While in prokaryotes, thermosensors and riboswitches are relatively common, controlling transcription and translation, in eukaryotes only one class

Molecular Cell 43, 1033–1039, September 16, 2011 ª2011 Elsevier Inc. 1037

Molecular Cell An RNA Thermosensor Controls APE2 30 ss Selection

of riboswitches has been described. They modulate pre-mRNA splicing by sophisticated conformational changes induced upon thiamine pyrophosphate (TPP) binding (Neupert and Bock, 2009; Wachter, 2010). Our findings for the APE2 thermosensor indicate that simpler switching strategies, analogous to the prokaryotic thermosensors that control ribosome binding, are also present in eukaryotes to control spliceosome access. This presents additional possibilities of splicing control, and it will need to be taken into account when deciphering eukaryotic genomes. EXPERIMENTAL PROCEDURES Bioinformatics Analyses A total of 282 introns, defined as unambiguous sequences with canonical splice sites in chromosomal genes, were extracted from the S. cerevisiae genome (SGD, July 2009). To predict BS, introns were scanned for NNNTRACNN motifs up to 200 nt upstream from the 30 ss. Those with the smallest Hamming distance to the TACTRACNN motif were predicted as BS. When several motifs with identical Hamming distances were found, an additional selection based on the potential base pairing to U2 snRNA was applied using RNAcofold (Hofacker, 2009), forcing the BS-A to be unpaired. When several motifs had the same potential, the closest to the 30 ss was selected. For secondary structure predictions, we used RNAfold (Hofacker, 2009) with default parameters, using the sequence region between 8 nt downstream from the BS-A and the 30 ss and discarding both signals. To test the depletion of triplets in introns we compared the observed and expected frequencies, which were calculated assuming that nucleotide positions are independent. Individual nucleotide frequencies were calculated in the region between the BS and the 30 ss, discarding the first 7 nt after the BS-A. The expected probability of each triplet was then calculated as the product of the individual observed frequencies. Accessibility of a nucleotide was defined as one minus the probability of its being paired. Pairing probabilities were calculated using RNAfold (Hofacker, 2009). Given the same sequence and absence of bias, a larger window is expected to give lower accessibility values, as the folding probability increases. The predictions for all yeast species analyzed (Table S1), pair probabilities for the structures, and accessibility values can be found at http:// regulatorygenomics.upf.edu/Yeast_Introns/. Further details on the bioinformatics analyses can be found in the Supplemental Information.

Spanish Ministry of Science (BIO2008-01091 and 363); MC #510183; EURASNET (LSHG-CT-2005-518238); CSIC (200920I195); and Agaur. M.M. is supported in part by a Training of Researchers (FI) fellowship (Agaur) and M.P. by a Carlos III Ph.D. fellowship. Received: March 21, 2011 Revised: June 6, 2011 Accepted: July 27, 2011 Published: September 15, 2011 REFERENCES Cellini, A., Felder, E., and Rossi, J.J. (1986). Yeast pre-messenger RNA splicing efficiency depends on critical spacing requirements between the branch point and 30 splice site. EMBO J. 5, 1023–1030. Chen, S., Anderson, K., and Moore, M.J. (2000). Evidence for a linear search in bimolecular 30 splice site AG selection. Proc. Natl. Acad. Sci. USA 97, 593–598. Chua, K., and Reed, R. (2001). An upstream AG determines whether a downstream AG is selected during catalytic step II of splicing. Mol. Cell. Biol. 21, 1509–1514. Clark, T.A., Sugnet, C.W., and Ares, M., Jr. (2002). Genomewide analysis of mRNA processing in yeast using splicing-specific microarrays. Science 296, 907–910. Collins, C.A., and Guthrie, C. (1999). Allele-specific genetic interactions between Prp8 and RNA active site residues suggest a function for Prp8 at the catalytic core of the spliceosome. Genes Dev. 13, 1970–1982. Cooper, T.A., Wan, L., and Dreyfuss, G. (2009). RNA and disease. Cell 136, 777–793. Crotti, L.B., and Horowitz, D.S. (2009). Exon sequences at the splice junctions affect splicing fidelity and alternative splicing. Proc. Natl. Acad. Sci. USA 106, 18954–18959. Deshler, J.O., and Rossi, J.J. (1991). Unexpected point mutations activate cryptic 30 splice sites by perturbing a natural secondary structure within a yeast intron. Genes Dev. 5, 1252–1263. Fabrizio, P., Dannenberg, J., Dube, P., Kastner, B., Stark, H., Urlaub, H., and Lu¨hrmann, R. (2009). The evolutionarily conserved core design of the catalytic activation step of the yeast spliceosome. Mol. Cell 36, 593–608. Frank, D., Patterson, B., and Guthrie, C. (1992). Synthetic lethal mutations suggest interactions between U5 small nuclear RNA and four proteins required for the second step of splicing. Mol. Cell. Biol. 12, 5197–5205.

Strains and Reporter Plasmids S.cerevisiae strains are derived from BY4741. Reporter plasmids are based on pCC71 (Collins and Guthrie, 1999), where the ACT1 intron has been replaced by the various versions of the different introns studied. As a result, the introns are flanked by 50 nt of ACT1 leader and the CUP1 gene. Cloning strategies and oligonucleotides are available in Supplemental Information.

Goguel, V., and Rosbash, M. (1993). Splice site choice and splicing efficiency are positively influenced by pre-mRNA intramolecular base pairing in yeast. Cell 72, 893–901.

Primer Extension This was performed as described in Siatecka et al., 1999, on RNA from strain BY4741 upf1D carrying the reporter plasmid, unless stated otherwise. Primers used are complementary to CUP1 and to U6, used as loading control. Quantifications are of three independent experiments ± SEM.

Konarska, M.M., Vilardell, J., and Query, C.C. (2006). Repositioning of the reaction intermediate within the catalytic center of the spliceosome. Mol. Cell 21, 543–553.

SUPPLEMENTAL INFORMATION Supplemental Information includes four figures, three tables, Supplemental Experimental Procedures, and Supplemental References and can be found with this article online at doi:10.1016/j.molcel.2011.07.030.

Hofacker, I.L. (2009). RNA secondary structure analysis using the Vienna RNA package. Curr. Protoc. Bioinformatics, Chapter 12:Unit12.12. Kaempfer, R. (2003). RNA sensors: novel regulators of gene expression. EMBO Rep. 4, 1043–1047.

Lesser, C.F., and Guthrie, C. (1993). Mutational analysis of pre-mRNA splicing in Saccharomyces cerevisiae using a sensitive new reporter gene, CUP1. Genetics 133, 851–863. Liu, Z.R., Laggerbauer, B., Lu¨hrmann, R., and Smith, C.W. (1997). Crosslinking of the U5 snRNP-specific 116-kDa protein to RNA hairpins that block step 2 of splicing. RNA 3, 1207–1219.

ACKNOWLEDGMENTS

Luukkonen, B.G., and Se´raphin, B. (1997). The role of branchpoint-30 splice site spacing and interaction between intron terminal nucleotides in 30 splice site selection in Saccharomyces cerevisiae. EMBO J. 16, 779–792.

We thank C. Query, J. Valca´rcel, and J. Warner for helpful discussions and comments on the manuscript. This research has been supported by the

Neupert, J., and Bock, R. (2009). Designing and using synthetic RNA thermometers for temperature-controlled gene expression in bacteria. Nat. Protoc. 4, 1262–1273.

1038 Molecular Cell 43, 1033–1039, September 16, 2011 ª2011 Elsevier Inc.

Molecular Cell An RNA Thermosensor Controls APE2 30 ss Selection

Patterson, B., and Guthrie, C. (1991). A U-rich tract enhances usage of an alternative 30 splice site in yeast. Cell 64, 181–187. Rogic, S., Montpetit, B., Hoos, H.H., Mackworth, A.K., Ouellette, B.F., and Hieter, P. (2008). Correlation between the secondary structure of pre-mRNA introns and the efficiency of splicing in Saccharomyces cerevisiae. BMC Genomics 9, 355–373. Siatecka, M., Reyes, J.L., and Konarska, M.M. (1999). Functional interactions of Prp8 with both splice sites at the spliceosomal catalytic center. Genes Dev. 13, 1983–1993. Smith, C.W., Chu, T.T., and Nadal-Ginard, B. (1993). Scanning and competition between AGs are involved in 30 splice site selection in mammalian introns. Mol. Cell. Biol. 13, 4939–4952. Smith, D.J., Query, C.C., and Konarska, M.M. (2008). ‘‘Nought may endure but mutability’’: spliceosome dynamics and the regulation of splicing. Mol. Cell 30, 657–666.

Spingola, M., Grate, L., Haussler, D., and Ares, M., Jr. (1999). Genome-wide bioinformatic and molecular analysis of introns in Saccharomyces cerevisiae. RNA 5, 221–234. Umen, J.G., and Guthrie, C. (1995). The second catalytic step of pre-mRNA splicing. RNA 1, 869–885. Wachter, A. (2010). Riboswitch-mediated control of gene expression in eukaryotes. RNA Biol. 7, 67–76. Wahl, M.C., Will, C.L., and Lu¨hrmann, R. (2009). The spliceosome: design principles of a dynamic RNP machine. Cell 136, 701–718. Will, C.L., and Lu¨hrmann, R. (2005). Splicing of a rare class of introns by the U12-dependent spliceosome. Biol. Chem. 386, 713–724. Yassour, M., Kaplan, T., Fraser, H.B., Levin, J.Z., Pfiffner, J., Adiconis, X., Schroth, G., Luo, S., Khrebtukova, I., Gnirke, A., et al. (2009). Ab initio construction of a eukaryotic transcriptome by massively parallel mRNA sequencing. Proc. Natl. Acad. Sci. USA 106, 3264–3269.

Molecular Cell 43, 1033–1039, September 16, 2011 ª2011 Elsevier Inc. 1039