Accepted Manuscript Activation of the intronic cryptic 5′ splice site depends on its distance to the upstream cassette exon
Wei Liu, Xia Li, Shengjie Liao, Kefeng Dou, Yi Zhang PII: DOI: Reference:
S0378-1119(17)30192-0 doi: 10.1016/j.gene.2017.03.023 GENE 41829
To appear in:
Gene
Received date: Revised date: Accepted date:
11 January 2017 13 March 2017 17 March 2017
Please cite this article as: Wei Liu, Xia Li, Shengjie Liao, Kefeng Dou, Yi Zhang , Activation of the intronic cryptic 5′ splice site depends on its distance to the upstream cassette exon. The address for the corresponding author was captured as affiliation for all authors. Please check if appropriate. Gene(2017), doi: 10.1016/j.gene.2017.03.023
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT Activation of the intronic cryptic 5' splice site depends on its distance to the upstream cassette exon
Department of Hepatobiliary Surgery, Xijing Hospital, Fourth Military Medical University,
RI
*
PT
Wei Liu*1, Xia Li*1, Shengjie Liao†‡, Kefeng Dou*2 and Yi Zhang†‡2
† Center
for Genome Analysis,
‡
SC
Xi’an, Shaanxi Province, China
Laboratory for Genome Regulation and Human Heath,
NU
ABLife Inc., Optics Valley International Biomedical Park, Building 9-4, East Lake High-Tech
MA
Development Zone, 388 Gaoxin 2nd Road, Wuhan, Hubei 430075, China.
These authors contributed equally to this work.
2
To whom correspondence should be addressed. Tel: 86-27-81779056; Fax: 86-27-81779056;
PT E
D
1
Email:
[email protected]
CE
Correspondence may also be addressed to Tel: 86-13571890758; Email:
AC
[email protected]
1
ACCEPTED MANUSCRIPT Abstract Splice site selection is a key step that determines the mRNA isoforms generated from a single transcript. The large diversity in splice site sequences emphasizes the plasticity of splice site recognition and selection. In this report, a cell-based reporter system using a
PT
SMN1/2 cassette exon was applied to the roles governing the activation of a cryptic 5’SS
RI
from the intron 4 of the CT/CGRP gene. We found that the cryptic site was activated when
SC
placed within 124 nt downstream the cassette exon, and the level of activation was negatively correlated with its distance from the exon. In addition, activation was not affected by PTB but
NU
was eliminated by an insertion extending the exon length. Activated cryptic 5’SSs in intron or
MA
exon could override the original alternative 5’SS, obeying the U1 base-pairing rule. These results suggest that the exon length itself could represent a factor in determining the splice
PT E
D
site selection.
Keywords:
AC
CE
Alternative Splicing, Splice Site Selection, ESE, Positional Effect, Cryptic Splice Site
Abbreviations
ESE, exonic splicing elements; 5’SS, 5’ splice site; 3’SS, 3’ splice site; ORF, open reading frame; A5SS, Alternative 5’ splice site; A3SS, Alternative 3’ splice site;
2
ACCEPTED MANUSCRIPT 1. Introduction Pre-mRNA alternative splicing is an essential post-transcriptional mechanism that is extensively used in higher eukaryotes to expand proteomic diversity. For each gene, altering the choice of splice site produces diverse mature mRNAs composed of different exons that
PT
encode different protein isoforms (Chen & Manley, 2009, Nilsen & Graveley, 2010). In
RI
contrast to yeast, metazoan pre-mRNAs are often extremely long and contain short exons and
SC
very long introns. The precise recognition of a splice site is a prerequisite for alternative splicing decisions, and the selection of short exons from long introns requires sophisticated
NU
mechanisms for splice site recognition (Reed, 1996, Roca, Krainer et al., 2013).
MA
The consensus sequence elements of splice sites are extremely short and poorly conserved, and the mechanism by which the spliceosome distinguishes authentic splice sites
D
from a huge number of cryptic splice sites remains a long-standing question (Black, 1995,
PT E
Sun & Chasin, 2000). The exon definition model proposed more than two decades ago well explains the recognition of internal exons among long introns in vertebrates, and it invokes
CE
the concerted recognition of upstream 3’SSs and downstream 5’SSs across an exon
AC
(Robberson, Cote et al., 1990). The first step of pre-mRNA splicing is the formation of the E complex, in which U1 snRNA forms a short duplex with the 5’SS, with the U2AF65 protein binding to the polypyrimidine tract at the 3’SS (Reed, 1996, Wahl, Will et al., 2009). The exon definition model relies on the crosstalk between 5’SS and 3’SS through protein-protein interaction, and the initial evidence was obtained from the binding of U1 snRNP at both 5’and 3’SSs (Ruby & Abelson, 1988). Indeed, the binding of U1 snRNP to the 5’SS of an internal exon enhances upstream 3’SS recognition (Robberson et al., 1990), and a mutation in
3
ACCEPTED MANUSCRIPT the 5’SS enhances or decreases U1 snRNA binding to a degree that is sufficient to cause the inclusion or exclusion of the entire exon (Kuo, Nasim et al., 1991). U1 snRNP-promoted 3’SS recognition and usage is at least partially explained by its ability to target U2AF65 to the upstream 3’SS (Hoffman & Grabowski, 1992). However, the role of exon definition in the
PT
selection of authentic splice site is not yet well understood.
RI
The recognition of the 5’SS in a pre-mRNA is a critical event in an alternative splicing
SC
decision and is initiated by the formation of a base-pairing interaction between the splice site sequence and particular sequences in the U1 snRNA of the spliceosome (Roca et al., 2013,
NU
Wahl et al., 2009). In general, greater sequence complementarity leads to U1 snRNP binding
MA
with a higher affinity and thus defines a stronger splice site (Roca et al., 2013). However, the compilation of 5’SS sequences in human has revealed more than 9000 sequence variants in
D
the -3 to +6 region, indicating a large complexity in the mechanisms underlying 5’SS
PT E
recognition in addition to its sequence complementarity to U1 snRNA (Roca, Akerman et al., 2012). Consistently, many non-selected cryptic splice sites contain sequences that match the
CE
5’SS consensus either the same as or better than the actual 5’SS (Roca et al., 2013), and a
AC
cryptic 5’SS could be selected if a natural 5’SS is inactivated (Treisman, Green et al., 1983). In this report, we used a cell-based reporter system to study the factors governing the selection of cryptic 5’SSs. A known cryptic splice site was inserted at different position of the intron downstream the SMN1 and SMN2 cassette exon. The results revealed that this cryptic splice site was activated when located within 124 nt of the cassette exon. This activation was increased when the cryptic splice site was placed nearer to and even inside the short cassette exon. Our results suggest that the selection of a strong 5’SS during an alternative splicing
4
ACCEPTED MANUSCRIPT decision could largely depend on its distance from the upstream 3’SS and/or the exon length, supporting the exon definition model in which the highly dynamic crosstalk between 5’- and 3’SSs established during splice site recognition determines 5’SS selection.
PT
2. Materials and methods
RI
2.1. Plasmid constructs and mutagenesis
SC
The minigenes SMN1 and SMN2 containing exon 7 (54 bp) and portions of the adjacent introns 6 and 7 from PCI-SMN1 and PCI-SMN2 (gifts from Dr. J. Zhou, Arizona
NU
State University) were cloned into the GFP-coding sequence of pZW8 (a gift from Dr. C. B.
thus yield SMN1 and SMN2 reporters.
MA
Burge, Massachusetts Institute of Technology) to replace the original SIRT1 minigene and
D
The two constructs were used as the backbone of the test system. To insert
PT E
cis-elements at different positions in the intron, two restriction enzyme sites, namely, SalI and SacI, were cloned at different positions of the minigene introns (see Figure 4A, lower) using
CE
the overlap PCR strategy (primer sequences are listed in Supplementary Data S4). In total, 12
AC
working constructs were obtained in both the SMN1 and SMN2 reporter backgrounds. Based on the position of the insertion and the backbone reporter, the resulting constructs are named 1.1-1.13 and 2.1-2.13, with the former series referring to the SMN1 backbone and the latter referring to the SMN2 backbone. Insertions at position 8 led to splice failure, likely due to an interruption of the original splice signal; thus, constructs 1.8 and 2.8 were not included in this study. To exclude the possibility that the insertion interrupted the original cis-elements or created a new enhancer motif, non-functional sequences (5’-ACCTCAGGCG) (Fairbrother,
5
ACCEPTED MANUSCRIPT Yeh et al., 2002) were inserted at each chosen position to construct control plasmids. All of the control constructs showed a splicing pattern that was similar to that of the parental construct (Supplementary Figure S1). To insert the test sequence into the exon sequence, two restriction enzyme sites,
PT
namely, XhoI and EcoRI, were cloned into exon 7 of original reporter 1 to replace the
RI
Tra2-ESE sequence (5’ AAAAGAAGGAAGGTG). To prove the function of the reporter, a
SC
different sequence was inserted. The insertion of the known ESE sequences into the exon yielded a reporter exon inclusion product, whereas the insertion of a nonsense sequence
NU
resulted in exon exclusion (data not shown). All of the EX series plasmids used in Figure 5
MA
were constructed in the same manner.
To prepare all series of constructs, sense and antisense oligonucleotides containing
D
full or truncated SalI and SacI restriction sites (in pZW8-SMN1 series) or XhoI and EcoRI
PT E
restriction sites (in exon test reporter) were synthesized (Shanghai GeneChem Co., Ltd.) and diluted in linker buffer (50 mM Tris-HCl pH 8.0, 100 mM NaCl, and 1 mM EDTA) to a final
CE
concentration of 100 nM. To avoid splicing-induced frameshift and consequent
AC
nonsense-mediated mRNA decay, some oligonucleotides contained truncated restriction sites. Each pair of sense and antisense oligonucleotides were mixed for annealing:denaturation at 95°C for 2 min and annealing at 52°C for 10 min and then chilled on ice. The product was diluted to a concentration of 10 nM and then ligated into the digested Original Reporter 1 or 2. To construct the 1.10P (Del3’) 3’ insertion and 1.10P (Del3’) 5’ insertion plasmids, the inserted sequences were amplified from HeLa genome DNA using RNP-F/RNP-R and
6
ACCEPTED MANUSCRIPT RPN-F/RPN-R primers (see Supplementary data). The 151-nt amplification originated from SMN1 intron 7, which is not included in the minigene sequence used. The fragments were then digested with SalI and SacI and inserted into the enzyme-treated SMN1 reporter. 2.2. Cell culture and plasmid/RNA transfection
PT
HeLa (human cervical carcinoma) cells were maintained in Dulbecco’s modified
RI
Eagle medium supplemented with 10% foetal calf serum. The cells were plated in a 24-well
SC
plate at a concentration of 1 x 105 cells per well and then transfected when they reached 60% confluence. Transfection was performed following the standard protocol with Lipofectamine
NU
2000, with 500 ng of plasmid per well. The cells were maintained for 24 h and then harvested
MA
for total RNA preparation or protein preparation. For PTB knockdown, several siRNAs were synthesized by Gene Pharma Co. (PTB1-HOMO-1234 siRNA,
D
5’-CCCAAAGCCUCUUUAUUCUTT; PTB1-HOMO-1618 siRNA,
PT E
5’-GCGUCGUCAAAGGAUUCAATT; and PTB1-HOMO-1279 siRNA, 5’-GCGUAAGAUCCUGUUCAATT). Among these, the most efficient was
CE
PTB1-HOMO-1234, and this siRNA was thus selected for most of the experiments. siRNA
AC
transfection was performed following the standard protocol with Lipofectamine 2000 at a final concentration of 40 nM. The effect of PTB knockdown was assessed by western blotting and Q-PCR (Figure 4). Endogenous genes (EIF4G2 and FAM38A) responding to the level of PTB were investigated using the primers listed in Supplementary Data. For the PTB response experiments, plasmid transfection was performed at 24 h after siRNA transfection. 2.3. RNA preparation and RT-PCR analysis The total RNA from the transfected cells was extracted using a one-step extraction
7
ACCEPTED MANUSCRIPT method with TRIzol (Invitrogen). To assess the RNA quantity and quality, absorbance at 260 nm and 280 nm was measured using a microplate reader (Epoch, BioTek). For each sample, 500 ng of RNA was used as the template in a 10-µl cDNA transcription system. A standard reverse-transcription (RT) protocol supplied by Promega (utilizing M-MLV) was used with
PT
oligo (dT)18. This procedure was followed by PCR amplification and product separation on
RI
2% agarose gels or Q-PCR amplification using an IQ5 (Bio-Rad). In all the PCR or Q-PCR
SC
systems, 1 µl of cDNA was used as the template. Gel images were captured using a GeneGenius Gel Imaging System (SynGene) and quantified with the associated GeneTools
NU
analysis software. All the results were confirmed at least three independent times.
MA
2.4. Western analysis
All of the samples for RT-PCR were harvested at the same time. The samples were
D
lysed using RIPA with a proteinase inhibitor cocktail at a concentration of 1:100 (Calbiochem,
PT E
Cat. No. 539134). The samples were heated at 100°C for 10 min and subjected to 12% SDS-PAGE. An anti-PTB antibody from ProteinTech Co., was used at 1:1000. All the results
3. Results
AC
CE
were confirmed at three independent times.
3.1. An intronic cryptic 5’SS was activated when it was cloned in close proximity to a competing 5’SS of a cassette exon To study the roles governing the selection of a cryptic splice site, we constructed a CMV promoter-driven plasmid containing an EGFP ORF that was interrupted by the SMN1/2 cassette exon, as previously reported (Liu, Zhou et al., 2010). The splicing efficiency of the
8
ACCEPTED MANUSCRIPT reporter gene transcripts in HeLa cells was assayed by RT-PCR 24 h after transfection. The cassette exon was included in the original SMN1 reporter, whereas it was excluded from the SMN2 reporter, as previously reported (Figure 1B). A cryptic 5’SS plus the adjacent sequence of 42 nt located in intron 4 of the CT/CGRP gene was cloned at different positions of the
PT
intron downstream of the SMN1/2 cassette exon in our reporters (Figure 1A) (Lou, Yang et
RI
al., 1995). The sequence of the cryptic 5’SS forms canonical base pairs at the -3 to +5
SC
positions with the +4 to +11 positions of U1 snRNA, indicating that the cryptic 5’SS represents a strong 5’SS. In contrast, the alternative 5’SS of the exon 7 of SMN1/2 was much
NU
weaker. Interestingly, the cryptic 5’SS is silent in human cells, and it has been identified as
MA
part of the intronic enhancer that increases the inclusion of upstream exon 4 (Lou, Cote et al., 1994, Lou et al., 1995). The enhancer sequence contains a strong polypyrimidine tract bound
D
by PTB, and this binding is proposed to inhibit the usage of the cryptic splice site inside the
Gagel et al., 1996).
PT E
enhancer sequence and to promote the usage of the upstream polyadenylation site (Lou,
CE
To obtain proper sites for analyzing the activation of the cryptic splice site, a
AC
non-functional sequence (10nt) was inserted into 8 different positions in the upstream intron and 5 positions in the downstream intron of the SMN1/2 cassette exon. We found that the insertion at position 8 completely abolished SMN1/2 inclusion (data not shown), likely due to its interruption of the branch point function. All of the other 12 insertion sites were suitable for intronic insertion (Supplementary Figure S1). A series of splicing enhancer sequences for SR proteins Tra2, SRSF5, SRSF2, SRSF7, and SRSF6 (Goren, Ram et al., 2006) were cloned and inserted at these positions within the SMN2 splicing context, and most showed no
9
ACCEPTED MANUSCRIPT obvious enhancing effect (Supplementary Figure S2), with a few exceptions. In fact, Tra2 and SRP55 demonstrated an enhancing function at the -84-nt intronic position upstream the SMN2 exon; whereas the SRP40 enhancer sequence at the +24-nt intronic position downstream the SMN2 exon resulted in near-complete exon inclusion. These results indicate
PT
that the enhancer sequence in intronic regions close to an alternative exon could be
RI
recognized and bound by SR proteins to promote exon inclusion.
SC
We then cloned the 42-nt insert containing the cryptic 5’SS at the 12 suitable intronic positions (Figure 1A). The cryptic sites containing insertions at positions 9, 10 and 11 of the
NU
SMN1 splicing reporter, which correspond to intronic insertion locations of 24 nt, 64 nt and
MA
124 nt, resulted in a spliced product that was larger than that obtained with the inclusion of SMN1, and the spliced product varied in lengths corresponding to activation of the cryptic
D
5’SS (Figure 1B, upper panel). The RT-PCR products were sequenced, and the results showed
PT E
that the new spliced products originated from the selection of the cryptic 5’SS. The sequencing result of the spliced product at position 10 is shown in Supplementary Figure S3.
CE
The splicing pattern of the reporter pre-mRNA resulted from the 42-nt insertion at all of the
AC
other intronic sites was similar to that of the background SMN1 reporter (Figure 1B, upper panel). Interestingly, the activation of the cryptic splice site was also evident for the insertion at locations of 24 nt, 64 nt and 124nt within the SMN2 splicing context (Figure 1B, lower panel). In contrast to its activation when being placed at the close proximity of the alternative 5’SS, the cryptic 5’SS close to the upstream constitutive 5’SS was silent (Figure 1, positions 1-3). This result could be explained either by the constitutive 5’SS possessing a highly
10
ACCEPTED MANUSCRIPT conserved splice signal sequence of CAGGUAA or a slightly stronger affinity for U1 snRNA than the cryptic 5’SS does. Alternatively, the splice site selection in long terminal exons could be different from those in short internal exons. The activation of the cryptic 5’SS as an intron-proximal splice site has demonstrated
PT
some interesting features. First, the extended inclusion of SMN1 resulted from the use of
RI
cryptic 5’SS was the exclusive spliced product when the cryptic 5’SS was located at positions
SC
24 nt and 64 nt (Figure 1B, upper panel), consistent with its much higher strength in base pairing the U1 snRNA than the authentic 5’SS of exon 7 in SMN. As the SMN exon is 54 nt in
NU
length, the choice of the inserted cryptic 5’SS from intronic positions 24 nt and 64 nt resulted
MA
in an extended exon of 100 nt and 140 nt (taking the 17-nt sequence from the 5-nt insert and restriction sites into consideration ), respectively, which is within the most commonly
D
observed length of human alternative exons. Second, when this intron-proximal splice site
PT E
was located 124 nt from the distal splice site, three spliced products corresponding to SMN1 exon exclusion, SMN1 exon inclusion and extended SMN1 inclusion were obtained, with
CE
SMN1 exon inclusion being the major splice form. These results together suggest a positive
AC
correlation between the activation of the 5’SS proximal to the upstream exon.
3.2. Activation depends on the splice site strength rather than the adjacent sequence context To study whether the cryptic 5’SS activation depends on the sequence context of the 42-nt insert, we constructed a set of reporter plasmids with inserted deletions and/or mutations at position +64 nt (the 10th insertion site) of the SMN1 cassette exon (Figure 2A). Figure 2B shows that the deletion of the 5’ PTB-binding site (1.10-Δ5’), the 3’ part of the
11
ACCEPTED MANUSCRIPT enhancer sequence (1.10-Δ3’), or both (1.10-Δ5’Δ3’) did not inhibit the activation of the cryptic 5’SS, suggesting that activation does not depend on the enhancer sequence context. Consistently, mutation of the PTB-binding sites did not alter the activation activity (1.10-m5’ and 1.10-m5’ Δ3’).
PT
Consistent with the model that the activation of cryptic 5’SSs depends on the strength
RI
of the splice site itself, mutation of the critical CAG 5’SS consensus to CUG within the
SC
context of the full-length 42-nt insertion (1.10-mSS) resulted in a substantial decrease in the usage of the cryptic 5’SS and an increase in the usage of the upstream competing 5’SS
NU
(Figure 2B). A similar shift in 5’SS usage was also evident when a CAG-to-CUG mutation
MA
was created in the 1.10-Del3’ background (1.10-Δ3’mSS). Interestingly, the CAG-to-CUG mutation in the 1.10Mut5’&Del3’ background (1.10-m5’ Δ3’mSS) decreased the usage of the
D
cryptic 5’SS but did not result in evident usage of the upstream competing 5’SS. Instead, we
PT E
observed an increase in the usage of the constitutive 5’SS, which resulted in a skipped exon.
CE
3.3. Activation depends on the proximity of the splice site to the upstream alternative exon
AC
The above-described results are well consistent with the exon length-dependent model in which the distance of an intronic cryptic 5’SS splice site to the 3’SS across an exon plays a role in determining its selection efficiency. To further validate this model, we inserted a 151-nt sequence that originated from intron 7 of the SMN1 gene either upstream or downstream of the activated cryptic 5’SS in the 1.10-Δ3’ construct to generate two reporter plasmids denoted 1.10-Δ3’-5’ins and 1.10-Δ3’-3’ins (Figure 3A). In the 1.10-Δ3’-5’ins plasmid, the distances from the cryptic 5’SS to the upstream exon and to the across-exon
12
ACCEPTED MANUSCRIPT 3’SS were increased to 215 nt and 293 nt, respectively, whereas the distance to the downstream exon was not changed. In contrast, in the 3’ insertion plasmid, the distance from the cryptic 5’SS to the upstream exon was not changed, whereas the distance to the downstream exon was increased by 151 nt. Figure 3B demonstrates that cryptic 5’SS
PT
activation was abolished when the internal exon length was increased to 293 nt, whereas a
RI
change in the length of the downstream intron sequence had no effect on the activation of the
SC
cryptic 5’SS. These results further support a negative correlation between the length of the
NU
spliced internal exon and the selection of cryptic 5’SSs.
MA
3.4. PTB is not required for the cryptic 5’SS activation
The PTB-binding site in the 42-nt insert from intron 4 of the CT/CGRP gene aids in
D
the binding of U1 snRNP to the cryptic 5’SS, which is essential for the activity of this
PT E
intronic enhancer (Lou, Helfman et al., 1999). However, PTB is generally known as a splicing repressor, the binding of which correlates with the negative regulation of the
CE
recruitment of spliceosomal components to their nearby splice sites (Wagner &
AC
Garcia-Blanco, 2001). Our recent recognition of the enhancer function of PTB is attributed to its binding to the polypyrimidine tracts near the constitutive splice site, which allows PTB to modulate the competiveness of alternative and constitutive splice sites (Xue, Zhou et al., 2009). A recent study showed that PTB binds to a pyrimidine-rich sequence between a pair of alternative 5’SSs promotes usage of the upstream 5’SS (Hamid & Makeyev, 2014). These previous studies prompted us to further analyse the role of PTB in the activation of cryptic 5’SSs in this study.
13
ACCEPTED MANUSCRIPT The results presented in Figure 2 show that the PTB-binding site is not required for the activation of cryptic 5’SSs. To address whether PTB proteins have any function in splicing activation by binding to some other intronic elements in the constructs used, HeLa cells were pre-treated with siRNA for 24 h to downregulate the expression level of the PTB
PT
protein, and we then transfected different reporter plasmids to assay the choice of 5’SS
RI
(Figure 4A). Western blot and RT-PCR analyses showed that PTB expression was
SC
successfully downregulated (Figure 4A). The splicing of two endogenous genes, namely, EI4G2 and RBM27, which has been proven to be regulated by PTB (Xue et al., 2009), was
NU
analysed and shown to respond to PTB knockdown (Figure 4B). Nevertheless, the splicing
MA
patterns of all of the tested reporter mRNAs with or without the cryptic 5’SS did not respond to PTB knockdown (Figures 4A and 4C). Taken together, the results obtained in this study
D
suggest that PTB does not play a role in regulating the alternative splicing of the two
PT E
competitive 5’SSs investigated, as will be discussed below.
CE
3.5. The intronic cryptic 5’SS is activated in the cassette exon
AC
The results obtained in this study generally obey the previously reported proximity rule, whereby the intron-proximal 5’SS is preferentially selected (Hicks, Mueller et al., 2010, Reed & Maniatis, 1986). Nevertheless, there are substantial differences. According to our results, the distance between the two competing 5’SSs could not reach a distance of 200-300 nt, as previously reported. Indeed, the closer the two splice sites, the stronger the selection of the intron-proximal splice site (Figure 1). To further investigate the intron-proximal rule, we inserted cryptic 5’SS-containing
14
ACCEPTED MANUSCRIPT sequences into the reporter SMN1 exon to replace the original Tra2 ESE, allowing the construction of new plasmids, with one harbouring the 3’-deleted cryptic 5’SS (Ex-Δ3’) and the other harbouring a splice site truncation (Ex-Δ3’ΔSS) (Figure 5A). Strikingly, the cryptic 5’SS was exclusively activated during splicing after transfection of the plasmid containing an
PT
intact splice site into HeLa cells, whereas transfection of the plasmid containing the splice
RI
site truncation abolished this activation (Figure 5B). The spliced products were confirmed by
SC
sequencing, and mutation of the CAG consensus of the cryptic splice site to CUG or CAC completely abolished the activation of the cryptic 5’SS (Figure 5C). These results argue
NU
against the intron-proximal rule, indicating that the application of this rule could be restricted.
MA
For example, the rule may be more applicable to the splicing of terminal exons (Hicks et al.,
D
2010, Yu, Maroney et al., 2008), but not the internal cassette exon used in this study.
PT E
4. Discussion
In mammals, the precise selection of a splice site is a challenging procedure under
CE
highly organized regulation. In fact, the activation of an inappropriate splice site can trigger
AC
severe disease (Cooper, Wan et al., 2009, Muntoni & Wood, 2011, Roca, Olson et al., 2008). A large number of cryptic splice sites with no sequence deficiency are repressed, but the underlying mechanisms still remain obscure. It has been recently reported that various cis-acting splicing regulatory elements and noncanonical uridine-pseudouridine interaction in the 5’ SS/U1 helix can contribute to the recognition of some 5’ splice sites (Brillen, Schoneweis et al., 2016, Kralovicova, Hwang et al., 2011, Tan, Ho et al., 2016). Consistently, the recent study on structure also demonstrates additional contacts for the U1 recognition of
15
ACCEPTED MANUSCRIPT 5’ splice sites in addition to the known base pairing (Kondo, Oubridge et al., 2015). We have shown that a naturally unused cryptic 5’SS can be activated when it is placed in an intron close to the upstream cassette exon. The cryptic 5’SS used in this study is a well-characterized enhancer sequence involved in PTB binding and the activation of the
PT
upstream 5’SS and polyadenylation site (Lou et al., 1996, Lou et al., 1999, Lou, Neugebauer
RI
et al., 1998, Lou et al., 1995). Interestingly, activation of the cryptic 5’SS strongly depends on
SC
the consensus sequence of the splice site but not on the adjacent sequences that bind PTB or other factors (Figure 2). In addition, knockdown of the PTB level does not affect the
NU
activation of the cryptic 5’SS (Figure 4). We showed that shortening the exon sequence from
MA
54 nt to 24 nt and increasing the exon length from 54 nt to 118 nt using a pure intronic sequence have no effect on the exclusive activation of a cryptic 5’SS in reporter plasmids
D
when they are expressed in HeLa cells. However, an increase in the distance beyond 200 nt
PT E
(by different sequences) completely abolishes its activation (Figures 1 and 3). These results suggest that, in addition to the well-recognized roles of the exon sequence in splice site
CE
selection, exon length itself might function as a determinant as well.
AC
Recent studies have linked nucleosome structure and modifications to alternative splicing control in metazoans (Keren, Lev-Maor et al., 2010, Schwartz, Meshorer et al., 2009). In human, nucleosome occupancy at different regions of a gene are correlated with the level of inclusion of these regions in mature mRNAs: constitutive exons have the highest occupancy, followed by alternative exons that are both frequently and rarely used; introns have the lowest nucleosome occupancy (Schwartz et al., 2009, Tilgner, Nikolaou et al., 2009). The nucleosome could act as a fluctuating barrier that pauses RNAPII, a process termed
16
ACCEPTED MANUSCRIPT “speed bump”, which may create a time window for splice site recognition and selection (Hodges, Bintu et al., 2009, Keren et al., 2010). Exonic nucleosomes have a specific set of histone modifications that lead to interactions with the splicing machinery and enable more efficient recognition of the exon. In both Caenorhabditis elegans and humans, exons are
PT
preferentially marked by histone 3 lysine 36 trimethylation (H3K36me3) (Kolasinska-Zwierz,
RI
Down et al., 2009, Schwartz et al., 2009, Tilgner et al., 2009). This H3K36me3 modification
SC
could be bound by a multifunctional protein that also binds splicing factors to impact splice site recognition and alternative splicing (Luco, Pan et al., 2010).
NU
The link between nucleosome structure and alternative splicing could represent
MA
mechanisms for the exon definition model. The requirement of the distance of a strong cryptic 5’SS to its upstream 3’SS to be within a nucleosome shown in this study is consistent
D
with both the exon definition model and the nucleosome impact of pre-mRNA splicing in
PT E
metazoan. Nevertheless, we cannot rule out the possibility that the distance-related regulation of the cryptic 5’SS selection is caused by the presence of unknown splicing regulatory
AC
CE
sequences or structures residing in the adjacent sequence.
17
ACCEPTED MANUSCRIPT Acknowledgements We thank Dr. Chris Burge (MIT) for sending us the PCI-SMN1 plasmid as a gift; Dr. Tao Sun (Wuhan University) for construction of the PCI-SMN1 backbone plasmid; and Dr. Qian Li (Wuhan University) for assisting with the transfection experiments.
PT
Funding
RI
This work was supported by a grant from the China 973 Program (2015CB554100) awarded
SC
to K.D. and grants from the China 973 Program (2005CB724604) and ABL2015-AS001 awarded to Y.Z.
NU
Conflicts of Interest:
AC
CE
PT E
D
MA
The authors have no conflicts of interest to declare.
18
ACCEPTED MANUSCRIPT References Black DL (1995) Finding splice sites within a wilderness of RNA. RNA 1: 763-71 Brillen AL, Schoneweis K, Walotka L, Hartmann L, Muller L, Ptok J, Kaisers W, Poschmann G, Stuhler K, Buratti E, Theiss S, Schaal H (2016) Succession of splicing regulatory elements
PT
determines cryptic 5'ss functionality. Nucleic Acids Res
RI
Chen M, Manley JL (2009) Mechanisms of alternative splicing regulation: insights from
SC
molecular and genomics approaches. Nat Rev Mol Cell Biol 10: 741-54 Cooper TA, Wan L, Dreyfuss G (2009) RNA and disease. Cell 136: 777-93
NU
Fairbrother WG, Yeh RF, Sharp PA, Burge CB (2002) Predictive identification of exonic
MA
splicing enhancers in human genes. Science 297: 1007-13 Goren A, Ram O, Amit M, Keren H, Lev-Maor G, Vig I, Pupko T, Ast G (2006) Comparative
D
analysis identifies exonic splicing regulatory sequences--The complex definition of enhancers
PT E
and silencers. Mol Cell 22: 769-81
Hamid FM, Makeyev EV (2014) Regulation of mRNA abundance by polypyrimidine
CE
tract-binding protein-controlled alternate 5' splice site choice. PLoS Genet 10: e1004771
AC
Hicks MJ, Mueller WF, Shepard PJ, Hertel KJ (2010) Competing upstream 5' splice sites enhance the rate of proximal splicing. Mol Cell Biol 30: 1878-86 Hodges C, Bintu L, Lubkowska L, Kashlev M, Bustamante C (2009) Nucleosomal fluctuations govern the transcription dynamics of RNA polymerase II. Science 325: 626-8 Hoffman BE, Grabowski PJ (1992) U1 snRNP targets an essential splicing factor, U2AF65, to the 3' splice site by a network of interactions spanning the exon. Genes Dev 6: 2554-68 Keren H, Lev-Maor G, Ast G (2010) Alternative splicing and evolution: diversification, exon
19
ACCEPTED MANUSCRIPT definition and function. Nat Rev Genet 11: 345-55 Kolasinska-Zwierz P, Down T, Latorre I, Liu T, Liu XS, Ahringer J (2009) Differential chromatin marking of introns and expressed exons by H3K36me3. Nat Genet 41: 376-81 Kondo Y, Oubridge C, van Roon AM, Nagai K (2015) Crystal structure of human U1 snRNP,
PT
a small nuclear ribonucleoprotein particle, reveals the mechanism of 5' splice site recognition.
RI
eLife 4
SC
Kralovicova J, Hwang G, Asplund AC, Churbanov A, Smith CI, Vorechovsky I (2011) Compensatory signals associated with the activation of human GC 5' splice sites. Nucleic
NU
Acids Res 39: 7077-91
MA
Kuo HC, Nasim FH, Grabowski PJ (1991) Control of alternative splicing by the differential binding of U1 small nuclear ribonucleoprotein particle. Science 251: 1045-50
D
Liu W, Zhou Y, Hu Z, Sun T, Denise A, Fu XD, Zhang Y (2010) Regulation of splicing
PT E
enhancer activities by RNA secondary structures. FEBS Lett 584: 4401-7 Lou H, Cote GJ, Gagel RF (1994) The calcitonin exon and its flanking intronic sequences are
CE
sufficient for the regulation of human calcitonin/calcitonin gene-related peptide alternative
AC
RNA splicing. Mol Endocrinol 8: 1618-26 Lou H, Gagel RF, Berget SM (1996) An intron enhancer recognized by splicing factors activates polyadenylation. Genes Dev 10: 208-19 Lou H, Helfman DM, Gagel RF, Berget SM (1999) Polypyrimidine tract-binding protein positively regulates inclusion of an alternative 3'-terminal exon. Mol Cell Biol 19: 78-85 Lou H, Neugebauer KM, Gagel RF, Berget SM (1998) Regulation of alternative polyadenylation by U1 snRNPs and SRp20. Mol Cell Biol 18: 4977-85
20
ACCEPTED MANUSCRIPT Lou H, Yang Y, Cote GJ, Berget SM, Gagel RF (1995) An intron enhancer containing a 5' splice site sequence in the human calcitonin/calcitonin gene-related peptide gene. Mol Cell Biol 15: 7135-42 Luco RF, Pan Q, Tominaga K, Blencowe BJ, Pereira-Smith OM, Misteli T (2010) Regulation
PT
of alternative splicing by histone modifications. Science 327: 996-1000
RI
Muntoni F, Wood MJ (2011) Targeting RNA to treat neuromuscular disease. Nat Rev Drug
SC
Discov 10: 621-37
Nilsen TW, Graveley BR (2010) Expansion of the eukaryotic proteome by alternative
NU
splicing. Nature 463: 457-63
MA
Reed R (1996) Initial splice-site recognition and pairing during pre-mRNA splicing. Curr Opin Genet Dev 6: 215-20
PT E
selection. Cell 46: 681-90
D
Reed R, Maniatis T (1986) A role for exon sequences and splice-site proximity in splice-site
Robberson BL, Cote GJ, Berget SM (1990) Exon definition may facilitate splice site
CE
selection in RNAs with multiple exons. Mol Cell Biol 10: 84-94
AC
Roca X, Akerman M, Gaus H, Berdeja A, Bennett CF, Krainer AR (2012) Widespread recognition of 5' splice sites by noncanonical base-pairing to U1 snRNA involving bulged nucleotides. Genes Dev 26: 1098-109 Roca X, Krainer AR, Eperon IC (2013) Pick one, but be quick: 5' splice sites and the problems of too many choices. Genes Dev 27: 129-44 Roca X, Olson AJ, Rao AR, Enerly E, Kristensen VN, Borresen-Dale AL, Andresen BS, Krainer AR, Sachidanandam R (2008) Features of 5'-splice-site efficiency derived from
21
ACCEPTED MANUSCRIPT disease-causing mutations and comparative genomics. Genome Res 18: 77-87 Ruby SW, Abelson J (1988) An early hierarchic role of U1 small nuclear ribonucleoprotein in spliceosome assembly. Science 242: 1028-35 Schwartz S, Meshorer E, Ast G (2009) Chromatin organization marks exon-intron structure.
PT
Nature structural & molecular biology 16: 990-5
RI
Sun H, Chasin LA (2000) Multiple splicing defects in an intronic false exon. Mol Cell Biol
SC
20: 6414-25
Tan J, Ho JX, Zhong Z, Luo S, Chen G, Roca X (2016) Noncanonical registers and base pairs
NU
in human 5' splice-site selection. Nucleic Acids Res 44: 3908-21
MA
Tilgner H, Nikolaou C, Althammer S, Sammeth M, Beato M, Valcarcel J, Guigo R (2009) Nucleosome positioning as a determinant of exon recognition. Nature structural & molecular
D
biology 16: 996-1001
PT E
Treisman R, Green MR, Maniatis T (1983) cis and trans activation of globin gene transcription in transient assays. Proc Natl Acad Sci U S A 80: 7428-32
CE
Wagner EJ, Garcia-Blanco MA (2001) Polypyrimidine tract binding protein antagonizes exon
AC
definition. Mol Cell Biol 21: 3281-8 Wahl MC, Will CL, Luhrmann R (2009) The spliceosome: design principles of a dynamic RNP machine. Cell 136: 701-18 Xue Y, Zhou Y, Wu T, Zhu T, Ji X, Kwon YS, Zhang C, Yeo G, Black DL, Sun H, Fu XD, Zhang Y (2009) Genome-wide analysis of PTB-RNA interactions reveals a strategy used by the general splicing repressor to modulate exon inclusion or skipping. Mol Cell 36: 996-1006 Yu Y, Maroney PA, Denker JA, Zhang XH, Dybkov O, Luhrmann R, Jankowsky E, Chasin
22
ACCEPTED MANUSCRIPT LA, Nilsen TW (2008) Dynamic regulation of alternative splicing by silencers that modulate
AC
CE
PT E
D
MA
NU
SC
RI
PT
5' splice site competition. Cell 135: 1224-36
23
ACCEPTED MANUSCRIPT Figure Legends Figure 1. Activation of an intronic cryptic 5’ splice site. (A) Diagram of the 42-nt insert sequence components and its location in the CT/CGRP gene (upper); diagram of the splice site sequence in the reporter plasmid containing the SMN1/2 cassette exon (middle); diagram
PT
of the insertion positions of the 42-nt insert (Lower). (B) RT-PCR analysis of the spliced
RI
products from the constructs containing the PTB binding site and cryptic splice site shown as
SC
in diagram A. The spliced products with different exon compositions are diagrammed on the right. ‘In’ indicates the product containing the SMN1/2 cassette exon. ‘Ex’ indicates the
NU
skipping of this cassette exon. ‘*’ indicates the extended cassette exon resulting from the
MA
selection of the cryptic splice site.
D
Figure 2. Sequence components required for the activation of the cryptic 5’SS. (A) Diagram
PT E
of the deletions or mutations. 3’-3’ part was deleted; 5’-5’ part was deleted; s3’-3’ part sequence was mutated; m5’-the 5’ PTB binding site was mutated; mSS-the splice site was
AC
CE
mutated. (B) RT-PCR results of spliced products of constructs.
Figure 3. Influence of an unrelated 151-nt insert on the activation of the cryptic 5’SS. (A) Diagram of constructs with the 151-nt insert before or after the 42-nt insert. (B) RT-PCR results of the spliced products of the constructs shown in (A). “ins” indicates the insertion mutation.
Figure 4. Effect of PTB expression on the activation of cryptic splice sites. (A) Western (left
24
ACCEPTED MANUSCRIPT lower) and Q-PCR (right) results obtained after PTB knockdown. RT-PCR results of the spliced product of the 1.10-Δ3’ construct without and with PTB knockdown (left upper). (B) Response of PTB-sensitive genes EI4G2 and RBM27 to PTB knockdown. (C) RT-PCR results showing the selection of the cryptic splice site in different constructs with or without
RI
PT
PTB knockdown.
SC
Figure 5. The intronic cryptic 5’SS is activated when placed in the cassette exon. (A) Diagram of the constructs containing different types of inserts replacing the Tra2-binding
NU
sequence in the SMN1 reporter. (B) RT-PCR results of the spliced product obtained from
MA
constructs containing an intact or mutant cryptic 5’SS (upper). Sequencing results of the spliced product are also shown (lower). (C) RT-PCR results of the spliced product of the
AC
CE
PT E
D
different constructs shown in (A).
25
AC
CE
PT E
D
MA
NU
SC
RI
PT
ACCEPTED MANUSCRIPT
Figure 1
26
AC
CE
PT E
D
MA
NU
SC
RI
PT
ACCEPTED MANUSCRIPT
Figure 2
27
Figure 3
AC
CE
PT E
D
MA
NU
SC
RI
PT
ACCEPTED MANUSCRIPT
28
AC
CE
PT E
D
MA
NU
SC
RI
PT
ACCEPTED MANUSCRIPT
Figure 4
29
AC
CE
PT E
D
MA
NU
SC
RI
PT
ACCEPTED MANUSCRIPT
Figure 5
30
ACCEPTED MANUSCRIPT Highlights 1. An intronic cryptic 5’SS was activated when it is placed close to the upstream cassette exon; 2. The activation depends on the splice site strength rather than the adjacent sequence
PT
context;
AC
CE
PT E
D
MA
NU
SC
4. PTB is not required for the cryptic 5’SS activation.
RI
3. The level of activation was negatively correlated with its distance from the exon;
31