Relationship between changes in the exon-recognition machinery and SLC22A1 alternative splicing in hepatocellular carcinoma

Relationship between changes in the exon-recognition machinery and SLC22A1 alternative splicing in hepatocellular carcinoma

Journal Pre-proof Relationship between changes in the exon-recognition machinery and SLC22A1 alternative splicing in hepatocellular carcinoma Meraris...

11MB Sizes 0 Downloads 46 Views

Journal Pre-proof Relationship between changes in the exon-recognition machinery and SLC22A1 alternative splicing in hepatocellular carcinoma

Meraris Soto, Maria Reviejo, Ruba Al-Abdulla, Marta R. Romero, Rocio I.R. Macias, Loreto Boix, Jordi Bruix, Maria A. Serrano, Jose J.G. Marin PII:

S0925-4439(20)30026-0

DOI:

https://doi.org/10.1016/j.bbadis.2020.165687

Reference:

BBADIS 165687

To appear in:

BBA - Molecular Basis of Disease

Received date:

29 May 2019

Revised date:

22 December 2019

Accepted date:

12 January 2020

Please cite this article as: M. Soto, M. Reviejo, R. Al-Abdulla, et al., Relationship between changes in the exon-recognition machinery and SLC22A1 alternative splicing in hepatocellular carcinoma, BBA - Molecular Basis of Disease(2020), https://doi.org/ 10.1016/j.bbadis.2020.165687

This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

© 2020 Published by Elsevier.

Journal Pre-proof Aberrant splicing in hepatocellular carcinoma

1

Relationship between changes in the exon-recognition machinery and SLC22A1 alternative splicing in hepatocellular carcinoma Meraris Soto1, Maria Reviejo1, Ruba Al-Abdulla1, Marta R. Romero1,3, Rocio I.R. Macias1,3, Loreto Boix2,3, Jordi Bruix2,3, Maria A. Serrano1,3, Jose J.G. Marin1,3 (1)

HEVEFARM Group, University of Salamanca, IBSAL, Salamanca, Spain. BCLC Group, Hospital Clinic-IDIBAPS, Barcelona, Spain. (3) Center for the Study of Liver and Gastrointestinal Diseases (CIBERehd), Carlos III National Institute of Health, Madrid, Spain. (2)

ro of

Short Title: Aberrant SLC22A1 splicing in hepatocellular carcinoma

Keywords: Cancer; Chemoresistance; Chemotherapy; Liver; OCT1; Spliceosome

re

-p

Corresponding Author Jose J.G. Marin Department of Physiology and Pharmacology, University of Salamanca, Salamanca, Spain. Email: [email protected]; Tel.: +34-663182872; Fax: +34-923-294669

na

lP

Abbreviations: A, adjacent non tumor tissue; ASGF, alternative splicing global frequency; DB, database; HCC, hepatocellular carcinoma; OCT1, organic cation transporter type 1; PSI, percent-splice-in; T, tumor; TCGA, The Cancer Genome Atlas; TLDA, Taqman Low Density Arrays.

Jo

ur

Acknowledgments: This study was supported by the CIBERehd (EHD15PI05/2016) and Fondo de Investigaciones Sanitarias, Instituto de Salud Carlos III, Spain (PI16/00598, co-funded by European Regional Development Fund/European Social Fund, "Investing in your future"); Spanish Ministry of Economy, Industry and Competitiveness (SAF2016-75197-R); Junta de Castilla y Leon (SA063P17); Fundación Samuel Solórzano Barruso, Spain (FS/7-2016, FS/8-2017 and FS/13-2017); AECC Scientific Foundation (2017/2020), Spain; and “Centro Internacional sobre el Envejecimiento” (OLD-HEPAMARKER, 0348_CIE_6_E), Spain. Ruba Al Abdulla was supported by a pre-doctoral contract funded by the “Junta de Castilla y León” and the “Fondo Social Europeo”, Spain (EDU/828/2014). Jordi Bruix is supported by Fondo de Investigaciones Sanitarias, Instituto de Salud Carlos III, Spain (PI18/00768, co-funded by European Regional Development Fund/European Social Fund, "Investing in your future"), AECC PI044031, and Secretaria d’Universitats i Recerca del Departament d’Economia i Coneixement 2014 SGR 605. Meraris Soto was supported by a pre-doctoral scholarship funded by the “Becas Internacionales Universidad de Salamanca-Banco Santander para la Movilidad de Estudios de Doctorado”, Spain (B.O.C.y.L 4/1/18)

Journal Pre-proof Aberrant splicing in hepatocellular carcinoma

2

Abstract

Changes in the phenotype that characterizes cancer cells are partly due to altered processing of pre-mRNA by the spliceosome. We have previously reported that aberrant splicing plays an essential role in the impaired response of hepatocellular carcinoma (HCC) to sorafenib by reducing the expression of functional organic cation transporter type 1 (OCT1, gene SLC22A1) that constitutes the primary way for HCC cells to take up this and other drugs. The present study includes an in silico analysis of publicly available databases to investigate the relationship between alternative splicing of SLC22A1 premRNA and the expression of genes involved in the exon-recognition machinery in HCC

ro of

and adjacent non-tumor tissue. Using Taqman Low-Density Arrays, the findings were validated in 25 tumors that were resected without neoadjuvant chemotherapy. The results supported previous reports showing that there was a considerable degree of alternative

-p

splicing of SLC22A1 in adjacent non-tumor tissue, which was further increased in the tumor in a stage-unrelated manner. Splicing perturbation was associated with changes in

re

the profile of proteins determining exon recognition. The results revealed the importance of using paired samples for splicing analysis in HCC and confirmed that aberrant splicing

lP

plays an essential role in the expression of functional OCT1. Changes in the exon recognition machinery may also affect the expression of other proteins in HCC. Moreover,

na

these results pave the way to further investigations on the mechanistic bases of the relationship between the expression of

spliceosome-associated genes and its

Jo

ur

repercussion on the appearance of alternative and aberrant splicing in HCC.

Journal Pre-proof Aberrant splicing in hepatocellular carcinoma

3

Introduction

Profound changes in the phenotype that characterize cancer cells are due in part to altered processing of immature mRNA (pre-mRNA) by the machinery involved in this process [1]. The result is the generation of aberrant mRNA that: i) is not translated into proteins; ii) is the origin of proteins with partial or complete loss of function; or iii) is translated into proteins with a completely different function. Although the repercussions could be relevant for cancer biology and clinical outcome, the underlying mechanism affecting mRNA maturation in cancer cells is still poorly understood. Recently, several highly common alterations across different cancer types have been reported, suggesting

ro of

the existence of pan-cancer splicing perturbations [2]. In eukaryotic cells, after transcription, pre-mRNA undergoes three main processing steps. These are splicing, capping at the 5' end and addition of a poly A tail at the 3' end. Splicing, which occurs in

-p

more than 95% of expressed pre-mRNA, is carried out by removal of generally long sequences (up to 100 kb) constituted by the introns and ligation of generally shorter

re

sequences (100-300 bp) corresponding to the exons that are flanked by introns in the premRNA [3]. The existence of alternative splicing enhances cell complexity by increasing

lP

the number of different proteins that can be synthesized from a single gene [4]. The crucial process in splicing is carried out by a very complex macromolecular structure, the

na

spliceosome, in whose dynamic step-wise function are directly or indirectly involved approximately 300 proteins in mammalian cells. U2-dependent spliceosome (major spliceosome) catalyzes the removal of most numerous introns (U2-type), whereas a low

ur

number of genes containing rare U12-type introns are processed by a different structure

present study.

Jo

called U12-dependent minor spliceosome [5], which is not be considered further in the

Hepatocellular carcinoma (HCC) is the most common primary liver cancer and one of five types of tumors with the worst prognosis [6]. Single-molecule real-time long-read RNA sequencing in HCC has revealed that alternative variants and tumor-specific variants that arise from aberrant splicing are common during liver carcinogenesis [7]. The transcriptome-wide analysis have revealed that differential alternative events, mainly affecting metabolic pathways, are prevalent in HCC [8]. Moreover, the potential of alternative splicing events as biomarkers in HCC has been recently explored [9]. In previous studies we have reported that alternative splicing of SLC22A1 pre-mRNA plays an essential role in the poor response of HCC to sorafenib [10, 11], the first-line drug in the treatment of advanced HCC [12], due to the high frequency of aberrant splicing results in the generation of short non-functional peptides that account for an overall reduced

Journal Pre-proof Aberrant splicing in hepatocellular carcinoma

4

expression of functional organic cation transporter type 1 (OCT1) [10, 11, 13]. This transporter constitutes the primary way for HCC cells to take up sorafenib, which is required for further interaction with its intracellular targets [10, 13], i.e., the catalytic site of receptors with tyrosine kinase activity.

As a first step to attempt to unravel the cause of enhanced aberrant splicing of SLC22A1 in HCC, we have searched for changes in the expression of major spliceosome genes that may affect this process. In most genes of higher eukaryotes, whose intron length exceeds 200-250 nucleotides, splicing starts by the assembly of E complex across designated regions of pre-mRNA to achieve exon recognition [14]. This step involves binding to the

ro of

target sequence of two small nuclear ribonucleoprotein particle (snRNP) U1 and U2 to generate the early E complex. First, U1 binds the 5' splicing site (5’SS), which is characterized by a consensus splice-site sequence formed by A/CAG at the 3’ end of the

-p

exon followed by GURAGU at the 5’ end of the adjacent intron in a typical metazoan premRNA, where R is any purine nucleotide. Exon recognition is completed by U2 snRNP.

re

Proteins Sf1 and the heterodimer U2af (U2af1 and U2af2) identify: i) the branch point site (BPS), whose sequence is YNYURAY, where A is mandatory, Y is any pyrimidine

lP

nucleotide and N is any nucleotide; ii) the polypyrimidine tract (PPT); and iii) the 3' splicing site (3’SS), which is located at the 3’ end of the intron, and is characterized by the

na

sequence AG. The recognition process occurs because Sf1 binds to BPS and U2af1 binds to 3'SS, while U2af2 binds to PPT and interacts with both Sf1 and U2af2.

ur

In the present descriptive study, the expression of proteins associated with the exon recognition machinery, either because they are involved in spliceosome formation, such

Jo

as these forming the E complex, or its regulation has been investigated. Several factors, mainly Ser/Arg-rich proteins (SR-proteins) bind to exonic (ESE) and intronic (ISE) splicing enhancers to stimulate the binding of U2AF to a weak 3’ splice site and U1 snRNP to the downstream 5’ splice site. The effect of SR-proteins favoring short splicing are antagonized by heterogeneous ribonucleoprotein particles (hnRNPs) able to interact with exonic (ESS) and intronic (ISS) splicing silencers. In addition to elements structurally involved in the spliceosome and its function in exon recognition, post-translational modifications of these proteins affect their rearrangement, which is crucial for determining the way splicing occurs. Thus, several nuclear and cytoplasmic protein kinases and phosphatases play an important role in the regulation of the mRNA maturation machinery [15].

Journal Pre-proof Aberrant splicing in hepatocellular carcinoma

5

To investigate the role of spliceosome machinery expression changes in the splicing of SLC22A1, we have used an event-driven approach, which has previously been used to detect individual events correlated with cancer development or its prognosis [16-18]. Thus, we have evaluated data available in public databases (DB) regarding gene expression in HCC and adjacent non-tumor tissue. We have included in our analysis the expression levels of genes involved in the formation, activity, and regulation of the spliceosome. Moreover, the study contains information provided by three DB regarding the relationship between the expression of selected genes and patient survival. Finally, as some discrepancies among DB were found, a more robust paired t-test analysis of data collected from 50 paired samples of the tumor (T) and adjacent (A) non-tumor tissue

ro of

included in “The Cancer Genome Atlas” (TCGA) was carried out. The SpliceSeq data set was used to correlate changes in the exon-recognition machinery and the rate of alternative splicing of SLC22A1 gene in HCC. The findings have been latter validated in a

-p

cohort of 25 paired biopsies of HCC using custom-designed Taqman Low Density Arrays

Jo

ur

na

lP

re

(TLDAs).

Journal Pre-proof Aberrant splicing in hepatocellular carcinoma

6

Materials and Methods Spliceosome gene expression data In a first step, the study was expanded using information available at the Integrative Molecular Database of HCC (HCCDB), which has been developed by Lian et al., [19] who have curated 15 public HCC expression datasets, including TCGA and the GenotypeTissue Expression (GTEx) with up to nearly 4000 clinical samples. This resource is contributed by Tsinghua University, National Center for Liver Cancer & Shanghai Eastern Hepatobiliary

Surgery

Hospital

and

is

publicly

available

at

http://lifeome.net/database/hccdb/home.html. Among different DB integrated in the HCCDB only HCCDB3 contains additional data,

ro of

concerning healthy (H, n=6) and cirrhotic (C, n=40) liver tissue. The number of samples (T/A) per database was as follows: HCCDB1: 100/97; HCCDB3: 268/243; HCCDB4: 240/193; HCCDB6: 225/220; HCCDB7: 80/82; HCCDB11: 88/48; HCCDB12: 81/80;

-p

HCCDB13: 228/168; HCCDB15 (TCGA): 351/49; HCCDB16: 60/60; HCCDB17: 115/52; HCCDB18: 212/177. The function t-test followed by Benjamini-Hochberg correction was

re

applied to detect significant difference in gene expression in HCC as compared with adjacent non-tumor tissue in each dataset. Values were normalized by log2

lP

transformation [20].

Information regarding the survival of HCC patients, who were stratified depending on whether the tumor had high and low expression of each analyzed gene, was available at

na

HCCDB6, HCCDB15, and HCCDB18. A Kaplan-Meier analysis was applied to evaluate

ur

the differences in survival rates. P value lower than 0.05 was considered as significant. Paired samples data analysis

Jo

In a second step, we have completed the in silico study using information from patients with HCC available at TCGA (https://tcga-data.nci.nih.gov/). Although 427 cases are included in this database, we used data only from 50 patients of whom there was information regarding paired T and A tissues. The expression values for the selected genes were double normalized using the expression of glyceraldehyde-3-phosphate dehydrogenase (GAPDH) and β-actin (ACTB). A paired t-test was used to calculate the statistical significance between T and A in this analysis. Validation cohort To rule out any potential effect of the pharmacological treatments on the expression of splicing-related proteins, specimens from HCC patients who had not received any neoadjuvant anticancer treatment before the surgical resection of the tumor were used to separately analyze tumor (T) and adjacent non-tumor tissue (A). These samples were collected at Salamanca University Hospitals and Barcelona Clinic Hospital (Spain) and

Journal Pre-proof Aberrant splicing in hepatocellular carcinoma

7

then donate for this project by the Tumor Biobanks of these hospitals. The Ethical Committees for Clinical Research of both institutions approved the research protocols, and all patients signed written consents for the use of their samples for biomedical research. The declaration of Helsinki was followed to collect patient samples. Patient and tumor information is included in Supplementary Table 1. Gene expression was determined by custom-designed Taqman Low Density Arrays that included optimized Thermo Fisher reagents (Supplementary Table 2). SLC22A1 pre-mRNA splicing data In the third phase of this study, we have analyzed the abundance of splicing events undergone by SLC22A1 pre-mRNA and its association with changes in the expression of

ro of

proteins involved in mRNA maturation machinery described above. To carry out this analysis, we have used as a computational tool, the on-line SpliceSeq package [21] available

at

the

University

of

Texas,

MD

Anderson

Cancer

Center

-p

(https://bioinformatics.mdanderson.org/TCGASGFliceSeq). For each sample and every possible splice event (e.g. an exon skip event), this web-based resource calculates a

re

percent-splice-in or absence of each splicing event (PSI) value, which is defined as the ratio of normalized read counts indicating inclusion of a transcript element on the total

lP

normalized reads for that event (both inclusion and exclusion reads). For instance, a PSI value of 0.25 for a splice event must be interpreted as that this event occurs in approximately 75% of the transcripts present in that sample.

na

Values shown in this study have been calculated from PSI of six splice events occurring in SLC22A1 mRNA maturation that were due to either exon skip (ES) or alternate donor site

ur

(AD) (Figure 1). We have defined alternative splicing global frequency (ASGF) as the percentage of mRNA molecules that contain at least one splicing event. ASGF was

Jo

calculated as 100*(PSI1*…* PSIn), where PSIx was PSI for individual splicing events in the sample (in the case of SLC22A1, x=1…6). This calculation has the limitation of assuming that all splicing events were non-excluding independent events, which in the case of SLC22A1 pre-mRNA is not entirely correct, because exon 9 or 10 skips cannot occur if exon 9+10 have already be skipped together. Samples with splicing events that had failed to be quantified (null) were eliminated from the study. The Pearson correlation coefficient was calculated by regression analysis between ASGF affecting SLC22A1 premRNA (SLC22A1-ASGF) and expression levels (mRNA) of factors involved in exonrecognition, using the least squares method. Complete information regarding selected genes is summarized in Tables 1 to 4, which have been organized by their function in pre-mRNA maturation. Data concerning genes with the highest expression and undergoing the most relevant changes are depicted in Figures 2, 4, 6, 8 and 9. The rest are available in Supplementary Figures 1 to 26.

Journal Pre-proof Aberrant splicing in hepatocellular carcinoma

8

Results

Determination of splicing probability Normal splicing of SLC22A1 pre-mRNA leads to the formation of mature mRNA constituted by 11 exons (Figure 1A). However, up to seven alternative splicing events have been described (Figures 1B y 1C). Here we have considered together splicing events whose PSI values were available at the SpliceSeq DB (Figure 1B). This approach underestimates total splicing, because additional splicing events, previously described by us [10], have not been considered in that DB (Figure 1C). The analysis indicated that the percentage of SLC22A1 mRNA molecules affected by alternative splicing was higher in

ro of

the tumor than in the adjacent tissue (Figures 1D and 1E) without significant differences among the stages of tumor development (Figure 1F).

-p

The U1 element of the E complex

A paired analysis of data included in TCGA showed a significant change in HCC regarding

re

the expression of proteins forming the U1 element of the spliceosome (Table 1). SNRNP70 gene encoding U1 snRNP 70kDA protein was similarly expressed in HCC and

lP

adjacent tissue (Supplementary Figure 1A) and correlated with SLC22A1-ASGF (Supplementary Figure 2A). This gene was upregulated in 5 DB, not significantly altered in

na

another 5 DB and downregulated in 2 DB (Supplementary Figure 3). The analysis of its relationship with patient survival showed no difference between patients bearing tumors with high and low SNRNP70 expression (Supplementary Figure 4).

ur

A more general consensus among different DB was found regarding the expression of SNRPA that encodes U1A snRNP (Table 1). This was found slightly upregulated in the

Jo

paired analysis (Supplementary Figure 1B), and 10 DB, whereas no change was described in 2 DB (Supplementary Figure 3). A significant correlation with SLC22A1ASGF was found (Supplementary Figure 2B). SNRPA expression levels had no impact on survival (Supplementary Figure 4). Similar results were obtained for SNRPC, which was found upregulated in HCC in all DB except one (Supplementary Figure 3). The paired analysis confirmed enhanced SNRPC expression in tumor versus adjacent tissue (Supplementary Figure 1C). No significant correlation between SNRPC expression and SLC22A1-ASGF was found (Supplementary Figure 2C). Although some trend to shorter survival time in patients with higher tumor SNRPC expression was observed, this was not statistically significant (Supplementary Figure 4). Results obtained in the validation cohort, showed no significant changes in the expression of SNRPA and SNRPC (Figure 2).

Journal Pre-proof Aberrant splicing in hepatocellular carcinoma

9

The Sm-proteins of the E complex The analysis of six Sm-proteins involved in the E complex (Table 1), revealed that the most significant changes concerned Sm-B/B1 (SNRPB), whose expression was high in the adjacent tissue and was further increased in HCC (Figure 3A). This was confirmed by the validation cohort (Figure 2). A significant correlation with SLC22A1-ASGF was found (Figure 3B). Data on this gene were available only in 10 DB. Among them, in 8 DB SNRPB mRNA was significantly higher in tumor than in adjacent tissue (Figure 3C). Interestingly, a significant relationship between high SNRPB expression and shorter survival was found (Figure 3D). Expression levels of SNRPD1 were 10-fold lower that these SNRPB (Supplementary

ro of

Figure 1D). The paired analysis of TCGA data showed slightly higher SNRPD1 expression in tumors than in adjacent tissue (Supplementary Figure 1D). However, the TLDA analysis has not confirmed the increased expression of this gene in HCC (Figure 2). No correlation

-p

between SNRPD1 expression and SLC22A1-ASGF was found (Supplementary Figure 2D). In all DB, except one, SNRPD1 was upregulated (Supplementary Figure 3). A

found (Supplementary Figure 4).

re

significant association between higher SNRPD1 expression and the poorer prognosis was

lP

The expression of SNRPE was higher in tumor than in adjacent tissue (Supplementary Figure 1E). There was no correlation with SLC22A1-ASGF (Supplementary Figure 2E). In

na

all DB, except 2, SNRPE was found upregulated in HCC (Supplementary Figure 3). No significant prognostic value for this gene was found (Supplementary Figure 4). The expression of other genes encoding Sm-proteins, such as SNRPF, SNRPG, and

ur

SNRPN was similarly low in HCC and adjacent tissue (Supplementary Figure 5A-5C). Only SNRPN expression showed a significant correlation with SLC22A1-ASGF

Jo

(Supplementary Figure 6C). Although SNRPF and SNRPG were reported to be upregulated in most DB, some discrepancy concerning SNRPN expression was seen (Supplementary Figure 7). Neither SNRPF nor SNRPN were associated with patient survival, whereas higher expression of SNRPG was associated with a poorer prognosis (Supplementary Figure 8). Surprisingly, SNRPF demonstrated high expression levels in the validation cohort with moderate downregulation in HCC (Figure 2). Exon recognition factors of the E complex The analysis of three genes involved in the critical role of E complex in exon recognition during splicing (Table 1) revealed that the expression of SF1 was lower in HCC than in adjacent tissue (Supplementary Figure 5D). A significant correlation between SF1 expression and SLC22A1-ASGF was found (Supplementary Figure 6D). Data of HCCDB were heterogeneous. Thus, a moderate upregulation was found in 3 DB, whereas there

Journal Pre-proof Aberrant splicing in hepatocellular carcinoma

10

was no significant change in 4 DB or there was a moderate but significant downregulation in the remaining 5 DB (Supplementary Figure 7). Analysis of the validation cohort supported SF1 downregulation in HCC (Figure 2). The expression of U2AF1 was also reduced in HCC (Supplementary Figure 5E), which was consistent with values determined by us in the validation cohort (Figure 2). A significant correlation with SLC22A1-ASGF was found (Supplementary Figure 6E). Surprisingly, only 1 DB reported U2AF1 downregulation in HCC, whereas in the rest no change or up-regulation was found (Supplementary Figure 7). The paired analysis of TCGA data revealed similar U2AF2 expression in HCC and adjacent tissue (Supplementary Figure 5F). No significant change in the expression of this

ro of

gene in HCC was also observed in the validation cohort (Figure 2). In contrast, non-paired analysis indicated that U2AF2 was upregulated in 10 DB (Supplementary Figure 7). Singificant correlation with SLC22A1-ASGF was found (Supplementary Figure 6F).

-p

Neither SF1, U2AF1, or U2AF2 expression were associated with prognostic significance

re

(Supplementary Figure 8).

Alternative splicing factors inhibiting short splicing

lP

Nine genes encoding proteins able to inhibit short splicing have been analyzed (Table 2). Although the paired analysis indicated that the expression of HNRNPA1 was not

na

enhanced (Supplementary Figure 9A), non-paired analysis reported HNRNPA1 upregulation in 9 DB (Supplementary Figure 11). No correlation of HNRNPA1 expression with SLC22A1-ASGF (Supplementary Figure 10A) or patient survival (Supplementary

ur

Figure 12) was found.

No significant changes in the expression of HNRNPA2B1, HNRNPD and HNRNPH1

Jo

(Supplementary Figures 9B, 9C and 9D) were detected in the paired analysis of these genes, whereas controversial results of upregulation and downregulation were described in different DB (Supplementary Figure 11). Except for HNRNPH1, no significant correlation between SLC22A1-ASGF and the expression of these genes was found (Supplementary Figures 10B, 10C and 10D). Also, except for HNRNPH1, no association with patient survival was reported (Supplementary Figure 12). Other genes of this subgroup, which included HNRNPK, PCBP2 (Supplementary Figures 9E and 9F) and HNRNPL (Figure 3E), were downregulated. In contrast, the expression of HNRNPF (Figure 4A) and HNRNPI (Figure 4E) was significantly higher in HCC than in adjacent non-tumor tissue. A correlation with SLC22A1-ASGF was found for HNRNPL (Figure 3F), HNRNPF (Figure 4B) and HNRNPI (Figure 4F). Although there was some consensus with the results of the paired analysis, several discrepancies regarding the expression of these genes in HCCDB were found (Supplementary Figure 11 and Figures

Journal Pre-proof Aberrant splicing in hepatocellular carcinoma

11

3G, 3C and 3G). A significant association between shorter survival and higher expression of HNRNPK, PCBP2 (Supplementary Figure 12) and HNRNPL (Figure 3H) was found. Regarding PCBP2 and HNRNPFR expression, results found in DB were consistent with values determined by us in the validation cohort (Figure 5). Alternative splicing factors favoring short splicing Fourteen SR-proteins were analyzed (Table 3). The expression of SRSF1 was not significantly changed in HCC (Supplementary Figure 13A) and was not significantly correlated with ASGF (Supplementary Figure 14A). SRSF1 expression was found slightly increased in 6 DB and unchanged in another 6 DB (Supplementary Figure 15). No

ro of

association with patient survival was found (Supplementary Figure 16). The paired analysis showed a modest downregulation of SRSF2 in HCC (Supplementary Figure 13B), together with the absence of its correlation with SLC22A1-ASGF

-p

(Supplementary Figure 14B). SRSF2 expression was increased in 6 DB, unchanged in 3 DB and reduced in 1 DB (Supplementary Figure 15). Tumors with lower SRSF2

re

expression were associated with more prolonged patient survival (Supplementary Figure 16).

SRSF8,

SRSF10

lP

The expression of SRSF3, SRSF4 (Supplementary Figures 13C and 13D), SRSF7, (Supplementary

Figures

17A,

17B

and

17D)

and

SRRM1

na

(Supplementary Figure 21A) was decreased in HCC, whereas that of SRSF9, SRSF11, SRSF12 (Supplementary Figures 17C, 17E and 17F) and RBM5 (Supplementary Figure 21B) was not significantly modified. The highest expression and most marked changes

ur

were these of SRSF5 (Figure 6A) and SRSF6 (Figure 6E). Among these genes the expression of SRSF5 (Figure 6B), SRSF6 (Figure 6F), SRSF4 (Supplementary Figure

Jo

14D), SRSF7, SRSF8, SRSF9, SRSF11 (Supplementary Figures 18A, 18B, 18C and 18E) and RBM5 (Supplementary Figure 21D) correlated with SLC22A1-ASGF. Both upregulation and downregulation were found in different DB (Figures 6C and 6G; Supplementary Figures 15, 19 and 22). Regarding the expression levels of SRSF3-5, SRSF7-10 and SRRM1, results found in DB were consistent with values determined by us in the validation cohort (Figure 7). None of these genes was associated with patient survival (Figures 6D and 6H; Supplementary Figures 16, 20 and 22). Kinases and phosphatases involved in spliceosome protein activation The paired analysis revealed altered expression of protein kinases and phosphatases involved in controlling the activity of several of the proteins commented above (Table 4). Thus, nuclear kinases were either downregulated, such as CLK1 (Figure 8A) or upregulated, such as CLK2 (Figure 8E), whereas low expression and slight changes were

Journal Pre-proof Aberrant splicing in hepatocellular carcinoma

12

seen for CLK3 and CLK4 (Supplementary Figures 23A and 23B). These findings were consistent with results reported in most DB (Figures 8C, 8G and Supplementary Figure 25). Correlation with SLC22A1-ASGF was seen for CLK1 (Figure 8B), CLK2 (Figure 8F) and CLK3 (Supplementary Figure 24A). Association with patient survival was not found for any of these nuclear kinases (Figures 8D, 8H and Supplementary Figure 26). Regarding cytoplasmic protein kinases, only SRPK1 was considerably expressed in liver tissue and moderate upregulated in HCC (Supplementary Figure 23C). These findings were consistent in almost all DB (Supplementary Figure 25). The expression of SRPK2 was very low in both adjacent and tumor tissue (Supplementary Figure 23D). No correlation between SRPK1 expression and SLC22A1-ASGF (Supplementary Figure 24C)

ro of

or patient survival (Supplementary Figure 26) was found.

Protein phosphatases PP1A (PPP1CA) and PP2A (PPP2CA) were found upregulated by the paired analysis (Figures 9A and 9E), which was consistent with data from most DB,

-p

except for PPP1CA in HCCDB11 (Figure 9C) and PPP2CA in HCCDB3 (Figure 9G). The expression level of PPC1CA (Figure 9B) and PPP2CA (Figure 9F) correlated with

re

SLC22A1-ASGF. Patient survival was not associated with the expression of any of these genes (Figures 9D and 9H).

lP

In the validation cohort, a significant reduction in the expression of CLK1 and CLK4 was observed in HCC when compared with adjacent tissue. The rest of genes encoding

Jo

ur

na

protein kinases and phosphatases did not show a significant change in HCC (Figure 10).

Journal Pre-proof Aberrant splicing in hepatocellular carcinoma

13

Discussion

ro of

An important observation in this study was the marked discrepancy among different cohorts regarding HCC-associated changes in the expression of proteins involved in mRNA maturation machinery. In many cases, different DB reported statistically significant changes but in opposite direction to each other. This was probably due to the great interindividual variability described for these genes both in adjacent and tumor tissues. Thus, the fact of analyzing all data from the same cohort as a global set generates artifactual conclusions. This was clearly seen when data from the same cohort (TCGA) were analyzed using paired t-test comparing adjacent and tumor tissue from each patient. The results of this analysis were more robust and, for some genes, even contradictory to those obtained when expression data of TCGA were globally analyzed. The study of our validation cohort further supported this concept.

ur

na

lP

re

-p

From the clinical point of view, it is relevant to mention that some changes observed in three DB could be associated with the duration of patient survival. Owing to the heterogeneity of the cohorts included in available BD and the lack of clinical information regarding the treatment received by these patients, it is difficult to stablish any causeeffect relationship between changes in gene expression and the mechanisms accounting for their impact in patient survival. Nevertheless, some changes associated with enhanced SLC22A1-ASGF, such as downregulation and upregulation of genes encoding proteins favoring or inhibiting short splicing, respectively, are expected to favor the generation of aberrant mRNA variants. These can lead to a decrease in the level of functional OCT1 expressed in cancer cells and hence also in their ability to take up and respond to sorafenib.

Jo

To discuss changes observed in gene expression, we will follow the temporal sequence of exon recognition. This process starts by the assembly of so-called pre-spliceosomal complex that defines the intron. This involves U1, which is composed of U1 snRNA, three U1-specific proteins and seven Sm proteins, some of which, such as SNRPA, SNRPB, SNRPC, SNRPD1 and SNRPE were upregulated in HCC (Table 1). Whether these changes result in a more active spliceosome is unknown, but it must be highlighted that SNRPB and SNRPD1 were among the genes whose high expression was associated with shorter patient survival. Next step in exon recognition involves the interaction with U2 snRNP proteins Sf1, U2af1 and U2af2 that are encoded by SF1, U2AF1 and U2AF2 genes. Both SF1 and U2AF1 were downregulated in HCC (Table 1), which suggests a reduced ability to identify both BPS and 3’SS. Regarding exon-recognition function, it is of note that several SR-proteins (SRSF2-8,10 and SRRM1) involved in potentiating the identification of weak splicing sites, by binding to ESE and ISE, were also down-regulated in HCC (Table 3). These results are consistent with previous reports indicating a reduced SRSF3 expression in HCC [22].

Journal Pre-proof Aberrant splicing in hepatocellular carcinoma

14

Downregulation was particularly marked for SRSF5 and SRSF6. The expected functional consequence of the lower abundance of SR-proteins is a higher probability of exonskipping or aberrant alternative splicing. It is of note that experimental data in mice have shown that SRSF3 loss in hepatocytes favor liver carcinogenesis [23]. On the other hand, the enhanced expression of several HNRNP genes coding proteins able to bind to ESS and ISS and displace SR-proteins from their interaction with premRNA, such as HNRNPF and HNRNPI, might have similar consequences. In contrast, downregulation of HNRNPK, HNRNPL and PCBP2 would influence the balance towards shorter splicing (Table 2).

re

-p

ro of

Another important finding of this study was the marked change affecting the expression of regulatory enzymes accounting for phosphorylation and dephosphorylation of proteins involved in exon recognition machinery (Table 4). Whereas for protein kinases, both diminished and increased expression was seen, protein phosphatases were consistently upregulated. These results suggested a balance in favor of dephosphorylated proteins, which could affect the overall function of the exon-recognition machinery. This open the door to further mechanistic investigations that should include quantitative and functional analysis of individual regulatory enzymes and its target proteins.

Jo

ur

na

lP

In sum, these results support previous reports [10, 13] showing that in tissue adjacent to HCC, there is already a considerable degree of alternative splicing, which is consistent with the frequent observation of altered splicing in several pathological liver conditions accompanied by metabolic perturbation but in absence of cancer [24, 25]. For instance, in the liver of patients with insulin-resistant obesity, downregulation of several splicing factors encoded by SR and HNRNP genes has been reported [26]. In liver of patients with either mild or advanced non-alcoholic fatty liver disease (NAFLD), the expression of thirty splicing factors has been found altered [27, 28]. Moreover, SRSF4 downregulation has been associated with non-alcoholic steatohepatitis (NASH) [29]. Nevertheless, the frequency of altered SLC22A1 splicing events was further increased in the tumor, which is associated with profound and complex changes in the profile of proteins determining exon recognition during the maturation of pre-mRNA. This may play an important role in the overall expression of functional OCT1, and presumably other proteins, which account for HCC-characteristic phenotypic traits, such as the reduced ability to take up and respond to sorafenib [10]. The present descriptive study has permitted to identify changes in the expression of key elements involved in the exon-recognition machinery, which will be a necessary starting point for further mechanistic investigations aimed to elucidate the functional repercussions of changes in individual proteins on the splicing of SLC22A1 and other genes that play a critical role in HCC pathogenesis and treatment.

Journal Pre-proof Aberrant splicing in hepatocellular carcinoma

15

Figure Legends

Figure 1. SLC22A1 pre-mRNA and splicing variants. A. Scheme of exons (1-11 boxes), introns (horizontal lines) and normal splicing of wild-type SLC22A1 variant. Alternative splicing events considered by SpliceSeq from RNASeq data available at TCGA (B) or previously described by us and others based on RT-PCR analysis (C). Individual paired values (D) and statistical comparison by paired t-test of average values (mean±SEM) (E) of SLC22A1 alternative splicing global frequency (SLC22A1-ASGF), defined as the percent of mRNA molecules that contain at least one splicing event. ASGF was calculated in SLC22A1 mRNA by SpliceSeq in paired samples of hepatocellular

ro of

carcinoma tumor (T) and adjacent non-tumor tissue (A). Relationship between ASGF and the tumor stage (F). *, p<0.05.

-p

Figure 2. Experimental validation of changes in the expression of genes involved in Complex E formation. Schematic representation of genes with statistically tumor-

re

associated changes in mRNA levels according with RNASeq data available at TCGA (A). Results from the validation of gene expression in a cohort of 25 paired samples of

lP

hepatocellular carcinoma tumor (T) and adjacent non-tumor tissue (A) as determined by TLDA (B). *, p<0.05, on comparing A vs. T.

na

Figure 3. Relevant changes in the expression of SNRP and HNRNP genes. Abundance of SNRPB (A) and HNRNPL (E) mRNA in paired samples of hepatocellular

ur

carcinoma (HCC) tumor (T) and adjacent non-tumor tissue (A) from RNASeq data available at TCGA. Statistical comparison by paired t-test of average values (mean±SEM).

Jo

Correlation between SNRPB (B) and HNRNPL (F) mRNA and SLC22A1 alternative splicing global frequency (SLC22A1-ASGF). Relative SNRPB (C) and HNRNPL (G) expression levels from data derived from different databases encompassed in HCC database (HCCDB). Values are mean±SEM, from samples collected from tumor (T, n=2048; from 60 to 351 per database), adjacent non-tumor (A, n=1469; from 48 to 243 per database). Kaplan-Meier survival curves of patients with HCC stratified according to the level of SNRPB (D) and HNRNPL (H) expression, as reported in three databases (HCCDB6, HCCDB15 and HCCDB18). †, p<0.05; prognostic value when high vs. low gene expression was compared. *, p<0.05, on comparing A vs. T.

Figure 4. Relevant changes in the expression of HNRNP genes. Abundance of HNRNPF (A) and HNRNPI (E) mRNA in paired samples of hepatocellular carcinoma tumor (T) and adjacent non-tumor tissue (A) from RNASeq data available at TCGA.

Journal Pre-proof Aberrant splicing in hepatocellular carcinoma

16

Statistical comparison by paired t-test of average values (mean±SEM). Correlation between HNRNPF (B) and HNRNPI (F) mRNA and SLC22A1 alternative splicing global frequency (SLC22A1-ASGF). Relative HNRNPF (C) and HNRNPI (G) expression levels from data derived from different databases encompassed in hepatocellular carcinoma (HCC) database (HCCDB). Values are mean±SEM, from samples collected from tumor (T, n=2048; from 60 to 351 per database), adjacent non-tumor (A, n=1469; from 48 to 243 per database). Kaplan-Meier survival curves of patients with HCC stratified according to the level of HNRNPF (D) and HNRNPI (H) expression, as reported in three databases (HCCDB6, HCCDB15 and HCCDB18). †, p<0.05; prognostic value when high vs. low

ro of

gene expression was compared. *, p<0.05, on comparing A vs. T.

Figure 5. Experimental validation of changes in the expression of genes favoring long splicing by inhibition of short splicing due to interaction with exonic (ESS) and

-p

intronic (ISS) silencers. Schematic representation of genes with statistically tumorassociated changes in mRNA levels according with RNASeq data available at TCGA (A).

re

Results from the validation of gene expression in a cohort of 25 paired samples of hepatocellular carcinoma tumor (T) and adjacent non-tumor tissue (A) as determined by

lP

TLDA (B). *, p<0.05, on comparing A vs. T.

Figure 6. Relevant changes in the expression of SRSF genes. Abundance of SRSF5

na

(A) and SRSF6 (E) mRNA in paired samples of hepatocellular carcinoma tumor (T) and adjacent non-tumor tissue (A) from RNASeq data available at TCGA. Statistical

ur

comparison by paired t-test of average values (mean±SEM). Correlation between SRSF5 (B) and SRSF6 (F) mRNA and SLC22A1 alternative splicing global frequency (SLC22A1-

Jo

ASGF). Relative SRSF5 (C) and SRSF6 (G) expression levels from data derived from different databases encompassed in hepatocellular carcinoma (HCC) database (HCCDB). Values are mean±SEM, from samples collected from tumor (T, n=2048; from 60 to 351 per database), adjacent non-tumor (A, n=1469; from 48 to 243 per database). KaplanMeier survival curves of patients with HCC stratified according to the level of SRSF5 (D) and SRSF6 (H) expression, as reported in three databases (HCCDB6, HCCDB15 and HCCDB18). †, p<0.05; prognostic value when high vs. low gene expression was compared. *, p<0.05, on comparing A vs. T. Figure 7. Experimental validation of changes in the expression of genes favoring short splicing by interaction with exonic (ESE) and intronic (ISE) splicing enhancers. Schematic representation of genes with statistically tumor-associated changes in mRNA levels according with RNASeq data available at TCGA (A). Results

Journal Pre-proof Aberrant splicing in hepatocellular carcinoma

17

from the validation of gene expression in a cohort of 25 paired samples of hepatocellular carcinoma tumor (T) and adjacent non-tumor tissue (A) as determined by TLDA (B). *, p<0.05, on comparing A vs. T. Figure 8. Relevant changes in the expression of CLK genes. Abundance of CLK1 (A) and CLK2 (E) mRNA in paired samples of hepatocellular carcinoma tumor (T) and adjacent non-tumor tissue (A) from RNASeq data available at TCGA. Statistical comparison by paired t-test of average values (mean±SEM). Correlation between CLK1 (B) and CLK2 (F) mRNA and SLC22A1 alternative splicing global frequency (SLC22A1ASGF). Relative CLK1 (C) and CLK2 (G) expression levels from data derived from

ro of

different databases encompassed in hepatocellular carcinoma (HCC) database (HCCDB). Values are mean±SEM, from samples collected from tumor (T, n=2048; from 60 to 351 per database), adjacent non-tumor (A, n=1469; from 48 to 243 per database). Kaplan-

-p

Meier survival curves of patients with HCC stratified according to the level of CLK1 (D) and CLK2 (H) expression, as reported in three databases (HCCDB6, HCCDB15 and

re

HCCDB18). †, p<0.05; prognostic value when high vs. low gene expression was

lP

compared. *, p<0.05, on comparing A vs. T.

Figure 9. Relevant changes in the expression of phosphatases. Abundance of PPP1CA (A) and PPP2CA (E) mRNA in paired samples of hepatocellular carcinoma

na

tumor (T) and adjacent non-tumor tissue (A) from RNASeq data available at TCGA. Statistical comparison by paired t-test of average values (mean±SEM). Correlation

ur

between PPP1CA (B) and PPP2CA (F) mRNA and SLC22A1 alternative splicing global frequency (SLC22A1-ASGF). Relative PPP1CA (C) and PPP2CA (G) expression levels

Jo

from data derived from different databases encompassed in hepatocellular carcinoma (HCC) database (HCCDB). Values are mean±SEM, from samples collected from tumor (T, n=2048; from 60 to 351 per database), adjacent non-tumor (A, n=1469; from 48 to 243 per database). Kaplan-Meier survival curves of patients with HCC stratified according to the level of PPP1CA (D) and PPP2CA (H) expression, as reported in three databases (HCCDB6, HCCDB15 and HCCDB18). †, p<0.05; prognostic value when high vs. low gene expression was compared. *, p<0.05, on comparing A vs. T. Figure 10. Experimental validation of changes in the expression of genes encoding nuclear and cytoplasmic kinases and phosphatases. Schematic representation of genes with statistically tumor-associated changes in mRNA levels according with RNASeq data available at TCGA (A). Results from the validation of gene expression in a cohort of

Journal Pre-proof Aberrant splicing in hepatocellular carcinoma

18

25 paired samples of hepatocellular carcinoma tumor (T) and adjacent non-tumor tissue

Jo

ur

na

lP

re

-p

ro of

(A) as determined by TLDA (B). *, p<0.05, on comparing A vs. T.

Journal Pre-proof Aberrant splicing in hepatocellular carcinoma

19

REFERENCES

[1] C.J. David, J.L. Manley, Alternative pre-mRNA splicing regulation in cancer: pathways and programs unhinged, Genes Dev, 24 (2010) 2343-2364. [2] Y. Du, S. Li, R. Du, N. Shi, S. Arai, S. Chen, A. Wang, Y. Zhang, Z. Fang, T. Zhang, W. Ma, Modularized Perturbation of Alternative Splicing Across Human Cancers, Front Genet, 10 (2019) 246.

ro of

[3] Y. Lee, D.C. Rio, Mechanisms and Regulation of Alternative Pre-mRNA Splicing, Annu Rev Biochem, 84 (2015) 291-323. [4] C.L. Will, R. Luhrmann, Spliceosome structure and function, Cold Spring Harb Perspect Biol, 3 (2011).

re

-p

[5] A.A. Patel, J.A. Steitz, Splicing double: insights from the second spliceosome, Nat Rev Mol Cell Biol, 4 (2003) 960-970.

lP

[6] J. Ferlay, M. Colombet, I. Soerjomataram, C. Mathers, D.M. Parkin, M. Pineros, A. Znaor, F. Bray, Estimating the global cancer incidence and mortality in 2018: GLOBOCAN sources and methods, Int J Cancer, (2018).

na

[7] H. Chen, F. Gao, M. He, X.F. Ding, A.M. Wong, S.C. Sze, A.C. Yu, T. Sun, A.W. Chan, X. Wang, N. Wong, Long-Read RNA Sequencing Identifies Alternative Splice Variants in Hepatocellular Carcinoma and Tumor-Specific Isoforms, Hepatology, (2019).

Jo

ur

[8] S. Li, Z. Hu, Y. Zhao, S. Huang, X. He, Transcriptome-Wide Analysis Reveals the Landscape of Aberrant Alternative Splicing Events in Liver Cancer, Hepatology, 69 (2019) 359-375. [9] P. Wu, D. Zhou, Y. Wang, W. Lin, A. Sun, H. Wei, Y. Fang, X. Cong, Y. Jiang, Identification and validation of alternative splicing isoforms as novel biomarker candidates in hepatocellular carcinoma, Oncol Rep, 41 (2019) 1929-1937. [10] E. Herraez, E. Lozano, R.I. Macias, J. Vaquero, L. Bujanda, J.M. Banales, J.J. Marin, O. Briz, Expression of SLC22A1 variants may affect the response of hepatocellular carcinoma and cholangiocarcinoma to sorafenib, Hepatology, 58 (2013) 1065-1073. [11] A. Geier, R.I. Macias, D. Bettinger, J. Weiss, H. Bantel, D. Jahn, R. Al-Abdulla, J.J. Marin, The lack of the organic cation transporter OCT1 at the plasma membrane of tumor cells precludes a positive response to sorafenib in patients with hepatocellular carcinoma, Oncotarget, 8 (2017) 15846-15857.

Journal Pre-proof Aberrant splicing in hepatocellular carcinoma

20

[12] J.M. Llovet, S. Ricci, V. Mazzaferro, P. Hilgard, E. Gane, J.F. Blanc, A.C. de Oliveira, A. Santoro, J.L. Raoul, A. Forner, M. Schwartz, C. Porta, S. Zeuzem, L. Bolondi, T.F. Greten, P.R. Galle, J.F. Seitz, I. Borbath, D. Haussinger, T. Giannaris, M. Shan, M. Moscovici, D. Voliotis, J. Bruix, S.I.S. Group, Sorafenib in advanced hepatocellular carcinoma, N Engl J Med, 359 (2008) 378-390. [13] E. Lozano, R.I.R. Macias, M.J. Monte, M. Asensio, S. Del Carmen, L. SanchezVicente, M. Alonso-Pena, R. Al-Abdulla, P. Munoz-Garrido, L. Satriano, C.J. O'Rourke, J.M. Banales, M.A. Avila, M.L. Martinez-Chantar, J.B. Andersen, O. Briz, J.J.G. Marin, Causes of hOCT1-dependent cholangiocarcinoma resistance to sorafenib and sensitization by tumor-selective gene therapy, Hepatology, (2019).

ro of

[14] S.M. Berget, Exon recognition in vertebrate splicing, J Biol Chem, 270 (1995) 24112414.

-p

[15] J. Soret, J. Tazi, Phosphorylation-dependent control of the pre-mRNA splicing machinery, Prog Mol Subcell Biol, 31 (2003) 89-126.

lP

re

[16] M. Danan-Gotthold, R. Golan-Gerstl, E. Eisenberg, K. Meir, R. Karni, E.Y. Levanon, Identification of recurrent regulated alternative splicing events across human solid tumors, Nucleic Acids Res, 43 (2015) 5130-5144.

na

[17] S. Shen, Y. Wang, C. Wang, Y.N. Wu, Y. Xing, SURVIV for survival analysis of mRNA isoform variation, Nat Commun, 7 (2016) 11548.

ur

[18] H. Dvinge, R.K. Bradley, Widespread intron retention diversifies most cancer transcriptomes, Genome Med, 7 (2015) 45.

Jo

[19] Q. Lian, S. Wang, G. Zhang, D. Wang, G. Luo, J. Tang, L. Chen, J. Gu, HCCDB: A Database of Hepatocellular Carcinoma Expression Atlas, Genomics Proteomics Bioinformatics, 16 (2018) 269-275. [20] J.A. Ferreira, The Benjamini-Hochberg method in the case of discrete test statistics, Int J Biostat, 3 (2007) Article 11. [21] M. Ryan, W.C. Wong, R. Brown, R. Akbani, X. Su, B. Broom, J. Melott, J. Weinstein, TCGASpliceSeq a compendium of alternative mRNA splicing in cancer, Nucleic Acids Res, 44 (2016) D1018-1022. [22] M. Elizalde, R. Urtasun, M. Azkona, M.U. Latasa, S. Goni, O. Garcia-Irigoyen, I. Uriarte, V. Segura, M. Collantes, M. Di Scala, A. Lujambio, J. Prieto, M.A. Avila, C. Berasain, Splicing regulator SLU7 is essential for maintaining liver homeostasis, J Clin Invest, 124 (2014) 2909-2920.

Journal Pre-proof Aberrant splicing in hepatocellular carcinoma

21

[23] S. Sen, M. Langiewicz, H. Jumaa, N.J. Webster, Deletion of serine/arginine-rich splicing factor 3 in hepatocytes predisposes to hepatocellular carcinoma in mice, Hepatology, 61 (2015) 171-183. [24] C. Berasain, S. Goni, J. Castillo, M.U. Latasa, J. Prieto, M.A. Avila, Impairment of premRNA splicing in liver disease: mechanisms and consequences, World J Gastroenterol, 16 (2010) 3091-3102. [25] N.J.G. Webster, Alternative RNA Splicing in the Pathogenesis of Liver Disease, Front Endocrinol (Lausanne), 8 (2017) 133.

-p

ro of

[26] J. Pihlajamaki, C. Lerin, P. Itkonen, T. Boes, T. Floss, J. Schroeder, F. Dearie, S. Crunkhorn, F. Burak, J.C. Jimenez-Chillaron, T. Kuulasmaa, P. Miettinen, P.J. Park, I. Nasser, Z. Zhao, Z. Zhang, Y. Xu, W. Wurst, H. Ren, A.J. Morris, S. Stamm, A.B. Goldfine, M. Laakso, M.E. Patti, Expression of the splicing factor gene SFRS10 is reduced in human obesity and contributes to enhanced lipogenesis, Cell Metab, 14 (2011) 208-218.

lP

re

[27] S.K. Murphy, H. Yang, C.A. Moylan, H. Pang, A. Dellinger, M.F. Abdelmalek, M.E. Garrett, A. Ashley-Koch, A. Suzuki, H.L. Tillmann, M.A. Hauser, A.M. Diehl, Relationship between methylome and transcriptome in patients with nonalcoholic fatty liver disease, Gastroenterology, 145 (2013) 1076-1087.

ur

na

[28] R. Zhu, S.S. Baker, C.A. Moylan, M.F. Abdelmalek, C.D. Guy, F. Zamboni, D. Wu, W. Lin, W. Liu, R.D. Baker, S. Govindarajan, Z. Cao, P. Farci, A.M. Diehl, L. Zhu, Systematic transcriptome analysis reveals elevated expression of alcoholmetabolizing genes in NAFLD livers, J Pathol, 238 (2016) 531-542.

Jo

[29] C. Lopez-Vicario, A. Gonzalez-Periz, B. Rius, E. Moran-Salvador, V. Garcia-Alonso, J.J. Lozano, R. Bataller, M. Cofan, J.X. Kang, V. Arroyo, J. Claria, E. Titos, Molecular interplay between Delta5/Delta6 desaturases and long-chain fatty acids in the pathogenesis of non-alcoholic steatohepatitis, Gut, 63 (2014) 344-355.

Journal Pre-proof Aberrant splicing in hepatocellular carcinoma

22

Table 1. Genes involved in Complex E of spliceosome that have been analyzed.

Complex E Sm Proteins

Gen

U1 snRNP 70 kDa

Change in expression

SNRNP70

Correlation Clinical with SLC22A1 Pronostic alternative splicing value

ENS

None

Yes

No

ENSG00

U1A snRNP

SNRPA

Up

Yes

No

ENSG00

U1C snRNP

SNRPC

Up

No

No

ENSG00

Sm-B/B1

SNRPB

Up

Yes

Yes

ENSG00

SNRPD1

Up

No

Yes

ENSG00

Up

No

No

ENSG00

None

No

No

ENSG00

None

No

Yes

ENSG00

None

Yes

No

ENSG00

Down

Yes

No

ENSG00

U2AF1

Down

Yes

No

ENSG00

U2AF2

None

Yes

No

ENSG00

Sm-D1

SNRPE

Sm-F

SNRPF

Sm-G

SNRPG

Sm-N

SNRPN SF1

Binding to 3’SS

U2af1

Binding to PPT

U2af2

lP

Sf1

na

Binding to BPS

re

Sm-E

ro of

Complex E U1 Element

Protein

-p

Function

Jo

ur

BPS, Branch point site; 3’SS, 3’ splicing site; PPT, polypyrimidine tract.

Journal Pre-proof Aberrant splicing in hepatocellular carcinoma

23

Table 2. Genes encoding alternative splicing factors inhibiting short splicing. Change in expression

hnRNP A1

HNRNPA1

None

No

No

ENSG000001

hnRNP B1

HNRNPA2B1

None

No

No

ENSG000001

hnRNP D

HNRNPD

None

Yes

No

ENSG000001

hnRNP F

HNRNPF

Up

No

No

ENSG000001

hnRNP H

HNRNPH1

None

Yes

Yes

ENSG000001

hnRNP K

HNRNPK

Down

No

Yes

ENSG000001

hnRNP L

HNRNPL

Down

Yes

Yes

ENSG000001

hnRNP E2

PCBP2

Down

No

Yes

ENSG000001

PTBP1

HNRNPI

Up

Yes

No

ENSG000000

-p

re

lP na

Correlation Clinical with SLC22A1 pronostic alternative splicing value

ro of

Gen

ur

Interaction with exonic (ESS) and intronic (ISS) silencers results in inhibition of short splicing

Protein

Jo

Function

ENSEMB

Journal Pre-proof Aberrant splicing in hepatocellular carcinoma

24

Table 3. Genes encoding factors favoring short splicing. Change in expression

SF2/ASF

SRSF1

None

No

No

ENSG000001

SC35

SRSF2

Down

No

Yes

ENSG000001

SRp20

SRSF3

Down

No

No

ENSG000001

SRp75

SRSF4

Down

Yes

No

ENSG000001

SRp40

SRSF5

Down

Yes

No

ENSG000001

SRp55

SRSF6

Down

Yes

No

ENSG000001

9G8

SRSF7

Down

Yes

No

ENSG000001

SRp46

SRSF8

Down

Yes

No

ENSG000002

SRp30c

SRSF9

None

Yes

No

ENSG000001

TRA2-

SRSF10

Down

No

No

ENSG000001

SRSF11

None

Yes

No

ENSG000001

-p

re

lP

na

NET2

Correlation Clinical with SLC22A1 pronostic alternative splicing value

ro of

Gen

ENSEMB

SREK1

SRSF12

None

Yes

No

ENSG000001

SRm160

SRRM1

Down

No

No

ENSG000001

None

Yes

No

ENSG000000

ur

Interaction with exonic (ESE) and intronic (ISE) splicing enhancers favors short splicing

Protein

LUCA15

Jo

Function

RBM5

Journal Pre-proof Aberrant splicing in hepatocellular carcinoma

25

Table 4. Genes encoding proteins involved in the regulation of SR proteins. Protein

Gen

Change in expression

Correlation with SLC22A1 alternative splicing

Clinical pronostic value

Nuclear Kinases

CDC like kinase 1

CLK1

Down

Yes

No

ENSG0

CDC like kinase 2

CLK2

Up

Yes

No

ENSG0

CDC like kinase 3

CLK3

Up

Yes

No

ENSG0

CDC like kinase 4

CLK4

None

No

No

ENSG0

SRSF protein kinase 1

SRPK1

Up

No

No

ENSG0

SRSF protein kinase 2

SRPK2

None

No

No

ENSG0

PP1A

PPP1CA

Up

Yes

No

ENSG0

PP2A

PPP2CA

Up

Yes

No

ENSG0

-p re lP na ur

Phosphatases

Jo

Cytoplasmic Kinases

ro of

Function

ENS

Journal Pre-proof Aberrant splicing in hepatocellular carcinoma

26

Highlights The alternative splicing of SLC22A1 pre-mRNA is increased in HCC



The expression of key genes favoring short splicing is decreased in HCC



The expression of several proteins that inhibit short splicing is increased in HCC



Spliceosome-related protein kinases and phosphatases expression is altered in HCC

Jo

ur

na

lP

re

-p

ro of



Figure 1

Figure 2

Figure 3

Figure 4

Figure 5

Figure 6

Figure 7

Figure 8

Figure 9

Figure 10