Novel alternative splice variants of the human protein arginine methyltransferase 1 (PRMT1) gene, discovered using next-generation sequencing

Novel alternative splice variants of the human protein arginine methyltransferase 1 (PRMT1) gene, discovered using next-generation sequencing

Gene 699 (2019) 135–144 Contents lists available at ScienceDirect Gene journal homepage: www.elsevier.com/locate/gene Novel alternative splice vari...

4MB Sizes 0 Downloads 60 Views

Gene 699 (2019) 135–144

Contents lists available at ScienceDirect

Gene journal homepage: www.elsevier.com/locate/gene

Novel alternative splice variants of the human protein arginine methyltransferase 1 (PRMT1) gene, discovered using next-generation sequencing Panagiotis G. Adamopoulos, Adamantios V. Mavrogiannis, Christos K. Kontos, Andreas Scorilas

T



Department of Biochemistry and Molecular Biology, National and Kapodistrian University of Athens, 15701 Athens, Greece

A R T I C LE I N FO

A B S T R A C T

Keywords: Alternative splicing Novel exon 3′ RACE Semiconductor sequencing Nonsense-mediated mRNA decay (NMD) Histone methylation

Next-generation sequencing (NGS) technology is highly expected to help researchers disclose the complexity of alternative splicing and understand its association with carcinogenesis. Alternative splicing alterations are firmly associated with multiple malignancies, in terms of functional roles in malignant transformation, motility, and/or metastasis of cancer cells. One perfect example illustrating the connection between alternative splicing and cancer is the human protein arginine methyltransferase 1 (PRMT1) gene, previously cloned from members of our research group and involved in a variety of processes including transcription, DNA repair, and signal transduction. Two splice variants of PRMT1 (variants v.1 and v.2) are downregulated in breast cancer. In addition, PRMT1 v.2 promotes the survival and invasiveness of breast cancer cells, while it could serve as a biomarker of unfavorable prognosis in colon cancer patients. The aim of this study was the molecular cloning of novel alternative splice variants of PRMT1 with the use of 3′ RACE coupled with NGS technology. Extensive bioinformatics and computational analysis revealed a significant number of 19 novel alternative splicing events between annotated exons of PRMT1 as well as one novel exon, resulting in the discovery of multiple PRMT1 transcripts. In order to validate the full sequence of the novel transcripts, RT-PCR was carried out with the use of variantspecific primers. As a result, 58 novel PRMT1 transcripts were identified, 34 of which are mRNAs encoding new protein isoforms, whereas the rest 24 transcripts are candidates for nonsense-mediated mRNA decay (NMD).

1. Introduction Eukaryotic mRNAs are transcribed as precursors containing intervening sequences, called introns. These sequences are subsequently removed by RNA splicing during maturation of the primary transcript into the final mRNA molecule, which consists of exons (Smith and Valcarcel, 2000). Alternative splicing is one of the most important mechanisms that generate a large number of mRNA and protein isoforms from the surprisingly low number of human genes, changing the structure of transcripts as well as their encoded proteins (Stamm et al., 2005). Notably, genome-wide analyses regarding the mechanism of alternative splicing have shown that about 40–60% of human genes have alternative splice forms, suggesting that alternative splicing is one of the most major causes of the functional complexity characterizing the human genome (Modrek and Lee, 2002). The investigation of the

alternative splicing mechanism as well as its impact on genomics and transcriptomics has been a significant subfield of molecular biology for many years, but has not received much research attention compared with other fundamental fields, such as the discovery of new genes or transcriptional regulation. Nevertheless, extensive analyses of ESTs and microarray data have identified several main types of alternative splicing (Blencowe, 2006). The most prevalent pattern, exon skipping or cassette exon, corresponds to splicing events where an exon is spliced out of the primary RNA transcript (Sammeth et al., 2008), whereas additional major splicing mechanisms include mutually exclusive exons that cannot coexist in the mature mRNA sequence, and intron retentions, where an intron is retained in the final mature mRNA sequence (Graveley, 2001; Sanchez-Pla et al., 2012). Alternative selection of 5′ or 3′ splice sites within exon sequences has also been characterized as a dominant alternative splicing mechanism, as it may lead to subtle

Abbreviations: cDNA, complementary DNA; EST, expressed sequence tag; hnRNP, heterogeneous nuclear ribonucleoprotein; mRNA, messenger RNA; NGS, nextgeneration sequencing; NMD, nonsense-mediated mRNA decay; ORF, open reading frame; PTC, premature translation termination codon; RACE, rapid amplification of cDNA ends; RT-PCR, reverse-transcription polymerase chain reaction; UTR, untranslated region ⁎ Corresponding author at: Department of Biochemistry and Molecular Biology, National and Kapodistrian University of Athens, Panepistimiopolis, 15701 Athens, Greece. E-mail address: [email protected] (A. Scorilas). https://doi.org/10.1016/j.gene.2019.02.072 Received 21 August 2018; Received in revised form 24 January 2019; Accepted 17 February 2019 Available online 05 March 2019 0378-1119/ © 2019 Published by Elsevier B.V.

Gene 699 (2019) 135–144

P.G. Adamopoulos, et al.

Table 1 List of primersa that were used for the validation of the NGS results. Name

Direction

Sequence (5′→3′)

1/4F 1/5F 1/7F 1/8F 2/4F 4/6F 4/7F 4/8F 4/10F 5/7F 5/8F 5/10F 6/8F 6/9F 6/10F 7/9F 4/6R 4/7R 4/10R 5/7R 5/10R 6/9R 6/12R 7/9R 6/10R 8/10R 9/11R 9/12R N/12Rc 11/12R

Forward

TGCATCATGGAGGTGTCCTG CGAACTGCATCATGGAGGAGATG CTGCATCATGGAGTGGTGAC CATCATGGAGGCGCCCG CGCCTCTTGAAGAAGTGTCCTG TTGGCATCCACGAGATCGAG TGGCATCCACGAGTGGTGA ATCCACGAGGCGCCCGAT GCATCCACGAGGAGGTGG CAAGGTCATCGGGTGGTGAC AAGGTCATCGGGGCGCC AAGGTCATCGGGGAGGTGG TTAGACCACGGCGCCCGA AGTTAGACCACGGGTGGGAG AAGTTAGACCACGGAGGTGGA ACAAGTGGCTGGGTGGGA TGGAACACTCGATCTCGTGGA TGATGGTCACCACTCGTGGA GTATAGATGTCCACCTCCTCGTG TGATGGTCACCACCCGATGA GTATAGATGTCCACCTCCCCGAT GTTCTCCCACCCGTGGTC CAGGTCCCGCGTGGTCTA CGTTCTCCCACCCAGCCA TAGATGTCCACCTCCGTGGTC GGTATAGATGTCCACCTCAGTGGAT GGGACTCGGGGCCTTTATG AAGTCCAGGTCCCGCTTTATG GGTCCCGCCCCTTGTCTA CCAGGTCCCGGTTGTTCTTG

a b c

Reverse

Length (nt)

Tm (°C)b

20 23 20 17 22 20 19 18 18 20 17 19 18 20 21 18 21 20 23 20 23 18 18 18 21 25 19 21 18 20

59.8 61.3 58.6 60.7 61.2 59.9 61.2 63.9 59.8 60.7 62.2 61.0 63.7 60.9 60.5 60.8 60.9 60.8 59.7 60.9 60.8 59.7 61.0 61.6 61.0 61.2 59.9 60.1 60.7 60.8

Each primer was designed to target a specific splice junction between two annotated exons of the PRMT1 gene. Melting temperature (Tm) was calculated by Primer-BLAST. The reverse primer “N/12R” was designed to target the splice junction between exon 12 and the newly discovered exon.

single genes and have been applied successfully in several approaches including microarrays (Yeakley et al., 2002; Johnson et al., 2003; Pan et al., 2004; Fehlbaum et al., 2005). However, tremendous progress has been made using the newly introduced high-throughput sequencing, which not only led to the discovery of novel alternative splicing events (Blencowe, 2006; Pan et al., 2008; Wang et al., 2008), but also has accelerated the process of understanding the regulation of alternative splicing in different tissues by examining the expression levels of protein regulators and helping to define cis-regulatory elements (Castle et al., 2008; Sultan et al., 2008). Next-generation sequencing (NGS) has emerged as a powerful and increasingly cost-effective technology for genomic and transcriptomic analyses. NGS has several significant advantages, including its high sensitivity and accuracy, broad dynamic range, nucleotide-level resolution as well as the ability to analyze pre-mRNA alternative splicing and detect novel mRNA transcripts (Morozova and Marra, 2008; Bryant Jr. et al., 2012; Sanchez-Pla et al., 2012; Park et al., 2013; Mangul et al., 2014). NGS has been widely coupled with other molecular cloning techniques, including RT-PCR (Sun et al., 2014), 5′ RACE (Lagarde et al., 2016; Leenen et al., 2016; Nashed et al., 2016) as well as 3′ RACE (Adamopoulos et al., 2016; Lagarde et al., 2016), in order to further investigate not only the variability and complexity of transcriptomes, but also the discovery of alternative transcription start sites and poly(A) sites. More specifically, since RACE followed by NGS can unveil the full sequences of both untranslated regions (5′- and 3′-UTRs), new insights into the complexity of these regions are highly expected. In fact, it is known that a remarkable number of human genes comprise several alternative first exons that constitute the 5′-UTR of the respective transcripts, each one having its own alternative promoter. It has also been estimated that almost 58% of the transcribed genes have multiple promoters (Carninci et al., 2006). On the other hand, identification of novel alternative 3′-UTRs is of high importance, as these regions

changes in the coding sequence. The recognition of exon-intron boundaries depends also on sequence elements within exons and on intronic elements that are distinct from the splice sites. These sequence elements and the factors that recognize them are employed for the constitutive definition of introns and exons (Garcia-Blanco et al., 2004). The preservation or removal of exon sequences in the mature RNA transcript enhances the generation of transcript variants and, consequently, of protein isoforms that may differ in structural and functional properties (Kriventseva et al., 2003; Sanchez et al., 2011). Therefore, alternative splicing requires accurate regulation in order to guarantee plasticity, while still displaying high specificity and fidelity. However, despite the fact that alternative splicing events are tightly regulated by epigenetic DNA- and histone-modifying proteins under normal cellular conditions, they are often dysregulated in cancer cells and may lead to the production of aberrantly expressed or even cancer-specific transcripts (Tazi et al., 2009). Recently, a number of studies identified that dysregulation of alternative splicing plays a major pathogenic role in cancer (Skotheim and Nees, 2007; David and Manley, 2010; Ward and Cooper, 2010). Tumor-specific splice variants are being discovered at an increasing rate and their functions are also investigated in cancer progression. The tumor-specific splice variants, the expression patterns and activities of which are successfully characterized, may become attractive targets for ablation or splicing modification (Blair and Zi, 2011). In addition, it should be noticed that about 50% of human genetic disorders are associated with mutations in splice sites and regulatory elements, including enhancers and silencers, resulting in alternative exon composition in the mature RNA sequence (Caceres and Kornblihtt, 2002; Cartegni et al., 2002; Matlin et al., 2005) For this reason, alternative transcripts produced by aberrant splicing or cancerspecific splice variants could be used as diagnostic, prognostic, and/or predictive biomarkers as well as therapeutic targets (Pajares et al., 2007). Until recently, splicing studies have traditionally focused on

136

Gene 699 (2019) 135–144

P.G. Adamopoulos, et al.

Fig. 1. Schematic demonstration of representative sequencing reads for every novel splicing event detected in the present study. The nucleotides of each exon are shown with different colors (blue and red) for visual purposes. In total, 19 novel splice junctions were identified. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

proteins by PRMTs plays an important regulatory role in many biological processes, such as signaling (Berthet et al., 2002), DNA repair (Lake and Bedford, 2007), maturation and nucleocytoplasmic transport of RNA (Boisvert et al., 2002), protein protection (Fackelmayer, 2005), ribosomal assembly (Bachand and Silver, 2004), and the regulation of gene expression (An et al., 2004). The PRMTs comprise a family of nine protein members and can be discriminated into three distinct groups. Type-I enzymes catalyze the formation of NG-monomethylarginine and asymmetric NG, NG-dimethylarginine residues, type-II enzymes catalyze the formation of NG-monomethylarginine and symmetric NG, NG-dimethylarginine residues, whereas type-III enzymes catalyze monomethylation of the internal guanidine nitrogen atom to form ω-NGmonomethylarginine (Zobel-Thropp et al., 1998). The predominant mammalian arginine methyltransferase is PRMT1.

frequently contain regulatory regions that post-transcriptionally influence gene expression, polyadenylation, translation efficiency, localization, and stability of mRNAs (Barrett et al., 2012; Pichon et al., 2012). The ability of NGS to uncover novel alternative splicing events is even more important with regard to cancer-related genes, as particular alternative splice variants have been shown to possess clinical relevance in human malignancies (Alexopoulou et al., 2014; Christodoulou et al., 2014). One such example of cancer-related genes corresponds to the family of protein arginine methyltransferases (PRMTs), the products of which catalyze the methylation of arginine residues, one of the most widespread post-translational modification of proteins. These proteins methylate arginine residues by transferring methyl groups from Sadenosyl-L-methionine to terminal guanidine nitrogen atoms (Herrmann et al., 2009). Post-translational modification of target 137

Gene 699 (2019) 135–144

P.G. Adamopoulos, et al.

Fig. 2. Schematic demonstration of one representative sequencing read confirming the existence of the novel exon of the PRMT1 gene, located between exons 11 and 12. Nucleotides of exon 11 are shown in blue color and nucleotides of exon 12 in red; the middle exon (in black color) is the new one. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

33 (head and neck squamous cell carcinoma). Each cell line was cultured based on the American Type Culture Collection instructions.

The human PRMT1 gene is located on chromosome 19q13.3, close to RRAS and IRF3 genes, spanning a region of 11.3 kb of genomic DNA and comprises 12 exons and 11 intervening introns. The encoded protein is a type-I PRMT, which is involved in various processes, including transcription, DNA repair, and signal transduction (Wolf, 2009). PRMT1 methylates arginine residues in glycine-rich regions, found in many proteins that bind RNA and involved in various aspects of RNA metabolism, such as heterogeneous nuclear ribonucleoproteins (hnRNPs), histone H4, and signal transducers and activators of transcription (STATs). Additionally, there is strong evidence that PRMT1 interacts with ILF3 and IFNAR1 (Papadokostopoulou et al., 2009). In the past, members of our lab discovered three splice variants of PRMT1, encoding proteins of 343 (isoform 1), 361 (isoform 2), and 347 (isoform 3) amino acid residues, with a molecular mass of 39.6, 41.5, and 39.9 kDa, respectively (Scorilas et al., 2000). The isoform 2 was predicted to contain 3 hydrophobic stretches, one of which may be a signal peptide, whereas the full-length protein encoded by variant 3 contains an in-frame translation stop codon in the middle of exon 3, a well as a second (downstream) translation start codon. In the present study, we performed NGS technology to search for, identify, and characterize novel alternatively spliced variants of the PRMT1 gene in many human cell lines. After in-depth computational analysis, we detected novel alternative splicing events that support the existence of novel PRMT1 transcripts, and verified these novel splice junctions.

2.2. Total RNA extraction and reverse transcription Total RNA was extracted from 55 human cell lines using the TRIzol Reagent (Ambion, Austin, TX, USA) and stored in THE RNA Storage Solution (Ambion) at −80 °C, until next use. The purity and concentration assessment for each RNA sample was performed spectrophotometrically at 260 and 280 nm. Reverse transcription was then performed in 20 μL reaction volumes, using 5 μg of total RNA, SuperScript II Reverse Transcriptase (Invitrogen, Carlsbad, CA, USA) as well as an oligo-dT–adaptor (5′-GCGAGCACAGAATTAATACGACTCACTATAGGTTTTTTTTTTTTVN-3′, where V = G, A, C and N = G, A, T, C) as primer. After first-strand cDNA synthesis, each sample was diluted 5-fold and all cDNAs were organized in 19 groups, based on their tissue of origin. Then, cDNAs of each group were mixed equimolarly. Therefore, 19 cDNA pools were generated. 2.3. Amplification of the novel splice variants using nested 3′ RACE Next, 3′ RACE was performed for the selective amplification of the PRMT1 transcripts. In brief, a forward specific primer (sequence: 5′-GTAGGTGCGGGTGAAGATGG-3′) was designed to target the region of the annotated translation start codon and was used along with a universal reverse primer (sequence: 5′-GCGAGCACAGAATTAATACG ACT-3′) that was designed to anneal on the oligo-dT–adapter primer sequence. 3′ RACE products were 50-fold diluted in nuclease-free water and used as templates for nested 3′ RACE that was performed with an internal gene-specific forward primer (sequence: 5′-AGGCCGCGAACT GCATCAT-3′) and another universal reverse primer (sequence: 5′-AGC ACAGAATTAATACGACTCACTATAGG-3′). The purpose of nested 3′ RACE was to achieve higher sensitivity and specificity for the targeted transcripts. Both 3′ RACE and nested 3′ RACE reactions were performed 25 μL containing MgCl2-free KAPA Taq Buffer C (Kapa Biosystems Inc., Woburn, MA, USA), 1.5 mM MgCl2, 0.2 mM dNTPs, 10 pmol of each primer, and 2 units of KAPA Taq DNA Polymerase (Kapa Biosystems Inc.), in a Veriti 96-Well Fast Thermal Cycler (Applied Biosystems, Foster City, CA, USA). Thus, the DNA was denatured at 95 °C for 3 min, followed by 35 cycles of amplification (denaturation, 95 °C for 30 s; annealing, 60 °C for 30 s; extension, 72 °C for 1 min). After 35 cycles, the final extension was done at 72 °C for 1 min. Next, nested 3′ RACE products were cleaned-up using the

2. Materials and methods 2.1. Human cell line culture A total of 55 human cell lines derived from 19 different tissues were used in the current original research study: MCF-7, SK-BR-3, BT-20, MDA-MB-231, MDA-MB-468 (breast adenocarcinoma), BT-474, T-47D, ZR-75-1 (ductal carcinoma of the breast), OVCAR-3, SK-OV-3, ES-2, MDAH-2774 (ovarian cancer), Ishikawa, SK-UT-1B (endometrial adenocarcinoma), HeLa, SiHa (cervical carcinoma), PC-3, DU 145, LNCaP (prostate cancer) T24, RT4 (urinary bladder cancer), ACHN, 786-O, Caki-1 (renal cell carcinoma), Caco-2, DLD-1, HT-29, HCT 116, SW 620, COLO 205, RKO (colorectal cancer), AGS (gastric adenocarcinoma), Hep G2, HuH-7 (hepatocellular carcinoma), U-87 MG, U-251 MG, D54, H4, SH-SY5Y (brain cancer), A549 (lung adenocarcinoma), FM3, MDAMB-435S (melanoma), Raji, Daudi, U-937 (lymphoma), K-562, HL-60, Jurkat, REC-1, SU-DHL-1, GRANTA-519 (leukemia), HEK293 (normal embryonic kidney), 1.2B4 (normal pancreas), and BB49-SCCHN, CAL138

Gene 699 (2019) 135–144

P.G. Adamopoulos, et al.

Fig. 3. Agarose gel electrophoresis results of all the RT-PCR products that were produced for the validation of our NGS results. The primer pairs used for each RT-PCR reaction are shown at the top of each lane. In addition, the expected PCR product length as well as the transcript variant number that is validated is shown below each lane. Finally, the 100 bp DNA ladder was used for the assessment of each PCR product size.

139

Gene 699 (2019) 135–144

P.G. Adamopoulos, et al.

(caption on next page) 140

Gene 699 (2019) 135–144

P.G. Adamopoulos, et al.

Fig. 4. Structure of the 34 novel PRMT1 splice variants that comprise an open reading frame (ORF). Exons are depicted as boxes and introns as lines; gray and white boxes represent coding and non-coding exons, respectively. Numbers inside boxes and above lines indicate the length of each exon or intron in nucleotides. Arrows (↓) show the position of the ATG codon, asterisks (*) indicate the position of the stop codon, and question marks (?) represent an undetermined 5′ UTR. For each new PRMT1 transcript, the splice variant number, the GenBank® accession number and the protein isoform number are shown next to each transcript.

PRMT1 gene. Three of these transcripts have open reading frames (ORFs) and are mRNAs (NM_001536.5, NM_198318.4, NM_001207042.2), whereas the other one transcript contains a premature translation termination codon (PTC) and is hence characterized as non-coding RNA (NR_033397.4). In total, 12 exons are identified in all annotated PRMT1 transcripts; however, the three protein-coding transcripts lack exon 3. In addition, PRMT1 transcript v.1 (NM_001536.5) contains all the rest exons, transcript v.2 (NM_198318.4) lacks also exon 2, and transcript v.3 (NM_001207042.2) lacks – besides exons 2 and 3 – exons 8 and 9 (Suppl. Fig. 1). Our bioinformatic analysis confirmed all annotated splice junctions, covered by thousands of sequencing reads of the FASTQ file. Apart from the annotated alternative splicing events, bioinformatic analysis and visualization using IGV revealed a total of 19 novel splice junctions between the annotated exons of the PRMT1 gene (Fig. 1 and Suppl. Fig. 2), as well as one novel exon, as described in the next paragraph. Based on the analyzed NGS data, most of the 19 novel splice junctions that were detected are much less frequent than the previously annotated ones, since they were covered by fewer sequencing reads. All novel splice junctions were validated using RT-PCR and electrophoresis on agarose gels. On the other hand, no alternative 3′ UTRs were identified by our analysis.

NucleoSpin Gel and PCR Clean-up kit (Macherey-Nagel GmbH & Co. KG, Duren, Germany), following the manufacturer's guidelines. After PCR clean-up, the concentration of each sample was spectrophotometrically determined. All cleaned-up nested 3′ RACE products were mixed to generate a single sample, which was stored at −20 °C until further use. 2.4. NGS library preparation and semi-conductor sequencing NGS library preparation and quantification, followed by template preparation and enrichment, were performed as previously described (Adamopoulos et al., 2017; Adamopoulos et al., 2018). Next, NGS based on the semi-conductor sequencing technology was performed with the Ion PGM™ Sequencing 400 Kit (Ion Torrent) in an Ion PGM™ System (Ion Torrent), according to the manufacturer's instructions. 2.5. Bioinformatic analysis of the NGS data The Torrent Suite™ Software (Ion Torrent) was used for basecalling and alignment of raw data to the human genome (hg38). The generated FASTQ file was uploaded to the publicly available online GALAXY suite of software tools for NGS data analysis (https://usegalaxy.org/). TopHat2 was used to align the obtained sequencing reads and to identify novel splice sites with direct mapping to known transcripts (Kim et al., 2013). Several output files were hence created, including the “accepted hits” (BAM) file containing the reads aligned to the reference genome and the “splice junctions” (BED) file comprising the detected splice junctions. Both files were used for visualization, processing, and analysis of the NGS data. The Integrative Genomics Viewer (IGV) (Thorvaldsdottir et al., 2013) was used for interactive visual exploration of our results, uncovering alternative splicing events in read alignments.

3.2. Discovery of a novel exon of the PRMT1 gene Apart from the 19 novel splice junctions, a new exon, located between exons 11 and 12, was discovered, based on 808 sequencing reads of the FASTQ file. This exon of 90 nucleotides is located 203 nucleotides downstream of exon 11 and 1142 nucleotides upstream of exon 12, and has the following sequence: 5′-CAGGATTGCTTGAGCATTCACCGGAAG TTATGGGCCTG GGGACCCAGGCTGAAACCCTGCCCTCCTGGTGCTTCC ACTCTAGACAAGGGG-3′. This novel exon is exclusively spliced with exons 11 and 12 of the PRMT1 gene (Fig. 2) and was validated using RT-PCR and electrophoresis on an agarose gel.

2.6. Determination of the full exon structure of each novel transcript with nested PCR In order to define the full exon structure of the novel PRMT1 transcripts comprising at least one of the novel splice junctions, we designed splice-junction–specific PCR primers using Primer-BLAST (Ye et al., 2012) (Table 1) and performed a first PCR using primers spanning the first and the last splice junctions of each putative cDNA of PRMT1. Next, we performed nested PCR. All reactions were carried out in a total volume of 25 μL containing MgCl2-free KAPA Taq Buffer C (Kapa Biosystems Inc.), 1.5 mM MgCl2, 0.2 mM dNTPs, 10 pmol of each primer, and 2 units of KAPA Taq DNA Polymerase (Kapa Biosystems Inc.), in a Veriti 96-Well Fast Thermal Cycler (Applied Biosystems), under the following cycling conditions: a denaturation step at 95 °C for 3 min, followed by 35 cycles of 95 °C for 30 s, 60 °C for 30 s, 72 °C for 1 min, and a final extension step at 72 °C for 1 min. PCR products were 50-fold diluted in nuclease-free water, before being used as templates for nested PCR, under the same amplification conditions. Equal volumes of PCR products were electrophoresed on 2–3% NuSieve GTG Agarose (Cambrex Bio Science Rockland Inc., Rockland, ME, USA) gels and visualized under UV light by ethidium bromide staining.

3.3. Identification of novel PRMT1 transcripts Since the size of the cDNA sequences of all PRMT1 transcripts exceeds the maximum DNA fragment size (~400 bp) that can be sequenced by NGS technology, we performed reverse transcription – polymerase chain reaction (RT-PCR) using variant-specific sets of primers (Table 1), designed to target the newly discovered splice junctions. For this purpose, 17 amplicons were generated in a first PCR, using primers spanning the first and the last splice junctions of each putative cDNA of PRMT1. Thus, each PCR reaction included a mix of amplified transcripts. Further discrimination of these transcripts was performed using nested PCR with distinct sets of internal primers (Table 1). The expected nested-PCR products were revealed after agarose electrophoresis (Fig. 3). Based on RT-PCR, a total of 58 novel alternative PRMT1 transcripts were identified and their sequences were deposited in GenBank® (accession numbers: KX421008-KX421010, KX421012-KX421021, KX421025-KX421027, KX421029-KX421038, KX421041, KX421042, KX421044-KX421050, KX421053, KX421056, KX421058-KX421062, KX421065, KX421070-KX421076, KX421078-KX421080, KX421082KX421085, and KX421090). All newly discovered PRMT1 transcripts are 5′-partial, since their sequences start from the target region (translation start site) of the forward gene-specific primer that was used for the molecular cloning with nested 3′ RACE. Therefore, their 5′-UTR

3. Results 3.1. Identification of novel splice junctions in PRMT1 mRNAs In brief, there are currently four annotated transcripts of the human 141

Gene 699 (2019) 135–144

P.G. Adamopoulos, et al.

(caption on next page) 142

Gene 699 (2019) 135–144

P.G. Adamopoulos, et al.

Fig. 5. Structure of the 24 novel PRMT1 splice variants that do not possess any open reading frame (ORF). Non-coding exons are depicted as white boxes and introns as lines. Numbers inside boxes and above lines indicate the length of each exon or intron in nucleotides. Question marks (?) represent an undetermined 5′ end of each transcript. For each new PRMT1 transcript, the splice variant number and the GenBank® accession number are provided.

discovered biomarkers are well established and applied for routine use, the need for the discovery of new ones is quite urgent. The extraordinary number of splicing events that was detected in the present study, leading to an unexpectedly increased number of PRMT1 splice variants, is not surprising, since PRMTs exert various functional roles for the conservation of cellular homeostasis. In our previous study regarding the human BCL2L12 gene (Adamopoulos et al., 2016), a member of the BCL2 family and therefore an apoptosisrelated gene exerting its important function via multiple protein-protein interactions and thus determining cell fate, NGS technology revealed a significant number of novel alternative splice variants in human cancer cells. In recent years, it has become clear that methyl groups represent major controlling elements in protein functions and an arginine residue in proteins is one of the most common targets for methylation in mammalian cells (Yoshimatsu et al., 2011). Although PRMT1 is known to methylate proteins such as histone H4 and hnRNPs, its physiological role in the cell has not been thoroughly elucidated yet. Modifications in the expression pattern of certain genes are strongly associated ted with cancer incidence. A number of studies have provided evidence linking arginine methylation to anti-proliferative signaling and/or a protective role against cancer. However, these observations may not be applicable to all PRMTs and all cancer types. Alternative splicing can lead to the production of PRMT1 isoforms with a wide array of different biochemical and regulatory properties (Goulet et al., 2007). The different structures of the predicted PRMT1 protein isoforms presented in this study suggests distinct properties. Last but not least, it should be mentioned that the major limitation of the current original research study is that it is restricted to the experimental validation of PRMT1 transcripts (putative mRNA or noncoding RNA molecules). Validation of the existence of the predicted PRMT1 protein isoforms in a functional study would undoubtedly strengthen the results of our work. Moreover, experimental verification of the potential role of the predicted PMRT1 protein isoforms could be carried out to examine their role in the methylation process, or other functions of these proteins. Supplementary data to this article can be found online at https:// doi.org/10.1016/j.gene.2019.02.072.

remains unknown and merits further investigation. In addition, we searched for an ORF in the sequence of each new transcript and found that 34 out of 58 novel PRMT1 splice variants (those with accession numbers: KX421008-KX421010, KX421012-KX421021, KX421025KX421027, KX421029-KX421038, KX421041, KX421042, KX421044KX421049) have such an ORF; thus, these transcripts are predicted to encode new PRMT1 protein isoforms (Fig. 4). On the other hand, the remaining 24 transcripts (those with accession numbers: KX421050, KX421053, KX421056, KX421058-KX421062, KX421065, KX421070KX421076, KX421078-KX421080, KX421082-KX421085, and KX421090) are candidates for nonsense-mediated mRNA decay (NMD), as they are all predicted to contain at least one PTC residing 5′ to a limit of approximately 50 nt upstream of the last exon junction (Zhang et al., 1998a, b; Thermann et al., 1998) (Fig. 5). 4. Discussion The newly introduced NGS technology can be used to deeply survey alternative splicing complexity in the human transcriptome and to identify novel transcripts resulting from alternative splicing, even those with very low expression levels (Pan et al., 2008). This tremendous ability of NGS is of high significance, because the identification of novel transcripts could provide functional or clinical information for many diseases, including cancer. Splice variants that are expressed solely in cancerous tissues or aberrantly expressed in several human malignancies could constitute major mediators of carcinogenesis or cancer progression. In fact, alternative splicing has a critical impact on many fundamental aspects of cell biology, including cell cycle control and apoptosis (Kim et al., 2008). Despite the fact that in most cases the relationship between a specific splicing event and the etiology of cancer is not yet fully understood, novel alternatively spliced transcripts could appear as new molecular biomarkers for diagnostic or prognostic purposes, and/or as potential targets for therapeutic strategies (Finn et al., 1994). Splicing patterns of several genes have been reported to be altered in cancers, including those encoding the prolactin receptor, RON receptor tyrosine kinase, the small signaling G protein RAC1, fibronectin, fibroblast growth factor receptors (FGFRs), CD44, and the MDM2 proto-oncogene. For instance, particular protein isoforms of RON and RAC1 can accumulate in tumors, and over-expression of the tumor-associated isoforms is sufficient to transform cells in culture (Srebrow and Kornblihtt, 2006). In the present work, the application of NGS technology led to the discovery of 19 previously unknown splice junctions of the human PRMT1 gene, as well as one novel exon of this gene. Although expression of PRMT1 has been detected in various human of tissues, the relative prevalence of alternatively spliced transcripts of the gene differs between normal and cancerous breast tissues. Despite that PRMT1 was not found to be hormonally regulated by steroid hormones in human breast cancer cells, the results indicated that two splice variants of this gene are downregulated in breast cancer (Scorilas et al., 2000). Another study investigating the expression pattern of PRMT1 in colon cancer patients showed that only PRMT1 splice variants v.1 and v.2 were often expressed, whereas PRMT1 v.2 overexpression was associated with nodal status and histological grade of the tumor (Mathioudaki et al., 2008), suggesting that PRMT1 v.2 could serve as a biomarker of unfavorable prognosis in colon cancer patients. Undoubtedly, further investigation of the novel PRMT1 transcripts presented in this study is needed, in order for their potential biomarker properties to be clarified. Altered splicing patterns can serve as markers of the altered cellular state associated with disease, even when they are not involved in the key pathway(s) of the disease pathobiology. Since only a few of the

Disclosure The authors have no conflicts of interest to declare. References Adamopoulos, P.G., Kontos, C.K., Tsiakanikas, P., Scorilas, A., 2016. Identification of novel alternative splice variants of the BCL2L12 gene in human cancer cells using next-generation sequencing methodology. Cancer Lett. 373, 119–129. Adamopoulos, P.G., Kontos, C.K., Scorilas, A., 2017. Molecular cloning of novel transcripts of human kallikrein-related peptidases 5, 6, 7, 8 and 9 (KLK5 - KLK9), using next-generation sequencing. Sci. Rep. 7, 17299. Adamopoulos, P.G., Kontos, C.K., Diamantopoulos, M.A., Scorilas, A., 2018. Molecular cloning of novel transcripts of the adaptor-related protein complex 2 alpha 1 subunit (AP2A1) gene, using next-generation sequencing. Gene 678, 55–64. Alexopoulou, D.K., Kontos, C.K., Christodoulou, S., Papadopoulos, I.N., Scorilas, A., 2014. KLK11 mRNA expression predicts poor disease-free and overall survival in colorectal adenocarcinoma patients. Biomark. Med 8, 671–685. An, W., Kim, J., Roeder, R.G., 2004. Ordered cooperative functions of PRMT1, p300, and CARM1 in transcriptional activation by p53. Cell 117, 735–748. Bachand, F., Silver, P.A., 2004. PRMT3 is a ribosomal protein methyltransferase that affects the cellular levels of ribosomal subunits. EMBO J. 23, 2641–2650. Barrett, L.W., Fletcher, S., Wilton, S.D., 2012. Regulation of eukaryotic gene expression by the untranslated gene regions and other non-coding elements. Cell. Mol. Life Sci. 69, 3613–3634. Berthet, C., Guehenneux, F., Revol, V., Samarut, C., Lukaszewicz, A., Dehay, C., Dumontet, C., Magaud, J.P., Rouault, J.P., 2002. Interaction of PRMT1 with BTG/

143

Gene 699 (2019) 135–144

P.G. Adamopoulos, et al.

Nashed, M.G., Linher-Melville, K., Frey, B.N., Singh, G., 2016. RNA-sequencing profiles hippocampal gene expression in a validated model of cancer-induced depression. Genes Brain Behav. 15, 711–721. Pajares, M.J., Ezponda, T., Catena, R., Calvo, A., Pio, R., Montuenga, L.M., 2007. Alternative splicing: an emerging topic in molecular and clinical oncology. Lancet Oncol 8, 349–357. Pan, Q., Shai, O., Misquitta, C., Zhang, W., Saltzman, A.L., Mohammad, N., Babak, T., Siu, H., Hughes, T.R., Morris, Q.D., Frey, B.J., Blencowe, B.J., 2004. Revealing global regulatory features of mammalian alternative splicing using a quantitative microarray platform. Mol. Cell 16, 929–941. Pan, Q., Shai, O., Lee, L.J., Frey, B.J., Blencowe, B.J., 2008. Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat. Genet. 40, 1413–1415. Papadokostopoulou, A., Mathioudaki, K., Scorilas, A., Xynopoulos, D., Ardavanis, A., Kouroumalis, E., Talieri, M., 2009. Colon cancer and protein arginine methyltransferase 1 gene expression. Anticancer Res. 29, 1361–1366. Park, J.W., Tokheim, C., Shen, S., Xing, Y., 2013. Identifying differential alternative splicing events from RNA sequencing data using RNASeq-MATS. Methods Mol. Biol. 1038, 171–179. Pichon, X., Wilson, L.A., Stoneley, M., Bastide, A., King, H.A., Somers, J., Willis, A.E., 2012. RNA binding protein/RNA element interactions and the control of translation. Curr. Protein Pept. Sci. 13, 294–304. Sammeth, M., Foissac, S., Guigo, R., 2008. A general definition and nomenclature for alternative splicing events. PLoS Comput. Biol. 4, e1000147. Sanchez, S.E., Petrillo, E., Kornblihtt, A.R., Yanovsky, M.J., 2011. Alternative splicing at the right time. RNA Biol. 8, 954–959. Sanchez-Pla, A., Reverter, F., Ruiz de Villa, M.C., Comabella, M., 2012. Transcriptomics: mRNA and alternative splicing. J. Neuroimmunol. 248, 23–31. Scorilas, A., Black, M.H., Talieri, M., Diamandis, E.P., 2000. Genomic organization, physical mapping, and expression analysis of the human protein arginine methyltransferase 1 gene. Biochem. Biophys. Res. Commun. 278, 349–359. Skotheim, R.I., Nees, M., 2007. Alternative splicing in cancer: noise, functional, or systematic? Int. J. Biochem. Cell Biol. 39, 1432–1449. Smith, C.W., Valcarcel, J., 2000. Alternative pre-mRNA splicing: the logic of combinatorial control. Trends Biochem. Sci. 25, 381–388. Srebrow, A., Kornblihtt, A.R., 2006. The connection between splicing and cancer. J. Cell Sci. 119, 2635–2641. Stamm, S., Ben-Ari, S., Rafalska, I., Tang, Y., Zhang, Z., Toiber, D., Thanaraj, T.A., Soreq, H., 2005. Function of alternative splicing. Gene 344, 1–20. Sultan, M., Schulz, M.H., Richard, H., Magen, A., Klingenhoff, A., Scherf, M., Seifert, M., Borodina, T., Soldatov, A., Parkhomchuk, D., Schmidt, D., O'Keeffe, S., Haas, S., Vingron, M., Lehrach, H., Yaspo, M.L., 2008. A global view of gene activity and alternative splicing by deep sequencing of the human transcriptome. Science 321, 956–960. Sun, W., Li, D., Su, R., Musa, H.H., Chen, L., Zhou, H., 2014. Construction, characterization and expression of full length cDNA clone of sheep YAP1 gene. Mol. Biol. Rep. 41, 947–956. Tazi, J., Bakkour, N., Stamm, S., 2009. Alternative splicing and disease. Biochim. Biophys. Acta 1792, 14–26. Thermann, R., Neu-Yilik, G., Deters, A., Frede, U., Wehr, K., Hagemeier, C., Hentze, M.W., Kulozik, A.E., 1998. Binary specification of nonsense codons by splicing and cytoplasmic translation. EMBO J. 17, 3484–3494. Thorvaldsdottir, H., Robinson, J.T., Mesirov, J.P., 2013. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief. Bioinform. 14, 178–192. Wang, E.T., Sandberg, R., Luo, S., Khrebtukova, I., Zhang, L., Mayr, C., Kingsmore, S.F., Schroth, G.P., Burge, C.B., 2008. Alternative isoform regulation in human tissue transcriptomes. Nature 456, 470–476. Ward, A.J., Cooper, T.A., 2010. The pathobiology of splicing. J. Pathol. 220, 152–163. Wolf, S.S., 2009. The protein arginine methyltransferase family: an update about function, new perspectives and the physiological role in humans. Cell. Mol. Life Sci. 66, 2109–2121. Ye, J., Coulouris, G., Zaretskaya, I., Cutcutache, I., Rozen, S., Madden, T.L., 2012. PrimerBLAST: a tool to design target-specific primers for polymerase chain reaction. BMC Bioinformatics 13, 134. Yeakley, J.M., Fan, J.B., Doucet, D., Luo, L., Wickham, E., Ye, Z., Chee, M.S., Fu, X.D., 2002. Profiling alternative splicing on fiber-optic arrays. Nat. Biotechnol. 20, 353–358. Yoshimatsu, M., Toyokawa, G., Hayami, S., Unoki, M., Tsunoda, T., Field, H.I., Kelly, J.D., Neal, D.E., Maehara, Y., Ponder, B.A., Nakamura, Y., Hamamoto, R., 2011. Dysregulation of PRMT1 and PRMT6, type I arginine methyltransferases, is involved in various types of human cancers. Int. J. Cancer 128, 562–573. Zhang, J., Sun, X., Qian, Y., LaDuca, J.P., Maquat, L.E., 1998a. At least one intron is required for the nonsense-mediated decay of triosephosphate isomerase mRNA: a possible link between nuclear splicing and cytoplasmic translation. Mol. Cell. Biol. 18, 5272–5283. Zhang, J., Sun, X., Qian, Y., Maquat, L.E., 1998b. Intron function in the nonsense-mediated decay of beta-globin mRNA: indications that pre-mRNA splicing in the nucleus can influence mRNA translation in the cytoplasm. RNA 4, 801–815. Zobel-Thropp, P., Gary, J.D., Clarke, S., 1998. delta-N-methylarginine is a novel posttranslational modification of arginine residues in yeast proteins. J. Biol. Chem. 273, 29283–29286.

TOB proteins in cell signalling: molecular analysis and functional aspects. Genes Cells 7, 29–39. Blair, C.A., Zi, X., 2011. Potential molecular targeting of splice variants for cancer treatment. Indian J. Exp. Biol. 49, 836–839. Blencowe, B.J., 2006. Alternative splicing: new insights from global analyses. Cell 126, 37–47. Boisvert, F.M., Cote, J., Boulanger, M.C., Cleroux, P., Bachand, F., Autexier, C., Richard, S., 2002. Symmetrical dimethylarginine methylation is required for the localization of SMN in Cajal bodies and pre-mRNA splicing. J. Cell Biol. 159, 957–969. Bryant Jr., D.W., Priest, H.D., Mockler, T.C., 2012. Detection and quantification of alternative splicing variants using RNA-seq. Methods Mol. Biol. 883, 97–110. Caceres, J.F., Kornblihtt, A.R., 2002. Alternative splicing: multiple control mechanisms and involvement in human disease. Trends Genet. 18, 186–193. Carninci, P., Sandelin, A., Lenhard, B., Katayama, S., Shimokawa, K., Ponjavic, J., Semple, C.A., Taylor, M.S., Engstrom, P.G., Frith, M.C., Forrest, A.R., Alkema, W.B., Tan, S.L., Plessy, C., Kodzius, R., Ravasi, T., Kasukawa, T., Fukuda, S., Kanamori-Katayama, M., Kitazume, Y., Kawaji, H., Kai, C., Nakamura, M., Konno, H., Nakano, K., MottaguiTabar, S., Arner, P., Chesi, A., Gustincich, S., Persichetti, F., Suzuki, H., Grimmond, S.M., Wells, C.A., Orlando, V., Wahlestedt, C., Liu, E.T., Harbers, M., Kawai, J., Bajic, V.B., Hume, D.A., Hayashizaki, Y., 2006. Genome-wide analysis of mammalian promoter architecture and evolution. Nat. Genet. 38, 626–635. Cartegni, L., Chew, S.L., Krainer, A.R., 2002. Listening to silence and understanding nonsense: exonic mutations that affect splicing. Nat Rev Genet 3, 285–298. Castle, J.C., Zhang, C., Shah, J.K., Kulkarni, A.V., Kalsotra, A., Cooper, T.A., Johnson, J.M., 2008. Expression of 24,426 human alternative splicing events and predicted cis regulation in 48 tissues and cell lines. Nat. Genet. 40, 1416–1425. Christodoulou, S., Alexopoulou, D.K., Kontos, C.K., Scorilas, A., Papadopoulos, I.N., 2014. Kallikrein-related peptidase-6 (KLK6) mRNA expression is an independent prognostic tissue biomarker of poor disease-free and overall survival in colorectal adenocarcinoma. Tumour Biol. 35, 4673–4685. David, C.J., Manley, J.L., 2010. Alternative pre-mRNA splicing regulation in cancer: pathways and programs unhinged. Genes Dev. 24, 2343–2364. Fackelmayer, F.O., 2005. Protein arginine methyltransferases: guardians of the Arg? Trends Biochem. Sci. 30, 666–671. Fehlbaum, P., Guihal, C., Bracco, L., Cochet, O., 2005. A microarray configuration to quantify expression levels and relative abundance of splice variants. Nucleic Acids Res. 33, e47. Finn, L., Dougherty, G., Finley, G., Meisler, A., Becich, M., Cooper, D.L., 1994. Alternative splicing of CD44 pre-mRNA in human colorectal tumors. Biochem. Biophys. Res. Commun. 200, 1015–1022. Garcia-Blanco, M.A., Baraniak, A.P., Lasda, E.L., 2004. Alternative splicing in disease and therapy. Nat. Biotechnol. 22, 535–546. Goulet, I., Gauvin, G., Boisvenue, S., Cote, J., 2007. Alternative splicing yields protein arginine methyltransferase 1 isoforms with distinct activity, substrate specificity, and subcellular localization. J. Biol. Chem. 282, 33009–33021. Graveley, B.R., 2001. Alternative splicing: increasing diversity in the proteomic world. Trends Genet. : TIG 17, 100–7. Herrmann, F., Pably, P., Eckerich, C., Bedford, M.T., Fackelmayer, F.O., 2009. Human protein arginine methyltransferases in vivo–distinct properties of eight canonical members of the PRMT family. J. Cell Sci. 122, 667–677. Johnson, J.M., Castle, J., Garrett-Engele, P., Kan, Z., Loerch, P.M., Armour, C.D., Santos, R., Schadt, E.E., Stoughton, R., Shoemaker, D.D., 2003. Genome-wide survey of human alternative pre-mRNA splicing with exon junction microarrays. Science 302, 2141–2144. Kim, E., Goren, A., Ast, G., 2008. Insights into the connection between cancer and alternative splicing. Trends Genet. 24, 7–10. Kim, D., Pertea, G., Trapnell, C., Pimentel, H., Kelley, R., Salzberg, S.L., 2013. TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol. 14, R36. Kriventseva, E.V., Koch, I., Apweiler, R., Vingron, M., Bork, P., Gelfand, M.S., Sunyaev, S., 2003. Increase of functional diversity by alternative splicing. Trends Genet. 19, 124–128. Lagarde, J., Uszczynska-Ratajczak, B., Santoyo-Lopez, J., Gonzalez, J.M., Tapanari, E., Mudge, J.M., Steward, C.A., Wilming, L., Tanzer, A., Howald, C., Chrast, J., VelaBoza, A., Rueda, A., Lopez-Domingo, F.J., Dopazo, J., Reymond, A., Guigo, R., Harrow, J., 2016. Extension of human lncRNA transcripts by RACE coupled with long-read high-throughput sequencing (RACE-Seq). Nat. Commun. 7, 12339. Lake, A.N., Bedford, M.T., 2007. Protein methylation and DNA repair. Mutat. Res. 618, 91–101. Leenen, F.A., Vernocchi, S., Hunewald, O.E., Schmitz, S., Molitor, A.M., Muller, C.P., Turner, J.D., 2016. Where does transcription start? 5'-RACE adapted to next-generation sequencing. Nucleic Acids Res. 44, 2628–2645. Mangul, S., Caciula, A., Al Seesi, S., Brinza, D., Mndoiu, I., Zelikovsky, A., 2014. Transcriptome assembly and quantification from ion torrent RNA-Seq data. BMC Genomics 15, S7 Suppl 5. Mathioudaki, K., Papadokostopoulou, A., Scorilas, A., Xynopoulos, D., Agnanti, N., Talieri, M., 2008. The PRMT1 gene expression pattern in colon cancer. Br. J. Cancer 99, 2094–2099. Matlin, A.J., Clark, F., Smith, C.W., 2005. Understanding alternative splicing: towards a cellular code. Nat Rev Mol Cell Biol 6, 386–398. Modrek, B., Lee, C., 2002. A genomic view of alternative splicing. Nat. Genet. 30, 13–19. Morozova, O., Marra, M.A., 2008. Applications of next-generation sequencing technologies in functional genomics. Genomics 92, 255–264.

144