A novel exon generates ubiquitously expressed alternatively spliced new transcript of mouse Abcc4 gene Sayeed Ur Rehman, Hassan Mubarak Ishqi, Mohammed Amir Husain, Tarique Sarwar, Mohammad Tabish PII: DOI: Reference:
S0378-1119(16)30704-1 doi:10.1016/j.gene.2016.08.058 GENE 41562
To appear in:
Gene
Received date: Revised date: Accepted date:
19 April 2016 30 July 2016 31 August 2016
Please cite this article as: Rehman, Sayeed Ur, Ishqi, Hassan Mubarak, Husain, Mohammed Amir, Sarwar, Tarique, Tabish, Mohammad, A novel exon generates ubiquitously expressed alternatively spliced new transcript of mouse Abcc4 gene, Gene (2016), doi:10.1016/j.gene.2016.08.058
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT A novel exon generates ubiquitously expressed alternatively spliced new transcript of mouse Abcc4 gene
IP
SC R
Mohammad Tabish*
T
Sayeed Ur Rehman, Hassan Mubarak Ishqi, Mohammed Amir Husain, Tarique Sarwar and
Department of Biochemistry, Faculty of Life Sciences, A.M. University, Aligarh, U.P. 202002,
NU
India
MA
*Corresponding author: Department of Biochemistry
D
Faculty of Life Sciences
TE
A.M. University, Aligarh U.P. 202002, India
CE P
Email:
[email protected]; Tel: +91-9634780818
AC
Running Title: A novel transcript isoform of mouse Abcc4 gene No. of Figures: 6 No. of Tables: 3
Conflict of Interest: The authors declare that there is no conflict of interest in this work.
ACCEPTED MANUSCRIPT Abstract Abcc4 gene codes for a protein (ABCC4) involved in the transportation of different classes of drugs
T
outside the cells. Various important drugs transported by ABCC4 include antiviral and anticancer
IP
drugs as well as endogenous molecules such as bile acids, cyclic nucleotides, folates, prostaglandins
SC R
and steroids. Alternative splicing generates multiple mRNAs that encode protein isoforms having diverse functions. In this study, we have identified a novel transcript of mouse Abcc4 gene using a
NU
combination of bioinformatics and molecular biology techniques. This transcript was found to be different from the reported transcript in having a different first exon that was found to be located on
MA
previously identified first intron. Newly identified transcript was found to be expressed across different tissues we studied and in different developmental stages. Expression level of novel and
D
reported transcripts was studied using quantitative real time PCR. After conceptually translating the
TE
novel transcript, various post-translational modifications were studied. Translation efficiency and predicted half life of encoded protein isoforms were analysed in silico. Molecular modelling was
CE P
performed to compare the structural differences in both isoforms. The diversity at N-termini in these protein isoforms explains the diverse function of ABCC4 in mouse.
AC
Keywords: Alternative splicing; mouse; Abcc4 gene; bioinformatics; molecular biology
ACCEPTED MANUSCRIPT Introduction Multiple drug resistance protein 4 (MRP4/ABCC4) is an ATP-binding cassette (ABC) transporter
T
included in the subfamily C of the ABC transporter superfamily. MRP4 is encoded by Abcc4 gene,
IP
which is involved in the transport of diverse sets of drugs outside the cell (Zhao et al, 2014).
SC R
Various anticancer drugs that are transported by ABCC4 include 6-mercaptopurine, 6-thioguanine, camptothecins and others (Lin et al., 2013; Tian et al., 2005; Leggas et al., 2004). ABCC4 is
NU
reported to be overexpressed in retinoblastomas, neuroblastomas, gliomas, melanomas, colon cancer and colorectal cancer cells (Hendig et al., 2009, Holla et al., 2008). Moreover, ABCC4
MA
expression is associated with drug resistance in leukaemia and ovarian cancer cells (Beretta et al., 2010). Recently, a study showed that downregulation of ABCC4 increased apoptosis in drug-
D
resistant human gastric cancer cells, thereby restoring the sensitivity of the drug-resistant cancer
TE
cells to 5-FU (Zhang et al., 2015) while another study showed protective effect of ABCC4 against
CE P
cytarabine-mediated damage in leukemic and host myeloid cells (Drenberg et al., 2016). ABCC4 is also reported to play an important role in cAMP homeostasis and related pathways (Belleville-
AC
Rolland et al., 2016). Such a diverse role of ABCC4 makes it an interesting target for research. ABC transporters can be classified by presence of three sequence motifs located in their cytoplasmic ATP-binding domain or nucleoside binding domains (NBDs). These are Walker A motifs, Walker B motifs, and ABC signature motif (Haimeur et al., 2004). The core functional structure of most ABC transporters include two NBDs, nucleoside binding domains and two sets of membrane spanning domains (MSDs) typically containing six membrane-spanning alpha helices (Haimeur et al., 2004). Alternative splicing plays an important role in creating diversity and regulating cellular functions (Kelemen et al., 2013; Nilsen and Graveley, 2010; Stamm et al., 2005). Recently, alternative splicing was reported to play an important role in creating functional diversity in ABCC1 (MRP1)
ACCEPTED MANUSCRIPT of sea urchin (Gökirmak et al., 2016). In case of ABCC4, there is no study reporting the alternatively spliced isoforms in recent years. In this study, we have analysed the alternative
T
splicing of the mouse Abcc4 gene due to its important and diverse roles inside the cell. Mouse
IP
Abcc4 gene is located on chromosome 14 and contains 31 exons that code for 1325 amino acids
SC R
protein. According to consensus CDS project, two CCDS are reported, CCDS27335.1 and CCDS49565.1, that differ in exclusion of exon 4 in one of them. To study the alternative splicing of a Abcc4 gene, we followed a novel methodology that combines the use of various bioinformatic
NU
tools and molecular biology techniques to predict and then confirm the existence of novel isoforms
MA
as described earlier (Banday et al., 2012). Using such methodology, we predicted and confirmed one exon located in the intronic region between exon 1 and exon 2 of Abcc4 gene that could
D
participate in alternative splicing with other internal exons. With the help of various molecular
TE
biology techniques, we confirmed the expression of one new transcript in different tissues of mouse. Further, using various bioinformatics tools, prediction of several post translational modifications of
CE P
the novel isoform was performed.
AC
Materials and Methods
Computational analysis and prediction of novel exons Our novel methodology involves the extensive use of various databases and gene/exon finding tools. Abcc4 gene of mouse was studied for predicting exons that could participate in alternative splicing with existing exons leading to generation of novel isoforms. Genomic DNA, cDNA and protein sequences of Abcc4 were downloaded from Mouse Genome Informatics (MGI). Published exons and introns were first located on the genomic sequence. Upstream region of the gene was also analysed for putative novel exons using gene and exon finding tools like GENSCAN (Burge and Karlin, 1997), FGENESH (Solovyev et al., 2006) and FEX (Solovyev et al., 1994). These tools were able to identify the published exons with high score and served as a positive control. Several
ACCEPTED MANUSCRIPT false positives arising in the prediction process were filtered out using the knowledge of comparative genomics and alternate splicing. The removal of redundant data was done carefully by
T
manual curation. The new exons identified were then subjected to test code analysis available at
IP
sequence manipulation suite (SMS). Test code recognises the protein coding nature of the DNA
SC R
sequences and is based on statistical correlation of nucleotide sequences to be coding or non-coding in nature (Fickett, 1982). Based on the results obtained from test code analysis, exons were chosen for confirmation by wet lab experiments involving RT-PCR and sequencing. Different online tools in
our
study
are
GENSCAN
(http://genes.mit.edu/GENSCAN.html),
NU
used
FGENESH
MA
(http://linux1.softberry.com/berry.phtml?topic=fgenesh&group=programs&subgroup=gfind), FEX (http://linux1.softberry.com/berry.phtml?topic=fex&group=programs&subgroup=gfind), (http://www.ebi.ac.uk/Tools/msa/clustalo/),
D
ClustalOmega
Sequence
TE
(http://blast.ncbi.nlm.nih.gov/Blast.cgi),
BLAST
Manipulation
Suite
(http://www.bioinformatics.org/sms/), ExPASy translate tools (http://web.expasy.org/translate), (http://www.expasy.org/proteomics),
CE P
ExPASy
TermiNator
(http://www.isv.cnrs-
gif.fr/terminator3/index.html), TFBIND (http://tfbind.hgc.jp/). Exon specific primers were designed
AC
and synthesized (Sigma Aldrich, Bangalore, India) and were as followed. Primer
Sequence (5'-3')
Direction
Primer location
FC
CTG GTC ATA AGC GGA GAC TGG AA
Forward
Exon E2
FE1A
GTG TTT TCA CCA ACG TCC AGG AAG
Forward
Exon E1A
FE1B
ATG CTG CCG AGT GAG GTG GTG AA
Forward
Exon E1B
R1
GCA GAG GCA GAA GAA TAA CCA GAA C
Reverse
Exon E6
R2
CTG CCC ACA GAA AGT GCA AGA AG
Reverse
Exon E6
Preparation of RNA from different tissues of mouse
ACCEPTED MANUSCRIPT Total cellular RNA was isolated from different mouse tissues using RNA extraction kit (Intron Biotechnology, Gyeonggi-do, Korea) according to manufacturer’s instructions. Isolated RNA
T
was dissolved in diethyl pyrocarbonate treated water and quantitated spectrophotometrically.
IP
Integrity of RNA was confirmed by denaturing agarose gel electrophoresis. RNA prepared was
SC R
either used immediately or stored at -80 °C. Animal experimentations were permitted by Ministry of Environment and Forests, Government of India under registration no. 714/02/a/CPCSEA. It was issued by Committee for the Purpose of Control and Supervision of
NU
Experiments on Animals (CPCSEA) dated 25th October, 2012 and approved by the Institutional
MA
Animal Ethic Committee (IAEC) of Department of Biochemistry, Faculty of Life Sciences, Aligarh Muslim University, Aligarh, India (Order no: D.No. 4165).
TE
D
Single strand cDNA synthesis
Splicing patterns of newly predicted exons with internal exons were characterized by synthesis of
CE P
cDNA by reverse transcriptase, PCR amplification followed by cloning and sequencing. Abcc4 mRNA from mouse tissues like brain, heart, kidney and liver. Tissue specific total RNA (2 μg) was reverse transcribed into cDNA using oligo(dT)18 primer in a total volume of 20 µL by using
AC
RevertAidTM H Minus Reverse Transcriptase (Fermentas Life Sciences, USA) at 42ºC for 1 h. Touchdown PCR
To amplify the single stranded cDNA prepared from different tissues of mouse, touchdown PCR was performed using PCR master mix (Fermentas Life Sciences, USA) in a total volume of 50 μL. Each PCR tube contained fixed amount of single stranded cDNA as template, a common reverse primer (R1) and one of the forward primers. PCR Amplification was performed under the following conditions: denaturation for 30 s at 93°C; annealing for 30 s at 70°C with a decrease of 0.2°C per cycle; and extension for 40 s at 72°C. This was repeated for 35 cycles followed by a final extension at 72°C for 8 min.
ACCEPTED MANUSCRIPT Semi-nested PCR To further authorize the genuine PCR product, 1 μL of product obtained in touchdown PCR was
T
used as a template for further amplification by PCR for 30 cycles using the same upstream
IP
forward primer but a different, common reverse primer (R2) which is located internal to R1. The
SC R
resulting semi-nested PCR product (8 μL) was then subjected to electrophoresis on a 1.2 % (w/v) agarose gel, stained with ethidium bromide and photographed on a UV transilluminator.
NU
Sub cloning and sequencing
MA
The identity of PCR products obtained after semi nested PCR was confirmed by DNA sequencing. The PCR amplified products were electrophoresed on 1.2 % (w/v) agarose gels. Anticipated
D
ethidium bromide stained bands were cut out from the gels and DNA was eluted using a Qiaquick
TE
PCR gel purification kit (Qiagen, Santa Clarita, CA). The purified DNA was sub cloned into TOPO
CE P
vector (Invitrogen, Carlsbad, CA) and transformed to E. coli JM109-competent cells were transformed. Transformed colonies were grown overnight at 37°C for miniprep plasmid DNA purification. The purified plasmid containing the insert was sequenced using an automatic
AC
sequencer using either M13 forward or reverse primers.
Quantitative real time PCR (qRT-PCR)
qRT-PCR was used to study the expression of control and novel transcript variants of Abcc4 gene (Ishqi et al., 2016). Total RNA from mouse kidney was isolated as discussed earlier. After determining the concentration of RNA using UV spectrophotometer Nanodrop (Thermo Scientific, USA), 2.5 μg of RNA was converted to cDNA (as described earlier). Different preparations of cDNA samples were quantified in duplicate for Abcc4 gene while actin gene was taken as an internal control. All experiments were performed according to the protocol described in the manual (Thermo Scientific, USA) using eppendorf realplex4 mastercycler. In brief, 20 µl PCR reaction
ACCEPTED MANUSCRIPT mixtures included primers (300 nM each), 1 µl of 1:5 diluted template cDNA and 10 µl of 2X PCR master mix (SYBR Green). Initial denaturation was performed for 7 min at 95°C, followed by 40
T
rounds of 15 s at 95°C and 60 s at 60°C. Formation of multiple products by a set of primers was
IP
also ruled out. Real-time detections were carried out at annealing stages. A common reverse primer
SC R
(A4R) was used to amplify the transcripts along with isoform specific forward primer. The primers were designed from the unique regions of both the isoforms. The following primers sequence was
NU
used for amplification of transcripts:
Sequence (5'-3')
Direction
Primer location
A4CF
GTG TTC TTC TGG TGG CTC AAC
Forward
Junction of exon 1 and exon 2 of control transcript
A4NF
GAG AGG AAC GGT GGC TCA AC
A4R
CTT CCT CGA GTC CTT CTT GG
MA
Primer
Junction of exon 1 and exon 2 of novel transcript
Reverse
3rd exon, common to both transcripts
TE
D
Forward
In silico analysis for promoter regions and post translational modifications
CE P
Sequences upstream of existing first exon and novel first exon were analyzed and compared for possible promoter regions using different tools such as TFSEARCH, and TFBIND (Tsunoda and
AC
Takagi, 1999). Putative transcription start site was predicted using Promoter 2.0 prediction server (Knudsen, 1999). The sequence of novel transcript isoforms was conceptually translated using ExPASy Translate tool. Protein sequences were then subjected to various post translational modifications tools available at ExPASy resource portal. Comparison of amino acid sequences of N-termini was done using multiple sequence alignment tool (ClustalW).
With the help of
Terminator tool, N-terminal methionine excision, translation efficiency and predicted half life of Ntermini of different isoforms were studied (Martinez et al., 2008, Frottin et al., 2006).
Results and Discussion
Two new coding exons of mouse Abcc4 predicted within first intron
ACCEPTED MANUSCRIPT With the help of bioinformatics tools, we predicted two new exons of Abcc4 gene in mouse (Figure 1A). New exons, E1A and E1B, were located between E1 and E2 and are capable of participating in
T
alternative splicing with known exons. Each new exon can potentially alternatively splice out E1 or
IP
both E1 and E2 and generate different transcript isoforms differing at their 5’ end (Figure 1B).
SC R
Both E1A and E1B exons have high FEX score along with published exon, E1, which served as positive control (Table 1). Hits with low scores and several false positives were filtered out using
NU
splicing rule information and comparative genomics. The presence of the start codon ‘ATG’ in the novel exons show that these exons are capable of coding for novel N-terminals of encoded proteins.
MA
These exons were further subjected to TestCode analysis available at Sequence Manipulation Suite that predicts the coding and non-coding nature of exons. As seen in Table 1, E1 (positive control)
D
and E1B was found to be “coding” while E1A was “May be coding” in nature. Also to find any
TE
match in various databases, the nucleotide and encoded amino acid sequence of these new exons
any positive hits.
CE P
were also used to search for the available ESTs with the help of BLAST and none of them yielded
AC
Confirmation of predicted exon by RT-PCR and Semi-nested PCR To confirm the existence of predicted exons, RT-PCR followed by semi-nested PCR was performed. PCR product obtained were fractioned by agarose gel electrophoresis and is shown in Figure 2. Lane “a” in Figure 2 represents control band (550 bp) in kidney while lane “b-e” represents a product (391 bp) corresponding to E1B exon in kidney, brain, heart and liver respectively. In case of positive control, a band corresponding to CCDS27335.1 was obtained which was further confirmed by sequencing. After sequencing of PCR products, existence of one novel transcript containing exon E1B was confirmed. Sequence of Abcc4-N1 transcript was submitted to the gene bank with accession number KX013261.
ACCEPTED MANUSCRIPT Novel transcript isoform containing new exon E1B at 5' end, replacing known exon E1, was named as Abcc4-N1 that included exons E1B, E2, E3, E5 and E6 in the amplicon. Exon 4 (E4) was found
T
to be missing from Abcc4-N1. Since isoform corresponding to CCDS49565.1 also lacks exon 4, we
IP
have compared the novel isoform with the known isoform (corresponding to CCDS49565.1) also
SC R
lacking exon 4. We will refer this transcript isoform as Abcc4 and the corresponding protein as ABCC4.
NU
Expression of Abcc4 transcripts in different tissues during the development Expression of the Abcc4-N1 transcript was studied in different tissues of mouse across three
MA
different developmental stages. Different postnatal stages studied were PN3 (postnatal 3 days), PN15 (postnatal 15 days) and PN60 (postnatal 60 days). Published transcript Abcc4 as well as novel
TE
developmental stages (Table 2).
D
transcript, Abcc4-N1, was found to be ubiquitously expressed in all the four tissues across all
CE P
Quantitative real time RT-PCR
Expression level of control (Abcc4) and novel (Abcc4-N1) transcripts was studied by real-time RT-
AC
PCR using appropriate primers. In case of both variants, primers were designed in such a way to include the exon junction region in the forward primer while the reverse primer was same. Expression level of these transcripts in kidney is shown in Figure 3. Under normal conditions, expression of Abcc4-N1 was 36% as compared to Abcc4. Expression of Abcc4-N1 was found to be significantly lower and this might be one of the reasons for not being detected earlier. Promoter region analysis With the help of web based tool TFBIND (Tsunoda and Takagi, 1999), various transcription factor binding sites were predicted in the region upstream of first exon of each transcript. Putative TF binding sites with high scores are presented in Figure 4. As compared to Abcc4, promoter region of
ACCEPTED MANUSCRIPT Abcc4-N1 showed some similarity (GATA-1 and MZF-1) as well as differences in the potential TF binding sites. TF binding sites (high prediction score) present in the promoter regions of Abcc4 are
T
GATA-1, HSF2, AML-1a, c-Rel, Sp1, USF, MZF-1, STATx and c-ETS, and Abcc4-N1 are CdxA,
IP
MyoD, HNF-1, CP2, GATA-1, Nkx-2, GATA-2, deltaE and MZF1. The existence of these highly
SC R
diverse sets of TF binding sites may fulfil the need for differential expression of these isoforms in different cell types and/or across different developmental stages.
NU
Post translational modifications
In order to understand the function of different isoforms of Abcc4 gene, we performed comparative
MA
post translational studies in silico with the help of ExPaSy tools. In silico study on mouse Abcc4 gene revealed that the encoded N-terminals of new and published isoforms have unique properties
TE
D
and can undergo co-translational or post translational modifications, which is important for imparting different functions to these protein isoforms. The nucleotide sequences of first exons in
CE P
each transcript were conceptually translated and studied for various post translation modifications. The amino acid sequence obtained was compared with that of published isoform using multiple
AC
sequence alignment tools, ClustalW (Figure 5). These isoforms were found to differ only at their Nterminal in the region encoded by their first exon. All the properties mentioned hereafter are for amino acid sequence encoded by first exon only and not the complete protein. The molecular weight (MW) and isoelectric point (pI) of the sequences encoded by first exon was calculated using tools available at ExPaSy site (Bjellqvist et al., 1994). In case of published Abcc4 isoform, first exon E1 codes for a sequence of 2.9 kDa (MLPVHTEVKPNPLQDANLCSRVFFW). However, the sequence encoded by first exon (E1B) of Abcc4-N1 is 1.8 kDa (MEMLPSEVVKPREER). Also, there was considerable shift in the isoelectric point (pI) by these sequences. In case of ABCC4 the pI value is close to 6.59 while in case of ABCC4-N1, pI value was calculated to be 4.94. Along with published isoform ABCC4, novel isoform, ABCC4-N1 did not have any signal peptide. First
ACCEPTED MANUSCRIPT exon encoded sequence of published ABCC4 isoform was found to have one predicted Oglycosylation site (Steentoft et al., 2013) in the region studied whereas ABCC4-N1 had no such site.
T
N-glycosylation, phosphorylation, PKA dependent phosphorylation (Blom et al., 2004) and
IP
sulfation sites (Chang et al., 2009) were found to be absent in both these sequences (results not
SC R
shown).
Fate and stability of proteins depend upon the N-terminal sequence of the polypeptide and hence
NU
decides the half life of the protein (Giglione and Meinnel, 2001). N-terminal of these isoform was studied for various characteristic properties that included N-terminal methionine excision (NME),
MA
translation efficiency and predicted half life of the protein sequence. These studies were conducted with the help of TermiNator tool that also predicts N-terminal acetylation, N-terminal
D
myristoylation and S-palmitoylation. The isoforms ABCC4 and ABCC4-N1 were predicted to
TE
retain the N-terminal methionine and did not undergo N-terminal excision. ABCC4-N1 was
CE P
predicted to have acetylation at the N-terminal. Further, ABCC4-N1 was found to be more efficient in translation than ABCC4. Predicted half life of these protein isoforms is shown in Table 3.
AC
Functional significance of alternatively spliced isoform Alternative splicing generates different proteins from same gene. In case of Abcc4 gene in mouse, we identified an isoform that contains a novel first exon that splices with internal exon-2 to form mature mRNA. In order to understand the consequence of alternative splicing of first exon, we performed BLAST-P using 25 amino acid encoded by exon 1. It was seen that the 25 amino acid residues present at the N-terminal of ABCC4 is highly conserved through different organisms. Nterminal amino acid sequences ABCC4 from four different organisms namely Mus musculus, Rattus norvegicus, Canis lupus familiaris and Homo sapiens were aligned using clustalW and is shown in Figure 6A. The conserved 25 amino acid sequence of ABCC4 is replaced by a novel 15 amino acid sequence at the N-terminal in ABCC4-N1 isoform. This is expected to alter the functional aspect of
ACCEPTED MANUSCRIPT ABCC4. Further, BLAST-P was performed using the novel 15 amino acid sequence of ABCC4-N1. As seen in Figure 6B, few hits were obtained that showed some level of sequence similarity to
T
middle or C-terminal region of bacterial ABC transporter ATP-binding proteins (Identity; 75% -
IP
69%). Change in the ABCC4-N1 sequence may assign a different role to this ABC transporter.
SC R
Further downstream experiments can be performed to evaluate the functioning of novel isoform. Conclusion
NU
Using a combination of bioinformatics and molecular biology techniques, we have identified a novel transcript of Abcc4 gene in mouse. The novel transcript was found to be expressed in all the
MA
four tissues that were studied across different developmental stages. With the help of different tools available online, in silico analysis was performed. Various post translational modifications in the
D
novel as well as published isoforms are reported. ABCC4 and ABCC4-N1 were predicted to show
TE
no N-terminal methionine cleavage. Promoter analysis showed diverse sets of transcription factor
CE P
binding sites. The amino acid sequence encoded by exon 1 was found to be highly conserved in different organisms and was replaced by novel 15 amino acid sequence which shows some homology with ABC transporters of lower organisms. However, further studies are required to
AC
characterise the functional aspects of novel isoforms at the level of protein and their expression in presence of different drugs since various drugs were reported to modulate alternative splicing (Rehman et al, 2015). Acknowledgements The authors are thankful to C.S.I.R, New Delhi, for the award of CSIR-SRF fellowship to SUR (File no - 09/112(0470)/2011-EMR1). We are thankful to the Department of Biotechnology, New Delhi, India for providing generous funding to MT (Grant No: BT/PR5271/BID/7/395/2012). The necessary facilities provided by the Department of Biochemistry, A.M.U., Aligarh, India are also acknowledged.
ACCEPTED MANUSCRIPT References Banday, A.R., Azim, S., Rehman, S.U., Tabish, M., 2012. Two novel N-terminal coding exons of Prkar1b gene of mouse: identified using a novel approach of in silico and molecular biology
IP
T
techniques. Gene 500, 73-79.
Belleville-Rolland, T., Sassi, Y., Decouture, B., Dreano, E., Hulot, J. S., Gaussem, P., Bachelot-
SC R
Loza, C., 2016. MRP4 (ABCC4) as a potential pharmacologic target for cardiovascular disease. Pharmacol Res. 107, 381-389.
NU
Beretta, G.L., Benedetti, V., Cossa, G., Assaraf, Y.G., Bram, E., Gatti, L., Corna, E., Carenini, N., Colangelo, D., Howell, S.B., Zunino, F., Perego, P., 2010. Increased levels and defective glycosylation of MRPs in ovarian carcinoma cells resistant to oxaliplatin. Biochem. Pharmacol. 79,
MA
1108-1117.
Bjellqvist, B., Basse, B., Olsen, E., Celis, J.E., 1994. Reference points for comparisons of two-
D
dimensional maps of proteins from different human cell types defined in a pH scale where
TE
isoelectric points correlate with polypeptide compositions. Electrophoresis 15, 529-539. Blom, N., Sicheritz‐Pontén, T., Gupta, R., Gammeltoft, S., Brunak, S., 2004. Prediction of
CE P
post‐translational glycosylation and phosphorylation of proteins from the amino acid sequence. Proteomics 4, 1633-1649.
AC
Burge, C., Karlin, S., 1997. Prediction of complete gene structures in human genomic DNA. J. Mol. Biol. 268, 78-94.
Chang, W.C., Lee, T.Y., Shien, D.M., Hsu, J.B., Horng, J.T., Hsu, P.C., Wang, T.Y., Huang, H.D., Pan, R.L., 2009. Incorporating support vector machine for identifying protein tyrosine sulfation sites. J. Comput. Chem. 30, 2526-2537. Drenberg, C.D., Hu, S., Li, L., Buelow, D.R., Orwick, S.J., Gibson, A.A., Schuetz, J.D., Sparreboom, A., Baker, S.D., 2016. ABCC4 Is a Determinant of Cytarabine‐Induced Cytotoxicity and Myelosuppression. Clin. Transl. Sci. 9, 51-59. Fickett, J.W., 1982. Recognition of protein coding regions in DNA sequences. Nucleic Acids Res. 10, 5303-5318.
ACCEPTED MANUSCRIPT Frottin, F., Martinez, A., Peynot, P., Mitra, S., Holz, R. C., Giglione, C., Meinnel, T., 2006. The proteomics of N-terminal methionine cleavage. Mol. Cell Proteomics 5, 2336-2349. Giglione, C., Meinnel, T., 2001. Organellar peptide deformylases: universality of the N-terminal
IP
T
methionine cleavage mechanism. Trends Plant Sci. 6, 566-72.
Gökirmak, T., Campanale, J. P., Reitzel, A. M., Shipp, L. E., Moy, G. W., Hamdoun, A., 2016.
SC R
Functional diversification of sea urchin ABCC1 (MRP1) by alternative splicing. Am. J. Physiol. Cell Physiol. 310, C911-C920.
NU
Haimeur, A., Conseil, G., Deeley, R.G., Cole, S.P., 2004. The MRP-related and BCRP/ABCG2 multidrug resistance proteins: biology, substrate specificity and regulation. Current Drug Metab. 5,
MA
21-53.
Hendig, D., Langmann, T., Zarbock, R., Schmitz, G., Kleesiek, K., Götting, C., 2009. Characterization of the ATP-binding cassette transporter gene expression profile in Y79: a
D
retinoblastoma cell line. Mol. Cell. Biochem. 328, 85-92.
TE
Holla, V.R., Backlund, M.G., Yang, P., Newman, R.A., DuBois, R.N., 2008. Regulation of
CE P
prostaglandin transporters in colorectal neoplasia. Cancer Prev. Res. 1, 93-99. Ishqi, H.M., Rehman, S.U., Sarwar, T., Husain, M.A., Tabish, M., 2016. Identification of differentially expressed three novel transcript variants of mouse ARNT gene. IUBMB Life 68, 122-
AC
135.
Kelemen, O., Convertini, P., Zhang, Z., Wen, Y., Shen, M., Falaleeva, M., Stamm, S., 2013., Function of alternative splicing. Gene 514, 1-30. Knudsen, S., 1999. Promoter 2.0: for the recognition of PolII promoter sequences. Bioinformatics 15, 356-361. Leggas, M., Adachi, M., Scheffer, G.L., Sun, D., Wielinga, P., Du, G., Mercer, K.E., Zhuang, Y., Panetta, J.C., Johnston, B., Scheper, R.J., Stewart, C.F., Schuetz, J.D., 2004. Mrp4 confers resistance to topotecan and protects the brain from chemotherapy. Mol. Cell. Biol. 24, 7612-7621. Lin, F., Marchetti, S., Pluim, D., Iusuf, D., Mazzanti, R., Schellens, J.H., Beijnen, J.H., van Tellingen, O., 2013. Abcc4 together with abcb1 and abcg2 form a robust cooperative drug efflux system that restricts the brain entry of camptothecin analogues. Clin. Cancer Res. 19, 2084-95.
ACCEPTED MANUSCRIPT Martinez, A., Traverso, J.A., Valot, B., Ferro, M., Espagne, C., Ephritikhine, G., Zivy, M., Giglione, C., Meinnel, T., 2008. Extent of N-terminal modifications in cytosolic proteins from eukaryotes. Proteomics 8, 2809-31.
T
Nilsen, T. W., Graveley, B. R., 2010. Expansion of the eukaryotic proteome by alternative splicing.
IP
Nature 463, 457-463.
SC R
Rehman, S.U., Husain, M.A., Sarwar, T., Ishqi, H.M., Tabish, M., 2015. Modulation of alternative splicing by anticancer drugs. Wiley Interdiscip. Rev. RNA 6, 369-79.
NU
Solovyev, V., Kosarev, P., Seledsov, I., Vorobyev, D., 2006. Automatic annotation of eukaryotic genes, pseudogenes and promoters. Genome Biol. 7, 10.1-10.12.
MA
Solovyev, V.V., Salamov, A.A., Lawrence, C.B., 1994. Predicting internal exons by oligonucleotide composition and discriminant analysis of spliceable open reading frames. Nucleic Acids Res. 22,
D
5156-5163.
TE
Stamm, S., Ben-Ari, S., Rafalska, I., Tang, Y., Zhang, Z., Toiber, D., Thanaraj, T, A., Soreq, H., 2005. Function of alternative splicing. Gene 344, 1-20.
CE P
Steentoft, C., Vakhrushev, S.Y., Joshi, H.J., Kong, Y., Vester-Christensen, M.B., Schjoldager, K.T., Lavrsen, K., Dabelsteen, S., Pedersen, N.B., Marcos-Silva, L., Gupta, R., Bennett, E.P., Mandel, U., Brunak, S., Wandall, H.H., Levery, S.B., Clausen, H., 2013. Precision mapping of the human
AC
O‐GalNAc glycoproteome through SimpleCell technology. The EMBO journal 32, 1478-1488. Tian, Q., Zhang, J., Tan, T.M., Chan, E., Duan, W., Chan, S.Y., Boelsterli, U.A., Ho, P.C., Yang, H., Bian, J.S., Huang, M., Zhu, Y.Z., Xiong, W., Li, X., Zhou, S., 2005. Human multidrug resistance associated protein 4 confers resistance to camptothecins. Pharm. Res. 22, 1837-1853. Tsunoda, T., Takagi, T., 1999. Estimating transcription factor bindability on DNA. Bioinformatics 15, 622-30. Zhang, G., Wang, Z., Qian, F., Zhao, C., Sun, C., 2015. Silencing of the ABCC4 gene by RNA interference reverses multidrug resistance in human gastric cancer. Oncol. Rep. 33, 1147-1154. Zhao, X., Guo, Y., Yue, W., Zhang, L., Gu, M., Wang, Y., 2014. ABCC4 is required for cell proliferation and tumorigenesis in non-small cell lung cancer. Onco. Targets Ther. 21, 343-51.
ACCEPTED MANUSCRIPT Figure Legends Figure 1: Exon-intron organisation and alternative splicing pattern of published and newly
T
predicted Abcc4 transcripts of mouse. (A) Exons are represented with rectangular boxes and inter
IP
connecting lines represent introns. Known exons (E1 to E31) are shown in black boxes whereas
SC R
coloured boxes represent newly predicted exons. Size of at the exons/introns are not to scale. Dashed lines show the splicing pattern, predicted to generate two new transcripts having either exon E1A or exon E1B as the 1st exon at the 5Ꞌ end along with a known transcript having exon E1 as first
NU
exon at the 5' end. TSS is putative transcription start site for known variant containing E1 while
MA
TSS-N and TSS-N’ is putative transcription start site for novel variants containing E1A and E1B respectively. (B) Designing of forward and reverse primers. Control forward primer (FC) was
D
designed from published exon 2 region. Reverse primer R1 was from exon 6 while reverse primer
TE
R2 was located internally (from exon 6) to R1. R2 primer was designed for semi-nested PCR. Forward primers from newly predicted exons, FE1A and FE1B were designed from newly predicted
CE P
exons E1A and E1B respectively.
Figure 2. RT-PCR followed by semi-nested PCR products fractioned by agarose gel
AC
electrophoresis. Total RNA isolated from kidney was reverse transcribed and amplified by RTPCR followed by semi-nested PCR using a common reverse primer R2. Lane ‘a’ corresponds to the control product of 550 bp using forward primer FC while lane ‘b-e’ represents a product of anticipated size (391 bp) obtained by using forward primer FE1B in kidney, brain, heart and liver respectively. PCR product was not obtained in the tissues tested when forward primer FE1A was used. Size of PCR ladders are shown on the left. Expected size of PCR products using forward primer FE1B and reverse primer R2 is as followed depending on the splicing pattern. (1) 616 bp when exon E1B splices with E2 and includes E4 in the transcript and 391 bp when E4 is also spliced out from the transcript. (2) 505 bp when E1B splices directly with E3 and includes E4 in the transcript. If E4 is spliced out, a product size of 280 bp was expected.
ACCEPTED MANUSCRIPT Figure 3. Relative expression of Abcc4 transcript variants. Oligo(dT)18 primed cDNA was synthesized from mRNA obtained from adult kidney mouse. Quantitative real time RT-PCR was
T
used to study the expression level of isoforms using exon specific primers. Forward primers were
IP
designed in such a way to include the exon junction region while the reverse primer was same. Fold
SC R
change with respect to control is reported.
Figure 4. Comparative in silico prediction of different transcription factor binding sites.
NU
Regions located upto 600bp upstream of first exon of each isoform were analyzed for possible transcription factor binding sites. Only results with high probabilities are reported here.
MA
Figure 5. Multiple sequence alignment (Clustal W) of amino acid sequences encoded by first two exons of published ABCC4 and new isoforms of ABCC4-N1 in mouse. Marked difference is
D
seen at the N-terminal of these isoforms. Arrow shows the sequencing corresponding to start of
TE
exon 2. Upstream of this are the sequences of alternately spliced first exons. Asterisk shows
CE P
identical residues in both isoforms.
Figure 6. Amino acid sequence comparison with other organisms using BLAST-P. (A) Amino
AC
acid sequence encoded by exon 1 of mouse Abcc4 shows a high similarity with higher organisms. (B) Amino acid sequence encoded by first exon of Abcc4-N1 (query) shows (1) 75% and (2) 69% identity with ABC transporters of some lower organisms.
AC
CE P
TE
D
MA
NU
SC R
IP
T
ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT
AC
CE P
TE
D
MA
NU
SC R
IP
T
Figures
Figure 1
AC
CE P
TE
D
MA
NU
SC R
IP
T
ACCEPTED MANUSCRIPT
Figure 2
MA
NU
SC R
IP
T
ACCEPTED MANUSCRIPT
AC
CE P
TE
D
Figure 3
MA
NU
SC R
IP
T
ACCEPTED MANUSCRIPT
AC
CE P
TE
D
Figure 4
MA
NU
SC R
IP
T
ACCEPTED MANUSCRIPT
AC
CE P
TE
D
Figure 5
AC
CE P
TE
D
MA
NU
SC R
IP
T
ACCEPTED MANUSCRIPT
Figure 6
ACCEPTED MANUSCRIPT
Table 1: List of published and predicted exons with their FEX and coding test score.
Coding/Non-coding Coding May be Coding Coding
T
TestCode score 1.324 0.785 1.241
IP
FEX score 23.09 4.38 6.33
SC R
Exons Known exon E1 Predicted exon E1A Predicted exon E1B
AC
CE P
TE
D
MA
NU
Exon 1 (E1) was taken as positive control for both FEX and test code analysis.
IP
T
ACCEPTED MANUSCRIPT
SC R
Table 2. Expression profile of novel isoforms in different tissues across different developmental stages. PN15
PN60
Brain
+
+
+
Heart
+
+
+
Kidney
+
+
+
Liver
+
+
+
MA
NU
PN3
AC
CE P
TE
D
Three different postnatal (PN) stages were studied that include PN3, PN15 and PN60. Appearance of RT-PCR products of anticipated size is represented as (+) sign and denotes expression of control and novel isoform. cDNA was prepared independently from three different mice of each group mentioned.
ACCEPTED MANUSCRIPT Table 3. Various properties of N-terminal amino acid sequence of published and novel protein isoforms as predicted by TermiNator tool.
*M(1)
ABCC4-N1
MEMLPSEVVKPREERWL NPL
Ac-M(1)
T
MLPVHTEVKPNPLQDANL CS
Translation efficiencyb
IP
ABCC4
Likelihood %
Predicted half life (h)
100
1
?
100
5
5-31
SC R
Predicted N-terminus of mature proteina
NU
Amino acid sequence (20 aa at N- terminal)
(1) or (2) indicates the position of the amino-acid in the protein sequence starting with M(1).
b
Translation efficiency refers to the capacity of the translation machinery to translate the mRNA into protein. The
MA
a
AC
CE P
TE
D
higher the value denotes more efficient translation on the scale 1-5.
ACCEPTED MANUSCRIPT Abrivations: RT-PCR, Reverse transcriptase polymerase chain reaction
T
EST, Expressed sequence tags
IP
PN, Postnatal
SC R
SMS, sequence manipulation suite
AC
CE P
TE
D
MA
NU
qRT-PCR, Quantitative real time PCR
ACCEPTED MANUSCRIPT Highlights
Mouse Abcc4 gene was studied to identify novel alternatively spliced transcripts.
Expression of a novel transcript was confirmed using molecular biology techniques.
Novel transcript has different promoter region and transcription start site.
Conceptually translated transcript show different post-translational modifications.
AC
CE P
TE
D
MA
NU
SC R
IP
T