Biochimica et Biophysica Acta 1492 (2000) 191^195
www.elsevier.com/locate/bba
Short sequence-paper
Cloning and characterization of a novel RNA-binding protein SRL300 with RS domains Yukiharu Sawada a
a;
*, Yutaka Miura b;c , Kazumi Umeki d , Taiki Tamaoki c , Kei Fujinaga a;e , Sachiya Ohtaki d
Department of Molecular Biology, Cancer Research Institute, Sapporo Medical University School of Medicine, Central Ward, Sapporo 060-8556, Japan b Department of Bio-regulation and Molecular Neurobiology, Institute for Molecular Medicine, Nagoya City University School of Medicine, Nagoya 467-0001, Japan c Department of Medical Biochemistry, University of Calgary, Calgary, Alta., Canada T2N 4N1 d Department of Laboratory Medicine, Miyazaki Medical College, Kiyotake 889-1692, Japan e Biotechnology Research Laboratory, Takara Shuzo Co. Ltd., Kusatsu 525-0055, Japan Received 22 December 1999; received in revised form 21 February 2000; accepted 2 March 2000
Abstract AT-rich element binding factor 1 (ATBF1) mRNA encodes a transcription factor implicated in neuronal differentiation. A cDNA for the protein that can bind the 5P-noncoding sequence of the ATBF1 mRNA was cloned. The deduced protein, termed SRL300, contains a unique RNA-binding region, two large RS domains and many phosphorylation sites. SRL300 protein was detected in both human and rat cells. ß 2000 Elsevier Science B.V. All rights reserved. Keywords : RNA-binding protein; cDNA cloning ; RNA-binding region ; RS motif
The ATBF1 cDNA was ¢rst isolated as one whose product binds an AT-rich regulatory element of the human K-fetoprotein (AFP) gene [1]. Transient transfection assays showed that ATBF1 suppressed the activity of the AT-rich element of the enhancer/promoter region of the AFP gene, but not that of albumin gene [2]. The ATBF1A mRNA encodes a transcriptional regulatory protein that is characterized by a large size (404 kDa) and the presence of four homeodomains and 23 zinc ¢nger motifs [3]. It is expressed in human and mouse embryonal carcinoma cells in a neuronal di¡erentiation-dependent manner [3,4]. The ATBF1-A mRNA has a long 5P-noncoding region that spans approximately 670 nucleotides with a potential to form extensive stem-loop structures. These features suggest a possible interaction with cellular proteins
Abbreviations : ATBF1, AT-rich element binding factor 1; BCIP, 5bromo-4-chloro-3-indolyl-phosphate; GST, glutathione S-transferase; IPTG, isopropyl-L-D-(3)-thiogalactopyranoside; ORF, open reading frame ; RBD, RNA-binding domain ; RBP, RNA-binding protein ; RS motif, protein sequence motif rich in alternating arginine and serine residues * Corresponding author. Fax: +81-11-618-3313; E-mail :
[email protected]
through which processing of the mRNA may be regulated in a di¡erentiation-dependent manner. In this report, we performed an RNA-ligand screening of a cDNA expression library and identi¢ed a cDNA clone that encodes a novel RNA-binding protein characterized by the presence of two clusters of RS motifs. The 5P-end of the ATBF1-A cDNA in a partial cDNA clone, pMD14 [3], was used to synthesize the A5 RNA probe (404 nucleotides in length) corresponding to the 5P-noncoding region by T3 RNA polymerase (Boehringer Mannheim) after linearization with StuI. Escherichia coli Y1090 was infected with 1.3U106 pfu of a HeLa cDNA library on Vgt11 (Clontech), induced by IPTG and blotted to nitrocellulose membranes. After denaturation in and renaturation from 6 M guanidine HCl [5], the blots were pre-incubated in solution A (10 mM Tris^HCl, pH 7.5, 0.15 M KCl, 0.05 M NaCl, 1 mM MgCl2 ) containing 50 Wg/ml tRNA for 2 h at 4³C, and then incubated in solution A containing the 32 P-labeled A5 RNA probe (1U106 cpm/ml) and 50 Wg/ml tRNA for 8 h at 4³C. After washing in solution B (10 mM Tris^HCl, pH 7.5, 0.2 M KCl, 0.1 M NaCl, 0.1% Tween 20), they were autoradiographed for 10 h at 370³C. Nine positive clones (R1^R9) were isolated by the ¢rst screening and subjected to puri-
0167-4781 / 00 / $ ^ see front matter ß 2000 Elsevier Science B.V. All rights reserved. PII: S 0 1 6 7 - 4 7 8 1 ( 0 0 ) 0 0 0 6 5 - 8
BBAEXP 91386 6-6-00
192
Y. Sawada et al. / Biochimica et Biophysica Acta 1492 (2000) 191^195
Fig. 1. Deduced amino acid sequence of human SRL300. The RNA-binding domain (RBD) is doubly underlined. Two large RS motif clusters (RSD-1 and RSD-2) are boxed. Two regions boxed and marked a and b were used to raise antibodies. Two polyserine stretches are underlined. A pair of hemibrackets delimit the amino acid sequence encoded by the cDNA clone R2. The cDNA sequences have been deposited in DDBJ/EMBL/GenBank under the accession numbers AB015644, AB016087, AB016089, AB016090, AB016091 and AB016092.
¢cation through the second and the third screening using the A5 RNA probe labeled with digoxigenin (Boehringer Mannheim). The cDNA inserts were ampli¢ed by PCR and recloned in pT7Blue T (Novagen) or pBlueScript II (Stratagene). Sequence analysis showed that all the clones were essentially identical except a few base changes. One of the reading frames is open over the whole sequence, indicating that it represents a partial cDNA of a longer mRNA species. The cDNA clone R2 with the consensus sequence was used as the starting probe to screen another HeLa cell cDNA library on Vgt10. Additional 5P-sequences were generated with a 5P-RACE (rapid ampli¢cation of cDNA ends) reaction kit (Clontech) from the same cDNA library. Compilation of the overlapping cDNA clones resulted in a 9027-bp cDNA with an ORF that can encode a protein of 2752 amino acid residues with a calculated molecular mass of 299 617 (Fig. 1). The putative initiation codon is compatible with Kozak's consensus for the e¤cient translation initiation [6], and is preceded by an inframe stop codon. The deduced protein contains numerous repeats of arginine-serine dipeptide (RS motif) with particularly large number of repeats clustered in two regions, named RSD-1 and RSD-2, in the amino and carboxyl terminal half, respectively, of the molecule (Fig. 1). The two RS domains are composed of repeats of similar peptides (21 repeats of
RRGRSRSRT/SPA/Q for RSD-1 consensus and 19 repeats of RRRSRSRTP/SPVT/S for RSD-2 consensus). There are multiple copies of a related sequence motif (RRSRSGSSP/SE/D for consensus) in the region between RSD-1 and RSD-2. Thus, the protein was named SRL300 as a possible member of RS domain proteins that are involved in RNA processing including pre-mRNA splicing and mRNA transportation [7,8]. SRL300 also contains multiple stretches of basic amino acid residues that could function as nuclear localization signals, suggesting that it is localized in the nucleus. In addition, there are multiple consensus phosphorylation sites for cAMP-dependent protein kinase (PKA), protein kinase C (PKC), casein kinase II (CKII) and cyclin-dependent kinase (CDK). These features suggest that SRL300 is involved in the RNA metabolism in nucleus and that the function of SRL300 is regulated by phosphorylation. To examine whether SRL300 is expressed in cells, we have raised rabbit polyclonal antibodies against SRL300. Two polypeptide stretches a and b that localize on both sides of the RBD (Fig. 1) were expressed as fusion proteins with GST and used as immunogenic antigens. Both antibodies detected broad bands with a range of molecular mass more than 300 000 (Fig. 2, lanes 1 and 3). Similar results were obtained with rat 3Y1 cells (Fig. 2, lanes 5 and 6), indicating that SRL300 is conserved among di¡er-
BBAEXP 91386 6-6-00
Y. Sawada et al. / Biochimica et Biophysica Acta 1492 (2000) 191^195
193
lanes 2 and 4). These results indicate that SRL300 exists in HeLa cells primarily in the form of phosphorylated proteins. Detection of SRL300 as a broad band like a smear
Fig. 2. Detection of SRL300 protein. HeLa and rat 3Y1 cells were labeled with [35 S]methionine and [35 S]cysteine (110 mCi/ml, Amersham) for 2 h. The cells were lysed in the lysis bu¡er composed of 20 mM sodium phosphate (pH 7.0), 250 mM NaCl, 5 mM EDTA, 30 mM pyrophosphate, 0.1 mM sodium vanadate, 10 mM sodium £uoride, 5 mM dithiothreitol, 0.5% NP40, 1 mM phenylmethylsulfonyl £uoride, 10 Wg/ ml aprotinin and 1 Wg/ml pepstatin A. Aliquots (3U107 cpm) of the extracts were incubated in duplicate with antibodies raised against GSTSRp300-a and GSTSRp300-b fusion proteins for 90 min at 4³C, and the immunocomplexes formed were collected on protein A-agarose beads (Pierce) and washed four times with the lysis bu¡er. The washed immunocomplexes were equilibrated with the phosphatase bu¡er composed of 50 mM Tris HCl (pH 9.0) and 1 mM MgCl2 , and incubated in 50 Wl of the bu¡er with (+) or without (3) 20 U of calf intestine alkaline phosphatase (CIAP) (Takara) for 1 h at 37³C. YN-1 cells, a 3Y1 cell clone transformed by a plasmid carrying adenovirus (Ad) 5/12 chimeric E1A and Ad5 E1B genes, were labeled and immunoprecipitated with anti-Ad2 E1A protein antibody (Calbiochem) to detect p300, p107 and E1A proteins as labeled size markers (lane 7). Samples were subjected to SDS^polyacrylamide gel electrophoresis and autoradiography.
ent mammalian species. The amount of rat protein reacting with the anti-human SRL300-b antibody was less than that with the anti-human SRL300-a antibody (Fig. 2, lanes 8 and 9), suggesting some sequence diversity between the two species within the peptide sequence recognized by the anti-b antibody. Treatment of the immunoprecipitates with calf intestine alkaline phosphatase increased the mobility of the broad band in gel electrophoresis and resulted in part in the formation of bands with approximately the same mobility as that of the E1A-associated p300 (Fig. 2,
Fig. 3. Determination of the RNA-binding region. (A) Diagram of GST-R2 fusions. Various subfragments of the R2 cDNA produced by restriction were recloned to produce GST fusions of the truncated R2 segments. The R2dlAP construct carries extra 12 bp derived from a multi-cloning region encoding amino acid residues CMP in place of the deletion. The dlRS mutations were constructed by PCR reactions with mutated primers, deleting 12 bp encoding the amino acid residues SRSR at 277^280 and replacing S281 to G. Coding sequences are shown by open boxes and the numerals on the ends indicate the corresponding amino acid positions in SRL300. The results of RNA-binding assays are summarized on the right. (B) RNA-binding assay. GST fusion proteins were induced by 0.6 mM IPTG and puri¢ed on glutathione-Sepharose 4B beads (Pharmacia) as described previously [14]. The puri¢ed GSTR2 fusion proteins were blotted on nitrocellulose membranes. The blots were incubated in solution C (10 mM Tris^HCl, pH 7.5, 0.2 M KCl, 0.1 M NaCl, 1 mM MgCl2 , 5UDenhardt's solution) at 4³C for 8^16 h, and then in solution C containing digoxigenin-labeled RNA probes at 4³C for 8^16 h. They were washed in solution B at 4³C three times for 30 min each and blocked in solution B (10 mM Tris^ HCl, pH 7.5, 0.2 M KCl, 0.1 M NaCl, 0.1% Tween 20) containing 3% dry milk and 1% BSA at 4³C for 30 min. The digoxigenin-labeled A5 RNA probe bound was visualized by the reaction of alkaline phosphatase conjugated with anti-digoxigenin antibody (Boehringer Mannheim) using BCIP as a substrate and nitroblue tetrazolium as a chromophore [5].
BBAEXP 91386 6-6-00
194
Y. Sawada et al. / Biochimica et Biophysica Acta 1492 (2000) 191^195
further suggests that SRL300 proteins are phosphorylated at many sites into a wide variety of phosphorylation levels. Interestingly, the distribution of consensus phosphorylation sites on SRL300 has a characteristic feature. For example, the sites for PKA (R/K-R/K-X-S/T) and PKC (S/T-X-R/K) are associated with RS motif consensus for RSD-2 and most frequently map to RSD-2. The CDK (S/ T-P-X-R/K) sites are associated with RS motif consensus for RSD-1, while CKII (S/T-X-X-D/E) sites are often included in RS motif found in the region between RSD-1 and RSD-2. Therefore, it is possible that SRL300 is phosphorylated on di¡erent domains by those kinases in response to di¡erent signals. Since the partial cDNA R2 was cloned by the RNA ligand screening procedure, the RBD of SRL300 is thought to reside within the polypeptide encoded by the R2 cDNA. To map the RBD, the R2 polypeptide (amino acids 197^425) and its various fragments were expressed as a series of GST fusion proteins (Fig. 3A). The A5 RNA probe bound to the fusion protein carrying the intact R2 (GSTR2) and those carrying truncated R2 that contain 63 or more N-terminal amino acids (R2EP, R2EA and R2dlAP) (Fig. 3B). On the other hand, fusions containing only 26 N-terminal amino acids (R2Ef), or lacking nine or more amino acids at the N-terminus of R2 (R2TT, R2¡, R2AE and R2PE) were devoid of the binding capacity. These results indicate that the ¢rst 63 residues at the Nterminus of the R2 polypeptide are required for the RNA binding. This region is rich in basic amino acids (14 arginines and 17 lysines) and serine (15 residues) and has several RS motifs (Fig. 1). The RBD identi¢ed in SRL300 lacks the RNP motif or any other well-de¢ned RNA-binding motifs, but is somewhat similar to the arginine-rich motif (ARM) [9]. The structures of the ARMs are diverse in di¡erent proteins: the ARM peptides of HIV Rev form an K-helix and bind speci¢cally to the Rev-responsive element (RRE) [10], while those of HIV Tat have no speci¢c conformation, but form a stable structure upon binding to the transacting-responsive element (TAR) [11]. These features of the ARMs may include the RBD of SRL300 as a new member of the ARM family. No typical RNA- or DNA-binding motif was found throughout the protein sequence. Northern blot analysis of SRL300 mRNA was done on total RNAs from several human tissues (liver, placenta, white blood cells) and cell lines (A549, HeLa, HepG2) using the R2 cDNA as a probe. A band of 9^10 kb was detected in all samples examined, indicating that SRL300 is expressed in diverse tissues and cell lines (Fig. 4). The size of SRL300 mRNA is similar to that of the SRL300 cDNA, indicating that our cDNA represents most, if not all, of the human SRL300 mRNA. Searches of the DDBJ/EMBL/GenBank data bases have revealed extensive homology to two cDNA clones with the accession numbers AB002322 and AF201422 and a genomic DNA clone with the accession number AC004493
Fig. 4. Expression of SRL300 mRNA in human tissues and cell lines. Total RNA was prepared from human liver, placenta, white blood cells and cultured human cells, A549, HeLa and HepG2, by the guanidinium^CsCl method [5]. Ten micrograms of RNA were run on a 1% formaldehyde^agarose gel, blotted to a nylon membrane, and probed with a 32 P-labeled DNA probe prepared from R2 by the Klenow fragment of DNA polymerase I (Takara) and random primers. mRNA for L-actin was also detected using a speci¢c hybridization probe as a control of RNA quantity. Lanes: 1, liver; 2, placenta; 3, white blood cells; 4, A549 ; 5, HeLa; 6, HepG2.
in addition to ESTs. The ubiquitous expression of the partial cDNA clone AB002322 [12] is consistent with our result of the Northern analysis. The sequence AF201422 has recently been deposited for a nuclear matrix protein, SRm300, a component of pre-mRNA splicing coactivator complex [13]. Comparison of these cDNA sequences suggests that SRL300 represents the ¢rst complete sequence of SRm300. The genomic clone (AC004493) seems to include the whole sequence of the SRL300 cDNA. The detailed structure and chromosome mapping of the SRL300 gene will be described elsewhere. Y.S. thanks Mr. I. Yamamoto for technical assistance and Drs. T. Tokino and T. Kotani for support and discussion. Y.S. was supported in part by the Medical Exchange Program between Sapporo Medical University and the University of Calgary.
References [1] T. Morinaga, H. Yasuda, T. Hashimoto, K. Higashio, T. Tamaoki, Mol. Cell. Biol. 11 (1991) 6041^6049. [2] H. Yasuda, A. Mizuno, T. Tamaoki, T. Morinaga, Mol. Cell. Biol. 14 (1994) 1395^1401. [3] Y. Miura, T. Tam, A. Ido, T. Morinaga, T. Miki, T. Hashimoto, T. Tamaoki, J. Biol. Chem. 270 (1995) 26840^26848. [4] A. Ido, Y. Miura, T. Tamaoki, Dev. Biol. 163 (1994) 184^187. [5] J. Sambrook, E.F. Fritsch, T. Maniatis, Molecular Cloning : A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York, 1989. [6] M. Kozak, Nucleic Acids Res. 15 (1987) 8125^8133. [7] A.M. Zahler, W.S. Lane, J.A. Stolk, M.B. Roth, Genes Dev. 6 (1992) 837^847.
BBAEXP 91386 6-6-00
Y. Sawada et al. / Biochimica et Biophysica Acta 1492 (2000) 191^195 [8] J.F. Caceres, G.R. Screaton, A.R. Krainer, Genes Dev. 12 (1998) 55^ 66. [9] C.G. Burd, G. Dreyfuss, Science 265 (1994) 615^621. [10] R. Tan, L. Chen, J.A. Buettner, D. Hudson, A.D. Frankel, Cell 73 (1993) 1031^1040. [11] R. Tan, A.D. Frankel, Proc. Natl. Acad. Sci. USA 92 (1995) 5282^ 5286.
195
[12] T. Nagase, K. Ishikawa, D. Nakajima, M. Ohira, N. Seki, N. Miyajima, A. Tanaka, H. Kotani, N. Nomura, O. Ohara, DNA Res. 4 (1997) 141^150. [13] B.J. Blencowe, R. Issner, J.A. Nickerson, P.A. Sharp, Genes Dev. 12 (1998) 996^1009. [14] Y. Sawada, M. Ishino, K. Miura, E. Ohtsuka, K. Fujinaga, Virus Genes 15 (1997) 161^170.
BBAEXP 91386 6-6-00