ELSEVIER
Biochimicaet BiophysicaActa 1261 (1995) 290-292
et Biophysica A~ta
Short Sequence-Paper
cDNA encoding a chicken protein (CRP1) with homology to hnRNP type A / B A. Cvekl, J.B. McDermott, J. Piatigorsky * Laboratory of Molecular and Developmental Biology, National Eye Institute, National Institutes of Health, Bethesda, MD 20892-2730, USA
Received17 October1994;revised 11 January1995; accepted19 January 1995
Abstract
The sequence of a cDNA encoding a putative chicken RNA-binding protein is reported. The C-terminal portion of the predicted protein is similar to a family of nucleic acid binding proteins that includes murine CArG box-binding factor CBF-A, human hnRNP A/B, hepatitis B enhancer-binding protein E2BP, and AU-rich RNA-binding protein AUF1. These proteins all have two consecutive RNA recognition motifs. However, the N-terminal 72 amino acids of this deduced chicken protein show no relation to the N-terminal sequences of the other proteins. We call this protein chicken ribonucleoprotein, CRP1. Keywords: Gene cloning; RNP proteinfamily;(Chicken)
RNA-binding proteins are involved in various aspects of RNA metabolism. They play specific roles in mRNA processing, turnover, transport and translation [1]. Many of the RNA-binding proteins contain at least one RNA recognition motif (RRMs or RNP motifs) which have been shown to mediate RNA binding [1-3]. The 80- to 90-amino acid RRM includes two conserved regions, an octapeptide RNP1 and a hexapeptide RNP2. Based on the sequence of these motifs, RRM-containing proteins have been grouped into classes that also share function [4]. One such class is the heterogenous nuclear ribonucleoproteins (hnRNPs), which are defined as proteins that associate with hnRNA (pre-mRNA) in the nucleus [1-3]. Expression screening [5] of a chicken embryonic heart library with double-stranded DNA probes derived from an enhancer element of the chicken aA-crystallin gene [6] resulted in isolation of a 1250-bp cDNA (clone pF4).
The nucleotide sequence data reported in this paper have been submitted to the EMBL/GenBankData Libraries under the accession number U14942. * Correspondingauthor. Fax: + 1 (301) 4020781.
0167-4781/95/$09.50 © 1995 ElsevierScienceB.V. All rights reserved SSDI 0167-4781(95)00021-6
Three different double-stranded target sequences were generated in separate plasmids by insertion of protein-binding sites DE2A, DE2B, and a consensus of DE2A and DE2B into the B a m H I site of pBS-SK + . The inserted oligonucleotides in each plasmid contained four copies of the target sequences (underlined), and B a m H I ends (small case): DE2A, 5'-gatc(AGAGACCAGACTGTCATCCCCAGACTGTCAT)2; DE2B, 5'-gatc(CCCCAGGTCAGTCTCCGCAAGGTCAGTCT)2; consensus, 5'-gatc(AGAG A C C A G A C T G A C A T C C C C A G A C T G A C A T ) 2. The three radioactive probes were generated by polymerase chain reaction with standard T3 and T7 primers, and mixed for screening. A single cDNA, pF4, was obtained by this screening. The sequence of pF4 has a single open reading frame that encodes a predicted protein of 285 amino acid residues (33.6 kDa). In vitro transcription/translation of a pF4 using reticulocyte lysate generated a protein with an apparent molecular mass of 41-43 kDa (Fig. 1). While the experimental value is higher than the predicted molecular mass, a similar aberrant migration of the human hnRNP A / B protein has been observed [7]. We call the protein encoded in pF4 chicken ribonucleoprotein 1 (CRP1). The carboxy-terminal three-fourths of CRP1 identify this protein as a member of a subfamily of nucleic acid-bi-
291
A. Cvekl et al. /Biochimica et Biophysica Acta 1261 (1995) 290-292
<
includes two RRMs and is 90% and 89% identical to the CBF-A and hnRNP A / B proteins, respectively (Fig. 2). The C-terminal domains of CRP1 and the hnRNP A / B proteins are rich in glycine, serine, asparagine, glutamine and arginine residues and contain a putative A T P / G T P binding site (boxed, Fig. 2). The sequence homology between CRP1 and RNA-binding proteins suggest that CRP1 may bind RNA as well as DNA. The biological role of CRP1 and its recognition specificity remain to be determined. The N-terminal sequence of CRP1 is distinct from the mammalian hnRNP A / B proteins. Whereas the murine and human proteins share regions of identity in the Nterminal domain, amino acids 1 to 72 of CRP1 show no relation to other members of the hnRNP families, or to other sequences in the database. This region of CRP1 is very rich in arginine (25 of 72 residues). However, the N-terminal domains of all three proteins are rich in glycine, serine and threonine, suggesting that the function of the N-terminal domain may be conserved. The overall sequence similarity suggests that CRP1 is the chicken homolog of the mammalian hnRNP A / B proteins. This relationship is supported by conservation of 190 bp of the crpl 3' untranslated region (UTR) with the mammalian hnRNP A / B 3' UTRs (not shown). Conservation in the 3' UTRs of vertebrate genes for hnRNP A1, hnRNP A2 and poly(A)-binding proteins has been noted [12]. We thank Dr. Jay Potts for his generous gift of the chicken 14-day embryonic heart cDNA expression library. We are very grateful to Dr. Peter Good for critical reading of this manuscript, and to Drs. Fatah Kashanchi and Paul Driggers for advice in screening the library.
Z ,,¢ LL
--
46
--
30
--2O
1 2 Fig. 1. SDS-PAGE analysis of [ 35S]methionine-labeled proteins generated in in vitro transcription-translation using a rabbit reticulocyte lysate (Promega). Lane 1, pBS-KS + (vector alone); lane 32 pBS-KS/F4cDNA. The arrow indicates translated protein. The positions of molecular mass standards (kDa) are indicated. A sodium dodecylsulfate-4 to 20% polyacrylamide gradient gel was used (Novex, San Diego, USA).
nding proteins that includes the murine (called CArG box-binding factor A, CBF-A) [8] and human [7] hnRNP A / B proteins, AU-rich RNA-binding protein (AUF1) [9] and hepatitis B enhancer-binding protein (E2BP) [10]. The amino acid sequence from Gly-73 to Gin-242 of CRP1
CRPI
MDTKRLRAPGSSSRPTPAGPRRLRRARRQGRQRHRERRRRREPACGGHGGGGRQPERGRRRPDQRQQQRGGR..
CBF-A
MSDAAEEQPMETTGATENGHEAAPEGEAPVEPSAAAAAPAASAGSGC4~TTTAPSG~QNGAEGDQ
i
I
i I
il
il i llilillilillllill II hu A/B
ill
RN~2
CRP1 CBF-A
INASKNEEDA
lllilillillllilill lllllllilill
G E M F V G G ¢ I W D T S K K D L K D Y F T K F G E V V D C T I KMDPNTGR GEMFVGG¢ ,'WDTSKKDLKDYFTKFGEVVDCT I KMDPNTGI~
....... P R A G I R T A P R D Q I N A S K N E E D A
~KEPGSVEKVLEQKEHRLDGRLI DPKKAMAMKKDPV
I WGG¢ ] P E . A T E E K I R E Y F G E F G E I E A I E L P M D P K T 5
llllll
II
lllill illlillll llilllillillll
%GFGFIL .~KDS S S V E K V L D Q K E H R L D G R V I D P K K A M A M K K D P V
llillll Ill
~GFGFIL P'KDAASVEKVLDQKEHRLDGRVIDPKKAMAMKKDPV
~RG~,/FI
F K E E D P V K K V L E K K F H N V S G S K C E IK V A Q P K E V
ililllllllllilll illilllllllIIlll
lllililill llilillill llilll l l l l l l l
illl llillfIllillllIIllllflilllfl
I F V ~ I ~ E .A T E E K I R E Y F G Q F G E I EAI E L P I DPKI20 ~GF'v'~I F K E E D P V K K V L E K K F H T V S G S K C E I K V A Q P K E V
hu A/B
Ill
RRMI
lillilllIllilllilililllilllilll
I I I I l l l l III llllilIllll illillIlll Ill il l l l l l l l I llilili
N-TERM
i lilillllfill
RNP 1
lillilll lillililillilillillllillillilill hu A/B
I
S W D T S K K D L K D Y F T K F G E V T D C T I KMDPNTGR
llllllll
CBF-A
i
MS EAGEEQPMETTGATENGHEAVPEASRGRGWTGAAAGLEARPPR
CRP1
II
RRM2
~ E S PTEEKI R E Y F G E F G E I EAI ELPMDPKLN} =tRGFVFI F K E E E P V K K V L E K K F H T V S G S K C E I K V A Q P K E V ATP/GTP
CRP1
Y Q Q Q Q F S S G G G R G S Y G G R G R G G R .... ~ R R G G H Q N N Y K P Y
CBF-A
YQQQQYGS. GGRGNRNRG2CRGS ..... G G G Q ~ S T N Y G K ~ R R G G H Q N N Y K P Y
lllll
I fill
il
llll]ll llil[lllllllillll
lilllill llilllillllJl hu A/B
Y Q Q Q Q Y G S .G G R G N R N R G N R G S
C-TERM
illl]illillll[IllllIlllflll ~
S TNYGKSI~RRGGHQNNYKP Y
Fig. 2. Alignment of the chicken CRP1, murine CBF-A [8], and human RNP A / B [7] protein sequences. The sequences were aligned with the GCG Gap program [11]. The domains of the proteins are indicated to the right. RNP1 and RNP2 are boxed in each RRM. A putative ATP/GTP-binding site in the C-terminal domain is boxed. The accession number for CRP1 is U14942.
292
A. Cvekl et al. / Biochimica et Biophysica Acta 1261 (1995) 290-292
References [1] Haynes, S.R. (1992) New Biol. 4, 421-429. [2] Birney, E., Kumar, S. and Krainer, A.R. (1993) Nucleic Acids Res. 21, 5803-5816. [3] Burd, C.G. and Dreyfuss, G. (1994) Science 265, 615-621. [4] Kim, Y.-J. and Baker, B.S. (1993) Mol. Cell. Biol. 13, 174-183. [5] Singh, H., LeBowitz, J.H., Baldwin, A.S. and Sharp, P.A. (1988) Cell 52, 415-423. [6] Klement, J.F., Cvekl, A. and Piatigorsky, J. (1993) J. Biol. Chem. 268, 6777-6784.
[7] Khan, F.A., Jaiswal, A.K. and Szer, W. (1991) FEBS Left. 290, 159-161. [8] Kamada, S. and Miwa, T. (1992) Gene 119, 229-236. [9] Zhang, W., Wagner, B.J., Ehrenman, K., Schaefer, A.W., DeMaria, C.T., Crater, D., DeHaven, K., Long, L. and Brewer, G. (1993) Mol. Cell. Biol. 13, 7652-7665. [10] Tay, N., Chart, S.-H. and Ren, E.-C. (1992) J. Virol. 66, 6841-6848. [11] Genetics Computer Group (1991) Program Manual for the GCG Package, Madison. [12] Good, P.J., Rebbert, M.L. and Dawid, I.B. (1993) Nucleic Acids Rcs. 21, 999-1006.