Sequence of a chicken erythroblast mono(ADP-ribosyl)transferase-encoding gene and its upstream region

Sequence of a chicken erythroblast mono(ADP-ribosyl)transferase-encoding gene and its upstream region

Gene, 164 (1995) 371 372 © 1995 Elsevier Science B.V. All rights reserved. 0378-1119/95/$09.50 371 GENE 09220 Sequence of a chicken erythroblast mo...

211KB Sizes 0 Downloads 37 Views

Gene, 164 (1995) 371 372 © 1995 Elsevier Science B.V. All rights reserved. 0378-1119/95/$09.50

371

GENE 09220

Sequence of a chicken erythroblast mono(ADP-ribosyl)transferaseencoding gene and its upstream region * (Cloning; active site region; gene promoter; chicken repeat-1 element)

Terence Davis and Sydney Shall Cell and Molecular Biology Laboratory, University of Sussex, Brighton BN1 9QG, UK Received by R.W. Davies: 7 April 1995; Accepted: 20 June 1995; Received at publishers: 20 July 1995

SUMMARY

We have cloned the MADPRT gene encoding the 300-amino-acid mono(ADP-ribosyl)transferase (MADPRT) from chicken erythroblasts. The protein has homology to the rabbit and human skeletal muscle (50% identity) and two chicken heterophil (52% identity) NAD+:arginine MADPRT. The active site region is particularly conserved. The upstream region of the MADPRT gene from erythroblasts has several features characteristic of promoter sequences.

Mono-ADP-ribosylation is a post-translational modification of proteins involving the transfer of an ADPribose moiety from NAD + to specific aa. This modification is the mode of action of certain bacterial toxins and is involved in the regulation of cellular signalling processes (Lowery and Ludden, 1990; Williamson and Moss, 1990). MADPRT have also been isolated from many eukaryotic tissues and cells including liver, erythrocytes and muscle (Zolkiewska et al., 1994). Fig. 1 shows the sequence of the clones p411 and p1812. The sequence has a 300-aa ORF (33.5 kDa) followed by a stop codon and a polyadenylation signal. The intron shown has the consensus splice acceptor and donor sites. The deduced protein has 50-52% identity with the rabbit, human and two chicken heterophil MADPRT. This degree of similarity is the same as that Correspondence to: Dr. T. Davis, School of Biology, University of Sussex, Brighton BN1 9QG, UK. Tel. (44-1273) 678-303; Fax (44-1273) 678-433; e-mail: [email protected] * On request, the authors will supply detailed experimental evidence for the conclusions reached in this Brief Note. Abbreviations: aa, amino acid(s); AEV, avian erythroleukemia virus; bp, base pair(s); kb, kilobases or 1000 bp; cDNA, DNA complimentary to RNA; CRI element, chicken repeat 1 retroviral-like element; MADPRT, mono (ADP-ribosyl)transferase(s); MADPRT, gene encoding MADPRT; NAD, nicotinamide adenine dinucleotide; nt, nucleotide(s).

SSDI 0 3 7 8 - 1 1 1 9 ( 9 5 ) 0 0 5 0 4 - 8

between the chicken heterophil and the mammalian muscle proteins. The active site Glu of the mammalian skeletal muscle MADPRT and its adjacent aa are particularly conserved in the chicken proteins; the rabbit and human sequence is EEEVLIPP (Zolkiewska et al., 1992; Okazaki et al., 1994) and the chicken sequence is EDEVLIPP (Tsuchiya et al., 1994) with the active site Glu underlined. The different chicken genes gives evidence for a family of these enzymes. In addition to the coding region we have sequenced 900 bp upstream from the Met start codon into the promoter region (Fig. 1). This sequence has paired CCAAT (nt 452) and TATA (nt 819) motifs. There is a sequence with approx. 80% identity to a chicken repeat 1 (CR1) element at nt 76-305. The function of these repeats is unknown but they have been shown to occur frequently at the 5' ends of many genes in the chicken (Stumph et al., 1984). Four other motifs (nt 210, 256, 697 and 824) resemble Spl-binding motifs found in the 5' region of the rat poly(ADP-ribose) polymerase-encoding gene (Potvin et al., 1993) and the mouse p12 gene (Guerin et al., 1990). Interestingly, two of these are in the CR1 homology. There are also two putative Apl sites (nt 463 and 516), the first of which is identical to the Apl consensus sequence, while the second has two mis-matches (Lee et al., 1987). Whether these sequences have any signifi-

372 GAGC TCCATCAGTGACATCGATGATGGGTTCGAGTGCACCC

TCAGCAGT TGGCTGC TGAC

ACCGAGCCGAGCGGTGCGTTGACACAGCAGAAGTCCCGGACTGTCCCTCTACTCCATCAT CATAAGGCTCCATTTGGAGTCTGGGTTCCCCAACACAGGAAAGACGTGGAGGTGTTGGAA TGGATGCAGAGGAGGTCATGGAGCTGCTGAGAGGGTTGGAGCAGCTCTCCCCTCAGCGAG GAAAGGCTGAGGGAGCTGGGGGTGTTCAGCCTGGAGAAGGGGAGGCTCCGTGGACACCTC ACTGCAGGCACTGCAGGGCTAATCAGAATGTACAGCATCTCATTGAGAGCATCGTCCAAA

60 120 180 240 300 360 420 488 540 600 660 720 780 840 900 1 960 6 1020 1080 1140 1200 1260 1320 1380 1440 16 1500 36 1560 56 1620 76 1680 96 1740 i]6 1800 136 1860 156 1920 176 1980 196 2040 216 2180 236 2160 256 2220

CAGT~CCTGAGCACTGGCAGGCAT GGAGCATCAACTCC TACCTCAGAAACCTGTCCCAGT GTGTGACCACC CTCGCGGCAGTGACGT TCTTCC/~TTATCAAGTGACTCAGAAGGGAGCA A C A G C A A G C C T T CCAACA.AAGT G A A C A G G A A T TCG T T C A C C C A A C G C G C A G C T GAC GCT G CGTGCAGTGAGTGGTTGCAGTGGACACACAGTGGGCTCACATTGGAAACCCTCGT TGTTC CAGGGAGC CTGGA~TGATCATGGACGGGATGAAGGCAGAGATTTTGGGATATCAGCAGA GGAGGAGGTGATGGGTGCCCATAGATGGGCTCGGGATGGAGGGGAGGAC TTGGCCATGGA CGC T A T T G C A C T G G T T A T C C A T G T A A T G T G A A A C A C A C G C T G T A G T T C A G C A C A T C A A G C AGGT GAAGACCCT T TGGTATGGAGGACAGTGGCTGGCA~ATACATATGGGGAGGCCTGGA GGTGGATCATATCACAC CCCACAAAC TCACCAT GGCGAACGCCACGTGCTTACAGTGATG M GAGGAACCACTGCTGGTGGTGAGCTTGACTTCTTTTCTGCTTTTTTCCCAGTTCTCTCCC E E P L L CCACCCCCATGCAGGGGGTAGTQAGCGAAGGGGCTGAGCTGC TGCGGGGTAACACCACAA CCCCGTGGCTGCAC TCCCTGGCACTGCAT GGCCCTGACAT T GAGGTGTGGGGTCCTGGGC TGCCCTCCATCCCCAGAGCTGTGATGGGAGCACTCAGGGCTTGGCTGAGGGC T T TACT T A C T T T T T T C T T T T G C C T T C T G C A A G A A G C A G T C T G G T C A C A A G C A G T G T TGT T CT C C T T G G TGTCACTGTCAGTGTGTGATGGTTC CTGTGCCTTTCGGCTT TCGCAT GTGA~GGAAG GGGGACAGTGGGAAGGAGGTGGTGAAGGAAGGAGGAGGGGGTGCAGCAGGCAGGGGACGC AGTGGGGT GATCCTGAGC TGTGCCCACCTCCCCGTGCCCCCGCGCTGTTGC T GACACAGT G C T G G G C A C C T G C A T C C T T C T G C A C A T G G A G C A C G C C A T TC T G G G C T T G G T G C T G C T G C T H A I L G L V L L L CAGCACCAGGACTGATGCATCGGCTGCCAGGAGCAAGAAAGGCCCCATAAAGGAGGTGGT $ T R T D A S A A R S K K G P I K E V V G A T G G A C A T G G C G C C C CAC TCC T T T G A T G A C C A G T A C C A G G G C T G C A T C G A C T T G A T G G A M D M A P H S F D D Q Y Q G C I D L M E GGC T G A G C T G C A G G A G C T G A A C C G C A C C G A G T T C G C C A A C G A A A C C T TTGC T G A G G G C T G A E L Q E L N R T E F A N E T F A E G W GAGGAGCGCCACAGAGGAG TGGCAGCGCCGAT GGGGTCGGGTC TCCAGCCCAATGGTGCT R S A T E E W Q R R W G R V S S P M V L GCGACAGGATCAAGCCATAGCCGTGC TGGCGTATACAATGGAGGGAGAAC TGTACCGT G T R Q D Q A I A V L A Y T M E G E L Y R V GTTCAACAACGCCACGCTCACGGCTGGGCGCTCCCGGCAGCAC T A C C T G A G C T CC T A C C C F N N A T L T A G R S R Q H Y L S S Y P C TTCAAGACAC TGCATTTCCTAC T GAGCAGAGCCC TGCACACCCTGCAGGAATCCCAAAC F K T L H F L L S R A L H T L Q E 5 Q T CCAGCCATGC CACAACGTCTTCCGTGGCGTCAGGGGCACCCGAT TCACT GCACAGCAAGG Q P C H N V F R G v R G T R F T A Q Q G C A C G G T G G T C C G C T T T G G C C A G T T C A C C T C C T C C TC C C T C C A G A A G A A G G T T G C T G A G T T T V V R F G Q F T S S 8 L Q K K V A E F T TT T G G C C T G G A C A C G T T C T T C T C G G T G G A G A C C T GC T A T G G T G T G C C C A T C A A G G A C C T F G L D T F F ~ V Z T C Y G V P I E D L C TCCAC T T TCCCTGGTGAGGATGAAGTCCTCAT CCCACCCTTCGAGCAAT TCAGGGTCAC S T F P G E D E V L I P P F E Q F R V T CAACT CCACCTACACTGCGGGAAGAAGCTTCATCCAGCTCCGCTCCCAGGGGAAGAGCAG N S T Y T A G R S F I Q L R S Q G K S S CACCTAC~CTGCGAGTTTGTG~GGAGAA~GGTGC~GGAGCGGCCATGCGCTTTCAG T Y N C E F V K E K R C K E R P C A F S 2 7 6 TGCAGAT~GAGCAGTCCCCTGCCCCGCAGCCCCTGGCCAGGTTGGGCACCCCTAGCAGC 2280 A D K S S P L P R S P W P G W A p L A A 2 9 6 CCCCCACAGCCATT/%ATGCCCATCCCTGTTCTCCCCAGGCAGGAGCAGCCCCATGG~GC 2340 ? H S H * 300 CCCGCACCTCTGGGGGCTCCTCCTGGCAGCCGCAGCCCTGGCAGCCCTGGGAAAGCCCTG 2400 AGCTCTCCACCATCCCACCGCCGCCCCTCCATACGGACAGGAGGGCAGC~TGTGCCCTC 2460 ATTGCATCTCACGG~CCTGCAGGCGCTGCTTTCCCTCTGCTGACGCTTTCGGTGCCGCT 2520 GTTGGACGTGG~CGGCAGCACAG~CCGCCCACAGCTCCGCCACAGTA/%T/%A/%AGCATT 2580

GCTT~CCTGGCT~TA~

_ ~

2639

Fig. 1. The genomic and eDNA sequences have been combined (GenBank accession Nos. X83676 and X82397). The CCAAT and TATA motifs, splice acceptor and donor sequences, start and stop codons, and the polyadenylation sequence are in bold. The CR1 element homology is in italics. The Spl homologies are singly underlined and the Apl homologies doubly underlined. Methods: An AEV-transformed chicken erythroblast cDNA library in )~gtl0 was screened with a probe made from nt 51-300 and 684-800 of the coding sequence of the rabbit MADPRT sequence (Zolkiewska et al., 1992) and two identical clones k411 and k412 were obtained. The 5' part of clone %411 was used to screen a chicken genomic library in kGEMI1 (Promega, Madison, WI, USA) and the clone )~G114 isolated. The cDNA insert from k4ll (clone p411) was sequenced (Applied Biosciences Model 373A automated sequencer). A 1.8-kb Sstl (clone p1812) fragment of the genomic clone was also sequenced.

cance in the M A D P R T gene is unknown, but they show that this sequence does have several features characteristic of promoter sequences.

ACKNOWLEDGEMENTS We would like to thank J. Moss (NIH, Bethesda, MD, USA) for giving us the sequence of the rabbit MADPRT before publication and for providing the clone, and the British Cancer Research Campaign for funds. REFERENCES Guerin, S.L., Pothier, F., Robidoux, S., Gosselin, P. and Parker, M.G.: Identification of a DNA binding site for the transcription factorGC2 in the promoter region of the pl2-gene and repression of its positive activity by upstream regulatory elements. J. Biol. Chem. 265 (1990) 22035-22043. Lee, W., Mitchell P.J. and Tjian, R.: Purified transcription factor AP-1 interacts with TPA-inducible enhancer elements. Cell 49 (1987) 741 752. Lowery, R.G. and Ludden, P.W.: Endogenous ADP ribosylation in procaryotes. In: Moss, J. and Vaughan, M. (Eds.) ADP-ribosylating Toxins and G Proteins: Insights into Signal Transduction. American Society for Microbiology, Washington, DC, 1990, pp. 458 477. Okazaki, I.J., Zolkiewska, A., Nightingale, M.S. and Moss, J.: Immunological and structural conservation of mammalian skeletal muscle Glycosylphospatidylinositol-linked ADP-ribosyltransferases. Biochemistry 33 (1994) 12828-12836. Potvin, F., Roy, R.]., Poirier, G.G. and Guerin, S.L.: The US-1 element from the gene encoding rat poly(ADP-ribose)polymerase binds the transcription factor Spl. Eur. J. Biochem. 215 (1993) 73-80. Stumph, W.E., Hodgson, C.P., Tsai, M.-J. and O'MaUey, B.W.: Genomic structure and possible retroviral origin of the chicken CR 1 repetitive DNA sequence family. Proc. Natl. Acad. Sci. USA 81 (1984) 6667 6671. Tsuchiya, M., Hara, N., Yamada, K., Osago, H. and Shimoyama, M.: Cloning and expression of a cDNA for arginine-specific ADP-ribosyltransferase from chicken bone marrow cells. ], Biol. Chem. 269 (1994) 27451-27457. Williamson, K.C. and Moss, J.: Mono-ADP-ribosyltransferases and ADP-ribosylarginine hydrolases: a mono-ADP ribosylation cycle in animal cells. In: Moss, J. and Vaughan, M. (Eds.) ADP-ribosylating Toxins and G Proteins: Insights into Signal Transduction. American Society for Microbiology, Washington, DC, 1990, pp. 493 510. Zolkiewska, A., Okazaki, S.L. and Moss, ].: Vertebrate mono-ADPribosyltransferases. Mol. Cell. Biochem. 138 (1994) 107 112. Zolkiewska, A., Nightingale, M.S. and Moss, J.: Molecular characterization of NAD'arginine ADP-ribosyltransferase from rabbit skeletal muscle. Proc. Natl. Acad. Sci. USA 89 (1992) 11352 11356.