The primary structure of rat ribosomal protein S19

The primary structure of rat ribosomal protein S19

Biochimie (1990) 72. 299-302 ~ Soci6t6 franqaisede biochimie et biologiemol6culaire/Elsevier. Paris 299 Short communication The primary structure of...

309KB Sizes 0 Downloads 87 Views

Biochimie (1990) 72. 299-302 ~ Soci6t6 franqaisede biochimie et biologiemol6culaire/Elsevier. Paris

299 Short communication

The primary structure of rat ribosomal protein $19 K Suzuki, J Olvera, IG Wool* Department of Biochemistry and Molecular Biology, Universily of Chicago, 920 East 58th Street, Chicago, IL 60637, USA

(Received 27 February 1990; accepted 15 March 1990) Summary - The covalent st~acture of rat ribosomal protein S19 was deduced from the sequenceof nucleotides in a recombinant eDNA t.nd

confirmed from the NHz-terminal amino acid sequence of the protein. Ribosomal protein S19 contains 144 amino acids (the NH2-tenninal methianine is removed after translation of the mRNA) and has a molecular weight of 15944. Hybridizationof the cDNA to digests of nuclear DNA suggests that there are 15d8 copies of the S19 gene. The mRNA for the protein is about 640 nucleotides in length. Rat S19 is related to Saccharomyces ¢erevisiae S16A and to l~aiobacterium marismortui SI 2. recombinant DNA / nudeotide sequences / mnino acid sequence comparisons/ number of genes / mRNA size

~ntroduc|ion

A solution of the structure of eukaryotic ribosomes is deemed important since it is believed, with cause, to be essential (if not sufficient) for a rational, molecular account of the function of the organelle in protein synthesis. A prime requirement for solving the structure is knowledge of the sequence of nucleotides and amino acids in the constituent nucleic acids and proteins. A commitment has been made to the acquisition of this data for mammalian (rat) ribosomes I l l . We report here the structure of rat ribosomal protein S!9 which we have inferred from the ~equence of nucleotides in a recombinant cDNA and which we have confirmed by sequencing portions of the protein. Materials and Methods

The recombinant D N A procedures employed, the method used to determine the sequences of nucleotides and the strategy adopted to design probes to identify cDNAs encoding rat ribosomal orot~ins were cited or described before [2-4]. Three p~:obeo were made, each based on a sequence i~ rat ribosomal protein S19: probe 1 was a-mixture of 256 different oligodeoxynucleotides, each 26 nucleotides in length and related to the sequence K D V N Q O E F V (residues 6-14); probe 2, a mixture of 64 differem oligodeoxynueleotides, each 20 nucleotides in length, was based on the sequence K V P E W V D (residues 28-34); probe 3, a mixture of 128 different oligodeoxynucleotides, each *Correspondence and reprints

23 nucleotides in length, was based on the sequence KLAPYDEN (residues 43-50). The oligodeoxynucleotides were synthesized on a solid support by the methoxyphosphoramidite method using an Applied Biosystems, 380B, D N A synthesizer [5]. Results and Discussion

The sequence o f nucleotides in a c D N A encoding rat ribosomal protein S19

A random selection of 15000 colonies from two cDNA libraries of 30000 and of 20000 independent transformants was screened for clones hybridizing to three oligonucleotide probes that were synthesized to be complementary to the sequence of nucleotides predicted to be present in a portion of the mRNA for rat ribosomal protein $19. Six colonies gave a positive hybridization signal with the probe~. The DNA from the plasmids of the six transformants was isolated and digested with restriction endonucleases. These clones had inserts of 0.35 to 0.55 kilobases and Southern blot hybridization with the probes indicated that all might contain cDNA for $19. The anticipated length of the S19 coding sequence, calculated from the molecular weight of the protein [6], is 465 nucleotides. Two clones pS19-5 and pS19-7, 0.55 and 0.35 kb in length respectively, were selected and the sequences of nucleotides from both strands of the cDNA and overlapping sequences for each restriction site were obtained. The cDNA insert in pS19-5 contains 520 nucleotides

K Suzuki et al

3(R)

portion of the 3' part of the open reading frame (fig 1). The 5' noncoding sequence of pS19-7 has 10 consecutive pyrimidines, C C T I T C C C C T , starting at position - 3 4 (fig 1). Sequences of pyrimidines have been reported to be present at the 5' end of many eukaryotic ribosomal protein mRNAs and may play a role in the regulation of their translation (cf [1] for references and discussion). Since the parts of pS19-5 and pS19-7 that overlap have the same nucleotide sequences, they are likely to be derived from the same gene.

and includes a 5' noncoding sequence of 28 nucleotides, a single open reading frame and a 3' noncoding sequence of 54 nucleotides with a terminal poly(A) stretch (fig 1). In the other two reading frames, the sequence is interrupted by termination codons. The open reading frame of 438 nucleotides begins at an ATG codon at a position that we designate + 1 and ends with a termination codon T A G at position 436; it encodes 145 amino acids (fig 1). The initiation codon occurs in the context A C G A U G C which differs from the optimum ACCAUGG---~]. The recognition sequence, A A T A A A , directing post-transcriptional cleavage-polyadenylation of the 3' end of pre-mRNA [8] is located at position 457-462; 13 nucleotides upstream of the start of a poly(A) stretch of 17 nucleotides. The cDNA insert in pS19-7 contains 286 nucleotides. The sequence of nucleotides in pS19-7 is identical to that of pS19-5 except that it contains 6 additional nucleotides in the 5' noncoding r e , o n and lacks a good

The primary structure o f rat ribosomal protein S19 The rat ribosomal protein specified in pS19-5 was identified as S19 in the following manner: the recombinant cDNA clones pS19-5 and pS19-7 were selected using three oligodeoxynucleotide probes that were complementary to the codons for sequences near the N H 2 terminus of S19. The amino acid composition inferred from the cDNA is very close to that previously derived pS19-7

pS19-5

~.~3o ~.. CCTTTCCCCTGGCTGGCAGeGCGGAGGCeGCACG

÷i 30 60 90 ATGCCTGGAGTTACTGTAAAAGACGTTAACCAGCAGGAGTTCGTCAGAGCTCTGGCAGCCTTC•TCAAAAAGTCTGGGAAGCTGAAAGTC M

P

G

V

T

V

K

D

V

1

N

Q

Q

E

F

V

R

A

L

A

i0

A

F

L

K

K

S

G

K

L

K

20

V

30

120 150 180 CCCGAATC-GGTGGACACAGTCAAGTTC-GCCAAACATAAAGAGCTTGCCCCATATGATGAGAACTGGTTCTACACACGAGCTGCTTCCACA P

E

W

V

D

T

V

K

L

A

K

H

K

E

L

A

P

Y

D

40

E

N

W

F

Y

T

R

A

A

S

50

T

60

210 240 270 GCACGGCACCTGTACCTCCGTGGTGGTGCAGGGGTTGGTTCCATGACCAAGATCTACGGAGGACGGCAGAGAAAC GGTGTCAGGCCCAGC A

R

H

L

Y

L

R

G

G

A

G

V

G

S

M

T

K

I

Y

70

G

G

R

Q

80

R

N

G

V

R

P

~

S

90

pS19-7 300 330 360 CACTTCAGCAGAGGCTCTAAGAGTGTGGCcCGCCGGGTCCTCCAAGcCCTGGAGGGGCTGAAAATGGTGGAAAAGGACCAAGATGGGGGC H

F

S

R

G

S

K

S

V

A

R

R

V

L

Q

A

L

E

G

i00

L

K

M

V

E

K

D

ii0

Q

D

G

G

120

390 420 450 CGCAAGCTAACACCTCAGGGACAGAGAGATCTGGACAGGATCGC~GGACAGGTGGCAGCTGCCAACAAGAAGCATTAGAACAAAGGATG~ R

K

L

T

P

Q

G

Q

R

D

L

D

R

I

A

G

130

Q

V

A

A

A

N

K

K

H

*

140

48O TGGGTTAATAAATTGCCTCATTCATAAAA~

--~ pslg-5

Fig 1. The sequence of nucleotides in the cDNA inserts in plasmid pS19-5 and pS19-7 and the amino acid sequence encoded in the open reading frame. The position of the nucleotides in the cDNA inserts is given above the residue; the position of amino acids in the protein derived from the nucleotide sequence is designated below the residue; the start and the termination sites for pS19-5 and pS19-7 are designated by arrows.

R a t ribosomal protein S19

2ol ~'~1

[6] from an hydrolysate of purified S19 (table I). What is most convincing, however, is that the sequence of amino acids deduced from the sequence of nucleotides in pS19-5 corresponds to the NH2-terminal 50 residues determined directly from protein S19 by Edman degradation using an automated gas phase sequencer (data not shown). The molecular weight of rat ribosomal protein S19, calculated from the sequence of amino acids deduced from pS19-5, is 16075. However the NH2-terminal

methionine encoded in the S19 m R N A is removed after translation since it is not found in the amino acid sequence derived from the protein. The residue next to the initial methionyl in S19 is prolyl which has been reported to favor NH2-terminal processing [9]. Thus, the number of residues in the mature protein is 144 and the molecular weight is 15944; close to that of 17100 estimated from the migration of the purified protein in sodium dodecyl sulfate gels [6]. Protein S19 lacks cysteine and has an excess of basic residues (13 arginyl, 15 lysyl, and 4 histidyi) over acidic ones (7 aspartyl and 6 glutamyl) (table I).

Table I. Amino acid composition of rat ribosomal protein

S19. The amino acid composition (in numbers of residues) was determined either (A) from an analysis of an hydrolysate of purified S19 [6] or inferred (B) from the sequence of nucleotides in a recombinant cDNA. aThe NH2-terminai methionine is removed after translation of the mRNA. Amino acid

A

B

Alanine

14

15

Arginine

13

13

Aspartic acid and asparagine

11

7 +4

Cysteine

-

0

Glutamicacidandglutamine

15

6 +8

Glycine

15

15

Histidine

4

4

isoieucine

4

2

Leucine

12

12

Lysine

16

15

Methionine

2

3a

Phenylalanine

4

4

Proline

6

5

Serine

7

7

Threonine

6

6

Tryptophan

-

2

Tyrosine

4

4

Valine

12

13

Residues

145

145

The number of copies of the $19 gene

T~,e cDNA insert in pS19-5 was made radioactive and used to probe digests of rat liver nuclear D N A made with the restriction endonucleases BamHI, EcoRl, or HindlII [3]. The number of hybridization bands suggests that there are 15-18 copies of the S19 gene (results not shown). Many other mammalian ribosomal protein genes have been found to be present in multiple copies. However, in no instance has it been shown that more than one of the genes is functional; the presumption is that the other copies are pseudogenes (cf [1] for references and discussion). The size of the m R N A for rat ribosomal protein $19

To determine the size of the mRNA coding for $19, glyoxylated total poly(A)+mRNA from rat liver was separated by electrophoresis and screened for hydridization bands using radioactive pS19-5 cDNA [4]. One distinct band of about 640 bases was detected (results not shown). Comparison of the sequence of amino acides in rat S19 with that of ribosomal proteins from other species

The sequence of amino acids in rat ribosomal protein S19 was compared, using the computer programs R E L A T E and ALIGN [10], to more than 500 other ribosomal proteins contained in a library that we have compiled. The comparison that yielded the highest score (22.4 SD units) was with Saccharomyces cerevisiae S16A ]11]. An alignment of the amino acid sequences reveals 72 identities and 13 conservative changes (isoleucine / leucine / valine, serine / thereonine, glutamic acid / aspartic acid, arginine / lysine) of 142 possible matches for an ALIGN score of 47.3. It is near certain that these two proteins are homologous, ie derived from a common ancestral gene. Rat S19 is also related to Halobacterium marismortui S12 [12]; the R E L A T E score is 7.9 and in the alignment there are 45 identities and 9 conservative changes in 136 possible matches for an ALIGN score of 19.1. No internal duplications were found in the amino acid sequence of S19.

302

K Suzuki et al

The determination of the sequence of amino acids in rat $19 is a contribution to a set of data of which it is hoped will eventually encompass the structure of all the proteins in the ribosomes of this mammalian species. The primary purpose for the accumulation of this data is to use it to arrive at a solution of the structure of the organelle. However, the information may also help in understanding the evolution of ribosomes, in unraveling the function of the proteins, in defining the rules that govern the interaction of the proteins and the rRNAs, and in uncovering the amino acid sequences that direct the proteins to the nucleolus for assembly on nascent rRNA.

Acknowledgments This work was supported by the National Institute of Health, Grant GM 21769. We are grateful to YL Chan for advice and fruitful discussions and to A Timosciek for the preparation of the manuscript. The sequence data will appear in the EMBL/Gen Bank/DDBJ Nucleotide Sequence Database under the accession number X51707.

References 1 Wool IG, Endo Y, Chan YL, Gliick A (1990) Studies of the structure, function, and evolution of mammalian ribosomes. In: The Structure, Function and Evolution of Ribosomes (Hill WE, ed) American Society for Microbiology, Washington DC, (in press)

2 Chan YL, Lin A, McNally J, Wool IG (1987) The primary structure of rat ribosomal protein L5: a comparison of the sequence of amino acids in the proteins that interact with 5S rRNA. J Biol Chem 262, 12879-12886 3 Chan YL, Wool IG (1988) The primary structure of rat ribosomal protein $6. J Biol Chem 263, 2891-2896 4 Gliick A, Chain YL, Lin A, Wool IG (1989) The primary structure of rat ribosomal protein SI0. Eur J Biochem 182, 105-109 5 Beaucage SL, Caruthers MH (1981) Deoxynucleoside phosphoramidites - A new class of key intermediates for deoxypolynucleotide synthesis. Tetrahedron Lett 22, 1859-1862 6 Collatz E, Ulbrich N, Tsurugi K, Lightfoot HN, MacKinlay W, Lin A, Wool IG (1977) Isolation of eukaryotie ribosomal proteins. Purification and characterization of the 40 S ribosomal subunit proteins Sa, Sc, S3a, S3b, $5', $9, S10, Sll, S12, S14, S15, S15', S16, S17, S18, S19, $20, $21, $26, $27' and $29. J Biol Chem 252, 9071-9080 7 Kozak M (1986) Point mutations define a sequence flanking th e AUG initiator codon that modulates translation by eukaryotic ribosomes. Cell 44, 283-292 8 Proudfoot NJ, Brownlee GG (1976) 3' Non-coding region s. ~uencesin eukaryotic messenger RNA. Nature 263,211-214 9 Flinta C, Persson B, Jrrnvall H, yon Heijne G (1986) Sequence determinants of cytosolic N-terminal protein processing. Eur J Biochem 154, 193-196 10 Dayhoff MO (1978) Survey of new data and computer methods of analysis. In: Atlas of Protein Sequence and Structure (Dayhoff MO, ed), 5, suppl 3, National Biomedical Research Foundation, Washington DC, 1-8 11 Molenaar CMT, Woudt LP, Jansen AEM, Mager WH, Planta RJ, Donovan DM, Pearson NJ (1984) Structure and organization of two linked ribosomal protein genes in yeast. Nucleic Acids Res 12, 7345-7358 12 Kimura J, Arndt E, Kimura M (1987) Primary structures of three highly acidic ribosomal proteins $6, S12, and S15 from the archaebactedum Halobacterium marismortui. FEBS Lett 224, 65-70