Biochimica et Biophysica Acta 1492 (2000) 537^542
www.elsevier.com/locate/bba
Short sequence-paper
The 5S rRNA genes in Macaca fascicularis are organized in two large tandem repeats1 Lars Ri¡ Jensen, Sune Frederiksen * Biochemistry Laboratory B, Department of Medical Biochemistry and Genetics, The Panum Institute, University of Copenhagen, Blegdamsvej 3, DK-2200 Copenhagen N, Denmark Received 14 March 2000; received in revised form 8 May 2000; accepted 15 May 2000
Abstract The 5S rRNA genes in Macaca fascicularis are organized in tandem repeats which are unusually large and complex. The tandem repeats consist of a 7.3 kb DNA fragment with two 5S rRNA genes linked to a 4.3 kb fragment with one gene. The total number of genes in the repeats is 50^100 per haploid genome. The 5S rDNA has an external promoter, the D box, in the same position relative to transcription start as the human gene but is transcribed less efficiently than a human 5S rRNA gene in a HeLa cell extract. ß 2000 Elsevier Science B.V. All rights reserved.
RNA polymerase III transcribes a number of small RNA encoding genes. They are divided into three types according to there promoter structure [1]. Type 1 contain only the 5S rRNA genes based on the structure of the internal promoter region which contain the A box and C box and an intermediate element. The genes in human cells were later found to have a promoter region in the 5P-£anking sequence which stimulated transcription up to 10-fold [2]. The 5S rRNA genes belong to a multigene family which occurs in tandem repeats in mammalian organisms [3^7]. The 5S rRNA sequence is very conserved in higher eucaryotic cells [8]. The non-transcribed spacer in the repeats is not as well conserved and has been used to study the evolution of di¡erent mouse strains [9]. In order to understand how the multigene families develop and maintain the proper genes we have studied the 5S rRNA gene organization and the presence of an external promotor in a selected number of mammalian organisms (man [2,4], mouse [5], rat [6] and guinea pig, unpublished results). In this paper we report on the structure and organization of the Macaca fascicularis 5S rRNA encoding genes. DNA was extracted from liver cells and digested with restriction enzymes for Southern blot analysis (Fig. 1). The
* Corresponding author. Fax: +45-35-32-77-32; E-mail :
[email protected] 1 GenBank accession numbers AF 193580-193592.
Southern blot analysis was carried out using Nylon membranes (Hybond N+, Amersham) and a 32 P-labeled run-o¡ transcript of the coding sequence of a 5S rRNA gene [5]. Digestion with ScaI (lane 1) or AccI (lane 2) both gave rise to two bands at 3.0 and 4.3 kb (Fig. 1). ScaI recognizes a sequence in the coding region of the mammalian 5S rRNA genes and AccI a sequence overlapping the transcription start site in the 5S rRNA genes and the results therefore suggest a repeat structure consisting of 3.0 and 4.3 kb units. Measurements of band intensities (Fig. 1) indicated that there is one 3.0 kb unit for every two 4.3 kb units. The 3.0 and 4.3 kb units are physically connected as seen from digestion with PstI which gave rise to 7.3 and 4.3 kb bands (lane 8). Genomic DNA was digested with other restriction enzymes and 5S rDNA from isolated bands was further analyzed by restriction analysis. The repeat structures of the 7.3 and 4.3 kb repeats are shown in Fig. 2A,B. Digestion with XbaI likewise gives 7.3 and 4.3 kb fragments and the XbaI site is located about 100 bp from the PstI site (results not shown). The band pattern obtained with PvuII suggested that the 7.3 kb unit is followed by one 4.3 kb unit (Fig. 2C). The restriction maps have been con¢rmed by restriction analysis of the cloned 5S rDNA fragments. A repeat structure consisting of two di¡erent repeat units within one 5S rDNA tandem repeat has not previously been observed in mammalian organisms. In these organisms the genes are present in only one major repeat (human, 2.3 kb [4]; mouse, 1.6 kb [5]; rat, 1.8 kb plus a 2.5 kb pseudogene repeat [6] and hamster, 2.2 kb [3]). In Xenopus oocyte 5S rDNA, how-
0167-4781 / 00 / $ ^ see front matter ß 2000 Elsevier Science B.V. All rights reserved. PII: S 0 1 6 7 - 4 7 8 1 ( 0 0 ) 0 0 1 3 9 - 1
BBAEXP 91432 5-7-00
538
L.R. Jensen, S. Frederiksen / Biochimica et Biophysica Acta 1492 (2000) 537^542
Fig. 1. 5S rDNA tandem repeats in M. fascicularis. Southern blot analysis of genomic liver DNA. DNA (5 Wg) was digested with ScaI (lane 1), AccI (lane 2), BamHI (lane 4), SacI (lane 6) and PstI (lane 8). Molecular markers of 1.1 and 2.2 kb were obtained by digestion of rat DNA with SacI (lane 3) [6] and human DNA with BamHI (lane 7) [4]. BstEII digested lambda DNA marker (lane 5). Application of samples is marked by App. The restriction fragments were resolved on a 0.7% agarose gel, transferred onto a nylon membrane (Hybond N+, Amersham) and hybridized with a 32 P-labelled run-o¡ transcript from a human 5S rRNA gene as described previously [5].
ever, the 5S rDNA repeat contains a varying number of 5S rRNA pseudogenes within the gene repeat units [10,11]. Based on molecular analysis of mitochondrial DNA it has been estimated that M. fascicularis diverged from the human more than 50 million years ago [12], so during this time two very di¡erent repeat patterns have evolved. In human, the major repeat of 2.3 kb contains about 95% of the tandem repeated 5S rRNA genes and a minor repeat of 1.6 kb only about 5% [4]. A possible explanation for having a repeat consisting of two units of di¡erent sizes could be that the M. fascicularis 5S rDNA repeat is being homogenized and has been caught in an intermediate. One more interesting di¡erence between the human and the macaque is that the 5S rRNA genes in M. fascicularis map to chromosome 1p at three di¡erent loci [13] whereas the human genes map to chromosome 1 at only two loci [14]. The 5S rRNA genes from the macaque were cloned from the repeats in order to increase the percentage of gene containing material and to avoid the possible spread of pseudogenes in the genome. DNA was digested with restriction endonucleases (BamHI, SacI or PstI), separated on preparative agarose gels and the desired repeat fragments were eluted using a HSB E-51 elutor (Biometra). DNA was then ligated into a pBluescript KS vector and the transformation of Escherichia coli DH5K was carried out with Rb-chloride competent cells prepared by the procedure of Hanahan [15] or by electroporation on a Biorad Genepulser according to the manual (Biorad). Clones with 5S rDNA sequences were found by a colony
hybridization technique using a 32 P-labelled 5S RNA made by run-o¡ transcription [5] as the probe. Screening of the M. fascicularis library resulted in 11 clones containing 5S rRNA genes (Fig. 2D). Six of the clones (pMf 1^6) were isolated as 2.3 kb BamHI fragments, two clones (pMf 7 and 8) as SacI fragments of 2.8 and 4.3 kb, respectively, and three clones as PstI fragments. Two of the 4.3 kb PstI clones (pMf 9 and 10) contained one 5S rRNA gene (Fig. 2B) and one 7.3 kb PstI clone (pMf 11) contained two 5S rRNA genes (Fig. 2A). The six 2.3 kb BamHI fragments (pMf 1^6) (Fig. 2D) originate in either the 4.3 kb fragment from the 7.3 kb repeat or in the 4.3 kb repeat (Fig. 2A,B). The 2.8 kb SacI fragment (pMf 7) (Fig. 2D) originates in the 7.3 kb repeat and contains gene 1 (Fig. 2A). The 4.3 kb SacI fragment (pMf 8)(Fig. 2D) originates in either the 4.3 kb fragment from the 7.3 kb repeat or in the 4.3 PstI repeat (Fig. 2A,B). The 7.3 kb PstI clone contains two 5S rRNA genes (Fig. 2A). The 7.3 kb clone was further digested with AccI or SacI and the fragments subcloned (pMf 11.1, 11.2 and 11.3) (Fig. 2D) in order to further study the structure of the 7.3 kb repeat. The clones (pMf 1^11) were sequenced partially and three of the representative sequences (pMf 6, 9 and 11, gene 1) are shown in Fig. 3 and compared with a human 5S rDNA sequence. The upstream regions of the macaque 5S rRNA genes are identical up to position 3103. Two clones were sequenced further upstream. One (pMf 11.1, gene 1) is located in the 3.0 kb unit of the 7.3 kb repeat and the other clone (pMf 9) is from the 4.3 kb repeat. From transcription start to position 3346 the macaque sequences show a high degree of similarity and with a high GC content also seen in the human sequences [2,7]. Upstream of position 3346 there is no similarity between the pMf 9 and the pMf 11 gene 1 sequences and pMf 9 has a much lower GC content. These results suggest that a strand break event has occurred between position 3347 and 3346 since human and M. fascicularis diverged resulting in two di¡erent repeats in M. fascicularis. Most likely a 3.0 kb unit has been removed from the 7.3 kb repeat giving rise to a new 4.3 kb repeat and from Fig. 1 (lane 1 and 2) it is seen that the intensities of the bands correspond to two 4.3 kb units for each 3.0 kb unit. Clones pMf 2, 4, 5, 8, 9, 10 and 11, gene 2, have the same 5P-£anking and 3P-£anking sequences (pMf 9 shown in Fig. 3). Clones pMf 7 and 11, gene 1, have the same 5Pand 3P-£anking sequences (pMf 11, gene 1, shown in Fig. 3). Three of the BamHI clones (pMf 1, 3 and 6) have the same 5P- £anking region as the other macaque clones but a 3P-£anking sequence which di¡ers from the other clones. The 3P-£anking sequence is similar to that in pMf 9 (Fig. 3) except for one 19 bp and one 39 bp deletion. Clone pMf 10 (AF 193589) is a gene variant with an extra T nucleotide inserted between position 47 and 48 in the coding region and in exactly the same position as a human gene variant pHU5S2 (ACX 71798) isolated previously [2]. The sequence downstream of the back-up termination site in
BBAEXP 91432 5-7-00
L.R. Jensen, S. Frederiksen / Biochimica et Biophysica Acta 1492 (2000) 537^542
M. fascicularis clones is very rich in pyrimidines in a pattern that almost only consists of one or two C's followed by one to three T's. The insertions/deletions shown in the 3P-£anking sequences (Fig. 3) could very likely have been introduced by unequal crossing over or by the process of slipped strand mispairing. The last process introduces small repeating sequence elements and combined with transition mutations, probably C to T [16,17], this could
539
give rise to the sequence pattern seen in the non-transcribed downstream spacer region (Fig. 3). The 12 bp external promotor element, the D box, is located at the same position in the macaque 5S rRNA genes (321 to 332 bp) as in the human genes but four single base substitutions were found between the coding region and the D box close to transcription start. ATF and Sp1 binding sites are found upstream of the D box
Fig. 2. Restriction maps of the 5S rRNA genes in M. fascicularis genomic DNA. The 7.3 kb PstI repeat contains two 5S rRNA genes (A) and the 4.3 kb PstI repeat one 5S rRNA gene (B). The relative position of the 7.3 and 4.3 kb repeats was further investigated by means of other restriction enzymes and the results are shown in (C). The relative positions of the 11 isolated clones and three subclones are shown in (D). The question mark indicates that the clone may originate in either the 4.3 kb repeat or the 4.3 kb fragment of the 7.3 kb repeat. The positions of the 5S rRNA genes are indicated by arrows and the sizes of the restriction fragments are indicated.
BBAEXP 91432 5-7-00
540
L.R. Jensen, S. Frederiksen / Biochimica et Biophysica Acta 1492 (2000) 537^542
Fig. 3. Alignment of three 5S rDNA sequences from M. fascicularis and one human 5S rDNA [4]. Identical bases are indicated by (-) and deletions by (*). Numbers refer to positions relative to transcription start of the 5S rRNA gene, which is written in bold. The D box positioned around bp 325 has been marked and so have the ATF binding sites (bp 3323 and 3196) and the Sp1 binding site (bp 3228). A back-up termination site of four T's around bp +150 is underlined. pMf All means that the sequence of the three di¡erent clones are identical. The clones were sequenced on both strands using Cycle sequence kit (Perkin Elmer), appropriate primers (Pharmacia Biotech) and 33 P-dATP or 33 P-dTTP (Amersham) to visualize the sequences on X-ray ¢lm.
in macaque 5S rDNA but not at identical positions in the di¡erent clones (Fig. 3). The RNA polymerase III transcribed EBER1 and 2 genes containing Sp1 and ATF sites which in£uence transcription in vivo but not in vitro [18]. The coding sequence of the gene is the same in human and M. fascicularis clones but the termination sequence only contains four T's in M. fascicularis instead of ¢ve. This seemed to result in a less accurate termination since about 5% of the transcription read-through the ¢rst four T's and stopped at the back-up termination sequence at position +150 (Figs. 3 and 4). A back-up termination sig-
nal has also been found in Xenopus [19] and in the mammalian genome [3^6,20]. Immediately downstream of this back-up termination sequence the similarity between the human and M. fascicularis sequences vanish and the human sequence further downstream of position +200 is not shown. Transcriptional activity of the 11 isolated M. fascicularis clones were measured in a HeLa S-100 extract as described previously [2]. In comparable gene doses no di¡erence in transcriptional activity was observed. The 210 nucleotide internal standard is a T7 RNA polymerase run-o¡ tran-
BBAEXP 91432 5-7-00
L.R. Jensen, S. Frederiksen / Biochimica et Biophysica Acta 1492 (2000) 537^542
Fig. 4. In vitro transcription of cloned 5S rRNA genes. The assay contained in a ¢nal volume of 20 Wl, 0.5 Wg cloned 5S rDNA in a pBluescript KS vector and 10 Wl HeLa S-100 extract. The assay conditions were as described previously [2]. The incorporation of radioactive UTP was measured with a Phosphor Imager. The source of 5S rRNA genes were: human 5S rDNA, clone pHU5S3.1 [2] (lane 1); mouse 5S rDNA, clone pMO5S1.1 [5] (lane 2); M. fascicularis DNA, clone pMf7 (lane 3); M. fascicularis DNA, clone pMf9 (lane 4) and a control without DNA (lane 5).
script and this was added to the reaction mixture in order to correct for eventual loss of material during later extractions. The transcriptional e¤ciency of the human clone, pHU5S3.1 (Fig. 4, lane 1) was set to 100%. The transcriptional e¤ciency of mouse 5S rDNA, pMO5S1.1 was found to be 50% (Fig. 4, lane 2) [5]. The 2.8 kb clone, pMf 7 (SacI fragment) (Fig. 4, lane 3) and the 4.3 kb clone pMf 9 (PstI fragment) (lane 4) were transcribed with 60 to 75% e¤ciency relative to the human gene. Similar e¤ciencies were obtained with the other macaque clones. The results obtained with the pMf 7 clone (2.8 kb SacI fragment) and the pMf 9 clone (4.3 kb PstI fragment) are shown in Fig. 4 because they represent a gene from the 7.3 kb repeat (gene 1) and from the 4.3 kb repeat, respectively. Transcription of the macaque genes resulted in two bands, one with a normal size 5S rRNA and one RNA at approximately 150 nucleotides corresponding to termination at the back-up termination site (Fig. 4, lane 3 and 4). Transcription was carried out with 32 P-UTP and therefore the 150 nucleotide
541
molecule contains about two times the radioactivity found in the 121 nucleotide 5S rRNA which has been corrected for when transcriptional e¤ciency is calculated. The amount of read-through RNA (150 nt) is about 5% for both M. fascicularis clones. It was an intriguing ¢nding that the macaque 5S rRNA genes were transcribed less e¤ciently than the human genes because the macaque 5S rRNA genes and the human genes contain the same coding sequence, the same D box sequence and the position of the D box is the same in the two species. Fig. 5 shows a comparison of upstream 5S rDNA sequences from di¡erent mammalian species. The 5P-£anking sequences up to bp 3267 are very similar in human and macaque 5S rDNA. Only a few base substitutions are present, the GC content is high and the sequence contains no repeating motifs. The lower transcriptional activity of the macaque genes must therefore be due to other sequence di¡erences. Sequences downstream of the gene have not been investigated in mammalian cells for the in£uence on transcription of 5S rDNA and it is possible that these sequences combined with only four T's in the M. fascicularis termination signal were responsible for the lower transcriptional activity. Interestingly a recent paper has demonstrated that in the Xenopus somatic 5S rRNA gene the sequence from the gene to a position about +160 binds a protein which enhances transcription [21]. To investigate the possible in£uence of the 3P-£ank on transcription, a hybrid clone was synthesized. ScaI cuts in the middle of the coding region and the human clone pHU5S3 [2] and the macaque clone pMf 8 (a SacI 4.3 kb fragment containing a 144 bp upstream transcription start) was cut with ScaI and fractionated on an agarose gel. The fragments with the human downstream region of the gene and the M. fascicularis upstream region of the gene were ligated, transformed and the hybrid gene puri¢ed before transcription in a HeLa extract was carried out. The read-through band of 150 nucleotides disappeared but the transcriptional e¤ciency was not improved (results not shown) which indicated that the e¤ciency might be in£uenced by sequences upstream of the coding region and possibly the four substitutions between the D box and transcription start because in vitro transcription is not in£uenced by sequences upstream of the D box [2].
Fig. 5. Position of the D box promotor in the 5P-£anking region of 5S rRNA genes from di¡erent mammalian cells. The 12 bp long D box is conserved about 25 bp from the transcription start in M. fascicularis, human [2], mouse [5], rat [6] and hamster [3]. GC boxes (double underlined) are found in M. fascicularis, mouse, rat and human. The Sp1 binding site located about bp 340 in human is not present in M. fascicularis. Identical bases are indicated by (-) and deletions by (*).
BBAEXP 91432 5-7-00
542
L.R. Jensen, S. Frederiksen / Biochimica et Biophysica Acta 1492 (2000) 537^542
The importance of the D box for transcription in vitro has been studied with human [2], mouse [5] and rat 5S rDNA [6]. Deletion of fractions of the D box in human genes will lower the transcriptional e¤ciency by up to 90% [2]. The 5S rRNA genes from mouse and rat are transcribed with only 50% e¤ciency and deletion of the D box almost abolished transcription [5,6]. In these two rodents the D box is located 1 bp further upstream when compared with the human gene. This is most likely the reason for the decreased activity since the sequence between the D box and gene start di¡ers in 13 positions (mouse) and 14 positions (rat) compared with the human sequence and the mouse and rat sequences di¡er in three positions. The presence of a D box in 11 di¡erent clones isolated from M. fascicularis cells is therefore a strong evidence for the importance of this external promotor in the transcription of 5S rRNA genes. The GC-rich box of seven bases containing an Sp1-like sequence is likewise conserved in the mammalian clones shown in Fig. 5. Finally, the copy number of 5S rRNA genes in the M. fascicularis genome and repeats was determined. This was done in a dot-spot apparatus essentially as described in [4] and with the same probe used for Southern blotting as described above. The radioactivity was measured with a Phosphor Imager (Molecular Dynamics) and the total number of 5S rRNA genes and 5S rRNA-like genes was determined to be 200 to 400 genes per haploid genome. The bands in Southern blots (Fig. 1) were quanti¢ed and the number of genes in the di¡erent repeats were determined. The results showed that 50 to 100 5S rRNA genes in M. fascicularis are present in repeats compared with 100 to 150 5S rRNA genes in human 5S rDNA repeats [4]. We thank Rita Jensen and Irene JÖrgensen for excellent
technical assistance. The work was supported by the Novo Nordisk Foundation and the Danish Natural Science Research Council.
References [1] I.M. Willis, Eur. J. Biochem. 212 (1993) 1^11. [2] J. Nederby Nielsen, C. Hallenberg, S. Frederiksen, P.D. SÖrensen, B. Lomholt, Nucleic Acids Res. 21 (1993) 3631^3636. [3] R.P. Hart, W.R. Folk, J. Biol. Chem. 257 (1982) 11706^11711. [4] P.D. SÖrensen, S. Frederiksen, Nucleic Acids Res. 19 (1991) 4147^ 4151. [5] C. Hallenberg, J. Nederby Nielsen, S. Frederiksen, Gene 142 (1994) 291^295. [6] S. Frederiksen, H. Cao, B. Lomholt, G. Levan, C. Hallenberg, Cytogenet. Cell Genet. 76 (1997) 101^106. [7] R.D. Little, D.C. Braaten, Genomics 4 (1989) 376^383. [8] M. Szymanski, M.Z. Barciszewska, J. Barciszewski, T. Specht, V.A. Erdmann, Biochim. Biophys. Acta 1350 (1997) 75^79. [9] H. Suzuki, K. Moriwaki, S. Sakurai, Mol. Biol. Evol. 11 (1994) 704^ 710. [10] L.J. Korn, Nature 295 (1982) 101^105. [11] W. Nietfeld, M. Digweed, H. Mentzel, W. Meyerhof, M. Koster, W. Knochel, V.A. Erdman, T. Pieler, Nucleic Acids Res. 16 (1988) 8803^ 8815. [12] U. Arnason, A. Gullberg, A. Janke, J. Mol. Evol. 47 (1998) 714^727. [13] B. Lomholt, S. Frederiksen, L.R. Jensen, K. Christensen, C. Hallenberg, Mamm. Genome 7 (1996) 451^453. [14] B. Lomholt, S. Frederiksen, J. Nederby Nielsen, C. Hallenberg, Cytogenet. Cell Genet. 70 (1995) 76^79. [15] D. Hanahan, J. Mol. Biol. 166 (1983) 557^580. [16] D. Baltimore, Cell 24 (1981) 592^594. [17] G.A. Dover, Curr. Opin. Genet. Dev. 3 (1993) 902^910. [18] J.G. Howe, M.D. Shu, Cell 57 (1989) 825^834. [19] L.J. Korn, D.D. Brown, Cell 15 (1978) 1145^1156. [20] J.D. Dinman, R.B. Wickner, Genetics 141 (1995) 95^105. [21] M.R. Sturges, M. Bartilson, L.J. Peck, Nucleic Acids Res. 27 (1999) 690^694.
BBAEXP 91432 5-7-00