Novel structure of a human U6 snRNA pseudogene

Novel structure of a human U6 snRNA pseudogene

195 Gene. 36 (1985) 195-199 Elsevier GENE 1352 Novel structure of a human U6 snRNA pseudogene (Re~ombin~t DNA; plasmid pBR325; phage I Charon vecto...

522KB Sizes 1 Downloads 63 Views

195

Gene. 36 (1985) 195-199 Elsevier GENE 1352

Novel structure of a human U6 snRNA pseudogene (Re~ombin~t

DNA; plasmid pBR325; phage I Charon vectors; placenta; genomic DNA evolution)

Hubert Theissen, Jutta Rinke, Christopher N. Traver *, Reinhard Liihrmann and Bernd Appel ** Otto-Warburg-Laboratorium, Max-Plank-Institut ftir molekulare Genetik, Berlin [Dahlem) (F.R.G.) Tel. (30)8307354, and *Department of Molecular Bioph_vlsics and Biochemistry. Yale University,New Haven, CT 06510 (U.S.A.) Tel. (203)?854585 (Received March Sth, 1985) (Revision received and accepted April 19th, 1985)

SUMMARY

A genomic DNA library containing human placental DNA cloned into phage Iz Charon 4A was screened for snRNA U6 genes. In vitro 32P-labeled U6 snRNA isolated from HeLa cells was used as a hybridization probe. A positive clone containing a 4.6-kb EcoRI fragment of human chromosomal DNA was recloned into the EcoRI site of pBR325 and mapped by restriction endonuclease digestion. Restriction fragments containing U6 RNA sequences wereidentified by hybridization with isolated U6 [ “P]RNA. The sequence analysis revealed a novel structure of a U6 RNA pseudogene, bearing two 17-nucleotid~nt)-long direct repeats of genuine U6 RNA sequences arranged in a head-to-tail fashion within the 5’ part of the molecule. ~~othetic~ models as to how this type of snRNA U6 pseudogene might have been generated during evolution of the human genome are presented. When compared to mammalian U6 RNA sequences the pseudogene accounts for a 77% overall sequence homology and contains the authentic 5’- and 3’-ends of the U6 RNA.

INTRODUCTION

All eukaryotic cells so far examined are known to contain a class of abundant, metabolically stable small RNAs of nuclear origin, designated snRNAs Ul to U6 (for a review see Reddy and Busch, 1983).

** To whom correspondence addressed.

and reprint requests should be

Abbreviations: bp, base pair(s); EtBr, ethidium bromide; kb, kilobases or 1000 bp; nt, nucleotide(s); pep, 3’,5’-cytidinediphosphate; $, pseudo; SDS, sodium dodecylsulfate; snRNA, small nuclear RNA. 0378-l 119/85/$03.30 0 1985 Elsevier Science Publishers

The primary structures of the snRNAs are highly conserved throughout evolution. There have been numerous reports on the cloning of genes for human snRNAs. Functional genes have so far been found for Ul RNA (Manser and Gesteland, 1982; Monstein et al., 1982; Lund and Dahlberg, 1984) and U2 RNA (van Arsdell and Weiner, 1984; Westin et al., 1984). Besides those genes, a large number of pseudogenes for Ul and U2 RNAs was described. Up to now only pseudogenes have been found in man for U3 RNA (Bernstein et al., 1983), U4 RNA (Hammarstrem et al., 1982) and U6 RNA (Hayashi, 1981). No DNA sequence has been pub~sh~ for U5 RNA. The snRNAs Ul to U5 are transcribed by

196

might indicate that U6 RNA is transcribed by RNA polymerase III. The cloning of U6 RNA genes and their expression should finally give an answer on how transcription of U6 RNA is regulated. While screening a human genomic DNA library for U6 RNA genes, we found a U6 RNA pseudogene which exhibits a remarkable structure.

RNA polymerase II. This was found by in vitro transcription of the Ul snRNA gene (Murphy et al., 1982) and by -transcription inhibition studies of in vivo transcribed snRNAs (Chandrasekharappa et al., 1983; Savouret et al., 1984). However, in these studies the drug a-amanitin did not inhibit the transcription of U6 RNA at concentrations which inhibited the expression of RNAs U 1 to U5. This finding

A)

,

l.Okb

J

PI 1

I

\

\

I

\

Y’U6

I

I I

I

I R&

ER

I

,

\

I

FI FId;I

\ . I

HI TII HI

.

PI

B) : ,j

; 3

20

. . . . . . . ..‘.....*...‘.....*... T&AAGGACTXAAAATTCTCATCAGAAACAGT

4c 7i fO 60 . . . . . . . . . . . . . . . . . . B........ . . . . . . . . . GGAGGTCAGAAAACAATGGAACATCT?AAAAATACTAA

Hfnf I 8C

. ..*.....

95

,.........

! oc

I . . . . . . ..a

110

.,...*...

,I .- 7, : 3:. 140 c I......... ......... .........

AGG,L~AAAGAAATATCAATCCATAATTCTACATCCAGCAAAAA

CACCC~TCAhGAATGAAGACAAGTGCTC **t**+ Xp?pGSGC.tiC

i-. . 0

2 ;; ;,

-.--r-‘T’ CAGAGAAGA:TAGCATGGLIL~\LX,~TAAG

***XtX*~*******,****

f+

1_ 2”c1

18C

170

163

150

. . . . . . . . . . . . . . . . . . l..........

*********f**~**tt****

l

l

+t+

*

+**

. . . ..ACAUAtiACUAAAr~UG~AA~GA~ACAGAGAAGAUUAGCA~GGCCCC~.iCG~AAG

GC.GUCGGCAGC

2 2 3 255 240 250 260 .,,....,.!..,.........,...*..‘*.*....*. . . . . . . . . .I.......,, GATGACGTGCAAHTTTGTCAAGGGTTTCATTTTTTTTATAACACAAAGGATAAATGCTT~AGGGG~TG *****+I **x*9.*, +* *** *** *+* **a*

‘20

270 . . ..*....i

Fok I

GAUGACACGCAAAV~CGUGAAGCGU~~C~AL~AU’~UU~,~,H

Fig. 1. Map and sequence

of U6 pseudogene.

into the EcoRI site ofpBR325. are indicated Tl-TaqI;

by bars. The following

XI-XhoI.

U6 snRNA

symbols

are aligned for maximal Positions

are used: BII-BsfEII;

homology,

lacking an asterisk

and sequence

from genuine U6 RNA, respectively.

The structure

at the 5’-end

map (A).

ER-EcoRI;

The sequence

as indicated

FI-F&I;

indicates

by asterisks.

Two direct repeats

whose relative locations HI-Hinti; strand

RI-R.raI;

and the sequence

or a 13-m-long

between

inserted

and directions

whose sequence

A gap had to be introduced

of 17 and 18 nt, respectively, The orientation

PI-PSI;

the DNA fragment

of the noncoding

indicate either single nt mismatches

of U6 RNA has not yet been identified.

of a 4.6-kb human DNA fragment

and two A/u-sequences,

section of the diagram

of the U6 RNA pseudogene. sequence

map of plasmid JU6 consisting

a U6 RNA pseudogene

The dark bar ($U6) in the expanded

in Fig. 1B. (B) The DNA sequence of the RNA sequence.

(A) Restriction

The insert contains

SI-StuI; is shown

of mammalian nt 17 and 18

DNA segment differing in length are underlined

of the sequence

with heavy arrows.

(B) is the reverse

of that in the

197

EXPERIMENTAL

(b) Structure of a U6 RNA pseudogene

(a) Construction of plasmid pJU6

Fig. 1B shows the sequence of a 278-bp fragment of pJU6 containing the pseudogene and the sequence of mammalian U6 RNA, aligned in such a way as to give maximal sequence homology. The U6 RNA sequence is identical for mouse (Harada et al., 1980; Reddy and Busch, 1983), rat (Epstein et al., 1980; Reddy and Busch, 1983), and man (Rinke and Steitz, 1985). The pseudogene contains the sequences of both the 5’ and 3’ ends of genuine U6 RNA. On the basis of 25-nt differences with respect to the 107~nt-long U6 RNA, the overall sequence homology is 77%. The majority of the point mutations occur in the 3’ part of the pseudogene, while the middle part shows contiguous homology to U6 RNA with only one mismatch in a stretch of 43 nt. The 5’ part of the pseudogene contains one point mutation and a contiguous stretch of 13 nt with the sequence GTCTAAAATTGGA (DNA nt 144-156 in Fig. 1B) which has replaced the sequence UCGGCAGC at U6 RNA nt 10-17. This sequence heterology leads to a pseudogene that is 5 nt longer than genuine U6 RNA. Interestingly, 11 of these 13 nt, together with the 3’ adjacent 7 nt that are homologous to the RNA sequence, generate an 18-nt-long sequence which is nearly perfectly repeated in a head-to-tail fashion within the pseudogene.

A human genomic DNA library in the vector 2 Charon 4A (a gift of Dr. S.M. Weissman) was screened with [ 32P]pCp end-labeled U6 RNA from HeLa cells as a hybridization probe. The DNA library consists of about 15-kb fragments of human placental DNA partially cut with EcoRI. The screening of 50 000 plaques by the in situ plaquehybridization technique of Benton and Davis (1977) revealed a total of four positive colonies all of which carried different chromosomal fragments of inserted human DNA. To establish whether the positive clones contained pseudogenes or true genes, the respective phage DNA was hybridized to in vivo 32P-labeled U6 RNA from HeLa cells and heteroduplexes were assayed for the degree of protection towards digestion of the U6 RNA by RNase T, (Weiner, 1980). None of the clones protected the full length of the U6 RNA sequence (not shown), indicating that they were all pseudogenes. The clone I U6.7 showed the best protection of U6 RNA and was therefore used for a detailed analysis. Digestion of 1 U6.7 with EcoRI led to a 4.6-kb DNA fragment which proved to carry sequences complementary to U6 RNA. This fragment was recloned into the EcoRI site of pBR325 (Bolivar, 1978) and termed pJU6. A detailed restriction map of pJU6 is shown in Fig. 1A. To identify the location of the pseudogene, fragments of pJU6 were hybridized with [ 32P]pCp-1abeled U6 RNA (Southern, 1975). Overlapping DNA fragments, which proved to be positive in hybridization, were sequenced according to Maxam and Gilbert (1980). The relative location and orientation of the pseudogene is shown in Fig. lA. In vitro transcription studies with pJU6 and derivatives containing subfragments of the DNA insert, in nuclear extracts of Xenopus laevis oocytes or of HeLa cells, showed that the U6 RNA pseudogene was inactive. These experiments furthermore reveded the existence of two human repetitive sequences of the A/u-type whose relative locations and transcription directions are indicated by arrows in Fig. 1A (our unpublished results).

DISCUSSION

We have shown that the 4.6-kb human DNA insertof pJU6 contains a U6 RNA pseudogene. The screening for human U6 RNA genes has up to now only identified the structures of pseudogenes. Hayashi (1981) estimated the number of human U6 RNA gene loci to be at least 200 per haploid genome, which included a large number of pseudogenes. For human Ul RNA, Denison et al. (198 1) calculated that there are at least ten times as many pseudogenes as genes. In mouse one true gene and two pseudogenes of U6 RNA were found by Ohshima et al. (198 1). Multiples of ten gene loci in mouse were calculated by these authors, again most of them being pseudogenes. Denison and Weiner (1982) have proposed a classification system for pseudogenes of Ul RNA which subdivides them into four classes.

Fig. 2. Hypothetical models for the generation of a U6 RNA pseudogene during evolution of the human genome. (A) Reverse transcription model: An incomplete cDNA of U6 RNA partially melts and reprimes at an RNA site already reverse-transcribed once. (B) Double crossing-over model: two crossing-overs between two U6 DNA sequences (I) would lead to two rearranged DNA sequences, one ofwhich is shown in II. An insertion of4 nt (III) results in the pseudogene sequence presented in Fig. 1B. For more details see DISCUSSION.

Following this classi~cation the published U6 RNA pseudogenes would fall into classes II and III. However, the novel structure of the U6 RNA pseudogene presented in this paper demonstrates the existence of further classes of pseudogenes. It contains both the 5’ and 3’ ends of genuine U6 RNA and, apart from several mutations at single nt positions mostly in the 3’ half of the pseudogene, there is a contiguous stretch of 13 nt near the 5’ end that difl’ers both in sequence and length from mammalian U6 RNA. As shown in Fig. 1B this sequence alteration of the U6 RNA generates two direct repeats in a head-to-tail arrangement within the 5’ part of the pseudogene, which is a novel feature for snRNA pseudogenes. The question could be raised as to how this pseudogene was generated during evolution. Two models are proposed which would lead to the internal direct repeat structure found in the pseudogene. Model A is based on the reverse transcription of U6 RNA. As shown in Fig. 2A (line I) an incomplete cDNA of U6 RNA is reverse-transcribed up tuRNA nt 22 which is located next to a proposed stable hairpin structure at the 5’ end of the RNA (Epstein et al., 1980; Harada et al., 1980). After partial remelting, the cDNA’s 5’ end reprimes at an RNA position (nt 39-42) that has already been reversetranscribed once (line II), resulting in an internal direct repeat as indicated by arrows in line III. Since the pseudogene in pJU6 displays the genuine 5’ end of U6 RNA, it is reasonable to propose that reverse transcription is continued at RNA nt 9 which is in the loop of the hairpin structure (line II). It remains obscure, why 8 nt of the U6 sequence are not reversetranscribed, but instead two new nt (GT) are found in this part of the pseudogene.

Fig. 2B shows a hypotheti~~ model of how the pseudogene of pJU6 might have been generated by double crossover during evolution of the human genome. U6 RNA genes from two homologous sister chromatids might have aligned themselves in such a way as to result in a shift from the correct homologous gene alignment into a position where new sequence homologies could be created. The first line in Fig. 2B shows the 5’ half of a U6 RNA pseudogene on one chromatid with two G to A transitions as they are actually found in the JU6 pseudogene (indicated by asterisks). The lower line in panel I shows part of the homologous U6 pseudogene from the sister chromatid. On the basis of unequal sister-chromatid exchange the model proposes a double crossover between the two chromatids which would lead to two rearranged pseudogenes. One of these is shown in line II. This pseudogene has exchanged the genuine U6 RNA sequence sequence TCGGCAGC for the AAAATTGGA. As a result of this recombination, two direct repeats of 18 and 17 nt, respectively, would be generated in a head-to-tail arrangement (underlined by two arrows in line III). The sequence CTGT could have been inserted into this pseudogene sequence in a separate recombination event which would result in precisely the pseudogene structure that is presented in this paper.

ACKNOWLEDGEMENTS

We thank Dr. S. Weissman for generously providing a human genomic DNA library. For her support and helpful discussions we thank Dr. J. Steitz. We

199

are grateful to Dr. R. Brimacombe for critically reading and H. Markert for typing of the manuscript. This work was supported by NIH grants CA 16038 and GM 26 154 to J. Steitz and by the DFG grant Lu 294-2 to R. LUhrmann.

REFERENCES Benton, W.D. and Davis, R.W.: Screening & recombinant clones by hybridization to single plaques in situ. Science 196 (1977) 180-182. Bolivar, F.: Construction and characterization of new cloning vehicles, III. Derivatives of plasmid pBR322 carrying unique EcoRI sites for selection of EcoRI-generated recombinant molecules. Gene 4 (1978) 121-136. Chandrasekharappa, S., Smith, J.A. and Eliceiri, G.L.: Biosynthesis of small nuclear RNAs in human cells. J. Cell. Phys. 117 (1983) 169-174. Denison, R.A., van Arsdell, S.W., Bernstein, L.B. and Weiner, A.M.: Abundant pseudogenes for small nuclear RNAs are dispersed in the human genome. Proc. Natl. Acad. Sci. USA 78 (1981) 810-814. Denison, R.A. and Weiner, A.M.: Human Ul RNA pseudogenes may be generated by both DNA- and RNA-mediated mechanisms. Mol. Cell. Biol. 2 (1982) 815-828. Epstein, P., Reddy, R., Henning, D. and Busch, H.: The nucleotide sequence of nuclear U6 (4.7s) RNA. J. Biol. Chem. 255 (1980) 8901-8906. Hammarstrlim, K., Westin, G. and Pettersson, U.: A pseudogene for human U4 RNA with a remarkable structure. EMBO J. 1 (1982) 737-739. Harada, F., Kato, N. and Nishimura, S.: The nucleotide sequence of nuclear 4.8 S RNA of mouse cells. Biochem. Biophys. Res. Commun. 95 (1980) 1332-1340. Hayashi, K.: Organization of sequences related to U6 RNA in the human genome. Nucl. Acids Res. 9 (1981) 3379-3388. Lerner, M.R., Andrews, N.C., Miller, G. and Steitz, J.A.: Two small RNAs encoded by Epstein-Barr virus and complexed with protein are precipitated by antibodies from patients with systemic Lupus erythematosus. Proc. Natl. Acad. Sci. USA 78 (1981) 805-809.

Lund, E. and Dahlberg, J.E.: True genes for human Ul small nuclear RNA. J. Biol. Chem. 259 (1984) 2013-2021. Manser, T. and Gesteland, R.F.: Human Ul loci: genes for human Ul RNA have dramatically similar genomic environments. Cell 29 (1982) 257-264. Maxam, A.M. and Gilbert, W.: Sequencing end-labeled DNA with base-specific chemical cleavages. Methods Enzymol. 65 (1980) 499-560. M&stein, H.-J., Westin, G., Philipson, L. and Pettersson, U.: A candidate gene for human Ul RNA. EMBO J. 1 (1982) 133-137. Murphy, J.T. and Burgess, R.R., Dahlberg, J.E. and Lund, E.: Transcription of a gene for human Ul small nuclear RNA. Cell 29 (1982) 265-274. Ohshima, Y., Okada, N., Tani, T., Itoh, Y. and Itoh, M.: Nucleotide sequences of mouse genomic loci including a gene or pseudogene for U6 (4.8s) nuclear RNA. Nucl. Acids Res. 9 (1981) 5145-515s. Reddy, R. and Busch, H.: Small nuclear RNAs and RNA processing, in Progress in Nucleic Acid Research and Molecular Biology, Vol. 30, 1983, pp. 127-162. Rinke, J. and Steitz, J.A.: Association of the Lupus antigen La with a subset of U6 snRNA molecules. Nucl. Acids Res. 13 (1985) 2617-2629. Southern, E.M.: Detection of specific sequences among DNA fragments separated by gel electrophoresis. J. Mol. Biol. 98 (1975) 501-517. Savouret, J.-F., Cathala, G., Eberhardt, N.L., Miller, W.L. and Baxter, J.D.: Interaction of small nuclear ribonucleoproteins with SV40 in CV-1 cells: is U2 snRNA involved in regulating replication? DNA 3 (1984) 365-376. Weiner, A.M.: An abundant cytoplasmic 7s RNA is complementary to the dominant interspersed middle repetitive DNA sequence family in the human genome. Cell 22 (1980) 209-218. Westin, G., Zabielski, J., Hammarstrom, K., Monstein, H.-J., Bark, C. and Pettersson, U.: Clustered genes for human U2 RNA. Proc. Natl. Acad. Sci. USA 81 (1984) 3811-3815. Van Arsdell, S.W. and Weiner, A.M.: Human genes for U2 small nuclear RNA are tandemly repeated. Mol. Cell Biol. 4 (1984) 492-499. Communicated by H.G. Zachau.