More Sm snRNAs from Vertebrate Cells

More Sm snRNAs from Vertebrate Cells

EXPERIMENTAL CELL RESEARCH ARTICLE NO. 229, 276–281 (1996) 0372 More Sm snRNAs from Vertebrate Cells YI-TAO YU, WOAN-YUH TARN, THERESE A. YARIO, A...

129KB Sizes 0 Downloads 27 Views

EXPERIMENTAL CELL RESEARCH ARTICLE NO.

229, 276–281 (1996)

0372

More Sm snRNAs from Vertebrate Cells YI-TAO YU, WOAN-YUH TARN, THERESE A. YARIO,

AND

JOAN A. STEITZ1

Department of Molecular Biophysics and Biochemistry, Howard Hughes Medical Institute, Yale University School of Medicine, New Haven, Connecticut 06536-0812

There are a number of low-abundance small nuclear RNAs (snRNAs) in eukaryotic cells. Many of them have been assigned functions in the biogenesis of cellular RNAs, such as splicing and 3* end processing. Here, we present the sequence of Xenopus U12 snRNA and compare the secondary structures of the low-abundance U11 and U12 with those of the high-abundance U1 and U2, respectively. The data suggest functional parallels between these two pairs of snRNAs in premRNA splicing. Using a highly sensitive method, we have identified several new low-abundance snRNAs from HeLa cells. These include five U7 snRNA variants and six novel snRNAs. One of the six novel RNAs is an Sm snRNA, whereas the rest are not immunoprecipitable by either anti-Sm antibodies or anti-trimethylguanosine antibodies. The discovery of these new RNAs suggests that there may be yet more low-abundance snRNAs in the nuclei of eukaryotic cells. q 1996 Academic Press, Inc.

INTRODUCTION

The nuclei of all metazoan cells harbor a variety of small RNA molecules (chain lengths from 60 to 300 nucleotides) that play distinct roles in gene expression. The most abundant of these small nuclear RNAs (snRNAs), called U1, U2, U4, U5, and U6 (the nomenclature was originally derived from the U-richness of their sequences), are present in approximately a million copies in the nucleoplasm of mammalian cells [1]. Each is tightly associated with a number of proteins, some of which bear epitopes recognized by patient Sm autoantibodies, thereby conferring the name Sm small nuclear ribonucleoproteins (Sm snRNPs). These Sm Data presented at a Nobel Symposium on ‘‘The Functional Organization of the Eukaryotic Cell Nucleus,’’ Saltsjo¨baden and Stockholm, September 3–6, 1996. 1 To whom correspondence and reprint requests should be addressed at Department of Molecular Biophysics and Biochemistry, Howard Hughes Medical Institute, Yale University School of Medicine, 295 Congress Avenue, New Haven, CT 06536-0812. Fax: (203) 624-8213.

276

0014-4827/96 $18.00 Copyright q 1996 by Academic Press, Inc. All rights of reproduction in any form reserved.

AID

ECR 3373

/

6i18$$$$61

snRNPs assemble together with non-snRNP protein factors on the introns of protein-coding gene transcripts to form the spliceosomes that excise the introns and generate mature mRNAs. Another class of snRNPs that also contain members of the U-series of snRNAs resides in the nucleolus. These snRNAs (U3, U8, and U13 . . . §U60) are somewhat less abundant (104 –105/cell) and are associated with a different set of proteins than the splicing snRNPs, including fibrillarin (Fb) [2]. The Fb snRNPs play roles in the maturation of ribosomal RNAs. Some are essential for cleavages in the multistep pathway that carves 18S, 28S, and 5.8S RNAs from the long preRNA transcript [2]. Many others direct 2*-O-methylation of specific residues within the 18S and 28S sequences [3]. The question of whether there might exist lower abundance members of the Sm snRNP family that participate in other nuclear RNA processing events was fueled by the discovery of mammalian U7 [4, 5, 5a]. This low-abundance snRNP (about 103/cell) is essential for fashioning the 3* ends of histone mRNAs. Its abundance relative to the spliceosomal snRNPs makes sense in that histone mRNAs constitute about 1% of cellular messages and an average pre-mRNA contains about 10 introns. In 1988, a screen for other low-abundance Sm snRNPs revealed the existence of U11 and U12 [6]. They exist at about 104 copies in HeLa cell nuclei. Although the physical properties of these snRNPs were characterized [7], no function was assigned to them until 1996. Based on sequence complementarity, Hall and Padgett [8] suggested that U11 and U12 might function in the removal of a rare class of metazoan introns having AU at their 5* end and AC at their 3* end (AT-AC in the DNA). This hypothesis was confirmed for U12 by both in vivo and in vitro experiments: splicing-defective mutations close to the branch site A residue near the 3* end of an AT-AC intron were suppressed by compensatory mutations engineered into the 5* end of U12 [9] and psoralen cross-links were formed involving complementary regions in U12 and a splicing substrate containing an AT-AC intron [10]. Initially, the AT-AC spliceosome was determined to

12-03-96 15:22:30

ecl

277

MORE Sm snRNAs

contain U11, U12, and U5 snRNPs, but no U4 or U6 snRNAs were detected [10]. Circumstantial evidence that U11 participates in AT-AC intron removal came from its selective association with U12 [7] and with the AT-AC spliceosome [10]. Nonetheless, it is probably reasonable to speculate that, in the AT-AC versus the major spliceosome, U11 carries out functions of U1, just as U12 mimics the role of U2. The lack of U4 and U6 in the AT-AC spliceosome prompted Tarn and Steitz [11] to look for other snRNP components. Intensive search led to the discovery of U4atac and U6atac [11]. These snRNAs, like U11 and U12, are present at about 1/100 the abundance of their major spliceosome counterparts (U4 and U6) in the HeLa cell nucleus (õ104/cell). Here, we present additional sequences of low-abundance Sm snRNAs. We have determined a Xenopus U12 sequence, which confirms the conservation of certain regions known to be important for U12 function. A screen for low-abundance human HeLa cell Sm snRNAs has uncovered the existence of U7 variants, as well as novel RNAs whose functions pose tantalizing problems for future investigation. MATERIALS AND METHODS Preparation of Sm/TMG RNAs. Xenopus tissue culture (XTC) cells were sonicated and immunoprecipitated with monoclonal antitrimethylguanosine (TMG) antibodies [12]. After fractionation on a 6% polyacrylamide–8 M urea gel, RNA comigrating with the 147-nt marker was excised and subjected to sequence analysis (see below). For preparation of HeLa cell RNAs, nuclear extracts (10 ml) were first immunoprecipitated [13] with monoclonal anti-Sm antibodies (Y12) [14]. The recovered RNAs were then immunoprecipitated again with TMG antibodies. The retrieved RNAs were split into two fractions. One portion (about 1/10 of the total sample) was 3 * end labeled with [32P]Cp and T4 RNA ligase (Pharmacia) [15] and fractionated in parallel with the remainder (about 9/10) of the sample on a 6% polyacrylamide–8 M urea gel. Regions that did not contain any known Sm snRNAs were excised from the lane which contained unlabeled sample. RNAs subsequently eluted from these gel slices were subjected to sequence analysis. RNA sequencing (tailing–RT-PCR–cloning–sequencing). Except for minor modifications, the procedure was basically as described by O’Brien and Wolin [16]. The recovered HeLa RNAs (above) were ligated to a DNA oligonucleotide (anti-T3) which contains a 3*-dA at its 3* end (5* TTTAGTGAGGGTTAATdA 3 *) [17]. Complementary DNAs were then generated by reverse transcription using AMV reverse transcriptase (Boehringer Mannheim) and the oligodeoxynucleotide T3 (5* ATTAACCCTCACTAAA 3*). The resulting cDNAs were poly(dA) tailed at their 3* ends with dATP and terminal transferase (Gibco BRL). Using PCR, the cDNA sequences were amplified with the primers polydT and T3 (ATTAACCCTCACTAAA). An RNA sequence library was then constructed by insertion of the total PCR products into a SmaI-digested pGEM 3Z vector (Stratagene). After transformation, cDNA clones were isolated and sequences were determined. For Xenopus U12 analysis, the cDNA (without polydA tailing) was amplified using T3 and an oligonucleotide corresponding to nucleotides 2–23 of of human U12 (Fig. 1). After sequencing, the extreme 5* end of U12 was determined by primer extension sequencing [18].

AID

ECR 3373

/

6i18$$$$62

12-03-96 15:22:30

Northern blot analysis. RNA samples (either anti-Sm precipitated, anti-TMG precipitated, or total nuclear RNAs) were fractionated on 6% polyacrylamide–8 M urea gels followed by electrophoretic transfer onto Zeta-probe GT membranes (Bio-Rad). Using plasmids generated from the respective cDNA clones (above) as templates, [a32 P]UTP uniformly labeled antisense X8 and antisense C26 RNA probes were produced by in vitro transcription. After hybridization at 427C for 16 h [18], blots were washed sequentially in 21 SSC and 0.1% SDS, 0.51 SSC and 0.1% SDS at room temperature (30 min for each step), and then in 0.51 SSC and 0.1% SDS at 427C for 15 min.

RESULTS AND DISCUSSION

snRNA Components of the AT-AC Spliceosome In Fig. 1 the primary and secondary structures of human U11 and U12 are compared with human U1 and U2, respectively. Despite the significant differences in primary sequence, the overall secondary structures exhibit striking similarities between the two pairs. The Sm binding sites in U11 and U12 are both located in single-stranded regions and are almost identical to those in U1 and U2, respectively. In contrast, the proposed 5* splice-site pairing sequence of U11 (nucleotides 4–11) [8] and the branch-site recognition sequence of U12 (nucleotides 18–24) [8, 18] are pictured in double-stranded regions (Fig. 1 and see below). Chemical and enzymatic probing of the U11 and U12 snRNP structures indicated that U11 nucleotides 4– 11 are largely inaccessible [7] whereas U12 nucleotides 18–24 are available for base pairing [6]. All information gathered so far suggests that U11 substitutes for U1 in the splicing of AT-AC introns. Recently, we carried out cross-linking experiments using the P120 substrate [10] containing a single 4SU near the 5* splice site. As seen in the splicing of the major class of introns, where the 5* end of U1 cross-links to the 5* splice site at very early times [19], we observed two early cross-links when the single 4SU substituted AT-AC substrate was used (Y.-T. Yu, unpublished data). It is not yet clear whether U11 is involved in the cross-linked species and further, whether the proposed 5* splice-site binding sequences located in the doublestranded stem of U11 (Fig. 1) become cross-linked to the 5* splice site. Alternatively, genetic suppression should provide a powerful tool for testing this proposal. In comparison, U12 has been much more extensively studied. Both in vivo compensatory analyses, where changes in U12 reversed the deleterious effects of branch-site mutations [9], and in vitro psoralen crosslinking experiments [10] have demonstrated that U12 fulfills the role of U2 in the splicing of an AT-AC intron. Thus, the proposed branch-site recognition sequence of U12 indeed base pairs with the branch site of the intron. As a complement to functional studies, we recently

ecl

278

YU ET AL.

FIG. 1. The primary and secondary structures of U1, U2, U11, and U12 snRNAs. The structures of human U1 and U2 snRNAs are adapted from Baserga and Steitz [1]; the structures of human U11 and U12 are as deduced by Montzka and Steitz [6]. The Sm binding sites in all four snRNAs are indicated by shaded boxes. The predicted 5* splice-site interaction sequences in U1 and U11, as well as the branch-site binding sequences in U2 and U12, are underlined. Sequences in U2 and U12 that are involved in the formation of U2/U6 helix I or U12/U6atac helix I, respectively, in the active spliceosome [11, 23, 24] are designated by outlined letters. Bases altered in chicken [18], mouse [18], and/or Xenopus U12 snRNAs relative to human U12 are indicated by letters with a black square background. Arrows show the nucleotide alterations of Xenopus U12 snRNA.

sequenced U12 snRNA derived from a divergent vertebrate species, Xenopus. Previously, U12 snRNAs derived from two other vertebrate species (mouse and chicken) had been analyzed [18]. All these U12 sequences can be folded into a secondary structure identical to that in human ([18], and see Fig. 1). Despite some base changes indicated in Fig. 1, the Xenopus sequence that base pairs with the branch site (underlined) [10] and the sequence that likely forms Helix I with U6atac in the spliceosome (outlined letters) [11] remain unchanged. An unresolved issue regarding the splicing of premRNAs containing AT-AC introns is the involvement of the U5 snRNP. It has long been known that there are at least seven different U5 snRNAs in the cell nucleus [20]. Many of them are in low abundance (õ104 copies/cell). All known U5 snRNPs have been found in

AID

ECR 3373

/

6i18$$$$62

12-03-96 15:22:30

spliceosomes formed on pre-mRNAs containing GU-AG introns (the major class of introns) [20]. Are all these different U5’s also involved in AT-AC intron splicing? Since all the other AT-AC spliceosomal snRNPs (U11, U12, U4atac, and U6atac) are in low abundance, it is possible that only one of the low-abundance U5 variants is selected to function in AT-AC intron splicing. More work is needed to clarify this issue. In summary, the above comparisons have revealed remarkable structural similarities between U1 and U11 and between U2 and U12 (Fig. 1). Similarly, the U4atac/U6atac di-snRNP has previously been compared with its counterpart U4/U6 in the major spliceosome [11]; the secondary structures and the positions of conserved sequences within the structures are strikingly similar for each pair. That common structural features are shared by the two sets of spliceosomal

ecl

MORE Sm snRNAs

279

snRNPs argues strongly that the mechanism of the two pre-mRNA splicing machines is similar, if not identical. In addition, the conservation of limited sequences between the two independent sets of spliceosomal snRNAs emphasizes the functional importance of these particular regions to the splicing process. Novel Low-Abundance snRNAs from HeLa Cells The success in identifying U4atac and U6atac [11], which cannot be detected by direct [32P]Cp labeling in either a total RNA or Sm RNA preparation, prompted us to investigate whether there are additional lowabundance snRNAs in vertebrate cell nuclei. We explored this possibility for human HeLa cells. Our focus was on the Sm snRNPs. We first used antiSm (Y12) antibodies to immunoprecipitate from HeLa nuclear extracts the snRNPs containing proteins carrying Sm epitopes. Because most Sm snRNAs also possess a trimethyl guanosine (TMG) cap at their 5* ends, the Y12 immunoprecipitates were subjected to a further selection step with anti-TMG antibodies. RNAs recovered from these two rounds of immunoprecipitation were resolved on a denaturing polyacrylamide gel. Regions of the gel that did not contain known snRNAs were excised for further sequence analysis. To sequence previously uncharacterized low-abundance RNAs, we used a tailing–RT-PCR–cloning–sequencing procedure (see Materials and Methods), which has a number of advantages over other available methods. First, the tailing step, which involves the addition of two known oligonucleotides to the two ends of an unknown RNA, theoretically permits determination of full-length sequence for each RNA. Second, the method includes a cloning step, which allows generation of a library; thus, even without prepurification, many different RNA sequences can simultaneously be stored in the library. Third and most important, because there is a PCR amplification step, the method is highly sensitive. Therefore, it is extremely well suited to the detection and sequencing of low-abundance RNAs. To date, approximately 50 insert-containing plasmids have been sequenced. Not unexpectedly, many of them turned out to be derived from breakdown products of U1, U2, U4, U11, or U5 snRNAs. We were surprised to identify many intact U7 snRNAs which have base changes in positions near the 5* and 3* ends and in the loop. Five of these are presented in Fig. 2. Previously, no human U7 snRNA variants had been described. Yet, there is one report describing four human U7-like genes that were assumed to be pseudogenes [21]. In that work, the differences in sequence between the U7-like genes and the published U7 RNA sequence [5] also fell mainly into the

AID

ECR 3373

/

6i18$$$$62

12-03-96 15:22:30

FIG. 2. Sequence variants of human U7 snRNA. The published sequence and secondary structure of human U7 snRNA [5] are shown, with the Sm site indicated by a shaded box. Five clones containing U7 sequences with base alterations at positions 2, 44, 49, and/or 56 are shown (small arrows). Clone 1 also has a nucleotide deletion at position 62, indicated by n. Clone 4 contains a nucleotide insertion between positions 43 and 44, indicated by a large arrow.

5* and 3* ends and the loop region. Taken together, these results suggest that, in human, there may exist several variant U7 genes that are active. Alternatively, the base changes we found could have been introduced during PCR amplification, but we consider this possibility to be unlikely for several reasons. First, the U7 sequences we amplified are very short, and we ran only 20 cycles of PCR amplification. Thus, the frequency of PCR error should be very low. Second, in the sequences of the major snRNA breakdown products mentioned above, we found no base changes. It seems unlikely that PCR errors would be preferentially introduced into U7 snRNA. Third, usually PCR errors are randomly generated at many positions throughout the entire sequence. The base changes we observed in the U7 snRNA were located exclusively near the ends and in the loop. To confirm the authenticity of these U7 variants, further investigation is underway. Among the sequenced clones, there were six new RNA sequences (X8, C7, C11, C15, C26, and C27) which did not match any in the data base. While X8 RNA is 155 nucleotides long, all the rest are in the size range of 60–90 nucleotides. Surprisingly, except for X8, which contains an Sm binding sequence identical to that of U7 (Fig. 2), none of the others possesses an obvious Sm binding site (Fig. 3 and data not shown). To confirm that the six sequences represent discrete snRNA molecules, we performed Northern analyses with antisense RNA probes. Two of them, X8 and C26, gave the results shown in Fig. 4. The data show that X8 RNA is antiSm and anti-TMG precipitable, whereas the C26 RNA is not precipitable by either anti-Sm or anti-TMG antibodies. Based on Northern analysis, the abundance of both RNAs can be estimated to be about 1/100 that of U4atac (or about 101 to 102 per cell). The fact that five of the sequences do not have Sm binding sites is puzzling. One interpretation for this observation is that

ecl

280

YU ET AL.

these RNAs might be associated with Sm RNAs, similar to U6’s association with U4. Because the detection method is highly sensitive, very small amounts of an associated RNA could have been amplified. Why do these low-abundance RNAs exist in the nucleoplasm? What functional roles might they play? It is known that all cellular Sm snRNPs identified so far are involved in pre-mRNA processing. U1, U2, U4, and U5 are required for splicing of the major class of introns; U11, U12, U4atac, and U6atac are required for splicing of rare AT-AC introns, and the U7 snRNP is essential for the maturation of the 3* end of histone mRNAs. Recently, two new Sm RNPs (X and Y) have been discovered in the nematode Ascaris [22]. Based on their copurification with spliceosomes assembled on a GU-AG pre-mRNA, it was proposed that these two RNAs may function in the splicing of the major class of introns [22]. Since X8 is an Sm snRNP, it may also be involved in some aspect of pre-mRNA processing.

FIG. 4. Northern blot analyses of low-abundance RNAs derived from HeLa nuclear extracts. RNAs were fractionated on a 6% polyacrylamide-8 M urea gel, blotted, and probed for the X8 (A) or C26 (B) sequences. 32P-U-uniformly labeled antisense X8 RNA (A) and antisense C26 RNA (B) were used as probes (see Materials and Methods). Sm, TMG, and Pre, represent RNAs immunoprecipitated with anti-Sm, anti-TMG, and serum taken from a healthy male, respectively. Each RNA sample was prepared from 30 ml HeLa nuclear extract (see Materials and Methods). Total represents the total RNAs prepared from 15 ml nuclear extract. The X8 band (A) and the C26 band (B) are indicated by arrows. The numbers on the left are the sizes in nucleotides of MspI-digested pBR322 DNA. Compared to the markers, both RNAs migrate slightly slowly, probably because of unique nucleotide composition and/or modification.

Notably, X8 RNA has an Sm binding site identical to U7’s. Perhaps its function is related to that of the U7 snRNA or X8 might be a component of yet another lowabundance spliceosome. In the case of U11 and U12 snRNPs, their discovery [6] predated definition of their function by many years [9, 10]. This might be true for X8 and the other non-Sm snRNAs that we have identified. We thank J. Mermoud, A. Parker, and T. Taylor for critical reading of the manuscript. Y.-T.Y. was supported by the Cancer Research Fund of the Damon Runyon-Walter Winchell Foundation Fellowship, DRG-1353. This work was supported by a grant from the U.S. Public Health Service.

REFERENCES

FIG. 3. Sequences and potential secondary structures of X8 and C26 snRNAs. The structures were generated by Mulfold program. For X8, folding was performed with the constraint that the Sm binding site (covered by a shaded box) be kept unpaired. C26 was folded with no constraints. The free energy (nG, kcal/mol) values calculated for these two structures are: X8, 017.4; C26, 022.4

1. Baserga, S. J., and Steitz, J. A. (1993) in The RNA World (Gesteland, R. F., and Atkins, J. F., Eds.), pp. 359–381, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY. 2. Maxwell, E. S., and Fournier, M. J. (1995) Annu. Rev. Biochem. 35, 897–934. 3. Kiss-Laszlo, Z., Henry, Y., Bachellerie, J.-P., Caizergues-Ferrer, M., and Kiss, T. (1996) Cell 85, 1077–1088. 4. Soldati, D., and Schumperli, D. (1988) Mol. Cell. Biol. 8, 1518– 1524. 5. Mowry, K. L., and Steitz, J. A. (1987) Science 238, 1682–1687. 5a. Cotten, M., Gick, O., Vasserot, A., Schaffner, G., and Birnstiel, M. L. (1988) EMBO J. 7, 801–808. 6. Montzka, K., and Steitz, J. A. (1988) Proc. Natl. Acad. Sci. USA 85, 8885–8889. 7. Montzka Wassarman, K., and Steitz, J. A. (1992) Mol. Cell. Biol. 12, 1276–1285.

AID

ecl

ECR 3373

/

6i18$$$$62

12-03-96 15:22:30

MORE Sm snRNAs 8. Hall, S. L., and Padgett, R. A. (1994) J. Mol. Biol. 239, 357– 365. 9. Hall, S. L., and Padgett, R. A. (1996) Science 271, 1716–1718. 10. Tarn, W.-Y., and Steitz, J. A. (1996) Cell 84, 801–811. 11. Tarn, W.-Y., and Steitz, J. A. (1996) Science 273, 1813. 12. Krainer, A. R. (1988) Nucleic Acids Res. 16, 9415–9429. 13. Lerner, M. R., Boyle, J. A., Hardin, J. A., and Steitz, J. A. (1981) Science 211, 400–402. 14. Lerner, E. A., Lerner, M. R., Janeway, C. A., and Steitz, J. A. (1981) Proc. Natl. Acad. Sci. USA 78, 2737–2741. 15. England, T. E., and Uhlenbeck, O. C. (1978) Nature (London) 275, 560–561. 16. O’Brien, C. A., and Wolin, S. L. (1994) Genes Dev. 8, 2891–2903.

17. Tessier, D. C., Brousseau, R., and Vernet, T. (1986) Anal. Biochem. 158, 171–178. 18. Tarn, W.-Y., Yario, T. A., and Steitz, J. A. (1995) RNA 1, 644– 656. 19. Wyatt, J. R., Sontheimer, E. J., and Steitz, J. A. (1992) Genes Dev. 6, 2542–2553. 20. Sontheimer, E. J., and Steitz, J. A. (1992) Mol. Cell. Biol. 12, 734–746. 21. Soldati, D., and Schumperli, D. (1990) Gene 95, 305–306. 22. Maroney, P. A., Yu, Y.-T., Jankowska, M., and Nilsen, W. N. (1996) RNA 2, 735–745. 23. Madhani, H. D., and Guthrie, C. (1992) Cell 71, 803–817. 24. Nilsen, W. T. (1996) Science 273, 1824–1832.

Received September 3, 1996

AID

ECR 3373

/

6i18$$$$63

12-03-96 15:22:30

281

ecl