Gene, 154 (1995) 255 257 © 1995 Elsevier Science B.V. All rights reserved. 0378-1119/95/$09.50
255
GENE 08596
Primary structure of the canine U 1 s n R N A (Gene structure; nucleotide sequence; splicing; transcription)
M u k e s h V e r m a a, M a t t h e w J. O l n e s b, R a b i n d e r N. K u r l b a n d E u g e n e A. D a v i d s o n a "Department of Biochemistry and Molecular Biology, Georgetown University Medical Center, Washington, DC 20007, USA; and Department of Medicine and Graduate Program in Pathobiology, Brown University School of Medicine, Providence, R1 02912, USA. Tel. (1-401) 444-8093 Received by J.A. Engler: 3 August 1994; Accepted: 15 September 1994; Received at publishers: 11 November 1994
SUMMARY
The nucleotide (nt) sequence of the canine U1 snRNA was determined. It exhibited significant homology (90 98%) with known U1 sequences. The RNA can be folded according to the secondary structure previously proposed for the U1 snRNA. It contained the conserved sequence UUACCUG in loop A (nt 6 12), required for the recognition of the 5' splice site, and the sequence UGCACU in loop B (nt 68 73), required for recognition of the U1 70K protein. The U1 snRNA was localized in the nucleus and its transcription was sensitive to a-amanitin, suggesting that it is transcribed by RNA polymerase If. Southern analysis revealed that the canine genome possesses 5-10 copies of U1 snRNAencoding genes.
INTRODUCTION
Four snRNPs, U1, U2, U4/U6 and U5, are essential components of the spliceosome, the complex that catalyzes the splicing of precursor RNA. Base-pairing between the 5' end of U1 snRNA and the Y-splice site is known to be required, but not sufficient for proper recognition of the splicing substrate (Maniatis and Reed, 1987). In addition to the U1 snRNP, the U5 snRNP influences 5' splice-site selection by base pairing with regions in the exon adjacent to the 5' splice site, and U6 snRNP has been reported to associate with this region of the exon (Kohtz et al., 1994). The critical role of the U1 snRNP-pre-mRNA interaction early in spliceosome Correspondence to: Dr. M. Verma, Department of Biochemistry and Molecular Biology, 302 Basic Science Building, Georgetown University Medical Center, 3900 Reservoir Road NW, Washington, DC 20007, USA. Tel. (l-202) 687-6468; Fax (1-202) 687-7186; e-mail: mverma01 @gumedlib.georgetown.edu Abbreviations: aa, amino acid(s); bp, base pair(s); kb, kilobase(s) or 1000bp; nt, nucleotide(s); oligo, oligodeoxyribonucleotide; rDNA, gene encoding rRNA; re-, recombinant; sn, small nuclear; U1, gene encoding U1 snRNA.
SSDI 0 3 7 8 - 1 1 1 9 ( 9 4 ) 0 0 8 0 7 - 8
formation is highlighted by the presence of U1 in both the commitment complex, and the pre-spliceosomal complexes (Ruby and Abelson, 1988). Recently Kohtz et al. (1994) identified a human protein in vitro, called alternative splicing factor. Wassarmann and Steitz (1993) have suggested a role for U1 snRNP in polyadenylation of mRNA. Seven Drosophila melanogaster genes for U1 snRNA variants and their expression during development have been described (Lo and Mount, 1990). While studying the regulation of canine tracheal mucin gene expression, we observed that the message encoded by this gene was polydisperse (Verma and Davidson, 1993). In diseases such as cystic fibrosis, chronic bronchitis and asthma, this gene is apparently over-expressed. To understand the regulation of tracheal mucin expression at the pre-mRNA and mRNA levels, we decided to study the splicing machinery in the canine system. Therefore, we have characterized the U1 snRNA-encoding gene (U1). Here we have determined the nt sequence of the canine U1 clone, elucidated its localization in the cell, and confirmed the observed conservation of U1 snRNA coding sequences among species.
256 EXPERIMENTAL
AND DISCUSSION
(a) Cloning of the canine U1 A canine UI clone, designated CU1-5, was isolated from a canine genomic library (Clontech, Palo Alto, CA, USA) using a 40-nt oligo based on the 5' end of the UI coding region (nt 5-44) conserved in human and other mammalian species. A number of plaques encompassing two times the size of the dog genome were screened using standard methods (Sambrook et al., 1989). After a primary screening of the genomic library, four positive clones were identified which were found to be identical, based on DNA sequence analysis. One of the clones, CU1-5, was used for these studies. The localization of U1 snRNA in the nucleus, sensitivity to cz-amanitin and gene copy number were determined as described previously (Verma and Davidson, 1994). In order to confirm that the selected clone (CU l-5) did not hybridize with ribosomal DNA (rDNA), 2 pg of reDNA digested with EcoRl were separated in duplicate on an agarose gel and blotted onto a nylon membrane (Boehringer-Mannheim, Indianapolis, IN, USA). One half of the membrane was hybridized with a rDNA probe (Verma and Dutta, 1987), and the other half was hybridized with the 40-nt oligo used to screen the library. No hybridization was observed with the rDNA probe, while the 40-nt oligo probe exhibited significant hybridization (data not shown), demonstrating that the canine U1 clone did not carry rRNA sequences. Furthermore, the nt sequence of the canine U1 snRNA did not show any homology with the rRNA sequence. (b) U1 Nucleotide sequence and characterization The snRNAs U1, U2, U4 and U5 are transcribed by RNA polymerase II, based on the c~-amanitin sensitivity assay, while U6 is insensitive to this assay, To determine if the canine U1 snRNA is transcribed by RNA polymerase II, RNA hybridization experiments carried out in the presence or absence of ~-amanatin were performed as described previously (Verma and Davidson, 1994). Based on this analysis, the canine U1 gene is transcribed by RNA polymerase II (data not shown). To determine the cellular localization of the canine U1 snRNA, Northern hybridization experiments using canine epithelial RNA isolated from nucleus and cytoplasm were performed as described previously (Verma and Davidson, 1994). The canine U1 snRNA was localized exclusively to the nucleus (data not shown). The secondary structure of U1 snRNA has been previously studied by different approaches, and a model verified by phylogenetic studies has been established (Branlant et al., 1980; Mount and Steitz, 1981; Myslinski et al., 1989). As shown in Fig. 1, this model fits well with
C A U AC I~"
U G G
3"1C G 14 UIG C~G C ~..G C ,~G ]60
20
O • G OA U~A C C C e U ~. G ~. G UC C IC A G /G c /A U /A U G G c I GU~ 40 U ..G CAIU U U G • U G/C G~.C 140
t
120
|
C GA
~U,~,A GL~ G~ A~ C U
A ' AC G u
B (,o
'~ ~ ~' cCCA
UU AUC
• ~ ,, nuA,~ ucG U"; ~o
i" A U U G C
C CcU
C
G
G//GA CUJ Uu CA A//UR IO{) A t ' ~ C" G / C GG/ CA q UG U AA
J
C
Fig. 1. A model for the secondary structure of the canine UI snRNA. The structure was derived using the M-FOLD program (Genetics Computer Group, Madison, WI, USA) and plotted with LoopViewer version 1.0d59(Gilbert, 1990).Complementarybase-pairs are indicated by a dash (-), and non-complementarybase-pairs are indicated by a bullet (o).
the sequence determined for canine U1 snRNA. The base substitutions and deletions observed in the DNA do not disturb the potentiality for secondary structure formation of the encoded RNA (Figs. 1 and 2). In canine U1 snRNA, four loops (A, B, C, D) were present and their free energy was in the range reported for other U1 snRNAs. The structure can be folded in a secondary structure similar to the other mammalian U1 snRNAs (Fig. 1). It contains the conserved sequence U U A C C U G in loop A, from nt 6 to 12, required for the recognition of the 5' splice site and the sequence U G C A C U in loop B, from nt 68 to 73, required for the recognition of U1-70K protein (Tazi et al., 1993). Mammalian U1 snRNA has been reported to be 150-165 nt in length, whereas its size varies in plants, algae and yeast (Guthrie and Patterson, 1988). The canine U1 snRNA is 165-nt long and shows high homology with the chicken, Drosophila, human, mouse and rat U1 snRNAs (Fig. 2). In Drosophila, seven U1 genes have been reported, some of which are transcribed during late blastula-early gastrula embryogenesis whereas others are transcribed during embryogenesis (Lo and Mount, 1990).
257 1
11
21
C
CAUACUUACC
UGGCAGGGGA
GAUACCAUGA
[]M
C . . . . . . . . . . . . . . . . . . . . ,- . . . . . . . . . . . . .
M R
.
41
. ......
G-
C
UUAUCCAUUG
61
GG
Sl
CAC.UCCGGA
. . . . . . .
-C--.-CC .
C
- - C .......
. . . .
C
- . . . . . . C
- -
-
-
A-U.-
-
. . . .
C -
C-
.. . . . . . . . . .
i01 CCCCAAAUGU
U ....
UU
. ....
C
.- . . . . .
. . . .
GGGA.AACUC
-U . . . . . . C
. . . . . . . . . . . . . . . .
U . . . . . . . .
A-U
iii
CCUGCGAtZtIO
.
U---
. -A . . . . . . . . . . . C . . . . . .
-. . . . . . . . . . . . . . . . . . . . . . . .
C-
GCGAGGC A
A . . . . . . . . .
C . . . . . . . . . . . . . . . . . . .
91
UGUGCUGACC .---
GG.
---C-CU
- -. . . . . . . . . . . . . . . . . .
71
[AM
C
C . . . . . . . . . . . . . . . . . .
.
51
GGUUUUCCCA
G ........
- . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
C . . . . . . . . . . .
CH
. ....
.- . . . . . . . . . . . . . . . .
H
M
31 UCACGAAGGU G--
.. . . . . . . . . . .
-
C C
.AU
....
.A.
-L
F o u n d a t i o n CF6147 to M . V . M . J . O . was a recipient of a pre-doctoral fellowship from the N I H ( N I E H S ) . We are grateful to T a n i a Smith, A m y W i l s o n a n d C l a u d i a Blass for technical assistance. We are also thankful to Drs. Peter Burbello, J.P. R i c h a r d s o n a n d R a m Shukla for suggestions.
. . . . . . .
R
......
- -
H
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
121
131
141
151
161
C
GACUGCAUAA
UUUGUGGUAG
UGGGGGACUG
CGUUCGCGCU
UUCCCCUG
A ....
REFERENCES -UGC
-UG
. . . . .
U---...
U . . . . . . .
.. . . . . . . . . . . .
CH
-
M
.
H
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
-.-
. . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
C-
G
G ....
.GA
• ........ -.
........
. ........
-
--
2. Comparison of the U1 snRNA sequence from canine and other species. The canine (C) sequence has been aligned with Drosophila melanogaster (DM), chicken (CH), mouse (M), rat (R) and human (H) U1 snRNAs. A dash (-) represents an identical nt, a dot (.) represents a gap inserted to improve sequence homology. GenBank accession No. L33345. Fig.
The Saccharomyces cerevisiae U1 s n R N A sequence is different from most of the U1 R N A sequences reported ( G u t h r i e a n d P a t t e r s o n , 1988). I n order to estimate the n u m b e r of U1 genes in the c a n i n e genome, we performed g e n o m i c r e c o n s t r u c t i o n experiments (Alonso et al., 1983) in which we titrated the a m o u n t of U1 sequences c o n t a i n e d in 10 ~tg of c a n i n e D N A with increasing a m o u n t s of subclone C U 1 - 5 using the same subclone as a h y b r i d i z a t i o n probe. F r o m these experiments, we conclude that there are 5 - 1 0 copies of U1 in the c a n i n e genome. The experiments described a b o v e were u n d e r t a k e n to identify m o l e c u l a r c o m p o n e n t s likely to be involved in the f o r m a t i o n a n d f u n c t i o n i n g of the p r e - m R N A splicing a p p a r a t u s in c a n i n e tracheal epithelial cells. O u r interest is to u n d e r s t a n d the splicing events involved in m u c i n e n c o d i n g gene transcription. Since the c a n i n e U1 s n R N A sequence reported here exhibits high h o m o l o g y with other U1 sequences, the d a t a m a y be significant for evolut i o n a r y studies.
(c) Conclusions (1) The nt sequence of the c a n i n e U1 s n R N A was f o u n d to exhibit high h o m o l o g y with k n o w n U1 sequences. (2) The U1 s n R N A was localized in the nucleus a n d was t r a n s c r i b e d by R N A polymerase II. (3) The c a n i n e g e n o m e posseses 5 - 1 0 copies of U1 genes.
ACKNOWLEDGEMENTS This work was s u p p o r t e d by grants H L 2 8 6 5 0 a n d ES05303 from the N a t i o n a l Institutes of H e a l t h to E.A.D. a n d R.N.K., respectively, a n d the Cystic Fibrosis
Alonso, A., Jorcano, J.L., Beck, E. and Spiess, E.: Isolation and characterization of Drosophila melanogasterU2 small nuclear RNA genes. J. Mol. Biol. 169 (1983) 691-705. Branlant, C., Krol, A., Ebel, J.P., Gallinaro, B., Lazar, E., Jacob, M., Sri-Wadada, J. and Jeanteur, P.: Nucleotide sequences of nuclear U1A RNA from chicken, rat and man. Nucleic Acids Res. 9 (1980) 841 858. Gilbert, D.G.: Loopviewer, a Macintosh program for visualizing RNA secondary structure. Published electronically on the Internet, available via anonymous ftp to iubio.bio.indiana.edu, 1990. Guthrie, C. and Patterson, B.: SpliceosomalsnRNAs. Annu. Rev. Genet. 22 (1988) 387-419. Kohtz, J.D., Jamison, S.F., Will, C.L., Zuo, P., Luhrmann, R., GarciaBlanco, M.A. and Manely, J.L.: Protein-protein interactions and 5'-splice recognition in mammalian mRNA precursor. Nature 368 (1994) 119-124. Lo, P.C.H. and Mount, S.M.: Drosophila melanogaster genes for U1 snRNA variants and their expression during development. Nucleic Acids Res. 18 (1990) 6971-6977. Maniatis, T. and Reed, R.: The role of small nuclear ribonucleoprotein particles in pre-mRNA splicing. Nature 325 (1987) 673 678. Mount, S.M. and Steitz, J.A.: Sequence of U1 RNA from Drosophila melanogaster: implications for U1 secondary structure and possible involvement in splicing. Nucleic Acids Res. 9 (1981) 6351-6368. Myslinski, E., Wilhelm, F. and Branlant, C.: A structural analysis of P. polycephalum U1 RNA at the RNA and gene levels. Are there differentiallyexpressed U1 RNA genes in P. polycephalum U1 RNA evolution? Nucleic Acids Res. 17 (1989) 1019-1034. Ruby, S. and Abelson, J.: An early heirarchic role of U1 small nuclear ribnucleoprotein in spliceosome assembly. Science 242 (1988) 1028-1035. Sambrook, J., Fritsch, E.F. and Maniatis, T.: Molecular Cloning. A Laboratory Manual, 2nd ed. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY, 1989, Sanger, F., Nicklen, S. and Coulson, A.R.: DNA sequencing with chainterminating inhibitors. Proc. Natl. Acad. Sci. USA 74 (1987) 5463-5467. Tazi, J., Kornstadt, U., Rossi, F., Jeanteur, P., Cathala, G., Brunel, C. and Luhrmann, R.: Thiophosphorylation of U1-70K protein inhibits pre-mRNA splicing. Nature 363 (1993) 283-286. Verma, M., Madhu, M., Marrota, C., Lakshmi, C.V. and Davidson, E.A.: Mucin coding sequences are remarkably conserved. Cancer Biochem. Biophys, 14 (1993) 41 51. Verma, M. and Davidson, E.A.: Molecular cloning and sequencing of a canine tracheobronchial mucin cDNA containing a cysteine rich domain. Proc. Natl. Acad. Sci. USA 90 (1993) 7144-7148. Verma, M. and Davidson, E.A.: Canine U2 snRNA gene: nucleotide sequence, characterization and implications in RNA processing and cancer biology. Cancer Biochem. Biophys. 14 (1994) 123-131. Verma, M. and Dutta, S.K.: Phylogenetic implications of heterogeneity of the non-transcribed spacer region of rDNA repeating unit in variousNeurospora and related fungal species. Curr. Genet. 11 (1987) 309-314. Wassarmann, K.M. and Steitz, J.A.: Association with terminal exons in pre-mRNAs: a new role for the U1 snRNP? Genes Dev. 7 (1993) 647 651.