42
Review
TRENDS in Plant Science
Vol.8 No.1 January 2003
Plant snoRNAs: functional evolution and new modes of gene expression John W.S. Brown1, Manuel Echeverria2 and Liang-Hu Qu3 1
Gene Expression Programme, Scottish Crop Research Institute, Invergowrie, Dundee DD2 5DA, UK Laboratoire Ge´nome et De´veloppement des Plantes, UMR CNRS 5096, Universite´ de Perpignan, 66860 Perpignan, France 3 Key Laboratory of Gene Engineering of the Ministry of Education, Biotechnology Research Centre, Zhongshan University, Guangzhou 510275, People’s Republic of China 2
Small nucleolar RNAs (snoRNAs) are a well-characterized family of non-coding RNAs whose main function is rRNA modification. The diversity and complexity of this gene family continues to expand with the discovery of snoRNAs with non-rRNA or unknown targets. Plants contain more snoRNAs than other eukaryotes and have developed novel expression and processing strategies. The increased number of modifications, which will influence ribosome function, and the novel modes of expression might reflect the environmental conditions to which plants are exposed. Polyploidy and chromosomal rearrangements have generated multiple copies of snoRNA genes, allowing the generation of new snoRNAs for selection. The large snoRNA family in plants is an ideal model for investigation of mechanisms of evolution of gene families in plants. Small nucleolar RNAs are one of an ever-increasing number of families of small RNAs involved in RNA metabolism and gene expression in eukaryotes. Their major function is in processing and modification of ribosomal RNA (rRNA). Plant snoRNA genes are distinct from those of animals and yeast in their organization and expression. Recent genomic analyses from Arabidopsis and rice highlight their unique and novel gene organization, and provide insights into gene and genome evolution in plants with unexpected differences between dicots and monocots. SnoRNP structure and function In plants, as in all eukaryotes, the 5.8S, 18S and 25/26S rRNAs of the cytoplasmic ribosomes are produced by processing a precursor rRNA (pre-rRNA) in the nucleolus [1]. This requires endo- and exonucleolytic pre-rRNA cleavages and extensive modifications: mainly 20 -O-ribose methylation and conversion of uridine to pseudouridine (C) [1– 4]. During the past decade, small nucleolar RNAs (snoRNAs) have been identified that guide each of these modifications in eukaryotes and Archaea [5– 11]. These snoRNAs fall into two main classes – box-C/D snoRNAs, which direct 20 -O-ribose methylation (Fig. 1a), and boxH/ACA snoRNAs, which direct pseudouridylation (Fig. 1b). The guide function of snoRNAs distinguishes eukaryotes Corresponding author: John W.S. Brown (
[email protected]).
from eubacteria, in which only a few such modifications occur, each requiring a specific enzyme (Table 1). In vertebrate and yeast systems, only a few snoRNAs are essential for cell viability and most of these are required for specific cleavage of pre-rRNA and production of mature rRNA [1]. For example, the abundant U3 and U14 snoRNAs are required for 18S rRNA production. U3 and U14, like the unique RNase MRP snoRNA (which is involved in 25S –28S rRNA production), are conserved in all eukaryotes including plants [1,4,12]. In vertebrates, U8 (a box-C/D snoRNA) is implicated in 5.8S and 28S processing, but this snoRNA has not been found in yeast or Arabidopsis. Finally, in vertebrates and yeast, certain box-H/ACA snoRNAs are also required for some pre-rRNA cleavages [1], but these are not conserved and orthologues have not yet been found in plants. In vivo, all snoRNAs associate with specific sets of nucleolar proteins to form functional small nucleolar ribonucleoprotein particles (snoRNPs) (Fig. 1). In yeast and vertebrates, box-C/D snoRNAs associate with Snu13p, Nop56p, Nop58p and Nop1p (fibrillarin in animals and plants) [1,9– 11,13]. Nop1p is an essential protein required for some pre-rRNA cleavages, rRNA methylation and ribosome assembly [1]. Yeast and vertebrate box-H/ACA snoRNAs associate with Cbf5p (NAP57 in vertebrates), Gar1p, Nhp2p and Nop10p [9– 11]. In these complexes, Nop1p/fibrillarin is the rRNA methylase and Cbf5/Nap57 is the rRNA pseudouridine synthase [14,15]. In plants, early biochemical data revealed more than 120 each of 20 -O-ribose methyl and C residues in rRNAs [16,17]. Until recently, only a few plant snoRNAs had been identified. Completion of the Arabidopsis genome sequence allowed us to screen the genome independently and to identify 99 different box-C/D snoRNA genes, from a total of 175 genes (accounting for gene variants) [18 – 21] (http://www.scri.sari.ac.uk/plant_snoRNA/). Although this list is not yet complete, plants have most known snoRNAs among eukaryotes (Table 1). The Arabidopsis box-C/D snoRNAs are similar to their metazoan and yeast counterparts in size and structure but, in alignments, only the box-C and -D elements, and the RNA-complementary regions are conserved. One-quarter of Arabidopsis snoRNAs have two rRNA antisense elements, many of them targeting neighbouring residues or residues brought together by rRNA folding [19]. This is rarely the case for snoRNAs in
http://plants.trends.com 1360-1385/02/$ - see front matter q 2002 Elsevier Science Ltd. All rights reserved. PII: S1360-1385(02)00007-9
Review
TRENDS in Plant Science
43
Vol.8 No.1 January 2003
(a)
(b)
C′ 1 2 3 4 5
D′ CH3
Nop1p/ fibrillarin Nop58p 3′
Nop56p
Cbf5p/Nap57
Ψ
Cbf5p/Nap57
5′
CH3
Snu13p
5
CU
GA
RU G
AU G
A
C box
Ψ
Nhp2p
D box
Nop10p
Nhp2p
Nop10p
A
rRN
Gar1p
5′ 5′ 3′
Gar1p
ANANNA
ACA
H box
ACA box
C/D snoRNP
NNN 3′
H/ACA snoRNP TRENDS in Plant Science
Fig. 1. Box-C/D and box-H/ACA guide snoRNAs and the core associated proteins. (a) The box-C/D snoRNAs have two conserved boxes, C and D, flanked by short inverted repeats at the 50 and 30 snoRNA ends, respectively (arrows). Adjacent to box D or to an internal box D0 , there is an rRNA antisense element of 10– 20 nucleotides that basepairs to a specific region of the rRNA. The methylated nucleotide is always the fifth residue from the D or D0 box. An internal C0 box is required for guide function of the internal D0 box. All box-C/D snoRNAs associate with four core proteins that have been identified in yeast [9– 11,13]. Nop1p (fibrillarin in vertebrates and Arabidopsis [23]) is probably the rRNA methylase [14]. (b) The box-H/ACA snoRNAs have an ACA motif at the 30 end of the snoRNA and a Hinge (H) box linking two stem structures. The C nucleotide is determined by an internal loop in the stems forming short snoRNA– rRNA duplexes of 4– 6 bp flanking the target residue. All box-H/ACA snoRNAs associate with the four proteins identified in yeast [9–11]. Cbf5p (NAP57 in vertebrates and Arabidopsis [24]) is the C synthase [15].
other eukaryotes but is common in Archaea [7,8]. The identification of box-H/ACA snoRNA genes by computer is precluded by their short conserved motifs and antisense rRNA elements. Thus, although C residues in plants are as numerous as methylated residues [16], only two boxH/ACA snoRNA genes, snoR2 and snoR5, have so far been found through linkage to box-C/D snoRNA genes [20,22]. This gap in our knowledge is currently being addressed by the direct isolation of non-coding RNAs from Arabidopsis, including , 50 box-H/ACA snoRNAs (A. Hu¨ttenhofer, pers. commun.). The 99 Arabidopsis box-C/D snoRNAs are predicted to target 118 rRNA residues, consistent with the biochemical data on the high number of methylation sites in plant rRNAs [18 – 20]. Around two-thirds of these sites are novel to plants but, as in other systems, most are clustered on the conserved eukaryotic core and the functional rRNA domains surrounding the peptidyl-transferase centre [2,19]. However, some modified residues (e.g. 25S:U784,
targeted by snoR69) are situated on new expansion domains of the plant rRNAs, revealing that the cognate guide snoRNAs evolved along with the expansion domain [19]. In Arabidopsis, genes and expressed sequence tags encoding proteins with high homology to yeast and vertebrate snoRNP core proteins have been identified, including in particular the probable modifying enzymes, fibrillarin and NAP57 [23,24]. Nevertheless, only fibrillarin and Nop58 have been functionally characterized. In Arabidopsis, fibrillarin is encoded by two genes that produce similar proteins, each of which rescues a yeast nop1p mutant [23]. Arabidopsis Nop58 is found in the nucleolus and Cajal bodies (nuclear structures often associated with the nucleolus), and binds box-C/D snoRNAs (J.C. Gray, pers. commun.). In addition, when vertebrate snoRNAs were expressed in plant cells, they were accurately processed [25]. These results indicate that the function of these proteins in stability and accumulation of snoRNPs is conserved, and that the basic mechanisms
Table 1. rRNA modification in eukaryotes and prokaryotes Organism
Methylated nucleotides
Known box-C/D snoRNAs
Pseudouridines
Known box-H/ACA snoRNAs
Methylase
Pseudouridine synthase
Escherichia coli Archaea Saccharomyces cerevisiae Homo sapiens Arabidopsis
4 [2] 67 , 55 [2] , 107 [2] , 120a
0 46 [7] 41 [35] 47 97
10 4 , 45 [2] 93 . 100a
0 ND 19 13 2b
Specific methylases Nop1p h [14] Nop1p [1,14] Fibrillarin [9 –11] AtFib [22]
Specific synthases ND Cbf5p [19] NAP57 [9 –11] AtNAP57 [23]
Abbreviations: ND, not determined; snoRNA, small nucleolar RNA. a The number of modified residues in Arabidopsis is estimated based on data obtained in wheat and Acer [16,17]. b Several box-H/ACA snoRNAs have now been isolated (A. Hu¨ttenhofer, pers. commun.). http://plants.trends.com
Review
44
TRENDS in Plant Science
(a)
snoRNA Independent
USE
Plant
PSE
Metazoa
RAP1
TATA
Yeast
Intronic
Exon
Gene cluster
RAP1
Intronic gene cluster
TATA
DSE
?
(b)
Vol.8 No.1 January 2003
Animal, yeast or plant
TATA
Yeast
?
Plant
Exon
Exon
Gene cluster
Plant (rice)
Intronic gene cluster
Non-splicing
?
Intronic
Splicing Intron lariat
Endonucleases
Mature snoRNAs/snoRNPs
Exonucleases
TRENDS in Plant Science
Fig. 2. Gene organization, expression and processing. (a) Genes encoding snoRNAs are transcribed as independent units or found in introns (intronic) or as gene clusters (polycistronic). Plants uniquely contain intronic polycistrons. Exons and potential transcription signals are indicated. Abbreviations: DSE, distal sequence element; PSE, proximal sequence element; USE, upstream sequence element. (b) Polycistronic and intronic pre-snoRNA transcripts are processed by either a splicing or a non-splicing pathway. Linear polycistrons require endonucleolytic cleavage followed by exonucleolytic trimming. The binding of snoRNAs by core proteins at the earliest stages of processing ultimately blocks the exonucleolytic activity to generate mature snoRNPs.
guiding rRNA methylation and pre-rRNA cleavages (which depend on box-C/D snoRNAs and fibrillarin) are conserved in plants. Genomic organization of snoRNA genes The genomic organization of snoRNA genes displays great diversity in different eukaryotes (Fig. 2a). In vertebrates, all guide snoRNAs are intronic, nested within introns of protein-coding genes, but some snoRNA genes (e.g. U3) are independently transcribed. Most snoRNAs in yeast are encoded by independent genes, but there are also seven intronic genes and five polycistronic snoRNA gene clusters. By contrast, in plants, most snoRNAs are found in polycistronic clusters (Figs 2a,3a) [18 – 20,22]. Remarkably, of the 175 Arabidopsis snoRNA genes discovered so http://plants.trends.com
far, 133 are organized into 49 gene clusters (including duplicated and triplicated clusters) scattered over all five chromosomes. The clusters are composed of two to five homologous or heterologous snoRNA genes, and four are intronic (Figs 2a,3a). Such clusters have not been found in vertebrates but do occur in trypanosomes and yeast, although none is intronic. The genome of rice, the model monocot, is four times larger and more complex than the Arabidopsis genome, and its snoRNA gene content and organization are currently being analysed. To date, most known rice snoRNA genes are also polycistronic but, unlike in Arabidopsis, more than half of the rice clusters are intronic (Figs 2a,3a) [26] (J.B., unpublished). This adds to the original discovery of an intronic snoRNA gene cluster encoding four box-C/D
Review
TRENDS in Plant Science
(a) Arabidopsis
U31a
snoR4a
U33a
U51a
U33.1a Rice
45
Vol.8 No.1 January 2003
snoR4b U31b
snoR5a
U51.1a snoR5.1a
U33.1b
U33b
U51b
snoR5b
U51.1b snoR5.1b
Exon
Exon Intron 1 of hsc70 gene
Arabidopsis snoR28.1b snoR28.1a snoR28.1c
Rice snoR28.3a snoR28.3c snoR28.3b snoR28.3d Exon
Exon Intron 2 of NADH dehydrogenase
snoR69Y U29
snoR25 snoR53Y
snoR53Y
U29a
U29b
Exon
Exon Intron 4 of ribosomal protein gene L18
(b)
m
18S:Um1010 snoR20.1 snoR20.2
--GCCAGTGATGATTAGAT-TCAATGGTTGCTGAACATTCAAT------GTTGAAAAGC-ATCTAACTTGACTAGGACGGTCTGAGG-AGGCTGATGAAGATTAGATATTAATGGTTACTGAAATTTCAATAAGTGTGTTGATATACTATCTT-CTTGACTAGGACTG-CTGAGGCT ** *** ******** * ******* ***** ****** ***** **** ************** ******
C m
25S:Um 2445 snoR16.1 snoR16.2
m
18S:Cm1011
D
25S:Um48 m
GCAAATGATGAGTAGAAT--CTTAT-CCTACACACAGATGTATCAGTGTTGACTACCAATCTCTGCTTATTATCTGATG-GCAAATGAAGAATTGATTAATTTATGCTTAACCACTGATG-AACAGTGTTGACAAAACATCTCCGCTTATTATCTGATGCC ******** ** * ** * **** * ** *** **** * ********** * ***** ***************
C
D′
m
25S:Um36
D
25S:Um48 m
(c) AtU15-1a AtU15-1b AtU15-2 MtU15-1P MtU15-2P LeU15 GmU15 TaU15a TaU15b OsU15-1P OsU15-2a OsU15-2b SbU15 ZmU15
----CTCAGTGATGAAGAAACAATAGATGACGAGTCCGATGAAAT--CCATTCA-TAAAA-TCGTGGGGACAAA--GAGGCATTTGTCTGAGAG-------CTCAGTGATGAAGGAACAATAGATGACGAGTCCGATGAAAT--CCATTCA-TTAAA-TCATGGGGACAAA-CGAGGCATTTGTCTGAGAG-------CTCAGTGATGAATACACA---GATGACGAGTCCGATGAAAT--CCATTCA-TTAAA-TCATGGGGACAAA-CGAGGCATTTGTCTGAGAG--------------------------------ACCAGTCTGATGAAAT--CCATTCA-AAACAATTGTGAGGACAAAACGAGGCATTTGTCTGA----------------------------------GACGAGTCTGATGAAATTTCCATTCA-AAACAATTGTGAGGACAAAACGAGGCATTTGTCTGA-------TTCCTCAATGATGAACAAAC---AGATGACGAGTCTGAT--AAT--CCATTCTTTTAAAATAATGAAGACAAT-CGAGGCATTTGTCTGAGAGGAA------CTGTGATGAT-GAAAAC--GATGACGAGTCTGATATGATATCCATTCC-TTACAATACTGACGACAAATCGAGGCATTTGTCTGAGAG---TGTTCTCGGTGATGAT-CAACA-CTGATTAGATGTCCGATACAAT--CCATTCC-TTAAA-CCATGGGGACAATACGAGGCATTTGTCTGAGAGAACA -GTTCTCTGTGATGAT-GAAAA-CAGATGACGAGTCTGATGTAAT--CCATTCTATTAAA-CCGTGTGGACAAT-CGAGGCATTTGTCTGAGGGAAC-----------------------CAGATGACGAGTCCGATCCAAT--CCATTCCATTAAA-CCATGGGGACAAT-CGAGGCATTTGTCTGAGA-----ATTCTCCGTGATGAC-GAACA-CAGATGACGAGTCCGATCTAAT--CCATTCCATTAAA-CCATGGGGACAAT-CGAGGCATTTGTCTGAGAGAAT-GTTCTCCTTGATGA--ACACA-CAGATGACGAGTCCGATCTAAT--CCATTCCATTAAA-CCATGGGGACAAT-CGAGGCATTTGTCTGAGAGAACGGTTCTCTGTGATGA--GCAAA-CAGATGACGAGTCCGATCGAAT--CCATTCCATTACA-CCATGGGGACAAC-CGAGGCATTTGTCTGAGAGAACC AGTTCTCTGTGATGAGACAACA-AAGATGACGATTCCGATCTAAT--CCATTCCATTAAA-CCATGGGGACAAT-CGAGGCATTTGTCTGAGAGAACT
C
D′
25S:Gm2278
D
m
25S:Am2271 m
(d) U55/U16
C
D
C
D′
U55 snoR15
D
U16
C
D′
U16
D
U55 m (U55)
U55 snoR15 U16
* ***** * ***** * * ** **** * ***** ************* ** ****************** TAGGATGAATCTCATATATTGAT---GTTATTTACTACTGAA---ATTACATTGATGTTTTATTCACCTTGGAGAACTGA AGATGATGATTATCATAAAACAAATGGGTAATTTGCGACTGATAATATTACATTGATGTGTTTTTCACCTTGGAGAACTGATGT TATGATGAAATTATATTTCAT---GGGTAATTTGCGTCTGATTCTATG----TGATGC--TAACTTTTATGATTATCTGA ******* * ************ ***** *** ***** * ** * ****
D′
C
C′
D
(U16) m TRENDS in Plant Science
http://plants.trends.com
46
Review
TRENDS in Plant Science
and two box-H/ACA snoRNAs in an intron of rice hsp70 [27], and points to a fundamental difference between monocots and dicots. Polycistronic clusters, whether they are intronic or nonintronic, add further diversity to plant snoRNA gene organization and expression (Fig. 2b). Intronic snoRNA genes in animals and yeast are only found as a single snoRNA gene in any one intron, although host genes can contain many single snoRNA genes nested in different introns (Fig. 2a) [6]. An evolutionary extension of this principle is the several ‘host’ genes that harbour snoRNAs in multiple introns but whose mRNA does not code for proteins [5,11]. Most intronic snoRNA genes of vertebrates and yeast are nested in genes encoding proteins involved in ribosome biogenesis. This particular organization suggested coordinate expression of components implicated in the same cellular processes. This link was also seen with some of the rare intronic snoRNAs in Arabidopsis, such as U60 (snoR60 ) in fibrillarin genes [23], and three of the rice intronic snoRNA gene clusters were present in the largest introns of ribosomal protein genes (Fig. 3a). Thus, plants might have evolved the operon-like gene clusters, both intronic and non-intronic, for coordinated expression, primarily of the snoRNA genes but also of some ribosomal and nucleolar proteins. Transcription and processing of snoRNAs The different modes of expression of snoRNAs used by eukaryotes have consequences for transcript processing [6,12,20]. Vertebrates, plants and yeast all contain independently transcribed snoRNA genes that represent an archetypal gene structure, with the RNA coding sequence flanked by promoter, enhancer and termination sequences (Fig. 2a). For example, U3 and RNase MRP exist as independent transcription units in all eukaryotes. In animals and yeast, U3 is transcribed from RNA polymerase II (PolII) promoters, whereas, by contrast, plant U3 is transcribed from a RNA polymerase III (PolIII) promoter, consisting of an upstream sequence element and TATA box, first characterized for spliceosomal small nuclear RNA (snRNA) genes. This important change of polymerase specificity in evolution revealed the structural relatedness of the PolII and PolIII promoters in plants [28]. Yeast independent and polycistronic snoRNAs are transcribed from promoters containing a TATA box and binding sites for Rap1p, a transcription factor that is also involved in ribosomal-protein gene expression [6,29]. In plants, there is currently no information on the nature of the promoters driving the expression of snoRNA gene clusters
Vol.8 No.1 January 2003
or of termination signals, except that the gene clusters do not contain the recognized upstream sequence element and TATA signals of PolII and PolIII snRNAs and U3 snoRNA. Thus, plant snoRNA gene clusters appear to have different promoters from spliceosomal snRNAs. The biosynthesis of all intronic snoRNA genes depends on transcription from the host gene promoter and processing of the pre-mRNA. The different gene organization of intronic snoRNAs in animals/yeast and plants reflects different processing pathways to produce mature snoRNAs. In vertebrates and yeast, having only a single snoRNA in any particular intron is consistent with largely splicingdependent processing by exonucleases from linearized, debranched intron lariats [6] (Fig. 2b), although there are rare examples of snoRNAs that can be processed independently of splicing [30]. In yeast, polycistronic snoRNA precursors are processed by RNase III, which cuts duplex RNA with AGNN tetraloops formed in intergenic regions [29,31]. Trimming of the 50 and 30 flanking sequences involves the 50 ! 30 exonucleases Rat1p and Xrn1p, the 30 ! 50 exosome, and Rrp6p [1,29]. In plants, non-intronic snoRNA clusters are expressed from upstream promoters as pre-snoRNA transcripts and are processed to individual snoRNAs (Fig. 2b) [18,22]. The novel intronic polycistrons found in Arabidopsis and in preponderance in rice are in stark contrast to the one-snoRNA-per-intron organization of vertebrates and yeast. Processing of plant intronic clusters requires endonucleolytic activity on the pre-mRNA, spliced intron or both (Fig. 2b). Single and polycistronic U14 snoRNAs from both intronic and non-intronic transcripts were successfully processed in plant cells, demonstrating splicing-independent processing such that snoRNA and mRNA production from these transcripts can be mutually exclusive [22,32]. Although nothing is yet known about the processing enzymes, plants contain 50 ! 30 and 30 ! 50 exonucleases similar to components of the yeast RNAdegradation machinery and a complex family of RNase-IIIlike proteins, such that some similarity in processing pathways is expected [33 –36]. The fact that plant presnoRNA processing does not require splicing allows snoRNAs, and thereby rRNA and ribosomes, to be produced under the various extreme conditions that plants generally have to tolerate, and when splicing activity is reduced or shut down. A parallel is found in vertebrates, in which the expression of U14 snoRNA, hosted in the hsc70 heat-shock gene, is modulated by heat shock [37]. In animals, snoRNA processing and snoRNP assembly are likely to occur in Cajal bodies before transport to the nucleolus [9 – 11]. Plant pre-snoRNAs have been detected
Fig. 3. Plant snoRNA gene organization and evolution. (a) Heterogeneous and homogeneous, non-intronic and intronic gene clusters. Related clusters that are non-intronic in Arabidopsis and intronic in rice have been reported [18 –20,26]. (b) Generation of novel complementary regions and methylation sites (m). Two Arabidopsis snoR20 gene variants have an insertion or deletion of a nucleotide (arrow) between the complementary region and D box, resulting in the methylation of adjacent sites in 18S rRNA [18]. Sequence changes in the region adjacent to the D0 box in the two variants of snoR16 lead to two different complementary sequences, methylating different rRNA sites [19]. (c) Sequence alignments of U15 from different plant species and different gene variants showing many deletions, insertions and nucleotide substitutions. Abbreviations: At, Arabidopsis thaliana; Mt, Medicago truncatula (Barrel medic); Le, Lycopersicon esculentum (tomato); Gm, Glycine max (soybean); Ta, Triticum aestivum (wheat); Os, Oryza sativa (rice); Sb, Sorghum bicolor (sorghum); Zm, Zea mays (maize); P, partial expressed-sequence-tag sequence. Dotted lines indicate regions of complementarity to rRNAs. (d) At two chromosomal locations in Arabidopsis, there are either U16 and U55, or snoR15 genes, all related in sequence (identity to snoR15 is indicated by asterisks) [20]. SnoR15 contains the complementary regions of U16 and U55. SnoR15 genes have been found in five different plant species, suggesting that snoR15 has been duplicated at one site in Arabidopsis and accumulation of mutations has generated the individual U16 and U55 genes. In U55, there are three nucleotide substitutions (arrows) in the U16 complementary region, which destroy complementarity to rRNA. Complementary regions are shaded green (U16) and pink (U55); arrows at the 50 and 30 ends indicate inverted repeats. Figure modified from Ref. [20]. http://plants.trends.com
Review
TRENDS in Plant Science
in Cajal bodies, supporting a role in processing [38]. In addition, there is the unexplained observation of the accumulation of plant snoRNAs in the nucleolar cavity (a central, transcriptionally inactive region of plant nucleoli). The absence from this region of the core protein and methyltransferase, fibrillarin, suggests storage of presnoRNPs [38,39]. Evolution of plant snoRNA genes Prior to the divergence of Archaea from eukaryotes, their common ancestors already contained multiple snoRNA genes [7,8]. The evolution of snoRNAs in Archaea, yeast and vertebrates is thought to have occurred through a repeated series of duplications, mutations and selections for their ability to associate into stable snoRNPs and to influence ribosome assembly and function [9,10,40]. Owing to the prevalence of polyploidy in plants, there is a high degree of gene duplication and potential gene redundancy in plant snoRNAs, providing more opportunity to accumulate mutations. Thus, plant snoRNA genes provide a useful model for observing mechanisms of gene evolution. In plants, at least 50% of the snoRNA genes have two to four allelic variants. These have arisen through the duplication of gene clusters on the same or different chromosomes, or through tandem gene duplication within a cluster (Fig. 3a). Sequence alignments of variants reveal many sequence changes and small insertions and deletions (Fig. 3b,c). Mutations can alter rRNA-complementary regions to produce novel sequences that, under selection, could lead to novel methylation sites. For example, two variants of Arabidopsis snoR20 differ in their complementary region adjacent to the D box and guide the methylation of adjacent sites in rRNA (Fig. 3b) [18,20]. Similarly two variants of snoR16 have one common antisense element next to the D box but different antisense elements adjacent to their D0 boxes owing to sequence divergence of a few nucleotides (Fig. 3b). This has created two different guide sequences in the two genes (Fig. 3b). The degree of sequence variation seen among the snoRNA variants is likely to reflect several features of snoRNAs. First, snoRNA genes do not encode proteins and so do not have the constraints of maintaining open reading frames. Second, for most snoRNAs, their individual function might be non-essential. Third, only boxes C, D, C0 and D0 , and their complementary regions are required for stability and modification functions. This is most obvious when comparing snoRNA orthologues from different species (Fig. 3c). It is therefore likely that snoRNA genes evolve more rapidly than protein-coding genes. At the level of gene-cluster organization, more drastic rearrangements have also occurred, leading to the loss of conserved sequences and giving rise to gene fragments or non-functional genes, which ultimately lead to loss of the entire gene [18– 20]. Unequal-crossing-over or geneconversion events might also have added to the diversity of plant snoRNAs and generated the examples of different genes containing the same complementary regions. A clear example of gene duplication, amplification and accumulation of mutations giving rise to two different genes is snoR15, which has generated U16 and U55 in one site in Arabidopsis (Fig. 3d) [20]. http://plants.trends.com
Vol.8 No.1 January 2003
47
Initial analyses of gene organization in different plant species have shown both strong conservation of gene order of some gene clusters and mixing and dispersal of other gene combinations. For example, Arabidopsis contains non-intronic gene clusters with five different genes: U31, snoR4, U33, U51 and snoR5 (Fig. 3a). The last three are found in the same order in an intron of the hsc70 gene in rice [27]. This suggests that the initial gene order was established before the divergence of dicots and monocots. By contrast, in cereals (monocots), U14 gene clusters are linked to up to five other different snoRNA genes [22,26], whereas, in Arabidopsis, the same genes are organized in eight clusters spread across all five chromosomes. The major differences between monocots and dicots suggest that snoRNA genes have undergone substantial reorganization and transposition during the evolution of different plant species from primitive progenitor plants. The examples of strong conservation of order and arrangement suggest that particular gene clusters have a selective advantage in their co-expression, becoming ‘fixed’ relatively early. The contrasting situation, in which gene arrangements are different among different species, suggests that other groups of genes are actively evolving. Thus, plant and other higher-eukaryotic box-C/D snoRNA genes can therefore be viewed as a large multigene family encoding small RNAs with related structure and function, and that have evolved from a set of related ancestral genes. Functional consequences gene expansion in plants An important unanswered question concerns the function of extensive rRNA modifications in eukaryotes compared with the few modified nucleotides in prokaryotes (Table 1). Maintenance of the impressive nucleolar machinery directing modifications clustered in the functional core of all eukaryotic rRNA indicates that these must be essential for ribosome biogenesis or function. Although genetic depletion of individual and multiple guide snoRNAs in yeast had no obvious effect on cell growth [29,41], depletion of 20 -O-ribose methyl groups or pseudouridines in yeast nuclear and mitochondrial rRNA, and E. coli rRNA has detrimental effects on ribosome activity [15,42 –44]. Modifications of rRNA can contribute globally to stabilizing functional rRNA structure, to determining the structural and functional interactions between the ribosomal subunits, and to influencing the binding of tRNA and mRNA, and thus fine tune the translational activity of the ribosome [45]. In addition to their guide function, snoRNAs might also have a chaperone function in the proper folding of the nascent rRNA or might stabilize rRNA tertiary structure during ribosome biogenesis [1–3,6]. The increased number of plant rRNA methylations (Table 1) reflected in the expansion of snoRNA genes points to subtle but fundamental differences in ribosome activity. Plants are particularly exposed to large temperature changes, during which ribosomes must be produced and remain active. This observation is paralleled in prokaryotic systems. In E. coli, the well-known heatshock protein FtsJ has been recently identified as a specific rRNA methylase. Mutation of the FtsJ gene dramatically affects the ribosome profile and generates a temperaturesensitive phenotype [46]. In hyperthermophilic Archaea,
48
Review
TRENDS in Plant Science
increasing the temperature of a culture significantly increases the level of rRNA methylation [47]. Thus, rRNA methylation is regulated in prokaryotes, and this could be related to sustained ribosomal activity under adverse conditions. It would be particularly interesting to see whether plant rRNA methylation is regulated by temperature stress conditions, perhaps by inducible snoRNA gene expression. snoRNAs and ‘non-nucleolar’ RNA targets In Arabidopsis, four distinct box-C/D snoRNAs (snoR6, snoR26, snoR27 and snoR28) were found to lack complementarity to rRNA and might target other RNA substrates [20]. In vertebrates and yeast, box-C/D, -H/ACA and hybrid C/D – H/ACA snoRNAs have been identified that guide the modification of the spliceosomal U5 and U6 [9 – 11,48,49], and others have significant complementarity to U1, U2, U4 and U5 snRNAs [50]. In Archaea, box-C/D snoRNAs have been identified that direct the methylation of specific residues in some tRNAs [51], and many novel box-C/D and box-H/ACA snoRNAs in mouse lack significant complementarity to rRNAs, snRNAs or tRNAs [50]. Some of these box-C/D snoRNAs have been shown to be expressed specifically in the brain and could target the methylation of mRNAs [52]. Many of these modifications occur in the nucleolus or Cajal bodies, which are often associated with the nucleolus. The nucleolus is emerging as a major RNA-modification factory where, in addition to rRNA and snoRNAs, many other RNAs including some tRNAs, U6 snRNA, signal-recognition particle or telomerase RNAs and even some mRNAs are transiently localized [53]. Cajal bodies also contain diverse RNA populations including snRNAs, snoRNAs and small Cajal-body-specific snoRNAs [9 –11,48,54], consistent with them being sites for the assembly and modification of spliceosomal snRNAs [9 – 11,55]. Future prospects The genomic harvest of snoRNA genes in Arabidopsis and rice has shed new light into gene evolution and regulation in plants. Further investigation into snoRNA gene organization in these genomes and in those of lower plants will increase our knowledge of snoRNA evolution and gene expression, not only in plants but also in other eukaryotes. An important aim is to grasp the unique opportunity that plants offer to address the regulation of rRNA modification in different cellular environments. The dynamics and complex cellular trafficking required for snoRNP assembly and function is an intriguing area. In vivo, nearly all modifications occur in the nascent transcript, which is synthesized within a few minutes. This implies an extraordinary coordination of assembly and binding to pre-rRNA of the nearly 200 snoRNPs in the nucleolus. The plant nucleolus, with more distinct subnucleolar domains than that of animals, is particularly suited to RNA and protein localization studies [38,39]. The combination of established cell biology and the ease of plant manipulation for overproduction of tagged snoRNAs or proteins make plants a good model to elucidate the protein – protein and protein –snoRNA interactions that regulate these processes. http://plants.trends.com
Vol.8 No.1 January 2003
Finally, the identification in a range of organisms of small, stable non-coding RNAs opens a new era in RNA biology and gene regulation. In vertebrates, Drosophila and nematodes, microRNAs play crucial roles in the regulation of gene expression at the transcriptional (dosage compensation) and post-transcriptional (RNA processing and mRNA translation) levels, and have great implications for development [56,57]. Recently, non-coding RNAs, microRNAs and snoRNAs with unknown target RNAs have been identified in plants [58 –60] (A. Hu¨ttenhofer, pers. commun.). This is likely to be the tip of the iceberg of small non-coding RNAs in plants. The next challenge will be the investigation of the expression and regulation of these new RNAs, the structural or functional complexes they form, their mechanisms of actions, and their roles in plant growth and development. Acknowledgements Our research was supported by grant-in-aid to the Scottish Crop Research Institute from the Scottish Executive Environment and Rural Affairs Department (J.B.), the key project of the National Natural Science Foundation of China (L.-H.Q.) and the Centre National de la Recherche Scientifique (CNRS) (M.E.). References 1 Venema, J. and Tollervey, D. (1999) Ribosome synthesis in Saccharomyces cerevisiae. Annu. Rev. Genet. 33, 261 – 311 2 Maden, B.E.H. (1990) The numerous modified nucleotides in eukaryotic ribosomal RNA. Prog. Nucleic Acid Res. Mol. Biol. 39, 241 – 301 3 Ofengand, J. and Fournier, M.J. (1999) The pseudouridine residues of rRNA: number, location, biosynthesis and function. In Modification and Editing of RNA (Grosjean, H., Benne, R., eds), pp. 229 – 253, ASM Press 4 Lafontaine, D.L.J. and Tollervey, D. (2001) The function and synthesis of ribosomes. Nat. Rev. Mol. Cell Biol. 2, 514– 520 5 Weinstein, L.B. and Steitz, J.A. (1999) Guided tours: from precursor snoRNAs to functional snoRNP. Curr. Opin. Cell Biol. 11, 378 – 384 6 Bachellerie, J.P. et al. (2000) Nucleotide modifications of eukaryotic rRNAs: the world of small nucleolar RNA guides revisited. In The Ribosome: Structure, Function, Antibiotics and Cellular Interactions (Garrett, R.A. et al., eds), pp. 191– 203, ASM Press 7 Gaspin, C. et al. (2000) Archaeal homologues of eukaryotic methylation guide small nucleolar RNAs: lessons from Pyrococcus genomes. J. Mol. Biol. 297, 895 – 906 8 Omer, A.D. et al. (2000) Homologs of small nucleolar RNAs in Archaea. Science 288, 517 – 522 9 Kiss, T. (2001) Small nucleolar RNA-guided post-transcriptional modification of cellular RNAs. EMBO J. 20, 3617 – 3622 10 Kiss, T. (2002) Small nucleolar RNAs: an abundant group of noncoding RNAs with diverse cellular functions. Cell 109, 145 – 148 11 Filipowicz, W. and Pogacˇic´, V. (2002) Biogenesis of small nucleolar ribonucleoproteins. Curr. Opin. Cell Biol. 14, 319 – 327 12 Brown, J.W.S. and Shaw, P.J. (1998) Small nucleolar RNAs and prerRNA processing in plants. Plant Cell 10, 649 – 657 13 Watkins, N.J. et al. Conserved stem II of the box C/D motif is essential for nucleolar localisation and is required, along with the 15.5 k protein, for the hierarchical assembly of the box C/D snoRNP Cell (in press) 14 Wang, H. et al. (2000) Crystal structure of a fibrillarin homologue from ˚ resolution. Methanococcus jannaschii, a hyperthermophile, at 1.6 A EMBO J. 19, 317 – 323 15 Zebarjadian, Y. et al. (1999) Point mutations in yeast CBF5 can abolish in vivo pseudouridylation of rRNA. Mol. Cell. Biol. 19, 7461– 7472 16 Lau, R.Y. et al. (1974) Wheat embryo ribonucleates: III. Modified nucleotide constituents in each of the 5.8S, 18S and 26S ribonucleates. Can. J. Biochem. 52, 1110 – 1123 17 Cecchini, J.P. and Miassod, R. (1979) Studies on the methylation of
Review
18 19
20 21 22
23
24
25 26 27 28 29
30
31 32 33
34
35
36
37
38
TRENDS in Plant Science
cytoplasmic ribosomal RNA from cultured higher plant cells. Eur. J. Biochem. 98, 203– 214 Qu, L.H. et al. (2001) Identification of 10 novel snoRNA gene clusters from Arabidopsis thaliana. Nucleic Acids Res. 29, 1623– 1630 Barneche, F. et al. (2001) Identification of 66 box C/D snoRNAs in Arabidopsis thaliana: extensive gene duplications generated multiple isoforms predicting new ribosomal RNA 20 -O-methylation sites. J. Mol. Biol. 311, 57– 73 Brown, J.W.S. et al. (2001) Multiple snoRNA gene clusters from Arabidopsis. RNA 7, 5718– 5732 Brown, J.W.S. et al. Plant snoRNA database. Nucleic Acids Res. (in press) Leader, D.J. et al. (1997) Clusters of multiple different small nucleolar RNA genes in plants are expressed as and processed from polycistronic pre-snoRNAs. EMBO J. 16, 5742 – 5751 Barneche, F. et al. (2000) Fibrillarin genes encode both a conserved nucleolar protein and a novel small nucleolar RNA involved in ribosomal RNA methylation in Arabidopsis thaliana. J. Biol. Chem. 275, 27212 – 27220 Maceluch, J. et al. (2001) Cloning and characterization of Arabidopsis thaliana AtNAP57 – a homologue of yeast pseudouridine synthase Cbf5p. Acta Biochim. Pol. 48, 699 – 709 Leader, D.J. et al. (1998) Processing of vertebrate box C/D small nucleolar RNAs in plants. Eur. J. Biochem. 253, 154 – 160 Liang, D. et al. (2002) A novel gene organisation: intronic snoRNA gene clusters from Oryza sativa. Nucleic Acids Res. 30, 3262 – 3272 Qu, L.H. et al. (1997) Two snoRNAs are encoded in the first intron of the rice hsp70 gene. Prog. Nat. Sci. 7, 371 – 377 Kiss, T. et al. (1991) Alteration of the RNA polymerase specificity of U3 snRNA genes during evolution and in vitro. Cell 65, 517 – 526 Qu, L.H. et al. (1999) Seven novel methylation guide small nucleolar RNAs are processed from a common polycistronic transcript by Rat1p and RNase III in yeast. Mol. Cell. Biol. 19, 1144 – 1158 Giorgi, C. et al. (2001) Release of U18 snoRNA from its host intron requires interaction of Nop1p with the Rnt1p endonuclease. EMBO J. 20, 6856 – 6865 Chanfreau, G. et al. (1998) Yeast RNase III as a key processing enzyme in small nucleolar RNA metabolism. J. Mol. Biol. 284, 975 – 988 Leader, D.J. (1999) Splicing-independent processing of plant box C/D and box H/ACA small nucleolar RNAs. Plant Mol. Biol. 39, 1091– 1100 Kastenmayer, J.P. and Green, P.J. (2000) Novel features of the XRNfamily in Arabidopsis: evidence that AtXRN4, one of several orthologs of nuclear Xrn2p/Rat1p, functions in the cytoplasm. Proc. Natl Acad. Sci. U.S.A. 97, 13985 – 13990 Chekanova, J.A. et al. (2000) Poly(A) tail-dependent exonuclease AtRrp41p from Arabidopsis thaliana rescues 5.8S rRNA processing and mRNA decay defects of the yeast ski6 mutant and is found in an exosome-sized complex in plant and yeast cells. J. Biol. Chem. 275, 33158 – 33166 Chekanova, J.A. et al. (2002) Arabidopsis thaliana exosome subunit AtRrp41p is a hydrolytic 30 ! 50 exonuclease containing S1 and KH RNA-binding domains. Nucleic Acids Res. 30, 695 – 700 Jacobsen et al. (1999) Disruption of an RNA helicase/RNAaseIII gene in Arabidopsis causes unregulated cell division in floral meristems. Development 126, 5231– 5240 Chen, M.S. (2002) Differential accumulation of U14 snoRNA and hsc70 mRNA in Chinese hamster cells after exposure to various stress conditions. Cell Stress Chaperones 7, 65 – 72 Shaw, P.J. et al. (1998) Localization and processing from a polycistronic precursor of novel snoRNAs in maize. J. Cell Sci. 111, 2121– 2128
http://plants.trends.com
Vol.8 No.1 January 2003
49
39 Beven, A.F. et al. (1996) The organisation of ribosomal RNA processing correlates with the distribution of nucleolar snRNAs. J. Cell Sci. 108, 509– 518 40 Lafontaine, D.L.J. and Tollervey, D. (1998) Birth of the snoRNPs: the evolution of the modification-guide snoRNAs. Trends Biochem. Sci. 23, 383– 388 41 Lowe, T.M. and Eddy, S.R. (1999) A computational screen for methylation guide snoRNAs in yeast. Science 283, 1168 – 1171 42 Tollervey, D. et al. (1993) Temperature-sensitive mutations demonstrate roles for yeast fibrillarin in pre-rRNA processing, pre-rRNA methylation, and ribosome assembly. Cell 72, 443 – 457 43 Mason, T.L. (1998) Functional aspects of the three modified nucleotides in yeast mitochondrial large-subunit rRNA. In Modification and Editing of RNA (Grosjean, H., Benne, R., eds), ASM Press 44 Raychaudhuri, S. et al. (1999) Functional effect of deletion and mutation of the Escherichia coli ribosomal RNA and tRNA pseudouridine synthase RluA. J. Biol. Chem. 274, 18880 – 18886 45 Decatur, W.A. and Fournier, M.J. (2002) rRNA modifications and ribosome function. Trends Biochem. Sci. 27, 344– 351 46 Bu¨gl, H. et al. (2000) RNA methylation under heat shock control. Mol. Cell 6, 349– 360 47 Noon, K.R. (1998) Post-transcriptional modifications in 16S and 23S rRNAs of the archaeal hyperthermophile Sulfolobus solfataricus. J. Bacteriol. 180, 2883– 2888 48 Darzacq, X. et al. (2002) Cajal body-specific small nuclear RNAs: a novel class of 20 -O-methylation and pseudouridylation guide RNAs. EMBO J. 21, 2746 – 2756 49 Zhou, H. et al. (2002) Schizosaccharomyces pombe mgU6-47 snoRNA is required for the methylation of U6 snRNA at 41. Nucleic Acids Res. 30, 894– 902 50 Hu¨ttenhofer, A. et al. (2001) Rnomics: an experimental approach that identifies 201 candidates for novel, small, non-messenger RNAs in mouse. EMBO J. 20, 2943– 2953 51 Clouet d’Orval, B. et al. (2001) Box C/D RNA guides for the ribose methylation of archaeal tRNAs. The tRNATrp intron guides the formation of two ribose-methylated nucleosides in the mature tRNATrp. Nucleic Acids Res. 29, 4518– 4529 52 Cavaille´, J. et al. (2000) Identification of brain-specific and imprinted small nucleolar RNA genes exhibiting an unusual genomic organisation. Proc. Natl Acad. Sci. U.S.A. 97, 14311 – 14316 53 Pederson, T. (1998) The plurifunctional nucleolus. Nucleic Acids Res. 26, 3871 – 3876 54 Jady, B. and Kiss, T. (2001) A small nucleolar guide RNA functions both in 20 -O-ribose methylation and pseudouridylation of the U5 spliceosomal RNA. EMBO J. 20, 541 – 551 55 Matera, A.G. and Frey, M.R. (1998) Nuclear structure ’98: coiled bodies and gems: Janus or Gemini? Am. J. Hum. Genet. 63, 317– 321 56 Eddy, S.R. (2001) Non-coding RNA genes and the modern RNA world. Nat. Rev. Genet. 2, 919 – 929 57 Pasquinelli, A.E. (2002) MicroRNAs: deviants no longer. Trends Genet. 18, 171 – 173 58 MacIntosh et al. (2001) Identification and analysis of Arabidopsis expressed sequence tags characteristic of non-coding RNAs. Plant Physiol. 127, 765 – 776 59 Liave, C. et al. (2002) Endogenous and silencing-associated small RNAs in plants. Plant Cell 14, 1605– 1619 60 Reinhart, B.J. et al. (2002) MicroRNAs in plants. Genes Dev. 16, 1616– 1626