307
Nucleus and gene expression Editorial overview Arnie Berk* and lain W Mattajt Addresses ‘Molecular
CA
Biology Institute, University of California at Los 90024-l 570, USA; e-mail:
[email protected]
tEuropean
Molecular
Biology Laboratory,
Postfach 1022-40, D-6901 2 Heidelberg, e-mail:
[email protected] Current
Opinion
in Cell Biology
Meyerhofstrasse
Angeles, 1,
Germany;
1997, 9:307-309
http://biomednet.com/elecref~0955067400900307 0 Current
Biology Ltd ISSN 0955-0674
Abbreviations NLS NPC
nuclear localization sequence nuclear pore complex
The genetic environment The eukaryotic nucleus is a masterpiece of organization, but one whose principles we still do not understand. Organization in a mammalian cell begins with the nontrivial problem of packing 200cm or so of DNA into the 5 pm diameter of the nucleus. Of course, the DNA backbone is highly charged, so basic (histone) proteins are required for charge neutralization to allow compact packing. The packaged DNA is then anything but inert. Transcription of the DNA template as well as its replication and repair all have to take place, and in an environment whose protein concentration exceeds that of most protein crystals. Once a transcript is made, it has to be processed. Every single transcript in eukaryotic cells, with the exception of 5s rRNA in some species, is made as a precursor that requires nuclear processing to produce its mature form. These DNA- and RNA-based transactions are almost universally carried out by complex multimolecular machines whose relative distributions within the nucleus are just beginning to be understood. This issue of Current Opinion in Gel/ Biology starts with reviews that describe advances in our descriptions of some of these machines, then moves up through successive layers of complexity, regulation and organization, to end with discussion of the communication and transport between the nucleus and the rest of the cell through the pore complexes in the nuclear envelope. As an aid to the reader, we provide below a brief introduction to each of the reviews.
Basic mechanisms
in gene expression
The most fundamental level of gene expression is transcription. Greenblatt (pp 310-319) describes the RNA polymerase II holoenzyme, thought to be the form of RNA polymerase II that is involved in mRNA precursor (pre-mRNA) synthesis. In addition to the basal factors required for promoter recognition and transcription initiation, and the adapters required to allow response to regulators, the holoenzyme also seems to contain activities
that are involved in dislodging nucleosomes, in order to allow polymerase access to the DNA, and even factors required for processing the nascent pre-mRNA. A second complex of similar complexity is the spliceosome, in which introns (noncoding intervening sequences) are removed from a pre-mRNA. Spliceosomes contain upwards of 100 different proteins, some in multiple copies. At first sight, this complexity seems contradictory to the fact that intron removal is chemically identical to the RNA-catalyzed selfsplicing of one class (group II) of autocatalytic introns. Will and Liihrmann (pp 320-328) discuss potential functions of the many spliceosomal proteins, roles which include aiding accurate recognition of the rather minimal sequence information that specifies splice sites and making and breaking the many protein-protein and protein-RNA interactions that have to occur sequentially through the several steps of intron recognition, the two catalytic steps of splicing and, finally, spliceosome disassembly. The 3’ ends of most eukaryotic transcripts produced by RNA polymerase II are formed in a two-step reaction in which the RNA is cleaved at a specific site and then polyadenylated. Unlike splicing, these reactions are almost certainly catalyzed by proteins (not RNA), and, although simpler, the reactions still require roughly a megadalton of protein. Recent interest in this field has been largely focused on the identification and characterization of the factors required for the two-step reaction. Comparison between the mammalian and yeast factors, the major focus of the Keller and Minvielle-Sebastia review (pp 329-336), leads to a fascinating conclusion. Most of the known factors, not surprisingly, are conserved between the two species. However, it appears that the subcomplexes of the machinery have been mixed and matched differently during evolution, such that the detailed function of some factors, that is, their participation in either the cleavage or the polyadenylation step, has not been conserved. Pre-mRNA splicing requires five trans-acting small nuclear (sn) RNA cofactors. Preribosomal RNA processing, in contrast, involves interaction between the substrate and roughly 150 small nucleolar (sno) RNAs (see Tollervey and Kiss, pp 337-342). These RNAs are diverse, but almost all are synthesized in unusual ways, being either cleaved from polycistronic transcripts or excised from what is normally junk RNA, the introns of pre-mRNAs. Although a handful have functions analogous to those of the spliceosomal RNAs, that is, they are involved in the definition of processing sites in the pre-rRNA, most have turned out to be guides for the accurate placement of base or nucleoride modifications at specific and often conserved places in the mature rRNA sequence. The
308
Nucleus and gene expression
function of these modifications is mysterious, and most are nonessential. The discovery of how the modifications are made opens obvious doors to their manipulation, and therefore functional study.
a combination of both negative regulation, to prevent replication firing at inappropriate times, and positive regulation, to allow appropriate initiation, combine to produce the desired tight control.
A much less ubiquitous, but no less fascinating, aspect of post-transcriptional gene expression is RNA editing. In this group of related processes, bases are inserted into or deleted from a mature RNA, or have their nucleotide identity changed. Maas, Melcher and Seeburg (pp 343-349) discuss two different mechanisms of RNA editing in vertebrates. Both involve base deamination, resulting in either cytidine+uridine or adenosinejinosine changes. The cases they cover result in the production of two functionally different proteins from the edited and unedited mRNA transcripts. They also discuss the identification of the enzymes involved in the deamination reactions, and current ideas of how those enzymes are targeted to their specific substrates.
A few genetic diseases are known to result from the disruption of normal gene regulation. Reddy and Housman (pp 364-372) review current understanding of how triplet nucleotide repeats in specific alleles of human genes greatly expand in number in certain individuals, causing diseases such as myotonic dystrophy, Huntington’s disease and fragile X syndrome. Remarkably, the tendency for certain repeats to expand increases in successive generations, resulting in the unusual genetic phenomenon termed ‘anticipation’ in which a specific allele produces a phenotype only after it has been passed to additional generations of individuals. Recent results suggest that, in some cases, triplet repeat expansion inhibits gene expression by interfering with normal RNA processing or transport out of the nucleus.
Nuclear regulation
and pathology
Except for the deamination reactions involved in editing, all the basic mechanisms described above involve large multisubunit complexes. However, the T7 RNA polymerase is, at 97 kDa, capable of specific promoter recognition as well as initiation, elongation and termination of transcription, and RNase III is capable of accurately processing the 5’ and 3’ ends of Escheridia co/i rRNA, so what is all the complexity for? One answer is clearly accuracy. Given the size of the eukaryotic genome, and the necessity for RNA-processing reactions, in particular intron removal, to be accurate to the individual base, accuracy becomes a prime concern. However, it is likely that much of the complexity exists in order to provide targets for regulation. Many, if not all, eukaryotic genes are subjected to regulation at multiple levels. This theme is expanded upon by Hertel, Lynch and Maniatis (pp 350-357). They discuss enhancers of transcription and of pre-mRNA splicing, that is, &acting DNA or RNA sequence elements that exert stimulating effects at a distance, and make insightful and thought-provoking comparisons between the two. A common theme is argued to be the assembly of multiprotein complexes on the enhancers, favoured by cooperative protein-protein and protein-nucleic acid interactions. How the enhancerbound protein complexes target the basal transcription and splicing machineries is also discussed. Interactions between regulatory and basal factors are often themselves regulated in response to extracellular or intracellular signalling cascades, most frequently by changes in the phosphorylation state of the targets of the signalling pathway. Jallepalli and Kelly (pp 358363) discuss how the DNA-replication apparatus, yet another example of a giant multiprotein complex, is induced to initiate replication once and only once per cell cycle. Regulators of the replication complex are the cyclin-dependent kinases, and
Manipulative regulation of gene expression is a critical tool both in basic biological research and in biotechnology. A popular method to achieve repression of a specific gene in organisms that are not genetically manipulable has been antisense technology, where an RNA is artificially produced that is complementary to the mRNA of the target gene. These RNAs form a Watson-Crick duplex with their target mRNA, preventing its normal expression. This approach has, for unknown reasons, been particularly successful in plants. Plant molecular biologists were, however, dumbfounded when they discovered that their control experiments, involving overexpression of the sense mRNA, were often equally or more effective as a repressive mechanism. This fascinating post-transcriptional regulatory mechanism is called gene silencing. It is thought to derive from a cellular antiviral defence mechanism, and models for how it may occur are the theme of Depicker and Van Montagu’s review (pp 373-382).
Spatial and structural organization expression
of gene
Quite a different type of silencing, or inhibition, of both transcription and the timing of replication occurs close to the telomeric ends of chromosomes. ‘This is best studied in yeast, where the formation of (what else?) a large multimeric protein complex via DNA-protein and protein-protein interactions occurs. The proteins involved, and their interactions with nucleosomes that organize the subtelomeric chromatin into a repressive structure, are discussed by Grunstein (pp 383-387). In yeast, telomeres on different chromosomes pair during interphase, and occupy a few distinct peripheral locations in the nucleus. It is not clear if this spatial organization is required for silencing, but examples of nontelomeric regions of chromosomes pairing during interphase, and of this pairing
Editorial overview Berk and Mattaj
being significant for regulation, are known. Henikoff (pp 388-395) discusses examples of pairing between euchromatic regions of homologous chromosomes, and the effects of this pairing on the expression of genes in the paired regions. These effects are often called transvection, and their existence points to the need for mechanisms to ensure that at least some genes are transcribed in a nonrandom spatial way with respect to their homologue within the nucleus. Why this should be so is not understood. It is, however, more obvious what the nonrandom distribution of an mRNA in the cytoplasm can be good for. If an asymmetrical compartment within a cell must be built up, one way to do this is to sequester an mRNA to a particular point in the cytoplasm, and thereby deliver its translation product directly to a particular location. Examples include mRNAs involved in germ cell formation in various organisms, both vertebrates and invertebrates. Nasmyth and Jansen (pp 396-400) discuss the role(s) of cytoskeletal proteins in organizing mRNA movement and localization. In particular, they describe the use of genetics in both DrosopMa and yeast to identify components required for generating asymmetric mRNA distributions.
Nucleocytoplasmic
traffic
A Goliath among all nuclear multiprotein complexes is the nuclear pore complex (NPC) which, at 125MDa, weighs in at roughly 30 times the mass of a ribosome. The NPCs penetrate the double bilayer of membranes that surrounds the nucleoplasm, and all known trafficking between the nucleus and the cytoplasm occurs via the NPCs. These remarkable structures, estimated to contain 50-100 different proteins, form diffusion channels with a diameter of roughly 9 nm, which is equivalent to a globular protein of 50-60 kDa, but somehow change conformation to allow active transport of substrates of greater than 25 nm in diameter, which is equivalent to a ribosome. We are far from understanding how this can happen, but Doye and Hurt (pp 401411) describe the breathtaking pace at which the composition of both yeast and vertebrate NPCs
309
is being elucidated. They also talk about models for NPC assembly from subcomplexes. NPC proteins are clearly also involved in nucleocytoplasmic transport events, and our last two reviews cover this topic. Gorlich (pp 412419) mainly covers the mechanism of import of a major class of nuclear proteins, those that contain short basic nuclear localization signals (NLSs). He talks about the importin heterodimer, which is responsible for NLS recognition and for targeting the NLS-containing protein to the NPC. Importin translocates through the NPC with the NLS-containing substrate, and a second focus of this review is how transport asymmetry is brought about, that is, why importin carries NLS-containing proteins into, rather than out of, the nucleus. Finally, Nakielny and Dreyfuss (pp 420-429) discuss other mediators of nucleocytoplasmic transport, in particular those involved in export from the nucleus. Major substrates for export are the many RNAs transcribed in the nucleus. Although much less mechanistic detail is available about export than about import, the authors discuss similarities and differences between the two processes, and reinforce the message of the Gorlich review, that export and import are not only closely related mechanistically, but are also inextricably intertwined functionally. We can now appreciate that the complexity of biological function stems from the elaborate intricacy of the macromolecular assemblies that perform various biological functions. The reviews in this issue of Current Opinion in Cefi BioJogy describe the impressive recent progress that molecular biologists have made in identifying the macromolecular complexes that execute essential nuclear functions, progress in characterizing the individual proteins and nucleic acids from which these multicomponent complexes are assembled, and initial advances in deciphering how these components interact to regulate and coordinate essential nuclear processes.