The Brix domain protein family – a key to the ribosomal biogenesis pathway?

The Brix domain protein family – a key to the ribosomal biogenesis pathway?

Research Update TRENDS in Biochemical Sciences Vol.26 No.6 June 2001 345 Protein Sequence Motif The Brix domain protein family – a key to the ribo...

317KB Sizes 45 Downloads 26 Views

Research Update

TRENDS in Biochemical Sciences Vol.26 No.6 June 2001

345

Protein Sequence Motif

The Brix domain protein family – a key to the ribosomal biogenesis pathway? Frank Eisenhaber, Christian Wechselberger and Günther Kreil Six (one archaean and five eukaryotic) protein families have similar domain architecture that includes a central globular Brix domain, and optional N- and obligatory C-terminal segments, both with charged low-complexity regions. Biological data for some proteins in this superfamily suggest a role in ribosome biogenesis and rRNA binding.

The biogenesis of ribosomes is a multistep process requiring dozens of proteins and small nucleolar RNAs as essential supporting factors1. Recently, a previously uncharacterized transcript isolated from stage 11 Xenopus laevis embryos and later named Brix (biogenesis of ribosomes in Xenopus, AF319877) was linked with the pathway of ribosome formation. In HeLa cells transiently transfected with a cDNA encoding a green fluorescent protein–Brix fusion protein, this protein was found exclusively in nucleoli and coiled (Cajal) bodies, showing complete colocalization with fibrillarin. Binding assays indicate an association of Brix with rRNAs (preferentially of the large ribosomal subunit) but not with U3 or U8 snoRNAs. Additionally, the knockout of yol077c/brx1 (S66770), the yeast orthologue of Brix, was lethal and displayed defects in the synthesis of large ribosomal subunits. Publication of these experimental data is currently in preparation (Kaser, A., Bogengruber, E. et al., unpublished). Six families of homologous proteins with … similar domain architecture…

A thorough analysis of the Brix protein sequence (339 residues) yielded several surprising findings. After applying a homologue-searching strategy in the non-redundant protein sequence database including iterative profile searches with the PSI-BLAST tool (with standard sequence inclusion condition E<0.002 and filters for compositional bias2) and a fan-like search heuristic (starting individual new searches with each of the significant hits),

we collected a closed set of ~50 protein sequences. The common region of similarity among all superfamily members includes a sequence domain of 150–180 residues length that we propose to call the Brix domain (Fig. 1, Brix homepage http://mendel.imp.univie.ac.at/ SEQUENCES/BRIX/). Based on the degree of sequence similarity, the superfamily can be divided into six families: one archaean family (I) including hypothetical proteins (one per genome); and five eukaryote families, each named according to a representative member and including close homologues of this prototype: (II) Peter Pan (D. melanogaster) and Ssf1/2 (S. cerevisiae); (III) yhr088wp/Rpf1p (S. cerevisiae); (IV) IMP4 (S. cerevisiae); (V) brix (X. laevis) and yol077c/brx1 (S. cerevisiae); and (VI) ykr081cp (S. cerevisiae). In database searches, each sequence was found to identify its own family first and then, with a gap in E-value of many orders of magnitude, encounters the first non-family hit. The nature of the first non-family hit reveals the relative positions of the families in the sequence space and possible evolutionary relationships: groups I and IV are closely related; searches with sequences from groups III and V find members of the IMP4 family (IV) to be the closest non-family neighbours; and similarly, groups II and VI get linked to groups III and V, respectively. Interestingly, each of the extensively sequenced eukaryote genomes (baker’s and fission yeasts, worm, fly, human and Arabidopsis thaliana) has at least one representative in each of the six families. We compared the Brix domain with available domain libraries and found that the PFAM entry PF01945 (Ref. 3) unifies groups I, III and IV into one class but does not recognize the other three groups. Sequences of subfamily II have, except for the L. major sequence, already been collected by Migeon et al.4 While this work was in press, an independently collected subset of the Brix domain family, including members

from all six groups, has been published by Mayer et al.5 Typically, a protein sequence belonging to the Brix domain superfamily contains a highly charged N-terminal segment (∼50 residues) followed by a single copy of the Brix domain and another highly charged C-terminal region (∼100 residues). Generally, positive charges are more frequent (the content of Lys and Arg is usually >20%). In the regions flanking the Brix domain, low-complexity segments are common6. The archaean sequences have two unique characteristics: (1) the charged regions are totally absent at the N-terminus and are reduced in number to ∼10 residues at the C-terminus; and (2) the C-terminal part of the Brix domain itself is minimal. Two eukaryote groups have large insertions within the C-terminal region: ∼70 residues in the group III of yhro88w/Rpf1p-like proteins and ∼120 residues in the Peter Pan group II (Fig. 1). This finding is in agreement with the results of similarity searches suggesting that group III is the closest neighbour of group II. Three worm sequences [T32923 (family II), and T19409 and P54073 (both family III)] are much longer (∼700 residues) in contrast to the strong conservation of sequence architecture between other species; it is possible that these are in part gene prediction or genome assembly artifacts. Noticeably, the hypothetical human protein CAB77112.1 (CAC18877.1, group II) has also a similarly excessive length7. Additional information on Brix family members, family collection history, analysis of terminal charged regions and sequence artifacts is given in great detail on the Brix homepage (http://mendel.imp.univie.ac.at/ SEQUENCE/BRIX/). …with a suggested role in ribosomal biogenesis

More than 80% of all superfamily members (including essentially all family I, III and VI proteins) lack any functional

http://tibs.trends.com 0968-0004/01/$ – see front matter © 2001 Elsevier Science Ltd. All rights reserved. PII: S0968-0004(01)01851-5

346

Research Update

TRENDS in Biochemical Sciences Vol.26 No.6 June 2001

Ti BS

Fig. 1. Alignment of selected Brix domain segment representatives from all six families. Sequence names are composed of species name, NCBI accession numbers and gene names. The secondary structural (Sec. Str.) assignments (h, helix; e, strand) have been generated with JPRED2 (Ref. 16) for Brix. Conservation of hydrophobic and charged residues is indicated in green and red, respectively. Other types of conservation are highlighted in blue. There are two loop regions with positively charged residues in many sequences. Towards the C-terminus, the alignment with archaean sequences becomes increasingly difficult (the P. abyssi sequences passes over to the C-terminal charged region). More information is available at the Brix homepage http://mendel.nt.imp.univie.ac.at/ SEQUENCES/BRIX/. Abbreviations: Dm, Drosophila melanogaster; Pa, Pyrococcus abyssii; Sc, Saccharomyces cerevisiae; Tc, Trypanosoma cruzi; Xl, Xenopus laevis.

characterization and are described solely as, for example, hypothetical proteins and conceptual translations. Several family representatives have a characterized phenotype. The protein Peter Pan (AAD16459.1) from D. melanogaster4, T. cruzi’s stage-specific protein 24 (annotation of AAD14602.1) and yeast’s Ssf1/2 (Refs 8–10) are essential for general functions such as cellular growth, cell division, mating and genetically programmed ontogenetic transformations (e.g. imago formation, metacyclogenesis). Notably, all of these processes require extensive synthesis of new proteins. Also, all individual knockouts of the genes encoding Brix domain proteins (in the case of Ssf1/Ssf2, the double knockout) in baker’s yeast are lethal8–11. Thus, each of the five eukaryotic classes is responsible for an essential function. The requirement of the Ssf1/Ssf2 double knockout also indicates that the observation of multiple, possibly functionally redundant genes within a http://tibs.trends.com

particular sequence family of one species could indeed be real variants (i.e. not the result of sequencing errors). The RNAi-mediated inhibition of the expression of the C. elegans protein K12H4.3 (P34524, Brix family) leads to early larval morphological defects and is eventually lethal12,13. All phenotypic data agree with the hypothesis that the number of mature ribosomes, besides those of maternal origin, becomes too low (limiting for further development). In addition to Brix and yol077c/brx1, a few other proteins have also more than just a phenotypic functional characterization: Imp4 (P53941, family IV) from S. cerevisiae is a component of the U3 snoRNP complex that is crucial for correct maturation of the 18S rRNA and, hence, for generation of the small ribosomal subunit14; the L. major protein AAF39738.1 (family II) is characterized as an RNA-binding protein in its database annotation (direct experimental data are not

available). The S. pombe BAA87188 protein has nuclear localization15 and yeast Ssf1/Ssf2 are described as nucleolar proteins9. In summary, the functional characteristics for selected proteins out of three classes (Brix, Imp4 and Peter Pan), the lethal loss-of-function phenotypes for representative members of all five eukaryote groups and the conserved sequence architecture throughout the superfamily, support the notion that proteins of this superfamily appear essential for the production of an adequate supply of mature ribosomes. This finding suggests that: • The archaeobacterial hits and members of groups III and VI of hypothetical proteins might also have a role in ribosome synthesis. • This role is possibly connected with binding of rRNA at different levels of maturation and/or its conformational stabilization (RNA chaperone) for ribosome assembly10. In this context, the (positively) charged terminal segments might have a role in protein–rRNA interactions. • Whereas a single protein might suffice for Archaeabacteria, the predicted function in ribosome synthesis has possibly evolved into five subfunctions each guided by one out of the five eukaryote classes of proteins.

Research Update

We anticipate that further experimental investigation of the specific role of each of the five classes of Brix domain proteins in eukaryote model organisms, especially in yeast, will yield valuable new insights about ribosome synthesis and be key to our understanding of the ribosome biogenesis pathway. Acknowledgements

We are grateful to E. Bogengruber, M. Breitenbach, F.M. Jantsch and G. Lepperdinger for supplying experimental data on sequence and function of Brix (AF319877) and yol077c before publication and for extensive discussion of the Brix sequence analysis results. This research was supported by Boehringer-Ingelheim International. References 1 Kressler, D. et al. (1999) Protein trans-acting factors involved in ribosome biogenesis in Saccharomyces cerevisiae. Mol. Cell. Biol. 19, 7897–7912 2 Altschul, S.F. et al. (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucl. Acids Res. 25, 3389–3402 3 PFAM: http://pfam.wustl.edu

TRENDS in Biochemical Sciences Vol.26 No.6 June 2001

4 Migeon, J.C. et al. (1999) Cloning and characterization of Peter Pan, a novel Drosophila gene required for larval growth. Mol. Biol. Cell 10, 1733–1744 5 Mayer, C. et al. (2001) The archaeal homologue of the IMP4 protein, a eukaryotic U3 snoRNP component. Trends Biochem. Sci. 26, 143–144 6 Wootton, J.C. and Federhen, S. (1996) Analysis of compositionally biased regions in sequence databases. Methods Enzymol. 266, 554–571 7 Suarez-Huerta, N. et al. (2000) Cloning, genomic organization, and tissue distribution of human Ssf-1. Biochem. Biophys. Res. Commun. 275, 37–42 8 Yu, Y. and Hirsch, J.P. (1995) An essential gene pair in Saccharomyces cerevisiae with a potential role in mating. DNA Cell Biol. 14, 411–418 9 Kim, J. and Hirsch, J.P. (1998) A nucleolar protein that affects mating efficiency in Saccharomyces cerevisiae by altering the morphological response to pheromone. Genetics 149, 795–805 10 Nathan, D.F. et al. (1999) Identification of Ssf1, Cns1, and Hch1 as multicopy suppressors of a Saccharomyces cerevisiae Hsp90 loss-of-function mutation. Proc. Natl. Acad. Sci. U. S. A. 96, 1409–1414 11 Eurofan: http://www.mips.biochem.mpg.de/ proj/yeast/search/code_search.htm 12 Gönczy, P. et al. (2000) Functional genomic analysis of cell division in C. elegans using RNAi of genes on chromosome III. Nature 408, 331–336

347

13 http://mpi-web.embl-heidelberg.de/dbScreen/ 14 Lee, S.J. and Baserga, S.J. (1999) Imp3p and Imp4p, two specific components of the U3 small nucleolar ribonucleoprotein that are essential for pre-18S rRNA processing. Mol. Cell. Biol. 19, 5441–5452 15 Ding, D.Q. et al. (2000) Large-scale screening of intracellular protein localization in living fission yeast cells by the use of a GFP-fusion genomic library. Genes Cells 5, 169–190 16 Cuff, J.A.and Barton, G.J. (2000) Application of multiple sequence alignment profiles to improve protein secondary structure prediction. Proteins 40, 502–511

Frank Eisenhaber Research Institute of Molecular Pathology, Dr. Bohr-Gasse 7, A-1030 Vienna, Rep. Austria. e-mail: [email protected] Christian Wechselberger NIH-NCI, Tumor Growth Factor Section, Bldg. 10, R. 5B43, 10 Center Dr., MSC-1750, Bethesda, MD 20892, USA. Günther Kreil Institute of Molecular Biology, Austrian Academy of Sciences, Billrothstraße 11, A-5020 Salzburg, Austria.

A ubiquitin-interacting motif conserved in components of the proteasomal and lysosomal protein degradation systems Kay Hofmann and Laurent Falquet Ubiquitination generally serves as a signal for targeting cytoplasmic and nuclear proteins to the proteasome for subsequent degradation. Recently, evidence has accumulated indicating that ubiquitination also plays an important role in targeting integral membrane proteins for degradation by the lytic vacuole or the lysosome. This article describes a conserved protein motif, based on a sequence of the proteasomal component Rpn10/S5a, that is known to recognize ubiquitin. The presence of this motif in Eps15, Epsin and HRS, proteins involved in ligand-activated receptor endocytosis and degradation, suggest a more general role in ubiquitin recognition.

Ubiquitin is a small protein, highly conserved among eukaryotes, that becomes covalently attached to both itself and a variety of cellular proteins1,2. The role of

this ubiquitination is mostly to target proteins to the 26S proteasome degradation pathway3. In some cases, monoubiquitination (e.g. of histones) does not lead to degradation, but instead regulates other cellular processes such as chromatin remodeling4. Recently, several reports have described a role for monoubiquitination in a different pathway of protein degradation – the endocytosis and subsequent proteolysis of receptors and other transmembrane proteins by the vacuole or the lysosome5,6. According to the current model, the decisions about which protein is to be degraded at a specific time is made by the ubiquitination machinery, often in response to a prior event such as phosphorylation. Consequently, both the proteasome and the endocytosis machinery need a mechanism by which to faithfully recognize ubiquitinated proteins.

The 26S proteasome comprises two main particles: the 20S core proteasome and the 19S regulatory complex. Subunit S5a (also known as Rpn10) of the 19S regulator binds polyubiquitin chains and has a preference for chains containing four or more ubiquitin monomers. The ubiquitin-interacting region has been mapped to two short, related motifs that are found in all members of the S5a family7. Using these regions, which comprise ~20 residues, as a starting point, we searched for other potential ubiquitin-binding sequences. Specifically, we used a combination of iterative database searches with generalized profiles, and Hidden Markov Models (profile-HMMs)8. Only sequences that matched a profile or an HMM derived from previously established family members, with error probabilities of p < 0.01, were used for subsequent iteration cycles. After eight cycles, the sequence motif

http://tibs.trends.com 0968-0004/01/$ – see front matter © 2001 Elsevier Science Ltd. All rights reserved. PII: S0968-0004(01)01835-7