THUMP – a predicted RNA-binding domain shared by 4-thiouridine, pseudouridine synthases and RNA methylases

THUMP – a predicted RNA-binding domain shared by 4-thiouridine, pseudouridine synthases and RNA methylases

Research Update TRENDS in Biochemical Sciences Vol.26 No.4 April 2001 215 THUMP – a predicted RNA-binding domain shared by 4-thiouridine, pseudouri...

310KB Sizes 9 Downloads 68 Views

Research Update

TRENDS in Biochemical Sciences Vol.26 No.4 April 2001

215

THUMP – a predicted RNA-binding domain shared by 4-thiouridine, pseudouridine synthases and RNA methylases L. Aravind and Eugene V. Koonin Sequence profile searches were used to identify an ancient domain in ThiI-like thiouridine synthases, conserved RNA methylases, archaeal pseudouridine synthases and several uncharacterized proteins. We predict that this domain is an β RNA-binding domain that adopts an α/β fold similar to that found in the C-terminal domain of translation initiation factor 3 and ribosomal protein S8.

The biochemical diversity and hence the functional versatility of cellular RNAs is vastly extended by numerous nucleotide modifications. These modifications range from simple methylation of the primary bases to the formation of a variety of atypical bases such as pseudouridine, archaeosine, thiouridines, wyosine and queuosine1–3. A combination of sequence analysis and experimental studies has identified several classes of enzymes involved in base modification4–8. A common feature of many of these in situ RNA-base-modifying enzymes is the fusion of specific RNA-binding domains such as S4, PUA and NusB, to the respective catalytic domains8. These RNAbinding domains have also been detected in other proteins involved in translation and RNA processing8. Here, we show that the 4-thiouridine biosynthesis enzyme ThiI (Ref. 9) shares a previously unknown domain with other RNA-modifying enzymes, including pseudouridine synthases and RNA methylases, and predict that this domain binds RNA. 4-Thiouridine (s4U) is a modified base that is present in bacterial and archaeal tRNAs; in Escherichia coli, the formation of s4U is catalyzed by the IscS and ThiI enzymes, which also participate in thiamine biosynthesis10–12. The ThiI protein contains a central PP-loop ATPase domain13 and a C-terminal rhodanese-like domain14,15, which together catalyze the formation of s4U. This occurs via adenylylation of the 4-carbonyl group of uridine followed by sulfur-insertion by nucleophilic attack of the adenylated carbonyl group12. No function has been

assigned to the N-terminal portion of the ThiI protein. A search of the non-redundant protein database (NCBI) using the PSI-BLAST program16, with an expect (E)-value threshold of 0.01 for inclusion of sequences into a profile, using the N-terminal region (GenBank GI:1773107 residues 57–169) of ThiI as the query, detected similar sequences in a variety of predicted RNA methylases from archaea, eukaryotes and bacteria, and in several uncharacterized proteins. Further PSI-BLAST searches exploiting a profile constructed from all these sequences additionally detected a similar region in an archaea-specific family of predicted pseudouridine synthases (PSUSs) typified by the MJ0421 protein from Methanococcus jannaschii. However, this conserved region was not detectable in other thiouridine synthases such as TrmU or MiaB, which is involved in the synthesis of thiolated adenine derivatives. Thus, this conserved region is shared by enzymes that are predicted to carry out at least three unrelated types of RNAmodification, namely methylation, pseudouridylation and thiouridylation, and can be predicted to define a previously undetected domain involved in RNA metabolism. We named this domain THUMP after thiouridine synthases, methylases and PSUSs. The THUMP domain consists of 100–110 amino acid residues, and a multiple-alignment-based secondarystructure prediction using the PHD program17 revealed an α/β fold (Fig. 1). The succession of the predicted secondarystructure elements in the THUMP domain is identical to that in the experimentally determined structures of two proteins, the C-terminal domain of the translation initiation factor 3 (IF3-C) and the N-terminal domain of ribosomal protein S8. Sequence-structure threading using the hybrid fold recognition method18 and the ThiI THUMP domain as query recovers the IF3-C (PDB:1TIG) as the best hit. Thus, despite the lack of detectable

sequence similarity, the THUMP domain probably shares a common structural fold with these proteins that are involved in translation. The architectures of the THUMP-domain-containing proteins are generally analogous to those of the S4-, PUA- and NusB-domain-containing proteins, which combine a variety of catalytic domains with an RNA-binding domain8. Together, these observations suggest that THUMP is an RNA-binding domain that mediates the delivery of various RNA-modifying activities to their target RNAs. ThiI-like s4U-synthases are common in bacteria and archaea, but are apparently absent from eukaryotes, which is consistent with the presence of this modification only in prokaryotic tRNAs (Ref. 2). The C-terminal rhodanese-like domain is present only in the ThiI proteins from proteobacteria and Thermoplasma; in other organisms this activity is probably supplied by a distinct, standalone version of this domain (Fig. 2). The PSUSs that contain the THUMP domain are found only in the archaea, and the THUMP domains in these proteins differ from all other versions in having a C4-Zn-finger insertion near the N-terminus (Figs 1,2). In most archaea, the gene encoding this PSUS is adjacent to a large operon that contains genes for several ribosomal proteins, which suggests that this PSUS modifies rRNA. The THUMP-domain-containing predicted RNA methylases comprise two families, one of which (typified by mouse ROSA26AS and YpsC from Bacillus subtilis) is conserved in bacteria, archaea and eukaryotes. Together with the presence, in the methyltransferase domain, of an (ND)PPY signature characteristic of adenine methylases, this suggests that these enzymes methylate an adenine in the conserved core of rRNA. In proteobacteria, this methylase is fused to a second predicted purine methylase domain (Fig. 2), which indicates that these proteins catalyze two distinct RNA modifications. The other family of

http://tibs.trends.com 0968-0004/01/$ – see front matter. Published by Elsevier Science Ltd. PII: S0968-0004(01)01826-6

Research Update

216

Secondary structure: 7 ThiI_Ec_1773107 2 ThiI_Bs_3915122 853 TP0559_Tp_3322853 0 ThiI_Tm_4982270 2 ThiI_Mj_2128072 3 ThiI_Af_2649723 736 PH1313_Ph_3257736 22815 MTH1685_Mta_2622815 2 YbcY_Ec_2501582 153 PA3048_Pa_9949153 885 XF2651_Xf_9107885 01151 slr0064_Ssp_1001151 8 YpsC_Bs_1730958 052 PH0644_Ph_3257052 346 AF2178_Af_2648346 006 MJ0438_Mj_2128006 78861 ROSA26AS_Mm_1778861 8583 M04B2.2_Ce_3878583 1813 321 MTH724_Mta_2621813 028 AF1257_Af_2649321 729 MJ0710_Mj_2128028 4210 PH0338_Ph_3256729 891 APE0557_Ap_5104210 105605 MJ1257_Mj_1591891 2104459 Y71F9AL.1_Ce_7105605 3972 BC25H2.10c_Sp_2104459 27309 YGL232w_Sc_1723972 5569 CG15014_Dm_10727309 20255 W02G9.3_Ce_3165569 2005 FLJ20274_Hs_7020255 470 MTH909_Mta_2622005 15704 PH0084_Ph_3256470 4112 c40_008_Sso_6015704 6135 APE0465_Ap_5104112 4827 APE2431_Ap_5106135 298 APE1156_Ap_5104827 312 466 AF2226_Af_2648298 757 MJ0421_Mj_2128312 376 PH0080_Ph_3256466 4199 MJ0041_Mj_2495757 435 PH0962_Ph_3257376 22428 APE0546_Ap_5104199 AF1152_Af_2649435 MTH1322_Mta_2622428 Consensus/80%

61 60 63 55 68 52 59 57 43 46 66 53 44 60 72 42 169 87 50 46 57 41 50 49 58 26 146 154 216 147 70 47 56 79 64 66 44 45 46 110 77 52 24 59

TRENDS in Biochemical Sciences Vol.26 No.4 April 2001

..eeeeee.eeee...eEEEee.........HHHHHHHHHHHh...............EEEEEe...............hHHHHHHHHHHHHH.........eEEe....eeEEEEEEe...eeEEEEe. LAIRDALTRIPGI--HHILEVEDVPFTD- MHDIFEKALVQYRDQLE-------GKTFCVRVKRRG---KHDFSSIDVERYVGGGLNQHIE --SARVKLTNPDVTVHLEVEDD-RLLLIKGR 165 EALFPHLKQVFGI---QSFSLAIKCDSR- LDDIKATALKAIKDQYKP------GDTFKVATKRAY--KQFELDTNQMNAEIGGHILRNTE --GLTVDVRNPDIPLRIEIREE-ATFLTIRD 165 TTAEQALSYLLGI---TGWAAATACPKT- MEAITRCAHAEATLAAREG-----KRTFRIEARRAD--KRFCRTSSEIAREVGAVIHQSGA ---LSVDLHHPDVVIFIEVRER-EAFLYGAR 168 VTLDDKLKKIFGI---QNFSKGFLVSHD- FEEVKKYSLIAVKEKLEKGN----YRTFKVQAKKAY--KEYKKGVYEINSELGALILKNFK --ELSVDVRNPDFVLGVEVRPE-GVLIFTDR 162 DLALKLLKKVAGI---VSYSPVYECPLD- INEIVSFAVQIMKKKLKTL--NKEKVTFAVKTKRSY--KKFPFTSVEVNKKVGEAIVEKLG ---LEVDLEN PDIVLGIEILND-GAYIFTEK 176 EGLESRMAKIPGI---RYFGVGFKTELN- LESIKKAALSVLPESF---------KTFKVDTSRSN--KNFPMNSVEVNRKVGAFIVEKTG ---KKVDLTN PDVTVWIEICEK-EAYVYSKR 153 KEAANVLVRVFGI---VSISPAMEVEAS- LEKINRTALLMFRKKAKEV--GKERPKFRVTARRIT--KEFPLDSLEIQAKVGEYILNNEN ---CEVDLKNYDIEIGIEIMQG-KAYIYTEK 167 DEALDRLSKIFGI---VSFSPAVTAETG- FDSIEDSLREYIHELRSEGL-LTSRTPFAIRCRRVG---EHDFTSQEMAAFAGSVVVGEIG ---APVDLGNPDLEIHLEIRED-ETYIYHRV 165 LVYQSLMWSRLAS---RIMLPLGECKVYS DLDLYLGVQAINWTEMFNP-----GATFAVHFSGLND TIRNSQYGAMKVKDAIVDAFTRKNL -PRPNVDRD APDIRVNVWLHKE-TASIALDL 154 VAYRLCLWSRLAN---RVLLVLARFPVEN AESMYMAVHAVNWEDHLDA-----GGTLAVEFSGKGS GIDNTHFGALKVKDAIVDNLRERSG -RRPSVDKVNPDVRVHLHLDRG-QATLSLDL 157 DAQRLVLWSRLAS---RVLWPLAAFACAD EDALYAGVAALPWVEHVLP-----GQTLAVDAHVSGE AITHARYAAQRVKDAVVDTLRDAGV -VRPSVDVEHPDVRLNLSLRKG-RATLSVDL 177 LLYRINLWSRLIY---RVLMPMATVKAFN AQDLYRSIKKIDWDEYFSP-----EQTLQINCTGKNP QLNHSHFTALQVKNAIVDQQRDRYQ -RRSSVDLEQPDIVINAHIHQN-HCQLSLDS 164 AICRANLWLRTAD---RIKVQVASFKAKT FDELFEKTKAINWRSFIPE-----NGKFPVIGKSVKS TLASVPDCQRIVKKAIVEKLKLQSG KANDWIEETGAEYKVEISLLKD-QALITLDS 156 FEFATYLNEKSKT-LHRVIIEIASHEFKG9 LKRIEEFTVNIPVERFIKV-----TESFAVRAFRKGEHTITSVDIARVVGKAIFDRLSRI-GSPIVNLDHPSTIVRAELINE-AFFLGIDT 180 KSAIPLLNHFSRT-LERLNVLLLRCEVEG LDDIYAAVKGLDF-SFVK------GKSFAIRSLRVG---EHDFTSMDIARVAGQAVIDSFM3 GERLKVNLNQPDVIIRVELVDS-ELFVGVDT 183 LKLIPKINYLSRT-IERMNILLHREEIPN2 LDDIYKRVYNIDWTEWIKE-----NQSFAIRPLRAG---EHNFTSIDIGRVAGEAVIKSYQ3 NIRLKVNLDEPDVIVRVEVIFD-ELIVGIDT 157 DEKDGKKKHASSTSDSHILDYYENPAIKE5 VGDVLSSCKDETGQSLREE-TEPQVQKFRVTCNRAG--EKHCFTSNEAARDFGGAIQEYFK ---WKADM TNFDVEVLLNIHDN-EVIVAIAL 287 QISFCNWKLAIEA---YQLARGNAINGG- SEKILKQIQQFKKDRIYSE-VTDDSPKFRVTCKRCGE -KEIHKFSSMDAASKFGAEINNAFG ---WKCSV KEFDIEVVLRIDRN-NMTVMMAL 198 KILKKRLAYAHEI---SEVIGYAAADD-- LDDAAEEIDWSKYV----------RESFAVRIRKL----CGDVDSRTLERTIGAIIKEDTG ---LKVDL EKPWTLIRPVLIND-RFILTRRL 147 EKFFERLAYTNEV---TKIYDICSVSE-- LEQVFSEIPVYD-------------RLCCVRVKGG-------KGKTALERKLGALLWKRGA ----KVSVSNPEIVYKVYIQDD-KCYVGLLE 136 EGHRIIFRYNLEE---KSVDLVDKIVNDF INSFKDFVANIDYPDIDE------SKSYAVRVLKLHK DEFTKSIDSLRIEKEIGGIIKLKTN ---AKVNLTKPDILVRVVILEN-TFFISNVL 165 SVKHIGVFERLGL--AHEYGLLLYEGDD- IKDILDFAKGLEWRNII-------NGTFAVRRERMR---NCMHEVNDLDKKIGGIIHSQGL ----RVNLSRPDTVVRVYCGN--KIWLGIRI 142 AEQIYGLRSIHSA---SLLLAEAEVSPGR2 LEEVWRAAYKSGAHLHIPY-----GATFAVRGERIG--EGHSYTSVEIASTVGDAVQRAAE3 GWRPMVRLNSPQVVLHAEVDIF-TFRLGVLL 164 FLNIIKENKNNLKFSLRIIPLEIGCQTD- INEIKKAISFLINKKKEKLK----NKSFVVRCNRRG---NHEFTSEELERIIGEYVLENFK DLNLRVNLKDWDFKINIEILQD-ESYISIFQ 160 LSRSILVKYAIEL--SIEAESYDIMYE-- QIDNKPEIIEKFDK---------LEESFAIRFFAI----GRKKKLDSIERIKAFLDVVPFN11 ELFLVEEYENPSDEAPKKVYFG-KLIGEGRA 171 CEVGKTKKMTRYT--QRLIPIVRTTGVS- LDDLEELAKSLIDPLFHEG--QEGIKEFAVQANIRN---HTVLKKDDIYRTVARIVGKQH----MVDLKNFKLLILVQVIKN-IIGISIVQ 132 ADPKNMVKRTRYV--QKLTPITYSCNAK- MEQLIKLANLVIGPHFHDPSNVKKNYKFAVEVTRRN---FNTIERMDIINQVVKLVNKEGS EFNHTVDLKNYDKLILVECFKS-NIGMCVVD 259 DIATTGKSMSRFV--LRLVPIEVVCRAN- MPDIITAAGELFDKHF-----LKEPTSYGIIFNHRY---NQQIKRDQIITQLAELVNSKNV --GNKVDLKEAKKSIIVEVLRG-WCLLSVID 260 VELTQKAPRCRFL--QRVYPVEHTLAVD- LSKMNEVLMKVISDTLKAD-GTGKLPTYSVEFKARN---NDSVAKNSVLQMVDDAVCALAP --TARVSLNHADVTFFVQVSRT-TIMVGVCR 326 DMYKTKKKKTRVI--LRMLPISGTCKAF- LEDMKKYAETFLEPWFKAP----NKGTFQIVYKSRN---NSHVNREEVIRELAGIVCTLNS --ENKVDLTNPQYTVVVEIIKA-VCCLSVVK 254 AVNIIKNSPTTVI--SKVVPVEAVVRTR- LDSILEKVIALVSEKVEA------GETFRVICDLRG--RRYIRSAGELVEAVSEALMERFP ----ITESDEPDWIIQIEVVGD-STGVSVLR 174 VFKRLQSFETFAL--QKIIPIDLFIP--- LEKL-NDVIGEIATKIPK------GKSFAVRAKVRG----AKIGEKSLEIEIGGLIKRITG ---NPVNLENPEYLLIIEVLGK-KAGVALIK 147 VYGLLFASPPSCA--EKVYPFQLIINSTN EKEIVVNVINFIKNKTKD------LKTFYVRCYNRG-----IDVNCREIEMGVGIGLKGLI ----NVDFKDPEVVLHVNVLKE-FAGISLLR 158 AAAARASARVFGV--KSVSPAVEVEYSG- IEDLAEKAAEFFGER-VR------GRVFRVRARRSG---VEGFTSKDVERLLGKLLLERGA ---GGVDLEKPEYTAYVEVRGR-RAYFFDTI 182 RRLERRLPEDTPV--LRVIPVMRIVPAE- VEDVKKAVHSLLAT-QK-------PGSFAIRLEAR-----LLREGREIPRMEAVKIIAEGV --NRSVDLGRPDILVLIKPFRL-REGRLAAI 165 ADARLILERVVEA--DRVLVVHAVSRAS- LEEMAAAAAEAARGRIGG------GECFAVRTTRRG---SHGYTSVDVNVVVGDAVRRATG ---ACVNLESPDKVVAVEIIQD-LALISIYP 170 ESLKEKLEEIPEI--ETIIPVLVECEAK- LDEILSKAEVVAEKVKG-------ARTFAIRTKRRG---THDFTSLDVNLDLGDRIRELTG ---CEVDLNFPDKAVYVEIIGK-RAFIGVID 147 EDIEDKILQIPEV--ERVLKVYFETETD- FDKIVNLAEKIKDYIKE-------DETFAVETKKRG---KHDFSSTDINIVLGAKIKDLTN ---ASVDLNNPDKVVHVEVFKN-KTYVSITP 148 ENALQKILEVPEV--EKVYPVLIEVPAN- LDSIKSAADEIVKYIKE-------GETFAVRTKKRG---KRDFSSVDVNVVLGARIKDLTN ---AEVNLSYPDKVVQVEIIGE-KAYISVIP 149 ------------ --RGIFNKQK----- MEKLLNKAIELLKEYDF--------DTFLIGTHIPEE9 -TEFMESIKQEFGREFGKMLAVRLD ---KAPDKEYPDIVVHINPYTE-EIYLQINP 206 ----ZINC---- --QGIFRR------- MNELLDLVRNAVEDYEF--------QTFQIGSRFPKD12 GITTGEPINREFNRELGKLFAKVTG ---KIPNRERPDIVIIVEPFSS-KVELQVNP 175 ---FINGER--- -----GGR------- LESVIASAVEEGYRLLRAYD----IERFVVGVRLERG12 GAGYGESIKAEIRREVGKLLVSRGG ---VTVDFDSPEATLMVEFPGG-GVDIQVNS 151 --INSERTION- --CNAFDR------- VEEFAEEIVKKMSEYEF--------ETFNVGSRVWGS10 -KGIEYEIKQRFNTKLARAIEEKTG ----SKRTLNPDITVLFDLETF-TFELQIRP 118 ------------ --GDVFDR------- IDEAAGMVKEKVDELGLE------YSSVLVGTRLPPK11 LGIQTEGLKRDLNRELGKRVVKLLG ---CSAEFEDPDLVVMVDFRGDIRVWIQINP 159 ................p.h............hpph...h..h...............sh.l..p.................pl...hs..l.p...........hplppsc..l.lpl.....h.l....

THUMP-domain-containing predicted RNA methylases (typified by Methanococcus MJ0710) is restricted to archaea and eukaryotes and also contains THUMP

THUMP+~PP-loop

THUMP+2*methylase

THUMP+methylase

THUMP+methylase

THUMP

THUMP

THUMP+X

C4+THUMP+PSUS

PP-loop

PP-loop

Rhodanese Thil_Ec (proteobacteria, Thermoplasma)

Thil_Bs (other bacteria and archaea)

Methylase

THUMP

Methylase

Methylase

green; charged (c: DEHKR) colored pink. The limits of the domains are indicated by the position numbers on each side, and the long inserts have been replaced by numbers shown in red. The THUMP domains have been grouped according to domain architectures and these are indicated to the right of the alignment. The ThiI from E. coli has an additional rhodanese domain C-terminal to the PP-loop, whereas in Pyrococcus horikoshii the PP-loop domain is highly divergent. The sequences are denoted by their gene name followed by the species abbreviation and GenBank Identifier. Abbreviations: Af, Archaeoglobus fulgidus; Ap, Aeropyrum pernix; Bs, Bacillus subtilis; C4 is a predicted metal-chelating domain with four conserved Cys residues; Ce, Caenorhabditis elegans; Dm, Drosophila melanogaster; Ec, E. coli; Hs, Homo sapiens; Mj, Methanococcus jannaschii; Mm, Mus musculus; Mta, Methanobacterium thermoautotrophicum; Pa, Pseudomonas aeruginosa; Ph, Pyrococcus horikoshii; Sc, Saccharomyces cerevisiae; Sp, Schizosaccharomyces pombe; Sso, Sulfolobus sulfotaricus; Ssp, Synechocystis sp.; Tm, Thermotoga maritima; Tp, Treponema pallidum; X denotes a conserved globular region that is found in a family of archael proteins typified by AF2226; Xf, Xylella fastidiosa.

the purine-methylase-specific motif. Another archaeal–eukaryotic protein family typified by yeast YGL232w consists of stand-alone THUMP domains. This

THUMP

C4-finger THUMP C4 Pseudo U-synthase MJ0421_Mj (archaea)

THUMP

THUMP+PP-loop

Ti BS

Fig. 1. Multiple alignment of the THUMP domains. The alignment was constructed by parsing PSI-BLAST-generated highest-scoring pairs of sequence segments and realigning them using the ClustalW program21 followed by manual corrections. A search initiated with the THUMP domain of the 4-thiouridine biosynthesis enzyme ThiI from Escherichia coli recovers its ortholog from Archaeoglobus (E = 3 × 10−6; iteration 2), the methyltransferases MJ1257 (E = 10−4; iteration 3) and M04B2.2 (E = 2 × 10−5; iteration 4), and the solo THUMP protein (MTH909, E = 10−5; iteration 5). A search with the Aeropyrum pernix predicted pseudouridine synthase (PSUS) THUMP domain recovers ThiI from Archaeoglobus in the second iteration (E = 10−3), thus establishing links between the most diverse family members. The secondary structure predicted using the PHD program is shown above the alignment. E indicates a β strand, H indicates an α helix, and the uppercase notation is used to denote the most confident prediction (>82% accuracy). The 90% consensus shown below the alignment was derived using the following amino acid classes: polar (p: KRHEDQNST) colored blue; hydrophobic (h: ALICVMYFW) and the aliphatic subset of these (l: ALIVMC) are all shaded yellow; small (s: ACDGNPSTV) colored

THUMP

(+Rhodanese)

THUMP

YGL232w_Sc (archaea, eukaryotes)

X

AF2226_Af (archaea)

YbcY_Ec (proteobacteria)

ROSA26AS_Mm (archaea, eukaryotes, bacteria) Ti BS

Fig. 2. Architectures of the THUMP-domain-containing proteins. The schemes are drawn approximately to scale. C4 and C4-finger are two predicted metal-chelating domains with four conserved Cys residues. X denotes a conserved globular region that is found in a family of archaeal proteins typified by AF2226. The phyletic distribution of the proteins with a particular architecture is shown next to each scheme. Abbreviations: Af, Archaeoglobus fulgidus; Bs, Bacillus subtilis; Ec, Escherichia coli; Mj, Methanococcus jannaschii; Mm, Mus musculus; Sc, Saccharomyces cerevisiae.

http://tibs.trends.com

phyletic pattern is typical of components of the translation and RNA processing machinery19, 20, which suggests that this protein is an RNA-binding factor involved in one of these processes. In conclusion, THUMP is an ancient domain with predicted RNA-binding capacity that probably functions by delivering a variety of RNA modification enzymes to their targets. Like S4 and PUA, the THUMP domain apparently evolved prior to the divergence of the three primary divisions of life. Thus, the last common ancestor of all extant life forms probably already had a versatile RNA modification system. References 1 Ofengand, J. et al. (1995) The pseudouridine residues of ribosomal RNA. Biochem. Cell Biol. 73, 915–924 2 Limbach, P.A. et al. (1994) Summary: the modified nucleosides of RNA. Nucl. Acids Res. 22, 2183–2196

Research Update

3 Maden, B.E.H. and Hughes, J.M.X. (1997) Eukaryotic ribosomal RNA: the recent excitement in the nucleotide modification problem. Chromosoma 105, 391–400 4 Koonin, E.V. (1996) Pseudouridine synthases: four families of enzymes containing a putative uridine-binding motif also conserved in dUTPases and dCTP deaminases. Nucl. Acids Res. 24, 2411–2415 5 Gustafsson, C. et al. (1996) Identification of new RNA modifying enzymes by iterative genome search using known modifying enzymes as probes. Nucl. Acids Res. 24, 3756–3762 6 Mueller, E.G. and Palenchar, P.M. (1999) Using genomic information to investigate the function of ThiI, an enzyme shared between thiamin and 4-thiouridine biosynthesis. Protein Sci. 8, 2424–2427 7 Korber, P. et al. (1999) A new heat shock protein that binds nucleic acids. J. Biol. Chem. 274, 249–256 8 Aravind, L. and Koonin, E.V. (1999) Novel predicted RNA-binding domains associated with the translation machinery. J. Mol. Evol. 48, 291–302 9 Mueller, E.G. et al. (1998) Identification of a gene involved in the generation of 4-thiouridine in tRNA. Nucl. Acids Res. 26, 2606–2610 10 Lauhon, C.T. and Kambampati, R. (2000) The iscS gene in Escherichia coli is required for the biosynthesis of 4-thiouridine, thiamin, and NAD. J. Biol. Chem. 275, 20096–20103

http://tibs.trends.com

TRENDS in Biochemical Sciences Vol.26 No.4 April 2001

11 Kambampati, R. and Lauhon, C.T. (2000) Evidence for the transfer of sulfane sulfur from IscS to ThiI during the in vitro biosynthesis of 4-thiouridine in Escherichia coli tRNA. J. Biol. Chem. 275, 10727–10730 12 Palenchar, P.M. et al. (2000) Evidence that ThiI, an enzyme shared between thiamin and 4-thiouridine biosynthesis, may be a sulfurtransferase that proceeds through a persulfide intermediate. J. Biol. Chem. 275, 8283–8286 13 Bork, P. and Koonin, E.V. (1994) A P-loop-like motif in a widespread ATP pyrophosphatase domain: implications for the evolution of sequence motifs and enzyme activity. Proteins 20, 347–355 14 Hofmann, K. et al. (1998) A model of Cdc25 phosphatase catalytic domain and Cdk-interaction surface based on the presence of a rhodanese homology domain. J. Mol. Biol. 282, 195–208 15 Koonin, E.V. et al. (2000) A comparative-genomic view of the microbial stress response. In Bacterial Stress Responses (Storz, G. and Hengge-Aronis, R., eds), pp. 417–444, ASM Press 16 Altschul, S.F. et al. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucl. Acids Res. 25, 3389–3402 17 Rost, B. and Sander, C. (1993) Prediction of protein secondary structure at better than 70% accuracy. J. Mol. Biol. 232, 584–599

217

18 Fischer, D. (2000) Hybrid fold recognition: combining sequence derived properties with evolutionary information. Pac. Symp. Biocomput. 119–130 19 Makarova, K.S. et al. (1999) Comparative genomics of the Archaea (Euryarchaeota): evolution of conserved protein families, the stable core, and the variable shell. Genome Res. 9, 608–628 20 Koonin, E.V. et al. Prediction of the archaeal exosome and its connections with the proteasome and the translation and transcription machineries by a comparative–genomic approach. Genome Res. (in press) 21 Thompson, J.D. et al. (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucl. Acids Res. 22, 4673–4680

L. Aravind*† and Eugene V. Koonin† National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA. *e-mail: [email protected] †Acknowledgement We thank an anonymous reviewer for suggesting a suitable acronym for this domain.