consequence of a lack or excess of DMPK protein. Resolution of the alternative mechanisms underlying the devtlopment of DM requites the production~of transgenic mouse models that include an exoanded CfG ;Meaa;hmust be the next goal of
235-238
6 Hoffmann-Radvayi, H. et al. (1993) Hum. Mol. Gene;. 2.
_ 1263-1266 Krahe. R. et al. (1335) Genomics 28. ’ l-14 D Wang.J.Z. etal. (199j) Hmn. Mol. O Genet. 4,X%606 9. Jansen, G. et al. (1996~ Nat. Genet. 13,316-323 10 Reddy,S. etal. (1996) Nat. Genet.
Admowledgements 13.3i4-335 We thank Gill Bates and Sita Reddy for helpful discussions. Work in our II Justice, R.W. et al. (1995) Genes Dev. 3. i34-546 Iabontory is funded by the Muscular 12 Wang, Y.H. et al. (1994) Science Dystrophy Association. ReferenceS f Harper. P. (1989)Mrotonic L@stmply. Majorproblems in 2
:&r~rnlo~r21. Saunders Brook, I.D. etal.(1992) CeN68.
799-noi3
3 Carango, P., Noble,J.E.. Marks. H.G. and Funanage.V.L. (1993)
Gerronrics18.340-348 etal. t 19933 Nat. Gene:. 4. 233-238 5 Fu. Y.H.et al. (1993)Science 260,. 4 Sabouri, LA.
18
Jansen, G. et nl. (1995)Hum.
Mol.
Genet. 4,843-852
19 Shaw. DJ. ct al. (1993)Genomics 18.67%679 20
StJohnston. D. (1995) Ce//81. 161-170.
21 Rastinejad. F., Conboy, M.J.. Rando.
T.A. and Blau. H.M. (1993) Ce1175, 1107-1117 22 Rastineiad. F. and Blau. H.M.(1993) Cell72,‘903-917
Tanela. K.L.et al. C1WjJ.l. Cell Ho!. 128.995-1002 24 Willems. P. (1994)Nut. Genet. 8. 213-215 25 Zeitlin. S. et nl. (1995)NN/. Genet 265.669-671 11.155-163 13 Pearson, C.E. and Sinden. R.R. 26 Duyao. M.P. et al. (1995)Scieuce (1996)Biochemistty35, 5041-5053 269.407-410 14 Timchenko, L.T., Timchenko. N.A.. Caskey.CT. and Roberts, R. (1996) 27 Nasir, J. et nl. (1995) Celi81.81l-823 28 BurriRht.E.N. et al. (1995)Cc//82. Hum. MO/. Genet. 5. 115-121 937-948 IS Boucher. C.A. er al. (1995)Hum. 29 Ike& H. etal. (1996)NW. Genet. Mol. Gem?. 4. 1919-1925 13.196-202 16 Otten. A.D. and Tapscott. SJ. (1995) Pmt. Nat/. Acad. Sci. U. S. A. 12. 30 Bingham. P.M. etal.(1’9%) Nat. Genet. 9. 191-196 54654469 31 The Dutch-Be&n FragileX 17 Shaw, DJ. et al. (1993)J. Med. Consortium (1994)Cell78 23-33 Genet.30. 189-192 23
LETTER
Non-orthologous genedisplacement ,- _+N Unrelated proteins with the same biochemical activity can evolve independentlyt. Without complete genome information, however. it is not possible to ascertain whether a particular function is encoded by homologous or non-homologous genes in two different species. The chance always remains that the counterpart to a given gene in one species has not been sequenced in the other. In an attempt to prove and to assess quantitatively the postulate that dis:inct proteins in different species can be responsible for rhc same hmcticn, we compared the first two genomes to be sequenced completely, those of the parasitic bacteria Huemop~iltcs inJluenzae2 and M>roplasma genituhrm:‘. For this problem to be addressed, the function of the gene in question has to be essential in both organisms. Despite this ]iIIIitatiOn, we found a considerdhle number of cases where apparently indispensable cell functions are encoded by non-onhologous genes. 8y definition, orthologs are genes that are related by vertical
descent from a common ancestor and encode proteins with the same function in different species. By contrast, paralogs are homologous genes that have evolved by duplication and code for proteins with similar, but not identical, functions’. Comparison of the 1703 putative proteins encoded by the H. inJlfrmtzaegenome2.s with the 469 M. genitalitrm protein sequences, revealed that 233 showed sufficiently high similarity for them to be considered s;C~ologs (the criteria for distin_+tishing orthologs from ~an!ogs a.p= _ discussed in Ref. 5, in which the genome of H. hlfltrenrue was compared with the available 75% of the ficberfchia co/i genome). In addition, by comparing the remaining M. genitulitrm and H. ittflttrertzae proteins with those in the non-redundant sequence database, we identified 12 clear-cut cases where essential cellular functions in the two bacteria were encoded by non-orthologous (i.e. unrelated or paralogous) genes (Table 1). We call this phenomenon non-orthologous gene TIC SEPTEMBER1996 VOL. 12 No. 9
displacement, although its origins can be quite diverse. Nucleoside diphosphate kinase (NDK) appears to be a case where an essential function has been taken over by an unrelated protein. This example shows the impact of the concept of non-orthologous gene displacement on the analysis of complete genome sequences. Knowing that a function, in this case the formation of nucleoside triphosphates from diphosphates catalyzed by NDK, is indispensable but that the given genome does not encode homologs of the respective proteins from other organisms, one shot&J sccecn functionally unassigned gene products for likely candidates. In M. genifulitrm, we found two plausible candidates for a novel NDK, both of which contain sequence motifs typical of nucleoside and nucleotide kinases. The case of the repair DNA polymerase shows the complexity of the scenarios leading to non-orthologous gene displacement. It appears that a duplication of the DNA polymense 111(Pal 111)gene in the common ancestor of Gram-positive and Gram-negative bacteria twitich also encoded the
LETTER TABLE 1.Non-orthobgous genes coding forthe samefunction in Mycopbzma genftulfum Ifaemopbilus in~ae
and
H. fnJluarrne
M. g4?mwiunl
- -._ No fnqwmce similarity between M. genitahn Phosphoglycerate
MG43O
PMGI_BACSU PMGI_ECOLI PMGI_MAIZE
and H fn~errzae~ieirr~ HI0757 PMGl_ECOLI PMGMJIUMAN Q@mA) not in G(+)
MG46iI
LDH_BACSU ~HM_HU~
tfctDor ffd~~
LLDDJXOLI G(+f The HI enzyme is distantly related to eukatyotic thee BZ
LPLAJXOLI seu~Lo46W_l
HI0027 tl@EJ
LIPB_ECOLI 551458 (yeast)
E. coli and yeast encode both
None
N10876 ( rrdk)
NDK_ECOLI NDKB_HUMAN
The two predicted ldnases in MG are candidates for this indispensable activity
DPOl_ECOLI DPO1J~CIX.Z
MG encodes two homoloe of DNA polymense III. MG261 is the likely repair polymemse as it belongs ta a putative repair opero&
mm&e
L-lactate dehydrogenase
H11739B
Escheticbia coli encode5 both types of enzymes
Lipoate-protein ligase
MG270
Nucleoside diphosphate kinase
MG?&F’
DNA polymerase. repair
MG261 (dna.8
DP3A_~IN DP3A_ECOLI
HI0856
RNase H
MG262?d
DPOI_BACCA DPOI_HAEIN
HI0138 ( rf?bA): HI1059 ( nrhBl
RNH_ECOLI RNHl_YFAST RNH2_ECOLI MC326_1 @I. caprtcoi..) SC23CDS_13 (yeast)
MG262 is homologous to the 5’-3’ exonuclease domain of DNA polymerase I. It is predicted m replace the two un~l~ted RNases H of HI in primer removal during DNA replicadon
Glycyl-tRNA syntbewse
MG251
SYG_HUMAN
NIO927 0 HIQ924 Q$vs1
SYGA_ECOLI SYGB_ECOLI CFUZOj47_I (Chl3mydia) G(-1
The MG enzyme conrains one subunit. the HI counterpart two
Yeast encodes both types of enzymes
MG26@
fpolA)
types of enzymes
l?uw@s hi AI genitatkm and H. infiwnzae Prolyl-tRNA synthetase
MG283
Cytidine deaminase
MG052
Pyruvate
MG273
dehydrogenase El component LInnd p subunits
MG274
YHIO_YEAST
CDD_BACSU CDD_HU~
HI0729
SYP_ECOLI
@rvs,
YER7_YEAST
HIl3;O
CDD_ECOLI
The MG cytidine deaminase is more closely related to euka~otlc enzymes thzn to those from Gt+) bacteria
ODPl_ECOLI G(-1
Two subunits in C(*) barrerla (including MG) and eukaryotes, once
(cd&
ODPA_BACSU HI1233 ODPA_HUMAN (aceJ.?l ODPB_BACSiJ ODPB_HUMAN
subunit in CC-) bacteria
3 The M. genil~iirf~ and H. fjt~t~~zae proteins are designated as in Refs 3 and 2. respectively. The onhol~ous E. cofl @ne is indicated in parentheses where available. bThe SWISS-PROT code is given wherever available. GW and Gf.+) indicate the species range of the orthologs CGramnegative and Gram-Positive, respectively). C Abbreviations: MG. M. genitalittm, HI, H. inJ7trenzac. d? indicates that there are no detectable ortholop, but the functions are evidently essential for the organisms: thus. the detected similarity to other proteins with similar functions is suggestive but nerds IO be vpritied experimentally.
repair enzyme Pol If. was foIlowed by 3 subsequent elimination of one of the Pal 111pamlogs in the Gram-negative lineage and of Pol I in the Mycoptasmalineage’). As a result, one of the Poi 111pamlogs, which is encoded in the same
operon as two pu~tive repair enzymes, has probably ovenaken the repair polymemse functinn in the mycoplasmas (Ref. 6 and Table 1). Theoreticaliy. one may imagine that in a minimal organism. a single polymerase would he responsible
TIG SEPTEMBER 1996 VOL. 12 No. 9
335
for repli~tion and repair DNA synthesis-. The anatysis of the actual
dt. geuitdimn genes. hcnvcqxz makes a compelling cxse for non-onholo~gous displacement. The situation with IWase H is even mcxe inrtic;rte. fl. it1JllIPilzClL~
and E. colieach encode two RNases H. These enzymes are involved in the removal of RNA primers during DNA replication. which is evidently essential for the ~mpletion of replication. In M. genitnlirtm. there is no homolog of any know RNase H. and the activity is likely to be provided by the ~G262 prorein that is homologous to the N-terminal. Y-3 exonuclease domain of Pof I (Ref. 6 and Table I). This example shows the limitations of the classical defmition of 0~010~ as even though the MG262 protein is more closeiy related to PO! I than to any other protein in H. j?I~~~~zueor other bzicteiia. it is obvious that the functions of these proteins only partially overlap. FiialIy. as two ws of aminoacyl-tRNA synthetases of J*Lgenifalitrm have only eukaryotic orthvIogs, horizontal gene transfer with subsequent displacement of the bacterial enzymes seems to be a plausible mechanjsm (TabIe Il. In addition to the I1 instances of non-orthologous gene displacement unrelated
summarized in Table 1. we found more than IO cases in which it remains unclear whether the ,&I.genitalitrm and H. ittJ7ttmzae genes responsible for a particular function are orthologs or paralogs (data not shown). Even though the respective proteins are highly similar in M. ~enif~lft~~~ and H. inJi’trenzae, they show approximately the same level OF similarity to their eukaryotic or archaeal counterparts, suggesting the possibility of non-orthologous displacement. However, alternative hypotheses. such as an unusual ‘burst-like’ mode of evolution of these genes, cat-m& be ruled out. It is remarkable that as many as 11 non-onhologous genes with the same functions have been found compared with the 233 apparent orthologs in H. itlflrcerrzae and M.genilalirrm. In larger genomes. there are likely to he hundreds of such cases, and they will require special considemtion in the reconstruction of metabolic
pathways and in comparative genome analysis. EUGENE V.KOONIN*, ARUDYIR MUSHEGIAN’ ANDF’EER BORK~ ~rk~embl-heidelberg.de
References 1 Dcolittie. R.F.(19941Tnwh Biochem. Sci. 19, Ii-18 2
Fleischmann, R.D. etal.(199%
3
Science 269. ;iclc,--?12 Fmser, C.M.et al. (1995) Science 270. 397-403
4
Fitch. W.M.f 1970)$@. Zeal. 19, 99-106
6 Koonin. E.V.and Eork. P. (1996) TrendsBiochent. Sci. 21. 128-129 7 MusheRian.A.R.and Koonin. E.V. Pfvc. A&f!.Acad Sri. C! S. A.
r
The 1997 Bfst IEI~onic Chmf&xmce in Cd l3io~ogy dfororgarnizers
7he September 1996issue of TisSis a thematic issue on the RNApoiymen.se II uanscription machinery.Soreral review articles have been specially commissioned on key areas OFthis rapidly expanding topic, including: RNApoIynaerase II triu~~Iptian @ R. Korniqy
’ This September issueof
trends in CELL BIOLOGY i cat&s an interview with Barry Hardy on virtual c0nfetences. In this in~view, Dr Ha&~ issues an invi~& to scienti~s working in cell biology and i .nzki*&?d&Ul*#ines t;J ‘&come involved in the 1997 Fii International Ceil Biology Electmnic Conference. This is a new venture, and requires a cell biological advisory ~~~ to assist r&h organizing the meeting sessions and inviting peters. Anyone interested in participating, at any level, should contact either Barry Hardy (
[email protected]) or the W&S fn CELLBIOLOGY editorial team (at tcb~e~~ier.co.uk or the editorial address listed at the front of thjs journal>.
TAFs mediate tramzriptional activation and ~~~~ 4 C.P. Z%i@zerti.%dR. 7Jfati The
RNA polymcrww Bl gene?& elon@ion Iactors by D. Reines, J. W. Conuway and R.C. ~o~~u~~v Thehumangencralc0&sors b K. Miser and M. Mefrferentst
The
multiple rotas of tramicdptiad~ factor TFnH 19J.Q. S~~tn~, P. Vi& and&M. i@y Mediator of trans&flonnl rqulation t.y S. BjcWdundand Y-S,Ktm ~e~le~~~~~~f~~ mnscription by RNApol~merase II @ R.G. Roeder
TIG SEMIZMIWI 1996 Vol.. 12 No. 9
336
an&o1