Mechanisms for multiple intracellular localization of human mitochondrial proteins

Mechanisms for multiple intracellular localization of human mitochondrial proteins

Mitochondrion 3 (2004) 315–325 www.elsevier.com/locate/mito Review Mechanisms for multiple intracellular localization of human mitochondrial protein...

156KB Sizes 0 Downloads 46 Views

Mitochondrion 3 (2004) 315–325 www.elsevier.com/locate/mito

Review

Mechanisms for multiple intracellular localization of human mitochondrial proteins Jakob Christian Mueller*, Christophe Andreoli, Holger Prokisch, Thomas Meitinger Institute of Human Genetics, GSF - National Research Center for Environment and Health, Ingolstaedter Landstrasse 1, D-85764 Neuherberg, Germany Received 6 May 2003; received in revised form 13 September 2003; accepted 5 February 2004

Abstract There is an increasing number of reports that some single gene products function in more than one cellular compartment. This review lists and categorizes the targeting mechanisms of 31 human mitochondrial proteins that have multiple localizations. Further, genetic disorders based on mislocalization are described, and prediction algorithms for multilocalized proteins are proposed. A high diversity of experimentally verified targeting mechanisms ranging from single protein to multi-protein mechanisms exists, with a combination of multiple transcription starting points and alternative splicing being the most frequent. This observation stresses the individuality of the evolutionary histories of such mechanisms. We did not find specific localization strategies to cluster with certain protein functions. There was also no bias with respect to the evolutionary origin of the multicompartmentalized mitochondrial proteins. Both, genes of bacterial and eukaryotic origin show multiple localization, which does not corroborate the hypothesis that the development of multiple targeting is coupled predominantly with the recruitment of nuclear eukaryotic genes for novel mitochondrial functions. q 2004 Elsevier B.V. and Mitochondria Research Society. All rights reserved. Keywords: Subcellular localization; Dual localization; Alternative splicing; Protein isoforms; Targeting sequence; Evolution of the mitochondrial proteome; Bacterial origin; Eukaryotic origin; Mitochondrial disorders

1. Introduction A characteristic of eukaryotic cells is that specific metabolic reactions are sequestered within discrete intracellular compartments. It is therefore unsurprising that the products of most coding genes are allocated to only a single location within the cell. * Corresponding author. Tel.: þ49-89-3187-3464; fax: þ 49-893187-3297. E-mail address: [email protected] (J.C. Mueller).

Most enzymes of the citric acid cycle, for instance, are restricted to mitochondria. According to the current annotations, about 92% of the human mitochondrial proteins are solely localized in mitochondria (MITOP2, Andreoli et al., 2004). However, some metabolic processes take place in more than one compartment. Well known examples include DNA/RNA metabolism in the nucleus, mitochondria and chloroplasts; protein synthesis in the cytosol, mitochondria and chloroplasts; fatty acid b-oxidation in the peroxisomes and mitochondria;

1567-7249/$20.00 q 2004 Elsevier B.V. and Mitochondria Research Society. All rights reserved. doi:10.1016/j.mito.2004.02.002

316

J.C. Mueller et al. / Mitochondrion 3 (2004) 315–325

and antioxidant defense in the cytosol, mitochondria and peroxisomes. By far the most common mechanism for multiple localization is for functionally similar gene products in different compartments to originate from different gene loci, which evolved either as gene duplications (paralogs) or by convergent evolution (analogs) (Karlberg et al., 2000). But there is also an increasing number of observations that some proteins encoded by single genes occur and function in more than one compartment, and it may be suspected that the actual number of such double locations is vastly underestimated because of the unequal proportions at which proteins are distributed in different compartments (Danpure, 1995; Otterlei et al., 1998; Soltys and Gupta, 1999a). The maintenance of multiple genes for similar functions in different compartments has the advantage of completely independent expression control (Lynch and Force, 2000). In contrast, multilocalized functions, whereby the protein is encoded by a single gene locus, requires sophisticated regulation mechanisms for differential targeting into different intracellular locations. The evolution of dual localizations of single gene products is particularly interesting in the context of the mitochondrion because of the concerted evolution of two genomes. Since the endosymbiotic event giving rise to the mitochondrion, massiv gene transfer from the proto-mitochondrion to the nucleus and the subsequent gene reduction in the nucleus and the mitochondrion are on-going processes in certain lineages and probably indicate such a trend of decreasing redundancy and specialization of targeting regulation (Adams et al., 2000; Kurland and Andersson, 2000; Gray et al., 2001). Two hypotheses can be proposed for the origin of nuclear genes that encode multilocalized mitochondrial proteins: (1) Genes of eubacterial (endosymbiont) origin take over additional, but similar, functions outside the mitochondrion. The gene products thereby have to acquire the ability to target into more than one compartment. (2) Genes of eukaryotic (host) origin include mitochondrial functions by multicompartmentalization. It has been argued that proteins involved in novel functions of regulation and transport processes following the endosymbiotic event tend to be recruited from the eukaryotic genome (Karlberg et al., 2000). These eukaryotic proteins, which expanded the mitochondrial proteome but still retained their original

functions, are probably the most prominent candidates for dual localization. This review focuses on the various mechanisms for intracellular sorting of single-gene proteins that localize to more than one cellular compartment, with an emphasis on mitochondrial proteins. We scanned several databases (Pubmed, LocusLink, Swiss-Prot, MitoP2, OMIM) using queries with different combinations of keywords from compartment terminology and the field of subcellular localization and found 31 experimentally confirmed examples of multilocalized mitochondrial proteins. Cross-references within the articles were an additional valuable source of information. The compilation of mechanisms sought to answer the following questions: Are there a few consistent mechanisms or many unique strategies? Do specific strategies cluster with certain functions or proposed evolutionary origin? What will be the relative success of finding new candidates of multiple localization by searching either at the DNA/RNA level or at the protein level and using combinatorial tools for the prediction? In the long run, knowledge of the variability of targeting mechanisms may help in understanding disorders based on mistargeted proteins and in designing location-specific drugs (Soltys and Gupta, 1999a; Spooner et al., 2001).

2. Multiple sorting by multiple gene products The majority of the 1000 – 2000 estimated mitochondrially compartmentalized proteins are encoded by nuclear genes and are imported into the organelle by specific signals (Jaussi, 1995; Truscott et al., 2001; Lightowlers and Lill, 2001). In humans, only 13 genes of the respiratory complex remain encoded by the mitochondrion. Because the nuclear-encoded proteins are synthesized at cytosolic ribosomes, the cytosol may therefore be considered the default compartment. In general, all proteins directed to compartments other than the cytosol require some structure information for their correct targeting. Targeting information inherent to the DNA sequence can produce a specific segment of the peptide that enables the recognition of compartment-specific receptors and initiates the transport across or the insertion into the relevant membrane

J.C. Mueller et al. / Mitochondrion 3 (2004) 315–325

(Neupert, 1997). Several such characteristic targeting sequences (TS) are known for different compartments. Mitochondrial and endoplasmic reticulum TSs (MTS and ERTS, respectively), are usually localized N-terminally, nuclear TSs (NTS) are often distributed internally, and peroxisomal TSs (PTS) are usually located at the C-termini. It has been argued that the N-terminus of the pre-protein that bears a targeting signal attaches to specific membrane receptors and initiates the translocation before the complete protein can fold and expose the other TSs (Neupert, 1997; Truscott et al., 2001). This mechanism produces dominance of N-terminal TSs over more downstream TSs. The crucial question for multiple localized proteins is how different targeting signals of the isoforms are alternatively expressed or used. Potential strategies are categorized in Table 1, and reported experimental examples are sorted according to these strategies in Table 2. Based on current knowledge of gene expression, sequence-based targeting information can be expressed differentially on three levels: transcription (type I), mRNA splicing (type II) or translation (type III). These fundamental types of sequence-based mechanisms are shown in Fig. 1. If a gene has more than one promoter, transcripts with different 50 ends will inevitably arise. If these transcripts have different first in-frame start codons, then they have the potential to encode polypeptides that differ in an N-terminal TS only (Fig. 1: type I). Splicing of the mRNA has even more possibilities in producing proteins with different TSs. Alternative Table 1 Strategies for alternative subcellular targeting of mitochondrial proteins Multiple genes–multiple proteins Single gene–multiple proteins Multiple transcription starts Alternative splicing Multiple translation starts Combinations of the above mentioned Single gene–single protein Post-translational modification Inefficient/chimeric targeting Protein–protein interactions Shuttle or export from mitochondrion

Type I Type II Type III

Type Type Type Type

IV V VI VII

317

splicing at the 50 end can skip an otherwise extant exon that bears TS information for the N-terminus (Fig. 1: type II). Exon skipping can accomplish this also for internal signals and alternative 30 cleavage can change the C-terminal information. Based on the 31 listed examples in Table 2, there is only one example of a pure type I (FPGS; references see Table 2) and two examples of a pure type II strategy (BCAT2 and OGG1). In most cases (seven examples), type I is combined with type II. This combination can control a hierarchically dominant 50 TS by alternative 50 exons (Fig. 2). Known examples comprise not only proteins with mitochondrial – cytosolic distribution (PANK2 and GPX4) but also proteins with mitochondrial-nuclear distribution (UNG and DUT). In both cases, it is the mutually excluded MTS information that is responsible for the dual localization. A good characterized example is uracil-DNA glycosylase (UNG; Nilsen et al., 1997). Whereas the MTS on the second exon is completely exchanged in the two isoforms, the NTS is excluded only partly in the mitochondrial isoform (Fig. 2). Dominant effects of the N-terminal MTS over the internal NTS have also been reported for the gene products of the human MutY homolog (MUTYH; Takao et al., 1999). Protein isoforms that differ only in their N-terminal sequence can also be generated by alternative translation starts (Fig. 1: type III). This simple mechanism is one of the most frequent (5 cases). Examples include GARS, NFS1 and GSR. In the case of NFS1, differential start codon utilization has been shown to be pH-dependent (Land and Rouault, 1998). This strategy has the potentiality to react quickly to environmental changes and further enables antagonistic regulation. If one product is up-regulated, the other is automatically down-regulated owing to the efficiency of the translation machinery. Other supposed mechanisms for the control of multiple start codons in a single transcript are suboptimal first translation starts according to the characteristics of Kozak (1989) or ribosomal shunting (Fu¨tterer et al., 1993). Type III is frequently combined with alternative splicing (six examples). Almost all the mechanisms of substituting targeting sequences require two different start codons, but, in contrast to the strategy of alternative 50 exons (type I combined with II), here both alternative start codons are

Appr. abbr.

Protein name

Proposed function

Localization

Targeting sequences (verified)

Transcription starts

Alt. Splic.

Translation starts

Type (see Table 1)

Reference

Q05932

FPGS

Cell proliferation

Mito–cyto

MTS

.2

n

2

I

[6]

O15382

BCAT2

MTS

1

y

1

II

[32]

OGG1

Amino acid metabolism DNA repair

Mito–cyto

O15527

Mito–nucl

MTS?, NTS

1

y

2?

GARS

protein biosynthesis

Mito–cyto

MTS

1

n

2

II, but internal III

[16– 19]

P41250

[7,8]

Q9Y697

NFS1

MTS

1

n

2

III

[11]

GSR

Mito–cyto

MTS

1

n

2

III

[34]

P30044 O43402

PRDX5 NOC4

Peroxiredoxin 5 Neighbor of cox4

Inorganic sulphur supply Defense against oxidation Antioxidant defense ?

Mito–cyto–nucl

P00390

Folylpolyglutamate synthase Branched –chain aminotransferase 8-oxoguanine DNA glycosylase Glycyl-tRNA synthetase Iron-sulfur cluster assembly enzyme Glutathione reductase

Mito–perox Mito–cyto–nucl

1 1

n n

2 2

III? III or V

[27] [10]

Q9BZ23 Q93028

PANK2 UNG

CoA biosynthesis DNA repair

Mito–cyto–nucl Mito–nucl

2? 2

y y

2 2

I þ II I þ II

[39] [16,20]

P33316

DUT

DNA metabolism

Mito–nucl

MTS

2

y

2

I þ II

[14]

P43155

CRAT

MTS

2

y

2

,I þ II

[25,26]

GPX4

Intermediary metabolism Defense against oxidation

Mito–perox-ER

P36969

Mito–cyto

MTS

.2

y

2

I þ II?

[36,37]

Q9UIF7

MUTYH

DNA metabolism

Mito–nucl

MTS, NTS

1 or 2

y

2

I þ II or II þ III

[13]

P29372

MPG

DNA repair

Mito–nucl?

?

1 or 2

y

2

I þ II or II þ III

[15,16]

Q15046

KARS

Protein biosynthesis

Mito–cyto

MTS

1

y

2

II þ III

[5]

Q9UBK8

MTRR

MTS

1

y

2

II þ III

[35]

AKAP1

Amino acid metabolism Intermediary metabolism

Mito–cyto

Q92667

Pantothenate kinase 2 Uracil DNA glycosylase dUTP pyrophosphatase Carnitine acetyltransferase Phospholipid hydroperoxide glutathione peroxidase MutY homolog, (DNA glycosylase) N- methylpurineDNA glycosylase Lysyl-tRNA synthetase methionine synthase reductase A kinase anchoring protein 1

MTS, PTS weak MTS, NTS MTS, NTS MTS, NTS

Mito-ER

MTS, ERTS

1?

y

2

II þ III

[12]

J.C. Mueller et al. / Mitochondrion 3 (2004) 315–325

SwissProt ID

318

Table 2 Examples of multiple localization (mitochondria and others) of single gene products with experimental evidence. Proteins are sorted according to the types in Table 1

Appr. abbr.

Protein name

Proposed function

Localization

Targeting sequences (verified)

Transcription starts

Alt. Splic.

Translation starts

Type (see Table 1)

Reference

Q9UMS0

HIRIP5

MTS

1

y

2

II þ III

[40]

WND HMGCL

Fe–S cluster assembly Copper transport Lipid metabolism

Mito –cyto–nucl

P35670 P35914

Mito –cyto-golgi Mito –perox

? MTS, PTS

? 1

? n

? 1

IV V

[38] [23]

Q9UHK6

AMACR

Fatty acid metabolism

Mito –perox

MTS, PTS

1

n

1

V

[24]

Q9NUL7

DDX28

MTS, NTS

1

n

1

VII

[22]

GOT2

Mito-cellsurface

MTS

1

n

1

VII?

[28,29]

Q07021

C1QBP

?

?

?

VII?

[30]

SFN

Mito-cellsurfaceER-nucl Mito –nucl

MTS

Q9Y3B8

MTS, NTS

1

y

2

?

[21]

P50440

GATM

Mito –cyto

MTS

?

y

2

?

[9]

P07954

FH

Mito –cyto

MTS

?

y?

2?

?

[1,2,3]

P32019

INPP5B

RNA structuring, splicing Amino and fatty acid metabolism Receptor (and signaling) DNA/RNA metabolism Energy/creatine metabolism Intermediary metabolism Signal terminating

Mito-nucl

P00505

HIRA interacting protein 5 P-type ATPase 3-Hydroxymethyl-3methylglutarylCoenzyme A lyase Alpha-methylacylCoA racemase DEAD/H box polypeptide 28 Aspartate aminotransferase P32 protein, gC1q receptor Small fragment nuclease Glycine Amidinotransferase Fumarase

Mito –cyto

?

?

y?

?

?

[4]

? P31415

SLIT3 CASQ

Endocrine system Ca2 þ regulation

Mito –cellsurface Mito-SR

MTS ?

? ?

? ?

? ?

? ?

[31] [33]

Inositol polyphosphate5-phosphatase, 75 kD Slit homolog 3 Calsequestrine/ Calmitine

J.C. Mueller et al. / Mitochondrion 3 (2004) 315–325

SwissProt ID

The table does not include marginal mitochondrial proteins, such as ligands to the outer side of the outer membrane and proteins involved in the mitochondrial-mediated pathway of apoptosis (for a review see Pollack and Leewenburgh, 2001). mito, mitochondrial; cyto, cytosolic; nucl, nuclear; perox, peroxisomal; ER, endoplasmic reticulum; SR, sarcoplasmic reticulum; golgi, golgi apparatus. [1] Petrova-Benedict et al., 1987, [2] Jaussi 1995, [3] Kinsella and Doonan, 1986, [4] Speed et al., 1995, [5]Tolkunova et al., 2000, [6] Freemantle et al., 1995, [7] Mudge et al., 1998, [8] Shiba et al., 1994, [9] Humm et al., 1997, [10] Bachman et al., 1999, [11] Land and Rouault, 1998, [12] Huang et al., 1997, 1999, [13] Takao et al., 1999, [14] Ladner et al., 1996; Ladner and Caradonna, 1997, [15] Pendlebury et al., 1994, [16] Otterlei et al., 1998, [17] Aburatani et al., 1997, [18] Locuslink, [19] Bjoras et al., 1997, [20] Nilsen et al., 1997, [21] Nguyen et al., 2000, [22] Valgardsdottir et al., 2001, [23] Ashmarina et al., 1999, [24] Amery et al., 2000, [25] Corti et al., 1994, [26] van der Leij et al., 2000, [27] Knoops et al., 1999, [28] Bradbury and Berk, 2000, [29] Soltys and Gupta, 1999b, [30] Soltys et al., 2000, [31] Little et al., 2001, [32] Than et al., 2001, [33] Bataille et al., 1994, [34] Kelner and Montoya, 2000, [35] Leclerc et al., 1999, [36] Pushpa-Rekha et al., 1995, [37] Kelner and Montoya, 1998, [38] Lutsenko and Cooper, 1998, [39] Ho¨rtnagel et al., 2003, [40] Tong et al., 2003.

319

320

J.C. Mueller et al. / Mitochondrion 3 (2004) 315–325

Fig. 1. Fundamental types of sequence-based mechanisms for dual localisation. For each type the genome is depicted in the middle of the alternative mRNA’s at the top and at the bottom; the exon–intron structure is shown by black boxes connected with lines; ¼ transcription start, P ¼ translation start, c ¼ possible MTS.

supplied simultaneously on the longer transcripts (Fig. 3). Either of the start codons serves as the translation starting point (MTRR). One elegant mechanism to switch between the potential start codons is the introduction of a stop codon by alternative splicing (KARS, AKAP1, HIRIP5). This leads to the termination of the product initiated from the first translation start. The second downstream start codon then becomes important in initiating the complete protein (Fig. 3). The AKAP1 protein is another example for overlapping TSs and N-terminal dominance (Huang et al., 1997,1999). Here, the ERTS is splitted into two parts. The 50 segment serves two functions: it suppresses the downstream MTS and exposes the second part of the ERTS that overlaps with the MTS. An exception of the general N-terminal regulation are the splice variants of OGG1 (type II), at which alternative splicing at the C-terminal optionally includes/ excludes a nuclear translocation signal (Aburatani et al., 1997), whereas the MTS containing N-terminus seem to be invariant. Although a variety of potential control mechanisms for alternative splice sites are discussed

in Caceres and Kornblihtt (2002), little is known in specific cases. The same holds for different transcription initiation sites. The need for fine-tuned regulation is expected from the tissue- or development-dependent differences in subcellular distribution ratios (Corti et al., 1994). Instead of regulating both isoforms, examples exist where only one is regulated whereas the other is expressed in a constitutive fashion. For instance, the expression of the nuclear form of dUTP pyrophosphatase (DUT) is correlated tightly to the nuclear DNA replication status, whereas the mitochondrial form seems to be constitutive (Ladner and Caradonna, 1997).

Fig. 2. Mechanism with alternative 50 exons, i.e. combination of type I and II. Symbols see Fig. 1; example gene UNG; here the NTS is complex: in addition to 44 optionally included N-terminal residues, complete nuclear translocation requires at least another 60 residues that are more downstream (Otterlei et al., 1998).

J.C. Mueller et al. / Mitochondrion 3 (2004) 315–325

Fig. 3. Mechanism with premature stop condon and combination of type II and III. Symbols see Fig. 1; example gene KARS; # ¼ stop codon.

3. Multiple sorting of single gene products Dual or multiple localization of a single translational product is less well documented than mechanisms based on differential transcription or splicing. We have found one example where differential localization is managed by posttranslational modification (Table 1: type IV). The WND protein (a membrane protein of the Golgi apparatus) is cleaved proteolytically and the product without the N-terminal portion is localized to mitochondrial membranes (Lutsenko and Cooper, 1998). Even without any modification, multiple targeting can be accomplished by inefficient or chimeric TSs (type V; HMGCL and AMACR) or by proteinprotein interactions that modify targeting signals (type VI; see Vongsamphanh et al., 2001). Unique or repeated import and export across the mitochondrial membranes (type VII) has been proposed for aspartate aminotransferase (GOT2) which serves also as a fatty-acid-binding protein at the plasma membrane, the complement component 1 q subcomponent binding protein (C1QBP), and DEAD/H box polypeptide 28 (DDX28). Several mechanisms—such as the maintenance of the uncleaved MTS within the mitochondrion and nuclear export signals—are suspected to be involved, but are unsubstantiated in most specific cases (Soltys and Gupta, 1999b).

4. Disorders arising from multicompartmentalization Some rare diseases are associated with the mislocalization of normally non-mitochondrial gene products to the mitochondrion. It was recently found that a mutation in the DJ-1 gene caused its gene

321

product to localize to the nucleus and mitochondrion instead of to the nucleus and cytoplasm (Bonifati et al., 2002). It was argued that the mislocalization leads to a loss in oxidative stress response in the cytosol. However, whether this is causative for the Parkinson’s disease PARK7 still remains to be clarified. Another example is the mistargeting of the peroxisomal alanine:glyoxylate aminotransferase to the mitochondrion in humans with primary hyperoxaluria type 1 (Danpure et al., 1993). This aminotransferase is localized in the mitochondrion and the cytosol in amphibians, whereas it is variably targeted to the mitochondrion and/or peroxisome in mammals apparently according to the diet of the organism (Oatey et al., 1996; Holbrook and Danpure, 2002). A single substitution probably restored the MTS that was lost during evolution of humans. Mitochondrial deficiency of an otherwise mitochondrially – cytosolic distributed protein can also produce a disorder. For example, in a pantothenate kinase-associated neurodegeneration, the long mitochondrial isoform of PANK2 is missing (Ho¨rtnagel et al., 2003).

5. Evolutionary perspectives Given the 656 mitochondrial proteins of MitoP2, multiple localization of single gene products comprise at least 5%. To get also a rough comparable estimate of how many mitochondrial proteins might have accomplished multicompartmentalization by multiple genes, we searched in the CluSTr database (Kriventseva et al., 2001) for non-mitochondrial paralogs. About 20% of the mitochondrial reference set (MitoP2) have a non-mitochondrial paralog in the most restrictive clusters of CluSTr. The higher proportion of multiple loci in multicompartmentalized functions that we observed may reflect an early evolutionary stage or the constraint to retain the advantage of completely independent expression control by independent loci (Lynch and Force, 2000). We have found multiple targeting of single products within all commonly known functional groups that occur in more than one compartment (see Table 2): DNA/RNA metabolism (7 cases), protein metabolism (5 cases), lipid oxidation (3 cases) and antioxidant defense (3 cases). Interestingly, a rich

322

J.C. Mueller et al. / Mitochondrion 3 (2004) 315–325

diversity of mechanisms for multiple targeting can be observed, reflecting the individual evolutionary histories of the different genes. As mentioned, the process of mitochondrion-tonucleus gene transfer is supposed to be relatively young or ongoing, and thus could provide the possibility for multicompartmentalization for some genes only recently (Gray et al., 2001; Karlberg and Andersson, 2003). There is no correlation between most of the functional groups and the various mechanisms of multiple targeting. The DNA/RNA metabolism group, for example, utilizes strategies of type II, the combination of type I and II, the combination of type II and III, and type VI. It is conspicuous that only all three proteins involved into lipid metabolism (HMGCL, AMACR, GOT2) belong to the group with a single translational product. The breakdown of fatty acids is an oxidative process carried out mainly in peroxisomes, but also in mitochondria. The close phylogenetic relationship found between the peroxisomal and mitochondrial proteome was attributed to a proposed displacement from the mitochondrial progenitor to the peroxisome or vice versa (Marcotte et al., 2000). The chimeric targeting mechanism of the three proteins with little regulation potentiality might reflect the close relationship of mitochondria and peroxisomes. For direct export/import pathways between the mitochondrion and peroxisomes see also Soltys and Gupta (1999a,b). An estimation of the evolutionary origin of the multicompartmentalized mitochondrial proteins was performed by similarity BLAST searches with 10 prokaryotic organisms including two a-proteobacteria, a group that is the proposed closest relative to the eubacterial ancestor of mitochondria (Kurland and Andersson, 2000; Karlberg et al., 2000; Gray et al., 2001). We found six proteins in our list with highest similarity to the a-proteobacteria, indicating strong candidates of bacterial origin (Table 3). When analysing the functional categories we found an even higher ratio of , 19 proteins involved in bioenergetics and biosynthesis, which are proposed roles for proteins of bacterial origin (Karlberg et al., 2000; Karlberg and Andersson, 2003) and only , 6 proteins with characteristic eukaryote-specific functions like regulation, signaling and transport. These proportions do not

Table 3 Protein sequence similarity (expectation value) with a-proteobacteria (Rickettsia prowazekii, Bradyrhizobium japonicum) and other prokaryotes (Aquifex aeolicus, Bacillus subtilis, Escherichia coli, Haemophilus influenzae, Helicobacter pylori, Mycobacterium tuberculosis, Staphylococcus aureus, Synechocystis sp.) SwissProt ID

Appr. abbr.

a-proteobacteria

other prokaryotes

Q05932 O15382 O15527 P41250 Q9Y697 P00390 P30044 O43402 Q9BZ23 Q93028 P33316 P43155 P36969 Q9UIF7 P29372 Q15046 Q9UBK8 Q92667 P35670 P35914 Q9UHK6 Q9NUL7 P00505 Q07021 Q9Y3B8 P50440 P07954 P32019 ? P31415

FPGS BCAT2 OGG1 GARS NFS1 GSR PRDX5 NOC4 PANK2 UNG DUT CRAT GPX4 MUTYH MPG KARS MTRR AKAP1 WND HMGCL AMACR DDX28 GOT2 C1QBP SFN GATM FH INPP5B SLIT3 CASQ

1e-25 (Bj) 2e-53 (Bj)

2e-35 (Bs) 5e-70 (Bs)

1e-156ðRpÞ 5e-74 (Bj) 8e-23ðBjÞ

1e-146ðEcÞ 1e-126 (Ec) 1e-22ðSspÞ

6e-23 (Bj)

2e-70 (Ec) 6e-23 (Aa)

3e-32 (Bj) 3e-57 (Bj) 1e-16 (Bj) 1e-21 (Bj) 8e-33 (Bj)

1e-33 (Ssp) 4e-53 (Bs) 7e-17 (Bs) 1e-111 (Aa, Ec) 4e-40 (Bs)

1e-116 (Bj) 9e-83 (Bj) 2e-80 (Bj) 1e-35 (Bj)

1e-169 (Sa) 5e-68 (Bs) 7e-87 (Mt) 2e-38 (Ec) 9e-96 (Hi) 3e-50 (Hi)

3e-26 (Bj) 1e-164 (Bj)

1e-154 (Ec, Hi)

Lowest expectation value, if , 1e-10; for each group with indicated species (parenthesis) is shown. The expectation value represents the statistical significance of sequence similarity; proteins with highest similarity to a-proteobacteria are marked in italics, Rp, Rickettsia prowazekii; Bj, Bradyrhizobium japonicum; Bs, Bacillus subtilis; Ec, Escherichia coli; Ssp, Synechocystis sp.; Aa, Aquifex aeolicus; Sa, Staphylococcus aureus; Mt, Mycobacterium tuberculosis; Hi, Haemophilus influenzae.

corroborate the hypothesis that the development of multiple localization in mitochondrial proteins is coupled predominantly with the recruitment of nuclear, eukaryote-specific genes for novel mitochondrial functions. Instead, multicompartmentalization occurs equally in both these groups of different evolutionary origin.

J.C. Mueller et al. / Mitochondrion 3 (2004) 315–325

323

6. Prediction algorithms

References

The current list of proteins is undoubtedly not complete and could be extended by combinatorial predictions based on DNA and protein databases. Literature mining and computer predictions promote each other mutually, and could help us to assess the significance of multiple localization in specific genomes (Valencia, 2002). More than half of the described examples (17 out of 31) use a mechanism of differential splicing or transcription, and therefore will be detectable at the cDNA level. Predictions for proteins with these mechanisms are promising because they can be based on the relatively complete nucleic acid databases. Alternative splicing or transcription at the 50 end, an interruption of the open reading frame by an introduced stop codon, multiple potential start codons at the 50 end, multiple organellar TSs, non-cleavable TSs and posttranslational modification influencing the TS, all indicate multiple localization. Single prediction tools for these patterns, however, are weak, because of incomplete sequence information at the 50 ends of cDNAs and variability in TSs (Claros and Vincens, 1996; Emanuelsson and von Heijne, 2001). An algorithm that allows for the combined analysis of all expected patterns including a phylogenetic profile approach (Marcotte et al., 2000) could promote the detection of new candidates of multiple localization.

Aburatani, H., Hippo, Y., Ishida, T., Takashima, R., Matsuba, C., Kodama, T., et al., 1997. Cloning and characterization of mammalian 8-hydroxyguanine-specific DNA glycosylase/ apurinic, apyrimidinic lyase, a functional mutM homologue. Cancer Res. 57, 2151–2156. Adams, K.L., Daley, D.O., Qiu, Y.L., Whelan, J., Palmer, J.D., 2000. Repeated, recent and diverse transfers of a mitochondrial gene to the nucleus in flowering plants. Nature 408, 354 –357. Amery, L., Fransen, M., De Nys, K., Mannaerts, G.P., van Veldhoven, P.P., 2000. Mitochondrial and peroxisomal targeting of 2-methylacyl-CoA racemase in humans. J. Lipid Res. 41, 1752–1759. Andreoli, C., Prokisch, H., Ho¨ rtnapel, K., Mueller, J.C., Mu¨nsterko¨tter, M., Scharfe, C., Meitinger, T., 2004. MitoP2, an integrated database on mitochondrial proteins in yeast and man. Nucleic Acids Res. 32, D459– D462. Ashmarina, L.I., Pshezhetsky, A.V., Branda, S.S., Isaya, G., Mitchell, G.A., 1999. 3-hydroxy-3-methylglutary coenzyme A lyase: targeting and processing in peroxisomes and mitochondria. J. Lipid Res. 40, 70– 75. Bachman, N.J., Wu, W., Schmidt, T.R., Grossman, L.I., Lomax, M.I., 1999. The 50 region of the COX4 gene contains a novel overlapping gene, NOC4. Mamm. Genome 10, 506 –512. Bataille, N., Schmitt, N., Aumercier-Maes, P., Ollivier, B., Lucas-Heron, B., Lestienne, P., 1994. Molecular cloning of human calmitine, a mitochondrial calcium binding protein, reveals identity with calsequestrine. Biochem. Biophys. Res. Comm. 203, 1477–1482. Bjoras, M., Luna, L., Johnsen, B., Hoff, E., Haug, T., Rognes, T., Seeberg, E., 1997. Opposite base-dependent reactions of a human base excision repair enzyme on DNA containing 7, 8-dihydro-8-oxoguanine and abasic sites. Eur. Mol. Biol. Org. J. 16, 6314–6322. Bonifati, V., Rizzu, P., van Baren, M., Schaap, O., Breedveld, G.J., Krieger, E., et al., 2002. Mutations in the DJ-1 gene associated with autosomal recessive early-onset parkinsonism. Science 299, 256–259. Bradbury, M., Berk, P.D., 2000. Mitochondrial aspartate aminotransferase: direction of a single protein with two distinct functions to two subcellular sites does not require alternative splicing of the mRNA. Biochem. J. 345, 423 –427. Caceres, J.F., Kornblihtt, A.R., 2002. Alternative splicing: multiple control mechanisms and involvement in human disease. Trends Genet. 18, 186–193. Claros, M.G., Vincens, P., 1996. Computational method to predict mitochondrially imported proteins and their targeting sequences. Eur. J. Biochem. 241, 779 –786. Corti, O., DiDonato, S., Finocchiaro, G., 1994. Divergent sequences in the 5( region of cDNA suggest alternative splicing as a mechanism for the generation of carnitine acetyltransferases with different subcellular localizations. Biochem. J. 303, 37 –41.

7. Electronic databases Locuslink: http://www.ncbi.nlm.nih.gov/LocusLink/ OMIM: http://www.ncbi.nlm.nih.gov/omim/ PubMed: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi SwissProt: http://www.expasy.org/sprot/ MitoP2: http://ihg.gsf.de/mitop2

Acknowledgements We thank Olaf Bininda-Emonds for valuable discussions and linguistic amendments. Two anonymous reviewers and the editor helped to focus this review. This work was supported by the BMBF funded project “Bioinformatics for the functional analysis of mammalian genomes” (BFAM 031U112C).

324

J.C. Mueller et al. / Mitochondrion 3 (2004) 315–325

Danpure, C.J., 1995. How can the products of a single gene be localized to more than one intracellular compartment? Trends Cell Biol. 5, 230– 238. Danpure, C.J., Purdue, P.E., Fryer, P., Griffiths, S., Allsop, J., Lumb, M.J., et al., 1993. Enzymological and mutational analysis of a complex primary hyperoxaluria type I phenotype involving alanine:glyoxylate aminotransferase peroxisome-to mitochondrion mistargeting and intraperoxisomal aggregation. Am. J. Hum. Genet. 53, 417–432. Emanuelsson, O., von Heijne, G., 2001. Prediction of organellar targeting signals. Biochim. Biophy. Acta 1541, 114–119. Freemantle, S.J., Taylor, S.M., Krystal, G., Moran, R.G., 1995. Upstream organization of and multiple transcripts from the human folylpoly-gamma-glutamate synthetase gene. J. Biol. Chem. 270, 9579–9584. Fu¨tterer, J., Kiss-Laszlo, Z., Hohn, T., 1993. Non-linear ribosome migration on cauliflower mosaic virus 35S RNA. Cell 73, 789–802. Gray, M.W., Burger, G., Lang, B.F., 2001. The origin and early evolution of mitochondria. Genome Biol. 2, 1 –5. Holbrook, J.D., Danpure, C.J., 2002. Molecular basis for the dual mitochondrial and cytosolic localization of alanine:glyoxylate aminotransferase in amphibian liver cells. J. Biol. Chem. 277, 2336–2344. Ho¨rtnagel, K., Prokisch, H., Meitinger, T., 2003. An isoform of hPANK2, deficient in pantothenate kinase-associated neurodegeneration, localizes to mitochondria. Hum. Mol. Genet. 12, 321–327. Huang, L.J., Durick, K., Weiner, J.A., Chun, J., Taylor, S.S., 1997. Identification of a novel protein kinase A anchoring protein that binds both type I and type II regulatory subunits. J. Biol. Chem. 272, 8057–8064. Huang, L.J., Wang, L., Ma, Y., Durick, K., Perkins, G., Deerinck, T.J., et al., 1999. NH2-terminal targeting motifs direct dual specificity A-kinase-anchoring protein 1 (D-AKAP1) to either mitochondria or endoplasmic reticulum. J. Cell Biol. 145, 951–959. Humm, A., Fritsche, E., Mann, K., Go¨hl, M., Huber, R., 1997. Recombinant expression and isolation of human L -arginine: glycine amidinotransferase and identification of its active-site cysteine residue. Biochem. J. 322, 771– 776. Jaussi, R., 1995. Homologous nuclear-encoded mitochondrial and cytosolic isoproteins: a review of structure, biosynthesis and genes. Eur. J. Biochem. 228, 551 –561. Karlberg, E.O.L., Andersson, S.G.E., 2003. Mitochondrial gene history and mRNA localization: is there a correlation? Nat. Rev. Gen. 4, 391– 397. Karlberg, O., Canba¨ck, B., Kurland, C.G., Andersson, S.G.E., 2000. The dual origin of the yeast mitochondrial proteome. Yeast 17, 170–187. Kelner, M.J., Montoya, M.A., 1998. Structural organization of the human selenium-dependent phospholipid hydroperoxide glutathione peroxidase gene (GPX4): chromosomal localization to 19p13.3. Biochem. Biophys. Res. Comm. 249, 53 –55. Kelner, M.J., Montoya, M.A., 2000. Structural organization of the human glutathione reductase gene: determination of correct cDNA sequence and identification of a mitochondrial

leader sequence. Biochem. Biophys. Res. Comm. 269, 366 –368. Kinsella, B.T., Doonan, S., 1986. Nucleotide sequence of a cDNA coding for mitochondrial fumarase from human liver. Biosci. Rep. 6, 921 –929. Knoops, B., Clippe, A., Bogard, C., Arsalane, K., Wattiez, R., Hermans, C., et al., 1999. Cloning and characterisation of AOEB166, a novel mammalian antioxidant enzyme of the peroxiredoxin family. J. Biol. Chem. 274, 30451–30458. Kozak, M., 1989. The scanning model for translation: an update. J. Cell Biol. 108, 229–241. Kriventseva, E.V., Fleischmann, W., Zdobnov, E.M., Apweiler, R., 2001. CluSTr: a database of clusters of SWISS-PROT þ TrEMBL proteins. Nucleic Acids Res. 29, 33–36. Kurland, C.G., Andersson, S.G.E., 2000. Origin and evolution of the mitochondrial proteome. Microbiol. Mol. Biol. Rev. 64, 786 –820. Ladner, R.D., Caradonna, S.J., 1997. The human dUTPase gene encodes both nuclear and mitochondrial isoforms. J. Biol. Chem. 272, 19072– 19080. Ladner, R.D., McNulty, D.E., Carr, S.A., Roberts, G.D., Caradonna, S.J., 1996. Characterisation of distinct nuclear and mitochondrial forms of human deoxyuridine triphosphate nucleotidohydrolase. J. Biol. Chem. 271, 7745– 7751. Land, T., Rouault, T.A., 1998. Targeting of a human iron-sulfur cluster assembly enzyme, nifs, to different subvellular compartments is regulated through alternative AUG utilization. Mol. Cell 2, 807–815. Leclerc, D., Odievre, M.-H., Wu, Q., Wilson, A., Huizenga, J.J., Rozen, R., Scherer, S.W., Gravel, R.A., 1999. Molecular cloning, expression and physical mapping of the human methionine synthase reductase gene. Gene 240, 75 –88. Lightowlers, R.N., Lill, R., 2001. High-level mitochondriology at high altitude. Eur. Mol. Biol. Org. Rep. 2, 1074–1077. Little, M.H., Wilkinson, L., Brown, D.L., Piper, M., Yamada, T., Stow, J.L., 2001. Dual trafficking of Slit3 to mitochondria and cell surface demonstrates novel localization for Slit protein. Am. J. Physiol. Cell Physiol. 281, C486–C495. Lutsenko, S., Cooper, M.J., 1998. Localization of the Wilson’s disease protein product to mitochondria. Proc. Natl Acad. Sci. 95, 6004–6009. Lynch, M., Force, A., 2000. The probability of duplicate gene preservation by subfunctionalization. Genetics 154, 459 –473. Marcotte, E.M., Xenarios, I., van der Bliek, A.M., Eisenberg, D., 2000. Localizing proteins in the cell from their phylogenetic profiles. Proc. Natl Acad. Sci. 97, 12115–12120. Mudge, S.J., Williams, J.H., Eyre, H.J., Sutherland, G.R., Cowan, P.J., Power, D.A., 1998. Complex organisation of the 50 -end of the human glycine tRNA synthetase gene. Gene 209, 45–50. Neupert, W., 1997. Protein import into mitochondria. Annu. Rev. Biochem. 66, 863– 917. Nguyen, L.H., Erzberger, J.P., Root, J., Wilson, D.M. III, 2000. The human homolog of Escherichia coli Orn degrades small singlestranded RNA and DNA oligomers. J. Biol. Chem. 275, 25900–25906.

J.C. Mueller et al. / Mitochondrion 3 (2004) 315–325 Nilsen, H., Otterlei, M., Haug, T., Solum, K., Nagelhus, T.A., Skorpen, F., Krokan, H.E., 1997. Nuclear and mitochondrial uracil-DNA glyosylases are generated by alternative splicing and transcription from different positions in the UNG gene. Nucleic Acids Res. 25, 750 –755. Oatey, P.B., Lumb, M.J., Danpure, C.J., 1996. Molecular basis of the variable mitochondrial and peroxisomal localisation of alanine-glyoxylate aminotransferase. Eur. J. Biochem. 241, 374–385. Otterlei, M., Haug, T., Nagelhus, T.A., Slupphaug, G., Lindmo, T., Krokan, H.E., 1998. Nuclear and mitochondrial splice forms of human uracil-DNA glycosylase contain a complex nuclear localisation signal and a strong classical mitochondrial localisation signal, respectively. Nucleic Acids Res. 26, 4611–4617. Pendlebury, A., Frayling, I.M., Santibanez Koref, M.F., Margison, G.P., Rafferty, J.A., 1994. Evidence for the simulataneous expression of alternatively spliced alkylpurine N-glycosylase transcripts in human tissues and cells. Carcinogenesis 15, 2957–2960. Petrova-Benedict, R., Robinson, B.H., Stacey, T.E., Mistry, J., Chalmers, R.A., 1987. Deficient fumarase activity in an infant with fumaricacidemia and its distribution between the different forms of the enzyme seen on isoelectric focusing. Am. J. Hum. Genet. 40, 257 –266. Pollack, M., Leewenburgh, C., 2001. Apoptosis and aging: role of the mitochondria. J. Gerontol.: Biol. Sci. 56A, B475–B482. Pushpa-Rekha, T.R., Burdsall, A.L., Oleksa, L.M., Chisolm, G.M., Driscoll, D.M., 1995. Rat phospholipid-hydroperoxide glutathione peroxidase. J. Biol. Chem. 270, 26993–26999. Shiba, K., Schimmel, P., Motegi, H., Noda, T., 1994. Human glycyltRNA synthetase: Wide divergence of primary structure from bacterial counterpart and species-specific aminoacylation. J. Biol. Chem. 269, 30049–30055. Soltys, B.J., Gupta, R.S., 1999a. Mitochondrial-matrix proteins at unexpected locations: are they exported? Trends Biochem. Sci. 24, 174 –177. Soltys, B.J., Gupta, R.S., 1999b. Mitochondrial proteins at unexpected cellular locations: export of proteins from mitochondria from an evolutionary perspective. Int. Rev. Cytol. 194, 133–196. Soltys, B.J., Kang, D., Gupta, R.S., 2000. Localization of P32 protein (gC1q-R) in mitochondria and at specific

325

extramitochondrial locations in normal tissues. Histochem. Cell Biol. 114, 245–255. Speed, C.J., Matzaris, M., Bird, P.I., Mitchell, C.A., 1995. Tissue distribution and intracellular localisation of the 75-kDA inositol polyphosphate 5-phosphatase. Eur. J. Biochem. 234, 216 –224. Spooner, R.A., Maycroft, K.A., Paterson, H., Friedlos, F., Springer, C.J., Marais, R., 2001. Appropriate subcellular localisation of prodrug-activating enzymes has important consequences for suicide gene therapy. Int. J. Cancer 93, 123 –130. Takao, M., Zhang, Q.-M., Yonei, S., Yasui, A., 1999. Differential subcellular localization of human MutY homolog (hMYH) and the functional activity of adenine:8-oxoguanine DNA glycosylase. Nucleic Acids Res. 27, 3638– 3644. Than, N.G., Su¨megi, B., Than, G.N., Bellyei, Sz., Bohn, H., 2001. Molecular cloning and characterization of placental tissue protein 18 (PP18a)/human mitochondrial Branched-chain aminotransferase (BCATm) and its novel alternatively spliced PP18b variant. Placenta 22, 235 –243. Tolkunova, E., Park, H., Xia, J., King, M.P., Davidson, E., 2000. The human Lysyl-tRNA synthetase gene encodes both the cytoplasmic and mitochondrial enzymes by means of an unusual alternative splicing of the primary transcript. J. Biol. Chem. 275, 35063–35069. Tong, W.-H., Jameson, G.N.L., Huynh, B.H., Rouault, T.A., 2003. Subcellular compartmentalization of human Nfu, an iron-sulfur cluster scaffold protein, and its ability to assemble a [4Fe-4S] cluster. Proc. Natl Acad. Sci. 100, 9762–9767. Truscott, K.N., Pfanner, N., Voos, W., 2001. Transport of proteins into mitochondria. Rev. Physiol. Biochem. Pharmacol. 143, 81 –136. Valencia, A., 2002. Search and retrieve. Eur. Mol. Biol. Org. Rep. 3, 396 –400. Valgardsdottir, R., Brede, G., Eide, L.G., Frengen, E., Prydz, H., 2001. Cloning and characterization of MDDX28, a putative DEAD-box helicase with mitochondrial and nuclear localization. J. Biol. Chem. 276, 32056– 32063. Van der Leij, F.R., Huijkman, N.C.A., Boomsma, C., Kuipers, J.R.G., Bartelds, B., 2000. Genomics of the human carnitine acyltransferase genes. Mol. Gen. Metab. 71, 139– 153. Vongsamphanh, R., Fortier, P.-K., Ramotar, D., 2001. Pir1p mediates translocation of the yeast Apn1p endonuclease into the mitochondria to maintain genomic stability. Mol. Cell. Biol. 21, 1647–1655.