Mitochondrial DNA repeats constrain the life span of mammals

Mitochondrial DNA repeats constrain the life span of mammals

226 Update TRENDS in Genetics Vol.20 No.5 May 2004 | Genome Analysis Mitochondrial DNA repeats constrain the life span of mammalsq David C. Samuel...

117KB Sizes 0 Downloads 43 Views

226

Update

TRENDS in Genetics Vol.20 No.5 May 2004

| Genome Analysis

Mitochondrial DNA repeats constrain the life span of mammalsq David C. Samuels Virginia Bioinformatics Institute, Virginia Polytechnic Institute and State University, Blacksburg, VA 24061, USA

Mitochondrial DNA deletions, which are often flanked by repeats, are found in elderly organisms of many species. In this article, I describe the analysis of the mitochondrial genomes of 61 mammalian species and show that the number of the longer repeats constrains the typical life span. I also show that the number of repeats that occur in randomly shuffled sequences is a rough lower limit to the number in the actual sequences. These two constraints imply a maximum life expectancy for mammals of 80 –100 years. Mitochondria are the organelles that are responsible for the generation of most of the energy used by eukaryotic cells. They are the descendants of a-proteobacteria, which developed a symbiotic relationship with an ancestor of today’s eukaryotic life [1]. Mitochondria retain some vestiges of their independent ancestors, most notably the presence of DNA molecules in each organelle [2]. The mitochondrial DNA (mtDNA) encodes some of the proteins of the electron-transfer chain, the primary system for the conversion of adenosine diphosphate (ADP) to adenosine triphosphate (ATP) in most mammalian cells. Mutations in mtDNA can deprive the cell of energy and are correlated with the aging process [3– 5] in a wide range of species, for example, humans [6], rats [7], Caenorhabditis elegans [8] and various fungi [9]. Theories of aging caused by mitochondrial damage are controversial [10] but there is an association between mtDNA mutations and aging in many species. The most common age-related mtDNA mutation in humans is a large rearrangement called the ‘common’ or the ‘4977’ deletion [6,11 – 13] and is usually found in humans . 40 years old [3,14]. It is a deletion of 4977 bp that removes all or part of seven of the 13 protein-encoding mtDNA genes and five of the 22 tRNA genes. Individual cells containing this deletion have a mixture of wild-type mtDNA and the deletion mutation, a condition known as heteroplasmy. The rise in the level of heteroplasmy with age is tissue-dependent [15] and different tissue dependences have been found in different species [16,17]. Cells with many copies of the common deletion have a loss of mitochondrial function [18], and the increase in mtDNA deletions correlates strongly with the decline in sensory nerve function in rats [19]. q Supplementary data associated with this article can be found at doi: 10.1016/j.tig.2004.03.003 Corresponding author: David C. Samuels ([email protected]).

www.sciencedirect.com

Deletions and direct repeats The common deletion in human mtDNA is flanked by a 13bp repeat (Box 1). Other less-common age-related deletions are observed in humans and most of these are also flanked by shorter direct repeats or imperfect repeats, although some deletions are not associated with repeats [11,20]. Deletions can be formed by errors in the repair of double-strand breaks [21,22], replication slippage [23] or other replication errors [24]. The presence of direct repeats renders the DNA susceptible to rearrangement mutations. The 13-bp direct repeat responsible for the human common deletion is clearly a small detail in an mtDNA sequence of . 16 500 bp. It is reasonable to expect that sequence features of such a small size, compared with the genome length, could vary greatly across the mitochondrial genomes of different species. In fact, different common Box 1. Clonal expansions of mitochondrial DNA deletions and the location of the flanking direct repeats Most eukaryotic cells contain many copies of mitochondrial DNA (mtDNA); cells with mutated mtDNA usually contain a mixture of the mutated and wild-type mtDNA. These mutated mtDNA can be inherited from the mother or acquired through mtDNA damage. In elderly humans, cells with high levels of acquired mtDNA mutation usually contain high levels of a single mutation, although different mutations can be contained in different cells [32]. It is assumed that this is the result of a single mutation event that occurred at some time in the past and then ‘clonally expanded’ to high levels in the cell through repeated replication of the mutated mtDNA [32,33]. MtDNA contain two origins of replication, one for each strand. These replication origins are separated by approximately one-third of the genome in vertebrates and can be used to divide the circular genome into a major arc (the larger section) and a minor arc. The major arc is more prone to deletions [3], probably because this arc remains single stranded for a long time during the replication process. However, deletions are found throughout mtDNA [20]; therefore, the entire coding region was included in the analysis. The human mitochondrial genome analyzed here (NCBI GenBank accession number NC_001807) contains one direct 15-bp repeat and four repeats that are 13 bp. The human common deletion is flanked by one of these 13-bp repeats. Another of the 13-bp repeats lies in the minor arc, where deletions are rare. The other two 13-bp repeats and the 15-bp repeat flank one of the two origins of replication; a deletion caused by these direct repeat pairs would result in the loss of one of the replication origins. This loss would remove or limit severely the ability of the deleted mtDNA to replicate (except through duplication intermediates [34]) and thus prevent clonal expansion of the mutant mtDNA. However, simulations indicate that clonal expansions of mtDNA mutants require a long time to occur through random genetic drift [33,35], implying that the process might only be important in the long-lived species and might not have an important role in shortlived species.

age-associated mtDNA deletions are found in many species and these are flanked by a pair of 13 – 17 bp direct repeats [7,20,25,26]. Is there a relationship between the number and the length of direct repeats in the mitochondrial genome of a species and the typical life span of an organism within that species? To answer this question, I analyzed the mitochondrial genomes of 61 mammalian species available from the National Center for Biotechnology Information (NCBI; http://www.ncbi.nlm.nih.gov). This study was restricted to mammalian species, to compare species with the same general physiology but with a wide range of life spans [27 – 29]. For species without a well-established life expectancy, their recorded life spans in captivity or in the wild were used (details are available in the supplementary table online http:archive.bmn.com/supp/tig/ tge200503.pdf). Programs were written in Fortran to measure the number of direct repeats in each mtDNA sequence. Only direct repeat pairs that flank at least part of one or more mtDNA genes, including tRNA and rRNA genes, were analyzed. These direct repeat pairs could cause deletions that remove the gene products necessary for the formation of the electron-transfer chain. I did not include any direct repeat pairs if both members of the pair were located in the small non-coding section of mtDNA known as the D-loop. In human mtDNA, no large direct repeats with both members of the pair located in the D-loop were found but in many species such repeats are common as a result of a highly repetitive section in their D-loop. The apes, most rodents, whales and a few other mammalian species in this dataset lack this highly repetitive noncoding mtDNA segment. Direct repeats in mtDNA constrain life span There is a clear relationship in these data between the typical life span and the number of direct repeats $ 12 bp in the mitochondrial genome of the species (Figure 1). This is a constraint relationship, not a direct causal relationship. The data from almost all species, with a single exception, lie below a surprisingly linear limit. Long-lived species do not have many of the longer direct repeats in their mitochondrial genomes, whereas the number of longer direct repeats for the short-lived species appear to be randomly distributed below this constraint line. There are no apparent patterns within groups of related species, for example, primates and rodents. The sole exception to the . 12 bp constraint is the finback whale. Because the other two whale species and the other marine mammals obey the constraint, this exception is probably not due to the marine environment of this species, a possibility raised by the different respiration physiology of diving mammals. The same linear constraint seems to apply to these disparate groups of species indicating that this constraint is due to some fundamental property of mammalian mtDNA. The fact that this is a constraint relationship, not a direct correlation, reflects the reasonable conclusion that aging is a complicated process dependent on a multitude of factors and that mtDNA deletions are only one part of this process. One interpretation of this analysis is that the importance of mtDNA deletions in www.sciencedirect.com

227

TRENDS in Genetics Vol.20 No.5 May 2004

Actual sequences (a)

25

Shuffled sequences 25

20

20

15

15

10

10

5

5

0 Number of direct repeat pairs flanking coding mtDNA

Update

0

50

20

40

60

80

100

0

40

30

30

20

20

10

10 0

150

20

40

60

80

100

0

100

50

50

0

500

20

40

60

80

100

0

(d)

400

300

300

200

200

100

100

0

0

20

0

40

60

80

100

0

40

60

80

100

20

40

60

80

100

40

60

80

100

40

60

80

100

(g)

0

500

400

20 (f)

150

(c)

100

0

0

50

(b)

40

0

(e)

20 (h)

0

20

Life span (years) TRENDS in Genetics

Figure 1. The number of direct-repeat pairs in mitochondrial DNA (mtDNA) compared with the life span for 61 mammalian species. The species are grouped as primates (red stars), rodents (green triangles), cetaceans (blue squares), other marine mammals (blue triangles) and other mammals (black circles). The values are shown for direct repeats of length: (a) .12 bp, (b) 12 bp, (c) 11 bp and (d) 10 bp. The corresponding measurements on randomly shuffled sequences are shown in (e– h). The solid lines are the mean values of the distributions, which were calculated over sliding windows of 30 years width (a sliding window of ten years width was used for life spans of , 20 years, where the data density is highest). The broken lines in (a) and (b) are linear upper limits to the data distribution, determined by a least squares fit through the data points on the upper edge of the distribution. From the least squares fit the intercept I and the slope s of the upper constraint lines are: (a) I ¼ 22.7 ^ 0.3, s ¼ 0.241 ^ 0.005 yr21and (b) I ¼ 45.0 ^ 0.5, s ¼ 20.37 ^ 0.01 yr21. The correlation coefficients for the means of the distributions as a function of life span for the actual sequences are: (a) 2 0.93, (b) 2 0.87, (c) 20.83 and (d) 20.60. The correlation coefficients for the shuffled sequences are: (e) 20.99, (f) 20.69, (g) 20.54 and (h) 20.78. Averaged over all species, the mean values of the number of direct repeats in the shuffled sequences are: (e) 4.1 with 95% confidence intervals (1.9– 6.3), (f) 11.9 (7.7– 16.1), (g) 43.2 (32.7 –53.7) and (h) 160 (127 – 192). The number of repeats in the randomly shuffled sequences (e – h) is a rough lower limit to the distribution of the number of repeats in the actual sequences (a – d). The distribution for the longer repeats (a – b) also has a linear upper limit that decreases with increasing life span.

the aging process of a species might be indicated by the closeness of that species mtDNA to the upper constraint line. For both short-lived and long-lived species near this constraint, such as the hedgehog and humans, mtDNA deletions can be expected to have an important role in their aging process. For the species that lie well below the constraint, such as the guinea pig and the domestic dog, mtDNA deletions might not be important in the aging process.

228

Update

TRENDS in Genetics Vol.20 No.5 May 2004

Repeats in randomly shuffled sequences: a second constraint The variation across species in the number of direct repeats in the mtDNA genomes might be due to either the order in the sequence structure (avoiding or enhancing the number of direct repeats) or global properties (such as the sequence length or the nucleotide abundances). The length of the coding sequence is uniform across the mammalian species, with a range of 15 404– 15 492 bp (aside from the cane rat mtDNA, which has coding length of 15 681 bp; possibly indicating a sequencing error in this genome). The total mtDNA length is more variable, with a range of 16 295– 17 734 bp. The longer mtDNA genomes belong to shorter-lived species. To separate out the effects of the sequence order and the global sequence properties, I re-ran the repeat analysis and randomly shuffled the mtDNA sequence of each species. The shuffled sequences have a random order but have the same global properties as the original genomes. The results are shown in Figure 1e– h. The results for the shuffled sequences show some dependence on species life span, with higher values occurring in the shorter-lived species compared with the longer-lived species. However, the range of direct repeats is higher in the actual sequences than in the shuffled sequences. These results indicate that the difference in the number of direct repeats across species is due to both the global properties of the mtDNA sequences and the order in the actual sequence structure. The sequence structure enhances the number of direct repeats compared with the number in the shuffled sequence, with greater enhancement in the shorter-lived species. The mean number of repeats in the shuffled sequences (shown as unbroken lines in Figure 1e– h) gives an approximate lower limit to the number of direct repeats in the actual sequences. The values for the short-lived species tend to be distributed above the mean values for the shuffled sequences, and the upper constraint pushes the values for the long-lived species down to this lower boundary at , 75 – 80 years (the approximate human life span). The numbers of direct repeats of length 11 and 10 bp (Figure 1c,d) are clustered slightly above the values calculated for the shuffled sequences, aside from four short-lived species with higher values (Erinaceus europaeus, Echinosorex gymnura, Didelphis virginiana and Isoodon macourus). There is no indication in these data of any relationship between the number of direct repeats , 12 bp and life span (aside from those four species). The lack of a relationship between the number of shorter direct repeats and life span might be due to the reduced hybridization probability for shorter repeats [30,31]. It is also consistent with the observation that the common agerelated deletions are flanked by direct repeats of 13 bp or more [20]. The implication of the two constraints The data suggest that the mean values for the randomly shuffled sequences act as an approximate lower limit to the actual number of direct repeats of a given length in a mitochondrial genome. Perhaps this lower limit is caused by mutation processes that continually randomize the www.sciencedirect.com

mtDNA sequence over evolutionary time scales. The implication of the crossing point of the upper and lower constraints is that there is a maximum life expectancy for mammals due to mtDNA susceptibility to deletions. This maximum life expectancy is 80 – 100 years (Figure 1a,b) based on the crossing point and the confidence intervals of the mean number of the shuffled sequences. For a species to have a life expectancy beyond this limit, the upper constraint would require that the number of direct repeats in its mtDNA to be much less than the number present in a random sequence. We can speculate that over evolutionary timescales, randomizing mutation effects would then increase the number of direct repeats in the mtDNA of that species. Therefore, the susceptibility of the mtDNA to deletions would be increased and the life expectancy of the organism would decrease. References 1 Gray, M.W. et al. (1999) Mitochondrial evolution. Science 283, 1476– 1481 2 Burger, G. et al. (2003) Mitochondrial genomes: anything goes. Trends Genet. 19, 709 – 716 3 Wei, Y.H. (1992) Mitochondrial-DNA alterations as aging-associated molecular events. Mutat. Res. 275, 145 – 155 4 Cortopassi, G.A. and Wong, A. (1999) Mitochondria in organismal aging and degeneration. Biochim. Biophys. Acta 1410, 183 – 193 5 Attardi, G. (2002) Role of mitochondrial DNA in human aging. Mitochondrion 2, 27 – 37 6 Cortopassi, G.A. and Arnheim, N. (1990) Detection of a specific mitochondrial-DNA deletion in tissues of older humans. Nucleic Acids Res. 18, 6927– 6933 7 Gadaleta, M.N. et al. (1992) Mitochondrial-DNA copy number and mitochondrial-DNA deletion in adult and senescent rats. Mutat. Res. 275, 181 – 193 8 Melov, S. et al. (1995) Increased frequency of deletions in the mitochondrial genome with age of Caenorhabditis elegans. Nucleic Acids Res. 23, 1419– 1425 9 Osiewacz, H.D. (2002) Genes, mitochondria and aging in filamentous fungi. Ageing Res. Rev. 1, 425 – 442 10 Jacobs, H.T. (2003) The mitochondrial theory of aging: dead or alive? Aging Cell 2, 11 – 17 11 Mita, S. et al. (1990) Recombination via flanking direct repeats is a major cause of large-scale deletions of human mitochondrial DNA. Nucleic Acids Res. 18, 561 – 567 12 Zhang, C. et al. (1992) Multiple mitochondrial-DNA deletions in an elderly human individual. FEBS Lett. 297, 34 – 38 13 Lee, H.C. et al. (1994) Differential accumulations of 4,977 bp deletion in mitochondrial-DNA of various tissues in human aging. Biochim. Biophys. Acta 1226, 37 – 43 14 Pesce, V. et al. (2001) Age-related mitochondrial genotypic and phenotypic alterations in human skeletal muscle. Free Radic. Biol. Med. 30, 1223– 1233 15 Arnheim, N. and Cortopassi, G. (1992) Deleterious mitochondrial DNA mutations accumulate in aging human tissue. Mutat. Res. 275, 157– 167 16 Zhang, C. et al. (1997) Varied prevalence of age-associated mitochondrial DNA deletions in different species and tissues: a comparison between human and rat. Biochem. Biophys. Res. Commun. 230, 630– 635 17 Yowe, D.L. and Ames, B.N. (1998) Quantitation of age-related mitochondrial DNA deletions in rat tissues shows that their pattern of accumulation differs from that of humans. Gene 209, 23 – 30 18 Porteous, W.K. et al. (1998) Bioenergetic consequences of accumulating the common 4977-bp mitochondrial DNA deletion. Eur. J. Biochem. 257, 192 – 201 19 Nagley, P. et al. (2001) Mitochondrial DNA deletions parallel agelinked decline in rat sensory nerve function. Neurobiol. Aging 22, 635– 643 20 Lee, C.M. et al. (1997) Age-associated alterations of the mitochondrial genome. Free Radic. Biol. Med. 22, 1259– 1269

Update

TRENDS in Genetics Vol.20 No.5 May 2004

21 Lakshmipathy, U. and Campbell, C. (1999) Double strand break rejoining by mammalian mitochondrial extracts. Nucleic Acids Res. 27, 1198 – 1204 22 Thacker, J. et al. (1992) A mechanism for deletion formation in DNA by human cell extracts: the involvement of short sequence repeats. Nucleic Acids Res. 20, 6183 – 6188 23 Larrson, N.G. and Holme, E. (1992) Multiple short direct repeats associated with single mtDNA deletions. Biochim. Biophys. Acta 1139, 311 – 314 24 Chung, S.S. et al. (1996) Analysis of age-associated mitochondrial DNA deletion breakpoint regions from mice suggests a novel model of deletion formation. Age 19, 117 – 128 25 Brossas, J.Y. et al. (1994) Multiple deletions in mitochondrial DNA are present in senescent mouse brain. Biochem. Biophys. Res. Commun. 202, 654 – 659 26 Wang, E. et al. (1997) The rate of mitochondrial mutagenesis is faster in mice than humans. Mutat. Res. 377, 157 – 166 27 Nowak, R.M. (1999) Walker’s Mammals of the World (Vol. 1 – 2), 6th edn, Johns Hopkins University Press 28 Parker, S.P. ed. (1990) Grzimek’s Encyclopedia of Mammals (Vol. 1 – 5) McGraw-Hill 29 Perrin, W.F., et al. eds (2002) Encyclopedia of Marine Mammals Academic Press

229

30 Bi, X. and Liu, L.F. (1996) A replicational model for DNA recombination between direct repeats. J. Mol. Biol. 256, 849 – 858 31 Rocha, E.P.C. (2003) An appraisal of the potential for illegitimate recombination in bacterial genomes and its consequences: from duplications to genome reduction. Genome Res. 13, 1123 – 1132 32 Khrapko, K. et al. (1999) Cell-by-cell scanning of whole mitochondrial genomes in aged human heart reveals a significant fraction of myocytes with clonally expanded deletions. Nucleic Acids Res. 27, 2434– 2441 33 Elson, J.L. et al. (2001) Random intracellular drift explains the clonal expansion of mitochondrial DNA mutations with age. Am. J. Hum. Genet. 68, 802– 806 34 Tang, Y. et al. (2000) Maintenance of human rearranged mitochondrial DNAs in long-term cultured transmitochondrial cell lines. Mol. Biol. Cell 11, 2349– 2358 35 Khrapko, K. et al. (2003) Clonal expansions of mitochondrial genomes: implications for in vivo mutational spectra. Mutat. Res. 522, 13 – 19

0168-9525/$ - see front matter q 2004 Elsevier Ltd. All rights reserved. doi:10.1016/j.tig.2004.03.003

Evidence that functional transcription units cover at least half of the human genome Marie Se´ mon and Laurent Duret Laboratoire de Biome´trie et Biologie Evolutive, UMR CNRS 5558 Universite´ Claude Bernard Lyon 1, 16 rue Raphae¨l Dubois, 69622 Villeurbanne Cedex, France

Transcriptome analyses have revealed that a large proportion of the human genome is transcribed. However, many of these transcripts might be functionless. To distinguish functional transcription units (FTUs) from spurious transcripts, we searched for the hallmarks of selective pressure against mutations that impair transcription. We analyzed the distribution of transposable elements, which are counterselected within FTUs. We show that these features are sufficiently informative to predict whether a sequence is transcribed and, if transcribed, in which orientation. Our results indicate that FTUs constitute at least 50% of the genome and that approximately one-third of these transcripts apparently do not encode proteins. Analyses of the human genome sequence demonstrated that protein-coding regions constitute , 1.5% of human chromosomes [1]. Given the estimated number and the average length of protein-coding genes, protein-coding transcription units should comprise 30%– 40% of our genome [1,2]. However, it is much more difficult to estimate the number of transcription units corresponding to non-coding RNA (ncRNA) genes. On chromosome 7, . 200 putative ncRNA genes have been identified, comprising , 2% of the chromosome; however, it is possible that many others remain undiscovered [2]. Large-scale Corresponding author: Laurent Duret ([email protected]). www.sciencedirect.com

cDNA sequencing projects have been established to provide a complete picture of transcriptomes, and recently new methods have been developed to detect rare transcripts and longer cDNAs [3,4]. These studies revealed that ncRNAs are a major component of the mammalian transcriptome [3,4]. However, it is not clear whether all of these transcripts are functional. Some spurious transcripts might result from the activity of cryptic promoters [e.g. originating from transposable elements (TEs) or from recent pseudogenes] or from the illegitimate extension of transcription downstream of genes with weak polyadenylation signals. Contrary to functional transcription units (FTUs), these spurious transcripts are unnecessary for the proper functioning of genomes and hence are not subject to selective pressure. Thus, one possible way to distinguish FTUs from spurious transcripts is to find evidence that they are under selective pressure to be transcribed. Interestingly, comparisons of the distribution of TEs within introns and intergenic regions have indicated that there is a selective pressure against insertions of TEs within FTUs [5,6]. This is probably because the regulatory elements of such TEs (e.g. polyadenylation signals and promoters) might interfere with the proper expression of FTUs [5,6]. In this article, we describe how we took advantage of this peculiar distribution of TEs to build a model to predict FTUs, and thus evaluated the