Human Mitochondrial Genetics and Evolution A Salas, Universidade de Santiago de Compostela, Santiago de Compostela, Galicia, Spain © 2013 Elsevier Inc. All rights reserved.
Glossary Haplogroup Groups (clades) of mtDNA haplotypes (sequences) that are phylogenetically closely related. Haplotype A combination of genetic variants located on different places (loci) of the genome that are transmitted together. In the case of the mtDNA genome, the entire molecule or parts of it can be considered as being synonymous with haplotypes, given the maternal inheritance pattern of these genomes. Heteroplasmy mtDNA status indicating that different mtDNA molecules, generally differing by single point mutations, coexist in the same mitochondrion, cell, tissue type, or organ of an individual. Homoplasmy mtDNA status indicating that all mtDNA copies of an individual are identical.
Mitochondrial DNA: Characteristics Mitochondria are small organelles that generate energy for the cell to use. Each mitochondrion contains several mitochondrial DNA (mtDNA) molecules. There are roughly 100–10 000 copies of the mtDNA genome per cell. This number greatly depends on the energy demands of the cell or tissue. Each mtDNA molecule consists of circular molecules of about 15 569 base pairs (bp); the length of the molecule slightly varies due to the presence of (generally small) insertions or deletions, in particular mtDNA lineages or disease conditions (where large deletions can be lethal). Inheritance of mtDNA is non-Mendelian, as is the case for autosomal chromosomes; mtDNA is exclusively inherited from the mother. From a functional point of view, the human mtDNA genome can be divided into two main parts: the con trol and the coding regions. The control region is about 1200 bp long and harbors regulatory elements that are related to mtDNA replication or gene expression. The control region is usually divided into two segments characterized by their high average mutation rate, which are generally known as the first and second hypervariable segments (abbreviated as HVS-I and HVS-II or HV1 and HV2). The coding region occupies the rest of the molecule and it encodes 37 densely packed genes, includ ing 13 for proteins, 22 for transfer RNA (tRNA), and 2 for ribosomal RNA (rRNA) subunits. All of them are essential for maintenance of the correct function of the mitochondrion. In the past, it was commonly thought that all mtDNA molecules of the cell were identical, a condition known as homoplasmy. With the recent advances in sequencing techni ques and chemicals, it is now known that different mtDNA molecules can coexist within the same individual, a condition that is known as heteroplasmy. In the past few years, intraindi vidual mtDNA diversity has been observed; that is, the presence of different mtDNA molecules in different tissues of the same individual. These differences are often very subtle and affect
Brenner’s Encyclopedia of Genetics, 2nd edition, Volume 3
Maternal Inheritance The pattern of inheritance where the mtDNA characteristics pass on to the next generation from the mother (matrilineal), without interfering with the mtDNA of the father. Mitochondrial DNA (mtDNA) Genetic material that is uniquely found in mitochondria; small organelles that are located in the cytoplasm of the cell, outside the nucleus. Phylogeny Referring to the mtDNA genome, this term points to the multifurcating tree-shape diagram that explains the evolutionary patterns relating different mtDNA haplotypes. Phylogeography The area of research that interprets the observed distribution of mtDNA variants in the geographical space and in an historical framework in order to reconstruct the processes that created the observed divergence.
point positions of the whole molecule. There are some posi tions that seem to be more prone to harboring a heteroplasmic status than others, usually coinciding with a high positional mutation rate. Heteroplasmy can be relevant in forensic case work, for example, when comparing two pieces of evidence from the same biological source (that differ by different levels of heteroplasmy), or in clinical genetics, since certain patho genic mutations will only manifest phenotypically if they are present in the carrier above a given threshold. Mitochondrial DNA variation is generally reported via a contrast with the Cambridge Reference Sequence (CRS), the first human genome that was fully sequenced and reported in 1981. In reality, it is the revised CRS (rCRS) that serves as the contemporary standard because the original CRS contained a few sequencing errors that were repaired in the rCRS in 1999. It is important to note that the rCRS is not the most ancestral sequence of the mtDNA phylogeny, but a European mtDNA that belongs to a derived haplogroup known as H2a2a (hap logroup nomenclature according to the most updated phylogeny; Phylotree Build 12).
Molecular Variation in Human mtDNA Different methodologies have been used to screen molecular variation in mtDNA molecules in different human popula tions. Initially, the analysis of restriction fragment length polymorphism (RFLP) sites and screening procedures (e.g., heteroduplex analysis, single standard conformation poly morphisms, etc.) was used in pioneering disease and population studies. Sequencing of the HVS-I has been the most popular procedure since the mid-1990s, sometimes com plemented with HVS-II data. Today, more than 150 000 individuals representing populations worldwide have been screened for this mtDNA segment, and this information has been disseminated throughout the huge amount of scientific
Human Mitochondrial Genetics and Evolution
articles dealing with human populations (also disease and forensic) studies. In the last decade, coinciding with a reduction in the cost of standard sequencing analysis coupled with important improvements in methodologies and chemicals, sequencing entire mtDNA genomes became more popular. Today (October 2011), there are more than 9400 genomes available, most of them uploaded to GenBank. Next-generation sequencing is beginning to emerge in the field of mtDNA research; however, quality standards are still below the standard Sanger sequencing. Given that the mtDNA variation is inherited as a block from the mother to the offspring, mtDNA lineages are com monly referred as haplotypes (the mtDNA sequence in this case is synonymous with haplotype). Haplotypes that are evolutionary closely related can be grouped into clades, or more technically, haplogroups (the term refers to the group ing of the lineage ancestor and its living and/or decreased descendants). Today, a good amount of knowledge exists on global mtDNA phylogeny, to the point where some particular haplogroups are characteristic of local regions or ethnic groups. This has occurred because mtDNA variation accumu lates by way of intergenerational mutations in the germ line in populations at the same time as they move from one region to another. Given that mtDNA is transmitted through the matri line, the effective population size (technically defined by Sewall Wright as “the number of breeding individuals in an idealized population that would show the same amount of inbreeding as the population under consideration”) is much lower than the amount of nuclear DNA. Therefore, mtDNA variation is more exposed to the effect of genetic drift in populations, generating patterns of variation that are more geographically stratified than average nuclear markers. The mtDNA molecule is also a good model to use for understanding positional mutation rates. It has been known for several decades that, on average, the mtDNA molecule mutates much faster than the nuclear genome. Today, it is known that there are positions in the molecule that are mole cularly stable, while some others behave as mutational hotspots. Why this mutational heterogeneity exists in the gen ome is still unknown. Site-specific mutation rates can be relevant not just in evolutionary studies but also in disease studies and forensic casework (e.g., when aiming to establish the probability of an identity based on mtDNA profiles).
Mitochondrial DNA Phylogeography An important body of knowledge on human evolution has been created based on the phylogeographic analysis of mtDNA variation. Some limitations exist since mtDNA repre sents a unique locus within the whole human genome and it only tells histories that concern female populations (given its maternal inheritance). These arguments have been used by some researchers to debunk the power of the mtDNA molecule to infer demographic information. However, it must be under stood that the observed patterns of mtDNA variation can only be interpreted under certain hypothesis; meanwhile, many other hypotheses can be disregarded. The ‘Out of Africa’ theory can serve as a paradigmatic exam ple (Figure 1). The mtDNA tree can be seen as a large molecular jigsaw puzzle where all of the pieces (mtDNA sequences) find
Out of Africa
~60 000−70 000 years ago
Figure 1 The map illustrates the skeleton of European phylogeny in a nutshell. The letters in boxes indicate the nomenclature of the main European haplogroups; all of them derived from macro-haplogroup N, which in turn was derived from L3, one of the genuine sub-Saharan macro-haplogroups. Each haplogroup is defined by specific mutations (not indicated in the figure), and all of them can be further hierarchically subdivided along a multifurcating tree. Thus, for instance, one could move from the rCRS to sub-haplogroup J1c3a1 by ‘traveling’ along the phylogeny and following the nodes: rCRS > H2a2a(3) > H2a(1) > H2(1) > H(1) > HV (2) > R0(1) > R(2) > R2′JT (1) > JT(3) > J(6) > J1(2) > J1c(3) > J1c3 (1) > J1c3a(1) > J1c3a1(1); the numbers in brackets indicate the diagnostic mutations that have to be present on all mtDNAs that belong to J1c3a1 (unless back mutation occurs). For instance, the mutations that separate haplogroup H from HV are two transitions occurring at positions 2706 and 7028 (nomenclature according to the rCRS).
their place in the ‘picture’. When reconstructing the global mtDNA phylogeny, it can be observed that all non-African hap lotypes were derived from two unique macro-haplogroups, named M and N. The immediate mtDNA ancestors of M and N were found in sub-Saharan Africa; both derived from macro-haplogroup L3, which is undoubtedly of sub-Saharan origin and closely related to other L haplogroups that are only observed in this continental region (excluding the lineages that arrived to America and other parts of the world as a consequence of the forced trans-Atlantic slave trade or modern migrations). Such a pattern of mtDNA variation is completely incompatible with the multiregional hypothesis for the origin of modern humans; this hypothesis holds that humans first arose near the beginning of the Pleistocene about 2 million years ago from
Human Mitochondrial Genetics and Evolution archaic human forms in different geographical (continental) locations and evolved worldwide into the diverse populations of modern Homo sapiens sapiens. This hypothesis has been sup ported by a full army of physical anthropologists for decades. Patterns of worldwide mtDNA variations are incompatible with this hypothesis, but perfectly fit the alternative one known as the ‘Recent African Replacement Theory’, which proposes that mod ern humans arose as a new species in sub-Saharan Africa around 100 000–200 000 years ago, moving out of Africa around 60 000 years ago (‘Out of Africa’ theory) to replace other existing archaic human species (Figure 1). The analysis of mtDNA, from a phylogeographic perspec tive, complemented by other several statistical approaches, has contributed toward shedding light on the colonization of Southeast Asia, the Pacific, the American continent, and Europe, among others. The analysis of carefully selected mtDNAs for entire genome sequencing (generally representing specific haplogroups) has provided a good amount of knowl edge about postglacial expansions and hunter–gatherer relationships, for example.
Pseudo-Mitochondrial Genome The acronym of ‘nuclear mitochondrial DNA’ is NUMT, which refers to the natural transfer or ‘transposition’ of cytoplasmic mtDNA sequences into the nuclear genome. Most (if not all) NUMTs are transcriptionally inactive. Analysis of the human genome has revealed the existence of large amounts of mtDNA transferred into nuclear DNA. Using standard polymerase chain reaction (PCR)-based techniques, NUMTs can be unintentionally amplified instead of the targeted mtDNA due to the sequence similarity between the two DNAs. Although patterns of variation in NUMTs differ significantly from those observed in mtDNA, some NUMTs have inadvertently passed through the eyes of several researchers and many disease and evolutionary studies have unfortunately been brought to the wrong conclusions.
Phylogeny as a Tool for Data Quality Control The mtDNA phylogeny is the skeleton where all existing human mtDNA variations should fit. A pattern of genetic var iants that do not fit in the mtDNA tree has been generated, most likely artificial, by way of contamination, sample mix-up, or documentation errors. The strategy of fitting newly gener ated data into the known phylogeny has often been used in the literature in order to show the deficiencies that commonly arise in mtDNA studies. Many studies that have challenged the nonrecombination nature of mtDNA have been severely affected by erroneous datasets and therefore have been disas sembled using phylogeny as an a posteriori data quality control method (see more below). Errors can have important conse quences on research studies, but they can be even more relevant in the forensic field (see below), where errors can be
instrumental in swaying the decision of a judge in favor or against a given suspect.
Human Population Considerations in Biomedicine Although the interplay between different areas of research has generally been scarce, there are many common factors that are attracting the interest of different specialists. Forensic geneticists are often interested in establishing a maternal relationship between two donors or a suspect from a crime versus an evidentiary sample, for example. The mtDNA test is often the only choice available because evidentiary sam ples might be highly degraded or have low amounts of DNA, and therefore only the mtDNA could be PCR amplified given the large number of mtDNA molecules in the cell compared to the limited number of nuclear DNA copies. On the other hand, hair shafts (a common evidentiary sample in crime scenes) do not contain nuclear DNA, and therefore analysis of mtDNA is mandatory. Estimates of haplotype frequencies in populations are important in order to balance the alternative hypotheses that are generally considered in a forensic context: Could the mtDNA evidence be better explained if it came from a given suspect versus another person (i.e., somebody in the population)? The estimation of mtDNA haplotype frequency is not as obvious as it is for other nuclear polymorphisms given that mtDNA haplotypes are rare in populations and that genetic variation is more stratified in populations than most of the genetic markers of the nuclear genome. Population stratifica tion is also an area of interest for researchers interested in mtDNA as a predisposing factor in the complex diseases that generate false positives of association in a number of these studies. Variation in mitochondrial DNA has also been analyzed in the context of tumorigenesis under the hypothesis that mtDNA mutations could be responsible for tumor development. Most of these studies have been the subject of intense debate, given that most of the results could have been generated as a consequence of methodological artifacts.
See also: mtDNA; Population Genetics.
Further Reading Bandelt H-J, Richards M, and Macaulay V (eds.) (2006) Human Mitochondrial DNA and
the Evolution of Homo sapiens. Berlin; Heidelberg; New York: Springer-Verlag.
Jobling M, Hurles ME, and Tyler-Smith C (eds.) (2004) Human Evolutionary Genetics.
Origins, Peoples and Disease. New York: Garland Science.
Relevant Websites http://www.phylotree.org/ – Phylotree http://www.mitomap.org/MITOMAP – Mitomap http://empop.org/ – Empop