Heterozygosity: An Expanding Role in Proteomics

Heterozygosity: An Expanding Role in Proteomics

Molecular Genetics and Metabolism 74, 51– 63 (2001) doi:10.1006/mgme.2001.3240, available online at http://www.idealibrary.com on MINIREVIEW Heterozy...

160KB Sizes 0 Downloads 41 Views

Molecular Genetics and Metabolism 74, 51– 63 (2001) doi:10.1006/mgme.2001.3240, available online at http://www.idealibrary.com on

MINIREVIEW Heterozygosity: An Expanding Role in Proteomics Georgirene D. Vladutiu 1 Department of Pediatrics, Division of Genetics, School of Medicine & Biomedical Sciences, University at Buffalo, 936 Delaware Avenue, Buffalo, New York 14209 Received July 16, 2001, and in revised form July 26, 2001

tine palmitoyltransferase II deficiency; myophosphorylase deficiency; myoadenylate deaminase deficiency; complex phenotypes.

The human genome sequence provides the framework for understanding the biology of human cell function. The next step is to intensify the investigation of protein function in the context of complex biological systems. Cellular functions are carried out by molecular complexes acting in concert rather than by single molecules or single reactions. Parallels have been drawn between scale-free nonbiologic networks and functionally interconnected metabolic pathways in the cell. Modeling of metabolic networks, in which functional modules or subnetworks represent individual related pathways, will lead to the prediction of protein function in the larger context of a complex system. Depending on the robustness of these metabolic networks, singlegene defects alone or in combination with other gene defects and the environment have the potential for invoking a spectrum of alterations in the integrity of a given network. The overall purpose of this review is to highlight the importance of simple heterozygosity for one pathogenic mutation or combinatorial heterozygosity for two or more mutations within or between individual genes in altering the stability of metabolic networks. Several forms of heterozygosity are considered, e.g., intra- and interallelic heterozygosity and double heterozygosity. The concepts of synergistic heterozygosity, loss of heterozygosity, and mitochondrial DNA heteroplasmy also are discussed in relation to the quantitative effects of coexisting mutations on the phenotypic expression of disease. © 2001 Academic Press Key Words: heterozygosity; metabolic networks; proteomics; genotype-phenotype correlation; carni-

1

The recent publication of a rough draft of the human genome (1) is a crucial first step toward understanding the intricacies of human cell function and provides the necessary catapult for exploring new directions in the prediction of protein function. From this accomplishment comes the promise of new therapies, innovations in molecular diagnostics, and personalized drugs (2). While the latest estimate of the number of genes (⬃32,000) is only a fraction of the expected figure and represents less than 5% of the entire genome (2), we know that extensive use of alternative splicing generates considerable combinatorial diversity at the protein level, demonstrating that the complexity of an organism depends in part on the diverse capabilities of individual genes. Therefore, we can no longer rely on the one gene/one protein hypothesis. The living cell is not just a bag of metabolic reactions; it is an extremely complex system with many variables including location, motion, separation in space, molecular conformation, chemical composition, and the regulation of electrical potentials (3). The recent combination of computational science and biology into computer modeling of cellular functions has resulted in several powerful E-based programs (3, for review). One of the most promising is the “Virtual Cell,” a program that brings together the perspectives of the biochemist, geneticist, and microscopist to define biological compartments and their

Fax: (716) 878-7980. E-mail: [email protected]. 51

1096-7192/01 $35.00 Copyright © 2001 by Academic Press All rights of reproduction in any form reserved.

52

GEORGIRENE D. VLADUTIU

contents in space. An investigator can input data to study specific metabolic systems, adding complexity as needed and using the outcomes to make functional predictions. The program is funded by the NIH through a national resource center and is available on the world wide web (http://www.nrcam.uchc. edu). While these innovative methods are being used to develop a bigger picture of cell physiology from the ground up, metabolic physicians and geneticists are trying to understand inherited metabolic disease at the organismal level by dissecting disease phenotypes from the top down. Their tools require integration of molecular, biochemical, and clinical information in order to provide accurate diagnoses for and medical management of their patients. In the near future, protein function will be predictable in the context of higher order processes such as the regulation of gene expression and signaling cascades. Knowledge of protein function will enable the prediction of genotype-phenotype correlations (4). The purpose of this review is to provide an overview of the complexities that must be considered in the analysis of metabolic systems, and then to assess the consequences for an individual’s phenotype of gene mutations present in single copy (simple heterozygosity) or together with single copies of other mutated genes in an array of multigenic combinations (multiple gene heterozygosity). Genotype-Phenotype Correlation Understanding the relationship between genotype and phenotype has been an evolving process ever since the principles of Mendelian inheritance were applied to the human condition by physician investigators such as Sir Archibald Garrod at the turn of the 20th century (5). As with so many challenging biologically based processes, genotype-phenotype correlation turns out to be considerably more complex than the initial expectation of a direct and measurable impact of a single mutation on the expression of a gene. The realization is that for many disorders only a subset of causative mutations reliably can predict the phenotype. A two-threshold model has been proposed to explain the relationship between gene mutation and mutant protein function (6). With this model, a severe phenotype will always be observed below a certain threshold of aberrant protein function and a mild phenotype will always be observed above a certain threshold with an indeterminate range between the two in which there is no correlation between mutation and phenotype (6).

It is the indeterminate range that is most intriguing and ultimately may prove to be the most telling component of the model in understanding which combination of factors makes an essential contribution to a particular phenotype. One must contemplate the existence of other gene interactions, environmental factors, and timing of expression, that undoubtedly contribute to the less predictable phenotypes. The Proteomics Era The proteome is simply defined as the protein complement of the genome (7). It represents the complete set of proteins expressed by a cell during its lifetime. The proteome is much more complex than the genome. More and more it is being realized that cellular functions are carried out by molecular complexes acting in concert rather than by single molecules (8). A dramatic change in biological research is occurring from “reductionist dissection to systems integration” (9). Subproteomes may be defined as individual sets of related proteins in the cell that have a common purpose. The existence of 250 different human cell types, each expressing different subproteomes at various times and under different conditions, and the enormous amount of structural diversity found among the proteins expressed will provide a major challenge in unraveling the mysteries of the human proteome for decades to come (10). The study of the proteome, termed proteomics, includes evaluation of pretranslational processing of mRNAs, posttranslational processing of proteins, structural folding, targeting, and interactions within and between individual components of metabolic pathways, to name a few (Fig. 1). The slow pace of developments in protein chemistry technology needed to study the proteome has paled by comparison with the exponential growth of technologies that were developed in the race to sequence the genome (11). In the seventies, geneticists believed that knowing the genomic sequence would reveal the understanding of all hereditary diseases. However, in the nineties, it became clear that while the DNA sequence of individual genes is essential, it is not adequate for determining the extent and consequences of gene alterations in the causation of disease. The analysis of gene products, including mRNAs, also would prove to be critical to the understanding of hereditary disease (11). Due to the complexities of pre- and posttranslational modifications of mRNAs and proteins, respectively, and the nu-

HETEROZYGOSITY AND THE PROTEOMICS ERA

FIG. 1.

The complexity of proteomics.

merous overlapping metabolic pathways involved in complex biological processes, knowing the entire genome of a subject will not take into account the environmental modulation of genetic predisposition and genome expression (11). In fact, only about 2% of our total disease load is related to monogenic causality and the “rules governing physiological regulation and cellular and higher levels of organization are located not in the genome, but in interactive epigenetic networks which themselves organize genomic response to environmental signaling” (12). Metabolic Networks The concept of interconnected cellular metabolic networks is not new, especially in the study of simple organisms (13,14). Metabolic networks are composed of cellular constituents and reactions that generate mass, energy, information transfer, and cell-fate specification (15). Metabolic networks have been systematically studied in a variety of organisms across all levels of life and found to be very similar in their topological properties with strikingly similar characteristics to the inherent organization of nonbiological systems (15). There are basically two types of networks displayed by complex systems: exponential (homogeneous, with most nodes having the same number of links) and scalefree (heterogeneous, with most nodes having one or two links and only a few having a large number of links) (16). Biological networks are generally of the scale-free type. The question arises as to the

53

stability of these networks. Their perturbation has the potential for affecting a multitude of cellular processes. They may be so finely tuned that any alterations, e.g., in rate constants or analyte concentrations, may compromise their performance. Alternatively, some may be relatively insensitive to change revealing a robustness that allows a network to adapt easily to environmental stressors or even to genetic alteration. Bacterial chemotaxis, for example, represents a robust network that adapts to different stimuli without relying on the fine-tuning of individual parameters (13). The individual subnetworks within metabolic networks can be thought of as “functional modules” that are discrete entities with functions distinct from those of other modules (17). The function of individual modules relies on interactions among their components and may include proteins, RNA, DNA, and small molecules. The analysis of metabolic networks, their function, productivity, and vulnerability will rely not only on the structure of the individual reactions in the modules, but also on governing factors such as the kinetics and regulation of the reactions (9). Some modules are spatially isolated in structure and function, e.g., the ribosome acts as a single module to synthesize proteins. Others represent extended modules, e.g., signal transduction systems, such as that controlling chemotaxis in bacteria; there is the initial binding of a chemical signal followed by interactions between signaling proteins throughout the cell (17). It is the specific connectivity between selected nodes or modules that constitutes a metabolic network and individual networks make up subproteomes of the cell (Fig. 2).

FIG. 2. The evolution of a subproteome. OXPHOS, oxidative phosphorylation.

54

GEORGIRENE D. VLADUTIU

colytic pathway and in turn, pyruvate is produced and used in the production of acetyl-CoA, yet another substrate in the Krebs cycle. Fatty acid oxidation also contributes to the production of acetyl-CoA through the systematic degradation of variable length fatty acids. All of these pathways are separate modules with different sets of substrates yet they act interdependently toward the common goal of energy production. Mutations in any of the contributing proteins can impact not only on the function of the individual pathways but also on the collective function of the interactive partners. As the proteomics era unfolds, increasing attention must be given to the modeling of metabolic networks, prediction of their vulnerability, and enhancement of their chances for survival in the face of attack. FIG. 3. An interactive metabolic network. Schematic representation of metabolic pathways that are interconnected functionally with the common goal of energy production. Individual pathways function as nodes or modules in a network that forms a subproteome in the cell. FA, fatty acid; CARN, carnitine; CPT, carnitine palmitoyltransferase; CAT, carnitine/acylcarnitine translocase; OM, outer mitochondrial membrane; IM, inner mitochondrial membrane; AK, adenylate kinase; AMPD, myoadenylate deaminase; ASS, adenylosuccinate synthetase; ASL, adenylosuccinate lyase; AS, adenylosuccinate; ASP; aspartate; IMP, inosine monophosphate; AMP, adenosine monophosphate; PPL, phosphorylase, PYR, pyruvate; ETC, electron transport chain.

The survival of living systems throughout evolution requires that critical patterns of essential functional modules, such as the accuracy of chromosome segregation, be robust and insensitive to many environmental and genetic alterations. On the other hand, the ability to evolve requires sensitivity to change including genetic alterations. With respect to reconciling differences in genotype-phenotype correlations in metabolic disease, it is important to understand the different characteristics of network robustness and the flexibility in response to change that is inherent in each metabolic module. A single metabolic pathway, as a functional module, is capable of interacting with other pathway modules that collectively have a far-reaching influence on cell physiology. The example in Fig. 3 shows the purine metabolism pathway depicted as a four-step enzymatic module that produces fumarate and ammonia. The Krebs cycle, another multistep module, uses fumarate as a substrate ultimately leading to the production of ATP through interaction with the electron transport chain. Ammonia from the purine pathway activates phosphofructokinase in the gly-

Sources of Protein Diversity: Scratching the Surface Alternative Splicing One of the largest tasks in the postgenomic era will be to describe and to characterize functionally the components of the proteome. Many genetic mechanisms contribute to making this task more complicated including DNA recombination, RNA editing, exon shuffling, alternative splicing, and translational regulation. Variations within these mechanisms are fascinating tributes to nature’s dedication to maintaining biologic diversity. Alternative splicing is the most widely used mechanism for enhancing protein diversity in terms of the number of genes affected and the variety of organisms in which it occurs (18). Furthermore, it helps to explain the apparent discrepancy between gene number and the complexity of the human organism (18). Exons have multiple 5⬘ and 3⬘ splice sites that are alternatively used. Both constitutive (included in all mRNAs) and alternative (included in only some mRNAs) exons exist, offering the splicing machinery choices for inclusion or exclusion when generating a mature mRNA transcript (18). While the number of human genes that are alternatively spliced has been estimated to be 35% based on the analysis of expressed sequence tags (ESTs) (19,20), this number is likely to be an underestimate since the EST collection (http:/www.ncbi.nlm.nih.gov/dbEST) does not represent all protein coding sequences and only covers a portion of the transcript for most genes (18). Therefore, it is likely that the majority of transcripts expressed from human genes are alternatively spliced (18). Some genes are known to encode alternatively

HETEROZYGOSITY AND THE PROTEOMICS ERA

spliced transcripts that produce anywhere from hundreds to tens of thousands of different mRNAs, e.g., the neurexin family of neural proteins that function as receptors of neuropeptides and as adhesion molecules participating in synaptogenesis (18, for review). Over 1000 different neurexin mRNAs could potentially be synthesized from three genes in the family via alternative promoter usage and alternative splicing. Interestingly, the resultant proteins from some alternatively spliced mRNAs have altered specificities for their ligands, and the diversity in the neurexin proteins has been proposed to be a component of a code for neural connectivity (21). Determining the functional differences among proteins encoded by alternatively spliced mRNAs is a very important area of functional genomics and of particular interest to the pharmaceutical industry as these variants may mediate different pharmacological responses in the specific tissues in which they are located (2). Furthermore, many alternatively spliced proteins are likely to represent a large number of ligand-binding domains that could be therapeutic targets (2). Exon Shuffling Exon shuffling is the natural process of creating new combinations of exons by intronic recombination resulting in rearranged genes with altered functions (22). Analyses of protein sequences and threedimensional structures have revealed that many proteins generated by exon shuffling are composed of discrete domains that are considered to be “evolutionarily mobile,” meaning that they have moved around during evolution and are now included in otherwise unrelated proteins (23, for review). Mobile domains are able to fold independently which prevents misfolding when they are inserted into a new protein environment. Examples include kringle domains found in unrelated proteins such as apolipoprotein A and hemostatic proteases, e.g., zymogen. A discussion of the proposed mechanisms for the formation and shuffling of domain-encoding exons is beyond the purview of this review; however, it is clear that this unique process contributes to protein diversity in a well-conserved manner. Posttranscriptional Control Expression levels of a protein depend not only on transcriptional mechanisms but also on additional control mechanisms including, among others, translational regulation and posttranslational modifica-

55

tions (24). The regulation of protein concentrations by translational control has often been either neglected or underestimated. Its particular importance is during development, in response to environmental stimuli, and following viral infection or in disease. By comparison of the steady-state levels of proteins with those of their corresponding mRNAs, mRNA abundance is a poor indicator of the levels of a corresponding protein with variances between the two of up to 30-fold. Since the proteome determines the phenotype at the cellular level, the disparity between protein and transcript levels might lead to misinterpretation of mRNA profiling. The more appropriate approach of translational profiling can be accomplished by isolating polysome-bound mRNAs. This approach integrates every level of regulation from transcription to translation and promises to make an important contribution to characterization of the proteome helping to bridge the gap between genomics and proteomics (24). All of these factors contribute in varying degrees to the diversity of phenotypes among individuals. Therefore, inheritance of one or more mutations in any step of an interactive set of metabolic modules, or network, is bound to impact at many levels in the proteome. With this in mind, there is no wonder that specific and reproducible genotype-phenotype correlations can be made in only a small fraction of socalled single-gene disorders with consistently severe or mild phenotypes. Single rate-limiting steps do not control metabolic pathways. Control is shared among all steps and regulation occurs at the systems level mediated by effectors internal or external to the regulated system (6, for review). The activities of particular enzymes in a given metabolic pathway may be influenced by single mutations or combinations of intra- and interallelic mutations affecting one gene, or affecting multiple genes. Therefore, in this complex set of metabolic networks constituting the proteome, it is reasonable to look more closely at genetic alteration of all types no matter how subtle and assess the possible significance it may have in variably altering the robustness of networks within the proteome. The Expanding Definition of Heterozygosity The standard definition of simple heterozygosity is the existence of two different alleles at a designated locus, where a locus is the place occupied by a gene in a chromosome. With the increased understanding of genetic alteration at the molecular level,

56

GEORGIRENE D. VLADUTIU

this basic definition serves as a platform for expanded definitions of heterozygosity. In autosomal recessive inheritance of disease, heterozygotes are usually asymptomatic, e.g., cystic fibrosis or TaySachs disease, whereas in autosomal dominant inheritance, heterozygotes are clinically symptomatic, e.g., familial hypercholesterolemia type II. However, the terms “dominant” and “recessive” describe the phenotype, not the genotype. Cases have been reported in which manifesting carriers of typically autosomal recessive disorders exist, e.g., the triggerable myopathies (McArdle disease and carnitine palmitoyltransferase deficiency) (25–27). In these situations, it appears that inherited traits interact with environmental factors (e.g., strenuous exercise, fasting, extremes in temperature) bringing the individual to a threshold at which symptoms of the disease become manifest. Within families, there is evidence that individuals with the same recessive disorder have differing degrees of manifestations, suggesting that additional genetic factors may also impact the underlying disorder together with the environment (28,29). A rather extreme example of the impact of environment on phenotypic expression in heterozygotes is the well-documented occurrence of complications during the pregnancies of heterozygous mothers carrying fetuses affected with 3-hydroxyacyl-CoA dehydrogenase (LCHAD) deficiency. As many as 79% of pregnancies with LCHAD-deficient fetuses were shown to be complicated by acute fatty liver of pregnancy (AFLP) or hemolysis, elevated liver enzymes, and low platelets (HELLP syndrome) (30, for review). The main cause of the complications is believed to be due to the liver toxicity of accumulated 3-hydroxy-fatty acids generated by the fetus. However, the fact that obligate carrier mothers have reduced capacity to oxidize long-chain fatty acids may greatly reduce their threshold for manifesting clinical symptoms under the stress of the unfavorable metabolic microenvironment generated by the fetus. If it were possible experimentally for a noncarrier mother to carry a LCHAD-deficient fetus, the question arises as to whether this mother would have the same risk of liver-related complications during pregnancy as a heterozygous mother. Mouse models are currently under development to study these questions (30). Figure 4 schematically depicts examples of heterozygosity from the simple heterozygote, around which the basic definition was formed, to various models for multiple gene heterozygosity, all of which have the potential to contribute, however subtly, to the variation observed in

FIG. 4. The spectrum of heterozygosity. Solid bars denote single mutant loci; hatched bars denote wild-type loci; dotted bars represent loss of an allele.

individual phenotypes of disease. The possibilities for combinations and interactions between multiple genes are unlimited. An individual with two different mutant alleles at a given locus is a compound heterozygote. Compound heterozygosity can lead to a variety of phenotypes depending on the effect of each mutation on the integrity of the affected protein. One of the most studied examples is the effect of different missense mutations in the phenylalanine hydroxlase (PAH) gene on the integrity of the PAH protein (31). Most of the PAH missense mutations appear to reduce the intracellular stability of the enzyme protein making it more susceptible to misfolding. The degradation of mutant PAH proteins by proteases appears to be triggered by conformational alterations of the molecules that promote PAH aggregation (31). When considering the biallelic nature of inherited traits, one must look closely at the subunit structure of a given protein and determine the impact a single gene mutation may have on all possible configurations of the subunits. For example, in an individual who is a simple heterozygote for an autosomal re-

HETEROZYGOSITY AND THE PROTEOMICS ERA

cessive missense mutation in a gene encoding a tetrameric protein, 16 possible configurations in a ratio of 1:4:6:4:1 are possible, 15 of which may have at least one altered subunit (27). Hence, if the mutation is appropriately severe, it could alter the function of ⬎90% of the tetramers. Conversely, we suspect that nonsense mutations, though often more severe in the homozygous state than missense mutations, result in truncated subunit proteins that are degraded and, hence, not incorporated into tetramers. Therefore, the quality of residual enzyme in a nonsense mutation carrier may be normal while the total activity would be reduced and these heterozygotes would be at lower risk for manifesting symptoms compared with missense mutation carriers who would produce altered but functional protein. Another example demonstrating the difference between heterozygosity for missense versus nonsense mutations is the phenotypic heterogeneity found in congenital adrenal hyperplasia secondary to 21-hydroxylase deficiency caused by mutations in the 21hydroxylase gene (CYP21). In a study of 30 adolescent girls with hyperandrogenism, 14 healthy controls, and 15 obligate female carriers for CYP21 gene mutations, heterozygosity for CYP21 mutations was increased in women with hyperandrogenism (10/30) compared to healthy controls (1/14). Furthermore, CYP21 nonsense mutation carriers tended to be asymptomatic while missense mutations carriers manifested polycystic ovary syndrome (32). Another state of heterozygosity with potential for significant phenotypic impact is the double heterozygote—an individual with one mutant allele at each of two different gene loci. The occurrence of double heterozygosity for two pathological mutations in different genes can lead to a more severe phenotype as in the case of familial hypertrophic cardiomyopathy. This is a genetically heterogeneous autosomal dominant disease caused by mutations in several sarcomeric protein genes. Among seven genes associated with the disease, the ␤-myosin heavy chain gene (MYH7) and the cardiac myosin binding protein C (MYBPC3) gene are the most frequent. Eight individuals with hypertrophic cardiomyopathy were investigated for mutations in the MYH7 and MYBPC3 genes. Four carried a substitution mutation in the MYH7 gene, two carried a truncation mutation in the MYBPC3 gene, and two were doubly heterozygous for both mutations, representing the first report of the presence of two pathologic mutations in different genes associated with hyper-

57

trophic cardiomyopathy (33). The double heterozygotes exhibited marked left ventricular hypertrophy that was greater than that found in the other affected subjects, suggesting a synergistic effect between the two mutations. Another complex disease category in which double heterozygosity contributes significantly to the severity of disease is in the hemoglobinopathies where double heterozygosity for certain structural hemoglobin variants and/or thalassemia syndromes leads to severe disease (34). Disorders also exist in which double heterozygosity is required to produce clinically significant disease. One such case is digenic retinitis pigmentosa (RP). The two genes involved are RDS, which encodes a glycosylated dimer found in the rims of rod and cone outer segment disks, and ROM1, a gene encoding a protein similar in sequence and structure to the RDS protein. The two proteins form a tetrameric complex at the rim region of photoreceptor outer segment disk membranes that is believed to maintain the structure of the disks (35, for review). Among patients with RP, RDS gene mutations account for about 3 to 4% of cases. While there is no strong evidence that defects in ROM1 alone can cause monogenic RP, there are reports of families in which all affected members have defects in both RDS and ROM1. The mutation in RDS is always the missense mutation L185P. Heterozygosity for this mutation alone has no clinical effect. On the other hand, all affected individuals heterozygous for this mutation also are heterozygous for a mutation in ROM1 that is either a frameshift or missense mutation. Since the two genes are on separate chromosomes, the mutations segregate independently in a digenic inheritance pattern. Affected individuals who are double heterozygotes can pass both mutations on to their offspring. The resulting transmission mimics a dominant pattern except that 25% of offspring will be affected instead of 50%. Another distinctive feature in families is that the disease first appears in the offspring of a mating between unaffected individuals, each parent being a carrier for one of the mutations. The first two generations simulate autosomal recessive inheritance where the parents are unaffected and have a 25% risk of producing affected offspring. Nonallelic noncomplementation occurs in digenic RP because the mutant RDS-L185P protein is unable to form homotetramers and requires ROM1 subunits in order to form a functional complex. As long as wild-type ROM1 subunits are available in an RDS-L185P heterozygote, enough functional complexes will be produced to

58

GEORGIRENE D. VLADUTIU

support photoreceptors. However, if only one functional ROM1 allele is present, there is a deficit of functional complexes and photoreceptor degeneration occurs. Digenic inheritance has been found in other instances where subjects with mutations in two different genes develop a phenotype not associated with either gene mutation alone. An example is the occurrence of spina bifida in mice where homozygosity for a recessive undulated mutation in one gene (which by itself causes kinky tail) is required together with heterozygosity for a dominant Patch mutation in a second gene (which causes patchy coat) for phenotypic expression of disease (36). A New Look at Old Definitions Refinement in techniques for evaluating individual genes has led to a closer look at how mutations are examined quantitatively and qualitatively. For example, the presence of coexisting mutations within a single gene locus that demonstrate intra- as opposed to interallelic heterozygosity is being observed. Theoretically, heterozygous intraallelic alterations should be less detrimental to the total gene product than interallelic alterations because only one allele is affected in the former and both are affected in the latter (Fig. 4I, b and c). Furthermore, one would expect that one mutation would dominate over the other with coexisting mutations in cis. Usually one member of the paired alterations is a seemingly harmless polymorphism when found alone in a gene. Some of the variable phenotypes of short-chain acyl-CoA dehydrogenase (SCAD) deficiency provide examples of intraallelic heterozygosity which contribute to the variable disease phenotype. Several pathogenic mutations exist in the SCAD gene and two variants, 625G ⬎ A and 511C ⬎ T, have been found in either the homozygous or compound heterozygous state. The variants were found to be overrepresented among a clinically heterogeneous group of 133 patients with ethylmalonic aciduria. This led to the hypothesis that the variants may be susceptibility alleles (37,38). It was further hypothesized that two categories of patients with ethylmalonic aciduria exist with respect to SCAD gene mutations: a small group with pathogenic mutations in both alleles and a larger group with a functional SCAD deficiency who carry combinations of the 625A and 511T variants (39). To determine the validity of this hypothesis, 10 patients with ethylmalonic aciduria and SCAD deficiency in cultured skin fibroblasts were evaluated for mutations in the SCAD gene. Six

novel and one known pathogenic mutations were found as well as various combinations of the 625A and 511T variants; every patient had at least one and usually two of the variants, and 82% of the variant alleles were of the 625A type. Some patients had one mutation and one variant in the same (schematically in Fig. 4I, b) or different (schematically in Fig. 4I, c) alleles. One patient actually had a different mutation in each allele (compound heterozygote) and was homozygous for the 625A variant resulting in a total of four alterations between the two alleles. This patient also appeared to be one of the more severely affected infants since this individual was one of only two who had seizures together with hypotonia and developmental delay with onset in the newborn period. While it would appear that the more severe mutations would eliminate the effect of the coexisting variant, as suggested by expression studies, the investigators state that additional studies of similar alleles in SCAD-deficient patients are necessary before drawing such a conclusion. Heteroplasmy of the mitochondrial genome (mtDNA), i.e., transmission of mutant and wild-type genomes together in the same or different mitochondria, could be considered the ultimate state of heterozygosity. Heteroplasmy adds a quantitative dimension to the original definition of heterozygosity that is not possible with nuclear-encoded gene mutations; i.e., it provides for multiple copy numbers of single mutant genes within and among individual mitochondria in cells and, hence, within and among different tissues of the body. The inherent flexibility of the heteroplasmic state, in terms of dosage of aberrant genomes, allows for a continuum of phenotypes that can be evaluated correlating dosage with the quantity and quality of altered gene product. Heteroplasmy for mtDNA mutations could also act synergistically with other mtDNA mutations or with nDNA mutations (heterozygous or homozygous) to increase the clinical severity of a disease phenotype. Leber hereditary optic neuropathy (LHON) was the first human disease found to be associated with a mtDNA point mutation (40). LHON is characterized by painless loss of central visual acuity with onset between 12 and 30 years of age. About 80 –90% of affected males and only 8 –32% of females have visual loss. At least 18 different causative missense mutations have been described in genes encoding polypeptides of the four protein complexes (41). The three most common mutations are 11778G ⬎ A in ND4, 3460G ⬎ A in ND1, and 14484T ⬎ C in ND6,

HETEROZYGOSITY AND THE PROTEOMICS ERA

all subunits of complex I (42). The most common mutation, 11778G ⬎ A, is usually homoplasmic (found in all mtDNAs in a given individual). In certain mtDNA lineages, various combinations of mildly deleterious mutations accumulate over time and are directly proportional to the development of blindness; e.g., the mutation at 13708G ⬎ A in the ND5 gene, causing a substitution of threonine for alanine at amino acid 458 (A458T), is found in 5% of the general population and is not associated with visual loss when present alone. However, it is present in 26% of LHON patients with the 11778G ⬎ A mutation and in 19% of patients with other mutations causative for LHON. These variants are thought to be “enhancing mutations” or “premutations” that act synergistically with the more pathogenic mutations (43, for review). Hence, the concept of susceptibility variants or enhancing mutations coexisting with pathogenic mutations, often in the heterozygous state, is apparent in both genomes. There also can be an enhancement of pathogenicity of one mutation by another via the concept of biallelic inactivation. This phenomenon explains tumor-suppressor gene (e.g., the p53 gene) inactivation. Both alleles of a tumor suppressor gene must be inactivated in order to abolish the function of a protein involved in growth inhibition. One of the most common mechanisms is the occurrence of an intragenic mutation in one allele, coupled with the loss of the other allele, also known as loss of heterozygosity (LOH; Fig. 4III, a and b). The question arises as to what comes first, the intragenic mutation or LOH? Theories have been presented in support of both mechanisms. One of the more intriguing theories, with relevance to the present discussion of intra- and interallelic heterozygosity, is the possibility that two genes on the same chromosome arm are coinactivated as a requirement for the abolition of tumor suppression. A relationship may exist between the timing of intragenic mutation in these genes and LOH leading to functional advantages of the intergenic heterozygous state in tumors. Theoretically, when an intragenic mutation is the first hit, the altered protein product of this allele may interfere with the function of the normal protein produced by the remaining intact allele. This would provide a growth advantage to those cells with a mutated allele that would not be evident if LOH occurred as the first hit (44).

59

Patient Studies The occurrence of intra- and interallelic heterozygosity for disease-causing or disease-enhancing/susceptibility mutations is proving not to be uncommon. In fact, these events may occur more frequently as a cause of clinically relevant disease than homozygosity or compound heterozygosity for rare pathogenic mutations. We have termed the compounding of these partial defects synergistic heterozygosity (45). Increasing numbers of unexpected phenotypes are being observed among individuals with single-gene disorders. The coexistence of multiple single-gene mutations in an individual may result in complex phenotypes with the appearance of new symptoms that are atypical of the individual underlying disorders or with earlier onset of severe symptoms (46). To determine if coexisting single-gene disorders commonly exist, we performed mutation analysis in 50 individuals diagnosed biochemically with one of three exercise intolerance disorders with overlapping clinical characteristics. Manifesting carriers have been reported for all three disorders; therefore, we hypothesized that combinations of mutations causing these disorders may produce more severe symptoms than found with any one disorder alone. Genomic DNA was studied from 26 patients with adult-onset carnitine palmitoyltransferase (CPT; EC 2.3.1.21) II deficiency (MIM 255110), 9 patients with myophosphorylase (EC 2.4.1.1) deficiency (McArdle disease; MIM 232600), and 11 patients with myoadenylate deaminase (EC 3.5.4.6) deficiency (MIM 102770). A total of eight mutations (CPT2 gene, S113L, 413fs, P50H, R503C; PYGM gene, R49X, G204S; and AMPD1 gene, Q12X, P48L) were evaluated in patient DNA by allele-specific oligonucleotide analysis using established methods (27). The mutations evaluated collectively account for 80 –95% of the mutant alleles causative for these disorders. Corresponding disease mutations were detected in 73% of CPT II-deficient patients, 76% of McArdle disease patients, and 94% of myoadenylate deaminase-deficient patients. A total of 11% of these patients had molecular evidence for a coexisting disorder. In four of five cases, heterozygosity for myoadenylate deaminase deficiency was the coexisting disorder as may be expected since homozygosity for the common Q12X mutation has been estimated to be approximately 2% in the general population, with a carrier frequency of ⬃20% (47). Of particular interest was the finding of one of nine with biochemically diagnosed McArdle disease (0.04% of normal phos-

60

GEORGIRENE D. VLADUTIU

TABLE 1 Progression toward Phenotype and Genotype Complexity Patient

Age at diagnosis/ gender

1

40/M

2

17/M

3

68/F

4

25/F

5b

Birth/F

Biochemical findings in skeletal muscle

Clinical symptoms

Molecular findings

Diagnostic category

Severe muscle pain and cramps with rigorous exercise during armed services training Severe muscle pain and cramps with vomiting during football practice; myoglobinuria

CPT II 40% of normal

CPT2 gene: S113L heterozygote

1. 1 Disorder 2. 1 Mutation

CPT II activity 46% of normal NADH dehydrogenase: 26% of normal AMPD activity: ND

Acute onset muscle pain, weakness, myoglobinuria; serum CK 50,000 U/L; Apparent trigger: cholesterol-lowering drugs Muscle pain and cramps with exertion; progression to inability to walk and write Severe congenital hypotonia; absence of spontaneous movement; mild dysmorphic features; ventilator dependent; death at 43 days

Myoglobinuria screen: a normal Myoadenylate deaminase: absent activity

CPT2 gene: no mutations detected AMPD1 gene: Q12X heterozygote Complex I: ND AMPD1 gene: Q12X homozygote

1. 3 Disorders 2. 1 Mutation 3. Presumed 2 additional mutations in CPT2 and NADH dehydrogenase 1. 1 Disorder 2. 2 Mutations

Myophosphorylase: 0.04% of normal CPT II activity: ND

PYGM gene: R49X heterozygote CPT2 gene: S113L heterozygote

Absence of myophosphorylase and brancher enzyme activities; Polyglucosan bodies throughout tissues (autopsy finding)

ND

1. 2 Disorders 2. 2 Mutations 3. Presumed third mutation in PYGM 1. 2 Disorders 2. Presumed 4 mutations: 2 each in PYGM and brancher genes

Note. M, male; F, female; CPT, carnitine palmitoyltransferase; ND, not determined; a The myoglobinuria screen comprises the enzymatic analysis of myophosphorylase (PPL), phosphofructokinase, phosphorylase b kinase, phosphoglycerate kinase, phosphoglycerate mutase, and carnitine palmitoyltransferase; S113L, a mutation in the CPT2 gene causative for CPT II deficiency and accounting for ⬃60% of mutant alleles; Q12X, a mutation in the AMPD1 gene causative for myoadenylate deaminase deficiency and accounting for ⬃95% of mutant alleles; R49X, a mutation in the PYGM gene causative for myophosphorylase deficiency (McArdle’s disease) and accounting for ⬃70% of mutant alleles in Western Europeans. b See reference 48.

phorylase activity in skeletal muscle) who was heterozygous for the common S113L mutation causative for CPT II deficiency and heterozygous for the R49X mutation causing McArdle disease (a second mutation in the PYGM gene has not yet been found; Table 1). The young woman was diagnosed with McArdle disease at the age of 25. CPT enzyme activity was not measured at that time because the clinical and histopathologic suspicion was for a glycogen storage disease. Seven years after diagnosis, the patient reported to clinic with severe disability, unable to write or walk without assistance. She has

since become lost to follow-up; therefore, it is not possible to determine if other factors have contributed to the progressive nature of her clinical findings in addition to the three mutations responsible for her coexisting myopathies. We have since found a number of patients with biochemical and/or molecular evidence for coexisting metabolic myopathies. We propose that a progressive increase in genetic load leads to complex phenotypes among single-gene disorders and that environmental factors are particularly contributory to the disease phenotype in cases with a lower genetic load (Fig. 5). Additional exam-

HETEROZYGOSITY AND THE PROTEOMICS ERA

61

FIG. 5. Building a complex phenotype. The complexity of genotypes and consequent phenotypes through the accumulation of coexisting gene defects and interaction with environmental factors.

ples of individuals in each disease category with increasingly complex genotypes are presented in Table 1. Coexisting disorders are rarely sought in patients once a primary defect is found as in the case of the young woman with McArdle disease and CPT II deficiency. It is important for physicians to be aware that the potential for combinations of disorders exists and that heterozygosity for two or more disorders can act synergistically to augment the severity of disease symptoms. As pointed out in the discussion of Fig. 3, different metabolic pathways are interdependent, often interacting toward a common goal. Partial defects within and among these pathways can additively impair attainment of the common goal. Therefore, biochemical and/or molecular testing should be considered for multiple candidate disorders in cases of complex phenotypes.

rian Vladutiu for reading the manuscript and providing helpful suggestions. I also gratefully acknowledge the creative artwork of Ms. Barbara Evans. This work was supported in part by the Muscular Dystrophy Association and The Children’s Guild of Buffalo.

REFERENCES 1.

Consortium IHGS. Initial sequencing and analysis of the human genome. Nature 409:860 –921, 2001.

2.

Bailey D, Zanders E, Dean P. The end of the beginning for genomic medicine. Nature Biotechnol 19:207–209, 2001.

3.

Gross M. Putting cells together again. Curr Biol 11:R452– R453, 2001.

4.

Bork P, Dandekar T, Diaz-Lazcoz Y, Eisenhaber F, Huynen M, Yuan Y. Predicting function: From genes to genomes and back. J Molec Biol 283: 707–725, 1998.

5.

Garrod AE. The Croonian lectures on inborn errors of metabolism. Lancet 2:1–7, 73–79, 142–148, 200 –214, 1908.

6.

Dipple KM, McCabe ERB. Phenotypes of patients with “simple” Mendelian disorders are complex traits: Thresholds, modifiers, and systems dynamics. Am J Hum Genet 66: 1729 –1735, 2000.

7.

Wilkins MR, Sanchez JC, Gooley AA, Appel RD, HumpherySmith I, Hochstrasser DF, Williams KL. Progress with proteome projects: Why all proteins expressed by a genome

ACKNOWLEDGMENTS I am grateful to Drs. Charles Scriver and Edward McCabe for introducing me to parallels in nonbiologic networks and to Dr. Ad-

62

GEORGIRENE D. VLADUTIU Willems PJ. Inheritance of the S113L mutation within an inbred family with carnitine palmitoyltransferase enzyme deficiency. Hum Genet 97:291–293, 1996.

should be identified and how to do it. Biotechnol Genet Eng Rev 13:19 –50, 1996. 8.

Crowther RA, Harrison SC. Macromolecular assemblages. Curr Opin Struct Biol 11:141–143, 2001.

9.

Bailey JE. Complex biology with no parameters. Nature Biotechnol 19:503–504, 2001.

10.

29.

Broder S, Venter C. Whole genomes: The foundation of new biology and medicine. Curr Opin Biotechnol 11:581–585, 2000.

Vladutiu GD, Bennett MJ, Smail D, Wong L-J, Taggart RT, Lindsley HB. A variable myopathy associated with heterozygosity for the R503C mutation in the carnitine palmitoyltransferase II gene. Molec Genet Metab 70:134 –141, 2000.

30.

Hochstrasser DF. Proteomes in perspective. Clin Chem Lab Med 36:825– 836, 1998.

Ibdah JA, Yang Z, Bennett MJ. Liver disease in pregnancy and fetal fatty acid oxidation defects. Molec Genet Metab 71:182–189, 2000.

31.

12.

Strohman R. Epigenesis: The missing beat in biotechnology? Biotechnology 12:156 –164, 1994.

Scriver C, Waters P. Monogenic traits are not simple: Lessons from phenylketonuria. Trends Genet 15:267–272, 1999.

32.

13.

Barkai N, Leibler S. Robustness in simple biochemical networks. Nature 387:913–917, 1997.

Witchel SF, Aston CE. The role of heterozygosity for CYP21 in the polycystic ovary syndrome. J Pediatr Endocrinol Metab 13:1315–1317, 2000.

14.

Dipple KD, Phelan JK, McCabe ERB. Consequences of complexity within biological networks: Robustness and health, or vulnerability and disease. Molec Genet Metab 74:45–50, 2001.

33.

15.

Jeong H, Tombor B, Albert R, Oltvai ZN, Barabasi AL. The large-scale organization of metabolic networks. Nature 407: 651– 654, 2000.

Richard P, Isnard R, Carrier L, Dubourg O, Donatien Y, Mathieu B, Bonne G, Gary F, Charron P, Hagege A, Komajda M, Schwartz K, Hainque B. Double heterozygosity for mutations in the beta-myosin heavy chain and in the cardiac myosin binding protein C genes in a family with hypertrophic cardiomyopathy. J Med Genet 36:542–545, 1999.

16.

Albert R, Jeong H, Barabasi A-L. Error and attack tolerance of complex networks. Nature 406:378 –381, 2000.

34.

17.

Hartwell LH, Hopfield JJ, Leibler S, Murray AW. From molecular to modular cell biology. Nature 402:C47–C52, 1999.

Clarke GM, Higgins TN. Laboratory investigation of hemoglobinopathies and thalssemias: Review and update. Clin Chem 46:1284 –1290, 2000.

35.

18.

Gravely BR. Alternative splicing: Increasing diversity in the proteomic world. Trends Genet 17:100 –107, 2001.

Dryja T. Retinitis pigmentosa and stationary night blindness. In The Metabolic and Molecular Bases of Inherited Disease. (Scriver CR, Sly WS, Childs B, Beaudet A, Eds.). New York; McGraw-Hill, pp 5920 –5921, 2000.

19.

Mironov AA, Fickett JW, Gelfand MS. Frequent alternative splicing of human genes. Genome Res 9:1288 –1293, 1999.

36.

20.

Hanke J, Brett D, Zastrow I, Aydin A, Delbruck S, Lehman G, Luft F, Reich J, Bork P. Alternative splicing of human genes: More the rule than the exception? Trends Genet 15: 389 –390, 1999.

Helwig U, Imai K, Schmahl W, Thomas BE, Varnum DS, Nadeau JH, Balling R. Interaction between undulated and patch leads to an extreme form of spina bifida in doublemutant mice. Nature Genet 11:60 – 63, 1995.

37.

Gregersen N, Winter VS, Corydon MJ, Corydon TJ. Identification of four new mutations in the short-chain acyl-CoA dehydrogenase (SCAD) gene in two patients: One of the variant alleles, 5113 T, is present at an inexpectedly high frequency in the general population, as was the case for 625G3 A, together conferring susceptibility to ethylmalonic aciduria. Hum Molec Genet 7:619 – 627, 1998.

38.

Corydon M, Gregersen N, Lehnert W, Ribes A. Ethylmalonic aciduria is associated with an amino acid variant of short chain acyl-coenzyme A dehydrogenase. Pediatr Res 39:1059 –1066, 1996.

39.

Corydon MJ, Vockley J, Rinaldo P, Rhead WJ, Kjeldsen M, Winter V, Riggs C, Babovic-Vuksanovic D, Smeitink J, De Jong J, Levy H, Sewell AC, Roe C, Matern D, Dasouki M, Gregersen N. Role of common gene variations in the molecular pathogenesis of short-chain acyl-CoA dehydrogenase deficiency. Pediatr Res 49:18 –23, 2001.

11.

21.

Caillat-Zucman S, Garchon HJ, Timsit J, Assan R, Boitard C, Djilali-Saiah I, Bougneres P, Bach JF. Age-dependent HLA genetic heterogeneity of type 1 insulin-dependent diabetes mellitus. J Clin Invest 90:2242–2250, 1992.

22.

Gilbert W. Why genes in pieces? Nature 271:501, 1978.

23.

Kolkman JA, Stemmer WPC. Directed evolution of proteins by exon shuffling. Nature Biotechnol 19:423– 428, 2001.

24.

Pradet-Balade B, Boulme F, Beug H, Mullner EW, GarciaSanz JA. Translation control: Bridging the gap between genomics and proteomics? Trends Biochem Sci 26:225–229, 2001.

25.

Papadimitriou A, Manta P, Divari R, Karabetsos A, Papdimitriou E, Bresolin N. McArdle’s disease: Two clinical expressions in the same pedigree. J Neurol 237:267–270, 1990.

40.

26.

Schmidt B, Servidei MD, Gabbai MD, Silva MD, de Sousa Bulle de Oliveira MD, DiMauro S. McArdle’s disease in two generations: Autosomal recessive transmission with manifesting heterozygote. Neurology 37:1558 –1561, 1987.

Wallace DC, Singh G, Lott MT, Hodge JA, Schurr TG, Lezza AMS, Elsas LJI, Nikoskelainen EK. Mitochondrial DNA mutation associated with Leber’s hereditary optic neuropathy. Science 242:1427–1430, 1988.

41.

27.

Taggart RT, Smail D, Apolito C, Vladutiu GD. Novel mutations associated with carnitine palmitoyltransferase II deficiency. Hum Mutat 13:210 –220, 1999.

Fischel-Ghodsian N. Homoplasmic mitochondrial DNA diseases as the paradigm to understand the tissue specificity and variable clinical severity of mitochondrial disorders. Molec Genet Metab 71:93–99, 2000.

28.

Handig I, Dams E, Taroni F, Van Laere S, de Barsy T,

42.

Database MHMG. Center for Molecular Medicine Emory

HETEROZYGOSITY AND THE PROTEOMICS ERA University, Atlanta, GA. World wide web (http://www.gen. emory.edu/mitomap.html) 2000. 43. Wallace DC, Shoffner JM. Oxidative phosphorylation in disease. In The Metabolic and Molecular Bases of Inherited Disease (Scriver CR, Beaudet AL, Sly WS, Valle D, Eds.). New York: McGraw-Hill, pp 1556 –1560, 1995. 44. Wilentz RE, Argani P, Hruban RH. Loss of heterozygosity or intragenic mutation, which comes first? Am J Pathol 158: 1561–1563, 2001. 45. Vockley J, Rinaldo P, Bennett MJ, Matern D, Vladutiu GD. Synergistic heterozygosity: Disease resulting from multiple

63

partial defects in one or more metabolic pathways. Molec Genet Metab 71:10 –18, 2000. 46. Vladutiu G. Complex phenotypes in metabolic muscle diseases. Muscle Nerve 23:1157–1159, 2000. 47. Morisaki T, Gross M, Morisaki H. Molecular basis of AMP deaminase deficiency in skeletal muscle. Proc Natl Acad Sci USA 89:6457– 6461, 1992. 48. Herrick MK, Twiss JL, Vladutiu GD, Glasscock GF, Horoupian DS. Concomitant branching enzyme and phosphorylase deficiencies. An unusual glycogenosis with extensive neuronal polyglucosan storage. J Neuropathol Exp Neurol 53:239–246, 1994.