Mutational Spectrometry: Means and Ends K. KHRAPKO, P. AND&, R. CHA, G. H U A N D W. G . THILLY~
Center for Encironttientul Heulth Sciences Mussuchctsetts Institrite o j Technology Cumlwidge, Mussuchusetts 02139 I. Goals and Problems ............................................ 11. Allele-specific PCR (ASP) ....................................... 111. High-efficiency Restriction Assay ( H E M ) .........................
I\’. Methods Using Differential DNA Melting to Separate Mutants . . . . . . Heferences
....................................................
285 289 295 302 311
1. Goals and Problems A. Definition A mutational spectrum is the distribution of mutations within a defined DNA sequence with regard to position and kind. Seymour Benzer and the late Ernst Freese showed that different mutagens give different spectra of point-mutations in the rII region of bacteriophage lambda (1). Most researchers who have used mutational spectra have been interested in the molecular mechanisms of spontaneous and chemically induced mutation. A few have been interested in finding the primary causes of mutation in the various organs of humans. Getting the point-mutational spectrum from a population of phage, bacteria, or human cells in culture is straightforward. One isolates independent mutant colonies and sequences each until one has enough information for the intended use. This “clone-by-clone” method was what Benzer and Freese used, and it has been good enough for many mechanistic studies in the field. Much useful information has been obtained in this way. Analysis of the literature from 1958 to 1979 suggested to us, however, that clone-by-clone spectra simply would not be good enough for the next I
To whoin cerrespondenw may be addressed.
Prnyrcss in Niiclric Acid Arseareh and Molrciilar Biolngy, Vnl. 49
285
CnpyrilJit 0 1 W by Acdeniir Press. Inc. All rights d repductinn in any rorn: reserved.
286
K. KHRAPKO ET AL.
level of mutational mechanism studies or for finding the causes of human germinal and somatic mutation. Our work and this essay thus focus on our attempts to measure mutational spectra in whole human cell populations and in human tissue samples.
B. Appropriate Data Sets It is worthwhile to consider how the intended use of mutational spectra influences the criteria applied to define an appropriate data set. If we want to know whether a mutagen causes G-to-A transitions (as many alkylating agents do), we could just isolate and sequence a few dozen colonies and reach a supportable conclusion. If, however, we want to know whether two alkylating agents cause significantly different spectra, considerably more work must be done. For example, Coulondre and Miller compared the mutations found in the Em1 gene of E. coli after methylnitronitrosoguanidine(MNNG) treatment to those seen after ethyl methanesulfonate (EMS) treatment (2). More than 900 mutants were scored for the former, and more than 600 for the latter mutagen. Pairwise comparison showed that both mutagens caused G-to-A transitions at the same set of base-pairs. Statistical analysis (x2)showed the spectra of the two chemicals to be different at the 99% confidence limit. However, if the data sets were simply divided in half, the two spectra were not significantly different, even at the 95% confidence limit. This is a lot of work (>750 clones) to conclude that two spectra are not significantly different. It is especially difficult in this case, in which the spectra are, in fact, different. But one could not be sure of this with fewer than the 1500 colonies studied.
C. Bulk Approach to Mutant Analysis: Selectable Genes
We were not thrilled by the idea of sequencing hundreds of human cell mutant colonies to obtain results; we set out to find a less arduous way. We focused our attention on a selectable gene and combined mass selection of mutants with separation of mutant sequences by denaturing gradient gel electrophoresis (DGGE; see Section IV). We enumerated the mutants by their intensities on the gel, isolated and sequenced the mutant bands, and published the spectra (3). In such experiments, the reproducibility among spectra obtained from independent human cell cultures is excellent because the number of mutants induced by mutagen treatment is large enough. For instance, in a typical experiment using exon 3 of the hprt gene as the DNA sequence of interest, we make sure that more than 10,OOO hprt- mutants survive treatment in each replicate culture. In this way, any particular mutant that represents 1% or more of the hprt mutants will occur at least 100times among the surviving
‘‘V
MUTATIONAL SPECTHOMETRY
287
cells. Among independent experiments, the 95% confidence limits on the expectation of 100 will be 80 and 120. Counting and isolating mutants by DGGE have greatly simplified the job. One intermediate goal was reached. We can now obtain efficiently the kind of spectra previously obtained by clone-by-clone analysis in simple cell systems. For example, the method was applicable in human cell culture even at the level of sensitivity required for studying spontaneous mutation (4). The method is general for any selectable genes in viruses, bacteria, yeast, or mammalian cell DNA. However, we have not yet reached our ultimate technical goal of measuring spectra in human tissues.
D. Requirements for Measuring Mutational Spectra in Human Tissue
Point-mutant fractions for genes such as hprt in T cells in middle-aged humans were reported to be about 10-5 (5, 6). Since mutational spectra that include “hot spots” consisting of 1%or more of all mutants in a gene are very useful for mechanistic or causation studies, it is reasonable to assume that a typical study of mutational spectra in humans could require a means to measure mutations at a frequency of 10-7 or higher for single-copy nuclear genes. Such a frequency requires that 109 cells be used to produce a spectrum, which ensures that each “1%hot spot” is represented by statistically significant number of 100 copies. It is worth noting here that 109 mainmalian cells contain about 3000 p g of DNA. This is an enormous amount, which is very difficult to process. For example, DNA concentration in a polymerase chain reaction (PCR) should not exceed 50 pg/ml (7; R. Cha, unpublished), or only 5 pg per standard reaction. This is a common challenge for human mutational spectrometry regardless of which technique is utilized. One approach would be to restriction-digest the DNA and to isolate the size-fraction containing the desired sequence. This would reduce the amount of DNA down to 1%. The use of multi-copy genes may simplify the task. For ribosomal RNA genes at 400 copies per cell, 2.5 x 106 cells would suffice to produce a reproducible spectrum. Mitochondria1 genes exist at 4000 copies per cell (8). Moreover, mitochondria1 mutation rates appear to be some 20 times higher than those for nuclear genes (9). These two facts together mean that a means to measure initochondrial mutations at a frequency of 2 x 10-6 in a sample of about 104 cells would be suitable for human tissue mitochondria1 mutational spectrometry.
E. Unselected Approaches To obtain useful mutational spectra for human tissues, one thus must deal with rather low mutant fractions. Unfortunately, phenotypic selection cannot
288
K. KHRAPKO ET AL.
be used to enrich for mutants in most tissues, as opposed to bacteria or cell culture. Another drawback of phenotypic selection is that the approach is limited to selectable genes. The latter limitation is important not only because mutations at some DNA loci of particular interest cannot be selected. Even more important, selection generally rules out the use of multi-copy genes. Therefore, there is no other way but to substitute phenotypic selection by other processes to get rid of wild-type sequences. Our early attempts in the field were based on DGGE separations with radioactive label detection. It was shown that a simple combination of PCR with high-fidelity DNA polymerase and DGGE enabled one to detect a mutant at a fraction of about 10-3 (4). A more advanced approach included a DGGE separation of mutants from a mixture of restriction fragments followed by PCR of eluted DNA and another DGGE of the resulting mixture of PCR fragments. This approach enabled us to detect mutants down to less than 10-6. However, this approach suffered a reproducible but still unexplained non-linearity of response at fractions below 10-4 (10,11). In the sections below, we discuss three separate approaches we have taken; they appear promising either alone or in combination. Allele-specific PCR (ASP)-especially our favorite variant, mismatch amplification mutation assay (MAMA), described in Section 11-allows us to measure a specific point-mutation by constructing a PCR primer and using conditions that support the amplification of the mutant but not of the wildtype sequence (12). With our high-efficiency restriction-enzyme digestion assay, or H E M , we select for mutants in “six-cutter” restriction sites (G. Hu, unpublished). The restriction enzyme chosen cuts a wild-type recognition sequence but none of any of its possible mutants. The uncut mutant sequences are amplified by PCR and studied further (Section 111). With constant denaturing capillary electrophoresis (CDCE), which was derived from Fischer and Lerman’s DGGE, we make use of the differences in melting temperatures of DNA molecules caused by a single base change (13). These melting differences are translated into lower electrophoretic mobility of mutant/wild-type heteroduplexes as compared to wild-type/wild-type heteroduplexes, which enables efficient separation by electrophoresis. Table I summarizes the scope and the efficiency of the aforementioned approaches. The methods are considered single steps of mutant purification, which enables us to compare them in terms of the ability to enrich mutants in a mutant/wild-type mixture. Note that for each base-pair there exist several formal possibilities for a mutation: three substitutions, a deletion, or an insertion of any number of base-pairs 5’ to the base-pair in question. For the purpose of discussion, we thus consider five formal measurable mutations as possible for each base-pair.
289
MUTATIONAL SPECTHOMETHY
TABLE I APPHOACHESTO O R T A I N U S E F U L MUTATIONAL SPECTRA Approach Phenotypic selection (1O00bp gene) C I X E (100-1)p domain) H E M (6-111)site) MAMA (oneformal mutant) 41
FOR
HUMANTISSUESif
Number of possible mutations screened
Enrichment of mutants
SO00
105
Selectalde niiitaiits
500 30 1
101
100-150 Ill) Restriction sites A single known iiiutant
105
105
Limitation
C I X E , Constant denaturing CdpillaqJelectrophoresis; H E M , high-efficiency restriction as-
say: MAMA, inismatch ainplification inutation assay.
We may reasonably anticipate combining several purification steps into a complete mutant detection and/or isolation procedure. For example, a combination of phenotypic selection and DGGE yielded an enrichment of 107, which, as mentioned earlier, enabled us to investigate spontaneous mutation in cell culture (4).Total enrichment of 107 is the product of the enrichment of mutants by phenotypic selection (105) and by DGGE (102). Although many combinations are possible, some of them require a PCR as a link between consecutive enrichment steps. Since PCR itself generates mutations (14), it should be used only under the condition that the fraction of PCR-associated mutants is less than the fraction of the original mutants. This means that the original mutants must be enriched above a certain threshold prior to PCR. Since the threshold decreases as the fidelity of polymerase increases, the need for high-fidelity PCR is obvious. Here, it is worth pointing out the role of DGGE-like methods, including CDCE. Although not very efficient in enriching mutants, they are able to pick up almost any mutant and to display a mutational spectrum as a series of bands or peaks, which provides easy isolation and subsequent sequencing of individual mutants. Mutational spectrometry is not the only application for the approaches discussed here. Detection of mutants at low fractions are of special interest in population screening and in early detection of cancer cells.
II. Allele-specific PCR (ASP) ASP is a modification of the PCR (15, 16)that permits specific amplification of sequences differing by as little as a single base-pair (for a review, see 17). The method is based on the observation that a 3' mismatch(es) of a
290
K. KHRAPKO
Wild Type
-GGA+ +CCT-
-
- G O A L
1
ET AL.
Mutant “ A 4 4 +CTTPCR utilizingone mismatch primer
A
M
4
c-
Gel electrophoresis following 30 40 cycles of ASP
-
FIG. 1. Allele-specific PCR (ASP). ASP is a modification of PCR that permits specific amplification of sequences differing by as little as a single base-pair. Specificity of amplification is obtained by using a primer that, unless it is annealed to the desired allele, forms 3’ misinatch(es) with the template. Shown is the double-mismatch primer utilized for the detection of a transforming rat H-ras allele [GGA-to-GAA transition at the 12th codon (12)].The mutant allele, which forms one penultimate mismatch with the primer, is efficiently amplified by the polymerase; on the other hand, the extension from the wild-type allele is greatly hindered I)y the additional (ultimate) mismatch introduced by the double-iiiistiiatcli primer.
primer/template complex interferes with efficient extension by DNA polymerases. Allelic specificity is obtained by designing a primer that, unless it is annealed to the desired allele, will make a 3’ mismatch(es) with the template (Fig. 1). Terms synonymous with ASP in the literature include PCR amplification of specific allele (PASA; 18), amplification refractory mutation system (ARMS; 19), and MAMA (12).The procedure has been most widely utilized in the human population studies, for instance, to identify carriers of various human genetic disorders, including a,-antitrypsin deficiency (19,2O),sicklecell anemia (21), familial amyloidotic polyneuropathy (22, 23), and phenylketonuria (24). In each of these cases, the desired allele constitutes either 50% or 100%of the sample, and the specific allele is readily detected by nonisotopic ASP methods (17). ASP also has a number of potential applications, including short-term in oioo and in oitro mutagenicity tests, human mutational spectrometry, and
MUTATIONAL SPECTROMETRY
29 1
elucidation of genetic events that are involved during early stages of tumorigenesis. In order to carry out such analyses, we believe that mutation assays with a sensitivity of 10-5 or better are required. The sensitivity of a mutational assay is defined here as the lowest mutation fraction measurable by the assay. However, except for the MAMA, the limit of sensitivity of currently available ASP is around 1%(18)and this has been the reason for the limited utilization of ASP in human population studies. MAMA is an ASP that has been optimized in regard to its sensitivity (12). By exploring double-mismatch primers, altering the duration and the temperature of the primer-annealing and extension step in the PCR, and modifying the solvent composition of the reaction mixture, we reproducibly measure a specific mutation (GGA-to-GAA mutation at the codon 12 of the rat H-ras gene) at a fraction somewhat below 10-5. MAMA is limited in that it is designed to detect one specific mutation at a time. Its power, however, stems from its simplicity and speed. This, in turn, makes MAMA the technique of choice in certain cases in which rapid screening of a large number of samples is desired. For example, MAMA for the G-to-A transition in the 12th codon of the rat H-rus gene allowed us to screen efficiently hundreds of organ sectors. In the case of mutational spectrometry, one could use multiple MAMAS as a rapid screening tool for mutational hot spots once other procedures have provided the mutational spectrum. In such cases, a simple MAMA screening may be sufficient to assess whether certain individuals have been exposed to a particular mutagen.
A. Development of a Mismatch Amplification Mutation Assay (MAMA) The overall objective here is to define PCR conditions that allow efficient amplification of the desired mutant allele, but minimize amplification of a wild-type allele. Development of a MAMA involves several variables: (1)the mismatch primer sequence, (2) the temperature of the primer extension step, (3) the time permitted for extension, and (4) the composition of the reaction mixture, particularly the concentrations of dNTP, MgCl,, primer, and glycerol. 1. NUMBER,POSITION, AND NATUREOF MISMATCHES IN THE PRIMER
Despite a large number of reports regarding efficiencies of primer extension from matched versus mismatch primers, it is still extremely difficult to predict which mismatches will be extended and which will not. This is largely due to the fact that the efficiency of primer extension is greatly influenced by many parameters in PCR, such as the type of the DNA polymerase, the local context of the DNA, the reaction conditions (including
292
K. KHRAPKO ET AL.
concentrations of primers, dNTPs, MgCl,, pH), and the time allowed for extension. This point is illustrated in Table I. Whereas Newton et al. (19)and Kwok et al. (25) reported reduced amplification for specific single mismatches, all of the single-mismatch primers tested by Cha et al. (12) were amplified as efficiently as the perfect match primer. In the latter study, reduced amplification was observed only when double mismatches were introduced at the 3‘ end of the primer. Even then, one example was found in which a primer that created AG/CT double mismatches gave efficient amplification. Whereas Newton et al. (19) observed reduced amplification from T/T mismatches (primer/template), both Kwok et al. (25) and Cha et al. (12) reported efficiencies that were comparable to the perfect match. These differences can be attributed to several factors. As summarized in Table I, each study was carried out using a different gene, using a different mutation, and under different reaction conditions (including the length of the mismatch primers, the concentrations of various components of the reaction mixture, and the steps involved in the PCR cycle). One general “rule” in designing mismatch primers for MAMA is that, in order to see allele-specific amplification, the mismatches in a primer must be positioned at the 3‘ ultimate or the penultimate position. A single mismatch or double mismatches placed at least two positions away from the 3‘ ultimate positions are not as effective as mismatches at 3‘ ends in reducing the efficiencies of undesired alleles (12, 17, 19, 25). Also, for the purpose of detecting rare mutations (e.g., mutant fractions of less than or equal to lo+), a single mismatch has not yet been found to provide sufficient specificity. For an A-to-T transversion in the codon 61 of the mouse H-rus gene, Nelson et (11. (26) found a limit of 10-4, whereas Sarkar et ul. (18) reported a limit of 2.5 X 10-3 for a TA-to-AT polymorphism of the phenylalanine hydroxylase gene. With regard to the nature of double mismatches to be chosen to optimize specificity, there are no general rules except that mismatches involving T residues appear to be more permissive to extension than others (12,25)and should be avoided. Obviously, double-mismatch primers that permit looping-out of one of the mismatches are also undesirable.
2.
REACTION CONDITIONS
Reaction conditions play a critical role in determining whether or not a particular mismatch will be extended. For example, Kwok et al. (25) noted that a G/G mismatch was extended as well as a perfect match primedtemplate when the dNTP concentration in the reaction mixture was 800 p M , but not at 50 gM. Each component of the reaction mixture must be optimized so that it will allow for efficient amplification of the desired allele, but at the same time minimize amplification of the wild type.
MUTATIONAL SPECTROMETRY
293
This overall objective of MAMA is similar to that of high-fidelity PCR in that both require a high degree of specificity. In general, high-fidelity PCR conditions (e.g., a reduction in the mismatch primer extension)are achieved by lowering pH and the concentrations of dNTP and primers (27, 28). Several researchers found that lowering the MgCI, concentration also reduces the extension of mismatch primers in ASP (18, 27). On the other hand, for Tuy and Vent polymerases, Ling et al. (28) reported that when both pH and the concentration of dNTP are lowered, increasing the MgCl, concentration improves the fidelity of PCR. It is also important to note that increasing the fidelity of PCR in certain cases reduces the efficiency of PCR (28). In fact, for the MAMA of the G-to-A transition at the 12th codon of the rat H-rus gene, no detectable amount of amplification product was generated from either the wild-type allele or the mutant allele when pH was below 7.0 (8.4 in the original buffer), or the concentration of MgCI, was below 0.5 mM (2.25 mM in the original buffer).
3. BMPERATURE AND DURATION OF THE ANNEALING AND EXTENSION STEPS In general, shorter extension periods provide the condition for highfidelity PCR. For this reason, in MAMA, we eliminated the separate extension step that greatly decreased double-mismatch primer extension. To find the optimal annealing temperature, we tested temperatures ranging from 50°C to 66°C. No amplification product was observed from either the mutant allele (the desired allele) or the wild-type allele when the temperature was above 63°C. Below this temperature, efficient amplification (65-70% per cycle) of the mutant allele was observed. Minimum amplification occurred from the wild-type allele when the annealing step was carried out at 50°C. At the same time, a few aberrant bands also appeared. Thus, it appears that as the temperature is lowered, the double-mismatch primers can hybridize to other regions of the DNA. The optimal temperature and extension period must be determined individually for each MAMA developed. The use of capillary PCR permits tighter control over time/temperature parameters and could increase the sensitivity of MAMA. We do not know whether all single base-pair alterations in genomic DNA are amenable to MAMA (i.e., with the sensitivity of 10-5). Thus far, three different loci-the GC-to-AT transition at the 12th codon of the rat H-rus gene, the TA-to-AT transition in codon 664 of the rat c-neu gene, and the ATto-TA transition at the codon 61 of the rat H-rus gene-have been subjected to MAMA optimization. MAMA for the first two achieved a sensitivity of lo+; the current sensitivity of the third mutation is about 5 X 10-5. Even in these few cases, the optimal MAMA conditions for each sequence were
294
K. KHHAPKO ET AL.
significantlydifferent. For example, for the A-to-T transversion of the c-neu gene, 15 p.M of each dNTP (versus 37 p.M for the G-to-A mutation of the H-rus gene) and 5% glycerol (versus 10%) were used. In general, obtaining a sensitivity of 10-5 by MAMA required the following three features: (i) introduction of double mismatches; (ii) reduction of the extension time; and (iii) addition of glycerol. Our experiences indicate that by simply implementing these three conditions, one can achieve a sensitivity of 10-2 to 10-3. However, in order to increase the sensitivity to 10-5, optimization of various parameters in MAMA using a matrix approach is required.
6. Achieving Higher Sensitivity
The current limit of sensitivity of MAMA is lop5. This is based on the observation that 15 copies of a mutant allele mixed with 1.5x 106 copies of a wild-type allele gave rise to a signal that was reproducibly discernible from the 1.5 x 106 copies of the wild-type DNA alone. The limit of sensitivity stems from the fact that, despite the double mismatches, a small fraction of the wild-type allele is still extended by polymerase. Currently, it is not known precisely how frequently such double-mismatch extension takes place. Our experience with the GGA-to-GAA transition at the 12th codon of the rat H-rus gene indicates that the number of copies generated from 1.5 X 106 copies of the wild-type allele is slightly lower than the number generated from 15 copies of the mutant allele (i.e., 10-5). There are many possibilities that should reduce the background signal from the wild-type DNA. One can eliminate the wild-type DNA by first running a preparative DGGE. Since, it is possible to eliminate at least about 99% of the wild-type DNA by DGGE, this in turn would reduce the background signal by a factor of 100. An alternative method of ridding the wild type in some cases is to utilize a specific restriction enzyme that cleaves the wild-type but not the mutant DNA. In this way (see Section II), it may be possible to degrade over 99.99% of the wild-type DNA. In addition to these methods in which the source of background noise (i.e., the wild-type allele) is physically removed from the sample, there are other means to reduce the background, for example, by making extension of the double-mismatch primers more difficult. Tu4 DNA polymerase is not an enzyme of choice for high-fidelity PCR due to its relatively high error rate (14). Tu4 has been utilized in our initial studies because at the time it was the only thermostable enzyme that was also exonuclease negative (exo-). It was reasoned that exonuclease-positive (exo+) DNA polymerases would correct the terminal mismatch and extend the corrected primer, thereby eliminating the specificity that was conferred by introducing a mismatch(es)at the 3' end. More recently, additional exo- thermostable enzymes have been iden-
MUTATIONAL SPECTROMETRY
295
tified, including those derived from Pfu and Vent DNA polymerases. Although the fidelity of these additional exo- thermostable enzymes remains to be determined, they could easily be tested by MAMA to see whether they could reduce the background noise from the wild-type allele. Finally, one could combine the principle of differential oligonucleotide hybridization (DOH) and MAMA to reduce the background signal (a suggestion by H. Zarbl). DOH is a technique that has been utilized extensively in identifying oncogenic mutations in tumors. In a typical assay, a short piece of synthetic DNA fragment (10-20 bases long) encompassing the region of mutation is used to probe for a specific point-mutation. By optimizing the hybridization conditions, the technique can be successfully utilized in characterizing single point-mutations (29). The principle behind Fig. 3 is to design a synthetic oligonucleotide fragment (“blocker”)that will hybridize to the wild-type but not the mutant allele. By occupying the wild-type DNA, the blocker will presumably prevent wild-type DNA from annealing to the MAMA mismatch primer. In order to ensure that the blocker does not become extended by Tay polymerase, the 3’ end of the blocker will be synthesized with dideoxynucleotide or some other synthetic nucleotide that prevents chain elongation. In summary, an ASP in the form of MAMA has been demonstrated to permit measurements of single base-pair mutants at a fraction of 10-5. A similar sensitivity has been found for a single base-pair deletion in the human hprt gene (R. Okinaka, personal communication). It seems probable that MAMA sensitivity can be improved to measure mutant fractions down to 10-8.
111. High-efficiency Restriction Assay (HERA) A. Introduction HERA detects DNA point-mutations located in restriction recognition sites by eliminating wild-type DNA copies using high-efficiency restriction digestion. Restriction endonucleases are used to digest cellular DNA and eliminate wild-type DNA copies of the sequence studied. Mutants in the restriction recognition sites will be undigested. With high-fidelity DNA amplification, the mutants can be amplified and subsequently separated, enumerated, and isolated by DGGE or another suitable separation technique. Several groups have also been trying to use restriction endonucleases to eliminate wild-type DNA, with varied success. Processes based on restriction digestion have been used to detect point-mutations in oncogenes (30-35).
296
K. KHRAPKO ET AL.
However, a general design problem in these efforts is that PCR is used to amplify the target sequences, which are mixed with too many residual undigested wild-type DNA copies (30-34, 36). In these experiments, DNA amplification before a high-efficiency restriction digestion would be expected to create PCR-induced mutants at a level that would obscure expected in uioo mutations at fractions of 10-6 to 10-7 (14). Another problem with many of the experiments reported is a lack of sufficient initial mutant copy number to achieve useful data (30, 34, 35). In order to achieve a 95% confidence limit of 20%, 100 or more mutants must exist in any sample assayed. Reconstruction experiments, such as mixing “one copy” of a mutant with 1Oj copies of wild-type DNA, are not a suitable means to demonstrate a mutational detection sensitivity of 10-5; 100 mutants should be mixed with 107 copies of wild-type sequences for such a demonstration. The H E M method has the following advantages and characteristics: (i) The sensitivity for mutation detection by this method is about 10-7.Thus, it should be possible to measure human somatic mutations using H E M . (ii) HER4 can screen 4- to 8-bp DNA sequences each time for any pointmutation related to these sequences; therefore, it can be used in a limited way to establish mutational spectra. (iii) HERA measures mutation within palindromic sequences that show a higher proportion of mutational hot spots than random sequences (37).
B. Methodology The major steps of the H E M procedure are shown in Fig. 2.
1. CELL/DNA ISOLATION DNA should be isolated from tissue or cells without being exposed to elements that may react with DNA or cause DNA adducts. Many uncontrolled factors, especially heating and UV light from normal fluorescent lamps, generate DNA adducts that can be clearly separated and distinguished from wild-type DNA on DGGE (8). 2. ELIMINATION OF HETEROGENEOUS DNA WITH REGARD TO ENDONUCLEOLYTIC DIGESTION A DNA fragment several hundred bases in length carrying the target sequence is cut from cellular DNA at two restriction recognition sites flanking the region of interest, and purified on a polyacrylamide gel. In order to eliminate wild-type DNA by restriction digestion, the efficiency of restriction digestion must be sufficiently high so that very few copies of wild-type DNA will remain undigested. However, the efficiency of restriction digestion is limited by heterogeneity in the preparation of the
MUTATIONAL SPECTROMETRY
297
DGGEIXCOCE FIG.2. Illustration of H E M . A D N A population of 10" copies carrying hot-spot mutations with a niutatioiial fraction (MF) of 10-7 is cut from a cellular D N A preparation and digested by restriction eiidonuclease. A 10-5 fraction of wild-type D N A reliiains undigested. The MF of these inlitatits is increased to 10-2. The fraction of the tnutants generated in PCR (MFrYcR) is ecliial to the length of the target (h)tiiiies the error rate ofthe D N A p)lyiiierase used in PCR (fl (2 X lO~'/l)p/duplicatioti for Tay DNA polytiierilse), times the nuiiilwr of duplications made in PCR (d),which is almit 26, to produce a iiiaxiiiiuiii of 10'2 copies of D N A from a iiiiiiiiiirim of 1 0 4 wild-type D N A copies. Thus, MF,,,,, is calculated to be 1.6 X lo-?, which is LY)lllpdrdble to the MF. PCH errors lociited outside the restriction recognition site, as well as a large portion (90%) of residual wild-type sequences, are eliminated by another round of digestion. Reamplificatioii with a internal primer carrying a CC clamp eliiniiiated the noii-specific amplification signals generated in the first round of PCR and enabled the target seqrienct' to be analyzed b y I X G E or CIICE.
DNA, with regard to endonucleolytic digestion (Fig. 3). As shown in Fig. 3, the digestion efficiencies in the EagI site at bp 2567 of the mitochondria] DNA (mtDNA) and in the KpnI site at bp 2574 of the mtDNA were both 90% when cellular DNA was digested. However, double digestion with both KpnI and EagI also left an undigested residue of lo%, instead of 1%, as would be expected for independent action of the endonucleases on homoge-
298
K. KHRAPKO ET AL.
pBR322/Mspl, 250ng pBR322/Mspl, 500ng 109 copies rntDNA, 6x102-fold 109 copies rntDNA/Kpnl, 104-fold 109 copies mtDNNEagl, 1O4-fold 109 copies rntDNNKpnLEgal, 104-fold
a
b
pBR322/Mspl, 250ng pBR322/Mspl, 500ng 109 copies rntDNA/Eagl, 108-fold 109 copies mtDNA/Eagl, 108-fold
FIG.3. Improvement of restriction digestion efficiency. (a) Heterogeneity of cellular DNA. Cellular DNA containing 108 copies of mtDNA isolated by phenol extraction was digested with EagI, KpnI, or EagI plus KpnI, respectively, and subsequently amplified using Ta9 DNA polymerase and primers 1 and 2, which are complementary to the 2457 to 2476-bp and 2613 to 2594-bp regions of mtDNA (45b),which carries an EagI recognition site at 2567 bp and a KpnI recognition site at 2584 bp. Amplification-fold of each sample is indicated. Residual mtDNA copy number in the restriction-digestedcellular DNA samples before PCR was calculated to be about 10s copies. A restriction digestion efficiency of 90% was thus concluded. (b) High-efficiency EagI digestion. Cellular DNA was first digested with SphI and PouII; their recognition sites are located at 2436 bp and 2653 bp of mtDNA, respectively. Undigested DNA was then removed by purifying DNA fragments on a 6% polyacrylamide gel. The portion of the DNA fragments (length 217 2 40 bp) was recovered by electroelution.EagI digestion was carried out on 1Oe copies of these purified DNA fragments. After about 1Wfold amplification, -lo'* PCR products were observed as compared to the pBR322/MspI standard. This method indicated an undigested residue of 10-5 or less in replicate experiments with EagI digestion.
neous DNA (Fig. 3a). A portion of the DNA was thus determined to be indigestible, probably because it is incompletely dissolved as microprecipitates. To improve the digestion efficiency, target sequences were first cut from cellular DNA and purified on a polyacrylamide gel. Indigestible heterogeneous cellular DNA was removed by this gel-purification process. DNA thus purified and eluted from the gel can be digested to near completion. Only 10-5 or less of the wild-type DNA remains undigested, as determined by quantitative PCR (Fig. 3b).
3. HIGH-EFFICIENCY RESTRICTION DIGESTION OF DNA This is the key step contributing to high sensitivity. Wild-type DNA will be digested at the unique recognition site (i.e., the target sequence) to near completion so that only about 10-5 of the wild-type DNA copies remain
299
MUTATIONAL SPECTHOMETHY
undigested. A typical nuclear DNA hot spot that occurs at a fraction of 10-7 (Section I,D) will thus be enriched to about 10-2 by a high-efficiency digestion step. 4. HIGH-FIDELITY DNA AMPLIFICATION Undigested DNA, including mutants and undigested residual wild-type DNA copies, are amplified to generate 1012 total copies. Two points should be considered at this amplification step. Some DNA polymerases may add an extra nucleotide to the 3' end of the PCR products (38) during PCR and therefore affect their behavior in the following DGGE steps (39); DNA polymerases that create blunt-ended PCR products, such as T4 DNA polymerase, are preferred in this step. The second point is that DNA polymerases make mistakes during amplification; these may be mistaken for sample mutants. The PCR reaction should therefore be optimized with respect to fidelity (28). The mutant fraction (MF) generated during PCR can be predicted by the following equation:
M F = bfd/2 where h is the length of the target sequence, f is the error rate of the DNA polymerase, and d is the number of duplications of the sequence. If a 6-bp restriction recognition site is screened and Tay DNA polymerase is used to amplify DNA IW-fold, the expected mutant fraction generated in PCR should be 1.6 x 10-2. This is because there are 6 bp in the target restriction recognition site, f for Tay DNA polymerase is about 2 x 10-4, and amplification from 104 to 1012 copies requires 26 duplications;6 x 2 x 10-4 x 26/2 = 1.6 x 10-2. Since the sample mutant fraction of 10-2 is comparable with the PCR noise, sample mutants should be visible and distinguishable on a denaturing gradient gel from PCR noise, as observed in a simultaneous control containing PCR errors.
5. ELIMINATION OF MOSTPCR-GENERATED MUTANT AMPLIFICATION SIGNALS SEQUENCESAND NON-SPECIFIC The PCR product is redigested with the same restriction endonuclease as used in the step 3 to eliminate the PCR errors generated outside of the 6-bp target sequence in the amplified DNA fragment. In oitro DNA amplification generates PCR errors within and outside the target sequence, all of which are detected as signals during the later separation of mutants on DGGE. Considering that the total length of the amplified sequence is usually about 100 bp to facilitate separation of the mutants on DGGE, only 6%of the PCR errors will be located in the target region. Redigestion of the PCR product
300
K. KHRAPKO ET
AL.
eliminates most of the total PCR errors not located on the restriction recognition site. The digested PCR product is then reamplified 100-fold with an internal primer to eliminate non-specific amplification signals. In step 4, some nonspecific amplification occurs caused by the selected primers annealing to another region of the genomic DNA. These sequences may represent noise in the system. PCR with an internal primer removes almost all of these nonspecific amplification signals. 6. ANALYSISOF THE MUTANTS
Since the sample mutational fraction has been raised to at least 10-2, there are several ways to separate and enumerate these mutants. One of the most reliable methods is DGGE (40).The PCR product generated in step 5 can be attached to an artificial high-melting domain, and the purified PCR product can be run on a DGGE. Since the sensitivity of DGGE detection is around 10-2 to 10-3 (8),it is fully applicable in this case. CDCE (13)and single-strand conformation polymorphism (41)may be alternative choices.
C. Application of High-efficiency Restriction Assay (HERA) to Mitochondrial Mutational Assay HERA has recently been used to measure the mutations in the human mitochondria1 genome (41b).mtDNA has several advantages for mutational research. There are 103to 104 copies per cell, so smaller tissue samples yield the necessary number of mutants. mtDNA has an evolutionary rate 20 times that of nuclear single-copy genes (9) and appears to be more sensitive to chemical mutagens than is nuclear chromosomal DNA. Mitochondrial mutants may also play important roles in carcinogenesis, degenerative diseases, and aging (42-44).intDNA is a convenient target to detect hot-spot mutations and to establish a mutational spectrum from a normal healthy human. According to our calculation, 3 X 105 T cells from peripheral blood samples should provide enough mutant copies to detect hot-spot mutations in a fraction of approximately lo-’ in a 6-bp restriction recognition site. Nuclear multi-copy sequences such as ribosomal DNA genes may also be suitable for mutational spectra studies. However, there is a difficulty that must be overcome in order to measure mtDNA mutations: the interference from nuclear pseudogenes of mtDNA. mtDNA has frequently been inserted into the nuclear genome during evolution, and these insertion events now appear as a series of pseudogenes (45). When using a total genomic DNA preparation, these pseudogenes represent “noise” in mtDNA mutational assays. Single to multiple copies of mutant copies of pseudogenes represent mutant fractions of 10-2 to 10-4 relative to
301
MUTATIONAL SPECTROMETRY
U
2
n
m
1 2
3
4
5
6
7
8
9
10 11 12
FIG.4. DGGE display of mtDNA mutants in tissue samples. Cellular DNA from one lung, two normal colon, and two colon tumor samples were examined for mtDNA mutations in EagI and KpnI sites. A chromium-treated human lymphoblast line, TK6,which carried no detectable mutations on the examined Eagl and KpnI sites (data not shown), was used as a concurrent control. Normal colon sample 2 has a clear signal not found in the other tissues or cell samples (Arrow b). All normal tissue and tumor samples show a band not seen in the cell culture sample (Arrow a).
wild-type mtDNA, that is, one to 100 copies of a particular nuclear mitochondrial pseudogene per cell. We first chose the EagI site (2567 bp) and KpnI site (2574 bp) in the 16-S ribosomal RNA coding sequence of the human mtDNA as target sequences of H E M . A series of mtDNA pseudogenes homologous to the 2457 to 2594bp region of the mtDNA, at the fractions of 10-2 to 10-3 compared to the wild-type mtDNA, were found and sequenced (45b).By knowing the nuclear pseudogene sequences, a protocol is designed to eliminate all of the pseudogenes homologous to target mtDNA sequences and to screen rare mtDNA mutations (41b). When H E M was used to search for unselected mtDNA mutations in the
302
K . KHRAPKO ET AL.
EugI site (2567 bp) and the KpnI site (2574 bp) from human tissue samples, one colon sample was found to have a hot-spot mutation at a frequency of approximately 10-6 (Fig. 4) (41b). While more investigation is needed to standardize the HERA technique and more restriction recognition sites are needed as target sequences for mutational spectrometry, the strategy has shown its potential to achieve the goal of direct measurement of DNA mutations in tissue.
IV. Methods Using Differential DNA Melting to Separate Mutants A. Principles of Separation In the past few years, mutational spectrometric research on DNA has been accelerated by inventions of methods based on cooperative melting equilibria of DNA. These approaches include DGGE (46), constant denaturant gel electrophoresis (CDGE) (47), and a capillary-based variant of the latter, CDCE (13). All include electrophoresis of DNA under partial denaturing conditions (elevated temperature and/or media containing urea and/or formamide). Under these conditions, it is possible to separate mutants differing by only a single nucleotide as individual bands or peaks. The separation is based on the following facts. It has been shown that melting of DNA fragments is a discontinuous process (40). In fact, most of naturally occurring DNA consists of wellbounded melting domains, each of which melts as a single unit at a specific temperature, melting being a rather sharp transition. This conclusion is based on calculations following Poland’s algorithm for DNA melting (48) as later modified (49).This algorithm yields the probability for any base-pair of a DNA fragment to be either in a helical or in a disordered state as a function of temperature. The parameters used in these calculations, which characterize the cooperativity of melting and the probability of loop formation as well as intrinsic stability of a base-pair as a function of its nearest neighbors, were obtained in independent experiments (50, 51, and 52, respectively). The behavior of melting domains stems primarily from two factors: high cooperativity of melting (i.e., high probability for a base-pair to be in the same state, melted or helical, as the neighboring one), and low probability of the formation of melted loops (53). The results of such calculations are usually presented in the form of “melting maps,” which refer to the plots of melting temperature against DNA sequence. Melting domains show up on a melting map as horizontal portions of the plot.
IMUTATIONAL SPECTROMETRY
303
If a DNA fragment consists of two domains, one melting at a lower and the other at a higher temperature, the melting course of such a fragment would include, within a certain range of temperatures and/or denaturant concentrations, a stable intermediate, comprising the fully melted lowmelting and completely helical high-melting domain. The electrophoretic mobility of such a partially melted intermediate is inversely proportional to the exponent of the length of the melted portion and is usually only a fraction of the mobility of a completely double-stranded species (53).Apparently, the partially melted intermediate is in rapid equilibrium with the non-melted form of the DNA fragment. Hence, the apparent mobility of the fragment may be considered as a weighted average of the mobilities of its non-melted and partially melted forms at a particular temperature (13). Important for the separation of mutants is the f x t that the melting temperature of a domain is strongly affected by most base-pair changes (transitions, transversions, deletions, insertions, and mismatches) within that domain. If the change is located in the low-melting domain, the equilibrium between the partially melted intermediate and the non-melted form is shifted. Thus, within the appropriate range of temperature, the apparent mobility of the corresponding mutated fragment is changed as compared to the wild type, and the two are efficiently separated. For example, as much as 95% of base-pair substitutions may be separated from the wild type in a sample fragment of a p-globin promoter (54). Thus, an efficient separation of mutants depends on a number of requirements. The stretch of DNA to be screened for mutants should be located within the low-melting domain of a low-melting/high-meltingdomain combination with sharp domain boundaries and a sharp melting transition. The melting temperature of the high-melting domain should be high enough so that strand dissociation is negligible (otherwise the bands decay and a higher-mobility smear consisting of single strands is formed). In case such a combination does not occur naturally, an artificial high-melting domain, or “clamp,” can be attached to an arbitrary sequence via PCR (54). Moreover, in many cases, the separation of mutants from the wild type is either impossible or the extent of separation is not sufficient for the needs of mutational spectrometry. The ability of the method to detect mutations is significantly improved (in the sense of detecting absolutely all mutations and increasing the separation from the wild type) by converting them into heteroduplexes with the wild-type sequence (55). The improvement results from the fact that a base-pair to mismatch change, as a rule, destabilizes DNA much more than any base-pair to base-pair change. The heteroduplexes are generated by simply boiling and reannealing a sample containing a predominance of wild-type sequences over mutants. By mass action, all mutant homoduplexes are converted to heteroduplexes containing one wild-
304
K. KHHAPKO ET AL.
type strand. This procedure is particularly feasible in mutational spectrometry, because samples usually contain a large excess of wild-type DNA.
B. Comparison of DGGE, CDGE, and CDCE Approaches 1. EXPERIMENTAL SET-UP
Although the physical principles underlying the separations are similar, experimental set-ups are quite different for slab-gel procedures (DGGE and CDGE) and for the capillary polymer network format (CDCE). In DGGE, a DNA fragment is run in a polyacrylamide slab gel with an ascending gradient of denaturant (urea and formamide). The gel is submerged in electrophoresis buffer of controlled temperature (usually around 60°C). The running time is typically 8-16 hours, at 8 V/cm (56). CDGE is performed in much the same way as DGGE, except for the absence of a gradient of denaturant. The running time depends on the resolution to be achieved and the concentration of denaturant used (3-8 hours) (47). For the detection of DNA, both radioactive labeling and ethidium bromide staining have been used. CDCE (13) is a newly developed technique that puts together the constant denaturant approach and the polymer network capillary electrophoresis format introduced for the high-resolution separation of singlestranded DNA (57).In CDCE, fluorescently labeled DNA fragments are run through a capillary 75 pm in diameter filled with viscous non-cross-linked linear polyacrylamide solution, rather than polyacrylamide gel. The capillary can be used many hundreds of times, while polyacrylamide filling must be replaced after each run (a %minute procedure). A portion of the capillary where the separation takes place is inserted into a water jacket with a variable temperature. DNA is detected at a single point where a laser beam is focused on the capillary. The fluorescence of labeled DNA, induced by the laser, is detected by an optical system with a photomultiplier, and the data are transmitted to a computerized data acquisition system. There are several advantages of CDCE over slab-gel formats. Microcapillaries enable us to increase the speed of separation about 30 times as compared to CDGE and DGGE (the usual field strength in CDCE is 250 V/cm). The speed of separation in both DGGE and CDGE is limited by heat production, which is not significant in capillaries. Typically, a capillary separation of mutants takes less than 30 minutes. Moreover, laser-induced fluorescence detection gives very high sensitivity and dynamic range, both features being of special importance in mutational spectrometry. In our system, it is possible to measure DNA peaks containing as few as 3 x 104 and as many as 1011 molecules. The miniature format itself is an advantage, since in working with low numbers of DNA molecules, as required by mutational
MUTATIONAL SPECTHOiMETHY
305
spectrometry, it is better to keep volumes as small as possible. Moreover, with CDCE, fractions are taken simply by directing the material being electroeluted from the anode end of the capillary into separate tubes, while in slab gel, one must cut out gel slices and elute the DNA from each slice. 2. SEPARATION EFFICIENCY
Examples of separations of a mixture of four sequence variants and a single-stranded DNA (ssDNA) by the three methods are shown in Fig. 5. The sequence shown is an example of a well-behaved DNA fragment, containing both high- and low-melting domains. The melting temperature of the wild-type low-melting domain was predicted by Lerman’s algorithm to be 63°C. The differences between sequence variants are limited to the changes in one base-pair in the low-melting domain. This base-pair is a GC in the wild type (labeled “GC”); in the variants, this base-pair was changed to AT or to mismatches GT and AC. The comparison of separations by DGGE, CDGE, and CDCE shown in Fig. 5 demonstrates that CDCE is superior with regard to resolution. Most likely, this advantage should be attributed to the much higher speed of separation in CDCE, which makes dihsion insignificant. In fact, the speed of separation in CDCE is so high that the resolution appears to be limited by the relatively slow kinetics of the melting-reannealing process; by increasing the speed of separation even further, one actually sacrifices resolution (13). Considerable differences in the relative peak positions result from different modes of separation by the three methods. In DGGE, a DNA fragment is supposed to reach a denaturant concentration at which the low-melting domain is almost completely melted; hence, the mobility becomes so low that the band essentially stops. It appears, therefore, that the final positions of the bands, corresponding to different sequence variants, are spatially linked to the specific denaturant concentrations. Note that the dsDNA bands in Fig. 5A are sharper than the ssDNA band, due to so-called “focusing,” which refers to the compression of a band as its mobility decreases. The mobility of the ssDNA band does not decrease and it passes the dsDNA bands by the time separation is complete. In contrast to DGGE, in CDGE and CDCE the conditions are constant throughout the region where separation takes place. The mobilities of dsDNA fragments are thus constant and depend on the states of melting equilibria displayed by each of them under those conditions. This principle is illustrated in Fig. 6, which shows CDCE runs of the same sample as in Fig. 5, except for the absence of single-stranded fragments, at different temperatures. At 31”C, a single peak is observed. This peak contains all four sequence variants in the unmelted form. A temperature of 35°C appears sufficient to partially melt the two sequence variants
306
K. KHHAPKO ET AL.
8 n
z
iII
d
#O
the topofthegel,
-liom
8
CDGE
5 x
g
4 %
d
16
18
14
0
12
Distancefromthe(opofulegel,cm
14
16
18 20 Minutes
22
24
307
MUTATIONAL SPECTHOMETHY
. 0.0 . 0.5
Gc
'
36°C 0.3 GT
n Gc+AT
-
- 0.1
AC
35°C AC
31"C
GC+AT+GT+AC
k
15
20
25
30
5
-06
- 0.3 - 0.0
-1.6
- 0.8 10.0
FIG. 6. CDCE separation as a function of temperature. The same sample as in Fig. 5, except for the absence of single-stranded DNA, was run 011 a capillary filled with 6% polyacrylamide, 3 . 3 . 4 urea, 20% formamide in TBE buffer at 250 Vlcm at the temperatures listed on each electrophoretogram. Peaks are labeled as in Fig. 5.
FIG.5. Comparison of separation of DGGE, CDGE, and CDCE. A 2Wbp amplified labeled human mtDNA sequence with 112-bplow-melting and 94-bp high-melting domains was used in our model experiments. Of two variants of this sequence, one (designated GC) was identical to the wild type, while in the other (designated AT) the wild-type GC pair 30 bp deep into the low-meltingdomain was artificially substituted for an AT base-pair. To prepare a sample for separation, a mixture of GC and AT homoduplex sequences was boiled and reannealed, which created, by cross-hybridization, a pair of heteroduplexes, designated here GT and AC, according to the mismatches they bear. A single-stranded (ss) fragment WdS also included in the sample. DGGE and CDGE: The32P-labeled sample was run in slab gels awarding to 40 and 47, respectively, under optimal conditions for the separation of the components. CDCE: The 5' fluorescein-labeledsample was run on a 75+m capillary filled with 5.5% polyacrylamide in TBE (89-mM Tris-borate. pH 8.4, I-mM EIITA)buffer at 63.5"C. I25 Vlcm. One V.sec of peak area rwrresp)nds to about 10" DNA molecules. For inore details, see 13.
308
K. KHHAPKO ET AL.
with the most unstable low-melting domains that contain mismatches. Fragment GT shows higher mobility than AC, since the melting equilibrium for GT is shifted toward the unmelted form as compared to less stable AC. At 38"C, both AC and GT fragments appear to be almost purely in the melted state, so that their mobilities do not differ significantly and the corresponding peaks almost comigrate. By changing temperature, one may selectively improve the resolution within the narrower range of stabilities of particular interest. For example, the mutant homoduplexes, which may be both more and less stable than the wild type, are better resolved at 38"C, when the wild type is in the middle of the separation range. On the other hand, the heteroduplexes, all of which are much less stable than the wild type, are resolved at 36°C when the wild type is still almost unmelted. It appears, therefore, that, given an unknown mixture of sequence variants to be separated, the only parameter one needs to know in advance is the melting temperature of the wild type, which can be roughly predicted by Lerman's algorithm (58) and further refined in test runs.
C. Detection of Low-Frequency Mutations by CDCE The advantages of CDCE over DGGE and CDGE, discussed above, convinced us to choose it as the mutational spectrometry tool. The feasibility of CDCE for the detection of low-frequency mutations is illustrated by a model experiment aimed at detecting mutant PCR fragments that were admixed to wild-type fragments at fractions as low as 10-6. The experiment was based on the idea that, although the current efficiency of CDCE in enriching mutant sequences is about 103, the procedure consisting of two sequential CDCE purifications might provide the necessary sensitivity. This principle was illustrated earlier using consecutive DGGE separations (10). In the course of the experiment, four mixtures with mutant fractions of 10-4, 10-5, and 10-6 and a negative control were prepared from purified GC, GT, and AC fragments by sequential dilution and subjected to two CDCE purifications each followed by PCR amplification. Presented in Fig. 7 are the CDCE separations that characterize the mixtures at each step. To make the picture simpler, only two of four separations are shown for each purification step-one in which the mutants may already be detected, and one with the next lower mutant fraction, in which the mutants cannot yet be seen. Figure 7A and B shows CDCE separations of the initial wild-type/mutant mixtures (mutant fractions 10-4 and 10-5, respectively). Note that the full scale of the two panels is only 1/1O,OOO of the wild-type peak height. This demonstrates the impressive dynamic range of CDCE, which in this case is 104 within one run. Indeed, the wild-type peak in Fig. 7A contains 1W
iMUTATIONAL SPECTHOMETRY
309 I .2 3.9 I.3
.O D
O! K ) $
0
60 K)
60 10
12
14
Minutes
16
18
FIG.7. Detection of low-frequency mutations by CDCE. Samples taken at different stages of a reconstruction experiment were run through a capillary at 200 Vlcm, 63.5"C. 5.5% p l y acrylamide in TBE buffer. The amount of wild-type homoduplex (GC) was kept at about 1W copies per sample, which corresponds to a peak 5 V high, which is far off-scale. Due to slight differences between the runs, the charts had to be aligned along the time axis to make the heteroduplexes coincide. Peaks are labeled as in Fig. 5. (A and B) Initial mixtures of purified heteroduplexes (GT) and (AT) and wild-type homoduplex (heteroduplex fractions of 10-4 and 10-5, respectively). (C and D) Mixtures after one CDCEIPCR cycle (heteroduplex fractions 10-5 and 10-6, respectively). X, An unidentified peak of PCR-associated noise (see text). (E and F) Mixtures after two CDCE/PCR cycles (fraction lo-" and pure wild type, respectively).
copies, while mutant peaks (each of 105 copies) are still well above the background noise. Critical to such a high sensitivity is the quality of the purified DNA, which should not contain any admixtures that may appear in the heteroduplex region of separation as noise peaks. In the first CDCE/PCR cycle of mutant purification, fractions that belong to the heteroduplex region between minutes 14 and 20 were collected, pooled, and amplified by Pfu DNA polymerase. The PCR reactions were subjected to a second CDCE separation, shown in Fig. 7C and D (for mutant fractions 10-5 and 10-6, respectively). The full scale of the panels is now
310
K. KHHAPKO ET AL.
1 / 1 0 of the wild-type peak height. Two important conclusions may be derived from Fig. 7C and D. First, as measurements show, the mutant fi-actionof the 10-5 mixture was increased to more than 10-2, which is more than a 103-fold enrichment of mutant sequences. Since mutant peaks in the 10-6 mixture have not yet appeared above the background, 10-5 may be considered the detection limit for a single CDCE/PCR procedure. The enrichment at this step is limited by the carryover of wild-type sequences into those regions of separation where only mutants are supposed to be. Our preliminary observations (8)indicate that the carryover consists of two kinds of DNA molecules. Some of them fall behind the main peak for some non-specific reasons, such as adsorption. The others may bear a chemical modification that destabilizes their low-melting domains. Second, the background (relative to the wild-type peak) in PCRamplified samples is at least 100-fold above that in the initial mixtures of purified DNA fragments. Hence, PCR generates some fraction of “modified DNA molecules which show up in the heteroduplex region of separation. Some of these molecules may be the well-known true PCR-associated mutants (14)that result from polymerase mistakes. However, some of them are definitely of different origin, for example, peak X in Fig. 7C and D, which disappears in the next cycle (cf. Fig. 7E and F). The second cycle of purification was identical to the first one. CDCE separations of the resulting PCR reactions are shown in Fig. 7E and F, the full scale being 1/20 of wild-type height. The enrichment of mutants at this cycle is only about 25-fold, which, however, is enough to detect mutants that originally were at a fraction of 10-6. The reason for such a low enrichment apparently is the aforementioned PCR-associated noise, which coelutes with mutant peaks and is amplified along with the original mutants in subsequent PCR cycles.
D. Conclusion The principles of mutant separation by electrophoresis of cooperatively melting DNA molecules under partially denaturing conditions have been used to develop a new separation approach, CDCE. It has been demonstrated that CDCE has several important advantages that make it the technique of choice for mutational spectrometry. Namely, it is a very rapid method of high resolution and high dynamic range. Combining two consecutive CDCE separations with intermediate PCR has provided sensitivity of 10-6, which may be enough to detect mitochondrial mutations in human tissues. However, this result w a s achieved in a model system and it is still necessary to confirm that such a sensitivity can be reproduced on cellular DNA.
311
MUTATIONAL SPECTHOMETHY
ACKNOWLEDGMENTS We gratefully acknowledge John H. Hannekamp for communicating results and ideas prior to publication, H i l q Coller for critical reading of the manuscript, and Cindy Flannery for help in manuscript preparation.
REFERENCES 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13.
14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24.
S. Benxer and E. Freese, PNAS 44, 112 (19.58). C. Coulondre and J. H. Miller. JMB 117, 577 (1977). P. Keohavong and W. C. Thilly, PNAS 89, 4623 (1992). A. R. Oller and W. G. Thilly, JMB 228, 813 (1992). A. A. Morley, K. J. Trainor, R. Seshadri and R. C. Ryall, Nature 303, 155 (1983). R. J. Albertini, Mufat. Res. 150,411 (1985). I. Kdin, S. Shephard and U. Candrian, Mutot. Res. 283, 119 (1992). J. S. Hanekamp, Ph.D. thesis. Massachusetts Institute of Technology, Cambridge, 1993. W. M. Brown, M. George, Jr., and A. C. Wilson, PNAS 76, 1967 (1979). W. G. Thilly and P. Keohavong, U.S. Patent 5,045,450 (1991). A. Kat, Ph.D. thesis. Massachusetts Institute of Technology, Cambridge, 1993. R. Cha, H. Zarbl, P. Keohavong and W. G. Thilly, PCR Methods Appl. 2, 14 (1992). K. Khrapko, J.S. Kanekamp. W. C. Thilly, A. Belenkii, F. Foret and B. L. Karger, NARes 22, 364 (1994). P. Keohavong and W. C. Thilly, PNAS 86,9253 (1989). K. Kleppe, E. Ohtsuka, R. Kleppe, I. Molineux and H. G. Khorana, ]MB !56,341(1971). R. K. Saiki, S. Scharf, F. Falwna, K. B. Mullis, G. T. Horn, H. A. Erlichand A. Arnheim, Science 230, 1350 (1985). C. D. K. Bottema and S. S. Sommer, Mutat. Res. 288, 93 (1993). C. Sarkar, J. Cassady, C. Bottema and S . Sommer, Anal. Biochem. 186,64 (1990). C. R. Newton, A. Graham, L. E. Heptinstd, S. J. Powell, C. Summers, N. Kalsheker, J. C. Smith and A. F. Markham, NARes 17, 2503 (1989). H. Okayama, D. T. Curiel, M. L. Brantly, M. D. Holmes and R. G. Crystid,]. Lab. C h . Med. 114, 105 (1989). D. Y. Wu, L. Ugozzoli, B. K. PI11 and R. B. Wdltule, PNAS 86, 2757 (1989). W. C. Nichols, J. J. Liepnieks, V. A. McKusick and M. D. Benson, Genomics 5,535 (1989). S. Li, J. L. Sobell and S. S. Sommer, Am. J . Hum.Genet. 50, 29 (1992). S. S. Sommer, J. D. Cassady, J. L.Sobell and C. D. K. Bottemil,Mayo Clin. Proc. 64,1361
(1989). 25. S. Kwok, D. E. Kellogg, N. McKinney, D. Spasic, L. Godaand J. J. Sninsky, NARes 18,999 (1990). 26. M.A. Nelson, B. W. Futocher, T. Kinsella, J. Wymer and C . T. Bowden, PNAS 89, 6398 (1992). 27. K. A. Eckert and T. A. Kunkel, NARes 18,3739 (1990). 28. L. L. Ling, P. Keohavong, C. Dim and W. G . Thilly. PCR Methodp Appl. 1,63 (1991). 29. B. J. Conner, A. A. Reyes, C. Morin, K. Itukura, R. L. Teplitz and R. B. Wallace, PNAS 80, 278 (1963). 30. E. Felley-Bosm, C. Poumrd. J. Zijlstra, P. Amstad and P. Cerutti, NARes 19,2913 (1991). 31. R. Kumar and M. Barbacid. Oncogene 3,647 (1988).
312
K. KHRAPKO ET AL.
32. R. Kumar, S. Sukumar and M. Barbacid, Science U8,1101 (1990). 33. S. M. Kahn, W. Jiang, T.A. Culbertson, I. B. Weinstein. G. M.Williams, N. Tomita and Z. Ronai, Oncogene 6, 1079 (1991). 34. M. S. Sandy, S. M. Chiocm and P. A. Cerutto, PNAS 89, 890 (1992). 35. S.-J. Lu and M. C. Archer, PNAS 89, 1001 (1992). 36. A. Haliassos, J. C. Chomel, L. Tesson, M. Bwdis, J. Kruh. J. C. Kaplan and A. Kitzis, NARes 17,3606 (1988). 37. G . G . Hillebrand and K. L. Beattie, JBC 260,3116 (1985). 38. G . Hu, DNA Cell Biol. 12, 763 (1983). 39. P. F'feiffer and G . Hu, in "Denaturant Gradient Gel Electrophoresis: A Laboratory Manual" (L. Lerman, ed.). In press. 1994. 40. S. G. Fischer and L. S. Lerman, PNAS 80, 1579 (1983). 41. K. Hayashi. PCR Methods Appl. 1, 34 (1991). 41b. G. Hu, H. Coller, X. Li and W. C. Thilly, in preparation. 42. 8. Bandy and A. J. Davison, Free Radicals B b l . Med. 8, 523 (1990). 43. D. C. Wallre, Science e56,628 (1992). 44. J. W. Shay and H. Weibin, Mutat. &s. 186, 149 (1987). 45. T. Tsuzuki, H. Nomiyama, C. Setoyama, S. M d and K. Shimada, Gene 25,223 (1983). 4%. G. Hu and W. G. Thilly, Gene in press (1994). 46. S. G. Fischer and L. S. Lerinan, Cell 16, 191 (1979). 47. E. Hovig, 8. Smith-Soresen, A. Brogger and A.-L. Borresen, Mutat. Res. 262,63 (1991). 48. D. Poland, Biopolyners 13, 1859 (1974). 49. M. Fixman and I. I. Friere, Biopolymers 16, 2693 (1977). 50. B. R. Amirikyan, I. L. Vologodskii und Y. L. Lyubchenko, NARes 9, 5469 (1981). 51. R. D. Blake and J. R. Fresco, Biopolymers 12, 775 (1973). 52. 0. Gotoh and Y. Tagashira, Biupolyners 20, 1033 (1981). 53. L. S. Lerman, S. G . Fischer, I. Hurley, K. Silverstein and N. Lumelsky, Annu. Reo. Biophys. Bioeng. 13, 399 (1984). 54. R. M. Myers, S. G. Fischer, L. S. Lermian and T. Maneatis, NARes 13, 3131 (1985). 55. W. G. Thilly, Corcinogenesis 10, 511 (1985). 56. R. M. Myers, T. Maniatis and L. S . Lermiin, Methods E n t y d . 155, 501 (1987). 57. A. S. Cohen, D. R. Najarian, A. Lhulus, A. Cuttinan, J. A. Smith and B. L. Karger, PNAS 85, 9660 (1988). 58. L. S. Lerman and K. Silverstein, Methods Enqnwl. 155, 482 (1987).