[1]
SEQUENCINGOF in Vitro AMPLIFIEDDNA
3
[1] S e q u e n c i n g o f in V i t r o A m p l i f i e d D N A By ULF B. GYLLENSTEN and MARIE ALLEN
Introduction The polymerase chain reaction (PCR) 1'2 method for in vitro amplification of specific DNA fragments has opened up a number of fields in molecular biology that were previously intangible because of lack of sufficiently sensitive analytical methods. The PCR is based on the use of two oligonucleotides to prime DNA polymerase-catalyzed synthesis from opposite strands across a region flanked by the priming sites of the two oligonucleotides. By repeated cycles of DNA denaturation, annealing of oligonucleotide primers, and primer extension an exponential increase in copy number of a discrete DNA fragment can be achieved. Many applications of PCR, including diagnosis of heritable disorders, screening for susceptibility to disease, and identification of bacterial and viral pathogens, require determination of the nucleotide sequence of amplified DNA fragments. In this chapter we review alternate methods for the generation of sequencing templates from amplified DNA and sequencing by the method of Sanger. 3
Generation of Sequencing Template for Direct Sequencing Traditionally, templates for DNA sequencing have been generated by inserting the target DNA into bacterial or viral vectors for multiplication of the inserts in bacterial host cells. These cloning methods have been simplified, but are still subject to inherent problems associated with the maintenance and use of systems dependent on living cells, such as de novo mutations in vector and host cell genomes. By using PCR, templates for sequencing can be generated more efficiently than with cell-dependent methods either from genomic targets or from DNA inserts cloned into vectors. Amplification of cloned inserts of unknown sequence can be achieved using oligonucleotides that are priming inside, or close to, the polylinker of the cloning vector. 2 Sequencing the PCR products directly has two advantages over seI K. B. Mullis and F. Faloona, this series, Vol. 155, p. 335. 2 R. K. Saiki, D. H. Gelfand, S. Stoffel, S. J. Scharf, R. Higuchi, G. T. Horn, K. B. Mullis, and H. A. Erlich. Science 239, 487 (1988). 3 F. Sanger, S. Nicklen, and A. R. Coulson, Proc. Natl. Acad. Sci. U.S.A. 74, 5463 (1979).
METHODS IN ENZYMOLOGY,VOL. 218
Copyright © 1993by Academic Press, Inc. All rights of reproductionin any form reserved.
4
METHODSFOR SEQUENCINGDNA
[1]
quencing of cloned PCR products. First, it is readily standardized because it is a simple enzymatic process that does not depend on the use of living cells. Second, only a single sequence needs to be determined for each sample (for each allele). By contrast, when PCR products are cloned, a consensus sequence based on several cloned PCR products must be determined for each sample, in order to distinguish mutations present in the original genomic sequence from random misincorporated nucleotides introduced by the T a q polymerase during PCR.
Optimization of Polymerase Chain Reaction Conditions for Direct Sequencing The ease with which clear and reliable sequences can be obtained by direct sequencing depends on the ability of the PCR primers to amplify only the target sequence (usually called the specificity of the PCR), and the method used to obtain a template suitable for sequencing. The specificity of the PCR is to a large extent determined by the sequence of the oligonucleotides used to prime the reaction. For an individual pair of primers the specificity of the PCR can be optimized by changing the ramp conditions, the annealing temperature, and the MgC12 concentration in the PCR buffer. A titration, in 0.2 mM increments, of MgC12 concentrations from 1.0 to 3.0 mM in the final reaction is advised if the standard 1.5 mM concentration fails to produce the necessary specificity of the PCR. In cases in which optimization of PCR conditions fails to produce the desired priming specificity, either new oligonucleotides are required or the different PCR products can be separated by gel electrophoresis and reamplified individually for sequencing. When the PCR primers amplify several related sequences of the same length, for example, the same exon from several recently duplicated genes, or repetitive or conserved signal sequences, electrophoretic separation of the different products can be achieved either by the use of restriction enzymes that cut only certain templates and subsequent gel purification of the intact PCR products, or by the use of an electrophoretic system (denaturing gradient gel electrophoresis, temperature gradient gel electrophoresis) for separation that will differentiate between the products based on their nucleotide sequence difference. 4,5
4 R. M. Myers, V. C. Shemeld, and D. R. Cox, in "GenomeAnalysis--A Practical Approach" (K. E. Davies, ed.), p. 95. IRL Press, Oxford, 1988. 5 V. C. Shemeld,D. R. Cox, L. S. Lerman,and R. M. Myers,Proc. Natl. Acad. Sci. U.S.A. 86, 232 (1989).
[1]
SEQUENCINGOF in Vitro AMPLIFIEDDNA
5
Double-Stranded DNA Templates Many of the problems associated with direct sequencing of PCR products are not due to lack of specificity, but result from the ability of the two strands of the linear amplified product to reassociate rapidly after denaturation, thereby either blocking the primer-template complex from extending or preventing the sequencing oligonucleotide from annealing efficiently. 6 This problem is more severe for longer PCR products. To circumvent the strand reassociation of double-stranded DNA (dsDNA), a number of alternate methods have been developed.
Precipitation of Denatured DNA Denature the template in 0.2 M NaOH for 5 rain at room temperature, transfer the tube to ice, neutralize the reaction by adding 0.4 vol of 5 M ammonium acetate (pH 7.5), and immediately precipitate the DNA with 4 vol of ethanol. Resuspend the DNA in sequencing buffer and primer at the desired annealing temperature. 7
Snap-Cooling of Template DNA Denature the template by heating (95 °) for 5 rain. Quickly freeze the tube by putting it in a dry ice-ethanol bath to slow down the reassociation of strands. Add sequencing primer either prior to or after denaturation and bring the reaction to the proper temperature. 8
Cycling of Polymerase Chain Reactions A third method for generating enough sequencing template is to cycle the sequencing reaction, using Taq polymerase as the enzyme for both amplification and sequencing. Even though only a small fraction of the templates will be utilized in each round of extension-termination, the amount of specific terminations will accumulate with the number of cycles. 8-10
6 U. B. Gyllensten, and H. A. Erlich, Proc. Natl. Acad. Sci. U.S.A. 85, 7652 (1988). v L. A. Wrischnik, R. G. Higuchi, M. Stoneking, H. A. Erlich, N. Arnhein, and A. C. Wilson, Nucleic Acids Res. 15, 529 (1987). 8 N. Kusukawa, T. Uemori, K. Asada, and I. Kato, Biotechniques 9, 66 (1990). 9 M. Craxton, Methods: Companion Methods Enzymol. 3, 20 (1991). l0 J.-S. Lee, DNA 10, 67 (1991).
6
METHODSFOR SEQUENCINGDNA
[1]
Single-Stranded DNA Templates Sequencing problems derived from strand reassociation can be avoided by preparing single-stranded DNA (ssDNA) templates by any of the following number of methods. Strand-Separating Gels
Agarose strand-separating gels may be successfully employed to obtain ssDNA of fragments of more than about 500 bp. 11This method is suitable primarily for long products, or where other methods may not give sufficient yields of ssDNA. Blocking Primer P o l y m e r a s e Chain Reaction
An alternative way of generating ssDNA in the PCR, without the inherent lower efficiency achieved using an asymmetric PCR, is to use blocking primer PCR. In this method, an excess of a third primer that is complementary to one of the PCR primers is added during the PCR (after about 15-20 cycles). The third oligonucleotide will outcompete the newly synthesized target molecules in each cycle as priming sites for the PCR primer and thereby prevent synthesis of one of the DNA strands. The PCR is thereby transformed at any suitable stage into a primer-extension reaction. Solid-State Sequencing
In this procedure, one of the oligonucleotide primers is labeled with biotin prior to the PCR. After a balanced synthesis of dsDNA, the strands are denatured and put through a streptavidin-agarose column,12 or mixed with magnetic beads to which streptavidin has been attached. 13The strand labeled through the incorporated PCR primer will be bound to the solid support, and the unbound strand can be removed. The bound ssDNA is subsequently eluted for direct sequencing, or sequencing is performed with the templates still bound to the matrix. The magnetic beads do not interfere with the sequencing reagents, and can even be loaded on the sequencing gel without distorting the migration of termination products. The benefit of this method is that the reaction will be cleaned up for sequencing, at the same time as the ssDNA template is generated. 11T. Maniatis, E. F. Fritsch, and J. Sambrook,"MolecularCloning: A LaboratoryManual," p. 179. Cold Spring Harbor Press, Cold Spring Harbor, New York, 1982. x2L. G. Mitchell and C. R. Merill, Anal. Biochem. 178, 239 (1989). 13j. Wahlberg, J. Lundberg, T. Hultman, and M. Uhlen, Proc. Natl. Acad. Sci. U.S.A. 87, 6569 (1990).
[1]
SEQUENCINGOF in Vitro AMPLIFIEDDNA
7
Exonuclease-Generated Single-Stranded DNA In this procedure one of the oligonucleotide primers is treated with polynucleotide kinase to introduce a 5'-phosphate prior to the PCR. After a symmetric PCR, the products are exposed to ~ 5' ~ 3'-exonuclease, and the strand containing a 5'-phosphatased primer will be digested. The ssDNA is then purified from the reaction mix and used for sequencing.14 The efficiency of this method in generating ssDNA depends to a large extent on the proportion of primers that have been successfully kinased.
Transcript Sequencing A radically different approach for template generation is to combine PCR with reverse transcription, using a phage promotor sequence attached to one of the PCR primers. ~5 A standard PCR is performed initially to generate dsDNA. The PCR product is subsequently used in a transcription reaction that will yield a further increase in copy number of the desired single-stranded (RNA) template. This transcript is then sequenced using reverse transcriptase. Either a thermolabile reverse transcriptase, with a temperature range of 37-45 °, or a thermostable recombinant reverse transcriptase (rTh; Perkin-Elmer Cetus, Norwalk, CT) with a temperature optimum of 75 °, is available for the sequencing.
Asymmetric Polymerase Chain Reaction In this procedure an asymmetric, or unequal, ratio of the two amplification primers is used in the PCR 6 (Fig. 1). During the first 20-25 cycles dsDNA is generated, but when the limiting primer is exhausted ssDNA is produced for the next 5-10 cycles by primer extension. The accumulation of dsDNA and ssDNA during a typical amplification of a genomic sequence, using an initial ratio of 50 pmol of one primer to 0.5 pmol of the other primer in a 100-/zl PCR, is shown schematically in Fig. 2. The amount of dsDNA accumulates exponentially to the point at which the primer is almost exhausted, and thereafter essentially stops. The ssDNA generation starts at about cycle 25, the point at which the limiting primer is almost depleted. Following a short (one or two cycles) initial phase of rapid increase, the ssDNA accumulates linearly as expected when only one primer is present (primer extension). In general, a ratio of 50 pmol: 1-5 pmol for a 100-/zl PCR reaction will result in about 1-3 pmol of ssDNA after 30 cycles of PCR. The yield of ssDNA can be estimated by adding 0.1 txl of [~-32P]dCTP (3000 Ci/mmol) to the PCR, and examining the 14 R. G. Higuchi and H. Ochman, Nucleic Acids Res. 17, 5865 (1989). 15 E. S. Stoflet, D. D. Koeberl, G. Sarkar, and S. S. Sommer, Science 239, 491 (1988).
8
METHODS FOR SEQUENCING D N A
[1]
50 pmol
B
41~
1-5 pmol
T
30 cycle PCR 1-5 pmol dsDNA 5 pmol ssDNA Sequenclng reaction
IIIIIIBIB PCR primer for sequencing r///////H~
Internal primer for sequencing
Fic. 1. The principle for asymmetric PCR. When the primer in limited concentration is exhausted, ssDNA is produced. The ssDNA produced can be sequenced either using the limiting PCR primer or an internal primer complementary to the ssDNA.
reaction products on a gel. The ssDNA yield cannot be consistently quantified from staining with ethidium bromide, because the tendency of ssDNA to form secondary structures may vary between templates. H o w e v e r , we routinely obtain a qualitative estimate by assaying 10/~1 on a 3% (w/v) NuSieve (FMC, Rockland, ME), 1% (w/v) regular agarose gel. The ssDNA is visible after the bromphenol blue has migrated about 2 cm as a discrete fraction migrating ahead of the dsDNA. If a ssDNA fraction is visible by ethidium staining, the asymmetric PCR contains enough material for one to four sequencing reactions. The overall efficiency of amplification is lower when an asymmetric primer ratio is used compared to when both are present in vast excess. This can usually be compensated for by increasing the number of PCR cycles. In addition, titrations may be needed to find the optimal primer ratio for each strand. An example of such a titration is shown in Fig. 3. In this case the most asymmetric ratios did not produce sufficient amounts of ssDNA. Instead, large amounts of high molecular weight, nonspecific PCR products were obtained. The optimal ratios for this primer pair were found to be 50 : 5 for one strand and 5 : 50 for the other. L o w yields of ssDNA using the asymmetric PCR may reflect either too little of the limiting primer, preventing the accumulation of enough dsDNA as a template for the p r i m e r - e x t e n s i o n reaction, or too high amounts of the limiting
[1]
SEQUENCINGOF in Vitro AMPLIFIEDDNA
9
primer, saturating the reaction with dsDNA before any ssDNA is produced. The ssDNA generated can then be sequenced using either the PCR primer that is limiting or an internal primer and applying conventional protocols for incorporation sequencing or labeled primer sequencing. 16 The population of ssDNA strands produced should have discrete 5' ends but may be truncated at various points close to the 3' end due to premature termination of extension. However, for any primer used in the sequencing reaction, only full-length ssDNA can be recruited as template. The ssDNA of choice can be generated either directly in the original PCR, by using an asymmetric molar ratio of the two oligonucleotide primers, or in a second PCR reaction with an excess of one PCR primer, using a gel-purified fragment from an initial regular (symmetric) PCR as a target, or a 1/100 dilution of a previous symmetric PCR. 6A7The asymmetric PCR has the advantage that, because the limiting primer is exhausted, there is no need to remove excess primers prior to initiating the sequencing reaction.
Protocol for Generation of Templates by Asymmetric Polymerase Chain Reaction. This protocol is suitable for generation of templates from a previous successful symmetric PCR. 1. Mix 80 t~l distilled H20, 10 kd 10x PCR buffer (500 mM KC1, 100 mM Tris, pH 8.3, 15 mM MgC12), 5 ~1 premixed primers, with 50 pmol of one primer and 1-5 pmol of the other primer in a total of 5 t~l, 5~1 mix of nonionic detergents [10% (v/v) each of Nonidet P-40 (NP-40) and Tween 20], 0.8/~1 deoxynucleoside triphosphate (dNTP) mix (25 mM with respect to each dNTP), 2.5 units Taq polymerase, and 2 drops of mineral oil. 2. Dilute the previous symmetric PCR 1/100. 3. Add 1 /~1 of diluted PCR to the asymmetric PCR mix and cap the tubes. 4. Run 40 PCR cycles. 5. After completion of PCR, assay for the presence of single-stranded DNA by running out 10 t~l of the reaction on a 3% NuSieve, 1% regular agarose gel. Run the bromphenol blue about 2 cm into the gel before examining the fluorescence. A successful reaction should have two bands, the ssDNA migrating slightly ahead of the dsDNA. 6. If ssDNA can be seen, remove the oil from the rest of the PCR by a single chloroform extraction. ~6 U. Gyllensten, in "PCR Technology: Principles and Applications for DNA Amplification" (H. A. Erlich, ed.), p. 45. Stockton Press, New York, 1989. ~7T. D. Kocher, W. K. Thomas, A. Meyer, S. V. Edwards, S. P~.~bo, F. X. Villablanca, and A. C. Wilson, Proc. Natl. Acad. Sci. U.S.A. 86, 6196 (1989).
a
1 2 3 4 5 6 7 8 91011 121314
dsDNA
b
1 2 3 4 5 6
7 8 91011121314
dsDNA ssDNA =~
C
dsDNA =~
1 2 3 4 5 6 7 8 91011121314
[1]
SEQUENCING OF in Vitro AMPLIFIED DNA
50/1 50/2 50/3 50/4 5o/5 1150 2/50 3/50 4•50 5/50 5O/5O
11
!i 4 ¸
t
!
I't
t! I!!
tO tO ~tO FIG. 3. Titration of optimal primer concentrations in the asymmetric PCR. Exon 13 of the human CFTR gene [J. R. Riordan, J. M. Rommens, B.-S. Kerem, N. Alon, R. Rozmahel, Z. Grzelczak, J. Zielenski, S. Lok, N. Plavsic, J.-L. Chou, M. L. Drumm, M. C. Ianuzzi, F. S. Collins, and L.-C. Tsui, Science 245, 1066 (1989)] was amplified using primer A (5'CTGTGTCTGTAAACTGATGGCTA-3') and primer B (5'-GTCTTCTTCGTTAATTTCTTCAC-3'). The PCR mix included 0.1/xl [c~-32p]dCTP(3000 Ci/mmol); the reaction products were separated on a 3% NuSieve, 1% regular agarose gel, and the gel was dried and autoradiographed. 7. R e m o v e the b u f f e r c o m p o n e n t s a n d r e s i d u a l d N T P s f r o m the s s D N A t e m p l a t e s u s i n g c e n t r i f u g e - d r i v e n dialysis [either C e n t r i c o n 30 ( A m i c o n , D a n v e r s , M A ) or Millipore (Bedford, MA)]. Collect the r e t e n t a t e (40/xl). 8. U s e 10-25 ~1 for the s e q u e n c i n g r e a c t i o n . [As a n a l t e r n a t i v e to dialysis, p r e c i p i t a t e the D N A in 4 M a m m o n i u m a c e t a t e to r e m o v e e x c e s s d N T P s a n d b u f f e r c o m p o n e n t s . C o m b i n e 100/zl P C R r e a c t i o n a n d 100/~1 FIG. 2. The accumulation of PCR products during an asymmetric PCR. A 242-bp product from the second exon of the HLA-DQA1 gene was amplified using primers GH26 and GH27.6 Lanes 1 and 14 contain the size standard qSx174cut with HaelII. Lanes 2-13 contain samples amplified for 5, 10, 13, 16, 19, 25, 28, 31, 34, 37, 40, and 43 cycles, respectively. (a) Genomic DNA was amplified with 50 pmol of primer GH26 and 0.5 pmol of primer GH27. (b) Southern blot of the agarose gel hybridized with an oligonucleotide complementary to both the dsDNA and ssDNA. (c) Same blot reprobed with an oligonucleotide with the same sequence as the ssDNA generated.
12
METHODSFOR SEQUENCINGDNA
[1]
4 M ammonium acetate and mix. Add 200/zl 2-propanol, mix, leave at room temperature for 10 min, and then spin for I0 min. Remove the supernatant and wash the pellet carefully with propanol, mix, leave at room temperature for 10 min, and then spin for 10 rain. Remove the supernatant and wash the pellet carefully with 500 tzl 70% (v/v) ethanol. Dry down the pellet and dissolve in 10/xl TE (10 mM Tris-HC1, pH 7.5, 0.5 mM EDTA) buffer.] Direct Sequencing with T7 DNA Polymerase The sequencing protocol consists of two steps: labeling and termination. 1. Use 20-60% of the PCR reaction (purified) in a total volume of 7 /~!. 2. Add 2 tzl 5× sequencing buffer (Ix: 40 mM Tris-HCl, pH 7.5, 20 mM MgCI 2, 50 mM NaCI). 3. Add 1 tzl (1-10 pmol) sequencing primer (in an asymmetric PCR use either the limiting primer or an internal primer complementary to the ssDNA generated). 4. Heat the primer-template mix to 65 °, leave for 4 min, and then allow it to cool to 30° over a period of 5 rain. 5. Mix 2 ~1 labeling mix with 50 tzl distilled water. When the yield of ssDNA template is low the labeling mix can be diluted to 1 : 100. [Note: The undiluted labeling mix is 750 tzM (with respect to dTTP, dCTP, and dGTP) and lacks dATP.] 6. Add 1 tzl of 0. I M dithiothreitol (DTT) to the primer-template mix. 7. Add 2 tzl of diluted labeling mix to the primer-template mix. 8. Add 0.5/zl of [a-35S]thio-dATP (> 1000 mCi/mmol). 9. Dilute T7 DNA polymerase to 1.6 units/lzl in 7/xl enzyme dilution buffer [enzyme dilution buffer: 10mM Tris-HC1, pH 7.5, 5 mM DTT, 0.5 mg/ml bovine serum albumin (BSA)]. 10. Add 2.0 tzl of diluted T7 DNA polymerase (3.2 units). II. Incubate the mixture at room temperature for 5 min. 12. Add 3.5/zl of the labeling reaction to each of the four tubes, or a microtiter plate, with 2.5/zl of each termination mix [each containing 80 tzM concentrations of each dNTP and an 8 /zM concentration of the appropriate dideoxynucleoside phosphate (ddNTP)], and incubate the reaction at 37 ° for 5 min. 13. Stop the reaction by adding 4/zl formamide-dye stop solution [90% (v/v) formamide, 20 mM ethylenediaminetetraacetic acid (EDTA), pH 8.0, and 0.05% (v/v) each of the dyes xylene cyanol and bromphenol blue]. 14. Store the reaction at - 2 0 ° until loading onto a sequencing gel.
[1]
SEQUENCINGOF in Vitro AMPLIFIEDDNA
13
Direct Sequencing with Taq Polymerase Taq polymerase is an ideal enzyme for DNA sequencing because it has high processivity and an absence of detectable 3' --* 5'-exonuclease activity, which help to avoid false terminations. Is In addition to these properties, which it shares with the thermolabile T7 DNA polymerase, it permits reaction temperatures between 55 and 85°, which will melt the secondary structure of most templates. Protocol for Sequencing of Amplified DNA Using Taq Polymerase 1. In a 0.5-ml microfuge tube, prepare one labeling reaction mixture per sample by adding in the following order: 4 tzl distilled H20, 1 /zl sequencing primer (1 pmol/tzl), 1/zl [a-35S]thio-dATP (> I000 mCi/mmol), 4 ~1 labeling mix (the labeling mix contains 0.57 units/~l Taq DNA polymerase, 0.86/zM dGTP, 0.86 ~M dCTP, 0.86 p~M dTTP, 143 mM TrisHC1, pH 8.8, 20 mM MgC12), and 10 pA DNA template. 2. Cap the tube and mix. 3. Incubate the tube for 5 min at 45 °. 4. Dispense 4/~1 of the labeling reaction into each of four tubes, or one microtiter plate, with 4 tzl of the four termination mixes A, T, C, and G (G termination mix: 20 tzM dGTP, 20 tzM dATP, 20 p~M dTTP, 20/zM dCTP, 60/.~M ddGTP; A termination mix: 20/zM dGTP, 20 tzM dATP, 20 /zM dTTP, 20/~M dCTP, 800 p~M ddATP; T termination mix: 20/zM dGTP, 20/zM dATP, 20/~M dTTP, 20/zM dCTP, 1200 tzM ddTTP; C termination mix: 20/zM dGTP, 20/zM dATP, 20/zM dTTP, 20 p~M dCTP, 400/zM ddCTP). 5. Cap the tubes and incubate at 72 ° for 5 min. 6. Remove the plate or tubes and add 4/zl stop solution (see above) to all samples. 7. Cover the plate or cap the tubes. If the samples cannot be analyzed immediately, they can be stored up to 1 week at - 2 0 °. Sequencing of Regions with Strong Secondary Structure Regions of DNA with strong secondary structure may give rise to two problems: (1) low efficiency of the PCR, due to a high frequency of templates that are not being fully extended by the Taq polymerase, and (2) compression of the DNA sequences in the sequencing reactions. It appears that the high reaction temperature of PCR using Taq polymerase 18M. A. Innis, K. B. Myambo, D. H. Gelfand, and M. A. D. Brow, Proc. Natl. Acad. $ci. U.S.A. 85, 9436 (1988).
14
METHODSFOR SEQUENCINGDNA
[1]
(50-75 °) should be sufficient to resolve most short secondary structures. However, strong inhibition of more complex regions has been observed, and efficient PCR of these can be achieved only after the addition of the base analog c7dGTP in the appropriate ratio relative to dGTP.19 Similarly, base analogs may have to be used in the sequencing reactions to avoid compression problems. Taq polymerase will incorporate cVdGTP but not inosine efficiently.~8 Direct Sequencing of Heterozygous Individuals When two alleles differ by a single point mutation, direct sequencing using a PCR primer will display the heterozygote position. However, when the allelic templates differ by more than one mutation direct sequencing will not resolve the phase of the mutations. In addition, the presence of short insertions or deletions in one of the alleles will generate compound sequencing ladders. There are four ways to resolve the phase of point mutations and obtain sequences of individual alleles from heterozygotes: (1) separating the alleles by cloning, (2) separating the different templates on the basis of their nucleotide sequence prior to sequencing, using a gradient gel electrophoretic system, (3) priming only one allele in the sequencing reaction, and (4) amplifying only one allele at a time. 6 Approaches 3 and 4 are applicable only to loci where the sequence of some of the alleles is known. In the sequencing reaction, oligonucleotides made to known allele-specific regions are used to selectively prime only one of the two allelic templates in a heterozygote. Errors Involved in Sequencing of Polymerase Chain Reaction Products Individual PCR products can differ from the sequence to be amplified by point mutations (Fig. 4) and by events of in vitro recombination in the PCR. Based on a fidelity assay for phage M13, the frequency of base substitution errors (1/10,000) and frameshift errors (1/40,000) of Taq polymerase was found to be considerably higher than for Klenow polymerase (1/29,000 base substitution errors, 1/65,000 frameshift errors) and T4 DNA polymerase (1/160,000 base substitution errors, 1/280,000 frameshift errors). 2° These assays were not performed under the same conditions as a standard PCR, and because the processivity and rate of synthesis by DNA polymerase are affected by MgC12 and dNTP concentration, buffer components, and the temperature profile of the cycle, these absolute t9 L. McConologue,M. A. D. Brow, and M. A. Innis, Nucleic Acids Res. 16, 9869 (1988). 20K. R. Tindall and T. A. Kunkel,Biochemistry 27, 6008 (1988).
T7 Taq GATCGATC
GIA
FIG. 4. Comparison of the sequencing ladders obtained by sequencing of asymmetric PCR templates by either T7 DNA polymerase (left four lanes) and Taq DNA polymerase (right four lanes). A portion (450 bp) of the human mitochondrial D loop [S. Anderson, A. T. Bankier, B. G. Barrell, M. H. L. de Bruijn, A. R. Coulson, J. Drouin, I. C. Eperon, D. P. Nierlich, A. Roe, F. Sanger, P. H. Schreier, A. J. H. Smith, R. Staden, and I. G. Young, Nature (London) 290, 457 (1981)] was amplified using the primers UG142 (5'GGTCTATCACCCTATTAACCAC-3') and UG143 (5'-CTGTTAAAAGTGCATACCGCCA-3') and sequenced using UG142. The arrows indicate the location of point mutational differences between the two individuals.
16
METHODSFOR SEQUENCINGDNA
[1]
numbers may not apply directly to PCR. The error rate in the PCR, estimated by sequencing of individual PCR products after 30 cycles (starting with 100-1000 ng of genomic target DNA), suggested that two random PCR products may be expected to differ once every 400-4000 b p ] The mosaic, or in vitro recombinant, PCR products are the result of partially extended DNA strands that can act as primers on other allelic templates in later cycles. Both of these artifact products are likely to accumulate primarily at the end point of PCR because of insufficient enzyme to extend all available templates and an abundance of DNA strands for annealing. These artifact products have been seen primarily in studies of highly degraded DNA, or in studies of archaeological remains. 2L'22In PCR analyses of high molecular weight samples, these products are likely to constitute less than I% of all templates. Both these types of errors must be considered when PCR products are cloned and allelic sequences inferred from individual PCR products. In direct sequencing, by contrast, these artifact PCR products will not be visible against the consensus sequence on the gel. Even when starting from a single DNA copy, such as that found in a single sperm, a misincorporation that arises in the first PCR cycle will appear only with, at the most, 25% of the intensity of the consensus nucleotide, given that all templates have an equal probability of being replicated. 6 Thus, direct sequencing is to be preferred, unless the primer sequences do not allow sufficient specificity to amplify only a single target, or the individual allelic sequences cannot be determined due to genetic polymorphism at multiple positions between the primers. The relatively high error rate of Taq polymerase may, however, create problems when individual products are to be used for expression studies, or analysis of mutation frequencies. Unless a population of linear PCR products can be used in the expression system, several molecules must be cloned and sequenced to identify the unmodified clones. Acknowledgments U . B . G . was supported by a Fellowship from the K n u t and Alice Wallenberg Foundation and a grant from the Swedish Natural Science R e s e a r c h Council.
21 S. P~i~ibo, J. A. Gifford, and A. C. Wilson, Nucleic Acids Res. 16, 9775 (1988). 22 S. P~fftbo, Proc. Natl. Acad. Sci. U.S.A. 86, 1939 (1989).