ANALYTICAL
154, 353-360
BICCHEMISTRY
(1986)
Specific-Primer-Directed ERICH C. STRAUSS, JOAN A. KOBORI,
DNA Sequencing GERALD
SIU, AND LEROY
E. HOOD
Received November 29. 1985 A simple and rapid strategy for DNA sequence analysis based on the Sanger chain-termination method is described. This procedure utilizes full-sized inserts of I to 4 kb of DNA cloned into Ml3 bacteriophage vectors. After the sequence of the first 600-650 bp of the insert DNA has been determined with the commercially available universal vector primer, a specific oligonucleotide is synthesized utilizing the sequence data obtained from the 3’ end of the sequence and used as a primer to extend the sequence analysis for another 600-650 nucleotides. Additional primers are synthesized in a similar manner until the nucleotide sequence of the entire insert DNA has been determined. General guidelines for the selection of oligonucleotide length and composition and the use of unpurified primers are discussed. The use of the specific-primer-directed approach to dideoxynucleotide sequence analysis. in association with highly purified single-stranded template DNA, reduces considerably the time required for the analysis of large segments of DNA. c 1986 Academic Press. Inc. KEY
WORDS:
dideoxynucleotide
sequencing: specific prrmers: CsCl-banded M 13 phage.
The dideoxynucleotide chain-termination sequencing procedure of Sanger et al. (l-3) is a widely used method for determining the nucleotide sequence of a single-stranded cloned fragment of DNA. The chain-termination sequencing procedure is based on the enzymatic synthesis of a radioactive complementary strand from the single-stranded template DNA utilizing an oligonucleotide primer to initiate synthesis and dideoxynucleotides to randomly terminate synthesis. The primer, annealed to a complementary region of the M 13 vector directly flank.ing the cloned DNA. is extended by the large fragment of Esdwichia coli DNA polymerase I which synthesizes the complementary strand of the cloned DNA adjacent to the primer site. The resulting reaction products are then separated by electrophoresis on polyacrylamide gels. Although the dideoxynucleotide sequencing method has been used successfully, limitations exist with the strategies that have been developed to sequence large regions of DNA. The random-cloning procedure uses different frequent-cutting enzymes to generate smaller 353
overlapping fragments of the region to be sequenced. The fragments are subcloned into the M I3 vector. sequenced, and analyzed by computer for overlaps in order to reconstruct the entire DNA sequence (4,5). This method requires the generation and sequencing of many clones to obtain the complete DNA sequence. which can be very time consuming. A second strategy, the partial-nuclease-digestion method. involves the use of DNase I to generate overlapping deletion recombinant M 13 clones (6,7). In this method, the complete region is cloned into the Ml3 vector, and shorter inserts generated at random are obtained. This places the primer site of the M 13 vector adjacent to different positions within the original insert, permitting the sequencing of regions that are many base pairs away from the original primer site. Similar methods using exonucleases such as Ba131 have also been described (8). Although these procedures do not require the generation of as many clones as the random-cloning method, a large number are still required, and the clone construction is often difficult. 0003-2697/86
$3.00
Copyright (C’ 1986 by Academic hers. Inc. ,911rights OFreproduction I” any form rexrvtd
In this paper we describe a simple. rapid strategy for sequencing large regions of DNA directly utilizing the dideoxynucleotide chaintermination method. A large stretch of DNA is cloned into the bacteriophage vector Ml3 and sequenced initially in both orientations using the universal vector primer. Oligonucleotides homologous with the 3’ end of these sequences are then synthesized and used to extend the sequence farther. In this manner, we have been able to obtain over 600 nucleotides of sequence per primer. We have analyzed the requirements of length, base composition, and purity of the specific primers and found that the length of the primer. but not the composition, affects the quality of the sequencing reactions. Other recent papers have also reported the use of specific oligonucleotide primers for rapid DNA sequence determination (9,10). MATERIALS
AND
METHODS
Oli~onllcleotidc~s. Specific oligonucleotide primers were synthesized in the Caltech Microchemical facility (1 1) with an Applied Biosystems 380A synthesizer. Oligonucleotides were purified on vertical denaturing 20% acrylamide gels ( 19% acrylamide, 1% N,N’methylene bisacrylamide, 7 M urea: 40 cm X 15.5 cm X 4 mm with 2.5-cm-wide gel slots) in Tris-borate buffer (TBE: 90 mM Tris. 100 mM boric acid, 1 mM EDTA,’ pH 8.3). A denaturing 5% acrylamide stacking gel was used to facilitate the removal of the comb. Approximately 0.2 pmol of oligonucleotide were denatured at 90°C for 3 min in a formamide’ Abbreviations used: bp. base pairs: kb. kilobases: EDTA, ethylenediaminetetraacetic acid: ,&,, absorbance at 600 nm: TE. 10 mM Tris, 1 mu EDTA, pH 8.0; PEG. polyethylene glycol 8000: dATP, 2’-deoxyadenosine triphosphate; dCTP, 2’-deoxycytosine triphosphate; dGTP. 2’-deoxvguanosine triphosphate: dTTP, 2’-deoxythymidine triphosphate: ddNTP, 2’,3’-dideoxynucleoside triphosphate: ddATP. 2’.3’-dideoxyadenosine triphosphate: ddCTP. 2:3’-dideoxycytosine triphosphate; ddGTP. 2’,3’dideoxyguanosine triphosphate; ddTTP, 2’,3’-dideoxythymidine triphosphate; MHC. major histocompatibility complex; cDNA. complementary DNA.
dye mix (90% deionized formamide. 10 mM EDTA, 10 mM NaOH. 0.X xylene cyanol. 0.2:‘; bromphenol blue) and loaded onto the gel. Electrophoresis was carried out at 500600 V for 16-24 h. until the bromphenol blue dye migrated 40 cm. The gel was then transferred to plastic wrap. The oligonucleotides were visualized by placing the wrapped gel onto a silica gel 60F-354 precoated thin-layer chromatography plate (EM reagents) and shining a short-wave (254 nm) ultraviolet light on it. The full-sized oligonucleotide and shorter partial-synthesis products were easily visualized. The band corresponding to the fullsized oligonucleotide was cut out with a razor blade. crushed with a glass rod and eluted for 3 h with agitation at 37°C in 1.5-3 ml of 0.5 M ammonium acetate, 10 rnb1 magnesium acetate, 1 mM EDTA, and 0.1% sodium dodecyl sulfate. The acrylamide was removed by centrifugation through a disposable Quik-Sep column with plastic filter disk (Isolab). The supernatant was then extracted with isobutanol repeatedly until the volume was approximately 200 ~1. This was then desalted by gel filtration in a IO-ml column with Sephadex G-25 (medium) in water. One-half-milliliter fractions were collected. and the fractions containing oligonucleotide were identified by absorbance at 260 nm. The fractions containing the DNA were then lyophilized in a Speed Vat Concentrator (Savant) and the pellet was resuspended in water to a final concentration of 10 ng per microliter. If desired. the purity of the oligonucleotide primers after synthesis and purification can be tested. Unpuritied and purified samples of the oligonucleotide primers were labeled on the 5’ end using [y-3’P]ATP and T4 polynucleotide kinase (12) and analyzed on a 20% urea-acrylamide gel (Fig. 1). Prqmutiot~ oj tempkutr DNA ,fiw .srq~remifzg. Highly purified single-stranded M 13 DNA template for sequencing was prepared as follows: 100 ml of an early log-phase culture L-t600 = 0.2) of JM 10 1 or JM 103 cells ( 13) are infected with 10” Ml3 phage and incubated with agitation for 12-18 h at 37°C. The bacterial cells are removed by centrifugation at
DNA
SEQUENCING
V42.12 Va2.16 VB2.20 V/39.16 VPlO.lS
WITH
SPECIFIC
PRIMERS
buffer to a final concentration mately 300 pg per milliliter.
355 of approxi-
DNA sequencing reactions. Our DNA sequencing protocol is a modification of that described by Sanger et al. ( 1,2). We have separated reaction components to facilitate the varying of reaction parameters (DNA, primer. nucleotide concentrations, and radioactive nucleotide). Approximately 2.5 pugof template DNA (3- 10 ~1). 10 ng of specific oligonucleotide primer ( 1 pl), and 1 ~1 of 10X buffer ( 10X = 0.5 M NaCl, 0.07 M Tris-HCI, pH 7.5, 0.1 M MgCl?, 0.03 M dithiothreitol) are brought to 14 ~1 with water in a 1.5ml microcentrifuge tube. The mixture is heated at 55-60°C for 10 min and allowed to cool for 30 min at room temperature. This procedure results in the annealing of the primer to the template DNA. During the annealing step. 1 ~1 of the appropriate deoxynucleotide mixture (A, C, G, T) is added to each of four chilled microcenFIG. I, Purilied and unpuritied oligonucleotide primers trifuge tubes on ice. These mixtures are preanalvred on a 10’7 acrylamide gel after kinase labelpared from 0.5 mM stocks of dATP, dCTP, ing (-1 7). dGTP. TE buffer and dTTP (PL Biochemicals), and vary depending on which radioactive 7000g for 10 min and the Ml3 phage in the nucleotide is used. Either 3’P-labeled or 35Slabeled nucleotides may be used: the procedure supernatant are concentrated by precipitation in 0.5 M NaCl and 3%) PEG. After 30 min at for using 3’P-labeled nucleotides will be deroom temperature. the phage are pelleted by scribed. For [ru-“P]dATP, the deoxynucleotide centrifugation at 1Z.OOO,rfor 20 min. 4°C. The ratios (where N = dCTP:dGTP:dTTP:TE phage are resuspended in approximately 5 ml buffer) are A = 20:20:20:20, C = 1:20:20:20. of TE buffer and centrifuged again at 17.000cq G = 20: 1:20:20. and T = 20:30: 1:20. The difor IO min. 4”C, to remove residual PEG. The deoxynucleotides are diluted in water from 0.5 concentrated phage stock is adjusted to 10.0 mM stocks prior to each sequencing experig with TE, followed by the addition of 4.2 g ment. The dideoxynucleotide concentration of ultrapure CsCl (BRL). The sample is cen- we use for these reactions are 0.25 mM ddATP, trifuged in a Beckman 50Ti rotor at 145,OOO~e. 0.20 mM ddCTP. 0.07 mM ddGTP, and 0.40 mM ddTTP (PL Biochemicals). One microliter 24 h. 30°C. The purified phage arc removed from the Csc’l. dialyzed extensively against TE of each ddNTP is added to the appropriate tube on ice. We have found that the deoxynubuffer. extracted twice with phenol saturated with Tris-HCl, pH 8. and three times with cleotide mixtures and the 0.5 mM dideoxydiethyl ether. The resulting DNA is precipinucleotide solutions are reliable for months if tated from solution by the addition of l/l 0th refrozen immediately on dry ice and stored at volume of 3 M sodium acetate (pH 5.0) and -20°C. Concentrated (10 mM) stocks of nu2.5 vol of absolute ethanol and stored at cleotides are stored in aliquots at -70°C. Two microliters of [cu-“P]dATP (Amer-20°C overnight. The DNA is then pelleted by centrifugation. and resuspended in TE sham, 2400 Ci/mmol, aqueous) and 2 ~1 of
356
STRAUSS
the large fragment of DNA polymerase I (Boehringer-Mannheim, 5 units/pi) are added to the microcentrifuge tube containing the annealed template DNA-primer solution and mixed. Three microliters of the DNA-enzyme-[32P]dATP solution is then added to each of the four reaction tubes. The volume of each reaction is 5 ~1. The tubes are mixed quickly and incubated at 30°C for 15 min. One microliter of0.5 mM dATP is then added to each reaction, and the reactions are incubated at 30°C for an additional 15 min. The reactions are stopped with the addition of 1 I ~1 of the formamide-dye mix. By using these conditions, 300-350 nucleotides of sequence can be obtained. In order to sequence further. modified reaction conditions are used. Approximately 1.5 PLgof template DNA and 5 ng of specific primer are annealed as described previously. The dideoxynucleotide concentrations used for these reaction conditions are 0.10 mM ddATP, 0.07 mM ddCTP, 0.025 mM ddGTP, and 0.20 mM ddTTP. The sequencing reactions are then conducted as described above. These reaction conditions can be used to determine up to 600-650 nucleotides of sequence. Before loading the gels. 3-4 ~1 of each reaction are added to different microcentrifuge tubes, denatured by heating for 3 min at 9095°C. and chilled on ice. The remaining solutions are stored at ~20°C; best results, however, are obtained with samples used within 24 h. We routinely run denaturing 5% acrylamide gels (80 cm long, 0.2 mm thick) in TBE
TABLE SEQUENCEOFSEVEN
ET
AL.
buffer. The gels are run at 1000 V for 30 min and the wells subsequently cleared of urea by squirting buffer into them before loading the samples. One to two microliters of the denatured samples are added to the gels. and the gels are run at 1500-3000 V. Usually. three gels are loaded: the first is run until the xylene cyan01 dye migrates 35-40 cm; the second is run until the xylene cyan01 dye migrates 8085 cm: and the third is run until the xylene cyan01 runs a total of l60- 170 cm. This will permit the resolution of approximately 20-650 bp of sequence away from the primer. RESULTS
Using the Applied Biosystems 380A DNA synthesizer and initiating synthesis with a controlled-pore-glass-attached nucleotide, a repetitive yield of 98% is generally obtained. Figure 1 shows a comparison of purified and unpurified samples of the specific oligonucleotides V/32.12. Vp2.16, Vp2.20. V/39.16, and V/310.16 (Table 1). Of these, Vp2.20 and V/310.16 are sufficiently pure even in the unpurified state to be used for sequence analysis. The oligonucleotides Vp2.12, Vp2.16. and Vg9.16, however. contain a significant amount of shorter partial-synthesis products in the unpurified sample. The effect of the contaminating fragments can be significant. A comparison of sequencing reactions conducted with both purified and unpurified specific oligonucleotide primers Vg9.16. VP 10.16, and
I
SPECIFIC~LIGONUCLEOTIDE
Primer
Sequence
vp2.12 Vp2.16 v/32.20 Vp9.16 VfI10.16 vg8.16 D822.16
AACTCCAGCATC AACTCCAGCATCTGTG TAACTCCAGCATCTGTGTGC AATAAGGAAAATATAT GTCAGAATAAGGAAAA CAACTCCAGTCCCCGC GGCCCCCCCAGTCCCC
(5’ -
3’)
PRIMERS %GC 50 50 50 13 31 69 88
DNA
SEQUENCING
WITH
Vp2.20 indicated that all three work well in sequencing reactions after purification and only V/310.16 and V/32.20 can be utilized unpurified. Sequence analysis with the unpurified Vg9.16 results in additional bands due to priming by the partial synthesis products (data not shown). Thus, although many of the specific oligonucleotides can be utilized directly after synthesis and deprotection, further purification may be necessary in order to remove the shorter partial-synthesis products to insure artifact-free sequence analysis. If the machine lines are kept clean, however. a repetitive yield of 98% is usually sufficient to permit the utilization of unpurified oligonucleotide as sequencing prirners.
To test the effect of primer length, we have used primers ‘of differing length from the same region of DNA in sequencing reactions using the same template and identical reaction conditions. Figure 2 shows sequencing reactions using the oligonucleotide primers Vp2.12. Vfi3.16. and V132.20. which are 12. 16, and 30 bases in length, respectively. These three primers have identical %GC content and. with the exception of the additional nucleotides, identical sequences (Table 1). Each was purified on a 20’;1#acrylamide gel, as described above. Although the sequences obtained with Vfi2.16 and ‘~‘~2.20 can be read without ambiguity, the sequence obtained with Vg2.13 is difficult to read due to artifactual bands that appear in all four reactions, and occur with greater frequency as the sequence progresses away from the primer.
Oligonucleotide primers of identical length but differing by their percentage GC content were synthesized and used in sequencing reactions with identical reaction conditions. The specific oligonucleotides VD9.16. VP 10.16. Vp3.16, V/38.16, and Dp22.16, each 16 bases in length and having 13. 3 1. 50, 69, and 88% GC content respectively, were used in the re-
SPECIFIC
357
PRIMERS
%?!--‘A
C G T”A
C G T’
.FIG. 2. Oligonucleotide primers of differing lengths used in sequencing reactions. V/3?. I2 is I2 bases. V/32.16 is I6 bases. and Vfi2.20 is 70 bases in length. Sequencing reactions using these oligonucleotides as primers are indicatcd. i\rtifactual bands are found in the reactions using V/X2. I2 as a primer. No artifactual bands can be seen in the reactions using either VZ. 16 or Vij3.20 as primers.
actions (Table I). The reactions were comparable in quality: approximately 650 bp was obtained from each set of reactions. The sequencing reactions obtained using these primers were indistinguishable in quality from the other specific primers and from the commercially available primers (data not shown). Base composition does not appear to have an effect on the quality of the sequencing reactions. DNA Template Preparution Using the CsCl-banding procedure described above, we routinely obtain 300-500
358
STRAUSS
pg of purified. single-stranded DNA template for sequence analysis. As the sequencing reaction conditions we describe require only 1.5-2.5 pg of DNA for use with each primer. only one large phage preparation is necessary to complete the sequence of an insert. The use of large quantities of DNA permits short film exposure times of 2-6 h in the absence of an intensifying screen. In contrast, the “miniprep” DNA procedure provides only 2-3 pg of DNA, which necessitates the use of less DNA per reaction and subsequently results in much longer film exposure times. In addition, the amount of readable sequence obtained with miniprep DNA may vary considerably depending on the purity of the preparations. Although the miniprep procedure can provide DNA sufficiently pure for sequencing, the DNA quality can be inconsistent. The CsClbanding procedure guarantees large amounts of reliable highly purified template DNA. We recommend the use of the CsCl purification procedure with the specific-primer-directed sequencing strategy. DISCUSSION
We have described a rapid and efficient method for sequencing large stretches of DNA based on the dideoxynucleotide chain-termination method of Sanger et al. ( 1,2). The specific-primer-directed sequencing strategy uses insert DNA as large as 3-4 kb cloned into M 13 bacteriophage vectors: however, the method may also be used for sequence analysis in plasmid vectors (14) when insert DNA is not easily cloned into the Ml3 vectors. @ur laboratory has utilized over 100 specific oligonucleotides to sequence DNA fragments varying in size from 900 to 3000 base pairs. Using our sequencing protocol, we have consistently obtained 600-650 bp of sequence per specific primer. We have observed no differences in quality or in length of obtainable sequence data utilizing specific primers and the commercially available universal primer. Both appear to prime sequence reactions at equal efficiencies. Usually, the oligonucleotides are pure enough to be used immediately after
ET
AI
synthesis: occasionally. however. the preparation is contaminated with partial-synthesis products and requires purification on a preparative 20% acrvlamide gel. In practice, there are two strategies with which to sequence the recombinant clones using this technique (Fig. 3). In the first strategy, oligonucleotides are synthesized to complete the first strand sequence only: the second strand sequence is determined by making oligonucleotide primers complementary to the sequence obtained on the first strand, and is completed by carrying out sequencing reactions with all of these primers at once. In this manner. the second strand is completed in a short period of time after the first strand is finished. In the second strategy. the oligonucleotides synthesized at each step are made both to continue the sequence, and to sequence the second strand of the newly obtained sequence. The second strategy is more rapid than the first strategy since the sequence of the second strand is being determined concurrently with the first strand. We have used both strategies successfully. Our experience suggests that certain guidelines should be followed in selecting primer sequences. The length of the primer appears to be important: consistent, artifact-free sequences were obtained with 15-. 16-, and 20base oligonucleotides, but the use of 12-base oligonucleotide primers often resulted in additional bands presumably due to secondarysite hybridization. Base composition of the oligonucleotide primers does not appear to affect the quality of the sequencing reactions: we have utilized primers that are 13. 3 1, 50. 69. and 88% in GC content successfully (Table 1). The specific oligonucleotide primer-directed sequencing procedure has one major advantage over the random-cloning and partial-nuclease-digestion procedures (4-8). Our technique requires the construction of only one recombinant Ml 3 phage clone, in both orientations; previous procedures required constructions of multiple recombinant clones ( 1517). The specific-primer-directed sequencing
DNA
SEQUENCING
WITH
SPECIFIC
PRIMERS
359
Clone 1nt0 mpl8 and mpl9 to obioln bath orlentotlons of the ~nserl DNA
Obtain sequence from both clones ustng the universal Ml3 vector primer (01
Option I
Optlo”
2
I
Use specific oligonucleotlde primers (0) to complete the seq”e”ce of one clone
Use speclflc ollgonucleotlde primers (o/o) to extend sequence of each clone
Use complementary olagowcleot\de primers (Al dewed from doto Cv ) of first clone’s sequence to cbtoin sequence of the second clone (opposite strondl
Use complementary ollgonucleotlde prrners (AlA) dewed from doto W/v) of opposite orlentotio” clone to complete the sequence of each clone
FIG. 3. Two different strategies for The open and filled arrows indicate the positions of the M I3 universal the various specilic primers. Thin have been used successfully. Option
sequencing using the specific-primer-directed DNA sequencing technique. the opposite strands of the DNA to be sequenced. Open boxes indicate primer. open and closed circles and triangles indicate the positions of arrows indicate the direction of the sequencing reactions. Both options 2 is significantly more rapid.
method permits the rapid, efficient analysis of large stretches of DNA. Kobori et al. ( 18) sequenced 3.0 kb for each of seven different mouse strains in the I region of the murine MHC. Initially. many Ba13 1 exonuclease-generated deletion clones were constructed in M 13 mpl0 to permit 3.0 kb of DNA sequence analysis on both strands of the first strain. Upon synthesis of specific oligonucleotides based on the sequence of the first strain, I region DNA from six additional mouse strains was rapidly sequenced using only two clones each (one fragment in two orientations). Due to sequence homology, the same set of oligo-
nucleotides could be used for all mouse strains The analysis of five strains ( 1.5 kb) was completed in approximately 2 months using specific oligonucleotide primers. The extent of the DNA sequence analysis was a direct result of the ease and rapidity of this specific-primerdirected DNA sequencing approach. Upon availability of specific oligonucleotide primers and full-sized M 13 clones in both orientations, 3.0 kb of DNA sequence can be determined in less than 1 week. The specific-primer-directed sequencing method has been extremely useful in several other applications ( 19-2 1). This technique can
S. ( 1984) D!\:.l 3,339-343. Brenner. D. G.. and Shaw. W. V. ( 1985) E,\/BO J. 4, 561-568. Hunkapiller, M.. Kent. S., Caruthers. M.. Dreyer, W., Firca. J.. Giffin, C.. Horvath. S.. Hunkapiller. T., Tempst, P.. and Hood. L. ( 1984) Naww (Lor~&nj 310, 105-l 1 I. Maxam. A.. and Gilbert. W. (1980) in Methods in Enzymology (Grossman. L.. and Moldave. K.. eds.). Vol. 65. pp. 499-560. Academic Press. New York. Messing. J.. Crea. R.. and Seeburg. P. (198 I ) ~Vr&ic :lcYd,c Rrs. 9, 309-31 I. Chen. E. Y., and Seeberg. P. H. (1985) nh:,l 4, l65170. Steinmetz. M., Moore. K. W.. Frelinger. J.. Taylor Sher. B.. Shen. F.-W.. Boyse. E.. and Hood. L. (1981) (‘d/25, 683-693. McNicholas. J., Steinmetz. M.. Hunkapiller. T.. Jones. P.. and Hood. L. (1982) .Scrcwcj 218, 1229-1232. Fisher. D., Hunt, S. W.. and Hood. L. ( 1985) J E\p. .Z~CY~.162, 528-545. Kobori. J.. Strauss. E., Minard, K.. and Hood. L.. submitted for publication. Siu. G.. Strauss. E.. and Hood. L.. submitted for publication. Arden. B.. Klotz. J. L.. Siu. G.. and Hood. L. E. (1985) iVu/urr ilondorz/ 316, 783-787. Siu. G.. Kronenberg. M.. Strauss. E.. Haars. R.. Mak. T.. and Hood, L. E. ( 1984) A’cmtre (Lorzdoni 31 I, 344-350. Smith. L. M.. Fung. S.. Hunkapiller. M. W.. Hunkapiller. T. J.. and Hood, L. E. (1985) h’m~kic~.-lcrtl.s Rrs. 13, 2399-2412. Smith. L.. M.. Sanders. J. 2.. Kaiser, R. J.. Hughes. P.. Dodd. C.. Kent. S. B. H., and Hood. L. E. (1986) h’urrrc (London). in press.