A new method of sequencing DNA

A new method of sequencing DNA

ANALYTICAL BIOCHEMISTRY 174,423-436 (1988) A New Method of Sequencing DNA EDWARD DAVID HYMAN Sybtrel Biotechnology, 3037C McNaughton Drive, Columb...

953KB Sizes 4 Downloads 145 Views

ANALYTICAL

BIOCHEMISTRY

174,423-436

(1988)

A New Method of Sequencing DNA EDWARD DAVID HYMAN Sybtrel Biotechnology, 3037C McNaughton Drive, Columbia, South Carolina 29223 Received April 15, 1988 An entirely new method of sequencing DNA has been devised that does not use electrophoresis, radioactivity, or fluorescence. The method works by measuring pyrophosphate generated by the DNA polymerization reaction. DNA and DNA polymerase are held by a DEAE-Sepharose column and solutions containing different dNTPs are pumped through. The pyrophosphate generated is measured continuously by a device consisting of a series of columns containing enzymes covalently attached to Sepharose. The alternating copolymer poly(dA.dT) is sequenced as an illustration of the method. Future improvements that will facilitate automation are discussed. 0 1988 Academic PXSS,Inc. KEY WORDS: DNA sequencing; luciferase; ATP sulfurylase; pyrophosphate; Sepharose; DNA polymerase.

Current methods of sequencing nucleic acids involve polyacrylamide gel electrophoresis of single-stranded nucleic acid fragments generated by either chain-terminating dideoxynucleotides (I), selective enzymatic fragmentation of RNA (2) or selective chemical degradation of DNA (3). New DNA sequencing machines use fluorescence instead of radioactivity to detect the DNA fragments (4). However, these machines still require intensive labor and careful technique and are inappropriate for large-scale sequencing. I have taken a completely different approach to the problem of sequencing DNA. My approach is based on the precise measurement of the PPi generated by the polymerization reaction of a dNTP with a template-primer catalyzed by DNA polymerase: template-primer + dNTP + template-(primer + dNMP) + PPi . By precisely measuring PPi it is possible to determine whether or not a polymerization reaction has occurred, and if so, to determine how many nucleotides have been incorporated into the growing primer chain. The sequencing process is schematized in Fig. 1. It consists of a series of precisely or423

dered capillary columns each containing an enzyme covalently attached to Sepharose 4B. A solution containing APS,’ glucose, glycerol, luciferin, and one of the four dNTPs (dATP, dGTP, dCTP, dTTP) is pumped initially through a column of PPase-Sepharose. The PPase selectively hydrolyzes residual PPi in the buffer solution; APS, glucose, glycerol, luciferin, and dNTP pass through freely. The solution then passes through a column containing template-primer-DNA polymerase noncovalently attached to DEAE-Sepharose 6B. As a result of the selectivity of DNA polymerases, polymerization takes place only if the dNTP can base pair properly. The mixture of APS, glucose, glycerol, luciferin, dNTP, and PPi then passes through a column of glycerokinase-Sepharose and hexokinaseSepharose. These kinases selectively break down dATP to dADP (if dATP is chosen) and ’ Abbreviations used: APS, adenosine 5’-phosphosulfate; DTE, dithioerythritol. AC-, acetyl-; BSA, bovine serum albumin; TMN buffer, 100 mM Tris-OAc, 10 rnrvt Mg(OAc)z, 0.05% NaNS, pH 7.75; PPase, pyrophosphatase; buffer L, 50 mM NaH2P04, 10 mM NaCl, pH 7.75, containing 25% glycerol; AMV, avian myeloblastosis virus. 0003-2697188 $3.00 Copyright 0 1988 by Academic Press, Inc. All rights of reproduction in any form reserved.

424

EDWARD

DAVID

HYMAN

APS,dNTP,LUCIFERlN GLUCOSE, GLYCEROL

PPase - sepharose

DNA-DNA

POLVMERASE - DEAE se&m-se

I PF’I,APS, dNTP, LUCIFERIN GLUCOSE, GLYCEROL GLYCEROL GLYCEROL- Pi

GLYCERW 1~~54~ - sephar0~e

mTPGLit2S GLUCOSE-P,

HEXOKINASE-

s~ose

4 PPi,APS, LUCIFERIN ATP SULFURYLASE

- sqharose

\1 ATP,LUCIFERIN LIGHT

<-

LUCIFERASE - sepharose

FIG. 1. Schematic diagram of DNA sequencer.

contaminating ATP to ADP. PPi, APS, and luciferin are unaffected by the kinases and enter the ATP sulfurylase-Sepharose column. This enzyme catalyzes the reaction PPi + APS --* ATP. This reaction (5). The ATP with luciferin ase-Sepharose reaction (6)

is fast and runs to completion formed in this reaction along passes into the firefly lucifercolumn which catalyzes the

buffer are used in any desired deoxynucleotide order until the sequence is complete. METHODS

AND MATERIALS

Chemicals Purified dATP, dCTP, dGTP, and dTTP solutions were obtained from Pharmacia. Glycerokinase was obtained from BoerhingerMannheim Biochemicals. All other chemicals were obtained from Sigma.

luciferin + ATP + O2 + oxyluciferin

+ AMP + PPi + CO2 + light

The light emitted is detected by a photomultiplier tube. ATP is quantitated with high sensitivity to determine how long the primer chains have been extended. Absence of ATP indicates that no polymerization has occurred and that the template-primer requires a different dNTP for polymerization to proceed. After thorough column wash, buffer containing a different dNTP is introduced. Cycles of dNTP buffer followed by wash

Coupling buffer is 0.1 M NaHC03, 0.5 M NaCl, pH 8.3; buffer L is 50 mM NaH2P04, 10 mM NaCl, pH 7.75, containing 25% glycerol; TMN buffer is 100 mM Tris-OAc, 10 mM Mg(OAch, 0.05% NaNJ, pH 7.75; assay buffer is TMN buffer + 1 mM DTE, 0.1% BSA, pH 7.75; sequencing buffer is 25 mM Tris-OAc, 10 mM Mg(OAch, pH 7.75, 1 mM DTE, 0.05% NaN3, 10 mM glucose, 10 mM glycerol, 5 X 1O-’ M APS. Solutions were prefiltered through a 0.2-brn sterile filter. Tris-

NEW

METHOD

OF

OAc buffers were brought to the correct pH with concentrated NaOH. Coupling Reactions Enzymes are coupled to tresyl-activated Sepharose 4B, prewashed with 1 mM HCl, in a sterile, rubber-stoppered test tube. Gels are centrifuged briefly to change the buffer. Enzymes are added to the gel using a 0.2~pm sterile filter. Firefly luciferase. Firefly luciferase (Photinus pyralis, Sigma L9009) is coupled using two methods: In Method A, add luciferase in coupling buffer to gel. Rotate at room temperature for 2 h at 30 rpm. Add 10 ml 1.O M ethanolamine, pH 8.0, and rotate an additional 2 h. Wash with 10 ml coupling buffer, 10 ml 0.1 M NaOAc, 0.5 M NaCl, pH 4.0, and 10 ml TMN buffer. Store in 4 ml TMN buffer + 2.5 mg/ml NaN3 at 4°C. This method was modified to allow more protein coupling. In Method B, add luciferase dissolved in buffer L to the gel. Rotate at 2 rpm at room temperature for 9 h. Add NaN, to 2.5 mg/ml, and store at 4°C. Hexokinase. Add 2000 units hexokinase (yeast, Sigma H5875) in 3 ml coupling buffer to 0.380 g gel. Rotate at room temperature 2 h. Add 10 ml of 1.O M ethanolamine, pH 8.0; continue rotation for 2 h. Wash gel with 10 ml coupling buffer, 10 ml of 0.1 M NaOAc, 0.5 M NaC1, pH 4.0, 10 ml coupling buffer, 10 ml TMN buffer, and store in 4 ml TMN buffer + 2.5 mg/ml NaN3 at 4°C. Hexokinase is unstable in coupling buffer, but the short duration of coupling minimizes the loss of activity. Glycerokinase. Add 100 units glycerokinase (Bacillus stereothermophilis, BoerhingerMannheim 69 l-836) in 3.0 ml coupling buffer to 0.380 g gel. The rest ofthe procedure is identical to that for hexokinase. A TPsulfurylase. Heat 50 units ATP sulfurylase (yeast, Sigma) in 1 ml coupling buffer at 5 1°C for 5 min, chill in ice water 1 min, dilute with 2 ml coupling buffer, and add to 0.190 g of gel. The heating procedure removes most

SEQUENCING

425

DNA

hexokinase contamination. The rest of the procedure is identical to that for hexokinase. PPase. Add 10 units pyrophosphatase (inorganic, Sigma) in 3 ml coupling buffer to 0.190 g gel. After ethanolamine block, wash in 10 ml TMN buffer, and store in 4 ml TMN buffer + 2.5 mg/ml NaN3 at 4°C. Column Construction A glass capillary tube (1. l-mm id., 1.5mm o.d.) is plugged at one end with glass wool and is sterilized with consecutive washes of concentrated NaOH, sterile deionized water, and sterile buffer. Enzyme-sepharose is loaded to the specified height. The glass tube is cut several millimeters above the top of the Sepharose and connected as instructed using silicon tubing (0.04-in. i.d. 0.085-in. o.d.). Silicon and Teflon tubing is prewashed in concentrated NaOH, sterile deionized water, and buffer before use. Columns are stored at 4°C in TMN buffer containing 25% glycerol and 2.5 mg/ml NaN3. Flow Cell Construction The flow cell device consists of two 18gauge stainless-steel needles that penetrate a partially hollowed type 00 black rubber stopper. On the top side both needles are bent to an angle slightly greater than 90” (to block external light); one needle is connected to a peristaltic econocolumn pump (Bio-Rad), the other to a waste reservoir. The other ends of the needles are connected to both ends of the luciferase column which is then enclosed in a plastic cuvette. The Ilow cell device fits snugly into the counting chamber of an LB9500C luminometer (Berthold Analytical). Determination

ofATP Equivalence

The method used is a slight modification of the method of Nyren and Lundin (7). In a total volume of 200 ~1 assay buffer + 0.1 mg D

426

EDWARD

DAVID

luciferin + 6.3 X 10e5 unit luciferase, add the following: APS: 10 mM glucose, 10 units hexokinase, and 5 X 1O-6 M APS. ADP: 10 mM glucose, IO units hexokinase, 10e4 M ADP. Add 10 units myokinase (chicken muscle) to observe a small drop in luminescence rate. AtetraP: 10 mM glucose, 10 units hexokinase, and 10e6 M AtetraP. dNTP: 1Oe4M dNTP. For dATP, allow several minutes for the rate to reach a steady level (see Fig. 6). For dCTP, dGTP, and dTTP add 10 mM glucose + 10 units hexokinase and luminescence completely disappears. ATP equivalence is obtained either by addition of a known quantity of ATP to the reaction mixture or by comparison with a similar assay of known ATP concentration. Inhibition

of A TP Luminescence by Analogs

In a total volume of 200 ~1 assay buffer + 0.1 mg D-luciferin + 6.3 X 10e5 unit luciferase, add the following: (A) 10m4 M dNTP. Add lO-(j M ATP for dATP, lo-’ M for dCTP, dGTP, and dTTP. (B) 5.0 X lo-‘M APS. Add lo-‘M ATP. Inhibition is determined by comparison with luminescence rate obtained in the absence of the substrate analog. Sequence of Poly(dA . dT) DNA-DNA polymerase-DEAE-Sepharose is prepared as follows: Add 0.26 unit of poly(dA . dT) dissolved in 10 ~1 of 10 mM Tris-HCl, 5 mM NaCl, pH 7.5, to 20 ~1 of sequencing buffer. Add 50 units of AMV reverse transcriptase, incubate at room temperature for 1 min, dilute with 500 ~1 sequencing buffer, and inject through a IO-mm DEAESepharose 6B column preequilibrated in sequencing buffer. Place DNA column in sequencer; equilibrate in sequencing buffer. Add dNTP at lo-’ M to sequencing buffer to make dNTP sequencing buffer. The sequence is performed using a 1-min pulse of the indicated dNTP sequencing buffer followed by a

HYMAN

1Zmin wash with sequencing buffer; flow rate = 75 &min. Column lengths are (PPase) 15 mm, (glycerokinase) 40 mm, (hexokinase) 20 mm, (ATP sulfurylase) 15 mm, and (luciferase) 10 mm. Luciferase-Sepharose is prepared by method B using 0.190 g gel + 0.4 mg luciferase in 3 ml buffer L. Concentrations on the chart recorder are determined by using the background luminescence of APS and assuming an ATP equivalence of 2.4 X 10p4. RESULTS

The luciferase-Sepharose column is a key component of the DNA sequencer and is described first. Luciferase-Sepharose has been described previously (8). I have carefully studied and refined its characteristics for this application. In the following experiments solutions of ATP and luciferin are pumped continuously through a capillary tube column of luciferase-Sepharose. The column is held by a simple device, referred to as a flow cell, constructed of a rubber stopper and needles, which fits snugly in the counting chamber of the luminometer. Turning off overhead lights eliminates all external light noise. The rate of light emitted in the luciferaseSepharose column is a function of the following variables: ATP concentration, amount of luciferase coupled to the gel, Iuciferin concentration, and flow rate. Each of these variables is examined individually. A log-log plot of the rate of luminescence versus ATP concentration is shown in Fig. 2A. The light rate is linearly proportional to the ATP concentration from lo-‘* to lo-’ M ATP. The total volume of mobile phase in the column, approximately 10 ~1, makes the detection limit about 10-l’ mol ATP. This high sensitivity is probably due to a high density of enzymatically active luciferase per microliter of gel, At a flow rate of 53 &min this column produced a completely stable light rate for at least 20 min at ATP concentrations less than lop8 M, and only an average 0.3% decline per minute at 10m7M ATP. These stable rates are achieved without the use of L-luciferin, a sub-

NEW METHOD

ATP

OF SEQUENCING

CONCENTRATION

DNA

427

I rvmd

FIG. 2. Luminescence rate versus ATP concentration. The luciferase-Sepharose is made by method B using (A) 0.040 g gel + 0.5 mg luciferase in 250 ~1 buffer L, (B) 0.190 g gel + 0.1 mg luciferase in 3.0 ml buffer L, (C) 0.190 g gel + 0.02 mg luciferase in 3.0 ml buffer L. All columns are 15 mm long. ATP measurements are made in TMN buffer + 1 mM DTE + 100 fig/ml luciferin. Flow rate = 95 &mitt.

strate analog of D-luciferin that is used to stabilize rates in the soluble-type assay (6). Figures 2B and C represent columns differing in the amount of luciferase coupled to the gel. Columns with smaller luciferase densities still give a linear luminescence rateATP concentration relationship but have less sensitivity. Rates still become nonlinear with ATP concentrations exceeding 10e6 M, in accordance with Michaelis-Menten kinetics (9). No advantage is observed in using small luciferase densities as in Figs. 2B and C. Luminescence rate is plotted versus luciferin concentration in Fig. 3. The curves consist of roughly two regions: Below 50 pg/ml luciferin, the rate rises approximately linearly with luciferin concentration; above 50 &ml, the rate begins to plateau. At very high concentrations of luciferin, 400 pg/ml, the rate declines. This is in good agreement with the conventional assay procedure (9). For lucif-

erin, 100 pg/ml is a routinely used concentration. Figure 3 also confirms that luminescence is weakly inhibited by Tris-OAc ( 10). I have also noted that a change in Tris-OAc concentration affects the column such that on return to a previous concentration, the rates are not always consistent. Luminescence rate is plotted versus flow rate at lo-’ and lo-* M ATP in Fig. 4. Initially the luminescence rate rises sharply, but gradually reaches a plateau level at higher flow rates. At 10p9, lo-“, and lo-” M ATP, the luminescence rate does not show such a large response and tends to plateau at a flow rate of 50 &min. Slight pulsations in the flow rate of the peristaltic pump produce small oscillations in the rate of luminescence. However, connecting additional Sepharose columns before the luciferase column (as in the DNA sequencer) effectively buffers pulsations in the flow rate and gives a steady luminescence

428

EDWARD

LUCIFERIN

DAVID

CONCENTRATION

HYMAN

lug/ml1

FIG.3. Luminescence rate versus luciferin concentration at different concentrations of Tris-OAc. Buffer is (0) 100 mM Tris-OAc, (A) 50 mM Tris-OAc, (0) 25 mM Tris-OAc, or (A) 10 mM Tris-OAc containing 10 mM MgOAc, pH 7.75,0.05% NaN3, 1 mM DTE, 10 mM glucose, 10 mM glycero!, 5 X tom7M APS, lo-* M ATP. Plow rate = 67 cl/min. The luciferase-Sepharose is made by method A using 0.190 g gel + 0.5 mg luciferase in 3.0 ml coupling buffer. Column height = 15 mm.

rate. The flow rate used routinely is about 75 &min, which is safely in the plateau region of both curves in Fig. 4. No damage was observed in columns at this flow rate after many hours of continuous use. A simple mathematical model will help explain the curve in Fig. 4. If the flow rate is suddenly stopped in the luciferase column of Fig. 4, the rate falls initially in an exponential-like decline for the first few minutes similar to soluble assay light kinetics (11). This demonstrates that a constant flow rate through the luciferase-Sepharose is necessary to achieve a stable light rate. Imagine that a luciferase column, height ho, is broken into cross sections dx in thickness. Assume that L(t) is the function that gives the luminescence rate as a function of time when there is no flow through the column. The luminescence contribution of each dx slice is [L(x/

v)](dx)/b, where 21= linear velocity of flow in the column and x = distance of the slice to the top of the column. Summing all the dx contributions and introducing the variable substitution I = X/U give total luminescence =

h m/v) so ho V

=-

W

ho s o Assuming that L(t) = l&“, be solved: total luminescence = 2

&

L(t)dt.

this integral can

(1 - e-“h’“).

Total luminescence is a function of two variables: flow rate (v) and column height (h). Note that lo/h,, is a constant which equals the maximum luminescence achievable per unit

NEW METHOD

OF SEQUENCING

429

DNA - 25.0

180 -

-

21.5

- $4.5

100

0

I

I

I

I

25

50

75

100

FLOW

11.0 125

RATE lul/minl

FIG. 4. Luminescence rate versus flow rate. Buffer is TMN + 1 mM DTE + 100 &ml luciferin + (0) lo-’ M ATP, left side, (0) 10e8 M ATP, right side. (A) Theoretical curve for lo-’ M ATP obtained by matching curves at 10 pl/min and assuming a luminescence of 200 at infinite flow rate. The luciferasesepharose is made by method B using 0.190 g gel + 0.4 mg luciferase in 3.0 ml buffer L. Luciferase column height = 15 mm.

length at a specified ATP concentration; this is an inherent property related to the luciferase density. The plot of a theoretical curve for lo-’ M ATP shown in Fig. 4 fits the experimental data well and confirms that the mathematical model is a workable approximation. This model does not apply to luciferase columns with small luciferase densities, since these columns lack an initial exponential-type decline in luminescence when the flow rate is stopped. Covalently bound luciferase is stable at room temperature giving reproducible results for many hours. Storage for a month in glycerol and NaN3 at 4°C results in only a small loss of activity. One column is adequate for making hundreds of ATP determinations, in agreement with the observations of Kricka er al. (8). These properties make luciferase-

Sepharose ideal for continuous monitoring of ATP concentration and well suited for automation, a necessary requirement for the DNA sequencer. The subcomponent of the DNA sequencer which continuously monitors PPi concentration comprises the last four columns of the DNA sequencer (Fig. 1): glycerokinase + hexokinase + ATP sulfurylase + luciferase. In the process, PPi is mixed with a solution containing a fixed quantity of APS, glucose, glycerol, and luciferin. The glycerokinaseSepharose and hexokinase-Sepharose columns selectively degrade contaminating ATP to ADP. ATP is a common contaminant of commercial preparations of APS (about 0.01%). The PPi reacts readily with

430

EDWARD

DAVID

APS in the ATP sulfurylase-Sepharose column to form ATP. The ATP and luciferin enter the luciferase column and the ATP is quantitated as described earlier. A chart recorder provides a continuous record of PPi concentration versus time. This method of measuring PPi concentration is a modification of the procedure developed by Nyren and Lundin (7) which employs soluble enzymes. As expected, the luminescence rate is also linearly proportional to PPi concentration, in agreement with the results of Nyren and Lundin (7). The luminescence rate due to PPi added to a sample is obtained after subtracting the background luminescence contributed from two sources: APS and PPi contamination. APS itself can serve as a weak substrate of luciferase to produce light. The background luminescence of APS can be minimized by utilizing a small excess while still allowing for complete conversion Of PPi to ATP. At a flow rate of 32 &min, 5 X 1O-’ M APS is adequate for measuring 2 X I Om8M PPi . The high density of ATP sulfurylase activity per microliter of gel allows a small excess to be used. It is anticipated that addition of another enzyme column immediately following the ATP sulfurylase column will eliminate the APS background luminescence. This enzyme would specifically degrade APS without hydrolyzing ATP. Two types of enzymes with this property are reported in the literature: an APSase hydrolyzes APS + AMP + SO4 without hydrolyzing ATP ( 12), and ADP sulfurylase catalyzes APS + PO, + ADP ( 13). Both of these products, AMP and ADP, are practically inert to luciferase. The second source of background luminescence is due to contaminating PPi in the buffer, usually about lop9 M. The major sources of PPi contamination are Tris-OAc and the commercial deionized water. The PPase-Sepharose column in the DNA sequencer efficiently hydrolyzes this residual PPi and removes it as a source of concern when sequencing DNA.

HYMAN

Figure 5 shows a chart recording of a 1-min pulse Of PPi through the column system, illustrating the continuous measurement of PPi concentration versus time. The experimental yield of 6.8 X lo-l3 mol PPi (obtained by integration) approximates the theoretical value of 6.4 X lo-l3 mol PPi (error = 6%). The peak is approximately Gaussian in shape, is slightly skewed to the left, and is much broader than 1 min. The upward shift of the entire recording in Fig. 5 is due to the background luminescence of APS and PPi contamination. An important consideration is the potential effect of the dNTPs in the luciferaseSepharose column which could interfere with ATP measurements during DNA sequencing. Luciferase accepts analogs of ATP which produce luminescence in linear proportion to their concentration ( 14). This is conveniently defined, in terms of ATP equivalence, as the concentration of ATP that will produce the same luminescence rate as the analog, and is expressed by the formula: 1ATPlequivalents= k [ substrate analog] k values of analogs are given in Table 1. These k values were obtained using the soluble luciferase assay; however, I have observed that the k value of APS with luciferase-Sepharose is identical to the soluble assay value. Light rates produced by dGTP, dCTP, and dTTP are trivial and disappear rapidly in the presence of hexokinase and glucose (14). This strongly suggests that the luminescence is due to minute ATP contamination (less than 0.00 1%) since yeast hexokinase has little activity for dGTP and dCTP ( 15). The k values for dGTP, dCTP, and dTTP are given as upper limits. dGTP, dCTP, and dTTP at 10m4M do not inhibit luminescence of ATP, nor does APS at 5 X lo-’ M. These data correlate with those of Moyer and co-workers ( 14) and sup port the conclusion that luciferase shows high specificity for ATP. In contrast, dATP serves as a good substrate for luciferase and displays the peculiar reaction kinetics shown in Fig. 6. On initial addition of dATP, the luminescence rate

NEW

Pa:

METHOD

OF SEQUENCING

431

DNA

2.5

0

I

I

I

4

6

12

I

16

MINUTES

FIG. 5. One-minute pulse of 2 X lo-* M PPi in sequencing buffer containing 100 mM Tris-OAc. Flow rate = 32 pl/min. Column lengths are (glycerokinase) 25 mm, (hexokinase.) 20 mm, (ATP sulfurylase) 15 mm, and (luciferase) 10 mm. The luciferase-Sepharose. is prepared by method A using 0.190 g gel + 0.5 mg luciferase in 3 ml coupling buffer.

jumps instantly to a high level, then declines exponentially until it reaches a steady state level. I use this steady-state level to calculate the ATP equivalence. Subsequent additions of dATP result in only linearly proportional rate increases. I cannot explain this phenomenon. The elevated k value of 0.0 17 obtained by Moyer et al. (14) is the result of their use

TABLE 1 k

VALUJZSOFSUBSTRATEANALOGS

Substrate

k

APS dATP dGTP dCTP dTTP

2.4 X 1.8 x G1.9 x ~2.7 x c7.1 x s5.5 x

ADP

AtetraP

1O-4 lO-4 1o-6 lo-’ lo+ 1o-6

4.9 x lo-?

of the peak light rate. dATP at 1O-4 M inhibits 98% of the expected luminescence of 10m6M ATP, an inhibition much larger than that reported by Moyer et al. ( 14). The logic of using the kinase columns in the DNA sequencer is to eliminate the interference of dATP on the measurement of ATP by converting it to dADP, which is inert to luciferase. This is beneficial if the dATP concentration used for sequencing is lo-’ M, but at 1Op7Md ATP the benefits are marginal because the inhibitory and contributory effects of the dATP at this concentration are negligible. Glycerokinase contains most of the dATPase activity. The kinases also remove the inevitable ATP contamination of APS preparations. The activity of the kinases for dGTP, dCTP, and dTTP is inconsequential since none of these deoxynucleotides interact with luciferase. As a simple illustration of the DNA sequencing protocol, the alternating copolymer

432

EDWARD

DAVID

HYMAN

ATP

ATP ,

I 5

I 10

I 15

I 20

I 25

MINUTES

FIG. 6. Luminescence rate of dATP versus time. To 200 ~1 assay buffer + 0.1 mg luciferin + 6.3 X IO-’ units luciferax, dATP is added four times, each time increasing its concentration by 10m4M. The last addition of 10m6M ATP produces only a small increase in luminescence.

poly(dA . dT) was sequenced using AMV reverse transcriptase. At 25 mM Tris-OAc, the retardation effects of the DEAE-Sepharose for the anions PPi and dNTP are minimal, and the concentration is well tolerated by reverse transcriptase. One-minute pulses of the specified deoxynucleotide solution are followed by 12-min washes. Peak heights are used as an estimate of peak area to determine the amount of PPi formed. Figure 7 summarizes peak height versus deoxynucleotide. A portion of the chart recording Of PPi concentration versus time is shown in Fig. 8. The sequence ATATAT . . . is readily confirmed from the figures. Several observations are made: I. The first dATP pulse gave only about onehalf the expected PPi yield; random cleavage of the primer strand by the reverse po-

lymerization reaction during sample preparation accounts for this effect. The first pulse of dATP makes dTTP the next nucleotide required for all templates. 2. dATP pulse 16, which immediately follows a dATP pulse, gave a small PPi yield. This is expected since polymerization with the first pulse goes to near completion and the next nucleotide required is dTTP. A similar result is observed with consecutive pulses of dTTP. The small response in pulse 16 of dATP appears to have three causes: (i) misincorporation of dATP in place of dTTP (minor); (ii) degradation of the primer strand during the wash cycle due to PPi contamination in the column (major); and (iii) incomplete polymerization during the preceding pulse 15 (major). Elimination of problems (i) and (iii)

NEW METHOD

Y z

0L

GC 1

I AT

TAT

TATATA 10

OF SEQUENCING

AT 15

I1 AT

20

433

DNA

TATA

TATAT 30

ATATATATATATAT ‘lo 35

45

DEOXYNUCLEOTIDE

FIG. 7. Peak height versus dNTP for sequence of poly(dA.dT). The sudden increase in peak height observed at dNTP 4 1 is due to use of a new luciferin batch. Average yield Of PP, for each polymerization is about 9.4 X lo-l4 mol. See Methods and Materials for details.

10 U-Ln Tl61

A

T

A

T

5

IO-

All11

m-

T

A

T

T

A

A

5G

0-J

Aa

40

C

I 240

MINUTES

Ftci. 8. PPi concentration versus time of poly(dA . dT) sequence of deoxynucleotides 6-25. Peaks ride on top of APS background luminescence.

, 260

434

EDWARD

DAVID

may entail the use of polymerases with greater specificity or the use of lower temperatures to maximize the efficiency of the polymerization reaction and minimize the number of errors committed. The solution of problem (ii) may involve a more thorough removal of PPi before the buffer reaches the polymerization column by, for example, including an ATP sulfurylase column immediately following the PPase column. An alternative approach would be to remove the kinase columns from the system and combine the ATP sulfurylase and DNA polymerization columns or include ATP sulfurylase in the buffer instead of immobilizing it. AMV reverse transcriptase is devoid of 3’ + 5’ exonuclease activity ( 16). Sequencing of more complicated templates would be limited if the polymerase had residual 3’ + 5’ exonuclease activity. The use of deoxynucleoside [ 1-thio]triphosphates, which prevent proofreading, could eliminate this problem and increase the number of DNA polymerases that can be considered for the method ( 17). dGTP and dCTP give practically no polymerization, as expected, since neither can base pair properly with the template DNA. This exemplifies the specificity of AMV reverse transcriptase. The length of the sequence, 42 base pairs, and the small decrease in PPi yield during the sequence are attributed to the high processivity of AMV reverse transcriptase. Processivity is defined as the ability of a single enzyme molecule to polymerize nucleotides on a DNA chain without dissociating. AMV reverse transcriptase incorporates an average of several hundred nucleotides before dissociating (18). Experiments with Klenow fragment of Escherichia coli DNA polymerase I, possessing a low processivity, have confirmed that it washes off its DNA substrate much faster. The sequence was discontinued due to time constraints.

HYMAN

5. This sequence does not address the potential problem of limitations in sequencing due to the accumulation of incomplete polymerization reactions or errors. The rate of polymerization versus dATP concentration is explored further in Fig. 9. The peaks are obtained by a continuous flow of dATP instead of a I-min pulse. At 10e9 M dATP, the polymerization reaction is slow but emphasizes that a dNTP must be washed out of the DNA column thoroughly before giving a pulse of another dNTP. At lo-* M dATP, the peak is broad with a slow return to baseline level, indicating the completion of polymerization. At lop7 M dATP, polymerization is faster and the peak is narrower. The decline, however, never quite returns to baseline level. I postulate that this is due to a lowfrequency error of dATP polymerization in place of dTTP. At 10m6 and lo-’ M dATP, this error rate increases substantially and confirms what is intuitively obvious: the larger the dATP concentration, the larger the error rate. This relationship of error rate versus dNTP concentration holds true for the other dNTPs as well; e.g., increasing the concentration of dGTP will also result in increasing polymerization in the poly(dA.dT). DISCUSSION

There are several advantages to keeping the dNTP concentration as low as possible: (i) it reduces the error frequency, (ii) it reduces the time required to wash the dNTP completely off the DNA column, and (iii) it reduces PPi generated in the ATP sulfurylase column by the reaction dNTP + dNMP(?) + PPi, resulting either from a side reaction of the enzyme itself or a contaminating enzyme activity. The order of reactivity observed is dTTP > dGTP > dCTP. dNTP concentration must be kept large enough, though, to allow complete polymerization in a short period. Further research is needed to explore the variations in selectivity and enzymatic activity of polymerases of different organisms. Other areas of research can improve this method. Solid support matrixes such as silica

NEW METHOD 14

OF SEQUENCING

435

DNA

1

12 -

10 5 % 0 ‘0

6-

x 5 5

6-

E z 0 s2

4-

.k 2-

- I 0

I

I

I

5

10

15

MINUTES

FIG. 9. PP, concentration versus time of continuous reaction of dATP, at different concentrations in sequencing buffer, with poly(dA. dT). The method is the same as that for the sequence of poly(dA. dT) except a continuous flow of dATP sequencing buffer is used. All peaks are obtained after a l-mm pulse of dTTP sequencing buffer and a I2-min wash with sequencing buffer to ensure that dATP is the next deoxynucleotide required by all templates. Graphs of IO-’ to IO-’ M dATP begin at the same PP, concentration as 10m9M dATP but are staggered above for easier visualization.

gel or glass beads may allow faster flow rates that will decrease the sequencing time. An analog of dATP that is normally incorporated into the DNA by the polymerase but is unable to bind to luciferase or development of a luciferase that shows greater substrate specificity would clearly be useful. Another possible area of investigation is development of a solid support, coupled to a mixture of luciferase and ATP sulfurylase, which is also able to bind the DNA sample. The entire sequencing could be carried out in one column. The use of chain-terminating nucleotides to keep sequencing in phase for more complicated templates should be investigated. Experiments are currently in progress to sequence M 13 templates. With automation, this new method may provide an economical means of sequencing nucleic acids on a large

scale, one step closer to sequencing of the human genome. ACKNOWLEDGMENTS I thank Environmental Profile Labs (NJ) and Rim Brandeis of Community Memorial Hospital (Toms River, NJ). In Columbia, South Carolina, I thank Mark Simmons, Chamber of Commerce; Julie Wilson, Spacemakers; Carl Mohan, Automatic Sprinkler; Gary Angel, General Contractors; and Ed Huggins and Bob Lauther, University of South Carolina, for their help over the past year. I thank Danette Blackwell, Executive Concepts, and Kane Office Technologies for preparation of this manuscript, and Brumbaugh, Graves, Donahue & Raymond (New York) for patent work. This research was conducted with private funds.

REFERENCES 1. Sanger, F., Nicklen, S., and Coulson, A. R. (1977) Proc. Natl. Acad. Sci. USA 14,5463-5467.

436

EDWARD

DAVID

2. Donnis-Keller, H. (1980) Nucl. Acids Rex 8, 3 1333142. 3. Maxam, A. M., and Gilbert, W. (1977) Proc. Natl. Acad. Sci. USA 74,560-564. 4. DuPont (Wilmington, Delaware) and Applied Biosystems (Foster City, CA). 5. Robbins, P. W., and Lipmann, F. (1958) J. Biol. Chem. 233,686-690; Wilson, L. G., and Bandurski, R. S., (1958). J. Biol. Chem. 233,975-981. 6. Lundin, A. (1982) in Luminescent Assays: Perspective in Endocrinology and Clinical Chemistry (Serio, M., and Pazzagli, M., Eds.), pp. 29-45, Raven Press, New York. 7. Nyren, P., and Lundin, A. (1985) Anal. Biochem. 151,504-509.

8. Kricka, L. J., Wienhausen, G. K., Hinkley, J. E., and De Luca, M. (1983) Anal. Biochem. 129, 392397; Ugarova, N. N., Brovko, L. Y., Ivanova, L. V., Shekhovtsova, T. N., and Dolmanova, I. F. (1986) Anal. Biochem. 158, l-5; Brovko, L. Y., Ugarova, N. N., Vasil’eva, T. E., Dombrovskii, V. A., and Berezin, I. V. (1978) Biokhimiya 43, 798-805; Brovko, L. Y., Kost, N. V., and Ugarova, N. N. (1980) Biokhimiya 45,1582-1588. 9. Lundin, A., Rickardsson, A., and Thore, A., (1976) Anal. Biochem. 15,6 1 l-620.

HYMAN

10. Nichols, W. W., Curtis, G. D. W., and Johnston, H. H. (1981) Anal. Biochem. 114,396-397. 11. DeLuca, M., and McElroy, W. D. ( 1978) in Methods in Enzymology (DeLuca, M. A., Ed.), Vol. 57, pp. 3- 15, Academic Press, New York. 12. Sawhney, S. K., and Nicholas, D. J. D. (1976) Plant Sci. Lett. 6, 103-l 10. 13. Nicholls, R. G. (1977) Biochem. .I. 165, 149-155; Guranowski, A., and Blanquet, S. ( 1986) J. Biol. Chem. 261,5943-5946. 14. Moyer, J. D., and Henderson, J. F. (1983) Anal. Biothem. 131,187-l 89; McElroy, W. D., and Green, A. (1956) Arch. Biochem. Biophys. 64,257-27 1. 15. Darrow, R. A., and Colowick, S. P. ( 1962) in Methods in Enzymology (Colowick, S. P., and Kaplan, N. O., Eds.), Vol. 5, p. 226-235, Academic Press, New York. 16. Seal, G., and Loeb, L. A. (1976) J. Biol. Chem. 251, 975-981. 17. Kunkel, T. A., Eckstein, F., Mildvan, A. S., Koplitz, R. M., and Loeb, L. A. (198 1) Proc. Nat!. Acad. Sci. USA 70,6734-6738. 18. Tabor, S., and Richardson, C. C. (1987) Proc. Natl. Acad. Sci. USA 84,4767-477 1.