Sequencing Regular and Labeled Oligonucleotides Using Enzymatic Digestion and Ionspray Mass Spectrometry

Sequencing Regular and Labeled Oligonucleotides Using Enzymatic Digestion and Ionspray Mass Spectrometry

ANALYTICAL BIOCHEMISTRY ARTICLE NO. 263, 129 –138 (1998) AB982812 Sequencing Regular and Labeled Oligonucleotides Using Enzymatic Digestion and Ion...

189KB Sizes 0 Downloads 4 Views

ANALYTICAL BIOCHEMISTRY ARTICLE NO.

263, 129 –138 (1998)

AB982812

Sequencing Regular and Labeled Oligonucleotides Using Enzymatic Digestion and Ionspray Mass Spectrometry Huaiqin Wu,1 Clifford Chan, and Hoda Aboleneen Abbott Laboratories, Diagnostics Division, 100 Abbott Park Road, Abbott Park, Illinois 60064

Received November 26, 1997

A method using a combination of enzymatic digestion and ionspray mass spectrometry (MS) was developed for sequencing oligodeoxynucleotides (ODNs) containing more than 20 bases. Phosphodiesterase (PDE) digestion of ODNs produced truncated ODNs whose molecular weights (MWs) were determined by ionspray MS. It was demonstrated that reconstruction of MW spectra over a large MW range produced easyto-read sequence ladders similar to those obtained using matrix-assisted laser desorption/ionization timeof-flight mass spectrometry (MALDI–TOF-MS). Sample and enzyme cleanup, digestion control, and MW reconstruction were found to be crucial factors. For regular ODNs, both 5*- and 3*-PDE digestions are needed for complete sequencing. Late in the time course of PDE digestions, 5*-nucleoside monophosphates were found to produce artifactual peaks in the reconstructed MW spectra, and a table correlating base compositions and MS ions was compiled to handle such situations. For labeled ODNs, it is necessary to use collision-induced dissociation–tandem mass spectrometry (CID–MS/MS) for complete sequence determination. Sequencing of regular 22-mer and labeled 18-mer ODNs was demonstrated in this work. © 1998 Academic Press

Advances in molecular biology have resulted in the wide application of oligodeoxynucleotides (ODNs)2 in antisense therapeutics and DNA probe technology (1, 2). Although automated ODN synthesis has become routine, the analytical methods have been limited for many years 1

To whom correspondence should be addressed at Abbott Laboratories, Department 90T, Bldg. AP20, 100 Abbott Park Rd., Abbott Park, IL 60064-3500. E-mail: [email protected]. 2 Abbreviations used: ODN, oligodeoxynucleotide; ES, electrospray; MALDI, matrix-assisted laser desorption ionization; DE, delayed extraction; PDE, postsource decay; CID, collision-induced dissociation; RMW, reconstructed molecular weight; NMP, nucleoside monophosphate; B-C, biotin-C. 0003-2697/98 $25.00 Copyright © 1998 by Academic Press All rights of reproduction in any form reserved.

to purity examination using chromatographic/electrophoretic techniques and base composition analysis (3). The full characterization of ODNs requires complete sequence determination or verification. Traditionally, sequencing of small ODNs is mainly accomplished by the Maxam–Gilbert method (4) utilizing chemical digestions of the ODNs of interest. This method, involving four different chemical digestions and electrophoresis, is highly labor intensive and time consuming and requires handling of radioactive materials. Recently, electrospray (ES) and matrix-assisted laser desorption/ionization (MALDI)–mass spectrometry (MS) have emerged as powerful tools for the characterization of ODNs and their analogs (5, 6) and MALDI has been found to be most promising for ODN sequence analysis. There are three approaches which have been widely studied in the literature (5). The first approach is MS analysis of Sanger sequencing (7) reaction products, in an effort to replace the gel electrophoresis by MS for high throughput, sensitivity, and simplicity. Fitzgerald et al. (8) analyzed mock Sanger’s sequencing reactions for a 24-mer DNA fragment. Shaler et al. (9) analyzed actual Sanger reaction products from a 45-mer template and 12-mer primer and obtained a sequence up to 19-mer past the primer at a picomole level of template and primer. Similarly with delayed-extraction (DE) MALDI, Roskey et al. (10) analyzed cycle sequencing products up to 37 bases past the 13-mer primer. More recently, Mouradian et al. (11) successfully analyzed DNA fragments generated from the sequencing vector M13-mp19. The second approach is ladder sequencing using phosphodiesterases (PDEs). These exonucleases sequentially remove nucleotides from either the 39- or 59-terminus; the MWs of the shortened products are MS analyzed and sequence information is obtained. Pieles et al. (12) demonstrated the complete sequencing of a DNA 13-mer using both 39-PDE and 59-PDE. Later Bentzley et al. (13) analyzed 24-mer using this method. With DE-MALDI, Smirnov et al. (14) demonstrated that much longer lengths of DNA can be se129

130

WU, CHAN, AND ABOLENEEN

quenced, including a 33-mer. The third approach for sequence analysis is based on gas-phase fragmentation: postsource decay (PSD), prompt fragmentation (insource), or collision-induced dissociation (CID). Nordhoff et al. (15) obtained sequence information for up to 21-mer ODN by in-source fragmentation of deprotonated ODN using an infrared (IR) MALDI–TOF-MS. Similar results have also been obtained by Zhu et al. (16) using UVMALDI and two-component matrices for ODNs up to 35-mer. Talbo et al. (17) demonstrated the application of PSD for ODN analysis. In contrast, the literature contains only a limited number of publications on ODN sequencing with electrospray and ionspray MS, and the sizes of ODNs sequenced are much smaller. With CID, MS/MS spectra from ion-trap (18), FTMS (19), or triple-quadrupole MS (20, 21) have been found to provide useful sequence information. However, sequencing ODNs greater than 15-mer can be difficult since the MS/MS spectra are often complicated by multiply charged ions and by lack of complete ion series (21, 21). With the ladder sequencing method, Limbach et al. (22) obtained a partial sequence of a 10-mer using PDE digestion and ES–MS. Glover et al. (23) sequenced a 10-mer employing PDE digestion coupled with an off-line HPLC–MS method. Sequencing of larger ODNs has not been reported. In our investigation of problems associated with ladder sequencing larger ODNs by electrospray/ionspray, we found that the cleanup of enzymes and ODN samples and the control of digestion conditions are crucial. Here we report a method on the enzymatic sequencing of ODNs containing more than 20 bases using the combination of enzymatic digestion and ionspray mass spectrometry. Reconstructed molecular weight (RMW) spectra obtained by this method provide easy-to-read sequence ladders comparable to those obtained from MALDI–TOF-MS. Examples of sequencing a regular ODN 22-mer and a biotin-labeled ODN 18-mer will be illustrated. MATERIALS AND METHODS

Materials All ODNs were synthesized on an ABI 380B DNA synthesizer using the phosphoramidite approach. Phosphoramidites were purchased from Clontech Laboratories, Inc. (Palo Alto, CA). All solvents were HPLC grade or better. Snake venom phosphodiesterase (39PDE) and calf spleen phosphodiesterase (59-PDE) were purchased from Boehringer-Mannheim Biochemicals (Indianapolis, IN), purified by using Microcon microconcentrators (Amicon, Inc., Beverly, MA) as described below, stored at 2– 8°C in a refrigerator, and kept on wet ice before use on the bench.

Purification of PDEs Commercial 59-PDE suspended in 3.0 M (NH4)2SO4 solution was purified as follows: 40 ml of 59-PDE was centrifuged first to precipitate the enzyme. The enzyme pellet was redissolved in deionized water and loaded onto a Microcon-50 microconcentrator with a membrane of 50 kDa MW cutoff. The solution was spun to dryness in a Speedvac, washed three times with 80 ml deionized water, and reconstituted in 80 ml H2O. The final concentration was 1.5 U/ml. To purify 39-PDE, 40 ml of 39-PDE in 50% glycerol was loaded onto a Microcon-50, diluted with an equal volume of H2O, and spun to dryness. The enzyme was then washed three times with 80 ml of deionized water, spun dried, and reconstituted in 80 ml H2O. The final concentration was 2 U/ml. All enzymes were stored at 2– 8°C and placed on wet ice before use. Purification of DNA Samples by HPLC and Ion Exchange DNA samples were loaded onto a 3.9 3 300-mm mBondapak C18 column and cleaned by running 5 mM ammonium acetate (NH4OAc) at 0.5 ml/min for 40 min to replace ubiquitous Na1 ions with NH1 4 ions. Deionized water was run for 5 min to remove unbound ions in the column. DNA was eluted with 1:1 acetonitrile/ H2O, dried in a Speedvac, and reconstituted in water to a desired concentration (;200 mM). Snake Venom Phosphodiesterase (39-PDE) Digestion Ten microliters of 1:1 acetonitrile/2 mM NH4HCO3 (or replacing NH4HCO3 with 0.1% triethylamine for faster digestions) and 1–2 ml of 0.2 U/ml of the enzyme were added to an ODN sample (;500 pmol). One microliter of the digestion mixture (;40 pmol) was MS analyzed approximately every 5 min. Calf Spleen Phosphodiesterase (59-PDE) Digestion Ten microliters of 20 mM HCOONH4 (pH 6.6) and 1–2 ml of 2 U/ml of the enzyme were added to an ODN sample (;500 pmol). One microliter of the digestion mixture was MS analyzed approximately every 5 min. Mass Spectrometry A PE Sciex API III triple-quadrupole mass spectrometer (Sciex, Thornhill, Ontario, Canada) was used for all experiments. The instrument has a mass-to-charge ratio (m/z) range of 0 –2400 and is fitted with a pneumatically assisted electrospray (ionspray) interface. Multiply charged ODN ions were generated by spraying the solution through a stainless steel capillary held at 24000 V. The ODN sample was loop-injected into a carrier solution (1:1 acetonitrile/water with 2 mM

SEQUENCING OLIGONUCLEOTIDES BY MASS SPECTROMETRY

NH4HCO3) that was delivered by a syringe infusion pump (Model 22, Harvard Apparatus, South Natick, MA) through a fused silica capillary of 100 mm inner diameter at a flow rate of 20 ml/min. The potential on the sampling orifice was set at 35 V during calibration and was changed to 265 V for DNA signal enhancement. The instrument m/z scale was calibrated with the singly charged ammonium adduct ions of poly(propylene glycols) (PPGs) under unit resolution (50% valley definition). For sequence analysis, count control (CC) 1 was and must be used to enhance the sensitivity of the MS and counterbalance the signal decrease due to lower concentrations of the digestion products. Sciex data system MacSpec 3.22 and Biomultiview 1.2 were used for data processing, and the overlaid spectra were obtained by using Grams/32 (Galactic Industries Corp., Salem, NH). RESULTS AND DISCUSSIONS

Method Optimization In the method described below, the purity and MW of the ODN sample were determined first by MS. The sample was then subjected to PDE digestion leading to sequential loss of nucleotide residues (A, G, C, and T) equivalent in MW to dehydrated nucleoside monophosphate, and truncated ODNs were formed. The MWs of these truncated ODNs were determined by IS–MS via MW reconstruction and the arrangement of the MW peaks generated a sequence ladder. Therefore, the nucleotide residue difference between two adjacent peaks (indicated by MW difference) provided sequence information. For regular ODNs, MW differences between two adjacent peaks representing two ODNs can only be 289 for C, 304 for T, 313 for A, and 329 for G. With MALDI–TOF-MS, singly charged ions are predominantly formed, and the spectrum of a PDE digestion product provides a convenient way of displaying sequence ladders (8 –14). With IS–MS, at least three complications arise due to the nature of the ionization technique and its low tolerance to detergents and salts. First, due to the dominance of multiply charged ions for ODNs, the regular IS–MS spectrum of a PDE digestion mixture must be transformed into a RMW spectrum to obtain a readable sequence ladder. However, MW reconstruction over a range greater than 700 Da has not been reported in literature (19) for sequencing ODNs presumably due to the poor quality of the resulting RMW spectra (such as harmonic peaks and chemical noise from 59-monophosphate nucleosides). Second, metal ion adducts such as Na1 produce additional peaks in a MS spectrum when enzymes and samples are not properly purified, e.g., M 1 Na1 2 H1 (M 1 22) and M 1 2Na1 2 2H1 (M 1 44). This degrades the sensitivity through divided ion current among multiple forms of the ODN and complicates spectral interpreta-

131

tion. Third, as PDE digestion proceeds, the concentrations of the truncated ODNs decrease gradually, and higher sensitivity is required for MW determination. At the same time, the concentrations of nucleoside 59-monophosphates (59-NMPs) increase rapidly. These 59-NMPs produce very strong signals in the m/z range of 600 –700 due to the formation of nonspecific dimer ions (also relatively weak signals for trimers at m/z 900 –1000), which severely interferes with the MW reconstruction. Proper selection of solvents and experimental conditions is necessary to minimize the interference from 59-NMPs. Therefore, successful enzymatic sequencing using ES/IS–MS requires purification of all reagents, proper selection of solvents, and reconstruction of MW spectra. Successful sequencing also requires fine control of the digestion rate. Slow digestion shows little or no truncated ODNs for sequence analysis, while very fast digestion may produce only 59-NMPs—the final digestion products. By controlling substrate (ODN) and enzyme concentrations and digestion buffer, the digestion of ODNs can proceed at a rate suitable for MS analysis. The rate can be easily monitored by the shifting of MWs of shortened ODNs determined by MS. At the beginning of PDE digestion, the reaction proceeds smoothly, and it slows down gradually presumably due to denaturing of the enzyme by acetonitrile. Addition of enzymes is necessary to force the digestion to proceed further. It is also found that 39-PDE digestion in a more basic solution such as 1:1 acetonitrile/0.2% triethylamine (TEA) is much faster than that in 2 mM NH4HCO3. When sample quantity is not a concern, digestion under both conditions may provide complimentary data. ODN Sequencing with 39- or 59-PDE In one example, a 22-mer ODN, 59-GAC GTC AAG TCG TCA TGG CCC T-39, was synthesized and submitted for MS analysis. However, the MW determination indicated that the ODN synthesized was 40 Da smaller than expected. The sequence of the ODN was then analyzed using the method described above. Figures 1a and 1b show the ES–MS spectrum and the RMW spectrum of a 22-mer ODN at an early stage of 39-PDE digestion. As can be seen, it would be extremely difficult to retrieve any information directly from the MS spectrum (Fig. 1a). In contrast, the RMW spectrum (Fig. 1b) using Biomultiview clearly shows eight to nine ODNs. Although the low MW region of the RMW spectrum (at late stages of digestion) is relatively noisy, these noise peaks can be ignored for sequencing purpose because there are only four logical losses possible. The peaks in the RMW spectrum clearly demonstrate a sequence ladder with the MW of the starting ODN at 6672 Da. For example (Fig. 1b), a MW differ-

132

WU, CHAN, AND ABOLENEEN

FIG. 1. (a) MS spectrum of a 39-PDE digestion of the ODN 22-mer at early stage (5 min). (b) RMW spectrum from Biomultiview. (c) Stacked RMW spectrum from 39-PDE digestion of the ODN 22-mer at 15, 30, and 50 min of digestion. (d) Overlaid RMW spectrum of 39-PDE digestion of the ODN 22-mer.

SEQUENCING OLIGONUCLEOTIDES BY MASS SPECTROMETRY

FIG. 1—Continued

133

134

WU, CHAN, AND ABOLENEEN TABLE 1

MS Ions from ODNs Containing up to Five Bases Compa

[M–H]2

[M–2H]22

Compa

[M–H]2

[M–2H]22

Compa

[M–H]2

[M–2H]22

C2 CT CA T2 TA CG A2 TG AG G2 C3 C2T C2A T2C CTA C2G T3 A2C T2A CTG A2T CAG T2G A3 TAG G2C A2G G2T G2A G3 C4 C3T C3A C2T2 C2TA C3G T3C A2C2 T2CA C2TG T4

515.2 530.2 539.2 545.2 554.2 555.2 563.2 570.2 579.2 595.2 804.3 819.3 828.3 834.3 843.3 844.3 849.3 852.3 858.3 859.3 867.3 868.3 874.3 876.3 883.3 884.3 892.3 899.3 908.3 924.3 1093.4 1108.4 1117.4 1123.4 1132.4 1133.4 1138.4 1141.4 1147.4 1148.4 1153.4

— — — — — — — — — — 401.65 409.15 413.65 416.65 421.15 421.65 424.15 425.65 428.65 429.15 433.15 433.65 436.65 437.65 441.15 441.65 445.65 449.15 453.65 461.65 546.2 553.7 558.2 561.2 565.7 566.2 568.7 570.2 573.2 573.7 576.2

A2CT C2AG T3A T2CG A3C CTAG C2G2 G2C2 T3G A3T A2CG T2AG G2CT A4 A2TG G2CA T2G2 A3G G2TA G3C G2A2 G3T G3A G4 C5 C4T C4A C3T2 C3TA C4G T3C2 C3A2 C2T2A C3TG T4C C2A2T C3AG T3CA C2T2G A3C2

1156.4 1157.4 1162.4 1163.4 1165.4 1172.4 1173.4 1173.4 1178.4 1180.4 1181.4 1187.4 1188.4 1189.4 1196.4 1197.4 1203.4 1205.4 1212.4 1213.4 1221.4 1228.4 1237.4 1253.4 1382.5 1397.5 1406.5 1412.5 1421.5 1422.5 1427.5 1430.5 1436.5 1437.5 1442.5 1445.5 1446.5 1451.5 1452.5 1454.5

577.7 578.2 580.7 581.2 582.2 585.7 586.2 586.2 588.7 589.7 590.2 593.2 593.7 594.2 597.7 598.2 601.2 602.2 605.7 606.2 610.2 613.7 618.2 626.2 690.75 698.25 702.75 705.75 710.25 710.75 713.25 714.75 717.75 718.25 720.75 722.25 722.75 725.25 725.75 726.75

T5 T2A2C C2TAG C3G2 T4A T3CG A3CT C2A2G T3A2 T2CAG C2G2T A4C T4G A3T2 A2CTG C2G2A T3AG T2G2C A4T A3CG T2A2G G2CTA A5 G3C2 T3G2 A3TG A2G2C T2G2A G3CT A4G A2G2T G3CA G3T2 A3G2 G3TA G4C G3A2 G4T G4A G5

1457.5 1460.5 1461.5 1462.5 1466.5 1467.5 1469.5 1470.5 1475.5 1476.5 1477.5 1478.5 1482.5 1484.5 1485.5 1486.5 1491.5 1492.5 1493.5 1494.5 1500.5 1501.5 1502.5 1502.5 1507.5 1509.5 1510.5 1516.5 1517.5 1518.5 1525.5 1526.5 1532.5 1534.5 1541.5 1542.5 1550.5 1557.5 1566.5 1582.5

728.25 729.75 730.25 730.75 732.75 733.25 734.25 734.75 737.25 737.75 738.25 738.75 740.75 741.75 742.25 742.75 745.25 745.75 746.25 746.75 749.75 750.25 750.75 750.75 753.25 754.25 754.75 757.75 758.25 758.75 762.25 762.75 765.75 766.75 770.25 770.75 774.75 778.25 782.75 790.75

a

Comp, composition.

ence of 304 with peaks at 6672 and 6368 shows a difference of T which is the first base at the 39-terminus; a MW difference of 313 between peaks at 6368 and 6079 shows the difference of C, and so on. A partial sequence, 59-zzzAT GGC CCT-39, therefore, is readily obtained. Similarly, the sequence can be extended to 59-zzzCG TCA AGT CGT CAT GGC CCT-39, based on sequence ladders obtained in Fig. 1c. When these spectra are overlaid (using Grams/32), a long sequence ladder can be seen, as shown in Fig. 1d. When 39-PDE digestion approaches the 59-terminus of an ODN (or when 59-PDE digestion approaches the 39-terminus), ODNs with two, three, four, five, and six bases are constantly produced, and each ODN pro-

duces only one or two ions in the commonly measured m/z range of 480 to 1200. This situation prevents the computer from performing MW reconstruction which normally requires two or more ions in Biomultiview. However, the small sizes make it possible for direct calculation of m/z values of IS–MS ions for ODNs of any given base combination. To facilitate the interpretation of ES–MS spectra of ODNs containing five bases or less, a useful table (Table 1), correlating base compositions of ODNs to ions in ES–MS, was compiled. When the MWs of the shortened ODNs are below 2000 Da, any prominent peak, especially those away from the crowded peak areas in the mass spectrum of a digestion mixture, can be compared against ions in the

135

SEQUENCING OLIGONUCLEOTIDES BY MASS SPECTROMETRY

FIG. 2. MS spectrum of an enzymatic digestion at a later stage showing small ODNs as well as noncovalent 59-NMP dimers.

table to obtain its base composition. With repeated analysis of the digestion mixture, a sequence ladder can be built to obtain sequence information. For example, peaks at m/z 1197.4 and m/z 598.2 (Fig. 2) indicate a composition of ACG2, peaks at m/z 868.3 indicate a base composition of CAG, peaks at m/z 579.2 indicate a base composition of AG, and the partial sequence of 59 zzzCGzzz39 is obtained. When 39-PDE digestion approaches the 59-terminus of an ODN, the difficulty in reconstructing the MW spectrum was also increased by the appearance of dominant MS ions in the m/z range of 600 to 700. These peaks were found to be cluster ions from noncovalent dimers of 59-NMPs (Fig. 2 and Table 2) due to the increasing concentration of 59-NMPs from 39-PDE digestion. Ions at m/z 900 –1000 Da due to the 59-NMP trimers were also observed, but were much less abundant. These noncovalent complex peaks were often of high abundance at late stages of digestion. When MW reconstruction is performed by the computer, these

ions will be used undistinguished from those of ODNs, and artifactual peaks will be produced. Therefore, RMW spectra from such MS spectra can be misleading. Sequencing with Both 39- and 59-PDE With 39-PDE, the sequence of the last two bases to the 59-terminus cannot be determined because further digestion produces a nucleotide which has already been produced earlier during digestion (m/z 306, 321, 330, and 346) and a nucleoside which does not show up in negative-ion mode. For example, ions with m/z 579.2 indicate a sequence of AG or GA, but further digestion does not reveal which is correct. Similarly with 59-PDE digestion, the sequence of the last two bases to the 39-terminus cannot be determined. Because of this, 39-PDE and 59PDE are frequently used, in separate experiments, to obtain the complete sequence information. Sequencing using both enzymes also has other advantages. As 39-PDE digestion proceeds to the 59-terminus of

TABLE 2

MS Ions from Noncovalent 59-NMP Dimers Complex [M] m/z [M-H]2

CzC 613.2

CzT 628.2

CzA 637.2

CzG 653.2

TzT 643.2

TzA 652.2

TzG 668.2

AzA 661.2

AzG 677.2

GzG 693.2

136

WU, CHAN, AND ABOLENEEN

FIG. 3.

Overlaid RMW spectrum (three RMW spectra from Biomultiview) of 59-PDE digestion of the 22-mer.

the oligomer, digestion slows down gradually and it takes much more time to obtain sequence information of the 59-terminus. In addition, noise becomes significant presumably due to decreased ODN concentrations and the increasing concentration of 59-NMPs. When a separate digestion is performed from the 59-terminus using 59PDE, 59-terminal sequence information can be obtained much faster since the sequence information needed can be obtained from the early stage of digestion. Most importantly, digestions from both termini provide overlapping sequences which verifies the sequence obtained from each digestion. As illustrated in Fig. 3, a partial sequence 59-GAC GTC AAC TCG Tzzz39 was obtained from the 59-PDE digestion of the aforementioned ODN 22-mer. The 59-terminal sequence information, which was difficult to obtain from 39-PDE digestion, was readily obtained by 59-PDE digestion. An overlapping sequence of 11 bases was obtained. 59- z z CG TCA ACT CGT CAT GGC CCT-39 39 3 59 digestion 59-GACG TCA ACT CGTzzzzzzzzzzz-39 59 3 39 digestion Based on the complementary sequence information, the complete ODN sequence is therefore derived from

this method: 59-G ACG TCA ACT CGT CAT GGC CCT39. It is clear that the difference between the sequence determined and the sequence provided is that a base G is replaced by a base C, which accounted for the 40-Da mass difference. This result strongly emphasizes the importance of sequence verification. Sequencing-Labeled ODNs When ODNs are labeled at the 59- or 39-terminus, as is the case for DNA probes, PDE digestion can only be performed using one of the two PDEs. For instance, with a biotinylated ODN 18-mer, 59-B CGT ATG AGT GAT TCC TCC-39, in which the biotin moiety (B) is joined to the ODN by a regular phosphodiester bond, only 39-PDE can be used for sequencing. Figure 4a shows the overlaid MW spectrum from 39-PDE digestion of the 59-labeled 18-mer, showing the sequence of 17 bases in the ODN: 59-zzzGT ATG AGT GAT TCC TCC-39. The MW spectra were obtained from digestion mixtures at 3, 6, 10, 15, and 25 min of the digestion, respectively. Digestion stopped when it was one base away from the label, producing biotin-C (B-C) at m/z 661.2. To determine the base attached to the biotin moiety, the CID–MS/MS spectrum of the deprotonated B-C (Fig. 4b) was obtained. The base attached to the label can easily be identified by the presence of a deprotonated purine or pyrimidine base at m/z 110,

SEQUENCING OLIGONUCLEOTIDES BY MASS SPECTROMETRY

137

FIG. 4. (a) Overlaid RMW spectrum (four RMW spectra) of 39-PDE digestion products of the 59-biotin-labeled 18-mer. (b) CID–MS/MS spectrum of the final digestion product containing the biotin label.

125, 134, and 150 for C, T, A, and G, respectively. In this case, both ions at m/z 110 of deprotonated cytosine and m/z 550 due to the loss of cytosine from the de-

protonated digestion product indicate that the base was C, and the complete sequence of the ODN was obtained.

138

WU, CHAN, AND ABOLENEEN

CONCLUSIONS

We have developed an IS–MS method for enzymatic sequencing of ODNs with more than 20 bases. Sequencing both regular and labeled ODNs has been demonstrated using a combination of 59-PDE, 39-PDE digestions, and CID–MS/MS. We demonstrated that reconstruction of MW spectra over a large MW range produced an easy-to-read sequence ladder similar to those in MALDI–MS spectra. The cleanup of both samples and enzymes, the control of enzymatic digestion, and MW reconstruction were found to be crucial for sequencing with ES/IS–MS. For regular ODNs, both 59- and 39-PDEs must be used in separate experiments for complete and faster sequencing. Late in the time course of digestion, the increasing concentrations of 59-NMP interfere with MW reconstruction. A useful table, correlating ODNs up to 5 bases long and their MS ions, has been compiled to handle such situations. Complete sequencing of a 22-mer ODN was demonstrated and a base error in the sequence was identified. For labeled ODNs, CID– MS/MS must be used along with PDE digestions. With the 59-biotin-labeled 18-mer ODN, the sequence of 17 bases was determined by 39-PDE digestion, and the last base adjacent to the biotin was determined by CID–MS/ MS. These examples demonstrate that the method is suitable for sequencing both regular and labeled ODNs more than 20 bases long and that ESI–MS provides a viable option for sequencing ODNs, especially when a MALDI instrument is not available. The high mass accuracy, high resolution, and ease of MS/MS operation permit sequencing of terminally labeled ODNs, as demonstrated in this work. REFERENCES 1. Zamecnik, P. C. (1996) in Antisense Therapeutics (Agrawal, S., Ed.), pp. 1–11, Humana Press, Totowa, NJ. 2. Kricka, L. J. (1992) in Nonisotopic DNA Probe Techniques (Kricka, L. J., Ed.), pp. 1–28, Academic Press, San Diego. 3. Gaffney, B. L., Markyk, L. A., and Jones, R. A. (1984) Biochemistry 23, 5686 –5691.

4. Maxam, A. M., and Gilbert, W. (1980) in Methods in Enzymology (Colowick, S. P., and Kaplan, N. O., Eds.), Vol. 65, pp. 499 –580, Academic Press, New York. 5. Limbach, P. A. (1996) Mass Spectrom. Rev. 15, 297–336. 6. Murray, K. K. (1996) J. Mass Spectrom. 31, 1203–1215. 7. Sanger, F., Nicklen, S., and Coulsen, A. R. (1977) Proc. Natl. Acad. Sci. USA 94, 5463–5467. 8. Fitzgerald, M. C., Zhu, L., and Smith, M. M. (1993) Rapid Commun. Mass Spectrom. 7, 895– 897. 9. Shaler, T. A., Tan, Y., Wickham, J. N., Wu, K. J., and Becker, C. H. (1995) Rapid Commun. Mass Spectrom. 9, 942–947. 10. Roskey, M. S., Juhasz, P., Smirnov, I. P., Takach, E. J., Martin, S. A., and Haff, L. A. (1996) Proc. Natl. Acad. Sci. USA 93(10), 4724 – 4729. 11. Mouradian, S., Rank, D. R., and Smith, L. M. (1996) Rapid Commun. Mass Spectrom. 10(12), 1475–1478. 12. Pieles, U., Zurcher, W., Schar, M., and Moser, H. E. (1993) Nucleic Acids Res. 21, 3191–3196. 13. Bentzley, C. M., Johnson, M. V., Larsen, B. S., and Gutteridge, S. (1996) Anal. Chem. 68, 2141–2146. 14. Smirnov, I. P., Roskey, M. T., Juhasz, P., Takach, E. J., Martin, S. A., and Haff, L. A. (1996) Anal. Biochem. 238(1), 19 –25. 15. Nordhoff, E., Karas, M., Cramer, R., Hahner, S., Hillenkamp, F., Kirpekar, F., Lezius, A., Muth, J., Meier, J., and Engels, J. W. (1995) J. Mass Spectrom. 30, 99 –112. 16. Zhu, Y. F., Taranenko, N. I., Allman, S. L., Taranenko, N. V., Martin, S. A., Haff, L. A., and Chen, C. H. (1997) Rapid Commun. Mass Spectrom. 11, 897–903. 17. Talbo, G., and Mann, M. (1996) Rapid Commun. Mass Spectrom. 10, 100 –103. 18. McLuckey, S. A., and Habibi-Goudarzi, S. (1994) J. Am. Soc. Mass Spectrom. 5, 740 –747. 19. Little, D. P., Chorush, R. A., Speir, J. P., Senko, M. W., Kelleher, N. L., and McLafferty, F. W. (1994) J. Am. Chem. Soc. 116, 4893– 4897. 20. Ni, J., Pomerantz, S. C., Rozenski, J., Zhang, Y., and McCloskey, J. A. (1995) Anal. Chem. 68, 1989 –1999. 21. Wang, P., Bartlett, M. G., and Marin, L. B. (1997) Rapid Commun. Mass Spectrom. 11, 846 – 856. 22. Limbach, P. A., McCloskey, J. A., and Crain, P. F. (1993) Nucleic Acids Symp. Ser. 31, 127–128. 23. Glover, R. P., Sweetman, G. M. A., Farmer, P. B., and Roberts, G. C. K. (12995) Rapid Commun. Mass Spectrom. 9, 897–901.