Analytical Biochemistry 377 (2008) 95–104
Contents lists available at ScienceDirect
Analytical Biochemistry j o u r n a l h o m e p a g e : w w w . e l s e v i e r. c o m / l o c a t e / y a b i o
Picomole-level mapping of protein disulfides by mass spectrometry following partial reduction and alkylation Susan F. Foley 1, Yaping Sun 1, Timothy S. Zheng, Dingyi Wen * Biogen Idec, 14 Cambridge Center, Cambridge, MA 02142, USA
a r t i c l e
i n f o
Article history: Received 15 January 2008 Available online 4 March 2008 Keywords: Disulfide mapping Disulfide structure determination Partial reduction and alkylation TIM-1 IgV Fn14 TNF receptor family TIM family
a b s t r a c t We have deduced the disulfide bond linkage patterns, at very low protein levels (<0.5 nmol), in two cys teine-rich polypeptide domains using a new strategy involving partial reduction/alkylation of the pro tein, followed by peptide mapping and tanden mass spectrometry (MS/MS) sequencing on a nanoflow liquid chromatography-MS/MS system. The substrates for our work were the cysteine-rich ectodomain of human Fn14, a member of the tumor necrosis factor receptor family, and the IgV domain of murine TIM-1 (T-cell, Ig domain, and mucin domain-1). We have successfully determined the disulfide linkages for Fn14 and independently confirmed those of the IgV domain of TIM-1, whose crystal structure was published recently. The procedures that we describe here can be used to determine the disulfide structures for pro teins with complex characteristics. They will also provide a means to obtain important information for structure–function studies and to ensure correct protein folding and batch-to-batch consistency in com mercially produced recombinant proteins. © 2008 Elsevier Inc. All rights reserved.
Protein structure–function relationships are studied for a vari ety of reasons, ranging from the elucidation of basic biochemical pathways to the identification of therapeutic targeting strategies. The preferred approach to these studies utilizes analytical tools such as X-ray crystallography and NMR for determining the struc ture of a protein or a protein complex. Although successes using this approach have increased in recent years, there remains a sig nificant unmet need for structural information, due mainly to the intrinsic requirements of both of the aforementioned techniques for relatively large amounts or high concentrations of homoge neous protein [1]. In cases where neither NMR nor crystallogra phy is feasible or where the data from these techniques are incom plete, disulfide mapping, often in combination with structural models, proves invaluable. Disulfide mapping is also required to ensure that recombinant proteins have folded correctly from batch-to-batch. The traditional approach to disulfide mapping is to enzymati cally cleave the protein between each Cys residue and then define the disulfide-linked peptides by mass spectrometry or N-terminal sequencing. The limitations of this approach are obvious: even the most nonspecific proteases cannot easily cleave “cysteine knots” or between adjacent cysteine residues. In early studies, digestion with papain and pepsin was carried out after partial reduction of * Corresponding author. Fax: +1 (617)-679-3635. E-mail address: dingyi.wen@biogenidec.com (D. Wen). 1 These authors contributed equally to this work. 0003-2697/$ – see front matter © 2008 Elsevier Inc. All rights reserved. doi:10.1016/j.ab.2008.02.025
disulfide bonds with cysteine to fully elucidate the disulfide link ages in antibodies [2]. More recently, a method developed to avoid issues such as steric hindrance of enzymes and to resolve even difficult disulfide arrangements is partial reduction and cyanyla tion using 1-cyano-4-dimethylaminopyridinium tetrafluoroborate [3,4] followed by treatment with a strong base for cleavage at the N terminus of cyanylated cysteine. While this approach offers high cleavage site specificity, there are many drawbacks. For example, protein backbone modification [3–5] and complex side reactions caused by exposure to a strong base (e.g., b-elimination, neutraliza tion of carboxyl groups) can result in significantly lower yields of cleavage products [4,5] and can complicate or confound interpreta tion of the mass spectra. The time required for optimization and consumption of sometimes limited amounts of sample are also potential issues because the reactivity of cyanylation products is sequence dependent [3,5,6]. An alternative method for handling even difficult disulfide motifs is partial reduction and alkylation, which works at acidic pH, essentially eliminating the potential for disulfide scrambling and generating few side reactions. Recently, enhanced mass detection instrumentation and nanoflow liquid chromatography have provided a new opportunity to analyze a wide range of substrates at femtomole to low-picomole levels, but application of these methods to disulfide mapping has yet to take advantage of these improvements. In fact, conventional disul fide mapping using partial reduction and alkylation routinely con sumes more than 20 nmol of a protein per experimental cycle [6,7], which is due mainly to the material required for optimization of
96
Picomole-level mapping of protein disulfides / S.F. Foley et al. / Anal. Biochem. 377 (2008) 95–104
enzymatic digestion, partial reduction, chromatographic separa tions, and mass analyses of the singly reduced and alkylated prod ucts. As a result, this method is particularly time consuming and its application is limited mostly to peptides or very small proteins. Here we describe an approach that combines specific enzy matic cleavage, a generally applicable recipe for partial reduction and alkylation, and MS/MS sequencing coupled with a nanoflow LC for separation. A major advantage of this approach is its high sensitivity because it generally requires only picomole quantities of the protein in microliter volumes. A complicated disulfide struc ture often can be solved with as little as 0.5 nmol of a protein. The substrates that we used to demonstrate this approach are human fibroblast growth factor inducible 14 (Fn14) [8,9] and the N-termi nal IgV domain of murine T cell, Ig domain, and mucin domain1 (TIM-1) [10,11]. Fn14 is the cognate receptor of tumor necrosis factor (TNF) -like weak inducer of apoptosis (TWEAK) [8,9]. Until the present studies, the disulfide structure of Fn14 was uncertain, having been described as possessing either of two distinct TNFR (TNF receptor) disulfide architectures [12,13]. The IgV domain of muTIM-1 has a unique disulfide structure among all of the IgV superfamily (IgSF) [11]. Both examples highlight the advantage of bringing specificity and sensitivity to techniques for disulfide mapping. Materials and methods Expression and purification of Fn14-Myc-His and murine TIM-1-IgV-Fc Fn14-Myc-His was expressed in Pichia pastoris yeast cells and purified by metal chelate chromatography. The IgV domain of murine TIM-1 fused with the Fc portion of human IgG1 (referred to as murine TIM-1-IgV-Fc) was expressed in Chinese hamster ovary (CHO) cells and purified as described by Sizing et al. [14]. Alkylation of Fn14-Myc-His and murine TIM-1-IgV-Fc Alkylation was carried out under denaturing but nonreducing conditions. Approximately 0.5 nmol of the protein in 25 lL Phos phate-buffered saline was treated with 2.5 lL of 1 M 4-vinylpyr idine (4-vp) followed by 25 mg solid guanidine hydrochloride (GuHCl) to ensure denaturation. The volume was then increased to 100 lL by the addition of 6 M GuHCl, 0.5 M Tris–HCl, pH 7.6. The solution was held at room temperature in the dark for 1 h. The 4-vp-treated protein was subsequently recovered by precipi tation with 40 volumes of chilled ethanol [15]. Specifically, after dilution of the sample in ethanol, the solution was held at ¡20 °C for 1 h and then centrifuged at »6700g for 12 min at 4 °C. The supernatant was discarded and the pellet was washed once with chilled ethanol. N-deglycosylation of murine TIM-1-IgV-Fc N-linked glycans were removed from the 4-vp-treated protein with PNGase F (Roche). Briefly, the pellet from the preceding proce dure (containing approximately 18 lg of 4vp-treated protein) was dissolved in 2 M urea, 0.06 M methylamine HCl, 0.2 M Tris–HCl, pH 6.5, to give a protein concentration of about 0.35 mg/mL. Approx imately 12.5 milliunits of PNGase F was then added and the solu tion was held at room temperature for 24 h.
2 Abbreviations used: TNF, tumor necrosis factor; TWEAK, TNF-like weak inducer of apoptosis; TNFR, TNF receptor; IgSF, IgV superfamily; 4-vp, 4-vinylpyridine; TFA, trifluoroacetic acid; DTT, dithiothreitol; MALDI-TOF MS, matrix-assited laser desorp tion ionization time-of-flight mass spectrometry; CRD, cysteine-rich domain; CHO, Chinese Hamster Ovary; TNFRSF, TNFR superfamily.
Intact mass measurement The molecular masses of the proteins were assessed under reducing conditions using an LC-MS system composed of an HPLC (2695 Alliance Separations Module), a 2487 dual-wavelength UV detector, and a ZMD mass spectrometer (Waters Corp.). Salts were removed from the sample on a Vydac 2.1 x 10-mm C4 guard column (214GD52). The protein was eluted with a linear gradient (from 0 to 70% acetonitrile in 0.03% TFA) at a flow rate of 100 lL/min. The column temperature was 30 °C. The resulting mass spectra were deconvoluted using the MaxEnt 1 program. Endo-AspN, Endo-LysC, and tryptic digestion, and separation of digested Fn14-Myc-His fragments Approximately 0.5 nmol of 4-vp-treated protein was digested with 8% (w/w) endoprotease Asp-N (endo-AspN; Roche) in a solu tion containing 2.5 M urea, 20 mM methylamine, 5 mM MgCl2, 0.2 M Tris–HCl, pH 6.5, for 8 h at room temperature, followed by addition of another aliquot of endo-AspN (8% w/w). After an addi tional 8 h, about 15% (w/w) endoprotease LysC (endo-LysC; WACO) was added, and the solution was held at room temperature for 20 h. Finally, 5% (w/w) of trypsin (Promega) was added and the solution was held at room temperature for 8 h more. Prior to analy sis of the digest on an LC-MS system, freshly prepared 8 M urea in 0.2 M methylamine HCl was added to improve peptide solubility. The solution was split into two parts: one part was analyzed after reduction with 50 mM DTT for 1 h, and the other part was directly analyzed without reduction. The reduced and nonreduced digests were assessed using an LC-MS system composed of an HPLC (2695 Alliance Separations Module), a 2487 dual-wavelength UV detec tor, and an LCT mass spectrometer (Waters Corp.). The HPLC was equipped with a Vydac 1 £ 250-mm C18 column (218TP51). Pep tides were eluted with a 150-min linear gradient (from 0 to 70% acetonitrile in 0.03% TFA) at a flow rate of 70 lL/min. The column temperature was 30 °C. The disulfide-linked peptide cluster con taining Cys residues was identified by mass spectrometry and col lected for further partial reduction experiments. Tryptic digestion and separation of digested murine TIM-1-IgV-Fc fragments Approximately 0.5 nmol of alkylated, N-deglycosylated pro tein in buffer containing 2 M urea, 0.06 M methylamine HCl, 0.2 M Tris–HCl, pH 6.5, was digested with »5% (w/w) endo-LysC for 4 h at room temperature; then a second aliquot of endo-LysC was added (5%, w/w) and the solution was incubated for an additional 4 h. Trypsin (5%, w/w; Promega) was added and the digestion was allowed to proceed for 16 h at room temperature. A second aliquot of trypsin (5%, w/w) was then added and the digestion was allowed to proceed for 6 h more. Additions of endo-LysC and trypsin were repeated, as described, over another 30-h period. An enzyme blank was prepared as described above for the sample, except that TIM1-IgV-Fc was not added to the vial. Both the digest and the enzyme blank were analyzed on the LC-MS system as described above for Fn14-Myc-His. The disulfide-linked IgV peptide cluster was col lected for further analysis. Partial reduction and alkylation of the disulfide-linked peptides Conditions for the partial reduction and alkylation of disulfidelinked tryptic peptides were optimized as follows. The disulfidelinked peptide cluster isolated by C18 chromatography was dried under vacuum and dissolved in buffer containing 0.1 M sodium cit rate, pH 3.0, 6 M GuHCl at an estimated concentration of 10 pmol/ lL. Approximately 2 lL (estimated at 20 pmol protein/peptide)
Picomole-level mapping of protein disulfides / S.F. Foley et al. / Anal. Biochem. 377 (2008) 95–104
97
Fig. 1. Schematic summary of human Fn14 sequence. Brackets represent disulfide bonds determined in this study.
was mixed with an aliquot of a stock solution containing 2 mM Tris(2-carboxyethyl)phosphine hydrochloride (TCEP; Pierce) in 0.1 M citrate, 6 M GuHCl buffer, pH 3.0. The amounts of TCEP tested were 0.025, 0.10, 0.25, and 0.5 nmol. The final volume of the reac tion was adjusted to 2.5 lL with the sample buffer. Reduction was carried out at 37 °C for 15 min and was stopped by alkylating the reduced sulfhydryls with 20 mM N-ethylmaleimide (NEM; Pierce) at 37 °C for 1 h in the dark. Once the optimal conditions for partial reduction were established, i.e., 0.1 nmol TCEP/pmol of Fn14 and 0.25 nmol TCEP/pmol of TIM-1 IgV, the remaining sample was sub jected to the partial reduction/alkylation protocol and the result ing peptide mixture was fractionated on a 1.0 £ 250-mm Vydac C18 column. The statuses of the reduction and peak identification were determined by mass spectrometry on a nanoflow LC-MS/MS sys tem or a MALDI-TOF MS spectrometer, as described below.
Endo-AspN digestion of the 4751.4-Da disulfide-linked peptide cluster of murine TIM-1-IgV-Fc The fraction containing the disulfide-linked peptide cluster with a molecular mass of 4751.4 Da was evaporated to dryness under vacuum, and the residue was redissolved in 10 lL of 2 M urea, 0.125 M Tris–HCl, pH 6.5, 5 mM MgCl2 and digested with »0.3 lg of endo-AspN (Roche) overnight at room temperature. Identification of peptides and specific alkylation sites by mass spectrometry Identification of peptides was done either on a nanoflow LCMS/MS system [specifically, a Waters Nano-Acquity UPLC (Ultra Performance LC) system in line with a Waters QTOF Premier
Table 1 Disulfide-linked peptides detected in a combined AspN and tryptic digest of nonreduced human Fn14-Myc-His Disulfide-linked peptidea
Residue numbersa
Observed molecular mass (Da)b
Calculated molecular mass (Da)b
TD1/TD2 with one disulfide bond TD3/TD4 with two disulfide bonds
1–11 (TD1) 22–23 (TD2) 35–49 (TD3) 24–29 (TD4)
1365.57 2183.98
1365.55 2183.88
a
TD designations denote predicted AspN/tryptic peptides from huFn14-myc-His tag.
b
Monoisotopic masses.
Fig. 2. C18 reversed-phase HPLC profile of partially reduced and alkylated TD3/TD4 from human Fn14. The partially reduced and alkylated peptide clusters, A and B (two peptides linked by a single disulfide bond), were collected for MS/MS analysis. The doublet peak for reduced peptides TD3 and TD4 containing two NES groups is due to ster eoisomers generated by NEM alkylation.
98
Picomole-level mapping of protein disulfides / S.F. Foley et al. / Anal. Biochem. 377 (2008) 95–104
Fig. 3. MS/MS spectrum of reduced, AspN/tryptic peptide TD3 (reduced from peptide cluster TD3/TD4 in peak A, Fig. 2) from human Fn14, residues 24–29 containing a NES group. The sequence of the peptide, the fragmentation pattern, and detected fragment ions are shown at the top. “y” designates ions that contain the C-terminal region of the peptide with one or more amino acid residues generated by collision-induced dissociation (CID). “b” designates ions that contain the N-terminal region of the peptide with one or more amino acid residues generated by CID. Calculated masses for some critical ions are as follows: b2 = 219.02, b3 = 290.05, internal fragment ion CA = 175.05, and SC = 316.02.
Fig. 4. MS/MS spectrum of reduced, AspN/tryptic peptide TD4 (reduced from peptide cluster TD3/TD4 in peak A, Fig. 2) from human Fn14, residues 35–49 containing a NES group. The sequence of the peptide, the fragmentation pattern, and detected fragment ions are shown at the top. Designation of “y”and “b” ions is as described in Fig. 3. Cal culated masses for some critical ions are as follows: b3 = 491.16, b4 = 604.25, b5 = 661.19, y10 = 1000.50, y11 = 1057.53, y12 = 1170.61, and internal fragment ion CA = 175.05.
mass spectrometer] or on an Applied Biosystems Voyager STR DE MALDI-TOF mass spectrometer using 2,5-dihydroxybenzoic acid as the matrix. MS/MS spectra were acquired using the data-depen dent acquisition function on the nanoflow LC-MS/MS system.
Peptides were separated on a 0.1 £ 100-mm Atlantis dC18 column (186002831; Waters Corp.), eluted with a 70-min gradient (from 0 to 70% acetonitrile in 0.1% formic acid) at a flow rate of 400 nL/ min. The column temperature was maintained at 35 °C. The sample
Picomole-level mapping of protein disulfides / S.F. Foley et al. / Anal. Biochem. 377 (2008) 95–104
99
Fig. 5. Schematic summary of key structural elements in the murine TIM-1-IgV-Fc sequence: murine Ig domain, black; human Fc domain, blue. Black lines represent disulfide bonds determined in this study. Blue lines represent the predicted disulfide bond pattern in the Fc domain that was also confirmed in this study. Cys residues forming the interchain disulfides are identified with blue arrows.
cone voltage was 35 V. A ramped collision energy of 25–40 eV was used for MS/MS experiments, and MS/MS spectra were collected in the m/z range 50–1500, with sampling every 0.5 s and a 0.05-s separation between consecutive scans. The MS and MS/MS spec tra were deconvoluted using the MaxEnt 3 program that combines multiple m/z peaks into a single MH+ peak. Results Analysis of the recombinant Fn14-Myc-His A soluble form of the human Fn14 protein consisting of its entire ectodomain (residues 1–53 not accounting for the signal peptide) plus Myc and His tags (Fig. 1) was successfully expressed in P. pastoris yeast cells. The observed molecular mass of 7617 Da agreed well with the predicted mass of 7616.6 Da. The molecu lar mass did not shift after treatment of the protein with 4-vp under nonreducing but denaturing conditions, indicating that all of the cysteine residues in the protein are involved in disulfide bonds. Biochemical analyses of this material also demonstrated that the P. pastoris-derived Fn14-Myc-His protein is monomeric and, most importantly, that it adopts its natural conformation, based on its ability to bind TWEAK with an affinity similar to that of Fn14 proteins produced by other expression systems (data not shown). Analysis of disulfide linkages in the CRD domain of human Fn14 The extracellular cysteine-rich domain of human Fn14 contains six Cys residues and all are involved in disulfide bonds, as demon strated above. To assess the disulfide connectivity in this domain, we carried out an AspN/tryptic digestion on a nonreduced sample and were able to separate two pairs of disulfide-linked peptides by C18 reverse-phase chromatography (data not shown). As summa rized in Table 1, disulfide-linked peptides correspond to TD1 (resi dues 1–11) linked to TD2 (residues 22–23) and TD3 (residues 24– 29) linked to TD4 (residues 35–49). Because peptides TD1 and TD2 each contain only a single cysteine residue, the linkage is clearly Cys9 (C1) to Cys22 (C2). On the other hand, peptides TD3 and TD4
are linked by two interpeptide disulfide bonds, so additional work was required to determine exact disulfide linkages. The technique of partial reduction was used to generate forms of TD3/TD4 that contain only a single disulfide bond; however, separation of these forms by HPLC proved to be a challenging task, as indicated by the chromatographic profile (Fig. 2). Even under optimized conditions, the relevant peaks, corresponding to products containing a single disulfide bond and 2 N-ethyl succinimido (NES) groups, were only »75% resolved (peak A at »146 min and peak B at »148 min). By collecting a portion of each peak, i.e., the leading edge of the peak at 146 min and the trailing edge of the peak at 148 min, we obtained representative samples of both forms. Sequencing results for the reduced constituents of the TD3/TD4 peptide cluster from peak A (Fig. 2) are shown in Figs. 3 and 4. In the MS/MS spectrum of peptide TD3 (observed m/z = 779.32) (Fig. 3), the observed fragment ions, b2, m/z = 219.02, and b3, m/z = 290.05, are consistent with a NES group at Cys28 (C4) because if it were at Cys25 (C3), the m/z would be 344.09 for b2 and 415.13 for b3. Neither of these ions was detected in the MS/MS spectrum of the reduced TD3 from peak A. For peptide TD4 (observed m/z = 1660.82) (Fig. 4), the observed fragment ions, b3, m/z = 491.16, b4, m/z = 604.25, y10, m/z = 1000.50, y11, m/ z = 1057.59, and y12, m/z = 1170.71, all indicate that the NES group is on Cys37 (C5), not Cys40 (C6). If the NES group were at Cys40 (C6), the m/z values would be 366.11, 479.20, 536.22, 1182.57, and 1296.66 Da for b3, b4, y10, y11, and y12, respectively. None of these ions was detected in the MS/MS spectrum of the reduced TD4 from peak A. That Cys28 (C4) in TD3 and Cys37 (C5) in TD4 are alkylated with NEM indicates that these two Cys residues were linked by a disulfide bond before the partial reduction/ alkylation. We also can conclude that Cys25 (C3) and Cys40 (C6) remain disulfide linked, holding peptides TD3 and TD4 together after the partial reduction/alkylation. The data from the MS/MS sequencing of peak B (not shown) are complementary, showing that the NES groups in the later-eluting peak are at Cys25 (C3) and Cys40 (C6), so TD3 and TD4 must be disulfide-linked through Cys28 (C4) and Cys37 (C5). In summary, we have experimentally determined the disulfide connectivity of the Fn14 CRD to be C1C2, C3-C6, and C4-C5.
100
Picomole-level mapping of protein disulfides / S.F. Foley et al. / Anal. Biochem. 377 (2008) 95–104
Analysis of the full-length murine TIM-1-IgV-Fc
Confirmation of disulfide linkages in the Fc portion of murine TIM-1-IgV-Fc
Murine TIM-1 IgV domain was expressed in CHO cells as a fusion protein linked to the Fc portion of human IgG1, forming a sol uble dimer. The predicted sequence of a single chain contains 337 residues with one potential N-linked glycosylation site at Asn187 (Fig. 5). The expressed protein was evaluated by measuring the molecular mass of N-deglycosylated protein under reducing condi tions. The observed molecular mass (37,539 Da) agreed well with that predicted for residues 2–336 (37,537.6 Da). The protein was also analyzed by tryptic peptide mapping after reduction (Table 2). About 90% of the predicted sequence was confirmed by peptide masses; sequence coverage was limited by the number of small, hydrophilic peptides that were not retained on the reverse-phase HPLC column. The primary structure showed no heterogeneity that could confound mass assignments during disulfide mapping. To determine whether any Cys residues were in the free thiol form, native protein was treated with the alkylating reagent, NEM, under denaturing conditions. The molecular mass of the NEMtreated native protein, measured after reduction of disulfides with DTT, is consistent with that of the N-deglycosylated, but nonalky lated protein, residues 2–336 (observed, 37,537 Da). This confirms that all of the cysteine residues in the protein are involved in disul fide bonds.
Table 2 LC-MS analysis of peptides from a tryptic digest of N-deglycosylated and reduced murine TIM-1-IgV-Fc Tryptic peptidea
Residue numbers
Retention time (min)
Observed molecular mass (Da)b
Calculated molecular mass (Da)b
T19 T2 T3 T4 T5 T6 T7 T8 T9 T10 T11 T119 T12 T13 T14 T15c T16d T17 T18c T19c T20c T21 T22c T23c T24c T25 T26 T27 T28 T29 T30 T31c T32 T33
2–6 7–23 24–32 33–51 52–56 57–59 60–63 64–89 90–100 101–112 113–138 113–136 139–145 146–164 165–178 179–182 183–191 192–207 208–210 211–212 213–216 217–224 225–228 229–230 231–234 235–245 246–250 251–260 261–282 283–299 300–304 305–306 307–329 330–336
36.3 47.9 45.1 47.1 32.6 — 40.0 52.8 51.6 47.4 58.2 58.5 42.8 49.6 49.6 — 38.0 57.9 — — — 42.2 — — — 44.0 30.9 49.1 52.8 53.7 35.4 — 50.4 43.5
636.30 1848.89 993.42 2071.88 637.29 — 536.26 2769.22 1331.65 1359.77 2529.42 2504.26 834.37 2080.92 1676.72 — 1189.48 1806.92 — — — 837.43 — — — 1285.66 604.24 1103.57 2543.12 1872.96 574.25 — 2743.22 659.28
636.35 1848.92 993.47 2071.93 637.34 348.18 536.30 2769.22 1331.65 1359.78 2729.41 2504.26 834.43 2081.00 1676.79 500.31 1189.49 1807.00 438.21 249.11 446.25 837.50 447.27 217.14 456.24 1285.67 604.31 1103.60 2543.12 1172.91 574.33 261.14 2743.24 659.35
a
T designations denote predicted tryptic peptides from the murine TIM-1 Ig mucin minus murine Fc sequence where T19 is the observed N-terminal peptide and T33 is the C-terminal peptide. Cys-containing peptides are in boldface. b Monoisotopic masses. c Only the major components are listed. Small hydrophilic peptides were not retained on the column. d
The predicted mass corresponds to the N-deglycosylated peptide T16.
Disulfide-linked peptides from the murine TIM-1-IgV-Fc fusion protein were identified by comparative peptide mapping of the reduced and nonreduced tryptic digest. Autolysis peptides derived from trypsin and endo-LysC were also identified on the maps (labeled “Enzyme” in Fig. 6) but none of these peptides interfered with identification of disulfide-linked peptides or with determina tion of disulfide linkages for murine TIM-1-IgV-Fc. The resulting chromatographic profiles are shown in Fig. 6: Fig. 6A shows the nonreduced digest and Fig. 6B the reduced digest. Four peaks are unique to the map of the nonreduced digest, and three of those con tain disulfide-linked peptides from the Fc region. Predicted disul fide-linked peptides from the hinge region (T119/T119; observed mass, 5004.52 Da; calculated mass, 5004.49 Da), the CH2 region (T13/T19; observed mass, 2328.12 Da; calculated mass, 2328.10 Da), and the CH3 domain (T27/T32; observed mass, 3844.79 Da; calcu lated mass, 3844.82 Da) were all observed. All of the detected link ages were as expected and no disulfide scrambling was observed. These results are summarized in Table 3. Analysis of disulfide linkages in the IgV domain of murine TIM-1-IgV-Fc The IgV domain of TIM-1 contains six Cys residues. In the nonre duced tryptic map, the peptides composing the IgV domain eluted as a disulfide-linked peptide cluster in a peak at 51.5 min; the pep tide cluster contained the four peptides, T2 (residues 7–23), T3 (res idues 24–32), T4 (residues 33–51), and T8 (residues 64–89) linked by three disulfide bonds (observed mass, 7682.40 Da; calculated mass, 7682.52 Da; Fig. 6A, Table 3). The identity of this peptide cluster was confirmed by reduction with DTT, which resulted in the appearance of the four constituent peptides, T2, T3, T4, and T8 (Fig. 6B). The precise disulfide linkages could not be elucidated directly by mapping because there are no enzymes that cleave between individual cysteine residues in this region to generate peptides containing one cysteine. To overcome this problem, the disulfide-linked IgV peptide cluster was collected and subjected to partial reduction with TCEP and alkylation with NEM followed by nano-LC-MS. The resulting map is shown in Fig. 7. By successfully optimizing the conditions of partial reduction, a balance between the completely nonreduced form of disulfide-linked IgV peptide cluster (T2/T3/T4/T8 at 52.5 min) and the completely reduced and alkylated forms of each of the peptides was struck . This was evi denced by two significant partially reduced, NEM-alkylated, disul fide-linked clusters, each of which contained only two peptides. The first of these clusters (46 min) had a mass of 3188.48 Da which matches the predicted mass of T3 linked to T4 with one NES group (calculated mass, 3188.43 Da); the second partially reduced cluster (55 min) had a mass of 4741.23 Da which matches the predicted mass of T2 linked to T8 with one NES group (calculated mass, 4741.17 Da). These results show that Cys17 (C1) in T2 is linked to either Cys87 (C5) or Cys88 (C6) in T8 and that Cys29 (C2) in T3 is linked to either Cys35 (C3) or Cys40 (C4) in T4. To assign the exact disulfide connectivity within peptide clusters T2/T8 and T3/T4, there remained only the task of defining the NEM alkylation site in the constituent peptides containing two cysteine residues, i.e., T4 and T8. To identify the alkylation sites within tryptic peptides T4 and T8, the corresponding peptide clusters were collected and reduced with DTT. The reduced constituents were then subjected to mass spectrometric analysis. MS/MS sequencing of the reduced T4 with one NES group (T4-NES) from the T3/T4 cluster showed that the NES group is at Cys-35 (C3), not Cys40 (C4). In the MS/MS spec trum of T4-NES shown in Fig. 8, the b3 ion has an observed m/z
Picomole-level mapping of protein disulfides / S.F. Foley et al. / Anal. Biochem. 377 (2008) 95–104
101
Fig. 6. Tryptic peptide maps of murine TIM-1-IgV-Fc. Digests were separated by HPLC on a Vydac C18 column and analyzed on-line with an LCT mass spectrometer. (A) Non reduced digest; (B) reduced digest. Identified peak characteristics are summarized in Table 1. “E” designates an enzyme-derived peptide.
Table 3 Major disulfide-linked peptides detected in a tryptic peptide map of the nonre duced digest of pyridylethylated murine TIM-1-IgV-Fc Disulfide-linked tryptic peptidea
Residue numbers
Retention time (min)
Observed molecular mass (Da)
Calculated molecular mass (Da)
T2/T3/T4/T8 with three disulfide bonds
7–23 24–32 33–51 64–89 113–136 interchain
51.8
7682.40b
7682.52b
61.2
5004.52
5004.49
146–164 211–212 251–260 307–329
46.2
2328.12
2328.10
50.3
3844.79
3844.82
T119/T119 with two disulfide bonds T13/T19 with one disulfide bond T27/T32 with one disulfide bond
T8-NES was generated from full reduction of the partially reduced T2/T8 cluster, Cys17 (C1) in T2 must be linked to (the unmodified) Cys88 (C6) in T8 and, consequently, the only two NES-modified Cys residues in the IgV domain, Cys35 (C3) in T4 and Cys87 (C5) in T8, must form a disulfide bond. In summary, we have experimentally determined that the disulfide connectivity in the IgV domain of TIM-1 is C1-C6, C2-C4, and C3-C5. Discussion
a
T designations denote predicted tryptic peptides from the murine TIM-1-IgV-Fc sequence where T119 is a nonspecific tryptic peptide, residues 113–136. b Average mass.
of 414.14, the y12 ion of 1442.66, and the y16 ion of 1784.87, which matches very well to the calculated m/z values if Cys-35 (C3) is alkylated with NEM (the calculated m/z for b3 is 414.14, that for y12 is 1442.70, and that for y16 is 1784.85). If Cys40 (C4) is alkylated, b3 should be 289.10, y12 should be 1567.74, and y16 should be 1909.82. There are no traces of any of these ions in the MS/MS spectrum of T4-NES; therefore, Cys40 (C4) is not alkylated and, because T4NES was generated from the peptide cluster T3/T4, Cys40 (C4) in T4 must be linked to Cys29 (C2) in T3. MS/MS sequencing of T8 with one NES group (T8-NES) did not generate sufficient ions to identify the alkylation site, so the peak was treated with endopro tease AspN to produce the shorter, alkylated peptide DSGLYCCR (T8’-NES, calculated mass, 1041.41 Da). MS/MS sequencing of this peptide showed that the NES group is located at Cys87 (C5). In the MS/MS spectrum shown in Fig. 9, the y2 ion has an observed m/z of 278.10, which could be generated only if Cys87 is alkylated with NEM (calculated m/z for y2 is 278.13). If Cys88 had been alkylated with NEM, y2 would have an m/z of 403.18. Additionally, there are two internal fragments at m/z of 364.11 and 392.11, which match the calculated m/z of TyrCys(NES) (YC5), with and without the loss of water (calculated m/z is 364.15 and 392.13, respectively). Because
We have successfully demonstrated the application of highsensitivity disulfide mapping methods using two recombinant proteins, one a fusion protein of »40 kDa which contains the IgV domain of murine TIM-1 linked to human Fc and the other, a soluble version of the CRD of human Fn14 which has a molec ular weight of »8 kDa. TIM-1 IgV domain corresponds to the N-terminal domain of a type 1 transmembrane protein originally identified in ischemic kidney models (as KIM-1 [16]) and later ascribed important functions in immunomodulation [17–20]. Potential ligands have been proposed but none has been identi fied unambiguously [11,19,21]. Our original goal was to advance the investigations into structure–function relationships by better defining the ligand binding site, the IgV domain [11,14]. At the outset of our work, the available information assigning the TIM-1 IgV domain to the IgSF was based solely on sequence homology; and the six cysteine residues with the potential for three disulfide bonds represented a significant departure from the typical IgSF structure. We approached this with no preconceptions, except that the region of sequence assigned to IgV was a distinct domain, i.e., with no interdomain disulfide bonds. Because ours was an Fc fusion protein, we could use the Fc disulfides essentially as an internal control. Our results clearly defined three disulfide bonds in the IgV region as Cys1-Cys6, Cys2-Cys4, and Cys3-Cys5 and the Fc bonds were as expected with no apparent scrambling. These results have now been confirmed by the recently published crys tal structure of muTIM-1 [11]. Our success also highlights the sen sitivity of the method. Disulfide mapping of the extracellular CRD of Fn14 provides another example of the general utility of our approach. Fn14 is the receptor for TWEAK [8] and plays important roles in inflam mation and tumorogenesis [9,12]. Like TIM-1, Fn14 is a type 1
102
Picomole-level mapping of protein disulfides / S.F. Foley et al. / Anal. Biochem. 377 (2008) 95–104
Fig. 7. C18 reversed-phase HPLC profile of partially reduced, NEM-alkylated, disulfide-linked peptide clusters from murine TIM-1-IgV-Fc. Identities of components in each peak are shown.
Fig. 8. MS/MS spectrum of tryptic peptide T4 containing a NES group from murine TIM-1-IgV-Fc. (T4 was released from the peptide cluster T3/T4 by reduction of “T3/T4 with 1 NES” shown in Fig. 7). The sequence of the peptide, the fragmentation pattern, and the detected fragment ions are shown at the top. Designation of “y”and “b” ions is as described in Fig. 3. Calculated masses for some critical ions are as follows: y3 = 369.20, y4 = 483.24, y5 = 584.29, y6 = 770.37, y7 = 883.45, y8 = 996.54, y9 = 1097.59, y10 = 1211.63, y12 = 1442.70, y16 = 1784.85, y17 = 2012.91, b2 = 186.09, and b3 = 414.14.
transmembrane protein, but it belongs to a subset of recently rec ognized atypical TNFR superfamily (TNFRSF) members because it
contains only a single CRD, whereas traditional TNFRSF members contain three or more CRDs [13,22]. A typical CRD in the TNFRSF
Picomole-level mapping of protein disulfides / S.F. Foley et al. / Anal. Biochem. 377 (2008) 95–104
103
Fig. 9. MS/MS spectrum of AspN/tryptic peptide, residues 82-89 containing a NES group from murine TIM-1-IgV-Fc. This peptide was generated from T8 that was released from the “T2/T8 with 1 NES” peptide cluster shown in Fig. 7 by reduction. The sequence of the peptide, the fragmentation pattern, and detected fragment ions are shown at the top. Designation of “y”and “b” ions is as described in Fig. 3. Calculated masses for some critical ions are as follows: y2 = 278.13, y3 = 506.19, y4 = 669.25, y5 = 782.33, y6 = 839.35, internal fragment ions YC(NES) (a) = 364.15 and (b) = 392.13, and ions C(NES)C(a) = 304.08, and b3 = 260.09.
has six conserved cysteine residues which form three disulfide bonds. Each CRD is composed of two structural modules from a list including A1, A2, B1, B2, C2, D2, etc., where the letter refers to a module of characteristic consensus sequence and tertiary fold and the number indicates the number of disulfide bonds within the module. The two modules in a CRD are linked in tandem in various combinations, e.g., A1!B2 (an A1 module followed by a B2 mod ule), A2!B1, A1!C2, or A1!D2 [13,23]. Based on sequence align ment, it was originally predicted that the atypical TNFRSF mem bers, such as TACI, BCMA, and Fn14, would likely have the A1!C2 module structure like the fourth (and last) CRD in TNF-R1 [22]. However, recently available crystal structures of BCMA and TACI show that the CRDs in both proteins have an A1!D2 module struc ture [24]. A major difference between C2 and D2 modules is the connectivity of the four Cys residues in the module, which results in different tertiary structures, even though their secondary struc tures are similar [23]. In a C2 module, the four cysteine residues (C3, C4, C5, and C6) are linked as C3-C6 and C4-C5, whereas in the D2 module, the linkages are C3-C5 and C4-C6. With no available crystal structure for Fn14, Brown et al. [12] recently developed a three-dimensional structure model of the protein based on the crystal structure of BCMA, namely a model having an A1!D2 module structure with C1-C2, C3-C5, and C4-C6 disulfide linkages. The model was used as the basis for evaluating the results from site mutagenesis studies [12]. Our experimental determination of the disulfide structure for Fn14 reveals that the disulfide linkages in the Fn14 CRD are C1-C2, C3-C6, and C4-C5. In other words, Fn14 CRD has an A1!C2 module pair structure, not an A1!D2 one. Our finding supports the A1!C2 structure originally predicted for Fn14 by Bodmer et al. [22] and indicates that the model proposed [12] based on the structure of BCMA is incorrect. The disulfide link age pattern can have a significant influence on the overall tertiary structure of a protein. In the case of TACI and BCMA, the A1!D2 modular structure provides a “saddle-like” binding site for the ligand Tall-1 [23,24]. In contrast, the fourth CRD of TNFR-1, com
posed of an A1!C2 modular substructure, is partly “disordered” and may be important for its dimer formation [25]. Conclusions about function based on incorrect protein structures can have lim ited validity at best. This new information on the disulfide link ages of Fn14 should be helpful in reevaluating the results of in vitro studies of surface topography and the mechanism of ligand binding. The two proteins described herein, both successfully charac terized by high sensitivity disulfide mapping, are typical of the analytes targeted in structure–function studies. We have shown that disulfide linkages of proteins having high molecular weight, multiple domains, and glycans can be successfully analyzed with out large amounts of protein and that a single experimental strat egy can be applied to many others without lengthy optimization. The physical characteristics of these proteins are routinely encoun tered and normally interfere with crystallization. Disulfide map ping can be valuable even when the crystal structure is available. In cases where the X-ray data are low resolution, disulfide bonds may be unassignable without disulfide mapping. If a protein domain is truncated in the interest of crystallization, caution has to be taken not to design away cysteine residues that are critical to the struc tural integrity, as was the case for a truncated version of human NoGo-66 receptor, NoGoR1(310) (residues 1–310), [26]. Our disul fide mapping analysis of full-length human NoGo-66 receptor 1, a large (»55-kDa), multidomain, cysteine-rich glycoprotein, showed that the conclusions drawn from crystallography of the truncated NoGoR1(310) were in error due to the elimination of two cysteine residues, Cys-334 and Cys-346, from the recombinant construct [27]. In summary, disulfide structures of proteins with complex char acteristics can be determined using the method described here. The method is very sensitive and efficient compared to other meth ods. The information of disulfide linkages of proteins is not only important for structure–function studies and resolving ambiguity in crystal or NMR structures or structure modeling but also invalu
104
Picomole-level mapping of protein disulfides / S.F. Foley et al. / Anal. Biochem. 377 (2008) 95–104
able for ensuring correct protein folding and batch-to-batch consis tency in the manufacture of therapeutic proteins. Acknowledgments We thank Veronique Bailly, Paul Rennert, Alexey Lugovskoy, Dennis Krushinskie, and Kathy Strauch for their contributions. References [1] J.H. Prestegard, H. Valafar, J. Glushka, F. Tian, Nuclear magnetic resonance in the era of structural genomics, Biochemistry 40 (2001) 8677–8685. [2] J. Rousseaux, R. Rousseaux-Prevost, H. Bazin, Optimal conditions for the prep aration of Fab and F(ab’)2 fragments from monoclonal IgG of different rat IgG subclasses, J. Immunol. Methods 64 (1983) 141–146. [3] J.L. Gallegos-Perez, L. Rangel-Ordonez, S.R. Bowman, C.O. Ngowe, J.T. Watson, Study of primary amines for nucleophilic cleavage of cyanylated cystinyl pro teins in disulfide mass mapping methodology, Anal. Biochem. 346 (2005) 311–319. [4] J. Wu, J.T. Watson, A novel methodology for assignment of disulfide bond pair ings in proteins, Protein Sci. 6 (1997) 391–398. [5] C.G. Schutte, T. Lemm, G.J. Glombitza, K. Sandhoff, Complete localization of disulfide bonds in GM2 activator protein, Protein Sci. 7 (1998) 1039–1045. [6] J. Qi, J. Wu, G.A. Somkuti, J.T. Watson, Determination of the disulfide structure of sillucin, a highly knotted, cysteine-rich peptide, by cyanylation/cleavage mass mapping, Biochemistry 40 (2001) 4531–4538. [7] U. Goransson, D.J. Craik, Disulfide mapping of the cyclotide kalata B1. Chemical proof of the cystic cystine knot motif, J. Biol. Chem. 278 (2003) 48188–48196. [8] S.R. Wiley, L. Cassiano, T. Lofton, T. Davis-Smith, J.A. Winkles, V. Lindner, H. Liu, T.O. Daniel, C.A. Smith, W.C. Fanslow, A novel TNF receptor family member binds TWEAK and is implicated in angiogenesis, Immunity 15 (2001) 837– 846. [9] J.A. Winkles, N.L. Tran, M.E. Berens, TWEAK and Fn14: new molecular targets for cancer therapy?, Cancer Lett. 235 (2006) 11–17. [10] V.K. Kuchroo, D.T. Umetsu, R.H. DeKruyff, G.J. Freeman, The TIM gene family: emerging roles in immunity and disease, Nat. Rev. Immunol. 3 (2003) 454– 462. [11] C. Santiago, A. Ballesteros, C. Tami, L. Martinez-Munoz, G.G. Kaplan, J.M. Casas novas, Structures of T cell immunoglobulin mucin receptors 1 and 2 reveal mechanisms for regulation of immune responses by the TIM receptor family, Immunity 26 (2007) 299–310. [12] S.A. Brown, H.N. Hanscom, H. Vu, S.A. Brew, J.A. Winkles, TWEAK binding to the Fn14 cysteine-rich domain depends on charged residues located in both the A1 and D2 modules, Biochem. J. 397 (2006) 297–304. [13] J.H. Naismith, S.R. Sprang, Modularity in the TNF-receptor family, Trends Bio chem. Sci. 23 (1998) 74–79.
[14] I.D. Sizing, V. Bailly, P. McCoon, W. Chang, S. Rao, L. Pablo, R. Rennard, M. Walsh, Z. Li, M. Zafari, M. Dobles, L. Tarilonte, S. Miklasz, G. Majeau, K. Godbout, M.L. Scott, P.D. Rennert, Epitope-dependent effect of anti-murine TIM-1 monoclo nal antibodies on T cell activity and lung immune responses, J. Immunol. 178 (2007) 2249–2261. [15] R.B. Pepinsky, Selective precipitation of proteins from guanidine hydrochlo ride-containing solutions with ethanol, Anal. Biochem. 195 (1991) 177–181. [16] T. Ichimura, J.V. Bonventre, V. Bailly, H. Wei, C.A. Hession, R.L. Cate, M. San icola, Kidney injury molecule-1 (KIM-1), a putative epithelial cell adhesion molecule containing a novel immunoglobulin domain, is up-regulated in renal cells after injury, J. Biol. Chem. 273 (1998) 4135–4142. [17] W.K. Han, A. Alinani, C.L. Wu, D. Michaelson, M. Loda, F.J. McGovern, R. Thadh ani, J.V. Bonventre, Human kidney injury molecule-1 is a tissue and urinary tumor marker of renal cell carcinoma, J. Am. Soc. Nephrol. 16 (2005) 1126– 1134. [18] M. Khademi, Z. Illes, A.W. Gielen, M. Marta, N. Takazawa, C. Baecher-Allan, L. Brundin, J. Hannerz, C. Martin, R.A. Harris, D.A. Hafler, V.K. Kuchroo, T. Olsson, F. Piehl, E. Wallstrom, T cell Ig- and mucin-domain-containing molecule-3 (TIM-3) and TIM-1 molecules are differentially expressed on human Th1 and Th2 cells and in cerebrospinal fluid-derived mononuclear cells in multiple scle rosis, J. Immunol. 172 (2004) 7169–7176. [19] J.H. Meyers, S. Chakravarti, D. Schlesinger, Z. Illes, H. Waldner, S.E. Umetsu, J. Kenny, X.X. Zheng, D.T. Umetsu, R.H. DeKruyff, T.B. Strom, V.K. Kuchroo, TIM-4 is the ligand for TIM-1, and the TIM-1-TIM-4 interaction regulates T cell prolif eration, Nat. Immunol. 6 (2005) 455–464. [20] T.T. Chen, L. Li, D.H. Chung, C.D. Allen, S.V. Torti, F.M. Torti, J.G. Cyster, C.Y. Chen, F.M. Brodsky, E.C. Niemi, M.C. Nakamura, W.E. Seaman, M.R. Daws, TIM-2 is expressed on B cells and in liver and kidney and is a receptor for H-ferritin endocytosis, J. Exp. Med. 202 (2005) 955–965. [21] S.E. Umetsu, W.L. Lee, J.J. McIntire, L. Downey, B. Sanjanwala, O. Akbari, G.J. Berry, H. Nagumo, G.J. Freeman, D.T. Umetsu, R.H. DeKruyff, TIM-1 induces T cell activation and inhibits the development of peripheral tolerance, Nat. Immunol. 6 (2005) 447–454. [22] J.L. Bodmer, P. Schneider, J. Tschopp, The molecular architecture of the TNF superfamily, Trends Biochem. Sci. 27 (2002) 19–26. [23] G. Zhang, Tumor necrosis factor family ligand-receptor binding, Curr. Opin. Struct. Biol. 14 (2004) 154–160. [24] Y. Liu, X. Hong, J. Kappler, L. Jiang, R. Zhang, L. Xu, C.H. Pan, W.E. Martin, R.C. Murphy, H.-B. Shu, S. Dai, G. Zhang, Ligand-receptor binding revealed by the TNF family member TALL-1, Nature 423 (2003) 49–56. [25] J.H. Naismith, T.Q. Devine, T. Kohno, S.R. Sprang, Structures of the extracellu lar domain of the type I tumor necrosis factor receptor, Structure 4 (1996) 1251–1262. [26] W.A. Barton, B.P. Liu, D. Tzvetkova, P.D. Jeffrey, A.E. Fournier, D. Sah, R. Cate, S.M. Strittmatter, D.B. Nikolov, Structure and axon outgrowth inhibitor binding of the Nogo-66 receptor and related proteins, EMBO J. 22 (2003) 3291–3302. [27] D. Wen, C.P. Wildes, L. Silvian, L. Walus, S. Mi, D.H. Lee, W. Meier, R.B. Pepinsky, Disulfide structure of the leucine-rich repeat C-terminal cap and C-terminal stalk region of Nogo-66 receptor, Biochemistry 44 (2005) 16491–16501.