Disulfide bond characterization of human factor Xa by mass spectrometry through protein-level partial reduction

Disulfide bond characterization of human factor Xa by mass spectrometry through protein-level partial reduction

Accepted Manuscript Title: Disulfide Bond Characterization of Human Factor Xa by Mass Spectrometry through Protein-Level Partial Reduction Author: Son...

818KB Sizes 0 Downloads 64 Views

Accepted Manuscript Title: Disulfide Bond Characterization of Human Factor Xa by Mass Spectrometry through Protein-Level Partial Reduction Author: Song Klapoetke Michael Hongwei Xie PII: DOI: Reference:

S0731-7085(16)30797-X http://dx.doi.org/doi:10.1016/j.jpba.2016.10.005 PBA 10889

To appear in:

Journal of Pharmaceutical and Biomedical Analysis

Received date: Revised date: Accepted date:

2-6-2016 28-9-2016 5-10-2016

Please cite this article as: Song Klapoetke, Michael Hongwei Xie, Disulfide Bond Characterization of Human Factor Xa by Mass Spectrometry through ProteinLevel Partial Reduction, Journal of Pharmaceutical and Biomedical Analysis http://dx.doi.org/10.1016/j.jpba.2016.10.005 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Disulfide Bond Characterization of Human Factor Xa by Mass Spectrometry through ProteinLevel Partial Reduction

Song Klapoetke1*, and Michael Hongwei Xie2.

1

Mass Spectrometry Core Facility, KBI Biopharma, 1101 Hamlin Road, Durham, NC 27704, United States

2

Suncadia Biopharmaceuticals, a subsidiary of Hengrui Medicine Corporation, 218 Xinghu Street, B8-401, Suzhou, Jiangsu, 215000, China * Corresponding author

Highlights 

Protein-level partial reduction is proposed first time in complex disulfide linkage study.



Disulfide bonds in Human Factor Xa are fully characterized first time by MSbased method.



Partial reduction can be used for disulfide bond mapping in recombinant protein by tandem MS.

Abstract: Protein-level partial reduction was investigated as a novel sample preparation technique to characterize proteins with cystine knots or complex disulfide linkages. Human Factor Xa containing twelve disulfide bonds was selected as a model protein to demonstrate this methodology. Five in twelve disulfide linkages were characterized through conventional non-reduced samples while the other seven disulfide linkages containing cystine knots were successfully characterized though partially reduced samples. Each disulfide linkage was confirmed through product ions generated by an UPLC-ESI QTOF MS system equipped with data independent collision-induced dissociation (CID) acquisition. Free cysteines in the sample were also determined in this study.

Keywords: Disulfide bond characterization Cystine knots Partial protein reduction Human Factor Xa LC-TOF MS Mass spectrometry

1. Introduction Disulfide linkage study is an important quality assessment of biopharmaceutical products. Because the formation of disulfide bonds is critical for stabilizing protein structures and maintaining protein functions, it is important to understand and maintain the linkages between multiple cysteine residues within a protein. Conventional methods such as X-ray crystallography [1], nuclear magnetic resonance (NMR) [1-2] and Edman degradation [3] have been widely used to analyze disulfide linkages. However, these methods require high concentration or relatively a large amount of pure material. Mass spectrometry (MS) with different ionization and fragmentation techniques developed for characterization of biomolecules is an alternative option for disulfide linkage analysis [4]. Disulfide bond formation results in a 2-Da molecular weight reduction, which can be distinguished by advanced mass spectrometers such as modern quadrupole time-of-flight (QTOF) and Orbitrap for accurate mass measurement. As a result, accurate mass MS-based disulfide bond analysis becomes a common practice in the biopharmaceutical industry. For example, enzymatic digestion of non-reduced and alkylated protein product followed by liquid chromatography-tandem mass spectrometry (LC-MS/MS) is routinely applied for characterization of disulfide linkages of monoclonal antibodies and other protein therapeutics. This technique is often referred as non-reduced peptide mapping or disulfide bond mapping in biopharmaceutical industry. It is used for 1) confirmation of product disulfide linkages, 2) monitoring or control of disulfide linkage consistence during development, and 3) elucidation of the disulfide bond structures of a new product. In biopharmaceutical disulfide bond characterization, Ultraviolet (UV) or MS profiles of LCeluted components between enzymatic digests of reduced and non-reduced samples are compared [5]. Peptides only observed in the non-reduced sample are considered to be disulfide

related, following assignment by precursor masses (MS) and further confirmation by MS/MS product ions. This method is sufficient when only single disulfide-bond linked species are presented in the proper digestion mixtures of a studied protein, but further experiments are needed to assign or confirm disulfide linkage connections when multiple disulfide-bond linked species exist. Different strategies can be deployed depending on the nature of the protein product [6]. Using alternative enzyme digestion to avoid multiple disulfide-bond linked species is a straightforward solution if feasible. Unlike collision-induced dissociation (CID, usually not cleaving on disulfide bond nor residues within a disulfide bond), electron-transfer dissociation (ETD) or electron capture dissociation (ECD) cleaves disulfide bonds. Therefore, ETD or ECD can potentially provide an alternative means of MS/MS [7-9] for disulfide bond mapping. Partial reduction with differential alkylation [10-12], in situ reduction, and on-line reduction of a digest mixture during sample analysis [13] are among the most popular methods to study complex disulfide linkages such as nested disulfides linkages or cystine knots (a structure motif with three disulfides: six cysteine residues in close proximity in a protein backbone). As described in recent reviews [6, 14], the applied partial reduction is focused on post-digestion or post-column manipulation. For example, the partial reduction for disulfide bond mapping of the Cyclotide Kalata B1 [15] was performed by adding reductant tris(2-caboxethy)phosphine (TCEP) to enzyme digested peptides, then partially reduced species were manually collected by C18 column separated fractions. Protein-level partial reduction was only attempted to assess the susceptibility of intrachain and interchain disulfide bonds of IgG1 molecules [16], a class of biotherapeutic proteins with relatively simple disulfide bond connections. In this study, we investigated protein-level partial reduction followed by specific enzyme digestion and LC-MS/MS analysis on a Waters QTOF instrument as a new approach to

characterize complex disulfide linkages. All collected MS and MS/MS data were processed by Waters BiopharmaLynx™, a software can automatically assign disulfide bond-linked peptides based on deconvoluted precursor masses and the assignment can further be validated by product ions if available [17]. Human Factor Xa (HFXa) was used as the model protein to demonstrate this methodology since HFXa is a well-known and highly disulfide bridged protein with molecular weight of about 44 KDa, and has two chains and 12 disulfide bonds including two cystine knots [18-20]. See Fig.1 for the hypothetical workflow of this approach. Protein is first alkylated to cap all free cysteines (Protein A in Fig. 1) then digested with selected enzyme (LysC and/or trypsin) to peptides. Disulfide bond linked species can be easily determined through following LC-MS/MS experiment (Peptide A in Fig. 1). The disulfide bond linkages can also be determined if there is only one disulfide bond in the linked peptides. For species with complex disulfide bonds such as cystine knots, reduction should be conducted first (at protein-level) to partially reduce disulfide bonds in the protein (Protein B in Fig. 1). The partially reduced protein can then be digested by selected enzyme after all free cysteines were capped to prevent disulfide bond reforming or scrambling. Disulfide bond linkages can be determined through following LC-MS/MS experiment (Peptide B in Fig. 1). Free cysteines can also be determined from reduced samples (Peptide C in Fig. 1). The details about how to implement this approach and how to identify disulfide linkages in HFXa are demonstrated through this study, with all 12 HFXa disulfide bonds and their connections unambiguously determined. 2. Materials and methods 2.1.

Materials

All reagents were of analytical reagent grade and purchased from Sigma (St. Louis, MO) unless stated otherwise. LCMS grade acetonitrile (ACN) and water were used. Bond-BreakerTM TECP was purchased from Thermo (Rockford, IL). Endoproteinase Lys-C (Lys-C) was purchased from Wako (Richmond, VA) and trypsin was purchased from Promega (Madison, WI). Human Factor Xa was purchased from Enzyme Research Laboratories (South Bend, IN). Sequencing grade trifluoroacetic acid (TFA) was purchased from Thermo. 2.2.

Sample preparation

Method development on sample preparation was focused on optimizing parameters of partial reduction and alkylation. Partial reduction was performed with TCEP concentrations at 0.1 mM, 0.5 mM, 1.0 mM, and 5.0 mM conditions. No reduction was observed at 0.1 mM TCEP samples and full reduction (≥ 99%) was observed in both 1.0 mM and 5.0 mM TCEP samples. Alkylation agent NEM was used to cap free cysteines in non-reduced sample and in partial reduced sample (when disulfide bond reduced, the related cysteines become free cysteines). The alkylation agent amount was evaluated through partially reduced sample at level of both 1.0 mM and 5.0 mM TCEP samples to ensure all free cysteines were alkylated. Three partial reduction conditions ranged from 0.25 mM to 0.75 mM were used in this study to ensure a range of reduction rate was achieved for each disulfide bond linked species. In order to eliminate experimental bias, each set of samples was prepared in the same time with the same reagents, and analyzed side by side. Two sets of samples were prepared using different enzymes for the experiment. Each set consisted of control, low-level partial reduction, mid-level partial reduction, and high-level partial reduction samples. Control sample was prepared by reconstituting a 42-µg aliquot of vacuum-dried sample with 10 µL of tris buffer (6 M guanidine hydrochloride (GuHCl), 250 mM tris pH 7.5); low-level partial reduction sample was prepared

by reconstituting a 42-µg aliquot of vacuum-dried sample with 10 µL of tris buffer with 0.25 mM TCEP; mid-level partial reduction sample was prepared by reconstituting a 42-µg aliquot of vacuum-dried sample with 10 µL of tris buffer with 0.50 mM TCEP; and high-level partial reduction sample was prepared by reconstituting a 42-µg aliquot of vacuum-dried sample with 10 µL of tris buffer with 0.75 mM TCEP. Two sets of samples were incubated in the dark at room temperature for one hour. The samples were then treated with 1 µL of 100 mM NEthylmaleimide (NEM) and incubated in the dark at room temperature for another hour. After NEM alkylation, all samples were diluted by adding 90 µL of H2O and digested by adding 2 µL of 4 mAU/µL Lys-C (unit definition: One amidase unit (AU) is the amount of enzyme, which will produce 1 micromole of p-Nitroaniline per minute at 30 degrees C, pH 9.5). First set of samples were incubated at 37°C for 5.5 hours. The second set of samples were incubated at 37°C for one hour then further digested with 4 µL of 1 µg/µL trypsin overnight at 37°C. After enzyme digestion, 50 µL of Lys-C control sample (from first set sample) was completely reduced by adding 10 µL 0.5 M TCEP and heating at 60°C for 10 minutes. All samples were diluted to final concentration of ~0.2 mg/mL with water and 0.5% TFA. 2.3.

HPLC instrument and conditions

The experiment was performed on a Waters UPLC® H-class system with UV detector. Liquid chromatographic separations were achieved by a C18 column (2.1 mm × 150 mm, 1.7 µm, 300 Å, product code: 186003687, Waters, Milford, MA) in a column heater set at 65°C. The peptides were eluted using a gradient program. The starting eluent was 99:1 mixture of 0.05% TFA in H2O (Mobile Phase A) and 0.04% TFA in ACN, (Mobile Phase B) for 5 minutes. The proportion of Mobile Phase B was increased linearly to 41% over 90 minutes and then increased to 50% in 10 minutes. The column was washed with 80% Mobile Phase B for 4 minutes before the eluent

was returned to its initial composition. The column was allowed to re-equilibrate for 10 minutes prior to starting the next analysis. The flow rate was 0.2 mL/min and sample injection volume was 40 µL (8 µg of protein). The fully reduced sample was analyzed before the non-reduced and partially reduced samples. The UV detection wavelength was 214 nm. 2.4.

ESI-QTOF MS instrument and conditions

The mass spectrometer (Xevo G2-S QTof MS, Waters) equipped with an electrospray source and lockspray was run in sensitivity and positive mode (ES+). The sample eluted from the UV detector was directed to the MS system. Mass spectrometric data was acquired in MSE scan mode (data independent fragmentation by CID). Data acquisition and analysis were performed by MassLynx® 4.1 and BiopharmaLynx™ 1.3.4 software from Waters. Mass spectrometer settings for MS analyses were as follows: capillary voltage 1.6 kV, cone voltage 40 V, source temperature 130°C, desolvation temperature 350°C, low collision energy 6.0 V, scan range m/z 100-2500, and high collision energy ramped from 20 to 55 eV. 3. Results and Discussion In this study, acquired data were processed by BiopharmaLynx™ 1.3.4 software. BiopharmaLynx™ software allows the user to add protein sequence information, known modifications, enzyme selection, when processing data. Peak assignments using the provided sequence (Fig. 2) and enzyme cleavage specificity were made on the basis of accurate mass (within 10 ppm of theoretical value) and confirmed by in-parallel collected product ions (b and y ions which are the main product ions in CID mode). Since expected disulfide bond linkage information was not used during data process, all possible disulfide bond linked peptides can be identified by software and further confirmed by manual verification. Usually disulfide bond and the amino acid residues within a disulfide bond are not cleaved in low energy CID mode. As a

result, no product ion should be found corresponding to cleavages among the residues within a disulfide bond. Amino acid sequence and expected disulfide linkages are presented in Fig. 2 [18]. The known PTMs were listed in Table 1 according to previous study [19] and confirmed in this study with reduced sample (data not shown). For reference, each expected disulfide bond has been assigned with a unique number from 1 to 12. Twelve disulfide bonds were expected in HFXa with No. 1 to 7 within the light chain (LC), No. 9 to 12 within the heavy chain (HC), and No. 8 connected between the light and heavy chains. 3.1.

Disulfide Linkages Identified by Non-reduced Sample

The theoretical masses and amino acid sequences of peptide(s) connected by one or more disulfide bonds from Lys-C digestion are presented in Table 1. The extracted ion current (XIC) and total ion current (TIC) of non-reduced protein digest are presented in Fig. 3. All 6 disulfide linked species were detected as presented in Figs. 3a to 3f and their masses and intensities are listed in Table 2. No additional species due to scrambled disulfide bond linkage was detected in the non-reduced sample. In principle, identification of a single disulfide bond linkage is straightforward because there is only one possibility for connection. On the other hand, there are 15 ways to form a cystine knot [7]. Three single-disulfide bonded linkages were assigned by precursor masses as containing disulfide bond 1, 11, and 12. As expected, their identities were further confirmed by corresponding product ions (included in section 1 of supplementary data for reference). The heavy chain disulfide-linked peptides K1 and K2 were assigned to containing two disulfide bonds (disulfide bonds 9 and 10) by precursor ion in Fig. 4a, and were also confirmed by

MS/MS product ions (Fig. 4b) to be C7 linked to C12 or C7-C12 (disulfide bond 9) and C27C43 (disulfide bond 10). The 2-Da mass reduction of product ions y26 to y31 was due to the intra disulfide bond 10. These identifications were also supported by comparing the TIC profiles of non-reduced sample (Fig. 5c) with that of fully reduced digest in Fig. 5a. Peaks at 48.85 min and 61.52 min in the TIC profile of non-reduced digest (Fig. 5c) can be assigned by precursor masses as disulfide bonds 2-4 and disulfide bonds 5-8, respectively, as indicated in Fig.2. However, the specific connection of each disulfide bond in these two groups of peptides cannot be elucidated or confirmed due to the existence of cystine knots. Below demonstrates how to determine the specific connections of the disulfide bonds in these two cysteine knots through protein-level partial reduction followed by alkylation, enzymatic digestion and LC-MS/MS analysis. 3.2.

Elucidation of Disulfide Bonds 2-4 with Partially Reduced Samples

Partially reduced and Lys-C digested samples were used to elucidate disulfide linkages of disulfide bonds 2-4. Disulfide bond reduction rates were estimated to be 50% to 95% in three different partial reduction conditions as reported in Table 3. To elucidate disulfide bonds 2-4, each related peptide was processed to identify alkylated cysteine in order to establish disulfide linkages in the sample. All free cysteines generated from disulfide bond reduction were in alkylated form with a 125-Da mass increase. Peptides with alkylated free cysteine(s) generated by partial reduction were identified by MS and confirmed by MS/MS. The data are reported in Table 4 for results from mid-level partial reduction condition as an example.

Six species due to partial reduction of the linkage of disulfide bonds 2-4 were detected and identified as listed in Table 4. Light chain disulfide bond linked peptides K6 and K7 with one disulfide bond and one free cysteine resulted from partial reduction were observed at two retention times (RT) at 21.8 and 22.9 min, corresponding to two isoforms due to different disulfide bond connections. For the isoform species eluting at 21.8 min, LC-MS/MS product ions confirmed that C55 and C61 were connected to form a disulfide bond while C50 was a free cysteine. Meanwhile, light chain disulfide bond linked peptides K8 and K9 (peak at 58.2 min) were also determined to have one disulfide bond and one free cysteine. MS/MS product ions confirmed that C72 and C81 were connected to form a disulfide bond while C70 was a free cysteine. Therefore, C50 and C70 should be connected to form a disulfide bond. So the first cystine knot linkage of light chain peptides (K6=K7=K8=K9, “=” means peptides linked by a disulfide bond) is confirmed to be C50-C70 (disulfide bond 2), C55-C61 (disulfide bond 3), and C72-C81 (disulfide bond 4) as expected in Fig. 2. Secondly, the isoform species eluting at 22.9 min is a light chain peptides K6 and K7 linked by C50 and C61 to form a disulfide bond and C55 to be a free cysteine. Therefore, a second cystine knot linkage of the light chain peptides (K6=K7=K8=K9) is confirmed to be C50-C61 (disulfide bond S1), C55-C70 (disulfide bond S2), and C72-C81 (disulfide bond 4). Finally, based on identified linked peptides K6=K7= K8 at 44.6 min, and the individual peptides K8 and K9, a third cystine knot linkage of light chain peptides (K6=K7=K8=K9) is also confirmed to be C50-C81 (disulfide bond S3), C55-C72 (disulfide bond S4), and C61-C70 (disulfide bond S5). Additional support for this linkage confirmation included that C70 and C72 in light chain K8 (peak at 62.9 min) and C81 in K9 (peak at 42.9 min) were alkylated in the partially reduced samples but not in the non-reduced sample. The related representative mass spectra and results from all three partial reduction

conditions are presented in section 2 of supplementary data for reference. The latter two cysteine knot formations of linked peptides (K6=K7=K8=K9) are not expected, indicating the approach can be used to identify unknown disulfide bonds and complex disulfide bond connections. In summary, 3 out of 15 possible cystine knot linkages were determined for the 6 cysteines located in light chain peptides K6 to K9 from the partially reduced samples. The relative amounts estimated by XIC peak intensity were 75.6%, 15.4%, and 9.0% for the first to third linkages respectively. The disulfide bond connections of the first linkage were the same as those expected in Fig. 2 and in dominant form (75.6%). However, five scrambled disulfide bonds were identified to form the second and third linkages. As explained in the Discussion section below, the scrambling should be part of the protein nature, and not introduced during the sample processing and handling. 3.3.

Elucidation of Disulfide Bonds 5-8 by Partially Reduced Samples

The results from all three partial reduction conditions for disulfide bonds 5-8 related Lys-C peptides are presented in section 3 of supplementary data. Although the data can confirm the expected disulfide bond 8, connected by LC C132 and HC C108 (Fig. 2), the other disulfide bond connections in disulfide bonds 5-8 related linkages were unable to be determined due to larger peptides generated by Lys-C digestion and less useful product ions for site specific cysteine determination. Since large peptides (LC K10 and HC K7) were generated by Lys-C digestion, it was necessary to reduce the size of these two peptides. Trypsin can remove 9 C-terminal residues from LC K10, 6 N-terminal and 11 C-terminal residues from HC K7. Therefore, Lys-C/trypsin sequential

digests of partially reduced samples were used to elucidate disulfide linkages in disulfide bonds 5-8. Similar to disulfide bond 2-4 elucidation, each disulfide bonds 5-8 related species was processed to identify alkylated cysteines in order to establish disulfide linkages in the sample. As discuss before, all free cysteines generated from disulfide bond reduction were in alkylated form with a 125-Da mass increase. Again, as an example, the results are summarized in Table 5 for results from mid-level partial reduction condition. In Table 5, light chain linked peptides K10c9 (9 C-terminal residues clipped due to trypsin digestion) and K11, peak at 50.4 min, were determined to have three disulfide bonds and one free cysteine as the result of one disulfide bond reduction in the sample. Product ions determined that C132 was a free cysteine. Two K10c9 related peaks at 45.6 min and 54.8 min were detected as the results of partial reduction. Peak at 45.6 min was identified to have two disulfide bonds and one free cysteine (C111) while peak at 54.8 min was identified to have one disulfide bond and three free cysteines (C96, C109, and C111). The identification of free cysteine C96 was inferred since no product ion was detected between C96 and C100 or C89 and C96. Therefore, the confirmed linkages were C89-C100 (disulfide bond 5), C96-C109 (disulfide bond 6), C111C124 (disulfide bond 7) as intra-chain disulfide bonds in light chain, and LC C132-HC C108 (disulfide bond 8) as an inter-chain disulfide bond. Cysteine in heavy chain K7n6c11 (6 Nterminal residues and 11 C-terminal residues clipped due to trypsin digestion) was confirmed to be alkylated in partially reduced sample (peak at 47.8 min) but not in non-reduced sample by MS and MS/MS data.

The confirmed disulfide linkages in disulfide bonds 5-8 were same as expected in Fig. 2. No other linkages were detected within disulfide bonds 5-8. The representative mass spectra and results from all three partial reduction conditions are included in section 3 of supplementary data for reference. The above results demonstrated that disulfide bond connections in the linkage of disulfide bonds 5-8 were the same as expected in Fig. 2. No additional linkage was identified for linked peptides (LC K10c9=K11=HC K7n6c11). This indicated that no disulfide bond scrambling was happened during sample digestion and handling. The representative mass spectra and results from all three partial reduction conditions are included in section 3 of supplementary data for reference. 3.4.

Free Cysteine Quantification

The fully reduced sample was used to quantify the free cysteines in HFXa sample. The free cysteines in HFXa sample are reported in Table 6. All free cysteines were alkylated by NEM and with a 125-Da mass increase but cysteines involved with disulfide bond formation were in reduced form. Each free cysteine site was identified by MS/MS, and labeled in red in Table 6. In light chain peptide K6, two free cysteines sites were identified as C50 and C55 [21]. Two isomer forms of alkylated C50 were detected with slightly shifted retention time (NEM alkylation isomers were formed during alkylation reaction) but only one peak was detected for C55 alkylation. The isomer formation of NEM alkylation and the mechanism are beyond the scope of this study. Overall, only low levels (≤ 5%) of free cysteines were detected from the protein. These low levels of free ceyteines were fully alkylated before digestion to avoid any potential introduction of disulfide bond scrambling during sample processing. 3.5.

Discussion

In this study, accurate mass LC-MS peptide mapping and sample preparation techniques were applied to assess disulfide linked peptides and disulfide bond connections of HFXa protein. As usual, LC-MS analyses of non-reduced Lys-C digests and fully reduced Lys-C digests can be used to determine disulfide bond linked peptides (Table 1, and Fig. 3) and free cysteines (Table 6) of the protein, respectively. With the help of BiopharmaLynx™ software, linkages (or peptide complex linked by multiple disulfide bonds) could also be assigned from non-reduced Lys-C digests based on deconvoluted and charge reduced precursor masses, such as the two cystine knots of LC K6=K7=K8=K9 and LC K10=K11=HC K7 (connected by 3 and 4 disulfide bonds, respectively, as shown in Table 2 and Fig. 2). However, the MS/MS fragments generated from these cystine knots or complex linkages are not enough to locate the site-specific cysteine connections. For this purpose, partial reduction of the protein prior to digestion was conducted to simplify the disulfide bond connection complexity. At optimized partial reduction conditions of HFXa protein, Lys-C digestion was initially attempted to elucidate the disulfide bond connections in the two cysteine knots and successfully determined the site-specific cysteine connections in LC K6=K7=K8=K9 complex, including an expected linkage with disulfide bonds 2-4 and two unexpected linkages with 5 scrambled disulfide bond connections. For the complex LC K10=K11=HC K7, Lys-C/trypsin sequential digestion was used to generate smaller tryptic peptides because linked peptides of this complex generated by Lys-C only were too large to further fragment well. The results (Table 5) of LC-MS/MS analysis of Lys-C/trypsin sequential digests of partially reduced HFXa confirmed the disulfide bonds 5-8 of this complex, and no additional scrambled disulfide bonds were detected. The methodology combined protein-level partial reduction with accurate mass LC-MS peptide mapping demonstrated here using HFXa should be able to adapt for elucidation of complex

disulfide bonded structures of other proteins, such as cystine knots with known or unknown disulfide bond connections. These were illustrated in this study for confirmation of the expected disulfide bonds 2-4 in cystine knot LC K6=K7=K8=K9 and 5-8 in complex LC K10=K11=HC K7, and the determination of two unexpected linkages of cysteine knot LC K6=K7=K8=K9. Five scrambled disulfide bonds were identified in the forming of these two unexpected linkages. As a general approach, an initial LC-MS disulfide bond mapping experiment should be performed by using non-reduced sample to identify all disulfide related species. A peptide mapping experiment should also be performed by using fully reduced sample for determination of potential free cysteines, post-translational modifications (PTMs), and for profile comparison between reduced and non-reduced digests. Proper enzyme(s) should be screened to achieve most desirable cleavages between cysteines and to generate suitable size peptides for best fragmentation. A single-site cleavage enzyme such as Lys-C (a protease that cleaves proteins on the C-terminal side of lysine residues) is suggested as a start point to assess possible linked peptides if feasible. After initial mapping, more enzymes can be screened to provide additional cleavage specificity. Finally, optimized protein partial reduction levels should be evaluated to reduce the complexity of disulfide linkages such as cystine knots. This method can be used in most biopharmaceutical industrial laboratories since recombinant proteins are generated with known sequence and the QTOF LC-MS system used for this study is widely available (accurate mass MS system with CID capability). The experiment is designed to study cystine knots or complex disulfide linkages. All sample preparation conditions were tuned to achieve the most efficient alkylation and enzyme digestion. The method should be further optimized if disulfide bond scrambling is a major concern by adjusting denaturation, alkylation, and enzyme digestion conditions. In this study, potential free

cysteines were fully capped to prevent any disulfide bond scrambling during sample preparation. The identified scrambled disulfide bonds from the two unexpected linkages of cysteine knots LC K6=K7=K8=K9 in this study should be part of the protein nature. This was further confirmed by one additional experiment performed under low pH digestion (pH at 5.2) of partially reduced HFXa, in which similar LC-MS/MS identification results were obtained (data not shown). 4. Conclusions In this study, we have demonstrated how protein-level partial reduction can be used for disulfide linkage study of a protein with cystine knots using HFXa as a model protein. With this method, simple disulfide linkages (such as disulfide bonds 1, 9-10, 11, and 12) can be determined by a non-reduced sample while cystine knots (disulfide bonds 2-4 and 5-8) can be further identified from partially reduced samples. Free cysteine content can be determined from a fully reduced sample. In addition, scrambled disulfide bonds can also be monitored and therefore unknown disulfide bonds can be elucidated for complex disulfide linkages such as cystine knots if any exists. In this study, TCEP was used in the range of 0.25 mM to 0.75 mM which is in-line with previous studies [16]. Multiple reduction conditions provided a range of partially reduced species for confident data elucidation. The estimated method sensitivity is at mid-picomole level.

Acknowledgements The authors wish to thank Michael Nold, Director, MSCF, KBI Biopharma for providing fund and facility for this study.

References [1]

[2]

[3] [4]

[5]

[6]

[7]

[8]

[9]

[10]

[11] [12]

[13]

[14] [15]

Liu, D. & Cowburn, D, Combining biophysical methods to analyze the disulfide bond in SH2 domain of C-terminal Src kinase Biophys Rep (2016). doi:10.1007/s41048-0160025-4 L. Poppe, J. O. Hui, J. Ligutti, J. K. Murray, and P. D. Schnier, PADLOC: A powerful tool to assign disulfide bond connectivities in peptides and proteins by NMR spectroscopy, Anal. Chem. 84 (2012) 262-266. B. Zhang, S. L. Cockrill, Methodology for determining disulfide linkage patterns of closely spaced cysteine residues, Anal. Chem. 81 (2009) 7314-7320. J. S. Andersen, B. Svensson, and P. Roepstorff, Electrospray ionization and matrix assisted laser desorption/ionization mass spectrometry: powerful analytical tools in recombinant protein chemistry, Nat. Biotechnol. 14 (1996) 449-457. Y. Wang, Q. Lu, S-L. Wu, B. L. Karger, and W. S. Hancock, Characterization and comparison of disulfide linkages and scrambling patterns in therapeutic monoclonal antibodies: using LC-MS with electron transfer dissociation, Anal. Chem. 83 (2011) 3133-3140. M. S. Goyder, F. Rebeaud, M. E. Pfeifer, and F. Kalman, Strategies in mass spectrometry for the assignment of Cys-Cys disulfide connectivities in proteins, Expert Rev Proteomics, 10.5 (2013) 489-501. W. Ni, M. Lin, P. Salinas, P. Savickas, S-L. Wu, and B. L. Karger, Complete mapping of a cysteine knot and nested disulfides of recombinant human arylsulfatase A by multienzyme digestion and LC-MS analysis using CID and ETD, J. Am. Soc. Mass Spectrom. 24 (2013) 125-133. S-L. Wu, H. Jiang, Q. Lu, S. Dai, W. S. Hancock, and B. L. Karger, Mass spectrometric determination of disulfide linkages in recombinant therapeutic proteins using online LCMS with electron-transfer dissociation, Anal. Chem. 81 (2009) 112-122. W. Zhang, L. A. Marzilli, J. C. Rouse, M. J. Czupryn, Complete disulfide bond assignment of a recombinant immunoglobulin G4 monoclonal antibody, Anal. Biochem. 311 (2002) 1-9. V. Schnaible, S. Wefing, A. Bucker, S. Wolf-Kummeth, and D. Hoffmann, Partial reduction and two-step modification of proteins for identification of disulfide bonds, Anal. Chem. 74 (2002) 2386-2393. B. Seiwert, H. Hayer, and U. Karst, Differential labeling of free and disulfide-bound thiol functions in proteins, J. Am. Soc. Mass Spectrom. 19 (2008) 1-7. C. Chumsae, G. Gaza-Bulseco, and H. Liu, Identification and localization of unpaired cysteine residues in monoclonal antibodies by fluorescence labeling and mass spectrometry, Anal. Chem. 81 (2009) 6449-6457. Y. Zhang, W. Cui, H. Zhang, H. D. Dewald, and H. Chen, Electrochemistry-assisted topdown characterization of disulfide-containing protiens, Anal. Chem. 84 (2012) 38383842. P. L. Tsai, S-F. Chen, and S. Y. Huang, Mass spectrometry-based strategies for protein disulfide bond identification, Rev. Anal. Chem. 32 (2013) 257-268. U. Goransson, and D. J. Craik, Disulfide mapping of the Cyclotide Kalata B1, J. Biol. Chem. 278 (2003) 48188-48190.

[16]

[17]

[18] [19]

[20]

[21]

H. Liu, C. Chumsae, G. Gaza-Bulseco, K. Hurkmans, and C. H. Rasziejewski, Ranking the susceptibility of disulfide bonds in human IgGs1 antibodies by reduction, differential alkylation, and LC-MS analysis, Anal. Chem. 82 (2010) 5219-5226. H. Xie, and W. Chen, Fast and automatic mapping of disulfide bonds in a monoclonal antibody using SYNAPT G2 HDMS and BiopharmaLynx 1.3, Waters application note 134626518, 2011. J. S. Joseph, and R. M. Kini, Snake venom prothrombin activators homologous to blood coagulation Factor Xa, Haemostasis, 31 (2001) 234-240. B. A. McMullen, K. Fujikawa, W. Kisiel, T. Sasagawa, W. N. Howald, E. Y. Kwa, and B. Weinstein, Complete amino acid sequence of the light chain of human blood coagulation factor X: evidence for identification of Residue 63 as β-hydroxyaspartic acid, Biochemistry 22 (1983) 2875-2884. J. S. Joseph, M. C. M. Chung, K. Jeyaseelan, and R. M. Kini, Amino acid sequence of trocarin, a prothrombin activator from tropidechis carinatus venom: its structural similarity to coagulation factor Xa, Blood 94 (1999) 621-631. P. Martinex-Acedo, V. Gupta, and K. S. Carroll, Proteomic analysis of peptides tagged with dimedone and related probes, J Mass Spectrom, 49 (2014) 257-265.

Protein

Partial reduction and alkylation

Protein B: free and reduced cysteines alkylated

Alkylation

Enzyme digestion

Protein A: free cysteines alkylated Peptide B: disulfide bonds partially reduced

Enzyme digestion

LC-MS/MS analysis

Peptide A: disulfide bonds intact

LC-MS/MS analysis

Disulfide Bond Mapping

Reduction

Peptide C: disulfide bonds reduced

LC-MS/MS analysis

Free cysteine determination

Fig. 1. Disulfide Bond Mapping Workflow.

Light Chain Human Factor Xa Average Mass = 44210.9008, Monoisotopic Mass = 44181.8774 N-Terminus = H, C-Terminus = OH Light Chain 1 ANSFL EEMKK GHLER ECMEE C 1 TCSYE C EAREV FEDSD KTNEF WNKYK DGDQCC ETSPC C QNQGK CKDGL C GEYTCC TCLEG C 2 3

76 FEGKN CELFT C RKLCS C LDNGD CDQFC C C HEEQN SVVCS C CARGY C TLADN GKACI C PTGPY PCGKQ C TLER 6 7 5

4

Heavy Chain

1

IVGGQ ECKDG C 9 ECPWQ C ALLIN EENEG FCGGT C ILSEF YILTA AHCLY C QAKRF KVRVG DRNTE QEEGG EAVHE VEVVI 10

76 KHNRF TKETY DFDIA VLRLK TPITF RMNVA PACLP C ERDWA ESTLM TQKTG IVSGF GRTHE KGRQS TRLKM LEVPY 8 151 VDRNS CKLSS C SFIIT QNMFCC AGYDT KQEDA CQGDS C GGPHV TRFKD TYFVT GIVSW GEGCA C RKGKY GIYTK VTAFL 11

12

226 KWIDR SMKTR GLPKA KSHAP EVITS SPLK Fig. 2. Human Factor Xa Sequence with 12 Expected Disulfide Bonds [18]

2015_11_24_FXa_b14 44.25

1: TOF MS ES+ 1158.1 0.2000Da 3.86e6

1

(a)

%

95 -5

45.00 50.00 2015_11_24_FXa_b14 48.85

60.00

65.00

70.00

75.00

80.00

85.00

2-4

90.00 1: TOF MS ES+ 1176 0.2000Da 2.30e6

(b)

%

95

55.00

-5 45.00 50.00 2015_11_24_FXa_b14

55.00

60.00

5-8

65.00

70.00

75.00

80.00

85.00

61.52

95

90.00 1: TOF MS ES+ 1673.7 0.2000Da 1.11e6

%

(c)

-5 45.00 50.00 2015_11_24_FXa_b14

55.00

60.00

65.00

70.00

75.00

80.00

85.00

9-10

95

84.52

90.00 1: TOF MS ES+ 1320.4 0.2000Da 5.19e6

%

(d)

-5 45.00 50.00 2015_11_24_FXa_b14

55.00

60.00

65.00

70.00

75.00

80.00

85.00

11

64.47

95

90.00 1: TOF MS ES+ 1226.9 0.2000Da 1.10e7

%

(e)

-5 45.00 50.00 2015_11_24_FXa_b14 95

55.00

12

60.00

65.00

70.00

75.00

80.00

85.00

60.00

65.00

70.00

75.00

80.00

85.00

55.34

90.00 1: TOF MS ES+ 1307.3 0.2000Da 6.49e6

%

(f)

-5 45.00 50.00 2015_11_24_FXa_b14 47.73

%

95 45.30

48.85

55.00 55.34 57.03

50.85 54.83

64.47 84.52

61.77 62.33 66.00 69.84 70.48

73.33 74.63

79.34

80.87

50.00

55.00

60.00

65.00

70.00

75.00

80.00

(g)

87.41

-5 45.00

90.00 1: TOF MS ES+ TIC 9.70e7

85.00

Time 90.00

Fig. 3. Extracted ion current (XIC) and total ion current (TIC) from non-reduced Lys-C digest: (a) to (f): XIC of disulfide linkages with disulfide bonds as labeled in each peak, (g) TIC of the non-reduced digest.

0.2 mg/mL NRE Lys-5B 2015_11_24_FXa_b14 2232 (84.523) Cm (2228:2242)

1: TOF MS ES+ 1.40e7

1320.3599 14042747

100

IVGGQECK 1320.6128 12218939

1320.1071 11874296

(a)

9

DGECPWQALLINEENEGFCGGTILSEFYILTAAHCLYQAK 10

1320.8656 8204612

%

1319.8691 6860816

1321.1187 4455564

1319.6163 2003857

1321.3716 1994240 1321.6096 886314

0

m/z 1319

1320

1321

1322

1323

0.2 mg/mL NRE Lys-5B

1325

y29-2

2015_11_24_FXa_b14 2231 (84.506) Cm (2227:2243) 100

1324

1610.7233 157038

IVGGQECK

2: TOF MS ES+ 1.57e5

y30-2

1611.2327 153153

1667.7662 142426

(b)

9

DGECPWQALLINEENEGFCGGTILSEFYILTAAHCLYQAK 10

y26-2

y31-2

1611.7256 105403

y31-2 1668.2678 97518

1610.2305 88173

1721.8079 97230 1721.4683 83792

%

1666.7634 74287

y27-2 y26-2

1612.2350 60837

1424.6617 46927 1425.1716 40302

1721.1285 55231 1725.3087 43467

1553.7045 45244

1488.6814 1490.1980 31584 31344

1425.6663 26031 1426.1763;14196 1480.6689 11177 1439.6260 10031

1668.7694 59132

y28-2

1489.1868 53545 1489.6923 48646

1724.8157 73870

1553.2043 25773

1490.6879 18054

1544.6981 12317

1554.7051 30909 1555.2054 17853

1602.7162 25509 1601.7167 21004

1612.7281 29335

1645.0757 35749

1645.7399 22240 1644.4116 23283

1689.0978 28831 1711.4454 23742

1725.8187 22485

1690.4271 9633

1726.3120 10936

1616.7087 10661

1590.0573 9993

0

m/z 1420

1440

1460

1480

1500

1520

1540

1560

1580

1600

1620

1640

1660

1680

1700

1720

1740

Fig. 4. LC/Q-TOF mass spectra of (a) quadruple precursor ion of heavy chain K1 and K2 linked by disulfide bonds 9-10, (b) product ions of the peptide in (a). The expected disulfide bond connections can be assigned in (a) and confirmed in (b).

2015_11_24_FXa-b06 47.70

51.42

61.74

57.66

46.03

%

1: TOF MS ES+ TIC 6.28e7

66.12

95

Reduced

64.63

62.30

(a)

45.27

40.60

44.15

72.6773.66

58.49

53.01 53.78

68.05

49.56

-5 40.00 45.00 2015_11_24_FXa_b16

50.00

55.00

60.00

65.00

69.80

70.00

75.52 76.12

75.00

80.80

80.00

85.00

47.73

95

90.00 1: TOF MS ES+ TIC 6.79e7

67.92

%

40.60

52.11 45.20

49.50

62.30 55.37 56.49

41.06 51.79

42.28

-5 40.00 45.00 2015_11_24_FXa_b14

50.00

Partially Reduced

64.50

61.74

53.61

71.43

55.00

60.00

65.00

11

70.00

74.32 75.56

75.00

80.84

80.00

87.77

85.00

64.47

95

12

47.73

Non-reduced

%

55.34

40.69

1 45.30

5-8

2-4

43.71 50.85

42.31

-5 40.00

45.00

50.00

54.83

55.00

90.00 1: TOF MS ES+ TIC 9.70e7

9-10 84.52

(c)

62.33 61.77 61.52

57.03

48.85

(b)

84.62 67.58

58.16

58.29

60.00

66.00

65.00

70.48

73.33 74.63

80.87

87.41

79.34

70.00

75.00

80.00

85.00

Time 90.00

Fig. 5. TIC profile comparison of three different Lys-C digests: (a) fully reduced, (b) partially reduced, (c) none reduced: all disulfide-linked peptides were labeled in (c).

Untitled LysC:/K Frag# K1 K2 K3

Res# 1-9 10-10 11-36

Sequence Theor(Bo) [M+H] [M+2H] [M+3H] [M+4H] (-)ANSFLEEMK(K) 1067.50 1068.50 534.76 356.84 267.88 (K)K(G) 146.11 147.11 74.06 49.71 37.53 (K)GHLERECMEETCSYEEA 3118.26 3119.27 1560.14 1040.43 780.57 REVFEDSDK(T) Untitled K4 37-43 (K)TNEFWNK(Y) 937.43 938.44 469.72 313.48 235.37 LysC:/K Untitled K5 44-45 (K)YK(D) 309.17 310.18 155.59 104.06 78.30 LysC:/K K6 46-60 (K)DGDQCETSPCQNQGK(C) 4681.94 4682.95 2341.98 1561.66 1171.49 Frag# Res# Sequence Theor(Bo) [M+H] [M+2H] [M+3H] [M+4H] Untitled K7 61-62 (K)CK(D) K1 1-9 (-)ANSFLEEMK(K) 1067.50 1068.50 534.76 356.84 267.88 Frag# Res# Sequence Theor(Bo) [M+H] [M+2H] [M+3H] [M+4H] LysC:/K K8 63-79 (K)DGLGEYTCTCLEGFEGK K2 10-10 (K)K(G) 146.11 147.11 74.06 49.71 37.53 K1 1-9 (-)ANSFLEEMK(K) 1067.50 1068.50 534.76 356.84 267.88 Untitled Disulfide (N) K3 11-36 (K)GHLERECMEETCSYEEA 3118.26 3119.27 1560.14 1040.43 780.57 Frag# Res# Sequence Theor(Bo) [M+H] [M+2H] [M+3H] [M+4H] K2 10-10 (K)K(G) 146.11 147.11 74.06 49.71 37.53 LysC:/K Bond K9 80-87 (K)NCELFTRK(L) REVFEDSDK(T) K1 1-9 (-)ANSFLEEMK(K) 1067.50 1068.50 534.76 356.84 267.88 K3 11-36 (K)GHLERECMEETCSYEEA 3118.26 3119.27 1560.14 1040.43 780.57 Frag# Res# Sequence Theor(Bo) [M+H] 4102.15 [M+2H] [M+3H] [M+4H] K10 88-122 (K)LCSLDNGDCDQFCHEEQ 8202.29 8203.30 2735.10 2051.58 K4 37-43 (K)TNEFWNK(Y) 937.43 938.44 469.72 313.48 235.37 K2 10-10 (K)K(G) 146.11 147.11 74.06 49.71 37.53 REVFEDSDK(T) Light chain K3 K1 1-9 (-)ANSFLEEMK(K) 1067.50 1068.50 534.76 1040.43 356.84 267.88 NSVVCSCARGYTLADNGK(A) 1 K5 44-45 (K)YK(D) 309.17 310.18 155.59 104.06 78.30 11-36 (K)GHLERECMEETCSYEEA 3118.26 3119.27 1560.14 780.57 K4 37-43 (K)TNEFWNK(Y) 937.43 938.44 469.72 313.48 235.37 K2 10-10 (K)K(G) 146.11 147.11 74.06 49.71 37.53 K11 123-134 (K)ACIPTGPYPCGK(Q) K6 46-60 (K)DGDQCETSPCQNQGK(C) 4681.94 4682.95 2341.98 1561.66 1171.49 REVFEDSDK(T) K5 44-45 (K)YK(D) 309.17 310.18 155.59 104.06 78.30 K3 11-36 (K)GHLERECMEETCSYEEA 3118.26 3119.27 1560.14 1040.43 780.57 K7 96-123 (K)TPITFRMNVAPACLPER K7 61-62 (K)CK(D) K4 37-43 (K)TNEFWNK(Y) 937.43 938.44 469.72 313.48 235.37 K6 46-60 (K)DGDQCETSPCQNQGK(C) 4681.94 4682.95 2341.98 1561.66 1171.49 2-4 REVFEDSDK(T) DWAESTLMTQK(T) K8 63-79 (K)DGLGEYTCTCLEGFEGK K5 44-45 (K)YK(D) 309.17 310.18 155.59 104.06 78.30 K7 61-62 (K)CK(D) K4 37-43 (K)TNEFWNK(Y) 937.43 4682.95 938.44 2341.98 469.72 1561.66 313.48 1171.49 235.37 K12 135-139 (K)QTLER(-) 645.34 646.35 323.68 216.12 162.34 (N) K6 46-60 (K)DGDQCETSPCQNQGK(C) 4681.94 K8 63-79 (K)DGLGEYTCTCLEGFEGK K5 44-45 (K)YK(D) 309.17 310.18 155.59 104.06 78.30 K1 1-8 (-)IVGGQECK(D) 5274.44 5275.45 2638.23 1759.16 1319.62 K9 80-87 (K)NCELFTRK(L) K7 61-62 (K)CK(D) (N) K6 46-60 (K)DGDQCETSPCQNQGK(C) 4681.94 4682.95 2341.98 1561.66 1171.49 K2 9-48 (K)DGECPWQALLINEENEG K10 88-122 (K)LCSLDNGDCDQFCHEEQ 8202.29 8203.30 4102.15 2735.10 2051.58 K8 63-79 (K)DGLGEYTCTCLEGFEGK K9 80-87 (K)NCELFTRK(L) 5-8 K7 61-62 (K)CK(D) FCGGTILSEFYILTAAHCLY NSVVCSCARGYTLADNGK(A) (N) K10 88-122 (K)LCSLDNGDCDQFCHEEQ 8202.29 8203.30 4102.15 2735.10 2051.58 K8 63-79 (K)DGLGEYTCTCLEGFEGK QAK(R) K11 123-134 (K)ACIPTGPYPCGK(Q) K9 80-87 (K)NCELFTRK(L) NSVVCSCARGYTLADNGK(A) (N) K3 49-51 (K)RFK(V) 449.28 450.28 225.65 150.77 113.33 K7 96-123 (K)TPITFRMNVAPACLPER K10 88-122 (K)LCSLDNGDCDQFCHEEQ 8202.29 8203.30 4102.15 2735.10 2051.58 K11 123-134 (K)ACIPTGPYPCGK(Q) K9 80-87 (K)NCELFTRK(L) K4 52-76 (K)VRVGDRNTEQEEGGEAV 2777.39 2778.40 1389.71 926.81 695.36 Heavy chain DWAESTLMTQK(T) NSVVCSCARGYTLADNGK(A) K7 96-123 (K)TPITFRMNVAPACLPER K10 88-122 (K)LCSLDNGDCDQFCHEEQ 8202.29 8203.30 4102.15 2735.10 2051.58 HEVEVVIK(H) K12 135-139 (K)QTLER(-) 645.34 646.35 323.68 216.12 162.34 K11 123-134 (K)ACIPTGPYPCGK(Q) DWAESTLMTQK(T) NSVVCSCARGYTLADNGK(A) K5 77-82 (K)HNRFTK(E) 801.42 802.43 401.72 268.15 201.36 9-10 135-139 K1 1-8 (-)IVGGQECK(D) 5274.44 5275.45 2638.23 1759.16 1319.62 K7 96-123 (K)TPITFRMNVAPACLPER K12 (K)QTLER(-) 645.34 646.35 323.68 216.12 162.34 K11 123-134 (K)ACIPTGPYPCGK(Q) K6 83-95 (K)ETYDFDIAVLRLK(T) 1581.84 1582.85 791.93 528.29 396.47 K2 9-48 (K)DGECPWQALLINEENEG DWAESTLMTQK(T) K1 1-8 (-)IVGGQECK(D) 5274.44 5275.45 2638.23 1759.16 1319.62 K7 96-123 (K)TPITFRMNVAPACLPER K8 124-136 (K)TGIVSGFGRTHEK(G) 1387.72 1388.73 694.87 463.58 347.94 FCGGTILSEFYILTAAHCLY K12 135-139 (K)QTLER(-) 645.34 646.35 323.68 216.12 162.34 K2 9-48 (K)DGECPWQALLINEENEG DWAESTLMTQK(T) K9 137-144 (K)GRQSTRLK(M) 944.55 945.56 473.28 315.86 237.15 QAK(R) K1 1-8 (-)IVGGQECK(D) 5274.44 5275.45 2638.23 1759.16 1319.62 FCGGTILSEFYILTAAHCLY K12 135-139 (K)QTLER(-) 645.34 646.35 323.68 216.12 162.34 11 K10 145-157 (K)MLEVPYVDRNSCK(L) 3675.71 3676.72 1838.86 1226.24 919.93 K3 49-51 (K)RFK(V) 449.28 450.28 225.65 150.77 113.33 K2 9-48 (K)DGECPWQALLINEENEG QAK(R) K1 1-8 (-)IVGGQECK(D) 5274.44 5275.45 2638.23 1759.16 1319.62 K11 158-176 (K)LSSSFIITQNMFCAGYD K4 52-76 (K)VRVGDRNTEQEEGGEAV 2777.39 2778.40 1389.71 926.81 695.36 FCGGTILSEFYILTAAHCLY K3 49-51 (K)RFK(V) 449.28 450.28 225.65 150.77 113.33 K2 9-48 (K)DGECPWQALLINEENEG TK(Q) HEVEVVIK(H) QAK(R) K4 52-76 (K)VRVGDRNTEQEEGGEAV 2777.39 2778.40 1389.71 926.81 695.36 12 FCGGTILSEFYILTAAHCLY K12 177-194 (K)QEDACQGDSGGPHVTRF 3916.79 3917.80 1959.40 1306.60 980.21 K5 77-82 (K)HNRFTK(E) 801.42 802.43 401.72 268.15 201.36 K3 49-51 (K)RFK(V) 449.28 450.28 225.65 150.77 113.33 HEVEVVIK(H) QAK(R) K(D) K6 83-95 (K)ETYDFDIAVLRLK(T) 1581.84 1582.85 791.93 528.29 396.47 K4 52-76 (K)VRVGDRNTEQEEGGEAV 2777.39 2778.40 1389.71 926.81 695.36 K5 77-82 (K)HNRFTK(E) 801.42 802.43 401.72 268.15 201.36 K3 49-51 (K)RFK(V) 449.28 450.28 225.65 150.77 113.33 K13 195-212 (K)DTYFVTGIVSWGEGCAR K8 124-136 (K)TGIVSGFGRTHEK(G) 1387.72 1388.73 694.87 463.58 347.94 HEVEVVIK(H) K6 83-95 (K)ETYDFDIAVLRLK(T) 1581.84 1582.85 791.93 528.29 396.47 K4 52-76 (K)VRVGDRNTEQEEGGEAV 2777.39 2778.40 1389.71 926.81 695.36 K(G) K9 137-144 (K)GRQSTRLK(M) 944.55 945.56 473.28 315.86 237.15 K5 77-82 (K)HNRFTK(E) 801.42 802.43 401.72 268.15 201.36 K8 124-136 (K)TGIVSGFGRTHEK(G) 1387.72 1388.73 694.87 463.58 347.94 HEVEVVIK(H) K14 213-214 (K)GK(Y) 203.13 204.13 102.57 68.72 51.79 K10 145-157 (K)MLEVPYVDRNSCK(L) 3675.71 3676.72 1838.86 1226.24 919.93 K6 83-95 (K)ETYDFDIAVLRLK(T) 1581.84 1582.85 791.93 528.29 396.47 K9 137-144 (K)GRQSTRLK(M) 944.55 945.56 473.28 315.86 237.15 K5 77-82 (K)HNRFTK(E) 801.42 1388.73 802.43 401.72 268.15 201.36 K15 215-220 (K)YGIYTK(V) 743.39 744.39 372.70 248.80 186.85 Note: K11 158-176 (K)LSSSFIITQNMFCAGYD K8 124-136 (K)TGIVSGFGRTHEK(G) 1387.72 694.87 463.58 347.94 K10 145-157 (K)MLEVPYVDRNSCK(L) 3675.71 3676.72 1838.86 1226.24 919.93 K6 83-95 (K)ETYDFDIAVLRLK(T) 1581.84 1582.85 791.93 528.29 396.47 K16 221-226 (K)VTAFLK(W) 677.41 678.42 339.71 226.81 170.36 TK(Q) K9 137-144 (K)GRQSTRLK(M) 944.55 945.56 473.28 315.86 237.15 K11 158-176 (K)LSSSFIITQNMFCAGYD K8 124-136 (K)TGIVSGFGRTHEK(G) 1387.72 1388.73 694.87 463.58 347.94 K17 227-233 (K)WIDRSMK(T) 934.47 935.48 468.24 312.50 234.63 K12 177-194 (K)QEDACQGDSGGPHVTRF 3916.79 3917.80 1959.40 1306.60 980.21 K10 145-157 (K)MLEVPYVDRNSCK(L) 3675.71 3676.72 1838.86 1226.24 919.93 TK(Q) bond linked group were listed according to amino acid sequence and 1. Theoretical masses of each disulfide expected K9 137-144 (K)GRQSTRLK(M) 944.55 945.56 473.28 315.86 237.15 K18 234-239 (K)TRGLPK(A) 670.41 671.42 336.21 224.48 168.61 K(D) K11 158-176 K12 177-194 (K)QEDACQGDSGGPHVTRF 1959.40 1306.60 980.21 disulfide bond linkage (Fig. 2) in(K)LSSSFIITQNMFCAGYD a Lys-C digestion sample. For3916.79 example: 3917.80 light chain peptide K3 contains disulfide bond 1 K10 145-157 (K)MLEVPYVDRNSCK(L) 3675.71 3676.72 1838.86 1226.24 919.93 195-212 (K)DTYFVTGIVSWGEGCAR TK(Q) K(D) and K13 with a theoretical mass of 3118.26 Da and [M+H]+ is m/z 3119.27 (without considering any modification). K11 158-176 (K)LSSSFIITQNMFCAGYD K(G) K12 177-194 3916.79 3917.80were 1959.40 1306.60 980.21 K13 195-212 (K)DTYFVTGIVSWGEGCAR 2. According to previous study [19](K)QEDACQGDSGGPHVTRF and our experiments, the following modifications used in this study: TK(Q) K14 213-214 (K)GK(Y) 203.13 204.13 102.57 51.79 K(D) K(G) • Carboxy-E: glutamic acid carboxylation (+43.9898 Da) at light chain E14, E16, E19, E20, E25, E26,68.72 E29, and E32. K12 177-194 (K)QEDACQGDSGGPHVTRF 3916.79 3917.80 1959.40 1306.60 980.21 K15 215-220 (K)YGIYTK(V) 743.39 744.39 372.70 248.80 186.85 K13 195-212 (K)DTYFVTGIVSWGEGCAR K14 213-214 (K)GK(Y) 203.13 102.57 68.72 51.79 • Hydroxyl-D: aspartic acid hydroxylation (+15.9949 Da) at light chain D63. 204.13 K(D) K16 221-226 (K)VTAFLK(W) 677.41 678.42 site372.70 339.71 226.81 170.36 K(G) K15 215-220 (K)YGIYTK(V) 743.39 744.39 248.80 • Glycosylation: Glycosylation (+162.0528 Da) at light chain S106 (the modification of S106 was assumed,186.85 not verified K13 195-212 (K)DTYFVTGIVSWGEGCAR K17 227-233 (K)WIDRSMK(T) 934.47 935.48 468.24 312.50 234.63 K14 213-214 (K)GK(Y) 203.13 204.13 102.57 68.72 51.79 (data K16 221-226 (K)VTAFLK(W) 677.41 678.42 339.71 226.81 170.36 since we only confirmed S90 was not the modification site). The modification was identified though reduced sample K(G) K18 234-239 (K)TRGLPK(A) 670.41 671.42 336.21 224.48 168.61 215-220 (K)YGIYTK(V) 743.39 744.39 372.70 248.80 186.85 K17 227-233 (K)WIDRSMK(T) 934.47 935.48 468.24 312.50 234.63 notK15 show). K14 213-214 (K)GK(Y) 203.13 204.13 102.57 68.72 51.79 K16 221-226 (K)VTAFLK(W) 677.41 678.42 339.71 226.81 170.36 K18 234-239 (K)TRGLPK(A) 670.41 671.42 336.21 224.48 168.61 K15 215-220 (K)YGIYTK(V) 743.39 744.39 372.70 248.80 186.85 K17 227-233 (K)WIDRSMK(T) 934.47 935.48 468.24 312.50 234.63 K16 221-226 (K)VTAFLK(W) 677.41 678.42 339.71 226.81 170.36 K18 234-239 (K)TRGLPK(A) 670.41 671.42 336.21 224.48 168.61 K17 227-233 (K)WIDRSMK(T) 934.47 935.48 468.24 312.50 234.63 K18 234-239 (K)TRGLPK(A) 670.41 671.42 336.21 224.48 168.61

Table 1. Disulfide bond related peptides from in-silico Lys-C digestion

Table 2. Disulfide linked peptides observed in non-reduced Lys-C digests

LC K3

Assigned Disulfide Bond c 1

LC K6=K7=K8=K9

2-4

48.86

4697.9383

4697.9521

2.9

1834231

LC K10=K11= HC K7

5-8

61.56

8358.6732

8358.7031

3.6

1209618

HC K1=K2

9-10

84.57

5274.4448

5274.4639

3.6

9597412

HC K1-K2

9-10

87.42

5256.4341

5256.4531

3.6

707585

HC K10=K11

11

64.51

3675.7080

3675.7117

1.0

14417080

HC K12=K13

12

55.37

3916.7896

3916.7983

2.2

10186410

Linked Peptidesa

RT (Min)

Theoretical Mass (Da)b

Measured Mass (Da)

44.27

3470.1775

3470.1843

Mass Error (ppm) 2.0

Intensity (Counts) 3623429

a: HC K1=K2 means peptides are linked by disulfide bond; HC K1-K2 means peptides are linked by both disulfide bond and peptide bond. b: Theoretical mass was based on modification listed in Table 1. c: Refer to Table 1 and Fig. 2 for details.

Table 3. Disulfide bonds 2-4 linked peptides observed in Lys-C digests Samplea

Linked Peptidesb

RT (Min)

Theoretical Mass (Da)c

Measured Mass (Da)

Mass Error (ppm)

Intensity (Counts)

% Reductiond

Non-reduced

48.9

4697.9521

2.9

1834231

NA

Partially-reduced:L

48.8

4697.9517

2.9

918647

50

4697.9487

2.2

482926

74

4697.9473

1.9

91877

95

Partially-reduced: M Partially-reduced: H

LC K6=K7=K8=K9

48.9 48.9

4697.9383

a: L = low level, M = medium level, H = high level. b: LC K6=K7=K8=K9 means peptides are linked by disulfide bond. c: Theoretical mass was based on the modifications listed in Table 1. d: % Reduction = relative to non-reduced= 100* (non-reduced-reduced)/non-reduced. Note: NA = not applicable

Table 4. Disulfide bonds 2-4 related peptides observed in Lys-C digests Linked Peptidesa

Disulfide Bond

Identified Free C

RT (Min)

Theoretical Mass (Da)b

Measured Mass (Da)

Mass Error (ppm)

Intensity (Counts)

LC C50 21.8 1980.7595 -0.8 246440 1980.7611 LC C55 22.9 1980.7589 -1.1 26798 LC K8=K9 one LC C70 58.2 2969.2882 2969.2927 1.5 664844 LC K6=K7=K8 two LC C50 44.6 3815.4999 3515.5051 1.4 249560 LC K8 none LC C70 and C72 62.9 2086.8499 2086.8462 -1.8 116405 LC K9 none LC C81 42.9 1134.5492 1134.5472 -1.8 471687 a: LC K6=K7 means peptides are linked by disulfide bond. b: Theoretical mass was based on hydroxyl-D (+15.9949 Da) in LC K8=K9 (see Table 1 for details); one C-NEM in LC K6=K7, LC K8=K9, and LC K9; and two C-NEM in LC K8. The mass change for each cysteine alkylation is +125.0477 Da. LC K6=K7

one

Table 5. Disulfide bonds 5-8 related peptides observed in Lys-C/trypsin digests Linked Peptidesa

Disulfide Bond

Identified Free C

RT (Min)

Theoretical Mass (Da)b

Measured Mass (Da)

Mass Error (ppm)

Intensity (Counts)

LC K10c9=K11

three

LC C132

50.4

4360.7243

4360.7075

-3.9

59867

LC K10c9

two

LC C111

45.6

3157.1827

3157.1726

-3.3

20554

LC K10c9

one

LC C109 and C111

54.8

3409.2937

3409.2798

-4.1

73640

LC K11

none

LC C124 and C132

51.5

1455.6527

1455.6531

0.3

676629

HC K7n6c11

none

HC C108

47.8

1324.6268

1324.6237

-2.3

901482

a: LC K10c9=K11 means peptides are linked by disulfide bond. LC K10c9 = 9 C-terminal residues clipped; HC K7n6c11 = 6 N-terminal residues and 11 C-terminal residues clipped. Both peptides were generated by trypsin digestion. b: Theoretical mass was based on one glycosylation (+162.0528 Da) in LC K10c9; and one C-NEM in LC K10c9=K11, LC K10c9 (with 2 disulfide bonds), and HC K7n6c11; two C-NEM in LC K11; and three C-NEM in LC K10c9 (with one disulfide bond). The mass change for each cysteine alkylation is +125.0477 Da (the possible glycosylation site is S106, the modification was previously determined with reduced sample).

Table 6. Free cysteines observed from a fully reduced Lys-C digest Peptide

LC K6

LC K9

LC K11 HC K1

HC K2

HC K7

HC K10 HC K11 HC K12 HC K13

Sequencea DGDQCETSPCQNQGK DGDQCETSPCQNQGK DGDQCETSPCQNQGK DGDQCETSPCQNQGK NCELFTRK NCELFTRK NCELFTRK ACIPTGPYPCGK ACIPTGPYPCGK IVGGQECK IVGGQECK DGECPWQALLINEENEGFCG GTILSEFYILTAAHCLYQAK DGECPWQALLINEENEGFCG GTILSEFYILTAAHCLYQAK TPITFRMNVAPACLPERDWAE STLMTQK TPITFRMNVAPACLPERDWAE STLMTQK MLEVPYVDRNSCK MLEVPYVDRNSCK LSSSFIITQNMFCAGYDTK LSSSFIITQNMFCAGYDTK QEDACQGDSGGPHVTRFK QEDACQGDSGGPHVTRFK DTYFVTGIVSWGEGCARK

Start

End

46

60

80

87

123

134

1

8

9

48

Modification

C-NEM C-NEM C-NEM C-NEM C-NEM C-NEM C-NEM

C-NEM

96

RT (Min)

Theoretical Mass (Da)

Measured Mass (Da)

19.33 25.86 25.43 25.12 34.97 42.89 42.33 40.96 46.60 20.26 27.50

1608.6144 1733.6621 1733.6621 1733.6621 1009.5015 1134.5492 1134.5492 1205.5573 1330.6050 832.4113 957.4590

1608.6169 1733.6606 1733.6586 1733.6606 1009.5044 1134.5500 1134.5494 1205.5594 1330.6061 832.4093 957.4586

Mass Error (ppm) 1.6 -0.9 -2.0 -0.9 2.9 0.7 0.2 1.7 0.8 -2.4 -0.4

95.95

4446.0645

4446.0820

3.9

519092

97.54

4571.1123

4571.1255

2.9

15367

66.14

3205.5723

3205.5774

1.6

6871224

68.07

3330.6199

3330.6235

1.1

148843

46.08 49.58 64.65 67.67 30.65 34.82 61.74

1552.7378 1677.7855 2124.9861 2250.0337 1930.8591 2055.9067 1987.9462

1552.7390 1677.7831 2124.9871 2250.0305 1930.8586 2055.9055 1987.9460

0.8 -1.4 0.5 -1.4 -0.3 -0.6 -0.1

3598414 64293 4934018 82120 2411390 64194 3899963

Intensity (Counts) 779291 27003 12301 10988 1159303 48700 38012 2291299 82857 616082 6374

% Intensityb 3.3 1.5 1.3 3.9 3.1 3.5 1.0

2.8

123 C-NEM

145

157

158

176

177

194

C-NEM C-NEM C-NEM

2.1

1.8 1.6 2.6

195 212 DTYFVTGIVSWGEGCARK C-NEM 64.31 2112.9939 2112.9919 -0.9 63863 1.6 a: Modification site is in red. C-NEM isomers were detected in LC K6 and LC K9. Therefore, free cysteines should be sum of the two isomers. b: % Intensity = 100 x modified/ (unmodified + sum of modified species). The reporting limit is 1%.