Biochimica et Biophysica Acta 1834 (2013) 1474–1483
Contents lists available at SciVerse ScienceDirect
Biochimica et Biophysica Acta journal homepage: www.elsevier.com/locate/bbapap
Mass spectrometry investigation of glycosylation on the NXS/T sites in recombinant glycoproteins Izabela Sokolowska, Armand G. Ngounou Wetie, Urmi Roy, Alisa G. Woods, Costel C. Darie ⁎ Biochemistry & Proteomics Group, Department of Chemistry & Biomolecular Science, Clarkson University, 8 Clarkson Avenue, Potsdam, NY 13699-5810, USA
a r t i c l e
i n f o
Article history: Received 24 February 2013 Received in revised form 15 April 2013 Accepted 22 April 2013 Available online 28 April 2013 Keywords: Recombinant protein Post-translational modification Glycosylation Mass spectrometry Targeted proteomics
a b s t r a c t We used a targeted proteomics approach to investigate whether introduction of new N-linked glycosylation sites in a chimeric protein influence the glycosylation of the existing glycosylation sites. To accomplish our goals, we over-expressed and purified a chimeric construct that contained the Fc region of the IgG fused to the exons 7 & 8 of mouse ZP3 (IgG-Fc-ZP3E7 protein). Immunoglobulin heavy chain (IgG-HC protein) was used as control. We then analyzed the IgG-HC and IgG-Fc-ZP3E7 proteins by liquid chromatography-tandem mass spectrometry (LC–MS/MS) and by Western blotting (WB). We concluded that in control experiments, the glycosylation site was occupied as expected. However, in the IgG-Fc-ZP3E7 protein, we concluded that only one out of three NXS/T glycosylation sites is occupied by N-linked oligosaccharides. We also concluded that in the IgG-Fc-ZP3E7 protein, upon introduction of additional potential NXS/T glycosylation sites within its sequence, the original NST/S glycosylation site from the Fc region of the IgG-Fc-ZP3E7 protein is no longer glycosylated. The biomedical significance of our findings is discussed. © 2013 Elsevier B.V. All rights reserved.
1. Introduction To regulate the solubility of recombinant therapeutic proteins or of immunoglobulin (IgG)-based recombinant chimeric proteins used as therapeutics, artificial glycosylation such as modification of proteins by polyethylene glycol (PEG) or PEG-ylation is a standard procedure. However, the reproducibility of batches of recombinant proteins or chimeric proteins produced for therapeutic use is critical for their commercialization. Therefore, intense efforts have been made by protein chemists, biochemists and mass spectrometrists, focused on the full structural characterization of these recombinant chimeric proteins. The methods of choice are usually biochemical and whenever possible, mass spectrometry [1–9]. One important feature of the IgG-based chimeric proteins is the formation of the correct disulfide bridges within the chimeric protein. For a protein, failure to have correct disulfide connectivities may lead to changes in the three dimensional structure, as well as formation of
Abbreviations: ZP, zona pellucida; SP, signal peptide; FLCS, furin-like cleavage site; TM, transmembrane; IgG-HC protein, immunoglobulin heavy chain protein; IgGFc-ZP3E7 protein, the protein product of the Fc region of the IgG fused to the exons 7 & 8 of mouse ZP3; ZP3E7, polypeptide that corresponds to exon 7 from ZP3; SDS-PAGE, sodium dodecyl sulfate-polyacrylamide gel electrophoresis; WB, Western blotting; ECL, enhanced chemiluminescence; MS, mass spectrometry, LC–MS/MS, liquid chromatography tandem mass spectrometry; m/z, mass/charge; CID, collision-induced dissociation, DDA, data dependent analysis; IDA, information dependent analysis (DDA with inclusion list); TIC, total ion chromatogram; XIC, extracted ion chromatogram ⁎ Corresponding author. Tel.: +1 315 268 7763; fax: +1 315 268 6610. E-mail address:
[email protected] (C.C. Darie). 1570-9639/$ – see front matter © 2013 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.bbapap.2013.04.022
disulfide-linked multimeric proteins [4,6,8,10–13]. As such, changes in disulfide linkages can lead to changes in the solubility, half-life and renal clearance, and hence the therapeutic effectiveness of the chimeric protein. Another important feature of the IgG-based chimeric proteins is the NXS/T N-linked glycosylation site from the IgG part of the protein, which gives the recombinant chimeric protein almost a mandatory glycosylation of the asparagine residues from the NXS/T site, which also increases the chimeric protein's solubility. Therefore, full characterization of the chimeric protein and its NXS/T glycosylation site is a mandatory step in characterization of this protein. A fair assumption is that the NXS/T glycosylation site is occupied by an oligosaccharide and that one must determine the structure of the oligosaccharide. However, this is not always the case and one may also expect that the same glycosylation site is not occupied. In addition, if one has within the chimeric protein sequence more than one NXS/T glycosylation site, then one should consider that within the three dimensional structure of the chimeric protein, the glycosylation sites that are expected to be glycosylated may not necessarily be glycosylated anymore. Here, we performed a targeted MS analysis of the IgG-Fc-ZP3E7 protein to investigate the glycosylation sites for N-linked oligosaccharides. IgG-Fc-ZP3E7 is a fusion between the IgG Fc region and the exon 7 of the zona pellucida protein 3; zona pellucida 3 is the protein responsible for egg–sperm interaction and fertilization [2,10,11,14–16]. We overexpressed the chimeric construct IgG-Fc-ZP3E7 and purified the protein product, and then analyzed it by LC–MS/MS using data dependent analysis (DDA) and information dependent analysis (DDA with inclusion list or IDA). We also analyzed the control IgG-HC by the same approach.
I. Sokolowska et al. / Biochimica et Biophysica Acta 1834 (2013) 1474–1483
This experimental approach allowed us to identify the N-linked glycosylation sites in both IgG-HC and IgG-Fc-ZP3E7 proteins. We concluded that in IgG-HC, the only one predicted NST glycosylation site is occupied by a N-linked oligosaccharide residue. We also concluded that in IgG-Fc-ZP3E7 protein, one out of three NXS/T glycosylation sites is occupied by an N-linked oligosaccharide. These studies also allowed us to conclude that addition of a new potential NXS/T glycosylation sites within the sequence of IgG-Fc-ZP3E7 protein will prevent the glycosylation of the original NST/S glycosylation site from the Fc region of the IgG-Fc-ZP3E7 protein. The significance of this finding for biotechnology and for pharmaceuticals is discussed. 2. Materials and methods 2.1. Reagents Reagents were obtained from the following commercial sources. Cell culture reagents were from Invitrogen (Carlsbad, CA). AspN, goat anti-mouse IgG-HRP, formic acid, and acetonitrile from Sigma-Aldrich (St. Louis, MO), trypsin from Roche Applied Science (Indianapolis, IN), peptide-N-glycosidase (PNGase F) from New England Biolabs (Beverly, MA), SDS-PAGE gels from Bio-Rad (Hercules, CA), Mr markers from Bio-Rad, Invitrogen and New England Biolabs, nitrocellulose membranes and enhanced chemiluminescence (ECL) kits from Amersham Pharmacia Biotech (Piscataway, NJ). 2.2. Construction of IgG-Fc-ZP3E7 chimeric construct A chimeric construct was designed in which the gene responsible for the Fc portion of human IgG1 heavy chain was joined to the exons 7 and 8 of the mouse ZP3 (mZP3). The chimeric construct was named IgG-Fc-ZP3E7 [Note: Polypeptide encoded by exon 8 is removed by excision at the FLCS and is not found in the mature, secreted ZP3 or in IgG-Fc-ZP3 protein product, so for simplicity, we named the chimeric construct IgG-Fc-ZP3E7 not IgG-Fc-ZP3E78 (although it contains exon 8)]. An IgG fusion strategy that facilitates purification and characterization of the fusion proteins has been used successfully by other investigators [17]. The extensive procedure for building the IgG-Fc-ZP3E7 chimeric construct was described elsewhere [18]. For stable transfection, the IgG-Fc-ZP3E7 construct was linearized with HindIII/EcoRI and then introduced into EC cells by electroporation and a stably transfected EC cell line was obtained. 2.3. Secretion, purification, and characterization of IgG-Fc-ZP3E7 EC cells, stably transfected with IgG-Fc-ZP3E7, were cultured under standard conditions and the serum-free supernatant was collected and purified over an IgG-affinity column. The full procedure is described elsewhere [18]. Briefly, supernatant containing recombinant protein was collected after culturing EC cells in serum-free culture medium for 20–24 h. Supernatant was screened for IgG-Fc-ZP3E7 by purification over an ImmunoPure Immobilized Protein G (on agarose) column (Pierce) as described by the manufacturer; 50 ml supernatant was centrifuged at low speed at room temperature and filtered (Millipore), and then 50 ml ImmunoPure (A/G) IgG binding buffer (Pierce) was added, and then run over a column packed with 0.5 ml of settled Protein G (~ 1 ml of gel slurry). The flow-through was collected, applied once more to the Protein G column and then discarded. To remove unbound protein, the column was washed with ImmunoPure (A/G) IgG binding buffer and bound proteins were eluted with ImmunoPure IgG elution buffer (Pierce). Neutralizing buffer (1 M Tris, pH 9) was added (100 μl/ml) immediately upon elution to adjust eluted fractions to physiological pH. Eluted fractions were pooled, dialyzed extensively against H2O at 4oC, lyophilized, and stored frozen. Aliquots of bound IgG protein were analyzed by SDS-PAGE and Western immunoblotting to determine whether EC
1475
cells synthesized and secreted IgG-Fc-ZP3E7 protein. Here we will use the term IgG-Fc-ZP3E7 protein for the protein product of the IgG-Fc-ZP3E7 construct. The heavy chain of the IgG1 (IgG-HC) used as a control was purchased from Sigma. 2.4. N-glycanase digestion Aliquots of IgG-Fc-ZP3E7 (100–200 ng) and IgG-HC (500 ng) were dissolved in denaturing buffer containing 5% SDS and 10% β-mercaptoethanol (final concentrations of 0.5% SDS and 1% β-mercaptoethanol), the sample was boiled for 10 min, cooled, and one-tenth volume of 0.5 M sodium phosphate, pH 7.5, one-tenth volume of 50 mM Chaps, and 500 U of Peptide:N-Glycosidase F (New England BioLabs) was added (total volume ~15 μl). The sample was incubated at 37 °C for 24 h, then placed at −20 °C, and stored frozen prior to analyis by SDS-PAGE and WB. 2.5. SDS-PAGE and WB The glycosylated and de-glycosylated proteins were separated by SDS-PAGE and then stained by Coomassie, according to published procedures. The protein gel bands of interest were excised and subjected to enzymatic digestion. SDS-PAGE gels were also electroblotted on PVDF membrane (Millipore, Bedford, MA) and probed with antibodies against human IgG (Sigma) or polypeptide that corresponds to exon 7 of mouse ZP3 (ZP3E7) [18] or ConA-HRP (ConA, Sigma). The immune reaction was visualized by ECL reaction kit (Pierce, Rockford, IL). 2.6. Enzymatic digestion of proteins for MS-based analysis Digestion of gel pieces containing individual proteins with proteases was carried out using published protocols [3,5,19–21]. Gel pieces containing 0.05–1 μg of protein were excised from the gel and incubated with 60% (v/v) acetonitrile for 20 min, dried completely in a SpeedVac evaporator, and rehydrated for 10 min with digestion buffer (25 mM ammonium bicarbonate, pH 8.0). This procedure was repeated three times. After drying, gel pieces were again rehydrated in digestion buffer containing 10 mM DTT and incubated for 1 h at 56 °C. Reduced cysteine residues in the proteins were blocked by replacing the DTT solution with 100 mM iodoacetamide in 25 mM ammonium bicarbonate, pH 8.0, for 45 min, in the dark, at room temperature, with occasional vortexing. Gel pieces were then dehydrated, dried and subjected to digestion for 16–18 h at 37 °C in digestion buffer containing 15 ng/μl of trypsin (cleaves at the C-termini of R and K residues). The gel pieces were then dehydrated, dried and subjected to a second digestion for 16–18 h at 37 °C in digestion buffer containing 15 ng/μl of AspN (cleaves at the N-termini of D and E residues). Following digestion, peptides were extracted twice from gel pieces by addition of 200 μl of 60% acetonitrile/5% formic acid in 25 mM ammonium bicarbonate, pH 8.0, and shaking for 40–60 min at room temperature. Solutions containing peptide mixtures were then dried in SpeedVac concentrator, solubilized in 20 μl of 0.1% formic acid/2% acetonitrile and cleaned with a P10 ZipTip μ-C18 (Millipore Corporation, Billerica, MA). For LC–MS/MS analysis, the samples were resuspended in 10 μl of 0.1% formic acid in 2% acetonitrile. 2.7. MS, protein identification and data analysis The resulting peptide mixture was analyzed by reverse phase liquid chromatography (LC) and MS (LC–MS/MS) using a NanoAquity UPLC coupled directly to a Q-Tof Premier MS or Q-Tof Micro MS (Waters, Milford, MA). The procedure used was previously described [3,7,22,23]. Briefly, the peptides were loaded onto a 100 μm × 10 mm nanoAquity BEH130 C18 1.7 μm UPLC column (Waters, Milford, MA) and eluted over a 120 minute gradient of 10–85% acetonitrile in 0.1% formic acid at a flow rate of 250 nl/min. The column was coupled to a Picotip
1476
I. Sokolowska et al. / Biochimica et Biophysica Acta 1834 (2013) 1474–1483
Emitter Silicatip nano-electrospray needle (New Objective, Woburn, MA). MS data acquisition involved survey MS scans and automatic data dependent MS/MS of 2+, 3+ or 4+ ions (data dependent analysis, DDA). The MS/MS was triggered when the MS signal intensity exceeded 10 counts/s. In survey MS scans, the three most intense peaks were selected for CID and fragmented until the total MS/MS ion counts reached 10,000 or for up to 6 s each. Initially, the peptide mixtures were analyzed using the above described procedure for identification of most peptides from the peptide mixture. Subsequently, information dependent analysis (IDA) was targeted towards specific peptides that contain the N-linked glycosylation sites either in NXS/T or DXS/T state. If a NXS/T site is occupied by an N-linked oligosaccharide, then upon PNGaseF digestion, the Asparagine residue from the NXS/T sites is converted to Aspartate and the NXS/T sequence becomes DXS/T sequence. In MS terms, this conversion represents a 1 Da increase for the peptides that contain such sequence. Therefore we created an inclusion list where we instructed the mass spectrometer to preferentially select and fragment the peaks that may correspond to the peptides with the NXS/T or DXS/T sequences. In addition, we analyzed peptide mixtures where we instructed the mass spectrometer to select and fragment only the peaks listed in the inclusion list. Additional LC–MS/MS experiments, using the same peptide mixture as previously used, were performed using an Alliance 2695 HPLC and a NanoAcquity UPLC (both from Waters Corp) coupled with a Q-TOF Micro-MS. The MS parameters were the same as described in the previous experiments. The column that we used for Alliance 2695 HPLC was XBridge™ C18 3.5 μm, 2.1 × 100 mm column (Waters Corporation) and the HPLC was operated at a flow rate of 200 μl/min, while the column used for the NanoAcquity UPLC was a 100 μm × 10 mm nanoAquity BEH130 C18 1.7 μm UPLC column (Waters Corporation) and the UPLC was operated at 400 nl/min. The raw data were processed using ProteinLynx software version 2.2.5 with the following parameters: background subtraction of polynomial order 5 adaptive with a threshold of 35%, two smoothings with a window of three channels in Savitzky–Golay mode and centroid calculation of top 80% of peaks based on a minimum peak width of 4 channels at half height. The resulting pkl files were submitted for database searching and protein identification to an in-house Mascot server (version 2.2.1, Matrix Science, London, UK) using the following parameters: human database from NCBI, parent mass error of 50 ppm, product ion error of 0.15 Da, enzyme used: trypsin and AspN, one missed cleavage, and cysteines modified to carbamydomethyl as fixed modification. To identify whether the N-linked glycosylation sites are occupied by oligosaccharides, we created a ZP database where the IgG-Fc-ZP3E7 protein contained the sequence NXS/T or DXS/T. The ZP database was also used for database search using PLGS software, version 2.2.5 (Waters Corporation). The MS peaks and the MS/MS spectra that resulted from fragmentation of these peaks that corresponded to peptides containing glycosylation sites and that were identified by Mascot and PLGS searches in the ZP database were further confirmed by manual inspection of the MSMS spectra within the raw data using MassLynx software (version 4.1, Waters Corporation). In addition, the raw data were also processed using PLGS software version 2.4 (Waters Corporation), utilizing the same processing parameters described above. The processed data were further analyzed using customized ZP database where the IgG-Fc-ZP3E7 protein contained the sequence NXS/T or DXS/T. The results from the searches with the PLGS 2.4 software using the customized ZP database were further confirmed by manual inspection of the raw MSMS spectra using MassLynx software (version 4.1, Waters Corporation). 3. Results & discussion 3.1. Polypeptide encoded by the exon 7 of ZP3 contains potential glycosylation sites for N-linked oligosaccharides All zona pellucida (ZP) proteins are modular proteins that are intensely post-translationally modified (Fig. 1A). One of these modifications is
glycosylation and the protein product of exon 7 of ZP3 (ZP3E7) contains two glycosylation sites. Therefore, we investigated whether the glycosylation sites within ZP3E7 influences the glycosylation of the Fc region of the IgG (or vice-versa) or not. To do this, we built the IgG-Fc-ZP3E7 construct and analyzed its protein product. We also analyzed the control IgG-Fc protein. A schematic of the theoretical and processed masses of the IgG-HC and IgG-Fc-ZP3E7 polypeptides is shown in Fig. 1B. IgG-HC protein has a theoretical molecular mass of 52 kDa, which is post-translationally processed: upon removal of the signal sequence and the glycosylation at the N residue from the NST sequence, it becomes the mature IgG-HC protein. Here, we refer to IgG-HC as to the fully processed IgG-HC, unless otherwise specified. The molecular mass of the naked, mature IgG-HC protein is 50 kDa. IgG-Fc-ZP3E7 construct encodes a 42.6 kDa IgG-Fc-ZP3E7 protein which is post-translationally processed: upon removal of the signal sequence and the hydrophobic domain, the molecular mass of the naked, mature IgG-Fc-ZP3E7 protein becomes 32.5 kDa. The real molecular mass of this construct is higher, due to modifications given by the oligosaccharide residues at the N- and/or O-glycosylation sites. While IgG-HC protein has only one potential glycosylation site at the Asn residue within the NST sequence (96NST98), IgG-Fc-ZP3E7 protein has three potential glycosylation sites for the N-linked oligosaccharides: 96NST98, 288NCS290 and 291NSS293 (Fig. 1C).
3.2. IgG-HC and IgG-Fc-ZP3E7 proteins are glycosylated In order to investigate whether IgG-HC and IgG-Fc-ZP3E7 proteins are glycosylated, we treated them with PNGaseF, separated them by SDS-PAGE and analyzed them by WB using different antibodies. PNGaseF cleaves N-linked oligosaccharides from the glycosylation sites that have the consensus sequence NXS/T, where X may be any amino acid. Control, PNGaseF-untreated IgG-HC and IgG-Fc-ZP3E7 proteins were also analyzed. The outcome of this experiment is shown in Fig. 2. When the WB membrane was incubated with ConA, the PNGaseF-untreated IgG-HC and IgG-Fc-ZP3E7 proteins produced a chemiluminescent reaction at around 56–58 kDa. ConA (Concavalin A) recognizes mannose type oligosaccharides. However, the PNGaseF-treated IgG-HC and IgG-Fc-ZP3E7 proteins did not produce any reaction. The 32 kDa band in the gel lane containing the PNGaseF-treated proteins is due to the PNGaseF enzyme, not observed in the gel lanes of PNGaseF-untreated proteins. This suggests that IgG-HC and IgG-Fc-ZP3E7 proteins are glycosylated and contain N-linked oligosaccharides that are removed upon PNGaseF treatment. To confirm our findings, we stripped the WB membrane and re-probed it with anti-human IgG antibodies. As observed in Fig. 2, anti-human IgG antibodies stained both IgG-HC and IgG-Fc-ZP3E7 proteins, either treated or untreated with PNGaseF. The difference in the molecular mass between the untreated (56 kDa) and treated (53 kDa) IgG-HC protein (about 3 kDa) suggests that IgG-HC is glycosylated at the only potential 96NST98 N-linked glycosylation site, and the oligosaccharide group is removed upon PNGaseF treatment (as shown by ConA immunoblot). Additional bands were also observed in the WB lane with PNGaseF-treated IgG-HC protein, but since there are no additional potential N-linked glycosylation sites, this suggests that these additional bands may be degradation products of PNGaseF-treated IgG-HC protein. Within the same WB, in the gel lane containing the PNGaseF untreated IgG-Fc-ZP3E7, we identified this protein as a single band at about 60–65 kDa, while in the gel lane containing the PNGaseF treated IgG-Fc-ZP3E7, we identified this protein as a main band at around 48 kDa; two additional faint bands were also observed at around 38 and 32 kDa. IgG-Fc-ZP3E7 protein contains three potential glycosylation sites for N-linked oligosaccharides. Therefore, the difference in the molecular mass between the untreated (60–65 kDa) and PNGaseF treated (48 kDa) IgG-Fc-ZP3E7 (about 12 kDa) is due to the PNGaseF treatment. This suggests that at least
I. Sokolowska et al. / Biochimica et Biophysica Acta 1834 (2013) 1474–1483
A
1477
B
C 10 20 30 40 50 60 MPMGSLQPLA TLYLLGMLVA SVLGAAATMA PELLGGPSVF LFPPKPKDTL MISRTPEVTC 70 80 90 100 110 120 VVVDVSHEDP EVKFNWYVDG VEVHNAKTKP REEQYNSTYR VVSVLTVLHQ DWLNGKEYKC 130 140 150 160 170 180 KVSNKALPAP IEKTISKAKG QPREPQVYTL PPSRDELTKN QVSLTCLVKG FYPSDIAVEW 190 200 210 220 230 240 ESNGQPENNY KTTPPVLDSD GSFFLYSKLT VDKSRWQQGN VFSCSVMHEA LHNHYTQKSL 250 260 270 280 290 300 SLSPGLQLDE TCAEAQDGEL DGLWTTDPPS WLPVEGDADI CDCCSHGNCS NSSSSQFQIH 310 320 330 340 350 360 GPRQWSKLVS RNRRHVTDEA Fig. 1. A: Schematic of the processing of mammalian ZP3 protein and its fish homologue VE gamma. The ZP3 protein contains a signal sequence (SS), a zona pellucida (ZP) domain, a furin-like cleavage site (FLCS) and a hydrophobic domain (HD). Downstream of ZP domain, the amino acid sequence containing FLCS and HD is encoded by exons 7 and 8. Upon synthesis, ZP3 loses its signal sequence, forms disulfide bridges, it is glycosylated, and then loses its HD. B: Schematic of the IgG-HC protein and of the polypeptide encoded by IgG-Fc-ZP3E7 construct. IgG-HC has a theoretical mass of 52 kDa that upon cleavage of the signal peptide, produces a 50 kDa protein. IgG-Fc-ZP3E7 protein contains the Fc region of the IgG fused with exons 7 and 8 of the ZP3 (IgG-Fc-ZP3E7 construct). The unprocessed form of IgG-Fc-ZP3E7 protein has a theoretical mass of 42.6 kDa that, upon processing into the mature form, has a mass of 32.5 kDa. The theoretical masses of IgG-HC and IgG-Fc-ZP3E7 proteins are for their amino acid sequences only, and do not include any N- and/or O-glycosylations or other post-translational modifications such as phosphorylation or acetylation. C: Amino acid sequence of the IgG-Fc-ZP3E7 protein. The signal sequence is shown in orange. The sequence of the IgG-Fc is colored in black. The amino acid sequence encoded by ZP3 exon 7 is colored in blue. The potential N-linked glycosylation sites are colored in green.
one of the three potential glycosylation sites for N-linked oligosaccharides is occupied by N-linked oligosaccharides. To further confirm the results from the previous two WB experiments, we stripped the blot and re-incubated it with anti-ZP3 antibodies. Here we expected no reaction of anti-ZP3 antibodies with either PNGaseF untreated or treated IgG-HC. Instead, we expected an immune
reaction of both untreated and PNGaseF-treated IgG-Fc-ZP3E7. As observed, anti-ZP3 antibodies did not react with the IgG-HC, but reacted with both untreated and PNGaseF-treated IgG-Fc-ZP3E7. The 32 kDa band in the gel lane containing the PNGaseF-treated proteins is due to the PNGaseF enzyme, observed in the ConA-stained WB but not in the gel lanes of PNGaseF-untreated proteins. The difference in the molecular
Fig. 2. Analysis of IgG-HC and IgG-Fc-ZP3E7 by WB. IgG-HC and IgG-Fc-ZP3E7 were treated (+) or not treated (−) with PNGaseF and then separated by SDS-PAGE, followed by WB using antibodies against ConA (Anti-ConA antibodies). The IgG-Fc-ZP3E7 protein was first purified and then treated or not with PNGaseF and then analyzed by WB (2–4 ng of IgG-Fc-ZP3E7 was loaded per lane). IgG-HC was pure protein, purchased from Sigma (8–10 ng of IgG-HC was loaded per lane). The immune reaction was visualized by ECL reaction. The blot was stripped and re-probed with antibodies against IgG (Anti-human IgG Ab) and then stripped again and re-probed with antibodies against ZP3E7 (Anti-ZP3 Ab). The molecular weight markers are shown.
1478
I. Sokolowska et al. / Biochimica et Biophysica Acta 1834 (2013) 1474–1483
mass between untreated (60–65 kDa) and PNGaseF treated (48 kDa) IgG-Fc-ZP3E7 is therefore due to the PNGaseF-induced deglycosylation of at least one of the three potential N-linked glycosylation sites. Taken together, these data suggest that the IgG-HC and IgG-Fc-ZP3E7 proteins are glycosylated and their N-linked oligosaccharides are removed upon PNGase treatment. 3.3. Analysis of potential glycosylation sites in IgG-HC and IgG-Fc-ZP3E7 proteins by SDS-PAGE So far, using WB with anti-ConA, anti-human IgG, and anti-ZP3 antibodies, we were able to determine that IgG-HC is glycosylated at the 96NST98 glycosylation site, while IgG-Fc-ZP3E7 is glycosylated at minimum one of the three potential glycosylation sites for the N-linked oligosaccharides: 96NST98, 288NCS290 and 291NSS293. To further investigate whether these glycosylation sites are occupied by N-linked oligosaccharides, we performed a large scale over-expression and purification of IgG-Fc-ZP3E7 protein. We then treated IgG-Fc-ZPE7 and IgG-HC proteins by PNGaseF, separated them by SDS-PAGE and stained them by Coomassie dye. PNGaseF-untreated protein samples were used as controls (Fig. 3). As observed, there was a shift in the Coomassie-stained gel bands of both IgG-HC and IgG-Fc-ZP3E7 proteins upon PNGaseF treatment, confirming the previous WB experiments (Fig. 2) and again suggesting that both IgG-HC and IgG-Fc-ZP3E7 proteins are glycosylated and contain N-linked oligosaccharides. To further investigate which glycosylation sites are occupied and to obtain direct evidence for glycosylation (sequence information), we cut the gel bands from the SDS-PAGE gel labeled 1 to 5 (plus sample 1a), digested them and the resulting peptide mixture was further analyzed by LC–MS/MS. 3.4. Rational experimental design for identification of potential glycosylation sites in IgG-HC and IgG-Fc-ZP3E7 proteins The rationale for choosing the best strategy for identification of the potential glycosylation sites lied within the amino acid sequence of the IgG-HC and IgG-Fc-ZP3E7 proteins. After computer-based analysis of the amino acid sequences of these proteins, we concluded that identification of the peptides that contain the 96NST98 glycosylation site should be straightforward in both IgG-HC and IgG-Fc-ZP3E7 proteins, using trypsin digestion and LC–MS/MS analysis using DDA. However, identification of the peptides that contain the other two
IgG-HC PNGaseF
-
+
IgG-Fc-ZP3E7
-
+
2 1a
1
3 4 5
kDa -230 -130 -95 -72 -56 -36 -28
Fig. 3. Analysis of IgG-HC and IgG-Fc-ZP3E7 by SDS-PAGE. IgG-HC and IgG-Fc-ZP3E7 were untreated (−) or treated (+) with PNGaseF and then separated by SDS-PAGE, followed by Coomassie staining. The IgG-Fc-ZP3E7 protein was first purified and then treated or not with PNGaseF and then analyzed by SDS-PAGE and Coomassie staining (100–200 ng of IgG-Fc-ZP3E7 was loaded per lane). IgG-HC was pure protein, purchased from Sigma (500 ng of IgG-HC was loaded per lane). Bands circled with a red rectangular in the gel and numbered 1–5 were used for further analysis by LC–MS/MS. The molecular weight markers are shown.
potential glycosylation sites for N-linked oligosaccharides within the IgG-Fc-ZP3E7 protein (288NCS290 and 291NSS293) required a special strategy, presented in Fig. 4. The amino acid sequence of the protein produced by exon 7 of the ZP3 protein that contains the 288NCS290 and 291NSS293 potential glycosylation sites does not contain enough trypsin cleavage sites (trypsin cleaves at the C-termini of Arg and Lys residues; Fig. 4A) and the shortest peptide that can be produced by this enzymatic digestion is about 7 kDa (from K238 to R303; Fig. 4B), too large for LC–MS/MS analysis using our current MS instruments (Q-TOF Premier and Q-TOF Micro). Therefore, we looked for Asp and Glu residues within this amino acid sequence, in the hope that a trypsin-AspN double digestion would produce peptide of acceptable size and suitable for MS analysis. AspN cleaves at the N-terminus of Asp and Glu residues. As observed, there were many additional AspN cleavage sites within the 7 kDa peptide (Fig. 4C). Therefore, a trypsin-AspN double digestion was our first choice for LC–MS/MS analysis. In addition, using a trypsin-AspN double digestion, we could focus on a shorter peptide that still contained the two potential glycosylation sites for N-linked oligosaccharides (288NCS290 and 291NSS293) within the IgG-Fc-ZP3E7 (from E275 to R303; Fig. 4D). Cleavage of an N-linked oligosaccharide by PNGaseF converts Asn residue to an Asp one. Therefore, if the 288NCS290 and 291NSS293 potential glycosylation sites are occupied by an N-linked oligosaccharide, then upon PNGase treatment, their sequence should become 288DCS290 and 291DSS293 (Fig. 4E), creating additional AspN cleavage sites.
3.5. Analysis of the potential 96NST98 glycosylation site in IgG-HC protein by LC–MS/MS Initially we focused on identifying peptides that contain the only one potential N-linked glycosylation site on IgG-HC protein.
A
238KSLSLSPGLQLDETCAEAQDGELDGLWTTDPPS WLPVEGDADICDCCSHGNCSNSSSSQFQIHGPRQW SKLVSRNRRH315
B
238KSLSLSPGLQLDETCAEAQDGELDGLWTTDPPS WLPVEGDADICDCCSHGNCSNSSSSQFQIHGPR303
C
238KSLSLSPGLQLDETCAEAQDGELDGLWTTDPPS WLPVEGDADICDCCSHGNCSNSSSSQFQIHGPR303
D
275EGDADICDCCSHGNCSNSSSSQFQIHGPR303
E
275EGDADICDCCSHGDCSDSSSSQFQIHGPR303
Fig. 4. Rationale planning for identification of potential N-glycosylation sites in ZP3E7 using enzymatic digestion and LC–MS/MS. A: Amino acid sequence of part of IgG-Fc-ZP3E7, showing the theoretical trypsin cleavage sites (K and R, in red), and potential glycosylation sites for N-linked oligosaccharides (NCS and NSS, in green) within ZP3E7. B: The theoretical peptide that could result from a tryptic digestion and contains the N-linked glycosylation sites (NCSNSS); this peptide is not suitable for MS analysis (too big). C: The size of the peptide shown in (B) that contains the N-linked glycosylation sites (NCSNSS) could be shortened by trypsin-AspN double digestion. AspN enzyme cleaves at the N-termini of Asp and Glu. The theoretical AspN cleavage site is colored in blue (D and E). D: For analysis of the potential N-linked glycosylation sites on ZP3E7, we focused our efforts on the theoretical peptide that should result from a trypsin-AspN double digestion of the IgG-Fc-ZP3E7 protein, shown here. E: If the peptide shown in (D) has the N-linked glycosylation sites (NCSNSS) occupied by N-linked oligosachharides, then upon PNGaseF treatment of this peptide, the N-residues of the glycosylation sites (NCSNSS) should be converted to D residues. The net change between the peptide shown in (D) and (E) due to glycosylation of N-residues will be a gain of 1 Da per site (conversion of N to D). The new sequence (DCSDSS) becomes substrate for AspN and creates additional cleavage sites for this enzyme. If one site is occupied, then upon PNGase treatment of this peptide, only one N residue will be converted to D and the net change will be gain of 1 Da.
I. Sokolowska et al. / Biochimica et Biophysica Acta 1834 (2013) 1474–1483
would have been converted to 96DST98. Conversely, if the 96NST98 glycosylation site was not occupied by an oligosaccharide, then upon PNGaseF digestion the 96NST98 sequence would remain unchanged. As observed, the peptide identified by MS/MS contains the sequence 92EEQYDSTYR100. Therefore, the original Asn residue within this peptide was converted by PNGaseF into an Asp residue and the 96NST98 glycosylation site within the IgG-HC was occupied by an oligosaccharide. We also analyzed the PNGaseF untreated IgG-HC (Fig. 3, band 1a), but did not identify any peptide that contains either the unmodified 96NST98 glycosylation site or modified to 96DST98, most likely because the site was occupied by the oligosaccharide residue. Taken together, these data suggest that the 96NST98 glycosylation site is occupied by an oligosaccharide residue which is removed upon PNGaseF digestion.
We analyzed both PNGaseF treated and untreated IgG-HC. We performed a trypsin-AspN double digestion and analyzed them by LC–MS/MS. The protein that was analyzed is shown in Fig. 3, band 1. LC–MS/MS analysis of the peptide mixture and Mascot database search led to correct identification of the IgG-HC protein (Supplemental Table 1). In addition, we identified a doubly charged peak of m/z 595.70 (2 +) that corresponds to peptide 92EEQYDSTYR100. MS/MS fragmentation of this peak produced a series of b and y ions that led to identification of the peptide with the sequence 92EEQYDSTYR100 (Fig. 5A). The total ion chromatogram (TIC), extracted ion chromatogram (XIC), the MS and MS/MS for this peptide are shown in Supplemental Fig. 1. If the 96NST98 glycosylation site would have been occupied by an oligosaccharide, then upon PNGaseF digestion the 96NST98 sequence
A
1479
92EEQYDSTYR100 y4 y5
b2-H2O b3-H2O
y1
y3
y5 –NH3
y6
b2 y2
B
y3
y7
291DSSSSQFQIHGPR303 y4
(2+)
[MH] y2
y1
y5
y7 y8
y6
y9
y11
y10
y12
C 92EEQYNSTYR100 b2-H2O y5 y5 –NH3
b3-H2O y1 y4
y6
y2 y3
y6 –NH3
y7
Fig. 5. (A) LC–MS/MS analysis of the PNGaseF-treated (deglycosylated) IgG-HC for identification of the potential glycosylation sites for N-linked oligosaccharides. PNGaseF-treated IgG-HC (band 1 in Coomassie gel; Fig. 3) was digested by trypsin and then by AspN (trypsin-AspN double digestion) and the resulting peptide mixture was analyzed by LC–MS/MS. A doubly charged peak with m/z of 595.70 (2+), was fragmented by MSMS and produced a series of peaks (product b and y ions) that led to identification of the peptide with the sequence 92EEQYDSTYR100. The MS precursor peak with m/z 595.70 (2+) is also shown (expanded). (B) LC–MS/MS analysis of the PNGaseF-treated (deglycosylated) IgG-Fc-ZP3E7 for identification of the potential glycosylation sites for N-linked oligosaccharides. PNGaseF-treated IgG-Fc-ZP3E7 (band 5 in Coomassie gel; Fig. 3) was digested by trypsin and then by AspN (trypsin-AspN double digestion) and the resulting peptide mixture was analyzed by LC–MS/MS. A doubly charged peak with m/z of 723.29 (2+) was fragmented by MSMS and produced a series of peaks (product y ions) that led to identification of the peptide with the sequence 291DSSSSQFQIHGPR303. The MS precursor peak with m/z 723.29 (2+) is also shown (expanded). (C) LC–MS/MS analysis of the PNGaseF-untreated (glycosylated) IgG-Fc-ZP3E7 for identification of the potential glycosylation sites for N-linked oligosaccharides. PNGaseF-untreated IgG-Fc-ZP3E7 (band 2 in Coomassie gel; Fig. 3) was digested by trypsin and then by AspN (trypsin-AspN double digestion) and the resulting peptide mixture was analyzed by LC–MS/MS. A doubly charged peak with m/z of 595.21 (2+), that corresponds to a peptide with the sequence 92EEQYNSTYR100 was selected for further fragmentation by MSMS. MSMS fragmentation of this peak (precursor ion) produced a series of peaks (product b and y ions) that led to identification of the peptide with the sequence 92EEQYNSTYR100. The MS precursor peak with m/z 595.21 (2+) is also shown (expanded).
1480
I. Sokolowska et al. / Biochimica et Biophysica Acta 1834 (2013) 1474–1483
3.6. Analysis of potential glycosylation sites in IgG-Fc-ZP3E7 protein by LC–MS/MS We then focused on identifying peptides that contain the three potential N-linked glycosylation sites 96NST98, 288NCS290 and 291NSS293 on the IgG-Fc-ZP3E7 protein. We analyzed both PNGaseF treated and untreated IgG-Fc-ZP3E7 protein. We performed a trypsin-AspN double digestion and analyzed them by LC–MS/MS. In a Mascot database search, we identified IgG-Fc-ZP3E7 protein in LC–MS/MS runs of both PNGaseF treated and untreated protein (Supplemental Table 1). However, in the Mascot database search of the PNGaseF-untreated IgG-Fc-ZP3E7 sample (band 2 from Fig. 3), we identified as first hit bovine serum albumin, a contaminant that was in the medium and was not entirely removed (Supplemental Table 1). In the Mascot database search of the PNGaseF-treated IgG-Fc-ZP3E7 sample (band 3 from Fig. 3), we identified as the only hit bovine serum albumin, a contaminant that was in the medium and was not entirely removed (Supplemental Table 1). In the Mascot database search of the PNGaseF-treated IgG-Fc-ZP3E7 sample (bands 4 &5 from Fig. 3), we did identify IgG-Fc-ZP3E7. We also looked for peptides that are part of the polypeptide encoded by exon 8 of ZP3, but we did not identify such peptides, not even when we created a special IDA method in which we instructed the mass spectrometer to look for these peptides (inclusion list; Supplemental Table 1). 3.7. The 291NSS293 glycosylation site on IgG-Fc-ZP3E7 protein is occupied by an oligosaccharide We analyzed protein gel bands that contained PNGaseF treated IgG-Fc-ZP3E7 protein, marked in the gel from Fig. 3 as bands #3, #4 and #5. We analyzed bands #4 and #5 because we saw a shift in the WB of IgG-Fc-ZP3E7 protein upon PNGaseF treatment. We also analyzed band #3, because we reasoned that if there is no shift in the molecular mass of IgG-Fc-ZP3E7 protein upon PNGaseF treatment, then we should identify this protein by LC–MS/MS experiment. In addition, we also analyzed band #2, which contains IgG-Fc-ZP3E7 protein without PNGaseF treatment, because we reasoned that if there is a glycosylation site that is not occupied by an oligosaccharide residue, then we should identify it using LC–MS/MS as unglycosylated peptides with the sequence that contains the unglycosylated sites 96NST98, 288NCS290 or 291NSS293. LC–MS/MS analysis of PNGaseF-treated IgG-Fc-ZP3E7 protein (band #5 in SDS-PAGE gel from Fig. 3) identified a peak with m/z of 723.29 (2 +) that corresponds to peptide 291DSSSSQFQIHGPR303. MSMS fragmentation of this peak produced a series of b and y peaks that led to identification of peptide 291DSSSSQFQIHGPR303 (Fig. 5B, Supplemental Table 1). The TIC, XIC, MS and MS/MS for this peptide are shown in Supplemental Fig. 2. This peptide contains the potential 291NSS293 glycosylation site, with Asn residue converted to Asp residue. Similar results were obtained in LC–MS/MS analysis of PNGaseF-treated IgG-Fc-ZP3E7 protein (band #4 in SDS-PAGE gel from Fig. 3, Supplemental Table 1), but not in LC–MS/MS analysis of PNGaseF-untreated IgG-Fc-ZP3E7 protein (band #2 in SDS-PAGE gel from Fig. 3, Supplemental Table 1). Since the IgG-Fc-ZP3E7 protein was treated with PNGaseF prior LC–MS/MS experiments, these data suggest that the 291NSS293 glycosylation site is occupied by an oligosaccharide residue which is removed upon PNGaseF digestion. 3.8. The 96NST98 glycosylation site on IgG-Fc-ZP3E7 protein is NOT occupied by any oligosaccharide We continued to analyze gel bands for identification and characterization of the other two potential glycosylation sites 96NST98 and 288NCS290. LC–MS/MS analysis of the band #2 (showed in Fig. 3) that contained the PNGaseF-untreated IgG-Fc-ZP3E7 protein identified a peak with m/z of 595.21 (2 +) and its MS/MS fragmentation produced a series of b and y peaks that led to identification of
peptide 92EEQYNSTYR100 (Fig. 5C). The TIC, XIC, MS and MS/MS for this peptide are shown in Supplemental Fig. 3. This peptide contains the potential 96NST98 glycosylation site unmodified. Since the positive results from MS analysis of the PNGaseF-untreated IgG-Fc-ZP3E7 protein, this suggests that the 96NST98 glycosylation site is not occupied by an oligosaccharide group. To confirm that the 96NST98 glycosylation site is not occupied, we also analyzed the PNGaseF-treated IgG-Fc-ZP3E7 protein. We expected that if 96NST98 glycosylation site is not occupied by any oligosaccharide, then LC–MS/MS analysis of the PNGaseF-treated IgG-Fc-ZP3E7 protein should identify the 96NST98 glycosylation site unmodified. We also looked for the glycosylation site as 96DST98. Therefore, we looked for peptides that contained either 96NST98 or 96DST98 sequences. LC–MS/MS analysis of PNGaseF-treated IgG-Fc-ZP3E7 protein (bands 4 & 5 from Fig. 3; Supplemental Table 1) led to identification of a peptide with sequence 92EEQYNSTYR100 (but not 92EEQYDSTYR100), confirming that the 96NST98 glycosylation site is not occupied by an oligosaccharide group and further suggesting that this glycosylation site is not glycosylated at all. In a separate deglycosylation experiment as the one shown in Fig. 2, we cut out a band equivalent to band 5 (band 6 in Supplemental Table 1) and analyzed it by LC–MS/MS and obtained the same results. 3.9. The 288NCS290 glycosylation site on IgG-Fc-ZP3E7 protein is NOT occupied by an oligosaccharide We were not able to identify peptides that contain the third potential glycosylation site 288NCS290 using LC–MS/MS analysis. However, we determined by LC–MS/MS analysis that the peptide 291DSSSSQFQIHGPR303 has the 291NSS293 site occupied by an oligosaccharide (Fig. 5B) and the peptide 92EEQYNSTYR100 has the 96NST98 site unoccupied (Fig. 5C and data not shown). These data, corroborated with the WB data using anti-human IgG Ab, anti-ZP3 Ab and ConA (Fig. 2), where the removal of N-linked oligosaccharides by PNGaseF was complete, suggest that the 288NCS290 site is not occupied. We also determined by WB that the IgG-Fc-ZP3E7 protein also produced upon PNGaseF-treatment in addition to the 48 kDa band (Fig. 2) also a 38 kDa band (Supplemental Fig. 4). However, since ConA recognizes only one strong band in WB presented in Fig. 2, this suggests that the 38 kDa band presented in Supplemental Fig. 4 is most likely a degradation product of the IgG-Fc-ZP3E7 protein and not a result of the PNGaseF treatment. We also used structural biology analysis to predict whether the NCS glycosylation site could be occupied by an oligosaccharide when the NSS glycosylation site is occupied by an oligosaccharide (Supplemental Fig. 5). However, the results were inconclusive. 3.10. Additional glycosylation sites in the IgG-Fc-ZP3E7 protein prevent the glycosylation of the standard 96NST98 site of the IgG part of the IgG-Fc-ZP3E7 molecule Correct glycosylation of IgG molecules has profound implications in using IgGs as carriers for therapeutic molecules. In addition, from a regulatory point of view, it is imperative that production of a batch of antibodies is uniform. In our experiments, we concluded that the IgG-HC protein is always correctly glycosylated at the 96NST98 glycosylation site, as expected for a regular IgG molecule. However, surprisingly, when we analyzed the IgG-Fc-ZP3E7 protein, we discovered that the same 96NST98 glycosylation site is not occupied. Since there were two additional potential glycosylation sites for N-linked oligosaccharides in the IgG-Fc-ZP3E7 protein (and one of them is glycosylated as determined by LC–MS/MS), this suggests that introduction of additional glycosylation sites for N-linked oligosaccharides within the IgG-Fc-ZP3E7 protein may result in preventing glycosylation of the standard glycosylation site of IgG. Fig. 6 shows a comparison of the MSMS spectra of 1) a precursor peak with m/z of
I. Sokolowska et al. / Biochimica et Biophysica Acta 1834 (2013) 1474–1483
595.70 (2+) that corresponds to peptide 92EEQYDSTYR100 from PNGaseF-treated IgG-HC protein (Fig. 6A); 2) a precursor peak with m/z of 595.21 (2+) that corresponds to peptide 92EEQYNSTYR100 from PNGaseF-untreated IgG-Fc-ZP3E7 protein (Fig. 6B); and 3) a
1481
precursor peak with m/z of 595.21 (2 +) that corresponds to peptide 92EEQYNSTYR100 from PNGaseF-treated IgG-Fc-ZP3E7 protein (Fig. 6C). In all three MSMS spectra, we identified a series of peaks that corresponded to b and y ions (b2-H2O, b2, b3-H2O, b3, y1, y2, y3, y4)
b2 b3 b4
92E E Q Y D S T Y R100
A
y7 y6 y5 y4 y3 y2 y1 y5
IgG-HC + PNGaseF y6
y7NH3
y7
b2 b3 b4
B
92E E Q Y N S T Y R100 y7 y6 y5 y4 y3 y2 y1
y5
IgG-Fc-ZP3E7 - PNGaseF y6 y7NH3
y7
b2 b3 b4
C
92E E Q Y N S T Y R100
IgG-Fc-ZP3E7 + PNGaseF
y7 y6 y5 y4 y3 y2 y1 y5
Ax
y6
y7NH3
y7
IgG-HC + PNGaseF
Bx
IgG-Fc-ZP3E7 - PNGaseF
Cx
IgG-Fc-ZP3E7 + PNGaseF
Fig. 6. Comparison of the MSMS spectra that correspond to peptides 92EEQYNSTYR100 (from IgG-Fc-ZP3E7) and 92EEQYDSTYR100 (from IgG-HC). A: MSMS spectrum of the m/z = 595.70 (2+) that resulted from analysis of the PNGaseF-treated (deglycosylated) IgG-HC that corresponds to peptide 92EEQYDSTYR100 (band 1 in Coomassie gel; Fig. 3). B: MSMS spectrum of the m/z = 595.21 (2+) that resulted from analysis of the PNGaseF-untreated (glycosylated) IgG-Fc-ZP3E7 that corresponds to peptide 92EEQYNSTYR100 (band 2 in Coomassie gel; Fig. 3). C: MSMS spectrum of the m/z = 595.21 (2+) that resulted from analysis of the PNGaseF-treated (de-glycosylated) IgG-Fc-ZP3E7 that corresponds to peptide 92EEQYNSTYR100 (band 5 in Coomassie gel; Fig. 3). The peaks that are marked by green circles in A–C are common to peptide sequences from both 92EEQYDSTYR100 (the PNGaseF-treated, deglycosylated IgG-HC; A), 92EEQYNSTYR100 (PNGaseF-untreated, glycosylated IgG-Fc-ZP3E7; B), and 92EEQYNSTYR100 (PNGaseF-treated, deglycosylated IgG-Fc-ZP3E7; C) sequences. The peaks that are marked by red circles in B–C are common to peptide sequence 92EEQYNSTYR100 from both PNGaseF-untreated, glycosylated IgG-Fc-ZP3E7 (B) and PNGaseF-treated, deglycosylated IgG-Fc-ZP3E7 (C); these peaks are specific for the peptide 92EEQYNSTYR100. The peaks that are marked by blue circles in A are common to peptide 92EEQYDSTYR100 from the PNGaseF-treated, deglycosylated IgG-HC (A). The difference between the MSMS spectra of peptides with the sequences 92EEQYDSTYR100 and 92EEQYNSTYR100 is reflected by the peaks circled by the blue and red circles. Ax, Bx and Cx (with the inbox in each spectrum) are enhancements of the MSMS spectra shown in A, B and C respectively.
1482
I. Sokolowska et al. / Biochimica et Biophysica Acta 1834 (2013) 1474–1483
that are common in both 92EEQYDSTYR100 and 92EEQYNSTYR100 peptide sequences (Fig. 6A–C, green circled peaks). However, we also identified a series of y ions (y5-NH3, y5, y6-NH3, y6, y7-NH3, y7) that are peptide-specific. These y ions are either specific to peptide 92EEQYDSTYR100 (Fig. 6A) or peptide 92EEQYNSTYR100 (Fig. 6B & C), but not to both. Enhanced spectra that highlight the differences between the spectra from Fig. 6A–C are shown in Fig. 6Ax, Bx & Cx. Taken together, these data suggest that in IgG-HC protein, the 96NST98 glycosylation site is occupied and upon PNGaseF treatment, the glycosylation site is converted to 96DST98. Conversely, when additional glycosylation sites are inserted within the IgG-Fc-ZP3E7 protein, the 96NST98 is not further
glycosylated. A summary of the glycosylation of IgG-Fc-ZP3E7 protein is presented in Fig. 7. Overall, the finding that 1) one of NST glycosylation site that should be occupied by an N-linked oligosaccharide residue is not glycosylated (the NXS site from the Fc part of the IgG-Fc-ZP3E7) and 2) additional NXS glycosylation sites are glycosylated suggest that glycosylation would have an impact on protein structure and function and that, within the three dimensional structure, the glycosylation sites are almost indistinguishable by the glycosylation enzymes, even their position within the polypeptide chain is in two different positions (one towards the N-terminal end and the another one towards the C-terminal end). In
A Un-Glycosylated protein
50 kDa
92EEQYNSTYR100
IgG-Fc Glycosylated protein
53 kDa
92EEQYNSTYR100
IgG-Fc
PNGaseF-treated protein
50 kDa
92EEQYDSTYR100
IgG-Fc B Un-Glycosylated protein 92EEQYNSTYR100
IgG-Fc
32.5 kDa 284SHGNCSNSSSSQFQIHGPR303
ZP3-Exon 7 65 kDa
Glycosylated protein 92EEQYNSTYR100
IgG-Fc
PNGaseF-treated protein 92EEQYNSTYR100
IgG-Fc
284SHGNCSNSSSSQFQIHGPR303
ZP3-Exon 7
48 kDa 284SHGNCSDSSSSQFQIHGPR303
ZP3-Exon 7
Fig. 7. (A) Summary of the assignment of the glycosylation sites in IgG-HC. The mature IgG-HC has a mass of 50 kDa and has one potential N-glycosylation site NST, which is occupied in the secreted, mature protein. From the mature IgG-HC, the N-linked glycosylation site is occupied and then removed upon PNGase treatment, as demonstrated by WB (Fig. 2) and LC–MS/MS (Fig. 5A). (B) Summary of the assignment of the glycosylation sites in IgG-Fc-ZP3E7. The mature IgG-Fc-ZP3E7 polypeptide has a mass of 32.5 kDa and has three potential N-glycosylation sites, NST, NCS and NSS, marked in green. Upon glycosylation and secretion, IgG-Fc-ZP3E7 has a mass of 60–65 kDa. When additional glycosylation sites are within the recombinant IgG-Fc-ZP3E7 protein, the NST site from the IgG-Fc region (that should be glycosylated) is no longer glycosylated. Instead, the NSS site from the ZP3E7 is glycosylated.
I. Sokolowska et al. / Biochimica et Biophysica Acta 1834 (2013) 1474–1483
addition, the oligosaccharides are polar and hydrophilic and their attachment to a protein will significantly alter its physicochemical properties (pI, solubility, charge, increased stability). This will also affect the protein structure and thus its functions (folding, cell–cell communication, protein–protein interaction). It is not uncommon to have post-translational modifications such as glycosylations at different sites within the protein. For example, various degrees of phosphorylations within protein kinase A, described in the biochemistry textbooks (e.g. Lehninger Principles of Biochemistry) will modulate the kinase activity of this enzyme. In addition, phosphorylation–dephosphorylation events are the activators or inhibitors (or just modulators) of many signaling pathways such as receptor tyrosyne kinase pathways or in the regulation of the levels of glucose in the blood. The regulation of these proteins' enzymatic activity through these post-translational modifications is not due to the phosphorylation at one particular phosphorylation site, but rather due to the properties and conformation of the three dimensional structure of the protein. Likewise, one should expect the same with glycosylation and glycosylation sites, which would affect both the three dimensional structure of the proteins, as well as their solubility. As such, identification of glycosylation sites that are traditionally occupied (IgG-Fc glycosylation site from IgG-Fc-ZP3E7 protein) not occupied, and identification of newly introduced glycosylation sites occupied by N-linked oligosaccharides suggests that this process (modulation of the physico-chemical properties and biological activities) applies not only to phosphorylation, but also to glycosylation. The novelty of our manuscript is demonstration that such modulation applies to glycosylation, as well. 4. Conclusions In conclusion, in our experiments, we observed that although the N-linked glycosylation site in the sequence of IgG-Fc is always occupied by an oligosaccharide. Introduction of new glycosylation sites within a recombinant chimeric protein that contains IgG-Fc sequence fused to the protein of interest prevents glycosylation of the original N-linked glycosylation site, which should be (and is always) glycosylated. Therefore, if the modification is truly due to introduction of a new glycosylation site and not a cell culture artifact, this could open an unexpected avenue for new approaches for controlling the solubility of recombinant proteins not only by PEG-ylation, but also by addition of N-linked glycosylation sites and prevention of glycosylation of other N-linked glycosylation sites. Acknowledgements C.C.D. thanks Dr. Paul M. Wassarman for advice and discussions and Dr. Eveline S. Litscher for initiating the project (WB). C.C.D. thanks Dr. Thomas A. Neubert, Skirball Institute of Biomolecular Medicine, New York University for allowing the authors to perform the preliminary studies and for donation of the TofSpec2E MALDI-MS, and Dr. Guoan Zhang for help with the initial database search for analysis of the raw data. C.C.D. also thanks Ms. Laura Mulderig and her colleagues (Waters Corporation) for their generous support in setting up the Proteomics Center within the Biochemistry & Proteomics Group at Clarkson University. This work was supported in part by Clarkson University (start-up to C.C.D.) and by the Army Research Office through the Defense University Research Instrumentation Program (DURIP grant #W911NF-11-1-0304 to C.C.D.). Appendix A. Supplementary data Supplementary data to this article can be found online at http:// dx.doi.org/10.1016/j.bbapap.2013.04.022.
1483
References [1] R. Aebersold, M. Mann, Mass spectrometry-based proteomics, Nature 422 (2003) 198–207. [2] C.C. Darie, M.L. Biniossek, M.A. Gawinowicz, Y. Milgrom, J.O. Thumfart, L. Jovine, E.S. Litscher, P.M. Wassarman, Mass spectrometric evidence that proteolytic processing of rainbow trout egg vitelline envelope proteins takes place on the egg, J. Biol. Chem. 280 (2005) 37585–37598. [3] I. Sokolowska, C. Dorobantu, A.G. Woods, A. Macovei, N. Branza-Nichita, C.C. Darie, Proteomic analysis of plasma membranes isolated from undifferentiated and differentiated HepaRG cells, Proteome Sci. 10 (2012) 47. [4] I. Sokolowska, M.A. Gawinowicz, A.G. Ngounou Wetie, C.C. Darie, Disulfide proteomics for identification of extracellular or secreted proteins, Electrophoresis 33 (2012) 2527–2536. [5] I. Sokolowska, A.G. Woods, M.A. Gawinowicz, U. Roy, C.C. Darie, Identification of potential tumor differentiation factor (TDF) receptor from steroid-responsive and steroid-resistant breast cancer cells, J. Biol. Chem. 287 (2012) 1719–1733. [6] I. Sokolowska, A.G. Woods, J. Wagner, J. Dorler, K. Wormwood, J. Thome, C.C. Darie, Mass spectrometry for proteomics-based investigation of oxidative stress and heat shock proteins, Oxidative Stress: Diagnostics and Therapy, 2011. [7] A.G. Woods, I. Sokolowska, C.C. Darie, Identification of consistent alkylation of cysteine-less peptides in a proteomics experiment, Biochem. Biophys. Res. Commun. 419 (2012) 305–308. [8] A.G. Woods, I. Sokolowska, R. Yakubu, M. Butkiewicz, M. LaFleur, C. Talbot, C.C. Darie, Blue native page and mass spectrometry as an approach for the investigation of stable and transient protein-protein interactions, in: S. Andreescu, M. Hepel (Eds.), Oxidative Stress: Diagnostics, Prevention, and Therapy, American Chemical Society, Washington, D.C., 2011 [9] I. Sokolowska, A.G. Woods, M.A. Gawinowicz, U. Roy, C.C. Darie, Characterization of tumor differentiation factor (TDF) and its receptor (TDF-R), Cell Mol. Life Sci. (2012), http://dx.doi.org/10.1007/s00018-012-1185-0 [PMID:23076253]. [10] C.C. Darie, M.L. Biniossek, L. Jovine, E.S. Litscher, P.M. Wassarman, Structural characterization of fish egg vitelline envelope proteins by mass spectrometry, Biochemistry 43 (2004) 7459–7478. [11] C.C. Darie, E.S. Litscher, P.M. Wassarman, Structure, Processing, and Polymerization of Rainbow Trout Egg Vitelline Envelope Proteins, Springer-Verlag, Düsseldorf, Germany, 2008. [12] A.G. Ngounou Wetie, I. Sokolowska, A.G. Woods, K.L. Wormwood, S. Dao, S. Patel, B.D. Clarkson, C.C. Darie, Automated mass spectrometry-based functional assay for the routine analysis of the secretome, J. Lab. Autom. 18 (2013) 19–29. [13] I. Sokolowska, A.G. Ngounou Wetie, A.G. Woods, C.C. Darie, Automatic determination of disulfide bridges in proteins, J. Lab. Autom. 17 (2012) 408–416. [14] C.C. Darie, W.G. Janssen, E.S. Litscher, P.M. Wassarman, Purified trout egg vitelline envelope proteins VEbeta and VEgamma polymerize into homomeric fibrils from dimers in vitro, Biochim. Biophys. Acta 1784 (2008) 385–392. [15] L. Jovine, C.C. Darie, E.S. Litscher, P.M. Wassarman, Zona pellucida domain proteins, Annu. Rev. Biochem. 74 (2005) 83–114. [16] E.S. Litscher, W.G. Janssen, C.C. Darie, P.M. Wassarman, Purified mouse egg zona pellucida glycoproteins polymerize into homomeric fibrils under non-denaturing conditions, J. Cell. Physiol. 214 (2008) 153–157. [17] D.J. Capon, S.M. Chamow, J. Mordenti, S.A. Marsters, T. Gregory, H. Mitsuya, R.A. Byrn, C. Lucas, F.M. Wurm, J.E. Groopman, et al., Designing CD4 immunoadhesins for AIDS therapy, Nature 337 (1989) 525–531. [18] Z. Williams, E.S. Litscher, L. Jovine, P.M. Wassarman, Polypeptide encoded by mouse ZP3 exon-7 is necessary and sufficient for binding of mouse sperm in vitro, J. Cell. Physiol. 207 (2006) 30–39. [19] U. Roy, I. Sokolowska, A.G. Woods, C.C. Darie, Structural investigation of tumor differentiation factor (TDF), Biotechnol. Appl. Biochem. 59 (2012) 445–450. [20] A. Shevchenko, M. Wilm, O. Vorm, O.N. Jensen, A.V. Podtelejnikov, G. Neubauer, P. Mortensen, M. Mann, A strategy for identifying gel-separated proteins in sequence databases by MS alone, Biochem. Soc. Trans. 24 (1996) 893–896. [21] D.S. Spellman, K. Deinhardt, C.C. Darie, M.V. Chao, T.A. Neubert, Stable isotopic labeling by amino acids in cultured primary neurons: application to brain-derived neurotrophic factor-dependent phosphotyrosine-associated signaling, Mol. Cell Proteomics 7 (2008) 1067–1076. [22] C.C. Darie, V. Shetty, D.S. Spellman, G. Zhang, C. Xu, H.L. Cardasis, S. Blais, D. Fenyo, T.A. Neubert, Blue Native PAGE and Mass Spectrometry Analysis of the Ephrin Stimulation-Dependent Protein–Protein Interactions in NG108-EphB2 Cells, Springer-Verlag, Düsseldorf, Germany, 2008. [23] C.C. Darie, K. Deinhardt, G. Zhang, H.S. Cardasis, M.V. Chao, T.A. Neubert, Identifying transient protein–protein interactions in EphB2 signaling by blue native PAGE and mass spectrometry, Proteomics 11 (2011) 4514–4528. [24] A. Roy, A. Kucukural, Y. Zhang, I-TASSER: a unified platform for automated protein structure and function prediction, Nat. Protoc. 5 (2010) 725–738. [25] Y. Zhang, I-TASSER server for protein 3D structure prediction, BMC Bioinforma. 9 (2008) 40. [26] D. Xu, Y. Zhang, Ab initio protein structure assembly using continuous structure fragments and optimized knowledge-based force field, Proteins 80 (7) (2012) 1715–1735, http://dx.doi.org/10.1002/prot.24065. [27] N. Author, http://zhanglab.ccmb.med.umich.edu/QUARK. [28] Accelrys_Software_Inc, Discovery Studio Modeling Environment, Release 3.1, 1, Accelrys Software Inc., San Diego, 2012. , (1-1).