Bioinformatics of glycosaminoglycans

Bioinformatics of glycosaminoglycans

+Model PISC-393; No. of Pages 5 ARTICLE IN PRESS Perspectives in Science (2016) xxx, xxx—xxx Available online at www.sciencedirect.com ScienceDi...

1MB Sizes 3 Downloads 151 Views

+Model

PISC-393;

No. of Pages 5

ARTICLE IN PRESS

Perspectives in Science (2016) xxx, xxx—xxx

Available online at www.sciencedirect.com

ScienceDirect journal homepage: www.elsevier.com/pisc

Bioinformatics of glycosaminoglycans夽 Han Hu a,b, Yang Mao b,c, Yu Huang b,c, Cheng Lin b,c, Joseph Zaia a,b,c,∗ a

Bioinformatics Program, Boston University, Boston, MA, USA Center for Biomedical Mass Spectrometry, Boston University, Boston, MA, USA c Department of Biochemistry, Boston University, Boston, MA, USA b

Received 2 November 2015; accepted 22 January 2016

KEYWORDS Glycosaminoglycan; Heparan sulfate; Tandem mass spectrometry; Electron transfer dissociation; Divide-and-conquer

Summary Cell surface heparan sulfates modulate many signalling pathways by binding growth factors and growth factor receptors. Expressed in a spatially and temporally regulated manner, these highly sulfated polysaccharides play important roles in all aspects of animal physiology. To understand heparan sulfate-protein binding, it is necessary to develop instrumental sequencing methods. Towards this end, we and others have demonstrated the effectiveness of activated electron dissociation (ExD) tandem mass spectrometry. The value in the ExD approach is that extremely rich tandem mass spectra are produced. The challenge is that bioinformatics methods are needed to convert the raw data into HS saccharide sequences. In this article we describe HS—SEQ, an algorithm developed for this purpose. © 2016 Published by Elsevier GmbH. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

Introduction Heparan sulfate (HS) is a linear sulfated polysaccharide that consists of repeating uronic acid-hexosamine disac-



This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. This article is part of a special issue entitled Proceedings of the Beilstein Glyco-Bioinformatics Symposium 2015 with copyright © 2017 Beilstein-lnstitut. Published by Elsevier GmbH. All rights reserved. ∗ Corresponding author at: Dept. of Biochemistry, Boston University, Boston, MA, USA. E-mail address: [email protected] (J. Zaia).

charide units. Biosynthesized in the endoplasmic reticulum and Golgi apparatus as a nascent co-polymer, HS chains are acted upon by a series of biosynthetic enzymes (Bulow and Hobert, 2006) as shown in Fig. 1. The HS domain structure results from the actions of N-deactylase/Nsulfotransferase enzymes that remove selected glucosamine N-acetate groups and replace them with N-sulfate groups. A glucuronic acid C5 epimerase then coverts a subset of uronic acid residues to iduronic acid. A series of O-sulfotransferases then adds sulfate groups to specific sites on the monosaccharide residues see Fig. 2. These enzymes add sulfate to targets primarily in the N-sulfated domains. The resulting mature chains have domains of high sulfation interspersed with those with low sulfation and those with intermediate sulfation.

http://dx.doi.org/10.1016/j.pisc.2016.01.014 2213-0209/© 2016 Published by Elsevier GmbH. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).

Please cite this article in press as: Hu, H., et al., Bioinformatics of glycosaminoglycans. Perspect. Sci. (2016), http://dx.doi.org/10.1016/j.pisc.2016.01.014

+Model

PISC-393; No. of Pages 5

ARTICLE IN PRESS

2

H. Hu et al.

Figure 1

Biosynthesis of heparan sulfate.

Figure 2 Overview of heparan sulfate structure. (A) Repeating disaccharide units with positions of sulfation indicated; (B) illustration of binding among HS chains growth factors and growth factor receptors.

Heparan sulfate is found in metazoans (DeAngelis, 2002) and is required for embryogenesis, development (Perrimon and Bernfield, 2000), and normal functioning of all physiological systems (Bishop et al., 2007). Heparin is a highly sulfated form of HS found in granulated hematopoietic lineage cells. Heparin derived from porcine intestinal mucosa mast cells is used as an anticoagulant in hospital surgical suites worldwide. Low molecular weight heparins (LMWH), including enoxaparin and dalteparin, are produced by limited depolymerization of heparin and used to treat deep vein thrombosis and other blood clotting disorders (Samama, 1992). The recent approval of biogeneric forms of LMWH reflected the power of analytical methods in characterizing these extremely complex biopolymeric mixtures so as

to demonstrate acceptably low residual risk in the reverse engineered products. Heparan sulfate is found on the surfaces of most animal cell types where it binds to many families of growth factors and growth factor receptors (Bernfield et al., 1999). The structure of HS expression varies among cell types and as a function of developmental and disease states. In normal tissue, cells receive signals from their surroundings that depend on interactions among cell surface HS, growth factors and growth factor receptors. In pathophysiological states including cancers, cellular responses to growth factor signals become dysregulated. These dysregulated states are often associated with changes in the structures of HS chains and the signaling proteins to which they bind. Thus,

Please cite this article in press as: Hu, H., et al., Bioinformatics of glycosaminoglycans. Perspect. Sci. (2016), http://dx.doi.org/10.1016/j.pisc.2016.01.014

+Model

PISC-393;

No. of Pages 5

ARTICLE IN PRESS

Bioinformatics of glycosaminoglycans

3

Figure 3 Tandem mass spectrometric dissociation of Arixtra, an octasulfated pentasaccharide. (A) Collisional induced dissociation (CID) tandem MS showing abundant losses of SO3 from the precursor ion; (B) structure of Arixtra with activated electron detachment dissociation (EDD) product ions shown; (C) EDD tandem mass spectrum of Arixtra; (D) negative electron transfer dissociation (NETD) tandem mass spectrum of Arixtra.

there is strong interest in determining the protein binding HS sequences that underlie these binding events. We and others have sought to develop mass spectrometric methods for sequencing HS saccharides. As shown in Fig. 3A, collisional induced dissociation (CID) tandem MS of a highly sulfated HS-like saccharide results in a series of SO3 losses from the precursor ion. Such losses can be minimized by maximizing the precursor ion charge state (Naggar et al., 2004). Unfortunately, it is not possible to produce sufficient negative charge density to prevent all losses of SO3 for highly sulfated HS saccharides. Alternatively, such losses can be minimized by analyzing metal adducted HS saccharide ions (Naggar et al., 2004; Kailemia et al., 2012). In an effort to avoid the use of metal salts in the spray solution, we developed a chemical method to replace HS N-sulfate groups with deuterated N-acetate groups so as to reduce losses of SO3 . When combined with supercharging of the electrospray through the addition of sulfolane, this approach was effective for minimizing losses of SO3 and favouring glycosidic bond dissociation (Shi et al., 2012). Sharp et al. permethylate the sulfated saccharides, then chemically remove sulfate groups, then acetylate the newly exposed hydroxyl groups (Huang et al., 2011a,b; R. Huang et al., 2013). In doing so, they render the saccharides hydrophobic, stable, and amenable to reversed phase LC-tandem MS analysis. Amster et al. demonstrated the effectiveness of electron detachment dissociation (Wolff et al., 2007a,b, 2008) and

negative electron transfer dissociation (NETD) for tandem MS of HS saccharides. These approaches produced abundant backbone dissociation by glycosidic bond and cross-ring cleavages and showed acceptably low abundances of SO3 losses. As shown in Fig. 3 (Y. Huang et al., 2013), the EDD and NETD tandem mass spectra are extremely rich, with several cross-ring cleavages occurring to each residue. This complexity drives the need for an effective, non-biased method for interpretation of the tandem mass spectrometric data. We developed software to meet this need as described below.

Results The steps necessary to process HS saccharide tandem mass spectra are shown in Fig. 4. The first step is to assign product ion m/z, charge state and intensity values. Our data were produced using a Bruker SolariX FTMS instrument and the vendor SNAPP algorithm was used to make these assignments. The monoisotopic peak list was then converted to neutral mass values. The next step was to generate a list of theoretical HS saccharide sequences that match the composition [HexA, HexA, GlcN, Acetate, SO3 ] indicated by the exact mass of the precursor ion. This list considered all possible glycan product ion types, A, B, C, X, Y, Z that contain either the reducing or non-reducing end, according to the Domon—Costello nomenclature (Domon and Costello, 1988).

Please cite this article in press as: Hu, H., et al., Bioinformatics of glycosaminoglycans. Perspect. Sci. (2016), http://dx.doi.org/10.1016/j.pisc.2016.01.014

+Model

PISC-393; No. of Pages 5

ARTICLE IN PRESS

4

H. Hu et al.

Figure 4

Informatic pipeline for analysis of ExD tandem mass spectra of HS saccharides.

The algorithm also accounted for neutral water losses and hydrogen transfer reactions. Next, the algorithm matched the observed product ion masses against those in the theoretical list. It then clustered product ion compositions according to the number of acetate groups. The algorithm then considered the ambiguity of the product ion assignments based on the number of possible compositions for each. The sequence was built from the collection of assigned terminal product ions using an assignment graph as shown in Fig. 5. In this example, the graph began with two dummy nodes, the null and the precursor. An assignment was added to the graph, consistent with three sulfate groups and zero acetate groups. Thus, this assignment divided the precursor ion into two fragments. The terminal assignments were added to the graph in descending order of their uniqueness values. The distribution was determined and then used to localize the positions of sulfate groups on the saccharide backbone. The assignment graph was updated after the addition of each assignment node.

Conclusions The HS-SEQ program uses a divide-and-conquer strategy to reduce the problem of HS saccharide sequencing into sub-problems. To perform well, the algorithm requires high mass accuracy tandem mass spectra in which cleavages to most glycosidic bonds are present. Although ExD tandem mass spectra contain a degree of SO3 losses, the algorithm requires that at ions with no sulfate losses are detected. The algorithm performs best for saccharides in which the reducing end has been modified so as to remove the mass symmetry between product ions that contain the reducing end versus the non-reducing end. The combination of high resolution data and reducing end labelling keeps the degree of product ion ambiguity sufficiently low to enable the correct HS saccharide sequence to be determined.

Figure 5 (A) Subtasks in HS sequencing. In HS-SEQ, HS sequencing consists of two basic steps: identification of Ac positions and identification of sulfate positions. Data ambiguity is considered for each step, and sulfate loss is considered when identifying sulfate positions. (B) An assignment graph connects assignments and generates the modification distribution. The program is available at https://github.com/hh1985/glycan-pipeline.

Please cite this article in press as: Hu, H., et al., Bioinformatics of glycosaminoglycans. Perspect. Sci. (2016), http://dx.doi.org/10.1016/j.pisc.2016.01.014

+Model

PISC-393;

No. of Pages 5

ARTICLE IN PRESS

Bioinformatics of glycosaminoglycans Although in theory uronic acid epimers can be differentiated using ExD tandem MS, the phenomenon requires additional study before it is understood well enough to be exploited by the algorithm to assign epimeric positions.

Acknowledgement This work was supported by US National Institutes for Health grant number P41GM104603.

References Bernfield, M., Gotte, M., Park, P.W., Reizes, O., Fitzgerald, M.L., Lincecum, J., Zako, M., 1999. Functions of cell surface heparan sulfate proteoglycans. Annu. Rev. Biochem. 68, 729—777, http://dx.doi.org/10.1146/annurev.biochem.68.1.729. Bishop, J.R., Schuksz, M., Esko, J.D., 2007. Heparan sulphate proteoglycans fine-tune mammalian physiology. Nature 446, 1030—1037, http://dx.doi.org/10.1038/nature05817. Bulow, H.E., Hobert, O., 2006. The molecular diversity of glycosaminoglycans shapes animal development. Annu. Rev. Cell Dev. Biol. 22, 375—407, http://dx.doi.org/10.1146/annurev.cellbio.22.010605.093433. DeAngelis, P.L., 2002. Evolution of glycosaminoglycans and their glycosyltransferases: implications for the extracellular matrices of animals and the capsules of pathogenic bacteria. Anat. Rec. 268, 317—326, http://dx.doi.org/10.1002/ar.10163. Domon, B., Costello, C.E., 1988. A systematic nomenclature for carbohydrate fragmentations in FAB-MS/MS spectra of glycoconjugates. Glycoconj. J. 5, 397—409, http://dx.doi.org/10.1007/bf01049915. Huang, R., Pomin, V.H., Sharp, J.S., 2011a. LC/MSn analysis of isomeric chondroitin sulfate oligosaccharides using a chemical derivatization strategy. J. Am. Soc. Mass Spectrom. 22, 1577—1587, http://dx.doi.org/10.1007/s13361-011-0174-0. Huang, R., Pomin, V.H., Sharp, J.S., 2011b. A chemical derivatization strategy for structural analysis of isomeric glycosaminoglycan oligosaccharides using reverse phase LC—MSn. Proceedings of the 59th ASMS Conference on Mass Spectrometry and Allied Topics.

5 Huang, R., Liu, J., Sharp, J.S., 2013. An approach for separation and complete structural sequencing of heparin/heparan sulfate-like oligosaccharides. Anal. Chem. 85, 5787—5795, http://dx.doi.org/10.1021/ac400439a. Huang, Y., Yu, X., Mao, Y., Costello, C.E., Zaia, J., Lin, C., 2013. De novo sequencing of heparan sulfate oligosaccharides by electron-activated dissociation. Anal. Chem. 85, 11979—11986, http://dx.doi.org/10.1021/ac402931j. Kailemia, M.J., Li, L., Ly, M., Linhardt, R.J., Amster, I.J., 2012. Complete mass spectral characterization of a synthetic ultralow-molecular-weight heparin using collision-induced dissociation. Anal. Chem. 84, 5475—5478, http://dx.doi.org/10.1021/ac3015824. Naggar, E.F., Costello, C.E., Zaia, J., 2004. Competing fragmentation processes in tandem mass spectra of heparin-like glycosaminoglycans. J. Am. Soc. Mass Spectrom. 15, 1534—1544, http://dx.doi.org/10.1016/j.jasms.2004.06.019. Perrimon, N., Bernfield, M., 2000. Specificities of heparan sulphate proteoglycans in developmental processes. Nature 404, 725—728, http://dx.doi.org/10.1038/35008000. Samama, M.-M., 1992. Treatment of deep vein thrombosis (DVT) with low molecular weight heparins (LMWH). Adv. Exp. Med. Biol. 313, 275—281, http://dx.doi.org/10.1007/978-1-4899-2444-5 27. Shi, X., Huang, Y., Mao, Y., Naimy, H., Zaia, J., 2012. Tandem mass spectrometry of heparan sulfate negative ions: sulfate loss patterns and chemical modification methods for improvement of product ion profiles. J. Am. Soc. Mass Spectrom. 23, 1498—1511, http://dx.doi.org/10.1007/s13361-012-0429-4. Wolff, J.J., Amster, I.J., Chi, L., Linhardt, R.J., 2007a. Electron detachment dissociation of glycosaminoglycan tetrasaccharides. J. Am. Soc. Mass Spectrom. 18, 234—244, http://dx.doi.org/10.1016/j.jasms.2006.09.020. Wolff, J.J., Chi, L., Linhardt, R.J., Amster, I.J., 2007b. Distinguishing glucuronic from iduronic acid in glycosaminoglycan tetrasaccharides by using electron detachment dissociation. Anal. Chem. 79, 2015—2022, http://dx.doi.org/10.1021/ac061636x. Wolff, J.J., Laremore, T.N., Busch, A.M., Linhardt, R.J., Amster, I.J., 2008. Influence of charge state and sodium cationization on the electron detachment dissociation and infrared multiphoton dissociation of glycosaminoglycan oligosaccharides. J. Am. Soc. Mass Spectrom. 19, 790—798, http://dx.doi.org/10.1016/j.jasms.2008.03.010.

Please cite this article in press as: Hu, H., et al., Bioinformatics of glycosaminoglycans. Perspect. Sci. (2016), http://dx.doi.org/10.1016/j.pisc.2016.01.014