Serum N-glycan analysis in breast cancer patients – Relation to tumour biology and clinical outcome

Serum N-glycan analysis in breast cancer patients – Relation to tumour biology and clinical outcome

M O L E C U L A R O N C O L O G Y 1 0 ( 2 0 1 6 ) 5 9 e7 2 available at www.sciencedirect.com ScienceDirect www.elsevier.com/locate/molonc Serum N-...

2MB Sizes 7 Downloads 49 Views

M O L E C U L A R O N C O L O G Y 1 0 ( 2 0 1 6 ) 5 9 e7 2

available at www.sciencedirect.com

ScienceDirect www.elsevier.com/locate/molonc

Serum N-glycan analysis in breast cancer patients e Relation to tumour biology and clinical outcome Vilde D. Haakensena,b, Israel Steinfeldc,d, Radka Saldovae, Akram Asadi Shehnie, Ilona Kiferd, Bjørn Naumef, Pauline M. Rudde, Anne-Lise Børresen-Dalea,b,*, Zohar Yakhinic,d,** a

Department of Genetics, Institute for Cancer Research, Oslo University Hospital, The Norwegian Radium Hospital, Oslo, Norway b The K.G. Jebsen Center for Breast Cancer Research, Institute for Clinical Medicine, Faculty of Medicine, University of Oslo, Oslo, Norway c Department of Computer Science, Technion, Haifa, Israel d Agilent Laboratories, Agilent Technologies, Tel-Aviv, Israel e NIBRT GlycoScience Group, National Institute for Bioprocessing Research and Training, Fosters Avenue, Mount Merrion, Blackrock, Dublin 4, Ireland f Department of Oncology, Oslo University Hospital, The Norwegian Radium Hospital, Oslo, Norway

A R T I C L E

I N F O

A B S T R A C T

Article history:

Glycosylation and related processes play important roles in cancer development and pro-

Received 19 December 2014

gression, including metastasis. Several studies have shown that N-glycans have potential

Received in revised form

diagnostic value as cancer serum biomarkers. We have explored the significance of the

2 August 2015

abundance of particular serum N-glycan structures as important features of breast tumour

Accepted 3 August 2015

biology by studying the serum glycome and tumour transcriptome (mRNA and miRNA) of

Available online 19 August 2015

104 breast cancer patients. Integration of these types of molecular data allows us to study the relationship between serum glycans and transcripts representing functional pathways,

Keywords:

such as metabolic pathways or DNA damage response. We identified tri antennary trigalac-

Serum N-glycans

tosylated trisialylated glycans in serum as being associated with lower levels of tumour

mRNA

transcripts involved in focal adhesion and integrin-mediated cell adhesion. These glycan

Gene expression

structures were also linked to poor prognosis in patients with ER negative tumours. High

Breast cancer

abundance of simple monoantennary glycan structures were associated with increased

Integrated data analysis

survival, particularly in the basal-like subgroup. The presence of circulating tumour cells was found to be significantly associated with several serum glycome structures like bi and triantennary, di- and trigalactosylated, di- and trisialylated. The link between tumour miRNA expression levels and N-glycan production is also examined. ª 2015 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.

Abbreviations: 2AB, 2-aminobenzamide; GP, glycan peak; HPLC, high performance liquid chromatography; mHG, minimal hypergeometric; MicMa, Micrometastasis in Mammary carcinomas; sLex, sialyl Lewis x; TTGE, Temporal temperature gradient gel electrophoresis; UPLC, Ultra Performance Liquid Chromatography. * Corresponding author. Institute for Cancer Research, Oslo University Hospital, Postboks 4950 Nydalen, 0424 Oslo, Norway. Tel.: þ47 92854455. ** Corresponding author. Agilent Laboratories, 94 Em haMoshavot, 49527 Petah Tiqva, Israel. Tel.: þ972 (3) 9288 575. E-mail addresses: [email protected] (A.-L. Børresen-Dale), [email protected] (Z. Yakhini). http://dx.doi.org/10.1016/j.molonc.2015.08.002 1574-7891/ª 2015 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.

60

1.

M O L E C U L A R O N C O L O G Y 1 0 ( 2 0 1 6 ) 5 9 e7 2

Introduction

Molecular profiling of tumours to guide cancer care has increasingly been recognized and is used to classify tumours into several main categories (Sorlie et al., 2001; Weigelt et al., 2008) that differ in biology and response to treatment. Clinical practice continues to base decisions on the molecular profiles of the tumour and on the tumour environment, but tumour molecular profiling requires access to tumour samples and therefore relies on an invasive procedure. It is highly desirable, mainly for the purpose of supporting early detection of breast cancer, to develop approaches that use more accessible patient samples such as plasma or serum. Due to their demonstrated involvement in molecular signalling, glycoproteins and glycolipids present on the cell surface of tumour cells and host response cells, glycans are potential good candidates to provide indications pertaining to cancer pathogenesis and prognosis, including the metastatic potential (Saldova et al., 2011). Recent studies point to glycosylation events that are different when comparing tumour cells to healthy cells and tissues (Chandler and Goldman, 2013; Potapenko et al., 2009), including differences in many N- and O-glycans attached to cell surface proteins. Alterations in synthesis, modifications, transfer and degradation of glycans are all processes that affect glycosylation. Thus, glycosylation alterations on cell surface proteins affect tumour cell properties such as cell mobility, intercellular signalling and the capacity to metastasize (Julien et al., 2011; Ugorski and Laskowska, 2002). As some glycoproteins are secreted or shed from the tumours, tumour-related glycan profiles as well as protein glycosylation changes reflecting the host response may also be detected in serum. Indeed, changes in protein glycosylation in serum have been detected in various cancer types, including breast cancer (Saldova et al., 2011). The glycan composition of serum glycoproteins can be measured using several techniques developed in recent decades. Mass spectrometry based measurement approaches to glycan biomarkers are reviewed in Lebrilla (2013) and Ruhaak et al. (2013). Ruhaak et al. (2013) describe a method based on first employing the enzymatic release of N-glycans using PNGaseF. Upon release, the N-glycans are purified using a cartridge containing PGC in which the glycans are eluted in three fractions and are subsequently identified by using pre measured standards for comparisons. Lectins and other recognition molecules are also used to distinguish the binding profiles of samples from cancer patients versus samples from healthy controls (Li et al., 2009; Bicker et al., 2012; Cross et al., 2005; Parker et al., 2005). Serum N-glycome analyses based on high performance liquid chromatography (HPLC) with fluorescence detection is described in Royle et al. (2008). HPLC chromatography is used to quantify the relative abundances of different glycans in the serum glycome, yielding a simple and quantitative approach. The detailed analysis of the serum glycome using orthogonal technologies such as LC, WAX and mass-spectrometry is necessary to provide a secure basis for structural assignments (Saldova et al., 2014). Transitioning the method to Ultra Performance Liquid Chromatography (UPLC) enables greater resolution, as applied in our recently

published study of a breast cancer cohort (Saldova et al., 2014). This allowed the identification of additional glycan structures associated with breast cancer, mammographic density and serum estradiol. The relationship between physiologically important activities in tumour cells and serum glycome features has not previously been characterized systematically. By combining transcriptome profiling information from tumour samples and matching serum glycome profiling data in the extensively characterized MicMa breast cancer cohort (Enerly et al., 2011; Aure et al., 2013a), a systematic approach was taken to understand how processes that occur within the tumour manifest themselves in the serum N-glycome. We analyzed the N-glycome to identify glycan groups in serum that are associated, in a statistically significant manner, to survival and to other prognostic properties of the tumour. Furthermore, we investigated the association of N-glycan levels to tumour molecular properties such as expression levels of mRNA and microRNA (miRNA). Associations were also assessed at the pathway level, in order to obtain better and more useful interpretations of serum N-glycome differences when they potentially reflect an associated activity in the tumour. We identified serum N-glycans associated with biological processes in breast carcinomas as well as serum glycans associated with the presence of circulating tumour cells, primary tumour size and survival.

2.

Material and methods

2.1.

Subjects

A study of serum glycan profiling to compare serum samples from breast cancer patients to serum samples from cancer free subjects was reported in Saldova et al. (2014). To understand what tumour activity is correlated with or reflected in the serum glycan profiles, we focused on a cohort of breast cancer patients. Serum samples from a total of 104 women were included in the current study. The women were recruited during 1995e1998 as part of a large consecutive study (Micrometastases in Mammary cancer e MicMa) designed to explore the biology of micrometastases in breast cancer as previously described (Naume et al., 2007; Wiedswang et al., 2003). The study was approved by the Norwegian Regional Committee for Medical Research Ethics, Health Region II (reference number S-97103). The average clinical follow-up time is 9.5 years (1 monthe15 years). Overview of the subjects, including clinical variables and tumour characteristics, is shown in Appendix C, Table C.1.

2.2.

Serum samples

Serum was collected in serum-separating tubes (SST) and was left on the bench for 30 min before centrifuging for 10 min at 2000 G. The samples were then stored at 20  C before shipment from Oslo to Dublin for N-glycan analysis. N-glycans were released from 5 mL of serum samples (in duplicates) using the high-throughput method previously described (Royle et al., 2008). Briefly, serum samples were reduced and

M O L E C U L A R O N C O L O G Y 1 0 ( 2 0 1 6 ) 5 9 e7 2

alkylated in 96-well plates before they were immobilized in sodium dodecyl sulphate (SDS)-gel blocks and washed. The N-linked glycans were released using peptide N-glycanase F (1000 U/mL; EC 3.5.1.52) as described previously (Kuster et al., 1997). Glycans were fluorescently labelled with 2aminobenzamide (2AB) by reductive amination (Royle et al., 2008). Excess 2AB reagent was removed on Whatman 3 MM paper (Clifton, NJ) in acetonitrile (Royle et al., 2008, 2006).

2.3.

Ultra Performance Liquid Chromatography (UPLC)

UPLC was performed as previously described (Saldova et al., 2014) using a BEH Glycan 1.7 mm particles in 2.1  150 mm column (Waters, Milford, MA) on an Acquity UPLC (Waters, Milford, MA) equipped with a Waters temperature control module and a Waters Aquity fluorescence detector. 60 min method was used with solvent A 50 mM formic acid adjusted to pH 4.4 with ammonia solution and solvent B acetonitrile. Samples were injected in 70% acetonitrile. Fluorescence was measured at 420 nm with excitation at 330 nm. The system was calibrated using an external standard of hydrolyzed and 2AB-labelled glucose oligomers to create a dextran ladder, as described previously (Royle et al., 2006).

2.4.

Detection of circulating tumour cells

Detection of circulating tumour cells (CTCs) in peripheral blood was performed as previously described (Naume et al., 2001; Wiedswang et al., 2006). Briefly, Ficoll density centrifugation through LymphoprepTM (Axis-Shield PoC, Oslo, Norway) was used to isolate mononuclear cells, including tumour cells, from whole blood. Circulating tumour cells were isolated as described in Molloy et al. (2011). Briefly, mononuclear cells were separated from the whole blood samples using Ficoll Hypaque (BD, Breda, Netherlands) and frozen in liquid nitrogen until tumour cell enrichment. Tumour cells were isolated from the mononuclear cells using anti-EpCAM (clone HEA125) and anti-ERBB2 Micro Beads (MACS; Miltenyi Biotec, Leiden, Netherlands). Quadratic discriminant analysis was used to calculate a CTC score indicating presence or absence of CTCs for each sample based on gene expression data of the four best marker genes (CK19, p1B, EGP and MmG1) as previously described (Bosma et al., 2002; Hand, 1992).

2.5.

Molecular characterization of breast cancer tissue

Breast carcinomas from the women included in the study were fresh frozen and later used for whole genome molecular profiling of mRNA, miRNA and DNA. Total RNA was isolated using TRIZOL (Invitrogen) as previously described (Sorlie et al., 2006). mRNA gene expression profiling was performed using Agilent catalogue design whole human genome 4  44 K one colour oligo arrays (Agilent Technologies, Santa Clara, CA) as described previously (Enerly et al., 2011). Scanning was performed using Agilent Scanner G2565A and signals were extracted using Feature Extraction v9.5. Data were log2transformed. Non-uniform spots and probes with missing values on more than 10 arrays were excluded. Quantile normalization was performed in R using NormalizeBetweenArrays from the LIMMA library (Smyth, 2005). Missing values

61

were imputed using LLS imputation (R: LLSimpute from the pcaMethod library with k ¼ 20) (Kim et al., 2005). The data is compliant with Minimum Information About a Microarray Experiment (MIAME) and have been submitted to the Gene Expression Omnibus (GEO) with accession number GSE19783. microRNA-profiling from total RNA was performed using Agilent’s “Human miRNA microarray Kit (V2)” (Agilent Technologies, Santa Clara, CA) according to manufacturer’s protocol and as previously described (Enerly et al., 2011). The arrays were scanned on Agilent Scanner G2565A and Feature Extraction v9.5 was used to extract signals. Samples were run in duplicates and signal intensities for replicate samples were averaged, log2-transformed and normalized to the 75th percentile. miRNAs present in less than 10% of the samples were filtered out, leaving 489 miRNAs for further analysis. The data is MIAME compliant and have been submitted to the GEO with accession number GSE19536. TP53 mutation status covering exon 2 through 11 was performed as previously described (Naume et al., 2007).

2.6.

Statistical analyses

The peaks of the Glycan UPLC-output represent the relative area for each glycan peak in the spectrum. Hence, the data are compositional, conveying the relative amounts of glycan structures in a sample rather than the absolute quantities. The sum of all glycan peaks is constant. Raw compositional data are not suitable for most statistical analysis and a logit transformation was performed to map data into real space:   peak logitðpeakÞ ¼ log 1  peak A linear regression, analysis of variance (ANOVA), was fitted to identify glycan peaks and features associated with clinical traits. All analyses were corrected for multiple testing by use of false discovery rate (FDR). An FDR of less than 10% was considered significant. Age-adjustment was used for ANOVA-analyses and log-rank tests. Data not adjusted for age was used for association-analyses (gene expression related to serum glycans). Analyses were redone with ageadjusted data to assure that the results obtained where not dependent upon age. Survival analysis was performed on the logit transformed and age-corrected data. Age correction was performed by computing the residual of the glycan abundance, based on a linear regression with age as the explaining variable. Each GP was used to partition the samples in the cohort into two classes: samples where the GP was measured to have high abundance (top 33% of the samples) and samples where the GP was measured to have low abundance (bottom 33% of the samples). Log-rank test was used for survival analysis, comparing these two classes, and a threshold of p < 0.05 was considered significant. To show the trends in survival, top and bottom populations need to be compared, as above. Using 33% cut-offs was determined to represent top and bottom and still keep a grey zone in the middle. We do expect differences to be more pronounced in the distal percentiles, with middle values possibly introducing noise. Using smaller percentage cut-offs, such as

62

M O L E C U L A R O N C O L O G Y 1 0 ( 2 0 1 6 ) 5 9 e7 2

10%, leaves us with much less statistical power. Using 25% cut-offs we get results that are similar to those reported here. To associate between GPs and pathways or miRNA targets an integrative analysis was performed using ENViz (Steinfeld et al., 2014, 2015) and as described in Aure et al. (2013b). ENViz associates between a GP and a set of genes, representing a biological pathway, by assessing the enrichment level of the set of genes in genes highly correlated (or anti-correlated) to the GP. In brief, for a particular GP all genes are ranked according to their expression similarity to the GP abundance levels across the cohort of samples. Using the mHG (Eden et al., 2007) statistic the enrichment of the analysed set of genes is assessed in the top of this ranked list. We used Pearson correlation to obtain ranking and considered both directions for enrichment analysis (correlated and anti-correlated). To associate GPs to pathways we used WikiPathways (Kelder et al., 2012) gene sets. To associate GPs to miRNA targets we used Targetscan target prediction database (Lewis et al., 2005) and defined each miRNA target set as its top 2000 mRNA targets (or less if Targetscan reports less than 2000 scored targets). In our analysis, we compared each result to 10 K random results, as facilitated by ENViz (Steinfeld et al., 2015) to assess the empirical statistical strength. In each randomization instance the glycan profile is shuffled and the subsequent enrichment results are compared to those obtained in the actual data. The reported results are those GP-pathway or GP-miRNA-target-set associations that were found to be better than all randomized results. The overall schema for this analysis is described in Figure 1.

3.

Results

3.1. Serum N-glycan abundance is associated with tumour activity as reflected in gene expression profiles 3.1.1. Association of serum N-glycome with mRNA transcripts representing distinct pathways A total of six glycan peaks (GP2, GP11, GP32, GP34, GP36 and GP37) were significantly associated with mRNA-transcripts representing 8 different gene wiki-pathways with a false discovery rate (FDR) of 10%. In Table 1 the various glycanstructures represented by each glycan peak (GP) are shown, with visualisation of the predominant structures. In Figure 2 and Table 2 an overview of the GPs significantly correlated with transcripts in associated pathways is presented. The biological functions of the significant transcripts fall mainly into two categories: 1. Cell activity (replication, repair and metabolism) and 2. Interaction with extracellular matrix (adhesion, integrins and angiogenesis). GP2 (mostly core fucosylated monoantennary glycan FA1 with contributions of high mannosylated glycan M4 and biantennary glycan A2) was associated with decreased DNA replication and mismatch repair. FA1 is an unusually short and simple glycan and might be only partially processed. GP11 (mostly high mannosylated glycan M6 with contribution of core fucosylated biantennary bisecting glycan FA2[3]BG1) was correlated with decreased mitochondrial fatty acid betaoxidation. In Figure 3 a heatmap is shown representing the

Glycan peak

mRNA

Correlation analysis

Ranked list of genes (Figure 4)

Enrichment analysis: Wikipathways Figure 1 e Work-flow. Knowledge about function of mRNAtranscripts can be used to infer biological function of serum glycans.

expression levels of all 19,240 genes after ranking them according to their anti-correlation to GP2 and after sorting the samples (columns) according to GP2 expression levels. The vertical graph shows the significance level of mismatch repair genes in this ranking according to anti-correlation, with optimal enrichment attained at the top 227 genes (Figure 3). The glycan peaks GP32, 34, 36 and 37 are all mainly composed of trisialylated, trigalactosylated triantennary glycans. GP36 is the only one of these peaks with a dominance of core fucosylated glycans (mainly FA3G3S[3,6,6]3) and GP37 with high proportions of outer arm fucosylated glycans (mainly A3F1G3S[3,3,3]3). Higher serum abundance of GP32 (mainly A3G3S[3,3,3]3) was correlated with increased triacylglyceride synthesis in the breast carcinoma. GP34 and GP36 shared a strong anti-correlation with tissue expression of transcripts involved in integrin-mediated cell adhesion, while GP34 was also correlated with decreased angiogenesis and GP36 was also correlated with decreased adipocyte activity. GP37 is strongly correlated with decreased tissue expression of transcripts involved in focal adhesion. In summary GP34, GP36 and GP37 were associated with reduced cell adhesion (Figure 2). When luminal A and basal-like subtypes were analysed separately, GP43 (tetraantennary, tetragalactosylated tetrasialylated glycans A4G4S4) was found to be positively correlated with tumour gene expression related to the class A Rhodpsinlike pathway within the luminal A samples (Figure 4A) and

63

M O L E C U L A R O N C O L O G Y 1 0 ( 2 0 1 6 ) 5 9 e7 2

Table 1 e Glycan peaks and their N-glycan composition. Detailed composition of all other N-glycans from human serum is in Saldova et al. (2014). Predominant structures are highlighted in bold with illustration of the structure. Glycan label

GU

GP1

4.85

Glycans

Structure (highlighted glycans)

Table 1 (continued) Glycan label

GU

GP12

7.20

A1[3]G1S[3]1

7.20

A2G2

GP13

7.38 7.38

A1[3]G1S[6]1 A2BG2

GP14

7.62 7.62 7.62 7.62 7.62 7.62 7.76 7.76 7.76 7.76

M5A1G1 FA2G2 A2[3]G1S[3]1 A2[3]G1S[6]1 FA1G1S[3]1 FA1G1S[6]1 FA2BG2 M7 D3 A2[6]BG1S[3]1 A2[6]BG1S[6]1

5.34 5.41 5.41

M4 FA1 A2

GP3

5.61

A1[6]G1

GP4

5.78 5.78

A2B A1[3]G1

GP5

5.88

FA2

7.92 7.92 7.92 7.92 7.92 7.92 7.92 8.03 8.03 8.03 8.03

A2[3]BG1S[3]1 A2[3]BG1S[6]1 FA2[6]G1S[3]1 FA2[6]G1S[6]1 M4A1G1S[3]1 M4A1G1S[6]1 M7 D1 FA2[3]G1S[3]1 FA2[3]G1S[6]1 FA2[6]BG1S[3]1 FA2[6]BG1S[6]1

GP18

8.20 8.20

FA2[3]BG1S[3]1 FA2[3]BG1S[6]1

GP19

8.38 8.38 8.38

A2G2S[3]1 A2G2S[6]1 A3G3

GP20

8.53 8.53

A2BG2S[3]1 A2BG2S[6]1

GP15

GP7

6.18 6.24 6.24 6.24 6.38 6.38 6.55

M5 FA2B FA1[6]G1 A2[6]G1 FA1[3]G1 A2[3]G1 A2[6]BG1

GP16

GP17

GP8

GP9

GP10

GP11

6.71 6.71 6.71

A2[3]BG1 FA2[6]G1 M4A1G1

6.84

FA2[3]G1

6.95

7.08 7.08 7.08

Structure (highlighted glycans)

A1

GP2

GP6

Glycans

FA2[6]BG1

FA2[3]BG1 M6 D1,D2 M6 D3 (continued on next page)

64

M O L E C U L A R O N C O L O G Y 1 0 ( 2 0 1 6 ) 5 9 e7 2

Table 1 (continued)

Table 1 (continued)

Glycan label

GU

Glycans

GP21

8.63 8.63 8.63

M5A1G1S[3]1 M5A1G1S[6]1 FA3G3

8.63 8.63

M8 D2,D3 *A2G2S[3]1

GP22

8.80 8.80 8.80

FA2G2S[3]1 FA2G2S[6]1 M8 D1,D3

GP23

9.02 9.02

FA2BG2S[3]1 FA2BG2S[6]1

Structure (highlighted glycans)

Glycan label

9.21 9.21 9.21 9.21 9.21

A2F1G2S[3]1 A2F1G2S[6]1 *A2G2S[3,3]2 *A2G2S[3,6]2 *FA2G2S[3,3]2

GP25

9.43 9.43 9.43 9.62 9.62 9.62 9.62 9.62 9.79 9.79 9.79 9.79 9.79 9.79 10.04 10.04 10.04 10.04

A3G3S[3]1 A3G3S[6]1 M9 A3BG3S[3]1 A3BG3S[6]1 A2G2S[3,3]2 A2G2S[3,6]2 A2G2S[6,6]2 FA3G3S[3]1 FA3G3S[6]1 FA3BG3S[3]1 A2BG2S[3,3]2 A2BG2S[3,6]2 A2BG2S[6,6]2 A3F1G3S[3]1 FA2G2S[3,3]2 FA2G2S[3,6]2 FA2G2S[6,6]2

10.17 10.17 10.17 10.17 10.17 10.17 10.17 10.31 10.31 10.31 10.43 10.43 10.43

FA2BG2S[3,3]2 FA2BG2S[3,6]2 FA2BG2S[6,6]2 A2F1G2S[3,3]2 A2F1G2S[3,6]2 A2F1G2S[6,6]2 M9Glc A3G3S[3,3]2 A3G3S[3,6]2 A3G3S[6,6]2 A3BG3S[3,3]2 A3BG3S[3,6]2 A3BG3S[6,6]2

GP26

GP27

GP28

GP29

Glycans

GP30

10.60

A4G4S[3]1

GP31

10.77 10.77 10.77 10.77

FA3G3S[3,3]2 FA3G3S[3,6]2 FA3G3S[6,6]2 *A3G3S[3,3]2

GP32

10.96 10.96 10.96

*A3G3S[3,3,3]3 A3F1G3S[3,3]2 A4G4S[6]1

GP33

11.14 11.14 11.14 11.14 11.14 11.28 11.28 11.54 11.54 11.54 11.54 11.54 11.72 11.72 11.72

*A3G3S[3,3,6]3 *A3G3S[3,6,6]3 *A3BG3S[3,3,3]3 *A3BG3S[3,3,6]3 *A3BG3S[3,6,6]3 *FA3G3S[3,3,3]3 *FA3BG3S[3,3,3]3 A4G4S[3,6]2 A3G3S[3,3,3]3 A3G3S[3,3,6]3 A3G3S[3,6,6]3 A3G3S[6,6,6]3 A3BG3S[3,3,3]3 A3BG3S[3,3,6]3 A3BG3S[6,6,6]3

GP36

11.89 11.89 11.89 11.89

FA3G3S[3,3,3]3 FA3G3S[3,3,6]3 FA3G3S[3,6,6]3 FA3G3S[6,6,6]3

GP37

12.03 12.03 12.03 12.03 12.03 12.03 12.03 12.15 12.15

*A3G3S[3,3,3]3 *A3G3S[3,3,6]3 *A3G3S[3,6,6]3 A3F1G3S[3,3,3]3 A3F1G3S[3,3,6]3 FA3BG3S[3,3,3]3 FA3BG3S[6,6,6]3 A4G4S[3,3,3]3 A3F1G3S[3,6,6]3

12.33

A4G4S[3,3,6]3

12.33

A4G4S[3,6,6]3

GP34 GP24

GU

GP35

GP38

GP39

Structure (highlighted glycans)

M O L E C U L A R O N C O L O G Y 1 0 ( 2 0 1 6 ) 5 9 e7 2

Table 1 (continued) Glycan label GP40

GU

Glycans

12.48 12.48

A4F1G3S[3,3,3]3 A3F2G3S[3,3,3]3

12.48

A4F1G3S[3,3,6]3

12.48

A4F1G3S[3,6,6]3

GP41

12.67 12.67 12.67 12.67 12.78

A3F2G3S[3,3,6]3 A4F2G3S[3,3,3]3 A4F2G3S[3,3,6]3 *A4G4S[3,3,3,3]4 A4G4S[3,3,3,3]4

GP42

12.96

A4G4S[3,3,3,6]4

GP43

13.27 13.27 13.27

Structure (highlighted glycans)

*A4G4S[3,3,3,6]4 A4G4S[3,3,6,6]4 A4G4S[3,6,6,6]4

GP44

13.47 13.47 13.47 13.47

*A4G4S[3,3,3,3]4 FA4G4S[3,3,3,3]4 FA4G4S[3,3,3,6]4 A4BG4S[3,3,6,6]4

GP45

13.82 13.82 13.82 13.82

A4F1G4S[3,3,3,3]4 A4F1G4S[3,3,3,6]4 A4F1G4S[3,3,6,6]4 A4F1G4S[3,6,6,6]4

GP46

13.99 13.99 13.99 13.99 14.43

A4G4LacS[3,3,3,3]4 A4G4LacS[3,3,3,6]4 A4F2G4S[3,3,3,3]4 A4F2G4S[3,3,6,6]4 A4F3G4S[3,3,3,3]4

Structure abbreviations: all N-glycans have two core GlcNAcs; F at the start of the abbreviation indicates a core fucose a1,6-linked to the inner GlcNAc; Mx, number (x) of mannose on core GlcNAcs; Ax, number of antenna (GlcNAc) on trimannosyl core; A2, biantennary with both GlcNAcs as b1,2-linked; A3, triantennary with a GlcNAc linked b1,2 to both mannose and the third GlcNAc linked b1,4 to the a1,3 linked mannose; A4, GlcNAcs linked as A3 with additional GlcNAc b1,6 linked to a1,6 mannose; B, bisecting GlcNAc linked b1,4 to b1,3 mannose; Gx, number (x) of b1,4 linked galactose on antenna; F(x), number (x) of fucose linked a1,3 to antenna GlcNAc; Sx, number (x) of sialic acids linked to galactose. Dx: isoforms with different mannose-binding. *sialic acids isomers (same composition but different sialic acid linkage arrangements resulting in different GUs from the original structures).

65

with mismatch repair in basal-like samples Figure 4B. In the basal-like subgroup, we found that the “alpha 6 beta 4 integrin signalling pathway” was positively correlated with GP30 (tetraantennary, tetragalactosylated monosialylated glycan A4G4S[3]1) and negatively correlated with GP6 (mostly high mannosylated glycan M5 but also mono and biantennary glycans with or without core fucose) (Figure 4B). This pathway mainly consists of integrins and laminins in addition to cancer related genes like PI3K-receptors and MAP-kinases. GP2 (mostly core fucosylated monoantennary glycan FA1) was negatively correlated to DNA replication as well as cell cycle regulation in basal-like samples (Figure 4B).

3.1.2. Association of the serum N-glycome with tumour miRNA levels We analysed the correlation between GP abundance levels and miRNA expression levels in this cohort and show that there was no statistically significant correlation observed (see Figure A.2). We also performed an integrated analysis approach, similar to that described above for pathways, to study the association of glycan peaks to miRNA targets (mRNA transcripts). The results are described in Appendix B.

3.2. Serum glycan abundance is associated with clinical parameters Associations between glycan peaks and clinical parameters are summarized in Figure 5.

3.2.1. Association of serum N-glycome with tumour size, CTCs and other clinical parameters Increased tumour size was associated with low abundance of GP12 (biantennary digalactosylated glycan A2G2 and monoantennary monogalactosylated monosialylated glycan A1[3]G1S [3]1) and GP14 (mostly core fucosylated biantennary glycan FA2G2, but also other mono and biantennary glycans) (Figure 5). These two glycan peaks were not significant in the luminal A and basal-like subtypes separately. Occurrence of CTCs at time of diagnosis was associated with decreased abundance of the GP25 (mainly biantennary digalactosylated disialylated glycans without fucosylation, A2G2S[6,6]2), GP34 (mainly triantennary trigalactosylated trisialylated glycans, A3G3S[3,3,6]3) and GP36 (mainly core fucosylated triantennary trigalactosylated trisialylated glycans, FA3G3S[3,6,6]3) (Figure 5 and Table 1). GP36 was also found significantly associated with CTC in women with basal-like carcinomas only (p ¼ 0.014). GP25 and GP34 were not significant in the luminal A or basal-like subtypes separately. GP34 and GP36 are both also associated with tumour transcript activity related to cell-adhesion, as indicated above. Estrogen receptor status was associated with GP22 (mostly core fucosylated biantennary digalactosylated monosialylated glycans) with an FDR of 0.092, but the direction of alteration between the different levels of estrogen staining was not consistent, so we concluded the result was not significant. We did not find significant associations between serum glycans and lymph node status, progesterone receptor status, gene expression subtype, HER2-status or TP53-mutation status (Appendix C, Table C.2). When patients with luminal A (n ¼ 29) and basal-like (n ¼ 21) cancers were analysed

66

M O L E C U L A R O N C O L O G Y 1 0 ( 2 0 1 6 ) 5 9 e7 2

Figure 2 e GP-pathways enrichment networks. Each network is composed of glycan peaks (GP) indicated by oval nodes and wikiPathways gene sets indicated by rectangle nodes. An edge in the network represents a significant association between the GP and the connected pathway. Red edge indicates a significant positive correlation between the GP and the pathway genes, and blue edges indicate a significant anti-correlation between the GP and the pathway genes, all at FDR <10%. The width of an edge is correlated to the associated Llog(enrichment p-value). Each pathway node is coloured according to the sum of Llog(enrichment p-value) of edges connected to it. Glycan structures contributing to each peak are listed, underscoring the structure contributing most to the peak.

separately no significant associations were found with clinical parameters after correction for multiple testing.

3.2.2.

Association of serum N-glycome with survival

Using log-rank test, three glycan peaks were significantly associated with breast cancer specific survival; GP1 (monoantennary glycans, A1), GP3 (monoantennary monogalactosylated glycans, A1[6]G1) and GP22 (mostly core fucosylated biantennary digalactosylated monosialylated glycan, FA2G2S[6]1). Improved survival was observed with higher serum levels of these glycans. KaplaneMeier plots are shown in Figure 6, comparing the top 33% of the samples to the bottom 33%. Furthermore e glycans with complex structures (more than 2 antennas) are more abundant in samples taken from patients with poor survival (Figure 7). For patients with ER negative tumours, simple glycan structures GP1 (monoantennary glycans, A1) and GP3 (monoantennary monogalactosylated glycans, A1[6]G1) were still correlated with good prognosis, while more complex glycan structures GP34 (mainly triantennary trigalactosylated trisialylated glycans, A3G3S[3,3,6]3) and GP37 (mostly outer arm fucosylated triantennary trigalactosylated trisialylated glycans, A3F1G3S[3,3,3]3) were correlated with poor prognosis (log rank FDR ¼ 0.047, Appendix C, Table C.3 and Appendix A, Figure A.3). For ER-positive tumours, GP1 was still associated with favourable prognosis, as were also GP22 (mostly core-fucosylated biantennary glycans, FA2G2S[6]1) and GP29 (mostly triantennary trigalactosylated disialilated glycans, A3G3S[3,6]2) (Appendix A, Figure A.3 and Appendix C, Table C.3).

4.

Discussion

Associations between serum glycans and mRNA transcripts in breast carcinomas have not previously been studied. We found significant associations between levels of serum glycans and expression levels of mRNA-transcripts and of specific transcription of functionally related genes, all in the actual breast carcinomas (Figure 2 and Table 2). The associations we report are statistically significant both from the perspective of a null model and when tested against data shuffling. Overall, our findings demonstrate that molecular characteristics of breast tumours manifest themselves, or leave distinguishable traces, in the patient’s serum glycomics. Core fucosylated monoantennary glycans (GP2) were associated with low expression of transcripts related to DNAreplication and mismatch repair. GP2 was not significantly associated with survival although a trend was observed. These observations may be in line with reduced tumour activity of DNA-replication, and a low level of mismatch repair through increased lethality of tumour cells in response to therapy. Transcripts involved in mismatch repair were positively associated with tetraantennary, tetragalactosylated tetrasialylated glycans (GP43) in basal-like samples only, but no association with survival was observed here. We have previously found that core fucosylated monoantennary glycans (GP2) are associated with low serum estradiol and with postmenopausal status which is also in line with reduced replication (Saldova et al., 2014). High mannosylated glycan M6 (GP11) was associated with a decrease in breast carcinoma mitochondrial fatty acid beta-

67

M O L E C U L A R O N C O L O G Y 1 0 ( 2 0 1 6 ) 5 9 e7 2

Table 2 e Overview of GPs correlated with transcripts and associated pathways. Other traits associated with the GPs are indicated. Glycan structure contributing most to each peak is indicated in bold. Glycan structures

Transcripts associated with pathways Cell activity

GP2

GP11

GP30

GP32

GP34

GP36

GP37

GP43

1)

M4 FA1 A2 FA2[3]BG1 M6 D1,D2 M6 D3 A4G4S[3]1

YDNA-replication YMismatch repair YCell cycle (basal) YMitochondrial fatty acid beta-oxidation

*A3G3S[3,3,3]3 A3F1G3S[3,3]2 A4G4S[6]1 *FA3G3S[3,3,3]3 *FA3BG3S[3,3,3]3 A4G4S[3,6]2 A3G3S[3,3,3]3 A3G3S[3,3,6]3 A3G3S[3,6,6]3 A3G3S[6,6,6]3 FA3G3S[3,3,3]3 FA3G3S[3,3,6]3 FA3G3S[3,6,6]3 FA3G3S[6,6,6]3 *A3G3S[3,3,3]3 *A3G3S[3,3,6]3 *A3G3S[3,6,6]3 A3F1G3S[3,3,3]3 A3F1G3S[3,3,6]3 FA3BG3S[3,3,3]3 FA3BG3S[6,6,6]3 *A4G4S[3,3,3,6]4 A4G4S[3,3,6,6]4 A4G4S[3,6,6,6]4

[Triglyceride synthesis

Other associations

Extracellular interaction postmenopausal status1) Yserum estradiol1) normal breast tissue1)

[a6 b4 signaling pathway (in basal-like tumours) breast carcinomas1)

YIntegrin-mediated cell adhesion YAngiogenesis

YAdipocyte

YIntegrin-mediated cell adhesion

YFocal adhesion

[Mismatch repair (basal)

Ysurvival in ER neg YCTCs YmiR588-targets [BMI1) Ymammographic density1)

YCTCs [BMI1) YmiR642a-targets

[GPCRs, Rhodopsin-like G-protein receptors (in luminal A tumours

From Saldova et al. (2014).

oxidation. In normal cellular metabolism the fatty acids are degraded into acetyl-coA which enters the citric acid cycle and into NADH and FADH2 which are both used by the electron transport chain. An increase in mitochondrial fatty acid beta-oxidation occurs during febrile illness, fasting and increased muscular activity and increases the cell’s energy production. At all times, this process contributes as much as 80% of energy for heart and liver functions (Rinaldo et al., 2002; Eaton et al., 1996). During carcinogenesis, breast cancer cells increase their energy production through the aerobic glycolytic pathway (“the Warburg effect”). This has been thought to occur instead of energy production from mitochondrial beta-oxidation, but recent research has shown that both processes may occur at the same time and represent a total increase in energy production (Koppenol et al., 2011). In line with this, we have previously found M6 (GP11) to be lower in the serum of breast cancer patients than in healthy women (Saldova et al., 2014). Put together with the statistical association we observe in the current study these findings support a role for higher mitochondrial beta-oxidation of fatty acids as contributing to an increase in total energy production by

cancer cells. Oestrogen receptor alpha (ESR1) is reported to induce a protein necessary for fatty acid beta-oxidation in the mitochondria, which may explain the role of ESR1 in lipid metabolism and add to its role in carcinogenesis (Zhou et al., 2012). Indeed, we see in our data that women with oestrogen receptor positive tumours tend to have lower levels, in serum, of high mannosylated glycan M6 (GP11) than women with oestrogen-receptor negative tumours (t-test, p ¼ 0.07), which in turn corresponds to higher levels of mitochondrial betaoxidation in the tumour. Glycan structures modify membrane-bound proteins that interact with the extracellular matrix and stroma. We identified associations between integrins, cell adhesion and other related functions and the serum glycan abundance of several highly branched and sialylated glycan structures (GP30, GP34, GP36, GP37 and GP43). Triantennary trigalactosylated trisialylated glycans (GP34, GP36 and GP37) are all strongly associated with low levels of transcripts related to adhesion. Triantennary glycans have previously been linked to accelerated tumour growth and metastasis (Miwa et al., 2013). This may be caused by low

68

M O L E C U L A R O N C O L O G Y 1 0 ( 2 0 1 6 ) 5 9 e7 2

Figure 3 e Enrichment of genes in the Mismatch Repair Pathway in genes anti-correlated with GP2. The heatmap represents the expression levels of all 19,240 genes measured after ranking them according to their anti-correlation to GP2 (vertical direction) and after sorting the samples according to serum GP2 abundance levels (horizontal direction). Top panel in blue presents the abundance levels of GP2 across the 103 samples. The vertical graph shows the significance level in elog (hypergeometric p-value) of Mismatch Repair genes in the ranked list of genes. Optimal enrichment is attained at the top 227 genes with an enrichment p-value of < 7E-8.

levels of integrin-mediated cell adhesion resulting in increased mobility of tumour cells. We found that high serum abundance of GP34 is correlated with shorter survival in patients with ER negative tumours only (Appendix C, Table C.3 and Appendix A, Figure A.3). The association with poor survival in ER negative tumours may also be linked to previous observations that increased sialylation of integrins increases tumour cell aggressiveness (Lesniak et al., 2013; Miwa et al., 2013) as high serum GP34 may be indicative of a high ratio of sialic acid/integrins. We emphasize, however, that we do not have information about integrin-specific sialylation in this study. Both GP34 and GP36 are associated with increased BMI and GP34 is also associated with low mammographic density in a previous study (Saldova et al., 2014). Both these parameters are associated with poor prognosis in previous studies (Chan et al., 2014; Olsen et al., 2009). N-glycosylation of integrins have previously been linked to trastuzumab resistance (Lesniak et al., 2013). However, GP34 and GP36 were not associated with survival in patients with HER2þ tumours. The

association of GP34 and GP36 to low levels of CTCs is, however, not in line with tumour aggressiveness. Outer arm fucosylated triantennary trigalactosylated trisialylated glycans (GP37, containing sLex epitope) were correlated with poor prognosis in ER negative tumours only (log rank FDR ¼ 0.047, Appendix C, Table C.3). Increased serum levels of sLex have been associated with cancer and with the metastatic process through interaction with E-selectin on endothelial cells ((Julien et al., 2011) and reviewed in Potapenko et al. (2009)). In line with this, the expression of the gene carbohydrate sulfotransferase 1 (CHST1), which is predicted to be involved in the synthesis of 60 -sulfated sLex, was associated with poor prognosis in these patients (Potapenko et al., 2015). We have also previously found the sLex epitope to be associated with presence of CTCs (Saldova et al., 2011) which is in line with poor prognosis in ER negative tumours, but the association with CTCs was not confirmed in this study. The study populations for the two studies were quite different as the previous study included metastatic patients only, whereas this study includes non-metastatic (stage IeII) patients only. A disseminated disease is expected to affect the serum N-glycome differently, compared to an early-stage disease. In addition, in our previous studies on breast cancer we have described that the sLex containing glycans are associated with breast cancer progression and are coming from acute phase proteins (Abd Hamid et al., 2008). Increase in sLex containing glycans correlate not only with higher number of CTCs in advanced breast cancer (Saldova et al., 2011), but also with lymph node positive early breast cancer (Pierce et al., 2010). Therefore the CTCs are most likely indicating advanced disease, which reflects then on the serum glycomic profile, rather than the detected glycans being directly derived from CTCs. CTCs are not highly abundant in serum, therefore most of the glycans we detect are probably coming from acute phase proteins. We found that a high abundance of monoantennary glycan structures (GP1 and GP3) was associated with increased survival (Figure 6). This was true for the basal-like and ER negative subtypes alone, but less pronounced in the luminal A and ER positive subtypes (Appendix C, Table C.3 and Appendix A, Figure A.3). GP1 was also associated with targets of the let-7 miRNA-family. These targets include several oncogenes, such as RAS (Johnson et al., 2005), and have been found associated with favourable prognosis (Iorio et al., 2005). The only large glycans positively associated with favourable prognosis are triantennary trigalactosylated disialylated glycans which are significant in ER-positive tumours only. The tumour transcripts associated with serum glycans were not enriched in glycan biosynthesis (Figures 2 and 4). Similarly, few enzymes responsible for glycan modification were significantly associated with the serum glycans. We have previously found indications that serum glycan levels represent systemic features rather than tumour-specific features (Saldova et al., 2014). We do not expect the serum glycans to be produced by the tumour itself, but rather reflect systemic changes related to the tumour phenotype. Tumour aggressiveness (represented by tumour size and presence of CTC) may stimulate a systemic host response that is reflected in the serum glycome. Similarly, hormone status of a tumour

69

M O L E C U L A R O N C O L O G Y 1 0 ( 2 0 1 6 ) 5 9 e7 2

Figure 4 e Subtype-specific GP pathway enrichment networks. A) Luminal A carcinomas only. B) Basal-like carcinomas only. An edge in the network represents a significant association between the GP and the connected pathway. Red edge indicates a significant correlation between the GP and the pathway genes, and a blue edge indicates a significant anti-correlation between the GP and the pathway genes. Glycan structures contributing to each peak are listed, underscoring the structure contributing most to the peak.

Trait survival surv - ER+ surv - ERsurv - basal T size CTC

7

8

9

GP34 FA3G3S[3,3,3]3 FA3BG3S[3,3,3]3 A4G4S[3,6]2 A3G3S[3,3,3]3 A3G3S[3,3,6]3 A3G3S[3,6,6]3 A3G3S[6,6,6]3

10

11

12

13

14

GP36 FA3G3S[3,3,3]3 FA3G3S[3,3,6]3 FA3G3S[3,6,6]3 FA3G3S[6,6,6]3

GP46

6

GP25 A3G3S[3]1 A3G3S[6]1 M9 A3BG3S[3]1 A3BG3S[6]1 A2G2S[3,3]2 A2G2S[3,6]2 A2G2S[6,6]2

GP25 GP26 GP27 GP28 GP29 GP30 GP31 GP32 GP33 GP34 GP35 GP36 GP37 GP38 GP39 GP40 GP41 GP42 GP43 GP44 GP45

5

GP15 FA2BG2 M7 D3 A2[6]BG1S[3]1 A2[6]BG1S[6]1

GP14 M5A1G1 FA2G2 A2[3]G1S[3]1 A2[3]G1S[6]1 FA1G1S[3]1 FA1G1S[6]1

GP6 GP7 GP8 GP9 GP10 GP11 GP12 GP13 GP14 GP15 GP16 GP17 GP18 GP19 GP20 GP21 GP22 GP23 GP24

GP12 A1[3]G1S[3]1A 2G2

GP2 GP3 GP4 GP5

GP3 A1[6]G1

GP1

GP1 A1

15

GU

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46

Significantly higher abundance with increase in trait Significantly lower abuncance with increase in trait

GPs associated with mRNA transcripts levels in the tumour

Figure 5 e Overview of glycan peak associations. Glycan peaks significantly associated with clinical parameters such as survival, tumour size (T size), estrogen receptor status (ER) and circulating tumour cells (CTC) and gene expression: mRNA-expression (purple arrows), miRNA-targets (orange arrows). Glycan structures contributing to each peak are listed, underscoring the structure contributing most to the peak.

70

M O L E C U L A R O N C O L O G Y 1 0 ( 2 0 1 6 ) 5 9 e7 2

Figure 6 e Glycan peaks significantly associated with survival. High abundance of the glycan structures (red line, representing the top 33% of samples, see Methods) is associated with favourable prognosis. Glycan values are adjusted for age (see Methods). Red: high serum abundance. Green: Low serum abundance (representing the bottom 33% of samples, see Methods). may be related to the systemic hormone level which in turn will affect the serum glycome. Our study is the first to combine RNA high throughput profiling with serum glycomics analysis in a large cancer cohort. Several technology improvements can make future results even stronger. The UPLC-analysis, while sensitive and accurate, only provides information about relative abundances. In addition, our current analysis, using PNGaseF, only addresses N-glycans. Another limitation of the UPLC technique used herein is that we may miss information from certain glycan structures that constitute only minor portions of the glycan peak they belong to, as the variations of the structures constituting the major portions dominate the signal. As gene expression profiling is a stable and comprehensive technology it can help extract biological knowledge from the measured glycomics. Identification of the target proteins of the glycan alterations would greatly aid the search for specific biomarkers and the understanding of mechanisms that underlie the observed phenomena. In this sense enabling the characterization of intact glycopeptides, rather than just released glycans, will provide much deeper insight to disease mechanisms.

5.

We have also identified serum glycan structures related to breast cancer clinical traits, such as survival and primary tumour size. High serum abundance of simple glycan structures is associated with increased survival, particularly in

Conclusions

We have demonstrated how integrating serum glycomics with gene expression from breast carcinomas may be used to identify serum glycans related to breast carcinogenesis and functional processes in the tumour. This approach may improve the search for biologically relevant serum markers of malignant disease. The strongest associations between tumour gene expression and serum glycan structures were seen for trisialylated trigalactosylated triantennary glycans with or without fucose (GP34, GP36 and GP37) which are correlated with lower adhesion. Reduced adhesion may facilitate invasion and migration of the cancer cells and we do see a trend in correlation between higher serum levels of GP34 and GP37 and poor prognosis in patients with ER negative tumours (p ¼ 0.03 and 0.02 respectively).

Figure 7 e Overabundance of highly branched glycans in patients who died of breast cancer. The cohort of breast cancer samples was divided into samples taken from patients that were reported to have died from the disease (Class A) and samples taken from patients that did not die from breast cancer (Class B). Boxplots represent the GP abundance fold change levels between the average in Class A and the average in Class B. The Y-axis is log fold change. The left boxplot represents glycans with 2 antennae or less and the right boxplot represents glycans with more than 2 antennae. The relative abundance of glycans in patients in Class A (dead of breast cancer) compared with patients alive (Class B) is generally higher for glycans with more than two antennae. The box is coloured light grey from the median up to the 75th percentile and dark grey from the median down to the 25th percentile.

M O L E C U L A R O N C O L O G Y 1 0 ( 2 0 1 6 ) 5 9 e7 2

the basal-like subgroup and high-antennary structures are associated with poor prognosis. Presence of CTCs is the clinical trait that is most strongly associated with the serum Nglycome, supporting previous observations that systemic traits influence the serum glycome more than organ-specific traits (Saldova et al., 2014).

Acknowledgements We would like to thank MD Ivan Potapenko for help with Rscripts. The research leading to these results has received funding from the European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement n 260600 (“GlycoHIT”, coordinator Professor Lokesh Joshi, National University of Ireland, Galway).

Appendix A. Supplementary data Supplementary data related to this article can be found at http://dx.doi.org/10.1016/j.molonc.2015.08.002.

Conflict of interest The authors declare no conflict of interest. R E F E R E N C E S

Abd Hamid, U.M., Royle, L., Saldova, R., Radcliffe, C.M., Harvey, D.J., Storr, S.J., et al., 2008. A strategy to reveal potential glycan markers from serum glycoproteins associated with breast cancer progression. Glycobiology 18, 1105e1118. Aure, M.R., Leivonen, S.K., Fleischer, T., Zhu, Q., Overgaard, J., Alsner, J., et al., 2013a. Individual and combined effects of DNA methylation and copy number alterations on miRNA expression in breast tumors. Genome Biol. 14, R126. Aure, M.R., Steinfeld, I., Baumbusch, L.O., Liestol, K., Lipson, D., Nyberg, S., et al., 2013b. Identifying in-trans process associated genes in breast cancer by integrated analysis of copy number and expression data. PLoS One 8, e53014. Bicker, K.L., Sun, J., Harrell, M., Zhang, Y., Pena, M., Thompson, P., et al., 2012. Synthetic lectin arrays for the detection and discrimination of cancer associated glycans and cell lines. Chem. Sci. 3, 1147e1156. Bosma, A.J., Weigelt, B., Lambrechts, A.C., Verhagen, O.J., Pruntel, R., Hart, A.A., et al., 2002. Detection of circulating breast tumor cells by differential expression of marker genes. Clin. Cancer Res. 8, 1871e1877. Chan, D.S., Vieira, A.R., Aune, D., Bandera, E.V., Greenwood, D.C., McTiernan, A., et al., 2014 Oct. Body mass index and survival in women with breast canceresystematic literature review and meta-analysis of 82 follow-up studies. Ann. Oncol. 25 (10), 1901e1914. Chandler, K., Goldman, R., 2013. Glycoprotein disease markers and single protein-omics. Mol. Cell Proteomics 12, 836e845. Cross, S.S., Hamdy, F.C., Deloulme, J.C., Rehman, I., 2005. Expression of S100 proteins in normal human tissues and

71

common cancers using tissue microarrays: S100A6, S100A8, S100A9 and S100A11 are all overexpressed in common cancers. Histopathology 46, 256e269. Eaton, S., Bartlett, K., Pourfarzam, M., 1996. Mammalian mitochondrial beta-oxidation. Biochem. J. 320 (Pt 2), 345e357. Eden, E., Lipson, D., Yogev, S., Yakhini, Z., 2007. Discovering motifs in ranked lists of DNA sequences. PLoS Comput. Biol. 3, e39. Enerly, E., Steinfeld, I., Kleivi, K., Leivonen, S.K., Aure, M.R., Russnes, H.G., et al., 2011. miRNA-mRNA integrated analysis reveals roles for miRNAs in primary breast tumors. PLoS One 6, e16915. Hand, D.J., 1992. Statistical methods in diagnosis. Stat. Methods Med. Res. 1, 49e67. Iorio, M.V., Ferracin, M., Liu, C.G., Veronese, A., Spizzo, R., Sabbioni, S., et al., 2005. MicroRNA gene expression deregulation in human breast cancer. Cancer Res. 65, 7065e7070. Johnson, S.M., Grosshans, H., Shingara, J., Byrom, M., Jarvis, R., Cheng, A., et al., 2005. RAS is regulated by the let-7 microRNA family. Cell 120, 635e647. Julien, S., Ivetic, A., Grigoriadis, A., QiZe, D., Burford, B., Sproviero, D., et al., 2011. Selectin ligand sialyl-Lewis x antigen drives metastasis of hormone-dependent breast cancers. Cancer Res. 71, 7683e7693. Kelder, T., van Iersel, M.P., Hanspers, K., Kutmon, M., Conklin, B.R., Evelo, C.T., et al., 2012. WikiPathways: building research communities on biological pathways. Nucleic Acids Res. 40, D1301eD1307. Kim, H., Golub, G.H., Park, H., 2005. Missing value estimation for DNA microarray gene expression data: local least squares imputation. Bioinformatics 21, 187e198. Koppenol, W.H., Bounds, P.L., Dang, C.V., 2011. Otto Warburg’s contributions to current concepts of cancer metabolism. Nat. Rev. Cancer 11, 325e337. Kuster, B., Wheeler, S.F., Hunter, A.P., Dwek, R.A., Harvey, D.J., 1997. Sequencing of N-linked oligosaccharides directly from protein gels: in-gel deglycosylation followed by matrixassisted laser desorption/ionization mass spectrometry and normal-phase high-performance liquid chromatography. Anal. Biochem. 250, 82e101. Lebrilla, C.B., 2013. Is high throughput glycomics possible? Mass Spectrom. 2, S0016. Lesniak, D., Sabri, S., Xu, Y., Graham, K., Bhatnagar, P., Suresh, M., et al., 2013. Spontaneous epithelial-mesenchymal transition and resistance to HER-2-targeted therapies in HER-2-positive luminal breast cancer. PLoS One 8, e71987. Lewis, B.P., Burge, C.B., Bartel, D.P., 2005. Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets. Cell 120, 15e20. Li, C., Simeone, D.M., Brenner, D.E., Anderson, M.A., Shedden, K.A., Ruffin, M.T., et al., 2009. Pancreatic cancer serum detection using a lectin/glyco-antibody array method. J. Proteome. Res. 8, 483e492. Miwa, H.E., Koba, W.R., Fine, E.J., Giricz, O., Kenny, P.A., Stanley, P., 2013. Bisected, complex N-glycans and galectins in mouse mammary tumor progression and human breast cancer. Glycobiology 23, 1477e1490. Molloy, T.J., Bosma, A.J., Baumbusch, L.O., Synnestvedt, M., Borgen, E., Russnes, H.G., et al., 2011. The prognostic significance of tumour cell detection in the peripheral blood versus the bone marrow in 733 early-stage breast cancer patients. Breast Cancer Res. 13, R61. Naume, B., Borgen, E., Kvalheim, G., Karesen, R., Qvist, H., Sauer, T., et al., 2001. Detection of isolated tumor cells in bone marrow in early-stage breast carcinoma patients: comparison with preoperative clinical parameters and primary tumor characteristics. Clin. Cancer Res. 7, 4122e4129.

72

M O L E C U L A R O N C O L O G Y 1 0 ( 2 0 1 6 ) 5 9 e7 2

Naume, B., Zhao, X., Synnestvedt, M., Borgen, E., Russnes, H.G., Lingjaerde, O.C., et al., 2007. Presence of bone marrow micrometastasis is associated with different recurrence risk within molecular subtypes of breast cancer. Mol. Oncol. 1, 160e171. Olsen, A.H., Bihrmann, K., Jensen, M.B., Vejborg, I., Lynge, E., 2009. Breast density and outcome of mammography screening: a cohort study. Br. J. Cancer 100, 1205e1208. Parker, N., Turk, M.J., Westrick, E., Lewis, J.D., Low, P.S., Leamon, C.P., 2005. Folate receptor expression in carcinomas and normal tissues determined by a quantitative radioligand binding assay. Anal. Biochem. 338, 284e293. Pierce, A., Saldova, R., Abd Hamid, U.M., Abrahams, J.L., McDermott, E.W., Evoy, D., et al., 2010. Levels of specific glycans significantly distinguish lymph node-positive from lymph node-negative breast cancer patients. Glycobiology 20, 1283e1288. € ders, T., Helland, A., Potapenko, I.O., Haakensen, V.D., Lu Bukholm, I., Sørlie, T., et al., 2010 Apr. Glycan gene expression signatures in normal and malignant breast tissue; possible role in diagnosis and progression. Mol. Oncol. 4 (2), 98e118. € ders, T., Russnes, H.G., Helland,  Potapenko, I.O., Lu A., Sørlie, T., Kristensen, V.N., et al., 2015 Apr. Glycan-related gene expression signatures in breast cancer subtypes; relation to survival. Mol. Oncol. 9 (4), 861e876. Rinaldo, P., Matern, D., Bennett, M.J., 2002. Fatty acid oxidation disorders. Annu. Rev. Physiol. 64, 477e502. Royle, L., Campbell, M.P., Radcliffe, C.M., White, D.M., Harvey, D.J., Abrahams, J.L., et al., 2008. HPLC-based analysis of serum Nglycans on a 96-well plate platform with dedicated database software. Anal. Biochem. 376, 1e12. Royle, L., Radcliffe, C.M., Dwek, R.A., Rudd, P.M., 2006. Detailed structural analysis of N-glycans released from glycoproteins in SDS-PAGE gel bands using HPLC combined with exoglycosidase array digestions. Methods Mol. Biol. 347, 125e143. Ruhaak, L.R., Miyamoto, S., Lebrilla, C.B., 2013. Developments in the identification of glycan biomarkers for the detection of cancer. Mol. Cell Proteomics 12, 846e855. Saldova, R., Asadi Shehni, A., Haakensen, V.D., Steinfeld, I., Hilliard, M., Kifer, I., et al., 2014 May 2. Association of Nglycosylation with breast carcinoma and systemic features using high-resolution quantitative UPLC. J. Proteome Res. 13 (5), 2314e2327.

Saldova, R., Reuben, J.M., Abd Hamid, U.M., Rudd, P.M., Cristofanilli, M., 2011. Levels of specific serum N-glycans identify breast cancer patients with higher circulating tumor cell counts. Ann. Oncol. 22, 1113e1119. Smyth, G.K., 2005. Limma: linear models for microarray data. In: Bioinformatics and Computational Biology Solutions Using R and Bioconductor. Springer, New York, pp. 397e420. Sorlie, T., Perou, C.M., Tibshirani, R., Aas, T., Geisler, S., Johnsen, H., et al., 2001. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc. Natl. Acad. Sci. U. S. A. 98, 10869e10874. Sorlie, T., Wang, Y., Xiao, C., Johnsen, H., Naume, B., Samaha, R.R., et al., 2006. Distinct molecular mechanisms underlying clinically relevant subtypes of breast cancer: gene expression analyses across three different platforms. BMC Genomics 7, 127. Steinfeld, I., Navon, R., Creech, M., Yakhini, Z., Tsalenko, A., 2014. ENViz: A Cytoscape App for Integrated Statistical Analysis and Visualization of Sample-matched Data with Multiple Data Types. http://apps.cytoscape.org/apps/enviz. Steinfeld, I., Navon, R., Creech, M.L., Yakhini, Z., Tsalenko, A., 2015 May 15. ENViz: a Cytoscape App for integrated statistical analysis and visualization of sample-matched data with multiple data types. Bioinformatics 31 (10), 1683e1685. Ugorski, M., Laskowska, A., 2002. Sialyl Lewis(a): a tumorassociated carbohydrate antigen involved in adhesion and metastatic potential of cancer cells. Acta Biochim. Pol. 49, 303e311. Weigelt, B., Horlings, H.M., Kreike, B., Hayes, M.M., Hauptmann, M., Wessels, L.F., et al., 2008. Refinement of breast cancer classification by molecular characterization of histological special types. J. Pathol. 216, 141e150. Wiedswang, G., Borgen, E., Karesen, R., Kvalheim, G., Nesland, J.M., Qvist, H., et al., 2003. Detection of isolated tumor cells in bone marrow is an independent prognostic factor in breast cancer. J. Clin. Oncol. 21, 3469e3478. Wiedswang, G., Borgen, E., Schirmer, C., Karesen, R., Kvalheim, G., Nesland, J.M., et al., 2006. Comparison of the clinical significance of occult tumor cells in blood and bone marrow in breast cancer. Int. J. Cancer 118, 2013e2019. Zhou, Z., Zhou, J., Du, Y., 2012. Estrogen receptor alpha interacts with mitochondrial protein HADHB and affects beta-oxidation activity. Mol. Cell Proteomics 11, M111.