Plasma proteomics-based identification of novel biomarkers in early gastric cancer

Plasma proteomics-based identification of novel biomarkers in early gastric cancer

Journal Pre-proofs Plasma proteomics-based identification of novel biomarkers in early gastric cancer Bin Zhou, Zhe Zhou, Yuling Chen, Haiteng Deng, Y...

372KB Sizes 0 Downloads 27 Views

Journal Pre-proofs Plasma proteomics-based identification of novel biomarkers in early gastric cancer Bin Zhou, Zhe Zhou, Yuling Chen, Haiteng Deng, Yunlong Cai, Xiaolong Rao, Yuxin Yin, Long Rong PII: DOI: Reference:

S0009-9120(19)30899-9 https://doi.org/10.1016/j.clinbiochem.2019.11.001 CLB 10038

To appear in:

Clinical Biochemistry

Received Date: Revised Date: Accepted Date:

14 August 2019 27 October 2019 2 November 2019

Please cite this article as: B. Zhou, Z. Zhou, Y. Chen, H. Deng, Y. Cai, X. Rao, Y. Yin, L. Rong, Plasma proteomicsbased identification of novel biomarkers in early gastric cancer, Clinical Biochemistry (2019), doi: https://doi.org/ 10.1016/j.clinbiochem.2019.11.001

This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

© 2019 Published by Elsevier Inc. on behalf of The Canadian Society of Clinical Chemists.

Plasma proteomics-based identification of novel biomarkers in early gastric cancer Bin Zhoua,1, Zhe Zhoub,1, Yuling Chenc,d, Haiteng Dengc, Yunlong Caia, Xiaolong Raoa, Yuxin Yinb,*,[email protected], Long Ronga,*,[email protected] aDepartment

of Endoscopy Center, Peking University First Hospital, Beijing, China of Systems Biomedicine, Department of Pathology, School of Basic Medical Sciences, Peking University Health Science Center, Beijing, China cMOE Key Laboratory of Bioinformatics, Center for Synthetic and Systematic Biology, School of Life Sciences, Tsinghua University, Beijing, China dTsinghua University-Peking University Joint Center for Life Sciences, Beijing, China * Corresponding authors. bInstitute

1Authors

contributed equally

Abstract Background Identification and treatment in the early stage can significantly improve the prognosis of gastric cancer (GC). However, to date, there is still no ideal biomarker that can be used for the screening of early stage GC (EGC). The proteomics supported by mass spectrometry offers more possibilities for discovering tumor biomarkers. The aim of this study was to explore candidate protein biomarkers for EGC screening with mass spectrometry and bioinformatics technology. Methods Plasma samples were collected from 15 EGC patients and 15 healthy controls. After a selective immune-depletion to remove high abundance proteins, plasma samples were analyzed by liquid chromatography-tandem mass spectrometry (LC-MS/MS) combined with the tandem mass tags (TMT) labeling. Results A total of 2040 proteins were identified, and 11 proteins were found to be differentially expressed. The results of the logistic regression model and orthogonal signal correction-partial least squares discriminant analysis (OPLS-DA) model showed that the changed proteins identified by plasma proteomics could help distinguish EGC patients from healthy controls. Conclusion The proteins identified by plasma proteomics using LC-MS/MS combined with TMT labeling could help distinguish EGC from healthy controls. Keywords: Biomarkers; Early screening; Gastric cancer; Plasma proteomics; LC-MS/MS

1. Introduction Gastric cancer (GC) is the fourth most common cancer worldwide, and its five-year overall survival rate is less than 25%[1]. The poor outcome is partially due to nonspecific symptoms, which make it difficult to diagnose at early stage. Early stage GC (EGC) is confined to mucosa 1 / 11

and/or submucosa regardless of the presence of lymph node metastasis, and its five-year survival is over 90% after curable surgery. In comparison, the five-year survival of stage IV GC is less than 5%[2, 3]. Moreover, owing to the low risk of lymph node metastasis in the early stage, EGC can be cured with endoscopic surgery, which is less painful and risky than laparotomy or laparoscopic surgery. Clinically, screening and surveillance of tumors have been assisted by detecting tumor markers in blood for years. Currently, serum tumor markers, including CA 19-9, CA 125 and CEA, can be used in GC[4]. However, these traditional tumor markers have relatively low sensitivity and specificity, which limits their value in EGC screening[5, 6]. In addition, some serum biomarkers have been used for the screening of EGC, such as serum pepsinogen (PG). A low serum PG I concentration and PG I/II concentration ratio suggest gastric mucosal atrophy, which is the main risk factor for GC, so that patients with a high risk of GC can be identified. However, a meta-analysis showed that the overall screening sensitivity and specificity for GC were only 77% and 73%, respectively[7]. In conclusion, there is still a lack of ideal serum or plasma EGC screening methods, and novel biomarkers need to be explored. The development of plasma proteomics supported by mass spectrometry and bioinformatics technology provides high-resolution, high-accuracy tools for tumor biomarker research. To explore candidate protein biomarkers for EGC screening, we used liquid chromatography-tandem mass spectrometry (LC-MS/MS) combined with the tandem mass tags (TMT) label to compare the plasma proteome between EGC patients and healthy volunteers.

2. Materials and methods 2.1 Patients and plasma samples Fifteen plasma samples were collected from EGC patients. EGC was defined as gastric cancer confined to mucosa and/or submucosa regardless of the presence of lymph node metastasis, including high-grade intraepithelial neoplasia (HGIN). All patients received endoscopic submucosal dissection (ESD) at Peking University First Hospital and were diagnosed with EGC pathologically. Fasting blood samples were collected in EDTA tubes before ESD. Blood samples from 15 age- and sex-matched healthy volunteers were collected as controls. All healthy controls received endoscopic examination to rule out gastric or colorectal cancer. The study was approved by the Ethics Committee of the Peking University First Hospital, and all participants provided informed consent. Based on the recommendation of the Human Proteome Organization Plasma Proteome Project (HUPO PPP), the plasma sample was used in this study[8]. Plasma samples were obtained from blood samples by centrifugation at 3,000g for 10 min at 4℃ and immediately stored at -80℃.

2.2 Plasma sample preparation Equal amounts of plasma from 15 healthy volunteers were pooled, forming a mixed plasma sample as the internal standard sample. Plasma samples were arranged and grouped as shown in Figure 1. An IgY 14 column (Sigma, St. Louis, MO, USA) was used to deplete the 14 abundant plasma proteins, including albumin, IgG, α1-antitrypsin, IgA, IgM, transferrin, haptoglobin, 2 / 11

α2-macroglobulin, fibrinogen, complement C3, orosomucoid, HDL, and LDL, in plasma following the column instructions. After depletion, urea was added to the protein sample to a concentration of 2 M. Reduction and alkylation then followed. Proteins were digested by trypsin with a protease/protein ratio of 1:50 at 37℃ overnight. The samples were desalted by Sep-Pak columns (Waters, Milford, MA, USA). Peptides from different samples were labeled with tandem mass tags (TMT) reagents (Thermo, Pierce Biotechnology, Rockford, IL, USA) according to the manufacturer’s instructions. In each group, the TMT-labeled peptides were mixed and desalted by a Sep-Pak column.

2.3 Off-line HPLC fractionation The peptides were fractionated by a UPLC3000 system (Dionex, Sunnyvale, CA, USA) with an XBridgeTM BEH300 C18 column (Waters, Milford, MA, USA). Mobile phase A is H2O adjusted by ammonium hydroxide to pH 10, and mobile phase B is acetonitrile adjusted by ammonium hydroxide to pH 10. Peptides were separated with the following gradients: 8% to 18% phase B, 30 min; 18% to 32% phase B, 22 min. Forty-eight fractions were collected, dried by a speedvac, combined into 12 fractions, and re-dissolved in 0.1% formic acid.

2.4 LC-MS/MS analysis The TMT-labeled peptides were separated by a 120-min gradient elution at a flow rate of 0.250 µl/min with an EASY-nLC 1000 system (Thermo Fisher Scientific, Waltham, MA, USA), which is directly interfaced with a Q-Exacitve HF-X spectrometer (Thermo Fisher Scientific, Waltham, MA, USA). The analytical column was a fused silica capillary column (75 µm ID, 150 mm length; packed with C-18 resin, Lexington, MA, USA). Mobile phase A consisted of 0.1% formic acid, and mobile phase B consisted of 100% acetonitrile and 0.1% formic acid. For the quantitative proteomics analysis, the Q-Exacitve HF-X spectrometer was operated in the data-dependent acquisition mode using Xcalibur 3.0.63 software (Thermo Fisher Scientific, Waltham, MA, USA), and there was a single full-scan mass spectrum in the Orbitrap (350–1550 m/z, 120,000 resolution) with an automatic gain control (AGC) target value of 2e6. A data-dependent acquisition method was performed to collect generated MS/MS spectra at a resolution of 17,500 with an AGC target of 1e6 and a maximum injection time (IT) of 50 ms for top 40 ions observed in each mass spectrum. The isolation window was set at 1.2 Da width, the dynamic exclusion time was 20 s, and the collision energy was set at 38%. The generated MS/MS spectra were searched against the Uniprot Human database (https://www.uniprot.org; August 10, 2016; 89,105 sequences) using the SEQUEST searching engine in the Proteome Discoverer 2.1 software (PD, Thermo Fisher Scientific, Waltham, MA, USA). The search criteria were as follows: full tryptic specificity was required; one missed cleavage was allowed; carbamidomethylation (C) and TMT sixplex (K and N-terminal) were set as the fixed modifications; the oxidation (M) was set as the variable modification; precursor ion mass tolerances were set at 10 ppm for all MS acquired in an Orbitrap mass analyzer; and the 3 / 11

fragment ion mass tolerance was set at 20 mmu for all MS2 spectra acquired. The peptide false discovery rate was calculated using Percolator provided by PD. When the q-value was smaller than 1%, the peptide spectrum match was considered to be correct. False discovery was determined based on peptide spectrum match when searched against the reversed decoy database. Peptides only assigned to a given protein group were considered unique. The false discovery rate was also set to 0.01 for protein identification. Relative protein quantification was performed using PD2.1 according to the manufacturer’s instructions on the intensities of six reporter ions per peptide. Quantification was carried out only for proteins with two or more unique peptide matches. Protein ratios were calculated as the median of all peptide hits belonging to a protein. Quantitative precision was expressed as the protein ratio variability.

2.5 Identification of differentially expressed proteins Data analyses were performed using R software (www.r-project.org). Proteins with more than 50% missing values in either the EGC group or the control group were removed from the analysis. The remaining missing values were estimated using the k-nearest neighbors (KNN) method. The data matrix was then log2 transformed to approach the Gaussian distribution. Differentially expressed proteins between the EGC group and the control group were identified by Student’s t test under a raw p-value cutoff of 0.05.

2.6 Logistic model and cross validation A logistic regression model containing differentially expressed proteins was constructed to discriminate between cases and controls. Receiver operating characteristic (ROC) curves were constructed to evaluate the diagnostic performance of the model. Areas under the ROC curve (AUC), optimal cutoff, sensitivity, specificity and accuracy were calculated using R packages pROC.[9] To evaluate the predictive performance of the model, leave-one-out cross-validation (LOOCV) was performed. In more detail, logistic models were estimated using 29 samples, and the remaining sample was left as the test set. The entire cross-validation procedure was repeated 30 times to cover all the samples, and the mean accuracy was calculated.

2.7 Orthogonal signal correction-partial least squares discriminant analysis (OPLS-DA) The OPLS-DA model was constructed using the R package MetaboAnalystR and ropls[10-12]. R2Y and Q2Y metrics were used to evaluate the performance of the model. The variable importance in projection (VIP) scores reflects both the loading weights for each component and the variability of the response explained by it, and thus can be used for feature selection[13].

3. Results 3.1 Characteristics of EGC patients and healthy controls 4 / 11

Altogether, we used 15 plasma samples from EGC patients and 15 plasma samples from ageand sex-matched healthy controls (details in Table 1). Thirteen of 15 patients were diagnosed with adenocarcinoma, and two were diagnosed with HGIN pathologically. The adenocarcinomas were mainly well or moderately differentiated. The invasive depth of adenocarcinomas was mainly limited to the mucosa. Details of the tumors are shown in Table 2.

3.2 Identification of differentially expressed proteins A total of 2040 proteins were identified, and 11 proteins were found to be differentially expressed between EGC patients and healthy controls, with 7 upregulated and 4 downregulated. The list of differentially expressed proteins is shown in Table 3.

3.3 Logistic regression model for distinguishing EGC patients from controls We first constructed a logistic regression model for each differentially expressed protein. The AUCs, specificities and sensitivities are shown in Table 4. The results indicated insufficient efficiency of individual proteins in distinguishing EGC patients from control. Hence, we constructed a multivariate logistic regression model on all 11 differentially expressed proteins. The combined diagnostic model was found to be: y = -40.06 - 57.9 * Q8NBP7 + 58.36 * P00441 + 68.38 * Q86UD1 - 32.84 * A0A2R8Y7X9 24.54 * P62979 + 65.72 * A0A0G2JMC9 + 216.11 * P08493 - 114.04 * P16157 - 155.88 * A0A087WTY6 + 7.02 * P14207 + 26.76 * Q9H939 The sensitivity and specificity of the combined model reached 100% and 100%, implying a clearly greater performance than each of the individual proteins. However, this result might derive from overfitting of model. To evaluate the predictive performance of the model, we used the LOOCV method to classify the entire dataset into a training set and test set. Consequently, the sensitivity and specificity of LOOCV were 66.7% and 86.7%, respectively.

OPLS-DA model and KEGG pathway analysis To further select potential biomarkers for EGC, a supervised, multivariate technique, OPLS-DA, was used. A score plot of the OPLS-DA model is shown in Figure 2A. The EGC group is clearly separated from controls, indicating the potential efficiency of plasma proteomics for discovering EGC. We then employed a statistic termed VIP score to select biomarkers. A cutoff of VIP > 2 was used, and 43 proteins were selected (Supplementary Table S1). KEGG pathway analysis of these biomarkers revealed enriched terms, including porphyrin and chlorophyll metabolism and nitrogen metabolism (Figure 2B).

Discussion Numerous proteins have been found to be differentially expressed in cell lines, tissues or 5 / 11

body fluid of GC and deservedly thought to be potential GC biomarkers[14-19]. However, there is still a lack of studies focusing on EGC. It has been shown that the profiles of protein expression may change during the development of tumors[20, 21]. Therefore, the protein profiles of EGC were significantly different from those of advanced stage GC, and biomarkers found in advanced GC have limited reference value for EGC screening[22]. This case-controlled study compared the plasma proteome between EGC patients and healthy volunteers to discover candidate EGC protein biomarkers. To the best of our knowledge, this is the first study focusing on EGC plasma with a proteomic method to discover novel biomarkers. In plasma proteome research, the core step is accurate quantification of proteins. LC-MS/MS is a rapid and sensitive technique and is capable of analyzing complex protein mixtures such as plasma. For mass spectrometry-based protein quantification, there are two main strategies: labeling and label-free proteomics. Although the label-free technique is widely used in proteomic studies involving a large number of samples, quantification is more accurate by labeling strategy for that samples can be analyzed at the same time to eliminate instrument error. Considering that the change in protein expression in EGC is not as significant as that in advanced-staged GC, we chose LC-MS/MS combined with the TMT label to analyze plasma from EGC patients and healthy controls to explore potential protein biomarkers for EGC screening. In each group, we set up an internal control so that the quantification results among groups could be comparable. Eleven proteins were differentially expressed between plasma samples from EGC patients and healthy volunteers. Folate receptor beta (FR2) and Proprotein convertase subtilisin/kexin type 9 (PCSK9) were reported to be associated with GC in previous studies. FR2 is a member of the folate receptor family, and is expressed in certain epithelial tissues and myeloid cells. FR2 was found to be expressed in tumor-associated macrophages (TAM), which are recruited into the tumor microenvironment and have immunosuppressive function[23]. A recent study showed that FR2 was elevated in gastric serum samples by antibody array assay[24]. PCSK9 is extensively expressed in the liver and plays an important role in circulating cholesterol regulation. A recent study showed that PCSK9 was highly overexpressed in the GC secretome and its overexpression can be validated in tumor tissues using immunohistochemical methods. In contrast, it was nearly undetectable in normal gastric epithelial cells, indicating that it could be a novel GC biomarker[16]. However, the function of PCSK9 in progression and tumorigenesis of cancer remains unknown. Matrix Gla protein (MGP) was also found to be upregulated in our study. There were no reports indicating that MGP is involved in human gastric cancer, but it was found to be overexpressed in ovarian cancer cell lines, suggesting a potential role in ovarian cancer pathogenesis[25]. The molecular mechanism of its overexpression in EGC plasma requires further elucidation. Superoxide dismutase 1 (SOD1), which is a member of superoxide dismutase (SOD), was found to be downregulated in our study. SOD is one of the most important antioxidant enzymes in the human body, and may play an important role in cancer[26]. The change of the activity and content of SOD and its members in GC has been studied in previous researches. The results of some studies showed that the activity of SOD was significantly reduced in the malignant tissues compared with the adjacent nonmalignant tissues or the normal mucosa from the healthy controls[27, 28]. As to the SOD1, its decreased activity level in the GC tissues was found[29]. In addition to the change of the activity, a study proved that the content of SOD1 was lower in the malignant tissues by Enzyme-Linked Immunosorbent Assay (ELISA) and immunochemistry, 6 / 11

which supports the finding of our study[30]. In the peripheral blood samples, a meta-analysis showed a significantly decreased SOD activity in the patients with GC, providing a potential diagnostic parameter[31]. Our study indicated that the decreased level of SOD1 might be a valuable parameter in the diagnosis of EGC. Due to the genetic diversity of GC, a single protein is unlikely to have adequate diagnostic power, so a panel composed of several protein biomarkers is necessary in the diagnosis of EGC. In this study, we used a logistic regression model and OPLS-DA model to validate whether these proteins identified by mass spectrometry had the ability to distinguish EGC from healthy volunteers. Differentially expressed proteins screened out by t-test were used to construct a logistic regression model. The sensitivity and specificity of this combined model were better than those of any individual protein. The sensitivity and specificity of LOOCV were 66.7% and 86.7%, respectively. The sample size was small in the present study. We believe that the method we used has the ability to develop a more accurate diagnosis tool for EGC with a larger sample size. Apart from the logistic regression model, the results of OPLS-DA also showed that EGC and healthy controls could be significantly separated by plasma proteins. Forty-three potential biomarkers were further selected by this method, and KEGG pathway analysis revealed enriched terms, including porphyrin and chlorophyll metabolism and nitrogen metabolism.

Conclusion Using LC-MS/MS combined with TMT labeling, we identified 11 proteins differentially expressed in plasma from EGC patients and healthy controls. Despite being limited by the small sample size, the results of the logistic regression model and OPLS-DA model showed that proteins identified by plasma proteomics could help distinguish EGC from healthy controls. Our results provide a resource of potential biomarkers for diagnosis of EGC, and clues for further study of EGC pathogenesis.

Conflict of interests None.

Acknowledgements We thank the Protein Chemistry Facility at the Center for Biomedical Analysis of Tsinghua University for sample analysis.

Funding This study was supported by the youth clinical research project of Peking University First Hospital (2018CR28).

References [1] Brenner H, Rothenbacher D, Arndt V. Epidemiology of stomach cancer[J]. Methods Mol Biol, 2009,472:467-477. 7 / 11

[2] Matsuda T, Ajiki W, Marugame T, et al. Population-based survival of cancer patients diagnosed between 1993 and 1999 in Japan: a chronological and international comparative study[J]. Jpn J Clin Oncol, 2011,41(1):40-51. [3] Ajani J A, Bentrem D J, Besh S, et al. Gastric cancer, version 2.2013: featured updates to the NCCN Guidelines[J]. J Natl Compr Canc Netw, 2013,11(5):531-546. [4] Sakamoto K, Haga Y, Yoshimura R, et al. Comparative effectiveness of the tumour diagnostics, CA 19-9, CA 125 and carcinoembryonic antigen in patients with diseases of the digestive system[J]. Gut, 1987,28(3):323-329. [5] Haglund C, Kuusela P, Roberts P, et al. Tumour marker CA 125 in patients with digestive tract malignancies[J]. Scand J Clin Lab Invest, 1991,51(3):265-270. [6] Ychou M, Duffour J, Kramar A, et al. Clinical significance and prognostic value of CA72-4 compared with CEA and CA19-9 in patients with gastric cancer[J]. Dis Markers, 2000,16(3-4):105-110. [7] Miki K. Gastric cancer screening using the serum pepsinogen test method[J]. Gastric Cancer, 2006,9(4):245-253. [8] Rai A J, Gelfand C A, Haywood B C, et al. HUPO Plasma Proteome Project specimen collection and handling: towards the standardization of parameters for plasma proteome samples[J]. Proteomics, 2005,5(13):3262-3277. [9] Robin X, Turck N, Hainard A, et al. pROC: an open-source package for R and S+ to analyze and compare ROC curves[J]. BMC bioinformatics, 2011,12(1):77. [10] Chong J, Yamamoto M, Xia J. MetaboAnalystR 2.0: From Raw Spectra to Biological Insights[J]. Metabolites, 2019,9(3):57. [11] Thévenot E A, Roux A, Xu Y, et al. Analysis of the Human Adult Urinary Metabolome Variations with Age, Body Mass Index, and Gender by Implementing a Comprehensive Workflow for Univariate and OPLS Statistical Analyses[J]. Journal of proteome research, 2015,14(8):3322-3335. [12] Bylesjö M, Rantalainen M, Cloarec O, et al. OPLS discriminant analysis: combining the strengths of PLS‐DA and SIMCA classification[J]. Journal of Chemometrics, 2006,20(8‐10):341-351. [13] Farrés M, Platikanov S, Tsakovski S, et al. Comparison of the variable importance in projection (VIP) and of the selectivity ratio (SR) methods for variable selection and interpretation[J]. Journal of Chemometrics, 2015,29(10):528-536. [14] Leal M F, Chung J, Calcagno D Q, et al. Differential proteomic analysis of noncardia gastric cancer from individuals of northern Brazil[J]. PLoS One, 2012,7(7):e42255. [15] Wu C, Luo Z, Chen X, et al. Two-dimensional differential in-gel electrophoresis for identification of gastric cancer-specific protein markers[J]. Oncol Rep, 2009,21(6):1429-1437. [16] Marimuthu A, Subbannayya Y, Sahasrabuddhe N A, et al. SILAC-based quantitative proteomic analysis of gastric cancer secretome[J]. Proteomics Clin Appl, 2013,7(5-6):355-366. [17] Subbannayya Y, Mir S A, Renuse S, et al. Identification of differentially expressed serum proteins in gastric adenocarcinoma[J]. J Proteomics, 2015,127(Pt A):80-88. [18] Wu W, Juan W C, Liang C R, et al. S100A9, GIF and AAT as potential combinatorial biomarkers in gastric cancer diagnosis and prognosis[J]. Proteomics Clin Appl, 2012,6(3-4):152-162. [19] Umemura H, Togawa A, Sogawa K, et al. Identification of a high molecular weight kininogen fragment as a marker for early gastric cancer by serum proteome analysis[J]. J Gastroenterol, 2011,46(5):577-585. 8 / 11

[20] Abramowicz A, Wojakowska A, Gdowicz-Klosok A, et al. Identification of serum proteome signatures of locally advanced and metastatic gastric cancer: a pilot study[J]. J Transl Med, 2015,13:304. [21] Lu H B, Zhou J H, Ma Y Y, et al. Five serum proteins identified using SELDI-TOF-MS as potential biomarkers of gastric cancer[J]. Jpn J Clin Oncol, 2010,40(4):336-342. [22] Kim H K, Reyzer M L, Choi I J, et al. Gastric cancer-specific protein profile identified using endoscopic

biopsy

samples

via

MALDI

mass

spectrometry[J].

J

Proteome

Res,

2010,9(8):4123-4130. [23] Puig-Kroger A, Sierra-Filardi E, Dominguez-Soto A, et al. Folate receptor beta is expressed by tumor-associated macrophages and constitutes a marker for M2 anti-inflammatory/regulatory macrophages[J]. Cancer Res, 2009,69(24):9395-9403. [24] Wu D, Zhang P, Ma J, et al. Serum biomarker panels for the diagnosis of gastric cancer[J]. Cancer Med, 2019,8(4):1576-1583. [25] Sterzynska K, Klejewski A, Wojtowicz K, et al. The Role of Matrix Gla Protein (MGP) Expression in Paclitaxel and Topotecan Resistant Ovarian Cancer Cell Lines[J]. Int J Mol Sci, 2018,19(10). [26] Che M, Wang R, Li X, et al. Expanding roles of superoxide dismutases in cell regulation and cancer[J]. Drug Discov Today, 2016,21(1):143-149. [27] Batcioglu K, Mehmet N, Ozturk I C, et al. Lipid peroxidation and antioxidant status in stomach cancer[J]. Cancer Invest, 2006,24(1):18-21. [28] Wang S H, Wang Y Z, Zhang K Y, et al. Effect of superoxide dismutase and malondialdehyde metabolic changes on carcinogenesis of gastric carcinoma[J]. World J Gastroenterol, 2005,11(28):4305-4310. [29] Monari M, Trinchero A, Calabrese C, et al. Superoxide dismutase in gastric adenocarcinoma: is it a clinical biomarker in the development of cancer?[J]. Biomarkers, 2006,11(6):574-584. [30] Janssen A M, Bosman C B, van Duijn W, et al. Superoxide dismutases in gastric and esophageal cancer and the prognostic impact in gastric cancer[J]. Clin Cancer Res, 2000,6(8):3183-3192. [31] Li J, Lei J, He L, et al. Evaluation and Monitoring of Superoxide Dismutase (SOD) Activity and its Clinical Significance in Gastric Cancer: A Systematic Review and Meta-Analysis[J]. Med Sci Monit, 2019,25:2032-2042.

Fig. 1. The arrangement of plasma samples. N: plasma samples from healthy volunteers. T: plasma samples from EGC patients. M: mixed plasma sample as an internal standard. Fig. 2. (A) Score plot of the OPLS-DA model. T: early stage gastric cancer. N: healthy control. (B) KEGG pathway analysis. Table 1. Demographics of patients and healthy controls Demographics Sex, n (%) Male Female Age, years, mean ± SD

Patients

Healthy controls

p value 1

12 (80) 3 (20) 65.4 ± 9.6

12 (80) 3 (20) 63.7 ± 7.5

0.60

Fisher’s exact test for sex, Student’s t test for age, p < 0.05 was considered statistically significant.

9 / 11

Table 2. Characteristics of tumors Characteristic

n (%)

Tumor size, mm, mean ± SD Tumor location in stomach Upper-third Middle-third Lower-third Histology type-predominant HGIN Well differentiated adenocarcinoma Moderately differentiated adenocarcinoma Signet cell carcinoma Depth of tumor invasion in carcinoma (n=13) Mucosal Submucosal (≤500 μm) Submucosal (>500 μm)

21.9 ± 17.9 5 (33.3) 4 (26.7) 6 (40.0) 2 (13.3) 6 (40.0) 6 (40.0) 1 (6.7) 8 (61.5) 3 (23.2) 2 (15.3)

Table 3. List of differentially expressed proteins UniProt

Description

Gene name

Fold change

p value

P08493

Matrix Gla protein

MGP

1.42

0.005

Q9H939

Proline-serine-threonine

PSTPIP2

1.41

0.036

NBL1

1.35

0.044

LILRA2

1.34

0.047

Accession Up-regulated

phosphatase-interacting protein 2 A0A087WTY6

Neuroblastoma

suppressor

of

tumorigenicity 1 A0A0G2JMC9

Leukocyte

immunoglobulin-like

receptor subfamily A member 2 P14207

Folate receptor beta

FOLR2

1.33

0.016

Q86UD1

Out at first protein homolog

OAF

1.27

0.036

Q8NBP7

Proprotein convertase subtilisin/kexin

PCSK9

1.25

0.041

type 9 Down-regulated P00441

Superoxide dismutase [Cu-Zn]

SOD1

0.73

0.041

P16157

Ankyrin-1

ANK1

0.71

0.049

P62979

Ubiquitin-40S ribosomal protein S27a

RPS27A

0.70

0.039

A0A2R8Y7X9

Uncharacterized protein

0.60

0.009

Table 4. AUC, sensitivity and specificity of the logistic regression model Models

AUC

Sensitivity (%)

Specificity (%)

Q8NBP7

0.702

60.0

80.0

10 / 11

P00441 Q86UD1 A0A2R8Y7X9 P62979 A0A0G2JMC9 P08493 P16157 A0A087WTY6 P14207 Q9H939 Combined model LOOCV

11 / 11

0.707 0.711 0.796 0.724 0.747 0.813 0.653 0.64 0.689 0.573 1 0.711

93.3 46.7 80.0 86.7 60.0 73.3 46.7 80.0 60.0 73.3 100 66.7

46.7 100 86.7 66.7 93.3 80.0 86.7 53.3 80.0 53.3 100 86.7