Journal Pre-proof Identification and validation of immune-related prognostic signature for breast cancer
lncRNA
Yong Shen, Xiaowei Peng, Chuanlu Shen PII:
S0888-7543(20)30065-3
DOI:
https://doi.org/10.1016/j.ygeno.2020.02.015
Reference:
YGENO 9479
To appear in:
Genomics
Received date:
22 January 2020
Revised date:
10 February 2020
Accepted date:
18 February 2020
Please cite this article as: Y. Shen, X. Peng and C. Shen, Identification and validation of immune-related lncRNA prognostic signature for breast cancer, Genomics (2020), https://doi.org/10.1016/j.ygeno.2020.02.015
This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
© 2020 Published by Elsevier.
Journal Pre-proof
Identification and validation of immune-related lncRNA Prognostic Signature for Breast Cancer Yong shena,#, Xiaowei Penga,#, Chuanlu Shena,*
a
Department of Pathology and Pathophysiology, School of Medicine, Southeast
-p
ro
of
University, 210009, Nanjing, China
re
Keywords
Jo ur
na
lP
TCGA, LncRNA, Breast Cancer, Risk score, Immune Prognostic Model
* #
Corresponding author, Email address:
[email protected] (C Shen) and
[email protected] (Y Shen) Yong shen and Xiaowei Peng contributed equally to this work.
Journal Pre-proof
ABSTRACT The prognosis of patients with breast cancer is closely related to both the infiltration of immune cells and the expression of lncRNAs. In this study, we evaluated the infiltration of immune cells in 1109 breast cancer samples obtained from TCGA by applying the ssGSEA to the transcriptome of these samples, thereby
of
generating high immune cell infiltration group and low immune cell infiltration group.
ro
On the basis of these groupings, we found 696 differentially expressed lncRNAs
-p
which were sequentially subjected to univariate Cox regression and stepwise multiple
re
Cox regression analysis, 11 lncRNAs were identified as prognostic signature for
lP
breast cancer. Kaplan-Meier analysis, univariate Cox regression, multivariate Cox regression, and ROC analyses further revealed that this 11-lncRNA signature was a
na
novel and important prognostic factor independent of multiple clinicopathological
Jo ur
parameters. The TIMER database showed that this 11-lncRNA prognostic signature for breast cancer was associated with the infiltration of immune cell subtypes.
Journal Pre-proof
1. Introduction Breast cancer is one of the most common malignancies in women worldwide[1]. Its morbidity rate is increasing year by year, and its mortality rate ranks second among female malignant tumors[2]. Fortunately, due to the improvement of diagnosis and treatment in recent years, the mortality rate of breast cancer has been greatly reduced
of
so far[3]. Breast cancer is a highly heterogeneous tumor, and its etiology and pathological manifestations vary from person to person[4]. However, the prognosis of
ro
patients with breast cancer is mostly related to immunity[5]. There are a large number
-p
of inflammatory cells infiltrated in breast cancer, not only around the tumor but also
re
in the tumor matrix[6]. Some studies have shown that the density of CD8+ T cells
lP
(cytotoxic T cells) is highly correlated with immune escape in breast cancer, and the
na
infiltration of CD8+ T and CD4+ T cells is also significantly related to the prognosis of breast cancer[7]. Macrophages are another important component of breast cancer
cells
and
Jo ur
tumor-infiltrating immune cells, reaching about 50%[8]. Cleaning up the wreckage of conducting
antigenic
reactions
are
their
main
functions[8].
Antigen-presenting cells (APC) and dendritic cells (DC) also play an important role in antigen presentation and cytotoxicity to tumor antigens[9, 10]. Therefore, in order to improve the prognosis of breast cancer and to provide reliable information to guide the correct individual treatment strategies, we urgently need to screen reliable immune predictors and prognostic indicators. Long non-coding RNA (LncRNA) is a class of RNA molecules with transcripts longer than 200 nt[11]. They do not encode proteins but regulate gene expression at
Journal Pre-proof
various levels (epigenetic, transcriptional or post-transcriptional regulation, etc.) in the form of RNA[11]. As a new type of gene regulator, lncRNA is associated with the development, progression, and prognosis of human diseases, especially cancer. The abnormal expression of some lncRNAs may be related to excessive cell growth, repressed apoptosis, invasion, metastasis, epithelial-mesenchymal transformation
of
(EMT) and poor prognosis of breast cancer[12]. For example, lncRNA-Hh strengthens cancer stem cell generation in twist-positive breast cancer via activation
ro
of the hedgehog signaling pathway[13]. LncRNA-HOXA11-AS has been found to
-p
overexpress in breast cancer, contributing to the invasion and metastasis of breast
re
cancer cells[14]. LncRNAs regulating the immune microenvironment of human breast
lP
cancer have become a hot spot. Some studies showed that there were a large number
na
of different types of immune cells infiltrating in breast cancer, not only in the cancer nest, but also in the tumor matrix, and the prognosis of breast cancer was closely
16].
Jo ur
related to the type and number of immune cells infiltrating around the neoplasm [15,
Therefore, the establishment of tools to accurately predict the prognosis of breast cancer patients is very important to guide clinical diagnosis and treatment. Because abnormal phenotypes are closely related to the poor prognosis of breast cancer, it is reasonable to identify lncRNA related to breast cancer phenotype to predict breast cancer prognosis[17]. In the present study, we analyzed the data set of lncRNA expression in the Cancer Genome Atlas (TCGA) and screened the lncRNAs related to tumor phenotype by single-sample gene set enrichment analysis (ssGSEA),
Journal Pre-proof
ESTIMATE,
Cox
and
other
analysis
methods. We
demonstrated
that
11
survival-related and grade-related lncRNAs were related closely to the prognosis of breast cancer.
2. Materials and methods
of
2.1 Collection and grouping of Breast cancer data The fragments per kilobase of per million (FPKM) of breast cancer
ro
transcriptome, lncRNA counts data and corresponding clinical data of breast cancer
-p
were downloaded from TCGA program (https://portal.gdc.cancer.gov). The grouping
re
of breast cancer transcriptome data based on the TCGA was realized by ssGSEA. We
lP
had obtained a set of marker genes for immune cell types from Bindea et al. Using 29
na
immune data sets, including immune cell types, immune-related pathways, and immune-related functions, we used the ssGSEA method of R software Gene Set
Jo ur
Variation Analysis (GSVA) package to analyze the infiltration level of different immune cells, immune-related pathways and the activity of immune-related functions in breast cancer expression profile data. The ssGSEA applied the genetic characteristics expressed by immune cell populations to individual cancer samples. According to the results of ssGSEA, samples of breast cancer in the TCGA were classified as high immune cell infiltration group and low immune cell infiltration group by using “hclust” (R package).
2.2 Verification of the effectiveness of immune grouping
Journal Pre-proof
The analysis of differentially expressed genes (DEG) in the expression profile data was carried out by using the ESTIMATE algorithm. The Stromal Score, Immune Score, ESTIMATE Score, and Tumor Purity were also analyzed by ESTIMATE algorithm based on transcriptome expression profiles of breast cancer to verify the effect of ssGSEA grouping and to draw clustering heat map and statistical map. The
of
gene expression level of human leukocyte antigen (HLA) and CD274 (PD-L1) were used to verify the differences between the two groups. The CIBERSORT
ro
deconvolution algorithm was used to accurately determine the composition of
-p
immune cells in large tumor sample data from mixed cell types, and the DEG of the
lP
re
two groups was verified again.
na
2.3 Identification of immune-related lncRNAs in breast cancer According to the above-mentioned groups, the TCGA lncRNA counts expression
Jo ur
profile data were divided into high immune cell infiltration group and low immune cell infiltration group. The differentially expressed lncRNA was analyzed by edgeR package according to the criteria of |log2FC|>1 and p<0.05. The lncRNA related to immunity and affecting tumorigenesis was screened out after the difference analysis was carried out according to the same criteria between cancer group and paracancerous group. Venn analysis was used to detect the immune-related lncRNA from two analyses above.
2.4 Identification of immune-related lncRNA prognostic signature for breast cancer
Journal Pre-proof
According to the clinical data of breast cancer cases in the TCGA, univariate Cox proportional hazard regression (PHR) analysis was used to screen lncRNA related to survival from immune-related lncRNA with p<0.001 as the criteria. Then multivariate Cox PHR analysis was used to construct a prognostic signature and the risk score was calculated. Kaplan-Meier survival analysis was performed to compare
of
the survival difference for both groups. LASSO Cox analysis identified lncRNAs most correlated with overall survival, and 10-round cross-validation was performed to
ro
prevent overfitting. The risk score for each patient was then calculated based on the
-p
expression levels of lncRNAs. According to the median risk score, breast cancer
re
patients were divided into high-risk group and low-risk group. The risk score was
lP
calculated using the following formula[18]:
𝑛
na
Risk core = ∑ coefi X id 𝑖=1
Jo ur
Univariate and multivariate Cox regression analysis was used to evaluate the prognostic relationship between risk score and age, sex, grade, clinical stage and T stage (N stage and M stage had a large number of uncertain values, which were not included in the study).
2.5 Correlation analysis of immune cell infiltration B cells, CD4+T-cells, CD8+T-cells, dendritic cells, macrophages, and neutrophils immune infiltration data were download from tumor immune estimation resource (TIMER) database (https://cistrome.shinyapps.io/timer/). The correlation between risk scores and immune infiltration was calculated by Pearson correlation.
Journal Pre-proof
2.6 Statistical Analysis All statistical analysis was applied by R version 3.6.1 (Institute for Statistics and Mathematics, Vienna, Austria; https://www.r-project.org) (Package: impute, Up Set R, ggplot2, rms, glmnet, preprocess Core, forest plot, survminer, survival ROC,
of
beeswarm)[18]. For descriptive statistics, mean ± standard deviation was used for the continuous variables in normal distribution while the media (range) was used for
ro
continuous variables in abnormal distribution. Categorical variables were described
-p
by counts and percentages. Two-tailed p<0.05 was regarded statistically
lP na
3. Result
re
significant[18].
3.1 Construction and verification of breast cancer groupings
Jo ur
We obtained 1109 breast cancer samples and 113 paracancerous samples from the TCGA. The ssGSEA method was applied to the transcriptome of breast cancer samples to evaluate the infiltration of immune cells. Twenty-four immune-related terms were included to eliminate the richness of multiple immune cell types in breast cancer. By using unsupervised hierarchical clustering algorithm, breast cancer samples were divided into two groups according to immune infiltration, including the high immune cell infiltration group (n = 943) and the low immune cell infiltration group (n = 166) (Figure 1a). In order to verify the feasibility of the above grouping strategy, based on the expression profile of breast cancer, the ESTIMATE algorithm
Journal Pre-proof
was used to calculate Tumor Purity, ESTIMATE Score, Immune Score, and Stromal Score. Compared with the low immune cell infiltration group, the high immune cell infiltration group had lower Tumor Purity but higher ESTIMATE Score, Immune Score and Stromal Score (Figure 1a). The box chart also showed that there was a significant
positive
correlation
between
high
immune
cell
infiltration
of
group(Immunity-H) and ESTIMATE Score, Immune Score and Stromal Score, respectively, while there was a positive correlation between low immune cell
ro
infiltration group(Immunity-L) and Tumor Purity (Figure 1b). Compared with the
-p
low immune cell infiltration group, the high immune cell infiltration group had higher
re
immune components and lower tumor purity (p<0.05). Also, we found that the
lP
expression of HLA family and CD274 (PD-L1) in the high immune cell infiltration
na
group was significantly higher than that in the low immune cell infiltration group, respectively (p<0.01) (Figure 1c and 1d). In addition, we used the CIBERSORT
Jo ur
method to verify the above groups and found that the high immune cell infiltration group had more amount of kinds of immune cells (Figure 1e). In aggregate, these results indicate that this breast cancer grouping can be used for follow-up analysis.
3.2 Analysis of differentially expressed lncRNAs between tumor group and paracancerous group and between high immune cell infiltration group and low immune cell infiltration group According to the criteria of |log2FC|>1 and FsDR<0.05, we analyzed the difference between breast cancer group (1109 cases) and breast cancer paracancerous
Journal Pre-proof
group (113 cases). We found 2999 differentially expressed lncRNAs, of which 2208 and 791 were up-regulated and down-regulated, respectively (Figure 2a). According to the same criteria, 1422 differentially expressed lncRNAs were identified in the high immune cell infiltration group compared with the low immune cell infiltration group, with 455 up-regulated and 967 down-regulated (Figure 2b). After a two-way Venn
of
analysis, a total of 696 differentially expressed lncRNAs were determined in the tumor group and high immune cell infiltration group compared with the
ro
paracancerous group and low immune cell infiltration group (Figure 2c). Together,
re
-p
these results suggest that there were immune-related lncRNAs in breast cancer tissue.
na
for breast cancer
lP
3.3 Identification and assessment of 11 immune-related lncRNA prognostic signature
Based on the survival data set of breast cancer samples, we applied univariate
Jo ur
Cox regression to the expression profiles of the 696 lncRNAs. A total of 18 differentially expressed lncRNAs were determined according to the criterion of p<0.001 (Figure 3a). In order to avoid overfitting the prognostic signature, we performed Lasso regression on these lncRNAs and found 17 differentially expressed lncRNAs related to immune cell infiltration in breast cancer (Figure 3b), and the optimal values of the penalty parameter
were determined by 10-round
cross-validation(Figure 3c). By stepwise multiple Cox regression analysis, 11 lncRNAs,
including
LINC00668,
LINC02418,
AL356515.1,
LINC01010,
AP005131.6, AL772337.1, AC027514.1, AL161646.2, AC004847.1, AC243773.2 and
Journal Pre-proof
AL591686.1. were further identified from the above 17 lncRNAs (Table 1). The risk score for each sample was then calculated based on the expression levels of these 11 lncRNAs. Risk score = 0.06*LINC00668 +0.13*LINC02418 +0.24*AL356515.1 -0.23*LINC01010
-0.15*AP005131.6
+0.18*AL772337.1
+0.21*AC027514.1
+0.17*AL161646.2 -0.13*AC004847.1 +0.07*AC243773.2 -0.13*AL591686.1(Table
of
1). According to the median risk score, breast cancer samples were divided into high-risk group and low-risk group. Kaplan-Meier curve showed that the samples in
ro
the high-risk group exhibited worse overall survival (OS) than those in the low-risk
-p
group, indicating the prognostic signature of risk score is effective (p = 2.493e-10)
re
(Figure 3d). The risk curve and scatterplot were generated to show the risk score and
lP
survival status of each breast cancer sample. The risk coefficient and mortality of
na
samples in the high-risk group were higher than those in the low-risk group (Figure 3e and 3f). The heatmap of these 11 lncRNA expression profiles in breast cancer
Jo ur
samples showed that LINC01010, AP005131.6, AC004847.1, and AL591686.1 were highly expressed in the low-risk group, while LINC00668, LINC02418, AL356515.1, AC027514.1, AL772337.1, AL161646.2, and AC243773.2 were highly expressed in the high-risk group (Figure 3g). Collectively, these studies identify 11 immune-related lncRNAs as prognostic signature for breast cancer.
3.4 Evaluation of 11 immune-related lncRNAs as independent prognostic factors in patients with breast cancer Univariate and multivariate Cox regression analyses were used to explore
Journal Pre-proof
whether the above 11 immune-related lncRNAs were prognostic factors for breast cancer independent of clinicopathological factors, such as age, gender, and pathological stage. The hazard ratio (HR) of risk score and 95% CI were 1.328 and 1.256-1.404 in univariate Cox regression analysis (p<0.001), and 1.266 and 1.188-1.349 in multivariate Cox regression analysis (p<0.001), respectively,
of
suggesting that the 11 lncRNAs were independent prognostic factors in patients with breast cancer (Figure 4a and 4b). In order to compare the sensitivity and specificity
ro
of risk score on the prognosis of patients with breast cancer, time-dependent receiver
-p
operating characteristics (ROC) analysis was performed. The area under the ROC
re
curve (AUC) of the risk score was 0.836 (Figure 4c), suggesting the 11 lncRNA
lP
prognostic signature for breast cancer was highly reliable. In aggregate, these results
na
indicate that the 11 immune-related lncRNAs were independent prognostic factors in
Jo ur
patients with breast cancer.
3.5 Correlation between 11 immune-related lncRNA prognostic signature for breast cancer and the infiltration of immune cell subtypes Given that these 11 lncRNAs were related to tumor immunity, we next analyzed the correlation between the 11 lncRNA prognostic signature and the infiltration of immune cell subtypes in breast cancer using the data from the TIMER database. As shown in Figure 5a-5f, the correlation values of B cells, CD4+ T cells, CD8+ T cells, DC, neutrophils, and macrophages with risk score were -0.111, -0.205, -0.169, -0.208, -0.204 and -0.097, respectively, suggesting that the infiltration of these immune cell
Journal Pre-proof
subtypes was significantly negative correlated with the prognosis of breast cancer. Taken together, these results indicate that the 11 lncRNA prognostic signature for breast cancer was associated with the infiltration of these immune cell subtypes.
4. Discussion
of
Breast cancer is the most common and fatal malignant tumor among women in the world, with highly heterogeneous biological and clinical features[19]. The high
ro
heterogeneity of breast cancer exists not only in the genotypes and phenotypes of
-p
tumor cells but also in the tumor microenvironment[20]. Breast cancer tissue is not
re
only composed of breast cancer cells but also mixed with many kinds of normal cells,
lP
such as immune cells, stromal cells and fibroblasts[21]. These different types of cells
na
interact with each other, evolve together, and eventually form a complex whole. In the current study, therefore, we focus on the heterogeneity of breast cancer and the
Jo ur
interaction between tumor-infiltrating immune cells and tumor cells, which was of great significance for studying the mechanism of tumor development and progression, and for developing new diagnostic and therapeutic approaches. In addition, with the wide application of high-throughput technology and the continuous maturity of data sharing mechanism, unprecedented large-scale multi-group tumor data have accumulated in the international public database, and tumor research has entered the era of "big data"[22]. Using the transcriptome sequencing data, especially on lncRNAs, and clinical-pathological features of breast cancer obtained from the TCGA, we identify and verify the 11 lncRNA prognostic signature related to immune cell
Journal Pre-proof
infiltration in this study. The heterogeneity of immune microenvironment in breast cancer is very high, and the type and number of infiltrating immune cells vary greatly in different locations[23]. In this study, there were significant differences in Tumor Purity, ESTIMATE Score, Immune Score, and Stromal Score between the high immune cell
of
infiltration group and the low immune cell infiltration group. Furthermore, the heterogeneity of immune microenvironment in breast cancer was verified by the
ro
expression of HLA and CD724 as well as the algorithm of CIBERSORT.
-p
In recent years, in-depth sequencing studies of transcriptome have found that
re
about 4/5 of the transcripts in the human genome are protein non-coding genes,
lP
including lncRNAs[24]. lncRNAs have been shown to participate in the development,
na
progression, invasion, and metastasis of breast cancer used variety of ways[12, 25]. In this study, we identify 11 lncRNAs including LINC01010, LINC00668, and
Jo ur
LINC02418 as prognostic signatures for breast cancer. Similarly, some studies have shown that LINC01010 was significantly related to the survival and prognosis of patients with neuroblastoma[26]. LINC00668 promotes the development of breast cancer by inhibiting apoptosis and accelerating cell cycle[27]. LINC02418 was a promising new tumor marker for the diagnosis and prognosis of colorectal cancer[28]. In order to explore the feasibility of the prognostic signature in clinical application, we compared this prognostic signature with the clinical indexes of breast cancer patients, such as gender, age, pathological stage, etc., using the univariate and multivariate COX analyses as well as ROC analysis, and confirmed that the 11
Journal Pre-proof
lncRNA prognostic signature could be independent prognostic factor in patients with breast cancer. Based on lncRNA sequencing data, tumor immune infiltrating cells account for a high proportion in many kinds of tumors, such as breast cancer, skin melanoma, non-small cell lung cancer and colon cancer[29, 30]. They are the key to tumor
of
immunotherapy[31]. The antigen-antibody complementary determining regions of T cell and B cell receptors play a decisive role in their recognition of tumor-specific
ro
antigen[32]. Therefore, the study of the sequence characteristics of tumor immune
-p
infiltrating T cell and B cell surface receptors was helpful to analyze the interaction
re
between tumor cells and T cells or B cells and to develop new methods for tumor
lP
diagnosis and treatment. Postoperative tumor tissue usually contains a certain amount
na
of immune infiltrating cells, leading to tumor tissue RNA sequencing data mixed with all kinds of information of tumor immune microenvironment[33]. In this study, we
Jo ur
find that 11 lncRNA prognostic signature for breast cancer was associated with the infiltration of immune cell subtypes using edgeR package. In conclusion, these studies identify 11 lncRNAs as prognostic signatures for breast cancer. The 11 lncRNA prognostic signature for breast cancer was associated with the infiltration of immune cell subtypes.
Acknowledgements This work was supported by grants (Nos. 81071803, 81272261, and 30971144) from the National Natural Science Foundation of China (http://www.nsfc.gov.cn/).
Journal Pre-proof
References [1] A. Dumas, I. Vaz Luis, T. Bovagnet, M. El Mouhebb, A. Di Meglio, S. Pinto, C. Charles, S. Dauchy, S. Delaloge, P. Arveux, C. Coutant, P. Cottu, A. Lesur, F. Lerebours, O. Tredan, L. Vanlemmens, C. Levy, J. Lemonnier, C. Mesleard, F. Andre, G. Menvielle, Impact of Breast Cancer Treatment on Employment: Results of a Multicenter Prospective Cohort Study (CANTO), J Clin Oncol, (2019) JCO1901726.
of
[2] A.M. Afifi, A.M. Saad, M.J. Al-Husseini, A.O. Elmehrath, D.W. Northfelt, M.B.
ro
Sonbol, Causes of death after breast cancer diagnosis: A US population-based analysis, Cancer, (2019).
-p
[3] T. Wang, L.E. McCullough, A.J. White, P.T. Bradshaw, X. Xu, Y.H. Cho, M.B.
re
Terry, S.L. Teitelbaum, A.I. Neugut, R.M. Santella, J. Chen, M.D. Gammon, Prediagnosis aspirin use, DNA methylation, and mortality after breast cancer: A
lP
population-based study, Cancer, 125 (2019) 3836-3844. [4] V. Cremasco, J.L. Astarita, A.L. Grauel, S. Keerthivasan, K. MacIsaac, M.C.
na
Woodruff, M. Wu, L. Spel, S. Santoro, Z. Amoozgar, T. Laszewski, S.C. Migoni, K. Knoblich, A.L. Fletcher, M. LaFleur, K.W. Wucherpfennig, E. Pure, G. Dranoff, M.C.
Jo ur
Carroll, S.J. Turley, FAP Delineates Heterogeneous and Functionally Divergent Stromal Cells in Immune-Excluded Breast Tumors, Cancer Immunol Res, 6 (2018) 1472-1485.
[5] E. Mamessier, F. Bertucci, R. Sabatier, D. Birnbaum, D. Olive, "Stealth" tumors: Breast cancer cells shun NK-cells anti-tumor immunity, Oncoimmunology, 1 (2012) 366-368. [6] N. Eiro, B. Fernandez-Garcia, L.O. Gonzalez, F.J. Vizoso, Cytokines related to MMP-11 expression by inflammatory cells and breast cancer metastasis, Oncoimmunology, 2 (2013) e24010. [7] M. Harao, M.A. Forget, J. Roszik, H. Gao, G.V. Babiera, S. Krishnamurthy, J.A. Chacon, S. Li, E.A. Mittendorf, S.M. DeSnyder, K.F. Rockwood, C. Bernatchez, N.T. Ueno, L.G. Radvanyi, L. Vence, C. Haymaker, J.M. Reuben, 4-1BB-Enhanced
Journal Pre-proof Expansion
of
CD8(+)
TIL from
Triple-Negative
Breast
Cancer
Unveils
Mutation-Specific CD8(+) T Cells, Cancer Immunol Res, 5 (2017) 439-445. [8] P. Bieniasz-Krzywiec, R. Martin-Perez, M. Ehling, M. Garcia-Caballero, S. Pinioti, S. Pretto, R. Kroes, C. Aldeni, M. Di Matteo, H. Prenen, M.V. Tribulatti, O. Campetella, A. Smeets, A. Noel, G. Floris, J.A. Van Ginderachter, M. Mazzone, Podoplanin-Expressing
Macrophages
Promote
Lymphangiogenesis
and
Lymphoinvasion in Breast Cancer, Cell Metab, 30 (2019) 917-936 e910. [9] C.D. Stefanski, K. Keffler, S. McClintock, L. Milac, J.R. Prosperi, APC loss
of
affects DNA damage repair causing doxorubicin resistance in breast cancer cells,
ro
Neoplasia, 21 (2019) 1143-1150.
[10] P. Michea, F. Noel, E. Zakine, U. Czerwinska, P. Sirven, O. Abouzid, C. Goudot,
-p
A. Scholer-Dahirel, A. Vincent-Salomon, F. Reyal, S. Amigorena, M. Guillot-Delost,
re
E. Segura, V. Soumelis, Adjustment of dendritic cells to the breast-cancer
lP
microenvironment is subset specific, Nat Immunol, 19 (2018) 885-897. [11] P. Cai, A.B. Otten, B. Cheng, M.A. Ishii, W. Zhang, B. Huang, K. Qu, B.K. Sun,
na
A genome-wide long noncoding RNA CRISPRi screen identifies PRANCR as a novel regulator of epidermal homeostasis, Genome Res, (2019).
Jo ur
[12] Q.Y. Huang, G.F. Liu, X.L. Qian, L.B. Tang, Q.Y. Huang, L.X. Xiong, Long Non-Coding RNA: Dual Effects on Breast Cancer Metastasis and Clinical Applications, Cancers (Basel), 11 (2019). [13] M. Zhou, Y. Hou, G. Yang, H. Zhang, G. Tu, Y.E. Du, S. Wen, L. Xu, X. Tang, S. Tang, L. Yang, X. Cui, M. Liu, LncRNA-Hh Strengthen Cancer Stem Cells Generation in Twist-Positive Breast Cancer via Activation of Hedgehog Signaling Pathway, Stem Cells, 34 (2016) 55-66. [14] W. Li, G. Jia, Y. Qu, Q. Du, B. Liu, B. Liu, Long Non-Coding RNA (LncRNA) HOXA11-AS Promotes Breast Cancer Invasion and Metastasis by Regulating Epithelial-Mesenchymal Transition, Med Sci Monit, 23 (2017) 3393-3403. [15] I. Bar, I. Theate, S. Haussy, G. Beniuga, J. Carrasco, J.L. Canon, P. Delree, A. Merhi, MiR-210 Is Overexpressed in Tumor-infiltrating Plasma Cells in Triple-negative Breast Cancer, J Histochem Cytochem, (2019) 22155419892965.
Journal Pre-proof [16] F. Pages, J. Galon, M.C. Dieu-Nosjean, E. Tartour, C. Sautes-Fridman, W.H. Fridman, Immune infiltration in human tumors: a prognostic factor that should not be ignored, Oncogene, 29 (2010) 1093-1102. [17] F.O. Beltran-Anaya, S. Romero-Cordoba, R. Rebollar-Vega, O. Arrieta, V. Bautista-Pina, C. Dominguez-Reyes, F. Villegas-Carlos, A. Tenorio-Torres, L. Alfaro-Riuz,
S. Jimenez-Morales,
A.
Cedro-Tanda,
M.
Rios-Romero,
J.P.
Reyes-Grajeda, E. Tagliabue, M.V. Iorio, A. Hidalgo-Miranda, Expression of long non-coding RNA ENSG00000226738 (LncKLHDC7B) is enriched in the
of
immunomodulatory triple-negative breast cancer subtype and its alteration promotes
ro
cell migration, invasion, and resistance to cell death, Mol Oncol, 13 (2019) 909-927. [18] T. Meng, R. Huang, Z. Zeng, Z. Huang, H. Yin, C. Jiao, P. Yan, P. Hu, X. Zhu, Z.
-p
Li, D. Song, J. Zhang, L. Cheng, Identification of Prognostic and Metastatic
re
Alternative Splicing Signatures in Kidney Renal Clear Cell Carcinoma, Front Bioeng
lP
Biotechnol, 7 (2019) 270.
[19] B. Sousa, A.S. Ribeiro, J. Paredes, Heterogeneity and Plasticity of Breast Cancer
na
Stem Cells, Adv Exp Med Biol, 1139 (2019) 83-103. [20] J.H.E. Baker, A.H. Kyle, S.A. Reinsberg, F. Moosvi, H.M. Patrick, J. Cran, K.
Jo ur
Saatchi, U. Hafeli, A.I. Minchinton, Heterogeneous distribution of trastuzumab in HER2-positive xenografts and metastases: role of the tumor microenvironment, Clin Exp Metastasis, 35 (2018) 691-705. [21] F. Bai, Y. Jin, P. Zhang, H. Chen, Y. Fu, M. Zhang, Z. Weng, K. Wu, Bioinformatic profiling of prognosis-related genes in the breast cancer immune microenvironment, Aging (Albany NY), 11 (2019) 9328-9347. [22] H. Li, C. Gao, L. Liu, J. Zhuang, J. Yang, C. Liu, C. Zhou, F. Feng, C. Sun, 7-lncRNA Assessment Model for Monitoring and Prognosis of Breast Cancer Patients: Based on Cox Regression and Co-expression Analysis, Front Oncol, 9 (2019) 1348. [23] A.S. Dias, C.R. Almeida, L.A. Helguero, I.F. Duarte, Metabolic crosstalk in the breast cancer microenvironment, Eur J Cancer, 121 (2019) 154-171. [24] A. Piovesan, F. Antonaros, L. Vitale, P. Strippoli, M.C. Pelleri, M. Caracausi, Human protein-coding genes and gene feature statistics in 2019, BMC Res Notes, 12
Journal Pre-proof (2019) 315. [25] Q. Guo, S. Lv, B. Wang, Y. Li, N. Cha, R. Zhao, W. Bao, B. Jia, Long non-coding RNA PRNCR1 has an oncogenic role in breast cancer, Exp Ther Med, 18 (2019) 4547-4554. [26] L. Gao, P. Lin, P. Chen, R.Z. Gao, H. Yang, Y. He, J.B. Chen, Y.G. Luo, Q.Q. Xu, S.W. Liang, J.H. Gu, Z.G. Huang, Y.W. Dang, G. Chen, A novel risk signature that combines 10 long noncoding RNAs to predict neuroblastoma prognosis, J Cell Physiol, (2019).
of
[27] X. Qiu, J. Dong, Z. Zhao, J. Li, X. Cai, LncRNA LINC00668 promotes the
ro
progression of breast cancer by inhibiting apoptosis and accelerating cell cycle, Onco Targets Ther, 12 (2019) 5615-5625.
-p
[28] Y. Zhao, T. Du, L. Du, P. Li, J. Li, W. Duan, Y. Wang, C. Wang, Long noncoding
re
RNA LINC02418 regulates MELK expression by acting as a ceRNA and may serve
lP
as a diagnostic marker for colorectal cancer, Cell Death Dis, 10 (2019) 568. [29] R. Huang, Z. Zeng, G. Li, D. Song, P. Yan, H. Yin, P. Hu, X. Zhu, R. Chang, X.
na
Zhang, J. Zhang, T. Meng, Z. Huang, The Construction and Comprehensive Analysis of ceRNA Networks and Tumor-Infiltrating Immune Cells in Bone Metastatic
Jo ur
Melanoma, Front Genet, 10 (2019) 828. [30] W.D. Yu, H. Wang, Q.F. He, Y. Xu, X.C. Wang, Long noncoding RNAs in cancer-immunity cycle, J Cell Physiol, 233 (2018) 6518-6523. [31] R. Jiang, J. Tang, Y. Chen, L. Deng, J. Ji, Y. Xie, K. Wang, W. Jia, W.M. Chu, B. Sun, The long noncoding RNA lnc-EGFR stimulates T-regulatory cells differentiation thus promoting hepatocellular carcinoma immune evasion, Nat Commun, 8 (2017) 15129. [32] I. Sela-Culang, M.R. Benhnia, M.H. Matho, T. Kaever, M. Maybeno, A. Schlossman, G. Nimrod, S. Li, Y. Xiang, D. Zajonc, S. Crotty, Y. Ofran, B. Peters, Using a combined computational-experimental approach to predict antibody-specific B cell epitopes, Structure, 22 (2014) 646-657. [33] Z.Q. Zhou, J.J. Zhao, Q.Z. Pan, C.L. Chen, Y. Liu, Y. Tang, Q. Zhu, D.S. Weng, J.C. Xia, PD-L1 expression is a predictive biomarker for CIK cell-based
Journal Pre-proof immunotherapy in postoperative patients with breast cancer, J Immunother Cancer, 7
Jo ur
na
lP
re
-p
ro
of
(2019) 228.
Journal Pre-proof
Figure 1. Construction and verification of breast cancer grouping. a. The immune cells were highly expressed in the cluster1 group, which was named as the high immune cell infiltration group (Immunity_H), and the low expression in the Cluster2 group was named as the low immune cell infiltration group (Immunity_L). Using ESTIMATE's algorithm, the Tumor Purity, ESTIMATE Score, Immune Score and
of
Stromal Score of each sample gene were displayed together with the grouping information. b. The box-plot showed that there was a statistical difference in Tumor
ro
Purity, ESTIMATE Score, Immune Score and Stromal Score between the two groups
-p
(p<0.01). c and d. The expression of HLA family genes and CD274 in high immune
re
cell infiltration group (red) were all significantly higher than that in low immune cell
lP
infiltration group (green) (p<0.01). e. The statistical chart after using the
na
CIBERSORT method showed the proportion difference of each immune cell between
group(green).
Jo ur
the high immune cell infiltration group (red) and the low immune cell infiltration
Figure 2. Analysis of differentially expressed lncRNAs. a. The volcano plot showed that 2208 genes were up-regulated and 791 down-regulated between breast cancer and paracancerous tissues. Each red dot showed a upregulated gene and each green dot showed a downregulated gene (fold change >4, p= 0.001). b. Consistent with Figure 3a, the volcano plot showed that 455 genes were up-regulated and 967 down-regulated between high and low immune cell infiltration group of breast cancer. c. Using R software package to pick up the intersection, we obtained a total of 696 differentially expressed genes.
Journal Pre-proof
Figure 3. Identification and assessment of
immune-related lncRNA prognostic
signature for breast cancer. a. The HR and p-value from the univariable Cox HR regression of selected genes in the immune terms (Criteria: p-value<0.001). b. The LASSO Cox analysis identified 17 lncRNAs most correlated with prognostics. c. The optimal values of the penalty parameter were determined by 10-round cross-validation.
of
d. Patients in the high-risk group (red) exhibited worse overall survival (OS) than those in the low-risk group (blue). e. The risk curve of each sample reordered by risk
ro
score. f. The scatter plot of the sample survival overview. The green and red dots
-p
represent survival and death, respectively. g. Heatmap showed the expression profiles
re
of the signature in the low-risk groups and high-risk groups. The pink bar represented
lP
the low-risk group, and the blue bar represents the high-risk group. The 0 to 4 level of
na
gene expression was represented by the evolution from green to red. Figure 4. The Cox regression analysis for evaluating the independent prognostic
Jo ur
value of the risk score.The univariate (a) and multivariate (b) Cox regression analysis of risk score, age, gender, grade, and TNM stage. c. Calculate the AUC for risk score, age, gender, grade, and TNM stage of the total survival risk score according to the ROC curve. Figure 5. Correlation between the 11 lncRNA prognostic signature for breast cancer and the infiltration of immune cell subtypes. The six most significant correlations of risk score with immune cell infiltration ssGSEA score. a. B cells. b. CD4+ T cell. c. CD8+ T cell. d. Dendritic. e. Neutrophil. f. Macrophage.
Journal Pre-proof
Author Statement
Chuanlu Shen and Yong Shen conceived the study design. Yong Shen drafted the manuscript. Yong Shen and Xiaowei Peng performed statistical analysis. All the
of
authors participated in the discussion, provided conceptual comments, and have read
Jo ur
na
lP
re
-p
ro
and approved the final manuscript.
Journal Pre-proof
The expression levels of these 11 lncRNAs coef
HR
HR.95L
HR.95H
p-value
LINC00668
0.05743
1.059112
1.000718
1.120913
0.047172
LINC02418
0.131969
1.141073
1.042715
1.248708
0.004112
AL356515.1
0.242501
1.274432
1.109549
1.463818
0.000602
LINC01010
-0.22616
0.797594
0.70333
0.904491
0.000425
AP005131.6
-0.15287
0.858243
0.75443
0.976341
0.020128
AL772337.1
0.180778
1.198149
1.057565
1.357421
0.004527
AC027514.1
0.206782
1.229715
1.046417
1.44512
0.012042
AL161646.2
0.168444
1.183462
1.03545
1.35263
0.013474
AC004847.1
-0.12587
0.76218
1.020025
0.090409
AC243773.2
0.072279
1.074956
0.995029
1.161303
0.066723
AL591686.1
-0.1254
0.882147
0.769362
1.011466
0.072397
ro
-p
0.881727
lP
na Jo ur
of
id
re
Table 1
Journal Pre-proof
Highlights
11 differential lncRNA expression prognostic models were established for breast cancer. Our prognostic model of breast cancer was correlated to the immune cell
na
lP
re
-p
ro
of
infiltration.
Jo ur
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6