Lung Adenocarcinoma and Squamous Cell Carcinoma Gene Expression Subtypes Demonstrate Significant Differences in Tumor Immune Landscape

Lung Adenocarcinoma and Squamous Cell Carcinoma Gene Expression Subtypes Demonstrate Significant Differences in Tumor Immune Landscape

Accepted Manuscript Lung Adenocarcinoma and Squamous Cell Carcinoma Gene Expression Subtypes Demonstrate Significant Differences in Tumor Immune Lands...

951KB Sizes 0 Downloads 87 Views

Accepted Manuscript Lung Adenocarcinoma and Squamous Cell Carcinoma Gene Expression Subtypes Demonstrate Significant Differences in Tumor Immune Landscape Hawazin Faruki, Gregory M. Mayhew, Jonathan S. Serody, D Neil Hayes, Charles M. Perou, Myla Lai-Goldman PII:

S1556-0864(17)30214-9

DOI:

10.1016/j.jtho.2017.03.010

Reference:

JTHO 540

To appear in:

Journal of Thoracic Oncology

Received Date: 27 September 2016 Revised Date:

4 February 2017

Accepted Date: 7 March 2017

Please cite this article as: Faruki H, Mayhew GM, Serody JS, Hayes DN, Perou CM, Lai-Goldman M, Lung Adenocarcinoma and Squamous Cell Carcinoma Gene Expression Subtypes Demonstrate Significant Differences in Tumor Immune Landscape, Journal of Thoracic Oncology (2017), doi: 10.1016/j.jtho.2017.03.010. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT 1

Lung Adenocarcinoma and Squamous Cell Carcinoma Gene Expression Subtypes Demonstrate Significant Differences in Tumor Immune Landscape

SC

RI PT

Hawazin Faruki1 Gregory M Mayhew1 Jonathan S Serody2 D Neil Hayes2 Charles M Perou2,3 Myla Lai-Goldman1

TE D

*Corresponding Author Hawazin Faruki, DrPH GeneCentric Diagnostics 280 South Mangum Street Suite 350 Durham, NC 27701 919-451-0400 [email protected]

M AN U

(1)GeneCentric Diagnostics, Durham, NC; (2)Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC (3)Department of Genetics, University of North Carolina, Chapel Hill, NC,

EP

Keywords: Lung Cancer, Adenocarcinoma, Squamous cell carcinoma, Non Small Cell Lung Cancer, NSCLC, Subtypes, Immune response, PD-L1, Gene Expression.

AC C

Disclosures/Conflicts of Interest HF, GM, MLG, are employees of GeneCentric Diagnostics and are named in 3 pending patents for lung subtyping. CMP and DNH are Board of Directors Members, equity Stock Holders, and consultants for GeneCentric Diagnostics. They each are also named as inventors on an issued US patent and 3 other pending patents for lung cancer subtyping. JSS is consultant for GeneCentric Diagnostics.

1

ACCEPTED MANUSCRIPT 2

Abstract Introduction: Molecular subtyping of lung Adenocarcinoma (AD) and lung Squamous Cell

RI PT

Carcinoma (SQ) reveal biologically diverse tumors that vary in their genomic and clinical attributes.

Methods: Using published immune cell signatures and several Lung AD and SQ gene

expression datasets including The Cancer Genome Atlas (TCGA), immune response in relation

SC

to AD and SQ expression subtypes was examined. Expression of immune cell populations and other immune related genes including CD274 (PD-L1) was investigated in the tumor

M AN U

microenvironment relative to the expression subtypes of AD (Terminal Respiratory Unit (TRU), Proximal Proliferative (PP), and Proximal Inflammatory (PI)) and SQ subtypes (Primitive, Classical, Secretory, Basal).

Results: Lung AD and SQ expression subtypes demonstrated significant differences in tumor immune landscape. The PP subtype of AD demonstrated low immune cell expression among

TE D

AD’s while the secretory subtype showed elevated immune cell expression among SQ’s. Tumor expression subtype was a better predictor of immune cell expression than CD274(PD-L1) in SQ tumors but was comparable in AD tumors. Non-silent mutation burden was not correlated with

EP

immune cell expression across subtypes; however, MHC class II gene expression was highly correlated with immune cell expression. Increased immune and MHC II gene expression was

of SQ.

AC C

associated with improved survival in the TRU and PI subtypes of AD and in the primitive subtype

Conclusions: Molecular expression subtypes of lung AD and SQ demonstrate key and reproducible differences in immune host response. Evaluation of tumor expression subtypes as potential biomarkers for immunotherapy should be investigated.

2

ACCEPTED MANUSCRIPT 3 Introduction Non-small cell Lung Cancer (NSCLC) is a heterogeneous disease typically classified into two broad subtypes, Adenocarcinoma (AD) or Squamous Cell Carcinoma (SQ), using standard

RI PT

pathology methods. Beyond the morphologic differentiation, multiple gene expression subtypes that differ in their prognosis, underlying genomic alterations, and potential response to treatment have been identified within AD and SQ tumors.1-4 The three gene expression AD subtypes

SC

include Terminal Respiratory Unit (TRU), Proximal proliferative (PP), and Proximal Inflammatory (PI)2 formerly referred to as “bronchioid”, “magnoid” and “squamoid” subtypes respectively.4 SQ

M AN U

includes four subtypes, primitive, classical, basal, and secretory.1,3 Lung AD and SQ expression subtypes are not discernable by standard morphology-based diagnoses, however, they demonstrate significant differences in key genomic, genetic, and clinical characteristics including tumor differentiation, stage specific survival, underlying drivers, and likely response to various therapies.1-4 The AD subtype TRU is characterized by enrichment for alterations in EGFR, ALK,

TE D

nonsmokers, and a better prognosis. STK11 deletion, high proliferation, brain metastases, and poor prognosis characterize the PP subtype, and TP53 mutations are more characteristic of the PI subtype.2,4,5 The SQ subtypes are enriched for RB1 loss in the primitive subtype,

EP

KEAP/NFE2L2 oxidative stress alterations in the classical subtype, greater inflammatory

AC C

response in the secretory subtype, and NF1 alterations in the basal subtype.1,3

Investigation of the tumor immune response across multiple tumor types using genomic data has been valuable in identifying the role of Tumor Infiltrating Lymphocytes (TIL’s) in prognosis and immunotherapy response.6-8 Bremnes et al. recently reviewed the role of TIL’s and the associated composition of the immune microenvironment as a key determinants of NSCLC patient outcomes.9 Immune checkpoint inhibitors targeting the PD-1/PD-L1 interaction have been shown to reverse the lung tumor induced immunosuppressive microenvironment releasing

3

ACCEPTED MANUSCRIPT 4 an effective host anti-tumor immune response and leading to remarkable improvements in survival.10-12

RI PT

Biomarkers for predicting immunotherapy response have included a variety of anti-PD-L1 antibodies assayed using Immunohistochemistry (IHC). In melanoma and lung adenocarcinoma, increase in tumor cells expressing PD-L1 has been associated with an increase in response, however, variable cutoffs, multiple antibodies with differing affinities, and lack of method

SC

10,12,13

standardization, have resulted in conflicting findings regarding the value of PD-L1 IHC testing.11,12,14 Other predictors of response, including non-silent mutation and neo-antigen

M AN U

burden15 and gene expression of a variety of immune response genes,16 have been studied but remain investigational tools at this time. In this study, we explore the immune landscape of genomic subtypes of lung AD and SQ to characterize the immune microenvironment using publicly available genomic datasets. Given intrinsic biologic differences of gene expression

TE D

subtypes of lung AD and SQ tumors, we were interested in investigating subtype specific immune characteristics and associated immune cell/marker expression differences that might

EP

contribute to our understanding of prognosis and/or response to immunotherapy.

Materials and Methods

AC C

Sample Datasets

Multiple publically available datasets were assembled to evaluate tumor subtype infiltration of immune cells and associated survival differences. The AD and SQ datasets included several publically available lung cancer gene expression data sets. Data sources and sample numbers and characteristics are provided in Supplemental Table 1.

Publically Available Expression Data Sets and Subtype Assignments

4

ACCEPTED MANUSCRIPT 5 We used 4 previously published adenocarcinoma samples with a total of 1190 patient samples. The published data sets included TCGA,2 Shedden et al.,17 Tomida et al.,18 and Wilkerson et al.,4 all of which were derived from fresh frozen specimens. The published TCGA data included

RI PT

expression profiles from 58 tumor adjacent normal lung tissue samples. For TCGA, upper quantile normalized RSEM data was downloaded from Firehose19 and log2 transformed. Affymetrix Cel files from Shedden et al.,17 were downloaded from the NCI website

SC

(https://caintegrator.nci.nih.gov/caintegrator/) and robust multi-array average expression

measures were generated using the Affy package in R. Normalized Agilent array data was

and Wilkerson et al.4 (GSE26939).

M AN U

downloaded from the Gene Expression Omnibus (GEO) website for Tomida et al.,18 (GSE13213)

We used 4 published gene expression data sets of lung squamous cell carcinoma samples having a total of 761 patients, including TCGA,1 Lee et al.,16 Raponi et al.,20 and Wilkerson et al.3

TE D

The published TCGA data included expression profiles from 51 tumor adjacent normal lung tissue samples. For TCGA, upper quantile normalized RSEM data was downloaded from Firehose19 and log2 transformed. Normalized Affy array data was downloaded from GEO for Lee

EP

et al.,16 (GSE8894) and Raponi et al.,20 (GSE4573), and normalized Agilent array data was downloaded from GEO for Wilkerson et al.,3 (GSE17710). To determine AD subtype (TRU, PP,

AC C

and PI) and SQ subtype (basal, classical, primitive, secretory), we applied the published AD 506gene nearest centroid classifier and the SQ 208 classifier as described previously in Wilkerson et al.3,4 Full list of datasets used is included in Supplementary Table S1.

Gene Sets Our investigation of immune differences by subtype used the 24 immune cell gene signatures from Bindea et al.,6 that each had a varying number of genes, and were classified as adaptive or innate immunity cell signatures. Adaptive Immune Cell (AIC) signatures included Tcells, Central

5

ACCEPTED MANUSCRIPT 6 Memory T cells (Tcm), Effector Memory T cells (Tem), T helper cell (Th), Type 1 T helper cells (Th1), Type 2 T helper cells (Th2), T follicular helper cells (Tfh), T helper 17 cells (Th17), T Regulatory Cells (Treg), Gamma Delta T cells (Tgd), CD8 Tcells, Cytotoxic T cells, B cells, and

RI PT

Innate Immune Cell (IIC) signatures included Natural Killer (NK), NK CD56dim cells, NK CD56bright cells, Dendritic cells (DC), Immature Dendritic Cells (iDC), Plasmacytoid Dendritic Cells (pDC), Activated Dendritic Cells(aDC), Mast cells, Eosinophils, Macrophages, and

SC

Neutrophils. For each signature we assigned a score to each sample by calculating the average expression value of all genes in the list. The IFN signature was a new signature developed by us

M AN U

for this analysis and included 13 interferon signaling pathway genes selected by the authors to investigate IFN-pathway expression; the IFN gene list can be found in Supplementary Table S2. A 13-gene MHC class II signature score,21 was also included as well as 4 additional individual gene immunity markers: PDCD1, PD-L1 (CD274), CTLA4, and PD-L2 (PDCD1LG2). For all signatures, a summary of missing genes in various datasets is included in Supplementary Table

TE D

S2.

Immune Cell Genomic Evaluations

EP

Using the TCGA lung cancer data and separately for adenocarcinoma and squamous cell carcinoma to investigate overall immunity marker trends by subtype, we plotted expression

AC C

heatmaps where samples were arranged by subtype and markers were grouped according to ordering in Bindea et al.,6 Marker-subtype association test p-values (Kruskal-Wallis) and Mann Whitney test p-values for marker distribution differences between each pair of subtypes were calculated. Correlations among the 30 immune markers were made by plotting matrices of pairwise Spearman rank correlation coefficients, where markers were ordered by hierarchical clustering. We compared the MHC class II signature in tumor versus tumor adjacent normal lung tissue using the Mann-Whitney test. To evaluate the reproducibility of immunity marker

6

ACCEPTED MANUSCRIPT 7 differences among the subtypes, we plotted normalized T cell signatures by subtype for each data set.

RI PT

Prediction Strength To assess the prediction strength of subtype as a predictor of immune markers relative to that of PD-L1, a linear regression model of each signature with subtype the sole predictor, and again

SC

with PD-L1 the sole predictor, was fit using the TCGA dataset. PD-L1 expression was treated as a low/high categorical variable with equal proportions in each group. Scatter plots of adjusted

M AN U

R-squared when subtype was the predictor against adjusted R-squared when PD-L1 was the predictor were inspected for overall trends.

Genomic Associations with Tcell Expression

Several genetic and genomic alterations, characteristic of AD and SQ subtypes, including EGFR,

TE D

TP53, and STK11 inactivation in AD and RB1, NFE2L2 and NF1 expression in SQ were examined for association with Tcell expression with and without adjustment for subtype using linear regression. Mutation and CNV data was downloaded from Firehose22 and for STK11

EP

samples were called inactive when reported as deleted and/or mutated. When association with Tcells was strong, we plotted the marker distribution by subtype and evaluated association

AC C

evidence using Fisher’s exact test and the Kruskal Wallis test for binary and continuous markers, respectively, and compared every pair of subtypes for marker distribution differences using Fisher’s exact test or Mann-Whitney. Using non-silent mutation burden per Mb data available in the supplementary TCGA information,1,2 we investigated association with Tcell expression using linear regression. Association between mutation burden and subtype was evaluated overall using the Kruskal Wallis test and between each pair of subtypes using Mann-Whitney.

7

ACCEPTED MANUSCRIPT 8 Subtype and immune signature associations with a 13-gene MHC class II signature,21 calculated as an average of all genes in the list (Gene list in Supplementary Table S2), were investigated using the Kruskal-Wallis test for overall differences and Mann-Whitney for comparing two

RI PT

subtypes. For immune signature-MHC class II associations in both tumor and tumor adjacent normal, Spearman correlation coefficients and p-values were calculated.

SC

Survival Analysis

We tested for immune marker-survival associations in the TCGA data sets, overall and

M AN U

separately within each subtype, using Cox proportional hazards models. Immune markers were left as continuous variables after being centered and scaled to have mean 0 and variance 1. Stage IV patients were excluded from the analysis because of heterogeneity in their clinical management and poor representation (i.e. small sample size) in the genomic datasets. Evaluations within a specific subtype were adjusted for stage, and overall evaluations were

TE D

adjusted for both stage and subtype. Forest plots showing hazard ratios and confidence intervals for significant signatures were made. All statistical analyses were conducted using R

Results

EP

3.2.0 software (http://www.R-project.org).

AC C

Immune cell evaluations by subtype Examination of Immune cell gene signatures including both adaptive immune cells and innate immune cells as well as individual immune gene markers revealed clear differences among the AD and SQ subtypes (see Figure 1). In AD, immune expression was consistently lower in the PP subtype for most cell types examined. Expression was similar in TRU and PI for most T cells but could be differentiated between TRU and PI by greater expression of some innate immune cells (dendritic cells, NK CD56bright, mast cells, eosinophils) and several adaptive immune cells

8

ACCEPTED MANUSCRIPT 9 (Bcells, TFH, Tcm, Th17, CD8 Tcells) in the TRU subtype while the PI subtype showed higher expression of Th1 and Th2, Treg, cytotoxic Tcells and NKCD56dim cells. Box plots of all the immune cells and markers by AD and SQ subtype can be found in Supplemental Figure S1.

RI PT

Marker-subtype association test p-values (Kruskal-Wallis) as well as Mann Whitney test p-values for marker distribution differences between each pair of subtypes can be found in Supplemental

SC

Table S3.

Immunotherapy targets, CTLA4 and CD274 (PD-L1), demonstrated consistently higher

M AN U

expression in the PI subtype across multiple datasets (Supplemental Figure S1). In the PP tumors, both adaptive immune cells and innate immune cell expression as well as immunotherapy target expression was low relative to other AD. (Figure S1). Among the SQ subtypes, the secretory subtype showed consistently higher immune cell expression of both innate and adaptive immune cells with one exception, the Th2 signature, where both primitive

TE D

and secretory had comparable expression (Figure S1). The classical subtype demonstrated the lowest immune cell expression of all the SQ subtypes. Unlike the case for AD subtypes, CD274(PD-L1) expression did not correlate with other immune cell expression in SQ subtypes.

EP

This is especially obvious in the classical subtype where CD274 (PD-L1) expression was high

AC C

despite relatively low expression of other immune cells (Figure 1 & Supplemental Figure S1).

Hierarchical clustering analysis grouped adaptive immune cells together and innate immune cells together. In AD, adaptive immune cells such as T cells, cytotoxic cells, CD8 cells, Th1 cells, PDCD1, CTLA4, and Tregs had high pairwise correlations (Spearman correlation coefficient >0.53) and similarly for innate immune cells, iDC, DC, macrophages, neutrophils, mast cells, and eosinophils (Spearman correlation coefficient was >0.52). In SQ, patterns were similar (Spearman correlation coefficient >0.68 for adaptive immune cells and >0.55 for the innate immune cells). NK cells were not consistently correlated with innate immune cells. In particular,

9

ACCEPTED MANUSCRIPT 10 NK CD56dim cells (cytolytic activity) were more strongly correlated with adaptive immune cells than with innate immune cells (Spearman correlation coefficient >0.50 for adaptive immune cells vs >0.09 for innate cells in AD and >0.70 for adaptive immune cells vs. >0.42 for innate cells in

RI PT

SQ). Hierarchical clustering is shown in Supplemental Figure S2 and all correlation coefficients and p-values for each pair of immune markers are included in Supplemental Table S4.

SC

Strength of Association with Adaptive Immune Cell Expression

Strength of association between CD274(PD-L1) expression and adaptive immune cell

M AN U

signatures, as compared to AD or SQ subtype was investigated. Figure 2 shows CD274signature associations (measured by adjusted R-squared) were stronger than AD subtype for some cells (Tcells, Th1, Treg, cytotoxic cells, Thelper, Tem, Tgd) but not for others (TFH, Th2, CD8, Th17, and Tcm). In AD, median F-test p-value and adjusted R-squared of the predictive strength of gene expression subtype vs. CD274 expression for association with adaptive immune

TE D

cells was 5.97e-13 and 0.10 for subtype versus 1.18e-10 and 0.08 for CD274. In SQ tumors, subtype was a better predictor of adaptive immune cell expression for all cells examined and median F-test p-value and adjusted R-squared were 2.16e-24 and 0.20 for subtype versus

EP

4.36e-5 and 0.03 for CD274) (Figure 2).

AC C

Evaluation of Tcell signature in Multiple Datasets To evaluate the reproducibility of our findings, we examined individual signatures across each data set separately, with typical results shown in Figure 3. T cell immune signature expression subtype patterns by subtype in AD and SQ were remarkably reproducible across a variety of gene expression datasets and involving a variety of gene expression platforms including RNAseq (Illumina, San Diego, CA) and microarrays from both Affymetrix (Santa Clara, CA) and Agilent (Santa Clara, CA) (Figure 3). AD and SQ subtypes showed similar Tcell expression patterns independent of platform. Similar observations were made for other immune cell types and

10

ACCEPTED MANUSCRIPT 11 expression biomarkers including Bcells, NK cells, dendritic cells, macrophages, PDCD1 (PD-1) and CD274 (PD-L1), which can be seen in Supplemental Figure S3.

RI PT

Somatic Genetic and Genomic Associations with Tcell Expression Evaluation of mutation burden and its association with Tcell expression in the AD and SQ

subtypes was conducted. Non-silent mutation burden in the TCGA AD data differed by subtype with PI showing the highest burden and TRU the lowest burden (Figure 4A). The PI subtype,

SC

which is enriched for TP53 mutations23 and had greater mutation burden, was associated with higher immune cell expression, however, TRU had the lowest mutation burden (Figure 4A),

M AN U

rarely harbors TP53 mutations23, yet demonstrated high immune cell expression features (Figure 1 & Figure S1). In SQ, non-silent mutation burden was not significantly different across subtypes (p= 0.54; Figure 4B) despite significant differences in Tcell expression (Figure 1 & Figure S1). Using linear regression, mutation burden was not found to be correlated with Tcell immune cell

TE D

expression in either AD or SQ datasets (p=0.24 in AD and p=0.9 in SQ, Figure S4-D,H).

Several AD and SQ gene expression subtype-enriched genetic and genomic alterations were

EP

investigated for their association with Tcell expression including EGFR, TP53, STK11 inactivation in AD, and RB1, NF1, and NFE2L2 expression in SQ. In AD, only STK11

AC C

inactivation was markedly associated with T cell gene expression signatures (p=0.0007, Supplemental Figure S4-A), but the association was lost after adjustment for expression subtype (p=0.43). In SQ, NFE2L2 expression was associated with Tcells (p=1.2E-07, Supplemental Figure S4-G), as was NF1 expression (p=0.01, Supplemental Figure S4-F), but adjustment for expression subtype eliminated significance to p=0.47 and p=0.26, respectively. Loss of STK11 in AD24-26 and KEAP/NFE2L2 alterations in SQ27 have been associated with reduced immune response in NSCLC. Our evaluation showed enrichment of STK11 inactivation in the low immune response adenocarcinoma PP subtype (Figure 4C) and KEAP/NFE2L2 alterations,

11

ACCEPTED MANUSCRIPT 12 impacting the oxidative stress pathway, in the low immune response SQ classical subtype (Figure 4D), however, neither STK11 nor NFE2L2 were significant predictors following

RI PT

adjustment for expression subtype (Supplemental Figure S4-A,G).

MHC class II

Given recent interest in MHC class II gene expression association with immune infiltration in

SC

triple negative breast cancer21 and in colon cancer6, in combination with our own observations of differential expression of several MHC class II genes across the subtypes, we investigated the association of immune cell expression in AD and SQ lung cancer with MHC class II genes using

M AN U

a published 13 gene MHC class II signature21. MHC class II gene expression varied significantly across tumor subtypes in Figures 4E and 4F (p=2.7E-45 and 3.1E-39 in AD and SQ respectively). We next examined tumor adjacent normal lung tissue expression profiles from the AD and SQ datasets (n=58 and 51 respectively) to evaluate whether MHC II expression was lung

TE D

tissue specific versus tumor specific. Higher MHC II expression in tumor adjacent normal lung tissue as compared to lung tumor tissue suggested tumor specific decreased expression in the tumor microenvironment as compared to normal lung tissues taken from lung cancer patients

EP

(p=4.8E-16 and p< 2.2E-16 in AD and SQ, respectively). MHC class II gene expression was strongly correlated with several immune cells in both AD and SQ including Tcell expression

AC C

(Spearman correlation=0.66 in AD; 0.86 in SQ), Bcell expression (Spearman correlation=0.5 in AD; 0.69 in SQ) and DC expression (Spearman correlation=0.69 in AD; 0.76 in SQ). Scatter plots of MHC II and immune cell expression in AD and SQ subtypes with correlation coefficients and associated p-values can be found in Supplemental Figure S5. Tumor adjacent normal MHCII expression was uniformly high relative to tumor samples (Figure S5) and correlations with immune cell expression were lower. In contrast with results of other subtype specific genomic alterations, MHC II expression remained a significant predictor of Tcell immune cell expression in linear regression models following adjustment for subtype in AD and SQ data sets (p<1E-50).

12

ACCEPTED MANUSCRIPT 13

Survival analysis

RI PT

Immune infiltrates have been associated with improved survival in NSCLC 9,28-31, however analyses have typically not considered the subtypes of AD or SQ. Using cox proportional hazard models, we calculated subtype-specific hazard ratios per unit increase in normalized expression.

SC

Hazard ratios and confidence intervals for markers that were significant (nominal p-value<0.05) for at least one subtype following adjustment for pathologic stage are shown in Figure 5. For AD

M AN U

subtypes, a unit increase in expression for many innate and adaptive immune cells, including CD274 (PD-L1), CTLA4, and MHC class II signature, was significantly associated with improved survival in the PI subtype of AD but not in other AD subtypes (Figure 5A).

Among the SQ subtypes, a unit increase in expression for Th1, Th2, TFH, DC, macrophages,

TE D

and MHC class II was significantly associated with improved survival in the primitive subtype (Figure 5B). We demonstrate that increased immune cell expression is not consistently associated with improved survival and appears to be expression subtype dependent. Curiously,

EP

the inflammatory SQ secretory subtype expected to demonstrate improved survival with increased expression of immune cells, did not show significant associations with survival (except

AC C

for Mast cells). The secretory SQ subtype demonstrated uniformly high expression of immune cells, which may have prevented detection of significant survival benefit. Alternatively, immune expression in the SQ secretory subtype may not be associated with improved survival. Subtypespecific and overall hazard ratios and confidence intervals (adjusted for subtype and pathologic stage) for all the signatures and genes evaluated are included in Supplemental Table S5.

Discussion

13

ACCEPTED MANUSCRIPT 14 The mortality rate associated with lung cancer remains high and despite the presence of several potentially targetable mutations in lung AD, many lung tumors do not carry a known driver mutation that can be targeted with existing therapies,32 therefore there is a significant need to

RI PT

improve management for this large group of lung cancer patients. Recent evidence from the Cancer Genome Atlas Network confirm earlier findings that genomic analysis of lung AD and SQ tumors define unique biologic subtypes with potential to inform treatment and management

SC

decisions.1-4 Given the lack of a reproducible biomarker for immune therapy response in NSCLC and the potential for clinical differentiation by gene expression subtypes to inform prognosis and

M AN U

treatment outcomes, the immune landscape of lung AD and SQ molecular subtypes was explored.

We demonstrate consistent, but variable, immune features of lung AD and SQ expression subtypes with decreased immune cell expression in the PP subtype of AD and elevated

TE D

expression in the secretory and to a lesser extent primitive subtypes of SQ. Results of CD274(PD-L1) gene expression were not always correlated with immune cell expression particularly in SQ where gene expression subtypes were more strongly predictive of Tcell gene

EP

expression than CD274. This is consistent with earlier observations demonstrating lack of association of PD-L1 IHC expression with response to checkpoint inhibitors in SQ patients11 but

AC C

potentially in conflict with another study suggesting association of PD-L1 expression and treatment response in NSCLC patients.12

Evaluation of mutation burden did not confirm a strong independent association of mutation burden with immune cell expression in either AD or SQ despite earlier observations of mutation burden and neoantigen associations with response to checkpoint inhibitors in NSCLC.15 In AD, the PI subtype, enriched for TP53 mutations, was associated with elevated immune cell expression and a higher mutation burden, however, the TRU subtype was associated with the

14

ACCEPTED MANUSCRIPT 15 lowest mutation burden despite relatively high immune expression. As mutation burden association with immune response is investigated, it will be important to consider distribution of AD subtypes in the sample population as the relative proportion of various subtypes within a

RI PT

given cohort, could bias conclusions of mutation burden association with immune response for AD as a whole. This study did not investigate neo-antigens and it is possible that neoantigen burden results might yield different results, however, previous observations have demonstrated a

SC

high correlation of neo-antigens with non-silent mutations.15 In SQ, there was no association of silent mutation burden and Tcell immune expression, agreeing with an earlier study of immune

M AN U

cytolytic activity across tumor types.8

Other potential genomic associations with lack of immune cell expression such as STK11 loss/ inactivation24-26 in the adenocarcinoma or KEAP1/NFE2L2 oxidative stress pathway alterations in SQ27, were reflective of differences in enrichment across the subtypes but the alteration

TE D

association with immune cell expression was not subtype independent. Association of Tcell immune expression with MHC class II genes and with improved survival in the AD PI subtype as well as the primitive subtype of SQ was observed. Expression of MHC class II genes with

EP

increased immune cell expression may be reflective of ongoing immune cell infiltration or may be tumor associated as suggested recently in triple negative breast cancer.21 More importantly, it is

AC C

associated with improved survival in some of the subtypes but not others, thus providing potentially valuable clinical justification for additional investigation of expression subtypes of AD and SQ. This adds to a growing body of literature linking improved prognosis to infiltrating immune cells in NSCLC 9,28-31 and in a variety of other tumors7 but is unique in identifying prognostic differences of immune cell expression across expression subtypes that may not be discernable when evaluating AD or SQ tumors as a whole.

15

ACCEPTED MANUSCRIPT 16 Recent studies suggests that adaptive immune responses in the lung may be initiated in the tumor environment in node-like “tertiary lymphoid structures”33 thus providing an explanation for the relatively high expression of MHC class II genes in the normal lung as compared to the tumor

RI PT

microenvironment. Despite earlier observations of loss of MHC class I gene mutations on chromosome 6 in SQ as a potential mechanism of immune suppression in lung SQ tumors,1 preliminary evaluation of chromosome 6 deletions in our dataset did not confirm an association

SC

of loss of MHC class II with lack of immune expression in AD nor in SQ. Nonetheless, down regulation of MHC class II genes by other means may be an important mechanism in evading

M AN U

host response and further investigation of MHC class II genes and the role they play in variable immune infiltration in the tumor microenvironment of AD and SQ subtypes is warranted.

This work is significant for presenting the differential immune response associated with intrinsic lung AD and SQ gene expression subtypes. Reproducible differences in immune response and

TE D

associated survival implications likely reflect the underlying tumor biology of the gene expression subtypes with potential to inform immunotherapy response and reduce selection bias in drug trials. This study is limited by the use of genomic analyses in publicly available gene expression

EP

datasets. The immune signatures used6 have been applied in multiple tumor types to characterize the tumor immune response7 however to our knowledge, have not been specifically

AC C

evaluated in various normal lung anatomic sites (proximal vs distal airway) where immune differences may exist and where future evaluations will be needed to further elucidate the lung tumor immune landscape. Publicly available gene expression datasets are disproportionately derived from early stage fresh frozen surgical resection samples with sufficient residual tissue for genomic study. Genomic advances have recently established the reliability of gene expression analysis in Formalin Fixed Paraffin Embedded (FFPE) tumor tissues facilitating the use of archived tissue for such analyses and preliminary data suggests similar immune landscape findings. Advanced tumors, however, particularly Stage IV tumors, are rarely well represented in

16

ACCEPTED MANUSCRIPT 17 these archives despite frequent diagnoses of such patients. We anticipate that this limitation will be overcome as genomic technology improvements permit reliable analyses from more limited

RI PT

samples, and/or from circulating nucleic acids.

This study reflects an analysis at a single point in time prior to the clinical introduction of

immunotherapy. Changes in the immune landscape following initiation of immunotherapy have

SC

been observed, at least in melanoma,34 and suggest new areas for exploration of the tumor immune microenvironment in serial longitudinal samples, which could not be evaluated here. In

M AN U

addition, acquired resistance involving alterations in IFN receptor signaling and/or antigen presentation as noted recently in melanoma patients who develop resistance to checkpoint inhibitors35 will likely reveal new insights into the immune landscape. Nonetheless, at a time when reliable immune-oncology biomarkers are so urgently needed, we believe this work is significant for identifying and characterizing these immune differences by expression subtype,

TE D

while acknowledging the need for additional studies in immunotherapy treated patient cohorts as such data becomes available.

EP

In conclusion, expanded use of molecular testing to better characterize lung tumors is likely, and we feel desirable, as genomics drives improved therapeutics and more personalized oncology

AC C

treatment plans. Gene expression-based tumor subtyping of lung AD and SQ provides valuable information regarding differential immune features of NSCLC tumors, with potential to inform immunotherapy drug development, treatment selection, and improved outcomes.

17

ACCEPTED MANUSCRIPT 18 Acknowledgements We thank the many patients who donated their tissues for medical research including datasets

AC C

EP

TE D

M AN U

SC

RI PT

in the public domain.

18

ACCEPTED MANUSCRIPT 19

References:

6.

7. 8. 9.

10. 11. 12. 13.

14. 15.

16.

RI PT

SC

5.

M AN U

4.

TE D

3.

EP

2.

Cancer Genome Atlas Research N. Comprehensive genomic characterization of squamous cell lung cancers. Nature. 2012;489(7417):519-525. Cancer Genome Atlas Research N. Comprehensive molecular profiling of lung adenocarcinoma. Nature. 2014;511(7511):543-550. Wilkerson MD, Yin X, Hoadley KA, et al. Lung squamous cell carcinoma mRNA expression subtypes are reproducible, clinically important, and correspond to normal cell types. Clin Cancer Res. 2010;16(19):4864-4875. Wilkerson MD, Yin X, Walter V, et al. Differential pathogenesis of lung adenocarcinoma subtypes involving sequence mutations, copy number, chromosomal instability, and methylation. PLoS One. 2012;7(5):e36530. Hayes DN, Monti S, Parmigiani G, et al. Gene expression profiling reveals reproducible human lung adenocarcinoma subtypes in multiple independent patient cohorts. J Clin Oncol. 2006;24(31):5079-5090. Bindea G, Mlecnik B, Tosolini M, et al. Spatiotemporal dynamics of intratumoral immune cells reveal the immune landscape in human cancer. Immunity. 2013;39(4):782-795. Iglesia MD, Parker JS, Hoadley KA, Serody JS, Perou CM, Vincent BG. Genomic Analysis of Immune Cell Infiltrates Across 11 Tumor Types. J Natl Cancer Inst. 2016;108(11). Rooney MS, Shukla SA, Wu CJ, Getz G, Hacohen N. Molecular and genetic properties of tumors associated with local immune cytolytic activity. Cell. 2015;160(1-2):48-61. Bremnes RM, Busund LT, Kilvaer TL, et al. The Role of Tumor-Infiltrating Lymphocytes in Development, Progression, and Prognosis of Non-Small Cell Lung Cancer. J Thorac Oncol. 2016;11(6):789-800. Borghaei H, Paz-Ares L, Horn L, et al. Nivolumab versus Docetaxel in Advanced Nonsquamous Non-Small-Cell Lung Cancer. N Engl J Med. 2015;373(17):1627-1639. Brahmer J, Reckamp KL, Baas P, et al. Nivolumab versus Docetaxel in Advanced Squamous-Cell Non-Small-Cell Lung Cancer. N Engl J Med. 2015;373(2):123-135. Garon EB, Rizvi NA, Hui R, et al. Pembrolizumab for the treatment of non-small-cell lung cancer. N Engl J Med. 2015;372(21):2018-2028. Gettinger SN, Horn L, Gandhi L, et al. Overall Survival and Long-Term Safety of Nivolumab (Anti-Programmed Death 1 Antibody, BMS-936558, ONO-4538) in Patients With Previously Treated Advanced Non-Small-Cell Lung Cancer. J Clin Oncol. 2015;33(18):2004-2012. Shukuya T, Carbone DP. Predictive Markers for the Efficacy of Anti-PD-1/PD-L1 Antibodies in Lung Cancer. J Thorac Oncol. 2016;11(7):976-988. Rizvi NA, Hellmann MD, Snyder A, et al. Cancer immunology. Mutational landscape determines sensitivity to PD-1 blockade in non-small cell lung cancer. Science. 2015;348(6230):124-128. Lee HJ, Lee JJ, Song IH, et al. Prognostic and predictive value of NanoString-based immune-related gene signatures in a neoadjuvant setting of triple-negative breast

AC C

1.

19

ACCEPTED MANUSCRIPT 20

22. 23. 24. 25.

26.

27. 28.

29.

30. 31. 32.

RI PT

SC

21.

M AN U

20.

TE D

19.

EP

18.

AC C

17.

cancer: relationship to tumor-infiltrating lymphocytes. Breast Cancer Res Treat. 2015;151(3):619-627. Director's Challenge Consortium for the Molecular Classification of Lung A, Shedden K, Taylor JM, et al. Gene expression-based survival prediction in lung adenocarcinoma: a multi-site, blinded validation study. Nat Med. 2008;14(8):822827. Tomida S, Takeuchi T, Shimada Y, et al. Relapse-related molecular signature in lung adenocarcinomas identifies patients with dismal prognosis. J Clin Oncol. 2009;27(17):2793-2799. Broad Institute TCGA Genome Data Analysis Center (2015): Analysis-ready standardized TCGA data from Broad GDAC Firehose stddata__2015_02_04 run. 2015. Raponi M, Zhang Y, Yu J, et al. Gene expression signatures for predicting prognosis of squamous cell and adenocarcinomas of the lung. Cancer Res. 2006;66(15):74667472. Forero A, Li Y, Chen D, et al. Expression of the MHC Class II Pathway in TripleNegative Breast Cancer Tumor Cells Is Associated with a Good Prognosis and Infiltrating Lymphocytes. Cancer Immunol Res. 2016;4(5):390-399. Broad Institute TCGA Genome Data Analysis Center (2015): Analysis-ready standardized TCGA data from Broad GDAC Firehose stddata__2015_04_02 run. 2015. Cancer Genome Atlas Research Network, Comprehensive molecular profiling of lung adenocarcinoma. Nature. 2014;511:543-550. Cao C, Gao R, Zhang M, et al. Role of LKB1-CRTC1 on glycosylated COX-2 and response to COX-2 inhibition in lung cancer. J Natl Cancer Inst. 2015;107(1):358. Koyama S, Akbay EA, Li YY, et al. STK11/LKB1 Deficiency Promotes Neutrophil Recruitment and Proinflammatory Cytokine Production to Suppress T-cell Activity in the Lung Tumor Microenvironment. Cancer Res. 2016;76(5):999-1008. Schabath MB, Welsh EA, Fulp WJ, et al. Differential association of STK11 and TP53 with KRAS mutation-associated gene expression, proliferation and immune surveillance in lung adenocarcinoma. Oncogene. 2016;35(24):3209-3216. Hast BE, Cloer EW, Goldfarb D, et al. Cancer-derived mutations in KEAP1 impair NRF2 degradation but not ubiquitination. Cancer Res. 2014;74(3):808-817. Kayser G, Schulte-Uentrop L, Sienel W, et al. Stromal CD4/CD25 positive T-cells are a strong and independent prognostic factor in non-small cell lung cancer patients, especially with adenocarcinomas. Lung Cancer. 2012;76(3):445-451. Liu H, Zhang T, Ye J, et al. Tumor-infiltrating lymphocytes predict response to chemotherapy in patients with advance non-small cell lung cancer. Cancer Immunol Immunother. 2012;61(10):1849-1856. Schalper KA, Brown J, Carvajal-Hausdorf D, et al. Objective measurement and clinical significance of TILs in non-small cell lung cancer. J Natl Cancer Inst. 2015;107(3). Yuan A, Hsiao YJ, Chen HY, et al. Opposite Effects of M1 and M2 Macrophage Subtypes on Lung Cancer Progression. Sci Rep. 2015;5:14273. National Comprehensive Cancer Network. Clinical Practice Guideline in Oncology. Non-small Cell Lung Cancer. https://www.nccn.org/professionals/physician_gls/f_guidelines.asp. Accessed 15.9.16.

20

ACCEPTED MANUSCRIPT 21

RI PT

SC M AN U TE D

35.

EP

34.

Remark R, Becker C, Gomez JE, et al. The non-small cell lung cancer immune contexture. A major determinant of tumor characteristics and patient outcome. Am J Respir Crit Care Med. 2015;191(4):377-390. Chen PL, Roh W, Reuben A, et al. Analysis of Immune Signatures in Longitudinal Tumor Samples Yields Insight into Biomarkers of Response and Mechanisms of Resistance to Immune Checkpoint Blockade. Cancer Discov. 2016;6(8):827-837. Zaretsky JM, Garcia-Diaz A, Shin DS, et al. Mutations Associated with Acquired Resistance to PD-1 Blockade in Melanoma. N Engl J Med. 2016;375(9):819-829.

AC C

33.

21

ACCEPTED MANUSCRIPT 22

Figure 1. Heatmaps of Bindea et al.6 immune cell signature expression, other immune signatures, and individual immune markers in lung adenocarcinoma and squamous cell

RI PT

carcinoma gene expression datasets.1,2 TRU= Terminal Respiratory Unit, PP= Proximal Proliferative, PI= Proximal Inflammatory. MHC II= Major Histocompatibility Class II gene

AC C

EP

TE D

M AN U

SC

signature.

22

ACCEPTED MANUSCRIPT 23 Figure 2. Association strength (adjusted R-squared) between CD274 (PD-L1) expression and immune signature, versus strength between subtype and immune signature, for 13 Adaptive Immune Cell expression (AIC) signatures in adenocarcinoma and squamous cell carcinoma

RI PT

datasets. Association between subtype and AIC was greater for some AIC’s in adenocarcinoma and for all AIC tested in squamous cell carcinoma. Tcm = central memory T cells, Th = T helper cells, Th1 =Type 1 T helper cells, Th2 = Type 2 T helper cells, TFH =T follicular helper cells,

AC C

EP

TE D

M AN U

SC

Th17= T helper 17 cells, Treg= Tregulatory cells, Tgd=Gamma Delta Tcells.

23

ACCEPTED MANUSCRIPT 24 Figure 3. Reproducibility of T cell signature gene expression subtype patterns across multiple Adenocarcinoma (AD) datasets2,4,17,18 and Squamous Cell Carcinoma (SQ) datasets.1,3,16,20 TRU= Terminal Respiratory Unit, PP= Proximal Proliferative, PI= Proximal Inflammatory.

RI PT

RNAseq (Illumina, San Diego, CA ) and microarrays from both Affymetrix (Santa Clara, CA)

AC C

EP

TE D

M AN U

SC

and Agilent (Santa Clara, CA).

24

ACCEPTED MANUSCRIPT 25 Figure 4. Adenocarcinoma (AD) and Squamous cell carcinoma (SQ) subtype non-silent mutation burden (A,B), STK11 inactivation (mutation and/or deletion) in AD (C), NFE2L2 expression in SQ (D), and MHC class II signature (E,F), with association test p-values. TRU= Terminal Respiratory

RI PT

Unit, PP= Proximal Proliferative, PI= Proximal Inflammatory, MHC II= Major Histocompatibility

AC C

EP

TE D

M AN U

SC

Class II gene signature.

25

ACCEPTED MANUSCRIPT 26 Figure 5. Subtype specific immune marker hazard ratios and 95% confidence intervals for 5 year overall survival in stage I-III AD (A) and stage I-III SQ (B). Hazard ratios (HR) correspond to a unit increase in the normalized immune marker and were adjusted for pathological stage using

RI PT

cox models. Only markers that were significant (nominal p-value<0.05) for at least one subtype are shown. TRU= Terminal Respiratory Unit, PP= Proximal Proliferative, PI= Proximal

Inflammatory, MHC II= Major Histocompatibility Class II gene signature, Th1=Type 1 T helper

SC

cells, Th2= Type 2 T helper cells, TFH=T follicular helper cells, Th17= T helper 17 cells,

AC C

EP

TE D

M AN U

Treg=Tregulatory cells, DC= Dendritic cells, iDC=Immature Dendritic Cells.

26

ACCEPTED MANUSCRIPT 27 Supplemental Figures and Tables Figure S1. Lung Adenocarcinoma (AD) and Squamous Cell Carcinoma (SQ) gene expression box plots by subtype for each of the gene signatures/individual genes examined and associated

RI PT

Kruskal-Wallis test p-values.

Figure S2. Correlation of immune cell signatures and other immune markers in Lung

Adenocarcinoma (AD) and Squamous Cell Carcinoma (SQ) in The Cancer Genome Atlas

SC

datasets (n= 515 AD and n= 501 SQ). Signatures are arranged by hierarchical clustering.

M AN U

Figure S3. Reproducibility of Bcell, Dendritic cell (DC), macrophages, PDCD1 (PD-1) and CD274(PD-L1) gene expression subtype patterns across multiple Adenocarcinoma (AD)2,4,17,18 and Squamous Cell Carcinoma (SQ) datasets.1,3,16,20 TRU= Terminal Respiratory Unit, PP= Proximal Proliferative, PI= Proximal Inflammatory. RNAseq (Illumina, San Diego, CA ) and

TE D

microarrays from both Affymetrix (Santa Clara, CA) and Agilent (Santa Clara, CA).

Figure S4. Box plots or Scatterplots of Tcells versus STK11 inactivation (A), TP53 mutation (B), EGFR mutation (C), and non-silent mutation burden (D) in AD, as well as RB1 mutation (E), NF1

EP

expression (F), NFE2L2 expression (G), and non-silent mutation burden (H) in SQ. Tcells, NF1, and NFE2L2 were normalized to have mean 0 and variance 1 and mutation burden was on the

AC C

log scale. Association p-values shown from linear regression models include p, when the marker was the sole predictor of Tcells, and p’, when the model was adjusted for subtype.

Figure S5. Scatterplots of MHCII versus Tcells, Bcells, and DC in tumor and in tumor adjacent normal tissue separately in TCGA lung Adenocarcinoma (AD) and Squamous Cell Carcinoma (SQ) datasets (AD n=515 tumor and n=58 normal) (SQ n=501 tumor and n=51 normal).

27

ACCEPTED MANUSCRIPT 28

Table S1. Characteristics of Lung Adenocarcinoma (AD) 2,4,17,18 and Squamous Cell Carcinoma

RI PT

(SQ)1,3,16,20 datasets used.

Table S2. Genes included in immune gene signatures evaluated (Bindea et al.6, Interferon, MHC class II, and individual genes), missing signature genes by specific dataset, and modifications to

SC

Bindea et al.6 signatures.

Table S3. Subtype-marker and subtype-alteration association test p-values overall and between

M AN U

each pair of subtypes. The Kruskal-Wallis and Mann-Whitney tests were used for continuous markers and genomic alterations. Fisher’s exact test was used for categorical genomic alterations.

TE D

Table S4. Spearman rank correlation and association test p-value between each pair of immune markers, separately in The Cancer Genome Atlas (TCGA) lung Adenocarcinoma (AD) and Squamous Cell Carcinoma (SQ) cohorts (n=515 AD and n=501 SQ)

EP

Table S5. Survival Hazard Ratios (HR) and Confidence Intervals (CI) for all signatures and individual genes analyzed in The Cancer Genome Atlas (TCGA) lung Adenocarcinoma (AD) and

AC C

Squamous Cell Carcinoma (SQ) cohorts (n= 515 AD and n= 501 SQ).

28

ACCEPTED MANUSCRIPT

-1.5

−0.5

0.0

-0.5

−0.5 −0.5 0.0 0.0 0.5 0.5 1.0 1.0 1.5 1.5 0.5

1.0

1.5

0.5 1.0 1.5

Subtype Subtype

M AN U

B cells B cells B cells T cells T cells T helper cells T helper cells Tcm Tcm Tem Tem Th1 cells Th1 cells Th2 cells Th2 cells T cells TFH TFH Th17 cells Th17 cells TReg TReg CD8 T cells CD8 T cells Tgd Tgd Cytotoxic cells Cytotoxic cells NK cells NK cells NK CD56dim cells NK cells NK CD56dim cells NK CD56bright cells NK CD56bright cells DC DC iDC iDC Dendritic cells aDC aDC pDC pDC Eosinophils Eosinophils Macrophages Macrophages Other Innate cells Mast cells Mast cells Neutrophils Neutrophils IFN IFN Interferon signature PDL1 PDL1 PDL2 PDL2 Individual genes PDCD1 PDCD1 CTLA4 CTLA4 MHCII MHC II MHC

RI PT

−1.5

SC

−1.5 −1.5

am (squ

TE D

ŸPI

1

oid)

oid)

EP

ŸPP

gn ( ma

AC C

1.0 1.5

Squamous Cell Carcinoma (n=501)

11

Adenocarcinoma (n=515)

Subtyp

Subtype Subtype

B cells T cells

T helper cells Tcm Tem

B cells B cells T cells T cells T helper cells T helper cells Tcm Tcm Tem Tem Th1 cells Th1 cells Th2 cells Th2 cells TFH TFH TFH Th17 Th17 cells cells Th17 cells TReg TReg TReg CD8 T CD8 T cells cells CD8 T cells Tgd Tgd Tgd Cytotoxic Cytotoxic cells cells Cytotoxic cells NK cells NK cells NK cells NK CD56dim cells NK CD56dim cells NK CD56dim cells NK CD56bright cells NK CD56bright cells NK CD56bright cells DC DC DC iDC iDC iDC aDC aDC aDC pDC pDC pDC Eosinophils Eosinophils Eosinophils Macrophages Macrophages Macrophages Mast cells Mast cells Mast cells Neutrophils Neutrophils Neutrophils IFN IFN IFN PDL1 PDL1 PDL1 PDL2 PDL2 PDL2 PDCD1 PDCD1 PDCD1 CTLA4 CTLA4 CTLA4 MHCII MHC II MHCII

Th1 cells Th2 cells TFH Th17 cells TReg CD8 T cells Tgd

Cytotoxic cell NK cells

NK CD56dim

NK CD56brig DC iDC aDC pDC Eosinophils

Macrophages Mast cells Neutrophils IFN PDL1 PDL2 PDCD1 CTLA4 MHCII

SC

RI PT

ACCEPTED MANUSCRIPT



0.00

0.05

0.10

0.15

0.20

Subtype Association Strength

0.25





0.35 0.30 0.25 0.20 0.15 0.10

EP



AC C

0.05



●●

0.05

0.15 0.10



TCGA SQ n=501



0.00



PDL1 Association Strength



B cells T cells T helper cells Tcm Tem Th1 cells Th2 cells TFH Th17 cells TReg CD8 T cells Tgd Cytotoxic cells

TE D



M AN U

0.20



0.00

PDL1 Association Strength

0.25

TCGA AD n=515

0.00

0.05

0.10

0.15



0.20



0.25

Subtype Association Strength

0.30

0.35

ACCEPTED MANUSCRIPT

Adenocarcinoma





classical



AC C ● ● ●

● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ●

●●

●●

● ● ● ●● ● ● ● ● ● ● ● ● ● ●●● ●● ● ● ● ● ● ● ●

● ● ● ●●●● ● ●



● ● ●

● ● ● ●



● ● ● ● ● ● ●

● ●● ● ●

● ●●

●●

3



2 1 0

PI

TRU

PP

PI

Subtype

Lee Affy n=75 Kruskal−Wallis p=3.1e−06

UNC Agilent n=56 Kruskal−Wallis p=0.00022



● ●

● ●









● ● ● ● ●● ●

● ● ● ● ● ● ● ● ●● ● ●● ● ●● ● ● ● ● ● ● ●● ● ●

● ●



● ● ●

● ● ● ● ●● ●

● ● ● ●

● ● ●●● ●



● ●

● ●











−1 −2

SC

PP

3

● ●● ● ●● ● ● ● ● ●● ● ● ●●



Subtype

● ● ● ●

●● ●





−3

0

T cells

RI PT ● ●

−2







2

2 1 0 −1 −2

●●

● ●

● ● ●









● ● ● ●





● ●

● ● ● ●



● ●● ●●

● ●●

● ●●





●● ● ● ●● ● ●

●●●

● ● ●

● ●● ●





● ●



● ●

basal

●●●

● ● ● ●● ●● ● ●● ● ●● ●● ●●● ● ● ● ● ●

primitive secretory

Subtype

basal

classical

primitive secretory

Subtype

−3

● ● ●



● ● ● ●● ● ● ●● ● ● ● ● ● ●

● ● ● ●●● ●● ●





−3



● ●





● ● ● ● ●● ●● ● ● ● ● ●

●● ●

−3

●●

● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ●●● ● ● ●● ●● ● ● ● ●● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ● ●



T cells

1 0 −1 −2



−3

T cells

● ●● ● ● ● ●● ● ● ●● ● ● ● ●● ●●● ●● ●● ● ● ●●● ● ● ●● ●● ●●● ●● ● ● ●● ● ● ●●●● ● ● ●●●● ● ● ●●● ●● ● ●● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ●● ●● ● ●●● ●●●● ●● ●●● ●● ●● ● ● ● ● ● ●● ●●●● ● ● ● ● ●● ● ● ● ● ●



−1

TE D

EP

3

3 2

● ● ● ●● ●●● ● ● ● ● ● ●●● ● ●● ● ● ●● ● ●● ● ● ● ● ●●● ● ● ●● ● ●●● ● ● ● ● ● ● ●●● ●●● ● ●● ●●● ● ● ● ●● ● ●● ● ● ● ●● ● ● ● ● ● ● ●● ● ● ●● ●●● ● ● ● ●●● ●● ● ●●● ●●● ●●● ● ●●● ● ●● ●●● ● ● ● ● ●● ●●● ● ●● ●●● ●●● ● ● ● ● ●● ● ● ● ● ● ●●● ● ● ●● ● ● ● ● ●

● ● ● ● ● ● ● ●



TRU

Raponi Affy n=129 Kruskal−Wallis p=5.6e−06





● ● ● ●●

● ●● ● ● ●

● ● ●● ●

● ● ●

● ● ● ● ● ●● ●● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ●● ● ●

3

PI

Squamous Cell Carcinoma









● ● ●

2

PP

Subtype

● ● ● ● ●●●● ● ● ● ● ● ● ●● ● ●●● ● ●● ● ● ● ●● ●●●● ●● ● ● ● ● ●● ● ● ●● ●●●● ● ●● ● ● ● ● ● ●●● ● ●● ● ● ● ● ● ●● ●●● ● ● ●●● ● ● ● ●●● ● ● ● ●●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ●● ●





● ● ● ●● ●

1

Subtype





●● ●

● ● ●● ●

0

TRU

TCGA RNAseq n=501 Kruskal−Wallis p=1.1e−27

● ●

● ●

−2

PI







● ●

−1



1

PP

2



● ● ● ●

● ● ●





T cells



1



● ●



●● ●

−3



−2





●● ● ● ●●● ● ●● ● ● ●● ● ●● ● ● ● ●● ● ● ● ● ●● ●● ● ● ●● ● ● ●●

● ●



● ●





T cells

2 −1

● ● ● ● ●● ● ●● ●● ● ● ● ● ● ●● ●● ● ● ●● ●● ● ● ● ●●●● ● ●●●● ●● ● ● ● ● ●●●●●● ●●● ● ● ● ●●● ● ●●●● ● ●● ● ● ● ●● ●●● ● ●● ●● ● ● ●●● ● ● ●●●●●● ● ● ●● ●● ● ● ●●●●● ●● ● ●●● ● ●● ● ● ● ● ● ● ●●● ● ●● ●●●● ● ● ● ● ● ● ●● ●● ● ●● ● ●





0

TRU

● ●

−1



● ● ●●● ● ● ● ●● ●● ●● ● ● ● ● ● ● ● ●● ● ● ● ●● ●● ●● ●● ● ● ● ● ● ● ● ● ●● ●●● ● ● ●● ● ● ● ● ● ●●● ● ● ●●● ●●●● ●●●● ●● ● ● ●● ● ● ●●● ● ●● ● ● ● ●● ● ● ●●● ● ● ●● ● ● ● ●● ● ● ●● ● ●●● ● ● ● ● ● ●● ● ●● ●● ● ● ● ● ●● ● ●● ●● ● ● ● ● ●●● ● ● ●● ● ●

T cells



● ● ● ● ● ●● ● ●● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ●● ● ● ●● ● ● ● ●● ● ●● ● ● ● ●● ● ●●● ● ●● ●● ● ● ●●●● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ●● ●● ●● ● ● ● ●● ● ● ● ●● ● ●●● ●● ● ●● ● ● ● ●●

●●



−3





0

1

● ●

● ●



−2

● ● ●●● ● ●● ● ● ● ● ●●● ●●● ● ● ● ● ● ● ● ●● ●●● ● ●●● ● ● ● ● ● ● ●● ● ● ● ●●●●●● ●● ●●● ●●●●● ●● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●●●● ●●● ●● ● ●● ● ●● ●● ● ● ● ● ●● ● ●● ●● ●● ● ● ● ● ● ● ●●●● ● ●●● ●●●

T cells

2 1 0 −1

● ●



−3

−2

T cells

● ● ● ●● ● ● ● ● ●● ● ● ● ● ●● ● ●● ● ●●● ● ●● ● ● ● ●● ●● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ●● ● ●●● ● ●● ● ● ● ●● ● ●● ●● ●●● ●●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ● ● ●● ● ●● ● ● ● ● ● ● ●●● ● ● ● ● ●●●● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ●●●● ● ● ● ● ●● ● ●●● ●●● ● ●● ●●● ● ●● ● ●

UNC Agilent n=116 Kruskal−Wallis p=6.7e−08

3

● ● ●

● ●

● ● ● ● ●● ● ● ● ●● ● ●● ●● ● ●● ● ● ● ●● ● ●● ● ● ● ● ●● ●● ● ● ● ●●●● ●● ● ● ● ● ● ● ● ●●● ● ●● ● ● ● ● ● ● ●●●● ●●● ●● ●● ● ● ●● ● ● ● ● ●● ● ● ● ●●●● ●● ● ● ● ● ●● ●● ● ●● ● ●● ● ● ●● ● ●●● ● ● ● ● ●●●● ●●●●●● ● ●● ● ●● ● ●●●●● ● ● ● ● ●● ●● ● ●● ● ● ●● ●● ● ●● ●● ●●● ● ● ● ● ● ●

Tomida Agilent n=117 Kruskal−Wallis p=1.4e−06

M AN U

3

Shedden Affy n=442 Kruskal−Wallis p=2.1e−16

3

TCGA RNAseq n=515 Kruskal−Wallis p=2.7e−16

basal

classical

primitive secretory

Subtype

basal

classical

primitive secretory

Subtype



24/173

●● ●



PP

4 3

PI

basal

D



classical

primitive



secretory



TE D

12

13

14

15

● ● ●

●●● ● ●●● ● ● ●● ● ●● ● ● ●●● ● ● ● ● ● ● ●●● ●● ● ● ● ● ●● ● ●●● ●●● ● ● ● ●●● ●● ●● ● ● ●● ●● ● ● ● ● ● ● ●● ●● ●● ● ● ● ● ● ●● ● ● ●● ●● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ●● ● ● ●● ●●● ● ●

● ● ●

● ● ● ● ● ● ● ●● ● ●●● ● ● ●● ●● ● ● ● ●● ● ● ●● ● ● ● ●● ● ●● ●●●● ● ● ● ● ● ● ● ● ● ●●●● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ●● ● ●●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ●● ●● ●● ●● ● ● ● ● ● ● ●● ● ●● ● ●●●● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ●



basal

EP

classical

● ● ●●● ●● ●● ●● ● ● ● ●● ● ●● ● ●● ● ●● ● ● ●●● ● ● ● ● ●●● ● ●● ● ●●● ● ●● ●● ●● ● ● ● ● ● ● ● ● ● ●●● ● ● ●● ● ●● ● ●

● ● ● ● ●● ● ● ●● ● ●● ● ● ● ●● ●●● ●●● ●● ● ●●● ● ● ● ● ● ●● ● ●● ● ● ●● ●● ● ● ● ● ● ● ● ●● ● ● ● ●●● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ●● ●●● ● ● ● ● ●● ● ●●● ● ● ● ● ●● ●● ● ●● ● ● ● ●● ●● ● ●

primitive

secretory



Subtype

F

SQ n=501 Kruskal−Wallis p=3.1e−39

● ● ●



12 11 10 8



9

●● ● ● ● ● ● ●● ● ● ●● ●● ● ● ● ●●●● ● ● ●●● ● ● ● ● ● ● ●● ●● ● ● ● ● ●● ●●●● ● ● ● ● ●●● ● ● ●● ●●●●● ● ● ● ●● ● ● ●● ● ● ● ● ● ● ●● ●● ● ● ● ●●● ● ● ● ●● ● ● ● ●● ●● ●●● ● ●●● ● ●●● ● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ● ●● ● ● ●●● ●● ● ● ● ●●●● ● ● ● ● ● ●● ● ●

13



MHC class II

AC C ●

7

13 12 11 10 9 8



● ● ● ● ● ● ●● ●● ●● ●● ● ● ●●●● ●● ● ●● ●● ● ● ● ● ● ●● ●●● ●●●● ●● ● ●●● ● ●● ●● ●● ● ●●● ● ● ● ● ● ● ●● ●●● ● ●●● ● ●● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ●● ●● ● ● ●●●●● ●●● ● ● ● ● ● ●● ● ● ● ● ●● ● ●●●● ● ● ● ● ● ●● ● ● ● ●

7

MHC class II

● ●



● ● ● ● ●● ● ● ●● ● ● ●● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ●● ●●

SQ n=501 Kruskal−Wallis p=1e−49

M AN U

0.5 0.4 0.3 0.1 0.0

PI

AD n=515 Kruskal−Wallis p=2.7e−45 ●● ● ● ● ●● ●● ● ● ●● ● ● ●● ●●● ● ● ● ●● ● ●● ●● ● ●● ● ●●● ● ● ●● ● ● ● ● ● ● ● ● ●●● ●● ● ● ● ● ● ● ●●●● ● ●● ● ● ●● ● ● ●● ● ●● ● ● ● ● ● ●● ● ● ●●● ●● ● ●● ●●● ●● ●● ● ● ● ● ● ● ●● ● ● ● ● ● ●●● ● ● ● ●●●●● ● ● ● ● ●●● ●● ●● ●●● ● ● ●● ●● ● ● ●● ● ●● ● ● ● ● ● ●●●●● ●● ● ● ● ●

● ● ● ● ● ● ● ● ● ●●● ● ● ● ● ● ● ●●



Subtype

E



● ●



24/173

PP

● ● ● ● ● ● ● ● ● ● ●● ●● ● ●● ●●● ● ● ● ●● ● ●●● ● ● ● ●● ● ● ● ●●●●● ● ● ● ● ● ● ●●●● ● ●● ●●●

Subtype

67/154

28/188



● ●●● ●

SC

Subtype

AD 67/154 n=515 Fisher's Exact p=3.5e−11

TRU

● ● ●●● ● ●● ● ● ● ● ● ●● ●● ●● ● ●● ●● ● ● ● ●● ● ● ●● ● ●● ● ● ●● ●● ●



RI PT

● ● ●

2

● ● ● ● ●● ●●● ● ●● ● ● ●●●● ● ●●●● ● ●●● ● ● ● ●● ●● ●● ● ● ●● ●● ● ● ●

1



11

● ●

● ● ●● ● ●

● ● ●

10

● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ●● ●● ● ● ● ● ● ● ●●● ● ● ● ● ●● ●● ● ● ● ●● ● ● ● ●●● ● ● ● ● ● ● ● ●●● ● ● ● ●● ● ●● ● ● ● ● ●● ●● ●

28/188

0.2



0

● ●

● ●● ● ● ●● ●

NFE2L2 expression

3 2

● ● ● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ●● ● ● ●● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ●● ● ● ● ● ● ● ●● ●

●● ●

●●



1

Non−silent mutation burden per Mb (log scale)

4



C

SQ n=178 Kruskal−Wallis p=0.54

ACCEPTED MANUSCRIPT

TRU

Proportion STK11 Inactive

B

AD n=230 Kruskal−Wallis p=6.6e−06

0

Non−silent mutation burden per Mb (log scale)

A



● ●

● ● ● ● ● ●●●● ● ● ● ● ● ● ● ●● ● ●● ● ● ● ●● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ●● ●● ●● ● ●●●● ● ● ● ● ● ● ● ● ●● ● ●●●●● ●● ● ●●● ● ● ●● ● ● ● ● ● ● ● ● ●● ● ● ●●● ● ● ●● ●● ● ●● ● ●● ●●●● ● ● ● ● ● ● ●● ●● ● ● ● ● ● ●

● ● ●● ●● ●● ● ● ●● ● ● ● ● ●● ●●● ●●●● ● ● ●● ●●● ● ● ● ● ● ●●● ● ●●● ● ● ●●● ● ●●● ●● ●●● ● ● ● ● ● ● ● ●● ● ● ●● ● ●●●● ● ●●● ● ● ● ●● ●● ●● ● ●●● ●●● ●● ●●●● ● ●●● ●● ● ● ● ● ●●●● ● ●● ● ●● ● ● ●● ●● ●● ● ● ●● ● ● ● ● ●● ●● ● ●● ●●● ●

● ●● ● ● ● ● ●● ● ● ● ● ● ●● ●●●● ● ● ● ●● ●●● ● ● ● ● ● ● ●● ● ● ● ● ●● ●● ● ●● ● ● ● ●● ● ● ● ●● ● ● ● ● ● ● ●● ● ●

● ●● ● ● ●● ● ● ● ●● ●●●● ● ● ●● ●● ●●●● ● ● ●● ● ●●● ● ● ● ● ● ● ● ● ● ●● ●● ● ● ● ●● ● ● ●●● ●● ● ●● ● ● ● ● ● ● ● ●●● ● ● ●●●● ● ● ●● ● ● ● ● ●● ● ● ● ●● ● ● ●● ●● ●● ● ● ●● ●

●●

●● ●

TRU

PP

Subtype

PI

basal

classical

primitive

Subtype

secretory

CI

U

Adenocarcinoma Squamous Cell Carcinoma Adenocarcinoma Adenocarcinoma Adenocarcinoma

0.5 0.50.5

M AN U

TE D

0.71 (0.53,0.96) Th1Th1 Th1 cells cellscells 0.80 (0.54,1.17) Th17 cells 0.91 (0.61,1.35) 0.62 (0.44,0.88) TFHTFH TFH 0.62 (0.43,0.92) TReg 0.91 (0.60,1.36) Th1 cells 0.73 (0.58,0.91) Th17 Th17 cells cellscells Th17 0.90 (0.66,1.23) DC 0.68 (0.41,1.15) 0.69 (0.51,0.95) Th2TReg cells TReg TReg 1.01 (0.72,1.42) iDC 0.98 (0.67,1.45) 0.70 (0.50,0.99) DC DC DC 0.89 (0.60,1.33) TFH Eosinophils 0.89 (0.57,1.38) 0.53 (0.40,0.73) iDCiDC iDC 0.79 (0.55,1.14) Macrophages 0.98 (0.59,1.63) DC 0.68 (0.49,0.95)Eosinophils Eosinophils Eosinophils 0.78 (0.59,1.04) PDL1 1.06 (0.70,1.59) 0.61 (0.45,0.83) Macrophages Macrophages Macrophages Macrophages 1.01 (0.69,1.46) CTLA4 0.89 (0.61,1.32) 0.71 (0.52,0.97) PDL1 PDL1 PDL1 1.17 (0.77,1.78) Mast cells MHCII 1.07 (0.66,1.72) 0.70 (0.52,0.94) CTLA4 CTLA4CTLA4 0.91 (0.68,1.21) 0.85 (0.58,1.26) MHCII 0.5 0.54 (0.40,0.80) MHCII MHCII MHCII 0.90 (0.64,1.28) 0.73 (0.40,1.38)

Squamous Cell Carcinoma Squamous Cell Carcinoma Squamous Cell Carcinoma

SC

TFH

2.0

B

0.71 (0.53,0.96) 0.80 (0.54,1.17) 0.91 (0.61,1.35) 0.62 (0.44,0.88) 0.62 (0.43,0.92) 0.71 (0.53,0.96) Th1 cells 0.710.71 (0.53,0.96) 0.91 (0.60,1.36) (0.53,0.96) 0.80 (0.54,1.17) 0.800.80 (0.54,1.17) 0.73 (0.58,0.91) (0.54,1.17) 0.91 (0.61,1.35) 0.910.91 (0.61,1.35) 0.90 (0.66,1.23) (0.61,1.35) 0.62 (0.44,0.88) 0.620.62 (0.44,0.88) 0.68 (0.41,1.15) (0.44,0.88) 0.62 (0.43,0.92) Th2 cells 0.620.62 (0.43,0.92) 0.69 (0.51,0.95) (0.43,0.92) 0.91 (0.60,1.36) 1.13 (0.72,1.79) Th1 cells 0.910.91 (0.60,1.36) Th1Th1 cellscells 1.01 (0.72,1.42) (0.60,1.36) 0.73 (0.58,0.91) 0.52 (0.28,0.97) 0.730.73 (0.58,0.91) 0.98 (0.67,1.45) (0.58,0.91) 0.90 (0.66,1.23) 1.14 (0.79,1.64) 0.900.90 (0.66,1.23) 0.70 (0.50,0.99) (0.66,1.23) 0.68 (0.41,1.15) 1.34 (0.89,2.03) 0.680.68 (0.41,1.15) 0.89 (0.60,1.33) TFH (0.41,1.15) 0.69 (0.51,0.95) 0.95 (0.70,1.30) Th2 cells 0.690.69 (0.51,0.95) Th2Th2 cellscells 0.89 (0.57,1.38) (0.51,0.95) 1.01 (0.72,1.42) 0.57 (0.39,0.84) 1.011.01 (0.72,1.42) 0.53 (0.40,0.73) (0.72,1.42) 0.98 (0.67,1.45) 1.31 (0.90,1.93) 0.980.98 (0.67,1.45) 0.79 (0.55,1.14) (0.67,1.45) 0.70 (0.50,0.99) 1.09 (0.77,1.55) 0.700.70 (0.50,0.99) 0.98 (0.59,1.63) DC (0.50,0.99) 0.89 (0.60,1.33)TFH 0.90 (0.65,1.26) 0.890.89 (0.60,1.33) 0.68 (0.49,0.95) (0.60,1.33) TFH TFH 0.89 (0.57,1.38) 0.56 (0.33,0.95) 0.890.89 (0.57,1.38) 0.78 (0.59,1.04) (0.57,1.38) 0.53 (0.40,0.73) 1.18 (0.85,1.63) 0.530.53 (0.40,0.73) 1.06 (0.70,1.59) (0.40,0.73) 0.79 (0.55,1.14) 0.95 (0.65,1.38) 0.790.79 (0.55,1.14) 0.61 (0.45,0.83) Macrophages (0.55,1.14) 0.98 (0.59,1.63) DC 1.26 (0.85,1.86) DC 0.980.98 (0.59,1.63) 1.01 (0.69,1.46) (0.59,1.63) DC 0.68 (0.49,0.95) 0.47 (0.22,0.99) 0.680.68 (0.49,0.95) 0.89 (0.61,1.32) (0.49,0.95) 0.78 (0.59,1.04) 1.11 (0.79,1.56) 0.780.78 (0.59,1.04) 0.71 (0.52,0.97) (0.59,1.04) 1.06 (0.70,1.59) Mast cells 1.48 (0.98,2.26) 1.061.06 (0.70,1.59) 1.17 (0.77,1.78) (0.70,1.59) 0.61 (0.45,0.83) Macrophages 1.14 (0.76,1.71) 0.610.61 (0.45,0.83) Macrophages 1.07 (0.66,1.72) (0.45,0.83) Macrophages 1.01 (0.69,1.46) 0.56 (0.32,0.97) 1.011.01 (0.69,1.46) 0.70 (0.52,0.94) (0.69,1.46) 0.89 (0.61,1.32) 1.23 (0.87,1.74) 0.890.89 (0.61,1.32) 0.91 (0.68,1.21) (0.61,1.32) 0.71 (0.52,0.97) 1.62 (1.02,2.55) 0.710.71 (0.52,0.97) 0.85 (0.58,1.26) MHCII (0.52,0.97) 1.17 (0.77,1.78) 1.53 (1.09,2.14) Mast cells 1.171.17 (0.77,1.78) Mast cellscells 0.54 (0.40,0.80) (0.77,1.78) Mast 1.07 (0.66,1.72) 0.96 (0.58,1.59) 1.071.07 (0.66,1.72) 0.90 (0.64,1.28) (0.66,1.72) 0.70 (0.52,0.94) 1.23 (0.88,1.71) 0.700.70 (0.52,0.94) 0.73 (0.40,1.38) (0.52,0.94) 0.91 (0.68,1.21) 1.17 (0.81,1.71) 0.910.91 (0.68,1.21) (0.68,1.21) 0.85 (0.58,1.26) 1.11 (0.74,1.69) 0.850.85 (0.58,1.26) MHCII (0.58,1.26) MHCII MHCII 2.0 0.5 0.54 (0.40,0.80) 0.43 (0.21,0.87) 0.540.54 (0.40,0.80) (0.40,0.80) 0.90 (0.64,1.28) 1.05 (0.75,1.47) 0.900.90 (0.64,1.28) (0.64,1.28) 0.73 (0.40,1.38) 1.34 (0.86,2.10) 0.730.73 (0.40,1.38) (0.40,1.38)

Cell Carcinoma

1.0

1.5

HR, 95% CI

0.5 1.0 1.01.0

1.0 1.5 1.51.5

1.5 2.0 2.02.0

TRU HR, 95% CI HR, 95% CI HR,HR, 95% CI CI 95% PP basalPI classical TRU TRUTRU primitive PPPP PP secretory PIPI PI

1.13 (0.72,1.79) 0.52 (0.28,0.97) 1.14 (0.79,1.64) 1.34 (0.89,2.03) 0.95 (0.70,1.30) 1.13 (0.72,1.79) 1.131.13 (0.72,1.79) 0.57 (0.39,0.84) (0.72,1.79) 0.52 (0.28,0.97) 0.520.52 (0.28,0.97) 1.31 (0.90,1.93) (0.28,0.97) 1.14 (0.79,1.64) 1.141.14 (0.79,1.64) 1.09 (0.77,1.55) (0.79,1.64) 1.34 (0.89,2.03) 1.341.34 (0.89,2.03) 0.90 (0.65,1.26) (0.89,2.03) 0.95 (0.70,1.30) 0.950.95 (0.70,1.30) 0.56 (0.33,0.95) (0.70,1.30) 0.57 (0.39,0.84) 0.570.57 (0.39,0.84) 1.18 (0.85,1.63) (0.39,0.84) 1.31 (0.90,1.93) 1.311.31 (0.90,1.93) 0.95 (0.65,1.38) (0.90,1.93) 1.09 (0.77,1.55) 1.091.09 (0.77,1.55) 1.26 (0.85,1.86) (0.77,1.55) 0.90 (0.65,1.26) 0.900.90 (0.65,1.26) 0.47 (0.22,0.99) (0.65,1.26) 0.56 (0.33,0.95) 0.560.56 (0.33,0.95) 1.11 (0.79,1.56) (0.33,0.95) 1.18 (0.85,1.63) 1.181.18 (0.85,1.63) 1.48 (0.98,2.26) (0.85,1.63) 0.95 (0.65,1.38) 0.950.95 (0.65,1.38) 1.14 (0.76,1.71) (0.65,1.38) 1.26 (0.85,1.86) 1.261.26 (0.85,1.86) 0.56 (0.32,0.97) (0.85,1.86) 0.47 (0.22,0.99) 0.470.47 (0.22,0.99) 1.23 (0.87,1.74) (0.22,0.99) 1.11 (0.79,1.56) 1.111.11 (0.79,1.56) 1.62 (1.02,2.55) (0.79,1.56) 1.48 (0.98,2.26) 1.481.48 (0.98,2.26) 1.53 (1.09,2.14) (0.98,2.26) 1.14 (0.76,1.71) 1.141.14 (0.76,1.71) 0.96 (0.58,1.59) (0.76,1.71) 0.56 (0.32,0.97) 0.560.56 (0.32,0.97) 1.23 (0.88,1.71) (0.32,0.97) 1.23 (0.87,1.74) 1.231.23 (0.87,1.74) 1.17 (0.81,1.71) (0.87,1.74) 1.62 (1.02,2.55) 1.621.62 (1.02,2.55) 1.11 (0.74,1.69) (1.02,2.55) 1.53 (1.09,2.14) 1.531.53 (1.09,2.14) 0.43 (0.21,0.87) (1.09,2.14) 0.96 (0.58,1.59) 0.960.96 (0.58,1.59) 1.05 (0.75,1.47) (0.58,1.59) 1.23 (0.88,1.71) 1.231.23 (0.88,1.71) 1.34 (0.86,2.10) (0.88,1.71) 1.17 (0.81,1.71) 1.171.17 (0.81,1.71) (0.81,1.71) 1.11 (0.74,1.69) 1.111.11 (0.74,1.69) (0.74,1.69) 2.0 0.43 (0.21,0.87) 0.430.43 (0.21,0.87) (0.21,0.87) 1.05 (0.75,1.47) 1.051.05 (0.75,1.47) (0.75,1.47) 1.34 (0.86,2.10) 1.341.34 (0.86,2.10) (0.86,2.10)

RI PT

Th1 cells

EP

A

Squamous ACCEPTED MANUSCRIPT

AC C

1.5

Adenocarcinoma

1.0

1.5

HR, 95% CI basal classical basal Cell Carcinoma basal Adenocarcinoma Squamous TRU basal 2.0 TRU 0.5 1.0 1.5 0.5 1.0 1.5 2.02.0 primitive classical 0.5 basal 1.0 1.5 classical PP TRU PP classical secretory primitive PI PP PI primitive classical HR, 95% CI HR, 95% CI CI HR, 95% PI secretory secretory primitive primitive secretory basal basal

basal secretory classical classical classical primitive primitive primitive secretory secretory secretory

2.0