Original Study
Estimation of the Survival of Patients With Lung Squamous Cell Carcinoma Using Genomic Copy Number Aberrations Yan Cao,1 Yu Liu,2 Xin Yang,3,4 XiangYang Liu,5 Naijun Han,2 Kaitai Zhang,2 Dongmei Lin4 Abstract Because of sole reliance on morphology and the tumor, node, metastases stage to estimate the survival of patients with lung squamous cell carcinoma is inadequate, we investigated the predictive accuracy of DNA copy number aberrations (CNAs) in these patients. A number of CNA regions were found to be significantly associated with survival, which indicates that assessment of CNAs could add value for estimating patients’ prognosis. Background: Estimation of the survival of patients with lung squamous cell carcinoma (SCC) on the basis of histopathology is inadequate. The aim of this study was to identify genomic regions with potential value for estimating the prognosis of these patients. Patients and Methods: Depending on their survival time, 100 patients with primary lung SCC were separated into high- or low-risk prognostic groups, and their copy number aberrations (CNAs) were analyzed using array-comparative genomic hybridization (array-CGH). Results: We identified 123 CNA regions that were significantly associated with survival. Among these regions, some have been reported previously (eg, amplifications of 8p12, 3q27.1, and loss of 9p21.3 and 13q34) but others have never been reported. For example, gains of 3q27.1, 5p13.2, and 5p13.3 were found to be associated with a favorable prognosis, but patients harboring gains of 11q23.3, 11q13.1, and 14q32.3, and deletions of 3p21.3 and 9p21.3 tended to have poor survival. Among the 123 CNA regions, 41 were further selected to construct a survival estimation model that could effectively separate SCC patients into high- or low-risk groups with an accuracy of 92%, sensitivity of 90%, and specificity of 94%. The results of the array-CGH were further validated in an independent cohort of 45 formalin-fixed, paraffin-embedded specimens using real-time polymerase chain reaction. Conclusion: A number of CNA regions were found to be associated with the survival of SCC patients, and we were able to construct a model to estimate prognosis on the basis of these regions. Assessment of these CNAs could potentially assist in clinical decision-making regarding adjuvant therapy after surgery. Clinical Lung Cancer, Vol. 17, No. 1, 68-74 ª 2016 Elsevier Inc. All rights reserved. Keywords: Copy number aberrations, Prognosis, Squamous cell carcinomas of the lung
Introduction Lung cancer is one of the most common causes of cancer death worldwide.1 Among the various subtypes of lung cancer, squamous cell carcinoma (SCC) constitutes approximately 40% of the total Yan Cao and Yu Liu contributed equally to this work. 1
Department of Pathology, Plastic Surgery Hospital, Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, P.R. China 2 State Key Laboratory of Molecular Oncology, Department of Etiology and Carcinogenesis, Cancer Institute (Hospital), Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, P.R. China 3 Department of Pathology, Cancer Institute (Hospital), Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, P.R. China 4 Key laboratory of Carcinogenesis and Translational Research (Ministry of Education), Department of Pathology, Peking University Cancer Hospital and Institute, Beijing, P.R. China
68
-
Clinical Lung Cancer January 2016
number of cases (on the basis of the World Health Organization Classification of Tumors)2. Currently, histopathological diagnosis and staging are the main standards used to estimate the prognosis of patients with lung SCC in the clinic. However, the heterogeneity of 5
Department of Thoracic Surgical Oncology, Cancer Institute (Hospital), Peking Union Medical College and Chinese Academy of Medical Sciences, Beijing, P.R. China Submitted: Jun 1, 2015; Revised: Aug 4, 2015; Accepted: Aug 11, 2015; Epub: Aug 22, 2015 Address for correspondence: Dongmei Lin, MD, Key laboratory of Carcinogenesis and Translational Research (Ministry of Education), Department of Pathology, Peking University Cancer Hospital and Institute, Beijing 100142, P.R. China Fax: 86-01088196667; e-mail contact:
[email protected]
1525-7304/$ - see frontmatter ª 2016 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.cllc.2015.08.005
the tumors makes prediction of the outcome solely on the basis of morphology and the tumor, node, metastases (TNM) stage inadequate, and the use of molecular biomarkers could add value for this purpose. Because DNA copy number aberrations (CNAs) are a primary feature of cancer cells at a molecular level, we set out to validate the predictive accuracy of this feature of lung cancer.3 Investigation of recurrent regions of gain or loss is an effective way to locate the target genes, and this could help to disclose the mechanisms of cancer initiation and progression. Furthermore, the investigation of DNA CNAs could improve the diagnosis, prognosis, and treatment of cancer. The aim of this study was to establish a prognostic genomic profile in patients with lung SCC. Chromosomal instability is one of the hallmarks of tumors, which can result in the gain or loss of specific genomic regions or even entire chromosomes. These CNAs play an important role in carcinogenesis and malignant progression through CNA-induced gene expression alterations and, subsequently, key cancer-specific processes.4 Many tumors possess characteristic CNAs, and a better understanding of these CNAs might lead to an improved prognosis and the ability to personalize treatment. Studies of lung cancer have previously identified a number of CNAs, and some of these have been associated with tumor generation, metastasis, and recurrence, among other things.4-7 However, to our knowledge, only a few studies on direct correlation of these CNAs with the survival of patients with lung SCC have been performed.4,5 To identify potentially prognostic CNA markers that could assist in estimating the clinical prognosis, we investigated CNAs in 100 formalin-fixed, paraffin-embedded (FFPE) specimens of primary SCCs using array-comparative genomic hybridization (CGH). The results obtained were further validated using quantitative real-time polymerase chain reaction (PCR) in an independent cohort of patients with lung SCC.
Patients and Methods Patients Among a total of 525 patients with lung SCC who underwent radical surgery at our institution but who had not received previous adjuvant systemic therapy, adequate follow-up records were available for 420 patients. Of these, 105 patients died of lung cancer within 2 years after surgery, 187 died of lung cancer between 2 years and 5 years, and 128 survived for more than 5 years or had a diseasefree survival time longer than 4 years after surgery. The diagnosis of each tumor specimen, including the original histopathological diagnosis, differentiation, lymph node metastasis, and stage, was determined and confirmed by senior pathologists according to the World Health Organization Classification of Tumors2 and the 7th edition of the Cancer Staging Manual of the American Joint Committee on Cancer.8 Patients were further selected for the study using the following criteria: (1) survival > 3 months after surgery; (2) the number of lymph nodes resected at surgery in each patient should be > 8 to avoid a false-negative diagnosis regarding the lymph node status; (3) the surgically resected tumor must contain a high percentage of tumor cells (> 70%); and (4) genomic DNA (gDNA) extracted from FFPE specimens must meet the qualification of array-CGH hybridization. The patients were then further divided according to
their survival times into a low-risk (LR) group (ie, those who survived longer than 5 years or had a disease-free survival time longer than 4 years after surgery), and a high-risk (HR) group (ie, those who died of lung cancer within 2 years after surgery). A total of 145 patients met the criteria for the study, and their median survival time was 42 months (range, 3.5-99 months). FFPE tissue specimens of the primary lung tumors from these patients were collected from the Department of Pathology at our institution. One hundred patients (50 rated as LR and 50 as HR) were selected for array-CGH analysis, and the remaining 45 (23 rated as LR and 22 as HR) were used as an independent validation cohort. The demographic characteristics of the patients included in the study are shown in Supplemental Table 1 in the online version. This study was approved by the institutional ethics committees of the Cancer Institute and Hospital, Chinese Academy of Medical Sciences, which waived the need for informed consent because of the observational nature of the study.
Genomic DNA Preparation Tumor cells were first scraped using a scalpel and then collected under the direction of pathologists from ten 10-mm thick FFPE slides for each histological lesion. gDNA was extracted from the tumor cells using a DNeasy Blood & Tissue Kit (Qiagen Inc, Hilden, Germany). DNA concentrations were measured using a NanoDrop ND-1000 Spectrophotometer (NanoDrop Technologies, Wilmington, DE).
Array-CGH Analysis Array-CGH analysis was performed using the Agilent Human Genome Microarray Kit 444K, with an overall median probe spacing of 43 kb (Agilent Technologies), and comparative genomic hybridization was performed according to the standard Agilent protocol (Protocol v5.0, June 2007; available at: http://www.agilent. com). Briefly, the tumor and reference gDNA (Promega Inc, Madison, WI) were labeled with Cy5 and Cy3, respectively, using the Agilent Genomic DNA USL Labeling Kit (Agilent Technologies) for FFPE samples. A total of 500 ng of gDNA samples and human Cot-I DNA was dissolved in the hybridization mixture supplied within the Agilent Oligo Array-CGH Hybridization Kit (Agilent Technologies). After denaturation at 95 C for 3 minutes and incubation at 37 C for 30 minutes, the mixtures were slowly dispensed onto the gasket. A microarray slide was then placed onto the gasket slide. The samples were hybridized in a hybridization oven at 65 C and 20 rpm for 40 hours. Finally, the slides were scanned with the Agilent Scanner System, and the Feature Extraction 12.0 (Agilent Technologies) was used for data extraction.
Real-Time PCR Quantitative, real-time PCR was carried out on the Stratagene Mx3005P QPCR System (Agilent Technologies). The primers used in this study are shown in Supplemental Table 2 in the online version. The reactions were performed using the TAKARA SYBR Premix EX Taq (TaKaRa Inc, Tokyo, Japan). Thermal cycling consisted of an initial denaturing step at 95 C for 10 seconds, followed by 45 cycles at 95 C for 15 seconds and 60 C for 30 seconds. All samples were analyzed in triplicate, and the hemoglobin, beta gene was used as the input reference gene.
Clinical Lung Cancer January 2016
- 69
Estimated Survival of Lung SCC Using CNAs Statistical Analysis
Results
Data from the array-CGH were analyzed using the DNA copy (version 1.24.0) package in R9 (version 2.12.0). After normalization, a moving window-based method was used to replace outlier values with the mean of the window. The data were then smoothed and a circular binary segmentation algorithm10 was applied to perform segmentation. For this study, a log2 ratio value > 0.848 (relative copy number of 3.6) was recognized as a gain, and a value less than 0.737 (relative copy number of 1.2) was considered a loss, as described previously.11 Copy number variation regions from the HapMap collection12 were excluded from further analysis to avoid a potential false-positive CNA call because of the reference gDNA used. Survival analysis was carried out using the Survival (version 2.358) package in R, and the P value was adjusted according to the false discovery rate. Common copy number alteration fragments associated with the survival of patients with lung SCC were screened out. Cluster analysis was then applied to these fragments, and a prediction model was constructed using a generalized linear model applied to representative fragments from each cluster. Gene set enrichment analysis (GSEA) was also applied to genes located within these fragments.
Gross View of CNAs in Lung SCC
Accession Numbers Data from the array-CGH analysis in this study are available at Gene Expression Omnibus under the accession number GSE43131.
As a first step, regions that fell within common variation regions for normal populations were excluded from further analysis by filtering with CNA data sets from the HapMap collection, but no probe was filtered out in this step. CNA regions were first detected across all of the 100 SCC specimens contained in the array-CGH data set, as described above, and frequencies were then calculated. Gains of 3q27.1 (37 of 100), 5p13 (13 of 100), and 5p13.33 (10 of 100), and deletions of 9p21.3 (19 of 100), 4q (19 of 100), 10p15.1 (13 of 100), and 3p (12 of 100) were found to be the most frequent events. In Figure 1A, track A (in red) represents the frequencies for copy number gains, and track B (in green) represents the frequencies for copy number losses. Among these variations, some well characterized CNA regions in nonesmall-cell lung cancer (NSCLC) were also found in this study, such as a gain of chromosome 3q and the deletion of chromosome 3p, 4q, which were also among the most frequent events in the 100 SCC specimens studied.
Copy Number Aberrations Associated With the Survival of Patients With Lung SCC The log rank test was applied to detect CNA regions significantly associated with survival. A total of 123 fragments were found and are listed in Supplemental Table 3 in the online version. The frequencies of CNAs across the whole genome for the HR and LR SCC groups are shown separately in Figure 1A (tracks C and D show the copy number gains and losses, respectively, in 50 HR SCC
Figure 1 (A) Circos Plot for the Gross View of Copy Number Aberration Frequencies Across the Whole Genome in the Entire ArrayComparative Genomic Hybridization Cohort (Tracks A and B), the High-Risk Cohort (Tracks C and D), and the Low-Risk Cohort (Tracks E and F). Tracks in Red Represent the Frequencies of Copy Number Gains, and Tracks in Green Represent the Copy Number Losses. (B) Histogram for P Values From the Gene Set Enrichment Analysis, With Gene Categories on the X-Axis and eLog10 (P Value) on the Y-Axis. The Horizontal Red Line in the Plot Represents P [ .05
70
-
Abbreviations: BP ¼ Biological Process; MF ¼ Molecular Function; CC ¼ Cellular Component.
Clinical Lung Cancer January 2016
Yan Cao et al patients, and tracks E and F show the copy number gains and losses in 50 LR SCC patients). Some CNA regions were more frequently aberrant in one group relative to the other. For example, we found that SCC patients in the LR group more frequently had a copy number gain in chromosome 5p13.2-13.3, 5p15.33 and a copy number loss in chromosome 4q, 9p21.3, 10p15.1, and patients in the HR group tended to harbor a copy number gain in chromosome 11q13.1, 14q33.32 and a copy number loss in chromosome 3p24.3. A comparison of the major CNA regions between the 2 groups is shown in Table 1. Among these CNAs, some have previously been reported to be associated with survival in NSCLC, such as amplifications of chromosome 8q24, 3q27.1, and 8p12, and deletions of 9p21.3 and 13q34; in contrast, some CNAs have not been reported before, such as gains of chromosome 3q27.1 and 5p13.3 and deletions of 3p21.3 and 9p21.3. Genes located in these CNA regions might contribute to either a good or bad prognosis. GSEA was applied to investigate whether specific pathways were involved in progression. As shown in Figure 1B, genes in the CNA regions were significantly enriched in a series of gene categories, encompassing several hallmarks of cancer, including genome instability, cell migration, angiogenesis, and regulation of cell growth.
Construction of a Prediction Model With CNAs to Evaluate Survival of Lung SCC Patients To construct a prediction model, cluster analysis was first applied and the 123 fragments that were associated with survival of SCC were separated into 41 groups. We chose the one representative fragment that was most significantly associated with survival from among the fragments in each cluster, and used a generalized linear model to construct the prediction model. Information on the 41 fragments used is also listed in Supplemental Table 3 in the online version. As shown in the multidimensional scaling plot in Figure 2A, patients with lung SCC with a high or low survival risk could be successfully separated by the model. The receiver operating characteristic curve in Figure 2B shows that the overall accuracy of our model in this training set was 92.0%, and the sensitivity and specificity were 90.0% and 94.0%, respectively. The area under the curve was 0.945 for the model.
Validation of CNAs in the Independent Cohort of Lung SCC Patients Using Real-Time PCR For the purposes of validation, 5 fragments were randomly selected from the 41 fragments used for the prediction model Table 1 Comparison of CNA Regions Between Lung SCC Patients in the HR and LR Groups Group LR HR
Copy Number Gain
Copy Number Loss
3q27.1, 5p15.33, 5p13.3, 5p13.2, 7q22.1, 9p13.3, 13q34 1q21.2, 2p23.3, 3q27.1, 4q35.1, 4q32.3, 4q21.23, 8q24, 8p12, 9p13.3, 11q23.3, 11q13.1, 12p13.1, 14q32.3, 17q11.2, 19q13.11
4q35.1, 4q32.3, 4q21.23, 8p12, 9p21.3, 10p15.1, 14q22.3 1p36.11, 3p25.3, 3p24.3, 3p21.3, 5p15.33, 5p13.3, 5p13.2, 6q23.2, 8p12, 9p21.3, 9p13.3, 10q26, 10p15.1, 7q22.1, 11p15.4, 13q34
Abbreviations: CNA ¼ copy number aberration; HR ¼ high-risk; LR ¼ low-risk; SCC ¼ squamous cell carcinoma.
construction, and their copy numbers were evaluated using realtime PCR in the independent cohort of 45 lung SCC patients. The copy number status (gain or loss) for each fragment in each patient was determined using the same criteria as in the array-CGH analysis, as described previously herein. After applying the log rank test, fragment Frag-36 (chr5:36645000-36706000 5p13.2) was found to be significantly associated with the survival of lung SCC patients in the validation cohort, but the other 4 fragments were not. Solute carrier family 1, member 3 (SLC1A3) was the only gene located within this fragment. As shown in Figure 3, patients who harbored a copy number gain of this gene tended to have a better prognosis compared with patients with a neutral copy number or even a copy number loss in this region in the array-CGH data set and the independent validation data set. The full validation results for the 5 randomly selected fragments are shown in Table 2.
Discussion Squamous cell carcinoma is one of the most common subtypes of lung cancer. Currently, a histopathology-based staging system is still the dominant standard used to estimate the outcome of lung cancer patients in the clinic. However, because of the heterogeneity of SCC, patient outcomes vary considerably, even for those whose tumors have similar clinical and pathological features.13 In these circumstances, the current staging system for SCC might have reached its limitation for estimation of prognosis, and this is why molecular methods could add value. In this study, we aimed to identify potential prognostic markers from a genomic viewpoint using array-CGH analysis. All patients included in the study had complete follow-up information. Patients who died within 3 months after surgery were excluded, because they could have died from postoperative complications rather than SCC. We also carefully checked the lymph node metastasis status of each patient because metastasis is highly associated with patient survival. To avoid a false-negative result regarding the lymph node status, 5 patients with no lymph node metastases were randomly selected from the HR group and hematoxylin and eosin-stained slides of lymph nodes were examined using serial sectioning with an average interval of 20 mm. All the slides were examined by a senior pathologist, and no metastases were found. These stringent criteria made our further analyses more reliable. We identified 123 genomic regions with CNAs that were associated with survival of patients with lung SCC in this study. Among them, some are frequently reported aberrations, such as amplifications of 2p23,14 8q24,15-17 3q27.1,18-20 and 8p12,16,21 and the loss of 3p,22 9p21.3,4 and 13q34.20 In the study of Salido et al,14 anaplastic lymphoma kinase (which is located in 2p23) copy number amplification was poorly characteristic of NSCLC, and in the study of Danner et al,22 deletion of 3p was reported to be a frequent chromosome imbalance in SCC and was associated with decreased overall survival. These findings were also evident in our research. Furthermore, amplification of 10q23 and 17q23-34 has been identified as being significantly associated with the development of early brain metastasis of lung adenocarcinoma,23 and our results were in accordance with this in that patients who exhibited a gain of 10q23 and 17q23-24 had a poor prognosis. In addition, amplification of 19q13.3 has been found to occur frequently in lung SCC, but the association of gain of 19q13.3 and prognosis in these patients needs further
Clinical Lung Cancer January 2016
- 71
Estimated Survival of Lung SCC Using CNAs Figure 2 The Performance of the Prediction Model in Evaluation of the Survival of Lung Squamous Cell Carcinoma Patients. (A) Scatter Plot Shows That Patients in the High-Risk (HR) Group (Red Points) and the Low-Risk (LR) Group (Blue Points) Could Be Separated Successfully Using the Prediction Model. (B) The Receiver Operating Characteristic Curve Shows the Sensitivity (Sens) (90%) and Specificity (Spec) (94%) of This Model in the Training Set
Abbreviations: PV ¼ Negative Predictive Value; PVþ ¼ Positive Predictive Value.
investigation.24 Our research revealed that patients with a gain of 19q13.3 have poor survival. This might be because CNAs in these regions lead to aberrations of some oncogenes and tumor suppressor
genes. For example, the gain of 8q24, where the well known myc oncogene is located, has been considered to play an important role in tumor progression in many studies, and is associated with a poor
Figure 3 KaplaneMeier Plot of Frag-36 in the Array-Comparative Genomic Hybridization (aCGH) Cohort (A) and in the Independent Validation Cohort (B). Patients Who Harbored Copy Number Gains in this Region Tended to Have a Better Prognosis Compared With Patients Who Carried a Neutral Copy Number or a Copy Number Loss
72
-
Clinical Lung Cancer January 2016
Yan Cao et al Table 2 The Validation of 5 Randomly Selected Fragments in the Independent Lung SCC Cohort Fragment
Chr
Start, kb
End, kb
P a (Array-CGH Data Set)
P a (Validation Data Set)
Frag-28 Frag-33 Frag-36 Frag-95 Frag-117
3 4 5 14 19
49437 186305 36645 104465 19557
49612 186525 36706 104589 19598
.001 <.001 <.001 .001 .001
.723 .398 .025 .237 .336
Abbreviations: CGH ¼ comparative genomic hybridization; Chr ¼ chromosome; kb ¼ kilobase; SCC ¼ squamous cell carcinoma. a Processed using log rank test with R software.
prognosis in NSCLC.25,26 Furthermore, the loss of 9p21.3, where a tumor suppressor gene and cell cycle regulators localize, is also frequently reported to be associated with a poor survival outcome in patients with NSCLC.3,4 Some variations such as gain of 8p12 and loss of 18q21.32 were also revealed to be poor survival factors in our study. Nearby fragments of these sections have been found to be related to prognosis in other research. For example, amplification of 8p11.23-11.22 where the fibroblast growth factor receptor gene is located has been reported to be an independent negative prognostic factor in lung SCC,27 and deletions of 18q22.3 have been found to be associated with recurrences of SCC.28 It might be speculated that adjacent segments on chromosomes have some similar functions or have a synergistic effect during performance of certain functions. Some variations previously reported to be associated with the prognosis of NSCLC, such as a gain of 10p and 16q,21 were not detected in our research. This might be a clue that SCC, a subtype of NSCLC, has its own genetic CNA profile, and NSCLC should not be treated as a whole in these types of studies. We also detected some CNA regions that have not previously been reported. For example, patients with amplifications of 5p13.2 and 5p13.3 had a favorable survival outcome, and those with gains of 11q23.3, 11q13.1, and 14q32.3 and a deletion of 3p21.3 had poor survival. Further investigation of genes localized in these regions might reveal more oncogenes and tumor suppressor genes that have significant roles in the progression of SCC. The results of the GSEA indicated that genes involved in the 123 CNA regions studied were significantly enriched in several gene ontology categories the cover most aspects of the hallmarks of cancer, such as cell growth, angiogenesis, cell motility, and the cell’s response to a drug. This finding suggests that the prognosis of SCC is determined by a combination of cell behaviors, and we need to cover all of these aspects to construct a robust model to predict patient survival. In this study, we first screened 41 fragments from the 123 CNA regions studied using a cluster method and then constructed a prediction model using a generalized linear model. This model performed well in the array-CGH data set, but it still needed to be validated in an independent cohort. We therefore randomly selected 5 fragments from the 41 CNA regions for further validation using real-time PCR, and the results showed that Frag-36 (5p13.2) was significantly associated with the survival of lung SCC patients in the independent cohort. In many studies, a gain of 5p has frequently been detected in NSCLC and SCC,16,29 and Garnis et al30 reported multiple early genetic events on chromosome 5p in lung cancer progression. However, a CNA in this region has not been reported to be associated with survival in cancer until now.
The SLC1A3 gene, which is a member of the solute carrier family, is located in Frag-36. Many members of this family are considered to play important roles in cancer progression. Some of them play the roles of oncogenes and some function as tumor suppressor genes. Other authors at our institution have reported that high expression of SLC39A6 (solute carrier family 39, member 6) protein is associated with short survival times in individuals with advanced esophageal SCC.31 SLC5A8 (solute carrier family 5, member 8) was the most frequently reported gene among this family, and it has been proposed to be a potential tumor suppressor gene that is silenced by epigenetic changes in various tumors, such as prostate tumors,32 colorectal cancers,33,34 and pancreatic cancer.35 Other members of the gene family have also been reported to be potential tumor suppressor genes besides SLC5A8, such as SLC38A3 (solute carrier family 38, member 3)36 and SLC3A2 (solute carrier family 3, member 2).37 However, there have only been a few reports about the relationship between SLC1A3 and cancer. Although it has been reported that overexpression of SLC1A3 is associated with the migration of glioblastoma38 and pediatric osteosarcoma,37 no studies have reported on the relationship between SLC1A3 and lung cancer until now. In our study, we found that the genomic region in which SLC1A3 was located was more frequently amplified in SCC patients in the LR group, suggesting that amplification of SLC1A3 is associated with a favorable outcome in lung SCC and that it might also act as a tumor suppressor gene in SCC. Further studies of the molecular function of SLC1A3 are needed to fully investigate its role in lung cancer progression. Using array-CGH, we observed a number of genomic CNA regions in lung SCC patients in this study and identified some potentially prognostic CNAs related to their survival. Furthermore, we constructed a model to estimate the survival of patients on the basis of these CNAs. The results of the array-CGH were then further validated using real-time PCR in an independent cohort. Although our prediction model still requires further validation, these CNAs could be useful markers to evaluate patients’ prognosis, and they might play an important role in assisting clinicians to make decisions regarding appropriate adjuvant therapy after surgery.
Conclusions In this study, genomic CNAs in patients with primary lung SCC were investigated using array-CGH. We found 123 CNA regions that were significantly associated with survival in these patients, and a prognosis estimation model was constructed using 41 regions selected from the 123 regions. In addition, 5 of these CNAs were selected for validation in an independent cohort of 45 primary lung SCCs using real-time PCR. The CNA regions that we identified could potentially
Clinical Lung Cancer January 2016
- 73
Estimated Survival of Lung SCC Using CNAs be useful molecular biomarkers to assist in estimating the prognosis of lung SCC, especially for patients whose tumors have similar clinical and pathological features, and they could also help clinicians make appropriate decisions concerning adjuvant therapy after surgery.
Clinical Practice Points Sole reliance on morphology and the TNM stage is inadequate to
estimate the survival of patients with lung SCC. Some CNA regions in patients with lung SCC have been found
to be significantly associated with survival. Assessment of CNAs could potentially be useful for estimating
the prognosis of patients with lung SCC, and assist in clinical decision-making regarding adjuvant therapy.
Acknowledgments This study was supported by grants from the National Nature Science Foundation of China, No. 30770932 and 81272414 (Dongmei Lin), and the Nature Sciences Foundation of Beijing, No. 7082058 (Dongmei Lin). These organizations had no involvement in the conduct of the study, the writing of the study report, or the decision to submit it for publication. Editorial assistance with the manuscript was provided by Content Ed Net, Shanghai Co Ltd.
Disclosure The authors have stated that they have no conflicts of interest.
Supplemental Data Supplemental tables accompanying this article can be found in the online version at http://dx.doi.org/10.1016/j.cllc.2015.08.005.
References 1. Jemal A, Siegel R, Ward E, et al. Cancer statistics, 2006. CA Cancer J Clin 2006; 56:106-30. 2. Travis WD, Brambilla E, müller-Hermelink HK, et al., World Health Organization classifiation of tomours: pathology and genetics of tomours of lung, pleura, thymus, and heart. Lyon: IARC press, 2004,20-27. 3. Van Boerdonk RA, Daniels JM, Snijders PJ, et al. DNA copy number aberrations in endobronchial lesions: a validated predictor for cancer. Thorax 2014; 69:451-7. 4. Zhao Y, Li Y, Lu H, et al. Association of copy number loss of CDKN2B and PTCH1 with poor overall survival in patients with pulmonary squamous cell carcinoma. Clin Lung Cancer 2011; 12:328-34. 5. Tai AL, Yan WS, Fang Y, et al. Recurrent chromosomal imbalances in nonesmallcell lung carcinoma: the association between 1q amplification and tumor recurrence. Cancer 2004; 100:1918-27. 6. Chujo M, Noguchi T, Miura T, et al. Comparative genomic hybridization analysis detected frequent overrepresentation of chromosome 3q in squamous cell carcinoma of the lung. Lung Cancer 2002; 38:23-9. 7. Rydzanicz M, Giefing M, Ziolkowski A, et al. Nonrandom DNA copy number changes related to lymph node metastases in squamous cell carcinoma of the lung. Neoplasma 2008; 55:493-500. 8. Greene FL, Trotti A (eds). AJCC Cancer Staging Manual 7th edn. American Joint Commision on cancer, Springer: Chicago, 2009, pp 253-270. 9. R Development Core Team. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing; 2010. 10. Venkatraman ES, Olshen AB. A faster circular binary segmentation algorithm for the analysis of array CGH data. Bioinformatics 2007; 23:657-63.
74
-
Clinical Lung Cancer January 2016
11. Weir BA, Woo MS, Getz G, et al. Characterizing the cancer genome in lung adenocarcinoma. Nature 2007; 450:893-8. 12. Frazer KA, Ballinger DG, Cox DR, et al. A second generation human haplotype map of over 3.1 million SNPs. Nature 2007; 449:851-61. 13. Chen HY, Yu SL, Chen CH, et al. A five-gene signature and clinical outcome in nonesmall-cell lung cancer. N Engl J Med 2007; 356:11-20. 14. Salido M, Pijuan L, Martínez-Avilés L, et al. Increased ALK gene copy number and amplification are frequent in nonesmall-cell lung cancer. J Thorac Oncol 2011; 6: 21-7. 15. Micke P, Edlund K, Holmberg L, et al. Gene copy number aberrations are associated with survival in histologic subgroups of non-small cell lung cancer. J Thorac Oncol 2011; 6:1833-40. 16. Boelens MC, Kok K, van der Vlies P, et al. Genomic aberrations in squamous cell lung carcinoma related to lymph node or distant metastasis. Lung Cancer 2009; 66:372-8. 17. Kang JU, Koo SH, Kwon KC, et al. High frequency of genetic alterations in nonsmall cell lung cancer detected by multi-target fluorescence in situ hybridization. J Korean Med Sci 2007; 22(suppl):S47-51. 18. Kang JU, Koo SH, Kwon KC, et al. Identification of novel candidate target genes, including EPHB3, MASP1 and SST at 3q26.2-q29 in squamous cell carcinoma of the lung. BMC Cancer 2009; 9:237. 19. Comtesse N, Keller A, Diesinger I, et al. Frequent overexpression of the genes FXR1, CLAPM1 and EIF4G located on amplicon 3q26-27 in squamous cell carcinoma of the lung. Int J Cancer 2007; 120:2538-44. 20. Son JW, Jeong KJ, Jean WS, et al. Genome-wide combination profiling of DNA copy number and methylation for deciphering biomarkers in non-small cell lung cancer patients. Cancer Lett 2011; 311:29-37. 21. Lockwood WW, Chari R, Coe BP, et al. Integrative genomic analyses identify BRF2 as a novel lineage-specific oncogene in lung squamous cell carcinoma. PLoS Med 2010; 7:e1000315. 22. Danner BC, Hellms T, Jung K, et al. Prognostic value of chromosomal imbalances in squamous cell carcinoma and adenocarcinoma of the lung. Ann Thorac Surg 2011; 92:1038-43. 23. Lee HW, Seol HJ, Choi YL, et al. Genomic copy number alterations associated with the early brain metastasis of nonesmall-cell lung cancer. Int J Oncol 2012; 41: 2013-20. 24. Vanhecke E, Valent A, Tang X, et al. 19q13-ERCC1 gene copy number increase in nonesmall-cell lung cancer. Clin Lung Cancer 2013; 14:549-57. 25. Iwakawa R, Kohno T, Kato M, et al. MYC amplification as a prognostic marker of early-stage lung adenocarcinoma identified by whole genome copy number analysis. Clin Cancer Res 2011; 17:1481-9. 26. Massion PP, Zou Y, Uner H, et al. Recurrent genomic gains in preinvasive lesions as a biomarker of risk for lung cancer. PLoS One 2009; 4:e5611. 27. Kim HR, Kim DJ, Kang DR, et al. Fibroblast growth factor receptor 1 gene amplification is associated with poor survival and cigarette smoking dosage in patients with resected squamous cell lung cancer. J Clin Oncol 2013; 31:731-7. 28. Sriram KB, Larsen JE, Savarimuthu Francis SM, et al. Array-comparative genomic hybridization reveals loss of SOCS6 is associated with poor prognosis in primary lung squamous cell carcinoma. PLoS One 2012; 7:e30398. 29. Huang YT, Heist RS, Chirieac LR, et al. Genome-wide analysis of survival in earlystage nonesmall-cell lung cancer. J Clin Oncol 2009; 27:2660-7. 30. Garnis C, Davies JJ, Buys TP, et al. Chromosome 5p aberrations are early events in lung cancer: implication of glial cell line-derived neurotrophic factor in disease progression. Oncogene 2005; 24:4806-12. 31. Wu C, Li D, Jia W, et al. Genome-wide association study identifies common variants in SLC39A6 associated with length of survival in esophageal squamous-cell carcinoma. Nat Genet 2013; 45:632-8. 32. Park JY, Zheng W, Kim D, et al. Candidate tumor suppressor gene SLC5A8 is frequently down-regulated by promoter hypermethylation in prostate tumor. Cancer Detect Prev 2007; 31:359-65. 33. Ueno M, Toyota M, Akino K, et al. Aberrant methylation and histone deacetylation associated with silencing of SLC5A8 in gastric cancer. Tumour Biol 2004; 25:134-40. 34. Kakizaki F, Aoki K, Miyoshi H, et al. CDX transcription factors positively regulate expression of solute carrier family 5, member 8 in the colonic epithelium. Gastroenterology 2010; 138:627-35. 35. Park JY, Helm JF, Zheng W, et al. Silencing of the candidate tumor suppressor gene solute carrier family 5 member 8 (SLC5A8) in human pancreatic cancer. Pancreas 2008; 36:e32-9. 36. Kholodnyuk ID, Kozireva S, Kost-Alimova M, et al. Down regulation of 3p genes, LTF, SLC38A3 and DRR1, upon growth of human chromosome 3-mouse fibrosarcoma hybrids in severe combined immunodeficiency mice. Int J Cancer 2006; 119:99-107. 37. Uemura T, Yerushalmi HF, Tsaprailis G, et al. Identification and characterization of a diamine exporter in colon epithelial cells. J Biol Chem 2008; 283:26428-35. 38. Tatenhorst L, Senner V, Püttmann S, et al. Regulators of G-protein signaling 3 and 4 (RGS3, RGS4) are associated with glioma cell motility. J Neuropathol Exp Neurol 2004; 63:210-22.
Yan Cao et al Supplemental Table 1 Demographic Information of Patients Involved in This Study Characteristic
High Risk
Low Risk
55 years
16
17
>55 years
34
33
M
44
46
F
6
4
þ
30
20
20
30
Array-CGH Data Set Age
Sex
LNMa
Differentiation Well differentiated
3
2
Moderately differentiated
23
31
Poorly differentiated
24
17
14
26
Stage I II
10
14
III and IV
26
10
3 cm
11
17
>3 cm
39
33
Tumor Size
Real-Time PCR Data Set Age 55 years
8
9
>55 years
14
14
M
21
21
F
1
2
Sex
a
LNM
þ
17
7
5
16
2
1
16
16
4
6
I
6
11
II
4
5
III
12
7
Differentiation Well differentiated Moderately differentiated Poorly differentiated Stage
Tumor Size 3 cm
1
0
>3cm
21
23
Abbreviations: CGH ¼ comparative genomic hybridization; F ¼ female; LNM ¼ lymph node metastasis; M ¼ male; PCR ¼ polymerase chain reaction. a þ indicates presence of lymph node metastasis; indicates no lymph node metastasis.
Clinical Lung Cancer January 2016
- 74.e1
Estimated Survival of Lung SCC Using CNAs Supplemental Table 2 Primers Used in Real-Time Polymerase Chain Reaction Analysis Fragment
GSYM
Frag-28
NICN1 BSN BSN
Frag-33
SLC25A4 SNX25 SNX25
Frag-36
SLC1A3 SLC1A3 SLC1A3
Frag-95
PLD4 C14orf79 CDCA4 GPR132
Frag-117
PBX4 EDG4
Abbreviations: F ¼ forward; GSYM ¼ gene symbol; R ¼ reverse.
74.e2
-
Clinical Lung Cancer January 2016
Primer Sequence F R F R F R F R F R F R F R F R F R F R F R F R F R F R F R
ACTCCTCCGTGAGGTAAGC TGTTGGACACTAGAGCCATT CCAAAGATGCTGGAAGA GGGAGTGACCCTTGTTA GTACACCAGCCCCTTCT TACCCACCCAGGAAAAT TAGACTATTCCTAGGGGAAG TGAACACTCAATGAAGCAT AGCCAATTTATTATCCTGAAGC ATTCCCTACGAACAACTAACCTAT TACGATCTCCCTTGTCAGC CCCAGGAGGAAACAAATAA TGGGAAAGGTAGATTGGG TGCTGTCAACCCTGCTAA CATGCACTAAAGCAAAGC AGGTCAAGGAACCCAAAA TCTTCTGTTTGCCTCCAC TGATGACAATGATTATGCC GTTGCCTATACCTTAAACCCTTACCC CCAGCCCACCTCACATCACC CCCGCAGTCACTCGCAGATG GACCCACTCCGCAGACCACA AGTCGTGCTGACCTCCTGTA GACCCAAACCAAATCCTCTT CCATGACGCTGCTAGGTT CAAGAGTGAGGACGGGAGT ATAACTTTCTTAGTCCCTCC CACCTTGCCTCTTCAGT ACTGTTGTCATCATCCTGG CTGCTCACATACATTATCACCT
Product Size 66 122 150 87 121 90 131 131 132 120 200 139 175 129 177
Yan Cao et al Supplemental Table 3 Detailed Information on the 123 Fragments Fragment Frag-1 Frag-2 Frag-3 Frag-4 Frag-5 Frag-6 Frag-7 Frag-8 Frag-9 Frag-10 Frag-11 Frag-12 Frag-13 Frag-14 Frag-15 Frag-16 Frag-17 Frag-18 Frag-19 Frag-20 Frag-21 Frag-22 Frag-23 Frag-24 Frag-25 Frag-26 Frag-27 Frag-28 Frag-29 Frag-30 Frag-31 Frag-32 Frag-33 Frag-34 Frag-35 Frag-36 Frag-37 Frag-38 Frag-39 Frag-40 Frag-41 Frag-42 Frag-43 Frag-44 Frag-45 Frag-46 Frag-47 Frag-48 Frag-49 Frag-50 Frag-51 Frag-52
CHR
Start
Stop
Representative Fragment
GSYM
1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 3 3 3 3 3 4 4 4 4 5 5 5 5 6 6 6 7 7 7 7 7 7 7 7 7 7 7 8
24868 25257 32758 34465 148141 157950 159356 159435 167699 171968 183507 191109 208179 4534 6001 10780 16102 24222 25818 27024 27455 86147 135749 231067 10700 20007 45564 49437 185538 2099 86206 169297 186305 2209 36100 36645 139203 130072 132311 134258 20761 33121 53762 80267 95052 95676 100074 129469 137804 156490 157852 8229
24952 25632 32791 35804 148185 158190 159360 159465 167969 172454 183544 191357 208753 4831 8321 11240 16668 25243 26604 27136 27519 86180 135942 231331 10950 20787 45663 49612 185547 2225 86615 169433 186525 2807 36188 36706 139660 130409 132311 135098 21292 33279 54939 80384 95267 96152 100116 129635 137929 156702 158137 9674
0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 1 0 0 1 0 0 0 1 0 0 0 0 1 0 0 1 1 0 0 1 0 1 0 0 0 1 0 0 1 0 1 0
SRRM1 CLIC4 TMEM57 C1orf63 SYF2 ZBTB8 ZMYM4 KIAA0319L C1orf212 ZMYM1 SFPQ ZMYM6 NCDN SF3B4 OTUD7B MTMR11 SV2A TAGLN2 SLAMF9 CRP SLAMF8 DUSP23 CCDC19 IGSF9 DEDD NIT1 NDUFS2 ADAMTS4 APOA2 TOMM40L FCER1G SELE SELL SERPINC1 DARS2 RABGAP1L ZBTB37 KLHL20 IVNS1ABP C1orf26 CDC73 TROVE2 UCHL5 GLRX2 HHAT SERTAD4 SYT14 RNF144 RSAD2 ATP6V1C2 PDIA6 PQLC3 C2orf50 ROCK2 FAM49A ITSN2 POMC ADCY3 RBJ RAB10 C2orf39 HADHA HADHB OTOF SELI ASXL2 GPR113 KIF3C MAPRE3 DPYSL5 AGBL5 FLJ20254 KRTCAP3 ZNF513 NRBP1 PPM1G POLR1A ZRANB3 CAB39 SP100 SLC6A11 SGOL1 PCAF C3orf48 LARS2 LIMD1 DAG1 BSN NICN1 FAM131A CLCN2 MXD4 C4orf15 POLN ARHGAP24 FLJ20035 ANXA10 SLC25A4 LRP2BP SNX25 C5orf38 UGT3A2 SKP2 LMBRD2 SLC1A3 NRG2 C5orf32 PURA PFDN1 PSD2 ARHGAP18 L3MBTL3 CTGF SGK SLC2A12 TBPL1 SP8 ABCB5 BBS9 FLJ45974 SEC61G VSTM2 SEMA3C PDK4 DYNC1I1 SLC25A13 GNB2 TFR2 ACTL6B PERQ1 ZC3HC1 C7orf45 FLJ14803 KIAA0265 SVOPL TRIM24 HLXB9 UBE3C NCAPG2 PTPRN2 MFHAS1 CLDN23 TNKS PPP1R3B THEX1
Clinical Lung Cancer January 2016
- 74.e3
Estimated Survival of Lung SCC Using CNAs Supplemental Table 3 Continued Fragment
74.e4
CHR
Start
Stop
Representative Fragment
GSYM
Frag-53 Frag-54 Frag-55 Frag-56 Frag-57
8 8 8 8 8
22602 32705 33568 96326 118612
22842 33429 34493 96338 121775
1 0 1 0 0
Frag-58
8
124331
126130
0
Frag-59 Frag-60 Frag-61 Frag-62 Frag-63 Frag-64 Frag-65 Frag-66 Frag-67 Frag-68 Frag-69 Frag-70
8 8 8 9 9 10 10 10 10 10 10 11
130833 131534 134921 21795 34177 3398 4880 85995 120343 122924 132610 770
131088 134164 139360 21957 34462 4386 5405 86002 120552 123723 133599 974
0 0 1 1 1 0 1 0 0 0 1 0
Frag-71 Frag-72 Frag-73 Frag-74 Frag-75 Frag-76 Frag-77 Frag-78 Frag-79 Frag-80 Frag-81 Frag-82 Frag-83 Frag-84 Frag-85
11 11 11 11 11 11 11 11 11 11 11 12 12 12 12
3030 6772 10064 65240 65870 66320 112337 116572 117633 119487 124123 49 624 12657 12832
3200 7027 10187 65310 65891 66384 112616 117234 117685 119627 124815 303 864 12763 14850
1 0 1 0 0 0 0 0 0 0 1 0 0 0 0
Frag-86 Frag-87 Frag-88 Frag-89 Frag-90 Frag-91 Frag-92 Frag-93 Frag-94 Frag-95 Frag-96 Frag-97 Frag-98 Frag-99
12 12 12 13 13 14 14 14 14 14 15 15 15 15
14997 61084 69827 112187 112809 54602 60046 76950 102516 104465 61125 64628 72751 73006
16053 61722 69837 112610 112842 55128 60251 77022 102671 104589 61233 64834 72915 73531
1 0 1 0 1 0 0 0 0 1 0 1 0 0
PEBP4 EGR3 FUT10 NRG1 DUSP26 C8orf37 DCC1 NOV MAL2 MTBP ENPP2 SNTB1 TNFRSF11B EXT1 SAMD12 MRPL13 TAF2 COLEC10 THRAP6 DEPDC6 SQLE KIAA0196 ATAD2 TMEM65 ZNF572 TRMT12 FER1L6 NDUFB9 RNF139 TATDN1 FAM91A1 ANXA13 ZHX1 FBXO32 C8ORFK36 C8orf32 MTSS1 MLZE FAM49B TG TMEM71 PHF20L1 ADCY8 KCNQ3 LRRC6 C8ORFK32 KHDRBS3 ZFAT1 MTAP C9orf25 NUDT2 DNAI1 C9orf48 C9orf24 UBAP1 KLF6 AKR1CL2 UCN3 AKR1C3 AKR1C4 AKR1C2 AKR1C1 RGR C10orf46 PRLHR NSMCE4A ATE1 FGFR2 PPP2R2D TCERG1L CD151 POLR2L EFCAB4A AP2A2 CHID1 TSPAN4 LRDD PNPLA2 RPLP2 CEND1 SLC25A22 C11orf36 OSBPL5 CARS ZNF215 ZNF214 NLRP14 OR6A2 SBF2 DKFZp761E198 HTATIP B3GNT1 SLC29A2 PC C11orf80 RCE1 NCAM1 BACE1 SIDT2 FXYD2 DSCAML1 CEP164 TAGLN PCSK7 FXYD6 RNF214 EVA1 CD3E POU2F3 OAF TRIM29 ESAM CCDC15 VSIG2 HEPACAM PKNOX2 SLC37A2 ROBO4 C11orf61 JARID1A SLC6A12 SLC6A13 NINJ2 WNK1 GPR19 CREBL2 CDKN1B EMP1 HIST4H4 WBP11 GUCY2C APOLD1 C12orf36 ATF7IP GPRC5D HEBP1 DDX47 GPRC5A GRIN2B C12orf60 GSG1 FLJ22662 H2AFJ RERG PDE6H DERA PTPRO STRAP EPS8 ARHGDIB USP15 MON2 C12orf61 TSPAN8 TUBGCP3 F7 F10 KTN1 TBPL2 FBXO34 LGALS3 C14orf32 KIAA0831 DLG7 SIX4 SIX1 SIX6 AHSA1 C14orf148 THSD3 C14orf133 CDC42BPB TNFAIP2 PLD4 GPR132 C14orf79 CDCA4 TPM1 LACTB RPS27L ZWILCH SMAD6 CYP1A2 CYP1A1 LMAN1L CSK EDC3 COX5A C15orf39 SCAMP5 SIN3A NEIL1 RPP25 MAN2C1 COMMD4 PPCDC
-
Clinical Lung Cancer January 2016
Yan Cao et al Supplemental Table 3 Continued Fragment
CHR
Start
Stop
Representative Fragment
GSYM
Frag-100 Frag-101 Frag-102 Frag-103 Frag-104 Frag-105 Frag-106 Frag-107 Frag-108 Frag-109 Frag-110
16 16 16 16 16 16 16 17 17 17 17
10926 15867 31446 55159 65767 66254 80159 23671 24920 26339 26782
11353 16056 31601 55217 65787 66317 80301 23697 24950 26507 27956
0 1 1 0 0 0 1 0 0 0 1
Frag-111 Frag-112 Frag-113 Frag-114 Frag-115
17 17 18 18 18
41331 55331 22026 30177 41744
41458 55395 22401 31202 42903
1 1 0 0 0
Frag-116 Frag-117 Frag-118 Frag-119 Frag-120 Frag-121 Frag-122 Frag-123
18 19 19 19 20 20 21 21
55859 19557 41802 49346 7709 17401 35486 39605
56189 19598 41837 49424 7915 17536 35580 40480
1 1 0 1 1 1 0 1
DEXI PRM1 TNP2 KIAA0350 C16orf75 PRM2 CIITA ABCC1 C16orf63 ERAF MT1E MT2A MT4 MT3 LOC653319 EXOC3L E2F4 NOL3 RANBP10 PARD6A C16orf48 GFOD2 CMIP TMEM97 TNFAIP1 IFT20 TP53I13 ANKRD13B GIT1 RNF135 NF1 SUZ12 CDK5R1 RHBDL3 PSMD11 ZNF207 MYO1D LRRC37B UTP6 C17orf79 C17orf75 RAB11FIP4 RHOT1 MAPT RPS6KB1 LOC51136 KCTD1 PSMA8 ZNF397 MAPRE2 ZNF396 ZNF271 DTNA ZNF24 KIAA1632 ST8SIA5 RNF165 CCDC5 ATP5A1 HDHD2 LOXHD1 C18orf25 PSTPIP2 MC4R EDG4 PBX4 GIOT-1 ZNF382 ZNF227 ZNF234 ZNF226 TXNDC13 HAO1 PCSK2 DSTN BFSP1 LOC150084 C21orf88 LOC728776 C21orf87 SH3BGR PCP4 C21orf13 BRWD1 WRB DSCAM B3GALT5
Abbreviations: CHR ¼ chromosome; GSYM ¼ gene symbol.
Clinical Lung Cancer January 2016
- 74.e5