Oral Oncology (2007) 43, 455– 462
available at www.sciencedirect.com
journal homepage: http://intl.elsevierhealth.com/journals/oron/
Gene expression signatures that can discriminate oral leukoplakia subtypes and squamous cell carcinoma Nobuo Kondoh a,*, Shuri Ohkura a, Masaaki Arai a, Akiyuki Hada a, Toshio Ishikawa b, Yutaka Yamazaki c, Masanobu Shindoh d, Masayuki Takahashi e, Yoshimasa Kitagawa c, Osamu Matsubara f, Mikio Yamamoto a a
Department of Biochemistry II, National Defense Medical College, 3-2 Namiki, Tokorozawa-shi 359-8513, Japan Ishihara Sangyo Kaisha, LTD., Nishi-Shibukawa, Kusatu-shi 525-0025, Japan c Department of Oral Diagnosis, Division of Oral Pathobiological Science, Hokkaido University Graduate School of Dental Medicine N13 W7, Kita-ku, Sapporo 060-8586, Japan d Department of Oral Pathology and Biology, Division of Oral Pathobiological Science, Hokkaido University Graduate School of Dental Medicine N13 W7, Kita-ku, Sapporo 060-8586, Japan e Department of Oral surgery, National Defense Medical College, 3-2 Namiki, Tokorozawa-shi 359-8513, Japan f Department of Pathology II, National Defense Medical College, 3-2 Namiki, Tokorozawa-shi 359-8513, Japan b
Received 10 March 2006; accepted 25 April 2006 Available online 18 September 2006
KEYWORDS
Summary The purpose of this study is to generate a classifier for oral squamous cell carcinoma (OSCC) and leukoplakias (LPs), and evaluate its diagnostic potential. In order to identify marker gene candidates, differential gene expression between LPs and OSCCs were examined by cDNA microarray. The expression of 118 marker gene candidates was further evaluated by quantitative reverse transcription-PCR (QRT-PCR) analyses of 27 OSCC and 19 LP tissues. We identified 12 up-regulated and 15 down-regulated marker genes in OSCCs compared to LPs. Using Fisher’s linear discriminant analysis (LDA), we demonstrated that 11-gene predictors among this novel marker set could best distinguish OSCCs from LPs (>97% accuracy), whereas a further seven of these gene predictors could be utilized to distinguish higher grade (higher than moderate) from lower grade (lower than mild) dysplasias (>95% accuracy). These predictor
Head and neck/ oral cancer; Gene expression profiling; Leukoplakia; Microarray; Quantitative reverse transcription PCR
Abbreviations: OSCC, Oral squamous cell carcinoma; LP, Leukoplakia; LDA, Fisher’s linear discriminant analysis; loo, Leave-one-out cross validation; QRT-PCR, Quantitative reverse transcription-polymerase chain reaction. * Corresponding author. Tel.: +81 4 2995 1469; fax: +81 4 2996 5190. E-mail address:
[email protected] (N. Kondoh).
1368-8375/$ - see front matter c 2006 Elsevier Ltd. All rights reserved. doi:10.1016/j.oraloncology.2006.04.012
456
N. Kondoh et al. gene sets provide multigene classifiers for the diagnosis of pre-cancerous to cancerous transition of oral malignancy. c 2006 Elsevier Ltd. All rights reserved.
Introduction Oral leukoplakias (LPs) are white lesions that include hyperplasias and dysplasias of the oral mucosa, and often undergo malignant transformation to oral squamous cell carcinoma (OSCC).1 The histological features associated with LP, and with OSCC, have been described previously.2 The dysplasias are classified as either mild, moderate or severe, based on histopathological findings, and these designations are thought to be the sequential phases of oral carcinogenesis. However, we sometimes experience a discordance between the pathological diagnosis of these lesions and the corresponding prognosis in an individual patient. It has been suggested that the detection of clonal genetic changes, such as loss of heterozygosity or microsatellite instability, in both primary OSCC and pre-malignant lesions could be a more informative method of monitoring these cancer patients.3 However, the staging and grading of OSCCs are still, for the most part, dependent on traditional clinicopathological observations.3 In line with our previous approach,4 there are several attempts for the identification of molecular biomarkers associated with specific phenotypes of head and neck squamous cell carcinomas.5–7 Although, these approaches are effective to observe the wide variety of genetic alterations in oral malignancy, in order to accurately diagnose cancer subtypes, supervised learning methods are the most suitable.8–11 In this study, we attempted to generate a multigene classifier for the diagnosis of oral malignancies. Using both cDNA microarray and QRT-PCR techniques, a comprehensive gene expression profile was generated and compared among OSCC and leukoplakia tissues. We subsequently defined a list of 27 marker genes that are either significantly elevated or down-regulated in OSCCs, compared with LPs. Among these genes, predictor gene sets for OSCC-LP classification were determined by Fisher’s discriminant analysis (LDA) and validated by the leave-one-out method. Our present results suggest that these differentially expressed genes can not only provide a valuable prognostic tool for OSCC patients, but could also provide important information concerning the molecular mechanisms underlying the development of oral squamous cell carcinoma.
Materials and methods Tissue samples Twenty-seven OSCCs and 19 leukoplakias, including four hyperplastic, three mild, seven moderate and five severely dysplastic tissues, were incorporated into this study (Table 1). These samples had been surgically resected in the dental hospital of Hokkaido University from February 1998 to
April 2004. Each of the samples was then macrodissected, and the stromal tissues were carefully removed as much as possible by a pathologist and stored at 100 °C. All of these procedures were undertaken following informed consent from each patient, and adhered to the ethical guidelines of the Dental hospital of the Hokkaido University School of Dentistry (Sapporo, Japan). The main histoclinical characteristics of each patient are summarized in Table 1.
RNA extraction and microarray analysis RNA extractions were performed using ISOGEN (Nippon Gene, Japan), according to the manufacturer’s instructions. Total RNA isolates extracted from five LPs (one hyperplasia, and one mild, one moderate and two severe dysplasias), and from five OSCC tissues, were aliquoted as mixtures. cDNA microarray analysis was performed using the IntelliGene HS Human expression chip, containing 16,600 probe sets (Takara, Japan). Briefly, for each mixture, 4 lg of total RNA was used for single-stranded cDNA synthesis using a T7-promoter linked oligo-dT primer. Subsequently, amplified (a)-RNA synthesis was performed in the presence of aminoallyl-UTP (Ambion, USA), then coupled with either a 30 Cy3- or 30 Cy5 N-hydroxysuccinimide ester. After purification, the labeled a-RNAs were combined and hybridized to the IntelliGene HS Human expression chips. After overnight hybridization at 70 °C, a fluorescence scan was performed using a ScanArray Express microarray reader (Perkin–Elmer, CA, USA), and the signals were quantified using the accompanying software. To ensure the reliability of the data, genes were considered to be differentially expressed if the P values were <0.001 (Student t-test), and had a fold change >3.0.
Quantitative RT-PCR For validation of the microarray analysis, QRT-PCR, using RNA extracts from several OSCC and leukoplakia samples (Table 1), was performed as previously described.12 Each RNA sample (1 lg) was reverse-transcribed using Superscript II (Invitrogen). For each gene showing a 3-fold or higher difference in expression on the microarrays, PCR primer pairs were designed (see Table 2) using Primer Express software (Perkin–Elmer). Each amplification reaction was carried out using Sybr Green Mastermix in an ABI Prism 5700 Sequence Detection System (Applied Biosystems, CA, USA) for 15 min at 95 °C for the initial denaturing, followed by 40 cycles of 95 °C for 30 s and 60 °C for 1 min. The expression values for each gene were normalized to the expression levels of the S5 ribosomal protein transcript (Genbank accession no. NM_001009; primer pairs, 50 -GAG CGC CTC ACT AAC TCC ATG ATG A-30 and 50 -CAC TGT TGA TGA TGG CGT TCA CCA-30 ), which was used as an internal control.
Gene expression signatures that can discriminate oral LP subtypes and SCC Table 1
Pathoclinical appearances of tissue samples
Histologya
Siteb
Sexc
Age
TNM classification
WHO grade
OSCC OSCC OSCC OSCC OSCC OSCC OSCC OSCC OSCC OSCC OSCC OSCC OSCC OSCC OSCC OSCC OSCC OSCC OSCC OSCC OSCC OSCC OSCC OSCC OSCC OSCC OSCC
1 2 3 5 6 7 8 10 11 12 13 14 15 16 17 19 22 23 55 56 57 58 59 60 61 64 65
P T Si UG Bu LG LG LG T T FM P UG T LG FM T T T T T Si FM UG Bu LG Bu
F M M M M M M M F M M F F F F M M F F M M M F F F M M
66 74 54 50 73 40 71 65 71 66 70 84 58 64 71 63 57 71 46 90 90 65 83 77 81 79 79
T2N0M0 T2N0M0 T1N0M0 T4N0M0 T3N0M0 T4N2M0 T4N0M0 T3N0M0 T2N0M0 T2N0M0 T2N0M0 T3N0M0 T4N0M0 T1N0M0 T1N0M0 T4N2cM0 T3N0M0 T2N1M0 T3N0M0 T3N0M0 – T3N2bM0 T2N1M0 T1N0M0 T3N0M0 T4N0M0 T3N0M0
1 1 3 1 1 1 1 1 2 2 2 1 1 2 1 1 1 2 1 2 – 2 1 1 1 2 2
S S S S S
25 26 27 29 49,50e
T T LG Bu T
M F F F M
57 58 87 70 54
NA NA NA NA NA
NA NA NA NA NA
T Bu T T UG T T
F M M F F F F
60 82 50 71 98 48 61
NA NA NA NA NA NA NA
NA NA NA NA NA NA NA
Mild dys 28 Mild dys 32,33e Mild dys 46
T T T
M M M
67 67 39
NA NA NA
NA NA NA
Hyperplas Hyperplas Hyperplas Hyperplas
T LG LG T
M F M F
60 55 42 68
NA NA NA NA
NA NA NA NA
dys dys dys dys dys
Mod Mod Mod Mod Mod Mod Mod
dys dys dys dys dys dys dys
21 30 36–40d 42–44d 33 52 47,48e
24 31 32 35
NA, not applicable. a HP, hyperplasia; Mid/Mod/S, mild-/moderate-/severedysplasia. b T, tongue; P, palate; LG/UG, lower/upper gingiva; FM, floor of mouth; Bu, buccal mucosa; Si, sinus. c M, male; F, female. d Mixture of three samples derived from different portions in the same lesion. e Mixture of two samples derived from different portions in the same lesion.
457
To evaluate the statistical significance of the differentially expressed genes, a Student t-test (two-sided) was performed. We performed the direct sequencing analyses and examined that each PCR product was correctly amplified (not shown).
Hierarchical clustering Unsupervised hierarchical clustering analysis was performed using DNASIS Pro software (Hitachi Software Engineering, Tokyo, Japan). In this study, a combination of the clustering algorithms Pearson correlation (for dissimilarity) and Ward (for cluster consolidation) was used.
Supervised classification The supervised classification of the data was examined by the use of Fisher’s linear discriminant analysis (LDA).13 Given a set of independent variables (pij), the weight vector of linear discriminant function is determined to maximize the ratio between the interclass distance and the sum of variance within each class: gi ¼ Rj w j pij
linear discriminant function
mA ¼ Ri2A gi =NA r2A ¼ Ri2A ðgi mA Þ2 =NA F ¼ jmA mB j2 =fr2A þ r2B g Fisher’s ratio A; B : classes i ¼ 1 Ni : an index for sample ðcaseÞ j ¼ 1 Nj : an index for variable ðgeneÞ In the present study, a set of independent variables (a ‘‘model’’) was selected by the use of a stepwise increment method and genetic algorithm.14 The stability of the model was examined by leave-one-out cross validation (loo). Each sample was masked one after another. An LDA model was then build from residual Ni 1 samples and the class of the masked sample was predicted from this model.
Results The identification of potential marker genes between leukoplakia and squamous cell carcinoma To identify potential marker genes that are differentially expressed between LP and OSCC, cDNA microarray analyses were performed using RNA mixtures of five OSCC and of five leukoplakias. Among the 16,600 target cDNAs on the chip arrays, 4600 genes showed a detectable signal of which 63 were highly expressed (3-fold or more) in the OSCC mixture, compared with the LP mixture. In addition, 53 genes were preferentially expressed (3-fold or more) in the LP mixture (data not shown). Since the cDNA array tip does not contain probes for keratin-4 and transglutaminase three genes, whose differential expression between LPs and OSCCs has already been verified in our previous study,4 the expression of these genes was additionally examined in the following QRT-PCR analysis. QRT-PCR analyses using 27 OSCC and 19
458
Table 2
Leukoplakia (LP)- and OSCC (SC)-dominant marker genes
Classifierb
Gene name (symbol)
Category
Accession no.
PCR primers (forward/reverse)
LP1 LP8
A A
Extracellular matrix Nuclear receptor
NM_000900 NM_021005
GTGCTGCTACACAAGAACCCTGAG/CACAAAGTTACTACCGCTAAGGCG GGTGGTCGCCTTTATGGACCAC/GACCACAGGCATCTGAGGTGAA
LP12 LP21 LP29 LP17
A A A B
Secretory protein Epithelial/Cytoskeletal Epithelial/Cytoskeletal Extracellular matrix
NM_003062 NM_006121 X14640 NM_002725
GGCATCGTCGAAATACGCCTAGA/TGAGTGATTTCAGGCCCTGGAAG TGCCCTCAAGGATGCCAAGAAC/GGAGGGTCCTGTAGGTGGCAAT GCAGAGCCCAGCTAGCCCTGAGC/TGGGGTGGCATCCATCTCCACGTTG GGCGTTCCATGACTTCTCCTCG/GCGGAGTAGGGCCTAGATGACC
LP19
B
ACAGTGGTTGGTGAGGAATGGGT/GCTCTCTAGCTGTGATGCCTCGC
B
Epithelial/ Protein modification Trans cription factor
NM_016190
LP27
NM_006732
CCAATGCTCCAGCTGTCGTCTG/GCAGGAGCAAGCCCTGCTCAC
LP5 LP28
B A, B
Matrix Gla protein (MGP) Nuclear receptor subfamily 2, group F, member 2 (NR2F2) Slit homolog 3 (Drosophila) (SLIT3) Keratin 1 (KRT1) Keratin 13 (KRT13) Proline arginine-rich end leucine-rich repeat protein (PRELP) Chromosome 1 open reading frame 10 (C1orfl0) FBJ murine osteosarcoma viral oncogene homolog B (FOSB) Dermatopontin (DPT) Transglutaminase 3 (TG3)
NM_001937 NM_003245
LP4
A, B
Extracellular matrix Epithelial/ Protein modification Membrane protein/Receptor Extracellular matrix
NM_022003
CAGGCAGCATGGACCTCAGTC/TAGCTGAAGCCTTGCCGGTTC CGCTGCTACTGATGTGGCCAGCAGA/ GCTTCTTTCCTGCGGGGTAGCATCCAAC AGCAATAACTCCATGGGCTCTGG/CAACGGAACAGGAGTCCTTCAACT
NM_002404
GATCCGAGGAGATGCTCTGGAGA/GGTGGTCATGTCACAGAAGACGG
Metabolism Anti-apoptotic Immune response/ Development
NM_001387 NM_001831 NM_001311
GCGAGGTCACCTCTCACTCTTCTG/GGATGTGCCAGGATCGAGAGAAT AGGCTCAGCAGGCCATGGA/TCACACTGGTCCTTCATCCGC ACTTCGCCGAGAGGGTGACC/CCAAACATGGCTGCGTAGCAG
Interferon-induced
NM_001565
AAGTGGCATTCAAGGAGTACCTCTC/CATTGTAGCAATGATCTCAACACGT
Extracellular matrix
NM_005562
CTACGGATCACAGCTCCCTTGATG/TGAGATTCCGCAGTAACCTTCGATAC
Secretory protein
NM_006350
ACCGCAATGAATGTGCACTCCTAA/CAGTAGGCATTATTGGTCTGGTCCAC
Secretory protein
NM_144646
AGTGTGCCCGGATTACTTCCAGG/CAATGGTGAGGTGGGATCAGAGAT
Nuclear receptor
NM_002899
GATCTGACAGGCATAGATGACCG/CTTCCACTCTCATCTCTAGGTGCA
Secretory protein
NM_198966
CGGTTAGCCCTGTTCCACGAAC/CAAATCAGAGGCGCTTCCTCTG
Secretory protein
NM_000358
TACAAGGAATTTGCTTCGGAACCA/CGCGATGCAGCTGTTCTCAATG
Epidermal specific
NM_033255
ATTACTGGAACTGAAACGGCAGCA/CATATTCCAACAGCCTCCAGATTG
Interferon-induced
NM_006417
GGAAGCGGCTTAGCCTTCTC/TCCTGGTAACTCTCTTCCGCAT
Interferon-induced
NM_001549
GGAGACGGAATGTTATCAGACACCA/CATGTTGCACAGCAGTGTCTTCAG
Ubiquitine proteasome Metabolism
NM_017414 NM_000067
CAGAGCTTGGATTTCAGCCAGAT/CAGCATTCCGGATGTAGACACAG CTGATGGACTGGCCGTTCTAGGTA/GATTCAGGAAGGAGGCCACGAG
LP15 LP22 LP2 LP16 SC1
A
SC5
A
SC13
A
SC43
A, B
SC3 SC27 SC41 SC9 SC6 SC44 SC10 SC7 a b
FXYD domain containing ion transport regulator 6 (FXYD6) Microfibrillar-associated protein 4 (MFAP4). Dihydropyrimidinase-like 3 (DPYSL3) Clusterin (CLU) Cysteine-rich protein 1 (intestinal) (CRIP1) Chemokine (C-X-C motif) ligand 10 (CXCL10) Laminin, gamma 2 (LAMC2), transcript variant 1 Follistatin (FST), transcript variant FST317 Immunoglobulin J polypeptide, linker protein (IGJ) Retinol binding protein 1, cellular (RBP1) Parathyroid hormone-like hormone (PTHLH) Transforming growth factor, beta-induced, 68 kDa (TGFBI) Epithelial stromal interaction 1 (EPSTI1) Interferon-induced protein 44 (IFI44) Interferon-induced protein with tetratricopeptide repeats 3 (IFIT3) Ubiquitin specific protease 18 (USP18) Carbonic anhydrase II (CA2)
Genes up-regulated in leukoplakia (LP) or oral squamous cell carcinoma (SC). A – 11 predictor genes that can discriminate between LPs and OSCCs; B – 7 predictor genes that can discriminate between mild- and moderate-dysplasias.
N. Kondoh et al.
Gene IDa
Gene expression signatures that can discriminate oral LP subtypes and SCC LP tissues (see Table 1) further demonstrated that among these differentially expressed marker gene candidates, 33 were significantly overexpressed in OSCC, whereas 30 genes were up-regulated in LP tissues (P < 0.06). In order to minimize the potential effects of any anatomical differences in the oral cavity upon gene expression, we confirmed the differential expression levels from the microarrays and QRTPCR experiments in a further 9 OSCC and 12 LP tissues derived from tongue only (data not shown). We focused finally on 27 differentially expressed marker genes, among which 15 (denoted as LP) were overexpressed in LPs compared to OSCCs, whilst 12 genes were up-regulated in OSCCs and in some moderate- to severedysplasias (denoted as SC) (Table 2). Among these, the expression of 24 marker genes was significantly altered between OSCC and LP tissues (P < 0.05), irrespective of any anatomical differences. Although, the expression of SC 43 was not significantly altered between OSCCs and LPs (P = 0.98), we also took the expression of this gene into account, since it was strongly suppressed in all hyperplasias and mild dysplasias and may enable the classification of dysplasias into subgroups. We also took the expression of SC1 and SC44 into account as the SC1 transcripts seem to be significantly elevated in a specific population of OSCC and because both SC44 and SC1 mRNAs are significantly suppressed in most LPs except for Mo dys 47, 48. The expression patterns of these 27 marker genes are summarized in Figure 1.
459
Unsupervised classification of oral squamous cell carcinoma and leukoplakia tissue using our newly identified marker genes Based on the expression profiles of the 27 marker genes that we identified, unsupervised hierarchical clustering was undertaken to classify OSCC and LP tissue samples. As shown in Figure 1, 27 OSCC and 19 LP sample tissues were separated into two distinct clusters (I and II). Most LPs were classified as cluster I, which was further divided into subclusters Ia and Ib. Subcluster Ia was mainly comprised of LPs of a relatively well differentiated phenotype, including hyperplasia and mild dysplasia, whereas subcluster Ib consisted of a mixture of both OSCC and moderate- to severe-dysplasias. With the exception of Mo dys 33, cluster II consisted almost entirely of OSCCs. These results of the unsupervised clustering demonstrate that 27 marker genes could harbor principal components essential for the classification of the dysplasia–carcinoma transition.
Supervised classification The aim of our present study is to establish a clear distinction between LP and OSCC, and among dysplasia grades, based upon a molecular classification. In order to identify marker gene sets that could discriminate between OSCCs and LPs, a supervised classification approach using LDA13 was performed. The expression of these 27 marker genes
Figure 1 Unsupervised clustering of OSCC and LP tumors, based on marker gene expression data. The clustered tissue groups Ia, Ib and II are indicated at the bottom. Gene IDs corresponding to Table 2 are listed on the right. The tumor types (on the top) correspond to the information in Table 1. The relative gene expression levels are represented as color intensity on a logarithmic scale (high, red; low, green; medium, black). The statistical significance (Student t-test on both sides) of the differential gene expression levels between OSCCs and LPs is denoted on the right. The asterisk indicates that the OSCC tumors 56 and 57 were derived from the same patient, before and after 5-FU treatment, respectively.
460
N. Kondoh et al.
was analyzed among 27 OSCCs and 19 LPs, including hyperplasias and dysplasias (Table 1). This approach involves parameter (gene) selection by the use of a stepwise increment and genetic algorithm.14 In the former method, starting from a null model, the most effective variable is added to a previous candidate set of variables in each step. When the score of the model reaches a maximum, the increment stops. When the Fisher’s ratio was employed as a score, a model with 11 parameters was selected as the best model. The stability of this model was examined by the leave-oneout cross validation (loo) method and compared with that of a stepwise increment method and a genetic algorithm (fit) (Fig. 2). The LDA score for each sample is given as the following linear discrimination function: Score = 0.231 (LP1) + 0.223 (LP4) 0.0537 (LP28) 0.0734 (LP21) 0.892 (LP12) 0.0617 (LP29) 0.282 (LP8) + 0.0122 (SC1) + 0.0669 (SC13) 0.0684 (SC43) 0.0366 (SC5). The gene expression and LDA scores are summarized in Figure 3. With the exception of OSCC16 and Mo dys 33, the sample sets were correctly discriminated by the 11 marker genes selected by the Fisher’s ratio. The optimal prediction accuracy with this set of 11 genes was 97.8% (fit) and 97.8% (loo), respectively. Using the same approach, we also attempted to distinguish mild dysplasias and hyperplasias from higher grade (>moderate) dysplasias and OSCCs. As shown in Figure 4, we reached the optimal prediction using a set of seven genes with an accuracy of 97.8% (fit) and 95.6% (loo), respectively. The LDA score for each sample is given as the following linear discrimination function: Score = 0.372 (LP5) + 0.495 (LP4) 0.576 (LP19) + 0.0380 (LP28) 0.246 (LP17) 0.464 (LP27) + 0.0889 (SC43). The predictor genes are summarized in Table 2.
Discussion In order to identify marker gene candidates, we first screened differential gene expression between OSCCs and LPs by using cDNA microarray. Although microarrays are a powerful technology for measuring genome-wide expression, there is the difficulty in reproducibly quantifying sources. We, therefore, verified the expression of all 118 marker gene candidates by using QRT-PCR of 27 OSCC and 19 LP tissues. We first identified 27 marker genes representing the differential expression between LPs and OSCCs. To further identify marker gene sets and establish appropriate algorism that can sufficiently discriminate between OSCCs and LPs, a supervised classification approach using LDA was performed. After intensive parameter selection and cross validation, we reached an optimal prediction with a set of 11 genes. According to our classification, however, a LP sample, Mo dys 33, was classified as an OSCC. Interestingly, Mo dys 33 had been clinically diagnosed and treated as an OSCC, because of its cancerous macroscopic appearances and history (data not shown). Therefore, a molecular diagnosis based on these 11 genes may, in part, predict clinical features or genetical background of the patient rather than histological grades. Using the same approach, we could also distinguish mild dysplasias, hyperplasias, and perhaps normal mucosa from higher grade (>moderate) dysplasias and OSCCs with a further set of seven genes. Using differential display and northern blot analyses, we previously identified several differentially expressed genes between LPs and OSCCs.4 Among these genes, keratin-13 and TG-3 are also included in the present 27 marker genes, however in this experiment, the differential expression of keratin-14 and -17 remained undetectable level in the cDNA microarray analysis; the expression of keratin-4 did not show significant difference in the QRT-PCR analysis using tissues derived from tongue only (not shown). These discrepancies in the results are mainly attributable to the technical difference, and in part, the increased sample
(%) 100
Accuracy of prediction
fit loo 90
80
70
1
6
11 16 Parameter number
21
26
Figure 2 Stability of the model examined by leave-one-out cross validation (loo) methods. The results were compared with those of both a stepwise increment method and genetic algorithm (fit). Each parameter represents marker gene.
Gene expression signatures that can discriminate oral LP subtypes and SCC
461
Figure 3 Expression profiles of 11 predictor genes and discrimination by LDA scores. The relative gene expression levels are as indicated in Fig. 1. The LDA score for each sample is given by the linear discrimination function (see results): d – OSCCs; s – LPs.
Figure 4 Expression profile of seven predictor genes and discrimination by LDA scores. The gene expression and LDA scores are as described for Fig. 3: d – OSCCs and severe- and moderate-dysplasias; s – mild dysplasias and hyperplasias.
number. Anyhow, among the verified 27 marker genes, 11and 7-predictor sets are optimal to keep the accuracy of diagnosis, demonstrating that the combination usage, but not simply the number, of marker genes is essential to construct appropriate predictor sets. The rationality of some predictor genes is supported by the following observations. The down-regulation of TG-3
expression is a common event in the development of human esophageal, and head and neck cancers.8,9,15 The loss of the C1orf10 gene (LP19) product, also known as a member of the fused gene family, cornulin,16 has also been suggested to contribute to the development of OSCC.17 Furthermore, the matrix Gla protein (MGP; LP1), encoding a vitamin K-dependent extracellular matrix (ECM) protein,
462 is associated with transformation and de-differentiation of colonic epithelial cells.18 The expression of MGP is inversely correlated with tumor size, lymph-node metastasis and tumor grade in renal cell carcinoma, though the expression is significantly elevated in the tumor compared to matched normal tissue.19 The expression of MGP is also elevated in breast cancer.20 Therefore, MGP may set up scaffolding for primary tumor cells, by contrast acquisition of metastatic phenotype may be correlated with downregulation of this ECM molecule. The decreased expression of dermatopontin (DPT; LP5) is associated with the development of leiomyoma and keloids21 and FosB (LP27) is down-regulated in poorly differentiated breast carcinomas.22 Hence, the reduction of these genes may be an important manifestation of the pre-malignant to malignant transition boundary. On the other hand, Laminin gamma 2 (LAMC2; SC5) is overexpressed in head-and-neck carcinomas.23 Follistatin (FST; SC13) is overexpressed in liver tumors,24 and is suggested to be an antagonist of activin-mediated growth inhibition.25 It has also been reported that the chemokine ligand 10 (CXCL10; SC1) may confer anti-malignant properties in breast cancer.[26] Moreover, in response to IFNgamma, CXCL10 is induced via the NF-kappa B pathway in OSCC cells.[27] In conclusion, our present results demonstrate that the expression profile of the marker genes that we have identified through cDNA microarrays and QRT-PCR could prove to be a useful diagnostic marker and also provide valuable information for our further understanding of oral malignancy.
References 1. Lopez M, Aguirre JM, Cuevas N, Anzola M, Videgain J, Aguirregaviria J, Castro A, de Pancorbo MM. Gene promoter hypermethylation in oral rinses of leukoplakia patients – a diagnostic and/or prognostic tool? Eur J Cancer 2003;39: 2306–9. 2. Bloor BK, Seddon SV, Morgan PR. Gene expression of differentiation-specific keratins in oral epithelial dysplasia and squamous cell carcinoma. Oral Oncol 2001;37:251–61. 3. Forastiere A, Koch W, Trotti A, Sidransky D. Head and neck cancer. N Engl J Med 2001;345:1890–900. 4. Ohkura S, Kondoh N, Hada A, Arai M, Yamazaki Y, Sindoh M, et al. Differential expression of the keratin-4, -13, -14, -17 and transglutaminase 3 genes during the development of oral squamous cell carcinoma from leukoplakia. Oral Oncol 2005;41:607–13. 5. Schmalbach CE, Chepeha DB, Giordano TJ, Rubin MA, Teknos CR, Bradford CR, et al. molecular profiling and the identification of genes associated with metastatic oral cavity/pharynx squamous cell carcinoma. Arc Otolaryngol Head Neck Surg 2004;130:295–302. 6. Carinci F, Lo Muzio L, Piattelli A, Rubini C, Palmieri A, Stabellini G, et al. Genetic portrait of mild and severe lingual dysplasia. Oral Oncol 2005;41:365–74. 7. Ginos MA, Page GP, Michalowicz BS, Patel LJ, Volker SE, Pambuccian SE, et al. Identification of a gene expression signature associated with recurrent disease in squamous cell carcinoma of the head and neck. Cancer Res 2004;64:55–63. 8. Whipple ME, Mendez E, Farwell DG, Agoff SN, Chen C. A genomic predictor of oral squamous cell carcinoma. The Laryngoscope 2004;114:1354–64.
N. Kondoh et al. 9. Roepman P, Weesels LFA, Kettelarij N, Kemmeren P, Miles AJ, Lijinzaad P, et al. An expression profile for diagnosis of lymph node metastases from primary head and neck squamous cell carcinoma. Nature Genet 2005;37(2):182–6. 10. Iizuka N, Oka M, Yamada-Okabe H, Mori N, Tamesa T, Okada T, et al. Comparison of gene expression profiles between hepatitis B virus- and hepatitis C virus-infected hepatocellular carcinoma by oligonucleotide microarray data on the basis of a supervised learning method. Cancer Res 2002;62(14): 3939–44. 11. Ramaswamy S, Tamayo P, Rifkin R, Mukherjee S, Yeang C-H, Angelo M, et al. Multiclass cancer diagnosis using tumor gene expression signature. Proc Natl Acad Sci USA 2001;98:15149–54. 12. Arai M, Kondoh N, Imazeki N, Hada A, Hatsuse K, Kimura F, et al. Transformation-associated gene regulation by ATF6 during hepatocarcinogenesis. FEBS Lett 2006;580:184–90. 13. Fisher NA. The use of multiple measurements in taxonomic problems. Ann Eugen 1936;7(Part II):179–88. 14. Goldberg DE. Genetic algorithm in search, optimization and machine learning. Reading (MA): Addison-Wesley; 1989., p. 1– 26. 15. Chen BS, Wang MR, Xu X. Transglutaminase-3, an esophageal cancer-related gene. Int J Cancer 2000;88(6):862–5. 16. Contzler R, Favre B, Huber M, Hohl D. Cornulin, a new member of the ‘‘Fused gene’’ family, is expressed during epidermal differentiation. J Invest Dermatol 2005;124:990–7. 17. Imai FL, Uzawa K, Nimura Y, Moriya T, Imai MA, Shiiba M, et al. Chromosome 1 open reading frame 10 (C1orf10) gene is frequently down-regulated and inhibits cell proliferation in oral squamous cell carcinoma. Int J Biochem Cell Biol 2005;37: 1641–55. 18. Fan C, Sheu D, Fan H, Hsu K, Allen Chang C, Chan E. Down-regulation of matrix Gla protein messenger RNA in human colorectal adenocarcinomas. Cancer Lett 2001;165: 63–9. 19. Levedakou EN, Strohmeyer TG, Effert PJ, Liu ET. Expression of the matrix Gla protein in urogenital malignancies. Int J Cancer 1992;52:534–7. 20. Chen L, O’Bryan JP, Smith HS, Liu E. Overexpression of matrix Gla protein mRNA in malignant human breast cells: isolation by differential cDNA hybridization. Oncogene 1990;5:1391–5. 21. Catherino WH, Leppert PC, Stenmark MH, Payson M, PotlogNahari C, Nieman LK, et al. Reduced dermatopontin expression is a molecular link between uterine leiomyomas and keloids. Genes Chrom. Cancer 2004;40:204–17. 22. Milde-Langosch K, Kappes H, Riethdorf S, Loning T, Bamberger AM. FosB is highly expressed in normal mammary epithelia, but down-regulated in poorly differentiated breast carcinomas. Breast Cancer Res Treat 2003;77:265–75. 23. Patel V, Aldridge K, Ensley JF, Odell E, Boyd A, Jones J, et al. Laminin-gamma 2 overexpression in head-and-necksquamous cell carcinoma. Int J Cancer 2002;99:583–8. 24. Rossmanith W, Chabicovsky M, Grasl-Kraupp B, Peter B, Schausberger E, Schulte-Hermann R. Follistatin overexpression in rodent liver tumors: a possible mechanism to overcome activin growth control. Mol Carcinog 2002;35:1–5. 25. Stove C, Vanrobaeys F, Devreese B, Van Beeumen J, Mareel M, Bracke M. Melanoma cells secrete follistatin, an antagonist of activin-mediated growth inhibition. Oncogene 2004;23:5330–9. 26. Goldberg-Bittman L, Neumark E, Sagi-Assif O, Azenshtein E, Meshel T, Witz IP, et al. The expression of the chemokine receptor CXCR3 and its ligand, CXCL10, in human breast adenocarcinoma cell lines. Immunol Lett 2004;92:171–8. 27. Hiroi M, Ohmori Y. Constitutive nuclear factor kappa B activity is required to elicit interferon-gamma-induced expression of chemokine CXC ligand 9 (CXCL9) and CXCL10 in human tumour cell lines. Biochem J 2003;376:393–402.