Rapid Communication Pancreatology 2003;3:169–178 DOI: 10.1159/000070087
Received: November 29, 2002 Accepted after revision: February 7, 2003
Systematic Isolation of Genes Differentially Expressed in Normal and Cancerous Tissue of the Pancreas Robert Grützmann a Christian Pilarsky a Eike Staub b Armin O. Schmitt b Melanie Foerder a Thomas Specht b Bernd Hinzmann b Edgar Dahl b Ingo Alldinger a Andre Rosenthal b Detlef Ockert a Hans-Detlev Saeger a a Department of Visceral, Thoracic and Vascular Surgery, University Clinic Carl Gustav Carus, Technical University Dresden, Dresden, and b metaGen Pharmaceuticals, Berlin, Germany
Key Words Pancreatic cancer W Gene W Expression profiling
Abstract Background: There is increasing knowledge about the genetic basis of pancreatic cancer (PaCa). Tumor suppressor genes (TSGs; e.g. p53 and DPC4) and oncogenes (e.g. K-ras) have been shown to be involved in the development of PaCa. However, the extent of chromosomal changes (gains and losses) implicates that many more genes may be involved in the multistep progression of PaCa. Identification of these genes is essential for understanding the molecular events in the development of PaCa. Methods: We assembled public and proprietary libraries of more than 4 million expressed sequence tags using newly developed software tools. Results: We identified a total of 249 genes with specific expression patterns in normal and cancerous tissue of the pancreas. Of these, 27 genes were found to be preferentially expressed in normal tissue of the pancreas, while 222 genes showed significant upregulation of expression in
R.G. and C.P. contributed equally to this work.
ABC
© 2003 S. Karger AG, Basel and IAP 1424–3903/03/0032–0169$19.50/0
Fax + 41 61 306 12 34 E-Mail
[email protected] www.karger.com
Accessible online at: www.karger.com/pan
PaCa. Of the 249 genes, 232 (93.2%) were found to represent known human genes or putative human homologues of genes characterized previously in other species, while 17 (6.8%) represent putative new genes. Conclusion: These genes may represent a valuable source to identify novel TSGs and oncogenes involved in the carcinogenesis of PaCa. Copyright © 2003 S. Karger AG, Basel and IAP
Introduction
Pancreatic cancer (PaCa) is an important cause of malignancy-related death. In the United States it is the fifth leading cause of cancer death, accounting for approximately 30,000 deaths annually [1]. The incidence rate has increased to 8.9/100,000 within the last 30 years. Today, it is the eighth most common cancer with the lowest overall 5-year relative survival rate of any tumor type [2]. Beside surgery, there is no effective therapy, and 96% of patients with PaCa die in the first 6 months after diagnosis. The discovery of novel therapeutically useful genes and diagnostic markers may fuel medical progress and in the long run eventually lead to novel drugs and diagnostic strategies.
Dr. Robert Grützmann, University Clinic Carl Gustav Carus Department of Visceral, Thoracic and Vascular Surgery Fetscherstrasse 74, D–01307 Dresden (Germany) Tel. +49 351 458 4395, Fax +49 351 458 2742 E-Mail
[email protected]
Fig. 1. Schematic drawing of our strategy of database mining and in silico expression analysis in PaCa.
Known molecular changes in PaCa include loss of function of the tumor suppressor genes (TSGs) p53, p16 and DPC4. Other known TSGs, in some cases deleted or mutated in PaCa, include the retinoblastoma gene and the BRCA2 gene. Activating mutations of the proto-oncogene K-ras have been found in up to 90% of cases of PaCa, which is the highest incidence among various tumor types [for a review, see ref. 3]. Gain and loss of entire chromosomes have frequently been observed in PaCa cells, including additional copies of chromosomes 20 and 7 as well as complete loss of chromosomes 18, 13, 12, 17 and 6 [4, 5]. In addition, chromosomal aberrations are also common in PaCa; amplifications often affect 20q, 11q, 17p, 19q, 8q, 12p, 4q and 20p, while allelic losses have been found for 3p, 6p, 6q, 8p, 10q, 12q, 13q, 16p, 18p, 21q and 22q [6–8]. These data suggest the existence of as yet unidentified TSGs and oncogenes in these regions. Identification of these genes is essential as the basis for the development of new treatments and diagnostic modalities, as well as for an understanding of the development of PaCa. A number of methods have been developed to detect changes in gene expression between normal and disease tissue, including differential display PCR [9], representational difference analysis [10], serial analysis of gene expression [11, 12] and subtractive hybridization [13], as well as various RNA profiling methods using high-density gene chips [14]. The expressed sequence tag (EST) approach is another powerful method of measuring gene expression profiles in different tissues [15]. ESTs are single-pass reads from randomly selected cDNA clones [16]. EST libraries of many human tissues, both normal and cancerous, have been established, and more than 3 million ESTs have been deposited in the public domain (dbEST), while more than 6 million ESTs are contained in
170
Pancreatology 2003;3:169–178
proprietary EST databases (Human Genome Science, Incyte Genomics). Database entries of EST sequences carry in their header a piece of information which permits the identification of their source tissue. Recently, we developed an in silico method for RNA expression profiling which uses EST libraries and allows us to derive electronic Northern blot profiles over about 20 tissue pairs [17]. For each cDNA sequence, the method counts the ESTs originating from benign or cancer tissue and corrects automatically for the EST pool size of all the libraries. The method was carefully designed to identify genes which are significantly up- or downregulated in cancer tissue. Here, we applied this method to EST libraries obtained from normal and cancer tissue of the pancreas. By profiling 4 million ESTs, of which approximately 60,000 were derived from normal pancreatic tissue while 20,000 originated from several tumors of the pancreas, we identified 249 genes differentially expressed in PaCa. The panel of genes resulting from this analysis may serve as a unique resource for therapeutic or diagnostic candidate targets [18].
Materials and Methods EST Databases We used two EST databases for in silico profiling of expression patterns: dbEST (http://www.ncbi.nlm.nih.gov/dbEST) and the proprietary database Lifeseq® (Incyte Pharmaceuticals, Palo Alto, Calif., USA). These databases contained approximately 1,000,000 (dbEST) and 3,000,000 (Incyte) ESTs at the time of the study. In silico Profiling of Expression Pattern We applied a method for expression profiling of genes based on EST counting and normalization developed recently in our laboratory [17]. The procedure has been named AUTEX (automated extension of partial DNA sequences). Figure 1 summarizes the rationale of
Grützmann et al.
Table 1. Electronic Northern blot of metalloproteinase 11
Tissue
Bladder Brain Breast Colon Endocrine tissue Heart Kidney Liver Lung Skeletal muscle Ovary Peripheral blood leukocytes Pancreas Prostate Skin Small intestine Stomach/esophagus Testis Uterus All tissues
Normal tissue
Tumor tissue
hits
hits
pool size
12 2 14 3 2 0 2 0 2 0 1 0 4 0 0 2 0 0 1 45
42,553 100,222 67,582 35,112 61,769 7,275 20,741 15,763 54,085 27,070 41,736 941 21,810 76,769 1,180 9,383 12,120 16,899 21,735 659,418
0 1 1 3 0 0 0 0 1 0 0 1 0 0 0 0 0 0 5 17
pool size 25,643 184,386 120,725 52,193 62,283 98,508 50,214 21,510 102,742 58,318 33,687 127,720 60,513 106,099 27,206 36,435 13,800 24,903 73,176 1,449,323
p value
Expression ratio
0.0051 0.285 5.86!10 –6 0.69 0.248 – 0.0854 – 0.275 – 1 1 0.00493 – – 0.0419 – – 1 2.88!10 –11
– 0.27 0.04 0.67 – – – – 0.26 – – – – – – – – – 1.49 0.17
Expression levels in 19 normal and tumor tissues are shown for metalloproteinase 11. The abundance is given as the number of matching ESTs found in a pool of tissue-specific ESTs. The expression ratio is defined as the ratio of the relative abundances (number of matching ESTs divided by the pool size). The relative abundance of normal tissue is divided by that of the tumor. Ratios smaller than 1 indicate upregulation in tumor tissue, while those larger than 1 indicate downregulation in tumor tissue. The p value, computed by Fisher’s exact test, expresses the statistical significance of the differential expression.
this in silico approach. First, an iterative BLAST search and assembly procedure was used to form cDNA contigs of maximal length from single EST sequences using the dbEST and LifeSeq databases. More than a hundred thousand different cDNA contigs were obtained. For each cDNA contig, we analyzed the spatial distribution of the individual ESTs by comparing them to all 4 million ESTs sorted into pairs of benign and cancer tissue across 19 different human tissues (bladder, brain, breast, colon, endocrine tissue, heart, kidney, liver, lung, skeletal muscle, ovary, peripheral blood leukocytes, pancreas, prostate, skin, small intestine, stomach/esophagus, testis and uterus). The result is an electronic Northern blot, and EST counts for individual tissues are normalized by the EST pool size for each tissue. Fisher’s exact test was used to assess the significance of differential expression, and a cDNA sequence was regarded as differentially expressed if the p value was smaller than 0.05 for a particular tissue. One example is shown in table 1, i.e. metalloproteinase 11, which is overexpressed in cancer of the pancreas, bladder and breast. Finally, all cDNA sequences showing differential expression in benign and cancer tissue of the pancreas defined by p values smaller than 0.05 were analyzed by BLAST [19]. The stringency parameters defining sequence homology were 10 –4 for the E value and 95% sequence identity.
Isolation of Genes Expressed in Pancreatic Cancer
Results
Using the methods described above, 249 candidate genes which are differentially expressed between normal and cancerous tissue of the pancreas in electronic Northern blots were identified. Twenty-seven of these genes (10.8%) were found to be downregulated in PaCa, while 222 genes (89.2%) were upregulated in pancreatic cancer tissue. All genes exhibited an expression ratio of at least 2, and their p value computed by Fisher’s exact test was below 0.05. The 249 differentially expressed cDNAs were compared with a recent Unigene database of mRNA sequences, EST assemblies and EST singletons (build #143). Based on BLAST sequence similarity searches followed by visual inspection of the results, 232 out of the 249 cDNAs (93.2%) were identified in the Unigene database. The remaining 17 sequences (6.8%) represent putative unknown genes. We grouped the known genes according
Pancreatology 2003;3:169–178
171
Fig. 2. Distribution of the genes which were
differentially expressed in PaCa and normal tissue of the pancreas.
to their cellular function (fig. 2), either based on direct information about the gene from the literature or indirect evidence from known protein domains which we detected in 6-frame translations of the cDNAs using the Pfam database (http://www.sanger.ac.uk/Software/Pfam/). A complete classification of all known candidate genes is shown in table 2.
Discussion
Knowledge about the genetic basis of PaCa has increased in recent years [20]. Several genetic alterations have been described. The most frequent ones are mutations of K-ras, p16, p53, DPC4 and BRCA2 and the overexpression of HER-2/neu [3]. Furthermore, it was shown that more than 40% of the chromosomal arms suffer loss of heterozygosity in PaCa [6]. Therefore, many more genes may be involved in the carcinogenesis of PaCa. Different methods, like differential display PCR [21] and representational difference analysis [10], as well as spotted cDNA arrays [22, 23], have been applied to search for differentially expressed genes in PaCa. Our high-throughput in silico profiling method enables the identification of complex alterations of gene expression which might be responsible for the development of the phenotype of PaCa cells. In addition, our method does not require pancreatic tissue probes, which are difficult to obtain at high quality suitable for RNA profiling. The majority of the genes (222) were found to be upregulated in PaCa, while only 27 genes were downregulated in the tumor. 232 genes
172
Pancreatology 2003;3:169–178
showed significant similarity to known human genes or genes from other species, while 17 genes represent novel genes with unknown function. To our knowledge, this is the largest number of differentially expressed genes identified by one single method in PaCa so far. Some of the genes have already been shown by other groups to be associated with PaCa. These include genes like alpha 1 actinin (ACTN1), osteonectin (SPARC), lectin 4 (LGALS4), thymosin beta-10 gene, syndecan 1 (SDC1) and galectin-3. Integrin beta 4 binding protein has already been described by other groups to be associated with PaCa. Integrin beta 4 binding protein was found to be upregulated in several other carcinomas, and alterations in integrins seem to be early events in the transition from benign to malignant colorectal phenotypes. This gene is also abundantly expressed in epithelia [24, 25]. Our data show that in PaCa, this gene was overexpressed by more than 11 times compared to normal pancreatic cells (p ! 0.05). Matrix metalloproteinase 11 (stromelysin 3) is another gene that is strongly upregulated in PaCa, as shown by our electronic Northern blots (table 1). Stromelysin 3 is an extracellular proteinase which is predominantly expressed in a variety of invasive human carcinomas, including cancer of the breast and colon and PaCa. This protease can modulate cancer progression and invasion by remodeling extracellular matrix and probably by inducing the release of the necessary microenvironmental factors [26–28]. These genes represent cell matrix-associated molecules and seem to play an important role in tumor growth and cancer invasion.
Grützmann et al.
Table 2. Human genes differentially expressed in cancerous and normal tissue of the pancreas with proposed function (according to BLAST searches), functional category and direction of differential expression in PaCa Putative function
Direction of expression
Signal transduction Homo sapiens estrogen regulated gene 1 (ERG-1) Homo sapiens gastric inhibitory polypeptide receptor (GIPR) Homo sapiens glycoprotein 2 (zymogen granule membrane) (GP2) Homo sapiens protein serine kinase (PSKH1 gene) Homo sapiens regenerating islet-derived 1 beta (pancreatic stone protein, pancreatic thread protein) (REG1B) Homo sapiens secretagogin (SECRET) Homo sapiens adenylate cyclase 3 (ADCY3) Homo sapiens BAII-associated protein 2 (BAIAP2), transcript variant 1 Homo sapiens CAAX box 1 (CXX1) Homo sapiens casein kinase 1 epsilon (CSNK1E) Homo sapiens CD20-like precursor (LOC64166) Homo sapiens diacylglycerol kinase delta (130 kD) (DGKD) Homo sapiens differentially expressed in hematopoietic lineages (GW112) Homo sapiens FLJ22356 fis, clone HRC06345 Homo sapiens G protein beta subunit-like (GBL) Homo sapiens gamma-aminobutyric acid (GABA) A receptor, pi (GABRP) Homo sapiens guanine nucleotide binding protein (G protein), alpha inhibiting activity polypeptide 2 (GNA12) Homo sapiens guanine nucleotide-releasing factor 2 (specific for crk proto-oncogene) (GRF2) Homo sapiens hepatocellular carcinoma-associated antigen 112 (HCA112) Homo sapiens KIAA0620 protein Homo sapiens KIAA1178 protein Homo sapiens KIAA1437 protein Homo sapiens KIAA1663 protein Homo sapiens lectin, galactoside-binding, soluble 3 (galectin 3) (LGALS3) Homo sapiens LIM and SH3 protein 1 (LASP1) Homo sapiens melanoma-associated antigen MG50 Homo sapiens misshapen/NIK-related kinase (MINK) Homo sapiens multiple PDZ domain protein (MPDZ) Homo sapiens myristoylated alanine-rich protein kinase C substrate (MARCKS, 80K-L) (MACS) Homo sapiens nucleobindin 1 (NUCB1) Homo sapiens pancreas tumor-related protein (FKSG12) Homo sapiens protein kinase C substrate 80K-H (PRKCSH) Homo sapiens RAB7, member RAS oncogene family (RAB7) Homo sapiens regulator of G-protein signalling 5 (RGS5) Homo sapiens S100 calcium-binding protein A4 (calcium protein, calvasculin, metastasin, murine placental homology) (S100A4) Homo sapiens SH3 domain binding glutamic acid-rich protein like (SH3BGRL) Homo sapiens similar to Caenorhabditis elegans protein C42C1.9 (KEO4) Homo sapiens similar to protein kinase, cAMP-dependent, regulatory, type II, alpha Homo sapiens syndecan 1 (SDC1) Homo sapiens thyroid specific PTB domain protein Homo sapiens TRK-fused gene (TFG) Homo sapiens tumor endothelial marker 1 precursor (TEM1) Homo sapiens tumor protein D52-like 2 (TPD52L2) Homo sapiens tumor suppressor deleted in oral cancer-related 1 (DOC-1R) Metabolism Homo sapiens colipase, pancreatic (CLPS) Homo sapiens phospholipase A2, group IB (pancreas) (PLA2G1B) Homo sapiens superoxide dismutase 2, mitochondrial (SOD2) Homo sapiens alpha 1,2-mannosidase Homo sapiens CGI 83 protein Homo sapiens clone MGC:3339 Homo sapiens dimethylarginine dimethylaminohydrolase 2 (DDAH2) Homo sapiens F1F0-type ATPase subunit d (ATP5JD) Homo sapiens fructose-1,6-bisphosphatase 1 (FBP1) Homo sapiens inorganic pyrophosphatase (SID6-306) Homo sapiens isocitrate dehydrogenase 3 (NAD+) beta (IDH3B) Homo sapiens meningioma expressed antigen 5 (hyaluronidase) (MGEA5) Homo sapiens mitochondria solute carrier protein (MSCP) Homo sapiens NADH dehydrogenase (ubiquinone) 1 beta subcomplex, 8 (19 kD, ASHI) (NDUFB8) Homo sapiens ornithine decarboxylase antizyme 2 (OAZ2) Homo sapiens phospholipase A2, group IIA (platelets, synovial fluid) (PLA2G2A) Homo sapiens pyruvate dehydrogenase kinase 4 Homo sapiens solute carrier family 9 (sodium/hydrogen exchanger), isoform 3 regulatory factor 2 (SLC9A3R2)
down down down down down down up up up up up up up up up up up up up up up up up up up up up up up up up up up up up up up up up up up up up up down down down up up up up up up up up up up up up up up up +
Isolation of Genes Expressed in Pancreatic Cancer
Pancreatology 2003;3:169–178
173
Table 2 (continued) Putative function
Direction of expression
Structural cytoskeleton Homo sapiens annexin A4 (ANXA4) Homo sapiens destrin (actin depolymerizing factor) Homo sapiens actinin, alpha 1 (ACTN1) Homo sapiens actinin, alpha 4 (ACTN4) Homo sapiens capping protein (actin filament), gelsolin-like (CAPG) Homo sapiens dynactin 1 (p150, Glued (Drosophila) homology) (DCTN1), transcript variant 1 Homo sapiens epithelial protein lost in neoplasm beta (EPLIN) Homo sapiens filamin A, alpha (actin-binding protein-280) (FLNA) Homo sapiens integrin beta 4 binding protein (ITGB4BP) Homo sapiens supervillin (SVIL), transcript variant 2 Homo sapiens synaptopodin (KIAA1029) Homo sapiens thymosin beta-10 gene (TMSB10) Homo sapiens thymosin beta-4 (TMSB4X) Homo sapiens transforming growth factor beta 1 induced transcript 1 (TGFB1I1) Homo sapiens transforming, acidic coiled-coil containing protein 1 (TACC1) Homo sapiens VAMP (vesicle-associated membrane protein)-associated protein B and C (VAPB) Homo sapiens zyxin (ZYX) Transcriptional regulation Homo sapiens heat shock 90-kD protein 1, alpha (HSPCA) Homo sapiens cellular retinoic acid-binding protein 2 (CRABP2) Homo sapiens clone IMAGE:3622356 Homo sapiens DR1-associated protein 1 (negative cofactor 2 alpha) (DRAP1) Homo sapiens FOXP1 Homo sapiens hematopoietic PBX-interacting protein (HPIP) Homo sapiens high-mobility group (nonhistone chromosomal) protein 1 (HMG1) Homo sapiens high-mobility group (nonhistone chromosomal) protein 17 (HMG17) Homo sapiens methyl-CpG binding domain protein 2 (MBD2) Homo sapiens nuclear matrix protein NMP200 related to splicing factor PRP19 (NMP200) Homo sapiens nuclease sensitive element binding protein 1 (NSEP1) Homo sapiens retinoid X receptor alpha mRNA Homo sapiens suppressor of Ty (S. cerevisiae) 5 homolog (SUPT5H) Homo sapiens SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily a, member 4 (SMARCA4) Homo sapiens tripartite motif-containing 28 (TRIM28) Homo sapiens zinc finger protein 220 (ZNF220) Protein synthesis and modification Homo sapiens heat shock 90-kD protein 1, alpha (HSPCA) Homo sapiens translocating chain-associating membrane protein (TRAMP protein) Homo sapiens clone MGC:9947 IMAGE:3876105 Homo sapiens eukaryotic translation initiation factor 3, subunit 5 (epsilon, 47 kD) (EIF3S5) Homo sapiens eukaryotic translation initiation factor 4 gamma 2 (EIF4G2) Homo sapiens KIAA0905 protein (yeast Sec31p homolog) Homo sapiens low-density lipoprotein-related protein-associated protein 1 (alpha-2-macroglobulin receptor-associated protein 1) (LRPAP1) Homo sapiens methionine aminopeptidase; eIF-2-associated p67 (MNPEP) Homo sapiens MRJ gene for a member of the DNAJ protein family Homo sapiens putative mitochondrial outer membrane protein import receptor (hTOM), nuclear gene encoding mitochondrial protein Homo sapiens ribosomal protein S15 (RPS15) Homo sapiens Sec61 gamma Homo sapiens sialyltransferase 1 (beta-galactoside alpha-2,6-sialytransferase) (SIAT1) Homo sapiens translation initiation factor IF2 (IF2) Homo sapiens UDP-Gal:betaGlcNAc beta 1,4-galactosyltransferase, polypeptide 5 (B4GALT5) Extracellular ligands Homo sapiens serine protease inhibitor, Kazal type 1 (SPINK1) Homo sapiens somatostatin (SST) Homo sapiens agrin precursor Homo sapiens high-density lipoprotein binding protein (vigilin) (HDLBP) Homo sapiens insulin-like growth factor binding protein 4 (IGFBP4) Homo sapiens lectin, galactoside-binding, soluble 3 binding protein (LGALS3BP) Homo sapiens lectin, galactoside-binding, soluble 4 (galectin 4) (LGALS4) Homo sapiens midkine (neurite growth-promoting factor 2) (MDK) Homo sapiens NJAC protein (NJAC)
174
Pancreatology 2003;3:169–178
down down up up up up up up up up up up up up up up up down up up up up up up up up up up up up up up up down down up up up up up up up up up up up up up down down up up up up up up up
Grützmann et al.
Table 2 (continued) Putative function
Direction of expression
Homo sapiens secreted protein, acidic, cysteine-rich (osteonectin) (SPARC) Homo sapiens serine protease inhibitor, Kunitz type, 2 (SPINT2) Homo sapiens tumor necrosis factor (ligand) superfamily, member 13 (TNFSF13)
up up up
Extracellular protein degradation Homo sapiens chymotrypsin C (caldecrin) (CTRC) Homo sapiens elastase 3B (ELA3B) Homo sapiens kallikrein 1, renal/pancreas/salivary (KLK1) Homo sapiens pancreatic carboxypeptidase A1 (CPA1) Homo sapiens protease, serine 2 (trypsin 2) (PRSS2) Homo sapiens matrix metalloproteinase 11 (stromelysin 3) (MMP11) Homo sapiens progastricsin (pepsinogen C) (PGC) Homo sapiens serine protease 11 (IGF binding) (PRSS11)
down down down down down up up up
RNA processing Homo sapiens DEAD/H (Asp-Glu-Ala-Asp/His) box polypeptide 16 (DDX16) Homo sapiens DEAD-box protein abstract (ABS) Homo sapiens heterogenous nuclear ribonucleoprotein C (C1/C2) (HNRPC) Homo sapiens HMT1 (hnRNP methyltransferase, S. cerevisiae )-like 2 (HRMT1L2) Homo sapiens KIAA0052 protein Homo sapiens NS1-associated protein 1 (NSAP1) Homo sapiens PAI-1 mRNA-binding protein (PAI-RBP1) Homo sapiens putative nucleolar RNA helicase (NOH61)
up up up up up up up up
Cell cycle Homo sapiens (E. coli homolog)-like (RUVBL1) Homo sapiens chromosome segregation 1 (yeast homolog)-like (CSE1L) Homo sapiens cyclin K (CCNK) Homo sapiens cyclin-dependent kinase inhibitor 1A (p21, Cip1) (CDKN1A) Homo sapiens PRAD1 mRNA for cyclin Homo sapiens TRF2-interacting telomeric RAP1 protein (RAP1)
up up up up up up
Immunity Homo sapiens immunoglobulin heavy chain variable region (V4-31) gene Homo sapiens CD74 antigen (invariant polypeptide of major histocompatibility complex, class II antigen-associated) (CD74) Homo sapiens interferon induced transmembrane protein 1 (IFITM1) Homo sapiens major histocompatibility complex, class I, F (HLA-F) Homo sapiens mRNA for single-chain antibody Homo sapiens polymeric immunoglobulin receptor (PIGR)
up up up up up
Cytoplasmatic protein degradation Homo sapiens ariadne-2 (D. melanogaster) homolog (all-trans retinoic acid inducible RING finger) (ARIH2) Homo sapiens hypothetical protein FLJ23251 (FLJ23251) Homo sapiens ORF (LOC51035) Homo sapiens proteasome (prosome, macropain) 26S subunit, non-ATPase, 3 (PSMD3) Homo sapiens proteasome (prosome, macropain) subunit, alpha type, 5 (PSMA5)
up up up up up
Structural extracellular matrix Homo sapiens collagen, type I, alpha 1 (COL1A1) Homo sapiens collagen, type I, alpha 2 (COL1A2) Homo sapiens keratin 7 (KRT7) Homo sapiens vimentin (VIM)
up up up up
DNA repair Homo sapiens hypothetical protein FLJ22402 (FLJ22402)
up
Other Homo sapiens 601679608F1 (cDNA) Homo sapiens chromosome 8 open reading frame 4 (C8orf4) Homo sapiens islet amyloid polypeptide (IAPP) Homo sapiens wh37d12.x1 (cDNA) Homo sapiens 602412375F1 (cDNA) Homo sapiens 602812513F1 (cDNA) Homo sapiens adlican Homo sapiens apurinic/apyrimidinic endonuclease (APEX nuclease)-like 2 protein Homo sapiens AV695345 (cDNA) Homo sapiens B-cell translocation protein 1 (BTG1) Homo sapiens biglycan (BGN) Homo sapiens cDNA FLJ14237 fis
down down down down up up up up up up up up
up
+
Isolation of Genes Expressed in Pancreatic Cancer
Pancreatology 2003;3:169–178
175
Table 2 (continued) Putative function
Direction of expression
Homo sapiens cDNA: FLJ21321 fis, clone COL02335, highly similar to HSA010442 mRNA for immunoglobulin kappa light chain Homo sapiens CGI 40 protein Homo sapiens CGI-06 protein Homo sapiens chromosome 21 open reading frame 59 (C21ORF59) Homo sapiens clone 24751 Homo sapiens clone IMAGE:3351295 Homo sapiens clone IMAGE:3604336 Homo sapiens clone IMAGE:4127796 Homo sapiens clone IMAGE:4309957 Homo sapiens DKFZp564H2023 (from clone DKFZp564H2023) Homo sapiens DKFZp586D0918 (from clone DKFZp586D0918) Homo sapiens DNA sequence from clone RP4-718J7 on chromosome 20q13.31–13.33, contains the PCK1 gene for soluble phosphoenolpyruvate carboxykinase 1, part of a novel gene similar to mouse DLM-1 (tumour stroma and activated macrophage protein), the 3) end of the TMEPAI gene encoding an androgen induced 1b transmembrane protein (PMEPA1), two putative novel genes Homo sapiens DNA sequence from clone RP5-1100H13 on chromosome 20q11.2, contains the 3) end of gene KIAA1219, a putative novel gene, a DC5 pseudogene, the gene for a putative RhoGAP domain containing protein Homo sapiens epithelial membrane proteins 3 (EMP3) Homo sapiens fer-1 (C. elegans )-like 3 (myoferlin) (FER1L3) Homo sapiens FLJ00085 protein Homo sapiens FLJ10960 fis, clone PLACE1000564 Homo sapiens FLJ21300 fis, clone COL02062 (cDNA) Homo sapiens FLJ22667 fis, clone HSI08385 (cDNA) Homo sapiens FLJ23055 fis, clone LNG03262 (cDNA) Homo sapiens full length insert cDNA clone EUROIMAGE 1967720 Homo sapiens golgi membrane protein GP73 (LOC51280) Homo sapiens GW128 Homo sapiens HSPC166 protein (HSPC166) Homo sapiens hypothetical protein (BM-009) Homo sapiens hypothetical protein (HSPC152) Homo sapiens hypothetical protein CAB56184 (CAB56184) Homo sapiens hypothetical protein DKFZp761F241 (DKFZP761F241) Homo sapiens hypothetical protein FLJ10439 (FLJ10439) Homo sapiens hypothetical protein FLJ12150 (FLJ12150) Homo sapiens hypothetical protein FLJ13465 (FLJ13465) Homo sapiens hypothetical protein FLJ14825 (FLJ14825) Homo sapiens hypothetical protein FLJ20113 (FLJ20113) Homo sapiens hypothetical protein FLJ20920 (FLJ20920) Homo sapiens hypothetical protein FLJ22439 (FLJ22439) Homo sapiens hypothetical protein FLJ22439 (FLJ22439) Homo sapiens hypothetical protein MGC15429 (MGC15429) Homo sapiens hypothetical protein MGC15429 (MGC15429) Homo sapiens hypothetical protein MGC2477 (MGC2477) Homo sapiens hypothetical protein MGC4342 (MGC4342) Homo sapiens hypothetical protein PRO1855 (PRO1855) Homo sapiens JM4 protein (JM4) Homo sapiens KIAA0560 protein Homo sapiens KIAA0747 protein Homo sapiens KIAA0956 protein Homo sapiens KIAA1691 protein Homo sapiens KIAA1693 protein Homo sapiens lambda-crystallin Homo sapiens neural proliferation, differentiation and control 1 (NPDC1) Homo sapiens NifU-like protein (hNifU) Homo sapiens nonspecific crossreacting antigen Homo sapiens novel protein AHNAK Homo sapiens semenogelin 1 (SEMG1) Homo sapiens similar to RIKEN cDNA 5830420C20 gene Homo sapiens transforming, acidic coiled-coil containing protein 1 (TACC1) Homo sapiens TTF-I interacting peptide 12 Homo sapiens UDP-N-acetylglucosamine: alpha-1,3-D-mannoside beta-1,4-N-acetylglucosaminyltransferase IV Homo sapiens wd91e02.x1 (cDNA) Homo sapiens wv04c02.x1 (cDNA) Homo sapiens zinc finger protein (ZNF-U69274)
176
Pancreatology 2003;3:169–178
Grützmann et al.
up up up up up up up up up up up
up
up up up up up up up up up up up up up up up up up up up up up up up up up up up up up up up up up up up up up up up up up up up up up up up up
We also found the insulin-like growth factor (IGF) binding protein (IGFBP) 4 (IGFBP-4) to be highly and significantly upregulated in PaCa (ratio 7.1, p ! 0.0001), supporting similar findings by Gress et al. [10] using representation difference analysis. IGFBPs are postulated to carry out several functions, including control of IGF, interactions with cell surface receptors and modulation of biological actions of the IGFs [29]. IGFBPs are potent inhibitors of the action of IGF in various cell types. Inhibitory IGFBP may significantly delay the growth of malignant prostate epithelial cells and enhance the sensitivity of these cells to apoptosis. IGFBP-4 proteolysis by cellbound plasmin can promote autocrine/paracrine IGF-II bioavailability in colon cancer cells. This may have important consequences for the behavior of cancer cells at the interface between stroma and malignant cells in carcinomas of the colon in vivo. IGFBPs may play a more active, IGF-independent role in growth regulation of cancer cells [30, 31]. Our data suggest that the IGFBP gene is upregulated in PaCa. This finding seems surprising, because previous studies have indicated that the gene represented a potential TSG. However, our data are in accordance with the expression profile of p53, a well-known TSG, which has been found to be overexpressed in 50– 60% of all PaCa cases. This is intriguingly close to the detected gene mutation rate of 45–80% [32, 33]. Another gene which may be of pathophysiological relevance for pancreatic carcinogenesis is serine protease 2 (trypsin 2) (PRSS2). The gene is involved in metabolism and has been shown to be downregulated in some tumors [34]. We also found that PRSS2 was downregulated in PaCa, as shown by our electronic Northern blot. A number of additional genes known to be associated with the malignant phenotype of PaCa or other tumors were also identified by us as differentially expressed in PaCa. Among these genes are epsilon casein kinase 1 (CSNK1E), tumor protein D52-like 2 (TPD52L2), glycoprotein 2 (GP2), S100 calcium-binding protein A4 (calvasculin, metastasin; S100A4) and somatostatin. Most of our genes (222) were found to be significantly upregulated in PaCa, suggesting that the EST libraries obtained from tumor tissue were enriched for upregulated genes. This finding is supported by a comparative genomic hybridization study on a panel of 27 PaCas showing that gains of chromosome material were more frequent than chromosome losses [7]. There are some limitations with our in silico approach. The result depends on the quality and the number of available EST libraries. Known oncogenes like K-ras, which is mutated in PaCa and contributes to carcinogene-
Isolation of Genes Expressed in Pancreatic Cancer
sis, can only be identified with our profiling method if their expression pattern changes significantly. Our analysis shows that K-ras is slightly upregulated in PaCa EST libraries; however, with a p value of 0.26, the upregulation is not statistically significant. Moreover, genes whose expression level is very low will not be found by the EST approach. Most EST libraries used in this study have been sequenced to a depth of approximately 5,000 individual clones. This depth is not sufficient to faithfully represent the abundance of all RNA molecules in a particular tissue [35]. It has been estimated that a cell expresses between 10,000 and 30,000 different genes and an average of about 300,000 mRNA molecules. Therefore, a single EST library cannot represent a perfect picture of the mRNA composition of a certain cell type. Low-abundance genes especially are often not present in an EST library. This problem can be minimized by pooling of equivalent EST libraries. We circumvented this problem by pooling individual EST libraries if they originated from the same benign or cancer tissue. These library pools then comprise tens of thousands of individual sequences and may show a better proportional representation of abundant as well as moderately expressed genes. In summary, our in silico RNA profiling approach is fast and reliable and has provided a unique resource of 249 genes differentially expressed in PaCa compared with normal tissue of the pancreas. Some of the genes have also been identified by other profiling methods, confirming the validity of our in silico approach. The data support the concept that gene expression profiles change dramatically during tumorigenesis. We have not been able to dissect expression profiles according to individual cancer stages, including grading and TMN classification. This requires carefully designed prospective profiling studies involving hundreds of patients and microdissected normal and cancerous tissue of the pancreas. Moreover, the approach only analyzes changes in expression of the genes, but not in the activity of the protein. Therefore, interesting candidate genes have to be validated by means of immunohistochemistry, for example. Nevertheless, this bioinformatic tool offers new possibilities in in silico analysis of gene expression and can be used to develop gene networks and pathways in different diseased tissues.
Acknowledgments We thank S. Pistorius (Dresden), H. Kalthoff (Kiel) and S. Gelling (Berlin) for critical reading of the manuscript as well as Alfred E. Neumann, whose comments were always appreciated. This work was supported by the Deutsche Krebshilfe (70-2937-SaI).
Pancreatology 2003;3:169–178
177
References 1 Parker SL, Tong T, Bolden S, Wingo PA: Cancer statistics, 1997. CA Cancer J Clin 1997;47: 5–27. 2 Murr MM, Sarr MG, Oishi AJ, van Heerden JA: Pancreatic cancer. CA Cancer J Clin 1994; 44:304–318. 3 Hilgers W, Kern SE: Molecular genetic basis of pancreatic adenocarcinoma. Genes Chromosomes Cancer 1999;26:1–12. 4 Bardi G, Johansson B, Pandis N, Mandahl N, Bak-Jensen E, Andren-Sandberg A, Mitelman F, Heim S: Karyotypic abnormalities in tumours of the pancreas. Br J Cancer 1993;67: 1106–1112. 5 Griffin CA, Hruban RH, Morsberger LA, Ellingham T, Long PP, Jaffee EM, Hauda KM, Bohlander SK, Yeo CJ: Consistent chromosome abnormalities in adenocarcinoma of the pancreas. Cancer Res 1995;55:2394–2399. 6 Hahn SA, Seymour AB, Hoque AT, Schutte M, da Costa LT, Redston MS, Caldas C, Weinstein CL, Fischer A, Yeo CJ, et al: Allelotype of pancreatic adenocarcinoma using xenograft enrichment. Cancer Res 1995;55:4670–4675. 7 Solinas-Toldo S, Wallrapp C, Müller-Pillasch F, Bentz M, Gress T, Lichter P: Mapping of chromosomal imbalances in pancreatic carcinoma by comparative genomic hybridization. Cancer Res 1996;56:3803–3807. 8 Brat DJ, Hahn SA, Griffin CA, Yeo CJ, Kern SE, Hruban RH: The structural basis of molecular genetic deletions. An integration of classical cytogenetic and molecular analyses in pancreatic adenocarcinoma. Am J Pathol 1997; 150:383–391. 9 Liang P, Pardee AB: Differential display of eukaryotic messenger RNA by means of the polymerase chain reaction. Science 1992;257: 967–971. 10 Gress TM, Wallrapp C, Frohme M, Müller-Pillasch F, Lacher U, Friess H, Büchler M, Adler G, Hoheisel JD: Identification of genes with specific expression in pancreatic cancer by cDNA representational difference analysis. Genes Chromosomes Cancer 1997;19:97–103. 11 Argani P, Iacobuzio-Donahue C, Ryu B, Rosty C, Goggins M, Wilentz RE, Murugesan SR, Leach SD, Jaffee E, Yeo CJ, Cameron JL, Kern SE, Hruban RH: Mesothelin is overexpressed in the vast majority of ductal adenocarcinomas of the pancreas: Identification of a new pancreatic cancer marker by serial analysis of gene expression (SAGE). Clin Cancer Res 2001;7: 3862–3868. 12 Ryu B, Jones J, Blades NJ, Parmigiani G, Hollingsworth MA, Hruban RH, Kern SE: Relationships and differentially expressed genes among pancreatic cancers examined by largescale serial analysis of gene expression. Cancer Res 2002;62:819–826.
178
13 Zuber J, Tchernitsa OI, Hinzmann B, Schmitz AC, Grips M, Hellriegel M, Sers C, Rosenthal A, Schafer R: A genome-wide survey of RAS transformation targets. Nat Genet 2000;24: 144–152. 14 Friess H, Ding J, Kleeff J, Liao Q, Berberat PO, Hammer J, Büchler MW: Identification of disease-specific genes in chronic pancreatitis using DNA array technology. Ann Surg 2001;234: 769–779. 15 Scheurle D, DeYoung MP, Binninger DM, Page H, Jahanzeb M, Narayanan R: Cancer gene discovery using digital differential display. Cancer Res 2000;60:4037–4043. 16 Adams MD, Kelley JM, Gocayne JD, Dubnick M, Polymeropoulos MH, Xiao H, Merril CR, Wu A, Olde B, Moreno RF, et al: Complementary DNA sequencing: Expressed sequence tags and human genome project. Science 1991;252: 1651–1656. 17 Schmitt AO, Specht T, Beckmann G, Dahl E, Pilarsky CP, Hinzmann B, Rosenthal A: Exhaustive mining of EST libraries for genes differentially expressed in normal and tumour tissues. Nucleic Acids Res 1999;27:4251–4260. 18 Fannon MR: Gene expression in normal and disease states – identification of therapeutic targets. Trends Biotechnol 1996;14:294–298. 19 Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool. J Mol Biol 1990;215:403–410. 20 Kern SE: Molecular genetic alterations in ductal pancreatic adenocarcinomas. Med Clin North Am 2000;84:691–695, xi. 21 Ozaki K, Nagata M, Suzuki M, Fujiwara T, Miyoshi Y, Ishikawa O, Ohigashi H, Imaoka S, Takahashi E, Nakamura Y: Isolation and characterization of a novel human pancreas-specific gene, pancpin, that is down-regulated in pancreatic cancer cells. Genes Chromosomes Cancer 1998;22:179–185. 22 Gress TM, Müller-Pillasch F, Geng M, Zimmerhackl F, Zehetner G, Friess H, Büchler M, Adler G, Lehrach H: A pancreatic cancer-specific expression profile. Oncogene 1996;13: 1819–1830. 23 Iacobuzio-Donahue CA, Maitra A, Shen-Ong GL, van Heek T, Ashfaq R, Meyer R, Walter K, Berg K, Hollingsworth MA, Cameron JL, Yeo CJ, Kern SE, Goggins M, Hruban RH: Discovery of novel tumor markers of pancreatic cancer using global gene expression technology. Am J Pathol 2002;160:1239–1249. 24 Sanvito F, Vivoli F, Gambini S, Santambrogio G, Catena M, Viale E, Veglia F, Donadini A, Biffo S, Marchisio PC: Expression of a highly conserved protein, p27BBP, during the progression of human colorectal cancer. Cancer Res 2000;60:510–516.
Pancreatology 2003;3:169–178
25 Biffo S, Sanvito F, Costa S, Preve L, Pignatelli R, Spinardi L, Marchisio PC: Isolation of a novel beta4 integrin-binding protein (p27(BBP)) highly expressed in epithelial cells. J Biol Chem 1997;272:30314–30321. 26 Noel A, Boulay A, Kebers F, Kannan R, Hajitou A, Calberg-Bacq CM, Basset P, Rio MC, Foidart JM: Demonstration in vivo that stromelysin-3 functions through its proteolytic activity. Oncogene 2000;19:1605–1612. 27 Rio MC, Lefebvre O, Santavicca M, Noel A, Chenard MP, Anglard P, Byrne JA, Okada A, Regnier CH, Masson R, Bellocq JP, Basset P: Stromelysin-3 in the biology of the normal and neoplastic mammary gland. J Mammary Gland Biol Neoplasia 1996;1:231–240. 28 von Marschall Z, Riecken EO, Rosewicz S: Stromelysin 3 is overexpressed in human pancreatic carcinoma and regulated by retinoic acid in pancreatic carcinoma cell lines. Gut 1998;43:692–698. 29 Clemmons DR, Jones JI, Busby WH, Wright G: Role of insulin-like growth factor binding proteins in modifying IGF actions. Ann NY Acad Sci 1993;692:10–21. 30 Glantschnig H, Varga F, Luegmayr E, Klaushofer K: Characterization of the mouse insulinlike growth factor binding protein 4 gene regulatory region and expression studies. DNA Cell Biol 1998;17:51–60. 31 Remacle-Bonnet MM, Garrouste FL, Pommier GJ: Surface-bound plasmin induces selective proteolysis of insulin-like-growth-factor (IGF)binding protein-4 (IGFBP-4) and promotes autocrine IGF-II bio-availability in human coloncarcinoma cells. Int J Cancer 1997;72:835– 843. 32 Barton CM, Staddon SL, Hughes CM, Hall PA, O’Sullivan C, Kloppel G, Theis B, Russell RC, Neoptolemos J, Williamson RC, et al: Abnormalities of the p53 tumour suppressor gene in human pancreatic cancer. Br J Cancer 1991;64: 1076–1082. 33 Casey G, Yamanaka Y, Friess H, Kobrin MS, Lopez ME, Büchler M, Beger HG, Korc M: p53 mutations are common in pancreatic cancer and are absent in chronic pancreatitis. Cancer Lett 1993;69:151–160. 34 Lukkonen A, Sorsa T, Salo T, Tervahartiala T, Koivunen E, Golub L, Simon S, Stenman UH: Down-regulation of trypsinogen-2 expression by chemically modified tetracyclines: Association with reduced cancer cell migration. Int J Cancer 2000;86:577–581. 35 Vingron M, Hoheisel J: Computational aspects of expression data. J Mol Med 1999;77:3–7.
Grützmann et al.