Experimental Hematology 2009;37:215–224
Genome-wide DNA-mapping of CD34þ cells from patients with myelodysplastic syndrome using 500K SNP arrays identifies significant regions of deletion and uniparental disomy Daniel Nowaka, Florian Noltea, Maximilian Mossner , Verena Nowaka, Claudia D. Baldusa, Olaf Hopfera, Stefanie Nolla, Eckhard Thiela, Florian Wagnerb,*, and Wolf-Karsten Hofmanna a
a Department of Hematology and Oncology, Charite´ University Hospital, Campus Benjamin Franklin, Berlin, Germany; bGerman Resource Center for Genome Research (RZPD), Berlin, Germany
(Received 28 June 2008; revised 22 September 2008; accepted 21 October 2008)
Objective. Identification of genomic lesions in progenitor cells of patients with myelodysplastic syndrome (MDS) could lead to the discovery of new disease-specific genes and may be of prognostic value. Materials and Methods. We carried out a genome-wide mapping of DNA from CD34+ cells of MDS patients with high-resolution 500K single nucleotide polymorphism arrays and a concomitant integration with global gene expression analysis. Thirteen MDS patients were analyzed. Results. Copy number and loss of heterozygosity analyses detected heterozygous deletions on chromosomes 2, 9, 13, 16, 17, and 20 ranging in size from 0.1 megabases (Mba) to 2.1 Mba. Additionally, numerous regions with significant uniparental disomy were detected. Integration of the genomic data with gene expression analysis showed that genes, which were downregulated at least 1.5-fold in regions of significant deletion and uniparental disomy were exclusively downregulated in those samples displaying the aberration. Genomics and gene expression data were confirmed by real-time polymerase chain reaction and variable number tandem repeat analysis. Conclusion. High-density genomic mapping of CD34+ bone marrow cells from patients with MDS identifies cryptic genetic lesions and offers new opportunities for the discovery of target genes in MDS by integration with gene expression analysis. Ó 2009 ISEH - Society for Hematology and Stem Cells. Published by Elsevier Inc.
Myelodysplastic syndrome (MDS) is a clonal disorder, which is characterized by ineffective hematopoiesis and in later stages often transforms into acute myeloid leukemia (AML) [1]. It is hypothesized that MDS develops in a multistep process involving genetic insults in hematological stem cells. This is supported by the fact that approximately 50% of MDS patients display karyotype aberrations in cytogenetic analyses [2,3]. However, conventional cytogenetic studies by nature of the method often fail to detect submicroscopic abnormalities [4]. *Dr. Wagner is currently at Atlas Biolabs GmbH, Berlin, Germany. Offprint requests to: Daniel Nowak, M.D., Cedars Sinai Medical Center, Department of Hematology and Oncology, 8700 Beverly Boulevard, Los Angeles, CA 90048; E-mail:
[email protected]
500K single nucleotide polymorphism (SNP) microarrays offer the opportunity to perform a global analysis of genomic DNA [5]. These arrays interrogate approximately 500,000 SNPs throughout the genome, which leads to a mean inter-SNP distance of 5.8 kb between interrogated SNPs. This allows a detailed genotyping analysis on a sub-mega-base pair level. Analysis of genomic DNA with SNP arrays provides two different types of information. One is a data set comprising the intensity data of all SNPs. Because the human genome is diploid, the intensity values are raised to two after normalization, which represents the normal expression of SNPs on somatic chromosomes. A homozygous deletion results in an expression value of zero and a heterozygous deletion in an expression value of one. Amplifications result
0301-472X/09 $–see front matter. Copyright Ó 2009 ISEH - Society for Hematology and Stem Cells. Published by Elsevier Inc. doi: 10.1016/j.exphem.2008.10.012
D. Nowak et al./ Experimental Hematology 2009;37:215–224
216
in expression values of $3 integer multiplicatives. Apart from copy number data, the method also yields a genotype data set that contains the SNP calls of AA, AB, or BB standing for the alleles of the SNPs. This allows detection of loss of heterozygosity (LOH), uniparental disomy (UPD), or allelic imbalance. LOH is defined as the loss of one of the two alleles in the genome, which results in exclusively homozygous SNP calls. This usually occurs in heterozygous deletions, where one allele is lost and the remaining allele only produces homozygous SNP calls, as there are no corresponding SNPs on the other allele. However, a state of LOH can also be detected without a corresponding loss of genetic material. These events are referred to as UPD. UPD can arise from a number of different mechanisms, such as trisomic rescue, incomplete segregation of chromosomes, or mitotic recombination, and results in the duplication of one parental allele and loss of the other leading to the detection of a copy number neutral state of LOH [6]. Finally, LOH can also be detected in regions of genomic amplification. The mechanisms leading to UPD can also lead to allelic imbalances, in which single alleles are present in multiple numbers while the other corresponding allele is lost. Several approaches with high-density genomic mapping methods, such as comparative genomic hybridization or SNP arrays have demonstrated that these are highly superior to conventional metaphase cytogenetics and lead to the discovery of previously hidden genomic lesions in the bone marrow (BM) or CD34þ cells of MDS [712]. Integrative approaches combining the power of high-density SNP arrays with global gene expression analysis have, among others, been carried out in leukemia [13], melanoma [14], multiple myeloma [15], or lymphoma [16] and led to the discovery of functionally relevant genes for these diseases.
An identification and detailed analysis of genomic lesions in hematological progenitor cells of MDS patients could lead to the discovery of new disease-specific target genes [17,18], which may be of value for development of future therapies or establishing new prognostic systems. We have used the 500K SNP mapping arrays to identify cryptic genetic lesions present in highly purified CD34þ hematopoietic progenitor cells of MDS patients. Furthermore, we have integrated the obtained genotyping data with global gene expression analyses of the same CD34þ cells and corresponding low-density BM cells with Affymetrix HG-U133 plus 2.0 microarrays in order to assess whether identified genomic lesions have an impact on gene expression. Materials and Methods Patients and sample preparation Heparinized BM aspirate samples were obtained after informed consent from 14 MDS patients upon initial diagnosis. For controls, BM aspirates were obtained from six voluntary healthy individuals after informed consent. Mononuclear cells were separated by density gradient centrifugation through Ficoll-Hypaque (Biochrom, Berlin, Germany) and subsequently CD34þ hematopoietic progenitor cells were isolated by magnetic cell separation (Miltenyi Biotech, Bergisch Gladbach, Germany). Genomic DNA and RNA were extracted from CD34þ cells and mononuclear BM cells using TRIZOL reagent (Invitrogen, Life Technologies, Grand Island, NY, USA), according to manufacturer’s protocol. Genomic DNA was additionally purified with SureClean reagent (Bioline, Luckenwalde, Germany) and quality controlled by gel electrophoresis and the ND-1000 spectrophotometer (NanoDrop Technologies, Wilmington, DE, USA) system. RNA was controlled with the Bioanalyzer 2100 (Agilent, Palo Alto, CA, USA) and ND1000 spectrophotometer systems, respectively. Patient characteristics, including World Health Organization disease classification and cytogenetics are summarized in Table 1.
Table 1. Patient characteristics of analyzed samples No. 0029 0055 0061 0071 0085 0110 0227 0303 0309 0319 0336 0386 0389 N0093 N0096 N0099 N0102
Age (y) 68 75 73 65 69 75 80 62 59 77 77 44 78 30 22 34 25
Gender
Diagnosis
Cytogenetics [metaphases]
IPSS
M M F F M F F M F F M F M M M F M
RA RA RAEB II RAEB I RA RA RA RAEB RAEB II RAEB II CMML RAEB I RAEB II Healthy Healthy Healthy Healthy
46,XY [23] 46,XY [25] 46,XX, t(1;17)(p36.3;q23) [17], 46XX [2] 46,XX [14] 46, XY [6], 45, X, dY [14] 46,XX [27] 46,XX [NA] 46,XY [25] 46,XX [25] 46,XX [22] 46,XY [20] 46,XX [20] 46,XY,del(20)(q11q13) [15], 47,XY, der(8),del(20)(q11q13) [5] NA NA NA NA
Low risk Low risk High risk High risk Low risk Low risk Low risk High risk High risk High risk High risk High risk High risk NA NA NA NA
CMML 5 chronic myelomonocytic leukemia; F 5 female; IPSS 5 International Prognostic Scoring System; M 5 male; NA 5 not available; RA 5 refractory anemia; RAEB 5 refractory anemia with excess blasts (I 5 !10% bone marrow blasts, II 5 1020% bone marrow blasts).
D. Nowak et al./ Experimental Hematology 2009;37:215–224
Genome mapping and gene expression analysis Five-hundred nanograms of genomic DNA were processed according to the instructions of the GeneChip Mapping 500K Assay Manual (Affymetrix, Santa Clara, CA, USA), hybridized to 500K NspI/StyI chip sets and scanned on an Affymetrix GeneChip scanner 3000. For gene expression analyses, 1 to 8 mg quality controlled RNA from CD34þ cells as well as unselected mononuclear BM cells was processed according to the standard Affymetrix target amplification protocol according to manufacturer’s instructions. 15 mg of amplified and fragmented cRNA was hybridized to Human Genome U133 Plus 2.0 microarrays according to the manufacturer’s protocol and scanned on an Affymetrix GeneChip scanner 3000. Data analysis Raw data were generated by GeneChip Operating System (GCOS) 1.4 (Affymetrix) and SNP genotype calls were calculated by GeneChip Genotyping Analysis Software (GTYPE) 4.1 with the Affymetrix BRLMM (Bayesian Robust Linear Model with Mahalanobis distance classifier) Analysis Tool 2.0 (BAT) algorithm. The mean SNP call rate of analyzed samples was 93.3% (range, 81.3597.82%). One MDS sample and two samples from the healthy donors were excluded from analysis due to insufficient signal quality. For copy number analysis, intensity data files generated in GCOS were processed in the Affymetrix CNAT (copy number analysis tool). Quantile normalization was performed using a baseline of the normals [19]. CNAT copy number files were imported into the PARTEK Genomics Suite software (version 6.2) [20]. LOH analysis was performed on the control and MDS samples without paired normals using an Affymetrix reference set of SNP frequencies derived from 48 normals according to the algorithm of Huang et al. [21], which is implemented in the PARTEK software. In reference to this algorithm and previous studies, which employed SNP arrays to analyze MDS [11,12], we applied stringent limits and defined regions of LOH as regions that reached a significance score (negative decadic logarithm of the p value) of greater than 6 and were larger than 2 megabases (Mba). LOH probability models allow a calculation of LOH states with the use of distribution reference sets of normals, which achieve an accuracy matching traditional LOH determination methods with paired normal samples. This has been affirmed in several comparison studies [2125]. The raw-intensity data generated on the Human Genome U133 Plus 2.0 microarrays were also imported into the PARTEK software. The expression values of the four controls from healthy individuals were used to create a baseline with an expression value of 1, which was used to normalize the MDS samples. A schematic overview of the applied workflow is presented in Figure 1. Validation of genomic copy number change, LOH, and gene expression analysis The gene expression patterns of interesting genes and copy number changes of selected genomic regions were confirmed by quantitative real-time polymerase chain reaction (PCR). Primers and fluorochrome-conjugated probes were obtained from Metabion GmbH (Martinsried, Germany). Sequences of primers and probes are available upon request. Real-time PCR was performed on the Rotorgene System (Corbett Life Science, Sydney, Australia). For gene expression validation, RNA was processed with QuantiTect reverse transcriptase kits (QIAGEN, Hilden, Germany) to produce
217
template cDNA. Expression values were calculated using the delta delta CT method [26] with glucose phosphate isomerase as housekeeping gene and based on triplicates. For confirmation of genomic copy number changes quantitative real-time PCR was performed on the genomic DNA of CD34þ cells and peripheral blood DNA of patients harboring a putative copy number alteration and normal controls according to the method described by Weksberg et al. [27]. Thereby, regions of copy number change on chromosome 20 in patient 0389 were validated with the target genes phosphatidylinositol 3,4,5-triphosphate-dependent RAC exchanger (PREX1) and sulfatase 2 (SULF2) and the deletion on chromosome 9 in patient 0386 was validated with the target genes chromosome 9 open reading frame 102 (C9orf102) and zinc finger protein 510 (ZN510). The target genes were positioned within the putatively altered region and were compared to the reference genes glucose-6-phosphate dehydrogenase (G6PDH) and hydroxymethylbilane synthase (HMBS). LOH was confirmed in these selected regions by analysis of variable number tandem repeats (VNTR) D9S1786, D9S1809 and D9S1145 for Chromosome 9 and D20S213, D20S178 and D20S209 for chromosome 20 mapping to regions with putative LOH as described previously [28,29].
Results Quality of 500K SNP array data In this study, the genomic DNA from CD34þ cells from 13 MDS patients and 4 healthy volunteers was subjected to the processing protocol for the Affymetrix 500K NspI/StyI SNP Array set. Enrichment of CD34þ cells from BM aspirations led to a purity of O95% as determined by fluorescein-activated cell sorting analysis. Several parameters were monitored to assess data quality. The mean call rate of all samples was 93.3%. The SNP call accuracy was gauged by calculation of heterozygous SNP calls of male X-chromosomal SNPs and was 99.99% in all male samples (n 5 9). Copy number analyses A first analysis of the copy number data set revealed no homozygous deletions. Consequently, we searched for regions of heterozygous deletions. This was carried out by calculation of significant regions of copy number change using the PARTEK Genomics software. Amplifications were defined as regions with a mean log2 ratio of O0.5, deletions as regions with a mean log2 ratio of below 0.5 and contiguous regions set to contain at least 0.1 Mba and a p value !0.01. This method reliably identified a known deletion on chromosome 20, del20(q11q13) in patient 0389 (Fig. 1), which was confirmed by metaphase cytogenetics. Furthermore, it led to the detection of cryptic deletions of various sizes in other samples on chromosomes 2, 9, 13, 16, 17, and 20 summarized in Table 2. Interestingly, no regions of amplifications were detected. To validate copy number variations detected by the SNP arrays, we performed quantitative real-time PCR in selected regions of genomic
218
D. Nowak et al./ Experimental Hematology 2009;37:215–224
Figure 1. Workflow of genomic mapping with 500K arrays and subsequent integration with gene expression analysis from the same individuals. (A) Chromosome view of copy number data (log2 ratio) of all chromosomes in CD34þ cells of patient 0389. The red circle marks a heterozygous deletion on chromosome 20q, which was confirmed by metaphase cytogenetics. (B) Chromosome view of copy number data of chromosome 20 (all patients) imported into the PARTEK genomics software. The heterozygous deletion in patient 0389 can be found in the bottom chromosome view as a deviation from the normal copy number of 2 (Log2 ratio 5 0). (C) Chromosome view of chromosome 20 of patient 0389 after the calculation of significant regions of deletion (blue bars). (D) Chromosome view of chromosome 20 of patient 0389 after calculation of loss of heterozygosity (LOH). Red bars represent significant regions of LOH with an LOH score O6 (negative decadic logarithm of the p value). (E) Chromosome view of the gene expression profile of chromosome 20 in CD34þ cells of patient 0389. Red lines display upregulated genes and blue lines downregulated genes. CNAT 5 copy number analysis tool.
D. Nowak et al./ Experimental Hematology 2009;37:215–224
219
Table 2. Significant regions of heterozygous deletion
Sample ID 0029 0386
0389
Chromosome 2q21.1 16p11.2 9q21.33 9q22.1q22.2 9q22.31q22.32 9q22.32q22.33 9q31.3q32 13q12.3 13q14.11 13q14.3 13q14.3 17p13.3 20q13.12 20q13.12 20q13.12q13.13 20q13.13 20q13.2
Position (Mba) 130.759130.859 31.17231.306 87.38987.737 90.67191.191 95.22496.048 96.97099.070 113.300114.500 30.27630.466 43.77143.904 50.48350.903 51.37251.619 19286 41.77042.770 43.96444.312 45.54046.880 47.43348.415 49.44649.647
Length (Mba) 0.100 0.134 0.348 0.520 0.824 2.100 1.200 0.190 0.133 0.430 0.247 0.267 1.000 0.348 1.340 0.982 0.201
Copy number value (log2 ratio) 0.7621 0.5358 0.6255 0.5910 0.6021 0.6069 0.6935 0.5247 0.5259 0.6517 0.5265 0.6985 0.5799 0.6627 0.6180 0.6161 0.5912
LOH score 1.23 4.38 3.25 2.64 1.33 3.55 2.88 2.26 2.80 1.35 0.95 2.77 4.40 7.90 23.75 7.44 0.46
LOH 5 loss of heterozygosity; Mba 5 megabases. LOH score represents the negative decadic logarithm of the p value of the probability for LOH in the tested region by calculation in a LOH probability model.
DNA [27]. We selected the heterozygous deletion on chromosome 20 in patient 0389 and the heterozygous deletion on chromosome 9 in patient 0386 for validation. Both the validation of the chromosome 20 deletion in patient 0389 and the chromosome 9 deletion in patient 0386 accurately confirmed a hemizygous loss of genomic material and LOH in the tested regions of these samples (Fig. 2A, B, C, and D). LOH, UPD, and allelic imbalance Besides confirmation of detected deletions, the calculation of LOH led to the identification of highly significant regions of LOH, which did not display corresponding loss of genetic material in the copy number analysis. When these regions fulfilled the criteria of having LOH probabilities with a p value !106 (5LOH score O6) and were O2 Mba, they were defined as regions of UPD [11,12]. By this method, a total of nine regions of UPD were detected in all samples (Table 3). It is noteworthy that a common region of UPD on chromosome 3 was present in two samples. This region contains several potentially MDS-relevant genes, such as 5-aminolevulinate synthase (ALAS1), an important player in heme synthesis [30], or RNA binding motif 6 (RBM6), recently discovered in a new gene fusion in acute megakaryoblastic leukemia [31]. Integration of genomic SNP chip copy number data with global gene expression analysis from CD34þ and corresponding low-density BM cells In order to assess whether genomic alterations translate into changes of gene expression we performed global gene expression analyses on the GeneChip HG U133 Plus 2.0 array using RNA from CD34þ cells. To analyze corresponding regions of genomic change, data from these arrays were
processed with the PARTEK software. Significant regions of genomic copy number alterations were overlaid onto the gene expression data. We first examined whether changes of genomic copy number had any effect on the mean gene expression in these regions. However, no changes of mean gene expression beyond 1.1-fold differences were detected in any of the areas. Consequently, we carried out a more detailed approach and analyzed single genes in the significant regions. Thereby, genes that were downregulated at least 1.5-fold in the respective regions were assessed. This procedure yielded a number of genes in the regions of deletion on chromosome 9 in patient 0386 and chromosome 20 in patient 0389. In CD34þ cells of patient 0386 on chromosome 9, three genes were downregulated (Fig. 3A) and in patient 0389 on chromosome 20, six genes were downregulated (Fig. 3B). To track whether differential gene expression detected in CD34+ cells is also retained in hematopoietic cell differentiating from them, we also performed a gene expression analysis of unselected mononuclear BM cells of the same patients. Upon analysis of differential gene expression in the regions of significant copy number change, we found that more genes were downregulated in unselected BM cells than in CD34þ cells. In BM cells of patient 0386 in the region of deletion on chromosome 9, twenty genes were downregulated (Fig. 3C and Suppl. Table S1) and in patient 0389 in the region of deletion on chromosome 20, eight genes were downregulated (Fig. 3D and Suppl. Table S2). We found that in the samples featuring the deletion, the identified genes were markedly downregulated as compared to the mean gene expression of these genes in the other MDS and normal samples without deletions in these regions (Fig. 3A, B, C, and D). Moreover, in patient 0386
D. Nowak et al./ Experimental Hematology 2009;37:215–224
220
A
B
Validation of Deletion on Chromosome 20q
3.0
Validation of LOH on Chromosome 20q
Copy number
0389 PB 2.5
0389 CD34+
0389 PB
NC1
0389 CD34+
NC2
2.0
1.5
1.0
C 3.0
D
Validation of Deletion on Chromosome 9q
0386 PB
2.5
Copy number
Validation of LOH on Chromosome 9q
0386 CD34+
0386 PB
NC1
0386 CD34+
NC2
2.0
1.5
1.0
Figure 2. Validation of copy number change and loss of heterozygosity (LOH). (A) Validation of copy number change on chromosome 20q in patient 0389 with quantitative real-time polymerase chain reaction (PCR): ‘‘0389 CD34þ’’ 5 DNA from CD34þ-positive bone marrow cells of patient 0389, ‘‘0389 PB’’ 5 DNA from peripheral blood (PB) mononuclear cells from patient 0389, ‘‘NC1’’ and ‘‘NC2’’ 5 DNA from PB mononuclear cells from normal controls. (B) Validation of LOH by variable number tandem repeats (VNTR) (D20S213) analysis in the putatively deleted region on chromosome 20q in patient 0389: ‘‘0389 PB’’ shows biallelic PCR product from peripheral blood DNA from this patient, ‘‘0389 CD34þ’’ shows PCR product from DNA from CD34þ bone marrow with the loss of one allele. (C) Validation of copy number change on chromosome 9q in patient 0386 with quantitative real-time PCR: ‘‘0386 CD34þ’’ 5 DNA from CD34þ positive bone marrow cells of patient 0386, ‘‘0386 PB’’ 5 DNA from peripheral blood mononuclear cells from patient 0386, ‘‘NC1’’ and ‘‘NC2’’ 5 DNA from peripheral blood mononuclear cells from normal controls. (D) Validation of LOH by VNTR (D9S1786) analysis in the putatively deleted region on chromosome 9q in patient 0386: ‘‘0386 PB’’ shows biallelic PCR product from peripheral blood DNA from this patient, ‘‘0386 CD34þ’’ shows PCR product from DNA from CD34þ bone marrow with the loss of one allele.
D. Nowak et al./ Experimental Hematology 2009;37:215–224 Table 3. Detailed list of all detected regions of uniparental disomy O2 megabases Sample ID 0303 0336 0110 N0096 0071 0071 0386 0029 0055
Chromosome
Position
Length (Mba)
3p21.313p21.2 3p21.313p21.1 4q31.21 5p15.1 7q11.21 11p11.2311p11.22 11q21.3 16q22.1 20q11.120q11.21
4956692551950950 4870044652869179 143163763145281296 1784313918085829 6393062366428652 3117820033211796 6734993770124250 6497072167400252 2817941730490782
2.38 4.16 2.11 2.42 2.49 2.03 2.77 2.42 2.31
Mba 5 megabases.
downregulation of the three genes CDC28 protein kinase regulatory subunit 2 (CKS2), SECIS binding protein 2 (SECISBP2) and UDP-glucose ceramide glucosyltransferase (UGCG) in CD34þ cells was also retained in mononuclear BM (Suppl. Table S1). The same applied for the gene adenosine deaminase (ADA), which was downregulated in CD34þ cells and in unselected BM of patient 0389 (Suppl. Table S2). Integration of genomic SNP chip UPD data with global gene expression analysis from CD34þ cells and mononuclear BM cells After analysis of gene expression in regions of copy number change, we followed this strategy for all regions of calculated UPD. Similar to the areas of copy number alterations, the mean expression of genes in the calculated areas of UPD was not influenced beyond 1.1-fold differences. However, the detailed analysis of genes, which were downregulated more than 1.5-fold, yielded a group of three genes from CD34þ and four genes from unselected BM samples (Table 4). The selected genes were downregulated in an enhanced manner in the samples containing the region of UPD as compared to the mean expression of other MDS samples (Figs. 3E and F). Data mining and validation of identified genes by real-time PCR Analysis of genomic DNA from CD34þ cells from MDS patients and integration with gene expression data led to identification of several genes, which were downregulated in the affected regions. To further converge on candidate genes, we searched for genes, with a potential pathogenetic impact in MDS. Thereby, genes were of special interest if they were concomitantly downregulated in CD34þ cells and unselected BM. In this analysis, we deemed two genes to be of significance for MDS: adenosine deaminase (ADA) and CALR (calreticulin). Adenosine deaminase (ADA) was concomitantly downregulated in CD34þ cells (Fig. 3B and Suppl. Fig. S1A) and mononuclear BM cells (Suppl. Table S2). We analyzed its expression pattern in all samples by real-time PCR and confirmed its two fold
221
downregulation in patient 0389 and a concomitant downregulation in other MDS samples in comparison to the healthy individuals (Suppl. Fig. S1B). Another gene selected for further analysis was Calreticulin (CALR), as it was markedly downregulated in all MDS samples (Suppl. Fig. S1C). Validation of the expression pattern of CALR by real-time RT PCR confirmed this observation (Suppl. Fig. S1D).
Discussion The availability of high-density 500K SNP arrays represents a new powerful and feasible method to detect genomic copy number lesions and moreover to perform additional genotype analyses for the detection of LOH, UPD [22] or genome wide association analyses [32]. Studies employing high-resolution methods, such as comparative genomic hybridization or array-based high-density SNP genotyping to identify small cryptic chromosomal lesions in MDS have indeed shown that these methods are important supplements to traditional metaphase cytogenetics and unveil numerous new copy number lesions and moreover copy number neutral aberrations such as UPD [812]. Because it is a prevalent theory that the dysplastic BM phenotype in MDS originates from and is sustained by a small number of putative MDS stem cells [1,33,34], the objective of this study was to employ high-density 500K SNP arrays to assess highly purified CD34þ-positive hematopoietic progenitor cells of MDS patients for unidentified cryptic genomic lesions. The use of CD34þ-selected cells as opposed to unselected BM for the SNP array analysis is supported by the observation that MDS blast cells in the BM express the CD34þ marker, but the proportion of CD34þ cell in MDS marrow is low [3537]. Furthermore, a recent study comparing genomic alterations between purified CD34þ cells versus CD34-BM fraction of MDS patients found considerable differences between these two cell compartments in the same patients [38]. By using high-density 500K SNP arrays, we have discovered numerous so far cryptic heterozygous deletions and regions of UPD in our MDS samples. UPD can exist as heterodisomy or isodisomy, depending on the mechanism of formation. UPD can either arise through several mechanisms on the level of the gametes, such as trisomy rescue, compensatory UPD, or gametic complementation, or it can develop due to a somatic recombinational event [6]. Because matched normal DNA for our MDS samples was not readily available, we made use of an LOH probability algorithm to calculate regions of UPD using nonmatched reference distributions [2125]. Similar to copy number polymorphisms, which are found also in healthy populations [39], tracts of homozygosity have been detected in healthy individuals [40]. Due to this circumstance, we have applied stringent criteria and only included regions of UPD O2 Mba to identify disease associated regions of
D. Nowak et al./ Experimental Hematology 2009;37:215–224
222
relative expression [arbitrary units]
1.25
Genes downregulated >1.5 fold in CD34+ cells of patient 0386 in region of deletion on chromosome 9
B
1.00 0.75 0.50 0.25 0.00 SECISBP2
0.5
MYBL2
SERINC3
PKIG
ADA
PLTP
NCOA5
0386
0389
mean MDS (n=12)
mean MDS (n=12)
mean normals (n=4)
mean normals (n=4)
Genes downregulated > 1.5 fold in whole bone marrow of patient 0386 in region of deletion on chromosome 9
D 1.2
1
Genes downregulated > 1.5 fold in whole bone marrow of patient 0389 in region of deletion on chromosome 20
1.0 0.8 0.6 0.4 0.2 0.0
0 MDS 0386
E
1.0
UGCG
relative expression [arbitrary units]
relative expression [arbitrary units]
2
1.5
0.0 CKS2
C
Genes downregulated >1.5 fold in CD34+ cells of patient 0398 in region of deletion on chromosome 20
2.0
relative expression [arbitrary units]
A
mean MDS (n=12)
mean normals (n=4)
Genes downregulated > 1.5 fold in CD34+ cells in regions of UPD in all patients
(n=12)
MDS 0389
F
(n=4)
mean MDS mean normals (n=12) (n=4)
Genes downregulated > 1.5 fold in whole bone marrow in regions of UPD in all patients
(n=12)
(n=4)
Figure 3. Graphs displaying the differential gene expression of selected genes in regions of genomic lesions. (A) and (B) Genes contained in the regions of deletion on chromosome 9 in patient 0386 and chromosome 20 in patient 0389, which are downregulated at least 1.5-fold in CD34þ cells. (C) and (D) Genes contained in the regions of deletion on chromosome 9 in patient 0386 (n 5 20 genes) and chromosome 20 in patient 0389 (n 5 8 genes), which were downregulated at least 1.5-fold in unselected bone marrow cells. (E) Genes contained in the regions of uniparental disomy (UPD) summarized from all patients that are downregulated at least 1.5-fold in CD34þ cells (n 5 3 genes). (F) Genes contained in the regions of UPD summarized from all patients that are downregulated at least 1.5-fold in unselected bone marrow cells (n 5 4 genes). The graphs demonstrate that the selected genes are exclusively downregulated in the samples containing the genomic deletion as compared to the other myelodysplastic syndrome (MDS) samples or healthy individuals.
UPD in MDS as done elsewhere [11,12]. Due to the increased detection of UPD in cells of patients with MDS as compared to the healthy individuals used as controls, we suggest UPD as a possible new important pathogenic
factor in MDS. Our findings of UPD are accordance with recent studies that have detected increased numbers of UPD regions in a large fraction of MDS and secondary AML [912]. Similar to chromosomal abnormalities
D. Nowak et al./ Experimental Hematology 2009;37:215–224
223
Table 4. Genes contained in the regions of uniparental disomy that were downregulated at least 1.5dfold in CD34þ cells and unselected bone marrow cells Location
Gene symbol
Gene title
RefSeq
Genes downregulated O1.5-fold in regions of UPD in CD34þ cells of all patients 3p21.3 RBM6 RNA binding motif protein 6 3p21 GNAI2 Guanine nucleotide binding protein (G protein), a inhibiting activity polypeptide 2 20q11.21 TM9SF4 Transmembrane 9 superfamily protein member 4
AI418892
Genes downregulated O1.5-fold in regions of UPD in unselected bone marrow cells of all patients 3p21.31 APEH N-acylaminoacyl-peptide hydrolase 3p21.3 HYAL3 Hyaluronoglucosaminidase 3 3p21.3 MAPKAPK3 Mitogen-activated protein kinase-activated protein kinase 3 7q11.21 RABGEF1 RAB guanine nucleotide exchange factor (GEF) 1
NM_001640 BC005896 U43784 AU159357
AI190489 NM_002070
RefSeq 5 reference sequence (NCBI); UPD 5 uniparental disomy.
detected by conventional cytogenetics, which are integral part of MDS prognostic systems, UPD detected by SNP arrays may also become of value for development of advanced prognostic systems. Furthermore, UPD that occurs in common regions in the genomes of MDS BM cells, may give leads to new target genes relevant for new therapies. An excellent example for the relevance of UPD is the recent discovery of the activating mutation of the Janus kinase gene 2 in myeloproliferative disorders [41], which is closely connected to UPD on chromosome 9p [24]. Analyzing malignant hematopoietic cells by SNP arrays, however, is a global screening method that produces large lists of interesting, potentially disease relevant candidate genes. A difficulty lies in narrowing down on a few interesting genes that have the highest possible probability to be relevant for the disease. By sequentially analyzing gene expression of CD34þ cells followed by assessment of the gene expression in the differentiated hematopoietic cells arising from these stem cells, we aimed to encircle those genes that were selected in the CD34þ cells and still retained their differential expression in the descending bone marrow to identify genes with a high probability of playing a role in the disrupted hematopoiesis of MDS. The selection of genes encoded in regions of genomic lesions led to a great reduction of data complexity. In addition, the integration of genomic data with global gene expression data contributed to narrow down further on candidate genes in these regions. Interestingly, we found that the mean gene expression of genes in the affected regions was not influenced beyond 1.1-fold differences in comparison to samples that did not feature a lesion. Thus, they were not markedly downregulated on average, as might be expected due to a gene dosage effect. Little information exists on the correlation between genomic array and global gene expression data. While single genes or groups of genes located in a region of a genomic alteration have been shown to be differentially expressed, an obligatory change of mean gene expression of all genes in these areas has not precisely been demonstrated [14,15]. Therefore, we performed a supervised analysis of the genes in the detected regions
of heterozygous deletion or UPD and selected those genes, which were downregulated at least 1.5-fold in the samples with the genomic lesions. This procedure demonstrated that selected genes were exclusively downregulated in samples containing genomic lesions. It showed that the number of genes differentially regulated was smaller in CD34þ cells as compared to unselected BM. This strengthened the impact of genes found to be differentially regulated in CD34þ cells when they were concomitantly downregulated in the unselected BM samples. This further specification of MDS candidate genes created gene groups with a focus on transcription, apoptosis and cell cycle control. In conclusion, this is the first study to genotype CD34þ BM cells from MDS patients with high-density 500K SNP arrays. It led to detection of new cryptic genomic lesions and highlighted an increased occurrence of UPD in MDS. Moreover, we show that the integration of the genomic data with global gene expression data of the same cells is a powerful method to identify new potential candidate genes for ongoing research in MDS pathogenesis and therapy.
Acknowledgments This work was supported by a grant from the DFG (HO2207/3-1) and by the Gutermuth Foundation. The authors declare no conflicts of interest.
References 1. Corey SJ, Minden MD, Barber DL, Kantarjian H, Wang JC, Schimmer AD. Myelodysplastic syndromes: the complexity of stem-cell diseases. Nat Rev Cancer. 2007;7:118–129. 2. Bernasconi P, Boni M, Cavigliano PM, et al. Clinical relevance of cytogenetics in myelodysplastic syndromes. Ann N Y Acad Sci. 2006; 1089:395–410. 3. Hofmann WK, Koeffler HP. Myelodysplastic syndrome. Annu Rev Med. 2005;56:1–16. 4. Tiu R, Gondek L, O’Keefe C, Maciejewski JP. Clonality of the stem cell compartment during evolution of myelodysplastic syndromes and other bone marrow failure syndromes. Leukemia. 2007;21: 1648–1657.
224
D. Nowak et al./ Experimental Hematology 2009;37:215–224
5. Komura D, Shen F, Ishikawa S, et al. Genome-wide detection of human copy number variations using high-density DNA oligonucleotide arrays. Genome Res. 2006;16:1575–1584. 6. Robinson WP. Mechanisms leading to uniparental disomy and their clinical consequences. Bioessays. 2000;22:452–459. 7. Gondek LP, Tiu R, Haddad AS, et al. Single nucleotide polymorphism arrays complement metaphase cytogenetics in detection of new chromosomal lesions in MDS. Leukemia. 2007;21:2058–2061. 8. O’Keefe CL, Tiu R, Gondek LP, et al. High-resolution genomic arrays facilitate detection of novel cryptic chromosomal lesions in myelodysplastic syndromes. Exp Hematol. 2007;35:240–251. 9. Gondek LP, Dunbar AJ, Szpurka H, McDevitt MA, Maciejewski JP. SNP array karyotyping allows for the detection of uniparental disomy and cryptic chromosomal abnormalities in MDS/MPD-U and MPD. PLoS One. 2007;2:e1225. 10. Gondek LP, Tiu R, O’Keefe CL, Sekeres MA, Theil KS, Maciejewski JP. Chromosomal lesions and uniparental disomy detected by SNP arrays in MDS, MDS/MPD, and MDS-derived AML. Blood. 2008;111: 1534–1542. 11. Mohamedali A, Gaken J, Twine NA, et al. Prevalence and prognostic significance of allelic imbalance by single-nucleotide polymorphism analysis in low-risk myelodysplastic syndromes. Blood. 2007;110: 3365–3373. 12. Wang L, Fidler C, Nadig N, et al. Genome-wide analysis of copy number changes and loss of heterozygosity in myelodysplastic syndrome with del(5q) using high-density single nucleotide polymorphism arrays. Haematologica. 2008;93:994–1000. Epub 2008 May 1027. 13. Mullighan CG, Goorha S, Radtke I, et al. Genome-wide analysis of genetic alterations in acute lymphoblastic leukaemia. Nature. 2007; 446:758–764. 14. Garraway LA, Widlund HR, Rubin MA, et al. Integrative genomic analyses identify MITF as a lineage survival oncogene amplified in malignant melanoma. Nature. 2005;436:117–122. 15. Walker BA, Leone PE, Jenner MW, et al. Integration of global SNPbased mapping and expression arrays reveals key regions, mechanisms, and genes important in the pathogenesis of multiple myeloma. Blood. 2006;108:1733–1743. 16. Takeyama K, Monti S, Manis JP, et al. Integrative analysis reveals 53BP1 copy loss and decreased expression in a subset of human diffuse large B-cell lymphomas. Oncogene. 2007;27:318–322. 17. Hofmann WK, de Vos S, Komor M, Hoelzer D, Wachsman W, Koeffler HP. Characterization of gene expression of CD34þ cells from normal and myelodysplastic bone marrow. Blood. 2002;100:3553–3560. 18. Komor M, Guller S, Baldus CD, et al. Transcriptional profiling of human hematopoiesis during in vitro lineage-specific differentiation. Stem Cells. 2005;23:1154–1169. 19. Carvalho B, Bengtsson H, Speed TP, Irizarry RA. Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data. Biostatistics. 2007;8:485–499. 20. Downey T. Analysis of a multifactor microarray study using Partek genomics solution. Methods Enzymol. 2006;411:256–270. 21. Huang J, Wei W, Zhang J, et al. Whole genome DNA copy number changes identified by high density oligonucleotide arrays. Hum Genomics. 2004;1:287–299. 22. Dutt A, Beroukhim R. Single nucleotide polymorphism array analysis of cancer. Curr Opin Oncol. 2007;19:43–49. 23. Nannya Y, Sanada M, Nakazaki K, et al. A robust algorithm for copy number detection using high-density oligonucleotide single nucleotide polymorphism genotyping arrays. Cancer Res. 2005;65:6071–6079.
24. Yamamoto G, Nannya Y, Kato M, et al. Highly sensitive method for genomewide detection of allelic composition in nonpaired, primary tumor specimens by use of affymetrix single-nucleotide-polymorphism genotyping microarrays. Am J Hum Genet. 2007;81:114–126. 25. Beroukhim R, Lin M, Park Y, et al. Inferring loss-of-heterozygosity from unpaired tumors using high-density oligonucleotide SNP arrays. PLoS Comput Biol. 2006;2:e41. 26. Livak KJ, Schmittgen TD. Analysis of relative gene expression data using real-time quantitative PCR and the 2(-Delta Delta C(T)) method. Methods. 2001;25:402–408. 27. Weksberg R, Hughes S, Moldovan L, Bassett AS, Chow EW, Squire JA. A method for accurate detection of genomic microdeletions using real-time quantitative PCR. BMC Genomics. 2005;6:180. 28. Xie D, Hofmann WK, Mori N, Miller CW, Hoelzer D, Koeffler HP. Allelotype analysis of the myelodysplastic syndrome. Leukemia. 2000;14:805–810. 29. Hofmann WK, Takeuchi S, Xie D, Miller CW, Hoelzer D, Koeffler HP. Frequent loss of heterozygosity in the region of D1S450 at 1p36.2 in myelodysplastic syndromes. Leuk Res. 2001;25:855–858. 30. Sadlon TJ, Dell’Oso T, Surinya KH, May BK. Regulation of erythroid 5-aminolevulinate synthase expression during erythropoiesis. Int J Biochem Cell Biol. 1999;31:1153–1167. 31. Gu TL, Mercher T, Tyner JW, et al. A novel fusion of RBM6 to CSF1R in acute megakaryoblastic leukemia. Blood. 2007;110: 323–333. 32. Samani NJ, Erdmann J, Hall AS, et al. Genomewide association analysis of coronary artery disease. N Engl J Med. 2007;357:443–453. 33. Nilsson L, Astrand-Grundstrom I, Anderson K, et al. Involvement and functional impairment of the CD34(þ)CD38(-)Thy-1(þ) hematopoietic stem cell pool in myelodysplastic syndromes with trisomy 8. Blood. 2002;100:259–267. 34. Nilsson L, Eden P, Olsson E, et al. The molecular signature of MDS stem cells supports a stem-cell origin of 5q myelodysplastic syndromes. Blood. 2007;110:3005–3014. Epub 2007 Jul 3006. 35. Ogata K, Nakamura K, Yokose N, et al. Clinical significance of phenotypic features of blasts in patients with myelodysplastic syndrome. Blood. 2002;100:3887–3896. Epub 2002 Jul 3818. 36. Ogata K, Kishikawa Y, Satoh C, Tamura H, Dan K, Hayashi A. Diagnostic application of flow cytometric characteristics of CD34þ cells in low-grade myelodysplastic syndromes. Blood. 2006;108:1037–1044. Epub 2006 Mar 1030. 37. Kanter-Lewensohn L, Hellstrom-Lindberg E, Kock Y, Elmhorn-Rosenborg A. Ost A Analysis of CD34-positive cells in bone marrow from patients with myelodysplastic syndromes and acute myeloid leukemia and in normal individuals: a comparison between FACS analysis and immunohistochemistry. Eur J Haematol. 1996;56:124–129. 38. Starczynowski DT, Vercauteren S, Telenius A, et al. High-resolution whole genome tiling path array CGH analysis of CD34þ cells from patients with low-risk myelodysplastic syndromes reveals cryptic copy number alterations and predicts overall and leukemia-free survival. Blood. 2008;28:28. 39. Feuk L, Carson AR, Scherer SW. Structural variation in the human genome. Nat Rev Genet. 2006;7:85–97. 40. Gibson J, Morton NE, Collins A. Extended tracts of homozygosity in outbred human populations. Hum Mol Genet. 2006;(15):789–795. Epub 2006 Jan 2025. 41. Baxter EJ, Scott LM, Campbell PJ, et al. Acquired mutation of the tyrosine kinase JAK2 in human myeloproliferative disorders. Lancet. 2005;365:1054–1061.
D. Nowak et al. / Experimental Hematology 2009;37:215–224
224.e1
Appendix ADA 204639_at
B
1.25
*
1.00 0.75 0.50 0.25
ADA real time PCR 9 8
Relative expression [arbitrary units]
Relative expression [arbitrary units]
A
0.00
7 6 5
3 2 1 0
1.00 0.75 0.50
*
0.25 0.00
D Relative expression [arbitrary units]
CALR 200935_at 1.25
0029 0055 0061 0071 n.a. 0085 0110 0227 0303 0309 0319 0336 0386 0389 n0093 n0096 n0099 n0102
0029 0055 0061 0071 0085 0110 0227 0303 0309 0319 0336 0386 0389 n0093 n0096 n0099 n0102
CALR real time PCR 12.5 10.0
*
7.5 5.0 2.5
0029 0055 0061 0071 0085 0110 0227 0303 0309 0319 0336 0386 0389 n0093 n0096 n0099 n0102
0.0 0029 0055 0061 0071 n.a. 0085 0110 0227 0303 0309 0319 0336 0386 0389 n0093 n0096 n0099 n0102
Relative expression [arbitrary units]
C
*
4
Supplementary Figure S1. Gene expression profiles. (A) ADA (Adenosine deaminase) as detected on microarray chip in CD34þ cells and confirmed by real-time polymerase chain reaction (PCR) (B). (C) Gene expression profile of Calreticulin (CALR) as detected with microarrays chip and confirmed by real-time PCR (D). *Marks the sample with the genomic lesion. Supplementary Table S1. Genes contained in the regions of deletion on chromosome 9 in patient 0386, which were downregulated at least 1.5-fold in CD34þ cells (upper table) and unselected bone marrow cells (lower table) Location
Gene symbol
Gene title
Genes downregulated O1.5-fold in region of deletion in CD34þ cells of patient 0386 9q22 CKS2 CDC28 protein kinase regulatory subunit 2 9q22.2 SECISBP2 SECIS binding protein 2 9q31 UGCG UDP-glucose ceramide glucosyltransferase Genes downregulated O1.5-fold in region of deletion in unselected bone marrow cells of patient 0386 9q22.31 FAM120A Family with sequence similarity 120A 9q22dq31 SEMA4D Sema domain, immunoglobulin domain (Ig), transmembrane domain (TM) and short cytoplasmic domain, (semaphorin) 4D 9q22 CKS2 CDC28 protein kinase regulatory subunit 2 9q21.33 AGTPBP1 ATP/GTP binding protein 1 9q22.3 PTCH1 Patched homolog 1 (Drosophila) 9q22.31 PHF2 Regulation of transcription, DNA-dependent 9q22.2 SECISBP2 SECIS binding protein 2 9q22.33 CDC14B CDC14 cell division cycle 14 homolog B (S. cerevisiae) 9q22.33 HIATL2 Hippocampus abundant gene transcript-like 2
RefSeq
NM_001827 NM_024077 NM_003358
BF591199 NM_006378
NM_001827 AB028958 BC043542 NM_005392 NM_024077 NM_003671 BC023631 (continued)
D. Nowak et al. / Experimental Hematology 2009;37:215–224;
224.e2 Supplementary Table S1 (continued ) Location
Gene symbol
9q22.32 9q22.2 9q22j9q22.32 9q21.33 9q31 9q32 9q31.3dq33.1 9q32 9q32 9q32 9q32
C9orf21 C9orf164 ZNF367 LOC389765 UGCG LOC158402 SUSD1 ROD1 HSDL2 KIAA1958 C9orf80
Gene title Chromosome 9 open reading frame 21 Chromosome 9 open reading frame 164 Zinc finger protein 367 Similar to KIF27C UDP-glucose ceramide glucosyltransferase Hypothetical protein LOC158402 Sushi domain containing 1 ROD1 regulator of differentiation 1 (S. pombe) Hydroxysteroid dehydrogenase like 2 KIAA1958 Chromosome 9 open reading frame 80
RefSeq AI655189 AF289567 N62196 AI684760 NM_003358 BE504242 AL137432 AF085953 AF086494 BG026236 BC013097
Genes that were downregulated concomitantly in CD34þ cells and unselected bone marrow are marked bold. RefSeq 5 reference sequence (NCBI); UDP 5 uniparental disomy. Supplementary Table S2. Genes contained in the regions of deletion on chromosome 20 in patient 0389 which were downregulated at least 1.5-fold in CD34þ cells (upper table) and unselected bone marrow cells (lower table) Location
Gene symbol
Gene title
Genes downregulated O1.5-fold in region of deletion in CD34þ cells of patient 0389 20q13.1 MYBL2 Vdmyb myeloblastosis viral oncogene homolog (avian)dlike 2 20q13.113.3 SERINC3 Serine incorporator 3 20q12dq13.1 PKIG Protein kinase (cAMPddependent, catalytic) inhibitor g 20q12dq13.11 ADA Adenosine deaminase 20q12dq13.1 PLTP Phospholipid transfer protein 20q12dq13.12 NCOA5 Nuclear receptor coactivator 5 Genes downregulated O1.5-fold in region of deletion in unselected bone marrow cells of patient 0389 20q13.1dq13.2 B4GALT5 UDPdGal:betaGlcNAc b 1,4d galactosyltransferase, polypeptide 5 20q13.11 C20orf111 Chromosome 20 open reading frame 111 20q13.1 CEBPB CCAAT/enhancer binding protein (C/EBP), b 20q11.2dq13.1 MMP9 Matrix metallopeptidase 9 (gelatinase B, 92 kDa gelatinase, 92 kDa type IV collagenase) 20q13.13 PREX1 Phosphatidylinositol 3,4,5dtrisphosphatedependent RAC exchanger 1 20q12dq13.2 SULF2 Sulfatase 2 20q13.13 ZNF313 Zinc finger protein 313 20q12dq13.11 ADA Adenosine deaminase Genes that were downregulated concomitantly in CD34þ cells and unselected bone marrow are marked bold. RefSeq 5 reference sequence (NCBI).
RefSeq
NM_002466 BC006088 NM_007066 AK090842 NM_006227 AB046857
AI084451 AF217514 AL564683 NM_004994 BF308645 AL133001 AL031685 AK090842