Biochimica et Biophysica Acta 1779 (2008) 347–355
Contents lists available at ScienceDirect
Biochimica et Biophysica Acta j o u r n a l h o m e p a g e : w w w. e l s ev i e r. c o m / l o c a t e / b b a g r m
Genomic structure, alternative splicing and expression of TG-interacting factor, in human myeloid leukemia blasts and cell lines Rizwan Hamid a,⁎, Johnequia Patterson a, Stephen J. Brandt b,c,d,e,f a
Department of Pediatrics, Vanderbilt University Medical Center, Nashville, Tennessee, USA Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, USA Department of Cell and Developmental Biology, Vanderbilt University Medical Center, Nashville, Tennessee, USA d Department of Cancer Biology, Vanderbilt University Medical Center, Nashville, Tennessee, USA e Vanderbilt-Ingram Cancer Center of Vanderbilt University Medical Center, Nashville, Tennessee, USA f VA Tennessee Valley Healthcare System, Nashville, Tennessee, Nashville, Tennessee, USA b c
A R T I C L E
I N F O
Article history: Received 26 February 2008 Received in revised form 26 March 2008 Accepted 4 April 2008 Available online 13 April 2008 Keywords: TGIF TGIF Leukemia AML Splice isoforms Expression Acute myelogenous leukemia
A B S T R A C T TG-interacting factor (TGIF) is a homeobox transcriptional repressor that has been implicated in holoprosencephaly and various types of cancer, including leukemias. In this study, we provide the first detailed description of the TGIF locus characterizing 12 TGIF splice isoforms. These isoforms have similar open reading frames but different 5′ untranslated regions. TGIF expression data are presented from multiple tissues, cell lines and primary leukemia cells. Isoform-specific real-time PCR analysis showed that even though these isoforms were broadly expressed all except isoform 4, had very low level of expression. In fact, isoform 4 was the predominant TGIF isoform expressed in all tissues analyzed. Since TGIF, levels have recently implicated to play a role in acute myelogenous leukemia we proceeded to characterize the minimal promoter region of isoform 4 as a first step in understanding mechanisms of TGIF expression. As expected for homeobox genes, the minimal promoter region for isoform 4 has multiple Sp1 binding sites and a CpG island raising the possibility that the low TGIF expression seen in some AML patients and leukemia cell lines may be secondary to methylation. Further characterization of expression from this promoter using 5-Aza-2′deoxycytidine treatment and transient expression assays showed that decreased TGIF expression is likely secondary to active repression and not because of promoter methylation. A detailed characterization of this complex locus is important as it may help to clarify the functions of this gene in brain development and leukemia biology. © 2008 Elsevier B.V. All rights reserved.
1. Introduction TG-interacting factor (TGIF) is a transcriptional repressor and member of the three amino acid loop extension (TALE) class of homeodomain proteins [1]. TGIF was initially identified as a nuclear protein that bound to a retinoid X receptor (RXR) response element in the retinol binding protein II promoter [2]. Our laboratory first identified a role for TGIF in acute myelogenous leukemia (AML) from the observation that expression levels of TGIF were an independent predictor of overall survival in AML. AML patients with lower TGIF levels in their leukemia cells had a worse prognosis than patients with higher TGIF levels [3,4] (submitted). TGIF has also been implicated in other types of human cancer such as esophageal carcinoma where it was shown to be over-expressed and bladder transitional cell carcinoma where its amplification was noticed [5,6]. TGIF also plays a role in early ⁎ Corresponding author. Division of Medical Genetics, Room DD-2205, Medical Center North, Vanderbilt University Medical Center, Nashville, TN 37232, USA. Tel.: +1 615 322 7601; fax: +1 615 343 9951. E-mail address:
[email protected] (R. Hamid). 1874-9399/$ – see front matter © 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.bbagrm.2008.04.003
brain development, since specific mutations and deletions of the TGIF gene are associated with holoprosencephaly (HPE), a common structural abnormality of the forebrain in humans [7]. HPE is an autosomal dominant disorder, and TGIF is one of several genes including Sonic Hedgehog [8], SIX3 [9], ZIC2 [10], GLI2 [11], PATCHED-1 [12], TDGF-1 [13] that have been associated with haploinsufficiency [14–17]. TGIF appears to have complex functions and can interact with DNA through its own consensus-binding site or indirectly through Smad2. Transcriptional repression by TGIF likely involves multiple mechanisms, including its competition with retinoid X receptor (RXR) for the RXR recognition elements, interaction with the ligand-binding domain of RXR [18], and targeting of Smad2 for degradation or sequestration [19,20]. Additionally, TGIF can act as a Smad2 corepressor by recruiting mSin3A and histone deacetylases (HDACs) to the TGF-β-activated Smad complex [21–23]. The human TGIF gene has been localized to chromosome 18p11.2 [2]. Although a human TGIF cDNA sequence has been previously reported [2], little is known about its genomic organization. Bioinformatics analysis indicates that 35–65% of human genes utilize alternative splicing, which not only contributes significantly to
348
R. Hamid et al. / Biochimica et Biophysica Acta 1779 (2008) 347–355
Table 1 Primer sequences used for RT-PCR analysis cloning and verification of splice junctions Isoforms
Forward primer
Reverse primer
1 2 3 4 5 6 7 8 9 10 11 12
5′TTTCCACGGCTTTTCTGG 3′ 5′TGTTTCTCTTTGGAGTGCC 3′ 5′GGTGTCCTTTTTTCCATCTG 3′ 5′TTCGCTTATCCCCTGTGTC 3′ 5′GGTGTCCTTTTTTCCATCTG 3′ 5′TTCCTTCGGCTGCGTTTCTG 3′ 5′TGGATAGCGTGAAAGAAGC 3′ 5′GGACAACAGTGATTTGCCTC 3′ 5′CAGGACAGAAAACACTGCTC 3′ 5′GCTGTGTCTCATCATTCCC 3′ 5′ACAAAACGAGGCTCTGTCC 3′ 5′CATTCGGAGTAGCCTGCTGT 3′
5′GGCGTTGATGAACCAGTTAC 3′ 5′GGCGTTGATGAACCAGTTAC 3′ 5′GGCGTTGATGAACCAGTTAC 3′ 5′GGCGTTGATGAACCAGTTAC 3′ 5′GGCGTTGATGAACCAGTTAC 3′ 5′GGCGTTGATGAACCAGTTAC 3′ 5′GGCGTTGATGAACCAGTTAC 3′ 5′GGCGTTGATGAACCAGTTAC 3′ 5′GGCGTTGATGAACCAGTTAC 3′ 5′GGCGTTGATGAACCAGTTAC 3′ 5′GGCGTTGATGAACCAGTTAC 3′ 5′GGCGTTGATGAACCAGTTAC 3′
human proteome complexity, but also makes experimental manipulation of a particular gene more complicated [24,25]. Eight 5′-splice isoforms of TGIF are listed in the National Center for Biotechnology Information (NCBI) database, however the full range of isoforms and their exact expression pattern has not yet been determined. Sequence variation in the 5′ ends of mRNA can affect transcriptional initiation, transcript stability, and translation efficiency. It may also lead to differential promoter use resulting in tissue-specific or developmental stage specific expression [26–37]. Elucidation of the number of transcripts, their expression patterns, and regulatory regions can provide insights into a gene's role in different pathways and cellular processes. In trying to investigate the potential role of TGIF sequence variations in high myopia, Scavello et al listed the 8 isoform already reported in NCBI database but provided no independent verification of those isoforms or provide evidence of their expression [38]. In this study, we undertook the first complete analysis of TGIF genomic organization. Our data suggest that TGIF has a complex genomic structure with multiple 5′-splice isoforms. We have characterized the expression of these isoforms in normal tissues, leukemia cell lines, and primary AML cells. To our surprise, one isoform was the predominantly expressed isoform in all tissues analyzed. Using RACE analysis and reporter constructs, we proceeded to identify its transcriptional start site and proximal promoter region. And finally, we present data that suggest mechanisms that control TGIF expression in myeloid leukemia cells. 2. Materials and methods 2.1. EST analysis The complete genomic sequence of TGIF was used to search the human expressed sequence tag (EST) database (http://www.ncbi.nlm.nih.gov/dbEST/) for sequence similarities [39]. Using this approach a total of 355 ESTs were identified. The location and genomic organization of TGIF isoforms were provisionally determined by comparison of EST sequences and genomic sequences using the Basic Local Alignment Tool (BLAST) of the NCBI, at www.ncbi.nlm.nih.gov/BLAST/) [40–44]. 2.2. Isolation of TGIF isoforms and construction of plasmids Based on the sequence information acquired, complementary primers were designed for PCR amplification and cloning of the identified isoforms. Total RNA was
isolated from HeLa, HL-60, TF-1 and AML-193 cells using silica gel membranes with oncolumn deoxyribonuclease treatment (RNeasy Total RNA Isolation Kit, Qiagen, Valencia, CA). Total RNA from the above mentioned cell lines, as well as the Human Total RNA Master Panel (Clonetech, Mountain View, CA), was used as template for cDNA synthesis using reverse transcriptase PCR. First strand cDNA synthesis was performed using the Superscript-III First-Strand System (Invitrogen Life Technologies, Carlsbad, CA) with 1 μg of total RNA and an oligo (dT) primer. One-tenth volume of the first-strand reaction was used as a template for PCR amplification using the Elongase Amplification System (Invitrogen Life Technologies, Carlsbad, CA) and the primers listed in Table 1. The PCR reaction mixture was denatured for 30 s at 94 °C, cycled 35 times (94 °C, 30 s; 56 °C, 30 s; 68 °C, 1 min 30 s), followed by a 5 min extension at 68 °C. The resulting PCR products were purified by filtration with a Microcon 50 microconcentrator (Amicon Corp., Danvers, MA) and then visualized by ethidium bromide staining on a 1% agarose gel. The purified PCR products were subcloned into the pCR2.1 TA cloning vector according to the manufacturer's instructions (Invitrogen). Sequence analysis to confirm the exon boundaries and the basic structure of the individual isoforms was completed using the BigDye Terminator v3.1 sequencing kit (Applied Biosystems). 2.3. Analysis of TGIF isoforms expression profiles by real-time cDNA was synthesized from total cellular RNA (1 µg) using oligo (dT) primers (Superscript-III cDNA Synthesis Kit (Invitrogen). Taqman real-time assays were designed for isoforms 1, 3, 4, 5, 6, 9, 10, 11 and 12 (Table 2) using the Primer Express software package. These assays were optimized using the standard curve method as previously described [45,46]; all had amplifications efficiencies of greater than 99.5%. Pre-designed Taqman (Applied Biosystems, ABI) assays were used for Isoforms 2 (Hs00545014_m1), 7 (Hs00545017_m1), 8 (Hs00545233_m1) and Total TGIF (Hs00820148_g1). Real-time PCR analysis was carried out using Taqman Universal Master Mix and 7500 Real-Time PCR system using the manufacturer's instructions (Applied Biosystems, Foster City, CA). We used TaqMan human endogenous control plate (catalog number 4309199) to analyze 13 genes as potential house keeping genes for data normalization. These included 18S rRNA, Acidic ribosomal protein, Beta-actin, Cyclophilin, Glyceraldehyde-3-phosphate dehydrogenase, Phosphoglycerokinase, β2Microglobulin, β-Glucronidase, Hypoxanthine ribosyl transferase (HPRT), Transcription factor IID, TATA binding protein, Transferrin receptor, and ABL. This revealed that HPRT showed the least variability between tissues and was in addition, similar in abundance to TGIF isoforms. We thus used a pre-designed Taqman assay (ABI) for HPRT (Hs99999909_m1) as a housekeeping gene for normalization. Amplification parameters consisted of initial denaturation at 95 °C for 10 min followed by 40 cycles of denaturation at 95 °C for 15 seconds s and annealing and extension at 60 °C for 1 min. Relative expression levels were calculated using the comparative Ct method [45,46]. 2.4. Determination of transcriptional start site of the major TGIF isoform To determine the 5′-end of the major TGIF isoform (Isoform-4), 5′ rapid amplification of cDNA ends (RACE) was performed using the GeneRacer Kit (Invitrogen), according to the manufacturer's recommended protocol. This system ensures that only mature capped mRNA transcripts participate in the reaction. Total RNA was isolated from multiple sources, including: HL-60, AML-193, TF-1 and HeLa cells. This total RNA was then incubated with calf intestinal phosphatase (CIP) in order to eliminate truncated mRNAs and with tobacco acid pyrophosphatase to remove the 5′-cap. The full-length mRNA was ligated to the GeneRacer RNA oligonucleotide (5′-CGACUGGAGCACGAGGACACUGACAUGGACUGAAGGAGUAGAAA-3′) using T4 RNA ligase. First-strand cDNA synthesis was performed using SuperScript III reverse transcriptase and a manufacturer supplied oligo (dT) primer. Double-stranded cDNA was then prepared and amplified in a PCR reaction using an Isoform 4 specific primer (5′ CCGACTCTCCCGTAACTTGTAG 3′) designed to be within the Isoform 4 coding region, and the GeneRacer 5′ primer (5′GCACGAGGACACUGACAUGGACUGA-3′ homologous to the previously ligated RNA oligonucleotide). A second nested PCR was performed using a manufacturer supplied 5′-nested primer (5′-GGACACTGACATGGACTGAAGGAGTA-3′) and a reverse nested gene-specific primer (5′ GCTCCAGCCGTTATTGCTAAAC 3′). In both assays, PCR products were verified by agarose gel electrophoresis and were further confirmed by sequence analysis.
Table 2 Primers and Taqman probe sequences used for real-time PCR analysis Isoform
Forward primer
Probe
1 4 3 5 6 9 10 11 12
5′- CCG CAT CGG TGG GAA CT- 3′ 5′- CCC GGC GGG AGG AA- 3′ 5′- GCG CTC GGT CCA GTC TTC- 3′ 5′- TCC CGG CTG GAA GGT ATT G- 3′ 5′- TGG CAG TCG TTG TTG GTA TTG- 3′ 5′- GTT GGG AAA GCA TGG TTA CAT TG- 3′ 5′- TTC GTT TAG GCT GTG TCT CAT CA- 3′ 5′- CTG TAG TTT TTG AAA GGT ATT GTT GC 5′- CCT ATC TAC TTG GGA AGC TGA GAC A- 3
5′5′5′5′5′5′5′5′5′-
6FAM 6FAM 6FAM 6FAM 6FAM 6FAM 6FAM 6FAM 6FAM
Reverse primer CGG TCC CCA TCC CA- 3′ CCC AAG TGT CAC TTG AA-3′ CGC TCC TTC CAG C- 3′ TCT CAC TGC CAG ATG CT- 3′ CAT CTG GCA GTG AGA CT- 3′ CAC TTA ACG CTG ACC A- 3′ CGA CCT CCC ACT GTC- 3′ GTC TCA CTG CCA GAT G- 3′ TCT TGA ACT CCT GAG CTC A
5′- TGT CCA TGC TGT CCT CAT CCT- 3′ 5′- ATT GCT AAA CAG GAC GCT CTG ATC- 3′ 5′- CGA CTC TCC CGT AAC TTG TAG GA- 3′ 5′- AAA GGT CCA AGG GAA TGT CCA T- 3′ 5′- TGT CCA TGC TGT CCT CAT CCT- 3′ 5′- AGC ACA GCA ACT ATT GCA AAG C- 3′ 5′- AAG GTC CAA GGG AAT GTC CAT- 3′ 5′- TGC CCC TTC TCC TTC TCT TG- 3′ 5′- CAA CCT GGA AGA GCC TTG TGA -3′
R. Hamid et al. / Biochimica et Biophysica Acta 1779 (2008) 347–355 2.5. Construction of TGIF isoform-4 promoter luciferase reporter constructs and their truncated forms Sequences upstream of the determined Isoform 4 transcriptional start site (TSS) were then analyzed for regulatory regions using multiple bioinformatics tools at http:// www-bimas.cit.nih.gov/molbio/proscan/ [47], http://www.fruitfly.org/seq_tools/promoter.html[48], http://www.genomatix.de/ [49–51] and http://rulai.cshl.org/software/ index1.htm [52–54]. This analysis identified three putative promoter regions in ~ 2000 bp immediately upstream to the TSS. This region was amplified from genomic DNA using primers 5′-GAATTGTGCCAGTGTTTCTCTTTG-3′ (forward) and 5′CGGCGCTGTCAGAGTGAGAGAGGC-3′ (reverse). The putative promoter regions were subsequently cloned into promoter-less luciferase vector, pGL4.1 basic upstream of the firefly luciferase-coding region (Promega Corp., Madison, WI, USA). The sequence of the cloned region was confirmed in both directions. Truncated forms of TGIF Isoform-4 promoter reporter construct were generated with the Erase-A-Base system (Promega). Briefly, the full-length construct was first digested with KpnI, resulting in a fragment that could not be degraded by Exonuclease III, and then digested by EcoR1, the resulting product could now be degraded by Exonuclease III in a unidirectional manner. The duration of the Exonuclease III digestion was adjusted to generate a series of truncated isoform 4 promoter inserts. After blunt end generation, self-ligation was performed and the ligation product was transformed into competent E. coli cells. Clones with different sized insert for the Isoform 4 promoter were isolated, and the exact promoter sequence was confirmed by DNA sequence analysis using a pGL4 sequencing primer.
349
identical volume of the vehicle dimethyl sulfoxide (DMSO). Cells were harvested every 24 h and RNA extracted as described above. Real-time PCR analysis was performed as already described above. 2.7. Transfection of Isoform-4 promoter luciferase constructs and assay for promoter activity Full-length and truncated TGIF promoter constructs were transfected into 293-T cells using Lipofectamine-2000 (Invitrogen, Carlsbad, CA). We used a dual luciferase assay system (Promega). In this system the test sequence sits in front of a Firefly luciferase gene while a control (to control for transfection efficiencies) plasmid contains a universal promoter that drives the expression of Renilla luciferase. Briefly, 293-T cells were cultured in 6-well plates to 70% confluency. Transfections were carried out with 3.95 μg of each promoter construct and 0.05 μg of pGL4.74 control vector, containing the coding region of Renilla luciferase driven by the thymidine kinase promoter and 10 μl of Lipofectamine-2000. Transient transfection in TF-1 and AML-193 cells were done using the 358 bp promoter plasmid, 1 × 106 exponentially growing cells and 8 μl of DMRIE-C (Invitrogen). Twenty-four hours after transfection, cells were harvested, washed with PBS, and suspended in passive lysis buffer (Promega). Cleared cell lysates were prepared by centrifugation, and firefly and Renilla luciferase activity were measured using a dual luciferase assay system (Promega). The relative activity (ratio between firefly and Renilla luciferase activity) was calculated and plotted as relative luciferase units (RLU). 2.8. Cell culture
2.6. 5-Aza-2′-deoxycytidine (5-Aza) treatment of the cells Exponentially growing TF-1 cells were exposed to the methylation inhibitor agent, 5-Aza, at a concentration of 5 µM for up to 96 h. Cells were mock treated with an
Human cell lines HL-60, TF-1, AML-193, U-937, K-562, Kasumi-6 and 293-T cells were purchased from American Type Culture Collection (ATCC, Rockville, MD) [55–61]. HL60, AML-193 and TF-1 are human AML cell lines and were maintained in 80% IMDM +
Fig. 1. Genomic organization of TGIF locus. Exons are indicated by shaded areas and their position relative to the genomic region is shown. Introns are not drawn to scale. The length of the exons, in base pairs, is indicated by the number on top of the boxes while the length of the introns is indicated by the numbers below. All TGIF isoforms identified have two common 3′ exons containing the translated region Isoforms 1 to 8 are listed as such in the NCBI databases. Arrow identifies the major expressed TGIF isoform in our analysis. The vertical box highlights the common coding region of TGIF isoforms.
350
R. Hamid et al. / Biochimica et Biophysica Acta 1779 (2008) 347–355
20% FBS, 70% Iscove's Modified Dulbecco's Medium (IMDM) + 10% FBS + 2 ng/ml recombinant GM-CSF and 70% RPMI 1640 + 20% FBS + 10% + 5 ng/ml recombinant GMCSF respectively. 293-T is a human kidney cell line and was maintained in 90% DMEM and 10% FBS.
3. Results 3.1. Genomic organization of TGIF We undertook a comprehensive analysis of available of cDNAs and ESTs sequences using NCBI databases. An earlier EST based analysis by Scavello et al. listed eight potential TGIF isoforms [38], whereas our analysis using alignment tools at the NCBI blast site of 355 TGIF EST sequences revealed the possible presence of multiple additional 5′ alternative exons spread over ~ 41 kb of genomic sequence (Fig. 1). We only evaluated ESTs matching with greater than 99% accuracy over their length and then proceeded to validate this EST data and confirm exon boundaries of these isoforms by RTPCR analysis. RNA from a variety of cell lines and tissues was subjected to RT-PCR analysis and then directly sequenced. All of the proposed isoforms shown in Fig. 1 have multiple full-length ESTs in support, and their exon boundaries were independently validated by RT-PCR analysis (data not shown) using the primers listed in Table 1. This analysis suggested that TGIF has a complex genomic structure with at least 12 isoforms, multiple alternate 5′ transcriptional start sites or exons, and similar coding regions. All of the isoforms shown in Fig. 1 had two common 3′ exons and were predicted to encode for proteins ranging from 252 aa to 401 aa. Our EST analysis suggested that TGIF could potentially have several additional isoforms; however, we were not able to confirm them independently by our RT-PCR analysis. 3.2. Expression of TGIF and its splice variants The presence of multiple TGIF isoforms that differ in their 5′ regions raised the possibility that they could be differentially expressed, either in a tissue- or developmental-specific manner. However, prior to analyzing individual isoform expression, we needed to assess the range of total TGIF expression in human tissues. For this purpose, we used a Taqman-based relative-real-time PCR analysis which showed that levels of total TGIF differed in various tissues with lung tissue showing the highest level of expression and brain showing the lowest (Fig. 2A). TGIF levels were higher in fetal brain than in adult brain, which is not surprising considering TGIF's role in early brain development. We then proceeded to analyze total TGIF abundance in a number of hematopoietic progenitor cells lines ranging from the most undifferentiated (TF-1) to the most differentiated (AML-193). These cells lines have been studied extensively as models of normal as well as leukemic hematopoiesis. This analysis demonstrated that in these cell lines TGIF levels varied over a large range (Fig. 2B). The least differentiated cell line, TF-1, had the lowest level of total TGIF expression, while AML-193, the most-differentiated line, had the highest level, nearly 22 fold higher than TF-1. Western blot analysis confirmed these results (data not shown). We then proceeded to analyze the TGIF isoform-specific expression patterns in both tissue samples and hematopoietic cell lines. Taqmanbased real-time PCR assays specific for each isoform were either purchased or developed and then validated as described in Materials and Methods. This relative analysis demonstrated that Isoform 4 was the predominant TGIF isoform present in all TGIF expressing tissues tested, including primary hematopoietic stem cells (HSC) and hematopoietic cell lines (Fig. 3). Minor differences of expression of other isoforms were detected between various tissues, but they were not quantitatively significant when related to Isoform 4. In the one exception, Isoform 8 expression was only detected in the lung tissue, albeit at very low levels. In TF-1 cells it appeared that isoform 2 and 6 had relatively higher levels of expression compared to the more
Fig. 2. Distribution of TGIF expression. Total TGIF expression was determined using Taqman real-time PCR chemistry in the indicated tissues (A) and hematopoietic cell lines (B). Analysis was repeated at least 3 times in triplicate. Data from a single representative experiment are shown as the mean plus/minus SD.
differentiated cell line AML-193. This apparent over-expression of isoform 2 and 6 is exaggerated by the relatively low expression of isoform 4 in these cells, thus magnifying the small differences between the lower expressing isoforms. 3.3. Expression of TGIF splice variants in primary leukemic blasts We previously showed that TGIF was expressed variably (a difference of 20–30 fold between the low and high expressors) in primary AML blasts, and that this expression correlates with disease prognosis [3,4]. In light of our identification of TGIF isoforms and the discovery of Isoform 4′s predominance, we investigated if Isoform 4 was also predominant in primary AML blasts. Using our relative-realtime PCR assay we analyzed a number of such samples that expressed a range of levels of TGIF. The results indicate that isoform 4 was the predominant TGIF isoform in both low- and high-expressing blasts (Fig. 4). As was the case in the cell line expression analysis (Fig. 3), a relative reduction in isoform 4 magnified the ordinarily small differences in expression between the less abundant isoforms. 3.4. Cloning and characterization of the Isoform-4 5′-flanking sequences Since Isoform 4 was the predominant TGIF isoform identified and total TGIF cellular expression levels appeared to be primarily secondary to the expression of Isoform 4 (Fig. 5) we proceeded to characterize its proximal 5′ regulatory regions in order to understand the mechanisms that might lead to low TGIF expression. Initially a 5′ RACE analysis was done to characterize Isoform 4 TSS. This analysis also included isoform 1 and 2 as controls. Our data (Fig. 6) show that Isoform 4 and 1 had TSS that matched the start sites as reported in the NCBI database (NM_003244) while RACE analysis of isoform 2 identified an additional 16 basepairs. We then proceeded to analyze sequences 10,000 base pairs (bp) upstream of the isoform 4 TSS for putative promoter regions with several software packages that are listed in Materials and Methods. We identified several potential regulatory sequences in the 2200 bp immediately upstream of the TSS. This region was then cloned into the promoter-less luciferase reporter construct pGL4. Transient transfection assays proved (Fig. 7) that
R. Hamid et al. / Biochimica et Biophysica Acta 1779 (2008) 347–355
351
Fig. 3. Relative expression of TGIF isoforms in tissues and hematopoietic cell lines. Isoform 4 was the predominant isoform expressed in all tissues and cell lines analyzed, including primary hematopoietic stem cells (HSC). The numbers on Y-axis simply indicate the amount of each isoform relative to the lowest expressing isoform for a particular sample. Analysis was repeated at least 3 times. Data from one such representative experiment is shown as the mean plus/minus SD.
this 2200 bp region was, indeed, able to drive luciferase reporter expression. A unidirectional deletion analysis of this sequence was carried out to identify the minimal promoter region and these promoter deletion mutants were analyzed in transient transfection assays in 293-T cells. As shown in Fig. 7, the TGIF isoform 4′s minimal proximal promoter region was localized to the 358 bp immediately upstream of the TSS. A bioinformatics analysis of this region identified multiple Sp1 binding sites and a CpG island. Deletion of an additional 110 bps up to bp −248 relative to the TSS, deleted two Sp1 binding sites and reduced the promoter region's activity in half. 3.5. Functional characterization of expression of isoform 4 promoter in hematopoietic cell lines The presence of a CpG island in the minimal promoter region raised the possibility that reduced TGIF repression may be secondary to promoter methylation a common mechanism of transcriptional silencing involving CpG islands. Since our analysis of human hematopoietic cell lines suggested that TGIF was expressed at quite
variable levels, with TF-1 line expressing the lowest amount of TGIF and AML-193 cells the highest, we decided to use these cell lines to test our hypothesis. To that end, TF-1 cells were treated with the demethylating agent 5-Aza for up to 96 h and TGIF levels analyzed in a Taqman-based real-time PCR assay. The results showed (Fig. 8) that after a 96 h treatment with 5-Aza the TGIF level increased by only about three fold. This increase was not sufficient to account for the nearly 22-fold difference in expression between AML-193 and TF-1 cells (Fig. 2B), suggesting that the lower TGIF expression in TF-1 was unlikely to be solely due to methylation. Furthermore, methyl-specific PCR analysis of this region did not show any evidence of methylation (data not shown). We concluded therefore that the TGIF promoter was being affected in trans (either due to active repression or lack of an activator) in TF-1 cells. If this hypothesis was correct then we would expect that a promoter–reporter construct containing the minimal TGIF (Isoform 4) promoter would show decreased luciferase expression by transient transfection analysis in TF-1 cells and higher luciferase expression in AML-193 cells. In fact, as shown by the dual luciferase assaysTGIF minimal promoter (358 bp promoter) activity
352
R. Hamid et al. / Biochimica et Biophysica Acta 1779 (2008) 347–355
R. Hamid et al. / Biochimica et Biophysica Acta 1779 (2008) 347–355
353
Fig. 5. Comparison of relative total TGIF and isoform 4 expression in tissues and cell lines. Total tissue and cell line TGIF expression was also shown in Fig. 2. Comparison of total TGIF expression with Isoform 4 expression suggests that isoform 4 is the main contributor to total cellular TGIF levels. The numbers on Y-axis simply indicate the amount of each isoform relative to the lowest expressing isoform for a particular sample. Data from one such representative experiment is shown as the mean plus/minus SD.
Fig. 6. RLM-RACE results for TGIF isoforms 1, 2 and 4. 5′ RACE was carried out for TGIF isoform 1, 2 and 4. PCR products from the second round of amplification were run on an agarose gel to confirm amplification (B). These products were then sequenced to precisely determine the length. As shown in A; black arrows donate TSS as predicted by the NCBI databases, while grey arrows show start sites determined after sequencing of the RACE products.
was nearly 20-fold higher in AML-193 cells than in TF-1 cells (Fig. 9) very similar to the 22-fold difference in TGIF RNA levels between these cell lines as determined by real-time PCR analysis (Fig. 2B), suggesting that the low expression of TGIF in hematopoietic cell lines may be secondary to an active mechanism. 4. Discussion TGIF is a homeobox transcriptional repressor implicated in hematopoietic stem cell function, brain development and cancer including leukemias. However, despite this expanding biological role, prior to this study, virtually no information existed regarding the genomic structure, expression pattern and regulation of TGIF in
human tissues. Thus to understand the underlying mechanisms controlling TGIF expression, in this study we undertook a detailed analysis of the genomic structure of this gene. Our data suggest that TGIF has a more complex structure than previously realized with 12 unique 5′ TSSs or exons and 2 common 3′ exons. Several studies have shown that eukaryotic genes can be alternatively spliced or transcribed from different promoters, both important mechanisms for generating message and protein diversity [26,30,36,62,63]. Individual isoforms can play important and unique roles in development and disease. In general, 5′ splice isoforms are associated with alternative promoter usage, sub serving to tissue- or development-specific control of gene expression, but also affecting the efficiency with which the mRNA is translated [27,32]. The data presented in this study
Fig. 4. Relative expression of TGIF isoforms in primary AML blasts. Total cellular RNA from 16 AML patients, 8 with high TGIF expression and 8 with low expression was analyzed expression of specific TGIF isoforms using Taqman-based real-time PCR assays as described in Materials and methods. Results from seven representative patients from each group are shown in the figure. For all eight patients in each group, Isoform 4 was the predominant TGIF isoform. The numbers on Y-axis simply indicate the amount of each isoform relative to the lowest expressing isoform for a particular sample. Analysis was repeated at least 3 times in triplicate. Data from one such representative experiment is shown as the mean plus/minus SD.
354
R. Hamid et al. / Biochimica et Biophysica Acta 1779 (2008) 347–355
Fig. 7. Relative promoter activities of TGIF isoform 4 promoter constructs. Transient transfections were done using the dual luciferase system to correct for differences in transfection efficiencies. The test plasmid expressed Firefly luciferase while the control plasmid expressed Renilla luciferase. Values on X-axis represent the length of the 5′ sequence driving Firefly luciferase expression. Values on Y-axis show a relative number, relative luciferase unit (RLU), that represent Firefly luciferase expression normalized against Renilla luciferase expression. Minimal proximal promoter elements were localized within the 358 bp immediately upstream to the TSS. Experiment was repeated multiple times and results of one representative experiment are shown.
show that TGIF may belong to the group of genes that possess 5′ splice isoforms. Our analysis of isoform expression showed that Isoform 4 was the predominant isoform expressed. This was true for every tissue analyzed including normal HSC, AML blasts, the peripheral lymphocytes and a number hematopoietic cell lines. In addition, Isoform 4 was the predominant isoform in tissues with both low as well as high total TGIF expression. Finally, Isoform 4 levels correlated with total TGIF levels, suggesting that it is the major contributor to the total cellular TGIF mRNA levels (Fig. 5). Some studies looking into TGIF's role in holoprosencephaly have used Isoform 3 cDNA as the reference cDNA, however our data would suggest that, in fact, Isoform 4 would be more appropriate since it represents the predominant isoform in both adult as well as fetal brain. The presence of multiple isoforms that, nevertheless, did not show significant variation in expression between tissues was unanticipated. It may be that in some tissues some of these isoforms are subject to developmental regulation of expression. Our comparison of fetal brain and adult brain did not show a significant difference in expression of TGIF isoforms, showing the predominance of isoform 4 in both. However, a more detailed development-specific analysis of TGIF isoform expression would need to be done to clarify whether TGIF isoforms are expressed differentially, based on developmental stage of a particular tissue. Interestingly, isoform-8 expression was only noted in lung tissue albeit at very low levels, suggesting that TGIF isoform expression may be subjected to both developmental and tissuespecific variation in expression.
Fig. 9. Relative activity of TGIF promoter reporter construct in hematopoietic cell lines. Activity of TGIF promoter luciferase reporter in transiently transfected AML-193 and TF-1 cells. Activity of this construct was nearly 18 fold higher in AML-193 cells than in TF-1 cells, paralleling total TGIF expression in these two cell lines. As stated in Materials and Methods transient transfections were done using the dual luciferase system to correct for differences in transfection efficiencies. Values on Y-axis show a relative number, relative luciferase unit (RLU), that represent Firefly luciferase expression normalized against Renilla luciferase expression. Experiment was done in a replicate of five and repeated. Data from one such representative experiment is shown as the mean plus/minus SD.
The presence of multiple TSS and unique 5′ exons may suggest the possibility that there are multiple unique promoter regions driving the expression of these isoforms. As shown by our EST analysis, many of the isoforms have transcriptional start sites that are within 50 bp of each other and it is likely that these isoforms would have either the same or overlapping regulatory regions. However, It is interesting to note that isoform 4 TSS position is significantly different from the other isoforms (Fig. 1). A bioinformatics analysis of the upstream sequences of the TGIF isoforms fails to identify strong promoter sequences for most of the isoforms except for isoform 4 suggesting one explanation of why isoform 4 might be the predominantly expressed TGIF isoform. Our analysis using truncations of the 5′ region immediately upstream of Isoform 4 TSS allowed us to identify a GC-rich region (within ~358 bp 5′ to the TSS), which displayed strong luciferase activity. Many genes contain CpG islands and the relevance of these sequences with respect to their methylation status and inactivation is well described. However, we found no evidence of methylation of TGIF promoter, in fact our studies suggests that at least for the leukemic cell lines studied reduced TGIF expression is the result of regulation in trans rather than epigenetic silencing. Luciferase construct containing the promoter region showed low reporter expression in the low TGIF expressing cell line TF-1 compared to the high-expressing line AML-193, and 5-Aza treatments of TF-1 cells did not increase TGIF expression significantly (Figs. 8 and 9). It is interesting to note that for the leukemic cell lines studied TGIF expression was the highest in the most-differentiated cell line and lowest in the least-differentiated line, which may suggest either a role for TGIF in hematopoietic cell differentiation or its control by the state of differentiation. In fact preliminary results from on going studies suggest that TGIF may be necessary for hematopoietic differentiation as its knockdown in hematopoietic progenitor cells lines leads to a growth and differentiation block. In conclusion, our data suggests that the TGIF gene has a more complicated structure than previously realized and although it expresses multiple isoforms; these appear not to vary significantly in relative amounts between tissues. Finally, isoform 4 is the major contributor to the total cellular TGIF expression. This works provides an essential platform for further investigation into the biological role of TGIF in hematopoiesis and leukemogenesis. Acknowledgements
Fig. 8. Relative total TGIF expression in TF-1 Cells following 5-Aza treatment. Total TGIF expression levels were analyzed with real-time PCR analysis at the indicated times following addition of 5 μM concentration of 5-Aza (dark columns) and DMSO (grey columns) in exponentially growing TF-1 cells.
The work presented in this application was supported by an intramural grant from the American Cancer Society and the Vanderbilt Physician Scientist Program (VPSD).
R. Hamid et al. / Biochimica et Biophysica Acta 1779 (2008) 347–355
References [1] T.R. Burglin, Analysis of TALE superclass homeobox genes (MEIS, PBC, KNOX, Iroquois, TGIF) reveals a novel domain conserved between plants and animals, Nucleic Acids Res. 25 (21) (1997) 4173–4180. [2] E. Bertolino, et al., A novel homeobox protein which recognizes a TGT core and functionally interferes with a retinoid-responsive motif, J. Biol. Chem. 270 (52) (1995) 31178–31188. [3] Julie A. Means-Powell, Rizwan Hamid, Danko Martincic, Vladimir D Kravtsov, Yu Shyr, John P. Greer, Daniel W. Byrne, Mark J. Koury, Stephenj. Brandt, Expression of the TG-Interacting Factor gene in blast cells at diagnosis correlates with patient survival in acute myeloid leukemia, submitted for publication. [4] J.A. Means-Powell, V.D. Kravtsov, Y. Shyr, S.E. Levy, J.P. Greer, M.J. Koury, S.J. Brandt, Expression of homeobox gene TG-interaction factor is an independent predictor of survival in acute myelogenous leukemia, Platform Presentation 45th ASH meeting, 2003. [5] K. Nakakuki, et al., Novel targets for the 18p11.3 amplification frequently observed in esophageal squamous cell carcinomas, Carcinogenesis 23 (1) (2002) 19–24. [6] C. Voorter, et al., Detection of chromosomal imbalances in transitional cell carcinoma of the bladder by comparative genomic hybridization, Am. J. Pathol. 146 (6) (1995) 1341–1354. [7] K.W. Gripp, et al., Mutations in TGIF cause holoprosencephaly and link NODAL signalling to human neural axis determination, Nat. Genet. 25 (2) (2000) 205–208. [8] E. Belloni, et al., Identification of Sonic hedgehog as a candidate gene responsible for holoprosencephaly, Nat. Genet. 14 (3) (1996) 353–356. [9] D.E. Wallis, et al., Mutations in the homeodomain of the human SIX3 gene cause holoprosencephaly, Nat. Genet. 22 (2) (1999) 196–198. [10] S.A. Brown, et al., Holoprosencephaly due to mutations in ZIC2, a homologue of Drosophila odd-paired, Nat. Genet. 20 (2) (1998) 180–183. [11] E. Roessler, et al., Loss-of-function mutations in the human GLI2 gene are associated with pituitary anomalies and holoprosencephaly-like features, Proc. Natl. Acad. Sci. U. S. A. 100 (23) (2003) 13424–13429. [12] J.E. Ming, et al., Mutations in PATCHED-1, the receptor for SONIC HEDGEHOG, are associated with holoprosencephaly, Hum. Genet. 110 (4) (2002) 297–301. [13] J.M. de la Cruz, et al., A loss-of-function mutation in the CFC domain of TDGF1 is associated with human forebrain defects, Hum. Genet. 110 (5) (2002) 422–428. [14] K.B. El-Jaick, et al., Functional analysis of mutations in TGIF associated with holoprosencephaly, Mol. Genet. Metab. 90 (1) (2007) 97–111. [15] J.E. Ming, M. Muenke, Multiple hits during early embryonic developmentdigenic diseases and holoprosencephaly, Am. J. Hum. Genet. 71 (5) (2002) 1017–1032. [16] D. Wallis, M. Muenke, Mutations in holoprosencephaly, Hum. Mutat. 16 (2) (2000) 99–108. [17] D.E. Wallis, M. Muenke, Molecular mechanisms of holoprosencephaly, Mol. Genet. Metab. 68 (2) (1999) 126–138. [18] L. Bartholin, et al., TGIF inhibits retinoid signaling, Mol. Cell. Biol. 26 (3) (2006) 990–1001. [19] S.R. Seo, et al., The novel E3 ubiquitin ligase Tiul1 associates with TGIF to target Smad2 for degradation, EMBO J. 23 (19) (2004) 3780–3792. [20] S.R. Seo, et al., Nuclear retention of the tumor suppressor cPML by the homeodomain protein TGIF restricts TGF-beta signaling, Mol. Cell 23 (4) (2006) 547–559. [21] D. Wotton, et al., A Smad transcriptional corepressor, Cell 97 (1) (1999) 29–39. [22] D. Wotton, et al., Multiple modes of repression by the Smad transcriptional corepressor TGIF, J. Biol. Chem. 274 (52) (1999) 37105–37110. [23] D. Wotton, et al., The Smad transcriptional corepressor TGIF recruits mSin3, Cell Growth Differ. 12 (9) (2001) 457–463. [24] A.A. Mironov, J.W. Fickett, M.S. Gelfand, Frequent alternative splicing of human genes, Genome Res. 9 (12) (1999) 1288–1293. [25] B. Modrek, C. Lee, A genomic view of alternative splicing, Nat. Genet. 30 (1) (2002) 13–19. [26] D. Baek, et al., Characterization and predictive discovery of evolutionarily conserved mammalian alternative promoters, Genome Res. 17 (2) (2007) 145–155. [27] Y. Barak, et al., Regulation of mdm2 expression by p53: alternative promoters produce transcripts with nonidentical translation potential, Genes Dev. 8 (15) (1994) 1739–1749. [28] A. Bevilacqua, et al., Post-transcriptional regulation of gene expression by degradation of messenger RNAs, J. Cell. Physiol. 195 (3) (2003) 356–372. [29] N.J. Butcher, et al., Genomic organization of human arylamine N-acetyltransferase Type I reveals alternative promoters that generate different 5′-UTR splice variants with altered translational activities, Biochem. J. 387 (Pt 1) (2005) 119–127. [30] F. Denoeud, et al., Prominent use of distal 5′ transcription start sites and discovery of a large number of additional exons in ENCODE regions, Genome Res. 17 (6) (2007) 746–759.
355
[31] P. Gascard, et al., Putative tumor suppressor protein 4.1B is differentially expressed in kidney and brain via alternative promoters and 5′ alternative splicing, Biochim. Biophys. Acta 1680 (2) (2004) 71–82. [32] M. Kozak, Structural features in eukaryotic mRNAs that modulate the initiation of translation, J. Biol. Chem. 266 (30) (1991) 19867–19870. [33] F. Mignone, et al., Untranslated regions of mRNAs, Genome. Biol. 3 (3) (2002) REVIEWS0004. [34] J. Ross, mRNA stability in mammalian cells, Microbiol. Rev. 59 (3) (1995) 423–450. [35] L.L. Shang, S.C. Dudley Jr., Tandem promoters and developmentally regulated 5′and 3′-mRNA untranslated regions of the mouse Scn5a cardiac sodium channel, J. Biol. Chem. 280 (2) (2005) 933–940. [36] J.S. Tan, N. Mohandas, J.G. Conboy, Evolutionarily conserved coupling of transcription and alternative splicing in the EPB41 (protein 4.1R) and EPB41L3 (protein 4.1B) genes, Genomics 86 (6) (2005) 701–707. [37] B.T. Wilhelm, et al., Transcriptional control of murine CD94 gene: differential usage of dual promoters by lymphoid cell types, J. Immunol. 171 (8) (2003) 4219–4226. [38] G.S. Scavello, et al., Sequence variants in the transforming growth beta-induced factor (TGIF) gene are not associated with high myopia, Invest. Ophthalmol. Vis. Sci. 45 (7) (2004) 2091–2097. [39] M.S. Boguski, T.M. Lowe, C.M. Tolstoshev, dbEST—database for qexpressed sequence tagsq, Nat. Genet. 4 (4) (1993) 332–333. [40] S.F. Altschul, et al., Basic local alignment search tool, J. Mol. Biol. 215 (3) (1990) 403–410. [41] W. Gish, D.J. States, Identification of protein coding regions by database similarity search, Nat. Genet. 3 (3) (1993) 266–272. [42] J. Zhang, T.L. Madden, PowerBLAST: a new network BLAST application for interactive or automated sequence analysis and annotation, Genome Res. 7 (6) (1997) 649–656. [43] S.F. Altschul, et al., Gapped BLAST and PSI-BLAST: a new generation of protein database search programs, Nucleic Acids Res. 25 (17) (1997) 3389–3402. [44] Z. Zhang, et al., A greedy algorithm for aligning DNA sequences, J. Comput. Biol. 7 (1–2) (2000) 203–214. [45] K.J. Livak, T.D. Schmittgen, Analysis of relative gene expression data using realtime quantitative PCR and the 2(-Delta Delta C(T)) method, Methods 25 (4) (2001) 402–408. [46] S.A. Bustin, Quantification of mRNA using real-time reverse transcription PCR (RTPCR): trends and problems, J. Mol. Endocrinol. 29 (1) (2002) 23–39. [47] D.S. Prestridge, Predicting Pol II promoter sequences using transcription factor binding sites, J. Mol. Biol. 249 (5) (1995) 923–932. [48] M.G. Reese, Application of a time-delay neural network to promoter annotation in the Drosophila melanogaster genome, Comput. Chem. 26 (1) (2001) 51–56. [49] T. Werner, Computer-assisted analysis of transcription control regions. Matinspector and other programs, Methods Mol. Biol. 132 (2000) 337–349. [50] T. Werner, Identification and functional modelling of DNA sequence elements of transcription, Brief Bioinform. 1 (4) (2000) 372–380. [51] M. Scherf, A. Klingenhoff, T. Werner, Highly specific localization of promoter regions in large genomic sequences by PromoterInspector: a novel context analysis approach, J. Mol. Biol. 297 (3) (2000) 599–606. [52] M.Q. Zhang, Prediction, annotation, and analysis of human promoters, Cold Spring Harb. Symp. Quant. Biol. 68 (2003) 217–225. [53] Z. Xuan, et al., Genome-wide promoter extraction and analysis in human, mouse, and rat, Genome Biol. 6 (8) (2005) R72. [54] R.V. Davuluri, I. Grosse, M.Q. Zhang, Computational identification of promoters and first exons in the human genome, Nat. Genet. 29 (4) (2001) 412–417. [55] T. Kitamura, et al., Identification and analysis of human erythropoietin receptors on a factor-dependent cell line, TF-1, Blood 73 (2) (1989) 375–380. [56] S.J. Collins, The HL-60 promyelocytic leukemia cell line: proliferation, differentiation, and cellular oncogene expression, Blood 70 (5) (1987) 1233–1244. [57] L. Diamond, et al., HL-60 variant reversibly resistant to induction of differentiation by phorbol esters, Carcinog. Compr. Surv. 10 (1985) 287–302. [58] S. Adams, et al., The proliferation of AML-193 is regulated by multiple hematopoietic growth factors and cytokines, Leukemia 3 (4) (1989) 314–315. [59] P. Ralph, M.A. Moore, K. Nilsson, Lysozyme synthesis by established human and murine histiocytic lymphoma cell lines, J. Exp. Med. 143 (6) (1976) 1528–1533. [60] H. Asou, et al., Establishment of the acute myeloid leukemia cell line Kasumi-6 from a patient with a dominant-negative mutation in the DNA-binding region of the C/EBPalpha gene, Genes Chromosom. Cancer 36 (2) (2003) 167–174. [61] C.B. Lozzio, B.B. Lozzio, Human chronic myelogenous leukemia cell-line with positive Philadelphia chromosome, Blood 45 (3) (1975) 321–334. [62] S.J. Cooper, et al., Comprehensive analysis of transcriptional promoter structure and function in 1% of the human genome, Genome Res. 16 (1) (2006) 1–10. [63] T.H. Kim, et al., Direct isolation and identification of promoters in the human genome, Genome Res. 15 (6) (2005) 830–839.