doi:10.1016/S0022-2836(02)01011-2 available online at http://www.idealibrary.com on
w B
J. Mol. Biol. (2002) 323, 601–611
Definition of Transcriptional Promoters in the Human b Globin Locus Control Region S. J. E. Routledge and N. J. Proudfoot* Sir William Dunn School of Pathology, South Parks Road University of Oxford, Oxford OX1 3RE, UK
Our previous studies on the human b globin gene cluster revealed the presence of intergenic transcripts throughout the locus, and demonstrated that transcription of the locus control region (LCR) initiates within an ERV9 endogenous retroviral long-terminal repeat (LTR) upstream of DNase I hypersensitive site 5. We show, using a combination of assays, that there are additional sites of transcription initiation within the LCR at hypersensitive sites 2 and 3. We have defined sites of transcription initiation, which occurs at discrete positions in a direction towards the globin genes. In addition, we show that mutation of specific transcription factor binding sites within HS2 leads to a reduction in transcription levels from within this site. We propose that these initiation events within the LCR can account for the observed orientation dependence of LCR function, and contribute to the open chromatin configuration of the b globin locus. In addition, transcription from within the LCR hypersensitive sites could compensate for the absence of the ERV9 LTR in many transgenic mice lines, which nevertheless regulate their globin clusters correctly. q 2002 Elsevier Science Ltd. All rights reserved
*Corresponding author
Keywords: LCR; globin; transcription; hypersensitive site
Introduction The human b globin locus, which extends over 70 kb on chromosome 11, contains the five b globin genes (50 -1-Gg-Ag-d-b-30 ) in the same orientation and in the order of their expression in erythroid development. The entire b globin locus is maintained in an open chromatin conformation in erythroid cells, as shown by increased sensitivity to DNase I digestion when compared to total genomic DNA.1 Upstream of the globin genes is the locus control region (LCR), a major regulatory element containing four erythroid-specific hypersensitive sites present at all stages of erythroid development (HS1 –4) and a constitutive site (HS5) located further upstream (Figure 1). Two further HSs have been described recently; HS6 and HS7, upstream of HS5.2 The LCR confers high-level, tissue-specific Present address: S. J. E. Routledge, Division of Medical and Molecular Genetics, GKT School of Medicine, Guy’s Tower, Guy’s Hospital, London SE1 9RT, UK. Abbreviations used: LCR, locus control region; ERV, endogenous retrovirus; LTR, long-terminal repeat; HS, hypersensitive site; TF, transcription factor; DLRe, dual luciferase reporter system; pol, polymerase. E-mail address of the corresponding author:
[email protected]
and copy number-dependent but position-independent expression on a linked gene in transgenic mice.3 A 6.5 kb microlocus containing HS1– 4 is sufficient for LCR activity.4 The naturally occurring Hispanic deletion removes HS2– 5 plus an additional 25 kb upstream and causes b thalassaemia.5 The globin genes are silent, HS1 does not form, and the locus resides in a closed chromatin configuration. In contrast, when the region encompassing HS2 – 5 is deleted from human chromosome 11 by homologous recombination, although expression of the genes is reduced greatly, the locus retains its DNase I sensitivity, and HS1 still forms, suggesting the chromatin is in an open configuration.6 We have shown that an ERV9 long-terminal repeat (LTR) element within the region of difference between these two deletions drives transcription into the LCR, and hence may contribute to the open chromatin structure of the locus.7 HS1 and HS5 appear to be dispensable for LCR function. A naturally occurring deletion of a 3030 bp region upstream of the 1 gene, including HS1, has no effect on globin gene expression8 and a deletion of HS5 in its natural context has minimal effect on globin expression, suggesting that it is not required to shield the locus from neighbouring chromatin effects,6 although it has been postulated to have insulator activity.9,10 HS2 and HS3 appear
0022-2836/02/$ - see front matter q 2002 Elsevier Science Ltd. All rights reserved
602
to contain the major enhancing activity of the LCR, as a deletion of one or the other from the LCR microlocus linked to a b gene in transgenic mice causes reduction in globin gene expression of , 50%.11 The dominant chromatin-opening activity may reside in HS3, since this fragment can reproducibly direct expression of a single copy transgene in mice,12 whereas experiments with HS2 constructs revealed a need for multiple copies to be integrated to allow detection of linked globin gene expression.13 The major activities of HS2, HS3 and HS4 have been defined to core elements of 100 –300 bp that enhance transcription of a linked globin gene in transgenic mice.14 – 16 Each core region contains a similar arrangement of transcription factor binding sites for both erythroid specific and ubiquitous TFs.17,18 A conserved arrangement of an NF-E2 site and tandem GATA-1 sites is required for the formation of HS2, HS3,19 and HS420 in vitro, although other factors including EKLF and Sp1 are required for HS4 formation in vivo.21 HS2 and HS3 can form in vitro on an immobilised template after incubation with erythroid cell nuclear extract, even without prior assembly into chromatin.22 EKLF is important also for HS3 formation.23,24 Deletions of individual HSs in the context of the whole locus result in mild to deleterious phenotypes depending on whether the core HS is deleted or a larger fragment including flanking regions.25 – 28 Sequences flanking the HS cores are required for synergistic enhancement of a linked gene: the presence of the flanking sequences produced an enhancement of more than the sum of the parts when more than one HS was present in the same construct.29 Li et al. provided evidence that HSs act cooperatively.30 The prevailing theory is that the HSs function as part of a holocomplex; the activation domains reside in the core regions, but flanking regions are required to stabilise the complex. The holocomplex may adopt different conformations to interact with different globin promoters, since there is some evidence for a stage specificity of interaction between the LCR and globin genes.26,27,31 LCR function is orientation-dependent. Experiments with transgenic mice containing a YAC with the complete human b globin locus containing an inverted LCR show that expression of all globin genes is affected severely at all stages in development.32 HS5 may be responsible for some of the effects seen, since it has been postulated to contain enhancer-blocking function.10 In support of this hypothesis, a marked 1 gene positioned 50 to HS5 is not expressed,32 although in another experiment, a b gene positioned 50 to the LCR was expressed.33 We have previously shown that, in addition to the erythroid-specific, developmentally regulated genic transcripts, transcription occurs in an erythroid-specific manner throughout the intergenic regions of the human b globin cluster and across the LCR.34 Furthermore, transcription of the
Promoters in the b Globin LCR
LCR initiates within an ERV9 LTR element upstream of HS5.7 Our model for activation of the b globin locus is that LCR and intergenic transcription is involved in opening the chromatin structure of the locus, or for the maintenance of an open chromatin structure, in order for genic transcription to occur at the appropriate developmental stages. This model is aided by the recent discoveries that chromatin remodelling enzymes such as histone acetyltransferases (HATs)35 and members of the SWI/SNF complex36 have been found to be associated with the transcribing RNA polymerase (pol) II complex. However, this model fails to explain transcription of the mouse LCR, which lacks the ERV9 LTR or an analogous element. It is possible that other transcriptional events within the LCR have a role to play. Transcription of an HS2 reporter construct has been described in erythroid cells37,38 and in vitro transcription of HS2 and HS3 has been demonstrated recently.22 Chromatin immunoprecipitation (ChIP) studies have found hyperphosphorylated RNA pol II localised to HS1, HS2 and HS3 in the mouse b globin LCR.39 Here, we present a transcriptional analysis of the LCR hypersensitive sites. We show that HS2 and HS3 are capable of promoting transcription of a reporter gene, but that HS1, HS4 and HS5 are not. We show that this transcription can occur in either orientation in erythroid cells and at a greatly reduced level in the non-erythroid HeLa cell line. Steady-state RNA analysis indicates that HS2 contains stronger promoter activity than HS3. 50 RACE analysis defines transcription initiation sites within HS2 and HS3, and suggests that transcription is significant only in a direction towards the globin genes. This transcription may provide a reason for the observed orientation-specificity of LCR function.32 In addition, we show that the integrity of specific TF-binding sites is necessary for full levels of transcription.
Results Analysis of transcription from the LCR hypersensitive sites using the Dual Luciferasee Reporter Assay System Our initial approach to determine the ability of individual HSs to promote transcription was to employ the Promega Dual Luciferasee Reporter Assay System (DLRe)†. This uses two different luciferase enzymes, firefly and Renilla, which produce a luminescence proportional to the amount of luciferase present in the cell lysate over a range of several orders of magnitude. The diagram in Figure 1 shows the positions of hypersensitive sites in the b globin LCR. Marked below are the test fragments cloned into the plasmid † http://www.shpromega.com.cn/57-5573_02.html
Promoters in the b Globin LCR
603
Figure 1. Hypersensitive sites in the human b globin LCR. The diagram of the human b globin locus shows the positions of the globin genes and hypersensitive sites in the LCR (marked with arrows). The LCR is expanded below to show the HSs and the ERV9 LTR element, which drives transcription of the LCR.7 The positions of HS DNA fragments tested for promoter activity in the dual luciferase reporter system (DLRe) are shown below, expanded to show co-ordinates of inserts. Co-ordinates for F þ luc and HS5 þ luc refer to GenBank sequence AF064190; co-ordinates for HS1–4 þ luc refer to GenBank sequence U01317. The open bars show the relative positions of core HS fragments used.
pGL3-basic, which contains the firefly luciferase gene. All fragments were cloned in both the positive and negative orientation relative to the firefly luciferase gene. Our strategy utilised fragments ranging from , 0.8 kb to , 1.2 kb in size, encompassing the HSs in order to be sure that any sequences necessary for transcription were
included. The HS2 fragment contains the same sequences as that used previously, since this has been shown to promote transcription in a reporter assay.37 Smaller fragments corresponding to the core elements of HS2 –4 were tested (HS2c –HS4c). Their position relative to the whole fragments is shown in Figure 1 (open bars). A Renilla luciferase
Figure 2. Analysis of promoter activity of the b globin LCR HSs by DLRe in erythroid and non-erythroid cells. Activity was corrected for transfection efficiency relative to the co-transfectional control plasmid pRL-TK (Promega #TB240) and is shown relative to the pGL3-control plasmid (Promega #TM033). Promoter activity in transiently transfected K562 cells is shown by filled bars; activity in transiently transfected HeLa cells is shown by hatched bars. The graph shows the averaged results of five separate assays; error bars show one standard deviation away from the mean. The construct transfected is labelled on the x-axis. The symbols þ and 2 denote the orientation of the insert relative to the luciferase gene. F þ contains the promoter of the ERV9 LTR element located upstream of HS5 and is a positive control. The inset graph shows promoter activity in HeLa cells on a larger scale.
604
expressing plasmid (pRL-TK, driven by the herpessimplex virus thymidine kinase promoter) was used as an internal control for transfection efficiency. Figure 2 shows the results of DLRe assays after transient transfection of test constructs into K562 or HeLa cells, corrected for transfection efficiency, relative to the pGL3-control plasmid, which contains the firefly luciferase gene driven by the simian virus (SV40) promoter with its associated enhancer. The main graph shows the averaged results of five experiments using K562 (filled) or HeLa (hatched) cell extracts. Error bars show one standard deviation either side of the mean. F þ luc contains the promoter region of the ERV9 LTR element located upstream of HS57 and gives a promoter activity twofold higher than that of the control in K562 cells. This is a useful positive control, as previous results demonstrate that this fragment has promoter activity in HeLa cells (nonerythroid) as well as in the erythroid cell line K562.7 The HS2luc constructs show appreciable levels of promoter activity in erythroid cells, HS2 þ luc at , 40% of the control activity, and HS2 2 luc at , 10% of the control, confirming that transcripts derive from within HS2 of the human b LCR in erythroid cells. The core fragments are able to promote transcription, HS2c þ luc at , 20% of the control, and HS2c 2 luc at , 45%. There is a low level of promoter activity in HS3, but only from the core fragment in the reverse orientation. In HeLa cells, F þ luc has a level of promoter activity that is , 60% that of the control. The inset graph shows the HeLa results on a larger scale, indicating that the same constructs are able to promote transcription in this cell line. HS1, HS4 and HS5 do not appear to be able to promote transcription in either cell line or in orientation in this assay. It is difficult to compare promoter activity between cell lines, as the control firefly reporter plasmid, pGL3-control, may show different levels of expression in K562 cells or HeLa cells. However, since its expression is driven by a strong promoter, differences between cell types can be assumed to be minor in comparison to the differences seen for the test constructs. It can be seen, therefore, that the LCR constructs have a substantially higher promoter activity in K562 cells than in HeLa cells. Also, the ERV9 promoter has a much higher activity than the HS2 and HS3 fragments in both erythroid and non-erythroid cells. These results confirm previous reports on transcription of HS2,38 and show that it is not unique among the LCR HSs in the ability to promote transcription. The low level of promoter activity seen by the HS2 constructs in HeLa cells is interesting, as HS2, although referred to in the literature as erythroid-specific,37 is partially present in HeLa cells.40 Transcription of HS2 may relate to the extent of DNase I sensitivity of the region, and hence chromatin structure.
Promoters in the b Globin LCR
There are differences seen in the level of promoter activity between whole and core HS fragment tested in either orientation. The pattern of activity is similar for both cell types, but overall activity in HeLa cells is more than tenfold lower than in K562 cells. In each case, the HS2c þ fragment is less active than the whole insert, suggesting that an activator element may be present in the region deleted. Deletion analysis has been performed in the past on the HS core regions to determine possible effects on enhancer function.41 However, although there are regions of conserved sequence outside of the HS cores, these do not appear to have any enhancer function. It has been suggested that these additional regions may function through protein binding to synergise interactions between the HSs or to interact with the core-bound proteins to aid in their correct orientation.29 It is possible that this effect is seen in this deletion. Conversely, the HS2c 2 fragment is more active than the whole insert. Also, the activity in HS3 is detected only in the core fragment in the reverse orientation, and not in the larger insert. These observations are likely due to a feature of the reporter system used. The fragments tested are fairly large, which could lead to problems with translation of the firefly luciferase open reading frame (ORF), depending on the site of transcription and consequently translation initiation. The initiation codon AUG at the start of the firefly luciferase ORF is located within a Kozak consensus sequence to enable efficient translation initiation.42 However, there are many other potential translation initiation codons as well as multiple translation termination codons throughout the HS inserts in all reading frames. These sequence features could lead to reduce luciferase protein production due to impaired translation, since there may be several potential AUG codons upstream of the true luciferase initiation site. The luciferase enzyme may be produced from the correct initiation codon by leaky-scanning or re-initiation (due to the presence of multiple STOP codons in all reading frames), but the amount of protein produced may not reflect the true amount of transcription from within an HS. In addition, since the initiation codons are present in all reading frames, there is no guarantee that translation would occur in the correct reading frame for active luciferase production. Hence, the presence of upstream AUGs or some other effect prevented detection of HS3 promoter activity in the whole insert construct, HS3 2 luc. This could account for the lack of detectable transcription in the other orientation, and the increase in activity seen from the HS2c 2 construct. Analysis of transcription from HS2 and HS3 by S1 nuclease protection The luciferase assays described above suggest that both HS2 and HS3 of the human b globin LCR contain promoter activity and are hence able
605
Promoters in the b Globin LCR
Figure 3. Analysis of transcription from HS2 and HS3 reporter constructs by S1 nuclease protection assay. (a) The general structure of a reporter plasmid consisting of an HS fragment fused to the human b-globin gene. Constructs HS2b, HS2c þ b, HS3c þ b and HS3c 2 b contain the fragments HS2, HS2c or HS3c as described in this Figure. The symbols þ and 2 denote the orientation of the insert relative to the b gene. The Eco RI restriction site used in the production of the S1 probe is marked below. The predicted size of an S1 nucleaseprotected fragment corresponding to use of the b globin poly(A) site (and hence transcription from the plasmid) is 212 nt. (b) An autoradiograph showing the results of S1 nuclease protection carried out on cytoplasmic RNA extracted from K562 cells transiently transfected with the constructs described above. The constructs used are labelled above each lane. M is the marker lane. Mock refers to untransfected K562 cells. The positions of bands corresponding to use of the b globin poly(A) site are marked, as are the sizes of markers. Additional bands are seen below the major globin band (asterisks). These are caused by partial breathing of the DNA:RNA hybrid during S1 digestion and are typical for S1 protection across the b globin 30 -end. The band corresponding to the VA transfection control is indicated. (c) Transcription levels were quantified by phosphorimager analysis relative to the VA control.
to drive transcription of a downstream reporter gene. New constructs were synthesised for RNA analysis, since steady-state transcripts produced from luciferase constructs are low level, due to a lack of introns in the DNA template. The general plasmid structure employed is shown in the diagram of Figure 3(a). Each HS insert was fused to exon 1 of a promoterless b globin gene, to make the constructs pHS2b, pHS2cb and pHS3c ^ b. Any transcripts produced should be stabilised by the processes of splicing and polyadenylation. The autoradiograph in Figure 3(b) shows the results of an S1 nuclease protection assay carried out on cytoplasmic RNA extracted from K562 cells transiently transfected with the reporter constructs. A 212 nt band indicating use of the b globin poly(A) site is seen for the transfected but not untransfected cells. Interestingly, HS3c þ b promotes transcription of the b globin gene, in contrast to the result seen in the DLRe assay. This is likely explained by the continued presence of several upstream translation initiation codons in the HS3c fragment, interfering with translation but not affecting transcription. In contrast, no transcription is detected from HS3c 2 b in this assay. This may be due to the continued presence of additional STOP codons downstream of the transcription initiation site, leading to nonsense-mediated decay.43 The results were quantified relative to the VA co-transfection control (lower panel) and are shown graphically in Figure 3(c), relative to transcription from HS2b. Promoter activity of HS2 is two to threefold stronger than that of HS3. The same constructs were transfected into HeLa cells. S1 nuclease analysis was carried out on the RNA extracted (data not shown). A similar pattern of transcription was seen, although again at greatly reduced levels. These results show that both HS2 and HS3 are able to promote transcription in both erythroid and at low level in non-erythroid cells. A mutational analysis of HS2 It is well documented that the tandem NF-E2binding sites located within HS2 are essential for its enhancer functions14,44 and chromatin remodelling activity,45 although NF-E2 null mice are viable.46 In addition, a conserved arrangement of an NF-E2 site and tandem GATA-1 sites is required for the formation of HS2, HS319 and HS420 in vitro. We reasoned that the tandem NF-E2 sites within HS2 may be important for transcription from within the site. Tuan et al. have shown that cleaving the HS2 enhancer between the tandembinding sites led to a drop in enhancer function and a drop in transcription from within the enhancer.37 We made constructs with mutations in the NF-E2-binding sites or the tandem GATA-1binding sites downstream, previously shown to abolish TF binding in an in vitro assay,44 and assayed for promoter activity using S1 nuclease protection as before. Figure 4(a) shows the mutations made in the sequence to form the
606
Promoters in the b Globin LCR
Figure 4. Mutational analysis of HS2. (a) The sequence of part of HS2 from co-ordinates 8654– 8738 (GenBank U01317). NF-E2-binding sites are shown in grey boxes. GATA-1 sites are shown in open boxes. Base changes introduced by mutation are indicated below. The names of mutant constructs are indicated to the right. (b) An autoradiograph showing the results of S1 nuclease protection carried out on cytoplasmic RNA extracted from K562 cells transiently transfected with the constructs described above. The constructs used are labelled above each lane. The VA co-transfection control is shown below. Transcription levels were quantified by phosphorimager analysis relative to the VA control. (c) The averaged results of three experiments. Transcription from HS2c þ b (wild-type) was taken to be 100%.
constructs NmtGmt or Gmt. Figure 4(b) shows the result of an S1 protection assay performed on cytoplasmic RNA extracted from transiently transfected K562 cells. The graph in Figure 4(c) shows averaged results from three experiments, corrected for transfection efficiency relative to VA, as a percentage of wild-type (HS2c þ ). Mutation of the tandem GATA-1-binding sites leads to a 38% reduction in transcription from within HS2. Additional mutation of the NF-E2-binding sites leads to a 46% reduction in transcription. It is likely that the lack of GATA-1 or NF-E2 binding can be compensated for partially by the binding of other transcription factors to the vast array of sites within the HS2 core region.
Analysis of transcription initiation sites in HS2 and HS3 by 50 RACE The DLRe and S1 nuclease assays indicate that HS2 and HS3 are capable of promoting transcription. 50 RACE analysis was carried out on cytoplasmic RNA extracted from K562 cells, using the SMARTe RACE cDNA amplification kit (Clontech). Random hexamers were used to prime the reverse transcription step. Primary RACE PCR was performed using HS3D-PCR as the 30 primer; the nested primer was HS3D. A single band of , 240 bp was seen after nested PCR (data not shown). The product was cloned and sequenced to determine the transcription initiation site
Figure 5. Sites of transcription initiation within HS2 and HS3. (a) HS2. Positions of TF-binding sites as determined by experimental data and cross-species comparisons47 are indicated. NF-E2-binding sites are shown as grey boxes. GATA1-binding sites are shown as striped boxes. E boxes are shown as shaded boxes. A CAC box is shown as an open box. The arrow denotes the site of transcription initiation as determined by 50 RACE analysis (position 8819, GenBank U01317). (b) HS3. TF-binding sites are shown as for HS2. The 50 CAC box has been shown to bind EKLF in vivo;24 this is indicated below. The arrow denotes the site of transcription initiation as determined by 50 RACE (position 4584, GenBank U01317).
607
Promoters in the b Globin LCR
(position 4584, GenBank U01317). As indicated in Figure 5(b), endogenous transcription initiates within HS3 in erythroid cells, downstream of the NF-E2 site and tandem GATA-1-binding sites, but upstream of the E box and EKLF-binding site (50 CAC box24). The presence of an AT-rich region downstream of the HS2 TF-binding sites, leading to difficulty with primer design, made it impossible to determine endogenous in vivo 50 RACE data on transcription initiation sites within HS2. Hence 50 RACE was performed on cytoplasmic RNA from K562 cells transiently transfected with the HS2c þ b construct, in order to increase the amount of stable RNA template present. We reasoned that transcription was likely to initiate from the same sites within this construct as in the endogenous HS2, since we found this to be the case for HS3 (data not shown). Primary RACE PCR was performed using HS2D as the 30 primer; the nested PCR used HS2Dnew as the 30 primer. A single band of , 100 bp was seen after nested PCR and was cloned and sequenced to determine the transcription initiation site. The initiation site is indicated in Figure 5(a) with a bold arrow, at position 8819 (GenBank U01317). Positions of transcription factor binding sites as determined by experimental data and cross-species comparisons47 are marked as for HS3. In contrast to HS3, transcription initiates towards the 30 -end of HS2, downstream of the array of TF-binding sites. The 50 RACE data described here thus indicate that transcription initiates within hypersensitive sites 2 and 3 at discrete sites in a direction towards the globin genes. 50 RACE was used in an attempt to determine sites of transcription initiation from HS2 and HS3 in a direction away from the globin genes. In contrast to the discrete PCR products seen in the forward orientation, a smear of products is seen from both HS2 and HS3 in the negative orientation (data not shown). On cloning, these were found to result from non-specific primer hybridisation, and hence no discrete initiation sites were identified within HS2 or HS3 in the reverse orientation. Transcription in this orientation, away from the globin genes, does not appear to be important in vivo.
Discussion The human b globin gene cluster has been studied extensively and various models have been proposed for how the LCR functions in regulation of the genes.48 – 50 It remains a possibility that LCR transcription plays a role in its function. We have shown that transcription of the b globin LCR in human erythroid cells initiates within an LTR of the human endogenous retrovirus ERV9, delineating the 50 extent of transcription within the LCR in erythroid cells.7 However, it was not known whether the LCR is transcribed as a single 23 kb
transcription unit or whether multiple initiation events occur within the LCR. We have now addressed this issue by analysing the LCR hypersensitive sites for evidence of promoter activity, detecting and characterising transcription initiation sites within HS2 and HS3. Possible mechanisms for transcription initiation within HS2 and HS3 50 RACE analysis revealed that transcription initiation towards the globin genes occurs at a discrete site for both HS2 and HS3. Both HS2 and HS3 contain an extensive array of transcription factor binding sites47 (see also Figure 5), including sites for the erythroid TFs GATA-1, NF-E2 and (in the case of HS3) EKLF, and sites for ubiquitous TFs such as basic helix – loop –helix proteins, which bind the E box sequences. Elnitski et al. proposed that the tight spacing of transcription factor binding sites within HS2 could lead to the formation of a very large protein –DNA complex across HS2.51 The similar organisation of TF-binding sites within HS3 suggests that the same may hold true here. Comparison of sequences around the transcription start sites within HS2 and HS3 does not shed much light on possible sequence requirements for transcription initiation. Initiation within HS2 occurs downstream of the array of TF-binding sites, whereas the major start site within HS3 falls between binding sites, although downstream of the essential NF-E2 and tandem GATA-1 sites. There is no consensus TATA box upstream of either start site, although both initiation sites fall within pyrimidine-rich regions, a feature of initiator elements (Inr).52 Promoter activity has been detected from reporter constructs containing HS2.37,38 This activity was disrupted by cleavage between the tandem NF-E2 sites, effectively cutting HS2 in half.37 In contrast, we show that mutations that abolish binding of NF-E2 and GATA-1 to their sites in HS2, but do not reduce the overall size of the HS, lead to a 40– 50% reduction in transcription, suggesting that these TFs play a partial role, but that their binding is not an absolute requirement for initiation of transcription from HS2. We show that transcription can initiate from within both HS2 and HS3 in the non-erythroid HeLa cell line, although at a lower level than in the erythroid K562 cell line (Figure 2). This is in contrast to the previous study, which claims that HS2 transcription is erythroidspecific.37 HS2, although described in the literature as erythroid-specific, is partially open in HeLa cells.40 Consequently, transcription of HS2 may relate to its DNase I sensitivity in some way, possibly as factors binding open the chromatin and allow TF binding and RNA pol II recruitment. A recent study described HS formation and transcription of both HS2 and HS3 in vitro on an immobilised template.22 Surprisingly, HS formation could be detected after incubation with erythroid
608
cell extracts without prior assembly into chromatin, suggesting that HS formation involves processes in addition to nucleosome remodelling. In vitro transcription reactions were carried out on the immobilised templates, and the RNA products detected indicated the presence of several transcription initiation sites. Our results partly correlate with these data, especially for HS3. However, the in vitro nature of these experiments may result in unphysiological initiation of transcription. Our in vivo analysis is more likely to reflect real transcriptional events. The role of transcription initiation within the LCR hypersensitive sites We previously proposed that the process of LCR transcription plays a role in opening the chromatin structure of the globin cluster in erythroid cells.7 The yeast elongator complex carries histone acetylase activity, which may disrupt the chromatin structure, facilitating many rounds of transcription.53 Chromatin remodelling factors have been isolated as part of the pol II complex in mammalian and yeast cells.35,36 A link between transcription and chromatin remodelling is also suggested by the fact that an SWI2/SNF2 family member has been found to be a human pol II transcription termination factor.54 Such data suggest a role for intergenic transcription within the b globin locus, either by maintaining or extending the DNase I sensitivity initiated by erythroid factors interacting with the HSs of the LCR.22 We think it likely that transcription initiation within HS2 and HS3 is involved in determining the orientation specificity of LCR function, as shown by Tanimoto et al.32 Transcription of LCRs may be a general feature of such regulatory elements, since there are other examples of transcribed LCRs.55 – 58 In addition, there are examples where non-coding RNAs play a regulatory role, whether by the process of transcription (for example, Alu transcription in the keratin 18 LCR)58 or the transcript itself, for example non-coding RNAs involved in dosage compensation59,60 and imprinting.61,62 Intergenic transcription may be a feature of gene clusters, for example that detected within the IL-4/IL-13 cluster on chromosome 5.63 Our demonstration that, in addition to the ERV9 LTR promoter, other promoters within the LCR contribute to its transcription is important. Although the ERV9 LTR could account for all LCR transcription in humans and higher apes,64 it is not present in the mouse b globin cluster. However, fluorescence in situ hybridisation (FISH) analysis shows that the mouse LCR is transcribed.34 Many of the human b globin transgenes integrated into mice lack the ERV9 LTR. NRO analysis has been performed on one such line (72) of transgenic mice.7 Transcription of the human LCR was detected up to the 50 -most probe present in the transgene and we suggested that in
Promoters in the b Globin LCR
this mouse line the LCR activates transcription of a nearby mouse promoter to use as an upstream transcription initiation site. However, it seems unlikely that transcription of all human transgenes lacking the ERV9 LTR would be initiated by such a serendipitous event. It may be that transcription initiation 50 to HS5 in the human LCR is not essential for transcription of the LCR and its function in vivo, and that initiation at sites within the LCR, such as those described here within HS2 and HS3, provide the sole means of LCR transcription. A recent study has shown that hyperphosphorylated RNA pol II is localised to HS1, HS2 and HS3 in the mouse b globin LCR.39 This localisation was not dependent on the presence of NF-E2, supporting our results, which show that mutation of the NF-E2 sites in the HS2 core region diminish but do not abolish transcription from within HS2. We suggest that LCR transcription is a general feature of b globin gene clusters in mammals. Definitive proof of a functional role for LCR and intergenic transcription has yet to be seen. Deletion/mutation of promoters will likely have no effect, as there appears to be some redundancy of function, with transcription initiation now described from at least three sites within the LCR. A crucial experiment will be to introduce a transcription terminator, potentially downstream of HS2 (the 30 -most promoter determined to date), to determine if this has any effect on chromatin structure or globin gene transcription.
Materials and Methods Plasmid constructions Luciferase constructs were made by blunt end ligating fragments 12934– 14086 (HS1 þ luc), 8486– 9218 (HS2 þ luc), 8561– 8877 (HS2c þ luc), 4309– 5124 (HS3 þ luc), 4427– 4776 (HS3c þ luc), 532– 1338 (HS4 þ luc), 1036– 1314 (HS4c þ luc), 5468– 6844 (HS5 þ luc) or 3222– 3971 (F þ luc) into the pGL-3 basic vector (Promega #TM033) cut at the Nhe I site and blunt ended. Co-ordinates for HS1 – 4 correspond to GenBank sequence U01317; co-ordinates for HS5 and F correspond to GenBank sequence AF064190. HS1 – 5 2 luc and F 2 luc contain the same inserts but in the reverse orientation with respect to the luciferase gene. The pGL-3 control vector has been described (Promega #TM033). pHS2b, pHS2cb and pHS3cb consist of the same fragments as above, blunt-ended and cloned into pb65 digested with Acc65I and Nco I to remove the b globin promoter region. Tissue culture Adherent HeLa cells were grown in modified Eagle’s medium (MEM) supplemented with 10% (v/v) foetal calf serum (FCS) and 2 mM L -glutamine. K562 cells were cultured in Dulbecco’s modified Eagle’s medium (DMEM) supplemented with 10% FCS and 2 mM L -glutamine. Penicillin and streptomycin were included (50 units/ml and 50 mg/ml, respectively). Transfections were carried out using Qiagen effectene according to
609
Promoters in the b Globin LCR
the manufacturer’s instructions. A sample (0.2 mg) of test plasmid and 0.2 mg of pRL-TK (co-transfection control) were transfected for luciferase reactions. A sample (5 mg) of test plasmid and 0.5 mg of VA plasmid were transfected 48 hours prior to extraction of cytoplasmic RNA.
2.
Dual luciferasee assay Cell extracts were prepared 48 hours after transfection according to the manufacturer’s instructions. Luciferase reactions were carried out using an LKB-Wallac 1250 luminometer. Readings were taken at ten-second intervals; the value for luciferase activity was taken to be the average of the 20 and 30 seconds time-points.
3.
4. S1 nuclease protection Cytoplasmic RNA was extracted 48 hours after transfection as described.66 The S1 probe for the b globin 30 -end consisted of the plasmid pb, digested with Eco RI, and filled in with Klenow and [a-32P]dATP, and purified using a Nick Column (Promega). A sample (50 – 100 cts/ second) of probe was hybridised overnight to 20 mg of cytoplasmic RNA in 80% (v/v) formamide, 40 mM Pipes (pH 6.4), 1 mM EDTA, 400 mM NaCl at 52 8C overnight. S1 nuclease protection analysis was performed as described.67
5.
6.
Rapid amplification of cDNA ends (50 RACE) 50 RACE was carried out using the SMARTe RACE cDNA amplification kit (Clontech) according to the manufacturer’s instructions. A sample (50 –250 ng) of random hexamers were used for reverse transcription, and amplification was with HS2D or HS3D-PCR. Genuine RACE products were identified by blotting and hybridisation to fragments corresponding to HS2c or HS3c, and were re-amplified using nested primers prior to cloning and sequencing.
7. 8.
9. Primer sequences Primers are listed below (50 –30 ): HS2D, GAAATAATATATTCTAGAATATGTC; HS3D, GCTGCTATGCTGTGCCTCC; HS2Dnew, CACATTCTGTCTCAGGCATCCATT; HS3D-PCR, GCAGTCCCATGTAGTAGTAGAATG; NFE2mut, GCATGAAGAATCAGACCTCAGCAT; GATAmut, GACCCATCGCGGAGTCATTACTCT.
10.
11.
12.
Acknowledgements The authors thank Dr K. Plant for helpful discussions. This work was supported by a Wellcome Project grant (051855) and a Wellcome Prize Studentship (052044) to S.J.E.R.
References 1. Groudine, M., Kohwishigematsu, T., Gelinas, R., Stamatoyannopoulos, G. & Papayannopoulou, T.
13.
14.
15.
(1983). Human-fetal to adult hemoglobin switching—changes in chromatin structure of the betaglobin gene locus. Proc. Natl Acad. Sci. USA, 80, 7551 –7555. Bulger, M., Hikke von Doorninck, J., Saitoh, N., Telling, A., Farrell, C., Bender, M. A. et al. (1999). Conservation of sequence and structure flanking the mouse and human b-globin loci: the b-globin genes are embedded within an array of odorant receptor genes. Proc. Natl Acad. Sci. USA, 96, 5129– 5134. Grosveld, F., van Assendelft, G. B., Greaves, D. R. & Kollias, G. (1987). Position-independent, high-level expression of the human b-globin gene in transgenic mice. Cell, 51, 975–985. Talbot, D., Collis, P., Antoniou, M., Vidal, M., Grosveld, F. & Greaves, D. R. (1989). A dominant control region from the human beta-globin locus conferring integration site-independent geneexpression. Nature, 338, 352– 355. Forrester, W. C., Epner, E., Driscoll, M. C., Enver, T., Brice, M., Papayannopoulou, T. & Groudine, M. (1990). A deletion of the human beta-globin locus activation region causes a major alteration in chromatin structure and replication across the entire beta-globin locus. Genes Dev. 4, 1637– 1649. Reik, A., Telling, A., Zitnik, G., Cimbora, D., Epner, E. & Groudine, M. (1998). The locus control region is necessary for gene expression in the human b-globin locus but not the maintenance of an open chromatin structure in erythroid cells. Mol. Cell. Biol. 18, 5992 –6000. Plant, K. E., Routledge, S. J. E. & Proudfoot, N. J. (2001). Intergenic transcription in the human b-globin gene cluster. Mol. Cell. Biol. 21, 6507 –6514. Kulozik, A. E., Bail, S., Bellan-Koch, A., Bartram, C. R., Kohne, E. & Kleihauer, E. (1991). The proximal element of the beta globin locus control region is not functionally required in vivo. J. Clin. Invest. 6, 2142 –2146. Yu, J., Bock, J. H., Slightom, J. L. & Villeponteau, B. (1994). A 50 beta-globin matrix-attachment region and the polyoma enhancer together confer positionindependent transcription. Gene, 139, 139– 145. Li, Q. & Stamatoyannopoulos, G. (1994). Hypersensitive site 5 of the human beta locus control region functions as a chromatin insulator. Blood, 84, 1399 –1401. Collis, P., Antoniou, M. & Grosveld, F. (1990). Definition of the minimal requirements within the human b-globin gene and the dominant control regoion for high level expression. EMBO J. 9, 233– 240. Ellis, J., TanUn, K. C., Harper, A., Michalovich, D., Yannoutsos, N., Philipsen, S. & Grosveld, F. (1996). A dominant chromatin-opening activity in 50 hypersensitive site 3 of the human beta-globin locus control region. EMBO J. 15, 562– 568. Ellis, J., Talbot, D., Dillon, N. & Grosveld, F. (1993). Synthetic human beta-globin 50 HS2 constructs function as locus-control regions only in multicopy transgene concatamers. EMBO J. 12, 127– 134. Talbot, D., Philipsen, S., Fraser, P. & Grosveld, F. (1990). Detailed analysis of the site 3 region of the human beta-globin dominant control region. EMBO J. 9, 2169– 2178. Philipsen, S., Talbot, D., Fraser, P. & Grosveld, F. (1990). The b-globin dominant control region: hypersensitive site 2. EMBO J. 9, 2159– 2167.
610
16. Lowery, C. H., Bodine, D. M. & Nienhuis, A. W. (1992). Mechanism of DNase I hypersensitive site formation within the human globin locus control region. Proc. Natl Acad. Sci. USA, 89, 1143– 1147. 17. Orkin, S. H. (1995). Regulation of globin gene expression in erythroid cells. Eur. J. Biochem. 231, 271– 281. 18. Baron, M. H. (1997). Transcriptional control of globin gene switching during vertebrate development. Biochim. Biophys. Acta, 1351, 51 –72. 19. Pomerantz, O., Goodwin, A. J., Joyce, T. & Lowery, C. H. (1998). Conserved elements containing NF-E2 and tandem GATA binding sites are required for erythroid-specific chromatin structure reorganization within the human b-globin locus control region. Nucl. Acids Res. 26, 5684– 5691. 20. Stamatoyannopoulos, J. A., Goodwin, A., Joyce, T. & Lowery, C. H. (1995). NF-E2 and GATA binding motifs are required for the formation of DNase I hypersensitive site 4 of the human beta-globin locus control region. EMBO J. 14, 106– 116. 21. Goodwin, A. J., McInerney, J. M., Glander, M. A., Pomerantz, O. & Lowery, C. H. (2001). In vivo formation of a human beta-globin locus control region core element requires binding sites for multiple factors including GATA-1, NF-E2, EKLF, and Sp1. J. Biol. Chem. 276, 26883 –26892. 22. Leach, K. M., Nightingale, K., Igarashi, K., Levings, P. P., Engel, J. D., Becker, P. B. & Bungert, J. (2001). Reconstitution of human b-globin locus control region hypersensitive sites in the absence of chromatin assembly. Mol. Cell. Biol. 21, 2629– 2640. 23. Wijgerde, M., Gribnau, J., Trimborn, T., Nuez, B., Philipsen, S., Grosveld, F. & Fraser, P. (1996). The role of EKLF in human beta-globin gene competition. Genes Dev. 10, 2894– 2902. 24. Gillemans, N., Tewari, R., Lindeboom, F., Rottier, R., deWit, T., Wijgerde, M. et al. (1998). Altered DNAbinding specificity mutants of EKLF and Sp1 show that EKLF is an activator of the beta-globin locus control region in vivo. Genes Dev. 12, 2863– 2873. 25. Peterson, K. R., Clegg, C. H., Navas, P. A., Norton, E. J., Kimbrough, T. G. & Stamatoyannopoulos, G. (1996). Effect of deletion of 50 HS3 or 50 HS2 of the human beta-globin locus control region on the developmental regulation of globin gene expression in beta-globin locus yeast artificial chromosome transgenic mice. Proc. Natl Acad. Sci. USA, 93, 6605– 6609. 26. Navas, P. A., Peterson, K. R., Li, Q., Skarpidi, E., Rohde, A., Shaw, S. E. et al. (1998). Developmental specificity of the interaction between the locus control region and embryonic or fetal globin genes in transgenic mice with an HS3 core deletion. Mol. Cell. Biol. 18, 4188– 4196. 27. Bungert, J., Dave, U., Lim, K. C., Lieuw, K. H., Shavit, J. A., Liu, Q. H. & Engel, J. D. (1995). Synergistic regulation of human beta-globin gene switching by locus control region elements HS3 and HS4. Genes Dev. 9, 3083– 3096. 28. Bungert, J., Tanimoto, K., Patel, S., Liu, Q., Fear, M. & Engel, J. D. (1999). Hypersensitive site 2 specifies a unique function within the human b-globin locus control region to stimulate globin gene transcription. Mol. Cell. Biol. 19, 3062– 3072. 29. Molete, J. M., Petrykowska, H., Bouhassira, E. E., Feng, Y.-Q., Miller, W. & Hardison, R. C. (2001). Sequences flanking hypersensitive sites of the
Promoters in the b Globin LCR
30.
31.
32.
33.
34.
35.
36.
37. 38.
39.
40.
41.
42. 43.
44.
b-globin locus control region are required for synergistic enhancement. Mol. Cell. Biol. 21, 2969– 2980. Li, G. L., Lim, K. C., Engel, J. D. & Bungert, J. (1998). Individual LCR hypersensitive sites cooperate to generate an open chromatin domain spanning the human beta-globin locus. Genes Cells, 3, 415– 429. Fraser, P., Pruzina, S., Antoniou, M. & Grosveld, F. (1993). Each hypersensitive site of the human beta-globin locus-control region confers a different developmental pattern of expression on the globin genes. Genes Dev. 7, 106– 113. Tanimoto, K., Liu, Q., Bungert, J. & Engel, J. D. (1999). Effects of altered gene order or orientation of the locus control region on human b-globin gene expression. Nature, 398, 344– 347. Zafarana, G., Raguz, S., Pruzina, S., Grosveld, F. & Meijer, D. (1994). The regulation of human b-globin expression: the analysis of hypersensitive site 5 (HS5) in the LCR. In Molecular Biology of Hemoglobin Switching (Stamatoyanopoulous, G., ed.), pp. 39 – 44, Intercept, Andover, UK. Ashe, H., Monks, J., Wijgerde, M., Fraser, P. & Proudfoot, N. J. (1997). Intergenic transcription and transinduction in the human b-globin locus. Genes Dev. 11, 2494– 2509. Cho, H., Orphanides, G., Sun, X., Yang, X.-J., Ogrysko, V., Lees, E. et al. (1998). A human RNA polymerase II complex containing factors that modify chromatin structure. Mol. Cell. Biol. 18, 5355– 5363. Wilson, C. J., Chao, D. M., Imbalzano, A. M., Schnitzler, G. R., Kingston, R. E. & Young, R. A. (1996). RNA polymerase II holoenzyme contains SWI/SNF regulators involved in chromatin remodelling. Cell, 84, 235– 244. Tuan, D., Kong, S. & Hu, K. (1992). Transcription of the hypersensitive site HS2 enhancer in erythroid cells. Proc. Natl Acad. Sci. USA, 89, 11219– 11223. Kong, S., Bohl, D., Li, C. & Tuan, D. (1997). Transcription of the HS2 enhancer toward a cis-linked gene is independent of the orientation, position and distance of the enhancer relative to the gene. Mol. Cell. Biol. 17, 3955– 3965. Johnson, K. D., Christensen, H. M., Zhao, B. & Bresnick, E. H. (2001). Distinct mechanisms control RNA polymerase II recruitment to a tissue-specific locus control region and a downstream promoter. Mol. Cell, 8, 465– 471. Dhar, V., Nandi, A., Schildkraut, C. L. & Skoultchi, A. I. (1990). Erythroid-specific nuclease-hypersensitive sites flanking the human beta-globin domain. Mol. Cell. Biol. 10, 4324– 4333. Caterina, J. J., Ryan, T. M., Pawlik, K. M., Palmiter, R. D., Brinster, R. L., Behringer, R. R. & Townes, T. M. (1991). Human b-globin locus control region: analysis of the 50 DNase I hypersensitive site HS2 in transgenic mice. Proc. Natl Acad. Sci. USA, 88, 1626– 1630. Kozak, M. (1987). An analysis of 50 -noncoding sequences from 699 vertebrate messenger RNAs. Nucl. Acids Res. 15, 8125– 8148. Nagy, E. & Maquat, L. E. (1998). A rule for termination-codon position within intron-containing genes: when nonsense affects RNA abundance. Trends Biochem. 23, 198– 199. Gong, Q. & Dean, A. (1993). Enhancer-dependent transcription of the epsilon-globin promoter requires promoter-bound GATA-1 and enhancer-bound AP-1/NF-E2. Mol. Cell. Biol. 13, 911 – 917.
Promoters in the b Globin LCR
45. Gong, Q. & Dean, A. (1996). Essential role of NF-E2 in remodelling of chromatin structure and transcriptional activation of the epsilon-globin gene in vivo by 50 hypersensitive site 2 of the beta-globin locus control region. Mol. Cell. Biol. 16, 6055– 6064. 46. Shivdasani, R. A. & Orkin, S. H. (1995). Erythropoiesis and globin gene-expression in mice lacking the transcription factor NF-E2. Proc. Natl Acad. Sci. USA, 92, 8690– 8694. 47. Hardison, R., Slightom, J. L., Gumucio, D. L., Goodman, M., Stojanovic, N. & Miller, W. (1997). Locus control regions of mammalian beta-globin gene clusters: combining phylogenetic analyses and experimental results to gain functional insights. Gene, 205, 73– 94. 48. Bulger, M. & Groudine, M. (1999). Looping versus linking: toward a model for long-distance gene activation. Genes Dev. 13, 2465– 2477. 49. Engel, J. D. & Tanimoto, K. (2000). Looping, linking, and chromotin activity: new insights into b-globin locus regulation. Cell, 100, 499– 502. 50. Li, Q., Harju, S. & Peterson, K. (1999). Locus control regions: coming of age at a decade plus. Trends Genet. 15, 403– 408. 51. Elnitski, L., Miller, W. & Hardison, R. (1997). Conserved E-boxes function as part of the enhancer in hypersensitive site 2 of the beta-globin locus control region: role of basic helix– loop – helix proteins. J. Biol. Chem. 272, 369– 378. 52. Roeder, R. G. (1996). The role of general initiation factors in transcription by RNA polymerase II. Trends Biochem. 21, 327– 335. 53. Wittschieben, B. O., Otero, G., deBizemont, T., Fellows, J., ErdjumentBromage, H., Ohba, R. et al. (1999). A novel histone acetyltransferase is an integral subunit of elongating RNA polymerase II holoenzyme. Mol. Cell, 4, 123– 128. 54. Liu, M., Xie, Z. & Price, D. H. (1998). A human RNA polymerase II transcription termination factor is a SWI2/SNF2 family member. J. Biol. Chem. 273, 25541– 25544. 55. Vyas, P., Vickers, M. A., Simmons, D. L., Ayyub, H., Craddock, C. F. & Higgs, D. R. (1992). cis-Acting sequences regulating expression of the human alpha-globin cluster lie within constitutively open chromatin. Cell, 69, 781– 793.
611
56. Bennani-Baiti, I. M., Jones, B. K., Liebhaber, S. A. & Cooke, N. E. (1995). Physical linkage of the human growth hormone gene and the skeletal muscle sodium channel a-subunit gene (SCN4A) on chromosome 17. Genomics, 29, 647– 652. 57. Aronow, B., Lattier, D., Silbiger, R., Dusing, M., Hutton, J., Jones, G. et al. (1989). Evidence for a complex regulatory array in the first intron of the human adenosine deaminase gene. Genes Dev. 3, 1384– 1400. 58. Neznanov, N. S. & Oshima, R. G. (1993). cis Regulation of the keratin 18 gene in transgenic mice. Mol. Cell. Biol. 13, 1815– 1823. 59. Kelley, R. L. & Kuroda, M. I. (2000). Noncoding RNA genes in dosage compensation and imprinting. Cell, 103, 9 –12. 60. Meller, V. H., Gordadze, P. R., Park, Y., Chu, X., Stuckenholz, C., Kelley, R. L. & Kuroda, M. I. (2000). Ordered assembly of roX RNAs into MSL complexes on the dosage-compensated X chromosome in Drosophila. Curr. Biol. 10, 136– 143. 61. Bartolomei, M. S., Zemel, S. & Tilghman, S. M. (1991). Parental imprinting of the mouse H19 gene. Nature, 351, 153– 155. 62. Schmidt, J. V., Matteson, P. G., Jones, B. K., Guan, X.-J. & Tilghman, S. M. (2000). The Dlk1 and Gtl2 genes are linked and reciprocally imprinted. Genes Dev. 14, 1997– 2002. 63. Rogan, D. F., Cousins, D. J. & Staynov, D. Z. (1999). Intergenic transcription occurs throughout the human IL-4/IL-13 gene cluster. Biochem. Biophys. Res. Commun. 255, 556– 561. 64. Di Cristofano, A., Strazzullo, M., Parisi, T. & La Mantia, G. (1995). Mobilization of an ERV9 human endogenous retroviral element during primate evolution. Virology, 213, 271– 275. 65. Proudfoot, N. J., Lee, B. A. & Monks, J. (1992). Multiple Sp1 binding-sites confer enhancer-independent, replication-activated transcription of Hiv-1 and globin gene promoters. New Biol. 4, 369– 381. 66. Eggermont, J. & Proudfoot, N. J. (1993). Poly(A) signals and transcriptional pause sites combine to prevent interference between RNA polymerase-II promoters. EMBO J. 12, 2539– 2548. 67. Sambrook, J., Fritsch, E. F. & Maniatis, T. (1989). Molecular Cloning: A Laboratory Manual, 2nd edit., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY.
Edited by J. Karn (Received 3 May 2002; received in revised form 12 September 2002; accepted 17 September 2002)