Structural Analysis and Mapping of DNase I Hypersensitivity of HS5 of the β-Globin Locus Control Region

Structural Analysis and Mapping of DNase I Hypersensitivity of HS5 of the β-Globin Locus Control Region

Genomics 61, 183–193 (1999) Article ID geno.1999.5954, available online at http://www.idealibrary.com on Structural Analysis and Mapping of DNase I H...

341KB Sizes 1 Downloads 39 Views

Genomics 61, 183–193 (1999) Article ID geno.1999.5954, available online at http://www.idealibrary.com on

Structural Analysis and Mapping of DNase I Hypersensitivity of HS5 of the b-Globin Locus Control Region Qiliang Li, Miaohua Zhang, Zhijun Duan, and George Stamatoyannopoulos 1 Division of Medical Genetics, Box 357720, School of Medicine, University of Washington, Seattle, Washington 98195 Received January 28, 1999; August 2, 1999

The b-globin locus control region (LCR) is a cis regulatory element that is located in the 5* part of the locus and confers high-level erythroid lineage-specific and position-independent expression of the globin genes. The LCR is composed of five DNase I hypersensitive sites (HSs), four of which are formed in erythroid cells. The function of the 5*-most site, HS5, remains unknown. To gain insights into its function, mouse HS5 was cloned and sequenced. Comparison of the HS5 sequences of mouse, human, and galago revealed two extensively conserved regions, designated HS5A and HS5B. DNase I hypersensitivity mapping revealed that two hypersensitive sites are located within the HS5A region (designated HS5A major and HS5A minor), and two are located within the HS5B region (HS5B major, HS5B minor). The positions of each of these HSs colocalize with either GATA-1 or Ap1/NF-E2 motifs, suggesting that these protein binding sites are implicated in the formation of HS5. Gel retardation assays indicated that the Ap1/NF-E2 motifs identified in murine HS5A and HS5B interact with NF-E2 or similar proteins. Studies of primary murine cells showed that HS5 is formed in all hemopoietic tissues tested (fetal liver, adult thymus, and spleen), indicating that this HS is not erythroid lineage specific. HS5 was detected in murine brain but not in murine kidney or adult liver, suggesting that this site is not ubiquitous. The presence of GATA-1 and NF-E2 motifs (which are common features of the DNase I hypersensitive sites of the LCR) suggests that the HS5 is organized in a manner similar to that of the other HSs. Taken together, our results suggest that HS5 is an inherent component of the b-globin locus control region. © 1999 Academic Press

INTRODUCTION

The human b-globin locus is controlled by a distant cis-acting regulatory element termed the locus control Sequence data from this article have been deposited with the EMBL/GenBank Data Libraries under Accession No. AF037169. 1 To whom correspondence should be addressed at Division of Medical Genetics, Markey Molecular Medicine Center, K253 HSB, Box 357720, University of Washington, Seattle, WA 98195. Telephone: (206) 543-3526. Fax: (206) 543-3050. E-mail: gstam@ u.washington.edu.

region (LCR). The LCR is capable of opening the b locus domain, enhancing transcription of linked genes, and controlling the timing and origin of DNA replication of the locus (reviewed by Orkin, 1990, 1995; Townes and Behringer, 1990; Felsenfeld, 1992; Grosveld et al., 1993; Engel, 1993; Stamatoyannopoulos and Nienhius, 1994; Martin et al., 1996; Wood, 1996). The LCR is characterized by the presence of five DNase I hypersensitive sites (HSs) located at positions 26.1, 210.9, 214.7, 218, and 221.5 kb relative to the e-globin cap site and designated HS1 to HS5, respectively (Tuan et al., 1995; Forrester et al., 1986; Grosveld et al., 1987). The enhancing activity of the LCR is confined mainly to HS2, HS3, and HS4 (Forrester et al., 1989; Collis et al., 1990; Ryan et al., 1989); these sites have been intensively investigated in the past decade. Each HS contains a 200- to 300-bp core sequence (Talbot et al., 1990; Philipsen et al., 1990; Caterina et al., 1991; Liu et al., 1992), which is highly conserved during evolution (Li, et al., 1990; Moon and Ley, 1990; Hardison et al., 1997). The cores contain several ubiquitous and erythroid-specific motifs, the most prominent of which are GATA-1, NF-E2, and the GT box (Talbot et al., 1990; Philipsen et al., 1990; Pruzina et al., 1991; Reddy and Shen, 1991; Strauss and Orkin, 1992; Ikuta et al., 1996). Each HS possesses only a portion of the LCR enhancing activity in transgenic mice while full activity of the LCR is obtained by combining multiple HSs in one construct (Talbot et al., 1989). Although individual HSs of the LCR are able to confer copy number-dependent expression of a linked gene (Fraser et al., 1990; Li and Stamatoyannopoulos, 1994b), deletion of any HS in the context of a 70-kb cosmid encompassing the entire b locus results in position effects on globin gene expression (Milot et al., 1996). Such results have led to the suggestion that the LCR acts as a holocomplex formed by the HSs (Wijgerde et al., 1995). There is developmental specificity of HS function (Fraser et al., 1993; Li and Stamatoyannopoulos, 1994b) exemplified by studies of transgenic mice carrying a bYAC containing a 234-bp deletion of the HS3 core (Navas et al., 1998). The function of the individual HSs is partially redundant as shown by the

183

0888-7543/99 $30.00 Copyright © 1999 by Academic Press All rights of reproduction in any form reserved.

184

LI ET AL.

FIG. 1. Physical map of the six lambda clones containing the murine HS5 sequence. The physical map of the murine LCR (top line) was assembled based on this study and the data in literature (Moon and Ley, 1990; Jimenez et al., 1992). The vertical arrows indicate the locations of the HSs. The inserts in the six lambda clones are represented by the three thin lines under the physical map. The closed boxes are the probes used in lambda library screening. The open box shows the region that was sequenced. B, BamHI; E, EcoRI; K, KpnI; X, XbaI.

finding that deletions of 1.9 kb of HS2 or 2.3 kb of HS3 in the context of a 248-kb bYAC result in minor effects on g- or b-globin expression (Peterson et al., 1996). Similarly, deletions of the endogenous murine HS2 or HS3 produce only minor decreases of b-globin gene expression (Fiering et al., 1995; Hug et al., 1996). In contrast, deletions of the core sequences of HS2, HS3, or HS4 have profound effects on globin gene expression (Bungert et al., 1995, 1999; Navas et al., 1998). In contrast to HS2, HS3, and HS4, the most upstream site of the LCR, HS5, has not been thoroughly studied, and its function remains elusive. It is not clear whether HS5 is an essential component of the LCR, and there are contradictory reports regarding the lineage specificity of this site. In contrast to HS1 to HS4, which are erythroid lineage specific, HS5 was initially considered to be expressed constitutively (Tuan et al., 1985, Dhar et al., 1990). Later, it was reported that HS5 is formed only in erythroid cells of transgenic mice carrying the human b locus, suggesting that this site is erythroid lineage specific (Zafarana et al., 1994). HS5 has insulating activity in enhancer blocking assays in K562 and murine erythroleukemia (MEL) cells (Chung et al., 1993; Li and Stamatoyannopoulos, 1994a) and can confer position-independent transcription (Yu et al., 1994). HS5 does not function as a chromatin insulator in transgenic mice (Zafarana et al., 1994). To gain insights into the organization of HS5, we cloned and sequenced mouse HS5 and performed DNase I hypersensitivity mapping. Comparisons of the sequences of murine, human, and galago HS5 indicated that this site is composed of two conserved regions, designated HS5A and HS5B. Each region contains GATA-1 and NF-E2 motifs. DNase I hypersensitivity studies indicated that HS5 is composed of four subbands, HSA major, HSA minor, HSB major, and HSB minor, colocalizing with either the GATA-1 or

the NF-E2 motifs of HS5A or HS5B. Studies of primary cells from various murine tissues showed that in contrast to previous reports, HS5 is not erythroid lineage specific because it is formed in all hemopoietic tissues we tested. HS5, however, is not a ubiquitously formed DNase I hypersensitive site because it was absent in two of the three nonhemopoietic tissues we analyzed. MATERIALS AND METHODS Cloning and sequencing. A 129sv strain mouse genomic library constructed in phage Lambda FIX II (Stratagene) was screened with a 300-bp probe corresponding to mouse HS4 (Jimenez et al., 1992). Positive clones were further identified by hybridization with human HS5 (Li and Stamatoyannopoulos, 1994a) probe at low stringency. The clones that hybridized to human HS5 were subjected to restriction enzyme mapping. The DNA partially digested by a restriction enzyme was fractionated on a 0.8% agarose gel, blotted onto a nylon membrane, and hybridized with probes that were located at either the 59 or the 39 ends of the insert DNA. Based on the map, the insert DNA was subcloned into a plasmid vector (pBluescript, Stratagene) and used for nucleotide sequencing. DNA sequence was determined from both directions using the dideoxynucleotide termination method. Sequence was analyzed using the GCG program. Cells and tissues. Murine cell lines (MEL, NIH3T3, and E427) were cultured in RPMI 1640 medium with 2 mM L-glutamine, 10% fetal calf serum in a humidified 5% CO 2 atmosphere at 37°C. Kidney, thymus, liver, and brain were collected from normal adult mice (strain B6D2). Fetal liver was collected from day 12 fetuses. Isolation of nuclei. Nuclei were isolated as described by Stamatoyannopoulos et al. (1995). Briefly, cell suspensions were prepared by passing small pieces of tissues through a needle several times and filtered through a nylon mesh. Cells were collected by centrifugation at 400g for 5 min and washed once in phosphate-buffered saline. The pellets were resuspended by tapping to a final cell density of 2 3 10 7 to 4 3 10 7 per milliliter in ice-cold RSB (10 mM Tris–HCl, pH 7.5, 10 mM NaCl, 3 mM MgCl 2). Cell membranes were disrupted by adding, dropwise, 10% NF-40 to a final concentration of 0.25%. The resulting nuclei were pelleted at 400g for 5 to 10 min and resuspended in RSB at a DNA concentration of 1 mg/ml.

HS5 OF THE b-GLOBIN LOCUS CONTROL REGION

185

FIG. 2. Dot matrix analysis of the murine and human HS5 sequences. Murine and human DNA sequences were compared using the compare program of the University of Wisconsin Genetics Computer Group. Dots were recorded whenever the two sequences used in the comparison had 14 or more identical nucleotides of the 21 nucleotides compared. Repetitive sequences were not excluded from these comparisons. The locations of HS4 and HS5 are shown. The human sequence coordinate is L22754 in the GenBank database and continues to J00179; the number 2688 in L22754 corresponds to number 1 in J00179. Determination of DNase I hypersensitive sites. Nuclei from cultured cells and mouse fetal liver were digested with increasing amounts of DNase I (Washington) for 10 min at 37°C. Nuclei from adult tissues were incubated without exogenous nuclease for increasing lengths of time at 37°C. The reaction was stopped by adding EDTA and proteinase K to final concentrations of 50 mM and 200 mg/ml, respectively. The mixture was incubated overnight at 55°C. The DNA was purified by phenol/chloroform extraction and precipitated by ethanol. The DNase I-treated mouse DNA was digested with KpnI. Ten milligrams of DNA from each sample was fractionated by electrophoresis in 0.7% agarose gel and transferred to Zeta-Probe membranes (Bio-Rad) by the method of Southern blotting. The blots were probed with a 32P-labeled probe (KpnI/EcoRI fragment, 1– 499 in this paper). After hybridization, the membranes were washed in 23 SSC, 0.1% SDS twice and in 0.13 SSC, 0.1% SDS once at 65°C for 20 min. Blots were autoradiographed with an intensifying screen at 270°C for 1 to 7 days. The size of the hybridizing bands appearing in the autoradiogram was determined by using a 1-kb ladder standard (Gibco BRL), which was diluted in 10 mg of mouse genomic DNA and run alongside the samples in each gel. Gel retardation assays. Nuclear extracts used for DNA binding assays were prepared as described by Andrews and Faller (1991). About 3 3 10 7 MEL cells were harvested and washed once with 40 ml of cold PBS. The pellet was then resuspended in 2 ml of cold Buffer A (10 mM Hepes-KOH, pH 7.9, 1.5 mM MgCl 2, 10 mM KCl, 0.5 mM DTT, 0.2 mM PMSF). After incubation for 10 min on ice, the suspension was centrifuged at 10,000g for 10 s. The pellet was resuspended in 500 ml cold buffer C (20 mM Hepes–KOH, pH 7.9, 25% glycerol, 1.5 mM MgCl 2, 420 mM NaCl, 0.2 mM EDTA, 0.5 mM DTT, 0.2 mM PMSF) and incubated on ice for 30 min. The crude nuclear extract was obtained after the cellular debris was removed by centrifugation at 12,000g for 30 min at 4°C. Protein concentrations were determined using the Bio-Rad protein assay kit. Aliquots were stored at 270°C.

The synthetic double-stranded oligonucleotides were end-labeled using T4 polynucleotide kinase. The binding reaction mixture (10 ml final volume) contained labeled oligonucleotide probes (20,000 cpm) in binding buffer (10 mM Tris–HCl, pH 7.5, 50 mM NaCl, 0.5 mM DTT), 10% glycerol, 1 mg poly(dI– dC), 0.05% NP-40, and various concentrations of cold probe competitors. After 7 mg of nuclear extracts was added, the binding reactions were performed for 20 min at room temperature. Samples were electrophoresed in 4.5% polyacrylamide gel in 0.53 Tris– borate–EDTA buffer containing 4 mM Mg 21 at 4°C. The DNA sequences of the oligonucleotides used in the studies were as follows: human HS2/NF-E2 probe, 59GAGTCATGATGAGTCATGCT39; mouse probe A, 59CAATTTTTGACACATTAGCT39; mouse probe B, 59CTATTAACTGACTCATATCT39.

RESULTS

Molecular Cloning of Mouse HS5 Human HS5 is located about 3 kb 59 to HS4. We reasoned that if the distance between HS4 and HS5 is conserved between human and mouse, we should be able to clone mouse HS5 by “walking” a l phage library using a HS4 probe. A 129sv strain mouse genomic library constructed in l phage FIX II was screened with a 300-bp probe corresponding to mouse HS4 (Jimenez et al., 1992). Six positive clones were isolated, and they were further hybridized with a human HS5 probe under low stringency. All six clones hybridized to the human HS5 probe. Physical mapping was performed on the positive clones (Fig. 1). Of these, four phage clones (l1661, l1131, l1132, and l1133) contained an identical insert, which is about 8 kb long and

186

LI ET AL.

FIG. 3. Alignments of the HS5 sequences of mouse (M), human (H), and galago (G). The GenBank accession numbers are AF037169 for mouse, L22754 for human, and U60902 for galago. To avoid interference of insertions, deletions, and low-homology regions on long-range alignment, the sequences were divided into several segments based on the results of dot matrix analysis. Alignments were first performed with the short DNA segments with higher homology. Then the entire alignments were completed by assembling the small pieces of alignments. The gaps were introduced to obtain maximal matching. (A) Alignment for the HS5A region. (B) Alignment for the HS5B region. The GATA-1 and NF-E2-like motifs are shown in boldface type. Blocks of sequence of 6 bp or more that were identified as “phylogenetic footprints” (see Results) are shown in the shaded areas.

HS5 OF THE b-GLOBIN LOCUS CONTROL REGION

FIG. 3—Continued

187

188

LI ET AL.

spans HS4 and HS5. The insert in phage l2731 extends about 4.5 kb upstream from clone l1661, while the insert in phage l1662 contains 8 kb of additional downstream sequence. A 5.8-kb nucleotide sequence was determined (GenBank Accession No. AF037169), which at its 39 end has a 610-bp overlap with the published mouse HS4 sequence (Jimenez et al., 1992) and extends through HS5.

Long et al. (1998) have published additional sequences 59 to human HS5 that include a 1.8-kb transposon element (ERV-9). They proposed that enhancer sequences within this element may play a role in the function of the LCR (Long et al., 1998). Dot matrix analysis using the GCG program failed to depict any homology between the murine HS5 sequence we report here and the ERV-9 element (which is human specific) or the sequences immediately flanking this element.

Sequence Comparisons The mouse sequence was compared with the sequence of human HS5 (GenBank, Accession No. L22754) and HS4 (Accession No. J00179) using the GCG program. Dot matrix analysis was performed using a window 14 of 21 matches (Fig. 2). Three conserved regions appear in the dot matrix graph. The first region corresponds to HS4, and it is represented by the rightmost diagonal line centered at 5.5 kb 39 to the KpnI site, the starting point in this sequence. The other two conserved regions are easily located and reside approximately 1.8 and 3.6 kb upstream of HS4. These two regions contain DNase I hypersensitive sites (see below) and are referred to as HS5A and HS5B, respectively. HS5A corresponds to HS4.2, and HS5B corresponds to HS5 of Bender et al. (1998). Since the galago HS5 sequence (GenBank Accession No. U60902) is available (Slightom et al., 1997), we compared the galago sequence with the mouse sequence by dot matrix analysis. Two conserved regions 59 to the HS4 were also found in galago, located at positions similar to those in human. Figures 3A and 3B show the alignment of HS5 sequences in mouse, human, and galago. Because deletions and insertions frequently occur during the course of evolution, gaps were introduced to reach maximum matching in alignments. The alignment in the HS5A region (Fig. 3A) spans about 0.7 kb, starting from about 3240 through about 4300 of the mouse sequence. Four homologous subregions were identified. The first, around 3250, is short and characterized by the presence of multiple GATA-1 sites. The second stretches from 3490 to 3645; within this 155-bp stretch, the homology between mouse and human is 65%, while the homology between mouse and galago is 61%. At the end of this homologous island, there is a putative Ap1/NF-E2 motif (see below). Neither GATA-1 nor NF-E2 motifs were present in homologous regions 3 (extending from 3750 to 4076) and 4 (extending from 4093 to 4313). The conserved region HS5B (Fig. 3B) is divided into two subregions based on the fact that it is interrupted by a 244-bp deletion in the galago sequence. The 59 homologous subregion extends from 1475 to 1990 of the mouse sequence. An Ap1/NF-E2 motif resides in the middle of this subregion. The 39 subregion extends from 2220 to 2470. This region contains a GATA-1 site that is present in the same position in the three species.

GATA-1 and NF-E2-like Motifs GATA-1 and NF-E2 motifs are characteristic of the DNase I hypersensitive sites of the LCR (Orkin, 1995). Therefore, we first inspected the sequences of murine HS5 for the existence of GATA-1 and Ap1/NF-E2 motifs, and the motifs described below were identified. The subsequent alignment of the HS5 sequences of the three species (Figs. 3A and 3B) showed that such motifs were also present in homologous regions of human and galago HS5. There is one important difference between the arrangements of GATA-1 and NF-E2-like sites in HS5 and the HS2, HS3, and HS4. The GATA-1 and NF-E2 motifs in HS2, HS3, and HS4 are approximately 50 bp apart, and this distance appears to have been conserved during evolution (Stamatoyannopoulos et al., 1995). In contrast, the GATA-1 and NF-E2 motifs in HS5A are 370 bp apart, and those in HS5 are 450 bp apart. Thus, while the presence of the two motifs is retained in HS5, the distance between them differs between HS5 and the other DNase I hypersensitive sites of the LCR. GATA-1. A remarkable feature of HS5 is the presence of multiple GATA-1 motifs that cluster in the first homologous island of the HS5A region (Fig. 3A). Five GATA-1 motifs are arranged in an array in the same orientation as in human HS5A. In galago HS5A, two GATA-1 sites reside next to each other. There are two GATA-1 motifs in mouse HS5A, but they are separated by a 31-bp insert. In the HS5B region (Fig. 3B), a GATA-1 motif resides at the same position in all three species. Ap1/NF-E2-like motifs. NF-E2 motifs have been identified in all HSs of the LCR (Stamatoyannopoulos et al., 1995). HS5A contains an NF-E2-like motif (TTTTGACACAT) that matches in 8 of 11 nucleotides with the consensus sequences of Ap1/NF-E2 (YGCTGASTCAY) (Andrews et al., 1993). The term TRE (TPA responsive element) has been proposed to designate the core of the NF-E2 sequence. The NF-E2-like site of HS5A contains a motif (TGACACA) that has a 1-bp mismatch from the TRE motif (TGACTCA). HS5B contains an NF-E2-like motif (AACTGACTCAT) that matches in 9 of 11 nucleotides with the consensus sequence of Ap1/NF-E2. The NF-E2-like site of HS5B contains a motif (TGACTCA) that has a perfect match with the TRE motif.

HS5 OF THE b-GLOBIN LOCUS CONTROL REGION

189

FIG. 4. Gel retardation assay for NF-E2 binding. (A) The NF-E2 probe was a 20-bp double-stranded DNA containing the NF-E2 motif in human HS2 (from 8663 to 8682). The labeled probe was incubated with MEL nuclear extract and fractionated on a polyacrylamide gel (lane 1). (Lanes 2–4) The labeled NF-E2 probe was competed by the cold NF-E2 probe. The amount of the cold competitor was increased from lane 2 to lane 4 as indicated by the wedge symbol. (Lanes 5–7) The labeled NF-E2 probe was competed with cold probe A (20 bp) that contains the NF-E2-like motif of HS5A. The amount of the probe A in lanes 5–7 is comparable to that of the cold NF-E2 probe in lanes 2–4. (Lane 8 –10) The labeled NF-E2 probe was competed with cold probe B that contained the NF-E2-like motif of HS5B. The amounts of the competitor are comparable to those of lanes 2 to 4. (B) Probe A (20 bp) containing an NF-E2-like sequence in the mouse HS5A region was labeled and competed with no cold competitor (lane 1), with itself (lane 2–5), or the human HS2/NF-E2 probe. (C) Probe B (20 bp) containing an NF-E2-like sequence in the mouse HS5B region was labeled and competed with no cold competitor (lane 1), with itself (lane 2–5), or with the human HS2/NF-E2 probe.

To test whether the so identified NF-E2-like motifs of HS5A and HS5B interact with proteins related to NF-E2, we performed gel retardation assays. A 20-bp probe containing the NF-E2 binding site of human HS2 was incubated with nuclear extracts from MEL cells. A band was detected (Fig. 4, lane 1) that could be competed out by increasing amounts of the cold probe (Fig. 4, lane 2– 4), indicating that it was specific. To assess the relationship between NF-E2 and the NF-E2-like motifs of murine HS5A and HS5B, two probes were used as cold competitors: probe A (20 bp), spanning from bp 3588 to 3607 of HS5A and containing three mismatches with NF-E2 and one mismatch with the TRE consensus sequence; and probe B (20 bp), spanning from 1792 to 1811 of the HS5B sequence and containing two mismatches with the NF-E2 consensus sequence and a perfectly matched TRE motif. Results are shown in Fig. 4A. The relative amounts of the competitors in lanes 5–7 (probe A) and lanes 8 –10 (probe B) are the same as that of the cold NF-E2 probe in lanes 2– 4. It can be noticed that with increasing amounts of competitors, the NF-E2 band is competed out both by probe A and by probe B. Although probe B contains a perfectly matched TRE motif, competition with NF-E2 was less efficient than with probe A. Probes A and B were also tested using binding assays. Probe A produced a single retarded band that migrated at the same position but with less intensity than the NF-E2 probe (Fig. 4B). The complex could be competed out by itself (Fig. 4B, lanes 2– 4) or by the cold NF-E2 probe (Fig. 4B, lanes 5–7). Probe B yielded multiple bands (Fig. 4C). The most retarded band migrated at the same position as the NF-E2 band and could be competed out by the cold NF-E2 probe (Fig. 4C, lane 5–7). These results suggest that both the TRE

sequences of HS5A and those of HS5B bind NF-E2 or NF-E2-like factors. Other Putative Motifs To detect other putative motifs for transcriptional factors, we performed phylogenetic footprinting analysis (Tagle et al., 1988). Phylogenetic footprints (defined as sequence blocks that show 100% conservation over a region of $6 contiguous basepairs in alignments of the three orthologous sequences of our study) were identified using a computer program described by Hardison et al. (1998). These phylogenetic footprints are shown in Figs. 3A and 3B as shaded areas. Of 14 phylogenetic footprints fitting these criteria, 11 did not match any known transcription motifs using the TRANSFAC program in the GenomeNet Server. Of the remaining 3, the longest footprint was ACCACTAGAGGG, starting at bp 1845 of the mouse sequence. It contains an AML-1 motif (TGTGGT) (Meyers et al., 1993). AML-1 encodes a nuclear transcription factor that shows homology in its 59 part with the Drosophila melanogaster segmentation gene, runt. A heat shock factor motif (AGAAN, Fernandes et al., 1994) was found in the footprint AGCCTTCT starting at bp 3863 in the mouse sequence. However, a typical heat shock protein binding site is composed of multiple, contiguous units of this 5-bp sequence arranged in alternating orientation; i.e., NGAANNTTCNNGAAN and at least two adjacent 5-bp units are required for stable binding in vitro. A footprint starting at bp 2422 in the mouse sequence matched with the CdxA motif (WWTWMTR, Margalit et al., 1993). CdxA is a member of the caudal family of the homeobox genes. The functional significance, if any, of these putative motifs is unclear.

190

LI ET AL.

Cell and Tissue Specificity

FIG. 5. DNase I hypersensitivity mapping of HS5 in (A) MEL cells and (B) mouse fetal liver cells. The DNase I concentrations increase from left to right as indicated by solid wedges on the top. The 1-kb molecular mass marker shown on the leftmost lane was run along with the samples, but exposed for a shorter period of time. The designation of HSs identified by the subbands is indicated in the right margin of the figure.

Mapping of DNase I Hypersensitive Sites Nuclei were isolated from uninduced MEL cells and subjected to DNase I treatment. DNA was purified from the treated nuclei and digested with KpnI to completion. Southern hybridizations were performed using a radioactive labeled KpnI/EcoRI DNA fragment (nucleotides 1–511 of the mouse HS5 sequence, GenBank Accession No. AF037169). Figure 5A shows that six subbands are present in addition to the parent band. The 10- and 5.5-kb subbands correspond to mouse HS3 and HS4. The other four subbands are 1.8, 2.2, 3.1, and 3.6 kb, and they are designated respectively as HS5B major, HS5B minor, HS5A major, and HS5A minor. Of these, HS5A major (the 3.1-kb subband) is the strongest and corresponds to HS4.2 of Bender et al. (1998). HS5B minor (the 2.2-kb subband) is the weakest, and it is visible only on an overexposed autoradiograph. HS5B corresponds to HS5 of Bender et al. (1998). The positions of HS5B minor and HS5A major coincide with the GATA-1 sites. The position of HS5B major colocalizes with the Ap1/NF-E2 or the AML-1 site of HS5B while the position of HS5A minor colocalizes with the Ap1/NF-E2 site of HS5A. To test whether these HSs are also present in the primary erythrocytes, we performed DNase I hypersensitivity assays in cell suspensions of day 12 mouse fetal liver. At this stage of mouse development, the fetal liver is an erythropoietic organ, and over 90% of the nucleated cells are erythroblasts or erythroid progenitors. As shown in Fig. 5B, in addition to the 10and 5.5-kb subbands (HS3 and HS4), four subbands are detected in the HS5 region at the same positions as in the MEL cells. Therefore, multiple DNase I hypersensitive sites are formed in the HS5 region of the primary murine erythroid cells as well as the cells of an established murine erythroleukemia cell line.

HS5 has been considered to be constitutive because it was detected in all human cell lines that had been tested, including erythroid, lymphoid, fibroblastic, hepatoma, epitheloid carcinoma, and multipotent embryonal carcinoma cells (Dhar et al., 1990). This interpretation was challenged by Zafarana et al. (1994), who suggested that HS5 is erythroid specific because it is present in the fetal liver, but not in the adult thymus of transgenic mice carrying a 70-kb human b locus. To clarify the cell and tissue specificity of HS5, we performed DNase I hypersensitivity assays in various murine tissues and in established murine cell lines. Nuclei were prepared from E427 (lymphocytic) and NIH3T3 (fibroblastic) cell lines. HS5A major and HS5B major were easily detectable in both lines; HS5A minor was present in the E427 line but it was absent in 3T3 cells (Figs. 6A and 6B). The 5.5- and 10-kb subbands of the erythroid-specific HS3 and HS4 were absent in both NIH3T3 and E427 cells. To test tissue specificity, nuclei were isolated from adult mouse brain, kidney, liver, spleen, and thymus, and DNase I hypersensitivity assays were performed. None of the subbands corresponding to HS5 (or HS3 and HS4) was present in the adult kidney and liver (which in the adult stage of mouse development is composed of only hepatic tissue). HS5A major and HS5B major could be detected in the spleen, in the brain, and, with less intensity, in the thymus (Figs. 6C– 6E). The 10- and 5.5-kb subbands (HS3 and HS4) were also present in the spleen, whereas they were absent in the brain and thymus. In conclusion, all the subbands of HS5 can be detected only in the cells of the erythroid lineage (fetal liver). However, HS5 cannot be considered to be exclusively erythroid, because the major subbands of HS5A and HS5B are also present in other hemopoietic tissues (spleen and thymus). These major subbands are not ubiquitous, because, of the three adult tissues tested (brain, adult liver, and kidney), they were absent in two. Hence, in contrast to the strictly erythroid-specific formation of HS1 to HS4, HS5 is not an erythroidspecific site because it appears in the brain; HS5 is not ubiquitous because it cannot be detected in the adult liver and kidney. DISCUSSION

It has been well established that the b-globin LCR is highly conserved in mammals (Li et al., 1990; Moon and Ley, 1990; Jimenez et al., 1992; Hardison et al., 1997). The most conserved regions of HS1, 2, 3, and 4 are confined to core sequences composed of 200 –300 nucleotides. The cores coincide with the DNase I hypersensitive sites, and they are functional equivalents of the HSs (reviewed by Grosveld et al., 1993). The goal of the work described in this paper was to sequence mouse HS5 and identify regions of potential functional

HS5 OF THE b-GLOBIN LOCUS CONTROL REGION

191

FIG. 6. DNase I hypersensitivity analysis in cell lines and adult murine tissues. DNase I partially digested MEL DNA was run along with the samples in each assay and served as a reference as shown on the rightmost lane. (A) E427 cells. (B) NIH3T3 cells. (C) Murine spleen. (D) Murine thymus. (E) Murine brain.

significance. Sequence comparisons indicated that, like the other HSs of the human b– globin locus LCR, HS5 is highly conserved in mouse, human, and galago. The HS5 consists of two conserved subregions, HS5A and HS5B, occurring around 1.8 and 3.6 kb, respectively, upstream of HS4. HS5A and HS5B are separated by a 1.8-kb nonhomologous sequence, and therefore it is unclear whether they represent one or two functional units. DNase I hypersensitivity mapping revealed that two DNase I hypersensitive sites are formed in each conserved subregion of HS5. The positions of each of these HSs coincide with either a GATA-1- or an Ap1/NF-E2 motif, suggesting that the binding of GATA-1 or Ap1/ NF-E2-related factors may be implicated in the formation of HS5. It has been previously shown that binding of NF-E2 and GATA-1 is required for formation of human HS4 (Stamatoyannopoulos et al., 1995). Furthermore, binding of NF-E2 alone is sufficient to restore the DNase I hypersensitive site 2 in an in vitro reconstituted chromatin (Armstrong and Emerson, 1996; Gong et al., 1996). The colocalization of the four DNase I hypersensitive sites composing HS5 with either GATA-1 or NF-E2 motifs provides indirect evidence that HS5 is organized in a manner similar to that of the other DNase I hypersensitive sites of the LCR. It can be argued that since both the GATA-1 and the Ap1/NF-E2 recognition sequences are short and they are present thousands of times in the genome, they can be localized in a DNA segment just by chance. The GATA-1 and Ap1/NF-E2 motifs of the HS5, however, are found within highly conserved regions, making it unlikely that their colocalization is due to chance. An understanding of the structure–function relationship of the LCR has been attempted with studies in transgenic mice that carry human globin gene constructs with deletion of LCR elements. Deletions of 1.9 or 2.3 kb including the core elements and flanking sequences of HS2 or HS3 in the context of a b locus YAC produced small effects on globin gene expression in transgenic mice (Peterson et al., 1996). Similarly, deletions of the endogenous HS2 or HS3 of the murine b locus LCR had only a minor effect on globin gene

expression in cis (Fiering et al., 1995; Hug et al., 1996). These results should be contrasted with the phenotypes of transgenic mice carrying deletions of the core elements of HSs. Deletions of 375 bp of the core region of HS2 result in severe depression of expression of all the b-globin genes at every developmental stage (Bungert et al., 1999). When 234 bp containing the core of HS3 were deleted, distinct abnormalities in the developmental expression of globin genes were produced (Navas et al., 1998). In the embryonic yolk sac cells of these mice with a HS3 core deletion, there was a total absence of expression of the human e-globin gene but normal expression of the g genes. When, however, the erythropoiesis switched from the yolk sac to the fetal liver, g gene expression was totally silenced. The latter finding should be contrasted with the abundant expression of the g genes in the beginning of the fetal liver erythropoiesis of mice carrying the wildtype b locus YACs (Peterson et al., 1996). In addition, in these mice with HS3 core deletions, the expression of the human adult globin gene became position dependent (Navas et al., 1998). The differing phenotypes obtained in studies of transgenic mice carrying either large deletions of HS sequences or deletions of only the core elements of these HSs indicate that different insights about the function of the LCR can be obtained by studying the effects of large deletions or the effects of mutations of the core element of the HSs. Mutations of core elements may provide more direct information about HS function as well as the role of the core elements in the formation of the LCR holocomplex (Bungert et al., 1995; 1999; Peterson et al., 1996; Navas et al., 1998). Recently, Bender et al. (1998) reported the sequence of murine HS5 and presented evidence that a deletion of the endogenous murine HS5 has no effect on the function of the genes of the b locus. This result is consistent with the phenotypes of the larger deletions of HS2 and HS3, none of which produced significant phenotypes (Peterson et al., 1996; Fiering et al., 1995; Hug et al. 1996). As in the case of HS2, HS3, and HS4, insights about the function of HS5 could be obtained by studying the phenotypes of transgenic mice carrying deletions of the HS5 core elements. The main reason

192

LI ET AL.

we did the study described here was to define whether HS5 contains elements that could be considered equivalent or similar to the core elements of the other HSs of the LCR. If they existed, such elements could be the targets of mutagenesis of a b locus YAC for the purpose of producing deletions and studying their functional effects in transgenic mice. Such “core” elements were provisionally identified in the regions of HS5A and HS5B containing the GATA-1 and NF-E2-like motifs. The possible functional significance of the so-identified core elements will await the results of experiments in b locus YAC transgenic mice. ACKNOWLEDGMENTS We thank Drs. Tariq Enver and Ming Hu for the mouse HS4 probe, Dr. Christopher H. Lowrey for providing to us his protocol for the DNase I hypersensitivity assay, and Alex Rhode for mouse tissue samples. This work was supported by grants from the National Institutes of Health (DK45365, HL20889, and HL53750).

REFERENCES Andrews, N. C., and Faller, D. V. (1991). A rapid micropreparation technique for extraction of DNA-binding proteins from limiting numbers of mammalian cells. Nucleic Acids Res. 19: 2499. Andrews, N. C., Erdjument-Bromage, H., Davidson, M. B., Tempst, P., and Orkin, S. H. (1993). Erythroid transcription factor NF-E2 is a hematopoietic-specific basic-leucine zipper protein. Nature 362: 722–728. Armstrong, J. A., and Emerson, B. M. (1996). NF-E2 disrupts chromatin structure at human b-globin locus control region hypersensitive site 2 in vitro. Mol. Cell. Biol. 16: 5634 –5644. Bender, M. A., Reik, A., Close, J., Telling, A., Epner, E., Fiering, S., Hardison, R., and Groudine, M (1998). Description and targeted deletion of 59 hypersensitive site 5 and 6 of the mouse b-globin locus control region. Blood 92: 4394 – 4403. Bungert, J., Dave, U., Lim, K-C., Lieuw, K. H., Shavit, J. A., Liu, Q., and Engel, J. D. (1995). Synergistic regulation of human b-globin gene switching by locus control region elements HS3 and HS4. Genes Dev. 9: 3083–3096. Bungert, J., Tanimoto, K., Patel, S., Liu, Q., Fear, M., and Engel, J. D. (1999). Hypersensitive site 2 specifies a unique function within the human b-globin locus control region to stimulate globin gene transcription. Mol. Cell. Biol. 19: 3062–3072. Caterina, J. J., Ryan, T. M., Pawlik, K. M., Palmiter, R. D., Brinster, R. L., Behringer, R. R., and Townes, T. M. (1991). Human b-globin locus control region: Analysis of the 59 DNase I hypersensitive site HS2 in transgenic mice. Proc. Natl. Acad. Sci. USA 88: 1626 –1630. Chung, J. H., Whiteley, M., and Felsenfeld, G. (1993). A 59 element of the chicken b-globin domain serves as an insulator in human erythroid cells as protects against position effect in Drosophila. Cell 74: 505–514. Collis, P., Antoniou, M., and Grosveld, F. (1990). Definition of the minimal requirements within the human b-globin gene and the dominant control region for high level expression. EMBO J. 9: 233–240. Dhar, V., Nandi, A., Schildkraut, C. L., and Skoutchi, A. (1990). Erythroid-specific nuclease-hypersensitive sites flanking the human b-globin domain. Mol. Cell. Biol. 10: 4324 – 4333. Engel, J. D. (1993). Developmental regulation of human b-globin gene transcription: A switch of loyalties? Trends Genet. 9: 304 – 309. Felsenfeld, G. (1992). Chromatin as an essential part of the transcriptional mechanism. Nature 355: 219 –224.

Fernandes, M., Xiao, H., and Lis, J. T. (1994). Fine structure analysis of the Drosophila and Saccharomyces heat shock factor– heat shock element interaction. Nucleic Acids Res. 22: 167–173. Fiering, S., Epner, E., Robinson, K., Zhuang, Y., Telling, A., Hu, M., Martin, D. I. K., Enver, T., Ley, T. J., and Groudine, M. (1995). Targeted deletion of 59HS2 of the murine b-globin LCR reveals that it is not essential for proper regulation of the b-globin locus. Genes Dev. 9: 2203–2213. Forrester, W. C., Thompson, C., Elder, J. T., and Groudine, M. (1986). A developmentally stable chromatin structure in the human b-globin gene cluster. Proc. Natl. Acad. Sci. USA 83: 1359 – 1363. Forrester, W. C., Novak, U., Gelinas, R., and Groudine, M. (1989). Molecular analysis of the human b-globin locus activation region. Proc. Natl. Acad. Sci. USA 86: 5439 –5443. Fraser, P., Hurst, J., Collis, P., and Grosveld, F. (1990). DNase I hypersensitive sites 1, 2 and 3 of the human b-globin dominant control region direct position-independent expression. Nucleic Acids Res. 18: 3503–3508. Fraser, P., Pruzina, S., Antoniou, M., and Grosveld, F. (1993). Each hypersensitive site of the human b-globin locus control region confers a different developmental pattern of expression on the globin genes. Genes Dev. 7: 106 –113. Gong, Q. H., McDowell, J. C., and Dean, A. (1996). Essential role of NF-E2 in remodeling of chromatin structure and transcriptional activation of the epsilon-globin gene in vivo by 59 hypersensitive site 2 of the beta-globin locus control region. Mol. Cell. Biol. 16: 6055– 6064. Grosveld, F., van Assendelft, G. B., Greaves, D. R., and Kollias, G. (1987). Position-independent, high-level expression of the human b-globin gene in transgenic mice. Cell. 51: 975–985. Grosveld, F., Dillon, N., and Higgs, D. (1993). The regulation of human globin gene expression. Bailieres Clin. Haematol. 6: 31–55. Hardison, R. C., Slightom J. L., Gumucio, D. L., Goodman, M., Stojanovic, N., and Miller, W. (1997). Locus control regions of mammalian b-globin gene clusters: Combining phylogenetic analyses and experimental results to gain functional insights. Gene 205: 73–94. Hardison, R., Riemer, C., Chui, D. H. K., Huisman, T. H. J., and Miller, W. (1998). Electronic access to sequence alignments, experimental results, and human mutations as an aid to studying globin gene regulation. Genomics 47: 429 – 437. Hug, B. A., Wesselschmidt, R. L., Fiering, S., Bender, M. A., Epner, E., Groudine, M., and Ley, T. J. (1996). Analysis of mice containing a targeted deletion of b-globin locus control region 59 hypersensitive site 3. Mol. Cell. Biol. 16: 2906 –2912. Ikuta, T., Papayannopoulou, T., Stamatoyannopoulos, G., and Kan, Y. W. (1996). Globin gene switching: In vivo protein–DNA interactions of the human b-globin locus in erythroid cells expressing the fetal or the adult globin gene program. J. Biol. Chem. 271: 14082–14091. Jimenez, G., Gale, K. B., and Enver, T. (1992). The mouse b-globin locus control region: Hypersensitive sites 3 and 4. Nucleic Acids Res. 20: 5797–5803. Li, Q., Zhou, B., Powers, P., Enver, T., and Stamatoyannopoulos, G. (1990). b-globin locus activation regions: Conservation of organization, structure, and function. Proc. Natl. Acad. Sci. USA 87: 8207– 8211. Li, Q., and Stamatoyannopoulos, G. (1994a). Hypersensitive site 5 of the human b locus control region functions as a chromatin insulator. Blood 84: 1399 –1401. Li, Q., and Stamatoyannopoulos, J. A. (1994b). Position independence and proper developmental control of g-globin gene expression require both a 59 locus control region and a downstream sequence element. Mol. Cell. Biol. 14: 6087– 6096. Liu, D., Chang, J. C., Moi, P., Liu, W., Kan, Y. W., and Curtin, P. T. (1992). Dissection of the enhancer activity of b-globin 59 DNase

HS5 OF THE b-GLOBIN LOCUS CONTROL REGION I-hypersensitive site 2 in transgenic mice. Proc. Natl. Acad. Sci. USA 89: 3899 –3903. Long, Q., Bengra, C., Li, C., Kutlar, F., and Tuan, D. (1998). A long terminal repeat of the human endogenous retrovirus ERV-9 is located in the 59 boundary area of the human b-globin locus control region. Genomics 54: 542–555. Margalit, Y., Yarus, S., Shapira, E., Gruenbaum, Y., and Fainsod, A. (1993). Isolation and characterization of target sequences of the chicken CdxA homeobox gene. Nucleic Acids Res. 21: 4915– 4922. Martin, D. I., Fiering, S., and Groudine, M. (1996). Regulation of b-globin gene expression: Straightening out the locus. Curr. Opin. Genet. Dev. 6: 488 – 495. Meyers, S., Downing J. R., and Hiebert, S. W. (1993). Identification of AML-1 and the (8;21) translocation protein (AML-1/ETO) as sequence-specific DNA-binding protein: The runt homology domain is required for DNA binding and protein–protein interaction. Mol. Cell. Biol. 13: 6336 – 6345. Milot, E., Strouboulis, J., Trimborn, T., Wijgerde, M., de Boer, E., Langeveld, A., Tan-Un, K., Vergeer, W., Yannoutsos, N., Grosveld, F., and Fraser, P. (1996). Heterochromatin effects on the frequency and duration of LCR-mediated gene transcription. Cell 87: 105– 114. Moon, A. M., and Ley, T. J. (1990). Conservation of the primary structure, organization, and function of the human and mouse b-globin locus-activating region. Proc. Natl. Acad. Sci. USA 87: 7693–7697. Navas, P. A., Peterson, K. R., Li, Qiliang, Skarpidi, E., Rohde, A., Shaw, S. E., Clegg, C. H., Asano, H., and Stamatoyannopoulos, G. (1998). Transgenic mice with HS3 core deletion: Direct evidence for developmental specificity of the interaction between the LCR and the embryonic or fetal globin genes. Mol. Cell. Biol. 18: 4188 – 4196. Orkin, S. H. (1990). Globin gene regulation and switching: Circa 1990. Cell 63: 665– 672. Orkin, S. H. (1995). Regulation of globin gene expression in erythroid cells. Eur. J. Biochem. 231: 271–281. Peterson, K. R., Clegg, C. H., Navas, P. A., Norton, E. J., Kimbrough, T. G., and Stamatoyannopoulos, G. (1996). Effect of deletion of 59 HS3 or 59HS2 of the human b-globin locus control region on the developmental regulation of globin gene expression in b-globin locus yeast artificial chromosome transgenic mice. Proc. Natl. Acad. Sci. USA 93: 6605– 6609. Philipsen, S., Talbot, D., Fraser, P., and Grosveld, F. (1990). The b-globin dominant control region: Hypersensitive site 2. EMBO J. 9: 2159 –2167. Pruzina, S., Hanscombe, O., Whyatt, D., Grosveld, F., and Philipsen, S. (1991). Hypersensitive site 4 of the human b globin locus control region. Nucleic Acids Res. 19: 1413–1419. Reddy, P. M. S., and Shen, C.-K. J. (1991). Protein–DNA interaction in vivo of an erythroid-specific, human b-globin locus enhancer. Proc. Natl. Acad. Sci. USA 88: 8676 – 8680. Ryan, T. M., Behringer, R. R., Martin, N. C., Townes, T. M., Palmiter, R. D., and Brinster, R. L. (1989). A single erythroid-

193

specific DNase I super-hypersensitive site activates high levels of human b-globin gene expression in transgenic mice. Genes Dev. 3: 314 –323. Slightom, J. L., Bock, J. H., Tagle, D. A., Gumucio, D. L., Goodman, M., Stojanovic, N., Jackson, J., Miller, W., and Hardison, R. (1997). The complete sequences of the galago and rabbit b-globin locus control regions: Extended sequence and functional conservation outside the cores of DNase hypersensitive sites. Genomics 39: 90 –94. Stamatoyannopoulos, G., and Nienhuis, A. W. (1994). Hemoglobin switching. In “The Molecular Basis of Blood Diseases” (G. Stamatoyannopoulos, A. W. Nienhuis, P. Majerus, and H. Varmus, Eds.), pp. 107–155, Saunders, Philadelphia, PA. Stamatoyannopoulos, J. A., Goodwin, A., Joyce, T., and Lowrey, C. H. (1995). NF-E2 and GATA binding motifs are required for the formation of DNase I hypersensitive site 4 of the human b-globin locus control region. EMBO J. 14: 106 –116. Strauss, E. C., and Orkin, S. H. (1992). In vivo protein–DNA interaction at hypersensitive site 3 of the human b-globin locus control region. Proc. Natl. Acad. Sci. USA 89: 5809 –5813. Tagle, D. A., Koop, B. F., Goodman, M., Slightom, J. L., Hess, D., and Jones, R. T. (1988). Embryonic e and g globin genes of a prosimian primate (Galago crassicaudatis): Nucleotide and amino acid sequence, developmental regulation, and phylogenetic footprints. J. Mol. Biol. 203: 439 – 455. Talbot, D., Collis, P., Antoniou. M., Vidal, M., Grosveld, F., and Greaves, D. R. (1989). A dominant control region from the human b– globin locus conferring integration site-independent gene expression. Nature 338: 352–355. Talbot, D., Philipsen, S., Fraser, P., and Grosveld., F. (1990). Detailed analysis of the site 3 region of the human b-globin dominant control region. EMBO J. 9: 2169 –2178. Townes, T. M., and Behringer, R. R. (1990). Human globin locus activation region (LAR): Role in temporal control. Trends Genet. 6: 219 –223. Tuan, D., Solomon, W., Li, Q., and London, I. M. (1995). The b-like globin gene domain in human erythroid cells. Proc. Natl. Acad. Sci. USA 82: 6384 – 6388. Wijgerde, M., Grosveld, F., and Fraser, P. (1995). Transcription complex stability and chromatin dynamics in vivo. Nature 377: 209 –213. Wood, W. G. (1996). The complexities of b-globin gene regulation. Trends Genet. 12: 204 –206. Yu, J., Bock, J. H., Slightom, J. L., and Villeponteau, B. (1994). A 59 b-globin matrix-attachment region and the polyoma enhancer together confer position-independent transcription. Gene 139: 139 – 154. Zafarana, G., Raguz, S., Pruzina, S., Grosveld, F., and Meijer, D. (1994). The regulation of human b-globin gene expression: The analysis of hypersensitive site 5 (HS5) in the LCR. In “Hemoglobin Switching” (G. Stamatoyannopoulos, Ed.), pp. 39 – 44, Intercept Ltd., Hampshire, UK.