Analysis of the GATA-1 Gene Promoter and Globin Locus Control Region Elements by in Vivo Footprinting

Analysis of the GATA-1 Gene Promoter and Globin Locus Control Region Elements by in Vivo Footprinting

ANALYSIS OF THE GATA-I GENE PROMOTER A N D CLOBIN LOCUS CONTROL REGION ELEMENTS BY IN VIVO FOOTPRINTING Erich C. Strauss and Stuart H. Orkin Abstrac...

1MB Sizes 0 Downloads 75 Views

ANALYSIS OF THE GATA-I GENE PROMOTER A N D CLOBIN LOCUS CONTROL REGION ELEMENTS BY IN VIVO FOOTPRINTING

Erich C. Strauss and Stuart H. Orkin

Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . I. Introduction.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . 11. Characterization of the GATA-I Promoter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111. Guanine/Adenine-LMPCR In Vivo Footprinting. . . . . . . . . . . . . . . . . . . . . . . IV. Analysis of Human Locus Control Region Elements A. The a-Globin LCR Element. . . . . B. The P-Globin LCR Subregion HS-3 V. Conclusions.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Acknowledgments References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Advances in Molecular and Cell Biology Volume 21, pages 135-158. Copyright 0 1997 by JAI Press Inc. All rights of reproductionin any form reserved. ISBN 0-7623-0145-7

135

136 136 137

141

155 155

136

ERlCH C. STRAUSS and STUART H. ORKIN

ABSTRACT Identification of relevant DNAregulatoryelements involved in transcriptionalcontrol is essential to determining which proteins establish and maintain cell-specific gene expression. As an approach to understanding the mechanisms of gene regulation, in vivo footprinting reveals protein-DNA interactions as they actually occur in situ. In this chapter, we discuss the application of in vivo footprinting to complement functional and in v i m studies of the GATA-1 gene promoter and a-and P-globin locus control regions in erythroid cells. In addition, we describe a modification of the in vivo footprinting technique that expands the analysis of DNA contacts to include adenine residues.

1.

INTRODUCTION

Hematopoietic cells provide an attractive biological system for the study of molecular mechanisms that control lineage and developmental specific gene expression. To give rise to the various hematopoietic lineages, common pluripotent stem cells express distinct sets of genes during cellular commitment and maturation. Activation of lineage-specific programs of gene expression in hematopoietic cells is presumed to be mediated by the interaction of cell-specific and ubiquitous transcriptional factors with their cognate cis-elements. The identification of regulatory motifs and the characterization of DNA-binding proteins that recognize these sequences provides a basis for understanding the mechanisms involved in cell-specific gene expression. Of the various methods available for the analysis of transcriptional control elements, only in vivo footprinting permits the detection of protein-DNA interactions as they exist in living cells. In vivo footprinting studies discriminate active from nonfunctional consensus binding sites, demonstrate cell-specific binding of ubiquitous factors, reveal changing profiles of protein occupancy at overlapping regulatory sequences, and identify active chromatin structures. For these reasons, we have used ligation-mediated polymerase chain reaction (LMPCR) in vivo footprinting (Mueller and Wold, 1989) in our studies of cell-specific gene expression in the erythroid lineage of hematopoietic cells. In this chapter we discuss the use of in vivo footprinting to complement functional studies in the identification of regulatory elements in the promoter of the erythroid transcriptional factor GATA-1 (Tsai et al., 1991). In addition, we describe a modification to the dimethysulfate (DMS)-based LMPCR in vivo footprinting procedure that extends the analytical potential of this technique (Strauss et al., 1992). Finally, we present the results of our in vivo footprinting analysis of the distant, upstream regulatory regions of the human a-and P-globin gene clusters (Strauss et al., 1992; Strauss and Orkin, 1992). These sequence elements, referred to as locus control regions (LCRs), are critical for the expression

In Vivo Footprinting: GATA-1 Gene Promoter and Globin LCR Elements

137

of globin genes in developing erythroid cells. In these studies, we define active motifs within the a- and P-globin LCRs and suggest a temporal relationship between subregions of the human P-LCR element in response to induced maturation of erythroid cells.

II.

CHARACTERIZATION OF THE GATA-1 PROMOTER

In erythroid cells, lineage-specific gene expression appears to mediated, in part, by the transcriptional factor GATA- 1. GATA- 1 recognizes a consensus sequence (T/A)GATA(A/G)that is present in the promoters and enhancers of all characterized erythroid-specificgenes (Orkin, 1990). Expression of GATA-1 is restricted at the transcriptional level to erythroid progenitors and two additional hematopoietic lineages: megakaryocytes and bone marrow derived mast cells (Evans and Felsenfeld, 1989; Tsai et al., 1989;Martinet al., 1990;Romeo et al., 1990).These lineages are presumed to be derived from a common progenitor.The functional significance of GATA motifs has been revealed in mutagenesis studies of regulatory regions in erythroid-expressed genes (Evans et al., 1988; Reitman and Felsenfeld, 1988; Martin et al., 1989; Mignotte et al., 1989; Plumb et al., 1989; Watt et al., 1990). Through gene targeting in embryonic stem (ES) cells, GATA-1 has been shown to be essential for normal erythroid development, as GATA-1 deficient ES cells fail to contribute to erythropoiesis in chimeric animals (Penvy et al., 1991). Present during all stages of erythroid development, the cellular content of GATA-1 transcripts and protein increases during erythroid maturation. Thus expression of GATA-1 is regulated with regard to cell type specificity and stage of development. As an approach to understanding how these aspects of GATA-1 expression are accomplished, we have isolated and characterized the mouse GATA-1 gene and initiated analysis of the mechanisms that restrict its expression to erythroid cells. The structure of the mouse GATA-I gene was determined by analysis of two overlapping bacteriophage clones (Tsai et al., 1991). The GATA-1 gene consists of six exons distributed over a region of 8 kb (Tsai et al., 1991). Exon I is noncoding; exon I1 contains the initiation codon for the mature protein. The two homologous zinc-finger domains of the GATA-1 protein are encoded separately in exons IV and V. Expression of a marked, intact GATA-1 gene was examined in mouse erythroleukemia (MEL) cells. These initial experiments with stable transfectants indicated that 7.5-kb 5' and 1-kb 3' sequences are sufficient to direct expression of the transgene. Additional studies in human erythroleukemia cells suggest that similar transgene expression is obtained with the wild-type GATA-1 gene and 2.7 kb of the upstream sequence. The study was next directed to the identification and analysis of the GATA-1 gene promoter. Cloning of GATA-1 cDNAs from various libraries demonstrated 5'-end heterogeneity of RNA transcripts. S 1 nuclease mapping established the presence of the putative promoter region immediately upstream of the 5' termini

ERICH C. STRAUSS and STUART H. ORKIN

138

of the heterogeneous GATA- 1 transcripts (Tsai et al., 1991).DNA sequence analysis of the putative promoter region of the GATA-1 gene, numbered with the last nucleotide of exon I as - 1, is shown in Figure 1.There are several prominent features of the promoter sequence. First, the region corresponding to the heterogeneous 5' ends of GATA-1 transcripts contains highly GA-rich sequences (positions - 163 to -93) that include multiple, simple repeats. Second, consistent with the lack of a discrete 5' end for GATA-1 transcripts, no TATA-like motifs are present in the putative promoter region. Third, two CACCC boxes are located upstream of the purine-rich region. These elements are frequently associated with erythroid-expressed genes (deBoer et al., 1988; Antoniou and Grosveld, 1990; Frampton et al., 1990; Watt et al., 1990). Fourth, an atypical, double GATA consensus binding site (positions -687 to -673) resides approximately 450 bp upstream of the distal -873 TTTGTGATCT TATCCCAATC CTCTGGACTC CCAGGGGAGT CCACTCTGGG -823 TGTCACCTCA GTTTCCCGCC TCTAACGTAG TATGGCGGGC AAGAAGTTGA -773 GGCACCGTCC CTGTGCATCC CCTACCCTGC CCCCCAGCCC CAAGACAGCC

myb-conamnaua -723 .TGZUCTGCG GCACCAACAG CCACAGTCGA GTCCAT

doubla OATA GA TAAGACTTAT

-673 3 C T G C C C C AGAGCAGGCC AGAGCTGGCG TAAGCCCCAG GCACGAGCCG

-623 AAGCACTAAA GAAGTGTATG TACCCTTACC CACTAGTCCT GGCCTAGTAC - 5 1 3 CCCAGACTGC TTCATAGAGG TGCCTGCAGC CTCTGCTTGA AATGCTCCCA

-523 AAACTCTGAG CCTCATTCTT CTCACCTGGA AATGGGTACA GCTATATCCC -413 CCTTTCTCCC AGCATTCAGG AGGGCTCACG CGCATACAGG TCCAACCCAC

-423 ACATAGCCTG GTACACAGTA GGGCTTTCCT CACTGAAAGA AACTAGTAGT - 3 1 3 AAAACATGAA ACTTAGATCT TGACTAATTG CTCATATGAC TTGACTGGAC

-323 ACTGGACTCC ACAGAAGCAA AGGCAAAGGG GATCCAACAA C C T G C A G W -273 BGBCAGGAAG GGCGGAGGGA CTAGAGCCTA AAAGGTCCTC CACAAGGAGG CACCC box CKXX: box -223 CGCCTCCCCTGC ACTGCC-GGC ACCAGCCACT -173 CCCTGGGGAG GAAAGAGGAG GGAGAAGGTG AGTGGGAGGG AGGGAGGGCG -123 GGCGGGCTGG CAGGAGGGAG AGAAGGGAGA CTCAGAGGCC AAGGCCAGTG

-

13 AGGACTCCCT TGGGATCACC CTGAACTCGT CATACCACTA AGGTGGCTGA -1 n5-1 23 ATCCTCTGCA TCAACAAGCC CAG GTCAGT CTTGATTCCC ARRAAAACCC

DNA sequence of the GATA-1 promoter. The sequence is numbered from the last nucleotide of exon I as position -1. Sequence motifs discussed in the text are indicated.

Figure 1.

In Vivo Footprinting: GATA-7 Gene Promoter and Globin LCR Elements

139

CACCC box; the GATA core sequences of this double element are in opposed orientations. Finally, a nonconsensus GATA site (positions -277 to -272) is located approximately 60 bp upstream of the distal CACCC box. To examine the potential for protein binding to the putative promoter, in vitro assays were used with erythroid and nonerythroid nuclear protein extract (Tsai et al., 1991).The upstream double GATAelement was protected in DNase I footprinting experiments using MEL and Escherichiu coli-expressed GATA-1, but no footprints were detected with nonerythroid (HeLa cell) extract. Moreover, gel-shift and methylation interference studies suggested an asymmetric binding of a single GATA molecule to the 5' GATA motif (Tsai et al., 1991). As anticipated by the presence of ubiquitous CACCC-binding proteins (Xiao et al., 1987; Schule et al., 1988; Philipsen et al., 1990; Talbot et al., 1990), the duplicated CACCC region binds proteins from both erythroid and nonerythroid extracts. The nonconsensus GATA site failed to bind GATA- 1 protein in vitro. To investigate GATA-1 promoter function, sequences spanning -874 to -20 were linked to the human growth hormone (GH) gene as areporter in transient expression assays. The wild-type GATA-UGH construct was shown to be preferentially active in an erythroid environment (Tsai et al., 1991), suggesting a role for the promoter region in the specificity of GATA- 1 expression. Site-specific mutagenesis of the double GATA and CACCC motifs were examined in transient transfection experiments to determine the relevance of these elements to GATA-1 promoter function. Isogenic constructions, with specific mutations, were studied to provide the most significant comparison of expression levels. As shown in Figure 2, mutation of the distal GATA motif (construct 5' (G-T)) and the combination mutation of the distal and proximal GATA core sequences (construct 5' (G-T)/3' (C-A)) reduce promoter activity to approximately 28% of the wild-type level. Mutation of the proximal GATA core alone (construct 3' (C-A)) had a more modest effect, reducing promoter activity to 48% of the wild-type construct. Deletion of the double GATA elements produced results similar to constructs 5' (G-T) and 5' (G-T)/3' (C-A). Two different constructions, with mutations of the nonconsensus GATA sequence at positions -277 to -272, produced no distinguishable affect on promoter activity (data not shown). The introduction of clustered substitutions in both CACCC boxes (construct mCACCC) decreased promoter activity to 22.5% of the wild-type level. Finally, a deletion of the promoter to position -127 reduced promoter activity to 15%. The transient transfection studies with site-specific mutants implicate both the double GATA and CACCC elements in GATA-1 promoter function. These results suggest that full promoter activity requires the presence of the proximal CACCC box sequences and the upstream, double GATAmotif that apparently binds a single GATA-1 molecule in an asymmetric fashion. The results described above reveal the functional significance of the double GATA element and CACCC boxes in the context of in vitro binding experiments and transient transfection studies. However, given the inherent limitations of in vitro assays and the modest effects of mutation on transient promoter activity, these

140

ERICH C. STRAUSS and STUART H. ORKIN Functional Elements of GATA-1 Gene Promoter

-686

-672

-218

-192

Promoter

Aetlvlty

100 f 17.1

Wild-type 5 ' (G-T)

27.4 f 3.4

3 ' (C-A)

48.0 f 7.2

5 '' (G-T) / 3 (C-A)

28.3 f 5.1

*TAAGACTTA@

mCACCC Appro

___--_------____________________________----t15.0 f 5.2 -127 HSV-TK promotor

94 f 7

Figure 2. Analysis of GATA-1 promoter activity in MEL cells. Construct designations

are shown at left. Deleted regions are shown by a dashed line.Thewild-typeGATA-1/GH construct was active as a positive control.

results do not necessarily establish the relevance of the double GATA element and CACCC boxes to the expressed, in situ gene. To examine in situ protein occupancy of these motifs, we used DMS based LMPCR in vivo footprinting (Mueller and Wold, 1989). These experiments were performed with uninduced and dimethyl sulfoxide (DMS0)-induced MEL cells, which express abundant GATA- 1, and in nonexpressing NIH-3T3 cells. As shown in Figure 3A (left), protections of G residues are present within and immediately downstream of the 5'-GATA core sequence in both uninduced and DMSO-treated MEL cells. However, no footprints were detected in the region of the 3'-GATA motif (Figure 3A, right). In addition, analysis of the CACCC box elements reveals footprints at both motifs (Figure 3B). In vivo footprints were identical in uninduced and DMSO-induced MEL cells, with one exception. At the 5' CACCC box, a single G residue is enhanced in uninduced and protected in induced MEL cells. In the nonexpressingNIH-3T3 cells, no in vivo footprints were observed in either the CACCC or double GATA regions. The results from these in vivo footprinting experiments provide persuasive evidence for a critical role of the

In Vivo Footprinting: GATA-7 Gene komoter and Globin LCR Elements

141

double GATA element and CACCC motifs in GATA- 1 promoter function. Moreover, the in vivo studies parallel both in vitro protein binding and transient promoter results. On the basis of these complementary findings, we proposed that a positive feedback loop, mediated by GATA-1 in association with CACCC binding proteins, serves to increase the expression of GATA- 1 protein during erythroid maturation and maintain the differentiated state.

III. GUANINE/ADENINE-LMPCR IN VWO FOOTPRINTING In vivo footprinting has generally been used in association with the alkalating agent dimethyl sulfate (DMS). DMS acts as a chemical probe that penetrates the nucleus of intact cells to methylate the N-7 position of guanines and the N-3 position of

Figure 3. In vivo DMS footprinting of the double GATA element and CACCC region. (A) DMS reactivity of the coding (left) and noncoding (right) strands of the double GATA element. (6)DMS reactivity of the noncoding strand of the CACCC region. (Lanes 1) In vitro methylated protein-free MEL DNA; (lanes 2 ) in vivo methylated NIH-3T3 DNA; (lanes 3) in vivo methylated MEL DNA (uninduced cells); (lanes 4) in vivo methylated MEL DNA (DMSO-inducedcells). Protections (open circles) and enhancements (closed circles) of guanine residues are indicated. Summaries of altered DMS reactivities of guanines at the double GATA element and CACCC region in MEL DNA are displayed below.

ERICH C. STRAUSSand STUART H. ORKIN

142

B

Figure 3. Continued

adenines in genomic DNA. Purine residues that interact with transcriptional factors in vivo display a pattern of either decreased or increased frequency of methylation relative to control, protein-free DNA. Since guanines reside in the DNA major grove, afrequent binding site for transcriptional factors, and methylation of adenine residues is less efficient than that of guanines, in vivo DMS footprinting studies of complex genomes has relied exclusively on the assessment of guanine reactivities. However, an analysis restricted to guanine residues may exclude the detection of selected regulatory binding sites. In vivo footprinting results may also be compromised by heterogeneity in the cell population under investigation.A population of cells demonstrating heterogeneity with regard to cell type, degree of maturation, or level of gene expression may compromise or obscure the ability to detect protein-DNA interactions. Two features of our in vivo footprinting experimentswere critical for a complete analysis of the human a- and P-globin LCR elements described in the following section. First, we modified the LMPCR in vivo footprinting procedure of Mueller and Wold (1989) to permit the analysis of adenine as well as guanine residues; we refer to this modified method as GA-LMPCR in vivo footprinting. A detailed discussion of GA-LMPCR footprinting procedure is discussed in Strauss and Orkin (1997). Second, we analyzed and compared the same chromatin region in different cellular environments.To study the a-globin LCR element on chromosome 16,we

In Vivo Footprinting: GATA-7 Gene Promoter and Globin LCR Elements

143

used human K562 cells that show erythroid, megakaryocytic, and myeloid characteristics (Rutherford et al., 1979; Lumelsky and Forget, 1991); human-MEL cell hybrids, line J3-8B, which contain a single human chromosome 16; and nonerythroid hepatoma cells (HepG2). A comparative example of in vivo footprinting with G and GA cleavage chemistry is shown in Figure 4.As indicated in Figure 4A, no in vivo footprints were detected at a potential GATA binding site in the human a-LCR element with G cleavage chemistry. In contrast, in vivo footprinting with GA cleavage chemistry

A

B

GA-cleavage chemistry

Figure 4. Analysis of in situ, protein-DNA interactions at the nonconsensus GATA binding site in the a-LCR element, using guanine (A) and guanine/adenine (B) LMPCR in vivo footprinting. Expressing cell lines include in vivo rnethylated 13-8B and K562; in vivo rnethylated HepG2 cells were used as a nonexpressing control. The same preparations of rnethylated DNA were used for the two experiments.

144

ERICH C. STRAUSS and STUART H. ORKIN

reveals protections at two adenines within the core of the GATA motif in J3-8B cells (Figure 4B). From these results we conclude that GA cleavage chemistry provides information about protein-DNA interactions that cannot be obtained by G cleavage chemistry alone. Furthermore, although K562 cells show a partid erythroid phenotype, these cells may, in some instances, be inadequate for the analysis of active erythroid regulatory elements. Finally, the lack of any detectable in vivo footprints in HepG2 cells is consistent with the absence of GATA-1 protein in hepatic cells (Tsai et al., 1989; Zon et al., 1989) and the inaccessibility of the a-LCR in non-globin-expressingcells. Similarresults were obtained in our analysis of the P-LCR element with HU- 11, K-562, and HepG2 cells. The HU- 11 line is an interspecies human-mouse somatic cell hybrid containing a segment of human chromosome 11 in a MEL cell environment (Dhar et al., 1990).

IV. ANALYSIS OF HUMAN LOCUS CONTROL REGION ELEMENTS

The expression of a - and P-like globin genes in developing erythroid cells is dependent on distant, upstream regulatory sequences, referred to as locus control regions (LCRs) (Grosveld et a]., 1987; Higgs et al., 1990). These regulatory elements are coincident with DNase I hypersensitive regions (Tuan et al., 1985; Forrester et al., 1987) and serve to maintain chromatin in an open, active configuration (Felsenfeld, 1992) to influence the transcription of globin genes. Linkage of LCR elements to globin and nonglobin genes permits consistent, position-independent expression of the linked gene in transgenic mice or cultured cells (Grosveld et al., 1987; Higgs et al., 1990). In select a-and P-thalassemic individuals, rare natural deletions of LCR regulatory sequences result in the inactivation of intact globin genes (Driscoll et al., 1989; Hatton et al., 1990). Current models postulate that transcription of individual globin genes results from proximal promoter and LCR chromatin-bound protein interactions. The activity associated with the human a-LCR element has been localized to about 350 bp of sequence that resides 40 kb upstream of the embryonic <-globin gene (Higgs et al., 1990). This region coincides with erythroid-specific DNase I hypersensitivity (Higgset al., 1990).The P-LCR is characterized by defined regions located 6-18 kb upstream of the embryonic E-globin gene (Tuan et al., 1985; Forrester et al., 1987). The activity of the P-LCR has been subdivided into discrete core subregions of 200-300 bp, each corresponding to DNase I hypersensitive sites designated HS-1, HS-2, HS-3, and HS-4. Most of the P-LCR activity has been associated with HS-2, HS-3, and HS-4 (Philipsen et al., 1990; Talbot et al., 1990; Pruzinaet al., 1991).Theremarkablycompactor-LCRelement and the P-LCR HS-2 subregion show classical enhancer activity in erythroid cells (Tuan et al., 1989;Ney et al., 1990b; Moon and Ley, 1991; Pondel et al., 1992). Although P-LCR subregions HS-3 and HS-4 make substantial contributions to position-independent expression of genes in transgenic mice, they lack enhancer activity.

In Vivo Footprinting: GATA-7 Gene PLornoter and Globin LCR Elements

145

LCR function is presumed to result from the binding of both cell-specific and ubiquitous proteins. Studies using in vitro assays have revealed multiple binding sites for the erythroid transcriptional factor GATA-1 (Evans and Felsenfeld, 1989; Tsai et al., 1989),AP- 1 or NF-E2 (Moi and Kan, 1990;Ney et al., 1990a,b; Philipsen et al., 1990; Talbot et al., 1990; Jarman et al., 1991; Pruzinaet al., 1991; Talbot and Grosveld, 1991) and proteins that recognize CACCC or GGTGG sequences (Philipsen et al., 1990; Talbot and Grosveld, 1991).As protein binding in vitro may not accurately reflect the activity of regulatory elements in situ, we have used in vivo footprinting as a complementary approach in the analysis of LCR mediated cell-specific gene expression. In the following sections, we present our analysis of the a-LCR element and the HS-3 subregion of the P-LCR. A.

The a-Clobin LCR Element

The major enhancer activity of the a-globin LCR element has been localized to 350 bp of sequence (Higgs et al., 1990). We have analyzed protein binding in this region with GA-LMPCR in vivo footprinting.The investigation of a-LCR regulatory sites was performed in uninduced and DMSO-induced J3-8B cells, and compared to in vivo methylated HepG2 DNA and in vim-methylated, protein-free DNA. Analysis of the human a-LCR element by in vitro assays has demonstrated that several motifs bind proteins from nuclear extracts of erythroid and nonerythroid cells (Jarman et al., 1991).These regulatory elementsinclude four potential binding sites for the erythroid transcription factor GATA-1, two potential sites for AP-1 and/or the erythroid factor NF-E2, and four potential sites for factors that recognize CACCC/GGTGG motifs. The active elements within the a-LCR region, detected by in vivo footprinting, are displayed in Figures 5 , 6 and 7, summarized in Figure 8A, and compared to in v i m results in Figure 8B. In nonerythroid HepG2 cells, no in vivo footprints were observed in the a-LCR element. The protein binding of each motif is described below. CA TA Elements

Of the four potential GATA-1 motifs in the a-LCR element, including a nonconsensus site (positions 211 to 216), three are active in J3-8B cells (Figures 5,6, and 8). The upstream GATA element (positions 10 to 15), which binds protein in vitro (Jarman et al., 1991) was not occupied in viva AP- 1INF-€2 Elements

In the central region of the a-LCR element, there are two AP-1 consensus sites [TGA(C/G)TCA]. These motifs bind a variety of proteins in vitro, including the erythroid-restricted protein NF-E2 (Mignotte et al., 1989). Examination of this

146

ERICH C. STRAUSS and STUART H . ORKIN

figure 5. In vivo DMS footprinting of the top strand of the human a-LCR element. Lanes: 1, In vitro methylated protein-free K562 DNA; 2, in vivo methylated HepG2 DNA; 3, in vivo methylated 13-88 DNA (uninduced cells); 4,in vivo methylated 13-88 DNA (DMSO-induced cells). Protections are indicated by open circles; enhancements are represented by closed circles. A solid arrow corresponds to a hypersensitive guanine between two downstream GATA elements.

region by in vitro studies identified protein binding in both erythroid and nonerythroid cells (Jarman et al., 1991). In vivo footprinting analysis of the AP-1MFE2 binding site reveals extensive protein contacts over both motifs in J3-8B cells; however, no footprints were detected in nonerythroid HepG2 cells (Figures 5 , 6, and 8). We have demonstrated by gel-shift studies that these sites are appropriate targets for purified mouse NF-E2 protein (Strauss et al., 1992).

CACCC/CCTCC Elements CACCC/GGTGG sequences are frequently observed in the promoters of globin genes and LCR elements, and in association with GATA sites. These motifs are bound by a variety of proteins, including the ubiquitous transcription factor SP-1; however, none of these factors have been shown to be erythroid specific (Talbot and Grosveld, 1991). Of the four potential sites in the a-LCR, only one is occupied by protein in vivo in J3-8B cells (Figures 7 and 8). Protein binding to this element

figure 6 . In vivo DMS footprinting of the bottom strand of the human a-LCR element. Lanes and designations of altered DMS reactivities are as indicated in Figure 5. A protected adenine between the downstream GATA motifs is indicated with an open arrow; an inducible hypersensitivity in this region is indicated with a closed arrow.

Figure 7. In vivo DMS footprinting of the top strand of the active CACCC/GGTGG element in the a-LCR. Lanes and designationsofaltered DMS reactivitiesare as indicated in Figbre 5. 147

148

ERlCH C. STRAUSS and STUART H. ORKIN GATA-1

GATA-1 112

168

224 00

YTA-1

0 .

P

In vitro

In vivo

GATA-I

%f

-280

CATU NLU

M

CT

%f %f

H

GATU

UTA-I

t

Figure 8. (A) Summary of altered DMS reactivities in the human a-LCR element. Protections are indicated by open circles; enhancements are represented by closed circles. Hypersensitive sites between the two GATA-1 binding sites are represented by closed arrows; an associated protection is indicated by an open arrow. (B)Schematic comparison of in vitro and in vivo footprinting analyses of the a-LCR element. Potential binding sites where no footprints were observed are represented by open boxes; protein occupancy of regulatory motifs is indicated by closed boxes. A protection and two hypersensitivities detected by in vivo footprinting are designated with open and closed arrows, respectively. was not detected in vitro; conversely, binding to CACCC/GGTGG sequences upstream o f this motif was observed in vitro (Jarman et al., 1991).

Induced Hypersensitivity Between Two GATA Elements An objective o f our in vivo footprinting studies was the identification o f novel protein-DNA contacts that may provide a basis for understanding the distinctive

In Vivo Footprinting: GAJA-7 Gene Bomoter and Globin LCR Elements

149

features of LCR elements: potent enhancer activity and relative position independence. In this regard, we have detected a discrete region characterized by a protection and weak hypersensitivity on the bottom strand and a strong hypersensitivity on the top strand in either uninduced or induced J3-8B cells (Figures 5 , 6, and 8). In DMSO-treated J3-8B cells the weak hypersensitivity on the bottom strand becomes distinctly enhanced (Figure 6, lane 4).Gel-shift experiments failed to reveal specific protein binding in this region (data not shown). Consequently, the stable protection and hypersensitivity, and the induced hypersensitivity in J3-8B cells may reflect local chromatin structure, perhaps resulting from the interaction of proteins bound to the adjacent GATA sites or their interaction with proteins that do not contact DNA. Alternatively, these footprints may represent protein binding in vivo that cannot be detected in virro, presumably because of low abundance or instability of the relevant protein in erythroid nuclear extract. The results of our analysis suggest that GATA-1, NF-E2, and proteins that bind CACCC/GGTGG sequences are minimally required for LCR function.In addition, the altered methylation pattern detected between the two downstream GATA elements may reflect the interaction of the bound proteins with each other and with accessory proteins not directly contacting DNA. These results may assist in resolving the paradox that the same repertoire of protein-DNA interactions is present throughout differentiation. Modest alterations in protein-protein interactions or post-translational modifications of preexisting nuclear proteins may transduce signals for differentiation and, in this instance, high-level a-globin gene expression. These changes might activate chromatin without directly altering protein-DNA contacts.

B. The 0-Globin LCR Subregion HS-3 The copy number-dependent, integration site-independent activity of HS-3 has been defined by a region of sequence spanning 225 bp (Philipsen et al., 1990). Protein binding within the HS-3 subregion and additional upstream sequences containingan AP-l/NF-E2 site wereexaminedby GA-LMPCR in vivo footprinting. The investigation of the HS-3 subregion was conducted in an adult erythroid environment using mouse erythroleukemia cells containing a segment of human chromosome 11. This cell line, designated HU- 11, exhibits a prominent, erythroidspecific hypersensitivesite coincident with the HS-3 subregion (Dhar et al., 1990). In contrast, the human erythroleukemia cell line K562, which expresses an embryonic and fetal globin program, fails to form HS-3 (Tuan et al., 1985). Consistent with this finding, no in vivo protein binding was observed in HS-3 with K562 cells (data not shown). The results of our in vivo footprinting analysis of the human HS-3 subregion are displayed in Figures 9- 12, summarized in Figure 13A, and compared with in vitro studies in Figure 13B. In nonerythroid HepG2 cells, no in vivo footprints were detected in HS-3.

Figure 9. In vivo DMS footprinting of the GATA sites in the human HS-3 region. Top-strand analyses are shown in the left panel of A, B, C, and D; bottom strand analyses are shown in the right panel of A, B, C, and D. Lanes: 1, In v i m methylated protein-free K562 DNA; 2 , in vivo methylated HepG2 DNA; 3, in vivo methylated HU-11 DNA (uninduced cells); 4, in vivo methylated HU-11 DNA (DMSO-induced cells). Protections and enhancements are represented by open and closed circles, respectively. Numbers in parentheses refer to nucleotides as shown in Figure 13A.

150

In Vivo Footprinting: GATA-1 Gene Promoter and Clobin LCR €/elements

151

D

Figure 9. Continued

A

B

Figure 70. In vivo DMS footprinting of top strand of the two active CACCKT sites in the human HS-3 core element. The upstream motif is shown in (A); the downstream motif is displayed in (8).Lanes and designations of altered DMS reactivities are as indicated in Figure 9.

Figure 7 7 .

In vivo DMS footprinting of the top (left) and bottom (right) strands of the upstream HS-3 AP-l/NE-2 site. Lanes and designations of altered DMS reactivities are as indicated in Figure 9.

CATA Elements

Four consensus GATA motifs are present with the core HS-3 region. Of these elements, three bind protein in vitro; potential binding to the remaining GATA site

ERICH C. STRAUSS and STUART H. ORKIN

152

Figure 12. In vivo DMS footprinting of the top (left)and bottom (right)strands of the central A T-rich region of the human HS-3 region. Lanes and designationsof altered DMS reactivities are as indicated in Figure 9.

+

was obscured in vitro by protein binding to one of the overlapping CACCC/GGTGG motif (Philipsen et al., 1990). Analysis of these sites by in vivo footprinting demonstrates that all four consensus GATA elements are active in uninduced and DMSO-treated HU-11 cells (Figures 9 and 13). Additional protections were evident at two GATA motifs following DMSO-induced maturation of HU-11 cells (Figure 9B, lane 4, A-119, A-121, and G-122; Figure 9C, lane 4, G-156). Protein binding in vivo was not detected at the nonconsensus GATA site (data not shown). CACCCIGGTGC Elements The HS-3 core element contains seven CACCC/GGTGGmotifs; six of these sites have been shown to bind protein in vitro (Philipsen et al., 1990). Two of these motifs are conserved between humans and mice (Moon and Ley, 1991A); however, only a single site is conserved among humans, mice, and goats (Li et al., 1990). Analysis of these motifs in uninduced HU- 11 cells reveals only single contacts at two of the CACCC/GGTGG sequences (Figure 10A, lane 3, G-100; Figure 10B, lane 3, G-191). After induced maturation with DMSO, additional contacts are evident at both of these elements (Figures 10 and 13). Of the two active CACCUGGTGG motifs, one (positions 187 to 191) corresponds to the highly conserved element; the other functional site is unique to the human P-LCR HS-3 core sequence. AP-IINF-EZ Element

Located upstream of the defined HS-3 core element (Philipsen et al., 1990) is a single AP-l/NF-E2 motif (Mignotte et al., 1989). This site binds purified murine NF-E2 in vitro (Andrews et al., 1993). In uninduced HU-11 cells, modest hypersensitivities are observed at positions A-9 and A-10 (Figure 11, right panel, and Figure 13). Following DMSO treatment of HU-11 cells, additional protein-DNA interac-

In Vivo Footprinting: GATA-7 Gene Promoter and Globin LCR Elements

In m'vo

153

r

Figure 13. (A) Summaryof altered DMS reactivities in human P-LCR HS-3. Protections and enhancements are represented by open and closed circles, respectively. Arrows denote protections that do not correspond to a known or identified protein binding site. (B) Schematic comparison of in vitro and in vivo footprinting analysis of HS-3. Potential binding sites, where footprintswere not detected, are indicated by open boxes. Protein binding of regulatory motifs is indicated by closed boxes. The hatched box depicting the AP-l/NF-EZ motif indicatesthat binding to this element was not assessed in previous in vitro studies (Philipsen et al., 1990).

tions were detected, including a protection in the core AP-l/NF-E2 binding site (Figure 11, left panel, and Figure 13). In HS-3, the absence of in vivo contacts at this site in HepG2 cells parallels our studies in the a-globin LCR (Figures 5,6, and 8) and the human and murine P-LCR HS-2 core element (data not shown). These results contrast with previous reports that show binding in both erythroid and nonerythroid HeLa cells at functional motifs in the human HS-2, including the AP-l/NF-E2 sites (Ikuta and Kan, 1991; Reddy and Shen, 1991). Our findings are consistent with a chromatin structure that precludes protein binding in nonerythroid, non-globin expressing cells.

154

ERICH C. STRAUSS and STUART H. ORKIN

Adeninellhymine-rich Sequence

In the central region of the HS-3 core element, in vivo footprinting revealed a discrete set of erythroid-specificprotections at an A + T-rich sequence (Figure 12). This sequence does not contain any characterized regulatory binding sites. An examination of this region by gel-shift assay failed to reveal specificprotein binding in vitro (data not shown). We have detected similar patterns of altered DMS reactivity at sequences that do not resemble classical protein binding motifs in the a-LCR (Figures 5 , 6 , and 8) and the human P-LCR HS-2 subregion. In the human HS-2 core element, hypersensitivities are evident upstream of the active GATA site (Strauss and Orkin, 1992). As in the or-LCR and HS-3, the region containing the hypersensitive nucleotides in HS-2 does not exhibit specific protein binding in vitro. Thus the apparent contacts in the A + T-rich region of HS-3, and similar interactions in the a-LCR and HS-2, may represent altered chromatin structure resulting from the mutual interactions of proteins bound to their respective regulatory motifs. Our in vivo footprinting analysis of HS-3 reveals differences in protein-DNA interactions when compared to the findings inferred from previous in vitro binding studies (Figure 13B). In functional studies, the site-independent activity of HS-3 has been localized to 225 bp of sequence downstream of the AP-l/NF-E2 motif (Philipsen et al., 1990). In vivo footprinting results demonstrate that the HS-3 core contains active GATA and CACCCKGTGG elements arranged symmetrically with respect to a central A + T-rich region. We speculate that the configuration and interactions of bound GATA-1 and CACCC/GGTGG proteins contribute to the distinctive LCR properties. This conclusion is consistent with functional studies that have defined the integration site-independent activity of HS-2 to sequences downstream of the double AP-l/NF-E2 elements (Talbot and Grosveld, 1991);this region contains single GATA and CACCC/GGTGG sites that are contacted by protein in vivo (Ikutaand Kan, 1991;Reddy and Shen, 1991).Although thedefined HS-3 core region retained the distinctive features associated with LCR elements, it was expressed at a reduced level in transgenic mice (Philipsen et al., 1990). The decreased expression is presumably attributable to the exclusion of the active HS-3 AP-lMF-E2 site. In vivo footprinting of HS-3 also shows that this subregion of the P-LCR is regulated with respect to stage of erythroid maturation.In human K562 cells, which express embryonic and fetal hemoglobins (Tuan et al., 1985;Forrester et al., 1987), formation of HS-3 does not occur (Tuan et al., 1985). In the adult erythroid environment of HU-11 cells, protein binding at HS-3 is evident in uninduced cells and becomes more complex following DMSO-induced maturation. These results contrast with our findings in the human HS-2 core element. In HS-2 in vivo footprinting studies, no distinctions in the degree or extent of protein binding are detected between uninduced and DMSO-treated HU-11 cells (data not shown). These findings suggest that the activity of HS-2 precedes that of HS-3 during

In Vivo Footprinting: GATA-1 Gene Promoter and Clobin LCR Elements

155

cellular maturation in HU-11 cells. Consequently, the individual core elements of the P-LCR may influence the complex developmental pattern of globin gene expression. Finally, in HS-3, the array of regulatory elements occupied in vivo are the same as those described in HS-2 and the a-LCR (Ikuta and Kan, 1991; Reddy and Shen, 1991; Strauss et al., 1992). These results suggest that the distinctive properties of LCR elements may be related to the organization of these motifs and the consequent protein interactions, rather than the action of unique LCR factors.

V.

CONCLUSIONS

The functional activity of erythroid regulatory elements is presumed to be mediated through their interaction with cell-specific and ubiquitous proteins. To investigate the properties of erythroid regulatory regions and the basis of cell-specific gene expression, DNA binding of nuclear proteins has been examined in vitro (Martin et al., 1989; Moi and Kan, 1990; Ney et al., 1990a, b; Philipsen et al., 1990; Talbot et al., 1990; Jarman et al., 1991; F’ruzinaet al., 1991; Talbot and Grosveld, 1991; Tsai et al., 1991). However, in vitro studies are limited in several respects. First, in vitro techniques may detect protein binding of ubiquitous factors to sites that are unavailable in native chromatin. Second, they may fail to reveal functional regulatory sites because of low concentration or instability of relevant proteins in nuclear extracts. Third, in vitro binding at sites with overlapping specificities may reflect relative protein abundance rather than the actual in situ state. Finally, in vitro analysis is insensitive to chromatin structure. The results from the studies discussed in this chapter demonstrate that in vivo footprinting permits high-resolution analysis of protein-DNA interactions as they occur in intact cells. In vivo footprinting experiments may reveal tissue-specific binding of ubiquitous proteins, discriminate the functional status of potential binding sites, monitor changing profiles of protein binding at overlapping regulatory motifs, and detect active chromatin structures.

ACKNOWLEDGMENTS We are grateful to D. Higgs, W. Wood, and J. Sharpe for providing the interspecific somatic cell hybrid lines HU-11 and J3-8B. This work was supported in part by a grant from the National Institutes of Health to S. H. 0.and a grant from Johnson and Johnson Research Awards to E. C. S.and S. H. 0. through

the Harvard-Massachusetts Institute of Technology Division of Health Sciences and Technology Program. S. H. 0. is an Investigator of the Howard Hughes Medical Institute.

REFERENCES Andrews, N.C., Erdjument-BromageH.,Davidson, M.B., Tempst, P., & Orkin, S.H.(1993). Erythroid transcription factor NF-E2 is a haematopoietic-specific basic-leucine zipper protein. Nature 362, 122-728.

156

ERICH C. STRAUSS and STUART H. ORKIN

Antoniou, M., & Grosveld, F. (1990). P-Globin dominant control region interacts differently with distal and proximal promoter elements. Genes & Dev. 4, 1007-1013. deBoer, E., Antoniou, M., Mignotte, V., Wall, L., &Grosveld, F. (1988).The human P-globin promoter: nuclear protein factors and erythroid specific induction of transcription. EMBO J. 7,4203-4212. Dhar, V., Nandi, A,, Schildkraut, C.L., & Skoultchi, A.I. (1990). Erythroid-specific nuclease-hypersensitive site flanking the human P-globin domain. Mol. Cell. Biol. lo, 4324-4333. Driscoll, M.C., Dobkin, C.S., & Alter, B.P. (1989). Gamma-delta-beta thalassemia due to de novo mutation deleting the 5’ P-globin gene activation-region hypersensitive sites. Proc. Natl. Acad. Sci. USA 86,7470-7474. Evans, T. Reitman, M., & Felsenfeld, G . (1988). An erythrocyte-specific DNA-binding factor recognizes a regulatory sequence common to all chicken globin genes. Proc. Natl. Acad. Sci. USA 85,5976-5980. Evans, T., & Felsenfeld, G. (1989). The erythroid-specific transcriptional factor eryfl: A new finger protein. Cell 5, 877-855. Felsenfeld, G. (1992). Chromatin as an essential part of the transcriptional mechanism. Nature 355, 2 19-224. Forrester, W.C., Takegawa, S., Papayannopoulou, T., Stamatoyannopoulos, G., & Groudine, M. (1987). Evidence for a locus activation region: The formation of developmentally stable hypersensitive sites in globin expressing hybrids. Nucleic Acids Res. 15, 10159-10177. Frampton, J., Walker, M., Plumb, M., & Harrison, P.R. (1990). Synergy between the NF-El erythroid-specific transcription factor and the CACCC factor in the erythroid-specific promoter of the human porphobilinogen deaminase gene. Mol. Cell. Biol. 10,3838-3842. Grosveld, F., van Assendelft, G.B., Greaves, D.R., & Kollias, B. (1987). Position-independent, high-level expression of the human beta-globin gene in transgenic mice. Cell 51,975-985. Hatton, C.S.R.. Wilkie, A.O.M., Drysdale, H.C., Wood, W.G., Vickers, M.A., Sharpe, J., Ayyub, H., Pretorius, I.M., Buckle, V.J.,& Higgs, D.R. (1990). a-Thalassemia caused by a large (62 kb) deletion upstream of the human a-globin gene cluster. Blood 76,221-227. Higgs, D.R., Wood, W.G., Jarman, A.P., Sharpe, J., Lida, J., Pretorius, I.-M., & Ayyub, H. (1990). A major positive regulatory region is located far upstream of the human a-globin gene locus. Genes & Dev. 4, 1588-1601. Ikuta, T., & Kan, Y.W. (1991). In vivo protein-DNA interactions at the P-globin gene locus. Proc. Natl. Acad. Sci. USA 88, 10188-10192. Jarman, A.P., Wood, W.G., Sharpe, J.A., Gourdon, G., Ayyub, H., & Higgs, D.R. (1991). Characterization of the major regulatory element upstream of the human a-globin gene cluster. Mol. Cell. Biol. 11,4679-4689. Li, Q., Zhou, B., Powers, P., Enver, T., & Stamatoyannopoulos, G. (1990). Betaglobin locus activation regions: conservation of organization, structure, and function. Proc. Natl. Acad. Sci. USA 87, 8207-821 1. Lumelsky, N.L., & Forget, B.G.(1991). Negative regulation of globin gene expression during megakaryocytic differentiation of a human erthyroleukemic cell line. Mol. Cell. Biol. 11, 3528-3536. Martin, D.I.K., Tsai, S.-F., & Orkin, S.H. (1989). Increased gamma-globin expression in a nondeletion HPFH mediated by an erythroid-specific DNA-binding factor. Nature 338,435-438. Martin, D.I.K., Zon, L.I., & Orkin, S.H. (1990). Expression of an erythroid transcription factor in megakaryocytic and mast cell lineages. Nature 344,444-446. Mignotte, V., Wall, L., deBoer, E., Grosveld, F., & Romero, P.-H. (1989). Two tissue-specific factors bind the erythroid promoter of the human porphobilinogen deaminase gene. Nucleic Acids Res. 17.37-54. Moi, P., & Kan, Y.W. (1990). Synergistic enhancement of globin gene expression by activator protein-I-like proteins. Proc. Natl. Acad. Sci. USA 87,9000-9004.

In Vivo Footprinting: GATA-1 Gene Promoter and Globin LCR Elements

157

Moon, A.M., & Ley, T.J. (1991A). Conservation of the primary structure, organization, and function of the human and mouse P-globin activatingregions. Proc. Natl. Acad. Sci. USA 87,7693-7697. Moon, A. M., & Ley, T.J. (1991B). Functional properties of the j3-locus control region in K562 erythroleukemiacells. Blood 77,2272-2284. Mueller, P.R., & Wold, B. (1989). In vivo footprinting of a muscle specific enhancer by ligation mediated PCR. Science 246,780-786. Ney, P.A., Sorrentino,B.P., Lowrey. C.H., & Nienhuis, A.W. (199Oa). lnducibilityofthe HS IIenhancer depends on binding of an erythroid specific nuclear protein. Nucleic Acids Res. 18,6011-6017. Ney, P.A., Sorrentino, B.P., McDonagh, K.T., & Nienhuis, A.W. (1990b).Tandem AP-1-binding sites within the human P-globin dominant control region function as an inducible enhancer in erythroid cells. Genes & Dev. 4,993-1006. Orkin, S.H. (1990). Globin gene regulation and switching:Circa 1990. Cell 59, 1115-1125. Penvy, L., Simon, M.C., Robertson, E., Klein, W.H., Tsai, S.-F., D’Agati, V., Orkin, S.H., & Constantini,F. (1991). Erythroid differentiation in chimeric mice blocked by a targeted mutation in the gene for transcription factor GATA-1. Nature 349,257-260. Philipsen, S., Talbot, D., Fraser, P., & Grosveld, F. (1990). The P-globin dominant control region: hypersensitivesite 2. EMBO J. 9,2159-2167. Plumb, M., Frampton, J., Wainwright, H., Walker, M., Macleod, K., Goodwin, G., & Harrison, P. ( I 989). GATAAG: A cis-control region binding an erythmid- specific nuclear factor with a role in globin and non-globin gene expression. Nucleic Acids Res. 17.73-92. Pondel, M.D., George, M., & Proudfoot, N.J. (1992). The LCR-like a-globin positive regulatory element functions as an enhancer in transient transfected cells during erythroid differentiation. Nucleic Acids Res. 20, 237-243. Pruzina, S., Hanscombe, 0..Whyatt, D., Grosveld, F., & Philipsen, S. (1991). Hypersensitive site 4 of the human P-globin locus control region. Nucleic Acids Res. 19, 1413-1419. Reddy, P.M.S., & Shen, C.-K. J. (1991) Protein-DNA interactions in vivo of an erythmid-specific, human P-globin locus enhancer. Proc. Natl. Acad. Sci. USA 88,8676-8680. Reitman, M., & Felsenfeld, G. (1988). Mutational analysis of the chicken P-globin enhancer reveals two positive-actingdomains. Proc. Natl. Acad. Sci. USA 85,6267-6271. Romeo, P.-H., Prandini, M.-H., Joulin, V., Mignotte, V., Prenant, M., Vainchenker, W., Marguerie, G., & Uzan, G. (1990). Megakaryocytic and erythroid lineages share specific transcription factors. Nature 344,447-449. Rutherford, T.R., Clegg, J.B., & Weatherall, D.J. (1979). K562 human leukemic cells synthesize embryonic hemoglobin in response to hemin. Nature 280, 164-165. Schule, R., Muller, M., Otsuka-Murakami, H., & Renkawitz, R. (1988). Cooperativity of the glucocorticoidreceptor and the CACCC-box binding factor. Nature 332,87-90. Strauss, E.C., Andrews, N.C., Higgs. D.R., & Orkin, S.H. (1992). In vivo footprinting of the human a-globin locus upstream regulatory element by guanine and adenine ligation-mediated polymerase reaction. Mol. Cell. Biol. 12,2135-2142. Strauss, E.C., & Orkin, S.H. (1992). In vivo protein-DNA interactions at hypersensitive site 3 of the P-globin locus control region. Proc. Natl. Acad. Sci. USA 89,5809-5813. Strauss, E.C., & Orkin, S.H. (1997). Guanine-adenine ligation-mediated PCR in vivo footprinting. Methods 11, 164-170. Talbot, D., Philipsen, S., Fraser, P., & Grosveld, F. (1990). Detailed analysis of the site 3 region of the human P-globin dominant control region. EMBO J. 9,2169-2178. Talbot, D., & Grosveld,F. (1991). The 5’HS 2 of the globin locus control region enhances transcription through the interaction of a multimeric complex binding two functionally distinct NF-E2 binding sites. EMBO J. 10, 1391-1398. Tsai, S.-F., Martin, D.I., Zon, L.I., D’Andrea, A.D., Wong, G.G., & Orkin, S.H. (1989). Cloning of cDNA for the major DNA-binding protein of the erythroid lineage through expression in mammalian cells. Nature 339,446-451.

158

ERICH C. STRAUSS and STUART H. ORKIN

Tsai, S.-F., Straws, E., & Orkin, S.H. (1991).Functional analysis and in vivo footprinting implicate the erythroid transcription factor GATA-1 as a positive regulator of its own promoter. Genes & Dev. 5,919-931. Tuan, D., Solomon,W., Li, Q.,and London, I.M. (1985).The "beta-like globin" gene domain in human erythroid cells. Proc. Natl. Acad. Sci. USA 82,6384-6388. Tuan, D., Solomon, W.B., London, I.M., & Lee, D.P. (1989). An erythroid-specific, developmental-stage-independent enhancer far upstream of the human"P-like globin" genes. Proc. Natl. Acad. Sci. USA 86,2554-2558. Watt, P., Lamb, P., Squire, L., & Proudfwt, N.J. (1990). A factor binding GATAAG confers tissue specificity on the promoter of the human epsilon-globin gene. Nucleic Acids Res. 18, 1339-1350. Xiao, J., Davidson, I., Macchi, M.,Rosales, R.. Vigeron, M., Staub, A., & Chambon, P. (1987). In virro binding of several cell-specificand ubiquitous nuclear proteins to the GT-1 motif of the SV40 enhancer. Genes & Dev. 1,794-807. (1989). The major human Zon, L.I., Tsai, S.-F., Burgess, S., Matsudaira, P., Bruns, G., & Orkin, S.H. erythroid DNA-binding protein (GF-I; NF-El; Eryf-1): Primary sequence and localization of the gene to the X chromosome. Proc. Natl. Acad. Sci. USA 87,668-672.