Chromosomal arrangement of the chicken β-type globin genes

Chromosomal arrangement of the chicken β-type globin genes

Cell, Vol. 24, 669-677, June 1981, Copyright 0 1981 by MIT Chromosomal Arrangement of the Chicken /?-Type Globin Genes Maureen Dolan,* Barry J. ...

1MB Sizes 9 Downloads 73 Views

Cell, Vol. 24, 669-677,

June

1981,

Copyright

0 1981

by MIT

Chromosomal Arrangement of the Chicken /?-Type Globin Genes Maureen Dolan,* Barry J. Sugarman,* Jerry B. DodgsonT and James Douglas Engel* * Department of Biochemistry and Molecular Biology Northwestern University Evanston, Illinois 60201 T Department of Microbiology and Public Health and Department of Biochemistry Michigan State University E. Lansing, Michigan 48824

Summary We have isolated the chicken P-type globin genes from a library of chicken DNA-X Charon 4A recombinant bacteriophage. There are four p-type genes within this segment of the genome; we believe this represents all of the &type genes of the chicken. The recombinant XCflGl contains the embryonic eand adult &globin genes. The hatching b”- and embryonic pglobin genes are found in the recombinant XCj3GP. Although XCj?Gl and XCBGP do not physically overlap, we present evidence that all four genes are closely linked and transcribed from the same DNA strand. These experiments demonstrate that the chromosomal regions represented by XCj?Gl and XCj3GP lie approximately 1.6 kb apart in the chicken genome. A third recombinant XCPG3 extends the genomic locus studied in the vicinity of the p-type globin genes to approximately 39 kb. The physical order of the chicken B-type globin genes within this segment of the chromosome is is unique 5’... p-/3”-/II-E . . . 3’. This arrangement among the vertebrate b-type globin gene clusters thus far examined, in that embryonic genes are located at the 5’ and 3’ ends of the cluster while the hatching and adult genes occupy central positions.

type globin polypeptides, two (p and E) are specific embryonic (primitive red cell) globins, while two others (/3 and /3”) are expressed in the adult or definitive red cell line (Bruns and Ingram, 1973). Two other P-type globin polypeptides, E’ and /3”, have been proposed to exist in the primitive and definitive red cell lines, respectively (Brown and Ingram, 1974; Moss and Hamilton, 1974). In an attempt to define the actual number of members of the /I-globin gene family and to decipher the possible control sequences that govern the expression of these genes, we have isolated the chicken P-type globin genes from a library of chicken DNA-X Charon 4A recombinant bacteriophage. We previously reported the isolation and characterization of a recombinant, XCPGl, that contains the chicken adult P-globin gene, and, in close physical proximity, an embryonic P-like globin gene (Dodgson et al., 1979). We herein report the isolation and characterization of a second independent recombinant, XCPG2, that contains two additional closely linked ptype globin genes. We have identified all of the P-type globin genes in XCPGl and XCj3G2 by DNA-sequence analysis. Mammalian globin genes are physically arranged in clusters within the genome of a particular organism (Lacy et al., 1979; Hardison et al., 1979; Jahn et al., 1980; Maniatis et al., 1980). Similarly, we have demonstrated the close physical linkage of the chicken P-type globin genes by genomic cloning and blotting experiments. However, the physical arrangement of embryonic versus adult P-type chicken globin genes is distinct from that observed in mammals. These findings suggest that either the physical arrangement of the genes within a chromosomal cluster has no causal relationship to the temporal pattern of expression of these genes, or that the mechanism of globingene activation in chickens differs from that in mammals.

Introduction Results The globin genes of vertebrates comprise a multigene family whose expression is both developmentally and temporally regulated. To date, the best studied of these are the P-type genes of the mouse (Jahn et al., 1980) and human (see Maniatis et al., 1980) but reports on the P-type globin genes of chickens (Dodgson et al., 1979; Ginder et al., 1979) rabbits (Lacy et al., 1979; Hardison et al., 1979) and frogs (Jeffreys et al., 1980; Patient et al., 1980) have also appeared. The globin gene families are attractive for molecular genetic studies because of their extensive previous genetic and biochemical analysis and their relative ease of isolation (once appropriate cloning procedures were developed). The embryonic and adult globin genes of chickens represent one such gene family whose members are temporally expressed during development. Of the /I-

Isolation of XCPGl , XCPGP and XC/lG3 The chromosomal library of chicken DNA-X Charon 4A recombinant bacteriophage (Dodgson et al., 1979) was screened for recombinants containing P-type globin genes with 32P-labeled embryonic and adult chicken-globin cDNA as probe. The isolation and characterization of XCPGl , a recombinant containing the adult chicken /?-globin gene, has already been described (Figure 1 A; Dodgson et al., 1979). Although hCfiG2 and XCPG3 were detected by their hybridization to this mixed-globin probe, these recombinants were found to hybridize most strongly to embryonic cDNA, and only to ,&globin-specific probes. Blots of these recombinants, when hybridized with radiolabeled adult or embryonic globin cDNA and then repeatedly washed at increasingly greater stringency,

Cell 670

showed that embryonic globin cDNA hybridized strongly to all three recombinants at high criterion, whereas under these conditions, adult globin cDNA hybridized primarily to XCPGl. We therefore tentatively assumed that XCPG2 and hCPG3 contained embryonic ,&type globin genes. XCj3GP Contains Two Closely Linked &Type Globin Genes XC/3GP DNA was cleaved with restriction enzymes Eco RI, Barn HI, Kpn I and Hind Ill (and all pairwise combinations of these), subjected to electrophoresis on agarose gels, stained with ethidium bromide and photographed (Engel and Dodgson, 1980). DNA was then blotted onto nitrocellulose filters and hybridized to 3’P-labeled chicken adult P-globin cDNA cloned into pBR322 (P. Chambon, personal communication). Analysis of the band patterns on the gel photographs and autoradiographic exposures of the blots (data not shown) allowed direct deduction of the restriction enzyme cleavage map for hCPG2 (Figure 1). The chromosomal chicken DNA insert in AC/3G2 is 14.1 kb in length. Two distinct regions within this recombinant hybridize to the chicken adult /?-globin cDNA probe at low stringency. The first globin-complementary region is contained entirely within a 4.5 kb Hind Ill fragment and lies approximately 6.5 kb to the right of the chicken DNA-XCh4A left arm junction (heavy lines in Figure 10. Approximately 2.2 kb farther to the right is the second globin-complementary region. This locus is contained within a 910 bp Barn HI-Eco RI fragment, directly adjacent to the junction between the chicken chromosomal DNA and the XCh4A right arm. For convenience the leftmost globincomplementary locus is denoted pl and the rightmost region /?2. The P-type globin genes from vertebrates (mouse, chicken, rabbit, human) have been shown to be structurally similar with respect to the size and location of intervening sequences (Dodgson et al., 1979; Hardison et al., 1979; Jahn et al., 1980; Maniatis et al., 1980). Since the largest known intervening sequence in P-type globin genes is 904 bp (the human Gr gene; Slightom et al., 19801, we made the tentative assumption that these regions (separated by at least 2.2 kb) represented two distinct genes. Furthermore, at high criterion, the p2 region hybridized more strongly in blotting experiments to adult-globin cDNA than did ,@l , suggesting that p2 is more closely related to the adult /?-globin gene at the nucleotide sequence level. The recombinant XC/?G3 was also found to hybridize to the chicken /3-globin-specific probe. Restriction enzyme mapping and hybridization analysis of XPG3 demonstrated that it shares 9.2 kb of chromosomal overlap with AC/3G2 (Figure 1 E). This region of overlap encompasses the left end of the hC/3G2 chicken DNA insert and includes part of the ,f31 locus. hCPG3 extends the region of contiguous chromosomal DNA

Figure clones

1. Isolated

P-Like

Chicken

Globin

Recombinants

and

Sub-

(A) Restriction enzyme map of XC/3Gl. Positions of genes, including exon and intron sequences, are marked by closed and open boxes, respectively. Arrows: transcriptional direction.(B) Plasmid subclones derived from hC/3Gl. (0 Restriction enzyme map of KflG2. Heavy black lines: regions that hybridize to p-globin-specific probes (see text). (D) Plasmid subclones derived from AQ!?GP. (E) Restriction enzyme map of XCBG3 (showing region of overlap with XCflGP).

studied 6.2 kb to the left of that represented by XCPGP, for a total of approximately 20 kb. Other than the loci pl and p2, no regions within this 20 kb segment of the chromosome hybridize to either embryonic or adult globin-specific probes, even at very low criteria (for example, approximately 50% below homologous T,) that might be expected to detect pseudogene sequences (Lacy et al., 1979). The Genes in XC#3G2 Are Transcribed in the Same Direction The transcriptional direction of the two P-globin complementary loci in XCPG2 was determined by hybridization of 32P-labeled probes, specific for the 5’ or 3’ sequence of the chicken adult P-globin gene, to blots of restriction-endonuclease-cleaved XCPG2. A Hind Ill site lies within the major intron of the chicken adult /?-globin gene (locus /33 in XCpGl ; Figure 1 A; Dodgson et al., 1979; Ginder et al., 1979). This site can therefore be used to create 5’ and 3’ globin-specific probes. The 2.4 kb Hind Ill-Barn HI fragment (containing the 3’ portion of the chicken adult P-globin gene) was subcloned into pBR322 to generate a 3’specific probe, ppl HB28. Likewise, the 1.9 kb Eco RI-Hind ill fragment containing the 5’ and central coding regions was subcloned into pBR322 to generate ppl HR16, a probe specific for 5’ P-globin sequences (Figure 1 B; see Experimental Procedures). Figure 2 shows the hybridization pattern obtained when these 3’- and 5’-specific probes are separately hybridized to identical mapping blots of ACPG2 DNA. A more extensive restriction enzyme map of XCPGP with hybridizing regions is indicated schematically in Figure 3. The 5’-specific probe hybridized strongly to a 470 bp Barn HI-Sac I fragment containing a portion of the p2 locus (Figure 2 lane C; Figure 3). An adjacent 440 bp Sac I-Eco RI fragment of the /?2 gene weakly hybridizes to the 5’ probe, indicating that the Sac I site in locus /?2 probably does not lie within the large intron. Both the 380 bp Sma I-Sac I fragment (Figure 2 lane D) and the 780 bp Sac I fragment of locus pl

Chicken 671

b-Globin

Genes

FGHIJ HYBRIDIZATION TO: (5’-+CENTRAL) PROBE (p/3lHRl61 -

a=EcoRI d =sACI 0.78 \_ 0.640.47\ 0.440.38r Figure

2.

Figure

Hybridization

of 5’- and

3’-specific

p-Globin

Probes

to

2 kbp

3’- PROBE (p/3lHBZB)

&=EcoRI v= HlNdm

LINKER

3. Transcriptional

Direction

NONE

b;s

?=BAMHI

f

of the Genes

in hCPG2

=SMAI

hCtYG2 DNA was digested with the indicated enzymes (singly and in all pairwise combinations). subjected to electrophoresis on agarose gels (0.5-2%). stained and photographed. The restriction map was deduced from the gel photograph, as well as from subclone mapping (for example, see Figure 4). Heavy lines flanked by restriction enzyme symbols: fragments that hybridized to either or both of the 5’- and 3’specific P-globin sequence probes (see Figure 2).

XC/3GP

DNA was digested with Sac I (lanes A and F). Sac I and Eco RI (lanes B and G), Sac I and Barn HI (lanes C and H). Sac I and Sma I (lanes D and I) and Sma I (lanes E and J). DNA was subjected to electrophoresis on 1.5% agarose gels and transferred to nitrocellulose filters. Lanes A-E were hybridized to “P-labeled pal HR16 (10’ cpm/gg); lanes F-J were hybridized to “P-labeled p@l HB28 (10’ cpm/gg). Hybridization and blot-washing procedures are described in Experimental Procedures. hC,8G2

hybridize to ppl HRl6, indicating that the central Sac I site of locus ,& lies 5’ to the large intron. An identical blot was hybridized to the 3’-specific probe, ppl HB28. This probe hybridized only to those restriction fragments that include the 780 bp Sac I fragment of locus /31 (Figure 2 lanes F-J). These results demonstrate that both P-type globin genes in hCPG2 are transcribed from the same DNA strand in the left-toright direction relative to the map in Figure 3. In addition, these studies show that the pl locus represents an entire gene, whereas the p2 region lacks coding sequences homologous to the 3’ portion of the adult P-globin gene. DNA-sequence analysis has shown that the Sac I site of locusp lies approximately 100 bp 5’ to the large intron (at codons specifying amino acids 68 and 69) and that the Eco RI linker interrupts this gene within its major (3’) intron (see below and Figure 5). Identification of the b-Type Globin Genes of XCj3Gl and XQ3GP While the adult chicken ,&globin gene had been identified by hybridization and R-loop analysis (Dodgson et al., 1979; Ginder et al., 1979) the remaining /3type globin genes could not be definitively identified by these procedures. We have used the DNA-sequencing technique of Maxam and Gilbert (1980) and comparison of the resulting nucleotide sequences to the known chicken P-type globin protein sequences (Matsuda et al., 1973; Chapman et al., 1981; B. Chapman, personal communication) to identify the ,f3type globin genes in XC/?Gl and XCPG2. Restriction maps of each recombinant subclone

were generated by the procedure of Smith and Birnsteil (1976). Figure 4 shows the sequencing strategy used to obtain the identifying (coding) DNA sequence for each of the P-type globin genes, in regions of the amino acid sequence where the P-type genes were most distinct. Subclone pp2H2 contains the entire /31 locus. The central exon DNA sequence of pl was determined in order to identify this gene. The /32 gene was similarly identified by analysis of the central exon in subclone pP2BRlO. Plasmid ppl HR16 contains the 5’ and central exons of the adult /3-globin gene (locus p3), and central exon DNA sequence was generated to confirm this gene’s identity. ppl HR3 contains 3’ exon and 3’ noncoding DNA sequences of an embryonic /?-type gene (locus p4); these 3’ exon sequences were determined for comparison to the chicken embryonic /3type polypeptide sequences. The nucleotide sequence of the identifying region of each gene is shown in Figure 5. Comparison of the amino acid sequences predicted from these nucleotide sequences to the amino acid sequences of the p-, + and /3-globin polypeptides identified p3 as the adult /3-globin gene, p4 as the e gene, and pl as the p gene. The amino acid sequence predicted from the nucleotide sequence of locus /?2 identifies this gene as encoding a P-type globin polypeptide that is distinct of the from the /3, e and p globins. The conservation nucleotide sequences at the intron-exon splice junctions (Efstratiadis et al., 1980) and at the codons specifying amino acids involved in heme contacts (Eaton, 1980) leads us to believe that ,&2 encodes an expressed P-type globin polypeptide. The literature supports the existence of a P-globin polypeptide, denoted p”, that is maximally expressed around hatching (Bruns and Ingram, 1973); we therefore tentatively identify locus p2 as the structural gene for this polypeptide (see Discussion). Formally, however, locus p2 cannot be assigned as the coding sequence for p”, since neither mRNA nor protein sequence data exist for this globin species.

Cell 672

v =HlNdrn T=B*uHI B:ECORILlNKER h 4ACI b=HnaI Figure

4. Sequencing

+=PvuJ,L w:AvaI l=TAoI Strategy

- :MwI 4.SHAI o= BGLII of &Type

O=Sau3AI A=HAEIU Globin

f=HwfI T=PsrI

Genes

Plasmid subclones were constructed as described (see Experimental Procedures). p/3lHRl6 (locus 83) and pPlHR3 (locus /34) were derived from XCbGl (see Figure 1 A). Plasmids pp2H2 (locus pi) and p/32BRlO (locus p2) were derived from KZfiG2 (see Figure 1C). Arrows: direction and the extent of DNA sequence determination.

Chromosomal Linkage of B-Type Globin Genes It has been previously demonstrated that the P-type globin genes of rabbit, human and mouse are closely linked within a small fraction of a single chromosome of the organism (Lacy et al., 1979; Jahn et al., 1980; Maniatis et al., 1980). We wished to determine if the chicken P-type globin genes were similarly physically contiguous in the genome. Initially, we sought to demonstrate such linkage through the analysis of sets of overlapping /3-type-globin-gene-containing clones. However, after several screenings of the library (approximately 2 x 1 O6 pfu), no clone(s) were found that overlapped the chromosomal region between XCPGl and KPGP. Since it is possible to map single-copy genes in total chromosomal DNA with the Southern blotting technique with specific hybridization probes (Flavell et al., 1978), we have investigated the physical arrangement of the chicken P-globin genes in this manner. The restriction enzyme maps of XCPGl and XCPG2 permit this analysis, since we can prepare hybridization probes specific for hCPG1 and XCPG2 sequences. Under the conditions used, each probe detects genomic DNA fragments containing ,&type globin gene sequences, but hybridization is most intense at the completely homologous locus, and less intense when due only to crosshybridization between p-like coding sequences. This strategy presumes that, if the P-type globin genes of hCPG1 and hCPG2 are closely linked on a single chicken chromosome, restriction mapping with chromosomal DNA Southern blots should indicate such linkage. The linkage of the hC,BGl and XC/3G2 P-type globin gene recombinants would be evident if both of the specific hybridization probes derived from these recombinants hybridized equally well to a single genomic DNA fragment (which would logically result from the absence of a restriction enzyme site within the missing chromosomal DNA region). The hybridization of distinct DNA fragments would be observed on the blots if one or more of a

Figure 5. Comparison of the Nucleotide Sequences /33 and 84 to the E-. p- and P-Globin Polypeptides

of Loci 81, p2.

Nucleotide sequences obtained from each subclone (see Figure 4) were aligned with the amino acid sequences of the c--. p-and fl-globin polypeptides to identify each locus. 83 DNA sequence from ppl HR16 was aligned with the /3-globin polypeptide (Matsuda et al., 1973) from codons 31-84. /34 DNA sequence was aligned with amino acids 105 144 of the E-globin polypeptide (B. Chapman, personal communication). /31 DNA sequence was obtained from p/32H2 and was aligned with amino acids 40-69 of the p-globin polypeptide (Chapman et al., 1981). The central-exon amino acid sequence predicted from the nucleotide sequence of locus /32 (derived from subclone pfl2BRlO) is shown. The six nucleotides forming each intron-exon splice junction determined are shown in lower case letters. The differences in amino acid sequence between the adult p-globin polypeptide and the remaining B-type globin polypeptides are shown in large type; identities with the adult fi-globin protein are not shown.

particular restriction enzyme site existed within this region. Figure 6 shows the results of such an experiment. Calf thymus DNA containing hC/3Gl and XC/3GP at a copy number equivalent to that present in total chicken chromosomal DNA was digested with Eco RI and hybridized separately to 3’P-labeled p/31 HI31 6 (the 1.9 kb fragment containing the 5’ half of the adult P-globin gene; Figure 6 lane A) and 32P-labeled p/32BRi 0 (the 910 bp Barn HI-Eco RI fragment containing the /32 locus; Figure 6 lane G). Each probe detects the expected Eco RI fragments produced from the parental recombinants hCPG1 and hCPG2. Through the adjustment of the blot-washing stringency, the relative hybridization intensities can be varied depending on whether the fragment contains sequence more or less homologous to the adult ,f?-

rC;tken

/3-Globin

Genes

Figure 6. Hybridization Chromosomal DNA

of XC/3Gl-

and XC/3GP-specific

Probes

to

ACaGl and ACBGP DNAs were digested with Eco RI (lanes A and G). and chromosomal chicken DNA was digested with Eco RI (lanes B and H). Hind III (lanes C and I), Barn HI (lanes D and J). Kpn I (lanes E and K) and Xba I (Lanes F and L), subjected to electrophoresis on 0.7% agarose gels and blotted onto nitrocellulose filters. Lanes A-F were hybridized to p/31 HR16 (hCflG1; h right-arm-adjacent probe). Lanes G-L were hybridized to the XCPGP A right-arm-adjacent probe, p,PPBRlO. Each probe was labeled with 32P by nick translation to a specific activity of -lOa cpm/pg (see Experimental Procedures).

globin or p2-locus probe. Thus the 6.0 kb XCPGI Eco RI fragment hybridizes strongly to the adult P-globin probe (lane A) and weakly to the p2-locus probe (lane G). Conversely, the 8.9 kb Eco RI fragment of XCPG2 hybridizes strongly when the /32-locus subclone is used as probe (lane G) and weakly when the adult pglobin probe is used (lane A). The chromosomal Eco RI DNA fragments, as detected by both probes, are both larger than the corresponding Eco RI fragments of the recombinants (Figure 6 lanes A and B, G and H). The intensely hybridizing chromosomal DNA fragment (Figure 6 lane B) corresponding to the 6.0 kb ACPGl Eco RI fragment is 6.2 kb in length. The chromosomal Eco RI fragment that hybridizes strongly to the p2-locus probe (lane H) is 9.6 kb in length, in comparison with the 8.9 kb hCPG2 Eco RI fragment. These size differences are consistent with the methodology used to construct the library (Maniatis et al., 1978; Dodgson et al., 1979) and the size of the known chromosomal Eco RI fragment containing the adult P-globin gene (6.2 kb; Ginder et al., 1979). A single hybridizing fragment is observed using Hind Ill-digested (lanes C and I), Kpn l-digested (lanes E and K) or Xba l-digested (lanes F and L) chromosomal DNA. Since both probes hybridize equally well to these single DNA fragments, DNA sequence homologous to each probe must be present within the fragment represented in each of these digests. The chromosomal Hind III fragment that hybridizes to both probes is 6.5 kb in length (Figure 6 lanes C and I). For this DNA fragment to contain sequences homologous to both p/?2BRiO and pplHRl6, the fragment endpoints must be the Hind III sites fortuitously found closest to the right linkers in both recombinants (1.9 kb from the XCh4A right arm junction in XC/3Gl and 2.9 kb from the XCh4A right arm junction

in XCPGP; Figure 1). This arrangement would place 1.6 kb of chromosomal DNA between AC/3Gl and XCPGP, which is missing from the contiguous DNA of the P-type globin gene cluster we report here. The Kpn I and Xba I fragments could not be accurately sized. From the restriction enzyme maps of XCPGl, hCPG2 and XCPG3, we can infer that the Kpn I fragment must be greater than 35 kb in length. Similarly, the Xba I fragment length must exceed 50 kb. Two bands are observed in Barn HI-digested chromosomal DNA (Figure 6 lanes D and J). These fragments hybridize with inverse intensity depending on the probe used (as seen in the analogous blots with Eco RI). From the restriction maps of XCPGl and XCPG2 and the hybridization intensities, the 4.5 kb band must contain part of the XC/?Gl sequence and the 2.1 kb fragment must contain part of the XCPGP insert. These bands represent the natural Barn HI fragments, which contain the hCPG1 4.1 kb Barn HIEco RI fragment and the 0.91 kb XCPG2 Barn HI-Eco RI fragment present on the subclones used as probes. From the sizes of the chromosomal Eco RI and Barn HI DNA fragments, there appear to be two Eco RI sites and one Barn HI site within the 1.6 kb of chromosomal DNA between the XCPGl and XCPG2 inserts. These sites were confirmed and their positions further defined by double digests of chromosomal DNA with Hind Ill and Eco RI, Hind Ill and Barn HI, Barn HI and Eco RI and Hind III alone (data not shown). When hybridized to Hind Ill-Eco RI-digested chromosomal DNA, p/32BRlO and ppl HR16 detect two bands of 3.6 and 2.2 kb, in agreement with the Hind Ill-Eco RI fragment sizes predicted from the restriction map shown in Figure 7. Similarly, these probes detect a doublet of Barn HI-Hind Ill chromosomal DNA fragments, both approximately 2.2 kb in length, thus placing the single Barn HI site in this region equidistant from the Barn HI site 5’ to the /I” gene and the Hind Ill site within the adult ,&globin gene. Although these data support the initial restriction map of this region (Figure 7) we cannot exclude the possibility that additional Eco RI sites exist between the two Eco RI sites identified. The physical linkage of the chicken P-globin genes is shown in Figure 7. These experiments demonstrate that the fl-globin genes are closely linked within a region of 39 kb in the physical order . . 3’ and that all four genes are 5’ . . p-p”-p-e transcribed from the same DNA strand.

Discussion Identification of the P-Type Globin Genes Biochemical and immunological analysis of the embryonic and adult chicken hemoglobins has previously provided a direct estimate of the number of chicken globin genes (Table 1; Bruns and Ingram, 1973; Matsuda et al., 1973; Brown and Ingram, 1974; Moss and

Cdl 674

Table

1. Chicken

Primitive Globin)

hC/3G3

u

u

Hemoglobin

Cell (Embryonic

Polypeptides’ Definitive

Cell (Adult

Globin)

lib

a Type

B Type

Hb

a Type

P Type

P

rr

P

A

aA

P’t

n’

IL-EcoRI

Figure Globin

LINKER

7. Summary Genes

k:EcoRI

~‘=BAMHI

of the Chromosomal

v: HiNdIiI

Linkage

~=KPNI

of Chicken

P-Type

The restriction enzyme map of the entire P-type globin gene cluster was derived from maps of hC/IGl , hCpG2 and hCfiG3 and the blot hybridization analysis of chromosomal DNA shown in Figure 6. Closed and open boxes: positions of exon and intron sequences, respectively. Arrows: transcriptional direction of each gene. In the middle are indicated the regions of the p-type globin gene cluster contained in the three genomic recombinants.

Hamilton, 1974; Vandecasserie et al., 1975). From these summarized data, there appear to be four to six ,&type globin polypeptides temporally expressed during various stages of avian red cell development, and therefore four to six structural genes. In their analysis of two embryonic hemoglobins, HbM and HbE, Brown and Ingram (1974) were unable to obtain sufficient quantities of separate HbM chains for biochemical analysis. However, by comparative studies of the HbD and HbM peptide maps and those of the separated E, /I and p chains, they were able to demonstrate that HbM contains an aD chain and a p-like chain very similar, if not identical, to E, which they denoted E’ (Table 1). Since separate HbM chains were unavailable, they could not conclusively state whether the embryonic p-like globins E and E’ were identical. Earlier studies had also indicated that the /?-globin chains of adult hemoglobins HbA and HbD were distinct (Moss and Hamilton, 1974). However, partial amino acid sequence analysis of these polypeptides demonstrated the probable identity of the adult p” and ,L3” chain(s) (Matsuda et al., 1973; Vandecasserie et al., 1975). Moss and Hamilton (1974) analyzed HbH, a trace hemoglobin maximally expressed around hatching time in the adult or definitive red cell line (days 18-22 postfertilization; Bruns and Ingram, 1973). They determined HbH to have a chain composition of CW$,$. This is in agreement with the concurrent immunological analysis of Brown and Ingram (1974) who found HbH to be very similar, but not identical, to HbE (& E*). Taken together, these several biochemical analyses most strongly support the existence of only four p-like genes (p, E, p” and p). Comparison of the amino acid sequences predicted from the nucleotide sequences of select regions of the /31, /?2. /33 and /I4 loci to the amino acid sequences of the f-, p- and P-globin polypeptides (Figure 5) has identified the /I1 locus as the p gene, the /33 locus as the adult /3 gene and the ,L?4 locus as the E

E

CP

c

D

a0

PC”’ t

M

CP

(4

HT

aA

P”

Nomenclature follows that of Brown and Ingram (1974); Chapman and Tobin (1979). T Nomenclature follows that of Moss and Hamilton (1974). l

see also

gene. The amino acid sequence predicted from the p2 central-exon DNA sequence indicates that locus p2 encodes a P-type globin polypeptide distinct from E, p and p. Comparison of the central-exon amino acid sequences of c, p and p to those predicted from /?2 DNA sequence suggests that the /32 locus encodes a polypeptide more like adult p globin than either of the embryonic globins. This is consistent with our previous blot hybridization analysis (Engel and Dodgson, 1978; described above), which indicates that locus /32 encodes an adult-like ,&globin gene (see also Stalder et al., 1980). Taken together, these findings suggest that locus /?2 encodes the fi” polypeptide. The amino acid sequence of p” is unknown, and our identification of locus p2 as the structural gene for fi” must therefore be considered tentative. However, the alternative, that the ,LI2 locus encodes the E’ polypeptide, is even less likely, since locus p2 would then be expected to show more stringent hybridization to embryonic-globin cDNA than to adultglobin cDNA; in fact, locus /I2 hybridizes more strongly to adult ,&globin probes (see Stalder et al., 1980). Furthermore, if the/?2 locus were the structural gene for &, from the immunological data of Brown and Ingram (1974) one might expect to see more DNA sequence or amino acid sequence homology to locus ,LI4 (encoding embryonic gene E). In fact, the DNA sequence of p2 predicts an amino acid sequence more similar top than to E (Figure 5). However, due to the lack of structural data on globin ,f3”, we cannot rule out the possibility that locus ,L?2 is a pseudogene. DNA-sequence analysis has demonstrated that the nucleotide sequences at both splice junctions surrounding the p2 locus central exon and at those codons specifying amino acids involved in heme contacts are conserved (Figure 5; Eaton, 1980; Efstratiadis et al., 1980). Although there are several nucleotide substitutions that result in amino acid changes within the central exon, only one of these changes results in an amino acid charge change between the P2-predicted polypeptide sequence and the adult pglobin polypeptide. Thus it is likely that ,62 represents an expressed adult-like ,B-globin gene, such as p”,

;;kken

/3-Globin

Genes

rather than a pseudogene. p” might be more analogous to a fetal gene than an adult gene because of its elevation in relative concentration at about the time of hatching (Bruns and Ingram, 1973) and its subsequent depletion in the definitive red cell population. We have also compared the nucleotide sequence of the adult P-globin gene (codons 31-64; Figure 5) to the corresponding sequences in reported P-globin cDNA clones. This DNA sequence corresponds most closely to that of the recombinant pCG/33, differing at only one nucleotide (3rd position in codon 76; Richards et al., 1979). In comparing it with pHblOO1, we find 12 nucleotide differences (Salser et al., 1979). Four of these nucleotide differences occur at positions in pHblOO1 that specify isoleucines, at positions 49 and 50. In the amino acid sequence of the l , p and p polypeptides these residues are serines (Matsuda et al., 1973; B. Chapman, personal communication); furthermore, our DNA sequence of the ,8” gene indicates that these two residues would also be serines (Figure 5). Since we have identified only four P-like globin genes in the gene cluster, these nucleotide differences between pHb1001 and locus /?3 must be due to sequencing errors, errors in reverse transcription or, most unlikely, differences in sources of hens and nucleic acid samples. Physical Linkage of Chicken &Like Globin Genes We have analyzed three independent genomic chicken DNA recombinant% XCPGl, XC/3G2 and XC/?GB (see above: Dodgson et al., 1979). XCPGl contains the adult @-globin gene and a closely linked embryonic /?-type globin gene, which we here have identified as the E gene (Figure 7). The adult ,&globin gene is located 5’ (in transcriptional sense) to the embryonic e gene. No other regions of P-type globin gene complementarity were found in the approximately 10 kb of DNA distal to the 3’ end of the E gene, suggesting that this gene forms the 3’ end of the chicken P-globin gene cluster. Our analysis of hCPG2 demonstrated that it also contains two closely linked P-type globin genes that are transcribed from the same DNA strand (Figure 3). In this case, an embryonic P-type gene, p, is located 5’ to an adult-like ,L?globin gene, presumably the /3” gene (Figure 5). We have analyzed approximately 15 kb of DNA 5’ to the p gene (in XC$GS; Figure 1 E), to find that this adjacent DNA is devoid of any additional ,&globin-complementary sequences. In genomic blotting experiments (Figure 6), we have demonstrated that the chicken ,&type globin genes are physically contiguous within a single chicken chromosome. However, unlike mammalian @-globin gene clusters, the chicken P-type globin genes are physically arranged such that the embryonic p gene is at the 5’ end of the cluster, the /?” and adult p genes are centrally located and the embryonic e gene occupies the extreme 3’ end of the cluster. If the same mecha-

nism for control of globin gene expression is operating in the activation of the J?-globin genes of vertebrates, the physical arrangement of chicken embryonic and adult P-like globin genes would imply that the 5’ . . embryonic-fetal-adult . . 3’ P-type gene arrangement observed in mammals is not necessary for their proper temporal order of expression. Alternatively, the evolved mechanism for gene activation may differ in avians and mammals. At the present time we cannot distinguish between these possibilities. Finally, our attempts to isolate recombinants that overlapped XCPGl and hCPG2 (and thus directly demonstrate physical linkage of these sets of P-type globin genes) were repeatedly unsuccessful. This failure is possibly due to the methodology used to create the chicken genomic library. As a consequence of using partial Hae Ill and Alu I restriction enzyme digestion of chicken chromosomal DNA to generate 15-21 kb random DNA fragments, regions of chicken DNA that contain a preponderance of Alu I and Hae III sites could be underrepresented in this library. Alternatively, recombinants that contain the overlapping region could be unusually unstable and therefore lost during the recombinant phage amplification. Other investigators have reported difficulty in obtaining certain recombinants in the statistically expected frequency from our library (Perrin et al., 1979), and, in particular, in isolating the 1.6 kb of DNA represented by the missing chromosomal segment between XCPGl and KPGP in other chromosomal libraries (H. Martinson, personal communication). Formally, there could exist a fifth ,&type globin gene (possibly encoding the putative E’ polypeptide). However, we feel it is unlikely that the 1.6 kb of the /?globin gene cluster not represented in our clones encodes another /?-type globin gene, for the following reason. The remaining portion of the p” gene (consisting of large intron, 3’ coding sequence and 3’ untranslated sequences abutting the right linker of ACPG2; Figure 1) would occupy approximately 500 to 600 bp of this missing region, leaving 1 .O to 1 .l kb to encode another ,&globin polypeptide. However, there is actually even less available DNA than this suggests, since Ginder et al. (1979) have analyzed 275 bp of this region (chromosomal DNA overlapping but not present in ACPGl) and do not find any evidence for the existence of another globin-complementary locus. Thus there is approximately 0.7 kb of DNA available to encode a fifth gene within this region; this space is obviously not sufficient if the putative gene assumes the average size (approximately 1.5 kb) of the other chicken P-type globin genes. Experimental

Procedures

DNA Preparation Phage DNA was prepared as described 1979). Bacteria harboring recombinant grown to 0.5 Asoo/ml in L broth plus

previously (Dodgson et al.. subclone plasmids were ampicillin (10 pg/ml). and

Cell 676

amplified overnight with chloramphenicol(250 pg/ml). Plasmid DNAs were prepared by Hirt lysis (1967) and by banding in CsCI-ethidium bromide (Radloff et al., 1967). The band of supercoiled DNA was collected, extracted 6 times with N-butanol, diluted 1:3 with water and precipitated with an equal volume of isopropanol. The pellet was resuspended in 0.25 M NaCI. 10 mM Tris-HCI (pH 7.5) and 1 mM EDTA and was maintained as an ethanol precipitate at -20°C until further use. Preparation of Plasmid Subclones Containing Fragments of flType Globin Genes XCPGl and ACPGP DNA were digested to completion with a variety of restriction endonucleases. For hCPG1, these were Hind Ill and Eco RI or Hind Ill and Barn HI. ACoG DNA was cleaved with Hind Ill or Barn HI and Eco RI. A fourfold excess of restriction nuclease-cleaved X recombinant DNA was mixed with pBR322 DNA (Sutcliffe, 1978). cleaved with the same enzyme(s) (and subsequently treated with calf intestine alkaline phosphatase) in 0.03 M Tris-HCI (pH 8.01, 4 mM MgCI*, 1.2 mM EDTA. 10 mM dithiothreitol and 0.06 mg/ml gelatin to a final DNA concentration of 10 pg/ml. Fresh ATP was added to a final concentration of 1 mM. T4 DNA ligase was then added and the mixture was incubated at 15°C overnight. E. coli HBlOl was transformed with a portion of the ligation mixture by a modification of the Hanahan procedure initially developed for Xl 776 (D. Hanahan, personal communication). Transformants were selected by differential drug-resistance characteristics (in all cases here, ampicillin resistance and tetracycline sensitivity). Positive transformants were analyzed by restriction enzyme digestion and blot hybridization according to the rapid plasmid isolation technique of Klein et al. (1980). DNA Sequencing The chemical-degradation procedure of Maxam and Gilbert (1980) was used without modification. Banded plasmid DNA was further purified by exclusion on Sepharose 28 prior to restriction enzyme cleavage and kinase labeling. Labeling with T4 polynucleotide kinase and isolation of singly end-labeled DNA fragments were as described by Maxam and Gilbert (1980). In some cases, DNA fragments were isolated by a modification of the procedure of Tabak and Flavell (1978). Miscellaneous Restriction enzyme digests were performed according to suppliers’ instructions. Agarose gel electrophoresis, blotting, hybridizations and nick translations were as previously described (Engel and Dodgson. 1978, 1980). Recombinant plasmids were mapped by the partial digest procedure of Smith and Birnstiel (1976). All manipulations of bacteria and phage containing recombinant DNA molecules were in accordance with current NIH guidelines. Materials Restriction enzymes were obtained from New England Biolabs. Bethesda Research Labs and Biotec. T4 polynucleotide kinase was obtained from P. L. Biochemicals. Inc. or New England Nuclear. T4 DNA ligase was obtained from New England Biolabs.

We thank B. Chapman, A. Maxam and T. Maniatis for access to work prior to publication and especially thank l-l. Martinson and B. Chapman for informative discussions. We gratefully acknowledge the gift of an adult P-globin cDNA clone prepared from nonanemic hens from P. Chambon. M. D. was partially supported by a Northwestern University Fellowship. J. D. E. acknowledges support by NIH and the Leukemia Research Foundation. J. B. D. received grant support from NIH. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. January

Brown, J. L. and Ingram. V. M. (1974). embryonic hemoglobins. J. Biol. Chem. Bruns, G. A. and Ingram. haemoglobins of the chick 225-305.

1, 1981;

revised

March

16. 1981

Structural studies 249, 3960-3972.

on chick

V. M. (1973). The erythroid cells and embryo. Phil. Trans. Royal Sot. B 266,

Chapman, B. and Tobin, A. J. (1979) Distribution of developmentally regulated hemoglobins in embryonic erythroid populations. Dev. Biol. 69, 375-387. Chapman, B.. Tobin. A. J. and Hood, L. E. (1981). Complete amino acid sequence of the major early embryonic beta-like globin in chickens. J. Biol. Chem.. in press. Dodgson, J. 8.. Strommer. J. and Engel, J. D. (1979). Isolation of the chicken /3-globin gene and a linked embryonic B-like globin gene from a chicken DNA recombinant library. Cell 7 7, 879-887. Eaton, W. A. (1980). The relationship between coding function in haemoglobin. Nature 284, 183-185.

sequences

and

Efstratiadis. A., Posakony. J. W., Maniatis. T., Lawn, R. M., O’Connell, C., Spritz, R. A., DeRiel, J. K., Forget, B. G.. Weissman. S. M., Slightom. J. L.. Blech. A. E., Smithies, 0.. Baralle. F. E., Shoulders, C. C. and Proudfoot. N. J. (1980). The structure and evolution of the human /3-globin gene family. Cell 21, 653-668. Engel, J. D. and Dodgson, J. B. (1978). Analysis of the adult and embryonic chicken globin genes in chromosomal DNA. J. Biol. Chem. 253, 8239-8246. Engel. J. D. and Dodgson. J. B. (1980). Analysis of the closely linked adult chicken a-globin genes in recombinant DNA?.. Proc. Nat. Acad. Sci. USA 77, 2596-2600. Flavell, R. A., Kooter, J. M., DeBoer, E.. Little, P. F. R. and Williamson, R. (1978). Analysis of the P-B-globin gene loci in normal and Hb Lepore DNA: direct determination of gene linkage and intergene distance. Cell 75, 25-41. Ginder, G.. Wood, W. I. and Felsenfeld. G. (1979). Isolation and characterization of recombinant clones containing the chicken adult @globin genes. J. Biol. Chem. 254, 8099-8102. Hardison. R. C.. Butler, E. T., Ill, Lacy, E.. Maniatis. T., Rosenthal, N. and Efstratiadis, A. (1979). The structure and transcription of four linked rabbit ,&like globin genes. Cell 78. 1285-l 297. Hirt. B. (1967). Selective extraction of polyoma mouse cell cultures. J. Mol. Biol. 26, 365-369.

DNA from

infected

Jahn. C. L., Hutchison. C. A., Ill, Phillips, S. J.. Weaver, S.. Haigwood. N. L., Voliva. C. F.. and Edgell. M. H. (1980). DNA sequence organization of the P-globin complex in the BALB/c mouse. Cell 21, 159168. Jeffreys. A. J., Wilson, V., Wood, D., Simons. J. P.. Kay, R. M. and Williams, J. G. (1980). Linkage of adult a- and /3-globin genes in X. laevis and gene duplication by tetraploidization. Cell 21, 555-564. Klein, R. D., Selsing. E. and Wells, R. D. (1980). A rapid microscale technique for isolation of recombinant plasmid DNA suitable for restriction enzyme analysis. Plasmid 3, 88-91. Lacy, E., Hardison, linkage arrangement 1283.

Acknowledgments

Received

References

R. C., Muon, D. and Maniatis. T. (1979). The of four rabbit b-like globin genes. Cell 18, 1273-

Maniatis. T.. Hardison. R. C., Lacy, E.. Lauer. J., O’Connell, C.. Quon. D.. Sim, G K. and Efstratiadis, A. (1978). The isolation of structural genes from libraries of eucaryotic DNA. Cell 15, 687-701. Maniatis. T.. Fritsch, E. F.. Lauer, J. and Lawn, R. M. (1980). molecular genetics of human hemoglobins. Ann. Rev. Genet. 145-l 78.

The 74,

Matsuda, G.. Maita. T.. Mizuno. K. and Ota, H. (1973). Amino acid sequence of a chain of All component of adult chicken haemoglobin. Nature New Biol. 244, 244. Maxam. A. M. and Gilbert, with base-specific chemical

W. (1980). cleavages.

Sequencing end-labeled DNA Meth. Enzymol. 65, 499-560.

Moss, 8. A. and Hamilton, E. A. (1974). Chicken definitive haemoglobins. Biochim. Biophys. Acta 377, 379-391.

erythrocyte

Chicken 677

8-Globin

Genes

Patient, R. K., Elkington. J. A., Kay. R. M. and Williams, Internal organization of the major adult 01- and fi-globin laevis. Cell 2 7, 565-573.

J. G. (1980). genes of X.

Perrin. F., Cachet, M., Gerlinger, P., Cami. B., Le Pennec, J. P. and Chambon. P. (1979). The chicken conalbumin gene: studies of the organization of cloned DNA?,. Nucl. Acids Res. 6, 2731-2748. Radloff, I?., Bauer. W. and Vinograd, J. (1967). A dye-bouyant-density method for the detection and isolation of closed circular duplex DNA: the closed circular DNA in HeLa cells. Proc. Nat. Acad. Sci. USA 57, 1514-1521. Richards, R. I., Shine, J., Ullrich. A., Wells, M. (1979). Molecular cloning and sequence /I-globin cDNA. Nucl. Acids Res. 7, 1137-l

J. R. E. and Goodman, H. analysis of adult chicken 146.

Salser. W. A., Cummings, I., Liu. A., Strommer, J., Padayatty, J. and Clarke, P. (1979). Analysis of chicken globin cDNA clones: discovery of a novel chicken alpha globin gene induced by stress in young chickens. In Cellular and Molecular Regulation of Hemoglobin Switching, G. Stamatoyannopoulos and A. Nienhuis. eds. (New York: Grune and Stratton). pp. 621-643. Slightom. J. L.. Blechl. A. E. and Smithies, 0. (1980). Human fetal Gy- and AT-globin genes: complete nucleotide sequences suggest that DNA can be exchanged between these duplicated genes. Ceil 21, 627-638.

Smith, H. 0. and Birnsteil. M. L. (1976). A simple method restriction site mapping. Nucl. Acids Res. 3. 2387-2399.

for DNA

Stalder. J.. Groudine. M., Dodgson. J. B.. Engel, J. D. and Weintraub. H. (1980). Hb switching in chickens. Cell 19, 973-960. Sutcliffe, J. G. (1978). Complete nucleotide sequence of the Eschefichia co/i plasmid pBR322. Cold Spring Harbor Symp. Quant. Biol. 43, 77-90. Tabak. H. F. and Flavell. R. A. (1978). A method for the recovery DNA from agarose gels. Nucl. Acids Res. 5, 2321-2332.

of

Vandecasserie, C.. Paul, C.. Schnek, A. G. and Leonis. J. (1975). Probable identity of the B chain from the two chicken hemoglobins. Biochimie 57, 843-844.

Note Added

in Proof

In Figure 3, two Sac I sites were omitted from the map of ACPGP. These lie 170 bp and 820 bp to the lefl of the Barn HI site in locus

P2.