Structural analysis of templates and RNA polymerase III transcripts of Alu family sequences interspersed among the human β-like globin genes

Structural analysis of templates and RNA polymerase III transcripts of Alu family sequences interspersed among the human β-like globin genes

Gene, 13 (1981) 185-196 185 Elsevier/North-Holland Biomedical Press Structural analysis of templates and RNA polymerase III transcripts of Alu fami...

978KB Sizes 0 Downloads 10 Views

Gene, 13 (1981) 185-196

185

Elsevier/North-Holland Biomedical Press

Structural analysis of templates and RNA polymerase III transcripts of Alu family sequences interspersed among the human 13-like globin genes (Cloned hemoglobin genes; in vitro transcription; recombinant DNA; repetitive DNA)

Craig H. Duncan, Pudur Jagadeeswaran, Richard R.C. Wang and Sherman M. Weissman Department of Human Genetics, Yale University School of Medicine, 333 Cedar Street, New Haven, CT 06511 (U.S.A.)

(Received November 24th, 1980) (Accepted December 19th, 1980)

SUMMARY Cloned DNA fragments from the human 13-like globin genomic region-can be transcribed in vitro by RNA polymerase III. We have investigated the structure of two templates and their transcripts by DNA sequencing, size fractionation of ribonuclease T1 generated oligonucleotides, and ribonuclease H digestion of RNA : DNA duplexes. The data indicate that repetitive DNA sequences, members of the Alu family of interspersed 300 bp reiterated DNA, are imbedded in both templates. The RNAs transcribed from them are composed of an entire Alu family sequence at their 5' ends linked to 3' ends of non-repetitive sequence.

INTRODUCTION DNA from higher organisms, unlike bacterial DNA, contains sequences which are reiterated many times in the genome (Britten and Kohne, 1968). One class of repeated sequence elements is of particular interest because its members are interspersed among the unique DNA sequences which compose genes and because its members are homologous to sequences found in hnRNA (for review see Davidson et al., 1977). By using recombinant DNA technology, it has become possible to isolate members of

Abbreviations: bp, base pairs; hnRNA, heterogeneous nuclear RNA; HPFH, hereditary persistence of fetal hemoglobin; kb, kilobase pairs; T1, RNase T1.

this group and analyse their structure by hybridization or sequence analysis (Scheller et al., 1977). Because these elements are interspersed and repeated more than l0 s times per genome, it was expected that they would be found in the neighborhoods of most, if not all, genes, perhaps playing some role in gene regulation. Our studies are directed towards investigating structure and function of intergenic repeated sequences in the human globin genetic complex. These loci have emerged as major model systems for studies of chromosomal structure and gene action (Bunn et al., 1977; Weatherall and Clegg, 1979). Because of the abundance and accessibility of globins and because of the clinical significance of hemoglobin disorders, a number of genetic lesions have been discovered at these loci (for review see Bank et al.,

0378-1119/81/0000-0000/$02.50 © Elsevier/North-Holland Biomedical Press

186 1980). The molecular consequences of these mutations have been defined by biochemical studies of globin protein structure and more recently by hybridization of purified nucleic acid probes to DNA from affected individuals (Van der Ploeg et al., 1980: Goossens et al., 1980). In this manner disorders in the relative abundance or temporal expression of globins have been linked to specific deletions of DNA sequences within the regions coding for the globin genes. Most recently, it has been possible to clone regions of chronmsomal DNA containing the globin genes, making possible their detailed structural analysis (Fritsch et al., 1980: Kaufman et al.. 1980: Lauer et al.. 1980). Using the cloned DNA molecules as templates, we have discovered that some of these sequences function as RNA polymerase 111 templates in an in vitro transcription system prepared from extracts of cultured human cells (Duncan et al., 1979). The templates are interspersed among the globin genes and flank clusters of co-temporally expressed genes. The structure of these templates was investigated by sequence analysis on the region 5' to the (;7 (fetal) gene. While this work was in progress, we learned of the studies of Houck et al. (1979) concerning Alu family, the predominant class of interspersed, repetitive DNA in the human genome. Comparison of DNA sequence data from our laboratories demonstrated conclusively that the RNA polymerase I11 template adjacent to the G,),gene contained a member of this class of 300 bp repeated elements (Jelinek et al., 1980). Other Alu family elements are found throughout the human /3-globin-like region. Their locations have been determined by hybridization studies and by nucleotide sequence analysis (Baralle et al., 1981; Fritsch et al., 1980; Coggins et al., 1980; P. Jagadeeswaran, unpublished data). These repeats are also found in the proximity of other human genes. Mu family members are located 6000 bp from the 3' end of the human insulin gene (Bell et al., 1980), and are flanking pseudogenes for the small nuclear RNAs UI, U2, and U6 (Denison et al., 1981; C.H. Duncan, unpublished data). We have completed sequence studies on two templates, one 5' to the (;7 globin gene and one 5' to the 8 globin gene. Using these DNA sequence data, it was possible to design experiments to localize, to within a few bases, the initiation and termination points of the in vitro transcripts. As presented below

these RNAs are composites of two different kinds of sequence, consisting of an entire Alu family sequence at their 5' ends coupled to a 3' portion which is not conserved in length or sequence.

MATERIALS AND METHODS (a) Cells The human HeLa cell line was obtained from J. Bertino and was maintained as a suspension culture at 37°C in Joklik's modified minimal essential medium (F13, GIBCO) supplemented with 7% horse serum. Escherichia coli strain HB101, harboring recombinant DNA plasmids, was cultured in Lbroth (10g tryptone, 5 g yeast extract and 5 g NaC1 per liter) containing tetracycline (15 cng/1) or ampicillin (50 rag/l). (b) DNA The E. coil strain harboring the plasmid R3.1 was supplied by T. Maniatis. The plasmid A36 has been described previously (Duncan et al., 1979). Plasmid DNA was purified as previously described (Duncan et al., 1979).

(c) Nucleases Restriction endonucleases were purchased from New England Biolabs and Bethesda Research Labs and used in accordance with the manufacturer's instructions. Ribonuclease H from E. coli was purchased from Enzo .Biochemical Corp. Ribonuclease T1 was purchased from Seikagaku Kogyo Co. Ltd.

(d) Nucleic acid analysis DNA from plasmids was digested with restriction endonucleases and fractionated on polyacrylamide or agarose gels. The fragments were eluted from gel slices and labeled at their 5'-termini with [32P]ATP according to established procedures (Lillehaug and Kleppe, 1975). The fragments were then cleaved by a second restriction enzyme, fractionated on gels and eluted. Sequence analysis was

187

carried

out by two methods. and

(Maxam

the

Gilbert,

presence

of

In

vitro

1979) by

RNA the

transcripts

Our previous

triphosphates

were

described

system

Hybridization

fragments

prepared

(Duncan

transcriptional

Wu (1978).

restriction

RNA polymerase

in

of

was performed

at 50°C in a solvent composed

and et

gene) contained

to

The nucleotide

DNA

Davidson,

Hybridization

reactions

sequence

established

for this frag-

this fragment, was used as a template for in vitro transcription, only one major RNA product was synthesized. When the EcoRI-BgZII fragment of this plas-

ethanol and chilling at -70°C. After a second ethanol precipitation, the nucleic acids were dissolved in 5 fl of buffer and treated with RNase H as described (Donis-Keller, 1979). The products of DNA sequencing reactions or RNase H digestions were fractionated on gels containing 7 M urea and 5 to 10% polyacrylamide in 0.04 M Tris-borate buffer (pH 8.3).

mid was subcloned,

the resulting plasmid directed the

synthesis of long RNAs which were not terminated in the same manner as they were when the intact EcoRI

fragment

was used as a template

(data not

shown). This implies that the Bg/II site lies within the template. This finding, coupled with the results described below, allowed us to pinpoint the template to a region of 490 bp containing the repeated sequence which is closest to the 6 gene. This sequence is shown in Fig. 2B. The flanking DNA sequences were also determined and will be presented in a

RESULTS

of templates

experi-

sequence (Fritsch et al., 1980; P. Jagadeeswaran. unpublished data). When R3.1. a plasmid containing

were

then diluted lo-fold with water, and nucleic acids were precipitated by the addition of 2 ~01s. of

(a) Localization

Hybridization

by the presence of two repeated sequences in the 3.1 kb f:‘coRl fragment positioned 5’ to the 6 coding

1977) OI- at 65°C in 0.3 M NaCl and

0.03 M Na . citrate.

one template.

ment is shown in Fig. ?A. Analysis of the template region 5’ to the F gene was made more difficult

for 1-3 h either

of 80% formamide,

0.4 M NaCl and 0.04 M Pipes buffer, pH 6.5 (Casey and

is shown in Fig. 1.

et al., 1979) indicated

ments indicated that the template was bounded by Bar7zHl and Sac1 sites which are 870 bp apart.

al.,

developed

RNA

III templates

study (Duncan

that the A36 plasmid (containing a 7.1 kb EcoRI fragment which includes the 5’ portion of the “7

(Maat and Smith, 1978).

as previously using

degradation

1977) or nick translation

dideoxynucleotide

as chain terminators purified

chemical

and DNA sequence

analysis

future publication (P. Jagadeeswaran man, manuscript in preparation).

and S.M. Weiss-

A map of the DNA region containing the human y, 13, and 6 globin genes, along with details of two

Kb

Alu

20

0 I

Genes Family

$4

30 8





s

*

RNA Templates AIU

H

pig. 1. Map of human denotes globin

DNA length genes (black

this region. represent Sac, Sacl.

and adult

p-like globin

in kb. The second rectangles).

The length

the two templates

fetal

described

non-repetitive

The third

of these sequences herein.

line shows

including

the locations

line (black

rectangles

has been exaggerated

The cross-hatched

DNA sequences.

gents,

and arrows)

RNA polymerase

restriction

Alu family enzyme

The bottom

III templates.

The top scale

genes along with the d and p adult

shows the locations

2-fold for clarity.

aren represents

A few selected

Alu family

of the Gy and *y fetal globin

of Alu family

line represents

repetitive

DNA in

an expanded

view of

DNA while the solid black area of the rectangles

sites are shown.

Abbreviation:

H, ~~LwIII; Alu,Alul;

188 (A) -50 -40 -30 -20 -i0 1 i0 20 30 40 GGATCCTAGATATTCCTTAGTCTGAGGAGGAGCAATTAAGATTCACTTGTTTAGAGGCTGGGAGTGGTGGCTCACGCCTGTAATCCCAGAATTTTGGGAG Start transcript S t a r t Alu f a m i l y s e q u e n c e 50 60 70 80 90 100 i10 120 130 140 GCCAAGGCAGGCAGATCACCTGAGGTCAAGAGTTCAAGACCAACCTGGCCAACATGGTGAAATCCCATCTCTACAAAAATACAAAAATTAGACAGGCATG

150 160 170 180 190 200 210 220 230 240 ATGGCAAGTGCCTGTAATCCCAGCTACTTGGGAGGCTGAGGAAGGAGAATTGCTTAAACCTGGAAGGCAGGAGTTGCAGTGAGCCGAGATCATACCACTG

250 260 270 280 290 300 310 320 330 340 CA•TCCAGCCTGGGTGA•AGAACAAGACTCTGTCTCAAAAAAAAAAAAGAGAGATTCAAAAGATTCACTTGTTTAGGCCTTAGCGGGCTTGAACACCAGT End Alu f a m i l y s e q u e n c e 350 360 370 380 390 400 410 420 430 440 CT•TGACACATTCTTAAAGGTCAGGCTCTACAAATGGAACCCAACCAGACTCTCAGATATGGCCAAAGATCTATACACACCCATCTCACAGATCCCCTAT

450 460 470 480 490 500 510 520 530 540 CTTAAAGAGACCCTAATTTGGGTTCACCTCAGTCTCTATAATCTGTACCAGCATACCAATAAAAATCTTTCTCACCCATCCTTAGATTGAGAGAAGTCAC

550 560 570 580 590 600 610 620 630 640 TTATTATTATGTGAGTAACTGGAAGATACTGATAAGTTGACAAATCTTTTTCTTTCCTTTCTTATTCAACTTTTATTTTAACTTCCAAAGAACAAGTGCA End t r a n s c r i p t 650 660 670 680 690 700 710 720 730 740 ATATGTGCAGCTTTGTTGCGCAGGTCAACATGTATCTTTCTGGTCTTTTAGCCGCCTAACACTTTGAGCAGATATAAGCCTTACACAGGATTATGAAGTC

750 760 770 780 790 800 810 TGAAAGGATTCCACCAATATTATTATAATTCCTGTCAACCTGATAGGTTAGGGGAAGGTAGAGCTC

(B) l0 20 30 40 50 60 70 80 90 i00 TGGCTGGATGCGGTGGCTCAGGCTTGTAAACCCAGCACTTTGGGAGGCCAAGGCAGGCAGATCACTTGAGGTCAGGAGTTCAAGACCAGCCTGACCAACA Start transcript S t a r t Alu f a m i l y s e q u e n c e 110 120 130 140 150 160 170 180 190 200 TGGTGAAACCCCAT•TCTACTAAAAATACAAAATCAGCCGGG•GTGTGGTGCATGC•TGCAGTCC•AGCTATT•AGGTGGCTGAGGCAGGAGACTTGCTT

210 220 230 240 250 260 270 280 290 300 GAAC•CAGGAGGCAGAGGTTGCGGTGAGCCTAGATTGCAC•ATTGCA•TCTAGCTTGGGCAATAGGGATGAAACTCCAT•TCAGAAGAGAAAAGAAAAAA End Alu f a m i l y sequence 310 320 330 340 350 360 370 380 390 400 AGACCTTATTCTGTTATACAAATCCTCTCAATGCAATCCATATAGAATAAACATGTAACCAGATCTCCCAATGTGTAAAACCATTTCAGGTAGAACAGAA

410 420 430 440 450 460 470 480 490 TTAAAGTGAAAAGCCAAGTCTTTGGAATTAACAGACAAAGATCAAATAACAGTCCTCATGGCCTTAAGAATTTACCTAACATTTTTTTT End t r a n s c r i p t

Fig. 2 (A) DNA sequence of RNA polymerase Ill template and of flanking sequences positioned 5' to the O7 gene (A36 plasmid). The sequence reads 5' to 3' from left to right. Initiation and termination areas of RNA transcription, and the boundaries (start and end) of the Alu family sequences are marked. The 17 bp direct repeats flanking Alu family DNA are boldly underlined. (B) DNA sequence of RNA polymerase III template positioned 5' to the 6 gene.

(b) AIu family DNA sequences As previously reported, the sequence in Fig. 2A contains a region which is 80% homologous to the sequence of the Alu family DNA clone BLUR8 (Jelinek et al., 1980; Rubin et al., 1980). The sequence of the R3.1 DNA (Fig. 2B) also contains a region which is homologous to BLUR8 DNA. The

two sequences share no discernable homology positioned 3' in a relation to the oligo(A) tracts in positions 2 8 3 - 2 9 4 (Fig. 2A) and 2 9 0 - 3 0 0 (Fig. 2B). Neither is there any homology shared between the DNA sequences 5 t in relation to position 1. As shown in Fig. 3, the sequences are 80% homologous and match well throughout a region of 294 bp. In their study of the DNA sequences flanking the human

189 A)

B)

i0 20 30 40 50 60 70 80 90 TGGCTGGATGCGGTGGCTCAGGCTTGTAAACCCAGCACTTTGGGAGGCCAAGGCAGGCAGATCACTTGAGGTCAGGAGTTCAAGACCAGCCTGACCAACA ****** . ********* ** * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * ******** ************* ****

I00 ******

AGGCTGGGAGTGGTGGCTCACGCCTGTAATCCCAGAATTTTGGGAGGCCAAGGCAGGCAGATCACCTGAGGTCAAGAGTTCAAGACCAACCTGGCCAACA 10 20 30 40 50 60 70 80 90

100

ii0 120 130 140 150 160 170 180 190 TGGTGAAACCCCATCTCTACTAAAAATACAAAA•TCAGCCGGGCGTGTGGTGCA••TGCCTGCAGTCCCAGCTATTCAGGTGGCTGAGGCAGGAGACTTG TGGTGAAATCCCATCTCTAC-AAAAATACAAAAATTAGACAGGCATGATG-GCAAGTGCCTGTAATCCcAGCTACTTGGGAGGCTGAGGAAGGAGAATTG ii0 120 130 140 150 160 170 180 190

200 210 220 230 240 250 260 270 280 290 CTTGAACCCAGGAGGCAGAGGTTGCGGTGAGCCTAGATTGCACCATTGCACTCTAGCTTGGGCAATAGGGATGAAACTCCATCTCAGAAGAGAAAAG CTTAAAC•TGGAAGGC•GGAGTTGCAGTGAGCCGAGATCATACCACTGCACTCCAGCCTGGGTGACAGA•ACAAGACTCTGTCTCAAAAAAAAAAAA 200 210 220 230 240 250 260 270 280 290

Fig. 3. Comparison of the two Alu family DNA sequneces found in RNA templates. The sequence read from left to right, using the same numbering system as in Fig. 2. Matches are marked by asterisks. Blanks have been inserted to maximise regions of homology. (A) is R3.1 DNA (see Fig. 2B). (B) is A36 DNA (see Fig. 2A).

insulin gene, Bell et al. (1980) presented DNA sequence data of an Alu family repeat. Their analysis of the boundaries of the repeat unit is roughly compatible with the above conclusions (see DISCUSSION for details). In addition, they observed 19 bp direct repeats flanking the Alu family sequence. As shown in Fig. 2A, there are 17 bp direct repeats at analogous locations in the A36 plasmid DNA sequence. The 17 and 19 bp repeats share no homology in nucleotide sequence. J.T. Elder, J. Pan, C.H. Duncan, and S.M. Weissman (manuscript submitted) have analyzed another Alu family repeat, which is unrelated to the globin locus, with results which are also in agreement with this definition of the repeated unit. Their data show a 10 bp repeat flanking the Alu family DNA sequence. Again, there is no homology with the flanking sequences discussed above. (c) Determination of the 5' and 3' ends of the RNAs The sizes of these transcripts were previously estimated as 575 and 515 bases. Thus, the transcripts must include portions which were not derived from Alu family sequences. Two methods were used for further analysis. First, the RNA was digested with RNase T1, which cleaves only after guanosine residues, and the resulting oligonucleotides were fractionated by electrophoresis on a gel containing 10% polyacrylamide and 7 M urea. Autoradiography of the gel (Fig. 4) revealed a pattern which correlated well with that predicted by inspection of the known DNA sequence. In the A36 DNA (Fig. 2A), oligo-

nucleotides of 32, 23, and 34 bases were predicted by the sequence at positions 106-137, 415-437, and 498 531, respectively. In the R3.1 DNA (Fig. 2B) oligonucbotides of 32 and 20 bases were predicted by the sequence at positions 106-137 and 314-333. Oligonucleotides of all 5 sizes were seen in digests of the in vitro transcripts (Fig. 4). The only large oligonucleotide which was not resolved in this system was the one of 17 bases corresponding to positions 279-295 in Fig. 2A. No significant band migrating with a chain length of 17 bases was seen (Fig. 4, lanes 3 and 4). This anomalous electrophoretic behavior probably resulted from the internal oligo(A) present in this oligonucleotide. This one exception aside, the data confirmed both the previous assignment of transcriptional polarities and the tentative localization of RNA templates. In addition, we could localize the 3'-end of the A36 transcript. RNase T1 generated oligonucleotides of 34 bases and 15 bases were predicted from the DNA sequences at positions 498-531 and 543 557 in Fig. 2A. Only the 34 base and none 15 base oligonucleotides were actually found in the RNA. Thus, the 3' end(s) of the A36 transcript lay between positions 532 and 557. The method developed by Donis-Keller (1979) was employed to further characterize the RNAs. 32P-labeled RNA was annealed to unlabeled DNA restriction fragments. Then RNase H (from E. coli) was added and allowed to digest the RNA in RNA : DNA hybrid structures. The RNA fragments thus generated were fractionated on 5% or 10% poly-

190

I

2

3

4

34

+~,+ +++++ ......

32

O

+ !!++++i+++ ++++

23 -

+++

20

:+

-- 15

Fig. 4. Size fractionation of RNase Tl-digestion products of in vitro RNAs. RNAs were digested with RNase T1, the reaction mixtures diluted 2-fold with deionized formamide, heated 2 min at 80°C and subjected to electrophoresis on a 10% polyacrylamide gel in 7 M urea. The numbers represent chain lengths of the oligonucleotides. Lane 1: RNA transcribed from R3,1 DNA using [c+-32p]UTP. Lane 2; RNA transcribed from R3.1 DNA using [c~-32P]GTP. Lane 3; RNA transcribed from A36 DNA using [c+-32P]UTP. Lane 4; RNA transcribed from A36 DNA using [c~32p]GTP. acrylamide gels in 7 M urea (Fig. 5). By comparison of gel mobilities with those of appropriate size markers, the sizes of the RNase H-generated frag-

ments could be estimated and by extrapolation, the ends of the RNA molecules determined. The restriction fragments used in this study were the Haelll-Haelll fragment spanning positions 48 to 94, the BamHI-Bglll fragment spanning positions --48 to 418, and the BgllI-DdeI fragment spanning positions 418 478 in Fig. 2A. These coordinates refer to cleavages on the complementary strand from that represented in tile figure, because that was tile strand which actually hybridized to the RNA in these experiments. We also used the XbaI to Bglll fragment from beyond the 5'-end of the sequence to position 360 and tile HpalI-Hpall fragment spanning position 140 to beyond the 3' end of the sequence in Fig. 2B. Because of the homology between transcripts, the HaelI1 fragment could be used to determine the 5' ends of both RNAs. RNase H cleavage of RNAs hybridized to the HaellI-HaelIl fragment was followed by electrophoresis in a 10% polyacrylamide gel. The gel was exposed to X-ray film and the resulting autoradiograph showed one prominent band for both A36 and R3.1 RNA (Fig. 5A, lanes 2 and 8). A semi-logarithmic plot, constructed by use of RNA standards, indicated a chain length of 4 6 - 4 8 bases. By inference then, the 5' ends of the RNAs lay 4 6 - 4 8 bases upstream from the HaelII restriction sites (GGCC) at positions 48 in Fig. 2A and 48 in Fig. 2B. Similar analysis indicated that the R3.1 transcript had a 5'-end 140 bases upstream from the Hpall site at position 140 (Fig. 2B). The 3'-ends of the RNAs were determined in a like manner (Fig. 5B). The data indicated that R3.1 RNA terminates 1 2 1 - 1 2 5 bases 3' to the Bglll cleavage at position 365. The 3'-end of the A36 transcript could be assigned with less certainty. Two restriction fragments were used to promote cleavage at sites close to the RNA terminus. One of these sites was the Bglll cleavage at position 418, while the other was the DdeI site at position 478 (Fig. 2A). In neither case were there definite single bands, but instead disperse sets of fragments ranging in size from 70 80 and 1 3 0 - 1 4 0 bases in length for the Ddel and BglI1 sites, respectively. Both these estimates placed tile 3' end of the RNA between positions 546 and 556, in agreement with the results of T1 oligonucleotide size analysis. We have found no evidence for any splicing of the RNAs. However, yeast tRNA precursors were spliced when they were incubated with crude cellular

191

A

B I

2

3

4

5

6

7

8

9

l

IO

• i!i!ii~5

-

158

-

t2t

--

79

2

3

4

5

0

t ~!~ ~ii~ ~

~iii

O

'~!!i )~ii ¸ -_

34

--

32

--23

O

158

-- t2t

-- 7 9

Fig. 5. RNase H digestion of RNA : DNA duplexes. 32p-labeled RNA was annealed to unlabeled DNA, digested with RNase It, and the products fractionated by polyacrylamide gel electrophoresis as described in Fig. 4. The chain lengths of RNA size markers are shown. (A) 10% polyacrylamide gel: (1) A36 RNA without DNA; (2) A36 RNA hybridized with the HaelII-HaeIII fragment (position 47 93, Fig. 2A); (3) A36 RNA hybridized to BgllI-DdeI fragment position 417-477, Fig. 2A); (4) A36 RNA without DNA and without RNase H; (5) T1 digest of A36 RNA (size markers); (6) in vivo labeled HeLa cell RNA (size markers); (7) A36 RNA; (8) R3.1 RNA hybridized with the HaelII-HaelII fragment (position 47-93, Fig. 2A); (9) R3.1 RNA hybridized to the HpaII-HpalI fragment (position 140 to beyond the 3'end, Fig. 2B); (10) R3.1 RNA without DNA. (B) 5% polyacrylamide gel: (1) A36 RNA hybridized to Hinfl-HinfI fragment (position 275-301, Fig. 2A); (2) A36 RNA hybridized to the BamHI-BgllI fragment (position -51 to 417, Fig. 2A); (3) same as lane 2 except sample was not heated before loading on gel; (4) A36 RNA without DNA; (5) in vivo RNA as size markers; (6) R3.1 RNA hybridized to XbaI-BglII fragment (beyond 5'-end to position 365, Fig. 2B).

extracts similar to those used in these transcriptions (Knapp et al., 1978). Therefore, we could not absolutely ruie out the presence of a short splice in these RNAs.

DISCUSSION The data presented here define the structures of the RNA polymerase III in vitro transcripts coded by the two templates adjacent to the h u m a n 6 7 and 8 globin genes. In our original report on these

molecules we demonstrated that the two transcripts possessed sufficient homology to hybridize with each other's templates. The present findings establish that this was due to the presence of repetitive sequences imbedded in both templates. These repeats were identified as members of the Alu family of repeated sequences, the predominant class of interspersed repetitive DNA in the human genome. Alu family sequences constituted the 300 5'-nucleotides of each template. At the 3'-extreme of these repeats, oligo(A) runs were found in the templates, 12 adenosine residues in one case and adjacent runs of 4 and 7 adenosines in the other. After this point,

192 there was no homology between templates. The RNAs terminated about 190 and 260 bases downstream from the oligo(A) for the templates 5' to the 6 and 67 genes, respectively. Rapid progress has been made in defining the DNA sequences which control transcription initiation and termination by RNA polymerase III. This has been made possible by the use of cell-free extracts from cultured cells (Wu, 1978) or from frog oocytes (Birkenmeyer et al., 1978) for faithful and accurate in vitro transcription by RNA polymerase III. These initial studies employed the genes for the adenovirusassociated (VA) RNAs and cloned genes for Xenopus 5S ribosomal RNAs. Using similar methods, tRNA genes from the genera Xenopus (Telford et al., 1979), Bornbyx (Garber and Gage, 1979; Hagenbuckle et al., 1979), Drosophila (Schmidt et al., 1978) and Saccharomyces (Ogden et al., 1979)were transcribed in vitro. Transcriptional control sequences of the Xenopus 5S RNA genes have been thoroughly studied. By constructing deletions of cloned genes in vitro, Sakonju et al. (1980) and Bogenhagen et al. (1980) found that the DNA region corresponding to nucleotides +50 to +80 of the mature 5S RNA is the only area essential tbr initiation of transcription. In the cases of tRNA genes and the VA RNA genes the situation appears to be more complex, but results from both systems were compatible with the notion that transcriptional control regions were situated internally within the coding sequences for mature RNA (Kressman et al., 1979; Fowlkes and Shenk, 1980). In addition, 5'-flanking sequences could alter the efficiency of transcription of adjacent templates (DeFranco et al., 1980; Thimmappaya et al., 1979; Kressman et al., 1979). These findings and the present report raise the possibility that Alu family DNA contains an internal RNA polymerase III initiation signal. The sequence data are suggestive. Ohe and Weissman (1970) and Fowlkes and Shenk (1980) observed homologies between nucleotide sequences of VA RNAs and tRNAs. These similarities fell into two regions; one near the 5'-end of the RNA molecules and the other 50 to 60 bases 3' to the initiation point of the RNAs. Both of the Alu family sequences presented here contained regions homologous to those sequences (Fig. 6A), although the latter homology occurred at position +75 instead of +60. The termination sequences of these templates also

A) Consensus Mouse 4.5S A36 RNA R3.1 RNA

RNA

GUGGPyNNPuGUGG..30-35 GUGGC ........... 46 GUGGC ........... 58 GUGGC ........... 60

Nuc..GGGUUCGAANCC Nuc ..... GAGUUCGAGGCC Nuc ..... GAGUUCAAGACC Nuc ..... GAGUUCAAGACC

B) AIu family RNA 3 i0 20 30 40 GCUGGGAGUGGUGGCUCACGCCUGUAAUCCCAGAAUUU--UG--GGAGGCCAAGGC ** ** ********* ****** *** * **** ** ****** GCCGGUAGUGGUGGCGCACGCCGGUA 1 I0 20

...... Mouse

60 AGGCAGAUCACC ** ****** AGAGGGAUCAC 60

4.5S

****

GGAUUUGCUGAAGGAGGCAGAGGC 30 40 RNA

AIu family RNA 70 80 90 i00 UGAGGUCAAGAGUU CAAGACCAACCUGGCCAACAUGGUGAA ****** ** *** ***** * *** ..........

50

GAGUUCGAGGCCAGCCUGGGCUACACAUUUU 70 80 Mouse 4.5S RNA

50

* U 90

Fig. 6. Comparison of in vitro transcripts with known in vivo RNA sequences. (A) Consensus sequence described by Fowlkes and Shenk (1980) was compared with the mouse 4.5S RNA and the inferred sequence of the in vitro transcripts. The sequences shown begin at positions 13 and end at positions 87 and 87 for the sequences in Figs. 2A and 2B, respectively (Nuc, any of four nucleotides). (B) Mouse 4.5S RNA sequence (Harada and Kato, 1980) compared to inferred A36 RNA sequence.

share features in common with known RNA polymerase III templates. Transcription studies with the Xenopus 5S genes (Bogenhagen et al., 1980), DNA sequence analysis of the regions 3' to known RNA polymerase 1II templates (Akusjarvi et al., 1980; Korn and Brown, 1978) and in vivo studies of mutations in the sup4A tRNA gene of yeast (Kurjan et al., 1980) all indicated that a series of 4 or more thymidine residues was sufficient fbr transcription termination. The R3.1 plasmid template was compatible with this conclusion, because it contained the sequence TTTTTTTT where the transcript terminated. The A36 transcript template appeared to terminate within the sequence TTATTATT, even though a run of 5 thymidines occurred 40 bases downstream in the template (positions 5 9 3 - 5 9 7 ) . Our data indicate that termination of this transcript was not precise. Our data also showed that 4 adjacent thymidine residues occurring in this template (positions 3 8 - 4 1 , Fig. 2A) did not cause termination in a significant portion of transcripts. The existence of these RNA molecules in vivo remains an open question. It is possible that these

193 templates are examples of pseudogenes, sequences which resemble normal templates but which are themselves never transcribed in vivo. Both frog 5S pseudogenes (Korn and Brown, 1978) and a human c~globin pseudogene (Proudfoot et al., 1980) are transcribed in vitro by crude cell-free extracts similar to those used in this study. RNA molecules corresponding to these pseudogene transcripts are not detected in cellular RNA. Alternatively, it could be argued that Alu family DNA sequences contain regions which can fortuitously mimic the normal recognition sequences for RNA polymerase III when deproteinized cloned DNAs are used as templates. We have two reasons for doubting this last hypothesis. Inspection of the nucleotide sequence data shows that the RNA transcripts initiate at points close to, if not exactly matching, the ends of the repeated sequences. This correlation is too striking to be dismissed as a coincidence; rather it implies that in vitro template activity has some corollary in living cells. Furthermore, these transcripts do possess homologies to RNA molecules which are found in vivo. Fowlkes and Shenk (1980) observed that a murine RNA species, known as the 4.5S RNA, also contained a putative RNA polymerase Ill control sequence. Comparison of the published sequence of this RNA (Harada and Kato, 1980) with the sequence in Fig. 2 reveals extensive homology between the 4.5S RNA and the 5'-end of the Alu family sequences in the RNAs (Fig. 6B). Again, the location of the 5'-end of the in vitro transcripts matches very closely with that of the 4.5S RNA. The 4.5S RNA contains a triphosphate at its 5'terminus, as do the 5S and VA RNAs, instead of the 7-methyl-guanosine pyrophosphate (cap) structure characteristic of RNA polymerase II. Alu family sequences will also hybridize to the 7S RNA (Weiner, 1980). The sequence of this RNA is not known. This molecule is found in mouse and human cells and in oncorna viral particles (Zieve and Penman, 1976). Thus, although there is no direct evidence that either of the sequences presented here are transcribed in vivo, they do share structural similarities with sequences which are known to be expressed as RNA in cells. In vitro transcription by RNA polymerase III is not a special property of the Alu family sequences interspersed among the human globin genes. In the study of J. Pan, J.T. Elder, C.H. Duncan, and S.M.

Weissman (manuscript submitted) eight of eleven clones selected only for the presence of Alu family sequences were active templates for RNA polymerase II1. Wu (1978) found that unfractionated human DNA directed the synthesis, in his transcription system, of a disperse set of RNA molecules in the 9-l lS size range, similar in size to the molecules described here. However, a significant fraction of Alu family sequences were not transcribed in vitro. At least two such examples were found in the human 134ike globin genes, one 3' to the j3 gene and one of the pair situated 5' to the 8 gene. Similar examples were readily observed in randomly selected clones of human DNA (C.H. Duncan, unpublished data). The structural basis of this differential transcriptional capability merits further study. These sequence studies provide a second exmnple of an Alu family member that is flanked by short direct repeats. Bell et al. (1980) suggested that these repeats fomr the boundaries of the Alu family repetitive sequence. We agree with this definition, with some reservations. First, there may be members of the Alu family which are not flanked by direct repeats. An alternative convention must be applied in such cases. Second, it is somewhat difficult to distinguish the exact endpoints of the direct repeats. For example, in the sequence presented (Fig. 7), Bell et al. (1980) accepted one mismatched base (their position 68) in order to extend the direct repeat from a 14-base perfect match to a 19-base imperfect match. Interpretation of the sequences is more harmonious if the 14-base repeat is used instead. Under this convention, their Alu family repeat begins with the sequence NGGCTGG exactly as do the two sequences presented here. DNA sequences 5' to this position share no detectable homologies. Furthermore the 5'-ends of the A36 and R3.1 RNA polymerase III templates correspond very closely, if not exactly, to this position. Finally, the 5' ends of Alu family DNA abut the flanking direct repeats for the cases of A36 DNA and the Alu family DNA in the vicinity of the insulin gene. The flanking sequences of the R3.1 Alu family DNA will be discussed in a future publication (P. Jagadeeswaran and S.M. Weissman, manuscript in preparation). The 3'end of Alu family DNA is marked by clusters of adenosine residues in every case we have examined. In the sequence of Bell et al. (1980), the 3' flanking direct repeat abuts the oligo(A) sequence. Further

194 studies will reveal whether this sequence arrangement is a general rule. At first glance, the sequence of A36 DNA appears to violate this rule, because of the sequence GAGAGATTCAA found between the 12-adenosine cluster at positions 283-294 (Fig. 2A) and the 3' direct repeat at positions 3 0 6 322 (Fig. 2A). However, this region still has 5 adenosine residues in 11 bases so it can be considered an extension of an adenosine-rich sequence. It should be noted that Alu family members will, in general, have different lengths: in these cases 318 bases for the repeat 3' to the insulin gene vs. 305 bases for the repeat in A36 DNA. This heterogeneity in length is due primarily to differences in the length of the oligo(A) tracts found at the 3' ends of Alu family DNA. The proposed sequence arrangement is reminiscent of that reported for another example of repetitive DNA, the 6 sequence of yeast. Available data indicate several structural analogies between Alu family members and 6 sequences. In both cases, the elements are repetitive, interspersed among unique sequences (Cameron et al., 1979), about 300 bp in length, and flanked by short direct repeats (Farabaugh and Fink, 1980). The comparison is certainly not exact. Based on estimates of copy number (Cameron et al., 1979; Houck et al., 1979), Alu family members are at least one order of magnitude higher than 6 sequences in repetition frequency per unit length of genomic DNA. Neither is there any discernable homology between Alu family sequence and the published DNA sequences for 6 elements (Farabaugh and Fink, 1980; Gafner and Philippsen, 1980). The in vitro template activity of 6 sequence is not known. However, RNA polymerase 11I, in isolated yeast nuclei, produced a heterogeneous collection of RNA molecules which were longer than any of its known in vivo products (Schultz, 1978). Yeast 6 sequences are mobile. They can be transposed as part of a larger repetitive element known as Tyl. Insertion and subsequent excision of Tyl results in the deposition of a 6 sequence in the yeast genome, flanked by 5 bp direct repeats derived from duplication of preexisting DNA at the target site (Farabaugh and Fink, 1980). Movement of Alu family DNA is a subject which requires further investigation. Two instances of Alu family involvement in DNA sequence rearrangements in vivo have been reported. In one case, a portion of a human a globin

gene was deleted and the remainder fused to Alu family DNA (Orkin and Michelson, 1980). The other case involved a monkey DNA sequence which is very similar to the Alu family. This sequence was inserted in the DNA of a viable mutant of simian virus 40 (Dhruva et al., 1980). However, in both cases, only a portion of an Alu family sequence was present. There is a strong possibility that both instances were the results o f random fusions of DNA. Functional roles for Alu family elements are more difficult to assess. At present, structural studies of DNA from patients with disorders in hemoglobin synthesis provide the only clue. In particular, the syndrome of HPFH is often associated with deletion of DNA sequences located 5' to the 6 globin gene, including one of the RNA polymerase 11I templates described in this report (Tuan et al., 1979, 1980: Fritsch et al., 1979). With the function of Alu family DNA still unknown, any speculation about the existence or functions of RNAs transcribed from them can be little better than guesswork. In spite of tiffs, one recent hypothesis concerning the mechanism of globin gene regulation should be mentioned (Stalder et al., 1980; Bemards and Flavell, 1980). This proposal invokes the existence of chromosomal domains, regions of chromatin which somehow modulate the transcription of DNA which they encompass. Expression of genes is presumed to be influenced by external factors interacting with the domain of which they are a member. It is conceivable that the transcripts of Alu family play some role in demarcating, establishing or maintaining these domains.

ACKNOWLEDGEMENTS We thank Francisco Baralle, Dana Fowlkes and Thomas Shenk for communicating their results prior to publication. We are also grateful to Elio Vanin and Oliver Smithies who provided us with their independently determined sequence of the template 5' to the 67 gene. The /i. coli strain harboring the plasmid R3.1 was provided by T. Maniatis. Finally, we wish to thank J.K. deRiel and B. Forget for their kind gift of a portion of the DNA used in this study. Discussions of this work with J.T. Elder and J. Pan were greatly appreciated. C.H.D. is supported by post-doctoral fellowship

195 No. CA 0 6 5 3 2 - 0 2

f r o m the N a t i o n a l I n s t i t u t e s o f

Health. This research was also s u p p o r t e d b y G r a n t No. 2 PO1 CA 1 6 0 3 8 - 0 6 a w a r d e d b y the N a t i o n a l Cancer I n s t i t u t e , D e p a r t m e n t o f H e a l t h a n d H u m a n Services, a n d b y the C o m p r e h e n s i v e C a n c e r C e n t e r for C o n n e c t i c u t at Yale, G r a n t No. CA 1 6 3 5 9 .

REFERENCES Akusjarvi, G., Mathews, M.B., Andersson, P., Vennstrom, B. and Pettersson, U.: Structure of genes for virus-associated RNA i and RNAii of adenovirus type 2. Proc. Natl. Acad. Sci. USA 77 (1980) 2 4 2 4 - 2 4 2 8 . Bank, A., Mears, J.G. and Ramirez, F.: Disorders of human hemoglobin. Science 207 (1980) 4 8 6 - 4 9 3 . Baralle, F.E., Shoulders, C.C., Goodbourn, S., Jeffreys, A. and Proudfoot, N.J.: The 5' flanking region of human epsilon globin gene. Nucl. Acids Res. (1981) in press. Bell, G.I., Pictet, R. and Rutter, W.J.: Analysis of the regions flanking the human insulin gene and sequence of an Alu family member. Nucl. Acids Res. 8 (1980) 4 0 9 1 - 4 1 0 9 . Bernards, R. and Flavell, R.A.: Physical mapping of the globin gene deletion in hereditary persistence of fetal hemoglobin. Nucl. Acids Res. 8 (1980) 1521-1543. Birkenmeyer, E.H., Brown, D.D. and Jordan, E.: A nuclear extract of Xenopus laevis oocytes that accurately transcribes 5S RNA genes. Cell 15 (1978) 1077-1086. Bogenhagen, D.F., Sakonju, S. and Brown, D.D.: A control region in the center of the 5S RNA gene directs specific initiation of transcription, II. The 3' border of the region. Cell 19 (1980) 2 7 - 3 5 . Britten, R.J. and Kohne, D.E.: Repeated sequences in DNA. Science 161 (1968) 5 2 9 - 5 4 0 . Bunn, tt.F., Forget, B.G. and Raney, H.M.; in Human Hemoglobins. Saunders, Philadelphia, 1977, pp. 101-112. Cameron, J.R., Loll E.Y. and Davis, R.W.: Evidence for transposition of dispersed repetitive DNA families in yeast. Cell 16 (1979) 7 3 9 - 7 5 1 . Casey, J. and Davidson, N.: Rates of formation and thermal stabilities of RNA : DNA and DNA : DNA duplexes at high concentrations of formamide. Nucl. Acids Res. 4 (1977) 1539-1552. Coggins, L.W., Grindlay, G.W., Vass, J.K., Slater, A.A., Montague, P., Stinson, M.A. and Paul, J.: Repetitive DNA sequences near three human t3-type globin genes. Nucl. Acids Res. 8 (1980) 3319-3334. Davidson, E.H., Klein, W.H. and" Britten, R.J.: Sequence organization in animal DNA and a speculation on hnRNA as a coordinate regulatory transcript. Develop. Biol. 55 (1977) 6 9 - 8 4 . DeFranco, D., Schmidt, O. and Soll, D.: Two control regions for eukaryotic tRNA gene transcription. Proc. Natl. Acad. Sci. USA, 77 (1980) 3365-3368.

Denison, R.A., VanArsdeU, S.W., Bernstein, L.B. and Weiner, A.M.: Abundant pseudogenes for small nuclear RNAs are dispersed in the human genome. Proc. Natl. Acad. Sci. USA (1981) in press. Dhruva, B.R., Schenk, T. and Subramanian, K.N.: Integration in vivo into simian virus 40 DNA of a sequence that resembles a certain family of genomic interspersed repeated sequences. Proc. Natl. Acad. Sci. USA, 77 (1980) 4514-4518. Donis-Keller, It.: Site specific enzymatic cleavage of RNA. Nucl. Acids Res. 7 (1979) 179-191. Duncan, C., Biro, P.A., Choudary, P.V., Elder, J.T., Wang, R.R.C., Forget, B.G., deRiel, J.K. and Weissman, S.M.: RNA polymerase III transcriptional units are interspersed among human non-a globin genes. Proc. Natl. Acad. Sci. USA 76 (1979) 5095-5099. Farabaugh, P.J. and Fink, G.R.: Insertion of tire eukaryotic transposable element Tyl creates a 5 base pair duplication. Nature 286 (1980) 352-356. Fowlkes, D.M. and Shenk, T.: Transcriptional control regions of the adenovirus VA1 RNA gene. Cell 22 (1980) 4 0 5 414. Fritsch, E.F., Lawn, R.M. and Maniatis, T.: Characterization of deletions which affect the expression of fetal globin genes in man: Nature 279 (1979) 5 9 8 - 6 0 3 . Fritsch, E.F., Lawn, R.M. and Maniatis, T.: Molecular cloning and characterization of the human t3-1ike globin gene cluster. Cell 19 (1980) 9 5 9 - 9 7 2 . Gafner, J. and Philippsen, P.: The yeast transposon Tyl generates duplications of target DNA on insertion. Nature 286 (1980) 4 1 4 - 4 1 8 . Garber, R.L. and Gage, L.P.: Transcription of a cloned Bombyx mori tRNA~ la gene: Nucleotide sequence of the tRNA precursor and its processing in vitro. Cell 18 (1979) 8 1 7 - 8 2 8 . Goossens, M., Dozy, A.M., Embury,. H., Zachariades, Z., Hadjiminas, M.G., Stamatoyannopoulos, G. and Kan, Y.W.: Triplicated c~ globin locii in humans. Proc. Natl. Acad. Sci. USA 77 (1980) 5 1 8 - 5 2 1 . Hagenbuckle, O., Larsen, D., Hall, G.I. and Sprague, K.U.: The primary transcription product of a silkworm alanine tRNA gene; identification of in vitro sites of initiation, termination and processing. Cell 18 (1979) 1217-1229. Harada, F. and Kato, N.: Nucleotide sequences of 4.5S RNAs associated with poly(A) containing RNAs of mouse and hamster cells. Nucl. Acids Res. 8 (1980) 1273-1283. Houck, C.M., Rinehart, F.P. and Schmid, C.W.; An ubiquitous family of repeated DNA sequences in the hmnan genome. J. Mol. Biol. 132 (1979) 2 8 9 - 3 0 6 . Jelinek, W.R., Toomey, T.P., Leinwand, L., Duncan, C.H., Choudary, P.V., Bixo, P.A., Weissman, S.M., Rubin, C.M., Houck, C.M., Deininger, P.L. and Schmid, C.W.: Ubiquitous interspersed repeated sequences in mammalian genomes. Proc. Natl. Acad. Sci. USA 77 (1980) 1398-1402. Kaufman, R.E., Kretschmer, P.J., Adams, J.W., Coon, H.C., Anderson, W.F. and Nienhuis, A.W.: Cloning and charac-

196 terization of DNA sequences surrounding the human % 6 and ~3 Nobin genes. Proc. Natl. Acad. Sci. USA 77 (1980) 4229 4233. Knapp, G., Beckman, J.S., Johnson, P.F., Fuhrman, S.A. and Abelson, J., Transcription and processing of intervening sequences in yeast tRNA genes. Cell 14 (1978) 221-236. Korn, L.J. and Brown, D.D.: Nucleotide sequence of Xenopus borealis oocyte 5S DNA: comparison of sequences that flank several related eukaryotic genes. Cell 15 (1978) 1145 1156. Kressman, A., Hofstetter, tt., DiCapua, E., Grosschedl, R. and Birnstiel, M.L.: A tRNA gene of Xenopus laevis contains at least two sites promoting transcription. Nucl. Acids Res. 7 (1979) 1749-1763. Kurjan, 1., Hall, B.D., Gilliam, S. and Smith, M.: Mutations at the yeast SUP4 tRNA Tyr locus: DNA sequence changes in mutants lacking suppressor activity. Cell 20 (1980) 701 709. Latter, J., Shen, C.K.J. and Maniatis, T.: The chromosomal arrangement of human a-like globin genes: Sequence homology and a globin gene deletion. Cell 20 (1980) 119-130. Lillehaug, J.R. and Kleppe, K.: Effects of salts and polyamines on T4 polynucleotidc kinase. Biochemistry 14 (1975) 1225-1229. Maat, J. and Smith, A.J.H.: A method for sequencing restriction fragments with dideoxynucleoside triphosphates. Nuel. Acids Res. 5 (1978) 4537-4545. Maxam, A.M. and Gilbert, W.: A new method for sequencing DNA. Proc. Natl. Acad. Sci. USA 74 (1977) 560-564. Ogden, R.C., Beekman, J.S., Abelson, J., Kang, tt.S., Soil, D. and Schmidt, O.: In vitro transcription and processing of a yeast tRNA gene containing an intervening sequence. Cell 17 (1979) 399 406. Ohe, K. and Weissman, S.M.: Nucleotide sequence of an RNA from cells infected with adenovirus 2. Science 167 (1970) 879-880. Orkin, S.H. and Michelson, A.: Partial deletion of the c~ globin structural gene in human c~-thalassaemia. Nature 286 (1980) 538-541. Proudfoot, N.J., Shandler, M.H.M., Manley, J.L., Gefter, M.L. and Maniatis, T.: Structure and in vitro transcription of human globin genes. Science 209 (1980) 13291336. Rubin, C.M., ltouck, C.M., Deininger, P.L., Friedmann, T. and Schmid, C.W.: Partial nucleotide sequence of the 300nucleotide interspersed repeated hmnan DNA sequences. Nature 284 (1980) 372-375. Sakonju, S., Bogenhagen, D.F. and Brown, D.D.: A control region in the center of the 5S RNA gene directs specific initiation of transcripts, I. The 5' border of the region. Cell 19 (1980) 13-25.

Scheller, R.H., Thomas, T.L., Lee, A.S., Klein, W.H., Niles, W.D., Britten, R.J. and Davidson, E.H.: Clones of individual repetitive sequences from sea urchin DNA construtted with synthetic EcoRI sites. Science 196 (1977) 197 202. Schmidt, O., Mao, J.l., Silverman, S., ttovemann, B. and Soil, D.: Specific transcription of eukaryotie tRNA genes in Xenopus germinal vesicle extracts. Proc. Natl. Acad. Sci. USA 75 (1978) 4819-4823. Schultz, L.D., Transcriptional role of yeast deoxyribonucleic acid dependent ribonucleic acid polymerase III. Biochemistry 17 (1978)750-758. Stalder, J., Larsen, A., Engel, J.D., Dolan, M., Groudine, M. and Weintraub, H.: Tissue-specific DNA cleavages in the globin chromatin domain introduced by DNase I. Cell 20 (1980) 451-460. Telford, J.L., Kressman, A., Koski, R.A., Grosschedl, R., Muller, F., Clarkson, S.G. and Birnstiel, M.L.: Delimitation of a promoter for RNA polymerase III by means of a functional test. Proc. Natl. Acad. Sci. USA 76 (1979) 2590 -2594. Thimmappaya, B., Jones, N. and Schenk, T.: A mutation which alters initiation of transcription by RNA polymerase III on the AdS chromosome. Cell 18 (1979) 947 954. Tuan, D., Biro, P.A., deRiel, J.K., Lazarus, H. and Forget, B.G.: Restriction endonuclease mapping of the human 3' globin gene loci. Nucl. Acids Res. 6 (1979) 2519 2544. Tuan, D., Murnane, M.J., deRiel, J.K. and Forget, B.G.: Heterogeneity in the molecular basis of hereditary persistance of fetal hemoglobin. Nature 285 (1980) 335-338. Van der Ploeg, L.H.T., Konings, A., Oort, M., Roos, D., Bernini, L. and Flavell, R.A.: 74-Thalassaemia studies showing that deletion of the 3' and 6 genes influences ¢/ globin gene expression in man. Nature 283 (1980) 637-642. Weatherall, D.J. and Clegg, J.B.: Recent developments in the molecular genetics of human hemoglobin. Cell 16 (1979) 467-479. Weiner, A.: An abundant cytoplasmic 7S RNA is partially complementary to the dominant interspersed middle repetitive DNA sequence family in the human gcnome. Cell 22 (1980) 209-218. Wu, G.J.: Adenovirus DNA-directed transcription of 5.5S RNA in vitro. Proc. Natl. Acad. Sci. USA 75 (1978) 2175-2179. Zieve, G. and Penman, S.: Small RNA species of the HeLa ceils: metabolism and subcellular localization. Cell 8 (1976) 19-31. Communicated by F.E. Young.