A ubiquitous family of repeated DNA sequences in the human genome

A ubiquitous family of repeated DNA sequences in the human genome

J. 41oZ. Biol. (1979) 132, 289-306 A Ubiquitous Family of Repeated DNA Human Genome CATHERINE M. HOUCK~,FRANK P. RINEHAHT~ Sequences in the AND...

2MB Sizes 37 Downloads 50 Views

J. 41oZ. Biol. (1979) 132, 289-306

A Ubiquitous

Family

of Repeated DNA Human Genome

CATHERINE M. HOUCK~,FRANK

P. RINEHAHT~

Sequences in the

AND CARL W. SCHMIDT

1 Departm,ed of Chjemistrg University of California Davis, CA 35616, 11.9.8. 2 College of the Virg%n Islands Division. qf Sciexx and Mathnmatica St. Thomas, VI 00801, U.8.A. (Received 27 November

197X.

and

in rwised form 2 April

197.9)

Renatured DNA from hnman and rnarly othrr cukaryotes is known to contairl 300-nncleotide duplex regions formed from rellatnred rrpcatcd sequences. These short repeated DNA sequences are widely believed to hr interspersed with single copy DNA sequences. In this work we show that at least half of these 300-n~cleoenz.yme AluI. This site is tide duplexes share a cleavapt: site for t,ho restriction lonatrd I70 nncleotides from one end. This 41~ family of repeated sequences makes up at least 3% of the genome and is present ill sovrral hnndrod thollsand topics. lnvcrtrd repeated sequences are also known to contain a short 300-nucleotidv duplex region. We find that at least half of ttre 3OO~Illlcle:ot,ide duplex regions irl iIlvcrtt,d repeated sequences also have an al~l wstrictiorl site located 170 II~IC~~~Otides from one end. By driven rcnaturation trcllniqncs, tllc Alec family is shown to be distributed over a minimum range of 307/, to 607; of the penomc. (Tile breadth of this ralrge wflects tllo presence of inverted repeated sequences wllich, in part, inclrlde t,lw ,417~family.) These findings imply that tile interspersion pattern of repeated and single copy sequences in human DNA is larpely dominated by one family of repeatcd sequences.

1. Introduction The interspersion of single copy DNA sequences with repeated DNA sequences is evidently a general feature of the sequence organization of t,he cukaryotic genome (Davidson et al., 1975). In most eukaryotes it is thought that 300-nucleotide repeated sequences are interspersed with longer (approx. 1000 nucleotides) single copy sequences (Davidson et aE., 1975). The length of the repeated DNA sequence is estimated by determining the length of duplex structures formed when repeated DNA sequences renature. Such length determinations of renatured duplex DNA have been performed by use of a wide variety of physical methods. For example, by electron microscopy 300-nucleotide duplex regions are readily observed in both renatured Xenopus DNA (Chamherlin et al., 1975) and human DNA (Deininger & Schmid, 1976a). Since t#hese 300-nucleotide sequences, as well as their interspersed unique sequencrs: 28!3 002%2836/79/230289-18 11

$02.00/O

&J 1979 Academic

Press Inc. (London)

Ltd.

mu

C‘. M. Hc)UCK9

It?. P, R1XE:MAH.T

.4.ND 0. ‘IV. S0WMI.D

10ocupy such 8 large fraction of tlhe genonr 8 in widely d.iver
REPEATED

DNA

SEQUENCES

IN

THE

HUMAN

GENOME

sequences are interspersed and are highly repetitive as expected about 50,000 times faster than single copy sequences).

291

(i.e. they renature

2. Materials and Methods (a) Preparation

of DNA

Human DNA was isolated from placenta as described by Schmid & Deininger (1975). The DNA was either lightly sheared in a Virtis homogenizer to about 2000 nucleotides in length (Houck et al., 1978) or was sheared by 3 passes through a French press to a length of about 500 nucleotides (Schmid & Deininger, 1975). HeLa cells were grown for 3 days on Eagle’s minimum essential medium with 15% fetal calf serum and 0.05 mCi of [3H]thymidine/ml. The 3H-labeled DNA was extracted and purified as previously described (Schmid & Deininger, 1975). Final purification of [3H]DNA was effected by density-gradient centrifugation in CsCl, p = 1.70 g cmm3. The r3H]DNA was sheared by sonication in a Quigley-Rochester Dismembranator equipped with a microtip. (b) Preparation

of 1z51-labeled

human

DNA

Radioiodinated DNA was prepared using the Commerford procedure as modified by Orosz & Wetmur (1974). The molar ratio of carrier iodide to radioactive iodide was 10: 1. After reaction, the iodination mixture was diluted into 0.05 M-sodium phosphate (an equimolar solution of NaH,PO, and Na,HPO,) and heated to 60°C for 1 h. The DNA was then loaded onto hydroxylapatite in 0.05 M-sodium phosphate and washed to remove unreacted iodide (Houck et al., 1978). The DNA eluted with 0.3 M-sodium phosphate had a specific activity of approximately lo7 cts/min per pg. (c) Second-order

renaturation

of DNA

followed

by treatment

with

X1 nuclease

DNA that was to be treated with S, nuclease was dialyzed into 0.01 M-PIPES (pH 6.8). The DNA was denatured by boilin g for several minutes and adjusted to 0.3 ivr-NaCl and renatured at 60°C to Cot = SST. At this Cotvalue, most of the repetitive DNA has renatured (Schmid & Deininger, 1975). After renaturation, an equal volume of 0.05 M-sodium acetate buffer (pH 4.5) and 2 x 10e4 M-ZnCl, was added as well as mercaptoethanol to a final concentration of 0.025 M (Britten et al., 1976). The DNA was then treated with deoxyribonuclease S, (Calbiochem) for 1 h at 37°C under conditions where single-stranded DNA is digested by the enzyme and double-stranded DNA is not (Houck et aZ., 1978). The reaction was terminated by adding enough 2 iv-sodium phosphate to make the solution 0.12 Msodium phosphate. The double stranded, enzyme-resistant DNA was bound to hydroxylapatite at 60°C in 0.12 M-sodium phosphate and eluted in 0.3 M-sodium phosphate. It was concentrated by extraction with butanol (Stafford & Berber, 1975) and dialyzed into 0.01 ivr-Tris (pH 7.4) for electrophoresis or restriction. (d) First-order HeLa adjusted at 65”C, nuclease, Tris (pH resistant sequences

renaturation

of DNA

followed

by treatment

with

271 n&ease

[3H]DNA was denatured in 0.01 M-PIPES as described above. The solution was to 0.3 M-NaCl and inverted repeated sequences were allowed to renature for 15 s followed by quenching in ice (Cot < 10e4). The DNA was then treated with S1 followed by hydroxylapatite chromatography at 65°C and dialysis into 0.01 M7.4) as described above. Approximately 4.7% of the DNA was found to be S,under these conditions, in agreement with other studies on inverted repea.ted (Wilson & Thomas, 1974; Schmid & Deininger, 1975; Dott et al., 1976). (e) Driven

[sH]DNA fragments of various denatured in alkali and renatured t Cot, the second order renaturation half of a component has renatured.

renaturations

of DNA

lengths were mixed with at 65°C in 0.24 M-sodium parameter

a large excess of driver phosphate as described

DNA, earlier

in units of s. mole/l. Cot&, the C,t value at which

292

c. M. HIOUCK, F. 1”. Rln’EHAPZ’IC’ANU C’. w. SCHIillLI

iHouck el al., 1978). The ext,ent of renaturatiora was assayed b>; hydroxyliapa~ite chromatography followed by scint~illation counting (Schmid $ Deiningar, 1.975). Radioiodinated DNA in 0.05 msodium phosphai;e was denatured by bo’iling. (This was done to avoid alkaline breakage at points where the DNA had presumably been depurinated by the heating at low pH required for the iodination reaction.) Tl!e solutions were adjusted to 0.24 m-sodium phosphate and renatured at 50°C. For the purpose of comparison, [YKJDNA isolated from inverted repeated sequences was renatured in the same way as lodinated DNA. (f)

Restriction

enzyme clemoage

Restriction with A&I was performed by adding 0.125 volume of a solution of 0~06 &I?ilgCl, and 0.06 M-mercaptoethanol to DNA in 0.01 mTris in a siliconized glass tube. An additional 0.125 volume of autoclaved gelatin (1 mg/ml) was a,dded except in preparative reactions (25 pg of DNA or more). AZuI (Roberts et nl., 1976) was purchased fi*om New England Biolabs, and 2 units were added for each pg of DN.A. The mixture was incubated at 37°C for 5 to 16 h. Reactions using other restriction enzymes were done in the buffer system recommended by t,he vendor (New England Biolabs or Bethesda Research Labs) using at least 1 unit of enzyme per pg of DNA for at least 6 h at 37’C. (g) Gel eleetrophoresis Electrophoresis

was performed in horizontal agarose gels using the apparatus described et al. (1977). The neutral buffer system used was 0.089 M-Tris-borate (pH !3,2), 0.0025 M-EDTA. Ethidium bromide (I pug/m 1) was included in all the gels and in t,Ele running buffers. The gels were run at 4.5 W (about 140 V). Alkaline gels were poured and run in 0.03 ilr-NaOH, 0.003 WI-EDTA and were neutralized to visualize ethidium bromode fluorescence (McDonnell et al.: 1977). Size markers were an EcoRI digest of bacteriophage X DNA or a. HpaII digest of polyoma DNA. We also used an NaeIII digest of bonnet monkey DNA, which gives prominent bands at 320 and 640 nucleotides (C. M. Rubin, C. M. Houck & C. TV. S&mid, unpublished observations). DNA was extracted from electrophoresis gels by either electrophoretic elution or buffer elution of the mashed gel followed by hydroxylapatite chromatography to remove residual agarose and ethidium bromide. Gel patterns were visualized by ethidium bromide fluorescence under ultraviolet illumination. Photographs were taken on Polaroid type 55 P/N film using a Wratten no.25 red filter (Manuelidis, 1977a). Densitometer scans of the photograph negative were performed on a Joyce-Lobe1 microdensitometer. Radioactive DNA patterns were obtained by slicing the gel and dissolving the individual slices in 1 ml of hot water. The 3H was then assayed in a Beckman LS-250 scintillation counter using either Instagel, Aquasol, Quantafluor, Scintiverse, Scintisolv or one of several other samples generously supplied by the vendors. by McDonnell

(h) Xizefractionation.

oj HeLa

[3H]DA’A

[3H]DNA in 0.05 x-sodium phospha’te was alkaline denatured, neutralized with H,PO; and sonicated as described above. The DNA was applied to a preparat,ive alkaline agarose gel and fractionat$ed according to size. The gel concentration varied between 0.7% and 1.2% agarose, depending on the size range desired. Portions of the isolated size fractions were rerun in analytical alkaline gels to determine their length. (i) IsolatioyL

and relzattbration

of [3H]DNA

complementary

to 300.nucleotide

repeats

Sonicated HeLa t3H]DNA was stripped of inverted, repeated sequences as described above. The remaining DN-4 was then sized on a preparat.ive agarose gel. DNA of the desired length was isolated by electrophoretic elution. The renaturation of the [3H]DNA was driven by at least a j-fold mass excess of 300.nucleotide repet.itive DNA ho C,t = 12 in 0.24 Msodium phosphate at 65°C. The double-stranded DNA was isolated on hydroxylapatite. A portion of the 3H-labeled complement to the 300-nuclootide DNA was shortened by heating to 100°C for 15 min followed by treatment with 0.1 -w-NaOH. The sizes af the C3H]DNA fragments were determined on a, 1.2% alkaline agarose gel and were foulld to be 420 and

REPEATED

DNA

SEQUENCES

IN THE

HUMAN

GENOME

293

264 nucleotides. Following alkaline denaturation, the renaturation of the [3H]DNA complementary to 300.nucleotide repeats was driven with a large excess of short, nuclear DNA in 0.24 M-sodium phosphate at 65”C, as described above.

3. Results (a) Isolation

of the Alu family

of repeated DNA

We wished to determine whether the 300-nucleotide renatured repeated sequences in humans are composed of one family or a number of families of repeated sequences. The logic of our approach is to subject the 300-nucleotide renatured repeated sequences to cleavage by a number of restriction enzymes. If there are many families of sequences in the repeated DNA, then one might expect the sample to be degraded into a large number of DNA fragment lengths. Conversely, if there are only a small number of sequence families, one might expect the restriction enzyme to cut the DNA into discrete fragment lengths or not to cut the DNA at all. To isolate 300-nucleotide repeated sequences, human placental DNA that had been sheared to 2000 nucleotides in length was denatured, renatured to C,t = 68 and treated with S, nuclease. The double-stranded DNA was fractionated according to length by electrophoresis on a 1.4% neutral agarose gel. Typical length profiles from two different digestions are shown in Figure 1. The exact distribution of lengths seems However, the profile always shows a to vary from preparation to preparation. prominent band at a length of about 300 nucleotides and multiples of that length (Fig. 1). The presence of the 300-nucleotide band is expected from the results of our previous electron microscope study of human DNA (Deininger & Schmid, 19’76a). The relationship of the 300-nucleotide band to the 600-nucleotide and 900-nucleotide bands is not certain and is currently under investigation. We have also previously

1llllI 900 600

I I I 330 200

I 100

,

Length (nucleotides)

FIG. 1. Length prof?.lesof repetitive

human DNA. neutral agarose gels containing DNA was electrophoresed on 1.4% ethidium bromide. The length profiles are densitometer scans of the DNA-ethidium bromide fluorescence. In one preparation, 40% of the DNA was S1 resistant (a). The 300.nucleotide band, which corresponds to approx. 12% of the &-resistant material or 5% of the mass of the genome, is used in subsequent studies. The profile in (b) is from a different preparation, treated under the same conditions, in which 23% of the DNA was S1 resistant. The scale shown gives approximate DNA lengths and was calibrated using an HaeIII digest of Bonnet monkey DNA (see Materials and Methods).

Repetitive,

&-treated

294

c:. M. MOUCK, IF. I’. RIXSEHART A;hjll i’. w. SGHiulI~

S,-digested ) rer&ured repeated human DN a i-1~7$11~:/(9;5d-~t:neit~ive u\e.thod. of exclusion chroma,tography (Houck eC al.. 19178). In that study we did aot resolve a sharp 3CCnucleotide ba,nd nor did we observe multiples (e.g. 600 alld 900 maeleo.” tides) of that length. .We attribute this difference t,o the superior resolution of gel electrophoresis. The preparation shown in Figure l(a) was mr,ed in the following investigations. In this preparation 40% of the DNA was, X, ree8istant. Approximately 12% of the X,-resistant material, corresponding to 5% of the genome (40% x 0.12) is found under the 30O-nucleotide peak (Fig. l(a)). In a second study, only 23% of the genome was S, resista,nt, and Lhis materiaJ displayed a more structured length profile (Fig. l(b)). While the total amount of E&-resistant ma,terial, as well as its apparent length profile, is variable, we find in all cases a constant fract,ion of the genome, 5%: occurs a’s 300-nucleotide repeat’s (Fig. l(a) and (b)). Heterogeneity in the lengths of single-strand tails that remain attached to otherwise duplex molecules after S, digestion is expected to a.lter the apparent duplex length profiles obtained by gel electrophoresis. Because of these uncertainties in the S, length studies, we are limited to the observa’tion of a prominent 30Gnucleotide duplex band, which corresponds to 5% of the genome. The width of this band. (-d35 nucleotides, HWHHT) is approximately twice the width of an homogeneous 340nucleotide restriction fragment of the or-satellite component of bonnet monkey (-20 nucleotides, HWHH). Allowing again for possible dangling single-strand tails on the 300-nuc1eotid.e duplexes, which would cause some heterogeneity in the mobility of the 300-nucleotide duplexes, we can estimate the maximum length heterogeneity of the duplexes to be about & 15 nucleotides (35 -- 20 nucleotides). The DNA contained in the 300-nucleotide band; 120/ of the duplex (Fig. l(a)), was isolated from the gel as described in Materials and Met,llods, The purified 300nucleotide repetitive DNA (Fig. 2(a)) was then treated with a variety of restriction enzymes, including HaeIII, HpaII, Igha.l, EcoRI, H&dTII, HpaI, BglII, BamHI, Xbal, P&I, Hind11 and HaeII. None of these enzymes gave any discernible cutting of the DNA. When the 300-nucleotide DNA was treated with A2uI, however, it was cut into two fragments with approxima’te lengths of 170 and 120 nucleotides (Fig. 2(b)). From the areas under the densitometer peaks (Fig. 2(b)), it can be estimated that about 25% of the DNA remains in the 300-nucleotide band after restriction and a,s much as 62% of the DNA is in the two Ah fragments (a,bout 36yb in the 170nucleotide band and about 26oj, in the 120-nucleotide band). Since the 300-nucleotide band is cut into only two pieces, whose lengths (170 f 120) add up to approximately 300, there is probably a single family of repeated sequences that contain the AEuI site 170 nucleotides from one end. The ratio of the areas of the 170-nucleotide and 120-nucleotide fragment s (36/26 = 1.38) is approximately what would be expected for two fragments with rela,tive sizes of 170 and 120 (l7Ojl20 = 1.42). For this reason, we think it likely that the 170 and 120-nucleotide fragments are the opposite ends of a single 300-nucleotide repeat. It is interesting to estimate the fraction of the genome contained in the Alu family. In the digestion shown in Figure 2(b), about 6Oyi of the 300-nucleotide repeat,ed sequence was specifiea,lly cleaved by AluI. This 300-nucleotide sequence is about 12% of the &-resistant’ DNA, which is 40% of the genome. A minimum of 3% of the genome (0.60 x 0.12 x 0.40) must therefore include the Alu family. We regard this as a minimum value for two reasons. First, repeated sequences are not exact, copies of examined.

t HWHH,

ha11 width

at half height.

REPEATED

DNA

SEQUENCES

IN

THE

500 400 300 200 Length (nucleotides)

HUMAN

GENOME

295

100

FIG. 2. Length profile of AZuI-restricted 300.nucleotide repeats. (a) A densitometer trace of the 300.nucleotide repetitive DNA isolated from the 1.4% agarose gel shown in Fig. l(a) and rerun on 2.5% agarose. This 300-nucleotide DNA was cut with AZuI as described in Materials and Methods and run on 2.5% agarose as shown in (b). From the relative areas under the peaks, it is estimated that approx. 25% of the mass of the 300.nucleotide DNA remains uncut by AZuI. The bulk of the DNA is cut into 2 bands, a 170.nucleotide band, which contains 36% of the mass of DNA, and a 120.nuoleotide band containing 26% of the DNA. The lengths of the bands were determined with reference to an HpaII digest of polyoma DNA. The scale shown on the abscissa is approximate. The DNA from the 170.nucleotide band was isolated from the gel and re-electrophoresed on 3% agarose as shown in (c). This sample is also used in subsequent renaturation studies.

each other, but merely similar sequences that have diverged by mutation from an ancestral sequence. It is reasonable to suppose that some of the uncut 300-nucleotide repeats may have lost their AZu site by divergence. Second, this value includes only those Alu family members found in the 300-nucleotide repeated sequences. We have also observed similar AluI cleavage products in the 600-nucleotide repeated sequences (Fig. 1) and conceivably this family is also found in DNA fragments of other lengths. DNA that was isolated from the 170-nucleotide peak (Fig. 2(b)) is designated as on a 3% agarose gel to verify the Ah fragment. This fragment was re-electrophoresed its homogeneity (Fig. 2(c)). Subsequent experiments, described below, have beein performed on this preparation. (b) Isolation

of the Alu family

from

inverted

repeated

DNA

We previously observed that a large fraction of the inverted repeated sequences in human DNA contained a 300-nucleotide duplex region (Deininger & Schmid, 1976a). In addition, we found that the single-strand spacer loop on this duplex had the same length as the interspersed single copy sequences. These similarities led us to propose that these inverted repeated sequences are a subclass of 300-nucleotide interspersed repeated sequences. If this proposal is correct, the inverted repeated sequences should also contain the Alu family, Inverted repeated sequences, labeled with 3H, were isolated a,nd treated with the single-strand-specific nuclease S, as described in Materials and Methods. In agreement with our electron microscope study (Deininger & Schmid, 1976a), approximately ha’lf of the DNA had a length of 300 nucleotides (Fig. 3(a)). The 300-nucleotide DNA was isolated from the gel (Fig. 3(b)) and treated with the restriction enzyme AZuI (Fig. 3(c)). As an internal control, unlabeled 300-nucleotide repeated sequences were included with the 3H-labeled, inverted repeated sequences during the restriction

Length

(nucleotides)

BIG. 3. Length profiles of inverted repeated DKA. 3H-labeled DNA was briefly renatured and treated with S, nuclease as described in Materiale and Methods. The inverted repeated duplex DNA was then electrophoresed on 1.4% agarose, cut into 2 mm slices and counted to obtain the length profile shown in (a). The 300.nucleotide duplex, as shown by the hatched area, was isolated and a portion rerun on 3% agarose (b). The 3H-labeled 300.nucleotide inverted repeated DNA was mixed with unlabeled 300-nucleotide repetit,ive duplex DNA and restricted with AU. The length profile of tjhe inverted repeals (radioactivity) is shown in (c), and the ethidium bromide fluorescence of t,he carrier DNA is shown in (d). The lengt’hs were determined using an Hue111 digest of Bonnet monkey DNA as the standard. The length scale is approximate.

(Pig. 3(d)). The inverted repeated sequences are found to contain the same MuI restrict’ion site located in the same posit.ion as that found in the bulk of the 300nucleotide repeats (Fig. 3(c)). The fraction of the 300-nucleotide inverted repeats that is cut by Ah1 (47%) is similar to the fraction of the total 300-nucleotide repeats cut by Ah1 (48% to 620/, in different experiments). It is especially interesting to compare the length distribution of t,he restricted invertBed repeated sequences (Fig. 3(c)) and the admixed 300-nucleotide interspersed repeats (Kg. 3(d)). We regard these distributions as being identical. These results confirm our proposal that 30Q-nucleotide inverted repeats are a subclass of the interspersed repeated sequences. (c) Driven (i) Renaturation

renaturations

of repeated sequence ~anzilies

qf 30Q-nucleotide repeats

Our initial mot,ivation for searching for m Ah family of repeat’ed sequences was the observa,tion of “50,000”-fold repetitive sequences. We wished to test 300nucleotide interspersed repeat,ed sequences, the Alu fragment, and 300-nucleotide inverted repeated sequences for t,he presence of sequences that renature 50,000 times faster than single copy sequences.

REPEATED

DNA

SEQUBNCES

IN

THE

HUMAN

GENOME

297

The 300-nucleotide repeated sequences (as shown in Fig. 2(a)) were radioactively labeled with lz51. The renaturation of the labeled DNA was then driven with a large excess (approx. lO,OOO-fold) of unlabeled 600-nucleotide nuclear DNA as described (not shown here) the in Materials and Methods (Fig. 4(a)). I n other experiments renatured DNA resulting from this radioiodinated preparation was found to have a low melting temperature (64°C in 0.12 M-sodium phosphate) so that the renaturations were performed at 50°C in 0.24 M-sodium phosphate. The renaturation rate of these sequences was analyzed by assuming they contained two kinetic components (Fig. 4(a)). The faster component of the DNA renatures with a Cot+ value of 0.0154. The C,t, value of unique DNA at 50°C in 0.24 M-sodium phosphate is about 800 (Deininger $ Schmid, 1979) so the fast component of 300-nucleotide repeats is at least 52:000-fold repetitive (800/0.0154). Radioiodination results in some strand breakage, so that we believe the slower component is the result, in part, of length heterogeneity. Our subsequent results on 3H-labeled, inverted repeated sequences (below) reinforce this belief.

0-I -

(al

0.2 -xx% 0.3 0.4 -

x*xX&

0.5 0.6 I

+Y# I

I

x x $i-x

1

(bl 0.1 b.,-.,

0.2 0.3

A -k \

0.5 0.6 0.7 0.8 0.9 t 0.4

x \ x \

I

I

I

-4

-3

-2

x-x-

-I

-I

xy-xI

I

I

0

I

2

I

Log CrJf

FIG. 4. Driven

renaturations

of short

repeated

DNA.

KC-labeled repetitive DNA driven by at least a 7600(a) The renaturation of 300.nuoleotide fold mass excess of 600.nuoleotide unfractionated nuclear DNA. A least-squares fit to 2 secondwith C,t,= 1.4 and order components gave 32% renaturing with Cot+ = 0.0154, 20% renaturing 32% unrenaturable DNA. (b) The 170-nucleotide AZu fragments of repetitive DNA was labeled with lz51 and its renaturation was driven by at least a 14,000.fold mass excess of 600-nucleotide nuclear DNA. The leastsquares fit to 2 components showed 32% renaturing with Cot* = 0.043, and 9% renaturing with Cotg = 3.0, and 48% unrenaturable. (c) The 300.nucleotide 3H-labeled inverted repeated DNA was renatured with at least a 540.fold mass excess of 600-nucleotide unlabeled nuclear DNA. The renaturation profile could be fitted with a single component containing 78% of the DNA with Cot+= 0.0125, and I1 v/o unrenaturable DNA.

The longer of t,he dlu restriction fregnrent~a of 300-~uxleotide repeats (Big. I(c)) was isolated and labeled with lz51. Its renataration was. driven wit811a%least a IU~OOOfold excess of unlabeled nuclear DNA as described above. The rena-turation of this repetitive sequence is shown in Figure 4(b). A least-squa.res fit to the data shows that a larger fra,etion of the renaturable DNA in this purified fraction of DNA is contained in the faster renaturing component, 61% of the renaturable DNA compared with 47 y0 for the 300-nucleotjide DKA. The C,,t value of t’he rapidly rena,luring component, (Cot, = 0.043) is somewhat higher than in the 300”nucleotidc DNA C,t plot, bu.t this can be partially attributed to the effect of the different lengths (300 nucleotides VP,QSUS I70 nucleotides) on the ra’te of renaturat,ion (Hinnebusch et nl., 1978). (iii) Renaturation

qf 300-nucleotide

inverted repeated sequences

We have also determined the rena,turat,ion rate of the 300-nucleotide 3H-labeled, inverted repeated sequences (Fig. 3(b)) by the driven renaturation approach described above. We find that the driven renaturation rat,e profile of these sequences can be accurately described by a single kinetic component (C,,S, = 0.0125) that renatures 64,000 times fa,ster (800/0.0125) than single copy sequences (Fig. 4(c)). In addition the “H-labeled inverted repeated sequences are completjely renaturable (Fig. 4(c)) as compared to the relatively poor renaturability of two radioiodinated. preparations described above (Fig. 4(a) and (b)). These observa,tions imply thatf t.he radioiodinated samples are partially degraded as a result of the iodina,tion. Because of this technica, problem we do not regard the slowly renaturing component in the iodinated samples as being necessarily meaningful. We do think it is significant that (allowing for the shorter length of the Alec fragment) all three preparations contain a class of sequences that rena,ture about 50,000 times faster than single copy sequences as implied in the Introduction. The actual repetition frequency of the class is likely to be much greater tha,n 50,000-fold and will be estimated in the Discussion. It should be noted that the driven renaturation of inverted repeated sequences from human DPU’A has been previously studied by two laboratories (Schmid & Deininger, 1975; Dott et al., 1976). In both studies the inverted repeated sequence preparations were found to contain a class of slowly rena,turing sequences. The absence of this slowly renaturing class of sequences in this prepa,ration of 300nucleotide inverted repeated sequences (Fig. 4(c)) is presumably the result of the selective length fractionation used in this study. It is likely that there are other classes of inverted repeated sequences tha.t are unrelated to the Alu family or 300nucleotide interspersed repeated sequences. Since about half of the mass of inverted repeated sequences is contained in the short 300-nucleot,ide duplex regions (Fig. 3(a)), we expect this class to be especially numerous, a view that is consistent with our previous electron microscope results (Deininger & Schmid, 1976a). (iv) Test

qf intersprsion

of 300-nucleotide

repeated seque,?ces

DNA that is 300 nucleotides long and has been isolat,ed nucleotide renatured repetitive DNA might be assumed to spersed with unique sequences. There are other possibilities, if the DNA were present in clusters of severa, types of sequences, then the renaturation and S, cleavage might free sequences that were not interspersed with unique sequences.

by S, cleavage of 2000be DNA that was interhowever. For example, 30%nucleotide repeated 300-nucleot,ide repeated We ha,ve performed two

REPEATED

DNA

SEQUE,NCES

IN

THE

HUMAN

GENOME

299

types of experiments to decide whether the 300-nucleotide repeated sequences are interspersed with unique sequences. One experiment tests the fraction of the genome that can be driven by the 300-nucleotide sequences as a function of the fragment length (Davidson et al., 1973). Th’ is experiment is also performed on the 170-nucleotide Ah fragment of repetitive DNA. The other type of experiment involves testing whether a short DNA that contains a repeated sequence also contains a single copy sequence. The logic of this experiment is to first isolate short (approx. 500 nucleotides) fragments of radiolabeled repetitive DNA. These short fragments are subsequently degraded to an even smaller size (approx. 250 nucleotides) to liberate any single copy sequences that are covalently linked to repeated sequences. This approach has been used in the case of mouse DNA to demonstrate the interspersion of short repeated sequences with single copy DNA (Britten, 1972). Short fragment lengths are used in this experiment to test for closely interspersed sequences. In a previous study, relatively long DNA fragments were used in a similar experiment to merely test for the interspersion of repeated and single copy sequences (Schmid & Deininger, 1975.) (v) Renaturation

of length-fractionated

DNA

HeLa [3H]DNA was length-fractionated by gel electrophoresis as described in Materials and Methods. The fractions were re-electrophoresed to determine the actual lengths (Table 1). The length fractions were then renatured with an excess of various selected driver DNAs, and the extent of renaturation of the radiolabeled tracer was assayed by hydroxylapatite chromatography. The driver DNAs were either 600nucleotide total nuclear DNA, 300-nucleotide repeated DNA (isolated as described above), or the 170-nucleotide Alu fragment of repetitive DNA (also as described above). To control for self-renaturation of the 3H-labeled length fractions, samples were allowed to self-renature for a short period to form renatured inverted repeated sequences or for one hour to test for the background self-renaturation of the tracer DNA. The results of all these renaturations are shown in Table 1. The zero-time binding and the 600-nucleotide DNA renatured to Cot = 4 are similar to data published previously (Schmid & Deininger, 1975). The length fractions were driven by the 300 and 170-nucleotide repeated sequences for one hour so that the results could be compared to the control in which the length fractions were allowed to self-renature under the same conditions for one hour. For the 300-nucleotide repetitive driver, this corresponds to Cot = 0.3, and for the 170-nucleotide repetitive driver, it corresponds to cot = 0.15. As a control on possible strand breakage during the one hour renaturation period described above, we re-examined the lengths of some samples after they were subjected to the entire renaturation protocol (Table 1). We believe this control will overestimate strand breakage, since heating DNA at 65°C results in depurination. Depurinated sites are hydrolyzed at the pH used for alkaline gel electrophoresis, but are reasonably stable at neutral pH. The extent of strand breakage observed is reasonably small (Table 1) and is completely consistent with the depurination of DNA (Greer & Zamenhof, 1962). We conclude that DNA fragment length measured before renaturation is a reliable measure of the fragment length during renaturation. The controls as well as the results for the 300-nucleotide repetitive driver and the 170-nucleotide repetitive Ah fragment driver are plotted in Figure 5. The 300nucleotide repetitive fraction drives the renaturation of about 41% of the genome

1500~5000

1000-5500

2700-6800

3500

3600

4900

22.4

1841

14.9 28.3

26.7

24.7

15.7

16.9

8.0

12.8

12.0

4.5

Control 1 h No drivc&

length fractions

1

47.6 58.2

69.1

50.7

44.6

38.5

29.6

22.5

32.2

17.8

1 h 170-nt Ah repeated DNA driverf

to hydroxylapatite”

driver DNAs

57.4

53.4

48.5

42.3

40.6

28.1

29.2

24.1

1 h 300-nt repeated DNA driver”

y0 Round

with m&us

59.2

56.3

04.7

43.5

Z39.0

39.8

27.5

21.5

23.7

C"b : 4 &nrt whole cell DNA drivers

81.7

64.7

65~7

61~1

49.0

40-2

33.8

for

cot ~ 40 short. w-h& ceii DNA driver’

R Lengths WCPO dotermined by alkalino go1 oloctrophorasis ai; doscribod in the text with an E’coRl digmt of A DNA m size markers. L The range of lengths is taken to be the range at half the peak maximum. o Zero time refers to the percentage of the [W]DNA length fractions bound to bydroxylapatite a,fter a brief renaturation, as doacribed in the text,. d The I h control is to account for any self-renaturation of the length-fractionated DNA that might have occurred during the I h inouhe.tion required the driven renatwations with 300 and 170-nucleotide repeated sequences. c The 300-nucleotide (nt) DNA driver in this 1 h renaturation results in a driver C,t value of 0.3. f The I70-nucleotide (nt) AZu fragment of DNA used as a driver in this reaction results in a driver Cot value of 0.15. 6 The “short whole cell” driver DNA is 600.nuoleotide total nuclear DNA. h All renaturations were performed at 65°C in 0.24 iv-sodium phosphate.

3670

4500

3550

10.6

1500~4400

2700

2660

6.5

1550

800-2500

5.7

5.3

4.1

timeC

1800

-

Zero

7.7

500-1000

830

575

?rlcdian length after “renaturation””

of DNA

1050

300-700

490&730

610

Range of lengths at half-heightb

500

1Mcdian length (nncleotides)*

Renaturation

TABLE

REPEATED

DNA

SEQUENCES

IN

THE

HUMAN

GENOME

301

0.99 m

c ‘0 g T

0.8 0.7

Length

(nucleotides)

FIG. 5. Renaturation of DNA length fractions driven by short repeated DNAs. 3H-labeled DNA of various length fractions (see Table 1) was renatured with driver DNAs for 1 h at 65°C or was self-renatured. The fraction binding to hydroxylapatite is plotted as a function of length for 300.nucleotide repetitive driver (+) ; 170. nucleotide AZu fragment driver (x ); 1 h self-renaturation control (0); and zero-time binding control (A). Lines drawn through the data are not curve-fitted but are drawn to lend continuity to the data.

above background at a length of 4900 nucleotides. The 170-nucleotide Alu fragment drives the renaturation of about 30% of the genome above background. In addition, we have shown that a large portion of the zero-time binding DNA (which appears as background in this experiment) can be classed with the Ah family, so the AZu family is distributed over 30% to 52% of the genome at a length of 4900 nucleotides (Table 1). This is evidence that the sequences of the Alu family are interspersed with other types of sequences in the human genome. It does not, however, exclude the possibility that some of the sequences are present in a clustered arrangement elsewhere in the genome. It should also be noted that the extent of renaturation of the labeled tracer with both the Alu fragment and the 300-nucleotide repeated sequence compares well with the extent of renaturation found at C,t = 4 using total nuclear DNA as the driver (Table 1). For long DNA fragments (approx. 3000 nucleotides), we expect most of the repeated sequences to renature by C,t = 4. (By C,t = 40, long single copy sequences have begun to renature appreciably.) We conclude that both the Alu family and the 300-nucleotide repeated sequence must be widely distributed throughout the genome, and probably represent the dominant class of interspersed repeated sequences within the genome. (vi) Renaturation

of sheared DNA

adjacent to 30hucleotide

repeats

We wished to examine the types of sequences found next to the 300-nucleotide repetitive sequences to test whether unique sequences are found in close interspersion with the 300-nucleotide repeats. To do this, we stripped HeLa L3H]DNA of inverted repeated sequences and isolated DNA of the desired length on a preparative alkaline agarose gel (see Materials and Methods). The 3H-Iabeled DNA was then renatured with an excess of 300-nucleotide repeats and the duplex fraction was isolated on hydroxylapatite. A portion of the sH-labeled DNA was then degraded by heating to

302

t.i * hP . LIOlJUli,

F'.

P.

KiNE:PlAJA,'l

:1.hIj

1

!I.

dC:i3.J!.S,!Lr

reduce its size (Natevials and Methods). The ?emgths of boc;h the degyaderi rind. III / Idegraded DINAs, were dletermined by agarose gel electrophoresis. The undeg~eded complement wa)s 420 nucleotides in length, whereas the degraded ~:omplement waSa 264 nucleotides long, The renaturat,ion of t#he long and short [3H]D.NA fractions was Ghen driven with a large excess of short nuclear DNA. The unjqrxe portion of the renaturatian curve is shown in. Figure 6. A lea,&-squares fit for both curves shows that the unsheared DNA contains no m.ore than 2% unique sequences, while reducing the length of t’he DXA from 420 to 264 nucleotides released an additional 12% of the DNA. as free unique sequences. This demonstrates that some of the 300-nuc1eotid.e repeated sequences are located next to unique sequences, Based on an approximate theory of renaturakion (Xchmid & Deininger, 1975), sample calculations suggest that the amount of unique DNA liberated on shearing the tracer to 264 nucleotides is consistent with an average repeated sequence length of about 500 nucleotides. This calculation is subject to a variety of uncertainties, so that we do not regard it as being qua’ntitatively convincing and we will accordingly not outline the calculation here. Vaie wish to note that the results of this interspersion experiment are in very good agreement with the results of an analogous experiment performed on mouse DNA (Britten, 1972). These results give qualitative support for the interspersion of short repeated sequences with single copy sequences rather than, for example, the interspersion of long clusters of 300nucleotide repeated sequences with single copy sequences. This does not prove that all t,he 300-nucleotide repeated sequences a,re interspersed with unique sequences, but it is consistent, with a large fraction of the 300.nucleotide repeats being directly interspersed with unique sequences. Thus, both experimental tests for the interspersion of the 300-nucleotide sequence are consistent with a close interspersion of 300-nucleotide repeated sequences with

I.5

2-G

3.0

2.5

3.5

Log c,t FIG. 6. Renaturation of DNA adjacent t,o 300-nuoleotide repeated sequences. 3H-labeled DNA 420 nucleotides in length complementary to 300.nucleotide repeated sequences was isolated as described in the text. A portion of the DNA was degraded to an average length of 264 nucleotides. The renaturation of degraded and undegraded DKAs was driven with an emess of total nuclear DNA. The unique portion of the renaturation is shown for the 264-nucleotide (x ) and the 420.nucleotide (+) DNAs. The data were fitted using Cot: -=- 660 for unique human DSA (unpublished observations). The least-squares fit for the shortened DKA showed 1474 renaturing with C,tt = 550 and, for the longer DNA, 2% renaturing with C,,t: -. MO.

REPEATED

DSA

SEQUENCES

IN THE

HmlAN

OENOME

303

unique sequences. The Alu family, which makes up the bulk of the 300-nucleotide repeated sequences, also appears to be interspersed wihh unique sequences.

4. Discussion We have isolated a family of 300-nucleotide renatured repeated sequences from human DNA, which can be identified by the presence of a site for the restriction enzyme AZuI located 170 nucleotides from one end. The family includes 300-nucleotide inverted repeated sequences as well as other 300-nucleotide repeats. The Ah family of repeated sequences makes up a large fraction of the repeated DNA in the human genome. As reported in Results, we have observed a minimum of 3% of the genome in the AZu family. To fully appreciate the significance of this value, it is useful to consider the overall sequence organization of the human genome. We have presented evidence that about 60% of the human genome is occupied by an interspersion of 300-nucleotide repeated sequences with 2000-nucleotide single copy sequences (Schmid & Deininger, 1975; Deininger & Schmid, 1976a). According to this model, the interspersed repeated sequences would constitute about 8% of the genome (0.6 x 300/2300). As a minimum, the Ah family we observed here could conceivably account for about one-half (3% versus 8%) of the major interspersion pattern of the human genome. The repetition frequency of such a prevalent family would be very high. Assuming the haploid human genome has a complexity of roughly 2.5 x log base-pairs (Lewin, 1974): the repetition frequency of the 300-nucleotide members of the Ah family must be at least, 250,000 (>0.03 x 2.5 x 10s base-pairs/300 base-pairs). This value is qualitatively consistent with our renaturation rate studies. Considering only our results on the inverted repeated sequences, which avoids the technical complications described for the radioiodinated samples, we find a single kinetic component tha.t renatures 64,000 times faster than single copy sequences. It is well-known that the renaturation rate is dependent on the base-pairing fidelity of the duplex that forms (Bonner et al., 1973). As a rough rule, the renaturation rate is decreased twofold for every 10 deg. C depression in the melting temperature of the duplex that is forming (Bonner et al., 1973). The melting temperature of renatured repeated sequences in human is about 10 to 20 deg. C lower than that of single copy sequences (Schmid & Deininger, 1975; Deininger & Schmid, 1976b,1979 ; and unpublished studies on the samples reported in this work). We therefore estimate that the renaturation rate of these sequences is depressed by a factor of two to four. To be definite, we will assume the rate is reduced by a factor of three. The experiment indicating that the inverted repeated sequences renature 64,000 times faster than single copy sequences includes a comparison of the renaturation rates for a 300-nucleotide sequence and a 500 to 600-nucleotide single copy sequence, This difference in length also gives rise to a reduction in the relative rate of renaturation of the repeated sequence (Hinnebusch et al., 1978). For the lengths described above, we estimate this reduction to be 1.5-fold. Applying both corrections, we estimate the repetition frequency of the 300-nucleotide inverted repeated sequences to be roughly 290,000 (64,000 x 3 x 1.5). These corrections are approximate, but they demonstrate that the repetition frequency determined by DNA renaturation rate studies (approx. 290,000) is certainly consistent with the minimum number of copies of the Ah family we are able to isolate biochemically (approx. 250,000 copies).

304

C. M.

HOTJCK,

F. P. RIKEHART

AN11

C. W. RCHMTI)

In contrast’ to these results on human DNA, most other higher organisms are thought to have interspersed repeated sequences with relatively low repetition frequencies (Davidson et al., 1975). One reported exception to this rule is calf DNA, which contains a component that renatures 64,000 times faster than single copy DNA (Da’vidson et al., 1975). It is conceivable, therefore, that the widespread interspersion of one family of sequences observed here might be a peculiarity of certain mammalian genomes. An important question is whether the Ah family of repeated sequences is int’erspersed with single copy sequences, especially since human DNA is known to contain 340-nucleotide repeated sequences arranged in tandem (Manuelidis, 1977a.h, Manuelidis & Wu, 1978; Maio et al.. 1977). This tandem repeat is liberated as 340-nucleotide sequences by digestion with Haelll and for purposes of discussion will be referred to as the HaeIIl. family. Because of their coincidently similar lengths, it’ is conceivable that the HaelII family and the Ah family are related sequences. This does not seem to be the case. The HaeTIT family includes an EcoRI restriction site located 50 nucleotides from the HaeIIl cleavage site (Manuelidis & Wu, 1978). Since we are unable to detect either an HaeIlI site or an EcoRI site in the 300-nucleot’ide renatured repeated sequences, it would seem that the HaeTTI and Ah families are different sequences. In addition, the base sequence of the HaeIII family is non available (Manuelidis & Wu, 1978). This sequence does not predict t’he ohserved spacing of the AZu sites at intervals of 170 and 120 nucleotides. to the tandemly These comparisons suggest that the AZu family is unrelated repeated sequences of the HaeIll family. They do not prove that the AZu family is unrelated to some other unknown families of clustered repeats. For this reason we wish to review the evidence that the SZu family is interspersed with single copy sequences. The most direct experimental evidence for the interspersion of the AZu family with single copy DNA is the driven renaturation of long radioactive tracer sequences with both the 300-nucleotide repeated sequences and the AZu fragment. The results in bot’h cases show t’hese t’wo samples contain interspersed repeats. The only- possible flaw that we can envision in these experiments is that the AZu fragment is contaminated with some other class of interspersed repeats that are also found in the 300-nucleotide repeated sequence. While we cannot rigorously disprove this possibility, we think it unlikely. The length distribution of the AZu fragment is relatively homogeneous, so that these hypothetical contaminants would themselves br cut by AZuT into a limited number of fragment,s. In other words. the putative contaminant would have to be essentially a single family of sequences. Perhaps t,he best evidence for the interspersion of these sequences in huma,n DNA is our electron microscope study of renatured human DNA (Deininger & Schmid. 1976a). In t)hat &udy. we found that 300-nucleotide repeated and inverted repeated duplex structures were distributed throughout at least half of the genome. Since at’ least half of the 300-nucleotide repeated sequences are members of the AZu family. it is again reasonable to conclude that they are widely distributed throughout the genome. It has been proposed that the 3OO-nucleotide interspersed repeated sequences perform a regulatory function (Davidson & Britt’en. 1973) tither at the DNA or RNA level (Davidson et al., 1977). The inclusion of over half of these 300-nucleotide sequences in a single famil,y of repetit,ive sequences (the dlu family) would limit’ their abilit’y to function as complex regulat,ory elements.

REPEATED

DNA

SEQUENCES

IN

THE

HUMAN

GENOME

305

It is especially interesting to compare these results on repetitive DNA sequences to results on repetitive heterogeneous nuclear RNA sequences. HeLa heterogeneous nuclear RNA forms duplex structures containing both intramolecular (inverted repeated) duplexes and intermolecular duplex structures (Federoff et al.: 1977). There is a remarkable similarity between the structures observed in heterogeneous nuclear RNA and the structures observed in renatured DNA (Deininger & Schmid, 1976a). The average length of the duplex regions in heterogeneous nuclear RNA is again about 300 nucleotides (Federoff et al., 1977). These RNA-RNA duplexes are complementary to repeated sequences in human DIL’A (Jelinek et al., 1974). Jn addition, the double-stranded regions of het’erogeneous nuclear RNA have been shown b.y fingerprint analysis to have a complexity of about 1000 nucleotides (Robertson et al., 1977 ; Jelinek et al., 1978) The fingerprint’ of inverted repeated DNA (Jelinek, 1977) is identical to the fingerprint observed for duplex heterogeneous nuclear RNA. Thus, Jelinek has found that duplex regions of heterogeneous nuclear RPU’A and inverted repeated sequences in DNA share indistinguishable fingerprint’ patt’erns, and this pattern is dominated by a relatively simple sequence. We have found in this work that at least half of 300-nucleotide inverted repeated DNA sequences and half of all other 300-nucleotide repeated r;equences belong to one family. Comparing our independent results on inverted repeated DNA sequences, it seems likely that the heterogeneous nuclear RNA duplexes st-ndied by Jelinek are transcribed from the AZu family of repeated sequences. We are currently testing this hypothesis l)y RNA-DNA hybridization and DNA sequencing. This hypothesis suggests that thus function of the Ah family occurs at the level of the heterogeneous nuclear RNA. It has been proposed that such repeated sequences might be processing sites for heteroother geneous nuclear RNA (Robertson et al., 1977; Jelinek et al., 1978). hlthough possibilities cannot be ruled out at this time, we find this to be an especially attractive proposal for the function of a single simple class of repeated sequences that are so widely distribut’ed throughout the genome. We thank Dr Prescott Deininger and Dr Tlleodore Friedmann for a gift of polyoma DNA and MS Carol Rubin for valuable assistance. This work was supported by grant GM 21346 from the National Institutes of Health and by a graduate research award from the University of California, Davis.

REFERENCES Banner, T. I., Brenner, D. J., Neufeld, B. R. & Britten, R. J. (1973). J. Mol. Riol. 81, 123%135. Hritten, R. J. (1972). In Evolutior~ of Genetic System (Smith, H. H., Fd.) pp. 80-94, Gordon and Breach, New York. Rritten, R. J., Graham, D. E., Eden, F. C., Pninchand, D. M. 8: Davidson, E. H. (1976). .J. Mol. Evol. 9, l-23. Chamberlin, M. E., Britten, R. J. & Davidson, E. H. (1975). J. Mol. Biol. 96, 317-333. Davidson, E. H. & Britten, R. ,J. (1973). Quart. Rev. Biol. 48, 565-613. Davidson, E. H., Hough, B., Amenson, C. 8: Britten, R. J. (1973). ,I. Mol. Biol. 77, l-23, Davidson, E. H., Galau, G. A., Angerer, R. C. & Britten, R. J. (1975). Chromosoma, 51, 253-259. Davidson, E. H., Klein, W. H. & Britten, R. J. (1977). Develop. Biol. 55, 69-84. Deininger, P. L. & Schmid, C. W. (1976a). J. Mol. Biol. 106, 773-790. Deininger, P. L. & Schmid, C. W. (1976b). Science, 194, 846-848. Deininger, P. L. & Schmid, C. W. (1979) J. lVoZ. Biol. 127, 437-460.

306

C.

&I.

HOUCK.

F.

P.

RIXERbC-Lil

kl\rr!!

L’.

bV. YCJihlil~

l&d,t, P. J., C:hmmg. I’. K. & Scrunders: 0. F’. 1IR?B), B?;osi~,r,;~j,ssry~ .L$, .$I %---Uilj. Pedoroff, N., Wellauer, P. 5%. 8%Wall, R. (1977). Gel!, 16. .5R’i-~-610. Greer, S. & Zamanhof. S. (1962). J. n/roZ. Biol. 4, 123-143. Binnebuseh, A. G., Clark, V. E. & Klotz, L. C. (1.970). Biochem;i&~y, II, Bd2l-.-ma29. Houck, 6. &I., Rinehart, F. P. & Schmid, C. W. (1978). Riochim. Riophy.s. Acta., 518, 37-52, ,Jelinek, W. (i977). J. &“ol. Biol. 115, 591-601. Jehnek, VW.: Molloy, G., Fernnndez-Munoz, R.? Salditt, M. & Ilameii, J. E. jl:974). .I’. Uol. Biol. 82, M-370. Jelinek, W., Evans, R.: Ifi &on, M., Salditt-Georgieff, M. & Darnelll, J. E. (1978). Biochemistry, 17, 2776-2783. ” Lewin, B. (1974). Gene Expression 2: Eucaryotic Chromosomes, John Wiley and Sons, Kevv York. Naio, J. J., Brown, F. L. & Musich, P. R. (1977). J. IbToZ. Biob. 117, 63i-655. Manuelidis, L. (1977a). N&E. Acids Res. 3, 3063-3076. Hanuelidis, L. (1977b). Ghromosoma, 66, l-21. Manuelidis, L. &s Wu, J. C. (1978). LVatzLre (Lor~don,j, 2?6, 9294. McDonell, M. W., Simon, M. N. & Studier: F. W. (1977). J. ,?KoZ. Biol. 110, 119-146. Orosz, J. M. & Wetmur, J-. G. (1974). Biochemistry, 13, 546iL54li3. Roberts, R. J., Myers, P. A., Morrison, A. 8i Murray, K. (1976). J. &Iol. Biol. 102, 157%1%. Robertson, H. D., Dickson, E. & Jelinek, W. (i977). .I. Mol. Biol. X15. 57lN589. Schmid, C. W. & Deininger, P. L. (1975). Cell, 6, 345--358. Southern, E. 31. (1970). Nature (Londolx), 227, 794-798. Stafford, D. W* & Berber, D. (1975). Biochim. Biophys. acta, 378. 18. 21. Wetmur, J. G. Cy:Davidson, N. (1968). J. mol. Niol. 31, 349- 370. Wilson, D. & Thomas, C. (1974). S. Mol. Biol. 84, 115 144.