The promoters and spacers in the rDNAs of the Melanogaster species subgroup of Drosophila

The promoters and spacers in the rDNAs of the Melanogaster species subgroup of Drosophila

271 Gene, 77 (1989) 271-285 Elsevier GEN 02942 The promoters and spacers in the rDNAs of the melanogaster species subgroup of Drosophila (Recombinan...

2MB Sizes 2 Downloads 39 Views

271

Gene, 77 (1989) 271-285 Elsevier GEN 02942

The promoters and spacers in the rDNAs of the melanogaster species subgroup of Drosophila (Recombinant

DNA; simulans; mauritiana;

David C. Hayward*

teissieri; yakuba;

transcription

start point)

and David M. Clover

Cancer Research Campaign, Eukaryotic Molecular Genetics Research Group, Department of Biochemistry, Imperial College of Science and Technology, London SW72AZ (U.K.) Received by S.G. Oliver: 25 August 1988 Revised: 15 November 1988 Accepted: 18 November 1988

SUMMARY

of the melanogaster species subgroup of Drosophila (melanohave been compared. The external transcribed spacers (ETSs; the region encoding the 5’ end of the primary transcript, upstream from the 18s sequences) are highly conserved between in D. melanogaster, D. simulans and D. mauritiana, whereas the more distantly related D. yakuba and D. teissieri differ in having apparent deletions of 22 and 27 bp, respectively, in this region. The divergence of nucleotide sequence upstream from the transcription start points is consistent with the established phylogeny of the five species. The sequences between bp positions -47 and + 24 from the primary transcription start point show extremely little variation between each species. This is also the case for sequences between the approximate bp positions -140 to -125 and -85 to -70. This could indicate a functional importance not only of the sequences next to the transcription start point, but also of these upstream regions. An array of 240-bp repeats can be found at a comparable distance upstream from the transcription start point in each species. Matrix homology comparisons indicate that for each species not only is the sequence at the primary transcription start point duplicated within the 240-bp repeats as previously reported for D. melanogaster, but that this is part of a longer interrupted duplication which includes a region of strong similarity with the sequence between the approximate positions -105 to -65. This region is contained within one of the regions upstream from the transcription start point that is strongly conserved between the species. This sequence may therefore have functional significance not only for the transcription of the rRNA precursor, but also for transcription of the so-called NTS sequences which is now known to occur. The 240-bp arrays are themselves highly conserved within a species indicating that homogenisation mechanisms are operative. The divergence of these arrays between species is consistent with the phylogenetic tree. The 3’ sequences of the primary transcription unit, now known to be RNA-processing sites, are also highly similar between the species. Immediately downstream The spacer

sequences

of rDNAs

gaster, simulans, mauritiana,

of members

teissieri, and

yakuba)

Correspondence to: Dr. D.M. Glover, Department of Biochemistry, Imperial College, London SW72AE (U.K.) Tel. (44)1.5895111, ext.4109; Fax(44)15847596. * Present address: Research School for Biological Sciences, Australian National University, Canberra, A.C.T. (Australia) Tel. (61)62494488. 0378-l 119/X9/$03.50 0 1989 Elsevier

Science Publishers

B.V. (Biomedical

Abbreviations: bp, base pair(s); D., Drosophila;ETS, external transcribed spacer; Myr, one million years; nt, nucleotide(s); NTS, non-transcribed spacer; PolIk, Klenow (large) fragment of E. coli DNA polymerase I; rDNA, DNA coding for rRNA; RF, replicative form; rRNA, ribosomal RNA.

Division)

272

from these sites there is little homology are reached

between the rDNA of the different species, until 95-bp tandem

arrays

in each case.

INTRODUCTION

In most eukaryotes, the genes (rDNA) coding for the major rRNAs, are arranged as a series of tandem

from that found with many genes coding for proteins, where sequence elements upstream from the transcription start point such as the TATA box (Corden

Com-

et al., 1980), and the CAAT box (Benoist et al., 1980) are found to be conserved throughout evolu-

parisons of the rDNA from several organisms have made it clear that different regions of the rDNA

tion. The difference between the two types of gene reflects their organisation in the genome and their

repeat have undergone varying degrees of conservation during evolution. The regions corresponding to the mature rRNAs have been most conserved while the spacer sequences between the genes have diverged more rapidly, such that there may be little resemblance between the spacers of organisms taxonomically classified in the same genus (for reviews, see Fedoroff, 1979; Long and Dawid, 1980;

variation to the extent that the RNA polymerase I complex of one species cannot transcribe the rDNA of another (see Grummt, 1972; Reeder, 1984; Dover and Flavell, 1984). Presumably the genes encoding the proteins that have to interact with the rDNA promoter are undergoing parallel evolution.

repeats located on one or more chromosomes.

modes

of evolution.

The rDNA

promoter

shows

Dover, 1982). Several laboratories have shown that the region immediately surrounding the transcription start point, together with sequences further upstream in the NTS, is responsible for directing efficient transcription of the rDNA genes. However, the sequences around the transcription start point, including those shown to be necessary for accurate transcription show little similarity when comparisons are made across the genera. This difference in the rate of divergence reflects the functional constraints imposed upon the different regions. On the one hand, since the rRNAs interact with scores

The high degree of divergence of the transcriptional control regions of the rRNA genes means that the information gained by making comparisons between distantly related species is limited. By comparing these sequences in a group of closely related species, however, it should be possible to detect fast and slow diverging regions, the latter presumably having some function in the control of transcription. Dover and co-workers have approached this problem by analysing the divergence of rDNA promoters and spacers in two species from the subgenus Sophophora (0. melanogaster and Drosophila orena), and two species from the subgenus Drosophila (Drosophila hydei and Drosophila virilis; Tautz et al.,

of ribosomal proteins, there is strong selection against changes in these regions. On the other hand there is considerable variation in copy number of complete rDNA units and of repetitive elements within the rDNA spacers that is believed to result from unequal crossing over. The NTS of D. melanogaster contains three principal groups of sub-repeats (Coen and Dover, 1982). These are the 95 bp, 330 bp and 240 bp tandemly repeating elements which are found 5’ to 3’, in that order, in the spacer. A 60-bp sequence within the 240-bp repeats has previously been pointed out as having homology to the sequence from 25 bp upstream to 35 bp downstream from the transcription start point (Coen and Dover, 1982; Simeone et al., 1982; Miller et al., 1983). The degree of conservation of rDNA promoter sequences differs

1987). These subgenera are thought to have an evolution of between about 30 Myr and 60 Myr. We have chosen to study the promoters and spacers from five sibling species in the melanogaster species subgroup of Drosophila (Throckmorton, 1975). The evolutionary time scale in which these species have diverged has been estimated from studies of electrophoretic polymorphisms (Eisses et al., 1979; Gonzales et al., 1982), and on the frequency of synonymous codon substitutions (substitutions in the protein-coding region which do not give rise to an amino acid change) in the alcohol dehydrogenase gene (Bodmer and Ashburner, 1984). These estimates vary within an order of magnitude ranging from 0.8 Myr to 3.9 Myr for the divergence of D. melanogaster and D. simulans and between 2 Myr and 13 Myr for the

213

divergence of D. melanogaster and D. arena. This degree of variation between estimates reflects the

quences within the rDNA promoter

difficulty in putting

absolute

cance.

distance.

the phylogenetic

However,

these species,

values on evolutionary

and discuss

regions of live of

their functional

signifi-

tree shown in

Fig. 1 is supported by a number of criteria, including studies of alloenzyme alleles (Eisses et al., 1979); heat-shock

genes (Leigh Brown and Ish-Horowitz, mitochondrial genomes (Fauron and 1981); Wolstenhohne, 1980); satellite nucleotide sequences (Barnes et al., 1978); rRNA and histone genes (Coen et al., 1982a); rDNA insertion elements (Roiha

MATERIALS

et al., 1983); the ‘500’ and ‘300’ noncoding

lans and D. mauritiana were supplied

AND METHODS

(a) Cosmids Cosmid

families

clones

containing

from D. simu-

rDNA

by Dr. M.J.

Browne. They were constructed by cloning partial EcoRI* digests of genomic DNA into the cosmid vector Homer I (Chia et al., 1982), and were isolated

(Strachan et al., 1982); and polytene chromosome banding patterns (Lemeunier and Ashburner, 1976). In this paper we identify some slowly evolving seA.

----------

l---------

_______I_______

I

I I

_________I-_________ __I__ ___I-__

I I simulans

melanogaster

I

I

I

I

I

I

I sechelLia

mauritiana

arena

erecta

I

I teissieri

yakuba

B.

I

I

I

I

sim

mel

yak

tei

1

95.5

91.0

85.3

87.0

I

l_________l_________-----------------_--_--I

I

ma”

I I

I sim

88.4 (88.4)

I

mel

82.0 (ai .9)

83.8

80.9

84.3

(82.2)

I

yak

1 I

92.5

I

I I

I________-l_________-----------------------I

Fig. 1. Phylogenetic surrounding repeats

tree of Drosophila. (A) Phylogeny

the transcription

of the melanogaster subgroup.

start point. The corresponding

numbers

in the three species where these have been compared.

MICROGENIE

program

Single bp mismatches

and small deletions

which all five sequences the regions spanning Sequences

which aligns sequences were available,

the sequence

in parentheses

The numbers

were obtained

and counts the number ofpositions

or insertions

are therefore

deleted form the ETS (see RESULTS

homologies

y0 homology

The analysis

AND DISCUSSION,

the 240-bp spacer

search

option

of the

dividing by the total length.

was confined

ofthe D. yak&a (yak) andD.

to as me1 and sim, respectively.

of the sequences

between

using the homology

where they are matched,

scored equivalently.

i.e., from nt -173 to 101. In the comparisons

from D. melanogaster and D. simulans are referred

(B) Percentage

represent

to the region for

teissieri(tei) sequences,

section b) are omitted from the analysis.

by virtue of their homology rDNA

repeat

to the D. melanogaster

in the clone pDm238

(Roiha

et al.,

1981). EcoRI* digests were performed in 10 mM NaCl/lO mM Tris * HCl pH 7.4/l mM EDTA containing

10 % dimethylsulfoxide.

(b) Clones of Drosophila Fig. 2A shows fragments subclones

SB15 insert

map of part of the and indicates

the

which were subsequently subcloned. The B15.24 and B15.10 were constructed by

ligating an EcoRI

digest of the D. simulans cosmid

SB15 into pBR325. The subclone pDsl8S was constructed by ligating the ~g~II-~indII1 fragment from the 18s gene in B15.10 into pBR322 cut with Clones pDs90HR and Wind111 + BamHI. pDs90TR were made by cloning gel-puri~ed EcoRIIIindIII and EcoRI-TaqI fragments of the predicted size from B15.24 into pEMBL8 + cut with EcoRI + &‘indIII and EcoRI + AccI, respectively. To generate the Ml3 clones M13srRT500 and M 13srTT660, the purified 1.6kb EcoRI-BgZII fragment from B 15.10 was digested with TagI, and the sticky ends subsequently filled using PolIk. After gel purification the 660-bp TaqI fragment was ligated into SmaI-cut M13mp8, and the 500-bp EcoRI-iTag fragment was cloned into the BamHI site of M13mp8 using BgZII linkers. This process reconstructed the tilled-in EcoRI site. The plasmid clone pDsrTT660 was made by excising the insert from an RF preparation of M13srTT660 with EcoRI + &zdIII and cloning it into pEMBL8 + . Similarly, pDsRT500 was made from M13rRT500 RF but the insert was excised with EcoRI + BgZII and cloned into pEMBL8 + cut with EcoRI + BamHI. This removed several tandem BgEII sites which were present at one end of the insert as a consequence of the cloning strategy for M13srRT500. The construction of clone pDsRB was carried out by ligating the 1.6-kb EcoRI-BgZII fragment from B15.10 into EcoRI + BamHI-cut pEMBL8 + . (c) Clones of Drosophila

made by digesting the 3. I-kb EcoRI-BgiII fragment from MA89C with TaqI and PolIk filling the ends. The 2-kb EcoRI-TaqI fragment and the 660-bp I’aqI fragment were then gel purified and inserted into M13mp8 which had been cut with BamHI and blunted with PolIk. In order to obtain Ml3 bacterio-

simulans

a restriction

D. simulans cosmid

a restriction map is shown in Fig. 2B. The Ml3 clones M138marRT and M138marTT660 were

mauritiana

The D. muu$itia~a subclone MA89C was made by inserting a purified 4.4-kb EcoRI fragment from the cosmid MA89 into pBR325. This fragment was identified by hybridization to labelled 18 S RNA and

phages with inserts in the opposite o~entation, RFs of M 138marRT and M13mar8TT660 were grown, the insert serted

cut out with EcaRI

into EcoRI

+ HindIII,

+ ~indIII-cut

and in-

M13mp9,

M139marRT and M139marTT660. pDmarRT made later by cutting M139marRT RF EcoRI + Hind111 and cloning the insert pEMBL9 + cut with the same enzymes. (d) Clones of Drosophila yak&a

teissievi

giving was with into

and Dposophi~a

D. t~iss~~~genomic libraries were made by partially digesting genomic DNA isolated from adults with Sau3A and ligating fragments between 15 and 20 kb into /ZEMBL4 cut with BamHI. D. yakuba genomic libraries were constructed by ligating partial lMbo1 digests into the BamHI site of AEMBL4. These libraries were screened using pDsl8S as a probe. Fig. 2 (panels C and D) shows the restriction maps of the promoter containing Hind111 fragments from D. teissieri and D. yak&a. The clone pDyLBH was made by inserting a gel-purified 4.5-kb ElindIIIBglII fragment from a plasmid which contained the 5.6-kb HindIII spacer fragment into Hind111 + BarnHI-cut pEMBL9 + . Ml3 clones M13y240A and M13y240B were constructed by ligating an EcoRI* digest of the purified 4.5-kb HindIII-BgEII fragment into M13mp8 cut with EcoRI, while M13y240C, D and E were made by cloning blunt-ended, size-selected EcoRI* fragments from pDyLBH into end-repaired, BamHI-cut M13mpX. The clones were identified by virtue of their homology with the M13srRT500 insert. The plasmid subclones pDy240F,G and H and pDyPr contain TaqI fragments from a digest of the purified 4.5-kb EcoRI-BglII fragment inserted into pEMBL8 + cut with AccI. The D. teissieri clone pDtr3.3 contains the 5-kb Hind111 fragment cloned into the Hind111 site of pBR325; pDtLBH was subcloned from this and

215

TTT

TR

T

B

R

..._ _ _ _._._._

..._ Ikb

Ikb

D

B

3

..._ lkb

Ikb

Fig. 2. Cleavage containing

maps and sequencing

strategies

of the D. simulans cosmid

of the insert

the transcription

start

D. teissieri. Sites for the restriction

was carried

(4) pDs90TR;

(5) pDsRB;

The subclones

in panels C and D are: (1) pDyLBH;

map in C represents EcoRI*

(6) M13rRT500;

cannot

are indicated

out. The subclones

(7) M13srTT660;

one of the 240-bp repeats.

sites shown. These repeats

(3) pDtr3.3;

At least six separate

be unambiguously

ment from a TaqI digest of pDtLBH site of pEMBL8 + .

by horizontal

the transcription

repeats

assigned

into the AccI

AND DISCUSSION

(a) The region around the 3’ end of the 28s gene In D. melanogaster the 3’ end of the 28s rRNA is also the 3’ end of the 40s precursor molecule, and it was thought that this was the site at which transcription terminated. However, recent evidence indicates that transcription can read through into the spacer from the preceding rDNA repeat (Tautz and Dover, 1986). Thus the 3’ end of the 28s gene may

double-ended

clones. (A) Part

(C) The Hind111 fragment start

point

from

(not all BglII, and EcoRI*

arrows.

Dotted

arrows

indicate

in panels A and B are: (1) B15.24; (2) B15.10; (3) pDs90HR;

(8) pDsl8S;

(2) pDyPr;

contains the 4-kb HindIII-BgZII fragment inserted into Hind111 + BgZII-cut pEMBL9 + . Subclone pDtPr was made by inserting a size selected frag-

RESULTS

containing

MA89.

EcoRI* (R), Hind111 (H), BglII (B) and TaqI (T) are indicated

which were subcloned

analysis

of the D. mauritiana cosmid

from D. yakuba. (D) The Hind111 fragment

point

and TaqI sites are shown). Fragments the sites from which sequence

for the D. simulans, D. mauritiana, D. teissieri, and D. yakuba rDNA

SB15. (B) Part of the insert

enzymes

‘..-

(9) MA89C; (4) pDtLBH;

(10) M138marRT, and (5) pDtPr.

were cloned and sequenced

and (11) M138marTT660. The expanded

portion

of the

either from the TaqI sites or

to a specific place in the spacer.

now be regarded as a site of rapid post-transcriptional processing rather than termination. It was of interest to examine the sequence of this region and the organisation of the 5’ of NTS sequences in the melanogaster species subgroup, since one might expect to find conserved regions which may have some function in RNA processing. Fig. 3 shows the sequence distal to a conserved Hind111 site close to the 3’ end of the 28s gene in D. simulans, D. yakuba and D. teissieri, aligned with the previously determined D. melanogaster sequence (Mandal and Dawid, 1981). The precise end point of the gene has been determined for D. melanogaster by an Sl protection experiment (Mandal and Dawid, 1981), and the 283 gene sequences are indicated by asterisks. The sequences of the 3’ of the 28s gene are identical in the four species shown and also in D. orena and D. virilis (Tautz et al., 1987), reflecting the functional constraints imposed upon this region.

276

*************************** 20

10

man

30

40

me1 sim tei

AAGC~TATCCTTTGCTTGATGATTCGATATA _________-_________---_______a-t____-----_______--------_---a__-agt

yak

________--_________________a---a-ta-g-cc~~~q~q~~~_~q~q~qg~_~~_q~_~~q~~gq__~~_~_

_t-------____

70

50 60 _________t_c--_____------_______-

80

ATAAATG GTTGCCAAACAGCPCGTCATCnAT’P’TAGTGACGCAGGC~i~ __g_t_ttttcct__a________c--_______----_______ -c-ctaat

qtac-

120 130 140 ____________c_____t__t--g-g__g_g__-

110

mau me1 sim tei

AA

tatt-tt-t-q--tcq

tqa-qqt-t-qcq

150 160 170 c__~_____---______------________

CCCTA’TCATA’PAA TT’PTAA’rATAAAGAATTTAAffiAATTTTATCxAGAGTAGCCAAACACCTCGTCATCAATTTAGTGA _____________t____g_________c--____--_____g_____ -__t_____---_-____------__-_____

c_-c’

tataqCqCCtgCCtgCC-ctcqq-tc--t

q______---

90 --______--

100 ___

_--________

---

caqcqcactaqc-a-ag q-gttaqcqcgc-accca

180

190

--_~~~~~------_~

aq 200

CGCATATGATATTGTC ---~~~~_-------~

tq_a_t______q_a_______cgcg--_________q______

yak

mail me1 sim tei

210 __------_~___--

220 ---

230 240 _____-c---~~____----g----___

CCTATCATATAAT’PA ATA TAAAGAATTT __--------~~~~~ --______c__-_ ___--tac____a_t___qaat______c___

250

AAAGAATTTTA’PCAAG -_ -c-a__

yak

Fig. 3. Sequences

around

the 3’ end of the 28s rRNA gene. The sequence

from a conserved

Hz%dIII site 27 bp upstream

from the 3’

end of the 28s gene is shown for L). melanogaster (mel), I). simukms (sim), D. teissieri (tei) and 13. yakuba (yak). Asterisks diagram

indicate

285 rRNA gene sequences.

is shown aligned with the other sequences. by dashes;

differences

The D. mauritiuna (mau) sequence Positions

are shown by lower-case

where the sequences

are the same as the D. melanoguster sequence

and

are indicated

letters or gaps.

is not surprising, considering the high degree of sequence conservation in this region of the rRNAs between species less closely related than these (Gerbi, 1985). The sequences immediately downstream from the gene, however, show limited homology, and only when comparisons are made between the closely related species. Six out of the first 9 nt do~stre~ from the gene are conserved between D. melunogaster and D. simulans, and 7 out of the first 8, between D. yakuba and D. teissieri. There is no obvious conservation between the sequences of all four species. Therefore, apart from the fact that the first 10 or so nt positions are rich in A residues, it is difficult to speculate about putative sequences which might act as signals for post-transcriptional processing. Such signals may be within the 285 gene sequences, or may depend upon RNA secondary structure, involving regions of sequence farther downstream. In the D. melanogaster spacer, the 95-bp repeats start 13 bp do~stre~ from the end of the 28s gene. D. simulans and D. teissieri also contain 95-bp repeats showing good homology to a D. melanogaster consensus sequence. IIowever, the distance between the end of the 28s gene and the start of the 95-bp arrays differs in each species. In D. simul~~s, 20 bp separate the 3’ end of the gene from the start of the repeats, while in D. teissieri this distance is 99 bp. The D. yukuba sequence could only be read with certainty to nt position 100, but this species also

This

above the

is from the EcoRI site in the clone Ml38marRT

contains sequences homologous to the D. melanogaster 95-bp repeat further downstream, since these were cloned and sequenced during the cloning of the ~13y24OA and ~13y24OB inserts (not shown). Thus, a distance of at least 70-bp separates the end of the 28s gene from the 95-bp repeats in D. yakuba. The sequence of the D. orena NTS (Tautz et al., 1987) shows that the analogous region in this species is 71 bp long. We have determined the sequence of the D. mauritiana NTS from the EcoRI site (clone pDmarRT). This has good homology with the D. me~~~og~ster 95-bp repeat. It is shown in Fig. 3 aligned with the D. me~~~og~ter sequence from nt position 49, the beginning of the 9%bp array. We have not determined the length of sequence separating the 95-bp repeat from the 3’ end of the 28s gene in this species. The only si~~cant conservation found in the region separating the 28s gene from the 95”bp repeats occurs between the D. yakuba (yak) and D. teissieri (tei) sequences and is shown below: tei

46

GTACTT$TT$TTGTTGGTT$~TGATGGTT;AGCGC

yak 58 GTACTTGTT

TTGTTGGTTTTTGA~GGTT

84 AGCGC 89

The numbers refer to the nt positions in Fig. 2. Asterisks indicate mismatches or insertionsldeletions. This region has clearly diverged more rapidly than the 95-bp repeats. Indeed, whereas the L). teissieri

211

and D. arena 95-bp repeats share 7 1 y0 homology, significant quences

homology separating

can be found between

no

the se-

these repeats from the 28s gene

in these species. The 95-bp repeats

in D. simulans,

D. mauritiana and D. teissieri show 95x,

75 y0 homology,

respectively,

93% and with a D. melanogaster

consensus sequence consistent with the established phylogeny. A high degree of sequence divergence in the region between the 28s gene and the 95-bp repeats has also been noted

by Dover

and his colleagues

following

D. virilis, D. hydei

all homology.

D. teissieri sequences

and 27 bp, respectively,

sequences

reveals D. simulans and position

D. arena (Tautz

(b) Comparison of the regions surrounding the transcription start points Most rDNA units in D. melanogaster have seven or eight copies of a 240-bp repeat preceding the promoter region. This repeat contains a region of homology with the sequence around the major site for the initiation of transcription (Coen and Dover, 1982; Simeone et al., 1982; Miller et al., 1983). The end of the last 240-bp repeat in the D. melanogaster NTS is about 140 bp upstream from the transcription start point. There is, therefore, a region between the 3’ end of the last repeat and the sequences around the transcription start point which is ‘unique’ within the spacer. To determine whether the organisation of this region is similar between the melanogaster sibling-species, we carried out nucleotide sequence analysis as indicated in Fig. 2. The sequences are shown in Fig. 4 alongside the D. meIanogaster

sequence determined by Long et al. (198 1). The transcription start points in the siblingspecies are inferred from the strong homology to the site in D. melanoguster which has been accurately mapped (Long et al., 1981), and is assigned to nt position + 1. To facilitate comparisons, the sequences have been aligned to maximise homology. Consequently gaps have been introduced into the sequences and the position numbers shown do not necessarily correspond to the distance from the transcription start point. The sequences shown in Fig. 4 exhibit good over-

with the of 22 bp

in the ETS (the sequence

found at a comparable site in the ETS of D. orena rDNA by Tautz et al. (1987). Inspection of these

contain

with

which, in comparison

transcribed to give the 5’ end of the primary transcript, subsequently removed from mature rRNA by processing enzymes). A deletion of 40 bp has been

and

observations

are the D. yakuba and

other species, have relatively large deletions

et al., 1987). The reason for this extraordinarily high level of divergence is unclear, but it may well point towards a fundamental turnover mechanism for the sequences at the end of the spacer repeats.

their

The exceptions

of D. mauritiana, D. melanogaster rDNAs each

that the ETS

a 30-bp inverted repeat at the corresponding of the D. yakuba and D. teissieri deletions.

In D. melanogaster this sequence from nt position to 57 is: 28 CATTGTTCGAAATAT

28

42

I 11111111

IIII 57 GTAATATGCTTTATA43

Although this sequence could represent an insertion in the rDNA of the melanogaster complex, we show below, that these sequences are present in the duplicated promoter sequences within the 240-bp spacer repeats of D. yakuba. One explanation for this arrangement would be that the archetypal promoter first became duplicated within the 240-bp unit which itself became reiterated in the spacer. This would have been followed by a deletion event of the inverted repeat in the ETS occurring after the divergence of the yakuba and melanogaster complexes, but before the divergence of the individual yakuba complex species. If this were the case, we would speculate that the inverted repeat could represent the recognition site for an enzyme system responsible for its excision, although we note that the ‘deletion’ in D. orena extends beyond the inverted repeat. The y0 homology between all pairs of species from nt position -173 to 101, disregarding the mismatch resulting from the ETS deletion in the yakuba subgroup, is given in Fig. 1. The data are in agreement with the phylogenetic relationships, the D. mauritiana, D. melanogaster and D. simulans sequences showing higher homology to each other than they do to the D. yakuba and D. teissieti sequences, with D. mauritiana and D. simulans being most similar. The overall conservation of the sequences around the transcription start points is comparable with that found for synonymous codon changes in

178

-220

Illa” sim

---_----

lllel

T'PGGCAATTATATGAGTAAATTAA _c-a____at___at_g____--_c_a____a____at-g----___

yak

tei

IllA" sim me1 yak tei

-200 -190 -180 -170 ---____---_-_a-__a_---_______-~_____9--__---______--~-_~~~____

_ _

-90 -80 -70 __a____t______------__-------___----______ __a____t______-----___-------_------____--

sim me1

-150 ---_--_____---------___

-140 -130 _~~----__----

-120 -110 __- aaa---_a ---_a----_c_ __-__________~___~~~____aa--__----___-----

-60

'TGG a_=_-

AAAA TGAAGTG'TTCAT ---ta -t--tataat----ta -c--tataat--

-50 -40 -30 -20 -10 ~___~____----___------_---__---____---__-----___----c_--c---__--_---_____--_____ c__----____----___-----_-

TATTCTCGTAATATA'rAAGAGAATAGCCCG'rATGTTGGGTGGTAAATGGAATTGAAAATACCCGCT'PTGAGGA~GCGGGTTCAAAAAC'rACTAT A t_ctg_______t-g----_____-__---t-_t_------__~_~~~__-~~-__-------_____-t______--______________________ t_ctg_______t----____----_____t__ta---_-____-~---~___~~___------__------___-----_____-________________

10 20 _____-----__~------~-----__~~-----------~~-----

30 40 50 60 70 80 90 ~__-_______~-----__------_~___--__-----_~__~---_______________------cc_____ cc_----a__-------_ ~______-9__~-__~-----______-_a--c----_---_______________-_-____

1

ma" sim me1 yak tei

-160 _---

ATCATATACA'rATGAAAATGAATA'rTTATTATAI'GTATA'rAGGGGAAAAAATAATCATATAATATA'rATGAATAA __a__-at-_____g----______-___cg___-a--_t_-A_---t-_-t_ -- -t-____---___--_~___~----_______-_______---c-_---a-_~_-- -t__a____t___t_ -- -t_-----___----

-100 sim me1 yak tei

-210 ~---___-_____---

AGG'TAGGCAGTGGTTGCCGACCTC ----_ a_-----_____t______ --__~------------~-----110 ----_---_---------____-----__------

120 a_a

100

GCATTG'I.TCGAAATATA'rAT'PTCGTATAATGATTA'rACTTATAATAAAGTA'~ATTA'~rA'PCC(;TACA _-----_--_-___--- =___ ------~~~~-- _------c_-t--a-at t-g---- c__---____---~-------------- _-------

130 140 150 160 ______ _____-------~~-----~-_________g______________---g--__-------_------

170 180 __a___________-----____________________---------

190

200 _______

AAT'PTG'rT'rC'rCAGT'rCTT.rT'L.GAACACGGGAC'rTGGCrCCGCGAA~rAATAGGAATATACGC.rA'rT'rTAGATAHTATCGTTGAAA~AAA~rCAA~~~TC _ -_--- _--

tei 210

Ill?." sim me1 yak tei

220

230

240

__-____~_-_-____---________C--__----______________

'PATTATACATAGAATAACAAA'rCGTTTTCATATA'PTATCGT'rAA'PT'rT

Fig. 4. Sequences

around

the transcription

start point. The sequence

around

the start point for D. simuluns (sim), D. mauritiana (mau),

D. teissieri (tei) and D. yakuba (yak) are shown aligned with that for D. melunoguster (mel) shown in upper-case the sequences

are the same as the D. melanogaster sequence

Transcription

begins at nt position

placed over the nt position

1 and continues

to which the number

are indicated

rightward.

refers, whereas

the alcohol dehydrogenase genes (adh) of D. melanogaster, D. simulans, D. mauritiana and D. orena (Bodmer and Ashburner, 1984). If it is assumed that such codon changes represent mutation in the absence of selection, then this would imply that at least some parts of these sequences might not have been subjected to functional constraint. However, this assumption could be challenged as there may well be constraints upon codon usage. Furthermore, single-copy genes are likely to evolve under an entirely different set of constraints than tandemly arranged multi-copy genes, such as rDNA, which are subject to ‘homogenisation mechanisms’. Nevertheless, certain regions of the spacer sequences around the start point are conserved to a higher degree (see below). Fig. 5 shows a graphical representation of the sequence conservation between the sequences around

by dashes;

differences

In the region upstream in the transcribed

letters. Positions

are shown by lower-case

where

letters or gaps.

from the start point, the minus sign has been

region the last digit of the number

is in this position.

the transcription start point from nt position -173 to 101. This was obtained by first determining a sequence of the most frequently occurring nucleotide at each position for the five species (gaps were included in the analysis). Nucleotide divergence from this ‘consensus’ were then plotted against nucleotide position. It can be seen that the graph shows nonrandom distribution of nucleotide divergence. There is a large conserved region between nt -47 and 24, in which there are only four nt positions where all five sequences are not identical. This region contains the transcription start point, and the region which is reiterated in the spacer repeats (region D). It also contains the region found by Kohorn and Rae (1983a,b) to be important for directing efficient transcription in vitro (region C). Two other smaller regions of conservation can be seen, labelled A and B, from approx. -140 to -125 and from approx. -85

219

c Po*ition

Nucl.otid.

Fig. 5. Conserved sequences

the five sequences above the graph: directing

regions

in the sequences

surrounding

in Fig. 3 by taking the most frequently diverged

from this consensus

A and B, two small regions

transcription

occurring

the transcription nt at each position

was then plotted of conservation;

start point is at nt position

(gaps were included

against the nt position.

the sequences

was derived

and Rae (1983a,b)

in the D. melunoguster spacer;

only concerns

sequence

in the analysis).

The number

Various regions of the sequence

C, the region found by Kohorn

in vitro; D, the region found to be reiterated

deleted in D. teissieriand D. yakuba. In this region the analysis

start point. A consensus

from the of times

are indicated

to be important

for

E, the region of the ETS which is

of the other three species. The transcription

1.

to -70. The second of these falls within a region from approx. -10.5 to -65 which is found to be conserved between the sequences around the transcription start point and the 240-bp spacer repeats (see section c below). The region marked E represents the portion of the ETS which is deleted in D. yakuba and D. teissieri. In this region the analysis has only been carried out on the sequences of the other three species. (c) Comparison of the 240-bp spacer repeats To determine whether the overall organisation of the 240-bp spacer repeat is similar in the other species, we cloned and sequenced such sequences from D. simulans, D. yakuba and D. teissieri. Fig. 6 shows these sequences compared to a consensus D. melanogaster sequence derived from a comparison of nine sequenced clones. Since the 240-bp re-

peats are in a tandem array, this consensus is circularly permutable but for clarity has been arranged so that nt position 243 in Fig. 6 is equivalent to the 3’ end of the 3’-most repeat. The D. yakuba repeat is also a consensus derived from eight clones, whereas the D. simulans sequence is that of the last whole spacer repeat in the clone B15.10 (Fig. 3). The D. teissieri sequence shown is that of the longest available clone with spacer repeat homology. It has been arranged to give the best alignment with the other repeats resulting in a gap between nt positions 90 and 172, for which the sequence has not been determined. The y0 homologies between these repeats shown in Fig. 1 indicate similar amounts of divergence as the sequences around the transcription start point. The most highly conserved region of about 50 bp (underlined in Fig. 6) coincides roughly with the duplication of the transcription start point within the D. melanogaster 240-bp spacer repeat.

280

____________________--._____---_____--_____-----_ __________________________-___________._________

nel

Fig. 6. The sequence and the consensus

210

210 220 ITACIT~~CI\I\I\I\TAA~*~~~~~=~~~~=~*~~~’~

sim 240

_~___---_-____~___--9---______aa-----~~..~~.__

yak

210

-‘~______9----q_________cg-----__-a

tel

210

210 GTGA(\RA a______a

_a___----9____q-_-------cg____9---a

a____9_a

of the 240-bp repeats.

sequence

210

(A) The variation

in nine independently

derived from them. Repeats m240.a-c

sequenced

D. melanogaster 240-bp spacer

(1982), m240.g and m240.h are from Simeone et al. (1986) and m240.i is from Simeone et al (1982). (B) The variation of eight cloned D. yakuba 240-bp spacer repeats EcoRI* y240.d-f

fragments. are cloned

Repeats

together

with the consensus

y240.b and y24O.c are incomplete

TaqI fragments

and are all truncated

repeats

are taken from Miller et al. (1983), m240.d-f are from Coen and Dover sequence

derived from them. Repeats

since part of the sequence

due to the presence

in the sequences

y240.a-c

could not be read with certainty.

of a TaqI site at nt position

22. Positions

are cloned Repeats where the

281

The

sequences

D. melanogaster

of all the 240-bp

repeats

the region of the NTS from the last spacer repeat to

from

and D. yakuba which were used to

the transcription

start site has been conserved.

For

the three species for which we have determined

derive the consensus sequences are also shown in Fig. 6. Two points are notable from this figure. First

the

of all the repeats are very homogeneous, diverging on average by only 2% from the consensus. Thus the processes responsible for maintaining homogeneity

of the 240-bp spacer repeat, D. simulans, D. yakuba, and D. melanogaster, it is evident that the repeats contain a duplication of the transcription start point (Fig. 7B). The duplicated

among the repeats act relatively swiftly compared to the rate of mutation. Second, some mutations are

region in D. yukuba is slightly shorter than in the other species, the homology ending 24 bp down-

present

stream from the transcription start point corresponding to the start of the ‘deletion’ in the ETS

in more than one repeat,

and different

peats contain different subsets of the mutations.

complete

reThis

referred

reflects the action of the homogenising mechanism, most probably unequal crossing-over, which works throughout the length of the repeat. The frequency of occurrence of any particular mutation can be viewed as a transition stage in its eventual elimination or fixation in the population of repeats. A similar distribution of mutations has been observed in the spacer repeats of D. hydei and D. virilis (Tautz et al. 1987) and also in noncoding, tandemly repetitive DNA families (Strachan et al. 1985).

sequence

to

above.

Consequently,

the

D. yakuba

repeat

A comparison of the 240-bp spacer repeats and the sequences surrounding the transcription start point for D. melanogaster, D. simulans, D. yakuba and D. teissieri indicates that the highly conserved sequence at the transcription start point is duplicated within the spacer repeats for each of these species (Fig. 7). The first 75 or so bp of the lower sequences presented in Fig. 7A correspond to the end of the last

has a longer region of homology with the D. melanogaster transcription initiation site than to the D. yakuba start point extending beyond nt position 24 (Fig. 7C). Matrix comparisons of these sequences indicate that the promoter duplications are part of a larger interrupted duplication for each of these species (Fig. 8). In such comparisons, diagonal lines are plotted wherever regions of homology occur between the sequences being compared. In Fig. 8, the abscissae represent the sequences around the transcription start points, and the ordinates the 240-bp spacer repeats. In each panel, the diagonal labelled 1 represents the end of the last 240-bp spacer repeat as shown in Fig. 7A. The diagonal labelled 3 represents the duplication of the region around the transcription start point within the 240-bp repeats as shown in Fig. 7B. In addition, for each of the three species there is a stretch of weaker homology labelled diagonal 2. This is in direct line with diagonal 3, indicating that the spatial relationships between these se-

spacer repeat, indicating that the array of 240-bp repeat elements ends approx. 150 bp upstream from the transcription start point, most clearly seen here with D. melanogaster, D. yakuba and D. teissieri. Furthermore, the 240-bp repeats are in the same phase with respect to the ‘unique’ sequence 3’ to the last repeating unit. Thus the overall organisation of

quences are the same in the spacer repeats and the major rDNA promoter. This suggests that the reiterations of the transcription start points contained in the 240-bp spacer repeats, shown in Fig. 7B are actually part of a larger duplication which starts at about 105 nt upstream from the transcription start point. An alignment of the 240-bp repeat and pro-

(d) Promoter duplications of the 240-bp repeats are part of a larger interrupted duplication

sequences spacer

match the consensus

repeats

upper-case

letters. Positions

by lower-case

are indicated

letters

where the sequences

or gaps.

The highly

D. melanogasteris heavily underlined. and B, respectively.

differences

are shown by lower-case

in a gap between

letters or gaps. (C)The

with the D. melanogastersequence

are the same as in D. melanogasterare indicated

conserved

region

corresponding

to the reiteration

sequence

by a dash; differences of the transcription

are the consensus

sequences

It has been arranged

90 and 172, for which the sequence

to give the best alignment

has not been determined.

of the

which is shown

in

are indicated start

point

in

shown in panels A

is that ofthe last whole spacer repeat in the clone B 15.10. The D. teissierisequence

clone with spacer repeat homology.

nt positions

aligned

The D. melanogasterand D. yakuba sequences

The D. simulanssequence

is that of the longest available resulting

by dashes;

from D. simulans,D. yakuba and D. teissieriare shown

shown

with the other repeats,

282

A. . 180

190

200

t +t

210

220

230

t****i

240

me1240

TTGGCAATTATATGAGTAAATTAAATCATATACATATGAAAATAAATATTTATTATATGTATAT

mel

TTGGCAATTATATGAGTAAATTAAATCATATACATATGAAAATGAATATTTATTATATGTATATAGGGGAAAAAATAATCA -220

-210

180

-200

190

-190

200

-180

210

CT -170

220

-160

230

TTGGCAATAATATGAGTAAATTAAATAATAAACATATGAAAATTAATATGTATTATATGTATAAAGTTTAAA

sim

TTGGCAATAATATGAGTAAATTAAATAATAAACATATGAAAATTAATATGTATTATATGTATAAAGTTTAAAATAATC -210

-200

-190

-180

-170

-160 l

180

190

200

210

220

t**

TCGACAATATTATATGGAAATTAAATATTAATCATATGGAAATGAATATTTATCGTATGTATAAATGAAAAAATGATCAAA TCGACAATATTATATGGAAATTAAATATTAATCATATGGAAATGAATATTTATCGTATGAATAATGGAAAAATATATTTAA -200

-190

180

190

200

210

-180

-170 l

l

220

-160

*

*t

230

f

TCGACAATAATATATGGAAATTAAATATTAAACATATGGAAATGAATATTTATCGTATGGATAAATGAAGAAATGATAAAA

tei

TCGACAATAATATATGGAAATTAAATATTAAACATATGAAAATGAATATTTATCATATGAATAATGGAAAAATATATTTAA -210

-200

-190

-180

-150 *t*

**

240

tei240

-220

*t

240

yak

-210

-150

ff

230

yak240

-220

-150

240

sim240

-220

GAAAAATGTTGAA

-170

-160

-150

B. * **

l

*

80

l

l

90

140

150

GCTGTTCTACGACAGAGGGTTCAAAAACTACTATAGGTAGGCAGTGGTTGCCGACCTCTCATATTGTTCAAAA CCGCTTTGAGGACAGCGGGTTCAAAAACTACTATAGGTAGGCAGTGGTTGCCGACCTCGCA

***

-20

1

-10

1

10

t

90

TTGTTCGAAA 20

30 l

t

100

110

120

130

CCGGAATTACGACAGAGGGTTCAAAAACTACTATAGGTAAGGCAGTGGTTGCCGACCTCTCATATTGTTCAAAA

sim

CCGCTTTGACGACAGCGGGTTCAAAAACTACTATAGGTA

+r

t*

-20

**+

80

1

10

20 *

l

100

110

120

30 l

130

**

t

GCTGGTACCGACAGAGGGTTCAAAAACTACTATAGGTAGGCAGTGGTTGCCGACCTCTCGTATTGTTCGAAA

yak

CGTTTTGAGGACAGCGGGTTCAAAAACTACTATAGGTAAGCAGTGGTTGCTGACCTCCGCTTTATATATTAC

C.

*t.**

-20

ttt

80

-10

1

10

20

l

90

110

120

130

l

GCTGGTACCGACAGAGGGTTCAAAAACTACTATAGGTAGGCAGTGGTTGCCGACCTCTCGTATTGTTCGAAA CGCTTTGAGGACAGCGGGTTCAAAAACTACTATAGGTAGGCAGTGGTTGCCGACCTCGC

Fig. 7. Homologies of the MICROGENIE

between the sequences program

surrounding

-10

repeat,

of sequence

and the lower sequence

numbers

shown are the same as those used in Figs. 4 and 6. Mismatches

between

the ends of the spacer

repeats

and the sequences

D. simuluns, D. yakuba and D. tetisieri. (B) The homology

D. melanogaster, D. simulans and D. yakuba. (C) The homology start point.

30

start point and the spacer repeats. 15 nt long, which showed are indicated

the transcription

the spacer

between

by asterisks

repeats

The homology

In all cases the

the transcription

start point. The

above the sequence. start

point

shown

and the transcription

the D. yakuba spacer

search option

at least 80% match.

is from the region surrounding

surrounding

between

ATTGTTCGAAA 20

10

the transcription

was set to find stretches

is from the spacer

1

*

*

yak240

-20

***

140

mel

-30

**

30

l

100

l

140

yak240

-30

*

GGCAGTGGTTGCCGACCTCCCGCATTATTCGAAA

-10

*

90

l

*t

140

sim240

-30

scription

130

mel

80

sequence

120

me1240

-30

upper

110

*

I.

l

100

repeat

(A)The homology

for D. melanogaster,

start

points

shown

for

and the D. melanogaster tran-

283

‘\

‘\ \

\

\\2

‘\\

Fig. 8. Matrix

comparisons

of the sequences

matrix option of the MICROGENIE matches. sequences

Panels

surrounding

case the diagonal

the transcription lines labelled

of homology

start point (abscissa)

1 and 3 represent

upstream

the transcription

for D. melanoguster, D. simuluns, and D. yak&z, are taken from Fig. 4 and the 240-bp repeats

the homology

from that represented

shown in panels by diagonal

moter sequences showing these features is presented in Fig. 9. It can be seen that within the three species there is a block of homology, which stretches from about -10.5 to -65. This region contains one of the sequence blocks, from approx. -70 to -85, that has been conserved between the sibling species (region B in Fig. 5). An analysis of the promoter and spacer repeat sequences from D. orma (Tautz et al., 1987), shows that a similar situation exists in this species. Tautz et al. (1987) also sequenced this region from D. virilisand D. hydei, which are classified in a different subgenus from the melanogaster species subgroup. In D. virilisthe spacer repeats are 220 bp 2 me; 240

start point with the spacer repeats.

The parameters

respectively,

(ordinates)

in which the

from Fig. 6. In each

A and B of Fig. 7. The diagonal labelled 2 represents

3.

long and end almost immediately upstream from the transcription start point. They are therefore effectively duplications of the first 220 bp upstream from the start point. The situation in D. hydeiis similar but there are three short regions which are peculiar to the repeat nearest the real promoter. In all of the 240-bp spacer repeats discussed in this paper, and also those from D. arena (Tautz et al., 1987), there is an A residue at nt position -19 with respect to the ~~s~~ption start point dup~cations in the spacer repeats. In the case of the true transcription start point, however, there is a C residue at nt position - 19 in the 3

1

I

I

~~~ -100

-80

-PO

-60

-20

60 sim sim

100

240 T~$GTGGCAAACGGAATTGAAAATACCCGC -40

20 yak

of the

were set so that a dot is plotted for each nt in any run of 13 nt which has at least eleven

A, B, and C show these comparisons

around

a short stretch

program

80 CGTGAAAGGTTATAGTAG’~GTAAACAAGGC’~~GGTAC TTGGATACCAAACAGAATTGAAAATACCCGTTTTGA -40

240

yak -100

Fig. 9. Alignment

of the sequences

-20

upstream

of the transcription

D. simuluns and D. yakubu. The letters 2 and 3 above the sequences

start

point and the spacer

indicate

in Fig. 8. The upper sequence is from the spacer repeat and the lower sequence in each case. Boxes are drawn around matched nt within these regions.

the regions

repeat

of homology

sequences indicated

is from the region around

100

for D. melunogaster, by diagonals

the transcription

2 and 3

start point

284

five

sequences

presented

in Fig. 3 and

D. arena (Tautz et al., 1987). The significance

also

in

of this

is unknown and the observation does not preclude the existence of spacer repeats, as yet unsequenced, which do not conform

to this rule.

(e) Concluding remarks

and Glover, 1988). We found that, whilst the expression of the reporter gene downstream from the rDNA promoter could be observed with a construct having 43 bp of upstream rDNA sequences, we could no longer detect expression with constructs having either 60 bp or 72 bp of upstream

sequences.

However,

306 bp

upstream The conservation of the transcription start point duplications in the spacer suggests that they have functional significance. Indeed, the D. melanogaster repeats have been shown to direct transcription in vivo (Miller et al., 1983), and mapping of the 5’ end of the spacer transcripts shows that they initiate within the duplicated region at the nt position which is analogous to the transcription start point (Murtif and Rae, 1985). However, these transcripts are found at relatively low levels compared to the fulllength rRNA precursor molecule and they are not found in the cytoplasm. One possibility is that the spacer transcripts themselves do not serve any function, but that the transcription start point duplications somehow affect the efficiency of rDNA transcription, perhaps by increasing the local concentration of RNA polymerase or transcription factors. It is possible that in an ancestral population the spacer repeats contained duplications extending 100 bp or more upstream from the transcription start point. If the mechanisms responsible for maintaining homogeneity among the repeats failed to act on the sequences immediately upstream from the transcription start point, a gradual divergence of these sequences from those in the repeats would occur. The observed homology (from approx. 105 bp to 65 bp upstream from the transcription start point) in D. melanogaster, D. simulans and D. yakuba suggests that these sequences may have a functional signiticance. This region contains one of two sequence blocks (-85 to -70) found to be conserved around the transcription start point in species comparisons (Fig. 5). The other conserved block is between nt -140 and -125. Neither of these sequence blocks is apparently necessary for efficient transcription in vitro for which only sequences upstream from -43 nt are required (Kohorn and Rae, 1983). We have recently carried out experiments with templates having varying lengths of upstream sequences introduced into cultured cells by transfection (Hayward

constructs

with

180 bp

or

sequences were efficiently transcribed.

of The

requirements for eficient transcription in vivo are therefore clearly more complex than with in vitro systems. It thus seems likely that the regions which are identified in this study as being conserved during evolution do have functional significance. The continued development of in vivo and in vitro systems to study Drosophila rDNA transcription will allow the role of these sequences to be tested.

ACKNOWLEDGEMENTS

We are grateful to the Cancer Research Campaign for supporting this work and providing a Studentship for D.C.H. and a Career Development Award for D.M.G.

REFERENCES

Barnes,

S.R., Webb, D.A. and Dover, G.A.: The distributions

satellite and main-band

DNA components

of Drosophila. Chromosoma

ter species subgroup

of

in the melanogas67 (1978)

341-363. Benoist,

C., O’Hare,

ovalbumin

K., Breathnach,

gene sequence

R. and Chambon,

ofputative

P.: The

control regions. Nucleic

Acids Res. 8 (1980) 127-142. Bodmer,

M. and Ashburner,

DNA sequences

M.: Conservation

and change in the

coding for alcohol dehydrogenase

species of Drosophila. Nature

in sibling

309 (1984) 425-430.

Chia, W., Scott, M.R.D. and Rigby, P.W.J.: The construction cosmid libraries of vectors.

of eukaryotic

Nucleic

DNA using the Homer

Acids Res. 10 (1982) 2503-2520.

Coen, E.S. and Dover, G.A.: Multiple PolIk initiation in rDNA

sequences

of Drosophila melanoguster. Nucleic

spacers

of

series

Acids

Res. 10 (1982) 7017-7026. Coen,

E.S.

and

coevolution

Dover,

G.A.:

Unequal

of X and Y rDNA

arrays

exchanges

and

the

in Drosophila melano-

gaster. Cell 33 (1983) 849-855. Coen, E.S., Strachan, certed evolution

T. and Dover,

ofribosomal

in the melanogaster species Biol. 158 (1982a)

17-35.

G.A.: Dynamics

of con-

DNA and histone gene families subgroup

of Drosophila. J. Mol.

285

Coen, ES., Thoday, J.M. and Dover, G.A.: Rate of turnover of structural variants in the rDNA gene family of Drosophila melff~~guster. Nature 295 (1982b) 564-568. Corden, J., Wasylyk, B., Buchwalder, A., Sassone-Corsi, P., Kedinger, C. and Chambon, P.: Promoter sequences of eukaryotic protein-coding genes. Science 209 (1980) 1406-1414. Dover, GA. and Flavell, R.B.: Molecular coevolution: DNA divergence and the maintenance of function. Cell 38 (1984) 622-623. Eisses, K.I., Van Dijk, H. and Van Delden, W.: Genetic differentiation within the melanogaster species group of the genus Drosophila (Sophophora). Evolution 33 (1979) 1063-1068. Fauron, L.M.R. and Wolstenholme, D.R. (1980): Extensive diversity among Drosophila species with respect to nucleotide sequences within the adenine and thee-~ch region of mitochondrial DNA molecules. Nucleic Acids Res. 11 (1980) 2439-2452. Fedoroff, N.: On spacers. Cell 16 (1979) 697-710. Gerbi, S.A.: Evolution of ribosomal DNA. In MacIntyre, R. (Ed.), Molecular Evolutionary Genetics. Plenum, New York, 1985, pp. 419-517. Gonzales, A.M., Cabrera, V.M., Larruya, J.M. and Gullan, A.: Genetic distance in the sibling species Drosophila melanogaster, Drosophila simulans and Drosophila mauritiana. Evolution 36 (1982) 517-522. Grummt, I. (1982). Nucleotide sequence requirements for specific initiation of transcription by RNA polymerase I. Proc. Natl. Acad. Sci. USA 81 (1982) 6908-6911. Hayward, D.C. and Glover, D.M.: Anaiysis of the Drosophila rDNA promoter by transient expression. Nucleic Acids Res. 16 (1988) 4253-4268. Kohorn, B.D. and Rae, P.M.M.: Localization of DNA sequences promoting RNA polymerase I activity in Drosophila. Proc. Natl. Acad. Sci. USA 80 (1983a) 3265-3268. Kohorn, B.D. and Rae, P.M.M.: A component of the Drusophj~a RNA polymerase I promoter lies within the rRNA transcription unit. Nature 304 (1983b) 179-181. Leigh-Brown, A.J. and Ish-Horowitz, D.: Evolution of the 87A and 87C heat shock loci in Drosophila. Nature 290 (1981) 677-682. Lemeunier, F. and Ashburner, M.: Relationships within the melanogaster species group of the genus Drosophila (Sophophora), II. Phylogenetic relationships between six species based on polytene chromosome banding sequences. Proc. Roy. Sot. London Ser. B. 193 (1976) 275-294.

Long, E.O. and Dawid, I.E.: Repeated genes in eukaryotes. Annu. Rev. Biochem. 49 (1980) 727-764. Long, E.O., Rebbert, M.L. and Dawid, LB.: Nucleotide sequence of the initiation site for ribosomal DNA ~anscription in Drosophila melanogaster: comparison of genes with and without insertions. Proc. Natl. Acad. Sci. USA 78 (1981) 1513-1517. Mandal, R.K. and Dawid, LB.: The nucleotides sequence at the transcription termination site of ribosomal RNA in Drosophilu rnezanog~t~r. NucIeic Acids Res. 9 (1981) 1801-1811. Miller, J.R., Hayward, D.C. and Glover, D.M.: Transcription of the ‘non-transcribed’ spacer of Drosophila melanogaster rDNA. Nucleic Acids Res. 11 (1983) 1l-19. Murtif, V.L. and Rae, P.M.M.: In vivo transcription of rDNA spacers in Drosophila. Nucleic Acids Res. 13 (1985) 3221-3239. Reeder, R.H.: Enhancers and ribosomal gene spacers. Cell 38 (1984) 349-351. Roiha, H., Miller, J.R., Woods, L.C. and Glover, D.M.: Arrangements and rearrangements of sequences flanking the two types of rDNA insertion in D. melanogaster. Nature 290 (1981) 749-753. Roiha, II., Read, CA., Browne, M.J. and Glover, D.M.: Widely differing degrees of sequence conservation of the two types of rDNA insertion within the melanogaster species subgroup of Drosophila. EMBO J. 2 (1983) 721-726. Simeone, A., de Falco, A., Macino, G. and Boncinelli, E.: Sequence organisation of the ribosomal spacer of Dros~~h~a melanog~ter. Nucleic Acids Res. 10 (1982) 8263-8272. Simeone, A., La Volpe, A. and Boncinelli, E.: Nucleotide sequence of a complete ribosomal spacer of D. melanogaster. Nucleic Acids Res. 13 (1985) 1089-1101. Strachn, T., Webb, D. and Dover, G.A.: Transition stages of molecular drive in multiple-copy DNA families in Drosophila. EMBO J. 4 (1985) 1701-1708. Tautz, D. and Dover, G.A.: Transcription of the tandem array of ribosomal DNA in Drosophila melanogaster does not terminate at any fixed point. EMBO J. 5 (1986) 1267-1273. Tautz, D., Tautz, C., Webb, D. and Dover, G.A.: Evolutionary divergence of promoters and spacers in the rDNA family of four Drosophila species. Implications for molecular coevolution in multigene families. J. Mol. Biol. 195 (1987) 525-542. Throckmorton, L.H.: The phylogeny, ecology and geography of Drosophila. In R. King (Ed.), Handbook of Genetics, Vol. 3. Plenum, New York, 1975, pp. 421-469.