271
Gene, 77 (1989) 271-285 Elsevier GEN 02942
The promoters and spacers in the rDNAs of the melanogaster species subgroup of Drosophila (Recombinant
DNA; simulans; mauritiana;
David C. Hayward*
teissieri; yakuba;
transcription
start point)
and David M. Clover
Cancer Research Campaign, Eukaryotic Molecular Genetics Research Group, Department of Biochemistry, Imperial College of Science and Technology, London SW72AZ (U.K.) Received by S.G. Oliver: 25 August 1988 Revised: 15 November 1988 Accepted: 18 November 1988
SUMMARY
of the melanogaster species subgroup of Drosophila (melanohave been compared. The external transcribed spacers (ETSs; the region encoding the 5’ end of the primary transcript, upstream from the 18s sequences) are highly conserved between in D. melanogaster, D. simulans and D. mauritiana, whereas the more distantly related D. yakuba and D. teissieri differ in having apparent deletions of 22 and 27 bp, respectively, in this region. The divergence of nucleotide sequence upstream from the transcription start points is consistent with the established phylogeny of the five species. The sequences between bp positions -47 and + 24 from the primary transcription start point show extremely little variation between each species. This is also the case for sequences between the approximate bp positions -140 to -125 and -85 to -70. This could indicate a functional importance not only of the sequences next to the transcription start point, but also of these upstream regions. An array of 240-bp repeats can be found at a comparable distance upstream from the transcription start point in each species. Matrix homology comparisons indicate that for each species not only is the sequence at the primary transcription start point duplicated within the 240-bp repeats as previously reported for D. melanogaster, but that this is part of a longer interrupted duplication which includes a region of strong similarity with the sequence between the approximate positions -105 to -65. This region is contained within one of the regions upstream from the transcription start point that is strongly conserved between the species. This sequence may therefore have functional significance not only for the transcription of the rRNA precursor, but also for transcription of the so-called NTS sequences which is now known to occur. The 240-bp arrays are themselves highly conserved within a species indicating that homogenisation mechanisms are operative. The divergence of these arrays between species is consistent with the phylogenetic tree. The 3’ sequences of the primary transcription unit, now known to be RNA-processing sites, are also highly similar between the species. Immediately downstream The spacer
sequences
of rDNAs
gaster, simulans, mauritiana,
of members
teissieri, and
yakuba)
Correspondence to: Dr. D.M. Glover, Department of Biochemistry, Imperial College, London SW72AE (U.K.) Tel. (44)1.5895111, ext.4109; Fax(44)15847596. * Present address: Research School for Biological Sciences, Australian National University, Canberra, A.C.T. (Australia) Tel. (61)62494488. 0378-l 119/X9/$03.50 0 1989 Elsevier
Science Publishers
B.V. (Biomedical
Abbreviations: bp, base pair(s); D., Drosophila;ETS, external transcribed spacer; Myr, one million years; nt, nucleotide(s); NTS, non-transcribed spacer; PolIk, Klenow (large) fragment of E. coli DNA polymerase I; rDNA, DNA coding for rRNA; RF, replicative form; rRNA, ribosomal RNA.
Division)
272
from these sites there is little homology are reached
between the rDNA of the different species, until 95-bp tandem
arrays
in each case.
INTRODUCTION
In most eukaryotes, the genes (rDNA) coding for the major rRNAs, are arranged as a series of tandem
from that found with many genes coding for proteins, where sequence elements upstream from the transcription start point such as the TATA box (Corden
Com-
et al., 1980), and the CAAT box (Benoist et al., 1980) are found to be conserved throughout evolu-
parisons of the rDNA from several organisms have made it clear that different regions of the rDNA
tion. The difference between the two types of gene reflects their organisation in the genome and their
repeat have undergone varying degrees of conservation during evolution. The regions corresponding to the mature rRNAs have been most conserved while the spacer sequences between the genes have diverged more rapidly, such that there may be little resemblance between the spacers of organisms taxonomically classified in the same genus (for reviews, see Fedoroff, 1979; Long and Dawid, 1980;
variation to the extent that the RNA polymerase I complex of one species cannot transcribe the rDNA of another (see Grummt, 1972; Reeder, 1984; Dover and Flavell, 1984). Presumably the genes encoding the proteins that have to interact with the rDNA promoter are undergoing parallel evolution.
repeats located on one or more chromosomes.
modes
of evolution.
The rDNA
promoter
shows
Dover, 1982). Several laboratories have shown that the region immediately surrounding the transcription start point, together with sequences further upstream in the NTS, is responsible for directing efficient transcription of the rDNA genes. However, the sequences around the transcription start point, including those shown to be necessary for accurate transcription show little similarity when comparisons are made across the genera. This difference in the rate of divergence reflects the functional constraints imposed upon the different regions. On the one hand, since the rRNAs interact with scores
The high degree of divergence of the transcriptional control regions of the rRNA genes means that the information gained by making comparisons between distantly related species is limited. By comparing these sequences in a group of closely related species, however, it should be possible to detect fast and slow diverging regions, the latter presumably having some function in the control of transcription. Dover and co-workers have approached this problem by analysing the divergence of rDNA promoters and spacers in two species from the subgenus Sophophora (0. melanogaster and Drosophila orena), and two species from the subgenus Drosophila (Drosophila hydei and Drosophila virilis; Tautz et al.,
of ribosomal proteins, there is strong selection against changes in these regions. On the other hand there is considerable variation in copy number of complete rDNA units and of repetitive elements within the rDNA spacers that is believed to result from unequal crossing over. The NTS of D. melanogaster contains three principal groups of sub-repeats (Coen and Dover, 1982). These are the 95 bp, 330 bp and 240 bp tandemly repeating elements which are found 5’ to 3’, in that order, in the spacer. A 60-bp sequence within the 240-bp repeats has previously been pointed out as having homology to the sequence from 25 bp upstream to 35 bp downstream from the transcription start point (Coen and Dover, 1982; Simeone et al., 1982; Miller et al., 1983). The degree of conservation of rDNA promoter sequences differs
1987). These subgenera are thought to have an evolution of between about 30 Myr and 60 Myr. We have chosen to study the promoters and spacers from five sibling species in the melanogaster species subgroup of Drosophila (Throckmorton, 1975). The evolutionary time scale in which these species have diverged has been estimated from studies of electrophoretic polymorphisms (Eisses et al., 1979; Gonzales et al., 1982), and on the frequency of synonymous codon substitutions (substitutions in the protein-coding region which do not give rise to an amino acid change) in the alcohol dehydrogenase gene (Bodmer and Ashburner, 1984). These estimates vary within an order of magnitude ranging from 0.8 Myr to 3.9 Myr for the divergence of D. melanogaster and D. simulans and between 2 Myr and 13 Myr for the
213
divergence of D. melanogaster and D. arena. This degree of variation between estimates reflects the
quences within the rDNA promoter
difficulty in putting
absolute
cance.
distance.
the phylogenetic
However,
these species,
values on evolutionary
and discuss
regions of live of
their functional
signifi-
tree shown in
Fig. 1 is supported by a number of criteria, including studies of alloenzyme alleles (Eisses et al., 1979); heat-shock
genes (Leigh Brown and Ish-Horowitz, mitochondrial genomes (Fauron and 1981); Wolstenhohne, 1980); satellite nucleotide sequences (Barnes et al., 1978); rRNA and histone genes (Coen et al., 1982a); rDNA insertion elements (Roiha
MATERIALS
et al., 1983); the ‘500’ and ‘300’ noncoding
lans and D. mauritiana were supplied
AND METHODS
(a) Cosmids Cosmid
families
clones
containing
from D. simu-
rDNA
by Dr. M.J.
Browne. They were constructed by cloning partial EcoRI* digests of genomic DNA into the cosmid vector Homer I (Chia et al., 1982), and were isolated
(Strachan et al., 1982); and polytene chromosome banding patterns (Lemeunier and Ashburner, 1976). In this paper we identify some slowly evolving seA.
----------
l---------
_______I_______
I
I I
_________I-_________ __I__ ___I-__
I I simulans
melanogaster
I
I
I
I
I
I
I sechelLia
mauritiana
arena
erecta
I
I teissieri
yakuba
B.
I
I
I
I
sim
mel
yak
tei
1
95.5
91.0
85.3
87.0
I
l_________l_________-----------------_--_--I
I
ma”
I I
I sim
88.4 (88.4)
I
mel
82.0 (ai .9)
83.8
80.9
84.3
(82.2)
I
yak
1 I
92.5
I
I I
I________-l_________-----------------------I
Fig. 1. Phylogenetic surrounding repeats
tree of Drosophila. (A) Phylogeny
the transcription
of the melanogaster subgroup.
start point. The corresponding
numbers
in the three species where these have been compared.
MICROGENIE
program
Single bp mismatches
and small deletions
which all five sequences the regions spanning Sequences
which aligns sequences were available,
the sequence
in parentheses
The numbers
were obtained
and counts the number ofpositions
or insertions
are therefore
deleted form the ETS (see RESULTS
homologies
y0 homology
The analysis
AND DISCUSSION,
the 240-bp spacer
search
option
of the
dividing by the total length.
was confined
ofthe D. yak&a (yak) andD.
to as me1 and sim, respectively.
of the sequences
between
using the homology
where they are matched,
scored equivalently.
i.e., from nt -173 to 101. In the comparisons
from D. melanogaster and D. simulans are referred
(B) Percentage
represent
to the region for
teissieri(tei) sequences,
section b) are omitted from the analysis.
by virtue of their homology rDNA
repeat
to the D. melanogaster
in the clone pDm238
(Roiha
et al.,
1981). EcoRI* digests were performed in 10 mM NaCl/lO mM Tris * HCl pH 7.4/l mM EDTA containing
10 % dimethylsulfoxide.
(b) Clones of Drosophila Fig. 2A shows fragments subclones
SB15 insert
map of part of the and indicates
the
which were subsequently subcloned. The B15.24 and B15.10 were constructed by
ligating an EcoRI
digest of the D. simulans cosmid
SB15 into pBR325. The subclone pDsl8S was constructed by ligating the ~g~II-~indII1 fragment from the 18s gene in B15.10 into pBR322 cut with Clones pDs90HR and Wind111 + BamHI. pDs90TR were made by cloning gel-puri~ed EcoRIIIindIII and EcoRI-TaqI fragments of the predicted size from B15.24 into pEMBL8 + cut with EcoRI + &‘indIII and EcoRI + AccI, respectively. To generate the Ml3 clones M13srRT500 and M 13srTT660, the purified 1.6kb EcoRI-BgZII fragment from B 15.10 was digested with TagI, and the sticky ends subsequently filled using PolIk. After gel purification the 660-bp TaqI fragment was ligated into SmaI-cut M13mp8, and the 500-bp EcoRI-iTag fragment was cloned into the BamHI site of M13mp8 using BgZII linkers. This process reconstructed the tilled-in EcoRI site. The plasmid clone pDsrTT660 was made by excising the insert from an RF preparation of M13srTT660 with EcoRI + &zdIII and cloning it into pEMBL8 + . Similarly, pDsRT500 was made from M13rRT500 RF but the insert was excised with EcoRI + BgZII and cloned into pEMBL8 + cut with EcoRI + BamHI. This removed several tandem BgEII sites which were present at one end of the insert as a consequence of the cloning strategy for M13srRT500. The construction of clone pDsRB was carried out by ligating the 1.6-kb EcoRI-BgZII fragment from B15.10 into EcoRI + BamHI-cut pEMBL8 + . (c) Clones of Drosophila
made by digesting the 3. I-kb EcoRI-BgiII fragment from MA89C with TaqI and PolIk filling the ends. The 2-kb EcoRI-TaqI fragment and the 660-bp I’aqI fragment were then gel purified and inserted into M13mp8 which had been cut with BamHI and blunted with PolIk. In order to obtain Ml3 bacterio-
simulans
a restriction
D. simulans cosmid
a restriction map is shown in Fig. 2B. The Ml3 clones M138marRT and M138marTT660 were
mauritiana
The D. muu$itia~a subclone MA89C was made by inserting a purified 4.4-kb EcoRI fragment from the cosmid MA89 into pBR325. This fragment was identified by hybridization to labelled 18 S RNA and
phages with inserts in the opposite o~entation, RFs of M 138marRT and M13mar8TT660 were grown, the insert serted
cut out with EcaRI
into EcoRI
+ HindIII,
+ ~indIII-cut
and in-
M13mp9,
M139marRT and M139marTT660. pDmarRT made later by cutting M139marRT RF EcoRI + Hind111 and cloning the insert pEMBL9 + cut with the same enzymes. (d) Clones of Drosophila yak&a
teissievi
giving was with into
and Dposophi~a
D. t~iss~~~genomic libraries were made by partially digesting genomic DNA isolated from adults with Sau3A and ligating fragments between 15 and 20 kb into /ZEMBL4 cut with BamHI. D. yakuba genomic libraries were constructed by ligating partial lMbo1 digests into the BamHI site of AEMBL4. These libraries were screened using pDsl8S as a probe. Fig. 2 (panels C and D) shows the restriction maps of the promoter containing Hind111 fragments from D. teissieri and D. yak&a. The clone pDyLBH was made by inserting a gel-purified 4.5-kb ElindIIIBglII fragment from a plasmid which contained the 5.6-kb HindIII spacer fragment into Hind111 + BarnHI-cut pEMBL9 + . Ml3 clones M13y240A and M13y240B were constructed by ligating an EcoRI* digest of the purified 4.5-kb HindIII-BgEII fragment into M13mp8 cut with EcoRI, while M13y240C, D and E were made by cloning blunt-ended, size-selected EcoRI* fragments from pDyLBH into end-repaired, BamHI-cut M13mpX. The clones were identified by virtue of their homology with the M13srRT500 insert. The plasmid subclones pDy240F,G and H and pDyPr contain TaqI fragments from a digest of the purified 4.5-kb EcoRI-BglII fragment inserted into pEMBL8 + cut with AccI. The D. teissieri clone pDtr3.3 contains the 5-kb Hind111 fragment cloned into the Hind111 site of pBR325; pDtLBH was subcloned from this and
215
TTT
TR
T
B
R
..._ _ _ _._._._
..._ Ikb
Ikb
D
B
3
..._ lkb
Ikb
Fig. 2. Cleavage containing
maps and sequencing
strategies
of the D. simulans cosmid
of the insert
the transcription
start
D. teissieri. Sites for the restriction
was carried
(4) pDs90TR;
(5) pDsRB;
The subclones
in panels C and D are: (1) pDyLBH;
map in C represents EcoRI*
(6) M13rRT500;
cannot
are indicated
out. The subclones
(7) M13srTT660;
one of the 240-bp repeats.
sites shown. These repeats
(3) pDtr3.3;
At least six separate
be unambiguously
ment from a TaqI digest of pDtLBH site of pEMBL8 + .
by horizontal
the transcription
repeats
assigned
into the AccI
AND DISCUSSION
(a) The region around the 3’ end of the 28s gene In D. melanogaster the 3’ end of the 28s rRNA is also the 3’ end of the 40s precursor molecule, and it was thought that this was the site at which transcription terminated. However, recent evidence indicates that transcription can read through into the spacer from the preceding rDNA repeat (Tautz and Dover, 1986). Thus the 3’ end of the 28s gene may
double-ended
clones. (A) Part
(C) The Hind111 fragment start
point
from
(not all BglII, and EcoRI*
arrows.
Dotted
arrows
indicate
in panels A and B are: (1) B15.24; (2) B15.10; (3) pDs90HR;
(8) pDsl8S;
(2) pDyPr;
contains the 4-kb HindIII-BgZII fragment inserted into Hind111 + BgZII-cut pEMBL9 + . Subclone pDtPr was made by inserting a size selected frag-
RESULTS
containing
MA89.
EcoRI* (R), Hind111 (H), BglII (B) and TaqI (T) are indicated
which were subcloned
analysis
of the D. mauritiana cosmid
from D. yakuba. (D) The Hind111 fragment
point
and TaqI sites are shown). Fragments the sites from which sequence
for the D. simulans, D. mauritiana, D. teissieri, and D. yakuba rDNA
SB15. (B) Part of the insert
enzymes
‘..-
(9) MA89C; (4) pDtLBH;
(10) M138marRT, and (5) pDtPr.
were cloned and sequenced
and (11) M138marTT660. The expanded
portion
of the
either from the TaqI sites or
to a specific place in the spacer.
now be regarded as a site of rapid post-transcriptional processing rather than termination. It was of interest to examine the sequence of this region and the organisation of the 5’ of NTS sequences in the melanogaster species subgroup, since one might expect to find conserved regions which may have some function in RNA processing. Fig. 3 shows the sequence distal to a conserved Hind111 site close to the 3’ end of the 28s gene in D. simulans, D. yakuba and D. teissieri, aligned with the previously determined D. melanogaster sequence (Mandal and Dawid, 1981). The precise end point of the gene has been determined for D. melanogaster by an Sl protection experiment (Mandal and Dawid, 1981), and the 283 gene sequences are indicated by asterisks. The sequences of the 3’ of the 28s gene are identical in the four species shown and also in D. orena and D. virilis (Tautz et al., 1987), reflecting the functional constraints imposed upon this region.
276
*************************** 20
10
man
30
40
me1 sim tei
AAGC~TATCCTTTGCTTGATGATTCGATATA _________-_________---_______a-t____-----_______--------_---a__-agt
yak
________--_________________a---a-ta-g-cc~~~q~q~~~_~q~q~qg~_~~_q~_~~q~~gq__~~_~_
_t-------____
70
50 60 _________t_c--_____------_______-
80
ATAAATG GTTGCCAAACAGCPCGTCATCnAT’P’TAGTGACGCAGGC~i~ __g_t_ttttcct__a________c--_______----_______ -c-ctaat
qtac-
120 130 140 ____________c_____t__t--g-g__g_g__-
110
mau me1 sim tei
AA
tatt-tt-t-q--tcq
tqa-qqt-t-qcq
150 160 170 c__~_____---______------________
CCCTA’TCATA’PAA TT’PTAA’rATAAAGAATTTAAffiAATTTTATCxAGAGTAGCCAAACACCTCGTCATCAATTTAGTGA _____________t____g_________c--____--_____g_____ -__t_____---_-____------__-_____
c_-c’
tataqCqCCtgCCtgCC-ctcqq-tc--t
q______---
90 --______--
100 ___
_--________
---
caqcqcactaqc-a-ag q-gttaqcqcgc-accca
180
190
--_~~~~~------_~
aq 200
CGCATATGATATTGTC ---~~~~_-------~
tq_a_t______q_a_______cgcg--_________q______
yak
mail me1 sim tei
210 __------_~___--
220 ---
230 240 _____-c---~~____----g----___
CCTATCATATAAT’PA ATA TAAAGAATTT __--------~~~~~ --______c__-_ ___--tac____a_t___qaat______c___
250
AAAGAATTTTA’PCAAG -_ -c-a__
yak
Fig. 3. Sequences
around
the 3’ end of the 28s rRNA gene. The sequence
from a conserved
Hz%dIII site 27 bp upstream
from the 3’
end of the 28s gene is shown for L). melanogaster (mel), I). simukms (sim), D. teissieri (tei) and 13. yakuba (yak). Asterisks diagram
indicate
285 rRNA gene sequences.
is shown aligned with the other sequences. by dashes;
differences
The D. mauritiuna (mau) sequence Positions
are shown by lower-case
where the sequences
are the same as the D. melanoguster sequence
and
are indicated
letters or gaps.
is not surprising, considering the high degree of sequence conservation in this region of the rRNAs between species less closely related than these (Gerbi, 1985). The sequences immediately downstream from the gene, however, show limited homology, and only when comparisons are made between the closely related species. Six out of the first 9 nt do~stre~ from the gene are conserved between D. melunogaster and D. simulans, and 7 out of the first 8, between D. yakuba and D. teissieri. There is no obvious conservation between the sequences of all four species. Therefore, apart from the fact that the first 10 or so nt positions are rich in A residues, it is difficult to speculate about putative sequences which might act as signals for post-transcriptional processing. Such signals may be within the 285 gene sequences, or may depend upon RNA secondary structure, involving regions of sequence farther downstream. In the D. melanogaster spacer, the 95-bp repeats start 13 bp do~stre~ from the end of the 28s gene. D. simulans and D. teissieri also contain 95-bp repeats showing good homology to a D. melanogaster consensus sequence. IIowever, the distance between the end of the 28s gene and the start of the 95-bp arrays differs in each species. In D. simul~~s, 20 bp separate the 3’ end of the gene from the start of the repeats, while in D. teissieri this distance is 99 bp. The D. yukuba sequence could only be read with certainty to nt position 100, but this species also
This
above the
is from the EcoRI site in the clone Ml38marRT
contains sequences homologous to the D. melanogaster 95-bp repeat further downstream, since these were cloned and sequenced during the cloning of the ~13y24OA and ~13y24OB inserts (not shown). Thus, a distance of at least 70-bp separates the end of the 28s gene from the 95-bp repeats in D. yakuba. The sequence of the D. orena NTS (Tautz et al., 1987) shows that the analogous region in this species is 71 bp long. We have determined the sequence of the D. mauritiana NTS from the EcoRI site (clone pDmarRT). This has good homology with the D. me~~~og~ster 95-bp repeat. It is shown in Fig. 3 aligned with the D. me~~~og~ter sequence from nt position 49, the beginning of the 9%bp array. We have not determined the length of sequence separating the 95-bp repeat from the 3’ end of the 28s gene in this species. The only si~~cant conservation found in the region separating the 28s gene from the 95”bp repeats occurs between the D. yakuba (yak) and D. teissieri (tei) sequences and is shown below: tei
46
GTACTT$TT$TTGTTGGTT$~TGATGGTT;AGCGC
yak 58 GTACTTGTT
TTGTTGGTTTTTGA~GGTT
84 AGCGC 89
The numbers refer to the nt positions in Fig. 2. Asterisks indicate mismatches or insertionsldeletions. This region has clearly diverged more rapidly than the 95-bp repeats. Indeed, whereas the L). teissieri
211
and D. arena 95-bp repeats share 7 1 y0 homology, significant quences
homology separating
can be found between
no
the se-
these repeats from the 28s gene
in these species. The 95-bp repeats
in D. simulans,
D. mauritiana and D. teissieri show 95x,
75 y0 homology,
respectively,
93% and with a D. melanogaster
consensus sequence consistent with the established phylogeny. A high degree of sequence divergence in the region between the 28s gene and the 95-bp repeats has also been noted
by Dover
and his colleagues
following
D. virilis, D. hydei
all homology.
D. teissieri sequences
and 27 bp, respectively,
sequences
reveals D. simulans and position
D. arena (Tautz
(b) Comparison of the regions surrounding the transcription start points Most rDNA units in D. melanogaster have seven or eight copies of a 240-bp repeat preceding the promoter region. This repeat contains a region of homology with the sequence around the major site for the initiation of transcription (Coen and Dover, 1982; Simeone et al., 1982; Miller et al., 1983). The end of the last 240-bp repeat in the D. melanogaster NTS is about 140 bp upstream from the transcription start point. There is, therefore, a region between the 3’ end of the last repeat and the sequences around the transcription start point which is ‘unique’ within the spacer. To determine whether the organisation of this region is similar between the melanogaster sibling-species, we carried out nucleotide sequence analysis as indicated in Fig. 2. The sequences are shown in Fig. 4 alongside the D. meIanogaster
sequence determined by Long et al. (198 1). The transcription start points in the siblingspecies are inferred from the strong homology to the site in D. melanoguster which has been accurately mapped (Long et al., 1981), and is assigned to nt position + 1. To facilitate comparisons, the sequences have been aligned to maximise homology. Consequently gaps have been introduced into the sequences and the position numbers shown do not necessarily correspond to the distance from the transcription start point. The sequences shown in Fig. 4 exhibit good over-
with the of 22 bp
in the ETS (the sequence
found at a comparable site in the ETS of D. orena rDNA by Tautz et al. (1987). Inspection of these
contain
with
which, in comparison
transcribed to give the 5’ end of the primary transcript, subsequently removed from mature rRNA by processing enzymes). A deletion of 40 bp has been
and
observations
are the D. yakuba and
other species, have relatively large deletions
et al., 1987). The reason for this extraordinarily high level of divergence is unclear, but it may well point towards a fundamental turnover mechanism for the sequences at the end of the spacer repeats.
their
The exceptions
of D. mauritiana, D. melanogaster rDNAs each
that the ETS
a 30-bp inverted repeat at the corresponding of the D. yakuba and D. teissieri deletions.
In D. melanogaster this sequence from nt position to 57 is: 28 CATTGTTCGAAATAT
28
42
I 11111111
IIII 57 GTAATATGCTTTATA43
Although this sequence could represent an insertion in the rDNA of the melanogaster complex, we show below, that these sequences are present in the duplicated promoter sequences within the 240-bp spacer repeats of D. yakuba. One explanation for this arrangement would be that the archetypal promoter first became duplicated within the 240-bp unit which itself became reiterated in the spacer. This would have been followed by a deletion event of the inverted repeat in the ETS occurring after the divergence of the yakuba and melanogaster complexes, but before the divergence of the individual yakuba complex species. If this were the case, we would speculate that the inverted repeat could represent the recognition site for an enzyme system responsible for its excision, although we note that the ‘deletion’ in D. orena extends beyond the inverted repeat. The y0 homology between all pairs of species from nt position -173 to 101, disregarding the mismatch resulting from the ETS deletion in the yakuba subgroup, is given in Fig. 1. The data are in agreement with the phylogenetic relationships, the D. mauritiana, D. melanogaster and D. simulans sequences showing higher homology to each other than they do to the D. yakuba and D. teissieti sequences, with D. mauritiana and D. simulans being most similar. The overall conservation of the sequences around the transcription start points is comparable with that found for synonymous codon changes in
178
-220
Illa” sim
---_----
lllel
T'PGGCAATTATATGAGTAAATTAA _c-a____at___at_g____--_c_a____a____at-g----___
yak
tei
IllA" sim me1 yak tei
-200 -190 -180 -170 ---____---_-_a-__a_---_______-~_____9--__---______--~-_~~~____
_ _
-90 -80 -70 __a____t______------__-------___----______ __a____t______-----___-------_------____--
sim me1
-150 ---_--_____---------___
-140 -130 _~~----__----
-120 -110 __- aaa---_a ---_a----_c_ __-__________~___~~~____aa--__----___-----
-60
'TGG a_=_-
AAAA TGAAGTG'TTCAT ---ta -t--tataat----ta -c--tataat--
-50 -40 -30 -20 -10 ~___~____----___------_---__---____---__-----___----c_--c---__--_---_____--_____ c__----____----___-----_-
TATTCTCGTAATATA'rAAGAGAATAGCCCG'rATGTTGGGTGGTAAATGGAATTGAAAATACCCGCT'PTGAGGA~GCGGGTTCAAAAAC'rACTAT A t_ctg_______t-g----_____-__---t-_t_------__~_~~~__-~~-__-------_____-t______--______________________ t_ctg_______t----____----_____t__ta---_-____-~---~___~~___------__------___-----_____-________________
10 20 _____-----__~------~-----__~~-----------~~-----
30 40 50 60 70 80 90 ~__-_______~-----__------_~___--__-----_~__~---_______________------cc_____ cc_----a__-------_ ~______-9__~-__~-----______-_a--c----_---_______________-_-____
1
ma" sim me1 yak tei
-160 _---
ATCATATACA'rATGAAAATGAATA'rTTATTATAI'GTATA'rAGGGGAAAAAATAATCATATAATATA'rATGAATAA __a__-at-_____g----______-___cg___-a--_t_-A_---t-_-t_ -- -t-____---___--_~___~----_______-_______---c-_---a-_~_-- -t__a____t___t_ -- -t_-----___----
-100 sim me1 yak tei
-210 ~---___-_____---
AGG'TAGGCAGTGGTTGCCGACCTC ----_ a_-----_____t______ --__~------------~-----110 ----_---_---------____-----__------
120 a_a
100
GCATTG'I.TCGAAATATA'rAT'PTCGTATAATGATTA'rACTTATAATAAAGTA'~ATTA'~rA'PCC(;TACA _-----_--_-___--- =___ ------~~~~-- _------c_-t--a-at t-g---- c__---____---~-------------- _-------
130 140 150 160 ______ _____-------~~-----~-_________g______________---g--__-------_------
170 180 __a___________-----____________________---------
190
200 _______
AAT'PTG'rT'rC'rCAGT'rCTT.rT'L.GAACACGGGAC'rTGGCrCCGCGAA~rAATAGGAATATACGC.rA'rT'rTAGATAHTATCGTTGAAA~AAA~rCAA~~~TC _ -_--- _--
tei 210
Ill?." sim me1 yak tei
220
230
240
__-____~_-_-____---________C--__----______________
'PATTATACATAGAATAACAAA'rCGTTTTCATATA'PTATCGT'rAA'PT'rT
Fig. 4. Sequences
around
the transcription
start point. The sequence
around
the start point for D. simuluns (sim), D. mauritiana (mau),
D. teissieri (tei) and D. yakuba (yak) are shown aligned with that for D. melunoguster (mel) shown in upper-case the sequences
are the same as the D. melanogaster sequence
Transcription
begins at nt position
placed over the nt position
1 and continues
to which the number
are indicated
rightward.
refers, whereas
the alcohol dehydrogenase genes (adh) of D. melanogaster, D. simulans, D. mauritiana and D. orena (Bodmer and Ashburner, 1984). If it is assumed that such codon changes represent mutation in the absence of selection, then this would imply that at least some parts of these sequences might not have been subjected to functional constraint. However, this assumption could be challenged as there may well be constraints upon codon usage. Furthermore, single-copy genes are likely to evolve under an entirely different set of constraints than tandemly arranged multi-copy genes, such as rDNA, which are subject to ‘homogenisation mechanisms’. Nevertheless, certain regions of the spacer sequences around the start point are conserved to a higher degree (see below). Fig. 5 shows a graphical representation of the sequence conservation between the sequences around
by dashes;
differences
In the region upstream in the transcribed
letters. Positions
are shown by lower-case
where
letters or gaps.
from the start point, the minus sign has been
region the last digit of the number
is in this position.
the transcription start point from nt position -173 to 101. This was obtained by first determining a sequence of the most frequently occurring nucleotide at each position for the five species (gaps were included in the analysis). Nucleotide divergence from this ‘consensus’ were then plotted against nucleotide position. It can be seen that the graph shows nonrandom distribution of nucleotide divergence. There is a large conserved region between nt -47 and 24, in which there are only four nt positions where all five sequences are not identical. This region contains the transcription start point, and the region which is reiterated in the spacer repeats (region D). It also contains the region found by Kohorn and Rae (1983a,b) to be important for directing efficient transcription in vitro (region C). Two other smaller regions of conservation can be seen, labelled A and B, from approx. -140 to -125 and from approx. -85
219
c Po*ition
Nucl.otid.
Fig. 5. Conserved sequences
the five sequences above the graph: directing
regions
in the sequences
surrounding
in Fig. 3 by taking the most frequently diverged
from this consensus
A and B, two small regions
transcription
occurring
the transcription nt at each position
was then plotted of conservation;
start point is at nt position
(gaps were included
against the nt position.
the sequences
was derived
and Rae (1983a,b)
in the D. melunoguster spacer;
only concerns
sequence
in the analysis).
The number
Various regions of the sequence
C, the region found by Kohorn
in vitro; D, the region found to be reiterated
deleted in D. teissieriand D. yakuba. In this region the analysis
start point. A consensus
from the of times
are indicated
to be important
for
E, the region of the ETS which is
of the other three species. The transcription
1.
to -70. The second of these falls within a region from approx. -10.5 to -65 which is found to be conserved between the sequences around the transcription start point and the 240-bp spacer repeats (see section c below). The region marked E represents the portion of the ETS which is deleted in D. yakuba and D. teissieri. In this region the analysis has only been carried out on the sequences of the other three species. (c) Comparison of the 240-bp spacer repeats To determine whether the overall organisation of the 240-bp spacer repeat is similar in the other species, we cloned and sequenced such sequences from D. simulans, D. yakuba and D. teissieri. Fig. 6 shows these sequences compared to a consensus D. melanogaster sequence derived from a comparison of nine sequenced clones. Since the 240-bp re-
peats are in a tandem array, this consensus is circularly permutable but for clarity has been arranged so that nt position 243 in Fig. 6 is equivalent to the 3’ end of the 3’-most repeat. The D. yakuba repeat is also a consensus derived from eight clones, whereas the D. simulans sequence is that of the last whole spacer repeat in the clone B15.10 (Fig. 3). The D. teissieri sequence shown is that of the longest available clone with spacer repeat homology. It has been arranged to give the best alignment with the other repeats resulting in a gap between nt positions 90 and 172, for which the sequence has not been determined. The y0 homologies between these repeats shown in Fig. 1 indicate similar amounts of divergence as the sequences around the transcription start point. The most highly conserved region of about 50 bp (underlined in Fig. 6) coincides roughly with the duplication of the transcription start point within the D. melanogaster 240-bp spacer repeat.
280
____________________--._____---_____--_____-----_ __________________________-___________._________
nel
Fig. 6. The sequence and the consensus
210
210 220 ITACIT~~CI\I\I\I\TAA~*~~~~~=~~~~=~*~~~’~
sim 240
_~___---_-____~___--9---______aa-----~~..~~.__
yak
210
-‘~______9----q_________cg-----__-a
tel
210
210 GTGA(\RA a______a
_a___----9____q-_-------cg____9---a
a____9_a
of the 240-bp repeats.
sequence
210
(A) The variation
in nine independently
derived from them. Repeats m240.a-c
sequenced
D. melanogaster 240-bp spacer
(1982), m240.g and m240.h are from Simeone et al. (1986) and m240.i is from Simeone et al (1982). (B) The variation of eight cloned D. yakuba 240-bp spacer repeats EcoRI* y240.d-f
fragments. are cloned
Repeats
together
with the consensus
y240.b and y24O.c are incomplete
TaqI fragments
and are all truncated
repeats
are taken from Miller et al. (1983), m240.d-f are from Coen and Dover sequence
derived from them. Repeats
since part of the sequence
due to the presence
in the sequences
y240.a-c
could not be read with certainty.
of a TaqI site at nt position
22. Positions
are cloned Repeats where the
281
The
sequences
D. melanogaster
of all the 240-bp
repeats
the region of the NTS from the last spacer repeat to
from
and D. yakuba which were used to
the transcription
start site has been conserved.
For
the three species for which we have determined
derive the consensus sequences are also shown in Fig. 6. Two points are notable from this figure. First
the
of all the repeats are very homogeneous, diverging on average by only 2% from the consensus. Thus the processes responsible for maintaining homogeneity
of the 240-bp spacer repeat, D. simulans, D. yakuba, and D. melanogaster, it is evident that the repeats contain a duplication of the transcription start point (Fig. 7B). The duplicated
among the repeats act relatively swiftly compared to the rate of mutation. Second, some mutations are
region in D. yukuba is slightly shorter than in the other species, the homology ending 24 bp down-
present
stream from the transcription start point corresponding to the start of the ‘deletion’ in the ETS
in more than one repeat,
and different
peats contain different subsets of the mutations.
complete
reThis
referred
reflects the action of the homogenising mechanism, most probably unequal crossing-over, which works throughout the length of the repeat. The frequency of occurrence of any particular mutation can be viewed as a transition stage in its eventual elimination or fixation in the population of repeats. A similar distribution of mutations has been observed in the spacer repeats of D. hydei and D. virilis (Tautz et al. 1987) and also in noncoding, tandemly repetitive DNA families (Strachan et al. 1985).
sequence
to
above.
Consequently,
the
D. yakuba
repeat
A comparison of the 240-bp spacer repeats and the sequences surrounding the transcription start point for D. melanogaster, D. simulans, D. yakuba and D. teissieri indicates that the highly conserved sequence at the transcription start point is duplicated within the spacer repeats for each of these species (Fig. 7). The first 75 or so bp of the lower sequences presented in Fig. 7A correspond to the end of the last
has a longer region of homology with the D. melanogaster transcription initiation site than to the D. yakuba start point extending beyond nt position 24 (Fig. 7C). Matrix comparisons of these sequences indicate that the promoter duplications are part of a larger interrupted duplication for each of these species (Fig. 8). In such comparisons, diagonal lines are plotted wherever regions of homology occur between the sequences being compared. In Fig. 8, the abscissae represent the sequences around the transcription start points, and the ordinates the 240-bp spacer repeats. In each panel, the diagonal labelled 1 represents the end of the last 240-bp spacer repeat as shown in Fig. 7A. The diagonal labelled 3 represents the duplication of the region around the transcription start point within the 240-bp repeats as shown in Fig. 7B. In addition, for each of the three species there is a stretch of weaker homology labelled diagonal 2. This is in direct line with diagonal 3, indicating that the spatial relationships between these se-
spacer repeat, indicating that the array of 240-bp repeat elements ends approx. 150 bp upstream from the transcription start point, most clearly seen here with D. melanogaster, D. yakuba and D. teissieri. Furthermore, the 240-bp repeats are in the same phase with respect to the ‘unique’ sequence 3’ to the last repeating unit. Thus the overall organisation of
quences are the same in the spacer repeats and the major rDNA promoter. This suggests that the reiterations of the transcription start points contained in the 240-bp spacer repeats, shown in Fig. 7B are actually part of a larger duplication which starts at about 105 nt upstream from the transcription start point. An alignment of the 240-bp repeat and pro-
(d) Promoter duplications of the 240-bp repeats are part of a larger interrupted duplication
sequences spacer
match the consensus
repeats
upper-case
letters. Positions
by lower-case
are indicated
letters
where the sequences
or gaps.
The highly
D. melanogasteris heavily underlined. and B, respectively.
differences
are shown by lower-case
in a gap between
letters or gaps. (C)The
with the D. melanogastersequence
are the same as in D. melanogasterare indicated
conserved
region
corresponding
to the reiteration
sequence
by a dash; differences of the transcription
are the consensus
sequences
It has been arranged
90 and 172, for which the sequence
to give the best alignment
has not been determined.
of the
which is shown
in
are indicated start
point
in
shown in panels A
is that ofthe last whole spacer repeat in the clone B 15.10. The D. teissierisequence
clone with spacer repeat homology.
nt positions
aligned
The D. melanogasterand D. yakuba sequences
The D. simulanssequence
is that of the longest available resulting
by dashes;
from D. simulans,D. yakuba and D. teissieriare shown
shown
with the other repeats,
282
A. . 180
190
200
t +t
210
220
230
t****i
240
me1240
TTGGCAATTATATGAGTAAATTAAATCATATACATATGAAAATAAATATTTATTATATGTATAT
mel
TTGGCAATTATATGAGTAAATTAAATCATATACATATGAAAATGAATATTTATTATATGTATATAGGGGAAAAAATAATCA -220
-210
180
-200
190
-190
200
-180
210
CT -170
220
-160
230
TTGGCAATAATATGAGTAAATTAAATAATAAACATATGAAAATTAATATGTATTATATGTATAAAGTTTAAA
sim
TTGGCAATAATATGAGTAAATTAAATAATAAACATATGAAAATTAATATGTATTATATGTATAAAGTTTAAAATAATC -210
-200
-190
-180
-170
-160 l
180
190
200
210
220
t**
TCGACAATATTATATGGAAATTAAATATTAATCATATGGAAATGAATATTTATCGTATGTATAAATGAAAAAATGATCAAA TCGACAATATTATATGGAAATTAAATATTAATCATATGGAAATGAATATTTATCGTATGAATAATGGAAAAATATATTTAA -200
-190
180
190
200
210
-180
-170 l
l
220
-160
*
*t
230
f
TCGACAATAATATATGGAAATTAAATATTAAACATATGGAAATGAATATTTATCGTATGGATAAATGAAGAAATGATAAAA
tei
TCGACAATAATATATGGAAATTAAATATTAAACATATGAAAATGAATATTTATCATATGAATAATGGAAAAATATATTTAA -210
-200
-190
-180
-150 *t*
**
240
tei240
-220
*t
240
yak
-210
-150
ff
230
yak240
-220
-150
240
sim240
-220
GAAAAATGTTGAA
-170
-160
-150
B. * **
l
*
80
l
l
90
140
150
GCTGTTCTACGACAGAGGGTTCAAAAACTACTATAGGTAGGCAGTGGTTGCCGACCTCTCATATTGTTCAAAA CCGCTTTGAGGACAGCGGGTTCAAAAACTACTATAGGTAGGCAGTGGTTGCCGACCTCGCA
***
-20
1
-10
1
10
t
90
TTGTTCGAAA 20
30 l
t
100
110
120
130
CCGGAATTACGACAGAGGGTTCAAAAACTACTATAGGTAAGGCAGTGGTTGCCGACCTCTCATATTGTTCAAAA
sim
CCGCTTTGACGACAGCGGGTTCAAAAACTACTATAGGTA
+r
t*
-20
**+
80
1
10
20 *
l
100
110
120
30 l
130
**
t
GCTGGTACCGACAGAGGGTTCAAAAACTACTATAGGTAGGCAGTGGTTGCCGACCTCTCGTATTGTTCGAAA
yak
CGTTTTGAGGACAGCGGGTTCAAAAACTACTATAGGTAAGCAGTGGTTGCTGACCTCCGCTTTATATATTAC
C.
*t.**
-20
ttt
80
-10
1
10
20
l
90
110
120
130
l
GCTGGTACCGACAGAGGGTTCAAAAACTACTATAGGTAGGCAGTGGTTGCCGACCTCTCGTATTGTTCGAAA CGCTTTGAGGACAGCGGGTTCAAAAACTACTATAGGTAGGCAGTGGTTGCCGACCTCGC
Fig. 7. Homologies of the MICROGENIE
between the sequences program
surrounding
-10
repeat,
of sequence
and the lower sequence
numbers
shown are the same as those used in Figs. 4 and 6. Mismatches
between
the ends of the spacer
repeats
and the sequences
D. simuluns, D. yakuba and D. tetisieri. (B) The homology
D. melanogaster, D. simulans and D. yakuba. (C) The homology start point.
30
start point and the spacer repeats. 15 nt long, which showed are indicated
the transcription
the spacer
between
by asterisks
repeats
The homology
In all cases the
the transcription
start point. The
above the sequence. start
point
shown
and the transcription
the D. yakuba spacer
search option
at least 80% match.
is from the region surrounding
surrounding
between
ATTGTTCGAAA 20
10
the transcription
was set to find stretches
is from the spacer
1
*
*
yak240
-20
***
140
mel
-30
**
30
l
100
l
140
yak240
-30
*
GGCAGTGGTTGCCGACCTCCCGCATTATTCGAAA
-10
*
90
l
*t
140
sim240
-30
scription
130
mel
80
sequence
120
me1240
-30
upper
110
*
I.
l
100
repeat
(A)The homology
for D. melanogaster,
start
points
shown
for
and the D. melanogaster tran-
283
‘\
‘\ \
\
\\2
‘\\
Fig. 8. Matrix
comparisons
of the sequences
matrix option of the MICROGENIE matches. sequences
Panels
surrounding
case the diagonal
the transcription lines labelled
of homology
start point (abscissa)
1 and 3 represent
upstream
the transcription
for D. melanoguster, D. simuluns, and D. yak&z, are taken from Fig. 4 and the 240-bp repeats
the homology
from that represented
shown in panels by diagonal
moter sequences showing these features is presented in Fig. 9. It can be seen that within the three species there is a block of homology, which stretches from about -10.5 to -65. This region contains one of the sequence blocks, from approx. -70 to -85, that has been conserved between the sibling species (region B in Fig. 5). An analysis of the promoter and spacer repeat sequences from D. orma (Tautz et al., 1987), shows that a similar situation exists in this species. Tautz et al. (1987) also sequenced this region from D. virilisand D. hydei, which are classified in a different subgenus from the melanogaster species subgroup. In D. virilisthe spacer repeats are 220 bp 2 me; 240
start point with the spacer repeats.
The parameters
respectively,
(ordinates)
in which the
from Fig. 6. In each
A and B of Fig. 7. The diagonal labelled 2 represents
3.
long and end almost immediately upstream from the transcription start point. They are therefore effectively duplications of the first 220 bp upstream from the start point. The situation in D. hydeiis similar but there are three short regions which are peculiar to the repeat nearest the real promoter. In all of the 240-bp spacer repeats discussed in this paper, and also those from D. arena (Tautz et al., 1987), there is an A residue at nt position -19 with respect to the ~~s~~ption start point dup~cations in the spacer repeats. In the case of the true transcription start point, however, there is a C residue at nt position - 19 in the 3
1
I
I
~~~ -100
-80
-PO
-60
-20
60 sim sim
100
240 T~$GTGGCAAACGGAATTGAAAATACCCGC -40
20 yak
of the
were set so that a dot is plotted for each nt in any run of 13 nt which has at least eleven
A, B, and C show these comparisons
around
a short stretch
program
80 CGTGAAAGGTTATAGTAG’~GTAAACAAGGC’~~GGTAC TTGGATACCAAACAGAATTGAAAATACCCGTTTTGA -40
240
yak -100
Fig. 9. Alignment
of the sequences
-20
upstream
of the transcription
D. simuluns and D. yakubu. The letters 2 and 3 above the sequences
start
point and the spacer
indicate
in Fig. 8. The upper sequence is from the spacer repeat and the lower sequence in each case. Boxes are drawn around matched nt within these regions.
the regions
repeat
of homology
sequences indicated
is from the region around
100
for D. melunogaster, by diagonals
the transcription
2 and 3
start point
284
five
sequences
presented
in Fig. 3 and
D. arena (Tautz et al., 1987). The significance
also
in
of this
is unknown and the observation does not preclude the existence of spacer repeats, as yet unsequenced, which do not conform
to this rule.
(e) Concluding remarks
and Glover, 1988). We found that, whilst the expression of the reporter gene downstream from the rDNA promoter could be observed with a construct having 43 bp of upstream rDNA sequences, we could no longer detect expression with constructs having either 60 bp or 72 bp of upstream
sequences.
However,
306 bp
upstream The conservation of the transcription start point duplications in the spacer suggests that they have functional significance. Indeed, the D. melanogaster repeats have been shown to direct transcription in vivo (Miller et al., 1983), and mapping of the 5’ end of the spacer transcripts shows that they initiate within the duplicated region at the nt position which is analogous to the transcription start point (Murtif and Rae, 1985). However, these transcripts are found at relatively low levels compared to the fulllength rRNA precursor molecule and they are not found in the cytoplasm. One possibility is that the spacer transcripts themselves do not serve any function, but that the transcription start point duplications somehow affect the efficiency of rDNA transcription, perhaps by increasing the local concentration of RNA polymerase or transcription factors. It is possible that in an ancestral population the spacer repeats contained duplications extending 100 bp or more upstream from the transcription start point. If the mechanisms responsible for maintaining homogeneity among the repeats failed to act on the sequences immediately upstream from the transcription start point, a gradual divergence of these sequences from those in the repeats would occur. The observed homology (from approx. 105 bp to 65 bp upstream from the transcription start point) in D. melanogaster, D. simulans and D. yakuba suggests that these sequences may have a functional signiticance. This region contains one of two sequence blocks (-85 to -70) found to be conserved around the transcription start point in species comparisons (Fig. 5). The other conserved block is between nt -140 and -125. Neither of these sequence blocks is apparently necessary for efficient transcription in vitro for which only sequences upstream from -43 nt are required (Kohorn and Rae, 1983). We have recently carried out experiments with templates having varying lengths of upstream sequences introduced into cultured cells by transfection (Hayward
constructs
with
180 bp
or
sequences were efficiently transcribed.
of The
requirements for eficient transcription in vivo are therefore clearly more complex than with in vitro systems. It thus seems likely that the regions which are identified in this study as being conserved during evolution do have functional significance. The continued development of in vivo and in vitro systems to study Drosophila rDNA transcription will allow the role of these sequences to be tested.
ACKNOWLEDGEMENTS
We are grateful to the Cancer Research Campaign for supporting this work and providing a Studentship for D.C.H. and a Career Development Award for D.M.G.
REFERENCES
Barnes,
S.R., Webb, D.A. and Dover, G.A.: The distributions
satellite and main-band
DNA components
of Drosophila. Chromosoma
ter species subgroup
of
in the melanogas67 (1978)
341-363. Benoist,
C., O’Hare,
ovalbumin
K., Breathnach,
gene sequence
R. and Chambon,
ofputative
P.: The
control regions. Nucleic
Acids Res. 8 (1980) 127-142. Bodmer,
M. and Ashburner,
DNA sequences
M.: Conservation
and change in the
coding for alcohol dehydrogenase
species of Drosophila. Nature
in sibling
309 (1984) 425-430.
Chia, W., Scott, M.R.D. and Rigby, P.W.J.: The construction cosmid libraries of vectors.
of eukaryotic
Nucleic
DNA using the Homer
Acids Res. 10 (1982) 2503-2520.
Coen, E.S. and Dover, G.A.: Multiple PolIk initiation in rDNA
sequences
of Drosophila melanoguster. Nucleic
spacers
of
series
Acids
Res. 10 (1982) 7017-7026. Coen,
E.S.
and
coevolution
Dover,
G.A.:
Unequal
of X and Y rDNA
arrays
exchanges
and
the
in Drosophila melano-
gaster. Cell 33 (1983) 849-855. Coen, E.S., Strachan, certed evolution
T. and Dover,
ofribosomal
in the melanogaster species Biol. 158 (1982a)
17-35.
G.A.: Dynamics
of con-
DNA and histone gene families subgroup
of Drosophila. J. Mol.
285
Coen, ES., Thoday, J.M. and Dover, G.A.: Rate of turnover of structural variants in the rDNA gene family of Drosophila melff~~guster. Nature 295 (1982b) 564-568. Corden, J., Wasylyk, B., Buchwalder, A., Sassone-Corsi, P., Kedinger, C. and Chambon, P.: Promoter sequences of eukaryotic protein-coding genes. Science 209 (1980) 1406-1414. Dover, GA. and Flavell, R.B.: Molecular coevolution: DNA divergence and the maintenance of function. Cell 38 (1984) 622-623. Eisses, K.I., Van Dijk, H. and Van Delden, W.: Genetic differentiation within the melanogaster species group of the genus Drosophila (Sophophora). Evolution 33 (1979) 1063-1068. Fauron, L.M.R. and Wolstenholme, D.R. (1980): Extensive diversity among Drosophila species with respect to nucleotide sequences within the adenine and thee-~ch region of mitochondrial DNA molecules. Nucleic Acids Res. 11 (1980) 2439-2452. Fedoroff, N.: On spacers. Cell 16 (1979) 697-710. Gerbi, S.A.: Evolution of ribosomal DNA. In MacIntyre, R. (Ed.), Molecular Evolutionary Genetics. Plenum, New York, 1985, pp. 419-517. Gonzales, A.M., Cabrera, V.M., Larruya, J.M. and Gullan, A.: Genetic distance in the sibling species Drosophila melanogaster, Drosophila simulans and Drosophila mauritiana. Evolution 36 (1982) 517-522. Grummt, I. (1982). Nucleotide sequence requirements for specific initiation of transcription by RNA polymerase I. Proc. Natl. Acad. Sci. USA 81 (1982) 6908-6911. Hayward, D.C. and Glover, D.M.: Anaiysis of the Drosophila rDNA promoter by transient expression. Nucleic Acids Res. 16 (1988) 4253-4268. Kohorn, B.D. and Rae, P.M.M.: Localization of DNA sequences promoting RNA polymerase I activity in Drosophila. Proc. Natl. Acad. Sci. USA 80 (1983a) 3265-3268. Kohorn, B.D. and Rae, P.M.M.: A component of the Drusophj~a RNA polymerase I promoter lies within the rRNA transcription unit. Nature 304 (1983b) 179-181. Leigh-Brown, A.J. and Ish-Horowitz, D.: Evolution of the 87A and 87C heat shock loci in Drosophila. Nature 290 (1981) 677-682. Lemeunier, F. and Ashburner, M.: Relationships within the melanogaster species group of the genus Drosophila (Sophophora), II. Phylogenetic relationships between six species based on polytene chromosome banding sequences. Proc. Roy. Sot. London Ser. B. 193 (1976) 275-294.
Long, E.O. and Dawid, I.E.: Repeated genes in eukaryotes. Annu. Rev. Biochem. 49 (1980) 727-764. Long, E.O., Rebbert, M.L. and Dawid, LB.: Nucleotide sequence of the initiation site for ribosomal DNA ~anscription in Drosophila melanogaster: comparison of genes with and without insertions. Proc. Natl. Acad. Sci. USA 78 (1981) 1513-1517. Mandal, R.K. and Dawid, LB.: The nucleotides sequence at the transcription termination site of ribosomal RNA in Drosophilu rnezanog~t~r. NucIeic Acids Res. 9 (1981) 1801-1811. Miller, J.R., Hayward, D.C. and Glover, D.M.: Transcription of the ‘non-transcribed’ spacer of Drosophila melanogaster rDNA. Nucleic Acids Res. 11 (1983) 1l-19. Murtif, V.L. and Rae, P.M.M.: In vivo transcription of rDNA spacers in Drosophila. Nucleic Acids Res. 13 (1985) 3221-3239. Reeder, R.H.: Enhancers and ribosomal gene spacers. Cell 38 (1984) 349-351. Roiha, H., Miller, J.R., Woods, L.C. and Glover, D.M.: Arrangements and rearrangements of sequences flanking the two types of rDNA insertion in D. melanogaster. Nature 290 (1981) 749-753. Roiha, II., Read, CA., Browne, M.J. and Glover, D.M.: Widely differing degrees of sequence conservation of the two types of rDNA insertion within the melanogaster species subgroup of Drosophila. EMBO J. 2 (1983) 721-726. Simeone, A., de Falco, A., Macino, G. and Boncinelli, E.: Sequence organisation of the ribosomal spacer of Dros~~h~a melanog~ter. Nucleic Acids Res. 10 (1982) 8263-8272. Simeone, A., La Volpe, A. and Boncinelli, E.: Nucleotide sequence of a complete ribosomal spacer of D. melanogaster. Nucleic Acids Res. 13 (1985) 1089-1101. Strachn, T., Webb, D. and Dover, G.A.: Transition stages of molecular drive in multiple-copy DNA families in Drosophila. EMBO J. 4 (1985) 1701-1708. Tautz, D. and Dover, G.A.: Transcription of the tandem array of ribosomal DNA in Drosophila melanogaster does not terminate at any fixed point. EMBO J. 5 (1986) 1267-1273. Tautz, D., Tautz, C., Webb, D. and Dover, G.A.: Evolutionary divergence of promoters and spacers in the rDNA family of four Drosophila species. Implications for molecular coevolution in multigene families. J. Mol. Biol. 195 (1987) 525-542. Throckmorton, L.H.: The phylogeny, ecology and geography of Drosophila. In R. King (Ed.), Handbook of Genetics, Vol. 3. Plenum, New York, 1975, pp. 421-469.