Cell, Vol. 34,415-419.
September
1983, Copyright
0092.8874/83/090415-05
CD 1983 by MIT
%02.00/O
The Origin of Extrachromosomal Circular co@ Elements Andrew J. Flavell and David Ish-Horowitz imperial Cancer Research Fund Laboratories Burtonhole Lane, Mill Hill London NW7 1AD, England
Summary Cloned extrachromosomal circular copia elements were studied by nucleotide sequence and restriction enzyme analysis to determine their mechanisms of formation and their possible roles in copia transposition. Rearranged circular copias containing inverted segments flanked by 5 bp sequence duplication were observed, suggesting that circular copias are capable of integrating into their own sequences. Such copia circles are analogous to similarly rearranged retrovirus circles, strengthening the relationship between the co@-like elements and retroviruses. There is marked sequence heterogeneity at the junction between the fused terminal direct repeats of seven co@ circles. The junctions contain O-15 bp inserts, the sequences of which are inconsistent with the creation of these particular molecules by reverse transcription. Introduction The Drosophila genetic element copia is one of the most thoroughly studied eucaryotic mobile sequences. Its general structural features (Fig. 1) are very similar to those of the integrated forms of the retrovirus proviruses (for review, see Varmus, 1982) although to date retroviruses have only been described in the higher vertebrates. These similarities include the presence of an internal DNA segment several kilobases long, flanked by a pair of identical direct terminal repeats (LTRs) of several hundred base pairs, These repeats themselves contain small inverted terminal repeats several base pairs long. Each element is flanked by an identical pair of host DNA sequences, usually 4-6 bp long, that are present only once at the target site for.integration. Most integrated retrovirus proviruses and copia-like elements have limited sequence homology between the extreme ends. There is also sequence homology at both ends of the internal DNA segment next to the LTRs. In retroviruses, these regions are the primer sites for synthesis of the two strands of the linear proviral DNA from the virion RNA template. All copia-like elements and retrovirus proviruses studied to date possess cytoplasmic polyadenylated transcripts covering approximately the full length of the elements. In retroviruses this functions as the virion RNA. These similarities have led to speculation that copia and the copia-like elements (elements sharing these common features) may transpose as viruses or virus-like particles (Flavell and Ish-Horowitz, 1981; Shiba and Saigo, 1983). In retrovirusinfected cells, the virion RNA is converted into unintegrated DNA provirus by reverse transcription.
Three distinct types of extrachromosomal proviral DNA exist in infected cells-linear double-stranded DNA and closed circular molecules containing either one or two LTRs (for reviews see Weinberg, 1977; Coffin, 1979; Varmus, 1982). Some or all of these are intermediates in the creation of integrated genomic proviral DNA. We have previously described extrachromosomal circular copia molecules in Drosophila cultured cells (Flavell and Ish-Horowicz, 1981). The structure of these molecules resembles that of extrachromosomal circular retrovirus proviruses. Here we present sequence analysis of two of these cloned extrachromosomal copias, which suggests that circular copias can integrate into their own sequences. The sequences of several other circular copias suggests that these particular molecules are not derived by reverse transcription. These results suggest that circular copias are transposition intermediates, but that there may be several independent ways in which they are created: by reverse transcription of a copia retrovirus-like entity, by excision of genomic copia elements, or by semiconservative replication as plasmids.
Results Models for the Generation of 2-Direct Repeat Circular Copias There are two major classes of circular copia molecules in cultured Drosophila cells (Flavell and Ish-Horowitz, 1981). These consist of complete copia elements circularized by fusion of the LTRs and smaller molecules containing only one LTR. We have cloned 39 circular copias. Of these, 30 contain one direct repeat and nine contain two repeats. These two types of structures are strikingly similar to the circular DNA provirus copies of retroviruses produced by reverse transcription. However, such molecules might also be created by excision of genomic copias. The two models predict different structures at the LTR-LTR junction for 2LTR copia circles containing two terminal repeats. We therefore examined the nucleotide sequences of such molecules to test which model was supported by their detailed structure.
Reverse Transcription Model Reverse transcription of retrovirus RNAs yields 2LTR circular proviruses that are 4 bp longer than the integrated proviruses (Varmus, 1982; Ju and Skalka 1980; Swanstrom et al., 1981; Donehower et al., 1981). These four nucleotides are located at the junction between the long terminal direct repeats and derive from the 2 bp flanking the internal ends of the LTRs (Figure 2). If 2LTR copia circles were derived by reverse transcription of full-length copia RNA, they would also contain extra nucleotides at the direct repeat junction, derived from the LTR-interior junctions. There would presumably be at least 5 bp extra, since there is 5 bp between the end of the copia LTR and the TGG complementary to all tRNA 3’ ends (Figure 2). Other bp might derive from the sequence of the internal segments of copia that flank the 3’ direct repeat, although their
Cell 416
LTR Figure 1. General Structural Retrovtrus Proviruses
internal segment Features
LTR Shared
by copie
and Integrated
An internal segment of several kilobases is flanked by a pair of identical terminal repeats (LTRs). The LTFts carry smatl, imperfect terminal repeats (a). The whole element is flanked by an identical pair of host DNA sequences of several base pairs 0). The surrounding genomic DNA (m) is indicated.
number would depend on the exact position at which (+) strand synthesis began. The flanking internal sequences for copia are shown in Figure 2.
Copia
omo&
Genomic Excision Model It is more difficult to predict the LTR-LTR junctions derived by a genomic excision model. Exact excision by homologous recombination between the 5 bp duplicated host flanking sequences would generate circles containing a 5 bp insertion at the junction. Since copia has no apparent sequence specificity for insertion, and there are about 150 copias per cultured cell genome (Potter et al., 1979) this insert would be a random sequence. Nonhomologous recombination would yield different structures depending on the genomic positions at which the recombination occurred. Figure 3 shows how recombination could yield ILTR copia circles, or 2LTR copias containing insertions or deletions at the LTR-LTR junction.
Figure 2. Extrachromoscmal
DNA Intermediates
in the Retrovtrus
Llfe Cycle
A part of the life cycle of a typical retrovkus, Moloney MuLV is shown (for review see Vam7us. 1982). Virton RNA is reveme4ranscrttx3d in the cyto plasm into a linear, doubtestranded DNA that is somewhat longer, due to duplication of LTR sequence at each end of the DNA. After several hours, two forms of circular provirus (derived from the linear molecule) accumulate in the nucleus. These contain one cr two LTRs. After a day or two, effectivety all the extrachrcmosomal proviral DNA disappears and only integrated provirus is seen. ft is not known which extrachromosomal DNA is tha precurscr(s) to the integrated provtrus. The TT and AA dmucteotffs marked on the vinon RNA are the first bases synthesized of the minus (-) and plus (+) DNA strands, respectively, during reverse transcription. They form the end dinucleotides of the linear provirus and the junction sequence of the 2LTR circular provirus. The sequence surrounding the plus and minus strand primer sites for MuLV is compared with the corresponding region of copia. (Van Reveren et al., 1980; Sutcliie et al., 1980; Flavell et at., 1981; R. Levis, unpublished data). Sequences shared between the MuLV minus strand primer site and the copia corresponding region are underlined.
Sequence Heterogeneity at Terminal LTR-LTR Junctions in Cloned Circular copias We have determined the LTR-LTR junction sequence of seven individual cloned 2LTR copia circles (Figure 4). The junction sequence is heterogeneous. One recombinant (~6658) contains perfectly fused repeats with no intervening sequence. Other have insertions of between one and fifteen nucleotides between the direct repeats. If these insertions were generated by reverse transcription they would be a subset of the flanking internal segment sequences shown in Figure 2. In no cases do the inserted nucleotides correspond to sequences at the junctions between the direct repeats and the interior of the copia element. It is possible that the insertions are due to sequence variability of the interior junctions. However, the sequence of the 5’ junctions of pBB5 (unpublished data) is identical to that of cDM 2056 (Flavell et al., 1981; 5’ and 3’ are defined by the copia full-length transcript). The 3’ LTR-interior junctions of pBB2 and cDm 2056 are also identical (R. Levis and A. Flavell, unpublished data). The 15 bp insertions in pBB30 and pBB54 are not present in the parts of copia whose sequences have been determined (1 .I kb and 0.35 kb into the element from the 5’ and 3’ ends, respectively, and a 0.63 kb internal segment; Levis et al., 1980; Flavell et al., 1981; Flavelf, unpublished data; Fouts and Manning, 1981). The most economical interpretation of these data is the model of
imprecise recombinative excision of genomic copia, shown in Figure 3. It is noteworthy that, of all the retroviral LTR-LTR junctions sequenced so far (Varmus, 1982) only one clone contains any extra nucleotides over the normal 4 bp. (TTAAA instead of TTAA; Van Beveren et al., 1982).
Copia Circles Containing Sequence Inversion and Deletion Two circular copias that we isolated (pBB24 and pBB42) carry very specific DNA sequence inversions. The structures of the variant circular copias are shown in Figure 5. Both carry sequence inversions, with one breakpoint at the direct repeat junction of a 2LTR copia circle. In pBB24, the other breakpoint is in the internal copia segment, approximately 1.8 kb in from the 3’ end of the element. In pBB42, the other breakpoint is in the 5’ direct repeat, 199 bp from the direct repeat junction. In both cases, 5 bp represented once in the unrearranged circular element is duplicated at either end of the inverted segment. Such a configuration is consistent with their creation when a 2direct repeat copia circle or linear molecule integrates into its own sequence. Analogous rearranged molecules have been observed in circular retrovirus proviruses (Shoemaker et al., 1980, 1981; Van Beveren et al., 1982) and models for their creation were first proposed by Shoemaker et at.
Origin of Circular copia Elements 417
pBEf
58
pBB 1 Rocomblnative
pEB 2
1:I
pBB 5
‘l:::I-
pBB 29
AGGTGAAAAGGTTTC TCCACTTTTCCAAAG
(===JrQ c
pBB 54
D
c
Ball
c
A
4
Insertion
Deletion Circular ccpias
Figure 4. The LTR-LTR De-
(A) Homologous recombination between flanking 5 bp sequences (W) generates 2LTR copii circles containing a 5 bp insertion at the LTR-LTR junction. (6) Hcmologous recombination between the LTRs yields 1LTR mpia circles. (C) Nonhomologous recombination between the end of one LTR and genomic DNA flanking the other LTR generates copia circles with insertions at the LTR-LTR junction. (D) Nonhomologous reccmbinatiin between the end of one LTR and the interior of the copia element yields a 1LTR copia circle carrytng a deletion of sequence flanking either LTR.
(1980). Interestingly, self-integration within a long terminal direct repeat has not been reported in any retrovirus system to date. Shoemaker et al. (1980) proposed two alternative models for the generation of inversion provirus circles, involving self-integration of 1 LTR and 2LTR circles, respectively. pBE342 could only be generated by the second of these two models or by self-integration of a linear extrachromosomal copia (as proposed for retroviruses by Swanstrom et al., 1981). We also isolated a 1 LTR copia circle carrying a deletion of approximately 500 bp (Figure 5). Again, one breakpoint for the rearrangement is at the external end of a terminal direct repeat. Such a molecule could be derived either by the above models or by imprecise excision of a genomic copia, leaving part of the element behind in the Drosophila genome.
Junction
...
1 Rsa 1
,lOObq
Figure 3. Postulated Structures of Extrachromosomal rived by Excision of Genomic Elements
pBB Xl
Sequence
of Cloned 2LTR copia Circles
The sequences of seven cloned copia circles at the LTR-LTR junctions were determined, The sequence at each end of the cupia LTR is shown (Levis et at., 1980). The sequence of pBB5S corresponds to a perfect fusion of the two LTRs; all the other clones contain insertions at the LTRLTR junction. pBB3Cr does not contain the last A/r nucfeottde pair of the left LTR. Maxam-Gilbert sequence analysis of 5’end-labeled fragments was from the restriction sttes shown.
Discussion To demonstrate that copia can transpose via extrachromosomal circular intermediates, it is necessary to show that the circles are derived from the genome and that they are capable of integration. The sequence data for the two cloned circular copia carrying inversions suggest that extrachromosomal copia DNAs might be able to insert into their own sequences. We believe that circular copias are similarly capable of genomic integration. The origin of these circles, however, is still unclear. Are they derived from a copia retrovirus-like particle, by excision of genomic copias, or are they replicating plasmids? The sequences at the direct repeat junctions of full-sized copia circles argue against the creation of these particular molecules by reverse transcription. We believe that at least the majority of 2LTR circular copias are derived from the imprecise excision of genomic copias. However, one line of evidence argues against copia transposing exclusively via imprecise
Cell 418
A+“-
pBB24 3
5 6 r>r>l
/,4 )
/I
?
6
b4W 9
excision and reintegration. Imprecise excision would lead, in some cases, to fragments of copia being scattered throughout the Drosophila genome. Such fragments are very rare (Carlson and Brutlag, 1978; Levis et al., 1980). A second way in which extrachromosomal copia might be derived is by reverse transcription. Indeed, the many structural similarities between copia-like elements and retroviruses strongly supports this hypothesis. Retroviruslike entities have been reported in cultured Drosophila cells (Heine et al., 1980) and Shiba and Saigo (1983) have recently described the isolation from Drosophila cell nuclei of a virus-like particle that contains copia RNA. These authors and we ourselves (unpublished data) have failed to detect such particles in tissue culture supematants or in cytoplasmic extracts. It therefore seems likely that copia can exist as an intracellular virus-like particle but rarely, if ever, as an extracellular virus. Why do we not observe 2LTR circular copias whose detailed structure is predicted from a reverse transcription model? We propose that such molecules have a short half-life before integrating into the Drosophila genome. Indeed, we believe that pBB2 and pBB5 are incapable of genomic integration, as we have
pBB 2 pBB 42 pBB 6
Figure 5. The Structure of Inversion a Deletion Variant copia Circles
Variants and
(A) The relationship between PLTR copia circle, pBB2 (shown as a linear for descriptive purposes), and the variant cloned copia circles pBB24, pBB42. and pBB6. In pBB24. a 1.9 kb DNA sequence flanked by the LTR-LTFI junction and an apparently random 5 bp sequence (W) is inverted with duplication of the 5 bp sequence. In pBB42, a 199 bp sequence contained inside an LTFi is inverted with duplication of another 5 bp sequence @). pBBG carries a deletion of approximately SO0 bp relative to a 2LTR copia circle. Positions 1-9 are regions whose sequence is shown in (B). The sequencing strategy is shown; all fragments were 5’-end-labeled. (B) The sequence at the inversion and deletion breakpoints of pBB24, pBB42, and pBB6. The sequence of regions 1-9 in (A) is shown. N corresponds to T in the case of pBB2.
been unable to introduce recombinant plasmids based on them into the germ line of Drosophila eggs (unpublished data). A third way in which circular copias might propagate is as plasmids. Recombinant molecules based upon pBB5, containing the E. coli gpt selectable marker, when transfected into culture Drosophila cells, failed to integrate but persisted under selection as plasmids at low copy number (J. H. Sinclair, J. H. Sang, J. Burke, and D. IshHorowicz, submitted). We are currently determining the relative importance of these three possible mechanisms in the generation of copia circles in cultured cells. The existence of copia circles containing sequence deletions and inversions is paralleled by analogous variant extrachromosomal retrovirus proviruses (Shoemaker et al., 1980, 1981; Ju and Skalka, 1980; Van Beveren et al., 1982). This further, extends the similarity between copia and retrovial circles. It has been assumed that rearranged circular proviruses are generated by aberrant selfintegration of extrachromosomal elements. Shoemaker et al. (1980) and Swanstrom et al. (1981) proposed models for the generation of variants based upon self-integration of
Ongin of Circular cop& Elements 419
circular and linear proviruses, respectively. The structures of the deletion cop& circle pBB6 and one of the inversion circles is consistent with each of these models but pBB42, which contains an inversion entirely within one LTR, could only be generated from an extrachromosomal linear or circular copia containing two LTRs. Experimental
Procedures
Extrachromosomal circular copias from Kc cultured Drosophila cells were cloned in bacteriophage A Charon 21a and subcloned into plasmid pAT153 as described by Flavell and Ish-Horowitz (1981). Plasmid DNA was prepared by the method of Ish-Horowitz and Burke (1981). DNA sequence analysis was by the methods of Maxam and Gilbert (198fI). Acknowledgments We thank Jeff Williams, John Wyke. and Brenda Marrfott preparation of this manuscript. The costs of publication of this article were defrayed payment of page charges. This article must therefore be “advertisement” in accordance with 18 USC. Section indicate this fact. Received
for help in the in part by the hereby marked 1734 solely to
References Carlson. M., and Brutlag, D. (1978). One of the copia genes is adjacent to satellite DNA in Drosophila melanogaster. Cell 15, 733-742. of retrovirus
Donehower. L., Huang. A., and Hager, G. (1981). Regulatory and coding potential of the mouse mammary tumour virus long terminal redundancy. J. Virol. 37, 226-238. Flavell, A. J., and Ish-Horowfcz, of the eukaryotic transposable Nature 292, 591-595.
D. (1981). Extrachromosomal circular copies element copia in cultured Drosophila cells.
Flavell, A. J., Lewis, R.. Simon, M. A., and Rubin. G. M. (1981). The 5’ termini of RNAs encoded by the transposable element copia. Nucl. Acids Res. 9, 6279-6291. Fouts, D. L., and Manning, J. E. (1981). A complex repeated DNA sequence within the Drosophila transposable element copia. Nucl. Acids Res. 9, 7053-7964. Heine, C., Kelly, D., and Avery, R. (1980). The detection of intracellular retrovirus-like entities in Drosophila melanogaster cell cultures. J. Gen. Virol. 49,3a5-395. Ish-Horowitz, D., and E!urke, J. F. (1981). cloning. Nucl. Acids, Res. 9, 2989-2992.
Rapid
and efficient
cosmid
Ju. G.. and Skalka, A. M. (1980). Nucleotide sequence analysis of the long terminal repeat (LTR) of avian retroviruses: structural similarities wfth transposable elements. Cell 22, 379-386. Levis, R., Dunsmuir, P.. and Rubin, G. M. (1980). Terminal repeats of the Drosophila transposable element copia: nucteotide sequence and genomic organization. Cell 21. 58-588. Maxam, A. M., and Gilbert, W. (1980). Sequencing end-labeled base-specific chemical cleavages. Meth. Enzymol. 65,499~560.
Shoemaker, C., Gaff, S., Gilboa. E., Paskind. M.. Mtra. S.. and Baltimore, D. (1980). Structure of cloned retroviral circular DNAs: implication for virus integration, Cold Spring Harbor Symp. Quant. Biol. 45, 71 l-717. Shoemaker, C., Hoffman, J.. Goff, S. P.. and Baltimore, D. (1981). Intramolecular integration within Moloney murtne leukemia virus DNA. J. Virol. 40, 164-172. Swanstrom, R., De Lorbe, W. J., Bishop, J. M., and Varmus, H. E. (1981). Nucleotide sequence of cloned unintegrated avtan sarcoma virus DNA: viral RNA contains direct and inverted repeats similar to those in transposable elements. Proc. Nat. Acad. Sci. USA 78, 124-128. Van Beveren. C., Goddard, J., Bern?.. A., and Verma, I. M. (1986). Structure of Moloney murine leukemia viral DNA: nucleotide sequence of the 5’ long terminal repeat and adjacent cellular sequence. Proc. Nat. Acad. Sci. USA 77.x307-331 1. Van Beveren, C., Rands, E., Chattopadhyay, S., Lowry, D., and Verma. I. (1982). Long terminal repeat of murine retroviral DNAs: sequence analysis, host-proviral junctions and the preintegration site. J. Virol. 41. 542-556. Varmus, H. E. (1982). Retroviruses. In Mobile Genetic Shapiro, ed. (New York: Academic Press). Weinberg, integrated
May 23, 1983
Coffin, J. M. (1979). Structure, replication and recombination genomes: some unifying hypotheses. J. Gen. Virol. 42, l-26.
Shiba, T., and Saigo, K. (1983). Origin of eukaryotic movable genetic elements: identification of retrovfrus-like particles containing copia RNA in Drosophila melancgaster. Nature 302. 119-124.
DNA with
Potter, S. S., Brorein. W. J. Jr.. Dunsmuir, P., and Rubin, G. M. (1979). Transposition of elements of the 412. copia and 297 dispersed repeated gene families in Drosophila. Cell 77. 41.5-427. Rubin, G. M., Brorein, W. J., Dunsmuir, P., Flavell, A. J., Levis, R., Strobef, E., Toole. J. J., and Young, E. (1980). copia-like elements in the Drosophila genome. Cold Spring Harbor Symp. Quant. Bid. 45, 619-628. Scott, M. L., McKereghan. K., Kaplan, H. S., and Fry, K. E. (1981). Molecular cloning and partial characterization of unintegrated linear DNA from gibbon ape leukemia vrrus. Proc. Nat. Acad. Sci. USA 78. 4213-4217.
R. A. (1977). Structure of the intermediates provirus. Biochim. Biophys. Acta 473, 39-55.
Elements, leading
J. A. to the