Deletion formation in bacteriophage T4

Deletion formation in bacteriophage T4

J. Mol. Biol. ( 1988) 202, 233-243 Deletion Formation in Bacteriophage T4 Britta Swebilius Singer-f and Jane Westlye Department of Molecular, Cel...

1MB Sizes 2 Downloads 126 Views

J. Mol. Biol. ( 1988) 202, 233-243

Deletion Formation

in Bacteriophage

T4

Britta Swebilius Singer-f and Jane Westlye Department

of Molecular, Cellular, and Developmental Box 347, University of Colorado Boulder, CO 80309, U.S.A.

(Received 17 June 1987, and in revised form

Biology

15 October 1987)

We have manipulated the dispensable region of the rIIB gene of bacteriophage T4 in order to study the generation of deletions involving direct repeats. We show that recombination between different parental chromosomes is one source of the deletions we have studied. We have also investigated the effects of structure, base composition and distance on deletion formation. We demonstrate that the potential to form structure in single-stranded DNA has variable effects on the frequency of deletion formation and conclude that, in some cases, slipped mispairing during DNA synthesis can make a substantial contribution to deletion frequencies. The G +C richness of the direct repeats involved in deletion formation is an important parameter of the frequency of deletion formation. We have confirmed that increasing the distance between direct repeats decreases deletion frequency.

1. Introduction

al., 1982a; Ripley & Glickman, 1982; Schaaper et al., 1986; see also Foster et al., 1981; Egner & Berg, 1981). This has been demonstrated experimentally for the deletion/excision of TnS-related inserts studied by DasGupta et al. (1987). They created palindromes of varying lengths, flanked by direct repeats, and found that the longer the palindrome between the direct repeats, the more likely the deletion. There has been considerable speculation about the mechanisms of deletion formation. The proposed mechanisms fall into two broad classes: replication and recombination. Both lagging-strand synthesis at the replication fork and repair synthesis fall into the former category. Replicationbased models (e.g. see Efstratiadis et al., 1980; Albertini et al., 1982a,b; DasGupta et al., 1987) are adaptations of the slipped mispairing hypothesis suggested by Streisinger et al. (1966) to account for frameshift mutagenesis. Iri the replication-based models, the newly replicated repeat duplex melts, the newly synthesized strand anneals to the other (unreplicated) repeat, and replication proceeds (see Fig. 1). The newly synthesized strand has only one half of the repeat pair and none of the sequence in between. The next round of replication gives rise to one deleted and one wild-type duplex. Alternatively, repair ma,y remove the looped-out singlestranded DNA. Figure l(c) and (d) shows how the potential for secondary structure in single-stranded DNA can enhance the probability of deletion formation by slipped mispairing. The fact that DNA synthesis can generate deletions has been demonstrated by the finding of Kunkel (1985) that

The role of direct repeats in deletion formation has been amply demonstrated by sequence analysis of mutants (e.g. see Farabaugh et al., 1978; Studier et al., 1979; Efstratiadis et al., 1980; Pribnow et al., 1981; Albertini et al., 1982a). The vast majorit,y of deletions arise between short repeated sequences; the deletion mutant contains only one of the pair of repeats1 and has lost all of the DNA in between. Inspection of a large number of sequences of deletion mutations has led to several hypotheses concerning the parameters that affect deletion formation. Statistical analysis of deletions isolated in the lacl gene suggests that deletion frequencies are affected by the distance between the direct repeats (Galas, 1978). This was confirmed experimentally by Albertini et al. (19826). They reduced the distance between a pair of direct repeats from 987 to 107 base-pairs, with the result that the deletion frequency increased more than tenfold. The length of the repeat is also important; the longer the repeat, the more likely the deletion. Albertini et al. (19823) demonstrated that manipulating one of the repeats to introduce a discontinuity diminishes the likelihood of recovering a deletion. It has been proposed that the potential to form secondary structure in single-stranded DNA may influence the rate of deletion formation (Albertini et t Author to whom correspondence should be sent. $ “Pair of repeats” is used to refer to 2 like sequences. It is convenient to use the term “repeat” to refer to either sequence or both. We also refer to a “half repeat”, focusing on 1 member of the pair of repeats. CKW-2836/XS/l40233-11

$03.00/o

233

0 1988 Academic Press Limited

B. S. Singer and J. Westlye

234

(b) g;-r

.......*

(0 (cl 5-5

-/ -

(b)

--

Figure 2. Generation

of deletions during DNA Figure 1. Generation synthesis. Deletions may arise by slipped mispairing during.DNA synthesis. In (a) and (c), the lower strand is single-stranded template; it could be either the lagging strand in replication or the repair tract during DPU’A repair. The filled bar represents repeats. In (a), the newly synthesized half repeat duplex melts. In (b), the newly synthesized single-stranded half repeat misaligns on its downstream complement and replication proceeds. The top strand is the template for a deleted chromosome. In (c), the arrows along the bars indicate that there is an inverted repeat between the direct repeats and adjacent to the downstream half repeat. Melting is as in (a). In (d), the loop is stabilized by base-pairing between the inverted repeat and the upstream half repeat. As in (b). the upper strand contains the deletion.

among the mutations created during in-vitro DNA synthesis by rat or chicken DNA polymerase beta are deletions involving direct repeats. Recombination is another possible mechanism of deletion formation. A number of groups have proposed breakage-reunion mechanisms of deletion formation, involving gyrase, topoisomerase and other DNA-breaking and DNA-joining enzymes (Marvo et al., 1983; Ikeda et al., 1984; Bullock et al., 1984; Michel & Ehrlich, 1986a,b). On the other hand, it is possible that deletions might arise through unequal crossing over. In this model, direct repeats misalign in a process that is dependent on sequence similarity. Mispairing could be either interstrand or intrastrand (Fig. 2). Consistent with this notion is the finding that the frequency of deletions is elevated in some strains that are hyperrecombinogenic (Coukell & Yanofsky, 1970; Konrad, 1977). However, the role of recA in deletion formation is ambiguous (Franklin, 197 1; Albertini

et al., 1982a; DasGupta

et al., 1987).

.

of deletions during recombination. (a) Repeats are represented with filled bars. The downstream half repeat on one chromosome misaligns with the upstream half repeat of another chromosome. Recombination generates a deletion and perhaps a reciprocal duplication. (b) When the misaligned half repeats are on the same chromosome, recombination yields only the deletion.

In order to study deletion formation, we manipulated the dispensable region of the rIIB gene of bacteriophage T4 so that it contains a pair of direct repeats that flank a DNA fragment to be deleted. The strain containing the repeats is mutant; the deletion is pseudowild. We have tested a number of parameters that affect deletion formation, including the composition of the repeats, the distance between the repeats, and the potential to form structure in single-stranded DNA. In order to address the question of whether recombination can generate deletions, we also created pairs of strains that are analogous to these deletion-yielding constructs, except that the are on different chromosomes. When “repeats” these two phage coinfect a host, only recombination can generate a pseudowild strain with the deletion sequence. Figure 3 shows how redombination between these two phages is analogous to recombination that yields a deletion when a single phage infects a cell, replicates and recombines. We showed earlier (Singer et al., 1982) that the primary pathway for recombination in T4 requires approximately 50 base-pairs of homology. Lambdasystems yield similar plasmid recombination estimates (74 base-pairs; Watt et al., 1985; 27 basepairs; Shen & Huang, 1986; 40 base-pairs; King & Richardson, 1986). Even when the homologous recombination is easily sequences are shorter, detectable in all of these systems. We suggested that a second recombination system may operate when the extent of homology is small and that this system may be responsible for deletion formation.

235

Deletion Formation

Table 1 Sequences of strains used in constructions A.

Xhal-Sall

in Hind1 site (CTTGAC’) SulI/HincII _-----

inserted

linkw

c:‘r’_r ATTcT&AGT(’ I XhaI I ‘\--mm IfincIl -----I inserlrd in Xhal Dral

I$. Left polylinkrr

(:‘l’T

ATT

insrrtrd

XbuI ATT(‘TAGAGT(’

I). (i) 0 + (‘-rich

oligomrr

(pBSSX)

(;1A(‘ I

sitr (pBSS24)

XbaI ------CTAGAG’K&K HincII/Sdl

BwnHI -----(“rA(:(‘TTT,~AA(_‘BGC’TGACAGGATC‘CGT PWII

(‘. Right polylinker

(:TT

of rIIU

in Hind1

site (pBSS21) StuI -- MlUl-- ------448TC~TAGGCCTTTGATCaCGCGTCG Bell BglI 1

GAC

duplex

amber --(‘(‘T(:(!(:(‘C(‘GGQCTAGCCG ~:GA(‘G(IG(:G(:CCGAC’TGGCCTAG Mismatches (ii)

A + T-rich

oligomer

duplel

Mismatches CCTATATATAAATT6&TTG GGATATATATTTA4TCAACCTAG Complement of amber E. Left half construct

with repeat inserted

(pBSS27

and pBSS33) mpg

fragments I

1

amber BamHI ------ATT(~TAc:(!TTTAAACA~~~~~~~~~~~~~~~~~~~~~~CGTCTAGAGTC Half repeat Dral

(:‘l’T

F. Right half construct

with repeat inserted

(pBSS32

--&~a>GAC

and pBSS29)

mpg fragments I

1

(:‘rT

(:.

MluI ------

amber XbaI - -S&l - --_ Bdll-----ATTCTAGAGTCAGATCTTAG~~~~~~~~~~~~~~~~~~~~~ACGCGTCG Half repeat Right half with 46 base-pair

5’ GTT

palindrome

XbaI -----ATTCTAGAGTCA

(pBSS44

and pBSS41)

XlwII ------

stu1 -----cPTBOaRc~gR~EY~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Palindrome -

H. Left half joined

GAC

--- MluI

amber ---

-

--

ACGCGTCG

GAC

wilh right half (mpg f r agmerits are inserted

3’

into either of the indicated

sites)

mpg

fragment8 I

GTT

amber DraI -----_-_ ATTCTAGCTTTAAACADCCT~~~~~~~~~~~~~~~~~~CGT Half repeat

1

BamHI -_---_

XbaI ______ -

mpg

fragments I

1

amber XbaI - Nmr_ - -S&I--CTAGAGTCAGATCTTAGGCCTBmm?J&&nAGuG&44cACGCGTCG Half repeat

_ MM

_ GAC

H. S. Singer and J. Westlye

Table 1 cont. I. Left half joined with right half that has inoerted repeat (mpg fragments are inserted into the indicated Gte)

mpg fragments I 5’- (:TT

amber ------DrnI --ATTCTAaCTTTAAACAG~C~~~~~~~~~~~~~~~~~AT~~GT

1

BamHI ----__

-----_XbaI -

Direct repeat -

XhaI ------

XhoIl -----~TAGAGTCAGATCC’REtCITAR~~I_tYRYRYAGcI

amber ---stu1 ----WY%!Ry&?&I%Q”JAGmG(xrc

Inverted repeat MIlLI - ACGCGTCG GA<’ 3

-

Direct repeat

.J Sequencesf deletion GTT

DraI amber --ATTCTAGCTTTAAACAGBCT~~~~~~~~~~~~~~~~~~A~(~~GT(!(:



2. Materials and Methods construction:

GAC

Half repeat

We chose to begin our study of deletion formation using 24 base-pair direct repeats. These repeats are long enough to allow facile recovery of deletions, yet are well below the minimum site size we had earlier determined for T4 general recombination. Our results indicate that both recombination and replication generate deletions, at least when the direct repeats are of this length. It may be that when the repeats are shorter, other pathways come into play. It is difficult to assess the relevance of the lambda-plasmid studies to the current work, since, as far as is known, none of the host recombination systems have any influence on recombination in T4. These studies may, however, be relevant to deletion formation in Escherichia coli. For instance, at all lengths of homologous DNA measured, RecA enhances recombination; enhancement, is less when the extent of homology is less (King & Richardson, 1986). This observation may, at least in part, explain that deletion formation sometimes does and sometimes does not depend on the availability of RecA.

(a) Strain

MluI ------

plasmids

The starting plasmid for our constructions was ppu’C82 (from Nancy Casna). This pBR322 derivative contains the Hind111 fragment that spans the rZZA-rZZB intercistronic region (Pribnow et al., 1981). The vector was manipulated so that the H&c11 site in rZZB is unique. We cut the plasmid with HincII and inserted an 11 base-pair oligomer that contains an XbaI site and regenerates a H&II site (see Table 1A). Only the desired orientation of the insert yields a Sal1 site. This orientation was ascertained by restriction analysis and confirmed with sequence analysis. This plasmid is designated pBSS8. The plasmid pBSS8 was cut with XbaI and the left polylinker was inserted (Table 1B). The left polylinker contains a PvuII and a BamHI site; in addition, it regenerates one XbaI site that is on the right in the desired orientation (pBSS24). The DraI site also included in this polylinker is not used in this set of constructions. The right polylinker was inserted into the HincII site

of pBSS8 (Table 1C). In the desired orientation, the polylinker has a BgZII site on the left followed by &UT, BcZI and MZuI sites (pBSS21). The MZuI site is not) used in this series of constructions. The orientation of the polylinkers in pBSS21 and pBSS24 was determined by restriction analysis and confirmed by sequencing. Each half of the pair of repeats was constructed separately. Each polylinker contains a site that results in blunt ends when cut with the appropriate enzyme (PvuII or StuI), followed by a site that yields B’GATC overhang when cut (BamHI or BcZI). All 4 sites are unique in these plasmids. The half repeat oligomer duplexes contain 1 blunt end and 1 S’GATC overhang (Table 1D). One of these oligomer duplexes is G+C-rich and the other is A+T-rich. Each was ligated to PvuII-BamHI-digested pBSS24, regenerating the BamHI site but destroying the PwuII site. These now comprise the left halves of the repeat pairs. pBSS27 contains the G + C-rich left half and pBSS33 contains the A+T-rich left half. The A+T-rich and G+C-rich oligomer duplexes were also ligated to StuI-BcZI-digested pBSS21, regenerating the StuT site and destroying the BcZI site. These are the right halves of the repeat pairs. pBSS32 contains the G+C-rich right half and pBSS29 contains the A + T-rich right half. The mispairing in the A+T- and G+C-rich oligomers was designed to allow us to use 1 oligomer duplex to select, 2 variants for a future study. All of the strains used here contain TAG on the sense strand at the site of the mismatch. This was confirmed by sequence analysis. The amber codon TAG is in frame in the left repeat (Table 1E) and in the deletion sequence (Table lJ), thus the “pseudowild” deletion strain is in fact an amber mutant and phenotypically pseudowild only when growing on an amber suppressor. Replicative form M13mp9, generously provided by David S. McPheeters, wm cut with Sau3AI to generate fragments with 5’GATC overhangs. It may be of interest to note that the Sau3AI site at position 2221 that is present in Ml3 is apparently missing in M13mp9, since repeated, otherwise complete, digestions failed to yield the predicted fragment of 507 base-pairs. (This fragment was present in comparable digests of fl replicative form.) Only the 332 and 1698 base-pair fragments have been used in this study. These were cloned into the BamHI sites of the left half constructs and the BgZII sites of the right constructs. The orientation of the inserts was determined by restriction analysis and, in some cases, confirmed by sequencing.

Deletion

As noted before, each right half repeat is bounded on its left side by a StuI site that is preceded by a BglII site. Cutting with StuI and BgZII yields a blunt end and a B’GATC overhang in the orientation opposite to that used before to insert the A +T- or G + C-rich repeat oligomer duplex. Ligating the G + C-rich repeat oligomer duplex to StuI-BgZII-cut pBS32 yields pBS44, which has the G +Crich repeats immediately adjacent to each other in opposite orientations with a StuI site in the middle. The identical manipulation with pBSS29 and the A+T-rich repeat oligomer duplex yields pBSS41. Each left half construct has an XbaI site to the right of the left half repeat and inserted mp9 fragment (if any). Each right half construct has an X&I site to the left of the right half repeat and inserted mp9 fragment (if any). Constructs with both left and right halves (Table 1H and I) were created by cutting all plasmids with XbaI and EcoRI (resident asymmetrically in the vector portion of the plasmids), gel-purifying the appropriate fragments and ligating. Gel purification was accomplished with SeaPlaque low-melting-point agarose and elutions were effected with Schleicher and Schuell Elutip-D. When the 332 base-pair mp9 fragment is inserted into the BarnHI site in the + orientation, it puts an ATG inframe with the downstream sequence. This ATG is preceded by a sequence that, by eye, looks very similar to a ribosome binding site. Indeed, when the entire sequence is resident in T4, the strain is rII- but it has a very high reversion frequency. In order to circumvent this difficulty, we cut the plasmid with XbaI, filled-in with Klenow and all 4 deoxynucleotide triphosphates, and ligated. The resulting sequence, when resident in T4, is rII- and has a very low reversion frequency. This was the strain we used for the crosses reported in Table 4. (b) Sequencing We used dideoxy sequencing using AMV reverse transcriptase from Life Sciences to sequence strains of interest. The template was either HindIII-digested plasmid or RNA from T4-infected cells. These procedures have been described in detail by Shinedling et al. (1987). (c) Synthetic oligonucleotides Gligonucleotides (including primers for sequence analysis) were synthesized on an Applied Biosystems model 380a DNA synthesizer and purified as described by Shinedling et al. (1987). (d) Strain construction., bacteriophage The rIIB mutations constructed on plasmids were recombined into bacteriophage T4 as follows, rI1 mutations phenotypically suppress mutations in T4 gene 30. which encodes DKA ligase. Amber-suppressing, plasmid-bearing strains are permissive for growth of 30,,ci04(rIT+) and allow recombination to occur between t’he plasmid and the phage chromosome. Plating the progeny phage on a non-suppressing host, Bb, which is permissive for rII 30- double mutants but not for the rI1 + 30- parent, selects for recombinants that have acquired the mutation constructed on the plasmid. Mutants were verified by spot test crosses. The 30mutation was removed by crossing to wild-type and selecting mutants that make r plaques on S/6, a host that is non-permissive for rII- 30- double mutants. Again, rII mutants were verified by spot test crosses. This procedure has been described in detail (Singer et al., 1981).

Formation

237

The bacteriophage T4 genome contains 166 x lo3 bases of DEA with about 1 to 2% terminal redundancy. Since the T4 chromosome can accommodate only limited amounts of additional DNA, we used saA9 (Depew et al., 1976) as a compensatory deletion to allow for the additional sequences inserted into the rIIl3 gene. All of the mutants that were recombined into the 30background were also recombined into a comparable saA9 30- background. These strains were backcrossed to sad9 in a manner entirely analogous to the procedure above. The double mutant sad9 30,,,$104 was provided by Jonatha Gott and David Shub. The deletions saA9 and 30,,,$104 were isolated in the T4D genetic background, whereas most rI1 mutants were isolated in T4B. The strains reported here have been backcrossed extensively to ensure that the background is T4B. (e) Determination

of deletion frequencies

Uniparental Phage were plated on Bb and, after 5 h at 37”C, plaques were plucked into 05 ml of saline in an 18 mm x 150 mm tube. The next day, 1 ml of Bb at 2 x lO’/ml, growing exponentially in H broth, was added. The tube was rolled for 75 min at 37°C. This constitutes an initial multiplicity of infection of about 0.1 phage/cell in liquid growth. Overall, there were about 5 to 6 rounds of growth from single phage particle to the lysate in which deletion frequencies were measured. Deletion frequencies were determined by plating on CR63 to determine total phage count and on CR63lambda to determine the number of deletions. Strains with 400 base-pairs more than the wild-type amount of DPU’A are at a growth disadvantage (Singer $ Parma, 1987). Tn order to avoid selecting for mutants that have acquired a deletion, we used strains with saA9 in the background to determine deletion frequencies. The deletion saA9 does not affect the rate of deletion formation (Singer & Parma, 1987). We also collected extensive data for deletion frequencies of st,rains with 30- and/or T4D/T4B hybrid backgrounds. Neither of these factors affects deletion frequencies (data not shown). (i)

(ii)

Biprental

Bb was grown in H broth at 37°C to a density of 2 x lOs/ml. A sample (5 ml) of bacteria in a 50-ml flask was infected at a multiplicity of 5 of each parent, added in a volume of 1 ml. Crosses were aerated at 37°C for 1 h. Deletion frequencies were determined as for the uniparental crosses. (f)

Statistics

We used the Minitab statistical package that is available on the University of Colorado Computing Center DEC VAX 8550 computer to compare deletion frequencies in uniparental crosses. We used Minitab’s Mann-Whitney test (McKean 6 Ryan, 1977) to determine the likelihood that 2 sets of data are drawn from populations with the same distributions. Because the events that happen early in the growth of a lysate contribute disproportionately to deletion frequencies, it is appropriate to report the median, rather than the mean, deletion frequency. Thus, we could not calculate standard deviations. The Mann-Whitney U test is a nonparametric statistical test that compares 2 sets of data by merging the data in rank order. The test then evaluates the randomness of the interspersion of the 2 data sets.

238

B. 8. Singer and J. Westlye

Table 2 Median

deletion frequencies

when composition of repeats and distance repeats are varied

between direct

Median deletion frequency ( x 10’) Direct repeats Insert

Orientation

None 332 332 1698

Distance (b)

A + T-rich

G + (J-rich

44 376 376 1742

780 15 10 3.5

8500 1950 nd. 41

+ -

Ratio G+C/A+T 11 130 12

Deletion frequencies were measured as described in Materials and Methods. Each median reported represents at least 5 independent measurements. The + orientation designates a fragment inserted in the clockwise direction relative to the mp9 map, when rlIA-rIIB is considered to be clockwise. As described in the text, the distance between the repeated sequences is calculated from the 1st base-pair of one repeat to the 1st base-pair of the other. nd., not done: bp, base-pairs.

Recombinants arising in a single growth round are thought to be independent (Steinberg & Stahl, 1961). Hence for biparental crosses we determined means and standard deviations (Table 4) using a Casio fx-8000 calculator. As expected for a normal distribution, the means and the medians are the same in the biparental crosses. We used the xyplo program of the Delila DNA analysis package to do linear regressions (Schneider et al., 1982). In order to determine the structure of the strains shown in Table 6, we used an RrjA folding program (Martinez, 1984). This program uses a Monte Carlo method to assess the population of foldings. Multiple foldings allow the evaluation of the relative importance of a structure. Because of the asymmetry of folding of Watson versus Crick (that is, G and T on one strand may participate in pairing but A and C on the complement cannot), we examined both strands for available structures. (This program can be used for reasonably long pieces of DNA, but only local stem-loop structures are predicted. Hence, we did not use it in attempting to discover structures that involve distant complements.)

adenine in the A + T-rich insert and guanine in the G +C-rich insert. The sequence of the variable nucleotides was generated by a random sequence generator. Table 2 shows the deletion frequencies of these strains. Deletion formation is enhanced when the direct repeats are G+C-rich. The ratio of deletion formation in the G +C-rich strains relative to the A +T-rich strains is shown in the last column of Table 2. This ratio varies from 10 to 130. (ii) Effect of distance between the directly

The smallest distance between the directly repeated sequences in our constructs is 44 basepairs, counting from the first base-pair of the first

Table 3 Deletions

3. Results (a) U&parental

infections of strains direct repeats

(i) Effect of base composition

with insert

of the repeats

A. A + T-rich

In order to find out if the base composition of the direct repeats affects the rate of deletion formation, we made pairs of strains that are identical except for the A + T richness or G +C richness of the repeats. As described in Materials and Methods, we made synthetic oligonucleotide duplexes to serve as the repeated sequences. The length of the repeated sequence is 24 base-pairs. The A+T-rich repeat has 17 A. T base-pairs and the G+C-rich repeat has 19 G. C base-pairs.

Twelve

nucleotide

positions

are

invariant between the A + T and G + C-rich repeats (Table 1E and F). The balance of the nucleotides is variable but the positions of purines and pyrimidines are fixed. Each variable pyrimidine is cytosine in the G +C-rich insert and thymine in the A +Tall variable purines are rich insert. Similarly,

repeated

sequences

None 332 332 I698 13. 0 f (‘-rich

None 1698

Orientation

sequenced Without inverted repeat

With inverted repeat

direct repeats

+ -

2 1 1 2

4t 1 1 2t

2

2 2

direct rrpeat.u

-

2t

W’e sequenced a number of each deletion strain as indicated below. We used AMV reverse transcriptase to sequence RNA from cells infected with independently isolated pseudorevertants (Shinedling et al., 1987). Each sequencing gel contained all 4 dideoxynucleotide sequencing reactions for 1 pseudorevertant. \Ve determined only A or G for the balance of pseudorevertants analyzed on a particular gel, since that was adequate to identify the deletion. All strains appear to be identical with the completely sequenced pseudorevertant and have the predicted sequence precisely. t One of these was sequenced completely.

239

Deletion Formation

( b)

Figure 3. Comparison of deletion formation generated through recombination in the uniparental and biparental configurations. (a) Schematic of chromosome with left repeat (filled bar) and an mp9 fragment (hatched bar) inserted in the BarnHI site between the repeat and the XbaI site (X). Other than XbaI, restriction sites are represented with a dotted line. The sequence upstream from the direct repeat is wild-type rIIB. (b) Schematic of chromosome with right repeat and the same mp9 fragment as in (a) inserted into the BgZII site between the repeat and the XbaI site. The sequence downstream from the direct repeat is wild-type rZIB. (c) Chromosomes depicted in (a) and (b) aligned for the recombination event that yields the deletion sequence. (d) Chromosome with both repeats and the same mp9 fragment as in (a) and (b). Here the insert is in the BumHI site adjacent to the left repeat. The portion of this chromosome that is to the left of the XbaI site derives from the plasmid that corresponds to the phage chromosome depicted in (a). The portion of the chromosome to the right of the XbaI site derives from the plasmid sequence shown in Table 1F. This is the same as the chromosome depicted in (b) above, except that it does not have an mp9 insert in the BgZII site. The relevant plasmids were cut with XbaI and EcoRI (resident in the vector); the fragments were purified and ligated to yield the plasmid that corresponds to the phage chromosome shown. (e) At low multiplicity, 1 phage chromosome is initially present in the infected cell. After replication generates copies, 2 chromosomes misalign in a fashion analogous to t>he misalignment shown in (c). This misalignment generates the deletion. sequence to the first base-pair of its repeat. In order to vary this distance, we inserted Sau3AI fragments from M13mp9 into the BamHI site adjacent to the left member of the repeat pair (Table 1E and H). The results shown in Table 2 are all from strains with the mp9 fragments inserted in the BamHT site adjacent to the left half of the pair of repeated sequences. We have also inserted the 1698 fragment in the minus orientation into the BgZII site next to the right A +T-rich repeat. The median value of deletion frequencies obtained for this strain is 2.3 x 10m7. We also made a strain with the 332 base-pair fragment inserted in the minus orientation in the BgZII site adjacent to the right A +Trich repeat. This strain reverts with a median frequency of 1.3 x 10e6. In both cases, the deletion frequency is the same (within calculated experimental error) as the comparable strain with the fragment inserted into the BamHI site.

(iii)

Proof that the “deletion frequencies” not include other sorts of mutations

reported do

It was possible that mutations other than deletions could contribute to the calculated “deleFor instance, any duplication tion frequencies”. that fuses the distal portion of rIIB in-frame to a viable ribosome binding site will result in a pseudorevertant (Freedman & Brenner, 1972). In order to assess the contribution of events other than those expected to the pseudowild frequencies we we sequenced several independent measured, pseudorevertants (Table 3). All of the sequenced mutations had precisely the deletion sequence predicted. (b) Recombination The design of the starting insertion of Sau3AI fragments

plasmids permits into either the

240

B. S. Singer

and J. Westlye

Table 4 Deletion

in biparental crosses

frequencies

(c) lZ#ect of an inverted and uniparental

Deletion frequencies Repeat

Insert

AT AT CC‘

332 332 1698

Orientation

Uiparental

+ -

4.3k1.4 4.7f0.8 76.0f16.0

repeat

( x 10’)

lTniparental 15 10 41

The deletion frequencies shown for the uniparental configurations are for strains with the M13mp9 fragment inserted into the BamHI site. As noted in the text, one parent in the biparental configuration has the cognate fragment inserted into the BarnHI site and the other has it inserted into the BgZII site. Median frequencies are reported for the uniparental configuration and mean frequencies for the biparental configuration (see Materials and Methods). The deletion frequencies reported for the biparental crosses are significantly greater than the reversion frequencies observed in the parental self-crosses.

BamHT site adjacent to the left direct repeat (Table 1E) or into the BgZII site adjacent to the right repeat (Table IF). This feature makes it possible to create pairs of strains with the left repeat on one chromosome and the right repeat on another chromosome, each with the same mp9 fragment. Figure 3 shows such a pair. One chromosome contains the left repeat and an insert in the BamHI site. The other chromosome contains the right repeat and the same fragment in the same orientation inserted into the RglII site. When these two strains coinfect a host, only recombination can yield pseudowild progeny with the deletion sequence. Table 4 shows the recombination frequencies measured in three such crosses and compares the deletion frequencies obtained in the uniparental configuration.

repeat between the direct sequences

Strains that include an inverted repeat between the direct repeats have the potential t,o form IlT\;A structures in single-stranded regions. These structures cannot be formed in strains without t.he inverted repeats. Pairwise comparison of strains wit,h and without inverted repeats allows us to assess the effect of the potential to form structure on deletion frequency. Restriction sites were placed so that it was possible to insert) the A+T- and G + C-rich oligonucleotide duplexes adjacent) to the right halves of the repeat sequences and in the opposite orientation (Table IG). This const,ruct results in the creation of a palindrome of overall length 46 bases. The left half of the palindrome can form a stem with either of the direct repeats (Table IT). In contrast with some reports (c.g. see Leach & Stahl, 1983; Goodchild et al., 1985), there is no evidence that this palindrome is genet,ically unstable in the rllB gene, either in t,he A+T or G +(:-rich configuration. The net increase in distance between direct repeats in strains with the inverted repeats added is 14 base-pairs. Table 5 shows the deletion frequencies of these strains and the ratio of delet’ion frequencies in strains containing inverted repeats relative to the corresponding strains lacking inverted repeats. We also sequenced a number of independent pseudorevertants of these strains (Table 3). All have the predicted sequence. 4. Discussion (a) Mechanisms

of deletion formation

(i) Recombination In order to find out if recombination can generate deletions, we measured deletion formation in a biparental cross in which the repeats are on

Table 5 Median

Insert

Orientation

deletion frequencies Distance @PI

in strains

with inverted

Median deletion frequency ( x 10’)

A. A + T-rich repeats None 332 + 332 1698 -

58 390 390 1756

20,500 38 57 39

ll. 0 + C-rich None 332 1698

58 390 1756

9900 1500 250

repeats Ratio (with/without inverted repeat) 26.0 2.5 5.7 1.1

(0.02)t (0.02) (0@05) (0.07)

repeats + -

1.2 (0.65) 03 (0.30) 6.1 (0.005)

Deletion frequencies were measured as described for Table 2. As described in the text, the distance between the repeated sequences is calculated from the 1st base-pair of one repeat to the 1st base-pair of the other. bp, base-pairs. t The value in parentheses next to the ratio is the probability that the 2 median values used to calculate the ratio are drawn from populations with the same distribution, i.e. the significance. See Materials and Methods.

Deletion

241

Formation

(see Fig. 3). Table 4 chromosomes different compares each biparental cross with the cognate uniparental “cross”. It is appropriate to consider this a cross, since, after the first round of replication, there are many copies of the chromosome available to participate in recombination. Tn all three biparental crosses, recombination yields deletions. Although direct comparison is inappropriate because the two sets of experiments were done differently (see Materials and Methods), the hiparental crosses serve as a rough indicator of the relative amount of deletion formation attributable to recombination in the uniparental configuration. In crosses in which the repeats are G + C-rich, there is more deletion formation in the biparental than in the uniparental configuration. It is possible that one consequence of high multiplicity of infection (which is necessary to ensure that all cells are infected with both parents) is to stimulate recombination. When the repeats are G+C-rich, it seems likely that recombination is responsible for all or virtuallv all of the deletion formation observed in the un’iparental configuration. This result also indicates that this recombination is an interchromosomal rather than an intrachromosomal event. Recombination yields deletions to a lesser extent in the biparental crosses in which the repeats are A+T-rich and the inserts are 332 base-pairs. Tf we assume, on the basis of the results from the cross wit’h G + C-rich repeats, that the recombination frequency measured in the biparental crosses is about twofold higher than the recombination frequency in the uniparental crosses, then the amount of deletion formation attributable to recombination when the repeats are A+T-rich and the distance is 376 base-pairs is about 15% of the total. Slipped mispairing and intrachromosomal recombination must account for the balance of deletion formation in this case. It should be noted that part of the G +C-rich repeat has alternating C and G. In T4 DNA, c?tosine bases are substituted in their 5’ position with glucosylated hydroxymethyl groups. Thus, part. of the G +C!-rich repeat may assume Z form (Hehe & Felsenfeld. 1981) and this in turn may stimulate recombination (Klysik et al.. 1982; Murph? & Stringer, 1986). It will be interesting to test this hypothesis. Even if this is true, it is unlikely to be the whole story (see below).

inverted repeat adjacent to the right direct repeat. The inverted repeat forms a palindrome with the right repeat; it also has the potential to form a stem with the left repeat. In one of the pairs that we tested, the inverted repeat causes a dramatic (26-fold) stimulation of deletion formation (Table 5). Whereas an inverted repeat greatly stimulates deletion formation in the strain with A + T-rich repeats and no mp9 fragment insert, the same is not true of the cognate strain with G+C-rich repeats. Moreover, the strain with an A+T-rich inverted repeat is the only case in which the deletion frequency of the A+T-rich strain is higher than the deletion frequency of the comparable G+C-rich strain. Evaluation of the structures that are formed in these four strains (for details, see Materials and Methods) leads to the conclusion that the strains with inverted repeats do indeed have the structure that is predicted to enhance slipped mispairing (Fig. 3). Only local and rather weak structures are formed by the sequence with A +T-rich direct repeats and no inverted repeat. No one structure is particularly favored over the others. The strain with the G +C-rich repeats formed two different, very stable structures, depending on which strand was analyzed, but the structures formed do not readily lend themselves to deletion formation by slipped mispairing. We propose that melting of the duplex is the limiting step in slipped mispairing. Melting of the first of the pair of direct repeats would be far more likely in the strain with the A +T-rich repeats; hence, the structure made possible by the inverted repeat would have the opportunity to form, thus stimulating deletion formation by slipped mispairing. The 26-fold stimulation attributable to the inverted repeat attests to how important slipped mispairing can be for deletion formation, at least in some cases. (It should be noted that replication that reads past a strong hairpin structure in the template strand is not a predominant mechanism of deletion formation. If it were, deletions would be associated with hairpin structures, and there would be no requirement for direct’ repeats.) Among the other pairs of strains tested, the potential for structure either stimu!ates or has no effect on the rate of deletion format>ion.

(ii) Slipped mispairing lt has been suggested that the potential to form st’ructure in single-stranded DNA enhances deletion formation (Ripley & Glickman, 1982). Such an enhancement of deletion formation would be indicative of the role of slipped mispairing because structure would bring the repeats closer together and tend to stabilize the deletion loop (Fig. 1). In order to determine the effect on deletion formation of the kind of structure proposed by Ripley & Glickman (1982), we constructed pairs of strains that either included or did not include a long

(i) Effect of base composition of the repeats The most striking parameter that affects deletion formation is the base composition of the half repeats. With the one exception noted above, the strains in which the repeats are G +C-rich have deletion frequencies at least ten times that of the comparable strains with A + T-rich repeats. Stimulation of deletion formation by the G +C richness of the direct repeats is not limited to our system. DasGupta et al. (1987) analyzed deletion formation in pBR322. The direct repeats in their starting strains were generated by the duplication

(b) Parameters

that affect deletion formation

242

B. S. Singer and J. Westlye Table 6 AG of direct repeat versus logarithm of deletion frequency: correlation coefficients Correlation

All data A 10 omitted

coefficients

ISSO/Lambda

22

0.75 030

0.92 0.95

for various configurations Size of palindrome (bp) 32 0.61 0.92

90 0.36 063

Using the values reported by Breslauer et al. (1986), we have calculated the A.B values of the direct repeats involved in the deletions studied by DasGupta et al. (1987). While duplication generates 9 base-pair repeats, in a few instances, the direct repeats are actually 10 base-pairs long because the base-pair adjacent to the duplication is the same as the base-pair at the opposite end of the transposon. We regressed those values on the logarithms of the deletion frequencies they report. The values reported here are the correlation coefficients (r). The 1st line includes all data reported. The next line shows the correlation coefficients when deletion frequencies from one anomalous strain (and its derivatives) are eliminated from the data set. bp, base-pairs.

event associated with insertion of a TnS-related transposon into the beta-lactamase gene of pBR322. The transposon contains 45 x lo3 bases between 1534 base-pair IS50 inverted repeat sequences, flanked by the duplicated base-pairs. Precise excision of the inserted sequences, including the duplicated nine base-pairs, regenerates the wildtype beta-lactamase gene. They studied the deletion of 14 insertions at different sites within the gene and found that the frequencies varied more than lOOO-fold. They point out that deletion frequencies are strongly dependent on features of the DNA sequence of the insertion site. We calculated the AG value of the direct repeat at each insertion site using the values given by Breslauer et al. (1986). A linear regression of the free energies on the logarithms of the deletion frequencies gives a reasonably good fit (r = 0.75). DasGupta et al. (1987) used restriction sites present in the 1534 base-pair inverted repeat sequences to delete most of the inverted repeat sequences and all of the 42 x lo3 base-pairs of lambda chromosome that had been located between the inverted repeats. This procedure generated 22, 32 and 90 base-pair palindromes between the direct repeats. They measured deletion frequencies in these strains also. The correlation coefficient between the AG value of the repeat and the logarithm of the frequency of deleting the 22 basepair palindrome (and the duplicated 9 base-pairs) is 0.92. Table 6 shows that the correlation coefficients calculated for strains with the 32 and 90 base-pairs palindromes are much smaller (0.61 and O-36). As they note, the data from pne of their insertion sites are anomalous. When data from this set of strains are eliminated, there is a good correlation between AG values and the logarithm of the deletion frequency in all of the configurations tested. When one regresses AG on the deletion frequency instead of on the logarithm of the deletion frequency, the fit is substantially worse (not shown). This implies that the relationship may reflect a binding constant, and suggests that

formation of the misaligned duplex is limiting in deletion formation. We have suggested that, in slipped mispairing, melting of the newly replicated repeat is limiting. If this is true, deletion formation through slipped mispairing is most probable when the direct repeat has a small AG of melting. Tn recombination, single-stranded DNA is already available. Hence, we expect deletion formation via recombination to be enhanced by large negative AG values. The stimulation of deletion formation by large negative AG in both T4 and pBR322 is striking. Our data suggest that both recombination and slipped mispairing are important mechanisms of deletion formation. It will be interesting to determine the relative contributions of each. (ii) Effect of distance on deletion formation The data of Table 2 (with mp9 fragments in the BamHI site) and the two additional cases cited (with mp9 fragments in the BglII site) show that there is an inverse relationship between the frequency of deletions and the distance between the direct repeats when the same direct repeats are used. These findings substantiate earlier results (Galas, 1978; Albertini et al., 19826) but there are too few data to determine the precise relationship. It is obvious that increasing the distance would diminish the frequency of deletion formation via slipped mispairing, since the closer the repeats the greater the probability that both will occur in the same replication fork or single-stranded repair tract. Indeed, with the A + T-rich repeats (which we have argued are most likely to delete through slipped mispairing), the stimulation afforded by the potential for structure diminishes as distance increases and disappears when the distance between repeats is great. The reason for a distance effect on deletion formation via recombination is less clear. After all, if one 24 base-pair sequence has to find another, what difference should it make how much DNA there is in between! We did crosses like those shown in Table 4 except that the parent with the right

Deletion Formation repeat had no mp9 fragment inserted. (This strain is rTT- because of a frameshift. Table IF.) The recombination (deletion) frequencies in these crosses are several hundred to several thousand times higher than in cognate crosses with mp9 inserts in both parents (data not shown). In these crosses, the chromosomes can align by the homologous T4 sequence upstream from the repeat sequences. This alignment puts the right repeat and left repeat in close proximity. In the crosses in which both parents have inserts, the comparable alignment has the left and right repeats 400 or 1800 base-pairs apart. We suggest that, in the uniparental situation, correct alignment of the DNA between the repeats competes with the misalignment, hence the inverse relationship between distance and deletion frequency. The importance of distance and alignment strongly implies that the “recombination” responsible for the deletions we have studied is attributable to homology-dependent recombination and not recombination effected by DNA breaking and joining enzymes. We thank Rhonda Kalil and Jose Parra for technical assistance. David McPheeters, David Parma, Tom Schneider. Sidney Shinedling and Gary Storm0 have helped in a variety of ways, primarily by giving freely of and intelligence. We their experience, expertise, rspeeially thank Larry Gold for all of the above (except technical assistance) and for providing space and facilities. This work was supported by NIH grant (iM28634.

References Albertini, A. M., Hofer. M., Calos, M. P. & Miller, ,I. H. (1982a). Cell, 29, 319-328. Albertini, A. M., Hofer. M., Calos, M. P., Tlsty, T. D. & Miller. *J. H. (1982b). Cold Spring Harbor Symp. @ant.

Biol.

47, 841X%0.

Behr. M. & Felsenfeld, G. (1981). Proc. Nut. Acad. Sci., (:.S.A. 78, 1619-1623. Breslauer, K. J., Frank, R., Blocker, H. & Marky, I,. A. (1986). Proc. Nat. Acad. Sci.,

7J.S.A. 83, 3746-3750.

Bullock. P.. Forrester, W. 62 Botchan, M. (1984). J. Mol. Biol. 174, 5584. C’oukell, M. B. & Yanofsky, (:. (1970). Nature (London), 228, 633-635. DasGupta. I:.. Weston-Hafer, K. & Berg, D. E. (1987). Genetics, 115; 41-49. Depew, R. E.. Snopek, T. J. & Cozzarelli, N. R. (1975) Virology, 64. 144- 152. Efstratiadis. A.. Posakony, J. W., Maniatis, T., Lawn, R. M.. O’Connell, C., Spritz, R. A., DeRiel, J. K., Forget, B. G.. Weissman, S. M., Slightom, cJ. L.. Blechl, A. E., Smithies, O., Baralle, F. E., Shoulders, (1. C. & Proudfoot. N. J. (1980). Cell, 21, 653-668. Egnrr. C. 8: Berg, D. E. (1981). Proc. Nat. Acad. SC%.. I1.H.A. 78, 459-463.

Farabaugh. I’. J., Schmeissner, U., Hofer, M. & Miller. ,J. H. (1978). J. Mol. Biol. 126, 847-857.

243

Foster, T. J.. Lundblad, V., Hanley-Way, S.. Hailing, S. N. & Kleckner, N. (1981). Cell, 23, 215-227. N’. C. (1971). The Bacteriophage Lambda Franklin, (Hershey, A. D., ed.), pp. 175-194, Cold Spring Harbor Laoratory Press, Cold Spring Harbor, NY. Freedman. R. & Brenner. S. (1972). J. MoZ. Biol. 69, 409419. Galas, D. J. (1978). J. Mol. Biol. 126, 858-863. Goodchild, J., Michniewicz, J., Seto-Young. D. & Karang, S. (1985). Gene, 33, 367-371. Ikeda, H.. Kawasake, I. & Gellert, M. (1984). ,noZ. Gen. Genet. 196, 546-549. King, S. R. & Richardson, J. P. (1986). Mol. Gen. Genet. 204, 141-147. Klysik, J., Stirdivant, S. M. & Wells. R. I). (1982). J. Biol. Chem. 257, 10152-10158. Konrad, E. B. (1977). J. Bacterial. 130, 167-172. Kunkel, T. A. (1985). J. Biol. Chem. 260. 5787-5796. Leach, D. R. F. & Stahl, F. W. (1983). N&urr (London), 305, 448-451. Maniatis, T., Fritsch, E. F. & Sambrook, ?J. (1982). In Molecular Cloning: A Laboratory Manual, pp. 156170, Cold Spring Harbor Laboratory Press. Cold Spring Harbor, NY. Martinez, H. M. (1984). Nucl. Acids Res. 12, 323-334. Marvo, S. L., King, S. R. & Jaskunas, S. R. (1983). Proc. Nat. Acad. Sci., IJ.S.A. 80, 2452-2456. McKean, J. W. & Ryan. T. A., Jr (1977). ACM Trans. Math. Software, 3, 1833185. Michel, B. & Ehrlich. S. D. (1986a). Proc. Nat. Acad. Ski., 1’X.A.

83 3386-3390.

Michel, B. & Ehrlich, S. D. (19866). EMBO J. 5, 36913696. Murphy, K. E. & Stringer, J. R. (1986). Nucl. Acids Res. 14, 732557340. Pribnow, I)., Sigurdson, D. C., Gold, L., Singer, B. S., Napoli, C., Brosius, J., Dull, T. ,J. & Eoller, H. F. (1981). J. Mol. Biol. 149, 337-376. Ripley, L. S. & Glickman, B. W. (1982). (‘old Spring Harbor Symp. Quant. Biol. 47, 851.--861. Schaaper. R. M., Danforth, B. pu’. & Glickman, B. W. (1986). J. Mol. Biol. 189, 273-284. Schneider, T. D.. Stormo. G. D., Haemer. ?I. S. & Gold, L. (1982). Nucl. Acids Res. 10, 3013-3024. Shen, P. & Huang. H. V. (1986). Genetics, 112, 441-457. Shinedling, S. T., Singer, B. S., Gayle. M., Pribnow, D., ,Jarvis. E., Edgar, B. & Gold, I,. (1987). .I. MoZ. Biol. 195, 47 I-480. Singer, B. S. & Parma, D. (1987). T4 Notes, 1. 6-7. Singer. B. S., Gold, L., Shinedling, S. T., Colkitt, M., Hunter, L. R., Pribnow, D. & Nelson, M. A. (1981). J. Mol. Biol. 149. 405-432. Singer, B. S.. Gold, L., Gauss, I’. Bi Doherty, D. H. (1982). Cell, 31, 25-33. Steinberg, C. & Stahl, F. (1961). J. Theorrt. Hiol. 1, 488497. Streisinger, G., Okada. Y., Emrich, J., Newton, J., Tsugita, A.. Terzaghi, E. & Tnouye, M. (1966). CoZd Spring Harbor Symp. Quant. Biol. 31, 77T84. Studier, F. W.. Rosenberg, A. H., Simon, M. N. & Dunn, J.
Edited by I. Herskowitz

Proc. Nat. Acad. Sci., 1J.S.A. 82. 4768-4772.