[26]
METHODSFOR in Vitro DNA RECOMBINATION
447
[26] M e t h o d s f o r / n Vitro D N A R e c o m b i n a t i o n and Random Chimeragenesis By ALEXANDER A. VOLKOV and FRANCES H. ARNOLD Introduction In vitro polymerase chain reaction (PCR)-based methods for recombining homologous D N A sequences are capable of creating highly mosaic chimeric sequences. Several different methods have been reported for in vitro recombination or " D N A shuffling": the original Stemmer method of DNase I fragmentation and reassembly, 1 the staggered extension process (STEP), z'3 and random priming recombination. 4 We have found that slight variations in the shuffling protocols can affect the outcome of the experiment. Furthermore, different genes sequences recombine most efficiently under different conditions. Here we provide protocols that are designed to give a high likelihood of success. The protocols presented here are known to work for recombining sequences of - > 8 5 % identity.
Methods Stemmer Method
The Stemmer method involves digestion of parental D N A molecules into small fragments, using DNase I, and reassembly by overlap extension PCR. In the presence of Mg 2+ or Mn 2+ DNase I exhibits no obvious sequence specificity and generates D N A fragments randomly distributed over the gene length. Fragments are assembled in a cyclic PCR-like reaction with denaturation, annealing, and extension steps. During the annealing step single-stranded molecules associate with complementary molecules from any of the parent DNAs present to create novel sequence combinations. Stemmer Protocol Using DNase I Fragmentation
1. Prepare D N A templates. Templates can be plasmids carrying target sequences, sequences excised by restriction endonucleases or amplified by 1 W. P. C. Stemmer, Nature (London) 370, 389 (1994). 2 H. Zhao, L. Giver, Z. Shao, J. A. Affholter, and F. H. Arnold, Nature Biotechnol. 16, 258 (1998). 3 M. S. B. Judo, A. B. Wedel, and C. Wilson, Nucleic Acids Res. 26, 1819 (1998). 4 Z. Shao, H. Zhao, L. Giver, and F. H. Arnold, Nucleic Acids Res. 26, 681 (1998).
METHODS IN ENZYMOLOOY, VOL. 328
Copyright © 2000 by Academic Press All rights of reproduction in any form reserved. 0076-6879/00 $30.00
448
D N A SHUFFLING AND OTHER METHODS
[26]
PCR. Prepare 2-5/zg of DNA templates mixed in equal proportions in a volume not exceeding 44/zl. 2. Add 2.5/zl of 1 M Tris-HC1 (pH 7.5) and 2.5/zl of 200 mM MnCI2, and bring the volume to 49/zl with deionized water. Equilibrate the mixture at 15° for 5 min. 3. Add 1/zl of DNase I (10 U//zl; Boehringer Mannheim, Indianapolis, IN) freshly diluted 1 : 100 in deionized water and perform digestion at 15°. Take 10-/zl aliquots after 30 sec and after 1, 2, 3, and 5 min of incubation and immediately mix them with 5 /~1 of ice-cold stop buffer containing 50 mM EDTA and 30% (v/v) glycerol. 4. Separate the fragments by electrophoresis in a 2% (w/v) agarose gel. Figure 1 shows an example of a DNase I digest separated on a gel. It is important not to use standard loading buffers with their high concentration of tracking dyes, as dyes can mask fluorescence from DNA fragments. The stop buffer already contains glycerol and does not require additional loading buffer; however, a small amount of bromphenol blue can be added to it to simplify loading samples on the gel. 5. Cut DNA fragments in the desired size range from the gel and extract by an appropriate elution protocol. 6. Combine 10/zl of purified fragments, 5/xl of 10x Pfu buffer, 5/xl of 10× dNTP mix (each dNTP 2 mM), and 0.5 /zl of Pfu polymerase (Stratagene, La Jolla, CA) in a total volume of 50/zl. 7. Run the assembly reaction, using the following thermocycler program: 3 min at 94° followed by 40 cycles of 30 sec at 94°, 1 min at 55 °, and 1 min + 5 sec/cycle at 72°. The number of cycles depends on the fragment size. Assembly from small fragments may require more cycles than assembly from large fragments. Small fragments also may require a lower annealing temperature, at least during the initial cycles. Extension time depends on M
1
2
3
4
5
N
200 b p 100 b p -
lie
FIG. 1. Gel electrophoresis of DNase I digestion products. Lanes 1-5 are aliquots taken 30 second 1, 2, 3, and 5 min after the beginning of the reaction. M, Molecular weight marker, 100-bp ladder (Life Technologies, Bethesda, MD).
[26]
METHODSFOR in Vitro DNA RECOMBINATION
449
the gene size and should be adjusted accordingly for genes larger than 1-2 kb. 8. Amplify recombinant genes in a standard PCR, using serial dilutions of the assembly reaction (1 /.tl each of undiluted reaction, 1 : 10 dilution, and 1 : 50 dilution). 9. Run a small aliquot of the amplified products on an agarose gel to determine the yield and quality of amplification. If amplification produces a smear with low yield of full-length sequence reamplify these products with nested primers separated from the previously used primers by 50-100 bp. Run a small aliquot on an agarose gel. 10. Select the reaction with high yield and low amount of nonspecific products. Purify the reaction products, digest with appropriate restriction endonucleases, and ligate into the cloning vector. Notes
1. D N A fragmentation with DNase I requires the presence of either Mg 2+ or Mn 2+ ions. The original version of the D N A shuffling protocol 1 recommended using MgC12. Experience indicates that MnC12 may be more suitable to maintain low point mutagenesis rates. 5'6 Mutation rate can also be reduced by using high-fidelity polymerases, Pfu or Vent, instead of Taq polymerase. 6 2. Manganese and magnesium ions affect DNase I digestion differently: Mg 2+ stimulates formation of single-stranded cuts, while Mn 2+ stimulates cleavage of both strands. 7,8Both metal ions promote random fragmentation, but some Mg2+-generated nicks may remain undetected by agarose gel electrophoresis in nondenaturing conditions and lead to overestimation of the real fragment sizes. 3. Another important feature of this protocol is the use of EDTA rather than thermal inactivation to stop the DNase digestion. The presence of either Mg 2+ or Mn 2+ makes DNase I thermostable. 8-1° A high temperature does eventually inactivate the enzyme, but it remains active up to about 60 °, continuing digestion with increasing speed. This high thermostability of DNase I is not by itself a problem, as long as the same inactivation protocol is used in all experiments. Switching from one inactivation method to another may require adjustment of DNase concentration or incubation 5 I. A. J. Lorimer and I. Pastan, Nucleic Acids Res. 23, 3067 (1995). 6 n . Zhao and F. H. Arnold, Nucleic Acids Res. 25, 1307 (1997). 7 V. W. Campbell and D. A. Jackson, J. Biol. Chem. 255, 3726 (1980). 8 H.-M. Eun, "Enzymology Primer for Recombinant DNA Technology." Academic Press, San Diego, California, 1996. 9 S. W. Bickler, M. C. Heinrich, and G. C. Bagby, BioTechniques 13, 64 (1992). 10F. M. Pohl, R. Thomae, and A. Karst, Eur. J. Biochem. 123, 141 (1982).
450
DNA
SHUFFLING AND OTHER METHODS
[261
time. E D T A inactivation is recommended because it is technically simpler and more reproducible. 4. Gel electrophoresis of DNase products is not absolutely necessary for successful shuffling. It is possible to adjust reaction conditions and stop the reaction at a selected time. This approach may work well for some templates, but in some cases, even after digestion for extended periods of time, there is a significant amount of full-length template remaining in the reaction. Using the unfractionated mixture for the assembly reaction would generate an unacceptably large fraction of parental, nonrecombinant molecules. Other advantages of gel separation are visual control of the reaction and the ability to collect only the fragments in the desired size range.
Staggered Extension Process Staggered extension process (STEP) recombination is based on template switching during polymerase-catalyzed primer extension. The abbreviated denaturation and annealing cycles limit the primer extension in a single cycle. Extension interrupted by denaturation resumes during the next annealing step, where the partially extended primers can anneal to different parent sequences present in the reaction. Multiple cycles of partial extension then create a library of chimeric sequences. 2 With no template digestion or fragment reassembly, this protocol is simple.
Staggered Extension Process Protocol 1. Prepare D N A templates. Templates can be plasmids carrying target sequences, sequences excised by restriction endonucleases or amplified by PCR. 2. Combine 5/zl of 10× Taq buffer, 5/zl of 10× dNTP mix (each dNTP 2 mM), 1-20 ng of each template DNA, 30-50 pmol of each primer, and 0.5/zl of Taq polymerase in a total volume of 50/zl. 3. Run 80-100 extension cycles: 94° for 30 sec and 55° for 5-15 sec. 4. Run a small aliquot of the reaction on an agarose gel. Possible reaction products are full-length amplified sequence, a smear, or a combination of both. 5. If plasmids were purified from a dam-methylation positive strain (DH5a, XL1-Blue) the extension reaction can be incubated with DpnI endonuclease to remove parent D N A and decrease the background of nonrecombinat clones. Combine 2 tzl of the extension reaction, 1 /zl of DpnI reaction buffer, 6/xl of H20, and i ILl of DpnI restriction endonuclease (5-10 U//zl). Incubate at 37° for 1 hr. 6. Amplify the target sequence in a standard PCR, using serial dilutions of the previous reaction (1 /xl each of undiluted reaction, 1:10 dilution, and 1 : 50 dilution).
[26]
METHODSFOR in Vitro DNA RECOMBINATION
451
7. Run a small aliquot of the amplified products on an agarose gel to determine the yield and quality of amplification. If amplification produces a smear with a low yield of full-length sequence, reamplify these products with nested primers separated from the previously used primers by 50-100 bp. Run a small aliquot on an agarose gel. 8. Select the reaction with high yield and a low amount of nonspecific products. Purify the reaction products, digest with appropriate restriction endonucleases, and ligate into the cloning vector. Notes
1. Appearance of the extension products in step 4 may depend on the specific sequences recombined or the type of template used. Using whole plasmids in StEP recombination may result in nonspecific annealing of primers and their extension products all over the vector sequence, which would appear as a smear on the gel. As the number of cycles increases these products may be extended further, and the smear would shift up on the gel. A similar effect may be observed for large targets, even in the absence of any vector sequences. Small genes prepared by PCR amplification or endonuclease digestion are most likely to show gradual accumulation of the full-length product with increasing number of cycles. 2. DNA polymerases currently used in DNA amplification are fast enzymes. Even brief cycles of denaturation and annealing provide time for these enzymes to extend primers for hundreds of nucleotides. Therefore, it is not unusual for the full-length product to appear after only 10-15 cycles. 3. The faster the full-length product appears in the extension reaction, the fewer the template switches that have occurred and the lower the recombination frequency. Everything possible should be done to minizize time spent in each cycle: selecting a faster thermocycler, using smaller test tubes with thin walls, and, if necessary, reducing the reaction volume. Polymerases are not all equally fast. The proofreading activity of Pfu and Vent polymerases slows them down, offering another way to increase recombination frequency.3 Polymerases with proofreading activity are also recommended during the amplification step to keep the mutagenic rate to a minimum. 4. As a general rule, annealing temperature should be decreased when higher recombination frequency is required or when templates have low GC content. Genes with GC pairs unevenly distributed along the gene length may present significant problems due to nonspecific annealing. These sequences should be amplified by PCR or excised from their cloning vectors prior to recombination to minimize the amount of nonspecific DNA (vector) present in the reaction.
452
D N A SHUFFLING AND OTHER METHODS
[261
Random-Priming Recombination Random-priming recombination uses extension of random primers to generate fragments for reassembly. 4 Random primers are annealed to the template DNAs and are extended by a D N A polymerase at room temperature or below. The low temperature provides enough stabilization of the annealed primers to allow the use of random hexamers. Hexanucleotides are long enough to form stable duplexes with template D N A and short enough to ensure random annealing. Although longer primers can also be used, annealing may not remain random for short genes.
Random-Priming Protocol 1. Prepare D N A templates. Templates can be plasmids carrying target sequences, sequences excised by restriction endonucleases or amplified by PCR. 2. Combine 0.2-0.5 pmol of each template D N A and 7 nmol of dp(N)6 random primers (Pharmacia Biotech, Piscataway, NJ) in a total volume of 65/zl. 3. Incubate for 5 min at 100° and transfer on ice. 4. Add 10/xl of 20 mM dithiothreitol (DTr), 10/zl of 10x buffer [0.9 M HEPES (pH 6.6), 0.1 M MgCI2], 10/zl of dNTP mix (5 mM each), and 5 tzl of Klenow fragment (2 U/tzl). 5. Incubate for 3-6 hr at 22°. 6. Run a small aliquot of the reaction on an agarose gel. A faint, low molecular weight smear should be visible. 7. Add 100 tzl of deionized water and purify extension products from the template by passing the reaction mixture through a Microcon-100 filter (Amicon, Beverly, MA) at 500g for 10-15 min at 25 °. Do not exceed the recommended centrifugation speed, as a significant amount of template may pass through the filter. 8. Concentrate the flowthrough fraction on a Microcon-3 or -10 filter at 14,000g for 30 min at 25° to remove primers and small fragments. 9. Recover the retentate fraction and continue with the assembly reaction (steps 6-10 of the Stemmer protocol).
Notes 1. Random primer extension is a versatile method for generating random fragments for recombination. Unlike DNase I fragmentation in Stemmer shuffling, this method does not require double-stranded D N A and can be used on both double-stranded and single-stranded substrates. Moreover, random primers can also be used with R N A substrates. In fact, because the stability of R N A - D N A duplexes is higher than that of D N A - D N A
[26]
METHODSFOR ill Vitro DNA RECOMBINATION
453
duplexes, R N A templates have an advantage over D N A templates that may allow extension at higher temperatures. 2. A n o t h e r important feature of this m e t h o d is the ability to use any D N A polymerase. Because the primer extension step does not involve thermal cycling, polymerase choice is not limited to thermophilic enzymes. This feature gives more options for selecting the best polymerase.
Applications o f / n Vitro R e c o m b i n a t i o n The original Stemmer protocol, first introduced in 1994, has been successfully applied to the directed evolution of a large number of proteins. For example, protein folding and solubility were improved for the green fluorescent protein from A e q u o r e a victoria 11 and single-chain antibody fragments (scFv) produced in E s c h e r i c h r a coll. 12 Enzyme substrate specificity has been changed: for example, a galactosidase was converted to a fucosidase, 13 while the substrate specificity of a biphenyl dioxygenase was modified and extended. TM Enzyme thermostability has been improvedJ 5'~6 Enzymatic activity of cephalosporinases was dramatically increased by shuffling homologous naturally occurring genesJ 7 Sequencing of the evolved genes has proved the ability of this method to recombine closely spaced mutations and create highly mosaic genes. A relatively new D N A shuffling method, StEP recombination, has been used to improve the catalytic activity and thermostability of subtilisin. 2'1s In these studies, mutations as close as 34 bp were recombined. The randompriming method has been successfully used to recombine mutants of subtilisin E J 8 Mutations separated by as few as 12 bp were recombined. An efficient, hybrid in v i t r o - i n v i v o recombination method is described in the next chapterJ 9 11A. C. Crameri, E. A. Whitehorn, E. Tate, and W. P. C. Stemmer, Nature Biotechnol. 15, 315 (1996). 12K. Proba, A. Worn, A. Honegger, and A. Pluckthun, J. Mol. Biol. 275, 245 (1998). 13J.-H. Zhang, G. Dawes, and W. P. C. Stemmer, Proc. Natl. Acad. Sci. U.S.A. 94, 4504 (1997). 14T. Kumamaru,H. Suenaga,M. Mitsuoka,T. Watanabe, and K. Furukawa, Nature Biotechnol. 16, 663 (1998). 15f. Buchholz, P.-O. Angrand, and A. F. Steward, Nature Biotechnol. 16, 657 (1998). ~6L. Giver, A. Gershenson, P.-O. Freskgard, and F. H. Arnold, Proc. Natl. Acad. Sci. U.S.A. 95, 12809 (1998). 17A. Crameri, S.-A. Raillard, E. Bermudez, and W. P. C. Stemmer, Nature (London) 391, 288 (1998). 18H. Zhao and F. H. Arnold, Protein Eng. 12, 47 (1999). 19A. A. Volkov, Z. Shao, and F. H. Arnold, Methods Enzymol. 328, Chap. 27, 2000 (this volume).
454
[261
D N A SHUFFLING AND OTHER METHODS
Recombination Results All three PCR-based methods presented can create libraries of recombined sequences. The different methods each have their own advantages and disadvantages, and their relative performance will probably differ for different templates. We compared the three methods for their ability to recombine the truncated green fluorescent protein (GFP) genes in the recombination test system described in the next chapter. 19Table I presents the results of that comparison. In these experiments, the Stemmer protocol using DNase I fragmentation and StEP show the highest recombination efficiency. With DNase fragmentation, using smaller fragments (<100 bp) generates a slightly higher efficiency than larger fragments (100-200 bp). DNA recombination may differ from sequence to sequence and be affected by base composition or other specific features of the sequence. For example, secondary structure formation in single-stranded DNA may adversely affect the performance of all the methods. However, the extent of this effect is probably not equal for all of them. The assembly step in the Stemmer protocol and random-priming recombination can be affected by secondary structure. StEP recombination, with its short annealing times, may be even more sensitive to secondary structure. Random priming can also be affected by secondary structure at the most crucial step of the reaction, annealing and extension of the random primers. Small primers
TABLE I FLUORESCENT Escherichia coli COLONIES OBTAINED BY RECOMBININO Two GREEN FLUORESCENT PROTEIN TEMPLATES CONTAININGMUTATIONSAT THE INDICATED SITESa Fraction of fluorescent colonies (%) Stemmer protocol Distance between mutations (bp)
<100-bp fragments
100 to 200-bp fragments
StEP recombination
Random-printing recombination
423 315 207 99 24 99 + 99
20.5 14.5 11.5 9.6 5.8 6.1
19.2 9.7 8.3 8.4 5.1 3.3
18.5 13.1 9.8 8.2 4.8 1.8
5.0 4.8 3.1 1.2 0.9 0.2
a See [27] in this volume 19 for detailed explanation of the templates. Generation of fluorescence requires recombination between the sites to restore the wild-type sequence. The last row shows the results of recombining one single and one double mutant with 99-bp distances between each mutation.
[26]
METHODSFOR in Vitro DNA RECOMBINATION
455
will be more sensitive to secondary structure than longer primers at this annealing step. The low temperature used in the extension of the random primers also stabilizes secondary structure. Increasing extension temperature would force the use of longer primers, which would probably lead to less efficient recombination, at least for short genes. Termination of elongation of both short and long primers is sensitive to secondary structure. Formation of stable stem-loop structures may be the most important factor causing nonrandom distribution of extended fragments in random-priming recombination. An important parameter for determining the utility of any given method is the average number of recombination events per gene (crossover frequency) that can achieved. The Stemmer protocol probably offers the best possible cross-over frequency. The number of recombination events per gene is inversely proportional to the fragment size. Preparing shorter fragments increases recombination frequency, as demonstrated in Table I, although the effect is not large. Small genes, however, will have limited capacity for such an improvement, because small fragments will be inefficiently reassembled. In random priming, the fragment sizes can be controlled by adjusting the primer-to-template molar ratio. High primer concentration limits exten-
1.0 N181D + N218S-type
× 0.8 0.6
N181D- or N218S-type
8 0.4 Wild-type E0.2
I
0
I
I
100
I
I
200
I
I
300
I
i
400
Clones Fro. 2. Results of screening a library of recombined thermostable subtilisin E mutants. 21 Data are sorted and plotted in descending order of thermostability (residual/initial activity). Arrows indicate thermostability of wild type, parents N181D and N218S, and recombinant N181D + N218S. The plateau regions corresponding to the parent and recombined sequences are characteristic of successful recombination of a limited number of improved mutants.
456
DNA
SHUFFLING AND OTHER METHODS
[27]
sion of all primers. 2° There is one more yet-unexplored opportunity for size control during primer extension: shortening the incubation time may significantly decrease fragment size. Optimal incubation times will have to be determined experimentally, and they will vary from gene to gene. D N A is not the only nucleic acid that can be used in recombination experiments. Certainly cDNA copies can always be synthesized from R N A templates, but the random-priming and StEP recombination protocols can potentially use R N A directly. Detailed information on recombinants is, of course, provided by D N A sequencing. Restriction digests are also useful, provided restriction sites change on recombination. Screening the functional properties of some number of clones may also give useful information about recombination and also about associated point mutation rates. Figure 2 shows the results of a thermostability assay carried out on a library of recombined thermostable subtilisin E mutants. 21 The data are sorted and plotted in descending order. When the mutations are additive, one can clearly distinguish the recombinant clones from the parental ones. The relative fraction of highly thermostable recombinants indicates the efficiency of the recombination process (in this example, perfect recombination with no point mutation would yield 25% highly thermostable clones). Point mutation rates can be deduced from the fraction of inactive clones. 6
2oC. P. Hodson and R. Z. Fisk, Nucleic Acids Res. 15, 6295 (1987). 21H. M. Zhao and F. H. Arnold, Proc. Natl. Acad. Sci. U.S.A. 94, 7997 (1997).
[27] R a n d o m C h i m e r a g e n e s i s b y Heteroduplex Recombination B y ALEXANDER A. VOLKOV, ZHIXIN SHAO, and FRANCES H. ARNOLD
In~oducUon D N A recombination is an important tool for directed evolution of proteins and nucleic acids. Genetic variations existing in nature or created in the laboratory can be recombined to generate libraries of molecules containing novel combinations of sequence information from any or all of the parent sequences. By combining beneficial mutations and removing deleterious ones, recombination may help to accelerate the evolution of single molecules toward a specified function. Novel chimeric sequences
METHODS IN ENZYMOLOGY, VOL. 328
Copyright © 2000 by Academic Press All rights of reproduction in any form reserved. 0076-6879/00 $30.00