410
[23]
NUCLEIC ACID LIBRARIES
Such conditions not only increase the opportunity for the deletion/poly(A) variants to increase in the population but also increase the number of random mutations that are generated by reverse transcriptase and DNA polymerase. The selection scheme is currently being repeated using more stringent aminoacylation conditions and with the additional step of gel purifying the PCR products. These modifications should obviate the need for the in vivo screen and should enrich the library for SRS-active species in fewer rounds than in the work described in this chapter. This will enable us to answer our primary question of whether unique combinations of acceptor stem nucleotides that can functionally mimic the E. coli serine tRNA sequence exist.
Acknowledgments We are very grateful to John Abelson, in whose laboratory this work was performed, for his insights and encouragement on this project. We also thank Olke Uhlenbeck for helpful discussions and Andrea Ghetti for critically reading the manuscript. This project was supported by NIH Grant GM 48560.
[23] I n V i t r o E v o l u t i o n o f R a n d o m i z e d
Ribozymes
B y JOYCE TSANG a n d GERALD F. JOYCE
Darwinian evolution can be performed in vitro to obtain ribozymes with optimized function or novel catalytic abilities. Molecules with the desired characteristics are selected from a pool of randomized sequences, amplified, and mutated. This process is repeated until the ribozymes of interest are obtained. The key difference between in vitro evolution and in vitro selection is that the former involves the repeated introduction of new mutations. Thus, with in vitro evolution, even if the desired ribozyme does not exist in the initial pool, suboptimal variants can be selected and, over the course of evolution, gain new mutations that result in the behavior of interest. As a consequence, the number of possible sequences that can be surveyed is greatly increased compared to what can be represented in an initial p o o l There are two types of initial pools. One is derived from a preexisting ribozyme sequence and the other is composed of completely random sequences. The former allows for a thorough search of the sequences that are closely related to a prototype and is particularly useful when optimizing METHODS IN ENZYMOLOGY, VOL. 267
Copyright © 1996 by Academic Press, Inc. All rights of reproduction in any form reserved.
[23]
In Vitro EVOLUTIONOF RIBOZYMES
411
or subtly altering the function of an existing ribozyme. The latter allows for a broader but less comprehensive search of R N A sequences, which may be necessary when attempting to develop ribozymes with entirely new catalytic abilities. This chapter discusses in vitro evolution techniques as applied to an initial pool that is derived from a preexisting ribozyme sequence. We describe basic methods for creating a pool of ribozyme variants, selecting molecules with the desired properties, and amplifying and mutating the chosen sequences to generate a new population of variants. Strategies for performing randomization, selection, and mutagenesis of an ensemble of RNAs, as well as practical considerations for carrying out a successful in vitro evolution experiment, are discussed. Creating Pools of Randomized Ribozymes There are two basic strategies for creating a pool of randomized ribozymes. One strategy employs degenerate, synthetic oligodeoxynucleotides, which can be either directly transcribed to RNA or incorporated into a D N A template that encodes the ribozyme. Such degenerate oligonucleotides can be produced on an automated D N A synthesizer using nucleoside 3'-phosphoramidite solutions that have been doped with a small percentage of the three incorrect monomers, I correcting for the differential coupling efficiency of the various phosphoramidites. Another strategy employs an errorprone polymerase to introduce mutations into the ribozyme-encoding DNA template. One such method, mutagenic polymerase chain reaction (PCR), exploits the highly error-prone activity of Taq D N A polymerase. Synthetic Oligonucleotides Direct Transcription. Our standard protocol for transcribing synthetic oligodeoxynucleotides to R N A is adapted from a procedure developed by Uhlenbeck and colleagues, employing T7 RNA polymerase? For the synthetic oligonucleotide to serve as a template for transcription, its 3' end must contain one strand of the T7 promoter, A T r A T G C T G A G T G A T A T , followed by CC (T7 R N A polymerase initiates transcription with GG). This template oligonucleotide is hybridized to a synthetic 18-mer oligonucleotide, A T A T C A C T C A G C A T A A T G G , that contains the other strand of the T7 promoter, enabling recognition by T7 R N A polymerase. The template and 18-mer oligonucleotides are mixed in a 1 : 2 ratio and incubated
i j. D. Hermes,S. C. Blacklow,and J. R. Knowles,Proc. NatL Acad. Sci. U.S.A. 87,696 (1990). 2j. F. Milligan, D. R. Groebe, G. W. Witherell, and O, C. Uhlenbeck, Nucleic Acids Res. 15, 8783 (1987).
412
NUCLEIC ACID LIBRARIES
[23]
at 65 ° for 5 min. This solution is allowed to cool to 50 °, a 50-/xl volume of 10x transcription buffer containing 150 mM MgC12, 20 mM spermidine, 500 mM Tris (pH 7.5), and 50 m M dithiothreitol (DTT) is added, and the mixture is held at 25 ° for 5 min. Transcription occurs in a 500-/zl volume containing - 4 ~ M annealed template, 15 mM MgCI2, 2 m M spermidine, 50 mM Tris (pH 7.5), 5 mM DTT, 2 mM of each NTP, 0.005 U//xl inorganic pyrophosphatase, and 25 U//zl T7 R N A polymerase, incubated at 37 ° for 2 hr. All reagents should be added at (or above) room temperature to prevent precipitation of the DNA. Full-length transcription products should be purified, either by polyacrylamide gel electrophoresis or high-performance liquid chromatography (HPLC). At present, the maximum length of D N A that can be prepared synthetically is about 100-120 residues, which limits the maximum length of the resulting transcript to about 80-100 nucleotides. If longer transcripts are required, then it is necessary to prepare two or more synthetic DNAs that can be stitched together. This can be done by preparing oligonucleotides with overlapping sequences, which are then annealed and filled in with a DNA-dependent D N A polymerase. A disadvantage of this method is that the regions of overlap cannot be randomized, resulting in templates that are interrupted by stretches of 10-20 nucleotides of fixed sequence. Alternatively, the individual oligonucleotides can be made double-stranded by a few rounds of PCR amplification, cut at their ends with an appropriate restriction enzyme, and then ligated together. This reduces the number of intervening, fixed nucleotides to 4-6. Template-Directed Mutagenesis. Template-directed mutagenesis involves the hybridization of one or more mutagenic oligonucleotides to a single-stranded D N A that is of the opposite orientation relative to the template strand, followed by completion of a new template strand using T4 D N A polymerase and T4 D N A ligase. The resulting partial duplex structure is then transcribed directly with T7 R N A polymerase to provide a pool of mutagenized RNAs. 3 The transcribed RNAs are complementary to whatever sequence has been incorporated into the template strand. This technique is especially useful for segmental randomization within a long RNA. It can be initiated with either plasmid D N A or double-stranded PCR products that encode the R N A of interest. One method of preparing the nontemplate strand involves exonucleolytic digestion of the template strand. Beginning with plasmid D N A that contains a T7 promoter and the ribozyme-encoding gene, single-stranded D N A of the appropriate orientation is prepared by cutting at a unique restriction site that lies downstream from the target gene, then digesting with T7 gene 6 exonuclease (a nonprocessive 5' ~ 3'-exonuclease) to a G. F. Joyce and T. Inoue, Nucleic Acids Res. 17, 711 (1989).
[231
In Vitro ~VOLVTIONOF RmOZYMES
413
remove the template strand. We typically digest 100 tzg of linearized ribozyme-encoding plasmid D N A with 500 units of gene 6 exonuclease at 37 ° for 1 hr in a 500-txl volume containing 5 mM MgC12, 20 mM KC1, 50 mM Tris (pH 8.0), and 1 mM DTT. The exonuclease is removed by phenol extraction, and the D N A is purified by ethanol precipitation. Another way of preparing the nontemplate strand is to amplify the ribozyme-encoding gene in a PCR reaction using a standard primer for the nontemplate strand and a biotinylated primer for the template strand. The amplified material is then bound to a streptavidin matrix (most conveniently, a streptavidin-containing pipette tip) and is washed with 0.2 N NaOH to denature and elute the nonbiotinylated, nontemplate strand. The eluate is neutralized with an equal volume of 0.2 N HC1, and the nontemplate strand is concentrated and purified by ethanol precipitation. In constructing the randomized template, two types of oligodeoxynucleotides are annealed to the nontemplate strand: the "terminator oligonucleotide," which is complementary to a site near the 3' end of the target gene and defines the 3' end of the transcript, and the "mutator oligonucleotides," which introduce defined or random mutations. The first and last nucleotides of the mutator oligonucleotides are not randomized in order to facilitate their incorporation into newly synthesized template molecules. The mutator oligonucleotides must be 5'-phosphorylated so that they can serve as a substrate for T4 D N A ligase. The terminator and mutator oligonucleotides are added in fivefold excess over template to an annealing reaction containing - 3 nM template, 2 mM MgC12, 20 mM Tris (pH 7.5), and 2 mM DT-F, which is incubated at 70° for 5 min, then slow cooled to 30° over 45 min. The gaps between the oligonucleotides are filled in and closed in a reaction mixture containing the annealed DNA, 0.5 mM of each dNTP, 1 rnM ATP, 5 mM MgC12, 10 mM Tris (pH 7.5), 2 mM DTT, 5 U/txl T4 D N A polymerase, and 10 U/ txl T4 D N A ligase. T4 D N A polymerase is much preferred over other DNA-dependent D N A polymerases because of its lack of strand displacement activity. We find that incubation at 0° for 5 rain, then 25 ° for 5 min, and then 37 ° for 90 rain facilitates the synthesis of full-length templates. The polymerase and ligase are removed by phenol extraction and the reconstituted template is purified by ethanol precipitation. Finally, RNA is transcribed from the randomized template, as described earlier. Starting with 6 pmol of single-stranded DNA, one can expect to obtain 100-200 pmol of purified, mutagenized RNA. Mutagenic PCR. As described in detail by Cadwell and Joyce,4'5 the polymerase chain reaction can be carried out under mutagenic conditions 4 R. C. Cadwell and G. F. Joyce, PCR Methods Appl. 2, 28 (1992). s R. C. Cadwell and G. F. Joyce, PCR Methods Appl. 3, $136 (1994).
414
NUCLEIC ACID LIBRARIES
[23]
that further decrease the already low fidelity of Taq D N A polymerase, enabling introduction of random point mutations within a ribozyme-encoding gene. As part of this procedure, the concentrations of the four deoxynucleotide triphosphates are adjusted to ensure that the various types of base substitutions are produced with roughly equal probability. Our standard protocol for mutagenic PCR is as follows: 20 fmol of cDNA, produced by reverse transcription of the target ribozyme, is added to a 100-/zl reaction mixture containing 7 m M MgC12, 50 m M KC1, 10 mM Tris (pH 8.3), 0,012% (w/v) gelatin, 0.2 mM dGTP, 0.2 mM dATP, 1 mM dCTP, 1 m M dTTP, 30 pmol of each primer, 0.5 m M MnC12, and 1 unit Taq polymerase, which is incubated for 30 cycles of 92° for 1 rain, 45 ° for 1 min, and 72 ° for 1 min. MnCI2 must be added next to last, just prior to Taq polymerase, to prevent precipitation. Over the 30 temperature cycles only about 10 doublings occur, converting 20 fmol of input cDNA to about 20 pmol of double-stranded D N A product, with a mutation rate of 0.66% (_0.13%, 95% confidence interval) per nucleotide position. The upstream primer must contain the T7 promoter sequence so that the resulting mutagenized DNA can be transcribed directly, as described earlier. A higher mutation rate can be obtained by carrying out a second mutagenic PCR, seeded with 20 fmol of the products from a first mutagenic PCR. The overall mutation rate thus becomes about 1.3% per nucleotide position. Because the amount of material used to seed the PCR is only 20 fmol (-101° molecules), there is a risk that successive PCR amplification reactions will result in a loss of sequence diversity (i.e., rare sequences in the population may not be carried over into the next PCR). This is especially a concern if one attempts to carry out more than two successive mutagenic PCRs. One way to partially offset this loss is to mix the various mutated pools (e.g., mix equal portions of the 1.3, 1.9, and 2.7% mutated pools). It may be necessary to purify the input D N A prior to seeding the next mutagenic PCR, either by isolating the material from an agarose gel or by synthesizing cDNA from R N A that has been transcribed from the PCR products. It is possible to carry out mutagenic PCR for fewer than 30 cycles, for less than 10 doublings, or with higher concentrations of input cDNA to diminish the loss of sequence diversity and/or prevent mispriming events that tend to occur after repeated rounds of mutagenic amplification. However, such manipulations are likely to result in biased mutations and an altered mutation rate compared to what is obtained with the standardized protocol.
Randomization Strategies Our basic strategy for randomizing a ribozyme is to generate a population of variants that is both diverse and comprehensive, i.e., a population
[23]
In Vitro EVOLUTIONOF RIBOZYMES
415
containing variants with the greatest possible diversity yet including all possible variants that are closely related to the prototype sequence. For a sequence of length n, mutagenized at a degeneracy d, the average number of mutations per sequence is nd. The mean number of copies per sequence for a sequence that has k mutations is given by S [ P ( k , n, d ) ] / N ( k ) , where S is the size of the population, P is the proportion of the population having k mutations, and N is the number of distinct sequences having k mutations. P and N are given by the equations: P(k, n, d ) = [n!/(n - k)!k!] dk(1 -- d ) n-k
(1)
N ( k ) = [n!(n - k)!k!]3 '~
(2)
In one study, for example, we wished to randomize a total of 140 positions within an existing ribozyme (n = 140). For a population size of 20 pmol (S =- 1.2 × 10 ~3 molecules), we chose a degeneracy of 5% (d = 0.05). In this way, more than half of the mutagenized population contained seven or more mutations, yet all possible 1-, 2-, 3-, 4-, and 5-mutation variants were represented (Fig. 1). If we had introduced mutations at a lower frequency, say 2.5% (d = 0.025), then variants with 4 or fewer mutations would have been represented comprehensively, but the vast majority of individuals would have remained very closely related to the original ribozyme. If we had introduced mutations at a higher frequency, say 10% (d = 0.10), then the variants would have contained, on average, a high number of mutations, but sequences that are closely related to the original ribozyme would not have been well represented. The degree of randomization should be based on the level of confidence one has that the starting ribozyme is a good lead compound in the search for the desired catalytic function.
Selection Techniques This section discusses three techniques for selecting randomized ribozymes: selective cDNA synthesis, immobilization on a solid support, and differential electrophoretic mobility. All three techniques require that the desired ribozyme become distinguishably altered as a consequence of carrying out the target reaction. Selective cDNA synthesis places a significant restriction on the difference between unreacted and reacted molecules, requiting that reacted molecules become extended at their 3' end via a linkage that can be read through by reverse transcriptase (e.g., a phosphodiester linkage). The advantage of this technique is that it is both highly sensitive and highly selective, allowing amplification of as few as 1000 reactive molecules, while excluding a 106-fold excess of unreactive mole-
416
[23]
NUCLEIC ACID LIBRARIES 0.25
o~ e-
._o
0.20
"5 E 0.15
e-
.o c1. o
0.10
~6 tO
0.05 tL
0 1012
10 6
1
10 "6
10 "12
10 "ls
10 "24
10 .3o
Mean copies/sequence with k m u t a t i o n s FIG. 1. Fraction of the population and mean number of copies per sequence for sequences having k mutations, assuming the population has been randomized over a stretch of 140 nucleotides with a degeneracy d. For each level of degeneracy (©, d = 0.025; e , d = 0.05; and I~, d = 0.1), the number of mutations is shown in italics adjacent to the corresponding data point. The vertical dashed line indicates an expectation value of 1 for the mean number of copies per sequence.
cules. 6 The other two techniques, while less stringent, are applicable to a much broader range of ribozyme-catalyzed reactions, including those that produce extensions at either the 3' or the 5' end or result in a cleavage event. For each technique, we describe the basic procedure and discuss its sensitivity and specificity. In addition, we consider potential problems and suggest ways in which they might be overcome.
Selective cDNA Synthesis Basic Technique. Selective cDNA synthesis requires that the 3' end of the ribozyme become attached to an oligonucleotide or oligonucleotide analog, forming a linkage that can be read through by reverse transcriptase. 6 G. F. Joyce, in "Antisense R N A and D N A " (J. A. H. Murray, ed.), p. 353. Wiley-Liss, New York, 1992.
In Vitro EVOLUTIONOF RIBOZYMES
[231
~
Y
'~X
3"
417
~-X.Y
5"
5"
'
cDNA
5"
41
Reverse Transcriptase
~
"~X.Y
IIIIII
5"
FiG. 2. Selective cDNA synthesis. An RNA molecule, represented by the curved line, catalyzes linkage of a functional group X, located at its own 3' end, to another functional group Y, located at the 5' end of an oligonucleotide substrate. An oligonucleotide primer, indicated by the thick line, hybridizes to the 3' end of the substrate and initiates selective cDNA synthesis of the reacted molecules.
An oligodeoxynucleotide primer is designed such that it does not bind to unreacted variants, but is able to hybridize to the extended 3' terminus of reacted molecules and initiate selective reverse transcription (Fig. 2). Our standard protocol for selective cDNA synthesis is as follows: - 2 0 0 fmol of R N A is added to a 20-tzl solution containing 1 t~M primer, 0.2 mM of each dNTP, 10 mM MgCI2, 50 mM Tris (pH 7.5), 5 mM DTI', and 2 U//zl avian myeloblastosis virus (AMV) reverse transcriptase, which is incubated at 42° for 1 hr. For greatest specificity, the primer should have an annealing temperature at or slightly below the incubation temperature for the cDNA synthesis reaction. The specificity of this technique can be quite high: as few as 104 molecules out of a population of 10~3 can be selected. We often carry out selective cDNA synthesis as part of an isothermal amplification procedure, as described below. Troubleshooting. Incomplete reverse transcription may occur as a result of primer mishybridization or difficulty of the polymerase in reading through highly structured regions of the R N A template. Mishybridization can be ameliorated by using a primer with a lower annealing temperature and/or carrying out cDNA synthesis at a higher temperature [up to 50° with Superscript (BRL, Gaithersburg, MD) reverse transcriptase]. It may
418
NUCLEICACID LIBRARIES
[231
be advantageous to anneal the primer to the R N A template prior to carrying out c D N A synthesis in order to outcompete the tertiary structure that might prevent proper primer hybridization. Over the course of an in vitro evolution experiment, sequences may arise that contain an internal region that is complementary to the primer. Such molecules can be excluded by a gel purification step, assuming that they have significantly different mobility compared to the desired RNA. However, these spurious products are a burden on the amplification system and reduce the efficiency of recovery of the functional RNAs. Thus, it may be necessary to change the primer sequences, perhaps alternating between two different primer sequences with successive amplification reactions.
Immobilization on Solid Support Basic Technique. This method requires that the ribozyme become either linked to or released from a solid support in order to be selected. Linking experiments begin with the substrate attached to the matrix and the ribozyme free in solution. The substrate may be attached covalently via a chemical linker or noncovalently via the tight interaction between a biotin group on the substrate and streptavidin on the support. The ribozyme, upon forming a covalent adduct with the immobilized substrate, becomes bound to the matrix (Fig. 3a). Nonreactive ribozymes are washed away (a)
(b)
FIG.3, Selectionby immobilizationon a solid support. (a) Ribozymesthat catalyzelinkage of a functional group X to another functional group Y become attached to a matrix. (b) Immobilizedribozymesthat catalyze cleavageof the linkage X. Y are released from a matrix. cDNA synthesisof the reacted ribozymesis carried out using either the matrix-boundmaterial (a) or material that is collected in the eluate (b).
[231
In Vitro EVOLUTIONOF RIBOZYMES
419
and those that remain bound are amplified, typically by reverse transcription and PCR. Release experiments begin with the ribozyme immobilized, either joined to the substrate or with the substrate free in solution. Under the appropriate reaction conditions, e.g., following addition of a divalent metal cofactor, the immobilized ribozyme catalyzes its removal from the matrix via a bondbreaking reaction (Fig. 3b). There are several methods for attaching the ribozyme to the matrix, including (1) enzymatic ligation to an already immobilized oligonucleotide or oligonucleotide analog; (2) Watson-Crick base pairing to a complementary, matrix-bound oligonucleotide; and (3) chemical cross-linking between a reactive group that has been incorporated into the ribozyme (e.g., 5'-thiophosphate) and an appropriate partner on the solid support (e.g., thiolate). In the linking experiments, high salt washes are necessary to remove nonspecific binders from the column (typically with a buffered solution containing 1 M NaCI and 1 mM EDTA). Even in the best of circumstances, we find that about 0.02% of unreacted molecules remain bound nonspecifically to the matrix following several high salt washes. Thus, the specificity of this technique may be several orders of magnitude lower than that of selective cDNA synthesis. Reverse transcription of bound ribozymes can be carried out directly on the solid support and the resulting cDNAs can be collected in the eluate. Compared to the situation in linking experiments, it is more difficult for unreacted molecules to circumvent a selection scheme that requires ribozymes to catalyze their own release from a solid support. Ribozymes are first attached to the matrix, either covalently or noncovalently, and the nonspecifically bound RNAs are removed with high salt washes. In this case, however, the -0.02% of ribozymes that remain nonspecifically bound to the matrix after the washes are unlikely to elute under the comparatively mild conditions of most ribozyme-catalyzed reactions. Thus, with very high selectivity, only ribozymes that are capable of catalyzing their removal from the matrix are eluted after the ribozyme-catalyzed reaction is allowed to occur. The sensitivity of this technique depends on how efficiently reacted ribozymes can be eluted from the matrix and recovered by subsequent ethanol precipitation. If the number of selected molecules is extremely low, then carrier nucleic acid should be added to facilitate precipitation. Troubleshooting. A potential problem when selecting for ribozymes that become linked to a solid support by reacting with a bound substrate is the existence of molecules that bind tightly to the matrix but do not carry out the target reaction. If this noncovalent interaction is dependent on the folded structure of the RNA, then it is unlikely to withstand washing under denaturing conditions (e.g., excess E D T A and saturating urea). Very tight
420
NUCLEIC ACID LIBRARIES
[231
noncovalent interactions may require washing with 0.2 N NaOH or at an elevated temperature, although this risks partial loss of the truly reactive material. It may also be useful to carry out a prior negative selection procedure, removing RNAs that bind to a matrix that does not contain substrate before allowing the remaining RNAs to react with the matrixbound substrate.
Differential Electrophoretic Mobility Basic Technique. Reacted ribozymes that differ in size or conformation from unreacted species can be selected based on their differential electrophoretic mobility. Ribozymes are first radiolabeled, either by the incorporation of radiolabeled monomers during in vitro transcription or by 5'-end labeling with T4 polynucleotide kinase and [y-32P]ATP. Reacted species are separated from unreacted molecules by gel electrophoresis, visualized by autoradiography, and cut and eluted from the gel. Reacted species then must be returned to their original unreacted form, either before or after amplification. The selectivity of this technique depends on the difference in mobility between reacted and unreacted ribozymes. If the fraction of reacted ribozymes is quite small, then detection and recovery of the reacted molecules become difficult, limiting the sensitivity of the method. Troubleshooting. The greatest difficulty with selection based on differential electrophoretic mobility is the occurrence of unreacted RNAs that have the same electrophoretic mobility as the reacted species, perhaps due to an insertion or deletion or to some unusual conformation. One possible solution to this problem is to gel purify the reacted ribozymes as well as the regenerated, unreacted form of the molecule. 7 Ribozymes thus are subjected to a dual selection. If this remedy is impractical or unsuccessful, then it may be necessary to combine this selection technique with one of the other two selection methods. Selection Strategies In order to obtain rapid enrichment of superior catalysts, the selection criteria must be chosen carefully, based on the proportion of the population that undergoes the target reaction and the level of background activity due to either the uncatalyzed reaction or the lack of specificity in the selection procedure. If the selection criteria are too lenient, allowing a large fraction of the molecules to undergo reaction, then superior catalysts will not enjoy a significant advantage over less active molecules, and improvement of the 7 T. Pan and O. C. Uhlenbeck, Biochemistry 31, 3887 (1992).
[231
In Vitro EVOLUTIONOF RIBOZYMES
421
catalytic properties of the population as a whole will be very gradual. For example, if 10% of the initial population is allowed to carry out the target reaction, then the enrichment of a new, highly advantageous variant will be, at most, 10-fold. On the other hand, if the selection criteria are too harsh, then the number of reacted molecules may be too low, falling below the limits of sensitivity of the selection scheme. Thus, the ideal selection criteria are those that result in survival of the fewest reacted molecules that can be reliably recovered by the selective amplification procedure. The selection criteria can be adjusted by manipulating various reaction parameters, such as reaction time, substrate concentration, and ambient conditions. In some cases, the initial pool may not show any detectable activity for the target reaction. One can proceed on faith for several rounds of selective amplification, carrying along putative catalysts with material resulting from the background activity. Catalysts, if present in the initial pool, should in principle be detectable after (log x)/(logy) rounds of selective amplification, where x is the estimated fraction of molecules that have undergone the target reaction and y is the fraction of molecules that are selected as a result of background events. The selection criteria may need to be reevaluated if such molecules are not detected after (log x)/(log y) + ~2 rounds. If it appears that the catalyst of interest is not present in the initial pool, one might attempt to proceed toward the desired catalytic behavior in two steps, first seeking molecules that perform some related, but less challenging reaction, and then continuing on toward the desired activity. Once an activity is realized, it is important to maintain stringent selection pressure on the evolving population to ensure that the best catalysts continue to enjoy a significant advantage. This can be accomplished by progressively increasing the stringency of the selection criteria, adjusting the reaction parameters so that only a small, but reliably recoverable proportion of the population is able to survive. A potential concern is that the number of survivors not be so small that the diversity of the population is lost, which would reduce the opportunity for discovering new, more advantageous variants through mutagenic amplification. In practice, however, most selection schemes have such a high level of background activity that the number of surviving molecules, even under the harshest allowable selection criteria, is still quite substantial. Amplification There are two basic methods for amplifying the selected ribozymes. The first involves an isothermal R N A amplification procedure that cycles
422
NUCLEICACIDLmRARIES
[23]
primer 1
T7RNA Y~ Polymerase[~
Reverse ~anscri ptase cDNA
T7 prom - dsDNA
primer 2
FIG. 4. Isothermal amplification reaction. Primer 1 initiates reverse transcription of RNA to cDNA. Primer 2 initiates second-strand synthesis, which results in the attachment of a T7 promoter element. Approximately 200-fold amplification occurs when the double-stranded D N A is transcribed back to RNA.
through a cDNA intermediate. 6'8 The second involves reverse transcription, then PCR amplification of the resulting cDNA, and finally in vitro transcription of the PCR products. We often employ both amplification strategies in a sequential manner. This provides a higher overall level of amplification and allows us to take advantage of certain favorable attributes of each procedure. Isothermal RNA amplification allows selective cDNA synthesis and amplification to occur simultaneously. PCR amplification provides double-stranded DNA for cloning and allows new mutations to be introduced, as described earlier.
Isothermal Amplification Isothermal amplification involves two primer-dependent steps: reverse transcription of RNA to eDNA, then attachment of a T7 promoter element. These are followed by transcription of the DNA back to RNA (Fig. 4). To initiate eDNA synthesis, primer 1 binds at a unique site at the extreme 3' end of the ribozyme, preferably with a melting temperature of 37-42 ° in order to obtain optimal specificity. Primer 2 contains one strand of the T7 promoter and binds at a unique site at the extreme 3' end of the eDNA. Second-strand synthesis proceeds as a consequence of the DNA-dependent s j. C. Guatelli, K. M. Whitfield, D. Y. Kwoh, K. J. Barringer, D. D. Richman, and T. R. Gingeras, Proc. Natl. Acad. Sci. U.S.A. 87, 1874 (1990).
[231
In Vitro EVOLUTIONOF RIBOZYMES
423
D N A polymerase activity of reverse transcriptase. Because primer 2 only binds to full-length cDNA, incomplete reverse transcripts are excluded from subsequent reaction steps. It is important to include several residues upstream from the T7 promoter sequence, at the extreme 5' end of primer 2, to ensure completion of a fully double-stranded promoter. T7 R N A polymerase produces roughly 100-1000 copies of R N A per copy of D N A template. Importantly, each of these RNAs can enter another round of reverse transcription, second-strand synthesis, and forward transcription. This cycle of events can occur repeatedly in a single test tube at a constant temperature of 37-42 °. Depending on the input, one can achieve up to 106fold amplification in 1 hr. For ribozyme-catalyzed reactions that result in ligation of an oligonucleotide or oligonucleotide analog to the 3' terminus of the RNA, selection and isothermal amplification can be coupled by relying on a version of primer 1 that hybridizes only to the extended 3' end of reacted ribozymes. The original 3' end of the ribozyme can be restored in a second amplification reaction that is seeded by a portion of the first, using a different form of primer 1 that does not extend beyond the 3' end of the ribozyme. Our standard protocol for isothermal R N A amplification is as follows: input R N A is added to a solution containing 1/xM of each primer, 2 mM of each NTP, 0.2 mM of each dNTP, 10 mM MgC12, 50 mM Tris (pH 7.5), 5 mM DTT, 5 U//zl T7 R N A polymerase, and 1.25 U//xl Moloney murine leukemia virus (MMLV) reverse transcriptase, which is incubated at 37-42 ° for 1-2 hr. With an R N A input of about 10 fmol, we typically achieve 10 3to 10Z-fold amplification. Amplified R N A can be isolated and purified by polyacrylamide gel electrophoresis. PCR Before amplification by PCR, the selected ribozymes must be reverse transcribed to cDNA, employing a primer that is complementary to the 3' end of the ribozyme (see Selective cDNA Synthesis). Our standard PCR protocol requires an input of about 10 fmol of cDNA, which is added to a 100-/xl solution containing 1/xM of each primer, 0.2 mM of each dNTP, 1.5 mM MgCI2, 50 mM KCI, 10 mM Tris (pH 8.3), 0.01% gelatin, and 2.5 units Taq polymerase. The mixture is incubated in a thermal cycler for 30 cycles of 92° for 1 min, 45 ° for 1 min, and 72 ° for 1 min. (It may be necessary to adjust the annealing temperature, depending on the length and sequence of the primers.) The upstream primer must contain the T7 promoter sequence so that the PCR products can serve as templates in a subsequent in vitro transcription reaction that generates the new population of ribozymes. These PCR conditions are slightly mutagenic, resulting in a mutation rate
424
NUCLEIC ACID LIBRARIES
[231
of about 0.1% per nucleotide position. Higher mutation rates can be obtained by carrying out mutagenic PCR, as described previously. Prior to gel purification of the RNA transcripts, it is advantageous to degrade the PCR products using DNase I. This helps prevent carryover of DNA from one round of selective amplification to the next. Carryover molecules are undesirable because they need not meet the selection criteria, yet are able to give rise to RNAs in the subsequent population. Thus, the presence of carryover sequences slows the pace of in vitro evolution. Spurious Amplification Products After many rounds of in vitro evolution, RNA or DNA sequences may arise that interfere with the amplification of reactive ribozymes. Such sequences may be self-priming, resulting in extra-length amplification products that appear as distinct bands or as a smear of high molecular weight material when analyzed by gel electrophoresis. Alternatively, the sequences may contain regions that are complementary to one or both of the primers, resulting in shorter-length amplification products. In the case of the extralength products, we find that gel purification of the cDNA, reduction in the amount of input material, or carrying out a lower level of amplification can usually remedy the problem. In the case of the shorter-length products, changing the primer sequence may be helpful. In fact, alternating the primer sequences with successive generations is a good preventative measure, guarding against both small, spurious amplification products and carryover or cross-contamination from previous rounds of selective amplification. Mutation The continual introduction of new mutations makes in vitro evolution a powerful method for optimizing ribozyme function. Starting populations, no matter how large, possess a limited amount of diversity and thus are unlikely to contain ribozymes with optimal activity. However, suboptimal versions of the desired species can be selected from a well-constructed initial pool and, through repeated rounds of selection, amplification, and mutation, can gain new mutations that lead progressively toward the desired phenotype. New mutations can be introduced during PCR amplification, either under highly mutagenic or mildly mutagenic conditions, as previously described. An estimate of the optimal level of mutagenesis is 1/n, where n is the length of the sequence.9 At this level, each amplified molecule will 9 M. Eigen, Naturwissenschaften 58, 465 (1971).
[231
In Vitro EVOLUTIONOF RIBOZYMES
425
contain, on average, one new mutation. If mutagenesis is carried out at a higher rate, then there may be a runaway accumulation of mutations, with new mutations arising more rapidly than they can be culled by selection. As a result, the information content of the most advantageous sequences quickly degenerates to random noise. A novel technique, dubbed "sexual PCR," has been developed to artificially promote recombination between variants. 1°'at The application of this technique to ribozymes will likely occur in the near future. Sexual PCR is carried out at the level of double-stranded DNA, beginning with either multiple copies of a particular sequence or a population of variant sequences. The DNAs are cleaved into short fragments with DNase I, and the fragments are size-fractionated by agarose or nondenaturing polyacrylamide gel electrophoresis. DNAs recovered from the gel are subjected to multiple cycles of PCR amplification in the absence of added primers, allowing the fragments to serve as primers and templates for each other. As a result, they become assembled in various combinations to form larger fragments. Sexual PCR is completed by introducing flanking primers and continuing PCR amplification for additional cycles, restoring the material to full length. The recombination frequency is controlled by adjusting the length of the gel-purified fragments, which determines the number of recombination events that are required to reassemble the entire gene. Because fragments containing less than - 2 0 nucleotides tend to recombine inefficiently and in a less discriminate manner, the sexual PCR technique is best suited for sequences longer than ~200 nucleotides. After numerous rounds of in vitro evolution, especially if stringent selection criteria are maintained throughout, a limited number of sequences may come to dominate the population. This is disadvantageous because it restricts the diversity of novel variants that will be explored in a subsequent evolutionary search. One is faced with two options: the population as a whole can be rerandomized, using either mutagenic or sexual PCR; or a single dominant sequence can be used as a prototype for the construction of a new starting pool of randomized variants. Rerandomizing the entire population is generally preferable because it offers relatively rare species a small, but potentially significant, opportunity to give rise to novel variants with highly desirable properties. On the other hand, in order to rid the population of individuals that give rise to amplification artifacts or simply as a matter of expediency, it may not be unreasonable to start anew. In vitro evolution is a powerful technique for obtaining optimized or novel R N A catalysts. This method can be used to increase our basic underl0 W. P. C. Stemmer, Proc. NatL Acad. Sci. U.S.A. 91, 10747 (1994). 11 W. P. C. Stemmer, Nature (London) 370, 389 (1994).
426
NUCLEIC ACID LIBRARIES
[24]
standing of RNA catalysis and, on a more practical level, to generate ribozymes with potential therapeutic or industrial value. Such goals will become easier to achieve as in vitro evolution methods improve. In the near future, we expect to see new techniques for handling larger population sizes, more sensitive and specific selective amplification procedures, and better mutagenesis protocols that allow one to specify any desired mutation rate. As such advances occur, the isolation of interesting ribozymes will become even more routine than described in this chapter. Acknowledgments The development of these methods was supported by Grant NAGW-3118 from the National Aeronautics and Space Administration and Grant AI-30882 from the National Institutes of Health. We are grateful to Ronald Breaker and Luc Jaeger for helpful discussions.
[24]
Peptide Nucleic Acids: A New Dimension Libraries and Aptamers
to Peptide
B y PETER E . NIELSEN
Introduction Peptide nucleic acid (PNA) is a DNA analog in which the deoxyribose phosphate backbone of DNA has been replaced by a charge neutral, achiral pseudopeptide (or amide) backbone composed of N-(2-aminoethyl)glycine units (Fig. 1). PNA was originally designed as a DNA mimic for sequencespecific recognition of double-stranded (ds) DNA via majo r groove triplex formation. 1 Indeed PNA is a very effective structural mimic of DNA, 2-s but binding to duplex DNA takes place via strand displacement due to the formation of an extraordinarily stable internal PNA2/DNA triplex 6'7 (Fig. 2). This binding mode is unique to PNA (although it would be predicted 1 p. E. Nielsen, M. Egholm, R. H. Berg, and O. Buchardt, Science 254, 1497 (1991). 2 M. Egholm, O. Buchardt, P. E. Nielsen, and R. H. Berg, J. Am. Chem. Soc. 114, 1895 (1992). 3 M. Egholm, O. Buchardt, P. E. Nielsen, and R. H. Berg, J. Am. Chem. Soc. 114, 9677 (1992). 4 M. Egholm, C. Behrens, L. Christensen, R. H. Berg, P. E. Nielsen, and O. Buchardt, J. Chem. Soc., Chem. Commun., 800 (1993). s M. Egholm, O. Buchardt, L. Christensen, C. Behrens, S. M. Freier, D. A. Driver, R. H. Berg, S. K. Kim, B. Norden, and P. E. Nielsen, Nature (London) 365~ 556 (1993). 6 D. Y. Cherny, B. P. Belotserkovskii, M. D. Frank-Kamenetskii, M. Egholm, O. Bnchardt, R. H. Berg, and P. E. Nielsen, Proc. Natl, Acad. Sci. U.S.A. 90~ 1667 (1993). 7 p. E. Nielsen, M. Egholm, and O, Buchardt, J. Mol. Recognition 7, 165 (1994).
METHODS IN ENZYMOLOGY, VOL. 267
Copyright © i996 by Academic Press, Inc. All rights of reproduction in any form reserved.