Gene 186 (1997) 29–35
Splicing by overlap extension by PCR using asymmetric amplification: an improved technique for the generation of hybrid proteins of immunological interest Anthony N. Warrens a,*, Michael D. Jones b, Robert I. Lechler a a Department of Immunology, Royal Postgraduate Medical School, Hammersmith Hospital, DuCane Road, London W12 0HN, UK b Department of Virology, Royal Postgraduate Medical School, London W12 0HN, UK Received 21 May 1996; revised 26 August 1996; accepted 27 August 1996
Abstract Major histocompatibility complex (MHC ) proteins play a central role in the immune recognition of antigen. The generation of hybrid MHC molecules has been of great value in elucidating the structure: function relationships of these key glycoproteins. In this report, the generation of cDNAs coding for seven such hybrid proteins is described. We have used the technique of splicing by overlap extension by the polymerase chain reaction (SOE by PCR) [Horton, R.M., Hunt, H.D., Ho, S.N., Pullen, J.K. and Pease, L.R. (1989) Engineering hybrid genes without the use of restriction enzymes: gene splicing by overlap extension. Gene 77, 61–68] to generate intermediate products of each of the components of the hybrid, tipped with a small sequence of the other, and then mixed these products in a second-stage PCR to produce the final spliced product. Where we were unable to generate final product, we introduced an additional step of asymmetric PCR synthesis to generate an excess of those strands which would anneal in the final PCR and found this to be effective. We noted a significant but manageable mutation rate, possibly contributed to by the tendency of DNA polymerase to add additional non-templated nucleotides [ Hu, G. (1993) DNA polymerase-catalyzed addition of nontemplated extra nucleotides to the 3∞ end of a DNA fragment. DNA Cell Biol. 12, 763–770]. To avoid this, we modified our protocol to include a stage of blunting our intermediate products with T4 DNA polymerase prior to mixing them in the final PCR. We present this system as an effective mechanism to splice DNA. Keywords: Major histocompatibility complex; MHC; Gene shuffling; Oligonucleotide; DNA polymerase
1. Introduction Much has been learned from the use of hybrid protein constructs produced from manipulated DNA. Such manipulation classically relies on the production of ligatable DNA fragments using restriction endonucleases (sometimes involving the generation and subsequent removal of restriction sites). Horton and colleagues ( Horton et al., 1989) described the technique of splicing by overlap extension by the polymerase chain reaction (SOE by PCR) a technique which was not limited by the presence of restriction sites at appropriate locations. We have now used this technique to generate seven * Corresponding author. Tel. +44 181 743 2030/2088; Fax +44 181 743 8602. Abbreviations: bp, base pair(s); kb, kilobase(s) or 1000 bp; MHC, major histocompatibility complex; PCR, polymerase chain reaction; SOE, splicing by overlap extension.
major histocompatibility complex (MHC ) hybrid proteins. The biological questions which these molecules have allowed us to address will be reported elsewhere. In this paper, we describe the use of this technique in our hands, the problems associated with it and our attempts to circumvent them.
2. Experimental and discussion 2.1. Splicing by overlap extension by the polymerase chain reaction (SOE by PCR) SOE by PCR, as used in the present studies (summarised in Fig. 1), involves three separate PCRs: the two DNA fragments produced in the first stage reactions are mixed to form the template for the second stage. Four primers are required for each construct: two flanking primers, and two hybrid primers - one of each for each
0378-1119/97/$17.00 Copyright © 1997 Elsevier Science B.V. All rights reserved PII S0 3 7 8- 1 1 19 ( 96 ) 0 06 7 4 -9
30
A.N. Warrens et al. / Gene 186 (1997) 29–35
Fig. 1. Splicing by overlap extension by polymerase chain reaction (SOE by PCR). This schematic representation of SOE by PCR is discussed in Section 2.1. cDNAs are represented lined as bars. One cDNA is represented with solid shading, the other lightly stippled; M13/pUC sequence is represented as a single bold line, and oligonucleotides annealing with it as empty boxes. Oligonucleotides are shown labelled with a lower case letter. Their 5∞ to 3∞ orientation is left-to-right if shown above the cDNA, and right-to-left if shown below it. Methods: The buffer used in PCRs was as follows (final concentrations): 10 mM Tris-HCl (pH 8.3 at 25°C ), 50 mM KCl and 0.01% gelatin (all Sigma or BDH, Poole, UK ). Various different concentrations of MgCl (BDH ) were used ranging from 0.5 mM to 5.0 mM. PCRs were initially performed at MgCl concentration of 2 2 1.5–3.0 mM. In no circumstances did variation of MgCl concentration within that range significantly alter the yield of product in first-stage PCR. 2 Molar equivalents of all four deoxynucleotides (Pharmacia, Uppsala, Sweden) were mixed and used at a final concentration of 200–250 mM each. Reactions were performed in a final volume of either 50 or 100 ml. In all reactions, 500 ng of each oligonucleotide primer was added. Taq polymerase (2.5 U; Stratagene, UK ) was used. The PCR protocol used at each stage (unless otherwise indicated) involved thirty cycles as follows: 92°C for 60–90 s, 55°C for 60–90 s, and 72°C for 3–5 min. After this, a final step at 72°C for 15 min was followed by cooling to 4°C. Following the first stage, the PCR products were gel purified and used in the second-stage reaction identical to the first stage reaction, save that a wider range of MgCl concentration was necessary (1.5–5.0 mM ). 5–15 ng of each of the intermediate products were added in equimolar amounts. The final 2 product was gel-purified and cloned into a eukaryotic expression vector for transfection. Construct cDNA was subcloned from that vector into M13mp18 to check the sequence of the cDNA being transfected, using a standard dideoxynucleotide chain termination sequencing protocol (Amersham International, Aylesury, UK ).
of the two first stage reactions. The hybrid oligonucleotides are designed from the known nucleotide sequences to generate fragments that will have overlapping sequence. One of the first stage PCRs produces a DNA fragment with the sequence 5∞ to the splice point, and the other a DNA fragment with the sequence 3∞ to the splice point ((1)–(2) in Fig. 1). However, since the hybrid oligonucleotides span the splice point, each of the first stage products is tipped with a short sequence derived from the other ((2) in Fig. 1). For this reason, when the two products are mixed they can partially anneal, and, using the original two flanking primers, participate in the second-stage PCR to produce the final product ((3–4) in Fig. 1). In each of the reactions described, the wild-type cDNAs used (1.1–1.3 kb) as templates were cloned into a vector of either the pUC or M13 series. Since in both of these series, the cloning site is within the lacZ gene, the sequences flanking the cloning site are the same and hence the same ‘universal’ flanking primers ( FP and RP in this study) may be used.
The seven proteins predicted to be generated from the DNA constructs described in this paper are schematised in Fig. 2. Table 1 identifies the templates and oligonucleotides used in this study. Using the standard PCR protocol described in detail in the legend to Fig. 1, it was possible to generate six of these seven constructs (C2–7). Fig. 3 shows an example of the intermediate and second-stage products obtained in the generation of RREa. 2.2. The use of asymmetric PCR to generate product In the production of C1, it proved impossible to generate any second-stage product from mixtures of the 5∞ and 3∞ products of the first stage PCRs, despite extensive manipulation of the buffer conditions and PCR protocol. As is illustrated in Fig. 4, when the two large DNA fragments are mixed and denatured in a second-stage PCR, only one combination of a 5∞ single strand and a 3∞ single strand will anneal to produce a template for the final spliced product. In an attempt to
A.N. Warrens et al. / Gene 186 (1997) 29–35
31
Fig. 2. The major histocompatibility complex (MHC ) constructs generated in this study. (a) A representation of the domain structure of MHC class II molecules (Brown et al., 1993). Each polypeptide is represented as a sequence of three regions: (1) the b-pleated sheet amino-terminal half of the membrane-distal (a1 or b1) domain; (2) the a-helical carboxy-terminal half of that domain; and (3) everything more carboxy-terminal. In the terminology used to describe the constructs, the upper case letters represent the origin of each of these three regions, the lower case letter indicating the a- or b-chain origin of the preceding sequence. (b) The seven constructs generated in this study. For simplicity they are referred to as C1–7 in this paper. The alleles used are indicated in the legend to Table 1. Ag marks the position of antigenic peptide.
bias DNA production towards those two strands, a period of asymmetric PCR was introduced into the protocol 1. The introduction of this step resulted in the successful production of second-stage product of DNA that had the coding sequence of C1. 2.3. Unwanted mutations introduced by SOE by PCR The process of SOE by PCR was found to introduce sequence-altering mutations and deletions. The mutFig. 3. Example of first- and second-stage products of SOE by PCR on agarose gel electrophoresis, stained with ethidium bromide and transilluminated with ultraviolet light. The gel on the left shows the two first-stage PCR products (0.5 and 0.8 kb, respectively) and the gel on the right shows two aliquots of the same second-stage PCR reaction combining these 5∞ and 3∞ products to yield C3 (1.3 kb). On both gels, l doubly digested with HindIII and EcoRI was used as the molecular mass marker.
1 Separately, gel-purified first-stage PCR products (5∞ 250 ng; and 3∞ 200 ng) were cycled 16 times in the presence of 500 ng of the relevant single oligonucleotide (using the terminology of Fig. 1, primer a for the 5∞ product and primer d for the 3∞ product) but using conditions that were otherwise unchanged from the above, and including the addition of further Taq polymerase. A final nucleotide concentration of 200 mM and of MgCl 4.5 mM were used for the 3∞ product. 5 and 2 16 ml of the two products, respectively, were mixed, which, assuming a molar equivalence of expansion, would produce stoichiometric equivalence. The standard PCR protocol was used for the second-stage PCR.
5∞ 3∞ 5∞ 3∞ 5∞ 3∞ 5∞ 3∞ 5∞ 3∞ 5∞ 3∞ 5∞ 3∞
C1
pUC18-DRA pUC18-DQA pUC18-DQA pUC18-DRA pUC18-DRA pUC18-Ea pUC18-Ea pUC18-DRA M13mp18-Aa* pUC18-DRA pUC18-DRA pUC19-neo-K* M13mp18-DRB pUC19-neo-K*
Templateb
FP RP FP RP FP RP FP RP RP RP FP FP FP FP
M13c,d
(a) (d) (a) (d) (a) (d) (a) (d) (a) (d) (a) (d) (a) (d)
AW12 AW2 AW15 AW5 AW16 AW6 AW17 AW7 AW32 AW22 AW18 AW8 AW19 AW9
(b) (c) (b) (c) (b) (c) (b) (c) (b) (c) (b) (c) (b) (c)
5∞-AAATTGCGGGTCAAAGCT/TGCAAATCGTCCAAAT-3∞ 5∞-ATTTGGACGATTTGCA/AGCTTTGACCCGCAATTT-3∞ 5∞-ACCTTGAGCCTCGAAGCT/TCTAAATTGTCTGAGA-3∞ 5∞-TCTCAGACAATTTAGA/AGCTTCGAGGCTCAAGGT-3∞ 5∞-CACCTCTGGGGCCAC/ATTGGTGATCGGAGT-3∞ 5∞-ACTCCGATCACCAAT/GTGGCCCCAGAGGTG-3∞ 5∞-TACCTCTGGAGGTA/CGTTGGCATCTGGAGT-3∞ 5∞-ACTCCAGATGCCAACG/TACCTCCAGAGGTA-3∞ 5∞-GTTACCTCTGGAGGTAC/ATTGGTAGCTGGGGTGG-3∞ 5∞-CCACCCCAGCTACCAAT/GTACCTCCAGAGGTAAC-3∞ 5∞-GGCCTTTGGGGAATC/ATTGGTGATCGGAGT-3∞ 5∞-ACTCCGATCACCAAT/GATTCCCCAAAGGCC-3∞ 5∞-GGCCTTTGGGGAATC/TCGCCGCTGCACTGT-3∞ 5∞-ACAGTGCAGCGGCGA/GATTCCCCAAAGGCC-3∞
Hybrid splicing primer and sequence c,e,f
DQA 180–162 DRA 242–257 DRA 275–258 DQA 146–161 Ea 3150–3135 DRA 339–354 DRA 368–354 Ea 2639–2654 DRA 370–354 Aa 248–264 K 3672–3658 DRA 339–353 K 3672–3658 DRB 268–282
Sequence sourceg
/ / / / / / / / / / / / / /
DRA 257–242 DQA 162–180 DQA 161–146 DRA 257–275 DRA 339–354 Ea 3135–3150 Ea 2654–2639 DRA 354–368 Aa 264–248 DRA 354–370 DRA 353–339 K 3658–3672 DRB 282–268 K 3658–3672
aThe constructs are described in Fig. 2. bAn asterisk indicates that a cDNA insert was inserted in the antisense orientation. cThe letter in parentheses following each oligonucleotide identifies its role in the schema in Fig. 1. dThe sequences of the universal primers was as follows: FP (forward primer) 5∞-TTGTAAAACGACGGCCAGTG-3∞ (M13mp18 6308–6289); RP (reverse primer) 5∞-GAAACAGCTATGACCATGAT-3∞ (M13mp18 6208–6227). eThe position of the splicing point is indicated by a slash. In four oligonucleotides, mutations were introduced which did not alter the amino-acid sequence in order to generate additional restriction sites. These are indicated as bold letters. fThe oligonucleotides were derived from published cDNA sequences as follows (this list also indicates the alleles involved ): DRA1*0101 (Lee et al., 1982); DQA1*0501 (Schi enbauer et al., 1987); DRB1*0101 (Bell et al., 1985); Eak (Mathis et al., 1983); Aas (Landais et al., 1985); Kk (Arnold et al., 1984); M13 (Messing et al., 1977). gThe numbering is derived from the references listed in footnote f. In the description of the sources of the oligonucleotides, rising numbers indicates sequence homologous with published coding sequence, falling numbers indicates complementary sequence.
C7
C6
C5
C4
C3
C2
First stage products
Construct a
Table 1 Oligonucleotides used to generate the constructs in this study
32 A.N. Warrens et al. / Gene 186 (1997) 29–35
A.N. Warrens et al. / Gene 186 (1997) 29–35
33
Fig. 4. Rationale for use of asymmetric PCR. In the reannealing phase of the second-stage PCR, there are four possible combinations. Only one of these (2) can act as a template for DNA polymerase to generate the complete double stranded hybrid cDNA, and this constitutes one of the less favoured pairs ((2) and (3)) of these combinations. Production of this species can be biased by relative overproduction of the two participating strands.
ations generated by this technique is shown in Table 2. In the case of C1 and C2, SOE by PCR was performed twice and three times, respectively, each producing mutations. The data presented in Table 2 represents clones with highest fidelity. For subsequent constructs, it was technically simpler to correct mutations than to repeat PCR and the cloning steps via the eukaryotic expression vector. All errors predicted to alter amino-acid sequences within the mature protein were reverted to wild-type by site-directed mutagenesis or (in one case) by the shuffling of a gene fragment using restriction enzyme recognition sites. In C2, a T-to-C sequence-altering mutation was found (in three different SOE by PCRs) in the codon immediately 5∞ to the position at which the splicing oligonucleotide would have annealed. One possible explanation for such a mutation relates to the ability of certain DNA polymerases to add an additional non-templated nucleo-
tide at the 3∞ end of a DNA fragment (Clark et al., 1987). In the light of this, we decided to remove any overhang from first stage PCR intermediate products prior to second-stage SOE. The purified intermediate products were rendered blunt-ended by T4 DNA polymerase (Pharmacia) 2 and the mixture was used without further manipulation in the second-stage PCR. This was undertaken for C3, C4, C6 and C7 and no further mutations in a base adjacent to the annealing site of the oligonucleotide were observed. However, the mutation in C7 was of a similar nature, with a deletion of the nucleotide immediately 3∞ to the shuffling oligonucleotide.
2 3U T4 DNA polymerase were used in the presence of 50 mM NaCl, 10 mM Tris-HCl (pH 7.9), 10 mM MgCl , 1 mM dithiothreitol and 2 dNTPs (200 mM each) (all Pharmacia), and incubated at 37°C for 1 h.
34
A.N. Warrens et al. / Gene 186 (1997) 29–35
Table 2 Errors generated by the technique of SOE by PCR Construct a
Errors in second-stage product b,c,d
Effect on amino-acid structure of mature proteind
Position within coding sequence or cDNA
C1
DQA 513 CA DQA 601 AG DQA −66 TA DQA 144 TC DRA 808 AG DRA 93 TA Ea 3146 GA Ea 3240 GC Ea 3256 AT Ea 2412 GA Ea 2479 GC DRA 688 CT 1 large deletione None Point deletion Kk 3834 C Point deletion Kk3658 A
Silent Silent Silent VA Silent Silent Silent Silent RW Silent DH Silent Silent None Subsequent missense Subsequent missense
a2 domain transmembrane region leader sequence a1 domain 3∞ UT a1 domain a2 domain a2 domain a2 domain a1 domain a1 domain transmembrane region 3∞UT
C2
C3
C4
C5 C6 C7
a3 domain a3 domain
aConstructs are depicted in Fig. 2. bThe whole of the 3∞ UT was not always sequenced cAlleles and numbering are as listed in the legend to Table 1. dAmino acids are represented by the single-letter code. e0.4 kb deletion from position 861 of the DRA 3∞ UT and position 6230 of M13.
3. Conclusions (1) SOE by PCR proved to be an effective technique for providing adequate amounts of spliced DNA for cloning purposes, having established the appropriate conditions. Using this technique, it is possible to be more demanding in the positioning of splicing sites than if forced to rely on corresponding restriction sites. (2) Our experience confirms that it is not always possible to achieve easily the appropriate conditions for successful SOE by PCR ( Ito et al., 1991) despite the claim of 100% efficiency by Horton et al. (1989). We offer the suggestion of an intermediate step of asymmetrical synthesis as an improvement on this technique where difficulties are encountered. Our protocol is much simpler that a previous report of a similar technique which relied on a larger number of reactions to splice smaller pieces of DNA and utilised much larger overlapping sequences ( Ward et al., 1993). (3) The major disadvantage of SOE by PCR over conventional shuffling techniques is its propensity to introduce mutations. With the advent of modern mutagenesis regimes, the reversal of these mutations is relatively simple. This also renders conventional splicing by the shuffling of restriction fragments more accessible if dependent on the introduction and removal of restriction sites. However, from the evidence of these seven splicing events which required only five changes to be reversed across an aggregate coding region of approximately 5 kb, the
number of mutageneses required is likely to be greater using the conventional system. (4) The availability of thermostable DNA polymerases with a much lower tendency to add a non-templated nucleotide to DNA fragments (Hu, 1993) may decrease further the rate of mutation and the need to blunt end intermediate products. However, our future strategy will use a PCR regime involving a smaller number of cycles: 20 cycles first stage, 15 cycles asymmetric expansion, 20 cycles second stage. Initial experience suggests that this regime works.
References Arnold, B., Burgert, H.-G., Archibald, A.L. and Kvist, S. (1984) Complete nucleotide sequence of H-2Kk gene. Comparison of three H-2K locus alleles. Nucleic Acids Res. 12, 9473–9487. Bell, J., Estess, P., St John, T., Saiki, R., Watling, D., Erich, H. and McDevitt, H. (1985) DNA sequence and characterisation of human class II major histocompatibility complex b chain from the DR1 haplotype. Proc. Natl. Acad. Sci. USA 82, 3405–3409. Brown, J.H., Jardetzky, T.S., Gorga, J.C., Stern, L.J., Urban, R.G., Strominger, J.L. and Wiley, D.C. (1993) Three-dimensional structure of the human class II histocompatibility antigen HLA-DR1. Nature 364, 33–39. Clark, J., Joyce, C. and Beardsley, G. (1987) Novel blunt-end addition reactions catalysed by DNA polymerases. J. Mol. Biol. 198, 123–127. Horton, R.M., Hunt, H.D., Ho, S.N., Pullen, J.K. and Pease, L.R. (1989) Engineering hybrid genes without the use of restriction enzymes: gene splicing by overlap extension. Gene 77, 61–68. Hu, G. (1993) DNA polymerase-catalyzed addition of nontemplated extra nucleotides to the 3∞ end of a DNA fragment. DNA Cell Biol. 12, 763–770.
A.N. Warrens et al. / Gene 186 (1997) 29–35 Ito, W., Ishiguro, H. and Kurosawa, Y. (1991) A general method for introducing a series of mutations into cloned DNA using the polymerase chain reaction. Gene 102, 67–70. Landais, D., Mattes, H., Benoist, C. and Mathis, D. (1985) A molecular basis for the Ia.2 and Ia.19 antigenic determinants. Proc. Natl. Acad. Sci. USA 82, 2930–2934. Lee, J.S., Trowsdale, J., Travers, P.J., Carey, J., Grosveld, F., Jenkins, J. and Bodmer, W.F. (1982) Sequence of HLA-DRa chain cDNA clone and intron-exon organization of the corresponding gene. Nature 299, 750–752. Mathis, D., Benoist, C., Williams, V., II, Kanter, M. and McDevitt, H. (1983) The murine Ea immune response gene. Cell 32, 745–754.
35
Messing, J., Gronenborn, B., Muller-Hill, B. and Hans-Hopschneider, P. (1977) Filamentous coliphage M13 as a cloning vehicle: insertion of a HindII fragment of the lac regulatory region in M13 replicative form in vitro. Proc. Natl. Acad. Sci. USA 74, 3652–3546. Schiffenbauer, J., Didier, D., Klearman, M., Rice, K., Shuman, S., Tieber, V., Kittlesen, D. and Scwartz, B. (1987) Complete sequence of the HLA DQa and DQb cDNA from a DR5/DQw3 cell line. J. Immunol. 139, 228–233. Ward, R., Hawkins, N., Wakefield, D., Atkinson, K. and Biggs, J. (1993) Production of a functional single-chain Fv fragment from the pan leukocyte antibody WM65 using splicing by asymmetric PCR. Exp. Hematol. 21, 660–664.