19 PCR Optimization for Detection of Transgene Integration Michael H. Irwin1, Wendy K. Pogozelski2 and Carl A. Pinkert1,3 1
Department of Pathobiology, College of Veterinary Medicine, Auburn University, Auburn, AL, 2Department of Chemistry, State University of New York at Geneseo, Geneseo, NY, 3Department of Biological Sciences, College of Arts and Sciences, The University of Alabama, Tuscaloosa, AL
I.
Introduction
This chapter is not intended to serve as a comprehensive review of the use of the polymerase chain reaction (PCR) for the identification of transgenic founder mice or their offspring. Numerous sources of information are available dealing with the theory, performance, and optimization of PCR analysis in a variety of applications (reviewed in Dawson et al., 1996; Nagy et al., 2002; Green and Sambrook, 2012). Rather, this brief overview is intended to address some of the general considerations for PCR analysis and some of the particular concerns when using this technique to detect transgene integration in experimental animals (especially laboratory mice). Amplification of DNA by PCR, first described in a revolutionary paper by Mullis and Faloona (1987), has made in vitro amplification of specific DNA target sequences both rapid and easy to perform. The impact of this technology on diagnostic screening of genomes for specific target sequences would be difficult to overstate. The utility of this technique is now taken for granted in virtually all laboratories involved in molecular cloning and other molecular biological technologies used to study the genomes of experimental animals and humans.
II.
Discussion
A. General For the purpose of detecting the integration of transgene constructs into mouse (or other animal) genomes, as described in Chapter 18, PCR analyses have proven invaluable in providing accurate, cost-effective, and timely data. If the proper safeguards are employed, PCR analyses can be very reliable in the initial identification Transgenic Animal Technology. DOI: http://dx.doi.org/10.1016/B978-0-12-410490-7.00019-0 © 2014 Elsevier Inc. All rights reserved.
534
Transgenic Animal Technology
of founder transgenic animals and in characterizing germ line transmission of genetic modifications transmitted to their offspring. This discussion will address various parameters and practical concerns of PCR analysis that relate to its use in the diagnostic testing of experimental animal genomic DNA samples for transgene integration. Although PCR is indeed exceedingly useful for integration analysis, it does have limitations. For example, PCR detects integration of the transgene only, and, in its basic form, provides no information on expression of the transgene. Reverse transcriptase-PCR (RT-PCR) can be very useful in detecting and quantifying gene expression; however, RT-PCR is somewhat more complicated than conventional/ basic PCR and poses its own unique set of practical considerations (see Chapter 21). One should also bear in mind that the PCR in its basic form is not intended to be a quantitative test and that PCR results can be misleading when one attempts (wrongly) to extrapolate information regarding the number of copies of the transgene present in the genome of an experimental animal without more precise methodology (from quantitative real-time PCR to various hybridization analyses outlined in Chapter 21). Additionally, when quantitative information is desired, real-time PCR or competitive PCR methods may also be employed with specific advantages as described in Section II.C.
B. PCR Considerations If the target template is not specific in relationship to the strain-specific genomic DNA, the PCR results will be of limited value. False positives resulting from a PCR of questionable specificity will invariably lead to significant expense related to maintaining and breeding animals that have no experimental value. Therefore, time spent optimizing and verifying the specificity of the PCR in the preliminary phases of an experiment will be critical. That said, optimization of any PCR analysis must initially focus on the specificity of the primers to the target sequence one wishes to amplify. Thus, primer design is the most fundamental consideration for the success of the PCR. Many software packages are available today that simplify the design of primers for use in the PCR, for example, GeneFisher (http://bibiserv.techfak.uni-bielefeld.de/genefisher/); OligoPerfectt Designer (http://tools.lifetechnologies.com/content.cfm?pageid59716); Primer 3 (http://bioinfo.ut.ee/primer3-0.4.0/primer3/); and OLIGO (http://www.oligo.net/). Although such programs no doubt greatly simplify primer design, one can easily devise suitable primers without such programs. All of the primers used in our laboratory and national cores over the years were designed without the use of primerdesign programs, and in some cases were shown to outperform primers designed using these programs. For most purposes, the length of primers should be at least 15 base pairs. If a nonrepetitive sequence is used, the probability of an exact duplication of a 15 bp sequence elsewhere in the genome is very small. This probability obviously shrinks as the primer increases in length. However, a point of diminishing returns may be reached with larger primers, where the increased specificity is outweighed by the loss of sensitivity, as larger primers hybridize less efficiently than
PCR Optimization for Detection of Transgene Integration
535
smaller primers. As a rule of thumb, we routinely use primers in the range of 1530 bp. There are just a few practical considerations to keep in mind when designing primers for PCR without the use of specific software programs. First, the possibility of “primerdimer” formation must be examined. These are small amplification products that result from hybridization of a given primer with itself or with the opposite direction primer. Although such species are rather easily identified on agarose gels due to their small size and are seldom confused with bona fide amplification products from the target sequence, they decrease the sensitivity of the PCR by diverting primers from the true target. Indeed, some reactions may show absolutely no amplification of the target sequence due to primerdimer formation. By comparing the sequence of a given primer both to itself and to the opposite direction primer, one can easily identify stretches of the primer that may be prone to primerdimer formation. In general, we avoid using primers that demonstrate complementarity of four or more bases, rejecting primers that show complementarity of three bases more than once. Furthermore, even a three-base match is unacceptable if it occurs at the 30 end of any primer (from which extension proceeds). Another important consideration is the G 1 C ratio of each primer. A G 1 C ratio that is too high will result in more difficulty in strand separation during the denaturation step, while a low G 1 C ratio will necessitate a lower annealing temperature, which is equivalent to lower stringency, thus increasing the likelihood of nonspecific amplification products. Therefore, practically speaking, it is best to have primers with a G 1 C ratio in the range of 4060%. In addition, although a primer may fit this criterion, stretches of continuous Gs and Cs within a region of the primer are also problematic and should be avoided. Finally, it is also mandatory that both forward and reverse primers have very similar G 1 C ratios so that optimal reaction conditions for one primer will also be optimal for the opposite direction primer. The length of the target sequence (the stretch of bases amplified by PCR) is also a practical concern. Amplification products less than about 200 bp sometimes appear diffuse in agarose gel electrophoresis. Alternatively, long target sequences are amplified less efficiently. Again, as a general rule, we attempt to identify target sequences in the range of 2001000 bp. It is a common mistake to attempt amplification of an entire (or nearly complete) transgene sequence, unless the transgene is relatively small. If one wished to ascertain the complete integration of a large transgene, two separate PCRs may be employed to amplify each end of the transgene. But for accurate verification of the integration of the complete transgene, Southern blotting along with DNA sequence determination provides much more reliable data. There are several tests one must perform prior to utilizing PCR analysis for detection of transgenic founder animals. First, the reaction should be tested for efficiency of amplification of the transgene itself, with no other DNA present. This is a good time not only to verify that the proper product is amplified but also to perform adjustments in various parameters such as buffer composition (especially in relation to magnesium concentration), cycling temperatures and times, and number of cycles required for optimal amplification (Figure 19.1). Next, it is imperative
536
Transgenic Animal Technology
If PCR analysis is to be used to identify transgenic mice, appropriate oligonucleotide primers and specification of proven conditions for the reaction are recommended. Mouse DNA (strain-specific tissue samples; e.g., B6SJL F1 hybrid, C57BL/6, or FVB tissues, as appropriate) for evaluating PCR specificity. One should complete the information below and if desired, maintain archival photographs (originals or discernable copies) of the control reactions for troubleshooting protocols and results. 5′ primer name (≤12 characters)
__________________________________________
5′ primer length (bp), molar conc., total conc. ____________,______________,_______________ 3′ primer name (≤12 characters)
__________________________________________
3′ primer length (bp), molar conc., total conc. ____________,______________,_______________ Length of PCR product (bp):
_______
Amount of each primer per reaction (μl):
5′ _______ 3′ _______
Denaturation temp. (°C): _____
Denaturation time:____
Annealing temp. (°C):
_____
Annealing time:
_____
_____
Extension time:
_____
Extension temp. (°C): Cycles:
_____
Other reaction conditions: pH ____, MgCl2 conc. ____, Taq conc.____, dNTPs ___, KCl conc. ___ Reaction conditions (for diagnostic evaluation, primer sets should be tested with the following control amplifications): 1. Normal mouse DNA (designated NM; use the specific mouse strain appropriate to the project) 2. NM + 0.1 to 0.5 gene copy/cell equivalent of the DNA construct 3. NM + 1 gene copy/cell equivalent of the DNA construct 4. NM + 5 (or 10) gene copies/cell equivalent of the DNA construct 5. 1 gene copy/cell equivalent of the DNA construct (no NM DNA) 6. 5 (or 10) gene copies/cell equivalent of the DNA construct (no NM DNA) 7. Appropriate marker DNA (Genomic conversion for DNA copies per cell—6 × 109 base pairs per diploid genome.) Marker DNA (lane 8 above): Type: ________ Concentration: _______ Volume (μL): ________ Reaction volume (μL): __________ Amount of mouse DNA per NM reaction (ng):_______ Amount of DNA used in 1 copy control (pg): _______
Figure 19.1 PCR analysis worksheet.
that the reaction be tested using normal genomic DNA from the animal species/ strain used in the study, with no transgene DNA added. This important step is included to verify that the reaction does not amplify any erroneous “target” sequences that might either affect the efficiency of the desired reaction or even lead to the generation of false positive signals. After the PCR has been optimized and tested to ensure no discernable reaction with normal genomic DNA, the sensitivity of the reaction must be assessed in a “real-world” situation. This means that the sensitivity and specificity of the PCR must be tested in the presence of normal
PCR Optimization for Detection of Transgene Integration
537
Calculation of copy number controls (amount of construct-specific DNA) is beneficial in characterizing the sensitivity and specificity of the PCR reaction. If whole vector (e.g., whole plasmid containing a given sequence) is used to spike normal mouse DNA (NMDNA), then one should use a ratio of construct to vector in the final calculation. Additionally, the use of strain-specific NMDNA should reflect the host genome. The mouse diploid genome has a mass of approximately 6.42 × 10–12 g. The amount of NMDNA used in the assay divided by this number gives the equivalent number of mouse diploid genomes. If, for example, a PCR assay was set up with 100 ng mouse DNA per sample, 100 ng = 1 × 10–7 g 1 × 10–7 g 6.42 × 10–12 g/diploid genome
= 15,576 diploid genomes
The size of the construct in bp multiplied by 1.07 × 10–21 g/bp = the mass of the construct in grams. The mass of the construct multiplied by the number of diploid genome equivalents = the single gene copy equivalent. For example, if the construct is 5000 or 10,000 bp and we are using 100 ng of DNA in our PCR, then the mass of the construct is 5000 bp × (1.07 × 10–21 g/bp) = 5.35 × 10–18 g 10,000 bp × (1.07 × 10–21 g/bp) = 10.70 × 10–18 g Therefore, for 100 ng (or 1.5576 × 104 genome equivalents), the single gene copy equivalent is: For 5000 bp: (5.35 × 10–18 g) × (1.5576 × 104) = 8.33 × 10–14 g or 0.0833 pg For 10,000 bp: (10.7 × 10–18 g) × (1.5576 × 104) = 16.67 × 10–14 g or 0.1667 pg
Figure 19.2 Calculating gene copy/genome equivalent.
genomic DNA to which has been added known quantities (equivalent to 0.1 copy, 1 copy, 10 copies, and 100 copies) of the transgene construct (Figure 19.2). If a given PCR is unable to amplify the target sequence present at the level of one-copy equivalent, it will be useless in detecting experimental animals that have integrated only a single copy of the transgene. And although mosaics may be rare in transgenic animals produced by pronuclear microinjection, using a 0.1- to 0.5-copy control will increase confidence that the sensitivity of the reaction is sufficient to detect all of the founders thus produced. Controls of 10- and 100-copy equivalents are used in case the lower copy number controls do not amplify sufficiently, so that at least one has an idea of how much more optimization is needed to obtain the desired sensitivity.
538
Transgenic Animal Technology
It is wise to always include these controls, including a zero-copy (negative) control both with and without normal genomic DNA present, in every subsequent PCR so that one may have a high level of confidence in the results obtained. One final concern related to the PCR of genomic DNA sequences involves the method of DNA extraction and purification utilized. Most techniques for preparation of genomic DNA involve the use of a protease (e.g., proteinase K) in the initial digestion of tissue samples before DNA extraction. Unfortunately, proteases are very capable of digesting the thermostable polymerase used in the amplification reaction. Therefore, extreme care should be exercised during the DNA extraction procedure to eliminate carryover of proteases. This also includes recognition of proteases normally present in the tissue used as the source of DNA. This problem is of particular concern when using a standard organic extraction procedure (e.g., phenol:chloroform extraction) where pipetting the liquid phase containing DNA can introduce proteins (including proteases) contained in the interphase. It is best when using such protocols to sacrifice a portion of the aqueous DNA-containing phase rather than attempt to recover it all and inadvertently carry over proteases from the interphase. Using one of several widely available DNA purification kits will eliminate this problem.
C. Quantitative (Real-Time) PCR Quantitative PCR (qPCR) methods are gaining considerable popularity in transgene analyses due to their power as quantitation tools. Although PCR techniques have readily replaced labor-intensive blotting procedures, qPCR has indeed become commonplace since the last edition of this chapter. What distinguishes qPCR from conventional methods is that the quantity of amplification product is measured as it is being made (after each extension), rather than at the completion of all the cycles (endpoint determination). Two approaches can be used for quantifying PCR products via qPCR. The first is TaqMant-PCR (Holland et al., 1991; Heid et al., 1996), which relies on a polymerase with exonuclease activity and a sequence-specific probe that anneals to a region between the forward or reverse primer site. The probe contains a fluorescent dye at the 50 end to serve as the reporter, and a quencher dye is tagged to the 30 end. The emission spectrum of the reporter overlaps the absorption spectrum of the quencher. As long as the reporter and quencher are in close proximity, or as long as the probe is intact, Fo¨rster energy transfer occurs and no fluorescence is detected. As the polymerase begins extension from the primer, it encounters the probe in its path and hydrolyzes the probe (hence, the analogy with the classic video game Pac-Man). Hydrolysis separates the reporter from the quencher, reducing energy transfer and enabling fluorescence to occur. Fluorescence increases proportionally each time the probe is hydrolyzed. Eventually, fluorescence will reach a detectable level called the threshold cycle or Ct. This value, representing the number of amplification cycles required for positive fluorescence, is directly related to the amount of target in the initial sample. One requirement of the TaqMant approach is that the sequence between the forward and reverse primers must be known. When the sequence information is not
PCR Optimization for Detection of Transgene Integration
539
available, or when a researcher wishes to prescreen and avoid the expense of labeled probes, the fluorescent dye SYBR Green can be used (Schneeberger et al., 1995). This dye binds to the minor groove of double-stranded DNA; therefore, the assay may not be specific for the target sequence. To counteract this problem, researchers often perform a melting point analysis to determine if the fluorescence data correspond to those from the target sequence. Most real-time systems are purchased as a unit that contains a thermal cycler attached to a light source and fluorescence reader. The data are fed directly into a computer and analyzed with specialty software; however, this specialized equipment is not completely necessary. A researcher can obtain real-time data by removing aliquots from a standard thermal cycler at various times during amplification. Of course, lack of automation does risk producing data that are less reproducible, and the specialized systems are far more convenient to use. qPCR systems can be purchased with either lasers or tungsten lamps as light sources. The laser systems are more expensive. Although the bandwidths are wider with the tungsten light sources, the data obtained with the tungsten lamps are, in our experience, equally acceptable to those produced by the laser systems. Quantification assays require reproducibility in setting up the reactions. To minimize pipetting variations, researchers often use master mixes. These generally include nucleotides, the polymerase, cations, and a dye such as ROX (6-carboxyX-rhodamine) for an internal reference. Labeled probes can be purchased from many manufacturers. Typical fluorescent reporter labels are TET (tetrachloro6-carboxy-fluorescein, λmax(emission) 5 538 nm), FAM (6-carboxyfluorescein, λmax 5 518 nm), JOE (λmax 5 554 nm), and VIC (λmax 5 554 nm), while a typical quencher dye is TAMRA (6-carboxytetramethylrhodamine, λmax 5 582 nm). Probe design requires a set of considerations similar to that for primer design. Software programs such as Primer Express (Applied Biosystems, Foster City, CA, http://www.appliedbiosystems.com) can facilitate probe selection. A given probe must be specific and should avoid regions that dimerize or form secondary structures. There are several differences between primers and probes. The probe Tm should be about 10 C higher than that of the primers (assuming a primer Tm in the region of 5860 C) and the sequence should avoid a G at the 50 end. In addition, probes should be highly purified (e.g., high-performance liquid chromatography (HPLC) purification) to avoid carryover of residual contaminants including quencher dyes. qPCR is amenable to multiplex analysis; that is, two or more products can be monitored simultaneously in the same sample by using different reporter dyes. The dyes must be selected with care, so that the spectra do not overlap. FAM and JOE or FAM and VIC are good choices in this respect. Multiplex analysis requires extra caution, however. If one target is initially present in a much greater amount, it will deplete reagents, thereby altering the amplification kinetics of the less abundant target. This situation can be avoided by prior optimization and by using conditions in which one set of primers is limiting. Quantification can be either absolute or relative. For absolute quantification, the researcher generally constructs a set of standard curves for the target whose copy
540
Transgenic Animal Technology
number is desired and for a control (either external or internal) whose copy number is known. Typical internal controls are the β-actin gene, the 18 S rRNA gene, and the glyceraldehyde 3-phosphate dehydrogenase gene. The threshold cycle for a dilution series, plotted against the log of the copy number, will yield a straight line, facilitating extrapolation and interpolation. It is important that one takes extreme care to avoid contamination of reagents. The technique is so sensitive that trace amounts of template in a reagent can cause large problems. Assays should always include no-template controls to test for reagent contamination. If RT-PCR is used, the procedure should also include a noamplification control to test for contamination of RNA by DNA. The high sensitivity of qPCR makes it especially useful in transgene analysis. The technique can identify and quantify insertion events and can indicate the stability of an insert over time (Norris et al., 2000; Ingham et al., 2001). One important use is in determination of copy number and zygosity (whether an organism is heteroygotic/hemizygotic, or homozygotic for a targeted allele), particularly with low-copy integrations and differentiation from wild-type offspring when endogenous genomic sequences are introduced into the genome (Becker et al., 1999). qPCR can also be used to assess the amount of viral vector in virus-mediated gene transfer modeling (Kozlowski et al., 2001; Wang et al., 2001). Finally, as outlined in Chapter 21, another use is to combine quantitative methods with RT-PCR of mRNA products to monitor gene expression (Fairman et al., 1999).
III.
Summary
In routine nucleic acid analyses, DNA samples are qualitatively and/or quantitatively analyzed first by the PCR, and the results are subsequently verified by Southern hybridization and expression analyses. The PCR procedure can be completed within one day, while the more labor- and cost-intensive procedures may take 37 days to complete. The invention of the PCR utilized a common sense approach that relied on some knowledge of how DNA behaves along with the curious existence of a class of DNA polymerases that maintain function at high temperature. Indeed, the basic concept is so straightforward that many of us were left wondering, “Why didn’t I think of that?” It is therefore not surprising that common sense is the most important consideration in designing and optimizing PCRs useful in specific applications and in interpreting data obtained from this ubiquitous and powerful technique.
References Becker, K., Pan, D., Whitley, C.B., 1999. Real-time quantitative polymerase chain reaction to assess gene transfer. Hum. Gene Ther. 10, 25592566.
PCR Optimization for Detection of Transgene Integration
541
Dawson, M.T., Powell, A., Gannon, F., 1996. Gene Technology. Bios Scientific Publishers, Oxford. Fairman, J., Roche, L., Pieslak, I., Lay, M., Corson, S., Fox, E., et al., 1999. Quantitative RT-PCR to evaluate in vivo expression of multiple transgenes using a common intron. Biotechniques 27 (3), 566570, 572574. Green, M.R., Sambrook, J., 2012. Molecular Cloning: A Laboratory Manual. fourth ed. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY. Heid, C.A., Stevens, J., Livak, K.J., Williams, P.M., 1996. Real time quantitative PCR. Genome Res. 6, 986994. Holland, P.M., Abramson, R.D., Watson, R., Gelfand, D.H., 1991. Detection of specific polymerase chain reaction product by utilizing the 50 ! 30 - exonuclease activity of Thermus aquaticus DNA polymerase. Proc. Natl. Acad. Sci. USA 88, 72767280. Ingham, D.J., Beer, S., Money, S., Hansen, G., 2001. Quantitative real-time PCR assay for determining transgene copy number in transformed plants. Biotechniques 31 (1), 132134, 136140. Kozlowski, D.A., Bremer, E., Redmond Jr., D.E., George, D., Larson, B., Bohn, M.C., 2001. Quantitative analysis of transgene protein, mRNA, and vector DNA following injection of an adenoviral vector harboring glial cell line-derived neurotrophic factor into the primate caudate nucleus. Mol. Ther. 3, 256261. Maudru, T., Pden, K.W., 1998. Adaptation of the fluorogenic 50 -nuclease chemistry to a PCR-based reverse transcriptase assay. Biotechniques 25, 972975. Mullis, K.B., Faloona, F.A., 1987. Specific synthesis of DNA in vitro via a polymerasecatalyzed chain reaction. Meth. Enzymol. 155, 335350. Nagy, A., Gertsenstein, M., Vintersten, K., Behringer, R., 2002. Manipulating the Mouse Embryo: A Laboratory Manual. third ed. Cold Spring Harbor Press, Cold Spring Harbor, NY. Norris, M.D., Burkhart, C.A., Marshall, G.M., Weiss, W.A., Haber, M., 2000. Expression of N-myc and MRP genes and their relationship to N-myc gene dosage and tumor formation in a murine neuroblastoma model. Med. Pediatr. Oncol. 35, 585589. Schneeberger, C., Speiser, P., Kury, F., Zeillinger, R., 1995. Quantitative detection of reverse transcriptase-PCR products by means of a novel and sensitive DNA stain. PCR Methods Appl. 4, 234238. Wang, K., Pesnicak, L., Guancial, E., Krause, P.R., Straus, S.E., 2001. The 2.2-kilobase latency-associated transcript of herpes simplex virus type 2 does not modulate viral replication, reactivation, or establishment of latency in transgenic mice. J. Virol. 75, 81668172.