ANALYTICAL BIOCHEMISTRY ARTICLE NO.
237, 109–114 (1996)
0207
Cloning Differentially Expressed Genes by Linker Capture Subtraction Meiheng Yang and Arthur J. Sytkowski Laboratory for Cell and Molecular Biology, Division of Hematology and Oncology, New England Deaconess Hospital, Department of Medicine, Harvard Medical School, Boston, Massachusetts 02215
Received December 8, 1995
We have developed a simple and effective method, designated linker capture subtraction (LCS), for cloning differentially expressed genes between two cell types or between cells treated in two different ways. In the first step of the method, two mRNA populations are converted to double-stranded cDNAs, fragmented, and ligated to linkers for PCR amplification. In the second step, the linkered DNA (tester) from one mRNA population is hybridized to an excess of the unlinkered DNA (driver) from the other mRNA population, followed by incubation with mung bean nuclease which digests single-stranded DNA specifically. This leaves only tester–tester homohybrids to be amplified by PCR in the following step, so as to achieve an enrichment of tester-specific sequences. The amplified PCR products are then used as tester for another round of subtraction. The process of subtraction is carried out three times, and the final PCR products are inserted into a vector for clonal analysis. We have used the strategy to begin to clone and identify the genes expressed differentially between the human prostate cancer cell lines LNCaP and PC-3, which have different tumorigenic and metastatic potentials. We demonstrated strong enrichment of target sequences. We also report the identities of two of the genes expressed differentially in these cell lines. One is prostate-specific antigen (PSA) which is known to be expressed in LNCaP but not in PC-3. The other is vimentin, the differential expression of which has not been reported previously in these prostate cancer cells. q 1996 Academic Press, Inc.
The isolation and identification of differentially expressed genes is of great importance in the study of embryogenesis, cell growth and differentiation, and neoplastic transformation. A variety of methods have been employed to achieve this end. They include differential screening of cDNA libraries with selective
probes, subtractive hybridization utilizing DNA/DNA hybrids or DNA/RNA hybrids, RNA fingerprint, and differential display (1–5). Recently, PCR-coupled subtractive processes have been reported (6–12). Each of these methods has achieved some success and each has some inherent limitations. Differential display (5) has problems of ‘‘false positives,’’ redundancy, and underrepresentation of certain mRNA species. cDNA–RDA (12) is a labor-intensive process, and its efficiency remains to be evaluated. We have sought to overcome some of these problems and have now developed a method designated as linker capture subtraction (LCS).1 This method is related operationally to RDA (10), that is, subtraction coupled to PCR amplification. However, it does not rely on a kinetic mechanism of enrichment as does RDA. Rather, it achieves enrichment by specifically preserving PCRpriming sites of target sequences, using mung bean nuclease as the mediator. Also, it is a much less laborintensive process. We have applied LCS to the human prostate cancer cell lines LNCaP and PC-3, which have different tumorigenic and metastatic potentials. It has resulted in the rapid and effective isolation of genes expressed differentially between the two cell lines. MATERIALS AND METHODS
Cell Culture and cDNA Preparation Human prostate cancer lines LNCaP and PC-3 cells (American Type Culture Collection, Rockville, MD) were cultured in RPMI 1640 medium with 10% fetal bovine serum, 95% air/5% CO2 at 377C. Total RNA was isolated by a guanidinium thiocyanate/phenol method (13). Poly(A)/ RNA was selected through oligo(dT)25 –Dynabeads (Dynal Inc., Lake Success, NY). cDNA was synthesized from 2 mg of poly(A)/ RNA using a SuperScript 1 Abbreviations used: LCS, linker capture subtraction; AP, amplification primer; PSA, prostate-specific antigen.
109
0003-2697/96 $18.00 Copyright q 1996 by Academic Press, Inc. All rights of reproduction in any form reserved.
AID
AB 9555
/
6m15$$$201
04-24-96 22:37:21
aba
AP: Anal Bio
110
YANG AND SYTKOWSKI
Choice System (GIBCO, Gaithersburg, MD) according to the manufacturer’s instruction. Oligo(dT)12 –18 was used to prime the first strand of cDNA synthesis. Restriction Enzyme Digestion, Linker Ligation, and PCR Amplification The double-stranded cDNA was digested with AluI and RsaI and then ligated with a double-stranded oligodeoxynucleotide linker, which had a blunt end and a 2-base 3* protruding end: ACTCTTGCTTGGACGAGCTCT ACTGAGAACGAACCTGCTCGAGA-p The linker contained an AluI/SacI site near the blunt end as indicated. The top strand was designated the amplification primer (AP). The bottom strand was phosphorylated at the 5* end. The linker was prepared by annealing the two strands. An equal mass of each of the two oligodeoxynucleotides was combined. The mixture was heated to 907C for 2 min and then allowed to cool to room temperature. The ligation was carried out by mixing 1 mg of cut cDNA, 5 mg of linker, 11 ligation buffer (Stratagene, La Jolla, CA) and 4 Weiss units of T4 DNA Ligase (Stratagene) in a volume of 10 ml, 87C, 20 h. The reaction mixture was electrophoresed through a 2% low-melt agarose gel to remove the unligated linkers. The linker-ligated cDNA fragments in the size range of 0.1–1.0 kb were collected. Linker-ligated cDNA fragments in agarose were amplified directly by PCR using AP as primer. The reaction (100 ml) contained 10 mM Tris–HCl, pH 8.9, 50 mM KCl, 0.1% Triton X-100, 200 mM dNTPs, 1 mM AP, 2 mM MgCl2 , 1 ml of melted agarose, and 5 U Taq polymerase (Promega), running for 30 cycles (947C, 1 min; 557C, 1 min; 727C, 1 min). The amplified cDNA fragments were purified using a Gene-Clean kit (Bio101, Vista, CA) and were used as the initial material for subtractive hybridization. Subtractive Hybridization Twenty micrograms of PCR-amplified driver DNA was digested with AluI (50 U), 377C, 2 h, followed by SacI (50 U), 1 h to cleave the linker so that driver DNA could not be amplified later. After digestion, the products were purified using Gene-Clean. The digested driver DNA (2.5 mg) and nondigested tester DNA (0.1 mg) were mixed, vacuum-dried, and redissolved in 4 ml of a buffer containing 15 mM N-(2hydroxyethyl)piperazine-N*-(3-propane sulfonic acid) (EPPS), pH 8.0, with 1.5 mM EDTA, overlaid with mineral oil, and denatured by heating for 5 min at 1007C. One microliter of 5 M NaCl was added, and the DNA was hybridized for 20 h at 677C. After hybridization,
AID
AB 9555
/
6m15$$$202
04-24-96 22:37:21
20 ml of pH-shift buffer A (1 mM ZnCl2 , 10 mM Na acetate, pH 5.0) was added and the solution was divided into five aliquots. They were incubated with 0, 0.85, 1.75, 3.5, or 7 U of mung bean nuclease (Promega), respectively, 377C, 30 min. To each sample 80 ml of pHshift buffer B (10 mM Tris–HCl, pH 8.9, 50 mM KCl, and 0.1% Triton X-100) was added. They were heated (957C, 5 min) to inactive the mung bean nuclease. Then 20 ml of enzyme solution (10 mM Tris–HCl, pH 8.9, 50 mM KCl, and 0.1% Triton X-100, 1 mM dNTPs, 5 mM AP, 10 mM MgCl2 and 5 U Taq polymerase (Promega)) was added. The PCR reaction was run under the same conditions as above. Each sample was electrophoresed on 2% agarose gel. The sample with the most abundant products of 0.1–1.0 kb was selected as tester for another round of subtraction. The above process was repeated twice with 2.5 mg of driver DNA and 0.025 mg of tester DNA. To test for enrichment of target sequences, PCR products derived from subtraction cycles 0–3 were electrophoresed on 4% NuSieve agarose (FMC, Rockland, ME), transferred to GeneScreen Plus membrane (Dupont/NEN, Boston, MA), and probed with the random-labeled PCR products (with linkers removed) of the third round of subtraction (see Results). Construction of Subtractive Library and Clonal Analysis After three rounds of subtraction, the PCR-amplified products were purified (Gene Clean), digested with SacI, inserted into dephosphorylated pGEM-7Zf(/) (Promega) at the SacI site, and transformed into competent Escherichia coli JM109 cells. We prepared two subtractive libraries: LNCaP (tester)/PC-3 (driver) Å ‘‘L-P,’’ and PC-3 (tester)/LNCaP (driver) Å ‘‘P-L’’ in this way. Forty-eight white colonies from each library were picked randomly and inoculated into LB / Amp medium in individual wells of a 96-well plate. Two replica DNA dot-blots were prepared on GeneScreen Plus filters using 25 ml of bacterial cells per well. The replica dot-blots were processed according to Brown and Knudson (14) and probed with random-labeled driver DNAs from LNCaP and PC-3, respectively. Candidate-positive colonies were boiled for 5 min in 20 ml H2O and centrifuged. DNA in the supernatant was amplified by PCR using universal vector primer T7 and SP6 for 20 cycles of 947C, 1 min; 557C, 1 min; 727C, 1 min. The PCR products were electrophoresed on 2% agarose. The desired bands were excised and purified (Gene Clean). The products were subjected to direct DNA sequencing (15), and were employed to prepare probes for Northern blot analyses.
aba
AP: Anal Bio
LINKER CAPTURE SUBTRACTION
111
of subtraction. The process of subtractive hybridization, mung bean nuclease digestion, and PCR amplification is carried out three times. Finally, the PCR products of the third round of subtraction are used to prepare a subtraction library by inserting them into a vector. Cloning and Analysis of Differentially Expressed Genes between the Human Prostate Cancer Cell Line LNCaP and PC-3
FIG. 1. Schematic illustration of linker capture subtraction.
RESULTS
Experimental Strategy This method is designed to isolate genes expressed differentially between two cell types or between cells treated in two different ways (Fig. 1). In the first step, both tester DNA and driver DNA are prepared. This is accomplished by digesting the double-stranded cDNA with restriction enzymes of choice, ligating the fragments to linkers, and carrying out a PCR reaction with linker sequence as primer. The driver DNA is digested with restriction enzymes to remove the linker sequence. In the second step, the linkered tester DNA is hybridized to an excess of driver DNA (with linkers removed) followed by incubation with mung bean nuclease which digests single-stranded DNA specifically. This leaves only linkered tester–tester homohybrids and unlinkered homo- and heterohybrids. In the following step, the linkered tester–tester homohybrids are amplified by PCR with linker sequence as primer to fulfill the first round of enrichment. The amplified PCR products are then used as tester for another round
AID
AB 9555
/
6m15$$$202
04-24-96 22:37:21
We have used the strategy to begin to clone and identify the genes expressed differentially between the human prostate cancer cell lines LNCaP and PC-3, which have different tumorigenic and metastatic potentials. After three cycles of subtraction, the PCR products were cleaved with SacI, inserted into pGEM-7Zf(/), and transformed into E. coli JM109 cells. Figure 2A shows the electrophoretic analysis of the PCR-amplified DNA derived from subtraction cycles 0–3. The original unsubtracted DNAs from LNCaP (lane L-P, 0) and PC-3 (lane P-L, 0) moved as a smear between 0.1 and 1.0 Kb. As subtraction rounds were performed, distinct bands were seen (lanes L-P, 1–3; P-L, 1–3). The intensity and resolution of these bands increased progressively with successive subtraction. When labeled PCR products of the third round of subtraction were electrophoresed on a 6% sequencing gel, 50–60 bands could be seen (not shown). DNA of the agarose gel of Fig. 2A was transferred to Gene Screen Plus membrane, and probed with the labeled PCR products of the third round of subtraction L-P, 3 (Fig. 2B) or P-L, 3 (Fig. 2C). The results indicate strong enrichment of differentially expressed sequences. After three rounds of subtraction, the PCR-amplified products were inserted into pGEM-7Zf(/) and transformed into E. coli JM109 cells. We randomly picked 48 white colonies from each of the libraries and grew them in LB medium in individual wells of a 96-well plate. Two replica DNA dot-blots were prepared and probed with the labeled driver DNAs from LNCaP (lane P-L, 0) and PC-3 (lane L-P, 0), respectively. A comparison of the hybridization intensity of a clone in two replica membranes revealed the relative abundance of the transcript in the two cell types. Over two-thirds of the selected clones demonstrated significant differences in abundance. We tested clones further with Northern blot and sequence analyses. From 78 colonies, 15 distinct clones were identified which correspond to mRNAs expressed differentially between LNCaP and PC-3 cell lines. The extent of differential expression ranged from severalfold to ú100-fold. In addition to five novel genes, the identified genes included some very interesting known genes which are or may be involved in signal transduction, tumor growth, tumor
aba
AP: Anal Bio
112
YANG AND SYTKOWSKI
FIG. 2. Enrichment of specific sequences from LNCaP (L-P) and PC-3 (P-L) cell lines. (A) Ten microliters of PCR reaction mixture (lanes 0–3) was electrophoresed on 4% NuSieve agarose gel. Lane M, 100-bp markers. Lane L, 20 ml of final PCR product L-P,3. Lane P, 20 ml of final PCR product P-L,3. (B) Effect of subtraction cycles on enrichment of LNCaP-specific sequences. DNA shown in A was blotted onto a GeneScreen Plus filter and was probed with radiolabeled PCR product of L-P,3. (C) Effect of subtraction cycles on enrichment of PC-3specific sequences. Filter shown in B was stripped and reprobed with radiolabeled PCR product of P-L,3.
invasion, and metastasis. A Northern blot exemplifying differential expression of two genes is shown in Fig. 3. DNA sequence analyses demonstrated that the LNCaP-specific gene is prostate-specific antigen (PSA) which is known to be expressed in LNCaP but not in PC-3 (16). The PC-3-specific gene was found to be vimentin, the differential expression of which has not been reported previously in these prostate cancer cells. DISCUSSION
We have described a new method, linker capture subtraction, applicable to the identification and isolation of genes expressed differentially between similar cell types. LCS offers several important features. First, LCS is highly effective. We achieved a strong stepwise enrichment of target sequences as shown in Fig. 2.
AID
AB 9555
/
6m15$$$202
04-24-96 22:37:21
When the PCR-amplified products of the third round of subtraction were cloned, 81% of randomly picked clones (78 of 96 colonies) corresponded to mRNAs expressed differentially between LNCaP and PC-3 cell lines. Such a high efficiency has not been reported by others, and it obviates the need to screen the subtractive library by differential hybridization, an essential step in some other methods. Second, LCS is simple to carry out and contains fewer steps than other methods. In particular, a variety of labor-intensive and potentially error-prone physical partitioning steps, such as biotinylation or repeated phenol extraction/ethanol precipitation, were eliminated. In LCS, all these steps of subtractive hybridization, mung bean nuclease digestion, and PCR amplification can be performed in one PCR tube, which makes the process very easy for operation and feasible for automation. Third, LCS is a fast and economical process. The materials required
aba
AP: Anal Bio
LINKER CAPTURE SUBTRACTION
FIG. 3. Linker capture subtraction reveals differential expression of prostate-specific antigen (PSA) and vimentin in LNCaP (L) and PC-3 (P) cells. Northern analysis. Total RNA from either cell was electrophoresed and probed with radiolabeled cDNA of two differentially expressed clones isolated by LCS. Top, autoradiograms; bottom, ethidium bromide-stained gels.
are kept to a minimum. The procedure, from isolation of mRNA to construction of the subtractive library, can be completed within 1 week. Unlike RDA (10) which uses a kinetic mechanism of enrichment, LCS achieves enrichment by specifically preserving PCR-priming sites (linkers) of target sequences. Mung bean nuclease plays a central role in the process. It removes linkers of all other linkered sequences except for tester–tester homohybrids. The nuclease also digests single-stranded DNA in the hybridization solution which is an abundant species that might otherwise cause high background or even failure of enrichment. The use of the enzyme appears to be more reliable and efficient than other physical partitioning methods as indicated by the high enrichment of target sequences and efficient isolation of numerous differentially expressed genes in our experiments. Exonuclease VII, also specific for ssDNA, could be employed in the LCS protocol. It has the added advantage of a pH optimum near to those of the subtractive hybridization and PCR reaction, thus eliminating the need for pH-shift buffers in the process. The digestion of double-stranded cDNA with different restriction enzymes gives a representation of the mRNA population. In the example of linker capture subtraction presented here, we employed AluI and RsaI. Obviously, the use of other enzymes would give different representations and may result in the isolation of genes different from those achievable using AluI and RsaI. Also, the use of different PCR conditions such as additional concentrations of magnesium, different annealing temperatures, and the addition of reagents
AID
AB 9555
/
6m15$$$202
04-24-96 22:37:21
113
such as DMSO, formamide, or glycerol would achieve other representations of the mRNA population of the cells under study. Moreover, thermostable DNA polymerases from different vendors may also give different representations because these enzymes appear to have different efficiencies in amplifying large-size fragments. We added the same linker to the tester and driver in our experiment. Previously, Balzer and Baumlein (17) reported the addition of different linkers to the tester and driver in order to avoid the necessity of restriction enzyme digestion of driver and to eliminate contamination of residual linkered driver. We found that the addition of different linkers gave an unequivalent representation of starting mRNAs for tester and driver, probably due to sequence-contexting of primers or so-called PCR ‘‘bias,’’ a tendency to amplify some sequences preferentially. Therefore, we found it advantageous to use the same linker for both tester and driver rather than different linkers. While it is true that the driver linkers cannot be completely removed by a restriction enzyme, we believe that our protocol itself has a mechanism to eliminate the contamination problem of the residual linkered driver. Since the unlinkered driver is present at high excess in the reaction, the residual linkered driver should be driven out by the unlinkered driver. Of course, too high a level of linkered driver would still pose a problem for efficiency of enrichment. Therefore, we designed a linker with both AluI and SacI sites included to ensure maximum removal of linker sequences from driver. Incubation with AluI first and then SacI was used to avoid incorporation of driver DNA into the library, since the SacI site was used for library construction. How to achieve an enrichment for both abundant and rare target genes is always an issue for the methods of cloning differentially expressed genes. For LCS, we think that the hybridization time might be the determining factor. Kinetically, short hybridization times favor enrichment of more abundant sequences, while longer times allow rare sequences to bind. As long as enough time is given for hybridization, rare target sequences can remain in the reaction. Indeed, moderate to rare genes are included in the genes we identified (data not shown). Moreover, as noted by Hubank and Schatz (12), unwanted dominant sequences (i.e., already identified sequences in the reaction) can be driven out by supplementing driver with unlinkered corresponding sequences. This would allow less dominant species to be isolated. One may also try to normalize tester and driver before subtraction. Here we suggest a procedure based on reassociation kinetics. First, both tester and driver are denatured and hybridized for a short time (e.g., 1 h). Then, the linkers of hybrids (presumably more abundant sequences) are removed by restriction enzymes. Finally, the remaining single-
aba
AP: Anal Bio
114
YANG AND SYTKOWSKI
stranded fraction of DNA is amplified by PCR. Proper sampling of target genes that differ in abundance by only a few-fold is another issue. By adjustment of the tester/driver ratios and by more cycles of subtraction, these genes should be isolated. We suggest that linker capture subtraction will be generally applicable to experiments such as those reported here as well as to studies of differential gene expression in cells incubated in the absence or presence of cytokines, growth factors or other biologically active molecules. Moreover, the method should also prove useful in finding differences between genomic DNAs. Obviously, this method will not detect important genes critical for biological events whose mRNAs are not expressed differentially. One should also not expect that in one experiment this method will provide the entire array of differentially expressed genes between cell types, although by modification of conditions described above, it may be possible to achieve this goal. While this article was in preparation, two different methods which may reach the same endpoint as LCS were reported. Schena et al. (18) developed a robotic system for generating high-density microarrays of complementary DNA clones. By labeling samples with different dyes, differences in gene expression could be quantified. Velculescu et al. (19) developed another techniques called serial analysis of gene expression, or SAGE. It relies on the fact that a sequence as short as nine base pairs is sufficient to identify 95% of human genes, provided that the sequence is picked from the same place in all the genes surveyed. Both approaches may allow a broad view of patterns of gene expression simultaneously. In our experience, LCS is at least as effective as these new methods, and it affords the advantages of efficiency, simplicity, and practicality.
AID
AB 9555
/
6m15$$$202
04-24-96 22:37:21
ACKNOWLEDGMENTS This work was supported by a NIH NRSA 1 F32 DK 09364 to M.Y. and U.S. Navy Grant N00014-93-1-0776, N.I.H. Grant DK38841, and NATO Grant 890509 to A.J.S.
REFERENCES 1. Mather, E. L., Alt, F. W., Bothwell, A. L. M., Baltimore, D., and Koshland, M. E. (1981) Cell 23, 369–378. 2. Hedrick, S. M., Cohen, D. I., Nielsen, E. A., and Davis, M. M. (1984) Nature 308, 149–153. 3. Davis, R. L., Weintraub, H., and Lassar, A. (1987) Cell 51, 987– 1000. 4. Welsh, J., Chada, K., Dalal, S. S., Cheng, R., Ralph, D., and McColelland, M. (1992) Nucleic Acids Res. 20, 4965–4970. 5. Liang, P., and Pardee, A. B. (1992) Science 257, 967–971. 6. Straus, D., and Ausubel, F. M. (1990) Proc. Natl. Acad. Sci. USA 87, 1889–1893. 7. Sive, H. L., and John, T. S. (1988) Nucleic Acids Res. 16, 10937. 8. Wieland, I., Bolger, G., Asouline, G., and Wigler, M. (1990) Proc. Natl. Acad. Sci. USA 87, 2720–2724. 9. Wang, Z., and Brown, D. D. (1991) Proc. Natl. Acad. Sci. USA 88, 11505–11509. 10. Lisitsyn, N., Lisitsyn, N., and Wigler, M. (1993) Science 259, 946–951. 11. Zeng, J., Gorski, R. A., and Hamer, D. (1994) Nucleic Acids Res. 22, 4381–4385. 12. Hubank, M., and Schatz, D. G. (1994) Nucleic Acids Res. 22, 5640–5648. 13. Xie, W. Q., and Rothblum, L. I. (1991) BioTechniques 11, 325– 327. 14. Brown, S. E., and Knudson, D. L. (1991) BioTechniques 10, 719– 722. 15. Winship, P. R. (1989) Nucleic Acids Res. 17, 1266. 16. Blok, L. J., Kumar, M. V., and Tindall, D. J. (1995) Prostate 26, 213–224. 17. Balzer, H. J., and Baumlein (1994) Nucleic Acids Res. 14, 2853– 2854. 18. Schena, M., Shalon, D., Davis, R. W., and Brown, P. O. (1995) Science 270, 467–470. 19. Velculescu, V. E., Zhang, L., Vogelstein, B., and Kinzler, K. W. (1995) Science 270, 484–487.
aba
AP: Anal Bio