streptavidin-mediated walking using the polymerase chain reaction

streptavidin-mediated walking using the polymerase chain reaction

journal of biotechnology ELSEVIER Journal of Biotechnology 35 (1994) 205-215 Extension of incomplete cDNAs (ESTs) by biotin/streptavidin-mediated wa...

745KB Sizes 0 Downloads 25 Views

journal of biotechnology ELSEVIER

Journal of Biotechnology 35 (1994) 205-215

Extension of incomplete cDNAs (ESTs) by biotin/streptavidin-mediated walking using the polymerase chain reaction D. Stephen Charnock-Jones

a Matthias

Platzer

b Andr6

Rosenthal c,d,,

a Department of Obstetrics and Gynaecology, University of Cambridge, The Rosie Maternity Hospital, Cambridge CB2 2SW, UK b Max-Delbriick Centre of Molecular Medicine, Robert-RiSssle~Strasse 10, Berlin-Buch 13125, Germany c Molecular Genetics Group, Department of Medicine, University of Cambridge, Addenbrookes Hospital, Hills Road, Cambridge CB2 2QQ, UK d Institute of Molecular Biotechnology, Beutenbergstr. 11, 07745 Jena, Germany

Abstract

In the last 2 years thousands of new partial cDNAs or expressed sequence tags (ESTs) have been identified by single pass sequencing methods. It is expected that this number will further increase in order to help to isolate all human genes. However, the scientific value of partial cDNA fragments is limited unless they are used as tools for isolating and sequencing their full length parent molecules. Conventional library screening methods are tedious and not very effective in achieving this goal. We present a modified PCR technique which allows rapid isolation of the ends of partial cDNA fragments in vitro using a biotin/streptavidin capture procedure. Our method has several advantages over the RACE technique, is very specific, and allows to frequently sequence the final product directly without subcloning. We also show that cDNA walks can be obtained from partial sequences as short as 26 bp.

Key words: PCR; cDNA extension; Human genome project; STS; EST

I. Introduction

The human genome project envisages utilizing both genetic mapping and sequencing to produce the definitive map of the genome - the entire D N A sequence and the localization with in it of all the genes. For the human genome this is a

* Corresponding author: Institut fiir Molekulare Biotechnologie e.V., Beutenbergstrasse ll, 07745 Jena, Germany.

monumental task, and even for more modest genomes such as that of the nematode Caenorhabditis elegans it is formidable. Initially the sequencing component of the human genome project focused on technological projects to reduce the cost of D N A sequencing and the sequencing of limited regions of particular interest. However, emphasis has now shifted toward the rapid identification of a large number of genes by partial c D N A sequencing. This was suggested some years ago (Brenner, 1990) and the wisdom of this ap-

0168-1656/94/$07.00 © 1994 Elsevier Science B.V. All rights reserved SSDI 0168-1 6 5 6 ( 9 4 ) 0 0 0 4 0 - J

206

D.S. Charnock-Jones et al. /Journal of Biotechnology 35 (1994) 205-215

proach is now increasingly recognized. In the last 2 years several groups from the USA, Japan, France and UK have described large numbers of partial cDNA sequences (Adams et al., 1991; Hoog, 1991; Okubo et al., 1992; Khan et al., 1992). At the last cDNA conference held in Paris in 1992 it was estimated that around 10 000 separate human genes have already been identified (Sikela and Auffray, 1993). This corresponds to approx. 10% of the coding sequences in the genome. This rate will increase even further, and one group claims it will have virtually completed the cDNA sequencing by the middle of 1994 (Macilwain, 1993). Similar projects are under way for other model genomes, for example, Caenorhabditis elegans (Waterston et al., 1992; McCombie et al., 1992) and mouse (Hoog, 1991). Similarly, there are projects to study, using cDNA sequencing and mapping, more diverse species such as the pig, chicken, Drosophila and Arabidopsis. The vast majority of these sequences are 300400 bp, but the different groups have used different strategies and envisage different uses for these partial sequences. For example, Adams et al. (1991) used randomly primed cDNAs deliberately to reduce the frequency of 5' and 3' noncoding sequences in their analysis. They were hoping to identify new genes on the basis of their lack of homology to known proteins. However, another group deliberately chose to sequence the 3' noncoding region from oligo-dT primed cDNA in order to ensure that a fragment cDNA was not categorized as more than one new gene (Khan et al., 1992). This strategy also has the advantage that the 3' noncoding regions of cDNA molecules are rarely interrupted by introns and are less well conserved with in gene families than the coding sequences. This facilitates the design of gene specific sequence tagged sites (STS) primers. Okubo et al. (1992) used large-scale cDNA sequencing to survey tissue gene expression. This method, while being primarily designed to address the variations in tissue specific gene expression, does generate new cDNA sequence. While these methods rapidly generate a partial sequence sufficient for use as an STS marker, they do not yield the entire cDNA sequence and

therefore the predicted protein remains unidentified. This is a serious drawback to those who wish to study the function of the new gene(s). Fulllength sequence is obviously needed to permit expression of recombinant protein in order to aid understanding of its biology. With the current strategies employed this is impossible. Very few of the clones sequenced (and none from the randomly primed libraries) are full length. This is an extremely serious drawback for the investigation of the function of the newly identified genes. Even if this were not a problem, distribution of the original clone to an individual scientist wishing to sequence the remaining portion would become increasingly time-consuming and probably impractical. There is, therefore, an increasing need for an in vitro method to extend partial cDNA clones that does not depend on distribution of biological material which may be inadequate anyway. The rapid amplification of cDNA ends (RACE; Frohman et al., 1988) and similar methods (Loh et al., 1989, 1991) have been developed to allow the extension of partial cDNAs. Both these methods rely on the use of an extended (anchored) oligo-dT and a gene specific primer to amplify the 3' region of the cDNA. Similarly, both authors utilize a homopolymer tail added to the first strand cDNA together with an anchored tail-complementary and gene specific primers to amplify the 5' end. Both methods are very similar, the major difference being in the choice of nucleotide used for the tailing. A recent advancement replaces the homopolymer tailing of the cDNA with the ligation of oligonucleotides that have the 3' ends blocked to prevent self ligation (Dumas et al., 1991; Troutt et al., 1992). We have already developed a PCR method for walking within total human genomic D N A (Rosenthal and Jones, 1990; Rosenthal et al., 1991) and here we present modifications of this which permit the extension of both 3' and 5' ends of incomplete cDNAs. The key principle of our method is the extension of a specific oligonucleotide primer which is biotinylated at its 5' end and the selective capture of this product on a solid support coated with streptavidin. Thus, the original highly complex mixture of cDNA fragments is reduced to one of much lower complex-

D.S. Charnock-Joneset al. /Journal of Biotechnology 35 (1994) 205-215 ity before exponential amplification is carried out, thereby reducing the background of nonspecific amplification. To extend 5' ends the captured D N A is tailed while still immobilized, allowing the reaction to be performed with the optimum buffer conditions. The exponential amplification is then carried out and the product can be sequenced directly or cloned if desired.

2. Methods

The method for walking to the 3' and 5' ends of cDNAs is outlined in Figs. 1 and 2, respectively, and a description is given below. The details of buffer composition and other reagents used are given at the end of the paper.

2.1. 3' Extension (Fig. 1) 2.1.1. RNA isolation A common cause for the failure or inadequate results obtained in cDNA isolation is the poor quality of the (m)RNA used. Therefore special care must be taken with this step. The procedure of Chomsninzski and Sarchi (1987) or the 'fast track' kit from Invitrogen yield good quality R N A and mRNA, respectively. 2.1.2. First stand cDNA synthesis There are many kits available for cDNA synthesis and most can be adapted for use at this stage by utilizing an anchored-oligo(dT) primer. The 'anchor' can be any useful sequence of 18-20 bp in length, for example, restriction enzyme recognition sites or the universal M13 ( - 2 0 ) sequencing primer (particularly if the end product is to be sequenced directly with fluorescently labeled primers). The only constraint is that the primer should not be self complimentary. We routinely use the following two primers: (a) 5'GTTGGATCCGCGGCCGCATCGAT{22)-3' which contains sites for the restriction enzymes BamHI, NotI and ClaI and (b) 5'-TTGTAAAACGACGGCCAGTGAGCT(20)-3' which contains the M13 ( - 2 0 ) universal primer sequence. Complementary D N A is synthesized in 30 /zl

207

containing 30 pmol tail-dT primer, 20 units human ribonuclease inhibitor and 20 units AMV reverse transcriptase. Prior to the addition of the enzymes the other components are heated to 68°C for 3 min and chilled on ice. The enzymes are added and incubated at 42°C for 1 h. The enzymes are then inactivated at 80°C for 10 rain.

2.1.3. Linear amplification A fraction of this cDNA (1-10%) is then transferred to the linear amplification step. Here a single gene specific 5'-biotinylated primer ('a' in Fig. 1) is used to amplify the single-stranded cDNA. Since only a single primer is present, the amplification is not exponential (hence 'linear' amplification). This is performed in 50/xl in the same way as standard PCR, except that there is only a single primer present at a concentration of 0.1 /zM. The Tm of this primer should ideally be above 60°C and the PCR annealing temperature set accordingly. This eliminates the possibility of nonspecific products arising from extension of any oligo-dT primer remaining after first strand cDNA synthesis. Up to 50 cycles of amplification may be performed. 2.1.4. Solid phase capture Streptavidin-coated magnetic beads are washed twice with TE/NaC1. Sodium chloride is added to the PCR mixture to a final concentration of 0.5 M followed by 30/~1 of washed beads. The NaC1 prevents nonspecific binding of D N A to the beads. The DNA and the beads are incubated with occasional mixing at room temperature for 10 rain, after which the beads are washed three times in T E / N a C 1 using a magnet and finally twice in HzO. 2.1.5. Exponential amplification The immobilized D N A is then used as a template for exponential amplification (20 cycles) using the same primer as for the initial linear amplification but without the biotin present and a tail primer (i.e., the same sequence as in the dT-tail primer but without the T track present, 'c' in Fig. 1). A small fraction of this material (1 /zl of a 1:50 dilution) is then reamplified with a

208

D.S. Charnock-Jones et al. /Journal of Biotechnology 35 (1994) 205-215

nested gene specific primer (i.e., 'b' in Fig. 1) and the tail primer. 20-40 cycles may be used. The products of this reaction are in solution and may be analyzed by agarose gel electrophoresis prior to cloning and sequencing.

2.2. 5' Extension (Fig. 2) M a n y of the steps r e q u i r e d for the extension to the 5' e n d o f t h e c D N A a r e similar to t h o s e a l r e a d y d e s c r i b e d for t h e 3' end. H o w e v e r , b e -

Known region of gene ! ~ - . " ! :'i~!:'~~'~'~''~::: '

AAAAAA

mRNA

Syn th eslze cDNA

~~~i--AAAAAA ....

TTTTTTII~

mRNA Single stranded cDNA

• Amplify with biotinylated gene specific primer

®

1

• Capture and wash on solid support

C

• Amplify with nested specific primer and tail primer

Tail

• Directly sequence or clone product Fig. 1.3' extension of incomplete cDNAs by a modified PCR technique using biotin capture.

D.S. Charnock-Jones et al. /Journal of Biotechnology 35 (1994) 205-215

~:~i~i~i i i!i i i i i i i i i!i i~i i~!i~!~!~i!i!i!i i i i i i ~-AAAAAA mRNA

J • Extend biotinylated gene specific primer mRNA

"x • Extend/amplify with biotinylated gene specific primer ~:............................................. ~ ~:::::::~::: iDouble I:~:~:'.~;~~,~--.-~.-'.~t~;~e~'.~::~:~:~v~] stranded (~) cDNA

I

|!iiiiiiiiiiiiiiii!iiiiiiiiiii~!i!~!!!i!!:iiiiiiiiiiiiiiiii~iiiiiil_..A A A AA A

-~ _~ -~

a(~)

"-x

a~

J

• Capture on solid support and wash

1 • Add hornopolymer tail to immobilised cDNA

• Anneal and extend oligo dC-tail primer (c), wash immobilized product • Amplify with nested gene specific p i m e r and tail primer (d)

• Clone or directly sequence product Fig. 2. 5' extension of incomplete cDNAs by a modified PCR technique using biotin capture and tailing.

209

210

D.S. Charnock-Jones et al. /Journal of Biotechnology 35 (1994) 205-215

cause the extension is in the opposite direction several additional steps are needed.

2.2.1. RNA isolation As already discussed, the quality of the starting material is of utmost importance. 2.2.2. cDNA synthesis There are two options for cDNA synthesis when walking toward the 5' end of a cDNA. The first is to use a gene specific 5'-biotinylated primer ('a' in Fig. 2) from which to initiate cDNA polymerization. This has the advantage that, in theory at least, it ensures that the sequence at the very 5' end of the m R N A is not lost. However, if there is significant secondary structure in the m R N A this may prevent the passage of the reverse transcriptase and incomplete products will be produced. The use of thermally stable reverse transcriptases and elevated polymerization temperatures may overcome this. The use of such enzymes also offers the possibility of performing multiple rounds of reverse transcription using cycles analogous to those used in PCR amplification. The second alternative is to use conventional doublestranded cDNA as the template. In this case, any of the kits which are used for cDNA library construction can be used to synthesize the cDNA. If this is the method chosen then it is inevitable that a small amount of sequence will be lost from the extreme 5' end. However, the second method has the advantage that multiple cycles of linear amplification can be carried out using the gene specific 5'-biotinylated primer ('a' in Fig. 2), as already described for the 3' extension. 2.2.3. Solid phase capture The streptavidin-coated beads are washed and used as before. However, there is an important point to note here. If the cDNA synthesis was specifically primed with the biotinylated primer, then sufficient beads must be used to bind all the biotinylated primer present. This, in practice, means that it is prudent to either use only a fraction of the cDNA product or to synthesize the cDNA using a low concentration of primer (0.1-0.5 /xM). The beads are washed and finally resuspended in H 2 0 as already described.

2. 2. 4. Tailing Since the cDNA is immobilized on the solid support it can be extensively washed before and after tailing. This allows the various enzymatic steps to be performed in their optimal buffers. Although it has been reported that terminal transferase is active in PCR buffer we have found that the reaction is considerably more reliable when performed in potassium cacodylate buffer. This can readily be removed by washing after the tailing is complete. The nucleotide used for tailing can be either d G T P or dATP and each has advantages. Using d G T P will ensure that the tail is not too long (dG tails are self limiting in length). However, this requires an oligo dC primer for subsequent amplification and such a primer is very 'sticky' and prone to nonspecific amplification. Poly A tails are not self limiting and can extend to 1-200 nucleotides in length which, although this length is reduced during the PCR steps that follow, can lead to quite long poly A tracts in the final product. If dATP is used, then the same tail primer can be used for both 5' and 3' extension. 2.2.5. Single linear extension After washing the immobilized tailed cDNA a complementary strand is synthesized. This is primed using an anti-tail primer that has an anchor similar to that used for walking to the 3' end of the cDNA. Only a single extension is carried out which minimizes the possibility of nonspecific amplification by the oligo-dC primer. 2.2.6. Exponential amplification The now double-stranded cDNA is washed again and then amplified exponentially using a nested gene specific primer ('b' in Fig. 2), and the anchor primer ('d' in Fig. 2). The conditions used are those employed for normal PCR and 35 cycles are usually needed. The protocol often yields distinct bands which can be sequenced directly without subcloning. 2.3. Sequencing of PCR extension products The following methods are used for template preparation and sequencing.

D.S. Charnock-Jones et al. /Journal of Biotechnology 35 (1994) 205 -215

211

0

,vO

o

=Z n

0- 8

0

rr S

0 rl

o

r

_x_

m

±

3~

0

!,

.=. <

_

_

m

n

¢0

~E i-

~

:.7

~

....

Z'~ 0

212

D.S. Charnock-Joneset al. /Journal of Biotechnology 35 (1994)205-215

2.3.1. Template preparation after subcloning PCR products representing complex mixtures are cloned directly into a commercial TA-vector system, or after end-repairing with T4 DNA polymerase are cloned into the EcoRV site of BluescriptII. Recombinant colonies are picked into a microtitre dish containing TB broth and the appropriate antibiotic. After overnight grow, PCR is performed in a thermostable microtitre dish using a 96-pin hedgehog device (Rosenthal et al., 1993). To identify recombinants, aliquots of PCR products are separated by electrophoresis using a 96well 1% agarose gel. Sequencing templates can either be prepared by a modified alkaline lysis protocol (Jones and Schofield, 1990) or by direct purification of the PCR product. For the latter, excess primers, nucleotides and small molecular weight products are removed from the PCR products by precipitation with polyethylene glycol (PEG) using a special 26.2% PEG 8000 solution at pH 5.2 containing 6.6 mM MgCI 2 and 0.6 M NaOAc (Rosenthal et al., 1993).

2.3.2. Template preparation for direct sequencing If several distinct products are formed during PCR the mixture is first separated on a 1% agarose gel. DNA is then recovered from the agarose by adsorption to glass beads. For single specific PCR products the aqueous phase is removed from beneath the oil into a new tube. The PCR DNA is recovered by PEG precipitation as described above.

2.3.3. Sequencing Templates prepared by alkaline lysis are sequenced using [a-3sS]dATP and sequenase 2.0 according to standard protocols. Templates obtained by direct purification of PCR products are cycle sequenced in 0.5-ml test tubes using Taq dye terminator chemistry and the ABI 373A sequencer. Excess dye terminators are removed by gravity chromatography using special perspex blocks of microcolumns scaled down to microtitre format (Rosenthal and CharnockJones, 1992).

3. Examples In the following section we describe results obtained by walking towards both the 3' and 5' of cDNA molecules. We also briefly show the result of a cDNA walk when only 26 bp of gene specific sequence was known.

3.1. 5' walk along the phospholipase C (alpha type) cDNA Double-stranded cDNA was synthesized from 1 /zg of human placental mRNA using standard techniques (Gubler and Hoffman, 1983). After ethanol precipitation the cDNA was diluted with 100 ~1 of sterile H20. In order to incorporate the biotin prior to solid phase capture 50 cycles of linear amplification were performed. The reaction (50 /xl) contained 1 /xl of double-stranded cDNA (10 ng), 5 pmol primer (Bio-J), 100 /zM dNTPs, 1 X PCR buffer, and 1 unit Taq polymerase. The sample was overlaid with light mineral oil and amplification was carried out using a Techne (Cambridge) PHC-2 programable heating block. The cycles used were (95°C 30 s, 60°C 30 s, 72°C 30 s ) x 50. Streptavidin-coated magnetic beads (50 /zl) were washed as already described and NaC1 was added to the amplified cDNA and mixed. After 10 rain at room temperature the beads were separated with a magnet and washed three times with TE/NaC1 and then three times with water. Tailing of the immobilized cDNA was performed in 50/zl of 1 x tailing buffer containing 50 /zM dGTP and 20 units terminal transferase for 15 min at 37°C. The enzyme was heat inactivated at 80°C for 10 min and the immobilized tailed cDNA washed with TE/NaC1 and water as before. The tailed cDNA was exponentially amplified using the anti-tail primer 5'-AATT G G A T C C C C C C C C C C C C C C C - 3 ' and the nested primer for 35 cycles in a 50 /zl reaction (95°C 30 s, 60°C 30 s, 72°C 30 s) x 35. The product was purified by agarose gel electrophoresis and sequenced directly using primer H and the fluorescent dye terminator Taq cycles sequencing kit (ABI). The resulting sequence is shown in Fig. 3. This sequence contains the expected overlap

D.S. Charnock-Jones et al. /Journal of Biotechnology 35 (1994) 205-215 and thus is a true walk and extends to the homopolymeric tail.

3.2. 3' walk along the human cDNA for the neural cell adhesion molecule L1 Single-stranded c D N A was synthesized from 100 ng of h u m a n fetal brain m R N A as described above. After first strand synthesis the reverse transcriptase was heat inactivated by incubation at 80°C for 15 min followed by adding sterile water up to 100 /zl. In order to obtain the 3' nontranslated region of the human neural cell adhesion molecule L1 (Rosenthal et al., 1991; Hlavin and Lemmon, 1991) we have performed 50 linear P C R cycles using the biotinylated primer 5'-bio-GCTGGTTCATCGGCTTTGTGAGTG-3' (primer A: L1CAM 3362-3385). The reaction (20 /xl) contained 1 Izl (1 ng) human fetal brain cDNA, 10 pmol biotinylated primer A, 100 IzM dNTPs, 1 × P C R buffer and 2 units of Taq polymerase. The sample was overlaid with light mineral oil and amplification was carried out using a Techne (Cambridge) PHC-2 programable heating block. The cycles used were (95°C 30 s, 60°C 60 s, 72°C 120 s ) × 5 0 . Washed streptavidin-coated magnetic beads (30/zl) were added and the mixture incubated for 30 min at room temperature. The beads were sedimented using a magnet and the supernatant removed followed by three washes with 40/xl of 0.1 M N a C 1 / T E and three washes with water. The beads were finally resuspended in 5/~1. Exponential P C R was performed in 20 /zl and the mixture contained, 5 /zl of the solid phase D N A template bound to magnetic beads, the L1 specific primer A but without biotin (1 izM), the tail specific primer 5 ' - G T T G GATCCGCGGCC-GCATCGA-3' (1 /zM), dNTPs (100/~M), 1 × P C R buffer, and 2 units of Taq polymerase. 25 cycles were performed using the same cycle conditions as in the linear amplification step. A second exponential amplification (35 cycles) was performed using an aliquot (1 /zl of a 1 : 50 dilution of the reaction product) and a 3' nested L1 specific primer 5 ' - G A T G A G A C C T T C G G C G A G T A C - 3 ' (primer B: L1CAM 35083528) as well as the tail primer. P C R conditions were as described above. An aliquot of the P C R

213

product was analyzed on a 1% agarose gel and two bands around 1.2 to 1.4 kb length were observed (Fig. 4). These products were cloned and sequenced on both strands using custom primers. The sequence analysis revealed a correct 1.4 kb long extension of the human neural cell adhesion molecule L1 towards the 3' end covering more than 1 kb of the 3' nontranslated region.

3.3. Amplification of 3' cDNA ends using primers designed with minimal sequence information Investigation of distantly related genes (gene families, homologs in different organisms, cellular homologs of viral genes) may require the localization of gene specific primers in a sequence section as short as possible. We have previously shown that human papillomavirus (HPV) antisense oligonucleotides can inhibit adhesion of human cells to plastic substratum. This effect is restricted to oligonucleotides derived from a 30 bp region of the H P V 18E7

1

2

Fig. 4. Lane 1, 123 bp ladder; lane 2, 1.2 kb and 1.4 kb PCR products obtained by 3' extension of the human neural cell adhesion molecule L1 (Rosenthal et al., 1991; Hlavin and Lemmon, 1991) using the method outlined in Fig. 1. Both products are correct extension products of the human neural cell adhesion molecule L1 towards the 3' end and covering its 3' nontranslated region.

214

D.S. Charnock-Jones et al. /Journal of Biotechnology 35 (1994) 205-215

gene (Platzer, manuscript in preparation). To analyse possibly involved human cellular genes it was necessary to use a pair of overlapping locus specific primers spanning a region of no more than 30 bp. In order to optimize primer selection pilot experiments were carried out using an in vitro HPV18 immortalized cell line. Total mRNA was prepared followed by first strand cDNA synthesis using 5'-TTGTAAAACGACGGCCAGTGAGCT~20)-3' as described above. The following amplification scheme was used: (a) 50 cycles of linear amplification were performed (95°C 30 s, 43°C 60 s, 72°C 120 s) using the following HPV18-specific biotinylated primer A 5'-bio-TGCATTTAGAGCCCC-3' (HPV18 627-641); (b) 35 cycles of exponential amplification were performed (95°C 30 s, 43°C 60 s, 72°C 120 s) using primer A but without biotin and the M13 ( - 2 0 ) universal forward primer 5'-GTAAAACGACGGCCAGT-3'; (c) another 35 cycles of exponential amplification were performed (95°C 30 s, 44°C 60 s, 72°C 120 s) using a partially nested HPV18-specific primer B 5'-AGCCCCAAAATGAAATT-3' (HPV18 636-652) and the M13 ( - 2 0 ) universal forward primer; (d) the PCR fragments were sequenced directly or after cloning using a TA-cloning kit (Invitrogen). The HPV18-specific primers A and B are 15 bp and 17 bp in length and overlap by 6 bp. Sequencing revealed a correctly spliced 925 bp 3' cDNA extension product which contains 274 bp of HPV18 from the E7 gene specific primer to a viral donor splice site (HPV18 655-929) and 651 bp of SV40 from t / T acceptor splice site to early polyadenylation site via a vector specific Mbol deletion (SV40 4571-4100/2770-2592). This demonstrates that correct splicing takes place in transcripts derived from the transgenic expression unit. We have also isolated several incomplete 3' walks which are the result of priming of the anchor-oligo (dT) primer at A-rich regions in the SV40 sequence during first strand cDNA synthesis. In summary, we show that biotin walking to the 3' ends of cDNAs can be performed using gene specific overlapping primers spanning a region as short as 26 bp.

4. Materials

A M V reverse transcriptase. Super RT, HT-Biotech, Cambridge, UK. Biotinylated primers. These were synthesized on a 380B automated DNA synthesizer (Applied Biosystems Inc., Foster City, CA, USA). This is a two-step procedure using Aminolink-2 (Applied Biosystems Inc., Foster City, CA, USA) to produce an amino-modified oligonucleotide which is subsequently coupled to a biotin-NHS ester. The biotinylated olignucleotide is precipitated and gel purified. A fuller description of this method is given in Rosenthal et al. (1991). However, phosphoramidites with biotin already coupled are now available and these can be incorporated using the standard automated DNA synthesis procedure. Human placental ribonuclease inhibitor. Cloned, Promega Biotech, Madison, WI, USA. PCR buffer. 10 mM Tris-HC1 pH 8.3 at 25°C, 50 mM KCI, 1.5 mM MgCI 2. Reverse transcriptase buffer. 50 mM Tris-HC1 pH 8.3 at 42°C, 400 mM KC1, 10 mM MgC12, 4 mM DTT, 1 mM dNTPs (Pharmacia P-L Biochemicals Inc., Milwaukee, WI, USA). Sequenase version 2.0 DNA sequencing kit. USB, Cleveland, OH, USA. Streptavidin coated magnetic beads. Dynabeads M-280, Dynal, Oslo, Norway. Tailing buffer. 100 mM sodium cacodylate pH 7.0, 1 mM CoC12, 0.1 mM DT]?, 20/zM dGTP, 50/~g ml- l BSA. TA cloning kit version 1.3. Invitrogen Corp., San Diego, CA, USA. Taq dye deoxy terminator cycle sequencing kit. Applied Biosystems Inc., Foster City, CA, USA.

D.S. Charnock-Jones et al. / Journal of Biotechnology 35 (1994) 205-215 Taqpolymerase. N a t i v e T a q p o l y m e r a s e (5 U / z l - 1, Perkin Elmer Inc./Applied

B i o s y s t e m s Inc.).

T E / N a C l . 10 m M T r i s - H C 1 p H 7.4, i m M E D T A , 0.5 M N a C I . Terminal transferase. F P L C p u r e , P h a r m a c i a P - L B i o c h e m i c a l s Inc., M i l w a u k e e , W I , U S A .

References Adams, M.D., Kelley, J.M., Gocayne, J.D., Dubnick, M., Polymeropoulos, M.H., Xiao, H., Merril, C.R., Wu, A., Olde, B., Moreno, R.F., Kerlavage, A.R., McCombie, W.R. and Venter J.C. (1991) Complementary DNA Sequencing: Expressed Sequence Tags and Human Genome Project. Science 252, 1651-1656. Brenner, S. (1990) The human genome: the nature of the enterprise. Ciba Found. Symp. 149, 6-17. Chomsninzski, P. and Sarchi, E. (1987) Single step method of RNA isolation by guanidine thiocyanate-phenol-chloroform extraction. Anal. Biochem. 162, 156-159 Dumas, J.B., Edwards, M., Delort, J. and Mallet, J. (1991) Oligodeoxyribonucleotide ligation to single-stranded cDNAs: a new tool for cloning 5' ends of mRNAs and for construction cDNA libraries by in vitro amplification. Nucleic Acids Res. 19, 5227-5232. Frohman, M.A., Dush, M.K. and Martin, G.R. (1988) Rapid production of full-length cDNAs from rare transcripts: Amplification using a single gene-specific oligonucleotide primer. Proc. Natl. Acad. Sci. USA 85, 8998-9002. Gubler, U. and Hoffman, B.J. (1983) A simple and very efficient method for generating cDNA libraries. Gene 25, 263-269 Hlavin, M.L. and Lemmon, V.V. (1991) Molecular structure and functional testing of the human L1 cell adhesion molecule: an interspecies comparison. Genomics 11, 416423. Hoog, C. (1991) Isolation of a large number of novel mammalian genes a differential cDNA library screening strategy. Nucleic Acids Res. 19, 6123-6127. Jones, D.S.C. and Schofield, J.P. (1990) A rapid method for isolating high quality plasmid DNA suitable for DNA sequencing. Nucleic Acids Res. 18, 7463-7464. Khan, A.S., Wilcox, A.S., Polymeropoulos, M.H., Hopkins,

215

J.A., Stevens, T.J., Robinson, M., Orpana, A.K. and Sikela, J.M. (1992) Single pass sequencing and physical and genetic mapping of human brain cDNAs. Nature Genet. 2, 180-185. Loh, E.Y. (1991) Anchored PCR: Amplification with SingleSided Specificity. Methods: A Companion to Methods in Enzymology 2, 11-19. Loh, E.Y., Elliott, J.F., Cwirla, S., Lanier, L.L. and Davis, M.M. (1989) Polymerase Chain Reaction with Single-Sided Specificity: Analysis of T Cell Receptor Chain. Science 243, 217-220. Macilwain, C. (1993) News: Genome project "to be done by 1994". Nature 362, 488. McCombie, W.R., Adams, M.D., Kelley, J.M., FitzGerald, M.G., Utterback, T.R., Khan, M., Dubnick, M., Kerlavage, A.R., Venter, J.C. and Fields, C. (1992) Caenorhabditis elegans expressed sequence tags identify gene families and potential disease gene homologues. Nature Genet. 1, 124131. Okubo, K., Hori, N., Matoba, R., Niiyama, T., Fukushima, A., Kojima, Y. and Matsubara, K. (1992) Large scale cDNA sequencing for analysis of quantitative and qualitative aspects of gene expression. Nature Genet. 2, 173-179. Rosenthal, A. and Jones, D.S.C. (1990) Genomic walking and sequencing by oligo-cassette mediated polymerase chain reaction. Nucleic Acids Res. 18, 3095-3096. Rosenthal, A., MacKinnon, R.N. and Jones, D.S.C. (1991) PCR walking from microdissection clone M54 identifies three exons from the human gene for neural cell adhesion molecule L1 (CAM-L1). Nucleic Acids Res. 19, 5395-5401. Rosenthal, A., and Charnock-Jones, D.S. (1992) New protocols for DNA sequencing with dye terminators. DNA Sequence 3, 61-64. Rosenthal, A., Coutelle, O. and Craxton, M. (1993) Large-scale production of DNA sequencing templates by microtitre format PCR. Nucleic Acids Res. 21, 173-174. Sikela, J.M. and Auffray, C. (1993) Finding new genes faster than ever. Nature Genet. 3, 189-191 Troutt, A.B., McHeyzer-Williams, M.G., Pulendran, B. and Nossal, G.J.V. (1992) Ligation-anchored PCR: A simple amplification technique with single-sided specificity. Proc. Natl. Acad. Sci. USA 89, 9823-9825. Waterston, R., Martin, C., Craxton, M., Huynh, C., Coulson, A., Hillier, L., Durbin, R., Green, P., Shownkeen, R., Halloran, N., Metzstein, M., Hawkins, T., Wilson, R., Berks, M., Du, Z., Thomas, K., Thierry-Mieg, J. and Sulston, J. (1992) A survey of expressed genes in Caenorhabditis elegans. Nature Genet. 1, 114-123.