Insertioanl inactivation of the tk locus in a human B lymphoblastoid cell line by a retroviral shuttle vector

Insertioanl inactivation of the tk locus in a human B lymphoblastoid cell line by a retroviral shuttle vector

Mutation Research, 289 (1993) 297-308 © 1993 Elsevier Science Publishers B.V. All rights reserved 0027-5107/93/$06.00 297 MUT 05287 Insertional ina...

1MB Sizes 0 Downloads 61 Views

Mutation Research, 289 (1993) 297-308 © 1993 Elsevier Science Publishers B.V. All rights reserved 0027-5107/93/$06.00

297

MUT 05287

Insertional inactivation of the tk locus in a h u m a n B lymphoblastoid cell line by a retroviral shuttle vector Andrew J. Grosovsky a, Adonis Skandalis b,,, Leslie Hasegawa and Barbara N. Walter

a

a

a Evironmental Toxicology Graduate Program, University of California, Riverside, CA 92521, USA and o Department of Biology, York University, North York, Ont. M3J 1P3, Canada (Received 1 February 1993) (Revision received 18 May 1993) (Accepted 19 May 1993) Keywords: Insertional inactivation; Retroviral vectors; Thymidine kinase

Summary Insertional mutagenesis represents an inherent risk in retrovirally mediated gene therapy, but it may be a useful experimental strategy for identification and isolation of novel cellular loci. In this investigation we have established a model system using a heterozygous thymidine kinase (tk) marker locus in a human B lymphoblastoid cell line, and a M-MuLV based shuttle vector. The frequency of T K - mutants in cells carrying 1-2 proviruses per genome is approximately 2 x 10 -5, a 5-fold increase as compared to an uninfected control population. Southern analysis of a set of 13 retrovirus infected T K - mutants revealed a predominance of rearrangements among those mutants which had not undergone loss of heterozygosity. No consistent relationship was found to exist between the occurrence of a rearrangement and tk gene expression as detected by northern analysis. The mechanisms of retroviral shuttle vector insertional mutagenesis were characterized in more detail by focusing on a single T K - mutant, T2. The single proviral insert in T2 was found to lie within tk intron 2, in parallel orientation to the direction of tk transcription. D N A sequence analysis of tk eDNA revealed the presence of an aberrantly spliced product from which exon 4 is excluded. Aberrant splicing could sufficiently account for the low level of functional tk transcript and thus the T K - phenotype in T2, although potential contributions from other mechanisms cannot be excluded.

Correspondence: Dr. Andrew J. Grosovsky, Environmental Toxicology Graduate Program, University of California, Riverside, CA 92521, USA. Tel. (909)787-3193; Fax (909)787-3087; E mail Grosovsk @ucracl.ucr.edu. * Present address: Program in Human Molecular Biology and Genetics, University of Utah, Salt Lake City, UT 84112, USA. Abbreviations: hprt, hypoxanthine-guanine phosphoribosyl transferase; M-MuLV, Moloney murine leukemia virus; tk, thymidine kinase.

Insertional mutagenesis following retroviral infection of mammalian cells has important experimental and clinical implications. Mutagenesis of cellular loci by proviral integration may be exploited for genetic analyses (Soriano et al., 1989), but represents a potential limitation in the use of retroviruses and retroviral shuttle vectors for gene therapy (McLachlin et al., 1990). Retroviral mediated gene therapy protocols are generally designed to deliver a functional copy of a critical gene to complement a specific single gene genetic

298

defect (Miller, 1992). The high efficiency of retroviral infection, the stability of the proviral insertion, and the inability of retroviral vectors to further infect host cells are important advantages in this strategy. However, despite the inherent potential for collateral insertional mutagenesis, there is very little information which permits a quantitative estimate of the specific-locus mutational risk. Molecular studies of activated cellular oncogenes have revealed many examples of naturally occurring retroviral insertional mutagenesis (Bishop 1987). Furthermore, integration of endogenous retroviruses at non-oncogene targets in mice has been shown to be responsible for the hairless (Stoye et al., 1988) and dilute (Jenkins et al., 1981) mutations. Experimentally induced retrovirus insertional mutagenesis of early mouse embryos has been used to identify and clone novel genes which affect development (Jaenisch et al., 1983; Soriano et al., 1987). However, only one report (King et al., 1985), described mutation at a selectable marker locus, the hprt gene in mouse F9 embryonal carcinoma cells. The number of H P R T - clones was found to increase approximately 10-fold following infection with a M-MuLV variant. The magnitude of the increased mutant frequency at the hprt locus remained similar under conditions in which provirus copy number ranged from 5 to 50 per cell. These results seem difficult to reconcile with the theoretical expectation that the potential for insertional mutagenesis at any given cellular locus should be directly related to the provirus copy number. Transposon tagging in yeast (Roeder and Fink, 1980) and Drosophila (Cooley et al., 1988) has been used for the recovery of novel genes whose inactivation leads to a discernible phenotype. Retroviruses would seem well suited for this role in mammalian systems since they infect host cells at high efficiency and integrate into the host genome at low copy number, thus facilitating the cloning of mutated loci. Although there are some highly preferred sites of proviral integration, insertions occur at frequent intervals throughout the host DNA (Shih et al., 1988; Kitamura et al., 1992; Pryciak and Varmus, 1992). Retroviral integration specificity would thus appear to be sufficiently diverse for insertional mutagenesis to oc-

cur throughout the full complement of target cell genes. Only a limited number of mouse development studies have explored this potential for the dissection of specific interesting phenotypes in mammalian systems (Jaenisch et al., 1983; Soriano et al., 1987). Very little has been done with mammalian cells grown in culture, perhaps because detection of insertional inactivation will largely be limited to functionally heterozygous or hemizygous target loci, and because of the absence of quantitative information that would permit an estimation of the number of infectants which must be screened to identify a mutant of an appropriate phenotype. Furthermore, mutants identified in such a screening protocol are of uncertain origin. Some may represent a spontaneous background, or result due to effects on gene expression from inserts in introns or flanking sequences (Soriano et al., 1989; Jaenisch, 1988; Breindl et al., 1984; Barker et al., 1991). In such cases, the provirus may not necessarily provide a useful marker for cloning the affected locus. Only one previous study (King et al., 1985) has examined a collection of mutants at a specific cellular locus in order to determine the spectrum of mutation following retroviral infection. In this study, we have investigated retrovirus shuttle vector insertional mutagenesis at the thymidine kinase (tk) locus in the human B lymphoblastoid cell line, TK6 (Skopek et al., 1978; Liber and Thilly, 1982). Thymidine kinase is a heterozygous, autosomal marker in TK6. This model system has permitted us to estimate a specific-locus mutational risk attributable to retrovirus insertion and to evaluate the usefulness of this approach in somatic-cell genetics. The mechanism of insertional mutagenesis by retroviral shuttle vectors was characterized in a set of 13 T K - mutants and investigated in further detail by focusing on one individual T K - mutant. We show here that the frequency of T K - mutation increases approximately 5-fold following infection with retroviral shuttle vectors. Molecular characterization of individual mutants reveals that the mutant phenotype may often be attributable to effects on gene expression rather than direct disruption of the coding sequence due to an integration event within the affected locus.

299

Materials and methods

Cell line, cell culture and selection of mutants TK6 is a human B lymphoblastoid cell line (Skopek et al., 1978; Liber and Thilly, 1982) which can be grown in logarithmic suspension cultures at densities of up to 106 cells/ml. The cultures were maintained in RPMI 1640 (Cellgro) supplemented with L-glutamine, penicillin/streptomycin and 10% iron-supplemented calf serum (HyClone). Single colonies may be obtained with high cloning efficiency using 96-well dishes. The tk locus resides on chromosome 17 and has been rendered heterozygous in TK6 cells by sequential selection of TK - / - derivatives, and back selection of TK -/+ revertants following treatment with the frameshift mutagen ICR-170 (Liber and Thilly, 1982). The locus is comprised of 7 exons representing 702 bp of coding sequence distributed over 12.9 kb of genomic DNA (Bradshaw and Deininger, 1984; Flemington et al., 1987). The tk gene encodes a salvage pathway enzyme which is inessential for cell survival under ordinary cell-culture conditions. Protocols for selection of TK- mutants have been published (Liber and Thilly, 1982; Grosovsky and Little, 1985). Briefly, TK- mutants are selected in 2 /zg/ml trifluorothymidine. Selections are done in 96-well dishes by seeding 4 × 104 cells/well. At least three 96-well dishes were used for each mutant frequency determination. TK- mutants were selected following co-cultivation of TK6 cells with retroviral shuttle vector producer lines and selection of infectants with G418 (see below). The resultant mutant collection was anticipated to contain siblings due to replication of a single mutant during G418 selection. Sibling mutants collected from this population were unambiguously identified by analysis of proviral insertions (see Results and Table 2) and are grouped together in further analyses. Retrovirus shuttle vector and packaging cell line ZipNeoSV(X) is a replication-defective retroviral shuttle vector derived from an integrated M-MuLV provirus and pBR322 sequences necessary for the propagation of the vector in E. coli (Cepko et al., 1984). The retroviral sequences encoding the gag, pol and env polypeptides were

removed. Sequences retained in the vector include the the M-MuLV long terminal repeats (LTRs) necessary for the initiation of viral transcription, the polyadenylation of viral transcripts, and for provirus integration. Retained sequences also include those required in cis for reverse transcription of the viral genome, and for encapsidation of the viral RNA. The neo gene is incorporated in ZipNeoSV(X) and G418 resistant mammalian cells incorporating the ZipNeoSV(X) provirus can therefore be selected. A 3' splice acceptor site is available upstream from the neo gene. Infectious ZipNeoSV(X) virions were obtained by transfection of packaging cell line PAl2 (Miller et al., 1985). Functions required in trans for the replication and encapsidation of ZipNeoSV(X) are then provided by an otherwise defective helper provirus integrated into the PAl2 genome.

ZipNeoSV(X) infection of TK6 and creation of insertional mutagenesis libraries For infection with ZipNeoSV(X), TK6 was cocultivated with an anchorage dependent, G418 g, PAl2 derivative, producer line. The cell density at the start of the co-cultivation was calculated so that approximately 106 cells/ml were present at the conclusion of the infection period. Following 72 h of co-cultivation, the TK6 cells were isolated by centrifugation of the culture medium and resuspended in fresh, sterile medium. After a 2-day expression period, the cells were selected in G418. For determination of infection frequency, an aliquot of the infected cells was seeded in G418 containing medium in 96-well dishes for a clonal assay. The bulk population of infected cells was selected by simply adding G418 to the culture medium and subculturing as necessary until a viable G418 r~ population was established. The population of G418 R cells obtained in this manner is called an insertional mutagenesis library (King et al., 1985) due to some analogies with genomic libraries in cloning vectors. The cloning efficiency and growth rate of the infectants was comparable to the parental TK6 cells. TK- mutants were selected from an insertional mutagenesis library which initially consisted of 3 × 106 infectants after 72 h of co-cultivation as determined by the clonal assay. Mutants were picked

300

after 14 days of growth in trifluorothymidine; slow-growth mutants (Yandell et al., 1986) were not analyzed in this study. An uninfected culture was maintained in parallel and used for determination of spontaneous mutant frequencies. Southern and northern blotting For Southern analysis, 20 ~g of genomic DNA were digested with the appropriate restriction enzyme and electrophoresed on 0.8% agaroseTAE buffer gels. The DNA was transferred onto Gene Screen Plus membranes (NEN Research Products) by capillary blotting, following the manufacturer's directions. Hybridization solutions contained I%SDS, 1 M NaC1, 10% dextran sulfate, 0.5 mg/ml herring sperm DNA, and 25 ng 32p-ATP random primer labelled probe. The tk probe was a full length tk cDNA purified as a BamHI-SmaI fragment from plasmid p T K l l (Bradshaw and Deininger, 1985). The ZipNeoSV(X) specific probe was an internal 2.2-kb XhoI fragment encompassing the neo gene as well as the SV40 and pBR322 origins of replication (Cepko et al., 1984). Hybridizations were carried out overnight at 65°C and after stringent washes autoradiography was performed with exposure periods of 2-7 days. For northern analysis 30 ~g of total RNA were electrophoresed on a 1% agarose, 37% formamide-MOPS gel. RNA was transferred to Gene Screen Plus nylon membrane by capillary blotting with 10 × SSC according to manufacturer's directions and hybridized to the tk cDNA probe. After stringent washes, membranes were autoradiographed with exposures of 2-3 days. The filters were subsequently stripped and reprobed with a neo-specific hybridization probe as a control for the amount of RNA loaded in each lane (data not shown). Polymerase-chain reaction and DNA sequencing PCR amplification of the insertional mutants was carried out in an Ericomp Thermocycler using standard protocols. Genomic amplification of a 1600-bp LTR-tk junction fragment was performed using a tk primer annealing within exon 3 (gTK2376R 5'-ATCTGGAAGCGACGGACGCG-3') and an LTR specific primer (LTR 551 5 ' - T C T C C T C T G A G T G A T T G A C T - 3 ' ) . The

PCR reaction mixture contained 15 mM Tris pH 8.3, 2.5 mM MgCI 2 60 mM KCI, 200 ~M dNTPs, 50 ng primers, and 1 unit of Taq polymerase (Perkin Elmer Cetus). The cycling protocol was 95°C for 2 min, 30 cycles of 95°C for 1 rain, 45°C for 1 min, and 72°C for 2 rain, and finally 72°C for 10 min. The fragment was separated on a 1% agarose gel, excised, and purified with Qiaex (Qiagen). To analyze tk mRNA, 2 ~g of total cellular RNA was reverse transcribed using a specific reverse primer (cTK811R 5'-GCAGCATGCAGGGCAGCGTC-3') at 42°C for 60 min, and then heated to 95°C for 4 min. The resultant first strand cDNA was then PCR amplified with specific internal primers (cTK789R 5'-GTAGGCGGCAGTGGCAGGAA-3'; and cTK61 5'-AGCTGCATFAACCTGCCGAC-3'). The cDNA amplification buffer contained 1 mM MgC12 200 ~M dNTPs, 50 mM KCI, 10 mM Tris pH 8.3, and 100 ng of each primer. The cycling protocol was 95°C for 2 minutes, then 25 cycles of 95°C for 30 sec, 57°C for 30 sec, and 72°C for 1 min, and finally 72°C for 10 min. Numbering for the tk genomic and cDNA sequence followed the scheme established by Deininger and colleagues (Bradshaw and Deininger 1984; Flemington et al. 1987). Direct, double-stranded sequencing of PCR fragments was done using the fmole sequencing kit (Promega) and 32p end-labelled primers. Results

Specific-locus mutation frequency in infected cells The mutant frequency induced at the tk marker locus was determined following expansion of a G418 R insertional mutagenesis library (Table 1). An increase in mutant frequency of approximately 5-fold was observed as compared to a parallel, uninfected control population. Individual TK- mutants from these experiments were collected for further characterization. Retrovirus shuttle vector provirus copy number In order to determine the copy number of ZipNeoSV(X) proviruses, genomic DNA from individual mutants was digested with HindIII, a restriction enzyme with a single recognition site

301 TABLE 1 MUTATION INDUCTION FOLLOWING I N F E C T I O N O F TK6 CELLS

ZipNeoSV(X)

Infection period (h)

T K - M u t a n t s / 1 0 6 viable cells

0 72

4.7+1.1 22.7 + 5.8

a

a

Data represents the m e a n of 3 mutation frequency determinations. Confidence intervals represent the standard error of the mean.

within the ZipNeoSV(X) provirus. Southern analysis was then performed using a probe which hybridizes to only 1 of the 2 ZipNeoSV(X) HindIII fragments. The number of bands is therefore, equivalent to the number of ZipNeoSV(X) genomic inserts. The size of the bands will vary widely depending on the position of recognition sites in the host sequences flanking the insertion site. As can be seen in Fig. 1, most of the strains had only one HindIII fragment which hybridized to the ZipNeoSV(X) probe and none had more than two. The provirus copy number for each mutant is listed in Table 2. Mutant strains with apparently identical proviral insertions are likely to be siblings since mutant collection was necessarily performed after expansion of the originally infected population. These mutant lines were therefore grouped together for further analysis. Rearrangements of the tk locus in T K - mutants Rearrangements of the tk locus in individual G418 R, TK- mutants were analyzed by Southern analysis with a tk specific hybridization probe (Fig. 2). Genomic DNA from TK- mutants was digested with SacI. Prominent 14.8-kb and 8.6-kb bands are a polymorphic pair created by a SacI RFLP at the tk locus (Yandell et al., 1986). The larger band is associated with the functional allele, and the smaller band is associated with the nonfunctional allele. A TK- phenotype attributable to loss of heterozygosity (LOH) is manifested as the disappearance of the 14.8-kb band without coordinate appearance of a new band hybridizing to the tk-specific probe. As can be seen in Fig. 2, several of the mutant strains analyzed have undergone an LOH event that may be

induced by proviral integration. Alternatively, mutants exhibiting LOH may have arisen spontaneously in the group collected following retrovirus shuttle vector infection. Many of the mutants demonstrate a rearrangement consistent with the integration of a ZipNeoSV(X) provirus within the tk locus. Restriction mapping of the insertion position is complicated by the presence of the alternate, unrearranged allele. However, examination of hybridization intensities of aberrant bands (Fig. 2) suggested that several of the insertions appeared to lie within a 2.2-kb SacI restriction fragment which consists primarily of intron 2, and hybridizes only to tk exon 3 in the cDNA probe (Fig. 4). The structural rearrangement in mutant group 12 (Ta-

2

5

6

7

8

9

10

11 12 13

12 Kb---

6 Kb---

4 Kb--

Fig. 1. Proviral copy n u m b e r in G418 R ZipNeoSV(X) infected T K mutants. Genomic D N A from each m u t a n t was digested with H i n d I I I which has a single recognition site within the ZipNeoSV(X) proviral sequence. An internal 2.2-kb XhoI fragment of ZipNeoSV(X) encompassing the neo gene was used as a hybridization probe. Since the XhoI probe is complementary to only one ZipNeoSV(X) H i n d I I I fragment, each provirus is represented by a single band on the autoradiogram. The total n u m b e r of bands is equivalent to the number of proviruses in the genome of each mutant. Individual mutants are indicated by numbers on the top of each lane.

302

~ 6 1 2 3 4 5 6

12.0

8

9

10111213141617181920

--

11.0-10.0--

9.0-8.0-7.0~ 6.0-5.0--

4.0--

3.0--

2.0--

1.6--

Fig. 2. Southern analysis of G418 R ZipNeoSV(X) infected TK mutants. Genomic D N A from each mutant was digested with Sacl. A cloned h u m a n tk c D N A was used as a hybridization probe. The 2 bands at 14.8 and 8.6 kb represent a polymorphic pair associated with the functional and non-functional allele respectively. Loss of the 14.8-kb band without coordinate appearance of a new band is interpreted as a loss of heterozygosity event. The parental TK6 pattern is shown on the far left lane of the autoradiogram, and individual mutants are indicated by numbers on the top of each lane.

1

2

3

4

5

6

7

8

9

10 11 12 13 14 16 17 18 19 20 TK6

Fig. 3. Northern analysis of G418 R ZipNeoSV(X) infected T K mutants. Total cellular R N A was electrophoresed, transfered to a nylon filter and hybridized with a cloned h u m a n tk c D N A probe. The filter was re-probed with a neo-specific hybridization probe which demonstrated the a m o u n t of R N A available for hybridization in each lane was closely equivalent (data not shown). Expression in the parental TK6 cells is shown in the far right lane, and individual mutants are indicated by numbers on the top of each lane.

303

retroviral insertion. Most of the rearrangement mutants had no detectable level of tk expression (Fig. 3, Table 2), but there was no consistent relationship between gene expression and structural rearrangement of the tk locus. Each lane contained an equivalent amount of RNA as confirmed by re-probing the filters with a neo-specific hybridization probe (data not shown). No altered bands suggesting splicing alterations were observed, although small size differences due to exon skipping or the use of cryptic splice sites might be difficult to detect at this level of analysis.

ble 2) was only observed following digestion with HindlII and BamHI (data not shown). Almost all of those mutants (7/8) which have not undergone a loss of heterozygosity event demonstrate a rearrangement within the tk locus (Fig. 2, Table 2). Northern analysis of TK - mutants We examined tk expression in the mutants by northern analysis (Fig. 3). Since the parental TK6 cell is functionally heterozygous at the tk locus, we first considered if the presence of an alternate non-functional allele in the parental TK6 cell line would complicate the analysis of gene expression. This can be determined by examining mutants which have undergone loss of heterozygosity and therefore contain only the non-functional allele from TK6. These mutants do not produce a level of transcript observable by northern analysis (Fig. 3) although transcription from the non-functional allele is detectable by PCR amplification (see below). The presence of the alternate allele will not, therefore, interfere with northern analysis of expression from alleles newly inactivated by

Sequence of the junction fragment In order to more specifically characterize insertional mutagenesis at the tk locus we focused on a single TK- mutant, T2. The precise location of the insertion in T2 was determined by PCR amplification of a junction fragment consisting partially of the ZipNeoSV(X) provirus and partially of intron 2 sequence. The orientation of the integrated provirus relative to the tk locus can

A. 12 II-,,,. SS 0 I

S

3

4

I,

I

567

,

S

III

S

3

6

9

12

15

18

I

I

I

I

I

I

,

,

S*

S

21

24

27

30

I

I

I

I

Kb

862

B.

i

870

i

GGTCTTTCAICTAGAAGGA

551 tk Sequence

ZipNeoSV(X) ~

tk Sequence

2395R

Fig. 4. (A) Map of the tk locus and the proviral insertion in T2. The 7 tk exons and the provirus in intron 2 are drawn to scale. The positions of SacI (S) recognition sites are indicated. A polymorphic Sac] site at approximately 22-kb, present only in the non-fuctiona] tk allele of TK6 cells, is indicated by an asterisk. (B) Proviral Insertion in T K - Mutant T2. A junction fragment of ZipNeoSV(X) and tk was generated by PCR amplification using primers L T R 5 5 ] and gTK2376R as indicated. DNA-sequence analysis of the PGR fragment is shown with the L T R sequence boxed and in bold-face type and tk sequence indicated by the appropriate numbers. The provirus has integrated at base pair 862 within tk intron 2 as shown.

304 TABLE 2

T2

T3

T5

T10

TK6

CHARACTERISTICS O F T K - MUTANTS R E C O V E R E D F O L L O W I N G R E T R O V I R U S S H U T T L E - V E C T O R INFECTION Mutant Mutant group ~ lines

Provirus I ~ s s o f Struccopy hetero- tural number zygosity alteration

Expression level b

1 2 3 4

T1 T2 T3,T4,T6 T5

1 1 1 2

Yes No No No

Yes Yes Yes

+/ -

5 6

T7 T8,T12,T16, T18,T19 T9 T10

1 1

ND No

No Yes

+

2 1

No No

Yes Yes

+

Tll T13 T14 T17 T20

1 1 1 1 1

Yes Yes Yes No Yes

-

-

7 8 9 10 11 12 13

Yes

1.0

0.7

0.5

a Mutants which were determined to be identical by Southern analysis of proviral inserts are considered to be siblings and are therefore, grouped together. b As estimated by northern analysis. Some tk transcript can be detected in all of these cell lines by PCR amplification.

also be determined by the specificity of the primers used to generate the PCR product (Fig. 4). DNA sequence analysis of the proviral integration site in T2 (Fig. 4) demonstrates that the insert lies within intron 2 and does not directly disrupt the tk coding sequence. Southern analysis (Fig. 2) suggests that inserts in several of the

Fig. 5. PCR Analysis of tk cDNA. Total cellular RNA was reverse transcribed with a tk-specific primer and the first strand cDNA was then PCR amplified. The wild type cDNA yields a unique 699 base-pair amplification product as seen for the parental TK6 cells, as well as mutants T5 and T10. Novel amplification products are observed in mutants T2 and T3 in addition to the wild-type fragment. A 100 base-pair ladder is shown in the far right lane.

other mutants (groups 3, 4 and 5) appear to be similarly located. Alternative splicing in T K - insertional mutants The tk transcripts from T2 and several other

TK- mutant strains were analyzed by reverse transcription followed by PCR amplification (Fig. 5). Novel amplification products were observed in

TK6

I II 12

3

4

567

~ 12

~

3

4

T2

567

Fig. 6. Aberrant splicing in T K - mutant T2. Splicing of the wild-type transcript is depicted for the parental TK6 cells at the top of the figure. The splicing pattern in the novel 600 base-pair amplification product in T2 is shown at the bottom. Exon 4, shown as an open box is excluded from the spliced transcript. The tk locus is drawn to scale. The T2 genome contains a single proviral insert, localized within tk intron 2 as shown.

305 T2 and T3 in addition to the 699 base-pair wildtype cDNA fragment. DNA sequence analysis of T2 cDNA demonstrated that the novel 600-bp amplification product is an aberrantly spliced tk transcript from which exon 4 has been precisely excluded (Fig. 6). The sequences of exons 3 and 5 are completely intact and directly contiguous. No shuttle vector sequences are included in the 600bp product. The 699 base pair amplification product observed in T2 may potentially represent residual, appropriately spliced transcript; or the transcript of the alternate, non-functional parental allele. The parental allele contains an inactivating frameshift mutation in exon 4 (Grosovsky et al., 1993). DNA-sequence analysis of the 699 basepair product identified an identical exon 4 frameshift. We therefore conclude that the 699 base-pair product represents the transcript of the non-functional tk allele in the parental TK6 cells. Discussion We have reported here that the frequency of T K - mutants in cells carrying 1-2 shuttle-vector proviruses per genome (Fig. 1) is approximately 2 x 10 -5 (Table 1). This represents a 5-fold increase in mutant frequency as compared to an uninfected control population. We have also found a similar increase in mutant frequency at the hprt and aprt loci in the same insertional mutagenesis library studied here (Grosovsky et al., unpublished observations). Other studies (Jaenisch et al., 1983; Soriano et al., 1987; King et al., 1985; Varmus et al., 1981) of retroviral insertional inactivation have used replication competent retroviruses. The capability of retroviral shuttle vectors to cause mutation at specific cellular loci has not been previously evaluated. Inactivation of gene function attributable to proviral integration does not require direct disruption of the coding sequence (Soriano et al., 1987). Integration within introns or even outside of the locus can still cause a mutant phenotype by interference with gene expression. The target size for retroviral mutagenesis is thus increased, although mutants isolated following retroviral infection may also include a substantial background of spontaneously occurring events. Varmus et al.

(1981) reported that only 2 of 68 morphological revertants examined were attributable to insertional mutagenesis of a target viral src gene present within a single RSV provirus in a transformed rat-cell line. Under conditions which yielded approximately a 10-fold increase in H P R T - mutant frequency (King et al., 1985), 4 / 1 4 mutated hprt alleles contained rearrangements observable by Southern analysis, and only 2 of these were shown to be disrupted by a retroviral insertion. Our results at the tk locus are summarized in Table 2. Mutants attributable to loss of heterozygosity account for 5 / 1 3 (38%) of the G418 g T K mutants. Multi-locus deletions or recombination leading to L O H can potentially be induced by proviral integration. Alternatively, some of these L O H mutants may represent a background of spontaneous events. Among the remaining mutant groups, 7 / 8 (87%) contained a genomic rearrangement detectable with a tk cDNA probe. Retroviral insertions within the tk locus would be reflected in this class of mutants although only one (T2) was confirmed by DNA-sequence analysis. The percentage of mutants exhibiting L O H in the retrovirus infected mutant collection is substantially lower than previously reported for spontaneous T K - mutants in the same cell line (Yandell et al., 1986; Little et al., 1987). In a set of 51 spontaneous T K - TK6 mutants (Yandell et al., 1986; Little et al., 1987) 71% had undergone L O H and only 33% of those which retained heterozygosity demonstrated genomic rearrangements. Li et al. (1992) examined 31 normal growth, spontaneous T K - TK6 mutants. L O H was observed in 74% and only 1/31 exhibited a structural rearrangement of the functional tk allele. Densitometric analysis indicated that mitotic recombination was the most prevalent pathway for spontaneous LOH. Our own laboratory has found large deletions in 4 / 3 6 normal growth, spontaneous T K - mutants (Grosovsky et al., 1993). In contrast to spontaneous T K - mutants, the retrovirus infected set would thus appear to be predominated by rearrangements. No consistent relationship was observed between gene expression and rearrangement by using northern analysis (Fig. 3, Table 2). The mechanisms of insertional mutagenesis at

306

the tk locus were characterized in detail in T K mutant, T2. Two distinct effects on tk gene expression were observed; an overall low level of tk transcript as detected by northern analysis, and the existence of an aberrantly spliced transcript associated with the proviral insert. Surprisingly, exon 4 is specifically skipped in the aberrantly spliced transcript although the provirus was localized within intron 2. The sequences of the splice consensus regions surrounding exon 4 were determined to be unaltered (data not shown). We do not presently understand the mechanism by which the provirus affects the splicing of exon 4, but it may not be unique. Other examples of exon skipping have recently been reported in which the local splice donor and consensus sequences remain unaltered (Zhang et al., 1992; Steingrimsdottir et al., 1992). Analysis of insertional mutagenesis in mutant T2 permits some specific conclusions regarding the mechanisms responsible for the T K - phenotype. Proviral interference with transcriptional initiation may occur by a variety of previously reported mechanisms (Gridley et al., 1987; Jaenisch, 1988; Breindl et al., 1984; Hartung et al., 1986; Barker et al., 1991) including hypermethylation of the sequences surrounding the insert, alterations in chromatin structure, or disruption of cis-acting control elements. Any of these mechanisms could be responsible for the low overall level of tk transcript in T2 but could not account for a b e r r a n t splicing. P r e m a t u r e polyadenylation using the polyadenylation signal in the shuttle vector sequence may occur in T2 since the provirus is integrated in parallel to the orientation of tk transcription. A truncated and possibly labile tk transcript produced in this manner could also contribute to the low steady state level of tk transcript. No such transcript was observed, although rapid degradation may make a highly labile transcript difficult to detect. On the other hand, aberrantly spliced transcripts are also often rapidly degraded but were detected and characterized. Since a large fraction of primary tk transcripts appear to be aberrantly spliced, this alone could sufficiently account for the T K - phenotype in T2. We cannot however, exclude the possibility that other mechanisms of gene regulation or post-transcriptional processing

may be affected and contribute to the low steady state level of functional tk transcript. The results presented here can be used to estimate the risk of insertional inactivation as a potential limitation in the use of retroviruses and retrovirai shuttle vectors for gene therapy. The mutational risk of retroviral mediated gene therapy must be considered throughout the genome of infected cells. However, only a subset of cellular loci is at risk of insertional inactivation. These loci must be either hemizygous or functionally heterozygous, but among loci of appropriate zygosity there may be additional parameters which influence the likelihood of mutation. The risk of proviral integration may be much higher in loci which are poised for transcription due to differences in chromatin configuration, a major factor in the selection of proviral integration sites (Kitamura et al., 1992; Pryciak and Varmus, 1992). The number of loci at risk are also restricted to those whose inactivation will result in a viable, although discernably mutated, cell. If for example, one supposes as few as 103 loci at risk, and that the mutant frequency at tk is representative of other susceptible cellular loci, then the risk of mutation per cell is ( 2 x 1 0 - s ) ( 1 X 1 0 3 ) = 2 x 10 -2, or 1 mutation in every 50 infected cells. If the number of loci at risk is as low as 102 , then the risk is still 1 mutation per 500 cells infected with retrovirus shuttle vector. The data may also be considered in the context of proviral integration as a potentially useful tool for genetic analyses. The mutant frequency of 2 x 10 -5 observed at the tk locus suggests that more than 105 infectants would need to be screened in order to identify a single clone with an appropriate null mutant phenotype produced by insertional inactivation of a comparable novel locus. If inactivation of any one of several loci could produce the desired phenotype, then the number of infectants required for screening would be correspondingly less. For example, if a phenotype can be affected by any one of at least 100 separate genes, and the phenotype is easily identifiable, then insertional mutagenesis can be a powerful approach. Normal embryonic development is one such phenotype. However, if susceptible loci are restricted to hemizygosity or heterozygosity a set of 100 may be reduced to only a

307

small handful. Therefore, the number of infectants which must be screened in order to expect the observation of one mutant remains high, although possibly lower than 105. In summary, this investigation has developed quantitative and qualitative information which can be used to estimate the efficacy of retroviruses in transposon-tagging strategies, and to evaluate the risk of mutagenesis associated with retroviralmediated gene therapy. Although a modest increase in mutant frequency was observed following retroviral shuttle-vector infection, only some of the mutants contained a provirus within the inactivated tk allele. Further investigation with additional model systems will be required to determine if the results obtained for the tk locus are typical of loci at risk throughout the genome.

Acknowledgements This work was supported by grant CN-15 from the American Cancer Society, and by grant RO1 CA55659 from the National Institutes of Health. We thank Professor B.W. Glickman for his support during production of the insertional mutagenesis library.

References Barker, D.D., H. Wu, S. Hartung, M. Breindl and R. Jaenisch (1991) Retrovirus induced insertional mutagenesis: Mechanisms of collagen mutation in Mov 13 mice Mol. Cell Biol., 11, 5154-5163. Bishop, J.M. (1987) The molecular genetics of cancer, Science, 235, 305-311. Bradshaw Jr., H.D. and P.L. Deininger (1984) Human thymidine kinase gene: molecular cloning and nucleotide sequence of a cDNA expressible in mammalain cells Mol. Cell Biol., 4, 2316-2320 Breindl, M., K. Harbers and R. Jaenisch (1984) Retrovirus induced lethal mutation in collagen I gene of mice is associated with altered chromatin structure, Cell, 38, 9-16. Cepko, C.L., B.E. Roberts and R.C. Mulligan (1984) Construction and applications of a highly transmissible murine retrovirus shuttle vector, Cell, 37, 1053-1062. Cooley, L., R. Kelley and A. Spradling (1988) Insertional mutagenesis of the Drosophila genome with single P elements, Science, 239, 1121-1128. Flemington, E., H.D. Bradshaw, V. Traina-Dorge, V. Slagel and P.L. Deininger (1987) Sequence, structure and promoter characterization of the human thymidine kinase gene, Gene, 52, 267-277.

Gridley, T., P. Soriano and R. Jaenisch (1987) Insertional mutagenesis in mice, Trends Genet., 3, 162-166. Grosovsky, A.J., and J.B. Little, (1985) Evidence for linear response for the induction of mutations in human cells by x-ray exposures below 10 rads, Proc. Nat. Acad. Sci. (U.S.A.), 82, 2092-2095. Hartung, S., R. Jaenisch and M. Breindl (1986) Retrovirus insertion inactivates mouse al(l) collagen gene by blocking initiation of transcription, Nature (London), 320, 365-367. Jaenisch, R. (1988) Transgenic animals., Science, 240, 14681474. Jaenisch, R., K. Harbers, A. Schnieke, I. Lohler, D. Chumakov, D. Jahner, D. Grotkopp and E. Hoffmann (1983). Germline integration of Moloney murine leukemia virus at the Mov 13 locus leads to recessive lethal mutations and early embryonic death, Cell, 32, 209-216. Jenkins, N.A., N.G. Copeland, B.A. Taylor and B.K. Lee (1981) Dilute (d) coat color mutation of D B A / 2 J mice is associated with the site of integration of an ecotropic MuLV genome, Nature (London), 293, 370-374. King, W., M.D. Patel, L.I. Lobel, S.P. Goff and M.C. NguyenHuu (1985) Insertional mutagenesis of embryonal carcinoma cells by retroviruses, Science, 228, 554-558. Kitamura, Y., Y.M.H. Lee and J.M. Coffin (1992) Nonrandom integration of retroviral DNA in vitro: Effect of CpG methylation, Proc. Natl. Acad. Sci. (U.S.A.), 89, 5532-5536. Li C.Y., D.W. Yandell, J.B. Little (1992) Molecular mechanisms of spontaneous and induced loss of heterozygosity in human cells in vitro, Somat. Cell Mol. Genet., 18, 77-87. Liber, H.L., and W.G. Thilly (1982) Mutation assay at the thymidine kinase locus in diploid human lymphoblasts, Mutation Res., 94, 467-485. Little, J.B., D.W. Yandell and H.L. Liber (1987) Molecular analysis of mutations at the tk and hprt loci in human cells, in: M.M. Moore, D.M. DeMarini, F.J. de Serres and K.H. Tindall (Eds.), Mammalian Cell Mutagenesis, Banbury Report 28, Cold Spring Harbor Laboratory, Cold Spring Harbor NY, pp. 225-236. Miller, A.D (1992) Human gene therapy comes of age, Nature (London), 357, 455-460. Miller, A.D., M.F. Law and I.M. Verma (1985) Generation of helper-free amphotropic retroviruses that transduce a dominant-acting, methotrexate-resistant dihydrofolate reductase gene, Mol. Cell Biol., 5, 431-437. Pryciak P.M. and H.E. Varmus (1992) Nucleosomes, DNA binding proteins, and DNA sequence modulaate retroviral integration target site selection, Cell, 69, 769-780. Roeder, G.S., and G.R. Fink (1980), DNA rearrangements associated with a transposable element in yeast, Cell, 21, 239-249. Rossi, A.M., Ad. D. Tates, A.A. van Zeeland and H. Vrieling (1992) Molecular analysis of mutations affecting hprt mRNA splicing in human T-lymphocytes in vivo, Environ. Mol. Mutagen., 19, 7-13 Shih, C., J.P. Stoye and J.M. Coffin (1988) Highly preferred targets for retrovirus integration, Cell, 53, 531-537. Skopek, T.R., H.L. Liber, B.W. Penman and W.G. Thilly (1978) Isolation of a human lymphoblastoid line heterozygous at the thymidine kinase locus: possibility for a rapid

308 human cell mutation assay, Biochem. Biophys. Res. Commun., 84, 411-416. Soriano, P., T. Gridley and R. Jaenisch (1987) Retroviruses and insertional mutagenesis in mice: proviral integration at the Mov 34 locus leads to early embryonic death, Gene Develop., 1,366-375. Soriano, P., T. Gridley and R. Jaenisch (1989) Retroviral tagging in mouse development and genetics., in: M.M. Howe and D.E. Berg (Ed.), Mobile DNA, American Society for Microbiology, Washington DC, pp. 927-937. Steingrimsdottir, H., G. Rowley, G. Dorado, J. Cole and A.R. Lehman (1992) Mutations which alter splicing in the human hypoxanthine-guanine phosphoribosyltransferase gene, Nucleic Acids Res., 20, 1201-1208. Stoye, J.P., S. Fenner, G.E. Greenoak, C. Moran and J.M.

Coffin (1988) Role of endogenous retroviruses as mutagens: the hairless mutation of mice, Cell, 54, 383-391. Yandell, D.W., T.P. Dryja and J.B. Little (1986) Somatic mutations at a heterozygous autosomal locus in human cells occur more frequently by allele loss than by intragenic structural rearrangements, Somat, Cell Mol. Genet., 12, 255-263. Varmus, H.E., N. Quintrell and S. Ortiz (1981) Retroviruses as mntagens: insertion and excision of a nontransforming provirus alter expression of a resident transforming provirus, Cell, 25, 23-36. Zhang, L.-H., H. Vrieling, A.A. van Zeeland and D. Jenssen (1992) Spectrum of spontaneously occurring mutations in the hprt gene of V79 Chinese hamster cells, J. Mol. Biol., 223, 627-635.