Structure and genomic organization of the mouse dihydrofolate reductase gene

Structure and genomic organization of the mouse dihydrofolate reductase gene

Cell, Vol. 19. 355-364, February 1960. Copyright 0 1960 by MIT Structure and Genomic Organization Dihydrofolate Reductase Gene Jack H. Nunberg,...

5MB Sizes 0 Downloads 143 Views

Cell, Vol. 19. 355-364,

February

1960.

Copyright

0 1960

by MIT

Structure and Genomic Organization Dihydrofolate Reductase Gene

Jack H. Nunberg,*# Randal J. Kaufman,*+ Annie C. Y. Chang,$§* * Stanley N. CohenQ* Robert T. Schimke*tt Departments of Biological Sciences, * Pharmacology,t Medical Microbiology,$ Genetics5 and Medicine* * Stanford University Stanford. California 94305

* and

of the Mouse

appear to have DHFR genes on acentric, extrachromosomal fragments (double minute chromosomes) (Kaufman et al., 1979). To understand the molecular mechanism(s) of gene amplification, as well as those which determine the intra- or extrachromosomal state of the amplified genes, we have examined the genomic organization both within and adjacent to the DHFR gene(s) of MTX-resistant and -sensitive cells. Results

Summary The genomic organization of the mouse dihydrofolate reductase gene has been determined by hybridization of specific cDNA sequences to restriction endonucleasagenerated fragments of DNA from methotrexate-resistant S-l SO cells. The dihydrofolate reductase gene contains a minimum of five intervening sequences (one in the 5’ untranslated region and four in the protein-coding region) and spans a minimum of 42 kilobase pairs on the genome. Genomic sequences at the junction of the intervening sequence and mFlNA-coding sequence and at the polyadenylation site have been determined. A similar organization is found in independently isolated methotrexate-resistant cell lines, in the parental sensitive cell line and in several inbred mouse strains, indicating that this organization represents that of the natural gene. Introduction Murine dihydrofolate reductase (DHFR, EC 1.5.1.3), a single polypeptide of molecular weight 21,500 (Stone and Phillips, 1977). is involved in the synthesis of tetrahydrofolates. which are required for de novo synthesis of thymidine, purines and glycine. Upon growth of mammalian cells in the presence of progressively increasing concentrations of methotrexate (MTX), a specific inhibitor of DHFR, resistant cells are obtained which contain an increased level of the target enzyme as a result of a corresponding amplification of that gene (Alt et al., 1978). Amplification of the DHFR gene has been observed in a large number of MTX-resistant murine, as well as hamster, cell lines (Alt et al., 1978; Nunberg et al., 1978; Kaufman, Brown and Schimke, 1979; Kellems et al., 1979). Cell lines in which the DHFR gene copy number is stable in the absence of MTX, such as the Chinese hamster ovary (CHO) MK42 (Nunberg et al., 1978) and murine L5178Y (Dolnick et al., 19791, have the amplified genes localized to a homogeneously staining region of a specific chromosome. Cell lines in which DHFR genes are lost in the absence of selection pressure tt To whom requests for reprints should be addressed. # Present address: Cetus Corporation, 600 Bancroft Way, Berkeley, California 94710.

Structure of the DHFR mRNA The double-stranded cDNA to DHFR mRNA from the MTX-resistant murine S-l 80 M-50 cell line was cloned in the Pst I site of pBR322 by dG-dC tailing, and a detailed restriction map of the DHFR cDNA insert of the pDHFR7 plasmid has been published (Chang et al., 1978). We have since analyzed the cDNA inserts of additional clones; the additional sequence data obtained enable us to construct a composite picture of the 1600 nucleotide DHFR mRNA molecule (Figure 1). A total of approximately 12 independently isolated DHFR clones have been examined by restriction endonuclease analysis involving 2-5 sites per molecule, and no heterogeneity has been observed so far at any site. Thus even though the cDNA used for molecular cloning had been obtained from mRNA from an uncloned population of MTX-resistant cells containing approximately 200 copies of the DHFR gene per haploid genome, no heterogeneity in the 1600 nucleotide DHFR mRNA has been detected. The coding region of the mRNA (186 amino acids, 564 nucleotides including the initiation codon AUG and a single termination codon UAA) is located near the extreme 5’ end of the molecule. At present, 82 nucleotides are known to be 5’ to the AUG. As is generally the case in eucaryotic mRNAs (Kozak, 1978), translation of DHFR appears to be initiated at the first AUG encountered at the 5’ end of the mRNA. Only weak homologies can be observed between the 3’ terminus of the 18s ribosomal RNA and DHFR mRNA sequences immediately to the 5’ side of the initiating AUG. The 3’ untranslated region includes 950 nucleotides of the 1600 nucleotide mRNA. Although a large 3’ untranslated region is a feature common to several mRNAs [such as hen ovalbumin (McReynolds et al., 1978) and ovomucoid (Buell et al., 19791, and mouse P-globin (Proudfoot and Brownlee. 197611, little is known about its function. The evidence that this region of the DHFR mRNA is indeed untranslated has been provided by Alt, Kellems and Schimke (19761, who showed that only one protein is overproduced in MTXresistant cells. Kronenberg, Roberts and Efstratiadis (1979) have shown that the 3’ untranslated region of rabbit P-globin mRNA is not required for in vitro

Cell 356

Figure 1. Structure of the Mouse cleotide DHFR mRNA

1600

Nu-

The 1600 nucleotide DHFR mRNA is represented in the upper figure. The accentuated region between the translation initiating AUG and translation terminating UAA codons represents the protein-coding region of the mRNA. The poly(A) tail of the mRNA is indicated at the extreme 3’ end. The positions of the restriction endonuclease sites are as determined by Chang et al. (1978) and are included for orientation purposes. The 100 nucleotide scale marker is shown at the lower right. DNA sequences corresponding to various regions of the 1600 nucleotide DHFR mRNA were determined using the chemical sequencing technique of Maxam and Gilbert (1977; see Experimental Procedures) and are shown in the lower portion of the figure. Restriction sites are included for orientation and the numbering is approximate. The following fragments were used to obtain these sequences: pBR322 Hpa ll*3858 -Hae Ill-a [pDHFR 7, 12 and 26 (Chang et al., 197811, pBR322 Hpa ll3s58 -Taq I-a’, Hae Ill-a-Hpa II’, Hpa II*-Hae Ill-b. Hinf I-d ‘-Bgl II-a. Hinf Id-Bgl II-a l , Bgl IIb-pBR322 Pst I l 36, , , Bgl Ii-b-pBR322 Hpa II*3548. [Fragments are written 3’-5’ and are labeled at one 5’ terminus, indicated by l ; when pBR322 sites were used, they are indicated by the nucleotide position determined by Sutcliffe (19781.1

translation. In addition, we have characterized a 750 nucleotide DHFR mRNA species in these cells which is polysomal, polyadenylated and codes for DHFR in translational assays. The mRNA lacks the entire 3’ 850 nucleotides of the 1600 nucleotide mRNA species, whereas the remaining sequences are identical (D. Setzer, J. H. Nunberg, M. McGrogan and R. T. Schimke, manuscript in preparation). Thus we conclude that the majority of the 3’ noncoding sequences are not absolutely required for translation. One known function of the 3’ end of eucaryotic mRNA molecules is to serve as a site for polyadenylation. The poly(A) tail of the DHFR mRNA is found in a cDNA, and is separated by 12 nucleotides from the sequence AAUAAA, which has.been found within 20 nucleotides of the polyadenylation site of almost all eucaryotic mRNAs studied [for example, hen ovalbumin (McReynolds et al., 19?8), mouse P-globin (Proudfoot and Brownlee, 1976) and hen ovomucoid (Buell et al., 1979)]. It is not known whether this sequence is involved in polyadenylation or transcription termination. Genomic Representation of the 3’ Untranslated Region of the DHFR mRNA To examine the structure of the.DHFR gene encoding the 3’ untranslated region, a 5 kb Barn HI-generated fragment containing this region was cloned from DNA of MTX-resistant S-l 80 R,A cells into the Pst I site of

pBR322 by dG-dC tailing. A heteroduplex formed between this plasmid (pDHFRg1) and a properly oriented cDNA plasmid (pDHFR11) (Chang et al., 19781, both cut with Eco RI to generate asymmetric plasmid arms, is shown in Figure 2. DHFR sequences form one continuous duplex region of 1010 f 70 nucleotides, representing the 3’ end of the DHFR mRNA. No internal deletion loops are visible and most, if not all, of the 3’ terminus of the cDNA sequence, excluding the poly(A) and dG-dC tails, appears to be duplexed. Thus sequences encoding the 3’ terminal 1010 nucleotides of the DHFR mRNA and the enitre 3’ untranslated region constitute one continuous unit in the genome. (The presence of a short intervening sequence may be missed in heteroduplex and restriction endonuclease analyses and cannot be rigorously excluded at present.) The genomic DNA sequence in this area (Figure 3) contained a Hinf I site not present in the cDNA sequence, located 200 nucleotides to the 5’ side of the Bgl II-A site in the 3’ untranslated region. This site was used in the determination of the DNA sequence at the junction between the intervening sequence and the mRNA sequence. The DNA sequence surrounding the polyadenylation site was also determined. The sequences are shown in Figure 3. with relevant mRNA sequences included for comparison. The polyadenylation site of the DHFR gene is continuous with the rest of the 3’ untranslated region, and

Mouse 357

Dihydrofolate

Reductase

Gene

Figure 2. Heteroduplex between 1600 Nucleotide DHFR cDNA Sequences and 5 kb Genomic Barn HI Fragment Encoding the 3’ Untranslated Region of the DHFR mRNA Heteroduplex molecules of Eco RI-digested cDNA plasmid pDHFRl1 and Eco RI-digested genomic plasmid pDHFRg1 were prepared and visualized as described in Experimental Procedures. The pBR322 arms provide the orientation of the cDNA sequence as well as double-strand molecular weight markers. Single-strand +X174 was included as a singlestranded marker. The DHFR sequences share 1010 f 70 nucleotides located at the extreme 3’ end of the cDNA sequence.

is located within the sequence TAAAAT. Although examples are few, a transcribed A has been found at the polyadenylation site of several other mRNAs, such as the mouse @globin mRNA (Konkel, Tilghman and Leder, 1978) and SV40 early mRNAs (Reddy et al., 1979). At the intervening sequence-mRNA sequence junction, the genomic sequence diverges from the mRNA sequence (underlined) as shown: TCCAGATAC (Figure 3). Although the exact splice point cannot be determined because of the likelihood of terminal redundancy at the intervening sequence junctions (for example see Catterall et al., 19781, the sequences do diverge at exactly the position which Breathnach et al. (1978) have proposed as the splice point; that is, after the dinucleotide AG found at virtually all Y-3’ intervening-mRNA sequence junctions. Analysis of this junction, as well as that at the polyadenylation site, using the Korn program with default parameters (Kern, Queen and Wegman, 19771, revealed no obviously significant regions of secondary structure. Organization of the DHFR Gene in MTX-Resistant S-l 80 Cells The organization of the DHFR gene in the MTX-resistant S-l 80 M-50 cell line was examined by hybridization of specific fragments of the DHFR cDNA sequence to genomic DNA (Southern, 1975). Because the M-50 cell line contains approximately 200 copies of the DHFR gene per haploid genome, this analysis is both simple and sensitive. The fragments used in

these experiments and the resulting map of the DHFR gene are shown in Figure 4. Typical data obtained when singly digested M-50 genomic DNA is identified with various fragments are shown in Figures 5a-5f. Two facts become olear from such an analysis: the mouse DHFR gene contains a minimum of five intervening sequences, and it covers a minimum of approximately 42 kb in the genome. Since none of the enzymes used, with the exception of Hae Ill, cuts within the DHFR mRNA sequence, the presence of a restriction endonuclease cleavage site is indicative of an intervening sequence. Figure 4 (bottom) shows the locations of the restriction sites used to define the intervening sequences. The minimum length of the DHFR gene was estimated by first summing the lengths of the presumed contiguous genomic Barn HI bands (from 3’ to 5’: 3 kb of the 5 kb fragment, 19 kb and 16 kb) and then including an arbitrary 3 kb from the 5’ terminal Pst I bands (2.8 and 1.2 kb from 3’ to 5’). The 42 kb span of the DHFR gene is remarkably large. Among other genes which have been examined, the amount of intervening sequence is not nearly as great; for instance, the 7.7 kb hen ovalbumin gene yields a 1.8 kb mRNA (Gannon et al., 1979) and the 1.6 kb mouse /3-globin gene yields a 0.6 kb mRNA (Konkel et al., 19781.’ By using different restriction endonucleases, including the frequently cutting enzyme Hae Ill, we believe it probable that we have identified, within the limits of the experiments, most if not all of the intervening sequences within the DHFR gene. There are, however,

Cdl 358

mRNA

PA

TAA (BamHIJ

GENOMIC

Hinfl

10

. ..(GAN)TCATGAA

...

...

Bgltlb Hinfle (Bam~t)

H&?IlI

SEQUENCE

20 TTTTTTTTCT

570 580 TTGGGGAAA TATAAACTTC

LeuGlyLys

Bg/lIa

Hinfl

TyrLysLeu

LeuProChTyr

153

40 CCCAGGCGTC

50 CTCTCTGAGG

600 CCCAGGCGTC

610 620 CTCTCTGAGG TCCAGGAGGA AAAA...mRNA

ProGlyVal

LeuSerGly

60 TCCAGGAGGA

ValGhGluGl~

1530 ATTGAGAATG...

sequence...

LyS . ..aminoacid sequence... 173

162

20

10 ATTGAGAATG...

AmA...genomicsewencz...

. ..GATACTGCTT

30 GAAATGRAAA

40 TTTAATAAGT

50 TAGAAACTAA

60 ACTTTATAAA

70 AATAAAAAAA

p-l TGAGCATTAA

90 AATGGCTTTC

1540 . ..GATACTGCTT

1550 GAAATGAAAA

1560 TTTAATAAGT

1570 TAGAAACTAA

1580 ACTTTATAAA

1590 AATAAAAAAA

16,OO TGAGCATTAA

PolVA AAAAAAAAAA

110 . . . CAGGGTTTCA

120 GATCATCAGG

130 TCAGAGAAAG

140 TATTTGTGCC

100 CTCATCTCAG...

HwIII

Figure

3. Analysis

of the 5 kb Genomic

Barn HI Fragment

Encoding

(NGG)

...genom~c sequence

the 3’ Untranslated

Region

of the DHFR

mRNA (pDHFRg1)

The section of the 5 kb genomic Barn HI fragment encoding the 3’ untranslated region of the DHFR mRNA (pDHFRg1) is represented in the upper figure and compared with the 3’ section of the mRNA. The accentuated region in each represents the protein-coding sequence. Restriction sites are included for comparison. DNA sequences were determined at the intervening-mRNA sequence junction and at the polyadenylation site using the chemical sequencing technique of Maxam and Gilbert (1977). and as described in Experimental Procedures. The following fragments were used: Hinf I l -Bgl II-a. Hinf I-e-Hae Ill l mRNA sequences are included for comparison. The boxed sequences represent regions in which genomic and mRNA sequences diverge. The amino acid sequence encoded by the mRNA is shown below the mRNA sequence. Those amino acids not encoded within this genomic sequence are shown in italics.

several potential pitfalls in the interpretation of data such as those presented above. In addition to the limitations imposed by requiring a restriction site for the identification of an intervening sequence, this method is also limited by the size of fragment that will bind to nitrocellulose as well as by the size of hybrid that will form under filter hybridization conditions. In virtually all the cases analyzed in these experiments, we have been able to localize hybridization of any mRNA sequence to one and only one genomic fragment. This was managed either by probing with fragments that hybridize to one and only one band, either directly or by comparison of overlapping fragments, or by insuring that two adjacent cDNA fragments hybridize to one and only one common band. This insurance protects against many of the pitfalls of this method of gene mapping, such as cross-contamination between fragments and partial digestion products of genomic DNA. It also draws attention to the poten-

tial problem that arises if the sequences under study are present in different allelic or multiple-copy states. In the case of the 200 copies of the DHFR gene in these cells, we observed no evidence for heterogeneity. We were not able to localize hybridization of some 5’ terminal cDNA sequences to a unique Barn HIgenerated band (see Figures 5a, and 5b and 5e). No consistent map could be constructed to include the extra 16 kb Barn HI band. Determination of whether this represents minor heterogeneity in the amplified DHFR genes must await molecular cloning of this region. The locations of the intervening sequences in the DHFR gene are summarized in Figure 4. Four intervening sequences are located within the coding region of the gene. This implies that, on the average, regions of coding sequence 120 nucleotides long are separated by approximately 7 kb of intervening sequence,

Mouse 359

Id

Dihydrofolate

-

-

I

Hinfl ab

I II III INTERVENING

Hinf Ic

IV v SEQUENCES

, II 111 IV 6.2 17, , 3.5 , 5.9,

1.6,

v 16

ECORI

q9

16

19

5

BamHl

I 3.2

, 3.7,

3.2

14

HindIll

1.2, , 2.6

,1.6 , 3.9 . 1.6 ,

0.50 9661.1 Ha.5lIl.3

Figure

Gene

-

lb)

(c)

Reductase

1.6

,

0.54

2 1.4

Pstl

H&Z?lll

Haelllb

4. Organization

of the Mouse

DHFR Gene

The organization of the mouse DHFR gene was determined by hybridization of specific fragments of DHFR cDNA sequence to genomic DNA digested with restriction endonucleases. electrophoresed and transferred to nitrocellulose filters (Southern. 1975; and Experimental Procedures). The various DHFR cDNA sequences used to probe the nitrocellulose filters are represented in (a). The endpoints correspond to the restriction sites in the DHFR cDNA shown in the middle. The positions of the intervening sequences deduced from these experiments are represented in (b). The positions are approximate. (c) localizes the restriction sites used to construct the map shown above. With the exception of the Hae Ill-a and Hae Ill-b sites shown, all sites occur within intervening sequences. The numbers refer to the molecular weight (in kb) of the genomic fragments in which the indicated cDNA sequences are found. It should not be assumed that the genomic fragments shown are contiguous.

One intervening sequence is located within the 5’ untranslated region, and the 3’ untranslated region is uninterrupted. Organization of the DHFR Gene in Other MTXResistant Murine Cell Lines The organization of the DHFR gene was also examined in independently isolated MTX-resistant murine cell lines with differing properties of chromosome structure and MTX-resistance. S-l 80 R2 and S-l 80 E92 (a subclone of RI) are cell lines from our laboratory which show, respectively, unstable and stable MTX resistance (Kaufman et al., 1979). Copies of the DHFR gene are present on double minute chromosomes in the R2 cell line (Kaufman et al., 1979). The MTXresistant L1210 line is a lymphoma-derived, stably resistant line from the laboratory of J. Bertino (Departments of Pharmacology and Medicine, Yale University; Alt et al., 1978). The MTX-resistant L5178Y line is also from the laboratory of J. Bertino and differs

from the other murine lines in possessing a stable, nearly diploid karyotype with the DHFR genes localized to a homogeneously staining region on a specific chromosome (Dolnick et al., 1979). The unstably MTX-resistant 3T6 line was developed by Kellems et al. (1979). Eco RI-digested genomic DNA from each cell line was transferred to nitrocellulose and probed using kinetically purified DHFR cDNA prepared according to the method of Alt et al. (1978). Figure 6a shows that all the lines showed essentially the same pattern of hybridization. (The 5’ terminal fragment is not visible in these or other experiments in which total DHFR cDNA sequence was used as probe). None of the above mentioned properties of these cell lines appears to be reflected in the genomic organization both within and adjacent to the DHFR gene, which appears to be the same as that found in the uncloned S-l 80 M-50 cell line. Organization of the DHFR Gene in Sensitive S-l 80 Cells, and in Inbred Mouse Strains The DHFR gene organization in MTX-sensitive, parental S-l 80 cells, as well as that in several inbred mouse strains, was examined using similar techniques. Genomic DNA was digested with Eco RI and the nitrocellulose filters were probed with pDHFRl1, a cDNA plasmid containing essentially the entire DHFR mRNA sequence (Figure 6b). Within the resolution of the experiment, no major differences are noted in the organization of the DHFR gene of the inbred mouse strains, the parental MTX-sensitive S-l 80 S-3 cell line and the MTX-resistant cell lines. In an attempt to improve the sensitivity of the filter hybridization to enable analysis of the unique-copy DHFR gene of MTX-sensitive S-180 S-3 cells, genomic DNA was enriched for DHFR sequences by preparative gel electrophoresis as described in Experimental Procedures. Figure 6c shows that hybridization occurs to the same bands in the DNA of MTXresistant and MTX-sensitive S-l 80 cells. (The higher molecular weight band in each lane is of unknown origin). Taken together, these results suggest that there has been no change in the organization of the DHFR gene or in sequences in the immediate vicinity of the gene in the course of amplification. Thus the organization observed in MTX-resistant cells does not result from the amplification process. In addition, the finding that the DHFR gene is similarly organized in a variety of inbred mouse strains indicates that the arrangement observed in cultured cells is not the result of growth in culture or viral transformation. The DHFR gene organization that we have examined is that of the natural gene. Discussion We have already argued that the 3’ untranslated gion of the mRNA is not an absolute requirement

refor

Cell 360

(b)

(a) E

B

P

Hi

E

Ha

B

P

Hi

E

Ha

B

P

Hi

Ha

* -24.3Kb

* - 9.9 a -6.7

- 2.3 - 1.9

* -0.6

I

a

I

’ I

b

I

C

I I

.d.e

I

f

I

Bg/I

Ia

HeeiIIb

(4 E

(e) B

Figure Genomic

P

HiHa

5. Hybridization DNA from

0) E

of Specific MTX-resistant

B

P

DHFR cDNA

Hi

Ha

Sequences

E

to Genomic

S-l 60 M-50 cells was digested

B

P

Hi

Ha

DNA of the MTX-Resistant

to completion

S-l 60 M-50

Cell Line

with Eco RI (E). Barn HI (B). Pst I (P). Hind Ill (Hi) or Hae III (Ha),

$44;”

Dihydrofolate

Reductase

Gene

(b)

(a) R2

MTX-RESISTANT E92

s-3

M==Y C3H

AKR

MTX-SENSITIVE S-3

MTX-RESIST M-50

22.3Kb-

7.76.15.74.9-

Figure

6. Hybridization

of DHFR

cDNA

Sequences

to Genomic

DNA to MTX-Resistant

and -Sensitive

Murine

Cell Lines and Inbred

Mouse

Strains

Experiments similar to those described in the legend to Figure 5 were performed as follows. (a) Eco RI-digested DNA from various independently isolated MTX-resistant murine cell lines was probed with kinetically purified DHFR cDNA (Experimental Procedures). The cell lines are described in the text. The markers are Eco RI-digested A DNA. (b) EGO RI-digested DNA from the MTX-sensitive. parental S-160 S-3 cell line and from various inbred mouse strains was probed with pDHFRl1. Hind Ill-digested h DNA served as markers. (c) Hind Ill + Pst I-digested DNA from the MTX-sensitive, parental S-3 cell line was enriched for DHFR sequences as described (Experimental Procedures) and probed with pDHFRl1. Comparable MTX-resistant M-50 cell DNA is included for comparison. The markers are Hind Ill-digested A DNA.

translation. What then is the origin of the 3’ untranslated region of eucaryotic mRNAs? Several observations hint at a possible explanation. First, we have shown that this region of the DHFR mRNA is represented as one continuous region in the genome, uninterrupted by intervening sequences. Similar results have been obtained in the case of the hen ovalbumin (Dugaiczyk et al., 1978) and mouse and human pglobin (Konkel et al., 1978; Lawn et al., 1978) genes. Second, we and other investigators (Konkel et al., 1978; McReynolds et al., 1978; Buell et al., 1979) have observed an unusual base composition and distribution in the 3’ untranslated region. These regions are typically rich in A and U nucleotides and contain many short homo-oligomeric stretches of these nucleotides. There is a certain similarity between the base composition and distribution in the 3’ untranslated region and that observed in intervening sequences. In both the mouse P-globin gene (Konkel et al., 1978) and the hen ovalbumin gene (Robertson et al., 1979) intervening sequences at AT-rich and contain numerous homo-oligomeric stretches, especially of A and T. [The small segment of intervening sequence which we have determined (Figure 3) is also T-rich .] These DNA segments encoding the 3’ untranslated regions may represent ancestral “intervening se-

quences” whose excision in RNA processing is generally not required for functional gene expression and which, therefore, persist within mRNAs. Perhaps these sequences result from the transposition of a polyadenylation site domain adjacent to the ancestral sequences encoding the mRNA. An alternative hypothesis is that polyadenylation sites developed at random beyond the translation-termination region of the ancestral gene, thus creating segments of DNA encoding uninterrupted 3’ untranslated regions of polyadenylated mRNAs. This possibility is in keeping with the observation that the truncated, 750 nucleotide DHFR mRNA present in these cells has the sequence AUAA within 25 nucleotides of the polyadenylation site (D. Setzer, M. McGrogan and R. Schimke, manuscript in preparation), suggesting the creation of a secondary polyadenylation site within the 3’ untranslated sequences of the 1600 nucleotide DHFR mRNA. Organization of the DHFR Gene In MTX-Resistant and -Sensitive Mouse Cells The DHFR gene spans a minimum distance of 42 kb in the mouse genome, and contains a minimum of five intervening sequences; four intervening sequences occur within the protein-coding region of the gene, and one intervening sequence occurs within the 5’ untranslated region.

electrophoresed and transferred to nitrocellulose filters (Southern. 1975; and Experimental Procedures). Filters were probed with the DHFR cDNA fragments indicated (a-e) in the middle of the figure (see Experimental Procedures). Only single enzyme digestions probed with several of the fragments used are shown here. Hind Ill-digested A DNA served as molecular weight markers.

Cell 362

It is remarkable that the DHFR gene spans approximately 42 kb. The amount of intervening sequence has not been so large in any other gene analyzed [for example, hen ovalbumin (Gannon et al., 1979) or mouse /3-globin (Konkel et al., 197811. It is premature to speculate on whether such long intervening sequences are a property of genes constituting so-called “housekeeping” enzymes, or, more plausibly, simply an example of one end of a continuum of the organization of different genes in higher organisms. While these experiments provide substantial information about the structure and organization of the DHFR gene, they give no indication of which of multiple mechanisms of gene amplification may occur (Kaufman and Schimke, 1980). No differences in terms of hybridization to genomic fragments at the 3’ terminus of the DHFR gene could be detected in DNA from mouse, from sensitive cells and from several independently isolated MTX-resistant cell lines. In addition, in no case have probes for the two termini of the DHFR gene hybridized to the same genomic DNA fragment. These results are not surprising in view of estimates of the size of the amplified unit containing the DHFR gene in MTX-resistant cells: approximately 500-1000 kb in CHO MK42 cells (Nunberg et al., 1978) and in murine L5178Y cells (Dolnick et al., 1979). The elucidation of important information related to the nature of DNA sequences flanking each amplified DNA sequence will probably require use of cosmid cloning vehicles (Collins and Hohn, 1978) and chromosome-walking methods (W. Bender, W. Spierer and D. Hogness, personal communication). Experimental

Procedures

Call Lines and Mouw Strains MTX-resistant and -sensitive cell lines are described in the text and elsewhere (Alt et al., 1978; Dolnick et al., 1979; Kaufman et al., 1979; Kellems et al., 1979). Inbred C57, C3H, BALB/c and AKR mica were obtained from the animal colony maintained by the Department of Radiology, Stanford University Medical Center. Recombinant Plaamld DNA8 DHFR mRNA sequences were introduced into the Pst I e.lte of pBR322 by dG-dC tailing of doublestranded cDNA as described previously (Chang et al., 1978). These plasmids were propagated in E. coli X1776. X2282 (Chang at al., 1978) or C600 SR1592 (Kushner, 1978). The 5 kb Barn HI fragment containing the 3’ terminal genomic fragment of the DHFR gene was partially purified from Barn HIdigested MTX-resistant S-l 80 R,A call line (Kaufman and Schimke, 1980) DNA by preparative gal electrophorasls (Dugaiczyk. Boyer and Goodman, 1975) and subsequent elution of 5 kb DNA (Thuring, Sanders and Borst. 1975). This material was inserted into the Pst I site of pBR322 by dG-dC tailing, introduced by transformation into E. coli Xl 776 and screened by in situ colony hybridlxation (Grunstein and Hogness. 1975) as described previously (Chang et al., 1978). Two of 1000 transformants contained the desired sequence. pDHFRg1 was used in these studies. Bacterial strains were grown and plasmid DNA was prepared as described previously (Chang et al., 1978). Plasmid DNA from C600 strains was amplified using 100 pgg/ml chloramphenicol (ClewelI. 1972) prior to isol&lon. The NIH Guidelines were followed in all work involving,recombinant DNA plasmids.

Reetrktion Endonucleaae Mapplng and DNA Sequence Analyeie Restriction endonuclaases ware either purchased from Bethesda Research Labs or New England BioLabs. or obtained as a gifl from the laboratory of C. Yanofsky (Department of Biological Sciences, Stanford University). All enzymes ware used according to the supplier’s recommendations. Restriction maps were determined by polyacrylamide gel electrophoretic analysis (Tris-borate-EDTA, Maniatis. Jeffrey and van de Sande, 1975; Tris-acetate-EDTA, Hayward and Smith, 1972) following simultaneous or sequential digestions of either Intact plasmid or fragments isolated from previous digestions. In some cases, maps were determined using partial restriction digestions of singly 5’-labeled fragments (Wickens et al., 1979). DNA was eluted from polyacrylamide gels electrophoretically (McDonell, Simon and Studier, 1977) and from agarose gels either by the “freeze-squeeze” method CThuring et al., 1975) or by binding to GF/C (Whatman) filters in 5 M NaCIO. (Thomas et al., 1979). DNA sequence analysis was performed by the method of Maxam and Gilbert (1977). Fragments were dephosphorylated (calf intestinal alkaline phosphatase; Boehringer), labeled at the 5’ termini using polynucleotide kinase (Biogenics) and Y-~*P-ATP (ICN. 2000-3000 Ci/mm). and than subjected to secondary restriction endonuclease digestion to generate singly end-labeled fragments. The following chemical reactions were used: A > C, G. C, C + T and either A or A + G; and the products were run on 0.35 mm thick 8 (Sanger and Coulson. 1978) and 20% (Maxam and Gilbert, 1977) polyacrylamide gels. Sequences In the untranslated region of the mRNA and genome were typically confirmed either by comparison between the two or by sequencing the opposite strand. Formation and Vlrualixatlon of Hataroduplexee Heteroduplexes between pDHFRg1 and properly oriented cDNA sequence plasmid. both linearized with Eco RI to generate asymmetric plasmid arms for orientation, were prepared by denaturing 5-l 0 ng of each plasmid in 10 ~175% formamide at 80°C for 1 min. NaCl was added to 0.25 M and renaturation was allowed to proceed at room temperature for 20 min. The reaction was then prepared for spreading using the isodenaturing formamide technique of Davis and Hyman (1971). Parlodion-coatad grids were stained with uranyl acetate, rotary-shadowed with Pt-Pd and examined using a Philips EM201 microscope. Single-stranded @Xl 74 (5375 nucleotldes; Sanger et al., 1977) added prior to spreading and the plasmid arms (3611 and 748 bp: Sutcllffe. 1978) sewed as single- and double-stranded markers. Five heteroduplexes were measured using a Hewlitt-Packard 986A Digitizer and 981 OA Calculator, and the mean length of the segments was determlned. Tranafar of Aeetrktion EndonucleaaraDlgeeted Genomlc DNA to Nltrocalluloae and Subeequant Hybrldlxatlon High molecular weight DNA was prepared using the method of either Flamm, Bond and Burr (1966) or Gross-Bellard, Oudet and Chambon (1973). The former method routinely gave shorter DNA (approximately 50 kb) which was easier to handle and yet was adequate for the purposes used. Eco RI-digested mouse liver DNA was a gift from M. McGrogan (Department of Biological Sciences, Stanford University). Restriction endonuclease digestions of genomic DNA (typically 5 cg/l 00 ~0 were monitored for completion by ihcubating an aliquot of the main reaction (containing 0.5-1.0 pg DNA) with 0.5-0.7 w X DNA. This procedure guarantees a minimum 2 fold overdigestion. The main reaction was then stopped with EDTA (to 10 mM). extracted sequentially with phenol and diethyl ether. and ethanol-precipitated. If digestions with a second restriction endonuclease were to be performed, the procedure was repeated. In some cases, digestions were repeated with the same enzyme to further insure completion. In one experiment, presented in Figure 6c. sensitive S-180 S-3 DNA was enriched for DHFR sequences as follows. S-l 80 S-3 DNA was digested to completion with Hind Ill and subjected to electrophoresis, and regions of the agarose gel corresponding in molecular weight to bands of hybridization in M-50 DNA (3-4 and 12-l 6 kb) were cut out. The DNA was eluted by the NaCIO, method. This enriched material was then digested to completion with Pst I and

Mouse 363

Dihydrofolate

Reductase

Gene

analyzed in parallel with the Hind Ill-Pst I M-50 DNA. To insure good transfer of large fragments from the gel, the gel was then irradiated under short Wavelength ultraviolet lamps for 10 min per side. Following denaturation and subsequent neutralization, the DNA was transferred from the gel to nitrocellulose filter paper (Schleicher and Schuell, BA-85) according to procedures developed by Southern (1975). with minor modifications. The filters were then marked to indicate the position of the molecular weight markers, rinsed in 2 x SSC (1 x SSC is 0.15 M NaCI. 1.5 mM sodium citrate) and baked in vacua (2 hr at 80°C). For hybridization, filters were pretreated for 12 hr at 65OC in hybridization mix [5 x SSC (pH 6.1). 1X Denhardt’s reagent (Denhardt. 1966) 0.5% sodium dodecylsulfate. 25 mM sodium phosphate (pH 6.8). 1.25 mM sodium pyrophosphate] containing 10 Ag/ml denatured E. coli DNA. Hybridizations were carried out in sealable plastic bags (Sears) using 25 Al/cm* hybridization mix containing denatured E. coli DNA and denatured nick-translated probe (approximately 1 OS-1 O7 Cerenkov cpm per lane). Hybridizations were allowed to proceed at 65°C for 2-3 days with frequent massaging of the filters. Filters were then rinsed and washed sequentially in hybridization mix plus E. coli DNA (once, 50-100 ml, 65°C. 1-2 hr), hybridization mix (twice. as above), 5 X SSC (once. as above) and finally 2 x SSC (twice, 100 ml. 20 min at room temperature). Filters were then air-dried, molecular weight markers were marked and autoradiographs were prepared using preflashed (Laskey and Mills, 1975) Kodak XR-5 film with DuPont Lightning Plus Intensifying Screen at -7OOC. Hybridization probes were either entire plasmid or specific restriction fragments of DHFR cDNA sequences. Whenever practical, specific fragments were the product of two different restriction endonuclease cleavages; these reactions were performed sequentially with a preparative gel intervening to separate the fragments of interest from other DHFR sequences prior to final digestion and preparative gel electrophoresis. Prior to electrophoresis. samples were deproteinized by sequential extraction with phenol and ether, precipitated with ethanol, redissolved and heated (5 min at 70°C) in gel sample buffer. Care was taken to avoid overloading the gel. Sequences to be used as probe were labeled in nick translation (Rigby et al., 1977). 20-200 ng DNA in 15 Al 50 mM Tris-HCI (pH 7.4). 5 mM MQCI, were nicked with DNAase I (Worthington; calibrated to nick approximately once per 200 nucleotides) and then nicktranslated with E. coli DNA polymerase I (New England BioLabs) in approximately 7 AM a-32P-dCTP or a-32P-dGTP (Amersham: 20003000 Ci/mm) and 150 PM other triphosphates. Afler 4-5 hr at 14’C the reaction was stopped by the addition of EDTA (to 12.5 mM) and sodium dodecylsulfate (to 0.1%) and heated to 68’C for 5 min. E. coli tRNA carrier was added and the mixture was desalted by centrifugation through a 400-500 pl P-60 (BioRad; 100-200 mesh) column (Wahl. Padgett and Stark, 1979). 35 ~1 washes in 10 mM Tris-HCI (pti 7.4). 1 mM EDTA, 10 mM NaCl were collected and monitored, and the void volume was pooled and precipitated with ethanol. In some earlier experiments, kinetically purified DHFR-specific “P-cDNA prepared according to the method of Alt et al. (1978) was used as probe. Nitrocellulose filters containing DNA from MTX-resistant cells were often reused for hybridizations (McGrogan et al., 1979) to allow better comparisons between experiments. Filters were washed in 0.1 N NaOH. 2 x SSC at room temperature for 20 min, rinsed twice in 2 x SSC and then pretreated for hybridization. Acknowledgments We are grateful to M. McGrogan. D. R. Setzer. G. F. Crousa and G. M. Wahl for valuable discussions and for access to results prior to publication. We also thank M. McGrogan for providing mouse liver DNA: W. Bender, M. Thomas and R. Davis for assistance with the electron microscopy; J. Sninsky for assistance with the computer program; and C. Yanofsky for providing several restriction endonucleases. This work was supported by research grants from the NIH and the American Cancer Society.

The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. Received

August

31( 1979;

revised

November

1, 1979

References Alt, F. W., Kellems, 25 l( 3063-3074.

R. E. and Schimke,

Alt, F. W.. Kellems. J. Biol. Chem. 253,

R. E., Bertino. 1357-l 370.

R. T. (1978).

J. Biol. Chem.

J. R. and Schimke,

R. T. (1978).

Breathnach. R.. Benoist, C., O’Hare, K.. Gannon. F. and Chambon, P. (1978). Proc. Nat. Acad. Sci. USA 75, 4853-4857. Buell, G. N.. Wickens, M. P.. Carbon, J. Biol. Chem. 254, 9277-9283.

J. and Schimke.

Catterall. J. R., O’Malley. B. W.. Robertson, Tanaka, Y. and Brownlee, G. G. (1978). Nature

R. T. (1979).

M. A., Staden. 275, 510-513.

R..

Chang, A. C. Y., Nunberg, J. H.. Kaufman, R. J.. Erlich. H. A., Schimke. R. T. and Cohen, S. N. (1978). Nature 275, 817-624. Clewell.

D. B. (1972).

Collins, 4248.

J. and Hohn. B. (1978).

Davis,

J. Bacterial.

R. W. and Hyman.

Denhardt, 652.

D. (1986).

170, 667-676. Proc. Nat. Acad. Sci. USA 75, 4242-

R. W. (1971).

Biochem.

J. Mol. Biol. 62, 287-301.

Eiophys.

Res.

Commun.

23, 641-

Dolnick. B.. Berenson. R.. Bertino. J. R.. Kaufman, R. J.. Nunberg, H. and Schimke, R. T. (1979). J. Cell Biol. 83, 394-402. Dugaiczyk. 96, 171-l

A., Bayer. 84.

H. W. and Goodman,

H. M. (1975).

J.

J. Mol. Biol.

DuQaiCzyk, A., Woo, S. L. C.. Lai, E. C.. Mace, M. L.. Jr., McReynolds. L. and O’Malley, B. W. (1978). Nature 724, 328-333. Flamm. W. G., Bond, H. E. and Burr, Acta 129, 31 O-31 9.

H. E. (1986).

Biochim.

Biophys.

Gannon. F.. O’Hare, K., Perrin. F.. LePennec. J. P.. Benoist. C.. Cachet, M., Breathnach. R., Royal, A., Garapin. A., Cami, 8. and Chambon. P. (1979). Nature 278, 428-434. Gross-Bellard. M.. Oudet, Chem. 36, 32-38.

P. and Chambon.

Grunstein. M. and Hogness. 72, 3961-3965. Hayward,

G. S. and Smith,

Kaufman,

R. J. and Schimke.

D. S. (1975). M. A. (1972).

P. (1973). Proc.

Eur. J. Bio-

Nat. Acad.

Sci. USA

J. Mol. Biol. 63, 383-395.

R. T. (1980).

Cell, in press.

Kaufman, R. J., Brown, P. C. and Schimke. Acad. Sci. USA 76,5669-5673.

R. T. (1979).

Proc.

Nat.

Kellems. R. E.. Morhenn. V. B.. Pfendt. E. A., Alt. F. W. and Schimke, R. T. (1979). J. Biol. Chem. 254, 309-318. Konkel, 1132. Korn. Acad. Kozak.

D. A., Tilghman.

S. M. and Leder.

L. J., Queen, C. L. and Wegman. Sci. USA 74, 4401-4405. M. (1978).

P. (1978).

Cell 15. 1125-

M. N. (1977).

Proc.

Nat.

Cell 75, 1109-1123.

Kronenberg. H. M.. Roberts, Acids Res. 6, 153-l 65.

B. E. and Efstratiatis.

A. (1979).

Nucl.

Kushner. S. R. (1978). In Proc. Int. Symp. on Genetic Engineering, H. W. Boyer and S. Nicosia. eds. (Amsterdam: Elsevier). Laskey.

R. A. and Mills, A. D. (1975).

Lawn, R. M.. Fritsch. E. F.. Parker, (1978). Cell 15, 1157-1174. McDonell, M. W.. Simon, Biol. 170, 119-146.

Eur. J. Biochem. R. C.. Blake,

M. N. and Studier,

56, 335-341,

G. and Maniatis.

T.

F. W. (1977).

J. Mol.

McGrogan. M., Spector, D. J., Goldenberg. C. J.. Halbert, Raskas. H. J. (1979). Nucl. Acids Res. 6, 593-607.

D. and

Ceil 364

McReynolds. L.. O’Malley. D., Fields, S., Robertson. 2 73, 723-728. Maniatis, T., Jeffrey, 14, 3787-3794.

8. W.. Nisbet. A. D.. Fothergill, J. E.. Girol, M. and Brownlee. G. G. (1978). Nature

A. and van de Sande.

Maxam. A. M. and Gilbert, 560-564.

W. (1977).

Nunberg, J. H.. Kaufman, Chasin, L. A. (1978). Proc. Proudfoot.

Proc.

H. (1975).

Biochemistry

Nat. Acad.

Sci. USA 74,

Ft. J.. Schimke, Ft. T., Urlaub. G. and Nat. Acad. Sci. USA 75, 5553-5556.

N. J. and Brownlee,

G. G. (1976).

Reddy. V. 8.. Ghosh. P. K.. Lebowitz. S. M. (1979). 173, 237-251.

Nature

P., Piatek,

263, 21 l-214.

M. and Weissman,

Robertson, M A., Staden. R.. Tanaka, Y.. Catterall, J. F.. O’Malley, B. W. and Brownlee, G. G. (1979). Nature 278, 370-372. Sanger,

F. and Coulson,

Sanger. Fiddes, (1977).

F.. Air, G. M.. Barreli. B. G., Brown, J. C., Hutchison. C. A., Ill, Slocombe, Nature 265, 687-695.

Southern, Stone,

E. M. (1975).

D. and Phillips,

A. R. (1978).

J. G. (1978).

Thomas, Biochem.

C. A., Jr., Saigo. 93, 158-l 66.

Thuring. Biochem.

R. W. J., Sanders, 66, 213-220.

Nuci.

Acids

Added

10.

FEBS Letters

74, 85-87.

Res. 5, 2721-2728.

K.. McLeod.

E. and ito, J. (1979).

J. P. M. and Borst.

R. A. and Stark,

Wickens, M. P.. Buell. G. N.. Crouse, R. T. (I 979). Gene 5, 19-43. Note

87, 107-l

N. L., Coulson, A. R.. P. M. and Smith, M.

J. Mol. Biol. 98, 503-517. A. W. (1977).

Sutcliffe.

Wahl. G. M.. Padgett, 254, 8679-8689.

FEBS Letters

G. R. (1979). G. F.. Carbon,

P. (1975).

Anal. Anal.

J. Biol. Chem. J. and Schimke,

in Proof

Gray Grouse. Christian Simonsen and R. T. Schimke have recently isolated much of the DHFR gene sequences from methotrexate-resistant S-180 ceils using X Charon 4A. All the general features of the map of the DHFR gene presented above have been confirmed, with the likely presence of an additional intervening sequence between the designated intervening sequences I and II (see Figure 4). The ambiguity in mapping of several Barn HI fragments of the gene, in particular the 16 kb Barn HI band, appears to result from minor heterogeneity in this region. Contrary to what is shown in Figure 4. we now believe that the majority of DHFR genes contain a 28 kb Barn HI band spanning the region between Barn HI sites in intervening sequence Ill and sequences in the 5’ direction from the gene. The 16 kb band arises from a class of genes that contain a Barn HI site within intervening sequence I.