Cell. Vol. 16. 753-761,
April 1979,
Copyright
0 1979
by MIT
Sequence of the Gene for Iso-I-Cytochrome Saccharomyces cerevisiae
Michael Smith, David W. Leung, Caroline Ft. Astell Department of Biochemistry Faculty of Medicine University of British Columbia 2075 Wesbrook Place Vancouver, British Columbia Canada V6T 1 W5
Shirley
Donna L. Montgomery and Benjamin Genetics Department, SK-50 University of Washington Seattle, Washington 98195
Gillam
and
D. Hall
The complete sequence of the iso-l-cytochrome c gene of yeast has been determined. The coding region of the gene contains no intervening sequences. The coding strand of the DNA immediately upstream from the coding region contains many fewer G residues than the rest of the coding strand both within and beyond the carboxy terminus of the coding region. One consequence of the reduced number of G residues in this region is the absence of the sequence ATG for 122 nucleotides upstream from the initiating ATG. Together with previous studies on the mRNA and the genetics of yeast iso1-cytochrome c, the sequence supports a model in which translation starts at the first AUG downstream from the 5’ terminus of the mRNA, with no other sequence requirements. It also is evident that iso-lcytochrome c is synthesized directly and not through an intermediary, longer precursor protein, as is often the case for proteins that interact with membranes. The DNA upstream and downstream from the coding region contains sequences which are potential transcription start and stop signals. The sequence confirms the assignments of nonsense and missense mutations throughout the coding region of the gene and provides a rationale for some mutational characteristics of the gene. Part of the sequence was determined using two new strategies for the Sanger terminator method, both of which obviate the need for restriction fragment isolation and template strand separation. Introduction In yeast, as in other eucaryotes, cytochrome c is an essential subunit of the electron transport system (Boyer et al., 1977). While cytochrome c is a component of the mitochondrion, it is specified by a nuclear gene (Sherman et al., 1966) and synthesized on cytoplasmic ribosomes (Clark-Walker and Linnane, 1967). The newly synthesized polypeptide is modified
c in
by covalent attachment of heme (Margoliash and Schejter, 1966; Sherman and Stewart, 1971) and by trimethylation of the lysine at position 77 (Delange, Glazer and Smith, 1970), and interacts with components of the C side of the inner mitochondrial membrane (Muscatello and Carafoli, 1969; Boyer et al., 1977). The synthesis of cytochrome c is repressed by glucose; this mechanism involves control of the amount of mRNA (Zitomer and Hall, 1976; Zitomer and Nichols, 1978). The evidence of procaryote molecular biology indicates that a full understanding of the gene structure and function requires both knowledge of DNA sequence and comprehensive genetic and physiological information. In yeast, the CyCl locus, which codes for the major cytochrome c, iso-l -cytochrome c, has been the target of extensive genetic analysis (Sherman and Stewart, 1971; Sherman et al., 1974). To perform a similarly detailed molecular analysis of CYCl DNA structure and function, we have recently isolated this gene from two yeast strains, one wildtype (CYCl ) at this locus and the other a cycl-9 mutant. A synthetic oligonucleotide (the 13-mer) complementary to part of the gene was used as a hybridization probe and this, together with the genetic characteristics of cycl-9, identified the DNA; its identity was confirmed by partial sequence determination of the iso-l -cytochrome c coding region (Montgomery et al., 1978). In this paper, we describe and discuss the sequence of a segment of yeast DNA which completely spans and extends to either side of the coding sequence for iso-l -cytochrome c. Results Determination of the DNA Sequence The yeast pBR322 plasmids used in the sequence determination were derived as diagrammed in Figure 1. The clone containing most of the coding sequence pYeCYCl(O.60) (formerly designated pYeCYC.4) and that of the mutant pYecycl-g(5.10) (formerly pYecycl-9) have been described previously (Montgomery et al., 1978). The sequence of the yeast DNA is shown in Figure 2, which also indicates the coding sequence for iso1 -cytochrome c and all the sequences corresponding to restriction endonucleases of known specificity (Roberts, 1978). The legend to Figure 1 specifies the experiments which define the various segments of the sequence. The method of Maxam and Gilbert (1977) was used for parts of the sequence, and the enzymatic terminator method (Sanger, Nicklen and Coulson, 1977) was used for other regions as well as for overlapping segments of the DNA. Two new strategies were used with the terminator method which are of general interest because both
Cell 754
A
e-L A
A A
L
A A &
22
Figure 1. Diagrams of the Origins of the Recombinant Plasmids and the Sequencing Strategy
ck,
Yeast
DNA
The upper diagram shows the two adjacent Eco RI fragments of yeast DNA which contain the gene for iso-l-cytochrome c (thick lina) (Montgomery et al., 1978). The plasmid pYeCYCl(O.60) contains the 0.60 kb Eco RI-Hind Ill fragment, which contains most of the gene, in pBR322 (between its Eco RI and Hind Ill sites). The plasmid pYeCyCi(4.50) contains the 4.50 kb Eco RI-Hind Ill fragment at the left end of the gene in pBR322 (between its Eco RI and Hind Ill sites). The plasmid pYecycl-9t5.10) contains the 5.10 kb Hind Ill fragment, which spans the gene in the mutant cycl-9. in pBR322 (at its Hind Ill site). The lower diagram shows the sequencing experiments used to compile the data of Figure 2. Experiments 1 and 3 on pYeCyCl(4.50) and 4 and 9 on pYeCyCl(O.60) were carried out by the method of Maxam and Gilbert (1977) and originated at the indicated restriction sites. The terminator method primed with oligodeoxyribonucleotides was used in experiments 2 [~AzGAzGAI on the Sma I-Eco RI fragment from pYeCyC1(4.50)]. 5 (pAdGaTG& 6 (pC3A3GA3). 7 (pAsGA3) and 10 (pT&T) all on pYeCYCi(O.60). In experiment 8. the primer was Hind Ill-cleaved pBR322 on pYeCYC1(0.60), and in experiment 11, the primer was pTzAGCAGAQGz on pYecycl-S(5.10) template.
obviate the need for restriction fragment isolation and mapping and for template strand separation. These modifications are significant because fragment isolation and strand separation are major time- and DNAconsuming steps in DNA sequence determinations. The first strategy is diagrammed in Figure 3. It involves the formation of a primer-template DNA heteroduplex by annealing linear, duplex recombinant DNA (the template) [in this case pYeCYC1 (0.60) cleaved at the Eco RI site] and linear, duplex vector DNA (the primer) (pBR322 cleaved at the Hind Ill site). The heteroduplex contains two potential priming sites for DNA synthesis when it is used as a substrate for a DNA polymerase in the terminator sequencing method, and a potential sequence pattern is created at both sites. Cleavage of the product of the terminator experiment with Hind Ill, however, releases only the yeast sequences as small fragments appropriate to gel electrophoretic “ladder” sequence determination, as shown in Figure 4. The full potential of this strategy has not yet been explored. It seems probable that this strategy, with priming at both ends of a cloned DNA fragment similar in size to that of pYeCYCl(O.60). would yield its total sequence particularly when used with high resolution gel electrophoresis (Sanger and Coulson, 1978). The underlying principle of effectively
eliminating the product of one priming site when two are present can be achieved in a number of other ways. The second strategy uses a single-stranded primer which is annealed to denatured duplex template DNA. In principle, single-stranded primers can be obtained by strand separation of restriction fragments of DNA. In the present experiments, however, enzymatically synthesized oligodeoxyribonucleotides of defined sequence were used as primers. The synthesis of specific primers is feasible because simple yet unique sequences are present in DNA at frequent intervals. The oligodeoxyribonucleotides used in the present study were pT&T, pA,GA3, pA4G3TG2, pC3A3GA, and pA2GA2GA,. All of these were easily synthesized using E. coli polynucleotide phosphorylase (Gillam, Jahnke and Smith, 1978). The result of the experiment using pT&T as primer with denatured Eco RI-cleaved pYeCYCl(O.60) as template is shown in Figure 5. When pT2AGCAGA&G, (the 13-mer) was the primer with Hind Ill-cleaved pYecycl-9(5.10) as the template, the results confirmed the ochre mutant sequence which also lacks the Eco RI site (GC+TA at nucleotide 7; Figure 2). The latter experiment demonstrates an important principle; oligonucleotide priming provides a very convenient method for the determination of the sequence of mutants, in this case in the N terminal region of the CYCl locus. Both of the new strategies will be useful for screening recombinant DNAs for specific sequences. The sequence determination described in this paper is supported by extensive overlaps either in the same or the complementary strand and by determination of the sequence of difficult regions by the two different methods (chemical and enzymatic; see the legend to Figure 1 for specific details). In regions where overlaps were not obtained, careful analysis of band spacing and intensity in the gel patterns ensures the accuracy of the determination. In addition, the known amino acid sequence of iso-l -cytochrome c provides an independent check on almost 40% of the sequence. Restriction Endonuclease Cleavages The mapping of restriction endonuclease cleavages provides independent confirmation of DNA sequence. The method of Smith and Birnstiel (1976) was used to map and confirm the presence of the sites for the following enzymes: Alu I, Ava II, Fnu D II, Fnu El, Hae Ill, Hind Ill, Hpa II, Mbo II, Mnl I, Taq I and Xho I. The Eco RI or the Hind Ill site served as the reference point. The resistance of the sequenced region of the DNA to Barn HI, Bgl II. Fnu 4 HI, Hha I, Hind II, Hinf I, Pst I, Pvu II, Sal I, Sma I and Xba I also confirms the absence of their recognition sequences. Resistance to Mbo I but not to Fnu El confirms the fact that in the recombinant DNA, the sequences 5’-GATC-3’ con-
Yeast 755
WC1
Gene Sequence XhoI 5'-CTCGAGCA 3'-GmCGT TaqI WI-1 Hae1II GATCCGCCAGGCGTGTATATAGCGTGGAT~AGGCAACTTTAGTGCTGACACATACAG CTAGGCGGTCCGCACATATATCGCACCTACCGGTCCGTTGAAATCACGACTGTGTATGTC -*,;ECORI1-20; 1240 -23;) -22; -19;
tain 6-methyladenine. mination confirmed 5’-CC
8 GG-3’
contains Maxam,
The chemical sequence deterthe fact that in the sequences
(Eco RII site), the second
5methylcytosine 1978).
(Ohmori,
nucleotide
Tomizawa
and
BCII AvaIII GCATATATATATGTGTGCGACGACACATGATCATATGGCATGCATGTGCTCTGTATGTAT ~GTATATATATACACACGCT~CTGTGTACTAGTATACCGTACGTACA~GAGACATACATA FnuEI HglAI -14; -180 -170 -160 AvaII MboII ATAAAACTCTTGTTTTCTfCTTTTCTCTAAATATTCTTTCCTTATACATTA~TTTG TATTTTGAGAACAAAAGAAGAAAAGAGATTTATAAGAAAGGAATATGTAATCCAGGAAAC _7giGr -11; -lob -9; -8b 1120 TAGCATAAATTACTATACTTCTATAGACACGCAAACACAAATACACACACTAAATTAATA ATCGTATTTAATGATATGAAGATATCTGTGCGTTTGTGTTTATGTGTGTGATTTAATTAT -5b
ISO
Sequence for Coding Region of the WC1 Locus The coding sequence, nucleotides l-330 (as shown in Figure 2), completely confirms the sequence of iso1-cytochrome c of yeast (Lederer, Simon and Verdi&e, 1972). In the context of eucaryote gene organization, the most striking feature is the complete lack of nontranslated sequences within the coding sequence. This is analogous to the structure of sea urchin histone genes (Schaffner et al., 1978) and contrasts with the vertebrate genes for P-globin, immunoglobin and ovalbumin, all of which contain intervening sequences. The coding strand of the WC1 gene (the strand with the same polarity as WC1 mRNA) has a nucleotide composition of A:1 10, G:78, T:77 and C:62. Not only are purine bases overrepresented in the coding strain, but their distribution within it is nonrandomly clustered. Runs of eight or more consecutive purine residues occur five times in the WC1 gene (in contrast to 1.6 times for a ramdomly generated sequence of the same base composition). Two features of the iso-l -cytochrome c protein sequence are responsible for these long purine clusters: -Compared to a randomly generated codon mixture with the same base composition as the WC1 coding strand, the gene contains a 2 fold excess of both lysine (AAR) and glycine (GGN) codons. -The purine-rich codons for lysine, glutamic acid (GAR), asparagine (AAY) and glycine occur next to one another slightly more frequently than would be expected if the codons in CYCI were arranged randomly. Each of the long purine clusters contains either a Lys.Lys or Glu+Lys codon doublet. These are augmented in length by the frequent occurrence of gtycine or asparagine codons on the -COOH terminal side of the Glu. Lys coding doublets. These sources of nonrandomness in the gene sequence bear an interesting relationship to the three-
(i'let):hr ATG TAC
2. Sequence
of the CYCI
Locus
of S. cerevisiae
The amino acid sequence of iso-lcytochrome c is derived from the DNA sequence and is identical with that determined from the protein. The restriction endonuclease recognition sequences shown ware detected by computer search of the sequence. The sites which have been confirmed experimentally are listed in the text. The sequences corresponding to the oligodeoxyribonucleotide primers used in the sequence determinations are ~AzGAzGAI at -108 to -99 (lower strand). ~TzAGCAGAZCZGZ (13-mer, used for cycl-9 DNA) at 17-29 (lower strand). PAIG~TGz at 80-89 (upper strand), pC3A3GA3 at 228237 (upper strand), pAsGA3 at 508-516 (lower strand) and pT&T at 512421 (upper strand).
-3b
-26
Lys &a Gly Ser Ala Lys :;s Gly Ala Thr Leu WI GAA TTC AAG GCC GGT TCT GCT AAG AAA GGT GCT ACA CTT CTT AAG TTC CGG CCA AGA CGA TTC TTT CCA CGA TGT GAA HpaII lb 3b 4b
20 Leu Gln Cys His Thr ACCI TTC AAG ACT AGA TGT CTA CAA TGC CAC ACC AAG TTC TGA TCT ACA GAT GTT ACG GTG TGG 15 Phe
Lys
Thr
Arg
56 30 Pro cCA ST
Cys
66
70
35 Gly Pro Asn Leu His Gly AvaII CAT AAG GTT GGT CCA AAC TTG CAT GGT GTA TTC YAA WIG"' TTG AAC GTA CC! His
Lys
Val
100 45 Ser
Gly
11;
Gln
14b Asn
Leu
Trp
196 Pro
Lys
Lys
Glu
Asn
AAC CCA AAG AAA TAT TTG GGT TTC TTT ATA 23b 90 LeuM;;;ILys
80 Ile
Phe
Gly
Arg
Asn
Met
13;)
Thr
Lys
Lys
ATC AAG TAG TTC 18b
Glu
TCA AGT
GAG TAC TTG ACT CTC ATG AAC TCA
Tyr
Leu
Thr
22; 85 M;,',,;;
Phe
Gly
Gly
24b
ATT -GGT ACC AAG ATTTT GGT GGG TAA GGA CCA TGG TTC TAC CGG AAA CCA CCC KpnI 266 27;
Lys
95 Asp
Leu
100 Ile
Thr
TTG AAG AAG GAA AAA GAC AGA AAC GAC TTA AAC TTC TTC CTT TTT CTG TCT TTG CTG AAT
ATT TAA
ACC TAC TGG ATG
Glu
Ile
70 Ser
21;)
F';;';;y
His
ATC TTT GGC AGA CAC TAG AAA CCG TCT GTG
176
AAC ATG TTG TAC
206 Tyr
40 Ile
166 65 Asp
AAA AAC GTG TTG TGG GAC GAA AAT TTT TTG CAC AAC ACC CTG CTT TTA 7s Asn
Glu Lys Gly Gly H*II GTG GAA AAG GGT GGC CAC CTT TTC CCA rrCCG fib"1 80
55 Ser Tyr Thr Asp Ala Asn SfaNI TCG TAC ACA GAT GCC AAT AGC ATG TGT CTA CGG TTA
156 Val
25 Val
120
50 Gly
Ala Glu Tyr AluI TCT GGT CAM GAA GGG TAT AGA CCA GTT CGA CTT CCC ATA
60 Lys
-16
GW;ty ACT TGA
28;
Arg
Asn
296
Asp
Tyr
3oi
Leu
Lys
TTG AAA AAC TTT
31;
105 Lys
Ala Cys Glu Och HWI w AAA GCC TGT GAG TAA ACAGGCCCCTTTTCCTTTGTCGATATCATGTAATTAGTTA TTT CGG ACA CTC ATT TGTWGAAAAGGAAACAGCTATAGTACATTAATCAAT ASU134b 326 33; 35; 36;
370
MnlI TGTCACGCTTACATTCACGCCCTCCCCCCACATCCGCTCTAACCGAAAAGGAAGGAGTTA ACAGTGCGAATGTAAGTGCGGGAGGGGGGTGTAGGCGAGATTGGCTTTTCCTTCCTCAAT 38b
Figure
-4;
396
A00
41;
42;
43;
AvaII GACAACCTGAAGTCTAGGTCCCTATTTATTTTTTTATAGTTATGTTAGTATTAAGAACGT CTGTTGGACTTCAGATCCAGGGATAAATAAAAAAATATCAATACAATCATAATTCTTGCA AsuI 46j 44; A70 48;
496
HgaI TATTTATATTTCAAATTTTTCTTTTTTTTCTGTACAGACGCGTGTACGCATGTAACATTA ATAAATATAAAGTTTAAAAAGAAAAAAAAGACATGTCTGCGCACATGCGTACATTGTAAT FnuDII 50; 51; 526 54;
55;
Hgal Hind111 TACTGAAAACCTTGCTTGAGAAGGTTTTGGGACGCTCGAAGGCTTTAATTTGCAAGCTT-3 ATGACTTTTGGAACGAACTCTTCCAAAACCCTGCGAGCTTCCGAAATTAAACGT~A-5 TaqI Alul 56; 586 60; 570
Cell 756
A~TAAA AAAA TA CT* CATAA AA ACT T AA TT
Figure 3. Diagram of the Use of Hind Ill-Cleaved pBR322 for the Terminator Sequencing Method Using Eco pYeCYCl(O.60) as Template
as a Primer RI-Cleaved
The heteroduplex between these two DNAs has two potential sites of DNA synthesis. These are the two 3’ termini in the upper pair of strands in the heteroduplex. Afler DNA synthesis by the terminator method, however, Hind Ill releases as small fragments subject to analysis only those fragments derived from the yeast DNA (heavy lines).
dimensional structure and biological function of the cytochrome c protein molecule. Both the content and positioning of glycine residues within eucaryotic cytochromes c are subject to strong evolutionary constraint. This is believed to reflect the essential role of glycine residues in determining the three-dimensional structure (Dickerson et al., 1971). Several of the lysine residues in cytochrome c are implicated in the binding site for cytochrome oxidase and the mitochondrial b. c, reductase (Ferguson-Miller, Brautigan and Margoliash, 1978). Thus the DNA regions which we have recognized as unusual by their high purine content in the coding strand have an important role in coding for the three-dimensional structure which positions the heme group of cytochrome c and allows the iron within it to undergo oxidation and reduction in the mitochondrion. While the selection of codons used in the CYCl gene is nonrandom in several respects, there is no contribution of selective codon usage to the purine clustering described above. Glycine and arginine are the only amino acids whose codons have both a purine in the middle position and the choice of either purine or pyrimidine at position three. In the CYCl sequence (Figure 2) however, GGY glycine codons occur in 10 of 12 cases, while CGN codons are totally absent.
L
-GiC -C --
CA G
-G-
.T - CA Figure 4. Gel Electrophoretic (12% Acrylamide) quence Obtained by the Experiment Diagrammed The sequence, top to bottom, corresponds (lower strand, Figure 2). Note the sequence 506-521 (lower strand).
Pattern of the Sein Figure 3
to nucleotides 346-450 AGAaGAs at nucleotides
Regarding the selective use of one group of arginine codons-3 AGR to 0 CGN-the CYCl gene resembles the globin genes (Kafatos et al., 1977) and the chicken ovalbumin gene (McReynolds et al., 1978). All these genes differ greatly in codon usage pattern from the sea urchin histone genes (Schaffner et al., 1978) in which CGN arginine codons are used preferentially.
Yeast 757
CYCl
Gene Sequence
As mentioned above, the DNA sequence confirms the protein sequence assigned to iso-l -cytochrome c. As expected, it also confirms the sequence of 44 nucleotides predicted from analysis of frame-shift mutants of the N terminus of the gene (Stewart and Sherman, 1974). The total sequence confirms and/or defines the nature of nonsense and missense mutations at other sites throughout the gene, for example, single base changes at amino acids 21 (~~~1-21, 66 (~~~1-72) and 93 (cycl-140) to produce ochre codons (TAA), at amino acids 64 (~~~1-84) and 71 (~~~1-76) to produce amber codons (TAG), and at amino acid 14 (cycl-115) to produce a proline (CCT) missense mutant (Sherman et al., 1974, 1975; Lawrence and Christensen, 1978). The direct sequence determination of part of the DNA of the mutant cycl9 confirms that an ochre codon is present at amino acid 2. The complete iso-1-cytochrome c sequence will help to define the nature of the many other mutants which map within and around the structural gene (Sherman et al., 1974, 1975). The sequence will also allow a more detailed understanding of sequence-dependent aspects of mutagenic events. The ~~~1-131 and ~~~1-115 mutant alleles, unlike all other cycl base-pair substitution alleles tested, efficiently yield ultraviolet-induced CYCl + revertants (Lawrence and Christensen, 1978). The REV1 independence of mutation at these sites suggests that they have special structural features. In the case of ~~~1-131, the DNA sequence at the mu5’-ATAGTG-3’ has two tantsite3’-TATCAC-5’
special
features.
First, a transition mutation at either nucleotide of the pyrimidine dimer yields an ATG codon in the iso-lcytochrome c reading frame. Second, one of these possible transitions is GC to AT, which other evidence suggests is favored in the absence of the REV1 gene function (Lawrence and Christensen, 1978). Reversion at the ~~~1-115 site apparently involves a rather complex mechanism which differs from that acting at cycl-131. Analysis of the sequence-dependent action of other mutagens (Sherman and Stewart, 1974; Sherman et al., 1975) and the effects of sequence on recombination efficiencies (Moore and Sherman, 1975, 1977) in the light of the CYCl gene sequence would also be of interest. Sequence
Preceding
the
Coding
Region
This paper describes the sequence of 248 nucleotides upstream from the first nucleotide of the coding sequence. There are several interesting structural fea-
Figure 5. Gel Electrophoretic (12% quence Obtained by the Terminator with Eco RI-Cleaved pYeCYCl(O.60)
Acrylamide) Pattern Method Using pT&T as Template
of the Seas Primer
The sequence corresponds to nucleotides 530-609 and part of pBR322 (upper strand, Figure 2). This gel electrophoretic separation was carried out on a thin gel (0.5 mm) according to the procedure of Sanger and Coulson (1976). The resolution is much greater than that typically obtained with 1.5 mm gels (for example, Figure 4).
Cell 758
tures. From -190 to -120, most of the sequence is alternating purine and pyrimidine residues, and from -114 to -78, the upper strand of the DNA duplex is very rich in pyrimidine residues. Such extended regions of unusual structure suggest that neither strand of the DNA in this region has amino acid coding functions. Size measurements of cDNA complementary to the mRNA for iso-l -cytochrome c suggest that the CYCl messenger has a leader sequence at least 80 nucleotides long (Szostak et al., 1977). If it is assumed that this leader sequence is not derived from a DNA containing inserted sequences and that the primary transcript initiates at A or G, there are four potential start sites (-88, -90, -91 and -92) between position -80 and the purineless region -93 to -108. Approximately 30 nucleotides before these potential start sites is the heptanucleotide TATAAAA, a sequence which has been found near the beginning of the adenovirus 2 major late transcription unit and the Bombyx mori fibroin gene (Ziff and Evans, 1978; Tsujimoto and Suzuki, 1979). In each of these cases, the DNA region coding for the mature mRNA 5’ terminus (adjoining the capped end) lies 30-31 nucleotides to the right of the initial T of TATAAAA. DNA sequences similar to these (TATAAATA, with slight variations) were discovered by M. Goldberg and D. Hogness (personal communication) within the regions preceding the transcription start for four of the Drosophila histone genes. Pyrimidine-rich sequences have been noted immediately before the start of the adenovirus 2 major late transcript (Ziff and Evans, 19781, and similar sequences are present in the regions adjacent to those coding for the mature mouse immunoglobulin light chains and P-globin. (Bernard, Hozumi and Tonegawa, 1978; Konkel, Tilghman and Leder, 1978). To assess the possible significance of TATAAAA and pyrimidine-rich sequences downstream for yeast CYCl gene expression, it will be of interest to define the first nucleotide in the mRNA. The availability of cloned CYCl-specific DNA makes it possible to sequence and map physically the CYCl mRNA and thereby to determine where on the DNA sequence CYCl mRNA transcription is initiated. The DNA sequence upstream from the coding region sheds some light on the mechanism of initiation of translation. It has been suggested that eucaryote mRNAs contain a sequence immediately upstream from the initiating AUG which is complementary to part of the 3’ terminus of the 18s ribosomal RNA (Hagenbuchle et al., 19781, and this is possible in the case of the iso-1-cytochrome c mRNA. An extensive analysis of eucaryote mRNA sequences, however, leads to the conclusion that the only common feature is the initiating codon AUG (Baralle and Brownlee, 1978). This supports a model of eucaryote translation initiation whose signal is the AUG nearest the 5’ terminus of the mRNA (Kozak and Shatkin, 1978).
The DNA sequence reported here together with the genetics of the CYCl locus supports this second model. A particularly striking feature of the sequence is the virtual absence of G residues in the upper strand from position -123 to -1. As a consequence, the first ATG upstream from the initiation codon is at - 126. As noted above, the transcript start is probably to the right of residue - 120, and hence the initiation codon is probably the AUG nearest the 5’ terminus of the mRNA. It should be noted that the region -120 to - 1 contains several translation termination codons in each of the three possible reading frames of the upper DNA strand. The ATG codon at -126 (or those beyond it) is therefore incapable of initiating translation of the CYCl gene. The mutant ~~~1-131 where the initiation codon is changed to GTG was discussed above. Apart from revertants to the original sequence, it has two classes of revertants which produce altered but functional proteins (Stewart et al., 1971). In one of these, the protein lacks the first three amino acids; it results from the introduction of a new initiation site at amino acid 4 by the change AAG + ATG (Stewart et al., 1971). The second class of revertants results in a protein which is one amino acid longer than the wild-type. It can be seen from the present DNA sequence that an A to G change at position - 1 results in a new initiation codon (ATA + ATG). This large flexibility in the position of the initiation codon is more easily accommodated by the second model of eucaryote translation initiation discussed above. Another mutant, ~~~1-362, results from an A to G change at - 18 (J. W. Szostak, J. I. Stiles, F. Sherman, and R. Wu, personal communication). It could be argued that this change has disrupted a ribosomal RNA binding site. It is notable, however, that the change introduces an ATG upstream from and in a different reading frame from that of iso-l -cytochrome c. It seems probable that initiation at this new AUG in the resultant mRNA precludes initiation at the downstream AUG codon for iso-lcytochrome c. A feature of translation initiation in higher eucaryotes is the sequence AUGG in a large fraction of the initiation sites (Kozak and Shatkin, 1978). This corresponds to the anticodon sequence CCAU in the initiator tRNA,,I (Piper and Clark, 1974; Gillum et al., 1975). The yeast initiator tRNA,,, has the anticodon sequence UCAU (Simsek and RajBhandary, 1972), and it is of interest that the gene for yeast iso-lcytochrome c has the corresponding sequence 5’ATGA-3’. Studies on mutants where the 3’ terminal A is replaced by other nucleotides do not give a clear picture of its relationship to translational efficiency (Stewart et al., 1971). Recent studies on procaryote translation initiation, however, suggest that nucleotides immediately adjacent to the AUG may be of some importance in translation efficiency (Taniguchi and Weissman, 1978; Ganoza, Fraser and Neilson, 1978;
Yeast CYCl 759
Gene Sequence
Mandersheid, Bertram and Gassen, 1978). Clearly, an analysis of mutations in this region of the yeast iso1 -cytochrome c gene would provide a fruitful source of information on the detailed mechanism of translation initiation. Many proteins which pass through or which are bound to cell membranes are synthesized with a lipophilic signal peptide (Devillers-Thiery et al., 1975; Habener et al., 1978). Since yeast cytochrome c is synthesized on cytoplasmic polyribosomes and then transported through the outer membrane into the mitochondrion, a procytochrome c begun by such a signal peptide might have been anticipated. The WC1 gene sequence, however, identifies the ATG codon adjoining N terminal threonine as the translation start. Thus, only the single methionine residue is removed after translation. In this respect, processing of nascent WC1 polypeptides differs from that of other cytoplasmically synthesized yeast mitochondrial proteins. Significantly higher molecular weight precursor polypeptides are observed in vivo and in vitro for the cytoplasmically produced subunits of cytochrome oxidase, ATPase and the cytochrome b.cl complex (Schatz, 1978; G. Schatz, M. L. Maccechini and C. CW, personal communication). Sequence Following the Coding Region The most notable features of this sequence are pyrimidine-rich tracts in the upper DNA strand and a relative paucity of alternating purine-pyrimidine-containing tracts. Here the sequence is the opposite of that preceding the coding region. The T-rich tracts in the upper strand of the DNA at 455-467 and 506-521 are very similar to those following the structural genes for yeast 5s RNA and some tRNA genes (Goodman, Olson and Hall, 1977; Maxam et al., 1977; Valenzuela et al., 1977, 19781, which may be transcription termination signals for RNA polymerase III. It also is notable that C-rich tracts analogous to that at 390399 are found in the untranslated 3’ termini of some eucaryote mRNAs (McReynolds et al., 1978). The translation termination signal is TAA, which is commonly found in eucaryote genes (Proudfoot, 1977; Efstratiadis, Kafatos and Maniatis, 1977; Proudfoot et al., 1977; Seeburg et al., 1977; Shine et al., 1977; Ullrich et al., 1977; McReynolds et al., 1978). A feature of eucaryote mRNAs is the sequence AAUAAA about 20 nucleotides before the 3’ terminal poly(A) (Proudfoot and Brownlee, 1976; Cheng et al., 1976). The sequence AATAAA is not present in the DNA sequence reported here. Thus either AAUAAA is not a feature of poly(A)-containing yeast mRNA or the transcription termination signal is beyond nucleotide 609. This dilemma can best be resolved by characterization of the 3’ terminus of the iso-1-cytochrome c mRNA. Conclusions The organization
of the iso-1-cytochrome
c gene in
the WC1 locus of Saccharomyces cerevisiae has been defined by direct sequence determination of clones of yeast DNA. The completion of this project confirms the feasibility of using a synthetic oligodeoxyribonucleotide as a probe for gene isolation (Gillam et al., 1977; Montgomery et al., 1978) and has involved the development of new strategies for DNA sequence determination. Extension of these studies will involve defining the relationship of the gene to the iso-l -cytochrome c mRNA sequence and defining the sequences between the WC1 locus and adjacent genetic loci. Experimental
Procedures
Recombinant Yeast DNA Isolation of the clones in pBR322 from wild-type yeast [pYeCYCl (O.SO)] and from the mutant cycl-9 [pYecycl-Q(5.1 O)] (see Figure I) has been described (Montgomery et al., 1978). The clone from wildtype DNA, pYeCYCl(4.50) (see Figure l), was constructed by cutting a 4.5 kilobase Eco RI-Hind Ill fragment from A-YeC and inserting it into pBR322 using the same procedures (Montgomery et al., 1978). Samples of all three DNAs and of pBR322 were purified by filtration on agarose followed by centrifugation in CsCl containing ethidium bromide (Elwell et al., 1975). DNA Sequence Determination The “terminator” method was used as described by Sanger et al. (I 977) using dideoxyribonucleoside-5’ triphosphates (PL Biochemical@. Deoxynucleoside-5’ triphosphates (PL Biochemical?.) were purified on DEAE-cellulose (Brown and Smith, 1977). E. coli DNA polymerase I (Klenow) was obtained from Boehringer Mannheim and a-32P-deoxyribonucleoside-5’ triphosphates (340 Ci/mmole) were from New England Nuclear. The “terminator” method used Hind Ill-cleaved pBR322 as primer and Eco RI-cleaved pYeCYCl(O.60) as template. The pBR322 (4 pg) in 60 mM NaCI, 50 mM Tris-HCI (pH 7.51, 7 mM MgCIz (IO ~1. buffer A) was digested with Hind Ill (5 U) at 37’C for 1 hr and then heated at 100°C for 3 min in a sealed capillary followed by immediate cooling in ice water. The pYeCYCl(O.60) (2 pg) in buffer A (IO ~1) was digested with Eco RI (4 U) at 37°C for 1 hr and then heated at 100°C for 3 min in a sealed capillary followed by immediate cooling in ice water. The two solutions were mixed and incubated at 65’C for 1 hr to anneal the primer-template combination. This was used as substrate for the “terminator” method of sequence determination with dideoxynucleotide-5’ triphosphates as terminators, essentially as described by Sanger et al., (1977). The reaction was completed by digestion of each of the four products with Hind Ill (1 U) at 37°C for 30 min followed by denaturation and gel electrophoresis as described by Brown and Smith (I 977). The results of this experiment are shown in Figure 4. In priming experiments with synthetic oligodeoxyribonucleotides. the primers used were pT&T (Gillam et al., 1978). pAsGA3 (Gillam. Waterman and Smith, 1975), pA4G3TG2. ~AzGAzGAI and pC3AzGA3 (synthesized using E. coli polynucleotide phosphorylase by the methods of Gillam et al., 1978) and the 13-mer (Gillam et al., 1977). The pAa. pC3 and PAzGAzG used as starting materials were obtained by published procedures (Khorana. Vizsolyi and Ralph, 1962; Astell and Smith, 1971). pYeCYCl(O.60) (2 pg) in buffer A (I 0 pl) was digested with Hind Ill (4 U) at 37°C for 1 hr. Oligodeoxyribonucleotide (25-50 ng) was added and the solution, in a sealed capillary, was heated at 100°C for 3 min and immediately cooled in ice water. This primertemplate combination was used as the substrate for the “terminator” method, and the products after denaturation were analyzed by gel electrophoresis as described by Brown and Smith (I 977) or Sanger and Coulson (1978) (Figure 5). Experiments using pTACT. pAdGsTG2 and pC3A3GA3 as primers were carried out in the same way except that the pYeCYCl(O.60) was cleaved with Eco RI (5 U). The 13-mer was used as primer on pYecycl-Q(5.1) cleaved with Hind Ill. The
Cell 760
experiment with ~AzGAzGA~ used as template a 400 nucleotide pair fragment of yeast DNA excised from pYeCYCl(4.50) by Eco RI and Sma I. Sequencing by base-specific partial degradation was carried out by the method of Maxam and Gilbert (1977). The sequence determination at the Eco RI site of pYeCYCl(O.60) has been described (Montgomery et al., 1978) and the same procedure was used for the sequence originating at this site in pYeCYCl(4.50). Sequencing at the Hind Ill site in pYeCYCl(O.60) and at the Xho I site in pYeCYCl(4.5) was carried out after 3’ terminal labeling using a-32Pdeoxyribonucleoside-5’ triphosphates (3000 Ci/mmole: Radiochemical Centre. Amersham) and E. coli DNA polymerase I (Klenow). Thus Xho l-cleaved YeCYCl(4.5) (5 pg) in 50 mM NaCI, 7 mM MgCIz, 7 mM @mercaptoethanol, 7 mM Tris-HCI (pH 7.4) (10 +I) containing dCTP (100 pmole). dTTP (100 pmole) and o-32P-dGTP (25 pmole, -2000 Ci/mmole) and 1 U of E. coli DNA polymerase I (Klenow) were incubated at 20°C for 15 min. After treatment with phenol and ethanol precipitation, the DNA was cleaved with Eco RI and the fragments were separated by electrophoresis in acrylamide (4%). Labeling at the 3’ terminus of the Hind Ill cleavage was carried out in the same way, but the only deoxyribonucleotide-5’ triphosphate was a-“PdATP (25 pmole, -2000 Ci/mmole). Restriction endonuclease cleavage sites in the yeast fragments of pYeCYCl(O.60) and pYeCyCl(4.50) were mapped by the method of Smith and Birnstiel (1976). Eco RI and Hind Ill were gifts from W. Schaffner. Fnu Al (=Hinf I). Fnu Cl (=Mbo I), Fnu DI (=Hae Ill). Fnu DII. Fnu Dill (=Hha I) and Fnu El were gifls from A. Lui (A. Lui, B. C. McBride and M. Smith, unpublished results; A. Lui. B. C. McBride, G. Vovis and M. Smith, unpublished results): Fnu 4HI was prepared by D. W. Leung (D. W. Leung. A. Lui. H. Merilees. B. C. McBride and M. Smith, unpublished results). Alu I, Hae III, Hha I, Mbo I and Mbo II were prepared essentially as described by Roberts et al. (1975) and Pst I as described by Brown and Smith (1978). Ava II. Barn HI. Bgl II. Hint II (Hind II). Pvu II, Sal I, Sma I, Taq I, Xba I and Xho I were from New England Biolabs. The cleavage specificities of these enzymes have been summarized by Roberts (1978) [with the exception of Fnu 4HI (GCNGC)]. 1 U of restriction nuclease completely digests 1 pg of substrate DNA (usually lambda DNA) in 1 hr at 37°C. The sequence of the DNA was recorded, edited, translated and examined for specific sequences (such as restriction enzyme recognition sites) using the computer programs described by McCallum and Smith (1977). All experiments involving recombinant DNA were carried out according to the guidelines operative in Canada and the United States. Acknowledgments We are grateful to Anne Lui and Walter Schaffner for gifts of enzymes, to Patricia Jahnke for pC3 and to Anne Lui for the preparation of figures. D. W. L. is the recipient of a Killam postdoctoral fellowship. We also wish to thank David Hogness for discussions about his comparison of eucaryotic promoter regions. The research was supported by the Medical Research Council of Canada, of which MS. is a research associate, and by the NIH. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. Received
November
27, 1978;
revised
January
19, 1979
References Astell, C. A. and Smith, M. (1971). Thermal elution of complementary sequences from cellulose columns with covalently attached oligonucleotides of known length and sequence. J. Biol. Chem. 246, 19441948. Baralle, F. E. and Brownlee. G. G. (1978). AUG is the only recognizable signal sequence in the 5’ non-coding regions of eukaryote mRNA. Nature 274. 84-87. Bernard,
0..
Hozumi.
N. and Tonegawa.
S. (1978).
Sequences
of
mouse immunoglobulin light chain changes. Cell 15. 1133-l 144.
genes
before
Bayer, P. D., Chance, B.. Emster. L.. Mitchell, Slater. E. C. (1977). Oxidative phosphorylation rylation. Ann. Rev. Biochem. 46, 955-1028.
and after
somatic
P.. Racker, E. and and photophospho-
Brown, N. L. and Smith, M. (1978). The mapping and sequence determination of the single site in @Xl 74am3 replicative form DNA cleaved by restriction endonuclease Pstl. FEBS Letters 65. 284-287. Brown, N. L. and Smith, M. (1977). The sequence of a region of bacteriophage @Xl 74 DNA coding for parts of genes A and B. J. Mol. Biol. 176, l-28. Cheng, C. C., Brownlee. G. G.. Carey, N. H.. Doel. M. T.. Gillam, S. and Smith, M. (1976). The 3’ terminal sequence of chicken ovalbumin messenger RNA and its comparison with other messenger RNA molecules. J. Mol. Biol. 707. 527-547. Clark-Walker, mitochondria
G. D. and Linnane. A. W. (1967). The biogenesis of in Saccharomyces cerevisiae. J. Cell Biol. 34, l-l 4.
Delange. R. J.. Glazer, A. N. and Smith, E. L. (1970). Identification and location of E-N-trimethylysine in yeast cytochromes c. J. Biol. Chem. 245,3325-3327. Devillers-Thiery, A., Kindt, T.. Scheele. G. and Blobel, G. (1975). Homology in amino-terminal sequence of precursors to pancreatic secretory proteins. Proc. Nat. Acad. Sci. USA 72, 5018-5020. Dickerson, R. E.. Takano. T., Eisenberg, D.. Kallai. 0. L.. Cooper, A. and Margoliash. E. (1971). Ferricytochrome eral features of the horse and bonito proteins at 2.8 A Biol. Chem. 246, 151 l-l 535. Efstratiadis. A., Kafatos. F. C. and Maniatis. T. (1977). structure of rabbit p-globin mRNA as determined from Cell 10. 571-585.
B., Samson, c I. Genresolution. J. The primary cloned DNA.
Elwell. L. P., de Graaff. J.. Seebert. D. and Falkow, S. (1975). Plasmid-linked ampicillin resistance in haemophilus influenza type b. Infect. Immunol. 12. 404-410. Ferguson-Miller. S.. Brautigan. D. L. and Margoliash. E. (1978). Definition of cytochrome c binding domains by chemical modification. Ill. Kinetics of reaction of carboxydinitrophenyl cytochromes c with cytochrome c oxidase. J. Biol. Chem. 253, 149-l 59. Ganoza. M. C., Fraser, A. R. and Neilson. T. (1978). Nucleotides contiguous to AUG affect translation initiation. Biochemistry 17, 2789-2775. Gillam. S.. Waterman, K. and Smith, M. (1975). Enzymatic synthesis of oligonucleotides of defined sequence. Addition of short blocks of nucleotide residues to oligonucleotide primers. Nucl. Acids Res. 2. 613-624. Gillam. S.. Jahnke. P. and Smith, M. (1978). Enzymatic synthesis of oligodeoxyribonucleotides of defined sequence. J. Biol. Chem. 253, 2532-2539. Gillam, S.. Rottman, F., Jahnke. P. and Smith, M. (1977). Enzymatic synthesis of oligonucleotide of defined sequence; synthesis of a segment of yeast iso-l-cytochrome c gene. Proc. Nat. Acad. Sci. USA 74, 98-100. Gillum, A., Urquhart. N.. Smith, M. and RajBhandary. U. L. (1975). Nucleotide sequence of salmon testes and salmon liver cytoplasmic initiator tRNA. Cell 6. 395-405. Goodman, H. M.. Olson, M. V. and Hall, B. D. (1977). Nucleotide sequence of a mutant eukaryote gene: the yeast tyrosine-inserting ochre suppressor SUP4-0. Proc. Nat. Acad. Sci. USA 74,54535457. Habener, J. F.. Rosenblatt, M.. Kemper. 8. Kronenberg. H. M., Rich, A. and Potts. J. T., Jr. (1978). Pre-proparathyroid hormone: amino acid sequence, chemical synthesis, and some biological studies of the precursor region. Proc. Nat. Acad. Sci. USA 75. 2616-2820. Hagenbtichle. 0.. Santer. M.. Steitz. J. A. and Mans, R. J. (1978). Conservation of the primary structure at the 3’ end of 18s rRNA from eucaryotic cells. Cell 13. 551-563. Kafatos. F. C.. Efstradiadis, A., Forget, B. G. and Weissman. S. M. (1977). Molecular evolution of human and rabbit o-globin mRNAs. Proc. Nat. Acad. Sci. USA 74, 5618-5622.
Yeast 761
CYCl
Gene
Sequence
Khorana. H. G., Vizsolyi. J. P. and Ralph, R. K. (1962). Studies on polynucleotides XII. Experiments on the polymerization of nucleotides. A comparison of different polymerizing agents and a general improvement in the isolation of synthetic polynucleotides. J. Am. Chem. Sot. 84.414-418. Konkel, D. A., Tilghman, S. M. and Leder, P. (1978). The sequence of the chromosomal mouse P-globin major gene: homologies in capping, splicing and poly(A) sites. Cell 15. 1 125-l 132. Kozak. M. and Shatkin, A. J. (1978). Identification terminal fragments from reovirus mRNA which ribosome binding. Cell 13. 201-212.
of features in 5’ are important for
Lawrence, C. W. and Christensen, Ft. 6. (1978). Ultraviolet-induced reversion of cycl alleles in radiation-sensitive strains of yeast. J. Mol. Biol. 722, 1-21. Lederer, F., Simon, A. M. and Verdi&e. J. (1972). Saccharomyces cerevisiae iso-cytochrome c: revision of the amino acid sequence between the cysteine residues. q iochem. Biophys. Res. Commun. 47. 55-58. McCallum. D. and Smith. M. (1977). Computer processing of DNA sequence data. J. Mol. Biol. 116. 29-30. McReynolds, L.. O’Malley. B. W., Nisbet, A. D., Fothergill. J. E., Givol. D., Fields, S., Robertson, M. and Brownlee, G. G. (1978). Sequence of chicken ovalbumin mRNA. Nature 273, 723-728. Mandershied. V.. Bertram. S. and Gassen. H. G. (1978). InitiatortRNA recognizes a tetranucleotide codon during the 30s initiation complex formation. FEBS Letters 90. 162-166. Margoliash. E. and Schejter. Chem. 21. 113-286.
A. (1966).
Cytochrome
c. Adv. Protein
Smith, M. (1978). c gene. Cell 74. DNA sequences in c gene of yeast I. genetic distances 79, 397-418.
Moore, C. W. and Sherman, F. (1977). Role of DNA sequences in genetic recombination in the iso-1-cytochrome c gene of yeast II. Comparison of mutants altered at the same and nearby base pairs. Genetics 85, l-22. Muscatello. V. and Carafoli. E. (1969). The oxidation of exogenous and endogenous cytochrome c in mitochondria. J. Cell Biol. 40.602621 Ohmori. H.. Tomizawa. J. and Maxam. A. M. (1978). Detection methylcytosine in DNA sequences. Nucl. Acids Res. 5. 1479-l
Schatz. G. (1978). Structure. genetic control and assembly of the mitochondrial inner membrane in yeast. 9th International Conference on Yeast Genetics and Molecular Biology, Abstracts, p. 9. Seeburg. P. H.. Shine, J.. Martial, J. A., Baxter, J. D. and Goodman, H. M. (1977). Nucleotide sequence and amplification in bacteria of structural gene for rat growth hormone. Nature 270. 486-494. Sherman, F. and Stewart, J. W. (1971). Genetics cytochrome c. Ann. Rev. Genet. 5, 257-296.
and biosynthesis
Sherman, F. and Stewart. J. W. (1974). Variation of mutagenic on nonsense mutants at different sites in the iso-l -cytochrome of yeast. Genetics 78, 97-l 13.
of
action c gene
Sherman, F.. Stewart. J. W., Margoliash. E.. Parker, J. and Campbell, W. (1966). The structural gene for yeast cytochrome c. Proc. Nat. Acad. Sci. USA 55. 1498-1504. Sherman, F.. Stewart, J. W.. Jackson, M., Gilmore, R. A. and Parker, J. H. (1974). Mutants of yeast defective in iso-1-cytochrome c. Genetics 77, 255-284. Sherman, F.. Jackson, M.. Liebman. S. W.. Schweingruber, M. and Stewart, J. E. (1975). Deletion map of CYCl mutants and its correspondence to mutationally altered iso-1-cytochrome c of yeast. Genetics 87, 51-73. Shine, J., Seeburg. P. H.. Martial, J. A.. Baxter. J. D. and Goodman, H. M. (1977). Construction and analysis of recombinant DNA for human chorionic somatomammotropin. Nature 270, 494-499. Simsek, M. and RajBhandary, U. L. (1972). The primary structure of yeast initiator transfer ribonucleic acid. Biochem. Biophys. Res. Commun. 49. 508-515.
Maxam, A. M. and Gilbert, W. (1977). A new method for sequencing DNA. Proc. Nat. Acad. Sci. USA 74, 560-564. Maxam. A. M.. Tizard. R. D.. Skryabin. K. and Gilbert, W. (1977). Promoter region for yeast 55 ribosomal RNA. Nature 267, 643-645. Montgomery, D. L., Hall, B. D., Gillam. S. and Identification and isolation of the yeast cytochrome 673-680. Moore, C. W. and Sherman, F. (1975). Role of genetic recombination in the iso-1-cytochrome Discrepancies between physical distances and determined by five mapping procedures. Genetics
Schaffner. W.. Kunz. G.. Daetwler, H.. Telford. J.. Smith. H. 0. and Birnstiel. M. L. (1978). Genes and spacers of cloned sea urchin histone DNA analyzed by sequencing. Cell 74, 655-671.
of 5486.
Smith. H. 0. and Birnstiel, M. L. (1976). A simple method restriction site mapping. Nucl. Acids Res. 3, 2387-2398.
for DNA
Stewart, J. W. and Sherman, F. (1974). Yeast frameshift mutations identified by sequence changes in iso-l-cytochrome c. In Molecular and Environmental Aspects of Mutagenesis, L. Prakash. F. Sherman, M. W. Miller, C. W. Lawrence and H. W. Taber, eds. (Springfield. Illinois: Charles C Thomas), pp. 102-l 27. Stewart, J. W., Sherman, F.. Shipman, N. A. and Jackson, M. (1971). Identification and mutational relocation of the AUG codon initiating translation of iso-1-cytochrome c in yeast. J. Biol. Chem. 246. 71297145. Szostak, J. W., Stiles, J. I., Bahl. C. P. and Wu. R. (1977). Specific binding of a synthetic oligodeoxyribonucleotide to yeast cytochrome c mRNA. Nature 265. 61-63. Taniguchi, T. and Weissman. C. (1978). Site directed mutations in the initiator region of the bacteriophage Qa coat cistron and their effect on ribosome binding. J. Mol. Biol. 178. 533-565.
Piper, P. W. and Clark, B. F. C. (1974). Primary structure of a mouse myeloma cell initiation transfer RNA. Nature 247, 516-520.
Tsujimoto. Y. and Suzuki, Y. (1979). Structural gene at the 5’ end and its surrounding regions.
Proudfoot. N. J. (1977). Complete 3’ noncoding region sequences rabbit and human /3-globin messenger RNAs. Cell 10, 559-570.
of
Ullrich. A., Shine, J.. Chirgwin. J., Pictet, R., Tischer. E., Rutter. W. J. and Goodman, H. M. (1977). Rat insulin genes: construction of plasmids containing the coding sequences. Science 196. 13131319.
Proudfoot, N. J.. Gillam. S., Smith, M. and Langley. J. I. (1977). Nucleotide sequence of the 3’ terminal region of rabbit a-globin messenger RNA: comparison with human a-globin messenger RNA. Cell 1 I, 807-818.
Valenzuela. P.. Bell, G. I., Masiarz, F. R.. DeGennaro, L. J. and Rutter, W. J. (1977). Nucleotide sequence of the yeast 5s ribosomal RNA gene and adjacent putative control regions. Nature 267, 641-643.
Proudfoot, N. J. and Brownlee. G. G. (1976). 3’ noncoding in eukaryote messenger RNA. Nature 263, 21 l-21 4.
Roberts, R. J. (1978). recognition sequences.
Restriction and modification Gene 4, 183-l 93.
enzymes
sequences
and their
Roberts, R. J., Breitmeyer. J. B.. Tabachnik. N. F. and Myers, P. A. (1975). A second specific endonuclease from Haemophilus aegyptius. J. Mol. Biol. 91, 121-l 23. Sanger. F. and Coulson. gels for DNA sequencing. Sanger. F., Nicklen, with chain-terminating 5463-5467.
A. R. (1978). FEBS Letters
The use of thin acrylamide 87, 107-l 10.
S. and Coulson. A. R. (1977). DNA sequencing inhibitors. Proc. Nat. Acad. Sci. USA 74,
analysis of the fibroin Cell 16. 425-436.
Valenzuela. P., Venegas, A.. Weinberg, F.. Bishop, R. and Rutter. W. J. (1978). Structure of yeast phenylalanine-tRNA genes; an intervening DNA segment within the region coding for the tRNA. Proc. Nat. Acad. Sci. USA 75. 190-194. Zlff. E. B. and Evans, R. M. (1978). Coincidence of the promoter and capped 5’ terminus of RNA from the adenovirus 2 major late transcription unit. Cell 15. 1463-1475. Zitomer, R. S. and Hall, B. D. (1976). RNA. J. Biol. Chem. 251, 6320-6326. Zitomer. R. S. and Nichols, repression of yeast cytochrome
Yeast cytochrome
c messenger
D. L. (1978). Kinetics and c. J. Bacterial. 735. 39-44.
glucose