J. Mol. Biol. (1965) 13, 432-450
Studies on the Bacteriophage MS2 I. Distribution of Purine Sequences in the Viral RNA and in Yeast RNA W.
FIERS, L. LEPOUTRE AND
L. VANDENDRIESSCHE
Laboratory of Physiological Chemistry, State University of Ghent, Belgium (Received 1 March 1965, and in revised form 24 May 1965) Methods have been worked out for digestion of RNA with pancreatic ribonuclease and for fractionation of the products. In appropriate conditions the hydrolysis seemed to proceed strictly according to the known specificity of the enzyme. The resulting oligonucleotides of general formula (purine nucleotidej, pyrimidine nucleotide were separated on the basis of chain length by chromatography on DEAE-cellulose in the presence of 7 M-urea. Other factors, however, influenced the separation as well. The methodology was worked out by means of yeast RNA and some of these results are presented. The distribution of purines in the bacteriophage MS2 RNA reveals a striking pattern of alternating excess (n = 0, 2, 4, 6, 8) and deficit (n = 1,3,5, 7) over the amount calculated on the basis of a random distribution. A similar rhythm is present in yeast RNA. The estimated number of the various tracts per MS2 RNA molecule are reported. There occur three octanucleotides, three nonanucleotides and two decanucleotides (n respectively 7, 8 and 9). The minimum molecular weight, based on the decanucleotide content, amounts to 1·06 X 10 6 , which is in complete agreement with the physical value. The implications of the results, especially the rhythmic distribution, for the genetic code are discussed. It may be concluded that the sequence of triplets is not random, but subject to a higher degree of order.
1. Introduction In 1961, Loeb & Zinder described an RNA-bacteriophage named f2, which was to become the first representative of a new class of viruses. Very similar phages (among others MS2, R17, p, fr and M12) have since been isolated in various parts of the world. They are characterized by a particle weight of 3·6 to 4·2 X 106 daltons, an RNAcontent of about 31 % and a male specific host range (Escherichia coli, Hfr or F+). Physico-chemical studies by Strauss & Sinsheimer (1963) on MS2, by Mitra, Enger & Kaesberg (1963) and by Gesteland & Boedtker (1964) on R17 and by Marvin & Hoffmann-Berling (1963) on fr, indicated that each viral particle contains a single RNA molecule weighing 1·0 to 1·3 X 106 daltons. Based on the data of the former investigators, the MS2 RNA chain would consist of about 3050 nucleotides. The isolated viral RNA is infectious for E. coli protoplasts, even of female origin (Davis, Strauss & Sinsheimer, 1961). Multiplication of the virus seems to require neither synthesis (Cooper & Zinder, 1962; Hofschneider, 1963) nor transcription (Haywood & Sinsheimer, 1963) of DNA. As no other cellular constituent is known to be endowed with stable genetic information, it may be concluded that the incoming 432
MS2 RN A. 1. DISTRIBUTION OF PURINE TRACTS
433
viral RNA-chain contains all the information for the processes which ultimately lead to a burst of 103 to 104 progeny particles. Moreover, at least part of this RNA is directly translatable into meaningful polypeptides (Nathans, Notani, Schwartz & Zinder, 1962). In an attempt to gain some insight into the coding of this viral information, we have determined the distribution of purine sequences. This analysis was made possible by the specificity of pancreatic ribonuclease, which only splits phosphodiester links adjacent to pyrimidine residues. The resulting oligonucleotides can be separated, mainly on the basis of net charge, by chromatography in the presence of urea (Bell, Tomlinson & Tener, 1964). The results are almost consistent with a random arrangement, except that they reveal interesting rhythmic deviations. An estimate was made of the minimum molecular weight of the viral RNA molecule, based on the relative content of the longest purine sequence. This value is in good agreement with the physical data. The methodology was worked out with yeast RNA, presumably of ribosomal origin, and distributions of the purine sequences in this material are also included. Preliminary notes have appeared (Lepoutre & Fiers, 1964,1965).
2. Materials and Methods (a) Chemicals
Chromatographically purified pancreatic ribonuclease A (Worthington Biochem. Corp.) was used for experiments E-PG, F-PH and L-PI, and ribonuclease D (Mann Research Lab.) for the others (no preference is implied). Freon-II was bought as "Forane II" from Electrochimie, Ugine, France. Labelled sodium phosphate [32P] was obtained as a sterile, carrier-free, isotonic solution, pH 7 (2 mc/rnl.) from CEN. Mol-Donk, Belgium. DEAEcellulose was obtained from Serva Entwicklungslabor, Heidelberg, with a capacity of about 0·75 m. equiv.jg. Urea (UCB, pro analysi grade) was used as purchased. (b) Yeast RNA
Several preparations, made according to Crestfield, Smith & Allen (1955), were analysed. This procedure involves precipitation of the RNA with 1 M-NaCI, under conditions which leave sRNA and DNA in solution. The product obtained was highly polymerized and may be considered as essentially of ribosomal origin. The preparation had been stored for more than a year as a lyophilized powder at -15°C. Further purification was deemed necessary, especially to remove heavy-metal ions which inhibited the pancreatic ribonuclease. An approximately 7% aqueous sodium ribonucleate solution (pH 5 to 6) was treated twice with 0·5 vol. chloroform-n-octanol (5: I). After addition of 0·1 vol. 20% potassium acetate, adjusted to pH 5'0, followed by 2 vol. 95% ethanol, a precipitate formed overnight at -15°C. It was centrifuged down, redissolved in 0·01 M-EDTA (pH 7,0), and dialysed against the same solution for 6 days (changed twice daily). The Visking dialysis tubing had been treated by boiling for 10 min in a 2% (wjv) sodium bicarbonate + 0·005101EDTA solution. After precipitation as above, the RNA was taken up in 0·1 M-sodium acetate 0·001 M-EDTA (pH 6,7) and stored frozen until use. For experiments K and L. the RNA was further extracted with 1 vol. phenol (redistilled and equilibrated with the same buffer), in an attempt to remove suspected contaminating ribonucleases (Rushizky, Greco, Hartley & Sober, 1963). All operations were carried out at O°C.
+
(c) Bacteriophage JJ:[S2
The host bacterium E. coli C3000 and the bacteriophage MS2 were kindly made available by Dr R. L. Sinsheimer, A reasonably homogeneous laboratory stock of the latter was made by passing twice through a single plaque isolate. Standard methods of bacteria and phage enumeration were used (Adams, 1959). The media and culture conditions were as described by Davis & Sinsheimer (1963) and by Strauss & Sinsheimer (1963). 30
434
W. FIERS, L. LEPOUTRE AND L. VANDENDRIESSCHE
The host bacteria were grown in 10-1. batches of MS medium to a concentration of 4 X 10 B/ml. under oxygen aeration. Phages were added at a multiplicity of 5 and stirring was reduced during 15 min. The culture was kept for another 6 hr at 37°C and then cooled to O°C. Further operations were carried out at the latter temperature unless otherwise indicated. Titres obtained varied from 3 to 7 X 10 1 2/ml. Purification steps (i), (ii) and (v) were taken over from Strauss & Sinsheimer (1963) and step (iii) from Gesteland & Boedtker (1964), where full details can be found. (i) Twenty. five ml. 0·1 M-EDTA (pH 7'0) and 280 g ammonium sulphate were added per 1. lysate, the latter over a 2-hr period. After another 2-hr period the precipitate was centrifuged off (30 min at 9000 g) and resuspended in 100 ml. 0'1 M-NaCI + 0·05 M·tris-HCI + 0·01 MEDTA, adjusted to pH 7·6. (ii) The suspension was shaken with 1 vol. Freon-11 and the latter centrifuged off and re-extraeted with 0·5 vol. of the NaCI-tris-EDTA buffer. The aqueous phases were combined. (iii) One-third vol. cold methyl alcohol was added over a 150-min period at -12°C. Another 150 min later the precipitate was centrifuged off in a Servall RC2 centrifuge at -12°C. It was redissolved in 50 ml, NaCI-tris-EDTA buffer. (iv) The virus was spun out in the rotor 30 of the Spinco model E by centrifugation for 150 min at 30,000 rev.jmin, It was resuspended in 9 ml, buffer and left overnight. (v) The solution was clarified by low-speed centrifugation (15 min at 12,000 g) and 0·55 g CsCI was added per g. The solution was distributed in 3 tubes of the Spinco SW39 rotor and centrifuged for 30 hr at 37,000 rev.jmin and 2°C. The main band contained the virus. Drops were collected through a pinhole. (vi) Tubes containing the virus fractions were pooled and transferred quantitatively to a small dialysis bag, pretreated by boiling in a 2% (w/v) sodium bicarbonate + 0·005 M.EDTA solution. Dialysis was carried out for 30 to 48 hr against 3 changes of about 200 vol, 0·1 M-sodium acetate + 0·01 M·trisacetate + 0·001 M-EDTA, adjusted with acetic acid to pH 6·0. The spectra of the final solutions were measured in 0·1 M·NaCI + 0·01 M-tris0·001 M·EDTA (pH 7,6). The A max was at 260 mp' and the A m ln at 240 mp.. The HCI ratio Amax/Amln was 1·516 ± 0·022 and the ratio A260/A2BO 1'807 ± 0·017. An absorbancy coefficient A 12 :fl fJt:} of 8·03 was assumed (Strauss & Sinsheimer, 1963). Yields varied from 100 to 480 mg. The plaque-forming efficiency was in the range 12 to 22'5%.
+
(d)
32 P-labelled
bacteriophage MS2
Forty-eight ml. of modified TPG medium (Davis & Sinsheimer, 1963) was inoculated with 2 ml, of a C3000-culture in the same medium. Bacteria were grown to a density of 2 X 10 6 per ml, under aeration. They were infected at a multiplicity of 10, and 5 min later 4 mc of carrier-free sterile sodium phosphate was added. Aeration at 37°C was continued for 4 hr. Often, although not always, complete clearing of the culture was observed. Titres ranged from 7·5 X 10 ll/ml. (experiment PH) to 1·2 X 101 2/ml. (experiment PE). In experiment PH, the bacteria were grown in the presence of 32p (8 mo added to 100 ml, culture), and their growth followed by turbidimetry. This modification, however, did not seem to enhance the final incorporation. The purification was the same as for the cold virus, suitably scaled down, except for step (i). Approximately 2·5 mg carrier MS2 was added prior to the (NH4l:lS04·fractiona. tion. In order to improve the yield, the supernatant fraction was recentrifuged for 150 min at 78,000 g and the resulting precipitate combined with the (NH4hS04·precipitate. In experiments PH and PI, the addition of (NH4hS04 was omitted, without seemingly affecting the purity of the final preparation. Step (iii) was omitted. Step (iv) involved 150·min centrifugation at 35,000 rev./min in the Spinco rotor 40. (e) Preparation oj bacteriophage RNA Preparation was carried out in the cold room (2°C). Care was taken to avoid nuclease contamination, e.g, by using dry heat-sterilized glassware. An appropriate amount of unlabelled MS2 solution and the total of each 32p -labelled virus preparation were combined (respectively 114 mg and 2·7 X 10 7 cte/min for experiment E·PF, 114 mg and 5·1 X 10 7 cts/ min for experiment E-PG, and 119 mg and 3·2 X 10 7 ctsjmin for experiment F·PH). The volume averaged approximately 6 ml, 0·11 vol. 10 M-lithium chloride, 0·025 vol. 20% recrystallized sodium dodecyl sulphate and bentonite to a concentration of 7 mg/ml.
MS2 RNA. 1. DISTRIBUTION OF PURINE TRACTS
435
were added. The latter had been purified according to Fraenkel-Conrat, Singer & Tsugita (1961) and was resuspended in the extraction buffer. One vol. redistilled and bufferequilibrated phenol was next added. The phases were mixed for 1 min on the Vortex (Scientific Industries, Springfield) and separated by centrifugation. The aqueous phase was treated twice more with phenol, and the phenol phases were washed with 0·5 vol. buffer, as described by Strauss & Sinsheimer (1963). The final aqueous solutions were combined and residual phenol removed by two ether extractions. 0·1 vol. 20% potassium acetate buffer (pH 5), and 2 vol. 95% ethanol at -15°C were added, and the mixture left overnight at the latter temperature. The resulting RNA precipitate was centrifuged off 0·01 M-sodium for 10 min at 3000 g and redissolved in 10 ml, 0·3 M-sodium acetate phosphate 0·001 M-EDTA (pH 7,0). The RNA was reprecipitated by addition of ethanol and redissolved in 6 ml. 0·3 M-sodium acetate + 0·001 M-EDTA (pH 6'7). The yield was essentially quantitative (less than 0·01 % of the radioactive material remained in the ethanolic supernatant fractions). The spectrum was observed in 0·05 Mtris buffer. A max was at 259 mp' and A m 1n at 231 mp', with ratios Amax/Amln between 2·10 and 2·43 and A260/A280 between 2·04 and 2·14. For experiment PE (which was later combined with the yeast RNA digest K, vide infra), the same procedure was followed, except for the omission of unlabelled virus and the omission of ethanol precipitations. In experiment L-PI, the phenolization was carried out as above on 32P-Iabelled MS2 (1'3 X 108 cte/min) to which 122 mg yeast RNA had been added as carrier.
+
+
(f) Hydrolysis with ribonuclease The total yield of the RNA preparation was transferred quantitatively to a dry heatsterilized, water-jacketed reaction vessel. The washings with the same buffer brought the volume to about 8 ml. A drop of chloroform was added. After attaining the reaction temperature of37·0°C, the pH was adjusted to 7·4 with 0·05 N-NaOH. The equipment used was the Titrator TTTla, glass electrode G202B, reference electrode K4312 and stirrer SMP1, all from Radiometer, Copenhagen. The electrodes were reserved exclusively for this purpose, and treated with 0·1 N-HCI after each experiment. The enzymic hydrolysis was started by addition of 0·025 ml. containing 0·060 mg chromatographically pure pancreatic ribonuclease. This resulted in a calculated substrate-to-enzyme ratio of about 500 (Table 1).t The pH was maintained at 7·4 by periodical addition of 0·05 N·NaOH from a microburette, shielded from carbon dioxide uptake. A fresh drop ofchloroform was added every 8 to 10 hr. A typical time course is shown in Fig. 1. When the release of secondary phosphate groups
1·00
:g I
080
0
0
zI 060 z
U')
0
0
004 0·20 5
10
15
20
25
30
35
40
45
Time (hr)
FIG. 1. Hydrolysis of MS2 RNA by pancreatic ribonuclease (experiment E-PG). 35·4 mg MS2 RNA in 8 m!' 0·3 M-sodium acetate + 0·001 M-EDTA was digested with 60'JLg pancreatic RNase A at 37°C. The pH was maintained at 7·4 by periodic addition of alkali. At the arrow, another 60 JLg enzyme was added.
t It was found that minute traces of sodium dodecyl sulphate, which still contaminated the RNA preparations, were responsible for partial inhibition of the enzyme. Hence the actual concentration of active ribonuclease is not known.
436
W. FIERS, L. LEPOUTRE AN D L. VANDENDRIESSCHE
dropped to zero, the reaction was stopped eit h er by freezing or by immediate loading on the DEAE-cellulose column. The p ercentage hydrolysis of potent iall y sus ce pt ible bonds was calculated on the basis of the m easured absorbancy a t 260 mu of the RNA, the r eported m olar ex t inc t ion coefficient €p (8600 for MS2 RNA), the nucleotide com p osit ion (Strauss & Sinsheimer, 1963), an average pK' -value of 5·95 for t he secon da ry phosphate of the digestion products (i.e. an assum pt ion of 96·5 % di ssociation under the reaction conditions) and the observed alkali co ns um p t ion . Under t hese con d itions, t he percentage hydrolysis given in Table I can on ly be cons ide re d as an approximation. TABLE
1
Conditions of the pancreatic RNase digestion Percentage
Substrate t o enzym e ratio (w/w)
Total reaction time (hr)
H J Ka Lb
130 666 595 1018
127 103 100 54·5
E-PF E-PG F -PH
240 5900 308d
69·5 43·5 43·5
Experiment
Yeast RNA
MS2 RNA
h ydrolys is est imated on the basis of NaO H consumption
Column load (O.D. un its)
663 1139 1099 1454
79·1
881 1122 1358
99·9 88·9 93·1
Co-chromatographed with the 32P -Iabelled l\-IS2 RNA preparation PE (Fig. 5). b Co-digest ed an d co-chromatographed with the 32P-lab elled MS2 RNA pr eparation PI. o After 31 hr the reaction was almo st complete (Fig . 1), and another 60 ,...g enzyme was added, which brought the substrate/enzyme ratio t o 295. d Aft er a first addition of 60,...g RN ase, almost no hydrolysis was observed; 10 min lat er another 60 ,...g enzyme was ad ded . a
In ex per iment PE (which was ch rom a t ograp hed with the ycast RNA h y drolysa t e K), the la belled RNA (1,4 X 107 eta/min) in 5 ml , 0·1 xr-sodium acet a t e 0·01 ~I-tris acetate O'OOIl\I-EDTA (pH 7'5), was incubated for 48 hr at 37°C with 36 fLg ribonuclease and a drop of chloroform. This corres p on ded to approximately the same enzyme con cent ra t ion as given above (but of course a much lower substrate -to-enzyme ratio). The enzymic h yd rolysis of yeast RNA (experiments H, J and K) was carried out as above, except t hat the solution contained 0·0 1 M-sodium a cetate + 0·00 1 M-E DTA. The react ion times (giv en in Table 1) were considera b ly longer , due to the d im in ished ribo nuclease activity at this low ionic strength (Edelhoch & Coleman, 1956). Also a h igher h ea v y- m etal ion conta m in a t ion m a y have inhibited t h e enzyme, even in the presence of EDT A. It was later suspected t hat the panc rea t ic ribon u clease action may be less specific at this low ionic strength.j Consequently, ex pe riment L and the viral R NA di gest ions deseribed above were carried out in 0·3 M-sodi um acetate 0·001 M-EDTA.
+
+
+
(g) Column chromatography on DEAE-cellulose
Before u se, the DEAE-cellulose was thoroughly washed with alkali and acid. Fines we re removed by decantation in water a nd subsequently in 1 :II-NaC!. After removal of ex cess chloride, the adsorbent was converted to the hydroxide form with 1 N-NaOH. It was
t This would be in agr eement with the results of Beers (1960) on the hydrolysis of poly A by pancreatic ribonuclease. In the case of digestion of plant virus RNA's , the problem may perhaps be more complex, due to the presence of cont am inat ing plant nuc leases, the activity of which is enhanced in high salt concentration solution (Staehelin, 1961; Symons, Re es, Short & Markham, 1963).
MS2 RNA. 1. DISTRIBUTION OF PURINE TRACTS
437
washed until almost neutral and resuspended in 20% (wjv) potassium acetate. The acetate concentration was lowered to 0·1 1\1 and the column poured in this form. Fifty-ml. portions of the slurry were added at a time, and pressure was applied either from a small air pump or more recently by means of a hydrostatic pressure head. The DEAE·cellulose was equilibrated with the lowest concentration of the gradient (vide infra). It may be noted that the break-through volume for urea is considerably larger than the hold-up volume. The column dimensions at this point of the procedure were 2'5 em X 95 em. The RNA hydrolysate in 0·3 M-sodium acetate + 0·001 M-EDTA was diluted by addition of 2 vol. 7 M-urea in water. This loading solution, containing approximately 1000 o.n. units, was allowed to sink into the column, and was washed in with 50 ml, of the lowest gradient concentration. The linear gradient consisted of 2·51. 0·1 M-sodium acetate + 0·01 M-tris acetate + 0·001 M-EDTA + 7 M-urea (pH 7,5) (in experiments F·PH and L-PI the pH was increased to 7'9)t in the mixing vessel, and 2'51. 1 M-sodium acetate + 0·01 M-tris acetate + 0·001 M-EDTA + 7 M-urea, same pH, in the reservoir. The effluent was led through the monitors and fractions of 10 ml, were collected, approximately every 20 min. The operations were carried out at room temperature. The yeast RNA hydrolysates in experiments H, J and K were diluted with 1 vol. of the lowest gradient concentration. The column height was 50 to 56 em, and each gradient solution 21. (h) Quantitative analysis of the column effluent The column effluent was consecutively led through a radioactivity and an absorbancy monitor. The former consisted of a home-made flow-cell (a 19·5·cm long spiral of PVC· tubing, 0·2 em outer diameter), arranged under the FDW-l window of a Tracerlab Omniguard Beta Counting system. Print-out was set at 10-min intervals. The efficiency for 32p of the flow-cell corresponded to plating out 0·12 ml. solution under normal conditions. The ultraviolet absorption was followed with a 4701A Uvicord (LKB, Stockholm). Individual fractions were measured at 260 m", in a spectrophotometer (Zeiss PMQII), and eventually a spectrum was recorded. The results were corrected for background absorbancy, which was assumed to increase linearly from the region before the mononucleotides (A 2 6 0 ~ 0,020) to the end of the gradient (A 2 6 0 ~ 0,060). The radioactivity of the individual fractions was calculated on the basis of the automatic registration, after correction for flow rate and 32P_decay. Appropriate fractions of each peak were pooled, and samples taken for spectrophotometry and for 32P_counting. The tubes were rinsed once. and the wash fluid was similarly analysed. The solutions were stored at -15°C until furt.her work-up, A true evaluation of the optical density results would require a knowledge of the extinction coefficients (Ep) of each isoplithic peak. As a first approximation, the results were corrected for the average higher absorbance of the purines, i.e, an Ep-value of py. (A pu.w X n) + A w was assumed for each isoplith of general formula (Pupj, Pypj ; n+l Apu,w is the weighted average molar extinction coefficient of the purines (13,020 for MS2 RNA and 13,100 for yeast RNA) and Apy,w the corresponding value of the pyrimidines (respectively 8830 and 8980). The assumptions are that no hypochromicity occurs, that the extinction values measured in urea are the same as in aqueous solution and that the ratios VjC and AjG are the same in all isopliths as in the total RNA. None of these conditions is rigorously fulfilled.' No correction for hypochromicity seemed warranted. Indeed, on one hand it is well known that the hypochromicity of polynucleotides decreases in strong urea solutions (Warner, 1957; Hummel & Kalnitsky, 1959); and on the other hand the chromicity of mononucleotides decreases by 2 to 7% in strong urea solution (Stockx, 1963). The two effects tend to cancel out relative differences, and indeed good agreement was found between the distributions calculated on the basis of optical density and of radioactivity (Table 4).
t pH values which refer to solutions in 7 M-urea should only be considered as an instrument reading. t Abbreviations used: Pu, purine ribonucleoside; Py, pyrimidine ribonucleoside.
438
W. FIERS, L. LEPOUTRE AND L. VANDENDRIESSCHE
Radioactive samples were plated out on 2'5-cm aluminium planchets and counted in a Traeerlab Omniguard counting system. Creeping of the strong urea was prevented by adding a drop of 10% Tween-80. An absolute efficiency of 28·1 % was found for 32p, with a background of 0'5 ctsfmin or less. Forty-two mg urea per planchet (i,e, 0·1 ml. 7 Msolution) decreased the relative efficiency to 87%. (i) Other methods Nucleotide material could be quantitatively recovered from the strong urea solution by adsorption on Norite. The solution was diluted with 2 vol. water, acidified to less than pH 5'0, and 0·05 vol, of a 20 mgjml, acid-washed Norite suspension was added. After mixing and centrifuging, the adsorbent was washed 3 times with water, and eluted with 0·15 M.NH 40H in 47'5% aqueous ethanol. The recovery of mononucleotides was quantitative. Alkaline hydrolysis, paper chromatography and electrophoresis are described in the following paper (Fiers, De Wachter, Lepoutre & Vandendriessche, 1965).
3. Results (a) Separation of purine sequences by chromatography in the presence of 7 M-urea
Typical separations of yeast RNA are shown in Figs 3 and 5, and of MS2 RNA in Figs 5 and 6. The identification of the first three major peaks (respectively around fractions 52,82 and 115 in Fig. 3) as the mono-, di- and trinucleotides was unambiguous, as each peak has been separated into its individual components by rechromatography (Fiers et al., 1965). By extrapolation, a chain length of 4, 5, 6, 7, 8, 9 and 10 was assigned to the following major peaks. This seems justified as the elution volume of the maximum of each isoplith increased regularly with chain length (Fig. 2). The reproducibility of the separation was excellent (Fig. 2, experiment E and H). Packing under pressure was essential (compare experiments E or H, and J). 08
...ee ...
Cl>
..
07
Cl>
0·6
E
05
v
::>
-c
-. 0
VI
04
0
?;- 03
'C
"0 I:
2
3
4
5
6
7
8
9
~
Chain length FIG. 2. Separation of isopliths as a function of chain length. Chromatography of RNase digestion products on DEAE-cellulose. The concentration of the eluting buffer as a function of the isoplith length (or net charge) is given. - X - X - , Yeast RNA (experiment J); column 2·5 x 56 em; gradient, 4 I. (pH 7,5); adsorbent not compressed. -A-A-, Yeast RNA (experiment E); same conditions as above but column packed under pressure. -f::,-f::,-, Yeast RNA (experiment H); same conditions as experiment E. MS2 RNA (experiment E-PG); column 2·5 X 97 em; gradient, 5 1. (pH 7,5). -0-0-, MS2 RNA (experiment F-PH); column 2·5 X 92 em; gradient, 5 I. (pH 7-9).
-e-e-,
MS2 RNA. I. DISTRIBUTION OF PURINE TRACTS
439
In these experiments maximum resolution between the higher isoplithic peaks was attempted by using large column capacities relative to the load. Under these conditions, partial splitting of the mono- and dinucleotide peaks was often observed, as was an asymmetrical shape of many of the higher oligonucleotide peaks (Fig. 3). The first and second components of the mononucleotide peak were identified as respectively uridylic acid and cytidylic acid on the basis of their spectra and their mobility on paper chromatography and electrophoresis. Each was relatively free of the other. It is obvious that this separation broadened the isoplithic peaks, and as a consequence depressed the resolution in function of chain length. The phenomenon was further investigated by analysis of a tetranucleotide peak. Appropriate tubes were pooled, so that four fractions of equal volume, and corresponding to increasing elution volume, were obtained. The urea and salt were dialysed away (48 hours against five changes of 0·01 M-thioglycol) and the nucleotide material concentrated, hydrolysed and analysed for constituent mononucleotides (Table 2). Evidently the tetranucleotides containing TABLE
2
Partial resolution of tetranucleotides upon chromatography in 7 M-urea Composition
I
II
III
IV
C A G U
0·02 1·78 1·70 0·98
0·33 1·62 1·49 0·67
0·70 1·66 1·26 0·30
0·89 1·68 1-10 0·11
PulPy
3·48
3·11
2·92
2·78
A/G
1·04
1·08
1·31
1·52
The tetranuc1eotide peak (experiment E-PG) was divided in four fractions, I, II, III and IV in order of increasing elution volume. They were separately purified, concentrated and hydrolysed in 0·3 N-KOH for 18 hr at 37°C. Perchloric acid was added to pH 1, the precipitate centrifuged off and washed, and both supernatant fraction and wash solution adjusted to pH 4. Electrophoresis was carried out for 3 hr at pH 3·4 and 26 v Icm on Whatman paper 3 MM. The e1ectropherogram was scanned for 32p counts, and the results are based on the latter (the sum of the pyrimidine nucleotides is taken as unity).
uridylic acid moved in front, whereas those ending in cytidylic acid trailed. As separation into two peaks was not observed, the column behaviour must have been influenced by other factors as well. There was some effect of purine composition, with adenylic acid coming behind guanylic acid (this is in disagreement with Staehelin (1963) who relates a high 290 mfl-/260 mfl- absorbancy ratio in the later part of the peaks to a higher guanylic acid content). It seems likely, however, that purine sequence also is involved. It may be added here that Bartos, Rushizky & Sober (1963) have observed a partial fractionation of each isoplith according to purine/pyrimidine ratio (which is constant for a pancreatic RNase digest). An attempt was made to reduce the separation of uridylic and cytidylic acid-containing oligonucleotides (and hence sharpen the peaks) by increasing the pH to 7·9. As shown in Fig. 2, this resulted in a definite improvement in the separation of the isoplithio peaks.
440
W. FIERS, L. LEPOUTRE AND L. VANDENDRIESSCHE
(b) Purine sequences in yeast RNA A typical elution curve is shown in Fig. 3, and the optical density curve of Fig. 5 also refers to yeast RNA. The small peaks coming before the mononucleotides (maximum at, respectively, fractions 19, 28 and 34) have not been rigorously identified, but may be, respectively, nucleosides, phenol and cyclic nucleotides. Spurious peaks appeared between the main isoplithic peaks, and as these were absent in the MS2 digest (except for a very small peak in experiment F-PH, vide infra), it seems likely that they contained odd components, e.g. methylated nucleotides.
2·5 1·2
10
2·0 ~ E
0
'-0
~
,...u <:
1·5
a
.x»
..0
.o
-c
1'0
04
0·5 0·2
1M
t
-----------~--_._-250
Fraction no.
FIG. 3. Column chromatography of a yeast RNA digest (experiment H). 6630.D. units of a yeast RNA digest were loaded on a 2·5 X 50 cm DEAE-cellulose column. Elution was with a 4-1. gradient from 0-1 to 1 sr-sodium acetate in 0·01 M-tris-acetate + 0·001 MEDTA + 7 M-urea (pH 7,5). 10-m!. fractions were collected approximately every 20 min.
The distribution of the different isopliths is given in Table 3. At first glance, the results do not show striking deviations from a random distribution. The highest oligonucleotide peak, which is clearly resolved in Figs 3 and 5, consisted of heptanucleotides. A similar pattern for comparable material was reported by Bartos et al. (1963). In experiment L, non-specific nuclease activity was avoided by digestion at higher ionic strength. Under these conditions, an oeta. and a nonanucleotide peak were clearly resolved and even a faint decanucleotide peak could be distinguished.
MS2 RNA. 1. DISTRIBUTION OF PURINE TRACTS TABLE
441
3
Distribution of (PUP)n Pyp sequences in yeast RNA
Tract length
2 3 4 5 6 7 8 9 10
Percentage found>
Percentage calculated Expt He for random I distribution-
22·09 23·41 18·61 13·14 8·71 5·54 3·42 2·06 1·23 0·73
29·2 23·6 21-5 12·5 7·48 4·02 1·27
")
J
0'46!
Expt Kd Average
Expt J
l
I
I
H,J&K
27·1 22·8 21·3 12·8 8'74 4·43 2·16
25·3 21-1 21·6 13·9 9·20 5·06 2·67
27·2 22-5 21·5 13·0 8·47 4·50 2·03
I-Ii?
} 0-77!
0·71!
j
l j
Expt Ld
I
II
Average I & II
26·05" 20·95 21·36 12·17 8·69 5·21 3·26 1·27 0·92 0·07
26·66" 19·57 21·37 12·24 9·16 5·14 3·24 1·29 0·94 0·34
26·36" 20·26 21·37 12·21 8·93 5·18 3·25 1·28 0·93 0·21
% Accounted for g
89·9
92·9
95-2
92·7
94·9
% in
1M fraction>
% Recovery
0-006 90-3
0-002 84·7
0·14 90·0
0·03 88·3
0·33 98-1
Based on the following composition: 25% A, 28% G, 27% U and 20% C. Experimental details are given in Table I. Results I based on summation of the optical den. sities of individual fractions. Results II based on optical density of pooled peaks. The assumptions made in recalculation on a molar basis are discussed in the text. c Results shown in Fig. 3. d Chromatographed together with 32P·labelled MS2 RNA. e This value included 0·68% present in the 2',3' cyclic nucleotide peak. t This material was not clearly resolved in discrete peaks, but presumably corresponded to oeta- and nonanucleotides. g The percentage material (expressed in G.D. units) accounted for by the isoplithic peaks, relative to the total amount eluted from the column (likewise expressed in D.D. units). h After the run, the column was washed with the highest concentration of the gradient. The difference (5 to 10%) between the optical density material accounted for in the isoplithic peaks plus the I M fraction and the total recovered consisted of the peaks which moved before the mononucleot.ides (partly phenol) and between the lower isoplithic groups. a b
A finding, which becomes interesting in view of the MS2 RNA results (vide infra), is the relative excess of odd-numbered tracts (i.e. containing an even number of purines). In Fig. 4 the difference between the experimental and the calculated amount is plotted as a function of length of tract. The effect is especially striking up to the tetranucJeotide level; but in experiment L, which was carried out under optimal conditions, it can be seen to persist up to a tract length of 10. In several experiments the mono- and trinucleotide peaks have been rechromatographed, and found to consist entirely of normal pyrimidine nncleotides and Pup Pup Pyp sequences respectively. Hence their abundance is a true reflection of the primary structure, and is not caused by contaminating products which were eluted in the same region (e.g. derived from end groups, or formed by non- specific nuclease activity).
442
W. FIERS, L. LEPOUTRE AND L. VANDENDRIES:;CHE
5 4 .."
20
3
::; u
" u
2
~
>
::> 0::
'E
0
."
0::
::>
0 -I ...... 0~
-2 -3 -4
2
3
4
7 5 6 Tract length
8
9
10
FIG. 4. Deviation from random distribution as a function of chain length (yeast R NA). The diffe rence between the amount found and the amount calculated on the basis of a random distribution (Table 3) is plotted as a function of chain length (abscissa). - X - X - , Ave rage of experiments H, J andK (digested in 0·01 sr-sodium acetate);-O-O-, experiment L (d igested in 0·3 ::If·sodium acetate).
The 1 1>1 peak, eluted with the highest concentration of the gradient, was at least partially constituted by impurities, retained from the strong urea solutions. In addition, it may perhaps have contained polynucleotidcs, which are resistant to pancreatic RNase digestion, such as poly A, detected by Edmonds & Abrams (1963) among others in yeast RNA. However, it is important to note that t he total amount found is too low to account for a polynucleotide (n > 10) present in each 80 s ribosome. No further ultraviolet light-absorbing material could be eluted by rinsing the column with 2 M-sodium carbonate solution. (c) P urine sequences in MS2 RNA
A co-chromat ography of a yeast R NA hydrolysate with a 32P -Iabelled MS2 RNA hydrolysate is recorded in F ig. 5. In general, the absorbancy and radioa ct ivity peaks move together, which is evidence that both contain only normal nucleotides. However, partial resolution of absorb.ancy and radioactivity certainly took place, e.g. in the tetra- and hexanucleotide peaks (respectively around fractions ]22 and 158). This is undoubtedly due to a different composition of the two RNA's, and the partial fractionation within each isoplith, discussed in section 3(a) . The conditions of the 32p. labelled RNA d igestion were sub-optimal in this experiment (low substrate to enzyme ratio and no means of following the rate; see section 2(e)). As a consequence, some non -specific degradation of the MS2 RNA was noted. The small peak preceding the mononucleotides was found to contain among other substances cyclic adenylic and guanylic acid (the first absorbancy peak does not contain 32p and is due to traces of phenol).
443
MS2 RNA. 1. DISTRIBUTION OF PURINE TRACTS
31 300
200
l~o
1
250
1
200
e
~
00
"'E 0
ISO
-o
~ »,
~ 100
'1 ;,
50
~'
0
.,
j\
u co
150
·
7500
+1:
4000
~
· "
"' I'
.·.\'
5000
I' , I
J~
#~
........
co
!
0\/
~
3000
:§:
>-.0
.., -0
~£ 0-0
100
1M
,L,<,
2000
t
"-"-"-"
0·50
o
_ ....J
., .J\_ *'
1000
250 Fraction no.
FIG. 5. Column chromatography of a yeast RNA digest and a 32P_labelled MS2 RNA digest (experiment K.PE). 894 O.D. units of a yeast RNA digest and less than 20 O.D. units of a 32P·labelled MS2 RNA digest (1,4 X 10 7 ctajmin) were loaded on a 2·5 X 52 cm DEAE·cellulose column. Same conditions as for Fig. 3. The left ordinate refers to the absorbancy at 260 mil- (-0-0-) as well as to the molarity of the eluting buffer ( - .. -); the right ordinate refers to the 32p ctsjmin (-e-e-). The insert shows the higher isopliths on the same abscissa but with a tenfold expanded ordinate scale
The different elution pattern of the higher isopliths from yeast RNA and from MS2 RNA made the former suspect as carrier for the latter. Hence experiments which contained only MS2 RNA were run (Fig. 6). The absorbancy peak around fraction 30 was again due to traces of phenol. The peak around fraction 41 corresponded to material with a unit charge and consisted mainly of cyclic uridylic acid. Cyclic purine nuoleotides were definitely not present. The following series of peaks corresponded respectively to the mono-, di-, tri-, tetra-, penta-, hexa-, hepta-, octa-, nona- and decanucleotides. The small peak around fraction 95 was not observed in experiments E-PG or L·PI, in which the digestion was nearer completion, as judged from the amount of cyclic uridylio acid left. It was tentatively considered a dinucleotide derivative, as the nucleotide composition, after alkaline hydrolysis, was 0·18 cytidylic acid, 0·82 uridylic acid, 0·80 adenylic acid and 0·64 guanylic acid. The cleanness of the profile suggests the absence of impurities, odd nucleotide derivatives, non- specific degradation products, etc. The results (e.g. presence of cyclic uridylic acid but no cyclic cytidylic acid) are in agreement with the findings of Rushizky, Knight & Sober (1961), who report the following sequence of events for the enzymic action: (1) splitting of internucleotide linkages by transphosphorylation; (2) opening of the cyclic terminals of oligonucleotides; (3) opening of cyclic 2',3' cytidylic acid; and (4)
444
W. FIERS, L. LEPOUTRE AND L. VANDENDRIESSCHE
300
~
:::t
6000
200
500
E
QJO ...,'" ON
c:
gC c:
400
If I,
f\,
~ 300
~~
...,"' V
150
>"0 ..., .D
~~ O.D
200 100
~~ 100
0
5000
l~
I
f
I
4000
c:
:€ ...,
\
"'
l{\ :\ : • ~ If" ~\j \1\ \j ....~.'"
3000
~
~
1M
~.~
~
2000
0'50
1000
..
~
I....
o
50
----..·400
100
Fraction no. FIG. 6. Column chromatography of an MS2 RNA digest (experiment F-PH). 1358 o.n. units of a pancreatic RNase digest ofMS2 RNA, labelled with 32p (1,1 X 107 ctsfmin), were loaded on a 2·.5X 92 em DEAE-cellulose column. Elution was carried out with a 5-!. gradient from 0·1 M to I Ill-sodium acetate in 0·01 M-tris acetate + 0-001 M-EDTA + 7 M-urea (pH 7-9). 10-m!. fractions were collected approximately every 20 min. The left ordinate refers to the absorbancy at 260 ttu: ( - 0 - 0 - ) as well as to the molarity of the eluting buffer (_ .. -); the right ordinate refers to the 32P-ctsfmin (-e-e-). The insert shows the higher isopliths on the same abscissa but with a fivefold expanded ordinate scale.
opening of cyclic 2',3' uridylic acid. It was felt that a small amount of this cyclic 2',3' uridylic acid was tolerable, as it did not affect the quantitative evaluation of the results and constituted an internal control for the extent of the enzymic digestion. The results of several experiments are summarized in Table 4. The four different and independent methods of analysis are in good. agreement. Nevertheless, as the radioactivity can be measured with much higher sensitivity and as no ambiguous corrections or assumptions are to be made, the results based on radioactivity are considered the most reliable. A summary of these results is presented in Table' 5. Experiment L-P! consisted of yeast RNA and 32P-labelled MS2 RNA and for this reason the results are not included. This decision, however, does not significantly affect our conclusions regarding the distribution of the purine sequences. The molecular weight of MS2 RNA has been determined by physical methods and corresponds to about 3050 nucleotides (Strauss & Sinsheimer, 1963). On this basis, an estimate of the total number of the different isoplithic tracts per single RNA chain has been made (Table 5). Beyond the decanucleotide peak, no more ultraviolet light-absorbing material or radioactive material can be eluted by the gradient. The peak which appeared with the 1 M buffer solution was too small and inconsistent to represent a polypurine tract. At least a part of this radioactivity had its origin in unknown binding sites in the system, which have a higher affinity for the nucleotide material than the diethylaminoethyl substituents of the cellulose. No more radioactive material was eluted by total stripping of the column with 2 M-sodium carbonate.
441)
MS2 RNA. 1. DISTRIBUTION OF PURINE TRACTS TABLE
4
The percentage distrib1Jiion of (PUP)n Pyp isopliths in MS2 RNAa Expt F-PHb
Expt E-PG
Expt L·Plo
Tract length I
II
III
IV
I
II
III
IV
III
IV
I" I
1·98 25·83
2·71 24·95
1·56 24·04
1·41 25·66
24·93 19·41 12·21 8·00 3·84 2·51 0·93 0·23 0·17
22·94 19·75 12·00 9·41 4·80 3·08 0·91 0·83 0·61
23·95 18·97 12·01. 8·62 4·24 2·72 0·92 0·83 0·62
4·72 20·72 0·59 22·74 19·1.8 11.·93 9·77 4·17 3·41 0·86 1·20 0·69
4·39 21·41 0·63 23·02 19·90 12·16 8·97 3·63 3·07 0·92 1·17 0·69
0·60 26·29
25·48 20·07 12·09 8·79 4·10 2·23 0·58 0·55 0·25
4·43 21·92 0·99 24·20 20·01. 11.·68 8·96 3·55 2·44 0·60 0·90 0·25
0·71 27·20
2" 2 3 4 5 6 7 8 9 10
4·21 21·86 0·88 23·97 19·41 /.l·81. 8·99 4·00 2·91 0·63 0·81 0·53
22·09 19·05 11·98 9·36 4·20 3·24 0·73 0·83 0·59
22·20 21·38 12·29 8·60 3·67 2·85 0·78 0·76 0·56
99·9 0·09 81·1
97·6 0·48 89·2
% Accounted for 99·1 % in I M fraction 0·22 93·1 % Recovery
100 0·02 98·4
99·7 0·3 94·8
.. The analysis was based on: I, summation of the optical densities of individual fractions; II, optical densities of pooled peaks; III, summation of the radioactivity as registered with the flow cell; IV, radioactivity of the pooled peaks (planchet counting). b Results shown in Fig. 6. c This experiment contained yeast RNA and 32P-labelled MS2 RNA. " Contained mainly 2',3' cyclic uridylic acid. e Contained presumably a dinucleotide derivative.
TABLE
5
Distribution of (PuP)n Pyp tracts in MS2 RNA Tract length Percentage calculated" (n I)
+
I 2 3 4 5 6 7 8 9 10
25·10 25·04 18·74 12·47 7·78 4·66 2·71 1·54 0·87 0·48
Percentage found" Expt E·PG Expt F-PH
26'34d 23·45 19·36 12·01 9'02 4·52 2'90 0·92 0'83 0·62
25·62 d 23·49" 19·54 12·05 9·37 3·90 3·24 0·89 1·19 0·69
No. of tracts Percentage per accounted molecule" for
793 358 198 92 !56 21 13 3 (4) 3 (4) 2
26·09 23·56 19·55 12·11 9·21 4·15 2·99 0'79 0·89 0·66
a Based on random sequence and the following base composition: 22·8% A, 27·1 % G, 25·2% U and 24·9% C (Strauss & Sinsheimer, 1963). b Average of the distributions of 32p activity reported in Table 4. C A single RNA chain of 3050 nucleotides is assumed. " The cyclic nucleotides (mainly 2',3' uridylic acid) are included. e The small preceding peak is included.
446
W. FIERS, L. LEPOUTRE AND L. VANDENDRIESSCHE
4. Discussion (a) Distribution of purine tracts in MS2 RNA
The validity of the results presented in this paper depends entirely on the specific action of pancreatic ribonuclease. The questions which must be asked are: (1) Did the reaction go to completion! (2) Was there any hydrolysis, enzymic or not, of 3'purine nucleotide esters! As mentioned above, incomplete digestion should have been shown up by the appearance of oligonucleotides ending in a 2',3'cyclic phosphodiester. Although the lower homologues were clearly resolved upon rechromatography (Fiers et al., 1965), they did not show up under normal conditions. Also the decanucleotides were clearly resolved, but there was not a trace of undigested material of longer chain length. Over-digestion is a more serious problem. This often shows up as 2',3' cyclic purine nucleotides (Symons, Rees, Short & Markham, 1963). These compounds, however, were not present in our experiments carried out under optimal conditions (although small amounts were detected in experiment K-PE). Neither was there a trace of degradation products in the mono-, di- and trinucleotide peaks (Fiers et al., 1965). Another strong argument for the validity of the results is their relative independence of experimental variables, such as the amount of enzyme and reaction time (it is obvious that experiments E-PG and L-PI were nearer to complete digestion than experiment F-PH). The key to success is (1) extreme care in preparation and handling of the RNA solutions, (2) the use of the chromatographically pure RNase A or D, (3) digestion at high ionic strength, and (4) stopping the reaction when it approaches the end point. The results, summarized in Table 5, do not reveal a general trend to deviate from the distribution calculated on the basis of a random sequence. This suggests that there is no strong discrimination against triplets containing three purines. MS2 RNA is composed of about 1017 triplets, and of these a minimum of 47 and a maximum of 213 are such three-purine triplets (a more accurate estimate oftheir relative frequency cannot be made as the viral RNA chain is too small to represent a statistically valid sample of all inter-triplet frequencies). The estimated number of three-pyrimidine triplets is in the same range. A remarkable feature of the distribution, however, is that all odd-numbered tract lengths are in excess of random, and all even-numbered in deficit. This aspect is discussed in the next section. The number of octa-, nona- and decanucleotides is still in reasonable agreement with a random distribution. This suggests that a sequence of two adjacent threepurine triplets is neither selected against nor favoured. The estimated number 01 tracts containing seven or more purines in a row amounts to 8. This can be compared with a value of 35 reported by Sedat & Sinsheimer (1964) for r/>X DNA, which has less than twice the number of monomers present in MS2 RNA. Each viral RNA molecule contains three or four octanucleotides and three or four nonanucleotides; the lower estimate is probably the better, especially if the results of experiment L-PI are also taken into account. The imperfect agreement is likely to be due to experimental error, although other factors may influence this result. Indeed there is no proof that the RNA molecules are a homogeneous population. Even in a fresh lysate, the number of plaque-forming units falls much short of the number of physical viral particles. Also the plaque-forming efficiency decreases towards the lighter side of a virus band in CsCI (unpublished experiments). This may mean that some chains were encapsulated before synthesis was complete.
MS2 RNA. 1. DISTRIBUTION OF PURINE TRACTS
447
The percentage of decanucleotide material found corresponds closely to the quantity expected if two tracts are present per molecule. It may be argued that the actual number is three, but degraded by 30 or 40%. This seems unlikely as (1) no degradation products were detected, (2) various experiments in which the digestion conditions were different all lead to the same conclusion, and (3) in some experiments complete separation between the two decanucleotides was achieved'] and the ratio was near 1: 1 (we cannot completely exclude the possibility that only one tract is a decanucleotide and the other an undecanucleotide). Knowing that two different decanucleotides are present, the minimum molecular weight of the RNA chain has been calculated on the basis of the percentage decanucleotide material present. This estimate is 1·12 X 106 and 1·00 X 106 for experiments E-PG and F-PH respectively. It is in excellent agreement with the physically determined molecular weight of the isolated MS2 RNA, namely, 1·05 X 106 , and with the RNA content per viral particle (Strauss & Sinsheimer, 1963). These results also prove that the viral RNA chain is not composed of identical subunits. (b) 18 the genetic code rhythmic?
A striking feature of the purine distribution in MS2 RNA is that all odd- numbered tracts (containing an even number of purines) were in excess of a random distribution, and all even-numbered tracts in deficit (Fig. 7). The reproducibility of the experimental results presented in Table 5 is good enough to conclude that their averages are different from the calculated values with a significance of 98% (Student's t.test; this conclusion is reinforced by the independent
4-
3 -0
s
2
0
:; -'! 0
U
~
\
0-
V>
" E -1 co
-g OJ
0
u.,
-2
~ -3
I x,1
." \
\
I
I
\
\ \
\
\
I
\
I I
\
\I
"<.
-4
I
!
!
1
2
3
456 Tract length
7
8
9
10
FIG. 7. The deviation from random distribution 88 a function of chain length (MS2 RNA). This plot is similar to Fig. 4. - 0 - 0 - , Experimental results (Table 5: average of experiments E-PG and F-PH, minus amount calculated on the basis of a random sequence). X - . - . x , Hypothesis 1. A code containing 50% Pup letters, and in which all even places have a 60% chance of containing the same letter as the preceding, and all odd places a 60% chance of a different letter (difference from random distribution is plotted). t::, - - - t:,., Hypothesis 2. Same as above, but the preference amounts to 65 and 61·5%, respectively.
t Partial separation is already evident in Fig. 6. In some other experiments, e.g, L-PI, which involved co-chromatography of a carrier yeast RNA digest and a 32P·labelled MS2 RNA digest, the resolution W88 complete.
448
W. FlERS, L. LEPOUTRE AND L. VANDENDRlESSCHE
experiment L-PI, Table 4). The unexpected feature, however, is not that they are different from random, but that they suggest a regular pattern of deviation. We have considered some possible implications of this rhythm, assuming that the observed frequency distribution is valid for an indefinitely long molecule. What does this pattern mean? For the remainder of this discussion the genetic language may be simplified to a sequence of two letters, Pup and Pyp. First we note that the observed periodicity is two, and not three. As such it is not possible to obtain this rhythmic distribution by merely selecting the relative concentration of the eight possible three-letter code-words. If the genetic code consists of non-overlapping triplets, which seems almost certain (Crick, Barnett, Brenner & Wat.ts-Tobin, 1961), the rhythmic distribution would mean that the sequence of triplets is not random, but subject to a higher degree of order. In other words, the assumption that the likelihood of occurrence of a given triplet is equal to its relative concentration is not warranted for a rhythmic code. What kind of sequence would give rise to the observed rhythm? Suppose the code is a sequence of even and uneven letters. An alternating excess-deficit would obtain if all even loci are preferably the same as the preceding letter (Pup Pup or Pyp Pyp) and all ocld places preferably different (Pup Pyp and Pyp Pup). This amounts to the same as a doublet code with an excess of homo-words (Pup Pup and Pyp Pyp), and in which the first letter of each doublet is preferably different from the last letter of the preceding doublet (again it must be stressed that it is not possible to obtain the rhythm by merely adjusting the concentration of the different doublets). Hypothesis 1 (Fig. 7) is calculated on this basis, assuming a preference of 60% for the same letter on the even places and a 60% preference for a different letter on the odd places (this amounts to the same as a doublet code containing 30% Pup Pup, 30% Pyp Pyp, 20% Pup Pyp and 20% Pyp Pup, and in which a doublet ending with a Pup or a Pyp respectively has a 60% chance of being followed by a doublet in which a Pyp or a Pup, respectively, is the first letter). If both preferences are equal, they balance out for the mononucleotidos, and the amount of Pyp is the same as for a random distribution. The experimental results, however, did show a small excess of mononucleotides. In hypothesis 2 (Fig. 7), this is taken into account by assuming a preference of 65% for same letters and 61'5% for alternate letters (these parameters result in approximately the experimental excess of mono- and pentanucleotide). These calculated distributions based on such simple assumptions fail at longer chain lengths. The experimental rhythm, however, is more uniform and persistent. This suggests that Nature is far-sighted in not only selecting adjacent doublets, but also influencing, albeit perhaps to a lesser degree, the second nearest neighbour, etc. What does a system of alternating preferences mean in terms of a triplet code? As the rhythm pertains partly to the longer sequences, it is necessary to invoke in a preference system the Pup Pup Pup-triplet (i.e, a given preceding or following triplet is either favoured or discriminated against). However, this would again lead to a periodicity of three instead of two. In other words, it is not possible to build up a language, which has rhythm in agreement with the experimental data, by merely choosing the proper concentration of the eight possible triplets and by placing simple restrictions on nearest neighbours. Is the rhythmic code a more general phenomenon? Few data in the literature bear on this point. A similar periodicity is undoubtedly present in yeast RNA (Fig. 4). Hall & Sinsheimer (1963) found a deficit--excess-deficit for the solitary pyrimidine nucleotides, the dipyrimidine nucleotides and the tripyrimidine nucleotides respectively in the bacteriophage epX174 DNA (these correspond to the di-, tri- and tetranucleotide tracts respectively in the present work). The agreement does not extend to the longer tracts (except perhaps a periodicity with a sign inversion). Similarly, in the same DNA, solitary purine nucleotides show a deficit and dipurine nucleotides an excess (Sedat & Sinsheimer, 1964). No periodicity is revealed in the few reported distributions of pyrimidines and purines in microbial DNA or in DNA of higher organisms (Spencer & Chargaff, 1963; Habermann, Habermannova & Cerhova, 1963; Burton, Lunt, Petersen & Siebke, 1963). The non-random sequence of triplets may place some restrictions on the amino acid distribution. However, as each triplet in the two-letter code stands for eight actual
MS2 RNA. I. DISTRIBUTION OF PURINE TRACTS
449
codons, the former may not be revealed, except by sophisticated mathematical analysis. It may be appropriate to mention here that vectorial analysis has shown striking examples of repeating sequences in the primary structure of pancreatic ribonuclease (Lanni, 1960, 1961). What is the significance of the rhythmic code? It may be related to phenotypic expression (e.g. if proper folding of the corresponding polypeptides requires a certain repeat), but this seems unlikely as it involves an important part of the genetic material. Neither is a functional role easily conceived (e.g. it is of no help in maintaining the proper reading frame). The periodicity may be the result of mutation pressures, which are dependent on neighbouring nucleotide environment; it may be formed by virtue of an intricate specificity pattern of the polynucleotide-syntheeizmg machinery, or it may be the remnant of a simpler, primitive code present in earlier forms of life. Note: The paper of Sinha, Fujimura & Kaesberg (1965), dealing with similar work on the related phage R17, reached us when this manuscript was ready for submission. The results are in general agreement and the alternating rhythm can also be deduced from their results. They find only two nonanucleotides, one decanucleotide and one higher oligo. nucleotide (dodeca- or trideca-) compared to three, two and zero, respectively, in MS2 RNA. This probably constitutes a real difference in the primary structure of the genetic material of two closely related strains. This work was supported by grants from the U.S. Public Health Service (GM 11304-01), the Lilly Research Laboratories, the Nationaal Fonds voor Wetenschappelijk Onderzoek, and the Fonds voor Collectief Fundamenteel Wetenschappelijk Onderzoek, REFERENCES Adams, M. H. (1959). Bacteriophages. New York: Interscience Publ, Bartos, E. M., Rushizky, G. W. & Sober, H. A. (1963). Biochemistry, 2,1179. Beers, R F. (1960). J. Biol. Chern; 235, 2393. Bell, D., Tomlinson, R. V. & Tener, G. (1964). Biochemistry, 3, 317. Burton, K., Lunt, M. R, Petersen, G. B. & Siebke, J. C. (1963). Cold Spr. Harb. Symp. 28, 27. Cooper, S. & Zinder, N. D. (1962). Virology, 18, 405. Crestfield, A. M., Smith, K. C. & Allen, F. W. (1955). J. Biol. Chem. 216, 185. Crick, F. H. C., Barnett, L., Brenner, S. & Watts-Tobin, R. J. (1961). Nature, 192, 1227. Davis, J. E. & Sinsheimer, R. L. (1963). J. Mol. Biol. 6, 203. Davis, J. E., Strauss, J. H. & Sinsheimer, R. L. (1961). Science, 134, 1427. EdeIhoch, H. & Coleman, J. (1956). J. BioI. Ohern; 219, 351. Edmonds, M. & Abrams, R. (1963). J. BioI. Ohem, 238, PC 1186. Fiers, W., De Wachter, R., Lepoutre, L. & Vandendriessche, L. (1965). J. Mol. Biol. 13, 451. Fraenkel-Conrat, H., Singer, B. & Tsugita, A. (1961). Virology, 14, 54. Gesteland, R. F. & Boedtker, H. (1964). J. Mol. Biol. 8, 496. Habermann, V., Habermannova, S. & Cerhova, M. (1963). Biochim. biophys. Acta, 76, 310. Hall, J. B. & Sinsheimer, R L. (1963). J. Mol. Biol. 6, 115. Haywood, A. M. & Sinsheimer, R. L. (1963). J. Mol. Biol. 6, 247. Hofschneider, P. H. (1963). Z. Naturf. 18b, 205. Hununel, J. P. & Kalnitsky, G. (1959). J. Biol. Chem, 234, 1517. Lanni, F. (1960). Proc. Nat. Acad. Sci., Wash. 46, 1563. Lanni, F. (1961). Proc. Nat. Acad. Sci., Wash. 47, 61. Lepoutre, L. & Fiers, W. (1964). A/'ch. intern. Physiol. Biochim. 72, 331. Lepoutre, L. & Fiers, W. (1965). Arch. intern. Physiol. Biochim. 73, 372. Loeb, T. & Zinder, N. D. (1961). Proc. Nat. Acad. Sci., Wash. 47, 282. Marvin, D. A. & Hoffmann-Berling, H. (1963). Z. Naturf. 18b, 884. Mitra, S., Enger, M. D. & Kaesberg, P. (1963). Proc, Nat. Acad. Sci., Wash. 50, 68. Nathans, D., Notani, G., Schwartz, J. H. & Zinder, N. D. (1962). Proc, Nat. Acad. Sci., Wash. 48, 1424. 31
450
W. FIERS, L. LEPOUTRE AND L. VANDENDRIESSCHE
Rushizky, G. W., Greco, A. E., Hartley, R. W. & Sober, H. A. (1963). Biochem, Biophys. Res. Comm. 10, 311. Rushizky, G. W., Knight, C. A. & Sober, H. A. (1961). J. Biol. Chem, 236, 2732. Sedat, J. & Sinsheimer, R. L. (1964). J. Mol. Biol. 9, 489. Sinha, N. K., Fujimura, R. K., & Kaesberg, P. (1965). J. Mol. Biol. 11, 84. Spencer, J. H. & Chargaff, E. (1963). Biochim. biophys. Acta, 76, 310. Staehelin, M. (1961). Biochim. biophys. Acta, 49, 27. Staehelin, M. (1963). In Progress in Nucleic Acid Research, ed. by J. N. Davidson and W. E. Cohn, vol. 2, p. 170. New York: Academic Press. Stockx, J. (1963). Biochim. biophys. Acta, 68, 535. Strauss, J. H. & Sinsheimer, R. L. (1963). J. Mol. Biol. 7, 43. Symons, R. H., Rees, M. W., Short, M. N. & Markham, R. (1963). J. Mol. Biol. 6,1. Warner, R. C. (1957). J. Biol. ou«. 229,711.