On the origin and evolution of the genetic code I. Wobbling and its potential significance

On the origin and evolution of the genetic code I. Wobbling and its potential significance

1. theor. Biol. (1977) 67, W-109 On the Origin and Evolution I. Wobbling of the Genetic Code and its Potential Significance NILS AALL BARRICELL...

2MB Sizes 5 Downloads 122 Views

1. theor. Biol. (1977) 67, W-109

On the Origin

and Evolution

I. Wobbling

of the Genetic Code

and its Potential

Significance

NILS AALL BARRICELLI Department

of Mathematics, University of Oslo, Blindern, Norway

(Received 5 November 1975, and in revisedform I2 October 1976) In this paper several properties of the genetic code are interpreted by assumingthat wobbling or someremnant of wobbling hasoriginally been a common phenomenonalso in the first nucleotide of each codon, and not only in the third nucleotide. Some of the last stepsin the evolution of the genetic code are describedon the basisof this interpretation of geneticcode features. An attempt to outline someof the earlier stepsin the evolution of the geneticcode is basedon the assumptionthat at an earlier stagewobbling may also have beencommon in the central nucleotide of eachcodon. In the last part of the paper the possibility is considered that the pairing rules which characterize wobbling may have been much more common in the past not only in codon-anticodon pairing but also in polymer copying. The advantagesof a freer purine-pyrimidine pairing like the one characteristic of wobbling in a primitive (or prebiologic) environment in which nucleotide production was not entirely (or not at all) under biologic control are stressed. This paper is based exclusively on the “Frozen accident” interpretation of the genetic code (Crick, 1968) with a few modifications introduced or implied in the text. No stereochemicalcodon interpretations and only a minimum of chemicalconsiderationsare involved.

1. Introduction A glance at the genetic code as presented in Table 1 shows that the 64 triplets are not distributed at random among the 20 amino acids. They are associatedin groups of triplets coding for the sameamino acid, with common nucleotides and occupying contiguous positions in the table. Several rules can easily be deduced from the table (Crick, 1968). (1) XYU and XYC always code for the same amino acid, as if no distinction were made between the two pyrimidines U and C in the third codon base. 85

N.

A.

BARRICELLI TABLE

1

The genetic code. This table shows the best allocations of 64 codons. The two codons marked ochre and amber are believed to signal the termination of the polypeptide chain 1st 4 2nd -+

U

C

A

-__- .-. .- .__

G

..-

..~

3rd

~-

.~

U

PHE PHE LEU LEU

SER SER SER SER

TYR TYR Ochre Amber

CYS CYS ? TRP

u C A G

C

LEU LEU LEU LEU

PRO PRO PRO PRO

HIS HIS GLN GLN

ARG ARG ARG ARG

U C A G

A

ILE ILE ILE MET

THR THR THR THR

ASN ASN LYS LYS

SER SER ARG ARG

U C A G

G

VAL VAL VAL VAL

ALA ALA ALA ALA

ASP ASP GLU GLU

GLY GLY GLY GLY

C U A G

(2) XYA and XYG usually code for the same amino acid with two exceptions represented by the rare amino acids methionine and tryptophan which have only one codon each. Except for these two cases, no distinction is made between the two purines A and G in the third codon base. (3) In half the cases (8 out of 16) four codons like XYU, XYC, XYA and XYG code for the same amino acid, or in other words the third base in each triplet is disregarded and the amino acid is determined by the two first bases (X and Y) only. (4) There is one case in which three codons of the type XYU, XYC and XYA code for the same amino acid (isoleucine). (5) In two cases two codons of the type UXY and CXY code for the same amino acid (leucine). In these two cases no distinction is made between the two pyrimidines U and C in the first codon base. (6) In two other cases two codons of the type CXY and AXY code for the same amino acid (arginine).

EVOLUTION

OF

THE

GENETIC

CODE

87

The first four rules are dealing with cases in which differences in the third base do not affect the translation of the codon in terms of amino acids. Similarly the last two rules [(5) and (6)] are dealing with cases in which differences in the first base of the codon does not affect its translation. As is well known, the first four rules are interpreted by the wobbling phenomenon discovered by Crick (1966). This interpretation if restricted to third codon and anticodon base, does not however explain the last two rules [(5) and (6)]. It is our intention to interpret the last two rules by assuming that the wobbling phenomenon may have been more common in the past and may have also involved the first nucleotide in each triplet or at least in many triplets. Wobbling in the first codon base may not entirely be a phenomenon of the past. A case resembling first base wobbling (or FBW), as the phenomenon hereafter will be called, is found today by in vitro experiments with the formylmethionine tRNA, which is responsible for protein chain initiation in E. coli. As is well known, formylmethionine-tRNA will initiate polypeptide chain growth in response to either the methionine codon AUG or the valine codon GUG. If the tRNA anticodon is UAC, as seems likely, this would be a typical example of wobbling in which the first base, instead of the third base is involved. Our method of interpreting the properties of the genetic code will, to a large extent, be based on the following assumptions which together will be designated as the “primordial purine-pyrimidine pairing hypothesis” or shortly “Pu-Py hypothesis”. (1) III the earliest versions of the genetic code (primordial code or codes) pairing rules between codon and anticodon triplets were used in which less distinction than in present-day pairing rules was made between some of the nucleotides, particularly between the purines (A, G, I and possibly other primordial purines) and between the pyrimidines (U, C and possibly other primordial pyrimidines). (2) The pairing rules are assumed to have always (or nearly always) been reciprocal with respect to codon and anticodon nucleotides, as would be expected if the origin of codon-anticodon pairing was a form of annealing between the two RNA segments. (3) The primordial pairing rules have gradually, or by successive steps, been replaced by pairing rules more in line with those which are presently being used when RNA and DNA strands are copied, as far as the first two nucleotides in each triplet are concerned (with an exception presented for the first nucleotide by the formylmethionine-tRNA mentioned above). Some of the successive steps will be discussed in the next section.

88

N. A. BARRICELLI

This Pu-Py hypothesis, together with the genetic code itself and very little additional information, will be the basis for our attempt to interpret various properties of the genetic code, particularly the six characteristics [rules (1) to (6)] enumerated at the beginning of this section. In order to demonstrate the power of this approach let us assume that we were unaware of, or unfamiliar with, the wobbling hypothesis and were trying to find an interpretation of the first four rules, dealing with the cases in which differences in the third base do not affect the translation of the codons. According to the Pu-Py hypothesis the first two rules are easily interpreted by assuming that either in the present or some time in the past in the third codon and anticodon position some of the purines can (or could) pair with more than one pyrimidine, and some pyrimidines can (or could) pair with more than one purine. One may tentatively start with the assumption that each one of the pyrimidines U and C can (or could) pair with both of the purines A or G and vice versa. However, one should keep in mind that this condition does not have to apply for both of the pyrimidines in order to interpret rules (1) and (2), and one should be prepared to modify this assumption for one of the pyrimidines if required in order to fit the data of Table 1 and the rules (1) to (4). This has been done in Table 2 (alternatives A and B) where parentheses are used to eliminate some of the alternatives as required in order to fit the data of Table 1 and the rules (1) to (4). Both alternatives A and B in Table 2 are constructed by starting from the same assumption that each one of the pyrimidines U and C in the third anticodon TABLE 2 Solutions of the third base pairing problem for the interpretation qf rules (I), (2), (3) and (4). Parentheses are used to ident$v eliminated alternatives. X designates an unknown tRNA nucleotide needed for the interpretatiom Third position anticodon base

Third position codon base

Alternative A

Alternative

U c A G X

A and (A and) U (and U and A and

G G C) C U or C or both

U C A G X

(A A (U U A

G G C C U or C or both

B and) and and) and and

EVOLUTION

OF

THE

GENETIC

CODE

89

position can pair with either one of the purines A and G, and vice versa. If this pairing rule were assumed to be something of the past which does no longer apply it might be difficult to make a decisive argument for changing it. But if the intent were to find pairing rules which might have a chance of being valid also in the present and may be subject to experimental verification, then the choices are more restricted. In fact there are two exceptions to rule (2), namely tryptophan and methionine, both of which have unique codons of the type XYG. Evidently there must be anticodons with a third nucleotide capable of pairing only with G and not with the other purine. There are two solutions to this problem which do not require the introduction of an extra nucleotide besides the ubiquitus four RNA nucleotides U, C, A and G. The first solution, which is presented in Table 2A, is to assume that the pyrimidine C in the third anticodon position pairs only with G and not with the other purine A. The second solution, presented in Table 2B, is to assume that the other pyrimidine U in the third anticodon position pairs only with G. Since the pairing rules are assumed to be reciprocal with respect to codon and anticodon nucleotides [see Pu-Py hypothesis point (2)] if C in the third anticodon base pairs only with G and not with A as in the first solution (alternative A) it will have to pair only with G and not with A also in the third codon base. That implies that an A in the third anticodon base pairs only with U and not with C, as indicated by the eliminated alternatives surrounded by parentheses in Table 2A. A similar argument is applied to eliminate alternatives in Table 2B. Each one of the two solutions presented in Table 2 fully identify the pairing rules of the four nucleotides U, C, A and G and leave no possibility of interpreting the remaining rules (3) and (4) presented at the beginning of this section without introducing at least one extra nucleotide X in the third anticodon position of some tRNA molecules. This is no surprise since we know that tRNA molecules often contain various types of modified nucleotides. The simplest solution is to make use of only one extra nucleotide X which according to rule (3) must be able to pair with at least one purine and at least one pyrimidine. Moreover, according to rule (4) the purine will have to be A only, whereas the pyrimidine(s) can be either U or C or both (see Table 2, A and B). This is as far as we can go by the information contained in Table 1. As is well known, according to the wobbling hypothesis the first solution (alternative A) is the correct one-as expected since A seems more likely to pair only with U than to pair only with C and the same applies to the other pairing rules-and the extra nucleotide X, which is identical to inosine, can pair with all of the three nucleotides A, U and C (see Table 3).

90

N. A. BARRICELLI

2. The Assumption of First Base Wobbling (FBW-hypothesis) in Primordial Genetic Codes Our next task will be to interpret the last two rules [(5) and (6)] presented in the introduction, and dealing with the cases in which differences in the first base of the codon does not affect its translation in terms of amino acids. In this case we will of course take into account any useful side-information which can be available, since our purpose is now to discover new ground rather than giving a simple demonstration of the applicability of the method. We cannot assume that any kind of wobbling or primordial purinepyrimidine pairing mechanism could still apply to the first nucleotide in a triplet, with the one already mentioned exception represented by formylmethionine-tRNA. But we will assume that some time in the past some sort of wobbling has been taking place also in the first position of each codon (or at least of several codons), in order to find out whether this assumption can give us a plausible picture of the last steps in the evolution of the genetic code. TABLE

3

Pairing in the third base according to the wobbling hypothesis Third position anticodon base ~~~..__... U 2 G I

-~~~

Third position codon base ._~~~ ~~ A or G G U u or c U, C or A

There is no strict requirement that wobbling in the first position, if it ever occurred, would have had to follow exactly the same rules which today apply in the third position. But that would obviously be the simplest assumption to start with. We shall tentatively use this assumption, and we will also assume that wobbling in the first base was eliminated gradually by successive removal in different tRNA molecules or in different anticodon triplets. It will simplify the presentation of the subject if we assume that the removal of first base wobbling (FBW) was achieved in several stepswithout implying that each step occurred at the same time everywhereof which the last three major steps can be described as follows:

EVOLUTION

OF THE

GENETIC

CODE

91

(1) A first step in which FBW was removed from those codon triplets which had a C or an A in the central base, carrying evolution from a general FBW stage to a (UG)-wobbling stage in which FBW was confined to codon triplets with a U or a G in the central base (see Table 4). (2) A second step in which FBW was removed also from those codon triplets which had a pyrimidine in the third base. This left FBW only in those triplets which had a U or G in the central base and a purine (Pu) in the third base, carrying evolution from the (UG)-stage to a (UG)Pu-stage. (3) A third step in which FBW was removed completely (except in the one case represented by the formylmethionine-tRNA), carrying evolution to the present stage. The successive stages separated by the three steps described above are summarized in the following Table 4. We shall follow the scheme outlined in this table as far back as proves useful and is capable of giving plausible results. But we will keep in mind that wobbling may also have occurred in the central base, and, if necessary, we may have to intercalate the elimination of wobbling in the central base with any one of the steps recorded in Table 4. TABLE

4

Subsequentstagesin jirst base wobbling (FB W). Pu = purines, Py = pyrimidines, N = any base Stages

Codons subject to FBW

Codons excluded from FBW

General FBW-stage (UG)-stage (UG)Pu-stage Present stage

All NUN and NGN NUPu and NGPu None

None NCN and NAN NCN, NAN and NNPy All

If we assume that the last stage of first base wobbling before the present was the (UG)Pu-stage described in Table 4, rules (5) and (6) presented in the introduction can be interpreted as a consequence of the wobbling process by assuming minimal changes in the genetic code as indicated in Table 5(a) and (b). In both parts the common translation of the codons UUPu and CUPu [rule (5)] is ascribed to the still existing leucine-tRNA anticodons GAPy and GA1 (with or without the assistance of an earlier, now extinct, tRNA with anticodon IAPy) which according to the FBW rules were able to pair with UUPu codons. Similarly the common translation of the codons CGPu and AGPu [rule (6)] is ascribed to an earlier arginine-tRNA, now extinct. with anticodon ICPy. The other changes are rigorous consequences

92

N.

A.

BARRICELLI

of the wobbling rules applied, if we assume that there were no cases in which the same triplet coded for two different amino acids and that the only new tRNA molecules introduced in the transition to the present stage were those (methionine and tryptophan) filling codon positions left empty by the elimination of FBW. By comparing Table 5(a) and (b) with Table 1 we notice that the FBWhypothesis requires only moderate changes in the genetic code during the transition from the (UG)Pu-stage to the present stage. The only changes concern the four codons UGA, UGG, AUA and AUG, of which the first two supposedly coded for arginine in the (UG)Pu-stage, whereas they respectively represent a nonsense triplet and tryptophan in the present stage. The last two AUA and AUG are assumed to code either for leucine TABLE

5

Recomtructed genetic code for the (UG)Pu-stage. Possible tRNA anticodons are indicated in parenthesis as an interpretation. Underlined anticodons are those belonging to tRNA molecules replaced or removed in the transition to the next stage Base 2

Base 1

U

(4 U Phe (AAPu) Leu (AAPy,

GAPy, --d

c

A

G

C

A

Base 3

G

Ser (AGPu, Ser (AGPy,

AGI) AGI)

Tyr (AUPu) Nonsense

Cys (ACPu) Arg .-..-_.. (GCPy, .------. GCI,

ICPy)

PY Pu

Pro (GGPu, Pro (GGPy,

GGI) GGI)

His (GUPu) Gln (GLJPy)

Arg (GCPu, Arg (GCPy,

ICPy)

Pu

Ile (UAPu) Nons. or Start

Thr (UGPu, Thr (UGPy,

UGI) UGI)

Asn (UUPu) Lys (UUPy)

Ser (UCPu) Arg (ICPy) __.

Val (CAPu, Val (CAPy,

CAI) CAI)

Ala (CGPu, Ala (CGPy,

CGI) CGI)

Asp (CUP@ Glu (CUPy)

Gly (CCPu, Gly (CCPy,

CCI) CCI)

Phe (AAPu) Leu (AAPy,

IAPy, ~--

etc.)

Ser (AGPu, Ser (AGPy,

AGI) AGI)

Tyr (AUPu) Nonsense

Cys (ACPu) Arg (GCPy, ~

GCI, ~

Leeu (GAPu, Leu (GAPy,

GAI) GAI,

IAPy) _---

Pro (GGPu, Pro (GGPy,

GGI) GGI)

His (GUPu) Gln (GUPy)

Arg (GCPu, GCI) Arg (GCPy, GCI,

Thr (UGPu, Thr (UGPy.

UGI) UGI)

Asn (UUPu) Lys (UUPy)

Ser (UCPu) Arg (ICPy)

Ala (CGPu, Ala (CGPy,

CGI) CGI)

Asp (CUPu) Glu (CUPy)

Gly (CCPu, Gly (CCPy,

Leu (GAPu, Leu (GAPy,

GAI) _-

GAI) GAT)

GCI) GCI,

PY

PY Pu PY PL!

(b) u

c

A

Ile (UAPu) Leu .--(IAPy)

G

Val (CAPu, Val (CAPy,

CAI) CAI)

ICPy) --.-

PY Pu

l)Y _ICPy) ~. -..~._ Pu PY Pu

Ccl) CCI)

PY Pu

EVOLUTION

OF

THE

GENETIC

93

CODE

[Table 5(b)] or to be nonsense and/or start for polypeptidc initiation triplets [Table 5(a)] in the (UG)Pu-stage, whereas AUA codes for isoleucine and AUG for methionine or for the polypeptide initiation amino acid formylmethionine today. In this connection it is worth mentioning that the two rare amino acids tryptophan and methionine are considered by Crick (1968) and others as later additions to the genetic code. These are the only two present amino acids missing in the two versions of genetic code for the (UG)Pu-stage described in Table 5(a) and (b). Both consist of 18 amino acids and a few nonsense and/or start codons. We shall not attempt, at this stage, to suggest any particular way in which the genetic code and the tRNA molecules involved may have evolved from the (UG)Pu-stage to the present stage. In order to make a meaningful suggestion to this effect one could attempt to investigate the relatively few known cases of organisms which use different interpretations of the genetic code. The fact that there are such organisms using, for example, antibiotic agents which modify the interpretation of the genetic code, makes it hard to rule out the possibility that some evolution of the genetic code might still be going on, at least among microorganisms? (Likover & Kurland, 1967; Gorini, 1974. Notice the tendency of streptomycine to restore earlier -Tables 6 and 7-interpretations of the codon UUU.) We shall instead discuss the possibility of finding some experimental tests for the theory, which, if successful, could throw light on the problem (see next section). 3. The Question of Finding Experimental the FBW-theory

Confirmation

of

Observational predictions which can be made by an evolutionary theory (such as a theory dealing with the past evolution of the genetic code) always relate to properties of living (or earlier living) organisms which have been acquired in the past. Very often the question whether organisms (or fossils) with the predicted properties still exist is uncertain. Nevertheless successful predictions of future discoveries-relating to prehistoric events which were unknown at the time when the prediction was made-have many times been accomplished by evolutionary theories. Such predictions, if successful, are a valid test of the theory which can be as good as other experimental predictions. 7 In spite of the general stand-still presently reached in the evolution of the genetic code, the universality of this code might not only be the result of a “frozen accident” (Crick, 1968). The exchange between different species of genetic information mediated by episomes and viruses (including genetic information adaptive to various antibiotics affecting the genetic code) might also have contributed to its universality.

94

N.

A.

BARRICELLI

We find no reason for refraining from presenting the following observational predictions by the FBW-theory even though the probability of finding organisms with the predicted properties is uncertain at the best. (1) If the FBW-theory is correct there is a possibility of finding organisms which do not use tryptophan in any of their proteins. The tryptophan-tRNA, if present at all, in the environment in which the organisms develop may be used for other purposes, but not for carrying tryptophan to the ribosomes which produce their proteins. It might even be possible to find organisms in which the tryptophan codon UGG and perhaps even the nonsense codon UGA may still be used as arginine codons [see Table 5(a) and (b)]. (2) Similarly one might find organisms which do not use methionine (except possibly formylmethionine) in their proteins. It is possible that some of them may still interpret the methionine codon AUG as a leucine or an isoleucine codon. (3) One may find organisms in which the substitution of tryptophan by arginine and/or the substitution of methionine by leucine or isoleucine in their proteins will never be lethal even in cases in which most other substitutions of the same amino acids would be lethal. One of the organisms in which it might be interesting to look for this kind of phenomena is the Rous Sarcoma Virus. According to Maugh (1974) tryptophan-tRNA is used in this organism as a primer for inverse transcription of RNA into DNA. According to other experiments with the same virus carried out by the Bishop’s group (Cordell, Stavnezer, Friedrich, Bishop, & Goodman, 1976) the primer is methionine-tRNA (Bishop, personal communication). This unusual function of tryptophan- or methionine-tRNA does not necessarily imply that they could not be used also in a more regular or conventional fashion by the same organism. But the cases we may want to look for are those presenting irregularities with respect to the use of tryptophan- and/or methionine-tRNA supplied by a host organism, if we want to increase our chances of finding viruses in which these molecules are not used as regular transfer RNA. To discover organisms adapted to a still earlier stage (see next two sections, Tables 6, 7 and 8) of the genetic code’s evolution, it may be worth looking among organisms producing or resistent to streptomycin. 4. Earlier Stages in First Base Wobbling In the following discussion an attempt shall be made to move one more step back in the evolution of the genetic code, to the stage in which wobbling was supposedly extended to all the codon triplets which had a U or a G

EVOLUTION

OF

THE

GENETIC

CODE

95

in the central base, the so-called (UG)-stage (see Table 4). The same wobbling rules which apply in the third necleotide will again be used in this new extension to the first nucleotide. Of course, the same reservation we made in section 2 that this is the simplest assumption but not the only possible one applies also here. Likewise the very separation into individual stages such as the (UG)Pu-stage, the (UG)-stage, etc., is no more than a convenient way of presenting the subject by outlining only some of the main events and may not reflect the chronologic sequence of events the way they would actually have occurred. With these reservations, when we apply the wobbling rules the way they are stated, we have relatively few choices concerning the selection of a genetic code for the (UG)-stage which would be consistent with one or the other of the genetic codes for the following (UG)Pu-stage presented in Table 5(a) and (b). It seems hard to avoid the conclusion that the leucine codons CUPy must have been leucine codons also in the (UG)-stage, and the arginine codons CGPy must have been arginine codons also in the (UG)-stage, since these codons are not among those which would have been left without a meaning by the removals of FBW cases characterizing the transition from (UG)-stage to (UG)Pu-stage. That leaves no choice concerning the UUPy codons which according to the FBW rules applying in the (UG)-stage would have to be ascribed to the same amino acid (leucine) assigned to the CUPy codons, replacing phenylalanine in Table 5(a) and (b). Likewise, the UGPy would have to be ascribed the same amino acid (arginine) assigned to the CGPy codons, replacing cystine in Table 5(a) and (b). Moreover, according to the FBW rules, in the (UG)-stage the AUPy codons would either have to be left as nonsense codons or they would have to be assigned to leucine or valine, replacing isoleucine in Table 5(a) and (b). Because of the chemical similarity between leucine and isoleucine we have preferred the alternative leucine in the Table 6(a) and (b). Similarly, the AGPy codons would either have to be left as nonsense codons or they would have to be assigned to arginine or glycine, replacing serine in Table 5(a) and (b). We will show only two of these three possibilities in the following Table 6(a) and (b). Table 6(a) and (b) shows two possible extensions of the wobbling rule to the first nucleotide of all triplets with a U or a G in the central base. Both of these extensions are based on the code presented in Table 5(b), which is used as a point of departure. Two other similar extensions could have been based on the code presented in Table 5(a) and the following comments could apply also to them. Three amino acids (isoleucine, phenylalanine and cysteine) which were present in Table 5(a) and (b) are missing in Table 6(a) and (b), and the

96

N. A. RARRICELLI

total number of amino acids is reduced to I 5. It is however possible that isoleucine may have coexisted with leueine in some organisms and some triplets, for example in the triplets UUPy, CUPy and AUPy which in Tables 6(a) and (b) are ascribed to leucine. As pointed out by Crick (1968) there is a possibility that early genetic codes may have presented cases of more than one amino acid coded by the same codon. This possibility has actuality with respect to leucine and isoleucine which, because of their chemical similarity, might have been difficult to discriminate for some of the early tRNA molecules. As expected, first base wobbling leads to genetic codes in which triplets which have a common U or a common G in the central base show a very low repertory of amino acids, for the same reasons why wobbling in the third base leads to a low number (maximum 2) of amino acids for any group of triplets which have the first two bases in common. TABLE

6

Reconstructedgenetic codefor the (UC)-stage. Only anticodon triplets di&rent from those reported in Table 5 are indicated Base 2

u

c

U

Leu (IAI) Leu (IAI)

Ser Ser

Tyr Nons.

C

Leu (IAI) Leu (IAI)

Pro Pro

His Gin

Arg Arts Am Arg

A

Leu (IAI) Leu (IAI)

Thr Thr

Asn Lys

Nons. Arg

Pu

G

Val Val

Ala Ala

ASP Glu

GUY GUY

PY Pu

Leu (IAI) Leu (JAI)

Ser Ser

TYr Nons.

Arg (ICI) Arg .__ (ICI)

PY

Leu (IAI) Leu (IAI)

Pro Pro

His Gin

Arg (ICI) Arg (ICI)

PY Pu

Leu (IAI) ___ Leu (IAI)

Thr Thr

Asn LYS

Arg (ICI) Arg (ICI)

PY PU

Val Val

Ala Ala

ASP Glu

GUY GUY

PY Pu

Base 1

A

G

1st version PY

Pu PY Pu PY

2iici version U

Pu

EVOLUTION

OF

THE

GENETIC

97

CODE

If we were to attempt another reconstruction of the genetic code one more step back in time before the stage described in Table 6(a) and (b), we would run into major difficulties, if we were to use the same straightforward method outlined in Table 4 by re-introducing a general first base wobbling. The number of possible reconstructions would be too large for comfort and the result would do little to help understanding how the genetic codes presented in Table 6 could have originated. It would appear that a different approach will be required in order to understand the early evolution of the genetic code before the stage described in Table 6. This will be attempted in the next section. For the time being our purpose has been achieved by showing that the hypothesis of first base wobbling in primordial codons is useful as a means to find plausible interpretations of some prominent characteristics of the genetic code and suggest some of the last steps in its evolution. 5. The Assumption of Central Base Wobbling (CBW-hypothesis) the Earliest Genetic Codes

in

Instead of attempting a final extension of first base wobbling to all codons (according to Table 4, general FBW-stage), we shall explore the possibility of extending wobbling to the central base as a means of interpreting the early evolution of the genetic code. This has been done in the following Table 7, keeping all the amino acids and tRNA species present in Table 6(a) and (b). Inevitably this leads to ambiguous codons and overlapping amino acid interpretations, which could have been avoided by assuming that a few amino acids (such as leucine, valine, arginine and glycine) were not present when the elimination of central base wobbling took place. However, the ambiguities appear to be surprisingly harmless for the use of the code, as one would have expected if all or most of the quoted amino acids were already present. Moreover, all codons with a C or an A in the central base (involving 11 amino acids and one nonsense codon) present no ambiguity, as is obvious for the following reasons. The introduction of wobbling in the central base without changing the nature of the tRNA molecules available nor the amino acids carried by them would not alter the meaning of codons carrying a C or an A in the central base, since, according to the wobbling rules, C would still pair only with G, and A would still pair only with U in the tRNA central anticodon base. However, codons with a U or a G in the central base would acquire a double meaning, since, according to the wobbling rules, U can pair both with A and G, and furthermore G can pair both with U and C. If both types of tRNA molecules (A and G in central anticodon base or U and C in central anticodon base) are present and if they carry different amino acids, codons with a U or a G in the central base I.R. 7

98

N. A. BARRICELLI

can be ambiguous and may have two interpretations. This ambiguity, which is conspicuously present in Table 7, could explain why codons with a U or G in the central base were being avoided, leaving only four different amino acids with such codons in Table 6. In Table 7 a second interpretation for codons with a U or a G in the central base is indicated where it applies. Some of them (indicated in parentheses) are dependent on the existence of anticodons with an I or G in the first base. These anticodons may have been introduced during the transition from the stage recorded in Table 7 to the stage recorded in Table 6(a) and (b) as a means to fill in the gaps left over by the removal of CBW, and may not have been present during the stage described in Table 7. There are three amino acids, leucine, valine and glycine, which are restricted to ambiguous codons. All of the other amino acids have unambiguous codons of their own in addition to the possible ambiguous ones. In most cases the unambiguous codons have a C or an A in the central base. Arginine has its own codons UGPu which are unambiguous in spite of the G in the central base because there is no tRNA molecule with the anticodons AUPy as indicated by the presence of the nonsense codons UAPu [see Table 6(a) and (b)]. Some of the ambiguities may not have been present in all species or may only reflect differences between different species or populations which may have contributed to shape subsequent genetic codes by crossing or symbiotic associations. However, as pointed out by Crick (1968) it is to be expected that primordial genetic codes may have had some ambiguity. Moreover, the arrangement we find in Table 7 has the appearance of having been expressly TABLE

7

Reconstructed genetic code for a primordial stage in which wobbling was general in the second and third base, and was restricted only in the first base to those codonswhich had a U or a G in the secondbase Base 2 Base 1

U

C

A

G

Base 3

U

Ser, Leu Ser, Leu

Ser Ser

Tyr Nom.

Sr, Am Am

PY PU

C

Pro, (Leu) Pro, (Leu)

Pro Pro

His Gin

His, (Arg) Gin, (Aw)

PY PU

A

Th, O-4 lk O-4

Thr Thr

Asn LYS

Asn, (Arg) LYS, (Am)

PY Pu

G

Ala, Val Ala, Val

Ala Ala

ASP Glu

Asp, Gly Glu, Gly

PY Pu

EVOLUTION

OF

THE

GENETIC

CODE

99

designed to make the ambiguity as harmless as possible. All but three of the amino acids can be coded by unambiguous triplets so that the use of ambiguous triplets can be restricted to those cases in which the ambiguity is harmless, or even useful if both proteins are needed. Two of the remaining three amino acids have several codons which allow different alternatives, thus giving the opportunity of making a choice of the best or least harmful alternative. Each ambiguous triplet used will imply that not one, but two types of protein will be produced, differing in the amino acid coded by that triplet. One of the proteins could be useless, but seldom harmful except for the possible wastefulness implied, and in most cases a single amino acid is not critical for the proteins function. Furthermore, it is to be expected that primordial proteins may usually have been shorter than present ones and might seldom have had more than one of the few amino acids with ambiguous codons in a critical position. Another property of the genetic code organization at the stage presented in Table 7 is expressed by the following rule which summarizes the startling number of cases in which anticodon triplets differing only in the third base were used to represent the same amino acid. Only the first two baJes in each anticodon triplet were used to identify the amino acid, except when the central base was a U, in which case a distinction was made between thepurines (A, G) and thepyrimidines (U, C) in the thirdbase.

This rule is easier to observe in Table 8 showing the tRNA species required to interpret the genetic code of Table 7. Each tRNA species is identified by its anticodon, ignoring the third base, except in those anticodons which had a U in the central base. We do not know whether this phenomenon was the result of an obligatory inclusion of tRNA specimens with an I (inosine) in the third base among those tRNA which had not a U in the central base or whether the third base was actually ignored in those anticodons during the translation process. The idea that in primordial genetic codes only the first two bases in each codon or anticodon may have been used has already been mentioned by Crick (1968). Our interpretation is consistent with this idea except in the case in which the central anticodon base was a U. Whatever the origin or the cause of these restrictions, it is clear that they must have imposed a severe limitation to the number of codons available for amino acid identification. If we keep in mind these restrictions we find that Table 7 represents a very efficient way of coding for 15 different amino acids plus a nonsense word with a minimum number and severity of overlappingcodons, asif thenumber of codons werein short supply. Indeed it was. With the restrictions imposed by being able to use only the two first bases except when the central anticodon base was a U and by an almost general wobbling system with the only exception presented by the first base in those

100

N. A. BARRICELLI TABLE

8

tRNA-species (anticodon) needed to interpret the genetic code presented irl Table 7. The 3rd base is specljied only in those anticodons which have a U in the 2nd base. In parentheses are indicated those tRNA which were not strictly needed,and might or might not have beenpresent Base 2 Base 1

A

G

U

C

I

(Leu)

-

-

(Aw)

A

Leu

Ser

Tyr -

Arg

G

(Leu)

Pro

His Gin

(Arg)

U

-

Thr

Asn LYS

-

C

Val

Ala

Asp Glu

GUY

Base 3 (For central u only)

Pu PY Pu Py Pu PY Pu PY

anticodons in which the central base was a U or a G, the total number of unambiguous codons available was 12, plus one nonsense codon which could be retrieved by leaving out tRNA molecules with the anticodons AUPy (see Table 8). The omission of this anticodon, which according to the wobbling rules would have paired both the UAPu and UGPu, made it possible to discriminate between the nonsense codons UAPu and the arginine codons UGPu. The additional three amino acids could in no way have been accommodated without overlapping. The very efficiency of the primordial code given in Table 7 is an important argument in support of the theory. Additional support is obtained by looking into the evolutionary aspect of the theory and its interpretation of the selective pressures operating and its strict adherance to the continuity principle. In spite of its efficiency, or exactly because of it, the stage of evolution described in Tables 7 and 8 must be considered as a transitory or temporary solution of the problem of representing an increasing number of amino acids with the codons available. The scarcity of codons and the ambiguities it implied must have created a growing selective pressure for an increase in the number of codons. The simplest way to increase the number of codons at the stage described in Tables 7 and 8 would be an elimination of wobbling in the central base. There are two main problems related to the elimination of wobbling in the central base.

EVOLUTION

OF

THE

GENETIC

CODE

101

(1) Any place in which an ambiguous central base (U or G) is used by a messenger RNA molecule instead of the corresponding unambiguous base (A or C, see Table 2) in order to code for the same amino acid would no longer serve its purpose. (2) Before the creation of tRNA molecules with I’s and G’s in the first anticodon base, the elimination of central base wobbling could have created several new nonsense codons of the types AUX, CUX, AGX, CGX, thus producing unwanted brakes in the protein molecules. The first problem would partly have been solved in advance by the selective pressure favouring the use of unambiguous codons in those places in which the use of the proper amino acid could have been critical. The second problem might require the contemporary or beforehand introduction of the tRNA molecules with an I in the first anticodon base IAX (Leu) and ICX (Arg) (see Table 8). To start with, these added tRNA molecules may introduce other ambiguities involving the codons AUX, CUX, AGX, CGX (Table 7). But at the same time it would create a selective pressure for the avoidance of these new ambiguous codons and, most important, it would avoid the introduction of nonsense triplets and the ensuing brakes in the protein molecules even in those cases in which the ambiguous codons are not avoided. There seem to be no other major problems, and the stage described in Tables 7 and 8 (hereafter designated as “late central base wobbling stage”) appears to be an excellent starting point for the elimination of wobbling in the central base, carrying evolution to the stage described in table 6(a) and tb).

6. Early Central Base Wobbling Stage Before this late central wobbling stage there must have been a period in which the number of new amino acids introduced in the system must have been gradually increasing until the selective pressure for the increase of the number of codons reached the breaking point described above. Some inkling about the last amino acids which might have been introduced in the system can be obtained in part from the structure of Tables 7 and 8, and in part from side information obtained from other sources helping to identify those amino acids which are supposed to be among the first ones adopted by living organisms. It would seem that a sound criterium for the reconstruction of an earlier stage in the evolution of the genetic code, to be designated as the “early central base wobbling stage”, should consist in the elimination of as many as possible of the ambiguities in the code described in Table 7 by removing a minimum of amino acids. Moreover, whenever a

102

N. A. BARRICELLI

choice among two alternatives is possible, one should always prefer the alternative which does not imply the removal of amino acids, such as glycine, alanine, serine and aspartic acids, which are supposed to be among the first ones adopted by living organisms (Crick, 1968). With these criteria we have attempted to reconstruct an earlier genetic code described in Table 9 and its tRNA anticodons described in Table 10. It is obvious, of course, that the further we go back in the past, the more uncertain our reconstructions are going to be. But the technique, employed in the reconstruction of early genetic codes is expected to be helpful to achieve more accurate reconstructions in the future whenever more information will become available. In the code described in Table 9 and Table 10 the amino acids leucine and arginine, which caused the largest number of ambiguities in Table 7, are not included or restricted to codons where they produce little ambiguity. Moreover, we have excluded the amino acid valine, rather than alanine, and the amino acid glutamic acid, rather than glycine, because alanine and glycine are supposed to be among the first amino acids adopted by living organisms (see above). At this stage the genetic code may have included only 11 or possibly 12 amino acids and few nonsense triplets. In order to avoid ambiguities in the codons of Table 9 it is necessary to assume that tRNA molecules with a central C-base in their anticodons are missing (see Table IO), since the third base is ignored in anticodons with a central C but not in those with a central U, and part of the codons with a central G (which can pair both with C and U) would become ambiguous otherwise. TABLE

9

Reconstructed genetic code for the early central base wobbling stage. 111 parentheses a possible or transitory second interpretation of some codons (see Table 10) is indicated

Base 1

PY

Base 2 A

G

Base 3 (for central Pr only) ~-- __Py Pu

U

Ser

Tyr Nom.

-___-.~ ‘Or M-g) (Aw)

C

Pro

His Gin

His Gin

PY Pu

A

Thr

Asn LYS

Asn LYS

Py Pu

G

Ala

Asp GUY

Asp GUY

PY PU

EVOLUTION

OF

THE

GENETIC

103

CODE

This is as far as we shall go in our attempts to reconstruct genetic codes by this method. Earlier genetic codes may have been different in different species or populations all of which may have contributed to shape subsequent genetic codes by crossing or symbiotic associations. At this stage wobbling was general in the central base and was restricted only in the first base to those codons which had a U or a G in the central base. The first base was the only one in which a discrimination had been introduced between all of the four nucleotides in those codons which had a C or an A in the central base (C and A being the only two bases which, under the wobbling rules, can pair with and select only one anticodon base each, thus making it possible to avoid a premature elimination of first base wobbling in more triplets than wanted). TABLE

10

tRNA-species (anticodons) needed to interpret the genetic code of Table 9. In order to avoid ambiguities in the genetic code of Table 9 no tRNA-species with a C in the central anticodons base are included except a possible arginine anticodon of the type ACX which is indicated in parenthesis and may cause the arginine alternatives indicated in parenthesis in Table 9 Base I

Pu

Base 2 U

C

Base 3 (for central U only)

Tyr Pu PY

Pro

c:

Ala

Asn LYS

-

ASP GIY

-

PU PY Pu Py

7. The Earliest Codes We shall briefly outline some of the major steps in the evolution of the genetic code before the stage described in Tables 9 and 10, without attempting to reconstruct specifically the genetic codes developed in the successive stages in different populations. The earliest genetic codes may have used only the first two bases of each triplet to identify an amino acid. At that time purinepyrimidine pairing (or wobbling) may have been common for all of the bases in each codon. With this arrangement no more than four unambiguous

104

N.

A.

13hRRICEI.I,I

codons cculd have been available for amino acid identification, plus one (or several) nonsense codon( If more than four amino acids were being used by the same species, some of them must have had overlapping or ambiguous codons. An increase of the number of codons was probably not achieved before a possibility was created for a hereditary fixation of at least one individual nucleotide, for example U in the central anticodon base. We shall describe one among several possible ways in which an increase of the number of codons may have started. Let us assume that some of the primitive organisms had reached a stage in which they were able to secure, either by synthesizing or by preying on other organisms, a regular apply of mad1 and possibly one or several purines, but were not able to secure a regular supply of cytocine. Under these conditions U may have been the only pyrimidine regularly used by these species in their RNA, except by accident. As a result it would have been possible for these organisms to develop a discrimination between tRNA molecules with a U in their central anticodon base and those without a U in that position, in spite of the prevailing wobbling rules which, if other pyrimidines had been available, could have allowed a replacement of U. Once this discrimination was established and had been used to increase the number of codons available by introducing a system which allowed a distinction between purines and pyrimidines (or U’s) in the third base for those tRNA molecules which had a U in the central anticodon base there would be a selective pressure preventing the reintroduction of C’s in the central nucleotide, because of the ambiguities that would ensue. This selective pressure may for a long time have prevented the introduction of central C’s in the anticodon triplets (see Table IO). At this stage even if C or some other pyrimidines should become available besides U, the number of unambiguous codons which could be used for amino acids would be six plus one or several nonsense triplets, as suggested by the following Table 1 l(a) and (b). From a six amino acid stage like the one described in the example of Table 11 the evolution towards the following stages described earlier required the introduction of a system which could not only secure a regular supply of all of the four nucleotides U, C, A and G, but also discriminate between them in the hereditary information. It may be worth while inquiring whether the introduction of DNA as carrier of hereditary information may have served this discriminative function. The use of thymine in the DNA molecule instead of uracil may have avoided the ambiguity resulting from the ability of uracil to pair not only with A but also with G in the wobbling system. The first DNA polymers may have acted as a sort of virus parasites gradually evolving into DNA virus symbionts of the primitive organisms until they

EVOLUTION

OF

THE

GENETIC

105

CODE

11

TABLE

Example of hypothetic six amino acids code (a) and anticodon table (b) during PM-Py pairing (general wobbling) stage. A Al, AA2, etc., designate unspec@ied amino acids. In parentheses are indicated those triplets which may have been missing (anticodons) or replaced by nonsense triplets (codons)

Base 1 __--

(a) Codons Base 2 Py Pu

(b)

Anticodons Base 2 U

Base 3

Base I

Pu

Py Pu

A

AA1

AA2 AA3

-

Pu Py

(AA2)

Py

(AA3)

Pu

G

(AAl)

(AA?) (AA3)

-

Pu Py

AA2 AA3

C

Base 3

U

AA1

c

(AAl)

A

(AM)

(AA% (AA6)

Py Pu

U

(AM)

(A.43 (AA6)

-

Pu PY

G

AA4

AA5 AA6

Py Pu

C

AA4

AA5 AA6

-

Pu Py

subsequently overtook control of the main hereditary apparatus. After these steps had been carried out, the stage was set for a gradual introduction of a similar discrimination between all of the four nucleotides also in the genetic code. Some of the main conditions necessary for a step-by-step evolution from the six amino-acid stage to the 12 amino-acid stage described in Tables 9 and 10 had been fulfilled. 8. Wobbling and the Early Evolution of Nucleotide Pairing

Perhaps the most important implication of the wobbling phenomenon is that the pairing rules between nucleotides we are used to consider normal today may not always have been so in the past. This is obvious as far as the codon-anticodon pairing is concerned since wobbling seems to have been common both in the first and third base and may well have been a common phenomenon for all the bases at an early stage. There is, however, a possibility supported by some lines of evidence that pairing rules different from those prevailing today and possibly more similar to the wobbling rules may have been applied also in the RNA copying process in the early stages of biologic evolution. Some of the supporting evidence will be presented below by showing the advantages such pairing rules would have at the very beginning of the biologic evolution process. Other supporting evidence is derived from the mutagenic effect of base-analogs, producing

106

N.

A.

BARRICELLI

copying mistakes (transitions) caused by the tendency of many purines to pair with either one of the two pyrimidines U or C, and the tendency of many pyrimidines to pair with either one of the two purines A or G. This suggests that the low frequency of similar mistakes achieved by the polymerases produced in living organisms today, when the four regular nucleotides are used instead of other bases, may be the result of a specialization obtained in a long evolution process, and may reflect a property of the polymerases rather than a property of the four RNA bases. Other supporting evidence comes from the chemico-physical consideration that forces leading to codon-anticodon pairing may from the beginning have been of the same nature as the forces leading to the copying of a polynucleotide strand by gradual association of individual nucleotides and/or short polynucleotide segments, suggesting that pairing rules would from the start have been similar in both cases. If some kind of wobbling or Pu-Py pairing rules were common for all three nucleotides in codon-anticodon pairing, the same rules may have been applied in RNA copying as well, among the primordial ancestors of present-day organisms. If we look at Table 3 we notice that, with an exception presented by the nucleotide I which is found only in tRNA, the rules which may have been common in the early stages of biological evolution may have allowed each purine to pair with one or several pyrimidines and each pyrimidine to pair with one or several purines. In substance it may have been a kind of Pu-Py pairing rule in which no distinction was made, as far as hereditary information is concerned, between the two purines or between the two pyrimidines. In fact, if we attempt to copy several times an RNA string containing the four nucleotides U, C, A, G by following the rules in Table 3, we find that U will in the complementary copy be faced either by an A or a G, which when the complementary strand is copied again will produce a U or a C. The same will be the result for C which can produce a U or a C when the complementary strand is copied. An A will produce an A or a G, and a G will produce an A or a G when the complementary strand is copied. As far as hereditary information is concerned we can discriminate only two types of nucleotides, purines and pyrimidines. Which purine or which pyrimidine occupies a particular spot in an RNA molecule is irrelevant as far as hereditary information is concerned and can hardly be considered an hereditary characteristic of the molecule. Only the question whether a particular spot is occupied by a purine or by a pyrimidine defines an hereditary characteristic of the molecule when the Pu-Py pairing rules presented in Table 3 are used. This would lead to a binary (Pu, Py) alphabet instead of the quaternary (U, C, A, G) alphabet used in the hereditary material of living organisms nowadays.

EVOLUTION

OF

THE

GENETIC

CODE

107

The idea that a binary alphabet may have been used in the early evolution of polynucleotide molecules is not new (cf. Crick, 1968; Orgel, 1968). But the kind of binary alphabet proposed here presents advantages which may have been of fundamental importance for the early polynucleotide chains which started the biological evolution process. The first polynucleotide molecules must have been formed in a watery environment with complex organic solvents, which we may call the Oparin lake (Oparin, 1938), in which some kinds of nucleotides were present, but were not produced and therefore not controlled by living organisms, There is no assurance that the nucleotides originally present in the Oparin lake were the same which are used by living organisms today; but some kinds of purines and some kinds of pyrimidines may have been present. However, since the production of these nucleotides was not under biological control, the abundance of each particular purine or pyrimidine may have varied greatly, and at some times or in some parts of the Oparin lake one or a few of them may have been missing completely. Under these conditions an organism which could replace one purine, 01 pyrimidine with another one if the former is locally or temporarily missing would evidently have a great advantage. This is what the Pu-Py pairing rule described above can accomplish for them. In the following examples we shall for the sake of simplicity, use only nucleotides which are common today, but similar phenomena may have occurred even if a part or all of the original nucleotides were different from present ones, but followed some kind of purine-pyrimidine pairing rules with similar characteristics. Let us assume that at different times and/or in different parts of the Oparin lake the abundance or shortage of different nucleotides could be drastically different. We would like to find out how the outcome of a polynucleotide segment U. C. A. G. U of RNA or some precursor, would change when reproducing in different environments where different nucleotides are present or missing. We shall only include cases in which the segment would be able to reproduce with the nucleotides available, but we will also include a few examples in which the deaminated nucleotide I is-included, causing mutations by replacement of some purines with pyrimidines or vice versa. Table 12 shows some examples of changes which could have taken place in the outcome of the same polynucleotide segment in four successive generations developed in different environments in which different nucleotides were missing. It is assumed that copying was taking place according to the pairing rules (present wobbling rules) defined in Table 3 in the organisms considered.

108

N.

A.

BARRICELLI

12

‘IABLE

Successive copies of a primordial RNA segment in direrent environments in which diJXerent nucleotide species were missing (using wobbling rules of pairing) Parental strand

U.C.A.G.U

Nucleotides missing

A I < G.G.U.PyG

1st generation complements 1st generation progeny Nucleotides missing 2nd generation complements 2nd generation progeny Nucleotides missing 3rd generation complements 3rd generation progeny Nucleotides missing 4th generation complements 4th generation progeny

--+

U.C.A.G.U -~----

U.C.A.G.U

> PyPyG . G . Py

c 1 +~-~~~~~ PuG.U.U.Pu --U.U.PuPuU

A, C, 1 hp.G.G.U.U.G

A, U, J &.G.C.C.G C.C.G.G.C

C, G, I A.A.U.U.A -------U.U.A.A.U

>

U.U.G.G.U

>

i

A, U, G I.I.C.C.1 ---C.C.I.1.C C, U, G iT.A.A.1 A.A.I. 1.A C, G, 1 6YU.A.A.u A.A.U.U.A

Table 12 shows a number of outcome types derived from the same 9 ancestral polynucleotide, U. C. A. G. U. Most of them are equivalent as far as heredity is concerned and can be considered as different adaptations of the same organism to environments where different purines and/or different pyrimidines are available. With the pairing rules used the replacement of a purine by another purine, or the replacement of a pyrimidine by another pyrimidine can hardly be considered an hereditary change. However, the third sequence (last column of Table 12) shows a case in which, by the interference of the nucleotide I, purines and pyrimidines are interchanged in the 4th generation progeny, resulting in a complementary structure with the wrong polarity (arrow to the right instead of left, compare 4th generation of third example with 2nd generation of second example in Table 12). If the I nucleotide, inosine, was present at a time in which reproduction was subject to the pairing rules defined in Table 3, a possibility of deriving a complementary polynucleotide with unchanged polarity from each original chain would have existed. Complementary lines of polynucleotides with the same polarity could have reproduced side by side if both lines were viable and competitive.

EVOLUTION

OF

THE

GENETIC

CODE

105

The fact that the replacement of a purine by another purine and the replacement of a pyrimidine by another pyrimidine did not constitute an hereditary change, does not imply that such replacements would not have importance for the organisms survival or competitivity. Probably different organisms were adapted to different environments with different nucleotide compositions. The problem of outlining some of the main biological aspects of the evolution process which, starting from a stage of self-reproducing polynucleotides unable to produce the bases they needed, may have led to organisms capable of controlling their environment and producing these bases, deserves an investigation of its own to be presented in a separate paper (Barricelli, unpublished manuscript). The author is indebted to Professor S. Luria for valuable advice and encouragement; to Professor J. M. Bishop for unpublished information about the role of methionine tRNA as a primer for inverse transcription; to Dr B. Thorheim and Dr W. Blix Gundersen for calling attention to the similarities between the genetic code interpretation elicited by some antibiotics such as streptomycin and some of the reconstructed earlier genetic codes presented in this paper.

REFERENCES B., STAVNEZER, E., FRIEDRICH, R., BISHOP, J. M. & GOODMAN, H. M. (1976). J. Virol. 19, 548. CRICK, F. H. C. (1966).J. molec. Biol. 19, 548. CRICK, F. H. C. (1968).J. molec. Biol. 38, 367. GORINI, L. (1974). Ribosomes, pp. 791-863. Cold SpringHarbor Laboratory. LIKOVER, T. E. & KURLAND, C. G. (1967). Nan. Academy of Sci. 58,2385. MAUGH, II, T. H. (1974). Science, N. Y. 185, 41. OPARIN, A. I. (1938).The Origin of Life. (Translatedwith annotationsby S. Morgulis.) CORDELL,

NewYork: Macmillan.

ORGEL,

L. E. (1968).

J. molec.

Biol. 38, 381.