Gene 435 (2009) 9–12
Contents lists available at ScienceDirect
Gene j o u r n a l h o m e p a g e : w w w. e l s ev i e r. c o m / l o c a t e / g e n e
The close relationship between the biosynthetic families of amino acids and the organisation of the genetic code Massimo Di Giulio a,⁎, Umberto Amato b a b
Laboratory for Molecular Evolution, Institute of Genetics and Biophysics ‘Adriano Buzzati Traverso’, CNR, Via P. Castellino, 111, 80131 Naples, Napoli, Italy Istituto per le Applicazioni del Calcolo ‘Mauro Picone’, CNR, Sede di Napoli, Via Pietro Castellino 111, I-80131 Naples, Napoli, Italy
a r t i c l e
i n f o
Article history: Received 20 October 2008 Received in revised form 18 December 2008 Accepted 23 December 2008 Available online 14 January 2009 Keywords: Biosynthetic relationships between amino acids Precursor–product amino acid pairs Ancestral stages of the code Generation of random codes
a b s t r a c t By generating random codes and applying Fisher's exact test, we confirm that the biosynthetic families of amino acids are intimately involved in the organisation of the genetic code. This observation corroborates the coevolution theory of genetic code origin. As the amino acids belonging to the single biosynthetic families have codons that are contiguous in the genetic code, they must have entered the code itself by means of a clustering mechanism, which must clearly have been compatible with the mechanism on which this theory is based because this too envisages the clustering of biosynthetically correlated amino acids within the code. © 2009 Elsevier B.V. All rights reserved.
1. Introduction Immediately after the genetic code was deciphered, numerous authors maintained that its organisation seemed to reflect the biosynthetic relationships between amino acids (Nirenberg et al., 1963; Pelc, 1965). This point of view reached its climax with the formulation of the coevolution theory of genetic code origin in 1975, which suggests that the genetic code is an imprint of the biosynthetic relationships between amino acids (Wong, 1975). That the biosynthetic relationships between amino acids might have been able to condition the origin of the genetic code was convincingly maintained by Taylor and Coates (1989) and other authors (Dillon, 1973; Miseta, 1989; Di Giulio, 1996; Di Giulio and Medugno, 2000) and, more recently, by Davis (2007, 2008). Di Giulio (2001) replied to the criticisms that were voiced against the ‘biosynthetic origin’ of the genetic code. Indeed, some authors have cast doubt on the authenticity of the correlation between the biosynthetic relationships between amino acids and the organisation of the genetic code (Amirnovin, 1997; Ronneberg et al., 2000) and, in particular, on whether or not the relationships between amino acid pairs in a precursor–product relationship are significantly reflected in the genetic code (Amirnovin, 1997; Ronneberg et al., 2000). More recently, Rob Knight has expressed doubts on the significance of this relationship between the biosynthetic pathways connecting the amino acids and the organisation of the genetic code Abbreviations: CCS, Codon Correlation Score. ⁎ Corresponding author. Fax: +39 081 6132706. E-mail address:
[email protected] (M. Di Giulio). 0378-1119/$ – see front matter © 2009 Elsevier B.V. All rights reserved. doi:10.1016/j.gene.2008.12.018
(Di Giulio, 2008). Given the importance that this relationship has in the process of falsification/corroboration of the theories proposed to explain genetic code origin, we have decided to check, once again, the truth of the relationship: biosynthetic pathways of amino acids– genetic code organisation. 2. Materials and methods Let the genetic code and the set of amino acids Ai, i = 1,…, 20 be assigned so that each amino acid occupies a specific position within the code. Let T be the set of pairs of these positions within the genetic code so that exchanges between the corresponding amino acids are possible; and let L be the set of the relative number of possible exchanges, Lj, on the basis of the genetic code structure (only single base changes are considered). Finally, let F be the set of amino acid pairs in a biosynthetic relationship, obtained considering all the possible pairs of amino acids within the biosynthetic families (Table 1). The sum of the Lj's for a particular code constitutes the Codon Correlation Score (CCS). Let P = (P1,…, P20) be any one permutation of the genetic code so that the positions of the synonymous codon blocks are occupied by the amino acids A(Pi), i = 1,…, 20. We assume that histidine always occupies the same position in the genetic code because it is metabolically isolated (Taylor and Coates, 1989). The CCS relative to a permutated genetic code is calculated as the sum of possible exchanges Lj for all the pairs of amino acids occupying positions in the code so that changes are possible (i.e. belonging to T) and in a biosynthetic relationship (i.e. belonging to F). In addition to the CCS value, the number of possible biosynthetic relationships with the permutation P (CCS1) is also considered.
10
M. Di Giulio, U. Amato / Gene 435 (2009) 9–12
Generating a high number of permutations within the genetic code, we obtain an accurate estimate of the probability density of CCS, CCS1 and their joint under the hypothesis that the code evolved at random. In the same way, we have calculated the probability density of CCS, CCS1 and their joint, considering the set F′ of amino acid pairs for which there is a precursor–product relationship between amino acids (that is to say, the sum of the possible changes Lj for all the amino acid pairs belonging to F′ and occupying positions belonging to T). The simulation was performed by generating 600 million random permutations of the genetic code. The simulation software was written in Matlab environment and is available on the web page http://www.na.iac.cnr.it/programmi/randomgeco/randomgeco.htm. 3. Results and discussion
Table 1 All the combinations of amino acid pairs relative to the five biosynthetic amino acid families (Di Giulio and Medugno, 2000) Serine family: Ser-Gly = 2, Ser-Cys = 4, Ser-Trp = 1, Gly-Cys = 2, Gly-Trp = 1, Cys-Trp = 2 Phosphoenolpyruvate family: Phe-Tyr = 2 Pyruvate family: Ala-Val = 4, Ala-Leu = 0, Val-Leu = 6 Aspartate family: Asp-Asn = 2, Asp-Thr = 0, Asp-Ile = 0, Asp-Met = 0, Asp-Lys = 0, Asn-Thr = 2, Asn-Ile = 2, Asn-Met = 0, Asn-Lys = 4, Thr-Ile = 3, Thr-Met = 1, Thr-Lys = 2, Ile-Met = 3, Ile-Lys = 1, Met-Lys = 1 Glutamate family: Glu-Gln = 2, Glu-Arg = 0, Glu-Pro = 0, Gln-Arg = 2, Gln-Pro = 2, Arg-Pro = 4 His does not fall into this classification because it is metabolically isolated (Taylor and Coates, 1989). The numbers indicate the times that the pairs' amino acids are interchangeable on the basis of the genetic code structure.
3.1. Estimate of the probability with which the biosynthetic families of amino acids are distributed in the genetic code, by means of simulation We generated random codes while maintaining invariant the allocations of synonymous codon blocks of amino acids, as in the genetic code, and permuting the 20 amino acids (amino acid permutation codes) (Di Giulio, 1989a, b). For each of these codes, we calculated the Codon Correlation Score (CCS) (Amirnovin, 1997; Di Giulio and Medugno, 2000), i.e. for each pair ij of amino acids we calculated the number of times that the amino acid i transforms into the amino acid j on the basis of the genetic code structure and only considering single base changes (Di Giulio, 1989a; Di Giulio and Medugno, 2000; Archetti, 2004). The sum of all these numbers for all the considered pairs of amino acids in a biosynthetic relationship is the CCS. For instance, for the five biosynthetic families of amino acids defined on the basis of an amino acid precursor or a non amino acid precursor (Table 1) and for the genetic code, we obtain a CCS value of 55 units (Table 1). It clearly emerges that if the biosynthetic relationships between amino acids had a true significance for the origin of the genetic code, then all the possible combinations of amino acid pairs within a biosynthetic family should be statistically significant in that these combinations of amino acid pairs would be made between amino acids that, by definition, should possess codons that are contiguous in the genetic code (Di Giulio and Medugno, 2000). Therefore, if the biosynthetic relationships between amino acids were important in establishing genetic code origin, then this CCS value of 55 units associated to the genetic code should be obtained with a low probability from the set of amino acid permutation codes. As shown in Table 2, this probability of obtaining a CCS value that is greater than or equal to 55 is 0.0050 and, therefore, statistically significant. The CCS defines only one aspect of a code's capability to reflect the biosynthetic relationships between amino acids; a more important aspect is given by the number of amino acid pairs in a biosynthetic relationship that intervene to define a given CCS (Di Giulio and Medugno, 2000). This number of amino acid pairs in a biosynthetic relationship, for the five amino acid biosynthetic families defined by the genetic code, is 23 (Table 1). We can therefore ask ourselves with what probability this number of amino acid pairs in a biosynthetic relationship is obtained from the set of amino acid permutation codes. As shown in Table 3, this probability is only 6.8 × 10− 5. The CCS and the number of amino acids in a biosynthetic relationship, defined by the genetic code considering the five biosynthetic families of amino acids, need to be joined in order to calculate the real probability of extracting the genetic code from the set of amino acid permutation codes (Di Giulio and Medugno, 2000). This is because, considering only (i) the CCS, we only count codes that have a CCS value greater than or equal to 55 units, but it can happen that they have less than 23 amino acid pairs in a biosynthetic
relationship; and (ii) the number of amino acids in a biosynthetic relationship, we can count codes having a CCS value of less than 55 units. Therefore, the joint of these two variables should provide the real estimate of randomly extracting the genetic code from the set of amino acid permutation codes. This probability is seen to be 5.8 × 10− 5. 3.2. Estimate of the probability with which the biosynthetic families of amino acids are distributed in the genetic code, using Fisher's exact test Fisher's exact test seems to be suitable for estimating the statistical significance of the distribution of the biosynthetic families of amino acids within the genetic code (Di Giulio, 2008). As suggested by Rob Knight (Di Giulio, 2008), Fisher's exact test can estimate this probability because if we consider the Asp family, for instance, we observe from the genetic code that: (i) 5 amino acids (Ile, Met, Thr, Asn and Lys) of the Asp biosynthetic family are codified by ANN type codons (= a), while (ii) only 2 amino acids (Ser and Arg) not belonging to this family are codified by codons of the same ANN type (=b). Furthermore, (i) only 1 amino acid (Asp) belonging to the Asp family is codified by non-ANN codons (=c) whereas (ii) 12 amino acids (Phe, Tyr, Cys, Trp, Leu, Pro, His, Gln, Val, Ala, Gly and Glu) are codified by non-ANN codons (=d). If we use these parameter values (a = 5, b = 2, c = 1, d = 12), Fisher's exact test yields a probability of 0.0072 that the distribution of the amino acids in the Asp biosynthetic family has been generated in the genetic code only by chance. Hence, a highly significant probability. Table 4 reports the estimate of the probabilities for the 5 biosynthetic families (Table 1). Moreover, the calculation of the quantity −2lnP (Table 4) makes it possible to estimate the probability of the aggregate (Fisher, 1950). Indeed, by summing all the −2lnP quantities provides a value χ2 = 34.81 (df = 10) and a P = 1.3 × 10− 4. It therefore emerges that the codons attributed by the genetic code to the amino acids in the 5 biosynthetic families are not random but reflect these families in that their current distribution within the genetic code would be obtained by chance alone with a very low probability. 3.3. The statistical non-significance in the genetic code of a particular set of amino acid pairs in a biosynthetic precursor–product relationship and its interpretation These observations clearly point out that the biosynthetic relationships between amino acids were important in genetic code origin. However, the coevolution theory of genetic code origin (Wong, 1975) is based on the notion of amino acids in a precursor–product relationship, and this has been criticised (Amirnovin, 1997; Ronneberg et al., 2000). We have already analysed this controversial point elsewhere (Di Giulio, 1999, 2001; Di Giulio and Medugno, 2000). Here
M. Di Giulio, U. Amato / Gene 435 (2009) 9–12
11
Table 2 CCS value and relative probability obtained by generating 600 million random codes
Table 2 (continued) CCS
Probability
CCS
Probability
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75
1 1.000000e+ 000 1.000000e+ 000 9.999999e−001 9.999995e−001 9.999981e−001 9.999938e−001 9.999814e−001 9.999522e−001 9.998814e−001 9.997390e− 001 9.994425e− 001 9.989188e− 001 9.979562e−001 9.964247e− 001 9.938586e−001 9.901341e−001 9.843973e− 001 9.767309e−001 9.657325e−001 9.520826e−001 9.336629e−001 9.122552e−001 8.848599e−001 8.548269e−001 8.180483e−001 7.797487e− 001 7.347016e−001 6.898858e−001 6.389989e−001 5.904153e− 001 5.369594e−001 4.877780e−001 4.352306e−001 3.884948e− 001 3.398588e−001 2.979857e−001 2.554859e−001 2.200197e−001 1.848242e−001 1.563660e−001 1.286806e−001 1.069762e−001 8.623422e−002 7.048268e−002 5.565436e−002 4.474116e−002 3.460283e−002 2.737667e−002 2.072807e− 002 1.614853e− 002 1.196053e− 002 9.171537e−003 6.640698e− 003 5.015845e−003 3.543572e−003 2.635513e−003 1.813780e−003 1.330300e−003 8.924083e−004 6.463233e−004 4.224900e−004 3.024033e−004 1.931633e−004 1.373200e− 004 8.547833e−005 6.035500e− 005 3.674333e− 005 2.601000e−005 1.556000e−005 1.086500e−005 6.386667e−006 4.441667e−006 2.505000e−006 1.711667e− 006 8.933333e− 007
76 77 78 79 80 81 82 83 84 85 86 87
6.133333e−007 3.066667e− 007 2.433333e−007 1.150000e−007 8.666667e− 008 3.833333e−008 2.666667e− 008 1.500000e−008 1.500000e−008 3.333333e−009 3.333333e−009 0
In order to read the probability value greater than or equal to a specific CCS value, it is necessary to take the probability value relative to CCS − 1. Data in bold indicate the probability of obtaining a CCS value greater than or equal to 55.
we aim only to discuss a particular set of amino acid pairs in a precursor–product relationship, for which the coevolution theory does not appear to be statistically significant. This set (Table 5) is obtained from the biosynthetic relationships, as shown in Fig. 1 of Taylor and Coates (1989), considering only the amino acids that are in a certain and undoubted precursor–product amino acid relationship. For this set (Table 5), which has a value CCS = 14, we obtain P = 0.34 that this or greater values are obtained by chance from the set of amino acid permutation codes. Likewise, for the variable number of amino acid pairs in a precursor–product relationship, we observe in the genetic code that, out of 11 expected pairs, only 6 can be seen to be in a precursor–product relationship (Table 5), which obtains a P = 0.23. Finally, the combination of CCS with the number of pairs in a precursor–product relationship provides P = 0.20. All these probabilities are non-significant and, strictly speaking, would falsify the coevolution theory because these 6 pairs in a precursor–product Table 3 Values of the number of amino acid pairs in a biosynthetic relationship per code and relative probability, obtained by generating 600 million random codes Biosynthetic pair number
Probability
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
1 1.000000e+ 000 9.999994e−001 9.999902e−001 9.998884e−001 9.991374e− 001 9.953792e−001 9.820917e−001 9.473310e−001 8.772985e−001 7.651148e−001 6.182170e−001 4.575374e− 001 3.079486e−001 1.875907e−001 1.030945e−001 5.095104e−002 2.259470e−002 8.968927e− 003 3.175425e−003 1.002875e−003 2.797617e− 004 6.790500e−005 1.422500e−005 2.360000e−006 3.033333e−007 4.500000e−008 3.333333e−009 0
In order to read the probability value greater than or equal to a specific number of amino acid pairs in a biosynthetic relationship per code (n), it is necessary to take the probability value relative to n − 1. The probability that the biosynthetic pair number 23 is obtained is rendered in bold.
12
M. Di Giulio, U. Amato / Gene 435 (2009) 9–12
relationship (Table 5) would be easily obtained in the genetic code by chance alone. Clearly, these probabilities partly justify the criticisms of Amirnovin (1997) and Ronnenberg et al. (2000). The coevolution theory responds to the non-significance in the genetic code of these pairs of amino acids in a precursor–product relationship (Table 5) by suggesting that the pairs not observed in the genetic code refer only to Asp and Glu because the latter were codified late on in the genetic code, respectively by the codons AAY and CAR which today codify for Asn and Gln. This postulate is such as to remove the majority of the non-contiguities between the amino acids in a precursor–product relationship (Wong, 1975). That this postulate is most likely correct is suggested by the existence of the pathways AsptRNAAsn→AsntRNAAsn (Curnow et al., 1996) and Glu-tRNAGln→GlntRNAGln (Schon et al., 1988), which are molecular fossils of the codon concession mechanism from the precursor to the product amino acids, as predicted by the coevolution theory (Wong, 1975, 1976, 2005, 2007; Wachtershauser, 1988; Danchin, 1989; de Duve, 1991; Di Giulio, 2002, 2008). Therefore, the fact that this set of precursor–product pairs gives a non-significant probability is explained by the very behaviour of Asp and Glu, which are also involved in the transformation taking place on tRNAs, for the very reason that the codons AAY and CAR might have codified for Asp and Glu in the late phase of genetic code evolution. This would not only justify the postulate of the coevolution theory but would also appear to be a necessity (Di Giulio, 2008). In short, the existence of these biosynthetic pathways on the tRNAs involving Asp and Glu and the absence of the precursor–product amino acid pairs again involving just these two amino acids is not random. In conclusion the coevolution theory can fully justify the statistical non-significance of this particular set of amino acid pairs in a precursor–product relationship, by exploiting the very existence of biosynthetic pathways on tRNAs, which are the most characteristic and corroborative observation regarding this theory (Wong, 1975, 1976; Wachtershauser, 1988; Danchin, 1989; de Duve, 1991; Morowitz, 1992; Edwards, 1996; Di Giulio, 2002, 2008).
4. Conclusions The results presented here unequivocally indicate that there exists a close relationship between the biosynthetic pathways of amino acids and the organisation of the genetic code. This observation confirms what is referred in the literature (Nirenberg et al., 1963; Pelc, 1965; Dillon, 1973; Wong 1975; Jurka and Smith, 1987; Di Giulio, 1996, 1999, 2001; Di Giulio and Medugno, 2000; Davis, 2007, 2008). This is strong evidence in favour of the coevolution theory of genetic code origin because if, as pointed out here, the biosynthetic families of amino acids are reflected in the genetic code, i.e. the amino acids of the single biosynthetic families have codons that are contiguous in the code, then it might have taken place only if the amino acids of the biosynthetic families entered the code through a clustering mechanism. This mechanism must clearly be compatible with the one suggested by the coevolution theory because it too envisages clustering of biosynthetically correlated amino acids in the code. Therefore, although this theory is not significant in a particular set of amino acids in a precursor–product relationship (Table 5), never-
Table 4 This shows the results of the application of Fisher's exact test to the five biosynthetic families of amino acids
Serine family Phosphoenolpyruvate family Pyruvate family Aspartate family Glutamate family See text for further information.
a
b
c
d
P
−2lnP
4 2 3 5 3
4 4 7 2 2
0 0 0 1 1
12 14 10 12 14
0.0144 0.079 0.105 0.0072 0.032
8.48 5.08 4.51 9.86 6.88
Table 5 A particular set of amino acid pairs which are in a certain precursor–product amino acid relationship Ser-Gly = 2, Ser-Cys = 4, Ser-Trp = 1, Asp-Asn = 2, Asp-Thr = 0, Asp-Met = 0, Asp-Lys= 0, ThrIle = 3, Glu-Gln = 2, Glu-Arg = 0, Glu-Pro = 0 The numbers indicate the times that the two amino acids interchange on the basis of the genetic code. See text for further information.
theless it is the general relationship between the biosynthetic families of amino acids and the genetic code which clearly designates that the interpretation given by the coevolution theory to the absence of several precursor–product amino acids pairs in the code is correct and that, therefore, this theory is considerably corroborated by this. Finally, it is obvious that all these observations are even more compatible with the extended coevolution theory (Di Giulio, 2008). References Archetti, M., 2004. Codon usage bias and mutation constraints reduce the level of error minimization of the genetic code. J. Mol. Evol. 59, 258–266. Amirnovin, R., 1997,. An analysis of the metabolic theory of the origin of the genetic code. J. Mol. Evol. 44, 473–476. Curnow, A.W., Ibba, M., Soll, D., 1996. tRNA-dependent asparagine formation. Nature 382, 589–590. Danchin, A., 1989. Homeotopic transformation and the origin of translation. Prog. Biophys. Mol. Biol. 54, 81–86. Davis, B.K., 2007. Making sense of the genetic code with the path-distance model. In: Ostrovsky, M.H. (Ed.), Leading Edge Messenger RNA Research Communication, pp. ;1–32. Davis, B.K., 2008. Imprinting of early tRNA diversification on the genetic code: domains of contiguous codons read by related adaptors for sibling amino acids. In: Takeyama, T. (Ed.), Messenger RNA Research Perspectives, pp. ;1–79. de Duve, C., 1991. Blueprint for a Cell: the Nature and Origin of Life. Neil Patterson Publishers, Carolina Biological Supply Company, Burlington, NC. Di Giulio, M., 1989a. Some aspects of the organization and evolution of the genetic code. J. Mol. Evol. 29, 191–201. Di Giulio, M., 1989b. The extension reached by the minimization of polarity distances during the evolution of the genetic code. J. Mol. Evol. 29, 288–293. Di Giulio, M., 1996. The b sheets of proteins, the biosynthetic relationships between amino acids, and the origin of the genetic code. Origins Life Evol. Biosph. 26, 589–609. Di Giulio, M., 1999. The coevolution theory of the origin of the genetic code. J. Mol. Evol. 48, 253–254. Di Giulio, M., 2001. A blind empiricism against the coevolution theory of the genetic code. J. Mol. Evol. 53, 11–17. Di Giulio, M., 2002. Genetic code origin: are the pathways of the type Glu-tRNAGln_GlntRNAGln molecular fossils or not? J. Mol. Evol. 55, 616–622. Di Giulio, M., 2008. An extension of the coevolution theory of the origin of the genetic code. Biol. Direct 3, 37. Di Giulio, M., Medugno, M., 2000. The robust statistical bases of the coevolution theory of the genetic code. J. Mol. Evol. 50, 258–263. Dillon, L.S., 1973. The origins of the genetic code. Bot. Rev. 39, 301–345. Edwards, M.R., 1996. Metabolite channeling in the origin of life. J. Theor. Biol. 179, 313–322. Fisher, R.A., 1950. Statistical Methods for Research Workers, 11th ed. Oliver and Boyd, Edinburgh and London, p. 99. Jurka, J., Smith, T.F., 1987. β-turns in early evolution: chirality, genetic code, and biosynthetic pathways. Cold Spring Harbor Symp. Quant. Biol. 52, 407–410. Miseta, A., 1989. The role of protein associated amino acid precursor molecules in the organization of genetic codons. Physiol. Chem. Phys. Med. NMR 21, 237–242. Morowitz, H.J., 1992. Beginnings of Cellular Life: Metabolism Recapitulates Biogenesis. Binghamton/New York: Yale Univ. Press/Vail-Ballou Press; p. 160–171. Nirenberg, M.W., Jones, O.W., Leder, P., Clark, B.F.C., Sly, W.S., Pestka, S., 1963. On the coding of genetic information. Cold Spring Harbor Symp. Quant. Biol. 28, 549–557. Pelc, S.R., 1965. Correlation between coding-triplets and amino-acids. Nature, London 207, 597–599. Ronneberg, T.A., Landweber, L.F., Freeland, S.L., 2000. Testing a biosynthetic theory of the genetic code: fact or artifact? Proc. Natl. Acad. Sci. U. S. A. 97, 13690–13695. Schon, A., Kannagara, C.G., Gough, S., Soll, D., 1988. Protein biosynthesis in organelles requires misaminoacylation of tRNA. Nature 331, 187–190. Taylor, F.J.R., Coates, D., 1989. The code within the codons. BioSystems 22, 177–187. Wachtershauser, G., 1988. Before enzymes and templates: theory of surface metabolism. Microbiol. Rev. 52, 452–484. Wong, J.T., 1975. A co-evolution theory of the genetic code. Proc. Natl. Acad. Sci. U. S. A. 72, 1909–1912. Wong, J.T., 1976. The evolution of the universal genetic code. Proc. Natl. Acad. Sci. U. S. A. 73, 1000–1003. Wong, J.T., 2005. Coevolution theory of the genetic code at age thirty. BioEssays 27, 416–425. Wong, J.T., 2007. Question 6: coevolution theory of the genetic code: a proven theory. Orig. Life Evol. Biosph. 37, 403–408.