d. theor. Biol. (1975) 54, 399-401
A Metaphor for Molecular Evolution A model represents the isolation of certain features of a complex situation so that their mutual relationships can be seen without the distraction of other features of lesser significance. "Every system is its own best analogue" so that selection for simplicity is essential. A metaphor can be regarded as a representation still less rigorous than a model--more a parable than an analysis. It is difficult to think intuitively about the evolution of an organism or a protein molecule because the interactions between all the components entail the representation of the possible configurations as points in a manydimensional manifold or landscape in which we are aliens. An ingenious proposal by Randi6 (1974) for the unique numbering of graphs (and thus of molecules) can be used as a metaphor to illustrate some of the properties of systems with many interactive parts. We must imagine simply that a configuration with a smaller Randi6 numbering is more successful than one with a larger, and is thus selected for in some way. Randir's proposal contains a fatal flaw or fallacy which renders it unfit for its particular purpose, but this "flaw" is an intrinsic property of the systems it represents. His proposal is as follows. The N nodes of a non-symmetrical graph can be numbered 1 to N in N! different ways. Each of these numberings can be represented as an N x N connectivity (incidence) matrix but it would be difficult to recognize that each of these connectivity matrices represented the same graph. Randi6 gives an algorithm for renumbering so that for a given graph a unique numbering (and thus a unique connectivity matrix) can be obtained. The N x N connectivity matrix of the graph is symmetrical, the entries being 0 or 1 with zeros along the diagonal. Reading the matrix from left to fight and top to bottom by rows gives a number with N z binary digits which uniquely characterizes the matrix (and from which the matrix could be restored). In fact it is more convenient to work with the N decimal numbers which represent the N rows. Each of the Nt different numberings of the graph thus gives a different index number. It is proposed to make that numbering which gives the lowest index number the standard unique numbering of the graph. All questions, such as the identity of two graphs, the 399
400
A.L. MACKAY
enumeration of graphs, the symmetry of a graph, can be answered when this unique numbering is derived. The problem is to reach it and, what is more difficult,to show that it is the numbering required. Randid gives a procedure for approaching the unique numbering from any arbitrary initialnumbering. It consists of three tests which are cycled as subroutines. (I) The basic test is that a numbering with a lower index number is preferred to one with a higher. Mutations leading to a lower index are fixed. Others are rejected. (2) The matrix is arranged. Starting from the top, rows are examined to check that the index number of each row is not greater than that of the one immediately below it. If such a case is found then these neighbouring rows are exchanged, and the corresponding columnns are also exchanged to keep the matrix symmetrical. The latterexchange may disturb the order of succeeding rows, but it does not affect the order of rows earlier ordered. Examination thus continues in successive passes down the matrix, until no further changes take place. The index number of the arrangement is then noted. (3) The effectsof further exchanges of rows (and corresponding columns) are then tried. T w o non-adjacent rows are exchanged and the matrix is then rearranged [as in (2)]. If the index number is lower than before the ordering is retained, otherwise the rows are put back as they were. W h e n the procedure operates, the numbering is constrained to evolve along a path towards that with the lowest index number possible. The test of the absolute minimum which is us,d, is that no binary exchange of numbers should lower the index number. The pathway to the unique minim u m number is thus supposed to be always downhill. The fallacy in the procedure is that a state may be reached where every bipartiteexchange indeed makes the index worse but it may be one where a cyclic exchange involving three or more points (A to B, B to C, C to A) or some more complicated permutation, may stillmake the index lower. To examine all such multiple exchanges would render the procedure extremely costly and would simply be equivalent to listingallN! possibleindex numbers and picking the smallest. In an actual trialwith a complex graph (of 17 vertices, each connected to four others) a minimum was found a littleabove the absolute minimum. The corresponding configurations were very different from each other so that the two downward paths must have diverged at a col quite high up during the walk. Some initial numberings settled into one configuration and some into the other. In all there must be 17! = 3 x 1014 - 2 4a possible numberings to choose from.
LETTERS TO THE E D I T O R
401
Smaller graphs of molecules all converged to a unique minimum so that the algorithm is powerful but not perfect. The interest in the metaphor lies in the complex path which the algorithm traces out in such a relatively simple system. The Randid index is valuable in exemplifying a global parameter which depends on all of the N components and the ways they are linked up. It shows that some evolutionary paths of creodes can finish, given the types of permissible mutations--here the exchange of the numbers of the points--in a dead end, where less than the optimum development of the system is realized. In this example all mutations are either advantageous or deleterious, but neutral mutations could easily be introduced by neglecting the contribution of one row to the index or by allowing an increase in the index number provided it was less than, for example, the last value but one. At present, computer time limits the value of N to less than 20, since a configuration of this size may take 60 sec of CPU time to order on the CDC 6400 machine, but with more efficient programming, it is anticipated that larger graphs could be handled.
Department of Crystallography, Birkbeck College, University of London, MaIet Street, London, WC1 7HX, England (Received 10 December 1974) REFERENCE RANDt~, M. (1974). J. chem. Phys. 60, 3920.
A. L. MACKAY