The implications for molecular evolution of possible mechanisms of primary gene duplication

The implications for molecular evolution of possible mechanisms of primary gene duplication

J. Theoret. Biol. (1968) 20, 227-244 The Implications for Molecular Evolution of Possible Mechanisms of Primary Gene Duplication R. I. WATTS Paedia...

1MB Sizes 1 Downloads 25 Views

J. Theoret. Biol. (1968) 20, 227-244

The Implications for Molecular Evolution of Possible Mechanisms of Primary Gene Duplication R. I.

WATTS

Paediatric Research Unit, Guy’s Hospital Medical School, London, S. E. 1, England AND

D. C.

WATT’S

Department of Biochemistry, Guy’s Hospital Medical School, London, S. E. 1, England (Received 11 January 1968, and in revised form 6 May 1968)

Primary geneduplicationis distinguishedfrom secondarygeneduplication which, whenthe original duplicategenesaresuitablyjuxtaposed,can occur relatively readily by homologousunequal crossing-over.The need to considerhomology betweengenesquantitativelyis established.It is concluded upon theoreticalgrounds and in considerationof the somewhat sparseevidencewhich at presentpermits the quantitation of homology that a high level of homologyis necessaryfor crossing-overbetweengenes to occur.This beingso, non-homologousunequalcrossing-overis rejected as an important mechanism for primary gene duplication. Other mechanisms,based on cytogeneticevidenceof chromosomebehaviour, are discussed with respect to the probability of their producing viable

duplicationsand to their usefulnessin subsequentevolution. 1. Introduction The existence of duplicate geneswas first established upon genetic evidence when Sturtevant (1925) described the behaviour of the Bar eye locus in Drosophila. Genetic techniques enabled Lewis (1951) to develop the concept of pseudoallelism, the close linkage of genesof related function, a probable property of a row of duplicate genes. Biochemical techniques have more recently made it possible to infer the existence of duplicate genes from similarities in structure between (Ingram, 1961) or within (Smithies, Connell & Dixon, 1962; Sanger & Brownlee, 1967)gene products, proteins or RNA. An appreciation of the evolutionary potential of duplicate genes preceded the development of molecular genetics. Huxley (1942) pointed out the

228

R. L.

WATTS

AND

D.

C. WATTS

possibilities for divergent specialization of such genes, now seen to be realized in the haemoglobins and the many isoenzyme systems. Lewis (1951) was among the first to state the role of duplicate genes in the evolution of new proteins: that one duplicate might evolve while the other maintained function. Particular genes may be duplicated as a result either of small changes in the chromosomes or of large changes where whole chromosomes or whole sets of chromosomes come to be present in duplicate. Among the second type, polyploidization has played a large role in the evolution of higher plants (Stebbins, 1950) which are, in the main, hermaphrodite. The various mechanisms for sex-determination appear to be absolutely dependent upon the presence of the sex-determining chromosomes as part of a diploid set. (It is interesting that no way of overcoming this appears to have arisen during evolution although the incompatibility systems of higher plants, with an entirely different genetic basis, serve the same purpose of ensuring crossfertilization.) The work of Muldal (1952) on one animal group, the oligochaete annelids, has shown that hermaphrodite species are tetraploid by comparison with the bisexual polychaete annelids and that parthenogenetic species show further increases in ploidy. Such changes are coupled to evolutionary stasis (Watts & Watts, 1967, 1968), presumably because the essential process of recombination is absent. The duplication of genes as a result of small changesin the chromosomes would appear to be of very much greater importance in the evolution of higher organisms in general. This paper is concerned with the different ways in which such changes can occur and their different evolutionary effects. 2. Gene Duplication and Unequal Crossing-over The recognition of gene duplication at the Bar eye locus in Drosophila (Sturtevant, 1925) was coupled to the postulation of unequal crossing-over to account for the ability of flies carrying the gene to give rise to normal and Double-Bar offspring with a frequency well above that of mutation. Sturtevant suggested that crossing-over occurred between the end of the Bar gene on one homologue and its beginning on the other, thus producing a chromosome with the duplicated form, Double-Bar, and, reciprocally, no-Bar or normal (Fig. la). Such crossing-over was not only unequal but also between nonhomologous parts of the genetic material. It was shown by Bridges (1936) that the Bar gene was itself a duplication, so that, when Double-Bar arose by unequal crossing-over, this was between homologous regions of genetic material (Fig. lb). Homologous, unequal crossing-over is now a well-authenticated process (Lewis, 1941; Bender, 1967). Evidence for its occurrence may be expected

GENE

DUPLICATION

Crossing-over

AND

ENZYME

Double bar

-

EVOLUTION +

229

Norma

(a)

~

pr

FIO. 1. Gamete. formation in Bar homozygote. (a) Non-homologous unequal crossingover as proposed by Sturtevant (1925). (b) Homologous unequal crossing-over as proposed by Bridges (1936).

whenever the recombinational behaviour of pseudoallelic clusters is studied. In the biochemical field, it is exampled by the human haemoglobins Iepore (Baglioni, 1962).Upon amino acid sequenceevidence,these are the products of genes formed by unequal crossing-over (unfortunately termed “nonhomologous crossing-over” by Baglioni) between the /I and 6 haemoglobin loci, which show close linkage, and, again upon sequence evidence, are highly homologous. On the other hand, we have been unable to find any evidence in the literature for non-homologous unequal crossing-over in diploid organisms, except in cases of abnormality such as trisomy where chromosomes were without a homologous partner with which to pair (McClintock, 1941). An investigation by Roberts (1965) into the possibility that homologous unequal crossing-over might induce chromosome misalignment and thus non-homologous unequal crossing-over gave completely negative results. It seems most probable that effective pairing (that where crossing-over occurs) is discontinuous (Pritchard, 1960) and that some kind of looping out between an unequally paired region and the next point of pairing permits both to be homologous.

230

R. L.

WATTS

AND

D.

C. WATTS

of Homology between Genes

3. Quantitation

(In order to follow the arguments in this section it is necessaryto make referenceto the amino acid code (Table I).) TABLE

1

The amino acid code? UUU UUC UUA UUG

Phe Phe Leu Leu

CUU CUC CUA CUG

Leu Leu Leu Leu

AUU AUC AUA AUG

Ile Ile Ile Met

GUU GUC GUA GUG

Val Val Val Val

UCU UCC UCA UCG

Ser Ser Ser Ser

CCU CCC CCA CCG

Pro Pro Pro Pro

ACU ACC ACA ACG

Thr Thr Thr Thr

GCU GCC GCA GCG

Ala AIa Ala Ala

UAU UAC UAA UAG

Tyr Tyr Chain Termn. Chain Termn.

CAU CAC CAA CAG

His His Gln Gln

AAU AAC AAA AAG

Asn Asn Lys Lys

GAU GAC GAA GAG

Asp Asp Glu Glu

UGU UGC UGA UGG

Cys Cys Not translated Trp

CGU CGC CGA CGG

Arg Arg Arg Arg

AGU AGC AGA AGG

Ser Ser Arg Arg

GGU GGC GGA GGG

Gly Glv Cl; Gly

t After Jukes (1967).

It is evident from the foregoing that the concept of homology between genesis not a simple one, except for an idealized pair of homologous chromosomes each consisting of a row of single clearly differentiated genes. If it is imagined that in each chromosome one such gene is duplicated, different kinds of homology clearly exist between the members of the two pairs; all combinations are structurally homologous and thus able to take part in homologous crossing-over, but some are not locationally homologous, so that crossing-overbetweenthem would be unequal. Again, the various alleles for a particular locus will be locationally homologous but cannot be 100% structurally homologous. Two alleles may differ only in one base-pair. The question arises: what percentage of the base-pairs can be different without lowering the degree of homology so far as to interfere with the capacity for crossing-over? The existenceof the Lepore haemoglobins is of interest from this point of view. The /I and 6 haemoglobin each have 146 amino acid residues,of which ten residuesare different, six of these occurring in three pairs. All the differ-

GENE

DUPLICATION

AND

ENZYME

231

EVOLUTION

ences but one are between amino acids having codons differing in only one base; for the remaining difference (Ala or Thr to Gln), the first two bases are changed. The minimum difference on this basis is 2.55% between the base-pairs of the two cistrons, which would thereby be 97.45% homologous. Added to this, becauseof the positions of the differences, there are three long stretches of DNA (84, 107 and 87 base-pairs long) where there may be complete homology (Fig. 2). The cross-over point, in both types of Hb Lepore (Hollondio) I I

9

II I

II 2710

Crossover point

Hb Lepore (Boston) I L ,

\

I

12

22

50

36

I 67

I 150/l

I\

86 I 258

07

I 261+262

/I \\

116 117 12415 126 II 349356

1 I 1 378 373176

Amino acid

Bose-pair

FIG. 2. Homology in the formation of haemoglobii Lepore. The positions where different amino acids occur in B and 6 haemoglobin chains are shown by squaresin the diagram. The upper line of figures shows the numbers of the variant residues and the lower line the numbers of the base-pairsinferred to vary in the DNA. The amino acid sequencesof haemoglobii Lepore (Hollandia) and (Boston) show that the cross-over points which gave rise to them were located in the regions indicated by the brackets (sequencesfrom Eck & Dayhoff, 1966). Lepore haemoglobin where its position is known, is in one of these regions, in Hb Lepore (Hollandia) the tist and in Hb Lepore (Boston) the third (Baglioni, 1965). If the amino acid code is functionally degeneratein vivo in man, there could be many more differences between the two cistrons, or, indeed, between the various alleles for one cistron. The latter might differ from each other in base-sequenceonly, producing no protein variant through which they could be detected. Assuming complete degeneracyand complete selective neutrality for the various codons whether homo- or heterozygous, synonymous substitutions would accumulate in the population at approximately one quarter of the mutation rate. This is calculated from the proportion of single base changes in the codons for each amino acid which yield other codons for the same amino acid, with the assumption that all such codons and all their changes are equally probable. Becauseof their slightly different amino acid compositions, the value is 23.6% for /3 haemoglobin and 23-l y0 for 6 haemoglobin; the value for the original duplicate genes before their divergencemay be assumedto be of the same order, Under these circumstances, a large randomly-mating population might be expected to

232

R. L.

WATTS

AND

D.

C. WATTS

contain a random mixture of synonymous codons for each triplet. This situation would be interrupted each time a population founded on one or a few breeding pairs arose. Such a population would display a sharply decreased level of degeneracy within individual triplets and thus a sharply increased degree of homology between alleles appearing by mutation. Selection which rapidly increased a particular allele at the expense of others in the population would have the same effect at the relevant locus and those linked with it. Is it possible to believe that mutation of a codon to its synonym could be selectively neutral ? By definition there would be no difference upon which selection could act at the level of protein structure. At the level of protein synthesis, some synonyms might have a selective advantage either because of the relative concentrations of the respective transfer RNA species in the organism or from some more elusive cause such as an effect on stability of the messengerRNA or its ability to bind to ribosomes. At the level of chromosome pairing in meiosis, a single mutation of this kind would not be expected to be significantly disadvantageous. Several mutations would probably have no effect, but, as codon heterozygosity built up towards the theoretical maximum of about one third of the base-pairs, a lowering of fertility must surely start to operate. Even if pairing were possible between “homologous” chromosomes when only two thirds of the base-sequence matched, an organism would be exposed to a high risk of confusion between genes of different function but only moderately different coding properties. In the case of /I and 6 haemoglobin, for instance, with only 2.55% difference between the bases involved in determining the different structures of the proteins, synonymous alleles for the two loci might arise which had more structural homology with each other than with the locationally homologous genes. It is perhaps worth considering that such a process might have contributed to the origin of the Lepore haemoglobins. In this case, the two genes serve the same function in the organism, and the hybrid protein happens to be able still to carry out that function, so that its origin has no obvious selective disadvantage. This is unlikely to be the case for many structurally related proteins. It seems most probable that mechanisms which protected organisms from drift towards a randomisation of synonymous codons would have developed during evolution. These could operate at the level of protein synthesis, as previously mentioned, or directly upon the pairing mechanism. At the present time, very few experiments which shed any light on the possible extent of codon randomization in uivo have been carried out, Using rabbit reticulocyte ribosomes and two classes of leucyl-sRNA with different codon specificity from Escherichia coli (Weisblum, Gonano, von Ehrenstein &

GENE DUPLICATION

AND

ENZYME

EVOLUTION

233

Benzer, 1965) or of arginyl-sRNA from yeast (Weisblum, Cherayil, Bock &c M l, 1967) it has been shown that both leucine and arginine are represented by different codons at different positions in the rabbit haemoglobin a-chain gene. In order to investigate the possibility of different codons representing the same amino acid at the same position, it m ight be necessaryto perform experiments upon different populations of rabbits. If degeneracyis freely manifest at particular positions, it m ight have been expected that heterozygosity for some of the triplets would have been revealed, unless the experimentswere performed with highly inbred rabbits. The only indication of activity with two classesof amino acyl-sRNA shown by Weisblum and his colleagues was for arginine 31. The apparent secondary activity was much lower than the main one, and the authors suggestit may be explained as an artefact. In an attempt to find evidence for codon heterogeneity at particular positions in human genes, we have considered 43 variants of the human haemoglobin chains known to have single amino acid replacements.Of these, 25 were unique with respectto their position. Of the six groups with more than one known replacement at the same position (Table 2), two contained TABLE 2

Homologous positions in human haemoglobins where more than one amino acid replacement is known Position

Replacement

86 ~6

Ala Glu Glu Glu

;:

Glu -+ Gly Glu + Lys

;i

al5

I316 ~916 616 a54 a54

a58 863 all6 /9121 /3121

+ + -+ -+

Asp Val Lys Lys

Haemoglobin Hb J Toronto HbS HbC Hb F Galveston

Gln -+ Glu Gln + Arg

Hb G Hb Hb J Hb D Hb J HbBa Hb Hb

His -+ Tyr His --f Arg

Hb M Boston Hb Zurich

Glu + Lys Glu + Gln Glu -+ Lys

Hh 0 Indonesia HbD Punjab HbO Arab

Gly Gly Gly Gly

-+ + -+ -+

Asp Arg Asp Arg

San Jose Siriraj Oxford Bushman Baltimore Mexico Shimonoseki

234

R. L.

WATTS

AND

D.

C. WATTS

mutations from glutamic acid and one each those from glutamine and histidine. These three amino acids each have two codons which cannot be distinguished on the basis of their mutation behaviour. Another group (&+/I and 616) contained mutations from glycine. Positive evidence for degeneracyat a position containing glycine would be provided by mutations at that position to aspartic acid, serine or cysteine in addition to glutamic acid or tryptophan. In this case, there were two mutations to aspartic acid and two to arginine, which could occur if glycine at this position generally has the samecodon. The remaining group comprised mutations from glutamic acid at 86 and ~6, but the a5 position, consideredto be homologous, contained alanine, which, in Hb J Toronto, is replaced by aspartic acid. Since the onestep mutations (single base changes)connecting alanine with glutamic acid and aspartic acid involve different alanine codons, it can be deduced that at some stage in the divergenceof the a and /I chain genesa mutation from one alanine codon to another probably occurred. Of the 78 differences between the human a and /I haemoglobin chains, 26 involve changes in more than one base; that some of these occurred through mutations between synonymous codons would be expected. If the single-step differences between the various haemoglobin chains are examined, it is found, at least for serine, valine, glycine, arginine and threonine that these demand the use of different codons at different positions. There is therefore evidence for degeneracy in vivo but no evidence so far that this operates within one position in the gene. It seems reasonable to assume for the present that homology between alleles is not greatly reduced by the degeneracy of the amino acid code, becausesuch lessening of homology is either directly or indirectly selected against. 4. Primary and Secondary Gene Duplication

If unequal crossing-over is always homologous, the important conclusion emergesthat it is solely a mechanism for secondarygene duplication, that is, extra genetic material can be produced in this way only provided that duplicate genesare already present.It is obviously necessaryto ask what the primary duplication mechanism may be. Biochemical geneticistshave been reluctant to abandon the possibility of unequal crossing-over. Tatum (1961) invoked “a process analogous to unequal crossing over” as “the most reasonable source of new genetic material”. Smithies, Connell &Dixon (1962)clearly distinguished betweenhomologous and non-homologous unequal crossing-over, but discussedat somelength the implications of the latter processas a primary mechanism for the generation of gene duplications and partial duplications. Later Smithies (1964) rejected it as the mechanism by which partial duplica-

GENE

DUPLICATION

AND

ENZYME

EVOLUTION

235

tion of the haptoglobin a-chain gene occurred. After considering the possibility of internal local homologies in this gene in the light of preliminary amino acid sequencedatat, he opted for an origin in “a purely random event, rather than any event analogous to crossing over . . . the occurrence of two chromosomal (or chromatid) breaks in non-homologous positions within the haptoglobin locus followed by rejoining”. Thus, in a sense,the mechanism proposed is the same as non-homologous unequal crossing-over (Smithies refers to it subsequently as “a non-homologous exchange”) but is deemed not to have occurred during that part of meiosis when effective pairing is in operation. Cytogenetic researchhas revealeda number of mechanismswherebyprimary duplication can occur as a result of aberrations involving accidental breakage of chromosomes. There appears to be no experimental evidence for the chromosome breakage mechanism (Smithies, 1964) which, to distinguish it more clearly from unequal crossing-over, we should prefer to describe as an interchange between two homologous chromosomes, the two breakage positions being slightly different. It is a mechanism of appealing simplicity, there seems no reason why it should not occur equally with interchanges between non-homologous chromosomes, and, it should be noted, it would be extremely difficult to observe cytologically. An important characteristic of duplicate genesis their position relative to each other, which will vary according to the mechanism by which primary duplication occurs. Further possible changes in modified chromosomes will depend upon the likelihood of secondary duplication by homologous unequal crossing-over. Smithies, Connell & Dixon (1962) have presented evidence for the generation of variants at the haptoglobin a-chain locus by this process. 5. Evolutionary Variation in Proteins Proteins have a versatility of function which dictates that the structural requirements of the molecule must vary considerably between functional classes,and, likewise, the type of structural changeswhich may occur during t Note added in proof. Publication of sequences for the three haptoglobin a chains (Black, J. A. & Dixon, G. H. (1968). Nature, Land. 218, 736) coniims the suggestion of Smithies that the change-point between the als and aI* sequences found in apps occurs at an alanine residue (a1P71/a1s12).If the two “parental” sequences are lined up on this basis, there is 23 % homology between the fhxt two bases of 11 triplets in the region. An alternative scheme proposed by Black and Dixon in which the two sequences are offset by one triplet improves the homology to 55 %. Since the alps sequence can only be produced from this comiguration if the paired triplets for Ala (alp71) and Asp (als13) are both included in the crossover product, we feel that this mechanism is not tenable.

236

R.

L.

WATTS

AND

D.

C. WATTS

evolution. Enzymes have particular structural requirements which may render generalizations derived from data for other types of protein inapplicable to them. Such data may provide insight into the manner in which genescan vary; the direction and intensity of selective forces must depend upon the class of protein involved. For instance, Smithies’(1964) mechanism for the generation of variability in antibody proteins is unlikely to be of importance in the evolution of enzymes, and the suggestion (Smithies, Connell & Dixon, 1962) that joining together of parts of different genescould produce evolutionary advance seemsvery improbable, unless the “parental” proteins were closely related. The latter would hold good for adjacent duplicate genes which had diverged slightly in structure and between which homologous unequal crossing-over might occur, giving an extra possibility for recombination of different mutations into one gene (Wagner & Mitchell, 1964; Horowitz, 1965). The partially duplicated haptoglobin gene Hp2ct, which codes for a polypeptide almost twice the length of the alternative gene products is an example of what is probably a very unusual phenomenon, even in a serum globulin, the retention of function in a single polypeptide in place of what would normally be two separatesubunits not covalently linked. This is presumably only possible because, in the normal association of subunits, the C-terminus of one lies near to the N-terminus of another. A comparable situation has been noted (Watts, 1968)in the enzyme tryptophan synthetaseif the postulated evolution (Bonner, de Moss & Mills, 1965) of the A-B system of some fungi from the A + B system of bacteria has occurred. In general, where the evolution of proteins is concerned, it is clear that ability to perform a new function is unlikely to be associated with great novelty of structure; the latter is more likely to lead to complete loss of function. This applies not only to parts of the structure directly involved in catalysis but to the balance of the amino acids engaged in maintenance of specificity and configuration (Watts & Watts, 1968). The most useful genetic material for evolution, therefore, is that in which is encoded the structure of a useful, functional protein. Slight modifications of such DNA can lead to the origin of isoenzymes of particular functional fitness in certain tissues or at certain times in development. It can give rise to enzymes mediating a similar reaction to that of the original enzyme, but with usefully altered substrate specificity or to enzymes processing the same substrate but to a slightly altered product. In this way, gene duplication not only protects the organism from the ill-effects of mutation upon genes for essential enzymes, but provides very favourable material for the evolution of new ones. A clear example of this process is provided by the phosphagen kinase systems of polychaete annelids (Watts & Watts, 1968).

GENE

DUPLICATION

AND

ENZYME

EVOLUTION

237

6. Mechanismsfor GeneDuplication Dependentupon ChromosomeBreakage Table 3 shows eight mechanismsfor gene duplication for most of which there is direct cytogenetic evidence.They have been grouped in four classes, according to the relative positions of the duplicate geneswithin the genome. Where the duplicate genes are within one chromosome, the polarity of the duplicated section may be of importance in the subsequentbehaviour of the chromosomes,and has been noted. This will be discussedbelow. The fifth column shows whether deletion precedes or accompanies duplication, since the viability of organisms with such chromosomesmight be determined by their deficiencies.Mechanisms 4, 5 and 8 are commonly presentedin text books of geneticsas giving rise to lethally modified chromosomes. However, very small amounts of DNA may be lost, merely enough to provide “open” ends for non-standard rejoining. It is noteworthy that in all casesthe DNA is lost from the ends of chromosome arms, a position which, because of its greater vulnerability, might be expected to contain selectively less important material. Thus, all such mechanisms might be considered as means of exchanging DNA coding for proteins no longer of use to the organism for extra DNA of a potentially more useful type. An example of mechanism 5 in which all indispensable genetic material was retained was described by McClintock (1941) in maize and is illustrated in simplified form in Fig. 3. Here a functionally terminal inversion is formed, as one of the two breaks occurs at the junction of a dispensablesatellite region with the main body of the chromosome.Apart from the mechanisms involving interchange between homologous chromatids or chromosomes (l-3), only mechanism 6 (Fig. 4) achieves duplication without deletion. Sturtevant & Beadle (1936) reported that Drosophila carrying small duplications generatedin this way were of normal viability and in some caseswere normal in all respects. The sixth column shows the number and nature of chromosomebreaks involved and so gives some measure of the probability of gene duplication occurring in each way. Where breaks occur after anaphasebridge formation, which must itself lower the probability of a viable cell being produced, their position is important, and in mechanism1 two breaks in the double anaphase bridge formed must occur at different positions. From this viewpoint, our hypothetical mechanism7 (Fig. 5) is no lessprobable than, for instance, 5(c), which has been shown to yield viable progeny. If breakageof both chromatids at two adjacent points (as postulated by Kaufman & Bate (1938) to explain an X-ray induced duplication with reversed polarity and by Slizynska (1963)for duplications either with the same or reversedpolarity, induced by

Same

+

+

:z -

+

+

Reverse

Reverse Reverse Reverse Reverse

2

-

Same

Suggested by Smithies (1964)

Kaufman & Bate (1938)

2

2+lAB

Zimmering (1955)

-

2 + 1AB Sturtevant & Beadle (1936) 2 Newmeyer & Taylor (1967) 2 + 1AB McClintock (1941) Sturtevant & Beadle (1936) 2+2

1 + IAB McClintock (1941)

2

-

Same

References

2t + 2AJ3 McClintock (1941)

No. of breaks

-

f Deletion

Same

Polarity

AB = breaks in anaphase bridge.

8. Adjacent-I disjunction in translocation heterozygote

IV. Different chromosomes 1 Of satellite.

7. Ring-rod heterozygote (Fig. 5)

III. Same chromosome, different arm

t Or twist in replication.

5. Single cross-over within an inversion in an inversion heterozygote (a) paracentric inversion (b) pericentric inversion (c) “terminal” inversion (Fig. 3) 6. Single cross-over in the common inverted region of heterozygote for partially overlapping inversions (Fig. 4)

II. Same chromosome arm; not adjacent

Mechanism of duplicate gene formation 1. Chromatid interchange in ring chromosome (reciprocal) 2. Chromatid interchange in rod chromosome (non-reciprocal) 3. Interchange between homologous chromosomes (non-reciprocal) 4. Breakage-fusion-bridge cycle

Relative position of duplicate genes

I. Same chromosome arm; may be adjacent

Class

Mechanisms for gene duplication classiJied according to the relative positions of the duplicate genes

TABLE 3

GENE

DUPLICATION

-1,a

b

AND

ENZYME

EVOLUTION

239

cjdfi Standard chromosome

bo c .::::::::.:.:::::::::::::::::: . . . . . . . :... . . . .

d

ef Terminal inversion formed offer breakoge OS shown

-

Pachytene arrangement

ef~

Awhase 1 t.:.:.:.:

c 0

b

c

d ef b o d c ::::::::::::::::: 0 Gamete c Other products: 2 parent01 chromosomes and fragment

FIG. 3. A chromosome with a terminal inversion is formed following breakage at two points. As one breakage point is between the satellite and the chromosome, no genetic material is lost. A single exchange within the inverted portion in the inversion heterozygote gives rise to a dicentric bridge and an acentric fragment at anaphase I. Breakage of the bridge at the point indicated will give rise at anaphase II to a complete chromosome + (d) in addition to a fragment and two normal chromosomes. ( 4) Breakage point; (x) crossover point.

chemical mutagens)is a type of chromosome damag e of reasonablefrequency it may be a main source not only of direct duplication but of the inversions and re-inversions postulated below to be of evolutionary importance in the origin of duplicate genes. Provided the new chromosome is capable of synapsis with a normal homologue (which should be possible for chromosomes with small duplications by looping out of the extra material) it might be expectedto appear in the more favourable homozygous state within the next few generations of a prolifically inbreeding species. Further changes in genotype as a result of homologous unequal crossing-over can then occur in classes I, II and III.

240

R. L. WATTS

AND

D. C. WATTS

-

a C

b

e d c .... . . ..... . . . . .... . ... .::. .:.:.

Standard

chromosome

Inversion

A

Inversion

B

f >

Pochytene in heterozygote of A and El

ab

d

c

f Gametes -2

ob

e + 2 parental

d

c

e

f

chromosomes

FIG. 4. Two partially overlapping inversions are combined in a heterozygote; in meiosis a single exchange occurs in the common inverted region. Chromosomes deficient for (e) and with (e) duplicated result. (x) Crossover point.

The structures resulting from mechanisms l-3 directly permit secondary duplication either of a single gene, if this was the extent of primary duplication, or of a group of duplicated genes to yield mixed pseudoallelic clusters. This also applies after mechanism 7 followed by ring closure and breakage at a different point, bringing the duplicate regions within one chromosome arm. In these cases, where the polarity of the original duplicate genes is the same, only the genesinvolved in the primary duplication can be increased in number in the secondary process. The structures resulting from mechanisms 1, 5 and 6, in which the duplicate genes have reverse polarity would require an inversion including one of the duplicate genes to take place before secondary duplication could occur by this process. However, homologous unequal crossing-over might have other consequencesin all casesexcept l-3. After duplication by mechanism 7 it could result in restoration of the unduplicated ring chromosome or in formation of a double-length dicentric chromosome which might by breakage produce an extension of the terminal duplication. A similar form of extension might occur following mechanism 4, where the reversed polarity of the duplicate genescould also lead to anaphase bridge formation. In both cases,only geneswhich might have been duplicated

GENE DUPLICATION

AND

ENZYME

EVOLUTION

241

in the primary event by chromosome breakage at a different point can be added to the duplicated region. Similarly, the terminal duplication with reverse polarity, d (Fig. 3) from mechanism 5(c) is able to bring about increasein the size of the duplicated region, but only up to the original lim it.

d

h

Pachytene arrangement

Anaphase I

FIG. 5. Following a single exchange in a ring-rod heterozygote, a dicentric configuration is formed at anaphase I. If breakage of the bridge occurs as shown, duplications and deficiencies of (h) occur among the products. ( 4 ) Breakage point; ( x) crossover point.

Mechanism 6, however, results in the sub-terminal duplication with reverse polarity e (Fig. 4) which, through homologous unequal crossing-over can lead to anaphasebridge formation and possible duplication of all the genes between the original inversion and the centromere also with reversepolarity, accompaniedby loss of the terminal region f. The end result is thus the same as that of mechanisms5(a) and (b), but it is achieved stepwise.In common with stepwise enlargement of the duplicated regions described above, this m ight have evolutionary advantagein that each small changecould separately undergo the test of selection. T.B.

16

242

R. L.

WATTS

AND

D.

C. WATTS

7. Conclusions Genes which appear upon biochemical evidence to be products of duplications may be found by genetic analysis to have any of the four position relationships classified above (Table 3), as may be illustrated by those for the human haemoglobins. The position of such genes, as detected today, may be either that determined by the primary duplication event or resulting from subsequent chromosome rearrangements (Ingram, 1961). It is therefore difficult to deduce the particular primary mechanism of a known duplication. The simplest explanation of the haptoglobin a-chain partial duplication is one of the two offered by Smithies (1964). Nevertheless, this gene has been shown (Bloom, Gerald & Reisman, 1967) to be sited near the end of a chromosome arm and on a chromosome for which ring formation is known. Mechanisms 1 and 7 should therefore be considered. Other mechanisms are improbable, since the genes are known to be adjacent. The same applies to the haemoglobin B and 6 loci, where complete gene duplication has occurred. Unless there are more haemoglobin loci than usually recognized, each of the duplications proposed in Ingram’s scheme must have been a primary event, although there appears to be no evidence against the alternative that precursors of the two “non-u” genes “/?” and “y” arose by homologous unequal crossing-over shortly after the adjacent duplication marking the origin of “CI” and “non-c? loci, the similarity between the /I and y chains being explained by their functional relationship rather than their more recent divergence. Species distribution rules out a similar origin for the 6 gene. In any case where there is clear evidence for the existence of three or more homologous loci, as there is for the annelid phosphagen kinases (Watts & Watts, 1967, 1968) it is most probable that the primary events were of a type permitting further increase of the number of genes by homologous unequal crossing-over. For these enzymes, the hypothesis awaits the availability of amino acid sequence data. Homologous unequal crossing-over is a process with great evolutionary potential since it allows secondary duplication of duplicate genes, can bring about extension of duplicated regions and increases the possibilities of variation by recombination. Because of this, and also in order to examine the possibility that non-homologous crossing-over occurs, homology must be seen not as an all-or-none property of genes, but as one that needs to be quantitated. The Lepore haemoglobins provide the only evidence as to how far structural homology can be relaxed and crossing-over still occur; this value is for unequal crossing-over, which had to occur in face of competition, as it were, from /%-fi and 6-6 gene synapsis, these being both locationally homologous and probably more highly structurally homologous. Crossing-

GENE

DUPLICATION

AND

ENZYME

EVOLUTION

243

over between alleles might require less structural homology, althaugh it was not possible to find evidence for a lowering of homology between alleles as a result of the degeneracy of the genetic code. The degree of homology averagedfor a whole gene or a whole chromosome may not be the same as that required for crossing-over to take place at a particular point. Homologous chromosomes might in fact be discontinuously homologous at the minute level of the base-sequenceof their DNA. The general finding that cross-over frequency varies in different parts of chromosomes might in part be due to this. One interesting paradox remains. It can be concluded from the evidence of Yanofsky (1963) that recombination can occur within dissimilar codons, indicating that identical basesare not required at the actual cross-over point. Thus, although homology of base-sequenceis clearly of importance in the events leading up to crossing over, the role that it plays in this process is still extremely obscure. In consideration of the high level of structural homology which appears to be required for crossing-over to take place between genes, it seems best to reject non-homologous unequal crossing-over as a main source of primary gene duplication until such a time as some evidence for its occurrence is forthcoming. Genesmay be duplicated by a number of mechanismsfor which there is direct or indirect cytogenetic evidenceand theseprovide a stimulating range of evolutionary possibilities. We should like to thank ProfessorP. E. Polani for his encouragement and for helpful discussion of cytogenetic concepts. R. L. W. also thanks the Spastics Society for support during the preparation of this paper. REFERENCES BAGLIONI, C. (1962). Proc. n&n. Acud. Sci., U.S.A. 48,188O. BAGLIONI, C. (1965). Biochim. biophys. Actu, 97, 37. BENDER, H. A. (1967). Genetics, Princeton, 55,249. BLOOM, G. E., GERALD, P. S. & REISMAN, L. E. (1967). Science, N. Y. 156, 1746. BRIDGES, C. B. (1936). Science, N. Y. 83, 210. Bomb, D. M., DE Moss, J. A. & MILLS, S. E. (1965). In “Evolving Genes and Proteins” (V. Bryson and H. J. Vogel, eds.), p. 305. New York: Academic Press. ECK, R. V. & DAYHOFF, M. 0. (1966). “Atlas of Protein Sequence and Structure 1966”. Maryland : National Biomedical Research Foundation. HOROWITZ, N. H. (1965). In “Evolving Genes and Proteins” (V. Bryson and H. J. Vogel, eds.), p. 15. New York: Academic Press. INGRAM, V. M. (1961). Nature, Lond. 189, 704. HUXLEY, J. (1942). “Evolution: the Modem Synthesis”, p. 89. London: George Allen and Unwin Ltd. JUKES, T. H. (1967). Biochem. biophys. Res. Commun. 27, 573. KAUFMAN, B. P. & BATS, R. C. (1938). Proc. natn. Acad. Sci. U.S.A. 24, 368. LEWS, E. B. (1941). Proc. natn. Acad. Sci. U.S.A. 27, 31. LEW, E. B. (1951). Cold Spring Harb. Symp. quant. Biol. 16, 159. MCCLINTOCK, B. (1941). Cold Spring Harb. Symp. quant. Biol. 9,72.

244

R. L.

WATTS

AND

D.

C. WATTS

MULDAL, S. (1952). Heredity, Lond. 6, 55. NEWMEYER,D. & TAYLOR, C. W. (1967). Genetics, Princeton, 56,771. PRITCHARD,R. H. (1960). Genet. Res. 1, 1. ROBERTS,P. A. (1965). Genetics, Princeton, 52, 1017. SANGER,F. & BRO~NLEE, G. G. (1967). In Symposium on “Structure and Function of Transfer RNA and 5 sRNA”. 4th Meeting Federation of European Biochemical Societies. SLIZYNSKA,H. (1963). Genet. Res. 4, 154. SMITHIES,0. (1964). Cold Spring Harb. Symp. quant. Biol. 29,309. S-, O., CONNELL,G. E. & DIXON, G. H. (1962). Nature, Lond. 196,232. STEBBINS,G. L. (1950). “Variation and Evolution in Plants”. New York: Columbia University Press. STURTEVANT,A. H. (1925). Genetics, Princeton, 10, 117. STURTEVANT,A. H. & BEADLE,T. H. (1936). Genetics, Princeton, 21, 554. TATUM, E. L. (1961). Proc. 5th Int. Congr. B&hem. (U.S.S.R.), 3,178. London: Pergamon PESS. WAGNER,R. P. & MITCHELL, H. K. (1964). “Genetics and Metabolism”, p. 355. New York: John Wiley. WATTS, D. C. (1968). In “Advances in Comparative Physiology and Biochemistry”, Vol. 3. (In press.) WATTS, R. L. & WATTS, D. C. (1967). Abstracts 4th Meeting Fed. Europ. B&hem. Sots., p. 28. WATIX, R. L. & WATTS, D. C. (1968). Nature, Lond. 217, 1125. WEISBLUM,B., CHERA~~L,J. D., BOCK, R. M. & SILL, D. (1967). J. molec. Biol. 28, 275. WEISBLUM,B., GONANO, F., VON EHRENSTEIN,G. St BENZER, S. (1965). Proc. natn. Acad. Sci., U.S.A. 53, 328. YANOFSKY,C. (1963). In “Cytodifferentiation and Macromolecular Synthesis” (M. Locke, ed.), p. 15. New York : Academic Press. SIMMERING,S. (1955). Genetics, Princeton, 40, 809.