Evolution of glycolysis

Evolution of glycolysis

106 L.A. FOTHERGILL-GILMORE and P. A. M. MICHELS Quantitative Analysis 3. Glycolytic Complexes 2. 220 221 XI. GLYCOLYTIC ENZYMES WITH OTHER FUNCTI...

9MB Sizes 123 Downloads 203 Views

106

L.A. FOTHERGILL-GILMORE and P. A. M. MICHELS Quantitative Analysis 3. Glycolytic Complexes

2.

220 221

XI. GLYCOLYTIC ENZYMES WITH OTHER FUNCTIONS

XII. CONCLUSIONS ACKNOWLEDGEMENTS

223 225 227

REFERENCES

227

I. I N T R O D U C T I O N "Metabolic pathways are boring!" The attitude implicit in this assertion is not uncommon, and probably arises because pathways are frequently considered to be uninteresting collections of individual enzymes which are grouped together only because they catalyze a sequence of chemical reactions. This superficial view misses the point that a pathway is much more than the sum of its individual reactions. A metabolic pathway is an entity in its own right, with its own distinct and fascinating properties. Thus for example, the subtle and sensitive ways that a pathway can respond to variations in metabolic requirements are functions of the pathway as a whole. These collective properties of a particular pathway can vary from tissue to tissue and from organism to organism. It is the overall purpose of this review to consider how metabolic pathways may have evolved. Where appropriate, emphasis will be placed on the evolution of the collective properties of pathways. Glycolysis is the obvious pathway to take as an example, to examine in detail, to test our ideas. There are many reasons why this pathway is particularly suitable for a consideration of the evolution of metabolic pathways. Glycolysis is a central metabolic pathway and is present, at least in part, in all organisms. It is thus possible to compare the pathway from phylogenetically distant organisms. The individual enzymes of the pathway are exceptionally well characterized, both in terms of enzymic properties and in terms of detailed structures. Crystal structures of all the enzymes are available, as are a great many sequences. From this information we know that some of the enzymes have similar three-dimensional structures, and we also know that the sequences of individual enzymes are strongly conserved throughout evolution. In addition, there are several glycoIytic enzymes that catalyze similar reactions, and it is therefore possible to examine the proposal that, for example, an ancestral kinase may have diverged to the present four glycolytic kinases. Many enzymes in the pathway bind similar substrate or effector molecules such as nucleotides, and a consideration of the evolution of these binding sites and effector functions is of obvious relevance. This review will primarily be restricted to a consideration of the l0 mainstream glycolytic reactions and enzymes included in Fig. 1. In addition, mention will be made as appropriate of several other enzymes closely involved with glycolysis such as bisphosphoglycerate mutase, lactate dehydrogenase and alcohol dehydrogenase. The various aspects of evolution presented in this review will be discussed in the context of the knowledge of the detailed structures of the enzymes. It is therefore appropriate to begin with a summary of the structural information currently available.

II. M E T H O D S AND N O M E N C L A T U R E 1. Methods

This review is based on protein sequence and crystal structure data that were available in 1991-2. Multiple sequence alignments were generated by the CLUSTAL software package (Higgins and Sharp, 1988), with subsequent modifications if necessary to ensure that insertions and deletions do not fall within segments of regular secondary structure as observed in the three-dimensional structure of a representative member of each enzyme family. The elements of secondary structure are shown above the aligned sequences (a = ~-helix; b = fl-strand). Residues identical in all sequences are indicated by asterisks (,) underneath the alignment, and similar residues are indicated by dots (-). Percentages of

Evolution of glycolysis HC~O I HCOH I HOC. I HCOH

MQATP

107

MgADP H~mO HCOH |

.co.

~o?. HCOH .~o. I

IlK

I

c.2o. Glucose

CH2OH

P61

~=o

CH20-P Glucose 6-P

MgATP

H~OH H~OH ~H20--P

CH20-P

Fructose 6-P

~=o I

HOCH I ItCOH

PFK

I

HCOH J CH20-P Fructose 1, 6-P 2

N~H

N~+

HC~O

o~:~

MgADP

&2o-,

C I

MgATP

~

4

~

Glyceraldehyde 3-P

?=o CH2OH Dihydroxyacetone-P

FIM

CH20_P

6BPBH

i, 3-Bisphosphogly~erate

%/0PGK

c H~OH

~H20-P 3~Phosphoglycerate

P6 RM~N~ o. /oI

HCO-P i CH20" 2-Phosphoglycerate

~'~

H2O

o

/o-

Phosphoenolpyruvat



C= 0 MgADP

• MgATP

Pyruvate

PFK FIG. 1. Glycolysis. The abbreviations for the enzymes are as follows: HK, hexokinase; PGI, glucosephosphate isomerase; PFK, phosphofructokinase; ALD, aldolase; TIM, tdosephosphate isomerase; GAPDH, glyceraldehyde-phosphate dehydrogenase; PGK, phosphoglyeerate kinase; PGAM, phosphoglycerate mutase; ENO, enolase; PYK, pyruvate kinase. The letter "P" in the chemical structures represents a phospho group.

identity between sequences were calculated from the number of identical residues between aligned sequences; insertions, deletions and extensions have not been counted. Evolutionary relationships between amino-acid sequences are expressed as "accepted point mutations per 100 residues" (PAMs). This parameter, which was calculated according to Dayhoff (1978), takes into account the back mutations and multiple hits that may have occurred during evolution. Drawings of three-dimensional structures were made using the F R O D O software package from atomic coordinates deposited in the Protein Data Bank at the Brookhaven National

108

L.A. FOTHERGILL-GILMOREand P. A. M. MICHELS

Laboratory (U.S.A.), or (in the case of glucosephosphate isomerase, aldolase, triosephosphate isomerase and pyruvate kinase) from coordinates that were kindly made available to us. In the drawings, ~-helices are shown in green, fl-strands in orange, random coil and other structures in blue, and substrates and effectors in black.

2. Nomenclature Each enzyme is indicated by an abbreviation of minimally three and maximally seven characters in the figures presenting the sequence alignments and identity matrices. The first three characters denote the organism involved, and the other characters are used to distinguish different isoenzymes, either expressed in different tissues or co-existing within one cell type. The nomenclature and the references reporting the sequences are given in Table 1. III. STRUCTURES OF GLYCOLYTIC ENZYMES The early 1970s saw X-ray diffraction techniques beginning to be applied to the solution of the structures of an increasing number of different proteins. Thanks to the far-sightedness of several groups involved in protein crystallography and protein sequencing at that time, it was decided to make a concerted effort to solve the detailed structures of all the enzymes of the glycolytic pathway. This ambitious goal has recently been achieved, and crystal structures and amino acid sequences are now available for all the glycolytic enzymes. Indeed, in many cases, the structures have been solved of a particular enzyme in different conformations (including allosteric R and T states), or in the presence and absence ofligands, or from phylogenetically distant organisms. One of the most surprising aspects of the structures of the glycolytic enzymes revealed by the early stages of the crystallographic work was the fact that they all have a common pattern of structure. All glycolytic enzymes have a core of mostly parallel fl-strands flanked by or-helices, usually arranged in more than one domain. In some cases the structural similarities are extensive and whole domains are essentially indistinguishable in topology. In addition, most enzymes have their active sites and effector sites located between domains. These discoveries of course led to speculation concerning the evolution of these enzymes, and will be discussed in more detail in the section on enzymes with similar domains (see Section 111.3). Recent improvements in the techniques for DNA sequencing and for high-sensitivity protein sequencing have brought about a remarkable increase over the last few years in the number of sequences available. Glycolytic enzymes are no exception to this general observation. In 1985, about 60 complete sequences were determined for glycolytic enzymes (reviewed by Fothergill-Gilmore, 1986). At the time of writing the present review (1991-2), there are about 160 sequences available. Most of the glycolyticenzymes have been sequenced from several different organisms and/or from several different tissues often at both the nucleotide and protein levels. Glyceraldehyde-phosphate dehydrogenase currently holds the record with 45 sequences determined from a wide range of organisms. This wealth of sequence information, especially when correlated with the X-ray diffraction data, enables us to examine many aspects of the evolution of glycolysis in considerable detail.

1. Sequences and Crystal Structures (a) Hexokinase Hexokinase (EC 2.7.1.1) catalyzes the transfer of a phospho group from ATP to glucose by a direct in-line mechanism. Under physiological conditions the reaction is essentially irreversible. The structurally best characterized hexokinase is from yeast (yeast corresponds to Saccharomyces cerevisiae throughout this review unless otherwise specified) where it occurs as two isoenzymes designated A and B. The two isoenzymes have 378 of their 485 amino acids in common (Stachelek et al., 1986), and each enzyme is active as either a monomer or a dimer. Dimer formation is favoured by glucose and MgATP (Shill et al., 1974), whereas monomers are found in the presence of glucose and ADP (Derechin et al., 1972). The activity of yeast hexokinase can be enhanced several-fold by allosteric activators

109

E v o l u t i o n of glycolysis TABLE I. ENZYMENOMENCLATURE Enzyme Hexokinase

Glucosephosphate isomerase

Abbreviation buml-N moul-N rat2-N rat3-N huml-C rat 1-C moul-C rat2-C rat3-C ratglk yeaglk yeaA yeaB hum pig mou yea Kla

human mouse rat rat human rat mouse rat rat rat yeast yeast yeast human pig mouse yeast Kluyveromyces lactis

Pfa TbrGl claCh Eco BstA

Plasmodiumfalciparum Trypanosoma brucei clarkia Escherichia coil Bacillus stearothermophilus Bacillus stearothermophilus human muscle rabbit muscle human liver mouse liver yeast yeast human muscle rabbit muscle human liver mouse liver human platelet yeast yeast Escherichia coli

BstB ATP-depcndent phosphofructokinase

hummus-N rabmus-N humliv-N mouliv-N yeaA-N yeaB-N hummus-C rabmus-C humliv-C mouliv-C humpla-C yeaA-C yeaB-C Eco Bst

PPi-dependent phosphofructokinase Aldolase Class I

Aldolase Class II

Triosephosphate isomerase

Source

hum mon

Bacillus stearothermophilus Spiroplasma cirri potato potato Propionibacterium freundenreichii human rat mouse rabbit human rat chicken human rat Drosophila melanogaster maize rice Plasraodium falciparum Trypanosoma brucei yeast Escherichia coil Corynbacterium glutamicum human monkey

mou rab chi coe maiCy

mouse rabbit chicken coelacanth maize

Anid

Aspergillus nidulans

Sci potA potB Pfr humA ratA mouA rabA humB ratB chiB humC ratC Dine maiCy ricCy Pfa TbrGl yea Eco Cgl

Isoenzyme/Location I N-terminal half I C-terminal half II N-terminal half III N-terminal half I N-terminal half I C-terminal half I C-terminal half II C-terminal half III C-terminal half glucokinase glueokinase A N-terminal half B C-terminal half

Reference

A

Nishi et al., 1988 Arora et al., 1990 Thelen and Wilson, 1991 Schwab and Wilson, 1991 Nishi et al., 1988 Schwab and Wilson, 1988 Arora et al., 1990 Thelen and Wilson, 1991 Schwab and Wilson, 1991 Andreone et al., 1989 Albig and Entian, 1988 Stachelek et al., 1986 Stachelek et al., 1986 Gurney, 1987 Chaput et al., 1988 Gurney et al., 1986 Tekamp-Olsen et al., 1988 W6solowski-Louvel et al., 1988 Kaslow and Hill, 1990 Marchand et al., 1989 Tait et al., 1988 Froman et al., 1989 Tao et al., 1989

B

Tao et al., 1989

N-terminal half N-terminal half N-terminal half N-terminal half A N-terminal half B N-terminal half C-terminal half C-terminal half C-terminal half C-terminal half C-terminal half A C-terminal half B C-terminal half

Nakajima et al., 1987 Lee et al., 1987 Levanon et al., 1989 Gehnrich et al., 1988 Heinisch et al., 1989 Heinisch et al., 1989 Nakajima et al., 1987 Lee et al., 1987 Levannn et al., 1989 Gehnrich et al., 1988 Simpson, 1991 Heinisch et al., 1989 Heinisch et al., 1989 Shirakihara and Evans, 1988 French and Chang, 1987

glycosome chloroplast

a-subunit fl-subunit A A A A B B B C C

glycosome

Chevalier et al., 1990 Carlisle et al., 1990 Carlisle et al., 1990 Ladror et al., 1991 Freemont et al., 1988 Joh et al., 1985 Mestek et al., 1987 Tolan et al., 1984 Rottmann et al., 1984 Tsutsumi et al., 1984 Burgess and Penhoct, 1985 Rottmann et al., 1987 Kukita et al., 1988 Malek et al., 1985 Kelley and Tolan, 1986 Hidaka et al., 1990 Knapp et al., 1990 Marchand et al., 1988 Schwelberger et al., 1989 Alefounder et al., 1989 Von der Osten et al., 1989 Maquat et al., 1985 Old and Mohrenweiser, 1988 Cheng et al., 1990 Corran and Waley, 1975 Straus and Gilbert, 1985 Kolb et al., 1974 Marchionni and Gilbert, 1986 McKnight et al., 1986 continued overleaf

110

L. A. FOTHERGILL-GILMORE a n d P. A. M. MICHELS TABLE 1 (continued) Enzyme

Abbreviation yea Spo

Glyceraldehyde phosphate

dehydrogenase

lsoenzyme/Location

TbrCy TbrG1 TcrGI maiChA peaChA spiChA

yeast Schizosaccharomyces pombe Trypanosoma brucei Escherichia coil human pig rat rat mouse hamster chicken lobster Drosophilia melanogaster Drosophilia melanogaster Caenorhabditis elegans Schistosoma mansoni ice plant maize mustard Aspergillus nidulans Cryphonectria parasitica Ustilago maydis yeast yeast Kluyveromyces factis Zy#osaccharomyces rouxii Trypanosoma brucei Trypanosoma brucei Trypanosoma cruzi maize pea spinach

tabChA peaChB spiChB

tobacco pea spinach

chloroplast A chloroplast B chloroplast B

tobChB EcoA

tobacco Escherichia coil

chloroplast B A

EcoB

Escherichia coil

B

Taq Tma Zmo Bco Brae Bst

Pwo hum hor

Thermus aquaticus Thermotoga maritiraa Zymomonas mobilis Bacillus coagulans Bacillus meftaterium Bacillus stearothermophilus Bacillus subtilis Methanobacterium bryantii Methanobacterium formicium Methanothermus fervidus Pyrococcus woesii human horse

rat mou humtes

rat mouse human

moutes Anid

mouse Asperqillus nidulans

Pch Tre Tvi yea

Penicillum chrysogenum Trichoderma reesei Trichoderma viride yeast

Kla

Kluyveromyces lactis

TbrGl F.co hum pig ratl rat2 mou ham chi lob Dmel Dme2 Cel Sma iceCy maiCy musCy Anid Cpa Uma yeal yea2 Kla Zro

Bsu Mbr Mfo Mfe Phosphoglyceratekinase

Source

Reference Alber and Kawasaki, 1982 Russell, 1985

glycosome

Swinkels et al., 1986 Pichersky et al., 1984 Tso et al., 1985a Harris and Perham, 1968 Tso et al., 1985a Fort et al., 1985 Sabath et al., 1990 Vincent and Fort, 1990 Dugaiczyk et al., 1983 Davidson et al., 1967 Tso et al., 1985b Tso et al., 1985b

cytosol cytosol cytosol

1 2

cytosol glycosome glycosome chloroplast A chloroplast A chloroplast A

Yarbrough et al., 1987 Goudot-Croz¢l et al., 1989 Ostrem et al., 1990 Brinkmann et al., 1987 Martin and Cerff, 1986 Punt et al., 1988 Choi and Nuss, 1990 Smith and Leong, 1990 Holland and Holland, 1979 Holland et al., 1983 Shuster, 1990 lmura et al., 1987 Michels et al., 1991 Miehels et al., 1986 Kendall et al., 1990 Brinkmann et al., 1987 Liaud et al., 1990 Brinkmann et al., 1989; Ferri et al., 1990 Shih et al., 1986 Liaud et al., 1990 Brinkmann et al., 1989; Ferri et al., 1990 Shih et al., 1986 Branlant and Branlant, 1985 Alefounder and Perham, 1989 Hecht et al., 1989 Schultes et al., 1990 Conway et al., 1987 Tefay et al., 1989 Schl/ipfer et at., 1990 Branlant et al., 1989 Viaene and Dhaese, 1989 Fabry et al., 1989 Fabry et al., 1989 Fabry and Hensel, 1988

testis testis

Zwickl et al., 1990 Miehelson et al., 1983 Banks et al., 1979; Merrett, 1981 Ciccarese et al., 1989 Mori et al., 1986 Tani et al., 1985; McCarrey and Thomas, 1987 Boer et al., 1987 Clements and Roberts, 1986 Van Solingen et al., 1988 Vanhanen et al., 1989 Goldman et al., 1990 Watson et al., 1982; Perkins et al., 1983 Fournier et al., 1990 continued opposite

111

E v o l u t i o n of glycolysis

TABLE I (continued) Enzyme

Abbreviation TbrCy TbrGl CfaCy CfaGl wheCy wheCb Eco Tth Zmo Brae Mbr

Enolase

Pyruvate kinase

Isoenzyme/Location cytosol glyeosome cytosol glycosome cytosol chloroplast

hummus humbra

Trypanosoma brucei Trypanosoma brucei Crithidiafasciculata Crithidiafasciculata wheat wheat Escherichia coil Thermus thermophilis Zyraoraonas mobilis Bacillus meoaterium Methanobacterium bryantii Methanothermus fervidus human muscle human brain

humrbc rabrbc mourbc yea

human red blood cell rabbit red blood cell mouse red blood cell yeast

Sco f26bp

Streptomyces coelicor rat liver

humA ratA ducA XlaA humB ratB chiB humG ratG Dine yeal yea2 hummus2 ratmus2 Ratmusl catmusl chimus humliv ratliv

human rat duck Xenopus laevis human rat chicken human rat Drosophila melanogaster yeast yeast human rat rat cat chicken human rat

ratrbc pot Anid Anig yea

rat potato Asperoillus nidulans Aspergillus niger yeast

R

TbrCyl TbrCy2 Eco Bst

Trypanosoraa brucei Trypanosoma brucei Escherichia coli Bacillus stearotherraophilus

1 2

Mfe Phosphoglycerate mutase

Source

Reference Osinga et al., 1985 Osinga et al., 1985 Swinkels et al., 1988 Swinkels et al., 1988 Longstaff et al., 1989 Longstaff et aL, 1989 Nellemann et al., 1989 Bowen et al., 1988 Conway and Ingrain, 1988 Schl/ipfer et al., 1990 Fabry et al., 1990 Fabry et al., 1990

fructose-2,6bisphosphatas¢ A A A A B B B G G 1 2 M2 M2 M1 M1 M L L

eytosol cytosol

Shanske et al., 1987 Sakoda et al., 1988; Blouquit et al., 1988 Joulin et al., 1986 Yanagawa et al., 1986 Le Boulch et aL, 1988 White and FothergillGilmore, 1988 White et al., 1992 Lively et al., 1988 Giallongo et aL, 1986 Sakimura et al., 1985a Wistow et al., 1988 Segil et al., 1988 Cali et al., 1990 Oshima et al., 1989 Russell et al., 1986 McAleese et al., 1988 Sakimura et al., 1985b Bishop and Cortes, 1990 Holland et al., 1981 Holland et al., 1981 Tani et al., 1988a Noguchi et al., 1986 Noguchi et al., 1986 Muirhead et al., 1986 Lonberg and Gilbert, 1983 Tani et al., 1988b Inoue et al., 1986; Lone et al., 1986 Noguehi et al., 1987 Blakeley et al., 1990 De Graaff and Visser, 1988 De Graaff, 1989 Burke et al., 1983; McNaUy et al., 1989 Allert et al., 1991 Allert et al., 1991 Ohara et al., 1989 Muirhead, 1991

such as ATP (Kosow and Rose, 1971), which probably shift the monomer--dimer equilibrium towards the more active dimer (Steitz et al., 1977). In mammals hexokinase exists as four different isoenzymes (reviewed by Wilson, 1984), one of which occurs abundantly in liver and is commonly known as glucokinase. Glucokinase has a subunit size of approximately Mr 50,000 and is thus similar to the yeast enzyme. The other isoenzymes (designated Types I, II and III), however, have about twice the subunit size, and have probably arisen by a process of gene duplication followed by fusion. An alignment of the sequences of human, rat, mouse and yeast hexokinases (Fig. 2) gives clear evidence to substantiate the gene doubling mechanism (see Section VI.1 for a more extensive discussion). Pairwise comparisons of the sequences (expressed as percent amino acid identities) are given in Matrix 1. The crystal structures of both the A and B isoenzymes have been solved in the presence and

112

L.A. FOTHERGILL-GILMOR£and P. A. M. MICHELS

HEXOKINASE

54 huml-N moul-N rat2-N rat3-N huml-C rat1-C moul-C rat2-C rat3-C ratglk yeaglk yeaA yeaB

IAAQLLAYYFTELKDDQVKKIDKYLYAMRLSDETLIDIMTRFRKEMKNG I A A Q L L A Y Y F T E I/(DDQVKKIDKYLYAMRLSDEI LIDI L T R F K K E M K N G IAS H M I A C L F T E L N Q N Q V Q K V D Q F L Y H M R L S D E T L L E I S R R F R K E M E K G A A I EP S GLHP GERD S S C P Q E G I P R P S G S L E L A Q E Y L Q Q F K V T M T Q L Q Q I Q A S L L C ~ Q A HRQIEETLAHFHLTEI)MT.T.RVKKRMRAEMELG I R Q IEET A H F R L S K Q T L M E V K K R L R T ~ M E M G H R Q IEETLS H F R L S K Q A I ~ E ~ 4 E M G

s LKLSMEQLLEVKRmm~mEQG P FQLS L E Q L T A V Q A Q M R E R M I R G AMDTTRCGAQLLTLVEQI LAEFQLQEEDLKKVMSP/4QK~)RG SFDDLHKATERAVIQAVDQICDDFEVTPEKLDELTAYFIEQMEKG V H L G P KKPQARKGSMADVPKEI/~DE I H Q L E D M F T V D S E T L R K V V K H F IDELNKG VHLGP KKP QARKGSMADVPKEI~4QQ IE I F E K I F T V P T E T L Q A V T K H F I S E L E K G *

100 huml-N m o u l -N rat2-N rat3-N hum/-C ratl-C m o u l -C rat2-C rat3-C ratglk yeaglk yeaA yeaB

L S R ..... D F N P T A T V K M L P T F V R S I P D G S E K G D F I A L D L G - - G S S F R I L R V Q V N H E K - L S R ..... D Y N P T A S V K M L P T F V R S I P D G S E K G D F I A L D L G - -GS SFRI L R V Q V N H E K - L G A ..... T T H P T A A V K M L P T F V R S T P D G T E H G E F L A L D L G - - G T N F R V L R V R V T D N G - L K G .... -QD SPAP S V R M L P T Y V R S T P H G T E Q G D F L V L E L G A T G A S L R V L W V T L T G T K - L R K .... - Q T H N N A V V E M L P S F V R R T P D G T E N G D F L A L D L G - - G T N F R V L L V K I R S G K - L R K ..... E T N S K A T V K M L P S F V R S I P D G T E H G D F L A L D L G - - G T N F R V L L V K I R S G K - L R K ..... E T N S R A T V E M L P S Y V R S I P D G T E H G D F L A L D L G - - G T N F R V L L g q ( I R S G K - L S K ..... E THAVAP V K M L P T Y V C A T P D G T E K G D F L A L D L G - - G T N F R V L L V R V R D G K - L Q G ........ ES S S L R M L P T Y V R A T P D G S E P / ~ D F L A L D L G - - G T N F R V L L V R V A E G - - L R L ..... E T H E E A S V K M L P T Y V R S T P E G S E V G D F L S L D L G - - G T N F R g ~ 4 L V ~ LAP P K E G H T L A S D K G L P M I P A F V T G S P N G T E R G V L L A A D L G - - G T N F R I C S V N L H - - -G L T K ........ K G V N I P M I P G W V M E F P T G K E S G N Y L A IDLG- - G T N L R V V L V K L S ----(3 L S K ........ KGVNIPMIPGW-gPE)FP T G K E S G D F L A I D L G - - G T N L R V V L V K L G - - - -G *

*

*

*

*

*

*

*

**

*...*.

*

153 huntl-N moul-N rat2-N rat3-N huml-C ratl-C moul-C rat2-C rat3-C ratglk yeaglk yeaA yeaB

N Q N V H M E SEVYDTPENI V H G S G - - - S Q L F D H V A E C L G D F M E K . . . . . . . R K I K D K K L P V G SQNVSME SEVYDTPENIVHGSG--- SQLFDHVAECLGDFMEK ....... RKIEDKKLPVG L Q R V E M E N Q I Y A I P E D IMRGSG- - - T Q L F D H I A E C ~ K . . . . . . . LQIEY.~(I~gLG E H S V E T R S Q E F V I P QEVI LGAG- - - Q Q L F D F A A R C L S E F L D A . . . . . . . Y P V E N Q G L K L G K R T V E M H N K I YAIP I E I M Q G T G - - - E E L F D H I V S C I S D F L D Y . . . . . . MGII(~PRMPLG K R T V E M H N K IYS IP LE IM~2GTG---DW.LFDH I V S C I S D F L D Y . . . . . . . M G I K G P R M P L G K R T V E M H N K I Y S IP L E I M Q G T G - - - D E L F D H I V S C I S D F L D Y . . . . . . . M G I K G P R M P L G R R G ~ IYS I P Q E V M H G T G - - - E E L F D M I V Q C I A D F L E Y . . . . . . . M G M K G V S L P L G - - S V Q I TNQVYS I P E Y V A Q G S G - - - Q K L F D H I V D C I V D F Q K R . . . . . - - Q G L S G Q S L P L G QWSVKTKHQMYS IPEDAMTGTA---EMLFDYISECISDFLDK ....... HQ~(HKKLPLG D H T SFMEQMKS K I P D D L L D D E N V T S D D L F G F L A R R T L A F M K K I q 4 P D E L A K G K D A K P M K L G NRTFDTTQSKIqfLPHDMRTTKH- - Q E E L W S F IADS L K D F M V E . . . . - Q E L L N T K D T L P L G D R T F D T T Q S K Y R L P D A M R T T Q N - - P D E L W E F IAD S L K A F IDE .... - Q F P Q G I S E P I P L G *

*. ,

.

o*

FIG. 2 (continued opposite).

absence of various ligands (Anderson et al., 1978; Bennett and Steitz, 1978, 1980a, b; Steitz et al., 1981 ). The structure of the monomeric form of the B isoenzyme crystallized in the absence of glucose has been solved and refined at 2.1 A resolution; the binary complex of the A isoenzyme with glucose has been independently solved at 4.5 A resolution, and refined at 3.5 .~. Each subunit of hexokinase is divided into two distinct domains, each with a core of strands which form a E-sheet flanked by s-helices (Fig. 3). The active site is located between the two domains. A dramatic discovery of these studies was that the enzyme undergoes a substantial movement of the domains on binding of ligands, as can be seen from a comparison of the apo and holo enzymes shown in Fig. 3. The structure of the B isoenzyme in the unligated, open form is shown on the left in comparison with the closed form of the A isoenzyme. Hexokinase provided the first example of an enzyme with major domain movement, a feature which is now known to occur in many kinases and indeed in other enzymes.

Evolution of glycolysis

113 212

huml-N moul-N rat2-N rat3-N hum1-C ratl-C moul-C rat2-C rat3-C ratglk yeaglk yeaA yeaB

FT F S FP CQQS KIDEAIL I TWTKRFKASGVEGADVVKLLNKAIKKRGDYDANIVAVVNDTV F TF SFPCRQS K IDEAVL I TWTKRFKASGVEGADVVKLLNKAIEIIPI~YDANIVAVVNDTV F TF SFPCHQTKLDE S FLVSWTKGFKS SGVEGRDVVDLI RKAI QRRGDFD I D I V A V ~ D T V FNF SFP CHQTGLDKSTLI S W T K G F R C S G V E G Q D V V Q L L R D A I Q A Q G T Y N I D W A M V N D T V FTF SFP CQQT S LDAG ILI T W T K G F K A T D C V G H D V V T L L R D A I K R R E E F D L D W A V V N D T V FTF SFPCHQTNLDCG ILI S W T K G F K A T D C E G H D V A S L L R D A V K R R E E F D L D W A V V N D T V F TF SFP CKQT S LDCGIL I T W T K G F K A T D C V G H D V A T L L R D A V K R R E E F D L D W A V V N D T V FTFSFPCQQNSLDQS ILLKWTI~FKASGCEGEDVVTLLKEAIHRREEFDLDWAVVNDTV FTF SFP CKQLGLDQGI LLNWTKGFNASGCEGQDVVYLLREAIRRRQAVELNVVAIVNDTV FTF SFPVRHEDLDKG ILLNWTKGFKASGAEGNN I V G L L R D A I K R R G D F E M D W A M ~ I D T V F TF SYPVDQT S LNSGTLI RWTKGFRIADTVGKDWQLTQEQLSi~I~(~IPMIKVVALTNDTV FTF SYPASQNKINEGILQRWTKGFD IPNVEGHDVVPLLQI~I SKR-ELP IEIVALINDTV FTF SFPASQNKINEGILQRWTKGFD IPNIENHDVVPMLQKQISKR-NIP IEVVALINDTT *.**o~

..



*

***

*

.

.

.

.

.

.

.

.

.

.

.

.

.

.

**..***.

259 huml-N moul-N rat2-N rat3-N huml-C ratl-C moul-C rat2-C rat3-C ratglk yeaglk yeaA yeaB

GTMMTCGYDDQHCE ........... VGLI IGTGTNACYMEELRHIDLVEGD. GTMMTCGYDDQQCE 'VGLI IGTGTNACYMEELRHIDLVEGD. GTI~ITCGYDDQDCE .IGLIVGTGSNACYMEEMRHIDMVEGD, GTI~C42ELGTRPCE .VGLIVDTGTNACYNI~dL~IVAALDED GTMMTCAYEEPTCE ........... V G L I V G T G S ~ G D , GTI~TCAYEEP TCE •I GL I~ T N AC YP~ 1~ W C E ~ . GTI~ITCAYEEP SCE ........... I G L I V G T G S ~ GTMMTCGYEDPHCE ........... VGL IVGTGS N ~ " ~ 4 E ~ V E LVDGE, GTMMSCGYDDPCCE .blGLIVGTG~VPGD, ATMISCYYEDRQCE . V G M I V G ~ L V E G D . GTYLS HCYTSDNTD SMTS GEI S EPVIGCIFGTGTNGCYMEE INKITKLPQELRDKLIKEG GTLIASYYTDPETK .MGVIFGTGVNGAFYDVCSDIEKLEGKLADD I--P S GTLVASYYTDPETK MGVIFGTGVNGAYYDVCSD I E ~ K L S D D I--PP

huml-N moul-N rat2-N rat3-N huml-C ratl-C moul-C rat2-C rat3-C ratglk yeaglk yeaA yeaB

EGRMCINTEWGAFGDDGSLED IRTEFDRE IDRG-SLNP GKQLFEKMVSI~IYIGELVRLIL EGRMCINTEWGAFGDDGS LED IRTEFDRELDRG-S I 2 1 P G K Q L F E E M V S ~ I G E L V R L I L EGRMC INMEWGAFGDDGTLND IRTEFDRE IDI~-S LNPGKQLFEEMI ~ L V R L I L RGRTCVS IEWGSFYDEEALGPVLTTFDDALDHE - S L V P G A O R F E E M I G G L ~ V R L V L QGQMC INMEWGAFGDNGCLDD IRTHYDRL~TEY-SI~AGKQRYEEMI S(~YLG~ IVRNIL QGQMC INMEWGAFGDNGCLDD IRTDFDKVVDEY-SI~SGKQRFEEMI SGMYII~ IVRNIL QGQMC INMEWGAFGDNGCLDD I R T D F D K ~ D E Y - S I ~ / S G K Q R F E E ~ SGMYI/~IVRNIL E GNIlCVNMEWGAFGDNGC LDD LRTVFDVAVDEL- S IIIPGKQRFEKMI ~ IVRN I L SGHblCINMEWGAFGDDGS LSMLGTCFDASVDQA-S I N P ~ SGI4N~GE IVRIIIL EG~WGAFGDSGELDEFLLEYDRMVDES - SANPGQQLYEKI I G G ~ M ~ L V R L V L KTHMI INVEWGSF -DNELKIILPTTKYDVVIDQKLS TNPGFHLFEKRVSI~IFLGEVIRNIL NS PMAINCEYGSF -DNEHLVLP RTKYDVAVDEQ-SPRPGQQAFEIIMT SGYI'LGELLRLVL SAPMAINCEYGSF -DNEHVVLPRTKYD IT IDEE -SPRPGQQTFEI~IS ~ I L R L A L

317

• •

o •

*o*o*



o

o*

, •

*

.*

°

o**

o'It

° o*~°

o*

*

FlC~. 2 (continued overleaf).

(b) Glucosephosphate isomerase The second step along the pathway to catabolize glucose is the aldose/ketose isomerization of glucose 6-phosphate to fructose 6-phosphate. This equilibrium reaction is catalyzed by glucosephosphate isomerase (EC 5.3.1.9), which promotes the intramolecular transfer of a proton between carbon 2 and carbon 1 (see Fig. 1). The enzyme is active as a dimer with identical subunits of M r 66,000. The crystal structure at 2.6 A resolution (Achari et al., 1981) shows that the enzyme subunit folds into two unequal domains, the larger of which has a core of six fl-strands surrounded by e-helices (Fig. 4). The presence of a sixstranded fl-sheet is reminiscent of the nucleotide-binding domains found in many dehydrogenases, but in the case of glucosephosphate isomerase the connections between the elements of secondary structure are quite different. The smaller domain has a less regular B-sheet, with five strands linked by e-helices and an irregularly folded loop. The active site is

114

L.A. FOTHERGILL-GILMOREand P. A. M. MICHELS 372

huml-N moul-N rat2-N rat3-N huml-C ratl-C moul-C rat2-C rat3-C ratglk yeaglk yeaA yeaB

V K M A K E G L L F E G ..... R I T P E L L T R G K F N T S D V S A I E K N K E G L H N A K E I L - - T ~ V K M A K E S L L F E G ..... RI T P E L L T R G K F T T S D V A A I E T D K E G V Q N A K E I L - - T ~ P V K M A K A E L L F Q G ..... KLSP E L L T T G S F E T K D V S D I E E D K D G I E K A Y Q I L - - M R / ~ L N P V H L S Q H G V L F G G ..... C A S P A L L S Q N S I L L E H V A K M E D P A T G I A H V H T V L - - Q G L G L S P I D F T K K G F L F R G ..... Q I S E T M K T R G I F E TKFLS Q I E S D R L A L L Q V R A I L--QQI/3LNS I D F T K K G F L F R G ..... Q I S E P L K T R G I F E T K F L S Q I E S D R L A L L Q V R A I L - - Q Q L G L N S I D F T K K G F L F R G ..... QI SEP L K T R G IFETI~I~LSQI E S D R L A L L Q V R A I L--QQLGLNS I D F T K R G L L F R G ..... RI S E R L K T R G I F E T K F L S Q I E S D C L A L L Q V R A I L - - R H L G L E S LHLTS L G V L F R G ..... Q K T Q C L Q T R T I F K T K F L F E I E S D S L A L R Q V R A I L - - E D L G L T L L K L V D E N L L F H G ..... E A S E Q L R T R G A F E T R F V S QVE S D S G D R K Q IHNIL- -S TLGLRP VDLHSQGLLLQQYRSKEQLPRHLTTPFQLSSEVLSHIE IDDSTGLRQTQLSLLQSLRLPT L E L N E K G L M L K D ..... QD LS K L K Q P Y I M D T SYPARI EDDP F E N L E D T D D M F Q K D F G V K T M D M Y K Q G F I F K N ..... QD LS K F D K P F V M D T S Y P A R I E EDP F E N L E D T D D L F Q N E F G INT o..



..

o~%

426 huml-N moul-N rat2-N rat3-N hum1 -C ratl-C moul-C rat2-C rat3-C ratglk yeaglk yeaA yeaB

S D D D C V S V Q H V C T I V S F R S A N L V A A T L G A I I/qRLRDNKGTP R L R T T V G V D G S L Y K T H P Q Y SHDDCVSVQHVCT IVSFRSANLVAATLGAI L N R L R D N K G T P R L R T T V G V D G S L ~ Q Y L Q E D C V A T H R I TQ IVSTRSAS L C A A T L A A V L T R I K E D K G E E R L R S T IGVDGSVk'KKHPMF QASDAE LVQRVCMAVCTRAAQLCAS LAAVL SRLQHSREQQTLHVAVATGGRVFEWHPRF TCDDS I L V K T V C G V V S R R A ~ L C G A G M A A V V D K I R E N R G L D ~ L Y K I R P H F TCDD S I L V K T V C G V V S K R A A Q L C G A M A A V V E K IRENRGLDHLNWIa/GVDGTLYKLHPHF TCSDS I L V K T V C G V V S K R A A Q L C G A G M A A V V E K I R E N R G L D H ~ L Y K L H P H F TCDDS I I V K E V C T V V A R R A A Q L C ~ V V D K I R E N R G L D N L K V T V G V D G T L Y K L H P H F T SDDALMVLEVCQAVSRRAAQLCGAGVAAVVEKIRENRGLQELTVSVGVDGTLYKLHPHF SVTDCD I V R R A C E S V S T R A A H M C S A G L A G V I N R M R E S R S E D V M R I T V G V D G S V Y K I R P S F TP T ERVQ IQKLVRA I S R R S A L A A V P L A A I L IK T N ~ G E V E IG C D G S V V E Y Y P G F TLP ERKL IRRLCE L I G T R A A R L A V C G I A A I C Q K . . . . . . I~"/KTGH I A A D G SVMNKYP G F T V Q E R K L IKRLS EL I G A R A A B L S V C G I A A I C Q K . . . . . . R G ~ K T G H I A A D G S V Y N R Y P G F *.*

.

.

.

.

.

.

.

.

*

.



,*

.

485 huml-N moul-N rat2-N rat3-N huml-C ratl-C moul-C rat2-C rat3-C ratglk yeaglk yeaA yeaB

S R R F H K T L R R L ....... V P D S D V R F L L S ES G S G K G A A M V T A V A Y R L A E Q S R R F H K T L R R L ....... VPD S D V R F L L S E S G S G K G A A M V T A V A Y R L A E Q A K R I M K A V K R L ....... V P D C D V R F L R S E D G S ~ A ~ K T L E L C I L K E T V M L L ....... A P E C D V S F IP S V D G G G R G V A M V T A V A A R L A T H R R I LEET L A S R I M H Q T V K E L ....... S P K C N V S F L L S E D G S G K G A A L I T A V G V R L R T E A S S S R I M H Q T V K E L ....... S P K C T V S F L L S E D G S G K G A A L I T A V G V R L R ~ D P S IA S R I M H Q T V K E L ....... S P K C T V S F L L S E D G S G K G A A L I T A ~ P T N A A K V M H E T V R D L ....... AP K C D V S F LE S E D G S G K G A A L ITAV/~CRIREAGQR SRLVSVI"4RKL ....... APQCTVTFLQSEDGSGKGAALVTRVACRLTQMACV

K E R F H A S V R R L ....... T P N C E ITF IESEEGSGRFaAALVSAVACKKACMLAQ R S M L R H A L A L S - - -P L G A E ~ L K IAKDGS GVGAAIA2ALVA K E A A A K G L R D I Y G W T G E N A S K D P IT I V P A E D G S G A G A A V I A A L S E K R I A E G K V S G I IGA K E K A K N A L K D IYGWTQT S LDDYP IK I V P A E D G S G A G A A V I A A L A Q K R I A E G K S V G I IGA .

.

.

.

.

.

.

*.*

*

* . . . .

FIG. 2. Alignment of hexokinase sequences. See Section II for nomenclature and references for the sequences. The numbering is according to the yeast enzyme. The correlation between sequence and secondary structure was not available at the time of writing.

an enclosed pocket formed partly by the slight cleft between the two fl-sheets, which point toward each other, and partly by portions of the other subunit. Glucosephosphate isomerase is an intriguing example of an enzyme with more than one function (see Section XI). Thus the sequence of the mouse enzyme was originally reported as the sequence of a lymphokine with neurotrophic activity, which was termed neuroleukin (Gurney et al., 1986). At this time there were no glucosephosphate isomerase sequences published. Subsequently the sequences of 11 glucosephosphate isomerase have become available (see Fig. 5), and it is now apparent that neuroleukin and glucosephosphate isomerase are one and the same molecule. Pairwise comparisons of the sequences (expressed as per cent amino acid identities for the whole enzymes and for some domains) are given in Matrix 2.

Evolution of glycolysis

FIG. 3. The structures of the open and closed forms of yeast hexokinas¢. The unligated open form of the B isoenzyme is on the left, and the closed form of the A isoenzyme on the right.

FIG. 4. Glucosephosphat¢ isomerase from pig muscle. Only a single subunit of the dimeric enzyme is shown. The ~t-carbon coordinates were kindly made available by C. Davies and H. Muirhead.

115

116

L.A. FOTHERG1LL-GILMOREand P. A, M, MICHELS

FIG. 7. Phosphofructokinase from B. stearothermophilus in the R (on the left) and T conformations. In each case two subunits of the tetrameric enzyme are shown. It is likely that this bacterial dimer corresponds to a single subunit of the double-size eukaryotic enzyme.

Evolution of glycolysis

117

MATRIX I. PAIRWISE COMPARISON OF HEXOK1NASESEQUENCES

huml-N moul-N rat2-N rat3-N huml-C ratl-C moul-C rat2-C rat3-C ratglk yeaglk yeaA yeaB

hl-N

ml-N

r2-N

r3-N

ht-C

rl-C

ml-C

r2-C

r3-C

rglk

yeaglk

yeaA

yeaB

100

95 100

68 68 100

41 41 45 100

52 51 56 43 100

53 52 57 43 90 100

52 51 56 43 90 97 100

55 54 59 46 78 77 77 100

52 51 53 47 63 64 64 67 100

49 49 53 42 54 53 53 57 54 100

31 31 36 31 34 35 35 34 35 29 100

32 31 33 28 35 36 35 37 35 34 38 100

31 30 33 27 36 37 36 35 34 32 39 78 100

See Section II for nomenclature and references for the sequences.

GLUCOSEPHOSPHATE

ISOMERASE

14 hum pig mou yea Kla Pfa Tbr cla

AALTRDPQFQKLQQ AALTQNPQFKKLQT AALTRNPQFQKLLE SNNS F T N F K L A T E L P A W S K L Q K A S K N T YSD F K L A T E L P A M N Q L Q S NME ITNLKSYKELVT SSYLDDLRIDLAASPASGGSAS IAVGSFN IPYEVTRRLKGVGADADTTLTSCASWTQLQK

Eco

KNINPTQTAAWQALQK

hum pig mou yea Kla Pfa Tbr cla

Eco

hum pig mou yea Kla Pfa Tbr cla Eco BstA

BstB

KN~PTQTAAWKAL~

aaaaaaaaaaa bbbbbb bbbbbbb aaaaaaaaa71 W Y R E H R S E L N L R R L F D A N K D R F N - - -HF S L T L N T N H G H I L V D Y S K N L V T E D V M R M L V D L A W Y H E H R S D L N L B R L F E G D K D R F N - - -HF S L N L N T N H G R I L L D Y S ~IWLVTEAVMQMLVDLA WHRANS~LFEADPERFN--NF S L N L N T N H G H I L V D Y S K N L V N K E V M Q M L V E L A I YES Q G K T L S V K Q E F Q K D A K R F E K - - L N K T F T N Y D G S K I L F D Y S K N L V N D E I I A A L I E L A LYEQKGKKLNVKDEFAKDNS RYEK- -FAKTFVNYDGSKILFDFSKNLVDDEI LKSLIQLA L S A E E - - K T K D L K D Y L N D E N R . . . . . . SES L I K K F K N F - Y M D L S R Q R Y S E K T L N K L V E Y A L Y E Q Y - G D E P IKKHFEAD S E R G Q R Y S V K V S L G S K D E N F L F L D Y S K S H I N D E I K C A L L R L A H F E Q M - K S V E I A D L F A Q D A D R F A K . . . . . . F S A T F D D Q M L V D F S KNRI T Q E T L D K L Q A L A H F D E M - K D V T I A D L F A E D ~ D R F SK . . . . . . F S A T F D D Q M L V D Y S K N R I T E E T AKLQ LA aaaaaa V aaaaaa127 K S R G V E A A R E R M F N G E K I N Y T E G R A V L H V A L R N R S N T .... P I L V D G K D V M P E V N K V L D K KS R ~ E A A R E ~ M ~ N G E K I N F T E D R A V L H V A L R N R S N T .... P I L V D G K D V M P E V N R V L E K KS B G V E A A R D N M F S G S K I N Y T E D R A V L H V A L R N R S N T .... P I K V D G K D V M P E V N R V L D K K E A N V T G L R D A M F K G ~ H INS T E D R A V Y H V A L R N R A N K .... P M Y V D G V N V A P E V D S V L K H K E A K V T S L R D A M F N G ~ P I N F T E G R A V Y ~ I A L R N R S L K .... P M Y V D G T N V T P E V D A V L Q H EEVELKKKVEKTFMGE~ENRSVLHTALRIP IEKINTHKI IIDNKNVLEDVHGVLKK E E R G I R Q F V Q S V F R G E R V N T T E N R P V L H I A L R N R S N R .... P I Y V D G K D V M P A V N K V L D Q K E TDLS GAI K S M F S G E K I N R T E D R A V L H V A L R N R S N T .... P I L V D G K D V M P E V N A V L D K K E C D LAGAI KSMF S G E K I N R T E N R A V L H V A L R N R S N T .... P I L V D G K D V ~ E V N A V L E K T H IRFDYS K A L S F F G E H E L T Y L R D A V - K V A H H S L H E K T G V ...... G N D F L G W L D L P V N Y AI S F D Y S N A L P F M Q E N E L D Y L S E F V - K A A H H M L H E R K G P . . . . . . G S D F L G W V D W P IRY *

hum pig mou yea Kla

Pfa Tbr cla Eco BstA BstB

*

*

aaaaaaaaaa bbbbbb aaaaaaaaaaaa 173 MKSFCQRVRS~WKGYTGKT ITDVIN IGIVGSDLGP LMVTEALKP Y .............. M K S F C K R V R S G E W K G Y S GKS ITDVINIGI G G S D L G P L M V T E A L K P Y . . . . . . . . . . . . . . M K S F C Q R V R S ~ ) W K G Y T G K S I TD I INIGI G G S D LGP L M V T E A L K P Y . . . . . . . . . . . . . . MKEF SEQVRSGEWKGYTGKKI TDVVNIGI GGSDLGPVMVTEALKHY .............. MKEFTEEVRSGAWKGYTGKS ITDVVN IGIGGSDLGPVMVTEALKHY .............. IEKYSDDIRNGVIKTCKNTKFKNVIC IGIGGSYLGTE~LNKNEKDQV MRSF SEKVRTGEWKGHTGKAIRHVVNIGIGGSDLGPVMATEALKPF .............. M K G F SERI I S G Z W K G Y T G K A I T D V V N I G I G G S D L G P F M V T E A V R P Y . . . . . . . . . . . . . . M K T F SEAI I S G ~ W K G Y T G K A I T D V V N I G I G G S D L G P Y M V T E A L R P Y . . . . . . . . . . . . . . D K E E F A R I Q K A A A K - - IQAD S D V L L V I G I G G S Y L G A R A A I E M L H H S F Y N DKNEF SRIKQAAER-- IRNHSDALVVIGIGGSYLGARAAIEALSHTFHN ***

FIG. 5

**

**,

(continued overleaf).

*

118

hum pig rnou yea Kla Pfa Tbr cla Eco BstA BstB

L, A. FOTHERGILL-GILMOREand P. A. M. MICHELS

bbbbbbb aaaaaa bbbbbbb aaaaaaaaaa224 ........ S S G G P R V W Y V S N - IDGTH I A K T L A Q L N P E S SLF I IAS K T F T T Q E T I TNAETA •S A E G P R V W F V S N - I D G T H I A K T L A T L N P E S S L F I I A S K T F T T Q E T I T N A E T A •S K G G P R V W F V S N - IDGTH I A K T L A S L S P E T SLF I I A S K T F T T Q E T ITNAETA A G V L D V H F V S N - IDGTH IAE T L K V V D P E T T L F L IAS K T F T T A E T ITNANTA A T N L K V H F V S N - I D G T H I A E T L K D L D H E T T L F L I A S K T F T T A E T ITNATQL N N F N N N Y D Q D N V F N V R F L A N - V D P N D V N R A I Q N L D Q Y D T L V I I I S K T F T T A E TMLNARS I ........ SQRDLS L H F V S N - V D G T H IAEVLKS ID IEATLF I V A S K T F T T Q E T I TNALSA ......... K N H L N M H F V S N - V D G T H I A E T L K D L S P E T T L F L V A S K T F T T Q E T M T N A H S A K N H L N M H F V S N - V D G T H I A E V L K K V N P E T T L F L V A S K T F T T Q E T M T N A H SA -- - A L P K E K R N T P Q I IFVGNNI S STYbEC~VMDLLEGKDFS INVI S K S G T T T E P A I A F R I F ....... Q M N D T T Q IYFAGQNI S S T Y I S H L L D V L E G K D L S INVI S K S G T T T E P A I A F R I F • * •

hum pig mou yea

Kla Pfa Tbr cla Eco BstA BstB

°

,,







°



**



*

,

,

aaaaaaa aaaaaabbbbbb bbbbb 276 - - K E W F L Q A A K ..... DP S A V A K H F V A L S TNT TKVKEFGI DP Q - N M F E F W D W V G G R Y S LW - - K E W F L Q S A K ..... DP S A V A K H F V A L S T N T T K V K E F G I D P Q - N M F E F W D W V G G R Y S LW - - K E W F L E A A K ..... DP S A V A K H F V A L S T N T A K V K F G I D P Q - N M F E F W D W V G G R Y S L W - - K N W F L S K T ~ .... Dp SH IAKHFAALS T N E T E V A K F G I D T K - N M F G F E S W V G G R Y S V W E L K N W F L S K N G G .... D Q SH I S K H F A A L S T N A T E V E K F G I D T K - N M F G F E N W V G G R Y S V W - - K K W L S L K I K ..... D D E N L S K H M V A V S T N L K L T D E F G I S R D - N V F E F H D ~ S V T - -RRALLDYLRS R G IDEKGS V A K H F V A L S T N N Q K V K E F G I D E E - N M F Q F W D W V G G R Y S M W - - R D W F L K T A G ..... D Q Q H V A K H F A A L S T N A K A V G E F G I D T N - N M F E F W D W V G G R Y S LW - - R D W F L K A A G ..... D E K H V A K H F A A L S T N A K A V G E F G I D T A - N M F E F W D W V G G R Y S LW - - R K L L E E K .... Y G K E E A R - K R I Y A T T D R A R G A L K T L A T A E G Y E TF I I PDDVGGRYSVL - - R D Y M E K K .... Y G K E E A R - K R I Y V T T D R T K G A L K K L A D Q E G Y E T F V I P D N I G G R Y S VL •

°

°

°°

.

°,

*



.



***

.

*

.

V

hum pig mou yea

Kla Pfa Tbr cla Eco

BstA BstB

aaaaaa aaaaaaaaaaaaaaa aaaaaaaaaaa 334 SAIG - LS IALHVGFDNFEQLLSGAHWbSDQHFRTTP L E K N A P V - L L A L L G IWYINCFGCET SAIG-LS I A L H V G F D N F E Q L L S G A H ~ 4 D Q H F R T T P L E K N A P V - L L A L L G i W Y I N F F G C E T SAIG-LS IALHVGFDHFEQLLS G A H M M D Q H F L K T P L E K N A P V - L L A L L G I W Y I N C Y G C E T SAIG-LSVALYIGYDNFEAFLKGAEAVDNHFTQTPLEDNIPL-LGGLLSVWYNNFFGAQT SAIG-LSVALYIGFDNFEAFLKGAEAVDKHFVETPLEDNIPL-LGGLLSVWYNNFFDAQT S SVGILPLS I A F G Y K N M R N F L N G C H D ~ E H F I / 4 A D L K E N I P V - L L A L T S F Y N S H F F D Y K N SA IG- LP IMI S I G Y E N F V E L L T G A H V I D E H F A N A P P E Q N V P L - L L A L V G V W Y INFFGAVT SAIG-LS I ILS I G F D N F E Q L L S ~ Q H F A S A P A E Q N L P V - L L A L I G IWYNNFFGAET S AIG- LS IVLS I G F D N F V E L L S ~ K H F S T T P A E K N L P V - L V A L I G IWYNNFFGAET T A V G L L P - - I A V S G A N I E ~ M K G A A Q A R E D F S S S E L E E N A A Y Q Y A A I R N I LYNK--GKT I TAVGLLP-- IAVAGLN IDRMR4EGAASAYHKYNNPDLLTNESYQYAAVRNI LYRK--GKAI ,.,*

hum pig mou

yea Kla Pfa Tbr cla Eco BstA BstB

*,

..

. , , *

,,o



*

,,

bbbbbbbbb aaaaaaa bbbbbbb bbbbbbbbbbb394 H A M L P Y D Q Y L H R F A A Y F Q Q G D M E S N G K Y I TKS G T R V D H Q T G P IVWGEP G T N G Q H A F Y Q L I HAMLP Y D Q Y L H R F A A Y F Q Q G D M E S N G K Y I T K S G T R V D H Q T G P I V W G E P G T N G Q H A F Y Q L I HAt .r.~Y D Q Y M H R F A A Y F Q Q G D M E S N G K Y I T K S G A R V D H Q T G P I V W G E P G T N G Q H A F Y Q L I HLVAPFDQYLHRFPAYLQQLSMESNGKSVTRGNVFTDYSTGS ILFGEPATNAQHSFFQLV HLVAP F D Q Y L H R F P A Y L Q Q L S M E S N G K S V T R G N V F A N Y S T G S I L F G E P A T N A Q H S F F Q L I VAI LP YFQNLLKF SAH I Q Q L S M E S N G K S V D R N N Q P I H Y N T C Q V Y F G E P G T N G Q H S F Y Q L I HAI L P Y D Q Y L W R L P A Y L Q Q L D M E S N G K Y V T R S G K T V S TLTGP I I F G E A G T N G Q H A F Y Q L I EAILP Y D Q Y M H R F A A Y F Q ~ S N G K Y V RAGHPVDYQTGP IIWGEPGTNGQHAFYQLI EAI LP Y D Q Y M H R F A A Y F Q Q G N M E S N G K Y V D R N G N V V D Y Q T G P I IWGEP G T N G Q H A F Y Q L I E L L I N Y E P A L Q Y F A E W W K Q L F G E S EGK- - D Q K G IFPASANF . . . . . . . . STDLHSLGQYI ELLVNYEP S LHYVS E W W K Q L F G E S EGK- - D Q K G L F P A S V D F . . . . . . . . TTDLHSMGQYV . . . . . . . * **.** ... *.. , FIG. 5

(continued opposite).

(c) Phosphofructokinase Phosphofructokinase (EC 2.7.1.11) is an allosterically regulated enzyme that catalyzes the transfer ofa phospho group from ATP to fructose 6-phosphate (Fig. 1). Under physiological conditions the reaction is essentially irreversible~ A quite distinct enzyme, fructose 1,6bisphosphatase, is required during gluconeogenesis to catalyze the reverse reaction. The regulation of the activities of these two enzymes can be of particular importance in controlling the flux through the glycolytic and gluconeogenic pathways (see Section X). The metabolic requirements of different organisms and of different tissues within an

Evolution of glycolysis

119 450

aaaaaaaaaaaaaaaaaaaaaaa

hum pig mou yea Kla Pfa Tbr cla Eco BstA BstB

H Q G T K M I P C D F L .... IPVQTQHP I R K G L H H K I L L A N F L A Q T E A L M R G K S TEEARKE L Q A H Q G T K M I P C D F L .... IPVQTQHP I R K G L H H K I L L A N F LAQTEAI/MKGKS TEEARKE L Q A H Q G T K M I P C D F L .... I P V Q T Q H P I R K G L H H K I L L A N F L A Q T E A L M K G K L P E E A R K E L Q A H Q G T K L I P SDF I .... LAAQS HNP I E N K L H Q K M L A S N F F A Q A E A L M V G K D E E Q V K A E G A T H Q G T K L I P S D F I .... LAAQ S HNP I E N N L H Q K M L A S N F F A Q A E A L M V G K D E E Q V K S E G A T HQG-QVIPVELIGFKHSHFP IKFDKEVVSNHDELMTNFFAQADALAIGKTYEQVKEENEK H Q G T N L I P C D F I ...... G A I Q SQNK I G D H H K I F M S N F F A Q T E A L M I G K S P S E V R R E L E A H Q G T K L V P C D F I ...... APAI T H N A L A D H H P K L L S N F F A Q T E A L A F G K S RDVVEKEF TD H Q G T K M V P C D F I ...... AP AI THNP L S D H H Q K L L S N F F A Q T E A L A F G K S R E V V E Q E Y R D QEGRRDLFETVL- - -KVEKPRHDLVIEAEENDLDGLNYLA ...... -GKTVDFVNTKAFE Q E G R R N L I ETVL- - - H V K K P Q I E L T I Q E D P E N I D G L N F L A . . . . . . . G K T L D E V N K K A F Q * ,.

hum

pig mou yea Kla Pfa Tbr cla ECO BstA BstB

* •







** •



aaaaaaa bbbbbbbb aaaaaaaaaaaaaaaaaaaa 509 A G - K S P ED LERLLP H K V F E G N R P T N S I V F T K L T P F M L G A L V A M Y E H K I F V Q G I I~E)INSF A G - K S P EDFEKLLP K V F E G N R P T N S I V F T K L T P F I L G A L I A M Y E H K I F V Q G V I N D INSF AG-KSPEDLEKLLPHKVFEGNRPTNS IVFTKLTPFILGALIAMYEHKIFVQG~ INSF G G . . . . . . . . . LVP HKVF S G N R P T T S ILAQKI T P A T L G A L I A Y Y E H V T F T E G A I ~ INSF G G . . . . . . . . . L V P H K V F S G N R P T T S ILAQKI T P A T L G A L I A Y Y E H V T F T E G R I N N INSF N - K M S P E .... L L T H K V F N G N R P S T L L L F D E L N F Y T C G L L L S LYE S R I V A E G F L L N I N S F A G E R S A E K INALLP HKTF IGGRP S N T L L I K S L T P R A L G A I I A M Y E H K V L V Q G A I W G ID SY A G - K S A E S V A H I V P F K V F E G N R P T N S I L L R E I T P Y S L G A L I A L Y E H K I F T Q G A I LNIFTF Q G - K D P A T L D Y V V P F K V F E G N R P T N S I L L R E I T P F S L G A L I A L Y E H K I F T Q G V I LNIFTF GT . . . . . . . . . LLAHT- - D G G V P N L V I T L P E L N E Y T F G Y L V Y F F E K A C A M S G Y L L G V N P F GT . . . . . . . . . LLAHV- - D G G V P N L I V E L D E M N E Y T F G E M V Y F F E K A C G I SGHLLGVNPF .,,

hum pig mou yea Kla Pfa Tbr cla Eco BstA BstB

* o,

*.

*.

,,,

D Q W G V E L G K Q L A K K IEP E L D G S A Q V T S H D A ...... D Q W G V E L G K Q L A K K IEP ELDGS SPVT SHD S ...... D Q W G V E L G K Q L A K K I E P E LEGS SAVT SHD S . . . . . . D Q W G V E L G K V L A K V I G K E L D N S ST I STHDA . . . . . . DQWGVELGKVLA DQWGVELGKVLAKEVRNYFNDTRNQKKSDNTYNFNE DQWGVELGKVLAKS ILPQLRPGMRVNNHDS ...... D Q W G V E L G K Q L A N R I L P E L E N E D E ITTHDS . . . . . . D Q W G V E I ~ K Q L A N R I L P E L K D D K E I S SHDS . . . . . . DQPGVEA~ALLGKPGYEEKKAELEKRLK DQPGVEA~ALLGKPGFEDEKAALMKRLS K **

***

*

.,

.*

*

.

..

.,

aaa 557 STNGLINF IKQQREARVQ STNGLINF IKQEREARSQ STNGLISFIKQQRDTKLE STNGLINQFKEWM S T K N F I K L L L V Q IKKKKK INTNLK STNGLI~MFNELSHL STNALINRYKSWR STNGLINRYKAWRG

*

FIG. 5. Alignment of glucosephosphate isomerase sequences. See Section II for nomenclature and references for the sequences. Elements of regular secondary structure observed in the pig muscle enzyme are shown. The boundaries between the three domains of the enzyme are indicated by the arrows (large domain = residues 1-102 and 290-512; small domain = residues 103 289; C-terminal domain = residues 513 557). The numbering is according to the pig enzyme.

properties of phosphofructokinases (reviewed by Bloxham and Lardy, 1973; Hofmann, 1976). The enzyme from bacteria has a relatively limited repertoire of effector molecules, and the protein itself is a tetramer of identical subunits of M, 33,000. By contrast, the yeast enzyme is an ~4fl4 octamer formed from substantially larger subunits of M, 112,000 and 118,000. In mammals, phosphofructokinase is a tetramer of identical subunits of M, 85,000, which can aggregate into large oligomers. It is clear from a comparison of the sequences of representatives of these three classes of phosphofructokinase (Fig. 6(a) and Matrix 3(a)) that the eukaryotic enzymes have arisen by a gene duplication and fusion process similar to that followed by hexokinase (see Section V.1). Pairwise comparisons of the sequences (expressed as per cent amino acid identities for the whole bacterial enzymes, for the two halves of the eukaryotic enzymes, and for some domains) are given in Matrix 3. The crystal structures of two bacterial phosphofructokinases in both the apo and holo forms have been solved at high resolution. The Bacillus stearothermophilus enzyme has been crystallized in the active R state in the presence of fructose 6-phosphate and MgADP, and the

120

L . A . FOTHERGILL-GILMOREand P. A. M. MICHELS MATRIX 2(a). PAIRWISE COMPARISONOF GLUCOSEPHOSPHATEISOMERASESEQUENCES

hum pig mou yea Kla Pfa TbrG1 cla Eco BstA BstB

hum

pig

mou

yea

Kla

Pfa

Tbr

cla

Eco

BstA

BstB

100

93 100

89 89 100

58 59 58 100

57 59 57 86 100

38 39 37 39 41 100

57 58 57 53 52 40 100

63 64 64 59 60 41 58 100

64 65 63 59 59 40 57 88 100

23 23 23 24 25 24 23 25 24 100

21 22 21 21 22 23 22 23 22 70 100

MATRIX 2(b). PAIRWISE COMPARISONSOF DOMAINS FROM REPRESENTATIVE SEQUENCES Large domain (residues 1-102; 290-512)

hum mou yea

hum

mou

yea

100

88 100

51 51 100

Small domain (residues 103-289)

hum mou yea

hum

mou

yea

100

94 100

69 69 100

C-terminal domain (residues 513-557)

hum mou yea

hum

mou

yea

100

78 100

63 58 100

See Section II for nomenclature and references for the sequences.

structure has been determined at 2.4 A resolution (Evans et al., 1981). Crystals of the same enzyme in the less active T state have also been obtained, and this structure has been solved at 2.5 A (Evans et al., 1986; Schirmer and Evans, 1990). These two structures are compared in Fig. 7. The enzyme from E. coli in the R state has also been studied by X-ray crystallography, and shown to have a very similar structure to that of the Bacillus enzyme (Shirakihara and Evans, 1988; Rypniewski and Evans, 1989). Like hexokinase and glucosephosphate isomerase, the phosphofructokinase subunit is clearly divided into two domains, each with a core of strands of fl-sheet surrounded by or-helices. In contrast to hexokinase, there is little difference between the overall conformations of the ape and hole enzymes. The active sites of the two subunits of the bacterial enzyme are located in the cleft between the domains, and are shown to the top and bottom of the drawing with fructose 6-phosphate and MgATP bound. The effector sites of the bacterial enzymes, shown with the activator MgADP bound in Fig. 7, lie in deep clefts between the two subunits. It is likely that the double-size eukaryotic phosphofructokinases have two classes of effector site: one corresponds to the bacterial effector site, and the other to a mutated form of the active site (Poorman et al., 1984). Thus it is proposed that the bacterial enzymes have four active sites and four MgADP effector sites per tetramer, whereas the eukaryotic enzymes have twice this number of sites per tetramer

Evolution of glycolysis

121

a PHOSPHOFRUCTOKINASE-ATP -204

yeaA-N yeaB-N

QSQDS-CYGVAFRSIITNDEALFKKTIHFYHTLGFATVKDFNKFKHGENSLLSSGTSQDS TVTTPFVNGTSYCTVTAYSVQSYKAAIDFYTK--FLSLEN-RSSPDENSTLLSNDSI--S

yeaA-N yeaB-N

LREVWLE S FKLS EVDAS GFRI P QQEATNKAQS QGALLKIRLVMS AP IDETFDTNETATIT LKILLRPDEKINKNVEA--HLKELNS ITKTQDWRSHATQS LV ........ FNTSDILAVK

yeaA-N yeaB-N

yF S TDLNK IVEKFP KQAEKLSDTLVFLKDPMGNN ITF SGLANATD SAP T S EDAFLEATSE D T LNAMNAP LQGYP TE - - -LFPMQLYTLDP LGNGVGVT STKNAVSTKP TPPPA-PEASAE

hun~us-N rabmus-N humliv-N mouliv-N yeaA-N yeaB-N hummus-C rabmus-C humliv-C mouliv-C yeaA-C yeaB-C Eco Bst Sci

bbbbbbb aaaaaaaaaaa2 4 THEEHHAAKTLG IGKAIAVLTSGGDAQGMNAAVRAVVR THEEHHAARTLGVGKAIAVLTSGGDAQGMNAAVRAVVR AAVD LEKLRAS GAGKA IGVLT SGGDRQ(~4NAAVRAVTR ATVD LEKLRMS GAGKAI GVLT SGGDAQ(~L%AVRAVTR DE I I SRA. S SDASD LLRQTLGS S QKKKK IAVMTSGGDS P G~qAAVRAVVR S GLS SKVHS YTDLAYRMKTTDTYP S LPKP LNRP QK--AIAVMTSGGDAPG~SqSNVRAIVR A V M N V G A P A A ~ U ~ A V R S TVR AVMNVGAPAAGR~AAVRS TVR AILNVGAP~AAVRSAVR A I LNVGAPAAG~IAAVP SAVR GIVHVGAPS~TRAATL AIVNVGAPAGGINSAVYSMAT IKKIGVLTSGGDAPGZaqAAIRGVVR KRIGVLTSGGDSPG24NAAIRSVVR LKKIG I LTSGGDSQ(~qAAIAGVIK

-144

-84

..

hummus-N rabmus-N humliv-N mouliv-N yeaA-N yeaB-N hur~nus-C rabmus-C humliv-C mouliv-C yeaA-C yeaB-C Eco Bst Sci

Eco Bst

Sci

*.

*

59:2-8

.

**.

**.

.

.

*

aaaaaaaaaaa bbbbbb aaaaaaaaaaaaaa bb118 LRAAYNLVKRGITNLCVIGGDGSLTGADTFRSEWSDLLSDLQKAGKITDEEATKS SYLNI LRAAHNLVKRGI TNLCVI GGDGSLTGADT FRS EWSD LL SD LQKAGK ITAEEATRS S YLNI RAAANNLVQHGITNLCVIGGDGSLTGANIFRSEWGS LLEELVAEGKI SETTAWTYSHLNI LAAAYNLLQHGI TNLCV IGGDGSLTGAN IFRNEWGS LLEE LVKEGK I S E S TAQNYAH LT I RQAAGNLI S QGIDALVVCGGDGS LTGADLFRHEWP S LVDE LVAEGRFTKEEVAP YKNLS I LLGAQHLIEAGVDALIVCGGDGSLTGADLFRSEWPSL IEELLKTNRI SNEQYERMKHLNI EQISANITKFNIQGLVI IGGFEAYTGGLELMEGRKQF . . . . . . . DELCIPF EQ I SAN ITKFNIQGLVI IGGFEAYTGGLELMEGRKQF ................ DELCIPF ES IVEN IRIYGIHALLVVGGFEAYEGVLQLVEARGRY ................ EELCIVM EAIVENLRTYNIHALLVI GGFEAYEGVLQLVEARGRY . . . . . . . EELCIVM EE IATQMRT - S INALL I IGGFEAYLGLLE LSAAREKH .EEFCVPM GT IAYYFQKNKLDGL I i ~ E G F R S LKQLRDGRTQH .PIFN I PM GMIAYYFQKYEFDGLI IVGGFEAFES LHQLERARES Y ................ PAFRIPM AVAIENLKKRGIDALVVIGGDGSYMGAMRLTEMG FPC KKGI EQLKKHGIEGLVVIGGDGSYQGAKKLTEHG FPC KKAVD I LKKQE IAALVVIGGDGSYQGAQRLTE LG ............. INC . ** . ,

FIG. 6(a) (continued overleaf). JP8

.

aaaaa bbbbbb aaaaaaaa bbbbb aaaaaaaaaa81 VGIFTGARVFFVHEGYQGLVDGGD -H IKEATWE SVSMMLQLGGTVIGSARSKDFREREGR VGIFTGARVFFVHEGYQGLVDGGD -H IREATWE SVSMMLQLC4~TVIGSARCKDFREREGR M G IYVGAKVFLIYEGYEGLVEGGE -NIKQANWLSVSN I IQLGGT I IGTARSKAFTTREGR M G IYVGAKV LIYEGYEGLVEGGE -NIKPANWLSVSN I IQLGGT I I GSARCKAFTTREGR TG I H F G C D V F A V Y E G Y E G L L R G G K - ~ V R G W L S E G G T L IGTARSMEFRKREGR SAIFKGCRAFVVMEGYEGLVRGGPEYIKEFHWEDVRGWSAEGGTNIGTARC24EFKKREGR I GLIQGNRVLVVHDGFEGLAKG-- -Q I D E A G W S Y V G G W ~ S K L G T K R - - - T L P K K S F iG L I Q G N R V L V V H D G F E G P A K G - - - Q I E E ~ S W S Y ~ K L G S K R - - - T L P K K S F TG i S HGHTVYVVHDGFEGLAKG-- -QVQEVGWHDVAGWLGRGGSMLGTKR---TLPKGQL TGI S EGHTVYI-gRDGFEGLAKG-- -QVQEVGWHDVAGWLGRGGSMLGTKR---TLPKPHL YCLS HGHKPYAIMNGF S GLIQTGE --VKELSWIDVENWHNLGGSE IGTNR-- -SVASEDL YCMSQGHRPYAIYNGWSGLARHES --VRS LNWKDMLGWQSRGGSE IGTNR--VTPEEADL SALTEGLEVMGI YDGYLGLYED -- -RMVQLDRYSVSDMINRGGTFLGSARFPEFRDENIR KAIYHGVEVYGVYHGYAGLIAG---NIKKLEVGDVGD I IHRGGTILYTARCPEFKTEEGQ TAHAKGLETY I I RDGYLGLINN---WIEVVDNNFADS IMLLGGTVIGSARI~EFKDPEVQ •

hummus-N rabmus-N humliv-N mouliv-N yeaA-N yeaB-N hummus-C rabmus-C humliv-C mouliv-C humpla-C yeaA-C yeaB-C

*.

122

hummus-N r a b m u s -N humliv-N mouliv-N yeaA-N yeaB-N hummus-C rabmus-C humliv-C mouliv-C hu~pla-C yeaA-C yeaB-C Eco Bst Sci

L.A. FOTFERGILL-GILMOREand P. A, M. MICHELS

bbb ~ aaaaaaaaa~aaaaaaaaaaaaaa bbbbbbb aaaaa177 E G L V G S IDNDFCGTDMT I G T D S A L H R I M E IVDAI T T T A Q S H - Q R T F V L E V M G R H C G Y L A L V G L V G S IDNDFCGTDMT I G T D S A L H R I T E I V D A I T T T A Q S H - Q R T F V L E V M G R H C G Y L A L AGLVGS IDNDFCGTDMT IGTDSALHRIMEVIDAITTTAQSH-QRTFVLEVMGRHCGYLAL A G L V G S I D N D F C G T D M T IGTD S A L H R I M E V I D A I TTTAQS H - Q R T F V L E V M G R H C G Y L A L V G L V G S IDNDMSGTDS T I G A Y S A L E R I C E M V D Y I D A T A K S H - S R A F V V E V M G R H C G W L A L C G T V G S IDNDMS T T D A T I G A Y S A L D R I C K A I D Y V E A T A N S H - S R A F V V E V M G R N C G W L A L VVIPATVSNNVPGSDFSVGADTALNT ICTTCDRIKQSAAGTKRRVF IIETMGGYCGYLAT W I P A T V S N V P G S D F S V G A D T A L N T I C T T C D R I K Q S A A G T K R R V F I I ET~F~GYCGYLAT CVIPAT ISNNVPGTDFSLGSDTAVNAAMESCDRIKQSASGTKRRVF IVETMGGYCGYLAT CVIP A T I S N N V P G T D F S L G S D T A V N A A M E S C D R I K Q S A S G T K R R V F IVETMGGYCGYLAT V M V P A T V S N N V P G S D F S IG A D T A L N T I T D T C D R I K Q S A S G T K R R V F I I E T M G G Y C G Y L A N C L IPATVSNNVP GTEYS L G V D T C L N A L V N Y T D D I K Q S A S A T R R R V F V C E V Q G G H SGYIAS VL IPATLSNNVPGTEYS LGSDTALNALMEYCDVVKQSASS TRGRAFVVDCQGGNSGYLAT I GLP G T IDND IKGTDYT I GFFTALS T V V E A I D R L R D T S S S H -QRI S V V E V M S R Y C G D L T L VGVP GT IDND IPGTDFT I G F D T A L N T V I D A I D K I R D T A T S H - E R T Y V I E V M G R H A G D IAL IALPGT IDND ITS SDYT I G F D T A I N I V V E A I D R L R D T M Q S H - N R C S IVEVMGHACGD IAL * .,,,

hummus -N rabmus-N humliv-N mouliv-N yeaA-N yeaB-N hummus-C rabmus-C humliv-C mouliv-C humpla-C yeaA-C yeaB-C Eco Bst Sci

* •

,.,

*

,,

* •



o



* •



* ,,

aaaaaaa bbb aaaaaaaaaaaaaaa bbbbbb 224 V T S L S C G A D W V F I P E C P .... P D D D W E E H L C R R L S E T R T R G S R L N I I IVAEGAIDKNGKP V T S L S C G A D W V F I P E C P .... P D D N W E D H L C R R L S E T R T R G S R L N I IIVAEGAIDRNGKP V S A L A S G A D W L F I P E A P .... P E D G W E N F M C E R L G E T R S R G S R L N I I I IAEGAIDRNGKP V S A L A S G A D W L F I P E A P .... PEDGWENFM/~ERLGETRSRGSRLN I I I IAEC4~IDRHGKP M A G I A T G A D Y I F IP ERAV-- -P H G K W Q D E L K E ~ K G R R N N T I IVAEGALDDQLNP L A G I A T S A D Y I F IPEKPA-- -T S S E W Q D Q M C D I V S K H R S R G K R T T I V V V A E G A I A A D L T P M A G L A A G A D A A Y I F E E P .... F T I R D L Q A N V E H L V Q K M K T T V K R G L V L R N E K C N E N Y T T M A G L A A G A D A A Y I F E E P .... F T I R D L Q A N V E H L V Q K M K T T V K R G L V L R N E K C N E N Y T T V T G I A V G A D A A Y V F E D P .... F N I H D L K V N V E H M T E K M K T D I Q R G L V L R N E K C H D H Y T T V T G I A V G A D A A Y V F E D P .... F N I H D L ~ IQRGLVLRNEKCHEHYTTMGGLAA A A Y I F E E P .... FD I R D L Q S N V E H L T E K M K T T I Q R G L V L R N E S C S E N Y T T F T G L I T G A V S V Y T P E K K I D L A S IRED I T L L K E N F R H D K G E N R N G K L L V R N E Q A S SVYSTY A S L A V G A Q V S Y V P EEG I SLEQLSED I E Y L A Q S F E K A E G R G R F G K L ILKSTNASKALSAA A A I A G G C E F V V V P E ....... VEF S R E D L V N E IKAG I A K G K K H A I V A I T E H M C ...... W S G L A G G A E T ILIPE ....... A D Y D ~ D V ~ H E R G K K H S I I IVAEGVG ...... Y A G I A G G A D I IS INE ....... A A L S E T E I A D R V A M L H Q A Q K R S V I VVVS EMIYP ..... *

hummus-N r a b m u s -N humliv-N mouliv-N yeaA-N yeaB-N hummus-C rabmus-C humliv-C mouliv-C humpla-C yeaA-C yeaB-C Eco Bst Sci

aaaaaaaaaaaaa bbbbb~aaaaaa aaaaaaaaaaaaaaaaaaaa274 I TSED I K N L V V K R L G - Y D T R V T V L G H V Q R G G T P S E F D R I L G S R M G V E A V M A L L E ....... I T S E G V ~ L V V R R L G - Y D T R V T V L G H V Q R G G T P S A F D R I L G S R M G V E A V M A L L E ....... I S S S Y V K D L V V Q R L G - F D T R V T V L G H V Q R G G T P SAFDRI LS S K M G M E A V M A L L E ....... I S S S Y V K D L V V Q R L G - F D T R V T V L G H V Q R G G T P S A F D R I L S S ~ G M E A V M A L L E ....... V T A N D V K D A L IE - LG- L D T K V T IL G H V Q R G G T A V A H D R W L A T L Q G V D A V K A V L E ....... I S P S D V H K V L V D R L G - L D T R I T T L G H V Q R G G T A V A Y D R I L A T L Q G L E A V N A V L E ....... - -DF IFNLYS E E G K G I F D S R K N V L G H ~ S PT P F D R N F A T ~ GKI KE S Y R - -DF I F N L Y S E E G K G I F D S R K N V L G H M ~ S P T P F D R N F A T ~ G K I KE SYR - -EFLYNLYS S E G K G V F D C R T N V L G H L Q Q G G A P T P F D R N Y G T K L G V K A M L W L S E K L R E V Y R - -DFLYNLYS S E G R G V F D C R T N V L G H L Q Q G G A P T P F D R N Y G T K L G V K A M L W V S E K L R D V Y R -DF I Y Q L Y S E E G K G V F D C R K N V L G H M Q Q G G A P S P F D R N F G T K I S A R A M E W I T E K L K E A R G - -QLLAD I I S E A S K G K F G V R T A I P G H V ~ S S K D R V T A S R F A V K C IKF I E Q - ~ - T K L A E V IT A E A D G R F D A K P A Y P G H V Q Q G G L P S P I D R T R A T R M A I K A V G F I E D ........ -DVDE LAHF I E K E T G - R E T R A T V L G H IQRGGS PVPYDRILASRMGAYAIDLLLA. - SGVDFGRQ IQEATG-FETRVTVLGHVQRGGSPTAFDRVLAS~VELLLE. - D V H K L A K L V E S KS G - Y I T R A T V L G H T Q R G G N P T A M D R Y R A F Q M A Q F A V E Q I IA. -

-



h u m m u s -N r a b m u s -N humliv-N mouliv-N yeaA-N yeaB-N hunlnus-C ra~nus-C humliv-C

NGRIFA' NGRIFA' KGRVFA.



**

*.

**

**

W bbbbbb b b b b b a a a a a a a a • aaaaa310 ~TPDTP ACVVSLSGNQAVRLPLMECVQVTKDVTKAMD ~TPD TPACVVSLSGNQAVRLP L~CVQVTKDVTKAMD .ATPDTPACVVTL S GNQS VRLP I/w~CVQMT KEVQKAMD .ATPDTPACVVSLSG~SVRLPLMECVQVTKDVQKAME ......... FTPETP S P L I G I L E N K I I R M P L V E S V K L T K S V A T A I Q •S TPDTP SP L I A V N E N K I V R K P L M E S V K L T K A V A E A I P NTPD SG C V L ~ V F Q P V A E L K D Q T -DFEHRI P NTPDS GC~VFQPVTELQNQT -DFEHRI P NAPD S A C V I G L K K K A V A F S P V T E L K K D T -DFEHRMP

FIG. 6(a) (continued opposite).

Evolution of glycolysis

mouliv-C hu~pla-C yeaA-C yeaB-C Eco

123

Sci

KGRVFA NAPD SACVIGLRKKVVAFS P V T E ~ T -DFEHRMP RGKKFT .................... TDDS I ~ I SKRNVIFQPVAELKF~T-DFEHRIP ASPNTDAKVLRFKFDTHGEKVP TVEHEDDSAAVICVNGSHVSFKP IANLWENETNVELRKG ..... NQAAIAEARAAEENFNADDKT I SDTAAVVGVKGSHVVYNS IRQLYDYETEVSMRMP ~YGGRCVGIQNEQLVHHD I IDAIENMKRPFKGDW ~KGGRCVGIQNNQLVDHD IAEALANKHTIDQRMY ~LAI~QGDQI IARP IMEALS IPRSSRKEIW

hummus-N rabmus -N humliv-N mouliv-N yeaA-N yeaB-N hummu s -C rahmus-C humliv-C mouliv-C humpla-C yeaA-C yeaB-C EGo Bst Sci

EKKFDEALKLTGRSF~qMEVYKLLAHVRP PVSKSGSHTV EKRFDEAMKLRGRSFMNNWEVYKLLAHIRPPAPKSGSYTV DKRFDEATQLRGGS~IYKLLTHQKPPKEKSNF SL EERFDEAIQLRGRSFENNWKIYKLLAHQKVSKEKSNF S L NKDFDKAI S LRDTEF IELYENFLS TTVKDDGSELLPVSDRLN I AKDFKRAMS LRDTEF IEHI/~NFMAINSADHNEPKLPKDKRLKI KEQWWLKLRP ILKI LAKYEIDLDTSDHAHLEH I TRKRSGEAAV KEQWWLKLRP ILKI LAKYEIDLDTSEHAHLEH I SRKRS GEATV REQWWLS LRI/4LEMLAQYRI SMAAYVSGELEHVTRRTLSMDKGF R E Q W W L ~ SMADYVSGELEHVTRRTLS IDKGF KEQWWLKLRP I24KILAKYKAS YDVSD SGQLEHVQPWSV FEVHWAEYNKIGD IL S G R L K L R A E V ~ KVIHWQATRL IAD H LVGRKRVD LDCAKKLY ALSKELSI AKFDQLNQNIYQKS

Bst

aaaaaa

318

FI~. 6(a). Alignment of ATP-depcndent phosphofructokinase sequences. See Section II for nomenclature and references for sequences. Elements of regular secondary structure observed in the B. stearothermophilus enzyme are shown. The boundaries between the two domains are indicated by the arrows (domain 1 = residues 1-145 and 247-303; domain 2 = residues 146-246 and 304-318). The catalytic aspartate residue is shown by the star symbol. The numbering is according to the B. stearothermophilus enzyme.

b PHOSPHOFRUCTOKINASE - PPi 60 potA potB Pfr

DADYGIPRELSDLQKLRSHYHPELPPCLQGTTVRVELRDATTAADPSGEHTIKRFFPHTY ALLHLPPVTQRRLQ VKKV

potA potB Pfr

GQPLAHFLRATAKVPD-AQIITEHPAIRVGVLFCGRQSPGGHNVIWGI/4DALKVHNPKNI SFFLPYTDNHVSLVPDDSGDVAMNQILKIGVVLSGGQAPGGHNVISGIFDYLQTHCKGST ALLTAGGFAPCLSSAIAELIKRYTEVSPETTLIGYRYGYEGLLKGDSLEFSPAVRAHYDR

119

o.oo

.,

179 potA potB Pfr

LLGFLGGSEGLRAQKTLEITDDVLATYKNQGGYDMLGRTKDQIRTTEQVNAAMAACKALK MYGFRGGPAGVMKGKYVVLTPEFIYPYRNQGGFDMICSGRDKIETPEQFKQAEETAKKLD LFSFGGSPIGNSRVK LTNVKDLVARGLVASGDDPLKVAADQLIADG ...... •

.*

* . .

*

*

,.



.oo

.o

*

.

, o . o

*

239 potA potB Pfr

LDGLVI IGGVT SNTDAAH LAEKFAE T KCLTKVVGVPVT L N G D L K N Q ~ ' D T ICKV LDGLVVIGGDD SNTNACLLAENFRSKNLKTRVIGCPKT IDGDLKSKEVPTSFGFDTACKI VDVLHT IGGDDTNTTAAD LAAYLAQHDYP LTVVGLPKT IDND IVP -- IRQSLGAWTAADE .*

potA potB Pfr

*

* * *

,o**.*

**.

.

*.*

*

*

o*

*

*

299 NS QL I SNVCTDALSAEKYYYF IRI24GRKASHVALECTLQSHPNMVILGEE%rAA~KIJTIFD YAEMIGNVMIDARS T G ~ H ITLECALQTHPNVTLIGEEVFAKKLTLKN GARFAANVIAEHNAAPRELI IHE IMGRNCGYLAAE ............... TSRRYVAWLD • **

. •

• ***

, • • •

*

Fit::;. 6(b) (continued overleaf).

o •

.

124

L.A. FOTHERGILL-GILMOREand P. A. M. MICHELS 357 ITQQICDAVQARAEHDKNHGVILLPEGLIESIPEVYALLQEIH~KI--SSQ VTDYIADVVCKRAE S ~ L IP EGLIDF IPEVQQL IAEI2~ I L A W ) V % ~ ~ AQQWLPEAGLDR- -RGWD IHALYVPEATIDLDAEAERLRT%~DEVGSVNI ..........

potA potB Pfr

* o.

.

**

** •

.

**

.

* •

* •



* •

, , o .

e.

417 potA potB Pfr

LSPMASALFEFLPHF IRKQLr,DqPESDDSAQI~QIETEKLIAHLVET LTPQCLELFELLPLAIQEQTx.T~RDP~~IQMVET~ FISEGAGVPDIV~M~A~TDAFGHVQLDKINPGA~TAK----(~AERIGAGKTMV

potA potB Pfr

KKFNAICHFFGYQARGSLPSKFDCDYA~ILAAGLNG2MATITNLKNPANKMH Q-FKGQSHFFGYEGROGLP S N F D S ~ S L L Q S G K T G L I SSV~AAPVEZNT QK .... SGYFSRSAKSNAQDLE ...... L IAA ATMKVDAALAGTPGVVG(~EEAGDKLS

potA potB Pfr

CGASP I SAMMTVKRYGRGPGKAS IGVPALHPATVD LRGKS YELLS QNAT KFLLDDVMRNP VGGTALTALMDVER ...... RHGKFKP VI KKAMVE LEGAPFKKFAS KREEWALNNRY INP VIDFKRIAGHKPFD I TLDWYTQLLARIGQPAP IAAA

potA potB

Gp LQFDGPGADAKAVSLCVEDQDYIGRIKKLQEYLDKVRT IVKPGCSQDVLKAALSAMAS GP IQFVGPVANKVNHTLLLELGVDA

potA

VTD ILSVISSPSSVSTPF

477

537

597

615 FIG. 6(b). Alignment of PPi-dependent phosphofructokinase sequences. See Section II for nomenclature and references for sequences. The numbering is according to the potato a-subunit.

(four active sites, eight nucleotide effector sites, and four effector sites derived from ancestral active sites). This proposal is substantiated by inspection of amino acid residues involved in ligand binding and catalysis. Aspartate-126 (B. stearothermophilus numbering) participates in the catalytic mechanism, and this residue is present in the active sites of the bacterial enzymes and in the corresponding site in the N-terminal half of the mammalian enzymes (Shirakihara and Evans, 1988). However, this residue has mutated to a serine in the homologous site in the C-terminal half (see Fig. 6(a)), and thus the C-terminal site is no longer capable of catalysis. The preceding paragraphs in this section have described the ATP-dependent phosphofructokinase that is present in most organisms. There is in addition a quite distinct type of phosphofructokinase that is dependent upon inorganic pyrophosphate (PPi). This enzyme has a much narrower distribution. It was first discovered in 1974 in the parasitic amoeba Entamoeba histolytica (Reeves et al., 1974) and has subsequently been found in bacteria, other protists and higher plants (reviewed by Mertens, 1991). The enzymes isolated from

Evolution of glycolysis

125

MATRIX 3(a). PAIRWlSECOMPARISONSOF PHOSPHOFRUCTOKINASESEQUENCES

hummus-N rabmus-N humliv-N mouliv-N yeaA-N yeaB-N hummus-C rabmus-C humliv-C mouliv-C humpla-C yeaA-C yeaB-C Eco Bsc Sci

hm-Nrm-N hi-N ml-N yA-N yB-N hm-C rm-C hI-C rnl-C hp-C yA-C yB-C Eco Bst

Sci

100

35 36 36 35 34 34 26 26 26 27 26 22 21 48 46 100

94 100

72 73 100

72 74 93 100

49 50 50 49 100

47 49 48 48 50 100

27 27 26 27 26 25 100

27 27 26 27 26 24 97 100

29 28 27 28 27 25 64 63 100

28 28 27 28 26 24 64 63 94 100

MATRIX3(b). PAIRWISECOMPARISONSOF DOMAINSFROMREPRESENTATIVE SEQUEr~CF~S Domain 1 (residues 1-145; 247-303)

huron mouN yeaN

humC mouC yeaC

humN

mouN

yeaN

100

73 100

54 53 100

humC

mouC

yeaC

100

63 100

35 38 100

Domain 2 (residues 146-246; 304-318) I huroN humN mouN yeaN

humC mouC yeaC

mouN

yeaN

100

76 100

49 49 100

humC

mouC

yeaC

100

73 100

33 34 100

MATRIX 3(C). PAmWISECOMPARISONSOF PPi-DEPENDENTPHOSPHOFRUCTOK1NASE SEQUENCES

potA potB Pfr

potA

potB

Pfr

100

43 100

19 19 100

See Section II for nomenclature and references for the sequences.

24 25 24 25 22 20 70 69 63 63 100

23 22 23 24 22 21 33 32 35 35 33 100

24 24 24 24 24 23 35 34 39 39 34 48 100

40 39 39 39 37 35 29 29 28 28 23 22 23 100

43 43 46 47 42 39 34 33 31 30 31 27 26 54 100

126

L.A. FOTHERGILL-GILMOREand P. A. M. MICHELS

potato (Carlisle et al., 1990) and from Proprionibacteriumfreudenreichii (Ladror et al., 1991 ) have been most extensively studied structurally. The potato enzyme is a heterotetramer with two ~t- and two fl-subunits (M r 65,000 and 60,000 respectively), cDNAs encoding these subunits have been isolated and sequenced (Fig. 6(b)). It can be seen that the two types of subunit are homologous, although they are of different length. The subunits share 43% sequence identity in the aligned regions (Matrix 3(c)). The gene encoding the P. freudenreichii enzyme has also been sequenced, and is included in the alignment in Fig. 6(b). The sequence of the bacterial enzyme is remarkably dissimilar to the plant enzyme subunits, with only 19% identity overall. It is thus by no means certain that the bacterial and plant enzymes have diverged from a common ancestor (see Section 111.4). However, the sequences in the region corresponding to residues 180-274 of the potato subunit are strikingly more similar (see Fig. 6(b)). In this region there is 32% identity between the P. freudenreichii and potato fl subunit sequences. This similarity may be an indication of the conservation of residues involved in ligand binding (see Section IX.l). There is no apparent indication for gene duplication/fusion in the evolution of the PPidependent phosphofructokinases. Comparisons of these sequences with those of the ATPdependent phosphofructokinases indicate a possible distant relationship, but this is by no means certain. Alignments with E. coli phosphofructokinase show, for example, a possible match (22% identity) with residues 39-385 of the fl-subunit (Carlisle et al., 1990). See Section IX.1 for a further discussion of the possible relationships between the two types of phosphofructokinase. (d) Aldolase Fructose 1,6-bisphosphate aldolase (EC 4.1.2.13) catalyzes the reversible aldol cleavage of fructose 1,6-bisphosphate into the triosephosphates glyceraldehyde 3-phosphate and dihydroxyacetone phosphate (Horecker et al., 1972). Two different forms of aldolase have been distinguished on the basis of reaction mechanism: class I and II. In class I enzymes the catalytic reaction involves the formation of a Schiff-base intermediate between the substrate and a lysine residue, which stabilizes the negatively charged carbanion of dihydroxyacetone phosphate. Enzymes of class II do not form a covalent enzyme-substrate intermediate, but stabilization of the carbanion is achieved by a bivalent cation, usually Zn 2 ÷, that acts as an electron sink. Class II enzymes occur only in prokaryotes and lower eukaryotes such as yeast, fungi and algae, whereas class I aldolases have been found in representatives of all phylogenetic kingdoms: bacteria, protists and metazoa. The class I and class II aldolases show no apparent sequence similarity to each other (see Section IX). Mammals have three different class I isoenzymes, with a specific tissue distribution. They have been classified as A (embryonic tissues and muscle), B (liver and kidney) and C (brain). It can be inferred from sequence comparisons (Fig. 8(a) and Matrix 4(a)) that the isoenzymes diverged from a common ancestor early in vertebrate evolution (Tolan et al., 1987; Kukita et al., 1988; see also Section V.l(b)). Class I aldolases are homotetrameric enzymes with a subunit M r of approximately 40,000. The primary structure has been established for the enzyme from several eukaryotic Phyla. All these enzymes are dearly homologous. Three-dimensional structures have been determined for the muscle (class l-A) enzyme of human (Gamblin et al., 1990) and rabbit (Sygusch et al., 1987), at a resolution of 3.0 A and 2.7 A, respectively. Each subunit has an eight-stranded ct/fl barrel arrangement (Fig. 9). However, aldolase differs from the other glycolytic enzymes with such a barrel structure (triosephosphate isomerase, pyruvate kinase) in the organization of its active site. This site is located in the core of the enzyme and it contains not only hydrophobic residues, but also several charged amino acids. As in the other barrel-enzymes the active site is accessible from the C-terminal side of the barrel. However, access seems to be modulated by the C-terminal region of the polypeptide (shown at the top of Fig. 9), such that this arm-like structure with a flexible elbow may cover the active site. While both the hydrophobic and polar residues lining the active site are totally conserved throughout evolution, the C-terminal region appears highly variable (Matrix 4(b) and Section VI.3). Sequence comparisons and mutagenesis studies suggest that the nature of the C-terminus

Evolution of glycolysis

127

a ALDOLASE humA ratA mo~ rabA humB ratB chiB humC ratC Dine ric Pfa TbrGl

CLASS I 50 aaaaaaaaaaaaaa bbbbbbbb aaaaaaaaa P Y Q Y P A L T P . . . . . . . . . . E Q K K E L S D I A H R I V A P G K G I L A A D E S T G S IAKRLQS I G T ~ PHPYPALTP . . . . . . . . . . E Q K K E L A D I A H R I V A P G K G ILAADESTGS I ~ S IGTm~ PHPYPALTP .......... EQKKELSDIAHRIVAPGKGILARDESTGS IA~SIGTImq PHSHPALTP .......... EQKEELSD IAHRIVAPGKGILAADESTGS IAKRLQSIGTD~ AHRFPALTQ EQKKELSEIAQS IVANGKG ILAADESVGT~RIKVEN AHRFPALTS -EQKKELSE I A Q R I V A N G K G I L A A D E S V G T ~ R I K V E n ~ THQFPALSP . . . . . . . . . . E Q K K A L S D I A Q R I V A S G K G I L A A D E S V G T M ~ R I N V E N PHSYPALSAE Q K K E L S D IA L R I V A P G K G I L A A D E S V G S M A E R L S Q I G V ~ { PHSYPALSA .......... EQKKELSD IALRIVAPGKGILAADESVGSMAKRLSQIGVm{ TTYFNYPSK ELQDELRE IAQKIVAPGKGILAADESGPTHGKRI~IGVID~ SAYCG K Y K D E L IKNAAYIC- T P G K G I L A A D E S T G T IGKRLSS I N V E N SAYCG -KYKDE L I K N A A Y I G T P G K G I L A A D E STGT I G E R F A S ~ NAPKKLPAD V A E E L A T T A Q K L V Q A G K G IL A A D E S TQT IKKRFDNI ~ . w ~ S KRVEVLLTQLPAYNRLKTP YEAEL IETAKKMTAPGKGLLAADE S T G S C S K R F ~ I G L S N ,*

humA ratA mo~ rabA humB ratB chiB humC ratC Dme ric Pfa Tb~l

°******

o

°*.

*

°*

* .

.

.

.

.

.

*,*.

o**.,**,.

*

o.

°



***

*

aaaaaaaaaa bbbbbbbbbb aaaaaaa167 GVVP L A - G T N G E T T T Q G L D G L S E R C A Q Y K K D G A D F A K W R C V L K I G E H T - - P S ~ GVVP LA-GTNGETTTQGLDGLS ERCAQYKKDGAD SAKWRCVLKIGEHT - -PSSLAIVENA GVVP LA-GTNGETTTQGLDGLSERCAQYKKDGADFAKWBCVI/KIGEHT--P S ~ GVVP L A - G T N G E T T T Q G I / X ~ L S E R C A Q Y K K D G A D F A K W R C V L K I G E H T - - P S ~ GGAP LA-GTNKETT IQGLDGLSERCAQYKIgDGVDFGKWRAVLRIA~--PS S LAIQENA GGAP LA-GTNKETT IQGLDGL S E R C A Q Y K K D G V D F G K W R A ~ SDQC- -PSSLAIQENA GTAP LA-GTNGETT IQGLDKLAERCAQYKKDGADFGKWRAVLKI S S TT--P SQLAIQENA GVVP LA-GTDGET TTQGLIM~LSERCAQYKKDGADFAKWRCVLKI SERT- -PSALAI LENA GVVP LA-GTDGETTTQGLDGLLERCAQYKKDGADFAKWRCVLKI SDRT--P SAIAtlLENA GVVP LF -GS EDEVTTQGLDDLAARCAQYKKDGCDFAKWRCVLKIGKNT--PSYQSILENA GT IEVV-GTDKETTTQGHDDLG AKWRAVLKIGP~--PSQLAIDLNA GT IEVA-GTEKETTTQGHDDLG AKWRAVLKIGPNE - -PSQLAI ~ GLVNIP -CTDEEKS TQGLDGLAERCKEYYKAGARFAKWRTVLVIDTAKGKPTDLSNHETA GLEP LVEGAKGEQMTAGLDGY IKRAKKYYAMC-CRFCKWRNVYKI - -QNGTVSEAVVRFNA *

humA ratA mo~ rabA humB ratB chiB humC ratC Dme mai ric Pfa Tb~l

***

aaaaaaaaaaaa bbbbbb aaaaa aaaaaa ~ 110 TEENRRFYRQLLLTADDRVNPC IGGVILFHETLYQKADDGRPFPQVIKSKGGVVGIKVDK TEENRRFYRQLLLTADDRVNPC IGGVI LFHETLYQKADDGRPFPQVIKSKGGVVGIKVDK TEENRRFYRQLLLTADDRVNPC IGGVILFHETLYQKADDGRPFPQVIKSKGGVVGIKVDK TEENRRFYRQLLLTADDRVNPC IGGVILFHETLYQKADDGRPFPQVIKSKGGVVGIK~DK TEENRRQFRE I LF SVD S S INQS IGGVI LFHETLYQKDSQGKLFRNILKEF~IVVGI KLDQ T E E N R R Q F R E L L F SVDNS I SQS I G G V I L F H E T L Y Q K D S Q G K L F R N I L K E K G I V V G I K L D Q T EF/gRRAFRE I LF S SDAS I SKS IC-GVILFHETLYQKDS SGKPFPAI IKEKGMVVGIKLDA TEENRRLYRQVLF SADDRVKKC IGGVIFFHETLYQKDDNGVPFVRT IQDKGIVVGIKVDK TEENRRLYRQVLF SADDRVKKC IGGVIFFHETLYQKDDNGVPFVRT IQEKGILVGI KVDK TEDNRRAYRQLLF STDPKLAENI SGVILFHETLYQKADDGTPFAEI I/(KKGI ILGIKglDK VEENRRALRELLFCCPGAL-QYI SGVILFEETLYQKTKDGKPFVDVLKEGGVLPGIKVDK VEENRRSLRE L L F C T P G A L - Q Y L S G V I L F E E T L Y Q K T K D G K P F V D ~ I K V D K T I ENRASYRDLLFGTKG- LGKF I S G A I L F E E T L F ~ M V N L L H N E N I IPGIKVDK TAEHRRQYRAIMLECEG-FEQYI SGVI LHDETVYQKAKTGETFPQYLRRRGVVPGIKTDC ,.*

humA ratA mouA rabA humB ratB chiB humC ratC Dme mai ric Pfa TbrGl

*

. •

*

. .*

*

*

.*

*

***

*

*

*

aaaaaaaaaaaaa bbbbbbb aaaaaaaaaaaaaaaaaaaaa bb227 NVLARYAS ICQQNGIVP IVEPE ILP D(~DHD L K R C Q Y V T ~ V Y K A L S D H H IYLEGTL NVLARYAS ICQQNG IVP IVEP E I LPDGDHD LKRCQYVTEKg-LAAVYKALSDHHVYLEGTL NVLARYAS ICQQNGIVP IVEPE I LPDGDHDLKRCQYVTEKVLAAVYKALSDHHVYLEGTL NVLARYAS ICQQNGIVP IVEP E I LP D G D H D L K R C Q Y V T ~ V Y K A L S D H H IYLEGTL NALARYAS ICQQNGLVP IVEP EVI PDGDHD LEHCQYVTEKVLAAVYKALNDHHVYLEGTL NAI/J~YAS ICQQNGLVP IVEP EVLPDGDHDLEHCQYVSEKVLAAVYKALNDHHVYLEGTL NT LARYAS ICQQNGLVP IVE P EVLP DGD HD LQ RCQYVTEKVLAAVI'KAIAgDHHVYLEGTL NVLARYAS ICQQNGIVP IVEPE ILPDGDHD LKRCQYVT AAVYKALSDHHVYLEGTL NVLARYAS ICQQNGIVP IVEPE I LPDGDHDLKRCQFVTEKVLAAVYKALSDHHVYLEGTL NVLARYAS ICQSQRIVP IVEPEVLPDGDHDLDRAQKVTETVLAAVYKALSDHHVYLEGTL QGLARYAI ICQENGLVP IVEPE ILVDGPHD I D R C A Y V T E T V L A A C Y K ~ G T L QGLARYAI ICQENGLVP IVEPE ILVDGPHD I D P C A Y V S E V V L A A C Y K ~ G T L WGLARYAS I CQQNRLVP IVEPE ILADGPHS IEVCAVVTQK%rLSCVFKALQENGVLLEGAL ETLARYAI LSQLCGLVP IVEP Eg~4IDGTHD IETCQRVSQHVWSEWSALHRHG%r4WEGCL *****

..*

.*******..

**

* . . . .

FIG. 8(a) (continued overleaf),

*..

*

,**

.

.

**

*

128

L, A. FOTHERGILL-GILMOREand P. A. M. MICHELS

humA ratA mouA rabA humB ratB chiB humC ratC Dine mai ric Pfa TbrGl

b



b

aaaaaaaaaaaaaa bbbbbbbbb aaaaaaaaaaa287 LKPNMVTP GHACTQKF SHEE IAMATVTALRRTVP PAVTGITFLSGGQSEEEAS INLNAIN LKPNMVTP GHACTQKF SNEE IAMATVTALRRTVPPAVPGVTFLSC-GQSEEEAS INLNAIN LKPNMVTPGHACTQKF SNEE IAMATVTALRRTVP PAVTGVTFLSGGQS ~-~-wm~SINLNAIN LKPNMVTPGHACTQKYSHEE IAMATVTALRRTVPPAVTGVTFLSGGQS~W.~ASINLNAIN LKPNMVTAGHACTKKYTPEQVAMATVTALHRTVPAAVPGICFLSGG~4SEEDATLNLNAIN LKPNMLTAGHACTKKYTPEQVAMATVTALHRTVPAAVP S ICFLSGGMSEEDATLNLNAIY LKPNMVTAGHSCPKKYTPQDVAVATVTTLLRTVPAAVPGICFLSGGQSEEEAS~ LKP MVTPGHACP IKYTPEE IAMATVTALRRTVPPAVPGVTFLSGGQSRR~%SFNLNAIN LKPNMVTPGHACP IKYSP EE IAMATVTALRRTVPPAVPGVTFLSGGQSW-~n%S ~ LKPNHVTAGQS -AKKNTPEE IALATVQALRRTVPAAVTGVTFLSGGQSEEEATVNLSAIN LKPNMVTP GSD- SKKVTPEVIAEYTVRTLQRTVPAAVPAVLFLSGGQSEEEATRNLNA~a~ LKPNMVTPG SD -AKKVSP EVIAEYTVRTLQRTVPAAVPAIVFLSGGQSKEEATRNLNAI~ LKPNMVTAGYECT AKT TTQDVGFLTVRT LRRTVP P ALP Gg-qFLSGGQSEEEASVNLNS IN LKPNMVVP GAE SGLKATAEQVAEYTVKTLARVIP PALPGVTFLSGGLSEVMASEYLNA~a~ ****

..,*

.

*

. . . .

**

,*

*..***

....

*****

*

*,

**,,

aa bbbbbb V aaaaaaaaaa aaaaaaaaaaaaaaaaaaaaa 347 KCPLLKPWALTF S Y G R A L Q A S ~ W G G K K E N L K A A Q E E Y V K R A L A N S L A C ~ P SGQ KCP LLKPWALTF SYGRALQASALKAWGGKKENLKAAQEEYIKRALANS LACQGKYTPSG0 KCPLLKPWALTFSYGRALQASALKAWC43KKENI/(AAQEEYIKRALANSLAOQGKMTPSGQ KCPLLKPWALTF S Y G R A L Q A S A L K A W G G K K E N L K A A Q E ~ S L A C ~ S G ~ LCPLPKPWKLSFSYGRALQASALAAWGGKARNKEA GQYVHTGS RCP Lp RPWKLS F SY G R A L Q A S A L A A W G G K A A N K K A T Q E A F M K R A V A N C ~ Y V H T G S Q Sp L P K P W K L T F S Y G R A L Q A S A L A A W L G K S E N K K A A Q E A F C ~ I N S L A C R G Q Y V T S G K RCPLPRPWALTF SYGRALQASAI2~AWBC=QRDNAGAATEEFI K R A E V N G L A ~ S G E RCSLP RPWALTFS Y G R A L Q N A A L S A W R ~ R D N A G A A T E E F I K R A ~ G S G NVP L IRPWALTF S Y G R A L Q A S V L R A W A G K K E N I A A C ~ T . T ~ I ~ A K A N G D A ~ - - G KL STKKPWSLSF S F G R A L Q A S T L K A ~ L A R C K A N S E A T L G T Y K G D A A KLS T KPWS LSF SFGRALQQS TLKAWSGKAANIEKARAAFLTRCKANSEATLGTYK~AV ALG -p HPWALTF S YGRALQAS V I ~ K K E N V A K A R E V L L Q R A E A N S LATYGKYKG-GA NCP Lp RPWKLTFSYARALQS S A I K R W G G K E S G V E A ~ S L A Q L G K Y N R - - A

humA ratA mouA rabA humB ratB chiB humC ratC Dme mai ric Pfa Tb~l

**

*.***

****

. , •

*

*

. •

*

*

*

*

363

AGAAASESLFISNHAY SGAAASESLFISNHAY SGAAASESLFISNHAY AGAAASESLFISNHAY SGAASTQSLFTACYTY SGAASTQSLFTASYTY TDTAATQSLFTASYTY DGGAAAQSLYIANHAY DGGAAAQSLYVANHAY SAGAGSGSLFVANHAY ---ADTESLHVKDYKY LGEGASESLHVKDYKY GGENAGASLYEKKYVY DDDKDSQSLYVAGNTY

humA ratA mouA rabA humB ratB chiB humC ratC Dme mai ric Pfa TbrGl

**

*

FIG. 8(a). Alignment of class I aldolase sequences. See Section II for nomenclature and references for the sequences. The Drosophila sequence is that of the most abundant isoenzyme (see Section VI.3). Elements of regular secondary structure observed in the rabbit muscle enzyme are shown. The boundary between the ~t/fl-barrel domain and the C-terminal domain is indicated by the arrow. The lysine at the active site that forms an imine link with the substrate is shown by the star symbol. The numbering is according to the rabbit muscle enzyme.

b ALDOLASE

CLASS II 60

yeaII EcoII CglII

GVEQILKBKTGVIVGEDVHNLFTYAKEHKFAIPAINVTSSSTAVAALEAARDSKSPIILQ SKIFDFVKPGVITGDDVQKVFQVAKENNFALPAVNCVGTDSINAVLETAAKVKAPVIVQ PIATPEVYNEMLDRAKEGGFAFPAINCTSSETINAALKGFAEAESDGIIQ •



....

***

FiG. 8(b)

*****

..o..

(continued overleaf).

*o*.o

.

.,

*o*

Evolution of glycolysis

129

yeaII EcoII CgiII

118 TSNGGAAYFAGKGISNEG-QNASIKGAIAAAHYIRSIAPAYGIPVVLHSDHC2&KKI/~-W FSNGGASFIAGKGVKSDVPQGAAILGAISGAHHVHQMAEHYGVPVILHTDHCAKKLLP-W FSTGGAEFGSGLAVKNK ...... V K G A V A L A A F A H E A A S Y G I N V A L H T D H C Q K E V L D E Y

yeaII EcoII CgiII

FDGMLEADEAYFKEHGEP LFS SHMLDL SEETDEEN IS TCVKYFKRMAAMDQWLEMEIGIT IDGLLDAGEKHGAATGKP LFS S HMID LSEES LQENIE ICSKYLERMSKIGMTLE I ~ VRP LLA I SQERVDRGELP LFQSHMWDGSAVP IDENLE IAQELLAKAEAANI ILEVE IGVV

178





oo







.o,









.



236 GGEEDGVNNENADKEDLYTKPEQVYNVYKALHP-ISPNFSIAAAFGNCHGLY-AGDIALR GGEEDGVDNSHMDASALYTQPEDVDYAYTELSK-ISPRFTIAASFGNVHGVYKPGNVVLT GGEEDGVEAKA--GANLYTSPEDFEKTIDAIGTGEKGRYLLAATFGNVHGVYKPGNVKLR

yeaII EcoII CglII

* * * * * * *

.

.

.

.

.

***

**

.

.

.

.

.

.

.

.

* * . * * *

* * *

.*..

*

295 PEILAEHQKYTREQVC-CKEEK-PLFLVFHGGSGSTVQEFHTGIDNGVVKVNI/)TDCQYAY PTILRDSQEYVSKKHNLPHN--SLNFVFHGGSGSTAQEIEDSVSYGVVEMNIDTDTQWAT PEVLLEC-QQVARKKLGLADDALPFDFVFHGGSGSEEEKIEEALTYGVIEI~WVDTDTHYAF

yeaII EcoII CglII

*.,*

.

* .

.

.

.

.

.

.

.

.

.

* * * * * * * *

. . . . . .

...

* * . * . * . * * *

..*

355 LTG IRDYVLNKKDYIMS PVGNP EGPEKPNKKFFDP RVWVREGEKTMC4&KITKSLETFRTT WE GVLNYYKANEAYLQGQLGNPKGEDQPNKKYYDPRVWLRAGQTSMIARLEKAFQELNAI TRP IVSHMFEN .... YNGVLKIDG-EVGNKKAYDPRSYMKKAEQSMSERI I E S C ~ L K S V

yeaII EcoII CglII



• .

.

.

.

.

.

.

*

,

***

.***

...

..

.*

. . . .

.

......

358 NTL DVL GKTTSK

yeaII EcolI CglII

FiG. 8(b). Alignment of class II aldolase sequences. See Section II for nomenclature and references for sequences. The numbering is according to the yeast enzyme.

MATRIX 4(a). PAIRWISECOMPARISONSOF CLASS I ALDOLASESEQUENCES

humA ratA mouA rabA humB ratB chiB humC ratC Dine maiCy ricCy Pfa TbrG1

humA ratA

mouA rabA humB ratB

chiB

humC ratC

Dme

mai

ric

Pfa

TbrG

100

98 99 100

72 72 72 72 81 81 100

82 83 83 83 70 71 73 100

69 70 70 69 63 63 64 69 68 100

60 60 60 60 58 58 60 58 58 59 100

58 58 58 58 57 57 58 55 56 56 93 100

54 55 54 54 50 50 52 56 54 53 56 56 100

49 49 49 49 48 48 47 48 47 46 50 48 46 100

97 100

99 97 98 100

69 70 70 69 100

69 70 70 69 95 100

80 80 81 81 69 69 72 96 100

130

L.A. FOTHERGILL-GILMOREand P. A. M. MICHELS MATRIX 4(b). PAIRWISECOMPARISONSOF DOMAINSFROM REPRESENTATIVESEQUENCES Barrel domain (residues 1-302)

humA mouA Dine mai

humA

mouA

Dme

mai

100

98 100

71 71 100

62 63 61 100

C-terminal domain (residues 303 363)

humA mouA Dme mai

humA

mouA

Dme

mai

100

97 100

61 63 100

48 48 48 100

MATRIX 4(C). PAIRWlSECOMPARISONOF CLASS II ALDOLASESEQUENCES

yea Eco Cgl

yea

Eco

Cgl

100

48 100

37 39 100

See Section II for nomenclature and references for the sequences.

contributes to the specificity of the enzyme towards its substrates (Sygusch et al., 1987; Takahashi et al., 1989). In the tetrameric structure each subunit makes contact with just two of its neighbours. These contacts, which are mainly hydrophobic, are made by the side chains of residues in the s-helices on the surface of the barrel. In contrast to triosephosphate isomerase and glyceraldehyde-phosphate dehydrogenase, the different subunits do not contribute to the formation of each other's active site. Indeed monomeric aldolase retains approximately 50% of the catalytic activity (Rudolph et al., 1977). Class II aldolases are dimeric proteins of identical subunits of M r 40,000. No crystal structures of class II enzymes have been reported yet. The complete primary structure of three class II aldolases have been determined: yeast, E. coli and Corynebacterium glutamicum (Fig. 8(b)). (e ) Triosephosphate isomerase Triosephosphate isomerase (EC 5.3.2.1) is the enzyme that interconverts dihydroxyacetone phosphate and glyceraldehyde 3-phosphate, i.e. the intramolecular proton transfer between the two products of the aldolase-catalyzed hexose cleavage (Noltmann, 1972). Since only glyceraldehyde-phosphate is funnelled down to pyruvate, this enzyme ensures that both halves of the hexose are used for ATP production by the glycolytic pathway. The enzyme does not require any cofactor. In all organisms analyzed, triosephosphate isomerase is a dimeric protein with identical subunits of approximately 27,000. High-resolution crystal structures are available for the enzyme from chicken muscle (2.5 A) (Banner et al., 1975), yeast (1.9 A) (Alber et al., 1981; Lolis et al., 1990) and Trypanosoma brucei (1.83 .A) (Wierenga et al., 1987, 1991a). The overall structure of each of these enzymes is very well conserved, despite considerable differences (approximately 50%) in the amino-acid sequences. The subunits have a globular shape with two protruding loops (Fig. 10). The globular part has the classical barrel structure: an eight-fold repeat of strand-turn-helix-turn units assembled into a cylinder of eight parallel fl strands surrounded by a layer of eight ~-helices. The active site is located at the carboxyl end of the barrel. One of the protruding

Evolution of glycolysis

FIG. 9. Aldolase from human muscle. Only a single subunit of the tetrameric enzyme is shown. The C-terminus is labelled. The tyrosine residue at the C-terminus is essential for activity. The coordinates were kindly made available by H. C. Watson.

FIG. 10. Triosephosphate isomerase from Trypanosoma brucei. The active dimer is shown. The coordinates were kindly made available by R. Wierenga.

131

132

L.A. FOTHERG1LL-GILMOREand P. A, M, MICHELS

Flo. 12. The structures of the apo and holo forms of glyceraldehyde-phosphatedehydrogenase from

B. stearotherrnophilus. Only single subunits of the two tetrameric enzymes are shown. The apo enzyme is on the left. The S-loop corresponds to the extended chain on the right of each structure.

FIG. 14. Comparison of the structures of yeast and horse phosphoglycerate kinases. The N-terminal domain is at the top of the diagram. The yeast enzyme is on the left.

Evolution of glycolysis

133

loops ("interface loop")is at the'subunit interface and is important for dimerization; it fits into a complementary pocket of the other subunit, where it extends into the active site (Alber et al., 1981; Wierenga et al., 1987; Lolis et al., 1990). Therefore, only the dimer is fully active (Waley, 1973). The other loop ("flexible loop") projects into the solvent in the unligated enzyme and closes over the active site when substrate binds (Joseph et al., 1990; Wierenga et al., 1991b). By this closure it may prevent unwanted side-reactions. Amino-acid sequences are available for triosephosphate isomerases from representatives of most phylogenetic groups (Fig. 11). Although the percentage of differences may be quite drastic (Matrix 5), the majority of substitutions have only been found at the surface and, surprisingly, in the inter-subunit contact area (Wierenga et al., 1987). These latter differences could affect the stability of the dimer. The conservation of active site residues is consistent with the similarity in kinetics amongst the enzymes from all organisms studied (Lambeir et al., 1987).

TRIOSEPHOSPHATE hum mon mou tab chi coe

mai Tbr yea Spo Anid Eco

I SOMERASE

bbbbbbbbb aaaaaaaaaaaaa bbbbbbb aaaaaaaaaaa 57 AP SRKFFVGGNWKMNGRKQS LGELIGTLNAAK- -VPADTEVVCAPP TAYIDFARQKLD-P AP SRKFFVGGNWKMNGRKQNLGEL I GTLNAAK- - V P A D T E W C A P P TAY IDFARQKLD -P AP TRKFFVGGNWKMNGRKKCLGEL ICTLNAAN- - V P A G T E W C A P P TAY IDFARQKLD -P AP SRKFFVC43NWKMNGRKKNLGEL I TTLNAAK - -VPADTEVVCAP P TAY IDFARQKLD -P APRKFFVU43NWKMNGDKKS LGEL I HTLNGAK- -LSADTEVVCGAP S I YLDFARQKLD -A APRKFFVGGNWKMNGDKKS LGEL I QTLNAAK- -VPFTGE IVCAPP EAYLDFARI/(VD -P GRKFFVGGNWKCNGTTDQVEKIVKTLNEC-QVPP SDVVEVVVSPPYVFLPVVKSQLR-Q SKPQP IAAANWKCNGS QQS LS ELIDLFNS TS - - INHDVQCVVAS TFVHLAMTKERLS HP ARTF~KLNGSKQS I KE IVERLNTAS - - IP ENVEVVICP PATYLDYSVSLVKKP ARKFFVGGNFKMNGS LESMKT I IEGLNTTK-LNVGDVETVIFP Q ~ I T T P ~ Q V K - K PRKFFVGGNFKMNGNAES TT S I IKNLNSAN- -LDKSVEVVVSPPALYLLQAREVAN-K RHP LVMGNWKLNGS RHMVHELVSNLRK-E LAGVAGCAVAIAP P EMY IDMAKREAEGS *

hum mon r~u tab chi coe mai Tbr yea Spo Anid Eco

*

spo Anid Eco

, • •

• •

,



bbbbb aaaaaa bbbbbaaaaaaaaa aaaaaaaaaaa117 K IAVAAQNCYKVTNGAFTGE I S PGMI KDCGATWVVLGHSERBHVFGESDELIGQKVAHAL K IAVAAQNCYKVTNGAFTGE I S PGMIKDCGATWVVLGHSERRHVFGESDELIGQKVAHAL K IAVAAQNCYKVT GP FTGE I SPGMIKDLGATWVVLGHSERRHVFGESDELIGQKVS HAL KIAVAAQNCYKVTNGAFTGE I SPGMI ~ C G A T W V V L G H S E R R H V F G E S D E LIGQKVAHAL KIGVAAQNCYKVPKGAFTGE I SPAMI ED I GAAWVILGHSERRHVFGESDE LI GQKVAHAL KFGVAAQNCYKVSKGAFTGE I SPAMI KDCGVTWVI LGHSERRHVFGESDELIGQKVS HAL EFHVAAQNCWVKKGGAFTGEVSAEMLVNLGVPWVILGHSERRALLGESNEFVGDKVAYAL KFVIAAQNAIAKS -GAFTGEVS LP I LKDFGVNWIVLGH SERRAYYGETNE IVADKVAAAV QVTVGAQNAYLKAS GAFTGENSVDQ iKDVGAKWVILGHSERRSYFHEDDKF IADKTKFAL D I G V G A Q N V F D K K N G A Y T G E N S ~ S L IDAG I TYTLTGH SERRT I FKESDEFVADKTKFAL E IGVAAQNVFDKP NGAF TGE I SVQQLREANIDWT I LGH SERRVI LKETDEF IARKTKAAI H IMLGAQNVNLNLS GAFTGET SAAMLKD IGAQY I I I GHSERRTYHKESDELIAKKFAVLK .,***

hum mon mou tab chi coe mai Tbr yea

**

*

,***

*

,





...

* * * * * *

* , . . . , ,

*

a bbbbbbbbaaaaaaa aaaaaaaaaaaaa bbbbbbb 177 AEGLGVIAC IGEKLDEREAGI T KVVF QTKVIADNV- -KDWSKVVLAYEPVWAIGTGKT AEGLGVIAC IGEKLDEREAGI TEKVVFEQTKVIADNV--KDWSKVVLAYEPVWAIGTGKT AEGLGVIAC IGEKLDEREAG I TEKVVFEQTKVIADNV- -KDWSKVVLAYEPVWAIGTGKT S EGLGVIAC I GEKLDEREAG I TEKVVFEQTKVIADNV- -KDWSKVVLAYEPVWAIGTGKT AEGLGVIAC IGEKLDEREAG I TEKVVFEQTKAI ADNV- -KDWSKVVLAYEPVWAIGTGKT SEGLGVVAC IGEKLDEREAGI TEGVVFEVTEVIADDV- -KDWSKVVLAYEPVWAIGTGKT S QGLKVIACVGETLEQREAGS TMDVVAAQTKAIAEKI --KDWSNVVVAYEPVWAIGTGKV AS GFMVIAC I G~TLQERESGRTAVVVLTQ I ~ W A K V V I A Y E P V W A I G T G K V GQGVGVILC IGETLEEKKAGKTLDVVERQLNAVLEEV- -KDWTNVVVAYEPVWAIGTGLA EQGLTVVAC IGETLADREANET ITVVVRQLNAIADKV--QNWSKIVIAYEPVWAIGTGKT EGGLQVIFC IG~TT.~.~.REANKTI D V V T R Q ~ L S K E Q W A K V V I A Y E P V M A I G T G K V EQGLTPVLC IGETEAENEAGKTEEVCARQ IDAVLKTQGAAAFEGAVIAYEPVWAIGTGKS •



,

,,,.~o

,,

FIG. I1 (continued overleaf).

,,.



134

L.A. FOTHERGILL-GILMOREand P. A. M. MICHELS aaaaaaaaaaaaaaaaa ~ b aaaaaaaaa bbbbb aaaa237 ATPQQAQEVHEKLRGWLKSNVSDAVAQSTRI I YGC-SVTGATCKELASQPDVDGFLVGGAS ATPC~A~SNVSEAV~STRI IYGGSVTGATCEELASQPDVDGFLVGGAS A T P Q Q A Q E V ~ S T R I IYGGSVTGATC~ELATPADVDGFLVGGAS ATPQQAQE~SDAVAQS TRI IYGGSVTGATCKELASQPDVDGFLVGGAS ATPQQAQEVHEKLRGWLKSHVSDAVAQS TRI IYGGSVTGGNCKELASQHDVDGFLVC~AS ASPQQSQELH~SETVADSVRI IYGGSVTGATCKELASEPDVDGFLVGGAS ATPAQAQEVHAS LRDWLKTNASPEVAES TRI IYGGSVTAANCKELAAQPDVDGFLVGGAS ATPQQAQEAHAL IRSWVS S K I G A D V R G E L R I L Y G G S ~ L Y Q Q R D V N G F L V G G A S ATPEDAQDIHAS IRI~LASKLGDKAASELRILYGGSANGSNAVTFKDKADVDGFLVGGAS G T P E ~ IRK~ATNKLGASVAEGLRVIYGGSVTGGNCKEFLKFHD IDGFLVGGAS ATTEQAQEVHSAIRKWLKDAI SAEAAENTRI I Y G G S V S E K N C K D ~ IDGFLVGGAS ATPAQAQAVHKF IRDH I -AKVDANIAEQVI IQYGGSVNASNAA LFAQPD IDGALVGGAS

hum mon mou

tab chi

coo mai Tbr

yea Spo Anid

Eco

• ,o,,o

,



, , o

,

,,

o,

aa aaaaaaaa 249 LKPEFVDI INAKQ LKPEFVDI INAKQ LKPEFVDI INAKQ LKPEFVD I INAKQ LKPEFVDI INAKH LKPEFVEYKDVRQ LKPEF ID I INAATVKSA LKPEFVDI IKATQ LKPEFVDI INSRN

hum mon mou rab chi coe ma i Tbr yea Spo Anid Eco

LKPEFPTNIVNVHSL LKPAFVDIVNARL LKADAFAVIVKAAEAAKQ **o

,

FIG. 11. Alignment of triosephosphate isomerase sequences. See Section II for nomenclature and references for the sequences.Elements of regnlar secondary structure observed in the T. bruceienzyme are shown. The numbering is also according to the trypanosome enzyme. The sequence of the B. stearothermophilus enzyme is omitted because of uncertainties in the sequence.

MATRIX 5. PAIRWISECOMPARISONS OF TRIOSEPHOSPHATE ISOMERASESEQUENCES

hum mon mou

rab chi

hum

mon

mou

rab

chi

coe

mai

Anid

yea

Spo

Tbr

Eco

100

99 100

94 94 100

98 98 94 100

89 88 86 88 100

82 82 80 83 80 100

62 62 60 62 62 57 100

55 54 54 54 57 52 56 100

53 52 52 52 53 50 51 55 100

52 52 51 52 54 51 52 60 54 100

52 52 50 51 52 47 52 48 49 46 100

45 45 44 45 45 41 43 42 45 44 41 100

coe

maiCy Anid yea Spo TbrGl Eco

See Section II for nomenclature and references for sequences.

(f) Glyceraldehyde-phosphate dehydrogenase G l y c e r a l d e h y d e - p h o s p h a t e d e h y d r o g e n a s e (EC 1.2.1.12) is responsible for the oxidative p h o s p h o r y l a t i o n of glyceraldehyde 3-phosphate into 1,3-bisphosphoglycerate (Harris a n d Waters, 1976). This reaction involves b o t h a n o x i d a t i o n a n d a p h o s p h o r y l a t i o n of the substrate. I n a first step, the aldehyde becomes covalently b o u n d to the sulphydryl g r o u p of a n active site cysteine. S u b s e q u e n t l y a hydride ion is r e m o v e d from the a d d i t i o n c o m p o u n d , a n d is transferred to the c o e n z y m e N A D +. The free energy of this oxidation is preserved in

Evolution of glycolysis

135

the thioester intermediate. Attack of the thioester by orthophosphate results in the formation of an acylphospho intermediate with a high free energy of hydrolysis. The acylphospho product (1,3-bisphosphoglycerate) is subsequently released. Glyceraldehyde-phosphate dehydrogenase is a homo-tetrameric enzyme with subunits of 34,000-38,000. Each subunit can bind the NAD ÷ cofactor, and the binding can occur cooperatively. However, the nature of this cooperativity depends on the source of the enzyme. Crystal structures are available for the holoenzyme of Bacillus stearothermophilus (1.8 A) (Skarzynski et al., 1987), lobster (2.9 A) (Moras et al., 1975), human (2.4/k) (Read et al., 1992) and Trypanosoma brucei (3.2 A) (Vellieux et al., 1992). Moreover, determination of the structure of the apo-glyceraldehyde-phosphate dehydrogenase from B. stearothermophilus (at 2.5/~,) has revealed the conformation changes associated with the binding of the coenzyme (Skarzynski and Wonacott, 1988). The four subunits of the enzyme are structurally almost identical. Each subunit consists essentially of two domains (Fig. 12); the N-terminal domain is involved in NAD ÷-binding, whereas the C-terminal domain contains the residues directly involved in the catalytic process. The structure of the NAD ÷-binding domain is very similar to those of some other NAD ÷-dependent dehydrogenases. The core of the enzyme is made up of the so-called S-loops of the individual subunits. The S-loops interact with the two adjacent subunits and contribute to the formation of their NAD ÷-binding pockets. Amino-acid sequences of glyceraldehyde-phosphate dehydrogenase from many different sources have been determined (Fig. 13(a)). The enzyme of all eukaryotes and eubacteria are clearly homologous. The maximal difference observed is about 60% (Matrix 6(a)), and active-site residues are fully conserved. The importance of these residues has been confirmed by site-directed mutagenesis (Mougin et al., 1988; Soukri et al., 1989). Conservation to a lesser extent is seen in the subunit contacts. The S-loop sequence displays features diagnostic of either prokaryotes or eukaryotes (Branlant and Branlant, 1985; Michels et al., 1991). Moreover, the presence of specific residues in the subunit interface and at the surface of the protein has been shown to confer thermostability, required for the growth of thermophilic organisms (Wrba et al., 1990). Plant chloroplasts contain two glyceraldehyde-phosphate dehydrogenase isoenzymes, involved in the carbon-fixation cycle. These isoenzymes are encoded in the nucleus. Sequence comparisons of the chloroplast isoenzymes (Matrix 6(a)) show them to be 80% identical to each other, and to be generally more similar to eubacterial than to eukaryotic sequences. The genes of these enzymes probably therefore have their origin in the prokaryotic endosymbiont that evolved into the chloroplast. In the course of evolution the ancestral gene was then transferred to the plant nucleus. Chloroplast glyceraldehyde-phosphate dehydrogenases differ from their cytosolic counterparts mainly in two respects. First, the enzymes are not NAD ÷-, but NADP+-dependent. This difference in coenzyme specificity is due to some minimal differences in amino-acid sequence (Corbier et al., 1990). Secondly, the main form of chloroplast glyceraldehyde-phosphate dehydrogenase (I) exists as a heterodimer (AEBz), composed of two distinct subunits. The quarternary structure of the minor (II) form is A 4 (Cerff, 1982). Primary structures of glyceraldehyde-phosphate dehydrogenase from four archaebacteria have been reported (Fig. 13(b) and Matrix 6(c)). However, these sequences are vastly different from the sequences of all other organisms; only 16-20% identity was found (see Sections IV.4 and IX.3). Doolittle et al. (1990) proposed a different evolutionary origin for archaebacterial glyceraldehyde-phosphate dehydrogenase when they found a slightly higher degree of similarity with a part of the sequence of bovine NAD ÷/NADPH transhydrogenase. They suggested that an archaebacterial ancestor had pirated another enzyme for use as the equivalent of a glyceraldehyde-phosphate dehydrogenase. Nevertheless, a common origin of the glyceraldehyde-phosphate dehydrogenases of archaebacteria, eubacteria and eukaryotes is supported by the occurrence of common sequence motifs, both in the NAD+-binding domain and in the catalytic domain (Hensel et al., 1989). See Section IX for a general consideration of glycolytic enzymes that catalyze the same reactions, but appear not to have a common ancestor.

136

L.A. FOTHERGILL-GILMOREand P. A. M. MICHELS

8 GLYCERALDEHYDE

hum

pig ratl rat2 mou

ham chi lob Dmel Dme2 Cel Sma ice maiCy musCy Anid Cpa Uma yeal yea2 Kla Zro TbrCy TbrGl TcrGI maiChA peaChA spiChA tobChA peaChB spiChB tobChB EcoA EcoB Taq Tma Zmo Bco Brae Bst Bsu

3-PHOSPHATE

DEHYDROGENASE

bbbbbb aaaaaaaaa bbbb aaaaaaaaaa 52 G K V K V G V N G F G R I G R L V T B A A F N S G . . . . K V D I V A I N D P F IDLNYMVIq4FQYDS T H G K VKVGVDGFGRIGRLVTRAAFNS G .... KVD IVAINDP F IDLH~MVYMFQYDS THGK V K V G V N G F G R I G R L V T R A A F S C D . . . . K V D I V A I N D P F ID VYMFQYDS THGK VKVGVNGF GR IGRLVTRAAF SCD .... KVD IVAINDP F IDIMYMVYMFQYDSTHGK VKVGVNGFGRIGRLVTRAAICS G .... KVE IVAINDPF IDLNYMVYMFQYDS THGK VKVGVNGFGR IGRLVTRAAFT SG .... KVEVVAINDPF IDLNYMVMMFQYDSTH~ VKVGVNGFGRIGRLVTRAAVLSG .... KVQVVAINDP F IDLNYMVMMFKYDSTHGH S K IG I D G F G R I G R L V L R A A L S C G . . . . . A Q V V A V N D P F I A L E Y M V Y M F K Y D S T H G V S K IG I N G F G R I G R L V L B A A I D K G - . . . . A S W A V N D P F I D V N Y M V Y L F K F D S T H G R S K I G I N G F G R I G R L V L R A A D K G . . . . . A N V V A V N D P F IDgq(YMVYLFKFDS T H G R S K A N V G I N G F G R I G R L V L R A A V E K D . . . . T V Q V V A V N D P F IT I D Y M V Y L F K Y D S T H G Q SRAKVG INGFGRIGRLVLRAAFLKN .... TVDVVSVNDPF IDLEYMVYMIKRDSTHGT A K V K V G I N G F G R I G R L V A R V I L Q R D . . . . D C E L V A V N D P F IS T D Y M T Y M F K Y D S V H G Q G K I K IG I N G F G R I G B L V A R V A L Q S E . . . . D V E L V A V N D P F I T T D Y M T Y M F K Y D T V H G H A D KK I K I G I N G F G R I G R L V A R V I L Q R N . . . . D V E L V A V N D P F I T T E Y M T Y M F K Y D S V H G Q AP KVG INGFGRIGRIVFRNAI EAG .... TVDVVAVNDPF IETHYAAYMLKYDSQHGQ VVKVGINGFGRIGRIVFRNAHEHS .... DVE IVAVNDPF IEPHYAAYMLKYDSQH~ S Q V N IG I N G F G R I G R I V F R N S V V H N . . . . T A N V V A I N D P F I D L E % M V Y M L K Y D S T H G V IRIAINGFGRIGRLVLRLALQRK .... D IEVVAVNDPF I SNDYAAYMVKYDS THGR VRVA INGFGRIGRLVMRIALS RP .... NVEVVALNDPF ITNDYAA~4FKYDSTHGR VKVAINGFGRIGKLVLRIALQRK .... ALEVVAVNDP F ISVDYAAYMFKYDS THGR VNVSVNGFGRIGRLVTRIAI SRK .... D INLVAINDPF ISTDYAAY~KYDSTHGR V I R V G I N G E G R I G R V V F R R A Q R R N . . . . D IE I V G I N D - L L D A D Y M A ~ S THGR T IKVG INGFGRIGRMVFQALCDDGLLGNE IDVVAVVDMNTDARYFAYQMKYDSVHGK p IKVG INGFGRIGRMVFQALCEDGLLGTE IDVVAVVDMNTDAEYFAYQMRYDTVHGK K L K V A IN G F G R I G R N F L R C W H G R G - D A S P L D V I A I ~ ) T G - G V K Q A S H L L K Y D S T L G I KQ L K V A I N G F G R I G R N F L R C W H G R K - -D SP L D V I A I N D T G - G V K Q A S E L L K Y D S T L G I K L K V A I N G F G R I G R N F L R C W H G R K - -D SP L D V V V I N D T G - G V K Q A S H L L K Y D S I L G T K L K V A I N G F G R I G R N F L R C W H G R K - -D SP L D V I A I ~ T G - G V K Q A S H L L K Y D S TLGI K L K V A I N G F G R I G R N F L R C W H G R K - -D SP L E V I V V N D S G - G V K N A S H L L K Y D S M L G T K L K V A I N G F G R I G R N F L R C W H G R K - -D SP L D ~ S G - G V K S A T H L I / C f D S ILGT K L K V A I N G F G R I G R N F L R C W H G R K - -D SP L D V V V V N D S G - G V K N A S H L L K Y D S M L G T T IKVG I N G F G R I G R I V F K A A Q K R S . . . . D IE I V A I N D L L - D A D Y M A Y M L K Y D S T H G R TVRVAINGFGRIGRNVVRALYE S G-RRAE I T W A I N E L A - D ~ S H G R KVGINGFGRIGRQVFRILHSRG ..... VEVALI~LT-DNKTLAHLLKYDS IYHR ARVAINGFGRIGRLVYRI IYERKNP --D IEWAINDLT-DTKTLAHLLKYDSVHKK A V K V A I N G F G R I G R L A A R A I L S R P - -D S G L E L V T I N D L G - S V E ~ q A F L F K R D S A H G T AVKVG INGFGRIGRNVFRAAVKNP .... D I E W A V N D L T ~ S V H G R A V K IG I N G F G R I G R N V F R A A L K N D . . . . N V E V V A I N D L T - D A N M L A H ~ S V H G K AVKVGINGFGRIGRNVFRAALKNP .... D IEWAV~D LT -DANTLAHLLKYDSVHGR AVKVGINGFGRIGRNVF~ .... EVEWAVNDLT-DANMLAHLLQYDSVHGK . . .*******

hum

pig rat 1 rat2 mou ham chi lob Dmel Dine2 Cel Sma ice maiCy musCy Anid Cpa Uma yea1 yea2 Kla Zro

bbbbb FHGTVKAE FHGTVKAE FNGTVKAE FNGTVKAE FNGTVKAE FKGTVKAE FKGTVKAE FKGEVKME FKGTVAAE FKGTVAAE FKGTVTYD FPGEVSTE CKSHEIKL WKHSDITL WKHNELKV FKGTIETY FKGDVTVE FNGDISTK YKGTVSHD YAGEVSHD YKGEVTTS FDGEVSHD

.

.

.

.

*o

bbbb bbbb aaaa bbbbb 101 NGKLVINGN---P IT IFQERDP SKIKWGDAGAEYVVESTGVFTT DGKLVIDGK- - -AI T IFQERDPANIKWGDAGTAYVV S TGVFTT NGKLVINGK---P IT IFQERDPANI KWGDAGAEYVVESTGVFTT NGKLVINGK---P IT IFQERDPVKIKWGDAGAEYVVES TGVFTT N G K L V I N G K - - -P I T I F Q E R D P T N I K W G E A G A E Y V V E S T G V F T T NGKLVINGK- - -AI T IFQERDPANI KWGDAGAEYVVE STGVFTT NGKLVINGH- - -AI T IFQERDP SNI KWADAGAEYVVES TGVFTT DGALVVDGK- --K I~ENIPWS~YIVESTGVFTT GGFLVVNGQ- - -K ITVFS ERDPANINWASAGAEYVVESTGVFTT ........ GGFLVVNGQ- --K ITVFSERDPAN~AGAEYIVESTGVFTT ...... GDFL IVQKDGKS SHKIKVFNSKDPAAIAWGSVKADFVVESTGVFTT ........ NGKLKVNGK- --L I SVHCERDPANIP~KDGAEYVVES TGVFTT ....... KDEKTLLFGE - -TPVAVFGCRNP EE IPg~GADFVVESTGVFTD ....... KD SKTLLFGD - -KPVTVFG IRNPEE IPWGEAC4~YVVESTGVFTD ....... KDEKTLLFGE - -KPVTVFGIRNPED IPWGEAGADFVVESTGVFTD ........ DEGL IVNGK---KI RFHTERDPANIPWGQDC4&EYIVESTGVFTT ........ GSDLVVGGK---KVRFYTERDPAAIPWSETGADYIVESTGVFTT ........ DGKL IVNGK-- - S IAVFAEKDP SNIPWGQAGAHYVVESTGVFTT . . . . . . . . D K H I I V D G H - - - K I A T F Q E R D P A N L P W A S L N I D I A I -S T G V F K E ........ DKHI IVDGK- --KIATYQERDPANLPWGS SNVD IAIDSTGVFKE ........ GNDLVIDGH- --K IAVFQE/~)PANLP~ IVIDSTGVFKE ........ KDHI ILNGK---KVAVFNEKDPAALPWGKLGVDVAIDSTGIFKE ........ ........ ........ ........ ........ ........ ........ ........

FIG. 13(a)

(continued opposite).

Evolution of glycolysis TbrCy TbrGl TcrGI maiChA peaChA spiChA tobChA peaChB spiChB tobChB EcoA EcoB Taq Tma Zmo Bco Bme Bst Bsu

hum pig rat1 rat2 mou ham chi lob Dme 1 Dme2 Cel Sma ice maiCy musCy Anid

Cpa Uma yea1 yea2 Kla Zro TbrCy TbrGl TcrGI maiChA peaChA spiChA tobChA peaChB spiChB tobChB EcoA EcoB Taq Tma Zmo Bco Brae Bst Bsu

137

F E G A V E V Q ........ G G A L V V N G K - - - K I R V T S E R D P ~ INVDVVVESTGLFLS F.~.S V S T T K S KP S V A K D D T L V V N G H - -RI L C V K A Q R N P A D L P W G K L G V E Y V I E S T G L F T V FK'-'EVTTTKS SP S V A K D D T L V V N G H - - R I L C V K A Q R N P A D L P W G K L G V E Y V I E S T G L F T A F D . ~ V K P V ...... -GDNAI S V D G K - - - V I K V V D SRNP S N L P W G E L G I D L V I E G T G V F V D FD~/)VKPV ...... -GTDGI S V D G K - - - V I K V V S D R N P A N L P ~ E L G I D L V I E G T G V F V D F D A D V K T A ...... -GD SAI S V D G K - - - V I K V V S D R N P V N L P W G D N G I D L V I E G T G V F V D F D A D V K P V ...... -GTDGI S V D G K - - - V I Q V V S D R N P V N L P W G D L G I D L V I E G T G V F V D F K A E V K I L ....... N N E T I T V D G K - --P IKVVS SRDP L K L P ~ % E L G I D I V I E G T G V F V D F K A D V K I I ....... D N E T F S IDGK---P I K V V S N R D P L K L P W A E L G I D I V I E G T G V F V D F K A D V K I V ....... D N E T I S V D G K - - -H I K V V S SRDP L K L P W A E L G I D I V I E G T G V F V D F D G T V E V K ........ D G H L I V N G K - - -KI R V T A E R D P A N L K % D E V G V D V V A E A T G L F L T F A W E V R Q E ........ R D Q L F V G D D - - - A I R V L H E R S L Q S L P W R E L G V D V V L D C T G V Y G S F P G E V A Y D ........ D Q Y L Y V D G K - - - A I R A T A V K D P K E I P W A E A G V G V V I E S T G V F T D F P G K V E Y T ........ E N S L IVDGK- --E I K V F A E P D P S K L P W K D L G V D F V I E S T G V F R N Y P G T V T T E ........ G N D M V I D G K - - - K I V V T A E R D P A N L P H K K L G V D I%~4ECTGIFTN L D A E V V V . . . . . . . . . N D G V S V N G K - --E I I V K A E R N P E N L A W G E IGVD IVVES T G R F T K L D A E V V V D ........ G S N L V V N G K - --T IE I S A E R D P / ~ L S W G K Q G V E IVVES T G F F T K L D A E V S V N ........ G N N L V V N G K - - -E I I V K A E R D P E N L A W G E IGVD IVVES T G R F T K L D A E V S V D ........ G N N L V V N G K - - - T I E V S A E R D P A K L S W G K Q G V E I V V E S T G F F T K

aaaaaaaaa bbbb bbbb bbbbaaaaaaaaaaaa159 M E K A G A H L Q G G - A K R V I I SAP SA- - D A P M F V M G V N H E K Y D N S L-KI I S N A S C T T N C L A P L M E K A G A H L K G G - A K R V I I SAP S A - - D A P M F V M G V N H E K Y D N S L - K I V S N A S C T T N C L A P L M E K A G A H L K G G - A K R V I I SAP S A - - D A P M F V M G V N H E K Y D N S L - K I V S N A S C T T N C L A P L M E K A G A H L K G G - A K R V I I SAP SA- - D A P M F V M G V N H E K Y D N S L - K I V S N A S C T T N C L A P L M E K A G A H L K G G - A K R V I I SAP SA- -DAPM~4MGVNMEaC/DNS L - K I V S N A S C T T N C L A P L M E K A G A H L K G G - A K R V I I SAP SA- - D A P M F V M G V N Q D K Y D N S L - K I V S N A S C T T N C L A P L M E K A G A H L K G G - A K R V I I SAP S A - - D A P M F V M G V N H E K Y D K S L - K I V S N A S C T T N C L A P L I E K A S A H F K G G - A K K V V I SAP S A - - D A P M F V C G V N L E K Y S K D M - T V V S N A S C T T N C L A P V IDKA$ T H L K G G - A K K V I I SAP SA- - D A P M F V C G V N L D A Y S P D M - K V V S N A S C T T N C L A P L I DKAS T H L K G G - A K K V I I SAP SA- - D A P M F V C G V N L D A Y K P D M - K V V S N A S C T T N C L A P L K E K A S A H L Q G G - A K K V I I SAP S A - - D A P M Y V V G V N H E K Y D A S N D H V V S N A S C T T N C L A P L I D K A Q A H I K N N R A K K V I I SAP SA- - D A P M F V V G V N E N S Y E K S M - S W S N A S C T T N C L A P L K D K A A A H L K G G - A K K V V I SAP SK- - D A P M F V V G V N E H E Y K S D L - N I V S N A S C T T N C L A P L K D K A A A H L K G G - A K K V V I SAPSK- - D A P M F V V G V N E D K Y T S D V - N I V S N A S C T T N C L A P L K D K A A A H L K G G - A K K V V I SAP SK- - D A P M F V V G V N E H E Y K S D L - N I V S N A S C T T N C L A P L Q E K A S A H L K G G - A K K V V I SAP S A - - D A P M F V M G V N N E T Y K K D I - Q V L S N A S C T T N C L A P L T E K A K A H L K G G - A K K V I I SAP SA- - D A P M Y V M G V N E K T Y D G S G - M V I S N A S C T T N C L A P L I D K A S A H I K G G - A K K V V I SAP SA- - D A P M Y V C G V N I ~ A Y D P K A - Q W S N A S C T T N C L A P L L D T A Q K H I D A G - A K K V V I T A P S S- - S A P M F V V G V N H T K Y T P D K - K I V S N A S C T T N C L A P L LD TAQKHIDAC--AKKVV7 TAP S S - - T A P M F V M G V N E V K Y T S D L - K I V S N A S C T T N C L A P L L D K A Q K H L D A G - A K K V V I TAP SK- - T A P M F V V G V N E D K Y N G E - - T I V S N A S C T T N C L A P I M D SANKH I E A G - A K K V V I TAP SG- - S A P M Y V M G V N E E T Y T P D Q - K I V S N A S C T T N C L A P L D D T A R K H I Q A G - A K K V V I TGP S K - D D T P M F V M G V N H T T Y K G E - - A I V S N A S C T T N C L A P L K SA A E G H L R G G - A R K V V I S A P A S G - G A K T F V M G V N H N N Y N P R E Q H V V S N A S C T T N C L A P L KAAAEGHLRGG-ARKVVI 8APASG-GAKTLVMGVNHHEYNP SEHHVVSNASCTTNCLAP I REGAGKHIQAG-AKKVLI TAPGKG-D IPTYVVGVNADQYNPDE-P I ISNASCTTNCLAPF R E G A G K H I T A G - A K K V L I TAP R K G - D IP T Y V V G V N A D A Y T H A D - D I I S N A S C T T N C L A P F RD G A G K H L Q A G - A K K V L I~_AP G K G - D IP T Y V V G V N E E G Y T H A D - T I I S N A S C T T N C L A P F R E G A G K H IQ A G - A K K V L I T A P G K G - D I P T Y V V G V N A D L Y N P D E - P I ISNASCTTNCLAPF GP G A G K H I Q A G - A K K V I I T A P A K G A D IP T Y V I G V N E Q D Y G H E V A D I I S N A S C T T N C L A P F GP G A G K H I Q A G - K K V I I T A P A K G S D I P T Y V V G V N E K D Y G H D V A N I I S N A S C T T N C L A P F GP G A G K H IQAG-AKKVI I T A P A K G A D IP T Y V V G V N E Q D Y S H E V A D I I S N A S C T T N C L A P F DETARKHITAG-AKKVVMTGP SK-DNTPMFVKGANFDKYAGQ--D IVSNASCTTNCLAPL REHGEAHIAAG-AKKVLF SHPGSNDLDATVVYGVNQDQLRAE-HRIVSNASCTTNCI IPV A D K A K A H L E G G - A K K V I IT A P A K G - E D I T L V M G V N H E A Y D P S R H H I I S N A S C T T N S L A P V R E K A E L H L Q A G - A K K V I I T A P A K G - E D I T V V I G C N E D Q L K P E -HT I ISCASCTTNS IAP I TEKASAHLTAG-AKKVL ISAPAKGDVDRTVVYGVNHKDLTADD-KIVSNASCTTNCLAPV R E D A A K H L E A G - A K K V I I S A P A K V - E N I T V V M G V N Q D K Y D A D A H H V I SNASCT T ICLAAF R A D A A K H L E A G - A K K V I I SAPAS - D E D I T I V M G V N E D K Y D A A N H N V I S N A P C T T N C L A P F R E D A A K H L E A G - A K K V I I S A P A K N - E D IT I V M G V N Q D K Y D P K A H H V I S N A S C T T N C L A P F RADAAKHLEAG-AKKVI I SAPAN-EED IT IVMGVNEDKYDAANHDVISNASCTTNCLAPF • .. *..*... * • . . .o* ***** ....

FIG. 13(a) (continued overleaf).

138

hum

pig rat 1 rat2 mou

ham chi lob Dme 1 Dme2 Cel Sma ice maiCy musCy Anid Cpa Uma yeal yea2 Kla Zro TbrCy TbrGl TcrGI maiChA peaChA spiChA tobChA peaChB spiChB tobChB EcoA EcoB Taq Tma Zmo Bco Bme Bst Bsu

hum

pig ratl rat2 mou ham chi lob Dme 1 Dine2 Cel Sma ice maiCy musCy Anid Cpa Uma yea1 yea2 Kla Zro TbrCy TbrGl

L . A . FOTHERGILL-GILMOREand P. A. M. MICHELS

S LOOP aaaaa bbbbbbbbb I J bbbb aaaaaa 217 AKVI HD -NFG IVEGIMTTVHAI TATQKTVDGP SGKLWRDGRGALQN I IPASTGAAKAVGK AKVIHD -HFG IVEGLMTTVHAITATQKTVDGP SGKLWRDGRGAAQN I IPASTGAAKAVGK AKVI HD -NFGIVEGLMTTVHAITATQKTVDGPSGKLWRDGRGAAQNI IPASTGAAKAVGK AKVIHD -NFG IVEGLMTTVHAI TATQKTVDGP SGKLWRIX~RGAAQN I IPASTGAAKAVGK AKVI HD -NFGIVEGI/MTTVHAITATQKTVDGP S GKLWRDGRGAAQNI IPASTGAAKAVGK AKVIHD -NFG IVEGLMTTVHAI TATQKTVDGP SGKLWRDGRGAAQNI IPASTGAAKAVGK AKVIHD-NFGIVEGLMTTVHAITATQKTVDGP SGKLWRDDRGAAQNI IPASTGAAKAVGK AKVLHE -NFE IVEGI~TTVHAVTATQKTVDGP SAEDWRGGRGAAQNI IP S STGAAKAVGK AKVIND-NFEIVEGI/4TTVq4ATTATQKTVDGPSGKLWRDGRGAAQNIIPAATGAAKAVGK AKVIND-NFE IVEGLMTTVHATTATQKTVDGP SGKLWRDGRGAAQNI I PASTGAAKAVGK AKVIND -NFG IIEGI/MTTVHAVTATQKTVDGP S GKLWRDGRGAGQNI IPASTGAAKAVGK AKVI HD-KFE IVEGI/MTTVHSFTATQKVVDGP S SKLWRDGRGAMQNIIPASTGAAKAVGK AKVIND-RFGIVEGIJMTTVHAMTATQKTVDGP SMKDWRGGRAASFNI IPSSTGAAKAVGK AKVI HD -NFGIVEGLMTTVHAI TATQKTVDGP SAKDWRGGRAASFNI IP S STGAAKAVGK AKVIND -RFG IVEGIAMTTVHSI TATQKTVDGP SMEDWRGGRAASFNI IP SSTGAAKAVGK AKVIND-NFGI IEGLMTTVHSYTATQKVVDGP SAKDWRGGRTAATNIIP S STGAAKAVGK AKVIND -EFKI IEGI~TTVHS YTATQKTVDGP SAEDWRGGRTAAQNI I P S STGAAKAVGK AKVIHD -KFG IVEGI~MTTVHATTATQKTVDGP SAKDWRGGRAAAANI I P S STGAAKRVGK AKVIND -AFG IEEGLMTTVHSMTATQKTVDGP SHKDWRGGRTASGNI I P S STGAAKAVGK AKVIND -AFGIEEGLMTTVHS LTATQKTVDGP SHKDWRGGRTASGNI IPS STGAAKAVGK AKI IND -EFG IDEAI/MTTVHS ITATQKTVDGP S HEDWRGGRTASGN I IP S STGAAKAVGK AKVI HN-EFGIKEGI~4TTgq4SMTATQKTVDGPSHKDWRGGRTASGNI IP S STGAAKAVGK AKVLND-KFGIVEGLMTTVHATTATQKTVDGP S QKDWRGGPF-~%AQNI IP S STGAAKAVGK VHVLVKEGFG IS TGLMTTVHSYTATQKTVDGVSVKDWRGGRAAALNI I P STTGAAKAVGM VHVLVKEGFGVQTGI/~TT IHS YTATEKTVDGVSVKDWRGGRAAAVNI IP STTGAAKAVGM VKVLDQ-KFG IIKGTMTT TH S YTGDQRLLDAS -HRD LRRARAAALN IVP TSTGAAKAVS L VKVLDQ -KFGI IKGTMTT TH S YTGDQRLLDAS -HRD LRRARAAALNIVPTSTGAAKAVAL VKVLDQ-KFGI IKGTMTTTHSYTGDQRLLDAS -HRDLRRABAACLNIVPTSTGAAKAVAL VKVLDQ-KFG I IKGTMTTTHS YTGDQRLLDAS -HRDLRRARAAALNIVP TSTGAAKAVAL AKVLDE -EFG IVKGTMT TTHSYTGDQRLLDAS -HRDLRRARAAALNIVP TSTGAAKAVSL VKVLDE -ELGIVKGTMTTTHSYTGDQRLLDAS -HRDLRRARAAALNIVP TSTGAAKAVSL VKVMDE -ELGIVKGTMTTTHSYTGDQRLLDAS -HRDLRRARAAALNIVPTSTGAAKAVSL AKVIND -NFGI IEGI24TTVHATTATQKTVDGP SHKDWRGGRGASQNI IP S STGAAKAVGK !KLLDD-AYGIESGTVTT IHSAMHDQQVIDAY-HPDLRRTRAASQS IIPVDTKLAAGTTR MKVLEE -AFGVEKAI~TTgq4SYTNDQRLLDLP -HKDLRRARAAAIN I IP TTTGAAKATAL VKVLHE -KFGIVS GMLTTVHS YTNDQRVLDLP -HKDLRRARAAAVNI IP TTTGAAKAVAL LHVLQQ-K IGIVRGI~MTTVHSFTNDQRILDQI -H SDLRRARTASASMIPTSTGAARAVAL ARVLHQ- IFGEVSRMMTTAHSYTNIQRI L D A A T H A D ~ S I IDTTNGAAMAVAL AKVLND - K F G ~ T V H SYTNDQQI LDLP-HKDYRRARAAAENIIPTSTGAAKAVSL AKVLHE-QFG~TVHSYTNNQRILDLP -HKDLRRARAAAES IIPTTTGAAKAVAL AKVLND -KFGIKRGMMTTVHS NDQQILDLP -HKDYRRABAAAENIIPTSTGAAKAVSL

bbbbbbbbb bbbbbbbbb aaaaaaaaaaaaaa bbbb275 V IPELNGKLTGMAFRVP TANVSVVDLTCRLEKP -AKYDD IKKVVKQAS E -GP LKGI LGYT VI PELDGKLT~W~FRVP TPNVSVVDLTCRLEKP -AKYDD IKKVVKQ ASE-GPLKGILGYT VI pE LNGKLTGMAFRVP TPNVSVVD LTCRLEKP -AKYDD IKKVVKQAAE-GPLKGI LGYT VI PELNGKLT(~L~FRVPTPNVSVVDLTCRLEKP -AKYDD IKKVVKQAAE -GP LKGILGYT VIPELNGKLT~MAFRVP TPNVSVVDLTCRLEKP -AKYDD IKKVVKQAS E -GPLKGILGYT VIPELNGKLT~MAFRVPTPNVSVVDLTCRLEKP -AKYEDIKKVVKQASE -GPLKGI LGYT VI PELNGKLTGMAFRVPTPNVSVVD LTCRLEKP -AKYDD IKRVVKAAAD -GP LKGI LGYT VIPELDGKLT~MAFRVP TPDVSVVDLTVRLGKE -CSYDD IKAAMKTAS E -GP LQGFLGYT VIPALNGKLT~MAFRVPTPNVSVVDLTVRLGKG-ATYDEIKAKVEEASK-GPLKGI LGYT VIPALNGKLT~MAFRVPTPNVSVVDLTVRLGKG -AS YDE IKAKVQEAAN-GP LKGI LGYT VI pE LNGKLTGMAFRVPTPDVSVVDLTVRLEKP -ASMDD IKKVVKAAAD -GPMKGI LAYT VIPALNGKLT~4AFRVPTPDVSVVDLTCRLGKG-ASYEE IKAAVKAAAS -GP LKGI LEYT VLPALNGKLTGMAFRVP TCDVSVVDLTVRIEKA-ASYEQIKAAIKEESE-GKLKGI LGYT VLPDLNGKLT~4SFRVP TVDVSVVDLTVRIEKG-AS YED IKKAI KAAS E -GPLKGIMGYV VLPQLNGKLTGMSFRVP TVDVSVVDLTVRLEKA-ATYDE IKKAIKEESQ-GKLKGILGYT VIP S LNGLKTGMAMRVP~SNVSVVDLTVRTEKA-VTYDQIKDAVKKAS E-NELKGI LGYT VI PE LNGKLTGMSMRVPTSNVSVVD LTVRIEKG-ATYEQIKTAVKKAAD -GP LKGVLAYT VIP S LNGKLTGMAFRVPTTNVSVVDLTARLEKG-ASYDEIKAEVKRASE -NELKGI LGYT VLPELQGKLT~MAFRVP TVDVSVVDLTVKLAKE -ATYDQIKKVVKARAE -GPMKGVLGYT VLPELQGKLT~4AFRVPTVDVSVVDLTVKLDKE -TTYDEIKKVVKAAAE -GKLKGVLGYT V L P E L Q G K L T ~ A F R V P TVDVSVVDLTVKLAKE -ATYDE IKAAVKKASQ-GKI/4NVVGYT VLP SLQGKLT~4AFRVP TVDVSVVDLTVNLAKE -TSYDEIKAALKKASE-GSMKGI LGYT I IP S LNGKLTGMAFRVPTPNVSVVDLTVRLERP -ATYKQICDAIKAASE -GELKGILGYV VIPS TQGKLT~MAFRVPTADVSVVDLTF IATRD-TS IKEIDAALKRAS K-TYMKNI LGYT FIc~. 13(a)

(continued opposite).

Evolution of glycolysis TcrGl maiChA peaChA spiChA tobChA peaChB spiChB tobChB EcoA EcoB Taq Tam Zmo Bco Bme Bst Bsu

hum pig rat1 rat2 mou ham chi lob Dmel Dme2 Cel Sma ice maiCy musCy Anid

Cpa Uma yea1 yea2 Kla Zro TbrCy TbrGl TcrGI maiChA peaChA spiChA tobChA peaChB spiChB tobChB EcoA EcoB Taq Tma Zmo Bco Bme Bst Bsu

V I P S TQGKLT(~MS FRVP T P D V S V V D L T F T A A R D -TS I Q E I D A A L K R A S K-TYMKGI LGYT VLPNLKGKLNGIALRVPTPNVSVVDLVVQVSKK-TLAEEVNQAFRDAAA-NELTGI LEVC V L P TLKGKLNGI A L R V P T P N V S V V D L V V Q V S K K - T F A E E V N E A F R E S A A - E E L T G I L S V C V L P Q L K G K L N G I A L R V P T P N V S V V D L V V Q V S K K - - T T A E V N A A F R E SAD - N E L K G I L S V C S S QALRGS SMALP LRVP T P N V S V V D L V V Q V S K K - T F A E E V N A A F R E R A D - K E L K G I LDVC V L P Q L K G K L N G I A L R V P T P N V S V V D L V V N V A K K G IS A E D V N A A F R K A A E - G P L K G I L D V C VLPQLKGKLNGIALRVP TPNVSVVDLVVN IEKVGVTAED AAA-GPLKGVLDVC VLP Q L K G K L N G I A L R V P T P N V S V V D L V V N V A K K G IT A E D V N A A F R K A A D - G P L K G V L A V C VLPELNGKLTGMAFRVP TPNVSVVD LTVRLEKA-ATYEQIKAAVKAAAE-GEMKGVLGYT FFPQFNDRFEAIAVRVPT INVTAIDLSVTVKKP -VKANEVNLL~KAAQ-GAFHGIVDYT V L P S L K G R F D G M A L R V P T A T G S I SD I T A L L K R E - V T A E E V N A A L K A A A E -GPLKGI LAYT VVPEVKGKLDG~4AIRVP TPDGS I T D L T V L V E K E - T T V E E V N A V M K E A T E -GRLKGI IGYN VI P E L K G K L D G I S I RVP TPDVS L V D F T F V P Q R D - T T A E E I N S V L K A A A D T G D ~ V L P E L K G K L N G ~ 4 A M R V A T A N V S V V D L V Y E L A K E - V T V E E V N A A L K A I A E -GELKGI LAYS V L P E L K G K L N G G A M R V P TPNVS L V D L V A E L D K E - V T V E D V N N A L K E A A E -GDLKGI LGYS V L P E L K G K L N G M A M R V P T P N V S V V D L V A E L E K E - V T V E E V N A A L K A A A E -GELKGI LAYS V L P E L K G K L N G G A M R V P TPNVS L V D L V A E L N Q E - V T A E E V N A A L K E A A E - ~ L K G I LGYS **,* . . * . . . . . ,.

bbbbbbbbbbbb bbbbbbbbaaaaaaaaaaaaaaaaaa 331 E H Q W S S D F N S D T H S S T F D A G A G I A L N D H .... F V K L I SWYDNEFGYSNRVVDI/4AHMAS E D Q V V S C D F N D S THS S T F D A G A G I A L N D H .... F V K L I SWYDNEFGYSNRVVDI24VHMAS E D Q V V S C D F N S N S H S S T F D A G A G I A L N D N .... F V K L I S W Y D N E Y G Y S N R V V D L M A Y M A S EDQWSCDFNSNSHSSTFDAGAGIALNDN .... I V K L I S W Y D N E Y G Y S N R V V D L M A Y M A S E D Q V V S C D F N S N S H S S T F D A G A G I A L N D N .... F V K L I S W Y D N E Y G Y S N R V V D L M A Y M A S E D Q V V S C D F N S D SH S S T F D A G A G I A L N D N .... F V K L I S W Y D N E F G Y S N R V V D L M A Y M A S E D Q V V S C D F N G D S HS S T F H A G A G I A L N D H .... F V K L V S W Y D N E F G Y S N R V V D L M V H M A S E D D V V S SDF IGDNRS S I F D A K A G I Q L S K T .... FVKVVSWYDNEFGYSQRVIDIJ/(HMQK D E E V V S T D F L S D T H S S V F D A K A G I S L N D K .... F V K L I S W Y D N E F G Y S N R V I D L I K Y M Q S D E E V V S T D F L S D T H S S V F D A K A G I S L N D K .... F V K L I S W Y D N E F G Y S N R V I D L I K Y M Q S E D Q V V S T D F V S D P H S S IFDAGACI S L N P N .... F V K L V S W Y D N E Y G Y S N R V V D L IGYIAT E D E W S S D F V G S T S SS I F D A K A G I S L N N N .... F V K L V S W Y D N E F G Y S C R V V D L I T H M H K E D D L V S T D F IGDNRSS IFDAKAGI S L N D N .... F V K L V S W Y D N E W G Y S T R V V D L I M H I S K E E D L V S T D F LGD SRS S I F D A K A G I A L N D H .... F V K L V S W Y D N E W G Y S N R V V D L IRHMFF EDDVVSTDFVGDNRSS IFDAKAGIALSDN .... FVKLVSWYDNEWGYSTRVVDLI IHMSK E D D I V S T D L N G D T R S S I F D A K A G I A L N S N .... F I K L V S W Y D N E W G Y S R R V V D L I S Y I SK E D D V V S T D M N G N P N S S I F D A K A G I S L N D H .... F V K L V S W Y D N E W G Y S R R V L D L I S H V A K E D A V V S Q D F I G N S H S S ZFDAAAGI S L N N N .... F V K L V S W Y D N E W G Y S N R C L D L L V F M A Q E D A V V S S D F L G D T H A S IFDASAGI QLSP K .... F V K L V S W Y D N E Y G Y S T R V V D L V E H V A K E D A V V S S D F L G D S H S S I F D A S A G I Q L S P K .... F V K L V S W Y D N E Y G Y S T R V V D L V E H IAK E D S V V S S D F L G D T H S T I F D A S A G I Q L S P K .... F V K V V A W Y D N E YGYS E R V V D L V E H V A E D D V V S S D F L G D A H S S I V D A A A G I Q L T P T .... F V K L V S W Y D N E FGYS T R V V D L V E H V A K D E E I V S S D I N G I P L T S V F D A R A G I S L N D N . . . . F V K L V S W Y D N E T G Y S N K V H D L I A H ITK D E E L V S A D F I SD SRS S I Y D S K A T L Q N N L P N E R R F F K I V S W Y D N E W G Y S H R V V D L V R H M A A D E E L V S A D F INDNRSS I Y D S K A T L Q N N L P K E R R F F K I V S W Y D N E W G Y S H R V V D L V R H M A S D V P L V D S V F R C S D V S S T IDAS L T M V M ~ D .... M V K V I S W Y D N E W G Y S Q R V V D L A D I C A N D E P LVD S V F R C T D V S S TVDS S L T M V M G D D . . . . L V K V I A W Y D N E W G Y S Q R V V D LAD IVAN D E P LVS IDFRCTDVS S T IDS S L T M V M G D D .... M V K V I A W Y D N E W G Y S Q R V V D L A D IVAN D E P L V D S V F R C S D V S S TVDAS L T M V M ~ D . . . . M V K V I A W Y D N E W G Y S Q R V V D L A D IVAN D V P L V S V D F R C S D V S T T IDS S L T M V M G D D . . . . M V K V V A M Y D N E W G Y S Q R V V D L A H L V A N D I P L V S V D F R C S D F S S T I D S S L T M V M G G D . . . . M V K V V A W Y D N E W G Y S Q R V V D LAD LVAN D E P L V S V D F R C S D V S ST 7DS S L ~ D .... MVKVVAWYDNEWGYSQRVVDLAHLVAN E D D W S T D F N G ~ V C T S V F D G K G G M G L N D N .... F V K L V S W Y D N E T G Y S N K V L D L I A H I SK E L P L V S V D F N H D P H S A I V D G T Q T R V S G A H .... L I K T L V W C D N E W G F A N R M L D T T L A M A T E D E I V L Q D IV~DPHSS 7 V D A K L ~ ..... MVKVFAWYDNEWGYANRVAD LVELVLR D E P I V S SD I I G T T F S G I F D A T I T N V I G G K .... L V K V A S W Y D N E Y G Y S N R V V D T L E L L L K D E P L V S R D F Y S D P H SS T V D S R E T A V L E G K .... L A R V V A W Y D N E W G F S ~ TAAQMAK I E P L V I R N Y N G S T V S ST IDILS T M V I D G A .... M V K V V S W Y D N E T G Y S H R W A L A A Y I N A EEPLVSGDYN~INS

ST IDALSTMVMEGN"

---MVKVI SWYDNESGYSNR~FqDLAQYIAA

E E P L V S R D Y N G S TVS ST Z D A L S T M V I D G K .... M V K V V S W Y D N E T G Y S H R V V D LAAYIAS E E P L V S G D Y N G N E N S S T IDALS T M V M E G S .... M V K V I S W Y D N E S G Y S N R V V D L A A Y I A K

FIc. 13(a) (continued overleaf). JPB 59:2-C

139

L.A. FOTHERGILL-GILMORE and P. A. M. MICHELS

140

334 KE KE KE KE KE KE KE VDSA KD KD RG VDHA CQ KSQ A VDAQ VDGNA KDSA A A

hum pig ratl rat2 mou ham chi lob Dme i Dme2 Ce i Sma ice maiCy musCy Anid Cpa Urea yeal yea2 Kla Zro TbrCy TbrGl TcrGl maiChA peaChA spiChA tobChA peaChB spiChB t obChB EcoA EcoB Taq Tma Zmo Bc o Bm~ Bst Bsu

SA RDRAAKL KDRSARL QWK NWK KWQA QWK KWPGT -PKVGSGDP LEDFCETNPADEECKVYE KWPGLEGSVASGDPLEDFCKDNPADEECKLYE NWPGSCSTRKWRSHI~4SFI~QI~SFF2~DFRSQ VAFR KGV M TL KGL KGL KGL KGP

FIG. 13(a). Alignment of glyceraldehyde-phosphate dehydrogenase sequences, with the exception of those from archaebacteria. See Section II for nomenclature and references for the sequences. Elements of regular secondary structure observed in the B. stearothermophilus enzyme are shown, and the S-loop is labelled. The active site cysteine is indicated by the star symbol, and the boundary between the N- and C-terminal domains by the arrow. The numbering is according to the

B. stearothermophilus enzyme.

b GLYCERALDEHYDE

3-PHOSPHATE DEHYDROGENASE :

ARCHAEBACTERIA 59

Mbr Mf o Mfe Pwo

KSVGINGYGTIGKRVADAVSAQDDMKIVGVTKRSPDFEARMAVEIC~'YDLYISVPERESS KSVG INGYGT I G K R V A D A V S A ~ D D M K I ~ D F E A R M A V E K G 2 D L Y I SAPERENS KAVAINGYGTVGKRVAD~GVSKTRPD~G2DLYVAIPERVKL K IKVG I N G Y G T I G E R V A Y A V T K Q D D ~ L I ~ F E A Y R A K E L G I P V M A R ~ E E F L P R ********,***** *., ****..,**.* . * * * * * ,* . * .* *

MbE Mfo Mfe Pwo

FEEAG IKVTGTADm-LLEKLD ~ T FEEAGIKVTC~AEE~~PEGI FEKAG IEVAGTVDDMLDEAD ~ P E

119

**.**..*.**

......

P G

E I

G

G R

**..* **.*.*.**

FIG. 13(b) (continued opposite).

X

I ~ N

L

~ IGLS TFQGK~KHDQIGI~ K ~ ~ IGLS

*.. *o**.******

°*

Evolution of glycolysis

141

179 MbE

Mfo Mfe Pwo

FN S F SNYNDVIGKDYARVVS CNTTGLCRT LNP INDLCGIKKVRAVMVRRGADPGQ~KEGP FNSF SNYKDVIGKDYARVVSCNTTGLCRTLNP INDLCG J/~v~RAVMVRRGADP ~ FNS LSNYEE SYGKDYTRVVSCNTTGLCRTLKP LHDS F G ~ g R A V I V R R G A D P A g V S K G P FVSS SNYEAAIGEDYVRVVSCNTTGLVRTLNAIKDY--VDYVYAVMIRRAN)PNDIKRGP *

*

***•,

****

**********

***

....

*



*

**

**

***

**

239 Mbr

Mfo Mfe Pwo

INAIVPNPPTVP S HHGPDVQTVMMDLNI T TMALLVP TTLMHQHNLMVELESSVSVEO I~E /~NAIVPNPp TVP SHHGPDVQTVMYDLNI TTMALLVP TTLMHQHNLMVELESSVSIDDIED INAI IPNP PKLP S HHGPDVKTVL-D INIDTMAVIVP T T L M H ~ - - ' ~ T V D D I ID TNAIKP SV-T IP SHHGPDVQTVI -P INIET SAFVVP TT IMHVHS IMVELKKP LTREDVID •

oo

o



.

,



.

..

°°o,

°

o

°

°

299 Mbr

Mfo Mfe Pwo

KLNETP RVLLLKAGEGLT STAGFMEYAKDLGRSRNDLFEI~SI~YMM~%I KLNETP RVLLLKAKEGLGSTAEFMEYAKELGRSRNDLFEI~LNIVDGELY~0AI V F ED TP RVI L I S A E D G L T S T A E I M E Y A K E L G R S R N D L F E I P % ~ - ~ I ~ Z l ' % I ~ % V I F ENT T R V L L F E K E K G F E S TAQLI E F A R D L H R E W ~ T L Y E IA%~(KES ~ Q A V • ..*,**,*..

,*.,***

..*.*..*

**°***.**

**

**o.°

°.

oo*.**.

337

F~o H£e Pwo

HQESDVVPENVDAIRAMT,WJ~ENDP SKS IQKTNKAMGI L HQESDVVPENVDAIRAMT,W)4EDNP SKS IEKTNKAMGIL HQESD IVPENVDAVRAILEMEEDKYKS ~ L Q HKESDVIPENIDAIRAMFEIAEK-WES IKKTNKS LGILK *



***



.***

. **

. **

.o * ,o..

.

**



****...**

FIG. 13(b). Alignment of archaebacterial glyceraldehyde-phosphate dehydrogenase sequences. See Section II for nomenclature and references for the sequences. The numbering is according to the M. bryantii enzyme.

(g) Phosphoglycerate kinase Phosphoglycerate kinase (EC 2.7.2.3) catalyzes the reversible transfer ofa phospho group from 1,3-bisphosphoglycerate to ADP, thus forming 3-phosphoglycerate and ATP (Scopes, 1973). The enzyme is a monomer with a M r of about 44,000. High resolution crystal structures have been determined for the yeast and horse enzymes (Fig. 14), both at 2.5 A (Banks et al., 1979; Watson et al., 1982). The structure of the binary complex of pig muscle phosphoglycerate kinase with 3-phosphoglycerate at 2.0 A has recently been published (Harlos et al., 1992). Phosphoglycerate kinase has a bilobal structure; two domains of about equal size are connected by a narrow hinge region. Each domain consists of a core of six parallel fl-strands, surrounded by ~t-helices. The active site lies in the cleft between the two domains. The nucleotide binds in a depression of the surface of the C-terminal domain, while the triose substrate binding occurs on the opposing surface of the N-terminal domain. Upon binding of the substrates the molecule undergoes a relatively large hinge bending, presumably required for the phospho transfer to occur during catalysis. A large number of amino-acid sequences, representing all major phylogenetic groups, are now available (Fig. 15). All these enzymes are clearly homologous, including those of the archaebacteria. The eukaryotic and eubacterial phosphoglycerate kinases do not differ more than 60% (Matrix 7). The similarity of eukaryotic and eubacterial sequences with those of archaebacteria is less: approximately 70% difference. Most residues important for substrate binding and catalysis are found to be invariant (Watson and Littlechild, 1990). Amino acids presumably involved in maintaining the tertiary fold of the enzyme are either kept invariant, or have been replaced by residues of similar nature. By contrast, those residues in contact with the solvent are very variable. Some specific differences can be found between the phosphoglycerate kinase structures of prokaryotes and eukaryotes. Most striking is the shortening of one of the surface loops (called the "nose region") by about 14 residues in the enzyme of prokaryotes. In wheat phosphoglycerate kinase, however, both the pro- and eukaryotic features are found (Longstaff et al., 1989). This plant contains both cytosolic and chloroplast forms of the enzyme. The sequences of both isoenzymes, which are encoded in the

hJ hum

hum pig rat1 rat2 1ou ham chl lob

Dam2

100

pig

rat1

rat2

mou

ham

chl

lob

Dmel

Dme2

Cel

Sma

ice

nai

mus

Cy

Cy

Cy

Anld

Cpa

Uma

yea1

yea2

Kla

Zro

93

94

94

94

94

g2

~I

76

~6

74

72

66

70

67

68

68

70

64

63

64

63

100

93

93

93

93

92

73

76

75

74

7J

66

68

67

69

69

71

66

64

64

64

100

99

98

96

93

72

77

77

76

74

67

69

68

69

69

73

66

65

64

64

100

97

95

92

71

76

76

76

73

66

69

6~

69

69

72

65

64

63

63

i00

96

93

72

78

77

75

73

67

70

68

70

69

73

66

64

64

64

100

93

73

79

78

75

74

68

71

68

70

69

73

66

65

65

64

>

100

?4

78

78

76

75

67

70

68

69

70

73

66

64

65

64

-n o

I00

76

77

70

74

69

70

70

71

69

73

66

65

68

66

I00

9~

73

76

67

68

67

70

68

73

63

65

66

62

100

74

77

67

68

67

70

69

74

64

66

65

62

I00

72

68

67

67

69

70

69

67

67

65

64

100

67

67

67

70

69

73

65

64

65

63

100

85

89

71

68

69

67

66

68

67

I00

86

69

69

68

66

66

66

65

100

70

68

68

69

68

68

66

100

79

72

66

65

64

65

100

69

65

65

G6

64

100

65

63

64

66

100

87

82

80

100

81

78

100

78

Cel $ma ice CY

CY iui Cy Anld Cpa Urea yea1 yea2 Kla

100

Zr o

MATRIX 6(a)

(continued opposite)

hum

plg

rat1

rat2

mou

ham

chl

lob

Dmel

Dme2

Cel

Sma

ice

mai

mus

Cy

Cy

Cy

Anld

Cpa

Uma

yeal

yea2

Kla

Zmo

Tbr CY Tbr GI Tcr G1 mai ChA pea Oh^ spi < ChA

o

tob

O

ChA

o

pea O ChB spl ChB tob ChB EcoA

EcoB Taq Tma Zmo

Bco Bse Bst

64

ice

67

66

65

65

63

64

Cpa

~ma

yeal

yea2

Kla

Zro

67

63

Anld

Cy

mus

Cy

mal

63

65

$ma

Cy

62

Cel

63

lob

67

65

chl

66

66

ham

Dmel

66

mou

Dme2

66

65

rail

rat2

~5

65

hum

pig

58

5~

54

55

58

55

59

57

58

57

57

54

57

58

58

55

55

54

54

54

54

55

Tbr

G1

Tbr

Cy

Tcr

56

56

52

53

56

54

57

56

56

56

55

52

55

56

57

54

54

53

53

53

54

53

G1

48

50

49

49

48

47

48

47

46

47

46

46

47

47

47

46

47

46

46

46

47

46

ChA

mal

48

49

47

48

46

47

46

46

44

45

45

45

47

46

45

45

46

45

45

45

45

44

ChA

pea

spi

48

51

48

49

47

47

47

48

47

48

46

46

47

47

46

4~

46

45

46

46

46

46

ChA

48

49

47

47

45

46

45

45

44

46

45

45

46

46

44

44

45

44

45

45

45

44

ChA

rob

spl

48

52

49

49

44

50

47

48

48

49

47

47

46

46

46

4~

47

46

47

47

46

47

ChB

rob

48

53

49

49

44

50

46

47

47

47

47

48

4~

47

46

~

46

45

46

46

46

46

ChB

66

65

68

68

66

69

68

64

64

66

65

65

66

66

64

66

67

67

66

67

66

65

EcoA

39

39

37

39

37

38

37

35

36

34

36

37

5~

37

36

39

39

37

37

37

37

37

EcoB

MATRIX 6(a) (continued opposite)

50

53

49

49

45

49

46

47

48

49

48

47

48

47

47

4~

47

47

~9

47

46

47

ChB

pea

51

51

49

52

51

52

51

49

48

48

50

51

49

49

49

51

50

50

50

50

50

49

Taq

52

54

50

52

49

48

4g

49

48

49

49

49

48

48

50

48

49

48

48

48

48

49

Tma

50

53

49

52

45

48

46

47

46

46

48

49

48

48

48

49

49

48

48

49

48

47

Zmo

48

49

47

49

47

49

50

46

47

49

47

47

46

47

47

48

49

48

47

47

46

47

Bco

51

53

51

55

52

55

54

50

51

51

51

52

53

53

51

54

57

56

56

57

54

54

Bme

54

56

53

55

55

56

55

52

52

52

54

53

54

55

53

55

57

56

55

55

54

54

8st

53

55

52

55

53

56

55

51

51

51

53

52

54

54

52

54

57

56

56

56

54

55

Bsu

~r

>

O

t-

tI-

O

.r" .>

Bst

Bme

Boo

Zno

Tma

Taq

EcoB

EcoA

ChB

rob

ChB

spi

ChB

pea

ChA

rob

ChA

spi

ChA

pea

ChA

mai

GI

Tcr

G1

Tbr

Cy

Tbr

100

56

G1

Cy

100

Tbr

Tbr

100

90

54

G1

Tcr

100

45

46

48

ChA

mal

100

90

44

45

47

ChA

pea

100

89

88

45

46

48

ChA

$pl

100

87

90

91

44

45

46

ChA

rob

I00

79

81

80

81

44

44

48

ChB

pea

100

89

78

81

80

80

43

44

47

ChB

spl

100

86

88

80

83

82

82

44

45

47

ChB

rob

100

48

47

48

47

48

49

49

52

55

80

EcoA

86 82

55 73 100

57 52 i00

100

60

100

57

65

63

44

56

59

43

100

55

41 51

45 62

59

59

59

60

57

58

57

59

53

53

58

Bst

100

56

55

57

57

54

57

55

56

51

51

55

Bme

45

53

55

54

55

53

55

54

55

49

49

51

Boo

50

49

52

53

51

50

53

54

51

48

48

48

Zmo

43

51

55

55

56

54

56

55

56

48

48

52

Tma

100

54

54

55

51

53

52

52

49

49

51

Taq

41

41

41

42

43

40

43

43

38

37

39

EcoB

84

93

74

57

61

60

43

56

58

58

59

55

58

56

58

53

53

55

Bsu

o

_~. O

o

m

146

L.A. FOTHERGILL-GILMOREand P. A. M. MICHELS MATRIX 6(b). PAIRWISECOMPARISONOF DOMAINS FROM REPRESENTATIVE SEQUENCES

N-terminal domain (residues 1-147) humN

mouN

yeaN

100

93 100

50 54 100

humN mouN yeaN

C-terminal domain (residues 148-334) humC

mouC

yeaC

100

95 100

75 75 100

humC mouC yeaC

MATRIX 6(c). PAIRWISECOMPARISONSOF ARCHAEBACTERIAL GLYCERALDEHYDE-PHOSPHATE DEHYDROGENASESEQUENCES

Mbr Mfo Mfe Pwo

PHOSPHOGLYCERATE

hum humtes hot rat mou moutes Anid Pch Tre Tvi yea Kla TbrGl TbrCy CfaG1 CfaCy wheCy wheCh

Eco Tth Zmo Bme Mbr

Mfe

Mbr

Mfo

Mfe

Pwo

100

95 100

71 70 100

58 57 56 100

KINASE

bbbbb aaaaaaaaaaaaa50 S LSNKLT LDKLDVKGKRVVMRV ...... DFNVPM- - -KNNQ ITNNQRIKAAVP S IKFCL S LSKKLTLDKLDVRGKRVIMRV . . . . . . D F N V P M - - - K K N Q I T N N Q R I K A S IP S I K Y C L S LSNKLTLDKLNVKGKRVVMRV ...... DFNVPM- - -KNNQ ITNNQRIKAAVP S IKFCL S LSNKLTLDKLDVKGKRVVMRV ...... DFNVPM- - -KNNQ ITNNQRIKAAVP S IKFCL SLSNKLTLDKLDVKGKRVVMRV ...... DFNVPM---KNNQITNNQRIKAAVP S IKFCL ALSAKLT LDKVDLKGKRVIMRV ...... DFNVPM- - -KNNQ ITNNQRIKAAI P S IKHCL S L T S K L S I T D V D L K D K R V L I R V . . . . . . D F N V P L D K N D N T T I T N P Q R I V G A L P TI K Y A I S LSNKLPVTDVDLKGKRVL IRV ...... DFNVPLD --ENENVTNPQRIVGALPTIKYAI S LSNKLS ITDVDLKGKRVL IRVS PALPVDFNVP LD - -ENKKITNNQRIVGAIP T IKHAL S L S N K L S I T D V D V K G K R V L I R V S P D C P V D F N V P LD - - E N K N I T N P Q R I A G A I P T I K H A L SLS S K L S V Q D L D L K D K R V F I R V . . . . . . D F N V P L- - - D G K K I T S N Q R I V A A L P T I K Y V L S LS S K L T V K D L D V T G K R V F I R V . . . . . . D F N V P L - - - D G K K I T S N Q R I V A A L P T I Q Y V L TLNEKKS INECDLKGKKVL IRV ...... DFNVPV- - -~NGK ITNDYRI RSALP TLKKVL SLKERKS INECDLKGKKVLIRV ...... DFNVPL-- -DDGNITNDYRIRSALPAVQKVL S LVPKKS IDDAVVKGKKVLIRV ...... DFNVPV---KNGE ITNDFRIRSALPTIQKVL S LAPKKT IDDAVVKGKKVL IRV ...... DFNVPV- - -ENGE ITNDFRIRSALP TIQKVL A T K R S V G T L G E A D L K G K K V F V R A . . . . . . D N V P LD - - D A Q K I T D D T R I R A S I P T I K Y L L A K K S V G D L T A A D L E G K R V L V R A . . . . . . D L N V P LD - - D N Q N I T D D T R I R A A PT IKYLL SVIKMTDLDLAGKRVFIRA ...... DLNVPV- - -KDGK%~SDARIRASLP TIELAL RTLLDLDPKGKRVLVRV ...... DYNVP ---VQDGKVQDETRILESLPTLRHLL A F R T L D D I G D V K G K R V L V R E . . . . . . D I/~VP -- - M D G D R V T D D T R L R A A I P T V N E L A NKKTLKDIDVKGKRVFCRV ...... DFNVPM- - -KDGKVTDETRIRAAIPTIQYLV S LP F Y T I D D F N L E D K T V L V R V . . . . . . D I N S P V D - P S T G S I L D D T K I K L H A E T I D E I S K F Y T M D D F D Y S G S R V L V R V . . . . . . D I N S P V D -P H T G R I L D D T R M R L H S K T L K E L V *

hum humtes hot rat mou moutes

*

*

a bb DNGAKSVVLMSHLGRPDGVPMPD DNGAKAVVLMSHLGRPDGVPMPD DDGAKSVVLMSHLGRPDVGPMPD DNGANSVVLMSHLGRPDGVPMPD ............... DNGANSVVLMSHLGRPDGVPMPD ............... DNGAKSVVI24SHLGRPDGIPMPD ...............

FIG. 15 (continued opposite).

aaaaaaaaaaa b b b 93 KYS LEPVAVELKS LLGKDVLFL -KYS L A P V A V E L K S L L G K D V L F L .KYS L Q P V A V E L K S L L G K D V L F L KYS LEPVAAELKSLLGKDVLFL K Y S L E P V A A E L K S LIX?4~VLFL KYS LEPVADELKSLLNKDVIFL

Evolution of glycolysis Anid Pch Tre Tvi yea Kla TbrGl TbrCy CfaG1 CfaCy wheCy wheCh Eco Tth Zmo Rue Mbr Mfe

hum humtes hor rat mou moutes Anid Pch Tre Tvi yea Kla TbrGl TbrCy CfaGl CfaCy wheCy wheCh Eco Tth Zmo Brae Mbr Mfe

147

DNGAKAVI LMSHLGRPDGKKNP DNGRKAVVLMSHLGRPDGKVNP DNGAKAVVLMSHLGRPNGAVNP DNGAKAVILMSHLGRPNGAVNA

................ ................ ................ ............

KYS L K P V V P K L K E L L G R D V I FT KYSLKPVVPVLEELLGKSVTFT KYSLKPVVPELERLLGKPVTFA KYSLKPVVPKLEELLG~VTFA

EHHPR~HLGRPNGERNE EKKP KAIVLASHLGRPNGEVND

................

KYS LAPVAKELQSLLGKDVTFL

................. KYS L A P V A D E L S R L L Q K P V T F L TEGG- S~HLGRP KGIPMAQAGKIRS TGGVPGFQQKATLKPVAKALSELLLRPVTFA T E G G - S C V I ~ S H L G R P K G V S M A E G K E L G S AGGI P G F E Q K A T L K P V A K A L S E L L S R P V T F A K E G G - SCI L M S H L G R P K G A E M S D P K P A K S V R - - - G Y E E A A T L R P V A A B L S E L L G Q K V E F A K E G G - SCI L M S H L G R P K G A K M S D P K P A K S V R - - - G Y E E A A T L R P V A A R L S E L L G Q K V E F A E K G A K - V I LASHLGRP KGVTP . . . . . . . . . . . . . . . . . ~ S LKP L V A R L S E L L G L E V V M A SNGAK-VI LT SHLGRP KGVTP . . . . . . . . . . . . . . . . . EI~S LAP L V P R L S E L L G IEVKKA K Q G A K - V M V T SHLGRP T E G E Y N E . . . . . . . . . . . . . . . EF S L L P V V L Y L K D K L S N P V R L V A G G A - S L V L L S H L G R P K G P D P . . . . . . . . . . . . . . . . . KYS L A P V G E A L R A H L P E A R F A P EKGAK-VL ILAHFGRPKGQPNP ................ EMSLARIKDALAGVLGRPVHF I E Q G A K - V I L A S H L G R P K G E V V E ................. E L R L N A V A E R L Q A L L G K D V A K A K K G A K T V V L - A H Q S R P -GKKD . . . . . . . . . . . . . . . . . F T T L Q Q H A K A L S N I LNRPVDYI D E N A K V A I L-AHQSRP -GKRD ................... F T T M E E H S KVLSNI L D M P V T Y V o. .* ,** * *

aaaaaaaaa bbbb a a a a a a a a a a a 152 K D C V G P E V E K A C A N P A A G SVI L L E N L R F H V E E E G K G K D A S G N K V K A E P A K I E A F R A S L SK KDCVGAEVEKACANPAPGSVILLENLRFHVEEEGKGQDP SGKKIKAEPDKIEAFRASLSK KDCVGP E V E K A C A D P A A G S V I L L E N L R F H V E E E G K G K D A S G N K V K A E P A K IETFRAS LSK K D C V G S E V E N A C A N P A A G T V I L L E N L R F H V E E E G K G K D A S G N K V K A E P A K IDAFRAS L SK K D C V G P E V E N A C A N P A A G T V I L L E N L R F H V E E E G K G K D A S G N K V K A E P A K IDAFRAS L S K K D C V G P E V E Q A C A N P D N G S I I L L E N L R F H V E E E G K G K D S SGKKI S A D P A K V E A F Q A S LSK E D C V G P EVEE TVN GGQVI L L E N L R F H A E E E G S S K D A D G N K V K A D K D A V A Q F R K G L T A E D C IGP Q T E E T V N DGQVI L L E N L R F H A E E E G S S KDAEGKKgq(ADKADVDR-SAS L T A R D C V G P E V E S IVNDADNGAVI L L E N L R F H I E E E G S A K D K D G N K T K A D K A K V E E F R K G L T A P D C V G P E V E A I V N K A D N G A V I L L E N L R F H IEEEGS S K D K E G N K T K A D K A K V E E F R K G L T A N D C V G P E V E A A V K A S A P G SVI L L E N L R Y H IEEEGSRK-%~DGQKVKASKEDVQKFRHELS S HDCVGEEVTNAVNNAKDGEVFLLENLRFHIEEEGSRK-VD KVK KAAVTKFREQLS S PDCLNAADWSKMSP --GDVVLLENVRFYK-EEGSKK .......... AKDREAMAKILAS PDCLNAADVVSKMS P - -GDVVLLENVRFYK-EEGSKK ......... STEEREAMAKILSS PDCLDAASYAAKLKG- -GDVLLLENVRFYA-EEGSKK .......... EEERDAMAKVLAA P D C L D A A S Y A A K L K G - -GDVLLLENVRFYA-EEGSKK. EEERDAMAKVLAA P D C IG E E V E K L A A A L P D G G V L L L E N V R F Y K E E E . . . . . . . . . . . . . . . K N D P E F A K K L A S EDVI GP EVEKLVADLANGAVLLLENVRFYKEEE, --KNDPEFAKKLAS K D Y L D G V D V A E ...... G E L V V L E N V R F N K G E K . . . . . . . . . . . . . . . K D D E T L S K K Y A A FP PGS E E A R B E A E A L R P G E V L L L E N V R F E P G E E . . . . . . . . . . . . . . . K N D P E L S A R Y A R N D IK G E A A A K A V D A L N P G A V A L L E N T R F Y A G E E . . . . . . . . . . . . . . . K N D P A L A A E V A K D E A F G E E V K K T IDGMS E G D V L ~ G E E ............... KNDPELAKAFAE D D IFGTAAREE I K R L K K G D I L L L E N V R F Y P E E I L K RDPHQQAETHMVRKLYP E D IFGCAARES IRNMENGDI I L L E N V R F Y S E E V L K RDPKVQAETHLVRKLSS * . .*** *,

¥ hum humtes hor rat mou moutes Anid Pch Tre Tvi yea Kla TbrGl TbrCy CfaGl CfaCy wheCy wheCh

bbb bbb aaaaaaaaaaaaaaa' bbbbb 210 L G D V Y V N D A F G T A H R A H S SMVGVN- -LPQKAGGFI/MKKELNYFAKALE S P ERPFLAI LGG L G D V Y V N D A F G T A H R A H S SMVGVN- - LPHKAS GFI~MKKELDYFAKALENP VRPFLAI LGG L G D V Y V N D A F G T A H R A H S SMVGVN- -LPQKAGGFI24KKELNYFAKALE S P ERPFLAI LGG L G D V Y V N D A F G T A H R A H S SMVGVN- -LP Q K A G G F I/4KKELNYFAKALE S P ERPFLAI LGG L G D V Y V N D A F G T A H R A H S SMVGVN- -LPQKAGGFI~MKKEI/qYFAKALE S PERPFLAI LGG L G D V Y V N D A F G T A H R A H S SMVGVN- - L P Q K A S G F I/MKKELDYFSKALEKP ERPFLAI LGG LGDI T I N D A F G T A H R A H S S M V G V D - - L P Q K A S G F L V K K E L E Y F A K A L E E P Q R P F L A I L G D V Y V N D A F G T A Q R A H S SMVGVD - - L P Q K A A G F L V K K E L E Y F A K A L E S PARPFLAI LGG LGD IY I D D A F G T A H R A H S SMVGVD - -LPQKA~X3FI24KEY-J~YFAKALESPQRP FLAI LGG L G D V Y V N D A F G T A H R A H S SMVGVD- -LPQKAAGFI24KKELDYFAKALE SPQRPFLAI LGG L A D V Y I N D A F G T A H R A H S SMVGFD- - L P Q R A A ~ F L L E K E L K Y F G K A T R N P TRPFLAI LGG L A D V Y V N D A F G T A H R A H S S IVGFD - - L P N R A A G F L L S K E L Q Y F A K A L E N P TRPFLAI LGG Y G D V Y I SDAFGTAHRD SATMTGIPKILGNGA~A~YIAMEKEI S Y F ~ P R P LVAIVGG YGDVYI SDAFGTAHRDSATMTGIPKILGN~ISYFAKVLGNPPRP LVAIVGG YGDVYVSDAFGTAHRDSATMTGIPKVIE~'YAGYI/MEKEINYFAQVLNNPPRP LVAIVGG YGDVYVSDAFGTAHRDSADMTGIP~KEINYFAQ~PRPLVAIVGG VADLYVNDAFGTAHRAHASTEGVTKFLRP SVAGFLMQKELDYLVGAVANP KKPFAAIVGG LADLFVNDAFGTAHRAHASTEGVTKFLKP SVAGFLLQKELDYLDGAVSNP KRPFAAIVGG

FIG. 15 (continued overleaf).

148 Eco Tth Zmo Brae Mbr Mfe

L.A. FOTHERGILL-GILMOREand P. A. M. MICHELS LCD~AFGTAHRA~THG I G K F A D V A C A G P LLAAELDALGKAI/qEPARPMVAIVGG LGEAFVLDAFGSAHR~ -AYAGFIRY./~ZVRALSRLLKDPERP Y A V V L G G LGDFYVNDAFSAAHRAHVSTEGLAHKLP -AF) ~THPVAAVVGG L A D V Y V N D A F G A A H R A H A S T E G I A Q H IP -AVAGFI24EKELDVLS K A L S N P E R P F T A I V G G I ID IF I N D A F A A A H R S Q P S LVGFAVKLP - S G A G R I M E K E L K S L Y G A V D N A E K P ~ V V D Y Y I N D A F A A A H R S Q P SLVGFPLKLP - S A ~ L Y K I IKNVEKPCVYILGG ***

*

*

,o

hum humtes hot rat mou moutes Anid Pch Tre Tvi yea Kla TbrGl TbrCy CfaG1 CfaCy wheCy wheCh Eco Tth Zmo Brae Mbr Mfe

Mfe

* ,

*





*

* •







** ,,

aaaaaaaaaa bbbbb aaaaaaa aaaaaa aaaaaaaaa2 64 A K V A D K I Q L I N N M L D K - -VNEMI IGC4R4AFTFI/fVLNNME IGT S .... L F D E E G A K I V K D AKg-ADKIQL IKNMLDK- -VNEMI I G G G M A Y T F L K V L N N M E IGAS .... L F D E E G A K I V K D AKVADKIQLINNMLDK--VNEMI I ~ ~ IGTS .... L F D E E G A K I V K N A K V A D K I Q L ! N N M L D K - -VNEMI I ~ T F ~ IGTS .... L Y D E E G A K I V K D A K V A D K I Q L INNMLDK - -VNEMI IGC4R4AFTF L K V L N N M E I GTS .... L Y D E E G A K I V K D A K V K D K IQL IK N M L D K - -VNFMI I G G G M A Y T F L K E L K N M Q IGAS .... LFDEEGAT IVKE S K V S D K I Q L I D N L L P K - -VNS LIITC4R~%FTFKKTLENVKIGS S .... L F D E A G S K I V G N A K V S D K I P V I D N L L P K - - V N S LI I IC4~MALTFKKTLENVKIGNS .... L F D E A G S K I L G E A K V S D K I Q L IDNLLDK- -VDT L IVCC4~MAFTFKKVLNN IP I GTS .... LFDEAGAKTCP S A K V S D K I Q L I D N L L D K - -VNTL I ICC4~WAFTFKKVLDNLAIGDS .... L F D K A G A E T V P K A K V A D K I Q L IDNLLDK- -VD S I I I G G G M A F T F K K V L E N T E I GDS .... I F D K A G A E I V P K A K V A D K IQLIDNLLDK- -VDS L I I G G G M A F T F K K V L E N T E IGDS .... I Y D A A G A E L V P K A K V S D K IQLLDNMLQR- - I D Y L L I G G A M A Y T F L K - A Q G Y S IGKS .... K C E E S K L E F A R S A K V S D K I Q L L D N M L Q R - - I D Y L L I G G A M A Y T F L K - A Q G Y S IGIS .... M C E E S K L E F A R S A K V S D K I Q L L D N M L G R - - INYLVI G G A M A Y T F Q K - A Q G H A I G I S .... M C E E D K L D L A K S A K V S D K IQLLDNMLGR- - I N Y L V I G G A M A Y T F Q K - A Q G H A I G I S .... M C E E D K L D L A K S SKVS S K I G V I E S L L A K - -VD ILILGC4R4IFTFYK-AQGLAVGKS .... L V E E D K L E L A T S SKVS S K I G V I E S L L E K - - C D I L ~ F T F Y K - A Q G L S V G S S .... L V E E D K L E L A T S S KVS TKLTVLDS LS KI - - A D Q L ~ I A N T F I A - A Q G H D V G K S .... L Y E A D L V D E A K R AKVSDKIGVIES LLPR-- IDRLLIGGAMAFTFLK-~VGRS .... LV~.~.DRLD L A K D AKVS T K L D V L T N L V S K - - V D H L I I G G G M A N T F L A - ~ G V D V G K S .... L C E H E L K D T V K G A K V K D K IGVIDHLLDK- -VDNLI I G G G L S Y T F I K - A L G H E V G K S .... L L E E D K I E L A K S V K V D D S I M V L E N V L B N G S A D Y V L T T G L V A N IF L W - A S G I N L G K Y N E D F I INKGYIDFVEK V K IDD S IMIMKN I LKNGSADYI LT S G L V A N V F L E - A S G I D I K E K N R K I LYRKNYKKF IKM ,

hum humtes hor rat mou moutes Anid Pch Tre Tvi yea KIa TbrGl TbrCy CfaGl CfaCy wheCy wheCh Eco Tth Zmo Bme Mbr



o,,

,,,



o,

*

"k'



aaaaaaaa bbbbb bbbbb aaaaaaaaaaa32 4 LMSKAEKNGVKI T L P V D F V T A D ~ D E N A K T G Q A T V A S G IPAG~MGLDCGPES SKKYAEAV I M A K A Q K N G V R I T F P V D F V T G D K F D E N A Q V G K A T V A S G I S P GP~4GLDCGP E S N K N H A Q W LMSKAEKNGVKI TLPVDFVTADKFDEHAKTGQATVAS G IPAG~MGLDCGTES SKKYAEAV LMTKAEKNGVKI TLPVDFVTADKFDENAKTGQATVASGIPAG~I4GLDCGTES SKKYAEAV LMSKAEKNGVKI TLPVDFVTADKFDENAKTGQATVAS GIP~LDCGTES SKKYAEAV IMEKAEKNGVKIVFPVDFVTGD~I~DENAKVGQAT IE SGIP S G ~ M G L D C G P E S IKINAQ IV I IE K A ~ V D Y V T A D K F A A D A K T G Y A T D E Q G I P D G ~ M G L D V G E K S V E S Y K Q T I IVEKAKKHNVE IVLPVDYVTADKFSADATVGSATTQR- IPDGYMGSDVGPESVKLYQKT I CRE-~VLPVDYITADKFDKDANTGTATDE S G I P D G ~ H G L D C G E K S IELYKEAI L V E K A K A K N V K I V L P T D F I T A D E ~ D K D A N T G L A T D K D G I P D G ~ M G I / ) C G D E S IKLYKEAI L M E K A K A K G V E V V L P V D F II A D A F S A D A N T K T V T D K E G I P A G W O G L D N G P E S R K L F A A T V L V E K A K K N N V K I V L P T D F V I G D ~ S A D A N T K V V T D K E G I P SGWQGLDNGPESRKAFAATV L L K K A E D R K V Q V I LP I D H V C H T E F K - A V D S P L I T E D Q N I P E G H M A L D IGPKT IEKYVQT I L L K K A E D R K V Q I ILP I D H V C H T E F K - A V D S P L ITEDQNIPEC4~4ALD IGPKT IEKYVQT I L L K K A Q E R N V E V L L P V D H V C N K E F K - A V D A P L V T E D V E I P E G Y M A L D IGPKT IKIYEDVI L L K K A Q E R N V E V L L P V D H V C N K E F Q - G V D A P L V T ~ D V E I P E G ~ M A L D I G P K T IKIYEDVI L I E T A K S K G V K L L L P T D V V V A D ~ A A D A E S K I V - P A T A I P D G N M G L D V G P D S IKTFAEAL L L A K A K A K G V S L L L P SDVI I A D E F A P D A N S Q T V - P A S A I P D G M M G L D I G P D S V K T F N D A L LLTT .... - C N I P V P S D V R V A T E F S E T A P A T L K S V N - D V K A D E Q I L D I G D A S A Q E L A E IL LLGRAEALGVRVYLPEDVVAAERIE-AGVETRVFPARAIPVP~GLDIGPKTREAFARAL I F A A A E K T G C K I HLP SD%"4VAKEFK-ANP P IRT I P V S D V A A D I ~ I LDVGP K A V A A L T E V L V V V A D D F S N D A N I Q V V - S IED I P S D ~ G L D A G P K T R E IYADVI GKQLLEEFDGQIEMPDDVAVCVD ..... NARVEYCTKNIPN-KP IYDIGTNTITEYAKFI A K K L K D K Y G E K I L T P V D V A I N K N . . . . . G K R I D V P I D D IPN-FP I Y D I ~ M E T I K I Y A E K I FIG. 15 (continued opposite),

Evolution of glycolysis

hLl.l~ humtes hor rat mou moutes Anid Pch Tre Tvi yea Kla TbrGl TbrCy CfaGl CfaCy wheCy wheCh Eco Tth Zmo Bme Mbr Mfe

149

bbb aaaaaaaaaaaa bbb aaaaaaaaa 383 TRAKQ I V W N G P V G V F E W E A F A R G T ~ E V V K A T S R - G C I T I IGC43DTATCCAKWNTED AQARL IVWNGPLGVFEWDAFAKGTKAI/4DE IVKATS K-GC I TVIGC43DTATCCAKWNTED ARAKQ IVWNGPVGVFEWEAFARGTKALMDEVVKATS R-GC I T I I G G G D T A T C ~ T E D ARAKQIVWNGPVGVFEWEAFARGTKSI/MDEVVKATSR-GCI T I IGG(~)TATCCAKWNTED GRAKQ IVWNGPVGVFEWEAFARGTKS I24DEVVKAT S R-GC I T I I GGGDTATCCAKWNTED AQAKLIVWNGP IGVFEWDAFAKGTKALMDEVVKATSN-GCVT I IGGGDTATCCAKWGTED AE SKT I LWNGPPGVFEMEPFAKATKATLDAAVAAVQN-GATVI IGGGDTATVAAKYGAED AEAKT ILWNGPPGVFELKP SPRP TEATLDAAVKAAE S -GS IVI IGGGDTATVAAKYKAED ADAKT ILWNGPAGVFEFDKFANGTKATLDAVVEGCKN-GKIVI IGC43DTATVAAKYGVED DEAKT I LWNGPAGVFEFEKFAGGTKATLDAVVEGCKN-GKIVI I GGGDTATVAAKYGVED AKAKT IVWNGPP GVFEFEKFAAGTKALLDEVVKS SAA-G~TVI I GGGDTATVAKKYGVTD AEAKT IVWNGPPGVFEFAPFAKGTEALLDAVVAS SQA-GNNV I IGC~D TATVAKKYGVVD GKCKSAIWNGPMGVFEMVPYS KGTFAIAKAMGRGTHEHGLMS I I GGGD SASAAE LSGEAK GKCKSAIWNGPMGVFEMVPYSKGTFAIAKAMGRGTHEHGLMS I IGGGD SASAAELSGEAK AKCKST IWNGPMGVF CY S KGTFAVAKAMGNGTQKNGLMS I IGG~)TASAAELSGEAK AKCKS T IWNGPMGVFEMPCYSKGTFAVAKAM~NGTQKNGLMS I IGC43DTASAAELS GEAK DTTKTVIWNGPMGVFEFEKFAAGTDAIAKQLAELTGK-GVT T I IGC4~ SVAAVEKAGLAD DTTQT I IWNGPMGVFEFDKFAVGTES IAKKLAELSKK-GVTT I IGG(~SVAAVEKVGVAD KNAKT ILWNGPVGVFEFPNFRKGTE IVANAIAD SE .... AFS ~ T L A A I D L F G I A D EGARTVFWNGPMGVFEVP PFDEGT LAVGQAIAALE - - -GAF T ~ SVAAVNRLGLKE KASKTLVWNGP LGAFE IEPFDKATVALAKEAAALTKAGSL I SVAGGGDTVAALNHAGVAK KNSKLVIWNGPMGVFELDAFANGTKAVAEALAEATD - - -TYSVI GGGD SAAAVEKFNLAD RDAKT I FANGPAGVFEQEGF S IGTED ILNT IAS SN- - - -GYS I I GGGHLAAAANQMGLS S REAKT IFANGPAGVFEEQQFS IGTEDLIMAIAS SN .... AFSVIAGGHLAAAAEKMGI SN •



,





¥ hum humtes hor rat mou moutes Anid Pch Tre Tvi yea Kla TbzGl TbrCy CfaGl CfaCy wheCy wheCh Eco Tth Zmo Brae Mbr Mfe

bbb 'aaaaaaaaa aaaa.a 415 KVSHVS ~ LELLEGKVLPGVDALSN I KVSHVS TGRGAS LELLEGKI LPGVEALSNM KVSHVSTGGGAS LELLEGKVLPGVDALSNV KVSHVS ~ LELLEGKVLPGVDALSNV KVSHVSTGGGAS LELLEGKVLPGVDALSNV KVSHVS ~ LELLEGKI LPGVEALSNM KI SHVSTGGGAS LELLEGKELPGVAALSEKSK KZSHVS TGC4~ASLELLEGEELPGVAALS SK KLSHVSTGGGAS LELLEGKELPGVVALS SK KLSHVS TGGGAS LELLEGKELPGVTALS SK KI SHVS TGGGAS L E L L E G ~ L P GVAFLSEKK KI SHVSTGGGAS L E L L E G ~ L P GVTFLSNKQ RMSHVSTGGGASLELLEGKTLPGVTVLDEKSAVVSYASAGTGTLSNRWS SL RMSHVS ~ L E L L E G K T L P G V T V L D D K E NMSHVS TGGGAS LELLEGKS LPGVTVLTNKDAKAPAAAAAAGGDCP CGS GCAAVPAAAT NMSHVS ~ L~.T.T,~.GKSLP GVTVLTNKE KMSH I S TGGGAS L E L L E G ~ LPGVLALDEA 9~4SHI S ~ L E L L E G ~ L P G V V A L D E G % ~ T R S V T V KI SYI S TGGGAFLEFVEGKVLP AVAMLEE RAK RFGHVSTGGGAS LEFLEKGTLPGLEVLEG DF SFVS TAGGAFLE~MEGKELPGVKALEA ~MSH I S ~ LEFMEGKELP GVVALNDK GITHISSGGGAS INLLAGEKLPVVE ILTEVEMKGRK K INH I S SGGGAC IAFLS GEELPAIKVLEEARKRSDKY I

FIG. 15. Alignment of phosphoglycerate kinase sequences. See Section II for nomenclature and references for the sequences. Elements of regular secondary structure observed in the yeast enzyme are shown. The boundaries between the N- and C-terminal domains are indicated by the arrows (N-terminal domain=residues 1-198 and 392~,15; C-terminal domain=residues 199-391). The numbering is according to the yeast enzyme.

nucleus, are very similar (82% identity). B o t h p h o s p h o g l y c e r a t e kinases lack the nose regions, b u t nevertheless c o n t a i n s o m e residues typical of e u k a r y o t i c p h o s p h o g l y c e r a t e kinases. Therefore, the p r e s e n t - d a y w h e a t sequences are p r e s u m a b l y the result of a r e c o m b i n a t i o n b e t w e e n the gene of the p r o k a r y o t i c s y m b i o n t t h a t d e v e l o p e d into the c h l o r o p l a s t , a n d an ancestral n u c l e a r gene.

Mfe

Mbr

Bme

Zmo

Tth

Eco

wheCh

wheCy

CfaGl

CfaCy

TbrGl

TbrCy

Kla

yea

Try

Tre

Pch

Anid

mout

humt

mou

rat

hoe

hum

i00

hum

77 74

63 81 i00

62 65 100

84

86

i00

85

100

61

64

63

i00

68

64

68

67

100

90

75

77

67

64

68

68

68

69

Trv

I00

71

70

65

67

63

63

65

65

65

66

yea

48

i00 I00

47

50

49

46

48

46

45

46

46

45

46

TbrC

83

71

69

66

67

63

62

64

64

63

64

Kla

100

93

47

47

50

50

47

49

47

45

47

46

46

47

TrbG

i00

74

74

48

46

51

48

48

49

47

47

48

48

48

48

CfaC

100

99

73

75

48

46

51

49

48

49

47

47

48

48

47

48

CraG

I00

49

49

50

49

52

52

54

53

48

51

48

48

49

49

49

i00

82

47

48

48

48

50

52

56

55

49

51

48

48

49

48

48

49

Ch

Cy 50

whe

whe

MATRIX 7(a). PAIRWISECOMPARISONSOF PHOSPHOGLYCERATEKINASESEQUENCES

64

64

68 68

i00

84

64 63

85

64

64

Tre

99

84

86

Pch

100

85

8~

Anld

96

98

mout

96

humt

97

mou

97

rat

100

hor

i00

47

47

42

42

42

41

41

40

41

40

40

39

3~

37

38

38

39

38

Eco

i00

43

50

51

44

44

45

43

42

41

41

42

40

40

41

42

42

42

42

43

Tth

100

45

46

46

48

40

40

40

40

41

39

40

40

40

41

41

42

42

41

41

43

Zmo

I00

48

51

48

58

60

50

50

47

46

52

51

50

50

48

47

51

50

50

49

50

50

Bme

i00

36

34

35

34

35

36

34

35

36

35

34

34

33

33

31

33

32

30

31

31

32

31

Mbr

i00

61

36

33

34

35

33

35

33

34

35

34

34

33

34

34

33

33

34

32

33

33

32

33

Mfe

>

C~

L~

Evolution of glycolysis

151

MATRIX 7(b). PAIRWISE COMPARISONSOF DOMAINSFROM REPRESENTATIVE SF~tJENCES N-terminal 393-415

huroN mouN yeaN

domain

(residues

1-198;

huroN

mouN

yeaN

100

97 100

66 66 100

C-terminal domain (residues 199-392) humC m o u C yeaC humC mouC yeaC

100

98 100

65 64 100

See Section II for nomenclature and references for the sequences.

(h) Phosphoglycerate mutase Phosphoglycerate mutases comprise a family of enzymes which catalyze reactions involving the transfer of phospho groups among the three carbon atoms of phosphoglycerates (reviewed by Fothergill-Gilmore and Watson, 1989). The formation of a phosphohistidine intermediate is a striking feature of the catalytic mechanism. There are at least four types of phosphoglycerate mutase which are kinetically and structurally distinct but which nevertheless have many features in common. The mutase in the glycolytic pathway (EC 5.4.2.1) catalyzes the inter-conversion of 3-phosphoglycerate and 2-phosphoglycerate (Fig. 1). One type ofmutase is dependent upon the cofactor 2,3-bisphosphoglycerate, and is found in vertebrates and in yeast. A second type is independent of bisphosphoglycerate, but requires manganese. This type occurs in Bacillus spp. A third type ofmutase is independent of any cofactors, and is found in organisms such as higher plants and invertebrates. The fourth member of the mutase family is a closely related enzyme (EC 5.4.2.4/EC 3.1.3.13) which catalyzes the synthesis of 2,3-bisphosphoglycerate, and hence plays a major role in controlling haemoglobin oxygen affinity. This enzyme is known as bisphosphoglycerate mutase, and can be considered to be an isoenzyme of the vertebrate glycolytic mutase because of sequence similarities (see Fig. 16), and because it can also readily catalyze the inter-conversion of 3- and 2-phosphoglycerates. The evolution of bisphosphoglycerate mutase is inextricably linked to that of haemoglobin, especially its effector properties (see Section VIII). Cofactor-dependent phosphoglycerate mutases are active as either monomers, dimers or tetramers depending upon the organism from which the enzymes have been isolated. This variation in quaternary structure will be considered in Section VII. The subunit size of the cofactor-dependent enzymes is about 27,000. By contrast, the cofactor-independent phosphoglycerate mutases are monomers of Mr 60,000. It is not yet known whether the cofactor-dependent and -independent mutases may have diverged from a common ancester. The sequence of the cofactor-independent enzyme from maize has recently become available (Grafia et al., 1992). There is no apparent sequence similarity to the cofactor-dependent enzymes. The structures of the 2,3-bisphosphoglycerate-dependent mutases are well characterized, in marked contrast to the other types, which have proved to be difficult to purify in sufficient quantities for structural studies. The sequences of seven mutases are available, and are aligned in Fig. 16. Quite unexpectedly, it has been discovered that these phosphoglycerate mutases are homologous to an enzyme not previously thought to be related. The determination of the sequence of an active-site phosphohistidine peptide from the

152

L.A. FOTHERGILL-GILMOREand P. A. M, MICHELS

PHOSPHOGLYCERATE MUTASE

hummus

humbra humrbc rabrbe mourbc yea $co

bbbbb aaaaaaaaaaaaaaaaaa bbbbbb 56 ATHRLVMVRHGE S TWNQENRFCGWFDAELS E K G T E ~ I K D A K M E F D ICYTSV AAYKLVL IRHGESAWNLENRFSGWYDADLSPAGHEEAKRGGQALBDAGYEFD ICFTSV S KYKL IMLRHGEGAWNKENRFCSWVDQKI/~SEG~Y.~%RNCGKQLKALNFEFDLVFTSV $KYKL IMLRHGEGAWNKENRFCSWVDQKLNSEGMEEARNCGKQLKALNFEFDLVFT SV SKHKLI IL R H G E ~ S W V D Q K L N N Q G L E E A R N C G R Q L K A L N F E F D L V F T SI PKLVLVRHGQSEWNEKNLFTGWVDVKLSAKGQQEAARAGELLKEKKVYPDVLYT SK ADAP YKL ILLRHGE SEWNEKNLFTGWVDVNLTPKG~KEATRGGE LLKDAGLLPDVVHT SH .*..

hummus humbra humrbc rabrbc mourbc yea Sco

***

*.

hummus humbra humrbc rabrbc mmurbc yea Sco

*

,*

*

,*.

*

.**

.



...



**

..

*,**

*********,*

*

..,.

**.,

.***.

aaaaaaaaaaaaaaaaaaaaa172 D IPPPPMDEKHPYYNS I S ~ R R Y A - - G L K P G E L P T C E S LKDTIARALPFWNEE IVPQ IKA DVPPPPMEPDHPFYSNI SEDRRYA- -DLTEDQLP SCE S LKDT IARALP FWNEE IVPQ IKE NVTPP P IEESHP YYQE IYI~)RRYKVCDVPLDQLPRSE S LKDVLERLLPYNNERIAPEVLR NVTPPP IEE SHP YYHE IYSDRRYRVCDVP LDQLP RS E S L K D ~ L P Y W N E R I A P EVLR NVTPPP IEE SHP IFHE IYSDRRYKVCDVP LDQLPRSES I/q)VLERLLPYWEERIAPE ILK DVPPPP IDAS SPFSQ--KGDERYKY--VDPNVLP ETES LALVIDRLLPYWQDVIAKDLLS DTPP PALDRDAEYSQ- -F SDP RYAM- -LP PELRPQTECLKDVVGRMLP YWFDAIVPD LLT ,,.**.,.

hummus humbra humrbc rabrbc mourbc yea Sco

.*

aaaaaaaaaaaaaaaa bbb aaaaaaaaaaaaaaa 116 LKRAI RTLWAILDGTDQMWLPVVRTWRLNERHYGGLTGLNKAETAAKHGEEQVKIWRRSF QKRAIRTLWTVLDAIDQMWLPVVRTWRLNERHYGGLTGLNKAETAAKHGEAQVKIWRRSY LNRS IHTAWL ILEELGQEWVPVES SWRLNERHYGAL IGLNREQMAI/qHGEEQVRLWRRSY LNRS IHTAWL ILEELGQEWVPVES SWRLNERHYGAL IGI/~REI~MALNHGEEQVRIWRRSY LNRS IHTAWL ILEELGQEWVPVES S W R L N E R H Y G A L I G ~ G E E Q V R L W R R S Y LSRAIQTANIALEKADRLWIPVNRSWRLNERHYGDLQGKDKAET~GEEKFNTYRRSF QKRAI RTAQL~r .W~KADRHWIPVHRHWRLNERHYGALQGEDKAQTLAEFGEEQFMLWRRSY .*.*,*

hummus humbra humrbc rabrbc mourbc yea Sco

° **

.

**

*

*,*

,,,*

**.*

. *,

..

bbbb aaaaaaaaaaa bbb bbb 231 GKRVLIAAHGNS LRGIVKHLEGMSDQAIMEI/~LP TGIP IVYEI~NKELKPTKPMQFLGDEE GKRVL IAAHGNS LRGIVKHLEGLS EEAIME I/TLPTGIP IVYELDKNLKP IKPMQFLGDEE GKTILI SAHGNS S ~ G I SDEDI INITLP TGVP ILLELDENLRAVGPHQFLGDQE GKTVLI SAHGNS SRALLKHLEG ISDED I IN ITLP%~ ~-~PILLELDE/TLRAVGPHQFLGDQE GKS ILI SAHGNS SRALLKHLEGISDEDI INITLPT~ FVPILLELDENLRAVGPHQFLGNQE GKT~MIAAHGNS LRGLVKHLEGISDADIAKLNIP TGIP LVFELDENLKP SKP S -YYLDPE GRTVLVAAHGNS LRALVKHLDGISDAD ~AGLNIPTG IP LSYELNAEFKP LNPGGTYLDPD

246 TVRKAMEAVAAQGKAK

TVRKAMEAVAAQGKA~Z AZQAAI~KVKQAKK AI~EAIKKVENOGKVKRAEK AIQAAIKKVDDQGKVKQGKQ

AAAAGAAAVANQ~ AAAAAIEAVKNQGKKK *

°***

FIG. 16. Alignment of phosphoglycerate mutase sequences. See Section II for nomenclature and references for the sequences. Elements of regular secondary structure Observed in the yeast enzyme are shown. The numbering is according to the yeast enzyme.

bifunctional enzyme responsible for catalyzing the synthesis and degradation of fructose 2,6bisphosphate showed that it has a high degree of sequence conservation with the mutase active site (Pilkis et al., 1987). The phosphatase subunit of this enzyme shares convincing sequence similarities with the mutases (about 25% identity, see Matrix 8) (Lively et al., 1988; Bazan et al., 1989), and the bifunctional enzyme appears to have evolved by the fusion of ancestral kinase and mutase genes (discussed more fully in Section XI). Pairwise comparisons of the sequences (expressed as per cent amino acid identities) are given in Matrix 8. The crystal structure of phosphoglycerate mutase from yeast has been solved at 2.8 A resolution (Winnet al., 1981; the atomic coordinate data are available from the Brookhaven Data Bank; Watson, 1982), and display the now familiar features of a central fl-sheet

Evolution of glycolysis

FIG. 17. The yeast phosphoglycerate mutase tetramer. The two sulphate ions at each active site are shown in black. It is assumed that the phospho groups oftbe ligands occupy the same positions as do the sulphate ions in the unligated enzyme that was crystallized (Fothergill-Gilmore and Watson, 1989). The C-terminal tail extends 14 residues beyond the crystallographic C-termini indicated by the letter "C"

FIG. 19. The yeast enolase subunit. The a/fl-barrel domain is located in the upper part of the diagram.

153

154

L.A. FOTHERGILL-GILMOREand P. A. M. MICHELS

A

FIG. 21. The subunit of cat muscle pyruvatc kinas¢. The four domains (N, A, B and C) are labelled, and ligand access to the active site is indicated. The likely location for the effector site is at the base of the N-domain. Space filling models of ATP and pyruvate are shown at the active site. The coordinates were kindly made available by H. Muirhead.

Evolution of glycolysis

155

MATRIX 8. PAIRWISECOMPARISONSOF PHOSPHOGLYCERATEMUTASESEQUENCES

hummus humbra humrbc rabrbc mourbc yea Sco f26bp

humm

humb

humr

rabr

mour

yea

Sco

f26bp

100

81 100

52 54 100

52 55 97 100

50 52 92 91 100

51 49 47 46 46 100

52 53 48 48 46 60 100

25 24 25 26 23 25 27 100

See Section II for nomenclature and references for the sequences. Also included is a comparison of these sequences with the sequence of the phosphatase subunit (abbreviation, f26bp) of the bifunctional enzyme 6-phosphofructo-2-kinase/fructose2,6-bisphosphatase (Lively et al., 1988).

surrounded by ~-helices (Fig. 17). The yeast enzyme is a tetramer. The polypeptide backbone folds into a single domain with a structural motif reminiscent of that found in nucleotidebinding domains of dehydrogenases and kinases (see Section 111.3). The C-terminal 14 residues are not observed in the electron density map, presumably because they constitute a flexible tail in the non-phosphorylated, non-ligated form of the enzyme that was crystallized. These residues are of considerable importance because they are required for activity (Sasaki et al., 1966; Price et al., 1985), and their role is currently being investigated by site-directed mutagenesis (White and Fothergill-Gilmore, manuscript in preparation). (i) Enolase The dehydration of 2-phosphoglycerate to phosphoenolpyruvate is catalyzed by enolase (EC 4.2.1.11), and this reaction is important for the formation of an enol phosphate required for the generation of ATP in the next step of glycolysis (Fig. 1). Enolase is a dimer with identical subunits of 3'/, 45,000, and requires Mg 2÷ for the stability of the dimer and for inducing a conformational change as a prerequisite for substrate binding. Additional Mg 2÷ is needed for catalysis. Enolase is one of the most abundantly expressed cytosolic proteins in many organisms, and appears to have a variety of functions in addition to its role in glycolysis (see Section XI). In vertebrates the enzyme occurs as three isoenzymes: ~ is found in liver and many other tissues, fl predominates in muscle, and ~ is specific for neurones and neuroendocrine tissue. The expression of these isoenzymes is regulated both developmentally and tissue-specifically. Rather surprisingly, the kinetic properties of the isoenzymes are all very similar, and the functional impetus for the evolution of the different forms remains rather enigmatic, but may relate to variations in stability. Representative sequences of all three isoenzymes are available from mammals, birds and an amphibian, as are sequences of the enzymes from Drosophila and yeast (Fig. 18). Pairwise comparisons of the sequences (expressed as per cent identities for the whole enzymes and for some of the domains) are given in Matrix 9. Enolase has the dubious distinction of being the last glycolytic enzyme to have its highresolution (2.25 A) crystal structure solved (Lebioda and Stec, 1988; with reinterpretation Lebioda et al., 1989). A notable feature of the structure is the presence of an eight-stranded ~/B-barrel that is similar, but not quite identical in topology to the barrels found in aldolase, triosephosphate isomerase and pyruvate kinase, as well as many non-glycolytic enzymes (Fig. 19, and see Section Ili.3 for a fuller discussion). In addition, the enolase subunit has a second domain of helices and E-strands which is smaller than the barrel, and a third region which corresponds to a C-terminal tail of 15 residues. The active site of enolase is located at the C-terminal part of the barrel, as it is in two of the other glycolytic ct/fl-barrel enzymes, triosephosphate isomerase and pyruvate kinase. The "conformationar' Mg 2÷ binds to the active site where it appears to be in a suitable position to coordinate the hydroxyl group of the 2-phosphoglycerate substrate as well as water molecules. Elucidation of the precise location and role of the "catalytic" Mg 2 ÷ requires further substrate binding studies.

156

L . A . FOTHERGILL-GILMOREand P. A. M. MICHELS

ENOLASE humA rata ducA XlaA humB ratB chiB humG ratG Dme yea1 yea2

bbbbbbbbb bbbbbbbbb bbbbbbb 60 S I LKIHARE IFD SRGNP TVEVDLFT S KGLFRAAVP SGASTG I YEALELRDNDKTRYMGKG S I LKI HARE I FD SRGNPTVEVDLYTAKGLFRAAVP S GASTG IYEALELRDNDKTRFMGKG S I LKIHARE IFDS RGNP TVEVDLYTNKGLFRAAVP SGASTG I YEALELRDNDKTRYMGKG S IKNIRARE IFD S R ~ P TVEVD LYTCKGLFRAAVP S GAS TG IYEALELRDNDKTRYLGKG AMQK I FARE I LD S RGNP TVEVDLHTAKGRFRAAVP S GAS TG I YEALELRDGDKGRYLGKG AMQKIFARE I LDSRGNP TVEVDLHTAKGRFRAAVP S GAS TG I YEALELRDGDKSRYLGKG S IQKIHARE I LDSRGDP TVEVDLHTAKGHFRAAVP S GAS TG I HEALELRDGDKKRF LGKG S IEKIWARE I LD SRG~PTVEVDLYTAKGLFRAAVP SGASTGIYEALELRDGDKQRYLGKG S IQKIWARE I LDSRGNP TVEVDLHTAKGLFRAAVP SGAS TG I YEALELRDGDKQRYLGKG T I KAIKARQ I YD S RG~PTVEVDLTTE LGLFRAAVP S GAS TGVHEALELRDNDKANYHGKS AVSKVYARS VIq)SRGNP TVEVE LTTEKGVFRS IVP SGASTGVHEALEMRDGDKS KWMGKG AVSKVYARSVYD S R ~ P T V E V E L T T E K G V F R S IVP S GASTGVHEALEMRDEDKSKWMGKG ,,

humA ratA ducA XlaA humB ratB chiB humG ratG Dme yea1 yea2

**

****

*****°*

*

*

**°

********

.****

**

**

,,

**°

aaaaaaaaaaaaaaaaaaaa aaaaaaaaaaa aaaaaaaaaaaaaa 12 0 VSKAVEHINKTIAPALVSKKLNVTEQEKIDKI24IEMDGTENKSKFGANAILGVSLAVCKA VSKAVEHINKTIAPALVS ~ Q E K I D Q L M I E ~ S ) G T E N K S K F G A N A I LGVS LAVCKA VSKAVEHINKTIAPAL I S K N V ~ K I D ~ S E N K S K F G A N A I LGVS LAVCKA VGRAVKYVNEFLGPALCTQNLNVVEQEKIDKLMIEMDGTENKSKFGANALLGVSLAVCKA VLKAVENINNTLGPALLQKKLSVADQEKVDKFMIELDGTENKSKFGANAI LGVSLAVCKA VP KAVEHINKTLGPALLEEKLSVVDQEKVDKFMIELDGTENKSKFGANAI LGVS LAVCKA VLKAVEH INKT I G P A L I E ~ I SVVEQEKIDKVVIE~)GTENKSKFGANAI LGVS LAVS HA VLKAVDHINSTIAPALI S SGLSVVEQEKLDNLMT.~T nGTENKSKFGANAI LGVS LAVCKA VLKAVDHINS TIAPAL I S SGLSVVEQEKLDNI/MLEIZM3TENKSKFGANAILGVS LAVCKA ~VGHVNDTLGPEL IKANLDVVDQAS IDNFMIKI/)GTENKSKFGANAI LGVS LAVAKA VLHA NVN VIAPAFVK IDVKDQKAVDDFLI SLDGTANKSKLGANAI LGVSLAASBA VMNAVNNVNNVIAAAFVKANLDVKDQKAVDDF LLS LDGTANKSKLGANAI LGVSMAAABA *

** °

* •

* °

,°,.,



* o

* °

** •

.o,,

**** ,,

**** .

**** •

* °

* •

.

V

humA rata ducA XlaA humB ratB chiB humG ratG Dine yea1 yea2

aaaaa aaaaaaaa T bbbbbbbbb bbbbbb aa180 GAVEKGVP LYRHIADLAGN-- SEVILPVPAFN INGGSHAGNKLAMQEFMILPVGAANFR GAVGKGVP LYRH IADLAGN--PEVTLPVPAFN INGGS HAGNKLAMQEFMILPVGAS SFR GAAEKGVP LYRH IADLAGN- - P E V I L P V P A F N V I N G G S H ~ K L A M Q E F M I P PCGADS FK GAAEKGVP LYRH IADLAGN- -P EVI LPVPAFNVINGGSHAGNKLAMQEFMILPVGADSFK GAAEKGVP LYRH IADLAGN--PDLI LPVPAFNVINGGS HAGNKLAMQEFMILPVGAS SFK GAAEKGVP LYRH IADL~3N- -PDLVLPVPAFNVINGGSHAGNKLAMQEFMILPVGAKLFQ GAAEKGVPLYRHIADLAGN--TELILPVPAFNVINGGSHAGNKLAMQEFMVLPVGAAS FH GAAERELPLYRHIAQLAGN--SDLILPVPAFNVINGGSHAGNKLAMQEFMILPVGAESFR GAAEKDLP LYRHIAQLAGN-- S D L I L P V P A F N V I N G G S H A ~ F M I L P V G A E S F R GAAKKGVP LYEH IADLAGN--EEI I LPVPAFNVINGGS H A ~ I L P TGAT SFT AAAEKNVPLYKHLADLSKSKTSPYVLPVPFI/~VLNGGSHAGGALALQEFMIAPTGAKTFA AAAEKNVP LYQHLADLSKSKTSPYVLPVPFLNVLNGGSHAGGALALQEFMIAP TGAKTFA • *** , • • **** ° ** • ******* ° ** • **** • , ** • • . ,.° • • ° • ° °

humA ratA ducA XlaA hums ratB chiB humG ratG Dine yea1 yea2

aaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaa 2 40 EAMRIGAEV~ KEKYGKDATNVGDEGGFAPN ILENKEGLELLKTAIGKAGYTDK EAMRI GAEVYHNL~NVII~I~YGEDATNVGDEGGFAPN I LENKEALELLKSAIAKAGYTDQ EAMRIGAEV~ KEKYGEDATNVGDEGGFAPNI LENKEALELLKTAIGKAGYSDK EAMRI~IKEKYG~DA~EGGFAPN I LENKEALE LLKTAINKAGYPDK ~ G A ~ KAKYGEDATNVGDEGGFAP NI LENNEALE LLKTAI QAAGYPDK GSQRIGAEVYHHLKGVIKAKYG~DATNVGDEGGFAPNI LENNEALELLKTAIQAAGYPDK DAMRVGAEVYHS L K G V I ~ ~ E G G F A P N I LDNHEALE LLKAAIAQAGYTDK D A M R I F . 4 ~ " ~ Y H T L K G V I K D K Y G ~ D ~ E G G F A P N I LENS EALELVKEAIDKAGYTEK DAMRLGAEV2RTLKGVIKD~A~EGGFAPNI LENSEALELVKEAIDKAGYTEK ~SEVYHHLENVIKAKFGLDATAVGDEGGFAPNIQSNKEAIATLISDAIAKAGYTGK EALRIGSEVYHNLKSLTKERYGASAG~VGDEGGVAPNIQTAEEALDLIVDAIKAAGHDGK ~IGSEVYHNLKSLTKKRYGASAG~/VGDEGGVAPNIQTAEEALDLIVDAIKAAGHDGK ,.

..*.****

**,°.*

,.*

FIG. 18

.*°°******

****

(continued opposite).

..

*.***°

°**

**.



Evolution of glycolysis

157

bbbbbb aaaaaaaaaaaaaa bbbbb 300 WIGMDVAASEFFRSGKYDLDFKSP -DDP SRYI SPDQLADLYKSFIKDYPVVS IEDPFDQ V V I G ~ V A A S EFYRAGKYDLDFKSP -DDAS RYI TPDQLAD LYKSF IK D Y P W S IEDP FDQ VVIGMDVAASEFYRDGKYDLDFKSP -DDP SRYI SPDQLADLYKGFVKNYPWS IEDPFDQ IVIGMDVAASEFYRDGKYDLDFKSP -DDP SRYI SPDKLAELYMSFVKNYPWS IEDPFDQ VVIGMDVAAS EFYRNGKYDLDFKSP -DDPARHITGEKLGELYKSF I K N Y P W S IEDPFDQ VVIGMDVAASEFYRNGKYDLDFKSP-DDPARHISGEKLGEPYKSF IKNYPVVS IEDPFDQ VV IGMDVAASEFCRDGRYDLDFKSP -PDPKRLITGEQLGE IYRGF I K D Y P W S IEDPFDQ IVIGMDVAASEFYRDGKYDLDFKSP -TDP S RY ITGDQLGALYQDFVRDYPVVS IEDP FDQ MVIG~DVAAS EFYRDGKYDLDFKSP -ADP SRC ITGDQLGALYQDFVRNYPWS IEDPFDQ IE IG~DVAASEFYKDGQYDLDFKNEKSDKSQWLPADKLANLYKEF IKDFP IVS IEDPFDQ VKIGLDCAS SEFFKDGKYDLDFKNPNSDKSKWLTGPQLADLYHS 12qKRYPIVS IEDPFAE VK IGLDCAS S EFFKDGKYDLDF~qPESDKS KWLTGVE LADMYHS LMKRYP IVS IEDPFAE

humA rata ducA Xl~ humB ratB chiB humG ratG Dme yeal yea2

**

* *,***

..*.******o

*

.

..

*

. * . .

. . . .

o*.*******..

aaaaaaaaaa bbbb aaaaaaaaaa bbbbb aaaaaaaaa360 DDWGAWQKFTASAG IQVVGDDLTVTNPKRIAKAVNEKSCNCLLLKVNQIGSVTE SLQACK DD~DAWQKFTATAG IQVVGDD LTVTNPKRIAKAAGEKSCNCLLLKVNQIGSVTE SLQACK D D W G A W ~ F T G S V G IQVVGDDLTVTNPKRIAKAVEEKACNCLLLKVNQIGSVTE SLQACK DHWEAWTKFTAASGIQVVGDDLTVTNPKRIAKAVEEKACNCLLLKVNQIGTVTE SLEACK DDKATWTSFLSGVNIQ IV~DLTVTNPKRIAQAVEKKACNCLLLKVNQIGSVTES IQACK DDWATWTSFLSGVD IQ IVGDDLTVTNPKRI~AVEKKACNCLLLKVNQIGSVTE S ILACK DDWEAWKRFVSHVD IQVV~DLTVANPKRIAHAAEQHACNCLLLGVNQIGSVTE S IQACK DDWAAWSKFTANVGIQIVGDDLTVTNPKRIERAVEEKACNCLLLKVNQIGSVTEAIQACK DD~%AWSKFTANVGIQIVGDDLTVTNPKRIERAVEEKACNCLLI/gVNQIGSVTEAIQACK DHWEAWSNLTGCTD IQ IV~DLTVTNPKRIATAVEKKACNCLLLKVNQIGTVTES IAAHL DD~AWSHFFKTAGIQ IVADDLTVTNPKRIATAIEKKAADALLLKVNQIGTLSE S IKAAQ DDWEAWSHFFKTAG IQ IVADD LTVTNPARIATAIEKKAADALLLKVNQIGTLSE S IKAAQ

humA ratA ducA XIaA hums ratB ohiB hums ratG Dme yea1 yea2

*.*..*

.**.*.*****.**

**.

*

....

• ***

*****...*..

*

aaaaa bbbbb aaaaaaaa bbbb aaaaaaaaaaaaaaa 420 LAQANGWGVMVS HRSGETEDTF IADLVVGLCTGQ IKTGAP CRSERLAKYNQLLRIEEELG LAQSNGWGVMVS HRSEETEDTF IADLVVGLCTGQ IKTGAP CRSERLAKYNQ ILRIEEELG LAQSNG~X3VMVSHRSGETEDTFIADLVVGLCTGQ IKTGAPCRSERLAKYNQLLRIEEELG LAQSNGWGVMVS HRSGETEDTF IADLVVGLCTGQ IKTGAPCRSERLAKYNQLLRIEEELG LAQSNGWGVMVSHRSGETEDTFIADLVVGLCTGQ IKTGAP CRSERLAKYNQLMRIEEALG L A Q S ~ HRSGETEDTFVADLVVGLCTGQIKTGAPCRSERLAKYNQLMRIEEALG LAQSHGWGVMVS HRSGETEDTF IADLVVGLCTGQ IEQGAPCRSERLAKYNQLMRIEEALG LAQENGWGVMVS HRSGETEDTF IADLVVGLCTGQ IKTGAPCRSERLAKYNQLMRIEEELG LAQENGWGVMVS HRSGETEDTF IADLVVGLCTGQ IKTGAP CRSERLAKYNQI/~RIEEELG LAKKNGWGTMVS HRSGETED SF IGDLVVGLSTGQ IKTGAP CRSERLAKYNQI LRIEEE IG D SFAAGWGVMVSHRSGETEDTFIADLVVGLRTGQ IKTGAPARSERLAKLNQLLRIEEELG D SFAANWGVMVSHRSGETEDTF I A D L W G L R T G Q IKTGAP ARSERLAKLNQLLRIEEELG

humA ratA ducA XlaA humB ratB chiB hums ratG Dine yea1 yea2

*************************

humA ratA ducA XlaA humB ratB chiB hums ratG Dam yeal yea2

****•

***

*******

**

****

.*

SKAKFAGRNFRNPLAK SKAKFAGRSFRNPLAK SKARFAGRNFRNPRIN SKARFAGKNFRKPVFN DKAIF~RKFRNPKAK DKAVFAGRKFRNPKAK DKAKF~J~/~RNPKAK DEARFAGHNFRNP SVL EEARFAGHNFRNP SVL AGVKF~KSFGKPQ DNAVFAGENFHHGDKL

DKAVYAGENFHHGDKL • . .** .* FIG. 18. Alignment of enolase sequences. See Section II for nomenclature and references for the sequences. Elements of regular secondary structure observed in the yeast enzyme are shown. The boundary between the N-terminal meander domain and the C-terminal barrel domain is indicated by the arrow. The numbering is according to the yeast enzyme.

d ~ 59:2-D

158

L. A. FOTHERGILL-G1LMOREand P. A. M. MICHELS MATRIX 9(a). PAIRWISE COMPARISON OF ENOLASE SEQUENCES

humA ratA ducA XlaA humB ratB chiB humG ratG Drm yea 1 yea2

humA

ratA

ducA

XIaA

humB

ratB

chiB

humG

ratG

Drm

yeal

yea2

100

94 100

92 90 100

88 87 90 100

83 83 83 83 100

83 82 82 82 96 100

82 82 81 79 85 84 100

83 83 84 82 83 82 81 100

83 82 84 82 84 82 81 98 100

73 73 74 76 73 72 70 72 72 100

64 62 63 63 64 63 61 62 62 65 100

62 61 62 61 63 62 59 61 61 63 95 1130

MATRIX 9(b). PAIRWISE COMPARISONSOF DOMAINS FROM REPRESENTATIVE SEQUENCES N-terminal meander domain (residues 1-141)

humAN humBN humGN DmeN yealN

humAN

humBN

humGN

DmeN

yealN

100

82 100

82 81 100

72 74 71 100

59 63 58 65 100

C-terminal barrel domain (residues 142-436)

humAC humBC humGC DmeC yealC

humAC

humBC

humGC

DmeC

yealC

100

84 100

84 84 100

73 72 73 100

66 64 65 65 100

See Section II for nomenclature and references for the sequences.

(j) Pyruvate kinase Pyruvate kinase (EC 2.7.1.40), like phosphoglycerate kinase, is responsible for substrate level phosphorylation during glycolysis, and catalyzes the formation of ATP from phosphoenolpyruvate and ADP (Fig. 1). The reaction mechanism involves the addition of a proton, and a direct in-line transfer of the phospho group. The product, pyruvate, is the first non-phosphorylated intermediate in the pathway, and occupies a central role in metabolism. There is a requirement for a monovalent cation (usually K +) and two divalent cations (usually Mg 2+), one of which is bound to the enzyme and the other associated with the nucleotide (reviewed by Muirhead, 1987). Pyruvate kinase can be isolated from mammalian tissues as four different isoenzymes which have different kinetic properties, reflecting the different metabolic requirements of the tissues. The M 1 isoenzyme is found in skeletal muscle, and shows predominantly hyperbolic Michaelis-Menten kinetics. The other isoenzymes (M2 in kidney, adipose tissue and lung, L in liver, and R in red blood cells) are all allosterically regulated, and show sigmoidal kinetics with respect to the substrate phosphoenolpyruvate (reviewed by Hall and Cottam, 1978). The enzyme from yeast has properties similar to the M2 isoenzyme (Hunsley and Suelter, 1969). A variety of molecules can act as allosteric effectors, and in addition, the liver isoenzyme can be regulated by phosphorylation. In this isoenzyme, phosphorylation of a serine near the N-terminus leads to a decrease of activity which can be reversed by

Evolution of glycolysis

159

dcphosphorylation. The other types of pyruvatc kinase have different N-terminal scqucnccs (see Section III.2(c)), and thus are not susceptible to control by phosphorylation. The expression of the four pyruvate kinase isoenzymes in mammalian tissues is of particular interest because there are only two genes encoding the four isoenzymes. This topic is discussed more fully in Section VI.3. Sequences of all four of the mammalian isoenzymes are available, as are the sequences of pyruvate kinase from representatives of a phylogeneticallywide range of organisms (Fig. 20). The rat M 1 and M2 isoenzymes are identical except for a stretch of 56 amino acids (residues 380-435) where about half the amino acids are different (Noguchi et al., 1986). This portion of the pyruvate kinase molecule is encoded by two alternative exons (see Section VI.3) which are differently spliced in different tissues. Analogously, the L and R isoenzymes are also identical except that the N-terminal region of the R isoenzyme is longer. In this case, the use of alternative promoters yields isoenzymes of different length in liver and in red blood cells PYRUVATE KINASE hummus2 ratmus2 ratmusl catmus 1 chimus

humliv ratllv ratrbc Anid Anig

hummua2 ratmus2 ratmusl catmusl chimus humliv ratliv ratrbc potCy Anid Anig

yea TbrCyl TbrCy2 Eco Bst

SKPHSEAGTAF IQTQQLH PKPDSEAGTAFIQTQQLH PKPDSEAGTAFIQTQQLH SKPHSDVGTAFIQTQQLH SKHHDAGTAFIQTQQLH EGPAGYLRRASVAQLTQELGTAFFQQQQLP EGPAGYLRRASVAQLTQELGTAFFQQQQLP SVQENTLPQQLWPWIFRSQKDLAKSALSGAGGPAGYLRRASV~LTQELGTAFFQQQQLP AASSSLD AASSSID

aaaaaaaa • bbbbbbbb aaaaaaaaaaa bbbbbb 77 AAMADTFLEHMCRLDIDSPP ITA-RNTGI ICT IGPASRSVETLKEMIKSG~gVARLNFSH AAMADTFLEHMCRLDIDSAP ITA-RNTGI ICT I GPAS RSVEMLKEM IKS G~I~VARLNF SH AAMADTFLEHMCRLDIDSAP ITA-RNTGI ICT I GPAS RSVEMLKEM IKS G~n~VARLNF SH AAMADTFLEHMCRLD ID SPP ITA-RNTGI ICT IGPASRSVE I L K E M I K S G ~ V A R L N F S H AAMADTFLEHMCRLD IDS EP T IA-RNTGI ICT IGPAS RSVDKLKEMIKSG~2~VARLNFSH AAMADTFLEHLCLLD ID S EPVAA-RS TS I IAT I G P A S R S V G R L K E M I ~ I A R L N F SH AAMADTFLEHLCLLD ID S QPVAA-RS TS I IAT I G P A S R S V D R L K E M ~ I A R L N F S H AAMADTFLEHLCLLDIDSEPVAA-RSTS I IAT I G P A S R S V D R L K E M I K A G ~ I A R L N F S H ANID IAG I~E~DLPNDGRIPKTKIVCTLGP S S R T V P M L E K L ~ A R F N F S H HLSNRMKLEWHSKLNTEMVPAKNFRRTS I I C T I G P K T N S V E K I N A L R R A G ~ S H HLSNRMKLEWHSKLNTEMVP SKNFRRTS I IGT IGPKTNSVEKINS L R T A G ~ S H SRLERLT S LN- - V V ~ S D L R R T S I IGT IGPKTNNPETLVALRKAGLNIVR~fl~FSH SQLEHN IGLS IFEPVAKH-RANRIVCT IGPS TQSVEALKNL~g~SGMSVARMNFSH SQLEHNIGLS IFEPVAKH-RANRIVCT IGP S T Q S ~ G ~ 4 S V A R M N F SH KKTKIVCT IGPKTE S ~ A G ~ g V M R L N F SH KRKTKIVCT I GPASE SVDKLVQI~4EAG~qVARLNFSH •

hummus2 ratmus2 ratmusl catmus 1 chimus humliv ratliv ratrbc potCy Anid Anig

yea TbrCyl TbrCy2 ECO Bst













I°o



aaaaaaaaaaaaaaaaaaaa ~bbbbbbb bbbbbbb bbbb bbbb bbb137 GTHEYHAET IKNVRTATE SFASDP I LYRPVAVALDTKGPE IRTGLI~GSGTAEVELKKGA GTHEYHAET IKNVRAATE SFASDP I LYRPVAVALDTKGPE IRTGLI ~ S G T A E V E L K K G A GTHEYHAET IKNVRAATESFASDP I LYRPVAVALDTKGPE IRTGLI~GSGTAEVELKKGA GTHEYHAET I~V~RAATESFASDP IRYRPVAVALDTKGPEIRTGLIKGSGTAEVELKKGA GTHEYHEGT IKNVREATE SFASDP I TYRPVAIALDTKGPE IRTGLIE~SGTAEVELKKGA GSHEYRAET IANVREAVESFAGSP LSYRPVAIALDTKGPE IRTGI LQGGPESEVELVKGS GSHEYHAE S IANIREATESFATSP LSYRPVAIALDTKGPE I R T G V L ~ E S E V E I V K G S GSHEYHAES IANIREATESFATSP LSYRPVAIALDTKGPE I R T G V L ~ E S E V E I V K G S GTHEYGQETLDNLKIAMQNTQIL ..... -CAVMLDTKGPE IRTGFLTDGK--P IQLKEGQ GSYEYHQSVIDHAREAEKQAAG ..... RPVAIALDTKGPE IRTGNTVGDK--D IP IKAGH GSYEYHQSVIDNARE~ ..... RPLAIALDTKGPEIRTGNTPDDK--D IPIKQGH GS YEYHKSVIDNARKSEELYPG- .... RPLAIALDTKGPEIRTGTTTNDV--DYP IPPNH GSHEYHQTTI~LH ...... IGIALDTKGPEIRTGLFEDG- - - E V S F A P ~ GSHEYHQTT I ~ L H ...... IGIALDTKGPE IRTGLFKDG- --EVTFAPGD GDYAEHGQRIQNLRNVMSKTGKT ...... AAI LLDTKGPE I R ~ I E G G N - - D V S LKAGQ GDHEEHGRRIANIREAAKRTGRT ...... VAI LLDTKGPE IRTHNMENGA-- - IELKEGS *.















* * * * * * * * * *

FIG. 20 (continued overleaf).

160

L. A, FOTHERGILL-GILMOREand P. A. M, M~CHELS

hummus2 ratmus2 ratmusl catmusl chimus humliv ratliv ratrbc potCy Anid Anig yea TbrCyl TbrCy2 Eco Bst

b aaaaaa aaaaaa aaaaaaaaaaaaa bbbbbbb bbbbbbb 195 TLKITLDNAYMEKCDENI LWLDYKN ICKVV VGSKIYVDDGLI SLQVKQKGADF --LVTE TLKI TLDNAYMEKCDEN I LWLDYKN I CKVVEVGS KI YVDDGL I S LQVKEKGADY--LVTE T LKI T LDNAYMEKCDEN I LWLDYKNI CKVVEVGS KI YVDDGL I S LQVKEKGADY--LVTE TLKI T LDNAYMEKCDENVLWLDYKN I CKVVEVGS KVYVDDGL I S LLVKEKGADF - - LVTE ALKVTLDNAFMENCDENVLWVDYKNL IKVIDVGSKI YVDDGL I S LLVKEKGKDF - -VMTE QVLVTVDPAFRTRGNANTVWVDYPN I VRVVPVGGRI Y IDDGL I S LVVQK I SP EG- -LVTQ QVLVTVDP KFQTRGDAKTVWVDYHN I TRVVAVGGRI Y I DDGL I S LVVQKIGPEG - -LVTE QVLVTVDPAFQTRGDAKTVWVDYHN I TRVVAVGGRI Y IDDGL I S LVVQKIGPEG--LVTE E ITVSTDYT I--KGNEEMISMSYKKLVMDLKPGNTI LCADGT ITLTVLSCDPP SGTVRCR EMNI S TDEQYATASDDQNMYVDYKN I TKVI SAGKL I YVDDG I LS FEVLEVVDDK-TLRVR ELNI TTDEQYATASDDKNMYLDYKN I TKVI SPGKLI YVDDG I LSFEVLEVVDDK-T IRVR EMIFT TDDKYAKACDDKIMYVDYKNI TKVI SAGRI IYVDDGVLSFQVLEVVDDK-T LKVK IVCVTTDPAYEKVGTKEKFYIDYPQLTNAVRPGGS I Y V D D G V M T L R W S K E D D R - T L K C H IVCVTTDPAYEKVGTKEKFYIDYPQLTKAVPVGGS I YVDDGVMTLRVLSKEDDR-TLKCH TFTF TTDKSV-- IGNS EMVAVTYEGFTTDLSVGNTVLVDDGL IGMEVTAIEGNK- -VICK KLVI SMSEV---LGTPEK ISVTYP S L IDDVSVGAKILLDDGLISLEVNAVDKQAGE IVTT . . . . ..* * .** ... *

hummus2 ratmus2 ratmusl catmusl chimus humliv ratliv ratrbc potCy Anid Anig yea TbrCyl TbrCy2 Eco Bst

bbbbb bbb bbbb b b b T aaaaaaaa bbbb aaaaaaaaaa254 VENGGS LGS KKGVNLPGAAVDLPAVS EKD I QD L-KFGVEQDVDMVFAS F IRKASDVHEVR VENC~S LGSKKGVNLP GAAVDLPAVSEKD IQDL -KFGVEQDVDMVFAS F IRKAADVHEVR VENGGS LGSKKGVNLPGAAVD LPAVS EKD IQDL -KFGVEQDVDMVFASF IRKAADVHEVR VENGGS LGSKKGVNLPGAAVDLPAVS EKD IQD L-KFGVEQDVDMVFASF I RKASDVHEVR VENGGMLGSKKGVNLPGAAVDLPAVSEKD IQDL-KFGVEQNVDMVFASF IRKAADVHAVR VENGGVLGS RKGVNLPGAQVDLPGLS EQDVRDL -RFGVEHGVD IVFAS FVRKASDVAAVR VEHGGI LGSRKGVNLPNTEVDLPGLSEQDLLDL-RFGVQHNVD I IFASFVRKASDVLAVR VENGG I LGSRKGVNLPNTEVDLPGLSEQDLLDL-RFGVQHNVD I i FASFVRKASDVLAVR CENTAT LGERKNVNLPGVVVD LPTLTEKDKED I LEWGVPNNIDMIALS FVRKGSDLVNVR CLNNGN I S S RKGVN GTDVD LPALS EKD I SDL-KFGVKNKVDMVLAS F IRRGSD IRH IR CLNNGN IS S RKGVNLPGTDVDLPALS EKD IADL - ~ G V R N K V D M V F A S F I RRGSD IRH IR ALNAGKICSHKGVNLPGTDVD LPALSEKDKEDL -RFGVKNGVHMVFASF I RTANDVLT IR VNNHHRLTDRRGINLP GCEVDLPAV SEKDRKDL-EFGVAQGVDMIFAS F IRT AEQVREVR VNNHHRLTDRRGINLPGCEVD LPAVSEKDRKD L -EFGVAQGVDMI FASF IRTAEQVREVR VLNNGD LGENKGVNLPGVS IALPALAEKDKQD L I -FGCEQGVDFVAASF IRKRSDVIE IR ~ G V K V N L P G I TEKDRAD I L -FGIRQGIDF IAASFVRRASDVLE IR • .....*.*. ,,**,..*.* *o o* o ,,.. **.* ... .*

V

hummus2 ratmus2 ratmusl catmus 1 chimus humliv ratllv ratrbc potCy Anid

~n±g yea TbrCyl TbrCy2 Eco Bst

aa bbbbbb aaaaa aaaaaa bbbb aaaaa aaaaaaaaaaaaa313 KVLGEK-GKNIKI I SKIENHEGVRRFDEILEASDGIMVARGDLGIE IPAEKVFLAQER~4I KVLGEK-GENIK I I SK IENHEGVRRFDE I LEASDGIMVARGDLGIE IPAEKVFLAQm~4I KVLGEK-GKNIKI I SKIENHEGVRRFDE I LEASDGIMVARGDLGIE I P A E K V F ~ KVLGEK-GKNIKI I SKIE~4EGVRRFDE ILEASDGIMVARGDLGIE IPAEKVFLAQ~£MI KVLGEK-GKH IKI I SKIENHEGVRRFDE IMEASDGIMVARGDLGIE I P A E K V F ~ AALGPE-GHGIKI I S K I E N H E ~ E I L E V S D G I M V A R G D L G I E I P A E K V F L A Q K M M I DALGPE -GQNIKI I SKIENHEGVKKFDE ILEVSDGIMVARGDLGIE IPAEKVFLAQKMMI DALGPE -GQN IKI I SKI ENHEGVKKFDE I LEVSDGIMVARGDLGIE IPAEKVFLAQK~MI KVLGP H -AKRIQLMSKVENQEGVINFDE I LRETD SFMVARGD L~4E I P V E K I F ~ EVLGEE -GRE IQ I IAKIENQQGVNNFDE I T,~TDGVMVARGDLGIE IPAPKVF IAQK~MI EVLGEE -GRE IQ I IAKIENQQGVNNFDE I T.~TDGVMVARGDLGIE IPAP KVF ~ EVLGEQ-GKDVKI I V K I E ~ Q G V N N F D E I LKVTDGVMVARGDLGIE IPAPEVLAVQKKLI AALGEK-GKDILI ISKIENHQGVQNIDS I IEASNGIMVAR~DLGVE I P ~ II AALGEK-GKD ILI I SKIENHQGVQNIDS I IEASNGIMg-ARGDLGVE IPAEKVCVAQM~ I I EHLKAHGGENIHI I SKIENQEGLNNFDE I L E A S D G ~ L G V E I P V E E V I F ~ K ~ 4 I ELLEAHDALHIQI IAKIENEEGVANIDE I LEAADGI~S4ARGDLGVE IPAEEVP L IQKLLI * o . . .. *.**..*o oo*.*. .oo ********.***. ., *. .* FIG. 20

(continued opposite).

(Noguchi et al., 1987). Pairwise comparisons of the sequences (expressed as percent amino acid identities for the whole enzymes and for some domains) are given in Matrix 10. The pyruvate kinase subunit folds into four distinct domains (designated N, A, B and C; Fig. 21), and is thus more complex than any of the other glycolytic enzymes. The largest of the domains is a beautifully symmetrical eight-stranded ~/fl-barrel of identical topology to those occurring in aldolase and triosephosphate isomerase. The active site is located in a

Evolution of glycolysis

hummus2 ratmus2 ratmusl catmusl chimus humliv ratliv ratrbc potCy Anid Anig yea TbrCyl TbrCy2 Eco Bst

aaaa bbbbbb aaaaaaaaaaaa bbbbbb aaaaaa373 GRCNRAGKPVICATQMLESMIKKPRPTRAEGSDVANAVLDGADCIMLSGETAKGDYPLEA G R C N R A G K P V I C A T Q M L E S M I K ~ RP TRAEGSDVANAVLDGADCIMLSGETAKGDYP LEA GRCNRAGKPVICATOMLE SMI KEP RP TRAEGSDVANAVLDGADC IMLSGETAKGDYP LEA GRCNRAGKPVICATQMLE SMIKKPRPTRAEGSDVANAVLDGADC IMLSGETAKGDYP LEA GRCNRAGKP I I C A T Q M L E S M I K ~ R P T R A E G S D V A N A ~ I M L S G E T A K G D Y P L E A GRCNLAGKPVVCATQMLE SMITKPRP TRAETSDVANAVLDGADC IMLSGETAKGNFPVEA GRCNLAGKPVVCATQMLE SMI TKARP TRAETSDVANAVLDGADCIMLSGETAKGSFPVEA GRCNLAGKPVVCATQMLE SMI TKARP TRAET SDVANAVLDGADC IMLSGETAKGNFPVEA YKCNLAGKAVVTATQMLE SMIKSP AP TRAEATDVANAVLDGTDCVMLS GESAAGAYPELA AKCNIKGKPVICATQMLE SMTYNPRP TRAEVSDVANAVLDGADCVMLSGETAKGNYPCEA AKCNI KGKPVICATQMLE SMTYNP RP TRAEVSDVANAVLDGADCVMLSGETAKGNYPNEA AKSNLAGKPVICATQMLE SMTYNPRP TRAEVSDVGNAI LDGADCVMLS GETAKGNYP INA S KCNVVGKPVICATQMLE SMT SNPRP TRAEVSDVANAVLNGADCVMLSGETAKGKYPNEV SKCNVVGKPVICATQMLE SMT SNPRP T R A E V S D V A N A V L N ~ GETAKGKYPNEV EKC IRARKVVITATQMLD SMIKNP RP TRAEAGDVANAI LDGTDAVMLSGESAKGKYP LEA KKCNMLGKPVI TATQMLD SMQRNPRP TRAEASDVANAI FDGTDAVMLSGETAAGQYPVEA *

* * * * *

o,

,o

anid Anig yea TbrCyl TbrCy2 EcO Bst

**

* * * * *



v

hummus2 ratmus2 ratmusl catmusl chimus humliv ratliv ratrbc potCy

161

o,

** •

** •

* ooo

*

* * * * *

,

*



*



* •

y

aaaaaaaaaaaa aaaaaaaaaaaaa aaaaaaaaaaaaa bbbbb a433 VRMQNL IAREAEAAIYHLQLFEELRRLAP I TSDP TEATAVGAVEASFKCCSGAI IVLTKS VRMQHL IAREAEAAIYHLQLFEELRRLAP I TSDP TEAAAVGAVEASFKCCSGAI IVLTKS VRMQHLIAREAEAAVFH~ T,T.~EELARAS S QSTDP LEAMAblGSVEASYKCLAAALIVLTES VRMQHL IAREAEAAMFHRKLFEELVRGS SH STDX/4EAMAMGSVEASYKCLAAAL IVLTES VRMQHAIABEAEAAMFHRQQFEEI LRHSVHHREPADAMAAGAVEASFKCLAAALIVMTES VKMQHRIAREAEAAVYHRQLFEELRRAAP LS RDP TEVTAI GAVEAAFKCCAAAI IVLTTT ~ H A I A R E A E A A V Y H R Q L F E E L R R A A P LSRDP TEVTAIGAVEASFKCCAAAI IVLTKT V ~ H A I A R E A E A A V Y H R Q L F E E L R R A A P LSRDP TEVTAIGAVEAS FKCCAAAI IVLTTT V K IMS RIC I EAE S S LDNEAIFKEMIRCTP LPMSP LE S LAS SAVRTANKARAKL IVVLTRG VTMMSETCLLAEVAIP HFNVFDELRNLAP RPTDTVES IAMAAVSAS LELNAGAIVVLTTS ~ETCLLAEVAIP HFNVLDELRNLAPRPTDTVE S IAMAAVSAS LELNAGAIVVLTTS VTTMAETAVIAEQAIAYLPNYDDMRNCTP KPT S TTETVAASAVAAVFEQKAKAI IVLS T S VQYMARICVEAQ SATHDTVMFNS IKNLQK I PMCP EEAVCS SAVASAFEVQAKAMLVLSNT V Q ~ A R I C V E A Q S A T H D T V M F N S IKNLQKIPMCPEEAVCS SAVASAFEVQAKAMLVLSNT VS IMAT ICERTDRVMNS RLEFNNDNRKLRI .... TEAVCRGAVETAEKLDAP L I W A T Q G VKTMHQ IALRTEQALEHRD I LS QRTKESQTT I - - T D A I G Q S V A H T ~ V A A I V T P T V S * ,,

o,

,,,





,

,oo





V

hummus2 ratmus2 ratmusl catmusl

chimus humliv ratliv ratrbc potCy

Anid Anig yea TbrCyl TbrCy2 Eco Bst

aaaaaa bbbb aaaaaaaaaaa bbbbb 479 GRSAHQVARYRPRAP I IAVT ............. RNPQTARQAHLYRGIFPVLCKDPVQEGRSAHQVARYRPRAP I IAVT ............. RNPQTARQAHLYRGIFPVLCKDAVLDGRSAHQVARYRPRAP I IAVT ............. RNP QTARQAHLYRG IFPVLCKDAVLD GRSAHQVARYRPRAP I IAVT ............. RNHQTARQAHLYRGIFPVVCEDPVQE GRSAHLVSRYRPRAP I IAVT ............. RNDQTARQAHLYRGVFPVLCKQPAHDGRSAQLLSRYRPRAAVIAVT ............. RSAQAARQVHIA2RGVFP LLYREP P EAGRSAQLLSQYRPRAAVIAVT ............. GSAKAARQVHLSRGVFP LLYREPPEAGRSAQLLSQYRPRAAVIAVT ............. G S A K A A R ~ S R G V F P LLYREPPEAGSTAKLVAKYRPAVP I LSVVVPVLTTDSFDWS I SDETPARHS LVYRGLIP LLGEGSAKAT GNTARMISKYRPVCP I IMVS .RNPAATRYS HLYRGVWP FYFPEKKPDF GKTARYLSKYRPVCP IVMVT ............. R N P A A S R Y S H L Y R G V W P F L F P E K ~ D F GTTPRLVSKYRPNCPI ILVT ............. RCP RAARF S H L Y ~ F P F V F -EKEPVS GRSARLISKYRPNCP I ICVT ............. TRLQTCRQLNVTRHVVSVFYDAAKSG~ GRSARLISKYRPNCPI ICVT ............. TRLQTCRQLNVTRSVVSVFYDAAKSGE GRSARAVRKYFPDAT ILALT ............. TNEKTAHQLVLSKGVVPQLVKEI .... GKTPQMVAKYRPKAP I IAVT . . . . SNEAVSRRLALVWGVYTKEAP HV .... 'R

, , ,



*

'A'

,$,

,,







, , .



FIG. 20 (continued overleaf).

pocket between domains A and B in a very similar position relative to domain A as the active sites in other enzymes with eight-stranded ct/fl-barrel structures. It is not possible to be certain of the position of the effector site because only the non-allosteric muscle pyruvate kinase has been crystallized. However, the muscle enzyme does have a secondary nucleotide binding site (Fig. 21), and it is possible that this may correspond to the effector site in the allosteric isoenzymes.

162

L.A. FOTHERGILL-GILMOREand P. A. M. MICHELS aaaaaaaaaaaaaaaaaaa

hummus2 ratmus2 ratmus 1 catmusl chimus humliv ratliv ratrbc

potCy Anid ~n~g yea TbrCyl TbrCy2 Eco

Bst

530

- --AWAEDVDLRVNFAMNVGKARGFFKKGDVVIVLTGWRPGSGP TNTMRVVPVP - - -AWAEDVD LRVNLAMNVGKA~ ;FFKKGDVVIVLTGWRPGSGFTNTMRVVPVP ---AWAEDVDLR~KA~ ;FF~GDWIVLTGWRPGSGFTNTMRVVPVP - --AWAEDVD LRVNLAMNVGKA~ ~FKHGDWIVLTGWRPG SGFTNTMRVVPVP - - -AWAEDVD ~ L V IV L T G W R P G S ~ V V P VP - - - IW A D D V D R R V Q F G IE S G K L R G F L R V G D L V I V V T G W R P G S G Y T N I M R V L S I S - -- IWADDVDRRVQFGIE SGKLRGFLRVGDLVIVVTGWRPGSGYTNIMRVLSVS - - - IWADDVDRRVQFG IE SGKLRGF LRVGDLVIVVTGWRP GS GYTN IMRVLS VS D .... SES TEVI LEAALKSAVTRGLCKPGDAVV- -ALHRIGSASVIKICVVK NVKIWQEDVDRRLKWGINHGLKLGI INKGDNI VC~TNTVRVVPAEENLGLSEE NVKVWQEDVDRRLKWG INHALKLG IINKGDNIVCV~GHTNTVRVVPAEENLGLAEE D- --WTDDVEARINFG IEKAKEFG ILKKGDTYVS IQGFKAGAGHSNTLQVSTV D ..... KD~VKLGLDFAKK ASTGDVVVVVHADHSVKGYPNQTRL IYLP D ..... KDKEKRVKLGLDFAKKEKYASTGDVVVVVHADHSVKGYPNQTRL IYLP ...... FS TDDF~LALQSGLAHKGDVVVY-GFWCTGTERHY .......N T T D ~ M L D V A V D A A V R S G L V K H G D L V V I T A G V P V G E T G S T N L M K V H V I SD L L A K G Q * * •

Bat







GIGASRRSARPL

FIG. 20. Alignment of pyruvate kinase sequences. See Section II for nomenclature and references for the sequences. The boundaries between the four domains are indicated by arrows, and the boundaries of the differentially spliced exon in the M1 and M2 isoenzymes are shown by triangles. Domain N corresponds to residues 1-41; domain A to residues 42-115 and 224-387; domain B to residues 116-223; and domain C to residues 388-530. The differentially spliced exon corresponds to residues 380-435. The serine residue that is phosphorylated in the liver isoenzyme is indicated by the star symbol. Elements of regular secondary structure observed in the cat muscle enzyme are shown. The numbering is according to the cat enzyme.

2. Sequence Comparisons The present day structures of glycolytic enzymes are the result of a long evolutionary history. The structures have been shaped under constraints imposed by the metabolic requirements of the organisms. The nature and extent of the constraints must have been different for each part of a protein, dependent on its function. In this section we shall explore how different parts of the proteins have evolved under the constraints encountered. (a) Rates of evolution of entire enzymes The rates by which enzymes evolve can be calculated from comparisons of amino-acid sequences. Figure 22 shows the evolution rate of the glycolytic enzymes, based on the differences of the sequences of a limited set of organisms, for which the time at which they diverged from a common ancestor was calculated by Doolittle et al. (1989). These divergence times are presented in Table 2. For most glycolytic enzymes the evolution rate seems to be fairly constant. However, for some enzymes, most notably glucosephosphate isomerase, phosphoglycerate mutase and pyruvate kinase a biphasic curve was obtained. The majority of the enzymes evolve at a rate of 5-8 PAMs/100 million years (see Table 3). These rates are

TABLE 2. ORGANISM DIVERGENCE TIMES USED TO CALCULATE THE EVOLUTION RATE OF GLYCOLYTIC ENZYMES*

Divergence

Time (million years)

Primate---rodent/ungulate Mammal--bird Mammal--fish Vertebrate--invertebrate Animal fungus--plant Eukaryote---prokaryote

80 290 400 600 1000 1800

*Taken from Doolittle et al. (1989).

Bst

Eco

TbrCy2

TbrCyl

yea

Anig

Anid

pot

ratrbc

ratliv

humliv

chimus

catmusl

ratmusl

ratmus2

hummus2

51 43

42 42 I00

92 99 I00

69 92 i00

69 69 i00

87 88 I00

95

i00

100 67

68

69

69

67

68

69

44

43

43

44

44

i00

51

51

52

52

53

54

48

I00

I00

47

66

I00

47

41

47

47

49

51

50

50

51

51

Tbrl

67

39

50

50

50

49

49

50

51

51

yea

94

43

50

50

51

52

52

53

54

54

Anig

MATRIX 10(a). PAIRWISECOMPARISONSOF PYRUVATEKINASESEQUENCES

69

70

54

Anid

86

44

pot

92

70

ratR

96

70

ratL

I00

71

86

93

humL

chiM

caMl

94

97

I00

raMl

raM2

huM2

Tbr2

i00

99

48

48

47

41

47

47

49

51

50

51

51

51

I00

44

44

44

46

45

47

46

46

47

48

49

49

49

49

Eco

I00

51

41

41

42

46

45

45

45

45

46

48

47

47

47

47

Bst

8_

5

o 5

ml

164

L.A. FOTHERGILL-GILMOREand P. A. M. MICHELS MATRIX 10(b). PAIRWISECOMPARISONSOF DOMAINS FROMREPRESENTATIVESEQUENCES Domain A (residues 42-115; 224-387)

hummus2A ratmus2A yeaA

hummus2A

ratmus2A

yeaA

I00

98 100

62 62 100

hummus2B

ratmus2B

yeaB

100

98 100

46 47 100

hummus2C

ratmus2C

yeaC

100

97 100

41 40 100

Domain B (residues 116-223)

hummus2B ratmus2B yeaB

Domain C (residues 388-530)

hummus2C ratmus2C yeaC

See Section II for nomenclature and references for the sequences.

TABLE3. EVOLUTIONRATESOF GLYCOLYTICENZYMES Enzyme

Evolution rate (PAMs/100 million years)

Hexokinase N-terminal half C-terminal half Glucosephosphate isomerase Phosphofructokinase N-terminal half C-terminal half Aldolase Triosephosphate isomerase Glyceraldehyde-phosphate dehydrogenase Phosphoglycerate kinase Phosphoglycerate mutase Enolase Pyruvate kinase

16 14 6 7 10 6 6 5 6 10 5 8

The evolution rates were calculated from the slopes of the curves in Fig. 22. For the calculation of the values given for giucosephosphate isomerase, phosphoglycerate mutase and pyruvate kinase the bacterial points were not included because of the large variations obtained.

relatively slow. Several h o u s e k e e p i n g enzymes are f o u n d to evolve faster, a n d the e v o l u t i o n rate of l u x u r y p r o t e i n s can be c o n s i d e r a b l y higher (Table 4). H e x o k i n a s e a n d p h o s p h o f r u c t o k i n a s e seem to u n d e r g o a faster e v o l u t i o n t h a n the o t h e r glycolytic enzymes. C o m p a r i s o n of each o f the h o m o l o g o u s halves of the m a m m a l i a n h e x o k i n a s e s (that a r o s e by a gene d u p l i c a t i o n / f u s i o n process) with each of the "half-size" yeast isoenzymes indicates t h a t b o t h halves evolved at a p p r o x i m a t e l y the s a m e high rate: 14--16 P A M s / 1 0 0 million years. In c o n t r a s t , the two halves of p h o s p h o f r u c t o k i n a s e , which also have a similar ancestry, evolved at different rates. T h e N - t e r m i n a l half, t h a t has r e t a i n e d the active site, has evolved m o r e slowly (7 P A M s / 1 0 0 million years) t h a n the C - t e r m i n a l half (10 P A M s / 1 0 0 million years), t h a t d e v e l o p e d into a r e g u l a t o r y unit. This fast e v o l u t i o n rate is c o n s t a n t t h r o u g h o u t

Evolution of glycolysis

300

200-

165

PGI

• HKN * HKC ZOO100

100-

0

B

1000

2000

0

80-

300

1000

2000

1000

2000

ALl)4)

• PFK N PFK C

200-

40 100• 20 o ov-" O. e-

0 1000

O Q. Q) Q;

t,; U

Oi

,o] 0

10¢

120•

.

I 0 0 . TIM

80, E

2000

B

60, ~ 40, 20, 0

IGAPDH

6O 40

y



204

1000

0

120,

~

,

0 20001oo,

I000

PGM

PGK

100.

80

80,

60

60,

2000



40

40, 20 20

0i

0

,

0

1000

60 • ENO a

50, i

2000

0

120

,~



100£

~

/

8o2.

30

60

20-

4o£

lO-

2o£ 0

2000

PYK

4o-~

04

1000 m

O~ 10'00

2000

0

I000

2000

million of years

FIG. 22. Evolution rates of glycolytic enzymes. Each panel shows, for a particular enzyme from selected organisms, the differenees between the amino acid sequences expressed as "accepted point mutations per 100 residues" (Dayhoff, 1978; see Section II) vs the organism divergence times (see Table 2). For glucosephosphate isomerase, phosphoglycerate mutase and pyruvate kinase the lines have not been drawn to the bacterial points, because of the large variations observed. Abbreviations: HK N and HK C, hexokinase, N and C terminal halves; PGI, glucosephosphate isomerase; PFK N and PFK C, phosphofructokinase, N and C terminal halves; ALDO, aldolase; TIM, triosephosphate isomerase; GAPDH, glyceraldehyde-phosphate dehydrogenase; PGK, phosphoglycerate kinase; PGM, phosphoglycerate mutase; ENO, enolase; PYK, pyruvate kinase.

166

L. A. F'OTHERGILL-GILMOREand P. A. M. MICHELS TABLE 4. EVOLUTIONRATESOF CONTROL PROTEINS Enzyme Heat shock protein 70 Cytochrome c Alcohol dehydrogenase Dihydrofolate reductase ct-Globin Fibrinopeptide

Evolution rate (PAMs/100 million years) 5 9 11 14 20 45

1800 million years. This holds true both when the half-size prokaryotic enzyme is compared with the C-terminal half of the eukaryotic phosphofructokinase and also when the effectorsite-containing halves of the eukaryotic enzymes are compared amongst themselves. This may be an indication that this highly regulated enzyme has, throughout evolution, been adapting itself in different ways to the metabolic requirements of different organisms (see also Section X). The large differences between the mammalian and yeast hexokinases could be interpreted in a similar manner. The observation that some plots of evolutionary rates (Fig. 22) look biphasic could be explained in several ways. For instance, some sequences used in the analysis may not have been representative for a phylogenetic group. This may be because the glycolytic enzyme of a particular organism was subjected to specific metabolic constraints. Thus when the sequences are used to estimate phylogenetic relationships apparently aberrant results may be obtained. Phylogenetic trees for glucosephosphate isomerase, glyceraldehyde-phosphate dehydrogenase and pyruvate kinase are presented in Fig. 23. Careful inspection of these trees suggests that, in some cases, horizontal transfer ofglycolytic genes between organisms could be invoked to explain the anomalies (see also Section V). The strange branching pattern of the tree based on the sequences of glucosephosphate isomerase (for example, compare positions of the B. stearothermophilus and E. coli) can be explained by horizontal gene transfer as well as by specific metabolic constraints. However, the trees calculated from the sequences of glyceraldehyde-phosphate dehydrogenase and pyruvate kinase show some convincing examples of gene transfer. Examples include the E. coli gap A gene (Doolittle et al., 1990) and the pyruvate kinase gene from potato. It also seems very likely that the gene encoding one of the glyceraldehyde-phosphate dehydrogenase isoenzymes in trypanosomes was acquired from a different eukaryotic organism, not only because it is very different from that of the other isoenzyme (Michels et al., 1991), but also because it is absent from some related organisms (Wiemer, E. and Michels, P., unpublished). We have tried to exclude the amino-acid sequences encoded by horizontally transferred genes from our calculations of evolutionary rates. The non-linear plots in Fig. 22 can also be attributed to the likelihood that the evolution rates of proteins are not necessarily identical in different organisms or phylogenetic groups and that the rates are not constant in time (Dover, 1987). Yet another explanation may be that some of the glycolytic enzymes used in our evolutionary analysis have acquired additional, unrelated functions. Different enzymic activities or sometimes even structural roles have been attributed to several glycolytic enzymes or homologous structures. Examples include glucosephosphate isomerase as neuroleukin; glyceraldehyde-phosphate dehydrogenase as a protein with a function in transcription regulation, DNA repair and formation of microtubules and sarcoplasmic reticulum; enolase as a structural lens protein. This will be discussed in more detail in Section XI. Obviously such proteins would have evolved under different selective forces. (b) Rates of evolution of domains Like many other proteins, most glycolytic enzymes are made up of different domains. These are units generally composed of a continuous stretch ofa polypeptide chain with some elements of secondary structure, and form a spatially distinguishable part of a three-

167

Evolution of glycolysis PGI

P.tlllclparum

T.brucel S.clrevlslae

B.staarothermophllus

8

// clarkla

\

.....

pig

1 E.con

human

GRPDX EUKARYOTES S.mn )Lg ChL©ae.

or~.~ul+ m~a,oqa,c,t Ca~orha~lels e4eganl L~s~et Sca£sc.*~

~.=o.~

Aspe~z JJuJ .z~vz~., ~ize

sa~h+ro~ceJ

|

~t.vZm~aa a

S+~a~=eWel ~ v Z s Z * e x~Jacc.ir~y~#

~xZZ

E*chmr~c~* c . ~ ; A

/'rypanos~ a ~ 4

ptyplnommcall Il(gLy¢os~) l/yCOlml z~as

PROKARYOTES ~£JZ+

laa¢£zzua C£++UIub¢,[.I.~. )lc*+roc~e~p~£1u+ C~LOROPLASTS

PYK

"r.btucai E.coll

B.slearotharmophllua

potato

~ Chicken M

~

human

M2

rat M2

$.ceravlslaa

rat L human L

rat R

Fl~. 23. Unrooted trees showing the evolutionary relationships inferred from the amino-acid sequences of glucosephosphat¢ isomeras¢, glyceraldehyde-phosphate dehydrogenase and pyruvate kinase from a number of organisms. The phylogenetic relationships were estimated from the aligned sequences by a distance matrix method (Fitch and Margoliash, 1967), as described by Michels et al. (1991). The aberrant position of E. coil in the phylogenetic tree inferred from glyceraldehydephosphate dehydrogcnase sequences is presumably due to horizontal transfer of the gapA gene from a eukaryotic cell to this bacterium (Doolittl¢ et al., 1990).

168

L.A. FOTHERGILL-GILMOREand P. A. M. MICHELS

dimensional structure, usually with a specific catalytic or binding function. Very often these domains, or subdomains, are encoded by separate exons. Domains are considered as the building blocks of proteins. During evolution different domains have been assembled in various combinations by exon shuffling to create proteins with new properties, such as in the case of pyruvate kinase (see Section VI). Since the function of different domains varies, they will be subject to different evolutionary forces. The evolution rates of the major domains of glycolytic enzymes, calculated from Matrices 2(b)-10(b) are shown in Table 5. For some enzymes the evolution rates between the contributing domains are not significantly different. This is the case for domains 1 and 2 in both the N- and C-terminal halves of phosphofructokinase and for the N- and C-terminal domains of phosphoglycerate kinase and for enolase. In glucosephosphate isomerase, aldolase, glyceraldehyde-phosphate dehydrogenase and pyruvate kinase the highest degree of conservation is found in the domain that contains the active site or contributes the largest part to it when the active-site pocket is formed by two domains. In glucosephosphate isomerase it is the smallest domain that evolves relatively slowly compared to the large domain. Strikingly extensive variations seem to exist amongst the large domains of mammalian glucosephosphate isomerases. The barrel domain of aldolase, which forms the major part of the enzyme, is rather well conserved. Much more evolution is seen in the

TABLE 5. RATESOF EVOLUTIONOF DOMAINS Domains PAMs/100 million years

Sequences compared Glucosephosphate isomerase hum--mou hum--yea mou--yea Phosphofructokinase N-terminal half hum--mou hum--yea mou--yea

Large 16.0 5.5 32.5

Small 8.0 4.1 5.2

1

2

4.3 7.2 7.5

5.8 8.6 8.6

6.5 14.0 12.5

6.5 15.0 14.5

barrel 2.5 6.3 5.4 6.3 5.2 5.6

C-terminus 3.8 9.3 8.9 8.7 8.9 8.9

Glyceraldehyde-phosphate dehydrogenase hum--mou hum--yea mou--yea

N

C

8.7 8.3 7.2

6.3 3.1 3.1

Phosphoglycerate kinase hum--mou hum--yea mou--yea

N 3.8 4.6 4.6

C 2.5 4.8 5.0

Enolase humA--Dme humA--yea Dme--yea

N 6.0 6.0 4.8

barrel 5.7 4.6 4.8

Pyruvate kinase humM2--ratM2 humM2--yea ratM2--yea

A 2.5 5.4 5.4

B 2.5 9.5 9.5

C-terminal half hum--mou hum--yea mou--yea Aldolase humA--mouA humA--Dme humA--mai mouA--Dme mouA--mai Dme--mai

C-terminus 8.0 4.1 6.3

C 3.8 11.3 11.7

Evolutionof glycolysis

169

C-terminal arm, despite its important role in opening and closing the active pocket during the catalytic process. It could well be that part of the structural differences observed in the C-terminal arm are related to differences in substrate specificity (Berthiaume et al., 1991; Hester et al., 1991). Each subunit of glyceraldehyde-phosphate dehydrogenase contains two domains. The N-terminal domain, responsible for NAD ÷ binding, evolves faster than the catalytic C-terminal domain. The C-terminal domain of glyceraldehyde-phosphate dehydrogenase is the most highly conserved of all domains in glycolytic enzymes. In pyruvate kinase four domains can be distinguished: N, A, B and C. The N-terminal domain is highly variable in length and sequence. Domains A and B, which together constitute the active-site pocket, are rather well conserved. The evolution rate of the C-terminal domain is much higher. This domain makes extensive inter-subunit contacts, and presumably plays an important role in the regulation of the enzyme's activity by various effectors (Muirhead, 1987). Since a lot of variation is seen in the manner by which pyruvate kinase is regulated in different mammalian tissues and in different organisms, it is not surprising that domain C evolves faster than domains A and B. No detailed analysis has been made for the domain evolution in hexokinase, because the boundaries of domains and areas of secondary structures have not been precisely defined for this enzyme. The polypeptide chains of triosephosphate isomerase and phosphoglycerate mutase each fold into only a single domain. It is also possible to recognize differences in evolution rate at the subdomain level. Domains are relatively large structures in which the residues are not all equally important for structural and functional properties. Amino acids directly involved in substrate binding and the catalytic process are generally well conserved. Based on kinetic considerations Albery and Knowles (1976) have even concluded that triosephosphate isomerase has reached evolutionary perfection as a catalyst. Further changes of the active site would not accelerate the catalytic process, because the reaction rate is diffusion-limited in the association of glyceraldehyde-phosphate with the enzyme. Furthermore, visual inspection of the alignments in Figs 2, 4, 6, 8, 11, 13, 15, 16, 18 and 20 shows that many fewer substitutions occur in segments of internal secondary structure. In contrast, the peptide-loops between these structures evolve quite rapidly. Loops at the surface of the protein are particularly highly variable in length and sequence. (c) Variations at termini Glycolytic enzymes are generally well conserved. This holds true not only for the nature of the amino acids in the polypeptide chain, but also for the length of the chain. Elongation of the termini is presumably usually prevented by structural constraints. This may be the case if the termini are buried within the three-dimensional structure or if they are present at or near an area of the surface that is involved in contact between subunits or with other proteins. Alternatively an involvement in the catalytic process may preclude changes at the terminus. For instance, it has been demonstrated that the hydroxyl group of the conserved C-terminal tyrosine of aldolases appears to be essential for enhanced catalysis (Berthiaume et al., 1991 ). Nevertheless, variations at the termini seem to be permitted in some glycolytic enzymes. Occasionally very large extensions can be observed. The N-terminus of the glucosephosphate isomerase of Trypanosoma brucei contains an additional stretch of 50 amino acids, while the yeast phosphofructokinase has an N-terminal extension of approximately 200 residues. The function of these extended termini has not been determined yet. In contrast, a regulatory role could be attributed to the larger N-terminal domains found in some mammalian isoenzymes of pyruvate kinase. A serine residue near the N-terminus of the L and R-type pyruvate kinases can be phosphorylated by a cAMP-dependent protein kinase, leading to a decrease of the enzyme's catalytic activity. Some large variations at the C-terminus are seen in glyceraldehYde-phosphate dehydrogenase and phosphoglycerate kinase. These variations are most likely associated with the compartmentalization of isoenzymes (see also Section V. 1(b)). In chloroplasts of all higher plants glyceraldehyde-phosphate dehydrogenase is composed of two types of subunits, A and B, forming a heterotetramer A2B2 . The B subunit has a C-terminal extension of 30-50

170

U A. FOTHERGILL-GILMOREand P. A. M. MICHELS

residues with a high proportion of acidic side chains. This extension is not essential for catalytic function, but is possibly responsible for the association of the enzyme with the chloroplast envelope (Brinkmann et al., 1989). A C-terminal extension of 20--40 amino acids in one of the phosphoglycerate kinase isoenzymes in T. brucei and in the related organism Crithidia fasciculata is responsible for the targeting of the polypeptides from the site of synthesis in the cytosol to the lumen of a microbody-like organelle, called the glycosome (Swinkels et al., 1988; Fung and Clayton, 1991). This topogenic signal is not cleaved after passage through the organellar membrane, but apparently does not interfere with the catalytic function of the enzyme (Misset and Opperdoes, 1987). Other glycolytic enzymes of trypanosomes are also located in this organelle. Two of them, glucosephosphate isomerase and glyceraldehyde-phosphate dehydrogenase terminate at a tripeptide which corresponds with an established microbody targeting signal of yeast, plants and higher eukaryotes (Gould et al., 1990). This putative targeting signal of these microbody-located glycolytic enzymes in trypanosomes is retained in the mature protein. In contrast, glycolytic enzymes directed to plastids of plants are processed. The signal sequence at the N-terminus is cleaved after transfer (Cerff and Kloppstech, 1982). (d) Sequence constraints The evolution of the structures of the glycolytic enzymes has been directed by constraints imposed by metabolic requirements. Such constraints have led to enzymes with appropriate specificity and affinity for their substrates, to a higher glycolytic flux by optimizing the catalytic performance of the individual enzymes and to a more efficient use of the glycolytic substrates by the development of control mechanisms. There is ample evidence that evolution selects enzymes with appropriate specificity and affinity for their substrates. An example is the occurrence of the mammalian isoenzymes for hexokinase, each with a structure that imposes a different affinity and selectivity for the sugar substrates, according to the metabolic requirements of the tissue (see also Section V). One can also distinguish different aldolases which preferentially bind either fructose 1-phosphate or fructose 1,6-bisphosphate. The sequence of the C-terminal arm seems to be crucial for this specificity. Certain residues in this arm have been shown to be important for maintaining a high apparent affinity for the substrates (Takahashi et al., 1989). Moreover, other residues determine the flexibility in the hinge region of the arm and thus whether the opening of the active site is sufficient or not for the larger of the two substrates (Hester et al., 1991). Yet another example of how the constraints of metabolic requirements affect the affinity of glycolytic enzymes for their ligands is provided by the comparative analysis of the coenzyme binding domain of glyceraldehyde-phosphate dehydrogenases. The NAD ÷-binding pocket is very well conserved throughout evolution. However, the chloroplast enzyme, which is not involved in glycolysis but in photosynthetic assimilation of CO2 through the Calvin cycle, has some specific structural differences which enable it to use NADP ÷. Two amino-acid substitutions increase the hydrophilicity of the pocket and relieve the steric hindrance by the 2'-PO+ of this coenzyme (Corbier et al., 1990). The NAD+-binding pocket in one of the glyceraldehyde-phosphate dehydrogenase isoenzymes in trypanosomes also shows some specific alterations, which are conserved in related organisms (Lambeir et al., 1991). A number of substitutions and insertions create additional hydrophobic space, which seems responsible for a Km for NAD + that is approximately 10-fold higher than in the enzyme from other organisms. The role of this modified binding pocket in the trypanosome enzyme is not yet well understood; it must be related to the specific metabolic conditions prevailing within the glycosome, the organelle in which the enzyme performs its catalytic function (see also Section V). An example for optimization of the catalytic performance has already been mentioned in Section III.2(b); triosephosphate isomerase can be considered as one of the most perfectly evolved enzymes of the glycolytic pathway, since its k e a t / g m value approaches the diffusion-controlled limits (Albery and Knowles, 1976; Pettersson, 1989, 1990). This has been achieved by the selection of the proper catalytic residues, an active site which provides the proper fit to the substrate and a matrix that positions the functional groups. Moreover, as

Evolution of glycolysis

171

discussed in Section III.l(e), a flexible loop has evolved that closes over the active site to prevent undesired side reactions. In some other glycolytic enzymes evolution has built in a different form of flexibility essential for catalysis. Both hexokinase and phosphoglycerate kinase have a bilobal structure. Upon binding of the substrates in the cleft between the domains a major conformational change is induced, bringing the two lobes closer together. Phospho transfer between the bound substrates is only possible in this closed conformation. The formation of a functional active site results not only from constraints exerted on the parts of the polypeptide chain that constitute its direct environment, but also from constraints on the matrix around it. In several enzymes the polypeptide chain of another subunit is also involved. In triosephosphate isomerase only the dimer is fully active, because the so-called interface loop contributes to the active site of the other subunit. A lot of variation is allowed in the primary structure of this loop, affecting only the stability of the dimer (Casal et al., 1987; Wierenga et al., 1987). However, conservation of its geometry and charge is required to retain catalytic efficiency, indicating its important role in forming the proper active site structure. In glyceraldehyde-phosphate dehydrogenase the subunits act together in NAD + binding. The coenzyme molecule is situated in a cleft formed by the coenzyme-binding domain, part of the so-called S-loop of the catalytic domain (see Section III.l(f) and Fig. 12) and part of the S-loop from another subunit. Although the S-loop regions of prokaryotes and eukaryotes show some marked differences, they are extremely well conserved within both phyla, indicating their structural and functional importance (Michels et al., 1991). Interactions between subunits also play an important role in the control of enzyme activity. Nature has developed mechanisms to efficiently up or down-regulate activity by the cooperative binding of ligands to the individual catalytic units of an oligomeric enzyme. The conformational change which is induced by ligand binding in one unit leads via additional changes in tertiary and quarternary structure to an increase or decrease of affinity for the same or another ligand in the other subunits. This exchange of information between subunits of so-called allosteric enzymes must have been made possible by the evolution of specific structures involving the interfaces. Allosteric regulation occurs in several glycolytic proteins. In some of them the conformational changes in the subunit interface, triggered by the binding of an allosteric effector, has been studied in detail. Comparison of the crystal structures of phosphofructokinase from B. stearothermophilus with and without the positive effector G D P or the negative effector PEP has revealed the movements of specific interface residues (Schirmer and Evans, 1990). The conservation of the interface structure over long evolutionary distances is apparent; the information obtained from the analysis of the B. stearothermophilus phosphofructokinase was applicable to the enzyme of E. coll. In this latter organism the role of the corresponding residues in allosteric behaviour could be studied by site-directed mutagenesis (Kundrot and Evans, 1991). Most glyceraldehyde-phosphate dehydrogenases show cooperativity in the binding of the cofactor NAD +. The conformational changes in the interface of the B. stearothermophilus enzyme have been described in detail by Skarzynski and Wonacott (1988). The constraints for the development of allosteric regulation are not only exerted at the interface level, but obviously also in the effector binding sites. This can be illustrated by several glycolytic enzymes. The mammalian hexokinases I, II and III are all allosterically inhibited by glucose 6-phosphate. In contrast, the half-size mammalian glucokinase and the yeast hexokinases A and B do not show such an allosteric inhibition by the reaction product. This suggests that after the gene duplication/fusion process that resulted in the larger mammalian enzyme, an effector site for glucose 6-phosphate evolved from what was originally a duplicate catalytic site. Phosphofructokinase provides another example in which, due to loss of constraint, a substrate binding site could be turned into a regulatory site. The sites for the allosteric inhibitor ATP and the activator fructose 1,6-bisphosphate, present in the C-terminal half of the mammalian enzymes, are most likely evolved from the ADP activator site and the fructose 6-phosphate binding site, respectively, of the ancestral half protein. The N-terminal half of the eukaryotic enzyme and the half-size bacterial protein still retain the properties of the putative ancestral protein (Poorman et al., 1984; see also Section

172

L.A. FOTHERGILL-GILMOREand P. A. M. MICHELS

VI. 1). Another adaptation for allosteric control is found in pyruvate kinase. Most pyruvate kinases are activated by the heterotropic effector fructose 1,6-bisphosphate. However, in trypanosomes pyruvate kinase and phosphofructokinase (that produces the effector) are not present in the same cellular compartment. Pyruvate kinase is located in the cytosol, whereas phosphofructokinase is in the glycosome (Opperdoes and Borst, 1977). The trypanosomal pyruvate kinase has consequently evolved a high-affinity allosteric binding site for fructose 2,6-bisphosphate, a compound that is produced in the cytosolic compartment (Van Schaftingen et al., 1987). 3. Enzymes with Similar Domains The ribbon representations of the glycolytic enzymes given in this review suggest many striking similarities among the enzymes. Thus it is readily apparent that all the enzymes possess domains with a common pattern consisting of a core of fl-strands surrounded by s-helices. (It is perhaps worth pointing out that this folding pattern also occurs frequently among non-glycolytic enzymes. For example, flavodoxin, catalase, aspartate transcarbamylase and carboxypeptidase all have this type of structure (Richardson, 1981 ).) Is it possible to deduce from the common structural features whether the enzymes have diverged from a small number of ancestral domains, or whether they have converged to protein folds that are stable and well suited to providing active sites for the reactions of glycolysis? In order to answer this question (see Section III.4), it is necessary to examine the domain topologies in detail, and also to exploit the amino acid sequence information. Among the 10 glycolytic enzymes there are 18 recognizable domains (defining a domain to have at least four elements of regular secondary structure). Of these, three domains have the topology characteristic of nucleotide-binding domains, and four are ~/fl-barrels. The remaining 11 domains all differ in detailed topology, although many are superficially very similar. The detailed structures of three dehydrogenases (glyceraldehyde 3-phosphate, lactate and alcohol) were determined in the early 1970s, and were thus among the first protein structures to be solved. It was a remarkable finding at the time to discover that the enzymes were broadly similar in overall structure. All three enzymes have a coenzyme-binding domain and a catalytic domain, with the active sites located at the interface between the domains. All the domains consist of cores of mainly parallel fl-strands flanked by a-helices. Although this structural pattern is now very familiar, at the time of discovery it was quite unexpected because there had been almost no other suggestion of structural similarity among the three enzymes. Even more striking than the broad structural resemblances was the realization that the coenzyme-binding domains were virtually indistinguishable in topology, and that the domains bound the NAD ÷ cofactor in very similar conformations. These discoveries sparked off several detailed comparisons of the enzymes and speculation about their evolution (e.g. Ohlsson et al., 1974; Rossmann et al., 1975). A typical dehydrogenase NAD ÷-binding domain is shown in Fig. 24 and consists of six parallel fl-strands and three ~t-helices. The order of the fl-strands is reversed between strands a and d, with the consequence that there are helices on each side of the sheet. The NAD ÷ molecule is invariably bound to the C-terminal edge of the fl-sheet. Br~indrn (1980) has pointed out that this topology is geometrically favourable for the formation of a cleft, which would be suitable for the binding of cofactors and substrates. That the edge involved is always the C-terminal one probably relates to the partial positive charge associated with the N-termini of the flanking helices (Hol et al., 1978), which would be important for binding the negatively charged nucleotide. It has been noted that this coenzyme-binding domain consists of two similar units, each associated with a mononucleotide-binding region (Rao and Rossmann, 1973). The two mononucleotide-binding folds in a coenzyme-binding domain are related by an approximate 2-fold axis running parallel to the strands a and d (see Fig. 24). Among the glycolytic enzymes, there are two coenzyme-binding domains and one mononucleotide binding fold. The former occur in the coenzyme-binding domain of glyceraldehyde-phosphate dehydrogenase (Fig. 12), and in the ATP-binding domain of

Evolution of glycolysis

173

e

E

C

FIG. 24. A typical NAD ÷-binding domain in dehydrogenases as present in glyceraldehyde-phosphate dehydrogenase from B. stearothermophilus. The//-strands are labelled a-f.

phosphoglycerate kinase (Fig. 14). The latter is found in domain C of pyruvate kinase (Fig. 21). The beautifully symmetrical arrangement of eight parallel E-strands and eight 0t-helices in a cylindrical barrel structure was first described for triosephosphate isomerase (Banner et al., 1975, 1976). It was unexpected and exciting when it was discovered that this extensive structure is also present in pyruvate kinase (Stuart et al., 1979). Subsequently the same domain topology has been found in a remarkable number of different proteins, currently totalling 16 (reviewed by Farber and Petsko, 1990). This list will undoubtedly be added to in the future. There has been considerable interest in the evolutionary implications of the unexpected occurrence of ~/fl-barrels in enzymes previously thought to be unrelated (discussed in Section 111.4). There are three glycolytic enzymes that have typical ~/fl-barrel domains: aldolase, triosephosphate isomerase and pyruvate kinase. In addition, enolase has a very similar barrel domain, but in this case, one of the E-strands is anti-parallel. The structures of these four domains are compared in Fig. 25. The locations of the active sites in the barrel enzymes are very similar in all but one of the enzymes. The only exception is the glycolytic enzyme, aldolase. The active sites usually comprise pockets formed between the carboxy end of the barrel and an adjoining different domain. It appears that the cylindrical cleft surrounded by JPB 59:2-E

174

L.A. FOTHERGILL-GILMOREand P. A. M. MICHELS

ALD

TIM

3

ENO

"-

PK

FIG. 25. A comparisonof the ~/fl-barrelsfound in aldolase (ALD),triosephosphateisomerase(TIM), enolase (ENO) and pyruvate kinase (PK). All the barrels have parallel fl-strands, with the exception of one anti-parallel strand in enolase that corresponds to residues 167-172, as indicated by the arrows. the positive charges from the helix dipoles, which is a consequence of the ~/fl-barrel fold, is a versatile binding site particularly well suited to negatively charged substrates and cofactors. In the case of aldolase, the active site is located in the centre of the barrel, and is accessible to ligands from the C-terminal side of the barrel. How similar are the other 11 domains found in the glycolytic enzymes? One way to compare them is to consider the number of fl-strands within each one. Only one enzyme, phosphofructokinase, has a domain with four parallel fl-strands. There are two five-stranded fl-sheet domains with one anti-parallel strand, as seen in hexokinase and in domain C of pyruvate kinase. Although the arrangement of strands is the same, the connections are not. There are many domains with six fl-strands. These of course include the nucleotide-binding domains which have already been discussed. In addition, glucosephosphate isomerase and phosphoglycerate kinase both have six-stranded parallel fl-sheets, but with different topologies. The strand order in nucleotide-binding domains is fedabc (Fig. 24), and in the N-terminal domain of phosphoglycerate kinase is feabdc. In addition, hexokinase and phosphoglycerate mutase both have six-stranded sheets, but two strands in hexokinase and one strand in phosphoglycerate mutase are anti-parallel. Two enzymes have seven-stranded domains (phosphofructokinase and glyceraldehydephosphate dehydrogenase) which differ both with respect to positions of anti-parallel strands and with respect to topologies. There are four domains with eight-stranded structures, as has already been discussed in the context of the barrel domains. Finally, it should be mentioned that there is one domain that does not conform to the typical pattern, i.e. a core of mainly parallel fl-strands alternating with ~-helices. Domain B of pyruvate kinase has an extensive anti-parallel fl-sheet with almost no ~-helical content (Fig. 21).

Evolution of glycolysis

175

Thus, the structural similarities among the domains of glycolytic enzymes, which are so striking at first glance, are revealed to differ in detail. The fact that almost all the domains have a core of mostly parallel E-strands with flanking helices argues compellingly that this is a structure that is particularly well suited to catalyzing the reactions of the glycolytic pathway. It is stable and provides binding pockets for negatively charged ligands. 4. Convergent or Divergent Evolution? The previous sections of this review have been concerned with descriptions of the sequences and crystal structures of all the glycolytic enzymes. There is obviously a substantial body of structural information now available. What can this tell us about how the pathway has evolved? In most cases, a comparison of the sequences of a particular enzyme shows that they are convincingly similar to each other. In other words, the sequences are homologous; they have diverged from a common ancestor. However, in some cases the evidence is much less convincing. For example, yeast hexokinase is only 30-36% identical to the mammalian hexokinases. Are they homologous? What about glucosephosphate isomerase from B. stearothermophilus? It is only 21-23% identical to the mammalian sequences. It is important to understand the limits of what a provable match will be. It often seems surprising that two random or totally unrelated amino acid sequences may turn out to be 10-20% identical when aligned by a typical computer alignment program. The reason is, of course, that alignment programs allow gaps to be incorporated in either sequence if, by so doing, the alignment is significantly improved. However, an appropriate gap penalty is always imposed in order that gapping not get out of hand. If gaps were not allowed, then two random sequences would be about 5~i% identical. All of the alignments of the glycolytic enzymes have gaps that have been introduced in this way. Moreover, the gaps have been positioned such that they occur outwith regular elements of secondary structure, and are thus likely to be present in surface loops where variations in length and conformation can be relatively readily tolerated. Two diverging sequences take the course of a negative exponential. This follows strictly from classical statistics and the fact that any position can be subject to reverse changes (back mutations) and multiple hits. As such the percent identity for two sequences is not a proportionate indicator of how much change has actually occurred (Fig. 26). Two sequences that are 50% different have actually sustained "hits" amounting to 80 changes per 100 residues. A protein may suffer up to 360 such changes per 100 residues before a point is reached where it is no longer recognizable. Therefore there is no clear cut answer as to whether proteins which are less than about 20-25% identical have diverged from a common ancestor or not. It is of course inherently difficult to prove an essentially negative argument, and it can always be argued that divergence has occurred so long ago or so rapidly as to be no longer observable. On these considerations, it is quite convincing that the two classes of aldolase are not homologous (see Section IX.2). The situation with some of the bacterial and archaebacterial sequences is much less obvious. We have seen in the previous section that some of the different glycolytic enzymes have domains with identical topologies. Thus glyceraldehyde-phosphate dehydrogenase, phosphoglycerate kinase and pyruvate kinase all have nucleotide-binding domains. Aldolase, triosephosphate isomerase and pyruvate kinase have ~/fl-barrels. (It is considered that the domains with similar, but non-identical topologies are unlikely to have diverged from a common ancestor sufficiently recently to have any perceptible sequence similarities. These non-identical domains are thus not included in the discussion here.) Do comparisons of the sequences of the nucleotide-binding domains and the barrel domains help to identify any enzymes that may be divergently related? The answer is "perhaps" in the case of glyceraldehyde-phosphate dehydrogenase and phosphoglycerate kinase, and "probably not" for all the others. Alignment of the secondary structural elements of all the nucleotidebinding domains shows that the A M P mononucleotide-binding fold of glyceraldehydephosphate dehydrogenase is slightly more similar than random to both of the mononucleotide-binding domains of phosphoglycerate kinase (Fothergill-Gilmore, 1986). Attempts at

176

L.A. FOTHERGILL-GILMOREand P. A. M. MICHELS

a 100 >,

0

E

0 r-. Q 60 o

40 '

ology

'

20 ~ 0 .

|

|

80

|

160

|

/

i

240

320

changes per 100 residues

b

100

i

'safe'

60

homology

6 20

'dubious'

'

4'o

homology

'



,' 0

'

120 '

'

length of alignment FIG. 26 (a) Two randomly diverging sequences change in a negatively exponential fashion. After the insertion of gaps to align two random sequences, it can be expected that they will be 10-20% identical. (b) The threshold for homology as a function of alignment length. It is reasonable to assume that two proteins of greater than 25% sequence identity have diverged from a common ancestor. (Diagram (a) was adapted from Doolittle (1986), and diagram (b) from Sander and Schneider (1991)).

analogous alignments of the secondary structure elements from the barrel domains have yielded few convincing similarities. Some stretches of sequence do appear to be somewhat more similar than random (A. W. F. Coulson and L. A. Fothergill-Gilmore, unpublished data), but this would probably be expected for such ordered structures with a requirement for hydrophobic and hydrophilic residues in prescribed positions. For nucleotide-binding domains, therefore, the evidence can be argued both for and against divergent evolution. The essentially negative proof of convergent evolution remains as difficult as ever. The sequence evidence points toward multiple genes encoding mononucleotide-binding folds which have sometimes (possibly in the case of phosphoglycerate kinase) duplicated and then diverged. However, it seems likely that most of the nucleotide-binding domains have converged toward a structure that is stable and is particularly well suited for binding negatively charged molecules such as nucleotides. For barrel domains there is no apparent sequence evidence which can be invoked to indicate divergent evolution of the large number of domains with this structure. It therefore seems likely that these domains have converged onto another type of stable structure which provides an extended and versatile binding site for negatively charged ligands. In support of this suggestion is the fact that it is possible to synthesize a totally artificial ~//~-barrel (called octarellin) from eight/~-strand/~-helix units modelled on naturally occurring barrels (Goraj et al., 1990). This protein engineering approach confirms that the 8-fold repeat of strand/helix units alone possesses the appropriate characteristics to fold into a stable structure.

Evolution of glycolysis

177

IV. EARLY E V O L U T I O N Three main theories describing ways that enzymes may have evolved have been proposed from time to time. One theory envisages that consecutive enzymes in biochemical pathways which must bind the same substrate/product molecules have evolved by a series of gene duplication events, with the consequence that all the members of a particular pathway would be related to each other. Another theory suggests that enzymes catalyzing similar classes of reaction (for example, kinases or mutases) have developed as independent groups and have gradually specialized by divergence. A third theory postulates that enzymes with a requirement for binding similar ligands such as nucleotides have evolved from a common ancestor (for example, the nucleotide-binding domains). These theories have been propounded and discussed by a number of authors, including Haldane (1965), Horowitz (1965), Woese (1965), Waley (1969), Ycas (1974) and Kacser and Beeby (1984). All of these proposals assume that the evolution of enzymes and biochemical pathways is essentially divergent. There is of course an alternative possibility: that similarities in enzyme structure and function can result from convergent evolution, and that pathways have evolved from chance associations of independently evolving enzymes. The availability of the wealth of structural information describing the glycolytic enzymes means that we are in as good a position as possible to examine the four evolutionary mechanisms. However, we have already seen that there are very few convincing similarities at the sequence level between different enzymes in the glycolytic pathway (Section 111.3). This is presumably a consequence of the fact that the pathway is an ancient one. Therefore most of the examples cited in the remainder of this section will be pointers toward possibilities, rather than unequivocal proofs of one theory or another. 1. Consecutive Enzymes One consequence of the organization of biochemical reactions into pathways is that consecutive enzymes must interact with the same ligand. Thus, the product of one enzymic reaction is the substrate of the next. It would seem a distinct possibility that one enzyme in a pathway could have evolved after gene duplication into the next enzyme by retaining the ligand-binding features but altering the catalytic residues. Is there any evidence for this type of evolution? A consideration of the domain structures of the glycolytic enzymes shows that there are indeed two examples in which consecutive enzymes have domains with identical topologies. Thus aldolase and triosephosphate isomerase both have ~t/fl-barrel domains, and glyceraldehyde-phosphate dehydrogenase and phosphoglycerate kinase have nucleotidebinding domains. In addition, enolase and pyruvate kinase have barrel domains, but these differ in topology. The other pairs of consecutive enzymes in the main stream of the glycolytic pathway do not have domains with identical topologies. As mentioned in the previous Section (III.4), there is no apparent sequence similarity between any of the barrel domains, and it is thus not possible to ascertain from this information whether the aldolase and triosephosphate isomerase domains are divergently related. Perhaps a comparison of the intron positions within the genes encoding the two enzymes may help to shed some light on possible relationships. This information is available for triosephosphate isomerase (see Section VI.2), but not yet for aldolase. There is one pair of consecutive enzymes closely associated with glycolysis that provides an unequivocal example of divergence from a common ancestor: monophosphoglycerate mutase and bisphosphoglycerate mutase. However, in this case the "younger" of the two enzymes catalyzes an effectively irreversible reaction, and its substrate is not a ligand of the "older" enzyme. These two enzymes are thus more appropriately discussed in the next section on enzymes catalyzing similar reactions. 2. Enzymes Catalyzing Similar Reactions Is there any evidence to support the proposal that enzymes catalyzing similar types of reaction have developed as independent groups, and have gradually specialized by

178

L.A. FOTHERGILL-GILMOREand P. A. M. MICHELS

divergence? Glycolytic enzymes can be considered to fall into six groups, roughly according to the types of reaction they catalyze: kinase, isomerase, cleaving enzyme, dehydrogenase, mutase and dehydratase. In general, the sequences of enzymes catalyzing similar reactions (i.e. the kinases and isomerases) are so different that it is not even apparent how to align them. Overwhelmingly the crystallographic and sequence information show that the enzymes within each group are not closely related structurally, and it appears most unlikely that each of these two groups of enzyme has diverged from a common ancestor. If the group of enzymes being considered is widened to include those closely associated with glycolysis (e.g. bisphosphoglycerate mutase, lactate dehydrogenase, alcohol dehydrogenase) then it is possible to cite the definite example of the divergence of two mutases from a common ancestor. In addition, the dehydrogenases may also provide examples of this type, because their coenzyme-binding domains have the same topology, and certain specialized parts of these domains sometimes have sequences slightly more similar than random sequences. These similarities, however, probably indicate the structural constraints inherent in a stable dinucleotide-binding domain but may represent vestiges of an evolutionarily distant ancestor. A comparison of the sequence of the glycolytic phosphoglycerate mutase with bisphosphoglycerate mutase (Section IliA(h) and Fig. 16) suggests a relatively recent gene duplication event followed by limited divergence. During this divergent period two of the amino acid residues at or near the active site have altered (Ser- 11 to Gly, and Ala-60 to Ser) so that 1,3-bisphosphoglycerate will bind more readily to bisphosphoglycerate mutase than to the "parent" mutase enzyme. Bisphosphoglycerate mutase is thus able to catalyze the reactions of the Rapoport-Luebering shunt (see Section VIII) at a much greater rate than can the glycolytic phosphoglycerate mutase. It is interesting to note that bisphosphoglycerate mutase retains the ability to catalyze the phosphoglycerate mutase reaction at quite a high rate. The link between the evolution of phosphoglycerate mutases and the effector properties of haemoglobin is discussed in Section VIII.

3. Enzymes Binding Similar Ligands Five mainstream glycolytic enzymes catalyze reactions requiring either ATP or the structurally similar coenzyme NAD +. In addition, the closely associated enzymes, lactate dehydrogenase and alcohol dehydrogenase exploit the same NAD + cofactor. The three dehydrogenases have overall structures that are broadly similar: each has an NAD +-binding domain and a catalytic domain, with the active sites at the interface between the domains. As has already been discussed in the preceding Section, the three enzymes share coenzymebinding domains with identical topologies, and they bind the NAD ÷ cofactor in very similar conformations. However, the enzymes have no apparent sequence similarities. Among the four kinases, there are two enzymes which share domains with common folds: phosphoglycerate kinase has a domain with two mononucleotide-binding folds as in the dehydrogenases, and pyruvate kinase has a domain with a single mononucleotide-binding fold. Otherwise the kinases and dehydrogenases do not have any domains with common topologies. Is it possible to say whether the two kinases and the three dehydrogenases with similar domains may have evolved from a common ancestor? Unfortunately the answer is equivocal. There is no strong evidence for divergent evolution from amino acid sequences and, in fact, most comparisons of structurally analogous regions show the sequences to be unhelpfully close to random. Thus the evidence can be argued both for and against divergent evolution and it does not seem unreasonable to suggest that both are relevant to the evolution of nucleotide-binding regions. For example, glyceraldehyde-phosphate dehydrogenase and phosphoglycerate kinase may have diverged from a common mononucleotide-binding fold, whereas the analogous ATP-binding domain in the other kinases may be the result of convergent evolution to provide stable structures suitable for binding the negatively charged nucleotide.

Evolutionof glycolysis

179

4. Archaebacteria

Amongst the vast array of present-day organisms are those which seem to be relics of the past. These "living fossils" appear to have retained ancient features, perhaps to enable them to inhabit isolated ecological niches which may also be primitive in character. Examples of organisms of this type might include monotreme (egg-laying) mammals, and deep-sea fish such as the coelacanth Latimeria. Evolutionary scientists have been particularly intrigued by "living fossils" because they may give some clues about evolutionary processes that would otherwise be inaccessible. The relatively recent recognition of archaebacteria as a phylogenetically distinct group of organisms (reviewed by Woese, 1987) has revealed yet another group of "living fossils". As their name implies, these bacteria have a number of attributes which appear to be appropriate for survival in the conditions presumed to be extant at the very beginnings of the evolution of living forms. Thus for example, the thermoacidophile group of archaebacteria grow best at temperatures ranging from 60-100°C and in acidic conditions with a pH as low as 2. The relationships of archaebacteria to eukaryotes and eubacteria have been studied by comparisons of the sequences of various macromolecules. In particular, ribosomal RNA molecules have proved to be both convenient to study and informative (see Woese, 1987). It has become apparent from this type of evidence that archaebacteria are as distinct from eubacteria as they are from eukaryotes. The recent determination of the sequences of several metabolic enzymes (glyceraldehyde-phosphate dehydrogenase (see Table 1 for references), phosphoglycerate kinase (references in Table 1), malate dehydrogenase (Honka et al., 1990) and citrate synthase (Sutherland et al., 1990)) has provided additional evidence about phylogenetic relationships. Clearly these data are still too sparse to be able to make sweeping generalizations, but some observations appear to be reasonable. The archaebacterial glyceraldehyde-phosphate dehydrogenase sequences are so different from both the eukaryotic and eubacterial sequences (only 16-20% identity overall) that it is not possible to propose a confident alignment. This raises the question of whether or not the enzymes share a common ancestor (see Section IX). As has already been discussed in Section III, it is not possible to distinguish between very distantly diverged sequences on the one hand, and unrelated sequences on the other, when they share only 10-20% identical residues. The archaebacterial malate dehydrogenase and citrate synthase sequences also share very few identities with the corresponding eukaryotic and eubacterial enzymes. The situation is much clearer in the case of phosphoglycerate kinase. Here the archaebacterial sequences are 30-36% identical to the eukaryotic and eubacterial sequences (see Matrix 7(a)). These comparisons show that phosphoglycerate kinases from all three groups have diverged from a common ancestor, and also that the archaebacterial enzyme is essentially equally distant from both the eukaryotes and the eubacteria. The clear divergence of the phosphoglycerate kinase adds weight to the suggestion that the archaebacterial glyceraldehyde-phosphate dehydrogenase may be derived from a non-homologous enzyme that has converged to function as a glyceraldehyde-phosphate dehydrogenase. The fact that glyceraldehyde-phosphate dehydrogenase and phosphoglycerate kinase are evolving at about the same rate (see Section 111.2) provides further justification for this hypothesis. Comparisons at the level of individual enzymes confirm that archaebacteria are indeed very distant from the other two present-day kingdoms. In addition, other lines of evidence provide reasonably convincing hints that archaebacteria may represent modern relics of very early life forms. For example, many of the archaebacteria are adapted to thrive in conditions not very different from those presumed to have been present about 4000 million years ago when life is thought to have begun. Moreover, geochemical characterization of ancient sediments has revealed the presence of long branched-chain alkanes which are thought to be derived from ether-linked lipids present in the cell membranes of methanogens (Ourisson et al., 1984). These lipids are quite distinct from those found in eukaryotes and eubacteria, and are thus excellent indicators for the presence of archaebacteria. Of course these two points only provide circumstantial evidence consistent with the proposed antiquity of archae-

180

L.A. FOTHERGILL-GILMOREand P. A. M. MICHELS

bacteria. It is certainly defensible to argue that the distinguishing characteristics of archaebacteria are a consequence of adaptation to extreme environments, and are not necessarily genuinely "archaeic". Can anything be learned from comparative studies of metabolic pathways in the three kingdoms of archaebacteria, eukaryotes and eubacteria? Does this sort of approach provide any clues about how the glycolytic and gluconeogenic pathways may have assembled early in evolution? The answer most certainly is "yes", and many intriguing observations have emerged from such comparative studies. In all eukaryotes glucose catabolism is accomplished by the Embden-Meyerhof pathway as illustrated in Fig. 1. However, in eubacteria hexoses can be catabolized by a variety of additional pathways such as the Entner-Doudoroff and pentose phosphate pathways (reviewed by Cooper, 1986). Most of these pathways converge at the level of glyceraldehydephosphate, which is then converted by a common trunk pathway of enzymes to pyruvate (Fig. 27). This trunk pathway appears to be present in all eubacteria, although some species do in addition possess the alternative methylglyoxal pathway. In this latter pathway dihydroxyacetone phosphate is converted to methylglyoxal, and then to pyruvate, thus bypassing the substrate-level phosphorylation steps of the trunk pathway. The five trunk enzymes, glyceraldehyde-phosphate dehydrogenase, phosphoglycerate kinase, phosphoglycerate mutase, enolase and pyruvate kinase are therefore probably present in all eukaryotes and eubacteria. By striking contrast, the archaebacteria do not all possess the trunk pathway in its entirety. Phenotypically, archaebacteria can be placed into four groups: methanogens, extreme halophiles, thermoacidophiles, and sulphur-dependent thermophiles (reviewed by Danson, 1988, 1989). These groupings usually, but not always, reflect the genotypic relationships indicated by the ribosomal RNA analyses (Woese, 1987). The only archaebacterial glycolytic enzymes for which complete sequences are currently available are from the methanogen Embden-Meyerhof Entner-Doudoroff

Hexose monophosphate Pentose

xose-Pentose Phosphoketolase

Phosphoke~

glyceraldehyde 3 - P

~

...--- Pi,NA D *

~l"'~ NADH+H+ 1,3-bisP-glycerate

~__

AOP ATP

3-P-gfycerate

2-P-glycerate

~

H20

P-enolDyruvate

~

AOP ATP

pyruvate

FIG. 27. Trunk pathway of sugar catabolism. Various routes for glucose breakdown converge at glyceraldehyde 3-phosphate, which is then converted to pyruvate by the trunk pathways reactions.

Evolution of glycolysis

181

group. These are for glyceraldehyde-phosphate dehydrogenase and phosphoglycerate kinase as discussed earlier in this section. In addition, preliminary sequence information has recently been obtained for pyruvate kinase from the thermoacidophile, Thermoplasma acidophilum (Potter and Fothergill-Gilmore, 1992). This pyruvate kinase is clearly homologous with the other pyruvate kinase sequences shown in Fig. 20. Hexose metabolism in the methanogens appears to operate mainly in the direction of carbohydrate synthesis. These organisms rely primarily on the formation of methane from carbon dioxide and hydrogen to satisfy their energy requirements, instead of exploiting the catabolism of hexoses. In all methanogens carbon is eventually fixed into acetyl-CoA, and from thence into glucose apparently by a reversal of the Embden-Myerhof pathway (see Danson, 1988, 1989). Phosphofructokinase has not been found in the methanogens, and it seems likely that these organisms are unable to undertake glycolysis. However, it does seem probable that all the enzymes which can function in both the anabolic and catabolic directions are present, although they have not yet all been characterized individually. The patterns of hexose catabolism in halophiles and thermophiles are different yet again (reviewed by Danson, 1988, 1989). Halophiles possess a modified Entner-Doudoroff pathway in which the intermediates are not phosphorylated until the level of 2-keto-3-deoxy6-phosphogluconate (KDPG). In this pathway glucose is first oxidized to gluconate, then dehydrated to 2-keto-3-deoxygluconate, and finally phosphorylated to KDPG. The enzyme KDPG-aldolase then cleaves the K D P G into one molecule of pyruvate plus one molecule of glyceraldehyde 3-phosphate, which is in turn further metabolized to pyruvate by the trunk pathway enzymes. The halophiles thus apparently do not possess hexokinase, glucosephosphate isomerase, phosphofructokinase or fructose 1,6-bisphosphate aldolase, although they do have the remaining glycolytic enzymes. The thermophiles are quite unique in that their pathway of glucose catabolism proceeds mostly with non-phosphorylated intermediates. From the metabolic point of view, the thermoacidophiles and the sulphur-dependent thermophiles appear to be very similar, and can be considered as a single group. These organisms have the same modified Entner-Doudoroff pathway as the halophiles, but here the 2-keto-3-deoxygluconate is cleaved into pyruvate and glyceraldehyde without any phosphorylation step. The glyceraldehyde is converted into glycerate which is in turn phosphorylated to 2phosphoglycerate by glycerate kinase. Enolase and pyruvate kinase complete the pathway. These two enzymes are the only glycolytic enzymes present in the thermophiles. If archaebacteria do indeed retain features of the earliest forms of life, then it is likely that glycolysis did not begin by a stepwise assembly of the individual glycolytic enzymes to form the pathway that exists in most organisms today. One possible scenario is that a very early form of glucose catabolism corresponded to a mainly non-phosphorylated Entner-Doudoroff sequence that yielded only a single molecule of ATP for every glucose degraded. More phosphorylated steps were then added such that further molecules of ATP could be generated. Meantime gluconeogenic processes were exploiting the reversal of many of the same steps in order to achieve the synthesis of glucose. Ultimately the evolution of phosphofructokinase enabled the glycolytic sequence to take place. This highly speculative scheme would envisage that pyruvate kinase and enolase are the most ancient glycolytic enzymes, followed by phosphoglycerate mutase, phosphoglycerate kinase and glyceraldehyde-phosphate dehydrogenase. V. ISOENZYMES For most glycolytic enzymes two or more isoenzymes can be found within one organism. The isoenzymes can be encoded by separate genes or by one gene. In the latter case the different enzymes are the result of post-transcriptional or post-translational processes. Different isoenzymes can be expressed at different stages of an. organism's development, under different hormonal or growth conditions or in different tissues. Such isoenzymes are often characterized by specific kinetic properties, corresponding to the cells' metabolic requirements in each situation. However, isoenzymes may also coexist within the same cell.

182

L.A. FOTHERGILL-GILMOREand P. A. M. MICHELS

This can be the case when similar (but sometimes opposite) reactions occur simultaneously in the cell as part of different metabolic pathways. For example, similar reactions are catalyzed in glycolysis, gluconeogenesis, the pentose phosphate pathway and the Calvin cycle. The cell can exclude interference between the pathways by specific regulation of key enzymes in each pathway (see also Section X.1), or by spatial separation of the different pathways. In the latter case isoenzymes are found in different cell compartments. It should be mentioned here that several cases in which the occurrence of glycolytic isoenzymes have been described, particularly in the older literature, should probably be ascribed to different explanations. For example, multiple forms could result from proteolytic processes during isolation procedures, or could represent slightly different polypeptides encoded by two alleles of one gene locus. 1. Specific Expression of lsoenzymes (a) In development Housekeeping proteins, such as glycolytic enzymes are generally required throughout the cell cycle and during all stages of development. However, to adjust the metabolism to the cells' requirements under the different conditions, the level of expression is usually varied. Many reports deal with this regulation of expression of glycolytic enzymes. A few of them describe the differential expression of isoenzymes during development. For example, the expression of the mammalian muscle isoenzyme of phosphoglycerate mutase is specifically increased during embryological and postnatal stages (Grisolia et al., 1970). Further, for pyruvate kinase a significant shift in the expression of the isoenzymes occurs during vertebrate embryogenesis: the M2 type is the only form detected in early foetal tissues, while in later stages all four isoenzymes are expressed in a tissue-specific manner (Imamura and Tanaka, 1982). In Caenorhabditis elegans two isoenzymes for glyceraldehyde-phosphate dehydrogenase have been detected. The subunits of these isoenzymes are encoded by four non-allelic genes (gpd 1-4). The polypeptides encoded by gpd 1 and 4 form the minor enzymes, present in all cells of this nematode, while the polypeptides encoded by the other genes constitute the major isoenzyme that is mainly found in the body wall muscle, gpd 1 and 4 are preferentially expressed in embryos, whereas the expression of the tandemly-arranged gpd 2 and 3 genes increases during postembryonic development (Huang et al., 1989). The developmental expression of glycolytic isoenzymes can also be observed in parasitic protists like T. brucei, which undergo a cyclic transmission between different hosts. Trypanosomes contain two phosphoglycerate kinase isoenzymes; one is present in the glycosome, the other in the cytosol (see also below). When the parasite lives in the bloodstream of the mammalian host, only the glycosomal enzyme is detectable, whereas the cytosolic form is exclusively present when the trypanosome is in the insect vector, the tsetse fly (Misset and Opperdoes, 1987; Swinkels et al., 1992). This regulation occurs at the posttranscriptional level (Gibson et al., 1988). Growth conditions can also have dramatic effects on the differential expression of glycolytic isoenzymes. In mammals the dietary state has both rapid and long-term effects on the fluxes through the glycolytic and glyconeogenic pathways via the action of hormones (for a review, see Pilkis et al., 1988). The long-term control of glycolysis is mainly exerted by glucagon and insulin, which by cAMP mediate alterations in the mRNA levels of key enzymes like pyruvate kinase and glucokinase. High carbohydrate diet and insulin stimulate the preferential expression of L-type pyruvate kinase and glucokinase in the liver. In plants various growth conditions can cause a differential expression of certain glyceraldehyde-phosphate dehydrogenase isoenzymes. When tobacco plants were transferred from dark to light, transcription of all glyceraldehyde-phosphatedehydrogenase genes was stimulated. However, the steady-state mRNA levels transcribed from the nuclear genes encoding the two polypeptides of the chloroplast enzyme increased at least 30 to 50-fold, whereas the level of the transcript from the gene for the cytosolic isoenzyme increased only 10-fold (Shih and Goodman, 1988). Martinez et al. (1989) have shown that maize contains three functional genes for cytosolic glyceraldehyde-phosphate dehydrogenase which are

Evolution of glycolysis

183

differentially regulated. The total level of mRNA for the cytosolic isoenzyme increased during anaerobiosis, although the transcription of at least one gene was decreased, as was the transcription of both genes encoding the subunits of the chloroplast isoenzyme. These observations correspond with the report by Sachs et al. (1980) that the glycolytic capacity of higher plants increases when they are exposed to low oxygen pressure. Also in micro-organisms a significant differential regulation of glycolytic isoenzymes can be observed in response to the carbon source. In yeast several reactions of the glycolytic pathway can be catalyzed by two or more isoenzymes. For one set ofisoenzymes, enolase, the ratio changes drastically depending on the carbon source. Whereas the presence of enolase I is characteristic for growth under non-fermentative conditions, enolase II could only be detected when glucose is the growth substrate (Entian et al., 1984, 1987). In contrast, the ratios between the three glyceraldehyde-phosphate dehydrogenases and the three sugar phosphorylating enzymes (hexokinase A and B and glucokinase) do not change (McAlister and Holland, 1985; Moore et al., 1991). These enzymes seem to be constitutively expressed, despite the fact that they are not functionally equivalent and each is required under particular growth conditions. The different growth characteristics of mutants affected in each of the glyceraldehyde-phosphate dehydrogenases support the notion that the different isoenzymes have different functions (McAlister and Holland, 1985). Furthermore, one of the hexokinase isoenzymes in yeast is involved in preventing interference between different pathways of carbon metabolism. Hexokinase B is capable of exerting catabolite repression, whereas hexokinase A and glucokinase are not. In the presence of its substrates glucose, fructose or mannose the enzyme represses the synthesis of the enzymes involved in gluconeogenesis, the Krebs cycle, the glyoxylate cycle and the catabolism of exogenously supplied sugars such as maltose, saccharose and galactose (reviewed by Entian, 1986). The precise mechanism by which hexokinase B performs this regulatory function has not yet been established (Entian and Frrhlich, 1984; Ma et al., 1989). Glycolytic isoenzymes can also be found in bacteria. For example, in E. coli, which has been studied most intensively, pairs of proteins catalyzing similar reactions have been found for phosphofructokinase, aldolase, glyceraldehyde-phosphate dehydrogenase and pyruvate kinase. The two enzymes of each pair are very different. The two phosphofructokinases are clearly not homologous; the role of the minor enzyme is still uncertain (Hellinga and Evans, 1985). One of the glyceraldehyde-phosphate dehydrogenase genes was acquired by horizontal transfer from an eukaryote, the other one has the typical prokaryotic features (Alefounder and Perham, 1989). The role of each of these enzymes in the metabolism and their level of expression remains to be determined. The two aldolase enzymes of E. co li belong to the two non-homologous classes. The class I enzyme is only synthesized during growth on gluconeogenic substrates, whereas the class II protein seems to be constitutively expressed, although in variable amount (Stribling and Perham, 1973; Baldwin and Perham, 1978). Also the two pyruvate kinase isoenzymes of E. coli are differentially expressed: the type I enzyme, which is allosterically activated by fructose 1,6 bisphosphate is inducible by glucose, whereas the AMP-activated pyruvate kinase is constitutively present (Malcovati and Kornberg, 1969). (b) By tissues and organelles Glycolytic enzymes are often expressed in a tissue-dependent manner. For example, the different types of mammalian hexokinase, each with distinct kinetic properties, are not equally distributed over the body. Brain and kidney express mainly type I, skeletal muscle type II, heart and intestine both types I and II. In kidney, spleen and intestine type III is also present, whereas all four types are expressed in the liver (Katzen and Schimke, 1965; Reyes and Cardenas, 1984). Type IV, glucokinase, is only expressed in liver and pancreatic fl cells (Ureta, 1982). For mammalian phosphofructokinase three different subunits have been described: the muscle (M), liver (L) and platelet (P) type. These subunits, which differ in both size and physico-chemical properties, are expressed in various amounts in different tissues. For example, in rat liver the ratio L: M: P is 75:21:4 (Dunaway and Kasten, 1986, 1987). Because

184

L.A. FOTHERGILL-GILMOREand P. A. M. MICHELS

the subunits can associate randomly, each tissue contains not only homotetrameric enzymes, but also various types of heterotetramers. These different assemblies of subunits result in complex isoenzymic populations with a wide variety of kinetic properties (Dunaway, 1983). The three aldolase class I isoenzymes A, B and C in vertebrates have also a tissue-specific distribution (Horecker et al., 1972). Type A enzyme, which is the most efficient in glycolysis, is the major form present in embryonic tissues and muscle. Type B, which is more adapted to participate in gluconeogenesis, is only detectable in liver and kidney, where it is the predominant form. Aldolase C, with intermediate catalytic properties, is found in brain. In tissues where more than one aldolase isoenzyme is expressed, hybrid forms composed of different subunits can also be found to be present. It has been inferred from sequence comparisons that the three aldolase isoenzymes evolved from a single protein during vertebrate evolution. A first divergence, very early in vertebrate history, resulted in the separate development of the B-type isoenzyme, whereas much later the isoenzymes A and C evolved from a common ancestral protein (Fig. 28(a)) (Tolan et al., 1987; Kukita et al., 1988). The chromosomal location of the genes encoding the three human aldolase isoenzymes, as well as the location of an aldolase pseudogene, have been determined (Lebo et al., 1985; Tolan et al., 1987). These four genes are located on two pairs of chromosomes (9 and 10, 16 and 17) which may themselves be homologous. Based on measurements of DNA content and chromosome numbers, Ohno (1970, 1973) postulated that two tetraploidization events occurred during vertebrate evolution. A first tetraploidization may have happened in primitive chordates, with a second event in the ancestral fish before vertebrates left the sea to live on land. This hypothesis is supported by karyotype analysis (Comings, 1972; Sawyer and Hozier, 1986) and gene mapping studies (reviewed by Tolan et al., 1987). The location of the genes encoding the four aldolase isoenzymes on chromosomes which may be homologous, suggests that the isoenzymes evolved as a result of two chromosome duplications, followed by divergence of the gene copies (Fig. 28(b)). In humans and mice two different, but functionally similar isoenzymes for phosphoglycerate kinase have been detected. One isoenzyme occurs in all somatic cells and in premeiotic germ cells. The other isoenzyme is only found in sperm cells. The gene for the major isoenzyme (pgk-1) is X-linked. Expression of this gene coincides with overall activity of the X-chromosome. Its transcription is thus constitutive, regardless of the cell type, when the chromosome is active. When spermatogenic cells enter meiosis the X-chromosome is inactivated and the second phosphoglycerate gene (pgk-2), which is autosomal, is expressed (Kramer and Erickson, 1981 ). It has been shown that the pgk-2 gene which does not contain any introns in contrast to pgk-1, must have evolved from the pgk-1 gene by retroposition (Tani et al., 1985; Boer et al., 1987; McCarrey and Thomas, 1987). Phylogenetic analysis suggests that this must have happened early in mammalian evolution. For the cofactor-dependent phosphoglycerate mutase three distinct but homologous forms can be found in all vertebrates. These isoenzymes are distributed in a tissue-specific manner and have been classified as muscle-, brain- and erythrocyte types. The molecular mass of the isoenzymes is very similar, but differences can be observed in some kinetic properties (reviewed by Fothergill-Gilmore and Watson, 1989). The sequence and kinetic properties of the muscle and brain isoenzymes are more similar to each other than to the erythrocyte enzyme, indicating that the gene duplication event giving rise to the erythrocyte form predates the divergence of the muscle and brain isoenzymes (but see Section VIII). In some tissues more than one gene is active, resulting in multiple isoenzymes composed of homo- and heterodimers (Pons et al., 1985). Also for enolase three different but homologous isoenzymes occur in mammalian tissues: at, fl and ~, (Rider and Taylor, 1975a, b). The active enzyme is usually a homodimer, but heterodimers have been observed as well (Shimizu et al., 1983). The a form is present in many tissues, fl predominates in muscle and ? is only found in neurons and neuroendocrine tissue. Four different tissue-specific isoenzymes for pyruvate kinase, each with distinct kinetic properties, are found in vertebrates (Muirhead, 1987). The M1 type is the major form in tissues where glycolysis predominates such as skeletal muscle, heart and brain. This enzyme shows hyperbolic Michaelis-Menten kinetics. The L-type is the most abundant form in liver

Evolution of glycolysis

185

a maize

D.melanogaater

rice

human B

chicken B

,umaa,O,C ° T.brucel mouse A / ~ ' , rat A

rabbit A human A

b PRIMITIVE FISH

CHORDATES

17

C

6/17 16 A

9110116/17 10

~t

9/10

9 B I

1 5

I

I

I

3

Myrx 10

I

I

I -2

FIG. 28 (a) Unrooted tree showing the evolutionary relationship between fructose 1,6-bisphosphate aldolases from various organisms. The phylogenetic relationships were estimated from the aligned amino acid sequences by a distance matrix method (Fitch and Margoliash, 1967) as described by Michels et al. (1991). (b) Evolutionary tree illustrating the proposed evolution of chromosomes containing the aldolase loci. The numbers refer to the chromosomal location of the genes for the human aldolase isoenzymes; ¢, stands for an aldolase pseudogene located on chromosome 10. The arrows point to tetraploidization events as described by Ohno (1973). The scale is in hundreds of millions of years. Modified from Tolan et al. (1987).

186

L.A. FOTHERGILL-GILMOREand P. A. M. MICHELS

where gluconeogenesis plays an important role. Its activity is allosterically regulated by a number ofeffectors and shows sigmoidal kinetics with respect to the substrate phosphoenolpyruvate. Moreover, it is regulated by reversible protein kinase-mediated phosphorylation. The other isoenzymes M2 (the major form in all early foetal and most adult tissues) and R (only expressed in erythrocytes) are also allosterically controlled. The occurrence of glycolytic isoenzymes in different cell compartments has been described for two groups of organisms: for photosynthetic organisms and for protists belonging to the Order Kinetoplastida. Although in plants and other eukaryotic phototrophic organisms like algae and euglenoid flagellates glycolysis is essentially a cytosolic process, glycolytic isoenzymes are also found in plastids. In chloroplasts the enzymes phosphoglycerate kinase, NADPH-dependent glyceraldehyde-phosphate dehydrogenase, triosephosphate isomerase and fructose 1,6-bisphosphate aldolase participate in the Calvin cycle. However, each of the other glycolytic enzymes may be detected in chloroplasts or non-photosynthetic plastids from various sources. These enzymes are involved in the supply of carbon precursors and cofactors for the various biosynthetic activities of plastids (for a review, see Dennis and Miernyk, 1982). The cytosolic and plastid isoenzymes can be nearly identical, except for a difference in charge, or there may be profound physico-chemical and kinetic differences between the isoenzymes. The plastid enzymes are usually also encoded in the nucleus, and after synthesis on ribosomes in the cytosol, are imported into the organelles with concomitant cleavage of the N-terminal signal sequence (Cerff and Kloppstech, 1982). Some nuclear encoded glycolytic enzymes of plastids appear to have typical prokaryotic features (see also Section III.l(f)). It is assumed that these enzymes originated from the prokaryotic symbiont that gave rise to the plastid. The genes of the organelle's ancestor must have been transferred to the nucleus of the primitive eukaryotic host (Quigley et al., 1988). In Kinetoplastida, the Order to which trypanosomes belong, the mainstream of glycolysis occurs in an organelle. The first seven enzymes of the pathway, from hexokinase to phosphoglycerate kinase, are sequestered in microbody-like organelles called glycosomes, whereas the last three, from phosphoglycerate mutase to pyruvate kinase, are exclusively found in the soluble part of the cell (Opperdoes and Borst, 1977). All these enzymes are nuclear encoded; microbodies do not contain any DNA. The glycosomal proteins are synthesized in the cytosol and post-translationally imported into the organelle without detectable processing. Although the evolutionary origin of the glycosomes remains a matter of conjecture (Michels and Opperdoes, 1991), the apparent advantage of this compact sequestration of glycolytic enzymes is the extremely high rate at which glucose can be metabolized; higher than in any other cell type studied to date (see also Section X.3). Two of the glycolytic enzymes in trypanosomes have a dual localization. For glyceraldehydephosphate dehydrogenase and phosphoglycerate kinase glycosomal and cytosolic isoenzymes have been described (Misset and Opperdoes, 1987; Misset et al., 1987). Both phosphoglycerate kinase isoenzymes participate in the mainstream of the glycolytic pathway; the glycosomal enzyme when the trypanosome lives in the bloodstream of the mammalian host and the cytosolic enzyme when the parasite is present in the tsetse fly (Misset et al., 1987; see also Section V.l(a)). The glycosomal glyceraldehyde-phosphate dehydrogenase also participates in glycolysis. However, the role of its cytosolic counterpart is not clear yet. It may have a regulatory function rather than an involvement in glycolysis. The evolutionary history of the two pairs of isoenzymes in the trypanosomes is very different. The two phosphoglycerate kinases are kinetically and structurally very similar, differing only in mass and charge. This similarity and the fact that their genes are tandemly linked indicates that they are derived from a common ancestral gene by duplication. The two glyceraldehydephosphate dehydrogenases are, however very different and encoded by genes located on different chromosomes (Gottesdiener et al., 1990; Michels et al., 1991). It seems likely that the gene for the cytosolic isoenzyme was acquired by the trypanosome's ancestor via horizontal transfer (Michels et al., 1991; Wiemer, E. and Michels, P., unpublished results). 2. Rates of Evolution of Different Isoenzymes The isoenzyme forms of glycolytic enzymes have been extensively studied, and much is

Evolution of glycolysis

187

known of their different properties and differential expression as discussed in the preceding section. Is there enough sequence information to be able to make any generalizations concerning the relative rates of evolution of the different isoenzymes? A perusal of the list of enzymes in Table 1 reveals that in some cases the sequences of tissue-specific isoenzymes have been determined from two different mammals. This is true of phosphofructokinase, aldolase, phosphoglycerate kinase, enolase and pyruvate kinase. These data are summarized in Table 6, and enable some observations about evolution rates to be made. However, it must be stressed that the data are obviously sparse, and so any generalizations made now may turn out to be incorrect in the light of additional sequence information. TABLE 6. EVOLUTIONOF MAMMALIANISOENZYMES Enzyme

Muscle

PFK (human/rabbit) (human/mouse) ALD (human/rat) PGK (human/mouse) ENO (human/rat) PYK (human/rat)

96 -97 98 96 97

Isoenzyme type Liver Brain

-93 95 -94 92

--96 -98 --

Testis

---86 ---

The relationships are expressed as percent identities between the pairs of sequences indicated. The symbol ( - ) means that the relevant sequences have not been determined. Abbreviations: PFK, phosphofructokinase; ALD, aldolase; PGK, phosphoglycerate kinase; ENO, enolase; PYK, pyruvate kinase.

A consideration of the muscle and liver isoenzymes shows a consistent trend of greater conservation of sequence among the muscle isoenzymes. The brain isoenzymes also appear to be highly conserved. The testis-specific phosphoglycerate kinase is strikingly less well conserved. A possible explanation for the greater conservation of muscle and brain isoenzymes may be found in the observations that many of these isoenzymes interact strongly with actincontaining thin filaments and cytoskeletal elements (Arnold and Pette, 1970; Brady and Lasek, 1981; Masters, 1984; reviewed by Clarke et al., 1985; Hofer et al., 1987). Thus these enzymes may be highly constrained not only to maintain catalytic function, but also to retain binding sites for actin and other structural proteins. In contrast, these types of interaction with structural proteins have not yet been demonstrated for liver isoenzymes. Moreover, liver is a tissue which can change relatively rapidly in response to changes in environment. It is perhaps not surprising that liver isoenzymes evolve more quickly, both because they are responding to a range of effector molecules, and also because they have fewer other constraints. Pyruvate kinase is an archetypical liver isoenzyme in this respect. Its ability to respond to effector molecules avoids futile cycling between glycolysis and gluconeogenesis. It is probably unwise to generalize on the basis of the two available testis-specific isoenzymes of phosphoglycerate kinase. These isoenzymes represent a rather special case because of their unusual evolutionary origin by retroposition (McCarrey and Thomas, 1987; see previous section), a process that normally gives rise to non-functional pseudogenes. It is tempting to speculate that their relatively high rate of evolution may be a consequence of the fact that their genes are without introns. Perhaps these genes are in some way subject to a higher than normal mutation rate, possibly during the processes of mitosis and meiosis. VI. G E N E S T R U C T U R E The previous sections of this review on the evolution of glycolysis have emphasized the enzymes themselves. Their structures and activities have been described and compared as an approach to understanding their evolution, and to unravelling the evolution of the pathway as a whole. The present section will now change emphasis to consider the genes encoding the enzymes. How have structures of the genes changed throughout evolution? Is there conservation of overall size, and of the sizes and positions of introns and exons? Is it possible to understand the mechanisms of developmental- and tissue-specific expression of certain

188

L.A. FOTHERGILL-GILMOREand P. A. M. M1CHELS

isoenzymes? The answers to these questions will be explored in the remaining three parts of this section. 1. Gene Duplication and Gene Fusion The single most important factor in enzyme evolution is probably gene (or exon) duplication. Once a gene has duplicated, then one copy is released from functional constraints and can subsequently acquire a new repertoire of properties by mutations in its DNA sequence. The process of gene duplication has occurred frequently in evolution, and has given rise to the great numbers of isoenzymes found in the glycolytic pathway (see Section V) and elsewhere. The molecular basis ofgene duplication is not well understood, but probably involves unequal crossovers between sister chromosomes during meiosis. Almost always the observed duplications are of complete genes or complete exons, with the crossovers having occurred in non-coding DNA sequences such as introns or intergenic regions. Presumably partial duplications are usually without function and thus do not persist. Occasionally, duplicated genes or exons fuse to give rise to proteins with new properties. In the glycolytic pathway, hexokinase and phosphofructokinase have evolved in this way. Pyruvate kinase provides examples of genes with duplicated exons. In a quite distinct category, some genes have arisen from the fusion of two or more non-homologous genes. Thus, ancestral kinase and phosphoglycerate mutase genes have fused to give rise to the bifunctional enzyme which synthesizes and degrades fructose 2,6-bisphosphate (see Section XI). A consideration of the subunit sizes of hexokinase shows a general progression from M r 25,000 in bacteria to M r 50,000 in fungi, invertebrates and plants to M r 100,000 in vertebrates (reviewed by Ureta et al., 1987). It appears likely that this enzyme has evolved by two duplication and fusion events. Compelling evidence for the more recent gene doubling comes from an examination of the currently available sequences from mammals and yeast. It is quite unambiguous that the yeast enzyme of 486 residues is homologous to both the N- and C-terminal halves of the mammalian enzymes (30-36% identity, see Fig. 2 and Matrix 1). Moreover, the two halves of the mammalian enzymes are quite similar to each other, with about half of the residues identical. Searches for sequence similarities to support the occurrence of the earlier duplication-fusion event have proved to be unfruitful (Ureta et al., 1987), presumably because the doubling happened so long ago that any sequence similarities have become obscured by the accumulation of subsequent alterations. A further observation from the overall sequence comparisons in Matrix 2 is that yeast hexokinase is slightly more similar to the C-terminal halves of the mammalian enzymes than it is to the N-terminal halves. This is entirely consistent with the results of affinity labelling experiments by J. E. Wilson and his colleagues (Nemat-Gorgani and Wilson, 1986; Schirch and Wilson, 1987) which show that the C-terminal half of rat brain hexokinase contains the catalytic site. The N-terminal half of the enzyme has similarly been shown to have the allosteric site for glucose 6-phosphate (White and Wilson, 1987). Taken together these observations demonstrate that the C-terminal half of hexokinase has been constrained to retain the catalytic function, whereas the region corresponding to the active site in the N-terminal half has been released to evolve into an effector site. Thus the vertebrate hexokinases have evolved to acquire an extended repertoire of control properties by a process of gene duplication and fusion. A comparable situation pertains to phosphofructokinase. The enzyme from bacteria is about half the size of the enzyme from lower and higher eukaryotes, including fungi, nematodes and mammals. It is likely in this case that the duplication and fusion events took place sometime early in evolution, not long after the divergence of prokaryotes and eukaryotes. The more distant time for the doubling is consistent with the fact that the two halves of the mammalian phosphofructokinase are much less similar to each other (2?-28% identity, see Matrix 3) than are the two halves of mammalian hexokinases (50-52% identity). The activity of phosphofructokinase is regulated by a number of different molecules, and it is striking that the bacterial enzyme responds to a much smaller range of effector molecules

Evolution of glycolysis

189

than does the enzyme from mammalian tissues, especially liver. Bacterial phosphofructokinase has one type of regulatory site and the enzyme can be activated by ADP or GDP, or inactivated by phosphoenolpyruvate. Phosphofructokinase in mammalian liver is sensitive to a number of additional effectors, especially fructose 2,6-bisphosphate as an activator, and citrate as an inhibitor. The mammalian enzyme thus appears to have more than one type of effector site, and this can be understood in terms of the structures of the prokaryotic and eukaryotic phosphofructokinases as indicated in Fig. 29. The crystal structure of bacterial phosphofructokinase shows that each subunit has two domains with the active site located in a depression between the domains (Fig. 7). The effector sites are shared between two adjacent subunits in the tetramer, and are identical (see Section X.1 (b) for a discussion of the evolution of the control of phosphofructokinase activity).

N

_.•

C

ES

ES

ES

E

Subunit of Bacterial PFK

Mammalian PFK

H FIG. 29. Evolution of phosphofructokinase by gene duplication and fusion. The diagrams represent the domain structure of the enzyme with the active site (AS) located in a pocket between domains. Phosphofructokinase is active as a tetramer in both bacteria and mammals, although for simplicity only a single subunit is represented. The effector site (ES) in the bacterial enzyme is shared between two adjacent subunits in an arrangement similar to that proposed for the subunit of the mammalian enzyme. The effector site shown at the bottom of the mammalian subunit is homologous to the active site.

As in the case of hexokinase, a main consequence of the gene duplication-fusion process was to release one of the duplicated active sites to develop into a novel class ofeffector site. In phosphofructokinase, the active site has been retained in the N-terminal half, with the new type of effector site in the C-terminal half. Evidence for this statement comes from a consideration of the amino acid sequences of the mammalian enzymes where it can be seen that the residue corresponding to the catalytic aspartate-126 has changed into a serine (see Fig. 6). Chemical modification studies indicate that citrate binds to a completely different effector site that corresponds to the putative ancestral effector site which is now present in the bacterial enzyme (Kemp et al., 1987). In Section V.l(b) we discussed how mammalian cells had acquired a second gene for phosphoglycerate kinase, only expressed in the testis, by retroposition. Such retroposition also happened with the mammalian gene for glyceraldehyde-phosphate dehydrogenase. In this latter case, however, many more new gene copies were created, most of which, if not all, seem to be pseudogenes (Piechaczyk et al., 1984; Hanauer and Mandel, 1984; Benham et al., 1984; Fort et al., 1985; Tso et al., 1985a). Some of these pseudogenes are transcribed, presumably :because they have been inserted near a functional but unrelated promoter. The identification of these retroposed genes as pseudogenes, not encoding active enzymes, is based on sequence determination of some copies and on genetic and transcription analyses which have not provided evidence for the occurrence of multiple functional genes for glyceraldehyde-phosphate dehydrogenase in mammals (Bruns and Gerald, 1976; Tso et al., JPB 59:2-F

190

L.A. FOTHERGILL-GILMOREand P. A. M. MICHELS

1985a). Large variations were observed in the number of glyceraldehyde-phosphate dehydrogenase gene copies in different species: a unique representation in chicken, a relatively low reiteration (10-30 copies) in man, hare, guinea-pig and hamster and a high reiteration (at least 200 copies) in mouse and rat (Piechaczyk et al., 1984). It was, therefore, hypothesized that the multiplicity must have been generated in two steps: first the creation of several pseudogenes by retroposition early in mammalian evolution, followed by various rounds of amplification of some of those pseudogenes in the lineage leading to mouse and rat (Piechaczyk et al., 1984; Hanauer and Mandel, 1984). However, so far no evidence has been found for the amplification, because no tandemly arranged pseudogenes could be detected in the genomic clones analyzed (Tso et al., 1985a). 2. Introns and E x o n s

The surprising discovery in 1977 that the mouse fl-globin gene is divided into alternating coding and non-coding regions (Tilghman et al., 1978) has excited considerable discussion about the function of this type of gene structure. The fact that the non-coding regions (introns) are not present in bacteria and are rare in yeast raises intriguing questions concerning their evolution. Did introns arise very early in the evolution of living organisms, only to be lost from bacterial and yeast genomes as they have become more streamlined for rapid growth? Or are introns more recent entities which have been inserted into eukaryotic genomes? Some of the best evidence that introns are primitive features which have probably been present since the very early stages of evolution comes from examination of the structures of glycolytic genes. For example, the genes encoding maize and chicken triosephosphate isomerase (Marchionni and Gilbert, 1986) are shown aligned in Fig. 30. The seven exons of the chicken gene correspond very closely with the nine plant exons. It can be seen that the two terminal introns in the chicken gene have been lost, and there is a small difference in position at the boundary of one of the introns. The close correspondence of the two genes is excellent evidence that the two genes have evolved from a common ancestral gene containing at least nine exons. This observation shows that the presence of introns predates the divergence of plants and animals which occurred about 1000 million years ago. Chicken Gene

=e'Z, FIG. 30. Comparison of exon positions in triosephosphate isomerase. The exons are indicated by numbered boxes; the two introns which are present in the plant gene but not in the animal gene are shown by asterisks. The sizes of the exons and introns are not drawn to scale. Abbreviation: TIM, triosephosphate isomerase.

Evidence that introns are even older, and predate the divergence of prokaryotes and eukaryotes comes from comparisons of glyceraldehyde-phosphate dehydrogenase genes from plants, birds, nematodes and bacteria (Quigley et al., 1988). Two forms of glyceraldehyde-phosphate dehydrogenase are found in maize: a cytosolic form that is active in glycolysis, and a chloroplast form that is involved in photosynthesis. The gene encoding the chloroplast enzyme is located in the nucleus, and has been isolated and sequenced (Quigley et al., 1988). The gene has three introns, two of which are located within the sequence specifying the transit peptide that is required for targeting of the protein to the chloroplast. The third intron is located at codon 166, and is in the boundary between the coenzyme and catalytic domains. This intron is in precisely the same position as intron 1 of glyceraldehyde-phosphate dehydrogenase from the nematode Caenorhabditis elegans (Yarbrough et al., 1987). A comparison of the sequences of the two maize enzymes with other

Evolution of glycolysis

191

glyceraldehyde-phosphate dehydrogenase sequences shows that the cytosolic enzyme clearly is grouped with other eukaryotic sequences, whereas the chloroplast enzyme is more similar to prokaryotic sequences (see Fig. 13). For example, the maize cytosolic enzyme is 67% identical to the nematode enzyme, whereas the chloroplast enzyme is only 46% identical. It is therefore likely that the chloroplast gene is of prokaryotic origin, and its present nuclear location reflects a gene transfer from the endosymbiotic plastid ancestor into the nucleus. It thus appears that the common ancestor of the nematode and maize chloroplast enzymes existed at the time of prokaryote-eukaryote divergence. It seems likely that this ancestral gene had at least one intron. A consideration of the structure of the gene encoding rabbit phosphofructokinase is also of interest from the point of view of the evolution of introns and exons. This gene is very large indeed, and is split into 22 exons distributed over 17 kilobases of DNA sequence (Lee et al., 1987). The gene encodes 780 amino acid residues which are divided into two clearly homologous halves which arose by gene duplication and fusion as discussed in the preceding section. However, the organization of the DNA encoding the two halves is strikingly different (Fig. 31). The N-terminal half of the enzyme is encoded by 12 exons spread over 13 kilobases, whereas the C-terminal half is encoded by 10 exons included within only 4 kilobases. The introns thus vary in length considerably from about 50 bases to about 3000 bases. The sizes of the exons is much less variable, and they usually contain compact secondary structural units (see next section). A more detailed examination of the intron/exon boundaries, however, reveals that there is little conservation of these boundaries. This situation is in marked contrast to those observed for the triosephosphate isomerase and glyceraldehyde-phosphate dehydrogenase genes. It would seem that the two halves of the phosphofructokinase gene have experienced very different evolutionary constraints. The reasons for this seem completely elusive at this stage. N - half

I I

III III

ha f _

II III II III

exons

i,Illll

I I IIIIIII

1-12

I

exons

II II II III 13-22

FIG. 31. Organization of introns and exons in the gene encoding rabbit phosphofructokinase.The thick vertical lines represent exons. The sizes of the introns and exons are drawn approximatelyto scale. The drawing is adapted from Lee et al. (1987). Are there any clues to help decide whether the large-intron or the small-intron type of organization is the more ancient one? From the point of view of enzyme structure and function it is clear that the N-terminal half of phosphofructokinase is older. Thus the N-terminal half possesses a fully functional active site, whereas the C-terminal half does not. Moreover, the sequence of the N-terminal half is more similar to the bacterial sequences (36-43% identical) than is the C-terminal half (26-33% identical) (see Matrix 3(a)). Both lines of evidence suggest that the N-terminal half is more closely related to the ancestral phosphofructokinase, and it is therefore tempting to speculate that the ancestral gene contained large introns which have gradually been trimmed in the duplicated half during evolution. Presumably a small intron has essentially the same function as a large one, and requires less anabolic input. Of course this hypothesis is highly speculative at this stage, and further knowledge of the structures of other doubled genes is required. Moreover this speculation does nothing to help understand why the two halves of the duplicated gene should have apparently quite different evolutionary constraints. 3. Differential Splicing

Introns seem to have been present since the beginning of evolution as discussed in the preceding section. What is their function? Why have they been so well conserved in

192

L.A. FOTHERGILL-GILMOREand P. A. M. MICHELS

eukaryotes, but lost from prokaryotes? It has been noted that exons have a restricted size range, encoding on average about 40--45 amino acids residues (Blake, 1983). Very frequently these blocks of amino acids correspond to structural units in proteins (Go, 1981). It seems very likely that proteins have arisen and evolved by the assembly and reassortment of exons. The presence of introns with their non-coding nucleotide sequences enables the exons they flank to be shuffled around without loss of function. Any mutations arising in the introns would not be manifested in the encoded protein. Another possible selective advantage of the presence of exons is that they enable a family of related proteins to be synthesized from a single gene by alternative RNA splicing. In fact, this turns out to be quite a common mechanism, and has been described for systems as diverse as antibodies (IgM), muscle proteins (myosin), cell surface proteins (fibronectin), blood coagulation factors (kininogen), neurotransmitters (substance P), and enzymes (pyruvate kinase) (reviewed by Breitbart et al., 1987). The reason why bacteria and yeasts have not maintained introns in their genomes is not entirely clear. It may be that an over-riding factor in the evolution of these organisms has been optimization for rapid growth, with the consequent selective pressure to lose all non-essential DNA. Moreover, bacteria and yeasts have alternative mechanisms for DNA rearrangements which can give rise to novel combinations of coding sequences. Pyruvate kinase provides an excellent example of the roles that differential splicing can play in the evolution of enzyme function. The enzyme can be isolated from mammalian tissues as four different isoenzymes which have different properties, reflecting the metabolic requirements of the tissues. Three of the isoenzymes (M2 in kidney, L in liver, and R in erythrocytes) are allosterically regulated such that the pyruvate kinase activity is inhibited when the tissues are carrying out gluconeogenesis. This control is necessary to prevent energetically wasteful futile cycling between glycolysis and gluconeogenesis. The M1 isoenzyme is found in skeletal muscle (a non-gluconeogenic tissue), and in contrast to the other isoenzymes is not subject to allosteric control. It was originally expected that the four pyruvate kinase isoenzymes would be encoded by four different genes. However, it is now known that there are only two different genes: one encoding the M1 and M2 isoenzymes, and one for the L and R isoenzymes. These two genes have been isolated and sequenced (Noguchi et al., 1987; Takenaka et al., 1989). It is clear that the M1 and M2 isoenzymes arise by differential RNA splicing, whereas the L and R isoenzymes are produced by the use of different promoters. Figure 32 shows how the M gene can yield two different types of mRNA molecule by the alternative use of exons 9 and 10. These two exons are homologous, and have probably arisen from the duplication of an ancestral exon followed by sequence divergence, such that 21 of the 55 amino acids encoded by the exons 9 and 10 now differ. These are the only sequence differences between the M1 and M2 isoenzymes, which are otherwise identical over the remaining 509 residues. An indication

MGeoe

I

,-8

MRNA

I

1- 8

Transcript

I,o1,,-,21 M1 m R N A

M2 m R N A

FIG.32. Alternativesplicingin differenttissues,of RNA transcribedfromthe pyruvatekinase M gene. The genehas 12exonsofwhichexon 9 is specificto the M1 isoenzymes,and exon 10specificto the M2 isoenzyme.The sizes of the exons and introns are not drawn to scale.

Evolution of glycolysis

193

of how these relatively small structural variations can give such an important difference in properties comes from a consideration of the three-dimensional structure of the enzyme (see Fig. 21). The amino acids encoded by the alternative exons fold into two ~t-helical regions in domain C, that constitute a major intcr-subunit contact. Apparently, the contacts across this inter-subunit boundary determine whether or not an isoenzyme can be allosterically regulated. A second well-documented example ofisoenzymes that arise by differential splicing relates to aldolase from Drosophila melanogaster (Shaw-Lee et al., 1992). Isolation and characterization of the gene encoding Drosophila aldolase revealed that it contained three alternative versions of the fourth exon (Fig. 33). The fourth exon specifies the amino acid sequence of the C-terminal region of the enzyme, and also has non-coding downstream flanking sequences. The lengths of the non-coding regions vary among the three alternative ¢xons, but the coding portions are nearly constant. A comparison of the deduced sequences of the exon 4 encoded residues with corresponding regions from other aldolases is shown in Fig. 34. It can be seen that the functionally important C-terminal tyrosine residue is conserved in all the sequences. In addition, five other residues are invariant. However, overall these sequences show considerable variation, and the alternative Drosophila sequences have only about 50% identity among themselves. The Drosophila C isoenzyme is most like the mammalian enzymes, and of these, is most similar to the muscle isoenzymc (59% identity). It seems likely that the alternative exons have arisen by separate exon duplication events. What can be said about the functional significance of the Drosophila isoenzymes? At the protein level, only the C isoenzyme has been studied and there is thus no direct information about relative catalytic properties. However, mRNA blotting experiments have demonstrated that the isoenzymes are expressed in a developmental-specific manner (summarized GI~E 1

2

3

~. ',

4A

~

~

4B

4C

i v/#ii#lli//ii,~

ISOENZYMES 2

AI

I 2

BI I

adult - low abundance

3 4B I II

2 C

3 4A II

i

embryo, larvae, ecloslon, adult

3 4C ii

embryo, larvae, pupae, adult - major form

FIG. 33. Structure of the gene encoding the three aldolase isoenzymes from D. melanogaster. The exons are indicated by boxes, with the non-coding regions hatched. Alternative splicing of the exons 4A, 4B and 4C gives rise to the three different isoenzymes.

Dme Dine Dine hum hum hum ric Pfa Tbr

A B C A B C

G1

•A N G E A A C G N Y T A G S V K G F A G K D T L H V D D H R Y •A N S Q A C Q G I Y V P G S IP SFAGNANLFVAQHKY •A N G D A A Q G K Y V A G S --AGAGSGS LFVANHAY •A N S L A C Q G K Y T P S G Q A G A A A S E S L F V S N H A Y •A N C Q A A K G Q Y V H T G S SGAAS TQS LFTACYTY •V N G L A A Q G K Y E G S G E D G G A A A Q S LYIANHAY .A N S E A T L G T Y K G D A V L G E G A S E S LHVKDYKY •A N S L A T Y G K Y K G G A G - G E N A G A S LYEKKYVY •M N S L A Q L G K Y - - N R A D D D K D S Q S L Y V A G N T Y *

*

*

*

*

*

FIG. 34. Comparison of the alternative C-terminal sequences of D. melanogaster aldolase with the corresponding C-terminal regions of representative class I aldolase sequences. The Drosophila sequences are from Shaw-Lee et al. (1992). See Section II for nomenclature and references for the other sequences. The arrow indicates the beginning of the sequence encoded by exon 4 of the Drosophila aldolase gene.

194

L.A. FOTHERGILL-GILMORE and P. A. M. M1CHELS

in Fig. 33). The A isoenzyme appears to be restricted to expression in adults and is in very low abundance. The C isoenzyme is the major form in all developmental stages and the B isoenzyme is particularly abundant at eclosion. It would seem likely that the isoenzymes differ in enzymic properties and it is known from extensive studies on the mammalian isoenzymes that the C-terminal region plays a major role in discriminating between the two ligands, fructose 1,6-bisphosphate and fructose 1-phosphate. The C-terminal region modulates access to the active site (see Fig. 9) and in this way can help to influence whether the enzyme is more active in the glycolytic or gluconeogenic direction (see Section V. 1(b)). A third example of differential splicing has been observed, this time for human muscle phosphofructokinase. Three different cases have been described, and two of these apparently produced non-functional products as a consequence of incorrect splicing in exon 9 (Sharma et al., 1990) or in exon 13 (Nakajima et al., 1990a). In neither of these cases is the exon in question present in a duplicated form in the gene, and the differential splicing thus leads to deletions and inactive enzymes. The third case relates to the occurrence of alternative splicing and the use of alternative promoters in the upstream untranslated region (Nakajima et al., 1990b). The functional significance of these observations is unclear at this stage.

VII. Q U A T E R N A R Y S T R U C T U R E The quaternary structure of most glycolytic enzymes has been well conserved during evolution. Monomeric forms are unusual, and the only enzyme that is invariably a monomer is phosphoglycerate kinase. In many cases inter-subunit contacts are critically important for function. Hexokinase is one of the three glycolytic enzymes that can be active as either a monomer or as an oligomeric enzyme (the other two are phosphoglycerate mutase and pyruvate kinase, and will be discussed later in this section). In fact, the quaternary structure of hexokinase is of particular interest because of the gene duplications and fusion events that have occurred during its evolution (see Section VI.1). Hexokinase in fungi, plants and invertebrates has a subunit Mr of 50,000, and is active as either a monomer or a dimer. It is known that the dimeric form is favoured by MgATP and glucose (Shill et al., 1974), and is more active than the monomer (Steitz et al., 1977; see Section III. 1(a)). The enzyme in mammals has a subunit twice as large (Mr 100,000), and is invariably active as a monomer. Thus the doubling process has ensured that the mammalian enzyme is always present as what is in effect a covalently joined dimer. A major consequence of the doubling has been to confer additional control properties on the larger enzyme in the form of allosteric inhibition by glucose 6-phosphate. Yeast hexokinase and mammalian glucokinase (Mr 50,000) are not subject to this type of allosteric inhibition. It is clear that in the larger enzymes one of the ancestral active sites has ceased to function as a catalytic site, and has evolved to become an effector site (Nishi et al., 1988; Schwab and Wilson, 1988). The crystal structure of glucosephosphate isomerase (see Section III. 1(b)) shows that the active site is formed partly by portions of two subunits (Achari et al., 1981). It is thus not surprising that the enzyme is found to be active as a dimer. No higher oligomers have been reported. Glucosephosphate isomerase in some circumstances has been shown rather surprisingly to be active as a lymphokine termed neuroleukin (Gurney et al., 1986; see Section XI). It is not clear whether neuroleukin is active as a monomer or as an oligomer. Several other glycolytic enzymes have similarly unexpected non-enzymic functions, and in many cases the unusual properties correlate with a monomeric, and presumably inactive form of the enzyme. Examples include glyceraldehyde-phosphate dehydrogenase as a parasite surface antigen (Goudot-Crozel et al., 1989) and pyruvate kinase as a thyroid hormone binding protein (Kato et al., 1989). The quaternary structure of phosphofructokinase is complex. One reason is that it has evolved by gene duplication and fusion like hexokinase (see Sections III. 1(c) and VI. 1). In addition, phosphofructokinase occurs in a variety of oligomeric forms from dimer to tetramer to octamer and even larger forms. The only crystal structures of phosphofructokinase currently available are the bacterial enzymes (Section III. 1(c)). These enzymes

Evolutionof glycolysis

195

are allosterically regulated homotetramers with a subunit size of 33,000. The structures show clearly that both the active sites and the effector sites are shared between adjacent subunits, and it is thus not surprising that the enzymes are active as tetramers. The gene doubling process during evolution means that an eukaryotic dimer would correspond in size and number of binding sites to the prokaryotic tetramer. It is perhaps therefore surprising that the dimeric form of eukaryotic phosphofructokinase is not active. The mammalian enzyme, however, is active as a tetramer (corresponding to a bacterial octamer), and also undergoes effector-mediated aggregation to even larger forms that are apparently less sensitive to ATP inhibition (Hofer et al., 1987). The situation is even more complicated in yeast where there are two distinct types of chain ~ and fl (with M r of 112,000 and 118,000), which associate to form ct4/34octamers. It appears that the gene doubling process occurred very soon after the separation of the prokaryotic and eukaryotic lineages, whereas the duplication to ~ and 13 chains occurred later in the fungal lineage (Heinisch et al., 1989). The significance of the complicated quaternary structure of phosphofructokinase is not entirely clear, but probably relates to the requirement for specific and responsive control properties for this enzyme. A wide range of effector molecules have been described (reviewed by Bloxham and Lardy, 1973; Hofmann, 1976; Sols, 1981), and some forms of the enzyme can be regulated by phosphorylation (Kulkarni et al., 1987; Huse et al., 1988). Class I aldolases from nearly all organisms are composed of four identical subunits with a molecular mass of about 40,000 (Horecker et al., 1972). Some reports, however, described a different number of subunits and a different subunit mass in various bacterial aldolases (Gotz et al., 1980; Krishnan and Altekar, 1991). Because crystal structures of only two mammalian aldolases are known, detailed information about the evolutionary conservation of the quaternary structure is limited. In the mammalian aldolase each subunit makes contact with two of its neighbours (Sygusch et al., 1987; Gamblin et al., 1990). The subunits are rather independent, in accordance with the absence of cooperativity in kinetic measurements. The subunit contacts, made by s-helices flanking the barrel, are mainly hydrophobic in nature. The class II aldolases studied so far are all homodimeric enzymes. However, no information is available as yet about their three-dimensional structure. All triosephosphate isomerases that have been analyzed are homodimers. The crystal structures have been determined for the enzymes from three distantly related eukaryotes (chicken, yeast and trypanosome) (Banner et al., 1975; Lolis et al., 1990; Wierenga et al., 1987, 1991a). The essential features of the quaternary structure of these enzymes are the same, despite the mere 53-54% sequence identity. The monomer~limer equilibrium constant is quite high. The interface is rather hydrophobic, and the major contacts between the two subunits are made by inter-digitating loops which extend into the active site of the other subunit. The quaternary structure does not endow cooperative binding of ligands to the enzyme. All glyceraldehyde-phosphate dehydrogenases are tetrameric enzymes with a subunit mass of approximately 36,000 (Harris and Waters, 1976). The crystal structures have been determined for the enzymes of a eubacterium (Skarzynski et al., 1987) and several eukaryotes (Moras et al., 1975; Read et al., 1992; Vellieux et al., 1992). The various enzymes reveal similar three-dimensional structures, despite differences of 45-50% in the primary structure. The four subunits of the glyceraldehyde-phosphate dehydrogenases, are structurally and functionally equivalent, although initially a pairwise symmetry was reported for the lobster enzyme (Moras et al., 1975). The contacts over the three orthogonal inter-subunit axes of the tetramer, designated the P, Q and R axes (Rossmann et al., 1973; Buehner et al., 1974) differ in nature and strength (Fig. 35). The contacts in the P-interface are made by residues of the catalytic domain; they are both hydrophobic and hydrophilic and quite extensive. Interactions across the R-axis involve the S-loop, which extends into the NAD ÷ binding pocket of the other subunit. Contacts across the Q-axis, which also involve residues of the coenzyme binding domain, are fewer. The amino acids involved in P-axis interactions are highly conserved throughout evolution; the S-loop residues in the R-interface are conserved within the eubacterial and eukaryotic lineages, but differ significantly between these two groups. The contacts across the Q-axis are less conserved. The binding of NAD + to the

196

L.A. FOTHERGILL-GILMOREand P. A. M. MICHELS GAPDH

I

LDH

0

~0

J

p

P

FIG. 35. Diagrammatic representationof the subunit association of some dehydrogenasemolecules. Indicated are the orthogonal inter-subunit axes P, Q and R, and the relativepositions of the NAD ÷ binding domains in glyceraldehyde-phosphatedehydrogenase(GAPDH) and lactate dehydrogenase (LDH) (modifiedfrom Buehner et al., 1974).

enzyme leads to a conformational change in the coenzyme-binding domain. This results in small changes in the Q-axis and R-axis inter-subunit contacts, and thus forms the basis of the cooperativity in N A D +-binding, which may be positive or negative depending on the source of the enzyme (Skarzynski and Wonacott, 1988). Phosphoglycerate kinase only occurs as a monomeric enzyme. Phosphoglycerate mutase can be grouped into two main types: cofactor-dependent and cofactor-independent (reviewed by Fothergill-Gilmore and Watson, 1989; see Section III.l(h)). These two types are probably not homologous. The cofactor-independent form occurs in plants and certain invertebrates, and is invariably monomeric with a monomer size of 60,000. The cofactor-dependent form of phosphoglycerate mutase is found in vertebrates and yeasts, and is active as a monomer, homodimer or homotetramer, depending upon the organism from which it is isolated. It is thus the only glycolytic enzyme that exists in its most active form in a variety of quaternary structures. (The different quaternary forms of hexokinase, phosphofructokinase and pyruvate kinase have varying activities, and seem to relate to mechanisms for the control of activity.) Monomeric phosphoglycerate mutase (Mr 23,000) is found in the fission yeast, Schizosaccharomyces pombe. The dimeric form occurs in vertebrates, and the tetrameric form in the budding yeast, Saccharomyces cerevisiae. The subunit sizes in these cases are 27,000, somewhat larger than the S. pombe enzyme monomer. The crystal structure of the tetrameric enzyme has been solved (Winn et al., 1981; Watson, 1982), and a drawing of the tetramer is given in Fig. 17. It can be seen that there are two types of subunit contact, neither of which is very extensive. The four active sites are quite separate as is consistent with its lack of cooperative properties. It is thus not too surprising that it can be active in different quaternary forms. However, it is not at all obvious what the selective pressure in favour of oligomers might be. It is perhaps relevant that partial dissociation of the S. cerevisiae enzyme renders it much more susceptible to proteolytic degradation (Johnson and Price, 1986). Enolase in the absence of Mg 2+ is a monomer (M r 45,000) and is inactive (Brewer and Weber, 1968). Binding of a single Mg 2+ ion per subunit provokes a conformational change and dimer formation. This form of the enzyme can bind substrates or competitive inhibitors, but has little activity (Failer and Johnson, 1974). Binding of a second Mg 2+ ion per subunit enables catalysis to occur (Failer et al., 1977). The crystal structure of the dimeric form of yeast enolase has been solved (Lebioda et al., 1989). Figure 19 illustrates the tertiary structure of an enolase monomer; the dimer interface involves contacts along two of the fl-strands at the bottom of the diagram. The contacts are primarily ionic, and are not very extensive, as expected for a dimer that can readily dissociate in the absence of Mg 2 +. In the case of enolase, it appears that oligomer formation is a means of controlling enzyme activity. Pyruvate kinase is most usually active as a homotetramer. However, certain exceptions

Evolution of glycolysis

197

have been reported: the enzymes from Zymomonas mobilis and S. pombe are homodimers (Pawluk et al., 1986; Duncan et al., 1989). In addition, the cytosolic isoenzyme of plant pyruvate kinase is a heterotetramer of closely related 56,000 and 57,000 subunits (Blakeley et al., 1990). Interestingly, monomeric forms of human M 1 and M2 pyruvate kinase have also been reported (Kato et al., 1989; Parkison et al., 1991). The conversion between the monomer and tetramer forms is regulated by thyroid hormone and fructose 1,6bisphosphate (Ashizawa et al., 1991a, b), and constitutes a potentially important control mechanism, because the monomeric form has reduced enzymic activity (see Section X. 1(c)). The subunit contacts in the tetrameric forms of pyruvate kinase are extensive, and involve approximately 100 residues (Clayden, 1989). Most (90% in the M1 isoenzyme) of the contacts are hydrophobic. Of particular interest are the contacts of the N-terminal domain and of two of the helices in the C domain (Fig. 21) that are located near the centre of the tetramer. These contacts play major roles in the cooperative properties of pyruvate kinase. Isoenzyme-specific sequence differences in the N-terminal domain of the L isoenzyme confer susceptibility to allosteric regulation by phosphorylation. The M1 (non-aUosteric) and M2 (allosteric) isoenzymes differ in sequence only in a single region of 56 residues that includes the inter-subunit helices in the C domain. It is apparent that a great deal of information is available concerning the quaternary structure of glycolytic enzymes. Several general points can be made: (1) There is a trend during evolution to larger double-size subunits in the effectorresponsive enzymes hexokinase and phosphofructokinase. These larger enzymes are able to respond to a wider range of effector molecules than the smaller ancestral forms. Presumably the possession of these properties within a single polypeptide chain instead of two distinct chains obviates the need for coordinate expression of two separate genes. Potential problems for the coordinate assembly of an oligomeric protein with different polypeptide chains are also avoided. Moreover, the double-size subunits (that are equivalent to covalently joined dimers) may have enhanced stability in vivo. Free subunits of oligomeric proteins tend to have a relatively high turnover because of their greater susceptibility to clearance mechanisms. It is interesting to note that recombinant haemoglobin with covalently joined ~-globin chains has an increased half-life in vivo (Looker et al., 1992). (2) Several regulatory mechanisms exist to control the quaternary structure of certain glycolytic enzymes (hexokinase, phosphofructokinase, enolase and pyruvate kinase). In all cases the smaller forms of the enzyme have reduced activity, and these assembly mechanisms can provide important control points. (3) There is a general trend to higher oligomeric forms; active monomers are relatively rare. The reason(s) for this is not very obvious, but may relate to resistance to proteolysis within the cell. A tetrameric enzyme for example has a smaller surface to interior ratio than the corresponding monomer, and thus has a greater potential for burying peptide bonds that might be especially susceptible to proteolysis. VIII. H A E M O G L O B I N AND P H O S P H O G L Y C E R A T E MUTASE E V O L U T I O N At first glance the connection between the evolution of haemoglobin and the evolution of a glycolytic enzyme--phosphoglycerate mutase--is far from obvious. However, these two molecules are inextricably linked at the level of haemoglobin effector function. In most mammals and in amphibia, haemoglobin oxygen affinity is controlled by 2,3-bisphosphoglycerate (reviewed by Coates, 1975). This effector molecule is the product of the bisphosphoglycerate mutase isoenzyme present in erythrocytes. Among the non-mammalian vertebrates, other effector molecules are used to control haemoglobin oxygen affinity: ATP (in the case of fish, amphibia, reptiles and birds) and inositol pentaphosphate (birds). A consideration of the evolution of haemoglobin effector function provides clues about the likely evolution of phosphoglycerate mutase. The regulation of oxygen binding to haemoglobin and its release to the peripheral tissues is of crucial importance to all vertebrates. This regulation is a complex interplay of many factors, including haemoglobin type, the concentration of organic effector molecules, and the

198

L.A. FOTHERGILL-GILMOREand P, A. M. MICHELS

flux through carbohydrate metabolic pathways (reviewed by Brewer and Eaton, 1971). In mammals 2,3-bisphosphoglycerate and hence bisphosphoglycerate mutase have a major role to play. The crucial role played by 2,3-bisphosphoglycerate has been particularly well studied in humans. In normal adults, the high concentration of bisphosphoglycerate in the erythrocytes (about 5 mM) ensures that oxygen is released from haemoglobin in the tissue capillaries even at the relatively high oxygen tension encountered. In the absence of bisphosphoglycerate, no oxygen would be released under these conditions. It is now well established that the concentration of bisphosphoglycerate rises in conditions of hypoxic stress (such as altitude sickness, low cardiac output, lung disease, etc.) in order to facilitate oxygen release from haemoglobin. What is the mechanism by which the enzyme bisphosphoglycerate mutase responds to conditions of hypoxic stress? The answer seems to relate primarily to glycolytic flux and the inhibition of hexokinase by 2,3-bisphosphoglycerate (Brewer, 1969; Ponce et al., 1971; Brewer et al., 1974; Duhm, 1975). In conditions requiring maximal release of oxygen, there is a relatively low concentration of free bisphosphoglycerate, and glycolysis can proceed unhindered by inhibition at the level of hexokinase. This, in turn, leads to optimum synthesis of bisphosphoglycerate by flux through the Rapoport-Luebering shunt (Fig. 36). In due course when the concentration of free bisphosphoglycerate rises, it acts as a feedback control for glycolysis primarily by inhibiting hexokinase. Of course, this is an oversimplification; and there can be many other factors involved, such as the concentration of ATP or Mg 2÷ or H + So far the role of 2,3-bisphosphoglycerate has been discussed only as it relates to adult glucose

;

oII

2-O3P-O-CH2-CHOH-C-O-PO32" 1,3-bisphosphoglycerate O-PO32"

I

2 " O 3 P - O - C H 2 - C H - C O 2" 2,3-bisphosphoglycerate

2 - O 3 P - O - C H 2 - C H O H - C O 2" 3-phosphoglycerate

11.pO32CH2OH-CH-CO 2" 2-phosphoglycerate

pyruvate FIG. 36. Reactionscatalyzed by phosphoglyceratemutases.Reaction ] is the major reaction catalyzed by the glycolytic monophospho~ycerate mutase.This enzyme can also catalyze reactions 2 and 3, but at relatively low rates. Bisphosphoglyccratemutasc catalyzes reaction 2 at the highest rate, but is also relatively efficient at catalyzing reactions ! and 3. Reactions 2 and 3 result in the formation and breakdown of 2,3-bisphosphoglyccrate, and are known as the "Rapoport-Euebering Shunt".

Evolution of glycolysis

199

humans. It is now of interest to consider the special requirements of the human foetus. It is obvious that the haemoglobin of the foetus must have a higher affinity for oxygen than the adult haemoglobin, or there would be no transfer of oxygen from the placenta to the foetus. Is this achieved by adjusting the bisphosphoglycerate concentration in the foetal erythrocytes such that it is lower than the maternal concentration? In this case the difference in affinity does not relate to differences in bisphosphoglycerate concentration, but is achieved by the presence of a different type of haemoglobin. Foetal haemoglobin has a lower affinity for bisphosphoglycerate than does the adult form, and consequently binds oxygen more tightly in the presence of the high concentrations of bisphosphoglycerate found in both foetal and adult erythrocytes (Tyuma and Shimizu, 1970). Most mammals show the same pattern of regulation of haemoglobin oxygen affinity as do humans. However, there are exceptions, and certain mammals (e.g. some members of the order Carnivora such as civets, hyenas and lions, and some artiodactyls such as cattle, goats and sheep) have relatively very low concentrations of 2,3-bisphosphoglycerate (Rapoport and Guest, 1941; Bartlett, 1970; Bunn et al., 1974). How do these animals regulate haemoglobin oxygen affinity, and how do they cope with hypoxic stress? Measurements of the oxygen affinities of haemoglobins from cats, sheep, cattle and goats have shown that they are inherently very low, and in fact are even lower than those observed for other mammalian haemoglobins in the presence of bisphosphoglycerate (Bunn, 1971). Clearly, the carnivore and artiodactyl haemoglobins do not require effector molecules to promote release of oxygen to the peripheral tissues. These animals appear to be able to respond to hypoxic stress by synthesizing an alternative species of haemoglobin with higher oxygen affinity and also with the capability of binding 2,3-bisphosphoglycerate. Thus the switch to the high affinity haemoglobin is accompanied by an increase in the synthesis of 2,3-bisphosphoglycerate, presumably because of the relief of feedback inhibition. This mechanism has been observed in cats (Taketa, 1974; Mauk et al., 1974), sheep (Van Vliet and Huisman, 1964) and goats (Huisman et al., 1967, 1968). It is now relevant to consider the evolutionary relationships between phosphoglycerate mutases and haemoglobins (and their effector molecules). Mammalian phosphoglycerate mutases are rather versatile and can catalyze three different reactions (Fig. 36). It was initially supposed that each of these reactions was catalyzed by a different enzyme and it was quite surprising when it was realized that the glycolytic phosphoglycerate mutase isoenzymes from brain or muscle and bisphosphoglycerate mutase from erythrocytes could each catalyze all three of the reactions, albeit at substantially different relative rates (Table 7; reviewed by Rose (1980)). It is apparent that monophosphoglycerate mutase is much more active as a mutase than as a synthase by a factor of about 10,000. In contrast, bisphosphoglycerate mutase has only about a seven-fold preference for the synthase reaction. It is probable that the three phosphoglycerate mutase isoenzymes found in present-day mammals have evolved from a common ancestor by two separate gene duplication events. What was the order of appearance of the muscle, brain and erythrocyte isoenzymes? Is it possible to deduce likely times for these gene duplications? Answers to these questions can be proposed from two lines of evidence: (1) distribution of isoenzymes and haemoglobin effector molecules, and (2) sequence comparisons among the phosphoglycerate mutases. TABLE 7. CATALYTIC CONSTANTS FOR MONOPHOSPHOGLYCERATE MUTASE AND BISPHOSPHOGLYCERATE MUTASE

Reaction 3PGA~2PGA 1,3-BPG--,2,3-BPG 2,3-BPG--, PGA + Pi

Name

kcat (sec"1) MPGAM

BPGAM

mutase synthase phosphatase

1330 0.4 2.8

1.7 12.5 2.6

The values are adapted from Rose (1980). Monophosphoglycerate mutase (MPGAM) was from chicken muscle, and bisphosphoglycerate mutase (BPGAM) was from horse erythrocytes. The phosphatase reaction was activated by 2-phosphoglycerate.

200

L.A. FOTHERGILL-GILMOREand P. A. M. MICHELS

Fish and all higher vertebrates with the exception of birds have two monophosphoglycerate mutase isoenzymes that can be distinguished by electrophoretic mobility, heat stability and sensitivity to thiol reagents (Mezquita et al., 1981). In humans these two isoenzymes share 81% sequence identity (Matrix 8). The B isoenzyme is present in most tissues excluding skeletal muscle, whereas the M isoenzyme occurs only in cardiac and skeletal muscle. Heterodimers can be found in tissues such as cardiac muscle where both isoenzymes are expressed (Bartrons and Carreras, 1982). Birds appear to possess only the B isoenzyme (Mezquita and Carreras, 1981). Which isoenzyme is likely to be the ancestral form? The wider distribution of the B isoenzyme among the vertebrates and among different tissues tends to argue in favour of it being more ancient. This suggestion is substantiated by the fact that analysis of human genomic DNA has shown that several copies of the B gene exist, possibly in the form of processed pseudogenes (Sakoda et al., 1988). The likely time for the gene duplication giving rise to the M isoenzyme is before mammals and fish diverged (about 400 million years ago). Presumably the M isoenzyme gene in the avian line has been lost or is no longer expressed. This subsequent event must have occurred after birds and reptiles diverged (145 million years ago) because both isoenzymes are present in reptiles. What was the order of appearance of the gene encoding the E isoenzyme (bisphosphoglycerate mutase) that is so important for the synthesis of 2,3-bisphosphoglycerate? Two possible schemes are illustrated in Fig. 37. Scheme A is consistent with the distribution of isoenzymes in present-day organisms. Scheme B seems to be justified from a comparison of the sequences of the M, B and E isoenzymes. The first gene duplication event to give rise to the M isoenzyme probably occurred before 400 million years ago as discussed above. The second gene duplication can be positioned from a consideration of the distribution of 2,3-bisphosphoglycerate as an effector molecule for haemoglobin oxygen affinity. This function probably arose in amphibia, in good agreement with the presence of the E isoenzyme in amphibia and higher vertebrates. Thus the second gene duplication probably occurred after the divergence of mammals and fish, but before the divergence of mammals and amphibia about 350 million years ago. In contrast, scheme B appears to be favoured from a comparison of the isoenzyme sequences, and suggests that the more recent gene duplication gave rise to the present-day M

A M isoenzyme Mutase

Ancester--~

E isoenzyme B/E ancestor

< B isoenzyme

B E isoenzyme

Mutase

Ancestor

.<

B/M ancestor

<

M isoenzyme B isoenzyme

FIG. 37. Alternative schemes for the order of appearance of the three phosphoglycerate mutase isoenzymes present in vertebrates. It is assumed that the brain (B) isoenzyme is the earliest form in both schemes. In scheme A, the muscle (M) isoenzyme predates the erythrocyte (E) isoenzyme. It is envisaged that the origin of the M isoenzyme is before the divergence of mammals and fish about 400 million years ago. The origin of the E isoenzyme is likely to be before the divergence of mammals and amphibians about 350 million years ago. In scheme B, the E isoenzyme predates the M isoenzyme.

Evolutionof glycolysis

201

and B isoenzymes. These two enzymes share 81% sequence identity, whereas the E isoenzyme is only about 50% identical to the other two. If it is assumed that the rate of evolution of these isoenzymes has been more or less constant, then scheme B seems correct. In this scheme, with the origin of the E isoenzyme predating the M and B divergence, it would be necessary to evoke the loss of expression of the E isoenzyme in the fish and the reptile/bird lines. It should be possible to check this scheme by taking into account the rate of evolution of phosphoglycerate mutase. Unfortunately only four glycolytic phosphoglycerate mutase sequences are currently available and the plot of PAMs vs divergence time does not give a straight line (Fig. 22). From this plot, the rate of evolution is between 3.3 and 9.6 PAMs per 100 million years. Thus for a pair of enzymes with 19% sequence difference, the range of divergence times is 220-640 million years ago. This is clearly an unhelpfully large range that frustratingly encompasses most of the relevant vertebrate divergence times! Moreover, according to the same argument, the E isoenzyme would have diverged from the others in the range 800-2400 million years ago. This seems most unlikely, and is probably excellent evidence in favour of a non-constant rate of evolution. Presumably the E isoenzyme underwent a period of rapid change after the gene duplication event giving rise to it, in order to acquire its distinct catalytic properties. It is therefore not possible with current information to make an unequivocal choice between the two possible schemes. However, scheme A is favoured on grounds of parsimony. This scheme would require the loss of the M isoenzyme in the bird line. In contrast, scheme B would necessitate the loss of the E isoenzyme in both the fish and the reptile/bird lines. IX. H E T E R O L O G O U S ENZYMES CATALYZING THE SAME REACTION Apparently unrelated enzymes that catalyze the same reaction are intriguing both with regard to their evolution and to their enzymology. Many questions come to mind. Have the enzymes converged to provide the same function, or did they diverge so long ago as to be no longer perceptibly related? Are the limited sequence similarities that can sometimes be shown a consequence of the requirement to catalyze the same reaction and bind the same ligands, or evidence in favour of divergence from a common ancestor? Do the enzymes have the same mechanism of action? Is the architecture of their active sites the same? Among the glycolytic enzymes, there are five that are particularly relevant to consider in this section: phosphofructokinase, aldolase, glyceraldehyde-phosphate dehydrogenase, phosphoglycerate mutase and pyruvate kinase. Of these, the class I and class II aldolases seem to be convincingly heterologous, as are pyruvate kinase and pyruvate phosphate dikinase. The cofactor-dependent and cofactor-independent phosphoglycerate mutases are also likely to be unrelated to each other. Phosphofructokinase and glyceraldehydephosphate dehydrogenase belong to a "not proven" category where the evidence can be argued in favour of both heterology and homology. Comparisons of crystal structures and of amino acid sequences can be among the best ways of deducing the evolutionary histories of proteins, as has been amply demonstrated in other sections of this review. Unfortunately in the case of the five heterologous glycolytic enzymes, the data are very sparse. There are no crystal structures available for any of the less common versions of the five enzymes. Moreover, there are few sequences: three PPidependent phosphofructokinases, three class II aldolases, four archaebacterial glyceraldehyde-phosphate dehydrogenases and two pyruvate phosphate dikinases. There are no cofactor-independent phosphoglycerate mutase sequences available. Any generalizations and interpretations must therefore be made with considerable caution.

1. Phosphofructokinase The most commonly occurring phosphofructokinase uses ATP as phospho donor (see Section III.l(c)). This enzyme is found in all major kingdoms, with the probable exceptions of archaebacteria, and certain protists and eubacteria. An alternative relatively rare form of phosphofructokinase that uses inorganic pyrophosphate as phospho donor is also found in some organisms, and can occur together with the ATP-phosphofructokinase (e.g. in plants

202

L.A. FOTHERGILL-GILMOREand P. A. M. MICHELS

and Euglena gracilis) or as apparently the sole form of phosphofructokinase (e.g. in parasitic protists such as Entamoeba histolytica, Giardia lamblia and Trichomonas vaginalis) (reviewed by Mertens, 1991; see Section III.1 (c)). In the latter group of organisms, the PPi-phosphofructokinase is not sensitive to fructose 2,6-bisphosphate regulation, and as it catalyzes a readily reversible reaction, can replace both ATP-phosphofructokinase and fructose 1,6bisphosphatase. However, the situation in plants and Euglena is much more complicated because the PPi-phosphofructokinase coexists with ATP-phosphofructokinase and fructose 1,6-bisphosphatase. In these organisms, the PPi-dependent enzyme appears to play a major role in anaerobic metabolism when the ATP/ADP ratio is low (see Mertens, 1991) but there remain many aspects still to be defined. See Section X for a discussion of the evolution of the control aspects of phosphofructokinase. The evolution of PPi-phosphofructokinase is puzzling because of its phylogenetic distribution. It does not cluster on any major lineage, but occurs in phylogenetically distant groups such as plants, Euglena, parasitic protists and eubacteria. The correlation appears to be more with organisms adapted to an anaerobic life style. This has the evolutionary consequence that the enzyme must either have (1) arisen in several independent events, or (2) been lost independently in several different lineages or (3) been subject to lateral gene transfer. The availability of additional sequence information will be required to resolve these alternatives. Three full-length PPi-phosphofructokinase sequences are currently available. These are the enzyme from Propionibacterium freundenreichii and the two different subunits of the potato enzyme, and are shown in Fig. 6(b). The potato ~-subunit has 615 amino acid residues and the fl-subunit 512 residues (Carlisle et al., 1990). The two subunits are homologous (43% identity) and have apparently evolved by gene duplication. The P.freundenreichii sequence has 403 residues (Ladror et al., 1991) and is quite different in sequence from the potato enzyme subunits. There is no evidence of gene doubling as has occurred in the ATPphosphofructokinase. Comparison of these sequences with the ATP-dependent enzymes is not straightforward because of the low degree of similarity and because of the different subunit sizes. A plausible alignment of the sequences with the ATP-phosphofructokinase from B. stearothermophilus (318 residues) is given in Fig. 38. It is apparent that there are few identical residues. The Bacillus sequence is 15% identical to P.freundenreichii, 22% to the ~-subunit and 23% to the fl-subunit. As has already been discussed (Section III.4), these values are unhelpfully close to those of unrelated sequences and belong to the "grey area" of possible homology (Fig. 26). However, some of the conserved residues (especially in the region aligned with residues 93-170 of the Bacillus enzyme) correspond to the fructose 6-phosphate and ATP binding sites, and there is generally good correspondence between the known secondary structure of the bacterial enzyme and the predicted secondary structures of the plant subunits (Carlisle et al., 1990). Moreover, the catalytic aspartate residue (Asp-126) of the Bacillus enzyme is conserved in the fl-subunit that is thought to be catalytic, but not in the ct-subunit that may play a regulatory role (Yan and Tao, 1984). The lack of sequence information for other PPi-phosphofructokinases from other organisms means that it is not possible to establish how closely related these enzymes are to each other. Variations in subunit size and quaternary structure give an indication that they may in fact be a rather disparate group. Thus the enzyme from Propionibacterium spp., Isotricha prostoma and Giardia lamblia are homodimers with subunits of M r 48,000-55,000; the enzyme from Trichomonas vaoinalis is a homotetramer with subunits of M r 45,000; the Euglena gracilis enzyme is a monomer of M r 110,000 and the potato enzyme is a heterotetramer with subunits of M r 60,000 and 65,000 (O'Brien et al., 1975; Mertens et al., 1989; Mertens, 1990; Miyatake et al., 1986; Carlisle et al., 1990). This variability would be consistent with independent origins of these enzymes along separate phylogenetic lineages. On balance, the present rather sparse data appear to add up in favour of divergence of ATP- and PPi-phosphofructokinases from a common ancestor. But the evidence is not strong and should probably be treated with some caution until more sequences are known. There remains the problem of the enigmatic phylogenetic distribution of the PPi-phosphofructokinase.

Evolution of glycolysis

203

potA potB Pfr

DADYGIPRELSDLQKLRSHYHPELPPCLQGTTVRVELRDATTAADPSGEHT IKRFFPHTY 60 AT.T.HLPPVTQRRLQ 14 VKKV4

Bst potA potB Pfr

bbbb aaaaaaaaaaaaaaaaa KRIGVLTSGGDSPGMNAAIRSVVRKAIYHG--VE GQPLAHFLRATAKVPD-AQIITEHPAIRVGVLFCGRQSPGGHNVIWGLHDALKVHNPKNI S FFLP YTDNHVSLVPDDSGDVAMNQI LKIGVVLSGGQAPGGHNVI SGIFDYLQTHCKGS T ALLTAGGFAPCLSSAIAELIKRYTEVSPETTLIGYRYGYEGLLKGDSLEFSPAVRAHYDR

32 119 74 64

..

Bst potA potB Pfr

bbbbb aaaaaaaabbbbb aaaaaaaaaaaaaaaaaaaaa VYGVYHGYAGLIAGNIKKLEVGDVGDIIHRGGTILYTARCPEFKTEEGQKKGIEQLKKHG LLGFLGGSEGLRAQKTLEITDDVLATYENQGGYDMLGRTKDQIRTTEQVN~ MYGFRGGPAGVMKGKYVVLTPEFIYPYRNQGGFDMICSGRDKIETPEQFKQAEETAKKLD LFSFGGSPIGNSRVK ........ LTNVKDLVARGLVASGDDPLKVAADQLIADG ......

92 179 134 110

* ...

Bst potA potB Pfr

*

* * *

,.

o,*



.o

o.,

.

,*

*

*,o,*.

.o*

.o***

o.

,o



.e

.***

255 357 314 261



aaaaaaaaaaaaaaaaaaaa bbbbbb bbbbbaaaaaaaa aaaaaaaaaa TAFDRVLASRLGABAVELLLEGKGGRCVGIQNNQLVDHD IAEALANKHT IDQRMYALSKE LS PWASALFEFLPHF IRKQT.T.T.~PESDDSAQLSQIETEKLIAHLVET~NKRLKEGTYKG LTP QCLELFELLP LAIQEQT.T.T.~.RDPHGNVQVAKIETEI~4LIQMVETELDQRKQKGAYNA F I SEGAGVPD IVAQMQATGQEVP TDAFGHVQLDKINPGAWFAK--- -QFAERIGAGKTMV .,

196 299 254 213

o..

aaaaaaaaaaaaa bbbbbb aaaaaaaaaaaaa bbbbbbbaaaaaa MNDVI~-RGKKHSIIIVAEGVGSGVDFGRQIQEATGFETRVTVLGHVQRGGSP ITQQICDAVQARAEHDKNHGVILLPEGLIESIPEVYALLQEIHGLLRQGVSADKI--SSQ VTDYIADVVCKRAESGYNYGVILIPEGLIDFIPEVQQLIAELNEILAHDVVDEAGVWKKK AQQWLPEAGLDR--RGWDIHALYVPEATIDLDAEAERLRTV~EVGSVNI •

144 239 194 168

*

aaaaaaaaaaaaaaa bbSbbbb aaaaaaaaaaaa bbb aa VIDAIDKIRDTATSHE-RTYVIEVMGRHAGDIALWSGLAGGAETILIPEAD ....... YD NSQLI SNVCTDALSAEKYYYF I ~ H V A L E C T L Q S H P N M V I LGEEVAASKLT IFD YAEMI GNVMIDARSTGKYYHFVRI/~GRAASH ITLECALQTHPNVTLIGEEVFAKKLTLKN GARFAANVIAEHNAAPREL I IHE I M ~ . . . . TSRRYVAWLD

• o

Bst potA potB Pfr



*o



Bst potA potB Pfr



bbbbbb aaaaaaaaaaaaaa bbbbbb ~ aaaaaaaa I E G L W I G G D G S Y Q G A K K L T E H G ...... FPCVGVPGTIDNDIPG--TDFTIGFDTALNT LDGLVIIGGVTSNTDAAHLAEKFAETKCLTKVVGVPVTLNGDLKNQFVEANVGFDTICKV LDGLWIGGDDSNTNACLLAENFRSKNLKTRVIGCPKTIDGDLKSKEVPTSFGFDT~,KI VDVLHTIGGDDTNTTAADLAAYL~HDYPLTVVGLPKTIDNDIVP--IRQSLGAWTAADE .o

Bst potA potB Pfr



.~

.o.

315 417 374 317

.o

Bst potA potB Pfr

LSI 318 KKFNAI CHFFGYQARGS LP SKFDCDYAYVLGHVCYHI LAAGLNGYMAT I T ~ A N K W H 477 Q -FKGQ SHFFGYEGRCGLP SNFD STYCYALGYGAGS LLQ SGKTGL I S SVGNLAAPVEEWT 433 QK .... SGYFSRSAKSNAQDLE ...... L IAATATMAVDAALAGTPGVVGQDEEAGDKLS 367

potA potB Pfr

CGASP I SAMMTVKRYGRGPGKAS IGVPALHPATVDLRGKSYELLSQNATKF LI/)DVYRNPG 537 ~ T A L T A L M D V E R ...... RHGKFKPVIKKAMVELEGAPFKKFAS KREEWALNNRYINPG 487 VIDFKRIAGHKPFD I TLDWYTQLLARIGQPAP IAAA 403 FIG. 38

(continued overleaf).

204

L. A. FOTHERGILL-GILMORE and P. A. M. MICHELS

potA potB

GP LQFDGPGADAKAVfi LCVEDQDYIGRIKKLQEYLDKVRT IVKPGCSQDVLKAALSAMAS 597 GP IQFVGPVANKVNH TLLLKLGgq)A 512

potA

VTDILSVISSPSSVSTPF

615

FIG. 38. Alignmentof ATP- and PPi-dependentphosphofructokinasesequences. See SectionII for nomenclatureand referencesfor the sequences.The catalyticaspartate in the B. stearothermophilus enzymeis indicatedby the star symbol.Residuenumbersfor each ofthe sequencesare givenat the end of the lines.

2. Aldolase

It has long been realized that the reversible aldol cleavage of fructose 1,6-bisphosphate can be accomplished by two quite distinct types of enzyme (reviewed by Rutter, 1964). The main distinguishing feature of the class I and class II aldolases at this time was catalytic mechanism. The class I enzymes involve a covalent imine link between the substrate and an active-site lysine residue in order to stabilize the intermediate carbanion. In contrast, the class II enzymes have a bivalent cation (usually Zn 2 + or Fe 2 ÷) at the active site to provide the same stabilization function. Sequence information reported in the last few years has confirmed the heterologous nature of the two aldolase classes. Sequences of three class II enzymes have shown them to comprise an homologous group (Fig. 8(b)) that appears to have no sequence similarities to the class I enzyme (Fig. 8 (a)). A comparison of the sequences of the class II aldolase with representative class I aldolase sequences is given in Fig. 39. This alignment was generated by the CLUSTAL program, with subsequent editing in four places to remove gaps in regions of the class I enzymes known to have regular secondary structure. It can be seen that the two classes of aldolase are very dissimilar and the matrix of percent identities (Matrix 11) confirms their lack of homology. This conclusion is at variance with that of Von der Osten et al. (1989) who detect conservation of sequence between class I and class II aldolases. Their alignment shows that 18% of the residues are similar or identical. This value and those given in Matrix 11 (11-15% identity), however, are sufficiently low that it is unlikely that the sequences have diverged from a common ancestor (see Section 111.4). What can be said about the evolutionary history of the two aldolase classes? Probably the most informative line of evidence is the phylogenetic distribution of the two aldolase classes (reviewed by Marsh and Lebherz, 1992). Examples of organisms expressing one or both of the aldolase classes are given in Fig. 40. It can be seen that both types of aldolase are present in all three of the main phylogenetic groups and that in some organisms both classes may be expressed. This is compelling evidence that both aldolase classes arose very early in evolution. It seems likely that many phylogenetic lineages have lost the expression of one or the other aldolase gene, perhaps because there was no selective advantage to express both. However, it is not easy to explain why some organisms have retained both classes. Patterns of expression in eubacteria grown under different conditions have led to the suggestion that the class I enzyme operates preferentially in gluconeogenesis, whereas the class II enzyme is more important for glycolysis (Stribling and Perham, 1973; Scamuffa and Caprioli, 1980; Fischer et al., 1982). However, this pattern of expression is not the general rule, and it appears that the two classes of aldolase can function equally well in vivo under different metabolic conditions (see Marsh and Lebherz, 1992). There would seem to be little selective advantage for or against the retention of two different aldolases within a single organism. 3. Glyceraldehyde-phosphate Dehydrogenase

Glyceraldehyde-phosphate dehydrogenase is the most highly conserved of all glycolytic enzymes (see Tables 3 and 5). The rate of evolution of the catalytic domain, for example, is only about 3 PAMs per 100 million years. Thus these domains in eukaryotic and eubacterial

Evolution of glycolysis

humA humB humC Dine

mai Pfa TbrGl yeaII ECOII CglII

aaaaaaaaaaaaaa bbbbbbbb aaaaaaaa P Y Q Y P A L T P . . . . . . . . . . E Q K K E L S D I A H R I V - - A P G K G I L A A D E S T G - - - S IAKRLQS 45 A H R F P A L T Q . . . . . . . . . . E Q K K E L S E IAQS IV- - A N G K G I L A A D E S V G - - - T M G N R L Q R PHSYPALSA .......... EQKKELSD IALRIV- -APGKGI LAADE SVG- --SMAKRLSQ TTYFNYPSK .......... ELQDELRE IAQKIV--APGKGI LAADESGP ---THGKRLQD S A Y C G . . . . . . . . . . K Y K D E L I K N A A Y I G - - T P G K G I L A A D E S T G - --T IGKRLS S NAPKKLPA. .DVAEELATTAQKLV- - Q A G K G I L A A D E S T Q - - -T I K K R F D N S K R V E V L L T Q L P A Y N R L K T P Y E A E L I ETAKI~MT - - A P G K G L L A A D E S T G - - - S C S K R F A G G V E Q I L K R K T G V I V ~ E D V H N L F T Y A K E H K F A I P A I N V T S S S TA%rAALEAARDSKSP 56 SK IFDFVKP GVI T G D D V Q K V F Q V A K E N N F A L P A V N C V G T D S I N A V L E T A A K V K A P P IATP E V Y N E M L D R A K E G G F A F P A I N C T S S E T I N A A L K G F A E A E S D oo

humA humB humC Dme

mai Pfa TbrGl yeaII EcoII CglII



mai Pfa TbrGl yeaII EcoII CglII

humA humB humC Dme mai Pfa TbrGl yeaII EcoII

Cglii

,

,o,o

a aaaaaaaaaaaaaaaa bbbbbb aaaaa I G T E N T E E N R R F Y R Q L L L T A D D R V N P C IGGVI L F H E T L Y Q K A D -DGRP 92 I K V E N T E E N R R Q F R E ILF SVD S S INQ S IGGVI L F H E T L Y Q K D S QGKL IGVENTEENRRLYRQVLFSADDRVKKC IGGVIFFHETLYQKDD ............. NGVP IGVENTEDNRRAYRQLLF STDPKLAENISGVILFHETLYQKAD ............ DGTP IN V E N V E E N R R A L R E L L F C C P G A L - Q Y I S G V I L F E E T L Y Q K T K . . . . . . . . . . . . . D G K P I K L E N T I E N R A S Y R D L L F G T K G - L G K F I SGAILFEETLFQKNE, -AGVP IGLSNTAEHRRQYRALMLECEG-FEQYISGVILHDETVYQKAK ............ -TGET I I L Q T S N G G A A Y F A G E ~ I S N E G - Q N A S I K G A I A A A H Y I R S I A P A Y G I P V V L H S D H C A K K L 115 VIVQFSNGGAS FIAGKGVKSDVPQGAAILGAISGAHHVHQMAEHYGVPVILHTDHCAKKL GI I Q F S T G G A E F G S G L A V K N K . . . . . . V K G A V A L A A F A H E A A K S Y G I N V A L H T D H C Q K E V •

humA humB humC Dme

205

*oo

aaaaaa bbbbbbbb aaaaaaaaaa bbbbbbbb FP Q V I K S K G G V V G IKVDKGVVP L A - G T N G E T T T Q G L D G L S E R C A Q Y K E D G A D F A K W R C V L 151 FRNI LKEKG IVVGIKLD~ LA-GTNKETTI QGI/)GLSERCAQYKEDGVD~VL FVRT IQDKG IVVGIKVDKGVVP LA-GTDGETTTQGLDGLSERCAQYKEDGADFAKWRCVL F A E IL K K K G I I L G I K V D K G V V P L F - G S E D E V T T Q G L D D L A A R C ~ ~ F A K W R C V L F V D V L K E G G V L P G IKVDKGT I E V V - G T D K E T T T Q G H D D ~ V L M V N L L H N E N I IP G I K V D K G L V N I P - C T D E E K S T Q G L D G L A E R C K E Y Y K A G A R F ~ T V L . FP Q Y L R R R G V V P G I K T D C G L E P L V E G A K G E Q M T A G L D G Y LP - W F D G M L E A D E A Y F K E H G E P L F S S H M L D L S E E T D E E N I S T ~ - -L 171 LP - W I D G L L D A G E K H G A A T G K P LFS S H M I D L S E E S L Q E N I E I C S ~ K I G ~ 4 T --L L D E Y V R P LLAI S Q E R V D R G E L P L F Q S H M W D G S A V P I D E N L E I A Q E L L A K A K A A N I I - - - L •

bb aaaaaaaaaaaaaaaaaaaa bbbbbbb aaaaaaaaaaa KI G E H T - - -P S A L A I M E N A N V L A R Y A S I C Q Q N G I V P I V E P E I LPDGDHDLKRCQY%.~EKV 208 R I A D Q C - --P S S L A I Q E N A N A L A R Y A S I C Q Q N G L V P I V E P E V I P D G D H D LEHCQY%.I'EKV KI S E R T - - - P S A L A I L E N A N V L A R Y A S I C Q Q N G I V P I V E P E I L P D G D H D L K R C Q Y V T E K V KIGIq~T---P SYQS I L E N A N V L A R Y A S I C Q S Q R I V P I V E P E V L P D G D H D L D R A Q K V T E T V K IGPNE - - -P S Q L A I D L N A Q G L A R Y A I I C Q E N G L V P I V E P E I L V D G P HDIDRCAYIFI~ZTV VID TAKGK-P TDLSNHETAWGLARYAS ICQQNRLVP IVEPE ILADGPHS IEVCAVVTQKV KIQNGT ---VSEAVVRFNAETLARYAI LSQLCGLVP IVEPEVMIDGTHD IETCQRVSQHV E M E IG I T G G E E D G V N N E N A D K E D L Y T K . . . . . . . . . . . . P E Q V Y N V Y K A L H P -- I SPNFS 217 E IELC4~TGGEEDGVDNSHMDASALYTQ . . . . . . . . . . . . P E D V D Y A Y T E L S K - - I S P R F T EVE IGVVGGEEDGVEAKA- -GANLYT S ............ PEDFEKT IDAIGT-GEKGRYL *

**

. .

FIG. 39 (continued overleaf).

enzymes retain more than 60% of their residues as identical. The NAD ÷-binding domain is somewhat less well conserved (7-8 PAMs per 100 million years). Overall, eukatyotic and eubacterial glyceraldehyde-phosphate dehydrogenases have about 50% of their residues in common. Glyceraldehyde-phosphate dehydrogenase is also the best characterized of the glycolytic enzymes from the structural point of view. Crystal structures are available for enzymes from four phylogenetically distant organisms (human, lobster, trypanosome and Bacillus), and a total of 45 sequences have been determined. Of these, 32 are from eukaryotes, 9 from eubacteria and 4 from archaebacteria. The relationship of the archaebacterial glyceraldehyde-phosphate dehydrogenases to those from other organisms turns out to be surprisingly difficult to define. It is by no means Ol~ 59:2-6

206

L.A. FOTHERGILL-GILMOREand P. A. M. MICHELS

aaaaaaaaaa bbbbbb aaaaaaaaaaaaaa bbbbb LAAVYKALSDHH IYLEG-TLLKPNMVTPGHACTQKFS HEE IAMATVTALRRTVPPAVTGI 2 6 7 LAA~HHVYLEG-TLLKPNMVTAGHACTKKYTPEQVAMATVTALHRTVPAAVPGI LAAVYKALSDHHVYLEG-TLLKPI~4VTPGHACP IKYTPEE IAMATVTALRRTVPPAVPGV LAAVYKALSDHHVYLEG-TLLKPNHVTAGQS -AKKNTP EE IALATVQALRRTVPAAVTGV LAACYKALNEHHVLLE~ - T LLKPI~MVTPGSD -SKKVTPEVIAEYTVRTLQRTVPAAVPAV L S CVFKALQENGVLLEG-AI/atPIa4VTAGYECTAKTTTQDVGFLTVRTLRRTVPPALPGV WS E V V S A L H R H G V V W E G - C ~ I ~ M V V P G A E S G L K A T A E Q V A E Y T V K T L A R V I P P A L P G V IAAAFGNC- -HGLY-AGD I ~ E I L A E H Q K M T R E Q V C 4 2 B E E K - P L F L V F H G G S G S T V Q E 273 I AASFGNV- -HGVYKPGNVVLTPTILRDSQEYVSKKHN-LPHN-S LNFVFHGGSGSTAQE LAATFGNV- -HGVYKPGNVKLRPEVLLEC-QQVARKKLGLADDALPFDFVFHGGSGSEKEK

humA humB humC Dme mai Pfa TbrGl yeaII EcoII CglII

• .

*

. ..

.

.

.

.

o

bbbb aaaaaaaaaaaaa bbbbbb aaaaaaaaaa aaaaaaaaa TF L SGGQSEEEAS INI/~AINKCPLLKPWALTFSY EEY 327 CFL SGCq4SEEDATLNI2~AINLCPLPKPMKLSFSYGRAL~ TQEAF TF LSGGQSEEEASFNLNAINRCPLPRPNALTFS %~ EEF T FL SGGQSEEEATVNLSAII~ffVPLIRPM~%LTFSYGRAI~ .GQNEL LFL SGGQSEEEA~TKKPWSLSFSFGR~ VFLSGGQSEEEASVNI/~S INALG-P HPMALTFSYGRAL~ TFLSGGLS~EYZMAMNNCPLPRPMKLTFSYARAL~SAI~ - -FHTGIDNGVVKVNI~TDCQYAYLTG--- IRD~YIMSPVGNPEGPEKPNKKFF 328 - - I KD S V S Y ~ I D T D T Q W A T W E G - - ~KGED~NKKYY - - IEEALTYGVIKMNVDTDTHYAFTRP --- IVSHMFEN- -Y- -NGVLKIDG-EVGNKKAY

humA humB hur0~ Dme mai Pfa TbrGl yeaII EcoII CglII

aaaaaaaaaaaa VKRALANSLACQ-GKYTP SC~ESLFVSNHAY MKRAMANCQAAK-GQYVHTGSSGAAS~QSLFTACYTY

humA humB humC Dme mai Pfa TbrGl yeaII EcoII CglII

363

IK R A E V N G L A A Q - G K Y E G S G Y . ~ S LYIANHAY LKRAKANGDAAQ-GKYVA- -GSAGAGSGS Xa'CANHAY L A R C K A N S E A T L - G T Y K G D A A - - - A D T E S I/4VKDYKY LQRAEANS LATY-GKMKG-GA~ LYEKKYVY ~ S L A Q L - G K X N R - -ADDDItDSQS L Y V A G N T Y

DPRVWVREG~TKSLETFRTTNTL D P RVWLRAGQT SMIARLEKAFQEI~IAIDVL D PRSYMKKAEQSMSERI IESCQDLKSVGKTT SK

358

FIG. 39. Comparison of class I and class II aldolases. See Section II for nomenclature and references for the sequences. Elements of regular secondary structure in the human muscle enzyme are shown, and the class I active-site lysine is indicated by the star symbol. The numbering is according to the human muscle and yeast enzymes.

MATRIX 11. PAIRWISECOMPARISONOF THEALDOLASESEQUENCESALIGNEDIN FIG. 39

humA humB humC Dme mai Pfa TbrGl yeaII EcolI CglII

humA

humB

humC

Dme

mai

Pfa

TbrGl

yealI

EcolI

CglII

100

70 100

82 71 100

69 64 69 100

61 59 59 60 100

55 51 56 54 57 100

49 47 48 46 50 47 100

14 15 13 13 11 12 12 100

14 14 13 13 12 14 12 48 100

12 11 12 13 11 12 14 37 39 100

clear w h e t h e r they share a c o m m o n a n c e s to r , a n d the evidence has been a r g u e d b o t h for an d against h o m o l o g y (Hensel et al., 1989; D o o l i t t l e et al., 1990). T h e sequences are m u c h m o r e different t h a n w o u l d be e x p e c t e d f r o m a c o n s i d e r a t i o n of the e v o l u t i o n rate, a n d overall they are actually n o m o r e similar t h a n r a n d o m sequences. In fact, the a r c h a e b a c t e r i a l sequences are so different as to m a k e a l i g n m e n t with any of the o t h e r sequences difficult, a n d they were

Evolution of glycolysis

207

ARCHAEBACTERIA Class I o n l y

Halobacterium saccharovorum

Class II o n l y

Halobacterium holobium

Both I and II

n o examples reported

EUBACTERIA Class I o n l y

Mycobacteriurn srnegmatis

Class II o n l y

Rhodopseudomonas spheroldes

B o t h I a n d II

Escherichia coil

EUKARYOTES Class I o n l y

animals, plants

Class II o n l y

yeast

B o t h I and II

Euglena gracilis

FIG. 40. Phylogenetic distribution of class I and class II aldolases. The examples are taken from Marsh and Lebherz (1992).

not included in the main alignment in Fig. 13(a) for this reason. This situation is in contrast with that for phosphoglycerate kinase, the only other glycolytic enzyme for which archaebacterial sequenoes are available. In this case, the archaebacterial sequences are undoubtedly homologous to the eukaryotic and eubacterial sequences (31-36% identity, Matrix 7(a)). It should be noted in this context that the archaebacterial ph.osphoglycerate kinase sequences are from Methanobacterium bryantii and Methanothermu~ fervidus, the same organisms as two of the glyceraldehyde-phosphate dehydrogenase sequences. The other two sequences are from Methanobacteriumformicium and Pyrococcus woesii, and all four organisms belong to the methanogen group within the archaebacteria. Among themselves the four available archaebacterial glyceraldehyde-phosphate dehydrogenase sequences are a clearly homologous group and share substantial sequence identity (Fig. 13(b) and Matrix 6(b)). A possible alignment of these sequences with representative eukaryotic and eubacterial sequences is shown in Fig. 41. As far as possible, gaps in the eukaryotic and eubacterial enzymes were constrained to regions known to lack regular secondary structure. Pairwise comparisons of sequence identities for the sequences given in this alignment are shown in Matrix 12. Several points can be made from these comparisons: (1) the sequences are all about the same length; (2) the overall sequence identities between the archaebacterial sequences and the others are only 16-20%; (3) only the glycines are conserved in the NAD÷-binding sequence . . . G F G R I G R . . . (residues 8-14) that, by contrast, is completely conserved in all eukaryotic and eubacterial glyceraldehydephosphate dehydrogenases; (4) asparate-33 that is involved in binding the ribose moiety of the NAD + in the eukaryotic and eubacterial enzymes is replaced by a lysine in the archaebacterial enzymes, consistent with their different NADP ÷ cofactor; (5) the active site cysteine (residue 151) is embedded within a small cluster of residues conserved in all the sequences; (6) the S-loop (residues 180-202, see Fig. 12) that is well conserved in the eukaryotic and eubacterial enzymes is considerably different in the archaebacterial enzymes, both with respect to sequence and length. It is clear from a consideration of these points that the archaebacterial enzymes are likely to be typical dehydrogenases, possessing a characteristic Rossmann fold in the N-terminal half. Moreover, theylaave a well-conserved region containing a putative active-site cysteine residue. These seem to be reasonably convincing indications that the eukaryotic, eubacterial and archaebacterial have diverged from a common ancestor. However, the archaebacterial enzymes are much less similar in overall sequence than would be expected from known evolution rates, and they also appear to be very different in the region of the S-loop that is

208

hum

lob Bst Mbr Mfo Mfe Pwo

L.A. FOTHERGILL-GILMOREand P. A. M. M1CHELS

bbbbbb aaaaaaaaa bbbb aaaaaaaaaa hb 57 GKVKVGVNGF( ;RIGRLVTRAAFNSGKVD IVAIND -PF I D ~ Q Y D S T H G K F H G T V SKIGIDGF( ;RIG R L V L R A A L S C G - A Q W A V N D -PF I A L E ~ H V ~ F K Y D S T H G V F K G E V AVKVG INGFGRI GRNVFRAALKHPD I EVVAVND LT- -DANTLAHLLKIq) SVHGRLDAEV K SVG INGYGT IGKRVADAVS AODDMKI VGVT KRSPDFEARMAVEKGYD LYI SVPERES KSVGINGYGT IGKRVADAVSAQDDMKIVGVTKRSPDFEARMAVEKGYDLYI SAPEREN KAVAINGYGTVGKRVADAIAQQDDMKVIGVSKTRPDFEARMALKKGYDLYVAIPERVK K IKVG INGYGT IGKRVAYAVTKQDDMEL IGVTKT~I~DFEAYRAKELGIPVYAASEEFLP o..*°*

hum

lob Bst Mbr Mfo Mfe Pwo

lob Bst Mbr Mfo Mfe Pwo

lob Bst Mbr Mfo Mfe Pwo





Io*o

lob Bst Mbr Mfo Mfe Pwo

..

.



o..

*

em



e o e

o**°*.

*

o

.

• • •

°

°

S LOOP b | i bbbb aaaaaa ~ h ~ 236 HAI TATQKTVDGP S G K L W R D G ~ N I IPASTGAAKAVGKVIPELNGKLTGMAFRVPTA HAVTATQKTVDGP SAKDWRGGRGAAQNI IP SSTGAAKAVGKVIPELDG~LTGMAFRVPTP HSYTNNQRILDLPHKDLRRA-RAAAES I I P T T T G A A K A V A L V L P E L K ~ RRGADPGQVKKGP INAI ..... VPNPPTVP SHHGP --DVQTVMYDLN-- IT~ALLVL~fT RRGADP SQVKKGP INAI ..... VPNPPTVP SHHGP --DVQTVMXDLN--ITTMALLVPTT RRGADPAQVSKGP INAI ..... IPNPPKLP SHHGP --DVKTVL-D IN-- ~ V I ~ 0 1 ~ F Z RRAADPNDIKRGP INAI ..... KPSV-TIP SHHGP --DVQTVI-P I N - - I E T ~ *

.*.

*o

.*

*o

..

.oo

*

***o

bbbbbbbbb aaaaaaaaaaaaaa bbbb ~ 293 NVSVVDLTCRLEKPAKYDD IKKVVKQASEGP L- - - K G I L G Y T E ~ / V S S D F N S D T H S S T F DVSVVDLTVRLGKECSYDD ~ A S E G P L - - - Q G F I ~ F I G 0 ~ S S I F NVSVVDLVAE GEL-- -I~GILAYSEEP LVSRDXIraSTVSSTI I24H~LESSVSVDDIKEKLNETPRVLLLKAG~GLTSTAGk~4~TdKD~GRSJ~DLF L M H ~ LES SVS IDD I E D ~ T P R V L L L K A K E G L G S T ~ I Q g ) L F LMHQHNVMVEVEETPTVDD I I D V F E D T P R V I L I S A E D G L T S T A E ~ L F IMHVHS IMVELKKP LTREDVID I F E N T T R V L L F E K E K ~ .:STAQLI E F A R D ~ Y eo

hum lob Bst Mbr Mfo Mfe Pwo

.......

.

.,

hum

*

b bbbb bbbbaaaaaaaaaaaaaaaaa h h ~ 177 I SAP SA-DAPMFVMGVNHEKYD -NS LK I I SNASCTTNCLAP ~ H I 3 ~ I F G I ~ / Z G I R ~ ? V I SAPSA-DAPMFVCGVNLEKYS - E D M T V V S N A S C T T N C L A P ~ I V E G L M T T V I SAPAKNED IT IVMGVNQDKYDPKAHHVI S N A S C T T N C L A P F A K V I ~ ~ FQGGEKHDQ IGLSFNSFSNYNDVIGKDYARVVSCNTTGLCRTLI~P INDLCGIKKVRA%~V F Q G G E K H D Q I G L S F N S F S N Y E D V I G K D Y A R V V S C N T T G L C R ~ INDLCGIEKVRATOMV FQGGEKHED IGLSFNSLSNYEESYGKDYTRVVSCNTTGLCRTL~PLHDSFGIEKVRAVIV FQGGEKAEVAQVS FVS S SNYEAALGKDYVRVVS CNTTGLVRTLHAI KDY--~0DYVYA%~I •

hum

*

bbb bbbb bbbb aaaa b b b b b aaaaaaaaa b b b 117 KAENGKLVINGNP IT I F Q E R D P S K I K W ~ D A G A E Y V V E S T G V F T ~ KMEDGALVVDGKKITVFNEMKPENIPWSKAGAEYIVESTGVFTTIEKAS~ SVNG~TNLVVNGKE I IVKAERDPENLAWGE IGVD I V V E S T G R F T K R E D ~ ? ~ A I ~ - I SFEEAGIKVTGTADELLEK ............ L D ~ T P E G I SFEEAGIKVTGTAEELFEK ............ LD IVVDCTPEGI - Y~AT LFEKAGIEVAGTVDDMLDE .ADIVIDCTPEG IGAKNLK-M~KEI~GIKAI RFEKAGFEVEGTLNDLLEK ............ VD I I V D A T P G ~ L Y E K A G % T ~ E ~ •

hum

.*.







ooo

.

..

o.

ol





~J

bbbbbbbb bbbbbbbbaaaaaaa aaaaaaaaaaa 334 DAGAGIALN-- -DHFVKLI SWYDNEFGYSNRVVDI/qAHMASKE DAKAGIQLS - - - K T F V K V V S W Y D N E F G Y S Q R V I D L I ~ S A DALSTMVID ---GEMVKVVSWYDNETGYSHRVVDLAAYIASKGL E I G V W E E S L N I V D G E L Y ~ I H Q E S D V V P E N V D A I R A I ~ g ~ D P SKS I Q ~ I L E IG~F~EESLN IVDGELYYMQAIHQESDVVP E N V D ~ . W ~ m T ~ I P SKS I ~ I L E IPVWRES ITVVDNE I Y Y M ~ V H Q E S D I V P E N V D A V R A I ~ ~ I ~ E IAVWKE S INVKGNRLFYIQAVHE~SDVIP ENIDAIRAMFE IAEK-WES II~C~?NKSLGILK ..*

°*

.

.

FIG. 41. Comparison of eukaryotic, eubacterial and archaebacterial glyceraldehydephosphate dehydrogenases. See Section II for nomenclature and references for the sequences. Elements of regular secondary structure observed in the B. stearothermophilus enzyme are shown, and the S-loop is labelled. The active-site cysteine is indicated by the star symbol and the boundary between the Nand C-terminal domains by the arrow. The numbering is according to the B. stearothermophilus enzyme.

Evolution of glycolysis

209

MATRIX 12. PAIRWISECOMPARISONOF THE GLYCERALDEHYDE-PHOSPHATE DEHYDROGENASESEQUENCESALIGNEDIN FIG. 41

hum lob Bst Mbr Mfo Mfe Pwo

hum

lob

Bst

Mbr

Mfo

Mfe

Pwo

100

71 100

54 54 100

20 19 20 100

20 19 20 96 100

18 18 20 71 70 100

16 16 18 58 57 56 100

functionally so important in the eukaryotic and eubacterial enzymes. These latter points provide compelling evidence that it is unlikely that the archaebacterial enzymes share a common ancestor with the eukaryotes and eubacteria. A possible scenario is that an existing dinucleotide-binding enzyme in the archaebacterial ancestral line underwent a gene duplication event and that one of the duplicated forms acquired glyceraldehyde-phosphate dehydrogenase activity, primarily by changes involving the region containing the active-site cysteine. The archaebacterial enzymes thus show convergence to the eukaryotic and eubacterial enzymes in this region. Is there any evidence in favour of this scheme? Doolittle et al. (1990) noted that the archaebacterial glyceraldehydephosphate dehydrogenase sequences show some similarity to a portion of the sequence of bovine NAD+/rNADPH transhydrogenase. A possible alignment is given in Fig. 42 and shows overall identity to be about 15%. This provides a hint that the ancestral form of this transhydrogenase may be the ancestor of both enzymes. It is also relevant to note that archaebacteria have quite distinctive patterns of hexose catabolism (see Section IV.4). Phenotypically the archaebacteria can be placed into four groups: methanogens, extreme halophiles, thermoacidophiles and sulphur-dependent thermophiles (reviewed by Danson, 1988, 1989). The latter two groups employ a mostly unphosphorylated pathway for glucose catabolism and possess only enolase and pyruvate kinase in the glycolytic pathway. The methanogens and halophiles have the trunk pathway of enzymes including glyceraldehyde-phosphate dehydrogenase (Fig. 27), but apparently lack phosphofructokinase. The methanogens appear to use their "glycolytic" enzymes primarily for carbohydrate synthesis. 4. Phosphoglycerate Mutase

Phosphoglycerate mutases can be grouped into two main classes that are distinguished by their requirement for the cofactor 2,3-bisphosphoglycerate (reviewed by Fothergill-Gilmore and Watson, 1989). The cofactor-dependent enzymes have been extensively characterized, and a crystal structure and several sequences have been reported (see Section III.l(h)). By contrast, relatively little is known of the structures of the cofactor-independent enzymes and no sequence information is currently available. What evidence is there to indicate whether the two main classes of phosphoglycerate mutase may have diverged from a common ancestor? The subunit size of the cofactor-independent enzyme is about twice that of the other phosphoglycerate mutases, and it is possible that they may have evolved by a gene doubling process as experienced by hexokinase and phosphofructokinase (see Section VI. 1). Mechanistically the two classes of phosphoglycerate mutase appear to be quite distinct. Most cofactor-dependent enzymes have an absolute requirement for 2,3-bisphosphoglycerate that is known to donate a phospho group to histidine-8 in an initial priming reaction. It is the phosphoenzyme that participates in catalysis (see Fothergill-Gilmore and Watson, 1989 for a detailed consideration of the catalytic mechanism). The cofactorindependent enzymes have no requirement for 2,3-bisphosphoglycerate. However, these enzymes also apparently involve a phosphohistidine intermediate, as shown by induced transport tests, isotope labelling experiments and examination of the stereochemistry of phospho transfer (Britton et al., 1971; Breathnach and Knowles, 1977; Bl~ittler and Knowles,

210

L.A. FOTHERGILL-GILMOREand P. A. M. MICHELS

bovTr Mbr Mfo Mfe Pwo

LQGI LKSAP LLLPGRHLLNAGLLAGSVGG I I P F I ~ P SFTT ..... -GITCLGSVSALSA 773 K SVG INGYGT IGKRVADAVSAQDDMKIVG%~KRSPDFEARMAVEKGYDLYI SVPERES 58 KSVG INGYGT IGKRVADAVS AQDDMK IVGVTKRSP DFEARMAVEKGYDLYI SAPEREN KAVAINGYGTVGKRVADAIAQQDDMKV IGVS KTRPDFEARMALKKG~DLYVAI PERVK K IKVGINGYGT IGKRVAYAVTKQDDMEL IGVTKTKPDFEAYRAKELGIPVYAASEEFLP *..,

bovTr Mbr Mfo Mfe Pwo



,

.

.

.

.

Mbr Mfo Mfe Pwo

*

*

.



..

.

.

..**

°.

.

.

.

.



.

*°*

. . . .

GYGIEAAKAQYP I A D L V K M L S E Q G K K V R F G I H P V A G R M P G Q L - ~ Y D I --V 944 G ...... P INAIVPNPP TVPS HHGPDVQTVMYDLNITTMALLVPTTIg~IQHNLMV~.~-S S 230 G ...... P INAIVPNPP TVPS HHGP DVQTVMYD L N I T T M A L L V P T T I R ~ ~ . P - ~ S G ...... P INAI IPNPPKLPSHHGPDVKTgL-D I N I D T M A V I V P T T L F ~ I ~ T G ...... P INAIKP SV-T IPSHHGPDVQTVI -P INIETSAFVVPTTII~IVHS IMVELKKP .

.

.

.

* . * *

•*

.

.

.

.

.



. . . .



°.

LEF~E INHDFPDTDLVLVIGANDTVNSAAQ . . . . EDPNSI I A G M P ~ I ~ V I V M K 999 VSVDD I K E K I / q E T P R V L L ~ G L T S T A G F M E Y A I ~ D L G R S R N D L F E 7G~Mg~I/h'IVDG 290 VS IDD I K D K L N E T P R V L L L K A K E G L G S T A E F M E Y ~ R N D L F E I G V N Z E S L N I V D G P TVDD I IDVFEDTPRVI LI SAEDGLTS TAE I M E Y ~ L F E ~ ~ N LTREDVIDIFENTTRVLLFEKEKGFESTAQLIEFARDLHREWNNLYEIAVWKES •

bovTr Mbr Mfo Mfe Pwo

..•.

.

*

bovTr

* * * . .

LS YIMCVA~qRSLANVILGGYGTTSTAGGKPI~g I SGTHTEINLDNAII~IIREANS I I ITP 888 L S FNS F S N Y N D V I G K D Y A R W S CNTTGIERT I~P INDLCGIKKVRAVMVRRGADPG~/KK 177 LSFNSF S N Y K D V I G K D Y A R W S CNTTGIERTLNP I~DLCGIKKVRAVMVRRFaADP SQVEK L S FNS L S N Y E E S Y ~ K D Y T R W S C N T T G L C R T L K P LHDSFGIKKVRAVI-gRRGdgDPIgQYSK VS FVS S SNYEAALGKDYVRVVSCNTTGLVRTI~AIKDY--VDYVYAVMIRRAADPI~IER .*

bovTr Mbr Mfo Mfe Pwo

....

VMGETLTARIGGAD .....14?VVITVLNSYS GWALCAEGFLLNI~qLLTIVGAL IGS SGAI 828 SFEEAGIKVTGTADE[.T.RgLDI V V D C T P E G I G - A K N K E ~ F Q G G E K H D Q I G 117 S FEEAGIKVTGTAEELFEKLDIVVDCTPEGIG-AKNKEGTYEEMSLKATFQGGEKHDQIG LFEKAGIEVAGTVDDMLDEAD IVIDCTPEGIG-AKNLK-MYKEI~GIKAIFQGGEKHED IG RF EKAGFEVEGTLND LLEKVD I IVDATPGGMC~EKNKQ-LYEKAGVKAIFQGGEKAEVAQ ..,,

bovTr Mbr Mfo Mfe Pwo

*

...



.

.

*..

..





-



..o



**

RS LGVGYAAVDNP I F Y K P N T A M - - L I g ~ A K K T C D ~ S Y Q K 1043 E LY2MQAIHQESDVVPENVDAIRA~C.r~gNDP SKS IQKTNKAMGI L 337 E LYYMOAIHQESDVVPENVDAIRN~.~IEDNP SKS IEKTNKAMGIL E IY2MQAVHQESD IVPENVDAVRAI LEMEEDKYKS INKTNKAFR~ LQ RLFYI QAVHKESDVIPENIDAIRRMFE IAEK-WES iru~-~'~KSLGILK

FIG. 42. Alignment of the C-terminal portion of bovine NAD+/NADPH transhydrogenase with archaebacterial glyceraldehyde-phosphate dehydrogenases. The transhydrogenase sequence (bovTr) is from Yamaguchi et al. (1988) and the references for the other sequences are in Section II. The numbering of the transhydrogenase and the Methanobacterium bryantii sequences are given.

1980). Presumably the monophosphoglycerate substrates serve as phospho donors for the cofactor-independent enzymes. An examination of the phylogenetic distribution of the cofactor-independent mutase is relevant to a consideration of its evolution. Carreras et al. (1982) and Price et al. (1983) have surveyed a wide range of eukaryotes to ascertain the distribution of the two types of mutase. All vertebrates have the cofactor-dependent enzyme and all plants have the cofactorindependent mutase. This apparent simplicity of distribution becomes considerably more complex when invertebrates, fungi and protists are surveyed. Among the invertebrates, the

Evolution of glycolysis

211

cofactor-dependent enzyme is found in platyhelminths, molluscs, annelids, crustaceans and insects, whereas the cofactor-independent enzyme is present in sponges, coelenterates, myriapods, arachnids and echinoderms. Studies of the fungi have revealed that four (including S. pombe and S. cerevisiae) were cofactor-dependent and that seven (e.g.N. crasse and A. nidulans) had mutases independent of added cofactor. Only two protists have been characterized and both of these have cofactor-dependent mutases. Very few bacterial enzymes have been characterized with respect to cofactor requirements. Gram-positive bacteria can be either cofactor-dependent (Streptomyces) or -independent (Bacillus). The gram-negative bacterium (E. coli) has a cofactor-dependent enzyme. The observed distribution of the cofactor-dependent and -independent mutases is consistent with the notion that both types ofgene were present early in evolution and that the animal lineage has lost the cofactor-independent gene at some stage prior to the radiation of the vertebrates. The plant lineage, on the other hand, appears to have lost the cofactordependent gene probably when the early forms diverged from the primitive unicellular organisms. An unequivocal definition of whether the two classes of phosphoglycerate mutase are homologous or heterologous must await sequence information from the cofactorindependent enzymes.

5. Pyruvate Kinase The final step of the glycolytic pathway is the conversion of phosphoenolpyruvate to pyruvate with the concomitant formation of ATP. In most organisms this reaction is catalyzed by pyruvate kinase with a mechanism involving an essentially irreversible transfer of a phospho group from phosphoenolpyruvate to ADP. However, in certain protists (e.g. Entamoeba histolytica) and bacteria (e.g. Bacteroides symbiosus) the usual pyruvate kinase is apparently absent, and the reaction is catalyzed by pyruvate phosphate dikinase (Reeves, 1968; Pocalyko et al., 1990). This enzyme catalyzes the reversible transformation of phosphoenolpyruvate to pyruvate in a reaction involving a phosphoenzyme intermediate. In this case, phosphoenolpyruvate and inorganic pyrophosphate each donate a phospho group to AMP to yield ATP. Pyruvate phosphate dikinase is also present in the chloroplasts of C4 and crassulacean acid metabolism plants where it is responsible for phosphoenolpyruvate production during photosynthesis (Hatch and Slack, 1968; Kluge and Osmond, 1971). Two pyruvate phosphate dikinase sequences have been determined at the DNA level: those from maize (Matsuoka et al., 1988) and from Bacteroides symbiosus (Pocalyko et al., 1990). The mature plant enzyme has 876 amino acid residues, and the cDNA sequence revealed that 71 additional residues at the N-terminus were essential for specifying the transport of the precursor protein into the chloroplasts. The bacterial enzyme has 840 residues. Overall the two sequences are 53 % identical, and have thus clearly diverged from a common ancestor. There is no apparent sequence similarity to the usual pyruvate kinase (Pocalyko et al., 1990). X. E V O L U T I O N OF C O N T R O L Glycolysis serves various functions in cellular metabolism. During the stepwise enzymic breakdown of sugar molecules the favourable free energy of some reactions is harnessed to drive other cellular processes. Reducing equivalents are made available to the cell, and building blocks are provided for the synthesis of amino acids, fatty acids and sterols. Moreover, as discussed in Section V, the glycolytic pathway shares enzymes with other metabolic systems such as gluconeogenesis, the pentose phosphate pathway and the Calvin cycle. To control this interplay of reactions and to adjust the glycolytic flux to the cellular needs for energy and precursors for anabolic processes, a number of control mechanisms have evolved. The identification of these control mechanisms has been done by two different approaches. The first approach involves looking for answers to qualitative questions concerning the mechanisms by which the activity of an enzyme catalyzing a specific reaction are controlled. This kind of analysis has provided insight into the regulation of enzymes by feed-back inhibitors, allosteric effectors, regulatory proteins and covalent modification. The second approach is to study essentially quantitative aspects: how is the overall flux through

212

L.A. FOTHERGILL-GILMOREand P. A. M. MICHELS

the pathway determined by the activity and concentration of each enzyme, by the levels of metabolites and by the concentrations of the effectors of enzyme activity?

1. Qualitative Analysis Three major control points have been identified in the glycolytic pathway by the qualitative approach: the enzymes that catalyze reactions which are virtually irreversible, i.e. hexokinase, phosphofructokinase and pyruvate kinase. The activity of these enzymes can be directly influenced by various glycolytic intermediates and by specific effector molecules, so that the glycolytic rate is adjusted within seconds to the overall cellular metabolism. Moreover, in multicellular organisms the activities of phosphofructokinase and pyruvate kinase, as well as those of the gluconeogenic enzymes catalyzing the reverse reactions can be controlled on a time scale of minutes through the action of hormones such as glucagon, insulin and catecholamines. This latter level of regulation couples the glycolytic and gluconeogenic fluxes in the cells of various tissues such as liver, heart and skeletal muscle, to the overall nutritional and activity state of the organism. These hormones can, in addition to regulating the activity of key enzymes, also influence the rate of their biosynthesis. This latter kind of regulation of glycolysis and gluconeogenesis is relatively slow: hours to days. From the preceding sections it is evident that the core structures of the glycolytic enzymes and their catalytic mechanisms have been extremely well conserved during evolution. However, the mechanisms by which glycolysis is controlled can vary substantially among the different major phylogenetic groups. This indicates that many of these control systems arose during later stages of evolution. We shall discuss the various molecular mechanisms by which the activity of the three key enzymes can be regulated and trace their evolutionary origins. (a) Hexokinase The major regulator of the activity of the mammalian hexokinases I, II and III is the reaction product glucose 6-phosphate. Increasing concentrations of this compound allosterically inhibit these enzymes. This regulation presumably evolved after the duplication-fusion of the ancestral enzyme; subsequently the catalytic site in the N-terminal half of the enzyme was altered to become an allosteric effector site for glucose 6-phosphate (see Section VI.1). No control by physiological concentrations of the reaction product is observed for mammalian glucokinase and yeast hexokinase, which are only half-size enzymes. The activity of the mammalian type I enzyme is stimulated by citrate and phosphate (Kosow and Rose, 1971). The physiological significance of the citrate stimulation could be to signal that there is an adequate supply of ATP in the cell. By simultaneously inhibiting phosphofructokinase (see below) and stimulating hexokinase, the citrate would promote the utilization of glucose and thence glucose 6-phosphate for glycogen and fat synthesis. Inhibition by glucose 6-phosphate has also been observed for the glucokinase of the bacterium Zymomonas mobilis (Doelle, 1982). The gene of this enzyme has been cloned and sequenced (Barnell et al., 1990). It codes for a polypeptide o f M r 35,000, considerably smaller than the mammalian and yeast polypeptides. Nevertheless, its primary structure shows some identity with the eukaryotic enzymes. The molecular basis of how product inhibition is exerted in this bacterial glucokinase remains to be determined. Different mechanisms have been evolved to regulate the activity of the glucosephosphorylating enzymes that are not controlled by glucose 6-phosphate. Yeast hexokinase can be active as either a monomer or a dimer. The formation of the more active dimer is favoured by glucose and MgATP (see Sections III. 1(a) and VII). However, yeast hexokinase B is strongly inhibited by high physiological concentrations of ATP (Kopetzki and Entian, 1985). At low, non-inhibitory concentrations of ATP a pronounced activation was observed by various metabolites such as 3-phosphoglycerate, malate, citrate and phosphate (Kosow and Rose, 1971). The strong ATP inhibition occurred similarly in isoenzyme A (Kopetzki and Entian, 1985), which otherwise has not been studied in great detail. A second ATPbinding site, distinct from the active site, as detected in the crystal structure of the yeast

Evolution of glycolysis

213

enzyme (Steitz et al., 1977) may be responsible for the regulatory effect. No information is available about the regulation of yeast glucokinase. For the control of the activity of mammalian glucokinase, a completely different mechanism has evolved. This liver enzyme can form a complex with a regulatory protein of Mr 60,000 (Van Schaftingen, 1989; Vandercammen and Van Schaftingen, 1990, 1991). This association, which results in inhibition of the enzyme by lowering its apparent affinity for glucose, is promoted by fructose 6-phosphate and antagonized by fructose 1-phosphate or agents that cause, in vivo, formation of this compound, such as fructose, sorbitol or D-glyceraldehyde (Fig. 43). The physiological relevance is that, in hepatocytes, fructose 6-phosphate is always in equilibrium with glucose 6-phosphate, because of the high glucosephosphate isomerase activity. As a result, glucokinase normally works at only 30% of its potential activity, unless fructose 1-phosphate is present. The formation of the latter intermediate during fructose metabolism in the liver will stimulate the enzyme, thus resulting in an increased conversion of glucose to both glycogen and lactic acid (Fig. 44) (Youn et al., 1986; Parniak and Kalant, 1988).

Glycogen

Glucose-6-phosphatase

Glucose 9

Glucose!6-phosphate

Glucokinase .e

ulato

Q

protein

I I I

-- -- --

_

_

_

I

Fructose-6-phosphate

Fructose-l,6-bisphosphate I I I

Fructose l

Dihydroxyacetone' ~

D-G!yceraldehyde-3-

~ Fructose-1phosphate

Sorbitol

D-Glyceraldehyde

Pyruvate

FIG. 43. Metabolism of glucose, fructose and D-glyceraldehyde in mammalian liver (modified from Van Schaftingen et al., 1992a, b). (active)

®

Fru-l-P

Fru-6mp

(Inactive) FIG. 44. Model for the regulation of glucokinase activity in mammalian liver. The regulatory protein (R) may exist in two conformations, only one of which can form an inactive complex with glucokinase (GK). This conformation is promoted by the binding of fructose 6-phosphate. Fructose 1-phosphate, on the other hand, prevents inhibition of glucokinase by trapping the regulatory protein in the other conformation (modified from Van Schaftingen et al., 1992a, b).

214

L.A. FOTHERGILL-GILMOREand P. A. M. MICHELS

The regulation of mammalian hexokinases by 2,3-bisphosphoglycerate, and its physiological importance have been extensively discussed in Section VIII. (b) Phosphofructokinase In the glycolytic pathway phosphofructokinase is the enzyme with the most sophisticated regulatory mechanisms. An increase in sophistication occurred during evolution as is evident when the control properties of the bacterial, yeast and mammalian enzymes are compared. This increase should, to a large extent, be attributed to the duplication-fusion event that must have happened early in the evolution of eukaryotes (see Section VI.1). The enzymes from the bacteria E. coli and B. stearothermophilus show very similar kinetics and allosteric control; both enzymes display cooperative binding to one substrate, fructose 6-phosphate, but not to the other substrate ATP. ADP and GDP are allosteric activators; phosphoenolpyruvate is an allosteric inhibitor. Although both the yeast and mammalian phosphofructokinases are composed of polypeptides which are doublets of the prokaryotic polypeptides, their quaternary structures are quite different. Nevertheless their kinetics and control mechanisms differ only in detail. The mammalian enzymes are homotetrameric proteins, like their prokaryotic counterparts and like most other eukaryotic phosphofructokinases analyzed so far. Substrate and regulatory molecules can bind to each subunit. In contrast, the yeast enzyme is an octamer composed of four ~ and four fl subunits which are clearly homologous. Their amino-acid sequences are 55% identical (Heinisch et al., 1989). The homology between these subunits suggests that a second gene duplication event occurred after the ancestors of Saccharomyces branched off from the main eukaryotic lineage. Genetic (Lobo and Maitra, 1982; Clifton and Fraenkel, 1982) and biochemical (Laurent et al., 1978; Tijane et al., 1980, 1982) evidence indicate that the c~subunits play a regulatory role, whereas the fl units are responsible for the catalytic activity. Both the yeast and mammalian enzymes are regulated by a large variety of effectors, which are very similar for both enzymes (Sols, 1981). Among the new regulatory mechanisms that have evolved are the allosteric inhibition by ATP and the stimulation by AMP, which enable the cell to adapt the glycolytic flux to the ATP/AMP ratio in the cell. Also fructose 1,6bisphosphate has become an allosteric activator, but only in the mammalian enzymes; it does not have any effect on the activity of the yeast enzymes (Vinuela et al., 1963; Bartrons et al., 1982). It has been proposed that, during mammalian evolution, the binding sites for the substrate fructose 6-phosphate and the activator ADP in the C-terminal half of the enzyme were converted into allosteric sites for fructose 1,6-bisphosphate and ATP (Poorman et al., 1984; see also Section VI. 1). The development of regulatory sites from substrate binding sites has followed a somewhat different path during the evolution of the yeast enzyme: allosteric binding sites, for which the opposing effectors ATP and AMP compete, have only been located on subunit ~ (Laurent et al., 1978); a single substrate binding site for fructose 6-phosphate and a binding site for ATP as phospho donor were exclusively found in each subunit fl (Tijane et al., 1980). Other important regulatory molecules, both for the mammalian and yeast phosphofructokinase are the inhibitor citrate, allowing feedback from the citric acid cycle, and the activators AMP and phosphate. Many more molecules have been reported to affect the activity of phosphofructokinase. In his review of 1981, Sols listed 23 effectors of the mammalian enzyme. A very potent allosteric stimulator of phosphofructokinase in many eukaryotes is fructose 2,6-bisphosphate (for reviews, see Hue and Rider, 1987; Van Schaftingen, 1987; Kemp and Marcus, 1990). It acts in synergism with AMP, and is able to relieve the inhibition that under physiological conditions usually would be exerted by ATP. Fructose 2,6-bisphosphate is active at micromolar concentrations. In case of the mammalian enzyme, it binds at the same site as fructose 1,6-bisphosphate (another positive effector), but the enzyme has a higher affinity for the former compound. The different mammalian isoenzymes have different affinities for fructose 2,6-bisphosphate. Whereas the affinity of the liver (L-type) isoenzyme for fructose 2,6-bisphosphate is at least 100-fold greater than for fructose 1,6-bisphosphate, the difference is only 10-fold in the muscle (M-type) enzyme. Therefore, when the concentration of fructose 1,6-bisphosphate in muscle cells is high, e.g. during contraction, the

Evolution of glycolysis

215

effect of fructose 1,6-bisphosphate would prevail over that of fructose 2,6-bisphosphate. In addition to its effect on phosphofructokinase, fructose 2,6-bisphosphate is an inhibitor of the enzyme that catalyzes the opposite reaction, fructose 1,6-bisphosphatase. It thus controls both glycolysis and gluconeogenesis, and is able to prevent waste ofenergy by futile cycling of metabolites. However, in livers of fed rats some cycling may occur, because of the different sensitivity of the two enzymes to fructose 2,6-bisphosphate, and the fact that at high substrate concentration the inhibition of the phosphatase is overcome. This is presumably due to steric hindrance when the two sugar-bisphosphates bind to their partially overlapping binding sites. Mechanisms have evolved in multicellular organisms to adjust the glycolytic and gluconeogenic fluxes to the needs of the whole organism. These mechanisms act at the level of phosphofructokinase and fructose 1,6-bisphosphatase, via fructose 2,6-bisphosphate, as well as at the pyruvate kinase level (Fig. 45; also see below). This control is studied in most detail in liver cells (for reviews see Van Schaftingen, 1987; Pilkis et al., 1988). A rise in blood glucagon results, via the membrane-bound adenylate cyclase, in increased intracellular cAMP levels, activating the cAMP-dependent protein kinase in the hepatocytes. This kinase controls the steady-state concentration of fructose 2,6-bisphosphate by phosphorylation of the bifunctional enzyme fructose 6-phosphate-2-kinase/fructose 2,6-bisphosphatase. Phosphorylation inactivates the kinase activity, responsible for synthesis of fructose 2,6bisphosphate, and stimulates its hydrolysis via an increased phosphatase activity. Insulin counteracts the effect of glucagon by decreasing the intracellular cAMP levels. Control of glycolysis by a variety of hormones and other extracellular factors, all acting via fructose 2,6bisphosphate, has also been described for various other tissues. However, the intermediate steps between the recognition of the extracellular signal and the change in the fructose 2,6bisphosphate concentration are not always the same. Different cascade mechanisms and

Glucose ~

Fruct0se-6-phosphate ~

PEP

I

(/'~ruva et - ~ / /

(~ / /

"

Pyruvate Fro. 45. Diagrammatic representation of the coordinate regulation of the fluxesthrough the gly¢olytie and glueoneogenic pathways in mammalian liver. Abbreviations: 6-PF-1K, 6-phosphofrueto-lkinase; 6-PF-2K, 6-phosphofructo-2-kinase; F-1,6-Pase, fructose 1,6-bisphosphatase; F-2,6-Pase, fructose 2,6-bisphosphatase; PEP, phosphoenolpyruvate (modifiedfrom Pilkis et al., 1988).

216

L.A. FOTHERGILL-GILMORE and P. A. M. MICHELS

different tissue specific isoenzymes of 6-phosphofructo-2-kinase/fructose 2,6-bisphosphatase are involved (for a review, see Hue and Rider, 1987). It has also been reported that the glycolytic enzyme 6-phosphofructo-l-kinase from various mammalian tissues is a substrate for both the cAMP-dependent protein kinase and the Ca2+-activated, phospholipid-dependent protein kinase (protein kinase C). It would seem likely that phosphorylation would affect the kinetic properties of the enzyme. However, the evidence for a physiological relevant regulation of the activity of this glycolytic enzyme by these protein kinases, triggered by ~t and # adrenergic hormones, is controversial. Fructose 2,6-bisphosphate also plays a crucial role in the control of plant and yeast carbon metabolism, by allosteric activation of phosphofructokinase and inhibition of fructose 1,6bisphosphatase (for reviews see Van Schaftingen, 1987; Macdonald and Buchanan, 1990; Holzer, 1990). In the cytosol of plants and in the phototrophic protist Euglena gracilis two heterologous phosphofructokinases are present: a PPi- and an ATP-dependent enzyme. Interestingly, it is the former enzyme that is activated by fructose 2,6-bisphosphate. When during photosynthesis the concentration of fructose 2,6-bisphosphate drops, as a result of the inhibition of its synthesizing enzyme by intermediates of photosynthetic carbon metabolism, the flux of glycolysis is reduced, and the flux through fructose 1,6-bisphosphatase is enhanced. In yeast, as in mammalian liver, the steady-state concentration of fructose 2,6bisphosphate is dependent on the activity of 6-phosphofructo-2-kinase, which is controlled by a cAMP-dependent protein kinase. An increase in the cellular concentration of cAMP results therefore in an enhanced glycolytic flux. This is one aspect of the role of cAMP as messenger in the phenomenon of catabolite repression in yeast (Franqois et al., 1984; Van Schaftingen, 1987). In addition, the rise in cAMP level, triggered by the activity of hexokinase B, leads also to many other metabolic changes (see Section V.1 (a)). Fructose 2,6-bisphosphate has been detected in most eukaryotes in which it was sought: animals, plants, fungi and several protists (Van Schaftingen et al., 1990). However, it does not appear to be present in some distantly related anaerobic protists (Fig. 46) (Mertens et al., Presence of Fru-2,6-P 2

/Z•}ANIMALS

ATP-PFK

FBPase

no

yes

PPI-PFK

FBPase

no

SACCHAROMYCES

yes

ATP-PFK

FBPase

no

ISOTRICHA

no

/'-.--.-,~- DICTYOSTELIUM yes ]/..~......~ENTAMOEBA no ~____..NAEGLERIA yes ~.,,~

~

yes (I)

none

FBPase

no yes no

FBPase

no

TRYPANOSOMA

yes

EUGLENA

yes

TRICHOMONAS

no

GIARDIA

no

yes yes

~ E S C H E R I C H I A

no

no

~

"

Obligatory anaerobic glycolysls

yes

;~/I~}PLANTS L

Enzymesaffected by Fru-2,6-P 2 ^ f Stimulation Inhibition

~

~

~

Pyruvate kinase

PPI-PFK

no

FIG. 46. Occurrence and physiological role of fructose 2,6-bisphosphate throughout Nature. A phylogenetic tree, based on small subunit rRNA sequences, has been redrawn in simplified form from Sogin et al (1986). Notes: (1) A T P - P F K and pyruvate kinase are unaffected; PPi-PFK is absent; (2) not determined; (3) FBPase I is absent. Abbreviations: F-2,6-P 2 , fructose 2,6-bisphosphate; PFK, phosphofruetokinase; FBPase I, fructose 1,6-bisphosphatase (modified from Van Schaftingen e t al., 1990).

Evolution of glycolysis

217

1989; Mertens, 1990). Furthermore, it has never been found in prokaryotes. In whatever organism the compound has been found, it seems to be involved in the regulation of glycolysis and/or gluconeogenesis. These data suggest that fructose 2,6-bisphosphate was already present as regulatory molecule of carbohydrate metabolism in a common ancestor of most eukaryotes, when the lineages of the Trypanosomatids and Euglenoids diverged from the main eukaryotic branch of the phylogenetic tree (Sogin et al., 1986). The absence of fructose 2,6-bisphosphate from anaerobic protists, such as Entamoeba and lsotricha, but maybe also Giardia and Trichomonas, representatives of the earliest branches of the eukaryotic evolutionary tree, is presumably due to secondary loss in the course of evolution. One can argue that the most primitive function of fructose 2,6-bisphosphate was not the control of glycolysis via phosphofructokinase, but the regulation of gluconeogenesis by inhibition of fructose 1,6-bisphosphatase, since this is the most constant property in the present-day organisms. In Dictyostelium discoideum this remains the sole function of the molecule. In organisms belonging to other evolutionary branches fructose 2,6-bisphosphate may have acquired an additional effector function, stimulation of glycolysis, by acting on either ATP- or PPi-dependent phosphofructokinase or on pyruvate kinase. (c) Pyruvate kinase Pyruvate kinase is the major regulatory enzyme of the section of the glycolytic pathway catalyzing the flux from fructose 1,6-bisphosphate to pyruvate. In almost all organisms this enzyme is a tetramer composed of identical subunits, although some different structures have been described as well: a heterotetramer in castor oil seeds (Podesta and Plaxton, 1991), a homodimer in the bacterium Zymomonas mobilis (Pawluk et al., 1986) and a monomer in the green alga Selenasrum minutum (Knowles et al., 1989). Nearly all tetrameric pyruvate kinases show positive cooperativity in binding the substrate phosphoenolpyruvate. In addition, the activity is often heterotrophically regulated by a number of allosteric effectors. Two major patterns by which the enzyme activity is controlled can be distinguished among the prokaryotes. The enteric bacteria E. coil and Salmonella typhimurium each has two differently regulated isoenzymes (Malcovati and Kornberg, 1969; Garcia-Olalla and Garrido-Pertierra, 1987). These isoenzymes are encoded by different genes, and have different chemical and molecular properties (Garrido-Pertierra and Cooper, 1977, 1983; Valentini et al., 1979). The type-I isoenzyme is allosterically activated by fructose 1,6bisphosphate and other sugars bearing two phospho groups at roughly the same distance apart (Speranza et al., 1990). Interaction of these feed-forward activators with the enzyme results in an increased affinity for phosphoenolpyruvate and an increased velocity. Furthermore, this enzyme is susceptible to a synergistic cooperative feedback inhibition by succinylCoA and ATP, and to inhibition by phosphate. The type II isoenzyme is stimulated by AMP and by some sugar-monophosphates such as ribose 5-phosphate and glucose 6-phosphate. The physiological reason for the presence of two pyruvate kinases and their different regulation may be as follows. When cells are grown on a glycolytic substrate, the intracellular concentration of fructose 1,6-bisphosphate is high (Lowry et al., 1971), thus stimulating the activity of isoenzyme I. Under conditions where gluconeogenesis is predominant, the fructose 1,6-bisphosphate concentration is at a level that the enzyme is virtually inactive, whatever the concentration of phosphoenolpyruvate. However, the phosphoenolpyruvate level measured under these latter conditions is such that isoenzyme II can catalyze the reaction at 30-40% of the maximum rate (Garcia-Olalla and GarridoPertierra, 1987). A decrease of the cellular energy charge would then, via the effector AMP, activate the enzyme and increase the ATP concentration. It should also be noted that in many bacteria the conversion of phosphoenolpyruvate into pyruvate is coupled to the uptake of glycolytic substrates, via the phosphoenolpyruvatedependent sugar-phospho transferase system. The phospho group of one of the phosphoenolpyruvate molecules derived from a glycolytic substrate is transferred via a number of phosphocarrier proteins to a sugar during its translocation across the cell membrane (Fig. 47). The high free energy of hydrolysis of phosphoenolpyruvate is used to drive this process. Control of the pyruvate kinase activity should, therefore, also be important for sugar

218

L.A. FOTHERGILL-GILMOREand P. A. M. MICHELS

sugar 1

sugar 2

cell membrane

\ / F-I,6-~ ADP--~ ATP~ I PEP ADP'~ ~

F-1,6-P 2 (~

,T,

FIG.47. Schematic representation of the role of pyruvate kinase in the control of glycolytic flux and sugar uptake via the phosphoenolpyruvate-dependent sugar-phospho-transferase system in bacteria. Abbreviations: F-1,6-PI, fructose 1,6-bisphosphate; PEP, phosphoenolpyruvate. uptake. Indeed, analysis of glycolysis by NMR spectroscopy in a different bacterium, Streptococcus lactis, which contains only a type I-like pyruvate kinase, suggests that a major control of both the glycolytic flux and sugar transport is exerted by the energy status of the cell, via the concentration of the antagonistic effectors of pyruvate kinase, fructose 1,6bisphosphate and phosphate (Thompson and Torchia, 1984). Many reports have been published describing the kinetic analysis of pyruvate kinase in other bacteria. In each species analyzed usually one enzyme has been detected. Some species contain an enzyme with kinetic characteristics similar to those of the type I isoenzyme of the Enterobacteriaceae, such as Streptococcus lactis, Streptococcus sanguis and Thermus thermophilus (Collins and Thomas, 1974; Crow and Pritchard, 1976; Yamada and Carlsson, 1975; Yoshizaki and Imahori, 1979). Other species contain a pyruvate kinase that shares its major kinetic properties with the type II isoenzyme, e.g. Azotobacter vinelandii, Bacillus stearothermophilus, Bacillus licheniformis, Streptococcus mutans, Mycobacteria smegmatis, Pseudomonas citronellolis and Halobacterium cutirubrum (Liao and Atkinson, 1971; Sakai et al., 1986; Tuominen and Bernlohr, 1971; Abbe and Yamada, 1982; Kapoor and Venkitasubramanian, 1981; Chuang and Utter, 1979; De Medicis et al., 1982). However, within each group of enzymes differences may be found with respect to, for example, the nature of the sugar phosphate that acts with the highest efficacy, and the susceptibility toward various additional effectors. Amino-acid sequences are available for the type I isoenzymes of E. coli and for the pyruvate kinase of B. stearothermophilus, that has type II-like properties (Fig. 20; Section III. 1(j)). These sequences are clearly homologous, suggesting that the two basic mechanisms by which the activity of pyruvate kinase can be controlled must have a long evolutionary history. The homotetrameric pyruvate kinases of eukaryotes exhibit, in general, cooperative binding of their substrate phosphoenolpyruvate, and allosteric activation by fructose 1,6bisphosphate at micromolar concentrations. This has been observed for the enzyme of protists like Euglena gracilis (Ohrmann, 1969), fungi such as S. cerevisiae and Neurospora crassa (Hunsley and Suelter, 1969; Yun et al., 1976; Kapoor and Tronsgaard, 1972) and in

Evolution of glycolysis

219

invertebrate and vertebrate animals (reviewed by Kayne, 1973; Imamura and Tanaka, 1982; Muirhead, 1990). The only presently known eukaryotic pyruvate kinase which shows hyperbolic Michaelis-Menten kinetics without any allosteric activation is the mammalian muscle (type M1) enzyme. This should, however, not be considered as an evolutionarily primitive trait, since this isoenzyme and its allosterically controlled counterpart M2, present in most tissues, are encoded by the same gene and are each synthesized after differential splicing of the precursor RNA (Noguchi et al., 1986; see Section VI.3). A somewhat different kinetic behaviour is also found in the trypanosome pyruvate kinase, where millimolar concentrations of fructose 1,6-bisphosphate are required to increase the affinity of the enzyme for its substrate (Flynn and Bowman, 1981). However, this enzyme appeared to be activated by fructose 2,6-bisphosphate at 4000-fold lower concentration than required for fructose 1,6-bisphosphate (Van Schaftingen et al., 1985; see also previous section). This unique form of control may be an adaptation to the compartmentalization of the glycolytic pathway in the protists belonging to the order Kinetoplastida (see Section V.l(b)). Feedforward activation of pyruvate kinase by fructose 1,6-bisphosphate cannot play a physiological role in these organisms, because the enzymes phosphofructokinase and pyruvate kinase are not present in the same cell compartment; the former is within an organelle, called the glycosome, the latter in the cytosol. Since 6-phosphofructo-2-kinase and fructose 2,6-bisphosphatase are localized in the cytosol (Van Schaftingen et al., 1987), regulation of phosphofructokinase by fructose 2,6-bisphosphate is equally not feasible. Instead, the latter molecule appears to have taken over the role to control the activity of pyruvate kinase. In addition to the activity-control mechanisms described above, many other specific kinetic features can be distinguished in pyruvate kinases of eukaryotes. ATP and high concentrations of phosphate are commonly found to inhibit the enzyme. Depending upon the source of the enzyme, various amino acids and intermediates of glycolysis and the Krebs cycle have been reported to modulate the enzyme activity. These effectors control, via pyruvate kinase, the fluxes through the different pathways of carbohydrate metabolism. The most sophisticated form of control has been developed in higher organisms. In mammals four isoenzymes have evolved. A gene duplication or a tetraploidization event (Comings, 1972; Ohno, 1973), early in mammalian evolution, must have resulted in the unlinked ancestral genes for the L/R- and M1/M2-type enzymes. Much more recently a duplication of a single exon occurred within both genes, so that in present-day organisms four different pyruvate kinase polypeptides can be synthesized (Noguchi et al., 1986, 1987). Expression of the pyruvate kinase genes and processing of their transcripts is tissue specific. Each of the four pyruvate kinase isoenzymes has kinetic properties optimally adapted to the role of the enzyme in the cells of the tissue where it is expressed. The M 1 isoenzyme functions in tissues such as skeletal muscle, heart and brain where glycolysis plays a major role, and gluconeogenesis does not occur. Therefore, no elaborate regulation of the enzyme's activity is required. Evolution has abandoned several of the usual regulatory mechanisms in this particular enzyme, like cooperative binding of the substrate, and allosteric activation and inhibition by effector molecules. The most regulated isoenzyme is the L-type that is predominant in tissues where gluconeogenesis can be very important, such as liver and kidney. The enzyme shows sigmoidal kinetics with respect to its substrate, and is allosterically activated by fructose 1,6-bisphosphate and low pH, indicating that binding of this effector and protons stabilize the conformational state in which the enzyme has a high affinity for its substrate. The low affinity conformation is promoted by ATP, alkaline pH and gluconeogenic amino acids such as alanine. Furthermore, the activity of the L-type isoenzyme can be down regulated by phosphorylation, under control of hormones. Two types of hormonal control of the liver enzyme have been described (for a review, see Pilkis et al., 1988). First, increased levels of glucagon in the blood lead to phosphorylation of the enzyme via the cAMP-dependent protein kinase, similar to the modification of phosphofructokinase described above. Second, catecholamines, vasopressin and angiotensin II also promote phosphorylation of the enzyme, presumably via the Ca 2 +/calmodulin-dependent protein kinase. Phosphorylation by both kinases occurs at the serine residue in the

220

L.A. FOTHERGILL-GILMOREand P. A. M. MICHELS

N-terminal extension specific to the L-type enzyme (indicated by a star symbol in Fig. 20). The phosphorylation results in inhibition of the enzyme's activity, by decreasing its affinity for phosphoenolpyruvate. Moreover, the enzyme becomes less susceptible to activation by fructose 1,6-bisphosphate and more sensitive to inhibition by alanine and ATP. Via its inhibitory effect on phosphofructokinase, glucagon lowers the cellular fructose 1,6bisphosphate level as well, resulting in an additional decrease of the pyruvate kinase activity (Pilkis et al., 1988). The overall result of these hormonal effects is inhibition of glycolysis and an enhanced flux through the gluconeogenic pathway. The inhibitory action of glucagon is counteracted by insulin. Moreover, fructose 1,6-bisphosphate is able to inhibit the rate of the cAMP-dependent protein kinase catalyzed phosphorylation of pyruvate kinase (Claus et al., 1979). In addition, the activity of the M2 isoenzyme, which is widely found in many tissues, appears to be regulated by monomer-tetramer inter-conversion (Ashizawa et al., 1991a,b). This inter-conversion responds to changes in the extracellular concentration of glucose. In the physiological range of glucose levels, the majority of the enzyme exists in tetrameric form. Fructose 1,6-bisphosphate mediates the tetramer-monomer inter-conversion. The intracellular concentration of this compound drops concomitantly with a reduction of the extracellular glucose. Whether this mode of regulation also operates for the L- and R-type isoenzymes of pyruvate kinase remains to be determined. In vitro studies have also suggested that thyroid hormone T3 can affect the monomer-tetramer equilibrium of both M 1- and M2-type pyruvate kinases, by specific association with the monomeric form (Kato et al., 1989; Parkison et al., 1991). Whether this is a physiological function of the hormone is questionable, since no change in the oligomeric state of the enzyme was found in cultured cells, when the hormone was added extracellularly. 2. Quantitative Analysis The classical point of view about control of metabolic systems implied that a particular enzyme was rate-limiting and, therefore, had full control over the flux through a pathway. The identification of such rate-limiting steps, also called "bottlenecks" or "control points", is mainly based on qualitative information obtained by in vitro examination of purified enzymes. The control point of a pathway was inferred from the kinetic properties of the various isolated enzymes and the measured in vivo concentrations of the enzymes, substrates, intermediates and effectors. However, during the last 20 years the awareness has grown that such extrapolations are not sufficient to fully understand metabolic control. Quantitative analysis of the flux through the pathway of in vivo systems is also required. A theoretical framework for such quantitative studies was developed by Kacser and Burns 1973) and Heinrich and Rapoport (1974). The value of their concepts of control analysis (reviewed by Westerhoff et al., 1984; Kacser and Porteous, 1987) has been clearly demonstrated in experimental studies on various pathways, such as mitochondrial respiration (Groen et al., 1982; Tager et al., 1983), membrane-dependent energy transduction (Westerhoff and Arents, 1984), amino acid metabolism (Flint et al., 1981; Salter et al., 1986) and glycolysis (Rapoport et al., 1974, 1976; Poolman et al., 1987a). These studies showed unequivocally that control is not exerted by a single enzyme, but is shared by all steps in a pathway. Mathematically it was shown that the contribution of any individual step in the regulation of the flux through a pathway (i.e. the so called "flux control coefficient" of an enzyme) can be determined by measuring the fractional change in the flux, induced by a fractional change in the concentration of active enzyme concerned. The sum of the flux control coefficients of all enzymes in a pathway is equal to 1. Experimentally the concentration of active enzyme in an intact system can be modulated by specific inhibitors (Groen et al., 1982; Poolman et al., 1987a), by alteration of the physiological conditions (Salter et al., 1986; Poolman et al., 1987a, b) or by variation of the level of enzyme expression by recombinant DNA technology (Walsh and Koshland, 1985; Schaaff et al., 1989). When such control analysis was applied to the glycolytic pathway it appeared that the contribution of the various enzymes to the overall flux control was highly dependent on the organism, nature of the tissue and physiological conditions. For example, the glycolytic flux

Evolution of glycolysis

221

in erythrocytes under normal, quasi steady state conditions is mainly exerted by the enzymes hexokinase, phophofructokinase and ATPase (Rapoport et al., 1974, 1976; Heinrich and Rapoport, 1983). In contrast, in the bacteria Lactococcus lactis, Streptococcus sanguis and Streptococcus cremoris a major role in the control of glycolysis can be played by glyceraldehyde-phosphate dehydrogenase. This was shown by several experimental approaches. First, when glucose became the growth-limiting nutrient for continuously cultured S. sanguis cells, an enhanced glycolytic activity was observed which could be attributed to an increased synthesis of glyceraldehyde-phosphate dehydrogenase and the proteins of the phosphoenolpyruvate dependent sugar-phospho transferase system (Iwami and Yamada, 1985). Furthermore, when S. cremoris cells were starved for lactose, the capacity of the glycolytic pathway decreased rapidly. This reduction was accompanied by a decrease in the activities of glyceraldehyde-phosphate dehydrogenase and phosphoglycerate mutase, whereas the level of all other glycolytic enzymes remained essentially unchanged (Poolman et al., 1987b). In subsequent experiments with such starved, non-growing S. cremoris and L. lactis cells, the glycolytic activity was gradually inhibited by iodoacetate, an inhibitor of glyceraldehyde-phosphate dehydrogenase, to quantitate the contribution of this enzyme to the flux control in these cells; its flux control coefficient was estimated to be 0.9 under these conditions (Poolman et al., 1987a). From experiments with fermenting yeast it can also be concluded that the so-called key enzymes (hexokinase, phosphofructokinase and pyruvate kinase), with all their regulatory properties, do not necessarily dominate control of glycolysis. Schaaff et al. (1989) manipulated the amount of these three, and five other enzymes involved in glycolysis and alcohol fermentation in S. cerevisiae, by introduction of additional gene copies. Increase (3.7to 13.9-fold) of the in vivo specific activities of these enzymes, either alone or in pairs, did not result in an enhanced flux or in a change of the level of glycolytic intermediates. These results suggest that control is either shared by several of these enzymes, or that it is almost entirely exerted at the level of a step that has not been examined in these experiments, the transport of glucose. Although no quantitative data are available as yet about the contribution of substrate transport to the overall control of the glycolytic flux, it seems very likely that such transport plays a major regulatory role in many cell types, under many different conditions. This is inferred from numerous qualitative studies on the kinetic properties and energycoupling mechanisms of the transporter proteins and the regulation of their biosynthesis. Moreover, in control analysis of a different pathway, i.e. the aromatic amino acid metabolism in rat liver, Salter et al. (1986) convincingly demonstrated that transport plays a major role in controlling the flux. A detailed discussion of these studies is outside the scope of this review. It should be stressed that the data from the above cited, quantitative studies on control of the glycolytic flux only hold for those cells under specific physiological conditions. Changes in these conditions can have profound effects on the biosynthesis and activity of the glycolytic enzymes, on the importance of glycolysis in the cell's metabolism and on its interactions with other pathways. It is thus obvious that changes in the environment of a cell will result in a redistribution of the control that each enzyme exerts on the glycolytic flux. All data presented above suggest that, with regard to control of glycolysis, evolution has been towards flexibility. In all phylogenetic lineages, evolutionary forces must have led to the development of a set of enzymes that by their expression patterns, kinetic properties and regulatory couplings enabled each cell to optimally adjust its glycolytic flux to its metabolic requirements under the physiological conditions it encounters. 3. Glycolytic Complexes Association of glycolytic enzymes into multiprotein assemblies has also been proposed as a mechanism to control glycolytic flux (reviewed by Clarke et al., 1985; Srivastava and Bernhard, 1985; Srere, 1987). Such associations would be formed either as soluble complexes in the cytosol, or as assemblies on subcellular structural components such as membranes, microtubules or actin filaments. Basically two hypotheses have been put forward to invoke such enzyme-enzyme interactions for the regulation of the glycolytic flux. First, dl:'9 59:2-H

222

L.A. FOTHERGILL-GILMOREand P. A. M. MICHELS

(multi)enzyme complexes would promote the direct channelling of glycolytic intermediates and coenzymes between the enzymes of the pathway, without equilibration of these compounds with the bulk cytosolic compartment (Srivastava and Bernhard, 1984; Srere, 1987). Secondly, Ovadi (1988) proposed that subunits of sequential glycolytic enzymes are involved in a dynamic process of alternatively forming homologous and heterologous oligomeric structures. The nature of these associations would depend on the metabolic state of the cell. Heterologous associations would favour channelling of intermediates. Moreover, the differently composed oligomers would differ in enzymic activity and consequently contribute to the control of the flux. The formation of complexes between sequential glycolytic enzymes, and their assembly on subcellular structural components have been described for several cell types (reviewed by Clarke et al., 1985; Srivastava and Bernhard, 1985; Srere, 1987). In particular, the possible roles of actin in muscle cells and the plasma membrane in erythrocytes in the assembly of the glycolytic enzymes have received much attention (Salhany and Gaines, 1981; Hofer et al., 1987). The in vivo occurrence of such complexes was inferred from observed associations during purification procedures and from in vitro binding studies. Moreover, the results of in vitro kinetic experiments have been interpreted as an indication that glycolytic intermediates and the coenzyme NADH was directly transferred between the associated enzymes without prior release to the reaction medium (Ovadi and Keleti, 1978; Srivastava and Bernard, 1984; Srivastava et al., 1989). However, serious doubts have been raised about the existence of glycolytic multienzyme complexes in vivo, under the physiological cellular conditions of high ionic strength and high enzyme concentrations (Maretzki et al., 1989; Brooks and Storey, 1991). Moreover, the kinetic data could also be explained without invoking channelling of intermediates between the enzymes (Kvassman et al., 1988; Kvassman and Pettersson, 1989a, b; Wu et al., 1991). Theoretical studies of metabolic control also indicated that the overall glycolytic flux would remain essentially unaffected if metabolic channelling occurred (Kvassman and Pettersson, 1989b; Maretzki et al., 1989). Therefore, it should be concluded that no compelling evidence exists for control of glycolytic flux by the formation of a multienzyme complex and direct metabolite transfer, although such interactions have convincingly been demonstrated for other metabolic processes. Nevertheless, the reversible formation of binary enzyme complexes or the reversible association of individual glycolytic enzymes with structural proteins may affect the activity of the enzymes (Brooks and Storey, 1991). For example, the activity of muscle phosphofructokinase is increased upon binding to actin. This activation resulted partly from a shift of the inactive dimeric form into the active tetrameric species, partly from an allosteric stimulation of the enzyme when bound to actin (Brooks and Storey, 1991 ). The association of a glycolytic enzyme with another protein could also be a mechanism to control the flux by holding the enzyme in a less active form. This is exemplified by the mammalian M1- and M2-type pyruvate kinases, which are inhibited as a result of the binding of thyroid hormone to their monomeric form (Kato et al., 1989; Parkison et al., 1991; Ashizawa et al., 1991a; but see discussion in Section X.1 (c)). A possible form of homologous association that may affect the glycolytic rate has been described for the mammalian phosphofructokinase. This tetrameric enzyme can undergo an effector-mediated aggregation to even larger forms that are less sensitive to ATP inhibition (Hofer et al., 1987). A common strategy of many organisms to sustain a high glycolytic flux is by increasing the synthesis of their glycolytic enzymes. This is most clearly illustrated by yeast; the glycolytic enzymes can comprise over 30% of its soluble cell protein; glyceraldehyde-phosphate dehydrogenases can make up approximately 5 % of the cell dry weight (Wills, 1990). A more efficient manner to increase the glycolytic flux is compartmentalization. This is observed in trypanosomes, where the seven enzymes responsible for the conversion of glucose into phosphoglycerate are sequestered in an organelle, the glycosome (Opperdoes and Borst, 1977; see also Section V.l(b)). When these organisms live as parasites in the mammalian bloodstream, they perform glycolysis at an extremely high rate: twice the maximal value reported for yeast. However, glycosomes do not contain more than 3.7% of the total cellular protein,

Evolution of glycolysis

223

although the glycolytic enzymes comprise 90% of the organelle's protein content (Misset et al., 1986). The enzymes within the organelle do not form a multienzyme complex and channelling of glycolytic intermediates could not be observed (Visser and Opperdoes, 1980; Visser et al., 1981; Aman et al., 1985; Hammond et al., 1985; Opperdoes, 1987). The high flux is presumably possible (1) because the key enzymes hexokinase and phosphofructokinase lack some of the usual regulatory properties (Nwagwu and Opperdoes, 1982), (2) several of the usual feedback circuits are precluded by the compartmentalization and (3) the high specific activities of the enzymes within the organelle may reduce the transient time of the system (Misset et al., 1986). XI. GLYCOLYTIC ENZYMES WITH OTHER F U N C T I O N S Several examples can be found in the literature of proteins with functions not related to glycolysis, which after purification or molecular cloning, turned out to be identical or highly similar to glycolytic enzymes. In some cases the evidence is rather convincing that glycolytic enzymes can have additional metabolic or even structural functions in the cell. However, for other cases, it would still be desirable to have a more rigorous proof by testing the properties of recombinant proteins or by mutational analysis. Neuroleukin is a neutrophic growth factor found in large amounts in mammalian muscle, brain, heart and kidney. Gurney et al. (1986) described the purification of mouse neuroleukin and the molecular cloning, sequencing and expression of its cDNA in cultured monkey cells. The serum-free supernatant of the transfected cells was shown to have the same biological properties as the protein purified from mouse salivary gland. The sequence of this growth factor turned out to be identical to the sequence of mouse glucosephosphate isomerase (see Fig. 5) (Chaput et al., 1988; Faik et al., 1988). Several glycolytic enzymes have surprisingly been identified as parasite surface antigens that confer protective immunity to the parasite. Schistosomes are snail-transmitted worms that spend a major part of their life cycle in the mammalian bloodstream. Living in the blood, glycolysis is essential for the parasite's survival; it uses glucose as sole energy source. Independent research groups have characterized a number of apparently protective antigens and identified three of them as fructose 1,6-bisphosphate aldolase, glyceraldehydephosphate dehydrogenase and triosephosphate isomerase. The best documented example concerns glyceraldehyde-phosphate dehydrogenase (Goudot-Crozel et al., 1989). Sera from subjects with low susceptibility to infection reacted with a major surface antigen with a molecular mass of 37,000, against which sera from susceptible individuals show little reactivity. The antigen was purified, and its cDNA cloned and sequenced; it has a strong similarity with glyceraldehyde-phosphate dehydrogenase from other organisms (e.g. 72% identity with the human enzyme; Fig. 13(a) and Matrix 6(a)). Antibodies against the recombinant protein identified the surface antigen of the worm. Similarly, in humans that have been infected with the malaria parasite Plasmodium falciparum, antibodies directed against an antigen with a molecular mass of 41,000 can be detected. This antigen plays a role in protective immunity and is probably expressed at the parasite's surface. The antigen has been afffinity-purified and its gene has been isolated from an expression library. Both gene sequencing and activity measurements of the purified protein identified the antigen as aldolase (Certa et al., 1988; Fig. 8(a)). Some proteins with the capacity to bind DNA and RNA and to activate transcription have been found to have structural and enzymic properties of glyceraldehyde-phosphate dehydrogenase (e.g. see Perucho et al., 1977; Ryazanov, 1985; Morgenegg et al., 1986). The use of impure protein preparations could well be the explanation for some of these observations. Glyceraldehyde-phosphate dehydrogenase is, in most cell types, a very abundant protein with affinity for acidic proteins and single-stranded nucleic acids (Caswell and Corbett, 1985; Perucho et al., 1980). It is presumably the NAD+-binding site of the enzyme that is responsible for this affinity. Meyer-Siegler et al. (1991) reported that the sequence of a cDNA clone for human uracil DNA glycosylase, an enzyme involved in DNA repair, is completely identical to that of glyceraldehyde-phosphate dehydrogenase. When expressed in E. coli, this cDNA directed the synthesis of a protein with the glycosylase

224

L.A. FOTHERGILL-GILMOREand P. A. M. MICHELS

activity. Studies with purified protein revealed that the glycosylase activity is specific for monomeric glyceraldehyde-phosphate dehydrogenase. No such activity is found with the tetrameric form characteristic for the glycolytic enzyme. Other activities claimed for glyceraldehyde-phosphate dehydrogenase are catalysis and regulation of the assembly and maintenance of specific subcellular structures. For example, in vitro experiments with pure tubulin from brain and pure enzyme suggest that tetrameric glyceraldehyde-phosphate dehydrogenase catalyzes the bundling of microtubules, whereas the ATP-dependent dissociation of the enzyme into dimers results in unbundling (Huitorel and Pantaloni, 1985). Further, various reports have described a specific but variable interaction of glyceraldehydephosphate dehydrogenase with the plasma membrane of erythrocytes and other haemopoietic cells (e.g. Allen et al., 1987). It has been suggested that the enzyme mediates the anchoring of cytoskeleton elements to the membrane (Huitorel and Pantaloni, 1985). In skeletal muscle the enzyme is autophosphorylated by ATP. The phosphoenzyme behaves as a kinase; the phospho group is transferred to specific membrane proteins involved in the formation of triad junctions from traverse tubules and terminal cisternae (Kawamoto and Caswell, 1986). Searches of sequence databases frequently turn up surprising relationships, and phosphoglycerate mutase and fructose 2,6-bisphosphatase provide a clear example of two proteins that unexpectedly share a common ancestor. The isolation and sequence determination of an active site phosphohistidine peptide from the bifunctional enzyme that catalyzes the synthesis and degradation of fructose 2,6-bisphosphate revealed striking sequence similarities to an active site peptide from phosphoglycerate mutase (Pilkis et al., 1987). More extensive comparisons gave convincing evidence that the C-terminal bisphosphatase portion of the bifunctional enzyme (residues 250-470) could be aligned with phosphoglycerate mutase (Bazan et al., 1989). A CLUSTAL alignment of the bisphosphatase with the mutase sequences (not shown) demonstrates that about one quarter of its residues are identical to the mutase (Matrix 8). Examination of the N-terminal kinase portion of the bifunctional enzyme (residues 1-249) reveals that it possesses a typical nucleotide-binding motif. However, its sequence is not similar enough to other kinases, including a glycolytic phosphofructokinase, to be able to deduce possible evolutionary divergence. The sequence information therefore provides compelling evidence that the bifunctional enzyme that synthesizes and degrades fructose 2,6-bisphosphate evolved by fusion of a phosphoglycerate mutase gene and a kinase gene. Further evidence comes from functional studies that show that the bisphosphatase can be phosphorylated by 1,3-bisphosphoglycerate (Tauler et al., 1987). However, the bisphosphatase is sufficiently different that it has no phosphoglycerate mutase, 2,3-bisphosphoglycerate synthase or 2,3-bisphosphoglycerate phosphatase activities. Not surprisingly, phosphoglycerate mutase cannot be phosphorylated by the relatively large fructose 2,6-bisphosphate molecule, and is therefore unable to act as a phosphatase toward it. The homology between the mutase and the bisphosphatase suggested that it might be interesting to ascertain if any other phosphohistidine enzymes show sequence similarities. The sequence of yeast phosphoglycerate mutase was compared with a number of enzymes, and some similarities were revealed in the cases of pyrophosphatase and acid phosphatase (Fothergill-Gilmore and Watson, 1989). It seems likely that these enzymes have diverged from a common ancestor after gene duplication events. Two functions have been attributed to enolase in addition to its normal catalytic activity. First, enolase plays a structural role in the eye lens. A major lens protein of lampreys, some fishes, reptiles and birds is z-crystallin. This protein and ot-enolase appear to be the products of the same gene (Wistow et al., 1988). The crystallin possesses some enolase activity, but this is greatly reduced, probably because of age-related post-translational modification. The recruitment of existing proteins for use as crystallins in eye lenses seems a general evolutionary strategy by Nature. Other proteins used are for example a heat shock protein, argininosuccinate lyase and lactate dehydrogenase (for a review see De Jong et al., 1989). The other additional role that enolase may fulfil is the acquisition of thermal tolerance in yeast (Iida and Yahara, 1985). One of the proteins (hsp 48), induced in Saccharomyces cerevisiae at

Evolution of glycolysis

225

elevated temperatures and upon entry into the resting state (Go) of the cell cycle, and constitutively present in a heat-shock resistant mutant, is identical to enolase I. The level of this enolase isoenzyme had previously been shown to be significantly increased in cells grown to stationary phase, which also have increased resistance to heat (McAlister and Holland, 1982). How enolase contributes to the increased thermotolerence is not yet clear. The synthesis of other glycolytic enzymes is not significantly altered under stress conditions. It may be relevant in this respect that the N-terminal sequence of enolases shows extensive sequence similarity to a region conserved amongst all eukaryotic hsp 70 proteins and the functionally similar E. coli dnaK protein (Iida and Yahara, 1985). Nature seems sometimes to employ a single protein for different functions. For glycolytic enzymes with multiple functions it seems reasonable to assume that the non-glycolytic function was acquired secondarily. Glycolysis is a very ancient, housekeeping process, catalyzed by enzymes which in part were already present in the common evolutionary ancestor of all major Phyla (see Section IV). In contrast, each of the non-glycolyticfunctions of these enzymes has only been described for certain organisms or certain cell types. Moreover, they can usually be considered as so-called luxury functions. Why highly specialized proteins such as glycolytic enzymes have sometimes acquired additional functions remains a matter of speculation. It is unlikely that a mere drive to economic use of the genome should be invoked. A glycolytic enzyme that suddenly turned out to have additional properties, thereby accidentally endowing a selective advantage to cells living under certain conditions, seems more probable. Such an advantage may also have become important to a certain cell type after minor mutations, leaving intact the glycolytic enzyme's activity. Alternatively, an advantage may have been conferred when the glycolytic enzyme was produced at a different subceUular location, in different amounts or at a different time. Whatever the origin of the multiple functions of glycolytic enzymes, it is obvious that the further evolution of these proteins must have been influenced by selective forces that acted on each function (see Section III.2(a)). Some of the examples presented above indicate that, in some cases, selection has led to really bifunctional glycolytic proteins. Other options used by evolution are (1) proteins of which different oligomeric states exert different roles, (2) proteins in which post-translational modification contribute to specific functions and (3) differently specialized isoproteins encoded by separate but homologous genes.

XII. CONCLUSIONS Our aims when we embarked on writing this review were twofold. The glycolytic pathway is unique in terms of the wealth of structural information available, and we wished to gather this into a single document. We then planned to use the structural information as a basis for considering how the individual enzymes and the pathway as a whole may have evolved. This final section is a summary of the main points that have emerged from the extensive comparisons that are now possible. The tertiary structures of all 10 of the enzymes of the glycolytic pathway display a superficial similarity. They are all variations on a common theme of a core of mostly parallel fl-strands surrounded by s-helices. However, a detailed examination of the topologies and the sequences gives little indication that the pathway may have arisen by divergence of the present-day enzymes from a smaller number of primitive unspecialized ancestors. There is thus no convincing evidence for an ancestral kinase or nucleotide-binding enzyme, although of course the divergence events may have been sufficiently long ago as to be no longer perceptible at the sequence level. The striking structural similarities such as occur in the eight-strand fl-barrel domains appear to be a consequence of convergence to stable structures that are particularly well suited to binding the negatively charged ligands of the glycolytic pathway. The phylogenetic distribution of the enzymes of the glycolytic pathway confirms that it is indeed an ancient metabolic pathway, and was present (at least in part) when eukaryotes and prokaryotes diverged about 1800 million years ago. A reasonably convincing scenario for how the pathway may have assembled comes from a consideration of the distribution of

226

U A. FOTHERGILL-GILMOREand P. A. M. MICHELS

glycolytic enzymes in eubacteria and archaebacteria. Many eubacteria do not have complete glycolytic pathways and possess only the "trunk pathway" from glyceraldehyde-phosphate dehydrogenase to pyruvate kinase. In these organisms glucose is degraded to glyceraldehyde-phosphate by several alternative pathways. In some archaebacteria even less of the glycolytic pathway is present, and for example the thermoacidophiles have only enolase and pyruvate kinase. It thus seems plausible that glycolysis may have assembled from the bottom up, essentially as a reversal of gluconeogenesis. This evolutionary mechanism would have given rise to glucose catabolic pathways with greater and greater net yields of ATP molecules. The evolution of phosphofructokinase was a critical and relatively late step in enabling the complete present-day glycolytic sequence to take place. Further evidence that glycolysis has arisen by a chance assembly of independently evolving enzymes comes from the fact that many of the glycolyticenzymes exist in alternative non-homologous forms in different (or even the same) organisms. It would seem likely in these cases that ancestral organisms possessed two quite different enzymes that could catalyze the same reaction and that in some cases one or the other form was subsequently lost. The plant and protist kingdoms appear to be particularly rich in alternative forms and further investigation may well be rewarding. It is not difficult to envisage the development of specific drugs against pathogenic protists that exploit the differences between the alternative non-homologous enzymes. Undoubtedly the most common way that individual glycolytic enzymes have evolved is by gene duplication. Once duplication has taken place, then one copy is released from functional constraints and can acquire a new repertoire of properties by mutations in its DNA sequence. Isoenzymes occur widely, and sequences are available for different isoenzyme forms of all of the glycolytic enzymes, with the exception of triosephosphate isomerase. In animals it seems likely that many of the isoenzymes arose at the two tetraploidization events postulated to have taken place during vertebrate evolution. In the cases of hexokinase and phosphofructokinase, gene duplication was followed by fusion to give rise to double-size enzymes. Pyruvate kinase provides a good example of the duplication of a single exon to give rise to different isoenzymes by tissue-specific alternative RNA splicing. It is clear that a duplicated phosphoglycerate mutase gene was fused with a kinase gene to create the bifunctional enzyme which synthesizes and degrades fructose 2,6bisphosphate. Genes encoding glycolytic enzymes provide some of the best evidence that introns are primitive features which have been present since the very early stages in evolution. In particular, the gene for a plant chloroplast glyceraldehyde-phosphate dehydrogenase gives compelling testimony that introns were present in prokaryotes at the time of the endosymbiotic event that gave rise to chloroplasts. Glycolytic enzymes are among the most highly conserved enzymes known. In general, about 5% of the residues of a glycolytic enzyme change every 100 million years. The most conservative enzyme is glyceraldehyde-phosphate dehydrogenase, with only 3% of its catalytic domain changing in the same period of time. These values can be compared with that of another highly conserved enzyme, dihydrofolate reductase which is changing at the rate of about 13% every 100 million years. Hexokinase and phosphofructokinase are strikingly more variable than the other glycolytic enzymes and this undoubtedly relates to the trend in the evolution of glycolysis toward increasingly elaborate control properties. In vertebrates it appears that muscle- and brain-specific isoenzymes are more highly conserved than liver isoenzymes. It is possible that the muscle and brain enzymes are constrained to maintain both catalytic function and cytoskeletal binding sites, whereas the liver enzymes must be able to respond to a metabolically much more variable environment. A major aspect of the evolution of the control of glycolysis resides with hexokinase and phosphofructokinase. Both enzymes have undergone gene duplication/fusion events that have given rise to double-size enzymes that have an increased repertoire of effector molecules. It seems clear that the evolutionary drive has been toward metabolic flexibility and responsiveness. In other words, these are convincing examples of evolution toward evolvability. By these means a wide range of signals can control flux, can switch metabolism

Evolution of glycolysis

227

between catabolism and anabolism, and can prevent futile cycling. Typical signals include feedback inhibition, allosteric effectors, regulatory proteins and covalent modification. In general, the exact mechanisms do not correlate particularly well with phylogeny. It is possible that many of the control mechanisms are of relatively recent origin and have occurred independently in different lineages. The structural studies on glycolytic enzymes undoubtedly constitute a remarkable achievement. This impressive body of information has enabled our understanding of many aspects of enzyme structure, activity and evolution to stride forward. A number of the gaps in our knowledge are now readily apparent, and it is to be hoped that the near future will see progress in the characterization of plant, protist and archaebacterial enzymes, and in the quantitative aspects of the control of glycolysis. ACKNOWLEDGEMENTS We would like to take this opportunity to thank the many members of our research groups who have contributed over the years either directly or indirectly to this review. In particular we thank F. R. Opperdoes, S. Allert and N. Chevalier for help with the sequence alignments, matrices and phylogenetic trees, and D. Rigden for help with the graphics of aldolase. We are grateful to a number of colleagues for atomic coordinates prior to publication: C. Davies, H. Muirhead, H. C. Watson and R. K. Wierenga. We thank F. R. Opperdoes and E. Van Schaftingen for critical reading of the manuscript. We are pleased to acknowledge financial support for our research on glycolytic enzymes, principally from the Science and Engineering Research Council, the Wellcome Trust, the U N D P / W o r l d B a n k / W H O Special Programme for Research and Training in Tropical Diseases, and the Science and Technology for Development programme from the Commission of the EEC. Our grateful thanks to F. Van de Calseyde-Mylle for secretarial assistance.

ABBE,K. and YAMADA,T. (1982)J.

Bacterial.

REFERENCES 149, 299 305.

ACHARI, A., MARSHALL,S. E., MUIRHEAD, H., PALMIERI, R. H. and NOLTMANN, E. A. (1981) Phil. Trans. R. Soc. Lond. B 293, 145 157. ALBER, T. and KAWASAK1,G. (1982) J. molec. Appl. Genet. 1, 419-434. ALBER, T., BANNER,D. W., BLOOMER,A. C., PETSKO, G. A., PHILLIPS, D. C., RIVERS,P. S. and WILSON, I. A. (1981) Phil. Trans. R. Soc. B 293, 159-171. ALBERY,W. J. and KNOWLES,J. R. (1976)Biochemistry 15, 5631-5640. ALaIG,W. and ENTIAN, K.-D. (1988) Gene 73, 141-152.

ALEFOUNDER, P. R. and PERHAM, R. N. (1989) Molec. Microbiol. 3, 723-732. ALEEOUNDER, P. R., BALDWIN, S. A., PERHAM, R. N. and SHORTS, N. J. (1989) Biochem. J. 257, 529-534. ALLEN, R. W., TRACH, K. A. and HOCK,J. A. (1987) J. biol. Chem. 262, 649~53. ALLERT, S., ERNEST, I., POLISZCZAK, A., OPPERDOES, F. R. and MICHELS, P. A. M. (1991) Fur. J. Biochem. 200,

19-27. AMAN, R. A., KENYON, G. L. and WANG,C. C. 0985) J. biol. Chem. 260, 6966-6973. ANDERSON, C. M., STENKAMP, R. E. and STE1TZ,T. A. (1978)J. molec. Biol. 123, 15-33. ANDREONE, T. L., PRINTZ, R. L., PILKIS, S. J., MAGNUSON, M. A. and GRANNER, D. K. (1989) J. biol. Chem. 264,

363-369.

ARNOLD, H. and PETTE, D. (1970) Eur. J. Biochem. 15, 360--366. ARORA, K. K., FANCIULLI, M. and PEDERSEN, P. L. (1990) J. biol. Chem. 265, 6481-6488. ASHIZAWA, K., McPHIE, P., LIN, K.-H. and CHENG, S.-Y. (1991a) Biochemistry 30, 7105 7111. ASHIZAWA, K., WILLINGHAM,M. C., LIANG, C.-M. and CHENG, S.-Y. (1991b) J. biol. Chem. 266, 16842-16846. BALDWIN, S. A. and PERHAM, R. M. (1978) Biochem. J. 169, 643-652. BANKS, R. D., BLAKE,C. C. F., EVANS, P. R., HASER, R., RICE, D. W., HARDY, G. W., MERRETT, M. and PHILLIPS, A. W. (1979) Nature 279, 773-777. BANNER, O. W., BLOOMER,A. C., PETSKO, G. A., PHILLIPS, D. C., POGSON, C. I., WILSON, I. A., CORRAN, P. H., FURTH, A. J., MILMAN, J. D., OAFFORD, R. E., PR1DDLE,J. D. and WALEY, S. G. (1975) Nature 255, 609-614. BANNER, O. W., BLOOMER,A. C., PETSKO, G. A., PHILLIPS, O. C. and WILSON, I. A. (1976) Biochem. biophys. Res. Commun. 72, 145-155. BARNELL, W. O., CHLOE YI, K. and CONWAY, T. (1990) J. Bacteriol. 172, 7227-7240. BARTLETT, G. R. (1970) Adv. exp. Med. Biol. 6, 245-255. BARTRONS, R. and CARRERAS,J. (1982) Biochim. biophys. Acta 708, 167 177. BARTRONS, R., VAN SCHAFTINGEN,E., VISSERS,S. and HERS, H.-G. (1982) FEBS Lett. 143, 137-140. BAZAN, J. F., FLETTERICK, R. J. and PILKIS, S. J. (1989) Proc. natn. Acad. Sci. U.S.A. 86, 9642-9646. BENHAM, F. J., HODGKINSON, S. and DAVIES, K. E. (1984) E M B O J. 3, 2635-2640. BENNETT, W. S. and STEITZ, T. A. 0978) Proc. natn. Acad. Sci. U.S.A. 75, 4848-4852.

228

L.A. FOTHERGILL-GILMOREand P. A. M. MICHELS

BENNETT,W. S. and STEITZ,T. A. (1980a) J. molec. Biol. 140, 183-209. BENNETT,W. S. and ST~ITZ,T. A. (1980b) J. molec. Biol. 140, 211-230. BERNSTEIN,F. C., KOETZLE,T. F., WILLIAMS,G. J. B., MEYER,E. F., BRICE,M. D., RODGERS,J. R., KENNARD,O., SHIMANOUCHI,T. and TASUMI,M. (1977) d. molec. Biol. 112, 535-542. BERTHIAUME,L., LOISEL,T. P. and SYGUSCH,J. (1991) J. biol. Chem. 266, 17099-17150. BISHOP, J. G. and CORCES,V. G. (1990) Nucleic Acids Res. 18, 191. BLAKE,C. (1983) Nature 306, 535-537. BLAKELEY,S. n., PLAXTON,W. C. and DENNIS,D. T. (1990) Plant molee. Biol. 15, 665--669. BL.~TTLER,W. A. and KNOWLES,J. R. (1980) Biochemistry 19, 738-743. BLOUQUIT,Y., CALVIN,M.-C., ROSA,R., PRONE,D., PRONE,J.-C., PRATBERNOU,F., COHEN-SOLAL,M. and ROSA,J. (1988) J. biol. Chem. 263, 16906-16910. BLOXHAM,D. P. and LARDY,H. A. (1973) The Enzymes, Vol. 8, pp. 239-278 (ed. P. E. BOYER).Academic Press, New York. BOER, P. H., ADRA,C. N., LAU, Y.-F. and MCBURNEY,M. W. (1987) Molec. Cell. Biol. 7, 3107-3112. BOWEN, D., LITTLECHILD,J. A., FOTHERGILL,J. E., WATSON,H. C. and HALL, L. (1988) Biochem. J. 254, 509-517. BRADY, S. T. and LASEK,R. J. (1981) Cell 23, 515-523. BRANLANT,G. and BRANLANT,C. (1985) Eur. J. Biochem. 150, 61-66. BRANLANT,G., OSTER,I. and BRANLANT,C. (1989) Gene 75, 145-155. BR~,ND~N,C.-I. (1980) Q. Rev. Biophys. 13, 317-338. BREATHNACH,R. and KNOWLES,J. R. (1977) Biochemistry 16, 3054-3060. BREITBART,R. E., ANDREADIS,A. and NADAL-GINARD,B. (1987) A. Rev. Biochem. 56, 467-495. BREWER,G. J. (1969) Biochim. biophys. Acta 192, 157-161. BREWER,G. J. and EATON,J. W. (1971) Science 171, 1205-1211. BREWER,G. J. and WEBER,G. (1968) Proc. hath. Acad. Sci. U.S.A. 59, 216-223. BREWER, G. J., OELSHLEGEL,F. J., MOORE, L. G. and NOBLE,N. A. (1974) Ann. N.Y. Acad. Sci. 241, 513-523. BRINKMAN,H., MARTINEZ,P., QUmLEY, F., MARTIN,W. and CEREF,R. (1987) J. molec. Evol. 26, 320--328. BRINKMAN,H., CEREF,R., SALOMON,M. and SOLL,J. (1989) Plant molec. Biol. 13, 81-94. BRITTON, H. G., CARRERAS,J. and GRISOLIA,S. (1971) Biochemistry 10, 4522-4533. BROOKS,S. P. J. and STOREY,K. B. (1991)FEBS Lett. 278, 135-138. BRUNS, G. A. and GERALD,P. S. (1976) Science 192, 54-56. BUEHNER,M., FORD, G. C., MORAS,D., OLSEN, K. W. and ROSSMANN,M. G. (1974) J. molec. Biol. 90, 25-49. BUNN, H. F. (1971) Science 172, 1049-1050. BUNN, H. F., SEAL,U. S. and SCOTT,A. F. (1974) Ann. N.Y. Acad. Sci. 241, 498-512. BURGESS,D. G. and PENHOET,E. E. (1985) J. biol. Chem. 260, 4604-4614. BURKE, R. L., TEKAMP-OLSON,P. and NAJARIAN,R. (1983) J. biol. Chem. 258, 2193-2201. CALI, L., FEO, S., OLIVA,D. and GIALLONGO,A. (1990) Nucleic Acids Res. 18, 1893. CARLISLE,S. M., BLAKELEY,S. D., HEMMINGSEN,S. M., TREVANON,S. J., HIYOSHI,T., KRUGER,N. J. and DENNIS, D. T. (1990) J. biol. Chem. 265, 18366-18371. CARRERAS,J., MEZQU1TA,J., BOSCH,J., BARTRONS,R. and PONS,G. (1982) Comp. Biochem. Physiol. 71B, 591-597. CASAL, J. I., AHERN, T. J., DAVENPORT,R. C., PETSKO, G. A. and KLIBANOV,A. M. (1987) Biochemistry 26, 1258-1264. CASWELL,A. H. and CORBETT,A. M. (1985) J. biol. Chem. 260, 6892-6898. CERFF, R. (1982) Methods in Chloroplast Molecular Biology, pp. 683~94 (eds M. EDELMAN,R. B. HALLICKand N. K. CHUA). Elsevier/North Holland, Amsterdam. CERFF, R. and KLOPPSTECH,K. (1982) Proc. natn. Acad. Sci. U.S.A. 79, 7624-7628. CERTA,U., GHERSA,P., DOBELI,H., MATILE,H., KOCHER,H. P., SHRIVASTAVA,I. K., SHAW,A. R. and PERRIN,L. H. (1988) Science 240, 1036-1038. CHAPUT, M., CLAES,V., PORTETELLE,D., CLUDTS,I., CRAVADOR,A., BURNY,A., GRAS, n. and TARTAR,A. (1988) Nature 332, 454-455. CHENG, J., MIELNICKI,L. M., PRUITT, S. C. and MAQUAT,L. E. (1990) Nucleic Acids Res. 18, 4261. CHEVALIER,C., SAILLARD,C. and BOVE,J. M. (1990) J. Bacteriol. 172, 2693-2703. CHOl, G. H. and NuNS, D. L. (1990) Nucleic Acids Res. 18, 5566. CHUANG,D. T. and UTTER, M. F. (1979) J. biol. Chem. 254, 8434-8441. CICCARESE,S., TOMMASI,S. and VONGHIA,G. (1989) Biochem. biophys. Res. Commun. 165, 1337-1344. CLARKE,F., STEPHAN,P., MORTON,D. and WEIDEMANN,J. (1985) Regulation of Carbohydrate Metabolism, pp. 1-31 (ed. R. BEITNER).CRC Press, Boca Raton, U.S.A. CLAUS,Z. H., EL-MAGHRAB1,M. R. and PILKIS, S. J. (1979) J. biol. Chem. 254, 7855-7864. CLAYDEN,D. A. (1989) Ph.D. Thesis, University of Bristol. CLEMENTS,J. M. and ROBERTS,C. F. (1986) Gene 44, 97-105. CLIFTON, D. and FRAENKEL,D. G. (1982) Biochemistry 21, 1935 1942. COATES,J. L. (1975) J. molec. Evol. 6, 285-307. COLLINS,L. B. and THOMAS,T. D. (1974) J. Bacteriol. 120, 52-58. COMINGS,D. E. (1972) Nature 238, 455-457. CONWAV, T. and INGRAM,L. O. (1988) J. Bacteriol. 170, 1926-1933. CONWAY,T., SEWELL,G. W. and INGRAM,L. O. (1987) J. Bacteriol. 169, 5653-5662. COOPER,R. A. (1986) Carbohydrate Metabolism in Cultured Cells, pp. 461-491 (ed. M. J. MORGAN).Plenum Press, New York. CORB1ER,C., CLERMONT,S., BILLARD,P., SKARZYNSKI,T., BRANLANT,C., WONACOTT,A. and BRANLANT,G. (1990) Biochemistry 29, 7101-7106. CORRAN, P. H. and WALEY,S. G. (1975) Biochem. J. 145, 335-344. CROW, V. L. and PRITCHARD,G. G. (1976) Biochim. biophys. Acta 438, 90-101. DANSON, M. J. (1988) Adv. Microbiol. Physiol. 29, 165-231. DANSON, M. J. (1989) Can. J. Microbiol. 35, 58-64.

Evolution of glycolysis

229

DAVIDSON,B. E., SAJGO,M., NOLLER,H. F. and HARRIS,J. I. (1967) Nature 216, 1181-1185. DAYHOFF,M. O. (1978) Atlas of Protein Sequence and Structure, Vol. 5, Suppl. 3, National Biomedical Research Foundation, Silver Spring, MD, U.S.A. DE GRAAFF,L. H. (1989) Ph.D. Thesis, University of Wageningen. DE GaAA~, L. H. and VISSER,J. (1988) Curt. Genet. 14, 553-560. DEJONG,W. W., HENDRIKS,W., MULDERS,J. W. M. and BLOE~ENDAL,H. (1989) Trends Biochem. Sci. 14, 365-368. DE MEDICIS,E., LALIBERT~,J.-F. and VAsS-MARENGO,J. (1982) Biochim. biophys. Acta 708, 57~7. DENNIS, D. T. and MIERNYK,J. A. (1982) A. Rev. Plant Physiol. 33, 27-50. DERECHIN, M., RUSTUM,Y. M. and BARNARD,E. A. (1972) Biochemistry 11, 1793-1797. DOELLE, H. W. (1982) Eur. J. Appl. Microbiol. Biotechnol. 14, 241-246. DOOLITTLE,R. F. (1986) Of URFs and ORFs: A Primer on How to Analyze Derived Amino Acid Sequences, pp. 103, University Science Books, Mill Valley, California. DOOLITTLE,R. F., ANDERSON,K. L and FENG, D. F. (1989) The Hierarchy of Life, pp. 73-85 (eds B. FERNHOLM,K. BREMERand H. JORNVALL).Elsevier Science Publishers BV, Amsterdam. DOOLITTLE,R. F., FENG, D. F., ANDERSON,K. L. and ALBERRO,M. R. (1990) J. molec. Evol. 31, 383-388. DOVER, G. A. (1987) 3". molec. Evol. 26, 47-58. DUGAICZYK,A., HARON, J. A., STONE, E. M., DENNISON,O. E., ROTHBLUM,K. N. and SCHWARTZ,R. J. (1983) Biochemistry 22, 1605-1613. DUHM, J. (1975) Biochim. biophys. Acta 385, 68-80. DUNAWAY,G. A. (1983) Molec. Cell. Biochem. 52, 75-91. DUNAWAY,G. A. and KASTEN,T. P. (1986) Biochem. J. 237, 157-161. DUNAWAY,G. A. and KASTEN,T. P. (1987) Biochem. J. 242, 667~71. DUNCAN, D., McGOWAN, J. and PRICE, N. C. (1989) Biochem. Soc. Trans. 17, 560-561. ENTIAN, K.-D. (1986) Microbiol. Sci. 3, 366-371. ENTIAN, K.-D. and FR6HL1CH, K.-U. (1984) J. Bacteriol. 158, 29-35. ENTIAN, K.-D., FR6HLICH, K.-U. and MECKE, D. (1984) Biochim. biophys. Acta 799, 181-186. ENTIAN,K.-D., MEURER,B., K(SHLER,H., MANN,K.-H. and MECKE,D. (1987) Biochim. biophys. Acta 923, 214-221. EVANS, P. R., FARRANTS,G. W. and HUDSON,P. J. (1981) Phil. Trans. R. Soc. Lond. B 293, 53~52. EVANS, P. R., FARRANTS,G. W. and LAWRENCE,M. C. (1986) J. molec. Biol. 191, 713-720. FABRY, S. and HENSEL,R. (1988) Gene 64, 189-197. FABRY, S., LANG, J., NIERMANN,T., VINGRON,M. and HENSEL,R. 0989) Fur. J. Biochem. 179, 405-413. FABRY, S., HEPPNER, P., DIETMAIER,W. and HENSEL,R. (1990) Gene 91, 19-25. FAIK, P., WALKER,J. I. H., REDMILL,A. A. M. and MORGAN,M. J. (1988) Nature 332, 455-457. FALLER, L. D. and JOHNSON,A. M. (1974) Proc. hath. Acad. Sci. U.S.A. 71, 1083-1087. FALLER, L. D., BAROUDY,B. M., JOHNSON,A. M. and EWALL,R. X. (1977) Biochemistry 16, 3864-3869. FARBER,G. K. and PETSKO,G. A. (1990) Trends Biochem. Sci. 15, 228-234. FERRI, G., STOPPANI,M., MELONI,M. L., ZAPPONI, M. C. and IADAROLA,P. (1990) Biochim. biophys. Acta 1041, 36~2. FISCHER, S., LUCZAK,H. and SCHLEIFER,K. H. (1982) FEMS Microbiol. Lett. 15, 103-108. FITCH, W. M. and MARGOLIASH,E. (1967) Science 155, 279-284. FLINT, H. J., TATESON,R. W., BARTELMESS,I. B., PORTEOUS,D. J., DONACHIE,W. D. and KACSER,H. (1981) Biochem. J. 200, 231-246. FLYNN, I. W. and BOWMAN,I. B. R. (1981) Molec. Biochem. Parasitol. 4, 95-106. FORT, P., MARTY, L., PIECHACZYK,M., EL SABROUTY,S., DANI, C., JEANTEUR,P. and BLANCHARD,J. M. (1985) Nucleic Acids Res. 13, 1431-1442. FOTHERGILL-GILMORE,L. A. (1986) Multidomain Proteins: Structure and Evolution, pp. 85-174 (eds D. G. HARDIE and J. R. COGGINS).Elsevier Science Publishers BV, Amsterdam. FOTHERGILL-GILMORE,L. A. and WATSON,H. C. (1989) Adv. Enzymol. Related Areas Mol. Biol. 62, 227-313. FOURNIER,A., FLEER, R., YEn, P. and MAYAUX,J. F. (1990) Nucleic Acids Res. 18, 365. FRANCOIS,J., VAN SCHAFTINGEN,E. and HERS, H.-G. 0984) Fur. J. Biochem. 145, 187-193. FREEMONT,P. S., DUNBAR,B. and FOTHERGILL-GILMORE,L. A. (1988) Biochem. J. 249, 779-793. FRENCH, B. A. and CHANG,S. H. (1987) Gene 54, 65-71. FROMAN, B. E., TAIT, R. C. and GOTTLIEB,L. D. (1989) Molec. Gen. Genet. 217, 126-131. FUNG, K. and CLAYTON,C. (1991) Molec. Biochem. Parasitol. 45, 261-264. GAMBLIN,S. J., COOPER,B., MILLAR,J. R., DAVIES,G. J., LITTLECHILD,J. A. and WATSON,H. G. (1990) FEBS Lett. 262, 282-286. GARC1A-OLALLA,C. and GARRIDO-PERTIERRA,A. (1987) Biochem. J. 241, 573-581. GARRIDO-PERTIERRA,A. and COOPER,R. A. (1977) J. Bacteriol. 129, 1208-1214. GARRIDO-PERTIERRA,A. and COOPER, R. A. (1983) FEBS Lett. 162, 420-422. GEHNRICH,S. C., GEKAKIS,N. and SUE, H. S. (1988) J. biol. Chem. 263, 11755-11759. GIALLONGO,A., FEO, S., MOORE, R., CROCE, C. M. and SHOWE, L. C. (1986) Proc. hath. Acad. Sci. U.S.A. 83, 6741~745. GIBSON, W. C., SWINKELS,B. W. and BORST,P. (1988) J. molec. Biol. 201, 315-325. Go, M. (1981) Nature 291, 90-92. GOLDMAN,G. H., VILLAROEL,R., VAN MONTAGU,M. and HERRERO-ESTRELLA,A. (1990) Nucleic Acids Res. 18, 6717. GORAJ, K., RENARD,A. and MARTIAL,J. A. (1990) Protein Engno 3, 259-266. GOTTESD1ENER,K., GARCIA-ANOVEROS,J., GwO-SHu LEE, M. and VAN DER PLOEG, L. H. T. (1990) Molec. Cell. Biol. 10, 6079-6083. GOTZ, F., FISCHER,S. and SCHLEIFER,K. H. (1980) Eur. J. Biochem. 108, 295-301. GOUDOT-CROZEL,V., CAILLOL,D., DJABALI,M. and DESSEIN,A. J. (1989) J. exp. Med. 170, 2065-2080. GOULD,S. J., KELLER,G.-A., SCHNEIDER,M., HOWELL,S. H., GARRARD,L. J., GOODMAN,J. M., DISTEL,B., TABAK, H. and SUBRAMANI,S. (1990) EMBO J. 9, 85-90.

230

L.A. FOTHERGILL-G1LMOREand P. A. M. MICHELS

GRISOLIA,J., DIEDERICH,D. and GRISOLIA,S. (1970) Biochim. biophys. Res. Commun. 41, 1238-1243. GROEN, R. K., WANDERS,R. J. A., WESTERHOFF,H. V., VAN DER MEER, R. and TAGER,J. M. (1982) J. biol. Chem. 257, 2754-2757. GURNEy, M. (1987) EMBL/GenBank/DDBJ Nucleotide Sequences Databases, accession number K03515. GURNEY, M. E., HEINRICH,S. P., LEE, M. R. and YIN, H. S. (1986) Science 234, 566-574. HALDANE,J. B. S. (1965) Origins of Pre-biological Systems and of Their Molecular Matrices, pp. 11-23 (ed. S. W. Fox). Academic Press, New York. HALL, E. R. and COTTAM,G. L. (1978) Int. J. Biochem. 9, 785-793. HAMMOND,D. J., AllAN, R. A. and WANG, C. C. (1985) J. biol. Chem. 260, 15646-15654. HANAUER,A. and MANDEL,J. L. (1984) EMBO J. 3, 2627-2633. HARLOS, K., VAS, M. and BLAKE,C. F. (1992) Proteins: Structure, Function Gen. 12, 133-144. HARRIS,J. I. and PEgnAM, R. N. (1968) Nature 219, 1025-1028. HARRIS,J. I. and WATERS,M. (1976) The Enzymes, Vol. 13, pp. 1-49 (ed. P. D. BOYER).Academic Press, New York. HATCH, M. D. and SLACK,C. R. (1968) Biochem. J. 106, 141-146. HECHT, R. M., GARZA,A., LEE, Y.-H., MILLER, M. D. and PlSEGNA,M. A. (1989) Nucleic Acids Res. 17, 10213. HEINISCH,J., RITZEL,R. G., VONBORSTEL,R. C., AQUILERA,A., RODICIO,R. and ZIMMERMANN,F. K. (1989) Gene 78, 309-321. HEINRICH, R. and RAPOPORT,S. M. (1983) Bioehem. Soc. Trans. 11, 31-35. HEINRICH, R. and RAPOPORT,T. A. (1974) Fur. J. Biochem. 42, 89-95. HELLINGA,H. W. and EVANS,P. R. (1985) Fur. J. Biochem. 149, 363-373. HENSEL, R., ZWICKL, P., FABRY,S., LANG,J. and PALM, P. (1989) Can. J. Microbiol. 35, 81-85. HESTER,G., BRENNEg-HOLZACH,O., RO~SI,F. A., STRUCK-DONATZ,M., WINTERHALTER,K. H., SMIT,J. D. G. and PIONTEK, K. (1991 ) FEBS Lett. 292, 237-242. HIDAKA, S., KADOWAKI,K., TSUTSUMI,K. and ISHIKAWA,K. (1990) Nucleic Acids Res. 18, 3991. HIGGINS, D. G. and SHARP,P. M. (1988) Gene 73, 23%244. HOFER, H. W., GERLACH,G., BIRKEL,G. and KIRSCHENLOHR,H. L. (1987) Biochem. Soc. Trans. 15, 982 984. HOFMANN,E. (1976) Rev. Physiol. Biochem. Pharmac. 75, 1-68. HOL, W. G. J., VAN DunNEN, P. T. and BEgENDSEN,H. J. C. (1978) Nature 273, 443-446. HOLLAND,J. P. and HOLLAND,M. J. (1979) J. biol. Chem. 254, 9839-9845. HOLLAND,J. P., LABIENIEC,L., SWIMMER,C. and HOLLAND,M. J. (1983) J. biol. Chem. 258, 5291-5299. HOLLAND, M. J., HOLLAND,J. P., THILL, G. P. and JACKSON,K. A. (1981) J. biol. Chem. 256, 1385-1395. HOLZER, H. (1990) Fructose 2,6-bisphosphate, pp. 219-227 (ed. S. J. PILKIS). CRC Press, Boca Raton, U.S.A. HONKA, E., FABRY,S., NIERMANN,T., PALM, P. and HENSEL,R. (1990) Fur. J. Biochem. 188, 623-632. HORECKER,B. L., TSOLAS,O. and LAI, C. Y. (1972) The Enzymes, Vol. 7, pp. 213-258 (ed. P. D. BOYER).Academic Press, New York. HOROWlTZ,N. H. (1965) Ecolvino Genes and Proteins, pp. 15-26 (eds V. BRYSONand H. J. VOGEL).Academic Press, New York. HUANG, X.-Y., BARRIOS,L. A. M., VONKHORPORN,P., HONDA,S., ALBERTSON,D. G. and HECHT,R. M. (1989) J. molec. Biol. 206, 411-424. HUE, L. and RIDER, M. H. (1987) Biochem. J. 245, 313-324. HUISMAN,T. H. J., ADAMS,H. R., DIMMOCK,M. O., EDWARDS,W. E. and WILSON,J. B. (1967) J. biol. Chem. 242, 2534-2541. HUISMAN,T. H. J., BRANDT,G. and WILSON,J. B. (1968) J. biol. Chem. 243, 3675-3686. HUITOREL,P. and PANTALONI,D. (1985) Eur. J. Biochem. 150, 265-269. HUNSLEY,J. R. and SUELTER,C. H. (1969) J. biol. Chem. 244, 4815-4818. HUSE, K., JERG1L,B., SCHWIDOP,W. D. and KOPPERSCHLAGER,G. (1988) FEBS Lett. 234, 185-188. hDA, H. and YAHAP~,I. (1985) Nature 315, 688-690. IMAMUgA,K. and TANAKA,T. (1982) Meth. Enzymol. 90, 150-165. IMURA,T., UTATSU,I. and Ton-E, A. (1987) .40tic. biol. Chem. Tokyo 51, 1641 1647. INOUE, H., NOGUCHI,T. and TANAKA,T. (1986) Fur. J. Biochem. 154, 465-469. IWAMI, Y. and YAMADA,T. (1985) Infect. lmmunol. 50, 378-381. JOH, K., MUKAI,T., YATSUKI,H. and HORI, K. (1985) Gene 39, 17-24. JOHNSON,C. M. and PRICE, N. C. (1986) Biochem. J. 236, 617-620. JOSEPH, D., PETSKO,G. A. and KARPLUS,M. (1990) Science 249, 1425-1428. JOULIN,V., PEDUZZI,J., ROMEO,P.-H., ROSA,R., VALENTIN,C., DUBART,A., LAPEYRE,B., BLOUQUIT,Y., GAREL, M.-C., GOGSSEENS,M., ROSA,J. and COHEN-SOLAL,M. (1986) EMBO J. 5, 2275 2283. KACSER,H. and BEEBY,R. (1984) J. molec. Ecol. 20, 38-51. KACSER,H. and BURNS,J. A. (1983) Symp. Soc. exp. Biol. 27, 65-107. KACSER,H. and PORTEOUS,J. W. (1987) Trends Biochem. Sci. 12, 5-14. KAPOOR, M. and TRONSGAARD,T. M. (1972) Can. J. Microbiol. 18, 805-815. KAPOOR, R. and VENKITASUBRUMANIAN,T. A. (1981) Biochem. J. 193, 435-440. KASLOW,D. C. and HILL, S. (1990) J. biol. Chem. 265, 12337-12341. KATO, H., FUKUDA,T., PARKISON,C., McPmE, P. and CHENG, S.-Y. (1989) Proc. hath. ,4cad. Sci. U.S.A. 86, 7861-7865. KATZEN, H. M. and SCH1MKE,R. T. (1965) Proc. hath..4cad. Sci. U.S.A. 54, 1218-1220. KAWAMOTO,R. M. and CASWELL,A. H. (1986) Biochemistry 25, 656-661. KAYNE,F. J. (1973) The Enzymes, Vol. 8, pp. 353-385 (eds P. D. BORERand E. G. KREBS).Academic Press, New York. KELLEY, P. M. and TOLAN,D. R. (1986) Plant Physiol. 82, 1076-1080. KEMP, R. B., FOX, R. W. and LATSnAW,S. P. (1987) Biochemistry 26, 3443-3446. KEMP, R. G. and MARCUS,F. (1990) Fructose 2,6-bisphosphate, pp. 17-37 (ed. S. J. PILKIS).CRC Press, Boca Raton, U.S.A. KENDALL,G., WILDERSPIN,A. W. F., ASHALL,F., MILES,M. A. and KELLY,J. M. (1990) EMBO J. 9, 2751-2758.

Evolution of glycolysis

231

KLUGE, M. and OSMOND,C. B. (1971) Naturwissenschaften 58, 414-415. KNAPP, B., HUNDT, E. and KOPPER, H. A. (1990) Molec. Biochem. Parasitol. 40, 1-12. KNOWLES,V. L., DENNIS,O. T. and PLAXTON,W. C. (1989) FEBS Lett. 259, 130--132. KOLB, P., HARRIS,J. I. and BRIDGES,J. (1974) Biochem. J. 137, 185-197. KOPETZKI, E. and ENTIAN,K.-D. (1985) Fur. J. Biochem. 146, 657-662. KOSow, D. P. and ROSE,I. A. (1971) J. biol. Chem. 246, 2618-2625. KRAMER,J. M. and ERICKSON,R. P. (1981) Dev. Biol. 87, 37-45. KRISHNAN,G. and ALTEKAR,W. (1991) Fur. J. Biochem. 195, 343-350. KULKARN1,G., JAGANNATHARAO,G. S., SRINIVASAN,B. G., HOFER,H. W., YUAN,P. M. and HARRIS,B. G. (1987) J. biol. Chem. 262, 32-34. KUKITA, A., MUKAI,T., MIYATA,T. and HORI, K. (1988) Eur. J. Biochem. 171, 471-478. KUNDROT,C. E. and EVANS,P. R. (1991) Biochemistry 30, 1478-1484. KVASSMAN,J. and PETTERSSON,G. (1989a) Fur. J. Biochem. 186, 261-264. KVASSMAN,J. and PETTERSSON,G. (1989b) Eur. J. Biochem. 186, 265-272. KVASSMAN,J., PETTERSSON,G. and RYDE-PETTERSSON,U. (1988) Eur. J. Biochem. 172, 427-431. LADROR, O. S., GOLLAPUDI,L., TRIPATHI, R. L., LATSHAW,S. P. and KEMP, R. G. (1991) J. biol. Chem. 266, 16550-16555. LAMBEIR,A.-M., OPPERDOES,F. R. and WIE~ENGA,R. K. (1987) Eur. J. Biochem. 168, 69-74. LAMBEIR,A.-M., LOISEAU,A. M., KUNTZ, D. A., VELLn~UX,F. M., MIcrmLS, P. A. M. and OPPERDOES,F. R. (1991) Fur. J. Biochem. 198, 429-435. LAURENT, M., CHAFFOTTE,A. F., TENU, J.-P., Roucous, C. and SEYDOUX,F. J. (1978) Biochem. biophys. Res. Commun. 80, 646~52. LEBIOOA,L. and SrEC, B. (1988) Nature 333, 683-686. LEBIODA,J., STEC, B. and BREWER,J. M. (1989) J. biol. Chem. 264, 3685-3693. LEaO, R. V., TOLAN,D. R., BRUCE,B. D., CI-mUNG,M.-C. and KAN, Y. W. (1985) Cytometry 6, 478-483. LE BOULCH,P., JOULIN,V., GAREL,M.-C., ROSA,J. and COHEN-SOLAL,M. (1988) Biochem. biophys. Res. Commun. 156, 874-881. LEE, C.-P., KAO, M.-C., FRENCH,B. A., PUTNEY,S. D. and CHANG,S. H. (1987) J. biol. Chem. 262, 4195--4199. LEVANON,D., DANClCER,E., DAENI,N., BERNSTEIN,Y., ELSON, A., MOENS,W., BRANDEIS,M. and GROr~R, Y. (1989) DNA 8, 733-743. LIAO, C.-L. and ATKINSON,D. E. (1971) J. Bacteriol. 1116, 37-44. LIAUD, M.-F., ZHANG,D. X. and CERFE,R. (1990) Proc. natn. Acad. Sci. U.S.A. 87, 8918-8922. LIVELY,M. O., EL-MAGHRAB1,M. R., PILKIS,J., D'ANGELO,G., COLOS1A,A. D., CIAVOLA,J.-A., FRASER,B. A. and PILKIS, S. J. (1988) J. biol. Chem. 263, 839-849. LOBO, Z. and MAITRA,P. K. (1982) FEBS Lett. 139, 93-96. LOLLS,E., DAVENPORT,R. C., ROSE,D., HARTMAN,F. C. and PETSKO,G. A. (1990) Biochemistry 29, 6609-6618. LONBERG,N. and GILBERT,W. (1983) Proc. nam. Acad. Sci. U.S.A. 80, 3661-3665. LONE, Y.-C., SIMON,M.-P., KAHN, A. and MARtE, J. (1986) FEBS Lett. 195, 97-100. LONGSTAEE,M., RAINES,C. A., McMORROW,E. M., BRADBEER,J. W. and DYER,T. A. (1989) Nucleic Acids Res. 17, 6569-6580. LOOKER,D., ABBOTT-BROWN,D., COZART,P., DURFEE,S., HOFFMAN,S., MATTHEWS,A. J., MILLER-ROEHmCH,J., SHOEMAKER,S., TRIMBLE,S., FERMI,G., KOMYAMA,N. H., NAGAI,K. and STETTLER,G. L. (1992) Nature 356, 258-260. LOWRY, O. H., CARTER,J., WARD, J. B. and GLASER,L. (1971) J. biol. Chem. 246, 6511-6521. MA, H., BLOOM,L. M., WALSH,C. T. and BOTSTEIN,D. (1989) Molec. Cell. Biol. 9, 5643-5649. MACDONALD,F. D. and BUCHANAN,B. B. (1990) Fructose 2,6-bisphosphate, pp. 193-210 (ed. S. J. PILKIS). CRC Press, Boca Raton, U.S.A. MALCOVATI,M. and KORNBERG,H. L. (1969) Biochim. biophys. Acta 178, 420-423. MALEK, A. A., SUTER,F. X., FRANK,G. and BRENNER-HOLZACH,O. (1985) Biochem. biophys. Res. Commun. 126, 199-205. MARCHAND, M., POLISZCZAK,A., GIBSON, W. C., WIERENGA,R. K., OPPERDOES,F. R. and MICHELS, P. A. M. (1988) Molec. Biochem. Parasitol. 29, 65-76. MARCHAND, M., KOOYSTRA,U., WIERENGA,R. K., LAMBEIR,A.-M., VAN BEEUMEN,J., OPPERDOES, F. R. and MICHELS, P. A. M. (1989) Fur. J. Biochem. 184, 455-464. MARCHIONNI,M. and GILBERT,W. (1986) Cell 46, 133-141. MARETZKI,n., RE1MANN,B. and RAPOPORT,S. i . (1989) Trends Biochem. Sci. 14, 93-96. MARSH, J. J. and LEBHERZ,H. G. Trends Biochem. Sci. 17, 110-113. MARTIN, W. and CERFF, R. (1986) Eur. J. Biochem. 159, 323-331. MARTINEZ,P., MARTIN,W. and CERFF, R. (1989) J. molec. Biol. 208, 551-565. MAQUAT,L. E., CHILCOTE,R. and RYAN,P. M. (1985) J. biol. Chem. 260, 3748-3753. MASTERS,C. (1984) J. Cell. Biol. 99, 222s-225s. MATSUOKA,M., OZEKI,Y., YAMAMOTO,N., HIRANO,H., KANOMURAKAMI,Y. and TANAKA,Y. (1988) J. biol. Chem. 263, 11080-11083. MAUK, A. G., WHELAN,H. T., PUTZ, G. R. and TAKETA,F. (1974) Science 185, 447-449. MCALEESE,S. M., DUNBAR,B., FOTHERGILL,J. E., HINKS, L. J. and DAY, I. N. M. (1988) Eur. J. Biochem. 178, 413-417. McALISTER, L. and HOLLAND,M. J. (1982) J. biol. Chem. 257, 7181-7188. McALISTER, L. and HOLLAND,M. J. (1985) J. biol. Chem. 260, 15019-15027. MCCARREY,J. R. and THOMAS,K. (1987) Nature 326, 501-505. MCKNIGHT, G. L., O'HARA, P. J. and PARKER,M. L. (1986) Cell 46, 143-147. MCNALLY, T., PURVIS,I. J., FOTHERGILL-GILMORE,L. A. and BROWN,A. J. P. (1989) FEBS Lett. 247, 312-316. MERRETT, M. (1981) J. biol. Chem. 256, 10293-10305. MERTENS, E. (1990) Molec. Biochem. Parasitol. 411, 147-150.

232

L.A. FOTHERGILL-GILMOREand P. A. M. MICHELS

MERTENS, E. (1991) FEBS Lett. 285, 1-5. MERTENS,E., VAN SCHAFTINGEN,E. and Mf2LLER,M. (1989) Molec. Biochem. Parasitol. 37, 183-190. MESTEK,A., STAUFFER,J., TOLAN,D. R. and CmSEK-Bmz, E. (1987) Nucleic Acids Res. 15, 10595. MEYER-SIEGLER,K., MAURO,D. J., SEAL,G., WURZER,J., DERIEL,J. K. and SlROVER,M. A. (1991) Proc. natn. Acad. Sci. U.S.A. 88, 8460-8464. MEZQUITA,J. and CARRERAS,J. (1981) Comp. Biochem. Physiol. 70B, 237-245. MEZQUITA,J., BARTRONS,R., PONS, G. and CARRERAS,J. (1981) Comp. Biochem. Physiol. 70B, 247-255. MICrlELS, P. A. M. and OPPERDOES,F. R. (1991) Parasitol. Today 7, 105-109. MICrELS, P. A. M., POLISZCZAK,A., OSINGA,K. A., MISSET,O., VAN BEEUMEN,J., WIERENGA,R. K., BoasT, P. and OPPERDOES, F. R. (1986) EMBO J. 5, 1049-1056. MICHELS,P. A. M., MARCHAND,M., KOHL, L., ALLERT,S., WmRENGA,R. K. and OPPERDOES,F. R. (1991) Eur. J. Biochem. 198, 421-428. MICHELSON,A. M., MARKHAM,A. F. and ORKIN, S. H. (1983) Proc. natn. Acad. Sci. U.S.A. 80, 472-476. MISSET,O. and OPPERDOES,F. R. (1987) Eur. J. Biochem. 162, 493-500. MISSET,O., BOS, O. J. M. and OPPERDOES,F. R. (1986) Eur. J. Biochem. 157, 441-453. MISSET,O., VAN BEEUIvlEN,J., LAMBEIR,A.-M., VAN DER MEEk, R. and OPPERDOES,F. R. (1987) Eur. J. Biochem. 162, 501-507. MIYATAKE,K., ENOMOTO,T. and K1TAKOA,S. (1986) Agric. biol. Chem. 50, 2417-2418. MOORE, P. A., SAGLIOCCO,F. A., WOOD, R. M. C. and BROWN,A. J. P. (1991) Molec. Cell Biol. 11, 5330-5337. MORAS,D,, OLSEN,K. W., SABESAN,M. N., BUEHNER,M., FORD,G. C. and ROSSMANN,M. G. (1975) J. biol. Chem. 250, 9137-9162. MORI, N., SINGER-SAM,J., LEE, C.-Y. and RIGGS, A. D. (1986) Gene 45, 275-280. MORGENEGG,G., WINCKLER,G. C., HiJBSCHER,U., HEIZMANN,C. W., MOUS, J. and KUENZLE,C. C. (1986) J. Neurochem. 47, 54-62. MOUGIN, A., CORBIER,C., SOUKRI,A., WONACOTT,A., BRANLANT,C. and BRANLANT,G. (1988) Protein Engng 2, 45-48. MUral-mAD, H. (1987) Biological Macromolecules and Assemblies, Vol. 3, pp. 114-186 (eds F. A. JURNAKand A. McPrERSON). John Wiley. MUIRHEAD,H. (1990) Biochem. Soc. Trans. 18, 193-196. MUIRHEAD,H. (1991) EMBL/GenBank/DDBJ Nucleotide Sequences Databases, accession number X57859. MUIRHEAD, H., CLAYDEN,D. A., BARFORD,D., LORIMER,C. G., FOTHERGILL-GILMORE,L. A., SCHILTZ, E. and SCHMITT,W. (1986) EMBO J. 5, 475-481. NAKAJIMA,H., NOGUCHI,R., YAMASAKI,Y., KONO,N., TANAKA,T. and TARUI,S. (1987) FEBS Lett. 223, 113-i 16. NAKAJIMA,H., KONO, N., YAMASAKI,T., HOTTA,K., KAWACHI,M., KUWAJIMA,M., NOGUCHI,T., TANAKA,T. and TARU1, S. (1990a) J. biol. Chem. 265, 9392-9395. NAKAJIMA,H., YAMASAKI,T., NOGUCHI,T., TANAKA,T., KONO, N. and TARUI,S. (1990b) Biochem. biophys. Res. Commun. 166, 637-641. NELLEMANN,L. J., HOLM, F., ATLUNG,T. and HANSEN,F. G. (1989) Gene 77, 185-191. NEMAT-GORGANI,M. and WILSON,J. E. (1986) Arch. Biochem. Biophys. 251, 97-103. NISHI, S., SEINO,S. and BELL, G. I. (1988) Biochem. biophys. Res. Commun. 157, 937-943. NOGUCHI, T., INOUE,H. and TANAKA,T. (1986) J. biol. Chem. 261, 13807-13812. NOGUCHI, T., YAMADA,K., INOUE,H., MATSUDA,T. and TANAKA,T. (1987) J. biol. Chem. 262, 14366-14371. NOLTMANN,E. A. (1972) The Enzymes, Vol. 6, pp. 326-340 (ed. E. D. BORER).Academic Press, New York. NWAGWU, M. and OPPERDOES,F. R. (1982) Acta Trop. 39, 61-72. O'BRmN, W. E., BOWIEN,S. and WOOD, H. G. (1975) J. biol. Chem. 250, 8690-8695. OHARA, O., DORIT, R. L. and GILBERT,W. (1989) Proc. natn. Acad. Sci. U.S.A. 86, 6883-6887. OHLSSON, I., NORDSTR6M,B. and BR~,NDEN,C. I. (1974) J. molec. Biol. 89, 339-354. OHNO, S. (1970) Evolution by Gene Duplication, Springer-Verlag, Heidelberg. OHNO, S. (1973) Nature 244, 259-262. OHRMANN,E. (1969) Arch. Mikrobiol. 67, 273. OLD, S. E. and MOHRENWEISER,H. W. (1988) Nucleic Acids Res. 16, 9055. OPPERDOES, F. R. (1987) A. Rev. Microbiol. 41, 127-151. OPPERDOES, F. R. and BORST, P. (1977) FEBS Lett. 80, 360-364. OSHIMA,Y., MITSUI,H., TAKAYAMA,Y., KUSHIYA,E., SAK1MURA,K. and TAKAI~Sm, Y. (1989) FEBS Lett. 242, 425-430. OSINGA,K. A., SWINKELS,B. W., GIBSON,W. C., BORST,P., VEENEMAN,G. H., VANBOOM,J. H., M1CHELS,P. A. M. and OPPERDOES,F'. R. (1985) EMBO J. 4, 3811-3817. OSTREM,J. A., VERNON)D. M. and BOHNERT,H. J. (1990) J. biol. Chem. 265, 3497-3502. OURISSON,G., ALBRECHT,P. and ROHMER,M. (1984) Sci. Am. 251 (2), 34-41. OVADI, J. (1988) Trends Biochem. Sci. 13, 486-490. OVADI, J. and KELETI,T. (1978) Eur. J. Biochem. 85, 157-161. PARKISON,C., ASHIZAWA,K., McPHIE, P., LIN, K.-H. and CrmNG, S.-Y. (1991) Biochem. biophys. Res. Commun. 179, 668-674. PARNIAK, M. A. and KALANT,N. (1988) Biochem. J. 251, 795-802. PAWLUK, A., SCOPES,R. K. and GRIFFITHS-SMITH,K. (1986) Bioehem. J. 238, 275-281. PERKINS,R. E., CONROY,S. C., DUNBAR,B., FOTHERGtLL,L. A., TUITE,M. F., DOBSON,M. J., KINGSMAN,S. M. and KINGSMAN,A. J. (1983) Biochem. J. 211, 199-218. PERUCHO, M., SALAS,J. and SALAS,M. L. (1977) Fur. J. Biochem. 81, 557-562. PERUCHO, M., SALAS,J. and SALAS,M. L. (1980) Biochim. biophys. Acta 606, 181-195. PETTERSSON,G. (1989) Eur. J. Biochem. 184, 561-566. PETTERSSON,G. (1990) Eur. J. Biochem. 194, 141-146. PICHERSKY,E., GOTTLIEB,L. D. and HESS,J. F. (1984) Molec. Gen. Genet. 195, 314-320. PIECHACZYK,M., BLANCHARD,J. M., RIAAD-ELSABOUTY,S., DANI,C., MARTY,L. and JEANTEUR,P. (1984) Nature 312, 469-471.

Evolution of glycolysis

233

PILKIS, S. J., LIVELY,M. O. and EL-MAGHRABI,M. R. (1987) J. biol. Chem. 262, 12672-12675. PILKIS, S. J., EL-MAGHRABI,M. R. and CLAUS,T. H. (1988) A. Rev. Biochem. 57, 755-783. POCALYKO,D. J., CARROLL,L. J., MARTIN,B. M., BABBITT,P. C. and DUNAWAY-MARIANO,O. (1990) Biochemistry 29, 10757-10765. PODESTA,F. E. and PLAXTON,W. C. (1991) Biochem. J. 279, 495-501. PONCE, J., ROTH, S. and HARKNESS,D. R. (1971) Biochim. biophys. Acta 250, 63-74. PONS, G., BARTRONS,R. and CARRERAS,J. (1985) Biochem. biophys. Res. Commun. 129, 658-663. POOLMAN,B., BOSMAN,B., KIERS,J. and KONINGS,W. N. (1987a) J. Bacteriol. 169, 5887-5890. POOLMAN,B., SMID, E. J., VELDKAMP,H. and KONINGS,W. N. (1987b) J. Bacteriol. 169, 1460-1468. POORMAN,R. A., RANDOLPH,A., KEMP, R. G. and HEaNRIKSON,R. L. (1984) Nature 309, 467-469. POTTER, S. and FOTHERGILL-GILMORE,L. A. (1992) FEMS Lett. 94, 235-240. PRICE, N. C., STEVENS,E. and ROGERS,P. M. (1983) FEMS Microbiol. Lett. 19, 257-259. PRICE, N. C., DUNCAN,D. and MCALISTER,J. W. (1985) Biochem. J. 229, 167-171. PUNT, P., DINGEMANSE,M. A., JACOBS-MEIJSlNG,B. J. M., POUWELS,P. H. and VAN DEN HONDEL,C. A. M. J. J. (1988) Gene 69, 49-57. QUIGLEY, F., MARTIN,W. F. and CERFF, R. (1988) Proc. narD. Acad. Sci. U.S.A. 85, 2672-2676. RAO, S. T. and ROSSMANN,M. G. (1973) J. molec. Biol. 76, 241-256. RAPOPORT, S. and GUEST,G. M. (1941) J. biol. Chem. 138, 269-282. RAPOPORT,T. A., HEINRICH,R., JACOBASCH,G. and RAPOPORT,S. (1974) Eur. J. Biochem. 42, 107-120. RAPOPORT,T. A., HEINRICH,R. and RAPOPORT,S. (1976) Biochem. J. 154, 449-469. READ, R. J., KALK, K. H., LITTLECHILD,J. A., WATSON,H. C. and HOL, W. G. J. (1992) in preparation. REEVES,R. E. (1968) J. biol. Chem. 243, 3202-3204. REEvEs, R. E., SOUTH,D. J., BLYTT,H. J. and WARREN,L. G. (1974) J. biol. Chem. 249, 7737-7741. REYES,A. and CARDENAS,i . L. (1984) Biochem. J. 221, 303-309. RICHARDSON,J. S. (1981) Adv. Prot. Chem. 34, 167-339. RIDER, C. C. and TAYLOR,C. B. (1975a) Biochem. biophys. Res. Commun. 66, 814-820. RIDER, C. C. and TAYLOR,C. B. (1975b) Biochim. biophys. Acta 405, 175-187. ROSE, Z. B. (1980) Adv. Enzymol. Relat. Areas Mol. Biol. 51, 211-253. ROSSMANN,M. G., ADAMS,M. J., BUEHNER,M., FORD, G. C., HACKERT,M. L., LILJAS,A., RAO, S. T., BANASZAK, L. J., HILL, E., TSERNOGLOU,D. and WEBB, L. (1973) J. molec. Biol. 76, 533-537. ROSSMANN,M. G., LILJAS,A., BR)~NDEN,C.-I. and BANASZAK,L. J. (1975) The Enzymes, Vol. 11, pp. 61-102 (ed. P. D. BOYER). Academic Press, New York. ROTTMANN,W. H., TOLAN,n. R. and PENHOET,E. E. (1984) Proc. HatH. Acad. Sci. U.S.A. 81, 2738-2742. ROTTMANN,W. H., DESELMS,K. R., NICLAS,J., CAMERATO,T., HOLMAN,P. S., GREEN,C. J. and TOLAN,D. R. (1987) Biochimie 69, 137-145. RUDOLPH, R., WESTHOF,E. and JAENICKE,R. (1977) FEBS Lett. 73, 204-206. RUSSELL,P. R. (1985) Gene 40, 125-130. RUSSELL,G. A., DUNBAR,B. and FOTHERGILL-GILMORE,L. A. (1986) Biochem. J. 236, 115-126. RUTTER,W. J. (1964) Fed. Proc. 23, 1248-1257. RYAZANOV,A. G. (1985) FEBS Lett. 192, 131-134. RYPNIEWSKI,W. R. and EVANS,P. R. (1989) J. molec. Biol. 207, 805-821. SABATH,D. E., BROOME,H. E. and PRYSTOWSKY,M. B. (1990) Gene 91, 185-191. SACHS, M., FREELING,M. and OKIMOTO,R. (1980) Cell 20, 761-767. SAKA1,H., SUZUKI,K. and IMAHORI,K. (1986) J. Biochem. 99, 1157-1167. SAKIMURA,K., KUSHIYA,E., OBINATA,M. and TAKAHASHI,Y. (1985a) Nucleic Acids Res. 13, 4365-4378. SAKIMURA,K., KUSH1YA,E., OBINATA,M., ODANI, S. and TAKAHASHI,Y. (1985b) Proc. narD. Acad. Sci. U.S.A. 82, 7453-7457. SAKODA,S., SHANSKE,S., DIMAURO,S. and SCHON, E. A. (1988) J. biol. Chem. 263, 16899-16905. SALHANY,J. M. and GAINES,K. C. (1981) Trends Biochem. Sci. 7, 13-15. SALTER,M., KNOWLES,R. G. and POGSON,C. I. (1986) Biochem. J. 234, 635~47. SANDER,C. and SCHNEIDER,R. (1991) Proteins: Structure, Function Gen. 9, 56-58. SASAKI,R., SUGIMOTO,E. and CHIBA,H. (1966) Arch. Biochem. Biophys. 115, 53-61. SAWYER,J. R. and HOZIER,J. C. (1986) Science 232, 1632-1635. SCAMUFFA,M. D. and CAPRIOLI,R. M. (1980) Biochim. biophys. Acta 614, 583-590. SCHAAFF,I., HEINISCH,J. and ZIMMERMANN,F. K. (1989) Yeast 5, 285-290. SCHIRCH, D. M. and WILSON,J. E. (1987) Arch. Biochem. Biophys. 254, 385-396. SCHIRMER,T. and EVANS,P. R. (1990) Nature 343, 140-145. SCHLXPFER,B. S., PORTMANN,W., BRANLANT,C., BRANLANT,G. and ZUBER,H. (1990) Nucleic Acids Res. 18, 6422. SCHULTES,V., DEUTZMANN,R. and JAENICKE,R. (1990) Eur. J. Biochem. 192, 25-31. SCHWAB, D. A. and WlLSOS,J. E. (1988) J. biol. Chem. 263, 3220-3224. SCHWAB,D. A. and W1LSON,J. E. (1991) Arch. Biochem. Biophys. 285, 365-370. SCHWELBERGER,H. G., KOHLWEIN,S. D. and PALTAUF,F. (1989) Eur. J. Biochem. 180, 301-308. SCOPES, R. K. (1973) The Enzymes, Vol. 8, pp. 335-352 (ed. P. D. BOYER). Academic Press, New York. SEGIL, N., SHRUTKOWSKI,A., DWORKIN,M. B. and DWORKIN-RASTL,E. (1988). Biochem. J. 251, 31-39. SHANSKE, S., SAKODA,S., HERMODSON,M. A., DIMAURO, S. and SCHON, E.A. (1987) J. biol. Chem. 262, 14612-14617. SHARMA,P. M., REDDY,G. R., BABIOR,B. M. and MCLACHLAN,A. (1990) J. biol. Chem. 265, 9006-9010. SHAW-LEE, R., LISSEMORE,J. L., SULLIVAN,D. T. and TOLAN,D. R. (1992) J. biol. Chem. 267, 3959-3967. SHIH, M.-C. and GOODMAN,H. M. (1988) EMBO J. 7, 893-898. SHIn, M.-C., LAZAR,G. and GOODMAN,H. M. (1986) Cell 47, 73-80. SHILL, J. P., PETERS,B. A. and NEET, K. E. (1974) Biochemistry 13, 3864-3871. SHIMIZU,A., SUZUKI,F. and KATO, K. (1983) Biochim. biophys. Acta 748, 278-284. SHIRAKIHARA,Y. and EVANS,P. R. (1988) J. molec. Biol. 204, 973-994. SHUNTER,J. R. (1990) Nucleic Acids Res. 18, 4271.

234

L.A. FOTHERGILL-GILMOREand P. A. M. MICHELS

SIMPSON,C. J. (1991) Ph.D. Thesis, University of Edinburgh. SKARZYNSK1,T. and WONACOTT,A. J. (1988) J. molec. Biol. 203, 1097-1118. SKARZYNSKI,T., MOODY,P. C. E. and WONACOTT,A. J. (1987) J. molec. Biol. 193, 171-187. SMITH, T. L. and LEONG,S. A. (1990) Gene 93, 111-117. SOGIN, M. L., ELWOOD,H. J. and GUNDERSON,J. H. (1986) Proc. natn. Acad. Sci. U.S.A. 83, 1383-1387. SOLS, A. (1981) Curr. Top. Cell. Regul. 19, 77-101. SOUKRI,A., MOUGIN,A., CORBIER,C., WONACOTT,A., BRANLANT,C. and BRANLANT,G. (1989) Biochemistry 28, 2586-2592. SPERANZA,M. L., VALENTINI,G. and MALCOVATI,i . (1990) Eur. J. Biochem. 191, 701-704. SRERE, P. A. (1987) A. Rev. Biochem. $6, 89-124. SRIVASTAVA,n. K. and BERNHARD,S. A. (1984) Biochemistry 23, 4538-4545. SRIVASTAVA,D. K. and BERNHARD,S. A. (1985) Curt. Top. Cell Regul. 28, 1-109. SRIVASTAVA,n. K., SMOLEN,P., BETTS,G. F., FUKUSH1MA,T., SPIVEY,H. O. and BERNHARD,S. A. (1989) Proc. natn. Acad. Sci. U.S.A. 86, 6464-6468. STACHELEK,C., STACHELEK,J., SWAN,J., BOTSTEIN,D. and KONIGSBERG,W. (1986) Nucleic Acids Res. 14, 945-963. STEITZ, T. A., ANDERSON,W. E., FLETTERICH,R. J. and ANDERSON,C. i . (1977) J. biol. Chem. 252, 4494-4500. STEITZ, T. A., SHOHAM,M. and BENNETT,W. S. (1981) Phil. Trans. R. Soc. Lond. B 293, 43-52. STRAUS,D. and GILBERT,W. (1985) Molec. Cell. Biol. 5, 3497-3506. STRIBLING,D. and PERHAM,R. N. (1973) Biochem. J. 131, 833-841. STUART,D. I., LEVINE,i . , MUIRHEAD,H. and STAMMERS,n. K. (1979) J. molec. Biol. 134, 109-142. SUTHERLAND,K. J., HENNEKE,C. i . , TOWNER,P., HOUGH,n. W. and DANSON,M. J. (1990) Eur. J. Biochem. 194, 839-844. SWINKELS,B. W., GIBSON,W. C., OSINGA,K. A., KRAMER,R., VEENEMAN,G. H., VAN BOOM,J. H. and BURST,P. (1986) E M B O J. 5, 1291-1298. SWINKELS,B. W., EVERS,R. and BURST,P. (1988) E M B O J. 7, 1159-1165. SWINKELS,B. W., LOISEAU,A., OPPERDOES,F. R. and BOmST,P. (1992) Molec. Biochem. Parasitol. 50, 69-78. SYGUSCH,J., BEAUDRY,D. and ALLA1RE,i . (1987) Proc. natn. Acad. Sci. U.S.A. 84, 7846-7850. TAGER,J. M., GROEN, A. K., WANDERS,R. J. A., DUSZYNSKI,J., WESTERHOFF,H. V. and VERNOORN,R. C. (1983) Biochem. Soc. Trans. 11, 40-43. TAIT, R. C., FROMAN,B. E., LAUDENCIA-CHINGCUANCO,n. L. and GOTTLIEB,L. D. (1988) Plant molec. Biol. 11, 381-388. TAKAHASHI,I., TAKASAKI,Y. and HURl, K. (1989) J. Biochem. 105, 281-286. TAKENAKA,i . , NOGUCH1,T., INDUE,H., YAMADA,K., MATSUDA,T. and TANAKA,T. (1989) J. biol. Chem. 264, 2362-2367. TAKETA, F. (1974) Ann. N.Y. Acad. Sci. 241, 524-537. TANI, K., SINGER-SAM,J., MUNNS, i . and YOSHIDA,A. (1985) Gene 35, 11-18. TANI, K., YOSHIDA,M. C., SATOH,H., MITAMURA,K., NOOUCHI,T., TANAKA,T., FUJII,H. and MIWA, S. (1988a) Gene 73, 509-516. TANI, K., FUJII, H., NAGATA,S. and MIWA, S. (1988b) Proc. natn. Acad. Sci. U.S.A. 85, 1792-1795. TAD, W., WANG, L., SHEN, R. and SHENG,Z. (1989) EMBL/GenBank/DDBJ Nucleotide Sequences Databases, accession numbers X16639 and X16640. TAULER,A., EL-MAGHARABI,i . B. and P1LKIS, S. J. (1987) J. biol. Chem. 262, 16808-16815. TEFAY, H. S., AMELUNXEN,R. E. and GOLDBERG,I. D. (1989) Gene 82, 237-248. TEKAMP-OLSON,P., NAJAR1AN,R. and BURKE,R. L. (1988) Gene 73, 153-161. TrlELEN, A. P. and WILSON,J. E. (1991) Arch. Biochem. Biophys. 286, 645-651. THOMPSON,J. and TORCHIA,D. A. (1984) J. Bacteriol. 158, 791-800. THANE, M. N., CHAFFOTTE,A. F., SEYDOUX,F. J., ROUCOUS,C. and LAURENT,M. (1980) J. biol. Chem. 255, 10188-10193. TDANE, M. N., CHAFEOTTE,A. F., YON, J. M. and LAURENT,M. (1982) FEBS Lett. 148, 267-270. TILGHMAN,S. i . , TIEMEIER,n. C., SEIDMAN,J. G., PETERLIN,B. M., SULLIVAN,i . , MAIJEL,J. V. and LEDER,P. (1978) Proc. HatH. Acad. Sci. U.S.A. 75, 725-729. TOLAN,D. R., AMSDEN,A. B., PUTNEY,S. n., URDEA,M. S. and PENHOET,E. E. (1984) J. biol. Chem. 259, 1127 1131. TOLAN, D. R., NICLAS,J., BRUCE,B. D. and LEBO, R. V. (1987) Am. J. Hum. Genet. 41, 907-924. TSO, J. Y., SUN, X.-H., KAO, T.-H., REECE,K. S. and Wu, R. (1985a) Nucleic Acids Res. 13, 2485-2502. TSO, J. Y., SUN, X.-H. and Wu, R. (1985b) J. biol. Chem. 260, 8220-8228. TSUTSUMI,K., MUKAI,T., TSUTSUMI,R., MORI, M., DAIMON,M., TANAKA,T., YATSUKI,H., HURl, K. and ISH1KAWA, K. (1984) J. biol. Chem. 259, 14572-14575. TUOMINEN,F. W. and BERNLOHR,R. W. (1971) J. biol. Chem. 246, 1746-1755. TYUMA,I. and SHIMIZU,K. (1970) Fed. Proc. 29, 1112-1114. URETA,T. (1982) Cutup. Biochem. Physiol. 7113, 549-555. URETA,T., MEDINA,C. and PRELLER,A. (1987) Arch. Biol. reed. Exp. 20, 343-357. VALENTINI,G., IADAROLA,P., SOMANI,B. L. and MALCOVATI,M. (1979) Biochim. biophys. Acta 570, 248-258. VANDERCAMMEN,A. and VAN SCHAFTINGEN,E. (1990) Eur. J. Biochem. 191, 483-489. VANDERCAMMEN,A. and VAN SCHAFTINGEN,E. (1991) Eur. J. Biochem. 200, 545-551. VANHANEN,S., PENTTILAE,M., LEHTOVAARA,P. and KNOWLES,J. (1989) Curt. Genet. 15, 181-186. VAN SCHAFTINGEN,E. (1987) Adv. Enzymol. Relat. Areas molec. Biol. 59, 315-395. VAN SCHAFTINGEN,E. (1989) Eur. J. Biochem. 179, 179-184. VAN SCHAFrINGEN,E., OPPERDOES,F. R. and HERS, H.-G. (1985) Eur. J. Biochem. 153, 403-406. VAN SCrL~FTINGEN,E., OPPERDOES,F. R. and HERS, H.-G. (1987) Fur. J. Biochem. 166, 653-661. VAN SCHAIrrINGEN,E., MERTENS,E. and OPPERDOES,F. R. (1990) Fructose 2,6-bisphosphate, pp. 229-244 (ed. S. J. PILKiS). CRC Press, Boca Raton, U.S.A. VAN SCHAFTINGEN,E., VANDERCAMMEN,A. and DETHEUX,M. (1992a) Mbd./Sci. 8, 46-52. VAN SCHAETINGEN,E., VANDERCAMMEN,A., DETHEUX,M. and DAVIES,D. R. (1992b) Adv. Enzyme Reful. 32, 133-148.

Evolution of glycolysis

235

VAN SOLINGEN,P., MUURLING,H., KOEKMAN,B. and VAN DEN BERG,J. (1988) Nucleic Acids Res. 16, 11823. VAN VLIET,G. and HUISMAN,T. H. J. (1964) Biochem. J. 93, 401-409. VELLIEUX,E. M. D., HA.IDU,J., VERLINDE,C. L. M. J., GROENDUK,H., READ,R. J., GREENHOUGH,T. J., CAMPBELL, J. W., KALK,K. H., LITTLECHILD,J. A., WATSON,H. C. and HOL, W. G. J. (1992) Proc. natn. Acad. Sci. U.S.A., in press. VIAENE,A. and DHAESE,P. (1989) Nucleic Acids Res. 17, 1251. VINCENT,S. and FORT, P. (1990) Nucleic Acids Res. 18, 3054. VINUELA,E., SALAS,M. L. and SOLS,A. (1963) Biochem. biophys. Bes. Commun. 12, 140-145. VISSER,N. and OPPERDOES,F. R. (1980) Fur. J. Biochem. 103, 623-632. VISSER,N., OPPERDOES,F. R. and BORST,P. (1981) Fur. J. Biochem. 118, 521-526. VON DER OSTEN,C. H., BARBAS,C. F., WONG, C.-H. and SINSKEY,A. L (1989) Molec. Microbiol. 3, 1625-1637. WALEY,S. G. (1969) Comp. Biochem. Physiol. 30, 1-11. WALEY,S. G. (1973) Biochem. J. 135, 165-172. WALSH, K. and KOSHLAND,D. E., JR (1985) Proc. natn. Acad. Sci. U.S.A. 82, 3577-3581. WATSON,H. C. (1982) Protein Data Base Entry: Phosphoglycerate mutas¢ (yeast), Brookhaven, New York (see Bernstein et al., 1977). WATSON, H. C. and LITTLECHILD,J. A. (1990) Biochem. Soc. Trans. 18, 18%190. WATSON,H. C., WALKER,N. P. C., SHAW,P. J., BRYANT,T. N., WENDELL,P. L., FOTHERG1LL,L. A., PERKINS,R. E., CONROY, S. C., DOBSON, M. J., TUITE, M. F., KINGSMAN,A. J. and KINGSMAN,S. M. (1982) EMBO J. i, 1635-1640. WESOLOWSKI-LOUVEL,M., GOFFRINI,P. and FERRERO,I. (1988) Nucleic Acids Res. 16, 8714. WESTERHOFF,H. V. and ARENTS,J. C. V. (1984) Biosci. Rep. 4, 23-31. WESTERHOFF,H. V., GROEN, A. K. and WANDERS,R. J. A. (1984) Biosci. Rep. 4, 1-22. WHITE, M. F. and FOTHERGILL-GILMORE,L. A. (1988) FEBS Left. 229, 383-387. WHITE, P. J., NA1RN,J., PRICE, N. C., NIMMO,H. G., HUNTER,I. S. and COC~INS,J. R. J. Bacteriol. 174, 434 440. WroTE, T. K. and WILSOn, J. E. (1987) Arch. Biochem. Biophys. 259, 402-411. WIERENGA,R. K., KALK, K. H. and HOE, W. G. J. (1987) J. molec. Biol. 198, 109-121. WIERENGA, R. K., NOBLE, M. E. M., VRIEND, G., NAUCHE, S. and HOE, W. G. J. (1991a) J. molec. Biol. 220, 995-1015. W1ERENGA,R. K., NOBLE,M. E. M., POSTMA,J. P. M., GROENDIJK,H., KALK,K. H., HOL, W. G. J. and OPPEgDOF_.S, F. R. (1991b) Proteins: Structure, Function Gen. 10, 33-49. WILLS,C. (1990) Crit. Rev. Biochem. molec. Biol. 25, 245-280. WILSON,J. E. (1984) Regulation of Carbohydrate Metabolism, pp. 45-85 (ed. R. B H ~ R ) . CRC Press, Boca Raton, U.S.A. WINN, S. I., WATSON,H. C., HARKINS,R. N. and FOTHERGILL,L. A. (1981) Phil. Trans. R. Soc. Lond. B 293, 121-130. WISTOW, G. J., LIETMAN,T., WILLIAMS,L. A., STAPEL,S. O., DE JONG, W. W., HORWITZ,J. and PIATIGORSKY,J. (1988) J. Cell. Biol. 107, 2729-2736. WOESE,C. R. (1965) Proc. HatH. Acad. Sci. U.S.A. 54, 1546-1550. WOESE, C. R. (1987) Microbiol. Rev. 51, 221-271. WRBA, A., SCHWEIGER,A., SCHULTES,V., JAENICKE,R. and ZAVODSKY,P. (1990) Biochemistry 29, 7584-7592. Wu, X., GUTFREUND,H., LAKATOS,S. and C , o c g , P. B. (1991) Proc. natn. Acad. Sci. U.S.A. 88, 497-501. YAMADA,T. and CARLSSON,J. (1975) J. Bacteriol. 124, 562-563. YAMAGUCHI,M., HATEFI,Y., TRACH, K. and HOCH, J. A. (1988) J. biol. Chem. 263, 2761-2767. YAW, T.-F. and TAO, M. (1984) J. biol. Chem. 259, 5087-5092. YANAGAWA,S., HITOMI,K., SASAKI,R. and CHIBA, H. (1986) Gene 44, 185-191. YARBROUGH,P. O., HAYDEN, M. A., DUNN, L. A., VERMERSCH,P. S., KLASS, M. R. and HECHT, R. M. (1987) Biochim. biophys. Acta 908, 21-33. YCAS, M. (1974) J. Theor. Biol. 44, 145-160. YOSHIZAKI,F. and IMAHORI,K. (1979) Agric. Biol. Chem. 43, 537-545. YOUN, J. H., YOUN, M. S. and BERGMAN,R. N. (1986) J. biol. Chem. 261, 15960-15969. YUN, S.-L., AUST,A. E. and SUELTER,C. H. (1976) J. biol. Chem. 251, 124-128. ZWICKL, P., FABRY,S., BOGEDAIN,C., HAAS,A. and HENSEL,R. (1990) J. Bacteriol. 172, 4329-4338.