doi:10.1016/S0022-2836(03)00264-X
J. Mol. Biol. (2003) 328, 505–515
Crystal Structure of Class I Acetohydroxy Acid Isomeroreductase from Pseudomonas aeruginosa Hyung Jun Ahn1, Su Jung Eom1, Hye-Jin Yoon1, Byung Il Lee1 Hyeongjin Cho2 and Se Won Suh1* 1
Structural Proteomics Laboratory, School of Chemistry and Molecular Engineering, College of Natural Sciences, Seoul National University, Seoul 151742 South Korea 2
Department of Chemistry Inha University, Incheon 402-751, South Korea
Acetohydroxy acid isomeroreductase (AHIR) is a key enzyme in the biosynthesis of branched-chain amino acids. We have determined the first ˚ crystal structure of a class I AHIR from Pseudomonas aeruginosa at 2.0 A resolution. Its dodecameric architecture of 23 point group symmetry is assembled of six dimeric units and dimerization is essential for the formation of the active site. The dimeric unit of P. aeruginosa AHIR partially superimposes with a three-domain monomer of spinach AHIR, a class II enzyme. This demonstrates that the so-called plant-specific insert in the middle of spinach AHIR is structurally and functionally equivalent to the C-terminal a-helical domain of P. aeruginosa AHIR, and the C-terminal a-helical domain was duplicated during evolution from the shorter, class I AHIRs to the longer, class II AHIRs. The dimeric unit of P. aeruginosa AHIR possesses a deep figure-of-eight knot, essentially identical with that in the spinach AHIR monomer. Thus, our work lowers the likelihood of the previous proposal that “domain duplication followed by exchange of a secondary structure element can be a source of such a knot in the protein structure” being correct. q 2003 Elsevier Science Ltd. All rights reserved
*Corresponding author
Keywords: acetohydroxy acid isomeroreductase; dodecamer; domain duplication; figure-of-eight knot; ketol – acid reductoisomerase
Introduction Acetohydroxy acid isomeroreductase (AHIR; also known as ketol – acid reductoisomerase; EC 1.1.1.86) catalyzes the conversion of acetohydroxy acids into dihydroxy valerates in the second step of the biosynthetic pathway for the branchedchain amino acids valine, leucine, and isoleucine.1 The substrates are either 2-acetolactate or 2-aceto2-hydroxybutyrate, and the two alternative products act as precursors of valine and leucine, or isoleucine, respectively. This catalysis is a twostep reaction composed of an alkyl migration followed by an NADPH-dependent reduction. The metal ion requirement for the rearrangement step is stringent and requires Mg2þ, whereas the reduction step is less specific and can utilize Mg2þ, Mn2þ, or Co2þ. These two dissimilar reactions Abbreviations used: AHIR, acetohydroxy acid isomeroreductase; MAD, multiwavelength anomalous diffraction; Pa, Pseudomonas aeruginosa; SeMet, selenomethionine. E-mail address of the corresponding author:
[email protected]
occur in sequence, in which Mg2þ and NADPH bind first and independently, followed by acetohydroxy acid substrate binding.2,3 Because AHIR activity has been observed in plants, fungi, and bacteria, but not in mammals, the enzyme is a potential target for developing herbicides and antimicrobial agents.4 A BLAST search of the non-redundant SWISSPROT protein sequence database indicates that the AHIR proteins (currently counting over 70) can be grouped into two classes in terms of chain length and domain organization. Approximately 85% of them fall into the group of shorter sequences and the rest fall into the other group of longer sequences. Bacterial AHIRs that belong to the first group have typically between 330 and 340 residues, while those belonging to the second group have , 490 residues. Fungal AHIRs belong to the first group, while plant AHIRs belong to the second group. They are longer than bacterial members in each group, however, because they have an organelle-targeting peptide sequence at their N termini. Sequence similarity among members within each group is significantly higher than that between the two groups. Here, we propose the
0022-2836/03/$ - see front matter q 2003 Elsevier Science Ltd. All rights reserved
506
Class I AHIR Structure
Table 1. Data collection, phasing, and refinement statistics A. Data collection and phasing Space group
P213
Unit cell parameters ˚) a ¼ b ¼ c (A a ¼ b ¼ g (deg.) X-ray source: Photon Factory BL-18B
184.48 90
Data set
SeMet l1 (remote)
SeMet l2 (peak)
SeMet l3 (edge)
˚) Wavelength (A 0.9500 0.9789 ˚) Resolution range (A 20–2.8 20– 2.8 Total/unique reflections 1,335,955/51,503 1,349,427/51,506 99.8 (99.8) Completeness (%) 99.8 (99.8)a 11.6 (43.1)b 11.3 (42.2) Rsym (%)b 6.4 (11.6)c Riso (%)c 23.47/2.91 29.49/4.85 f0 /f00 (e2) –/0.73 0.31/0.56 RCullis (iso/ano)d e –/0.95 2.74/1.34 Phasing power (iso/ano) ˚ ): 0.63/0.72 (before/after density modification) Figure of meritf for MAD phasing (20–2.8 A
0.9790 20–2.8 1,349,756/51,506 99.8 (99.8) 10.7 (40.0) 7.1 (11.8) 211.19/2.40 0.42/0.75 2.11/0.93
B. Refinement Data set ˚) X-ray wavelength (A
Native 1.0000 (Pohang Light Source, BL-6B)
Unit cell parameters ˚) a ¼ b ¼ c (A ˚) Resolution range (A Completeness (%) No. reflections (working/free set)g Rwork/Rfreeg (%) ˚ 2) No. amino acid residues (B-factor, A ˚ 2) No. water molecules (B-factor, A
184.38 20.0–2.00 97.1 122,230/13,545 21.2/23.6 327 £ 4 (30.8) 460 (30.6)
r.m.s. deviationh ˚) Bond lengths (A Bond/angles (deg.) Ramachandran analysis
0.0054 1.16 100% in allowed regions
˚ ) shell in parentheses. Completeness (2.95–2.80 A P P for I/s(I) . 1.0,Phigh-resolution P P P Rsym ¼ h i lIðhÞi 2 kIðhÞll= h i IðhÞI; where IðhÞ is the intensity of reflection h, h is the sum over all reflections, and i is ˚ ). the sum over i measurements of reflection h. Numbers in parentheses reflect statistics for the last shell (2.95–2.80 A P P c Riso ¼ llFPH l 2 lFP ll= lFP l; where FPH and FP are the derivative (l2 or l3) and native (l1) structure factors, respectively. ˚ Numbers in parentheses are for the last shell P P(2.89–2.80 A). d RCullis (iso) ¼ llFPH 2 FPl 2 lFPH(calc)ll/ lFPH 2 FPl, where FPH and FPH(calc) are the observed and calculated structure factors of a heavy-atom derivative. e Phasing power (iso)Pis the mean P value of the heavy-atom structure factor amplitude divided by the residual lack-of-closure error. f ia Figure of merit ¼ kl P(a)e / P(a)ll, where a is the phase and P(a) is the phase probability distribution. P P g R ¼ llFobsl 2 lFcalcll/ lFobsl, where Rfree24 is calculated for a randomly chosen 10% of reflections, which were not used for structure refinement, and Rwork is calculated for the remaining reflections. h Deviations from ideal bond lengths and angles. a
b
following nomenclature: class I AHIR for the group of shorter sequences and class II AHIR for the other group of longer sequences. To date, no three-dimensional structure of a class I AHIR has been reported. Structural information on AHIR is limited to the mature form of spinach AHIR, a class II enzyme.5 The mature form of spinach AHIR is a 524 residue protein (residues M72 – A595). The N-terminal 71 residues constitute the organelle-targeting peptide that is not found in bacterial AHIRs. It was suggested that an insert of , 140 residues in the middle of spinach AHIR, as compared to class I AHIRs, is unique to the plant enzymes.5,6 It is the only protein that has been found to possess a deep figure-of-eight knot.7 In order to gain insight into evolution of AHIRs through a structural comparison and to characterize the origin and the role of the “inserted” extra
, 140 residues in the class II AHIRs such as spinach AHIR, we have determined the first crystal structure of a class I AHIR, dodecameric ˚ AHIR from Pseudomonas aeruginosa (Pa), at 2.0 A resolution. The structure provides insights into evolution of class II AHIRs via domain duplication and the source of the deep figure-of-eight knot.
Results and discussion Overall subunit structure and the dimeric unit The structure of Pa AHIR has been determined by using the multiwavelength anomalous diffraction (MAD) data collected from a crystal of the selenomethionine (SeMet)-substituted enzyme ˚ and the native model has been refined to 2.0 A
Class I AHIR Structure
507
Figure 1. Electron density and monomer fold. (a) Stereo 2Fo 2 Fc electron density map, superimposed on the refined model. Primed residues belong to the second monomer. (b) Ribbon diagram of a Pa AHIR monomer. The Figure was drawn with MOLSCRIPT25 and Raster3D.26 Secondary structure elements were defined by PROCHECK.27
resolution (Table 1). The refined model of the native enzyme includes 1308 residues of four independent monomers in a crystallographic asymmetric unit (residues M1 – I327 in each monomer) and 460 water molecules. The electron density is clear for most of the residues (Figure 1(a)) but is missing for the C-terminal 11 residues A328 – N338 as well as the C-terminal eight residue tag, presumably because they are disordered in the crystal. The structures of four monomers are highly similar and we arbitrarily chose monomer A with the lowest mean B-factor for structural descriptions and comparisons. Each monomer is composed of two domains (Figure 1(b)): the larger, N-terminal a/b domain (residues M1 – T181) and the smaller, C-terminal a-helical domain (residues T182 – N338). The N-terminal domain consists of a ninestranded, mixed b-sheet (b1– b9) with flanking
a-helices on both sides of the b-sheet. The C-terminal domain consists of eight a-helices (a10– a17) and the loops that connect these helices, and it plays an important role in dimerization. In the crystal, two identical monomers that are related by either crystallographic or non-crystallographic 2-fold rotational symmetry form a tight dimer. The dimer formation is contributed by extensive interactions between the two C-terminal a-helical domains of both monomers (Figure 2). Many of the residues at the intradimer interface are well conserved among class I AHIRs. The dimeric unit of Pa AHIR possesses a deep figureof-eight knot, essentially identical with that in a monomer of spinach AHIR,7 except that there is one more cut in the case of Pa AHIR dimer (Figure 3). The subunit interface of Pa AHIR dimer buries ˚ 2 of the solvent-accessible surface area per 5550 A
508
Figure 2. Dimeric unit of Pa AHIR. (a) Domains are drawn in different colors: blue and cyan are the N and C-terminal domains of one monomer, while red and purple are the N and C-terminal domains of the other monomer. One monomer is drawn in tubes, while the other monomer is drawn in surface representation. The two C-terminal domains are intertwined. The Figure was drawn with GRASP.28 (b) This view is obtained by rotating (a) by 1808 around the vertical axis.
monomer (, 32% of the monomer surface area). This is one of the largest buried surface areas among homodimers.8 The large area and the highly conserved nature of the intradimeric interface suggest that this dimerization pattern is likely to be conserved in other members of class I AHIR. The dimeric unit is a building block for the assembly of a dodecamer and dimerization is required for the formation of the active site, as discussed in further detail below.
Dodecameric architecture Twelve identical monomers assemble a dodecamer of 23 point group symmetry in the crystal. Gel-filtration and dynamic light-scattering analyses indicate that Pa AHIR exists as a dodecamer in solution. The recombinant AHIR from Helicobacter pylori (a class I enzyme) fused with a C-terminal eight residue tag (LEHHHHHH) was found to exist as a dodecamer by dynamic light-scattering measurements. It seems to indicate that the dodecameric architecture of the class I AHIRs is a conserved feature, although its biological role is not clear. The trimeric interaction (discussed below) as well as the dimeric interaction in the Pa AHIR dodecamer involves highly conserved residues. The dodecamer of Pa AHIR ˚ in diameter (Figure 4) and has a measures , 135 A ˚ in diameter. It is hollow central core of , 35 A built of six dimeric units, with dimers positioned on the edges of a tetrahedron. Only two kinds of
Class I AHIR Structure
Figure 3. Figure-of-eight knot. (a) A drawing of the core of the C-terminal domain of spinach AHIR possessing a deep figure-of-eight knot.7 The helical segments A1 (residues T307– S341), A2 (S344 – K367), A3 (S377 – Y393, P394 – G409), B1 (Y461– K482), B2 (S485 – G509), and B3 (S518 – Y534, I535 – N546) all belong to the same monomer. The black continuous line represents the pseudo 2-fold symmetry axis. (b) A drawing of the core of two C-terminal domains in a dimeric unit of Pa AHIR in a similar orientation as in (a). The helical segments A1 (a10), A2 (a11), and A3 (a13, a14) are from the first monomer, and the helical segments B1 (a10), B2 (a11), and B3 (a13, a14) are from the second monomer of the dimeric unit of Pa AHIR. The black continuous line represents the 2-fold symmetry axis. If the two C-terminal domains of Pa AHIR are connected by the red dotted line, the figure-of-eight knot in the dimeric unit of Pa AHIR is identical with that in spinach AHIR. The Figure was drawn with MOLSCRIPT25 and Raster3D.26 (c) A deep figure-of-eight knot.7
monomer –monomer interaction are observed, i.e. a dimeric interaction on the edge around the 2-fold axis and a trimeric interaction at the vertex (or equivalently on the face) around the 3-fold axis (Figure 4). In the Pa AHIR dodecamer, two kinds of pores are formed around the 3-fold axis: a larger pore on the face of the tetrahedron and a smaller pore at the vertex (Figure 4). As far as we are aware, this kind of dodecameric architecture of 23 symmetry that we have found in Pa AHIR has been observed among protein structures only rarely. Another related example is found in
Figure 4. Dodecameric architecture of Pa AHIR. Monomers are colored differently. (a) A view down the 3-fold axis toward the face of a tetrahedron. (b) A view down the 3-fold axis toward the vertex of a tetrahedron.
Class I AHIR Structure
509
Figure 5. Structural superposition. (a) Different patterns of dimerization in Pa AHIR (thick lines) and spinach AHIR (thin lines). Domain colors of Pa AHIR are the same as in Figure 2, and the orientation of the dimeric unit of Pa AHIR is similar to that in Figure 2(b). Two monomers of spinach AHIR are colored in yellow and green, respectively. (b) Stereo Ca superposition of a dimeric unit of Pa AHIR and a spinach AHIR monomer. Domain colors of the dimeric unit of Pa AHIR are the same as in Figure 2. The monomer of spinach AHIR is colored in yellow. A pseudo 2-fold symmetry axis in the C-terminal domain of spinach AHIR and the corresponding 2-fold symmetry axis of the dimeric unit of Pa AHIR are nearly perpendicular to the plane of the Figure.
vanadium-dependent bromoperoxidase from the red algae Corallina officinalis.9 However, three kinds of monomer –monomer interface are involved in dodecamer assembly in this case. A different type of a dodecamer with 23 symmetry is more usual and has been observed in a number of structures, such as ornithine carbamoyltransferase from Pyrococcus furiosus,10 peptidyl-cysteine decarboxylase EpiD,11 and DNA protection protein Dps from Escherichia coli.12 Their building blocks are best described as trimers, with trimers positioned on the faces or at the vertices of a tetrahedron. Structural comparison with class II AHIR suggests domain duplication When we compare the overall structures of Pa AHIR and spinach AHIR, the most obvious difference is in their oligomeric states: Pa AHIR is dodecameric, whereas spinach AHIR is dimeric. It is worth emphasizing that the subunit arrangement in the dimeric unit of Pa AHIR is much different from that in the spinach AHIR dimer (Figure 5(a)), as a consequence of the presence of an extended sequence at the C terminus of spinach
AHIR. A sequence alignment of Pa AHIR and spinach AHIR (the mature form, residues M72 – A595) is shown in Figure 6. In this alignment, we extend the Pa AHIR sequence by appending an additional C-terminal domain (residues V197 – N338) to one monomer sequence for a reason that is explained below. The sequence identity is , 27% for comparing residues M1 – T181 of Pa AHIR against residues M72 – T306 of spinach AHIR, , 17% for comparing residues T182 –N338 of Pa AHIR against residues T307– P462 of spinach AHIR, and , 17% for comparing residues V197 – N338 of Pa AHIR against residues F463 –A595 of spinach AHIR. When we superimpose the structures of Pa AHIR and spinach AHIR monomers (both with the lowest mean B-factor values among four independent monomers in the asymmetric unit), the ˚ for 327 common Ca r.m.s. deviation is 1.53 A atoms (residues M1 –I327 of Pa AHIR and residues A83 – P462 of spinach AHIR). The residues M72 – S82 are missing from the spinach AHIR model and the residues A328 – N338 from the Pa AHIR model. In this superposition, the first half (residues T307 –P462) of the C-terminal domain of spinach AHIR, which corresponds to the so-called , 140
510
Class I AHIR Structure
Figure 6. Structure-based alignment of Pa AHIR and spinach AHIR amino acid sequences. Secondary structure elements are denoted as cylinders for a-helices or 310-helices and as arrows for b-strands above the sequence of Pa AHIR and below the sequence of spinach AHIR. In the case of Pa AHIR, the T182– N338 sequence (C-terminal domain of the first monomer of the dimeric unit of Pa AHIR) is boxed in blue and the V197– N338 sequence (corresponding to the C-terminal domain of the second monomer of the dimeric unit of Pa AHIR), appended to the C terminus of one monomer sequence, is boxed in red. The blue and red boxes correspond to domain 2 and domain 3 of spinach AHIR, respectively. The residues of spinach AHIR that are involved in binding metal ions, the inhibitor, and NADPH are marked by red triangles. I, I0 , and II– V indicate the conserved regions.14 Regions IV and V that contribute to the formation of the active site in a dimeric unit of Pa AHIR come from the second monomer. Corresponding regions in the first monomer of Pa AHIR are denoted by [IV] and [V]. The assignment of secondary structure elements is the same as in Figure 1(b). This Figure was produced with ALSCRIPT.29
residue plant-specific insert,5,6 overlaps spatially with the C-terminal domain (residues T182– I327) of Pa AHIR. But the second half (residues F463 – A595) of the C-terminal domain of spinach AHIR has no counterpart in the monomer model of Pa AHIR. When we superimpose the dimeric unit of Pa AHIR with the monomer model of spinach AHIR, an interesting observation is made. The second half (residues F463– A595) of the C-terminal domain of spinach AHIR overlaps spatially with the C-terminal domain (residues V197 – I327) of the second monomer in the dimeric unit of Pa AHIR (Figure 5(b)). Thus, we might consider spinach AHIR to be composed of three domains: domain 1 (residues A83 –T306), domain 2 (residues T307 –P462), and domain 3 (residues F463 –A595). Domain 1 corresponds to the N-terminal NADPHbinding domain, and domains 2 and 3 together constitute the C-terminal all helical domain, as defined by Biou et al.5 The spinach domain 1 alone (residues A83 –T306) is superimposed with the N-terminal domain (residues M1 – T181) of Pa
˚ for 181 AHIR with an r.m.s. deviation of 1.15 A a common C atoms. The spinach domain 2 alone (residues T307 – P462) is superimposed with the C-terminal domain (residues T182– I327) of Pa ˚ for 146 AHIR with an r.m.s. deviation of 1.31 A a common C atoms. The spinach domain 3 alone (residues F463 – A595) is superimposed with the C-terminal domain (residues V197 – I327) of Pa AHIR, which derives from the second subunit of ˚ the dimeric unit, with an r.m.s. deviation of 1.34 A for 131 common Ca atoms. When we extend the sequence of Pa AHIR by repeating the sequence of residues V197 – I327 and superimpose a hypothetical structure corresponding to the extended sequence (residues M1 – I327 plus residues V197 – I327) with the spinach monomer (residues A83 –A595), the r.m.s. deviation is ˚ for 458 common Ca atoms. The results of 1.74 A these structural comparisons suggest that domain 3 of spinach AHIR evolved by duplication of domain 2. This suggestion is also consistent with the presence of an approximate 2-fold symmetry in the spatial arrangement of a-helices in the
Class I AHIR Structure
511
Figure 7. Active site and conformational changes. (a) Predicted conformational changes upon binding of Mg2þ, NADPH, and the inhibitor in the dimeric unit of Pa AHIR. Domain colors are the same as in Figure 2. One N-terminal domain (red in Figure 2) is not shown in this Figure and only one active site is shown. NADPH and the herbicidal inhibitor are shown as stick models. The open conformation is drawn in dark colors and the predicted closed conformation in light colors. The closed conformation was generated by a superposition of individual domains of Pa AHIR to the corresponding domains of spinach AHIR. (b) The location of the active site in the dimeric unit of Pa AHIR and the spinach AHIR monomer. The two monomers of Pa AHIR are colored in green and purple, respectively, while the spinach AHIR monomer is colored in gray. The boxed area is magnified below. (c) A magnified view in stereo of the superposition of the active site residues. The 11 residues that are involved in the binding of Mg2þ, NADPH, and the inhibitor are drawn as stick models: spinach AHIR in black and Pa AHIR in red or blue. Red-colored and bluecolored residues of the dimeric unit of Pa AHIR belong to different monomers.
C-terminal domain (T307 – A595) of spinach AHIR, which was first observed by Taylor.7 Therefore, we conclude that the notion that the extra , 140 residue insert in the middle of spinach AHIR is plant-specific5,6 is not justified. Conformational changes upon binding of Mg21, NADPH, and inhibitor The structure of spinach AHIR, a class II enzyme, was determined in complex with Mg2þ, NADPH, and a herbicidal competitive inhibitor (PDB code 1YVE),5 and in complex with the reaction product dihydroxymethylvalerate, Mn2þ, and (phospho)-ADP-ribose (PDB code 1QMG).13 Since these two structures are virtually identical, our comparison is limited to the inhibitor complex only. Crystallization of the spinach enzyme in the unliganded state was not successful.5,13 In this study, we have determined the structure of Pa AHIR, a class I enzyme, in the unliganded state.
Thus, our structure makes it possible to infer possible conformational changes in AHIRs upon binding of Mg2þ, NADPH, and the inhibitor (see below). The smaller differences for comparing individual domains suggest a conformational change corresponding to domain closure upon binding of Mg2þ, NADPH, and the inhibitor in the spinach structure. Indeed, when the N-terminal domains of both AHIRs are superimposed, the Ca r.m.s. deviation between the C-terminal domain (residues T182– I327) of Pa AHIR and the spinach domain 2 (residues T307 –P462) is increased from ˚ to 5.45 A ˚ , and that between the Pa AHIR 1.31 A C-terminal domain (residues V197 –I327) of the second subunit in the dimer and the spinach domain 3 (residues F463– A595) is increased from ˚ to 14.8 A ˚ . The conformational difference cor1.34 A responds roughly to an opening of the C-terminal domain of Pa AHIR relative to the N-terminal domain by about 78 around the residue T181 (Figure 7(a)).
512
Class I AHIR Structure
Figure 8. Sequence alignment of AHIRs. Class I: Pa, P. aeruginosa (SWISS-PROT accession code: Q9HVA2); Nm, Neisseria meningitides (Q9JTI3); Mj, Methanococcus jannaschii (Q58938); Hp, H. pylori (O25097); Dr, Deinococcus radiodurans (Q9RU74); Mt, Mycobacterium tuberculosis (O53248); Tm, Thermotoga maritima (Q9WZ20). Class II (mature form): Pm, Pasteurella multocida (Q9CLF1); Ec, Escherichia coli (P05793); St, Salmonella typhimurium (P05989); Vc, Vibrio cholerae (Q9KVI4); Ba, Buchnera aphidicola (Q9RQ51); Hi, Haemophilus influenzae (P44822). Secondary structure elements of Pa AHIR are indicated above the sequence. The blue and red boxes, red triangles, and region designations are the same as in Figure 6. For class I AHIRs, the sequence of an additional C-terminal domain (residues V197–N338 in the red box) is appended to the C terminus of each sequence, as in Figure 6.
The active site and conserved regions The active site of spinach AHIR has been well characterized by previous structural studies of complexes.5,13 It is formed within a single monomer of spinach AHIR. The residues of spinach AHIR that are involved in the binding of NADPH, Mg2þ, and the herbicidal inhibitor are located in highly similar positions in the structure of the dimeric unit of Pa AHIR (Figure 7(b) and (c)). This similarity makes it possible to define the active site of Pa AHIR, which is contributed by both
N-terminal and C-terminal domains of the first monomer and the C-terminal domain of the second monomer of the dimeric unit. Therefore, dimerization is essential for the formation of the active site in Pa AHIR, and the C-terminal domain plays dual roles in forming the active site. A previous alignment of class II AHIR sequences disclosed that 24 strictly or highly conserved residues are dispersed in six sequence regions I, I0 , and II –V.14 The conserved residues were found to cluster around the active site and their functions were assigned.5 The Pa AHIR sequence (as a
513
Class I AHIR Structure
dimeric unit) has all of the six conserved regions like class II enzymes, with only two residues being different from spinach AHIR among 24 highly conserved residues.14 Met222 (region IV) and Asn253 (region V) of Pa AHIR replace Glu488 and Thr520 of spinach AHIR, respectively, which were suggested to play a structural role.5 Each of the two active sites in the dimeric unit of Pa AHIR is formed by the conserved regions I, I0 , and II (from the N-terminal domain of the first monomer), the conserved region III (from the C-terminal domain of the first monomer), and the conserved regions IV and V (from the C-terminal domain of the second monomer). That is, each C-terminal domain of Pa AHIR provides both the conserved region III for one active site of a dimeric unit and the conserved regions IV– V for the other active site, respectively. The regions I (Gly23, Gly25, and Gln27) and I0 (Phe109) are important for NADPH binding. The regions I0 (His107), III (Asp190, Glu194), and IV (Glu226, Glu230) are important for binding metal ions. Glu230 is highly conserved among both class I and class II enzymes (Figure 8, red box) but its equivalent residue Glu496 in spinach AHIR was not listed as one of the 24 conserved residues.5 The region II plays a structural role. Evolutionary pathway and a figure-of-eight knot in AHIR structures N-terminal domains of both Pa AHIR and spinach AHIR possess a typical Rossmann fold for binding the cofactor NADP(H) but their C-terminal domains are unique in sequence and fold. We suspect that a primordial class I AHIR probably evolved by fusion of the N-terminal NADP(H)binding domain and the C-terminal domain, and the fusion protein was active as a dimer. Our structural comparisons argue strongly that duplication of the C-terminal domain of the ancestral class I AHIR occurred later during evolution, resulting in class II AHIRs (Figure 8). The proposed evolutionary pathway is consistent with the presence of an approximate 2-fold symmetry in the spatial arrangement of a-helices in the C-terminal domain (T307 – A595) of spinach AHIR (Figure 3(a)).7 What could be possible advantages of evolution of class I AHIR into class II by domain duplication? One is a possibility of greatly enhanced affinity between two interacting domains when they are on the same polypeptide chain.15 In the dimeric unit of Pa AHIR, the interaction between two monomers is extensive (Figure 2(a) and (b)) and involves a deep figure-of-eight knot (Figure 3(b)). Thus the evolutionary pressure for enhanced affinity between the C-terminal domains in the dimeric unit of class I AHIRs would not have been high, if there were any. Another conceivable advantage would be relaxation of the constraints for evolution of the two C-terminal half-domains (that is, domain 2 and domain 3) in the case of class II AHIRs. This is because, in a monomer of
class II AHIRs, one of the two C-terminal halfdomains (domain 2) contributes to the formation of the active site by providing the conserved region III and the other C-terminal half-domain (domain 3) contributes by providing the conserved regions IV and V. Domain duplication would have allowed the two C-terminal half-domains (domain 2 and domain 3) to undergo evolutionary changes more or less independently to suit the different requirements. It appears that the two C-terminal half-domains of class II AHIR have diverged sufficiently so that their sequences are not readily aligned against each other. In contrast, in class I AHIRs, the C-terminal domain of the first monomer of the dimeric unit and another identical C-terminal domain from the second monomer, together with the N-terminal domain of the first monomer, form each active site. That is, each C-terminal domain must play dual roles in the formation of two active sites in the dimeric unit; each C-terminal domain has to provide simultaneously both the conserved region III for forming one of the two active sites and the conserved regions IV and V for forming the other active site. Thus, evolution of the C-terminal domain in class I AHIRs would be more constrained than in class II. In a careful analysis of the knots in protein structures, Taylor discovered a deep figure-of-eight knot in the C-terminal domain (residues T307– A595) of spinach AHIR.7 He noticed an approximate 2-fold symmetry in the spatial arrangement of a-helices in this domain (Figure 3(a)). And he proposed that swapping of secondary structure elements between duplicated domains could possibly provide a source of knotted proteins. However, we have shown in this study that the dimeric unit of Pa AHIR possesses a deep figure-of-eight knot, essentially identical with that in the monomer of spinach AHIR (Figure 3(a) and (b)) and the knot has been kept in class II AHIRs after evolution via domain duplication. Thus, our work does not support the proposal that domain duplication followed by exchange of a secondary structure element can be a source of a deep figure-of-eight knot in the protein structure.7
Conclusions We have determined the first crystal structure of a class I AHIR. It reveals interesting features, such as unique dodecameric assembly, dual roles of each C-terminal domain in forming the active sites of a dimeric unit, and the presence of a deep figure-of-eight knot. Dimerization is essential for forming the active site. The so-called plant-specific insert in spinach AHIR (class II) is structurally and functionally equivalent to the C-terminal domain of Pa AHIR (class I). The present structure argues strongly that domain duplication occurred during evolution from class I AHIRs to class II AHIRs. The structure lowers the likelihood of the previous proposal about the source of a deep
514
figure-of-eight knot in the protein structure being correct.
Materials and Methods Protein production and crystallization Pa AHIR was overexpressed in E. coli B834(DE3) cells and crystallized as described.16 The SeMet-substituted protein was overexpressed in E. coli B834(DE3) cells, using the M9 cell culture medium containing extra amino acids. Dithiothreitol (10 mM) was added during purification steps. The SeMet-substituted protein was crystallized into cubic crystals using a reservoir solution containing 100 mM sodium cacodylate (pH 6.5) and 1.4 M sodium acetate. This crystallization condition is different from that used for the native protein.16 Crystals of the SeMet-substituted protein grew up to approximate dimensions of 0.4 mm £ 0.4 mm £ 0.4 mm within a week. Data collection and structure solution A crystal of the SeMet-substituted protein was frozen using a cryoprotectant solution containing 25% (v/v) glycerol in the crystallization mother liquor. X-ray diffraction data were collected at 100 K on an ADSC Quantum 4R CCD detector at the BL-18B experimental station of Photon Factory, Japan. Raw data were processed and scaled using the programs MOSFLM†17 and SCALA.18 The SeMet-substituted crystal belongs to the cubic space group P21 3; with unit cell parameters of ˚ . Data collection from a native a ¼ b ¼ c ¼ 184.48 A crystal were reported previously.16 Table 1 summarizes MAD data collection statistics. All of the 44 expected selenium atoms in four independent monomers in a crystallographic asymmetric unit were located with the program SOLVE19 and the selenium sites were used to calculate the phases with SHARP.20 We subsequently improved the phases by non-crystallographic symmetry (NCS) averaging, solvent flattening, and histogram matching with the program DM.18 Phasing statistics are summarized in Table 1. The resulting electron density map was interpreted by the automatic model-building program MAID21 to give an initial model that accounted for ,75% of the backbone of the polypeptide chain with much of the sequence assigned. Subsequent manual model building was done using the program O.22 The model was refined with the program CNS,23 including the bulk solvent correction. Subsequent rounds of model building, simulated annealing, positional refinement, and individual thermal factor refinement were then performed. NCS restraints were relaxed in successive rounds of refinement. The refined model of the SeMet-substituted AHIR, accounting for 1308 residues of four independent monomers in each asymmetric unit, gave Rwork and Rfree of 22.3% and 26.1%, respectively.24 Subsequently, this model was used to solve the structure of the native crystal (Table 1). Initially, the model of the native protein was refined with strict NCS restraints to Rwork and Rfree of 27.5% and 28.4% for † (1) ftp://ftp.mrc-lmb.cam.ac.uk; (2) logon as user “anonymous” password kanythingl; (3) cd pub/mosflm; (4) get the README and all other files in this directory.
Class I AHIR Structure
˚ data, respectively. Further rounds of refine20.0– 2.00 A ment including addition of water molecules, B-factor refinement, and relaxation of NCS restraints reduced Rwork and Rfree to 21.2% and 23.6%, respectively. Protein data bank accession code The atomic coordinates and the structure factors have been deposited in the Protein Data Bank under accession code 1NP3.
Acknowledgements We thank Professor N. Sakabe and his staff for assistance during data collection at the Photon Factory, beamline BL-18B. We acknowledge the assistance of the staff at the Pohang Light Source, beamline 6B. This work was supported by a grant from the Korea Ministry of Science and Technology (NRL-2001, grant no. M1-0104-00-0132). H.J.A., S.J.E., H.J.Y., and B.I.L. are recipients of the BK21 fellowship.
References 1. Umbarger, H. E. (1996). Biosynthesis of the branched-chain amino acids. In Escherichia coli and Salmonella (Neidhardt, F. C., ed.), vol. 1, pp. 442– 457, ASM Press, Washington, DC. 2. Chunduru, S. K., Mrachko, G. T. & Calvo, K. C. (1989). Mechanism of ketol acid reductoisomerase. Steady-state analysis and metal ion requirement. Biochemistry, 28, 486– 493. 3. Dumas, R., Job, D., Ortholand, J.-Y., Emeric, G., Greiner, A. & Douce, R. (1992). Isolation and kinetic properties of acetohydroxy acid isomeroreductase from spinach (Spinacia oleracea) chloroplasts overexpressed in Escherichia coli. Biochem. J. 288, 865– 874. 4. Singh, B. K. & Shaner, D. L. (1995). Biosynthesis of branched chain amino acids: from test tube to field. Plant Cell, 7, 935– 944. 5. Biou, V., Dumas, R., Cohen-Addad, C., Douce, R., Job, D. & Pebay-Peyroula, E. (1997). The crystal structure of plant acetohydroxy acid isomeroreductase complexed with NADPH, two magnesium ions and a herbicidal transitions state analog determined ˚ resolution. EMBO J. 16, 3405– 3415. at 1.65 A 6. Dumas, R., Biou, V., Halgand, F., Douce, R. & Duggleby, R. G. (2001). Enzymology, structure, and dynamics of acetohydroxy acid isomeroreductase. Accts Chem. Res. 34, 399– 408. 7. Taylor, W. R. (2000). A deeply knotted protein structure and how it might fold. Nature, 406, 916– 919. 8. Jones, S. & Thornton, J. M. (1996). Principles of protein – protein interactions. Proc. Natl Acad. Sci. USA, 93, 13 – 20. 9. Isupov, M. N., Dalby, A. R., Brindley, A. A., Izumi, Y., Tanabe, T., Mushudov, G. N. & Littlechild, J. A. (2000). Crystal structure of dodecameric vanadiumdependent bromoperoxidase from the red algae Corallina officinalis. J. Mol. Biol. 299, 1035– 1049. 10. Villeret, V., Clantin, B., Tricot, C., Legrain, C., Roovers, M., Stalon, V. et al. (1998). The crystal
Class I AHIR Structure
11.
12.
13.
14.
15.
16.
17. 18. 19.
structure of Pyrococcus furiosus ornithine carbamoyltransferase reveals a key role for oligomerization in enzyme stability at extremely high temperatures. Proc. Natl Acad. Sci. USA, 95, 2801– 2806. Blaesse, M., Kupke, T., Huber, R. & Steinbacher, S. (2000). Crystal structure of the peptidyl-cysteine decarboxylase EpiD complexed with a pentapeptide substrate. EMBO J. 19, 6299– 6310. Grant, R. A., Filman, D. J., Finkel, S. E., Kolter, R. & Holge, J. M. (1998). The crystal structure of Dps, a ferritin homolog that binds and protects DNA. Nature Struct. Biol. 5, 294– 303. Thomazeau, K., Dumas, R., Halgand, F., Forest, E., Douce, R. & Biou, V. (2000). Structure of spinach acetohydroxy acid isomeroreductase complexed with its reaction product dihydroxymethylvalerate, manganese and (phospho)-ADP-ribose. Acta Crystallog. sect. D, 56, 389– 397. Dumas, R., Butikofer, M.-C., Job, D. & Douce, R. (1995). Evidence for two catalytically different magnesium-binding sites in acetohydroxy acid isomeroreductase by site-directed mutagenesis. Biochemistry, 34, 6026– 6036. Marcotte, E. M., Pellegrini, M., Ng, H.-L., Rice, D. W., Yeates, T. O. & Eisenberg, D. (1999). Detecting protein function and protein– protein interactions from genome sequence. Science, 285, 751– 753. Eom, S. J., Ahn, H. J., Yoon, H.-J., Lee, B. I., Bae, S. H., Baek, S. H. & Suh, S. W. (2002). Crystallization and preliminary X-ray crystallographic analysis of acetohydroxy acid isomeroreductase from Pseudomonas aeruginosa. Acta Crystallog. sect. D, 58, 2145– 2146. Leslie, A. G. W. (1997). User Guide, MOSFLM Version 5.50, MRC Laboratory of Molecular Biology, Cambridge, England. Collaborative Computational Project, Number 4 (1994). The CCP4 suite: programs for protein crystallography. Acta Crystallog. sect. D, 50, 760– 763. Terwilliger, T. C. & Berendzen, J. (1999). Automated MAD and MIR structure solution. Acta Crystallog. sect. D, 55, 849– 861.
515
20. de la Fortelle, E. & Bricogne, G. (1997). Maximumlikelihood heavy-atom parameter refinement for multiple isomorphous replacement and multiwavelength anomalous diffraction methods. Methods Enzymol. 276, 472– 494. 21. Levitt, D. G. (2000). A new software routine that automates the fitting of protein X-ray crystallographic electron density maps. Acta Crystallog. sect. D, 57, 1013– 1019. 22. Jones, T. A., Zou, J.-Y., Cowan, S. W. & Kjeldgaard, M. (1991). Improved methods for building protein models in electron density maps and the location of errors in these models. Acta Crystallog. sect. A, 47, 110 – 119. 23. Bru¨nger, A. T., Adams, P. D., Clore, G. M., DeLano, W. L., Gros, P., Grosse-Kunstleve, R. W. et al. (1998). Crystallography and NMR system: a new software suite for macromolecular structure determination. Acta Crystallog. sect. D, 54, 905–921. 24. Bru¨nger, A. T. (1992). The free R-value: a novel statistical quantity for assessing the accuracy of crystal structures. Nature, 355, 472–474. 25. Kraulis, P. J. (1991). MOLSCRIPT: a program to produce both detailed and schematic plots of protein structures. J. Appl. Crystallog. 24, 946–950. 26. Merritt, E. A. & Bacon, D. J. (1997). Raster3D: photorealistic molecular graphics. Methods Enzymol. 277, 505 –524. 27. Laskowski, R. A., MacArthur, M. W., Moss, D. S. & Thornton, J. M. (1993). PROCHECK: a program to check the stereochemical quality of protein structures. J. Appl. Crystallog. 26, 283– 291. 28. Nicholls, A. & Honig, B. (1991). A finite difference algorithm, utilizing successive over relaxation to solve the Poisson – Boltzmann equation. J. Comput. Chem. 12, 435–445. 29. Barton, G. J. (1993). ALSCRIPT: a tool to format multiple sequence alignments. Protein Eng. 6, 37 – 40.
Edited by D. Rees (Received 14 November 2002; received in revised form 12 February 2003; accepted 17 February 2003)