Structure of SAICAR synthetase from Pyrococcus horikoshii OT3: Insights into thermal stability

Structure of SAICAR synthetase from Pyrococcus horikoshii OT3: Insights into thermal stability

International Journal of Biological Macromolecules 53 (2013) 7–19 Contents lists available at SciVerse ScienceDirect International Journal of Biolog...

3MB Sizes 0 Downloads 76 Views

International Journal of Biological Macromolecules 53 (2013) 7–19

Contents lists available at SciVerse ScienceDirect

International Journal of Biological Macromolecules journal homepage: www.elsevier.com/locate/ijbiomac

Structure of SAICAR synthetase from Pyrococcus horikoshii OT3: Insights into thermal stability Kavyashree Manjunath a , Shankar Prasad Kanaujia b , Surekha Kanagaraj c , Jeyaraman Jeyakanthan c , Kanagaraj Sekar a,∗ a

Supercomputer Education and Research Centre, Indian Institute of Science, Bangalore 560 012, India Department of Biotechnology, Indian Institute of Technology, Guwahati 781 039, India c Department of Bioinformatics, Alagappa University, Karaikudi 630 003, Tamilnadu, India b

a r t i c l e

i n f o

Article history: Received 29 August 2012 Received in revised form 25 October 2012 Accepted 26 October 2012 Available online 5 November 2012 Keywords: SAICAR synthetase Purine de novo biosynthesis Pyrococcus horikoshii OT3 Hyperthermophile Thermostable proteins

a b s t r a c t The first native crystal structure of Phosphoribosylaminoimidazole-succinocarboxamide synthetase (SAICAR synthetase) from a hyperthermophilic organism Pyrococcus horikoshii OT3 was determined in ˚ and in C2221 (Type-2: Resolution 1.9 A). ˚ Both are dimeric two space groups H3 (Type-1: Resolution 2.35 A) but Type-1 structure exhibited hexameric arrangement due to the presence of cadmium ions. A comparison has been made on the sequence and structures of all SAICAR synthetases to better understand the differences between mesophilic, thermophilic and hyperthermophilic SAICAR synthetases. These SAICAR synthetases are reasonably similar in sequence and three-dimensional structure; however, differences were visible only in the subtler details of percentage composition of the sequences, salt bridge interactions and non-polar contact areas. © 2012 Elsevier B.V. All rights reserved.

1. Introduction Pyrococcus horikoshii OT3 is a hyperthermophilic anaerobic archaeon which was isolated from the hydrothermal fluid samples of Okinawa trough vents at a depth of 1395 m [1]. These organisms grow at an optimal temperature of 98 ◦ C, but are capable of surviving at 105 ◦ C over the pH range of 5–8 (optimal at pH 7) and NaCl concentration of 1–5%, (optimal value of 2.4%) [1]. The complete genome sequence of this organism has been determined [2]. These extremophiles have evolved using highly robust mechanisms to adapt to the extreme conditions with exceptionally stable proteins [3,4]. A recent review [5] describes the different strategies adopted by them to survive at extremely high temperature. To mention a few, these organisms have histones to facilitate DNA compaction, high concentrations of linear polyamines (spermines and spermidines) to stabilize DNA and branched chain polyamines to stabilize tRNA. The reverse gyrase, which is present only in hyperthermophiles, provides a positive superhelical structure to DNA, stabilizing it further. In addition, they exhibit various differences in protein sequences and structures. Many studies have been carried out to reveal the possible evolutionary strategies of such proteins [6–15].

∗ Corresponding author. Tel.: +91 80 22933059/22933060; fax: +91 80 23600683. E-mail addresses: [email protected], [email protected] (K. Sekar). 0141-8130/$ – see front matter © 2012 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.ijbiomac.2012.10.028

Extensive experimental and theoretical studies have been carried out exploring the sequence, structure and dynamic nature discerning mesophilic, thermophilic and hyperthermophilic proteins. Statistical studies on a large number of protein sequences from mesophiles, thermophiles and hyperthermophiles have come up with several observations unique to thermostable proteins. Presence of higher Ala, preference for Lys to Arg, higher percentage of charged residues, lesser polar uncharged residues and higher hydrophobic residues are some of the important observations [9]. Structural studies carried out on highly thermostable proteins from hyperthermophilic bacteria Thermatoga maritima [16] concluded that protein adopt different strategies for thermostability mainly involving hydrophobic and ionic interactions. Thermostable proteins exhibit various features like, enhanced hydrophobic core [17] (with some exceptions [18]), increased saltbridges [13,19], higher aromatic and cation-pi interactions [9] and shorter loop regions [20]. They also exhibit increased hydrogen bonding interactions [21] (except in few cases [22]), disulfide bonds [23], higher oligomerization states [24] and less number of cavities [22]. Further, some evidences support that at ordinary temperatures, hyperthermophilic proteins are less flexible compared to their mesophilic homologues [25], however, some studies disagree with this observation [26]. The present work is based on an enzyme SAICAR synthetase (238 residues; 27,436 Da) from a hyperthermophilic organism, Pyrococcus horikoshii OT3. This enzyme is involved in the de novo purine

8

K. Manjunath et al. / International Journal of Biological Macromolecules 53 (2013) 7–19

Fig. 1. De novo purine biosynthesis pathway in Pyrococcus horikoshii.

biosynthesis pathway. Nucleotides are bio-synthesized either in salvage pathway by joining already available bases with ribose sugar units or de novo pathway where nucleotide bases are assembled from simpler compounds. Purine nucleotide biosynthesis was first described by Buchanan [27]. De novo purine biosynthesis [28] consists of 11 steps in bacteria, fungi and only ten steps in some archaea and higher eukaryotes including humans [29]. The difference arises during the conversion of 5-aminoimidazole ribonucleotide (AIR) to carboxyaminoimidazole ribonucleotide (CAIR) [30,31]. The organism Pyrococcus horikoshii OT3 utilizes a single step catalysis by AIR carboxylase (PurE class II) as illustrated in Fig. 1. Interestingly, the enzymes involved in the de novo purine biosynthesis pathway can be important drug targets [32,33]. The enzyme SAICAR synthetase (E.C. 6.3.2.6) catalyzes the formation of N-succinyl-5-aminoimidazole-4-carboxamide ribonucleotide (SAICAR) from carboxyaminoimidazole ribonucleotide (CAIR) [34] and aspartic acid in the presence of ATP. In archaea (purC), bacteria (purC), fungi (ADE1) and plants (pur7), this reaction is catalyzed by a mono-functional enzyme but in higher eukaryotes, it is catalyzed by a bifunctional enzyme PAICS which has both AIR carboxylase and SAICAR synthetase activity. A total of 16 three-dimensional crystals structures of SAICAR synthetase from different organisms have been deposited in the Protein Data Bank (PDB). The first crystal structure was solved from S. cerevisiae (PDB˚ Subsequently, several crystal id 1a48; [35]) at a resolution of 1.9 A. structures of the native and its complex from S. cerevisiae (PDB-ids: 1obg, 1obd, 2cnu, 2cnv and 2cnq), T. maritima (PDB-id: 1kut [36]), E. coli (PDB-ids: 2gqs and 2gqr [37]), G. kaustophilus (PDB-id: 2ywv), M. jannaschii (PDB-ids: 2z02 and 2yzl), E. chaffeensis (PDB-id: 3kre), C. perfringens (PDB-id: 3nua), H. sapiens (PDB-id: 2h31 [38]) and M. abscessus (PDB-id: 3r9r) have been reported. According to the published reports, the enzyme is a monomer in S. cerevisiae, a covalent dimer in T. maritima and a non-covalent dimer in E. coli, while the bifunctional enzyme PAICS in human is an octamer. In the present

work, we report the crystal structure of the native SAICAR synthetase from P. horikoshii OT3 in two different space groups. It is noteworthy that this is the first uncomplexed SAICAR synthetase structure from a hyperthermophilic organism. The sequence and structures of all reported SAICAR synthetases have been examined to distinguish between mesophilic, thermophilic and hyperthermophilic proteins. In the following text SAICAR synthetase from C. perfringens, E. coli, E. chaffeensis, G. kaustophilus, H. sapiens, M. abscessus, M. jannaschii, P. horikoshii, S. cerevisiae and T. maritima are abbreviated as CpSS, EcSS, EhSS, GkSS, HsSS, MaSS, MjSS, PhSS, ScSS and TmSS, respectively. 2. Materials and methods 2.1. Protein purification The protein SAICAR synthetase was purified according to the protocol mentioned in our previous work [39] with slight modifications. The clone of gene PH0239 in pET11a was transformed into E. coli BL21-CodonPlus (DE3)-RIL cells. The transformed colonies were grown at 37 ◦ C in LB media containing 50 ␮g/ml of ampicillin and 34 ␮g/ml chloramphenicol. After a post induction (0.05 mM IPTG) growth of 4 hrs, cells were pelleted, re-suspended in lysis solution and lysed by sonication. After heat treatment and centrifugation of lysate, solution was desalted using Sephadex G-25 (GE Healthcare) desalting column. The desalted protein solution was loaded on to an anion exchange column, Sepharose Q (GE Healthcare), and eluted with a linear gradient of 0–0.5 M NaCl in buffer A (20 mM Tris–HCl, pH 8.0). The fractions containing protein was concentrated and loaded onto Sephacryl S200 (GE Healthcare) gel filtration column, pre-equilibrated with buffer A containing 0.2 M NaCl. Fractions containing pure protein were pooled and concentrated to 10–14 mg/ml as determined by measuring the absorbance at 280 nm. The purity of the protein was confirmed by SDS–PAGE.

K. Manjunath et al. / International Journal of Biological Macromolecules 53 (2013) 7–19

Unless otherwise mentioned, all the columns used in purification were pre-equilibrated in buffer A. 2.2. Crystallization The purified protein sample was screened for crystallization conditions using Hampton crystal screen kits. Crystals were obtained in two different space groups, H3 (Type-1) and C2221 (Type-2). Crystallization condition for Type-1 is described previously [39]. Type-2 crystals were obtained in another condition using the under oil method. Crystallization drop contained 1 ␮l of (∼10 mg/ml) protein and 1 ␮l of condition number 13 of crystal screen kit II, containing 0.2 M ammonium sulfate, 0.1 M sodium acetate trihydrate, pH 4.6 and 30% (w/v) PEG monomethyl ether 2000. Well-shaped good quality diffracting crystals appeared within a week. 2.3. Data collection, structure solution and refinement of Type-1 crystal For Type-1 crystal, the MAD data (collected previously using synchrotron) had some problems during the refinement in P31 space group with a hexamer in the asymmetric unit. A second data of SeMet protein crystal was collected at 100 K at the home source using Cu K␣ radiation on a MAR345 image plate detector with a Bruker Microstar Ultra rotating anode X-ray generator available at the Molecular Biophysics Unit, Indian Institute of Science, Bangalore. The crystal diffracted up to 2.35 A˚ resolution which was indexed and processed in the space group H3 (sg. no. 146) using IMOSFLM [40]. The crystallization condition had cadmium and initial scaling indicated an overall RMS correlation ratio (RCR) greater than one (1.227), thus, the anomalous pairs were separated during merging in SCALA. The data had a completeness of 100% and anomalous completeness of 100%. The overall Rmerge for all reflections was 8.2% with an average mosaicity of 0.57◦ . Data collection and processing statistics are given in Table 1. The program SFCHECK [41] indicated the presence of a twinning, with a twinning fraction of 0.064, which was later supported by H-test, L-test and Britton analysis, confirming a mild partial hemihedral twinning along (k h −l). It also indicated the tentative presence of a pseudotranslation with 21.2% of peak at the origin, but more than other off-origin peaks, along the vector (0.667, 0.333, 0.000). Preliminary studies suggested the presence of two chains in the asymmetric unit with a total cell volume of 1172271.5 A˚ 3 , Matthews’s coefficient (Vm ) [42] of 2.37 A˚ 3 /Da and 48.2% solvent content. Structure solution was obtained by molecular replacement calculations using the three-dimensional atomic coordinates of SAICAR synthetase from M. jannaschii (PDB-id: 2yzl; sequence identity of 49.6%) using PHASER [43]. Refinement was carried out using REFMAC [44] with intermediate rounds of model building using COOT [45]. Of the total reflections, 5% were set aside for calculating Rfree value and the refinement was carried out without twinning, as the twinning fraction was found to be negligible. Non-crystallographic symmetry restraints were applied to both the chains in the asymmetric unit and were maintained till the final refinement. Cadmium ions were located using anomalous peak search using CAD and FFT programs available in CCP4 package. The Rwork and Rfree of the final refined model was 23.5% and 28.6%, respectively. The validation of the structure was carried out using ADIT server available in RCSB. The final structure had 89.5% of the residues in the most favored, 10% in additionally allowed and remaining 0.5% in generously allowed regions of the Ramachandran plot [46]. The final refinement statistics are summarized in Table 1. The final refined model (Fig. 2a) consisted of two chains in the asymmetric unit with 3529 protein atoms, 165 water oxygen atoms, 12 cadmium ions, four sulfate ions, one acetate ion and one 1,4-butanediol molecule. The atomic

9

Table 1 Data collection and refinement statistics for Type-1 and Type-2 crystals. Type-1

Type-2

1.5418 100 200 H3 a = b = 95.42; c = 148.63

Observed reflections Unique reflections Completeness (%) Rmerge (%)† I/(I) Multiplicity Matthews coefficient (Å Da−1 ) Solvent content Z

55.26–2.35 (2.48–2.35) 86,788 21,057 (3112) 100.0 (100.0) 8.2 (40.0) 13.2 (3.1) 4.1 (4.0) 2.37 48.2 2

1.5418 100 150 C2221 a = 44.10; b = 155.39; c = 78.35 42.42–1.90 (2.0–1.9) 113,532 21,450 (3041) 98.9 (97.5) 4.2 (19.1) 24.1 (8.0) 5.3 (5.2) 2.45 49.8 1

Refinement statistics Rwork (%) Rfree (%)

23.5 28.6

18.4 22.9

Protein model Protein atoms Water oxygen atoms Metal ions (Cd2+ ) Others (BU1, SO4 , ACT)

3529 165 12 1BU1, 4 SO4 , 1 ACT

1806 211 – 2 SO4 , 1ACT

RMS deviations from ideal geometries Bond lengths (Å) Bond angles (◦ )

0.012 1.29

0.007 0.99

Average temperature factors (Å2 ) Protein atoms Water molecules Metals Others

25.38 25.37 52.69 43.98

22.36 29.75 35.01

Ramachandran statistics (%) Most favored Additionally allowed Generously allowed region

89.5 10.0 0.5

91.0 8.5 0.5

Data collection and processing statistics Wavelength (Å) Temperature (K) Crystal-to-detector-distance (mm) Space group Unit cell parameters (Å)

Resolution range (Å)



Rmerge =

  hkl

i



  

|Ii (h k l) − I(h k l) |/

hkl

i

Ii (h k l), where Ii (h k l) is the

ith observed intensity and I(h k l) is the weighted average intensity for multiple measurements. Values within the parenthesis correspond to the outermost resolution shell.

coordinates and structure factors (PDB-id: 3U54) have been deposited in PDB. 2.4. Data collection, structure solution and refinement of Type-2 crystal Type-2 crystal data were collected at 100 K using the home source. The crystal diffracted up to a resolution of 1.9 A˚ which was indexed and processed in C2221 space group using IMOSFLM. The data had an overall completeness of 98.9% with an overall Rmerge of 4.2% (mosaicity = 0.94◦ ). The data collection and processing statistics are given in Table 1. The cell content analysis indicated a single polypeptide chain in the asymmetric unit with a total cell volume 536909.3 A˚ 3 (Vm = 2.45 A˚ 3 /Da and solvent content = 49.8%). The Type-1 (chain A) structure was used as a search model to obtain the solution for Type-2 data using the program PHASER. The refinement and model building was carried out using REFMAC [44] and COOT [45], respectively. A total of 5% of the reflections were set aside for calculating Rfree value. At the end of refinement, after several rounds of model building, the Rwork and Rfree were reduced to 18.4% and 22.9%, respectively. The validation of the structures was carried out using ADIT server. The final refined model had 91% of the residues in the most favored, 8.5% in the additionally allowed and remaining

10

K. Manjunath et al. / International Journal of Biological Macromolecules 53 (2013) 7–19

neutralized by replacing the water molecules with either Na+ or Cl− ions. The long-range electrostatic interactions were treated with Particle Mesh Ewald (PME) method [55] with a Fourier spacing of 0.12 nm combined with the fourth order interpolation. The short-range neighbor interactions, columbic and vdw cut-offs were 1.4 nm, 1.4 nm and 1.0 nm, respectively. A switch potential was applied from 0.9 nm onwards for treating vdw forces. The bond lengths were constrained with the LINCS algorithm [56]. Structural alignments were carried out using MUSTANG [57]. The PISA (http://www.ebi.ac.uk/pdbe/prot int/pistart.html) web server [58] was used to analyze the interfaces and quaternary structures. In house server PDB Goodies [59] was used for some calculations in the pdb file. The figures were generated using PyMOL (DeLano Scientific; http://www.pymol.org). The plugin DSSP [60] was used to generate secondary structures. Volume calculations were done using the server 3V: Voss Volume Voxelator [61], radius of gyration calculation of each protein structures was carried out using HYDROPRO utility [62]. Secondary structure contents were calculated using 2Struc server [63]. HBPLUS [64] was used for calculating hydrogen bonds (with donor–acceptor cut-off distance of 3.5 A˚ and donor–hydrogen–acceptor angle to be at least 90◦ ) and salt-bridges were detected using WHATIF server [65]. The atomic accessibility was deduced using the program NACCESS [66]. NACCESS provides the absolute accessibility RSA (residue surface accessibility) values of each residue, which is the sum of atomic accessibility of the corresponding residues. The non-polar contact areas in protein were implemented using pdb np cont and clustering was done using pdb np clus programs [67]. In addition, locally generated PERL scripts were also used in the structure analysis.

3. Results and discussion 3.1. Structure description

Fig. 2. The overall three-dimensional structure of (a) Type-1 crystal is shown together with the secondary structural elements. The cadmium ions of the two chains are labeled and colored in violet and green, respectively. The sulfate ions, acetate ion and butanediol are also labeled. (b) The overall three-dimensional structure of Type-2 crystal is shown along with the sulfate and acetate ions. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of the article.)

0.5% in generously allowed regions of the Ramachandran plot. The final refinement statistics are summarized in Table 1. The refined model (Fig. 2b) has one chain in the asymmetric unit with 1806 non-hydrogen protein atoms, 211 water oxygen atoms, two sulfate ions and an acetate ion. The atomic coordinates and structure factors (PDB-id: 3U55) have been deposited in the PDB. 2.5. Sequence and structure analysis The sequences were retrieved from UNIPROT, the program ClustalW [47] was used for multiple sequence alignment and rendered using ESPript [48]. The three-dimensional structures were downloaded from PDB. Incomplete structures were modeled using the server SWISS-MODEL [49–51] and COOT was used for building the missing residues. Energy minimization was done for the structures (that were partially modeled) using GROMACS v4.5.3 [52], with OPLS-AA (optimized potentials for liquid simulations all atom) force field [53] and TIP4P water model [54] using conjugate gradient method with convergence criteria of 1 kJ mol−1 nm−1 . A dodecahedron box was chosen with a distance of 1.2 nm between the protein and the box wall. The charges of the system were

3.1.1. Overview Among the nine known enzymes involved in the purine biosynthetic pathway in P. horikoshii, six enzymes utilize ATP for the catalysis [GAR synthetase (PurD), FGAR synthetase (PurT), FGAM synthetase II (PurLQS), AIR synthetase (PurM), SAICAR synthetase (PurC) and FAICAR synthetase (PurP)]. The two domains of the enzyme SAICAR synthetase (PurC) correspond to the two domains [‘C’ substrate-binding and ‘B’ ATP-binding domains] of ATP-grasp family. However, it lacks the corresponding ‘A’ domain present in the ATP-grasp family [68]. According to the SCOP classification, the enzyme SAICAR synthetase from P. horikoshii OT3 belongs to ␣ + ␤ architecture. The protein crystallized in two different space groups, namely, H3 (Type-1, Fig. 2a) with two chains in the asymmetric unit and C2221 (Type-2, Fig. 2b) with a single chain in the asymmetric unit. The single chain has an approximate dimension of 50 A˚ × 50 A˚ × 40 A˚ with two domains (small and large). A small domain ‘A’ (residues 12–81) includes six beta strands (␤1–␤6) and a ␣-helix (␣1). The large domain ‘B’ (residues 82–238) consisting of eight ␤-strands (␤7–␤14), five helices (␣2–␣6) and a 310 helix. The electron density is clearly visible only from the 12th residue onwards. The mass spectrometric analysis on the protein indicated (data not shown) that almost 90% of the species in the sample have a mass corresponding to the full length protein. But, the SeMet derivatized protein showed multiple peaks corresponding to the full length and fragmented proteins. Thus, it is difficult to say whether the missing region (first eleven residues) is due to cleavage or because it is disordered. The Type-1 refined model contains two chains in the asymmetric unit (A and B) (Fig. 2a) with 3529 non-hydrogen protein atoms, 165 water oxygen atoms, 12 cadmium (Cd2+ ) ions, four sulfate ions, an acetate ion and a 1,4-butanediol molecule. The

K. Manjunath et al. / International Journal of Biological Macromolecules 53 (2013) 7–19

overall RMSD (root mean square deviation) between the two chains is 0.13 A˚ and no significant deviations were observed among the chains except near the N-terminal region. The cadmium sites were confirmed by the presence of anomalous peaks calculated using CCP4 tools. Among the 12 Cd2+ in the refined model, three ions CD400(A), CD401(A) and CD401(B) form a very strong coordination with both the chains leading to a pseudo-interface (different from the true interface) as indicated (asymmetric unit) in the figure (Fig. 2a). The residues Glu158(A,B) and Asp162(A,B) coordinate CD400(A), Glu111(B) and His63(A) coordinate CD401(A) and Glu111(A), His63(B) and a 1,4-butanediol molecule coordinates CD401(B). The ions CD402(A,B), CD403(A,B) and CD404(A,B) occupy identical positions in their respective chains. The residues Glu89(A,B) and Asp128(A,B) coordinate CD402(A,B), His127(A,B) and Asp35(A,B) coordinate CD403(A,B). Finally, Glu81(A,B) and Asp181(A,B) coordinate CD404(A,B). In addition, there are water molecules in the coordination shells of cadmium but are not uniformly visible in both the chains. The remaining three cadmium ions are present in the positions unique to each chain. The coordinating acidic residues Glu125 and Asp124 are 4.07 A˚ and 4.31 A˚ away, respectively, which is relatively far from CD405(A). The cadmium ion CD405(B) is coordinated by Glu125(B), the symmetry equivalent residues of chain A [Asp21(A), Asp19(A) and Lys22(A)] and a water molecule HOH315(A). The above four residues are essential for the coordination but, the corresponding position in chain A does not have a cadmium ion which may be due the lack of similar combination of the four residues which are involved in the coordination. The cadmium ion CD406(B) is coordinated by the residues Glu226(B) and Glu229(B). The sulfate ions SO4 420(A,B), designated as the first sulfate ion, have ionic interaction with the guanidinium group of Arg93, Arg198 and backbone nitrogen of Ser99 in their respective chains. Other sulfate ions SO4 421(A,B), designated as second sulfate ion, coordinate with Lys210, backbone nitrogen of Phe34 and Arg214 in their corresponding chains. The PISA server predicted that the interface between chain B and a symmetry equivalent (−y, x − y, z) molecule of chain A forms the most probable or true interface with an interfacial area of 1111.5 A˚ 2 (with a predicted solvation free energy gain upon formation of the interface, i G of −18.4 kcal/mol). The dimeric orientation in the asymmetric unit does not represent the true orientation of the dimer in solution, because the interfacial area is 241.5 A˚ 2 (i G of −1.1 kcal/mol). The server predicted two types of quaternary arrangements, the first being a hexamer (Fig. 3a) with a buried surface area of 16,350 A˚ 2 (with the solvation free energy gain upon formation of assembly int G −432.4 kcal/mol) and the second being a dimer (Fig. 3b) with a buried surface area of 4530 A˚ 2 (int G −132.2 kcal/mol). In the dimeric assembly, the true interface consists of 32 residues (Leu100 to Leu106, Tyr110, Leu112 to Leu119, Tyr121, Asn123, Leu126, Pro129 to His135, Lys137 to Leu139, Lys147, Glu150 and Leu154) from each chain (A and B). A total of four residues from each chain form hydrogen bonds at the interface [Asn132(B) with Val117(A), Tyr134(B) with Glu150(A), Val117(B) with Asn132(A) and Glu150(B) with Tyr134(A)]. Further, three residues from each chain form salt bridges at the interface [His135(B) with Glu118(A), Glu118(B) with His135(A) and Glu118(B) with His135(A)]. In the hexameric assembly, one (interface 1) of the interfaces is same as the true interface observed in the dimeric assembly and the other (interface 2) interface is the pseudo-interface having 12 residues from each chain, with ten (His63, Glu111, Pro113, Glu114, Lys155, Glu158, Lys161, Asp162, Ala165 and Lys166) common residues from chains A and B. The remaining two residues are Glu62 and Ile159 (chain A) and Leu112 and Ile170 (chain B). The Type-2 refined model has a single polypeptide chain (Fig. 2b) in the asymmetric unit with 1806 non-hydrogen protein atoms, 211 water molecules, two sulfate ions and an acetate ion. The

11

Fig. 3. (a) Hexameric state of Type-1 crystal. Small domain is colored in red and the large domain is colored in gold, cadmium ions are colored in violet, the true dimeric interface (interface-1) is colored blue and the pseudo-interface (interface2) is colored green, (b) the identical true dimeric interface observed in Type-1 and Type-2 forms. The interface is colored in blue and the small and large domains are colored in red and gold, respectively. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of the article.)

sulfate ions are coordinated to the same residues as found in Type-1 structure. The acetate ion is found interacting with the guanidinium group of Arg103, Met130 and a water molecule. The quaternary structure search using the server PISA predicted only a dimeric assembly with an interface area of 1159 A˚ 2 (i G = −17.7 kcal/mol) which is similar to the true interface/interface-1 of Type-1 structure. As opposed to Type-1 structure, in this case, a hexamer was not predicted and all the identified residues at the true dimeric interface of Type-1 (except Lys147) are also present in this structure. In addition to the hydrogen bonds and salt bridges at the interface of Type-1 structure, two more hydrogen bonds [Arg103(A) with Met130(A ) and Met130(A) with Arg103(A )] and an additional salt bridge are found [between His135(A) and Glu118(A ) (the symmetry equivalent molecule is given as A )].

12

K. Manjunath et al. / International Journal of Biological Macromolecules 53 (2013) 7–19

and Met86 in EcSS interacting with the adenine ring of ADP correspond to Leu23 and Met85 in PhSS. From the above, it is clear that these residues (Lys11, Ala12, Lys13, Leu23 and Met85) are essential for the binding of ATP. Thus, the N-terminal residues along with the seven beta strands (␤12, ␤13, ␤1, ␤2, ␤6, ␤5 and ␤7) and a loop connecting ␤13 and ␤14 form the binding pocket for ATP. The conserved residues Arg93, Ser99 and Arg198 of PhSS are probably the CAIR binding residues which correspond to Arg94, Ser100 and Arg199 in EcSS. The position of the first sulfate ion, in both Type-1 and Type-2, corresponds to the position of the phosphate group of CAIR in EcSS structure. Further, the position of the second sulfate ion corresponds to the carboxylic group of SAICAR moiety in ScSS or the hetero-atom position in the aspartic acid (Asp1308) bound structure of ScSS (PDB-id: 2cnu). The platform of the cleft for the binding of CAIR is formed by the beta strands ␤8, ␤9, ␤11 and ␤14 while 310 , ␣4, loops between ␤9–␤10 and ␤3–␤4 form a supporting ridge like structure for the binding of CAIR. When compared to the structure of ScSS (PDB-id: 2cnu), it is believed that the most probable binding site of aspartic acid is near the turn between ␤3 and ␤4 where the second sulfate ion was found in the present structures. Fig. 4. The superposition of Type-1 and Type-2 structures. Secondary structures ␣ – helices (purple), ␤ – sheets (orange), 310 helices (green) and turns (cyan) are assigned based on DSSP and colored accordingly in both the structures. Type-1 structure is colored in light shade while Type-2 structure is colored in dark shade. It can be observed that only Type-1 structure has a second 310 helix while it is a turn in Type˚ are labeled and the cadmium ions are 2 structure. All the deviating regions (>0.5 A) colored yellow. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of the article.)

3.1.2. Comparison of Type-1 and Type-2 structures Minor differences are found when the topologies of Type-1 and Type-2 structures are examined. A short 310 helix between ␤3 and ␤4 is found in both Type-1 and Type-2 structures, but, a second 310 helix between the strands ␤9 and ␤10 is present only in the chains of Type-1 structure while it is a turn in the Type-2 structure. The Type-1 (chain B) and Type-2 structures superpose with an over˚ The residues deviating more than 0.5 A˚ (nine all RMSD of 0.62 A. regions) are labeled in the Fig. 4. Two of the most deviating regions (Asp29 to Gly44 and Lys122 to Asp128) are present near the vicinity of the cadmium ions CD402(B), CD403(B) and CD405(B) (Type-1). The other two regions (Val96 to Ile116 and Arg214 to Lys217) are substantially deviated (especially the former) even though they are not coordinated to any cadmium ions. However, the region (Val96 to Ile116) is very crucial as it contributes to the true interface of the dimer. Upon comparing the two structures (Type-1 and Type2), it is quite clear that when the two regions, Asp29 to Gly44 and Lys122 to Asp128, are moved toward each other as seen in Type-1, there will be a drastic conformational change in the region, Val96 to Ile116, located at the true dimeric interface. These structural deviations may be correlated to a situation similar to the binding of the ligands ATP or CAIR at the active site influencing the deviations at the dimeric interface. The Type-1 structure resembles more, although not completely, to other SAICAR synthetase complex structures. Thus, it may be concluded that the cadmium binding induces the structural deviation which is significant only near the active site and provides a possible clue to the allosteric signal between the active site and the true interface. 3.1.3. Substrate binding sites in SAICAR synthetase The PhSS structures (Type-1 and Type-2) are compared with EcSS and ScSS to identify a probable substrate binding region and the residues in the active site of PhSS. The structure superposition with EcSS shows an RMSD of 1.35 A˚ for 223 residues. The residues Lys11, Ala12 and Lys13 which stabilizes the ␣ and ␤ phosphate groups of ADP in EcSS are conserved in PhSS, although the electron density of Lys11 is not visible in PhSS. The residues Leu24

3.1.4. Dimeric interface Type-1 structure was predicted as a hexamer (Fig. 3a) by PISA server. This hexamer is a dimer of trimers placed one above the other with a 60◦ rotation with respect to each other. In this arrangement, two different dimeric interfaces are observed and one of them is the crystallographic interface (pseudo-interface/interface2) held by strong coordinating interaction with the cadmium ions. The other is the true dimeric interface/interface-1 as predicted in Type-2 structure. The absence of a hexameric orientation in Type2 structure indicates that it may be a crystallographic artifact. As described earlier, the interfacial residues of the true dimeric assembly of both Type-1 and Type-2 structures (Fig. 3b) are formed by the residues from a 310 helix (which is common in Type-1 and Type2), ␤9, ␤10 and ␣2. Examining the oligomeric assembly of SAICAR synthetases from different organisms shows that the structures of EcSS, EhSS, GkSS, CpSS, MjSS, and TmSS are dimeric while the structures of ScSS and MaSS exist as monomer and the bifunctional HsSS is an octamer. It was previously reported [69] that the structure of EcSS exists as a trimer in solution, however, the crystal structure revealed that it is a dimer [37]. The presence of an additional ␤-turn between a 310 helix and the strand ␤9 (according to PhSS) probably prevents the dimerization in the structures of MaSS and ScSS. In case of octameric assembly of HsSS, it is clear that the SAICAR synthetase domain from one chain has a weak interaction with the corresponding domain of the other chain as observed in the dimers of other structures. The dimeric interfacial area of SAICAR synthetase from different organisms, EcSS (996.5 A˚ 2 ), EhSS (942.1 A˚ 2 ), GkSS (858.9 A˚ 2 ), HsSS (451.1 A˚ 2 ), CpSS (1118.2 A˚ 2 ), MjSS (1053 A˚ 2 ), TmSS (943.9 A˚ 2 ) and PhSS (1111.5 A˚ 2 for Type-1 and 1159 A˚ 2 for Type-2) are found to be similar except for HsSS, which is significantly less compared to all other dimers. 3.1.5. Comparison with other structures Multiple sequence alignment of all SAICAR synthetase sequences (Fig. 5a) shows the conservation of the two signature motifs and other additional residues (Gly10, Lys11, Lys122, Asp190, Arg198, Arg214, Asp209, Lys210, Ala32, Asp195, Lys45 and Leu60 of PhSS). A careful examination of the PhSS structure shows that, among these conserved residues (other than the two signature motifs), the first four residues are near the ATP binding region while next four residues are present near the CAIR and aspartate binding sites. The next three residues (Ala32, Asp195 and Lys45) are present between the ATP and CAIR binding sites and the residue Leu60 lies in the position of the helix ␣1. In case of ScSS sequence, there are three major insertions consist of 21

K. Manjunath et al. / International Journal of Biological Macromolecules 53 (2013) 7–19

13

Fig. 5. (a) Multiple sequence alignment of PhSS, MjSS, GkSS, CpSS, EhSS, TmSS, HsSS, EcSS, ScSS and MaSS using the program ClustalW. The signature motifs 1 and 2 are highlighted in green box. The residues highlighted in red box are conserved in all the sequences and those outlined in blue box are semi conserved. (b) A structure based sequence alignment obtained from Mustang (numbering is according to the residues aligned). (c) Superposition of the structures of MjSS (brown), GkSS (brown), CpSS (brown), EhSS (brown), EcSS (brown), TmSS (red), HsSS (green), ScSS (cyan), MaSS (cyan) with PhSS (dark blue). It clearly shows that the structures in brown are very similar to PhSS, while TmSS and HsSS are slightly deviated. Monomeric SAICARs (ScSS and MaSS) are significantly deviated. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of the article.)

14

K. Manjunath et al. / International Journal of Biological Macromolecules 53 (2013) 7–19

Fig. 5. (Continued)

(Asp71 to Ala91), 13 (Tyr133 to Pro145), 18 (Asp265 to Gln282) residues and one minor insertion of six residues (Ala247 to Gly252). Similar insertions are also observed in MaSS SAICAR synthetase. Thus, it is concluded that the two signature motifs and additional conserved residues lay the platform for the binding of CAIR and ATP. In order to see any changes in the structure, a pairwise structural alignment of all SAICAR synthetases has been carried out against PhSS. It shows an RMSD of less than 1.36 A˚ with CpSS, EcSS, EhSS, GkSS, MjSS and an RMSD of more than 1.86 A˚ with TmSS, HsSS, MaSS and ScSS. The first five structures are more similar to PhSS compared to the last four. In all the superposed structures, deviations are more pronounced at the N- and C-terminal regions. Structure based sequence alignment of all SAICAR structures against PhSS (Fig. 5b) shows the conservation of all the active site residues near the ATP or CAIR binding sites. Fig. 5c shows the structural superposition of all SAICAR synthetases against PhSS and it is clear that the monomeric SAICAR’s are somewhat different compared to the dimeric SAICAR’s. From the sequence and the structural alignments, it is difficult to delineate the mesophilic, thermophilic and hyperthermophilic SAICAR synthetases. 3.2. Thermal stability analysis 3.2.1. Amino acid composition The amino acid composition is examined to look for features that could delineate mesophilic, thermophilic and hyperthermophilic SAICAR synthetases. Among them (EcSS, EhSS, CpSS, GkSS, ScSS, MaSS, HsSS, MjSS, TmSS and PhSS), two (ScSS and MaSS) are monomeric, one (HsSS) is bifunctional octamer and another (TmSS) is a dimer with a disulfide bond between the monomers. The remaining six are non-covalent dimers. The bifunctional enzyme (HsSS) is excluded from the composition analysis. Due to lack of experimental results, the value of the melting temperature Tm is assumed to correlate with the optimal temperature To of the organisms. The percentages mentioned in the rest of this section are calculated by averaging the composition of the

amino acid residues separately for mesophilic, thermophilic and hyperthermophilic proteins. The percentage composition of each residue (Table 2, higher percentage compositions are in bold and lower percentage compositions are shaded) highlights that only two (Lys and Gln) residues delineate hyperthermophiles from mesophiles. For instance, the residue Lys is ∼4% more in hyperthermophiles while Gln is marginally less by ∼1.6% in hyperthermophilic structures. On examining further, in hyperthermophiles, the charged residues (D + E + H + R + K) together are ∼6% more and polar uncharged residues (S + T + N + Q) are ∼7% less than mesophiles. The value (D + E + H + R + K)/Q is observed to be extraordinarily high in hyperthermophiles than the corresponding values of thermophiles and mesophiles. This is the first report to distinguish hyperthermophiles from mesophiles based on the value of the (D + E + H + R + K)/Q ratio. This could be further confirmed with a larger data set of hyperthermophilic proteins. However, no conclusions could be drawn about the uniqueness of the hydrophobic residues in hyperthermophiles. Further, the thermophilic protein (GkSS) has less Lys residues but more Arg than hyperthermophiles, while Ala and Leu content are more and the residue Met is less than in mesophiles and hyperthermophiles. Surprisingly, in GkSS (being a thermophile with To of 60 ◦ C), the composition of the charged residues (D + E + H + R + K) and polar uncharged residues (S + T + N + Q) are comparable to mesophiles which contradicts the general observation that the polar residues are less and the charged residues are higher in thermophiles than mesophiles. However, GkSS has a high composition of aliphatic residues (A + I + L + V) which is ∼6.1% higher than mesophiles and 4.7% higher than hyperthermophilic proteins. Monomeric enzymes have some unique features. The percentage composition of Trp is ∼2.24 times greater than the rest of SAICARs, while Pro is ∼3.5% greater than mesophilic and ∼2.24% greater than thermophilic and hyperthermophilic SAICARs contributing to the conformational rigidity. The composition of polar uncharged residues (N + Q) is closer to hyperthermophilic or thermophilic than mesophilic enzymes, specifically the residue Asn.

K. Manjunath et al. / International Journal of Biological Macromolecules 53 (2013) 7–19

15

Table 2 Percentage composition of amino acids of SAICAR synthetase structures.

a

(M) Mesophile; (T) Thermophile; (H) Hyperthermophile.

These monomeric enzymes notably have more short chain polar residues (S + T), less number of charged residues (D + E + H + K + R) compared to other SAICARs. Finally, highly hydrophobic residues (L + I + F + W + V + M) are less in monomeric SAICARs. Thus, it may be concluded that the monomeric SAICARs resembles mesophilic

dimeric SAICARs in terms of charged and polar (S + T) residues content. On contrary, it also resembles thermophilic and hyperthermophilic SAICARs in terms of N + Q content. It is concluded that in the case of SAICAR synthetase, the monomers and dimers have to be separately analyzed for thermal stability.

Table 3 Radius of gyration and secondary structures. Source (group)a

Rg b (nm)

Reff c (Å)

CpSS (M) EcSS (M) EhSS (M) GkSS (T) MjSS (H) PhSS (H) TmSS (H)

1.81 1.81 1.86 1.85 1.89 1.86 1.85

10.29 10.51 10.25 10.12 10.16 9.98 9.33

a b c d

Percentage of secondary structuresd

% Loop (S + B + I)

B

E

G

I

H

S

T

Coil

0.8 0.8 0.8 0.8 1.2 1.3 0.4

35.3 31.6 33.1 33.1 30.2 31.1 30.9

1.3 1.3 1.2 1.2 1.2 2.5 0

0 0 0 0 0 0 0

29.4 30.0 25.2 30.2 29.8 27.7 33.0

6.3 6.3 7.4 8.7 7.9 7.6 6.1

13.0 11.8 16.5 9.1 14.0 14.3 11.3

13.9 18.1 15.7 16.9 15.7 15.5 18.3

7.1 7.1 8.2 9.5 9.1 8.9 6.5

(M) Mesophilic; (T) Thermophilic; (H) Hyperthermophilic. Radius of gyration. Effective radius – radius of a sphere whose surface area to volume ratio is same as the object in question. Reff in thermophiles and hyperthermophiles are highlighted. Secondary structures assigned by DSSP-B, beta-bridge; E, strand; G, 310 helix; I, ␲ helix; H, alpha helix; S, bend; T, turn; Coil, random coil.

16

K. Manjunath et al. / International Journal of Biological Macromolecules 53 (2013) 7–19

Table 4 Total number of hydrogen bonds and number of buried hydrogen bonds. Source e

CpSS EcSSe EhSSe GkSSf MjSSg PhSSg TmSSg a b c d e f g

No. of HBa

MMb

SSc

MS/SMd

Buried MM

Buried SS

Buried SM/MS

238 224 220 232 232 220 204

148 148 133 134 155 140 127

44 40 42 55 37 45 44

46 36 45 43 40 35 33

102 98 93 94 116 94 93

2 4 2 6 2 2 5

16 11 10 12 10 10 11

Hydrogen bond. Main chain – main chain hydrogen bond. Side chain – side chain hydrogen bonds. Main chain – side chain/side chain – main chain hydrogen bonds. Mesophiles. Thermophiles. Hyperthermophiles.

3.2.2. Overall structural features The structures of CpSS (3nua), EcSS (2gqr), EhSS (3kre), GkSS (2ywv), MjSS (2z02), TmSS (1kut) and PhSS (Type-2) are analyzed. The structures of SAICAR synthetases having the missing residues are modeled and the missing side chains are built.

Finally, the structures are energy minimized using OPLS-AA force field in GROMACS, before subjecting the structures for interaction analysis. It is reported that the compactness of the protein increases from mesophiles to hyperthermophiles [14]. The radius of gyration (Rg ), which is a measure of the compactness of

Table 5 Classification of salt-bridges based on RASB values.

a

Total number of salt bridges. Greater number of SBs in hyperthermophiles is in bold. RASB = (Sum of absolute accessibility)/(Sum of standard accessibility) × 100. This value is calculated for the residues involved in SB formation. Relatively higher percentage of SBs in hyperthermophiles compared to mesophiles (in each RASB bins) are indicated in bold with lighter background. Relatively lower percentage of SBs in hyperthermophiles compared to mesophiles (in each RASB bins) is indicated to bold with a darker background. b

K. Manjunath et al. / International Journal of Biological Macromolecules 53 (2013) 7–19

17

Table 6 Clustering of non-polar contact areas in the SAICAR synthetase structures. Cut-offa

CpSS

EcSS

EhSS

GkSS

MjSS

PhSS

TmSS

Nil 5

7255.60 224 7236.10 206 7080.55 185-4 6853.33 4-175 6678.63 8-6-121-10 5745.11 87-3 3997.37 3-76 3589.88 23-3-4 1171.23 3 77.03 3 77.03 3 77.03 3 77.03 –

7110.20 216 7076.15 204 6993.95 182-3 6706.11 3-5-150 6114.05 11-6-113-4 5449.87 76-6-4 3867.34 6-75-4 3836.54 74-4 3615.63 71 3410.17 59 2924.83 –

– –





41







7291.50 227 7243.90 209 7061.15 188-4 6857.46 175-3 6590.46 123-6-15 5649.36 3-6-12-61 3494.58 3-6-9-61 3395.00 3-56 2742.75 3 85.60 3 85.60 3 85.60 3 85.60 3 85.60 3 85.60 3 85.60

7849.25 225-3 7810.25 206-4 7647.12 192-3 7461.39 179-3 7231.79 17-6-130-3 6505.61 91-9 4701.32 90-9 4670.93 88 4250.69 70 3513.83 70 3513.83 69 6957.72 19 875.20 –



7675.05 225 7626.05 214 7541.65 193 7238.25 174 6882.40 12-4-7-130 6180.67 5-8-80-5-6 4555.05 8-60-9-5-4-6 4078.53 8-57-8-6 3675.21 7-19 1161.27 7–19 1161.27 7 260.11 7 260.11 7 260.11 –

7820.65 228 7775.75 211 7599.85 193-4 7182.85 178 7051.75 140-9 6250.78 10-73 3776.12 7-73 3671.44 72-7 3640.22 72-6 3607.90 –

38

7072.05 219 7037.55 199 6873.70 184-3-4 6733.65 173 6455.30 8-6-121-3 5423.40 3-6-67 3371.91 3-3-6 335.17 3-3 145.78 3-3 145.78 3 76.21 3 76.21 3 76.21 3 76.21 –







10 15 20 25 30 31 32 33 34 35 36 37

– –

– –

a Cut-off for the non-polar interactions. First data line is the non-polar contact area with no cut-off. Subsequent data represent the number of residues in the cluster at each cut-off. The non-polar contact area of each cluster is indicated in bold font below the clusters.

protein, did not show any distinction between the mesophilic and hyperthermophilic/thermophilic proteins (Table 3). However, the effective radius (Reff ), which is the radius of a sphere that has the same surface area to volume ratio as the protein in question, is observed to be marginally smaller in thermophilic and hyperthermophilic proteins (Table 3). It has been studied that the contents of the loop are comparatively less in thermostable proteins [20]. A close examination on the content of (Table 3) the secondary structural elements (calculated using 2Struc according to DSSP and considering the ‘loop’ as the sum of bend, ␤-bridge and ␲-helix) shows that the thermophilic (GkSS, 9.5%) and hyperthermophilic proteins (MjSS, 9.5%; PhSS, 8.9%) in fact have a higher percentage of loops than mesophilic (CpSS, 7.1%; EcSS, 7.1%; EhSS, 8.2%) proteins with an exception of TmSS (6.5%). The atomic packing, packing density in the protein have been investigated (data not shown), but it did not yield any correlation with temperature. 3.2.3. Interaction analysis Intra-molecular interactions such as hydrogen bonds and salt bridges, in the protein structures are analyzed to decipher the differences in the thermostability. Hydrogen bonding interactions revealed that (Table 4), the number of hydrogen bonds is similar in mesophilic, thermophilic and hyperthermophilic SAICARs. The percentage of buried (the sum of accessibility of both the atoms involved in the hydrogen bonding is zero) hydrogen bonds is higher in the case of MjSS and TmSS but not in the case of GkSS or PhSS. Thus, among the set of proteins considered in the study, an increased number of hydrogen bonds, compared to mesophilic proteins, is not observed in thermophilic or hyperthermophilic proteins as opposed to a general opinion that number of hydrogen bonds increase as the thermophilicity of the protein increases. The salt-bridge (SB) is a long-range interaction compared to the hydrogen bonding interaction. Salt-bridge interaction

distances (distance between the positively charged residues Arg, Lys, His and the negatively charged residues Asp, Glu) which are less than or equal to 4 A˚ are considered as strong [70], 4–6 A˚ are weak and 6 A˚ or more are considered as weaker. The thermophilic/hyperthermophilic did not have more number of strong ˚ compared to mesophiles SBs (within a cut-off distance 4 A˚ and 5 A) ˚ except for PhSS. However, higher cut-off distances (6 A˚ and 7 A) revealed a positive correlation in the number of SBs with Tm of the protein. Thus, hyperthermophilic proteins exhibited higher number of weaker SBs than mesophilic proteins. In case of PhSS, the SBs are high in number (in all distance cut-offs) compared to all other proteins, contributing a dominant feature for stability. In GkSS, as mentioned above, the percentage composition of the charged residues is comparable to mesophilic proteins as a result; the number of its SBs is also closer to mesophilic proteins. In order to calculate the relative accessibility of the salt-bridges, the absolute accessibility value (RSA) of the residue pairs involved in the SB formation is added and the sum is divided by the sum of the standard accessibility of these residues. The resulting ratio is multiplied by 100. This value is designated as RASB (relative accessibility of SBs). ˚ of SBs, the percentage of SBs Within each distance cutoff (4–7 A) with certain value of RASBs is clustered together (Table 5). Table 5 ˚ shows the number of SBs in each structure at various cut-offs (4 A, ˚ and the percentage of SBs in the correspond˚ 6 A˚ and 7 A) 5 A, ing RASB bins. The values in bold font (light background) indicate higher percentage (relatively exposed) of SBs in hyperthermophiles compared to mesophiles. Further, the values shown in bold font with dark background indicate a lower percentage of (relatively buried) SBs in hyperthermophiles compared to mesophiles. It can be inferred from Table 5 that the percentage of SBs having higher RASBs are more in hyperthermophiles especially at higher cut-offs ˚ However, the enzyme GkSS does not show such a (6 A˚ and 7 A). trend. To conclude, most of the SBs (weak or strong) tend to reside

18

K. Manjunath et al. / International Journal of Biological Macromolecules 53 (2013) 7–19

on the surface of the hyperthermophilic compared to mesophilic proteins. Hydrophobic interactions are long-range [71] interactions known to play a significant role in the protein folding and stability. The hydrophobic contribution to the thermal stability in SAICAR synthetase are investigated by studying the non-polar contact areas in protein structures using the tool pdb np cont and these interactions are clustered using the tool pdb np clus. The total non-polar surface area is calculated for the whole protein and for different contact area cut-offs (5, 10, 15, etc.) (Table 6). Later, they are clustered with a minimum of three members in each cluster. The cut-off indicates the minimum pairwise contact area between the residues for clustering. The first row data shows the total non-polar contact area and the subsequent rows show the total residues in each clusters separated by a hyphen (‘-’). The total contact area of all the clusters in a particular cut-off is in bold. The enzyme GkSS has a cluster of three residues even at a cut-off of 41 A˚ 2 and among the mesophilic proteins, the enzyme EhSS appears to have relatively higher stability in terms of non-polar contacts. The hyperthermophilic proteins have higher total non-polar contact area than mesophilic proteins. At higher contact area cut-off (≥33 A˚ 2 ), the total non-polar contact area of hyperthermophiles are greater than mesophilic proteins by at least 1000 A˚ 2 and probably provides the stability for the protein at higher temperature. However, these theoretical observations need to be ascertained experimentally by measuring the melting temperatures of these proteins. 4. Conclusion The first native crystal structure of SAICAR synthetase from a hyperthermophilic organism has been reported. The Type-1 structure of PhSS resembles the complex bound form of EcSS due to the bound cadmium ions near the active site, inducing significant deviation at the dimeric interface. These cadmium ions also give rise to a pseudo-interface leading to a hexameric form. The PhSS being a hyperthermophilic protein has very similar sequence and threedimensional structure compared to all other mesophilic dimeric SAICARs. The amino acid composition analysis revealed that the hyperthermophilic SAICARs, in general, has higher percentage of D + E + H + R + K and lesser percentage of S + T + N + Q compared to others. Further, the ratio of (D + E + H + R + K)/Q is found to be exceptionally high in hyperthermophiles. The thermophilic enzyme GkSS exhibited comparable percentage of D + E + H + R + K and S + T + N + Q with the mesophilic SAICARs. However, a very high percentage of aliphatic residues (A + L + I + V) are found. The monomeric SAICARs exhibited a unique composition with a large number of Pro and Trp residues. Hyperthermophiles have more number of weak salt bridges than mesophiles. Hyperthermophilic SAICARs have higher percentage of SBs with higher RASB values. It means higher percentage of SBs (both weak and strong) tend to reside on the surface of the hyperthermophilic SAICARs compared to mesophilic SAICARs. The total non-polar contact area is observed to be the highest in hyperthermophiles at higher contact area cut-offs. Author’s contribution KM purified, crystallized, collected the data, solved, refined and analyzed the structures. SPK and SK assisted in the purification process. JJ provided the plasmid. KS supervised the project and critically read the manuscript. Acknowledgements The authors gratefully acknowledge the facilities offered by the Interactive graphics facility and the Supercomputer Education and

Research Centre. The authors acknowledge the X-ray data collection facility at the Molecular Biophysics Unit. One of the authors (KM) thanks Eleanor Dodson for her valuable suggestions while solving the structure. The authors thank the Department of Science and Technology (DST) for financial support. The authors thank the Spring-8 beam line BL44XU (proposal number 2011B6653). References [1] J.M. González, Y. Masuchi, F.T. Robb, J.W. Ammerman, D.L. Maeder, M. Yanagibayashi, J. Tamaoka, C. Kato, Extremophiles 2 (1998) 123–130. [2] Y. Kawarabayasi, M. Sawada, H. Horikawa, Y. Haikawa, Y. hino, S. Yamamoto, M. Sekine, S. Baba, H. Kosugi, A. Hosoyama, Y. Nagai, M. Sakai, K. Ogura, R. Otsuka, H. Nakazawa, M. Takamiya, Y. Ohfuku, T. Funahashi, T. Tanaka, Y. Kudoh, J. Yamazaki, N. Kushida, A. Oguchi, K. Aoki, T. Yoshizawa, Y. Nakamura, F.T. Robb, K. Horikoshi, Y. Masuchi, H. Shizuya, H. Kikuchi, DNA Research 5 (1998) 55–76. [3] K.O. Stetter, Philosophical Transactions of the Royal Society of London, Series B: Biological Sciences 361 (2006) 1837–1843. [4] M.W.W. Adams, Annual Review of Microbiology 47 (1993) 627–658. [5] T. Imanaka, Proceedings of the Japan Academy Series B 87 (2011) 587–602. [6] G.J. Feller, Journal of Physics: Condensed Matter 22 (2010) 1–17. [7] L.D. Unsworth, J.V.D. Oost, S. Koutsopoulos, FEBS Journal 274 (2007) 4044–4056. [8] I.N. Berezovsky, E.I. Shaknovich, Proceedings of the National Academy of Sciences of the United States of America 102 (2005) 12742–12747. [9] C. Vieille, G.J. Zeikus, Microbiology and Molecular Biology Reviews 65 (2001) 1–43. [10] R. Sterner, W. Liebl, Critical Reviews in Biochemistry and Molecular Biology 36 (2001) 39–106. [11] S. Kumar, C.-J. Tsai, R. Nussinov, Protein Engineering 13 (2000) 179–191. [12] W.F. Li, X.X. Zhou, P. Lu, Biotechnology Advances 23 (2005) 271–281. [13] C. Vetriani, D.L. Maeder, N. Tolliday, K.S.-P. Yip, T.J. Stillman, K.L. Britton, D.W. Rice, H.H. Klump, F.T. Robb, Proceedings of the National Academy of Sciences of the United States of America 95 (1998) 12300–12305. [14] R. Scandurra, V. Consalvi, R. Chiaraluce, L. Politi, P.C. Engel, Biochimie 80 (1998) 933–941. [15] R. Landenstein, G. Antranikian, Advances in Biochemical Engineering/Biotechnology 61 (1998) 38–85. [16] R. Jaenicke, H. Schurig, N. Beaucamp, R. Ostendorp, Advances in Protein Chemistry 48 (1996) 181–269. [17] X. Fang, Q. Cui, Y. Tong, Y. Feng, L. Shan, L. Huang, J. Wang, Biochemistry 47 (2008) 11212–11221. [18] J. Chen, W.E. Stites, Journal of Molecular Biology 344 (2004) 271–280. [19] C.H. Chan, T.H. Yu, K.B. Wong, PLoS One 6 (2011) 1–8. [20] M.J. Thompson, D. Eisenberg, Journal of Molecular Biology 290 (1999) 595–604. [21] G. Vogt, P. Argos, Folding and Design 2 (1997) S40–S46. [22] A. Szilágyi, P. Závodszky, Structure 8 (2000) 493–504. [23] M. Beeby, B.D. O’Connor, C. Ryttersgaard, D.R. Boutz, L.J. Perry, T.O. Yeates, PLoS Biology 3 (2005) 1549–1558. [24] R. Thoma, M. Hennig, R. Sterner, K. Kirschner, Structure 8 (2000) 265–276. [25] T.B. Fitzpatrick, P. Killer, R.M. Thomas, I. Jelesarov, N. Amrhein, P.J. Macheroux, Journal of Biological Chemistry 276 (2001) 18052–18059. [26] G. Hernandez, F.E. Jenney Jr., M.W. Adams, D.M. LeMaster, Proceedings of the National Academy of Sciences of the United States of America 97 (2000) 3166–3170. [27] J.M. Buchanan, Journal of Cellular Physiology. Supplement 38 (1951) 143–171. [28] H. Zalkin, J.E. Dixon, Progress in Nucleic Acid Research and Molecular Biology 42 (1992) 259–287. [29] Z.D. Chen, J.E. Dixon, H. Zalkin, Proceedings of the National Academy of Sciences of the United States of America 87 (1990) 3097–3101. [30] W. Watanabe, G. Sampei, A. Aiba, K. Mizobuchi, Journal of Bacteriology 171 (1989) 198–204. [31] E.J. Mueller, E. Meyer, J. Rudolph, V.J. Davisson, J. Stubbe, Biochemistry 33 (1994) 2269–2278. [32] R.I. Christopherson, S.D. Lyons, P.K. Wilson, Accounts of Chemical Research 35 (2002) 961–971. [33] T. Dervieux, L.T. Brenner, Y.Y. Hon, Y. Zhou, M.L. Hancock, J.T. Sandlund, G.K. Rivera, R.C. Ribeiro, J.M. Boyett, C.-H. Pui, M.V. Relling, W.E. Evans, Blood 100 (2002) 1240–1247. [34] L.N. Lukens, J.M. Buchanan, Journal of Biological Chemistry 234 (1959) 1791–1798. [35] V.M. Levdikov, V.V. Barynin, A.I. Grebenko, W.R. Melik-Adamyan, V.S. Lamzin, K.S. Wilson, Structure 6 (1998) 363–376. [36] R. Zhang, T. Skarina, E. Evdokimova, A. Edwards, A. Savchenko, R. Laskowski, M.E. Cuff, A. Joachimiak, Acta Crystallographica F 62 (2006) 335–339. [37] N.D. Ginder, D.J. Binkowski, H.J. Fromm, R.B. Honzatko, Journal of Biological Chemistry 281 (2006) 20680–20688. [38] S.X. Li, Y.P. Tong, X.C. Xie, Q.H. Wang, H.N. Zhou, Y. Han, Z.Y. Zhang, W. Gao, S.G. Li, X.C. Zhang, R.C. Bi, Journal of Molecular Biology 366 (2007) 1603–1614. [39] K. Manjunath, J. Jeyakanthan, N. Nakagawa, A. Shinkai, M. Yoshimura, S. Kuramitsu, S. Yokoyama, Acta Crystallographica F 66 (2010) 180–183. [40] I. Steller, R. Bolotovsky, M.G. Rossman, Journal of Applied Crystallography 30 (1997) 1036–1040.

K. Manjunath et al. / International Journal of Biological Macromolecules 53 (2013) 7–19 [41] A.A. Vaguine, J. Richelle, S.J. Wodak, Acta Crystallographica D 55 (1999) 191–205. [42] B.W. Matthews, Journal of Molecular Biology 33 (1968) 491–497. [43] A.J. McCoy, R.W. Grosse-Kunstleve, P.D. Adams, M.D. Winn, L.C. Storoni, R.J. Read, Journal of Applied Crystallography 40 (2007) 658–674. [44] G.N. Murshudov, A.A. Vagin, E.J. Dodson, Acta Crystallographica D 53 (1997) 240–255. [45] P. Emsley, K. Cowtan, Acta Crystallographica D 60 (2004) 2126–2132. [46] R.A. Laskowski, M.W. MacArthur, D.S. Moss, J.M. Thornton, Journal of Applied Crystallography 26 (1993) 283–291. [47] M.A. Larkin, G. Blackshields, N.P. Brown, R. Chenna, P.A. McGettigan, H. McWilliam, F. Valentin, I.M. Wallace, A. Wilm, R. Lopez, Bioinformatics 23 (2007) 2947–2948. [48] P. Gouet, E. Courcelle, D.I. Stuart, F. Metoz, Bioinformatics 15 (1999) 305–308. [49] K. Arnold, L. Bordoli, J. Kopp, T. Schwede, Bioinformatics 22 (2006) 195–201. [50] F. Kiefer, K. Arnold, M. Künzli, L. Bordoli, T. Schwede, Nucleic Acids Research 37 (2009) D387–D392. [51] M.C. Peitsch, Nature Biotechnology 13 (1995) 658–660. [52] B. Hess, C. Kutzner, D. Van der Spoel, E.J. Lindahl, Journal of Chemical Theory and Computation 4 (2008) 435–447. [53] W.L. Jorgensen, J. Tirado-Rives, Journal of the American Chemical Society 110 (1988) 1657–1666. [54] W.L. Jorgensen, J. Chandrasekhar, J.D. Madura, R.W. Impey, M.L. Klein, Journal of Chemical Physics 79 (1983) 926–935.

19

[55] T. Darden, D. York, L. Pederesen, Journal of Chemical Physics 98 (1993) 10089–10092. [56] B. Hess, H. Bekker, H.J.C. Berendsen, J.G.E.M. Fraaije, Journal of Computational Chemistry 18 (1997) 1463–1472. [57] A.S. Konagurthu, J.C. Whisstock, P.J. Stuckey, A.M. Lesk, Proteins 64 (2006) 559–574. [58] E. Krissinel, K. Henrick, Journal of Molecular Biology 372 (2007) 774–797. [59] A.S.Z. Hussain, V. Shanthi, S.S. Sheik, J. Jeyakanthan, P. Selvarani, K. Sekar, Acta Crystallographica D 58 (2002) 1385–1386. [60] W. Kabsch, C. Sander, Biopolymers 22 (1983) 2577–2637. [61] N.R. Voss, M. Gerstein, Nucleic Acids Research 38 (2010) 555–562. [62] A. Ortega, D. Amorós, J. García de la Torre, Biophysical Journal 101 (2011) 892–898. [63] D.P. Klose, B.A. Wallace, R.W. Janes, Bioinformatics 26 (2010) 2624–2625. [64] I.K. McDonald, J.M. Thornton, Journal of Molecular Biology 238 (1994) 777–793. [65] G. Vriend, Journal of Molecular Graphics 8 (1990) 52–56. [66] B. Lee, F.M. Richards, Journal of Molecular Biology 55 (1971) 379–400. [67] F. Drabløs, Bioinformatics 15 (1999) 501–509. [68] Y. Zhang, M. Morar, S.E. Ealick, Cellular and Molecular Life Sciences 65 (2008) 3699–3724. [69] S.W. Nelson, D.J. Binowski, R.B. Honzatko, H.J. Formm, Biochemistry 44 (2005) 766–774. [70] D.J. Barlow, J.M. Thornton, Journal of Molecular Biology 168 (1983) 867–885. [71] J. Israelachvili, R. Pashley, Nature 300 (1982) 341–342.