International Journal of Biological Macromolecules 35 (2005) 211–220
Importance of main-chain hydrophobic free energy to the stability of thermophilic proteins K. Saraboji a , M. Michael Gromiha b , M.N. Ponnuswamy a,∗ b
a Department of Crystallography and Biophysics, University of Madras, Guindy Campus, Chennai 600025, India Computational Biology Research Center (CBRC), National Institute of Advanced Industrial Science and Technology (AIST), AIST Tokyo Waterfront Bio-IT Research Building, 2-42 Aomi, Koto-ku, Tokyo 135-0064, Japan
Received 1 December 2004; received in revised form 26 January 2005; accepted 15 February 2005
Abstract Living organisms are found in the most unexpected places, including deep-sea vents at 100 ◦ C and several hundred bars pressure, in hot springs. Needless to say, the proteins found in thermophilic species are much more stable than their mesophilic counterparts. There are no obvious reasons to say that one would be more stable than others. Even examination of the amino acids and comparison of structural features of thermophiles with mesophilies cannot bring satisfactory explanation for the thermal stability of such proteins. In order to bring out the hidden information behind the thermal stabilization of such proteins in terms of energy factors and their combinations, analysis were made on good resolution structures of thermophilic and their mesophilic homologous from 23 different families. From the structural coordinates, free energy contributions due to hydrophobic, electrostatic, hydrogen bonding, disulfide bonding and van der Waals interactions are computed. In this analysis, a vast majority of thermophilic proteins adopt slightly lower free energy contribution in each energy terms than its mesophilic counterparts. The major observation noted from this study is the lower hydrophobic free energy contribution due to carbon atoms and main-chain nitrogen atoms in all the thermophilic proteins. The possible combination of different free energy terms shows majority of the thermophilic proteins have lower free energy strategy than their mesophilic homologous. The derived results show that the hydrophobic free energy due to carbon and nitrogen atoms and such combinations of free energy components play a vital role in the thermostablisation of such proteins. © 2005 Elsevier B.V. All rights reserved. Keywords: Protein stability; Hydrophobic free energy; Three-dimensional structure
1. Introduction Several organisms, mainly archaea live under extreme environmental temperature conditions. Proteins from thermophilic organisms usually exhibit substantially higher intrinsic thermal stabilities than their counter parts from mesophilic organisms. Identifying and understanding the factors contributing to the stability of proteins from organisms living under extreme conditions stand out to be a longstanding problem. Although the molecular bases of protein thermostablisation have been the focus of many theoretical and experimental research efforts, this subject is only partially understood. Studies of thermostability can be divided into three categories: (i) by examining a single thermophilic ∗
Corresponding author. Tel.: +91 44 22351367; fax: +91 44 22300122. E-mail address:
[email protected] (M.N. Ponnuswamy).
0141-8130/$ – see front matter © 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.ijbiomac.2005.02.003
protein and comparing its structure at atomic level with one or more mesophilic homologues, (ii) systematic approach on the analysis based on sequence and structural information for a group of proteins in order to reach general conclusions and (iii) large scale comparison between thermophilic and mesophilic genome sequences. A number of examples can be quoted for the comparison of structures of mesophilic homologues but systematic studies are very limited. The recent progress in genome sequence projects enables one to make a comparative study of these thermophilic and mesophilic organisms [1]. There has been a growing interest in understanding the mechanism of stabilization of thermophilic proteins from these organisms. Understanding the physiochemical principles of thermostability will, no doubt, aid in the comprehension of protein folding and protein interaction mechanisms. Theoretical and experimental approaches
212
K. Saraboji et al. / International Journal of Biological Macromolecules 35 (2005) 211–220
have been undertaken to examine the stability of proteins. Comparison of the sequences and tertiary structures of homologous proteins from thermophiles, mesophilies and thermophobes has formed the basis of theoretical efforts [2,3]. Indeed, one review revealed many different physical and chemical reasons such as hydrogen bonding, hydrophobic packing, secondary structure propensity and helix dipole stabilization wherein the researchers reported the enhanced thermostablisation [4]. In recent years, several works have been carried out theoretically and experimentally to trace the secrets of thermostablisation through mutation studies [5–9] and also based on the analysis of amino acid composition. Fukuchi and Nishikawa [10] showed the amino acid composition on protein surface and interior of thermophilic and mesophilic bacteria. They observed the reduction in the number of charged residues and rich in polar residues in mesophilic bacteria and concluded that the bias of amino acid composition of thermophilic protein is due to the residues on protein surfaces, which may be due to extreme environment. Akke and Forsen [11] showed that the electrostatic interactions between charges on the surface of a protein could have significant effects on protein stability. With regard to helix stabilizing factors and stabilization of thermophilic proteins, Facchiano et al. [12] made the analysis on 13 thermophilic proteins and showed that the helices of thermophilic proteins are more stable than mesophilic homologues. Gromiha et al. [13] studied the relationship between stability changes caused by buried mutations and changes in 48 amino acid properties; this provides the correlation of hydrophobicity with the stability of proteins. The intramolecular interactions, namely hydrophobic, electrostatic, van der Waals and hydrogen bonds play an important role in the stability of protein structures [14–16,24]. Several investigations have been carried out to understand the mechanism for the thermostability of proteins. Gromiha et al. [17] made a comparative analysis on the relation between thermostability and amino acid properties for a family of meso and thermophilic proteins wherein the Gibbs free energy change of hydration and shape play a dominant role in thermostability of proteins. The mutational study by Hasegawa et al. [18] agreed with the results of Gromiha et al.; they analysed the increased stability of mesophilic cytochrome c through five substitutions and observed that the −GhN may contribute to the stability. Szilagyi and Zavodsky [19] made a systematic study on 25 protein families consisting of 64 mesophilic and 29 thermophilic proteins and concluded that different protein families adapt to higher temperatures utilizing different sets of structural devices and the number of ion pairs increased with the increase in growth temperature. Querol et al. [4] found the relationship between thermal stability and conformational characteristics of proteins. The thermostability of 16 different families of mesophilic and thermophilic proteins has been examined by Vogt et al. [20] and a good correlation evinced between the thermostability of the familial members and the
number of hydrogen bonds, as well in the fractional polar surface. The statistical analysis on 18 families of thermophilic and mesophilic proteins by Kumar et al. [21] showed the increase of the salt bridges and side-chain–side-chain hydrogen bonds in majority of the thermophilic proteins; the occurrence of residues Arg and Tyr are more frequent in thermophilic proteins. Kumar and Nussinov [22] made the analysis on fluctuations, ion pair contributions and stabilities in NMR conformer ensembles and found that the overall stabilizing contribution of ion pair is conformer population dependent. Recently, Gromiha [23] analyzed the medium and long-range contacts in mesophilic and thermophilic proteins of 16 different families and explained the fact that thermophiles prefer to have contacts between residues through hydrogen bonds; apart from hydrophobic contacts and also between polar and non-polar residues in thermophiles than mesophilies. Ponnuswamy and Gromiha [24] made the investigations on the conformational stability of folded proteins where the hydrophobic force drives the polypeptide chain to the folded state overcoming the entropic factor, while the other factors, especially hydrogen bonds and van del Waals attraction, define the shape and keep it from falling apart. Recently, Yano and Poulos [25] compiled the factors that are reported to be important for increased protein stability. It has been mentioned that electrostatic interactions, cation–pi interactions, aromatic and hydrophobic interactions and other factors would enhance the stability [26,27]. From this diverse collection of studies, it is difficult to come to a general conclusion about the structural features underlying the increased thermal stability of proteins from thermophilic microorganism. The contradictions and the limited understanding are the consequences of the limited data available and the nonuniform approach of the contributing researchers. Though the proteins can be engineered or engineer themselves in vivo to achieve greater stability by utilizing one or more of these strategies, it is clear that no single and preferred mode has yet to be established. The aim of present work is to combine the different free energy components of a set of thermophilic and mesophilic proteins to assess the contributions from different stability factors into a unified model. We compute the major free energy components of hydrophobic, electrostatic, hydrogen bonding, van der Waals and disulfide bonding interactions of the folded state of proteins, and also the conformational entropy of the unfolded state of the corresponding proteins. Here an in depth statistical analysis of parameters was made and investigated the importance of each interaction towards protein thermostability. 2. Materials and methods 2.1. Data set Recently, Kumar et al. [21] constructed the data set of 36 thermophilic and mesophilic proteins from 18 different
K. Saraboji et al. / International Journal of Biological Macromolecules 35 (2005) 211–220
families; Szilagyi and Zavodsky [19] built a data set representing 25 families and Vogt et al. [20] collected a set of 56 globular proteins from 16 different families. In this work, we construct a non-redundant data set of 23 families of thermophilic protein and its mesophilic counterpart from the previous studies [19–21]. These families span an entire spectrum, containing proteins from moderately thermophilic organisms and their mesophilic homologues. Here we select one pair from each family. The three-dimensional structures of all these proteins have been taken from Protein Data Bank [28]. For a given protein, the PDB files contain coordinates for the structure observed in a crystallographic asymmetric unit. This may not reflect the true biochemically relevant oligomeric state. We choose only one subunit from subunits having identical amino acid sequences and mutations are ignored. Also the structurally most similar thermophile–mesophile pair having the best resolution was chosen, so that the observed differences can be expected to be mostly due to thermostability. The PDB codes for all the proteins along with the resolution, average living temperature (TL ) and rms deviation for each family are given in Table 1. It has been reported that the average living temperature has a direct relationship with melting temperature of proteins [17,29]. As the number of samples for TL is more than Tm , and TL is widely used to understand the stability of thermophilic proteins, we have used TL for the present study. 2.2. Hydrophobic free energy (HFE) The hydrophobic free energy (Ghy ) of each protein was evaluated by using the method of Eisenberg and Mc Lachlan [30]. In this method, the change in free energy for transfer of an amino acid residue to water is given by GR = i σi Ai
(1)
where the sum is taken over all atoms i, Ai are the accessible surface areas, σ i atomic solvation parameters for the five classes of atoms namely, carbon, neutral nitrogen and oxygen, charged nitrogen, charged oxygen and sulfur which are determined by a least-squares fit of Eq. (1) based on the method of Ponnuswamy and Gromiha [24]. The σ values are C: 12.02, N/O: −5.86, N+ : −19.46, O− : −34.98, and S: ˚ 2 ). These atomic solvation parameters per35.51 cal/(mol A form better and explain the protein stability than other values available in the literature [31]. The hydrophobic free energy of folded protein was expressed as Ghy = i σi [Ai (folded) − Ai (unfolded)]
(2)
where Ai (folded) and Ai (unfolded) represent, respectively, the accessible surface areas (ASA) of each atom in the folded and unfolded states of the protein. The accessible surface areas of each atom in the folded state were computed using the program NACCESS [32]. According to Shrake and Rupley [33], the ASA of amino acid residue X in extended
213
state is computed using the average of ASA of residues present in the sequence Gly–X–Gly. The hydrophobic free energy of each protein was calculated separately for the five classes of atoms and due to side-chain–main-chain atoms contributions. 2.3. Electrostatic free energy (EFE) In our present approach, the method of Ponnuswamy and Gromiha [24] was adopted to compute the contribution from electrostatic free energy (Gel ). The actual electrostatic free energy of the folded state is taken as the sum of energy due to ion pairs [E1 ] and charge helix dipoles [E2 ] Gel = E1 + E2
(3)
The ion pairs were defined using a simple distance criterion; two oppositely charged residues were considered as an ion pair if their closest oppositely charged atom were closer to each other than a predefined cutoff distance. The cutoff dis˚ [34]. Each ion pair on the surface tance was chosen as ≤4 A of the protein is responsible for stability by about 1 kcal/mol [2,35] whereas such ion pair is buried to contribute around 3 kcal/mol [36]. In combining these experimental results, we follow the expression for ion pairs as E1 = 3Nbi + 1Nex
(4)
where Nbi and Nex are the number of buried and exposed ion pairs of a protein. The charge helix dipole interactions are obtained by find˚ distance from the C ing the positive charge residue within 4 A ˚ distance from cap–1 and negative charge residue with in 4 A the N cap+1 [37]. Based on site directed mutational studies [37,38], the charge helix dipole interactions could contribute to the stability of the protein by about 1.6 kcal/mol and hence, we compute the charge helix dipole interactions in a protein with expression E2 = 1.6Nch
(5)
where Nch is the number of charge helix dipole interactions of a protein. 2.4. Hydrogen bond free energy (HBFE) The hydrogen bond is one of the most important interatomic interactions in protein folding. The hydrogen-bonding free energy (Ghb ) has been computed from the information about the number of hydrogen bonds in a protein. The interactions that qualify as hydrogen bonds must be between the listed donor and acceptor atoms, and have acceptable geometries [39]. The number of hydrogen bonds present for all proteins was calculated by using the HBPLUS routine [39,40] with the following default parameters (D refers to the donor atom; A, the acceptor; H, the hydrogen atom; and AA the atom covalently bound to A): maximum distance for D–A, ˚ and for H–A, 2.5 A; ˚ minimum angle for D–H. . .A and 3.9 A
214
S. no. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 a b
Protein family
Thermophilic organism, PDB id and ˚ resolution (A)
No. of residues and TL (◦ C)a
Mesophilic organism PDB id and ˚ resolution (A)
No. of residues and TL (◦ C)
rms Deviation ˚ b (A)
Citrate synthase Malate dehydrogenase Rubredoxin Cyclodextrin glucanotransferase
Pyrococcus furiosus: 1AJ8-1.9 Thermus flavus: 1BDM-2.5 Pyrococcus furiosus: 1CAA-1.8 Thermoanaerobacterium thermosulfurigenes: 1CIU-2.3 Thermus aquaticus: 1EFT-2.5 Pyrococcus furiosus: 1GTM-2.2 Bacillus stearothermophilus: 1LDN-2.5 Bacillus thermoproteolyticus: 1LNF-1.7 Bacillus stearothermophilus: 1PHP-1.65 Thermus thermophilus: 1TFE-1.7 Thermotoga maritime: 1TMY-1.9 Pyrococcus furiosus: 1XGS-1.75 Thermomyces lanuginosus: 1YNA-1.55 Bacillus stearothermophilus: 1ZIN-1.65 Bacillus thermoprotelyticus: 2FXB-2.3 Thermus thermophilus: 2PRD-2.0 Thermus thermophilus: 3MDS-1.8 Bacillus stearothermophilus: 3PFK-2.4 Bacillus stearothermophilus: 1EBD-2.6 Thermoactinomyces vulgaris: 1THM-1.37 Thermotoga maritime: 1HDG-2.5
376; 100 332; 72.5 53; 100 683; 60
Chicken Heart: 1CSH – 1.6 Porcine: 4MDH – 2.5 Desulfovibrio vulgaris: 8RXN-1.0 Bacillus circulans: 1CDG-2.0
435; 37 333; 37 52; 35.5 686; 35
1.68 0.94 0.69 0.7
405; 71 419; 77.5 316; 52.5 316; 52.5 394; 52.5 145; 72.5 118; 90 295; 100 193; 50 217; 52.5 81; 52.5 174; 72.5 203; 72.5 319; 52.5 455; 52.5 279; 60 332; 82.5
E.Coli: 1EFU(C)-2.5 Clostridium symbiosum: 1HRD- 1.96 Plasmodium falciparum: 1LDG –1.74 Bacillus cereus: 1NPC-2.0 Saccharomyces cerevisiae: 1QPG-2.4 E.coli: 1EFU(B)-2.5 E.coli: 3CHY-1.66 E.coli: 1MAT-2.4 Bacillus circulans: 1XNB-1.49 Sacchromyces cerevisiae 1AKY –1.63 Clostridium acidurici: 1FCA-1.8 E.coli: 1INO-2.2 Homo sapiens: 1QNM-2.3 E.coli: 2PFK-2.4 Pseudomonas putida: 1LVL-2.45 Bacillus amyloliquifaciens: 1SUP-1.60 E.coli: 1GAD-1.80
385; 37 449; 33.5 316; 37 317; 30 415; 27.5 282; 37 128; 37 263; 37 185; 35 218; 27.5 55; 28 175; 37 198; 37 300; 37 458; 27.5 275; 35 330; 37
1.5 1.38 1.25 0.86 1.28 1.24 1.39 1.39 1.14 1.22 1.27 1.10 1.17 0.87 0.88 1.9 1.24
Bacillus stearothermophilus: 1BTM-2.8 Clostridium thermocellum 1XYZ-1.4
251; 52.5 320; 60
Homo sapiens: 1HTI-2.8 Cellulomonas fimi: 2EXO-1.8
248; 37 312; 30
1.24 1.38
EF–TU and EF–TU–TS complex Glutamate dehydrogenase Lactate dehydrogenase Thermolysin and neutral protease 3-Phosphoglycerate kinase (PKG) EF–TS and EF–TU–TS complex Che Y Methionine aminopeptidase Endo-1,4-b Xylanase Adenylate kinase Ferredoxin Inorganic pyrophosphatase Manganese superoxide dismutase Phosphofructokinase Reductase Subtillisin Glyceraldehyde-3-phosphate dehydrogenase (GAPDH) Triose phosphate isomerase Glycosyltransferase B (-glycanase)
TL , the average living temperature. rms deviation, the root mean square deviation for C␣ between the two protein structures of a family.
K. Saraboji et al. / International Journal of Biological Macromolecules 35 (2005) 211–220
Table 1 Data set used in this study showing 23-protein families and its thermophilic and mesophilic counterparts
K. Saraboji et al. / International Journal of Biological Macromolecules 35 (2005) 211–220
for D. . .A–AA, 90.0◦ . Baker and Hubbard [41] have recommended these values after extensive analysis. However, the interaction between the charged residues has already been considered as ion pairs, the possible number of hydrogen bonds between the residues has to be excluded from the total number of hydrogen bonds in a protein. Accordingly, the actual number of hydrogen bonds to be included in free energy computation is NHB = Nhb − (Nbi + Nex )
(6)
It has been reported that the free energy due to hydrogen bond is approximately 1 kcal/mol [42], and hence, the Ghb is taken to be Ghb = 1NHB
(7)
2.5. Disulfide bond free energy (SSFE) After analyzing the characteristics of disulfide bonds in a set of proteins, Thornton [43] suggested an approximated value of 2.3 kcal/mol, a probable free energy contribution to a disulfide bond (Gss ). We use this value to compute the free energy contribution from disulfide bonds. The number of disulfide bonds present in a protein was calculated by using the program HBPLUS [39] with the distance criteria S–S, ˚ 3.0 A. 2.6. van der Waals free energy (VDWFE) van der Waals interactions are calculated between the atoms separated by at least three bonds (1–4 interactions). van der Waals free energy (Gvw ) was calculated using the sum of Lennard–Jones potentials over all 1–4 interactions with AMBER [44] library files.
215
We have normalized the hydrophobic and all other free energy terms by number of amino acid residues in each protein.
3. Results and discussion The values of various average free energy contributions calculated in this study are given in Table 2. 3.1. Hydrophobic free energy The hydrophobic free energy contributions for each protein molecule was calculated according to different atom types such as C, N/O, N+ , O− , S and due to the main- and sidechain atoms. The conclusion is that most of the thermophilic proteins having lower free energy than its mesophilic parts except the sulphur atom case. In this spectrum, main- and side-chain carbon atoms and the main-chain nitrogen atoms consistently show low energy state in all the thermophilic proteins. The plot of the average hydrophobic free energy due to the main- and side-chain carbon atoms is shown in Fig. 1. The hydrophobic free energy due to the main-chain nitrogen atoms is illustrated in Fig. 2. An interesting feature observed between the thermophilic and mesophilic families is that all the 23 thermophilic families showing lower free energy than its mesophilic pair. The roles of hydrophobic profile of the carbon and main-chain nitrogen atoms help to categorize the thermophilic and mesophilic and thus the thermal stability. Further, the data presented in Table 2 showed that the difference in hydrophobic free energy is significantly high for the families, Methionine aminopeptidase, Ferredoxin and Glycosyltransferase. From the analysis on the structures of
Fig. 1. Contributions of side- and main-chain carbon atoms to hydrophobic free energy.
216
No.
Average free energy [kcal/mol] Thermophilic proteins
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 a
Mesophilic proteins
Difference (thermophilic–mesophilic)
PDB
HFE
EFE
HBFE
SSFE
Total
PDB
HFE
EFE
HBFE
SSFE
Total
HFE
EFE
HBFE
SSFE
Total
1AJ8 1BDM 1CAA 1CIU 1EFT 1GTM 1LDN 1LNF 1PHP 1TFE 1TMY 1XGS 1YNA 1ZIN 2FXB 2PRD 3MDS 3PFK 1EBD 1THM 1HDG 1BTM 1XYZ
−0.659 −0.643 −0.579 −0.688 −0.62 −0.671 −0.652 −0.604 −0.642 −0.538 −0.696 −0.704 −0.612 −0.584 −0.625 −0.629 −0.691 −0.621 −0.613 −0.637 −0.599 −0.598 −0.685
−0.047 −0.037 −0.057 −0.033 −0.017 −0.059 −0.013 −0.034 −0.036 −0.028 −0.039 −0.04 −0.031 −0.049 −0.025 −0.045 −0.054 −0.011 −0.057 −0.024 −0.038 −0.027 −0.053
−0.912 −0.858 −0.679 −0.958 −0.852 −1.021 −0.835 −1.003 −0.876 −0.917 −0.856 −0.847 −1.005 −0.912 −0.506 −0.799 −0.887 −0.912 −0.868 −0.982 −0.991 −0.912 −1
0 0 0 0 0 0 0 0 0 −0.016 0 0 −0.012 0 0 0 0 0 −0.005 0 0 0 0
−1.618 −1.538 −1.315 −1.679 −1.489 −1.751 −1.5 −1.641 −1.554 −1.499 −1.591 −1.591 −1.66 −1.545 −1.156 −1.473 −1.632 −1.544 −1.543 −1.643 −1.628 −1.537 −1.738
1CSH 4MDH 8RXN 1CDG 1EFU 1HRD 1LDG 1NPC 1QPG 1EFU 3CHY 1MAT 1XNB 1AKY 1FCA 1INO 1QNM 2PFK 1LVL 1SUP 1GAD 1HTI 2EXO
−0.651 −0.626 −0.583 −0.66 −0.595 −0.654 −0.661 −0.564 −0.651 −0.506 −0.702 −0.611 −0.631 −0.55 −0.543 −0.623 −0.65 −0.638 −0.593 −0.639 −0.598 −0.596 −0.575
−0.03 −0.025 0 −0.028 −0.06 −0.032 −0.044 −0.026 −0.048 −0.078 −0.053 −0.014 −0.011 −0.045 0 −0.017 −0.047 −0.028 −0.025 −0.019 −0.036 −0.02 −0.022
−0.966 −0.805 −0.635 −0.926 −0.794 −0.958 −0.968 −0.997 −0.822 −0.957 −0.93 −0.859 −0.962 −0.839 −0.491 −0.697 −0.874 −0.93 −0.841 −1.076 −0.939 −0.863 −0.99
0 0 0 −0.003 0 0 0 0 0 0 0 0 0 0 0 0 0 0 −0.005 0 0 0 −0.023
−1.647 −1.456 −1.218 −1.617 −1.449 −1.644 −1.673 −1.587 −1.521 −1.541 −1.685 −1.484 −1.604 −1.434 −1.034 −1.337 −1.571 −1.596 −1.464 −1.734 −1.574 −1.479 −1.61
−0.008 −0.017 0.004 −0.028 −0.025 −0.017 0.009 −0.04 0.009 −0.032 0.006 −0.093 0.019 −0.034 −0.082 −0.006 −0.041 0.017 −0.02 0.002 −0.001 −0.002 −0.11
−0.017 −0.012 −0.057 −0.005 0.043 −0.027 0.031 −0.008 0.012 0.05 0.014 −0.026 −0.02 −0.004 −0.025 −0.028 −0.007 0.017 −0.032 −0.005 −0.002 −0.007 −0.031
0.054 −0.053 −0.044 −0.032 −0.058 −0.063 0.133 −0.006 −0.054 0.04 0.074 0.012 −0.043 −0.073 −0.015 −0.102 −0.013 0.018 −0.027 0.094 −0.052 −0.049 −0.01
0 0 0 0.003 0 0 0 0 0 −0.016 0 0 −0.012 0 0 0 0 0 0 0 0 0 0.023
0.029 −0.082 −0.097 −0.062 −0.04 −0.107 0.173 −0.054 −0.033 0.042 0.094 −0.107 −0.056 −0.111 −0.122 −0.136 −0.061 0.052 −0.079 0.091 −0.054 −0.058 −0.128
PDB, protein data bank code; HFE, hydrophobic free energy; EFE, electrostatic free energy; HBFE, hydrogen bond free energy; SSFE, disulphide bond free energy.
K. Saraboji et al. / International Journal of Biological Macromolecules 35 (2005) 211–220
Table 2 Contribution of free energies to the thermophilic and its mesophilic counterpartsa
K. Saraboji et al. / International Journal of Biological Macromolecules 35 (2005) 211–220
217
Fig. 2. Contributions of main-chain nitrogen atoms to hydrophobic free energy.
mesophilic and thermophilic proteins in these families, we observed that the main- and side-chain carbon atoms tend to move interior to the protein in thermophiles. Hence, the burial of carbon atoms to the interior of thermphilic proteins influenced the subtle difference in hydrophobic free energy. Fig. 3 shows the average hydrophobic free energy of each family where 16 thermophilic and 7 mesophilic proteins having lower free energies. This study reveals that most of the thermally stable proteins possess lower hydrophobic free energies than its mesophilic counterpart. 3.2. Electrostatic free energy ˚ was The number of ion pairs using a distance limit of 4 A calculated wherein the thermophilic proteins show more ion pairs than mesophilic ones, an agreement with the results of Szilagyi and Zavodsky [19]. The normalized electrostatic free energy arising out of ion pairs and charge helix dipoles are lower for the 17 thermophilic families (Fig. 4). The
exceptions of six mesophilic families are in lower energy state because of its excess helical content and thus the charge helix dipoles contribute such lower electrostatic free energy. 3.3. Hydrogen bond free energy The component of normalized hydrogen bond free energy is observed to be dominant as that of the hydrophobic term in all cases. In 16 thermophilic families, the hydrogen bond free energy is lower than its mesophilic component. Although seven mesophilic proteins possess lower energy values, difference is minimal when compared to its thermophilic part. 3.4. Free energy due to disulfide bonds and 1–4 van der Waals interactions In our data set, most of the proteins do not possess disulfide bonds in both thermophilic and mesophilic parts. Out of 23 sets of families, three thermophilic and two mesophilic
Fig. 3. Normalized hydrophobic free energy contribution for each family.
218
K. Saraboji et al. / International Journal of Biological Macromolecules 35 (2005) 211–220
Fig. 4. Electrostatic free energy due to ion pairs and charge helix dipoles per residue.
proteins having one disulfide bond and one mesophilic protein contains two disulfide bonds. A study of the average disulfide bond free energy shows two thermophilic and two mesophilic proteins have lower energies. Only one family has the both thermophilic and its mesophilic part having disulfide bond for which the average disulfide bond free energy is same. The free energies due to the 1–4 van der Waals interactions show similar trend as in other cases. The average 1–4 van der Waals free energies was lower for 17 thermophilic and 6 mesophilic proteins. It is noticed that the energy due 1–4 van der Waals free energy is linearly related to the number of residues and the correlation of 1–4 van der Waals free energy with number of residues for thermophilic and mesophilic proteins is found to be 0.9703 and 0.9794, respectively. 3.5. Combination of energy terms The calculated energy terms are combined together with different combinations. Since the VDWFE values are relatively low with respect to the other four-energy terms and are directly proportional to the total number of residues, the contribution due to VDWFE is excluded and other energy terms are combined. 3.5.1. Two-energy factor The possible combination of two-energy factors implies the lower energy state of thermophilic families in most of the cases. Table 3 shows the lower energy state in each family for the different possible combinations of two-energy terms. Almost in each combination 70% of lower energy state falls in the region of thermophilic proteins. While considering all the six possible combinations, the following 11 thermophilic proteins having lower energetic contribution than their mesophilic counterparts: Malate dehydrogenase, Cyclodextrin glucanotransferase, Glutamate dehydrogenase, Thermolysin and neutral protease, Adenylate kinase, Ferredoxin, Inorganic pyrophosphatase, Manganese
Table 3 Comparative table showing possible combination of two free energy components for all 23 set of familiesa Family
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
Combination of energies HFE + EFE
EFE + HBFE
HBFE + SSFE
HFE + SSFE
HFE + HBFE
EFE + SSFE
T T T T T T M T M M M T T T T T T M T M T T T
M T T T T T M T T M M T T T T T T M T M T T T
M T T T T T M T T M M M T T T T T M T M T T M
T T M T T T M T M T M T M T T T T M T M T T T
M T T T T T M T T M M T T T T T T M T M T T T
T T T T M T M T M M M T T T T T T M T T T T T
a T represents thermophilic proteins have lower energy state and M represents mesophilic proteins have lower energy state.
superoxide dismutase, Reductase, Glyceraldehyde-3phosphate dehydrogenase and Triose phosphate isomerase. Only three mesophilic proteins seem to have lower energy state in all the six combinations. 3.5.2. Three-energy factor When energetic combinations are made with three-energy factors, the above trend of higher stability seems to be in reality in thermophilic families. Table 4 illustrates the
K. Saraboji et al. / International Journal of Biological Macromolecules 35 (2005) 211–220
219
Table 4 Comparative table showing possible combination of three and all four free energy components for all 23 set of familiesa Family
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 a
Combination of energies HFE + EFE + HBFE
HFE + EFE + SSFE
HFE + HBFE + SSFE
EFE + HBFE + SSFE
HFE + EFE + HBFE + SSFE
M T T T T T M T T M M T T T T T T M T M T T T
T T T T T T M T M M M T T T T T T M T M T T T
M T T T T T M T T M M T T T T T T M T M T T T
M T T T T T M T T M M T T T T T T M T M T T T
M T T T T T M T T M M T T T T T T M T M T T T
See Table 3 Footnotes.
three-term combinations and in each combination 74% of the thermophilic proteins are in lower energy states. While considering the four possible combinations, in addition to the families showing higher stability in the twoenergy factor case, the thermophilic proteins from other five families viz., Rubredoxin, EF–TU and EF–TU–TS complex, Methionine aminopeptidase, Endo-1, 4-b Xylanase and Glycosyltransferase B also show higher stability in all combinations. Along with the three mesophilic proteins showing lower energy state in two-term combination, here the other two mesophilic proteins are observed as in lower energy state. 3.5.3. Four-energy factor In the case of combination using all the four-energy terms, the tendency as in three-energy term combinations is maintained in both thermophilic and mesophilic proteins. 3.6. Comparison with other studies Vogt et al. [20] reported that an increase in hydrogen bonding and fractional polar surface increase the stability of thermophilic proteins. Xiao and Honig [45] found that electrostatic interactions are more favorable in thermophiles; Kumar et al. [22] showed that salt bridges and side chain-side chain hydrogen bonds increase the stability in most of the thermophilic proteins. Recently, Yano and Poulos [25] compiled the factors that are reported to be important for increased protein stability. It has been mentioned that electrostatic interactions, cation–pi interactions, aromatic and hydrophobic interactions, etc. would enhance the
stability [26,27]. In this work, we found that the hydrophobic free energy due to main-chain carbon and nitrogen atoms increased the stability of thermophilic proteins. 4. Conclusions The experimentally determined three-dimensional structures of thermostable proteins is still small in number and the lack of structural information may lead the unfriendliness of choosing the mesophilic counterpart from the same family of thermophilic protein where the structure is known. Because of these difficulties, there is a barrier in extensive statistical surveys, even though a large number of families have been examined in the present work; the data set chosen is comparatively larger than the previous efforts [17,20,21]. The comparative analysis on the thermophilic and their mesophilic counterparts through the free energy tool revealed important factors for the stability of thermophilic proteins. We found that the hydrophobic free energy of carbon and main-chain nitrogen atoms play an important role in thermostablization and combinations of different free energies are lower for thermophilic cases than its mesophilic counter parts. Acknowledgement K.S. acknowledges the Council of Scientific and Industrial Research (CSIR), Govt. of India, for the award of Senior Research Fellowship.
220
K. Saraboji et al. / International Journal of Biological Macromolecules 35 (2005) 211–220
References [1] S. Chakravarty, R. Varadarajan, Biochemistry 41 (2002) 8152. [2] M. Perutz, H. Radidt, Nature 255 (1975) 256. [3] P. Argos, M. Rossmann, U. Grau, H. Zuber, G. Frank, J. Tratschin, Biochemistry 25 (1979) 5698. [4] E. Querol, J.A. Perez-Pons, A. Mozo-Villarias, Protein Eng. 9 (1996) 265. [5] M.A. Arnott, R.A. Michael, C.R. Thompson, D.W. Hough, M.J. Danson, J. Mol. Biol. 304 (2000) 657. [6] J. Chen, Z. Lu, J. Sakon, W.E. Stites, J. Mol. Biol. 303 (2000) 125. [7] J. Hasegawa, S. Uchiyama, Y. Tanimoto, M. Mizutani, Y. Kobayashi, Y. Sambongi, Y. Igarashi, J. Biol. Chem. 275 (2000) 37824. [8] R.R. Biekofsky, S.R. Martin, J.E. McCormick, L. Masino, S. Fefeu, P.M. Bayley, J. Feeney, Biochemistry 41 (2002) 6850. [9] H. Takeshita, T. Yasuda, T. Nakajima, K. Mogi, Y. Kaneko, R. Iida, K. Kishi, Eur. J. Biochem. 270 (2003) 307. [10] S. Fukuchi, K. Nishikawa, J. Mol. Biol. 309 (2001) 835. [11] M. Akke, S. Forsen, Proteins 8 (1990) 23. [12] A.M. Facchiano, G. Colonna, R. Ragone, Protein Eng. 11 (1998) 753. [13] M.M. Gromiha, M. Oobatake, H. Kono, H. Uedaira, A. Sarai, J. Prot. Chem. 18 (1999) 565. [14] K.A. Dill, Biochemistry 29 (1990) 7133. [15] G.D. Rose, R. Wolfenden, Ann. Rev. Biophys. Biomol. Struct. 22 (1993) 381. [16] C.N. Pace, B.A. Shirely, M.A. McNutt, K. Gajiwala, FASEB J. 10 (1996) 75. [17] M.M. Gromiha, M. Oobatake, A. Sari, Biophys. Chem. 82 (1999) 51. [18] J. Hasegawa, S. Uchiyama, Y. Tanimoto, M. Mizutani, Y. Kobayashi, Y. Sambongi, Y. Igarashi, J. Biol. Chem. 275 (2000) 37824. [19] A. Szilagyi, P. Zavodsky, Struct. Fold Des. 8 (2000) 493. [20] G. Vogt, S. Woell, P. Argos, J. Mol. Biol. 269 (1997) 631. [21] S. Kumar, C.J. Tsai, R. Nussinov, Protein Eng. 13 (2000) 179.
[22] [23] [24] [25] [26] [27] [28]
[29] [30] [31] [32]
[33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44]
[45]
S. Kumar, R. Nussinov, Proteins 43 (2001) 433. M.M. Gromiha, Biophys. Chem. 91 (2001) 71. P.K. Ponnuswamy, M.M. Gromiha, J. Theor. Biol. 166 (1994) 63. J.K. Yano, T.L. Poulos, Curr. Opin. Biotechnol. 14 (2003) 360. M.M. Gromiha, S. Thomas, C. Santhosh, Prep. Biochem. Biotechnol. 32 (2002) 355. A. Paiardini, G. Gianese, F. Bossa, S. Pascarella, Proteins 50 (2003) 122. H.M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T.N. Bhat, H. Weissig, I.N. Shindyalov, P.E. Bourne, Nucleic Acids Res. 28 (2000) 235. E.A. Gaucher, J.M. Thomson, M.F. Burgan, S.A. Benner, Nature 425 (2003) 285. D. Eisenberg, A.D. Mc Lachlan, Nature 319 (1986) 199. D. Eisenberg, M. Wesson, M. Yamashita, Chem. Scr. 29A (1989) 217. S.J. Hubbard, J.M. Thornton, ‘NACCESS’, Computer Program, Department of Biochemistry and Molecular Biology, University College, London, 1993, available from: http://wolf.bi.umist.ac.uk/unix/naccess.html. A. Shrake, J. Rupley, J. Mol. Biol. 79 (1973) 351. D.J. Barlow, J.M. Thornton, J. Mol. Biol. 168 (1983) 867. L.R. Brown, A. DeMarco, R. Richarz, G. Wagner, K. Wuthrich, Eur. J. Biochem. 88 (1978) 87. A.R. Fersht, J. Mol. Biol. 64 (1972) 497. H. Nicholson, W.J. Becktel, B.W. Matthews, Nature 336 (1988) 651. D. Sali, M. Bycroft, A.R. Fersht, Nature 335 (1988) 740. I.K. Mc Donald, J.M. Thornton, J. Mol. Biol. 238 (1994) 777. I.K. Mc Donald HBPLUS, London, UCL, 1992. E. Baker, R. Hubbard, Prog. Biophys. Mol. Biol. 25 (1994) 97. A. Ben-Naim, J. Phys. Chem. 95 (1991) 1437. J.M. Thornton, J. Mol. Biol. 151 (1981) 261. W.D. Cornell, P. Cieplak, C.I. Bayly, I.R. Gould, K.M. Merz, D.C. Ferguson, T. Spellmeyer, J.W. Fox, Caldwell, P.A. Kollman, J. Am. Chem. Soc. 117 (1995) 5179. L. Xiao, B. Honig, J. Mol. Biol. 289 (1999) 1435.