Accepted Manuscript Title: Homology modeling, molecular docking and molecular dynamics studies of the catalytic domain of chitin deacetylase from Cryptococcus laurentii strain RY1 Authors: Soumyadev Sarkar, Suchetana Gupta, Writachit Chakraborty, Sanjib Senapati, Ratan Gachhui PII: DOI: Reference:
S0141-8130(16)31622-1 http://dx.doi.org/doi:10.1016/j.ijbiomac.2017.03.057 BIOMAC 7223
To appear in:
International Journal of Biological Macromolecules
Received date: Revised date: Accepted date:
14-9-2016 13-2-2017 11-3-2017
Please cite this article as: Soumyadev Sarkar, Suchetana Gupta, Writachit Chakraborty, Sanjib Senapati, Ratan Gachhui, Homology modeling, molecular docking and molecular dynamics studies of the catalytic domain of chitin deacetylase from Cryptococcus laurentii strain RY1, International Journal of Biological Macromolecules http://dx.doi.org/10.1016/j.ijbiomac.2017.03.057 This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Homology modeling, molecular docking and molecular dynamics studies of the catalytic domain of chitin deacetylase from Cryptococcus laurentii strain RY1
Soumyadev Sarkar a , Suchetana Gupta b , Writachit Chakraborty a , Sanjib Senapati b
, Ratan Gachhui a, *
a
Department of Life Science & Biotechnology, Jadavpur University, India
b
Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences,
Indian Institute of Technology Madras, India
*
Address for correspondence:
Ratan Gachhui, Ph. D. Professor Department of Life Science & Biotechnology, Jadavpur University, 188, Raja S.C. Mallick Road, Kolkata – 700032, India Tel: +913324572189 Fax: +913324137121 Email address:
[email protected]
Abstract This study provides structural insights into chitin deacetylase, over-expressing under nitrogen limiting condition in Cryptococcus laurentii strain RY1. The enzyme converts chitin, the second most abundant natural biopolymer, to chitosan, which offers tremendous applications in diverse fields. To elucidate the structure-function relationship of this biologically and industrially important enzyme, a homology model of the catalytic domain was constructed. The stability of the structure was assessed by molecular dynamics simulation studies. Tryptophan 151 of the domain was identified to form hydrogen bond and stacking interaction with chitin upon docking. In Silico substitution of Tryptophan (W) to Alanine (A), Phenylalanine (F) and Aspartate (D) corroborated the importance of the Tryptophan residue in interaction with the substrate. This is the first report of unravelling the structural characteristics of chitin deacetylase from Cryptococcus and understanding the approach of the enzyme towards its substrate. Our results would be helpful to perform experimental validations and apply quantum mechanics / molecular mechanics techniques to determine the detailed catalytic mechanism and enhance the industrial potency of the enzyme.
Keywords: chitin deacetylase, Cryptococcus laurentii strain RY1, molecular dynamics.
1. Introduction The enzyme chitin deacetylase (CDA) (EC 3.5.1.41) catalyzes the hydrolysis of acetamido group of N-acetylglucosamine (GlcNAc) in chitin to produce the deacetylated form, chitosan.
Chitin, a long-chain polymer of an N-acetylglucosamine, is known from ancient times and one of the most abundant natural polymers. However, because of its crystalline structure and insolubility in water and most organic solvents chitin has only limited applications. Chitosan, on the other hand, is readily soluble in dilute acidic solutions [1] and offers great possibilities in different fields due to its biodegradability, biorenewability, biocompatibility, physiological inertness, and hydrophilicity. Currently chitosan has wide applications in various fields from biomedicine to water treatment. In addition, chitosan has also been reported to exhibit some intriguing physiological properties such as antitumor, antimicrobial activities and elicitor activity for plants. [2] Most biological activities of chitosan are determined by its degree of polymerization, degree of acetylation and pattern of acetylation which defines the distribution of GlcNAc and GlcN moieties in the chitosan chain. Generally, commercial chitosan is produced via harsh thermochemical process which is environmentally unsafe and poorly controlled and results in significant variation of the end products, especially in their pattern of acetylation [3]. This introduces variations in the studies of biological activities of chitosan. In the recent times, CDA is progressively gaining attention for chitosan production, which would potentially eliminate these drawbacks through an environment friendly, well-controlled and economical process. The presence of CDA has been reported in several diverse organisms: Mucor rouxii [4, 5], Sachharomyces cerevisae [6, 7], Aspergillus nidulans [8], Absidia coerulea [9], Colletotrichum lindemuthianum [10], Cryptococcus laurentii strain RY1 [11], Cnaphalocrocis medinalis [12],Colletotrichum gloeosporioides [13], Penaeus monodon [14], Aspergillus flavus [15], Cryptococcus neoformans [16], Helicoverpa armigera [17] etc. Yet, most of the environmental sources are still unexplored. CDAs from different sources may have different activity,
specificity, efficiency and stability which can help to find more suitable conditions with bioprospecting uses. Also exploring CDAs from new sources may introduce us to novel biological roles of CDA to which we remain unaware of. Cryptococcus CDAs are biologically significant as well. CDA from Cryptococcus neoformans was reported to be responsible for providing cell wall integrity during vegetative growth [16]. We too established that this CDA from Cryptococcus laurentii strain RY1 overexpresses under nitrogen depletion. We speculated that this up-regulation of the CDA gene might play a role in the organism’s growth in nitrogen limitation [11]. Thus, the biological and industrial significance of the enzyme is immense. The CAZY database defines CDA as a member of carbohydrate esterase CE-4 family (http://afmb.cnrs-mrs.fr/~cazy/CAZY) [18]. According to Henrissat classification, the members share a conserved part of the primary amino acid structure named as the NodB homology domain or polysaccharide deacetylase domain [19]. These enzymes acquire a (β/α)8 fold comprising of a single catalytic domain that is made up of most of the conserved residues and also bear a substrate binding groove [20, 21]. Presence of divalent metal ions such as Zn2+, Ca2+, Co2+ etc. were reported to enhance the catalytic efficiency of CDA across various microorganisms, indicating it to be a metalloprotein [22, 23, 24]. Existence of a His-His-Asp zinc binding triad was elucidated in the crystal structure CDA of Colletotrichum lindemuthianum [10]. In the present study, analysis through homology modeling, molecular docking and simulations, in silico mutagenesis approaches are performed in order to understand the structure and substrate binding to Cryptococcus laurentii strain RY1 CDA. Using the crystal structure of the chain A of Aspergillus nidulans CDA, the three dimensional model of the catalytic domain of the protein has been generated. The binding studies between the enzyme and the ligand have
been analyzed by interaction energy calculations. The stability of the enzyme and the enzymeligand complexes were validated by molecular dynamics simulation studies. In the absence of a crystal structure of CDA from Cryptococcus, this study primarily investigates the structural features of CDA. Protein – carbohydrate interactions typically happen through stacking interactions, hydrogen bonding and participation of water molecules [25]. We predicted the residue W151 of the catalytic domain to interact with substrate by: (1) Stacking interactions between the aromatic ring and the chitin, (2) Hydrogen bonding contributed by the NH of the indole ring of Trp with the chitin. The investigation may pave the way to further understand the catalytic site and the structure-function relationship of the enzyme.
2. Materials and methods 2.1.
Nucleotide sequence
The partial nucleotide sequence of CDA was first identified as an overexpressed band in gel in nitrogen limiting condition by mRNA differential display [11]. The band was then sequenced (GenBank Accession no. KP244450) and searched against the whole genome assembly of Cryptococcus laurentii strain RY1 (GenBank accession no. JDSR00000000), and the complete cDNA sequence was obtained. The complete CDS of CDA was deposited under the GenBank Accession no. KX756630.
2.2.
Phylogenetic analysis
Protein sequence from the cDNA was predicted by GeneMark-ES [26]. The protein sequence was then aligned to reference sequences obtained from the NCBI protein DataBase using
ClustalW [27, 28] of the SeaView [29]. Phylogenetic tree was prepared using this alignment by the Protein Maximum Likelihood method (promlk) of the Phylip 3.695 [30].
2.3.
Sequence analysis
Physico-chemical parameters of CDA were computed by the ProtParam tool of ExPASy (http://web.expasy.org/protparam/). The physical and chemical parameters included molecular weight, amino acid composition, isoelectric point (pI), total number of negatively and positively charged residues, instability index, aliphatic index and Grand Average of Hydropathy (GRAVY). Protein domain of CDA was identified using the web-server Pfam [31] and SMART [32]. Secondary structure was elucidated with SOPMA [33]. Disulphide connectivity within the sequence was predicted by DISULFIND [34]. SignalP 4.1 server [35] was used to detect whether any signal peptide was present in the protein. Potential Glycosylation sites in the protein were predicted by NetNGlyc 1.0 Server [36] and NetOGlyc 4.0 Server [37] respectively. Presence of any GPI anchor site and transmembrane regions in CDA was identified by PredGPI predictor [38] and TMPred server respectively [39].
2.4.
Homology modeling and validation
Three-dimensional structures of CDA were constructed by homology modeling using Geno3D [40], PHYRE 2 [41], (PS)2-v2: Protein Structure Prediction Server [42, 43], and MODELLER 9.15 [44, 45]. Sequence alignment was carried out by ESPript3.0 [46]. The quality of the models was evaluated using the RAMPAGE [47], PROCHECK [48], ERRAT [49] and QMEAN [50, 51, 52, 53]. Visualization of the model was done with Chimera [54] and PYMOL (The PyMOL Molecular Graphics System, Version 1.8 Schrödinger, LLC). The tertiary structure
of CDA was deposited to the Protein Model Data Base (PMDB; http://mi.caspur.it/PMDB/) . PM0080572 was assigned to be the identifier for the structure.
2.5.
Molecular dynamics simulation of protein in water
The homology model was considered as the starting structure. Missing atoms and hydrogen atoms were added using LEAP module of AMBER12 [55, 56, 57, 58] and the structure was energy minimized for 2000 steps using steepest descent and conjugate gradient algorithm. The resulting structure then put in a solvent box containing water molecules, extending 9Å on each side of the box. TIP3P water box was chosen to describe the water molecules. Charge neutralization was performed by adding requisite number of counter-ions. The hydrated protein was energy minimized for removing initial bad contacts created by addition of water and hydrogen atoms. The system was then equilibrated at NVT ensemble at 300K for 500ps and system density was equilibrated at NPT ensemble at 1atm pressure for 1ns. The time step for simulation was chosen as 2fs. Post density and energy converse, 70ns of production run was generated using AMBER12 to obtain a stable structure. The simulation results were analyzed by ptraj module and visualized using VMD [59].
2.6.
Molecular docking with chitin
The stabilized apo structure of CDA was used to dock the metal ion (Cobalt II ion) and the ligand (chitin) to the binding site using AutoDock4 [60]. Grid box was carefully built by noting the known interacting residues from similar enzymes. The structure of chitin was optimized using Gaussian [61] and charge calculation done using B3LYP basis set 96-311 g++) and used in simulation input files.
2.7.
Molecular dynamics simulation of the CDA – chitin complex
The docked structure was energy minimized for 500 steps using steepest descent and conjugate gradient method in gas phase. The energy minimized structure was then hydrated in explicit water solvent. TIP3P water box was used (size of box 9 Å on each side). The structure was energy minimized, temperature and pressure equilibrated according to the same parameters mentioned previously under sub-section 2.5. Molecular dynamics simulation was then run using pmemd.cuda executable of AMBER12 for 60ns. Restraint was placed on the Cobalt (II) ion and chitin structure to prevent them from moving away from the binding site. The trajectory was analyzed using VMD. Analyses were done using ptraj module of AMBER12.
2.8.
In Silico mutagenesis of the interacting Tryptophan
The W151 residue of the model (W314 with respect to the complete protein sequence) was substituted with Alanine, Phenylalanine and Aspartate using Swiss PDB Viewer [62] to generate the mutated system. The mutated system was equilibrated and production run was generated as described previously under sub-section 2.7.
2.9.
Binding free energy analysis
Energy calculations were performed using MM-GBSA (molecular mechanics generalized Born surface area) suite of AMBER12 [63]. The binding free energy were calculated as: ∆Gbind = ∆EMM + ∆Gsolv - T∆S
where ∆EMM, ∆Gsolv and T∆S represents molecular mechanics energy change, solvation free energy and conformational entropy change during binding. The entropy contributions have been neglected as we are comparing the states having similar entropy.
3. Results and discussion 3.1.
Nucleotide sequence
The sequence of the overexpressed band by mRNA differential display in nitrogen limiting condition showed 74% identity to the CDA mRNA sequence of Cryptococcus gattii WM276 (GenBank Accession no. XM_003194806). Searching against the whole genome assembly of Cryptococcus laurentii strain RY1, the cDNA portion was found to belong to the contig number 420. Protein prediction by GeneMark-ES further predicted a 479 amino acid residues long protein corresponding to the overexpressed sequence.
3.2.
Phylogenetic analysis
Phylogenetic analysis was based on the protein sequence of CDA from Cryptococcus laurentii strain RY1 (Fig. 1). It was inferred from the tree that the CDA from Cryptococcus laurentii strain RY1 had been clustered with the CDA proteins from Cryptococcus neoformans and Cryptococcus gattii. This protein even though being segregated along with similar proteins from this genera, the Bootstrap value at the node for this protein was higher than the threshold (>55%). This segregation is statistically relevant and indicated a distinct phylogeny from similar proteins within the genera.
3.3.
Sequence analysis
The predicted protein contained 479 amino acid residues (Fig. 2) and had a molecular mass of 49.7 KDa. Threonine (11.7%), Serine (10.6%) and Alanine (10.4%) were the most frequent amino acids present in the sequence. The isoelectric point (pI) was calculated to be 4.68, which indicated that the protein is acidic. The low instability index 28.40 classified the protein to be stable. The aliphatic index of a protein, described as the relative volume occupied by the amino acids such as alanine, valine, isoleucine and leucine, which have an aliphatic side chain in their structure, may be regarded as the positive factor for the increase of thermal stability of globular proteins [64]. The high aliphatic index (82.13) indicated that the protein is stable for a broad range of temperature. The GRAVY value was found to be negative (-0.010) which indicated better interaction of this protein with the surrounding water molecules. Pfam identified the presence of polysaccharide deacetylase domain, with the alignment according to HMM from 174 to 292 residues. SMART server confirmed the Pfam data suggesting that the polysaccharide deacetylase domain spanned from 171 to 294 residues. NodB homology domain characteristic of the CE4 superfamily enzymes was also identified in the sequence as highlighted in Fig. 2. Secondary structure prediction by SOPMA suggested the presence of high proportion of random coils (49.48%), followed by alpha helix with (27.97%), and extended strand with (16.28%). Although, seven cysteine residues were present in the protein, DISULFIND server did not predict any disulfide connectivity. SignalP 4.1 Server predicted a signal peptide from 1 to 17 residues with a cleavage site between the 17th and 18th residue. The presence of the signal peptide at the extreme N-terminal end was further corroborated by the presence of a transmembrane helix predicted through TMpred server between 1-17 residues. The server also predicted two more transmembrane helices between 284-309 and 457-479 residues. The 284-309 transmembrane helix was in the
inside-outside orientation and the C terminal also showed another helix in the outside-inside orientation. This confirmed that the protein was membrane bound and it had a long intracellular domain from 18-283 residues, and a long extracellular one between the 310-456 amino acids. A single omega site was predicted at the 454th residue, suggesting the presence of a GPI anchor. The protein contained two N-Glycosylation sites at N119 and N305, whereas the protein was also found to be O-Glycosylated at several sites. Therefore, the protein might be both N- and Oglycosylated.
3.4.
Homology modeling and validation
A three-dimensional structure of CDA was constructed by homology modeling based on the crystal structure of Chain A of Aspergillus nidulans CDA (Resolution: 1.99 Å, PDB: 2Y8U Chain A). As no homology was available for the N and C terminals of the query and the subject, only the 229 residues (164 – 392 amino acids of the complete protein) corresponding to the catalytic domain was modeled. When the catalytic domain of CDA from Cryptococcus laurentii strain RY1 was aligned with that of Aspergillus nidulans, it showed 31 % identity, 39% sequence similarity and 93 % query cover (Fig. 3). A total of seventeen models were generated out of which five models of CDA were constructed initially by MODELLER, one through (PS)2-v2: Protein Structure Prediction Server, one by PHYRE 2 and ten models by Geno3D. The models generated by MODELLER were arranged according to the discrete optimized protein energy (DOPE) scoring function [65]. The final model was selected that had maximum number of residues in the favored regions and minimum numbers of residues in disallowed regions according to the Ramachandran Plot analysis by RAMPAGE and overall stereo-chemical property by PROCHECK (Table 1).
RAMPAGE analysis of the selected model (Model 1) revealed that 93.4% of the residues were in favored regions, 4.4% in allowed regions and only 2.2% in the disallowed regions (Fig. 4a). Five residues were identified to be outliers but also not among the predicted critical residues for ligand interactions as understood from previous studies of similar enzymes [10]. ERRAT verifies the protein structures based on crystallography. ERRAT analysis score for the model was 53.394%. QMEAN, on the other hand, estimates the global (structure) and local (residue) error. QMEAN assigns a global score of the model and scales the model reliability from zero to one. The QMEAN score was calculated to be 0.550. The QMEAN Z-score uses the reference structures, which are solved by X-ray crystallography to estimate the absolute quality of a model. For this model, the Z-score was estimated to be -2.37 (Fig. 4b). The QMEAN score, Z-score along with the ERRAT analysis confirmed that the model was of acceptable quality.
3.5.
Molecular dynamics simulation of protein in water
70ns of simulation was performed to refine the structure generated by homology modeling. Ramachandran Plot of the refined structure showed that there were no residues in the disallowed regions anymore (Fig. 4c) and the overall model quality factor by ERRAT analysis increased to 73.262% indicating an improved structure (Fig. 4d). The RMSD value for the production run was computed and plotted using XMGRACE tool. As shown in Fig. 5a, the trajectory has an overall stable RMSD value of around 5Å that flattens from 10 ns to 70ns. The radius of gyration value for the trajectory was also calculated and it stabilized at an average value of 3.44 nm over the equilibration course (Fig. 5b). The average value of energy is -79900 Kcal/mol. The energy for the system, therefore, showed convergence to a stable state (Fig. 5c). The B factor plot also revealed that the residue level fluctuations were minimum, except for the terminal residues (Fig.
5d). These indicated that the simulation was a stable one and the generated structure had attained stability. Together with the results from molecular dynamics simulation and quality check from ERRAT and Ramachandran plot, the refined structure was found to be fit for further docking studies.
3.6.
Molecular docking with chitin
Active site of CDA was identified by sequence alignment. Five CE-4 motifs and several conserved residues were identified across the CDA from Cryptococcus laurentii strain RY1, Aspergillus nidulans and Colletotrichum lindemuthianum [10]. Motif 1 19(S-Y-D-D)22 consists of two consecutive Aspartic acid residues; first of which was reported previously to bind zinc or cobalt ion and the second one to bind the acetate released from its substrate. Motif 2
71
(H-T-W-
S-H-P)76 include two Histidine residues that forms part of the metal binding triad. The Serine or Threonine has been reported to form a hydrogen bond with the second Histidine that contributes to loop stabilization. Motif 3 109(R-P-P-Y)112 forms one part of the active site groove while motif 4
147
(D-T-T-D-W)151 occupies the other side. The motif 3 is multi-functional in nature to form
associations with metals, acetate etc. Tryptophan of the fourth motif is considered to be the most important residue. The fifth motif 181(G-F-I-V-L-Q-H)187 contribute by binding acetate [10, 21]. Therefore, the chitin molecule was docked with residues that included D21-R109-D150-H187 and Cobalt (II) was docked with D22-H71-H75 in such a way that the known interactions with residues were conserved. The docking was performed using Genetic Algorithm scheme in AutoDock Tools and the best-docked structure was chosen according to the one possessing the maximum binding energy (-4.5 Kcal/mol). The docked structure predicted interactions with the neighboring residues: Y20, T43, F44, H71, R109, I131, W151, I183, V184, L185 and Q186
(corresponding to Y183, T206, F207, H234, R272, I294, W314, I346, V347, L348 and Q349 of the complete protein). R109 and W151 were among the experimentally known residues that correlated well with our predicted interacting studies (Fig. 6a, Fig. 6b). Among the interacting residues, Y20 was found to be having error>99% according to the Errat Plot. Therefore, Y20 residue was re-analyzed with two more independent webservers. Both the servers, Verify 3D [66, 67] and Prosa Web Server analysis [68, 69] yielded satisfactory scores for this residue. Verify3D returned a score 0.59 and Prosa yielded negative values for Y20 which established the reliability of the residue.
3.7.
Molecular dynamics simulation of the CDA – chitin complex
The docked complex was simulated for 60ns and the resulting trajectory was analyzed for its stability. An RMSD value of 2.5Å over a considerable part of the simulated trajectory indicated the stability of the simulated system (Fig. 7a). The B-factor for the system showed stability with only the terminal residues showing marginal fluctuations as seen in Fig. 7b. The radius of gyration of the simulated protein was found to be stable at about 3.605 nm over the entire duration of the simulation (Fig 7c). Further proof of the stability of the docked structure was obtained from the total energy value of the system for 60ns. The average value of energy was found to be about -95250 Kcal/mol (Fig 7d). The energy contribution of each amino acid was checked in the protein using MMGBSA suite. Of all the predicted interacting residues, W151 showed an average value of -2.7 Kcal/mol indicating its importance in the entire protein-ligand complex. (Table 2).
3.8.
In Silico mutagenesis of the interacting Tryptophan
From the simulation studies and energy calculations, the involvement of W151 in substrate interaction was determined. In order to confirm the contribution of the W151 residue, mutation studies were performed. Since no similar mutations were reported to the best of our knowledge, a few general mutations were executed. We mutated the W151A as Alanine is a hydrophobic amino acid with the smallest side chain. That would help to determine the effect of side chain on various interactions with chitin. W151F was carried out as it is also an amino acid having a hydrophobic side chain just like Tryptophan; however due to unavailability of the NH moiety, it could no longer donate the H as contrary to Tryptophan. W151D was used as another test case for mutational study as its side chain nature was different than Tryptophan and hence would be an ideal candidate to emphasize the effect of side chain for interactions with chitin. The average energy of contribution for F151 was about -1 Kcal/mol, a minor difference was noted with respect to the energy values of W151. This low difference could be explained by the fact that the benzoic ring of Phe could still interact with chitin by stacking interaction through the aromatic ring, but it could no longer donate H like the indole ring of Trp as evident from our H bond analysis. However, for W151A and W151D mutations, the values were greater (~ +0.025Kcal/mol). The inability of the non-aromatic amino acid residues to form the stacking interactions as well as hydrogen bond formation should contribute to the rise in the energy values. The results therefore indicate that W151 indeed participates in two-way protein – carbohydrate interaction: stacking interaction and direct hydrogen bond formation, thereby also confirming the significance of W151.
4. Conclusions
CDA have significant biological implications. Moreover, the enzyme orchestrates a key role to produce chitosan, thereby occupying a pivotal position industrially. In this computational study, the CDA domain model was predicted using molecular modeling. From molecular dynamics simulation and associated analyses, we have been able to show that the apo structure is a stable one. Based on the docking interpretations, the key residue responsible for chitin binding involved through hydrogen bonding and stacking interaction was identified. Moreover, the importance of residue W151 has also been shown by in silico site-directed mutagenesis. Further explorations may be possible by validating the simulation results with experimental data. In addition, a QM/MM study on the enzyme reaction mechanism may be one of the possible future scopes of this work, which would explain the functioning of the enzyme as a whole and its interactions with possible substrates. Our computational investigation, therefore, would facilitate the modification of Cryptococcus laurentii strain RY1 CDA for biotechnological purposes in future.
Acknowledgements We are thankful to HPCE, IIT Madras for providing us with the super-computing facilities. We are grateful to Dr. Ashutosh Mukherjee (Vivekananda College, West Bengal, India) for helping us with the construction of the homology model and Dr. Somnath Chakravorty (Department of Biochemistry and Molecular Biophysics, Kansas State University, USA) for designing the phylogenetic tree. We would also like to thank Dr. Semantee Bhattacharya, Debanjana Bhattacharya and Avishek Mukherjee for their help in language editing. Soumyadev Sarkar is the recipient of the State Government Fellowship, Govt. of West Bengal. We thank Department of Biotechnology, Govt. of West Bengal, India for financial support [BT 57 dated
26.11.2014]. The authors would like to thank one anonymous reviewer for providing significant suggestions that contributed to the improvement of the manuscript.
References [1] C.K.S. Pillai, W. Paul, C.P. Sharma, Chitin and chitosan polymers: Chemistry, solubility and fiber formation, Prog. Polym. Sci. 34 (2009) 641-678. [2] I.Tsigos, A. Martinou, D. Kafetzopoulos, V. Bouriotis, Chitin deacetylases: new, versatile tools in biotechnology, Trends Biotechnol. 18 (2000) 305-312. [3] Y. Zhao, W.T. Ju, G.H. Jo, W.J. Jung, R.D. Park, Perspectives of Chitin Deacetylase Research, Biotechnology of Biopolymers. (2011) 131-144. [4] Y. Araki, E. Ito, A pathway of chitosan formation in Mucor rouxii: Enzymatic deacetylation of chitin, Eur. J. Biochem. 55 (1975) 71-78. [5] D. Kafetzopoulos, A. Martinou, V. Bouriotis, Bioconversion of chitin to chitosan: purification and characterization of chitin deacetylase from Mucor rouxii, Proc. Natl. Acad. Sci. U.S.A. 90 (1993) 2564-2568. [6] A. Christodoulidou, V. Bouriotis, G. Thireos, Two sporulation-specific chitin deacetylaseencoding genes are required for the ascospore wall rigidity of Saccharomyces cerevisiae, J. Biol. Chem. 271 (1996) 31420-31425. [7] A. Christodoulidou, P. Briza, A. Ellinger, V. Bouriotis, Yeast ascospore wall assembly requires two chitin deacetylase isozymes, FEBS Lett. 460 (1999) 275-279.
[8] C. Alfonso, O.M. Nuero, F. Santamaria, F. Reyes, Purification of a heat-stable chitin deacetylase from Aspergillus nidulans and its role in cell wall degradation, Curr. Microbiol. 30 (1995) 49-54. [9] X.D. Gao, T. Katsumoto, K. Onodera, Purification and characterization of chitin deacetylase from Absidia coerulea, J. Biochem. Tokyo. 117 (1995) 257-263. [10] D.E. Blair, O. Hekmat, A.W. Schuttelkopf, B. Shrestha, K. Tokuyasu, S.G. Withers, D.M.F. van Aalten, Structure and Mechanism of Chitin Deacetylase from the Fungal Pathogen Colletotrichum lindemuthianum, Biochemistry-US. 45 (2006) 9416-9426. [11] W. Chakraborty, S. Sarkar, S. Chakravorty, S. Bhattacharya, D. Bhattacharya, R. Gachhui, Expression of a chitin deacetylase gene, up-regulated in Cryptococcus laurentii strain RY1, under nitrogen limitation, J. Basic Microb. 56 (2016) 576-579. [12] H.Z. Yu, M.H. Liu, X.Y. Wang, X. Yang, W.L. Wang, L. Geng, D. Yu, X.L. Liu, G.Y. Liu, J.P. Xu, Identification and expression profiles of chitin deacetylase genes in the rice leaf folder, Cnaphalocrocis medinalis, J. Asia Pac. Entomol. 19 (2016) 691–696. [13] N. Pacheco, S. Trombotto, L. David, K. Shirai, Activity of chitin deacetylase from Colletotrichum gloeosporioides on chitinous substrates, Carbohydr. Polym. 96 (2013) 227-32. [14] K.P. Sarmiento, V.A. Panes, M.D. Santos, Molecular cloning and expression of chitin deacetylase 1 gene from the gills of Penaeus monodon (black tiger shrimp), Fish Shellfish Immunol. 55 (2016) 484-489. [15] K. Narayanan, B. Parameswaran, A. Pandey, Production of chitin deacetylase by Aspergillus flavus in submerged conditions, Prep. Biochem. Biotechnol. 46 (2016) 501-508.
[16] L.G. Baker, C.A. Specht, M.J. Donlin, J.K. Lodge, Chitosan, the Deacetylated Form of Chitin, Is Necessary for Cell Wall Integrity in Cryptococcus neoformans, Eukaryot. Cell. 6 (2007) 855–867. [17] G. Han, X. Li, T. Zhang, X. Zhu, J. Li, Cloning and Tissue-Specific Expression of a Chitin Deacetylase Gene from Helicoverpa armigera (Lepidoptera: Noctuidae) and Its Response to Bacillus thuringiensis, J. Insect Sci. 15 (2015) 95. [18] V. Lombard, H. G. Ramulu, E. Drula, P.M. Coutinho, B. Henrissat, The Carbohydrateactive enzymes database (CAZy) in 2013, Nucleic Acids Res. 42 (2014) D490–D495. [19] F. Caufrier, A. Martinou, C. Dupont, V. Bouriotis, Carbohydrate esterase family 4 enzymes: substrate specificity, Carbohydr. Res. 338 (2003) 687-92. [20] D.E. Blair, D.M. van Aalten, Structures of Bacillus subtilis PdaA, a family 4 carbohydrate esterase, and a complex with N-acetyl-glucosamine, FEBS Lett. 570 (2004) 13-9. [21] D.E. Blair, A.W. Schüttelkopf, J.I. MacRae, D.M. van Aalten, Structure and metaldependent mechanism of peptidoglycan deacetylase, a streptococcal virulence factor, Proc. Natl. Acad. Sci. U S A. 102 (2005) 15429-34. [22] Y.J. Kim, Y. Zhao, K.T. Oh, V.N. Nguyen, R.D. Park, Enzymatic deacetylation of chitin by extracellular chitin deacetylase from a newly screened Mortierella sp. DY-52, J. Microbiol. Biotechnol. 18 (2008) 759–766. [23] K. Tokuyasu, M. Ohnishi-Kameyama, K. Hayashi, Purification and characterization of extracellular chitin deacetylase from Colletotrichum lindemuthianum, Biosci. Biotech. Biochem. 60 (1996) 1598–1603. [24] M. Yamada, M. Kurano, S. Inatomi, G. Taguchi, M. Okazaki, M. Shimosaka, Isolation and characterization of a gene coding for chitin deacetylase specifically expressed during fruiting
body development in the basidiomycete Flammulina velutipes and its expression in the yeast Pichia pastoris, FEMS Microbiol. Lett. 298 (2008) 130–137. [25] N. K. Vyas, Atomic features of protein – carbohydrate interactions, Curr. Opin. Struc. Biol. 1 (1991) 732-740. [26] V. Ter-Hovhannisyan, A. Lomsadze, Y.O. Chernoff, M. Borodovsky, Gene prediction in novel fungal genomes using an ab initio algorithm with unsupervised training, Genome Res. 18 (2008) 1979-1990. [27] J.D.Thompson, D.G. Higgins, T.J. Gibson, CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice, Nucleic Acids Res. 22 (1994) 4673-4680. [28] M.A. Larkin, G. Blackshields, N.P. Brown, R. Chenna, P.A. McGettigan, H. McWilliam, F. Valentine, I.M. Wallace, A. Wilm, R. Lopez, J.D. Thompson, T.J. Gibson, D.G. Higgins, Clustal W and Clustal X version 2.0, Bioinformatics. 23 (2007) 2947-2948. [29] M. Gouy, S. Guindon, O. Gascuel, SeaView Version 4: A Multiplatform Graphical User Interface for Sequence Alignment and Phylogenetic Tree Building, Mol. Biol. Evol. 27 (2010) 221-224. [30] J. Felsenstein, Using the quantitative genetic threshold model for inferences between and within species, Philos. Trans. R. Soc. Lond. B. Biol. Sci. 360 (2005) 1427–1434. [31] R.D. Finn, A. Bateman, J. Clements, P. Coggill, R.Y. Eberhardt, S.R. Eddy, A. Heger, K. Hetherington, L. Holm, J. Mistry, E.L.L. Sonnhammer, J. Tate, M. Punta, The Pfam protein families database, Nucleic Acids Res. 42 (2014) D222-D230. [32] I. Letunic, T. Doerks, P. Bork, SMART: recent updates, new developments and status in 2015, Nucleic Acids Res. 43 (2015) D257-D260.
[33] C. Combet, C. Blanchet, C. Geourjon, G. Deléage, NPS@: Network Protein Sequence Analysis, Trends Biochem Sci. 25 (2000) 147-150. [34] A. Ceroni, A. Passerini, A.Vullo, P. Frasconi, DISULFIND: a Disulfide Bonding State and Cysteine Connectivity Prediction Server, Nucleic Acids Res. 34 (2006) W177-W181. [35] T.N. Petersen, S. Brunak, G. von Heijne, H. Nielsen, SignalP 4.0: discriminating signal peptides from transmembrane regions, Nat. Methods 8 (2011)785-786. [36] R. Gupta, E. Jung, S. Brunak, Prediction of N-glycosylation sites in human proteins, In preparation. (2004). [37] C. Steentoft, S.Y. Vakhrushev, H.J. Joshi, Y. Kong, M.B. Vester-Christensen, K.T. Schjoldager, K. Lavrsen, S. Dabelsteen, N.B. Pedersen, L. Marcos-Silva, R. Gupta, E.P. Bennett, U. Mandel, S. Brunak, H.H. Wandall, S.B. Levery, H. Clausen, Precision mapping of the human O-GalNAc glycoproteome through SimpleCell technology, EMBO J. 32 (2013) 1478-1488. [38] A. Pierleoni, P.L. Martelli, R. Casadio, PredGPI: a GPI anchor predictor, BMC Bioinformatics. 9 (2008) 392. [39] K. Hofmann, W. Stoffel, TMbase - A database of membrane spanning proteins segments, Biol. Chem. Hoppe-Seyler. 374 (1993) 166. [40] C. Combet, M. Jambon, G. Deleage, C. Geourjon, Geno3D: automatic comparative molecular modelling of protein, Bioinformatics. 18 (2002) 213-214. [41] L. A. Kelley, S. Mezulis, C. M. Yates, M. N. Wass, M. J. E. Sternberg, The Phyre2 web portal for protein modeling, prediction and analysis, Nat. Protoc. 10 (2015) 845-858. [42] C.C. Chen, J.K. Hwang, J.M. Yang, (PS)2: protein structure prediction server, Nucleic Acids Res. 34 (2006) W152-W157.
[43] C.C. Chen, J.K. Hwang, J.M. Yang, (PS)2-v2: template-based protein structure prediction server, BMC Bioinformatics. 10 (2009) 366. [44] B. Webb, A. Sali, Comparative Protein Structure Modeling Using Modeller, Curr. Protoc. Bioinformatics. 47 (2014) 5.6.1-5.6.32. [45] M.A. Marti-Renom, A. Stuart, A. Fiser, R. Sánchez, F. Melo, A. Sali, Comparative protein structure modeling of genes and genomes, Annu. Rev. Biophys. Biomol. Struct. 29 (2000) 291325. [46] X. Robert, P. Gouet, Deciphering key features in protein structures with the new ENDscript server, Nucl. Acids Res. 42 (2014) W320-W324. [47] S.C. Lovell, I.W. Davis, W.B. Arendall III, P.I.W. de Bakker, J.M. Word, M.G. Prisant, J.S. Richardson, D.C. Richardson, Structure validation by Calpha geometry: phi,psi and Cbeta deviation, Proteins. 50 (2002) 437-450. [48] R.A. Laskowski, M.W. Macarthur, D.S. Moss, J.M. Thornton, PROCHECK: a program to check the stereochemical quality of protein structures, J. Appl. Cryst. 26 (1993) 283-291. [49] C. Colovos, T.O. Yeates, Verification of protein structures: patterns of nonbonded atomic interactions, Protein Sci. 2 (1993) 1511-1519. [50] P. Benkert, S.C.E. Tosatto, D. Schomburg, QMEAN: A comprehensive scoring function for model quality assessment, Proteins. 71 (2008) 261-277. [51] P. Benkert, M. Biasini, T. Schwede, Toward the estimation of the absolute quality of individual protein structure models, Bioinformatics. 27 (2011) 343-350. [52] P. Benkert, T. Schwede, S.C.E. Tosatto, QMEANclust: Estimation of protein model quality by combining a composite scoring function with structural density information, BMC Struct. Biol. 9 (2009) 35.
[53] P. Benkert, M. Künzli, T. Schwede, QMEAN Server for Protein Model Quality Estimation, Nucleic Acids Res. 37 (Web Server issue) (2009) W510-4. [54] E.F. Pettersen, T.D. Goddard, C.C. Huang, G.S. Couch, D.M. Greenblatt, E.C. Meng, T.E. Ferrin, UCSF Chimera--a visualization system for exploratory research and analysis, J Comput. Chem. 25 (2004) 1605-12. [55] D.A. Case, T.A. Darden, T.E. Cheatham III, C.L. Simmerling, J. Wang, R.E. Duke, R. Luo, R.C. Walker, W. Zhang, K.M. Merz, B. Roberts, S. Hayik, A. Roitberg, G. Seabra, J. Swails, A.W. Götz, I. Kolossváry, K.F. Wong, F. Paesani, J. Vanicek, R.M. Wolf, J. Liu, X. Wu, S.R. Brozell, T. Steinbrecher, H. Gohlke, Q. Cai, X. Ye, J. Wang, M.J. Hsieh, G. Cui, D.R. Roe, D.H. Mathews, M.G. Seetin, R. Salomon-Ferrer, C. Sagui, V. Babin, T. Luchko, S. Gusarov, A. Kovalenko, P.A. Kollman, AMBER 12, University of California, San Francisco. (2012). [56] D.A. Pearlman, D.A. Case, J.W. Caldwell, W.S. Ross, T.E. Cheatham III, S. De Bolt, D. Ferguson, G. Seibel, P. Kollman, AMBER, a package of computer programs for applying molecular mechanics, normal mode analysis, molecular dynamics and free energy calculations to simulate the structural and energetic properties of molecules, Comp. Phys. Commun. 91(1995) 1-41. [57] D.A. Case, T. Cheatham, T. Darden, H. Gohlke, R. Luo, K.M. Merz Jr., A. Onufriev, C. Simmerling, B. Wang, R. Woods, The Amber biomolecular simulation programs, J. Computat. Chem. 26 (2005) 1668-1688. [58] R. Salomon-Ferrer, D.A. Case, R.C. Walker, An overview of the Amber Biomolecular Simulation Package, WIREs Comput. Mol. Sci. In press. (2012).
[59] W. Humphrey, A. Dalke, K. Schulten, VMD - Visual Molecular Dynamics, J. Molec. Graphics. 14 (1996) 33-38. [60] G.M. Morris, R. Huey, W. Lindstrom, M.F. Sanner, R.K. Belew, D.S. Goodsell, D. S. A.J. Olson, Autodock4 and AutoDockTools4: automated docking with selective receptor flexibility, J.Comput. Chem. 30 (2009) 2785-91. [61] M.J. Frisch, G.W. Trucks, H.B. Schlegel, G.E. Scuseria, M.A. Robb, J.R. Cheeseman, G. Scalmani, V. Barone, B. Mennucci, G.A. Petersson, H. Nakatsuji, M. Caricato, X. Li, H.P. Hratchian, A.F. Izmaylov, J. Bloino, G. Zheng, J.L. Sonnenberg, M. Hada, M. Ehara, K. Toyota, R. Fukuda, J. Hasegawa, M. Ishida, T. Nakajima, Y. Honda, O. Kitao, H. Nakai, T. Vreven, J.A. Montgomery, Jr., J.E. Peralta, F. Ogliaro, M. Bearpark, J.J. Heyd, E. Brothers, K.N. Kudin, V.N. Staroverov, R. Kobayashi, J. Normand, K. Raghavachari, A. Rendell, J.C. Burant, S.S. Iyengar, J. Tomasi, M. Cossi, N. Rega, J.M. Millam, M. Klene, J.E. Knox, J.B. Cross, V. Bakken, C. Adamo, J. Jaramillo, R. Gomperts, R.E. Stratmann, O. Yazyev, A.J. Austin, R. Cammi, C. Pomelli, J.W. Ochterski, R.L. Martin, K. Morokuma, V.G. Zakrzewski, G.A. Voth, P. Salvador, J.J. Dannenberg, S. Dapprich, A.D. Daniels, Ö. Farkas, J.B. Foresman, J.V. Ortiz, J. Cioslowski, D.J. Fox, Gaussian 09, Revision E.01, Gaussian, Inc., Wallingford CT. (2009). [62] N. Guex, M.C. Peitsch, SWISS-MODEL and the Swiss-PdbViewer: An environment for comparative protein modeling, Electrophoresis. 18 (1997) 2714-2723. [63] B.R. Miller, T.D. McGee, J.M. Swails, N. Homeyer, H. Gohlke, A.E. Roitberg, MMPBSA.py: An Efficient Program for End-State Free Energy Calculations, J. Chem. Theory Comput. 8 (2012) 3314–3321. [64] A. Ikai, Thermostability and aliphatic index of globular proteins, J. Biochem. 88 (1980) 1895-1898.
[65] M.Y. Shen, A. Sali, Statistical potential for assessment and prediction of protein structures, Protein Sci. 15 (2006) 2507-2524. [66] M. Wiederstein, M.J. Sippl, ProSA-web: interactive web service for the recognition of errors in three-dimensional structures of proteins, Nucleic Acids Res. 35 (2007) W407-W410. [67] M.J. Sippl, Recognition of Errors in Three-Dimensional Structures of Proteins, Proteins. 17 (1993) 355-362. [68] J.U. Bowie, R. Lüthy, D. Eisenberg, A method to identify protein sequences that fold into a known three-dimensional structure, Science. 253 (1991) 164-70. [69] R. Lüthy, J.U. Bowie, D. Eisenberg, Assessment of protein models with three-dimensional profiles, Nature. 356 (1992) 83-5.
Fig. 1: Phylogenetic tree of CDA from Cryptococcus laurentii strain RY1.
Fig. 2: The full length protein sequence of CDA deduced from the gene sequence. The predicted Signal Peptide is underlined. NodB homology domain is represented with bold type. The italic type refers to the GPI anchor site and omega site is indicated by Grey. N-Glycosylation sites are showed with Brown.
Fig. 3: The alignment of the catalytic domain of Cryptococcus laurentii strain RY1 CDA (cdaclry1) with that of Aspergillus nidulans CDA (cdanidulans) and Colletotrichum lindemuthianum CDA (cdalindemuthianum). Five CE-4 motifs were identified and marked as MT (1-5). Residues highlighted in red indicate conserved residues in CDA across the three organisms.
Fig. 4: Model Validation. Fig. 4a: Ramachandran Plot of CDA as analyzed by RAMPAGE. Fig. 4b: Z-scores of the CDA model as obtained from QMEAN. Fig. 4c: Ramachandran Plot of refined CDA model as analyzed by RAMPAGE. Fig. 4d: ERRAT Analysis of refined CDA model.
Fig. 5: Simulation behavior of CDA. Fig. 5a: Root Mean Square Deviation. Fig. 5b: Radius of Gyration. Fig. 5c: Total Energy Analysis. Fig. 5d: B Factor.
Fig. 6a: Active Site of CDA. Chitin molecule is shown in Bond representation and the interacting residues have been shown in CPK form in the docked structure. Fig. 6b: The entire CDA model with docked chitin molecule.
Fig. 7: Simulation behavior of CDA Complex with chitin. Fig. 7a: Root Mean Square Deviation. Fig. 7b: B Factor. Fig. 7c: Radius of Gyration. Fig. 7d: Total Energy Analysis.
Table 1: Ramachandran Plot quality of the models generated from MODELLER, (PS)2-v2: Protein Structure Prediction Server, PHYRE 2 and Geno3D.
Model
Model 1 Model 2 Model 3 Model 4
Program
Favored MODELLER 93.4 (PS)2-v2: Protein Structure Prediction 93.0 Server PHYRE 2 88.1 Geno3D 81.9
Ramachandran Plot quality (%) Allowed 4.4 4.8
Disallowed 2.2 2.2
8.8 15.4
3.1 2.7
Table 2: Values for energy contribution for some key interacting residues with chitin. Residue
Energy (Kcal/mol)
Y20
0
T43
0.0091
F44
0.007
H71
-0.02
R109
0.401
I131
0.004
W151
-2.7
I183
0.014
V184
0.0581
L185
0.0545
Q186
0.0055