Substrate-binding specificity of chitinase and chitosanase as revealed by active-site architecture analysis

Substrate-binding specificity of chitinase and chitosanase as revealed by active-site architecture analysis

Accepted Manuscript Title: Substrate-binding specificity of chitinase and chitosanase as revealed by active-site architecture analysis Author: Shijia ...

811KB Sizes 1 Downloads 122 Views

Accepted Manuscript Title: Substrate-binding specificity of chitinase and chitosanase as revealed by active-site architecture analysis Author: Shijia Liu, Shangjin Shao, Linlin Li, Zhi Cheng, Li Tian, Peiji Gao, Lushan Wang PII: DOI: Reference:

S0008-6215(15)00305-5 http://dx.doi.org/doi:10.1016/j.carres.2015.10.002 CAR 7076

To appear in:

Carbohydrate Research

Received date: Revised date: Accepted date:

10-7-2015 3-10-2015 6-10-2015

Please cite this article as: Shijia Liu, Shangjin Shao, Linlin Li, Zhi Cheng, Li Tian, Peiji Gao, Lushan Wang, Substrate-binding specificity of chitinase and chitosanase as revealed by active-site architecture analysis, Carbohydrate Research (2015), http://dx.doi.org/doi:10.1016/j.carres.2015.10.002. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

Substrate-binding specificity of chitinase and chitosanase as revealed by active-site architecture analysis† Shijia Liua, Shangjin Shaoa, Linlin Lia, Zhi Chenga, Li Tianb, Peiji Gaob, Lushan Wangb, * a

Taishan College, Shandong University, Jinan 250100, P. R. China.

b

The State Key Laboratory of Microbial Technology, Shandong University, Jinan

250100, P. R. China.

*

Corresponding author. Tel: +86-131-5615-5829; fax: +86-531-8836-6202.

E-mail address: [email protected] (L. Wang). †

Supplementary data is enclosed.

Page 1 of 25

Highlights 

The sequence profiles of the chitosanase and chitinase active sites are constructed.



Substrate recognition is supported by hydrogen bonds with C2 functional groups.



CH-π interactions contribute to tighter binding and processivity.

Graphical Abstract

Abstract Chitinases and chitosanases, referred to as chitinolytic enzymes, are two important categories of glycoside hydrolases (GH) that play a key role in degrading chitin and chitosan, two naturally abundant polysaccharides. Here, we investigate the active site

Page 2 of 25

architecture of the major chitosanase (GH8, GH46) and chitinase families (GH18, GH19). Both charged (Glu, His, Arg, Asp) and aromatic amino acids (Tyr, Trp, Phe) are observed with higher frequency within chitinolytic active sites as compared to elsewhere in the enzyme structure, indicating significant roles related to enzyme function. Hydrogen bonds between chitinolytic enzymes and the substrate C2 functional groups, i.e. amino groups and N-acetyl groups, drive substrate recognition, while non-specific CH-π interactions between aromatic residues and substrate mainly contribute to tighter binding and enhanced processivity evident in GH8 and GH18 enzymes. For different families of chitinolytic enzymes, the number, type, and position of substrate atoms bound in the active site vary, resulting in different substrate-binding specificities. The data presented here explain the synergistic action of multiple enzyme families at a molecular level and provide a more reasonable method for functional annotation, which can be further applied towards the practical engineering of chitinases and chitosanases.

Keywords: Chitinase; Chitosanase; Active-site architecture; Substrate-binding specificity.

Page 3 of 25

1. Introduction Advancements in high-throughput sequencing have exponentially increased the number of available sequences, however, few sequences have been functionally annotated. Thus, efficient functional annotation becomes an urgent task. In order to classify and systematize sequence information, protein databases, such as the CAZy database (http://www.cazy.org/)1-2, classify enzymes into different families according to sequence similarity. Chitin is a polymer of N-acetylglucosamine (GlcNAc) linked by β-1,4 glycosidic bonds. Both chitin and its partially deacetylated, water-soluble form, chitosan3, are highly abundant and widely distributed throughout nature4-6. Hydrolyzed chitin and chitosan products and related materials are used in multiple fields, including agriculture7, medicine8, and industry9-10. Chitinolytic enzymes (chitinase and chitosanase) efficiently hydrolyze chitin and chitosan, with chitinase mainly cleaving the GlcNAc-GlcNAc and GlcNAc-GlcN bond11 (EC 3.2.1.14) and chitosanase breaking the GlcNAc-GlcN, GlcN-GlcNAc, and GlcN-GlcN bonds12 (EC 3.2.1.132). These catalytic- and glycosidic-bond hydrolysis mechanisms have been intensively studied, however, requirements to improve enzymatic efficiency call for further research into substrate-binding specificity. For Glycoside Hydrolases, active site architecture is a part of enzyme which directly interacts with glycoside substrate, adopting the functions of substrate recognition and glycoside bond cleavage13. It is shared by all members of a protein family, who adopt over 30% of sequence identity1-2. Active site architecture

Page 4 of 25

constitutes approximately 2-3% of the total enzyme volume, and is influenced by the length of the substrate considerably: an enzyme with a longer substrate has more interacting residues, thus larger active sites14. Using structural bioinformatics and statistical analysis, we are able to reveal the roles of key active-site residues involved in substrate binding. Similar research has been applied to other glycoside hydrolases (GH), including identification of spatial-position conservation of key active-site amino acids in GH13 enzymes15-17 and exploration of relationships between structure and function in GH11 xylanases18. Here, we analyze specific interactions between chitinolytic enzymes and substrates using structure-guided bioinformatics analysis to reveal distinct interactions among different families and propose practical instructions for protein engineering. Through similar interactions, such as hydrogen bonds and CH-π interactions, different families recognize different areas of the substrate, particularly the C2 functional group. The findings presented provide insight into how different enzyme families are able to perform similar functions and offer guidance for increasing functional-annotation accuracy.

2. Experimental 2.1 Data selection Chitinase and chitosanase family information was obtained from the CAZy database1-2. When selecting the enzyme families, the following criteria were adopted to make the alignments more accurate and convincing: the selected family for

Page 5 of 25

sequence alignment must contain over 20 available sequences; the target family for structure alignment must contain over 10 available PDB structures, and RMSD between members must not exceed 3 Å; if any structure or sequence contains known mutations, they must be corrected before the alignments. Apart from two major families of chitosanase - GH8 and GH46, GH3, GH5, GH7, GH75 and GH80 also include members with chitosanase activity. However, among these five enzyme families, only GH7 contains one solved structure, and there were only 1, 2, 3, 16 and 4 characterized sequences by October 2015, so we omitted these families for further analysis. GH8 and 46 had 1 and 5 known structures, respectively, which is not adequate for a conceivable structural alignment; while 23 sequences were available for each family as of October 2015, thus sequence alignments alone were undertaken for GH8 and GH46 families. Similarly, for chitinase enzymes, two major enzyme families GH18 and GH19, contained 411 sequences and 42 structures, as well as 173 sequences and 13 structures as of October 2015 (Table S1), meeting the above criteria, therefore both were chosen as the research object and undertaken sequence and structure alignments. Yet GH23, another enzyme family containing chitinase, was not chosen because it had only one characterized member. All sequences downloaded from NCBI19 and structures obtained from the Protein Data Bank (PDB)20 with the selected EC number of the specific family, were selected as the research sample. One PDB structure with whole-length ligand bound was chosen as the alignment template. If there were no structures with ligand bound available within a certain family, ligands from other enzyme structures within this

Page 6 of 25

family were obtained and inserted into the reference structure by molecular docking. Templates used for GH8, GH46, GH18, and GH19 were ChoK21 (PDB: 1V5D) from Bacillus sp. strain K17 (docked with the ligand from PDB: 1KWF22), OU0123 (PDB: 4OLT) from Microbacterium sp., ChiA24 (PDB: 1EHN) from Serratia marcescens, and BcChi-A25 (PDB: 3WH1) from Bryum coronatum, respectively.

2.2 Statistical analysis InterPro26 was used to detect and delete any non-catalytic domains. The selected model PDB structure was opened by PyMOL (The PyMOL Molecular Graphics System, Version 1.7.4 Schrödinger, LLC), then waters were removed and ligands were presented. The substrate was oriented with the non-reducing end on the left and reducing end on the right. The numbers denoting individual subsites increased from the non-reducing end to the reducing end, with the cleavage site in the middle. Residues located within 5 Å of the substrate were determined and further classified into the subsite with which they displayed the highest likelihood of binding, then were documented in accordance with the order of corresponding subsites. These residues made up of the active sites defined in Introduction part. Multiple-sequence alignment and structure alignment were performed using ClustalW (gap open=10.0, gap extend=0.5)27,28 and “Stamp Structural Alignment” tool of VMD29, respectively. The columns containing the active site amino acids were extracted from the alignment results, and listed under the corresponding substrate subsites. Both sequence and structure alignment were undertaken by GH18 and GH19. The

Page 7 of 25

results from structure alignment contained more information, and better reflected the characteristic of each family, although there were fewer available structures than sequences. Therefore, only the results of structure alignment were further analyzed in detail in the text, while those of sequence alignment were shown in Fig.S2. Alignment results were used to create an active-site architecture sequence profile using WebLogo30,31, showing residues characterized within different subsites. To assess sequence-profile accuracy, the score of each column was computed using WebLogo and compared to the conservation score obtained from the ConSurf Server32-34 and Jalview35. Finally, conserved residues were chosen for further analysis.

The

processes

involving

data

selection,

active-site

acquisition,

multiple-sequence alignment, and WebLogo creation are outlined within a software at the website https://github.com/Stephen8554/MyUsefulTool.

3. Results and Discussion 3.1 Amino acid preference in the active site architecture The observed frequency with which each of the 20 standard amino acids are found within a general protein (calculated from the NCBI protein database19), selected chitinolytic enzymes, and chitinolytic enzyme active sites was calculated using BioEdit (Fig. 1A). The relative fold change of residue frequency within chitinolytic enzyme active sites as compared with the holoenzyme was further calculated (Fig. 1B). The residue frequency distribution observed within a general protein was similar to that of the chitinolytic enzymes (r > 0.75), while residue frequencies observed

Page 8 of 25

between chitinolytic enzyme active sites were less similar (r < 0.35). These data indicate that residue frequency between chitinolytic enzymes and other proteins shows little variation, while chitinolytic enzyme active-site composition differs significantly. This implies that amino acids found most frequently within the active site are likely involved in catalysis. The residues Tyr, Glu, Trp, His, Arg, Asp, and Phe exhibited increased frequencies within the chitinolytic enzyme active site. Whereas, nonpolar amino acids, such as Val, Leu, Pro, and Ile, occurred with lower frequency as compared to elsewhere within the enzyme. Protein-ligand interactions primarily involve residues located on the protein surface, possibly explaining the lower occurrence of nonpolar residues within the active site36. However, hydrophobic aromatic residues, such as Trp, Phe, and Tyr, occurred with high frequency within the active site, possibly signifying important roles in enzyme function.

3.2 The number of residues bound to substrate subsites Members of a certain protein family exhibit sequence identity of over 30%, therefore they share similar 3D structure backbones. The overall backbone of a protein determines the length of its catalytic cleft, which subsequently decides the maximum number of sugar rings that it can bind to, since the length of a pyranose ring is more or less 5.2 Å37-39. During evolution, different subsites display different conservation degrees, and subsites far away from the cleavage site tend to be less conservative than those near the cleavage site (Fig.2). Although members within a

Page 9 of 25

certain enzyme family have various set of subsites, it displays a potential maximum set of subsites. And the maximum number of subsites can be inferred from the models which contain the longest substrate. From what is discussed above, the selected example is capable to represent the whole enzyme family to a large extent, for they are with the most numbers of substrates. Following the procedures described in Method section, residues in the model structure of selected chitinolytic enzyme family which display interactions with substrate subsites can be determined. Some conclusions can be obtained by calculating average numbers of residues bound to each subsite in different enzyme families, and comparing enzyme families with different sets of maximum subsites. Families having fewer substrate subsites also displayed larger average numbers of residues bound to each subsite, increasing overall enzyme:substrate binding interactions (Table 1). The GH19 family has four substrate subsites with an average of 6.5 residues bound to each, while the GH18 family with eight subsites has an average of 6.5 residues bound to each. The substrates associated with the GH8 and GH46 families were both comprised of six sugar rings, yet GH46 displayed more residues bound to subsites. This may be a consequence of space restrictions within the GH46 structure, resulting in a larger interaction surface that promotes additional substrate binding opportunities. The smallest structural unit necessary to allow binding by each GH family was also determined (Table 1). GH18 and GH19 chitinases exhibited residues bound to the -2 and +2 substrate subsites, indicating their importance to the catalytic mechanism and

Page 10 of 25

highlighting a tetrose group as the smallest structural unit necessary for enzyme:substrate binding. Similarly, GH8 enzymes displayed residues bound to the -2, -1, and +2 substrate subsites, while GH46 enzyme residues bind the -3 and +1 substrate subsites, indicating that the smallest structural unit was a tetrose group composed of either (-2)—(+2) or (-3)—(+1) subsites, respectively. GH8 and GH46 enzymes displayed few residues bound to the +3 substrate subsite, suggesting weaker substrate binding and possible involvement in product release. Variations in GH46 enzyme residues bound to each subsite were greater than those observed in GH8 enzymes, which may be a consequence of differences in substrate composition. Unlike GH8 enzymes, GH46 enzymes recognize protruding acetyl groups at the -1 and +1 substrate subsites. Additionally, GH18 chitinases particularly exhibited a longer binding cleft which may enhance exoenzyme processivity24,40. Accordingly, they were also observed to bind to more substrate subsites compared with other chitinolytic enzyme families.

3.3 Analysis of chitinolytic enzyme active-site architecture GH8, GH19, and GH46 enzymes employ an inverting catalytic mechanism21,23,25, while GH18 chitinase exhibits a substrate-assisted retaining mechanism41. Sequence alignment of GH8 and GH46 chitosanase and structure alignment of GH18 and GH19 chitinase were performed, with the subsequent sequence profile shown in Fig. 2. The overlap ratios of the top 30% of conserved residues verified sequence-profile accuracy, with the majority above 80% (Supplementary File S1).

Page 11 of 25

The conserved residues within chitinolytic enzyme active sites participate in multiple enzymatic processes, including substrate recognition, catalysis, and intermediate stabilization through electrostatic and hydrophobic interactions. The sequence profiles (Fig. 2) display several common features in agreement with these characteristics. Two acidic residues, Asp and Glu, are oriented at the cleavage site in the sequence profile and constitute catalytic residues. At the -1 or +1 substrate subsite, there are several residues that augment catalysis, including Asn319 in ChoK, Thr48 in OU01, and Ser102 in BcChi-A, which is positioned to form hydrogen bonds with and assist spatial orientation of the nucleophilic water molecule. Many conserved residues interact with the -2 substrate subsite, specifically aromatic residues, which potentially stabilize and recognize the substrate via intermolecular forces and hydrogen bonds. These conserved aromatic residues are similarly distributed to allow interactions with the +1 and +2 substrate subsites.

3.3.1 GH8 and GH46 chitosanase sequence-profile characteristics Apart from the conserved catalytic residues, GH8 and GH46 chitosanases exhibited more conserved residues (Asp, Tyr, His, and Arg) forming enzyme:substrate hydrogen bonds as compared to chitinases. Specifically, an Arg residue could potentially form additional hydrogen bonds through dissociation of side-chain hydrogen atoms and regulate interactions between side-chain carbonyl and amino groups from the polysaccharide substrate to assist binding and catalysis. There were several noted differences between GH8 and GH46 enzymes, including

Page 12 of 25

fewer hydrogen bonds and increased interactions between the protruding hydroxyl oxygen on C6 of the substrate (which is called O6 subsequently) and GH8 enzymes. GH46 enzymes displayed higher frequencies of Asp and Glu residues. Catalytic Asp and Glu were positioned to enable hydrogen bond formation with substrate amino groups, potentially increasing overall binding stability. Additionally, GH46 enzymes contained more conserved nonpolar residues (Gly and Ala), potentially allowing more space for substrate acetyl groups (Table S2) and neighboring residues within the narrow active site. Another difference concerns frequency of CH-π interactions. These interactions primarily involve the aromatic residues Trp, Tyr, and Phe, and could potentially enhance enzyme:substrate binding affinity (Table S3). CH-π interactions played significant roles in the process of catalysis, such as stabilizing distorted conformations of carbohydrate chain42. For example, in a GH8 chitosanase, ChoK from Bacillus sp. strain K17, Trp166 interacts with -2 substrate subsite through CH-π interaction, which helped increase the total catalytic efficiency21. There were fewer CH-π interactions observed in GH46 enzymes, primarily due to the presence of additional enzyme:substrate interactions within a narrow active site (Fig. S1), since formation of additional hydrogen bonds could reduce the need for further CH-π interactions to reach the relatively constant total binding affinity. The majority of functional annotations associated with GH8 chitosanase originated from different subtypes of the same strain, with those from different strains tending to possess higher levels of conservation. This indicates that GH8 chitosanase constitutes

Page 13 of 25

a relatively independent branch of the molecular evolutionary tree. The sequence profiles for the entire GH8 family and GH8 cellulase, specifically, were constructed and compared with that of GH8 chitosanase (Fig. S3). The conserved residues in all three sequence profiles were distributed to binding areas associated with the -1 and +1 substrate subsites. Additionally, the conserved residues Glu122 and Asp183 in ChoK are required for catalysis, Arg322 and Ser246 stabilize the substrate, and Gly184 may reduce steric hindrance. In contrast, the residues only conserved in the chitosanase sequence profile appeared related to its functional specificity. The catalytic Glu and Asn residues and the aromatic residues forming CH-π interactions, including Trp235, Phe413, and Tyr318 in ChoK, were not conserved in other GH8 enzymes 43,44. Specifically, the catalytic Asp residue in GH8 cellulase and xylanase corresponded to the Glu residue in GH8 chitosanase. With the development of sequencing techniques, the available sequence and structure information of other chitosanase enzyme families, such as GH3, GH5, GH75 and GH80, is continuously growing. In the near future, we hope to carry out the above analysis on those families, and draw a more comprehensive picture of the substrate-binding specificities of these chitosanases.

3.3.2 GH18 and GH19 chitinase sequence-profile characteristics Sequence and structure alignments were performed for both GH18 and GH19 chitinases (Fig. 2 and Fig. S2). Given that structure information often reveals more detail than sequence information, the following analysis is primarily based on the

Page 14 of 25

sequence profile from structure alignment45-47; however, the major information provided by both alignments agreed with each other. Two conserved acidic residues, Asp-Glu and Glu-Glu for GH1841 and 1925, respectively, constitute the catalytic residues. Similar to GH46 enzymes, these catalytic residues also interact with substrate N-acetyl groups (Table S4). Other conserved active-site residues included aromatic and nonpolar residues, including Gly, Ala, Ile, and Leu. Conserved GH19 chitinase residues interacting with the substrate displayed more diversity, such as Asn/Gln, Ser/Thr, and Arg. These residues occurred less often and were not conserved in GH18 chitinases. Additionally, conserved GH18 active-site residues primarily interacted with the substrate O7 atom, while their counterparts in GH19 active site showed no preference between substrate nitrogen and oxygen atoms. Evident CH-π interactions were observed, however, these residues displayed little conservation, while those forming hydrogen bonds with substrate N-acetyl groups were conserved. This indicates that interactions with N-acetyl groups may constitute the major stabilizing force, while interplanar CH-π interactions primarily contribute to enzyme sliding and processivity48-50. In contrast, fewer CH-π interactions and additional acetyl amine-group interactions were observed in GH19 enzymes. These results mimic those observed with GH46 enzymes, which belong to the same lysozyme superfamily.

3.4 Specific binding between chitinolytic enzymes and substrates The catalytic and substrate-stabilizing mechanisms of chitinases and chitosanases

Page 15 of 25

are comparable to those observed in other GH enzymes. In terms of substrate interactions, the nitrogen atom of the amino group is more capable of hydrogen bonding interactions than that of an imino nitrogen, whereas the oxygen atom of an acetyl amine group is more capable of forming contacts with neighboring residues. Therefore, the number of possible hydrogen bonding interactions capable of being formed with substrate amino or N-acetyl groups displayed no significant differences, ensuring the possibility of stable substrate binding. Furthermore, given the nitrogen-containing functional group in polysaccharide substrates, residues in chitinolytic enzymes were observed specifically bound to substrate nitrogen atoms. The four sugar rings shown in Fig. 3 represent the smallest structural unit capable of being bound (Section 3.2). Given that the distribution of these units differs between the two chitinase families, common- and specific-binding units are shown using different variations of gray, cartoon representations of the atoms highlight specific interaction sites, and different colors represent the different families. Chitinolytic enzymes displayed numerous interactions with the substrate C2 functional group and the protruding O6 atom, while few interactions with the O3, O4, or O5 (hydroxyl oxygen on C3, C4 and C5) atoms were observed. The number of specific substrate interactions observed for chitosanase was greater than that observed for chitinase, revealing more functional similarity between GH8 and GH46 enzymes. In terms of the C2 functional groups, chitinase bound the substrate imino nitrogen and acetyl oxygen simultaneously, while chitosanase primarily bound the amino nitrogen (denoted as N2 subsequently). Therefore, it is inferred that most chitosanases are

Page 16 of 25

capable of binding chitin through interactions with its nitrogen atoms, while chitinase possesses lower chitosanase activity due to the lack of an acetyl oxygen atom (denoted as O7 subsequently) in chitosan23. In two chitosanase families, GH8 enzymes displayed evident CH-π interactions, while GH46 enzymes displayed none. However, GH46 residues frequently interacted with different areas of the substrate. These areas were equally distributed along the substrate backbone and side chains, however, in the case of GH46 enzymes, interactions were mostly between side chain N2 and O6 atoms. Given that the bond between the substrate O6 atom and its backbone is a rotatable σ bond, this more flexible O6 atom may contribute less to recognition than the N2 atom. This implies that GH46 enzymes exhibit more substrate specificity than GH8 enzymes, which displayed fewer N2-related interactions. GH19 enzymes displayed significant interactions at various substrate positions without clear preference for nitrogen or oxygen, while GH18 enzymes tended to specifically interact with the oxygen and exhibited more CH-π interactions, similar to GH8 enzymes. The specific binding patterns of each enzyme family also revealed the synergistic actions required to achieve the highest catalytic efficiency.

4. Conclusion Using structural bioinformatics methods, sequence profiles of each GH family were constructed and enzyme:substrate binding mechanisms were proposed. These data suggest that hydrogen bonds between enzyme and the substrate C2 functional group

Page 17 of 25

play a key role in substrate binding and that CH-π interactions may potentially enhance enzyme processivity. Although enzymes from different families demonstrate similar enzyme:substrate interfaces, the apparent binding affinities associated with specific atomic interactions displayed considerably different substrate-binding specificities, including the smallest structural unit capable of being bound, conserved residue types and locations, and substrate atoms involved in enzyme interactions. This variability may explain the existence of different enzyme families capable of catalyzing reactions involving the same substrate and help identify categories of known protein sequences. The analysis and interaction patterns presented here offer a guide for possible conversion among chitinolytic enzymes, monofunctional and multifunctional enzymes, and may also be applied to the rational design of other GH families.

Page 18 of 25

Acknowledgements This work was supported a grant from The National Natural Science Foundation of China (31370111/31170071), as well as a grant from Scientific Training Project of Taishan College, Shandong University. We thank the anonymous reviewer whose extensive comments were considerably helpful in shaping and elucidating the whole manuscript.

Supplementary Data Supplementary figures and tables and the evaluation of sequence profiles of this paper are enclosed in the Supplementary Data and Supplementary File S1, respectively, for online publication. A software outlining the overall process of sequence-based chitinolytic enzyme family’s Weblogo creation can be found at https://github.com/Stephen8554/MyUsefulTool.

References 1.

Lombard, V.; Ramulu, H. G.; Drula, E.; Coutinho, P. M.; Henrissat, B. Nucleic Acids Res. 2014, 42, 490-495.

2.

Cantarel, B. L.; Coutinho, P. M.; Rancurel, C.; Bernard, T.; Lombard, V.; Henrissat, B. Nucleic Acids Res. 2009, 37, 233-238.

3.

Merzendorfer, H. Eur. J. Cell Biol. 2011, 90, 759-769.

4.

Hartl, L.; Zach, S.; Seidl-Seiboth, V. Appl. Microbiol. Biotechnol. 2012, 93, 533-543.

Page 19 of 25

5.

Bhattacharya, D.; Nagpure, A.; Gupta, R. K. Crit. Rev. Biotechnol. 2007, 27, 21-28.

6.

Kasprzewska, A. Cell. Mol. Biol. Lett. 2003, 8, 809-824.

7.

Arakane, Y.; Taira, T.; Ohnuma, T.; Fukamizo, T. Curr. Drug Targets 2012, 13, 442-470.

8.

Cheung, R. C.; Ng, T. B.; Wong, J. H.; Chan, W. Y. Mar. Drugs 2015, 13, 5156-5186.

9.

Yong, S. K.; Shrivastava, M.; Srivastava, P.; Kunhikrishnan, A.; Bolan, N. Rev. Environ. Contam. Toxicol. 2015, 233, 1-43.

10.

Chavan, S. B.; Deshpande, M. V. Biotechnol. Prog. 2013, 29, 833-846.

11.

Varum, K. M.; Anthonsen, M. W.; Grasdalen, H.; Smidsrod, O. Carbohydr. Res. 1991, 211, 17-23.

12.

Fukamizo, T.; Ohkawa, T.; Ikeda, Y.; Goto, S. Biochim. Biophys. Acta 1994, 1205, 183-188.

13.

Himmel, M. E.; Ding, S. Y.; Johnson, D. K.; Adney, W. S.; Nimlos, M. R.; Brady, J. W.; Foust, T. D. Science 2007, 315, 804-807.

14.

de Melo-Minardi, R. C.; Bastard, K.; Artiguenave, F. Bioinformatics 2010, 26, 3075-3082.

15.

Kumar, V. Carbohydr. Res. 2010, 345, 1564-1569.

16.

Kumar, V. Carbohydr. Res. 2010, 345, 893-898.

17.

Kumar, V. Bioinformation 2011, 6, 61-63.

18.

Paes, G.; Berrin, J. G.; Beaugrand, J. Biotechnol. Adv. 2012, 30, 564-592.

Page 20 of 25

19.

Benson, D. A.; Cavanaugh, M.; Clark, K.; Karsch-Mizrachi, I.; Lipman, D. J.; Ostell, J.; Sayers, E. W. Nucleic Acids Res. 2013, 41, 36-42.

20.

Berman, H. M.; Westbrook, J.; Feng, Z.; Gilliland, G.; Bhat, T. N.; Weissig, H.; Shindyalov, I. N.; Bourne, P. E. Nucleic Acids Res. 2000, 28, 235-242.

21.

Adachi, W.; Sakihama, Y.; Shimizu, S.; Sunami, T.; Fukazawa, T.; Suzuki, M.; Yatsunami, R.; Nakamura, S.; Takenaka, A. J. Mol. Biol. 2004, 343, 785-795.

22.

Guerin, D. M.; Lascombe, M. B.; Costabel, M.; Souchon, H.; Lamzin, V.; Beguin, P.; Alzari, P. M. J. Mol. Biol. 2002, 316, 1061-1069.

23.

Lyu, Q.; Wang, S.; Xu, W.; Han, B.; Liu, W.; Jones, D. N.; Liu, W. Biochem. J. 2014, 461, 335-345.

24.

Papanikolau, Y.; Prag, G.; Tavlas, G.; Vorgias, C. E.; Oppenheim, A. B.; Petratos, K. Biochemistry 2001, 40, 11338-11343.

25.

Ohnuma, T.; Umemoto, N.; Nagata, T.; Shinya, S.; Numata, T.; Taira, T.; Fukamizo, T. Biochim. Biophys. Acta 2014, 1844, 793-802.

26.

Hunter, S.; Apweiler, R.; Attwood, T. K.; Bairoch, A.; Bateman, A.; Binns, D.; Bork, P.; Das, U.; Daugherty, L.; Duquenne, L.; Finn, R. D.; Gough, J.; Haft, D.; Hulo, N.; Kahn, D.; Kelly, E.; Laugraud, A.; Letunic, I.; Lonsdale, D.; Lopez, R.; Madera, M.; Maslen, J.; McAnulla, C.; McDowall, J.; Mistry, J.; Mitchell, A.; Mulder, N.; Natale, D.; Orengo, C.; Quinn, A. F.; Selengut, J. D.; Sigrist, C. J.; Thimma, M.; Thomas, P. D.; Valentin, F.; Wilson, D.; Wu, C. H.; Yeats, C. Nucleic Acids Res. 2009, 37, 211-215.

27.

Thompson, J. D.; Higgins, D. G.; Gibson, T. J. Nucleic Acids Res. 1994, 22,

Page 21 of 25

4673-4680. 28.

Larkin, M. A.; Blackshields, G.; Brown, N. P.; Chenna, R.; McGettigan, P. A.; McWilliam, H.; Valentin, F.; Wallace, I. M.; Wilm, A.; Lopez, R.; Thompson, J. D.; Gibson, T. J.; Higgins, D. G. Bioinformatics 2007, 23, 2947-2948.

29.

Humphrey, W.; Dalke, A.; Schulten, K. J. Mol. Graphics , 1996, 14, 33-8, 27-8.

30.

Crooks, G. E.; Hon, G.; Chandonia, J. M.; Brenner, S. E. Genome Res. 2004, 14, 1188-11890.

31.

Schneider, T. D.; Stephens, R. M. Nucleic Acids Res. 1990, 18, 6097-6100.

32.

Glaser, F.; Pupko, T.; Paz, I.; Bell, R. E.; Bechor-Shental, D.; Martz, E.; Ben-Tal, N. Bioinformatics 2003, 19, 163-164.

33.

Ashkenazy, H.; Erez, E.; Martz, E.; Pupko, T.; Ben-Tal, N. Nucleic Acids Res. 2010, 38, 529-533.

34.

Landau, M.; Mayrose, I.; Rosenberg, Y.; Glaser, F.; Martz, E.; Pupko, T.; Ben-Tal, N. Nucleic Acids Res. 2005, 33, 299-302.

35.

Waterhouse, A. M.; Procter, J. B.; Martin, D. M.; Clamp, M.; Barton, G. J. Bioinformatics 2009, 25, 1189-1191.

36.

Pace, C. N.; Fu, H.; Fryar, K. L.; Landua, J.; Trevino, S. R.; Shirley, B. A.; Hendricks, M. M.; Iimura, S.; Gajiwala, K.; Scholtz, J. M.; Grimsley, G. R. J. Mol. Biol. 2011, 408, 514-528.

37.

Zhang, Y. H.; Lynd, L. R. Biotechnol. Bioeng. 2004, 88, 797-824.

38.

Nishiyama, Y.; Langan, P.; Chanzy, H. J. Am. Chem. Soc. 2002, 124,

Page 22 of 25

9074-9082. 39.

Nishiyama, Y.; Sugiyama, J.; Chanzy, H.; Langan, P. J. Am. Chem. Soc. 2003, 125, 14300-14306.

40.

Payne, C. M.; Baban, J.; Horn, S. J.; Backe, P. H.; Arvai, A. S.; Dalhus, B.; Bjoras, M.; Eijsink, V. G.; Sorlie, M.; Beckham, G. T.; Vaaje-Kolstad, G. J. Biol. Chem. 2012, 287, 36322-36330.

41.

van Aalten, D. M.; Komander, D.; Synstad, B.; Gaseidnes, S.; Peter, M. G.; Eijsink, V. G. Proc. Natl. Acad. Sci. U. S. A. 2001, 98, 8979-8984.

42.

Isogawa, D.; Morisaka, H.; Kuroda, K.; Kusaoke, H.; Kimoto, H.; Suye, S.; Ueda, M. Biosci., Biotechnol., Biochem. 2014, 78, 1177-1182.

43.

Van Petegem, F.; Collins, T.; Meuwis, M. A.; Gerday, C.; Feller, G.; Van Beeumen, J. J. Biol. Chem. 2003, 278, 7531-7539.

44.

Guerin, D. M.; Lascombe, M. B.; Costabel, M.; Souchon, H.; Lamzin, V.; Beguin, P.; Alzari, P. M. J. Mol. Biol. 2002, 316, 1061-1069.

45.

Zhang, Z.; Wang, Y.; Wang, L.; Gao, P. PLoS One, 2010, 5, 14316-14316.

46.

Jiang, H.; Blouin, C. BMC Bioinf. 2007, 8, 444.

47.

Zhang, Z.; Huang, J.; Wang, Z.; Wang, L.; Gao, P. Mol. Biol. Evol. 2011, 28, 291-301.

48.

Horn, S. J.; Sikorski, P.; Cederkvist, J. B.; Vaaje-Kolstad, G.; Sørlie, M.; Synstad, B.; Vriend, G.; Vårum, K. M.; Eijsink, V. G. Proc. Natl. Acad. Sci. U. S. A. 2006, 103, 18089-18094.

49.

Zakariassen, H.; Aam, B. B.; Horn, S. J.; Vårum, K. M.; Sørlie, M.; Eijsink, V.

Page 23 of 25

G. J. Biol. Chem. 2009, 284, 10610-10617. 50.

Watanabe, T.; Ariga, Y.; Sato, U.; Toratani, T.; Hashimoto, M.; Nikaidou, N.; Kezuka, Y.; Nonaka, T.; Sugiyama, J. Biochem. J. 2003, 376, 237-244.

Figure Legends

Figure 1. Chitinolytic enzyme and active-site residue frequency. Figure 2. Sequence logos of chitinolytic enzyme families. Figure 3. Chitinolytic enzyme and substrate interaction pattern diagram.

Table Legends Table 1. Numbers of residues interacting with substrate subsites. In terms of a certain enzyme, residues present within 5 Å of each substrate subsite were identified as having interactions with the appointed subsite. Their numbers were counted and listed below. The binding strength of the enzymes with each subsite and the smallest structural unit necessary for binding can be subsequently inferred from Table 1.

Page 24 of 25

Table 1. Numbers of residues interacting with substrate subsitesa.

GH Family

a

Numbers of residues having interactions with subsites +3

Total number (Hydrogen bonding + CH-π interaction)

Average number

4.3 5.3 5 6.5

-6

-5

-4

-3

-2

-1

+1

+2

GH8 GH46 GH18

-b 3

3

5

3c 8 4

6 5 8

6 6 4

3 8 4

5 4 9

1 -

26 (22+4) 32 (31+1) 40 (35+5)

GH19

-

-

-

-

9

4

6

7

-

26 (26+0)

The interaction here stands for polar interactions, mainly comprised of hydrogen

bondings and CH-π interactions between sugar chain and aromatic residues. b

“-”: Unidentified.

c

Underlined numbers indicate the existence of CH-π interaction at this subsite.

Page 25 of 25