Crystal structure of alkaline cellulase K: insight into the alkaline adaptation of an industrial enzyme1

Crystal structure of alkaline cellulase K: insight into the alkaline adaptation of an industrial enzyme1

doi:10.1006/jmbi.2001.4835 available online at http://www.idealibrary.com on J. Mol. Biol. (2001) 310, 1079±1087 Crystal Structure of Alkaline Cellu...

494KB Sizes 0 Downloads 36 Views

doi:10.1006/jmbi.2001.4835 available online at http://www.idealibrary.com on

J. Mol. Biol. (2001) 310, 1079±1087

Crystal Structure of Alkaline Cellulase K: Insight into the Alkaline Adaptation of an Industrial Enzyme Tsuyoshi Shirai1*, Hirokazu Ishida1, Jun-ichi Noda1, Takashi Yamane1 Katsuya Ozaki2, Yoshihiro Hakamada2 and Susumu Ito2 1

Department of Biotechnology and Biomaterial Chemistry Graduate School of Engineering, Nagoya University, Chikusa-Ku Nagoya, 464-8603, Japan 2

Tochigi Research Laboratories of Kao Corporation, 2606 Akabane, Ichikai, Haga Tochigi 321-3497, Japan

The crystal structure of the catalytic domain of alkaline cellulase K was Ê resolution. Because of the most alkaliphilic nature determined at 1.9 A and it's highest activity at pH 9.5, it is used commercially in laundry detergents. An analysis of the structural bases of the alkaliphilic character of the enzyme suggested a mechanism similar to that previously proposed for alkaline proteases, that is, an increase in the number of Arg, His, and Gln residues, and a decrease in Asp and Lys residues. Some ion pairs were formed by the gained Arg residues, which is similar to what has been found in the alkaline proteases. Lys-Asp ion pairs are disfavored and partly replaced with Arg-Asp ion pairs. The alkaline adaptation appeared to be a remodeling of ion pairs so that the charge balance is kept in the high pH range. # 2001 Academic Press

*Corresponding author

Keywords: X-ray crystallography; glucanase; commercial enzymes; alkaliphilic protein; protein evolution

Introduction Hydrolases are representative of the industrial enzymes as laundry-detergent additives. They have been the ``best-seller'' enzymes for more than 30 years.1 The application of hydrolases to laundry detergent was initiated with a protease that was used to digest proteinous stains such as keratin or elastin on clothing.2 ± 4 Other hydrolases such as amylases, lipases, and cellulases also have been applied to laundry use since the successful application of the protease.1,5 The introduction of cellulases was a turning point in the history of laundry enzymes because these enzymes did not digest stains as the other hydrolases did.6 Instead, a cellulase digests frays (micro®brils) of cotton fabric to help release stain particles.7 Micro®bril digestion also helps to maintain softness and vividness of textiles. The most common cellulases currently used for detergent additives are the endo-1,4-b-glucanases (EC 3.2.1.4). Being inactive against cellulose in crystalline states, most of the cellulases do not digest sound fabric. Stability in detergents at alkaline pH is required for a successful laundry enzyme. In most cases, naturally existing enzymes have been exploited for E-mail address of the corresponding author: [email protected] 0022-2836/01/051079±9 $35.00/0

an enzyme with the desired characters. The ®rst protease used as a laundry enzyme was extracted from Bacillus licheniformis, which showed optimum activity at pH 10.9.2 The search for a more ef®cient enzyme has continued since then, and one of the current laundry proteases, M-protease from alkaliphilic Bacillus sp. strain KSM-K16, shows maximum activity at pH 12.3.5,8,9 The strategy used for proteases was also used for cellulases, and an alkaline cellulase was found in the Bacillus sp. strain KSM-635, which showed an optimum pH of 9.5 at 45  C and high durability in detergents.10 The gene of the enzyme, named EG, was cloned, and was truncated into a fragment that encoded only the catalytic domain of the enzyme.11 The truncated EG expressed in Bacillus subtilis consisted of 364 amino acid residues and showed nearly the same properties of the native enzyme that make it suitable for a laundry detergent additive. Since the truncated EG also showed a resistance to protease digestion, it has been used in a compact laundry detergent along with M-protease, and was given the trademark name cellulase K (CelK in the following sections). Although the truncated product from the natural enzyme is useful as a laundry enzyme, a margin for sophistication still remains for arti®cial protein engineering, such as thermostabilization or enhancement of catalytic activity. To obtain the information essential to engineer the protein, the # 2001 Academic Press

1080 structure of CelK was determined by X-ray crystalÊ (1 A Ê ˆ 0.1 nm). lography at a resolution of 1.9 A The structure and an analysis of the structural bases for alkaline adaptation of CelK are reported here.

Results and Discussion Overall and substrate binding structures CelK is categorized as a family 5 cellulase.12,13 Of the family 5 cellulases, the 3D structures of endoglucanase E1cd from Acidothermus cellulolyticus,14 Cel5A from Bacillus agaradhaerens,15 ± 17 EGZ from Erwinia chrysanthemi (PDB code 1EGZ), CelC from Clostridium thermocellum,18,19 CelCCA from C. cellulolyticum20 are known. Showing 47 % identity in amino acid sequence of the catalytic domains, Cel5A is the closest relative of CelK among these cellulases (Figure 1(a)). The superimposition of the structures of CelK and Cel5A is shown in Figure 1(b). CelK conserved an a/b-barrel structure close to that of Cel5A especially at the core region of the barrel. The root mean square deviation of 142 Ca atoms in the barÊ. rel core region is 0.7 A CelK has three long insertions (insertions 1-3 in Figure 1). Additionally, both CelK terminals are extended beyond that of Cel5A (terminal extensions in Figure 1). Because of these additional fragments, the number of residues of CelK is 20 % larger than that of Cel5A. The three insertions comprise a peripheral section of the substrate binding

Alkaline Cellulase K Structure

cleft, suggesting some of the insertions are used in substrate binding (Figure 1(b)). The terminus of the C-terminal extension is close in space to insertion 3. The C-terminal extension links the catalytic domain and a C-terminal domain in the native protein.21 The insertion 3 might be the interface between the two domains. The active center of CelK in complex with cellobiose is shown in Figure 2(a). Glu373 and Glu485 work as acid/base and nucleophile in the catalytic process, respectively.22,23 The cellobiose molecule binds to ÿ3 and ÿ2 sugar-binding sites.24 The position and conformation of the cellobiose molecule are similar to that observed in the Cel5A-cellobiose complex.17 At the ÿ3 site, the glucose makes a stacking interaction with the side-chain of Trp269 and forms two direct hydrogen bonds, O5-Lys524Nz and O6-Lys524-Nz, with the protein molecule. At the ÿ2 site, eight direct hydrogen bonds are possible, namely, O2-Trp519-Ne1, O2-Glu526-Oe2, O3-Glu526-Oe2, O3-Lys524-Nz, O3-His265-Ne2, O5Tyr296-OZ, O6-Glu299-Oe2, and O6-Tyr296-OZ (Figure 2(a)). An induced ®t of the structures around the catalytic site on substrate binding was reported for Cel5A.16,17 The induced ®t involves dislocation of the loop Ala233-Gly239 (Gln490-Gly495 of CelK) and the side-chain of Tyr202 (Tyr444 of CelK) of Cel5A (Figure 2(b)). When the ÿ1 sugar-binding site is occupied by a glucose moiety, the side-chain of Tyr202 moves to form a hydrogen bond to glucose-O5 or Glu228-Oe2, and the loop moves

Figure 1. (a) Amino acid sequences of CelK and Cel5A. The terminal extensions and the insertions of CelK are labeled and differently colored. The residues in lowercase have not been modeled. The secondary structure elements are indicated by underlines and labeled as b-1 to b8 for b-strands and a1 to a8 for a-helices. The conserved residues are indicated by lines between the sequences. The asterisks indicate the catalytic residues: Glu373 and Glu485. (b) Stereo-pair of superimposed CelK and Cel5A 3D structures. Cel5A is shown in gray. CelK is colored according to the amino acid sequence in (a). The side-chains of the catalytic residue, Glu373 and Glu485, are shown. The bound cellobiose is shown in red and white stick model. INSIGHT II program (MSI/Ryoka Systems) was used to prepare the Figures of molecular models.

1081

Alkaline Cellulase K Structure

Figure 2. (a) The active center structure of the CelK-cellobiose complex. Side-chains of the active site residues are shown in orange. The red sphere is the cadmium ion found at the catalytic site. Hydrogen bonds are shown in yellow. The cellobiose molecule is shown in green. Fo ÿ Fc electron density map is superimposed to the model (contoured at 2.5s level, the atoms of cellobiose were excluded from the phase calculation). (b) The active center and carbohydrate ligand structures of CelK and Cel5A in different ligand states. Only the main-chain atoms are presented for the loops Ala233-Gly239 of Cel5A and Gln490-Gly495 of CelK. The bound sugars are labeled according to the binding sites. Hydrogen bonds are shown in yellow. The superimposed structures are Cel5A without ligand (open conformation, green), complexed with cellobiose (open conformation, blue) and covalent-intermediate (closed conformation, gray) and CelK without ligand (closed conformation, red) and cellobiose complex (closed conformation, orange).

toward the catalytic sites to enclose the reaction site (compare blue or green models and gray model in Figure 2(b)). Tyr444 is thought to work in orientation and activation of the Glu485 nucleophile.16,17 One residue in the loop, Ala234-O (Ala 491 of CelK), forms a hydrogen bond to glucose-O6 in the closed conformation. In the CelK structure, however, the loop Gln490Gly495 and the side-chain of Tyr444 were found at the positions that correspond to the closed conformation of Cel5A, even though no ligand was bound to the protein (red model in Figure 2(b)). When CelK was complexed with cellobiose, the loop moved slightly, while the side-chain of Tyr444 was ®xed in the closed conformation (orange model in Figure 2(b)). It forms a Tyr444OZ-Glu485-Oe2 hydrogen bond. The conformation of the loop and the side-chain of Tyr444 suggest that CelK adopts an active conformation prior to substrate binding. The induced®t mechanism of Cel5A is not shared in CelK, although a possibility cannot be ruled out that this conformation is an artefact produced by a cadmium ion that was found at the catalytic center (Figure 2(a)).

Balance of amino acid composition in alkaline adaptation The most useful information about the CelK structure would be the mechanism of alkaline adaptation of the enzyme. An analysis of alkaline M-protease suggested that the alkaline adaptation involves increases in the presence of Arg, Asn, His, and Gln residues, and decreases in Asp, Glu, and Lys residues.25 In addition, some ionic interactions were formed by using the gained Arg residues. The cellulases were analyzed by the same method as the proteases to see if the alkaline adaptation mechanism is shared by these enzymes. Figure 3 shows the molecular phylogeny of the selected subfamily 5-2 cellulases.22 The cellulases in this tree show at least 30 % amino acid sequence identity to CelK in their catalytic domains. The phylogeny implies that the alkaline species of cellulase are divided into two clusters, and CelK and Cel5A belong to different clusters. This tree was used to deduce the ancestral sequences of the cellulases by the same method previously used for proteases.25 To reduce uncertainty, branches with less than 90 % probability were avoided in this analysis; that is, the tree was truncated by using

1082

Alkaline Cellulase K Structure

Figure 3. Molecular phylogeny of subfamily 5-2 cellulases. The numbers associated with the branches are the evolutionary distances, and the numbers in parentheses are the percent reproduction of each branch in 1000 bootstrap reconstructions of the tree. Enzymes are identi®ed as the accession codes of the Swiss-Prot database. The tree, which is composed only of the branches shown in blue, was used for the ancestral type deduction. Note that the truncated tree does not contain branches with <90 % boot-strap probability. The nodes labeled a1, n1, a2, and n2 were used for analysis of the alkaline adaptation process.

only the branches shown in blue in Figure 3. The truncated tree was used to deduce the amino acid sequences at nodes n1 and a1, which might be the neutral and alkaline ancestors of cluster 1, respectively, and n2 and a2, which might be those of cluster 2. Shown in Figure 4(a) is the balance of amino acid compositions in alkaline adaptation processes of the cellulases that were derived from ancestral sequences. The same values for the proteases presented were obtained from previous work.25 Although some differences exist among the three cases, they shared a common tendency for hydrophilic amino acids except for Glu and Asn. In this consensus, Lys and Asp residues decreased, and Arg, His, and Gln residues increased in the adaptation process, as has been proposed before. Ala and Val residues were also found to increase. The results of two-sided t-test are also presented in Figure 4(a). Among the consensus, increase in Arg and His residues and decrease in Lys residue appear to be statistically signi®cant (0.05 > p). Observed preferences in amino acid substitution patterns for Arg, His and Lys residues are shown in Figure 4(b). The major sources of Arg are Gln

and gap. Asp is frequently converted into His. Lys is converted into Gln, Pro or Met. The modi®cation in proportion of hydrophilic residues seems to be relevant to alkaline adaptation of the enzymes. The pKa values of Arg are generally higher than that of Lys. Since excess of negative charges may disturb the local charge balance at higher pH, introducing Arg might help to maintain the balance. However, a rise in isoelectric point of the whole domain is not likely a necessary consequence of the modi®cation because the total number of positive charges did not surpass that of negative charges in the catalytic domain of cellulases. If a gain of Arg or Lys is simply counted as ‡1, and a gain of Glu or Asp is counted as ÿ1, the suggested charge balance in the adaptation process is ÿ1.2, ÿ5.4, and ‡4.7 for cellulase clusters 1, 2, and protease, respectively. Alkaline adaptation has brought about a rise in isoelectric point in the case of proteases. In the case of cellulases, however, the isoelectric points did not rise because Glu residues were gained in both cases (Figure 4(a)). The isoelectric point of CelK was experimentally determined to be 4.5.11 The consequence of the observed amino acid substitutions would not be a modi®-

Alkaline Cellulase K Structure

1083 cation of isoelectric point. It suggests that local interactions introduced by the substitutions should be examined to understand the adaptation mechanism. Local interactions introduced in the alkaline adaptation process

Figure 4. (a) Balance of amino acid compositions in the alkaline adaptation processes and the t-test results. (left graph) Filled, striped, and open bars represent the balances of cluster 1 and cluster 2 adaptation processes of cellulases, and that of proteases, respectively. The positive values mean a gain in the process.  represents deletion; the negative value of this category means insertion to alkaline species. Consensus in increment or decrement is indicated by a ®lled or open arrowhead, respectively. (right graph) The two-sided t-values between the balances of amino acid in alkaline adaptation and non-adaptation processes. The signi®cant levels (0.05 > p) are shown by the tick vertical lines. (b) Favored substitution patterns in the alkaline adaptation

The deduced sequences of ancestors n1 and a1 were used to identify the substitutions that occurred during the adaptation process. The sites that are assumed to be substituted with Arg or His or substituted from Lys in the n1-a1 branch (in Figure 3) are shown on the CelK structure (Figure 5). In the previous study of alkaline protease, the residues that were substituted during alkaline adaptation did not appear to signi®cantly modify the catalytic center of the enzyme.25 In the case of CelK, however, His333 was found close to the catalytic center (Figure 5). His333 is in contact with the cellobiose molecule in the complex structure and exists in proximity to the catalytic residues Glu373 and Glu485 (Figure 2(a)). Since the Ne2 atom of Ê away from Oe2 atoms His333 exists 4.5 and 6.8 A of Glu373 and Glu485, respectively, the protonation status of the imidazole may affect the catalytic sites. The side-chain of His333 seems to be stabilized by an ionized hydrogen bond His333-Nd1Glu299-Oe2 (Figure 5). The counterparts of His333 in Cel5A (cluster 2) and the other close relatives are Leu, which cannot be ionized. This site is occupied by a histidine in thermophilic endoglucanase E1cd of A. cellulolyticus.14 Four of the Arg residues, which were introduced during the adaptation process, are conserved between CelK and ancestor a1 (Figure 3). Arg423 and Arg464 form an ion pair network with Asp425 and Asp429, and the network is buried in the interior of the protein molecule (Figure 5). A similar network that involves two acquired Arg residues and one Glu residue was observed in alkaline protease.25 Another acquired Arg567 also forms an ion pair with Asp503 (Figure 5). Arg538 exists at the tip of insertion 3 and does not interact with the other part of the catalytic domain. As already mentioned, this insertion might be used for interdomain interaction. It is possible that Arg538 is used for an interaction with the C-terminal domain. The numbers of ion pairs in family 5 cellulases are shown in Table 1. The numbers of Lys-Asp ion pairs in CelK and Cel5A are fewer than in other cellulases. Possibly, the loss of Lys is to evade ion pairs between Lys-Asp that will not maintain a charge balance in the alkaline range. The suggested

process for Arg, His and Lys. The numbers associated with the arrows are the t-values between alkaline adaptation and non-adaptation processes.

1084

Alkaline Cellulase K Structure

Figure 5. Spatial distribution of the acquired Arg and His residues (red) and eliminated Lys residues (gray) that appeared to be responsible for the ion pair remodeling. Shown in blue are the negatively charged residues that might form an ion pair with the Arg, His or Lys residues. The cellobiose molecule is shown in green. Hydrogen bonds are shown in yellow.

the extension of seven residues (H2N-GRPAGMQ) to the N-terminal that was introduced in the construction of the expression vector. A total of four different crystal forms of CelK have been obtained. The crystallization and characterization of form 1 and 2 crystals were reported previously.31 Concurrent with the attempt to solve the form 2 crystal Ê and c ˆ 207.1 A Ê ) with the (space group R3, a ˆ 111.9 A multiple isomorphous replacement method, the search for a new crystal form continued, and form 3 (space Ê and c ˆ 221.8 A Ê ) and form 4 crysgroup R3, a ˆ 101.2 A tals were obtained (see Table 2 for parameters). The form 4 crystal showed the best quality among these crystals. The form 4 crystal was grown with the hanging-drop vapor-diffusion method by using Crystal Screen II reagent (Hampton Research). The crystal was obtained by using 8 ml of 1 % (w/v) protein, 20 mM cadmium sulfate hydrate and 0.5 M sodium acetate in 0.1 M Hepes buffer (pH 7.5) for a drop, and 1 ml of 40 mM cadmium sulfate hydrate and 1.0 M sodium acetate in 0.1 M Hepes buffer (pH 7.5) for a reservoir as a starting condition. Form 4 crystals grew in four to ®ve days at 18  C. The cellobiose complex of the protein was prepared by incubating the protein in a solution containing 16 mM cellobiose (Seikagaku Co.) overnight at room temperature before the precipitant solution was added to the protein solution. X-ray diffraction experiments of the form 4 crystals were done by using an R-AXIS IV imaging plate detector with a mirror-monochromator mounted on a Rigaku RU-300 copper-rotating anode X-ray generator Ê ) (Rigaku). The crystals were ¯ash frozen in (l ˆ 1.54 A liquid nitrogen and kept in a 100 K nitrogen stream from a cryo-system (Oxford Cryosystems) during the data collection. As an anti-freeze reagent, 2 ml of glycerol was added to 8 ml hanging drop about ten minutes before ¯ash-freezing the crystals. Diffraction images were pro-

evasion of Lys-Asp ion pair was supported by the spatial distribution of the eliminated Lys residues. The three sites that are supposed to be substituted from Lys are found close to Asp residues. The deleted ion pairs are Thr255(Lys)-Asp253, Val280(Lys)-Asn277(Asp) and Asn470(Lys)-Asp429 (in parentheses is the amino acid before adaptation). The last one was found close in space to the acquired ion pair network Arg423-Asp429-Arg464Asp425 (Figure 5). Although the gained Arg residues are used for ion pairs, a net increase in number of ion pairs was not observed because the gain was compensated by the loss of Lys-Asp pairs (Table 1). Alkaline adaptation might not require a net increase of ionic interactions, which is often observed for thermophilic26 ± 29 or halophilic adaptation.30 These observations suggest that the alkaline adaptation consists of several remodels of ionic interactions, mainly from Lys-Asp pairs into Arg-Asp pairs, in order to keep the local charge balance in high pH range. The detected remodeling might be one of the general features of the alkaline adaptation process because two different types of enzyme were analyzed by using statistical methods.

Materials and Methods Crystallographic analysis CelK was produced and puri®ed as described in a previous report.11 The truncated protein consists of residues from Ala228 to Leu584 of the native protein, and

Table 1. Numbers of ion pairs in family 5 cellulases Namea Lys-Asp Lys-Glu Arg-Asp Arg-Glu Total

CelK

Cel5A

CelC

E1cd

CelCCA

EGZ

1 2 7 4 14

1 6 5 4 16

5 3 6 7 21

4 0 7 3 14

3 1 3 4 11

3 7 1 5 16

Ê. Numbers of residue pairs in which charged atoms exist within 3.5 A Sources: CelK, Bacillus sp. KSM-635 (this work); Cel5A, B. agaradhaerens;15 ± 17 CelC, C. Thermocellum;18,19 E1cd, A. cellulolytics;14 CelCCA, C. cellulolyticum;20 EGZ, Erwinia chrysanthemi (PDB code 1EGZ). a

1085

Alkaline Cellulase K Structure Table 2. Crystallographic parameters, data collection, and re®nement statistics Data set Crystallographic parameters Space group Ê) Cell constants (A Data collection statisticsa Ê) Resolution limits (A No. of unique reflections (F > 0) Mean I/s(I) Rmerge (%) Completeness (%) Refinement statisticsa Ê) Resolution limits (A No. of refs used for refinement (F > 0) Rcryst No. of refs for Rfree statistics (5 % of total) Rfree Model contents Amino acid residues H2O Cd2‡ CH3COO ÿ Cellobiose RMSD from ideal geometry Ê) Bond length (A Bond angle (deg.) Dihedral angle (deg.) Improper angle (deg.) a

CelK

CelK-cellobiose complex

P3221 a ˆ 98.2, c ˆ 122.0

P3221 a ˆ 97.9, c ˆ 121.6

40.00-1.90 (1.97-1.90) 49,176 (4,496) 20.1 (5.9) 7.7 (20.2) 90.9 (83.8)

40.00-1.90 (1.97-1.90) 47,889 (3,826) 20.1 (4.9) 9.0 (18.7) 89.2 (72.4)

8.00-1.90 (1.97-1.90) 48,446 (4,493) 0.199 (0.264) 2,395 (214) 0.230 (0.274)

8.00-1.90 (1.97-1.90) 46,786 (3,807) 0.184 (0.240) 2,330 (191) 0.206 (0.242)

357 465 10 5 -

358 474 10 4 1

0.008 2.4 23.0 1.0

0.010 2.5 23.5 2.0

In parenthesis are the values for the ®nal shell.

cessed with DENZO and SCALEPACK programs.32 The statistics for data collection are summarized in Table 2. The structure of the form 4 crystal was solved by the molecular replacement method using Cel5A from B. agaradhaerens (PDB code 7A3H) as a search model. The program AMoRe was used for the calculation of rotation and translation functions.33 The crystallographic re®nement was executed by recurrently applying conjugate-gradient and B-factor re®nements with X-PLOR34 and manual model building with TURBO-FRODO.35 A total of 357 amino acid residues were built into the electron density. Electron densities for the two N-terminal residues (H2N-GR) and the ®ve C-terminal residues (KFTKL-COOH) were not observed. Several very high electron densities in spherical shape were interpreted as cadmium ions, and Y-shaped electron densities associated with the cadmium ions were interpreted as acetates. A model of the cellobiose complex was constructed from the cellobiose-free model. After a few cycles of re®nement, a cellobiose molecule was introduced into the model because a well de®ned density for the molecule at ÿ2 and ÿ3 carbohydrate binding sites was observed. The re®nement statistics for the both models are shown in Table 2. The qualities of the ®nal models of CelK and its complex with cellobiose were examined by using X-PLOR and PROCHECK programs.36 The models showed similar stereochemical qualities to one another. Most of the residues were found within the additional allowed region of the Ramachandran plots (not shown). The exceptions were His333 and Thr539 in the CelK model and His333 and Asp387 in the complex model. They were found within the generously allowed region. As observed in the structure of Cel5A from B. agaradhaerens, the peptide bond between Trp519-Gly520 was found in cis conformation. Additionally, the peptide bonds between Ala334-Pro335 and Gly496-Pro497 were found in cis conformation in CelK.

Phylogenetic analysis The sequences of cellulases that were more than 30 % identical in amino acid sequence to the CelK catalytic domain were retrieved from the SwissProt database.37 A phylogenetic tree was constructed using the MOLPHY system.38 The evolutionary distances were calculated with the maximum-likelihood method,39 and the tree topology was deduced with the neighbor-joining method.40 The same methods were also used for bootstrap reconstruction of the tree. Ancestral sequence deduction was done with the method previously used for the analysis of alkaline protease, which is based on an exhaustive search for the most parsimonious substitution pattern on the tree topology.25 Analysis of the amino acid balance and substitution pattern on the tree was done as follows. Two-sided paired-sample t-value was calculated as t ˆ (m1 ÿ m2)/ (h{[h1 ÿ 1]V1 ‡ [h2 ÿ 1]V2}/h1h2{h ÿ 2})1/2. The m1 and V1 are the mean and the variance of amino acid balance per 100 substitutions on the branch for the alkaline adaptation processes, respectively (Figure 4(a)). The h1 is the number of branches for adaptation processes (ˆ3). For the substitution pattern analysis, the m1 and V1 are mean and variance of number of certain substitutions per 100 substitutions, respectively (Figure 4(b)). The same values for the non-adaptive processes are m2, V2 and h2 (ˆ29), respectively. h ˆ h1 ‡ h2. The 95 % con®dence limit for the degree of freedom of 30 is jtj < 2.042.41

Atomic Coordinates The coordinates of CelK and its complex with cellobiose were deposited with the Protein Data Bank. The accession codes are 1G01 and 1G0C, respectively.

1086

Acknowledgments This work was partly supported by Grants-in-Aid for Scienti®c Research (B)(1) (11556014) and Encouragement of Young Scientists (12780491) from the Ministry of Education, Science, Sports and Culture of Japan. Some of the Figures were prepared by using the Computer-Aided Design facility of Nagoya University Venture Business Laboratory. We also thank Dr S. S. S. Raj for critical reading of the manuscript.

Alkaline Cellulase K Structure

15.

16.

17.

References 1. Riisgaard, S. (1990). The enzyme industry and modern biotechnology. Genet. Eng. Biotechnol. 10, 11-14. 2. Guntelberg, A. V. & Ottesen, M. (1954). Puri®cation of the proteolytic enzyme from Bacillus subtilis. Compt.-rend. Lab. Carlsberg, 29, 36-48. 3. Dambmann, C., Holm, P., Jenssen, V. & Nielsen, M. H. (1971). How enzymes got into detergents. Dev. Ind. Microbiol. 12, 11-23. 4. Aunstrup, K., Ottrup, H., Andersen, O. & Dambmann, C. (1972). Proteases from alkalophilic Bacillus species. Ferment. Technol. Today, 4, 299-305. 5. Ito, S., Kobayashi, T., Ara, K., Ozaki, K., Kawai, S. & Hatada, Y. (1998). Alkaline detergent enzymes from alkaliphiles: enzymatic properties, genetics, and structures. Extremophiles, 2, 185-190. 6. Hoshino, E. & Ito, S. (1997). Application of alkaline cellulases that contribute to soil removal in detergents. In Enzymes in Detergency (van Ee, J. H., Misset, O. & Baas, E. J., eds), pp. 149-174, Marcel Dekker, Inc. New York. 7. Murata, M., Hoshino, E., Yokosuka, M. & Suzuki, A. (1991). New detergent mechanism with use of novel alkaline cellulase. J. Am. Oil Chem. Soc. 68, 553-558. 8. Kobayashi, T., Hakamada, Y., Hitomi, J., Koike, K., Kawai, S. & Ito, S. (1995). Puri®cation and properties of an alkaline protease from alkalophilic Bacillus sp. KSM-K16. Appl. Microbiol. Biotechnol. 43, 473-481. 9. Hakamada, Y., Kobayashi, T., Hitomi, J., Kawai, S. & Ito, S. (1994). Molecular cloning and nucleotide sequence of the gene for an alkaline protease from the alkalophilic Bacillus sp. KSM-K16. J. Ferment. Bioeng. 78, 105-108. 10. Yoshimatsu, T., Ozaki, K., Shikata, S., Ohta, Y., Koike, K., Kawai, S. & Ito, S. (1990). Puri®cation and characterization of alkaline endo-1,4-b-glucanases from alkalophilic Bacillus sp. KSM-635. J. Gen. Microbiol. 136, 1973-1979. 11. Ozaki, K., Hayashi, Y., Sumitomo, N., Kawai, S. & Ito, S. (1995). Construction, puri®cation, and properties of a truncated alkaline endoglucanase from Bacillus sp. KSM-635. Biosci. Biotechnol. Biochem. 59, 1613-1618. 12. Henrissat, B., Claeyssens, M., Tomme, P., Lemesle, L. & Mornon, J.-P. (1989). Cellulase families revealed by hydrophobic cluster analysis. Gene, 81, 83-95. 13. Henrissat, B. & Bairoch, A. (1993). New families in the classi®cation of glycosyl hydrolases based on amino acid sequence similarities. Biochem. J. 293, 781-788. 14. Sakon, J., Adney, W. S., Himmel, M. E., Thomas, S. R. & Karplus, A. (1996). Crystal structure of thermostable family 5 endocellulase E1 from

18.

19.

20.

21.

22.

23. 24.

25.

26.

27.

28.

Acidothermus cellulolyticus in complex with cellotetraose. Biochemistry, 35, 10648-10660. Davies, G. J., Dauter, M., Brzozowski, M., Bjornvad, M. E., Andersen, K. V. & SchuÈlein, M. (1998). Structure of the Bacillus agaradherans family 5 endoglucaÊ and its cellobiose complex at 2.0 A Ê nase at 1.6 A resolution. Biochemistry, 37, 1926-1932. Davies, G. J., Mackenzie, L., Varrot, A., Dauter, M., Brzozowski, A. M., SchuÈlein, M. & Withers, S. G. (1998). Snapshots along an enzymatic reaction coordinate: analysis of a retaining b-glycoside hydrolase. Biochemistry, 37, 11707-11713. Varrot, A., SchuÈlein, M. & Davies, G. J. (2000). Insight into ligand-induced conformational change in Cel5A from Bacillus agaradhaerens revealed by a catalytically active crystal form. J. Mol. Biol. 297, 819-828. Dominguez, R., Souchon, H., Lascombe, M.-B. & Alzari, P. M. (1996). The crystal structure of a family 5 endoglucanase mutant in complexed and uncomplexed forms reveals an induced ®t active mechanism. J. Mol. Biol. 257, 1042-1051. Dominguez, R., Souchon, H., Spinelli, S., Dauter, Z., Wilson, K. S. & Chauvaux, S. et al. (1995). A common protein fold and similar active site in two distinct families of b-glycanases. Nature Struct. Biol. 2, 569-576. Ducros, V., Czjzek, M., Belaich, A., Gaudin, C., Fierobe, H.-P. & Belaich, J.-P. et al. (1995). Crystal structure of the catalytic domain of a bacterial cellulase belonging to family 5. Structure, 3, 939-949. Ozaki, K., Shikata, S., Kawai, S., Ito, S. & Okamoto, K. (1990). Molecular cloning and nucleotide sequence of a gene for alkaline cellulase from Bacillus sp. KSM-635. J. Gen. Microbiol. 136, 1327-1334. Wang, Q., Tull, D., Meinke, A., Gilkes, N. R., Warren, R. A. J., Aebersold, R. & Withers, S. G. (1993). Glu280 is the nucleophile in the active site of Clostridium thermocellum CelC, a family A Endo-b1,4-glucanase. J. Biol. Chem. 268, 14096-14102. McCarter, J. D. & Withers, S. G. (1994). Mechanisms of enzymatic glycoside hydrolysis. Curr. Opin. Struct. Biol. 4, 885-892. Davies, G. J., Tolley, S. P., Henrissat, B., Hjort, C. & SchuÈlein, M. (1995). Structures of oligosaccharidebound forms of the endoglucanase V from HumiÊ resolution. Biochemistry, 34, cola insolens at 1.9 A 16210 -16220. Shirai, T., Suzuki, A., Yamane, T., Ashida, T., Kobayashi, T., Hitomi, J. & Ito, S. (1997). Highresolution crystal structure of M-protease: phylogeny aided analysis of the high-alkaline adaptation mechanism. Protein Eng. 10, 627-634. Day, M. W., Hsu, B.-T., Joshua-Tor, L., Park, J. B., Zhou, Z. H., Adams, M. W. & Rees, D. C. (1992). Xray crystal structure of the oxidized and reduced forms of the rubredoxin from the marine hyperthermophilic archaebacterium Pyrococcus furiosus. Protein Sci. 1, 1494-1507. Ishikawa, K., Okumura, M., Katayanagi, K., Kimura, S., Kanaya, S., Nakamura, H. & Morikawa, K. (1993). Crystal structure of ribonuclease H from Ê resolThermus thermophilus HB8 re®ned at 2.8 A ution. J. Mol. Biol. 230, 529-542. Yip, K. S. P., Stillman, T. J., Britton, K. L., Artymiuk, P. J., Baker, P. J., Sedelnikova, S. E., et al. (1995). The structure of Pyrococcus furiosus glutamate dehydrogenase reveals a key role for ion-pair networks in

Alkaline Cellulase K Structure

29.

30.

31.

32.

33.

maintaining enzyme stability at extreme temperatures. Structure, 3, 1147-1158. KorndoÈrfer, I., Steipe, B., Huber, R., Tomschy, A. & Jaenicke, R. (1995). The crystal structure of Haloglyceraldehyde-3-phosphate dehydrogenase from the hyperthermophilic bacterium Thermotoga mariÊ resolution. J. Mol. Biol. 246, 511-521. tima at 2.5 A Dym, O., Mevarech, M. & Sussman, J. L. (1995). Structral features that stabilize halophilic malate dehydrogenase from an archaebacterium. Science, 267, 1344-1346. Shirai, T., Yamane, T., Hidaka, T., Kuyama, K., Suzuki, A., Ashida, T., Ozaki, K. & Ito, S. (1997). Crystallization and preliminary X-ray analysis of a truncated family A alkaline endoglucanase isolated from Bacillus sp. KSM-635. J. Biochem. 122, 683-685. Otwinowski, Z. (1993). Oscillation data reduction program. In Proceedings of the CCP4 Study Weekend: Data Collection and Processing (Sawyer, L., Isaacs, N. & Bailey, S., eds), pp. 56-62, SERC Daresbury Laboratory, UK. Navaza, J. (1994). AMoRe: an automated package for molecular replacement. Acta. Crystallog. sect. A, 50, 157-163.

1087 34. BruÈnger, A. T. (1992). X-PLOR Version 3.1. A System for X-ray Crystallography and NMR, Yale University Press, New Haven and London. 35. Cambillau, C. (1992). Turbo-FRODO, Molecular Graphics Program for Silicon Graphics IRIS 4D Series, Version 3.0, Bio-Graphics, Marseille, France. 36. Laskowski, R. A. (1993). PROCHECK: a program to check the stereochemical quality of protein structures. J. Appl. Crystallog. 26, 283-291. 37. Bairoch, A. & Apweiler, R. (2000). The SWISS-PROT protein sequence database and its supplement trEMBL in 2000. Nucl. Acids Res. 28, 45-48. 38. Adachi, J. & Hasegawa, M. (1992). Molphy: programs for molecular phylogenetics, I.- PROTML: maximum likelihood inference of protein phylogeny, Computer Science Monographs, No. 27, Institute of Statistical Mathematics, Tokyo. 39. Kishino, H., Miyata, T. & Hasegawa, M. (1990). Maximum likelihood inference of protein phylogeny and the origin of chloroplasts. J. Mol. Evol. 31, 151160. 40. Saitou, N. & Nei, M. (1987). The neighbor joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4, 406-425. 41. Campbell, R. C. (1974). Statistics for Biologists, 2nd edit., Cambridge University Press, England.

Edited by R. Huber (Received 9 January 2001; received in revised form 5 June 2001; accepted 5 June 2001)