Nucleotide sequence of gene celM encoding a new endoglucanase (CeIM) of Clostridium thermocellum and purification of the enzyme

Nucleotide sequence of gene celM encoding a new endoglucanase (CeIM) of Clostridium thermocellum and purification of the enzyme

JOURNAL OF FERMENTATION AND BIOENGINEERING Vol. 76, No. 4, 251-256. 1993 Nucleotide Sequence of Gene ceZA4Encoding a New Endoglucanase (CelM) of Clos...

763KB Sizes 4 Downloads 71 Views

JOURNAL OF FERMENTATION AND BIOENGINEERING Vol. 76, No. 4, 251-256. 1993

Nucleotide Sequence of Gene ceZA4Encoding a New Endoglucanase (CelM) of Clostridium thermocellum and Purification of the Enzyme TOHRU KOBAYASHI,‘Q

MAREK P. M. ROMANIEC,‘,2t PATRICK J. BARKER,) ULF T. GERNGROSS,’ AND ARNOLD L. DEMAIN’” Fermentation Microbiology Laboratory, Department of Biology, Massachusetts Institute of Technology, Cambridge, MA 02139, USA,’ Biological Laboratory, University of Kent, Canterbury, Kent, CT2 7NJ, UK,2 and Institute of Animal Physiology and Genetics Research, Babraham Hall, Babraham, Cambridge, CB2 4A T, W? Received

12 April

1993/Accepted

30 June

1993

The nucleotide sequence of a new endoglucanase gene (celM) of Clostridium thermocellum was determined. The structural gene contains an open reading frame of 978 bp starting with an ATG codon and terminating in a TGA stop codon. The deduced amino acid sequence of endoglucanase M (CelM) contains 325 amino acids (MW 35,186) and its N-terminal amino acid sequence is identical to that of the purified enzyme. The codon usage of the gene is similar to that of other endoglucanase genes in the same organism. The endoglucanase was purified from Escherichia coli with a 200-fold increase in specific activity. The optimum pH and temperature values of the enzyme are 5.5 to 6.5 and 60% respectively for carboxymethylcellulose (CMC) degradation.

The thermophilic, strictly anaerobic, cellulolytic bacterium Clostridium thermocellum produces an extracellular cellulase-hemicellulase complex which is termed the cellulosome (I). The cellulosome is composed of at least 14 polypeptide subunits (1, 2) and is known to be denatured by SDS. Reflecting the large number of subunits, many genes encoding cellulolytic and hemicellulolytic enzymes from C. thermocellum have been cloned, i.e. at least 22 endoglucanase genes (3, 4), four xylanase genes (5, 6), two lichenase genes (7, 8) two exoglucanase genes (9), and two ,i-glucosidase genes (3, 10-12). Among them, seven endoglucanase genes, celA to celF and celH (13-19), one xylanase gene, xy/Z (5), two lichenase genes, laml (7) and licB (8), and two ,‘l-glucosidase gene, bglA (20) and bglB (21) have been sequenced and some of the corresponding proteins have been purified from Escherichia coli (15, 17, 22-26). In the present work, we describe the nucleotide sequence of a new endoglucanase gene of C. thermocellum and purification of the enzyme from E. coli. The gene encoding the endoglucanase was isolated from a genomic library of C. thermocellum DNA in a lambda gt I1 phage vector. MATERIALS

AND

The vectors used were pUC8 (29) for cloning the gene, and M13mp18 and Ml3mp19 phages (30) for the DNA sequence determination. Plasmid pSC3/8 is a derivative of pUC8 and contains a 3.2 kbp EcoRI fragment of C. thermocellum DNA. The recombinant protein from pSC3/8 reacts with antibody to the 53 component of the subcellulosome (31, 32) due to a non-specific cross-reaction (see Results and Discussion). Plasmid pSC3.0118, which was subcloned from SC3/8, contains a 2.1 kbp PstIEcoRI fragment. The preparation of DNA, construction of a genomic library, preparation of the polyclonal antibodies to the 53 subunit and screening of the clones have already been described (32). Nucleotide sequence analysis The nucleotide sequence was determined by the dideoxy chain termination method (33). The PstI-EcoRI fragment was subcloned in M13mp18 and M13mp19 phages using restriction fragments produced by restriction endonucleases BamHI, SmaI, EcoRI and PstI. Subclones were also generated using the Cyclone I Biosystem kit with T4 DNA polymerase (International Biotechnologies Inc., New Haven, CT, USA); single stranded DNAs were then purified. The DNA synthesis reaction with single stranded DNA was performed using the T7SequencingT”’ kit with T7 DNA polymerase (Pharmacia LKB Biotechnology, Piscataway, NJ, USA) and cu-3SS-ddATP (Amersham Co. Arlington Heights, IL, USA) as recommended by the suppliers. The entire nucleotide sequence was determined in both directions. The sequence has been submitted to GenBank and has been assigned the accession number L13461.

METHODS

E. and vectors coli JMlOl (27) was used as the host strain and was grown aerobically in LB broth at 37°C. When antibiotic was necessary, ampicillin (100 /[g/ml) was added to the media. 2 ): YT medium and H-agar medium (28) were used for the experiment involving the DNA sequence determination. Bacterial

strain,

growth

conditions

Purification

of recombinant

endoglucanase

CelM

Unless otherwise stated, operations were performed at 4°C. E. co/i JMlOl, harbouring plasmid pSC3.01/8, was grown aerobically at 37°C for 16 h in 200 ml of LB medium supplemented with 100 /tg of ampicillin per ml. Cells were collected by centrifugation (5,000 x g for 5 min) and washed with chilled saline (0.85%, w/v). Harvested cells (wet weight 1.2 g) were suspended in 10 ml of 10 mM Tris-HCl buffer (pH 7.0) containing 1 mM each of CaClz

’ Corresponding author. 8 Permanent address: Tochigi Research Laboratories, Kao Corporation, 2606 Akabane, Ichikai, Haga, Tochigi 321-34, Japan. + The other authors dedicate this paper to the memory of Marek Romaniec who died at the untimely age of 31 on Nov. 13, 1992. The gene is named cc/M and the enzyme endoglucanase M (CelM) in Marek’s honor. His entire scientific life was devoted to the study of the cellulase complex of C. thernwcelhrnl. 251

252

KOBAYASHI

ET AL.

and 2-mercaptoethanol (basal buffer). The cell suspension was sonicated on ice for four intervals of one minute each with a Sonifier cell disruptor 200 (Branson Sonic Power Co., Danbury, CT, USA) at 40?4 output. After removing the cell debris (14,OOOxg for 15 min), the supernatant fluid was incubated at 60°C for IOmin. Heat denatured protein was removed by centrifugation (14,000 xg for 15 min), and 5.6g of ammonium sulfate was added to the supernatant fluid (80% saturation). The precipitate was collected by centrifugation (14,000 x g for 15 min) and dissolved in a small amount of basal buffer. The concentrate was dialyzed against one liter of basal buffer overnight. The retentate (5 ml) was applied to a DEAE-Bio-Gel A column (1 x 8 cm) which had been equilibrated with basal buffer. The column was washed with basal buffer and protein was eluted with a linear gradient of KC1 up to 0.15 M. Fractions (2.0ml) were collected and examined by Western dot blot analysis using 53 antibody (32). Fractions containing recombinant protein were pooled and concentrated to 4.5 ml by ultrafiltration (Nova cell IOK, Pharmacia). The concentrate was applied to a hydroxyapatite column (1.5 x 7.5 cm) which had been equilibrated with 1 mM phosphate buffer (pH 7.0) containing 1 mM 2-mercaptoethanol. The column was washed with equilibration buffer, followed by washing with 100 mM phosphate buffer containing 1 mM 2-mercaptoethanol. The protein was eluted with a linear gradient of IOOmM phosphate buffer up to 400mM containing 1 mM 2-mercaptoethanoi. Fractions (2.0ml) were collected and examined by Western dot blot analysis using 53 antibody. Recombinant protein-containing fractions were pooled, concentrated by ultrafiltration, and sequentially diluted and ultrafiltered several times with 1 mM phosphate buffer containing 1 mM 2-mercaptoethanol. The concentrate (7.5 ml) was applied to a 2nd hydroxyapatite column (1.5 x 7.5 cm). The conditions of hydroxyapatite chromatography were the same as above. Pooling and concentration were repeated, and the concentrate was sequentially diluted and ultrafiltered several times with basal buffer. Glycerol was added to the concentrate (5.1 ml) to 10% (v/v) and the preparation was stored at -20°C. Determination of NH,-terminal amino acid sequence The NHz-terminal amino acid sequence of the endoglucanase from E. coli (pSC3.01/8) was determined by Edman degradation with a 470 gas phase sequencer (Applied Biosystems Inc., Foster City, CA, USA) (34). For the fractionation and detection of phenylthiohydantoin (PTH) derivatives of amino acids, a 120A PTH-amino acid analyser (Applied Biosystems) was used. CMCase activity was measured by Enzyme assays the method of Wu ef al. (35) except that the incubation time was 5 h at 6O’C. One mU of CMCase activity is defined as the amount of enzyme that liberates reducing sugar equivalent to 1 nmol of cellobiose per minute under the assay conditiocs. CMCase activity was also measured by a viscometric assay according to the method of Fauth et al. (36). Xylanase and activities hydrolyzing pnitrophenyl-13-D-glucoside and p-nitrophenyl-/-D-cellobioside were measured as described previously (31, 32), except that the incubation time was 5 h at 60°C. Avicelase activity was measured by the turbidity method at 60°C (37). Protein was determined by Protein determination the method of Bradford (38) with bovine serum albumin as standard.

J. FERMENT.

BIOENG.,

RESULTS AND DISCUSSION The nucleotide seNucleotide sequence of celkf quence of the 2.12 kbp PstI-EcoRI fragment containing one open reading frame and the deduced amino acid sequence are shown in Fig. 1. The start codon ATG in this open reading frame is at nucleotide position (n.p.) - 15 and + 1. The latter ATG is preceeded by a strong potential Shine-Dalgarno (SD) sequence (39). The start codon ATG and the reading frame were verified by determining the N-terminal amino acid sequence of the endoglucanase isolated from E. coli, i.e. Met-Phe-Asp-Leu-Leu-Lys-LysPhe-Thr-Gly-Ile-Val-Gly-Val-Ser-Gly, which is identical to that of the deduced amino acid sequence. While the Nformylmethionine residue of the protein is deformylated in E. coli, removal of N-terminal amino acid does not take place. This phenomenon was observed in the case of recombinant endoglucanase C of C. Ihermocellum when it was expressed in E. coli (15). It is known that the methionine aminopeptidase of E. co/i cannot cleave methionine from the polypeptide when it is followed by a Phe, Lys, Arg, Leu, Ile, or Asn (40). The open reading frame of celA4 contained 978 bp, which started from the ATG codon (n.p. + 1) and ended with a TGA codon (n.p. +976). This indicates that the endoglucanase consists of 325 amino acids and has a molecular weight of 35,186. Comparison of the deduced amino acid sequence of endoglucanase M with other proteins, including the other C. -thermocellum enzymes, in the GenBank data base (Release 75), EMBL (European Molecular Biology Laboratory) gene data base (Release 32) and Swiss-Prot data base (Release 23) showed that CelM has some homology to CelH (19) and a short region of similarity to CelC (15) of C. thermocellum. The sequence of amino acids 108 to 211 of CelM exhibited 22.0% homology to that of residues 658 to 762 of CelH. The sequence of amino acids 174 to 186 of CelM exhibited similarity to that of 310 to 323 of CelC; seven amino acids in this region are identical. Although both CelH and CelC belong to cellulase Family A (41, 42), there is no homology between CelM and the other Family A cellulases, such as CelB of C. thermocellum (14), endoglucanases of C. acetobutylicum (43), C. cellulolyticum (44), Butyrivibrio fibrisolvens (45) and Trichoderma reesei (46). The greatest degree of homology between CelM and other proteins was found in the sequence of amino acids 59 to 107 of CelM and that of 340 to 388 of cyclodextrin glucanotransferase from Bacillus macerans (47, 48). Fifteen amino acids in this region are identical. The N-terminal amino acid sequence of CelM contains two adjacent lysine residues (amino acids 6 and 7) followed by a hydrophobic amino acid sequence (amino acids 8 to 14) which could be a signal peptide. It would be unlikely for E. coli to cleave the leader peptide of a C. thermocellctm protein (49). Comparison with other signal peptide sequences and their cleavage sites (15 to 40 amino acids in length with a positively charged N-terminal region) (50, 51) suggests that cleavage of the CelM signal peptide in C. thermocellum could occur to the right of serine, amino acid number 15. However, it is difficult to determine which amino acid is cleaved as a signal peptide because there are at least two possibilities. One is cleavage to the right of serine (amino acid 15). In this case (Gly-X-Ser 1 ), the cleavage would be the same as that of the signal peptide of bacteriorhodopsin from Halobacterium halobium, or elastase I from rat (51).

VOL.

76,

1993

SEQUENCE

OF C. THERMOCELLUM TABLE

I.

Step Crude extract Heat treatment DEAE-Bio-Gel A 1st hydroxyapatite 2nd hydroxyapatite

ENDOGLUCANASE

Summary

of purification

Total protein (mg) 118 40 6.1 3.1

1.9

GENE

253

of the endoglucanase

Total CMCase activity (mu)

Specific activity WJ/mg)

2.1 2.1

0.18 0.52

12.2 18.9 70.0

5.1

2.0 36.8

Fold 1.0 2.9 11.1 28.3 204.4

murine r-interferon (5 1). Putative promoter sequences TAAAAT and TTGCAC, similar to those of the Baciflus subtilis u* promoter (52) and the E. coli u70promoter - 10 and -35 regions, are at n.p. - 34 to - 59 with an intervening distance between the regions of 19 bp. Other possibilities are TAAAAT (n.p. -44) and TTGAAA (n.p. -73), or TTAAAT (n.p. -64) and TTGACA (n.p. -94). The distances between the - 10 and - 35 sequences of these possibilities are 23 and 24 bp respectively. These distances are longer than that of the first putative promoter sequence (19 bp) and those of E. coli(l5 to 21 bp, 53) and B. subtik (16 to 19 bp, 52). However, the distances between the promoter sequences of the C. thermocellum celA gene are 22 or 25 bp (54). Sl mapping will be required for a definitive location of the promoter sequence. The codon usage in the translation of celMmRNA is similar to that reported previously for other C. thermocellum endoglucanase genes (14, 16, 17). The preferred codon usage of celM is different from those encoding E. coli proteins, where AGA (Arg), AGG (Arg), AUA (Ile) and GAA (Gly) are only rarely used (55). Endoglucanase M contains seven cysteine residues, which leaves at least one free SH group. It does not contain tryptophan at all. Some proteins, such as penicillinase (56) and protein A (57) from Staphylococcus aureus, do not contain tryptophan. However, the amino acid sequence of the other endoglucanases of C. thermocellum do contain tryptophan residues (13-19), and the tryptophan is conserved in bacterial cellulose-binding domains (41). Furthermore, many of the other clostridial endoglucanases have a reiterated region in their amino acid sequence (13, 14, 16-19), but CelM does not. Endoglucanases A, B, E and F are rich in proline, serine and threonine near the C-terminus of the proteins (13, 14, 17, 18). A proline- and serine-rich region is present near the Cterminus of CelM (encoded by n.p. 829 to 933), but the threonine residues are not frequent in this region. CelM does not contain repeated sequences or putative linkers and is similar to CelC of C. thermocellum (15) and an endoglucanase of C. udu (58) in this respect. We conclude that CelM is a new type of clostridial endoglucanase. Purification and some properties of endoglucanase M A summary of the CelM purification is shown in Table 1. About two thirds of the crude protein was denatured during heat-treatment. CMCase activity was difficult to measure accurately in both the crude cell-free extract and the heat-treated sample due to extremely high blank values. However after ion-exchange chromatography, the activity could be clearly detected by the reducing sugar assay. This phenomenon was also observed when the CMCase activity was measured by a viscometric assay. Total activity increased in each chromatography step, finally reaching a level 33 times higher than that of the

254

KOBAYASHI

ET AL.

J. FERMENT.

KDa -M -67

BIOBNG.,

after SDS-PAGE specifically reacted with J3 antibody (36), the recombinant endoglucanase reacted with 53 antibody when a Western dot blot analysis was performed but not after SDS-PAGE. (iii) We consider the 53 component of the subcellulosome to be equivalent to the Ss component of the cellulosome (31), and the recent work of Wang et al. (59) on the sequence of Ss shows that it has no homology to the sequence of our endoglucanase gene. Thus a non-specific cross-reaction led to the isolation and sequencing of a new endoglucanase gene, not that of Ss/ 53. ACKNOWLEDGMENTS

FIG. 2. SDS-PAGE of the purified endoglucanase. Electrophoresis was performed in a 10% polyacrylamide gel in the presence of 0.1% SDS according to the method of Laemmli (60). Lane 1, 5 pg of the endoglucanase; lane 2, molecular weight markers, phosphorylase B, 94 kDa; bovine serum albumin, 67 kDa; ovalbumin, 43 kDa; carbonic anhydrblase, 30 kDa. Insert: A, Western dot blot of crude extract of E. coli (pUC8); B, Western dot blot of purified endoglucanase.

This work was funded by National Science Foundation Grant BBS-8711725 and U.S. Army Research Office Contract DAAL0388-K-0064. Support for TK was provided by the Kao Corporation. MPMR was a Merck Postdoctoral Fellow. UTG was a Fellow of the Austrian Ministry for Science & Research. We thank Dr. Daniel Liberman for encouragement and advice. We also thank Drs. Susumu Ito and Katsuya Ozaki for useful discussions.

crude cell-free extract. Some proteins or nucleic acids of E. cofi might bind the endoglucanase and interfere with its CMCase activity. These polymers may be removed during purification as the apparent CMCase specific activity also increased by about 200-fold. The specific activity of CelM is about 200-fold or more lower than the activities of the other purified enzymes of C. thermocellum (22-25), but 5- to 80-fold higher than those of crude extracts of some recombinant enzymes (4, 11). The degree of purification of CelM is at the same level as that of other purified C. thermoceflum enzymes (22-25). Figure 2 shows SDS-PAGE of the purified endoglucanase whose molecular weight is about 38,000, as compared to the figure of 35,186 from the deduced amino acid sequence. The optimum pH of CelM is between 5.5 and 6.5 and its optimum temperature is 60°C. Since endoglucanases decrease the degree of polymerization of CMC causing a drop in viscosity, the activity of CelM was also measured by viscometric assay. CelM (6 pg) reduced the viscosity of CMC by 50% in 1 h at 60°C. The endoglucanase was unable to degrade xylan, p-nitrophenyl-/3-o-glucoside, pnitrophenyl-fl-D-cellobioside or Avicel. CaC& (5 mM), MgCl* (5 mM) and NaCl (100mM) showed no effect, whereas the sulfhydryl reagents, monoiodoacetate (5 mM), N-ethylmaleimide (5 mM) and p-chrolomercuribenzoate (0.1 mM), showed inhibition values of lo%, 32% and 73% respectively. These results suggest that a cysteine residues contained in the active site of the enzyme. This is supported by the deduced amino acid sequence of the endoglucanase which leaves at least one free SH group. The cloning and purification procedures were based on the immunological reaction of the endoglucanase and the 53 antibody; however, we know that gene celit4is different from that which encodes the 53 component of the subcellulosome. We believe the reaction was due to a non-specific immunological reaction. Our reasoning is as follows. (i) We noted that the N-terminal amino acid sequence, molecular weight and enzymatic properties of endoglucanase M differ from those of 53. The N-terminal ammo acid sequence of 53 is Gly-Pro-Thr-Lys-Ala-ProThr-Lys (data not shown), its molecular weight is about 84 kDa (36) and its endoglucanase activity is strongly inhibited by MgC12 and NaCI. (ii) Although the 53 subunit

REFERENCES 1. Lamed, R., Setter, E., Kenig, R., and Bayer, E. A.: The cellulosome-A discrete cell surface organelle of Clostridium thermocellum which exhibits separate antigenic, cellulose-binding and various cellulolytic activities. Biotechnol. Bioeng. Symp., No. 13, 163-181 (1983). 2. Mayer, F., Coughlan, M. P., Mori, Y., and Ljungdahl, L. G.: Macromolecular organization of the cellulolytic enzyme complex of Clostridium thermocellum as revealed by electron microscopy. Appl. Environ. Microbial., 53, 2785-2792 (1987). 3. Hazlewood, G. P., Romaniec, M. P. M., Davidson, K., Grepinet, O., Beguin, P., Millet, J., Raynaud, O., and Aubert, J.-P.: A catalogue of Clostridium thermocellum endoglucanase, ,4-glucosidase, and xylanase genes cloned in Escherichia coli. FEMS Microbial. Lett., 51, 231-236 (1988). 4. Sakka, K., Furuse, S., and Shimada, K.: Cloning and expression in Escherichia coli of thermophilic Clostridium sp. Fl genes related to cellulose hydrolysis. Agric. Biol. Chem., 53, 905910 (1989). 5. Grepinet, O., Chebrou, M.-C., and Beguin, P.: Nucleotide sequence and deletion analysis of the xylanase gene (vnz) of Clostridium thermocellum. J. Bacterial., 170, 4582-4588 (1988). 6. MacKenzie, C. R., Yang, R. C. A., Pate& G. B., Bilous, D., and Narang, S. A.: Identification of three distinct Clostridium thermocellum xylanase genes by molecular cloning. Arch. Microbiol., 152, 377-381 (1989). 7. Zverlov, V. V., Laptev, D.A., Tishkov, V.I., and Velikodvorskaya, G. A.: Nucleotide sequence of the Clostridium thermocellum laminarinase gene. Biochem. Biophys. Res. Commun., 181, 507-512 (1991). 8. Schimming, S., S&wan, W. H., and Staudenbauer, W. J.: Structure of the Clostridium thermocellum gene IicB and the encoded b-1,3-1,4 glucanase; a catalytic region homologous to Bacillus lichenase joined to the reiterated domain of clostridial cellulases. Eur. J. Biochem., 204, 13-19 (1992). 9. Tuka, K., Zverlov, V. V., Bumazkin, B. K., Velikodvorsksya, G. A., and Strongin, A. Y.: Cloning and expression of Clostridium thermocellum genes coding for thermostable exoglucanases (cellobiohydrolases) in lkherichia coli cells. Biochem. Biophys. Res. Cormnun., 169, 1055-1060 (1990). 10. Grab&z, F. and Staudenbauer, W. L.: Characterization of two /?-glucosidase genes from Clostridium thermocellum. Biotechnol. Lett., 10, 73-78 (1988). 11. Romaniec, M. P. M., Clarke, S. G., and Hazlewood, G. P.: Molecular cloning of Clostridium thermocellum DNA and the expression of further novel endo-,3-1,4-glucanase genes in Escherichia coli. J. Gen. Microbial., 133, 1297-1307 (1987). 12. Kadam, S., Demain, A. L., Millet, J., Beguin, P., and Aubert,

VOL.

13.

14.

15.

16.

17.

18.

19.

20.

21.

22.

23.

24.

25.

26.

27.

28.

29.

30.

31.

32.

76.

1993

SEQUENCE

J.-P.: Molecular cloning of a gene for a thermostable ,3-glucosidase from Clos/ridiurn thertttocellunt into Escherichia coli. Enzyme Microb. Technol., 10, 9-13 (1988). Beguin, P., Cornet, P., nnd Aubert, J.-P.: Sequence of a cellulase gene of the thermophilic bacterium Closfridiutn thertnocelhttn. J. Bacterial., 162, 102-105 (1985). Grepinet, 0. nnd Beguin, P.: Sequence of the cellulase gene of Clostridiutn thenttocelhrtn coding for endoglucanase B. Nucleic Acids Res., 14. 1791-1799 (1986). Schwarz, W. H.. Schimming, S., Rucknagel. K. P., Burgschwaiger, S.. Kreil, G., and Slaudenbauer, W. L.: Nucleotide sequence of the celC gene encoding endoglucanase C of Clostridiutn thertnocelhttn. Gene, 63, 23-30 (1988). JolilT, G., Beguin, P., and Aubert, J.-P.: Nucleotide sequence of the cellulase gene cc/D encoding endoglucanase D of Closfridiutn therttioce//utn. Nucleic Acids Res., 14. 8605-8613 (1986). Hall, J., Hazlewood, G. P.. Barker, P. J.. and Gilbert, H. J.: Conserved reiterated domains in Clostridiutn fhertnocelhttn endoglucanases are not essential for catalytic activity. Gene, 69, 29-38 (1988). Navarro, A., Chebrou, M.-C.. Beguin, P., and Aubert, J.-P.: Nucleotide sequence of the cellulase gene cc/F of Clostridiutn thertttocelhttn. Res. Microbial., 142, 927-936 (1991). Yague, E., Beguin, P., and Aubert, J.-P.: Nucleotide sequence and deletion analysis of the cellulase-encoding gene cc/H of Clostridiuttt thertnocelhrtn. Gene, 89, 6 1-67 (1990). Grabnitz, F., Seiss, M.. Rucknagel, K. P.. and Staudenbauer, W. L.: Structure of the $-glucosidase gene bg/A of Clostridiutn thertnoce//uttt. Eur. .I. Biochem., 200, 301-309 (1991). Grabnilz, F., Rucknagel, K. P., Seiss, M.. and Staudenbauer. W. L.: Nucleotide sequence of the Clostridiutn thennocellutn bg/B gene encoding thermostable ,%glucosidase B: homology to fungal ,i-glucosidase. Mol. Gen. Genet.. 217, 70-76 (1989). Schwarz, W. H., Grabnilz. F., and Staudenbauer, W. L.: Properties of a Clostridiutn thertttocellutn endoglucanase produced in Escherichia co/i. Appl. Environ. Microbial., 51, 1293-1299 (1986). Beguin. P., Cornet, P., and Millet. J.: Identification of the endoglucanase encoded by cc/B gene of Clostridiutn ~hertnoce/hrtn. Biochimie, 65, 495-500 (1983). Petre, D., Millet, J., Longin. R., Beguin, P., Girard, H., and Aubert. J.-P.: Purification and properties of the endoglucanase C of Closiridiuttt thertnocelhrtn produced in Escherichia coli. Biochimie, 68, 687-695 (1986). Joliff, G., Beguin, P., Juy, M., Millet, J., Ryter, A., Poljak, R., and Auberl, J.-P.: Isolation, crystallization and properties of a new cellulase of CIos~ridiutn thertnocel/utn overproduced in Escherichia co/i. Bio/Technology, 4, 896-900 (1986). Grepinet, O., Chebrou, M.-C., and Beguin, P.: Purification of C’lostridiutn lhertnocellutn xylanase 2 expressed in Escherichia co/i and identification of the corresponding product in the culture medium of C. ~hertnocelhtttt. J. Bacterial., 170, 4576-4581 (1988). Y.-Perron. C.. Vieira, J., and Messing, J.: Improved Ml3 phage cloning vectors and host strains: n&leotide-sequences of the M13mp 18 and pUCl9 vectors. Gene. 33. 103-119 (1985). Miller, J. H.: Experiments in molecular’genetics, b. 433. Cold Spring Harbor Laboratory, Cold Spring Harbor, New York (1972). Vieira, J. and Messing, J.: The pUC plasmids, an Ml3mp7derived system for insertion mutagenesis and sequencing with synthetic universal primers. Gene, 19, 259-268 (1982). Norrander, J., Kempe. T., and Messing, J.: Construction of improved Ml3 vectors using oligodeoxynucleotide-directed mutagenesis. Gene, 20. 101-106 (1983). Kobnyashi, T., Romaniec, M. P. M., Fauth, U., and Demain, A. L.: Subcellulosome preparation with high cellulase activity from Closfridiuttt rhertttocei/utt~. Appl. Environ. Microbial., 56. 3040-3046 (1990). Kobayashi, T., Romaniec, M. P. M., Fauth. U., Barker, P. J., and Demain, A. L.: Cloning and expression in Escherichia co/i of Closfridiutn thertnoce/hrtn DNA encoding subcellulosomal proteins. Enzyme Microb. Technol., 14, 447-453 (1992).

OF C. THERMOCELLUM

ENDOGLUCANASE

GENE

255

33. Sanger, F., Nicklen, S., and Coulson, A. R.: DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. USA, 74, 5463-5467 (1977). 34. Hunkapiller, M. W., Hewick, R. M., Dreyer, W. J., and Hood, L. E.: High-sensitivity sequencing with a gas-phase sequenator, p. 399-413. In Hirs, C. H. W. and Timasheff, S. N. (ed.), Methods in enzymology, vol. 91. Academic Press, New York (1983). 35. Wu, J. H. D., Orme-Johnson, W. H.. and Demain, A. L.: Two components of an extracellular protein aggregate of Closfridiutn fhertnocekttn together degrade crystalline cellulose. Biochemistry, 27. 1703-1709 (1988). 36. Fauth, U., Romaniec. M. P. M., Kobayashi. T., and Demain, A. L.: Purification and characterization of endoglucanase Ss from Clostridiutn /hermocelhrtn. Biochem. J., 279, 67-73 (1991). 37. Johnson, E. A. and Demain. A. L.: Probable involvement of sulfhydryl groups and a metal as essential components of the cellulase of Clostridiutn thertnocellutn. Arch. Microbial., 137, 135-138 (1984). 38. Bradford, M. M.: A rapid and sensitive method for the quantitation of microgram quantities of protein utilizing the principle of protein-dye binding. Anal. Biochem., 72, 248-254 (1976). 39. Band, L. and Henner, D. J.: Bacillus subrilis requires a “stringent” Shine-Dalgarno region for gene expression. DNA, 3, 17-21 (1984). 40. Ben-Bassat, A. and Bauer, K.: Amino-terminal processing of proteins. Nature, 326. 315 (1987). 41. Beguin, P.: Molecular biology of cellulose degradation, 44, p. 219-248. In Omston, L. N., Balows, A., and Greenberg, E. P. (ed.), Annu. Rev. Microbial. Annual Reviews Inc., California (1990). 42. Gilkes, N. R., Henrissat, B., Kilburn, D. G., Miller, R. C. Jr., and Warren, R. A. J.: Domains in microbial ,%I,Cglycanase: sequence convertion, function, and enzyme families. Microbial. Rev., 55, 303-315 (1991). 43. Zappe, H., Jones, W. A., Jones, D. T., and Woods, D. R.: Structure of an endo-$-l ,Cglucanase gene from CIostridium acerobutylicutn P262 showing homology with endoglucanase genes from Bocilhcs spp. Appl. Environ. Microbial., 54, 12891292 (1988). 44. Faure. E.. Belaich, A., Bagnara, C.. Guadin, C., and Belaich, J.-P.: Sequence analysis of the Clostridiutn celhtlol.vticutn ce/CCA endoglucanase. Gene, 65, 51-58 (1990). 45. Berger, E.. Jones, W. A., Jones, D. T., and Woods, D. R.: Cloning-and sequencing of an endoglucanase (endl) gene from Bufvrivibrio fibrisolvens Hl7c. Mol. Gen. Genet.. 219. 193-198 (1989). 46. Saloheimo, M.. Lehtovaara, P., Penttila, M.. Teeri, T. T., Stahlberg, J., Johansson, G., Pelterson, G., Claeyssens, M., Tomme. P., and Knowles, J. C.: EGIII, a new endoglucanase from Trichodertna reesei: the characterization of both gene and enzyme. Gene, 63, II-21 (1988). 47. Takano, T., Fukuda, M.. Monma, M., Kobayashi, S., Kainuma, K., and Yamane, K.: Molecular cloning, DNA nucleotide sequencing, and expression in Bacillus subrilis cells of the BocilLus tnacerans cyclodextrin glucanotransferase gene. J. Bacterial., 166, 1118-1122 (1986). 48. Kimura, E., Kataoka, S., Ishi, Y., Takano, T., and Yamane, K.: Nucleotide sequence of the ,3-cyclodextrin glucanotransferase gene of alkalophilic Bacilhcs sp. strain 101 I and similarity of its amino acid sequence to those of tr-amylase. J. Bacterial., 169, 4399-4402 (1987). 49. Schwarz, W. H., Schmming, S., and Staudenbauer, W. L.: Degradation of barley $glucan by endoglucanase C of Closfridiutn thertnocellutn. Appl. Microbial. Biotechnol., 29, 25-31 (1988). 50. Perlman. D. and Halvorson, H. 0.: A putative signal peptidase recognition site and sequence in eukaryotic and prokaryotic signal peptides. J. Mol. Biol., 167, 391-409 (1983). 51. Watson, M. E. E.: Compilation of published signal sequences. Nucleic Acids Res., 12, 5145-5164 (1984). 52. Losick, R. and Pero, j.: Cascades. of sigma factors. Cell, 25, 582-584 (1981). 53. Hawley, D. K. and McClure, W. R.: Compilation and analysis

256

54.

55.

56. 57.

KOBAYASHI

ET AL.

of Escherrichiu co/i promoter DNA sequences. Nucleic Acids Res., 11. 2237-2249 (1983). Beguin. P., Rocancourl, M., Chebrou, M.-C., and Aubert. J.-P.: Mapping of mRNA encoding endogluconase A from Closrridirrm rherrtroce//urn. Mol. Gen. Genet., 202, 25 l-254 (1986). Konigsberg. W. and Godson, G. N.: Evidence for use of rare codons in the drroG gene and other regulatory genes of Escherichia co/i. Proc. Natl. Acad. Sci. USA, 80, 687-691 (1983). Ambler, R. P.: The amino acid sequence of Stuph.v/ococc~rs oure~s penicillinase. Biochem. .I., 151. 197-218 (1975). Uhlen. M.. Guss. B., Nilsson, B., Gatenbeck, S.. Philipson, L., and Lindberg, M.: Complete sequence of the staphylococcal

J. FERMENT.

BIOENG.,

gene encoding protein A. .I. Biol. Chem., 2.59, 1695-1702 (1984). 58. Nakamura, K., Misawa, N.. and Kitamurn. K.: Sequence of a cellulase gene of Ce//lt/o,,rorrus uda CB4. J. Biotechnol., 4, 247254 (1986). 59. Wang, W. K., Krijus. K., and Wu, J. H. D.: Cloning and DNA sequence for the gene coding for Clostridium lhertnocellrtnl cellulase Ss (CelS) a major cellulosome component. J. Bacterial., 175. 1293-1302 (1993). 60. Laemmli, U. K.: Cleavage of structural proteins during the assembly of the head of bacteriophage T4. Nature, 227, 680-685 (1970).