23
Gene, 63 (1988) 23-30 Elsevier GEN 02302
Nucleotide
sequence
(Recombinant
DNA;
of the celC gene encoding cellulase;
endoglucanase
endo-1,4-p-glucanase;
leader
C
sequence;
of
Clostridium thermocellum
signal peptide;
reiterated
domain;
phage 1 vectors)
Wolfgang H. Schwarz”, Silke Schimming”, Karl P. Riicknagel b, Sylvia Burgschwaiger =, Giinther Kreil’ and Waiter L. Staudenhauer” 0 Institutefor Microbiology, Technical Universityof Munich, D-8000 Miinchen 2 (F.R. G.), b Abteilung Braunitzer, Max Planck Institutefor Biochemistry, D-8033 Martinsried (F.R.G.) Tel. (089)8578-2489, and ’ Institutefor Molecular Biology, Austrian Academy of Science, A-5020 Salzburg (Austria) Tel. (06222)249-6121 Received
25 August
Accepted
16 November
1987
Received
by publisher
1987
1 December
1987
SUMMARY
The nucleotide sequence of the cellulase gene celC, encoding endoglucanase C of Clostridium thermocellum, has been determined. The coding region of 1032 bp was identified by comparison with the N-terminal amino acid (aa) sequence of endoglucanase C purified from Escherichiu coli. The ATG start codon is preceded by an AGGAGG sequence typical of ribosome-binding sites in Gram-positive bacteria. The derived amino acid sequence corresponds to a protein of M, 40439. Amino acid analysis and apparent M, of endoglucanase C are consistent with the amino acid sequence as derived from the DNA sequencing data. A proposed N-terminal 21-aa residue leader (signal) sequence differs from other prokaryotic signal peptides and is non-functional in E. coli. Most of the protein bears no resemblance to the endoglucanases A, B, and D of the same organism. However, a short region of homology between endoglucanases A and C was identified, which is similar to the established active sites of lysozymes and to related sequences of fungal cellulases.
INTRODUCTION
The thermophilic anaerobic bacterium C. thermocellum encodes a variety of extracellular endoglucanases involved in the degradation of cellulose
Correspondenceto: Dr. W.L. Staudenbauer, biology,
Technical
University
(F.R.G.)
Tel. (089)2105-2372.
of Munich,
Institute D-8000
for MicroMtinchen
(Millet et al., 1985; Schwarz et al., 1985; Romaniec et al., 1987). Besides the major endoglucanase A (Cornet et al., 1983; Schwarz et al., 1986), three additional enzymes, designated endoglucanase B (Beguin et al., 1983), C (Petre et al., 1986), and D
1000 bp; nt, nucleotide(s); 2
polyacrylamide fate;
Sm,
designates Abbreviations: coding
aa, amino
for cellulase
0378-l 119/88/$03.50
acid(s);
component;
0 1988 Elsevier
bp, base pairs(s); dd, dideoxy;
ccl, gene
d, deletion;
Science Publishers
kb,
B.V. (Biomedical
Division)
ORF,
gel electrophoresis;
streptomycin; plasmid-carrier
open reading
( ), designates state.
frame;
SDS, sodium prophage
PAGE,
dodecyl state;
sul-
[ 1,
24
(Joliff et al., 1986a), have been characterized by cloning and expressing the corresponding genes (c&A, celB, celC, celD> in E. coli. The nucleotide sequences of the genes ceM (Beguin et al., 1985), celB (Grepinet and Beguin, 1986), and celD (Joliff et al., 1986b) have been determined. Comparison of the deduced amino acid sequences revealed the presence of N-terminal leader (signal) sequences required for protein export, Furthermore, it was found that the C-terminal regions of all three enzymes shared a highly conserved reiterated domain (Joliff et al., 1986b). Recently, we reported on the high-level expression of the C. thermoceilum celA and celC genes in E. co& (Schwarz et al., 1987b). Endoglucanase A was partially exported into the cytoplasmic space, whereas endoglucanase C remained in the cytoplasm. Overexpression of ceiA resulted in decreased cell viability concomitant with the accumulation of endoglucanase A in the membrane fraction. In contrast, overproduced endoglucanase C accumulated as a soluble enzyme in the cytoplasm without adverse effects on the host cell. To elucidate the molecular basis for this difference in enzyme localization, we have determined the nucleotide sequence of the celC gene. It was found that the amino acid sequence of the presumptive signal peptide region at the N terminus differed significantly from the corresponding sequences of other pre-proteins from Gr~-positive bacteria. Moreover, the C-terminal amino acid sequence of endoglucanase C shared no homology with the other C. thermocellum endoglucanases. Only a limited region of sequence homology between endoglucanases A and C was identified, which resembles the proposed active sites of other cellulases.
MATERIALS
AND METHODS
(a) Bacterial strains and plasmids E. coli M72 (~7~53cI857~Hl)
SmR 1acZ Abio-uvrB AtrpEA2 was obtained from E. Remaut. Plasmid pSU1, a derivative of pPLc236 (Remaut et al., 1981) carrying the multiple cloning site of pUC8, was provided by W. Lubitz (Schtiller et al., 1985). Phage aLIC7 was isolated from a gene bank
constructed by cloning a partial Suu3A digest of C. thermocellum DNA in the phage A vector 1059 (Schwarz et al., 1985). (b) Nucleotide sequence analysis The nucleotide sequence was determined according to the procedures of Maxam and Gilbert (1980). Restriction fragments were end-labelled with ]a-32P]ddATP (EcaRI, EcoRV, KpnI) or a suitable [ cr-““P]dNTP (HindIII, XbaI) employing either reverse transcriptase (EcoRI, HindIII, XbaI) or terminal transferase (EcoRV, KpnI). The products of the chemical cleavage reactions were electrophoresed on 0.2 mm thick 6% or 18% polyacrylamide gels. Sequence data were analysed with the Programs for Rapid Biosequence Similarity Analysis of D.J. Lipman and W.J. Wilbur (NIH, Bethesda). (c) Enzyme purification Endoglucanase C was overproduced upon thermal induction of E. coli M72[pWS1257] (Schwarz et al., 1987b). Cell extracts were prepared by freezethaw lysis of Iysozyme-treated cells and E. coli proteins were precipitated by heating for 10 min at 60’ C. The enzyme was then purified to homogeneity by anion exchange chromatography on a Mono Q HR 5/5 column and subsequent gel filtration on a Superose 12 HR lo/30 column employing a Pharmacia FPLC system. (d) Enzyme characterization Enzyme activity was determined by assaying the release of reducing sugars from barley fi-glucan (Schwarz et al., 1986). SDS-PAGE was carried out in 10% polyacrylamide slab gels in the presence of 0.1% SDS. Staining for /I-glucanase activity was performed in polyacrylamide gels containing 0. I % barley /3-glucan (Schwarz et al., 1987a). For determination of aa composition, purified endoglucanase C was hydrolyzed in 5.7 N HCl at 110°C for 20 h and analyzed on a Biotronic LC 5000 analyzer. Tr~tophan was determined after addition of thioglycolic acid. Cysteine and methionine were estimated after oxidation with performic acid.
25
RESULTS
AND
DISCUSSION
between the mRNA and the 3’-end of most bacterial
sequence of the celC gene
interaction
16s rRNA (Van Knippenberg (a) Nucleotide
of the orientation
ment suggesting otide sequence,
of a clostridial
which can function
E. coli. Enhanced
tion of the 2 p,_ promoter plasmid
indicated
that
C. thermocellum. The ORF ends with the stop codon UGA at nt position + 1030. Another codon (UAA) follows at nt position
nucle-
as a promoter
gene expression
following
present
The
in
occurs
lated sequence,
E.r
pWS1251 ;
c* .’ .*
.’
.’ **
Pv2
0 I' $j
/" #'
H X6,-/’
E5
.*
preceding
Pv2
P
3’-.
2
X
K Sm
‘\
‘\
‘\
‘\
,
i
0.5
Fig. 1. Subcloning was subcloned
and sequencing
into plasmid
strategy
pBR322
\
\
i I
‘
+ I
'\
‘-,E
'\
from right to left. The arrows
indicated
by vertical
B, BarnHI; and Pv2/Sm
E, EcoRI; indicate
a PvuII cloning
I
represent
the direction
H, HindIII;
hybrid sites generated
-\ '\ '\ ‘\
of pi_-promoted
site (Pv2/Sm),
respectively.
fragment
g
‘x.&K E
pL
ii I
;;
kb
of phage I LIC7 DNA (Schwarz
expressing
transcription
and extent of sequence
a Sau3A
1.5
/
endoglucanase indicated
determinations,
C activity
et al., 1985) was blunt-end
that mRNA synthesis
from celC
with the sites of 3’-end labelling
by using a set of runs in 6% and 18% polyacrylamide into a BarnHI
gels.
Sm, SmaI; X, XbaI. Symbols
B/S
cloning site (B/S) and a SmaI fragment
into
K, KpnI; P, PstI; Pvl, PvuI; Pv2, PvuII; S, Sau3A;
by inserting
kb
I
for the celC gene. A 4.0-kb EcoRI fragment
lines. More than 300 nt were read from one experiment E5, EcoRV;
j i -
1
kb
H
G '\
E5
yielding pWS 1251. A 1.6-kb PvuII fragment
ligated into the .SmaI site of pSU1 to yield pWS 1257. Induction proceeds
of upstream
‘.
Pvl
pWS1257 j ?
the trans-
15
‘\
,1**’
;
is
is 74%
regulatory regions appears to be characteristic for genes of bacilli and related bacteria (Moran et al., 1982). No typical promoter sequences could be identified in this region by sequence inspection. Biochemical analysis of ceZC mRNA would therefore be required to localize the transcription start point(s).
Pv2
I'
stop
of the celC gene
The A + T content
high A + T content
r-
/* .*
in-frame + 1063.
but 62% within the coding region.
An exceptionally
+ 3. These codons are preceded by a strong potential ribosome-binding site allowing perfect base pairing
0
region
within the 200 nt immediately
from
right to left on the physical map shown in Fig. 1. Fig. 2 shows the complete nucleotide sequence of the celC structural gene along with its flanking regions. There is only one long ORF in this strand and its complement which begins at nt position -67. The first start codons in this ORF are the AUG codon at nt position + 1 and the GUG at nt position
XLIC7
5’-noncoding
enriched in AT residues.
induc-
on the vector
transcription
(-78.7
the celA and celB genes of
positive genes including
of the cloned DNA frag-
the presence
kcal/mol
kJ/mol) can be calculated (Tinoco et al., 1973). This value is in accord with those found for other Gram-
The ceZC gene was localized within a 1.6-kb PvuII fragment of genomic C. chermocellum DNA (Fig. 1). Constitutive synthesis of endoglucanase C was independent
et al., 1984). For this
a free energy of -18.8
26
-191 -181 -161 -151 -171 CAATAAAAAC TGAACACAGA AGAAGAAAAC GTGATATAAT TAAATTAGAA CGAACGCGCG -141TACATTi$AATAACCCAG -121TGTTAAATGG -111TTTCAG~~~~ -91 -81 -71 -61 -51 -41 -31 -21 -11 -1 CGATTCCAAA TGTTTATATC CAATTTACAT TTAAAAACAT ACAAAACATC AAAAGTATTT AATACCAATA TTTAAAACAC AATATTTCAG GAGGAAAAAA 15 3C 45 6C 75 90 ATGGTGAGTTTTAAAGCAGGTATAAATTTACCCCGATGGATATCACAATATCAAGTTTTCAGCAAAGAGCATTTCGATACATTCATTACG METValSeKPheLvsAlaGlvIleAsnLeuGlrr GlvTm IleSerGinTyrGinValPheSerLysGluHisPhe Asp Thr Phe IIeThr 1c5 12c 135 150 165 180 GAGAAGGACATTCIAACTATTGCAGAAGCAGGGTTTGACCATGTCAGACTGCCTTTTCATTATCCAATTATCGAGTCTCATGAC AATGTG GluLysAspIleGluThrIleAlaGIuAlaGlyPheAspHisVaIArgLeuProPheAspTyrProIleIleGluSerAspAspAsnVal
195 21c 225 240 270 255 CGACIATATAAAGAACATGGGCTTTCTTATATTGACCCCTGCCTTGAGTGGTGTAAAAA11 TACAATTTGGGGCTTGTGTTGCATATGCAT GlyGluTyrLysGIuAspGlyLeuSerTyrIleAspArgCysLeuGluTrpCysLysLysTyrAm LeuGlyLeuValLeuAspNetEis 300 315 360 285 33C 345 GCTCCCGGGTACCCCTTTCAAGATTTTAAGACAAGCACCTTGTTTGAAGATCCCAACCAGCAAAAGAGATTTGTTGAC ATATGGAGA HisAlaProGlyTyrArgPheGinAspPheLysThrSerThrLeuPheGluAspProAm GinGinLysArgPheValAspIleTrpArg
CAC
450 39c 420 435 375 405 TTTTTACCCAAGCGTTACATAAATGAACGGGAACATATTGCCTTTGAACTGTTAAATGAAGTTGTTGAGCCTGACACTACCCCCTGGAAC PheLeuAlaLysArgTyrIleAsnGluArgGluHisIleAlaPheGluLeuLeuAm GluValValGluProAspSerThrArgTrpAsn 540 480 51c 525 465 495 TTGATGCTTGAGTATATAAAAGCAATCAGGGAAATTCATTCCACCATGTGGCTTTACATTCCCCCCAATAACTATAACAGTCCTCAT LysLeuMetLeuGluTyrIleLysAlaIleArgGluIleAspSerThrKetTrpLeuTyrIleGlyGlyAsnAm TyrAsnSerProAsp
AAG
630 57c 600 615 555 585 GAGCTTAAAAACCTTGCACATATTCATCATCATTACATAGTTTACAATTTCCATTTTTACAATCCTTTTTTCTTTACGCATCAGAAAGCC GluteuLysAm LeuAlaAspIleAspAspAspTyrIleValTyrAsnPheEisPheTyrAsnProPhePhePheThrEisGinLysAla 720 690 705 660 645 675 TGGTCGGAAACTCCCATGCCCTACAACAGGACTGTAAAATATCCGCGACAATATGAGCGAATTGAA GAG TTTGTGAl AATAATCCT HisTrpSerGluSerAlaMetAIaTyrAm ArgThrValLysTyrProGlyGinTyrGluGlyIleGluGluPheValLysAsnAm Pro CAC
810 780 795 75c 735 765 AAGTACACTTTTATGATGGAATTGAATAACCTGAAGCTGAATAAAGAGCTTTTGCGCAI GATTTAAAA CCAGCAATTGAGTTCAGGGAA LysTyrSerPheMetNetGluLeuAsnAsnLeuLysLeuAsnLysGluLeuLeuArgLysAspLeuLysProAlaIleGluPheArgGlu 900 885 87C 840 825 855 AAGAAAAAATGCAAACTATATTGCCCCGAGTTTGGCGTAATTCCCATTGCTGACTTGGAGTCTAGGATAAAATGGCATGAACATTATATA LysLysLysCysLysLeuTyrCysGlyGluPheGlyValIleAlaIleAlaAspLeuGluSerArgIleLysTrpHisGluAspTyrIle 990 960 975 930 915 945 AGTCTTCT11 GAGGAGTATCATATCCCCGGCCCCGTGTGGAAC TACAAAAAAATGCATTTTGAAATTTATAATGAGCATAGAAA11 CCTGTC SerLeuLeuGluGluTyrAspIleGlyGlyAIaValTrpAsnTyrLysLysNetAspPheGIuIleTyrAm GluAspArgLysProVal 1CSC 1060 1070 1050 1005 1040 1020 TCGCIAGAATTGGTAAATATACTGCCCAGAAGAAAA ACTTGATTATTAAA ACTACATTTT TGCAAAAGTT TGTAATTTAA AAAATACAAC SerGinGluLeuValAsnIleLeuAlaArgArgLysThr*** Fig. 2. Nucleotide Shine-Dalgarno determined
sequence
by automated
of the coding
of the C. thermocellum
(SD) ribosome-binding sequence.
gas-phase The indicated
ceZC gene and deduced
amino acid sequence
site before the start codon is underlined sequencing. nucleotides
The numbers are aligned
of endoglucanase
C. The presumptive
with a double line. The underlined
refer to the nucleotide
position,
with numbering
with the last digit of each number.
amino acids were
starting
at the first nt
27
(b) Protein structure and codon usage The start codon AUG and the reading frame were verified by determining the N-terminal amino acid sequence, which is in full agreement with the amino acid sequence deduced from the nucleotide sequence. While the N-formyl methionine residue of endoglucanase C is deformylated in E. co&,removal of N-terminal amino acids did not take place. The large ORF encodes a protein of 343 aa residues with a predicted A4, of 40439. The size of the protein agrees well with the apparent M, of approx. 38000 determined for the purified enzyme produced in E. coli by SDS-PAGE (Petre et al., 1986; Schwarz et al., 1987b). Furthermore, the amino acid composition determined experimentally is in good agreement with the composition deduced from the nucleotide sequence (Table I). These data taken together indicate that the complete gene has been cloned and that the enzyme is expressed in E. coli without further proteolytic processing.
TABLE
I
Amino
acid composition
Amino
Determined
acid
analysis
of endogluca~ase by amino
acid
of the protein a
C Deduced
Ala
15.33
Arg Asx’
16.44
16
47.46
47
CYS Glx’
4.84
4
41.49
41
Gly His
16.38
16
7.84
9
Ile
22.56
26
28.09
28
LYs Met
29.17
29
7.13
8
Phe
21.47
22
Pro
11.06
11
Ser
15.51
15
Thr
9.94
10
Trp
5.74
8
Tyr Val
20.80
22
14.08
15
Total
335.93
343
a Analysis
was carried
’ Sequence
section
sequence h
16
Leu
METHODS,
from the
nucleotide
out as described
in MATERIALS
d.
data shown in Fig. 2.
’ Asx = Asn + Asp; Glx = Gln + Glu (see Table II).
AND
A summary of the codons used in the translation of celC mRNA is presented in Table II. The codon usage is similar to that reported previously for other C. t~e~moceilum cellulase genes (Grepinet and Beguin, 1986; Joliff et al., 1986). There appears to be a bias for selection of codons ending in A or U. This preference reflects the comparatively low G + C content (38%) of C. t~e~moce~lumDNA. Clostridial codon usage is therefore not optimal for the E. coli translation machinery and might limit the efficiency of gene expression in the heterologous host (Garnier and Cole, 1986). (c) Signal sequence and protein l~alization Evidence has been provided that endoglucanase C is secreted into the culture medium of C. thermoceflum grown on cellulose (Petre et al., 1986). A common feature of bacterial protein export is the requirement for an N-terminal leader (signal) sequence, which is removed upon translocation of the pre-protein across the plasma membrane. Leader peptides are 15-40 aa residues in length with a positively charged N-terminal region, an apolar hydrophobic core and characteristic sequences adjacent to the cleavage site (Kreil, 1981; Oliver, 1985). Some features of the N-terminal amino acid sequence of endogiucanase C, such as the presence of a positively charged lysine near the N te~inus followed by a somewhat hydrophobic domain, would be in accord with the presence of a leader peptide. The N end of mature endoglucanase C is not known; application of the algorithm developed by von Heijne (1983; 1986) reveals the presence of a potential cleavage site for leader peptidase after serine-21. However, this putative signal peptide of endoglucanase C is quite atypical in several respects. The leader peptides of celA, ceIB, and celD (Fig. 3) as well as those of several other preproteins of Gram-positive bacteria (Pugsley and Schwartz, 1985; MacKay et al., 1986; O’Neill et al., 1986) are longer and more basic at the N terminus than the one for celC. Moreover, the core region of the celC leader contains several neutral-polar amino acids (asparagine, glut~ine) and lacks a cluster of strongly hydrophobic amino acid residues. Lastly, almost all bacterial signal peptides terminate with alanine and prokaryotic leader peptidase only rarely cleaves after
28
TABLE II Codon utilization in the celC gene of Closttidium thermocellum Codon
Amino acid
No. b
Codon
Amino acid
No.”
Codon
Amino acid
No.”
Codon
Amino acid
uuu
Phe
16 6 4 8
ucu ucc UCA UCG
Ser
3 1
UAU UAC a UAA UAG
TYr
12 10 0 0
UGU UGC UGA UGG
CYs
1
EndC Trp
3 1 8
Pro
His
7 2 6 2
CGU” CGC” CGA CGG
Arg
14 8 21 8
AGU AGC AGA AGG
Ser
18 7 I7 16
GGU a GGC = GGA GGG
UUC” UUA UUG
Leu
cuu
Leu
1 2
9
ecu
cut
0
ccc
I
CUA CUG”
2 5
CCA a CCG”
2 2
CAU CAC CAA CAG”
14 3 9 8
ACU” ACC a ACA ACG
Thr
3 3 2 2
AAU AAC” AAA a AAG
Asn
5 2 3 5
GCU = GCC GCA a GCGa
Ala
2 5 6 3
GAU GAC GAA” GAG
Asp
AUU AUC” AUA AUG GUU” GUC GUA” GUG
Ile
Met Val
6
EndC End”
Gln
Lys
GlU
No.~
1
3 0 2
Arg
GUY
6 2 6 4 1 5 4 6
a Major E. cdi tRNA species (Ikemura, 1981). b Number of codons in the entire celC coding region (see Fig. 2). f Stop dons.
serine residues. These unusual features may be the reason why the celC leader sequence does not function in E. coIi and the protein thus remains in the cytoplasm (Schwarz et al., 1987b).
CelB
!&iii
MVSFfK
CelC
CelD
MSihTLk&&
comparison
to fuuctionaHy
related
Alignment of the deduced endoglucanase C sequence with the amino acid sequences of endo-
VGVVLLILAVLGVYMLAMPANTVSA
M+KN&
CelA
(d) Sequence enzymes
FLVLLIALIMIATLLVVPGVQTSA
AGINLGGWISQYQVFS
VLSLLIAVVFLSLTGVFPSGLI;T;VSA
Fig. 3. Comparison of signal peptides of C. f~e~~ceZZ~~ endoglucanases. The signal peptidase cleavage site of endogiucanase A (CelA) has been determined by N-terminal sequencing of the extracellular protein. Cleavage sites (shown as gaps) for endoglucanases B, C, and D (CelB, C, D) are those predicted by cleavage-site recognition rules. The symbols ( + ) and ( - ) denote basic and acidic amino acid residues, respectively.
29
CelC I"*;E" 9
CelA
HEWL Fig. 4. Alignment sequences
of homologous
T5D?
ESNFNTQATNRNTDGS regions
are boxed. The hen egg-white
of endoglucanase lysozyme
(HEWL)
C (CelC) and endoglucanase sequence
A (CelA). Amino
shows the region of the established
acids common
to both
active site residues
Glu-35
and Asp-52.
glucanases A, B, and D yielded no regions of extensive homologies. Furthermore, comparison with the sequences of various cellulolytic enzymes (Knowles et al., 1987) revealed no significant similarities. In particular, endoglucanase C is lacking the C-terminal direct repeat of 24 aa shared by the other three C. thermocellum endoglucanases. This conserved sequence has been implicated in the binding of these enzymes to two adjacent glucose residues of the cellulose substrates (Beguin et al., 1985). It should be pointed out that endoglucanase C has an unusual substrate range and displays features common to cellobiohydrolases by being able to cleave the agluconic bond of aryl-B-glucosides. Comparison of endoglucanase C with endoglucanase A revealed a short region of similarity (Fig. 4). Interestingly, this homologous sequence includes the motif Glu-Xaa,-Asn-Xaa,,,-Asp (where Xaa is any amino acid and subscript 5/7 indicates a number of 5 or 7 aa residues, respectively). A similar arrangement of amino acids is present in the catalytic site of lysozymes (Canfield, 1963). Limited sequence homology to the active site of lysozymes has been found in several fungal endoglucanases and cellobiohydrolases (Yaguchi et al., 1983; Paice et al., 1984; Teeri et al., 1987). These findings support the notion that cellulases might act like lysozymes by an acid catalysis mechanism (Clarke and Yaguchi, 1985).
ACKNOWLEDGEMENTS
We are grateful to Dr. P. Beguin for communicating his unpublished sequence data. We thank Drs. W. Lubitz and E. Remaut for providing us with bacterial strains and plasmids. This work was supported by grants from the Deutsche Forschungsgemeinschaft (SFB 145), from the Fonds der Chemischen Industrie, and from the Dr. Otto Rohm Gedachtnisstiftung GmbH.
REFERENCES Beguin, P., Cornet,
P. and Aubert,
gene of the thermophilic J. Bacterial.
P. and Millet, J.: Identification
encoded
cellum. Biochimie Canfield,
of a cellulase
Clostridium thermocellum.
162 (1985) 102-105.
Beguin, P., Cornet, glucanase
J.P.: Sequence
bacterium
of the endo-
by the celB gene of Clostidium
thermo-
65 (1983) 495-500.
R.E.: The amino-acid
sequence
of egg-white
lysozyme.
J. Biol. Chem. 238 (1963) 2698-2707. Clarke, A.J. and Yaguchi, function
of endo+
M.: The role of carboxyl 1,4-glucanase
mune. Eur. J. Biochem. Garnier,
from
in the com-
149 (1985) 233-238.
T. and Cole, S.T.: Characterization
genie plasmid
groups
from Schizophyilum
Clostidium
of a bacteriocino-
perjtingens
and molecular
genetic analysis ofthe bacteriocin-encoding gene.
J. Bacterial.
168 (1986) 1189-1196. Grepinet,
0. and Beguin,
P.: Sequence
of the cellulase
Clostridium thermocellum coding for endoglucanase
gene of B. Nucl.
Acids Res. 14 (1986) 1791-1799. Ikemura,
T.: Correlation
between the abundance
fer RNA and the occurrence protein
of E. coli trans-
of the respective
genes. J. Mol. Biol. 151 (1981) 389-409.
codons
in its
30
JolifT,G., Beguin, P., Juy, M., Millet, J., Ryter, A., Poljak, R. and Aubert J.P.: Isolation, crystallization and properties of a new cehulase of Cfosttidiurn the~ocel~~rn overproduced in ~sche~ch~a coli. Bio~echnolo~ 4 (1986a) 896-900. Joliff, G., Beguin, P. and Aubert, J.P.: Nueleotide sequence of the cellulase gene celD encoding endoglucanase D of Clostridtum thermocellum. Nucl. Acids Res. 14 (1986b) 8605-8613. Knowles, J., Lehtovaara, P. and Teeri, T.: Cellulase families and their genes. Trends Biotechnol. 5 (1987) 255-261. Kreil, G.: Transfer of proteins across membranes. Annu. Rev. Biochem. 50 (1981) 317-348. MacKay, R.M., Lo, A., Willick, G., Zuker, M., Baird, S., Dove, M., Moranclli, F., Seligy, V.: Structure of a Bacillus subtilis endo-/3-1,4-glucanase gene. Nucl. Acids Res. 14 (1986) 9159-9170. Maxam, A.M. and Gilbert, W.: Sequencing end-labeled DNA with base-specific chemical cleavages. Methods Enzymol. 65 (1980) 499-560. Moran, C.P., Lang, N., LeGrice, S.F.J., Lee, G., Stephens, M., Sonenshein, A.L., Pero, J. and Losick, R.: Nucleotide sequences that signal the initiation oftranscription and translation in Bacillus subtiiis. Mol. Gen. Genet. 186 (1982) 339-346. O’Neill, G.P., Warren, R.A.J., Kilburn, D.G. and Miller, R.C.: Secretion of a Cellulomonasj%ni exoglucanase by Escherichia coli. Gene 44 (1986) 331-336.
Paice, M.G., Desrochers, M., Rho, D., Jurasek, L., Rollin, CF., De Miguel, E. and Yagnchi, M.: Two forms of endoglucanase from the basi~omycete Sch~zoph~llum commune and their relationship to other &1,4glycoside hydrolases. Bio/Technology 2 (1984) 535-539. Petre. D., Millet, J., Longin, R., Beguin, P., Girard, H., Aubert, J.P.: Purification and properties of the endoglucanase C of Ciostridium thermocellum produced in Escherichia co/i. Biochimie 68 (1986) 687-695. Pugsley, A.P. and Schwartz, M.: Export and secretion of proteins by bacteria. FEMS Microbial. Rev. 32 (1985) 3-38. Remaut, E., Stanssens, P. and Fiers, W.: Plasmid vectors for high-efficiency expression controlled by the p,_ promoter of coliphage lambda. Gene 15 (1981) 81-93. Romaniec, M.P.M., Clarke, N.G. and Hazlewood, G.P.: Molecular cloning of C~ost~di~rnthermo~elIum DNA and the expression of further novel endo-B-1 ,4-glucanase genes in Escherichia coli. J. Gen. Microbial. 133 (1987) 1297-1307.
Schtiller, A., Harkness, R.E., Rilther, U. and Lubitz, W.: Deletion of C-terminal amino acid codons of phiX174 gene E: effect on its lysis inducing properties. Nucl. Acids Res. 13 (1985) 4143-4152. Schwarz, W.. Bronnenmeier, K. and Staudenbauer, W.L.: Molecular cloning of Clostridium thermocekkm genes involved in b-glucan degradation in bacteriophage lambda. Biotechnol. Lett. 7 (1985) 859-864. Schwarz, W.H., Grabnitz, F. and Staudenbauer, W.L.: Properties of C~ost~d~urnther~I~el~um endoglucanase produced in Escherichia e&i. Appl. Environ. Microbiof. 51 (1986) 1293-1299. Schwarz, W.H., Bronnenmeier, K., Griibnitz, F. and Staudenbauer, W.L.: Activity staining of cellulases in poly acrylamide gels containing mixed-linkage fl-glucans. Anal. Biochem. 164 (1987a) 72-77. Schwarz, W.H., Schimming, S. and Staudenbauer, W.L.: Highlevel expression of CIastridiumthermocellum cellulase genes in Escherichiu coli. Appl. Microbial. Biotechnol. 27 (1987b) 50-56. Teeri, T.T., Lehtovaara, P., Kauppinen, S., Salovuori, 1. and Knowles, J.: HomoIogous domains in Trichoderma reesei cellulolytic enzymes: gene sequence and expression of cellobiohydrolase II. Gene 51 (1987) 43-52. Tinoco, I., Borer, P.N., Dengler, B., Levine, M.D., Uhlenbeck, O.C., Crothers, D.M. and Gralla, J.: Improved estimation of secondary structure in ribonucleic acids. Nature New Biol. 246 (1973) 40-41. Van Knippenberg, P.H., Van Kimmenade, J.M.A. and Heus, H.A.: Phylogeny of the conserved 3’ terminal structure ofthe RNA of small ribosomal subunits. Nucl. Acids Res. 12 (1984) 2595-2603. von Heijne, G.: Patterns of amino acids near signal sequence cleavage sites. Eur. J. Biochem. 133 (1983) 17-21. von Heijne, G.: A new method for predicting signal sequence cleavage sites. Nucl. Acids Res. 14 (1986) 4683-4690. Yaguchi, R.J., Roy, C., Rollin,C.F., Paice, M.G. and Jurasek, L.: A fungal cellulase shows sequence homology with the active site of hen egg-white lysozyme. Biochem. Biophys. Res. Commun. 116 (1983) 408-411. Communicated by J. Knowles.