An expanded family of fungalysin extracellular metallopeptidases of Coprinopsis cinerea

An expanded family of fungalysin extracellular metallopeptidases of Coprinopsis cinerea

mycological research 112 (2008) 389–398 journal homepage: www.elsevier.com/locate/mycres An expanded family of fungalysin extracellular metallopepti...

2MB Sizes 8 Downloads 73 Views

mycological research 112 (2008) 389–398

journal homepage: www.elsevier.com/locate/mycres

An expanded family of fungalysin extracellular metallopeptidases of Coprinopsis cinerea Walt W. LILLYa,*, Jason E. STAJICHb,y, Patricia J. PUKKILAc, Sarah K. WILKEa,z, Noriko INOGUCHIa, Allen C. GATHMANa a

Department of Biology, Southeast Missouri State University, Cape Girardeau, MO 63701, USA Department of Molecular Genetics and Molecular Biology, Duke University, Durham, NC 27708, USA c Department of Biology, University of North Carolina, Chapel Hill, NC 27599, USA b

article info

abstract

Article history:

Proteolytic enzymes, particularly secreted proteases of fungal origin, are among the most

Received 9 September 2007

important of industrial enzymes, yet the biochemical properties and substrate specificities

Received in revised form

of these proteins have been difficult to characterize. Genomic sequencing offers a powerful

14 November 2007

tool to identify potentially novel proteases. The genome of the model basidiomycete Copri-

Accepted 29 November 2007

nopsis cinereus was found to have an unusually high number of metalloproteases that

Corresponding Editor:

closely match the M36 peptidase family known as fungalysins. The eight predicted C. cin-

Nicholas P. Money

ereus fungalysins divide into two groups upon comparison with fungalysins from other fungi. One member, CcMEP1, is most similar to the single representative fungalysins

Keywords:

from the basidiomycetes Phanerochaete chrysosporium, Cryptococcus neoformans, and Ustilago

Basidiomycota

maydis, and to the fungalysin type-protein from Aspergillus fumigatus. The remaining seven

Coprinus cinereus

C. cinereus predicted fungalysins form a group with similarity to three predicted M36 pep-

FTP

tidases of Laccaria bicolor. All eight of the C. cinereus enzymes contain both the signature

Fungalysin

M36 Pfam domain and the FTP propeptide domain. All contain large propeptides with

Peptidases

considerable sequence conservation near a proposed cleavage site. The predicted mature

Proteases

enzymes range in size from 37–46 kDa and have isoelectric points that are mildly acidic

Thermolysin

to neutral. The proximity of these genes to telomeres and/or to transposable elements may have contributed to the expansion of this gene family in C. cinereus. ª 2007 The British Mycological Society. Published by Elsevier Ltd. All rights reserved.

Introduction External digestion of macromolecular nutrients is a general feature of fungal growth and is dependent upon the secretion of hydrolytic enzymes. Among the most important of these hydrolases are those with proteolytic activity. Proteolytic enzymes are mechanistically diverse and include both those requiring highly specific peptide bond substrates, and others

that can hydrolyse a broad spectrum of peptides. They are divided into seven major mechanistic classes, including aspartyl proteases, cysteine proteases, glutamyl proteases, metalloproteases, serine proteases, threonine proteases, and those of unknown mechanism. Multiple studies have identified proteolytic activities, generally attributable to serine proteases and metalloproteases, in the growth medium supporting actively growing basidiomycetes. The model basidiomycetes

* Corresponding author. E-mail address: [email protected] y Current Address: Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720, USA. z Current Address: Department of Biology, University of North Carolina, Chapel Hill, NC 27599, USA. 0953-7562/$ – see front matter ª 2007 The British Mycological Society. Published by Elsevier Ltd. All rights reserved. doi:10.1016/j.mycres.2007.11.013

390

Schizophyllum commune (Hummel et al. 1998), Ustilago maydis (Hellmich & Schauz 1988), Agaricus bisporus (Burton et al. 1997), and Coprinopsis cinerea (Kalisz et al. 1987) all show secreted protease activity, and in most cases, the nitrogen status of the nutrient base affects the amounts (and perhaps types) of protease activity secreted. These older biochemical studies suffered from a lack of understanding of the number of protease genes present in fungal organisms, and often focused on measuring the total activity of an entire class of proteases, rather than the individual contributions of specific enzymes. The lack of resolution of these biochemical analyses also tended to underestimate the total number of enzymes contributing to extracellular proteolysis. In the only published biochemical study of C. cinerea proteases, Kalisz et al. (1987) identified only five proteases of all classes in culture filtrates using gelatin-containing polyacrylamide gel electrophoresis. We encountered similar resolution difficulties in S. commune (Hummel et al. 1998) and in C. cinerea (unpubl.). Such poor understanding of the extracellular enzyme activities has made it virtually impossible to gain insight into substrate specificity and regulation of individual enzymes. Recently, genome sequencing of fungi has revealed a large number of predicted fungal extracellular proteases and has provided a basis for investigating the function and specificity of individual enzymes. For example, genomic analysis in the white-rot basidiomycete Phanerochaete chrysosporium predicted 52 extracellular peptidases, of which 31 were subsequently found by proteomic methods in ligninolytic cultures (Vanden Wymelenberg et al. 2006). In this study, we report the discovery of a large, expanded family of extracellular fungalysin (peptidase M36 family) metalloproteases, based on genome-wide predictions for the mushroom-producing basidiomycete C. cinerea. Fungalysins are metalloproteases that are closely related to the bacterial thermolysin family. They were first discovered in Aspergillus fumigatus (Markaryan et al. 1994), but their role in non-pathogenic fungi is unknown. Here we report the sequence conservation, expression, and genomic location of this gene family.

Materials and methods Discovery and annotation of the M36 peptidase family Multiple gene models have been developed for Coprinopsis cinerea. A reference set of gene predictions produced using Augustus, GeneZilla, and SNAP predictions has been deposited in GenBank by the Broad Institute. Additional training of GLEAN and GeneZilla software, combined with setting an intron length upper limit of 300 nt, has provided our group with what appears to be a more reliable working set of gene predictions. These are referred to as ‘GLEAN_GZ2_Jan06max300’ models (subsequently abbreviated Jan06m300). The latter set are downloadable from http://fungal.genome.duke.edu, and most of the available gene models, including the Broad predictions, can be viewed in their genomic context on GBrowse at http://genome.semo.edu/. We screened the C. cinerea Jan06m300 predictions by performing a BLASTP search against the MEROPS peptidase database (http://merops.sanger.ac.uk/). Predicted cellular location of these was determined using

W. W. Lilly et al.

SignalP 3.0 (Bendtsen et al. 2004), and WoLFPSORT (Horton et al. 2007). Manual annotation of the M36 peptidase genes was performed by multiple alignment analysis of the Broad predictions, the Jan06m300 models, and available EST data using ClustalW. Where ambiguity of intron splicing was present in the predictions, and EST data were available, the ESTs were taken as correct. In two instances (CcMEP1 introns 2 and 3) conflicting intron predictions were resolved by PCR analysis. The individual M36 predictions were further compared by alignment to each other and to known M36 sequences from other fungi (particularly, Aspergillus fumigatus). The resulting manually annotated genes were then named CcMEP1 through CcMEP8.

Phylogenetic methods Amino acid sequences were aligned to the Pfam profile HMM of the Peptidase_M36 domain using the hmmalign program of the HMMER package (version 2.3.2; http://hmmer.janelia. org). A Bayesian consensus phylogenetic tree of the sequences was constructed using the MrBayes software package (version 3.1.2) (Ronquist & Huelsenbeck 2003) using mixed aamodel with three runs and four MCMC chains that converged after 800 iterations. ML BS values were calculated for the Bayesian consensus tree with RAxML (version 2.2.3) (Stamatakis 2006) using PROTMIXBLOSUM62 (as BLOSUM was the optimal amino acid module found by MrBayes) and 100 replicates.

Voucher material Coprinopsis cinerea (syn. Coprinus cinereus) str 130, Okayama 7, is available from Patricia Pukkila, Department of Biology, University of North Carolina-Chapel Hill.

Results Multiple members of family M36 peptidases in Coprinus cinereus The BLASTP search of Coprinopsis cinerea Jan06m300 translated gene predictions against the MEROPS protease database identified 301 unduplicated genes potentially encoding proteolytic enzymes, or non-enzymatic homologues of proteases (E values of 1  1010 or less; data on other proteases will be presented elsewhere). The cellular locations of the predicted protease gene products were then analysed using two methods of location prediction: SignalP 3.0 and WoLFPSORT. These methods provide tests for extracellularity based on presence of a signal sequence (in SignalP 3.0) or the combined features of signal sequence and full protein amino acid composition (in WoLFPSORT). These programs predicted a total of 105 proteases to be extracellular (100 predicted by both programs, three predicted only by SignalP, and two predicted only by WoLFPSORT). Of these 105 putative extracellular proteases, 50 are metalloproteases, 40 are serine proteases, ten are aspartic proteases, three are cysteine proteases, and one is a threonine protease. Members of 21 different MEROPS families are represented; however, six families (M28, M36, M43, S8, S9, and A1) comprise two-thirds of the total predicted

An expanded family of fungalysin extracellular metallopeptidases of Coprinopsis cinerea

391

the Southeast Assembly (Table 1). Six of the eight genes map within 200 kb of the nearest telomere. This distance represents 1–11 % of the total chromosome length (Table 1). The CcMEP coding sequences range in length from 2260– 2524 nt, including introns. Comparison of the two principal gene models used showed identical predictions for four of the CcMEP genes. In the other four genes there were eight discrepancies among the 35 intron calls. Four of these discrepancies were the result of skipped or added introns in one of the models; the other four were the result of alternate 5’ (one case) or 3’ splice junction predictions. Three of the eight discrepancies were solved with EST data, and two of the eight were resolved by PCR analysis. For the remaining three problems, the model that provided the best alignment with the other CcMEP gene products was used. The eight genes possess from four to 11 introns, which range in length from 52–192 nt with an average length of 65 nt (Fig 1). This is consistent with average intron length and frequency found in Coprinopsis cinerea and other basidiomycetes (Seitz et al. 1996; J.E.S. unpubl.). Although there appears to be some conservation of intron position near the 5’ end of several of the genes, they rarely begin at exactly the same position, and show variability in their lengths at a given position. This is in contrast to some other families of genes in C. cinereus, including the laccases, many of which show striking positional conservation of introns (Hoegger et al. 2004). Promoter analysis was performed on the 1 K nucleotides upstream from the translation start. Multiple alignment of these upstream sequences failed to find significant sequence similarity. Consensus TATA boxes were found for only five of the eight genes, and their positions ranged from 16 nt to 740 nt (Table 2). Multiple CAAT boxes were found in these upstream sequences as well. All but CcMEP4 and CcMEp6 showed consensus stress response elements, and four genes showed xenobiotic response elements. In addition, a tandem repeat sequence (TTCACGACA) recurred three times between 33 nt and 59 nt of CcMEP6. CcMEP7 and CcMEP8 had multiple instances (20 and 15, respectively) of 4 nt or longer stretches of the same nucleotide.

extracellular proteases. Organisms, including other basidiomycetes, commonly have multiple members of most of these families. We performed a similar BLASTP search with predicted proteins from the available basidiomycete genomes, including Phanerochaete chrysosporium, Ustilago maydis, Cryptococcus neoformans, and the preliminary predictions of the recently released Laccaria bicolor genome. Multiple members of each of these six families exist in all five of the basidiomycete genomes, with the exception of family M36. Among the five sequenced basidiomycetes C. cinerea appears unique in having a large number of genes for M36 proteases. The MEROPS database BLAST search of the Jan06m300 predictions identified eight unique genes potentially encoding M36 family proteases in C. cinereus (Table 1). Three other basidiomycetes, P. chrysosporium, U. maydis, C. neoformans, each have only one gene for an M36 peptidase, based on BLASTP searches of their predicted genes against Merops. In addition, BLASTP searches of their databases with all eight of the predicted C. cinerea M36 peptidases also identified only one gene with similarity. The ectomycorrhizal basidiomycete Laccaria bicolor is the only basidiomycete other than C. cinereus to have more than one putative M36 peptidase gene. A BLASTP search of its recently-released genome with the eight C. cinereus M36 peptidase genes indicates that L. bicolor may have as many as four M36 protease genes. The genomes of ascomycete dermatophytes Microsporum canis (Brouta et al. 2002) and Trichophyton rubrum (Jousson et al. 2004) each encode five members of the family. There are three genes for M36 proteases that provide the ‘type’ M36 peptidase in the ascomycete Aspergillus fumigatus, whereas Saccharomyces cerevisiae, Schizosaccharomyces pombe, and Neurospora crassa have no M36 homologues (MEROPS Database: http://merops.sanger.ac.uk/). Therefore, to the best of our knowledge C. cinereus has the most identified M36 genes of any organism.

Characterization of the M36 genes The eight predicted M36 peptidase genes are spread over six chromosomes, with only one cluster (Table 1). CcMEP6, CcMEP7 and CcMEP8 map together on chromosome XI in

Table 1 – Predicted genes encoding M36 family metalloproteases in Coprinopsis cinerea

CcMEP1 CcMEP2 CcMEP3 CcMEP4 CcMEP5 CcMEP6 CcMEP7 CcMEP8 a b c d e f

Broad Genea

Annotation

Length (nt)b

Chromosomec

Locationd

Strand

E value M36e

E value FTPf

EAU83617 EAU91122 EAU82511 EAU86365 EAU86463 EAU82955 EAU82952 EAU82961

Hand Hand Hand Broad Broad Broad Hand Broad

2510 2470 2455 2524 2369 2406 2393 2260

XIII III V VIII XII XI XI XI

Internal Internal 50 (1) 100 (5) 40 (2) 200 (9) 200 (9) 200 (9)

 þ   þ þ þ 

1.8E142 1.1E45 2.2E91 3.5E54 2.7E95 1.8E56 1.4e14 2.2e99

5.7E04 6.0E02 No Match 2.3E02 0.63 3.5E02 0.18 No Match

Broad Gene indicated by GenBank accession number. Gene characteristics based on annotations as indicated. Hand annotation data are in Supplementary Material Table S1. Chromosomes in the Southeast Assembly viewable at http://genome.semo.edu/. Distance to nearest telomere is shown in kb, and percent of chromosome length represented by that distance is shown in parentheses. E value for Blastp match of the predicted protein to the M36 Pfam domain. E value for Blastx match of predicted protein to the FTP Pfam domain.

392

W. W. Lilly et al.

CcMEP1 CcMEP2 CcMEP3 CcMEP4 CcMEP5 CcMEP6 CcMEP7 CcMEP8 500 nt

Fig 1 – Upper frame: intron positions of the putative M36 peptidase gene family of Coprinopsis cinerea.

Characterization of the predicted M36 peptidases The predicted proteins encoded by the M36 peptidase genes range in length from 577 to 781 amino acids. Pairwise sequence identities range from a high of 69 % between CcMEP3 and CcMEP5 to a low of 30 % between CcMEP1 and CcMEP7, with a mean of 42 % for all eight of the predicted proteins. A 1000  bootstrap analysis based on ClustalW alignments of the predicted amino acid sequences of the known M36 peptidases from fungi parses the enzymes into two major groups (Fig 2). One of the groups has all of the previously described M36 peptidases from the ascomycota, along with a separate, but distantly related branch containing the single-representative M36 peptidases from the basidiomycetes Cryptococcus neoformans, Ustilago maydis, and Phanerochaete chrysosporium. A single member of the M36 peptidase family from Coprinus cinereus (CcMEP1) and one from Laccaria bicolor (named by us as LbMEP1) also share more similarity with these basidiomycete proteins. The similarity of these basidiomycete M36 peptidases is high. Sequence alignment of the CcMEP1 predicted protein to the Aspergillus fumigatus enzyme shows that 74 % of the amino acids are identical or conservatively substituted.

The remaining seven C. cinerea enzymes, along with three putative M36 peptidases of L. bicolor, cluster together on a separate branch and are more distantly related to the A. fumigatus M36 fungalysins. It is typical for secreted proteases to be synthesized as proenzymes (Baker et al. 1993). Previously studied members of the M36 family (and the related M4 family from bacteria) have been demonstrated to possess a large propeptide domain, which may account for up to 50 % of the amino acid sequence encoded by the gene. Propeptide domains are involved in conformational stabilization and inhibition of the enzyme (Tang et al. 2003; Kubota et al. 2005). There is substantial evidence that propeptides are removed by proteolytic processing. This is catalysed either by specific enzymes (Tang et al. 2003) or by autolytic cleavage (Marie-Claire et al. 1998). Markaryan et al. (1996) demonstrated that in A. fumigatus the propeptide inhibited the activity of the mature fungalysin; however, nothing is known about the in vivo processing of M36 family enzymes in fungi. In the case of the related thermolysin family of bacterial metalloproteases, autocatalytic cleavage of the propeptide has been clearly demonstrated. The eukaryotic M36 and bacterial M4 families of metalloproteases share a conserved Pfam domain in their propeptides called FTP (fungalysin/thermolysin propeptide). CcMEP1 most closely matches the FTP domain of A. fumigatus (Fig 3A). Five of the seven other C. cinereus M36s show significant matches to FTP domains in their propeptides. The two that do not show significant matches to FTP (CcMEP6 and CcMEP7), nonetheless, show considerable overall similarity to the others in this region of their propeptides (Fig 3B). It is reasonable to consider that the peptide sequence near the cleavage point would be conserved, whether propeptide cleavage is autolytic or carried out by a separate modifying enzyme. This is certainly true for the more closely related M36 peptidase families in the ascomycota. However, the greater sequence dissimilarity in the CcMEPs provides a more stringent test of the hypothesis. Alignment of the putative propeptide cleavage point in CcMEP1 and its orthologues in

Table 2 – Potential promoter elements for the Coprinopsis cinerea M36 peptidase genes TATA Box tataa

CAAT Box caat

SRE cccct or agggg

CcMEP1 CcMEP2

740 (gataa) 72

955, 962 372, 950

CcMEP3

69

CcMEP4

137, 550

CcMEP5 CcMEP6

109 (tttaa) 115 (gataa) 187 (tattt)

CcMEP7

16

785, 947 372, 464 875, 915 114, 266 791 240, 669 750, 908 287, 487 616 237, 334 354, 626 254, 581

CcMEP8

500, 864

380, 480 540

477

XRE cacgct

MRE tgcgcac 313

136 614

119

183, 492

147, 153 604, 820 341, 593

114, 245

The upstream 1000 nt from the translation start point ( ¼ 0) for each gene were examined. SRE, stress response element; XRE, xenobiotic response element; MRE, metal response element.

195

An expanded family of fungalysin extracellular metallopeptidases of Coprinopsis cinerea

70

24 1.0

100

0.99

60 1.0

Cc MEP3 Cc MEP5 Cc MEP4

Lb MEP3

60 0.99

Cc MEP8 Lb MEP4 Lb MEP2

1.0

38 0.90

Cc MEP6

16 0.58 79 1.00

26 0.97

Cc MEP2

98 1.0

75 1.0

100 1.0

94 1.0 88 1.0

73 1.0 64 1.0

100 1.0

60 1.0 10 0.80 100 1.0

80 1.0

Cc MEP7

BDEG 08474 Ro MEP1

Cc MEP1

Lb MEP1 Pc MEP1 Cn 07039 Um 06098 92 Tr MEP3 100 1.0 Ag MEP3 1.0 Mc MEP3 97 Tr MEP2 100 1.0 Ag MEP2 1.0 Mc MEP2 100 Tr MEP4 100 1.0 Ag MEP4 1.0 Mc MEP4 Tr MEP1 100 1.0 33 Ag MEP1 0.94 Mc MEP1 86 Tr MEP5 99 1.0 Ag MEP5 1.0 Mc MEP5

Ci 5445 Ao NP1

27 0.77 100 1.0

0.1

Va ZP01234320

79 0.99

97 1.0

393

Ao NP3 Af P46074 Af P46075

Fig 2 – Bayesian tree prepared from amino acid sequence alignments of most known and predicted fungal M36 peptidases (fungalysins) aligned to profile HMM of Pfam domain Peptidase_M36. Af, Aspergillus fumigatus P46075; Ag, Arthroderma benhamiae (MEP1 [ AAQ21069, MEP2 [ AAQ21100, MEP3 [ AAQ21099, MEP4 [ AAQ21101, MEP5 [ AAQ21096); Tr, Trichophyton rubum (MEP1 [ AAN03636, MEP2 [ AAN03638, MEP3 [ AAQ21094, MEP4 [ AAN03642, MEP5 [ AAN03640; Ag and Tr are anamorphs); Ao, Aspergillus oryzae (AAT68480); Cc, Coprinopsis cinerea (this study, highlighted in bold); Ci, Coccidioides immitis (ABA38725); Cn, Cryptococcus neoformans (AAW45825); Lb, Laccaria bicolor (MEP1 [ JGI prediction GWW1.42.8.1, MEP2 [ GWW1.7.211.1, MEP3 [ GWW1.15.43.1, MEP4 [ GWW1.50.26.1); Mc, Microsporum canis (MEP1 [ CAD35289, MEP2 [ CAD35290, MEP3 [ CAD35288, MEP4 [ AAQ21098, MEP5 [ AAQ21095); Pc, Phanerochaete chrysosporium (MEP1 [ JGI prediction GWW2.10.104.1) , and Um, Ustilago maydis (XP_762245). The tree was rooted with the bacterial metalloprotease from Vibrio angustum labelled Va_ZP01234320 (ZP_01234320). Af, is from the Protein Database; Ag, Ao, Ci, Cn, Mc Tr, are Um, are predicted proteins taken directly from GenBank; and Cc, Lb, Pc, and are derived from gene prediction models at Broad Institute (Cc) or JGI (Lb and Pc), which have been manually annotated by us. Numbers above and below lines indicating ML BS and Bayesian posterior support, respectively. Broad lines indicate significant Bayesian and ML support (>0.95 bayesian posterior and > 70 % ML BS) and lines of intermediate thickness indicate significant support from only ML or Bayesian methods.

P. chrysosporium and L. bicolor with the known propeptide cleavage region of A. fumigatus fungalysin is shown in Fig 4A. The propeptides of the remaining seven CcMEPs and A. fumigatus fungalysin have sufficient similarity to produce a robust alignment (Fig 4B). We have used these alignments to predict the end of the propeptide and to predict the properties of the mature

enzymes (Table 3). CcMEP1 is the largest proenzyme, and the largest mature M36 protein at 46 kDa, despite having a proportionally larger propeptide than the other seven. CcMEP1 also has the lowest predicted pI of the family. The remaining mature CcMEPs range in size from 37–43 kDa, and have widely variant isoelectric points. Multiple alignments of the predicted mature peptides with A. fumigatus fungalysin shows several

394

W. W. Lilly et al.

Fig 3 – (A) Sequence alignment of the proposed FTP domain of M36 peptidase CcMEP1 from Coprinopsis cinerea with the FTP domain of Aspergillus fumigatus. (B) Multiple sequence alignment of the proposed FTP domains in the expanded family of M36 peptidases of C. cinereus with the FTP domain of A. fumigatus.

areas of sequence conservation, including the metalloprotease active site motif HEXXH (Fig 5). Seven of the mature proteases (all but CcMEP2) have potential N-glycosylation sites, and all show multiple potential O-bGlcNAc glycosylation sites. The importance of these potential sites is unclear, as little is known about glycosylation processes in filamentous fungi, and few empirical data exists concerning glycosylation of extracellular proteins from basidiomycetes.

Discussion Generally, it is common in the basidiomycetes, particularly in Coprinopsis cinerea, for there to be expanded families of genes.

C. cinerea shows major expansions of the hydrophobins, laccase gene family (Kilaru et al. 2006) and the cytochrome P450 family (J.E.S. unpubl.), as well expansions of other protease families (Lilly et al. unpubl.). In the basidiomycetes for which genome data are available, only C. cinerea and Laccaria bicolor appear to have more than one gene for the M36 fungalysin family, and the entire family itself has received no attention in basidiomycetes. Indeed, in the published study of the Phanerochaete chrysosporium secretome (Vanden Wymelenberg et al. 2006), the single fungalysin gene was omitted because the v2.1 gene model was not complete to the real N-terminus of the protein, thus missing the signal sequence. The gene we have designated CcMEP1 is the most similar to the single-representative fungalysins of the basidiomycetes P. chrysosporium, Cryptococcus neoformans, and Ustilago maydis,

Fig 4 – (A) Sequence alignment of the proposed propeptide cleavage site for Coprinopsis cinerea M36 peptidase CcMEP1 with the known cleavage site for Aspergillus fumigatus fungalysins. (B) Multiple sequence alignment of the proposed propeptide cleavage sites for the expanded family of M36 peptidases of C. cinereus with the known cleavage site for A. fumigatus fungalysins. Arrows indicate the point of cleavage of the propeptide from the mature enzyme.

An expanded family of fungalysin extracellular metallopeptidases of Coprinopsis cinerea

395

Table 3 – Properties of the Coprinopsis cinerea M36 family of metalloproteases Proenzyme

CcMEP1 CcMEP2 CcMEP3 CcMEP4 CcMEP5 CcMEP6 CcMEP7 CcMEP8

Propeptide

Mature peptide

No. of amino acids

Molecular weight

Isoelectric point

Propeptide ends

Signal peptide

No. of amino acids

Molecular weight

Isoelectric point

777 632 602 580 599 616 577 601

85750 69888 65662 63018 65360 66907 62868 66244

5.34 5.41 4.81 5.05 6.03 5.61 5.12 5.89

345 249 224 222 229 232 233 229

1–20 1–21 1–23 1–23 1–25 1–24 1–23 1–26

432 383 378 357 370 384 344 372

47680 42861 41117 39453 39988 41799 37086 41247

5.27 6.48 4.68 5.44 7.25 5.40 5.12 5.83

and to the type fungalysin of Aspergillus fumigatus. The structure of the NJ tree for the CcMEP1gene product and the closely related M36s from the other basidiomycetes (Fig 2) is identical to the trees created by combined gene analysis from those organisms (Fitzpatrick et al. 2006) (NB L. bicolor was not included in that study). The other M36 gene family members of C. cinerea and L. bicolor group together in on a separate branch suggesting that they may have arisen as the result of an initial duplication event that occurred within the common ancestor of the Agaricales and, thus, could be limited to that clade. This is partially supported by the fact that the recently published results of the ‘Deep Hypha’ project place P. chrysosporium outside the Agaricales, based on multi-locus sequence analysis (Matheny et al. 2007; Hibbett 2007; Larsson et al. 2007). The pending completion of the genomes of the agarics Schizophyllum commune, Pleurotus ostreatus, and Agaricus bisporus will give us insight into the prevalence of M36 gene expansion in the clade. Additional genome projects in non-agarics, such as Heterobasidion annosum will help us understand whether the expansion of this gene family is limited to the agaricales clade. In our current assembly, 98.7 % of the genome has been included in one of the 13 chromosomes of C. cinereus. It was striking to observe that although CcMEP1 (with similarities to genes in other fungi) maps internally in chromosome XIII, six of the remaining seven C. cinereus MEP genes are located near the ends of four different chromosomes. It is known that transposable elements are concentrated near the telomeres of C. cinerea chromosomes (Stajich et al. 2006), so it was not surprising to observe such repeated elements near seven of the eight CcMEP genes. It will be of interest to determine whether other gene expansions in C. cinerea show similar associations. The substantial amino acid sequence difference between C. cinerea fungalysins CcMEP2–CcMEP8and CcMEP1 necessitates justification of the claim that the group is actually a set of M36 peptidases. Members of the peptidase family M36 generally match two Pfam domains, M36 and FTP. Each gene had significant matches to the M36 Pfam, with CcMEP7 (E value ¼ 1014) being the weakest. Multiple alignments found three regions longer than 20 amino acids in the mature peptides with high degree of sequence similarity: the active site, a region about 100 amino acids toward the N-terminus from the active site, and a C-terminal domain. Alignment of these gene products to the FTP domain does not provide the same low E

values, partly because the FTP domain is much shorter (ca 50 amino acids), and partly because the FTP domain itself is not particularly well conserved between organisms. Nonetheless, multiple alignments of these gene products to the A. fumigatus FTP domain reveal extensive similarity, suggesting these regions in the C. cinereus putative M36 peptidases are indeed FTP domains. Another hallmark of fungalysin and thermolysin families is the existence of a large propeptide. Again, multiple alignments between members of the C. cinerea putative M36 peptidases show considerable sequence conservation with the experimentally documented A. fumigatus propeptide cleavage site. The resultant CcMEP propeptides are similar in size to each other and to the A. fumigatus propeptide. Taken together these data suggest that these genes comprise a family of M36 peptidases, indicating there is real substrate differentiation among the multiple M36 peptidases of C. cinerea or differential environmental or developmental regulation. In fungi, extracellular proteases have been traditionally ascribed two roles: general proteolysis for providing nutrients, and general proteolysis for softening tissues of host organisms for mycelial expansion (Kalisz et al. 1987; Markaryan et al. 1994). Both are well-documented activities; however, specific roles and substrates of individual proteases in these processes have seldom been established. The indispensability of any given protease in these processes has almost never been demonstrated, although Hoffman & Breuil (2004) did show that disruption of an extracellular serine protease gene resulted in slower growth of the organism on a protein substrate, and in its natural substrate, wood. In animal systems extracellular proteases serve a wide array of functions. For example, matrix metalloproteases of mammals, which degrade proteins in the extracellular matrix, are involved in a multitude of developmental processes (Vu & Werb 2000), and are thus implicated in numerous human disease conditions (Overall 2004; Vihinen et al. 2005). In fungi there are tantalizing data that suggest potential roles for extracellular proteolysis beyond generalized degradation. These processes include fungal wall remodelling during both vegetative growth and sexual development (Glass et al. 2004), modification of proteins involved in cell–cell recognition (Walser et al. 2003), evasion of host detection (Hung et al. 2005), and specific modification of the activities of extracellular enzymes. The last of these roles has been recently confirmed

396

W. W. Lilly et al.

Fig 5 – Multiple sequence alignment of the active site regions of the eight-member family of M36 peptidases of Coprinopsis cinerea with the metalloprotease active site region of Aspergillus fumigatus fungalysins.

An expanded family of fungalysin extracellular metallopeptidases of Coprinopsis cinerea

with the discovery of a class of extracellular serine proteases in the basidiomycete Pleurotus ostreatus that specifically modify laccases (Faraco et al. 2005). As virtually nothing is known about the functions, specificity, and regulation of fungal extracellular proteases in general, it is difficult to surmise the roles of the individual M36 fungalysin proteases of C. cinereus. However, the large expansion of this gene family provides an excellent model system for understanding the roles of individual closely related peptidases in the extracellular environment. Application of microarrays to explore differential regulation of the M36 genes in the context of the entire genome, currently being undertaken by our group, offers one avenue of investigation that might indicate differential function. Isolation and biochemical characterization of the individual enzymes, which we are also pursuing, can provide answers regarding substrate specificity, and by inference functions.

Acknowledgements This work was supported by NSF grant number 0412016 to P.J.P., W.W.L. and A.C.G. Additional support was provided to W.W.L. and A.C.G. from the Grants and Research Funding Committee of Southeast Missouri State University. J.E.S. was supported by an NSF graduate fellowship.

Supplementary material Supplementary material associated with this article can be found in the online version, at doi:10.1016/j.mycres.2007. 11.013.

references

Baker D, Shiau AK, Agard DA, 1993. The role of pro regions in protein folding. Current Opinion in Cell Biology 5: 966–970. Burton KS, Smith JF, Wood DA, Thurston CF, 1997. Extracellular proteinases from the mycelium of the cultivated mushroom Agaricus bisporus. Mycological Research 101: 1341–1347. Brouta F, Descamps F, Monod M, Vermout S, Losson B, Mignon B, 2002. Secreted metalloprotease gene family of Microsporum canis. Infection and Immunity 70: 5676–5683. Bendtsen JD, Nielsen H, von Heijne G, Brunak S, 2004. Improved prediction of signal peptides: SignalP 3.0. Journal of Molecular Biology 340: 783–795. Faraco V, Palmieri G, Festa G, Monti M, Sannia G, Giardina P, 2005. A new subfamily of fungal subtilases: structural and functional analysis of a Pleurotus ostreatus member. Microbiology 151: 457–466. Fitzpatrick DA, Logue ME, Stajich JE, Butler G, 2006. A fungal phylogeny based on 42 complete genomes derived from supertree and combined gene analysis. BMC Evolutionary Biology 6: 99–104. Glass NL, Rasmussen C, Roca MG, Reed ND, 2004. Hyphal homing, fusion and mycelial interconnectedness. Trends in Microbiology 12: 135–141.

397

Hellmich S, Schauz K, 1988. Production of extracellular alkaline and neutral proteases of Ustilago maydis. Experimental Mycology 12: 223–232. Hibbett DS, 2007 [‘2006’]. A phylogenetic overview of the Agaricomycotina. Mycologia 98: 917–925. Hoegger PJ, Navarro-Gonzalez M, Kilaru S, Hoffmann M, Westbrook ED, Kues U, 2004. The laccase gene family in Coprinopsis cinerea (Coprinus cinereus). Current Genetics 45: 9–18. Hoffman B, Breuil C, 2004. Disruption of the Subtilase gene, albin1, in Ophiostoma piliferum. Applied and Environmental Microbiology 70: 3898–3903. Horton P, Park K-J, Obayashi T, Fujita N, Harada H, AdamsCollier CJ, Nakai K, 2007. WoLF PSORT: Protein localization Predictor. Nucleic Acids Research 35 (Suppl 2): W585–W587. Hummel KM, Inselman AL, Ramos ER, Gathman AC, Lilly WW, 1998. Extracellular protease production by submerged cultures of Schizophyllum commune. Mycologia 90: 883–889. Hung CY, Seshan KR, Yu JJ, Schaller R, Xue J, Basrur V, Gardner MJ, Cole GT, 2005. A metalloproteinase of Coccidioides posadasii contributes to evasion of host detection. Infection and Immunity 73: 6689–6703. Jousson O, Lechenne B, Bontems O, Capoccia S, Mignon B, Barblan J, Quadroni M, Monod M, 2004. Multiplication of an ancestral gene encoding secreted fungalysin preceded species differentiation in the dermatophytes Trichophyton and Microsporum. Microbiology 150: 301–310. Kalisz HM, Wood DA, Moore D, 1987. Production, regulation and release of extracellular proteinase activity in basidiomycete fungi. Transactions of the British Mycological Society 88: 221–227. Kilaru S, Hoegger PJ, Ku¨es U, 2006. The laccase multi-gene family in Coprinopsis cinerea has seventeen different members that divide into two distinct subfamilies. Current Genetics 50: 45–60. Kubota K, Nishii W, Kojima M, Takahashi K, 2005. Specific inhibition and stabilization of aspergilloglutamic peptidase by the propeptide. Identification of critical sequences and residues in the propeptide. Journal of Biological Chemistry 280: 999–1006. Larsson KH, Parmasto E, Fischer M, Langer E, Nakasone KK, Redhead SA, 2007 [‘2006’]. Hymenochaetales: a molecular phylogeny for the hymenochaetoid clade. Mycologia 98: 926–936. Markaryan A, Lee JD, Sirakova TD, Kolattukudy PE, 1996. Specific inhibition of mature fungal serine proteinases and metalloproteinases by their propeptides. Journal of Bacteriology 178: 2211–2215. Marie-Claire C, Roques BP, Beaumont A, 1998. Intramolecular processing of prothermolysin. Journal of Biological Chemistry 273: 5697–56701. Markaryan A, Morozova I, Yu H, Kolattukudy PE, 1994. Purification and characterization of an elastinolytic metalloprotease from Aspergillus fumigatus and immunoelectron microscopic evidence of secretion of this enzyme by the fungus invading the murine lung. Infection and Immunity 162: 2149–2157. Matheny PB, Curtis JM, Hofstetter V, Aime MC, Moncalvo JM, Ge ZW, Slot JC, Ammirati JF, Baroni TJ, Bougher NL, Hughes KW, Lodge DJ, Kerrigan RW, Seidl MT, Aanen DK, DeNitis M, Daniele GM, Desjardin DE, Kropp BR, Norvell LL, Parker A, Vellinga EC, Vilgalys R, Hibbett DS, 2007 [‘2006’]. Major clades of Agaricales: a multilocus phylogenetic overview. Mycologia 98: 982–995. Overall CM, 2004. Dilating the degradome: matrix metalloproteinase 2 (MMP-2) cuts to the heart of the matter. Biochemical Journal 383: E5–E7. Ronquist F, Huelsenbeck JP, 2003. MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19: 1572– 1574. Seitz LC, Tank KL, Cummings WJ, Zolan MJ, 1996. The rad9 gene of Coprinus cinereus encodes a proline-rich protein required for meiotic chromosome condensation and synapsis. Genetics 142: 1105–1117.

398

Stamatakis A, 2006. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22: 2688–2690. Stajich JE, Birren B, Burns C, Casselton LA, Dietrich F, Fargo DC, Gathman AC, James TY, Kamada T, Lilly WW, Ma L-J, Muraguchi H, Palmerini H, Rehmeyer C, Wilke S, Zolan M, Pukkila PJ, 2006. Genomic analysis of Coprinus cinereus. In: Proceedings of the International Symposium on Mushroom Science. Akita Prefectural University, pp. 59– 74. Tang B, Nirasawa S, Kitaoka M, Marie-Claire C, Hayashi K, 2003. General function of N-terminal propeptide on assisting protein folding and inhibiting catalytic activity based on observations with a chimeric thermolysin-like protease. Biochemical and Biophysical Research Communications 301: 1093–1098.

W. W. Lilly et al.

Vanden Wymelenberg A, Minges P, Sabat G, Martinez D, Aerts A, Salamov A, Grigoriev I, Shapiro H, Putman N, Belinky P, Dorsoretz C, Gaskell J, Kersten P, Cullen D, 2006. Computational analysis of the Phanerochaete chrysosporium v2.0 genome database and mass spectrometry identification of peptides in ligninolytic cultures reveal complex mixtures of secreted proteins. Fungal Genetics and Biology 43: 343–356. Vihinen P, Ala-aho R, Kahari VM, 2005. Matrix metalloproteinases as therapeutic targets in cancer. Current Cancer Drug Targets 5: 203–220. Vu TH, Werb Z, 2000. Matrix metalloproteinases: effectors of development and normal physiology. Genes and Development 14: 2123–2133. Walser PJ, Velagapudi R, Aeb M, Kues U, 2003. Extracellular matrix proteins in mushroom development. Recent Research in Developmental Microbiology 7: 381–415.