MANSC: a seven-cysteine-containing domain present in animal membrane and extracellular proteins

MANSC: a seven-cysteine-containing domain present in animal membrane and extracellular proteins

Update 172 TRENDS in Biochemical Sciences 6 Marie-Claire, C. et al. (1998) Intramolecular processing of prothermolysin. J. Biol. Chem. 273, 5697– 5...

319KB Sizes 0 Downloads 9 Views

Update

172

TRENDS in Biochemical Sciences

6 Marie-Claire, C. et al. (1998) Intramolecular processing of prothermolysin. J. Biol. Chem. 273, 5697– 5701 7 Tang, B. et al. (2003) General function of N-terminal propeptide on assisting protein folding and inhibiting catalytic activity based on observations with a chimeric thermolysin-like protease. Biochem. Biophys. Res. Commun. 301, 1093 – 1098 8 Sonnhammer, E.L. and Durbin, R. (1995) A dot-matrix program with dynamic threshold control suited for genomic DNA and proteinsequence analysis. Gene 167, GC1 – G10 9 Katoh, K. et al. (2002) MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 30, 3059– 3066 10 Boeckmann, B. et al. (2003) The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res. 31, 365– 370 11 Bateman, A. et al. (2002) The Pfam protein families database. Nucleic Acids Res. 30, 276 – 280 12 Rost, B. (1996) PHD: predicting one-dimensional protein structure by profile-based neural networks. Method Enzymol. 266, 525 – 539 13 Mesnage, S. et al. (2000) Bacterial SLH domain proteins are non-covalently anchored to the cell surface via a conserved mechanism involving wall polysaccharide pyruvylation. EMBO J. 19, 4473– 4484

Vol.29 No.4 April 2004

14 Moriyama, R. et al. (1996) A germination-specific spore cortex-lytic enzyme from Bacillus cereus spores: cloning and sequencing of the gene and molecular characterization of the enzyme. J. Bacteriol. 178, 5330– 5332 15 Braun, P. et al. (2000) Activation of Pseudomonas aeruginosa elastase in Pseudomonas putida by triggering dissociation of the propeptideenzyme complex. Microbiology 146, 2565– 2572 16 Rzychon, M. et al. (2003) Staphostatins: an expanding new group of proteinase inhibitors with a unique specificity for the regulation of staphopains, Staphylococcus spp. cysteine proteinases. Mol. Microbiol. 49, 1051 – 1066 17 Massimi, I. et al. (2002) Identification of a novel maturation mechanism and restricted substrate specificity for the SspB cysteine protease of Staphylococcus aureus. J. Biol. Chem. 277, 41770 – 41777 18 Goodstadt, L. and Ponting, C.P. (2001) CHROMA: consensus-based colouring of multiple alignments for publication. Bioinformatics 17, 845– 846

0968-0004/$ - see front matter q 2004 Elsevier Ltd. All rights reserved. doi:10.1016/j.tibs.2004.02.004

MANSC: a seven-cysteine-containing domain present in animal membrane and extracellular proteins Jinhu Guo1, Shuai Chen1, Chaoqun Huang1, Li Chen1, David J. Studholme2, Shouyuan Zhao1 and Long Yu1 1 2

State Key Laboratory of Genetic Engineering, School of Life Sciences, Fudan University, Handan Road 220, Shanghai 200433, China The Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, CB10 1SA, UK

MANSC (motif at N terminus with seven cysteines) is a novel domain with a well-conserved seven-cysteine motif that is present at the N terminus of membrane and extracellular proteins, including low-density lipoprotein receptor-related protein 11 (LRP-11), hepatocyte growth factor activator inhibitor 1 (HAI-1) and some uncharacterized proteins encoded by multicellular animals from Mollusca to Chordata. We postulate that the MANSC domain in HAI-1 might function through binding with hepatocyte growth factor activator and matriptase. Low-density lipoprotein receptor-related protein 11 (LRP-11) belongs to the large family of low-density lipoprotein receptor-related proteins (LRPs) [1,2]. Hepatocyte growth factor (HGF) activator inhibitor type 1 [HAI-1; also named Homo sapiens serine protease inhibitor Kunitz type 1 (SPINT1)] is a specific inhibitor of HGF activator [3]. HAI-1 and its cognate, matriptase, comprise a newly characterized extracellular matrix-degrading protease system that functions as an epithelial membrane activator for other proteases and latent growth factors [4]. We report a new domain that is present in LRP-11, HAI-1 and some other uncharacterized animal membrane and extracellular proteins.

Corresponding author: Long Yu ([email protected]). www.sciencedirect.com

Identification and characterization of the MANSC domain We have found that the N terminus of the protein LRP-11 (amino acid residues 98 – 184 in BAB55257) comprises a cysteine-rich region that, according to searches against the Pfam [2] and SMART [5] databases, does not match any previously identified domain. A PSI – BLAST [6] search against the non-redundant protein database (http://www. ncbi.nlm.nih.gov/blast/), using an inclusion threshold of 0.005, revealed homology to several proteins. The first iteration retrieved LRP-11 (e.g. BAB55257 with E value 2e 2 42) and HAI-1 proteins (e.g. Q9R097 with E value 3e 2 10). The second iteration retrieved additional proteins with significant E values, including mouse protein 9130403P13Rik (AAH39930 with E value 2e 2 5) and Macaca fascicularis hypothetical protein (BAB46892 with E value 0.001). Human unnamed protein BAA91526 (E value 2e 2 17) was retrieved in the third iteration, and on subsequent iterations the results converged. Further PSI – BLAST searches using AAH32998.1 as the probe retrieved several more proteins with the same conserved seven-cysteine patterns, although the E values were lower than the threshold (i.e. , 0.005), for example, AAQ22567.1, NP055624.1, XP341528.1 and BAC65528.1. In total, 16 distinct proteins were identified from these searches, and have been aligned using ClustalX [7] and manual editing. We named this domain MANSC, for motif at N terminus with seven cysteines, because of its composition and location

Update

TRENDS in Biochemical Sciences

173

Vol.29 No.4 April 2004

Key:

LRP-11 BAB55257.1

Signal peptide MANSC

Unkown XP317605

PKD LDLa KU

HAI-1B AAP44001.1

EGF EGF-like

Defensin_2

Unknown AAH32998.1

Transmembrane

CG7565-P NP648171.1 KIAA0319 NP055624 100 aa Ti BS

Figure 1. Domain architecture of the MANSC (motif at N terminus with seven cysteines) domain-containing proteins according to searches against SMART and Pfam database [2,6]. Domains shown are: defensin-2, arthropod defensin family (Pfam: PF01097) and transmembrane region; EGF, epidermal growth factor-like domain (Smart: SM00181); EGF-like, EGF domain unclassified subfamily (Smart: SM00001); KU, BPTI/Kunitz family of serine protease inhibitors (Smart: SM00131); LDLa, low-density lipoprotein receptor domain class A (Smart: SM00192); MANSC; PKD, repeats in polycystic kidney disease 1 and other proteins (Smart: SM00089). The MANSC domain has been deposited in the Pfam database (accession number PF07502).

in proteins (Figure 1). The MANSC domain appears to be restricted to animals; no significant matches with microbial or plant genomes were found using TBLAST (http:// www.ncbi.nlm.nih.gov/sutils/genom_table.cgi/). However, TBLASTN searches (BLOSUM62) against the expressed sequence tags (ESTs) division indicated that the

JPRED AAH32998.1_Ho BAB46892.1_Ma AAH39930.1_Mu BAC33492.1_Mu XP238072.2_Ra BAB55257.1_Ho PSIPRED PROF XP317605.1_An NP609451.1_Dr Q9R097_Mu XP230470.2_Ra AAP44001.1_Ho AAH53239.1_Da XP341528.1_Ra BAC65528.1_Mu NP055624_Hu AAQ22567.1_Dr Consensus/80%

domain is conserved in a range of higher multicellular animals, including Mollusca, Arthropoda and Chordata. Five classes of protein contain single copies of the MANSC domain (Figure 2): (i) proteins with unknown function containing a signal peptide and transmembrane region; (ii) LRP-11 and similar proteins with signal

------EEE--HHHH---------------HHHHHH--------------EEEEE---------------EEEEE--------------------33 33 32 98 154 98

76 56 50 50 56 40 30 32 41 84

-SLEDVVIDIQSSLSKGIR-GNEPIYTSTQEDCINSCCSTKNISGDKACNLMIFDTRKTARQPN------CYLFFC--PNEEACPLKPAKGLMSY-SLEDVVIDIQSSLSKGIR-GNEPIYTSTQEDCINSCCSTKIISGDKACNFMIFDTRKIARRPN------CYLFFC--PNEEACPLKPAKGLRSY-SLEDVVIDIQSSLSKGIR-GNEPIHLATQEDCIGACCSTKDIAGDKACNLMIFDTRKTDRQPN------CYLFFC--PSEDACPLKPAKGLVTY-AVPDTIIRTQDSIAAGASFLRAPGSVRGWRQCVTACCS------EPSCSVAVVQLPRGPSVPAPMPAPRCYLFNCTARGRSVCKFAPLRGYRTY-AVPDTIIRTQDSIAAGASFLRAPGTVRGWRQCVAACCS------EPSCSVAVVQLPRGPSVPSPVPAPRCYLFNCTARGRSVCKFAPLRGYRTYSAMPDAIIRTKDSLAAGASFLRAPAAVRGWRQCVAACCS------EPRCSVAVVELPRRPAPPAAVLG--CYLFNCTARGRNVCKFALHSGYSSYCCCCCCCHHHHHHHHHCCC CCCCCHHHHHHHHCC CCCCCEEEEEC CCCCE EEEEECCCCCCCEEEEEECCCCCC CEEEEECCCCCCCCCCCCC CCCCCHHHHHCCCCC CCCCCEEEEEC CCCCE EEEEECCCCCCCCCEEEECCCCCC ----NTIIRTEESRSMGARFLDD-ADLNSREQCLRLCCE------TENCDVFVFEE-KSPGT--------CFLFQCGPPENFRCKFTRHSNYTS-----NTIIRTGESQAIGGKYLQG-IELDTIEECERLCCE------TDACDVYIFER-KAGGY--------CYLFECGPPEDFRCKFTRHANYTS-SGVPSFVLDTEASVSNGATFLGSPTARRGW-DCVRSCCT------TQNCNLALVELQPDRGEDAIS---ACFLMNCLYEQNFVCKFAPKEGFINYSGVPAFVLDTEASVSNGATFLGSPTVHRGW-DCVRACCT------TQNCNLALVELQPDGGEDAIS---ACFLMNCLYEQNFVCKFAPKDGFINYAGVPGFVLDTNASVSNGATFLESPTVRRGW-DCVRACCT------TQNCNLALVELQPDRGEDAIA---ACFLINCLYEQNFVCKFAPREGFINY-GKEDFVLNTDESVKEGATFLGSPQVSKPE-DCVMACCN------DPNCNLALMEHRED--PKSIN---TCFIINCLYKQKKVCHFVRKKGFTNY-TYSDAIISPN---LESIRIMRVSHTFSVG-DCTAACCD------LPSCDLAWWFE---GS---------CYLVNCMRPEN——CEPRTTGPIRSYL -TYSDAIISPN---PETIRIMRVSQTFSVG-DCTAACCD------LLTCDLAWWFE---GS---------CYLVKCMRSEN——CEPRTTGPIRSYL -ETTRIMRVSHTFPV---------------VDCTAACCD------LSSCDLAWWFE---GR---------CYLVSCPHKEN——CEPKKMGPIRSYL PPPDAVEPLEEEAYL---------------WNCLQACCEKPR-NGSSACNVVLVFK---AK---------CYHIRC——QSNEACLPKLRVRMPNEK .s..shllp.p.t.s.thp.h......ps..pCh.tCCs......p.sCslhlhbb..............CaLbpC...pp..Cb.....sb.sY.

117 117 116 184 241 184

149 129 134 134 140 121 101 103 101 149

(i)

(ii)

(iii)

(iv)

(v)

Figure 2. Representative alignment of MANSC (motif at N terminus with seven cysteines) domains were aligned with ClustalX [7] with manual editing. Five classes of protein contain single copies of the MANSC domain: (i) proteins with unknown function containing a signal peptide and transmembrane region; (ii) low-density lipoprotein receptor-related protein 11 (LRP-11) and similar proteins with signal peptide, polycystic kidney disease (PKD), low-density lipoprotein receptor domain class A (LDLa) and transmembrane region; (iii) uncharacterized proteins with signal peptide, LDLa and transmembrane region; (iv) hepatocyte growth factor activator inhibitor 1 (HAI-1) proteins with signal peptide, two tandem repeats of Kunitz domains, LDLa and transmembrane region; (v) other uncharacterized proteins with epidermal growth factor-like domain (EGF; Smart: SM00181), epidermal growth factor domain unclassified family (EGF-like; Smart: SM00001), defensin-2, Arthropod defensin family (Pfam:PF01097), PKD, signal peptide and transmembrane region.Sequences shown (species name followed by protein name) are: AAH32998.1_Ho, Homo sapiens, hypothetical protein FLJ10298; BAB46892.1_Ma, Macaca fascicularis, hypothetical protein; AAH39930.1_Mus, Mus musculus, 9130403P13Rik protein; BAC33492.1_Mu, M. musculus, unnamed protein; XP238072.2_Ra, Rattus norvegicus, similar to Kunitz-type protease inhibitor 1 precursor; BAB55257.1_Ho, H. sapiens unnamed protein; XP317605.1_An, Anopheles gambiae ENSANGP00000009941; NP609451.1_Dr, Drosophila melanogaster, CG6495-PA; Q9R097_Mu, M. musculus, Kunitz-type protease inhibitor 1; XP230470.2 _Ra, R. norvegicus, similar to serine protease inhibitor Kunitz type 1; AAP44001.1_Ho, H. sapiens, hepatocyte growth factor activator inhibitor 1B; AAH53239.1_Da, Danio rerio, unknown; XP341528.1_Ra, R. norvegicus, similar to mKIAA0319 protein; BAC65528.1_Mu, M. musculus, mKIAA0319 protein; NP055624_Hu, H. sapiens, KIAA0319 gene product; AAQ22567.1_Dr, D. melanogaster, GH22222p. The alignment was colored and the 80% consensus sequence of the domain calculated using Chroma tool [20]. Capital letters represent amino acids. Lower-case letters: a, aromatic; b, big; h, hydrophobic; l, aliphatic; p, polar; s, small; t, tiny. A secondary structure of this alignment profile was predicted using Jpred [21] and the secondary structure of XP317605.1 was predicted by both PSIPRED [22] and PROF [23]. The consensus motif deduced for MANSC domain can be summarized as follows: x4[D/x]x[I/V][I/x][D/x]x3 [S/x]x[S/x]x[G/I]x1 – 5[P/x]x6 – 7[D/x]C[V/x]x[S/A]CC[S/x][T/x]x1 – 7C[N/x][L/V][A/x]x6 – 18C[Y/F]L[F/x][N/x]Cx2[E/x]x1 – 4Cx3[P/x]x2[G/x]x2[S/x]Yx, where ‘x’ is any residue with range indicated. This multiple sequence alignment (alignment number ALIGN_000638) has been deposited with the European Bioinformatics Institute (ftp://ftp.ebi.ac.uk/pub/databases/embl/align/). www.sciencedirect.com

174

Update

TRENDS in Biochemical Sciences

peptide, repeats in polycystic kidney disease 1 (PKD1) and other proteins (Smart: SM00089), low-density lipoprotein receptor domain class A (LDLa; Smart: SM00192) and transmembrane region; (iii) uncharacterized proteins with signal peptide, LDLa and transmembrane region; (iv) HAI-1 proteins with signal peptide, two tandem repeats of Kunitz domains, LDLa and transmembrane region; (v) other uncharacterized proteins with epidermal growth factorlike domain (EGF; Smart: SM00181), epidermal growth factor-like domain unclassified subfamily (EGF-like; Smart: SM00001), defensin-2, Arthropod defensin family (Pfam: PF01097), PKD, signal peptide and transmembrane region. All of the MANSC-containing proteins contain predicted transmembrane regions and signal peptides. LRP11 and HAI-1 are integral membrane proteins, whereas HAI-1 is also secreted in milk [8]. It is possible that some of the cysteine residues in the MANSC domain form structurally important disulfide bridges. Results of analysis with DIpro (http://contact.ics. uci.edu/bridge.html) indicated a good probability for formation of intra-disulfide bridges between residues 100 and 106, 63 and 79, as well as 68 and 95 in AAH32998.1 with accuracy of ,83%. This suggests that Cys67 might not be involved in disulfide bond formation (Figure 2). Functional postulations HGF is a pleiotropic factor that functions as a mitogen, motogen, and/or morphogen for various cells. It is secreted as an inactive precursor that requires proteolytic activation for its role in the HGF-induced signaling pathway. HGF activator (HGFA) is a factor XII-like serine proteinase that is essential for the activation of HGF [9]. HAI-1 contains two extracellular Kunitz domains that are known to inhibit trypsin-like serine proteases, including HGFA and matriptase. After proteolytic cleavage, MANSC is located in the mature and secreted HAI-1 peptide [3]. HAI-2, an analog of HAI-1, was also identified and found to inhibit HGFA in vitro. In contrast to HAI-1, HAI-2 lacks the MANSC and LDLa domains [10]. HAI-1 can form a complex with both active HGFA and matriptase [9,11]. Denda et al. [12] has demonstrated that both the Kunitz domains in HAI-1, especially the N-terminal Kunitz domain, are mainly responsible for inhibition of both HGFA and trypsin in vitro. Their results also show that the deletion protein of HAI-1, which contains only MANSC, inhibits HGFA (albeit at a low level) [12]. Therefore, we propose that MANSC and/or LDLa might also play a part in complex formation. Both HGFA and matriptase contain Trypsin-like serine protease domains (Tryp_Spc; Smart: SM00020), which could be the potential binding regions for MANSC. Furthermore, the weak inhibitory effect of the region suggests that MANSC could carry out a different role. Concluding remarks Based on the functions of HGF and HAI-1, MANSC might be involved in development [13], regeneration of tissue injury [14,15], Alzheimer’s disease [16,17], and tumor invasion and metastasis [4,18,19]. Most other MANSCcontaining proteins, including LRP-11, are poorly characterized. Understanding of the MANSC domain will further the characterization of these proteins. www.sciencedirect.com

Vol.29 No.4 April 2004

Acknowledgements We thank Qing Yi (University of Arkansas for Medical Sciences, Arkansas, USA), Dianqing Wu (University of Connecticut Health Center, Connecticut, USA) and anonymous referees for critical reviewing of the article. This work was supported by the National 973 Program, National Natural Science Foundation and Graduate Innovation Foundation of Fudan University (No. CQH1322014), China.

References 1 Strausberg, R.L. et al. (2002) Generation and initial analysis of more than 15,000 full-length human and mouse cDNA sequences. Proc. Natl. Acad. Sci. U. S. A. 99, 16899 – 16903 2 Bateman, A. et al. (2002) The Pfam protein families database. Nucleic Acids Res. 30, 276– 280 3 Shimomura, T. et al. (1997) Hepatocyte growth factor activator inhibitor, a novel Kunitz-type serine protease inhibitor. J. Biol. Chem. 272, 6370 – 6376 4 Oberst, M. et al. (2001) Matriptase and HAI-1 are expressed by normal and malignant epithelial cells in vitro and in vivo. Am. J. Pathol. 158, 1301–1311 5 Letunic, I. et al. (2002) Recent improvements to the SMART domainbased sequence annotation resource. Nucleic Acids Res. 30, 242– 244 6 Altschul, S.F. et al. (1997) Gapped BLAST and PSI–BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402 7 Higgins, D.G. et al. (1988) CLUSTAL: a package for performing multiple sequence alignment on a microcomputer. Gene 73, 237 – 244 8 Lin, C.Y. et al. (1999) Purification and characterization of a complex containing matriptase and a Kunitz-type serine protease inhibitor from human milk. J. Biol. Chem. 274, 18237 – 18242 9 Kataoka, H. et al. (2000) Hepatocyte growth factor activator inhibitor type 1 is a specific cell surface binding protein of hepatocyte growth factor activator (HGFA) and regulates HGFA activity in the pericellular microenvironment. J. Biol. Chem. 275, 40453 – 40462 10 Kawaguchi, T. et al. (1997) Purification and cloning of hepatocyte growth factor activator inhibitor type 2, a Kunitz-type serine protease inhibitor. J. Biol. Chem. 272, 27558 – 27564 11 Benaud, C.M. et al. (2002) Deregulated activation of matriptase in breast cancer cells. Clin. Exp. Metastasis 19, 639 – 649 12 Denda, K. et al. (2002) Functional characterization of Kunitz domains in hepatocyte growth factor activator inhibitor type 1. J. Biol. Chem. 277, 14053 – 14059 13 Yamauchi, M. et al. (2002) Expression of hepatocyte growth factor activator inhibitor type 2 (HAI-2) in human testis: identification of a distinct transcription start site for the HAI-2 gene in testis. Biol. Chem. 383, 1953 – 1957 14 Kataoka, H. et al. (2000) Localization of hepatocyte growth factor activator inhibitor type 1 in Langhans cells of human placenta. Histochem. Cell Biol. 114, 469 – 475 15 Douglas, D. et al. (2002) Increase in the beta chain of hepatocyte growth factor (HGFb) precedes c-met expression after bleomycininduced lung injury in the rat. Exp. Lung Res. 28, 301– 314 16 Yamada, T. et al. (1998) White matter astrocytes produce hepatocyte growth factor activator inhibitor in human brain tissues. Exp. Neurol. 153, 60–64 17 Ma, S.L. et al. (2002) Low-density lipoprotein receptor-related protein 8 (apolipoprotein E receptor 2) gene polymorphisms in Alzheimer’s disease. Neurosci. Lett. 332, 216 – 218 18 Kobayashi, H. et al. (2003) The protease inhibitor bikunin, a novel anti-metastatic agent. Biol. Chem. 384, 749– 754 19 Kobayashi, H. et al. (2003) Kunitz-type protease inhibitor, bikunin, inhibits ovarian cancer cell invasion by blocking the calciumdependent transforming growth factor-b 1 signaling cascade. J. Biol. Chem. 278, 7790 – 7799 20 Goodstadt, L. et al. (2001) CHROMA: consensus-based colouring of multiple alignments for publication. Bioinformatics 17, 845 – 846 21 Cuff, J.A. et al. (1998) Jpred: a consensus secondary structure prediction server. Bioinformatics 14, 892 – 893 22 McGuffin, L.J. et al. (2000) The PSIPRED protein structure prediction server. Bioinformatics 16, 404– 405 23 Ouali, M. et al. (2000) Cascaded multiple classifiers for secondary structure prediction. Protein Sci. 9, 1162– 1176 0968-0004/$ - see front matter q 2004 Elsevier Ltd. All rights reserved. doi:10.1016/j.tibs.2004.02.007