Crystal Structure of Cucumisin, a Subtilisin-Like Endoprotease from Cucumis melo L.

Crystal Structure of Cucumisin, a Subtilisin-Like Endoprotease from Cucumis melo L.

J. Mol. Biol. (2012) 423, 386–396 http://dx.doi.org/10.1016/j.jmb.2012.07.013 Contents lists available at www.sciencedirect.com Journal of Molecular...

2MB Sizes 0 Downloads 2 Views

J. Mol. Biol. (2012) 423, 386–396

http://dx.doi.org/10.1016/j.jmb.2012.07.013 Contents lists available at www.sciencedirect.com

Journal of Molecular Biology j o u r n a l h o m e p a g e : h t t p : / / e e s . e l s e v i e r. c o m . j m b

Crystal Structure of Cucumisin, a Subtilisin-Like Endoprotease from Cucumis melo L. Kazutaka Murayama 1, 2 , Miyuki Kato-Murayama 2 , Toshiaki Hosaka 2 , Ami Sotokawauchi 3 , Shigeyuki Yokoyama 2 , Kazunari Arima 3 ⁎ and Mikako Shirouzu 2 ⁎ 1

Division of Biomedical Measurements and Diagnostics, Graduate School of Biomedical Engineering, Tohoku University, Sendai 980‐8575, Japan 2 RIKEN Systems and Structural Biology Center, Yokohama Institute, Yokohama 230‐0045, Japan 3 Department of Chemistry and Bioscience, Faculty of Science, Kagoshima University Graduate School of Science and Engineering, Kagoshima 890‐0065, Japan Received 24 May 2012; received in revised form 13 July 2012; accepted 17 July 2012 Available online 24 July 2012 Edited by R. Huber Keywords: plant serine protease; subtilisin-like fold; multidomain protein; thermostability

Cucumisin is a plant serine protease, isolated as an extracellular glycoprotein from the melon fruit Cucumis melo L. var. Prince. Cucumisin is composed of multiple domain modules, including catalytic, protease-associated, and fibronectin‐III-like domains. The crystal structure of cucumisin was determined by the multiwavelength anomalous dispersion method and refined at 2.75 Å resolution. A structural homology search indicated that the catalytic domain of cucumisin shares structural similarity with subtilisin and subtilisin-like fold enzymes. According to the Z-score, the highest structural similarity is with tomato subtilase 3 (SBT3), with an rmsd of 3.5 Å for the entire region. The dimer formation mediated by the protease-associated domain in SBT3 is a distinctive structural characteristic of cucumisin. On the other hand, analytical ultracentrifugation indicated that cucumisin is mainly monomeric in solution. Although the locations of the amino acid residues composing the catalytic triad are well conserved between cucumisin and SBT3, a disulfide bond is uniquely located near the active site of cucumisin. The steric circumstances of the active site with this disulfide bond are distinct from those of SBT3, and it contributes to the substrate preference of cucumisin, especially at the P2 position. Among the plant serine proteases, the thermostability of cucumisin is higher than that of its structural homologue SBT3, as determined by their melting points. A structural comparison between cucumisin and SBT3 revealed that cucumisin possesses less surface area and shortened loop regions. Consequently, the higher thermostability of cucumisin is achieved by its more compact structure. © 2012 Elsevier Ltd. All rights reserved.

*Corresponding authors. M. Shirouzu is to be contacted at RIKEN Systems and Structural Biology Center, 1-7-22 Suehirocho, Tsurumi, Yokohama 230-0045, Japan; K. Arima, Department of Chemistry and Bioscience, Faculty of Science, Kagoshima University Graduate School of Science and Engineering, 1-21-35 Korimoto, Kagoshima 890-0065, Japan. E-mail addresses: [email protected]; [email protected]. Abbreviations used: DFP, diisopropyl fluorophosphate; MAD, multiwavelength anomalous dispersion; PDB, Protein Data Bank. 0022-2836/$ - see front matter © 2012 Elsevier Ltd. All rights reserved.

Structure of Plant Serine Protease

387

Introduction Cucumisin was discovered in melon fruit, Cucumis melo L. var. Prince, 1 as an extracellular glycoprotein and was the first serine protease isolated from plants. Most of the proteases from natural plant sources are classified as cysteine‐type endopeptidases, which require reductants and chelating reagents to maintain their activity. Serine proteases do not require these reagents and, thus, are useful in the food industry. As a member of the serine protease family, cucumisin was inhibited strongly by diisopropyl fluorophosphate (DFP), and the residues in the catalytic triad were identified. 2,3 Analyses of its hydrolysis rates for various lengths of peptides revealed that the subsite length of cucumisin is S5–S3′. 4 The substrate specificity of cucumisin was investigated using synthetic oligopeptides. Cucumisin showed broad specificity at the P1 position; however, it did not cleave the Cterminal side of Gly, Ile, and Pro and preferred Leu, Asn, Gln, Thr, and especially Met. Pro at the P2 position is preferentially cleaved. 4 Moreover, cucumisin does not cleave the N-terminal side of Val but prefers Gly, Ala, Lys, and Ser. A BLAST search 5 for similar sequences deposited in the Protein Data Bank (PDB), using the catalytic domain of cucumisin, revealed many enzymes with subtilisin-like folds. The highest similarity was assigned to a subtilase from tomato (Solanum lycopersicum) (SBT3), which possesses a subtilisinlike catalytic domain, a protease-associated (PA) domain as an insertion of the catalytic domain, and a C-terminal fibronectin (Fn)‐III-like domain (Fig. 1). Cucumisin was expected to adopt the same domain arrangement as SBT3. An analysis of the substrate specificity of SBT3 revealed a preference for Gln and Lys in the P1 and P2 positions, respectively, for oligopeptide substrates. 6 It seems that the substrate specificity at the P1 position is common, but that of P2 is different between cucumisin and SBT3. These two plant serine proteases (cucumisin and SBT3) have similar characteristics, such as significant stability at elevated temperatures and under alkaline conditions. 7 A structural analysis of SBT3 revealed that the PA domain plays a critical role in Cucumisin

Catalytic domain 111

SBT3

Overall structure The crystal structure of cucumisin was determined by the MAD method and refined at 2.75 Å resolution. The crystal structure contains two molecules in the asymmetric unit, and their molecular structures are almost identical (Fig. 2a). The rmsd value obtained by superimposing the common C α atoms between mol-A and mol-B is 0.671 Å. The mature form of cucumisin begins at Thr111. The amino acid residues of cucumisin were traced from Thr111 to Leu730 in the electron density maps for both molecules, except for the extreme C-terminal residue, Val731, in both molecules and the loop regions (599–627 for mol-A and 602–627 for mol-B), which had unclear electron densities due to disorder and partial degradation (see Protein purification). The degraded region has been assigned to the autocleavage site. 9 The cucumisin structure consists of three domains: the catalytic domain, the PA domain inserted into the catalytic domain, and the Cterminal Fn‐III-like domain (Fig. 2b). The catalytic domain adopts a subtilisin-like fold, consisting of a highly twisted seven-stranded parallel β-sheet,

PA domain

Catalytic domain

Catalytic domain 465

PA domain 342

Catalytic domain 1

Results

331

113

BNP’

homodimerization and that the thermostability is independent of calcium binding. 8 On the other hand, the molecular bases of cucumisin's functions are poorly understood. Structural information is required to discuss the molecular bases of the substrate specificity and thermostability of cucumisin, the first isolated plant serine protease. In the present study, we determined the crystal structure of cucumisin by the multiwavelength anomalous dispersion (MAD) method at 2.75 Å resolution. The crystal structure of cucumisin revealed profound differences in the substrate recognition site structure and the proline locations, as compared with SBT3, in spite of the similarity in their overall folding. The substrate specificity and thermostability of cucumisin are discussed, based on the crystal structure.

634

Catalytic domain 472

654

Fn III domain 731

Fn III domain 761

Catalytic domain 182

183

275

Fig. 1. Domain organizations and architectures of cucumisin, SBT3 (tomato subtilase), and BPN′ (subtilisin BPN′).

Structure of Plant Serine Protease

388

(a) PA domain (mol-A)

(b) Fn III domain (mol-B)

PA domain Catalytic domain (mol-B)

Catalytic domain

DFP

Catalytic domain (mol-A) PA domain (mol-B)

Fn III domain

Fn III domain (mol-A)

Fig. 2. Ribbon representations of the molecular structure of cucumisin. The catalytic, PA, and Fn‐III-like domains are colored green, orange, and blue, respectively. (a) Molecules in the crystallographic asymmetric unit. (b) Monomer structure of cucumisin. DFP is depicted by a stick model.

flanked by nine α-helices. The PA domain is inserted between Met330 and Leu465 and includes three αhelices and nine β-strands. The third domain, the Fn‐III-like domain, is located adjacent to α10 of the catalytic domain with a flexible linker region. The Fn‐III-like domain of cucumisin forms an eightstranded β-sandwich structure. These inter-domain interactions involve many hydrogen bonds and hydrophobic contacts with residues on β24, β25, and β31 of the Fn‐III-like domain. The buried surface area between the catalytic domain and the Fn‐III-like domain is 1480 Å 2. Among the nine cysteines in the sequence of the mature enzyme, the crystal structure revealed that six cysteines form three disulfide bonds: Cys166– Cys174, Cys245–Cys250, and Cys380–Cys397. An unidentified loop in the electron density maps also includes two cysteines. To identify the disulfide bond formation between these cysteines, we conducted SDS-polyacrylamide gel electrophoresis in the presence and absence of a reducing reagent in the SDS sample buffer (see Materials and Methods). As a result, the bands migrated at different positions. Without a reducing reagent, the main band position was estimated to be 66 kDa, corresponding to fulllength cucumisin. On the other hand, the main band appeared at 53 kDa with a reducing reagent, suggesting degradation of the protein. It can be concluded that cucumisin forms a disulfide bond between Cys602 and Cys621. The prosequence of cucumisin is processed, after synthesis, as an inactive precursor. 9 After processing, the new N-terminal residue is Thr111, which is buried in the protein core. In addition to the hydrophobic contacts between the

methyl groups of Thr111 and Thr112, the N-terminal region is stabilized by three hydrogen bonds [Thr111(O γ )–Arg231(O), Thr112(O γ )–Gly226(O), and Thr112(O γ)–Arg332(N η1)] and a salt bridge between Arg113 and Asp331. As posttranslational modifications, N-linked oligosaccharides are bound to Asn466 and Asn652, although the electron density of the oligosaccharide on Asn652 of mol-A could not be identified due to disorder. These oligosaccharide sites are consistent with those previously reported. 10 Active‐site structure The serine protease family possesses a catalytic triad that generally consists of serine, histidine, and aspartic/glutamic acid. 11 These residues correspond to Ser525, His204, and Asp140 in cucumisin (Fig. 3). In order to prevent self-degradation during the purification process, we included DFP during the purification process. In the crystal structure, DFP is bound to Ser525 with a covalent bond between the serine oxygen and the phosphor atom of DFP, in a tetrahedral geometry. Due to this unnatural ligand, a hydrogen bond between His204 and Ser525 in the active site is alternatively formed between DFP and Ser525 in mol-B, whereas the hydrogen bond between Ser525 and His204 in mol-A is maintained. Therefore, the steric influence of DFP binding is considered to be limited to this local area. Structure similarity search A structural homology search with the Dali server 12 indicated that cucumisin shares structural

Structure of Plant Serine Protease

389

Asp248 Asn247

Thr141

Ser251

Asn202

Cys245 Asp252

Cys250 Asp140 Val274 His204 Ser273

Ala277

Gly275 Gly276

DFP

Asn278

Ser525

Asn307 Thr524 Gly308

Fig. 3. Active‐site environment. (upper panel) The amino acid residues of the catalytic triad are shown in cyan (Asp140, His204, and Ser525). The disulfide bond between Cys245 and Cys250 is colored yellow. DFP, covalently bonded to Ser525, is shown in magenta. (lower panel) Stereo representation of the 2Fo −Fc electron density, in the same view as the upper panel. The electron density is contoured at 1.0 σ.

similarity with subtilisin and enzymes with subtilisin-like folds. The highest similarity was assigned to tomato subtilase 3 (SBT3) (PDB ID: 3I74), with a Z-score of 50.9 and an rmsd of 3.5 Å for the entire region. Subtilisin BPN′ (1LW6) also shares high structural similarity (Z-score =36.3 and rmsd= 1.9 Å) in the catalytic domain. Figure 4 shows the structure-based sequence alignment between cucumisin, SBT3, and subtilisin BPN′. The sequential identities of these enzymes with the catalytic domain of cucumisin are 40% (for SBT3) and 30% (subtilisin BPN′). SBT3 possesses the same domain architecture as cucumisin, and each domain superimposed with low rmsd values for the common C α atoms: the catalytic domain (1.43 Å), the PA domain

(2.08 Å), and the Fn‐III-like domain (1.14 Å). However, the relative position of the PA domain is rearranged toward the active site (Fig. 5), with a 12‐Å distance between the C α atoms of the corresponding amino acids on the end of α8 [Ser450 (cucumisin) and Asn458 (SBT3)]. The structural analysis of SBT3 suggested that this PA domain plays an important role for inter-domain interactions in homodimer formation. In addition, a β-hairpin structure flanking the PA domain also contributes to dimerization stability. 8 In the crystal structure of cucumisin, the PA domain does not participate in inter-domain interactions, although the crystal contains two molecules in the asymmetric unit. In addition, the loop region between Gly510 and Arg513, which

Structure of Plant Serine Protease

390 α1

α2

β1

Cucu TTRSWDFLGFPLT--VPRRSQVESNIVVGVLDTGIWPESPSFDDEGFSPPPPKWKGTCETSNN---FRC 174 SBT3 TTHTSDFLKLNPSSGLWPASGLGQDVIVAVLDSGIWPESASFQDDGMPEIPKRWKGICKPGTQFNASMC 181 BPN’ VPYGVSQIKAPAL---HSQGYCGSNVKVAVIDSGIDSSHPDLKVA------------------------ 45 β2

α3

β3

β4

Cucu NRKIIGARSYHIGR----PISPGDVNGPRDTNGHGTHTASTAAGGLVSQANLYGLGLGTARGGVPLARI 239 SBT3 NRKLIGANYFNKGILANDPTVNITMNSARDTDGHGTHCASITAGNFAKGVSHFGYAPGTARGVAPRARL 250 BPN’ ---------------GGASMVPSETNPFQDNNSHGTHVAGTVAALNN---------SIGVLGVAPCASL 90 β5

β6

β7

β8

α4

β9

α5

Cucu AAYKVCW-NDGCSDTDILAAYDDAIADGVDIISLSVGGANPRHYFVDAIAIGSFHAVERGILTSNSAGN 307 SBT3 AVYKFSF-NEGTFTSDLIAAMDQAVADGVDMISISYGYRFIPLY-EDAISIASFGAMMKGVLVSASAGN 317 BPN’ YAVKVLGADGSGQYSWIINGIEWAIANNMDVINMSLGGPSG----SAALKAAVDKAVASGVVVVAAAGN 155 β10

β11

β12

β13

β14

Cucu GGPNFFTTAS----LSPWLLSVAASTMDRKFVTQVQIGNGQSFQGVSINTF--DNQYYPLVSGRDIPNT 370 SBT3 RGPGIGSLNN----GSPWILCVASGHTDRTFAGTLTLGNGLKIRGWSLFPARAFVRDSPVIYNK----- 377 BPN’ EGTSGSSSTVGYPAKYPSVIAVGAVDS------------------------------------------ 182 β15

α6

β16

α7

β17

Cucu GFDKSTSRFCTDKSVNPNLLKGKIVVCE-ASFGPHEFFKSLDG-AAGVLMTSN--TRDYADSYPLPSSV 435 SBT3 ---TLSDCSSEELLSQVENPENTIVICDDNGDFSDQMRIITRARLKAAIFISEDPGVFRSATFPNPGVV 443 BPN’ --------------------------------------------------------------------α8

β18

β19

β20

β21

Cucu LDPNDLLATLRYIYSIRSPGATIFKSTTIL-NASAPVVVSFSSRGPNRATKDVIKPDISGPGVEILAAW 503 SBT3 VNKKEGKQVINYVKNSVTPTATITFQETYLDTKPAPVVAASSARGPSRSYLGISKPDILAPGVLILAAY 512 BPN’ ---------------------------------SNQRASFSSVGP--------ELDVMAPGVSIQSTLP 210 β22

α9

α10

Cucu PSVAPVGGI----RRNTLFNIISGTSMSCPHITGIATYVKTYNPTWSPAAIKSALMTTASPMNAR---- 564 SBT3 PPNVFATSIGTNILLSTDYILESGTSMAAPHAAGIAAMLKAAHPEWSPSAIRSAMMTTADPLDNTRKPI 581 BPN’ ---------------GNKYGAYSGTSMASPHVAGAAALILSKHPNWTNTQVRSSLENTTTKL------- 257 α11

α12

β23

Cucu -----FNPQAEFAYGSGHVNPLKAVRPGLVYDANESDYVKFLCGQGYNTQAVRRITGDYSACTSGNTGR 628 SBT3 KDSDNNKAATPLDMGAGHVDPNRALDPGLVYDATPQDYV-NLLCSLNFTEEQFKTIARSSASHNCSNPS 649 BPN’ --------GDSFYYGKGLINVQAAAQ 275 β24

β25

β26

β27

β28

Cucu VWDLNYPSFGLSVS---PSQTFNQYFNRTLTSVAPQASTYRAMISAPQGLTISVNPNVLSFNGLGDRKS 694 SBT3 A-DLNYPSFIALYSIEGNFTLLEQKFKRTVTNVGKGAATYKAKLKAPKNSTISVSPQILVFKNKNEKQS 717 β29

β30

β31

Cucu FTLTVRGSI--KGFVVSASLVWSD--GVHYVRSPITITSLV 731 SBT3 YTLTIRYIGDEGQSRNVGSITWVEQNGNHSVRSPIVTSPIIEVW 761

Fig. 4. Sequence alignment of cucumisin, SBT3, and subtilisin. Secondary structure elements are shown by green ellipsoids (α-helices) and blue bars (β-strands). The PA domains of cucumisin and SBT3 are colored orange. Cysteines forming a disulfide bond are linked by thin lines. Amino acids in the catalytic triad are colored red. Sugar binding residues are highlighted with a green background.

corresponds to the β-hairpin structure in SBT3, does not form a special secondary structure element. The Fn‐III-like domain interacts similarly with the catalytic domain in the structures of both cucumisin and SBT3, while the domain linker region is significantly different. In the structure of cucumisin, this domain linker is highly flexible and remained unidentified in the electron density map, as mentioned above. Secondary structure prediction by PSIPRED 13 indicated that only a short helix and a strand were assigned for six amino acids of cucumisin in this region. The corresponding region in SBT3 forms two α-helices and a disulfide bond, Cys624– Cys645. This structural difference between cucumisin and SBT3 might be caused by a packing effect. Of the three disulfide bonds in the cucumisin structure, two of them (Cys166–Cys174 and

Cys380–Cys397) exist in corresponding positions in the SBT3 structure. The disulfide bond between Cys602 and Cys621 in cucumisin (corresponding to the Cys624–Cys645 disulfide bond in SBT3) may also exist, although it was not identified in the crystal structure. On the other hand, the Cys245– Cys250 disulfide bond is found only in cucumisin. Intriguingly, this disulfide bond is located in the active‐site region (Fig. 3). As expressed from a natural plant source, both enzymes have posttranslational modifications with N-linked oligosaccharides. There are three putative N-glycosylation sites (Asn-X-Ser/Thr) in cucumisin and eight in SBT3. In cucumisin, two glycosylation sites were identified from the crystal structure. On the other hand, five glycosylation sites were identified from the crystal structure and mass spectrometry in

Structure of Plant Serine Protease

SBT3: Asn177, Asn203, Asn376, Asn697, and Asn745. 6 These glycosylation sites do not overlap each other. Multimerization in solution In order to determine the oligomerization state of cucumisin in solution, we performed analytical ultracentrifugation experiments. The sedimentation equilibrium analysis revealed that the measured molecular mass of cucumisin was 86.1 kDa. The calculated value for the cucumisin monomer is 66.3 kDa. Assuming monomer/dimer equilibrium in solution, we interpret this result as equilibrium between 70% monomer and 30% dimer, suggesting that the main component in solution is the monomer. Considering 30% of the dimer form, the observed molecular association manner in the crystal is related to the non-crystallographic 2-fold axis, which may reasonably be regarded as a dimer form. The contact area between these two molecules in the asymmetric unit is approximately 1200 Å 2. The structural analysis of SBT3 revealed that the main component in solution is the dimer, 8 and the PA domain and the β-hairpin play a critical role in dimerization. In cucumisin, the dimer is a minor form, and the manner in which it is formed may be very different from that of SBT3. Thermostability measurement The thermostability of cucumisin was measured by differential scanning fluorimetry. 14,15 The melt-

391 ing point of cucumisin was Tm = 81 °C (Fig. 6). Although metal ions are one of the common factors for thermostability in many thermozymes, 16 no calcium binding site was found in the crystal structure of cucumisin. In addition, the enzymatic activity of cucumisin does not require calcium ions. 1,17 SBT3 shows calcium-independent thermostability, and its melting point is Tm = 71.5 °C. 8 Comparing the melting points, cucumisin has a higher melting point than its structural homologue, SBT3. Dimerization is a unique characteristic of SBT3, as a member of the subtilisin-like fold enzyme family, and it may contribute to the thermostability. On the other hand, the thermostability of cucumisin may be accomplished in a different manner.

Discussion Substrate specificity To evaluate the substrate binding sites, we built a superimposed model between cucumisin and SBT3 inhibitor (Ac-Phe-Glu-Lys-Ala-cmk) (Fig. 7). The model structure indicated the putative subsites (S4– S1 sites) for cucumisin from the inhibitor–protein contacts. The S1 site adopts the alanine side chain of the inhibitor and retains additional volume to accept a larger amino acid residue. This is consistent with the P1 position preferences of cucumisin determined

PA domains

Fig. 5. Overlaid structures of cucumisin and SBT3. The structures were superimposed on their catalytic domains. Cucumisin and SBT3 are colored blue and light brown, respectively. Ribbon models of the PA domains are enlarged on the right.

Structure of Plant Serine Protease

Fluorescence intensity (a.u.)

392

50

75

100

Temperature (°C)

Fig. 6. The plot of differential scanning fluorimetry measurements. The melting curves of samples with the protein (red) and without the protein (blue) are shown. a.u., arbitrary unit.

with synthetic oligopeptides in the previous study, which indicated that cucumisin prefers Leu, Asn, Gln, Thr, and Met for the P1 position. 4 The P2 position of the SBT3 inhibitor is Lys. The putative S2 site of cucumisin is distinct from that of SBT3. The S2 site of cucumisin is composed of Val274, Cys245, and Cys250. These cysteines form a disulfide bond, and no corresponding bond is present in either SBT3 or subtilisin BPN′. Cucumisin prefers proline for the S2 site. 4 The steric circumstances of the P2 site seem to be reasonable to accommodate the cyclic structure of proline by van der Waals contacts. On the other hand, the position corresponding to Val274 in

cucumisin is Tyr285, which occupies a large volume in the S2 site in SBT3. SBT3 prefers lysine for the S2 site, and its side chain is directed toward the exterior of the substrate binding site. Comparing the steric circumstances between cucumisin and SBT3, cucumisin might be able to accept another amino acid at the S2 position, but SBT3 could not accommodate proline because of the steric hindrance from tyrosine. The S2 pocket structure of cucumisin is unique, in that it can accommodate proline at the P2 position. The subsite length of cucumisin was determined as S5–S3′, from the hydrolysis rates of synthetic peptides. 4 Although the SBT3 inhibitor covers only the P1–P4 positions, the putative subsite S3–S5 may include Gly249–Asp252, Gly275–Ala277, and Val284–Ala286. These regions contain many small amino acid residues, such as glycine and alanine, allowing the formation of a wide-spaced cleft. These sites seem to be less specific for substrates because of their structural environments. However, two aspartic acids, Asp252 and Asp285, provide an overall negative charge to the binding region. In fact, the calculated electrostatic potential indicated an area of negative charges in the S3–S5 sites (Fig. 8). Thus, it is quite possible that cucumisin has a preference for basic amino acids at these positions. The preferences of the cucumisin subsites are currently being investigated. Although the S1′– S3′ sites presently cannot be correctly assigned due to the lack of a model structure, the active‐site channel is restrained by the approximately 90° kink in the PA domain, 8 which is located adjacent to the active site. The movement of the PA domain and the

Gly249 Cys250

Cys245

Asp252 Asp140

P4 Ala286

Val274

P2

Val284 Asp285

Gly275

His204

P3 Ser525

Ala277

P1 Fig. 7. Modeled structure between cucumisin and SBT3 inhibitor. The inhibitor (Ac-Phe-Glu-LysAla-cmk) is colored pink.

Structure of Plant Serine Protease

393

Catalytic domain

PA domain

Asp252 active site

Asp285

Fig. 8. Electrostatic potential distribution of cucumisin (surface model). The molecular surface is colored blue (positively charged) to red (negatively charged). The location of the active site is indicated by the circle.

Fn III domain

β-hairpin might play an important role for inactivation. 8 Actually, the shift of the PA domain toward the active site is observed in cucumisin (Fig. 5). In addition, in cucumisin, the region including Trp503–Leu517, corresponding to a βhairpin in SBT3, is not a β-hairpin but a loop structure. However, in cucumisin, these structural components seem to be involved in forming the active‐site channel rather than in interfering with substrate access to the binding site. Consequently, the substrate preference of cucumisin is achieved by the structures of the subsite pockets, especially the P2 site, and the conformational change of the PA domain. Protein autolysis stability Cucumisin has a lower autolysis rate, as compared with trypsin and papain. 18 Cucumisin shows broad substrate specificity, but proline is preferred at the P2 position, as described above. For synthetic oligopeptides, the hydrolysis rate for Suc-Ala-ProAla-pNa is 30× greater than that for Suc-Ala-AlaAla-pNa. 7 The mature form of cucumisin contains 42 prolines. A calculation of the surface‐exposed areas for these prolines revealed that the most exposed proline is Pro192, with a calculated surface‐ exposed area of 118.3 Å 2. In fact, Pro192 exists in a loop region and is fully exposed to the solvent in the crystal structure; also, it has a relatively high temperature factor (50.2 Å 2 for the C α atom). Prolines with greater than half of the maximum exposed area are candidate sites for autolysis. Among the 42 prolines in cucumisin, 8 may actually be autolysis candidates. Moreover, cucumisin does not digest the Pro–Pro bond. Excluding this proline,

the total number of well-exposed prolines is seven. This result suggested that the preferential autolysis sites (proline at P2 position and surface exposed) in cucumisin are very limited. This limitation may be the reason for its lower autolysis rate. Thermostability A previous report mentioned that calcium ions do not contribute to the thermostability of SBT3, 8 although SBT3 shares structural similarity with subtilisin BPN′, which exhibits calcium‐ion-dependent themostability. 19 As plant serine protease homologues, neither cucumisin nor SBT3 possesses a calcium binding site. The melting points of these two plant serine proteases are notably high and metal independent. In particular, the melting point of cucumisin is approximately 10 °C higher than that of SBT3. It is not easy to identify the specific factors contributing to thermostability because numerous influences are possible. 16,20,21 However, several factors may contribute to the thermostability of cucumisin, as compared with SBT3. One of the important factors for the thermostability of cucumisin is the short loops. The amino acid lengths of the mature enzymes are 621 residues for cucumisin and 649 residues for SBT3. A structure-based sequence alignment (Fig. 4) suggested that the difference in the C-terminal end is only three amino acids, beginning at a common N-terminal position. Consequently, cucumisin has 13 longer gap regions (36 residues) than SBT3. In contrast, SBT3 has only four longer gap regions (11 residues) than cucumisin (Fig. 4). Most of these gap regions exist on loop regions. In addition, the total surface‐exposed area of these proteins with respect to the monomer state

Structure of Plant Serine Protease

394 is 22,950 Å 2 for cucumisin and is 24,990 Å 2 for SBT3, calculated as the average value of the two molecules in the asymmetric unit. These results indicated that cucumisin is more compact than SBT3. In general, thermophilic proteins are more compact and have shorter loops, as compared to their mesophilic homologues. 22 Shortened loops can contribute to the lower entropy of an unfolded state and, thus, stabilize the folded state. 23 Furthermore, an analysis of the proline locations revealed the different distribution of prolines between cucumisin and SBT3. Of the 42 prolines in cucumisin, 3 are located in secondary structure elements (α-helices and βstrands), except for the N-terminal positions of these elements, while in SBT3, 7 of the 41 prolines are located in secondary structure elements. Proline residues in secondary structure elements disrupt hydrogen bond networks and destabilize local structures. 24 On the other hand, there are 35 and 30 prolines in cucumisin and SBT3, respectively, which are not involved in secondary structure elements. These prolines are included in loop regions and restrict loop mobility. They may also reduce the entropy of the unfolded state of proteins. 25 Consequently, the enhanced thermostability of cucumisin relative to SBT3 is accomplished through an entropy effect, by the shortened and proline-containing loops.

Materials and Methods Protein purification Cucumisin was purified by the method reported by Uchikoba et al. 7 The N-terminus of the mature enzyme starts at Thr111, consistent with previously reported data. 9 The lyophilized enzyme powder was dissolved in 50 mM Hepes buffer (pH 7.0), containing 0.05 M NaCl, 1 mM dithiothreitol (DTT), and a 10-fold molar excess of DFP. The protein solution was loaded onto a HiTrap SP column (GE Healthcare) and eluted by a linear gradient of 0.05–0.4 M NaCl. Fractions containing the protein were pooled and applied to a Superdex 200 HR column (GE Healthcare) equilibrated with 50 mM Hepes buffer (pH 7.0) containing 150 mM NaCl and 1 mM DTT. The peak fractions were collected, and the solution was concentrated to 20 mg/ml with a Centricon filter (Millipore). The sample assessed by SDS-polyacrylamide gel electrophoresis indicated three bands [66 kDa (full length), 53 kDa, and 13 kDa] despite the DFP treatment. In the electrophoresis without reducing reagent in the SDS sample buffer, the band intensity of the full-length protein increased significantly, indicating the existence of an intramolecular disulfide bond (between the 53‐ and 13‐kDa fragments) and partial degradation. Crystallization and data collection Cucumisin was crystallized by the sitting-drop vapordiffusion method. The drops were composed of 1.0 μl of

protein solution and 1.0 μl of reservoir solution [0.1 M imidazole buffer (pH 6.5) containing 0.8 M sodium acetate]. The crystals grew to approximate dimensions of 0.2 mm×0.05 mm×0.05 mm in 6 weeks. To prepare the heavy‐atom derivative, we soaked the crystals in a solution containing 1 mM K2PtCl6 for 1 day. Data collection was performed using crystals that had been transferred into Paraton N as a cryoprotectant for 1 min, before flashcooling in a nitrogen stream at 110 K. The reflection data sets were collected at three wavelengths [1.06978 Å (peak), 1.07195 Å (edge), and 1.05375 Å (high remote)] to 3.1 Å resolution on the beamline NW12 at the Photon Factory (Tsukuba, Japan). The native data sets were collected on beamlines NW12 at the Photon Factory and BL32XU at SPring-8 (Harima, Japan). The final refinement data were used to 2.75 Å resolution. All diffraction data sets were integrated and scaled with the HKL2000 program suite. 26 Structure determination and refinement The crystal structure of cucumisin was determined by the MAD method. The determination of the platinum sites and calculation of the MAD phases were accomplished with the program SOLVE 27 to 3.5 Å resolution, and four platinum positions were identified. The resulting electron density map (figure of merit: 0.42) was considerably improved by density modification and phase extension to 3.1 Å resolution with the program RESOLVE 27 (figure of merit: 0.66). Two models (mol-A and mol-B) in an asymmetric unit were built manually into the electron density map, using the program O. 28 The structure was refined with the program CNS, 29 using native data at 2.75 Å resolution. Since the last residue of the C-terminus (Val731) could not be identified in the electron density map for both mol-A and mol-B, these residues were excluded from the coordinates. This yielded the final structure, including the carbohydrate moieties (N-acetyl glucosamine) modifying residues Asn466 and Asn652 in mol-B. The final model was assessed by PROCHECK in the CCP4 suite. 30 The Ramachandran plot revealed that 80.7% of the residues are in the most favored regions, with 18.7% in the additionally allowed regions and 0.8% of the residues in the generously allowed region in mol-A. The current model yielded Rwork and Rfree values of 0.214 and 0.262, respectively. The data collection and refinement statistics are summarized in Table 1. The ribbon representations in the figures were rendered by PyMOL, 31 and the molecular surface in Fig. 8 was created with MolFeat (FiatLux, Japan). Ultracentrifugation The protein samples were purified as described above, and the buffer was exchanged by gel‐filtration chromatography to 50 mM Hepes buffer (pH 7.0), 150 mM NaCl, and 1 mM tris(2-carboxyethyl)phosphine. Sedimentation equilibrium experiments were performed in a Beckman Optima XL-I centrifuge with six-channel centerpieces, using a Beckman An-50Ti rotor. The protein was loaded into the cells at concentrations of 0.56, 0.28, and 0.19 mg/ml. Equilibrium distributions were analyzed after 16 h of centrifugation at 7000, 8000, and 9000 rpm and at 4 °C, respectively. For the molecular weight

Structure of Plant Serine Protease

395

Table 1. Crystal parameters, data collection and refinement statistics Crystal characteristics R3 a=149.483, b=149.483, c=218.035 2

Space group Unit cell parameters (Å) Molecules per asymmetric unit

Pt MAD data

Wavelength (Å) Resolution range (Å) Redundancy Unique reflections Completeness (%) I/σ(I) Rsymb Figure of merit Refinement statistics Resolution range (Å) Unique reflections R-factor/Rfreec No. of protein atoms No. of water molecules rmsd from ideal geometry Bond lengths (Å) Bond angles (°) Average isotropic B-value (Å2) a b c

Peak

Edge

Remote (high)

1.06978 50.00–3.00 4.5 36,596 100.0 (100.0)a 14.0 (3.2) 0.106 (0.496) 0.42 (after solvent modification)

1.07195 50.00–3.00 4.5 36,597 100.0 (100.0) 13.9 (2.5) 0.111 (0.605)

1.0375 50.00–3.00 4.5 36,370 100.0 (100.0) 13.0 (2.2) 0.116 (0.736)

Native 0.9000 50.00–2.75 4.6 46,877 100.0 (100.0) 12.1 (2.9) 0.158 (0.591)

44.64–2.75 46,860 0.214/0.262 8938 206 0.007 1.40 49.3

Numbers in parentheses refer to the highest‐resolution shell. Rsym =∑h∑i|Ii(h)−〈I(h)〉|⁄∑h∑iIi(h). R-factor=∑h||Fo|−|Fc||⁄∑h|Fo|. Free R-factor was calculated using 5% of reflections omitted from refinement.

analysis, a partial specific volume of 0.723 cm 3/g and a solution density value of 1.01009 g/cm 3 were used. Difference scanning fluorimetry The thermal stability of cucumisin was investigated by thermal denaturation in the presence of SYPRO orange (Invitrogen), using a 7300 Fast Real-Time PCR system (Applied Biosystems). Reactions were performed in 96-well plates, in a 50‐μl volume containing cucumisin (2 μM), 50 mM Hepes buffer (pH 7.5), 150 mM NaCl, 1 mM DTT, and freshly diluted SYPRO orange (1:1000, v/v). The fluorescence emission was collected using a ROX filter, with an excitation wavelength at 488 nm. The fluorescence intensity was monitored by increasing the temperature in 1 °C increments from 20 to 99 °C (80 min). The melting temperature (Tm) was identified from the midpoint of the melting curve. Accession numbers Coordinates and structure factors have been deposited in the PDB with accession number 3VTA.

Acknowledgements This work was supported by the Targeted Proteins Research Program from the Ministry of Education,

Culture, Sports, Science and Technology, Japan. The synchrotron radiation experiments were performed at SPring-8 and the Photon Factory (Proposal: 2011S2-005). We would like to thank the beamline staffs at BL32XU of SPring-8 and NW12 of the Photon Factory Advanced Ring for assistance during data collection. We also thank Naomi Ohbayashi (Iwaki Meisei University) for assistance with the difference scanning fluorimetry measurements, and Ryogo Akasaka for assistance with the ultracentrifugation measurements. We also acknowledge the support of the Biomedical Research Core of Tohoku University Graduate School of Medicine.

References 1. Kaneda, M. & Tominaga, N. (1975). Isolation and characterization of a proteinase from the sarcocarp of melon fruit. J. Biochem. 78, 1287–1296. 2. Kaneda, M., Ohmine, H., Yonezawa, H. & Tominaga, N. (1984). Amino acid sequence around the reactive serine of cucumisin from melon fruit. J. Biochem. 95, 825–829. 3. Yonezawa, H., Uchikoba, T. & Kaneda, M. (1995). Identification of the reactive histidine of cucumisin, a plant serine protease: modification with peptidyl chloromethyl ketone derivative of peptide substrate. J. Biochem. 118, 917–920. 4. Yonezawa, H., Kaizuka, H., Uchikoba, T., Arima, K. & Kaneda, M. (2000). Substrate specificity of cucumisin

396

5.

6.

7. 8.

9.

10. 11. 12. 13.

14.

15.

16. 17.

on synthetic peptides. Biosci., Biotechnol., Biochem. 64, 2104–2108. Altschul, S. F., Madden, T. L., Schaffer, A. A., Zhang, J., Zhang, Z., Miller, W. & Lipman, D. J. (1997). Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402. Cedzich, A., Huttenlocher, F., Kuhn, B. M., Pfannstiel, J., Gabler, L., Stintzi, A. & Schaller, A. (2009). The protease-associated domain and C-terminal extension are required for zymogen processing, sorting within the secretory pathway, and activity of tomato subtilase 3 (SlSBT3). J. Biol. Chem. 284, 14068–14078. Uchikoba, T., Yonezawa, H. & Kaneda, M. (1995). Cleavage specificity of cucumisin, a plant serine protease. J. Biochem. 117, 1126–1130. Ottmann, C., Rose, R., Huttenlocher, F., Cedzich, A., Hauske, P., Kaiser, M. et al. (2009). Structural basis for Ca 2+-independence and activation by homodimerization of tomato subtilase 3. Proc. Natl Acad. Sci. USA, 106, 17223–17228. Yamagata, H., Masuzawa, T., Nagaoka, Y., Ohnishi, T. & Iwasaki, T. (1994). Cucumisin, a serine protease from melon fruits, shares structural homology with subtilisin and is generated from a large precursor. J. Biol. Chem. 269, 32725–32731. Kaneda, M., Kamikubo, Y. & Tominaga, N. (1986). Amino acid sequences of glycopeptides from cucumisin. Agric. Biol. Chem. 50, 2413–2414. Hedstrom, L. (2002). Serine protease mechanism and specificity. Chem. Rev. 102, 4501–4524. Holm, L. & Rosenstrom, P. (2010). Dali server: conservation mapping in 3D. Nucleic Acids Res. 38, W545–W549. Bryson, K., McGuffin, L. J., Marsden, R. L., Ward, J. J., Sodhi, J. S. & Jones, D. T. (2005). Protein structure prediction servers at University College London. Nucleic Acids Res. 33, W36–W38. Vedadi, M., Niesen, F. H., Allali-Hassani, A., Fedorov, O. Y., Finerty, P. J., Jr., Wasney, G. A. et al. (2006). Chemical screening methods to identify ligands that promote protein stability, protein crystallization, and structure determination. Proc. Natl Acad. Sci. USA, 103, 15835–15840. Ericsson, U. B., Hallberg, B. M., Detitta, G. T., Dekker, N. & Nordlund, P. (2006). Thermofluor-based highthroughput stability optimization of proteins for structural studies. Anal. Biochem. 357, 289–298. Li, W. F., Zhou, X. X. & Lu, P. (2005). Structural features of thermozymes. Biotechnol. Adv. 23, 271–281. Yamagata, H., Ueno, S. & Iwasaki, T. (1989). Isolation and characterization of a possible native cucumisin from developing melon fruits and its limited autolysis to cucumisin. Agric. Biol. Chem. 53, 1009–1017.

Structure of Plant Serine Protease 18. Kaneda, M., Yonezawa, H. & Uchikoba, T. (1995). Improved isolation, stability and substrate specificity of cucumisin, a plant serine endopeptidase. Biotechnol. Appl. Biochem. 22, 215–222. 19. Smith, C. A., Toogood, H. S., Baker, H. M., Daniel, R. M. & Baker, E. N. (1999). Calcium-mediated thermostability in the subtilisin superfamily: the crystal structure of Bacillus Ak.1 protease at 1.8 Å resolution. J. Mol. Biol. 294, 1027–1040. 20. Sadeghi, M., Naderi-Manesh, H., Zarrabi, M. & Ranjbar, B. (2006). Effective factors in thermostability of thermophilic proteins. Biophys. Chem. 119, 256–270. 21. Kumar, S., Tsai, C. J. & Nussinov, R. (2000). Factors enhancing protein thermostability. Protein Eng. 13, 179–191. 22. Chakravarty, S. & Varadarajan, R. (2000). Elucidation of determinants of protein stability through genome sequence analysis. FEBS Lett. 470, 65–69. 23. Thompson, M. J. & Eisenberg, D. (1999). Transproteomic evidence of a loop-deletion mechanism for enhancing protein thermostability. J. Mol. Biol. 290, 595–604. 24. Prajapati, R. S., Das, M., Sreeramulu, S., Sirajuddin, M., Srinivasan, S., Krishnamurthy, V. et al. (2007). Thermodynamic effects of proline introduction on protein stability. Proteins, 66, 480–491. 25. Vieille, C. & Zeikus, G. J. (2001). Hyperthermophilic enzymes: sources, uses, and molecular mechanisms for thermostability. Microbiol. Mol. Biol. Rev. 65, 1–43. 26. Otwinowski, Z. & Minor, W. (1997). Processing of X-ray diffraction data collected in oscillation mode. Methods Enzymol. 276, 307–326. 27. Terwilliger, T. C. & Berendzen, J. (1999). Automated MAD and MIR structure solution. Acta Crystallogr., Sect. D: Biol. Crystallogr. 55, 849–861. 28. Jones, T. A., Zou, J. Y., Cowan, S. W. & Kjeldgaard, M. (1991). Improved methods for building protein models in electron density maps and the location of errors in these models. Acta Crystallogr., Sect. A: Found. Crystallogr. 47, 110–119. 29. Brunger, A. T., Adams, P. D., Clore, G. M., DeLano, W. L., Gros, P., Grosse-Kunstleve, R. W. et al. (1998). Crystallography & NMR system: a new software suite for macromolecular structure determination. Acta Crystallogr., Sect. D: Biol. Crystallogr. 54, 905–921. 30. Collaborative Computational Project, Number 4. (1994). The CCP4 suite: programs for protein crystallography. Acta Crystallogr., Sect. D: Biol. Crystallogr. 50, 760–763. 31. DeLano, W. L. (2002). The PyMOL Molecular Graphics System, http://www.pymol.org.