Structure of full-length porcine synovial collagenase reveals a C-terminal domain containing a calcium-linked, four-bladed P-propeller J Li1t, P Brick'*, MC O'Hare 2 , T Skarzynski 1' , LF Lloyd', VA Curry 2, IM Clark 2, HF Bigg2, BL Hazleman 2 , TE Cawston 2 and DM Blow' 1
Blackett Laboratory, Imperial College, London SW7 2BZ, UK and 2 Rheumatology Research Unit, Addenbrooke's Hospital, Hills Road, Cambridge CB2 2QQ, UK
Background: The collagenases are members of the family of zinc-dependent enzymes known as the matrix metalloproteinases (MMPs). They are the only proteinases that specifically cleave the collagen triple helix, and are important in a large number of physiological and pathological processes. Structures are known for the N-terminal 'catalytic' domain of collagenases MMP-1 and MMP-8 and of stromelysin (MMP-3). This catalytic domain alone, which comprises about 150 amino acids, has no activity against collagen. A second domain, of 200 amino acids, is homologous to haemopexin, a haem-binding glycoprotein. Results: The crystal structure of full-length MMP-1 at 2.5 A resolution gives an R-factor of 21.7%. Two domains are connected by an exposed proline-rich linker of 17 amino acids, which is probably flexible and has no
secondary structure. The catalytic domain resembles those previously observed, and contains three calcium-binding sites. The haemopexin-like domain contains four units of four-stranded antiparallel 13-sheet stabilized on its fourfold axis by a cation, which is probably calcium. The domain constitutes a four-bladed 3-propeller structure in which the blades are scarcely twisted. Conclusions: The exposed linker accounts for the difficulty in purifying full-length collagenase. The C-terminal domain provides a structural model for haemopexin and its homologues. It controls the specificity of MMPs, affecting both substrate and inhibitor binding, although its role remains obscure. These structural results should aid the design of site-specific mutants which will reveal further details of the specificity mechanism.
Structure 15 June 1995, 3:541-549 Key words: 1-propeller, collagenase, haemopexin, matrix metalloproteinase-1 (MMP-1)
Introduction The matrix metalloproteinases (MMPs) are a proteinases acting at neutral pH that between degrade all the components of the extracellular local breakdown of the matrix occurs in a
family of them can matrix. A umber of
important biological processes such as cell migration, angiogenesis and wound healing; moreover, severe pathological conditions such as rheumatoid arthritis and tumour growth and metastasis result from excessive uncontrolled breakdown of the matrix [1]. The study of the MMPs is therefore of major importance. The MMPs (Table 1) represent the last major class of proteolytic enzymes whose complete structures are unknown. Collagenases (MMPs 1, 8 and 13 [2]) are the only enzymes with significant activity against native triple-stranded collagen types I, II, III, VII and X, which they degrade by cleavage of a specific peptide bond in all three strands. The unique cleavage point is at bond 775-776 of mature triple-stranded collagen, producing fragments that are one-quarter and three-quarters the size of the whole collagen molecule. All MMPs are zinc-dependent enzymes, and all contain a sequence similar to the zinc-binding helix of thermolysin
[3,4]. It may be assumed that they will share a similar catalytic mechanism with thermolysin [5,6]. They also require calcium for stability. They possess distinctive substrate specificity, directed towards extracellular matrix macromolecules. After cleavage of a signal peptide of about 20 amino acids, the MMPs are secreted as inactive pro-enzymes. Upon activation, about 80 N-terminal residues are removed by proteolysis to leave the full-length active form of the MMP. The MMPs share a common domain structure, being composed of a catalytic domain, a linker peptide, and a domain with sequence similarity to haemopexin (Table 1). They are all inhibited by physiological inhibitors, the tissue inhibitors of metalloproteinases (TIMPs). The linker unit is of variable length and composition amongst the MMPs, but is always rich in proline residues. Bode et al. [7] determined the crystal structure of astacin, a digestive zinc endopeptidase from the crayfish Astacus astacus, which is highly homologous to the N-terminal part of the MMPs. Early in 1994, five crystal structures were published of N-terminal 'catalytic' domains of MMPs 1, 3 and 8, with strong similarity to each other
*Corresponding author. Present addresses: tnstitute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China and Glaxo Research and Development, Gunnels Wood Rd, Stevenage, Herts SGC2NY, UK.
© Current Biology Ltd ISSN 0969-2126
541
542
Structure 1995, Vol 3 No 6
Table 1. Domain sizes for some matrix metalloproteinases (MMPs). Number of amino acids in domain Enzyme MMP-1 MMP-8 MMP-13 MMP-2 MMP-9 MMP-3 MMP-10 MMP-11 MMP-7 MMP-12 MMP-14
Catalytic Linker Haemopexin Terminal
Fibroblast collagenase Neutrophil collagenase Collagenase 3 Gelatinase (72 KDa) Gelatinase (92 KDa) Stromelysin Stromelysin2 Stromelysin 3 Matrilysin Macrophage metalloelastase Membrane metalloproteinase
161
17
189
3
163 161 336 337 164 164 162 164
17 18 23 63 26 26 36 -
186 188 192 189 188 188 187 -
3 0 0 3 0 0 8 (9)
163
19
189
0
174
35
190
34
All sequence data were taken from the SWISSPROT or EMBL databases. Domain boundaries of MMP-1 were determined from the crystal structure; the others were allocated by homology. Haemopexin-like domains all begin and end at cysteine.
and to astacin [6,8-12]. An NMR structure for MMP-3 was determined independently [13]. These closely similar structures confirmed the anticipated resemblance to astacin and the significant analogy to the zinc-binding helix of thermolysin. With one exception [11], the crystal structures were all determined from crystals containing peptide-like inhibitors. Preliminary crystallographic data have been presented for MMP-7 [14]. The N-terminal domain of collagenase has no activity against its normal substrate, collagen, although it retains some nonspecific activity against casein and gelatin. The N-terminal residues of active collagenase are subject to proteolytic degradation, with the loss of one or two amino acids, and significant reduction in enzyme activity. All of the original collagenase structures [6,8-10,12] were of forms of the enzyme that had lost the N-terminal residue of the full-length molecule. Reinemer et al. [15] determined the crystal structure of the catalytic domain of MMP-8 with the additional N-terminal phenylalanine, and found it was isomorphous with the original structure [8]. The MMP-1 structure presented below includes a homologous N-terminal phenylalanine residue. The C-terminal domain of the MMPs, which is absent from all the above structures, has significant sequence homology to haemopexin (a glycoprotein involved in haem transport) and to vitronectin (a plasma secretory protein) [16]. Haemopexin is composed of two domains, each homologous to the C-terminal domain of collagenase. The C-terminal domain of haemopexin has been crystallized after deglycosylation [17], and its crystal structure has recently been determined [18]. Pea seed albumin 2 appears to be similar to one domain of vitronectin [19]. Our attempts to obtain crystals of full-length human synovial collagenase (MMP-1) were unsuccessful as autolytic cleavage separates the two domains. Studies with porcine
MMP-1 showed this enzyme to be less susceptible to autolysis [20,21]. (Porcine collagenase is 87.6% identical with the human enzyme at the amino acid sequence level.) A crystallization protocol and preliminary X-ray diffraction data were published in collaboration with colleagues from SmithKline Beecham [22]. Throughout the study, the key difficulty has been the preparation of pure undegraded samples of the full-length molecule. Additional problems were caused by high levels of disorder in most of the crystals. This paper presents the refined crystal structure of the fulllength activated porcine collagenase molecule and reveals the fold of the haemopexin-like C-terminal domain. Results Three important advances contributed to the structure determination reported here. Firstly, production of recombinant porcine collagenase [23] and the use of affinity chromatography [24] enhanced its availability and allowed correctly folded enzyme to be rapidly purified. Secondly, a specific metalloproteinase inhibitor N-[3-(Nhydroxycarboxamido)-2-(2-methylpropyl)-propananoyl]O-methyl-L-tyrosine-N-methylamide (CIC; Fig. 1) was added to the purified protein to prevent autolytic cleavage. Thirdly, flash freezing of the crystals provided a spectacular improvement in the attainable resolution (Table 2).
Fig. 1. Chemical structure of the CIC inhibitor. The material used was a racemic mixture of R- and S-configurations at the Ca of the leucine moiety.
Crystallography The crystals obtained from recombinant enzyme exhibited approximately the same degree of order at room temperature as the original crystals from tissue-culture material. It was not possible, even at the synchrotron, to obtain useful diffraction beyond 2.9 A, and disorder significantly reduced resolution along the crystallographic c* axis. Flash-frozen crystals, maintained at 100 K, provided measurable diffraction to 2.5 A (Table 2). Structure The amino acid backbone of full-length porcine collagenase MMP-1 is shown in Figure 2. No electron density is observed for the C-terminal three residues (467-469,
Full-length porcine collagenase Li et al.
Table 2. Crystallographic data. Derivative* Native 1'* Native 2t Heavy atom conc. Soak time Resolution Temperature Measurements Unique reflections Data completeness outer shell Rmerge Rmerge outer shell
2.9 A 25 0C 41 544 13860 91.9% 82.3% 0.084 0.269
2.5 A -1 73°C 72655 22212 94.2% 66.7% 0.086 0.230
HgAc 2 1 mM 24 h 3.4 A 25°C 21 714 8545 92.2% 65.2% 0.092 0.303
PHgAc saturated 24 h 3.5 A 25°C 23878 8183 93.7% 94.8% 0.116 0.311
MIR analysis to 3.5 A Mean fraction isomorphous difference 0.199 0.144 Number of sites 2 1 0.68 (670) 0.75 (772) Rcullis (reflections) Mean figure of merit (reflections) 0.44 (7346) *Cell dimensions for Native 1 are a=161.16 A, c=52.00 A. The estimated solvent content is 70.5%. tCell dimensions for Native 2 are a=161.14 A, c=52.22 A. Heavy-atom derivatives: HgAc 2, mercuric acetate; PHgAc, phenylmercury acetate.
numbering from the start of the signal sequence) so these are not included in the present model. The full-length collagenase molecule may be envisaged as two tightly folded domains with a flexible linkage. The N-terminal catalytic domain is linked to the C-terminal haemopexin-like domain by a highly exposed linker peptide, residues 261-277, with no secondary structure, whose precise conformation may well depend on the crystal
packing. The backbone of the catalytic domain is extremely similar to the structures recently published for human MMP-1 and MMP-8 [6,8-12,15]. One of the published structures of the catalytic domain [15] was used to guide interpretation of this part of the molecule. Collagenase requires the presence of both zinc and calcium ions for full activity. In human MMP-1, the catalytic domain contains a 'catalytic' zinc ion, an additional 'structural' zinc-binding site, and one [6,10], two [9] or three [11] calcium-binding sites. Human MMP-8 has the same zinc sites, and two calcium sites [8]. The porcine collagenase MMP-1 possesses a third calcium-binding site in the catalytic domain, identical to that found in crystals of the human catalytic domain lacking an inhibitor [11]. The ligands to the third calcium ion, Aspl24 and Glu199 are conserved in human MMP-1, but neither is present in MMP-8. The human MMP-1 crystals [11], which also exhibited the third calcium site, contained calcium at a concentration of 1 mM, comparable to the physiological extracellular concentration of this ion in humans. This observation suggests that this third calcium site is normally occupied in vivo. A cation at this site has no obvious catalytic role, and presumably has a structural function. The CIC inhibitor is bound to the active-site zinc and is stabilized by interactions along the active-site cleft, in analogous fashion to the binding of inhibitors in isolated catalytic domain structures. CIC is almost identical to the inhibitor used by the Sterling Winthrop team [6,12] (the difference being the presence of a methoxy group, which changes the amino acid at P2' from phenylalanine to
Fig. 2. Orthogonal views of the collagenase molecule in complex with the CIC inhibitor. (a) View down the fourfold axis of the haemopexin-like domain. The N-terminal catalytic domain (residues 100-260) is at the top of the figure and includes the inhibitor displayed with yellow carbon atoms. The three histidine residues (His218, His222 and His228) that coordinate the catalytic zinc ion are drawn with green carbon atoms and both the catalytic and the structural zinc ions are shown as magenta spheres. The catalytic domain also contains three calcium ions (red spheres). The haemopexin-like domain (residues 278-466) contains one calcium ion on the fourfold axis at the N-terminal end of the 3-strands. The outermost (fourth) 1-strand is interrupted in sheets 2 and 3 (numbering sheets from 1 at the top left clockwise to 4 at the bottom left) of the -propeller, and is shown as two separate sections of 13-structure. A helical turn directly follows this fourth 1-strand in the first three of the four sheets which make up this domain. The disulphide bond, shown in yellow, links the two ends of the haemopexin-like domain. (b) A view of the molecule perpendicular to the fourfold axis showing the exposed proline-rich peptide linking the two domains. (Figures generated using the programs MOLSCRIPT [39] and Raster3D [401.)
543
544
Structure 1995, Vol 3 No 6
O-methyltyrosine), and it makes the same interactions with the protein. In their structures of MMP-1 [6] and MMP-8 [12] a difference in the orientation of the ring of this phenylalanine is noted. The conformation of the phenylalanine in the MMP-8 structure closely resembles the O-methyltyrosine conformation observed in the present MMP-1 structure (X1=- 7 4 0 ) lending support to the suggestion that this conformational difference may depend upon crystal packing. In porcine collagenase, the O-methyltyrosine aromatic ring is stacked against the corresponding ring from an adjacent molecule. The C-terminal domain has approximate fourfold symmetry, the four units each containing a sheet of four antiparallel 3-strands, forming a four-bladed structure like the flights of a dart, and similar to 3-propeller structures observed in some other proteins [25] except that here there are only four blades, and they are scarcely twisted (Figs 2,3). Short peptide loops link one sheet to the next; Cys466, near the C terminus, forms a disulphide bridge with Cys278 at the start of the first sheet. This neatly completes a cyclic structure stabilizing the whole domain. Cys2 78 and Cys466 (or their equivalents) are conserved in all MMPs (with the exception of MMP-7, matrilysin, which lacks the whole C-terminal domain). Haemopexin contains eight repeats of similar structural units in which cysteines are similarly placed [16], so that each group of four units can also be completed by a disulphide bridge. Analogous cysteines exist in the haemopexin-like domain 2 of vitronectin (domain 1 is truncated) [16]. The secondary and tertiary interactions made by the four sheets are remarkably similar (Fig. 4). Along the fourfold axis of the C-terminal domain, at two positions, four peptide carbonyl groups point inwards, one from the innermost strand of each sheet, and positive density is observed on the axis. At the point where the strands converge, the size and density are consistent with a calcium ion, coordinating the four carbonyls (Fig. 5). At the second site, the strands have diverged somewhat and the ligand is probably a water molecule or a hydronium ion. Three of the four amino acids whose carbonyls
coordinate the probable calcium ion are aspartates. The aspartate chains are exposed to solvent, but may have a role in maintaining local electrical neutrality.
Discussion Cleavage of the full-length molecule The linker peptide, residues 261-277, is rich in proline (5 residues out of 17) but is probably flexible and is an obvious target for hydrolytic cleavage. The isolated catalytic domain is a less specific proteinase than the complete molecule and very similar in its activity to MMP-7. It cannot bind to collagen [26,27] and has no activity against it, but it retains the ability to cleave casein and gelatin, and gains the capacity to superactivate collagenase [20] where the propeptide is removed to leave an N-terminal phenylalanine, as also occurs with stromelysin, with a corresponding increase in specific activity of collagenase. Rapid purification of porcine collagenase and the presence of inhibitor were necessary to prevent cleavage of the linker peptide, especially at the high enzyme concentration required for crystallization. Crystal and molecular structure The crystals have a high solvent content (Table 2), consistent with the relatively high disorder in the structure. In the crystals, adjacent catalytic domains interact closely along the twofold axes of the 41 cell. Their contact is strengthened by stacking of the O-methyltyrosine moieties of the inhibitor, as described in the Results section. The lack of secondary structure in the 17-residue linker peptide, taken together with the tight packing of the folded domains, suggests that although the crystal structure may provide a good model of the structure of each domain in solution, the overall molecular structure in solution may be quite different, and is probably flexible. The haemopexin-like domain Figure 6 shows how the four units of the C-terminal domain have significant homology with each other, and with the eight units in the two similar domains of haemopexin. Alignment of the sequences of each unit to -
l
46
Fig. 3. Stereo diagram showing a Ca trace of the C-terminal haemopexin-like domain (residues 260-466) with numbering at approximately 10-residue intervals. ---
------
Full-length porcine collagenase Li et al.
(a)
sheet 1
p310
S
sheet 4
sheet 3
+
sheet 2
+
G-R
I I
Y-F
E
Q405
I
I
Y-K
I KI
T-
450
O- - - NI
D-R
Fig. 4. Schematic representation of the antiparallel ,8-sheets of the haemopexin-like domain. Each part of the figure represents a pair of sheets which are approximately in the same plane. (a) Sheets 1 and 3. (b) Sheets 2 and 4. The approximate fourfold symmetry axis of the domain is vertical. In each sheet, the parts of the structure which conform strictly to the Kabsch and Sander [41] definition of 1-sheet structure are outlined. Parts (a) and (b) represent roughly perpendicular views.
Fig. 5. Section of the electron-density map of the haemopexin-like domain comparable to the schematic diagram in Figure 4a. The map is contoured at r and was calculated using coefficients (3Fo-2F c) with phases obtained from the final model.
maximize these homologies, with only minor adjustment of the original alignment of Hunt et al. [16], places the 3-sheet residues in precise register. The strands form rather regular regions of 3-sheet (perhaps because the sheets have little twist) and the pattern of hydrogen bonds between the strands is remarkably conserved in the four sheets (Fig. 4).
Murzin [28] suggested that at least six 3-sheets are needed for a propeller structure. In the four-bladed propeller of the haemopexin-like domain, adjacent sheets are perpendicular, and not stacked on each other, as they are in propellers containing six, seven or eight sheets. As expected from Murzin's analysis, the innermost strands are closely parallel to the propeller axis, and are in
545
546
Structure 1995, Vol 3 No 6
consensus MMP's
f
FD nI
DL
T
RE
FDA TTL .RGE 278 1DSt MMP1 porcine .....T haemopexin domain 1 29 CS DGWS F DATTLDD domain 2 236S PHLV L .... L TS DN consensus MMP's
MMP1 porcine 323 haemopexindomain1 71 danin 2 282 consensus MMP's MMP1 porcine 372 haemopexindomainl 120 donain2 334 consensus MMP's 421 MMP1 pocine haemopexindomainl 163 domain2 379
F
KDR
FIR
LmSFP
.... L M FIFIK D R F..YIIR NG ML FIFKG E F VWK S H .... ATYAFSGTHYRLDTS .... H
I F FK W G Y P WIY E IAA IAD . . . . RDEV RF K GNKYWAVR . . . GQDVLYGYP D I H R DE P EREK YPKL F K VWV Y P ... G .... HN N F P SPV ID T VYV FL T K G YT LV S GYP RLEKE A V AA W . . . . . . E E K L Y LV a LI
Q VPNGL V
I
A
K
T YF
F
YR
E
P
SVF I YPEVELN KW KW.. ISER RDGWHSWPIAHO
MDPG
PK
K P
L
I
. KQSMDT GYPKM I A E EP S T V K N IIDAAIVF E E D .... TG K T Y FFVAHECIRYIDIEY FFWlD I ...... . . . LIDAA IV ECHRGECQAERVL FF'D RE ATWTELP . . . . . L SGA LH I M A GRR L WL IC P ... S SS R 1 I DS VID A AF Ga K DAV F D FaL K V D A V. GIGNKVDAVN AVGNCSS A L . ..HEKVGALCMEK..S.GPGL G I
MKs..I..
P
F G F P. . FPG I P V T PH
I.
r
N WL r FD ... R I L T L a K A N W F NICRKN GTRY FDFK TK Na L R DP V R a E V PP P V R L IGPNLYCYSDV ...... AKALPPNV TS L LGCTH FF
Fig. 6. Comparison of sequences in the haemopexin-like domains of MMPs and human haemopexin. Each horizontal block (set of four sequences) represents one sheet of the domain. The sequence of porcine MMP-1, and the sequences of both corresponding domains of human haemopexin are given. Underlining indicates the strands of -sheet in porcine MMP-1, assigned according to the definition of Kabsch and Sander [41]. A consensus sequence derived by comparing 10 MMP types is given above the MMP-1 sequence. An amino
acid is part of the consensus if it exists in 6 or more of the 10 MMP sequences and is shown in italics if present in all 10. Following Hunt et al. [161, the four sheets have been aligned to show the significant homologies between them, and boxes indicate where the MMP consensus agrees with both domains of human haemopexin. register along it, but are at a radius from the axis of about 3.5 A at the closest point near the calcium ion, which is much less than that observed for the larger propellers. The peptide carbonyls and amide groups facing inwards are brought into close proximity. The innermost strands diverge from the axis as they run towards the bottom of the domain with the radius increasing to about 4.7 A, relieving the packing somewhat at the bottom of the propeller axis. The closest peptide carbonyls are 2.4 A from the calcium ion, and the next closest are grouped at about 2.7 A around density for a water molecule on the axis. The calcium ion probably binds to two water molecules, one on the propeller axis below it, and one in the solvent above. Two other water molecules are observed on the propeller axis. The innermost strands are tightly packed and totally buried. The closest oa-carbons, those of Ala286, Ala330, Ala378 and Ala428, are about 4.6 A apart. Figure 7 shows how the Cot and C atoms of these alanines form a tightly packed ring at this point, with all the C3 atoms projecting anticlockwise (viewed from the top). The propeller sheets are less twisted than normal 3-sheets, with an average twist of about -10 ° between strands 1 and 2, and 2 and 3. The twist of the fourth strand is less regular. The second and third strands are progressively less constrained, and they must contain larger side chains to fill the available volume: strand 3 has many tryptophan, tyrosine and phenylalanine side chains. These strands also have progressively more polar character and increasing numbers of charged amino acids. Neither of these trends continues in the outermost strands, which are the least constrained by packing. These outer strands and their associated long loops form the surface of the domain, and must be mostly responsible for its binding properties. The outer strands of sheets 2 and 3 are interrupted by a loop containing a
Gly-Tyr-Pro sequence, which is conserved in most MMPs, and in sheet 2 of both domains of haemopexin. Enzyme activity We have attempted to model the binding of triplestranded collagen between the active site and the haemopexin-like domain of an isolated collagenase molecule in the conformation seen in crystals. It is not difficult to accommodate collagen, but no mode of binding appeared especially favourable. In particular, it is difficult to place a peptide bond of the regular triple helix into the observed active site, as noted by others [8]. The collagenases (MMPs 1, 8 and 13) are unique amongst the MMPs and all neutral proteinases for their ability to cleave triple-helical collagen at a specific site three-quarters of the way from the N terminus. As partially cleaved collagen molecules are not detected, it appears that all three strands are cleaved after binding to the enzyme [29]. The haemopexin-like domains of MMP-1 and MMP-3 are known to bind to collagen [26,27], although MMP-3 cannot cleave it. The catalytically inactive proenzyme, pro-MMP-1, and a chimeric protein comprising the catalytic domain of proMMP-3 and the haemopexin-like domain of MMP-1 are unable to bind collagen, suggesting that the propeptide chain interferes with the binding of the linker peptide or the haemopexin-like domain to the collagen substrate [30,31]. The isolated catalytic domain of collagenase does not bind to collagen and cannot cleave it [27]. Chimeric proteins have been constructed that demonstrate a particular role for the haemopexin-like domains in determining enzyme specificity. Chimeras of the catalytic domain of collagenase (MMP-1 or MMP-8) and
Full-length porcine collagenase Li et al.
Fig. 7. Section of the electron-density map viewed down the fourfold axis of the haemopexin-like domain. This view includes a water molecule on the axis, the four alanine residues that pack together tightly in strand 1 and the Phe-Phe sequence present in strand 2 of each of the four sheets. The map is contoured at 1u and was calculated using with phases coefficients (3Fo-2F,) obtained from the final model. the haemopexin-like domain of MMP-3 have no more than 0.2% activity against collagen, whether the linker peptide is derived from collagenase or from MMP-3 [30,31]. Replacement of the linker peptide of MMP-8 by the 26-amino acid linker of MMP-3 abolishes activity against collagen [30]. The structural integrity of the haemopexin-like domain is important: breaking the disulphide bond in the haemopexin-like domain reduces MMP-8 activity by 62% [30]. A molecule with the sequence of MMP-8 up to haemopexin sheet 1, but with the MMP-3 sheets 2, 3 and 4, has 16% of the activity of MMP-8 against collagen [30]. Similar results are obtained with MMP-1 [31]. There are few points in the haemopexin-like domain where the collagenases (MMPs 1, 8 and 13) all have side chains different from any other MMP, and in none of these cases do the three agree in sequence. Most of the differences are in strand 4 of each sheet and their associated long loops, which form most of the surface of the domain. It seems probable that specific binding to collagen depends upon the conformations of these surface loops as specified by the total sequence, and not on small surface changes brought about by individual amino acid substitutions.
Biological implications Collagen is the major structural protein of vertebrates and the destruction of this protein often leads to irreversible damage. Collagenase is a highly specific proteinase and is the only enzyme that can specifically cleave the triple helix of the fibrillar collagens. Consequently an understanding
of its mechanism of action is very important to a whole range of diseases where tissue damage occurs through the destruction of collagen. Analysis of the enzymatic properties of different matrix metalloproteinases (MMPs), of which three (MMPs 1, 8 and 13) are classified as collagenases, and of chimeras between them, demonstrates that features of the two folded domains [i.e. the catalytic (N-terminal) domain and the haemopexinlike (C-terminal) domain] and the linker peptide that connects them are all essential for collagen binding and hydrolysis, and are absent in the other highly homologous MMPs. The probable flexibility of the 17-residue peptide linker suggests that the active conformation (when the enzyme binds to the collagen triple helix) may be somewhat different from that seen in the crystal structure. Collagenase cleaves a specific peptide bond of all three strands of collagen, in collagen types I (chains cal and 0t2), II, III, VII and X. In the sequence around the labile bond, the essential feature of triple-helical collagen namely, that every third amino acid is a glycine - is maintained. In other ways the sequence is unusual: for example, there is a sequence of 12 amino acids without proline. It seems likely that collagenase recognizes an unusual conformational feature at this point on the triple helix, rather than detecting a series of particular side chains in a completely regular helix [32]. The C-terminal domain, a four-bladed 3-propeller structure seen here for the first time, provides a
547
548
Structure 1995, Vol 3 No 6 structural model for haemopexin, which contains two such four-bladed propellers, and other homologous proteins, such as vitronectin. Haemopexin and vitronectin have highly specific binding properties: haemopexin mediates the transport of haem to the liver [33]; vitronectin recognizes integrin receptors, controls cell adhesion and also binds specifically to heparin [34]. The haemopexin skeleton may be used by these molecules as the support for recognition surfaces of great specificity, analogous to the common skeleton of the immunoglobulins. In particular, the long surface loops associated with strand 4 of each -sheet specify most of the surface of the haemopexin-like domains, and may have versatile conformational and recognition properties. It has been suggested [20] that the observed cleavage of collagenases in the linker region might be an artefact of preparation techniques, induced particularly by the high enzyme concentrations needed for crystallization. The highly exposed nature of the linker peptide now suggests more probably that autolysis or cleavage in this region might be a physiological mechanism to prevent the unlimited destruction of collagen, acting as an additional control to that mediated by the tissue inhibitors of metalloproteinases. The structure of full-length MMP-1 presented here provides a model for the structures of the other MMPs, all of which are highly homologous.
Materials and methods
Daresbury, UK, using an image plate detector (MarResearch) (Table 2). The crystals diffract weakly and decay rapidly in the X-ray beam. Greatly improved order and crystal lifetime were achieved using the flash-freezing technique, followed by data collection at 100 K. Diffraction intensities were merged and scaled using the CCP4 program suite [35].
Structure interpretation Two HgAc 2 sites and a PHgAc site were derived from difference Patterson maps. One site is common. An electron-density map using isomorphous phases to 3.5 A with phases extended to 3.4 A using solvent flattening [36,37] showed recognizable density for the catalytic domain, to which the coordinates of Reinemer et al. [15] could be fitted, and revealed several of the 1-strands of the C-terminal domain. Further cautious interpretation of this map revealed the architecture of the C-terminal domain as four similar sheets. Using the program X-PLOR [38], the calculated phases were improved and the resolution gradually extended. Data from flash-frozen crystals were satisfactorily isomorphous, allowing the resolution to be extended to 2.5 A resolution, and the whole structure to be interpreted. The structural model presented here includes the backbone atoms for 367 of the 370 amino acid residues in the full-length molecule, complete side chains for 363 residues, together with two zinc ions, four calcium ions and the CIC inhibitor, as well as 295 water molecules. Based on this model the crystallographic R-factor is 21.7% to 2.5 A (Table 3).
Table 3. Model refinement statistics. Resolution range Number of reflections Number of atoms in model R-factor R-factor in range 2.5-2.61 A Rms deviation in bond lengths Rms deviation in bond angles
2.5-9.0 A 21 539 3294 21.7% 36.6% 0.015 A 1.92 °
Expression, purification and crystallization Expression and purification were as described by O'Hare et al. [23]. Briefly, cDNA for active collagenase was inserted into E. coli as a fusion protein with -galactosidase and a Factor Xa cleavage site. Induced expression of this gene produced pelleted fractions which were suspended in 8 M urea. After dialysis and incubation with Factor Xa, the protein was loaded onto a PHA-Sepharose column which only binds correctly refolded collagenase. Enzyme fractions eluted from this column were inhibited with 10-5 M CIC (a gift from SmithKline Beecham). Further purification using a Mono-S column (Pharmacia) was often needed to remove a fragment of Mr 20 000. The enzyme was concentrated in a Centricon centrifuge tube and crystallized in space group 141 as described previously [22] in hanging drops at pH 7.3, using 5% (w/v) polyethylene glycol as precipitant, in the presence of 10 mM CaCl. Crystals from recombinant material were better formed than those derived from tissue culture, but were of similar size and exhibited a similar degree of disorder.
X-ray diffraction data Typical dimensions of crystals used for X-ray diffraction were 0.2 mm x 0.2 mm x 0.1 mm. Mercuric acetate (HgAc2) and phenylmercury acetate (PHgAc) crystal derivatives were prepared by soaking. All X-ray diffraction data for the final analysis were obtained at the Synchrotron Radiation Source,
The atomic coordinates and the structure factors have been deposited with the Brookhaven Protein Data Bank. Acknowledgements: We thank Wolfram Bode and Robert Huber for making available the coordinates of their fragment of human MMP-8. This work was supported originally by Beecham Pharmaceuticals (later SmithKline Beecham) and has been supported throughout by the Medical Research Council. We thank the Arthritis Research Council for support for the preparation of pure, full-length collagenase.
References 1. Woessner, J.F.J. (1991). Matrix metalloproteinases and their inhibitors in connective tissue remodeling. FASEB . 5, 2145-2154. 2. Freije, J.M.P., et al., &L6pez-Otin, C. (1994). Molecular cloning and expression of collagenase-3, a novel matrix metalloproteinase produced by breast carcinomas. J. Biol. Chem. 269, 16766-16773. 3. Matthews, B.W., Jansonius, J.N., Colman, P.M., Schoenborn, B.P. & Dupourge, D. (1972). Three dimensional structure of thermolysin. Nat. New Biol. 238, 37-41. 4. Matthews, B.W., Colman, P.M., Jansonius, J.N., Titani, K., Walsh, K.A. &Neurath, H. (1972). Structure of thermolysin. Nat. New Biol. 5. 6.
238, 41-43. Matthews, B.W. (1988). Structural basis of the action of thermolysin and related zinc peptidases. Accounts Chem. Res. 21, 333-340. Spurlino, J.C., et al., & Smith, D.L. (1994). 1.56 A structure of mature truncated human fibroblast collagenase. Proteins 19, 98-109.
Full-length porcine collagenase Li et al. 7. Bode, W., Gomis-Roth, F.X., Huber, R., Zwilling, R. & Stocker, W. (1992). Structure of astacin and implications for activation of astacins and zinc-ligation of collagenases. Nature 358, 164-167. 8. Bode, W. (1994). The X-ray crystal structure of the catalytic domain of human neutrophil collagenase inhibited by a substrate analogue reveals the essentials for catalysis and specificity. EMBO J. 13, 1263-1269. 9. Borkakoti, N., et al., & Murray, E.J.(1994). Structure of the catalytic domain of human fibroblast collagenase complexed with an inhibitor. Nat. Struct. Biol. 1, 106-110. 10. Lovejoy, B., et al., & Jordan, S.R. (1994). Structure of the catalytic domain of fibroblast collagenase complexed with an inhibitor. Science 263, 375-377. 11. Lovejoy, B., Hassell, A.M., Luther, M.A., Weigl, D. & Jordan, S.R. (1994). Crystal structures of recombinant 19-kDa human fibroblast collagenase complexed to itself. Biochemistry 33, 8207-8217. 12. Stams, T., et al., & Rubin, B. (1994). Structure of human neutrophil collagenase reveals large S1' specificity pocket. Nat. Struct. Biol. 1, 119-123. 13. Gooley, P.R., et al., & Johnson, B.A. (1994). The NMR structure of the inhibited catalytic domain of human stromelysin-1. Nat. Struct. Biol. 1, 111-118. 14. Browner, M.F. (1994). High resolution crystallographic studies of human matrilysin. J. Cell. Biochem. Suppl. 18D, 120. 15. Reinemer, P., et al., & Bode, W. (1994). Structural implications for the role of the N terminus in the 'superactivation' of collagenases. FEBS Lett. 338, 227-233. 16. Hunt, L.T., Barker, W.C. & Chen, H.R. (1987). A domain structure common to hemopexin, vitronectin, interstitial collagenase, and a collagenase homolog. Prot. Seq. Data Anal. 1, 21-26. 17. Baker, H.M., Norris, G.E., Morgan, W.T., Smith, A. & Baker, E.N. (1993). Crystallization of the C-terminal domain of rabbit serum hemopexin. J. Mol. Biol. 229, 251-252. 18. Faber, H.R., Groom, C.R., Baker, H.M., Morgan, W.T., Smith, A. & Baker, E.N. (1995). 1.8 A crystal structure of the C-terminal domain of rabbit serum haemopexin. Structure 3, 551-559. 19. Jenne, D. (1991). Homology of placental protein 11 and pea seed albumin 2 with vitronectin. Biochem. Biophys. Res. Commun. 176, 1000-1006. 20. Clark, I.M. & Cawston, T.E. (1989). Fragments of human fibroblast collagenase. Purification and characterization. J. Biochem. 263, 201-206. 21. Clark, I.M., Mitchell, R.E., Powell, L.K., Bigg, H.F., Cawston, T.E. & O'Hare, M.C. (1995). Recombinant porcine collagenase: purification and autolysis. Arch. Biochem. Biophys. 316, 123-127. 22. Lloyd, L.F., et al., & Harper, G.P. (1989). Crystallization and preliminary X-ray analysis of porcine synovial collagenase. . Mol. Biol. 210, 237-238. 23. O'Hare, M.C., Clarke, N.J. & Cawston, T.E. (1992). Production in E. coli of porcine type I collagenase as a fusion protein with 13-galactosidase. Gene 111, 245-258. 24. Moore, W.M. & Spilburg, C.A. (1986). Purification of human collagenases with a hydroxamic acid affinity column. Biochemistry 25, 5189-5195.
25. Chothia, C. & Murzin, A.G. (1993). New folds for all-B proteins. Structure 1, 217-222. 26. Allan, J.A., Hembry, R.M., Angal, S., Reynolds, J.J. & Murphy, G. (1991). Binding of latent and high Mr active forms of stromelysin is mediated by the C-terminal domain. J. Cell Sci. 99, 789-795. 27. Bigg, H.F., Clark, I.M. & Cawston, T.E. (1994). Fragments of human fibroblast collagenase - interaction with metalloproteinase inhibitors and substrates. Biochim. Biophys. Acta 1208, 157-165. 28. Murzin, A.G. (1992). Structural principles for the propeller assembly of beta-sheets: the preference for seven-fold symmetry. Proteins 14, 191-201. 29. Welgus, H.G., Jeffrey, J.J., Stricklin, G.P., Roswit, W.T. & Eisen, A.Z. (1980). Characteristics of the action of human skin fibroblast collagenase on fibrillar collagen. J. Biol. Chem. 255, 6806-6813. 30. Hirose, T., Patterson, C., Pourmotabbed, T., Mainadi, C.L. & Hasty, K.A. (1993). Structure-function relationship of human neutrophil collagenase - identification of regions responsible for substratespecificity and general proteinase activity. Proc. Natl. Acad. Sci. USA 90, 2569-2573. 31. Murphy, G., Allan, J.A., Willenbrock, F., Cockett, M.I., O'Connell, J.P. & Docherty, A.J.P. (1992). The role of the C-terminal domain in collagenase and stromelysin specificity. J. Biol. Chem. 267, 9612-961 8. 32. Fields, G.B., Van Wart, H.E. & Birkedal Hansen, H. (1987). Sequence specificity of human skin fibroblast collagenase. Evidence for the role of collagen structure in determining the collagenase cleavage site. J. Biol. Chem. 262, 6221-6226. 33. Smith, A. & Hunt, R.C. (1990). Hemopexin joins transferrin as representative members of a distinct class of receptor-mediated endocytic transport systems. Eur. J. Cell Biol. 53, 234-245. 34. Preissner, K.T. (1991). Structure and biological role of vitronectin. Annu. Rev. Cell Biol. 7, 275-310. 35. Collaborative Computational Project Number 4. (1994). The CCP4 suite: programs for protein crystallography. Acta Crystallogr. D 50, 760-763. 36. Leslie, A.G.W. (1987). A reciprocal-space method for calculating a molecular envelope using the algorithm of B.C. Wang. Acta CrystalIogr. A 43, 134-136. 37. Wang, B.C. (1985). Resolution of phase ambiguity in macromolecular crystallography. Methods Enzymol. 115, 90-112. 38. Bronger, A.T., Kuriyan, J. & Karplus, M. (1987). Crystallographic R-factor refinement by molecular-dynamics. Science 235, 458-460. 39. Kraulis, P.J. (1991). MOLSCRIPT: a program to produce both detailed and schematic plots of protein structures. J. Appl. Crystallogr. 24, 946-950. 40. Bacon, D.J. & Anderson, W.F. (1988). A fast algorithm for rendering space-filling molecular pictures. J. Mo/. Graphics 6, 219-220. 41. Kabsch, W. & Sander, C. (1983). Dictionary of protein secondary structures. Biopolymers 22, 2577-2637. Received: 6 Mar 1995; revisions requested: 29 Mar 1995; revisions received: 6 Apr 1995. Accepted: 7 Apr 1995.
549