Staggered molecular packing in crystals of a collagen-like peptide with a single charged pair1

Staggered molecular packing in crystals of a collagen-like peptide with a single charged pair1

doi:10.1006/jmbi.2000.4017 available online at http://www.idealibrary.com on J. Mol. Biol. (2000) 301, 1191±1205 Staggered Molecular Packing in Crys...

815KB Sizes 0 Downloads 2 Views

doi:10.1006/jmbi.2000.4017 available online at http://www.idealibrary.com on

J. Mol. Biol. (2000) 301, 1191±1205

Staggered Molecular Packing in Crystals of a Collagen-like Peptide with a Single Charged Pair Rachel Z. Kramer1, Manju G. Venugopal2, Jordi Bella1 Patricia Mayville1, Barbara Brodsky2 and Helen M. Berman1,3* 1

Department of Chemistry Rutgers University, 610 Taylor Rd, Piscataway, NJ 088548087, USA 2

Department of Biochemistry Robert Wood Johnson Medical School, Piscataway NJ 08855, USA 3

Waksman Institute, Piscataway, NJ 08855, USA

The crystal structure of the triple-helical peptide, (Pro-Hyp-Gly)4-GluÊ resolution. This Lys-Gly-(Pro-Hyp-Gly)5 has been determined to 1.75 A peptide was designed to examine the effect of a pair of adjacent, oppositely charged residues on collagen triple-helical conformation and intermolecular interactions. The molecular conformation (a 75 triple helix) and hydrogen bonding schemes are similar to those previously reported for collagen triple helices and provides a second instance of water mediated N Ð H  O1C interchain hydrogen bonds for the amide group of the residue following Gly. Although stereochemically capable of forming intramolecular or intermolecular ion pairs, the lysine and glutamic acid side-chains instead display direct interactions with carbonyl groups and hydroxyproline hydroxyl groups or interactions mediated by water molecules. Solution studies on the EKG peptide indicate stabilization at neutral pH values, where both Glu and Lys are ionized, but suggest that this occurs because of the effects of ionization on the individual residues, rather than ion pair formation. The EKG structure suggests a molecular mechanism for such stabilization through indirect hydrogen bonding. The molecular packing in the crystal includes an axial stagger between molecules, reminiscent of that observed in D-periodic collagen ®brils. The presence of a Glu-Lys-Gly triplet in the middle of the sequence appears to mediate this staggered molecular packing through its indirect watermediated interactions with backbone CˆˆO groups and side-chains. # 2000 Academic Press

*Corresponding author

Keywords: collagen; triple helix; charged residues; staggered packing; hydration

Introduction

R.Z.K. and M.G.V. contributed equally to this work. Present addresses: M. G. Venugopal, Roche Diagnostics, 235 Hembree Park Drive, Roswell GA 30076.1447, USA; J. Bella, Wellcome Trust Centre for Cell-Matrix Research, School of Biological Sciences, University of Manchester, Stopford Building, Oxford Road, Manchester M13 9PT, UK. Abbreviations used: rms, root-mean-square; PPG 2, polymer-like structure of the triple-helical peptide (ProÊ resolution (Kramer et al., Pro-Gly)10 determined to 1.75 A 1998); Gly!Ala, the triple-helical peptide (Pro-HypGly)4-Pro-Hyp-Ala-(Pro-Hyp-Gly)5 determined by Bella et al. (1994); Hyp, hydroxyproline; EKG, the triplehelical peptide (Pro-Hyp-Gly)4-Glu-Lys-Gly-(Pro-HypGly)5; T3-785, the triple-helical peptide, (Pro-Hyp-Gly)3Ile-Thr-Gly-Ala-Arg-Gly-Pro-Hyp-Gly-(Pro-Hyp-Gly)3. E-mail address of the corresponding author: [email protected] 0022-2836/00/051191±15 $35.00/0

The triple helix is the principal structural element of collagen and is also an integral feature of a variety of host defense proteins such as the serum complement protein (C1q), mannose-binding protein, and the macrophage scavenger receptor (Hoppe & Reid, 1994). The essential structural features of the triple helix were initially elucidated through ®ber diffraction studies of native collagen and synthetic collagen-like polypeptides (Fraser et al., 1979; Rich & Crick, 1961; Yonath & Traub, 1969). The triple helix of collagen is a super-coil formed by the interwinding of three left-handed polyproline II helices in a right-handed manner around a common helical axis. A set of interchain hydrogen bonds in the Rich and Crick collagen II (Rich & Crick, 1961) pattern connects the three chains with a one-residue stagger between neighboring chains. Triple helices are characterized by # 2000 Academic Press

1192 repetitions of the triplet X-Y-Gly, since the compact triple helix requires the sterically small glycine in every third position near the central axis. The imino acids proline and hydroxyproline frequently occupy the X and Y positions, respectively, making Pro-Hyp-Gly the most common triplet in collagens. Hydroxyproline is formed post-translationally from proline by prolyl hydroxylase and is known to be crucial for collagen stability either through interactions with water (reviewed by Fraser & MacRae, 1973; Privalov, 1982) or inductive effects (Holmgren et al., 1998). Collagens are a family of 19 distinct extracellular matrix molecules with a triple-helical domain as their common structural feature. Collagens selfassociate to form distinct supramolecular arrangements that are their ®nal functional form in tissues. The best characterized and most common collagen form is the 67 nm (D) periodic ®bril, observed as the major structural component in tendon, skin, bone and most other connective tissues. Type I, II, III, V, and XI collagens self-associate in these characteristic ®brils, where adjacent molecules are staggered axially by 67 nm. Computational and experimental studies suggest that the axial staggering is due to electrostatic and hydrophobic interactions between neighboring molecules (Hulmes et al., 1973; Li et al., 1975; Trus & Piez, 1976), and that non-helical telopeptides at the termini of the triple helix may play a critical role (Prockop & Fertala, 1998). A particularly high proportion of XY-Gly triplets where X and Y are oppositely charged residues are observed and are suggested to play a role in determining the axial stagger (Doyle et al., 1974a). In addition to self-association, the collagen triple helix is an important binding motif, capable of interacting with many different kinds of molecules. Charged residues in the triple helix have been implicated in the recognition and binding by collagen, as well as by the macrophage scavenger receptor and the serum complement protein C1q (Acton et al., 1993; Doi et al., 1993; Hoppe & Reid, 1994). The importance of charged residues in the triple helix for self-association and ligand binding has provided an impetus for characterization of their structural features. About 15-20 % of all residues in collagen are Lys, Arg, Glu, or Asp, and about 40 % of all X-Y-Gly triplets contain at least one of these ionizable residues. Charged residues are asymmetrically distributed along the triple helix. In ®brilforming collagens, clusters of charged residues are interspersed with imino acid-rich regions in a pattern that is well conserved among a wide distribution of phyla (Bruns & Gross, 1973; Doyle et al., 1974b). Most basic residues are situated within two to three residues of an acidic residue (Jones & { The 75 notation is used to indicate the handedness of the helix and provide consistency with crystallographic screw symmetry. A (7-fold) 75 helix is equivalent to a left-handed 7/2 helix.

Staggered Molecular Packing of a Collagen Peptide

Miller, 1991; Salem & Traub, 1975; Traub & Fietzek, 1976) with a bias for Glu to occupy the X position, and Arg and Lys the Y position (Fietzek & KuÈhn, 1975; Salem & Traub, 1975). For example, in type III collagen, 96 % of the Glu residues are in the X position, and 83 % of the Arg residues are in the Y position (Ala-Kokko et al., 1989). Glu-Lys-Gly triplets are common in type IV collagen (constituting about 6 % of all triplets) and one is found at the ligand-binding site of the macrophage scavenger receptor (Doi et al., 1993). Glu-Arg-Gly triplets are common in type I collagen and one appears in the integrin-binding site of collagen (Knight et al., 2000). Molecular modeling studies have demonstrated that the triple-helical structure offers the stereochemical possibility for sequentially adjacent charged residues to participate in interchain or interhelical ion-pairs (Katz & David, 1990, 1992; Trus & Piez, 1976). Here, we report the X-ray crystal structure determination of a peptide with the sequence (ProHyp-Gly)4-Glu-Lys-Gly-(Pro-Hyp-Gly)5, denoted as EKG. The triple-helical EKG peptide is a (Pro-HypGly)10 homologue with the central residues in the X and Y positions replaced by glutamic acid and lysine, respectively. The EKG peptide forms a stable triple helix in aqueous solution with a melting temperature of 46  C (Venugopal et al., 1994). The crystal structure of this peptide allows examination of the effect of a pair of adjacent charged residues on triple-helical conformation and molecular packing, and permits direct observation of the interactions of ionized Glu and Lys residues with solvent, backbone atoms, and with each other within a triple-helical context.

Results Structural description The EKG triple helix is overall a rod-shaped structure with no signi®cant bend or untwisting (Figure 1). This is true along the entire molecule, including the central region where the glutamic acid and lysine residues lie. Ê long and The molecule is approximately 88 A Ê in diameter, and bears the hallmark signature 10 A of collagen triple-helical conformation: three polyproline II chains oriented in parallel fashion with a one-residue stagger between each chain. Mainchain conformational angles (Table 1), helical and superhelical parameters (Table 2), and interchain hydrogen bonding (Table 3), are consistent with the 75{ triple-helical conformation observed in imino acid-rich regions of collagen triple-helical crystal structures (Bella et al., 1994; Kramer et al., 1998, 1999; Nagarajan et al., 1999; Okuyama et al., 1981). As this structure is the highest-resolution example to date, it presents a good opportunity to review and compare the main architectural elements of collagen triple helices as seen in the crystal structures of model peptides.

Staggered Molecular Packing of a Collagen Peptide

1193

Figure 1. Stereoview of the EKG triple-helical asymmetric unit, colored by residue type (Pro, red; Hyp, orange; Gly, purple; Glu, green; Lys, aqua; three water molecules bound to glutamic acid N-H groups, turquoise). Gly90 was excluded from re®nement due to disorder. Glu73 and Lys74 could not be accurately de®ned in the electron-density maps and were thus modeled only up to Cb. The Figure was generated with MOLSCRIPT (Kraulis, 1991).

The rms deviation between the ®nal coordinates Ê for all and the idealized 7-fold structure is 0.44 A backbone atoms. Even if the calculation is extended to the imino acid side-chains (with alanine residues at the glutamic acid and lysine positions) the rms Ê . This contrasts with the deviation is only 0.51 A variable helical symmetry observed in the crystal structure of the T3-785 peptide{ (Kramer et al., 1999). This difference between the two peptides indicates that the inclusion of just one triplet with no imino acid residue in an otherwise ``Pro-HypGly'' environment is insuf®cient to induce observable changes in supercoiling. The central part of the molecule, containing the three Glu-Lys-Gly triplets, introduces almost no change into the helical and conformational parameters. Nevertheless, the temperature factor distribution indicates that the molecule has higher thermal disorder at both ends and in the central { The T3-785 peptide, (Pro-Hyp-Gly)3-Ile-Thr-Gly-AlaArg-Gly-Pro-Hyp-Gly-(Pro-Hyp-Gly)3, contains a stretch of three triplets per chain with no imino acid.

portion of the molecule. The pattern of interchain hydrogen bonds in the Pro-Hyp-Gly regions is identical with those previously reported for other triple helix structures. The N-H...O1C hydrogen bonds follow the Rich and Crick II pattern (Rich & Crick, 1961), accompanied by a weaker set of Ca-H    O1C hydrogen bonds (Bella & Berman, 1996). The central region also shows watermediated hydrogen bonds. Collagen triple-helices containing residues other than Pro or Hyp in the X and Y positions have additional amide groups available for hydrogen bonding such that a second set of interchain hydrogen bonds, N-H (X position)   O1C (Gly), is possible (Ramachandran & Kartha, 1955). To prevent a distorted triple helix, this interaction must be mediated by a water molecule (Ramachandran & Chandrasekharan, 1968). This was observed previously in the crystal structure of the T3-785 peptide (Kramer et al., 1999) corroborating biochemical evidence indicating that two of the three amide groups per triplet are involved in hydrogen bonds.

1194

Staggered Molecular Packing of a Collagen Peptide

Table 1. Averaged values of EKG main-chain dihedral angles Torsion angle o f c o f c o f c

X position X position X position Y position Y position Y position Gly Gly Gly

EKG

PPG 2a

Gly ! Alab,c

Collagen fiberd

179.7 (0.3) ÿ73.7 (3.0) 160.5 (5.9) 179.4 (0.7) ÿ59.7 (2.4) 151.3 (4.0) 180.0 (0.1) ÿ72.4 (3.7) 175.1 (4.2)

177.8 (0.7) ÿ75.0 (2.3) 161.4 (2.9) 176.7 (1.9) ÿ61.0 (0.9) 153.3 (2.0) ÿ179.9 (0.1) ÿ75.8 (2.0) 179.5 (3.0)

179.9 (1.8) ÿ72.6 (7.6) 163.8 (8.8) 178.5 (1.5) ÿ59.6 (7.3) 149.8 (8.8) 177.3 (3.1) ÿ71.9 (9.6) 174.1 (11.9)

180.0 ÿ72.1 164.3 180.0 ÿ75.0 155.8 180.0 ÿ67.6 151.4

The values are compared with those of other triple-helical structures. Standard deviations are given in parentheses. a Values for Hyp correspond to proline in the Y position. b Residues at the ends of the molecule without hydrogen bonding mates (1-3, 31-33, 61, 30, 58-60, and 88-90) are excluded from the analysis. c Bella et al. (1994). d Fraser et al. (1979).

In the EKG peptide, non-imino acid residues occupy one X position in each of the three chains. The amide groups of each of the three glutamic acids participate in interchain hydrogen bonds, mediated by one water molecule, with the glycine carbonyl group of an adjacent chain (Figure 2) (Kramer & Berman, 1999). This additional set of hydrogen bonds is oriented in the direction opposite from that of the Rich and Crick II hydrogen bonds. Nine such water-mediated bonds were observed in the T3-785 crystal structure, one per each available X-position amide group. The N-H groups of the lysine residues in the Y position are not similarly involved in an ordered set of hydration interactions. The three water molecules involved in watermediated hydrogen bonds in the EKG structure are Ê from the glutamic at an average distance of 2.85 A Ê from the glyacid amide nitrogen atom and 3.04 A cine carbonyl oxygen atom. The average temperaÊ 2, ture factor of these water molecules is 25.2 A which is comparable with the average temperature Ê 2. In one case factor of all solvent atoms, 22.6 A (Figure 2), the mediating water molecule makes an Ê ) with the hydroxyl additional contact (2.85 A group of the next Hyp residue in the C-terminal direction from the glycine residue (Kramer & Berman, 1999). This type of interaction was also proposed by Ramachandran et al. (1973) and reported for the T3-785 structure (Kramer et al., 1999). The amide groups of the Y position lysine residues are positioned more towards the solvent than are those in the X position and do not participate in repetitive hydrogen bonding interactions. Imino acid ring puckering generally follows the tendency previously reported for collagen-like peptides (Bella et al., 1994; Kramer et al., 1998) with the ring in the X (proline) position puckered in a downward manner and the ring in the Y (hydroxyproline) position puckered in an upward manner. { The Gly ! Ala peptide, (Pro-Hyp-Gly)4-Pro-HypAla-(Pro-Hyp-Gly)5, is a (Pro-Hyp-Gly)10 homologue with a single Gly to Ala substitution.

Momany et al. (1975) have described the geometries for these two conformations. There is one exception to this pattern, at position 37. Here, the proline ring puckers in the up rather than the down conformation. Electron density maps con®rm this observation.

Crystal packing The essentially straight EKG triple-helical molecules are packed in parallel. Laterally, the molecules are arranged in a quasi-hexagonal manner, similar to what was observed for the Gly ! Ala{ structure (Bella et al., 1994) with four of the six Ê and neighboring positions separated by about 14 A Ê . Molecules lie in the remaining two by about 16 A planes perpendicular to the crystallographic b-axis, with the molecular axis aligned parallel with the crystallographic ( 102) direction (Figure 3(a)). The positioning of the EKG molecule relative to the 2-fold screw axes creates a gap between the imino end of one molecule and the carboxyl end of the next along the molecular axis. It also generates an arrangement in which two of the nearest neighboring molecules (those related by the 2-fold screw at 1/2, y, 1/2) are in register, while the rest are staggered such that the Pro-Hyp-Gly portions of the molecules overlap. This staggered arrangement is reminiscent of that observed in D-periodic collagen ®brils.

Table 2. Average helical and superhelical parameters for EKG (standard deviations are given in parentheses)

Ê) Helical height (A Ê) Superhelical height (A Helical twist (deg.) Superhelical twist (deg)

EKG

Idealized 75 helix

8.5 (0.1) 2.8 (0.1) 51.7 (7.4) ÿ103.1 (7.9)a

8.6 2.9 51.4 ÿ102.9

a The superhelical twist displays less deviation from the average in the central region than in the Pro-Hyp-Gly end regions.

1195

Staggered Molecular Packing of a Collagen Peptide

increased disorder as evidenced by elevated B-factors in comparison with the rest of the model, and by less well-de®ned hydration positions. The EKG structure displays several direct interhelical contacts (Figure 4) between hydroxyproline side-chains (Kramer & Berman, 1998). In several places along the EKG molecule the hydroxyproline hydroxyl group is within hydrogen-bonding disÊ ) and tance (average Od   Od distance is 2.75 A geometry from a hydroxyproline hydroxyl group of a symmetry-related triple helix. This interaction occurs through one of the two potential hydration sites (Bella et al., 1995) of the hydroxyproline hydroxyl group. The direct interhelical interactions occur between staggered neighboring triple helices. Interactions of polar side-chains

Figure 2. Hydrogen bonding topology in the EKG structure (Kramer & Berman, 1999). Rich and Crick II (Rich & Crick, 1961) interchain hydrogen bonds are shown with red broken lines. Water-mediated hydrogen bonds are created by three water molecules (WR) connecting glycine C1O groups from one chain with N-H groups from Glu residues in the X position of an adjacent chain (blue broken lines). One of these water molecules makes an additional contact to the hydroxyproline hydroxyl group in the same chain as the glycine carbonyl group.

The combination of the molecular stagger and the gap between ends of helices produces a hole of Ê  12 A Ê extending along the crystalloabout 10 A graphic b direction (Kramer & Berman, 1998) and centered on the 2-fold screw axis located between molecular ends. Along the borders of this hole, the central portions of in-register molecules are coincident with the ends of neighboring staggered molecules (Figure 3(b)). As the glutamic acid and lysine side-chains and the N and C-terminal groups are ionized, a charged tunnel through the crystal is created. This open ``hole'' area, which is somewhat analogous to the gap regions of D-periodic ®brils (Figure 3(c)), is characterized by

There are direct interchain interactions involving ionized side-chains and backbone carbonyl or hydroxyproline hydroxyl groups (Figure 5). In two of the three such cases, lysine side-chains make direct interactions with a Y position carbonyl group of an adjacent chain. Lys14 Nz makes a direct contact with the carbonyl group of Lys44. Similarly, Lys44 Nz makes a direct contact with the carbonyl group of Lys74. If the side-chain of Lys74 could have been modeled, it is likely that it would make the same type of interaction with the carbonyl group of Hyp17. This type of interaction is analogous to those predicted for Y position Arg residues in conformational calculations of bovine type I skin collagen (Vitagliano et al., 1993). In the third direct contact, Glu13 Oe1 interacts with the hydroxyl group of Hyp71. Additional interactions of the polar side-chains occur through water molecules (Figure 5). Lys14 forms a two-water molecule bridge with the hydroxyl group of Hyp2 of a symmetry-related helix and a two-water molecule bridge with Oe1 of Glu43. If Glu73 had been modeled, presumably it could make analogous interactions with Lys44. Finally, Lys44 participates in a two-water molecule bridge with the carbonyl group of Gly75. No direct ion pair was observed between Glu and Lys side-chains, even though modeling studies report they are sterically possible. For example, given the one residue stagger between adjacent chains, Lys14 of chain 1 is at the same axial level and in close proximity to Glu43 in chain 2. In addition, since X and Y residues are located on the surface of the molecule, inter-helical interactions would also be possible. Hydration analysis First-hydration shell water molecules tend to cluster in well-de®ned hydration positions; a proposed nomenclature for these positions was developed for the Gly ! Ala structure (Bella et al., 1995). For the most part, the pattern of the ®rst hydration shell (water molecules bound directly to the peptide chain) follows that previously reported

1196

Staggered Molecular Packing of a Collagen Peptide

Figure 3 (legend opposite)

1197

Staggered Molecular Packing of a Collagen Peptide

Table 3. Average selected hydrogen-bonding parameters for EKG compared with Gly ! Ala (Bella & Berman, 1996; Bella et al., 1994) Ê) Interatomic distances (A EKG Gly ! Ala

Interatomic angles (deg) EKG Gly ! Ala

N Ð H  OˆC hydrogen bonds N ÐH Gly  O X position N Gly   O X position

1.98 (0.08) 2.92 (0.08)

2.06 (0.07) 2.94 (0.08)

Ca ÐH  OˆC hydrogen bonds Ha1 Gly  O Gly Ha2 Gly  O Gly Ca Gly  O Gly

N ÐH Gly  O X position H Gly  OˆC X position N Gly  OˆC X position

160 (8) 162 (5) 167 (3)

150 (4) 154 (5) 163 (5)

2.61 (0.11) 2.73 (0.09) 3.10 (0.08)

2.63 (0.20) 2.79 (0.16) 3.15 (0.15)

Ha1 Gly  O X position Ca Gly  O X position

2.39 (0.08) 3.45 (0.08)

2.41 (0.18) 3.46 (0.18)

Ha Y position  O X position Ca Y position  O X position

2.50 (0.09) 3.39 (0.07)

2.52 (0.19) 3.41 (0.16)

Ca ÐHa1 Gly  O Gly Ca ÐHa2 Gly  O Gly Ha1 Gly  OˆC Gly Ha2 Gly  OˆC Gly Ca Gly  OˆC Gly Ca ÐHa1 Gly  O X position Ha1 Gly  OˆC X position Ca Gly  OˆC X position Ca-Ha Y position  O X position Ha Y position  OˆC X position Ca Y position  OˆC X position

107 (4) 100 (5) 93 (4) 115 (3) 101 (3) 168 (5) 112 (4) 115 (3) 140 (4) 129 (2) 138 (3)

109 (7) 100 (8) 91 (5) 110 (6) 99 (5) 165 (6) 113 (8) 117 (8) 140 (5) 126 (5) 136 (5)

Hydrogen atoms have been placed based on the crystal coordinates of the heavier atoms, using X-PLOR default parameters (BruÈnger, 1992). Standard deviations are shown in parentheses.

for several other triple-helical peptides (Bella et al., 1995; Kramer et al., 1998). The major difference lies in that glycine carbonyl groups in the EKG structure have two possible hydration sites (Figure 6(a)) rather than one. One of these positions (WN) is similar to the sole position observed in earlier structures Gly ! Ala (Bella et al., 1995) and (ProPro-Gly)10 (Kramer et al., 1998). A second position (WRamachandran or WR) is occupied by water molecules that are additionally bound to each of the three amide nitrogen atoms of the non-imino acid residues in the X-positions creating the repetitive series of interchain hydrogen bonds mentioned above (Kramer & Berman, 1999). The Gly ! Ala structure displays a secondary hydration site on hydroxyproline hydroxyl groups near Cd (WD1), which is not observed in the EKG structure. In the EKG structure, every potential hydration site is not necessarily occupied, often as a result of proximity to disordered areas of packing or occlusion by glutamic acid or lysine side-chains. Ê ) interact In all, 90 water molecules (within 3.2 A with backbone carbonyl and hydroxyproline hydroxyl groups with the majority of these polar groups making at least one contact with a water molecule (Table 4). This is in agreement with the structures of (Pro-Pro-Gly)10 (Kramer et al., 1998) and Gly ! Ala (Bella et al., 1995). The result, however, contrasts with the smaller number of water molecules reported for (Pro-Hyp-Gly)10 by

Nagarajan et al. (1999). In particular, 21 out of 29 hydroxyl groups in the EKG structure participate in hydrogen bonding interactions with water, whereas only three out of seven have been reported to do so in the (Pro-Hyp-Gly)10 structure. Not every possible hydration position in EKG is occupied, as many of the hydroxyproline carbonyl and hydroxyl groups have only one of the two possible positions ®lled. Second hydration shell water molecules (i.e. those bound to the water molecules of the ®rst hydration shell) form intrahelical repetitive bridges (Figure 7) (Kramer & Berman, 1998). These bridges are analogous to the a , b , g , d bridges described for the Gly ! Ala structure (Bella et al., 1995). Interhelical o bridges are observed as well. Three water molecules, bound to second shell water molecules, can be characterized as occupying a third hydration shell. For the most part, however, water molecules bound to second hydration shell water molecules exist in the ®rst or second hydration shell of a symmetry-related helix. Second hydration shell water molecules may also become ®rst hydration shell water molecules from a neighboring helix. In general, the majority of bound water molecules are found in the Pro-Hyp-Gly portions of the peptide and the hydration pattern in the central portion of the molecule, which contains the polar residues, is somewhat disrupted (Kramer &

Figure 3. (a) Projection of the crystal packing of EKG viewed along the b-axis. The darker molecules are in the y ˆ 0 plane and the lighter molecules are in the y = b/2 plane. Central charged regions are depicted in blue and green. (b) Close-up view of the crystal packing shown in (a) focusing on the charged portions of the molecules. While the Glu43 side-chain appears to be close to its symmetry-related mate across the charged channel, because of the shift Ê . Similarly, the distance between Lys14 and the nearest imposed by the 2-fold screw, the two are separated by 12.1 A Ê . (c) Schematic illustration of the suggested arrangement of individual collagen C-terminal carbonyl group is 9.0 A molecules in the formation of D-periodic collagen ®brils.

1198

Staggered Molecular Packing of a Collagen Peptide

occupies a potential hydration site of either the carbonyl groups or the hydroxyl group. Both Lys14 Nz and Lys44 Nz occupy the WN hydration position of the Y position carbonyl group with which they are interacting (Figure 6(b)) and Glu13 Oe1 occupies the WB hydration site of the hydroxyl group of Hyp71. Through this interaction, Lys44 Nz participates in the formation of an a3 water bridge between the carbonyl groups of Lys74 and Gly75 (Figures 5 and 7(a)).

Discussion and Conclusions The EKG structure provides the ®rst observation of staggered packing in a crystal structure of a collagen-like peptide and offers insights into the relationship between sequence, molecular conformation, and intermolecular organization. Molecular conformation

Figure 4. Direct intermolecular contacts in the EKG structure (Kramer & Berman, 1998). Intermolecular interactions between hydroxyprolines (shown in light gray) occur among neighboring helices related translationally along the crystallographic c-axis. At several hydroxyproline hydroxyl groups along the EKG molecule, a hydroxyproline hydroxyl group from a neighboring molecule occupies one of the two possible ordered water positions (Bella et al., 1995) creating direct interhelical contacts with good hydrogen bonding geometry. The Figure was generated with MOLSCRIPT (Kraulis, 1991).

Berman, 1998). Although the polar side-chains do make contacts involving water, they tend to disrupt the typical hydration patterns found along the chains (Figure 7), especially for a and b bridges, which generally incorporate three or four water molecules, in contrast to the one or two water molecules usually included in g and d bridges. This disruption is analogous to the situation observed for double-helical DNA, where the hydration around polar phosphate groups was found to be less well organized than that around the non-polar bases (Schneider et al., 1998). It may be argued that the network of water molecules in the hydrophobic Pro-Hyp-Gly regions responds to inherent hydrogen bonding characteristics of water structure, connecting to the peptide where possible, whereas the charged groups in the polar central region impose a local structure in the water to improve ion solvation. In the EKG peptide, the disruption of the hydration in the central charged region is exacerbated by its location adjacent to open and disordered areas of the packing. In the three cases where lysine or glutamic acids interact with the peptide, the interacting atom

Inclusion of two charged residues per chain has little effect on the molecular conformation of the triple-helical structure, showing the same 7-fold helical symmetry observed for other imino acidrich collagen-like peptides. The EKG structure provides a second observation of water-mediated interchain hydrogen bonds in regions with nonimino acid residues in X and Y positions. The additional observation of this pattern in a completely different packing environment (Kramer & Berman, 1998) strengthens the notion that it is an integral feature of collagen conformation and further clari®es hydrogen bonding and hydration patterns in imino acid-poor regions. The melting temperature of the EKG peptide (tm 46  C) is lower than that observed for (Pro-HypGly)10 (58  C), consistent with Pro-Hyp-Gly being the most stabilizing tripeptide for the triple-helical conformation (Venugopal et al., 1994). This, however, does not preclude the possibility of electrostatic interactions contributing to the stabilization of the EKG peptide. Previous solution studies over a range of pH values suggested that this peptide was most stable at neutral pH when both the lysine and the glutamic acid were ionized. This indicated the possibility of a stabilizing involvement of intramolecular ion pairs (Venugopal et al., 1994), such as those predicted from modeling (Katz & David, 1990, 1992). Further solution investigations of a host-guest set of Pro-Hyp-Gly analogue peptides with a central triplet substituted by Pro-Lys-Gly, Glu-Hyp-Gly, and Glu-Lys-Gly, respectively, suggested that the apparent stability could be accounted for by effects of pH on the Glu residue alone (Chan et al., 1997). The crystal structure does not show ion pairs involving Glu and Lys residues. However, the observed structure suggests the ionizable side-chains may be stabilizing as a result of direct interactions of the polar side-chains with carbonyl groups and hydroxyl groups of hydroxyproline, or through water molecules. Perhaps the interaction of Lys side-chains

1199

Staggered Molecular Packing of a Collagen Peptide

with the carbonyl group of the Y position on adjacent chains may help to explain the preference for positively charged residues in the Y positions of various types of collagen.

Axial stagger related to the EKG triplet

Figure 5. The glutamic acid and lysine side-chains are involved in a variety of different interactions. Several of these occur through water molecules. These interactions are intrachain, interchain, or interhelical. Although direct contacts between lysine and glutamic acid sidechain interactions are stereochemically possible, none is observed. Direct contacts with carbonyl and hydroxyl groups are observed. Because of the ¯exibility of the side-chains and their proximity to the open/disordered portion of the packing, water-bridge interactions up to Ê were considered when the geometry and the 3.5 A appearance of the bridge were reasonable. Since only water bridges containing one or two water molecules were included, more extensive contacts can be discerned if additional water molecules and longer bridges are

Even though the inclusion of the EKG sequence did not signi®cantly alter the 75 triple-helical conformation, the molecules pack in a novel staggered arrangement in the crystal. The presence of the polar side-chains is important for the generation of the staggered packing, as it is not observed in other triple-helical peptide structures and suggests that there is a sequence-dependence to intermolecular interactions even in the absence of conformational variability. Other packing schemes have been reported for the crystal structures of three triple-helical peptides. The structure of (Pro-ProGly)10 was reported by Okuyama in 1981 (Okuyama et al., 1981), then further re®ned (Kramer et al., 1998; Nagarajan et al., 1998). Structures of (Pro-Hyp-Gly)10 (Nagarajan et al., 1999) and a homologous peptide with a single Gly to Ala substitution (Gly ! Ala peptide) (Bella et al., 1994) have also been reported. The lateral packing of EKG molecules is very similar to that in both the (Pro-Hyp-Gly)10 and the Gly ! Ala structures. Both (Pro-Pro-Gly)10 and (Pro-Hyp-Gly)10 form quasi-in®nite helices, precluding a distinct axial packing arrangement. In the Gly ! Ala peptide, which differs only in one tripeptide from the EKG peptide, adjacent molecules are in axial register. These results strongly suggest that in some way the EKG triplet is generating the axial stagger not seen in these other peptides with no charged groups, while not affecting the lateral packing. Since the EKG peptide crystallized under different conditions than the other peptides, no conclusion can be drawn about the effects of crystallization conditions on packing. The basis of axial packing lies in electrostatic interactions related to the presence of the Glu-LysGly triplet, but does not require the formation of ion pairs, since none was observed. The fact that the channel observed in the crystal is lined by both positive and negative charges, with a neutral net balance, indicates that some form of long-range electrostatic interactions may be important here. Direct interchain interactions that were observed include bonds of side-chains with backbone carbonyl groups and Hyp hydroxyl groups, as well as water-mediated connections involving the polar side-chains. Therefore, it appears that polar sidechains, but not direct interactions between them, are necessary for staggered packing.

taken into account. Chain 1 is shown in dark gray, chain 2 in medium gray and chain 3 in light gray. The Figure was generated with MOLSCRIPT (Kraulis, 1991).

1200

Staggered Molecular Packing of a Collagen Peptide

Figure 6. Water distribution diagrams around the carbonyl groups of the EKG structure. Water molÊ ecules were selected using a 3.25 A cutoff from the carbonyl or hydroxyl groups. The method of Schneider et al. (1993) was used to calculate three-dimensional contours. (a) The glycine carbonyl group is surrounded by two ordered hydration positions. The WA position was seen in the Gly ! Ala peptide. The three water molecules participating in the interchain hydrogen bonds form the WR position. (b) The Y position (hydroxyproline and lysine) carbonyl groups are surrounded by two ordered hydration sites, both of which were observed in the Gly ! Ala peptide. Here, the sidechain of Lys14 is shown folding back into the electron density cloud of the WN hydration site. The Figure was generated with CHAIN (Sack, 1988).

The variability of the polar side-chains as evidenced by side-chain disorder and partial occupancy (see Materials and Methods) indicates that the set of interactions described here may be an interchangeable subset of an even larger group. The variability and ¯exibility of the polar sidechains leaves them available for potential intermolecular interactions without competition from direct, speci®c interactions. The lack of direct, speci®c side-chain/side-chain interactions in the EKG structure supports results from 13C magnetic resonance studies, which demonstrated that collagen molecules in ®brils experienced a large degree of rotation about the helical axis (Torchia & Vanderhart, 1976). The authors suggested that stabilizing intermolecular contacts would not therefore comprise a single group of interactions between side-chains. Given the interactions in EKG, helices could potentially rotate in a gear-like manner and maintain equivalent interhelical interactions.

Direct interactions between hydroxyproline residues In the EKG structure, direct interactions between two Hyp groups in adjacent molecules were observed in six cases per triple helix. This contrasts with the structure of the Gly ! Ala peptide, where no Hyp-Hyp interaction was observed and where only water molecules bridge neighboring peptide molecules. Direct intermolecular interactions between hydroxyproline residues may play a role in helping to keep axial packing in register, but given the axial disorder and polymer-like stacking of the molecules observed in the (Pro-Hyp-Gly)10 structure (Nagarajan et al., 1999), where such interactions are possible, but are not reported, potential Hyp-Hyp interactions alone are not suf®cient to generate regularly aligned molecules. In the 1950s, Gustavson observed that the shrinkage temperature of collagen ®bers correlated with their hydroxyproline content, and suggested

Table 4. Statistics for ®rst hydration shell water molecules Polar atom Total water molecules attacheda No. residues with waters/ number residues available No. residues having both hydration positions occupied % Possible hydrated % Possible hydrated including dual positions Ê Average distance W  O A

Hyp Od

Gly O

Hyp O

29

24

38

21/29

24/27

23/29

8 73

N.A. 89

14 79

50 2.76 (0.12)

N.A. 2.83 (0.13)

40 2.82 (0.16)

a The total number of water molecules attached is 90 rather than 91, as one water molecule contacts both a Gly carbonyl and a Hyp hydroxyl group. N.A., not applicable as Gly O has only one primary hydration position.

1201

Staggered Molecular Packing of a Collagen Peptide

that hydroxyproline stabilized the collagen ®bril structure through intermolecular hydrogen bonds of the type Hyp OH   O1C (amide carbonyl) (Gustavson, 1955). However, determination of the nature of collagen triple-helical structure, and the observation that the melting temperature of collagens correlated with hydroxyproline, led researchers to focus on the role of Hyp in molecular stability, rather than in ®bril structure. Although direct hydrogen bonding of Hyp to backbone carbonyl groups within the same molecule is not possible, it was suggested that stabilization occurred through water-mediated hydrogen bonding to hydroxyproline (Privalov, 1982). The EKG structure shows novel arrangements of hydrogen bonds involving two hydroxyproline hydroxyl groups, rather than involvement of one hydroxyproline with amide carbonyl groups as previously suggested (Gustavson, 1955). Attention has focused on charged interactions and hydrophobic interactions as stabilizing forces for characteristic 67 nm periodic ®brils, but the observation of Hyp-Hyp interactions between different molecules in this crystal structure raises the alternative possibility that Hyp residues could be important to intermolecular interactions stabilizing collagen ®bril structure as well as in intramolecular stabilization. These Hyp-Hyp interactions could play a role in axial registration of molecules, but are not essential (since they are not seen in the Gly ! Ala peptide structure). Such interactions could stabilize interacting molecules once other interactions line the molecules in-register. The alternation of clusters of imino acid-rich regions with charged regions in ®bril-forming collagens, positioning the imino acid-rich regions of neighboring staggered molecules in proximity to one another, would provide many cases where a Hyp is opposite to another Hyp and therefore capable of bridging molecules. It is conceivable that once the collagen molecules are positioned, ®nal adjustments could be made by direct contacts between hydroxyproline residues in different molecules or water-mediated interactions. These results expand our concept of the unique role of hydroxyproline in collagen interactions, showing that they may mediate direct contacts between molecules as in a ®bril, as well as playing a key role in molecular stability. Hydration A signi®cant number of water molecules were localized in the EKG structure and found to form patterns and occupy positions similar to those previously reported (Bella et al., 1995; Kramer et al., 1998). The EKG structure is the highest-resolution example yet of a collagen triple-helical structure and the observation of an extended, repetitive water network in this case reinforces the notion that it is a general feature of collagen. An important function of collagen is binding and recognition, both to other collagen molecules in ®bril formation and to other types of molecules

including extra-cellular matrix components and integrins. It has been suggested (Bella et al., 1995) that ®bril formation in collagen may be aided through recognition of disrupted areas of hydration patterns. Perhaps the disruption of the regular hydration pattern that is observed here in the central, charged region of the EKG molecule is important in this respect, in particular to the generation of a staggered arrangement. Further, the direct interactions of polar sidechains in hydration positions indicate that the hydration structure in collagen can be considered to include markers for potential intermolecular or binding interactions in which water molecules would be displaced. Similar observations have been made for various nucleotide-binding proteins. In structures of protein-DNA complexes such as the catabolite activator protein (CAP), trp repressor, and lambda repressor, hydration structure surrounding their respective unbound DNA structures served as markers for interactions with polar side-chains (Woda et al., 1998).

Materials and Methods Crystal growth and data collection The EKG peptide was synthesized by Venugopal et al. as described (Venugopal et al., 1994). Crystallization conditions were found using the hanging drop vapor diffusion method. Crystals of the EKG peptide were grown at 4  C using 4 ml drops containing 10 mg/ml peptide, 15 % PEG 4000, 0.1 M Tris-HCl (pH 8.5), 0.3 M Li2SO4. The drops were placed above a reservoir containing 0.1 M Tris buffer, 0.6 M Li2SO4, 30 % PEG 4000. Crystals generally adopted a square or needle-like habit. A single crystal of size 0.2 mm  0.3 mm  0.1 mm was used for data collection. Data were collected at 4  C with CuKa radiation on a rotating-anode Enraf-Nonius CAD4 diffractometer. Crystals were mounted in a capillary and 7292 unique re¯ecÊ resolution. Friedel tions were measured up to 1.75 A pairs were collected for strong re¯ections. Radiation decay was monitored by periodically measuring the intensities of four re¯ections and was found to be negligible within the experimental error. Strong (5, 0, 15) Ê Bragg spacing were observed, re¯ections with a 2.8 A corresponding to the rise per tripeptide distance. Unique re¯ections were corrected for Lorentz polarization and absorption, with the software package MOLEN (Fair, 1990). The spacegroup is P21 with unit cell dimensions Ê , b ˆ 26.57 A Ê , c ˆ 45.89 A Ê , b ˆ 96.04  . Cona ˆ 29.24 A sidering a molecular mass of 8214 Da per triple helix, Ê 3, the asymmetric unit is and the cell volume of 35,450 A one triple helix with a Matthews coef®cient (Matthews, Ê 3/Da. Data collection information is sum1968) of 2.2 A marized in Table 5. Structure determination and refinement Molecular replacement was performed using MERLOT (Fitzgerald, 1991) and a 7-fold symmetric idealized triple helix with the sequence (Pro-Hyp-Gly)4-Ala-AlaGly-(Pro-Hyp-Gly)5. A rotation search produced seven equivalent solutions, related by 51  around the helical axis. A translational search was then performed on each

1202

Staggered Molecular Packing of a Collagen Peptide

Figure 7 (legend opposite)

Staggered Molecular Packing of a Collagen Peptide Table 5. Data collection parameters and re®nement statistics A. Data collection Data collection device Data collection temperature ( C) Ê) High resolution limit (A Number of unique reflections collected Overall completeness (%) Ê, Completeness (top shell) % (1.81-1.75 A 718 reflections) Number of reflections F > 2s Space group Ê , ) Unit cell dimensions (A B. Refinement Ê) Resolution (A No. reflections (F > 2sF) Rcryst/Rfree (%) Peptide non-hydrogen atoms/water sites rms deviation from standard geometries Ê) Bonds (A Angles (deg.) Impropers (deg.) Ê 2) Average temperature factors (A All atoms Peptide atoms Solvent

CAD4 diffractometer 4 1.75 7292 100 99.9 5745 P21 a ˆ 29.24, b ˆ 26.57, c ˆ 45.89, b ˆ 96.04 20-1.75 5481 18.0/22.7 569/153 0.006 1.089 1.241 13.3 10.8 22.6

of these seven solutions. Using several resolution ranges Ê , one of the rotation solutions gave with data up to 3.5 A signi®cantly better results in the translation search. The top peak from this search produced a model that displayed good crystal packing; 10 % of the re¯ections were set aside to be used for cross-validation (free R-factor). Several cycles of rigid body re®nement using X-PLOR (BruÈnger, 1992) reduced the R value to 37.1 % (R-free Ê. 42.8 %) for the 859 re¯ections between 10 and 3.5 A X-PLOR parameters for the non-standard amino acid hydroxyproline were used as indicated for the Gly ! Ala structure (Bella et al., 1994, 1995). The coordinates were then subjected to iterative rounds of positional re®nement, simulated annealing, restrained individual temperature factor re®nement, and manual model building using CHAIN (Sack, 1988). When the R-factor reached 31.4 % (R-free 39.7 %) for Ê , the electron density maps disdata between 8 and 3.0 A played good continuity and the need for further manual

1203 adjustments was not apparent. At this point, the addition of water molecules was begun. The strong hydration of the triple helix necessitates early addition of water molecules, as they constitute a signi®cant portion of the structure and are required to adjust the molecule to the ®nal conformation. Water molecules of the ®rst hydration shell were chosen by distance and geometry from main-chain carbonyl groups and then from hydroxyl groups, amino groups, and polar side-chains, and were investigated for position in spherical density, reasonable temperature factors, and improvement of R-values. Water molecules with B-factors signi®cantly higher than the average were excluded. While the electron density of the peptide appeared good from the beginning of water addition, adding water molecules improved the phasing dramatically. After the addition of about 50 ®rst shell water molecules, water molecules corresponding to more extended shells were incorporated. Again, distance and hydrogen bonding geometry with respect to water molecules already present were considered. Omit maps were also investigated. Certain areas of the packing possessed more ordered and well-de®ned water molecules than others. The less well-de®ned areas occurred in open regions of the packing as described above. In general, water molecules were not positionally restrained during re®nement. During the process of re®nement, the resolution of the Ê . Re®nement was data was gradually extended to 1.75 A completed with CNS (BruÈnger et al., 1998) using the maximum-likelihood method (Pannu & Read, 1996) and Ê . Overall anisoextending the low-resolution data to 20 A tropic B-factor and bulk solvent corrections were utilized as well. One terminal glycine residue (Gly90) was excluded from the ®nal re®nements as it could not be seen in the electron density maps and displayed very high temperature factors. The appearance of electron density for the polar side-chains in the center of the molecule was much better for the lysine residues than for the glutamic acid residues. Four of the six polar sidechains were traceable in density and were modeled. The two glutamic acid residues that were modeled were each given partial occupancy beyond the Cb atom. Glu13 (0.4 Ê2 occupancy) has an average temperature factor of 13.5 A Ê 2 for the side-chain for the entire residue and 14.3 A atoms only. Glu43 (0.5 occupancy) has an average temÊ 2 for the entire residue and perature factor of 14.9 A Ê 2 for the side-chain atoms only. Lys14 (full occu21.5 A

Figure 7. Water bridging patterns observed in EKG. A cylindrical projection is shown with the ®rst chain repeated on the right-hand-side. In general, additional bridges can be made if less stringent distance criteria are used and the shape and appearance of the bridge are considered instead. Frequently a bridge can be made except for one water position that is generally absent due to proximity to an open, disordered region of the packing. In many of these cases there is evidence for the missing water molecule in the electron density maps but the water molecule did not meet the re®nement and/or s level criterion. (a) a and b bridges. Intrachain a bridges are generally pentagonal in shape and occur between Hyp CˆˆO groups and Gly CˆˆO groups. These bridges utilize either two (a2 bridges) or three (a3 bridges) water molecules to form the bridge. a2 bridges are formed when a symmetry-related peptide chain occludes the position of the bridging water. Like a bridges, b bridges are essentially pentagonal in shape. Interchain b bridges connect Hyp CˆˆO groups and Gly CˆˆO groups forming either three water (b3 bridges) or four water (b4 bridges) bridges. b4 bridges can often be viewed as extended b3 bridges that are expanded due to one or two long bridging connections. (c) g and d bridges. Intrachain g bridges connect hydroxyproline OH groups with Gly C ˆ O groups utilizing one, two, or three water molecules to form g1, g2, and g3 bridges, respectively. In two cases, a hydroxyproline hydroxyl group from a symmetry-related helix participates in g bridges through a direct interhelical contact. Interchain d bridges connect hydroxyproline OH groups with glycine using either two (d2 bridges) or three (d3 bridges) water molecules. In two cases, a hydroxyproline residue from a neighboring triple helix occurs in d bridges.

1204 Ê 2 for pancy) has an average temperature factor of 14.9 A Ê 2 for the side-chain atoms the entire residue and 17.1 A only. Lys44 (full occupancy) has an average temperature Ê 2 for the entire residue and 19.4 A Ê 2 for factor of 16.8 A the side-chain atoms only. The remaining two (Glu73 and Lys74) are disordered and were modeled only up to the Cb atom. This is presumably because of the proximity to open areas of packing. The ®nal model (Figure 1), re®ned to a resolution of Ê , contains 569 non-hydrogen peptide atoms and 1.75 A 153 water molecules. The majority of these water molecules participate in extensive hydrogen bonding with peptide atoms or with other water molecules in a manner similar to that observed for the Gly ! Ala peptide (Bella et al., 1995). Of the 153 water molecules added, 90 are directly hydrogen bonded to peptide atoms and can be considered to occupy a ®rst hydration shell around the peptide. The R-factor for this model is 18.0 % (R-free Ê 22.7 %) for 5481 re¯ections between 20 and 1.75 A (F > 2s). A summary of re®nement information is given in Table 5. Protein Data Bank accession number Coordinates and structure factors have been deposited in the RCSB Protein Data Bank under accession code 1QSU.

Acknowledgments Overall support for this project was received from grants GM 21589 to H.M.B. and AR 19626 to B.B. from the National Institutes of Health, as well as a grant from the Pittsburgh Supercomputing Center. The research of R. Z. K. has been supported by the National Institutes of Health Molecular Biophysics Training Grant and the Department of Education's Graduate Assistance in Areas of National Need Grant. We are grateful to Tom Emge for his assistance with data collection.

References Acton, S., Resnick, D., Freeman, M., Ekkel, Y., Ashkenas, J. & Krieger, M. (1993). The collagenous domains of macrophage scavenger receptors and complement component C1q mediate their similar, but not identical, binding speci®cities for polyanionic ligands. J. Biol. Chem. 268, 3530-3537. Ala-Kokko, L., Kontusaari, S., Baldwin, C. T., Kuivaniemi, H. & Prockop, D. J. (1989). Structure of cDNA clones coding for the entire preproa1(III) chain of human type III procollagen. Biochem. J. 260, 509-516. Bella, J. & Berman, H. M. (1996). Crystallographic evidence for Ca-H    C hydrogen bonds in a collagen triple helix. J. Mol. Biol. 264, 734-742. Bella, J., Eaton, M., Brodsky, B. & Berman, H. M. (1994). Crystal and molecular structure of a collagen-like Ê resolution. Science, 266, 75-81. peptide at 1.9 A Bella, J., Brodsky, B. & Berman, H. M. (1995). Hydration structure of a collagen peptide. Structure, 3, 893-906. BruÈnger, A. T. (1992). X-PLOR, Version 3.1, A System for X-ray Crystallography and NMR, Yale University Press, New Haven, CT. BruÈnger, A. T., Adams, P. D., Clore, G. M., DeLano, W. L., Gros, P., Grosse-Kunstleve, R. W., Jiang, J.-S.,

Staggered Molecular Packing of a Collagen Peptide Kuszewski, J., Nilges, M., Pannu, N. S., Read, R. J., Rice, L. M., Simonson, T. & Warren, G. L. (1998). Crystallographic and NMR system: a new software suite for macromolecular structure determination. Acta Crystallog. sect. D, 54, 905-921. Bruns, R. R. & Gross, J. (1973). Band pattern of the segment-long-spacing form of collagen. Its use in the analysis of primary structure. Biochemistry, 12, 808815. Chan, V., Brodsky, B., Beck, K., Kirkpatrick, A. & Ramshaw, J. (1997). Positional preferences of ionizable residues in Gly-Pro-Hyp in the collagen triple helix of host-guest peptides. J. Biol. Chem, 272, 31441-31446. Doi, T., Higashino, K.-I., Kurihara, Y., Wada, Y., Miyazaki, T., Nakamura, H., Uesugi, S., Imanishi, T., Kawabe, Y. & Itakura, H. (1993). Charged collagen structure mediates the recognition of negatively charged macromolecules by macrophage scavenger receptors. J. Biol. Chem. 268, 2126-2133. Doyle, B. B., Hukins, D. W. L., Hulmes, D. J. S., Miller, A., Rattew, C. J. & Woodhead-Galloway, J. (1974a). Origins and implications of the D stagger in collagen. Biochem. Biophys. Res. Commun. 60, 858-864. Doyle, B. B., Hulmes, D. J. S., Miller, A., Parry, D. A. D., Piez, K. A. & Woodhead-Galloway, J. (1974b). Axially projected collagen structures. Proc. Roy. Soc. ser. B, 187, 37-46. Fair, C. K. (1990). MOLEN: an interactive structure solution procedure, Enraf-Nonius, Delft, Netherlands. Fietzek, P. P. & KuÈhn, K. (1975). Information contained in the amino acid sequence of the a1(I)-chain of collagen and its consequences upon the formation of the triple helix, of ®brils and crosslinks. Mol. Cell. Biochem. 8, 141-157. Fitzgerald, P. M. D. (1991). MERLOT - An Integrated Package of Computer Programs for the Determination of Crystal Structures by Molecular Replacement - Version 2.4, Merck Sharp & Dohme Research Laboratories, Rahway, NJ. Fraser, R. D. B. & MacRae, T. P. (1973). Conformation in ®brous proteins. In Molecular Biology (Horecker, B., Kaplan, N. O., Marmur, J. & Scheraga, H. A., eds), p. 628, Academic Press, New York. Fraser, R. D. B., MacRae, T. P. & Suzuki, E. (1979). Chain conformation in the collagen molecule. J. Mol. Biol. 129, 463-481. Gustavson, K. H. (1955). The function of hydroxyproline in collagens. Nature, 175, 70-74. Holmgren, S. K., Taylor, K. M., Bretcher, L. E. & Raines, R. T. (1998). Code for collagen's stability deciphered. Nature, 392, 666-667. Hoppe, H.-J. & Reid, K. B. M. (1994). Collectins ± soluble proteins containing collagenous regions and lectin domains± and their roles in innate immunity. Protein Sci. 3, 1143-1158. Hulmes, D. J. S., Miller, A., Parry, D. A. D., Piez, K. A. & Woodhead-Galloway, J. (1973). Analysis of the primary structure of collagen for the origins of molecular packing. J. Mol. Biol, 79, 137-148. Jones, E. Y. & Miller, A. (1991). Analysis of structural design features in collagen. J. Mol. Biol. 218, 209219. Katz, E. P. & David, C. W. (1990). Energetics of intrachain salt-linkage formation in collagen. Biopolymers, 29, 791-798. Katz, E. P. & David, C. W. (1992). Unique side-chain conformation encoding for chirality and azimuthal

Staggered Molecular Packing of a Collagen Peptide orientation in the molecular packing of skin collagen. J. Mol. Biol. 228, 963-969. Knight, C. G., Morton, L. F., Peachey, A. R., Tuckwell, D. S., Farndale, R. W. & Barnes, M. J. (2000). The collagen-binding A-domains of integrins a(1)b(1) and a(2)b(1) recognize the same speci®c amino acid sequence, GFOGER, in native (triple-helical) collagens. J. Biol. Chem. 275, 35-40. Kramer, R. Z. & Berman, H. M. (1998). Patterns of hydration in crystalline collagen peptides. J. Biomol. Struct. Dynam. 16, 367-380. Kramer, R. & Berman, H. (1999). Water-mediation of hydrogen bonds in collagen triple-helical structure. In Perspectives in Structural Biology (Vijayan, M., Yathindra, N. & Kolaskar, A., eds), pp. 169-178, Universities Press Limited, India. Kramer, R. Z., Vitagliano, L., Bella, J., Berisio, R., Mazzarella, L., Brodsky, B., Zagari, A. & Berman, H. M. (1998). X-ray crystallographic determination of a collagen-like peptide with the repeating sequence (Pro-Pro-Gly). J. Mol. Biol. 280, 623-638. Kramer, R., Bella, J., Mayville, P., Brodsky, B. & Berman, H. M. (1999). Sequence dependent conformational variations of collagen triple-helical structure. Nature Struct. Biol. 6, 454-457. Kraulis, P. (1991). MOLSCRIPT: a program to produce both detailed and schematic plots of protein structures. J. Appl. Crystallog. 24, 946-950. Li, S.-T., Golub, E. & Katz, E. P. (1975). Electrostatic side-chain complementarity in collagen ®brils. J. Mol. Biol. 98, 835-839. Matthews, B. W. (1968). Solvent content of protein crystals. J. Mol. Biol. 33, 491-497. Momany, F. A., McGuire, R. F., Burgess, A. W. & Scheraga, H. A. (1975). Energy parameters in polypeptides. VII. Geometric parameters, partial atomic charges, nonbonded interactions, hydrogen bond interactions, and intrinsic torsional potentials for the naturally occurring amino acids. J. Phys. Chem. 79, 2361-2381. Nagarajan, V., Kamitori, S. & Okuyama, K. (1998). Crystal structure analysis of collagen model peptide (Pro-Pro-Gly)10. J. Biochem. (Tokyo), 124, 1117-1123. Nagarajan, V., Kamitori, S. & Okuyama, K. (1999). Structure analysis of a collagen-model peptide with a (Pro-Hyp-Gly) sequence repeat. J. Biochem. (Tokyo), 125, 310-318. Okuyama, K., Okuyama, K., Arnott, S., Takayanagi, M. & Kakudo, M. (1981). Crystal and molecular structure of a collagen-like polypeptide (Pro-Pro-Gly)10. J. Mol. Biol. 152, 427-443. Pannu, N. & Read, R. (1996). Improved structure re®nement through maximum likelihood. Acta Crystallog. sect. A, 52, 659-668.

1205 Privalov, P. L. (1982). Stability of proteins. Proteins which do not present a single cooperative system. Advan. Protein Chem. 35, 1-104. Prockop, D. J. & Fertala, A. (1998). Inhibition of the selfassembly of collagen I into ®brils with synthetic peptides. Demonstration that assembly is driven by speci®c binding sites on the monomers. J. Biol. Chem. 273, 15598-155604. Ramachandran, G. N. & Chandrasekharan, R. (1968). Interchain hydrogen bonds via bound water molecules in the collagen triple helix. Biopolymers, 6, 1649-1658. Ramachandran, G. N. & Kartha, G. (1955). Structure of collagen. Nature, 176, 593-595. Ramachandran, G. N., Bansal, M. & Bhatnagar, R. S. (1973). A hypothesis on the role of hydroxyproline in stabilizing collagen structure. Biochim. Biophys. Acta, 322, 166-171. Rich, A. & Crick, F. H. C. (1961). The molecular structure of collagen. J. Mol. Biol. 3, 483-506. Sack, J. S. (1988). CHAIN - a crystallographic modeling program. J. Mol. Graph. 6, 224-225. Salem, G. & Traub, W. (1975). Conformational implications of amino acid sequence regularities in collagen. FEBS Letters, 51, 94-99. Schneider, B., Cohen, D. M., Schleifer, L., Srinivasan, A. R., Olson, W. K. & Berman, H. M. (1993). A systematic method for studying the spatial distribution of water molecules around nucleic acid bases. Biophys. J. 65, 2291-2303. Schneider, B., Patel, K. & Berman, H. M. (1998). Hydration of the phosphate group in double helical DNA. Biophys. J. 75, 2422-2434. Torchia, D. A. & Vanderhart, D. L. (1976). 13C magnetic resonance evidence for anisotropic molecular motion in collagen ®brils. J. Mol. Biol. 104, 315-321. Traub, W. & Fietzek, P. P. (1976). Contribution of the a2 chain to the molecular stability of collagen. FEBS Letters, 68, 245-249. Trus, B. L. & Piez, K. A. (1976). Molecular packing of collagen: three-dimentional analysis of electrostatic interactions. J. Mol. Biol. 108, 705-732. Venugopal, M. G., Ramshaw, J. A., Braswell, E., Zhu, D. & Brodsky, B. (1994). Electrostatic interactions in collagen-like triple helical peptides. Biochemistry, 33, 7948. Vitagliano, L., NeÂmethy, G., Zagari, A. & Scheraga, H. (1993). Stabilization of the triple-helical structure of natural collagen by side-chain interactions. Biochemistry, 32, 7354-7359. Woda, J., Schneider, B., Patel, K., Mistry, K. & Berman, H. M. (1998). An analysis of the relationship between hydration and protein-DNA interactions. Biophys. J. 75, 2170-2177. Yonath, A. & Traub, W. (1969). Polymers of tripeptides as collagen models. J. Mol. Biol. 43, 461-477.

Edited by I. A. Wilson (Received 5 April 2000; received in revised form 23 June 2000; accepted 5 July 2000)