Hydrogen bonds with π-acceptors in proteins: frequencies and role in stabilizing local 3D structures1

Hydrogen bonds with π-acceptors in proteins: frequencies and role in stabilizing local 3D structures1

doi:10.1006/jmbi.2000.4301 available online at http://www.idealibrary.com on J. Mol. Biol. (2001) 305, 535±557 Hydrogen Bonds with p -Acceptors in P...

12MB Sizes 1 Downloads 38 Views

doi:10.1006/jmbi.2000.4301 available online at http://www.idealibrary.com on

J. Mol. Biol. (2001) 305, 535±557

Hydrogen Bonds with p -Acceptors in Proteins: Frequencies and Role in Stabilizing Local 3D Structures Thomas Steiner* and Gertraud Koellner Institut fuÈr Chemie Kristallographie, Freie UniversitaÈt Berlin, Takustraûe 6, D-14195 Berlin, Germany

A comprehensive structural analysis of X Ð H   p hydrogen bonding in proteins is performed based on 592 published high-resolution crystal Ê ). All potential donors and acceptors are considerered, structures (41.6 A including acidic C ÐH groups. The sample contains 1311 putative X Ð H   p hydrogen bonds with NÐ H, O ÐH or S ÐH donors, that is about one per 10.8 aromatic residues. By far the most ef®cient p-acceptor is the side-chain of Trp, which accepts one X ÐH    p hydrogen bond per 5.7 residues. The focus of the analysis is on recurrent structural patterns involving regular secondary structure elements. Numerous examples are found where peptide X ÐH    p interactions are functional in stabilization of helix termini, strand ends, strand edges, b-bulges and regular turns. Side-chain X ÐH   p hydrogen bonds are formed in considerable numbers in a-helices and b-sheets. Geometrical data on various types of X ÐH    p hydrogen bonds are given. # 2001 Academic Press

*Corresponding author

Keywords: hydrogen bonding; aromatic hydrogen bonding; weak polar interactions; secondary structure; protein structure

Introduction Hydrogen bonding plays a key role in structure and function of proteins, including features such as overall folding, local architecture, protein-ligand recognition, enzymatic activity, protein hydration, and molecular dynamics (Jeffrey & Saenger, 1991). For a long time, research has concentrated on hydrogen bonds X Ð H   A in which X and A both are very electronegative atoms (mainly N and O). The structural aspects of these ``conventional'' hydrogen bonds in proteins are well investigated, and have been surveyed repeatedly (e.g. by Baker & Hubbard (1984) in their classical review). However, hydrogen bonding is a very broad phenomenon that is not restricted to N and O, but may involve less electronegative atoms. In fact, a great variety of hydrogen bonds are known, and their energies cover the whole range from over 30 kcal/ mol for the strongest to less than 0.5 kcal/mol for the weakest species. In structural biology, some of the ``non-conventional'' hydrogen bonds have recently been shown to be of greater importance, in particular the variants C ÐH    O (e.g. Steiner & Saenger, 1993; Derewenda et al., 1995; Auf®nger & E-mail address of the corresponding author: [email protected] 0022-2836/01/030535±23 $35.00/0

Westhof, 1996; Fabiola et al., 1997; Ghosh & Bansal, 2000), and N/O ÐH    p(Ph) (see the literature cited below). Surveys of the earlier structural biology literature have been given for the special cases NÐ H   p(Ph) by Perutz (1993) and C ÐH    O by Wahl & Sundaralingam (1997), for ``weakly polar interactions'' in a wide sense by Burley & Petsko (1988) and for the whole ®eld of non-conventional hydrogen bonds by Desiraju & Steiner (1999). Hydrogen bonds between donors X Ð H and the p-electron cloud of an aromatic moiety (called aromatic, or p-facial, or X ÐH    Ph, or X Ð H   p(Ph) hydrogen bonds) were discovered Wulf et al. (1936) and are today well documented for organic structural chemistry (e.g. Malone et al., 1997). In this hydrogen bond type, the donor group XH is placed roughly above the center of an aromatic ring, and the X Ð H vector points at it. In the literature, different sets of parameters are used to describe the geometry; here, we use the set given in Figure 1. The geometry of X ÐH    p hydrogen bonds is very soft, even softer than that of conventional hydrogen bonds, allowing large lateral displacements of the donor, and strong bending of the hydrogen bond angle without much of a change in energy (e.g. see Levitt & Perutz, 1988; Rodham et al., 1993; Worth & Wade, 1995). In fact, # 2001 Academic Press

536

Aromatic Hydrogen Bonds

Figure 1. Geometric parameters in XÐ H    p hydrogen bonds. M is the ring midpoint, o(X) is the angle between the X   M line and the ring normal (an angle o(H) can be de®ned analogously, but is not analyzed here).

X Ð H may point at the aromatic center, at particular C Ð C bonds, or even at an individual C-atom. Energies are smaller than for O/N Ð H  O hydrogen bonds, but are still of a comparable magnitude. In uncharged systems, typical energy values are 2 to 4 kcal/mol. With charged donors, there is a strong (possibly dominant) contribution of the electrostatic cation-p interaction, and total energies are substantially larger (reviewed by Ma & Dougherty, 1997). Normal distances of X to the Ê , but the longaromatic midpoint M are 3.2-3.8 A distance end is diffuse as is usual for hydrogen bonding in general, and shorter distances than Ê are occasionally found (e.g. 3.02 A Ê in 3.2 A ammonium tetraphenylborate, Steiner & Mason, 2000). Probably the ®rst example of an XÐ H   p(Ph) hydrogen bond in a peptide crystal structure was reported by McPhail & Sim (1965), but it had little impact in structural science at that time. Much later, N ÐH    p(Ph) hydrogen bonds in proteins attracted greater attention following the observation of such interactions in bovine pancreatic trypsin inhibitor (BPTI) by Huber and co-workers (Wlodawer et al, 1984), and in hemoglobin proteinligand interactions by Perutz et al. (1986). The example in BPTI is shown in Figure 2 (geometrical details are given in the legend). It has become a classical model system, and was further investigated with NMR (TuÈchsen & Woodward, 1987). Since then, a great number of X ÐH    p hydrogen bonds in proteins have been found in single case studies, involved in a wide variety of functions such as secondary structure stabilization (e.g. Armstrong et al, 1993), drug recognition (e.g. Kryger et al., 1999), DNA recognition (Parkinson et al., 1996), enzymatic action (e.g. Liu et al., 1993) and interactions of structural water molecules (Engh et al., 1996; Deacon et al., 1997; Koellner et al., 2000). The large surface makes p(Ph)-acceptors a ``target that is easy to hit'' (Suzuki et al., 1992). In consequence, they are good reserve acceptors for hydrogen bonds, and if conventional acceptors are locally lacking, X ÐH    p(Ph) bonds are often

Figure 2. The p-hydrogen bonds accepted by the sidechain of Tyr35 in bovine pancreatic trypsin inhibitor Ê , the (BPTI). The H    M distances are 2.6 and 2.7 A Ê , and the angles o(N) 4  N    M distances 3.5 and 3.6 A and 18  for the interactions donated by Gly37 N and Ê structure, Wlodawer et al., Asn44 Nd2, respectively (1.0 A 1984; PDB 5pti). A phosphate ion bonded to Tyr35 OZ is shown.

used to ®ll hydrogen bond coordination shells (Steiner et al., 1998). This allows conventional hydrogen bonds to be missed out without leading to completely unsatis®ed hydrogen bond potentials (Worth & Wade, 1995). X Ð H  p hydrogen bonds can be formed with many acceptors other than phenyl rings, such as various heterocycles, CC, C1C, and other pbonded moieties. Of these, only the imidazole group of the histidine side-chain is of greater relevance in proteins, and N Ð H  p(Im) hydrogen bonding has been reported, e.g. by VaÂsquez et al. (1998). X Ð H  p(Im) hydrogen bonds can be formed only if the side-chain is in its neutral form, whereas cationic imidazole is unsuitable as a hydrogen bond acceptor. Also polar C ÐH groups can form weak hydrogen bonds with p-acceptors (e.g. Steiner et al., 1995), and the importance of such interactions in proteins has been suggested (Nishio et al., 1995). However, with the exception of HisCe1 ÐH, the polarity of C ÐH groups in proteins is relatively moderate compared to the more acidic C Ð H types known in organic chemistry (terminal alkynes, chloroform, HCN, etc.). Typical C ÐH    p interactions in proteins should be much weaker than N/O Ð H   p hydrogen bonds, but should still not be neglected a priori.

537

Aromatic Hydrogen Bonds

In peptides and proteins, there is a second kind of short and advantageous contact between p-faces and amines or amides, that is stacked interactions, in which N    M distances can also be around Ê Figure 3) (e.g. Flocco & Mowbray, 1994; also 3.5 A see Burley & Petsko, 1986; Singh & Thornton, 1990). Competition between the two interaction types has been discussed in greater detail by Mitchell et al. (1994), who ®nd a dominance of the stacked over the hydrogen bonded geometry by a factor of about 2.5. This can be readily explained by energetic considerations. For example, an Asn side-chain in p-stacked geometry can donate two N ÐH   O hydrogen bonds and enjoy the stacking itself, whereas in the perpendicular geometry, only one N Ð H  O and an N Ð H  p hydrogen bond can be formed, without the possibility of additional stacking. Similar preference of stacked over hydrogen bonded N    Ph contacts was found for peptide groups (Worth & Wade, 1995; Worth et al., 1998). Even though there is already a voluminous literature on X ÐH    p bonds in proteins, a comprehensive description, and in particular an analysis of their positions in secondary structure, is lacking. Here, we attempt such an analysis by investigating X Ð H  p interactions in a large set of published high-resolution protein crystal structures. We consider all potential donors, including acidic C Ð H groups (Ca ÐH, TrpCd1 Ð H, HisCd2 Ð H, HisCe1 Ð H), and all potential p-acceptors. The hydrogen bond criteria are based mainly on experiences with organic small molecule crystal structures, where these phenomena are much better investigated. The focus is on recurrent X Ð H  p hydrogen bonds closely connected with regular elements of secondary structure, whereas those in loop regions are only marginally touched. Numerical data of fequencies of occurrence and hydrogen bond geometries are also given. Protein-ligand X ÐH    p hydrogen bonds are too much individual properties of the very particular systems to be included here. Database analysis of hydrogen bonding in macromolecular structures carries problems due to the constrained re®nement, and to limited resolutions. Numerical values of distances and angles

Figure 3. Hydrogen bonded (left) and stacked (right) NH-over-p interactions shown for an Asn/Gln sidechain interacting with Phe.

are often biased, in particular if interaction types are analysed that have not been fully considered when choosing re®nement constraints (short nonconventional hydrogen bonds are often mistaken by programs as ``bad contacts'' that must be avoided). Since the bias depends on the individual re®nement stategy, it differs from structure to structure even at constant resolution. Fortunately, the observation of interaction patterns, which is central to the present study, is robust and reliable Ê . The resolat resolutions around and below 2.0 A Ê ution limit of 1.6 A used in this study is quite cautious in this sense. On the other hand, analyzing ®ne details of interaction geometries, and absolute values of contact distances, clearly is not adequate and is not performed here.

Results General The structural data set consists of 593 highÊ ) protein crystal structures with resolution (41.6 A a total of 145,922 amino acid residues (Table 1). Because of resolution and re®nement problems with the histidine side-chain, it was considered as a potential p-acceptor only in combination with N ÐH and acidic C ÐH donors having de®ned Hatom positions, and disregarded in combination with O ÐH and S Ð H groups. Full details of data retrieval, selection of geometric cutoff de®nitions and problems encountered in this respect are given in Methodology. The hydrogen bond de®nitions used in the following are given in Table 2. Geometry of p-hydrogen bonds The general features of p-hydrogen bond geometry are described in the cited literature, and are not discussed again. We give some data that are helpful for the further discussion. In the analysis, one has to separate X-over-p contacts of the two occurring overall geometries, that is hydrogen bonded

Table 1. The structural data set Protein structures Amino acid residuesa Arg Asn Free Cys Gln His Lys Phe Ser Thr Trp Tyr Peptide NÐ H groups Ca ÐH groups Water molecules

593 145,922 5916 7242 778 5126 3535 8356 5920 10,392 9629 2516 5767 138,811 156,422 141,733

a Only residues with coordinates given for all side-chain atoms are counted.

538

Aromatic Hydrogen Bonds

Table 2. Hydrogen bond criteria used (see Methodology) N ÐH with defined H-atoms Ð NH‡ 3 C ÐO ÐH SÐ H Polar C ÐH H2O

Ê) D  M (A

DÐH   M (deg.)

<4.3 <3.8 <3.8 <4.0 <4.3 <3.8

>120 >120 -

o(D) (deg.)

<25 <25 <25 <25 <25 <25 a Ê No carbonyl or carboxylate O atom, and not more than two O atoms of any kind within 3.0 A from the donor. b Ê. As a, with a distance limit of 3.8 A

Additional -

a b

-

and stacked (Figure 3). One way is to analyze interplanar angles (stacked ˆ close to parallel, hydrogen bonded ˆ close to perpendicular planes). Alternatively, one can focus on the orientation of the X Ð H vector. This is demonstrated for an important example, the abundant peptide n ! (n 2) p-hydrogen bond. If we de®ne an N-over-p contact as one with o(N) < 25  , and consider all peptide-over-p contacts with Ê as potentially interesting, a correN    M < 5.0 A lation plot of NÐ H  M angles against N  M distances is obtained as shown in Figure 4. The data cluster at relatively linear angles represents hydrogen bonds, whereas that at angles around and below 90  originates from stacked contacts. A cutoff value of 120  separates the clusters. Note that the population between stacked and hydrogen bonded arrangements, representing intermediate cases, does not fall to zero. With side-chain donors, the relative population of intermediate cases is substantially higher. If one accepts o(N) < 25  and N Ð H  M > 120  as adequate angular criteria, distributions of N    M distances are obtained as shown for the different example of side-chain to p contacts in Figure 5. There is a clear maximum at distances Ê , and the shortest contacts are about around 3.5 A

Ê . In the following, we use a cutoff distance of 3.1 A Ê , which is rather long at a ®rst sight but is in 4.3 A accord with experimental and theoretical studies on small organic molecules (Desiraju & Steiner, 1999). If one wishes to consider N   C rather than N    M distances, one can ®nd relations between the two listed for several geometries in Table 3. In Table 4A, mean X   M distances are listed for several donors X Ð H. For roughly linear interÊ shorter. actions, H  M distances are about 1.0 A As expected, the interaction distances are shortest for the charged donor Arg. The relatively poor performance of peptide NÐ H is possibly a consequence of the more severe steric hindrances compared to side-chain donors. Note the relatively short C   M distances that can be achieved by acidic C ÐH groups. For donors without a de®ned H-atom position (LysNz, C Ð OH, C ÐSH, H2O), X Ð H  M angles cannot be determined, and an analysis as above is not possible. Simply retrieving all contacts with Ê would include a o(X) < 25  and X   M < 4.3 A vast number of arrangements where X ÐH does not point at M, but elsewhere. In consequence, more rigid cutoff criteria have to be used and, as far as possible, additional considerations have to be performed (see Methodology). The mean Ê geometries with a maximum distance of 3.8 A for ``putative p-hydrogen bonds'' are given in Table 4B.

Figure 4. Scatterplot of N Ð H   M angles against N    M distances in NH-over-p contacts of peptide groups and p-acceptors two residues back in the sequence (Phe and Tyr side-chains).

Figure 5. Distribution of X   M distances in sidechain N Ð H    p interactions with o(N) < 25  and N Ð H    M angles > 120  (only side-chains with de®ned H-atom positions).

539

Aromatic Hydrogen Bonds Table 3. Relation of X  M and X   C distances in XÐ H    p hydrogen bonds for different angles o(X) Ê ) for: X  C range (A Ê) X   M (A 3.0 3.2 3.4 3.6 3.8 4.0 4.2

o(X) ˆ 0



3.31 3.49 3.67 3.86 4.05 4.23 4.42

o(X) ˆ 10 

o(X) ˆ 20 

o(X) ˆ 30 

3.12-3.56 3.30-3.75 3.49-3.94 3.68-4.13 3.87-4.32 4.06-4.52 4.25-4.71

3.01-3.89 3.21-4.09 3.40-4.30 3.60-4.50 3.80-4.70 4.00-4.91 4.20-5.11

3.02-4.33 3.23-4.55 3.45-4.78 3.67-5.00 3.88-5.22 4.10-5.45 4.32-5.67

Calculated for the special case that X is placed ``over'' a C-M line.

Numbers of p-hydrogen bonds Based on the cutoff de®nitions given in Table 2, the total number of p-hydrogen bonds with the classical donors N ÐH, O Ð H and S Ð H in the structural sample is 1311 (Table 5). This corresponds to about one per 10.8 aromatic side-chains (without His). In addition, there is an even larger number of related interactions from Ca ÐH and acidic C ÐH donors (one per 8.7 aromatic sidechains). To give a more detailed view, Table 5 lists the numbers of p-hydrogen bonds for all relevant donor-acceptor combinations separately. It is seen that for practically all combinations that can form p-hydrogen bonds in principle, such hydrogen bonds are actually found in crystal structures. An interesting but rare variant is the S Ð H  p hydrogen bond, which is occasionally formed by free cystein side-chains. Since, due to their small absolute number, they will not appear in the following sections on recurrent patterns, a single example of this normally overlooked interaction is shown in Figure 6 (for details, see the Figure legend). As a matter of fact, the numbers in Table 5 are dependent on the cutoff criteria used. This circumstance is discussed in more detail in Methodology. Acceptor efficiencies By far the most ef®cient p-acceptor in proteins is the indol group of tryptophan. About 17.6 % of all Trp side-chains accept p-hydrogen bonds from N ÐH, O Ð H or S Ð H donors (Table 6). For Tyr,

this fraction is only half as large, 8.8 %, and for Phe it is only 5.8 %. The imidazole ring of histidine was considered as a potential acceptor only in combination with NÐ H donors (see Methodology), and with these, the frequency of p-hydrogen bonds is about a factor of 10 smaller than for NÐ H    p(Trp). The low frequency of NÐ H  p(His) bonds is not surprising, because, in principle, His may accept such an interaction only in neutral form. It is seen in the different lines of Table 6 that the hierarchy of acceptor ef®ciency is invariant for the different donor types, Trp4Tyr > Phe4His. The higher acceptor ef®ciency of the Trp sidechain compared to Tyr and Phe is possibly a rather trivial consequence of its larger aromatic surface. In fact, both rings of the indol group may act as pacceptors, and both have very similar acceptor ef®ciencies (of the 440 p-hydrogen bonds accepted by Trp, 196 are directed at the 6-ring, 196 at the 5-ring and 48 at both rings simultaneously). Apart from the purely geometric advantage, also the conjugate nature of the two rings might increase the p-acceptor strength compared to Tyr and Phe. Some side-chains accept p-hydrogen bonds at both faces at the same time, such as Tyr35 in BPTI (Figure 2). In the present sample, 49 such cases are found (29 with Trp, 13 with Tyr, 7 with Phe). The relative frequencies allow to make a rough estimation of cooperativity properties. For Trp acceptors, for example, the chance of p-hydrogen bond formation is about 17.6 % (Table 6). If the two faces would act as acceptors independently, the frequency of double acceptors should be about 0.1762,

Table 4. Mean geometry of hydrogen bonds Donor type

Ê) hX   Mi (A

ho(X)i (deg.)

A. Mean geometry of X ÐH   p hydrogen bonds involving XÐH with defined H-atom position (X  M < 4.3 AÊ) 3.44(6) 13.6(8) Arg Ne, ArgNZ 3.54(6) 8.0(2) His Ne2 3.57(2) 14.7(5) Asn Nd2, GlnNe2 Peptide 3.71(2) 13.9(4) 3.63(2) 12.7(5) Trp Cd1, His Cd1, His Ce2 a 3.82(1) 15.6(2) C B. Mean geometry of putative X ÐH  p hydrogen bonds of XÐ H without defined H-atom position (X  M <3 .8 AÊ) Water 3.503(8) 14.4(2) 3.51(4) 15.1(9) Ser Og, Thr Og1, Tyr OZ z 3.53(3) 15.5(9) Lys N 3.79(5) 10(2) Cys Sg

hXÐH   Mi (deg.) 134(2) 145(5) 140(1) 146.7(8) 146(1) 139.8(4)

540

Aromatic Hydrogen Bonds

Table 5. Numbers of aromatic hydrogen bonds with N Ð H, OÐ H, SÐ H and acidic C Ð H donors Donors Any NÐH, O ÐH, SÐH Peptide NÐ H Side-chain N ÐH, all Arg Asn Gln His (Ne ÐH) Lys Trp Side-chain O ÐH, all Ser Thr Tyr Water Cysteine SÐ H Acidic CÐ H Ca ÐH HisCd ÐH HisCe ÐH TrpCd ÐH

All accept.

Phe acc.

Tyr acc.

Trp acc.

His acc.

1311 239 280 60 98 32 10 59 21 44 20a 13a 11a 735a 9a

341 55 58 11 16 2 4 14 11 11 2 4 5 209 4

505 92 93 16 47 8 3 12 7 21 12 3 6 299 -

440 81 115 25 32 21 1 33 3 12 6 6 227 5

25a 11 14 8 3 1 2 -a -b -b -b -b -b

1452 101 48 37

445 47 9 11

478 21 22 13

358 15 15 11

171 18 2 2

a

OÐH and SÐH not considered with p(His) acceptors. The 11 Ser Og   p(His), 11 Thr Og   p(His), 11 Tyr OZ   p(His), 25 OWat   p(His) and six Cys Sg   p(His) contacts formally ful®l the hydrogen bond criteria applied. b

that is 3.1 %. The observed frequency, 1.1 %, is smaller by about a factor of 3, indicating appreciable anticoorperative behaviour (that is if an

aromatic group accepts a p-hydrogen bond, formation of a second one is disfavoured). Donor efficiencies When counting the numbers of p-hydrogen bonds per donor, one arrives at rather low frequencies around and below 1 % (Table 7). In particular for peptide N Ð H, the relative frequency is only 0.17 %, re¯ecting a strong preference for stronger conventional hydrogen bonding. For side-chain N ÐH donors, the relative frequencies are slightly higher. The very low rates of p-hydrogen bonding of O Ð H donors are to some degree a consequence of the more rigid cutoff criteria that had to be used (see Methodology), but might also be connected with the ability to rotate and the resulting better chance to ®nd a partner for a conventional hydrogen bond. Ca Ð H and acidic side-chain C Ð H groups are involved in C Ð H  p interactions also only in frequencies around 1 %. The peptide N Ð H donor

Figure 6. Face-on S   Ph contact in avian sarcoma Ê and o(S) ˆ 3  virus integrase, with S    M ˆ 3.6 A Ê structure, Lubkowski et al., 1999; PDB 1cxu). (1.42 A Since there is no conventional hydrogen bond partner Ê , the arrangement is suggestive of for SÐ H within 4.5 A S ÐH    p(Ph) hydrogen bonding. The geometry compares well with S ÐH    Ph hydrogen bonds in organic crystal structures, where H-atoms have been located and re®ned (e.g. Rozenberg et al., 1999).

Hydrogen bonds from peptide N ÐH groups are a major constituent of the p-hydrogen bonds in the sample (Table 5). The chance of the different sidechains to be involved in such an interaction ranges between 3.2 % for Trp and 0.3 % for His (Table 6). In the following, we generally assign residue n to the donor. A hydrogen bond e.g. n ! (n 2) is, therefore, an interaction of an XÐ H group with a residue two places back in the sequence. This parallels the usual nomenclature for conventional hydrogen bonds, but contrasts earlier studies on phydrogen bonding in peptides and proteins (e.g. Armstrong et al., 1993; Worth & Wade, 1995; Worth et al., 1998; Steiner, 1998), including some of

541

Aromatic Hydrogen Bonds Table 6. Acceptor ef®ciencies: percentage of the p-acceptors involved in hydrogen bonds With With With With With With With With With a

any N ÐH, OÐH, SÐH peptide NÐH side-chain N ÐH side-chain O ÐH water O ÐHa two O ÐH/NÐH Cys Sg Ð H Ca Ð H acidic side-chain CÐ H

Phe

Tyr

Trp

His

5.8 0.9 1.0 0.2 3.5 0.1 0.1 7.5 1.1

8.8 1.6 1.6 0.4 5.2 0.2 8.3 1.0

17.6 3.2 4.5 0.6 9.0 1.1 0.2 14.2 1.6

0.7a 0.3 0.4 -a -a -a 4.8 0.6

p(His) acceptors are not considered with OÐH and SÐH donors.

our own work. Earlier studies have often put the p-acceptor into the focus of interest, assigned it residue number n, and/or named it ®rst (``aromatic/amine interactions''). This has led to assignments like n(n ‡ 2) to the arrangement that is in this work called n ! (n 2). Using our de®nition, a distribution of sequence distances between donor and acceptor is obtained as given in Table 8 (left column). The uneven appearance of the distribution indicates strong structural speci®cities, and these must be elucidated. By far the most frequent peptide-to-p hydrogen bond is the kind n ! (n 2), which occurs 132 times in the sample. The second most frequent, n ! (n ‡ 3), is already much rarer (nine examples). All other cases with donor and acceptor close in sequence are either very rare or are not observed. Cases with donor and acceptor distant in sequence occur in signi®cant numbers, but even all together are rarer than the arrangement n ! (n 2) alone (84 for absolute distance > 6). Side-chains accepting a peptide p-hydrogen bond often form, in addition, a conventional hydrogen bond with a partner close in sequence to n. Most frequent are hydrogen bonds with the carbonyl O or the side-chain of residue (n ‡ 1). A full list of such interactions found in the data sample is given in Table 9, and a representative example is

Table 7. Donor ef®ciencies: percentage of the potential donors involved in aromatic hydrogen bonds Peptide NÐ H Side-chains Arg Asn Cysa,b Gln His (only Ne ÐH) Lysb Sera,b Thra,b Trp Tyra,b Watera,b Ca ÐH Side-chain acidic CÐ H

0.17 1.0 1.4 1.1 0.6 0.3 0.7 0.2 0.1 0.8 0.2 0.52 0.93 1.9

a Note that more restrictive hydrogen bond criteria were used compared to NÐH with de®ned H-atom position, Table 2. b Without possible aromatic hydrogen bonds to p(His).

shown in Figure 7(a) (in B28Asp insulin, Whittingham et al., 1998; a further example is seen in Figure 11 shown below). All these pairs of conventional and p-hydrogen bonds constitute strong local arrays that should stabilize the side-chains tightly in their positions. Note that for Tyr in particular, such patterns are formed by one-third of all side-chains involved in peptide N ÐH    p hydrogen bonds. The Trp side-chain, because of its large surface, may form another interesting interaction pattern, that is pairs of p-hydrogen bonded and stacked interactions with successive peptide groups. In this array, the six-membered ring may accept the hydrogen bond and the ®ve-membered ring be stacked, or vice versa. Seven decent examples of this kind are found in the sample, with a typical one shown in Figure 7(b) (in cholesterol esterase, Chen et al., 1998). When considering the role of peptide p-hydrogen bonds in secondary structure elements, it is obvious that they cannot be formed in central parts of regular b-sheets, a-helices and 310-helices because there, N ÐH is involved in main-chain hydrogen bonding. However, this does not mean that peptide-to-p hydrogen bonds can be formed only in loops. On the contrary, free peptide donors are available at the the ends and edges of sheets and the N-termini of helices, and at structural irregularities within strands and helices. Upon closer inspection, it is seen that many of these peptide groups are actually involved in p-hydrogen bonds, and that these interaction often seem to be operative in edge and terminus stabilization. The peptide n ! (n

2) p-hydrogen bond. General

Because of its frequent occurrence, the n ! (n 2) p-hydrogen bond deserves particular attention. In Figure 2, the overall geometry is shown for the classical example in BPTI (Wlodawer et al., 1984). The arrangement represents a circuar array with a relatively small number of atoms, and its geometry must be affected by stereochemical restrictions (if the p-acceptor is counted as a single ``quasi-atom'', the cycle consists of only nine ``atoms''). Within the cycle, the adjustable torsion angles are c(n 1), f(n 1), c(n 2), w1(n 2) and w2(n 2). According to the

542

Aromatic Hydrogen Bonds

Figure 8. Ramachandran diagram of peptide n ! (n 2) p-hydrogen bonded arrays. Small circles are from residues (n 1), and large circles from residues (n 2). The conformation of residue n is not con®ned.

Figure 7. Side-chains forming pairs of peptide p-hydrogen bonds and conventional interactions with closely neighboring partners. (a) Example with an Og Ð H   O hydrogen bond; (b) example with an N Ð H    p stacked interaction. (a) In B28Asp insulin, Ê structure, Whittingham et al., 1998 (PDB 1zeg), 1.6 A Ê , N   M ˆ 3.5 A Ê , N Ð H    M ˆ 137  , H    M ˆ 2.7 A Ê . (b) In pancreatic o(N) ˆ 11  , Tyr OZ    Asp Od1 ˆ 2.7 A Ê structure, Chen et al., 1998 cholesterol esterase, 1.6 A Ê , N    M ˆ 3.2 A Ê, (PDB 2bce), H-bond: H    M ˆ 2.3 A Ê, N Ð H    M ˆ 165  , o(N) ˆ 2  ; stack: N   M ˆ 3.6 A o(N) ˆ 10  .

the conformation of residue (n 1) is aR. For both, the scatter of data points is considerable, indicating substantial conformational variability of the array. For proper positioning of the acceptor group, the side-chain torsion angles have values of w1 around 180  and w2 around 60  or 120  (only for Trp and His side-chains, w2 of 60  and (120  represent different conformations; both occur). Apart from this normal overall conformation, Figure 8 shows a few outliers, as will be discussed in the last section on peptide n ! (n 2) p-hydrogen bonds. Because of the steric restrictions, the N ÐH   M angles are not linear but have typical values around 150  (Figure 4). The NÐ H vector often points in the direction of Cg rather than M, but there are many exceptions (see Figures shown below). The amino acid sequence characteristics of peptide n ! (n 2) p-hydrogen bonds have been analyzed in detail by Worth & Wade (1995), and it is suf®cient to note that their results are veri®ed here. In particular, it is veri®ed that the donor residue (n) is strongly preferred to be Gly (87 cases out of 132), whereas such a preference is not present for the central residue (n 1). The peptide n ! (n secondary structure

Ramachandran diagram of the relevant residues (n 1) and (n 2) (Figure 8), one particular backbone conformation is adopted in most cases. Residue (n 2), that is the one carrying the acceptor side-chain, is typically in b-conformation, whereas

2) p-hydrogen bond and

In peptide n ! (n 2) p-hydrogen bonds, the typical backbone conformation is b at (n 2) and aR at (n 1), whereas the conformation of n is not con®ned. The NÐ H donor (n) is blocked to form a conventional hydrogen bond, whereas all other hydrogen bond functionalities are ``free''. This

543

Aromatic Hydrogen Bonds Table 8. Sequence distance between donor and acceptor Peptide Side-chain Side-chain N ÐH N/OÐH Ca ÐH CÐH

Type n ! (n < (n 6)) n ! (n 6) n ! (n 5) n ! (n 4) n ! (n 3) n ! (n 2) n ! (n 1) n ! (n ‡ 1) n ! (n ‡ 2) n ! (n ‡ 3) n ! (n ‡ 4) n ! (n ‡ 5) n ! (n ‡ 6) n ! (n > (n ‡ 6))

49 1 1 3 132 3 9 1 4a 1 35

127 2 22 19 5 17 1 8 4 9 119

533 35 41 43 5 33 27 2 38 83 10 43 11 548

58 21 2 10 18 2 2 3 71

a All from one crystal structure with four symmetry-independent molecules.

allows in corporation in several secondary structure elements that will now be discussed. The corresponding frequencies of occurrence are listed in Table 10. By far the most frequent position in regular secondary structure elements is at the carbonyl end of b-strands (34 cases in Table 10). In this position, residue (n 2) forms the last regular residue of the strand, and the peptide donor of residue (n 1) may form the last regular main-chain hydrogen bond. The strand is then disrupted by the aRconformation of residue (n 1), which rotates the peptide donor of n out of the strand plane. The side-chain of residue (n 2) can be nicely folded over that donor, and the resulting n ! (n 2) phydrogen bond leads to a reasonably well satis®ed hydrogen bond situation in the loop region at the end of the strand. An example at the end of a strongly twisted antiparallel b-ribbon has already been shown for BPTI in Figure 2. A typical example in a mixed b-sheet is shown in Figure 9(a) (in thermolysine, structure published by Holland et al., 1992; note also stacking of the Trp 6-ring Table 9. Side-chains accepting a p-hydrogen bond from peptide N Ð H (n), and forming an additional conventional hydrogen bond with a residue near n With O(n 2) With O(n ‡ 1) With O(n ‡ 2) With N(n ‡ 2) With sidechain (n 1) With sidechain (n) With sidechain (n ‡ 1) With sidechain (n ‡ 2)

Tyr OZ

Trp Ne

His Nd

His Ne

4

2

-

1

10

1

1

1

2

3

-

-

4

-

-

-

2

-

-

-

5

-

-

1

3

7

2

-

-

1

-

-

Table 10. Positioning of the peptide n ! (n 2) phydrogen bond motif in secondary structure elements Donor ! acceptor

n

Loop ! loop Loop ! strand Loop ! a-helix Loop ! poly(Pro)helix a-Helix ! loop Strand ! strand

83 34 4 1 7 3

Example shown in Figure 13 Figures 2, 9(a) Figure 10(a) Figure 12 Figure 11 Figure 9(b)

against N ÐH(n ‡ 1)). Related cases occur in parallel b-sheets. Since the conformation of residue n is free and may be of the b-form, one may imagine n being the ®rst residue of a b-strand; however, such an arrangement does not occur in the structure sample. A further way to incorporate peptide n ! (n 2) p-hydrogen bonds in b-strands is by structural irregularities. An example in a parallel b-sheet occurs in benzoylformate decarboxylase (Hasson et al., 1998) and is shown in Figure 9(b). The strand at the ``right'' edge of the sheet has a b-bulge (Richardson, 1981) by insertion of an extra peptide moiety, which is rotated almost perpendicularly out of the regular array, and satis®es its donor potential by an n ! (n 2) p-hydrogen bond (three cases contributing to Table 10). This is a nice example where a p-hydrogen bond apparently helps to smooth out disadvantages resulting from local structural irregularities. In a-helices, peptide n ! (n 2) p-hydrogen bonded groups may be incorporated at either of the ends. If residue n is in aR-conformation, (n 1) and n may form the initial two residues of the helix, and even (n 2) may participate as acceptor of the ®rst regular hydrogen bond (seven cases in Table 10). The peptide N ÐH of n belongs to the rim of the helix, and by formation of the n ! (n 2) p-hydrogen bond, its donor potential at this critical position can be satis®ed. This parallels exactly the recurrent local stabilization of a-helix N termini by conventional hydrogen bonds n ! (n 2) with side-chain acceptors (Baker & Hubbard, 1984). Examples of this impressive hydrogen bond motif are shown for a p- and an O-acceptor, respectively, in Figure 10(a) and (b) (both in cystathionine g-synthetase from Escherichia coli; Clausen et al, 1998); it is obvious that the N ÐH   p and the NÐ H   O hydrogen bonds both play the same role in helix-end stabilization, and can therefore be considered as isofunctional. At the carbonyl end of a-helices, peptide n ! (n 2) p-hydrogen bonded arrays may occur only in the relatively uninteresting function where N ÐH of (n 2) donates the last hydrogen bond of the helix, whereas no other part of the n ! (n 2) p-bonded array is involved (four examples in Table 10). An example of this arrangement is shown in Figure 11, which is of some

544

Aromatic Hydrogen Bonds

Figure 9. Peptide n ! (n 2) p-hydrogen bonds in b-sheets. (a) At the end of a regular strand; (b) involving a bÊ structure, Holland et al., 1992 (PDB 8tln), H    M ˆ 2.4 A Ê , N   M ˆ 3.4 A Ê , NÐ bulge. (a) In thermolysine, 1.6 A Ê structure, Hasson et al., 1998 (PDB 1bfd), H    M ˆ 150  , o(N) ˆ 19  ; (b) in benzoylformate decarboxylase, 1.6 A Ê , N    M ˆ 3.5 A Ê , N Ð H    M ˆ 135  , o(N) ˆ 15  . H    M ˆ 2.7 A

interest because it involves the rare p(His) acceptor (in sul®te reductase; Crane et al, 1995). Peptide n ! (n 2) p-hydrogen bonded groups may also be incorporated in the less common elements of regular secondary structure, as long as the limitations of the main-chain conformation are obeyed. In the present data sample, only one such example is found. In the polyproline helix of avian pancreatic polypeptide (Blundell et al., 1981), the last two residues are involved in an n ! (n 2) phydrogen bonded array, with the donor group no longer part of the helix (Figure 12). As a matter of fact, peptide n ! (n 2) p-hydrogen bonded groups are readily incorporated in loops (83 cases in Table 10), where thay may help satisfy hydrogen bond potentials in the often awkward steric conditions. The many irregular loop sections are not the subject of the present study, but regular turns are clearly interesting. In the structure sample, nine cases of type I b-turns are found where the residue accepting the main-chain hydrogen bond also accepts a p-hydrogen bond from the central peptide of the turn. An example is shown in Figure 13 (occurring in carbonic

anhydrase; Iverson et al., unpublished results). The backbone conformation of most other types of 310-turns (I0 , II, II0 , III0 ; Richardson, 1981) is not compatible with formation of peptide n ! (n 2) p-hydrogen bonds. Only in type III turns, may this interaction be formed in principle (N termini of a-helices are often somewhat irregular, represent formally such a turn and may indeed be engaged in a peptide n ! (n 2) p-hydrogen bond; see Figure 10(a)). Unusual cases of the peptide n ! (n p-hydrogen bond

2)

Most peptide n ! (n 2) p-hydrogen bonds occur with backbone conformation aR(n 1), b(n 2). However, the Ramachandran diagram in Figure 8 shows a few outliers corresponding to other possible conformations (both residues aR or both b, or (n 1) irregular Gly). This suggests that, apart from the recurrent arrangements discussed above, a whole variety of other ways to incorporate peptide n ! (n 2) p-hydrogen bonds in regular elements of secondary structure should be allowed. Since these are only exceptional cases, it

545

Aromatic Hydrogen Bonds

Figure 11. Peptide n ! (n 2) p-hydrogen bond involving the C1O-terminal residue of an a-helix, as Ê structure, Crane et al., found in sul®te reductase (1.6 A Ê , N    M ˆ 3.3 A Ê , NÐ 1995; PDB 1aop). H    M ˆ 2.2 A H    M ˆ 142  , o(N) ˆ 10  . Note also the conventional hydrogen bond formed by the His side-chain, His386 Ê. Ne2   Asp389 O ˆ 2.9 A

is suf®cient here to merely point at this possibility. It is suf®cient to mention only that there is a single example of a side-chain not oriented with w1  180  , w2  60( 120  ), but with w1  68  , w2  5  (Trp402 in adenovirus ®bre; van Ravaij et al., 1999). The peptide n ! (n ‡ 3) p-hydrogen bond

Figure 10. Isofunctional peptide-to-p and peptide-to-O n ! (n 2) hydrogen bonds at the N termini of ahelices, operative in helix-end stabilization. (a) p-Hydrogen bond in cystathionine g-synthetase from Eschericia Ê structure, Clausen et al., 1998; PDB 1cs1). coli (1.5 A Ê , N   M ˆ 3.7 A Ê , N Ð H    M ˆ 159  , H    M ˆ 2.7 A o(N) ˆ 23  . (b) Conventional hydrogen bond in the Ê , N Ð H    O ˆ 155  . same structure, N   O ˆ 2.9 A

A relatively frequent peptide-to-p hydrogen bond is the kind n ! (n ‡ 3). All nine examples in Table 8 involve the same structural motif, that is a 310-turn associated with a main-chain NÐ H    O1C hydrogen bond (n ‡ 3) ! n. In other words, residues n and (n ‡ 3) form a pair of mutual hydrogen bonds, the regular 310 hydrogen bond constituting the turn, and the additional peptide n ! (n ‡ 3) p-hydrogen bond. A typical example occurs in horseradish peroxidase C1A (Hendriksen et al., 1999), and is shown in Figure 14(a). A 310-turn can represent a single turn of a 310-helix, and remarkably, there is an example where an n ! (n ‡ 3) p-hydrogen bonded array actually forms the ®rst turn of a six residue 310-helix, as shown in Figure 14(b) (in AraC gene regulatory protein; Soisson et al., 1997).

546

Aromatic Hydrogen Bonds

Figure 13. Typical peptide n ! (n 2) p-hydrogen bond in a type I b-turn, as found in carbonic anhydrase Ê structure, Iverson et al., unpublished; PDB 1qrf). (1.55 A Ê , N   M ˆ 3.7 A Ê , N Ð H    M ˆ 160  , H    M ˆ 2.7 A  o(N) ˆ 9 .

p-interaction is almost exclusively stacked, and only exotic (though possible) in hydrogen bonded geometry. Other peptide p-hydrogen bonds

Figure 12. Peptide n ! (n 2) p-hydrogen bond involving the last two residues of a poly(Pro)helix, as Ê structure, found in avian pancreatic polypeptide (1.4 A Ê, Blundell et al., 1981; PDB 1PPT). H    M ˆ 2.8 A Ê , N ÐH    M ˆ 155  , o(N) ˆ 21  . N    M ˆ 3.7 A

The peptide n ! (n

1) p-hydrogen bond

Possible peptide n ! (n 1) p-hydrogen bonds have attracted some attention, both in proteins (Worth & Wade, 1995) and small peptides (Steiner, 1998). The problem here is that the steric restrictions are so severe that it is unclear whether hydrogen bonding geometries can be adopted at all. In small molecule structures, however, hydrogen bonding in structurally related arrangements is well established (Desiraju & Steiner, 1999). There is a very large number of short N(n)-over-p(n 1) contacts in proteins, but almost all are stacked and only very few satisfy our hydrogen bond criteria. A single example of the latter is shown in Figure 15 (in aldose reductase; Wilson et al., 1992). Fully in line with the study by Worth & Wade (1995), it must be concluded that the peptide n ! (n 1)

There are many peptide-to-p hydrogen bonds with large sequence distance between donor and acceptor (Table 8). However, in the context of recurrent patterns, only those with the peptide donor part of a helix or strand are of greater interest. Such con®gurations are relatively rare, but still do occur with clearly recognizeable functions. Free peptide donors are found at the edges of b-sheets and the N terminus of a-helices, and can satisfy their donor potential with NÐ H  p hydrogen bonds (six relevant cases with b-sheets and four with a-helices in the sample). An example with a b-sheet is shown in Figure 16(a) (in cambialistic superoxide dismutase; Schmidt, 1999), and one with an a-helix in Figure 16(b) (in soyabean lipoxygenase L-1; Minor et al., 1996). For the a-helix, functional similarity with the peptide n ! (n 2) p-hydrogen bond shown in Figure 10(a) is obvious. Side-chain donors Side-chain p-hydrogen bonds are slightly more frequent than those with peptide donors (Table 5), and typically have shorter X   M distances (Table 4). With N/O/S ÐH donors, they involve about 2.2 % of the aromatic acceptors, and with acidic C Ð H, another 1.2 % (Trp Cd1 Ð H, HisCd2 Ð H, HisCe1 ÐH). In analogy to conventional hydrogen bonding in proteins (Baker & Hubbard, 1984), side-chain p-hydrogen bonds have a less systematic appearance than those formed by peptide

547

Aromatic Hydrogen Bonds

Figure 15. One of the rare putative peptide n ! (n 1) p-hydrogen bonds. In human aldose Ê structure, Wilson et al., 1992 (PDB reductase, 1.65 A Ê, Ê, 1ads), H   M ˆ 2.5 A N    M ˆ 3.2 A NÐ H    M ˆ 127  , o(N) ˆ 23  .

groups. The distribution of sequence distance between donor and acceptor is smoother than for peptide donors, and there is clearly more weight on interactions between groups far apart in sequence. Nevertheless, there are recurrent patterns occurring in regular secondary structure, as will now be discussed. Side-chain p-hydrogen bonds in a-helices

Figure 14. Peptide n ! (n ‡ 3) p-hydrogen bonds: all examples in the data sample occur in 310-turns; (a) A single turn; (b) a six residue 310-helix. (a) In horseradish Ê structure, Hendriksen et al., peroxidase C1A (1.47 A Ê , N    M ˆ 3.5 A Ê , NÐ 1999 (PDB 7atj)), H    M ˆ 2.5 A H    M ˆ 172  , o(N) ˆ 3  ; (b) in the AraC gene regulatÊ structure, Soisson et al., 1997 (PDB ory protein (1.6 A Ê, 2aac)); for 6-ring acceptor H    M ˆ 2.8 A Ê , N Ð H   M ˆ 171  , o(N) ˆ 23  ; for 5N    M ˆ 3.7 A Ê , N   M ˆ 3.8 A Ê , NÐ ring acceptor H    M ˆ 3.0 A H    M ˆ 142  , o(N) ˆ 24  .

It is obvious that in a-helices, side-chain phydrogen bonds can be formed only between residues that are immediately successive, or separated by about one turn of the helix. Because one turn is constituted by 3.6 residues, the latter con®guration requires sequence distances of 3 or 4. In the data sample, numerous intra-helix p-hydrogen bonds of these types could indeed be identi®ed, with the corresponding frequencies of occurrence given in Table 11. The possible cases n ! (n 1) and n ! (n ‡ 3) are missing here, but might well be present in a larger data sample. For the type n ! (n 4), Armstrong et al. (1993) have shown by mutation experiments that they stabilize the helix signi®cantly (helical peptides in solution), and there is no reason to believe that this should be different for the other cases. Of the four con®gurations found here, the three bridging two turns of the a-helix are closely related. The cases n ! (n 4) and n ! (n ‡ 4), in particular, differ only by donor and acceptor positions being reversed. Therefore, it is suf®cient to

548

Aromatic Hydrogen Bonds

show only one example representative for all (Figure 17(a)) (in Mef2a core; Santelli & Richmond, 2000). The frequently occurring con®guration n ! (n ‡ 1) is different because it involves two residues that are successive in one turn, as is shown for a typical example in Figure 17(b) (in human lysozyme; Song et al, 1994). Side-chain ¯exibility allows formation of all these hydrogen bond motifs with various combinations of long and short side-chains, and different overall geometries are possible. The acceptor side-chain may be oriented parallel with or perpendicular to the helix axis, and a ¯exible donor side-chain like that of Lys or Arg can form a p-hydrogen bond in either case. Side-chain p-hydrogen bonds in b-sheets In b-sheets, side-chains of neighboring strands may readily form hydrogen bonds between each other, and in the data sample, there are 49 phydrogen bonds found of this kind (39 with N/ O Ð H, 10 with acidic C ÐH donors). Side-chain ¯exibility, and the different chain lengths, allow several different overall geometries. For example, the residues involved may be directly opposing in the sheet, or they may be displaced by one or even two residues in either strand direction. The acceptor plane may be perpendicular to the local plane of the sheet, or parallel with it. It is suf®cient to show as a single example the remarkable sidechain of Trp56 in a mixed b-sheet of human rac1 (Hirshberg et al., 1997) (Figure 18). It accepts two N ÐH   p bonds in a sandwich fashion, one donated by an Asn side-chain of a directly opposing residue, and one donated by a lysine residue displaced two residues against the strand direction. A further way of side-chain p-hydrogen bonding in b-sheets is within one strand. Obviously, this is possible only between residues sepatated by two positions in sequence, n ! (n ‡ 2) and n ! (n 2). Both arrangements are very rare, with n ! (n ‡ 2) occuring three times and n ! (n 2) once in the sample (including acidic C Ð H donors). As an example, such a hydrogen bond in bovine angiogenin (Acharya et al., 1995) is shown in Figure 19. The Ca Ð H donor

Figure 16. Functional peptide N ÐH    p hydrogen bonds with partners far apart in sequence. (a) Stabilization of a sheet edge; (b) stabilization of an a-helix N terminus. (a) In cambialistic superoxide dismutase Ê (1.55 A structure, Schmidt, 1999; PDB 1bs3). Ê , N   M ˆ 3.9 A Ê , N Ð H    M ˆ 158  , H    M ˆ 2.9 A  Ê o(N) ˆ 25 . (b) In soyabean lipoxygenase L-1 (1.4 A Ê, structure, Minor et al., 1996; PDB 1yge). H    M ˆ 2.9 A Ê , N ÐH    M ˆ 147  , o(N) ˆ 19  . N    M ˆ 3.8 A

By far the most important polar C ÐH group in proteins is Ca Ð H. Possible involvement of Ca Ð H in p-hydrogen bonds was investigated using the same cutoff criteria as with NÐ H donors (Table 2), yielding the large number of 1452 relevant contacts (Table 5). The shortest of these contacts are fairly linear, and have H  M and Ca   M distances of Ê , respectively. For N ÐH    p about 2.4 and 3.4 A contacts, this would be considered an excellent hydrogen bond geometry, and for Ca Ð H, hydrogen bond nature is then very likely. However, most of the Ca ÐH    p contacts are longer, and the directionality is more diffuse than for

549

Aromatic Hydrogen Bonds

Figure 17. Side-chain p-hydrogen bonds in a-helices. (a) n ! (n 4) p-hydrogen bond bridging one turn of the helix; note also the N Ð H    O hydrogen bond donated to the same acceptor residue. (b) n ! (n ‡ 1) p-hydrogen bond within one turn; note also the short n ! (n 4) C Ð H   O1C hydrogen bond donated by Trp34Cd2. (a) In Ê structure, Santelli & Richmond, 2000 (PDB 1egw), H   M ˆ 2.4 A Ê , N    M ˆ 3.1 A Ê , NÐ Mef2a core, 1.5 A Ê structure, Song et al., 1994 (PDB 1lzr), N    M ˆ 3.4 A Ê, H    M ˆ 134  , o(N) ˆ 6  . (b) In human lysozyme, 1.5 A Ê. o(N) ˆ 19  ; the C Ð H    O interaction has a C   O distance of only 3.1 A

N ÐH   p bonds. This means that the fraction of ``good'' and ``borderline'' geometries is poorer for Ca ÐH than for N ÐH donors. The sequence distances between donors and acceptors (Table 8), are much more evenly distributed than for N/O Ð H   p hydrogen bonds. There is no prominent peak in the distribution, with the strongest one at n ! (n ‡ 3) being hardly twice as high as the population of most other distances, comprising just 5 % of all putative Ca ÐH    p bonds (with peptide p-hydrogen bonds, for comparison, the prominent peak at n ! (n 2) comprises 55 % of all data). This lack of clear sequence

distance speci®city already indicates a lack of structural speci®city in general, and suggests a much more casual nature of the interaction compared to N ÐH   p hydrogen bonds. Nevertheless, there are recurrent patterns involving Ca ÐH   p interactions that should be mentioned. In a and 310-helices, the Ca ÐH bonds are oriented radially away from the helix axes, and are readily available for hydrogen bonding. Most interesting are hydrogen bonds to side-chains of the same helix, and this is nicely possible with residues at position (n ‡ 3), which can fold their side-chains in appropriate orientation. This pattern is found 55

Table 11. Side-chain p-hydrogen bonds in a-helices n ! (n 4) n ! (n 3) n ! (n ‡ 1) n ! (n ‡ 4)

NÐH

OH/SH

Acidic CÐ H

n(all)

8 13 6 2

5 -

3 10 -

11 13 21 2

550

Figure 18. Pair of side-chain p-hydrogen bonds in a Ê structure, mixed b-sheet. In human rac1, 1.38 A Hirshberg et al., 1997 (PDB 1mh1); Asn donor Ê , N   M ˆ 3.3 A Ê , N Ð H    M ˆ 145  , H    M ˆ 2.4 A  Ê , o(N) ˆ 6  . o(N) ˆ 17 ; Lys donor N    M ˆ 3.6 A

times in the sample, and contibutes the larger part of the peak n ! (n ‡ 3) in the sequence distance distribution (Table 8). Examples are shown for an a-helix in Figure 20(a) and for a 310-helix in Figure 20(b) (a-helix in spinach rubisco, Andersson, 1996; 310-helix in g-B crystallin; Njamudin et al., 1993). Related interactions occur also in the opposite helix direction as n ! (n 4), but are clearly rarer (18 of the entries in Table 8). In the central part of b-sheets, the Ca ÐH groups are oriented in-plane, participate in systematic C Ð H    O hydrogen bonding (Derewenda et al, 1995; Fabiola et al, 1997), and are not available for Ca ÐH   p interactions. However, for glycine residues in b-sheets, the second Ca ÐH bond is oriented out of the sheet plane, and can donate p-hydrogen bonds in principle. Indeed, there are cases where a side-chain of a neighboring strand folds over a Gly residue, and accepts a Ca ÐH    p interaction. An example occurring in Bacillus pasteurii urease (Benini et al., 2000) is shown in Figure 21. Further motifs of Ca Ð H  p interactions involving Ca ÐH of strands are obviously possible at edges and irregularities, but are not discussed here further. Finally, we point at the possibility of Ca Ð H to form a pair of p-hydrogen bonds together with a

Aromatic Hydrogen Bonds

Figure 19. Intrastrand side-chain p-hydrogen bond in Ê structure, a b-sheet; in bovine angiogenin, 1.5 A Ê, Acharya et al., 1995 (PDB 1agi), H    M ˆ 3.1 A Ê , NÐ H    M ˆ 123  , o(N) ˆ 21  . N    M ˆ 3.7 A

neighboring peptide N ÐH donor, directed at the large face of a Trp side-chain. The partner peptide can be of the same residue (13 cases in the sample), and of residue (n ‡ 1) (11 cases). An example of this recurrent motif is shown in Figure 22 (in endoglucanase Cel5A; Varrot et al., 2000). Water molecules There are no less than 735 water-to-p contacts in the sample that are classi®ed as ``putative hydrogen bonds'' (Table 5). Of all donor types studied here, however, water is the least well studied in the present context, and adequate geometric hydrogen bond criteria have not yet been derived from small molecule data. Therefore, it is debatable which fraction of the ``possible'' hydrogen bonds represent true hydrogen bonds indeed. Furthermore, since water-to-p hydrogen bonding is possible with side-chain but not with main-chain groups, the role in stabilization of regular secondary structure is necessarily limited. For these reasons, OW ÐH    p hydrogen bonds are not described here further. Nevertheless, it is stressed that the existence of this interaction has been de®-

Aromatic Hydrogen Bonds

551

Figure 20. n ! (n ‡ 3) Ca Ð H    p hydrogen bonds in helices. (a) In an a-helix; (b) in a 310-helix. (a) In spinach Ê structure, Anderssen, 1996 (PDB 8ruc), H    M ˆ 2.9 A Ê , Ca    M ˆ 3.8 A Ê , Ca Ð H   M ˆ 165  , rubisco, 1.6 A Ê structure, Njamudin et al., 1993 (PDB 4gcr), H    M ˆ 2.5 A Ê , Ca    M ˆ 3.5 A Ê, o(Ca) ˆ 18  . (b) In gB-crystallin, 1.47 A a a   C Ð H    M ˆ 163 , o(C ) ˆ 11 .

nitely shown in structures of small peptides (Steiner et al., 1998), and functional roles have been proposed for buried water molecules (Koellner et al., 2000) and water molecules in binding pockets (Engh et al, 1996; Deacon et al., 1997). A main role will be in protein-solvent interactions of the ®rst hydration shell, where OW ÐH   p hydrogen bonds should allow water molecules to adopt relatively stable positions opposing aromatic groups, a location that is, in conventional terms, considered as very unfavourable. Further elucidation of all these matters would be appreciated, but ®rst requires more work in the small molecule ®eld and/or determination of more protein crystal Ê. structures at truly atomic resolutions <1.0 A

Summary and Conclusions The present analysis allows us to draw a number of important conclusions. One concerns the mere number of p-hydrogen bonds in proteins. When

using relatively cautious geometric criteria, as we do, about one out of 11 aromatic side-chains is found accepting a p-hydrogen bond from an O/N/S ÐH donor, and for the most ef®cient acceptor, Trp, it is even more than one out of six. Since aromatic residues constitute about 10 % of a typical protein, one may expect to ®nd in a new protein crystal structure about one p-hydrogen bond per 100 residues. X Ð H  p hydrogen bonds are formed in irregular loop regions, but occur also in recurrent patterns that are observed many times. Most important is the peptide n ! (n 2) p-hydrogen bond, which is found in a variety of structure-stabilizing functions. This motif is operative in stabilization of helix and strand ends, of the irregular peptide group of b-bulges, and of regular turns. The peptide n ! (n ‡ 3) p-hydrogen bond is systematically operative in stabilization of b-turns, and peptide-to-p bonds between partners far apart in sequence stabilize strand edges and helix termini. Such functions parallel exactly conventional

552

Aromatic Hydrogen Bonds

Figure 22. Pair of Ca Ð H    p and peptide N Ð H   p hydrogen bonds with a Trp side-chain, as found in Ê structure, Varrot et al., endoglucanase Cel5a, 0.97 A Ê , Ca    M ˆ 3.5 A Ê, 1999 (PDB 8a3h), Ha    M ˆ 2.7 A Ê, Ca Ð H    M ˆ 142  , o(Ca) ˆ 12  ; HN    M ˆ 2.3 A Ê , NÐ H    M ˆ 154  , o(N) ˆ 5  . N    M ˆ 3.2 A

Figure 21. Ca Ð H    p hydrogen bond of a Gly donor Ê in a b-sheet, as found in Bacillus pasteurii urease, 1.55 A Ê, structure, Benini et al., 2000 (PDB 4ubp), H   M ˆ 2.7 A Ê , Ca ÐH    M ˆ 155  , o(Ca) ˆ 12  . Ca    M ˆ 3.6 A

peptide to main-chain hydrogen bonds in equivalent structural situations. An explicit example of equivalent NÐ H  p and NÐ H  O hydrogen bonds at an a-helix N terminus is shown in Figure 10. This underlines the function of aromatic groups as ``reserve'' hydrogen bond acceptors that are able to take the place of conventional acceptors if the latter are lacking. Recurrent patterns are observed for side-chain p-hydrogen bonds. In a-helices as well as in b-sheets, systematic p-hydrogen bonding of different con®gurations is observed, which very likely contribute to the stability of these elements. X Ð H  p hydrogen bonds occur with acidic C ÐH donors in similar numbers as with N/O/ S ÐH groups, but have a clearly less systematic appearence. However, several recurrent patterns could be identi®ed also for C ÐH donors, indicating that they should not be generally neglected in this context. All this shows that X ÐH    p interactions are typically an integral part of hydrogen bonding in proteins. Although numerically far behind conven-

tional hydrogen bonds, the possibility of playing isofunctional roles gives them great potential importance. In consequence, we suggest that in geometric analyses of new crystal structures this kind of interaction should be fully considered as a proper hydrogen bond, and its possible function be analyzed with similar effort as is usual for conventional hydrogen bonding.

Methodology General For primary data aquisition, atomic coordinates of all Ê protein crystal structures with resolutions 41.6 A released until March 31, 2000, were extracted from the Protein Data Bank (PDB; Berman et al., 2000). Structures without residues carrying aromatic side-chains, and structures without published water coordinates were excluded. Since effects are studied that might appear in multiply determined structures in different ways, such structures were not sorted out, and of crystal structures with more than one molecule per asymmetric unit, all molecules were considered. The number of crystal structures used and the numbers of relevant residues are given in Table 1. Theoretical positions of hydrogen atoms Ê ) were then added with program HGEN (X Ð H ˆ 1.0 A of the CCP4 suite (Collaborative Computational Project No. 4, 1994), but only those at positions allowing this calculation unambiguously were used further. Employing a program written in-house, all interatomic contacts of the potential p-acceptors were sorted out that satisfy adjustable hydrogen bond criteria, such as those given in Table 2 (justi®cation of the criteria given below). This yielded a total of 1311 putative p-hydrogen bonds with

553

Aromatic Hydrogen Bonds

o(X) < 25  , can be maintained, but in order to reduce the contribution of dubious arrangements, the X    M distance cutoff is reduced to a somewhat arbitrary value of Ê . Even then, a large number of contacts are 3.8 A obtained that do very probably not represent hydrogen bonds. As an interesting example, an OH-over-p contact of a Ser Og in an a-helix is shown in Figure 23, which Ê and an angle o(Og) of has an Og    M distance of 3.5 A 18  (in hRFX1; Gajiwala et al., 2000). Because of this geometry, and the obvious structural relation with the sidechain N Ð H   p bond shown in Figure 17(a), one may be tempted to interpret the arrangement as a side-chain n ! (n 4) O Ð H    p hydrogen bond. However, there is also a contact of Ser Og with a main-chain carbonyl Ê , which is very strongly suggestive of acceptor, 2.9 A conventional hydrogen bonding Og Ð H    O 1 C. Then, the Og Ð H vector is not free to point at the p-acceptor, but must be oriented roughly parallel with it, and the interactions is stacked. To exclude such cases, the conventional hydrogen bond contacts of all C Ð OH    p contacts satisfying the cutoff criteria of X    M and o(X) were individually inspected. All examples were excluded where (a) a carbonyl or carboxylate acceptor binding the H-atom is closer to OH than Ê , or (b) two O atoms of any kind are closer than 3.0 A Ê . The latter case can, in principle, be compatible 3.0 A with OÐ H   p bonding (e.g. if both O-atoms are from water molecules), but formation of this interaction seems unlikely under such unfavourable competitive conditions. This sorting out reduced the 106 C Ð OH   p Ê and o(O) < 25  to contacts found with O   M < 3.8 A only 39 putative p-hydrogen bonds. For C Ð SH donors, analogous inspection reduced the 22 contacts Ê and o(O) < 25  to nine putative with O  M < 4.0 A hydrogen bonds. With water donors, individual sorting as for C Ð OH groups is not realistic. The reason is that the double donordouble acceptor function leads to a much more complex hydrogen bond coordination compared to hydroxyl groups (e.g. see Jeffrey & Saenger, 1991). Only if there are short contacts with two carbonyl or carboxylate O atoms, one may conclude that no H atom is left free for OW Ð H   p hydrogen bonding. For the majority of water molecules, no such easy conclusion is possible, Ê and so that they all enter into Table 5 if OW    p < 3.8 A o(OW) < 25  is satis®ed. This results in an unknown and possibly relatively high fraction of water molecules in the Table that actually do not form a p-hydrogen bond, and is a primary reason why water molecules are not analyzed in more detail in this study. Greater problems were encountered with the histidine side-chain as possible p-acceptor. In particular, for water and hydroxyl donors, there is a disquietingly high number of unrealistic contact geometries, with X   C disÊ . This can be explained only by tances down to 2.2 A major de®ciencies of the structural models, and has forced us to generally exclude XÐ H    p(His) interactions with X = O and S.

N/O/S Ð H donors, 1452 with Ca Ð H donors, and 186 with acidic side-chain C Ð H (Table 5). Secondary structure elements were assigned on the basis of the information given in the header of the PDB ®les. In some cases, closer inspection was performed using program PROMOTIF (Hutchinson & Thornton, 1996), which was also used to calculate the c and f angles in Figure 8. Conventional hydrogen bonding contacts were identi®ed using program CONTACT of the CCP4 suite. Structure plots were prepared with MOLSCRIPT (Kraulis, 1991). Hydrogen bond definitions For N Ð H    p hydrogen bonds, small molecule structure analysis ®rmly shows that the hydrogen bond geometry is very ¯exible, and large distortions from the ideal centered and perpendicular geometry are allowed (e.g. Desiraju & Steiner, 1999). There is a gradual transition between hydrogen bonding geometries and other kinds of contacts, and many borderline cases do occur. This makes de®nition of hydrogen bonds quite troublesome and, in practice, geometric cutoff criteria based on compromise values must be used. The basic requirement is that X is positioned ``over'' the p-face of the acceptor. As a geometric de®nition of Xover-p contacts, we found it advisable to extend the cutoff used by Mitchell et al. (1994), o(X) < 20  . The reason is that in small molecule structures, numerous wellde®ned p-hydrogen bonds with o(X) > 20  have been found, with extreme values up to 35  or more (Steiner et al., 1997; Malone et al., 1997; Desiraju & Steiner, 1999). However, extending the angular cutoff to the extreme observed value would include an unacceptable number of dubious cases, so that we restrict the study to contacts with o(X) < 25  . One should be aware that this is still a relatively cautious value that excludes a large number of borderline cases and even some bona ®de hydrogen bonds. To derive cutoff values for XÐ H    M angles and X   M distances, series of angle-distance scatterplots and histograms were examined, such as shown in Figure 4 for the special case of peptide n ! (n 2) pcontacts. A cutoff angle XÐ H    M > 120  seems to be adequate, and again, this is a cautious value that probably excludes more hydrogen bonds than it includes dubious cases. As a distance cutoff, the limit Ê was selected (Figure 5), which is relaX   M < 4.3 A tively permissive. As a matter of fact, all numerical results given in Tables 4 to 11 depend strongly on the cutoff de®nitions used. To illustrate this circumstance, the number of peptide N Ð H    p(Phe) ``hydrogen bonds'' identi®ed by several sets of cutoff criteria are listed in Table 12. For donors with H-atom positions that cannot be calculated from the non-H atom positions, this procedure cannot be followed. The de®nition of X-over-p contacts,

Table 12. Number of p-hydrogen bonds with peptide donors and Phe acceptors for different cutoff de®nitions Ê) N   M (A <3.8 <4.0 <4.3 <4.3 <4.3

NÐ H  M (deg.)

o(N) (deg.)

n

>130 >120 >120 >120 >110

<20 <25 <25 <30 <35

23 43 55 62 110

Spirit Overly cautious Cautious Intermediate (this work) Relaxed Overly relaxed

554

Figure 23. Example of a short face-on C Ð OH   p contact, which is probably not associated with an OÐ H    p hydrogen bond because the hydroxyl H-atom is involved in a conventional hydrogen bond Ser30 Og Ð H    O261C (in hRFX1; Gajiwala et al., 2000; PDB Ê , o(Og) ˆ 18  , Og    O ˆ 2.9 A Ê. 1dp7). Og    M ˆ 3.50 A The arrangement could be mistaken for a side-chain phydrogen bond of the kind n ! (n 4) that frequently occurs in a-helices (Table 11, Figure 17(a)).

Finally, we mention that we did not consider in any way multifurcated (often called multi-center; Jeffrey & Saenger, 1991) hydrogen bonding, that is the possibility of an XÐ H donor to form an XÐ H   p hydrogen bond and in addition a hydrogen bond with another acceptor (be it conventional or non-conventional). Such arrangements have been observed with various p-acceptors in small molecule structures (Desiraju & Steiner, 1999), and must be expected to occur in proteins too. However, they will involve mainly XÐ H    p hydrogen bonds in poor or borderline geometries, making a detailed analysis questionable from the beginning. Only to point at the existence of bifurcated hydrogen bonds, it is mentioned that we noticed a number of N Ð H donors that are apparently bonding at the same time with a p-acceptor and the main-chain carbonyl group of the same residue. An example with a peptide n ! (n 3) bifurcated p/O-hydrogen bond is shown in Figure 24 (in sulfursubstituted rhodanese; Gliubich et al., 1998). In this matter, as in many others touched, more elaborate analysis is proposed using high-precision structural data. Such data are not yet available in suf®cient quantities, but will be in the foreseeable future.

Aromatic Hydrogen Bonds

Figure 24. Example of a possibly bifurcated hydrogen bond with a peptide donor and a p(Tyr) and a carbonyl acceptor. In sulfur-substituted rhodanese; Gliubich et al., Ê , N    M ˆ 3.2 A Ê , NÐ 1998 (PDB 1rhs), H    M ˆ 2.4 A Ê, H    M ˆ 133  , o(N) ˆ 6  , N    O ˆ 3.1 A NÐ H    O ˆ 116  .

Acknowledgments This study has its roots in observations the authors made in experimental work on purine nucleoside phosphorylase, PNP (supported by the Deutsche Forschungsgemeinschaft, Ko 1477/2-1) and statistical investigations of acetylcholinesterase hydration (Koellner et al., 2000; supported by Minerva Foundation, Munich). Part of the data aquisition has been performed during a stay of both authors at the Weizmann Institute of Science, graciously hosted by Professor Joel L. Sussman and Professor Israel Silman (supported by the Forschungszentrum JuÈlich).

References Acharya, K. R., Shapiro, R., Rjordan, J. R. & Valler, B. L. (1995). Crystal structure of bovine angiogenin at Ê resolution. Proc. Natl Acad. Sci USA, 92, 29491.5 A 2953. Andersson, I. (1996). Large structures at high resolution: Ê crystal structure of spinach ribulose 1,5the 1.6 A bisphosphate carboxylase/oxygenase complexed with 2-carboxyarabinitol bisphosphate. J. Mol. Biol. 259, 160-174. Armstrong, K. M., Fairman, R. & Baldwin, R. L. (1993). The (i, i ‡ 4) Phe-His interaction studied in an alanine-based a-helix. J. Mol. Biol. 230, 284-291.

Aromatic Hydrogen Bonds Auf®nger, P. & Westhof, E. (1996). H-bond stability in the tRNAAsp anticodon hairpin. 3ns of multiple molecular dynamics simulation. Biophys. J. 71, 940954. Baker, E. N. & Hubbard, R. E. (1984). Hydrogen bonding in globular proteins. Prog. Biophys. Mol. Biol. 44, 97-179. Benini, S., Rypniewski, W. R., Wilson, K. S., Miletti, S., Ciurli, S. & Mangani, S. (2000). The complex of Bacillus pasteurii urease with acetohydroxamate Ê resolution. J. Biol. anion from X-ray data at 1.55 A Inorg. Chem. 5, 110-118. Berman, H. M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T. N., Weissig, W., Shindyalov, I. N. & Bourne, P. E. (2000). The Protein Data Bank. Nucl. Acid Res. 28, 235-242. Blundell, T. L., Pitts, J. E., Tickle, I. J., Wood, S. P. & Ê resolution) Wu, C.-W. (1981). X-ray analysis (1.4 A of avian pancreatic polypeptide. Small globular protein hormone. Proc. Natl Acad. Sci. USA, 78, 4175-4179. Burley, S. K. & Petsko, G. A. (1986). Amino-aromatic interactions in proteins. FEBS Letters, 203, 139-143. Burley, S. K. & Petsko, G. A. (1988). Weakly polar interactions in proteins. Advan. Protein Chem. 39, 125189. Chen, J. C., Miercke, L. J., Krucinski, J., Starr, J. R., Saenz, G., Wang, X., Spilburg, C. A., Lange, L. G., Ellsworth, J. L. & Stroud, R. M. (1998). Structure of Ê: bovine pancreatic cholesterol esterase at 1.6 A novel structural features involved in lipase activation. Biochemistry, 37, 5107-5117. Clausen, T., Huber, R., Prade, L., Wahl, M. C. & Messerschmidt, A. (1998). Crystal structure of Ê Eschericia coli cystathionine g-synthetase at 1.5 A resolution. EMBO J. 17, 6827-6838. Collaborative Computational Project No. 4 (1994). The CCP4 suite: programs for protein crystallography. Acta Crystallog. sect. D, 50, 760-763. Crane, B. R., Siegel, L. & Getzoff, E. D. (1995). Sul®te Ê : evolution and catalysis reductase structure at 1.6 A for reduction of inorganic anions. Science, 270, 5967. Deacon, A., Gleichmann, T., Kalb (Gilboa), A. J., Price, H., Raferty, J., Bradbrook, G., Yariv, J. & Helliwell, J. R. (1997). The structure of concanavalin A and its bound solvent determined with small-molecule Ê resolution. J. Chem. Soc. Faraday accuracy at 0.94 A Trans. 93, 4305-4312. Derewenda, Z. S., Lee, L. & Derewends, U. (1995). The occurrence of C Ð H   O hydrogen bonds in proteins. J. Mol. Biol. 252, 248-262. Desiraju, G. R. & Steiner, T. (1999). The Weak Hydrogen Bond in Structural Chemistry and Biology, Oxford University Press, Oxford. Engh, R. A., Brandstetter, H., Sucher, G., Eichinger, A., Baumann, I., Bode, W., Huber, R., Poll, T., Rudolph, R. & Saal, W. V. D. (1996). Enzyme ¯exibility, solvent and ``weak'' interactions characterize thrombin-ligand interactions: implications for drug design. Structure, 4, 1353-1362. Fabiola, G. F., Krishnaswamy, S., Nagarajan, V. & Pattabhi, V. (1997). C Ð H   O hydrogen bonds in b-sheets. Acta Crystallog. sect. D, 53, 316-320. Flocco, M. M. & Mowbray, S. L. (1994). Planar stacking interactions of arginine and aromatic side-chains in proteins. J. Mol. Biol. 235, 709-717. Gajiwala, K. S., Chen, H., Cornille, F., Roques, B. P., Reith, W., Mach, B. & Burley, S. K. (2000). Structure

555 of the winged-helix protein hRFX1 reveals a new mode of DNA binding. Nature, 403, 916-921. Ghosh, A. & Bansal, M. (1999). Three-centre C Ð H   O hydrogen bonds in the DNA minor groove: analysis of oligonucleotide crystal structures. Acta Crystallog. sect. D, 55, 2005-2012. Gliubich, F., Berni, R., Colapietro, M., Barba, L. & Zanotti, G. (1998). Structure of sulfur-substituted Ê resolution. Acta Crystallog. sect. rhodanese at 1.36 A D, 54, 481-486. Hasson, M. S., Muscate, A., McLeish, M. J., Polovnikova, L. S., Gerlt, J. A., Kenyon, G. L., Petsko, G. A. & Ringe, D. (1998). The crystal strucÊ resolture of benzoylformate decarboxylase at 1.6 A ution: diversity of catalytic residues in thiamin diphosphate-dependent enzymes. Biochemistry, 37, 9918-9930. Hendriksen, A., Smith, A. T. & Gajhede, M. (1999). The structures of the horseradish peroxidase C-ferulic acid complex and the ternary complex with cynide suggest how peroxidases oxidize small phenolic substrates. J. Biol. Chem. 274, 35005-35011. Hirshberg, M., Stockley, R. W., Dodson, G. & Webb, M. R. (1997). The crystal structure of human rac1, a member of the rho-family complexed with a GTP analogue. Nature Struct. Biol. 4, 147-152. Holland, D. R., Tronrud, D. E., Pleyk, H. W., Flaherty, M., Stark, W., Jansonius, J. N., McKay, D. B. & Matthews, B. W. (1992). Structural comparison suggests that thermolysin and related neutral proteases undergo hinge-bending motion during catalysis. Biochemistry, 31, 11310-11316. Hutchinson, E. G. & Thornton, J. M. (1996). Promotif: a program to identify and analyze structural motifs in proteins. Protein Sci. 5, 212-210. Jeffrey, G. A. & Saenger, W. (1991). Hydrogen Bonding in Biological Structures, Springer-Verlag, Berlin. Koellner, G., Kryger, G., Millard, C. B., Silman, I., Sussman, J. L. & Steiner, T. (2000). Active-site and buried water molecules in crystal structures of acetylcholinesterase from Torpedo californica. J. Mol. Biol. 296, 713-735. Kraulis, P. J. (1991). MOLSCRIPT: a programme to produce both detailed and schematic plots of protein structures. J. Appl. Crystallog. 24, 946-950. Kryger, G., Silman, I. & Sussman, J. L. (1999). Structure of acetylcholinesterase complexed with E2020 (Aricept1): implications for the design of new antiAlzheimer drugs. Structure, 7, 297-307. Levitt, M. & Perutz, M. F. (1988). Aromatic rings act as hydrogen bond acceptors. J. Mol. Biol. 201, 751-754. Liu, S., Ji, X., Gilliland, G. L., Stevens, W. J. & Armstrong, R. N. (1993). Second sphere electrostatic effects in the active site of glutathione S-transferase. Observation of an on-face hydrogen bond between the side-chains of threonine 13 and the p-cloud of tyrosine 6 and its in¯uence on catalysis. J. Am. Chem. Soc. 115, 7910-7911. Lubkowski, J., Dauter, Z., Yang, F., Alexandratos, J., Merkel, G., Skalka, A. M. & Wlodawer, A. (1999). Atomic resolution structures of the core domain of avian sarcoma virus integrase and its D64N mutant. Biochemistry, 38, 13512-13522. Ma, J. C. & Dougherty, D. D. (1997). The cation-p interaction. Chem. Rev. 97, 1303-1324. Malone, J. F., Murray, C. M., Charlton, M. H., Docherty, R. & Lavery, A. J. (1997). XÐ H   p(phenyl) interactions. Theoretical and crystallographic observations. J. Chem. Soc. Faraday Trans. 93, 3429-3436.

556 McPhail, A. T. & Sim, G. A. (1965). Hydroxyl-benzene hydrogen bonding. An X-ray study. Chem. Commun. 00, 124-125. Minor, W., Steczko, J., Stec, B., Otwinowski, Z., Bolin, J. T., Walter, R. & Axelrod, B. (1996). Crystal Ê resstructure of soyabean lipoxygenase L-1 at 1.4 A olution. Biochemistry, 35, 1067-1071. Mitchell, J. B. O., Nandi, C. L., McDonald, I. K., Thornton, J. M. & Price, S. L. (1994). Amino/ aromatic interactions in proteins. Is the evidence stacked against hydrogen bonding? J. Mol. Biol. 239, 315-313. Nishio, M., Umezawa, Y., Hirota, M. & Takeuchi, Y. (1995). The CH/p interaction. Signi®cance in molecular recognition. Tetrahedron, 51, 8665-8671. Njamudin, S., Nalini, V., Driessen, H. P. C., Slingsby, C., Blundell, T. L., Moss, D. S. & Lindley, P. F. (1993). Structure of the bovine eye lens protein Ê . Acta Crystallog. sect. D, gB(gII)-crystallin at 1.47 A 49, 223-233. Parkinson, G., Gunasekera, A., Vijtechovsky, J., Zhang, X., Kunkel, T. A., Berman, H. & Ebright, R. H. (1996). Aromatic hydrogen bond in sequencespeci®c protein DNA recognition. Nature Struct. Biol. 3, 837-841. Perutz, M. F. (1993). The role of aromatic rings as hydrogen-bond acceptors in molecular recognition. Phil. Trans. Roy. Soc. ser. A, 345, 105-112. Perutz, M. F., Fermi, G., Abraham, D. J., Poyart, C. & Bursaux, E. (1986). Hemoglobin as a receptor of drugs and peptides. X-ray studies of the stereochemistry of binding. J. Am. Chem. Soc. 108, 10641078. Richardson, J. S. (1981). The anatomy and taxonomy of protein structure. Advan. Protein Chem. 34, 167-339. Rodham, D. A., Suzuki, S., Suenram, R. D., Lovas, F. J., Dasgupta, S., Goddard, W. A., III & Blake, G. A. (1993). Hydrogen bonding in the benzene-ammonia dimer. Nature, 362, 735-737. Rozenberg, M., Nishio, T. & Steiner, T. (1999). Structural and IR-spectroscopic evidence of S ÐH    Ph hydrogen bonding in the solid state. New J. Chem. 23, 585586. Santelli, E. & Richmond, T. J. (2000). Crystal structure of Ê resolution. Mef2a Core bound to DNA at 1.5 A J. Mol. Biol. 297, 437-449. Schmidt, M. (1999). Manipulating the coordination number of the ferric iron within the cambialistic superoxide dismutase of Propionibacterium shermanii by changing the pH value. a crystallographic analysis. Eur. J. Biochem. 262, 117-126. Singh, J. & Thornton, J. M. (1990). SIRIUS. An automated method for the analysis of the preferred packing arrangements between protein groups. J. Mol. Biol. 211, 595-615. Song, H., Inaka, K., Maenaka, K. & Matsushima, M. (1994). Structural changes of the active site cleft and different saccharide binding modes in human lysozyme co-crystallized with hexa-N-acetyl-chitohexaose at pH 4.0. J. Mol. Biol. 244, 522-540. Soisson, S. M., MacDougall-Shackleton, B., Schleif, R. & Ê structure of the Wolberger, C. (1997). The 1.6 A AraC sugar-binding and dimerization domain complexed with D-fucose. J. Mol. Biol. 273, 226-237. Steiner, T. (1998). Structural evidence for the aromatic(i ‡ 1)-amine hydrogen bond in peptides: L-Tyr-LTyr-L-Leu monohydrate. Acta Crystallog. sect. D, 54, 584-588.

Aromatic Hydrogen Bonds Steiner, T. & Mason, S. A. (2000). Short N‡ Ð H   Ph hydrogen bonds in ammonium tetraphenylborate characterized by neutron diffraction. Acta Crystallog. sect. B, 56, 254-260. Steiner, T. & Saenger, W. (1993). Role of C Ð H   O hydrogen bonds in the coordination of water molecules. Analysis of neutron diffraction data. J. Am. Chem. Soc. 115, 4540-4547. Steiner, T., Starikov, E. B., Amado, A. M. & TeixeiraDias, J. J. C. (1995). Weak hydrogen bonding. Part 2. The hydrogen bond nature of short C Ð H   p contacts: crystallographic, spectroscopic and quantum mechanic studies of some terminal alkynes. J. Chem. Soc. Perkin Trans. 2, 1321-1326. Steiner, T., Mason, S. A. & Tamm, M. (1997). Neutron diffraction study of aromatic hydrogen bonds. 5-Ethynyl-5H-dibenzo(a,d)-cyclohepten-5-ol at 20 K. Acta Crystallog. sect. B, 53, 843-848. Steiner, T., Schreurs, A. M. M., Kanters, J. A. & Kroon, J. (1998). Water molecules hydrogen bonding to aromatic acceptors of amino acids. The structure of Tyr-Tyr-Phe dihydrate and a crystallographic database study on peptides. Acta Crystallog. sect. D, 54, 25-31. Suzuki, S., Green, P. G., Bumgarner, R. E., Dasgupta, S., Goddard, W. A., III. & Blake, G. A. (1992). Benzene forms hydrogen bonds with water. Science, 257, 942-945. TuÈchsen, E. & Woodward, C. (1987). Assignment of asparagine-44 side-chain primary amide 1H HMR resonances and the peptide amide N1H resonance of glycine-37 in basic pancreatic trypsin inhibitor. Biochemistry, 26, 1918-1925. van Raaij, M. J., Louis, N., Chroboczek, J. & Cusack, S. (1999). Structure of the human adenovirus serotype Ê resolution. Virology, 2 ®ber head domain at 1.5 A 262, 333-343. Varrot, A., Schulein, M., Pipelier, M., Vasella, A. & Davies, G. J. (1999). Lateral protonation of a glycoside inhibitor. Structure of the Bacillus agaradhaerens Cel5a in complex with a cellobiose-derived imidaÊ resolution. J. Am. Chem. Soc. 121, zole at 0.97 A 2621-2622. VaÂsquez, G. B., Ji, X., Fronticelli, C. & Gilliland, G. L. Ê resol(1998). Human carboxyhemoglobin at 2.2 A ution: structure and solvent comparisons of R-state, R2-state and T-state hemoglobins. Acta Crystallog. sect. D, 54, 355-366. Wahl, M. & Sundaralingam, M. (1997). C Ð H   O hydrogen bonding in biology. Trends Biochem. Sci. 22, 97-102. Whittingham, J. L., Edwards, E. J., Antson, A. A., Clarkson, J. M. & Dodson, G. G. (1998). The interactions of phenol and m-cresol in the insulin hexamer, and their effect on the association properties of B28 Pro ! Asp insulin analogues. Biochemistry, 37, 11516-11523. Wilson, D. K., Bohren, K. M., Gabbay, K. H. & Quiocho, F. A. (1992). An unlikely sugar binding site in the Ê structure of the human aldose reductase 1.65 A holoenzyme implicated in diabetic complications. Science, 257, 81-84. Wlodawer, A., Walter, J., Huber, R. & SjoÈlin, L. (1984). Structure of bovine pancreatic trypsin inhibitor. Results of joint neutron and X-ray re®nement of crystal form II. J. Mol. Biol. 180, 301-329. Worth, G. A. & Wade, R. C. (1995). The aromatic (i ‡ 2) amine interaction in peptides. J. Phys. Chem. 99, 17473-17482.

Aromatic Hydrogen Bonds Worth, G. A., Nardi, F. & Wade, R. C. (1998). Use of multiple molecular dynamics trajectories to study biomolecules in solution: the YTGP peptide. J. Phys. Chem B, 102, 6260-6272.

557 Wulf, O. R., Liddel, U. & Hendricks, S. B. (1936). The effect of ortho substitution on the absorption of the OH group of phenol in the infrared. J. Am. Chem. Soc. 58, 2287-2293.

Edited by R. Huber (Received 14 August 2000; received in revised form 27 October 2000; accepted 27 October 2000)