doi:10.1016/j.jmb.2004.03.072
J. Mol. Biol. (2004) 339, 471–479
Snorkeling Preferences Foster an Amino Acid Composition Bias in Transmembrane Helices Aaron K. Chamberlain, Yohan Lee, Sanguk Kim and James U. Bowie* Department of Chemistry and Biochemistry, UCLA-DOE Center for Genomics and Proteomics, Molecular Biology Institute, Boyer Hall, 611 Charles E. Young Drive E, Los Angeles, CA 90095-1570, USA
By analyzing transmembrane (TM) helices in known structures, we find that some polar amino acids are more frequent at the N terminus than at the C terminus. We propose the asymmetry occurs because most polar amino acids are better able to snorkel their polar atoms away from the membrane core at the N terminus than at the C terminus. Two findings lead us to this proposition: (1) side-chain conformations are influenced strongly by the N or C-terminal position of the amino acid in the bilayer, and (2) the favored snorkeling direction of an amino acid correlates well with its N to C-terminal composition bias. Our results suggest that TM helix predictions should incorporate an N to C-terminal composition bias, that rotamer preferences of TM side-chains are position-dependent, and that the ability to snorkel influences the evolutionary selection of amino acids for the helix N and C termini. q 2004 Elsevier Ltd. All rights reserved.
*Corresponding author
Keywords: membrane; protein; side-chain; polarity; rotamer
Introduction Membrane proteins represent approximately 25% of an average genome,1 but because of experimental difficulties, we are only beginning to understand the nature of the forces that define their structure.2 – 8 Recent advances in structure determination,9 – 15 and model protein mutagenesis studies,16 – 25 have begun to illuminate the forces that drive membrane protein folding, but we need a detailed understanding of the characteristics of membrane proteins before reliable predictions of their structures will be possible. Prior analyses of the amino acid composition of transmembrane (TM) helices have focused on the differences between the center of the helices and the termini or between the termini in different cellular compartments. For example, aromatic residues and polar residues tend to reside toward the helix ends, where they presumably interact favorably with the interfacial layer of the membrane.26 – 31 Basic residues are more populated Abbreviations used: SnD, snorkeling distance; SnP, snorkeling propensity; TM, transmembrane; POPC, palmitoyloleoylphosphatidylcholine. E-mail address of the corresponding author:
[email protected]
in the intracellular side of helices, resulting in the positive-inside rule that has been useful in defining the topology of transmembrane helices.32 – 34 Here, we describe a previously unnoticed composition difference between the N and C termini of TM helices. Although the effect has not been examined in detail, polar side-chains in TM helices are expected to “snorkel”, i.e. orient themselves to allow polar atoms to partially escape from the hydrophobic membrane core toward the interfacial or aqueous regions.28,35,36 Snorkeling of Lys residues has been seen in molecular dynamics simulations,37 and has been used to explain effects of mutations on lipid phase transitions.38 Lys snorkeling was invoked to explain the insertion depth of some amphipathic helices in the membrane.39,40 On the other hand, Cushley and co-workers find little evidence for Lys snorkeling in another amphipathic helix.41 Here, we analyze known membrane protein structures and demonstrate that polar side-chains in TM helices do indeed snorkel. We demonstrate that the ability of amino acids to snorkel in the appropriate direction is correlated with their N or C-terminal preferences. These results demonstrate that the membrane influences side-chain positions and suggest that the ability to snorkel constrains the evolutionary selection of TM amino acids.
0022-2836/$ - see front matter q 2004 Elsevier Ltd. All rights reserved.
472
Results Amino acid biases for the N-terminal and C-terminal membrane regions We defined the TM regions of 14 non-redundant ˚ thick slab. proteins as the most hydrophobic 30 A ˚ ˚ We divided the 30 A membrane into three, 10 A regions representing the membrane portions at the N-terminal, middle, and C-terminal sides of ˚ slab, the the helices. In the most hydrophobic 30 A 156 TM helices contained 2924 helical residues. In general, the abundance of each amino acid in the helices follows the trend expected of TM regions,42 with apolar amino acids being more prevalent than polar amino acids. Leu is by far the most prevalent amino acid, with 497 counts (17.0%). Ile, Val, Ala, Phe, and Gly are also well populated, having at least 243 counts (8.3%). Lys and Asp are the least populated with 24 (0.82%) and 21 (0.72%) counts each. Here, we refer to Tyr and Trp as hydrophilic or polar amino acids, because their snorkeling behavior in our tests resemble the long-chain polar amino acids Arg, Lys, Asp, and Gln (see below). Many hydrophilic amino acids are more frequent at the N-terminal third of the membrane than the C-terminal third (see Figure 1). Because the
Amino Acid Snorkeling Preferences
location of the membrane on a given protein structure is not defined precisely,43 we tested whether the population biases we observed were dependent on the positioning of the membrane, using four different placements of the membrane. The population biases observed with our optimally ˚ membrane are shown as black bars. placed 30 A We show the population biases determined with a ˚ membrane (gray bars) and with the mem26 A ˚ toward the intracellular side brane shifted 4 A (hatched bars) and extracellular side (open bars) of the protein. An asterisk ( p ) marks the amino acids that are consistently biased for one side of the membrane using all four membrane positions (Figure 1). The N-terminal side of the membrane contains more Asp, Lys, Arg, Gln, Pro, His, Trp, and Met, whereas the C-terminal side contains more Ala, Val, Ile, Gly, and Tyr. Thus, the amino acids that are more frequent in the N-terminal third are generally polar, whereas those more frequent in the C-terminal third are generally small, aliphatic residues. The frequency of Arg and Lys in the C-terminal membrane third are only about 60% of their frequency in the N-terminal third (C/N population bias , 0.6). The hydrophobic residues Ile and Val favor the C terminus modestly, by , 15% and , 11%, respectively. Tyr is unusual among the polar residues, because it has the strongest preference for the C terminus, a , 65% increase in frequency over the N-terminal frequency. Side-chains snorkel Because side-chains in a helix tend to extend toward the N terminus, we hypothesized that the overall N-terminal preference for the polar amino acids may occur because it is easier for the polar
Table 1. Amino acid snorkeling atoms and propensities Amino acid
Figure 1. The population bias of each amino acid for the N or C-terminal third of the membrane. We divided ˚ membrane into three 10 A ˚ sections (N-terminal, a 30 A middle, and C-terminal sections) and show the frequency of each amino acid in the C-terminal membrane section divided by its frequency in the N-terminal section. Hydrophobic amino acids generally occur more frequently in the C terminus (bias . 1.0) and hydrophilic amino acids occur more frequently in the N terminus (bias , 1.0). The results are shown for our 30 ideally placed membrane (black bars) and three other mem˚ brane positions. We narrowed the membrane to 26 A ˚ toward the intracellular (gray bars) or shifted it 4 A (hatched bars) or extracellular (open bars) side of the membrane. Asterisks ( p ) mark the amino acids with a consistent bias toward one side of the membrane.
L A F V I G T S M W Y H P N E Q C R K D
Overall frequency
Snorkeling atom
Snorkeling propensity
0.170 0.118 0.100 0.099 0.094 0.083 0.051 0.046 0.045 0.036 0.032 0.028 0.022 0.013 0.011 0.011 0.010 0.0092 0.0082 0.0071
Cd1, Cd2 – Cz g1 C , Cg2 Cd – Og Og Sd N1 Oh Nd, N1 – Od, Nd O11, O12 O1, N1 Sg N1, Nh1, Nh2 Nz Od1, Od2
20.27 – 0.56 20.03 21.58 – 20.93 20.33 20.60 0.10 1.07 20.01 – 20.64 20.79 20.62 20.66 21.30 20.97 20.81
Amino Acid Snorkeling Preferences
atoms to snorkel out of the membrane core. To test this idea, we first wanted to establish whether the membrane polarity gradient does indeed influence side-chain rotamer choices. We define a potential snorkeling atom, or set of atoms, for each amino acid (Table 1) and measure its displacement along the membrane normal from the Cb atom of the same amino acid. This displacement is the snorkeling distance, SnD. For example, Figure 2 shows three rotamers of Tyr and their snorkeling distances. Positive SnD values occur when the snorkeling atom extends toward the C-terminal side of the membrane as in the Tyr rotamer x1 ¼ 1808, x2 ¼ 908. An SnD value of zero means the potential snorkeling atom is at the same depth in the membrane as the Cb atom. Negative SnD values place the snorkeling atom toward the N terminus, as in the Tyr rotamers with x1 ¼ 2 608 and þ 608. We found that the average SnD of each amino acid differs in the N and C-terminal thirds of the membrane (Figure 3A). Polar residues tend to point their side-chain toward the aqueous or interfacial regions. This result is seen as a positive value in Figure 3. Polar residues in the C-terminal third extend their snorkeling atom more toward the C terminus than polar residues in the N-terminal third. In particular, Lys residues snorkel the furthest, on average extending the ˚ further toward the C termiterminal N atom 4.1 A nus when they reside in the C-terminal third than when they reside in the N-terminal third. In the C-terminal third, Lys residues extend an average ˚ toward the C-terminal side of the of 1.6 A membrane. In the N-terminal third, however, Lys ˚ toward the residues extend an average of 2.5 A
Figure 2. Snorkeling distances (SnD) of tyrosine rotamers. The snorkeling distances are the displacements of the snorkeling atom, the Oh atom, into or out of the membrane from the Cb atom. We show three rotamers of Tyr (x1 ¼ þ 608, 2 608 and 1808, x2 ¼ 908) on a helix whose axis is parallel with the membrane normal (bold line, marked N and C). Rotamers extending toward the C-terminal side of the membrane have positive snorkeling distances and those extending toward the N-terminal side have negative snorkeling distances. Potential snorkeling atoms of all the amino acids are shown in Table 1.
473
Figure 3. The difference between the average snorkeling distances of the amino acids in the N and C-terminal regions of the membrane. Most polar amino acids have positive values, indicating a preference for their sidechains to extend out of the membrane. The hydrophobic amino acids, Leu, Phe, and Ile, have negative values, indicating a preference for their side-chains to extend into the membrane. We show (A) the overall results, and (B) the separate results for buried residues (black bars) and surface-exposed residues (gray bars). Error bars show the standard error of the difference between the C-terminal and N-terminal averages.
N-terminal side of the membrane. Trp, Tyr, Asn, Glu, Gln, Arg, and Asp also snorkel, extending their polar atoms toward the aqueous region by at ˚. least 1 A The bias for the polar amino acids to extend more toward the C terminus when they reside in the C terminus is caused by an abundance of individual rotamers pointing in that direction. For example, in the C terminus, most Tyr residues (67% or 22/33) have x1 ¼ 1808 rotamers, which ˚ snorkel toward the C terminus by þ 3.3 A (Figure 2). Fewer Tyr residues (30% or 10/33) have x1 ¼ 2 608 rotamers, which snorkel toward ˚ . (The x2 angle of Tyr the N terminus by 2 3.1 A does not affect the snorkeling distance, because it does not affect the position of the Oh atom.) In the N terminus, however, only 41% (9/22) of Tyr residues have x1 ¼ 1808, whereas 55% (13/22) have x1 ¼ 2 608 rotamers. So, on average, Tyr residues in the N terminus point toward the N terminus, whereas Tyr residues in the C terminus point toward the C terminus.
474
Amino Acid Snorkeling Preferences
Unlike the polar amino acids, the hydrophobic amino acids generally extend their side-chain slightly further into the membrane, which we describe as “anti-snorkeling” behavior. Leu, Phe, and Ile place their side-chains more toward the N terminus if they reside in the C-terminal membrane third than if they reside in the N-terminal third. These amino acids have a negative value for the difference in average snorkeling distance (Figure 3A). This anti-snorkeling behavior is much weaker than the snorkeling behavior of the polar amino acids. Phe shows the strongest anti-snorkeling bias. On average, the Cz atoms of ˚ more toward C-terminal Phe residues are 0.69 A the N terminus than those of the N-terminal Phe residues. To the extent that snorkeling preferences are dependent on the bilayer environment, exposed residues might be expected to show larger snorkeling biases than the buried residues. As shown in Figure 3B, the exposed residues in the structures (gray bars) do generally show stronger snorkeling trends than the buried residues (black bars). The primary exception is Arg, which shows a strong bias when buried, but not when exposed. The reasons for this difference are unclear, but may, at least partly, reflect a statistical anomaly given the small numbers of exposed Arg residues (13 in total). Taken together, the results indicate that side-chain orientations, particularly for the polar amino acids, are influenced strongly by their local environment. Snorkeling propensity correlates with population bias The results above indicate that polar atoms in side-chains experience a force pushing them away from the membrane core. The optimum placement of these polar atoms for snorkeling cannot necessarily be obtained, however, because steric clashes strongly disfavor certain x angles. As a result of these steric restrictions, some amino acids may be able to snorkel more effectively in one direction than another. To investigate this possibility, we defined a value, called the snorkeling propensity (SnP), which is a measure of the inherent preference of each amino acid to extend its side-chain toward the N or C terminus of a helix in energetically permissible rotamers. The SnP of an amino acid is the average extension of the side-chain along its helix axis in water-soluble proteins where side-chain placement is not influenced by the membrane polarity gradient. A positive SnP value indicates that the side-chain prefers to extend toward the C terminus. While this model accounts for the energy of some interactions, like steric clashes, which are common to both TM and soluble proteins, other interactions, like the increased hydrogen bond potential, are probably not as well represented by the soluble-protein rotamer frequencies. The overall preference of some polar amino
Figure 4. Snorkeling propensities and population biases. The polar amino acids that show a consistent population bias in all four membrane positions (asterisks, Figure 1) are shown for (A) buried þ exposed residues, (B) buried residues, and (C) surface-exposed residues. The snorkeling propensity (SnP) of an amino acid is the average extension of its side-chain along the helix axis toward the C terminus as found in the rotamer distributions of soluble proteins. The hydrophilic amino acids show a positive correlation, implying that their tendency to extend toward one helix terminus favors their placement in that terminus.
acids to reside in one helix terminus correlates with their SnP (Figure 4A). Polar amino acids with a propensity to point toward the N terminus ˚ ) tend to overpopulate the N terminus. (SnP , 0 A In particular, Arg, Lys, Asp and Gln have a population bias of less than 0.7 and SnP less than ˚ . The overall correlation between SnP and 2 0.5 A the population bias occurs, despite the large variety of polar amino acids in the plot. Tyrosine is a strongly snorkeling amino acid and a notable exception from most polar residues in its overall preference for the C terminus. The
Amino Acid Snorkeling Preferences
configuration of its side-chain is ideal for snorkeling, because the hydroxyl group is located at the end of a long, hydrophobic side-chain. This configuration allows Tyr to move the hydroxyl group large distances in the membrane polarity gradient by changing rotamers. The two, Tyr x1 ¼ 1808 rotamers have large, positive snorkeling distances and favor the C terminus, whereas the x1 ¼ 2 608 rotamers have large, negative snorkeling distances and favor the N terminus. In soluble proteins, which lack the asymmetric polarity gradient, 64.5% of Tyr have x1 ¼ 1808.44 The abundance of these rotamers leads to the large, positive SnP ˚ , Figure 4A, abscissa) and can explain the (1.06 A population bias of Tyr for the C terminus (Figure 4A, ordinate). Both buried (Figure 4B) and exposed (Figure 4C) polar residues show this correlation between the snorkeling propensity and the population bias. As with the average side-chain extension distances shown in Figure 3, the exposed residues show a stronger trend than the buried residues. In the case of Figure 4, the exposed residues show a larger slope than the buried residues, consistent with the idea that the population bias reflects the membrane environment.
Discussion In helices, the Ca – Cb bond points more toward the N terminus resulting in negative snorkeling propensities for most amino acids. These N-terminal extending amino acids bias the N terminus to contain more polar amino acids. Thus, we suggest that the ability of residues to snorkel influences the evolutionary selection of amino acids for the N and C-terminal regions of TM helices. Because the location of the membrane on the structures is imprecise and the membrane itself does not have discrete boundaries, we investigated the results of different membrane positions on our amino acid frequency ratios (Figure 1). Upon ˚ or shifting the memshrinking the membrane 4 A ˚ brane 4 A, the population biases changed, especially in the rarer, polar amino acids, but the biases were robust for many amino acids. This variation is unavoidable, given the small number of membrane protein structures and can likely be reduced as more structures become available. Another limitation of the data is that most of the structures were solved in detergent solutions and not in an actual lipid bilayers. Many membrane proteins have lower activities in detergent solutions, but the effects of detergents on sidechain conformations are not known. Our values of population biases and snorkeling propensities are not determined from the observed side-chain conformations, however, and should not be influenced by the detergent environment. The observed snorkeling distances of the exposed residues (Figure 3) may have been affected. The buried residues
475
should be shielded from the detergents by the protein, however, but they nevertheless show similar, albeit weaker, trends. The actual snorkeling forces are likely to be stronger than our results indicate. The properties of the membrane change continuously with bilayer depth. Our division of the membrane into discrete thirds (an N-terminal interfacial region, a middle core region, and a C-terminal interfacial region) is clearly an oversimplification. These divisions and the placement of the membrane are necessary to determine which amino acids are grouped together in calculating the average amino acid frequencies and snorkeling distances. The polarity gradient is weaker in the core than in the interfacial regions,45 creating different snorkeling forces at different bilayer depths. We somewhat arbitrarily used a ˚ thick slab and 10 A ˚ divisions to indicate the 30 A membrane without regard to the strength of the polarity gradient. Thus, we combined residues that experience greater and lesser polarity gradients, potentially diminishing the apparent magnitude of the snorkeling effect. Our snorkeling model explains many aspects of amino acid positioning and rotamer selection, even though we ignore the absolute charge, polarity and detailed atomic configuration of the side-chains. For example, our model does not consider that the positive charges of Lys and Arg may interact more favorably with the negatively charged phospholipids than the negative charges of Asp and Glu. However, the charge may not be as critical as the polarity. von Heijne and coworkers found that Arg and Lys affect the position of a model TM helix in the membrane in a manner different from that of Glu and Asp,46 but ascribed the differences to the different side-chain lengths, not the change in charge. The Trp side-chain geometry is also poorly described by our snorkeling model, because the polar N1 atom is in the center of the side-chain, not at the end. The Trp side-chain must seek a compromise between the N1 atom and the more hydrophobic six-membered ring. In addition to these intra-residue effects, the helix tilt and rotation angles in the membrane will change the distance a polar atom extends through the polarity gradient. Yau et al. point out other factors that influence amino acids in the interfacial regions.47 Using solid-state NMR, they investigated the location of indole rings in palmitoyloleoylphosphatidylcholine (POPC) bilayers. As expected, the indole rings preferred the interfacial region. Interestingly, three indole analogs, including indene, which has a carbon atom in place of the N atom, all preferred the same bilayer region. That is, the indene did not protrude further into the membrane core and interact with more hydrophobic POPC components. They conclude that the dipolar interactions cannot explain the preference for Trp to reside in the interfacial region over the core region. Instead they suggest Trp’s aromaticity and shape are responsible.
476
Here, we address a different question, namely, “Is Trp’s dipole sufficient to cause a difference in its orientation when it resides in N-terminal or C-terminal interfacial regions?” We found that Trp, and other amino acids, have different side-chain conformations in the N and C-terminal regions. Futhermore, many polar residues have different frequencies in the N and C-terminal interfacial regions, which can be explained by their ability to align their dipole with the membrane dipole. Surprisingly, buried residues that interact with other protein components behaved similarly to exposed residues in our analysis. Some polar residues in both sets have a preference to extend their side-chains out of the membrane and prefer to reside on the side of the membrane that favors snorkeling. The trends for buried residues were weaker than those of lipid-exposed residues, presumably because the exposed residues face a more uniform polarity gradient extending out of the membrane. Our observation that buried residues have different side-chain conformations on different sides of the membrane has important implications for the design and structure prediction of membrane proteins. In predicting the packing of two or more TM helices, the side-chain conformations in the protein/protein interface may differ, depending on the region of the membrane in which they reside. Viewed in terms of the twostate model of TM protein structure formation,6 our results may suggest that the side-chains in TM helices may have somewhat limited conformational changes during the packing of isolated helices. A complete analysis of buried and exposed sidechain rotamers would help determine the extent of these conformational changes. The results reported here demonstrate the influence of the membrane on protein structure and evolution. The findings should impact various aspects of membrane protein structure analysis and prediction. First, sequence-based methods for TM region identification should incorporate the N-terminal polar residue bias. Second, rotamer libraries of TM amino acids need to consider the location of the residue in the membrane and not just the secondary structures or backbone phi/psi angles. Third, matching the polarity of the helices and their side-chains to the membrane polarity may help restrict models of TM structures. As more structures become available, these principles will likely become better illustrated and more easily applied to the prediction of TM protein structures.
Materials and Methods Selection of helical membrane proteins We screened an initial list of helical membrane protein structures from the Max Planck Institute†. Crystal struc† http://www.mpibp-frankfurt.mpg.de/michel/ public/memprotstruct.html
Amino Acid Snorkeling Preferences
tures were retained that were determined to a resolution ˚ resolution or better. We discarded proteins until of 3.0 A each pair had less than 30% sequence identity as judged by BLAST 2.0‡. The final 14 PDB codes used were 1C3W, 1EHK, 1EUL, 1EYS, 1EZV, 1FX8, 1H2S, 1J4N, 1JB0, 1KQF, 1KZU, 1L9H, 1QLA, and 2OCC. Identification of transmembrane residues For each protein, we identified the transmembrane helices by eye and used DSSP to find the TM helix boundaries.48 Each TM helix was defined to be a continuous stretch of helical residues. We calculated two helix axis points adjacent to the fifth and 15th residues in each helix with a weighted average of the fourth, fifth and sixth and the 14th, 15th and 16th Ca atom positions assuming a 1008 helical turn per residue. For each protein, a vector normal to the membrane was defined by averaging its helix axes, which are the difference vectors between the two axis points. In homo-multimeric proteins, all subunits were used in determining the normal vector and TM residues, although only one monomer per structure was used in other calculations. We identified the residues in the ˚ thick slab membrane by aligning the normal of a 30 A with the membrane normal and sliding the slab along ˚ increments until the averthe membrane normal in 1 A age hydrophobicity of the residues within the slab was at a maximum. The data presented here used the hydrophobicity scale of Fachere et al.,49 although different hydrophobicity scales had very little impact on the membrane placement. Using bacteriorhodopsin (PDB code 1C3W) as a test case, we placed the membrane using six different hydrophobicity scales: Kyte & Doolittle,50 Wimley & White,28 Wimley & White octanol scale,51 Fauchere & Pliska,49 Eisenberg et al.52 and Goldman et al.53 Five of the six hydrophobicity scales predicted the bacteriorhodopsin membrane center in the same location. The Gold˚ further man et al. scale predicted the membrane to be 2 A toward the intracelluar side of the protein. Given the location of the center of the membrane using the Fauchere et al. scale, the membrane position on all proteins ˚ toward the intracellular side and 4 A ˚ was shifted 4 A toward the extracellular side. We also narrowed the ˚ thickness, leaving it centered on the membrane to 26 A original location. These different placements cause the changes in the population biases (frequency ratios) pre˚ thick membrane sented in Figure 1. Using the ideal 30 A positioning, the 156 helices contained 2924 residues, or 18.7 residues on average in the membrane. We subdivided the membrane into three slabs of equal thickness, thereby subdividing the TM helices into Nterminal, middle and C-terminal regions, and assigned residues to each region based on their Ca positions. The total counts were 956, 1043, and 925 residues in the N terminus, middle and C terminus, respectively. The residues were divided into buried or exposed classes using the Environments program.54 Residues with greater than 75% surface area covered by protein were classified as buried and those with less than 25% surface area covered were classified as lipid-facing or exposed. Using the first helix as a reference helix for each structure, 52.8% (75/ 142) of the remaining helices are parallel with the reference helix and 47.2% (67/142) are anti-parallel to the reference helix. Therefore, shifting the membrane toward ‡ http://www.ncbi.nlm.nih.gov/BLAST/
477
Amino Acid Snorkeling Preferences
the N terminus of one helix, will shift it toward the C terminus in another helix. Slightly fewer residues are counted in the shifted positions, because of the requirement that residues are helical. Total residue counts were ˚ , centered), 2628 (26 A ˚ , centered), 2805 (30 A ˚, 2924 (30 A ˚ membrane shifted extracellularly), and 2856 (30 A, membrane shifted intracellularly). Counting tyrosine side-chain rotamers Side-chain x angles were measured with InsightII (Biosym, Inc.). The amino acid rotamers were divided into bins according to the scheme described by Dunbrack†.44,55 Amino acid snorkeling distances For each amino acid, we choose a potential snorkeling atom, or set of atoms, as shown in Table 1. The potential snorkeling atom is the most terminal atom or the polar atom of the side-chain. If more than one atom is listed, we use the average coordinates of all snorkeling atoms in our calculations. We use the polar atoms because they should experience the largest energetic benefit by extending away from the membrane core. In aliphatic side-chains, the atom chosen is simply a means to determine whether the side-chain extends toward the N or C-terminal side of the membrane. Although we refer to these chosen atoms as snorkeling atoms, we do not mean that all of these side-chains necessarily follow snorkeling behavior. Our data demonstrate that certain polar amino acids snorkel, some hydrophobic amino acids have a slight opposite trend, and some amino acids do not show obvious biases in the current data. The snorkeling distance, SnD, of each residue is the displacement of the snorkeling atom along the membrane normal from the Cb atom (Figure 2). A positive SnD value indicates the snorkeling atom extends toward the C-terminal side of the membrane and a negative value indicates the snorkeling atom extends toward the N-terminal side of the membrane. For the residues located in each membrane third, we calculated an average snorkeling distance of each amino acid, kSnDl, to determine if their side-chains generally point toward one terminus. We present the difference between the averages in the N and C-terminal sections for each amino acid ðkSnDlC terminus 2 kSnDlN terminus Þ. A positive value of this difference indicates that, on average, the side-chain stretches out of the membrane from the Cb atom. That is, a positive value indicates that the sidechain extends farther toward the C terminus when the amino acid is located in the C terminus than when it is located in the N terminus. A negative value indicates that, on average, the side-chain extends toward the membrane core. The snorkeling distance is not defined using the membrane boundaries we identified. The membrane boundaries determine only which residues are included in determining the average snorkeling distance in a given region. Amino acid snorkeling propensities To determine the direction each side-chain favors without the influence of the membrane, we determined † http://www.fccc.edu/research/labs/dunbrack/ bbdep.html
a snorkeling propensity, SnP, for each amino acid. The SnP is the sum of the snorkeling distances of all rotamers weighted by the frequency of each rotamer in soluble proteins: X SnP ¼ nDUN SnDi i nDUN is the frequency of the rotamer i taken from i Dunbrack’s soluble-protein rotamer library with backbone w and c angles of 2 608 and 2 508,44,55 and SnDi is the snorkeling distance of the rotamer i. We measured the snorkeling distance of each rotamer in an ideal helix parallel with the membrane normal. The rotamer frequencies reflect the average stability of each rotamer in soluble proteins. Thus, the SnP values represent a measure of the inherent bias of the amino acid to extend its snorkeling atom toward the N or C terminus without the snorkeling effects caused from the membrane polarity gradient. Table 1 shows the snorkeling atoms and propensities of each amino acid. If more than one atom is listed, we use the average coordinates of all snorkeling atoms to calculate the distances and propensity.
Acknowledgements We thank Dr Ronald Dunbrack (Fox Chase Cancer Center) for assistance in binning rotamers and Salem Faham for comments on the work. This work was supported by NIH grant number RO1 GM63919. J.U.B. is a Leukemia and Lymphoma Society Scholar.
References 1. Wallin, E. & von Heijne, G. (1998). Genome-wide analysis of integral membrane proteins from eubacterial, archaean, and eukaryotic organisms. Protein Sci. 7, 1029– 1038. 2. White, S. H. & Wimley, W. C. (1998). Hydrophobic interactions of peptides with membrane interfaces. Biochim. Biophys. Acta, 1376, 339 –352. 3. White, S. H. & Wimley, W. C. (1999). Membrane protein folding and stability: physical principles. Annu. Rev. Biophys. Biomol. Struct. 28, 319– 365. 4. Booth, P. J. (2000). Unravelling the folding of bacteriorhodopsin. Biochim. Biophys. Acta, 1460, 4 – 14. 5. Krebs, M. P. & Isenbarger, T. A. (2000). Structural determinants of purple membrane assembly. Biochim. Biophys. Acta, 1460, 15 – 26. 6. Popot, J. L. & Engelman, D. M. (2000). Helical membrane protein folding, stability, and evolution. Annu. Rev. Biochem. 69, 881– 922. 7. Chamberlain, A. K., Faham, S., Yohannan, S. & Bowie, J. U. (2003). Construction of helix-bundle membrane proteins. Advan. Protein Chem. 63, 19 –46. 8. DeGrado, W. F., Gratkowski, H. & Lear, J. D. (2003). How do helix– helix interactions help determine the folds of membrane proteins? Perspectives from the study of homo-oligomeric helical bundles. Protein Sci. 12, 647–665. 9. Bass, R. B., Strop, P., Barclay, M. & Rees, D. C. (2002). Crystal structure of Escherichia coli MscS, a voltagemodulated and mechanosensitive channel. Science, 298, 1582– 1587. 10. Ferguson, A. D., Chakraborty, R., Smith, B. S., Esser,
478
11.
12.
13. 14.
15.
16. 17.
18.
19. 20.
21.
22.
23.
24.
25.
26.
27.
L., van der Helm, D. & Deisenhofer, J. (2002). Structural basis of gating by the outer membrane transporter FecA. Science, 295, 1715– 1719. Locher, K. P., Lee, A. T. & Rees, D. C. (2002). The E. coli BtuCD structure: a framework for ABC transporter architecture and mechanism. Science, 296, 1091– 1098. Murakami, S., Nakashima, R., Yamashita, E. & Yamaguchi, A. (2002). Crystal structure of bacterial multidrug efflux transporter AcrB. Nature, 419, 587– 593. Toyoshima, C. & Nomura, H. (2002). Structural changes in the calcium pump accompanying the dissociation of calcium. Nature, 418, 605–611. Jiang, Y., Lee, A., Chen, J., Ruta, V., Cadene, M., Chait, B. T. & MacKinnon, R. (2003). X-ray structure of a voltage-dependent Kþ channel. Nature, 423, 33 – 41. Yu, E. W., McDermott, G., Zgurskaya, H. I., Nikaido, H. & Koshland, D. E., Jr (2003). Structural basis of multiple drug-binding capacity of the AcrB multidrug efflux pump. Science, 300, 976– 980. Isenbarger, T. A. & Krebs, M. P. (1999). Role of helix – helix interactions in assembly of the bacteriorhodopsin lattice. Biochemistry, 38, 9023– 9030. Choma, C., Gratkowski, H., Lear, J. D. & DeGrado, W. F. (2000). Asparagine-mediated self-association of a model transmembrane helix. Nature Struct. Biol. 7, 161– 166. Zhou, F. X., Cocco, M. J., Russ, W. P., Brunger, A. T. & Engelman, D. M. (2000). Interhelical hydrogen bonding drives strong interactions in membrane proteins. Nature Struct. Biol. 7, 154– 160. Bowie, J. U. (2001). Stabilizing membrane proteins. Curr. Opin. Struct. Biol. 11, 397– 402. Fleming, K. G. & Engelman, D. M. (2001). Specificity in transmembrane helix – helix interactions can define a hierarchy of stability for sequence variants. Proc. Natl Acad. Sci. USA, 98, 14340– 14344. Isenbarger, T. A. & Krebs, M. P. (2001). Thermodynamic stability of the bacteriorhodopsin lattice as measured by lipid dilution. Biochemistry, 40, 11923– 11931. Zhou, F. X., Merianos, H. J., Brunger, A. T. & Engelman, D. M. (2001). Polar residues drive association of polyleucine transmembrane helices. Proc. Natl Acad. Sci. USA, 98, 2250– 2255. Gratkowski, H., Dai, Q. H., Wand, A. J., DeGrado, W. F. & Lear, J. D. (2002). Cooperativity and specificity of association of a designed transmembrane peptide. Biophys. J. 83, 1613– 1619. Howard, K. P., Lear, J. D. & DeGrado, W. F. (2002). Sequence determinants of the energetics of folding of a transmembrane four-helix-bundle protein. Proc. Natl Acad. Sci. USA, 99, 8568– 8572. Lear, J. D., Gratkowski, H., Adamian, L., Liang, J. & DeGrado, W. F. (2003). Position-dependence of stabilizing polar interactions of asparagine in transmembrane helical bundles. Biochemistry, 42, 6400– 6407. Weiss, M. S., Kreusch, A., Schiltz, E., Nestel, U., Welte, W., Weckesser, J. & Schulz, G. E. (1991). The structure of porin from Rhodobacter capsulatus at ˚ resolution. FEBS Letters, 280, 379–382. 1.8 A Landolt-Marticorena, C., Williams, K. A., Deber, C. M. & Reithmeier, R. A. (1993). Non-random distribution of amino acids in the transmembrane segments of human type I single span membrane proteins. J. Mol. Biol. 229, 602–608.
Amino Acid Snorkeling Preferences
28. Wimley, W. C. & White, S. H. (1996). Experimentally determined hydrophobicity scale for proteins at membrane interfaces. Nature Struct. Biol. 3, 842– 848. 29. Arkin, I. T. & Brunger, A. T. (1998). Statistical analysis of predicted transmembrane alpha-helices. Biochim. Biophys. Acta, 1429, 113 –128. 30. Seshadri, K., Garemyr, R., Wallin, E., von Heijne, G. & Elofsson, A. (1998). Architecture of beta-barrel membrane proteins: analysis of trimeric porins. Protein Sci. 7, 2026– 2032. 31. Ulmschneider, M. B. & Sansom, M. S. (2001). Amino acid distributions in integral membrane protein structures. Biochim. Biophys. Acta, 1512, 1 – 14. 32. Sipos, L. & von Heijne, G. (1993). Predicting the topology of eukaryotic membrane proteins. Eur. J. Biochem. 213, 1333– 1340. 33. Andersson, H. & von Heijne, G. (1994). Membrane protein topology: effects of delta mu H þ on the translocation of charged residues explain the “positive inside” rule. EMBO J. 13, 2267 –2272. 34. van Klompenburg, W., Nilsson, I., von Heijne, G. & de Kruijff, B. (1997). Anionic phospholipids are determinants of membrane protein topology. EMBO J. 16, 4261–4266. 35. Tanford, C. & Reynolds, J. A. (1976). Characterization of membrane proteins in detergent solutions. Biochim. Biophys. Acta, 457, 133– 170. 36. Segrest, J. P., Jones, M. K., De Loof, H., Brouillette, C. G., Venkatachalapathi, Y. V. & Anantharamaiah, G. M. (1992). The amphipathic helix in the exchangeable apolipoproteins: a review of secondary structure and function. J. Lipid Res. 33, 141– 166. 37. Shrivastava, I. H., Capener, C. E., Forrest, L. R. & Sansom, M. S. (2000). Structure and dynamics of K channel pore-lining helices: a comparative simulation study. Biophys. J. 78, 79 – 92. 38. Strandberg, E., Morein, S., Rijkers, D. T., Liskamp, R. M., van der Wel, P. C. & Killian, J. A. (2002). Lipid dependence of membrane anchoring properties and snorkeling behavior of aromatic and charged residues in transmembrane peptides. Biochemistry, 41, 7190– 7198. 39. Mishra, V. K., Palgunachari, M. N., Segrest, J. P. & Anantharamaiah, G. M. (1994). Interactions of synthetic peptide analogs of the class A amphipathic helix with lipids. Evidence for the snorkel hypothesis. J. Biol. Chem. 269, 7185– 7191. 40. Mishra, V. K. & Palgunachari, M. N. (1996). Interaction of model class A1, class A2, and class Y amphipathic helical peptides with membranes. Biochemistry, 35, 11210– 11220. 41. Buchko, G. W., Rozek, A., Kanda, P., Kennedy, M. A. & Cushley, R. J. (2000). Structural studies of a baboon (Papio sp.) plasma protein inhibitor of cholesteryl ester transferase. Protein. Sci. 9, 1548– 1558. 42. Senes, A., Gerstein, M. & Engelman, D. M. (2000). Statistical analysis of amino acid patterns in transmembrane helices: the GxxxG motif occurs frequently and in association with beta-branched residues at neighboring positions. J. Mol. Biol. 296, 921– 936. 43. Wiener, M. C. & White, S. H. (1992). Structure of a fluid dioleoylphosphatidylcholine bilayer determined by joint refinement of x-ray and neutron diffraction data. III. Complete structure. Biophys. J. 61, 437– 447. 44. Dunbrack, R. L., Jr & Karplus, M. (1993). Backbonedependent rotamer library for proteins. Application to side-chain prediction. J. Mol. Biol. 230, 543–574.
479
Amino Acid Snorkeling Preferences
45. White, S. H., Ladokhin, A. S., Jayasinghe, S. & Hristova, K. (2001). How membranes shape protein structure. J. Biol. Chem. 276, 32395– 33238. 46. Monne, M., Nilsson, I., Johansson, M., Elmhed, N. & von Heijne, G. (1998). Positively and negatively charged residues have different effects on the position in the membrane of a model transmembrane helix. J. Mol. Biol. 284, 1177– 1183. 47. Yau, W. M., Wimley, W. C., Gawrisch, K. & White, S. H. (1998). The preference of tryptophan for membrane interfaces. Biochemistry, 37, 14713– 14718. 48. Kabsch, W. & Sander, C. (1983). Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers, 22, 2577– 2637. 49. Fauchere, J. L. & Pliska, V. (1983). Hydrophobic parameters of amino acid side-chains from the partitioning of N-acetyl-amino acid amides. Eur. J. Med. Chem-Chim. Ther. 18, 369– 375. 50. Kyte, J. & Doolittle, R. F. (1982). A simple method for
51.
52.
53.
54.
55.
displaying the hydropathic character of a protein. J. Mol. Biol. 157, 105– 132. Wimley, W. C., Creamer, T. P. & White, S. H. (1996). Solvation energies of amino acid side-chains and backbone in a family of host– guest pentapeptides. Biochemistry, 35, 5109–5124. Eisenberg, D., Schwarz, E., Komaromy, M. & Wall, R. (1984). Analysis of membrane and surface protein sequences with the hydrophobic moment plot. J. Mol. Biol. 179, 125– 142. Engelman, D. M., Steitz, T. A. & Goldman, A. (1986). Identifying nonpolar transbilayer helices in amino acid sequences of membrane proteins. Annu. Rev. Biophys. Biophys. Chem. 15, 321– 353. Bowie, J. U., Luthy, R. & Eisenberg, D. (1991). A method to identify protein sequences that fold into a known three-dimensional structure. Science, 253, 164 –170. Dunbrack, R. L., Jr & Cohen, F. E. (1997). Bayesian statistical analysis of protein side-chain rotamer preferences. Protein Sci. 6, 1661 –1681.
Edited by G. von Heijne (Received 20 October 2003; accepted 21 March 2004)