Biophysical Journal Volume 108 January 2015 107–115
107
Article Quantitative Mapping of Protein Structure by Hydroxyl Radical Footprinting-Mediated Structural Mass Spectrometry: A Protection Factor Analysis Wei Huang,1 Krishnakumar M. Ravikumar,1 Mark R. Chance,1 and Sichun Yang1,2,* 1
Center for Proteomics and Bioinformatics and 2Department of Pharmacology, Case Western Reserve University, Cleveland, Ohio
ABSTRACT Measurements from hydroxyl radical footprinting (HRF) provide rich information about the solvent accessibility of amino acid side chains of a protein. Traditional HRF data analyses focus on comparing the difference in the modification/ footprinting rate of a specific site to infer structural changes across two protein states, e.g., between a free and ligand-bound state. However, the rate information itself is not fully used for the purpose of comparing different protein sites within a protein on an absolute scale. To provide such a cross-site comparison, we present a new, to our knowledge, data analysis algorithm to convert the measured footprinting rate constant to a protection factor (PF) by taking into account the known intrinsic reactivity of amino acid side chain. To examine the extent to which PFs can be used for structural interpretation, this PF analysis is applied to three model systems where radiolytic footprinting data are reported in the literature. By visualizing structures colored with the PF values for individual peptides, a rational view of the structural features of various protein sites regarding their solvent accessibility is revealed, where high-PF regions are buried and low-PF regions are more exposed to the solvent. Furthermore, a detailed analysis correlating solvent accessibility and local structural contacts for gelsolin shows a statistically significant agreement between PF values and various structure measures, demonstrating that the PFs derived from this PF analysis readily explain fundamental HRF rate measurements. We also tested this PF analysis on alternative, chemical-based HRF data, showing improved correlations of structural properties of a model protein barstar compared to examining HRF rate data alone. Together, this PF analysis not only permits a novel, to our knowledge, approach of mapping protein structures by using footprinting data, but also elevates the use of HRF measurements from a qualitative, cross-state comparison to a quantitative, cross-site assessment of protein structures in the context of individual conformational states of interest.
INTRODUCTION Structure features of a protein can be comprehensively probed by hydroxyl radical footprinting (HRF) mediated structural mass spectrometry (MS), also known as protein footprinting. Information about solvent accessibility of the side chains of amino acids is encoded in a rate constant of each site describing the extent to which an amino acid is oxidatively modified in an HRF experiment (1–6). Previous use of this rate information has invariably focused on the comparison of a singular site to infer changes in that specific site across different conformational states of a protein, including applications to problems of protein folding (7), ligand-induced dynamics (8), and biomolecular complex assembly (9,10). For example, a significant reduction in the rate constant from a ligand-free to a ligand-bound conformation indicates that the site probed by HRF is in close proximity to a ligand-binding interface. Although it is highly informative for a relative comparison, i.e., between two different conformational states, this site-specific analysis does not provide meaningful measures for different sites that are within each conformational state, as opposed
Submitted May 19, 2014, and accepted for publication November 10, 2014. *Correspondence:
[email protected]
to that site across conformations. To date, it remains unclear whether the rate information can be used to allow absolute comparisons of structural and dynamic features of multiple sites of a protein. The rate quantification in an experimental HRF measurement can be briefly described as follows. For a well-folded protein (schematically shown in Fig. 1 A), formation of its protein topology and topography is due to packing a one-dimensional amino acid sequence into a three-dimensional structure. The packing results in the protein fold where some residues are buried, whereas others are more exposed and solvent-accessible on the protein surface. This accessibility information can be monitored by HRF, where hydroxyl radicals that are produced isotropically in solution can covalently modify side chains of amino acids that are solvent exposed (Fig. 1 B). Note that these radicals can be generated by various means, e.g., from irradiation of water by x-rays or electron beams (11,12), via chemical reactions such as the Fenton reaction (13), or by photolysis of hydrogen peroxide (14). In each case, the reaction products and fundamental methodological approaches for assessing structure are very similar. In a next step, the oxidized protein samples are analyzed via proteolysis using a specific protease, cleaving a whole protein into a large set
Editor: Nathan Baker. Ó 2015 by the Biophysical Society 0006-3495/15/01/0107/9 $2.00
http://dx.doi.org/10.1016/j.bpj.2014.11.013
108
Huang et al.
A
OH
B
C
footprinting
proteolysis
mass spec identification
k fp= 2 s -1
rate constant k fp= 20 s -1
30 ms
m un
od
ifie
d m
i od
fie
d
D y 4 + 16
peak quantification
20 ms
y 5 + 16
y3 y2
b 4 + 16
E
b 3 + 16
Unmodified (%)
F
Abundance
k fp
b 5 + 16
10 ms
Time (ms)
of peptide segments (Fig. 1 C). Because these peptides are covalently and irreversibly labeled by hydroxyl radicals, it is feasible to identify the labeled sites by MS (Fig. 1 D). For each site identified, a subsequent MS analysis is used to quantify the level of modification (or footprinting) by hydroxyl radicals (Fig. 1 E). This quantification step can be conducted at a single point of exposure, or can be repeated at various time points of x-ray irradiation or hydroxyl radical dose. As a consequence, a dose-response curve of footprinting can be determined for each individual peptide segment (Fig. 1 F). By performing a first-order kinetic analysis on the dose-response curve, it is established that this process of hydroxyl radical modification can be well characterized by a rate constant (11), termed here as the footprinting rate kfp. Note that the footprinting itself occurs when the protein is still intact in a well-folded conformation, thereby providing structural characterization at the native, physiological condition (15). As a result, one can obtain a set of footprinting rate constants, effectively describing the kinetic properties of different sites of a protein probed by HRF. However, it is still unclear how this footprinting rate kfp can be used for specific structural interpretation, despite the presumed notion that information on the solvent accessibility of each site should be encoded in this rate constant (16). In addition, as previously mentioned, most existing data analyses rely on a simple comparison of a singular site crossing different conformational states, as opposed to an absolute comparison between different sites that are probed simultaneously within an intact, native protein. There are two contributing factors to the footprinting rate constant kfp. One is related to the chemical character of each amino acid type, which largely depends on its side-chain’s intrinsic reactivity to hydroxyl radicals (1,17,18). The other is the solvent accessibility of each site probed by HRF reflects its local structural environment; this concept for understanding footprinting experiments of many kinds was established in a series of landmark works by Galas and Schmitz (16). However, unlike the similarity of reactivity Biophysical Journal 108(1) 107–115
m/z
FIGURE 1 Schematic of an HRF experiment for rate determination. (A) Illustration of a protein fold, where residues on a protein surface (highlighted in blue) are exposed to solvent and more prone to HRF, whereas other residues (e.g., in red) are buried and less exposed due to tight packing and contact formation. (B) Covalent-labeling of protein sites by hydroxyl radicals (each marked by a green dot) that are generated from radiolysis of water due to synchrotron irradiation. (C) The protein is broken into a set of peptide segments, cleaved by a specific protease. (D and E) Sequence and site of modified peptides are identified and the amount of modification is quantified, based on tandem mass spectroscopy analyses. (F) A characteristic footprinting rate constant kfp is determined for each peptide/residue segment based on a dose-response curve as a function of exposure time. To see this figure in color, go online.
among different modification/cleavage sites characterized by nucleic acids footprinting chemistry, a dynamic range of reactivity of the amino acid side chains to hydroxyl radicals can exceed 1000-fold. Thus, characterization of solvent accessibility of different sites using kfp needs to account for the intrinsic reactivity of the sites as an addition to the Galas and Schmitz concept. To explore the potential use of footprinting data for comprehensive structural characterization, here we introduce a new, to our knowledge, metric that converts the rate constant kfp reflecting a kinetic property to a structure-orientated protection factor (PF). We demonstrate on three model proteins that a simple normalization of kfp by the intrinsic reactivity enables a logical view of protein structures with regard to solvent accessibility using synchrotron-based HRF measurements. Furthermore, in a proof-ofprinciple study, we apply this PF analysis to chemical-based HRF data reported in the literature. Thus, despite its simplicity, this protection factor analysis is poised as a promising analytic tool to enable the general application of HRF for quantitative structural mapping of proteins. MATERIALS AND METHODS The protection factor analysis (see Eq. 1) provides a corrected measure of reactivity that is compared to protein crystal structures. Based on these crystal structures, structure ensembles generated from molecular dynamics (MD) simulations are examined to reveal dynamic fluctuations around a native conformation that may more accurately represent the structure ensemble in solution. Two different types of MD simulations were used: one with a simplified coarse-grained (CG) representation and the other with atomistic details. Details of these two MD simulations are described, respectively, as follows.
-like simulation CG Go Simplified CG simulations via a Go-like model were performed based on the crystal structure of a model protein gelsolin (Protein Data Bank (PDB) entry: 3FFN (19)), where each residue is represented as a single bead at its Ca position with its energy function, as previously explained
Protection Factor Analysis for Footprinting
109
P
in detail (20). Briefly, the bond, angle, and dihedral angle energies are represented as
Ebond ¼
X
2
kb ðr r0 Þ ; Eangle ¼
bonds
Edih ¼
X
Xn ¼ 1;3
ðnÞ k ½1 dihedrals f
2
kq ðq q0 Þ ; and
angles
þ cosðnðf f0 ÞÞ;
˚ 2), kq ¼ 20 kcal/(mol$rad2), using the force constants kb ¼ 100 kcal/(mol$A ð1Þ ð3Þ 2 kf ¼ 1:0 kcal/(mol$rad ), and kf ¼ 0:5 kcal/(mol$rad2). The symbols r, q, and f are the instantaneous bond distances, angles, and dihedral angles, respectively, and r0, q0, and f0 are the corresponding values in the crystal structure. The contact energy for a pair of contact-forming residues is given by
Enative
PF ¼
. 10 X . 12 0 ¼ ε0 5 sij rij 6 s0ij rij ; i;j
where s0ij is the native distance between residue i and j. Based on this CG energy model, Langevin dynamics simulations were performed at 300 K, where a friction coefficient of 50 ps1 was used for each residue. The CG simulations were implemented using a modified version of the package CHARMM (21).
All-atom simulation Atomistic simulations via an Amber ff99sb force field (22) were performed ˚ 93 A ˚ 122 A ˚ was using the package NAMD (23). A water box of 113 A ˚ away from the protein. used so that the boundary of the box is at least 12 A Naþ and Cl ions were added to achieve a salt concentration of 50 mM as used in footprinting measurements. Subsequent MD simulations were performed under the NPT ensemble (24,25), where Langevin dynamics was used for temperature control with a coupling coefficient of 5 ps at a target temperature 300 K. The target pressure was set at 1 atm using a piston period of 100 fs and a decay time of 50 fs. A time step of 2 fs was used together with the SHAKE algorithm constraining hydrogen atoms (26). For both all-atom and CG simulations, the total simulation time was 50 ns with coordinates saved every 100 ps. Postcontact analyses were performed using the CSU software (27).
RESULTS AND DISCUSSION Here, we first describe the PF analysis for HRF and demonstrate its application on three model systems (gelsolin, a-antitrypsin, and yeast cofilin) whose footprinting data are reported in the literature. This PF analysis is then further extended from synchrotron-based to chemical-based HRF measurements. Introduction of the PF analysis A protection factor analysis is directly inspired by the protection factor introduced in hydrogen-deuterium exchange (HDX) experiments (28–30). In HDX, normalization of measured reactivity using intrinsic exchange rates of unfolded peptides permits a comparison of multiple peptides within the same protein on an absolute scale. In a similar spirit, one can define a footprinting-based protection factor on the basis of the intrinsic reactivity of each amino acid specifically measured for HRF by
i Ri ; kfp
(1)
where Ri is the residue-type specific intrinsic reactivity of residue i to hydroxyl radicals (17), and kfp is the footprinting rate constant measured for each peptide segment, based on a first-order estimation from its dose-response measurements. The values of Ri for 20 amino acids are listed in Table 1, which are compiled from radiolysis data reported previously in the literature (17). Although conceptually similar to HDX, HRF provides a different level of information about each site under consideration. The main differences between HDX and HRF are twofold. First, HDX is a reversible covalent labeling technique, whereas HRF is irreversible. Second, HDX labels the residue backbones specifically measuring secondary and tertiary structure stability, whereas HRF probes specifically at the side-chain level, providing a pure measure of solvent accessibility of side chains but relating to tertiary/quaternary structure. In practice, up to 18 of all 20 amino acids have been effectively employed as HRF probes. Another important difference is that HDX can measure a wide range of conformational changes, ranging from large-scale folding/unfolding to local, smallscale conformational opening/closing, whereas HRF can be sensitive to a rather limited degree of conformational freedom (e.g., translations and rotations of side chains) occurring in a well-folded, native conformation. In this scenario, it implies substantial differences in both structural and temporal features probed by these two methods. To examine the accuracy and potential use of this approach, we applied the PF analysis to three model systems whose HRF data (in these cases from synchrotron radiolysis experiments) and crystallographic data are available in the literature. These include the calcium-free (apo) form of human gelsolin (8,19,31,32), a-antitrypsin (33,34), and yeast cofilin (35,36). Based on these data, we were able to examine a total of 24 peptide segments with a range of 5–30 amino acids within each peptide: 13 for gelsolin (Table 2), 8 for antitrypsin (Table S1 in the Supporting Material), and 3 for cofilin (Table S2). When two or more residues are modified within a peptide, only those well separated, e.g., by two residues apart, were used for the PF analysis to avoid interresidue interference. We note TABLE 1
Relative intrinsic reactivity (Ri) of 20 amino acids
Cysa
Metb
Trp
Tyr
Phe
His
Leuc
Ilec
Arg
Lys
29.2 Val 1.9
20.5 Thr 1.6
17.4 Ser 1.4
12.0 Pro 1.0
11.2 Glu 0.69
10.0 Gln 0.66
9.3 Asn 0.44
4.4 Asp 0.42
2.9 Ala 0.14
2.2 Gly 0.04
These values are on a relative scale to proline, compiled from the literature (17). a Derived from peptide GCG. b Derived from peptide GMG. c Based on measurements that use phenylalanine as an external reference (17), but scaled to the level of proline. Biophysical Journal 108(1) 107–115
110 TABLE 2
1 2 3 4 5 6 7 8 9 10 11 12 13
Huang et al. A list of peptides from human gelsolin with kfp, logPF, and S values Peptides
kfp (s.d.) (unit: s1)
logPF (s.d.)
S CG (s.d.)
S allatom (s.d.)
E38PGLQIWR45 F49DLVPVPTNLYGDFFTGDAYVILK72 Y87WLGNECSQDESGAAAIFTVQLDDYLNGR115 E121VQGFESATFLGYFK135 G143GVASGFK150 H151VVPNEVVVQR161 P251ALPAGTEDTAK262 D371PDQTDGLGLSYLSSH386 R424IEGSNKVPVDPATY438 V431PVDPATYGQFYGGDSYIILYNYR454 T571PSAAYLWVGTGASEAEK588 A600QPVQVAEGSEPDGFWEALGGK621 Q722GFEPPSFVGWFLGWDDDYWSVDPLDR748
0.44 (0.09) 1.47 (0.09) 1.86 (0.13) 0.69 (0.05) 0.48 (0.03) 0.80 (0.06) 0.58 (0.05) 0.68 (0.07) 0.78 (0.09) 1.05 (0.10) 0.84 (0.09) 1.17 (0.04) 1.91 (0.15)
4.27 (0.20) 4.12 (0.06) 4.12 (0.07) 4.46 (0.07) 3.57 (0.06) 3.42 (0.08) 3.27 (0.09) 4.16 (0.10) 3.72 (0.12) 4.54 (0.10) 4.03 (0.11) 3.69 (0.03) 4.16 (0.08)
9.0 (0.5) 7.6 (0.2) 6.7 (0.4) 6.6 (0.3) 4.7 (0.7) 2.7 (0.2) 2.8 (0.3) 6.4 (0.2) 6.7 (0.4) 8.9 (0.2) 9.4 (0.3) 6.8 (0.5) 6.9 (0.3)
10.1 (0.5) 8.7 (0.3) 8.3 (0.3) 7.9 (0.2) 6.6 (0.7) 4.9 (0.6) 3.2 (0.3) 7.5 (0.5) 6.8 (0.4) 9.6 (0.3) 9.8 (0.4) 7.2 (0.3) 8.2 (0.3)
Underlined are those residues that are modified by hydroxyl radicals and further identified from tandem MS analyses.
that this restriction is mostly due to the rate quantification, which is reported at the whole peptide level, although this restriction could be eliminated if a single-residue MS analysis was available for rate quantification. Using Eq. 1, we converted the footprinting rate constants to PFs (listed in Table 2 as well as Tables S1 and S2). Following a convention used in HDX, we expressed the PF values on a logarithmic scale (i.e., logPF) to color the corresponding crystal structures of these three proteins. This use of logPF (throughout this work), as opposed to the PF itself, is because we hypothesize that the footprinting rate is related to the activation free energy barrier associated with the accessibility of the protein side chains to hydroxyl radicals and the initial chemical step of hydrogen abstraction or ring attack. As such, it is different from that used in HDX (28–30), which can be associated with the equilibrium free energy, e.g., between open and closed states involved in hydrogen-bond breakage. As shown in Fig. 2, it is clear that the buried peptide segments have higher logPF values (in red), whereas those on the surface have lower logPF values (in blue). In other words, this projection enables a logical topographical view of structural features of various protein sites; namely, high-logPF regions are more buried and low-logPF regions are more exposed and preferentially located on the solvent accessible surface. Thus, the PF analysis is able to map out local accessibility of different protein regions using HRF measurements.
accessibility (8,31,37–39). A similar correlation was also observed in other measurements with hydroxyl radicals generated from either electrical discharge (40) or lasers (14,41,42). From a quantitative perspective, however, such a comparison is still insufficient, given that only identical segments could be compared across different conformational states. Here, we examined the 13 peptides from gelsolin listed in Table 2 as an example. Fig. 3, A and B, show the comparison of the total SASA (a surface area of all modified residues within a peptide) with kfp and logPF, respectively. In each case, we calculated a Pearson correlation coefficient r and determined r ¼ 0.42 (with p-value ¼ 0.16) for the case of kfp and r ¼ 0.33 (with p-value ¼ 0.28) for logPF, where the p-value was calculated against the null hypothesis
Strong correlation between HRF-based PFs and structural properties To examine the extent to which experimental rate data can be used for a quantitative structural understanding via this PF analysis, we then examined the relation of the PF values with several widely used structural parameters including the solvent accessible surface area (SASA) and structural contacts. It has been previously demonstrated that the footprinting rate kfp determined from synchrotron radiolysis experiments has a qualitative correlation with local solvent Biophysical Journal 108(1) 107–115
FIGURE 2 Structure mapping based on logarithmic PF values on three model proteins. (A) Human gelsolin (PDB entry 3FFN (19)), where a missing loop was modeled based on its homology (PDB entry 1D0N (32)) (B) a-antitrypsin (PDB entry 1QLP (33)). (C) Yeast cofilin (PDB entry 1CFY (35)). Bottom is a color bar used in the color mapping where lowlogPF regions are colored in blue and high-logPF regions are in red. To see this figure in color, go online.
Protection Factor Analysis for Footprinting
A
B
C
D
111
(Fig. 3 D). These data indicate that a quantitative structural interpretation of footprinting data is enabled by accounting for the chemical reactivity differences of the amino acids in the PF analysis (Eq. 1). To further explore the extent to which the PFs capture local structural features, a commonly used structural parameter, namely, the number of contacts one residue can make with its neighboring residues, is examined for its correlation with footprinting data. Such a contact parameter is previously known to be correlated with local solvent accessibility (43). Here, given the differential contribution of each residue type as just demonstrated in the previous hSASAi analysis, we used a structural contact parameter S, which also takes into account the differential contribution of each amino acid within a peptide such as S ¼
E
F
FIGURE 3 The PF analysis on footprinting data improves correlation with structural properties of solvent accessible surface area and structural contact of gelsolin. (A and B) Scatter plots of the rate constant kfp and the logarithmic PF with the total SASA value of modified residue(s) within each peptide. (C and D) Scatter plots of kfp and logPF with the weighted SASA, hSASAi. (E and F) Scatter plots of kfp and logPF with respect to the structural contact S crystal , which was calculated using the crystal structure (PDB entry 3FFN) as a reference.
r ¼ 0 for the entire set of 13 peptides. As the correlation is not statistically significant, this suggests that total SASA alone is not sufficient to explain the HRF kinetic observable. Given the distinct character of each amino acid in terms of its intrinsic reactivity, a weighted solvent accessibility hSASAi was then used to account for the contribution from each amino acid by P i Ri , SASAi P ; (2) hSASAi ¼ i Ri where SASAi is the solvent accessible surface area of residue i and Ri is the intrinsic chemical reactivity as listed in Table 1. Although no significant correlation is observed when kfp is plotted vs. hSASAi (r ¼ 0.23 and p-value ¼ 0.46; see Fig. 3 C), the correlation is significant between hSASAi and logPF with r ¼ 0.69 and p-value ¼ 0.007
P i Ri , Qi P ; i Ri
(3)
where Qi is the number of contacts that residue i makes with its neighboring residues, calculated using the package CSU (27). As for the kfp alone, its correlation with the structural contact parameter S is not significant (r ¼ 0.12; see Fig. 3 E), but the correlation is significantly increased to r ¼ 0.77 and p-value ¼ 0.001 (Fig. 3 F), when logPF values are used and compared with S values. Thus, protein topology and topography are seen to be highly significant determinants of the kinetic observables in HRF experiments. To account for the effect of local structural fluctuation on the structural correlation with PFs, we also performed Langevin dynamics simulations using both residue-simplified CG and atomic-level representations to model the solution structure ensemble. The former was built based on a Go-type potential defined mostly by attractive native interactions that stabilize a native conformation (20,44) (see details in Methods as well). In parallel, the latter allatom simulations were performed starting from a crystal structure (see Methods). For both CG and all-atom simulations, averaged S values were calculated and listed in Table 2 from a total of 50-ns of simulation data, within which the simulations have reached equilibrium as assessed by the commonly used Ca root mean-square deviations (with reference to the crystal structure) as well as the root mean-square fluctuation for each residue (Fig. 4 A and Fig. 4 C). Clearly, these simulation-based S values are highly correlated with S crystal calculated based on the crystal structure (Fig. 5), indicating that both simulations closely sample the local structural fluctuation near the crystal structure, as opposed to a set of structures having large conformational changes. Nonetheless, Fig. 6 shows that both the S values are very significantly correlated with logPF with r ¼ 0.79 and r ¼ 0.85, respectively. However, the results also indicate that there is only a modest improvement in the correlation between S and logPF when dynamic Biophysical Journal 108(1) 107–115
112
Huang et al.
A
B
C
FIGURE 4 Local structural fluctuation of gelsolin accessed by MD simulations. (A and C) Shown are trajectories of Ca root mean-square deviations (with reference to the crystal structure) for CG and all-atom simulations, respectively. (B and D) The root mean-squared fluctuation of each residue is also shown, where positions of the 13 peptides of gelsolin (reported in Table 2) are marked at the bottom of each panel.
D
fluctuations are included. For example, S crystal (Fig. 3 F) and S all-atom (Fig. 6) are correlated to logPF with very close r values (r ¼ 0.77 and r ¼ 0.85, respectively), suggesting that both static and dynamic structures are equivalently positioned to provide an effective interpretation of HRF measurements. Accuracy of the PF analysis The PF analysis was assessed by cross-validation of the logPF-S correlation calculations. This is performed by a leave-one-out jackknife approach by using all but one of the 13 peptides of gelsolin. Based on data from the other 12 peptides, the logPF value of the one that was left out was predicted via a linear regression. Fig. 7 shows the comparison between the predicted and observed logPF values from this jackknife test, where close agreements and very statistically significant correlations are observed for both CG and all-atom simulations. Although an overall agreement is demonstrated for the observed and jackknife-predicted logPF values, it should be noted that there are also potential outliers identified from this test. For example, the peptide 11 (listed in Table 2) is overestimated in terms of its predicted
logPF value compared to a prefect regression line. A detailed analysis shows that it makes quite a few additional contacts, presumably due to the formation of a b-strand, despite being exposed on the protein surface. Another outlier is peptide 4 (see Table 2), where F125 appears to be the only residue modified within this peptide (31). It is likely that the observed footprinting rate merely comes only from F125. Based on this consideration, we applied the PF analysis to this residue alone (using Eq. 1 but in the manner of a single-residue PF analysis), which gives a modestly improved correlation coefficient r ¼ 0.82, where the p-value is improved threefold (with p-value ¼ 0.0003). This improvement, due to the better resolution of a single-residue level measurement, suggests the potential value of single-residue HRF measurements (as opposed to the peptide-level assessments) that have been recently made possible thanks to technical advances in chromatographic methods and mass spectrometry using targeted quantification (5,45). Nonetheless, this cross-validation analysis provides an overall assessment on the effectiveness of our newly introduced PF analysis and further identifies potential outliers that would require a more careful detailed examination on a case-by-case basis. A
A
B
B
FIGURE 5 Correlation between S crystal and simulation-based structural contacts S CG /S all-atom of gelsolin. Values of S CG and S all-atom from CG and all-atom simulations are listed in Table 2, respectively. Biophysical Journal 108(1) 107–115
FIGURE 6 (A and B) Strong correlation between protection factors and structural contacts of gelsolin. Based on the PF and S values listed in Table 2, a Pearson correlation coefficient of r ¼ 0.79 (with p-value ¼ 0.0007) is observed for the CG-based S CG values, and r ¼ 0.85 (with p-value ¼ 0.00007) for the all-atom based S all-atom values.
Protection Factor Analysis for Footprinting
A
B
FIGURE 7 A leave-one-out jackknife test on the protection factor analysis of HRF data of gelsolin. This jackknife analysis shows a correlation coefficient of r ¼ 0.71 (with p-value ¼ 0.005) and r ¼ 0.80 (with p-value ¼ 0.0005) for the results of CG and all-atom simulations (shown in Fig. 6), respectively. To see this figure in color, go online.
113
A
B
C
D
Extending the PF analysis to chemical-based HRF measurements To examine whether the PF analysis can be generalized for analyzing other HRF experimental data, here we applied it to an alternative HRF approach using a fast photochemical oxidation (46), where the amount of modification is quantified at a single dose. Using the modification value (i.e., zero) at the initial time point and at this single dose point, the labeling yield becomes equivalent to a footprinting rate that can be used and compared across different peptides. Thus, a PF analysis can also be performed in this case by taking into account the intrinsic chemical reactivity of each amino acid as used in Eq. 1. We performed a PF analysis on the labeling data of barstar reported in (46). The resulting PFs are further compared with two structural parameters of SASA and S crystal that are calculated for each single residue based on its crystal structure (PDB entry: 1A19 (47)). When comparing the modification yield itself with either SASA (Fig. 8 A) or S crystal (Fig. 8 C), no correlation is observed between the yield and the structural parameters. However, when the logPF values derived from the labeling yields is used, Fig. 8 shows a notable increase in the correlation coefficient from r ¼ 0.01 to r ¼ 0.38 for the case of SASA and from r ¼ 0.07 to r ¼ 0.42 for the case of S crystal (Fig. 8 B and Fig. 8 D). Although these correlation coefficients themselves are of modest statistical significance, compared to the synchrotron-based HRF results (Fig. 3 F and Fig. 6), it is clear that the relative increase in correlation suggests this PF analysis—based on the consideration of the intrinsic chemical reactivity of individual amino acid types—can be applied to using HRF measurements for structural interpretation in general. CONCLUSIONS In summary, we have introduced a PF analysis that enables a quantitative use of HRF measurements for unbiased struc-
FIGURE 8 Extending the protection factor analysis to chemical-based HRF measurements of barstar. (A and B) Scatter plots of the modification yield and logPF with the SASA of each modified residue of a barstar model protein. (C and D) Scatter plots of the modification yield and logPF with S crystal , the number of structural contacts for each residue using the crystal structure of barstar (PDB entry 1A19 (47)) as a reference.
tural characterization of proteins. Based on three model protein systems, solvent accessibility and local structural contacts exhibit a quantitative agreement with calculated PFs, thereby permitting a topographical prediction of protein structures. Retrospectively, structural analyses based on these derived PFs show that protein topography governs the kinetic aspects of HRF. As such, this opens up the possibility of explicit structure modeling by using HRF measurements either at the peptide level or even at a single-residue level. SUPPORTING MATERIAL Two tables are available at http://www.biophysj.org/biophysj/supplemental/ S0006-3495(14)01197-7.
ACKNOWLEDGMENTS We thank Hua Xu, Masaru Miyagi, Jean-Eudes Dazard, and Janna Kiselar for discussions and proofreading. Computational support was provided in part by the Ohio Supercomputer Center and the CWRU high-performance computing cluster. This work was supported in part by National Institutes of Health (NIH) grants P30EB-09998 and R01-EB-09688 (M.R.C.).
REFERENCES 1. Xu, G., and M. R. Chance. 2007. Hydroxyl radical-mediated modification of proteins as probes for structural proteomics. Chem. Rev. 107:3514–3543. Biophysical Journal 108(1) 107–115
114 2. Gupta, S., M. Sullivan, ., M. R. Chance. 2007. The Beamline X28C of the Center for Synchrotron Biosciences: a national resource for biomolecular structure and dynamics experiments using synchrotron footprinting. J. Synchrotron Radiat. 14:233–243. 3. Gupta, S., R. Celestre, ., C. Ralston. 2014. Development of a microsecond X-ray protein footprinting facility at the Advanced Light Source. J. Synchrotron Radiat. 21:690–699. 4. Maleknia, S. D., and K. M. Downard. 2014. Advances in radical probe mass spectrometry for protein footprinting in chemical biology applications. Chem. Soc. Rev. 43:3244–3258. 5. Kiselar, J. G., and M. R. Chance. 2010. Future directions of structural mass spectrometry using hydroxyl radical footprinting. J. Mass Spectrom. 45:1373–1382. 6. Wang, L., and M. R. Chance. 2011. Structural mass spectrometry of proteins using hydroxyl radical based protein footprinting. Anal. Chem. 83:7234–7241.
Huang et al. 22. Hornak, V., R. Abel, ., C. Simmerling. 2006. Comparison of multiple Amber force fields and development of improved protein backbone parameters. Proteins. 65:712–725. 23. Phillips, J. C., R. Braun, ., K. Schulten. 2005. Scalable molecular dynamics with NAMD. J. Comput. Chem. 26:1781–1802. 24. Martyna, G. J., D. J. Tobias, and M. L. Klein. 1994. Constant pressure molecular dynamics algorithms. J. Chem. Phys. 101:4177. 25. Feller, S. E., Y. Zhang, ., B. R. Brooks. 1995. Constant pressure molecular dynamics simulation: the Langevin piston method. J. Chem. Phys. 103:4613. 26. Ryckaert, J.-P., G. Ciccotti, and H. J. Berendsen. 1977. Numerical integration of the cartesian equations of motion of a system with constraints: molecular dynamics of n-alkanes. J. Comput. Phys. 23: 327–341. 27. Sobolev, V., A. Sorokine, ., M. Edelman. 1999. Automated analysis of interatomic contacts in proteins. Bioinformatics. 15:327–332.
7. Chance, M. R. 2001. Unfolding of apomyoglobin examined by synchrotron footprinting. Biochem. Biophys. Res. Commun. 287: 614–621.
28. Bai, Y., J. S. Milne, ., S. W. Englander. 1993. Primary structure effects on peptide group hydrogen exchange. Proteins. 17:75–86.
8. Kiselar, J. G., P. A. Janmey, ., M. R. Chance. 2003. Visualizing the Ca2þ-dependent activation of gelsolin by using synchrotron footprinting. Proc. Natl. Acad. Sci. USA. 100:3942–3947.
29. Craig, P. O., J. La¨tzer, ., P. G. Wolynes. 2011. Prediction of nativestate hydrogen exchange from perfectly funneled energy landscapes. J. Am. Chem. Soc. 133:17463–17472.
9. Kiselar, J. G., R. Mahaffy, ., M. R. Chance. 2007. Visualizing Arp2/ 3 complex activation mediated by binding of ATP and WASp using structural mass spectrometry. Proc. Natl. Acad. Sci. USA. 104:1552– 1557.
30. Bai, Y., T. R. Sosnick, ., S. W. Englander. 1995. Protein folding intermediates: native-state hydrogen exchange. Science. 269:192–197.
10. Gupta, S., H. Cheng, ., M. Brenowitz. 2007. DNA and protein footprinting analysis of the modulation of DNA binding by the N-terminal domain of the Saccharomyces cerevisiae TATA binding protein. Biochemistry. 46:9886–9898. 11. Maleknia, S. D., M. Brenowitz, and M. R. Chance. 1999. Millisecond radiolytic modification of peptides by synchrotron X-rays identified by mass spectrometry. Anal. Chem. 71:3965–3973. 12. Maleknia, S. D., M. R. Chance, and K. M. Downard. 1999. Electrospray-assisted modification of proteins: a radical probe of protein structure. Rapid Commun. Mass Spectrom. 13:2352–2358. 13. Tullius, T. D., and B. A. Dombroski. 1986. Hydroxyl radical ‘‘footprinting’’: high-resolution information about DNA-protein contacts and application to lambda repressor and Cro protein. Proc. Natl. Acad. Sci. USA. 83:5469–5473. 14. Hambly, D. M., and M. L. Gross. 2005. Laser flash photolysis of hydrogen peroxide to oxidize protein solvent-accessible residues on the microsecond timescale. J. Am. Soc. Mass Spectrom. 16:2057– 2063. 15. Downard, K. M., S. D. Maleknia, and S. Akashi. 2012. Impact of limited oxidation on protein ion mobility and structure of importance to footprinting by radical probe mass spectrometry. Rapid Commun. Mass Spectrom. 26:226–230. 16. Galas, D. J., and A. Schmitz. 1978. DNAse footprinting: a simple method for the detection of protein-DNA binding specificity. Nucleic Acids Res. 5:3157–3170. 17. Xu, G., and M. R. Chance. 2005. Radiolytic modification and reactivity of amino acid residues serving as structural probes for protein footprinting. Anal. Chem. 77:4549–4555. 18. Xu, G., and M. R. Chance. 2005. Radiolytic modification of sulfurcontaining amino acid residues in model peptides: fundamental studies for protein footprinting. Anal. Chem. 77:2437–2449. 19. Nag, S., Q. Ma, ., R. C. Robinson. 2009. Ca2þ binding by domain 2 plays a critical role in the activation and stabilization of gelsolin. Proc. Natl. Acad. Sci. USA. 106:13713–13718.
31. Kiselar, J. G., P. A. Janmey, ., M. R. Chance. 2003. Structural analysis of gelsolin using synchrotron protein footprinting. Mol. Cell. Proteomics. 2:1120–1132. 32. Burtnick, L. D., E. K. Koepf, ., R. C. Robinson. 1997. The crystal structure of plasma gelsolin: implications for actin severing, capping, and nucleation. Cell. 90:661–670. 33. Elliott, P. R., X. Y. Pei, ., D. A. Lomas. 2000. Topography of a 2.0 A structure of alpha1-antitrypsin reveals targets for rational drug design to prevent conformational disease. Protein Sci. 9:1274–1281. 34. Zheng, X., P. L. Wintrode, and M. R. Chance. 2008. Complementary structural mass spectrometry techniques reveal local dynamics in functionally important regions of a metastable serpin. Structure. 16:38–51. 35. Fedorov, A. A., P. Lappalainen, ., S. C. Almo. 1997. Structure determination of yeast cofilin. Nat. Struct. Biol. 4:366–369. 36. Guan, J.-Q., S. Vorobiev, ., M. R. Chance. 2002. Mapping the G-actin binding surface of cofilin using synchrotron protein footprinting. Biochemistry. 41:5765–5775. 37. Kiselar, J. G., S. D. Maleknia, ., M. R. Chance. 2002. Hydroxyl radical probe of protein surfaces using synchrotron X-ray radiolysis and mass spectrometry. Int. J. Radiat. Biol. 78:101–114. 38. Guan, J.-Q., S. C. Almo, ., M. R. Chance. 2003. Structural reorganization of proteins revealed by radiolysis and mass spectrometry: G-actin solution structure is divalent cation dependent. Biochemistry. 42:11992–12000. 39. Guan, J.-Q., K. Takamoto, ., M. R. Chance. 2005. Structure and dynamics of the actin filament. Biochemistry. 44:3166–3175. 40. Maleknia, S. D., and K. M. Downard. 2012. On-plate deposition of oxidized proteins to facilitate protein footprinting studies by radical probe mass spectrometry. Rapid Commun. Mass Spectrom. 26:2311– 2318. 41. Sharp, J. S., J. M. Becker, and R. L. Hettich. 2004. Analysis of protein solvent accessible surfaces by photochemical oxidation and mass spectrometry. Anal. Chem. 76:672–683.
20. Yang, S., J. N. Onuchic, and H. Levine. 2006. Effective stochastic dynamics on a protein folding energy landscape. J. Chem. Phys. 125: 054910.
42. Charva´tova´, O., B. L. Foley, ., R. J. Woods. 2008. Quantifying protein interface footprinting by hydroxyl radical oxidation and molecular dynamics simulation: application to galectin-1. J. Am. Soc. Mass Spectrom. 19:1692–1705.
21. Brooks, B. R., C. L. Brooks, 3rd, ., M. Karplus. 2009. CHARMM: the biomolecular simulation program. J. Comput. Chem. 30:1545– 1614.
43. Eyal, E., R. Najmanovich, ., V. Sobolev. 2004. Importance of solvent accessibility and contact surfaces in modeling side-chain conformations in proteins. J. Comput. Chem. 25:712–724.
Biophysical Journal 108(1) 107–115
Protection Factor Analysis for Footprinting
115
44. Onuchic, J. N., and P. G. Wolynes. 2004. Theory of protein folding. Curr. Opin. Struct. Biol. 14:70–75.
nying protein folding: data processing and application to barstar. Biochim. Biophys. Acta. 1834:1230–1238.
45. Gupta, S., R. D’Mello, and M. R. Chance. 2012. Structure and dynamics of protein waters revealed by radiolysis and mass spectrometry. Proc. Natl. Acad. Sci. USA. 109:14882–14887.
47. Ratnaparkhi, G. S., S. Ramachandran, ., R. Varadarajan. 1998. Discrepancies between the NMR and X-ray structures of uncomplexed barstar: analysis suggests that packing densities of protein structures determined by NMR are unreliable. Biochemistry. 37: 6958–6966.
46. Gau, B. C., J. Chen, and M. L. Gross. 2013. Fast photochemical oxidation of proteins for comparing solvent-accessibility changes accompa-
Biophysical Journal 108(1) 107–115