Analytical Biochemistry 280, 46 –57 (2000) doi:10.1006/abio.2000.4483, available online at http://www.idealibrary.com on
Enhanced Prediction Accuracy of Protein Secondary Structure Using Hydrogen Exchange Fourier Transform Infrared Spectroscopy Bernoli I. Baello, Petr Pancoska, and Timothy A. Keiderling 1 Department of Chemistry, University of Illinois at Chicago, 845 W. Taylor Street (M/C 111), Chicago, Illinois 60607-7061
Received July 7, 1999
A novel equilibrium hydrogen exchange Fourier transform IR (HX-FTIR) spectroscopy method for predicting secondary structure content was employed using spectra obtained for a training set of 23 globular proteins. The IR bandshape and frequency changes resulting from controlled levels of H–D exchange were observed to be protein-dependent. Their analysis revealed these variations to be partly correlated to secondary structure. For each protein, a set of 6 spectra was measured with a systematic variation of the solvent H–D ratio and was subjected to factor analysis. The most significant component spectra for each protein, representing independent aspects of the spectral response to deuteration, were each subjected to a second factor analysis over the entire training set. Restricted multiple regression (RMR) analysis using the loadings of the principal components from 19 of these H–D analyses revealed an improvement in prediction accuracy compared with conventional bandshape-based analyses of FTIR data. Nearly a factor of 2 reduction in error for prediction of helix fractions was found using s 1, the average spectral response for the H–D set. In some cases, significant error reduction for prediction of minor components was found using higher factors. Using the same analytical methods, prediction errors with this new deuteration–response–FTIR method were shown to be even better than those obtained by use of electronic circular dichroism (ECD) data for helix predictions and to be significantly lower for ECD-based sheet prediction, making these the best secondary structure predictions obtained with the RMR method. Tests of a limited variable selection scheme showed further improvements, consistent with previous results of this approach using ECD data. © 2000 Academic Press
1 To whom correspondence should be addressed. Fax: (312) 9960431. E-mail:
[email protected].
46
Optical spectra, FTIR, 2 Raman, and electronic and vibrational CD (ECD and VCD) have been widely used as a basis for secondary structure analyses of proteins. A variety of mathematical tools have been developed for extracting quantitative estimates of the fractions of helix, sheet, and other components from these spectral data (1–29). Bandshape-based analyses have dominated ECD and VCD methods (30, 31), while bandshape- and frequency-based approaches (20, 32) have been employed for IR and Raman analyses. These techniques involve relatively rapid measurements with an intrinsically fast time scale whose analyses provide structural insight either by themselves or upon combination with data from other techniques. Combined spectral studies, for instance, data from FTIR with ECD (13, 19, 24), or VCD with ECD (23, 24), often provide structural details and precision not available from one technique alone. Similarly, hydrogen exchange is a broadly used technique for protein structure studies particularly for folding analyses. Its chemical and physical mechanisms have been studied extensively (33– 40), and their impact has proven to be most valuable with deuterationsensitive spectroscopic methods such as IR (41–51), Raman (52), NMR (37, 38, 53– 62), mass spectrometry (63– 68), and neutron diffraction (69). In particular, FTIR and VCD measurements have often been carried out on proteins in D 2O-based solutions due to interference from H 2O absorbances. The H–O–H deformation band (⬃1650 cm ⫺1) overlaps the amide I (primarily CAO stretch) frequency region, which provides the most structurally informative IR spectral changes. Using D 2O, the amide I⬘ band (N2 Abbreviations used: FTIR, Fourier transform infrared; ECD, electronic circular dichroism; VCD, vibrational CD; PC/FA, principal component method of factor analysis; RMR, restricted multiple regression; FC, fractional secondary structure composition.
0003-2697/00 $35.00 Copyright © 2000 by Academic Press All rights of reproduction in any form reserved.
HYDROGEN EXCHANGE FOURIER TRANSFORM INFRARED SPECTROSCOPY
deuterated) can be measured without solvent interference and furthermore, lower protein concentrations and longer pathlengths can be used to avoid aggregation effects. Although data of good signal to noise ratio (S/N) can be obtained either way (70, 71), the use of D 2O drastically modifies the amide II⬘ and III⬘ bands and, perhaps more importantly, involves an uncertainty as to the extent of deuteration of the protein (degree of H–D exchange). Thus, one has a trade-off between interference from the solvent for the most important band with ambiguity in the degree of deuteration, which additionally has real consequences in terms of frequency and bandshape. However, the variability in deuteration also has a structural origin. Those components of the structure which are most fluxional or are on the surface (such as loops and turns) are expected to exchange more readily and consequently become fully deuterated in D 2O. Those residues in the interior and more strongly internally hydrogen-bonded, thus more protected, will tend to have residual protonated amides. Hence, an analysis that couples deuteration response to spectral characteristics may give added insight into secondary and possibly tertiary structure. The amide I band has proven to be the FTIR feature most sensitive to secondary structure, which has led to band assignments of its components both in H 2O (72) and in D 2O (73). The overall maximum of the amide I band shifts about 5–10 cm ⫺1 to lower wavenumber upon N–H deuteration (amide I⬘). By contrast, the amide II (primarily N–H deformation and C–N stretch), which is less sensitive to protein secondary structure, but more sensitive to deuteration, shifts by about 100 cm ⫺1 on N–D exchange (amide II⬘). In situations of partial deuteration, the number of component bands can, in principle, increase substantially, so we will use a bandshape-based analysis method to provide some separation of these deuteration effects from the structural variations in proteins. The principal component method of factor analysis (PC/FA) (23, 24, 26) provides a means of determining the most common response and the major variance of a system to a perturbation. By a study of the spectral response to systematic H–D exchange, we have the potential of selectively removing the more tertiary structure-sensitive components of the spectra (major variance with deuteration) from the common response. The latter should emphasize the commonality of the secondary structure components while the former might reveal tertiary effects. In this paper, we will discuss the impact of these perturbations on secondary structure prediction, while in a subsequent work, with a totally different algorithm, we hope to explore tertiary effects in more detail than is possible with our current, secondary structure-based descriptor methods.
47
We, here, combine H–D exchange (in an equilibrium sense) with FTIR spectra of a training set of proteins to improve both the accuracy and extent of prediction of fractional secondary structure in these proteins. Using a consistent bandshape analysis scheme based on PC/FA and our previously detailed (23, 24, 26) restricted multiple regression (RMR) method of correlation with secondary structure, we compare this H–D exchange enhanced FTIR approach with results from ordinary FTIR in H 2O and D 2O as well as with ECDbased analyses for the same set of proteins. MATERIALS AND METHODS
Proteins. The set of 23 proteins studied is listed in Table 1 with the species and source. Nineteen of them are treated as “knowns” or the training set for the purposes of the RMR predictions. The FC values of these 19 are also given as were determined from the Protein Data Bank (PDB) file, listed in column 5, using the Kabsch and Sander DSSP program (74) as has been previously described (23). The proteins were used as purchased without further purification. For each protein, a set of six unbuffered aqueous solutions was prepared with a concentration of about 100 mg/mL with varying ratios of H 2O:D 2O (0, 20, 40, 60, 80, and 100%, v/v D 2O). This provides a solvent environment for the protein samples that is consistent with our previous FTIR- and VCD-based structural analyses on these same reference proteins (24, 26) except for the effects of deuteration. Aliquots of about ⬃15 L were placed in a commercial refillable cell (Graseby-Specac PN 1012) consisting of CaF 2 windows separated by a 6-m mylar spacer. Deuterated water was obtained from Cambridge Isotope Laboratories. Spectroscopy. All FTIR measurements were performed at room temperature using a Bio-Rad FTS-60 spectrometer with a DTGS detector, running under Win-IR software. Spectra were all measured with a consistent time delay of about 15 min between the time each solution was made and the first scan. Each spectrum was an average of 256 scans accumulated with a nominal resolution of 2 cm ⫺1 which took an additional 30 min, approximately. Interferograms were apodized with a triangular function and then Fourier-transformed. Each protein spectrum was obtained by subtracting the solvent spectrum from the spectrum of the protein solution measured under the same conditions. Subsequent data processing was carried out using GRAMS/32 (Galactic Industries, Salem, NH). For comparison, ECD spectral data for the 23 proteins used in this study were obtained from our spectral database (23, 24). FA/RMR calculation. Our method of using factor analysis results coupled with a restricted multiple regression (FA/RMR) has been fully detailed separately
48
BAELLO, PANCOSKA, AND KEIDERLING TABLE 1
Reference Set of Proteins with Their Source, Species, X-Ray PDB Entry Code, and FC Values Protein name
Code a
Species
Source
PDB
Helix
Sheet
Turns
Bends
“Other”
Alcohol dehydrogenase Carbonic anhydrase ␣-Chymotrypsinogen A ␣-Chymotrypsin type II Concanavalin A Cytochrome c Glutathione reductase Hemoglobin -Immunoglobulin Lactate dehydrogenase Lysozyme Myoglobin Ribonuclease A Ribonuclease S Subtilisin BPN⬘ Superoxide dismutase Triose phosphate isomerase Trypsin Trypsin inhibitor Albumin ␣-Casein Lactoferrin -Lactoglobulin A
ADH CAH CGN CHT CAN CYT GRS HEM REI LDH LYS MYO RNA RNS SBT SOD TPI TRP STI
Horse liver Bovine erythrocytes Bovine pancreas Bovine pancreas Jack bean Bovine heart Wheat germ Human Human Rabbit Hen egg white Horse skeletal muscle Bovine Bovine pancreas Bacterial Bovine erythrocytes Yeast Bovine pancreas Soybean Bovine serum Bovine milk Human milk Bovine milk
Fluka 05648 Sigma C-3934 Sigma C-4879 Sigma C-4129 Sigma C-2010 Sigma C-3131 Sigma G-6004 Sigma H-7379 Fluka 56834 Calbiochem 427217 Sigma L-6876 Sigma M-0682 Sigma R-5125 Sigma R-6000 Sigma P-8038 Fluka 86200 Sigma T-2507 Sigma T-8253 Sigma T-9003 Sigma A-0281 Sigma C-7891 Sigma L-0520 Sigma L-7880
8ADH 1CA2 2CGA 5CHA 3CNA 3CYT 3GRS 1HCO 1REI 6LDH 7LYZ 1MBN 3RN3 2RNS 1SBT 2SOD 1TIM 3PTN 1AVU — — — —
24.9 16.1 14.3 11.8 0.0 42.7 29.3 62.7 2.8 36.8 38.8 77.1 21.0 20.8 30.2 2.0 45.8 9.9 1.7
20.6 28.9 32.2 32.1 40.5 0.0 18.7 0.0 47.7 11.3 7.8 0.0 34.7 35.2 17.8 38.4 17.0 32.3 38.5
14.7 12.9 14.3 11.4 9.3 15.5 10.4 18.8 14.0 14.3 20.9 9.8 11.3 7.2 15.3 14.6 7.3 14.8 10.9
13.6 15.2 12.7 14.4 19.8 8.7 19.3 6.6 11.2 13.1 16.3 1.9 14.5 14.4 12.0 20.5 8.9 17.9 9.8
26.2 27.0 26.5 30.4 30.4 33.0 22.3 11.9 24.3 24.6 16.3 11.1 18.6 22.4 24.7 24.5 21.1 25.1 39.1
a
The proteins will be referred in the figures and Table 7 using these codes.
(23, 24) and only a brief description is provided here. Full details of the algorithm and the software are available from the authors. As an overview, FA is a modelfree method (in contrast to deconvolution and assignment of “resolved” bands to secondary structural types (16, 20)) for determining components of the bandshape based on their commonality over a training set of spectra. The loadings of each component spectrum are normally used to fit to fractional secondary structure composition (FC) using some regression relation. The fitting relationship can then be used to predict secondary structure content for unknown proteins. The RMR approach selectively chooses those loadings which yield the most reliable predictions for each structural type independently. With this method, the helix determination has no impact on and is independent of the sheet determination, for example. There are no model constraints regarding negative or ⬎100% determinations of total structure; hence, these characteristics provide an independent test of the results. Overall absorbance intensity is potentially a significant component in the FA/RMR calculation but is problematic for protein FTIR spectra because of lack of pathlength and concentration precision. In principle, absorbance in the UV bands of aromatic residues can be used to establish concentration, but reliable absorbance coefficients (⑀ ) would be needed for all proteins in the set. In this study, each protein was measured for a set of independently prepared solutions with differ-
ent percentages of D 2O. Each of these spectra has bands that vary in intensity due to the degree of deuteration. These were obtained by careful subtraction of the solvent spectra that contain interfering bands originating from various amounts of H 2O, D 2O, and HOD. These difficulties suggest that a scaling process for all the spectra within each set should be applied before any analysis. In this work we present results using the method of scaling employed in our previous studies (23, 24), whereby the absorbance of the amide I–I⬘ band maximum was scaled to 1. The results of another method, in which the spectra were scaled according to our best estimate of the experimental concentration (as obtained from the weight of protein sample dissolved) and analyzed identically as above, do not work well and will be discussed only as a point of comparison later in the paper. The spectra, P i , for the deuteration series for a given protein k were truncated to 1800 –1350 cm ⫺1, a region that encompasses the amide I, I⬘, II, and II⬘ bands. Each set of spectra, P i , was analyzed with the PC/FA method to yield a set of orthogonal spectral components, s j . These spectral components will be denoted henceforward as H/D components in the text. The components are rank-ordered according to their significance in reconstructing the experimental spectra as their linear combination. Succinctly, the set of H/D experimental spectra for protein k can be expressed as
HYDROGEN EXCHANGE FOURIER TRANSFORM INFRARED SPECTROSCOPY
关P i 兴 k ⫽ 关s j 兴 k 关C ij 兴 k ,
[1]
where s j is the jth H/D component spectrum of protein k and C ij is its loading in the experimental spectrum P i . Each loading can be further divided into its intensity and bandshape contributions as C ij ⫽ N i c ij ,
[2]
where c ij is the bandshape contribution to the C ij loading and
Ni ⫽
冑冕
P i ⴱ P i dv
[3]
This first stage of the calculation thus produced 23 sets of H/D component spectra, s j , each ordered as to their significance for each protein in the training set. Next, the jth H/D components, s j , the same one in order from each protein in the training set, were grouped separately for j ⫽ 1, 2, 3 to form the basis for a second series of PC/FA computations. A new set of spectral subcomponents S m then results for this reduced spectral set, where m is the index of the subcomponent: 关s j 兴 k ⫽ 关S m 兴 k 关L jm 兴 k
[4]
This procedure is possible because each spectral component s j describes one aspect of the response of a given protein spectrum to H–D exchange. Another set of loadings, L jm , is generated from this second principal component decomposition on the s 1 , s 2 , and s 3 datasets. It is these L jm , “reduced spectral set loadings,” which we have correlated with the variation in structure for the proteins in the training set. This method is similar to the initial steps in previous approaches (8, 9, 14, 19, 24) to FTIR bandshape analysis with the exception that for each set, s j , we are utilizing only part of the absorption bandshape. For example, the s 1 set has the components common to both the H and D forms of the protein while the s 2 set has the components most sensitive to change in deuteration. Multiple linear regressions, testing all possible combinations of the m subcomponent loadings for the 19 “known” proteins, were calculated using the loadings, L jm , obtained from the second factor analysis of various sets of H/D components with each of the FC values. Using our RMR approach (22–24), the prediction abilities of these various regressions were tested based on the “one-left-out” method, where one protein is systematically removed before developing a regression relation between the loadings, L jm , and the fractional components, FC , where corresponds to helix, sheet, etc.
49
The predicted FC values for each “left-out” protein are calculated and the standard deviations of all such predictions from the Kabsch–Sander (74) X-ray-derived FC values are used as the judgment criteria. These one-left-out tests were restricted to 19 proteins for which good X-ray crystal structure analyses were available. The optimal combination of spectral components for the prediction of FC values was selected by a complete search over all possible combinations and, beginning with the same set of spectral components, was carried out independently for all secondary structures considered. It is this extensive test with the training set data that then results in a relatively simple algorithm for future application to structure determination of unknown proteins. It should be clear that each analysis, e.g., helix using s 1 or sheet using s 2 or any other combination is independent. Thus, their comparison can identify which H/D spectral components are most useful for determination of each structural component. In addition, as a test of the method, multiple linear regression predictions of FC values were also undertaken with a protein elimination method that can be seen to be equivalent to the initial step of the successful variable selection method of Johnson and co-workers (3). An H/D component, s q1 (q ⫽ 1, 2, . . . , 19), of one protein with known X-ray structure was systematically removed only during the RMR analysis step (i.e., it remained part of the FA step), and the impact on prediction error was analyzed. This was undertaken to determine if there were any proteins in the data set that significantly degrade prediction accuracy for the rest of the data set. In this manner, one could possibly delineate artifacts in the measurements and calculations or identify errors contributed by disagreement between the spectral data sets and between the optical spectra and the X-ray data. RESULTS
Spectral data. The effects of the perturbation due to the exchange of the labile hydrogens of the protein with the deuterium in the solvent are manifested as changes in the band frequencies, shapes, and intensities. By way of example, in Fig. 1a a set of hemoglobin spectra illustrating the mid-IR absorbance response to variable D 2O content (actually to the H:D ratio in the solvent) is shown. A slight monotonic shift to lower wavenumber could be observed in the amide I–I⬘ as the amount of deuterium increased. In comparison, the amide II to II⬘ shift was about 100 cm ⫺1, from 1550 to 1450 cm ⫺1, upon deuteration. The amide II band intensity decreased as the percentage of deuterium increased, while the amide II⬘ band intensity increased. The difference spectra shown in Fig. 1b highlight the frequency shifts in this amide I–I⬘ band and the sys-
50
BAELLO, PANCOSKA, AND KEIDERLING
FIG. 1. (a) FTIR and (b) DFTIR absorbance spectra of hemoglobin at different D:H ratios. The arrows indicate the direction of the change in the intensity of the peaks as the ratio increases.
tematic intensity changes of amide II and II⬘. These were calculated by subtracting the protein spectrum obtained in H 2O from each of the other spectra in the set. Analysis. Despite the variation due to hydrogen exchange, a high degree of correlation between the absorbance spectra for each protein at different H:D ratios persisted. The correlation matrix (overlap integrals) used to carry out the PC/FA (75, 76) over the H–D exchange set for a given protein reflects this, having values close to 0.9. From the first factor analysis, the eigenvalue corresponding to the most dominant H/D component indicated that it contributed from 65 to 97% of the spectral bandshape for the 23 different protein H/D sets of spectra analyzed. On the other hand, only a 3–35% contribution to the spectra came from the second component, which represents the major H–D exchange-correlated variation in bandshape, and the third component hardly added to the variation (usually less than 1%). The H/D components for the hemoglobin spectra from Fig. 1a are shown in Fig. 2. The first H/D component, s 1 , consequently looks like a typical absorbance spectrum of a partially exchanged protein, having observable intensity at both the amide II and II⬘ bands. For the second H/D component, s 2 , derivative shapes centered on frequencies correspond-
ing to the amide bands appear, with that in the amide II–II⬘ region being the most prominent since that is the largest intensity change on deuteration. The third H/D component, s 3 , had few well-defined features that could be assigned to particular bands, compared with the first two component sets, and were less consistent in shape between the 23 proteins as well. Plots of the loadings, C ij , for hemoglobin are shown in Fig. 3 to demonstrate their variation with the fraction of bulk deuterium. Since only the first three components were significant, the discussion of the response with respect to the bulk deuterium (%D) will be confined to them. The loadings for s 1 , C i1 , change little with deuteration, typically varying within 10% of the mean value, while the other two loadings vary by a full sign reversal and evidence a systematic change with %D. For C i2 and C i3 , the trends with %D are very similar for all the proteins. If the reduced loadings c ij are plotted against %D (not shown), the plots are virtually unchanged other than a scale shift since the norms, N i , for hemoglobin (shown in Fig. 4 and compared to two other proteins with different responses) vary by only about 10%. However, this correction removes the effects of overall intensity variations (encoded in N i ) and emphasizes the quantitative description of the bandshape changes. The variations in the norms, N i , are due largely to errors in concentration which are systematic (going through a minimum for some H:D ratio of the solvent) when the spectra are scaled to A ⫽ 1 at the amide I–I⬘ band (Fig. 4b).
FIG. 2. H/D component spectra s 1 (most dominant), s 2 , and s 3 and from FTIR absorbance spectra of hemoglobin as calculated using PC/FA. Note the large overall intensity differences between them.
HYDROGEN EXCHANGE FOURIER TRANSFORM INFRARED SPECTROSCOPY
FIG. 3. The different loadings, C ij , corresponding to the H/D component spectra of hemoglobin shown in Fig. 2 plotted against solvent D:H ratios.
However, if the spectra are normalized to concentrations determined by weight and analyzed with the same PC/FA method separating the norm, N i , from the bandshape contribution, the N i values show a larger degree of random error that is corrected out from the reduced loadings, c ij . In either case, the higher loadings are variances from the average; hence, c i2 and c i3 follow the same trend as C i2 and C i3 , showing N i had minimal impact on them. The C i2 loadings appear to be simple linear functions of the %D in Fig. 3, but for other proteins, this dependence is more complex, having a variable degree of nonlinear component, as shown for the five examples in Fig. 5. The s 1 for different proteins are still similar with respect to bandshape, but the second factor analysis identifies the subtle differences in them. The same approach was undertaken for s 2 and s 3 , which had larger variation between proteins. We use these differences, expressed as loadings, L jm , of the new (training set) component spectra, S m , as input to an RMR scheme to find the optimal correlation with protein secondary structure as represented by the FC values. However, before proceeding, each set of components s j , j ⫽ 1, 2, 3, was phase corrected (multiplied by ⫾1) to be consistent in sign pattern since the phases of the FA spectral components are arbitrary in the H–D spec-
51
tral decomposition. To this end, consistency of the bandshape features and dependence of the loadings on H:D ratio were used as judgment criteria. For example, all s 2 spectral components were phased as in Fig. 2 so that the amide II is positive and the amide II⬘ is negative. RMR computations were then used to predict secondary structures for each protein in the set using a oneleft-out procedure with varying combinations of PC/FA loadings, L jm (23, 24). The optimal predictions were determined for the set of loadings that gave the lowest error in prediction for the entire set. Table 2 shows the errors in prediction with the RMR method on the set of s 1 H/D components for each protein in the set as well as the deviation of the sum of predictions from 100% total. The standard deviations obtained for the set of helix, sheet, etc. predictions for this training set of proteins are given in Table 3 (those corresponding to Table 2 are designated as s 1 ) together with results of similar FA/ RMR calculations based on other types of spectral data sets. Surveying the results in Table 2, it is clear that most of the error arises from a few proteins (for helix, superoxide dismutase, myoglobin, concanavalin, lysozyme, and carbonic anhydrase and for sheet, -immunoglobulin, alcohol dehydrogenase, trypsin inhibitor, superoxide dismutase, cytochrome c, and ␣-chymotrypsin). Such a pattern suggests that a variable selection method may be useful for these H/D exchange data (3, 21), as discussed below. Overall, the average secondary structure prediction error is within
FIG. 4. Comparison of the variation of the norms, N i , with respect to solvent D:H ratios for some protein sets (ADH, alcohol dehydrogenase; SOD, superoxide dismutase; HEM, hemoglobin) when the protein spectra are scaled according to (a) concentration and (b) amide I–I⬘ band intensity.
52
BAELLO, PANCOSKA, AND KEIDERLING
sets in terms of the quality of their prediction of the FC values is to scale the average prediction error of each spectral set according to that set for the technique which has the minimum error for each secondary structure component, as follows: relative error ⫽
FIG. 5. Comparison of the behavior of the loadings, C i2 of different proteins (CAN, concanavalin A; CGN, chymotrypsinogen; PTI, trypsin inhibitor; RNA, ribonuclease A; MYO, myoglobin) with the change in solvent D:H ratio.
⬃5% for all except superoxide dismutase, myoglobin, lysozyme, and -immunoglobulin; and the largest deviations from 100% total are for concanavalin A, lysozyme, trypsin inhibitor, and myoglobin, with the remaining being within 7%. Comparison of the Table 3 results for s 1 , s 2 , and s 3 shows that the individual H/D components when used for the FA/RMR have decreasing predictive ability (higher standard deviation) for helix and sheet as one proceeds from s 1 to s 3 . On the other hand, the second and third components do relatively better for the minor structures, such as turns and bends. We compared the results of these calculations, in terms of standard deviation of predictions, with similarly obtained standard deviations using instead spectra from FTIR and DFTIR in H 2O and in D 2O (Table 3) as basis for the RMR analysis. As before (23, 24), the DFTIR spectra were calculated as differences between the average absorbance spectra within the training set and that of each protein. For comparison, ECD spectra from our previous dataset (23) were also used to make secondary structure predictions with the same FA/ RMR method on the same training set. A convenient visual means for comparing the errors between data-
关共data set error ⫺ minimum error兲兴 ⫻ 100. minimum error
[5]
These relative errors are presented in Table 4. For all the methods compared, where each dataset is analyzed with the same FA/RMR method, we found the lowest prediction error for helix from the first component, s 1 , of the H/D exchange FTIR spectral set presented here. While it is significantly better than the standard deviation obtained with ECD spectra, which is a first in terms of the comparative predictability of FTIR and ECD spectra for secondary structure using the same algorithm, the helix predictions using H/D spectral components are even better (⬎75% lower error by Eq. [5]) than any other FTIR-based predictions. Various datasets measured using FTIR spectroscopy predict sheet content with about the same error levels. Here, the H/D method has no great advantage compared to FTIR, but these predictions are always lower in error than the sheet predictions obtained with ECD spectra. For sheet, the s 1 -based predictions are almost as good as the FTIR ones, but the s 2 - and s 3 -based predictions are worse. As noted above, minor structures (turns and bends), on the other hand, are generally somewhat better described by s 2 and s 3 from the H/D exchange FTIR dataset. A previous analysis (24) utilizing protein FTIR spectra in H 2O, focused on just the amide I and II bands for secondary structure predictions. To test the effects of the wider frequency region encompassed in our H/D spectral components (and the effect of inclusion/removal of the amide II⬘ band), we truncated the spectra to 1700 –1485 cm ⫺1. The results are tabulated in Table 5. Truncation of the FTIR spectra measured in aqueous solution significantly improved that dataset’s predictive ability (lower error), but for only the helical fraction, and slightly degraded the sheet predictions. In contrast, truncating the s 1 H/D component spectra increased the error for helical prediction. This clearly demonstrates that the advantage of the H/D component spectra arises from its sensing the relative accessibility of protein secondary structure segments to H–D exchange, as is most profoundly evidenced in the relative amide II and II⬘ intensities, whereas the FTIR (H 2O) dataset (which has no amide II⬘ component) is negatively impacted by inclusion of non-amide data in the FA/RMR.
53
HYDROGEN EXCHANGE FOURIER TRANSFORM INFRARED SPECTROSCOPY TABLE 2
Errors in Predicted FC Values of Individual Proteins for Best Predicting Models Using the Coefficients L 1m Protein name
Helix a
Sheet a
Turns a
Bends a
“Other” a
Total b
Average error c
Alcohol dehydrogenase Carbonic anhydrase ␣-Chymotrypsinogen A ␣-Chymotrypsin type II Concanavalin A Cytochrome c Glutathione reductase Hemoglobin -Immunoglobulin Lactate dehydrogenase Lysozyme Myoglobin Ribonuclease A Ribonuclease S Subtilisin BPN⬘ Superoxide dismutase Triose phosphate isomerase Trypsin Trypsin inhibitor
1.9 ⫺7.6 ⫺1.3 ⫺0.2 ⫺8.2 4.1 ⫺4.1 3.4 3.6 5.8 8.1 ⫺10.0 ⫺4.8 1.8 ⫺1.6 11.1 1.0 1.0 0.3
10.1 4.8 ⫺0.5 8.0 ⫺2.2 8.3 2.4 2.0 ⫺12.5 ⫺2.4 ⫺6.8 ⫺3.2 ⫺3.9 ⫺5.0 2.8 ⫺9.7 4.9 2.2 9.9
⫺2.2 ⫺0.6 ⫺2.2 2.0 1.8 ⫺0.3 3.5 ⫺4.2 ⫺2.8 ⫺2.1 ⫺6.1 8.0 2.0 6.5 ⫺0.7 ⫺2.0 ⫺0.2 ⫺2.2 1.6
⫺3.3 1.7 3.1 0.7 ⫺2.6 3.4 ⫺5.1 ⫺0.1 5.1 ⫺1.5 ⫺6.2 4.0 0.2 ⫺1.2 8.8 ⫺5.2 ⫺6.5 ⫺1.8 5.6
⫺7.0 0.5 0.1 ⫺3.8 ⫺8.8 ⫺8.1 0.2 ⫺1.2 3.9 ⫺1.1 ⫺2.2 9.2 8.3 ⫺0.5 ⫺3.6 9.4 4.4 0.1 ⫺6.7
⫺0.4 ⫺1.1 ⫺0.9 6.7 ⫺15.6 ⫺0.9 ⫺3.1 0.0 ⫺2.6 ⫺1.2 ⫺13.1 8.0 1.9 1.6 5.6 3.7 3.7 ⫺0.8 10.8
4.9 3.1 1.4 2.9 4.7 4.8 3.1 2.2 5.6 2.6 5.9 6.9 3.8 3.0 3.5 7.5 3.4 1.5 4.8
Secondary structure prediction error is the difference between the prediction and the X-ray-derived FC value: FC (predicted) ⫺ FC (X-ray). b The total represents the deviation from the ideal 100%. c Average error of FC values over the 5 secondary structure types considered: (1/5)* ¥ j 兩FC j (predicted) ⫺ FC j (X-ray)兩. a
As an additional test reflecting our earlier FTIRbased analyses (24), we analyzed the difference FTIR spectra in the same manner as the absorbance spectra. The set of difference spectra exemplified by Fig. 1b after the first PC/FA results in the modified H/D component spectra d 1 and d 2 , for each protein, which were then subjected to a second PC/FA over the whole training set. An RMR analysis with regard to FC values was again carried out using the one-leftout approach to generate standard deviations of prediction error which are summarized in Table 6. The prediction errors of d 1 are close to those obtained with s 2 for helix and sheet. In contrast, d 2 has much lower predictive ability for the major secondary structure fractions compared with the other datasets
(even worse than s 3 ) considered in this paper. Hence, for the H–D exchange dataset, a differential representation for the training set has no advantage. In essence, the best predictions were from s 1 and that component of the data set is suppressed in the difference representation, d 1 and d 2 . Finally, a protein elimination procedure (3) was performed for the analyses using the FTIR in H 2O, H/D exchange, s 1 , and d 1 datasets and the ECD spectral set. The results are summarized in Table 7. The smallest prediction error for each secondary structure content is tabulated together with the protein that was selected for removal to arrive at that value. With such a selective elimination approach, error reduction for all techniques is seen, with the H/D exchange, s 1 , still
TABLE 3
Standard Deviations of Prediction of FC Values by Different Spectral Data Sets a
a
Data set
Helix
Sheet
Turns
Bends
“Other”
FTIR, H 2O DFTIR, H 2O FTIR, D 2O DFTIR, D 2O ECD s1 s2 s3
11.07 (4) 10.45 (5) 9.60 (8) 10.04 (6) 7.20 (1) 5.49 (5) 8.75 (6) 11.54 (4)
5.74 (6) 6.53 (7) 6.21 (6) 7.38 (5) 9.82 (1) 6.48 (6) 7.90 (5) 10.22 (5)
3.35 (2) 3.55 (1) 3.36 (1) 3.50 (4) 3.88 (1) 3.49 (3) 3.05 (3) 2.68 (4)
3.65 (6) 4.35 (4) 3.15 (7) 2.88 (7) 3.95 (2) 4.30 (4) 3.68 (4) 2.93 (5)
5.51 (4) 4.88 (4) 6.05 (6) 6.80 (4) 4.51 (3) 5.53 (5) 6.21 (4) 5.48 (5)
The numbers in parentheses denote the number of loadings that yielded these predictions. The same for Tables 5 and 6.
54
BAELLO, PANCOSKA, AND KEIDERLING TABLE 4
Relative Errors of Prediction of FC Values by Different Spectral Datasets Data set
Helix
Sheet
Turns
Bends
“Other”
FTIR, H 2O DFTIR, H 2O FTIR, D 2O DFTIR, D 2O ECD s1 s2 s3
101.6 90.3 74.9 82.9 31.1 — 59.4 110.2
— 13.8 8.2 28.6 71.1 12.9 37.6 78.0
25.0 32.5 25.4 30.6 44.8 30.2 13.8 —
26.7 51.0 9.4 — 37.2 49.3 27.8 1.7
22.2 8.2 34.1 50.8 — 22.6 37.7 21.5
yielding markedly lower errors than any other spectral set. DISCUSSION
We have shown that measuring protein FTIR spectra as a function of deuteration can yield a dataset with more predictive ability for secondary structure (especially helix fraction) than FTIR spectra of proteins in H 2O or D 2O alone, if such data are systematically analyzed. In fact, these helix predictions even surpass those obtained with ECD in terms of accuracy and are well beyond ECD for -sheet predictions. We assume here that on deuteration only minor changes, if any, occur in the native state protein structures and, therefore, that any such effects would not present any impediment to the analysis and interpretation. It has been suspected that structural complications might arise due to D 2O, but, in fact, such isotopically induced structural changes in proteins have not been determined. Goto et al. (77) did not find any noticeable deuteration effects on the protein folding of apomyoglobin, while Hildebrandt et al. (52) did find an association between the spectral changes with structural modifications in the heme pocket of cytochrome c. However, these latter variations are unlikely to be equilibrium secondary structure changes. Effects due to pH have been noted before, especially on exchange rates (36, 78, 79); but as mentioned above, in this study we have maintained a consistency with previously
published work for the sake of comparative analysis, so that pH effects were not considered. The goal of this study was to determine the spectral effects of H/D exchange for the slowly exchanging portions of the proteins with the hope that they might distinguish between various structural components. While such slow exchange might have a minimal impact in kinetic studies, it could be expected to have a manifest effect in this equilibrium method. A major problem with use of FTIR spectra in general for protein secondary structure analyses is that all the spectra are very similar, since they lack the differential sensitivity characteristic of CD or some other differential type of spectra (24). The success of the FTIR method in previous applications was in large part dependent on its exquisite S/N which allows reliable detection of small spectral variations. H/D exchange offers a means of differentiation between proteins, yet the s 1 components are still quite similar. Here the variation of the other components is larger. Hence, one might ask: what has been gained by all these added data and processing if, in fact, the s 1 gives the best predictions of secondary structure (Tables 2 and 3)? In essence, we think that this process removes spectral “noise,” not from the instrumentation but spectral changes resulting from structural variation between helical segments, in particular, over the whole training set. This separation has another, perhaps more, important effect: that of separating the H/D-sensitive spectral components from the commonality in spectral response. Thus, H/D FA may reduce the variation in the spectral components most common among the set, i.e., s 1 , for spectral contributions from different helical types (exposed and buried) in the training set, separate out as s 2 and s 3 contributions from the most H/Dsensitive parts, which are normally surface turns and loops, and consequently lead to the improved helix fraction prediction we have indeed found (Table 3). We have pursued the effects of partial deuteration on proteins and its implications in the spectral analysis under controlled conditions. But we had to first address the concentration-determination problems of samples for IR measurements. Our normal approach (24) is to
TABLE 5
Standard Deviations of Prediction of FC Values by Different Spectral Datasets Showing the Effects of Truncation of the Spectra a Data set
Helix
Sheet
Turns
Bends
“Other”
FTIR, H 2O FTIR, H 2O (1750–1485 cm ⫺1) s1 s 1 (1750–1485 cm ⫺1)
11.07 (4) 6.84 (6) 5.49 (5) 8.24 (6)
5.74 (6) 6.66 (4) 6.48 (6) 5.94 (4)
3.35 (2) 3.54 (3) 3.49 (3) 3.69 (1)
3.65 (6) 2.89 (6) 4.30 (4) 2.04 (7)
5.51 (4) 4.92 (4) 5.53 (5) 5.41 (4)
a
The number in parentheses as in Table 3.
55
HYDROGEN EXCHANGE FOURIER TRANSFORM INFRARED SPECTROSCOPY TABLE 6
Standard Deviations of Prediction of FC Values of the HX-DFTIR a Data set
Helix
Sheet
Turns
Bends
“Other”
d1 d2
8.73 (4) 17.05 (4)
7.98 (2) 10.61 (3)
3.57 (3) 3.34 (3)
2.51 (5) 4.33 (4)
5.31 (6) 4.82 (5)
a
The number in parentheses as in Table 3.
scale the spectra according to the absorbance of amide I–I⬘. This same approach, to correct for concentrationpathlength variations between samples, has been employed by most groups for previous development of FTIR-based secondary structure prediction methods. In our analyses, the concentration information is convoluted in the spectral band areas, N i , together with other factors that characterize the overall spectral intensity. Figure 4 provides examples of the changes in N i when the set of spectra are scaled according to the amide I–I⬘ band intensity and according to the concentration as derived from the mass of the protein sample used to make each solution. Because of fluctuations seen in these values, N i serves mainly to correct the coefficients c i1 from intensity errors that do impact C i1 , while it hardly affects the coefficients of the less dominant H/D components. The second factor analysis which precedes the structural correlation step of RMR, however, only depends on the s j component spectra. These functions are inherently normalized since the intensity dependence is fully in the C ij loadings. Thus, our H/D component-based analysis emphasizes only the bandshape changes accompanying hydrogen– deuterium exchange and not the intensity variations. Hence, this analysis is intrinsically immune to intensity error, at least to a first approximation. The H/D components, s 1 , have retained the FTIR sensitivity to sheet structures as demonstrated by their ability to predict this secondary structure type as precisely as did the FTIR and DFTIR datasets. However, the benefit of hydrogen exchange is that, at the
same time, we also achieved an improvement in helix prediction to a level higher than ECD, but utilized only one technique (FTIR) and one set of concentration and solution conditions. H/D components resulting from the first FA, that of the spectra of each protein at different deuteration levels, encode the equilibrium response of the protein spectrum to deuterium content. In principle, controlled deuteration should highlight regions in the protein that respond more slowly to exchange, such as the buried parts of the protein which are often the helices. By contrast, the turns and bends are often on surfaces and would easily hydrogen exchange with the solvent. The helix content information is carried in s 1 as implied by the increase in the helix prediction error shown in Table 5 when a portion of the component spectrum that responded to the degree of H/D exchange, i.e., the amide II⬘ band, is deleted. Thus, it is the relative degree of deuteration, encompassed in the amide II–II⬘ intensity ratio, folded together with the known secondary structure sensitive amide I–I⬘ shape that leads to lower helix prediction errors for an improvement in the overall structure prediction. It is this coupling of information, much as we have shown previously for coupling amide I⬘ and II VCD (22, 23) and for coupling VCD and ECD (23, 24) that leads to significant improvements in structure prediction. As noted earlier, the H/D component spectra tend to separate the commonality of bandshape from the H–Ddependent (see Fig. 5) parts. It is conceivable that recoupling these with the independent weighting possible with our RMR could even further improve predictions just as coupling FTIR and ECD data for such analyses resulted in improved predictions (13, 19, 24). Finally, H/D exchange spectra for four additional proteins were run through the initial PC/FA to develop component spectra that were not used in our RMR to get FC predictions. These proteins had less well-determined crystal structures at the initiation of our study. However, general aspects of their structure are, in fact, known. Lactoferrin is well-predicted by s 1 to have 38%
TABLE 7
Lowest Error Levels from the Variable Selection Step with the Protein Eliminated from the Set Indicated in Parentheses a Dataset
Helix
Sheet
Turns
Bends
Other
FTIR, H 2O
7.47–9.25 (LDH, LYS, MYO) 6.27–6.86 (SOD, CYT, CAN) 7.47–8.23 (SOD, LYS, TPI) 4.41–4.83 (SOD, LYS, MYO)
3.91–5.65 (SOD, CHT, REI) 7.98–9.54 (CYT, MYO, LYS) 6.36–7.14 (REI, TPI, CAN) 5.31–5.62 (REI, SOD, STI)
2.76–3.07 (MYO, RNS, GRS) 3.46–3.65 (LYS, CYT, HEM) 2.78–3.19 (RNS, LYS, LDH) 2.86–3.15 (MYO, LYS, CAN)
2.02–3.21 (SOD, LYS, CYT) 3.29–3.71 (REI, CAN, GRS) 1.76–2.35 (TRP, RNS, CAN) 2.35–2.56 (LYS, SBT, CYT)
3.88–4.37 (TPI, RNA, CYT) 3.77–4.35 (STI, LDH, CYT) 4.01–4.52 (STI, RNA, CYT) 4.77–4.84 (SOD, TPI, CYT)
ECD d1 s1 a
The names of the proteins follow the code in column 2 of Table 1.
56
BAELLO, PANCOSKA, AND KEIDERLING
helix and 18% sheet which are well within our error compared to crystal structure reports (80). Lactoglobulin is predicted to have 6% helix and 32% sheet which would be fairly low in both cases (81) but the ␣– balance is correct. Albumin is known to be very highly helical (82), but we predict 52% which, though high, is not high enough. Similarly, casein which is widely thought to not have a well-defined secondary structure is predicted by this method to contain 23% helix and 23% sheet. While this may represent local structure elements highlighted by the through-bond sensitivity of vibrational spectra (83, 84), it is more conservatively seen to be evidence of failure associated with using the method beyond its training set limitations. Thus, as usual, our methods seem to work best for mixed structures and have larger errors for extreme structures, be they highly helical or deviant from the norm in globular folds. The protein elimination method used here was intended to test if there are proteins whose impact on the RMR is statistically different from the rest of the set. We have observed prediction improvements using this method for all the datasets used in this paper. In Table 7 are listed the specific proteins whose elimination from the training set had the biggest impact on improving prediction error. For helix, superoxide dismutase shows up in almost all techniques except for FTIR in H 2O, while for sheet, immunoglobulin is also important except for ECD. Cytochrome c appears in ECD for almost all the structure types and in all the datasets for “other.” These discrepancies between spectral and X-ray-derived FC values and between spectral data sets that show up as a decrease in error levels upon application of the limited-variable-selection-like step might have arisen due to various factors which could include prosthetic group effects, protein aggregation problems, and structural differences between solution and X-ray structures. Our basic analysis (reported in Tables 2– 4), however, retained all the proteins, as our method selects out those components of the spectra that are structurally sensitive (23) and, using RMR, chooses the optimal set of loadings of those components to best characterize the structure fractions. ACKNOWLEDGMENTS Parts of this work were previously supported by a grant from NIH (GM30147) and by funds from the UIC Campus Research Board. We wish to thank a referee who noted a mismatch of PDB file and one protein used for our data; the correction improved the predictions.
REFERENCES 1. Williams, R. W., and Dunker, A. K. (1981) J. Mol. Biol. 152, 783– 813. 2. Hennessey, J. P., and Johnson, W. C., Jr. (1981) Biochemistry 20, 1085–1094.
3. Manavalan, P., and Johnson, W. C., Jr. (1987) Anal. Biochem. 167, 76 – 85. 4. Berjot, M., Marx, J., and Alix, A. J. P. (1987) J. Raman Spectrosc. 18, 289 –300. 5. Yada, R. Y., Jackman, R. L., and Nakai, S. (1988) Int. J. Peptide Protein Res. 31, 98 –108. 6. Bussian, B. M., and Sander, C. (1989) Biochemistry 28, 4271– 4277. 7. Van Stokkum, I. H. M., Spoelder, H. J. W., Bloemendal, M., Van Grondelle, R., and Groen, F. C. A. (1990) Anal. Biochem. 191, 110 –118. 8. Lee, D. C., Haris, P. I., Chapman, D., and Mitchell, R. C. (1990) Biochemistry 29, 9185–9193. 9. Dousseau, F., and Pezolet, M. (1990) Biochemistry 29, 8771– 8779. 10. Venyaminov, S. Y., and Kalnin, N. N. (1990) Biopolymers 30, 1259 –1271. 11. Pancoska, P., and Keiderling, T. A. (1991) Biochemistry 30, 6885– 6895. 12. Perczel, A., Hollosi, M., Tusnady, G., and Fasman, G. D. (1991) Protein Eng. 4, 669 – 679. 13. Sarver, R. W., and Kruger, W. C. (1991) Anal. Biochem. 199, 61– 67. 14. Sarver, R. W., and Kruger, W. C. (1991) Anal. Biochem. 194, 89 –100. 15. Pancoska, P., Blazek, M., and Keiderling, T. A. (1992) Biochemistry 31, 10250 –10257. 16. Arrondo, J. L. R., Muga, A., Castresana, J., and Goni, F. M. (1992) Prog. Biophys. Mol. Biol. 59, 23–56. 17. Toumadje, A., Alcorn, S. W., and Johnson, W. C., Jr. (1992) Anal. Biochem. 200, 321–331. 18. Sreerama, N., and Woody, R. W. (1993) Anal. Biochem. 209, 32– 44. 19. Pribic, R., Van Stokkum, I. H. M., Chapman, D., Haris, P. I., and Bloemendal, M. (1993) Anal. Biochem. 214, 366 –378. 20. Surewicz, W., Mantsch, H. H., and Chapman, D. (1993) Biochemistry 32, 389 –394. 21. Sreerama, N., and Woody, R. W. (1994) J. Mol. Biol. 242, 497– 507. 22. Pancoska, P., Bitto, E., Janota, V., and Keiderling, T. A. (1994) Faraday Discuss. 99, 287–310. 23. Pancoska, P., Bitto, E., Janota, V., Urbanova, M., Gupta, V. P., and Keiderling, T. A. (1995) Protein Sci. 4, 1384 –1401. 24. Baumruk, V., Pancoska, P., and Keiderling, T. A. (1996) J. Mol. Biol. 259, 774 –791. 25. Rahmelow, K., and Hubner, W. (1996) Anal. Biochem. 241, 5–13. 26. Baello, B. I., Pancoska, P., and Keiderling, T. A. (1997) Anal. Biochem. 250, 212–221. 27. Forato, L. A., Bernardes, R., and Colnago, L. A. (1998) Quim. Nova 21, 146 –150. 28. Johnson, W. C., Jr. (1999) Proteins 35, 307–312. 29. Sane, S. U., Cramer, S. M., and Przybycien, T. M. (1999) Anal. Biochem. 269, 255–272. 30. Keiderling, T. A. (1996) in Circular Dichroism and the Conformational Analysis of Biomolecules (Fasman, G. D., Ed.), pp. 555–598, Plenum, New York. 31. Johnson, W. C., Jr. (1985) Methods Biochem. Anal. 31, 61–163. 32. Jackson, M., and Mantsch, H. H. (1995) Crit. Rev. Biochem. Mol. Biol. 30, 95–120. 33. Connelly, G. P., Bai, Y., Jeng, M. F., and Englander, S. W. (1993) Proteins 17, 87–92.
HYDROGEN EXCHANGE FOURIER TRANSFORM INFRARED SPECTROSCOPY 34. Englander, S. W. (1967) in Poly-␣-Amino Acids: Protein Models for Conformational Studies (Fasman, G. D., Ed.), pp. 339 –367, Dekker, New York. 35. Englander, S. W. (1972) Annu. Rev. Biochem. 41, 903–924. 36. Englander, S. W., and Kallenbach, N. R. (1984) Q. Rev. Biophys. 16, 521– 655. 37. Englander, S. W., and Mayne, L. (1992) Annu. Rev. Biophys. Biomol. Struct. 21, 243–265. 38. Englander, S. W., Sosnick, T. R., Englander, J. J., and Mayne, L. (1996) Curr. Opin. Struct. Biol. 6, 18 –23. 39. Englander, S. W., Mayne, L., Bai, Y., and Sosnick, T. R. (1997) Protein Sci. 6, 1101–1109. 40. Hvidt, A., and Nielsen, S. O. (1966) Adv. Protein Chem. 21, 287–386. 41. Heimburg, T., and Marsh, D. (1993) Biophys. J. 65, 2408 –2417. 42. Goormaghtigh, E., Vigneron, L., Scarborough, G. A., and Ruysschaert, J. M. (1994) J. Biol. Chem. 269, 27409 –27413. 43. Faizullin, I. A., Stupishina, E. A., Krushelnitskii, A. G., and Fedotov, V. D. (1995) J. Mol. Biol. 29, 326 –331. 44. Methot, N., Demers, C. N., and Baenziger, J. E. (1995) Biochemistry 34, 15142–15149. 45. Glandieres, J. M., Calmettes, P., Martel, P., Zentz, C., Massat, A., Ramstein, J., and Alpert, B. (1995) Eur. J. Biochem. 227, 241–248. 46. Baenziger, J. E., and Methot, N. (1995) J. Biol. Chem. 270, 29129 –29137. 47. Haris, P. I., Chapman, D., and Benga, G. (1995) Eur. J. Biochem. 233, 659 – 664. 48. Backmann, J., Schultz, C., Fabian, H., Hahn, U., Saenger, W., and Naumann, D. (1996) Proteins 24, 379 –387. 49. Ludlam, C. F. C., Arkin, I. T., Liu, X. M., Rothman, M. S., Rath, P., Aimoto, S., Smith, S. O., Engelman, D. M., and Rothschild, K. J. (1996) Biophys. J. 70, 1728 –1736. 50. De Jongh, H. H. J., Goormaghtigh, E., and Ruysschaert, J. M. (1997) Biochemistry 36, 13593–13602. 51. De Jongh, H. H. J., Goormaghtigh, E., and Ruysschaert, J. M. (1997) Biochemistry 36, 13603–13610. 52. Hildebrandt, P., Vanhecke, F., Heibel, G., and Mauk, A. G. (1993) Biochemistry 32, 14158 –14164. 53. Englander, S., Roder, H., and Wand, A. (1985) in Protein Structure. Molecular and Electronic Reactivity (Austin, R., Ed.), pp. 139 –153, Springer-Verlag, Philadelphia, PA. 54. Roder, H., Elove, G. A., and Englander, S. W. (1988) Nature 1988, 700 –704. 55. Roder, H. (1989) Methods Enzymol. 176, 446 – 473. 56. Mayne, L., Paterson, Y., Cerasoli, D., and Englander, S. W. (1992) Biochemistry 31, 10678 –10685. 57. Theriault, Y., Pochapsky, T. C., Dalvit, C., Chiu, M. L., Sligar, S. G., and Wright, P. E. (1994) J. Biomol. NMR 4, 491–504. 58. Roder, H. (1995) Nat. Struct. Biol. 2, 817– 820. 59. Zhang, J., Peng, X. D., Jonas, A., and Jonas, J. (1995) Biochemistry 34, 8631– 8641.
57
60. Kuwajima, K. (1996) FASEB J. 10, 102–109. 61. Schonbrunner, N., Wey, J., Engels, J., Georg, H., and Kiefhaber, T. (1996) J. Mol. Biol. 260, 432– 445. 62. Chung, E. W., Nettleton, E. J., Morgan, C. J., Gross, M., Miranker, A., Radford, S. E., Dobson, C. M., and Robinson, C. V. (1997) Protein Sci. 6, 1316 –1324. 63. Ehring, H. (1999) Anal. Biochem. 267, 252–259. 64. Mandell, J. G., Falick, A. M., and Komives, E. A. (1998) Anal. Chem. 70, 3987–3995. 65. Smith, D. L. (1998) Biochemistry (Moscow) 63, 285–293. 66. Smith, D. L., Deng, Y., and Zhang, Z. (1997) J. Mass Spectrom. 32, 135–146. 67. Smith, D. L., Zhang, Z. Q., and Liu, Y. Q. (1994) Pure Appl. Chem. 66, 89 –94. 68. Resing, K. A., and Ahn, N. G. (1999) Prog. Biophys. Mol. Biol. 71, 501–523. 69. Wlodawer, A., and Sjolin, L. (1982) Proc. Natl. Acad. Sci. USA 79, 1418 –1422. 70. Dong, A., Huang, P., and Caughey, W. S. (1992) Biochemistry 31, 182–189. 71. Baumruk, V., and Keiderling, T. A. (1993) J. Am. Chem. Soc. 115, 6939 – 6942. 72. Venyaminov, S. Y., and Kalnin, N. N. (1990) Biopolymers 30, 1259 –1271. 73. Byler, D. M., and Susi, H. (1986) Biopolymers 25, 469 – 487. 74. Kabsch, W., and Sander, C. (1983) Biopolymers 22, 2577–2637. 75. Pancoska, P., Fric, I., and Blaha, K. (1979) Collect. Czech. Chem. Commun. 44, 1296 –1312. 76. Malinowski, E. R., and Howery, D. G. (1980) Factor Analysis in Chemistry, Wiley, New York. 77. Goto, Y., Hagihara, Y., Hamada, D., Hoshino, M., and Nishii, I. (1993) Biochemistry 32, 11878 –11885. 78. Parker, F. S. (1971) Applications of Infrared Spectroscopy in Biochemistry, Biology and Medicine, Plenum Press, New York. 79. Bai, Y., Milne, J. S., Mayne, L., and Englander, S. W. (1993) Proteins 17, 75– 86. 80. Norris, G. E., Anderson, B. F., and Baker, E. N. (1991) Acta Crystallogr. B 47, 998 –1004. 81. Brownlow, S., Cabral, J. H. M., Cooper, R., Flower, D. R., Yewdall, S. J., Polikarpov, I., North, A. C. T., and Sawyer, L. (1997) Structure 5, 481– 495. 82. Geisow, M. J. (1992) Trends Biotechnol. 10, 335–337. 83. Keiderling, T. A., and Pancoska, P. (1993) in Biomolecular Spectroscopy, Part B (Hester, R. E., and Clarke, R. J. H., Eds.), Vol. 21, pp. 267–315, Wiley, Chichester. 84. Keiderling, T. A. (2000) in Circular Dichroism Principles and Applications for Chemists and Biologists (Nakanishi, K., Berova, N., and Woody, R. W., Eds.), pp. 621– 666, Wiley, New York, in press.