ANALYTICAL BIOCHEMISTRY Analytical Biochemistry 321 (2003) 183–187 www.elsevier.com/locate/yabio
Accuracy of protein secondary structure determination from circular dichroism spectra based on immunoglobulin examples Sergey Y. Tetin,a,* Franklyn G. Prendergast,b and Sergei Yu. Venyaminovb b
a Abbott Laboratories, Diagnostics Division, Core R&D Biotechnology, Abbott Park, IL 60064-6016, USA Department of Biochemistry and Molecular Biology, Mayo Clinic and Foundation, Rochester, MN 55905, USA
Received 18 March 2003
Abstract Strong contribution of the aromatic amino acid side chain chromophores to the far-UV circular dichroism (CD) spectra substantially distorts a relatively weak CD signal originating from b sheet, the main type of immunoglobulin secondary structure. In this study we compared the secondary structure calculated from the far-UV CD spectra with the X-ray data for three antibody Fab fragments. Calculations were performed with three different algorithms, using two sets of reference proteins. Low standard deviations between all six estimates indicate stable mathematical solutions. Despite pronounced differences in the shape and amplitude of the CD spectra, we found a strong correlation between CD and X-ray data in the secondary structure for every protein studied. The number and average length of the secondary structure elements estimated from the CD spectra closely resemble those of the X-ray data. Agreement between spectroscopic and crystallographic results demonstrates that modern methods of secondary structure calculation are resilient to distortions of the far-UV CD spectra of immunoglobulins caused by aromatic side chain chromophores. Ó 2003 Elsevier Inc. All rights reserved. Keywords: Monoclonal antibody; Secondary structure; Comparison of circular dichroism and X-ray analysis
Circular dichroism (CD)1 spectroscopy is the most frequently used technique for evaluation of protein conformation in solution. The method has been proven to be sufficiently simple, reliable, and, in many situations, invaluable for rapid determination of protein structure or monitoring conformational changes. Typically, the CD spectra of proteins are recorded in the far-UV region (180–250 nm), the near-UV region (250–320 nm), and in the regions of ligand absorption when studying protein– ligand complexes or proteins with prosthetic groups. The far-UV CD spectrum is directly related to the protein secondary structure, due to asymmetrical packing of intrinsically achiral (planar) peptide groups. In the nearUV region a CD signal originates from specific interaction between aromatic amino acid side chains and/or their interaction(s) with peptide group(s) and, therefore, * Corresponding author. Fax: 1-847-935-6498. E-mail address:
[email protected] (S.Y. Tetin). 1 Abbreviations used: CD, circular dichroism; PDB, Protein Data Bank.
0003-2697/$ - see front matter Ó 2003 Elsevier Inc. All rights reserved. doi:10.1016/S0003-2697(03)00458-5
is characteristic of the tertiary protein structure. Some bound achiral ligands and prosthetic groups demonstrate induced CD signals in the near-UV and visible spectral regions when interacting with the asymmetric protein template. All aforementioned CD effects can be observed in antibodies [1,2]. Immunoglobulin molecules contain a large number of aromatic amino acid residues and disulfide bonds. Numerous aromatic amino acids are localized in the variable domains, in and around the antibody binding site. In addition to strong absorption in the near-UV region, aromatic amino acids have even stronger absorption at shorter wavelengths. There are La electronic transitions of tyrosine and phenylalanine side chains around 220 nm in addition to Bb and Ba transitions of the indole chromophore of tryptophan at 225 and 200 nm. These transitions can contribute to the far-UV CD spectra of antibodies and other proteins but do not correlate with protein secondary structure. The contribution of aromatic side chains and disulfide bonds to the CD signal originating from the secondary structure affects the shape
184
S.Y. Tetin et al. / Analytical Biochemistry 321 (2003) 183–187
and amplitude of immunoglobulin far-UV CD spectra. Consequently, an accurate calculation of the secondary structure of an antibody or other immunoglobulin-like protein from their CD spectra may appear complex. Attempts to quantify secondary structure of these proteins are rather limited [3]. Antibodies, T cell receptors, MHC molecules, and all other members of the immunoglobulin superfamily are proteins with important biological significance and application including diagnostics and therapeutics. Several methods for protein secondary structure calculations from the far-UV CD spectra exist. These methods are based on different mathematical algorithms and use various sets of reference proteins with known crystallographic structure. The assignment criteria for secondary structure of reference proteins from their Xray data are often different also. Venyaminov and Yang [4] and Greenfield [5] reviewed these methods in detail. In this study we evaluated the accuracy of calculations of immunoglobulin secondary structure from the CD spectra. Using three antibody Fab fragments with known three-dimensional structure, we compared the secondary structures calculated independently from X-ray data and from related CD spectra in the far-UV region. The CDPro suite of programs [6] includes different mathematical algorithms and a large set of reference proteins designed for calculations of protein secondary structure from the CD data. Our study demonstrates good agreement between spectroscopic and crystallographic results for the fraction of amino acid residues included in different types of secondary structure, for the number of the secondary structure segments, and for their average length.
Materials and methods The far- and near-UV CD spectra of the Fab fragments derived from monoclonal IgG antibodies NC6.8, NC10.14, and 4-4-20 were previously reported [2]. The CD spectra were presented in molar, per mole of residue (far-UV region), or per mole of protein (near-UV region) ellipticity units. The high-resolution X-ray structures of the Fab fragments 4-4-20 [7,8], NC6.8e [9], and NC10.14 [10] were downloaded from the Protein Data Bank (PDB) [11]; the subsequent PDB codes are 4FAB, 2CGR, and 1ETZ. Secondary structure assignments were performed by the Kabsch and Sander [12] method using the DSSP program. The grouping of DSSP assignments were made according to Sreerama et al. [13] in the same manner as was done for proteins included in the reference protein sets.2 Namely, 2
Other ways to combine DSSP outputs may produce different results in calculations of the secondary structure fractions. We are thankful to the reviewer who underlined this fact.
all DSSP results were combined in six secondary structure classes: Hregular , Hdistorted , Sregular , Sdistorted , turns, and unordered. The classes Hregular or Sregular are fractions of residues in the central part of all a- and 310 -helical segments or strands; Hdistorted or Sdistorted are fractions of terminus residues in helices (two at each end of the helix; total of four per helical segment) or b strands (one residue at each end of the strand; total of two per strand). The class turns combines all turns and bends that contain more than one amino acid residue; the class unordered includes residues unassigned to any defined class of secondary structure and all single residues assigned to a structure. Additionally, we sum up fractions of regular and distorted helices or b strands as Htotal and Stotal . Secondary structure was calculated from the CD data using the CDPro suite of programs [6]. The CDPro suite contains modified versions of three methods: SELCON3 [13], CONTIN/LL-CONTIN method [14] in locally linearized approximation [15], and CDSSTR [16]. All methods are based on comparison of the far-UV CD spectum of the protein undergoing testing with CD spectra of reference proteins with known three-dimensional structure. Two sets of reference proteins were used. The first includes all 48 proteins that are typically used for such calculations [17]. The other set, selected by the CLUSTER program, contains only 11 all-b proteins, since immunoglobulins belong to the proteins of this tertiary structure class. Using three methods and two sets of reference proteins we have obtained and compared six values of the calculated secondary structure for each Fab fragment. Reported results (Tables 1–3) represent the mean from six calculated values and their standard deviations (SD). The class of tertiary structure as defined by the CLUSTER program is based on the method of Venyaminov and Vassilenko [18] and is included in the CDPro package.
Results and discussion The far-UV CD spectra of Fab fragments derived from the monoclonal antibodies NC6.8, NC10.14, and 4-4-20 are presented in Fig. 1a. Although all three spectra exhibit shapes typical for proteins with a high content of b structure, the individual spectra differ in amplitude, contain pronounced subpeaks or spectral shoulders, and intercept the baseline at considerably different wavelengths. For example, the amplitude of the far-UV CD spectrum of Fab NC10.14 is almost doubled in comparison with two other proteins and has a pronounced positive peak at 233 nm. As follows from the X-ray data (Tables 1–3), such differences in amplitude cannot be explained solely by variations in secondary structure. The Fab fragments NC6.8, 4-4-20, and
S.Y. Tetin et al. / Analytical Biochemistry 321 (2003) 183–187
185
Table 1 Secondary structure of Fab NC6.8 derived from the far-UV CD spectrum and X-ray analysis
X-ray CD
Mean S.D.
X-ray CD, Mean(S.D.)
Hregular
Hdistorted
Htotal
Sregular
Sdistorted
Stotal
turns
unordered
0.002 0.010 0.006
0.042 0.030 0.013
0.044 0.028 0.009
0.231 0.296 0.027
0.162 0.175 0.023
0.393 0.470 0.049
0.203 0.205 0.022
0.360 0.295 0.034
Number of helices
Average length of helix
Number of b strands
Average length of b strand
Tertiary struc. class
5 2.3(1.1)
3.8 3.6(2.1)
35 37.6(5.2)
4.9 5.4(0.2)
all-b all-b
Hregular or Sregular , fraction of residues in the central part of helical segments or strands; Hdistorted or Sdistorted , fraction of terminus residues in helices (two at each end of the helix; four per a helical segment) or b strands (one residue at each end of the strand; two per a strand), Htotal or Stotal combines fractions of regular and distorted helics or strands; Average length of helix or b strand, average number of residues per helix or b strand.
Table 2 Secondary structure of Fab NC10.14 derived from the far-UV CD spectrum and X-ray analysis
X-ray CD
Mean S.D.
Hregular
Hdistorted
Htotal
Sregular
Sdistorted
Stotal
turns
unordered
0.002 0.011 0.009
0.045 0.026 0.017
0.047 0.037 0.022
0.323 0.309 0.037
0.189 0.167 0.029
0.512 0.476 0.064
0.214 0.198 0.031
0.208 0.264 0.060
X-ray CD, Mean(S.D.)
Number of helices
Average length of helix
Number of b strands
Average length of b strand
Tertiary struc. class
6.0 3.0(1.8)
3.5 5.1(1.9)
39 37.0(6.9)
6.0 5.8(0.3)
all-b all-b
Table 3 Secondary structure of Fab 4-4-20 derived from the far-UV CD spectrum and X-ray analysis
X-ray CD
Mean S.D.
X-ray CD, Mean(S.D.)
Hregular
Hdistorted
Htotal
Sregular
Sdistorted
Stotal
turns
unordered
0.007 0.001 0.005
0.074 0.015 0.019
0.080 0.016 0.019
0.241 0.271 0.011
0.156 0.171 0.010
0.398 0.438 0.016
0.175 0.228 0.011
0.347 0.315 0.019
Number of helices
Average length of helix
Number of b strands
Average length of b strand
Tertiary struc. class
9.0 1.7(1.3)
3.9 4.6(3.6)
34 37.5(1.9)
5.1 5.2(0.2)
all-b all-b
NC10.14 contain similar fractions (0.39, 0.40, and 0.51, respectively) of total b structure, the main type of secondary structure in immunoglobulins. Since early studies of antibodies and Bence–Jones proteins by optical rotatory dispersion and circular dichroism [19–21], differences in spectra of individual immunoglobulins have been traditionally attributed to the aromatic side chain chromophores in an asymmetric microenvironment [1,2]. This interpretation is supported by pronounced differences in the near-UV CD spectra shown in Fig. 1b. The near-UV CD spectra of free antibodies originate only from the aromatic residue side chains and, to some extent, from disulfide bridges. Evidently, these chromophores may also produce different signals in the farUV range and contribute to the far-UV CD signal. Results of secondary structure calculations are presented in Tables 1–3. As indicated earlier, each CD
spectrum was analyzed by three different methods, using two sets of reference proteins. Reported are the grand means of six individual calculations for each Fab fragment. The associated standard deviations demonstrate not the real errors for the estimated values of secondary structure but, rather, the scattering of the results of the three methods of calculation with two sets of reference proteins. Fairly small standard deviations of the calculated values indicate that all combinations of algorithms and sets of reference proteins lead to similar and stable solutions. For example, the standard deviations associated with calculations of the total b structure, which forms about a half of the immunoglobulin secondary structure, fall within 0.016 and 0.064 for different proteins. We found a good agreement in the secondary structure fractions calculated from the CD data and
186
S.Y. Tetin et al. / Analytical Biochemistry 321 (2003) 183–187
believe that this information is useful for modeling protein tertiary structure, especially in situations when X-ray or NMR data are not available. It is important to indicate that the class of tertiary structure was unmistakably defined for all three proteins. The results obtained in this study demonstrate that an accurate estimation of the secondary structure from the far-UV CD spectra can be performed for immunoglobulins and perhaps for other proteins from the immunoglobulin superfamily. Conclusion
Fig. 1. CD spectra of Fab fragments NC6.8, NC10.14, and 4-4-20 in the far-UV (a) and the near-UV (b) regions.
determined by X-ray analysis for all three proteins. Calculations of helical fraction in the Fab fragments NC6.8 and NC10.14 may also be accepted as reasonable, considering the very small content of helices in immunoglobulin molecules. The largest difference of 0.064 in the total fraction of helices calculated from the CD spectra and crystallographic data was found in the Fab 4-4-40. In addition to the secondary structure content, programs in the CDPro package also calculate the number and average length of the secondary structure segments. Basically, the number of the secondary structure segments is calculated by dividing the number of residues included in the distorted helical structure by a factor of four and in the distorted b structure by a factor of two. Dividing the total amount of residues in a helix or b strand (ordered plus distorted) by the number of segments gives the average length of the segment. Results of these calculations are also presented in Tables 1–3. Apparently, the calculated number and length of the b strands in the Fab fragments NC6.8, NC10.14, and 4-4-20 match the X-ray data. We
Modern methods of protein secondary structure determination using the far-UV CD spectra are generally reliable. Even in the rather complicated case of immunoglobulins, where the far-UV CD spectra are distorted by a significant contribution of the aromatic side chain and/or disulfide bond chromophores, these methods give an accurate estimate of protein secondary structure. As a rule, contributions of nonpeptide chromophores in farUV CD spectra are assumed to be negligible in all algorithms for secondary structure calculations [4]. Several semiempirical attempts to include contributions of aromatic side chain chromophores in CD analysis have failed despite the fact that the contemporary level of CD theory allows estimation of such contributions a priori [22]. However, there are many experimental examples in which the protein far-UV CD spectra are affected by a CD signal originating from the aromatic side chains, disulfide bonds, or ligands bound to the protein molecules. As demonstrated in this study, algorithms for determining protein secondary structure from CD spectra can effectively eliminate such contributions in the far-UV CD spectra of immunoglobulins. Possibly, since nonpeptide chromophore spectra are not defined in the CD spectra of reference proteins but are inherited as spectral variations, algorithms abolish them as a kind of noise. Studies of other proteins with unusual features in far-UV CD spectra will show the versatility of such a filter. Acknowledgments F.G.P. and S.Y.V. were supported by National Institutes of Health Research Grant GM34847. References [1] S.Y. Tetin, W.W. Mantulin, L.K. Denzin, K.L. Weidner, E.W. Voss Jr., Comparative circular dichroism studies of an antifluorescein monoclonal antibody (Mab-4-4-20) and its derivatives, Biochemistry 31 (1992) 12029–12034. [2] S.Y. Tetin, D.S. Linthicum, Circular dichroism spectroscopy of monoclonal antibodies that bind a superpotent guanidinium sweetener ligand, Biochemistry 35 (1996) 1258–1264.
S.Y. Tetin et al. / Analytical Biochemistry 321 (2003) 183–187 [3] W.P. Vermeer, W. Norde, The thermal stability of immunoglobulin: unfolding and aggregation of a multi-domain protein, Biophys. J. 78 (2000) 394–404. [4] S.Yu. Venyaminov, J.T. Yang, Determination of protein secondary structure, in: G.D. Fasman (Ed.), Circular Dichroism and the Conformational Analysis of Biomolecules, Plenum Press, New York, 1996, pp. 69–107. [5] N.J. Greenfield, Methods to estimate the conformation of proteins and polypeptides from circular dichroism data, Anal. Biochem. 235 (1996) 1–10. [6] N. Sreerama, R.W. Woody, Estimation of protein secondary structure from circular dichroism spectra: comparison of CONTIN, SELCON, and CDSSTR methods with an expanded reference set, Anal. Biochem. 287 (2000) 252–260. [7] J.N. Herron, X.M. He, M.L. Mason, E.W. Voss Jr., A.B. Edmundson, Three-dimensional structure of a fluorescein-Fab complex crystallized in 2-methyl-2,4-pentanediol, Proteins 5 (1989) 271–280. [8] J.N. Herron, A.H. Terry, S. Johnston, X.M. He, L.W. Guddat, E.W. Voss Jr., A.B. Edmundson, High resolution structures of the 4-4-20 Fab-fluorescein complex in two solvent systems: effects of solvent on structure and antigen-binding affinity, Biophys. J. 67 (1994) 2167–2183. [9] L.W. Guddat, L. Shan, J.M. Anchin, D.S. Linthicum, A.B. Edmundson, Local and transmitted conformational changes on complexation of an anti-sweetener Fab, J. Mol. Biol. 236 (1994) 247–274. [10] L.W. Guddat, L. Shan, C. Broomell, P.A. Ramsland, Z. Fan, J.M. Anchin, D.S. Linthicum, A.B. Edmundson, The three-dimensional structure of a complex of a murine Fab (NC10. 14) with a potent sweetener (NC174): an illustration of structural diversity in antigen recognition by immunoglobulins, J. Mol. Biol. 302 (2000) 853–872. [11] H.M. Berman, J. Westbrook, Z. Feng, G. Gilliland, T.N. Bhat, H. Weissig, I.N. Shindyalov, P.E. Bourne, The Protein Data Bank, Nucleic Acids Res. 28 (2000) 235–242.
187
[12] W. Kabsch, C. Sander, Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features, Biopolymers 22 (1983) 2577–2637. [13] N. Sreerama, S.Yu. Venyaminov, R.W. Woody, Estimation of the number of a-helical and b-strand segments in proteins using circular dichroism spectroscopy, Protein Sci. 8 (1999) 370–380. [14] S.W. Provencher, J. Gl€ ockner, Estimation of globular protein secondary structure from circular dichroism, Biochemistry 20 (1981) 33–37. [15] I.H.M. van Stokkum, H.J.W. Spoelder, M. Bloemendal, R. van Grondelle, F.C.A. Goren, Estimation of protein secondary structure and error analysis from circular dichroism spectra, Anal. Biochem. 191 (1990) 110–118. [16] W.C. Johnson Jr., Analyzing protein circular dichroism spectra for accurate secondary structure, Protein Struct. Funct. Genet. 35 (1999) 307–312. [17] N. Sreerama, S.Yu. Venyaminov, R.W. Woody, Estimation of protein secondary structure from circular dichroism spectra: inclusion of denatured proteins with native proteins in the analysis, Anal. Biochem. 287 (2000) 243–251. [18] S.Yu. Venyaminov, K.S. Vassilenko, Determination of protein tertiary structure class from circular dichroism spectra, Anal. Biochem. 222 (1994) 176–184. [19] D.L. Ross, B. Jirgensons, The far ultraviolet optical rotatory dispersion, circular dichroism, and absorption spectra of a myeloma immunoglobulin, immunoglobulin G, J. Biol. Chem. 243 (1968) 2829–2830. [20] K.J. Dorrington, C. Tanford, Molecular size and conformation of immunoglobulins, Adv. Immunol. 12 (1970) 333–381. [21] T. Azuma, K. Hamaguchi, S. Migita, Denaturation of Bence Jones proteins by guanidine hydrochloride, J. Biochem. (Tokyo) 72 (1972) 1457–1467. [22] R.W. Woody, A.K. Dunker, Aromatic and cystine side-chain circular dichroism in proteins, in: G.D. Fasman (Ed.), Circular Dichroism and the Conformational Analysis of Biomolecules, Plenum Press, New York, 1996, pp. 109–157.