Raman Optical Activity: A Tool for Protein Structure Analysis

Raman Optical Activity: A Tool for Protein Structure Analysis

Structure, Vol. 13, 1409–1419, October, 2005, ©2005 Elsevier Ltd All rights reserved. DOI 10.1016/j.str.2005.07.009 Raman Optical Activity: A Tool fo...

369KB Sizes 44 Downloads 55 Views

Structure, Vol. 13, 1409–1419, October, 2005, ©2005 Elsevier Ltd All rights reserved. DOI 10.1016/j.str.2005.07.009

Raman Optical Activity: A Tool for Protein Structure Analysis Fujiang Zhu, Neil W. Isaacs, Lutz Hecht, and Laurence D. Barron* WestCHEM Department of Chemistry University of Glasgow Glasgow G12 8QQ United Kingdom

Summary On account of its sensitivity to chirality, Raman optical activity (ROA), measured here as the intensity of a small, circularly polarized component in the scattered light using unpolarized incident light, is a powerful probe of protein structure and behavior. Protein ROA spectra provide information on secondary and tertiary structures of polypeptide backbones, backbone hydration, and side chain conformations, and on structural elements present in unfolded states. This article describes the ROA technique and presents ROA spectra, recorded with a commercial instrument of novel design, of a selection of proteins to demonstrate how ROA may be used to readily distinguish between the main classes of protein structure. A principal component analysis illustrates how the many structure-sensitive bands in protein ROA spectra are favorable for applying pattern recognition techniques to determine structural relationships between different proteins. Introduction The determination of protein structure and function remains at the forefront of biomolecular science in the postgenomic era. Although X-ray crystallography, supplemented by multidimensional NMR for smaller structures in aqueous solution, is the technique of choice in this enterprise, the majority of proteins specified by a genome are currently inaccessible to these techniques (Pusey et al., 2005; Sue et al., 2005). According to recent statistics collated from structural genomics centers worldwide (www.mcsg.anl.gov), of w50,000 proteins cloned, to date only w2,000 have yielded X-ray or NMR structures. A major impediment to the application of X-ray crystallography is the lack of suitable crystals. This can be due to a number of factors, the most fundamental being that the protein lacks a compact tertiary fold in its native state. Such proteins are variously named “natively unfolded,” “intrinsically unstructured,” or “intrinsically disordered” and are now recognized as constituting an important structural class that has a variety of important functions (Uversky, 2002; Dunker et al., 2002; Dyson and Wright, 2005). There is an urgent need for techniques that can provide structural information for the enormous number of proteins, be they folded, unfolded, or partially unfolded, that are inacces*Correspondence: [email protected]

Technical Advance

sible to X-ray and NMR methods. Even for those proteins that do crystallize, it would be valuable to have techniques that provide fold information, albeit not at atomic resolution, since such data could, among other things, expedite the crystallization and structure solution process. Techniques such as vibrational spectroscopy measured via infrared (Barth and Zscherp, 2002) and Raman (Carey, 1982; Miura and Thomas, 1995; Tuma, 2005) methods, electronic circular dichroism measured in the ultraviolet (UVCD) (Fasman, 1996) and extended into the vacuum ultraviolet (VUVCD) using synchrotron radiation (Wallace et al., 2004), and vibrational circular dichroism (VCD) (Keiderling, 2000) have all been promoted as aids to protein structure analysis. Here, we focus on the technique of Raman optical activity (ROA), which, like VCD, combines the extra sensitivity to threedimensional structure of chiroptical methods such as UVCD with the advantages of vibrational spectroscopy and so reports on chirality associated with the 3N−6 fundamental molecular vibrational transitions, where N is the number of atoms in the molecule. ROA measures a small difference in the intensity of vibrational Raman scattering from chiral molecules in right- and left-circularly polarized incident light or, equivalently, the intensity of a small, circularly polarized component in the scattered light using incident light of fixed polarization (Hug, 2002; Barron, 2004; Barron et al., 2004) (Figure 1). The first and second experiments are called incident circular polarization (ICP) and scattered circular polarization (SCP) ROA, respectively. ROA, using the ICP strategy, was first observed by Barron et al. (1973) and has been developed to the point where it is now an incisive probe of the structure and behavior of biomolecules in aqueous solution (Barron et al., 2000, 2004). Raman spectroscopy provides molecular vibrational spectra by means of the inelastic scattering of visible light (Long, 2002). During the Stokes Raman scattering event, the interaction of the molecule with the incident visible photon of energy u, where u is its angular frequency, can leave the molecule in an excited vibrational state of energy uv, with a corresponding energy loss and, hence, a shift to lower angular frequency u − uv, of the scattered photon. Therefore, by analyzing the scattered light with a visible spectrometer, a complete vibrational spectrum may be obtained. Conventional Raman spectroscopy has several favorable characteristics that have led to many applications in biochemistry (Carey, 1982). In particular, the complete vibrational spectrum from w100 to 4000 cm−1 is accessible on one simple instrument, and both H2O and D2O are excellent solvents for Raman studies. Since ROA is sensitive to chirality, it is able to build on these advantages by adding to Raman spectroscopy an extra sensitivity to the details of the three-dimensional structure. The ability to study aqueous solutions, with no restrictions on the size of the biomolecules, makes ROA ideal for protein structure work. The routine application of ROA in protein science in

Structure 1410

the following definition of the dimensionless circular intensity difference (CID): ⌬ = (IR − IL) / (IR + IL).

(1)

In terms of the electric dipole-electric dipole molecular polarizability tensor aab and the electric dipole-magnetic dipole and electric dipole-electric quadrupole optical activity tensors G#ab and Aabg, the CIDs for forward (0°) and backward (180°) scattering from an isotropic sample for incident transparent wavelengths much greater than the molecular dimensions are (Barron, 2004): ⌬(0°) =

Figure 1. Two Equivalent ROA Experiments in Transparent Stokes Vibrational Raman Scattering at Angular Frequency of u – uv in Incident Light of Angular Frequency u (A) ICP ROA measures (IR − IL), where IR and IL are the scattered intensities (shown here as unpolarized) in right- and left-circularly polarized incident light, respectively. (B) SCP ROA measures (IR − IL), where IR and IL are the intensities of the right- and left circularly polarized components, respectively, of the scattered light using incident light of fixed polarization (shown here as unpolarized). A positive value of (IR − IL) corresponds to a small degree of right-circular polarization in the scattered light.

laboratories other than our own in Glasgow has been held back by the delicate nature of the measurements. This situation has now changed with the introduction of a commercial ROA instrument, the ChiralRAMAN from BioTools, Inc., which employs the SCP strategy based on a novel design by W. Hug (Hug and Hangartner, 1999; Hug, 2002, 2003). Further development of this instrument in our laboratory to optimize it for protein samples has now reached the point at which rapid measurements of protein ROA spectra may be accomplished in nonspecialized laboratories. In this article, we present a selection of protein ROA spectra acquired with the ChiralRAMAN instrument and illustrate how this powerful new tool can be used for protein structure analysis in its own right and as a valuable complement to protein X-ray crystallography. Theory The ROA Observables The fundamental scattering mechanism responsible for ROA was discovered by Atkins and Barron (1969), who showed that interference between the light waves scattered via the molecular polarizability and optical activity tensors of the molecule yields a dependence of the scattered intensity on the degree of circular polarization of the incident light and to a circular component in the scattered light. Barron and Buckingham (1971) developed a more definitive version of the theory and introduced, as an appropriate experimental quantity (where IR and IL are the scattered intensities in rightand left-circularly polarized incident light, respectively),

4[45aG′ + b(G′)2 − b(A)2] , c[45a2 + 7b(a)2] 1 3

24[b(G′)2 + b(A)2]

⌬(180°) =

,

c[45a2 + 7b(a)2]

(2a)

(2b)

where the isotropic invariants are defined as 1 3

1 3

a = aaa, G′ = G′aa, and the anisotropic invariants are defined as 1 2

b(a)2 = (3aabaab − aaaabb), 1 2

b(G′)2 = (3aabG′ab − aaaG′bb), 1 2

b(A)2 = uaabeagd Agdb. We are using a Cartesian tensor notation, in which a repeated Greek suffix denotes summation over the three components, and eabg is the third-rank unit antisymmetric tensor. For the case of a molecule composed entirely of idealized axially symmetric bonds, for which b(G#)2 = b(A)2 and aG# = 0, a simple bond polarizability theory shows that ROA is generated exclusively by anisotropic scattering, and the CID expressions reduce to (Barron, 2004) ⌬(0°) = 0, ⌬(180°) =

32b(G′)2

(3a) .

c[45a2 + 7b(a)2]

(3b)

Unlike conventional Raman scattering intensities, which are the same in the forward and backward directions, ROA intensity is maximized in backscattering and is zero in forward scattering. This result leads to the conclusion that backscattering boosts the ROA signal relative to the background Raman intensity and is therefore the best experimental strategy for most ROA studies of biomolecules in aqueous solution (Hecht et al., 1989; Hecht and Barron, 1990). It was mentioned in the Introduction that, as well as the circular intensity difference, ROA is also manifest as a small, circularly polarized component in the scattered beam using incident light of fixed polarization (including unpolarized). Within the far-from-resonance approximation, measurement of this circular component (SCP ROA) as (IR − IL)/(IR + IL), where IR and IL denote the

Protein Structure from Raman Optical Activity 1411

intensities of the right- and left-circularly polarized components, respectively, of the scattered light, provides equivalent information to that from the CID measurement (ICP ROA) (Barron, 2004). Furthermore, the simultaneous measurement of both ICP and SCP ROA, called dual circular polarization (DCP) ROA, can be advantageous (Nafie and Freedman, 1989). All of these results apply specifically to Rayleigh (elastic) scattering. For Raman (inelastic) scattering, the same basic CID expressions apply, but with the molecular property tensors replaced by corresponding vibrational Raman transition tensors between the initial and final vibrational states nv and mv. Thus, aab, etc. are replaced by , etc., where aab(Q), etc. are effective polarizability and optical activity operators that depend parametrically on the normal vibrational coordinates Q so that, within the Placzek polarizability theory of the Raman effect (Long, 2002), the ROA intensity depends on products such as (∂ααβ/∂Q)0 (∂G#ab/ ∂Q)0 and (∂aab/∂Q)0 eagd(∂Agdb/∂Q)0. Although ab initio calculations of ROA spectra, which are usually based on the Placzek approximation, are becoming increasingly successful for small chiral molecules (Polavarapu, 1990; Ruud et al., 2002; Zuber and Hug, 2004) and may readily be performed with a recent version of the Gaussian 03 software package, they are hopelessly inadequate for ROA calculations on structures with the size and complexity of proteins. Enhanced Sensitivity of ROA to Structure and Dynamics of Chiral Biomolecules The normal vibrational modes of biopolymers can be complex, with contributions from local vibrational coordinates within both the backbone and the amino acid side chains. ROA cuts through the complexity of the corresponding vibrational spectra since the largest signals are often associated with vibrational coordinates that sample the most rigid and chiral parts of the structure. These are usually within the backbone and often give rise to ROA band patterns characteristic of the backbone conformation. Polypeptides in the standard conformations defined by characteristic Ramachandran f,ψ angles found in secondary, loop, and turn structures within proteins are particularly favorable in this respect since signals from the peptide backbone usually dominate the ROA spectrum. In contrast, the parent conventional Raman spectrum of a protein is dominated by bands arising from the amino acid side chains that often obscure the peptide backbone bands. The time scale of the Raman scattering event (w10−14 s) is much shorter than that of the fastest conformational fluctuations. The ROA spectrum is therefore a superposition of “snapshot” spectra from all of the distinct conformations present in the sample at equilibrium. Since ROA observables depend on absolute chirality with respect to the arrangement of bonds, there tends to be a cancellation of contributions from structures with “opposite” chirality. The cancellation arises as a mobile structure explores the range of accessible conformations about single bonds. These factors result in ROA exhibiting an enhanced sensitivity to the dynamic aspects of protein structure. In contrast, observables that are “blind” to this chirality, such as

Figure 2. Optical Design of the Scattered Circular Polarization BioTools ChiralRAMAN Backscattering ROA Instrument The lenses are represented by double-headed arrows. (Adapted from Hug and Hangartner, 1999.)

conventional Raman band intensities, are generally additive and therefore less sensitive to conformational mobility. Instrumentation Since most ROA intensity is maximized in backscattering, a backscattering geometry is essential for the routine measurement of ROA spectra of biomolecules in aqueous solution. Backscattering ROA spectra may be acquired by using both the ICP and SCP measurement strategies, although the designs of the corresponding instruments are completely different. A backscattering ICP ROA instrument (Hecht et al., 1999) has been used in our Glasgow laboratory for some years and has provided a large number of protein ROA spectra. Although this ICP ROA instrument design has served to establish the value of ROA in protein science and will remain useful, a completely new design of an ROA instrument with significant advantages accruing from the use of the SCP strategy has recently been developed by W. Hug (Hug and Hangartner, 1999; Hug, 2002, 2003). In particular, “flicker noise” arising from dust particles, density fluctuations, laser power fluctuations, etc. are eliminated since the intensity difference measurements required to extract the circularly polarized components of the scattered beam are taken between two orthogonal components of the scattered light measured during the same acquisition period. The flicker noise consequently cancels out, resulting in greatly superior signal-to-noise characteristics. The basic design is illustrated in Figure 2. This corresponds to Hug’s original implementation of the SCP strategy (Hug and Hangartner, 1999); some of the details are changed in later implementations, but the basic principle remains the same. The incident visible laser beam at 532 nm from a frequency-doubled Nd/YAG laser, the initial

Structure 1412

linear polarization of which is “scrambled” by a fast rotation of the azimuth, is deflected into the sample cell by using a very small prism. The cone of backscattered light is collimated onto a liquid crystal retarder set to convert right- and left-circular polarization states into linear polarization states with azimuths perpendicular and parallel, respectively, to the plane of the instrument, followed by an edge filter to remove the intense Rayleigh line. A beam-splitting cube then diverts the perpendicular component at 90° to the propagation direction of the parallel component, which passes through undiverted. In this way, the right- and left-circularly polarized components of the backscattered light are separated and collected into the ends of two fiber optics. Each fiber optic converts the crosssection from circular at the input end to a linear configuration at the output end that matches the entrance slit of the fast imaging spectrograph (Kaiser Holospec) containing a highly efficient single volume-holographic transmission grating, thereby enabling separate Raman spectra for the rightand left-circularly polarized components of the scattered light to be dispersed simultaneously one above the other on the chip of a backthinned CCD detector. Subtraction then provides the required ICP ROA spectrum corresponding to tiny, circularly polarized components in the Raman bands (for an achiral sample, the intensities of the right-and left-circularly polarized components of the scattered light would be identical). Small differences in the two detection channels are compensated by interconverting their function through the switching of the liquid crystal retarder from the −l/4 to the +l/4 state. A commercial instrument based on this new design that also incorporates a sophisticated artifact-suppression protocol, based on a “virtual enantiomers” approach that greatly facilitates the routine acquisition of reliable ROA spectra (Hug, 2003), has recently become available (the ChiralRAMAN from BioTools, Inc.). This instrument provides high-quality protein ROA spectra in w2–5 hr, which is around five times faster than our home-built ICP ROA instruments, using a sample volume of w30 ␮l and w500 mW of focused laser power at the sample. Furthermore, the ChiralRAMAN instrument extends protein ROA data acquisition routinely to the low-wavenumber region from w250–600 cm−1. All the ROA spectra shown in this article were measured on the first production model of the ChiralRAMAN instrument, which has been optimized in our laboratory for protein samples. Results and Discussion Protein ROA Spectra Characteristic of the Basic Structural Classes Rather than dwelling on ROA bands characteristic of secondary structure elements and how the corresponding percentages may be extracted, as is commonly done for most other spectroscopic methods, we shall go immediately to the ROA spectra of proteins belonging to the different basic structural types within the SCOP classification of protein structures (http://scop. mrc-lmb.cam.ac.uk/scop/data/scop.b.html) to illustrate how these may often be recognized immediately. Figure

3 displays the backscattered SCP ROA spectra of human serum albumin (SCOP class: all α; fold: serumalbumin-like), human immunoglobulin G (all β; immunoglobulin-like), bovine ribonuclease A (α + β; RNase A-like), subtilisin Carlsberg (α/β; subtilisin-like), and the natively unfolded protein bovine β-casein, all measured in aqueous solution. MOLSCRIPT diagrams (Kraulis, 1991) of the first four are also displayed for convenience. It is immediately apparent that all five ROA spectra are quite distinct, much more so than the parent Raman spectra also shown. As explained above, protein ROA spectra are often dominated by bands originating in the peptide backbone that directly reflect the solution conformation. Vibrations of the backbone in polypeptides and proteins are usually associated with three main regions of the Raman spectrum (Tu, 1986; Miura and Thomas, 1995). These regions comprise the backbone skeletal stretch region, w870–1150 cm−1, originating in mainly Cα–C, Cα–Cβ and C–N stretch coordinates, the amide III region, w1230–1310 cm−1, often assigned mostly to the in-phase combination of the in-plane N–H deformation with the C–N stretch, and the amide I region, w1630– 1700 cm−1, that arises mostly from the C=O stretch. However, the amide III region involves much more mixing between the N–H and Cα–H deformations than previously thought, and should be extended to at least 1340 cm−1 (Diem, 1993; Schweitzer-Stenner et al., 2002). This extended amide III region is particularly important for ROA studies because the coupling between N–H and Cα–H deformations is very sensitive to geometry and generates a rich and informative ROA band structure. Amide II vibrations, which occur in the region of w1510–1570 cm−1 and are assigned to the out-of-phase combination of the in-plane N–H deformation with the C–N stretch, are either very weak or are not observed at all in the conventional (nonresonance) Raman spectra of proteins (Carey, 1982), but can be prominent in the ROA spectra. Side chain vibrations also generate many characteristic Raman bands (Tu, 1986; Miura and Thomas, 1995): although these are usually less prominent in ROA spectra due to some conformational freedom that can suppress the ROA intensities (vide supra), a few side chain vibrations do give rise to useful ROA signals. The ROA band pattern for human serum albumin is very similar to that for poly(L-lysine) in a model α-helical conformation (McColl et al., 2004a), reflecting the large amount of the extended α helix contained within its fold (69.2% α helix and 0.0% β strand in PDB structure 1ao6). The ROA band pattern for human immunoglobulin G has similarities with that for poly(L-lysine) in a model β sheet conformation (McColl et al., 2003), which accords with the large amount of antiparallel β sheets within each of its 12 β sandwich fold domains based on the Greek key motif (43.0% β strand and 3.0% α helix in PDB structure 1hzh). However, there are some important differences of detail that reflect the fact that the β sheet in the model polypeptide structure is more flat and uniform than that found in typical proteins (McColl et al., 2003) as well as being fully hydrated. The ROA spectra of bovine ribonuclease A and subtilisin Carlsberg contain bands characteristic of both α helices and β sheets: the former contains 17.7% α helix

Protein Structure from Raman Optical Activity 1413

Figure 3. Backscattered SCP Raman, IR + IL, and ROA, IR − IL, Spectra (A–D) Backscattered SCP Raman (IR + IL) and ROA (IR − IL) spectra of (A) human serum albumin, (B) human immunoglobulin G, (C) bovine ribonuclease A, (D) subtilisin Carlsberg, and (E) bovine β-casein, all in aqueous solution and recorded on the ChiralRAMAN instrument.

and 34.7% β strand according to PDB structure 1rph, and the latter contains 29.6% α helix and 17.2% β strand within its Rossmann fold according to PDB structure 1sca. The ROA spectrum of bovine β-casein, the natively unfolded character of which is well charac-

terized (Holt and Sawyer, 1993; Syme et al., 2002), looks very similar to that of disordered poly(L-lysine), which is now thought to be made up largely of poly(L-proline) II (PPII) sequences rather than “random coil” (Shi et al., 2002a). This is reinforced by the close similarity with

Structure 1414

the ROA spectrum of the water soluble, alanine-rich peptide AcOO(Ala)7OONH2 (McColl et al., 2004b) shown definitively by NMR and UVCD to be comprised largely of PPII (Shi et al., 2002b). There are of course a number of bands that may be assigned unequivocally to secondary structure. Examples include a couplet in the amide I region, negative on the low-wavenumber side and positive on the high, shown by both α helices and β sheets, but shifted by w10 cm−1 to higher wavenumber in the latter. Clear examples may be seen in the ROA spectra of human serum albumin (negative and positive components at w1638 and 1662 cm−1, respectively) and human immunoglobulin G (negative and positive components at w1648 and 1677 cm−1, respectively) in Figure 3. Likewise, the positive, extended amide III bands at w1345 cm−1 in human serum albumin and subtilisin Carlsberg originate in α helices, and the negative bands at w1247 cm−1 in human immunoglobulin G, bovine ribonuclease A, and subtilisin Carlsberg originate in β structure. Also, the strong-positive, extended amide III ROA band at w1318 cm−1 and the weak-positive amide I band at w1673 cm−1 in bovine β-casein are characteristic of the PPII helix. One advantage ROA has over other spectroscopies for protein structure analysis is that resolved signatures of loops and turns appear in addition to those of secondary structure, which is why ROA band patterns often provide motif or even fold information. Examples are the positive band at w1296 cm−1 and the two negative bands at w1347 and 1376 cm−1 in the ROA spectrum of human immunoglobulin G that are characteristic of β turns (McColl et al., 2003). Similar bands are absent from the ROA spectrum of the α/β protein subtilisin Carlsberg because the ends of the parallel β strands within its Rossman fold are connected by extended α helix sequences rather than by β turns, whereas a clear negative band at w1374 cm−1 assigned to β turns is present in the ROA spectrum of the α + β protein bovine ribonuclease A, in which the β sheet is antiparallel. This last observation suggests a simple method for distinguishing between α/β and α + β proteins. However, it is not infallible for distinguishing between parallel and antiparallel β sheets in all-β proteins: for example, in the β helix fold, parallel strands are connected by β turns, and, indeed, large turn signatures are present in the ROA spectrum of the β helix protein P.69 pertactin (McColl et al., 2003). Hydrated ␣ Helices and ␤ Sheets ROA is unique among spectroscopic methods in having the ability to distinguish between hydrated and unhydrated α helices. A strong, sharp-positive ROA band at w1340 cm−1 in α-helical proteins such as human serum albumin has been assigned to a hydrated form of the α helix, and a positive band at w1300 cm−1 has been assigned to the unhydrated α helix (McColl et al., 2004a). One piece of evidence is that the w1340 cm−1 band, but not the w1300 cm−1 band, disappears immediately when the protein is dissolved in D2O. This indicates first that N–H deformations of the peptide backbone make a significant contribution to the generation of the w1340 cm−1 ROA band (because the corre-

sponding N–D deformations contribute to normal modes in a spectral region several hundred wavenumbers lower) and, second, that, in proteins and viruses, the corresponding sequences are exposed to solvent, rather than being buried in hydrophobic regions where amide protons can take months or even years to exchange. Guided by studies of water molecules in high-resolution protein X-ray crystal structures, it has been suggested that the positive w1300 and 1340 cm−1 ROA bands assigned to unhydrated and hydrated α helices may correspond, respectively, to the canonical form of α helices and to a more open variant, in which the C=Oi group, already engaged in intrachain helix hydrogen bonding to NHi + 4, forms a hydrogen bond with an external water molecule (McColl et al., 2004a). This external type of backbone hydration is present on the hydrophilic side of the amphipathic α helix, which has a distinct hydrophobic side protected from water and a distinct exposed hydrophilic side, and leads to helix bending due to slightly different Ramachandran f,ψ angles on the two sides (Blundell et al., 1983). The ability of ROA to distinguish hydrated from unhydrated α helices is valuable in studies relating structure to behavior. For example, the hydrated α helix has a greatly enhanced susceptibility to unfolding (Sundaralingham and Sekharudu, 1989), which ROA studies have suggested may be primarily to PPII structure rather than the random coil, and which may have implications for amyloid fibril formation in certain situations (Blanch et al., 2000; Barron et al., 2002). There are hints that ROA may similarly be able to discriminate between hydrated and unhydrated β sheets via negative bands at w1220 and 1240 cm−1, respectively (McColl et al., 2003). For example, β sheet poly(L-lysine) in aqueous solution, which is expected to be fully hydrated, shows a strong-negative ROA band at w1218 cm−1, but no negative band at w1240 cm−1. Low-Wavenumber ROA Thanks to the ChiralRAMAN instrument, to our knowledge, we have been able to acquire protein ROA spectral data in the range of w250–600 cm−1 for the first time. This covers the region in which modes such as helix breathing, torsions, and skeletal deformations occur (Tu, 1986). We have so far observed only weak (but reproducible) ROA signals in the S–S stretch region, w500–550 cm−1, from disulphide bridges. This may be because any such ROA is generated mainly through isotropic scattering, rather than through the anisotropic scattering that dominates ROA in backscattering (Equation 2b). If this is the case, then S–S stretch ROA might be accessed best through measurements in the forward direction, which is dominated by isotropic scattering (Equation 2a). Much remains to be explored through this low-wavenumber ROA window on protein structure. Side Chains Although bands from side chains are usually not very prominent in the ROA spectra of polypeptides and proteins, there are several distinct regions in which side chain vibrations appear to be largely responsible for the

Protein Structure from Raman Optical Activity 1415

observed ROA features. In particular, ROA bands in the range of w1400–1480 cm−1 originate in CH2 and CH3 side chain deformations and also in tryptophan vibrations; ROA bands in the range of w1545–1560 cm−1 originate in tryptophan vibrations; and some ROA bands in the range ofw1600–1630 cm−1 originate in vibrations of aromatic side chains, especially tyrosine (Barron et al., 2000). Also, the ring breathing mode of the aromatic ring in phenylalanine, which generates a strong band in the conventional Raman spectrum at w1000 cm−1 (Tu, 1986) and a strong ROA band in small, chiral molecules such as 1-phenylethanol (Barron et al., 2004; Macleod et al., 2005), may be associated with ROA bands observed in this region in some proteins. The absolute stereochemistry of the tryptophan conformation, in terms of the sign and magnitude of the torsion angle χ2,1 around the bond connecting the indole ring to the Cβ atom, may be obtained from the w1545–1560 cm−1 tryptophan ROA band, assigned to a W3-type vibration of the indole ring (Blanch et al., 2001). This was discovered from observations of W3 tryptophan ROA bands with similar magnitudes but opposite signs in two different filamentous bacterial viruses with coat protein subunits containing a single tryptophan, which suggested that the tryptophans adopt quasienantiomeric conformations in the two viruses. Since the magnitude of the angle χ2,1 may be deduced from the W3 Raman band wavenumbers (Miura and Thomas, 1995), both the sign and magnitude of this angle may be obtained from the ROA spectrum, something usually only available from high-resolution X-ray protein crystal structures. The W3 ROA band may also be used as a probe of conformational heterogeneity among a set of tryptophans in disordered regions within a protein, since cancellation from ROA contributions with opposite signs will result in a loss of ROA intensity, as observed, for example, in a molten globule state of human lysozyme (Blanch et al., 2000). Glycoproteins Intact glycoproteins often provide ROA spectra containing bands originating in both the polypeptide and carbohydrate components (Barron et al., 2000; Zhu et al., 2005), from which information about the structure of both components may be deduced. This should be especially valuable since glycoproteins are of central importance in biochemistry, but are difficult to study by using conventional techniques. Carbohydrates themselves in aqueous solution give rich and informative band structure over a wide range of the vibrational spectrum (Barron et al., 2000). ROA spectra of monosaccharides contain information on sugar ring conformation, relative disposition of OH groups around the ring, the absolute configuration and axial or equatorial orientation of groups attached to the anomeric carbon, and the exocyclic CH2OH conformation; those of di- and oligosaccharides contain information on the conformation of the C–O–C glycosidic links; and those of polysaccharides provide information about whether the structure is disordered or has extended order, such as helical. In a preliminary study, the ROA spectrum of bovine α1-acid glycoprotein (AGP) was measured and com-

pared with those of β-lactoglobulin and N,N#-diacetylchitobiose, which revealed features consistent with previous suggestions that the polypeptide component of AGP has a structure based on the lipocalin fold, as adopted by β-lactoglobulin, and that the first two glycosidic links after the N-links to asparagine in the pentasaccharide core are of the β(1-4) type (Zhu et al., 2005). Polyproline (II) Helix and Unfolded Proteins The PPII helix can be supported by amino acid sequences other than those based on L-proline and has been recognized as a common structural motif within the longer loops in the X-ray crystal structures of many proteins (Adzhubei and Sternberg, 1993). It consists of a left-handed helix with 3-fold rotational symmetry (n = −3), in which the f,ψ angles of the constituent residues are restricted to values around −78°, +146° corresponding to a region of the Ramachandran surface adjacent to the β region. The extended nature of the PPII helix precludes intrachain hydrogen bonds, the structure being stabilized instead by main chain hydrogen bonding with water molecules and side chains (Creamer and Campbell, 2002). PPII currently attracts much interest as a major conformational element of disordered polypeptides and unfolded proteins in aqueous solution (Shi et al., 2002a; Bochicchio and Tamburro, 2002). It can be distinguished from random coil in polypeptides by using UVCD, VCD, IR, and Raman, but these techniques have difficulty in identifying it when other conformational elements are present, as in proteins. However, it is readily identified even in proteins by using ROA, which has proved valuable for studying PPII in unfolded and partially folded proteins (Barron et al., 2002) and its possible role in amyloid fibril formation in certain protein misfolding diseases (Blanch et al., 2000, 2004; Syme et al., 2002). The ROA spectrum of the natively unfolded protein β-casein, displayed in Figure 3, is characteristic of a PPII-rich structure, the main features being the strong-positive, extended amide III ROA band at w1318 cm−1 together with the weak positive amide I ROA band at w1673 cm−1. Viruses Knowledge of the structure of viruses at the molecular level is essential for understanding their modus operandi. However, the application of X-ray crystallography or fiber diffraction is often hampered by practical difficulties. Conventional Raman is valuable in studies of intact viruses at the molecular level, as it is able to simultaneously probe both the protein and nucleic acid constituents (Miura and Thomas, 1995; Thomas, 1999). The additional incisiveness of ROA, which may be applied to most types of virus, including filamentous, helical, rod-shaped, and icosahedral (Blanch et al., 2002a), further enhances the value of Raman spectroscopy in structural virology. The first virus ROA spectra were reported for filamentous bacteriophages (Blanch et al., 1999). The data proved valuable for the identification of ROA bands associated with unhydrated and hydrated α helices since large amounts of both types are present in the overlapping extended helical coat proteins in the intact viruses. ROA also proved useful in the comparison of

Structure 1416

the helix bundle coat protein structure of the rodshaped virus potato virus X (PVX) with that of tobacco mosaic virus (TMV) (Blanch et al., 2002b). Both show ROA spectra characteristic of helix bundle coat proteins. However, the positive, w1340 cm−1 ROA band in PVX is significantly more intense than that in TMV. The implied increased hydration of the α helices in the coat proteins of PVX is consistent with the characteristics of the PVX virus particles, which are flexuous filaments, compared with those of TMV, which are rigid rods. It is also consistent with the greater solvent exposure of the PVX coat proteins implied by the deep grooves observed in the viral surface by X-ray fiber diffraction (Parker et al., 2002), which have no counterpart in TMV. An example of the potential of ROA in structural virology is provided by a study of cowpea mosaic virus (CPMV) (Blanch et al., 2002a), in which the ROA spectrum of the intact virus was separated into three spectra characteristic of the jelly roll β sandwich fold of the identical coat proteins making up the icosahedral capsid and of the two distinct nucleic acid genomes, RNA1 and RNA2, which provided new, to our knowledge, information on the nucleic acid structure that is inaccessible to X-ray crystallography. Pattern Recognition in Protein ROA Spectra Although some individual band assignments may be uncertain, or not even valid due to the extensive vibrational coupling often involved in the generation of large ROA signals, overall band patterns can be characteristic of certain structural elements, motifs, and sometimes folds. This, together with the plethora of structure-sensitive ROA bands, means that pattern recognition methods will be especially important in the analysis of protein ROA spectra. We are currently constructing a data set of protein ROA spectra that covers as much of the fold space as possible. The SCOP database, release 1.67, lists 887 different folds within which there are 1447 superfamilies, with the most populous folds containing the largest number of superfamilies. The four most common structure classes (all α, all β, α/β, and α + β) contain 85% of the known folds and 87% of the superfamilies. Within these four classes, there are 88 folds with 2 or more superfamilies, giving rise to 471 different superfamilies. A judicious selection of targets from these 88 folds will provide representative ROA spectra from all folds populated with more than 1 superfamily; thus, a high degree of coverage (44%) of all known superfamiles will be achieved. Our work thus far has focused on the pattern recognition method of principal component analysis (PCA) (Malinowski, 2002), implemented for ROA by our collaborator K. Nielsen. This has proved valuable for providing an initial representation of the structural relationships among polypeptide and protein states based on their ROA spectra (McColl et al., 2003). From the experimental ROA spectral data, PCA calculates a set of subspectra that serves as basis functions, the algebraic combination of which, with appropriate expansion coefficients, can be used to reconstruct any member of the original set of ROA spectra. The ROA spectra are specified by intensities for the same set of wave-

numbers, and are normalized by scaling a spectrum such that the sum of the squared intensities is unity. Most of the polypeptides and proteins in our set of ROA spectra may be divided into two sets, one containing α helices and β sheets in various amounts, the other containing mainly disordered or irregular structures. This is reflected in the coefficients of the two most important basis functions. For the first and most important basis function, large coefficients are associated with large amounts of α helices or β sheets. Since the α helix and β sheet contents are inversely correlated (the larger the amount of one, the smaller the amount of the other), the coefficients associated with α helices and β sheets have opposite signs; a negative sign is given here for α helices, and a positive sign is given for β sheets. The coefficients for the second basis function reflect the amount of disordered (mainly PPII) structure; large positive coefficients are associated with disordered polypeptides and proteins. Proteins with small coefficients, positive or negative, for the first basis function, together with small coefficients for the second basis function, have similar significant amounts of α helices and β sheets. The raw basis functions do not have any direct physical interpretation beyond this. Figure 4 shows a plot of the coefficients for our current set of 83 polypeptide, protein, and virus ROA spectra for the 2 most important basis functions. The spectra separate into clusters corresponding to different types of structure, with increasing α helix content to the left, increasing β sheet content to the right, and increasing disordered or irregular structure from bottom to top. The protein positions are color-coded with respect to the seven different structural types listed in the figure that provide a useful initial classification that follows naturally from the PCA clustering. This PCA classification is a little different from the SCOP classification in that the SCOP all α and all β classes are further refined into all α and mainly α, and all β and mainly β; furthermore, disordered structure is now clearly recognized. However, the α + β and α/β classes are not differentiated and are collected together within the mainly α, αβ, and mainly β classes. The positions of poly(L-lysine) in model conformations (McColl et al., 2003), together with those of the five proteins whose ROA spectra are displayed in Figure 3, are indicated. Poly(L-lysine) in α-helical, β sheet, and disordered states lies in the lower left, lower right, and upper center of the plot, respectively, as expected. Human serum albumin lies in the lower left region within the all α region, and human immunoglobulin G lies close to the all β region, but both are somewhat higher than α-helical and β sheet poly(L-lysine) due to the significant amount of disordered or irregular structures in the form of loops and turns present in the native folds. Bovine ribonuclease A and subtilisin Carlsberg both lie within the αβ region, but the latter lies higher due to the significant amount of disordered or irregular structures it contains in its many long loops: this is consistent with the structural differences between the corresponding α + β and α/β classes, although the first two basis functions do not appear to discriminate directly between α + β and α/β. The natively unfolded protein bovine β-casein lies in the top central region, adjacent to disordered poly

Protein Structure from Raman Optical Activity 1417

Figure 4. Plot of the PCA Coefficients for the Two Most Important Basis Functions for a Set of 83 Polypeptide, Protein, and Virus ROA Spectra The examples discussed in this article are labeled as follows: α-helical (1a), β sheet (1b), and disordered (1c) poly(L-lysine); human serum albumin (2); human immunoglobulin G (3); bovine ribonuclease A (4); subtilisin Carlsberg (5); bovine β-casein (6); AcOO(Ala)7OONH2 (7). More complete definitions of the structural types are: all α, >w60% α helix, with little or no other secondary structure; mainly α, >w35% α helix and a small amount of β sheet (w5%–15%); αβ, similar significant amounts of α helices and β sheets; mainly β, >w35% β sheet and a small amount of α helix (w5%–15%); all β, >w45% β sheet, with little or no other secondary structure; mainly disordered/irregular, little secondary structure; all disordered/irregular, no secondary structure.

(L-lysine), but lower than the model PPII structure AcOO(Ala)7OONH2. It is clear that even a first simple level of PCA analysis provides consistent and informative relationships between different proteins that correspond to the distinct structural types specified in Figure 4. The PCA analysis could be extended to extract quantitative estimates of various structural elements (via orthogonal transformations of the basis functions to functions that relate directly to elements such as helix, sheet, loops, turns, etc.), and could also be further developed in order to separate polypeptide, carbohydrate, and nucleic acid contributions to composite ROA spectra. ROA as a Complement to Protein and Virus Crystal and Fiber X-Ray Diffraction As well as providing new structural information rapidly for the many proteins inaccessible to X-ray or NMR methods, ROA will be a useful adjunct to these methods. For example, ROA could be valuable for solving X-ray crystal structures by molecular replacement methods because ROA data will identify those proteins with the most structural similarity to the protein under study. These proteins, either alone or in a suitably weighted combination, will provide target structures for use as search molecules in molecular replacement. This will be especially valuable in cases in which structural homology is high but sequence homology is low, as is often the case with the coat protein subunits of

viral capsids. ROA could also be useful for identifying proteins that are completely or partially unfolded, and hence unlikely to form crystals, before fruitless efforts are expended on crystallization attempts. Since most of the many thousands of known viruses are likely to be inaccessible to high-resolution X-ray crystal or fiber diffraction methods, structural virology is a particularly fruitful area for ROA on account of the ease with which the folds of the major coat proteins of intact viruses in aqueous solution may be “read off” from their ROA spectra and differences of detail identified between coat proteins of different viruses having the same basic fold, and also due to its ability to provide information about the structure of the nucleic acid core and protein-nucleic acid interactions. It may also prove possible to obtain information about the carbohydrate structure of viral envelope glycoproteins. All this will provide a valuable complement to low-resolution X-ray crystal and fiber diffraction, as well as to electron cryomicroscopy, which currently has a best resolution limit of w8 Å for samples such as icosahedral viruses, which, due to their high symmetry and rigidity, can be especially favorable (Lee and Johnson, 2003; Orlova and Saibil, 2004). Experimental Procedures Bovine serum albumin, human immunoglobulin G, bovine ribonuclease A, and subtilisin Carlsberg were purchased from the

Structure 1418

Sigma-Aldrich Company Ltd. The bovine β-casein was prepared from whole acid casein by the urea fractionation method of Aschaffenburg (1963) in the laboratory of Dr. C. Holt at the Hannah Research Institute, Ayr, UK. SCP ROA spectra were acquired by using the ChiralRAMAN instrument (BioTools, Inc.) described above. They are displayed in analog-to-digital counter units as a function of the Stokes wavenumber shift with respect to the exciting laser wavenumber, and they are presented as circular polarization intensity differences (IR − IL), where IR and IL denote the intensities of the right- and leftcircularly polarized components, respectively, of the scattered light. The parent Raman spectra are presented as corresponding circular polarization intensity sums (IR + IL). The protein solutions were studied at concentrations of w50 mg/ ml in aqueous buffers (100 mM acetate buffer [pH 6.4] for the serum albumin, immunoglobulin G, ribonuclease A, and subtilisin Carlsberg; 50 mM phosphate buffer [pH 7.0] for the β-casein) at ambient temperature (w20°C). The solutions were filtered through 0.22 ␮m Millipore filters into quartz microfluorescence cells and then centrifuged. Visible fluorescence from traces of impurities, which can give large backgrounds in Raman spectra, was quenched by leaving the samples to equilibrate in the laser beam before ROA data was acquired. The experimental conditions were as follows: laser wavelength, 532 nm; laser power at the sample, w500 mW; spectral resolution, w10 cm−1; acquisition times, w3 hr. Acknowledgments We thank the United Kingdom Biotechnology and Biological Sciences Research Council for a research grant, and Dr. C. Holt for the sample of bovine β-casein. Received: May 12, 2005 Revised: July 5, 2005 Accepted: July 7, 2005 Published: October 11, 2005 References Adzhubei, A.A., and Sternberg, M.J.E. (1993). Left-handed polyproline II helices commonly occur in globular proteins. J. Mol. Biol. 229, 472–493. Aschaffenburg, R. (1963). Preparation of β-casein by a modified urea fractionation method. J. Dairy Res. 30, 259–260. Atkins, P.W., and Barron, L.D. (1969). Rayleigh scattering of polarized photons by molecules. Mol. Phys. 16, 453–466.

prefibrillar intermediate of human lysozyme. J. Mol. Biol. 301, 553–563. Blanch, E.W., Hecht, L., Day, L.A., Pederson, D.M., and Barron, L.D. (2001). Tryptophan absolute stereochemistry in viral coat proteins from Raman optical activity. J. Am. Chem. Soc. 123, 4863–4864. Blanch, E.W., Hecht, L., Syme, C.D., Volpetti, V., Lomonossoff, G.P., Nielsen, K., and Barron, L.D. (2002a). Molecular structures of viruses from Raman optical activity. J. Gen. Virol. 83, 2593–2600. Blanch, E.W., Robinson, D.J., Hecht, L., Syme, C.D., Nielsen, K., and Barron, L.D. (2002b). Solution structures of potato virus X and narcissus mosaic virus from Raman optical activity. J. Gen. Virol. 83, 241–246. Blanch, E.W., Gill, A.C., Rhie, A.G.O., Hope, J., Hecht, L., Nielsen, K., and Barron, L.D. (2004). Raman optical activity demonstrates poly(L-proline) II helix in the N-terminal region of the ovine prion protein: implications for function and misfunction. J. Mol. Biol. 343, 467–476. Blundell, T., Barlow, D., Borkakoti, N., and Thornton, J. (1983). Solvent-induced distortions and the curvature of α-helices. Nature 306, 281–283. Bochicchio, B., and Tamburro, A.M. (2002). Polyproline II structure in proteins: identification by chiroptical spectroscopies, stability and functions. Chirality 14, 782–792. Carey, P.R. (1982). Biochemical Applications of Raman and Resonance Raman Spectroscopies (New York: Academic Press). Creamer, T.P., and Campbell, M.N. (2002). Determinants of the polyproline II helixfrom modelling studies. Adv. Protein Chem. 62, 263– 282. Diem, M. (1993). Introduction to (New York: John Wiley & Sons, Inc.). Dunker, A.K., Brown, C.J., Lawson, J.D., Iakoucheva, L.M., and Obradovic, Z. (2002). Intrinsic disorder and protein function. Biochemistry 41, 6573–6582. Dyson, H.J., and Wright, P.E. (2005). Intrinsically unstructured proteins and their functions. Nat. Rev. Mol. Cell Biol. 6, 197–208. Fasman, G.D. (1996). Circular Dichroism and the Conformational Analysis of Biomolecules (New York: Plenum Press). Hecht, L., and Barron, L.D. (1990). An analysis of modulation experiments for Raman optical activity. Appl. Spectrosc. 44, 483–491. Hecht, L., Barron, L.D., and Hug, W. (1989). Vibrational Raman optical activity in backscattering. Chem. Phys. Lett. 158, 341–344. Hecht, L., Barron, L.D., Blanch, E.W., Bell, A.F., and Day, L.A. (1999). Raman optical activity instrument for studies of biopolymer structure and dynamics. J. Raman Spectrosc. 30, 815–825.

Barron, L.D. (2004). Molecular Light Scattering and Optical Activity, Second Edition (Cambridge, UK: Cambridge University Press).

Holt, C., and Sawyer, L. (1993). Caseins as rheomorphic proteins: interpretation of primary and secondary structures of the αS1-, β- and κ-caseins. J. Chem. Soc., Faraday Trans. 89, 2683–2692.

Barron, L.D., and Buckingham, A.D. (1971). Rayleigh and Raman scattering from optically active molecules. Mol. Phys. 20, 1111– 1119.

Hug, W. (2002). Raman optical activity. In Handbook of Vibrational Spectroscopy, Volume 1, J.M. Chalmers and P.R. Griffiths, eds. (Chichester: John Wiley & Sons, Inc.), pp. 745–758.

Barron, L.D., Bogaard, M.P., and Buckingham, A.D. (1973). Raman scattering of circularly polarized light by optically active molecules. J. Am. Chem. Soc. 95, 603–605.

Hug, W. (2003). Virtual enantiomers as the solution of optical activity’s deterministic offset problem. Appl. Spectrosc. 57, 1–13.

Barron, L.D., Hecht, L., Blanch, E.W., and Bell, A.F. (2000). Solution structure and dynamics of biomolecules from Raman optical activity. Prog. Biophys. Mol. Biol. 73, 1–49. Barron, L.D., Blanch, E.W., and Hecht, L. (2002). Unfolded proteins studied by Raman optical activity. Adv. Protein Chem. 62, 51–90. Barron, L.D., Hecht, L., Blanch, E.W., and McColl, I.H. (2004). Raman optical activity comes of age. Mol. Phys. 102, 731–744.

Hug, W., and Hangartner, G. (1999). A novel high-throughput Raman spectrometer for polarization difference measurements. J. Raman Spectrosc. 30, 841–852. Keiderling, T.A. (2000). Peptide and protein conformational studies with vibrational circular dichroism and related spectroscopies. In Circular Dichroism. Principles and Applications, N. Berova, K. Nakanishi, and R.W. Woody, eds. (New York: Wiley-VCH), pp. 621–666.

Barth, A., and Zscherp, C. (2002). What vibrations tell us about proteins. Q. Rev. Biophys. 35, 369–430.

Kraulis, P.J. (1991). MOLSCRIPT: a program to produce both detailed and schematic plots of protein structures. J. Appl. Crystallogr. 24, 946–950.

Blanch, E.W., Bell, A.F., Hecht, L., Day, L.A., and Barron, L.D. (1999). Raman optical activity of filamentous bacteriophages: hydration of α-helices. J. Mol. Biol. 290, 1–7.

Lee, K.K., and Johnson, J.E. (2003). Complementary approaches to structure determination of icosahedral viruses. Curr. Opin. Struct. Biol. 13, 558–569.

Blanch, E.W., Morozova-Roche, L.A., Cochran, D.A.E., Doig, A.J., Hecht, L., and Barron, L.D. (2000). Is polyproline II helix the killer conformation? A Raman optical activity study of the amyloidogenic

Long, D.A. (2002). The Raman Effect (Chichester: John Wiley & Sons, Inc.). Macleod, N.A., Butz, P., Simons, J.P., Grant, G.H., Baker, C.M., and

Protein Structure from Raman Optical Activity 1419

Tranter, G.E. (2005). Structure, electronic circular dichroism and Raman optical activity in the gas phase and in solution: a computational and experimental investigation. Phys. Chem. Chem. Phys. 7, 1432–1440. McColl, I.H., Blanch, E.W., Gill, A.C., Rhie, A.G.O., Ritchie, M.A., Hecht, L., Nielsen, K., and Barron, L.D. (2003). A new perspective on β-sheet structures using vibrational Raman optical activity: from poly(L-lysine) to the prion protein. J. Am. Chem. Soc. 125, 10019– 10026. McColl, I.H., Blanch, E.W., Hecht, L., and Barron, L.D. (2004a). A study of α-helix hydration in polypeptides, proteins and viruses using vibrational Raman optical activity. J. Am. Chem. Soc. 126, 8181–8188. McColl, I.H., Blanch, E.W., Hecht, L., Kallenbach, N.R., and Barron, L.D. (2004b). Vibrational Raman optical activity characterization of poly(L-proline) II helix in alanine oligopeptides. J. Am. Chem. Soc. 126, 5076–5077. Malinowski, E.R. (2002). Factor Analysis in Chemistry, Third Edition (New York: John Wiley & Sons, Inc.). Miura, T., and Thomas, G.J., Jr. (1995). Raman spectroscopy of proteins and their assemblies. In Subcellular Biochemistry. Volume 24: Proteins: Structure, Function and Engineering, B.B. Biswas and S. Roy, eds. (New York: Plenum Press), pp. 55–99. Nafie, L.A., and Freedman, T.B. (1989). Dual circular polarization Raman optical activity. Chem. Phys. Lett. 154, 260–266. Orlova, E.N., and Saibil, H.R. (2004). Structure determination of macromolecular assemblies by single-particle analysis of cryoelectron micrographs. Curr. Opin. Struct. Biol. 14, 584–590. Parker, L., Kendall, A., and Stubbs, G. (2002). Surface features of potato virus X from fiber diffraction. Virology 300, 291–295. Polavarapu, P.L. (1990). Ab initio vibrational Raman and Raman optical activity spectra. J. Phys. Chem. 94, 8106–8112. Pusey, M.L., Liu, Z.J., Tempel, W., Praissman, J., Lin, D., Wang, B.C., Gavira, J.A., and Ng, J.D. (2005). Life in the fast lane for protein crystallization and X-ray crystallography. Prog. Biophys. Mol. Biol. 88, 359–386. Ruud, K., Helgaker, T., and Bour, P. (2002). Gauge-origin independent density-functional theory calculations of vibrational Raman optical activity. J. Phys. Chem. A 106, 7448–7455. Schweitzer-Stenner, R., Eker, F., Huang, Q., Griebenow, K., Mroz, P.A., and Kozlowski, P.M. (2002). Structure analysis of dipeptides in water by exploring and utilizing the structural sensitivity of amide III by polarized visible Raman, FTIR-spectroscopy and DFT based normal coordinate analysis. J. Phys. Chem. B 106, 4294–4304. Shi, Z., Woody, R.W., and Kallenbach, N. (2002a). Is polyproline II a major backbone conformation in unfolded proteins? Adv. Protein Chem. 62, 163–240. Shi, Z., Olson, C.A., Rose, G.D., Baldwin, R.L., and Kallenbach, N.R. (2002b). Polyproline II structure in a sequence of seven alanine residues. Proc. Natl. Acad. Sci. USA 99, 9190–9195. Sue, S.C., Chang, C.F., Huang, Y.T., Chou, C.Y., and Huang, T.H. (2005). Challenges in NMR-based structural genomics. Physica A (Amsterdam) 350, 12–27. Sundaralingham, M., and Sekharudu, Y.C. (1989). Water inserted α-helical segments implicate reverse turns as folding intermediates. Science 244, 1333–1337. Syme, C.D., Blanch, E.W., Holt, C., Jakes, R., Goedert, M., Hecht, L., and Barron, L.D. (2002). A Raman optical activity study of rheomorphism in caseins, synucleins and tau protein: implications for fibrillogenic propensity. Eur. J. Biochem. 269, 148–156. Thomas, G.J. (1999). Raman spectroscopy of protein and nucleic acid assemblies. Annu. Rev. Biophys. Biomol. Struct. 28, 1–27. Tu, A.T. (1986). Peptide backbone conformation and microenvironment of protein side chains. Adv. Spectrosc. 13, 47–112. Tuma, R. (2005). Raman spectroscopy of proteins: from peptides to large assemblies. J. Raman Spectrosc. 36, 307–319. Uversky, V.N. (2002). Natively unfolded proteins: a point where biology waits for physics. Protein Sci. 11, 739–756.

Wallace, B.A., Wien, F., Miles, A.J., Lees, J.G., Hoffmann, S.V., Evans, P., Wistow, G.J., and Slingsby, C. (2004). Biomedical applications of synchrotron radiation circular dichroism spectroscopy: identification of mutant proteins associated with disease and development of a reference database for fold motifs. Faraday Discuss. 126, 237–243. Zhu, F., Isaacs, N.W., Hecht, L., and Barron, L.D. (2005). Polypeptide and carbohydrate structure of an intact glycoprotein from Raman optical activity. J. Am. Chem. Soc. 127, 6142–6143. Zuber, G., and Hug, W. (2004). Rarified basis sets for the calculation of optical tensors. 1. The importance of gradients on hydrogen atoms for the Raman scattering tensor. J. Phys. Chem. A 108, 2108–2118.