Biopharmaceutical applications of protein characterisation by circular dichroism spectroscopy

Biopharmaceutical applications of protein characterisation by circular dichroism spectroscopy

C H A P T E R 6 Biopharmaceutical applications of protein characterisation by circular dichroism spectroscopy A.J. Miles, B.A. Wallace Institute of S...

859KB Sizes 3 Downloads 161 Views

C H A P T E R

6 Biopharmaceutical applications of protein characterisation by circular dichroism spectroscopy A.J. Miles, B.A. Wallace Institute of Structural and Molecular Biology, Birkbeck College, University of London, London, United Kingdom

6.1 Introduction Circular dichroism (CD) is an optical spectroscopic method that exploits the differential absorption of left- and right-handed circularly polarized light by optically active molecules. CD spectroscopy in the far-ultraviolet (UV) wavelength range of the electromagnetic spectrum (260e170 nm) can be used to characterize and quantify protein secondary structural contents in terms of a-helical, b-strand, and unordered structure. It can also be used to identify changes that occur due to interactions with ligands or other moieties or environmental factors such as changes in pH, temperature, or ionic strength. In addition, the CD signals from the aromatic residues tryptophan, tyrosine, and phenylalanine present in proteins are detectable in the near-UV wavelength range, between 260 and 300 nm, and can be used to monitor changes in the environment of these moieties, which can reflect protein tertiary structure. There are several monographs [1,2] devoted to circular dichroism spectroscopy (and its related methods synchrotron radiation circular dichroism (SRCD) and linear dichroism (LD)) which describe the fundamental principles and instrumentation in detail. This chapter provides a brief introduction to them, but is primarily focused on methods of data collection and analyses, with specific attention to applications relevant to biopharmaceutical characterisation.

Biophysical Characterization of Proteins in Developing Biopharmaceuticals, Second Edition https://doi.org/10.1016/B978-0-444-64173-1.00006-8

123

Copyright © 2020 Elsevier B.V. All rights reserved.

124

6. Biopharmaceutical applications of protein characterisation by circular dichroism spectroscopy

FIG. 6.1 Linearly polarized electromagnetic radiation. The electric field (E) and magnetic field (B) are perpendicular to the direction of propagation and to each other. l is the wavelength of the light.

6.1.1 Theory 6.1.1.1 The physical origins of CD signals Electromagnetic (EM) radiation comprises an electric field and a magnetic field that oscillate in perpendicular planes, which are, in turn, perpendicular to the direction of a light propagation (Fig. 6.1). A light source generally gives rise to wave trains with the fields oriented isotropically; linear polarization selects light with the electric field oscillating in a single plane. In circularly polarized light (CPL), the electric vector rotates about the direction of propagation undergoing one revolution per wavelength where, looking toward the light source, right-handed CPL will rotate in a clockwise direction and left-handed CPL anticlockwise. Linear polarized light can be described as the superposition of oppositely circular polarized light of equal amplitude and phase. As long as the amplitudes and phases of the circular components remain the same, the resultant electric vector (E) will lie in a plane and oscillate in magnitude as shown on the left of Fig. 6.2. When the light passes through an optically active sample, there will be differential absorbance of the two circularly polarized components and the resulting radiation is said to be elliptically polarized with an electric vector tracing an ellipse as shown on the right of Fig. 6.2. The CD spectrum is derived from the difference in the absorption of left- and right-CPL emerging from the optically active sample (Fig. 6.3). A chromophore is optically active if it is chiral, covalently attached to a chiral center, or situated in a chiral environment courtesy of the three-dimensional structure of

FIG. 6.2 (Left): View along the direction of light propagation, looking toward the source of the plane polarized light. The electric vector (dashed arrow) is the sum of the left and right circularly polarized components (solid arrows). (Right): After passing through an optically active sample the left circularly polarized component is absorbed more than the right and the electric field traces an ellipse.

II. The selected biophysical tools in the biopharmaceutical industry

6.1 Introduction

125

FIG. 6.3 The relationship between absorption (Top) and CD (Bottom). For CD, “A” indicates when left circularly polarised light is absorbed more than right, which produces a positive CD peak; “B” is when the right-circularly polarised light is absorbed more than left, which produces a negative CD peak, and “C” is when right- and leftcircularly polarised light are absorbed equally (i.e., the sample is achiral), which produces no CD signal.

the molecule. The latter situation pertains to the peptide backbone group, which is the chromophore of principal interest for proteins. 6.1.1.1.1 Far-UV absorption of the peptide bond

When electromagnetic (EM) radiation of a specific energy (in the UV or visible parts of the spectrum) strikes a chromophore, absorption occurs if electrons are transiently promoted from a ground state to a higher energy state (Fig. 6.4 left). The peptide bond has two such transitions in the far-UV: an n / p* transition at w220 nm and a p / p* transition at w190 nm. The characteristics of the CD signal from a single chromophore depend upon the local torsion angles of the peptide backbone, i.e., the secondary structure, and the far UV CD spectrum (w190e240 nm) represents the summation of signals from all peptide chromophores in the sample. In Fig. 6.5 the solid line depicts the CD spectrum of the a-helical protein, myoglobin. There is a negative peak at 222 nm corresponding to the n / p* transition and, moving to lower wavelengths, a negative peak at 208 nm followed by a positive peak at 192 nm. The latter two arise from the p / p* transition, which generates two peaks where the higher energy level is split (Fig. 6.4 right) due to coupling of adjacent chromophores along sections of secondary structure [5]. There is a further intra-amide charge transfer transition [6] that manifests as a shoulder at about 175 nm in a-helical spectra, although this data is normally only obtainable by using a highly intense synchrotron light source, rather than the Xenon arc lamps typically found in lab-based CD instruments. Fig. 6.5 also shows

FIG. 6.4

(Left): Absorption involves the promotion of an electron to a higher energy state. (Right): The excited state becomes delocalised over the coupled chromophores and splits into two or more states of slightly different energies.

II. The selected biophysical tools in the biopharmaceutical industry

126

6. Biopharmaceutical applications of protein characterisation by circular dichroism spectroscopy

20 15

CD (Δε)

10 5 0 170 -5 -10

190

210

230

250

Wavelength (nm)

CD spectrum of a mostly a-helical protein (myoglobin) [solid line], a mostly b-sheet protein, (lentil lectin) [dotted line], and a disordered protein (MEG-14) [dashed line] The myoglobin spectrum clearly shows the n/p* and p/p* transition peaks. The shoulder between 180 nm and 170 nm is thought to be due to an intra-amide a p /p* charge transfer transition. The lowest wavelength of 172 nm was achieved by using a synchrotron light source. Data from Lees JG, Miles AJ, Wien F, Wallace BA. A reference database for circular dichroism spectroscopy covering fold and secondary structure space. Bioinformatics 2006;22:1955e62; Lopes JLS, Orcia D, Araujo APU, DeMarco R, Wallace BA. Folding factors and partners for the intrinsically disordered protein micro-exon gene 14 (MEG-14). Biophys J 2013;105:2512e20.

FIG. 6.5

the CD spectra of the mostly b-sheet protein (lentil lectin) and of a designed disordered protein for comparison. The spectra of predominantly b-sheet proteins have p / p* transitions with magnitudes that are typically three to five times smaller than those generated by a-helical proteins. b-Sheet structures also show variations in sheet-twist and the relative direction of the individual b-strands (parallel or antiparallel); consequently, there are large variations between different b-sheet proteins in the positions and intensities of the coupled p / p* transition. Predominantly disordered proteins are largely characterized by a single negative peak at w200 nm. Most proteins contain a mixture of secondary structure types, and hence their spectra are linear combinations of the characteristic spectra, weighted by the proportion of each type of structure present. 6.1.1.1.2 Near-UV CD

The aromatic residues in a protein (phenylalanine, tyrosine, and tryptophan) have characteristic p / p* side chain absorptions between 300 and 250 nm in the near UV region. These residues also produce signals in the shorter wavelength far UV region, where the peptide backbone transitions are found, but their per-molecule contribution in the far-UV are very small compared to the peptide absorption in this region, and consequently can generally be ignored. The near UV absorptions of these residues are affected by the local environment and are therefore sensitive to the protein tertiary structure. Disulfide bonds have two n / s* transitions that also give rise to a broad low-magnitude peak in this range, or separate peaks depending on the dihedral angle of the disulfide [6]. 6.1.1.1.3 Units and equations

The magnitude of a CD spectrum depends upon the concentration of the sample and the cell pathlength. For meaningful spectral comparisons or quantitative analyses of secondary structure to be carried out, spectra should be normalized to units that remove these factors from the equation. This section will discuss the derivation of these from the raw CD signal.

II. The selected biophysical tools in the biopharmaceutical industry

6.2 Instrumentation

127

Just as in absorption spectroscopy, the signal follows the BeereLambert law: Al ¼ εl cl

(6.1)

where Al is the absorbance at a given wavelength, εl is the extinction coefficient at that wavelength (M1 cm1), c is the molar concentration, and l is the optical cell pathlength in cm. However, since we are dealing with the difference between the absorbance of left and right CPL, the CD equivalent is:   (6.2) ALðlÞ  ARðlÞ ¼ DA ¼ εLðlÞ  εRðlÞ cl ¼ Dεcl where L and R refer to left- and right-circular polarized light. For historical reasons, data from the instrument will be normally presented in units of millidegrees or ellipticity (q), which was originally associated with optical rotatory dispersion measurements and is related to the difference in absorption by: DA ¼ q=3298

(6.3)

In terms of the difference in extinction coefficients, we have molar ellipticity [q], which has units of deg-cm/M: ½q ¼ 3298Dε

(6.4)

When measuring the protein spectra, it is convenient to convert the molar concentration to mg/mL because the CD signal is dependent upon the number of peptide chromophores rather than on the number of protein molecules. For the same reason, molar ellipticity can be replaced by mean residue ellipticity [q]MRE, which has units of deg-cm2 dmol1residue1. In the literature, spectra will usually be reported either in [q]MRE with peaks in the order of 103e104 or (as in the spectra shown in this chapter) in units of Dε per residue, which give values that are a factor of 3298 smaller. Conversion from machine units of mdeg to [q]MRE is as follows: ½qMRE ¼ q  MRW=cl ¼ Dε  3298

(6.5)

where MRW is the mean residue weight (molecular weight of the protein/number of peptide bonds in the protein), c is the concentration in mg/mL, and l is the optical pathlength in mm. Note that the number of peptide bonds ¼ ðnumber of residues  1Þ

6.2 Instrumentation There are a number of bench top CD instruments on the market from various manufacturers (see Sections Technology Availability and Further Reading). They present unique specifications to compete for commercial advantage and provide different data formats, but all are capable of producing both CD and high-tension (HT) - sometimes called

II. The selected biophysical tools in the biopharmaceutical industry

128

6. Biopharmaceutical applications of protein characterisation by circular dichroism spectroscopy

FIG. 6.6 (Top): Schematic diagram of the light path in a CD spectrometer after the light has passed through the monochromator. Linearly polarized light is decomposed into left- and right-circularly polarized light (CPL) components (Lcpl and Rcpl) by the photoelastic modulator (PEM). CPL is differentially absorbed by the optically active sample and then the transmitted light is incident upon the detector. The signal is decomposed into two signals: VAC is the difference between the intensities of LCPL and RCPL at the detector and VDC is the average light intensity over time. CD ¼ (VAC/VDC). (Bottom): Showing that as absorption increases, VDC (dashed line) decreases. This is offset by an increase in the high-tension (HT) or dynode voltage (solid line).

high-voltage or dynode voltage - measurements. The basic instrument designs of all are essentially the same, with the main components being a light source, a polarizer, a modulator, a sample chamber, and a detector (Fig. 6.6 top).

6.2.1 General setup of a bench-top instrument The standard light source is a Xenon arc lamp, which emits a high and relatively constant light flux from the infrared, through the visible to the UV wavelengths. In the case of a synchrotron radiation circular dichroism (SRCD) beamline, the design is very similar except that the intense light flux from the synchrotron is used as the light source [7]. The system is purged with nitrogen to prevent UV-generated ozone from damaging the optics and this also allows measurements below 200 nm, which would otherwise be prevented by the absorption of O2. The CD spectrum is measured across the wavelength range incorporating the transitions of interest; a monochromator selects a narrow bandwidth (<1 nm) in userdefined step sizes, which are typically 1 nm for protein samples. The “monochromatic” light is then linearly polarized in the vertical or horizontal plane before passing through a photoelastic modulator (PEM). This is a piezoelectric device with an optical element made from a material with an isotropic refractive index, usually amorphous quartz, which is alternately stretched and compressed at a frequency of 50 KHz. When fully stretched, the refractive index parallel to the direction of stress is altered relative to the refractive index perpendicular to the direction of stress, such that the crystal acts as a quarter-wave retarder transmitting CPL of one sign. When compressed, the crystal transmits CPL of opposite sign and, although all

II. The selected biophysical tools in the biopharmaceutical industry

6.3 Data generated

129

variations of elliptical polarizations in between are produced, a lock-in amplifier tuned to the frequency of the PEM ensures that only the CPL is monitored downstream. The CPL passes through the optically active sample under investigation before striking the photomultiplier tube (PMT), which converts the incident photons into a cascade of electrons thereby producing a measurable current. The signal from the PMT contains two components: a direct current (DC) component that represents the average total photon flux over time (i.e., number of oscillations of the PEM) and an alternating current (AC) (about 104 times smaller) with an amplitude that is proportional to the difference between the intensities of the left- and right CPL that emerge from the sample. The CD signal is derived from the signal thus: CD ¼ ðVAC =VDC ÞG

(6.6)

where G is an instrument-dependent constant. The ratio between the two components needs to be held constant as the scan proceeds to maintain the scale of the CD signal. This presents a problem because the total absorption of the sample changes and inevitably increases toward lower wavelengths. To maintain a steady DC current, an external voltage that exactly counteracts the changes in photon flux at the PMT is applied (Fig. 6.6 bottom). This is the HT signal, and monitoring this (as discussed in Section 6.4.3) is important for quality control, since its value is proportional to the sample absorbance and it can be used to indicate the validity of the CD signal. When the HT approaches a predefined maximum level, which is instrument-dependent (usually 500e600 mV), the CD signal becomes progressively noisy and distorted as there is an insufficient number of photons reaching the detector. Modern CD spectropolarimeters are equipped with thermoelectrically controlled sample chambers as standard and most are able to accommodate both round and rectangular sample cells. Round cells can be important if the sample is a scattering one, as they permit transmission of light isotropically as it exits the cell window. Accessories such as stopped flow, titration, and multiple sampling systems are available from commercial vendors as optional extras along with facilities for performing fluorescence and CD simultaneously or fluorescence-detected CD measurements.

6.3 Data generated 6.3.1 Types of data generated Compared to techniques such as NMR and X-ray crystallography, CD spectra have only modest information content, providing structural information on the level of net secondary structure types present, without reference to the topology, quaternary, or tertiary structure of the protein although in some cases it can provide information regarding the tertiary structure. However, CD has a number of advantages in that measurements are rapid, it uses only small amounts of material and proteins of any size can be examined in solution under ambient and physiologically relevant conditions rather than in the extreme conditions and high concentrations required for crystallization or NMR spectroscopy. Using rapid flow methodologies, the dynamics of protein folding or kinetics can also be monitored.

II. The selected biophysical tools in the biopharmaceutical industry

130

6. Biopharmaceutical applications of protein characterisation by circular dichroism spectroscopy

Near-UV CD provides information on the environment of aromatic side chains and disulfide bonds, and this can give some insight into the tertiary structure, for example, distinguishing between native and molten globule states, or folded and unfolded forms, but the signal tends to be qualitative rather than quantitative. The far-UV spectrum reflects the secondary structure composition and can be used to quantify the secondary structural elements (such as helix, sheet, turns, and disordered residues) present. Mutations, ligand binding, and environmental changes may cause structural changes detectable by CD spectroscopy. Stopped-flow measurements with a dead time of a few milliseconds, or continuous-flow measurements, can be used to monitor changes in tertiary and quaternary structures at appropriate wavelengths in the near-UV. Changes in a secondary structure conformation can sometimes be followed in the far-UV wavelength region if the buffer and sample conditions permit measurements [8]. If the difference between the native and the mutation or ligand-bound spectrum is too small to resolve, it may be possible to detect changes in the protein stability as a function of either heat or chemical denaturing conditions leading to the unfolding of the protein. Denaturation studies can be achieved by either measuring entire spectra or monitoring the unfolding at selected wavelengths [9].

6.3.2 Quantitative assessment of information content (as a function of wavelength) Secondary structure content can be quantified from the far-UV CD spectrum of a protein. The amount of information that can be accurately extracted depends upon the number of electronic transitions that the spectrum incorporates and hence on the lowest wavelength included. Using principal component analysis [10] it was demonstrated that spectra which reach no lower than 200 nm can be described by two eigenvectors allowing an estimation of helix content only. This increases to three or four eigenvectors when the spectrum is extended to 190 nm and six to eight by 170 nm [3] with correspondingly more secondary structural types emerging from the analysis including a-helices, antiparallel b-sheets, parallel b-sheets, b-turns of all types, and aperiodic (disordered) structures [11]. To reap the most benefit from far-UV CD, the objective is, therefore, to obtain data to the lowest wavelength possible; methods for achieving this without compromising the quality of the sample or data are discussed in the next section.

6.4 Guide to collecting good data 6.4.1 Amount of time required to make measurements With most biophysical techniques, sample preparation is far more time-consuming than data collection, and CD spectroscopy is no exception. A number of quick test runs may be required to find the optimum sample cell pathlength and instrument parameters, and then collection of a single spectrum over the wavelength range from 280 to 180 nm, comprising three repeat scans (or six accumulations, depending on the type of instrument), will typically

II. The selected biophysical tools in the biopharmaceutical industry

6.4 Guide to collecting good data

131

take less than 30 min. Time spent in cleaning and reloading the cell may add another 5 min, and therefore an assay of 10 different conditions would expend w8e10 h assuming that 10 different baselines are required, which is not necessarily the case if more than one sample has the same buffer and additive components. Once a spectrum and baseline have been collected, there is the opportunity to process and analyze the data while the next sample is being measured, so at the end of one day a full set of results can be produced, ready for interpretation. CD spectroscopy has been a low-throughput and manual method because the high quality of sample cells and the strict conditions required to collect comparable data generally makes the use of microplate-reading techniques unreliable. However low-birefringence flowthrough cells combined with robotic liquid handling systems that improve the quality of automated data collection are currently being developed (see Ref. [12] for example). Samples are transferred from 96 well plates to the flow-through cell, which is flushed with cleaning solution between measurements. In another study [13] (albeit using a synchrotron light source) direct measurements of reproducibility of repeat measurements from a 96 well plate show potential for future high throughput studies. These are developments for the future which will help to automate the technique and make it more suitable for pharmaceutical discovery and quality control purposes.

6.4.2 Calibration schedule All bench-top instruments should have an annual maintenance schedule; if they are relatively new this will be part of the warranty and thus will be carried out by a qualified instrument technician. However, lamps that have a life time of up to w1000 h are easily replaced by users. In addition, there are two calibration procedures that should regularly be undertaken by users to ensure that spectra are both internally consistent from day-to-day and are also compatible with those taken on any instrument [14,15]. Such calibrations are also essential to enable the user to compare spectra with standard spectra available in the Protein Circular Dichroism Data Bank (PCDDB) [16e18]. 6.4.2.1 Wavelength calibration Misalignment of the monochromator can produce small shifts in the wavelength scale, which are manifested as shifts in CD-peak position and can have a significant impact on secondary structure analyses [19]. There are a number of commercially available standards that produce sharp absorption peaks across the appropriate wavelength range, including holmium oxide liquid or glass filters, both traceable, with useful peaks between 630 and 219 nm. If the instrument is equipped to provide a bandwidth of 0.1 nm, another accurate calibration compound, cheap and available at most facilities, is benzene. One drop placed in the bottom of a stoppered 1 cm quartz cell will generate vapor that produces an HT spectrum similar to that shown in Fig. 6.7. If the absorption peaks are more than 0.5 nm outside of the accepted value, the wavelength axis should be adjusted in the instrument, although failing that all spectra could be shifted appropriately (and noted) using dedicated spectroscopy software or spreadsheets until the instrument is recalibrated.

II. The selected biophysical tools in the biopharmaceutical industry

132

6. Biopharmaceutical applications of protein characterisation by circular dichroism spectroscopy

FIG. 6.7 Absorption spectrum of benzene vapor (measured on station UV1 at ISA, Aarhus University, Denmark). The signal is monitored using the HT signal [7].

30

8

20

7

10

6 5

0 –10 170

220

270

320

4

–20

3

–30

2

–40

1

–50

Wavelength (nm)

HT

CD (mdeg)

6.4.2.2 Calibration for magnitude and polarization An even more important calibration procedure involves correcting for spectral magnitude and adjusting for incorrect light polarization. On commercial instruments, a fresh calibration spectrum should be obtained every few months or following any optical adjustments, such as lamp replacement. There are two compounds in general use: d-10-camphorsulfonic acid (CSA), which is available from many chemical distributors, and ammonium d-10camphorsulfonate (ACS), which is harder to obtain but less hygroscopic and thus easier to accurately weight out. Solutions of CSA and ACS are light sensitive and should be stored at 4 C in the dark for no longer than a few months. Both dissolve to give the same ion, which generates two peaks in the far-UV, one positive at w290.5 nm, and the other, negative, at w192.5 nm (Fig. 6.8). The absolute ellipticity of the 290.5 peak has been calculated as 2.36 mdeg/M/cm [20] and can be used to adjust the magnitude of either the instrument or the spectra obtained on the instrument. The absolute value for the second peak is not accurately defined since it lies at the lower wavelength limit of most, and especially older, commercial instruments. It must also be measured in short pathlength cells. The ratio between the 192.5 and the 290.5 peak should be between 1.9 and 2.2, and it follows that the protein spectra measured on instruments with different ratios will show corresponding differences in shape. One remedy for this is to assume that the 192.5 nm peak has twice the magnitude of the 290 nm peak and use the data to create a

0

FIG. 6.8 SRCD spectrum of the standard compound CSA (measured on beamline CD1, ISA, Aarhus University, Denmark). The solid line is the CD signal and the dashed line is the HT signal. II. The selected biophysical tools in the biopharmaceutical industry

133

6.4 Guide to collecting good data

40

CD (mdeg)

30 20 10 0 –10

170

190

–20 –30

210

230

250

270

Wavelength (nm)

FIG. 6.9

Spectrum of lentil lectin produced by an uncalibrated instruments before (solid line) and after calibration (dashed line) using a CSA standard.

calibration curve with which to adjust the measured protein spectra [21]. The difference between uncalibrated and calibrated spectra may be minimal or, as in Fig. 6.9, quite significant. The calibration protocol requires a solution of approximately 6 mg/mL with a 0.1 mm pathlength optical cell; the concentration is obtained from the UV absorbance at 285 nm where the extinction coefficient is 34.6/M/cm [22]. A CD spectrum is measured from 350 to 185 nm with a step size of 0.5 nm. The expected magnitude of the peak at 290.5 nm is calculated from the following equation:   q ¼ Dε  3298  concentration mgml1  l=MW (6.7) where Dε ¼ 2.36, l is the pathlength in mm and MW is the molecular weight of CSA, which is 232.3 g/mol. Hence a solution of 6 mg/mL will give a peak of 20.1 mdeg at 290.5 nm and of around 38 to 44 mdeg at 190.5 nm. Although CSA and ACS have been used as the “gold standard” for more than 30 years, there is no source of certified reference material and the existing reference values are derived from a literature consensus, thus making these materials untraceable. CSA is hygroscopic so gravimetric measurements of the solid are unreliable although the extinction coefficient at 285 nm has been verified [22]. Additionally, only one enantiomer is available so there is no way to determine the symmetry of polarization around zero. Other standards such as D-pantolactone have also been proposed; however, they suffer from the fact their concentration, and hence magnitude of their CD signal cannot be established from absorption spectrum [21,23]. There is a candidate compound under development that will address all of these limitations, Na[Co(EDDS)]$H2O (EDDS ¼ N,N-ethylenediaminedisuccinic acid), which can be produced as two enantiomers that give rise to nine Gaussian peaks between 750 and 178 nm. Inter-laboratory comparisons have been carried out and the compound should be commercially available in the future.

6.4.3 Data collection protocols 6.4.3.1 Protein concentration With well-calibrated CD instruments, one of the main sources of error arises from the protein concentration measurements. There are a number of methods commonly used such as

II. The selected biophysical tools in the biopharmaceutical industry

134

6. Biopharmaceutical applications of protein characterisation by circular dichroism spectroscopy

the biuret, Lowry, bicinchoninic acid assay (BCA), and Coomassie blue binding; they rely on the absorption of exogenous chromophores bound to the protein, either via the peptide group or certain amino acid side chains. The biuret method gives a uniform response across all proteins but may use up to 5 mg of protein. Although the other methods are much more sensitive, requiring <100 mg, the reactions can vary between proteins depending on their amino acid composition and binding affinities, and there is also the possibility of interference from cofactors and buffer constituents [24]. In general, these methods are not accurate enough to be compatible with producing ACCURATE spectral magnitudes, although they may be sufficient for producing REPRODUCIBLE spectra for a specific protein. A more accurate method is to measure the combined absorptions of the protein at 280 nm and then calculate the concentration from the extinction coefficients based on the tryptophan, tyrosine, and disulfide bonds content using the following formula:   A280 1mgml1 ; 1 cm pathlength ¼ ð5690nw þ 1280ny þ 60nc Þ=MW (6.8) where nw, ny, and nc are the number of tryptophan, tyrosine, and cystines, respectively, in the polypeptide and MW is its molecular weight in Da. These values are for the denatured protein in 6 M guanidinium HCl [25]. Alternatively, the amino acid sequence can be used with the ProtParam website [26] to calculate the extinction coefficient of the native protein (with and without disulfide bridges) in water. In this case, the extinction coefficients are slightly different:   A280 1mgml1 ; 1 cm pathlength ¼ ð5500nw þ 1490ny þ 125nc Þ=MW (6.9) The A280 method is convenient, nondestructive and where sample is scarce, small volumecells, for use with conventional UV/visible spectrophotometers, are commercially available; for samples that are particularly precious or in small abundance, microspectrophotometer instruments specifically designed to measure volumes of 1e2 mL (such as the Nanodrop) can be used. However, for any of these methods, there are a number of potential limitations: Firstly, the protein must contain one or more tryptophans in order to produce accurate results. The values in Eq. (6.8) were calibrated against the measured extinction coefficients from 80 proteins giving a standard deviation of 3.2% for proteins containing tryptophan, compared to a standard deviation of >10% for proteins lacking tryptophan [27]. Secondly, the contributions from other chromophores must be taken into account including the A280/260 ratio for nucleic acids. Thirdly, samples that exhibit light scattering (such as membranes and fibers) can cause a wavelength-dependent increase in the baseline absorbance. This may be also due to undissolved, suspended particles that can sometimes be removed by centrifugation or filtering. Phospholipid vesicles used in studies of membrane proteins cause scattering that can be eliminated by adding 10% of the ionic detergent sodium dodecylsulfate (SDS) to dissolve the vesicles, although this renders the sample irretrievable for conformational studies. If the light intensity at the CD instrument detector is known then the absorption of the protein at 205 nm can be used in a similar vein. The advantages of this method are that firstly, it does not rely on the presence of tryptophans in the primary sequence, and secondly, the absorption at 205 nm is compatible with the cell pathlengths used for measuring CD spectra

II. The selected biophysical tools in the biopharmaceutical industry

6.4 Guide to collecting good data

135

and most modern benchtop instruments enable the measurement of absorption and CD spectra simultaneously. The following formula [28] can be used to derive the absorption from the HT signal and the synchrotron ring current at SRCD beamlines.    logðRCs =RCbl Þ þ log HTas þ b HTabl þ b (6.10) Where RCbl is the value of the synchrotron ring current when the baseline is measured and RCs, the value of the synchrotron ring current when the sample is measured. The parameters a and b are detector dependent and can be obtained from the beamline scientists in charge. The extinction coefficient of the protein is calculated from the values derived in Ref. [29]. This method will be subject to the same problems as the A280 method, such as light scattering artifacts. The baseline must contain the same constituents as the sample minus the protein, the HT must not be near saturation at 205 nm and it is assumed that the absorption of the baseline subtracted spectrum is zero at wavelengths above 250 nm. The most accurate method for protein concentration determination is quantitative amino acid analysis, in which the protein is hydrolyzed in 6 M HCl and the number of stable and abundant amino acids present are measured against an internal amino acid standard. This is time-consuming and expensive, especially if there are no in-house facilities; however, it can be used to standardize the colorometric methods for an individual protein. 6.4.3.2 Optical cells The highest quality quartz cells are made of Suprasil. In general, they are either rectangular or round, and consist of either a single piece with a loading port in the top or two demountable plates, one of which has a beveled edge producing a thin well. Cells of pathlengths <0.01 cm are almost always of the demountable type. This design facilitates cleaning, which is problematic for single-piece cells with short pathlengths. A typical cleaning protocol is as follows: Cells are rinsed in sample buffer (in which the protein is soluble) followed by detergent, ethanol (not acetone!), and deionized water. They are then blown dry with nitrogen gas or polished with lint-free lens tissue. Following the high-temperature experiments, protein that has precipitated onto the walls of the cell may require extensive washing in detergents and/or concentrated nitric or chromic acid. 6.4.3.2.1 Pathlength

CD sample cells can be obtained with pathlengths ranging from 0.0005 to 10 cm accommodating protein concentrations of 20e0.001 mg/mL; however as the pathlength increases so does the absorption of the sample medium along with the low-wavelength cutoff of the CD spectrum. The total absorption of the system, i.e., sample, buffer, and optical cell, should not exceed 1.0 within the wavelength range of interest or the photon flux at the detector will be too low to accurately measure the CD signal. This is discussed further in Section 6.4.3.5.2 and in detail in Refs. [7,8]. Furthermore, for good secondary structure analysis, far-UV CD spectra should be measurable to 190 nm or beyond and this requires pathlengths not in excess of 0.2 mm, and much shorter if possible. Thus, the protein concentration should be above 0.05 mg/mL, which is optimal for a 0.2 mm pathlength cell, and given that highly absorbing buffers are the norm, concentrations of 1e10 mg/mL (optimal concentrations for 0.01 and

II. The selected biophysical tools in the biopharmaceutical industry

136 TABLE 6.1

6. Biopharmaceutical applications of protein characterisation by circular dichroism spectroscopy

Recommended relationships between pathlength (PL) and protein Concentration for CD measurements. These are values for far-UV CD. To obtain a suitable signal in the near-UV, all concentrations or pathlengths must be > 100 fold higher.

Cell PL (cm)

0.001

0.002

0.005

0.01

0.1

Protein conc (mg/mL)

3e10

2e5

1e2

0.3e1

0.03e0.1

0.001 cm pathlength cells respectively) are recommended. Monitoring the aromatic CD signal in the near-UV requires that the pathlength for a given concentration should be between 200 and 1000 times longer, i.e., 2e10 mg/mL in a 1 cm cell. Table 6.1 shows a selection of cells of increasing pathlength and recommended protein concentrations. Shorter pathlength cells require smaller volumes so, whereas a 0.1 cm pathlength quartz cell may require 300 mL of sample at 0.1 mg/mL a 0.001 cm pathlength cell requires 14e20 mL (depending on whether it is cylindrical or rectangular) at 3e10 mg/mL. In order to achieve very low wavelength data (normally with SRCD spectroscopy), specially designed short pathlength cells made of calcium fluoride have been designed [30]. 6.4.3.2.2 Loading optical cells

Cells should be entirely clean before use, since any residual protein will produce a small background CD spectrum and any other components may simply add to the nonchiral background noise. Loading of single piece cells can be straightforward, but caution must be taken to prevent the introduction of bubbles into the optical path, and the cells should always be placed with the same surface closest to the detector for reproducibility. Demountable cells must be handled more carefully: Firstly, new users are advised to practice cell assembly with water until they are confident that they can load the sample without incorporating air bubbles. When the two-halves are assembled, they should always have the same orientation with respect to each other and the cell holder, and they should be subjected to the same strain. This will ensure that orientation and strain-dependent birefringent effects are identical in both sample and baseline spectra and will be eliminated when the baseline is subtracted. A specially designed cell holder [30] has been developed for demountable cylindrical cells to ensure that this occurs. Lastly, and obviously, baselines (containing all components present in the sample except the protein) should always be measured in the same cell as the sample spectrum, and as close in time as possible to the corresponding sample spectrum. A video demonstrating cleaning and loading of optical cells is available on YouTube (see Further Reading Section). 6.4.3.2.3 Calibration of optical cells

Although optical cell pathlengths are provided by the manufacturer, cells with pathlengths of less than 0.01 cm should be calibrated and checked prior to initial use, as they can vary substantially from the listed pathlength, which will of course greatly affect the calculated ellipticity values. The pathlength can be determined from an interference pattern produced

II. The selected biophysical tools in the biopharmaceutical industry

6.4 Guide to collecting good data

137

FIG. 6.10 Interference fringe pattern obtained by scanning an empty sample cell (with a nominal pathlength of 0.001 cm) with a UV/VIS spectrometer in transmission mode. The pathlength of this cell, calculated as described in the text, is actually 0.0015  0.0003 cm [19].

by the internal reflection of light in an empty cell [19]. A transmission spectrum of the empty cell is measured from w800 to 400 nm on a UV/visible spectrophotometer and, unless the cell is misaligned or distorted, an interference fringe similar to Fig. 6.10 will be produced. The wavelength values for two fringes separated by 10e20 fringes are used with the following equation to derive the pathlength in mm: ðnðw1  w2Þ=2ðw1  w2ÞÞ=1000

(6.11)

where w1 is the selected high wavelength fringe, w2 is the low wavelength fringe, and n is the number of fringes between them, counting w1 as 0 and w2 as n. It is advisable to obtain fringe data with a number of cell orientations to estimate the error of this measurement; to this effect cylindrical cells can be rotated in the direction perpendicular to the light path whereas rectangular cells can be inverted. In general, as cell pathlengths decrease, the errors in reported values tend to increase, going from 1% to 2% for pathlengths of 0.01 cm to as much as 50% for pathlengths of <0.002 cm [19]. Errors in the pathlength of cells >0.01 cm are usually minimal but it is always a good practice to calibrate these using appropriate concentrations of potassium chromate in 0.01 M NaOH, ε372 nm ¼ 4830 M/cm, [31]. 6.4.3.3 Choice of sample conditions Another factor affecting the overall absorbance of the system is the choice of sample medium. Buffers based on the carboxyl or sulfonate groups such as citrate, acetate, HEPES, and MES absorb strongly below 195 nm and concentrations should generally be less than 20 mM in order to obtain low wavelength cutoff spectra. Buffers based on phosphates and borates can be used in the pH range of 6.5e9.5 although phosphate-buffered saline (PBS) should be avoided since chloride ions also absorb strongly in the far-UV wavelength range. Similarly, Tris HCl should be used at low concentrations (w20 mM), or sulfate used instead of HCl for acidification. Use of salts such as NaF or Na2SO4 rather than NaCl when possible (i.e., the protein is compatible with these) will also improve the spectra. If substitution or

II. The selected biophysical tools in the biopharmaceutical industry

138

6. Biopharmaceutical applications of protein characterisation by circular dichroism spectroscopy

FIG. 6.11 HT signal (proportional to absorbance) of some buffers and salts measured in a 0.01 cm pathlength optical cell. The cutoff above which the CD signal can no longer be measured on this instrument is 5 HT units. The HT signal with a cutoff above 200 nm is due to the presence of 6 M guanidinium HCl.

low concentrations are not practical, the highest possible protein concentration should be achieved (without causing aggregation or precipitation) to reduce the pathlength required for obtaining a spectrum with a good signal-to-noise level. Fig. 6.11 shows the HT signal produced by selected buffers and salts at 50 mM concentration in a 0.01 cm cell. Six molar guanidinium HCl is included since it is commonly used in unfolding assays. Both guanidinium and urea absorb strongly below 210 nm and it is only possible to monitor high-wavelength regions of the spectra in such experiments. Other additives such as detergents and ligands should be tested in the buffer prior to preparing a sample with these present, in order to determine their compatibility with CD measurements. 6.4.3.4 Instrument settings Far-UV protein CD spectra should be measured starting at a high wavelength of 280 nm (or in some cases, such as when the protein contains a large number of aromatic amino acids, 300 nm), which is well above the wavelengths where the peptide transition peaks occur: this produces a long region where the protein spectrum and its baseline should exactly overlay (see Section 6.5). Since the aromatic CD signals are several orders of magnitude lower in magnitude than the peptide signals in this region, they should not contribute to a signal at this wavelength. If the sample and baseline spectra do not match in this region, this is an indication of potential problems that should be investigated. The lowest wavelength selected should ideally be above that indicated by the saturation point of the HT signal. For accurate secondary structure analyses, data down to at least 190 nm are generally required. Most instruments either run step scans or have the option of step-scan or continuous-scan mode. In a step-scan mode, there are four main adjustable parameters: (1) the bandwidth, which determines the precision with which the monochromator selects wavelengths. For routine scans 1 nm is sufficient but this can be reduced to resolve smaller spectral features or increased to 2 nm to allow more light to fall on the sample when the overall absorbance is high; on some instruments the bandwidth automatically increases as the sample absorbance increases. (2) The dwell or averaging time is the length of time that data are collected

II. The selected biophysical tools in the biopharmaceutical industry

6.4 Guide to collecting good data

139

at each wavelength, usually around 1e2 s, with longer periods producing smoother data. Once again, some instruments can be programmed to automatically increase the dwell time in response to sample absorbance thereby increasing the signal-to-noise ratio as the signal deteriorates. (3) The step size refers to the wavelength difference between successive data points; normally 1 nm is adequate to resolve protein CD peaks but this can be reduced along with the bandwidth to resolve narrower features. (4) Number of repeat scans. At least three repeats of both baseline and sample should be measured to eliminate the possibility of artifacts in the data. During processing the replicate scans are averaged before subtraction of the baseline and the instrumental variation (error bars) can be determined from the standard deviations between the measurements at each data point. Parameters available in continuous-scan mode include bandwidth and repeat scans (usually referred to as accumulations) and time constant or response time, which is analogous to the dwell time. However, step size is replaced by scan rate in nm/minute. The rate at which data are collected should relate to the other parameters according to Eq. (6.12) to prevent spectral distortion. Scan rate  time constant < band width < width of spectral feature=10

(6.12)

Hence, to resolve a spectral feature with a width of 10 nm requires a bandwidth of less than 1 nm, and if a 1 s time constant is used the scan rate should not exceed 10 nm/min. Another important issue with both step and continuous scans is whether to save the average of the scans (which is the default on many instruments) or the individual files. Saving individual files is crucial for estimating the error levels (see above), and especially for monitoring any spectral changes that occur with time. For example, if a protein aggregates with time, or bubbles develop in the cell, there will be systematic differences between the initial and final CD and HT spectra obtained on that sample. Should such differences be evident, then the data should be discarded and the experiment repeated. 6.4.3.5 Sources of interference/troubleshooting 6.4.3.5.1 Signal-to-noise

It is not always possible to optimize the protein concentration for use with the sample cell pathlengths available in the laboratory. For example, highly absorbing buffers or salts may dictate the use of pathlengths shorter than desired, or a limited protein concentration may not be compatible with the available cells, in which case a shorter-than-optimum cell that still produces a CD signal albeit a small one may be the best strategy. A CD signal with a maximum of 3e4 mdeg will be very noisy but salvageable if the averaging time and/or number of repeat scans are increased. The signal-to-noise ratio (S/N) is related to the square root of the acquisition time thus: S=N ¼ ðktÞ

1=2

(6.13)

where k is the noise generated by the instrument components and t is time. Therefore to double the signal-to-noise ratio, the number of repeat scans or the averaging time must be squared, i.e., nine repeats instead of three, 4 s averaging time instead of 2 s. This also applies to the baseline, which can have an exaggerated effect on a small spectrum.

II. The selected biophysical tools in the biopharmaceutical industry

140

6. Biopharmaceutical applications of protein characterisation by circular dichroism spectroscopy

FIG. 6.12

CD spectra (black lines) and HT spectra (gray lines) of the same sample, concentration 15 mg/mL, measured in a 0.0020 cm pathlength cell (dashed lines) and a 0.0006 cm pathlength cell (solid lines), and normalized to the same value at 224 nm, illustrating that when the HT is too high, the high intensity peaks can be truncated, producing a distorted and erroneous spectrum.

6.4.3.5.2 HT level

Monitoring the HT signal is essentialdnot only to determine the low wavelength cutoff of the spectrum but also to ensure that there is no distortion of spectral peaks. In Fig. 6.12 two spectra of the same sample are compared. One is measured in a 0.0006 cm pathlength cell appropriate to the concentration of 15 mg/mL, the other is measured in a cell with a 0.0020 cm pathlength. Both spectra have been scaled to units of delta epsilon and it is obvious that the peaks of the spectrum measured in the long pathlength cell are truncated at wavelengths below 210 nm. This is an artifact arising from the steep increase in the sample absorbance as indicated by the HT signal. The HT cutoff for this instrument is 600 mV and the HT of the truncated spectrum is around 800 mV at 190 nm. In this example, the problem with the distorted spectrum may not be apparent if the HT is ignored. It also serves to illustrate that spectral distortion occurs well below the HT saturation point, which is around 1300 mV on this particular instrument. 6.4.3.6 Record/log keeping: essential details to note Regular users can rapidly amass a large amount of data; even a single assay of 10 samples and one baseline will comprise 33 raw spectra, several test runs and possibly a CSA spectrum and its baseline. At SRCD beamlines, data accumulation can be intense, and a 3-day session may produce more than 1000 raw data files. Therefore, good record keeping is an absolute essential and, although instrument software records some settings automatically in the file header and there is usually a provision for adding information before or after measurement, it is advisable to keep separate log-sheets, especially with regard to sample identities and conditions. The following information is the minimum that should be recorded so that the experiment can be repeated if necessary, for efficient and accurate data retrieval, and for archiving, regulatory, and publication purposes: Operator name (so that any queries at a later date can be addressed to the appropriate person), Date and file name (The obvious procedure of naming a file after the sample and condition (HBA10, pH10; HBA10, pH 9.5)) is not the

II. The selected biophysical tools in the biopharmaceutical industry

6.5 Data processing and analyses

141

most useful when collating and retrieving files from a large database. A more efficient system is to number the files produced by all users consecutively, or in sequential order for a single user with their own header (A001, A002, .B001, B002 .. ) or with a date code (such as 15Jan 1, 8001); these should be cross-referenced to the log sheet: the protein name along with details of any mutations and expression tags, protein concentration, buffer constituents and concentrations, cell ID and pathlength; instrument settings including wavelength range, step size, dwell time and temperature (although these will often be captured by the instrument software); also the relevant CSA calibration file should be identified. 6.4.3.7 Sources of errors in data collection Following the protocols presented above should enable the collection of reliable data. Below is a summary of the main potential sources of errors, roughly in order of decreasing effect: 1. Cell pathlength, especially for cells of pathlength <0.01 cm. 2. Protein concentration measurements. 3. Distortion of the spectrum due to high absorption caused by inappropriate matching of cell pathlength and concentration, or the use of high concentrations of opaque buffers and salts. 4. Misalignment of cells when measuring baseline or sample spectra causing a discrepancy in where the spectrum crosses the zero value for the CD signal. 5. Distortion of the spectrum due to using inappropriate instrument parameters. 6. Calibration errors (or lack of calibration) causing problems when comparing spectra between instruments, or of data collected at different times. 7. Bad record keeping. All details should be logged contemporaneously with data collection so that there is no confusion when large amounts of data are collected and analyzed. For quality control and public archiving, traceable data trails are required. 6.4.3.8 Special considerations for membrane proteins Membrane proteins are usually sequestered into micelles, liposomes, nanodiscs or bicelles which do not form homogeneous isotropic solutions. This can lead to potential spectral artifacts such as scattering, absorption flattening and spectral shifts, requiring careful experimental and analytical consideration. Practical aspects of CD spectroscopy of membrane proteins are discussed in Ref. [32].

6.5 Data processing and analyses 6.5.1 Data processing softwaredelements and procedures and identifying sources of error in the data Data processing software may be included with the instrument software but there are also specialist software packages such as CDtoolX [33] developed specifically for processing, displaying, and archiving protein CD spectra, which facilitates the procedures. In addition, data may be processed using standard spreadsheet software, although this can be less convenient as it is not specifically tailored for the purpose.

II. The selected biophysical tools in the biopharmaceutical industry

142

6. Biopharmaceutical applications of protein characterisation by circular dichroism spectroscopy

The basic data processing procedures are as follows: (1) average the replicate scans of both sample and baseline spectra. The instrument software may do this automatically by default, but this setting should be turned off since the raw data not only reveals the presence of artifacts but can be used to quantify the noise level; (2) subtract the averaged baseline from the averaged sample spectrum; and (3) if necessary, zero the subtracted spectrum at wavelengths (usually >260 nm) where the sample does not generate a signal; (4) calibrate the spectrum using the CSA-derived calibration curve for the instrument (optional); and (5) scale to units of [q]MRE or Dε. Each of these procedures may reveal a different source of error. Firstly, a noticeable discrepancy between replicate spectra above the background noise level can result from changes in the optical properties of the sample; this may be due to bubble formation or cell leakage, and in such cases, the sample should be reloaded and remeasured. Alternatively, this could indicate changes in the protein structure during data acquisition, which could be a result of the temperature not being equilibrated before commencing data acquisition, or protein stability issues. Secondly, the averaged baseline and averaged sample spectrum should overlay between 250 and 280 nm where there is no detectable CD signal from the protein. Misalignment of the optical cells could account for such differences, in which case the spectrum should be repeated. However, if the discrepancy is minimal (<0.5 mdeg), zeroing (see procedure 3 above) after subtraction will rectify the problem. Thirdly, once the spectrum has been scaled to units of [q]MRE or Dε, errors in the concentration or cell pathlength measurement become apparent if spectral peaks are outside a range of expected magnitudes. The positive peak around 190 nm in the spectrum of an alpha helical protein would not usually exceed 20 Dε ([q]MRE of 66,000). A mostly b-sheet protein should produce a spectrum with a positive maximum magnitude within the range of w2e8 Dε ([q]MRE 6500e26,000). Spectra that are too large indicate that either the pathlength or concentration values used in the calculations were too small whereas very small spectra indicate that one or both of these values are too large.

6.5.2 Secondary structure analyses A number of methods exist for determining protein secondary structure content based on empirical analyses of CD spectra [34]. All utilize the information from datasets of CD spectra of proteins with known crystal structures to calculate the structure of the protein of interest (query protein) from its CD spectrum. They are based on the assumption that the structures of the reference proteins in crystals and in solution are the same and that the contributions to the CD spectrum from individual secondary structures are additive such that: X Cl ¼ fi Bil þ noise (6.14) where Cl is the CD spectrum of a protein as a function of wavelength, fi is the fraction of secondary structure of a given type, and Bil is the ellipticity of each secondary structure type at each wavelength. i is the type of secondary structure (helix, sheet, turn, disorder, etc.). The noise term also includes the contributions due to aromatic side chains [35].

II. The selected biophysical tools in the biopharmaceutical industry

6.5 Data processing and analyses

143

As a number of the algorithms were developed more than 40 years ago, some of the software languages in which these were written have fallen into obscurity due to the swift advance of computer technology. Those still available for download or use online include SELCON3 and CDSSTR [36], which are based on singular value decomposition (SVD) methods in conjunction with variable selection methods [37], CONTINLL, which uses ridge regression with a modified version of variable selection [38,39], and neural networks [40]. All are sensitive to the magnitude of the spectrum [41], so accurate analyses depend upon having well calibrated data, a broad wavelength coverage in the spectrum (see Section 6.3.2 and [1]), and reference datasets with protein structural characteristics similar to those of the query protein [3,42,43].

6.5.3 Accuracy: spectral data versus derived results (i.e., what types of samples cannot be accurately analyzed, but can be measured for reproducibility) Data that cannot be easily quantified include near-UV CD data, far-UV spectra that cutoff at minimum wavelengths higher than 190 nm, spectra that include scattering artifacts (such as fibers and membrane samples), and proteins that have very different spectral characteristics than any of the proteins in the reference datasets. Hence, a large amount of data collected in the biopharmaceutical industry is not analyzed for secondary structure, but only in terms of spectral comparability, for example, in a batchto-batch quality control. Such comparisons can be done by overlaying spectra and judging their similarity by eye; however, this is not only subjective, but where there are large numbers of spectra, virtually impossible. One solution is to use a multivariate statistical approach such as principal component analysis (PCA) to compress the dataset into two or three principal components that explain the variation present. The number of principal components indicates the number of spectral types in the dataset; Soft Independent Modeling of Class Analogy (SIMCA) can determine which spectrum in the dataset belongs to each class with a quantifiable certainty [15]. Alternatively, the variability between batches can be compared with the variation between spectra of the same sample by examining the overlap between error bars (1e2 standard deviation) for different samples.

6.5.4 Public repository of protein CD data (PCDDB): availability and uses The Protein Circular Dichroism Data Bank (PCDDB) [16,17] is a freely accessible public resource for the deposition, archiving, and access to protein CD and SRCD data and metadata, analogous to other structural biology data banks, such as the Protein Data Bank (PDB) [44] for crystallographic and NMR structures and data. It provides an open access biophysical catalog of information on folded proteins, a resource for bioinformatics studies, and standards for comparison and quality assurance. The integrity and quality of the deposited spectra are checked by a suite of tools collectively known as ValiDichro [45]. Each entry in the PCDDB includes information about the protein including the sequence, and links to the cognate crystal structure in the PDB and the sequence entry in Uniprot [46] where available. Sample conditions such as the buffer contents, protein concentration, cell

II. The selected biophysical tools in the biopharmaceutical industry

144

6. Biopharmaceutical applications of protein characterisation by circular dichroism spectroscopy

pathlength, temperature and instrument parameters are included along with calibration data, all essential information for meaningful spectral comparisons. Other information present in an entry includes references to the literature with citations of relevant publications, and if they are open access, links to pdf reprints, and the identity of the depositor. The processed spectrum is always deposited, but in many cases the raw sample and baseline spectra are also included so the user can see the quality of the data. The data bank includes several sets of “gold standard” spectra, SP175 and SMP180 [3,43], for soluble proteins and a mixture of soluble and membrane proteins, respectively, that have been carefully and extensively tested and calibrated, and serve, not only as reference data base spectra for analyses, but are also particularly valuable for comparative studies and new software development. The PCDDB web server is accessed at http://pcddb.cryst.bbk.ac.uk, where there is a link to a YouTube tutorial video describing the way to input and access data. In its first 5 years operation, more than three million datasets have been downloaded, an indication of the popularity, interest in, and importance of CD data. The data bank is gaining in value for all users, as more researchers deposit their spectra, giving it a broader base of information content. This feature may be of particular value in the pharmaceutical industry, as a traceable record of spectral information.

6.5.5 Data validation ValiDichro [45] is the first software developed for evaluating the quality and validity of circular dichroism spectra and metadata; its purpose is comparable to the validation programs developed for crystallographic and NMR data, such as PROCHECK [47], MolProbity [48], and WHAT_IF [49]. It includes w20 tests, which check for completeness, quality, and consistency of the data. It was initially developed as a guide for users and depositors of PCDDB data, providing quality control for maintaining the integrity of the database, but has now been made available to any user, (including those not wanting to deposit data), as a standalone server for checking data. It provides both a test of data quality and as a guide for improvement of data collection parameters [45] making it a particularly valuable asset for the pharmaceutical community users. It provides not only a “traffic light” evaluation of the data (Pass, Flag, Fail) but also suggestions on how to improve Flagged and Failed data. However, it should be noted that in some cases where features of submitted data appear to lie outside the acceptable range but are not necessarily incorrect, a Flag status may just be an indication of a novel or interesting feature. Other tests reveal real errors in the data, such as those noted previously in this chapter.

6.6 Role in the research industry 6.6.1 CD data in the protein biopharmaceutical development process CD spectroscopy can play important and diverse roles in many aspects of product development, from examining purified proteins for structural integrity and proper folding, to

II. The selected biophysical tools in the biopharmaceutical industry

6.6 Role in the research industry

145

examining ligand binding and determining the affinity of biotherapeutic candidates, to the study of relative conformational stabilities in response to changing pH, ionic strength, and other variables at the preformulation phase of product development. It can also serve as a guide and aid to crystallization procedures. During production of biological products, CD spectroscopy can provide a noninvasive and non-destructive means of monitoring stability and checking batch-to-batch consistency of the product including its shelf life integrity [15]. 6.6.1.1 Biosimilars Another area where CD spectroscopy can play a valuable role is in the analysis of biosimilars, which are alternative drug versions of approved biopharmaceuticals whose patent has expired. The characteristics of biopharmaceuticals are highly dependent on the cell lines and recombinant technology used as well as the purification protocols, production processes, and the formulation of the drug. In recognition of this, regulatory bodies require that the generic versions must be highly similar, although not identical, to the original product but they must be as close as possible in terms of structure and clinical effects. This characterization is often referred to as “Higher Order Structure”. CD spectroscopy is one of the most useful and convenient techniques for structural comparison and this has been recognized in the regulatory guidelines (cf. section 6.6.2). It can be useful for both demonstrating similarity (for a biosimilar producer to support the adequate biosimilarity of their biosimilar to obtain its regulatory approval) or dissimilarity (for original patent holders who are trying to defend their innovative drug against the approval of a biosimilar copy). 6.6.1.2 Glycosylation An approach to optimizing protein-based drugs is to alter their surface properties by engineered glycosylation. This can, not only be useful for increasing stability during processing, but also to improve the physicochemical and pharmacological properties that are often suboptimal in the native formdfor a review, see Ref. [50]. Although the glycans have little effect on the secondary structure or conformation of the protein, some of the common monosaccharides that form the glycan core such as N-acetylglucosamine and N-acetylneuraminic acid give rise to a small but significant CD signal in the far-UV [51] and a spectral comparison could provide a quick way of determining the success of glyosylation and deglyosylation reactions and whether the biosimilars have the same level of glycosylation as the original drug.

6.6.2 CD spectroscopy and international regulatory bodies Regulatory requirements stipulate that the higher order structure of biomolecules should be monitored and strictly controlled since misfolded or aggregated structures may lack potency, and they may also generate detrimental immunological responses. CD spectroscopy in the near-UV wavelength region provides a tertiary structure fingerprint, including information on disulfide bridges, whilst the far-UV spectrum can establish whether a polypeptide is folded as expected. The International Conference on Harmonisation of Technical Requirements for Registration of Pharmaceuticals for Human Use (ICH) is an international project bringing together regulatory authorities and experts from the pharmaceutical industry in the USA, Japan,

II. The selected biophysical tools in the biopharmaceutical industry

146

6. Biopharmaceutical applications of protein characterisation by circular dichroism spectroscopy

and Europe to discuss issues of pharmaceutical product registration and to provide technical guidelines and requirements for product registration. Guideline Q6B which covers specifications related to “Test Procedures and Acceptance Criteria for Biotechnological/Biological Products” recommends circular dichroism spectroscopy as one of the methods used to profile the physicochemical properties of new biopharmaceuticals and for determining the presence of product-related impurities including degradation products [52].

6.7 Technology availability 6.7.1 Software All commercial instruments include software not only for data collection but also for data processing; however, they use different algorithms for processing such as smoothing and baseline corrections, so this needs to be considered when comparing data from different types of instruments. All enable data to be saved in both their internal formats, or in ASCII formats that can be ported to other software (including spread sheets) for processing. CDtoolX [33] is a new downloadable free software package designed specifically for processing, analysis and display of CD data (both CD and HT signals) across a wide range of instrument platforms and output formats. It facilitates comparisons between data collected on different types of instruments, including SRCD data collected at SRCD beamlines. Data processing is accomplished through an easy toggling system and processed spectra can be converted to [q]MRE or Dε units, and error bars can be calculated and displayed to indicate reproducibility levels. Spectra can be scaled at specific wavelengths [19] to enable comparisons of spectral shapes of different sample types. Calibration to a spectrum of CSA or ACS according to the procedure outlined in Section 6.4.2.2 can be carried out. All of these procedures can be performed simultaneously on multiple datasets. Metadata in the file headers can be edited and both raw and processed spectra archived in a user-defined MySQL database, from which they can be retrieved either calibrated or uncalibrated and in units of mdeg or Dε. Alternatively, data can be saved as ASCII files making them accessible to most spread sheet applications. CDtoolX [33] also includes a facility for performing singular value deconvolution analyses (for identifying component contributions, for example, in thermal melt experiments) [9]. There are many software packages that have been developed for analyses of protein secondary structure based on CD spectra. DichroWeb [34,53] is a user-friendly web-based server that includes all of the most popular algorithms, SELCON3, CDSSTR [36,37], CONTINLL [38,39], VARSLEC and K2D [40]. DichroWeb was developed to accept data formats produced by any conventional CD or SRCD instrument, and calculates a single fit parameter, the normalized root mean square deviation (NRMSD) [34,54], which enables simple comparisons between methods and datasets. DichroWeb also has a function for applying a magnitude-scaling factor to spectra, allowing the user to compensate for possible errors in protein concentration or cell-pathlength [41]. One of the most important issues in quantitative analyses of CD spectra is the suitability of the reference dataset used in the analyses. DichroWeb includes the widest range of such datasets, (10 in all), including three comprised of SRCD data, one of which includes membrane proteins. Two of the reference datasets have been specifically designed using bioinformatics

II. The selected biophysical tools in the biopharmaceutical industry

6.7 Technology availability

147

techniques to cover secondary structure and fold space. These are SP175 [3], which contains 71 soluble protein spectra with a low wavelength cutoff of 175 nm and SP175t, a version of this with a low wavelength cutoff of 190 nm. There is also SMP180 [43], which was specifically created for analyzing membrane protein spectra but can also be used with soluble protein data, and indeed is particularly good for analyzing beta-sheet rich proteins. SMP180 has a low wavelength cutoff of 180 nm and incorporates 30 membrane protein spectra plus all the SP175 proteins and another 27 soluble proteins, making it the broadest-based reference dataset available to date. CDPro [36] is a downloadable data analysis package that includes the CONTINLL, SELCON3, and CDDSTR algorithms, and a program to determine tertiary structure, CLUSTER. It accesses 10 protein CD datasets including SMP50 and SMP56, which comprise 50 and 56 protein spectra, respectively, 13 of which are from membrane proteins, and one dataset specifically for use with CLUSTER. CDPro also facilitates the use of user-defined datasets. Other web-based analysis tools include BestSel [55], and Capito [56]. The former can quantify the fraction of parallel and antiparallel beta sheet and further subdivide these into either left, right handed twisted or relaxed. The latter is designed to deal with complete datasets, such as those produced in a thermal melt. Other algorithms, include K2D3 [57] and CCA [58]. The “Further Reading” section of this chapter lists websites where various servers and software are currently available. Comparisons of spectral shapes of different protein samples that can aid in identifying proteins with similar structures and can obviate variations in magnitudes can be done using the DichroMatch server [18]. In a reverse situation PDB2CD [59,60] generates CD spectra from protein atomic coordinates, thereby facilitating structure comparisons in a situation where one protein has a crystal structure and for the other only its CD spectrum is available.

6.7.2 Commercial vendors and suppliers of accessories/key supplies 6.7.2.1 Instruments Four companies have dominated the CD market in recent years. They are: Aviv Biomedical Inc. of Lakewood, N.J., USA; Jasco Analytical Instruments, whose CD instruments are manufactured in Hachioji, Tokyo, but which has headquarters worldwide, including in the USA, Europe, and UK; Olis, Inc., based in Bogart, GA, USA have a number of models that employ a duel beam system; Applied Photophysics Ltd., based in Leatherhead, Surrey, UK, currently produce the Chirascan. All models come with thermoelectrically controlled sample chambers as standard and have optional facilities to perform fluorescence and CD simultaneously or fluorescence-detected CD. Accessories such as stopped flow apparatus, titration, and multiple sampling systems are also available. Websites for suppliers are noted in the “Further Reading” Section. 6.7.2.2 Cells/calibration standards Hellma Analytics (with local vendors in many countries) are providers of a wide range of cell types and pathlengths. For CD, the standard cells are made from Suprasil with a choice of round or rectangular shapes. Round cells have distinct advantages for scattering samples

II. The selected biophysical tools in the biopharmaceutical industry

148

6. Biopharmaceutical applications of protein characterisation by circular dichroism spectroscopy

[32]. Most commercial instrument cell holders can be modified to accept round cells. Demountable cells with standard pathlengths of 0.0003, 0.005, 0.001, 0.05, 0.01 cm are available, as are larger one-piece cuvettes (both “bottle” type and jacketed cells for use with circulating water temperature control). They also make calcium fluoride cells of very short pathlength based on the design described in Ref. [30]. These can be used for applications where low wavelength measurements are required (usually SRCD) since CaF2 is more optically transparent than silica in the far-UV. Accessories such as cell holders and Hellmanex, a strong alkaline surfactant especially formulated to clean glass and quartz are available from this supplier. CSA (d-10-camphorsulfonic acid or (1S)-(þ)-10-camphorsulfonic acid, a standard used for optical rotation/magnitude [21] can be obtained from SigmaAldrich in sizes of 5, 100 and 500 g with 99% purity. This material is both hygroscopic and light-sensitive and should be stored accordingly. ACS (Ammonium d-10-camphorsulfonate or (1S)-(þ)-10camphorsulfonic acid, ammonium salt) is available from Katayama Chemical, or can be ordered directly from Jasco Instruments. Wavelength calibration can be done using highpurity benzene from any supplier. Hellma Analytic also has available wavelength standards including traceable holmium oxide liquid and glass filters prepared in accordance with NIST (National Institute of Standards and Technology, USA), ASTM (American Society for Testing and Materials), or European Pharmacopeia. Alternatively, wavelength standards (and cells) can often be purchased from the spectrophotometer manufacturers.

6.8 Future developments 6.8.1 SRCD spectroscopy Synchrotron radiation circular dichroism spectroscopy is an advanced technique that extends the amount and types of data collectable and enhances the utility of protein CD spectroscopy [61]. Although the Xenon arc lamps used in most commercial CD instruments have light fluxes of >1010 photons/sec in the wavelength range of 300e250 nm, over the range from 250 to 180 nm the flux decreases by two orders of magnitude [7] effectively limiting measurements to >185 nm. Synchrotron light sources generate fluxes ranging from 1010 to >1013 photons/sec at 200 nm, which remain constant well into the vacuum UV [62,63], allowing measurements with high signal-to-noise down to 170 nm (or below for non-aqueous solutions). This permits the detection of subtle conformational changes, which may, for example, prove useful for high-throughput screening of drug binding. The low-wavelength data <185 nm allows access to a p / p* intra-amide charge transfer transition (see Fig. 6.5). This not only increases the information content available for secondary structure analysis but may contain information on tertiary interactions [7,62]. The high sensitivity also decreases the measurement time for fast kinetic studies and, to this end, stopped-flow and continuous-flow devices along with advanced detection and high throughput techniques are under development at a number of SRCD beamlines [62,63].

II. The selected biophysical tools in the biopharmaceutical industry

References

149

Acknowledgements This work was supported by grants from the Bioinformatics and Biological Resources Fund of the UK Biotechnology and Biological Sciences Research Council (BBSRC) to BAW. It was derived and evolved from lectures that were presented by BAW to pharmaceutical scientists at the CASSS meetings on Higher Order Structures held in 2011 and 2017, as well as at teaching workshops run by BAW and Dr. Robert W. Janes (Queen Mary University of London) at the 2017 Sao Paulo FAPESP school in Sao Paulo, Brazil, the annual Warwick University CD summer schools; the European Biophysical Societies Association (EBSA) summer schools in Montpellier, France in 2014, 2016, and 2018; at the International Union of Pure and Applied Biophysics (IUPAB)/EBSA 2017 workshop in Edinburgh, Scotland; at several BioCD Workshops held at Brookhaven National Labs in the USA, at the Chinese Academy of Sciences in Beijing China, and at CD training workshops held in Sao Carlos, Sao Paulo and Ribiero Preto, Brazil over the past several years. SRCD studies were undertaken at the ISA (Aarhus, Denmark), Soleil (France), ANKA- now KARA - (Karlsruhe, Germany), BSRF (Beijing, China), SRS Daresbury (UK), and NSLS (Brookhaven, USA) synchrotrons and were enabled by beamtime grants to BAW. We thank members of the Wallace group (Birkbeck College, University of London) and Dr. Robert W. Janes and his group (Queen Mary University of London) for helpful discussions.

References [1] Wallace BA, Janes RW, editors. Modern Techniques for Circular Dichroism and Synchrotron Radiation Circular Dichroism. IOS Press; 2011. [2] Norden B, Rodger A, Dafforn T. Linear Dichroism and Circular Dichroism: A Textbook on Polarized Light Spectroscopy. Royal Society of Chemistry Press; 2010. [3] Lees JG, Miles AJ, Wien F, Wallace BA. A reference database for circular dichroism spectroscopy covering fold and secondary structure space. Bioinformatics 2006;22:1955e62. [4] Lopes JLS, Orcia D, Araujo APU, DeMarco R, Wallace BA. Folding factors and partners for the intrinsically disordered protein micro-exon gene 14 (MEG-14). Biophys J 2013;105:2512e20. [5] Gilbert ATB, Hirst JD. Charge-transfer transitions in protein circular dichroism spectra. J Mol Struct Theochem 2004;675:53e60. [6] Woody RW. Circular dichroism. Methods Enzymol 1995:24634e71. [7] Miles AJ, Wallace BA. Synchrotron radiation circular dichroism spectroscopy of proteins and applications in structural and functional genomics. Chem Soc Rev 2006;35:39e51. [8] Kelly SM, Jess TJ, Price NC. How to study proteins by circular dichroism. Biochim Biophys Acta 2005;1751:119e39. [9] Ireland SM, Sula A, Wallace BA. Thermal melt circular dichroism spectroscopic studies for identifying stabilising amphipathic molecules for the voltage-gated sodium channel NavMs. Biopolymers 2018;109:23067. [10] Toumadji A, Alcorn SW, Johnson Jr WC. Extending CD spectra to 168 nm improves the analysis for secondary structures. Anal Biochem 1992;200:321e31. [11] Johnson Jr WC. Analyzing protein circular dichroism spectra for accurate secondary structures. Protein Struct Funct Genet 1999;25:307e12. [12] Fiedler S, Cole L, Keller S. Automated circular dichroism spectroscopy for medium-throughput analysis of protein conformation. Anal Chem 2013;85:1868e72. [13] Hussain R, Javorfi T, Rudd TR, Siligardi G. High-Throughput SRCD Using Multi-Well Plates and Its Applications. 2017. Scientific Reports 6 38028. [14] Jones C, Schiffmann D, Knight A, Windsor S. Val-CiD Best Practice Guide: CD Spectroscopy for the Quality Control of Biopharmaceuticals. National Physical Lab Report, DQL-AS 008; 2004. [15] Ravi J, Hills AE, Knight AE. Reproducible circular dichroism measurements for biopharmaceutical applications. In: (B.A. Wallace; R.W. Janes, (Eds.) Modern Techniques for Circular Dichroism and Synchrotron Radiation Circular Dichroism. IOS Press, pp. 125e40. [16] Whitmore L, Woollett B, Miles AJ, Klose DP, Janes RW, Wallace BA. PCDDB: the protein circular dichroism data bank, a repository for circular dichroism spectral and metadata. Nucleic Acids Res 2011;39:D480e6.

II. The selected biophysical tools in the biopharmaceutical industry

150

6. Biopharmaceutical applications of protein characterisation by circular dichroism spectroscopy

[17] Whitmore L, Miles AJ, Mavridis L, Janes RW, Wallace BA. PCDDB: new developments at the protein circular dichroism data bank. Nucleic Acids Res 2017;45:303e7. [18] Whitmore L, Mavridis L, Janes RW, Wallace BA. DichroMatch at the Protein Circular Dichroism Data Bank (DM@PCDDB): a web-based tool for identifying protein nearest neighbors using circular dichroism spectroscopy. Protein Sci 2018;27:10e3. [19] Miles AJ, Wien F, Lees JG, Wallace BA. Calibration and standardisation of synchrotron radiation circular dichroism and conventional circular dichroism spectrophotometers. Part 2: factors affecting wavelength and ellipticity measurements. Spectroscopy 2005;19:43e51. [20] Schippers PH, Dekkers HPJM. Direct determination of absolute circular dichroism data and calibration of commercial instruments. Anal Chem 1981;5:778e82. [21] Miles AJ, Wien F, Lees JG, Rodger A, Janes RW, Wallace BA. Calibration and standardisation of synchrotron radiation circular dichroism and conventional circular dichroism spectrophotometers. Spectroscopy 2003;17:653e61. [22] Miles AJ, Wien F, Wallace BA. Redetermination of the extinction coefficient of camphor-b-sulfonic acid, a calibration standard for circular dichroism spectroscopy. Anal Biochem 2004;335:338e9. [23] Damianoglou A, Crust EJ, Hicks MR, et al. A new reference material for UV-visible circular dichroism spectroscopy. Chirality 2008;20:1029e38. [24] Price NC. The determination of protein concentration. In: Engel PC, editor. Enzymology labfax. Oxford: Bios Scientific Publishers; 1996. p. 34e41. [25] Gill SC, von Hippel PH. Calculation of protein extinction coefficients from amino acid data. Anal Biochem 1989;182:319e26. [26] Gasteiger E, Hoogland C, Gattiker A, et al. Protein identification and analysis tools on the ExPASy server. In: Walker JM, editor. The Proteomics Protocols Handbook. Humana Press; 2005. p. 571e607. [27] Pace CN, Vajdos F, Fee L, et al. How to measure and predict the molar absorption coefficient of a protein. Protein Sci 1995;11:2411e23. [28] Sutherland J. In: Fasman G, editor. Circular Dichroism and The Conformational Analysis of Biomolecules. New York, London: Plenum Press; 1996. p. 616e8. [29] Anthis NJ, Clore GM. Sequence-specific determination of protein and peptide concentrations by absorbance at 205 nm. Protein Sci 2013;22:851e8. [30] Wien F, Wallace BA. Calcium fluoride micro cells for synchrotron radiation circular dichroism spectroscopy. Appl Spectrosc 2005;59:1109e13. [31] Haupt GW. An alkaline solution of potassium chromate as a transmittancy standard in the ultraviolet. J Res Natl Bur Stand 1952;48:414e23. [32] Miles AJ, Wallace BA. Circular dichroism spectroscopy of membrane proteins. Chem Soc Rev 2016;45:4859e72. [33] Miles AJ, Wallace BA. CDtoolX, a downloadable software package for processing and analyses of circular dichroism spectroscopic data. Protein Sci 2018;27:1717e22. [34] Whitmore L, Wallace BA. Protein secondary structure analyses from circular dichroism spectroscopy: methods and reference databases. Biopolymers 2008;89:392e400. [35] Greenfield NJ. Methods to estimate the conformation of proteins and polypeptides from circular dichroism data. Anal Biochem 1996;235:1e10. [36] Sreerama N, Woody RW. Estimation of protein secondary structure from circular dichroism spectra, comparison of CONTIN, SELCON and CDSSTR methods with an extended reference set. Anal Biochem 2000;287: 252e60. [37] Manavalan P, Johnson Jr WC. Variable selection method improves the prediction of protein secondary structure from circular dichroism spectra. Anal Biochem 1987;167:76e85. [38] Provencher SW, Glockner J. Estimation of globular protein secondary structure from circular dichroism. Biochemistry 1981;20:33e7. [39] van Stokkum HM, Spoelder HJW, Bloemendal M, et al. Estimation of protein secondary structure and error analysis from CD spectra. Anal Biochem 1990;19:110e8. [40] Andrade MW, Chacon P, Merelo JJ, et al. Evaluation of secondary structure of proteins from uv circulardichroism spectra using an unsupervised learning neural-network. Protein Eng 1993;6:383e90. [41] Miles AJ, Whitmore L, Wallace BA. Spectral magnitude effects on the analyses of secondary structure from circular dichroism spectroscopic data. Protein Sci 2005;14:368e74.

II. The selected biophysical tools in the biopharmaceutical industry

Further reading

151

[42] Janes, RW. Reference datasets for protein circular dichroism and synchrotron radiation circular dichroism spectroscopic analyses. In: (B.A. Wallace; R.W. Janes, (Eds.) Modern Techniques for Circular Dichroism and Synchrotron Radiation Circular Dichroism. IOS Press;2009. pp. 183e201. [43] Abdul-Gader A, Miles AJ, Wallace BA. A reference dataset for the analysis of membrane protein secondary structures and transmembrane residues using circular dichroism. Bioinformatics 2011;12:1630e6. [44] Berman HM, Westbrook J, Feng Z, Gilliland G, et al. The protein data bank. Nucleic Acids Res 2000;28:235e42. [45] Woollett B, Whitmore L, Janes RW, Wallace BA. ValiDichro: a website for validating and quality control of protein circular dichroism spectra. Nucleic Acids Res 2013;41:W417e421. [46] Magrane M, UniProt Consortium. UniProt Knowledgebase: a hub of integrated protein data. Database 2011;2011:bar009. [47] Laskowski RA, MacArthur MW, Moss DS, Thornton JM. PROCHECK ea program to check the stereochemical quality of protein structures. J Appl Crystallogr 1993;26:283e91. [48] Chen VB, Arendall III WB, Headd JJ, et al. MolProbity: all-atom structure validation for macromolecular crystallography. Acta Crystallogr 2010;D66:12e21. [49] Vriend G. WHAT IF: a molecular modeling and drug design program. J Mol Graph 1990;8:52e6. [50] Solá RJ, Griebenow K. Glycosylation of therapeutic proteins: an effective strategy to optimize efficacy. BioDrugs 2010;24:9e21. [51] Cronin NB, O’Reilly A, Duclohier H, Wallace BA. Effects of deglycosylation of sodium channels on their structure and function. Biochemistry 2005;44:441e9. [52] ICH, Q6B. Test procedures and acceptance criteria for biotechnological/biological products (Q6B)). In: International conference on harmonization of technical requirements for registration of pharmaceuticals for human use. FDA Register 64FR; 1999. p. 44928. [53] Whitmore L, Wallace BA. DICHROWEB, an online server for protein secondary structure analyses from circular dichroism spectroscopic data. Nucleic Acids Res 2004;32:W668e73. [54] Mao D, Wachter E, Wallace BA. Folding of the mitochondrial proton adenosine triphosphatase proteolipid channel in phospholipid vesicles. Biochemistry 1982;21:4960e8. [55] Micsonai A, Wien F, Kernya L, Lee YH, Goto Y, Réfrégiers M, Kardos J. Accurate secondary structure prediction and fold recognition for circular dichroism spectroscopy. Proc Natl Acad Sci Unit States Am (2015) 112 E3095E3103. [56] Wiedemann C, Bellstedt P, Görlach M. CAPITO - a web server based analysis and plotting tool for circular dichroism data. Bioinformatics 2013;29:1750e7. [57] Louis-Jeune C, Andrade-Navarro MA, Perez-Iratxeta C. Prediction of protein secondary structure from circular dichroism using theoretically derived spectra. Proteins 2012;80:374e81. [58] Perczel A, Park K, Fasman GD. Analysis of the circular dichroism of proteins using the convex constraint algorithm: a practical guide. Anal Biochem 1992;203:83e93. [59] Mavridis L, Janes RW. PDB2CD: a web-based application for the generation of circular dichroism spectra from protein atomic coordinates. Bioinformatics 2017;33:56e63. [60] Janes RW. PDB2CD visualises dynamics within protein structures. Eur Biophys J 2017;46:607e16. [61] Wallace BA, Gekko K, Hoffmann SV, Lin Y-H, Sutherland JC, Tao Y, Wien F, Janes RW. Synchrotron radiation circular dichroism (SRCD) spectroscopy - an emerging method in structural biology for examining protein conformations and protein interactions. Nucl Instrum Methods Phys Res A 2011;649:177e8. [62] Wallace BA. Protein characterisation by synchrotron radiation circular dichroism spectroscopy. Q Rev Biophys 2009;42:317e70. [63] Wallace BA. The role of circular dichroism spectroscopy in the era of integrative structural biology. Curr Opin Struct Biol 2019;52:1e6.

Further reading Books [1] Fasman GD, editor. Circular Dichroism and The Conformational Analysis of Biomolecules. New York: Plenum Press; 1996. [2] Rodger A, Norden B, editors. Circular Dichroism and Linear Dichroism. Oxford University Press; 1998.

II. The selected biophysical tools in the biopharmaceutical industry

152

6. Biopharmaceutical applications of protein characterisation by circular dichroism spectroscopy

Websites [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14]

CDtoolX: Spectral Processing Program, http://www.cdtools.cryst.bbk.ac.uk/. PCDDB: Data Bank Repository of CD Spectra, http://pcddb.cryst.bbk.ac.uk/home.php. DichroWeb: Secondary Structure Analysis Website, http://dichroweb.cryst.bbk.ac.uk/. CDPRO: Secondary Structure Analysis Program, http://sites.bmb.colostate.edu/sreeram/CDPro/. K2D2: Secondary Structure Analysis Website, http://cbdm-01.zdy.uni-mainz.de/~andrade/k2d2/. DichroMatch: Spectral Comparison Website, http://pcddb.cryst.bbk.ac.uk/dichromatch.php. 2Struc: Calculates Secondary Structure From PDB Codes Website, http://2struc.cryst.bbk.ac.uk/twostruc. ValiDichro: Spectral Validation Website, http://valispec.cryst.bbk.ac.uk/circularDichroism/ValiDichro/ upload.html. PDB2CD Website: http://pdb2cd.cryst.bbk.ac.uk/. BestSel Website: http://bestsel.elte.hu/index.php. Capito Website: http://capito.nmr.leibniz-fli.de/. ProtParam Website: http://web.expasy.org/protparam/.

General introductions to CD and secondary structure analyses [15] [16] [17] [18]

http://en.wikipedia.org/wiki/Circular_dichroism. http://www.cryst.bbk.ac.uk/PPS2/course/section8//ss-960531_21.html. http://dichroweb.cryst.bbk.ac.uk/html/userguide_dichroweb.shtml. http://www.photophysics.com/circular-dichroism/chriascan-technology/cd-and-hos-of-biomolecules.

Instruments, accessories and standards suppliers [19] [20] [21] [22] [23] [24]

http://www.hellma-analytics.com/en/laboratory-supplies/. http://www.sigmaaldrich.com/. http://www.jasco.co.uk/. http://www.photophysics.com/. http://www.olisweb.com/. http://www.bio-logic.net/products/stopped-flow/.

Online videos [25] [26] [27] [28] [29] [30]

Defining Circular Polarisation, http://www.youtube.com/watch?v¼Fu-aYnRkUgg. Measuring a CSA Calibration Spectrum, http://www.youtube.com/watch?v¼PEIDelWvSsg. Calibration of CD Spectra with CDtool Software, http://www.youtube.com/watch?v¼ovY6yVxw-tI. Cleaning and Loading CD Cells, http://www.youtube.com/watch?v¼OhD50eiLzWI. PCDDB Deposition Tutorial, http://www.youtube.com/watch?v¼NTblyIhwjog. Accurate Measurement of the True Pathlength of Optical CD Cells, https://www.youtube.com/watch? v¼fCN7qWDmRLc. [31] Analyzing Data Using Dichroweb: https://www.youtube.com/watch?v¼QZat_Wr2NGM.

II. The selected biophysical tools in the biopharmaceutical industry