DNA Binding and Bending by HMG Boxes: Energetic Determinants of Specificity

DNA Binding and Bending by HMG Boxes: Energetic Determinants of Specificity

doi:10.1016/j.jmb.2004.08.035 J. Mol. Biol. (2004) 343, 371–393 DNA Binding and Bending by HMG Boxes: Energetic Determinants of Specificity Anatoly ...

2MB Sizes 0 Downloads 14 Views

doi:10.1016/j.jmb.2004.08.035

J. Mol. Biol. (2004) 343, 371–393

DNA Binding and Bending by HMG Boxes: Energetic Determinants of Specificity Anatoly I. Dragan1, Christopher M. Read2, Elena N. Makeyeva1 Ekaterina I. Milgotina1, Mair E. A. Churchill3, Colyn Crane-Robinson2 and Peter L. Privalov1* 1

Department of Biology, Johns Hopkins University, Baltimore MD 21218, USA 2

School of Biological Sciences University of Portsmouth PO1 2DT, UK 3 Department of Pharmacology University of Colorado Health Sciences Center, Denver, CO USA

To clarify the physical basis of DNA binding specificity, the thermodynamic properties and DNA binding and bending abilities of the DNA binding domains (DBDs) of sequence-specific (SS) and non-sequencespecific (NSS) HMG box proteins were studied with various DNA recognition sequences using micro-calorimetric and optical methods. Temperature-induced unfolding of the free DBDs showed that their structure does not represent a single cooperative unit but is subdivided into two (in the case of NSS DBDs) or three (in the case of SS DBDs) subdomains, which differ in stability. Both types of HMG box, most particularly SS, are partially unfolded even at room temperature but association with DNA results in stabilization and cooperation of all the subdomains. Binding and bending measurements using fluorescence spectroscopy over a range of ionic strengths, combined with calorimetric data, allowed separation of the electrostatic and non-electrostatic components of the Gibbs energies of DNA binding, yielding their enthalpic and entropic terms and an estimate of their contributions to DNA binding and bending. In all cases electrostatic interactions dominate non-electrostatic in the association of a DBD with DNA. The main difference between SS and NSS complexes is that SS are formed with an enthalpy close to zero and a negative heat capacity effect, while NSS are formed with a very positive enthalpy and a positive heat capacity effect. This indicates that formation of SS HMG box–DNA complexes is specified by extensive van der Waals contacts between apolar groups, i.e. a more tightly packed interface forms than in NSS complexes. The other principal difference is that DNA bending by the NSS DBDs is driven almost entirely by the electrostatic component of the binding energy, while DNA bending by SS DBDs is driven mainly by the non-electrostatic component. The basic extensions of both categories of HMG box play a similar role in DNA binding and bending, making solely electrostatic interactions with the DNA. q 2004 Elsevier Ltd. All rights reserved.

*Corresponding author

Keywords: HMG-Protein; DNA-binding; DNA-bending; specificity; thermodynamics

Abbreviations used: DBD, DNA binding domain; Lef79, the DNA binding domain of mouse LEF-1 protein comprising the HMG box of 79 residues, i.e. lacking eight C-terminal residues of Lef-86 but containing one additional N-terminal residue; Lef86, the DNA binding domain of mouse LEF-1 protein consisting of 86 residues, i.e. Lef79 plus 8 C-terminal residues but one less N-terminal residue; Sox, the DNA binding domain (HMG box) of mouse Sox-5 protein comprising 81 residues; Sry, the DNA binding domain (HMG box) of human SRY protein comprising 81 residues; Box-B 0 , the DNA binding domain (HMG box) of human HMGB1 protein comprising 101 residues; D74, the DNA binding domain of Drosophila melanogaster HMG-D protein comprising the HMG box of 74 residues, i.e. without the basic tail; D100, the DNA binding domain of Drosophila melanogaster HMG-D protein consisting of 100 residues i.e. D74 and the 26 residue Cterminal basic tail; CFp , the heat capacity of the folded protein; CU p , the heat capacity of the unfolded protein; FAM, 5,6carboxyfluorescein; TAMRA, 5-carboxytetramethylrhodamine; FRET, fluorescence resonance energy transfer; AFE, asymptotic FRET effect; s, standard deviation; SS, sequence-specific; NSS, non-sequence-specific. E-mail address of the corresponding author: [email protected] 0022-2836/$ - see front matter q 2004 Elsevier Ltd. All rights reserved.

372

Introduction Studies over the past 30 years have shown that DNA binding domains (DBDs) recognize DNA in a wide variety of ways. Some DBDs such as the bacterial repressors, are fully folded before they bind to unique recognition sites, exhibiting sequence-specific DNA binding, with very high affinity and long residence times, using a lock-andkey mechanism.1 In eukaryotes the situation is more complex since the number of genes to be regulated far exceeds the number of transcription factors (TFs), so the operational units are combinatorial complexes of TFs and a single transcription factor binds to multiple sites in the genome.2 In consequence, there is often relaxation in the specificity of DNA sequence recognition arising from a degree of conformational adaptability.3 In some cases this adaptability may be assisted by a high level of “induced fit” on the part of the protein resulting in the induction of structure, as in the case of the bZIP4 or AT-hook5 proteins, for which the DBDs are fully disordered in free solution but have highly defined conformations in their complex with the DNA. In other cases, the result is DNA rather than protein distortion, typically achieved through binding to the minor groove, as with the TATA box binding protein (TBP),6,7 integration host factor (IHF)8 and the HMG box.9 The HMG box represents a particularly interesting example of a versatile eukaryotic DBD in that within the context of an essentially conserved Lshaped fold, two distinct categories exist: sequence specific (SS) boxes which exhibit strong DNA sequence preference (rather than an absolute sequence requirement) and non-sequence-specific (NSS) which appear to bind to a very wide range of sequences.10,11 Despite these differences, both categories are capable of binding to B-form DNA with high affinity12–14 and inducing a large bend15–17 to form complexes of rather similar structure. Both types of HMG box proteins participate in a wide variety of complexes that bind to promoters/ enhancers, (commonly termed enhanceosomes) and part, or sometimes all of their function is architectural. For example, the NSS HMGB1/2 proteins act as co-regulators with nuclear hormone receptors, such as the progesterone receptor,18 to facilitate their binding to target DNA sequences. A similar situation holds for the SS Sox2 protein, which participates in complexes with POU domain proteins such as Oct1 and Oct4 to facilitate transcription during early embryonic development.19 Despite these similarities in overall function, the DNA sequence selectivity of the two categories of HMG box is widely different.20 Two features have been shown to distinguish the complexes formed by the two categories:21–25 while SS complexes contain just one intercalation site and several base-specific contacts, NSS complexes contain two intercalation sites. Previous studies of the free SS Sox-521,26 and Sox-427 in solution have shown that

HMG Box–DNA Complexes

their minor wings are unstable but it is unclear how unstable are the other SS-DBDs. It is notable, however, that their structures have been obtained only in complexes with DNA,9,22,24 in contrast to NSS DBDs.23,25 The task is therefore to understand how these differences are expressed in the forces driving the DNA binding of SS and NSS HMG boxes. This can be achieved only by a detailed comparative study of the thermal properties of the DBDs representing the SS and NSS classes and the thermodynamics of their association with various DNA sequences. This assumes not only determination of the binding constants, i.e. the Gibbs energy of binding, but both components of the Gibbs energy, the enthalpic and entropic, under different conditions of temperature and salt concentration, since it is these components that carry information on the nature of the forces involved. In the case of HMG boxes, obtaining this information, their “thermodynamic signatures”, is not a straightforward matter due to their rather low stability. The present study shows how to overcome these difficulties and obtain a full thermodynamic characterization of the interactions responsible for the formation of HMG box complexes of both types. Here we consider the thermodynamic properties of several HMG box DBDs, both SS and NSS and the energetics of their association with various DNA sequences, so as to define the factors responsible for DNA specificity and bending ability. The DBDs of mammalian HMGB1, of Drosophila melanogaster HMG-D and yeast NHP6A are NSS binders,23,25,28–35 while others, such as the DBDs of mLEF-1,12,15,22,36 hSRY,9,24,37 and mSox-5,38,39,26 are SS binders. Some of these proteins, namely the SS DBD of mSox-540,21 and the NSS DBD of HMG-D14 have been investigated thermodynamically before but these studies are now significantly extended.

Results The objects The proteins and DNA duplexes used in this study are presented in Figure 1(a) and (b). The proteins are the DNA binding domains of: (1) hSRY, denoted here as Sry; (2) mLEF-1, consisting of 86 residues, denoted here as Lef86 and (3) its truncated form lacking seven residues, denoted here as Lef79; (4) mSox-5, denoted here as Sox; (5) yNHP6A, denoted here as NHP; and (6) Box 2 (Box B) of HMGB1, denoted here as Box-B 0 . Figure 1 also shows the sequences of the previously studied HMG-D100, denoted simply as D100 and HMGD74, its truncated form, denoted as D74. The parts of the protein sequences forming the globular fold are boxed to distinguish them from the basic C and N-terminal extensions. Positively charged residues are shown in blue and negatively charged residues in red. The overall charges are summarized in the first column of Table 1 showing that all the DBDs carry a high net positive charge.

HMG Box–DNA Complexes

373

Figure 1. (a) Sequences of the considered HMG box DBDs. Residues that form the globular domains of the proteins are enclosed in boxes to distinguish them from the basic extensions. Positively charged residues are shown in blue and negatively charged residues in red. The Lef86 DBD is exactly the same as that used in the NMR structure determination of its DNA complex.22 The Sry DBD (81 residues, the same fragment as used by Dragan et al.14) has the same N terminus as that used in the structure determination by Murphy et al.24 but is shorter at the C terminus by four residues. These four residues were found to be disordered in the structure presented by Murphy et al.24 (b) The DNA duplexes used in this study. Unlabeled DNA duplexes were used in calorimetric experiments, the 16 bp DNA duplexes labeled at the 5 0 and 3 0 ends with FAM and TAMRA were used in fluorescence anisotropy and FRET experiments.

The DNA duplexes used are: for Lef-86 and Lef-79 (DNALef), the LEF-1 cognate binding site TTCAAA;12,22 for Sry (DNASry) the DNA sequence CACAAA, originally identified by Haqq et al.,41 and

subsequently used in the structure determination of its DNA complex24 and for Sox (DNASox) the sequence AACAAT.38 DNAAT is the DNA duplex used in studying the structure of the NSS complex

374

HMG Box–DNA Complexes

Table 1. The log of association constants of the HMG DBDs with the different DNA sequences in 10 mM potassium phosphate (pH 6.0), 100 mM KCl; their extrapolated value at 1 M KCl, logðKa Þ0 , and the slopes of the logðKa Þ versus log½KCl plots, all at 20 8C DBD charges Sry 22CK9KZ13C

Lef86 22CK10KZ12C

Lef79 14CK8KZ6C

Sox 19CK9KZ10C

NHP 22CK14KZ8C

Box-B 0 30CK14KZ16C

D100 27CK13KZ14C

D74 17CK13KZ4C

Property logðKa Þ logðKa Þ0 Kv log Ka =v log½KCl Z logðKa Þ logðKa Þ0 Kv log Ka =v log½KCl Z logðKa Þ logðKa Þ0 Kv log Ka =v log½KCl Z logðKa Þ logðKa Þ0 Kv log Ka =v log½KCl Z logðKa Þ logðKa Þ0 Kv log Ka =v log½KCl Z logðKa Þ logðKa Þ0 Kv log Ka =v log½KCl Z logðKa Þ logðKa Þ0 Kv log Ka =v log½KCl Z logðKa Þ logðKa Þ0 Kv log Ka =v log½KCl Z

DNALef

DNASry

DNASox

DNAAT

10.3G0.3 1.8G0.08 8.5G0.17 13.2G0.3 10.2G0.2 3.4G0.07 6.8G0.13 10.6G0.2 6.7G0.2 3.3G0.07 3.4G0.1 5.3G0.1 7.9G0.2 2.4G0.1 5.5G0.1 8.6G0.2 8.6G0.2 2.1G0.1 6.4G0.2 9.2G0.3 8.0G0.3 0.8G0.1 7.2G0.3 11.2G0.5 8.6G0.3 2.8G0.2 5.8G0.3 9.1G0.5 5.0G0.1 2.5G0.02 2.5G0.03 3.9G0.1

10.8G0.2 2.4G0.04 8.4G0.11 13.1G0.2 8.0G0.5 1.4G0.2 6.6G0.3 10.3G0.5 5.0G0.5 1.7G0.5 3.3G0.6 5.1G0.9 8.1G0.2 2.5G0.1 5.6G0.1 8.7G0.1 8.4G0.2 2.0 G0.1 6.4G0.2 10.0G0.3 – – – – – – – – – – – –

12.4G0.2 3.8G0.05 8.6G0.15 13.4G0.3 8.7G0.3 2.1G0.1 6.7G0.2 10.5G0.3 – – – – 8.6G0.2 3.4G0.1 5.2G0.1 8.1G0.2 – – – – 8.4G03 1.3G0.1 7.1G0.3 11.0G0.5 8.6G0.2 2.6G0.1 6.0G0.2 9.4G0.3 5.2G0.1 2.6G0.07 2.6G0.08 4.1G0.1

9.4G0.2 0.8G0.08 8.6G0.12 13.4G0.2 – – – – – – – – 7.7G0.2 2.0G0.1 5.7G0.1 8.9G0.1 – – – – – – – – 8.7G0.2 2.8G0.1 5.9G0.1 9.2G0.2 5.7G0.1 3.0G0.08 2.7G0.06 4.2G0.1

Z represents the number of ionic contacts formed between a DBD and the phosphate groups of DNA.

of D74,23 which contains the sequence ATAT at its center. For studying DNA binding and bending by FRET and fluorescence anisotropy, DNA duplexes labeled with the fluorophores FAM and TAMRA at the 5 0 ends were used. Thermodynamic properties of the free proteins Figure 2(a) and (b) show the temperature dependence of the molar ellipticity at 222 nm of the NSS and SS DBDs. All these DBDs differ somewhat in helicity at low temperatures and upon heating the helicity of all the DBDs decreases sigmoidally over the range from 30 8C to 60 8C, dropping to a value corresponding to a completely unfolded state. In addition, the ellipticity also changes at temperatures below 30 8C: these changes are especially pronounced in the case of SS DBDs and proceed more steeply than expected for fully folded stable proteins,42 indicating some temperature-induced conformational change. The partial specific heat capacities of the DBDs measured by DSC over a broad temperature range paint the same picture (Figure 2(c) and (d)). The peak of the heat capacity function coincides with the sigmoidal change in ellipticity and shows

that protein unfolding proceeds with significant heat absorption, i.e. enthalpy change. After this extensive heat absorption peak the partial specific heat capacity drops to the level expected for completely unfolded proteins, CU p , calculated by summing up the heat capacities of the constituent amino acid residues (see Materials and Methods). These CU p functions do not differ much for all the DBDs and on the scale used cannot be distinguished. An important observation is that at low temperatures the partial specific heat capacities of all the SS DBDs are significantly higher than expected for compact globular proteins, CFp , whilst for the NSS DBDs, in particular Box-B 0 , they drop to this level at about 0 8C. The CFp function was obtained by averaging the specific heat capacity functions of BPTI and lysozyme, two globular proteins of comparable size to the HMG box DBDs that are highly stable over the entire temperature range considered here (see Materials and Methods). The higher heat capacities of the SS HMG box DBDs show that even at the lowest temperatures these proteins are not fully folded, in contrast to the NSS DBDs. It is also seen that melting of NSS DBDs proceeds with a much more pronounced heat

375

HMG Box–DNA Complexes

Figure 2. (a) and (b) Temperature dependencies of the molar ellipticities at 222 nm of the SS and NSS DBDs, respectively. (c) and (d) The partial specific heat capacity functions of the SS and NSS DBDs, respectively. The CFp function represents the expected partial specific heat capacity of fully folded protein; the CU p function shows the partial specific heat capacity of the completely unfolded polypeptide chains (see Materials and Methods).

absorption peak than in the case of SS DBDs, the excess heat absorption of which is more diffuse. Thermal properties of the DBD/DNA complexes Figure 3 shows the observed partial molar heat capacity functions for free Lef86 and Sry proteins (green), the free cognate DNA duplexes (red), their complexes (black) and for each case the calculated sum of the heat capacities of the protein and DNA (blue dashed-dotted lines). Also shown are the heat capacity functions expected for the fully folded proteins, CFp (black broken lines), and the sum of the , and CFp heat capacity functions of DNA, CDNA p (purple broken lines). One can see that both complexes are more thermostable than the free proteins and that the heat capacity functions for the complexes are lower than the sum of the functions of free DNA and free protein, i.e. complex formation results in a decrease of the heat capacity. It is also notable that the sum of the heat capacity functions of DNA and fully folded protein almost coincides with the heat capacity function of the complex at low temperatures. The summed function is, however, a little higher, since association of folded protein with DNA also results in a small heat capacity decrement, termed the heat capacity effect of binding, DCap . Comparison of the summed and measured functions shows that association with DNA prevents unfolding of the proteins at temperatures below the cooperative unfolding/ dissociation of the whole complex, i.e. association of protein with DNA results in refolding of protein.

DNA binding Since the association constants of the HMG box DBDs with the DNAs are rather high and need to be determined under a variety of solvent conditions and temperatures, they can be measured efficiently only by optical methods, in particular by fluorescence anisotropy and FRET titration. An advantage of using FRET is that formation of the DBD– DNA complexes results in considerable bending of DNA and a corresponding change in fluorescence resonance energy transfer between donor and acceptor fluorophores located at the 5 0 ends. From the observed changes in the FRET effect upon titrating DNA with HMG box DBDs, one can therefore derive protein-binding isotherms using low DNA concentrations, which is important for defining the binding function when the affinities are high. Furthermore, by exciting the fluorescence of the same DNA duplex at the maximum of the acceptor absorption (i.e. 560 nm for TAMRA), one can observe the anisotropy of its fluorescence, which reflects the change in DNA tumbling rate caused by protein binding, thereby yielding the binding isotherm (see Materials and Methods). Both methods gave very similar binding constants. Fluorescence anisotropy and FRET titrations carried out at different temperatures and salt concentration showed that binding isotherms are very specific for the given DBD–DNA complex (Figure 4(a)), and for any given complex they depend on the KCl concentration (Figure 4(b)). It was also found that the binding constants derived

376

HMG Box–DNA Complexes

Figure 3. The partial molar heat capacities functions of free Lef86 and Sry (green), their free cognate DNA duplexes (red, CDNA ) and their complexes (black). The dot-dashed lines (blue) show the sums of the heat capacities of the free p proteins and the free DNAs; the black broken lines, CFp , show the heat capacity function expected for the fully folded proteins (see equation (5)). The purple broken lines represent the sums of the heat capacity functions of the DNA and the fully folded proteins in the low temperature region (below 30 8C) linearly extrapolated to higher temperatures. All results obtained in 10 mM potassium phosphate (pH 6.0), 100 mM KCl.

from these isotherms for various complexes do not depend on temperature below 30 8C: e.g. for the Lef-DNALef complex logðKa Þ was found to be (9.9G 0.2) at 5 8C and (10.2G0.2) at 20 8C; for the SryDNASry (10.2G0.3) and (10.8G0.3) and for the BoxB 0 -DNALef (8.2G0.3) and (8.0G0.3) at these temperatures, respectively. Binding to different DNA sequences does not alter the slope of the logðKa Þ= log½KCl plot but gives a parallel shift and therefore results in different extrapolated values at log½KClZ 0, i.e. logðKa Þ0 (Figure 4(c)). A similar linear dependence of logðKa Þ on log½KCl was found for the other DBDs. For example, for the binding of Lef86 and its truncated form Lef79 to their cognate DNALef (Figure 4(d), filled squares), both show linear dependence but with different slopes that extrapolate to the same point at log½KClZ 0. This extrapolation shows that both Lef DBDs exhibit the same non-electrostatic interactions with their cognate DNA, thereby demonstrating that the eight C-terminal residues of Lef86 make no non-electrostatic interactions with the DNA. However, the association of Lef86 and Lef79 with the non-cognate DNASry (Figure 4(d), open squares) is significantly weaker and correspondingly their logðKa Þ functions are shifted downwards to give lines parallel with those found for the association with the cognate DNA; these likewise converge at log½KClZ 0. Table 1 lists the association constants (as logðKa Þ) of the considered DBDs with the different DNAs at 20 8C in 100 mM KCl, the dependence of logðKa Þ on log½KCl, i.e. vlogðKa Þ=vlog½KCl), and the asymptotic value of logðKa Þ0 at log½KClZ 0, i.e. at [KCl]Z 1 M. Sry has the highest affinity not for the DNASry sequence used in the NMR study of the Sry–DNA

complex,24 but for the DNA Sox sequence (AACAAT), as previously found to be the most optimal for binding hSRY.41,43,44 The binding constant for Box-B 0 , 1.1!108 MK1 is significantly higher than that reported earlier,34 0.5!106 MK1, probably because the DBD used here has longer N and C-terminal extensions that bear extra positive charges. The binding constant for NHP, both with DNA Lef and DNA Sry, is 3.3(G1.0)!10 8 M K1, slightly higher than the reported,7 1.0!108 MK1, determined by gel shift experiments. Bending of the DNA duplexes The asymptotic value of the FRET effect (AFE) reflects the extent of deformation of the DNA upon 100% protein binding (Figure 4(a)). Binding of the DBDs to DNA of the same length (16 bp) results in different values of the AFE, showing that the DNA deformations induced are not the same. It is also seen that the AFE depends significantly on the KCl concentration (Figure 4(b)). Since the AFE depends on the distance between the donor and acceptor fluorophores, Rda, one can estimate the distance between the DNA ends in the complexes (see Materials and Methods). The AFE data show that in solutions containing 100 mM KCl, the distance between the ends of the free 16 bp DNA (plus ˚, the length of the label connectors) is 60(G0.1) A whilst in the DBD–DNA complexes this distance is significantly reduced (see Table 2). Since all the FRET experiments used DNA duplexes of the same 16 bp size, the observed change in the distance between its ends, Rda, upon binding of different proteins can be regarded as a

HMG Box–DNA Complexes

377

Figure 4. (a) Change of the FRET effect upon titration of the 16 bp double-labeled DNAs with their cognate DBDs at 20 8C in 10 mM potassium phosphate (pH 6.0), 100 mM KCl. The broken lines correspond to the asymptotic values of the FRET effect (AFE). (b) Isotherms of D100 titrated into DNALef at four different KCl concentrations in 10 mM potassium phosphate (pH 6.0). For 100 mM KCl, [DNA]Z50 nM; for the three higher concentrations of KCl, [DNA]Z500 nM. The broken lines correspond to the asymptotic values of the FRET effect (AFE). (c) Dependence between the logs of the Sry association constants with the four different DNAs and the KCl concentration, at 20 8C in 10 mM potassium phosphate (pH 6.0). (d) Dependence of the log values of the association constants of Lef86 and Lef79 with their cognate and noncognate DNAs and the KCl concentration, obtained from changes in the intrinsic tryptophan fluorescence of the proteins, at 20 8C in 10 mM potassium phosphate (pH 6.0).

measure of the protein-induced bending of the DNA. However, the change in Rda value is a relative parameter of the DNA deformation that depends on the length of the DNA duplex used: a more useful parameter is the bend angle. In the case of these DBDs, which are all similar in structure and smoothly bend DNA over several base-pairs, and using DNAs of identical size, the deformation expressed in Rda can be transformed into a bend angle using as standards the known structures of the SS complexes, Lef86–DNALef and Sry– DNASry,22,24 and free DNA (Figure 5(a)).14 Table 2 shows that the maximal bending is seen for Lef86 binding to its cognate DNALef. Removal of the eight residue C-terminal extension (to Lef79) decreases the bending angle by 298, a reduction close to that determined previously by the circular permutation

assay,45 consistent with the previous studies of others.22 The ability of Sry to bend the DNASox duplex (as well as DNASry) is the lowest among the sequence-specific HMG boxes, although its affinity for DNASox is the highest observed. Determination of bend angles in the case of NSS DBDs might be problematic due to lack of a defined binding site. 46 However, the calibrated FRET measurements show that NHP bends DNALef and DNASry by 62(G4)8 and this value is close to the value of 708 found for the NMR structure of the NHP–DNASry complex,25 and is close to that determined earlier by circular permutation assays.47 For the D74–DNAAT complex, FRET gave 92(G9)8 for the bend angle,14 while according to the crystal structure23 it is of the order of 1108. The observed decrease of the AFE, and hence

378

HMG Box–DNA Complexes

Table 2. The asymptotic FRET effect (ATF), the distance between the DNA ends (Rda) and the DNA bend angle in the complexes of the HMG box DBDs in 10 mM potassium phosphate (pH 6.0), 100 mM KCl DBD Sry

Lef86

Lef79

Sox

NHP Box-B 0 D100c

D74c

Property

DNALef

DNASry

DNASox

AFE Rda Bend angle AFE Rda Bend angle AFE Rd Bend angle AFE Rda Bend angle AFE Rda Bend angle AFE Rda Bend angle AFE Rda Bend angle AFE Rda Bend angle

0.248 52.8 59G4 0.293 47.9 117G10b 0.270 50.4 88G8 0.228 55.2 36G4 0.25 52.6 61G4 0.231 54.9 39G5 0.296 47.6 121G11 0.274 47.6 94G8

0.244 53.3 54G3a 0.233 54.6 39G4 – – – 0.231 55.0 40G4 0.25 52.5 62G4 – – – – – – – – –

0.261 51.3 75G6 0.240 53.8 49G5 – – – 0.280 49.3 101G9 – – – 0.235 0.377 44G6 0.285 48.8 106G9 0.271 50.2 90G8

Rda is given in angstrom units and bend angles in degrees. a Bend angles determined from the NMR structures of the Sry– DNA24 complexes. b Bend angles determined from the NMR structures of the Lef86–DNA22 complexes. c Data from Dragan et al.14.

DNA bend angle, with increasing KCl concentration (Figure 4(b)) is a general feature of the studied DBD–DNA complexes, but is substantially steeper for the NSS complexes (Figure 5(b)). Figure 5(b) also shows that increase of salt concentration does not change the bend angle of the free double-labeled DNA duplex, i.e. salt does not change the optical properties of the fluorophores. The heat effects of association The ITC-measured enthalpies of association for the SS DBDs with their cognate DNAs and for the NSS DBDs with DNALef and DNASox shown in Figure 6(a) and (b) differ considerably. The SS DBDs exhibit a noticeable dependence on temperature; however, the enthalpies of binding Box-B 0 and D100 depend little on temperature and moreover they are substantially more positive. The enthalpies for Box-B 0 differ somewhat from the values reported for association of this DBD with GC and AT duplexes (16 and 35 kJ molK1 at 15 8C, respectively)34 but lie between these two values, as expected for a DNA duplex containing both A-T and G-C base-pairs. The uncorrected binding enthalpies are not linear functions of temperature. The reason for this deviation from linearity is that all the free DBDs start to unfold upon heating from very low temperatures, especially the SS HMG boxes, absorbing excess heat. Since association with DNA results in refolding of the DBDs (Figure 3), the heat of refolding is included in the calorimetrically measured heat effect of association. Thus, to determine the net enthalpy of association of the fully folded proteins with DNA, i.e. in the state

Figure 5. (a) Dependence of the DNA bend angle on the distance between the ends of the 16 bp duplexes, Rda. Green circles: Lef86 with DNALef (1), DNASox (8), DNASry (9). Red square: Lef79 with DNALef (3). Black squares: Sox with DNASox (2), DNASry (10), DNALef (12). Blue circles: Sry with DNASox (4), DNALef (6), DNASry (7). Wine star: NHP with DNASry (5). Violet star: Box-B 0 with DNALef (11). Light blue circles; free DNA (13). Arrows indicate the calibration points. (b) Dependence of the induced DNA bend angles on the KCl concentration in 10 mM potassium phosphate (pH 6.0). Note that the conformation of the free DNA detected by FRET does not change upon increasing KCl concentration, meaning that the FRET effect itself is not sensitive to change of KCl concentration.

379

HMG Box–DNA Complexes

Figure 6. (a) and (b) The ITC-measured binding enthalpies of the SS DBDs with their optimal DNAs and NSS DBDs with DNALef and DNASox, plotted as functions of temperature. The inset in (a) shows Sry binding enthalpies with DNASox measured at three different KCl concentrations and at two different temperatures in 10 mM potassium phosphate (pH 6.0). (c) and (d) The binding enthalpies shown in (a) and (b) corrected for protein refolding.

adopted at temperatures below those at which they start to unfold, which can be regarded as a standard state, this heat effect must be excluded from the calorimetrically measured heat of protein–DNA association. The procedure for the correction of the measured enthalpy of association is described in Materials and Methods. Figure 6(c) and (d) show that correction for these heats of refolding are substantial, particularly for the SS DBDs, and significantly changes the association enthalpy values: their dependence on temperature now becomes linear. ITC measurements of the heat effects of binding the SS DBDs with the non-cognate DNAs were also made. Table 3 lists the enthalpies of association of the DBDs with the various DNAs, corrected for refolding, determined at 20 8C in the standard buffer and the heat capacity effects of binding. This Table also includes the previously published enthalpies of association of the SS Sox,21 and the NSS D74 and D100 DBDs.14 Since D74 and D100 do not include histidine residues, while Lef86–Lef79 have three and Sry has one, it is possible that the observed differences in the enthalpies of binding at pH 6.0 result from protonation/deprotonation of

histidine residues, a reaction having a large enthalpy. Although these histidine residues do not directly contact DNA in the complexes,22,24 proximity to DNA might indirectly affect their pK values. The heat of DNA binding by Lef86 was therefore measured at three different pH values: pH 5.0, 6.0 and 7.0 at 20 8C and the binding enthalpies found to be very similar, deviating by less than 2 kJ molK1 (data not shown). The fact that these effects are within the experimental error suggests that there is no enthalpic effect from the protonation of histidine residues in this DBD–DNA association. It was also found that the binding enthalpies do not depend on the KCl concentration, as illustrated for Sry– DNASox at 5 8C and 20 8C in the inset to Figure 6(a).

Discussion Stability of the DBD structures Values of the mean residue ellipticities and partial specific heat capacities of the free DBDs make clear that even at the lowest temperatures they are not completely folded, especially in the

380

HMG Box–DNA Complexes

Table 3. The total enthalpy, heat capacity, Gibbs energy, entropy factor, and the electrostatic and non-electrostatic components of binding the HMG box DBDs with various DNAs at 20 8C in 10 mM potassium phosphate (pH 6.0), 100 mM KCl, corrected for protein refolding Total

Electrostatic

Non-electrostatic

DBD

DNA

DHa

DCpa

KDGa

TDSa

TDSael ZKDGael

KDGanel

TDSanel

SRY

Sox Sry Lef Sox Sry Lef Lef Sox Sry Lef Sry Lef Sry Lef Sox Lef Lef

27 35 38 6 9 17 12 20 43 13 – 40 42 32 38 64 70 G3.0

K2.6 K1.6 K1.6 K1.4 K0.5 K0.1 K1.2 K0.8 K0.3 K2.0 – C2.8 C2.7 C1.1 C1.1 C1.0 K1.0 G0.2

69 62 56 48 45 44 56 49 45 38 28 47 48 44 46 46 27 G0.7

96 97 94 54 54 61 68 69 88 51 – 87 90 76 84 110.0 92 G4.0

48 47 48 29 31 31 37 37 37 19 18 36 36 39 41 30 14 G1.0

21 14 8 19 14 14 19 12 8 19 13 11 12 5 5 16 13 G1.0

48 49 46 25 23 31 31 32 51 32 – 51 54 37 43 80 83 G5

Soxa

Lef86

Lef79 NHP Box-B 0 D100b D74b S

Units DHa, DGa and T DSa in kJ molK1; DCp in kJ KK1 molK1. a Data for Sox with DNASox taken from Privalov et al.21 but re-analyzed to take account of transition 1 0 . b Data from Dragan et al.14

case of the SS DBDs (Figure 2). From structural studies it is probable that a significant part of the polypeptide chain of these DBDs is indeed unfolded: in the case of Lef86, the heavily charged 19 residue C-terminal segment after Pro67, comprising 22% of its residues, is largely in an extended conformation in the major groove of the complex with DNA,22 while in the case of Sry the size of extended C-terminal segment in the complex with DNA amounts to 17%.24 It is to be expected that these segments are unfolded in the free proteins, which would significantly raise the absolute partial heat capacities. Using the known heat capacity effects of protein unfolding,48,49 one can calculate that unfolding 22% of Lef86 increases the heat capacity by about 7%. This, however, cannot fully explain the excess heat capacity of these DBDs at low temperatures, nor its steep increase with heating that develops into a peak of heat absorption and ends with heat capacity values expected for fully unfolded polypeptide chains. Deconvolution analysis of the heat capacity functions (Figure 7) shows that unfolding of the SS DBDs starts from very low temperatures, about K10 8C, in contrast to the NSS DBDs, which start to unfold from considerably higher temperatures. Furthermore, in contrast to the NSS DBDs Box-B 0 , D100/D74 and NHP, unfolding of which proceeds in two stages, unfolding of the SS DBDs Lef79, Lef86 and Sry proceeds in three stages. The Sox DBD is somewhat in between: the first transition (transition 1 0 ) occurs with a rather small enthalpy (and was therefore missed in our previous study),40 and the last transition (transition 2) proceeds with much

larger enthalpy than for Lef and Sry, although with a significantly smaller enthalpy than transition 2 in the NSS DBDs (Table 4). The Gibbs energies of stabilization of the cooperative sub-domains in Table 4 were determined for the standard temperature T0Z20 8CZ293.1 K by the equation: DGðT0 Þ Z DHt ðTt K T0 Þ=Tt K DCp ðTt K T0 Þ C TDCp lnðTt =T0 Þ;

(1)

Since a two-state transition usually corresponds to unfolding of a cooperative domain organized around a hydrophobic core,50 one can conclude that the L-shaped structure of the NSS HMG boxes is subdivided into two cooperative sub-domains, while the SS HMG boxes are subdivided into three cooperative sub-domains, one of which is highly unstable. The fact that transition 2 in Lef79, Lef86 and Sry proceeds with substantially lower enthalpy than in D100/D74, NHP and Box-B 0 suggests that the additional transition 1 0 in Lef79, Lef86 and Sry appears as a result of destabilization of some part of the sub-domain responsible for transition 2 in these DBDs. In the case of Sox this destabilized part appears to be much smaller. It is seen that the sum of the enthalpies of the low temperature transition 1 0 and the high temperature transition 2 for the SS DBDs are very similar (180–200 kJ/mol) and close to the enthalpy of the major transition 2 in the NSS DBDs. This supports the assignment of transition 1 0 to some part of the major wing, since transition 2 in Sox was directly shown by NMR to correspond to the major wing.40 The structure responsible for transition 1 0 is almost fully unfolded at 20 8C and

HMG Box–DNA Complexes

since it refolds in the presence of DNA it should be involved in binding: a possible assignment would be to loop 1, the region between helix 1 and helix 2 that participates directly in DNA binding. Analysis of the packing densities of the structures of the free solution HMG box DBDs indeed shows that the interior of these domains is not homogeneous (Figure 8). Among the DBDs, the structures of which have been determined without DNA, only Sox belongs to the SS class, while the other three represent the NSS DBD class. Nevertheless one can see clear-cut differences between the packing density distributions in these NSS DBDs and those in Sox: all the NSS DBDs have a tightly packed major wing, in contrast to Sox for which the densely packed cluster in the major wing is of much more modest dimensions. Correspondingly, the main heat absorption peak for Sox is significantly less pronounced than for the NSS DBDs (Figure 7). Furthermore, the excess heat absorption profiles of the other SS DBDs is even more diffuse than that of Sox, implying that their structures are even less tightly packed, i.e. are more flexible and loose. Association with DNA Since association of the DBDs with DNA results in refolding of protein, the heat effect of refolding must be subtracted from the ITC-measured enthalpies to obtain the net enthalpy of binding the fully

381 folded protein to DNA (see Materials and Methods). Correction for refolding changes considerably the association enthalpy values and their dependence on temperature, which become linear (Figure 6(c) and (d)). The linearity of these functions means that the heat capacity of association, DCap , of the fully folded DBDs with their cognate DNAs does not change with temperature and is negative for all the SS DBDs, in contrast to the NSS Box-B 0 , NHP and D100. This difference in the sign of DCap shows that in forming SS complexes it is the apolar contacts that dominate, while in forming NSS complexes polar contacts dominate; this follows from the knowledge that the heat capacity effect of dehydrating polar and charged groups is positive, in contrast to the dehydration of apolar groups for which DCap is negative.51,52 It is notable that DCap correlates with the Gibbs energy of association and the enthalpy (see Table 3): as the Gibbs energy of binding decreases in magnitude on altering the DNA sequence away from the optimal, the enthalpy of binding becomes more positive and the heat capacity change decreases. A distinguishing feature of the corrected enthalpies of DNA association for the NSS DBDs is that their magnitudes are significantly larger than for the SS and are positive over all the considered temperature range (Table 3 and Figure 6(c) and (d)). Since a positive enthalpy acts against association, it follows that DNA binding of all the folded DBDs is

Figure 7. Deconvolution of the partial molar heat capacity functions of four free DBDs under standard solvent conditions: 10 mM potassium phosphate (pH 6.0), 100 mM KCl, determined over the temperature range from K10 8C (i.e. super-cooled solution) to 75 8C. The broken line shows the baseline representing the expected partial molar heat capacity of folded protein, CFp , including a contribution from unfolded segments (see Materials and Methods).

382

38.0 44.5 44.9 45.9 38.6 45.9 41.8 41.6 G0.2

119 106 101 153 190 199 198 188 G5

2.9 2.2 2.2 2.1 2.2 2.5 2.5 2.5 G0.2

8.9 9.9 10.4 14.8 11.2 13.2 15.7 15.4 G2.0

principally an entropy-driven process and this is especially true for the NSS DBDs. The entropy of DBD association with DNA can be determined by combining the enthalpy with the Gibbs energies determined from the fluorimetrically determined binding constants, DSa ðTÞZ fDHa ðTÞC RT ln½Ka ðTÞg=T. Since at the lowest temperature for which the association was studied (5 8CZ278 K) the uncertainty in the refolding correction is minimal, the most reliable value of the entropy factor can be calculated at that temperature. The values of the entropy factors at 5 8C can then be extrapolated to higher temperatures using the heat capacity effects of association, DCap , (Table 3): DSa ðTÞZ DSa ð278ÞC DCap lnðT=278Þ. With these entropy functions and the corrected enthalpies, one can calculate the Gibbs energy functions of association of the fully folded DBDs with the DNA as DGa ðTÞZ DHa ðTÞK TDSa ðTÞ. As noted previously, this Gibbs energy value is in good correspondence with the apparent Gibbs energy values determined directly from the association constants without any correction for protein refolding, showing that the enthalpy and entropy of protein refolding efficiently compensate each other.14 This is not surprising since the considered temperature region is close to the transition temperatures of the sub-domains in the free proteins, where the Gibbs energies of unfolding/ refolding are close to zero.

1.1 1.2 1.2 1.3 1.0 1.8 1.5 1.1 G0.2

The components of the association energy

Units: Tt, in 8C; DHt, in kJ molK1; DG in kJ molK1; DCp in kJ KK1 molK1. a Data from Dragan et al.14

81 76 70 28 – – – – G7.0 Sry Lef86 Lef79 Sox NHP Box-B 0 D100a D74a s

7.0 6.0 8.0 7.0 – – – – G0.2

0.7 0.9 0.8 0.3 – – – –

K3.3 K3.5 K2.9 K1.2 – – – – G1.0

28.0 29.6 32.0 33.5 22.2 38.0 22.5 18.0 G0.2

86 83 79 83 68 92 86 69 G6.0

K3.3 K3.5 K2.9 K1.2 K0.8 5.0 0.8 K0.5 G0.8

DCp DHt DCp DHt DBD

Tt

DCp

DG (20 8C)

Tt

DHt

DG (20 8C)

Tt

Transition 2 Transition 1 Transition 1 0

Table 4. Thermodynamic parameters of temperature-induced transitions of the HMG box DBDs in 10 mM potassium phosphate (pH6.0), 100 mM KCl

DG (20 8C)

HMG Box–DNA Complexes

A linear dependence between the log of the binding constant of a protein with DNA and the log of the KCl concentration (Figure 4(c) and (d)) is usually regarded as a manifestation of the electrostatic interactions in this process.53–55 Formation of ion pairs between the cationic amino acid residues of the protein and the DNA polyanion results in the release of counterions, the mixing of which with the ions in bulk solution produces a significant entropy increase.55 The observation that the enthalpy of binding is independent of the KCl concentration (Figure 6(a), inset) shows that the effects of salt addition are purely entropic. At relatively low concentrations of salt in aqueous solution, when the activity of water is less affected by presence of salt, the entropy of water release upon protein binding is independent of the salt concentration and this entropy effect is simply proportional to the number of released counterions,56 mostly from the DNA, since the low density of charges on protein surfaces do not result in a tight coat of counterion.57 Correspondingly, the logarithm of the association constant of protein with DNA is presented in just two terms: a Þ K Zj log½KCl logðKa Þ Z logðKnel

(2)

where Z is the number of DNA phosphates that interact with the protein and j is the number of cations (KC) per phosphate group released upon protein binding.53,54 The first term on the right hand

383

HMG Box–DNA Complexes

Figure 8. Packing density analyses of the structures of several HMG box DBDs in free solution. All structures were determined by NMR. Among them only Sox represents the SS DBD class,26 while NHP25, HMG1 Box-B29 and D7484 represent the NSS class. The red clusters are regions with packing density higher than 0.68 (see Materials and Methods).

side of the equation results from non-electrostatic (nel) interactions between DNA and protein and the second results from electrostatic effects associated with release of counterions; thus the slope of the plot corresponds to Zj, i.e. is a measure of the number of ions released upon protein–DNA association. In the case of short DNA duplexes, j is about 0.64.58 From the slope of the logðKa Þ function one can calculate Z, the number of phosphate groups which release their counterions upon protein binding to DNA, i.e. the number of ionic contacts between protein and DNA. These numbers, given in Table 1, are closer to the net charges of the DBDs rather than to the total positive charge: this indicates the presence of a substantial number of internal salt links on the surface of the DBDs, especially on helix 3, which is remote from the DNA.29,22–24 There is also good correspondence with the results of structure determinations, e.g. from the present measurements, Z for Sry binding is 13 and according to the structure of the Sry– DNASry complex,24 there are 12 ionic contacts of DNA phosphate groups with Arg and Lys. When the salt concentration approaches 1 M, i.e. log½KClZ 0, the electrostatic term in equation (1) vanishes and DGa ZK2:3RT logðKa Þ approaches the non-electrostatic part of the Gibbs energy of complex formation, DGanel .54 This permits splitting the observed Gibbs energies of binding into two components: the non-electrostatic and the electrostatic, DGa Z DGanel C DGael . Since the association enthalpies do not depend on the salt concentration (Figure 6(a), inset), it follows that the enthalpy of the electrostatic component of the protein–DNA interactions is close to zero.54,59 The measured electrostatic energy, DGanel , can therefore be assigned entirely to the entropy term of the Gibbs energy. The entropy factor of the non-electrostatic component, TDSanel , can then be derived by subtracting the nonelectrostatic Gibbs energy ðDGanel Þ from the total

enthalpy of association, DHa, (Table 3). Figure 9 shows the contributions of the electrostatic and non-electrostatic components of the binding energies for the complexes of the DBDs with the various DNAs. The electrostatic component of the binding energy The electrostatic component of the binding energy dominates the non-electrostatic (Figure 9) for all the HMG boxes. Interestingly, the electrostatic component does not depend on the target DNA sequence for any of the DBDs, since it simply reflects the ionic and polar interactions with the DNA phosphate groups. In contrast, the nonelectrostatic component depends on the DNA sequence for the SS DBDs but has little dependence on DNA sequence for the NSS DBDs. Thus, the specificity of binding the SS DBDs is totally determined by this non-electrostatic component. The electrostatic component is significantly larger in the case of Lef86 than Lef79, whilst their nonelectrostatic components are almost identical with the same DNA. This shows that the contribution of the basic tail of Lef86 is purely electrostatic. Furthermore, comparison of Lef86 with Lef79 in Table 3 shows that the basic tail in Lef86 contributes positively to the heat capacity effect of binding, similar to what was observed for the basic tail of D100.14 This is just what is expected from the dehydration of charged and polar groups.49,51,52 Interestingly, Lef86 forms ZZ10.6 ionic contacts with the phosphate groups of its cognate DNA and Lef79 forms ZZ5.3 contacts; thus the eight-residue tail in Lef86 makes 5.3 contacts with phosphate groups (Table 1). Since the electrostatic Gibbs energy of binding for Lef86 is K37 kJ molK1 and for Lef79 is K19.1 kJ molK1, the tail contributes K17.9 kJ molK1 to binding (Table 1). Dividing this energy value by

384 the number of contacts, gives an estimate of the contribution of a single ionic contact in the tail of Lef86 of K3.4 kJ molK1. A similar value was found for the electrostatic contributions of the tail in D100:14 in that case ZZ3.8 for the tail and its electrostatic energy of binding is K16 kJ molK1, which gives K4.2 kJ molK1 for the contribution of a single electrostatic contact. These values are remarkably close to K4.0 kJ molK1, the value obtained by titrating DNA with pentalysine.60 The non-electrostatic component of the binding energy Although the non-electrostatic component of the Gibbs binding energy is not the dominating factor in binding the DBDs to the charged DNA, it is of special interest because it is the factor responsible for the specificity of binding.61–66 The non-electrostatic binding energy itself consists of two components, the enthalpic and entropic: DGanel Z DHa K TDSanel . The values of all three parameters for the interaction of the DBDs with their cognate and non-cognate DNAs at 20 8C are given in Table 3: Figure 10 shows the non-electro-

HMG Box–DNA Complexes

static contributions in decreasing order of the Gibbs energies of binding (red bars). The greatest magnitude of the Gibbs energy for Sox and Lef86/79 is for the corresponding cognate DNA, although for Sry it is DNASox, which must be regarded as its cognate sequence.41,43,44 It is notable that the decrease in magnitude of the non-electrostatic Gibbs energy (DGa, in red; Figure 10), is associated with an increase in the positive enthalpy of binding, (DHa, in green), and in the negative nonelectrostatic entropy factor, (KTDSanel , in blue), i.e. an increase in the positive non-electrostatic entropy, DSanel . The other notable fact is that the interactions of the NSS DBDs are characterized by a much larger positive enthalpy and entropy of binding than those of the SS DBDs. Furthermore, all three thermodynamic parameters correlate with the heat capacity effect of binding (Table 3). However, the thermodynamic signature of the NSS DBDs does not differ qualitatively from that of the SS DBDs with their least optimal DNA sequences, implying that reduced complementarity at the DNA–protein interface renders a SS HMG box similar to a NSS HMG box. These observations raise several questions. (a) Why are the non-electrostatic enthalpies and entropies of

Figure 9. The electrostatic (in blue) and non-electrostatic (in yellow) components of the total Gibbs free energy of binding the DBDs with various DNAs. The numbers above the bars indicate the induced DNA bend angles measured in standard buffer.

HMG Box–DNA Complexes

385

Figure 10. The enthalpic and entropic contributions to the non-electrostatic Gibbs energy of binding of the considered DBDs with the various DNAs at 20 8C. The non-electrostatic Gibbs energies of binding are shown by red bars, the enthalpies by green bars and the entropic factors by blue bars.

binding positive in all cases? (b) Why are they larger for the NSS DBDs? (c) Why do the enthalpies of binding correlate so well with the Gibbs energies for the SS DBDs and with the heat capacity effect of binding? One reason for the positive binding enthalpy might be that the association results in DNA bending, which requires work. This offers an explanation for the difference in enthalpies of binding proteins to the minor and major grooves of DNA: in contrast to HMG boxes, which interact with the minor groove and bend DNA, homeodomains interact with the major groove, do not bend DNA substantially, and the enthalpy of their binding is negative.67–70 However, on binding to cognate DNA, Sry induces a lower bend than do Lef or Sox but its binding enthalpy is significantly more positive, so the contribution of bending to the total enthalpy cannot be dominant. The other source of positive enthalpy in complex formation is the dehydration of groups forming the interface and it has been argued that these are particularly large for binding to the minor groove where water is in a highly ordered state at AT-rich sequences.71 This suggestion is supported by the observation that

binding of the NSS Box-B 0 to an AT duplex results in a larger positive enthalpy than does its binding to a GC duplex.34 Why then is the binding enthalpy more positive for the NSS DBDs than for the SS bound to cognate DNA? Removal of water of hydration requires work, but it is unlikely that dehydration of the interface formed by the NSS DBDs is more complete than for the SS, since a better complementarity of the two surfaces is expected for the SS DBDs. Indeed, the structures of the complexes formed by representatives of these two groups (Figure 11), differ significantly in their packing densities. The complexes of the SS DBDs have more densely packed interfaces with a greater number of van der Waals contacts and hydrogen bonds, which would provide negative enthalpy to counterbalance the effects of dehydration, thereby decreasing the overall positive enthalpy of binding. The negative sign of the heat capacity effect, and its correlation with the enthalpy of binding show that SS binding results mainly in the dehydration of apolar groups, i.e. in the formation of van der Waals contacts at the interface of SS complexes. The binding enthalpies of

386

HMG Box–DNA Complexes

Figure 11. Packing densities at the interfaces of the SS complexes of Lef86,22 Sry,24 and of the NSS complexes of NHP6A,25 and D74.23 The red clusters are regions with packing density higher than 0.68 (see Materials and Methods). To help visualize the packing at the interface, only clusters that include groups of both protein and DNA are shown. Residues intercalating into the DNA are shown in pale blue. Although the Lef, Sry and NHP complex structures were obtained by NMR, and that of D74 by crystallography, the striking differences seen in the interfacial packing densities are too large to be explained solely by differences in the constraints applied during the structure determinations.

NSS DBDs are thus more positive due to poorer interfacial complementarity, and therefore have a lower contribution from van der Waals and other contributions than are present in the SS complexes. The large positive binding entropies, when electrostatic effects are excluded, can derive only from the release of water on complex formation, i.e. dehydration of the groups at the interface and this must be the dominant contribution to the binding entropy. A reason for the larger positive entropy upon forming NSS complexes could be a smaller loss of conformational freedom as a result of poor complementarity at the interface, i.e. a reduced negative conformational entropy contribution. This conclusion is supported by the fact that the NSS complexes appear less densely packed at the interface than the SS complexes (Figure 11). Another reason might derive from the

differences in intercalation of these two groups of DBDs: in the SS DBDs there is only one large intercalating wedge (Met13 in Lef and Sox and Ile13 in Sry). However, the complexes of D74 and NHP have an additional intercalating wedge: Val32 and Thr33 for D74 and Phe48 for NHP.23,32,72,25 Dehydration of large protruding hydrophobic wedges will yield a substantial positive entropy. Energetics of the HMG box induced DNA bending The numbers above the energy bars in Figure 9 give the DNA bend angles induced by binding the indicated DBDs (Table 2). For a given DBD, the electrostatic component of the Gibbs energy of binding is, as expected, independent of the DNA

HMG Box–DNA Complexes

387

Figure 12. (a) The dependence of the bend angle on the non-electrostatic Gibbs energy for the three SS DBDs with three different DNAs in the standard buffer at 20 8C. Points 1 represent DNASox; points 2 represent DNALef and points 3 represent DNASry. (b) The dependence of the bend angles of SS DBD complexes on the electrostatic Gibbs energy of binding. Bend angles measured from AFE values obtained in individual titrations of protein into DNA in 10 mM potassium phosphate (pH 6.0) at 20 8C at several concentrations of KCl. (c) The dependence of the bend angles of NSS DBD complexes on the electrostatic Gibbs energy of binding obtained as for SS DBD complexes in (b).

sequence. The non-electrostatic component is also independent of the DNA for a given DBD of the NSS type but for SS DBDs the non-electrostatic component varies considerably with the bound sequence and these variations correlate with the DNA bend angle in the complex: the bend angle decreases with reduction of the non-electrostatic Gibbs energy and increase of the positive binding enthalpy. For the different DNA sequences, the dependence of the bend angle on the non-electrostatic Gibbs energy of binding is specific for a given SS DBD (Figure 12(a)). The bend angle of any given DBD–DNA complex depends also on the electrostatic component of the Gibbs binding energy (Figure 12(b) and (c)). The decrease of bend angle with decrease of the electrostatic component of the binding energy is rather modest and similar for the SS DBDs: for Sox the bend angle drops by about 308 as the electrostatic interactions are lost; for Sry it drops by 268 and for Lef86 by 298. Removal of the C-terminal basic extension from Lef86 (to Lef79) results in a similar decrease of the induced bend angle: by 298 (Table 2). It follows that for SS DBDs the observed decrease in the bend angle upon elimination of electrostatic interactions is solely due to the basic extensions. The similarity in the bending capability of the tails of Lef86 and Sry is unexpected, bearing in mind that whilst that of Lef86 passes through the major groove on the inside of the bent DNA,22 the tail of Sry continues along the minor groove.24 In marked contrast, the bend angle induced by all the NSS DBDs drops almost to zero on loss of the

electrostatic component. Removal of the basic extension from D100 (to D74) results in a decrease of bend angle by 278 (Table 2), which while similar to the contribution from the tails of the SS DBDs represents only a small part of the whole DNA bending effect associated with its electrostatic interactions. It follows therefore that bending by the globular domains of the NSS DBDs also depends on electrostatic interactions. The DNA bending by the basic extensions of the DBDs can best be explained by asymmetric charge neutralization.73–76 The present data provide quantitative assessment of this effect on real protein– DNA complexes of known structures: the number of ionic contacts formed by the tails of Lef86 and D100 with the DNA phosphate groups (see Z values in Table 1) are 5.3 and 3.7, respectively. Dividing the DNA bend angle induced by the tails of Lef86 and D100 by the number of ionic contacts formed, we find that one such ionic contact induces a DNA bend of 5–78 and this costs about K3.4 kJ molK1 from our estimates of the energy of a single ionic contact. What then is the mechanism of bending caused by the globular parts of the two types of HMG boxes? In the case of the SS DBDs, loss of all electrostatic interactions seems to have the same effect as removal of the C-terminal tail, so the bending induced by the globular domains must come entirely from non-electrostatic interactions (Figure 10). These non-electrostatic interactions are substantial for the cognate DNA, as seen from the more favorable binding energies, while the less

388 favorable contacts with non-cognate DNA result in a more positive binding enthalpy and a reduced binding energy and bend angle. The situation for the NSS DBDs is quite different: although the electrostatically induced bending by the C-terminal tails is similar to that for the SS DBDs, the bending induced by the globular domains (which is substantial, 948 in the case of D74) is almost totally abolished in high salt, i.e. it is almost totally proportional to the electrostatic component of the binding energy. The question is then why the DNA bending induced by the globular part of the DBDs depends directly on the binding energy: nonelectrostatic in the case of SS DBDs and electrostatic in the case of NSS DBDs? Structural and mutagenic analysis of the considered complexes suggest that a significant part in DNA bending is played by the intercalation of specific side-chains between the DNA bases, resulting in defined roll angles.22–25,72 However, an important point, sometimes overlooked in considering intercalation, is that insertion of the amino acid side-chain between the DNA bases is a thermodynamically unfavorable process: the isolated amino acid side-chains do not readily intercalate DNA, in contrast to the typical intercalators which are planar, polycyclic aromatic cations.77,78 Thus, the side-chain must be forced between the bases by the cumulative action of various protein– DNA interactions. We see that these interactions can be electrostatic or non-electrostatic, depending on the interface formed. As discussed above, the complexes of SS DBDs with their cognate DNAs are specified by a tighter interface, i.e. proceed with the formation of more extensive enthalpic contacts. In the case of NSS complexes, however, the lower flexibility of these DBDs does not permit formation of extensive enthalpic contacts and insertion of the intercalators into DNA is then driven mostly by long-range electrostatic interactions. The indifference of electrostatic (i.e. entropic) interactions to the DNA sequence explains the low sequence specificity of NSS DBDs for DNA binding and bending.

Conclusions The structures of the free HMG box DBDs do not represent a single cooperative unit but are subdivided into three or two sub-domains, i.e. are rather loose and unstable, especially in the case of SS DBDs. However, interaction with DNA stabilizes and integrates these sub-domains with the DNA into a single cooperative complex and the more flexible SS DBDs form complexes with a tightly packed interface having extensive enthalpic van der Waals contacts, which determine the sequence specificity of DNA binding. In contrast, in the case of the more rigid NSS DBDs, an intimate complementarity is not achieved and the major driving force for binding comes from entropic interactions, which are indifferent to the DNA sequence. The non-electrostatic interactions of SS HMG boxes and

HMG Box–DNA Complexes

the electrostatic interactions of NSS HMG boxes drive the bending of DNA by forcing the intercalating wedges of their globular domains between DNA bases. For both types of HMG boxes, further bending is induced by the asymmetric neutralization of phosphate groups by their basic extensions.

Materials and Methods Proteins and DNA duplexes The proteins and the DNA duplexes used are presented in Figure 1. Preparation of Sox and D100/74 was described in the previous papers.14,21 The two Lef DBDs were expressed from a human cDNA that codes for an identical amino acid sequence to that of mouse LEF-1 in this domain. The Sry protein was expressed from a human clone24 and Sox from a clone of mouse Sox-5.40 The second HMG box, encoding amino acid residues 84 to 184, of rat HMGB1 (B 0 ) and cloned into plasmid pT7-779 was expressed in Escherichia coli strain BL21(DE3)-gold. The expressed B 0 domain was thus 101 residues long plus an N-terminal methionine. Expression and purification of B 0 followed the protocol developed for Sry, with DTT added to the solutions.14 Electrospray mass spectrometry of the purified B 0 domain yielded a product with a molecular mass of 11,364 Da. This molecular mass is consistent with the mass calculated from the B 0 protein sequence but lacking the N-terminal methionine residue. Concentrations of B 0 were determined using an extinction coefficient 3280 nmZ10,810 MK1 cmK1. The purification of NHP6A protein was performed as described.30 Electrospray mass spectrometry of the purified NHP6A gave a mass of 10,671 Da, i.e. very close to the calculated value. The purity of all proteins used in the experiments was checked by chromatography and mass-spectroscopy and found to be better than 98%. All studies were carried out in the standard buffer: 10 mM potassium phosphate (pH 6.0) in the presence of various salt concentrations from 100 to 300 mM KCl. Spectropolarimetry CD measurements were carried out using a Jasco-710 spectropolarimeter equipped with the Peltier temperature controller PTC-3481. Measurements of the ellipticity of the samples were done in 100 mM KCl, 10 mM potassium phosphate (pH 6.0) using a 1 mm Suprasil quartz cell. Fluorescence measurements Fluorescence measurements were carried out on a SPEX FluoroMax-3 spectrofluorimeter equipped with an accessory for steady-state anisotropy measurements, a thermostated cell holder with stirrer and a softwarecontrolled water bath. A 0.4 cm pathlength quartz Suprasil cell was used. The experimental data were analyzed using the DataMax software (version 2.10) of the FluoroMax-3 instrument. FRET, fluorescence anisotropy titration and intrinsic tryptophan fluorescence The methodology of fluorescence resonance energy transfer (FRET) experiments, fluorescence anisotropy titration and intrinsic protein fluorescence changes for

389

HMG Box–DNA Complexes

monitoring binding to DNA, together with analysis of the data obtained, were described in detail.14 The only difference here was that both FRET and fluorescence anisotropy were studied using the same 16 bp duplexes, double-labeled with FAM and TAMRA (Figure 1). In FRET experiments, FAM and TAMRA were excited at 490 nm and 560 nm, respectively, whilst for fluorescence anisotropy experiments excitation was at 560 nm, the absorption maximum of TAMRA. Possible direct effects of protein binding on the fluorescence quantum yield of the fluorophores was checked in control experiments: proteins were titrated into solutions of 16 bp DNA duplexes singly-labeled with FAM and having the same sequences as the double-labeled DNAs, monitoring fluorescence excitation at 490 nm. To check the effect of protein binding on the fluorescence of TAMRA, proteins were titrated into solutions of the double-labeled DNAs, monitoring fluorescence excitation at 560 nm, where only TAMRA absorbs. In both circumstances the change in the fluorescence of the labels upon protein binding was small. FRET efficiency (E) for the double-labeled duplexes was determined from sensitization of the acceptor (TAMRA) fluorescence. This procedure normalizes the FRET signal for the quantum yield of TAMRA, for the concentration of the duplex molecule and for any error in the effectiveness of acceptor labeling.80 Fluorescence spectra of doublelabeled DNA, Fs(490), excited at 490 nm, were fitted using two standard spectra: of singly FAM-labeled DNA, Fd(490), (excited at 490 nm) and of TAMRA, obtained by excitation of the double-labeled DNA, Fa(560), at 560 nm: Fs ð490Þ Z A Fd ð490Þ C FE Fa ð560Þ

(3)

where FE (FRET effect) is the fitted weighting factor of the TAMRA-spectra (FEZFa(490)/Fa(560)). FE is linearly proportional to the FRET efficiency: FE Z E½3d ð490Þ=3a ð560Þ C 3a ð490Þ=3a ð560Þ

(4)

The ratio 3a(490)/3a(560)Z0.117 was determined from the excitation spectrum of a 5 0 TAMRA 16 bp DNA sample and the ratio of absorption coefficients 3d(490)/3a(560)Z 0.313 was determined from the absorbance spectrum of the double-labeled DNA (5 0 FAM-16 bp DNA 5 0 TAMRA). The FRET efficiency, E, varies as the sixth power of the separation between the donor (FAM) and acceptor (TAMRA), normalized to R0, the characteristic Foster distance for 50% energy transfer efficiency: EZ ½1C ðRda =R0 Þ6 K1 . Accordingly, the distance between donor–acceptor pair was calculated as: Rda Z R0 ½ð1K EÞ=E1=6 . For the FAM/TAMRA donor/acceptor pair, a Foster ˚ was used. For all four 16 bp DNAs radius of R0Z50 A used in the experiments a FRET effect (FE) of 0.19 was observed, corresponding to an Rda distance between ˚ . For these free DNAs the bend fluorophores of 60.9 A angle was taken as zero and used as the third calibration point for transforming Rda values into bend angles. A best-fit quadratic function was drawn between the three calibration points, allowing even the smaller bend angles to be derived by interpolation (Figure 5(a)). Experiments showed that the FRET effect does not depend on the KCl concentration in the range from 0.1 M to 0.5 M. This is important, since it permits investigation of the dependence of DNA bending by the bound protein on the ionic strength of the solution, i.e. to investigate the effect of electrostatic interactions on DNA bending. Differential scanning calorimetry DSC was performed on a Nano-DSC calorimeter from

Calorimetry Sciences Corporation (Utah, USA) with a cell volume of 0.328 ml. Details of the performance of this instrument and the experimental procedures are given elsewhere.81 Solutions for the calorimetric experiments were extensively dialyzed against solvent for 12 hours at 5 8C with three replacements of dialyzate, using protein concentrations of 1–3 mg/ml (w0.2 mM). The dialyzate was used as reference. The heat capacity was measured in both heating and cooling regimes of the scanning calorimeter and in all cases the results were very similar, showing that unfolding and refolding of the proteins and their complexes is highly reversible. The advantage of measuring in the cooling regime is that the heat capacity function can be determined down to 0 8C, or even lower by supercooling the aqueous solution. In the case of measurements down to K10 8C dust was removed from the solutions by filtering with a 0.22 mm Millipore, since dust can initiate freezing of a super-cooled aqueous solution. Measurements at low temperatures were important for observing pre-denaturational changes in the proteins. Results were analyzed using the CpCALC program supplied with the Nano-DSC. As a standard representing fully folded native protein, CFp , the averaged heat capacity of bovine pancreatic trypsin inhibitor (BPTI) and hen egg white lysozyme was taken, both being compact highly stable proteins of approximately the same size as the HMG boxes:49 CFp ðTÞ Z ½1:27 C ð0:0061 !TÞ; JKK1 gK1

(5)

As a baseline for deconvolution analysis, the heat capacity of fully folded protein was increased to allow for the presence of unfolded segments of the protein chain, which for Lef86 amounts to 22% and for SRY to 17% of the total chain length (see Figure 1). The additional contribution of the unfolded segments to the partial heat capacity is of the order of 7%, so uncertainties in the length of these unfolded segments introduces little error into the determination of CN p and the resulting deconvolution of the heat capacity function. The partial heat capacity of fully unfolded proteins, CU p , was determined by summing up the known tabulated heat capacities of individual amino acid residues.49 Isothermal titration calorimetry ITC was performed on a Nano-ITC series III from Calorimetry Sciences Corporation (CSC, Utah, USA) with a cell volume of 1.25 ml. DNA solutions were placed in the cell and concentrated protein solutions in the syringe. The protein solution was titrated in 5 ml increments at 200 seconds intervals into the DNA solution, using DNA concentrations of w0.15 mg/ml (w16 mM) in the cell and protein at w2 mg/ml (w200 mM) in the syringe. Samples of DNA and protein were prepared with the same batch of buffer in order to minimize artifacts due to minor differences in buffer composition. In separate experiments the heats of dilution of the protein into the solvent were measured and corrections made. The results of titration experiments were analyzed using the Bindwork program supplied with the instrument. Linear regression analysis of the experimental data presented in Figure 6(a) showed that the error in determination of the enthalpy of DBD–DNA association is G3.0 kJ molK1. Correction of the ITC-measured enthalpies of association The correction of the ITC-measured heat effect of

390

HMG Box–DNA Complexes

association corresponds to the area included between the heat capacity function for the complex and the summed heat capacity functions of the protein and DNA components. Assuming that at some temperature, T0, the protein is in its fully folded state, i.e. the protein correction at this temperature is zero, for temperature T the correction will be expressed by the equation:21 ðT DHðTÞa Z f½Cp ðTÞ K Cp ðT0 Þcompl K ½Cp ðTÞ T0

3. 4. 5.

K Cp ðT0 Þpr K ½Cp ðTÞ K Cp ðT0 ÞDNA gdT C DCp ðT0 Þa ðT K T0 Þ

(6)

where DCp ðT0 Þa Z Cp ðT0 Þcompl K Cp ðT0 Þpr K Cp ðT0 ÞDNA . When the length of the DNA duplex exceeds the protein-binding site and protein binding does not affect the fraying of the DNA ends, the sum of the heat capacity of DNA and the fully folded protein, CFp , changes with temperature in parallel with the heat capacity of the complex. In that case one can exclude from consideration the heat capacity function of the DNA and the complex. This situation holds true for all of the 16 bp DNA–DBD complexes: the heat capacity of the complex differs from the heat capacity of the DNA by the heat capacity of the folded protein, CFp , plus the temperature-independent heat capacity effect of binding (see Figure 3). The corrections for protein refolding were therefore determined simply by integration of the excess heat effect of protein unfolding shown in Figure 7.

6. 7. 8. 9.

10.

Packing density determination The program MOLE (from Molecular Graphics and Computation, Applied Thermodynamics, LLC) was used to determine packing densities in the interior of the protein–DNA complexes. The calculated packing density is the ratio of the volume occupied by the van der Waals envelope of a group of atoms, relative to the volume of space that they actually occupy and is thus a dimensionless quantity.82 For details of the program determining the packing densities in proteins see Privalov83

11.

12.

13.

14.

Acknowledgements The financial support of NIH GM48036-12 (to P.L.P.) and NIH GM59456 (to M.E.A.C.) is acknowledged. C.C.-R. and C.M.R. acknowledge the support of the Wellcome Trust, UK. We also acknowledge the kind gifts of the human SRY HMG box clone from the Marius Clore laboratory (NIDDK, NIH, Bethesda, MD), the clone of NHP6A from Reid Johnson (UCLA School of Medicine, Los Angeles, CA) and the rat HMGB1-B 0 clone from Jean Thomas (Department of Biochemistry, University of Cambridge, UK).

References 1. von Hippel, P. H. & Berg, O. G. (1989). Facilitated target location in biological systems. J. Biol. Chem. 264, 675–678. 2. Li, Z., Van Calcar, S., Qu, C., Cavenee, W. K., Zhang,

15.

16.

17.

18.

19.

M. Q. & Ren, B. (2003). A global transcriptional regulatory role for c-Myc in Burkitt’s lymphoma cells. Proc. Natl Acad Sci. USA, 100, 8164–8169. Wright, P. E. & Dyson, H. J. (1999). Intrinsically unstructured proteins: re-assessing the protein structure–function paradigm. J. Mol. Biol. 293, 321–331. Vinson, C. R., Sigler, P. B. & McKnight, S. L. (1989). Scissors-grip model for DNA recognition by a family of leucine zipper proteins. Science, 246, 911–916. Huth, J. R., Bewley, C. A., Nissen, M. S., Evans, J. N., Reeves, R., Gronenborn, A. M. & Clore, G. M. (1997). The solution structure of an HMG-I(Y)–DNA complex defines a new architectural minor groove binding motif. Nature Struct. Biol. 8, 657–665. Kim, J. L., Nikolov, D. B. & Burley, S. K. (1993). Cocrystal structure of TBP recognizing the minor groove of a TATA element. Nature, 365, 520–527. Kim, Y., Geiger, J. H., Hahn, S. & Sigler, P. B. (1993). Crystal structure of a yeast TBP/TATA-box complex. Nature, 365, 512–520. Rice, P. A., Yang, S., Mizuuchi, K. & Nash, H. A. (1996). Crystal structure of an IHF-DNA complex: a protein-induced DNA U-turn. Cell, 87, 1295–1306. Werner, M. H., Huth, J. R., Gronenborn, A. M. & Clore, G. M. (1995). Molecular basis of human 46X,Y sex reversal revealed from the three-dimensional solution structure of the human SRY–DNA complex. Cell, 81, 705–714. Grosschedl, R., Giese, K. & Pagel, J. (1994). HMG domain proteins: architectural elements in the assembly of nucleoprotein structures. Trends Genet. 10, 94– 100. Bustin, M. & Reeves, R. (1996). High-mobility-group chromosomal proteins. Architectural components that facilitate chromatin function. Prog. Nucl. Acid Res. Mol. Biol. 54, 35–100. Giese, K., Amsterdam, A. & Grosschedl, R. (1991). DNA-binding properties of the HMG domain of the lymphoid-specific transcriptional regulator LEF-1. Genes Dev. 5, 2567–2578. Nasrin, N., Buggs, C., Kong, X. F., Carnazza, J., Goebl, M. & Alexander-Bridges, M. (1991). DNA-binding properties of the product of the testis-determining gene and a related protein. Nature, 354, 317–320. Dragan, A. I., Klass, J., Read, C., Churchill, M. E., Crane-Robinson, C. & Privalov, P. L. (2003). DNA binding of a non-sequence-specific HMG-D protein is entropy driven with a substantial non-electrostatic contribution. J. Mol. Biol. 331, 795–813. Giese, K., Cox, J. & Grosschedl, R. (1992). The HMG domain of lymphoid enhancer factor 1 bends DNA and facilitates assembly of functional nucleoprotein structures. Cell, 69, 185–195. Ferrari, S., Harley, V. R., Pontiggia, A., Goodfellow, P. N., Lovell-Badge, R. & Bianchi, M. E. (1992). SRY, like HMG1, recognizes sharp angles in DNA. EMBO J. 11, 4497–4506. Paull, T. T., Haykinson, M. J. & Johnson, R. C. (1993). The non-specific DNA-binding and bending proteins HMG1 and HMG2 promote the assembly of complex nucleoprotein structures. Genes Dev. 7, 1521–1534. Onate, S. A., Prendergast, P., Wagner, J. P., Nissen, M., Reeves, R., Pettijohn, D. E. & Edwards, D. P. (1994). The DNA-bending protein HMG-1 enhances progesterone receptor binding to its target DNA sequences. Mol. Cell. Biol. 14, 3376–3391. Dailey, L. & Basilico, C. (2001). Co-evolution of HMG

HMG Box–DNA Complexes

20. 21.

22.

23.

24.

25.

26.

27.

28.

29.

30.

31.

32.

33.

domains and homeodomains and the generation of transcriptional regulation by Sox/POU complexes. J. Cell. Physiol. 186, 315–328. Murphy, F. V., IV & Churchill, M. E. (2000). Nonsequence-specific DNA recognition: a structural perspective. Struct. Fold Des. 8, R83–R89. Privalov, P. L., Jelesarov, I., Read, C. M., Dragan, A. I. & Crane-Robinson, C. (1999). The energetics of HMG box interactions with DNA. Thermodynamics of the DNA binding of the HMG box from mouse Sox-5. J. Mol. Biol. 294, 997–1013. Love, J. J., Li, X., Case, D. A., Giese, K., Grosschedl, R. & Wright, P. E. (1995). Structural basis for DNA bending by the architectural transcription factor LEF-1. Nature, 376, 791–795. Murphy, F. V., IV, Sweet, R. M. & Churchill, M. E. A. (1999). The structure of a chromosomal high mobility group protein–DNA complex reveals sequenceneutral mechanisms important for non-sequencespecific DNA recognition. EMBO J. 18, 6610–6618. Murphy, E. C., Zhurkin, V. B., Louis, J. M., Cornilescu, G. & Clore, G. M. (2001). Structural basis for SRYdependent 46-X,Y sex reversal: modulation of DNA bending by a naturally occurring point mutation. J. Mol. Biol. 312, 481–499. Masse, J. E., Wong, B., Yen, Y.-M., Allain, F. H. T., Johnson, R. C. & Feigon, J. (2002). The S. cerevisiae architectural HMGB protein NHP6A complexed with DNA; DNA and protein conformation changes upon binding. J. Mol. Biol. 323, 263–294. Cary, P. D., Read, C. M., Davis, B., Driscoll, P. C. & Crane-Robinson, C. (2001). Solution structure and backbone dynamics of the DNA-binding domain of mouse Sox-5. Protein Sci. 10, 83–98. van Houte, L. P., Chuprina, V. P., van der Wetering, M., Boelens, R., Kaptein, R. & Clevers, H. (1995). Solution structure of the sequence-specific HMG box of the lymphocyte transcriptional activator Sox-4. J. Biol. Chem. 270, 30516–30524. Read, C. M., Cary, P. D., Crane-Robinson, C., Driscoll, P. C. & Norman, C. M. (1993). Solution structure of a DNA-binding domain from HMG1. Nucl. Acids Res. 21, 3427–3436. Weir, H. M., Kraulis, P. J., Hill, C. S., Raine, A. R., Laue, E. D. & Thomas, J. O. (1993). Structure of the HMG box motif in the B-domain of HMG1. EMBO J. 12, 1311–1319. Yen, Y.-M., Wong, B. & Johnson, R. C. (1998). Determinants of DNA binding and bending by the Saccharomyces cerevisiae high mobility group protein NHP6A that are important for its biological activities. J. Biol. Chem. 273, 4424–4435. Allain, F. H.-T., Yen, Y.-M., Masse, J. E., Schultze, P., Dieckmann, T., Johnson, R. C. & Feigon, J. (1999). Solution structure of the HMG protein NHP6A and its interaction with DNA reveals the structural determinants for non-sequence-specific binding. EMBO J. 18, 2563–2579. Churchill, M. E. A., Changela, A., Dow, L. K. & Kreig, A. J. (1999). Interactions of high mobility group proteins with DNA and chromatin. Methods Enzymol. 304, 99–133. Dow, L. K., Jones, D. N. M., Wolfe, S. A., Verdine, G. L. & Churchill, M. E. A. (2000). Structural studies of the high mobility group globular domain and basic tail of HMG-D bound to disulfide cross-linked DNA. Biochemistry, 39, 9725–9736.

391 34. Muller, S., Bianchi, M. E. & Knapp, S. (2001). Thermodynamics of HMGB1 interaction with duplex DNA. Biochemistry, 40, 10254–10261. 35. Agresti, A. & Bianchi, M. E. (2003). HMGB proteins and gene expression. Curr. Opin. Genet. Dev. 13, 170– 176. 36. Travis, A., Amsterdam, A., Belenger, C. & Grosschedl, R. (1991). LEF-1, a gene encoding a lymphoid-specific protein with an HMG domain, regulates T-cell receptor alpha enhancer function. Genes Dev. 5, 880– 894. 37. Gubbay, J., Collignon, J., Koopman, P., Capel, B., Economou, A., Munsterberg, A. et al. (1990). A gene mapping to the sex-determining region of the mouse Y chromosome is a member of a novel family of embryonically expressed genes. Nature, 346, 245–250. 38. Connor, F., Cary, P. D., Read, C. M., Preston, N. S., Driscoll, P. C., Denny, P. et al. (1994). DNA binding and bending properties of the post-meiotically expressed Sry-related protein Sox-5. Nucl. Acids Res. 22, 3339–3346. 39. Read, C. M., Cary, P. D., Crane-Robinson, C., Driscoll, P. C., Carillo, M. O. M. & Norman, D. G. (1995). The structure of the HMG box and its interaction with DNA. In Nucleic Acids and Molecular Biology (Eckstein, F. & Lilley, D. M. J., eds), vol. 9, pp. 222–249, Springer, Berlin. 40. Crane-Robinson, C., Read, C. M., Cary, P. D., Driscoll, P. C., Dragan, A. I. & Privalov, P. L. (1998). The energetics of HMG box interactions with DNA. Thermodynamic description of the box from mouse Sox-5. J. Mol. Biol. 281, 705–717. 41. Haqq, C. M., King, C., Donahoe, P. K. & Weiss, M. A. (1993). SRY recognizes conserved DNA sites in sexspecific promoters. Proc. Natl Acad. Sci. USA, 90, 1097– 1101. 42. Privalov, P. L., Tiktopulo, E. I., Venyaminov, S. Y., Griko, Y. V., Makhatadze, G. I. & Khechinashvili, N. N. (1989). Heat capacity and conformation of proteins in the denatured state. J. Mol. Biol. 205, 737–750. 43. King, C. & Weiss, M. A. (1993). The SRY highmobility-group box recognizes DNA by partial intercalation in the minor groove: a topological mechanism of sequence specificity. Proc. Natl Acad. Sci. USA, 90, 11990–11994. 44. Giese, K. D., Pagel, J. & Grosschedl, R. (1994). Distinct DNA-binding properties of the high mobility group domain of murine and human SRY sex-determining factors. Proc. Natl Acad. Sci. USA, 91, 3368–3372. 45. Lnenicek-Allen, M., Read, C. M. & Crane-Robinson, C. (1996). The DNA bend angle and binding affinity of an HMG box increased by the presence of short terminal arms. Nucl. Acids Res. 24, 1047–1051. 46. Lorenz, M., Hillisch, A., Payet, D., Buttinelli, M., Travers, A. & Diekmann, S. (1999). DNA bending induced by high mobility group proteins studied by fluorescence resonance energy transfer. Biochemistry, 38, 12150–12158. 47. Tang, L., Li, J. L., Katz, D. S. & Feng, J. (2000). Determining the DNA angle induced by non-specific high mobility group-1 (HMG-1) proteins: a novel method. Biochemistry, 39, 3052–3060. 48. Privalov, P. L. & Makhatadze, G. I. (1990). Heat capacity of proteins. 2. Partial molar heat capacity of the unfolded polypeptide chain of proteins: protein unfolding effect. J. Mol. Biol. 213, 385–391. 49. Makhatadze, G. I. & Privalov, P. L. (1995). Energetics of protein structure. Advan. Protein Chem. 47, 307–425.

392 50. Privalov, P. L. (1982). Stability of proteins. Proteins which do not present a single cooperative system. Advan. Protein Chem. 35, 1–104. 51. Privalov, P. L. & Makhatadze, G. I. (1992). Contribution of hydration and non-covalent interactions to the heat capacity effect on protein unfolding. J. Mol. Biol. 224, 715–723. 52. Spolar, R. S., Livingstone, J. R. & Record, M. T., Jr (1992). Use of liquid hydrocarbons and amide transfer data to estimate contributions to thermodynamic functions of protein folding from the removal of nonpolar and polar surface from water. Biochemistry, 31, 3947–3955. 53. Record, M. T., Jr, Anderson, C. F. & Lohman, T. M. (1978). Thermodynamic analysis of ion effects on the binding and conformational equilibria of proteins and nucleic acids: the roles of ion association or release, screening, and ion effects on water activity. Quart. Rev. Biophys. 11, 103–178. 54. Record, M. T., Jr, Zhang, W. & Anderson, C. F. (1998). Analysis of effects of salts and uncharged solutes on proteins and nucleic acid equilibria and processes: a practical guide to recognizing and interpreting polyelectrolyte effects. Hofmeister effects, and osmotic effects of salts. Advan. Protein Chem. 51, 281–353. 55. Manning, G. S. (1978). The molecular theory of polyelectrolyte solutions with applications to the electrostatic properties of polynucleotides. Quart. Rev. Biophys. 179, 246. 56. Ha, J. H., Capp, M. W., Hohenwalter, M. D., Baskerville, M. & Record, M. T., Jr (1992). Thermodynamic stoichiometries of participation of water, cations and anions in specific and non-specific binding of lac repressor to DNA. Possible thermodynamic origins of the glutamate effect on protein– DNA interactions. J. Mol. Biol. 228, 252–264. 57. Manning, G. S. (2003). Is a small number of charge neutralizations sufficient to bend nucleosome core DNA onto its superhelical ramp? J. Am. Chem. Soc. 125, 15087–15092. 58. Olmsted, M. C., Bond, J. P., Anderson, C. F. & Record, M. T., Jr (1995). Grand canonical Monte Carlo molecular and thermodynamic predictions of ion effects on binding of an oligocation (L8C) to the center of DNA oligomers. Biophys. J. 68, 634–647. 59. Record, M. T., Jr, Ha, J. H. & Fisher, M. A. (1991). Analysis of equilibrium and kinetic measurements to determine thermodynamic origins of stability and specificity and mechanism of formation of sitespecific complexes between proteins and helical DNA. Methods Enzymol. 208, 291–343. 60. Lohman, T. M., deHaseth, P. L. & Record, M. T., Jr (1980). Pentalysine-deoxyribonucleic acid interactions: a model for the general effects of ion concentrations on the interactions of proteins with nucleic acids. Biochemistry, 19, 3522–3530. 61. Revzin, A. & von Hippel, P. H. (1977). Direct measurement of association constants for the binding of Escherichia coli lac repressor to non-operator DNA. Biochemistry, 16, 4769–4776. 62. DeHaseth, P. L., Lohman, T. M. & Record, M. T., Jr (1977). Non-specific interaction of lac repressor with DNA: an association reaction driven by counterion release. Biochemistry, 16, 4783–4790. 63. Boschelli, F. (1982). Lambda phage cro repressor. Nonspecific DNA binding. J. Mol. Biol. 162, 267–287. 64. Matthew, J. B. & Ohlendorf, D. H. (1985). Electrostatic deformation of DNA by a DNA-binding protein. J. Biol. Chem. 260, 5860–5862.

HMG Box–DNA Complexes

65. Takeda, Y., Ross, P. D. & Mudd, C. P. (1992). Thermodynamics of Cro–DNA interactions. Proc. Natl Acad. Sci. USA, 89, 8180–8184. 66. Ha, J. H., Spolar, R. S. & Record, M. T., Jr (1989). Role of the hydrophobic effect in stability of site-specific protein–DNA complexes. J. Mol. Biol. 209, 801–816. 67. Ladbury, J. E., Wright, J. G., Sturtevant, J. M. & Sigler, P. B. (1994). A thermodynamic study of the trp repressor–operator interaction. J. Mol. Biol. 238, 669– 681. 68. Lundback, T. & Hard, T. (1996). Sequence-specific DNA-binding dominated by dehydration. Proc. Natl Acad Sci. USA, 93, 4754–4759. 69. Carra, J. H. & Privalov, P. L. (1997). Energetics of folding and DNA binding of the MAT alpha 2 homeodomain. Biochemistry, 36, 526–535. 70. Gonzalez, M., Weiter, S., Ferretti, J. A. & Ginsburg, A. (2001). The vnd/NK-2 homeodomain: thermodynamics of reversible unfolding and DNA binding for wild-type and with residue replacements H52R and H52R/T56W in helix III. Biochemistry, 40, 4923– 4931. 71. Dragan, A. I., Liggins, J. R., Crane-Robinson, C. & Privalov, P. L. (2003). The energetics of specific binding of AT-hooks from HMGA1 to target DNA. J. Mol. Biol. 327, 393–411. 72. Klass, J., Murphy, F. V., IV, Fouts, S., Serenil, M., Changela, A., Siple, J. & Churchill, M. E. (2003). The role of intercalating residues in chromosomal highmobility-group protein DNA binding, bending and specificity. Nucl. Acids Res. 31, 2852–2864. 73. Mirzabekov, A. D. & Rich, A. (1979). Asymmetric lateral distribution of unshielded phosphate groups in nucleosomal DNA and its role in DNA bending. Proc. Natl Acad. Sci. USA, 76, 1118–1121. 74. Manning, G., Ebralidse, K. K., Mirzabekov, A. D. & Rich, A. (1989). An estimate of the extent of folding of nucleosomal DNA by laterally asymmetric neutralization of phosphate groups. J. Biomolec. Struct. Dynam. 6, 877–889. 75. Strauss, J. K. & Maher, L. J., III (1994). DNA bending by asymmetric phosphate neutralization. Science, 266, 1829–1834. 76. Kosikov, K. M., Gorin, A. A., Lu, X.-J., Olson, W. K. & Manning, G. S. (2002). Binding of DNA by asymmetric charge neutralization: all-atom energy simulations. J. Am. Chem. Soc. 124, 4838–4847. 77. Blackburn, G. M. & Gait, J. M., eds (1996). Nucleic Acids in Chemistry and Biology, Oxford University Press, Oxford. 78. Chaires, J. B. (1997). Energetics of drug–DNA interactions. Biopolymers, 44, 201–215. 79. Teo, S.-H., Grasser, K. D. & Thomas, J. O. (1995). Differences in the DNA-binding properties of the HMG-box domains of HMG1 and the sex-determining factor SRY. Eur. J. Biochem. 230, 943–950. 80. Clegg, R. M. (1992). Fluorescence resonance energy transfer and nucleic acids. Methods Enzymol. 211, 353– 388. 81. Privalov, G. P., Kavina, V., Freire, E. & Privalov, P. L. (1995). Precise scanning calorimeter for studying thermal properties of biological macromolecules in dilute solutions. Anal. Biochem. 232, 79–85. 82. Richards, F. M. (1977). Areas, volumes, packing and protein structure. Annu. Rev. Biophys. Bioeng. 6, 151– 176. 83. Privalov, G.P. (1995). Packing of protein interiors:

HMG Box–DNA Complexes

structure and distribution of packing density. PhD dissertation, Johns Hopkins University, Baltimore, USA. 84. Jones, D. N., Searles, M. A., Shaw, G. L., Churchill, M. E.,

393 Ner, S. S., Keeler, J. et al. (1994). The solution structure and dynamics of the DNA-binding domain of HMG-D from Drosophila melanogaster. Structure, 2, 609–627.

Edited by P. Wright (Received 16 April 2004; received in revised form 23 June 2004; accepted 6 August 2004)