The stability of helical polynucleotides: Base contributions

The stability of helical polynucleotides: Base contributions

J. Mol. Biol. (1962) 4,500-517 The Stability of Helical Polynucleotides: Base Contributions HOWARD DEVOEt AND IGNACIO TINOCO, JR. Department of Ohem...

1MB Sizes 0 Downloads 28 Views

J. Mol. Biol. (1962) 4,500-517

The Stability of Helical Polynucleotides: Base Contributions HOWARD DEVOEt AND IGNACIO TINOCO, JR.

Department of Ohemistry, University of Oalifornia, Berkeley, Oalifornia, U.S.A. (Received 13 December 1961, and in revised form 28 February 1962) Estimates have been made of the main contributions of the heterocyclic bases of DNA to the free energies of the two configurations in solution, the two-stranded helix and the single-strand random coil. The magnitude and directions of the dipole moments of the adenine, thymine, guanine and cytosine base groups have been calculated by a semi-empirical molecular orbital treatment. Dipole-dipole, dipole-induced-dipole and London force interactions among the bases in the helix are large, and make the free energy of the helix depend on the base composition and sequence. The helix stability (helix negative free energy) is proportional to the guanine + cytosine content. A comparison of calculated base-pair interactions with the nearest-neighbor base frequency data of Josse, Kaiser & Kornberg (1961) suggests that the base sequence in natural DNA may be influenced by the free energy. The free energy of the coil is difficult to estimate; it tends to be more negative from base-solvent interactions for a greater guanine-cytosine content. The choice of solvent determines the magnitude of these interactions and thus determines how the melting temperature of the helix will depend on the base composition. In a solvent such as water, in which significant base-base interactions exist in the coil, the coil free energy depends on the base sequence and London forces may cause stacking of the bases into ordered arrays. The hydrogen bond energy of the bases and the strain energy in the helix probably contribute little to the enthalpy change of the helix-coil transition, although they ensure specific base-pairing in the helix. The net contribution of configurational and solvent entropy changes to the free energy change of the helix-coil transition is also probably small.

1. Introduction Knowledge of the forces which govern the configurations of polynucleotides is necessary to our understanding and prediction of the properties of nucleic acids in solution. It would be very difficult to assess experimentally the various contributions to the stability in solution of a double-stranded helical polynucleotide relative to the two constituent single strands in a random-coil configuration. We have therefore attempted to do this theoretically for the main contributions of the purine and pyrimidine heterocyclic bases. Figure 1 shows how in imagination the separated nucleotide residues are brought together in a vacuum to form either a helix or two coils, which are then introduced into a solvent. The free energy changes of the individual steps can be added to give

t Present address: Section on Physical Chemistry, National Institute of Mental Health, Bethesda 14, Maryland, U.S.A. 500

STABILITY OF HELICAL POLYNUCLEOTIDES

501

the free energies of the solvated helix and coils, and the free energy change of the helix-coil transition. We omit free energy changes for forming covalent bonds between residues, as these are the same for helix and coils. The molar free energy of formation

~~

AF

~

(helix coil transition in solventl

Coils in solvent

Helix in solvent

1AFO.U.


AF solvotion)

solvation)

~~

AF

r

~ ($

)

(helixvcorl transition in vacuum)

Coils in vacuum

Helix in vacuum

"-AF

®

~

AF'7'


(helix

~ormatiOn)

Separated residues in vacuum

FIG. 1. A schematic diagram showing how the free energy change of a helix-eoil transition can be calculated from a series of simple steps.

of helices from coils in solution, l:1F, which will depend on the concentrations, determines the standard free energy of formation l:1Fo (which governs the equilibrium concentrations of the two configurations): l:1FO

= l:1F _ RT In (cy) helix (cy)2 coil

(1)

where c is the concentration in units appropriate to the chosen standard state and y is the activity coefficient. The free energy change for any step can be written:

l:1F = l:1F ES + l:1FL + f..H lIB + l:1HS - T l:1S 33

(2)

502

HOWARD DEVOE AND IGNACIO TINOCO, JR.

where F ES = electrostatic free energy from interactions among charges, permanent dipoles and induced dipoles. F L = London force energy from fluctuation dipole-induced-dipole interactions. HUB = hydrogen bond enthalpy. H s = short range strain energy. 118 = entropy change, including solvent entropy changes and configurational entropy changes. We can split up the electrostatic free energy as follows:

~=~+~+~+~+~

~

where the subscripts indicate the kind of electrostatic interaction: among charges (p), permanent dipoles (fL) and induced dipoles arising from polarizabilities (a). The electrostatic free energy of any distribution of charges, dipoles and polarizable groups, relative to infinite separation, is equal to the work of reversibly discharging the infinitely separated charges and dipoles and reversibly charging them in the final distribution. This work is

F

1

",,':'1 2

POl.=--2~ ~ ~

-PiDljl i j;J;i 1=1 €ij

0 en 2

(4)

where

Here, Pi and fLi are the charge and permanent dipole of group i; EiJ and Rij are the effective dielectric constant and distance between groups i and j; Dljl is the component of the polarizability of group j along the principal polarization axis l; Oi,j and Gi ,l are geometric factors involving unit vectors ei and el which lie along the group dipoles ILi and ILj (Gi ,11 involves unit vectors along ILi and ctjl' eto.), The London energy can be written F = __h ~ L

4

~

3

~

i j>i 1=1

3

~

ViVj

m~1 Vi

+VI

DliiDljm

G2

iI,jm

(5)

€ij

where Vi is the characteristic frequency determined from the dispersion of the refractive index of group i.

STABILITY OF HELICAL POLYNUCLEOTIDES

503

Our model for the helix is the rigid orientation of base groups determined by X-ray crystallography for the B form of fibrous DNA lithium salt (Langridge et al., 1960), applicable to undenatured natural and synthetic DNA in solution and probably also (except for a slight change in the diameter) to some two-stranded complexes of polyribonucleotides (Rich & Davies, 1956). For the helix free energy calculations we have approximated the charge distributions and polarizabilities of the bases by point dipoles. The free energy values depend on the overall base composition of the helix and the base sequence along each strand. The single-strand coil form presents a more difficult problem, because the geometry is not known. We have attempted free energy calculations only for coils in a very good solvent, in which we assume the bases do not interact appreciably with one another; then the free energy of each strand depends only on the base composition. We have tried two approximations for the interaction of each base with the surrounding solvent, one based on independent mobile solvent molecules and the other on the macroscopic dielectric constant of the solvent. The calculations require knowledge of the magnitude and orientation of the permanent dipole moment of each base in a polynucleotide at neutral pH. Most purine and pyrimidine bases are too insoluble in non-polar solvents for direct measurement of their dipole moments. Instead, we have calculated the moments of N-methyl derivatives of the uncharged bases in the correct tautomeric forms. The customary vectorial addition of empirical bond moments is untrustworthy for these compounds, since most of the total moment may arise from nitrogen and oxygen lone-pair electrons and be dependent on the particular hybridization of the orbitals. We chose to calculate the separate contributions of a electrons and 7T electrons to the total dipole moment of each base by molecular orbital (MO) methods: a localized MO treatment for the moment of each a bond and a Huekel MO treatment for the moment of the 7T-electron system of each base. The reliability of the results was confirmed by comparison with dipole moment values from a vectorial group and bond moment method and from experimental measurements on four sufficiently soluble base derivatives. Details are given in the Appendix.

2. Methods and Results (a) Atom positions

We assumed for the dipole moment calculations that the bases adenine, thymine, guanine and cytosine (hereafter abbreviated A, T, G and 0) in DNA have the bond lengths and bond angles deduced by Spencer (1959) from crystal diffraction data on analogous compounds. Within the DNA helix, the glycoside bonds were given the positions, and the base rings were given the orientations with respect to the helix and dyad axes, determined by Langridge et al. (1960) (Model 3) from X-ray diffraction studies on fibrous DNA. Each glycoside bond is distorted by 4° from forming equal angles with the two adjacent base-ring bonds. The planes of the bases were assumed to be perpendicular to the helix axis. Figure 2 shows the dimensions and orientations of the AT and GO base-pairs. An angle 8, which will be used to describe the directions of the calculated moments, is defined for the purines and for the pyrimidines on the diagrams of A and T, respectively.

504

HOWARD DEVOE AND IGNACIO TINOCO, JR.

o

I

2

3

Distance

4

5

eX,)

FIG. 2. Atom positions in the AT and GC base-pairs of DNA. The figure is in the plane of the bases and of the dyad axis (- - -) and is perpendicular to the helix axis. The calculated dipole moment of each base is drawn to scale, with the positive end located at the position of the point dipole used in the energy calculations. The dipole orientation angle (J is defined. Atom symbols: • nitrogen, 0 oxygen, • hydrogen, 0 carbon.

(b) Helix energy

The contribution of the bases to the helix energy can be divided into base-base, base-sugar, base-phosphate and base-solvent interactions. Each base has five nearest-neighbor bases which probably provide the main interaction energy. Basebase interactions are not only the most important, they are also the only interactions which depend on the sequence. We have evaluated the contributions from base-base interactions to the free energy of the helix (relative to infinitely separated bases) from the expressions of equations (4) and (5). Table 1 lists the values for each base group i of the dipole moment magnitude, fLi' the dipole moment angle, 8i , the average group polarizability, O:i' and the characteristic ellergy, hVi' The components of the molecular polarizability were taken as (XiI = (Xi2 = 1·2
STABILITY OF HELICAL POLYNUCLEOTIDES

505

the dyad axes of adjacent base-pairs is a translation of 3·36A and a right-handed rotation of 36°. The computation of the geometric factor G for any two vectors located within any two bases in the helix was performed by an IBM 704 digital computer program written with the assistance of Mr. Robert W. Woody. TABLE

1

BMe group properties used for energy calculatiO'M Base group

Adenine Thymine 5-bromouracil Guanine Cytosine

/-'1 (Debyes)

8;(°)

88& 33b

2·8& 3'5b 4'5b

lOb

6·1}&

324&

8'0b

108b

tX;(A3)C

hv; (koaljmole)

14 11

20()d 240 8

14 11

20()d 240 8

&Calculated for the 9-methyl derivative; see Appendix. b Calculated for the 3-methyl derivative; see Appendix. C Estimated from atomic refractions (Fajans, 1949). d Experimental value from dispersion of quinoline. 8 Experimental value from dispersion of benzene and pyridine.

The computed values for DNA of the interactions between bases in the same base-pair, and of the interactions between non-paired bases for the 10 possible combinations of adjacent base-pairs, are listed in Table 2 in order of decreasing stability. We notice that the main interaction between the coplanar bases in a basepair comes from the permanent dipole moments: the GC dipoles attract while the AT dipoles repel. The dipole interaction (mainly repulsive) is also large for adjacent non-paired bases, but now the attractive London force becomes the largest. The total electrostatic and London free energy for two adjacent base-pairs is given in the last column of Table 2. This clearly shows the much greater stability of the sequence G-3'-p-5'-C compared to T-3'-p-5'-A. These data have significance for information theory and DNA, as every chemical restraint on the sequence decreases the possible information that the sequence can contain. If there are no chemical restraints, then the possible information is at a maximum, while if the sequence is completely determined by the free energy, no message is carried. Nearest-neighbor sequence data have been obtained by Josse, Kaiser & Kornberg (1961). For each of the 12 nucleic acids they studied, the sequence A-3'-p.5'-T occurred more often than T-3'-p-5'-A; while G-3'-p.5'-C occurred more often than C--3'.p.5'-G, ten out of twelve times. The more frequent sequences are those which we calculate to have the more negative free energies. The variation of the helix energy with base composition in DNA with a random sequence is shown in Fig. 3 (lines). It is seen that the dipole-dipole and dipoleinduced-dipole energies both increase (less stability) as the A + T content of DNA increases, while the London energy remains constant. Fig. 3 (dashed line) also shows the result of including more than nearest-neighbor interactions (base-pairs up to 33·6A apart) in the calculation of dipole-dipole energy. The energy differs from the nearest-neighbor values by only about 0·5 kcalJmole of base-pairs. The discrepancy for dipole-induced-dipole and London energies is even less.

506

HOWARD DEVOE AND IGNACIO TINOCO, JR. TABLE

2

Nearest-neighbor base-base interactions in the DNA helix in a vacuum Interaction energy between 2 bases in a base-pair (kcal/2 moles of base) Base-pair" F L (helix) Sum F"" (helix) F tux (helix) GC AT

-3-1 0·8

-0'3 -0,1

-0'5 -0,5

-3,9 0·2

One-half the sum of the 4 interactions between non-paired bases in 2 adjacent base-pairs (kcal/2 moles of base) Adjacent base-pairs" Totalb F"" (helix) FfJ.a (helix) FL (helix)

ig~l i~gl

i~~l i~~l i~~l i~~l

iggl

i~~l i~il i~ll

-5,8

-4,1

-6,0

-19,8

2·1

-3,3

-6,8

-11,9

-0'5

-2·0

-6,8

-11,2

-0,9

-1·4

-3,6

-7·8

3·3

-2,0

-6,8

-7·4

3·1

-2,4

-6,0

-7·2

4·2

-2,4

-3,6

-5·7

1·3

-0,7

-6,0

-5,2

2·2

-0,6

-6,8

-5·0

2·2

-0,4

-3·6

-1,6

Negative free energy values correspond to an attractive force between bases. a Base symbols on the same line represent a base-pair. The arrows which designate the direction of the chain point from the 3' carbon on one sugar to the 5' carbon on the adjacent sugar. T A represents T-sugar-3'-phosphate-5'-sugar-A. ~

b The column gives the sum of free energies shown plus the average contribution per 2 moles of base from the appropriate base-pair energies for AT and GC given above.

Several non-random base sequences are included in Fig. 3. The energy values for DNA from nine viral, bacterial and animal sources (circles) were calculated from the nearest-neighbor base frequencies determined by Josse et al. (1961) for biosynthetic DNA primed by the natural DNA. If there is compositional heterogeneity among the different DNA molecules from each source, the calculated energy is an average for all molecules. Although the base sequences in the natural DNAs are not random (Josse et al., 1961), the energy values are seen to be close to those calculated for a random sequence.

STABILITY OF HELICAL POLYNUCLEOTIDES

507

Calculated energies are shown (triangles) for three kinds of polynucleotide having a regular alternating sequence (two kinds of base alternating with one another along each strand) or a non-alternating sequence (only one kind of base in each strand). These are poly AT (alternating sequence) and poly GC (non-alternating sequence) which have been synthesized as polydeoxyribonucleotides with the enzyme polymerase (Josse et al., 1961), and poly AT (non-alternating sequence) which probably closely corresponds in base-base energy to 1: 1 helical complexes which have been made using polyriboadenylic acid and either polyribouridylic acid (Rich & Davies, 1956) or polydeoxyribothymidylic acid (Rich, 1960).

~ ·0

'?~

o

-5

.D

'0

-5

-10

~ "0

~ -15 u,

Mole fraction of AT base-pairs

FIG. 3. Calculated contributions of base-base interactions to the free energy of helical polynucleotides of various base compositions. The values are per mole of base-pairs relative to infinitely separated bases. End effects are neglected. Only nearest-neighbor interactions are included, except the dashed line includes interactions between bases up to 33·6 A apart. Lines: calculated for a random sequence of bases. Triangles: calculated for (I) poly GC, non-alternating sequence, (2) poly AT, alternating sequence, (3) poly AT, non-alternating sequence. Circles: calculated for DNAs with the nearest-neighbor frequencies found by Josse et al. (1961); the DNAs come from (4) M. lysodeikticUB, (5) M. phlei, (6) A. aeroqenee, (7) E. coli, ,\+ phage and ,\dg phage, (8) calf thymus and B. subtilis, (9) H. injluenzae.

Using the calculated dipole moment of the 5-bromouracil group (BrU) listed in Table 1, we calculate for the nearest-neighbor dipole-dipole energy of poly A-BrU (alternating sequence) the value F p p (helix) = 4'Okcal/mole of base-pairs, compared to 4·3kcal/mole for poly AT (alternating sequence). Since 5-bromouracil has a greater polarizability than thymine, the F p p ' F pcx and FL contributions all tend to give helical poly A-BrU a greater stability than poly AT. The base contribution to the free energy of helix solvation can be considered in two parts: the direct base-solvent interaction (which is small and is ignored here), and the indirect effect of the solvent on the effective dielectric constant for base-base interaction. The indirect effect can be treated by the methods of Kirkwood (1934), Kirkwood & Westheimer (1938) and Hill (1944,1955) to calculate the work necessary to produce an arbitrary charge distribution in a body immersed in a medium of different dielectric constant. For each type of interaction (charge-charge, chargedipole, etc.) an effective dielectric constant is obtained which depends on the shape and dielectric constant of the body, the dielectric constant of the medium, and the

508

HOWARD DEVOE AND IGNACIO TINOCO, JR.

position of the charges or dipoles in the body. For two extreme values of the parameters and a dielectric constant for the medium of about 100, €ij is between 2 and 5 for nearest-neighbor dipole-dipole interaction. It .should be smaller for dipoleinduced-dipole interactions and be equal to I for the London interaction. Placing the helix into a solvent therefore increases the importance of the London force relative to the electrostatic interactions between bases. It does not change the general conclusions about the relative stability of different helix sequences discussed earlier. (c) Goil energy

As mentioned earlier, we have only attempted to calculate the free energy for a coil in a good solvent in which there is negligible base-base interaction; that is, in which the free energy of coils solvation is much greater than the free energy of coils formation in a vacuum. To evaluate the free energy change upon introducing a base group of a coil into the solvent, we need to know the change in the average orientation of the solvent molecules. If we neglect the correlation between solvent molecules and consider only the interactions between each base group and the surrounding solvent, we have in general for the geometric G factor of equation (4) (6)

where lij is the interaction energy between base group i and solvent molecule j, the exponentials are classical Boltzmann factors and the integrations are over all mutual orientations. If there is no base-solvent correlation, we have giving

(7) VI (

.l!L

_ 3hv8 (X8 ' " coil) S - - - - ..... 2€8

i

nivi (Xi

Rt(vi +v8 )

where the subscript 8 refers to the solvent, ni is the number of solvent molecules surrounding base i in the nearest coordination shell of average radius Rig, and the sums are over all bases in the coils. Actually there must be some base-solvent correlation; if it comes entirely from the dipole-dipole interaction we have

.

Fpp(coils)

2fL~

= - 3€~ kT

nifL~

t Rt

(8)

The expressions ofequation (7) for Fpo:(coils) and FL(coils), which use average molecular polarizabilities (Xi and lXS' are unaffected by base-solvent correlation (or solventsolvent correlation) if the polarizabilities are isotropic. The correct value for F p p is somewhere between 0 and the upper value given by equation (8).

STABILITY OF HELICAL POLYNUCLEOTIDES

509

A solvent in which a coil exhibits little or no base-base interaction, the assumption of equations (7) and (8), is formamide (Helmkamp & Ts'o, 1961). With the values II- (formamide) = 3·7 Debye, £8 (effective) = 1, IX (formamide) = 4·1 A3, hv (formamide) = 265kcalJmoie and RiB = 5A, the results can be expressed (in kcalJmole of base-pairs) : F",,(coils) = ~ (1'6nA +2·51l.r) X AT - (9,7n o+ 13'Ond Xoc F"",(coils)

=-

(at 3000K)

(0,21 n A +0·181l.r) X AT

- (0,27no+0'38nC> Xoc FL(coils)

= -0.'31(nA +no)-0·27(1l.r+ nd

(9)

where X AT and Xoc are the mole fractions of AT and GO base-pairs in the helix from which the coils came and nA.' 1l.r, no' n c are the respective number of solvent molecules around each base. Equation (9) with reasonable values for nA> etc., shows an overwhelming stability for the coils relative to the helix, and a coil energy which is strongly dependent on the base composition. For example, with eight solvent molecules around each base the coils from poly GO would be more stable than the coils from poly AT by 150kcalJmoie of base-pairs. An alternative model for base-solvent interactions, used for instance by Wads. (1954), is to have each base in a cavity of radius a interact with the Onsager reaction field which depends on the macroscopic solvent dielectric constant: (10)

If £s is large, as for formamide at room temperature (£8 = 110) and water up to ita! boiling point (£s~56), equation (10) (with a = 3A) gives

(11) kcalJmole of base-pairs. Now the coils from poly GO are more stable than the coils from poly AT by 44 kcalJmole of base-pairs. Although the energy differences calculated from equations (9) or (11) are too large because solvent-solvent interactions have been neglected, the qualitative conclusion is probably valid that in formamide an AT helix is more stable (relative to the coils) than is a GO helix. With water as a solvent, we know from optical absorption and optical rotation that there remain significant base-base interactions in the coil (Michelson, 1959; Helmkamp & Ts'o, 1961). Water is probably not as good a solvent as formamide because of its lower dipole. It is also unique in the large amount of crystalline (ice-like) structure that remains in the liquid state. The calculation of coil energies in water is therefore extremely difficult. Our equations imply that in water an AT helix is more stable relative to the coils than a GO helix, whereas experimentally the AT helix has a lower transition temperature (melting temperature) in water than the GO helix

510

HOWARD DEVOE AND IGNACIO TINOCO, JR.

(Marmur & Doty, 1959; Schildkraut, Marmur, Fresco & Doty, 1961). In water, basebase interactions in the coils which we have omitted are probably very important. A relatively minor energy contribution which we will estimate because it depends on the bases is the interaction between base groups of the coils and ions in the solution. Using an appropriate expression (Kirkwood, 1943) for the activity coefficient, 'YPI" of a base group gives (12)

where I is the ionic strength and a is the minimum center-to-center distance between ions and base groups. Taking a = 4 A gives Fpp.(coils) = -(0·04XAT+0·21XGd I

(13)

keal/mole of base-pairs (in water at 25°0). (d) Hydrogen bond enthalpy

We are interested in the total enthalpy change, /1HHB' when base-solvent hydrogen bonds in the coils are broken and base-base hydrogen bonds in the helix and solventsolvent hydrogen bonds are formed. Donohue (1956) has pointed out that the base groups have more potential hydrogen-bonding sites of either the donor or acceptor type than participate in the intramolecular helix base-pairing, as many as six in the guanine group. All of the sites in the coils are likely to form hydrogen bonds with the solvent, but in the helix those sites not involved in the base-pairing may alter their bonds to the solvent because of their altered accessibility. It is obvious that /1HHB is difficult to estimate, but since it is the sum of differential hydrogen bond strengths for the bases and solvent it may often be small; thus the differential hydrogen bond strengths for N-H····O, O-H-···N and O-H····O are O± L-fikcal (Pimentel & McOlellan, 1960). Even if small in magnitude, t::.HHB (together with strain energy) must be important in ensuring specific AT and GO base pairing in the helix. It should be pointed out that a more detailed calculation of Fill' and Fp.rx, in which a more realistic charge distribution replaced the point dipoles used here, would essentially include HHB' (e) 8train energy

Short-range steric repulsion energy is not expected to be large unless the helix configuration corresponds to strained bond angles, rotational conformations or interatomic distances which are relieved in the coils. Langridge et al. (1960) discuss the strain energy of their X-ray diffraction model of the DNA helix and estimate that it is only a few tenths of a kcaljmole of base groups at most. (f) Entropy The entropy change, /18, for formation of a helix from the coils includes the entropy of both the solvent and the DNA. If the solvent is water, ordering of the solvent around the bases in the coils may lead to a hydrophobic bond with a contribution to /18 of about + 28 e.u.jmole of base-pairs as judged from benzene in water (Kauzmann, 1959).

STABILITY OF HELICAL POLYNUCLEOTIDES

511

The loss of internal rotation on forming the helix leads to a decrease in configurational entropy which we will estimate. In the coil, there are six bonds per nucleotide residue about which free rotation is possible: -P-0-0-3'

0-4'-0-5'-0

(sugar) 0-1'

I

N(base)

n

From a study of the dependence of interatomic distances upon rotational configurations, Morgan (1958) concluded that about each of the two 0-0 bonds and the 0-4'-0-5' bond there is one range of stable configuration, and that a nucleotide residue has only five positions of stable configuration about the two P-O bonds. From model studies, Donohue & Trueblood (1960) concluded that each base group has two ranges of minimum potential energy about the 0-1'-N glycosidic bond. If we take these values as the number of rotational states in the coil and assume no rotational freedom in the helix, we get !:i.S = -Rln5-Rln2 = -4-6e.u.

per mole of base groups. Another estimate of the entropy change, probably an upper limit, comes from allowing three rotational configurations about each of the six bonds: !:i.S = -6Rln3 = -13·1e.u. per mole of base groups. The solvent and configurational entropy changes have opposite signs. As each e.u. contributes only 0·3kcal to the free energy at room temperature, the base entropy contribution to the free energy may be relatively small.

3. Discussion We have discussed the main base contributions to the free energy (relative to separated residues) of the helix and coil configurations of DNA. The contributions are many, but the London and electrostatic energies seem to dominate. The energy of the helix is thus seen to depend on the base composition and sequence. A sequence of particularly low stability in a localized region of a helix could explain the hotspots for mutation found by Benzer (1961) in his genetic maps. The coil energy is strongly dependent on base composition because of base-solvent dipole interaction, so that by a proper choice of solvent polarity it should be possible to change relative melting points of AT and GO polymers. The coil energy is not affected by base sequence in a good solvent; but in a poor solvent the London attraction between bases may dominate, thus providing stacked base dimers or polymers in the coil and introducing a base-sequence dependence of the coil energy. We cannot, as yet, attempt to predict melting temperatures and enthalpies and entropies of melting. Instead we will discuss the experimental values for the melting parameters in water as solvent. To estimate an enthalpy of melting at neutral pH, we take the calorimetric measurement of Sturtevant, Rice & Geiduschek (1958) for the denaturation of salmon sperm

612

HOWARD DEVOE AND IGNACIO TINOCO, JR.

DNA by lowering the pH to 2·5 (!1H = 4'8kcal/mole of base-pairs at 25°C and ionic strength 1=0,1). By assuming the heats of ionization of base groups in a polynucleotide coil are the same as those of deoxyribonucleotides measured by Rawitscher & Sturtevant (1960), we calculate an enthalpy change for titrating the base groups of the coils from pH 2·5 back to pH 7 of about 3·2 kealjmole of base-pairs. The enthalpy change on forming the helix at neutral pH is therefore

!1H = H(helix)-H(coils) = -8,0 koal/mole of base-pairs. To calculate the entropy of melting for salmon sperm DNA, we correct the melting temperature measured by Marmur & Doty (1959), Tm = 87°C at a counter-ion concentration [Na+]=0'18M, to the value Tm = 83°C at [Na+]=O·IM, estimated from the behavior of calf thymus DNA (Doty, Boedtker, Fresco, Haselkorn & Litt, 1959). Combining this value with the enthalpy change estimated above (assumed to be independent of the temperature), we obtain

!1B = B(helix)-B(coils) = !1H/(Tm+273) = -22·5e.u./mole of base-pairs for the entropy of helix formation at [Na+) = 0·1 M. Similar values of !1H and !1B for the 1: 1 helical complex of polyriboadenylic acid and polyribouridylic acid have been calculated by Warner & Breslow (1958) from titration and Tm data; at pH 7 and [K+) = O'2M the values are 1lH = -9·3kcal and !1B = -28e.u./mole of base-pairs. Recent calorimetric measurements of Steiner & Kitzinger (1962) have suggested !1H is about - 6 kealjmole of base-pairs for this complex at neutral pH. These data allow us to calculate !1H as a function of base composition for various DNAs whose Tm values at [Na+) = O'18M are known, according to the expression !1H = !1B(P,n+273) = -22·5(Tm+273). We are assuming that !1B, while it may depend on the ionic strength, does not vary with temperature or base composition. The results are plotted in Fig. 4. The !1H values for the DNAs from biological sources are seen to vary linearly with the mole fraction of AT base-pairs, reflecting the linear dependence of the Tm values on base composition observed by Marmur & Doty (1959). The extrapolated straight line varies from 1lH = -8·6 to -7'7kcal/mole of base-pairs at the extremes of X AT = 0 and X A.T = 1, a difference of 0·9 koal, This small difference could be accounted for by any of the enthalpy effects mentioned, including the electrostatic, London and hydrogen bond energies. Since the composition dependence of !1H (Fig. 4) is in the same direction as the composition dependence of Fpp(helix)+Fp",(helix) (Fig. 3), the magnitude of the electrostatic interactions in the helix may be what primarily determines the helix stability relative to the coils in water. This assumes that !1B does not depend on base composition and that hydrogen bond enthalpies playa minor role. Some additional support for this hypothesis comes from the fact that at low ionic strength synthetic polydeoxyribo A-BrU (alternating sequence) has a melting temperature 9°C higher than that of polydeoxyribo AT (alternating sequence) (Inman & Baldwin, 1962); both polymers should have the same hydrogen bonding but as mentioned above the poly A-BrU helix is calculated to have the greater stability from base-base interactions.

STABILITY OF HELICAL POLYNUCLEOTIDES

513

The calculations presented here show the large interaction energies which exist between the bases in the helix. The numerical values could obviously be improved by using bond dipoles and bond polarizabilities for the a electrons and point monopoles and polarizabilities for the 17 electrons. However, from the present results we can make such qualitative conclusions as the following. (1) Electrostatic and London forces I

?

8.4

I~ Y GC

.

M.ph/81

'0

?~ 8.2

..0

'0 .!1

g 8.0

Serratia

E.coli 0 0 Salmon sperm Calf thymuso 0D.pneumoniae Yeast o

'S.

~

I

1

7.8

Mole fraction of AT bose-pcirs

FIG. 4. The enthalpy of helix formation for natural and synthetic DNAs of various base compositions. The enthalpy is calculated from !J.H = (T m+273) M. where M is taken as -22,5 e.u, The values of T m were measured by Marmur & Doty (1959) and by Sohildkraut et al. (1961) for [Na+] = 0·18 M.

may influence the sequence ofbases in DNA and thus decrease the possible information content of a sequence. (2) The relative melting temperatures of GO and AT polymers will depend on the solvent. (3) In a poor solvent large London forces between parallel bases may cause stacking of the bases into ordered arrays in single polynucleotide strands. APPENDIX

Calculation of Dipole Moments (a) Sigma moments

Our treatment is similar to those of Gibbs (1955) and Hameka & Liquori (1958). For a o bond between atoms I and J we form a localized, completely covalent MO 'f"IJ = (/>r+epJwhere ep1 and epJ are normalized atomic orbitals (AOs). TheAOs of a given atom are constructed by hybridization from 2s and 2p orbitals (a Is orbital for hydrogen) to lie along the bonds and to be mutually orthogonal; a more exact treatment would make the MOs mutually orthogonal. The bond dipole moment is

514

HOWARD DEVOE AND IGNACIO TINOCO, JR.

where X [o x J and x are distances along the bond from nucleus I, nucleus J and the bond midpoint, respectively, and e is the charge of an electron. The first two terms are atom hybrid moment contributions and the third is the bond homopolar moment. For the planar ring compounds considered here, it was convenient to calculate for each carbon and nitrogen ring atom the total hybrid moment contributed by the atom's AOs. Since the ring bonds IJ and IK formed by ring atom I have approximately equal overlap integrals (SIJ~SrK) and we use equivalent AOs of atom I for these bonds, the total hybrid moment lies along the lone-pair orbital or extra-annular bond of atom I which is assumed to bisect the IJ-IK angle , IY.r (except for the ring junction atoms C-4 and C-5 of purines, the hybrid moments of which are taken for simplicity as zero). The total hybrid moment points away from the ring with magnitude -2(1 ~COSar) cos a r]l (S' -8) fLj(hybrid) =

20/

f

-COS a

r

Or = e x j(2s) (2pa) dT where S' is equal to 2 if atom I has a lone-pair a orbital or to (1 + fePrc!>LdT)-l for an extra-annular bond IL , is is the mean of SIJ and SjK' and Or is equal to 2·31 Debye for carbon and to 1·91 Debye for nitrogen (Moffitt, 1950). Overlap integrals were evaluated from tables for Slater atomic functions given by Mulliken, Rieke, Orloff & Orloff (1949). The total hybrid moment of nitrogen in an extra-annular amino group was similarly calculated as 0·04 Debye directed away from the ring. A carbonyl oxygen is a special problem, there being no bond angle to indicate the hybridization of the colinear bonding and lone pair AOs. The " magic formula" expression of Mulliken (1952) for the dissociation energy of the C(Sp2)-O bond was maximized with respect to the oxygen AO hybridization, using a fixed C-O distance of 1·22 A, Slater function overlap integrals, non bonding attraction integral values from approximate values for CO2 (Mulligan, 1951), and values of other parameters from Mulliken's paper. This treatment yielded the normalized oxygen bonding AO rPo = 0'238(2s) + 0'972(2p(1),a result which is insensitive to the exact hybridization of the carbon bonding AO. The hybridization moment of the carbonyl oxygen, including the lone pair AO, is then 1·04 Debye directed from C toward O. Homopolar moments of various bonds in Debye units, calculated from Slater function overlap integrals, are: C (sp2)-H, 1·1; C (sp3)-H, 1·1; C (lY.c = 108°)-H, 0·9; N (sp2)-H, 0'8; and N (cxN = 105°)-H, 0,5, all directed away from the ring. The homopolar moment ofC-N was taken as 0,1, directed from C toward N (Hameka & Liquori, 1958); the homopolar moment of a C-O bond with IY.c = 116° and the o bonding AO hybridized as above was calculated from Slater function .A" and En integral values (Kotani, Amemiya & Simose, 1938) to be 0·15 directed from 0 toward

C. The total (1 moment of a molecule is then the vector sum of the hybrid moments of the ring atoms and extra-annular nitrogen and oxygen atoms (hydrogen atoms and methyl radicals are assumed to have zero hybrid mom ents) and of the homopolar moments of the bonds. Table 3 lists the calculated values of the magnitude, fLeT' and the angle relative to the base, 8eT , for the base derivatives and seven model compounds.

STABILITY OF HELICAL POLYNUCLEOTIDES TABLE

515

3

Calculated and experimental dipole moments of model compounds, purines and pyrimidines

Compound

Group and bond moments P-tot (loot P-oot (loot

MO calculations P-"

Pyridine 1·4 Pyrimidine 1·7 Pyrrole 0·3 Aniline 0·1 4-aminopyridine 1·6 N -methyl-2-pyridone 0·2 N -methyl-4-pyridone 0·1 9-methyladenine 1·3 S-methylthymine 0·4 1,3-dimethyluracil 0·2 3-methyl-5-bromouracil 9-methylguanine 0·6 3-methylcytosine 1·7 9-methylpurine 1·3

0"

P-1T

(l1T

65 254 270

0·S4" 0'S4 1·47" 1·45& 2·43 4·1S" 6·65 106 1·67 3·S6 37 3·S6 37

236 63 65

6·SS 6·90 2·35

329 lIS 51

2·2" 2·5

i-s-

1·6" 4·0 4·2" 6·6 2'8 3·5 3'7

88 33 35

6·9

324

8·0

108

3·6

56

2·3 4·2 4·2 4·5 7·2 6·1 3·5

Experimental p-oot

Solvent Reference

2·23 benzene dioxane 2·42 1·S0 benzene benzene 1·53 benzene 3·79 4·15 benzene 6·9 benzene 47 3·0±0·2b CCl4 35 30 3·9 ± 0·1 dioxane 10 4·5 ± 0.3 0 dioxane 331 94 29 4·3 ± 0·2 d dioxane

e e e e e f f g

g g g

Values of P- in Debyes; values of (I in degrees. The italicized values were used for the energy calculations. .. Value fitted to agree with experimental value. 9-n-butyl derivative; because of low solubility, this value is based on a dielectric constant measurement at only one concentration. e 5-bromouracil. Hydrogen bonding to the solvent may make this value incorrect (see Smyth, 1955, p. 329). d 9-n-butyl derivative. e Smyth, 1955. f Albert & Phillips, 1956. II Present work. b

(b) Pi moments

The 7T moment of each molecule was calculated from the atomic position and 1T electron density at each atom containing a conjugated 7T AO. These densities are sums of squares of AO coefficients in a Hiickel MO calculation without overlap, and were computed by an IBM 701 computer program kindly given to us by Professor A. Streitwieser, Jr. By-products were the AO coefficients and eigenvalues of excited states which were used in the following paper on DNA hypochromism. The program requires values ofthe parameters k for each bond and h for each atom, such that the bond resonance integral equals kfJ and the atom Coulomb integral equals cx.+hfJ (where fJ and IX are the respective integrals for benzene). The assumed values of the k parameters were Streitwieser's (1961) recommended values ke-e = 0'9, ke_e = 1·1, and ke- N = 0·8; and values from experimental bond energies ke_N (aromatic) = 1 and ke=o = 2. To base the calculated moments on experimental ones, the values of the h parameters were adjusted until the 1T moments calculated for seven model compounds agreed with those predicted from the calculated o moments and experimental total moments (Table 3). The values adopted were: hN (one 7T electron) = 0'35, hN (two 7T electrons) = 1·8, hN (amino) = 1·1, and ho (carbonyl) = 1·7. For carbon, he was taken as one-tenth the sum of neighboring non-carbon h values (inductive effect).

516

HOWARD DEVOE AND IGNACIO TINOCO, JR.

The calculated magnitudes and directions of the total moments (Tables 1 and 3), obtained by vectorial addition of the theoretical (1 and 1T moments, were used for the energy calculations involving base dipole moments. The analogues for the bases in DNA were 9-methyladenine for A, 3-methylthymine for T, 9-methylguanine for G and 3-methylcytosine for C. (c) Group and bond moment method

The total dipole moments of the N-methyl base derivatives were recalculated by vector addition of appropriate molecule and bond moment values, as a check on the MO calculations and to obtain the moment of 3-methyl.5-bromouracil. The calculations were based on the experimental dipole moments (Smyth, 1955) of pyridine, aniline, N-methyl-2-pyridone, Lrnethylpyrrole (1,9 Debye), toluene (0·4 Debye), and bromobenzene (1,5 Debye) , a C-H bond moment of 0·4 Debye directed from H toward C, and a C-N·7 (purine) bond moment of 1·8 Debye. Table 3 lists the resulting fLtot and 0tot values, and they are seen to be fairly close to the MO results. (d) Dipole moment measurements

The dipole moments of four purines and pyrimidines in purified dioxane or carbon tetrachloride were measured by the method of Guggenheim (1955). The 9-n-butyladenine (BuA) and 9-n-butylpurine (BuP) were kindly supplied by Dr. John A. Montgomery, Southern Research Institute; the BuA was recrystallized twice from benzene (m.p. 135°C) and the BuP was stated to melt about 35°C. The l,3-dimethyluracil (MeU) and 5-bromouracil (BrU) were California Corporation for Biochemical Research C grade compounds ; the MeU was recrystallized from CHCls and ethanolether (m.p. 123·0 to 123'5°C) and the BrU was recrystallized from water. Other purine and pyrimidine derivatives, including 9-methyladenine and 3-methylcytosine, were insufficiently soluble in nonpolar solvents, presumably because of strong crystal hydrogen bonding (Albert & Brown, 1954). The experimental moments (Table 3) are seen to agree well with both the MO and group and bond moment theoretical values for analogous methyl derivatives, except in the case of 9-butyl (methyl) purine. The moments calculated by the MO procedure are closest to the experimental values. We wish to thank Professor A. Streitwieser, Jr., Mr. John Bush and Mr. Robert W. Woody of this Department for IBM computer programs and assistance in using them; Dr. John A. Montgomery, Southern Research Institute, Birmingham, Alabama, for purine derivatives; and Professor C. T. O'Konski of this Department for the loan of dielectric constant apparatus. Professor B. Zimm, University of California, La. Jolla, and Professor R. L. Baldwin, Stanford University, made very useful suggestions. This work was supported in part by the National Institute of Arthritis and Metabolio Diseases, Publio H ealth Service, and by an unrestrioted grant from Researoh Corporation. REFERENCES Albert, A. & Brown, D . J. (1954) . J. Chern. Soc. 2060 . Alb ert, A . & Phillips, J. (1956). J. Chern. Soc. 1294. Benzer, S. (1961). Proc, N at. Acad. Sci., Wash. 47, 403. Donohue, J. (1956). Proc, N at. Acad. Sci., Wash. 42, 60. Donohue, J. & Trueblood, K. N. (1960). J . Mol. Biol. 2, 363. Do ty, P., Boedtker, H., Fresco, J. R., Haselkorn, R. & Litt, M. (1959). Proc. Nat. Acad. Sci ., Wash . 45, 482.

STABILITY OF HELICAL POLYNUCLEOTIDES

517

Fajans, K. (1949). In Technique of Organic Ohemistry, Vol. 1, Part 2, p. 1164. New York: Iriterscienee Publishers. Gibbs, J. H. (1955). J. Phys. Chem, 59, 644. Guggenheim, E. A. (1955). Proc. Phys. Soc. London, B68, 186. Hameka, H. F. & Liquori, A. M. (1958). Mol. Physics, 1, 9. Helmkamp, G. K. & Ts'o, P. O. P. (1961). J. Amer. Chern: Soc. 83, 138. Hill, T. L. (1944). J. Chem. Phys. 12, 147. Hill, T. L. (1955). Arch. Biochem, Biophys. 57, 229. Inman, R. B. & Baldwin, R. L. (1962). In the press. Josse, J., Kaiser, A. D. & Kornberg, A. (1961). J. Biol. Ohern; 236, 864. Kauzmann, W. (1959). Advanc. Protein Chem. 14, 1. Kirkwood, J. G. (1934). J. Chem: Phys. 2, 351. Kirkwood, J. G. (1943). In Proteins, Amino Acids and Peptides, ed. by E. J. Cohn and J. T. Edsall, Chapter 12. New York: Reinhold. Kirkwood, J. G. & Westheimer, F. H. (1938). J. Chem. Phys. 6, 506. Kotani, M., Amemiya, A. & Simose, T. (1938). Proc. Phys.-Math. Soc. Japan, 20, Extra Number 1. Langridge, R., Marvin, D. A., Seeds, W. E., Wilson, H. R., Hooper, C. W. & Wilkins, M. H. F. (1960). J. Mol. Biol. 2, 38. LeFevre, C. G. & LeFevre, R. J. W. (1955). Revs. Pure and Appl. Chem. (Australia), 5, 261. Marmur, J. & Doty, P. (1959). Nature, 183, 1427. Michelson, A. M. (1959). J. Chern, Soc. 1371. Moffitt, W. (1950). Proc. Roy. Soc. A, 202, 548. Morgan, R. S. (1958). Disc. Faraday Soc. 25, 193. Mulligan, J. F. (1951). J. Chem. Phys. 19, 347. Mulliken, R. S. (1952). J. Phys. Ohem, 56, 295. Mulliken, R. S., Rieke, C. A., Orloff, D. & Orloff, H. (1949). J. Ohem. Phys. 17, 1248. Pimentel, G. C. & McClellan, A. L. (1960). The Hydrogen Bond. San Francisco: W. H. Freeman. Rawitscher, M. & Sturtevant, J. M. (1960). J. Amer. Chern, Soc. 82, 3739. Rich, A. (1960). Proc. Nat. Acad. Sci., Wash. 46, 1044. Rich, A. & Davies, D. R. (1956). J. Amer. Ohem. Soc. 78, 3548. Schildkraut, C. L., Marmur, J., Fresco, J. R. & Doty, P. (1961). J. Biol. Ohem, 236, PC2. Smyth, C. P. (1955). Dielectric Behavior and Structure. New York: McGraw·Hill. Spencer, M. (1959). Acta Cryst. 12, 59. Steiner, R. & Kitzinger, C. (1962). Abstracts of the Sixth Annual Meeting of the Biophysical Society. Washington. Streitwieser, A., Jr. (1961). Molecular Orbital Theory for Organic Chemists. New York: John Wiley & Sons. Sturtevant, J. M., Rice, S. A. & Geiduschek, E. P. (1958). Disc. Faraday Soc. 25, 138. Wada, A. (1954). J. Ohern, Phys. 22, 198. Warner, R. C. & Breslow, E. (1958). In Proceedings oj the Fourth International Congress of Biochemistry, Vol. 9, p. 157. New York: Pergamon Press.

34