Molecular structure of a complete turn of A-DNA

Molecular structure of a complete turn of A-DNA

J. Mol. Biol. (1991) 221, 623-635 Molecular Structure of a Complete Turn of A-DNA Nuria Verdaguerl, Joan Aymamll, Dolors FernBndez-Forner2, Ignacio F...

1MB Sizes 3 Downloads 38 Views

J. Mol. Biol. (1991) 221, 623-635

Molecular Structure of a Complete Turn of A-DNA Nuria Verdaguerl, Joan Aymamll, Dolors FernBndez-Forner2, Ignacio Fital Miquel Coil’, Tam Huynh-Dinh2, Jean Igolen2 and Juan A. Subirana’ ‘Departament

d’Enginyeria Diagonal

Q&mica, U&e&at Politdcnica 647, 08028 Barcelona, Spain

de Catalunya

2Unite’ de Chimie Organique, Institut Pasteur 28 R. du Dr. Roux, 75724 Pa.ris, France (Received

26 November

1990; accepted 24 May

1991)

We have determined the crystal structure of the dodecamer d(CCCCCGCGGGGG), showing

for the first time a complete turn of A-DNA. It has average structural parameters similar to those determined in fibres. Nevertheless it shows a considerable local variation in structure which is in part associated with the presence of a bound spermine molecule. We conclude that’t’he local DNA conformation does not only depend on the base sequence, but may be strongly modified upon interaction with other molecules. In particular, the CpG sequence, which is found in hypersensitive regions of the genome, appears to be able to easily change its conformation under external influences. Keywords:

DNA stkucture; X-ray diffraction; spermine; A-DNA;

1. Introduction We have solved the molecular structure of the autocomplementary DNA dodecamer d(CCCCCGCGGGGG) by X-ray diffraction analysis. We chose t’o study this compound because sequences containing only C ’G base-pairs show a striking conformational versatility: continuous sequences favour the A form (Arnott & Selsing, 1974) and alternating sequences favour the Z form (Wang et al., 1979). This oligonucleotide contains both types of sequences. Furthermore, all the dodecamers crystallized up to now have the B form (for a review, see Kennard & Hunter, 1989), so that the compound chosen by us might show any of t’hese conformations or mixtures of them. In fact we found that it adopts an A conformation quite similar, on average, to that’ found in fibres, in contrast to what is found in oct#amer crystals that deviate significantly from the standard fibre conformation (Wang et al., 1982; Shakked et al., 1983: McCall et al., 1985; Heinemann et al.,1987: Haran et aE.,1987; Rabinovich et al., 1988; Jain & Sundaralingam, 1989; Jain et al., 1989; Hunter et al., 1989; Takusagawa, 1990). It is the longest oligonucleotide that has been crystallized in the A form, showing a complete helical turn. 2. Materials and Methods

(a) Synthesis and crystallization The deoxyoligonucleotide was synthesized by the solid phase phosphoramiditr method in an automatic synthe002%-%X36/91/1 80623-13

.$oa.oo/o

623

CpG sequence

sizer and purified by reverse-phase high-pressure liquid chromatography. Crystals were grown in about 2 weeks from a solution containing 1 mM-dodecamer, 30 mMsodium cacodylate (pH 7). 5 mM-calcium chloride, 1.25 mM-spermine tetrachloride and 5% (v/v) 2-methyl2.4.pentanediol (MPD), vapour-diffused at room temperature against a reservoir of 20y0 MPD. The crystals obtained were prismatic rods measuring @25 mm x 0.3 mm x @8 mm.

(b) &ucture

resolution and re$nement

X-ray data were collected on an Enraf Nonius CAD4 diffractometer at 20°C. Over 6000 independent reflections were collected in the resolution range 25 to 1.9 A (1 f! = @ 1 nm), and the intensities corrected for Lorentz and polarization effects and time-dependent decay. The accuracy of the data was verified by collecting about 200 symmetry-related reflections and no significant discrepancy was found (R,,, = 3.9%). Among the collected reflections, 2333 were considered to be observable at the 2a(F) level. Data collection statistics are given in Table 1. Unit cell parameters are a = b = 45.2 A. c = 65.0 A, tl = /l= 90”. y = 120”, space group P3,21. Tbe structure was solved by a rotation/translation-search method using the multidimensional real-space search programs ULTIMA-LOOP (Rabinovich & Shakked, 1984). The trial model was an A-DNA structure constructed from fibre diffraction data. The rotation search was performed for Eulerian angles 4 = O-180”, 1(1= O-360” and 0 = O--90”in 5 deg. steps. simultaneously with a translation search in therangex=Otol,y=Otolandz=OtoO~5in~~2unit cell fractional steps. Initially only the low resolution reflections in the range 20 to 10 A were used in the search. 0

1991 Academic

Press Limited

624

N. Verdaguer et al.

Table 1 Data collection statistics

Resolution range (A) m-5.0 .?o-3.5 35-2.5 25-2.0 20-1.9

Number of

possible reflections

Number of

observed reflections

y’, Observed

358 571 739 580 125

899 82.0 41.0 22.0 141

398 696 1789 2624 889

The best R-factors were 41 to 450/b for P3,21 and 43 to 46% for P3,21; thus it was not possible to discriminate between the two space groups. The resolution was subsequently increased in steps from the 20 to 9 ip range (73 reflections) up to the 16 to 6 b range (222 reflections). At’ this point the R-factor was 50% for P3,21 and 46% for P3,21. Further rigid-body refinement was continued with space group P3,21. The final rigid-body refinement was done for the resolution shell 10 to 4 il (638 reflections) giving an R-factor of 48%. This model was then refined using the Hendrickson-Konnert (1981) restrained leastsquares procedure in steps of increasing resolution to include data to 2.5 A resolution with an R-factor of 27 So. At this stage, the thermal parameters were refined and the solvent molecules located from the difference Fourier maps were gradually included using the program FRODO (
3. Results (a) General description Several Figure 1. numbering presented determined Dickerson,

of the structure

views of the structure are shown in A cylindrical projection, including the of the bases and other features, is in Figure 2. The best helical axis was using the program HELIX (R. E. personal communication) applied to the

whole molecule. Practically the same axis was found when either end or the centre of the molecule were used, so that no indications of bending or kinks were present’. The main structural parameters are compared with other &ructures in Tables 2 and 3. The most significant feature that stands out upon this comparison is that the dodecamer has average helical parameters which are very close to those calculat’ed for fibres (C’handrasekaran et al.. 1989), as found in a decamer (Frederick at al.. 1989). It appears that longer sequences of DNA favour a closer fit to the st’ructure found in fibres of high molecular weight DNA. The main difference between our structure and the tibre parameters is found in the propeller t)wist of the base-pairs, which both in our case and in the decamer (Frederick d cd., 1989) is about 5” larger. We also find that the average rise per residue (2.39 w) is smaller than in d-DNA fibren (2.55 A). whereas in t,he octamrrs t,hc rise per residue is larger and has values intermediat,r between those of rl and B DNAs. Another interesting aspect of this struct.urr is that it shows that otigomer length alone does not determine the form adopted by oligonu&otidrs when crystallized. Thus, while all dodecanucleotides solved up to now prefer the B form upon crystallization, the structure we have found is an exception in which the d form is adopted, confirming that) C. C&rich sequences favour the il form. as found by Arnott & Selsing (1974). This tendency can br suppressed by certain counterions, such as arginine (Campos clr Subirana, 1987). On thrb ot,her hand. most octanucleotides have been crystallized in the A form, as reviewed by Kennard & Hunter (1989), even though their composition ranges between *50 and 100% G (1 pairs. Only some oct’anuclrotides with an alternating sequence have been found in the 2 form (Fujii et al., 1985). It, appears that, octanucleotides when crystallized never adopt) the R form, independently of their sequence. These cnm siderations, together with the finding that’ t,he helical parameters (Table 2) are closer to those of fibres, tend to demon&rate that longer oligonucleotides give better information on the structure of high molecular weight, DNA. Additional features of interest may he &terminetl from further inspection of Tables 2 and 3. It is apparent, for example that in spite of the intrinsic. molecular symmet’r;y of the dodecamrr. equivalent base-pairs show significant differences. This is obvious, for example, in the propeller t,wist. at the two ends of the molecule. Further examples will be discussed below. These differences clearly dernotlstrate the versatility of the DNA moleculr t’o adapt itself to the environment,, in this case t,o t’he c*rystal packing cxonstraints. (b) C~onformatior~al parameters The backbone torsion angles are given in Table :i. They are compared with two other representative structures. All average values are similar, except that some of the corresponding standard deviations

Molecular Structure of a Complete Turn of A-DNA

Figure 1. Stereoscopic views of the dodecamer d(CCCCCGCGGGGG) with an associated spermine placed approximately in the centre. The 2 lateral views shown are rotated by 90”. One strand is shown and the other strand and the spermine molecule are shown with filled bonds. The hydrogen bonds molecule with the dodecamer are indicated as broken lines in the middle. The G24. Cl base-pair is at

625

molecule that is with open bonds of the spermine the top.

626

N. Verdaguer et al

Shallow

groove Deep groove

Figure 2. Cylindrical projection of the dodecamer structure. The vertical axis represents the helical axis and the horizontal axis is the viewing angle seen from within the centre of the helix. The numbering of t’he bases is shown. Some representative distances between the phosphate groups at the opening of the deep groove are indicated. They correspond to the phosphate-phosphate distance minus the phosphate van der Waals’ radius (5.8 A). The points where the z and 1 main-chain torsion angles are in the trans conformation are represented by the letter t. The shadowed areas rrpresrnt

approximately regions of the shallow groove on which the terminal base-pair of 2 neighbouring molrcules interact through hydrogen bonds and van der Waals’ interactions. The spermine molecule is schematicallp rrpresent,ed by a continuous heavy line and 4 filled circles correspond to nitrogen atoms. The arrows indicate the hydrogen bonds formed with the dodecamer. In this Figure the spermine molecule is only shown schematically since it runs very close to the axis of the dodecamer

and its accurate

projection

gives a very misleading

are larger in our case. Large standard deviations are also found in the B-DNA dodecamer (Fratini et al.. 1982). The variations found might be related to the lack of crystallographic 2-fold symmetry of our structure. An inspection of Table 3 reveals some features of interest. (1) The angle 6, related to the sugar pucker, has a low standard deviation and an average value that is practically identical to that found in all the A form crystals studied to date. This average value of 6 may be considered as diagnostic of the A form and corresponds to a sugar pucker in the C-3’-endo region. (2) The glycosidic angle x has an average value smaller (Xav = - 153”) in the first strand than in the second strand (x,, = - 173”), another feature that substantiates the lack of symmetry of the dodecamer. (3) Some of the conformational angles show weak correlations (E-_x; E-_i), which are probably related to an optimization of the intramolecular interactions. (4) As found in other A form structures, the CIand y angles are usually found in the g-/g+ region, but in a few cases they are t/t, a feature that has an intluence in the helical parameters, which will be discussed below. The helical parameters, given in Table 2, show a considerable local variation, except for the inclina-

view.

tion of the base-pairs, which is rather constant. The changes in propeller twist and in rise per residue, although common in other A form crystals (Haran et al., 1987), are not easy to interpret, except that’ the high propeller twist of the first base-pair is due to its interaction with a neighbouring molecule in the crystal. The large value of the twist angle at t’he end of the molecule had been found (Haran et al., 1987) in another nucleotide d(CCCCGGGG) with a similar terminal sequence. The variation of the main helical parameters is presented in Figure 3. It is apparent that the local twist has a clear inverse correlation with the slide of each base-pair. On the other hand t’he rise per residue is not correlated with any of the other two variables represented. In fact the rise per residue as calculated by the NEWHELIX, program (R. E. Dickerson, personal communication) is influenced by the propeller twist of the neighbouring bases and can not be easily correlated with the other helical parameters. The inverse correlation found between slide and twist indicates that the base-pairs adjust their position so that a large local twist o corresponds to a small value of the slide. Thus, the local variations in structure are corrected, sd that the overlap of consecutive bases does not change much even when the twist (w) and slide have quite different values. This

Molecular Structure of a Complete Turn of A-DNA

627

Table 2 Helical parameters of the dodecamer

(deg.) 0’

Tilt

Slide

Propeller twist

Rise (A)

Twist w

(deg.)

(A)

(deg.)

(‘l-c:24 C&G23 (‘33G22 C4-(:21 t&G20 (‘ t&C19* T (‘77G18 (:x-(:17 G!J-Cl6 GlH’15 GlllC114 G12_(‘13

2.17 268 1.98 2.21 2.71 2.66 250 2.16 2.19 2.72 2.40

40.9 291 31.7 35.9 28.7 257 37.6 29.1 29.7 353 350

347 295 322 30.3 31.1 30.9 32.5 398 304 29.5 30.3

142 16.9 19.6 22.0 183 19.6 20.2 19.5 298 17.9 18.0 157

1.62 1.71 1.79 1.51 1.97 2.37 1.41 1.94 219 169 1.60

426 16.6 17.8 124 1 I.3 122 159 164 93 23.7 114 1X.X

Mean value s.1,.

2.39 0.25

32.6 4.4

31.1 1.5

192t 1.4

1.75 0.35

14.6.t 4.0

Octamers Deramer Fibre (A form) Fibre (R form)

2.9-3.3 2.64 2.55 34

30933.5 33.0 32.7 36.0

6+1P.i 152 165 13.0

7.0-13.3 18.2 22.6 2.0

References

; c d

Helical parameters were calculated with the program NEWHELIX (distributed by R. E. Dickerson). The twist’angle o is given by the program and measures the local twist of a base-pair against its neighbour. The twist angle w’gives the twist angle of the C!-1’atoms measured from the helix axis. It is the average of the T(C-1’) values given by the NEWHELIX program for both strands. a, Wang et al. (1982); Shakked et al. (1983); McCall et al. (1985); Heinemann et al. (1987); Haran et al. (1987); Rabinovirh et al. (1988): Lauble et al. (1988); .Jain et al. (1989): Hunter et al. (1989). b, Frederick et al. (1989). c. Chandrasekaran et al. (1989). d, Park rt nl (1987). S.D. standard deviation t The 2 terminal base-pairs have not been included.

Table 3 Backbone torsion angles a

P

- 109

(4 ( ‘5 (:6 (“7 (3 c:9

(iI0

c:I1 (:I:! (‘13 (‘14 (‘15 (‘16 (‘Ii (:1x (‘19 t:20 G21 (:22 (423 (:‘4 Mean d(A( ?$1$,>,? ,C(,(,O( (,(,T)t Fi bresf

- 223 -215 -178 -177 -190 -170 -149 -212 -175 -153 -173

114 88 75 84 64 93 84 65 91 74 82 80 96 79 74 83 74 80 72 81 64 75 76 79

- 178 (22) 166 (6) 175

53 (14) 51 (14) 42

80 (8) 79

-80 -58 -64 -76 -177 -78 -74 -59 -86 -184 -94

-169 -198 -181 -170 -209 -166 - 185 -167 - 150 -132 -177

-60 -76 -84 -97 -37 -92 -172 -61 -69 -199 -103 -75 (16) -65 (23) - .52

6

39 63 52 67 160 48 40 30 47 151 48 158 67 60 67 76 33 76 130 59 4.5 144 35

(‘1

(‘2 (“3

*J

79 (8)

&

;

-175 -145 -160 -174 - 154 -176 -171 -198 -198 -196 -157

-89

-140 -115 - 161 -166 -162 -149 -218 -113 -168 -166 - 239

- 1oti -101 -65 -61 - 105 -77 -48 -79 -54 - 80 -26

- 168 (28) -157 (8) -148

-71 (20) -66 (11) -75

-78 - 60 -95 -48 -52 -56 -47 -35 -84 -66

x -119 - 143 - 164 - 147 -171 - 162 -155 -145 -151 - 134 - 180 -168 -155 -192 -198 -172 - PO2 -152 -152 -175 -173 -167 -189 - 154 - -163 (26) - 159 (9) -157

Standard deviations are given in parentheses. The a and y angles of G6, Gil, GZO. G23 and the y angle of Cl and Cl3 are in the trans region and have not been included in the averages. The [ angle of G23 has also been excluded, since it has a very low value (-25”). t Frederick et al. (1989). $ Arnott & Selsing (1974).

628

N. Verdaguer et al.

tions

(a)

of helical

geometry.

distance

(b)

Figure 3. Plot of (a) rise, (b) slide and (c) twist of the dodecamer. In order to show the inverse correlation between the last 2 variables, the ordinate of slide is in an inverted

useful

in

from the bases.

(c) Sperm&e

presented

It is specially

A-DNA in which the helical axis is at a considerable

sense.

is apparent in Figure 4, when the overlap of the two CpC steps is compared: the bases have a very similar relative position, although the main-chain is considerably modified. In order to describe the conservation of the helical structure in spite of the local distortions due t(o the changes in twist (w) and slide, it is useful to introduce a new twist parameter (w’), which measures the angle rotated by consecutive basepairs as measured from the helical axis. It is determined as the average of the rotation of the two C-l’ atoms of the deoxyribose ring, which are attached to the base-pairs. As is apparent from Table 2, this parameter is rather constant: the standard deviation for w’ is l-5”, whereas it is 4.4” for w. Thus, w determines local changes in the arrangement of a base-pair in relation to its neighbours, whereas the new parameter o’ will detect overall local distor-

binding and CpG steps

The chemical symmetry of the dodecamer is not fully reflected in the structure, as it becomes obvious when the geometry of the C5-G6 and C7-G8 steps, which should be identical, are compared and the binding of spermine analysed. The twist angles (w) of both steps are quite different (2&7” and 37.6”) and much larger than commonly found in other nucleotides in which this step always has low values (in the range 16” to 25.1”). The average angles are given in Table 4. The small value found in some octamers has been attributed to a tendency t’o preserve some stacking of the two purine residues, but as shown in Figure 4 a similar degree of stacking can be obtained with larger angles, as found here. Furthermore, we should point out that the low twist angle of the CpG step is not found in oligonucleotides crystallized in the B form, and in fact it shows large variations in different structures (Yanagi et aE., 1991). The two CpG steps also differ in the conformation of the phosphodiester chain. The C&G6 step has a tram conformation in the a-y angles, whereas the C7-G8 step, which is symmetrical to C&G6 from the molecular point of view, maintains the standard g-/g+ conformation for these two bonds. It, is interesting to note that a similar change in the conformation of these bonds is observed in some octamers when they are either crystallized in t)he tetragonal or hexagonal system (Jain & Sundaralingam, 1989: Shakked et al., 1989). The anomaly we have just described might be related to the way a spermine molecule is bound, as shown in Figures 1 and 2. A closer view of the binding region is given in Figure 5. spermine Whereas the two nitrogen atoms in one half of the spermine molecule are bound inside the deep groove, the two other nitrogen atoms are bound to two phosphate groups across the borders of the deep groove of the DNA molecule. The spermine molecule is thus placed in a very asymmetric way. It is not clear why it does not adopt a more symmetrical position. This could be easily achieved by allowing the two central amino goups of spermine to interact with the bases and the two end groups to interact with the phosphate molecules or slice ‘tIersa. In either case very similar interactions of the dodecamer with the spermine molecule would be found. We did investigate such possibilities in the electron density maps, but they were incompatible with the electron density distribution. In any case the interactions wit,h the dodecamer occur at points which are almost symmetric: G6 and Cl8 for example are corresponding groups of the molecule. However, the symmetry is broken because the hydrogen bonds with the phosphate groups are not equivalent and additional interactions of one spermine nitrogen are

Molecular

Structure of a Complete

G21--

Turn of A-DNA

629

-C4

G20---cs

G16---C7 Cl7 ---G6

Cl7-

-

Cl6 ---

-G6 G9

Figure 4. Stereoscopic stacking diagrams of the base-pairs in the central part of the molecule. The view is perpendicular to the mean plane through the upper base-pair in each case. The upper and lower frames are practically identical among themselves and with the other CpC steps of the molecule (not shown). The central frame corresponds to the unique GpC step of the dodecamer and the other 2 frames represent the 2 CpG steps. It can be seen that the basestacking is rather similar in the 2 latter cases, whereas the main-chain conformational angles CLand y are trans in the CFi-G6 step and gauche in the C7-G8 step, as discussed in the text.

630

N. Verdaguer et al.

Figure 5. Omit difference map (F, - F,) showing the electron density distribution of the spermine molecule. It, was calculated after removal of the spermine molecule from the structure and 5 cycles-of refinement. The bases which interact with spermine are also shown. Hydrogen bonds are indicated by dotted lines. found with C7 and C17, which do not occur in the other equivalent nitrogen atom. It is most likely that the different conformations of the two CpG steps are related to the way the spermine molecule is bound to the dodecamer. This behaviour might be interpreted in two ways: the spermine molecule recognizes differences in the conformation of the two CpG steps and binds asymmetrically; or, alternatively, upon binding induces changes in the conformation of these two base steps. In any case it is clear that we are faced with a most simple model of a DNA interaction in which the conformational features of both interacting substances adapt themselves to each other.

the shallow groove of neighbouring molecules, as discussed in detail below. A striking deviation of the structural regularity of the CpC sequence is found in the case of the GpCpCpG sequence, which has a very large twist angle, both in an octamer (Wang et al., 1982) and in the decamer (Frederick et al., 1989). It appears that, this is an intrinsic feature of this sequence, the reasons for which deserve further analysis. Finally, the central alternating sequence. which has a potential to adopt the Z-DNA conformation,

Table 4 Average

(d) Influence

of sequence on conformation

The twist angle w between consecutive base-pairs (listed in Table 2) is adequate for an overall comparison, since it corresponds to the number of residues per helical pitch. A comparison of this parameter as a function of sequence is given in Table 4, both for this and other structures. It is found that the CpC sequence usually maintains a value very close to 32.7”, which corresponds to 11 base-pairs per turn, as found in fibres of A form DNA. Thus, stretches of cytosine residues will show a very regular structure with liftle conformational variation. On the other hand, when the CpC sequence is found at the 5’ end of the molecule, a much larger twist angle is found. This may be a peculiarity of the crystalline state, since all oligonucleotides which have been crystallized in the A form interact with their terminal base-pairs with

Sequence cpcpcps SPCPCPC CEPCPC CPCPCE GpCpCpG CPG GPC

twist angles of the dodecamer compared A-DNA octamer sequences Octamers Average” (oO)

n -_____

C’,GCG, Average” (CT”)

with

,1

32.4 (1.5)

20

31.8 (2.8)

6

38.9 (1.4) 30.7 (2.7) 4421442 22.2 (3.2) 31.0 (2.6)

4 6 2 ;

35.0/40.9

2

28.7/37% 257

2

I

The values in the first column are the average of those prcviously published for octamers crystallized in the A form (references given in Table 1). They correspond to the twist angle of the base step in bold type of the sequences given. S represents either C or G and C, represents a cytosine at the end of the molecule. n is the number of values available. When there are only 2 values, no average has been taken. The complementary bases are not indicated.

Molecular Structure of a Complete Turn of A-DNA

Figure 6. Skeletal packing diagram in the crystal lattice. Twenty-four molecules projected onto the ab plane are shown. Each unit cell contains 6 molecules. The molecule at the bottom of the diagram is shown at the left. with t,he numbers corresponding to the terminal base-pairs.

found in the A conformation. Its parameters deviate from those found in other crystal structures, as discussed in section (c), above, showing that this sequence has a potential to vary significantly in conformation. The differences found may be due in part to the binding of the spermine molecule we discussed above, but we should also note that it is the first case in which this tetranucleotide alternating sequence (CpGpCpG) has been crystallized in the A form, so that an intrinsic tendency to adopt the conformation we observed can not be excluded.

is in fact

(e) Packing The way the molecules are packed is illustrated in Figures 2, 6 and 7. Each molecule has its two

631

terminal base-pairs tucked into the shallow groove of two neighbouring molecules. In turn, two neighbouring molecules have one of their terminal basepairs tucked into its shallow groove, so that each dodecamer molecule has strong direct interactions with four neighbouring molecules. About four basepairs of the shallow groove are covered by neighbouring molecules, as approximately indicated by the shadowed areas in Figure 2. These interactions involve hydrogen bonds and van der Waals’ contacts with the base-pairs and main-chain oxygen atoms which occur in approximately symmetric regions of the molecule. The Cl * G24 terminal basepair forms hydrogen bonds with G8, G9 and G18, whereas the G12 * Cl3 base-pair forms such bonds with G20 and C21. The hydrogen bonds involved are listed in Table 5. A complex set of interactions is apparent, so that the Cl base, for example, is not only hydrogen-bonded with its corresponding base G24, but also with three other guanine residues (G8, G9 and G18). van der Waals’ contacts between Cl and the G8 deoxyribose ring are also extensive. Due to all these interactions, the Cl base is significantly displaced from its standard position, as is apparent in Figures 1 and 2 and Tables 2 and 3. A similar situation is found at the other end of the dodecamer, but the interactions and distortions are not so pronounced. Due to the pronounced distortion of the Cl base and a short contact between N-4 and O-5’ of residue G9 of a symmetry-related molecule, we decided to refine the final co-ordinates again using the program X-PLOR (Brunger et al., 1987), which includes intermolecular interactions in the refinement. No significant change in the structure was found and, in particular, the Cl base maintained its interactions and peculiar conformation, although the short contacts were eliminated. The appearance of the molecules packed in the unit cell is shown in Figures 6 and 7. Several chan-

Figure 7. Packing interactions of the dodecamer. The Figure shows a van der Waals’ stereo view of one molecule (1) that is identical to the isolated molecule shown in Fig. 6. The dodecamer interacts through its 2 ends with the shallow grooves of 2 neighbouring molecules (2 and 3). The phosphorous atoms are shaded. Two additional molecules (not shown) are tucked into the shallow groove of molecule 1, so that in fact every dodecamer is in close contact with another 4 molecules. An additional molecule occupies the space between 2 and 3, which are not in contact. Even though not all molecules are represented, it can be appreciated that the central solvent channel (type C in Fig. 6) is very narrow.

632

N. Verdaguer et al

nels of different sizes are apparent, in which solvent molecules and counterions are located. The amount of solvent found in the unit cell is similar to that found in other A form crystals, but in most structures larger channels of a single type are found. Here the unit cell has a much more complicated structure. Each molecule lies with its helical axis forming an angle of 54” with the c axis. The crystals contain three types of solvent channels. Channels A and C (shown in Fig. 6) are very narrow and are covered by phosphate groups, so that most counterions are probably located inside them. However, inspection of the electron density within them did not show any evidence of any localized counterions. Only some water molecules were apparent. Type B channels are wider and surrounded by the deep groove of DNA. The spermine molecules we located are found in the latter channels. Finally, we should comment that the space group does not use the chemical symmetry of the dodecamer as also observed for octamers which crystallize in the hexagonal system (McCall et al., 1985: Shakked et al., 1983; Lauble et al., 1988). At first sight packing requirements might be the determining factor for the loss of symmetry of the molecule upon crystallization. However, in this case an almost identisal packing of symmetrical dodecamers could be achieved using space group P6,22, which would utilize the 2-fold molecular axis, but with 6, hexagonal axes at the four corners of t,he unit cell shown in Figure 6. Thus, the lack of symmetry should originate from other features of the structure, such as intermolecular interactions or counterion binding. In summary, the situation found in crystals indicates that the sequence, the oligomer length and the crystallization solvent may all have a strong influence on the DNA form adopted by oligonucleotides upon crystallization.

(f) Hydration As stated in Materials and Methods, 108 water molecules have been located. Although the resolution of the diffraction pattern does not allow a detailed analysis, some general features are apparent from an inspection of their co-ordinates. A significant number of water molecules are located in the wider channels A and B shown in Figure 6, but only three molecules per asymmetric unit are found in the narrow channel C. As shown in Table 5, only a relatively small number of water molecules (36) are found to be in contact with the dodecamer and none with spermine. Among them only nine are found to form intramolecular bridges, an observation that does not appear to support the “economy of hydration” hypothesis of Saenger et al. (1986). However, according to the calculations of Vovelle & Goodfellow (1990), about 60 bridging sites should be present. Since we have found so few water mokcules, it is possible that either they are considerably disordered in this case or that upon increasing the

resolution additional water molecules might he located. An inspection of the water molecules in contact with the dodecamer, given in Table 5. shows two striking features. First the region from G18 t,o (~24 appears to be dehydrated, since it is associated with only three water molecules. Although this region is in contact’ with another oligonucleotide unit in thr crystal, it is not clear why we have found so few water molecules. In fact the equivalent region G6-G12 is considerably more hydrated. Another striking feature is the large hydration of Cl6 and C17. It is possible that some of the latter water molecules correspond to counterions. In fact two of them are so close to each other that it is difficult’ t,o establish at the present resolution whether they should be replaced by a bulkier counterion.

4. Discussion (a) Factors ,which in$urucr

DNA form

As noted by Kennard & Hunter (1989). the crystallization of oligonucleotides is always achieved under similar ionic conditions. with about 30% 2-methyl-2,4-pentanediol (MPD) in the final equilibrated drops used. Depending on the oligonucleotide, either A, R or Z forms are obtained. In one case, the A and R forms were found in the same crystal (Doucet et al., 1989). All these observations indicate that the energy difference between the various forms of DNA is very small in the MPD-containing solvent used for the crystallization of oligonucleotides. The size and sequence ma) influence the form of DNA obtained, but it appears that end effects have a large influence too. ln all crystals of the A form an interaction of the terminal base-pairs with the shallow groove of neighbouring molecules is found, whereas in the R form crystals the oligonucleotides interact either end-to-end or side-by-side. This peculiar end to shallow groove interaction appears to be related to the il csonformation. Under these packing conditions. large solvent channels are always found in A form oligonucleotide crystals, as shown for example in Figure 6. Thus, crystals of the A form contain more solvent than those of the B form (Kennard & Hunter. 1989). whereas in solutions of high molecular weight’ DNA the A form only appears upon dehydration of the H form. In this context it should be noted t’hat) ohgonucleotides crystallized in the R form very often start with the sequence (:pG. This starting sequenw has not been found in oligonucleotides which csrystallize in the A form. so it may prevent the typical end-shallow groove interaction just mentioned. It would be of interest’ t.o crystallize an octamer with a starting CpG sequence in order to substantiat,e t,his interpretation. In this discussion we consider that, the =I and B forms can be clearly distinguished. However. Heinemann ut al. (1987) noted that in oc+amer

Molecular Structure of a Complete Turn of A-DNA

633

Table 5 Intermolecular contacts and hydration Residue (‘1

(‘2

( ‘3 (‘4

water O-5’ N-4 N-4 N-4 o-2 G-2

34 31-G8

O-l P

26

Neighbour

Residue G24

3.4 3.7 29 29

G9 (O-5’) G9 (O-4’) G9 (N-2) G18 (N-2)

25 3.3 3.1 3.2 37

G9 (N-2) G8 (N-3) G18 (N-3) G8 (N-2)

‘32

N-2

3.0 (~12 (N-2)

G21

N-2 N-2 x-3

3.1 (~12 (N-3) 3.0 G12 (N-2) 2.5 G12 (N-2)

36 2.6 3.0.C4 3.5.G6 29Gl2

G20

N-2

3.1 Cl3 (O-2)

o-2P o-2P o-JP N-7 N-i

30 25C5 2.9 32

Cl9

o-1P

o-2P N-4

3.8

G18

N-2 N-3 O-6 N-7

Cl7

o-JP o-1P o-JP o-2P o-2P o-2P N-4 N-4

2.4 2.5 3.3 2.4 33 3.5 29Gl8

Cl6

o-JP o-JP O-4’ O-4’ o-2

3.4 3.5 32 32 30

Cl5

o-2P o-JP O-4’

33.Cl4 27

3.1 3.1 -C5

(‘5

O-1P o-2P O-5’ O-3’ o-4

G6

(‘7

(:X

3.6

2.7 S (N-5) 2.8 S (N-l)

29Cl 31 GL4 (N-2) 37 G24 (N-3)

(:J(J

O-3’ O-3’ N-2 N-2 N-3

Neighbour

G23 2.8 8 (N-10)

G-IF o-2P O-3’

t:9

Water

o-2P O-5’ O-4’ N-2 N-2 N-2

3.4

O-6

2.6-Gl l/Cl4

3.4 37 2.9 33

Cl (N-4) Cl (N-4) Cl (O-2) G24 (O-3’)

2.9 Cl (O-2) 3.2 G24 (N-2) 2.8 S (X-l) 34-Cl7

29 s (N-l)

33 3.2 S ‘(N-14)

c:11

O-3’ O-4’ O-6

29Gl2 31 25C14/GlO

Cl4

o-JP o-2P O-5’ N-4

3.0 3.7~Cl3 3.2~Cl5 35GlO/Gll

(il2

0-2P O-4’ O-3’ s-2 N-2 N-3

3.3-Gll 2.5~C5

Cl3

O-3’ O-2

35.Cl4

30 2.5 3.0 3.1

G22 G21 G21 G21

34 G20 (S-2)

(N-2) (N-3) (N-2j (N-2)

The residues are listed as consecutive base-pairs in the structure. In the first column are the distances to the water molecules which were located in the electron density map and that interact with the dodecamer. A maximum distance of 3.7 A was allowed. When a water molecule was also making contact with other bases, its sequence identification is also given. In the second column are the distances to neighbouring molecules (either another dodecamer or spermine).

crystals there was a correlated change in base-pair tilt and rise per base-pair which may approach the typical values of B form DNA. Thus, Kennard & Hunter (1989) considered that a continuum of righthanded DNA conformations may occur between the A and B forms. However, if the main parameters to define the A conformation are the average twist per

residue, torsion angle 6 (related to the sugar pucker) and glycosidic angle x, then all the octamers should be considered to be in the A form, rather than A-B intermediates, as indicated by the average values given in Tables 2 and 3, which are practically identical for all these structures. The average rise per residue and tilt of the base-pairs would be secondary

634

N. Verdaguer et al.

parameters, which in the A form may vary in a correlated fashion, at least in octamer crystals. (b) The local variability

of DNA

conformation

It appears that packing conditions have a strong influence on the average parameters of the helix, since the same octamer when crystallized in the hexagonal system has values of rise and tilt which vary significantly when in the tetragonal system (Jain & Sundaralingam, 1989; Shakked et al., 1989), so that the unit cell environment has a strong influence on the conformation found in oligonucleotides. In fact, Heinemann (1991) has shown that there is a simple relation between the packing density and the global helical structure of short DNA duplexes. On the other hand it is striking to note that the phosphodiester chain torsion angles, the glycosidic angle (all given in Table 3) and the average twist angle (given in Table 2), in spite of large local variations, have average values that are almost identical to those found in fibres. Individual twist angles may vary from 16” to 44”, but the averages of every individual structure all fall in the range 30” to 335”, which corresponds to 12 to 10.7 base-pairs per turn. It appears that interatomic forces stabilize the overall DNA conformation, so that when there is a strong local variation in structure, it is compensated elsewhere in the molecule by variations with the opposite sign. In this sense it is interesting to note that those molecules which have most extreme twist angles present a strict alternation of large-small values, as found for example in d(GGCCGGCC) (Wang et al., 1982), d(CCCCGGGG) (Haran et al., 1987) and d(CTCTAGAG) (Hunter et compensations are a al., 1989). Such structural natural consequence of the tendency to maintain optimal stacking and backbone interactions. The individual parameters used to characterize the local variations in the structure are very useful for this purpose, but obscure the tendency of the oligonucleotides to optimize the intermolecular forces. The parameter o’ that we introduced in Table 2 gives an averaged view of the local distortions and thus shows very little fluctuations along the dodecamer sequence. (c) Comparison

with the B form

It is interesting to compare the behaviour of the parameters listed in Tables 2 and 3, and represented in Figure 3, with those found in the B form. In the latter case there is a strong inverse correlation between rise and twist, as shown by Yanagi et al. (1991). This is due to the fact that the base-pairs are perpendicular to the helix axis, and since the phosphodiester chain has strong constraints and cannot vary much in length, a large twist is compensated by a small rise. Changes in both variables are compensated locally by changes in other parameters of the bases (mainly buckle and roll) in order to maintain an optimal stacking of the bases close to 3.4 A. For the same reason the changes in twist

average out to 36” over short regions (3 to 4 basepairs) of the helix. In the A form, since the bases are inclined and placed away from the helix axis, the rise parameter measured at the C-l’ atoms varies in a complex manner as a result of the local changes in orientation of the bases: twist, roll, buckle, all vary locally. The regularity of the A form structure manifests itself in the constancy of the twist w’ measured from the helix axis. The sugar pucker is also very constant in the C-5’-endo region. (d) DNA

interactions

The conformational parameters indicate that the sequence has a strong effect on them, as shown for example by the constancy of the twist angle in the CpC sequences shown in Table 4. On the other hand there are other sequences such as CpG that vary strongly in different cases, as also found in the B form (Yanagi et al., 1991). This behaviour is probably due to small differences in energy between conformations in such sequences. Thus, they may easily vary depending on the environment (solvent, counterions, local electric field in crystals, etc.). It is interesting to note that the CpG sequence has a very high tendency to mutate (Cooper & Youssoufian. 1988), a feature which might be related to the conformational variability of this dinucleotide. In general, the local conformation of DNA upon interaction will be determined not only by intramolecular forces, but also by the environment. A protein-DNA interaction cannot be considered as an interaction between rigid bodies, but should show features of an induced fit, so that the local DNA conformation may vary upon interaction. This has, in fact, been found in several complexes of DNA with proteins such as EcoRI endonuclease and in other cases discussed by McClarin et al. (1986). This work was supported in part by grant BT117-009 of the Plan National de Biotecnologia, Spain. We are thankful to Dr A. DiGabriele for sending us co-ordinates, to Dr R. Dickerson for sending us preprints and the NEWHELIX program, and to Dr J. Belle for his help in the use of plotting programs. N.V. acknowledges a fellowship from the Ministerio de Education y Ciencia. Atomic co-ordinates have been deposited with the Brookhaven Protein

Data

Bank.

References Arnott, S. & Selsing, E. (1974). The structure of polydeoxyguanylic acid. polydeoxycytidylic acid. J. Mol. Biol. 88, 551-552. Brunger, A. T., Kuriyan, ,J. & Karplus, M. (1987). Crystallographic R factor refinement by molecular dynamics. Science, 235, 458469. Cambillau, C. t Horjales, E. (1987). TOM: a FRODO subpackage for protein-ligand fitting with interactive energy minimization. J. Mol. Graph. 5. 174-194. Campos, J. L. & Subirana, J. A. (1987). The complex of poly(dG) poly(dC) with arginine: stabilization of the B-form and transition to multistranded structures. J. Biomol. Struct. Dynam. 5, 1519.

Molecular

Structure

of a Complete

Chandrasekaran, R., Wang, M., Tang, M-.K., He, R.-G., Puigjaner, L. C., Byler, M. A., Millane, R. P. & Arnott, S. (1989). A re-examination of the crystal structure of A-DNA using fiber diffraction data. J. Biomol. Struct. Dynam. 6, 1189-1202. Cooper, D. N. & Youssoufian, H. (1988). The CpG dinucleotide and human genetic disease. Human Genet. 78, 151-155. Doucet, J., Benoit, J.-P., Cruse, W. B. T., Prange, T. & Kennard, 0. (1989). Coexistence of A and B-form DNA in a single crystal lattice. Nature (London), 337, 19&192. Fratini, A. V., Kopka, M. L., Drew, H. R. & Dickerson, R. E. (1982). Reversible bending and helix geometry B-DNA dodecamer: CGCGAATTB’CGCG. ? BttZ. Chem. 257, 14686-14707. Frederick, C. A., Quigley, G. J., Tang, M. K., Coll, M., Van der Marel, G. A., Van Boom, J. H., Rich, A. t Wang, A. H.-J. (1989). Molecular structure of an Eur. A-DNA decamer d(ACCGGCCGGT). J. Biochem. 181, 295-307. Fujii, S., Wang, A. H.-J., Quigley, G. J., Westerink, H., Van der Marel, G., Van Boom, J. H. t Rich, A. (1985). The octamers d(CGCGCGCG) and d(CGCATGCG) both crystallize as Z-DNA in the same hexagonal lattice. Biopolymers, 24, 243-250. Haran, T. E., Shakked, Z., Wang, A. H.-J. & Rich, A. (1987). The crystal structure of d(CCCCGGGG). A new A-form variant with an extended backbone conformation. J. Biomol. Struct. Dynam. 5, 19%217. Heinemann, CJ. (1991). A note on crystal packing and global helix structure in short A-DNA duplexes. J. Biomol. Struct. Dynam. 8, 801-811. Heinemann, I:., Lauble, H., Frank, R. & Bliicker, H. (1987). Crystal structure analysis of an A-DNA fragment at 1.8 A resolution: d(GCCCGGGC). Nucl. Acids Res. 15, 9531-9550. Hendrickson, W. A. & Konnert, J. (1981). In BiomoZecular Structure, Conformation, Function and Evolution (Srinivasan. R., ed.), pp. 43-57, Pergamon Press. Oxford. Hunter, W. N., Langlois, B., D’Estaintot, B. L. & Kennard, 0. (1989). Structural variations in d(CTCTAGAG). Implications for protein-DNA interactions. Biochemistry, 28, 2444-2451. Jain, S. & Sundaralingam, M. J. (1989). Effect of crystal packing environment on conformation of the DNA duplex. J. Biol. Chem. 264, 12780-12784. Jain, S., Zon, G. & Sundaralingam, M. (1989). Base only binding of spermine in the deep groove of the A-DNA Biochemistry, 28, octamer d( GTGTACAC) 2360-2364. Jones, T. A. (1985). Interactive computer graphics: FRODO. Methods Enzymol. 115, 157-171. Kennard. 0. & Hunter, W. N. (1989). Oligonucleotide structure: a decade of results from single crystal X-ray diffraction studies. Quart. Rev. Biophys. 22. 327-379.

Edited by

Turn

of A-DNA

635

Lauble, H., Frank, R., BlGcker, H. & Heinemann, U. structure of Three-dimensional (1988). d(GGGATCCC) in the crystalline state. Nature (London), 16, 779%7816. McCall, M., Brown, T. & Kennard, 0. (1985). The crystal A model for structure of d(GGGGCCCC). poly(dG) . poly(dC). J. Mol. Biol. 183, 385-396. McClarin, J. A., Frederick, C. A., Wang, B.-C., Greene, P., Boyer, H. W., Grable, J. & Rosenberg, J. M. (1986). Structure of the DNA-EcoRI endonuclease recognition complex at 3 A resolution. Science, 234, 152&1540. Park, H., Arnott, S., Chandrasekaran, R., Millane, R. P. & Campagnari, F. (1987). Structure of the cc-form of poly(dA) . poly(dT) and related polynucleotide duplexes. J. Mol. Biol. 197, 513-523. Rabinovich, D. & Shakked, Z. (1984). A new approach to structure determination of large molecules by multidimensional search methods. Acta Crystallogr. sect. A, 40, 195-200. Rabinovich, D., Haran, T., Eisenstein, M. & Shakked, Z. (1988). Structures of the mismatched duplex d(GGGTGCCC) and one of its Watson_(=rick analoges d(GGGCGCCC). J. Mol. Biol. 200, 151--161. Saenger, W.. Hunter, W. N. & Kennard, 0. (1986). DNA conformation is determined by the economics in the hydration of phosphate groups. Nature (London), 324, 385-388. Shakked, Z., Rabinovich, D., Kennard, O., Cruse, W. B. T.. Salisbury, J. A. & Viswamitra, M. A. (1983). Sequence-dependent conformation of an A-DNA double helix. The crystal structure of the octamer d(GGTATACC). J. Mol. Biol. 166, 183-201. Shakked. Z., Guerstein-Guzikevich, D., Einstein, M., Frolow, F. & Rabinovich, D. (1989). The conformation of the DNA double helix in the crystals is dependent of its environment. Nature (London), 342, 45C460. Takusagawa, F. (1990). The crystal structure of d(GTACGTAC) at 2.25 A resolution: are the A-DNAs always unwound approximately 10” at the C-G steps? J. Biomol. Struct. Dynam. 7, 795-809. Vovelle, F. & Goodfellow, J. M. (1990). Sequence dependent hydration of DNA. Znt. J. BioZ. Macromol. 12, 369-373. Wang, A. H.-J., Quigley. G. J., Kolpak, F. J ., Crawford, J. L., Van Boom, J. A., Van der Marel, G. & Rich, A. (1979). Molecular structure of a left-handed double helical DNA fragment at atomic resolution. Nature (London), 282, 680-686. Wang, A. H.-J., Fujii, S., Van Boom, J. H. & Rich, A. (1982). Molecular structure of the octamer d(GGCCGGCC): modified A-DNA. Proc. Nat. Acad. Sci., U.S.A. 79, 3968-3972. Yanagi, K., PrivB, G. G. & Dickerson, R. E. (1991). An analysis of local helix geometry in three B-DNA decamers and eight doderamers. J. dlol. BioZ. 217, 201-214.

A. Klug