J. Mol. Riol. (1965) 11, 391-402
Fourier Synthesis Studies of Lithium DNA Part ill: Hoogsteen Models STRUTHER ARNOTT AND
M. H. F.
WILKINS
Medical Research Oouncil Biophsjsic« Research Unit King's Oollege, 26-29 Drury Lane, London, W.O.2 L.
D.
HAMILTON
Medical Department, Brookhaven National Laboratory Associated Universities Ine., Upton, Long Island, New York, U.S.A. AND
R.
LANGRIDGE
Ohildren's Oancer Research Foundation, Ohildren's Hospital Medical Oenter and Harvard Medical School, Boston, Mass., U.S.A. (Received 11 Novemher 1964) Molecular model building and X-ray diffraction techniques have been used to study the proposition that the base-pairing in double-stranded DNA is not of the Watson-Crick kind, but is like that found in a number ofcrystalline complexes of simple derivatives of adenine and thymine (or uracil). Successive approximations to the electron density distribution in lithium DNA have been made, assuming the proposition to be true. In spite of this bias in the analyses, it could be concluded that only a model of the Watson-Grick kind was compatible with the diffraction data.
1. Introduction The first direct determination of the structure of a purine-pyrimidine base-pair was made by Hoogsteen (1959) in a single-crystal study of a mixed crystal of3-N-methylthymine and 9-N-methyladenine obtained from an aqueous solution of equimolar quantities of these substances. In this pair the bases have a different arrangement (see Fig. 1(0.») from that proposed for DNA by Watson & Crick (1953) (Fig. l(b», in that the NH ... N thymine-adenine hydrogen bond was between Nl and Nl in the Watson-Crick pair, but between Nl and N7 in the Hoogsteen pair. The sameimidazole N7 of adenine, rather than Nl, is involved in the hydrogen bonding found in crystals of adenine hydrochloride (Cochran, 1951), and in the 1: 1 complex of 9-Nethyladenine and 3-N-methyluracil (Matthews & Rich, 1964), which has a Hoogsteentype base-pairing scheme although the complex was grown from solution in dimethyl sulphoxide. It is noteworthy that the purine and pyrimidine models available to Donohue in 1956 for his extensive review of base-pairing schemes for duplex polynucleotides were not sufficiently accurate to enable him to predict Hoogsteen's pair. 391
H
H
H
H
H (b)
H
°2 (d)
H
H
H
06_ ........ 2.939 H
........ "1
Nt
.....................--=:::...N
H
H
6
H
(2
H
c:
~
H
2'89~ -----'H---
c;
O2
(0) (c ) FIG. 1. Ad enine-thymine hydrogen-bonded p airs: (a) fou nd by Hoogsteen in a 1: 1 complex of 3-N-methylthymine and 9-N-methyladenine; (b) p ostu lated by Watson & Crick (current version refined by Arnott) for part of the structure of DNA. Guanine-cytosine pairs : (c) Hoogsteen type; (d) Watson- Crick type.
FOURIER SYNTHESIS STUDIES OF Li DNA: III
393
In no adenine-pyrimidine complex so far studied has a Watson-Crick base-pairing been discovered. This is in contrast to the complexes of guanine and cytosine derivatives (O'Brien, 1963; Sobell, Tomita & Rich, 1963) where Watson-Crick pairing is the only one found, probably because three hydrogen bonds can form between guanine and cytosine. The formation of other pairs in adenine-pyrimidine single-crystal systems might be determined by the additional hydrogen bonds linking these pairs in layers in the crystals. Although the environment of the organic bases in such systems is very different from that in polynucleotides, it was possible that these base-pairs might exist in DNA. The present study of the base-pairing in Li DNA was undertaken in order to determine precisely the organisation of the heterocyclic rings in DNA, it being assumed that the previous studies had reduced the choice to the two schemes described. The alternative DNA structure
Two helical, antiparallel phosphate-ester chains joined by purine-pyrimidine basepairs were considered to be necessary features of any DNA model. The Hoogsteen and Watson-Crick schemes both satisfy the main requirements for base-pairing in DNA. Both imply the specific pairing, adenine with thymine, or guanine with cytosine, demanded by the analytical data on DNA (Chargaff, 1950). One of each set of pairs has been shown to exist in simpler crystal systems. Both possess the same symmetry element relating their N-C glycosidic bonds, namely a dyad axis in the plane of the bases, which in a DNA molecule would allow the sugar and phosphate groups of two antiparallel chains to have similar conformations. Corresponding to each of the observed pairs, it is possible to construct from the other DNA purine and pyrimidine bases a pair with closely similar dimensions (Fig. l(c) and (d)) and with which the observed pair could be exchanged in the DNA molecule without disturbing the regularity of the sugar-phosphate exterior. It may be significant (cf. Fig.l(c) and (d)) that, for Watson-Crick pairs, precise equivalence of glycosidic bonds can be achieved while maintaining hydrogen bonds of normal length and with a high degree of linearity; but, with Hoogsteen pairs, equivalence appears to be achieved only when the degree of linearity of the hydrogen bonds is less.
(I)
(II)
394
ARNOTT, WILKINS, HAMILTON AND LANGRIDGE
The possibility that, in DNA, adenine and thymine occur in Hoogsteen pairs, and guanine and cytosine as Watson-Crick pairs, is ruled out by the regularity, in the deoxyribosyl phosphate chains, implied by the high degree of order in Li DNA crystals, a regularity unlikely to be achieved by a mixture of pairing schemes, since the interglycosidic link distance (C{ ... C~) in the former (,....., 8·80 A) is at least 2 A shorter than the corresponding distance (,....., 10·85 A) in the latter. ~ A possible objection to acceptance of a Hoogsteen system was that a guaninecytosine pair of this type would require cytosine to exist in the imino form (I, R = H) and not the amino form (II, R = H) expected from numerous studies on prototropy in the amino-pyrimidines (Ulbricht, 1963). Gatlin & Davies (1962) claimed that their nuclear magnetic resonance study of 2'-deoxycytidine (1, R = deoxyribosyl) showed it to have the imino form; but this has since been refuted by the work of Miles (1963), whose nuclear magnetic resonance study of deoxycytidine in dimethyl sulphoxide, and infrared study in deuterium oxide, both support the amino form. Nevertheless, it could not be said that the existence, in DNA itself, of this tautomer was thereby precluded. A direct evaluation of the competing hypotheses by X-ray methods was therefore undertaken.
2. The X-ray Analysis Structure factors were calculated for a Li DNA crystal model incorporating a Hoogsteen-type molecular model H I similar to the preliminary model of Langridge & Rich (1960). The conventional measure of the acceptability of a structure,
was 0,65, which was not unsatisfactory for an unrefined model, since the Watson-Crick DNA model gave rise to a value 0·85 which was reduced, on subsequent refinement, to 0·38 (Arnott, Wilkins & Hamilton, 1964). The successive Fourier syntheses of electron density which were made were of two kinds:
where the amplitudes,
IF}j'"I, were derived from
Jp =
the observed diffraction; and
L (IF:"I - IFifcll cos (2rrH.r -
aH)
H
where the amplitudes, (IF}j'"1 - IFH'!c/), were the differences between the observed structure amplitudes and those calculated from the model crystal structure. In both types of synthesis the phases, aH' were necessarily calculated from the crystal model parameters. From the results, contour maps of electron density were prepared, that in Po being an approximation to the electron density in the crystal, but biased towards that in the structure assumed because of the choice of phases. The Jp maps show the
<:» I::r:j
o
q
-,
~ ..., trj
~
(
... .. "
><
Z
....
"
....... .'."' ....
o: l-3
~
......
.
/
, ...
_... '
'..
:'....... .......-.\
.
...,
(
trj
~ rJJ
ta
l-3
q
tj
..., trj
rn
~':i")
o
I::r:j
t"' tj
Z
(c) FIG. 2. Approximate electron density in the mean plane (z = 0·086c) of a typical base-pair in H 1. (a) Po, contours at equal arbitrary intervals, with negative contours dashed; (b) and (c) .dp, with contours at twice the interval in Po' The atoms of the phasing model H I which lie within 0·5 A of the plane are shown as open circles on (a) and (b). The appropriate atoms of B III are superposed as filled circles on (c).
> I-< I-< I-<
e.> <0
'"
396
ARNOTT, WILKINS, HAMILTON AND LANGRIDGE
difference between the electron density in Po and that in the crystal model at the same resolution. They contain positive and negative regions indicating respectively the parts of the crystal unit where too little or too much scattering matter has been placed. These maps indicate, therefore, the site and extent of refinement required in the model. The evident need of extensive refinement was confirmed when the Fourier syntheses Po(H I) and Llp(H I), were examined. Features of the electron density in the region ofthe bases (see Fig. 2(a) and (b)) suggested that: (a) the base-pairs should have been sited about 2 A nearer the helix axis; (b) the interglycosidic link distance had to be substantially increased; (c) the components of the base-pairs had to be moved much closer together. The shift of about 1·5 A clearly indicated in Llp(H I) for the phosphate group (Fig. 3) was entirely compatible with the first two refinements. The changes deduced for the base and phosphate positions would have placed these groups in positions
I
I
-_...
,
I
I I I I I I
1
,, - -,"" "
...
, \
I I
I I
....
---- .... _,
FIG. 3. A bounded projection (z = 0'072c to 0·180c) parallel to 001 in Ap (H I) showing features in phosphate group region. The open circles shown the P0 4 position in H I, the filled circles the position in B III.
similar to those in B II and B III (Arnott et al., 1964) and were a further confirmation of the essential correctness of the positions determined for the base-pairs and phosphate groups in the previous studies. It was noted that the refinements (b) and (c) were incompatible with each other and with the maintenance of normal hydrogenbonded distances. That a refined model (B III) of the Watson-Crick kind would appear immediately to satisfy all the structural deficiencies of H I is shown in Fig. 2(c), where the model B III has been superposed on an appropriate section of Llp(H I). Improved versions of the Hoogsteen purine-pyrimidine pairs were made which took account of recent single-crystal studies of derivatives of the bases (Jeffrey & Kinoshita, 1963; Hoogsteen, 1963; Marsh et ol., 1962; Gerdil, 1961; Kraut & Jensen, 1963; Iball & Wilson, 1963) and of the hydrogen-bond lengths to be expected (Fuller, 1959, and private communication). An attempt was made to implement refinements in H I but not necessarily in a stereochemically acceptable way: a Fourier synthesis was made with a molecular phasing model H II in which the new Hoogsteen bases
/
,
., ,, '.., ·., , . - ,"
'.,
,.-
,
I
)
,
I
,
" .....
I
~
. .. - ....
....
-v ,
.
, I ,
,,
,
...
.
' ....
.
,
"'1
o
,
.
~
::0
,
,
.
.
.. ...
...
.
,
,
,,
...,
.: "
'.... '
·,. · ,. ,
...... .
.
,
o '
..
.
.,
I
::0 00
><
:
·..
-,
....trj
-,
Z
.. _.- .. .- .. " '. '
....00
, 'I....
q
,
'.
,
. ,
8 i:Q
trj 00
... .
,
'
00
8
t::I H
trj 00
,
..
, .-' ·
.,
.
.
.... ....
· . ..0 . ...... ·· : " ~, ' . •.· · , :"((;)'" , · " -' /r>., , , ,, . : ,, : .~ ,,
,-
.'
,
, , ,
o
"'1 t-I
t::I
Z
> H H
....
, .-
.
'.
,
I
./
f
(b)
(0) FIG, 4. E lect ron density in z Con tours as Fig. 2.
()
= 0'086c in
(a) Po (H II); (b) dp (H II ). P hasing model atoms in or ne ar «
0·5 A) t h e plane are shown as open circles.
...co
0>
<:»
-
(
'- ,
\b ...~, ..-~ . . -... ."
.
,
I
-
LJ (
'~~
J J 1)00(-
'\
to>
00
";
>
o
'. '
~
... .'
~
Z
.. ,.'
. -.
,
'
'j :'..'. ~~'.'.'~:: ' .. ::''''' ......... ' ". ......~~, . ( 0 0 '
o
I-:l I-:l
r-
~
Z rn :::r::
>
..
t , .. - - - · ....
"
..
,
.
(
, ,
..
.
,
(0 ) FIG. 5. E lectron density in z Contours as Fig. 2.
=
.'\
.. -...-
...
_.. '"
I
..
.r : :
.
,
,
Z
> Z
t"
> Z
Q
...tJ ~
Q
.
t:<:l
.
.- , '"
I
.'
(b) 0'086c in (a) Po (H III ); (b) .dp (H III). Phasing m odel atoms in or near «
~
o
tJ
I
'
,
.... -.......
·• . ·.. ·,
1
,
... ....~ t"
, ,, - - ..... •• ,, ': , , ,,
0'5 A) the plane are sho wn as open circles.
FOURIER SYNTHESIS STUDIES OF Li DNA: III
399
were placed in the positions indicated by all previous Li DNA Fourier analyses, but were not linked to the other nucleotide components, which were assumed to be in precisely the same sites as in B III (Arnott et al., 1964). In Llp(H II), most of the extreme topographical features of Llp(H I) did not appear, except the large maximum in the centre of the base-pair region (Fig. 4(b)), which indicated a deficiency of electron density which could not be satisfied by a Hoogsteen base-pair, but, as Fig. 2(c) clearly shows, would be by a Watson-Crick one. A further molecular model, H III, was made (co-ordinates in Table 1) in which the Hoogsteen bases and the phosphate group had positions like these in H II but in which the latter was so oriented that the deoxyribosyl groups could link the polynucleotide components together. In this model, bond lengths were within 0·01 A and bond angles within 1° or 2° of the accepted values (Sundaralingam & Jensen, 1964, American Crystallographic Association Abstracts, p. 51), except that there were bond angle distortions of 5° at C~ and C; on the sugar group. Structure factors were calculated and R found to be 0·41 (as it was for H II) (Fig. 5). The observed and difference Fourier syntheses showed (Fig. 5(a) and (b)) that the C~ ... C~ separation had to be greater than could reasonably be achieved in any Hoogsteen model, and the central maximum in the base region persisted. In contrast, these unacceptable features are not present in the LIp map (Fig. 6) for the refined WatsonCrick model B III (Arnott et al., 1964). The residual topography in this case is due mainly to inadequate allowance for water between the DNA molecules.
....p.-- ....
......... -,
"",
....<:.;:>.: AP(BIID
FIG. 6. Electron density in z = 0'086c in,:jp (B III). The Watson-Crick phasing model atoms. in or near the plane, are shown as open circles. Contours as Fig. 2.
ARNOTT, WILKINS, HAMILTON AND LANGRIDGE
400
TABLE
1
Oylindrical polar co-ordinates for lithium DNA molecular models with Hoogsteen base-pairs HI Group
Phosphate
P
01 O2 Oa
04 Sugar
O~
0 0; C~
C;
0; Purine
°a,(NH2)a Oa
C5
Guanine Pyrimidine
N7 Oa Ng C4 Na C2 N1 (NH 2 b Oa,(NH 2)a Ca N1 C2 O2 N2
C4 C5 Thymine
HIII
Atom
(CHa)
rCA)
"'(degrees)
zeAl
rCA)
"'(degrees)
9·12 8·84 10·44 9·40 8·04
44 43 41 42 37
-
1·80 1·84 2·68 0·34 2'42
9·18 8·36 10·35 9·47 8·34
52·5 60·9 51·1 53·0 43·75
- 1·46 - 1·94 - 2·37 0·00 - 1·71
6·00 6·52 7·94 8·08 8·60 6·60
49 53 58 58 66 55
0·00 - 1·32 - 1-08 0·48 0·92 0·92
5·39 5·84 7·11 7·31 7·73 6·13
54·5 56-6 64·1 67·1 78·3 63·3
0·00 - 1·45 - 1·31 0·16 0·32 0·87
1·80 2·52 2·68 2·42 3·72 4·50 4·02 4·92 4·86 3·80 6·22
129 102 41 36 51 68 82 97 106 103
0·00 0-00 0·00 0-00 0·00 0·00 0·00 0·00 0·00 0·00 0·00
2·83 3·05 2·61 4·76 2·92 4·01 3·85 5·08 5·22 4'42 6·49
143·9 119·4 92·0 63·0 46·2 60·9 81·0 88·9 103·4 117·1 107·0
0·00 0·00 0·00 0·00 0·00 0·00 0·00 0·00 0·00 0·00 0·00
1·56 2·12 2·18 3·54 4-16 4·50 4·40 3·44 4·24
114 76 41 37 22 51 68 82 102
0·00 0·00 0·00 0·00 0·00 0·00 0·00 0·00 0·00
1·59 1·85 1·70 3·03 3·67 4·01 4·08 3-27 4·18
141·2 97-1 51·9 44·8 26·8 60·9 80'3 97·2 116·0
0·00 0-00 0-00 0·00 0·00 0·00 0·00 0·00 0·00
71
zeAl
3. Stereochemical Considerations This Fourier study and the previous one (Arnott et al., 1964) have clearly shown that the positions of the two main scattering groups in DNA, namely the bases and phosphates, can be located with fair precision. Clearly this limits the conformation which can be adopted by the phosphate-ester chains, especially since the main helical parameters have also been established with certainty. As Langridge & Rich (1960) pointed out, there are stereochemical difficulties in building DNA models with Hoogsteen pairs, and neither H I nor H III was free from objectionable stereochemical features in the form of abnormally short non-bonded interatomic distances. Attention will be confined to H III, which represented the best attempt at reconciling the stereo-
FOURIER SYNTHESIS STUDIES OF Li DNA: III T.A1lLE
401
2
Phosphate-ester purine contacts
*(NH.)g *Na
·0.
·°2
*(~~2)g
04 Na *(NH 2 )g
2·55A 2·10 2·27 2·82 2·30 2·86 2·82 2·69
Sugar-pyrimidine contacts
c; o,
*0 4 04
2·81 2-67
The abnormally short interatomic distances in H III. The asterisk indicates a base atom of the next residue; (NH 2)g is the additional amino-group of guanine; the dashes indicate sugar atoms.
chemical requirements of Hoogsteen type DNA molecules with the observed X-ray diffraction: Table 2 provides a list of the short contacts in H III. These unacceptable non-bounded distances involve in the main the deoxyribosyl C~ and C; and part of the six-membered ring of the purine moiety. Such short contacts seem inevitable in models where the bases are arranged in the Hoogsteen manner, since the short interglycosidic distance of 8·8 A, the turn angle of 36° between residues, and bases stacked 3·4 A apart appear to impose a stereochemically impossible environment on the sugar groups.
4. Conclusion This study has confirmed the value of Fourier methods for refining fibre structures for which the data are, characteristically, few and of low resolution. When suitable data are available, the method can provide a tool superior to the "trial-and-error" methods hitherto used in this field, especially when it is not a question merely of small refinements of group parameters, but of deciding between competing structural hypotheses. Since Fourier syntheses are inevitably biased towards the structural model from which the phases were derived, evidence from them that a proposed model is unsuitable is compelling. For this reason, even without the strong support of the stereochemical arguments derived from the molecular-model building study, it can be concluded that base-pairing which is not of the Watson-Crick kind does not participate to any great extent in the structure of double helical DNA's. The work at Brookhaven was supported by the U.S. Atomic Energy Commission and that at Boston by grants from the U.S. Public Health Service, National Institutes of Health (HD-01267), and National Science Foundation (GB-2330). We are indebted to Professor Sir John Randall, F.R.S., Drs V. P. Bond and S. Farber for encouragement and to Drs M. Spencer and W. Fuller for discussion. We would also like to thank the University of London Institute of Computer Science and the Shell Oil Co. Ltd. for computing facilities.
402
ARNOTT, WILKINS, HAMILTON AND LANGRIDGE
REFERENCES Arnott, S., Wilkins, M. H. F. & Hamilton, L. D. (1964). Acta Ory8t., in the press. Chargaff, E. (1950). Experientia, 6, 201. Cochran, W. (1951). Acta Oryst. 4, 81. Donohue, J. (1956). Proc, Nat. Acad. Sei., Wash. 42, 60. Fuller, W. (1959). J. Phys. Ohern. 63,1705. Gatlin, L. & Davies, J. C. (1962). J. Amer. Ohern. Soc. 84, 4464. Gerdil, R. (1961). Acta Oryst. 14, 333. Hoogsteen, K. (1959). Acta Oryst. 12, 822. Hoogsteen, K. (1963). Acta Oryst. 16, 28. Iball, J. & Wilson, H. R. (1963). Nature, 198, 1193. Jeffrey, G. A. & Kinoshita, (1963). Acta Oryst. 16, 20. Kraut, J. & Jensen, L. H. (1963). Acta Oryst. 16, 69. Langridge, R. & Rich, A. (1960) . Acta Oryst. 13, 1052. Marsh, R. E., Bierstedt, R. & Eichhorn, E. J. (1962). Acta Oryst. 15, 310. Miles, H. T. (1963). J. Amer. Ohern. Soc. 85, 1007. O'Brien, E. (1963). J. Mol. Riol. 7, 107. Matthews, F. S. & Rich, A. (1964). J. Mol. Riolo 8, 89. Sobell, H. M. Tomita, K. & Rich, A. (1963). Proc, Nat. Acad. Sei.; Wash. 49, 8S5. Ulbricht, T. L. V. (1963). Tetrahedron Letters No. 16, 1027. Watson, J. D. & Crick, F. H. C. (1953). Nature, 171, 737.