J. Mol. Biol. (1965) 13, 914-929
Stereochemistry of Nucleic Acid Constituents I. Refinement of the Structure of Cytidylic Acid b M. SUNDARALINGAMt
AND
L. H.
JENSEN
Department of Biological Structure University of Washington Seattle, Washington 98105, U.S.A. (Received 22 December 1964, and in revised form 26 April 1965) The crystal structure of cytidylic acid b,: cyt.idine-Si-phosphate, has been refined by the method of full matrix least-squares to an R factor of 0'045, R = .EIFol - .EIFel/.EjFol where IFol and IFel are respectively the observed and calculated diffraction amplitudes. The nucleotide is found to exist as a zwitter ion. The cytosine ring is slightly nonplanar, with N(3) protonated. The ribose ring is puckered with Cc2')-endo, and displaced (on the same side ofCc5')) about 0·6 A from the plane of the remaining ring atoms. These latter appear to deviate slightly from a plane. Because the base in nucleic acids need not necessarily be planar and because C(l') can be significantly displaced from the least-squares plane of the base, we suggest that the definition of the torsion angle
°
1. Introduction The structure of cytidylic acid b, Fig. 1,§ was derived in projection by Alver & Furberg (1957,1959) using two-dimensional visually estimated X-ray intensities. We are interested in the solid state structures of nucleic acid components and have studies under way of the mononueleotides based on X-ray diffraction data. As one part of this work, we have collected three-dimensional photometrically integrated inten-
t Present address: Children's Cancer Research Foundation, Children's Hospital Medical Center, Boston, Massachusetts, 02115, U.S.A. ~ Presented at the American Crystallographic Association Meeting, Bozeman, Montana, July 26 to 31, 1964. An independent refinement has been completed by Mez & Donohue, using AIver & Furberg's data (1959). § The numbering that has been adopted for the pyrimidine ring is in accordance with Ring Index, 2nd ed., Amer. Ohem. Soc., 1960, p. 32. 914
STRUCTURE OF NUC L E IC A CID CONS T I TUE N TS.
I
915
siti es for cytidylie acid b. Accurate pri or knowledge of the constituents of nu cleic acids will be of considera ble aid in st ruc tural st udies of nucleic acids. It was with the view of supplying such informati on for cytidy lic acid b t ha t the present refinement was undertaken.
2. Experimental Procedure Crystals of cytidylic ac id are orth orho mbic, space gro up P2 1212 . T h e un it cell d imen sions as det ermined u sing a d iffractometer a re as foll ows : a = 8'7 78 ± 0,00 1, b = 21·649 ± 0·003 and c = 6·847 ± 0·001 A (eu K", = 1·541 8 A). Unid im en sionally integrated \ Veissenberg phot ogra ph s for hkO t hrough hk6 we re co llect ed using a m u ltip le film technique . T he cryst a l used was a recta ng ul ar t a ble t wit.h crosssect ion al area estimat ed to be 0·01 mm", In ge ne ral, t he crystals were lath-sh aped and elongated in t he c axial d irection . T wo sets of in tensit ies were co llected, a short (abo ut 17 hr) a nd a long (about 140 hr) expos ure for l = 0, 1, 2 a n d 3, and a single exposure (about 160 hr] for the r em a ining reflections. Intensities were determine d by scanning t he film s at r ight angles to the d irection of integration by the camera with a recording micro densitom et er . Within the lin ear r esponse range of the film , area abov e background under eac h trace is proportional to the integrated intensity. Altogether, 1207 independent r eflections were measured comprising 82% of the total avail a bl e with cop per radiation, or 96 % of those accessible u n der the con d it ion s of th e expe r iment. A few dozen addition al r eflection s were ob served but were too weak to photomet er. T ogether wit h the unobserved, the lat t er were n ot in cluded in t he refin ement.
R efi nement The refinement was carri ed t hrough on a n IB:1\1 709 com pu ter using t he full matrix least- squares progr a m of Busing & Le vy (1959). E ssent iall y t he H ugh es (1941) weighting sch eme was u sed, where y w = 1·0 for F; ~ 33·6 and yw = 33·6 /F o for F o > 33·6. Eleven in t en se reflections suffer ing apprec iable secondary ext inc tion were gi ven zero weig ht. In a ll, 189 n onhydrogen atom a nd 56 h ydrogen atom parameters t ogether with 7 level scale fa ct or s were v aried. I n t he ini ti al st age of the refin em en t , the 14 h y d rogen atoms were fix ed at t he p osit ions list ed b y Alver & F u rberg (1959) a nd on ly t he n onh ydrogen atom parameters and scale fa ct ors were varied, I n t he first t h ree cyc les, individ ual a to m isotropic thermal parameters were used. It wa s necessary to break t he computation into t wo p arts wh en in d iv idual atom anisotropic thermal p aram et ers were refined, becaus e on ly a m a ximum of a bo ut 175 parameters can be varied in one refinemen t cycle . Different sets of n onhydrogen a to m s were r efin ed in each cycle but a lways w ith common atoms in succ essive cyc les to a llow for int er action am ong p arameters. Aft er sev era l cycles the n onhydrogen atoms were fixed and t he h yd rogen atom p ositional and isot r opi c th ermal parameters were refined. It is interesting to n ot e that H o(.) (H N ( 3 ) in th is wo r k ) in Alver & Furberg's work (1959), " a t t ached " to 0 (6) of the phosphate group, m ov ed a dista nce of about 0·7 A in three least-squares cycles, and refin ed to a position at a covalent b ond distance from N( 3) of a scre w -a xis related m olecul e . Aft er the first refinem ent of t he h ydrogen atoms, the n onhyd rogen atom parameters were further refined foll ow ed by a fin al r efinement of t he hyd rogen atom parameters. T he fin al residual R = L"IF ol - L"jFcl /L"j Fol , wh ere IF ol and IF cl are resp ectively the o bserved a nd calculated d iffraction a m p litudes, for t h e 1196 reflections in clud ed in t he refin em en t is 4' 5 % . Va lues of R for different reflection classes, wit h a n d w ithou t hy drogen a t om co ntr ib ut ion, are sho wn in T able 1. It is seen t hat t he inclu sion of t he h ydrogen h as improved t he over -all value of R by 1' 5 %. The scattering fact ors for n eut ral C, N an d 0 were fro m B erghuis et al, (1955) , H fro m l\1c\V ecny (195 1) and P from F ree man & W a t son (1961). T he a t omic p ositi onal and t emp er ature p aram et ers wit h t h eir st a nd a rd deviation s a re listed in Tables 2 and 3. B ond Icn gths a n d bond angles ca lcu lated fr om t he param eters in T ables 2 and 3 are sho wn in Tables 4 and 5. Figure 3 is a view of the m olecule in t he c axial direc ti on, and Fig. 4 is a vi ew along t he c -ax is.
M. SUNDARALINGAM AND L. H. JENSEN
916
TABLE
1
Group l
No. of reflections
tR(al1atoms)
%
%
%
0
157
3·9
6·7
2·8
1
205
4·1
6·1
2·0
-:jR
t R (hydrogens omitted)
2
187
4·3
5·7
1·4
3
200
4·2
5·7
1·5
4
177
4·8
5·9
i-r
5
158
5·3
5·8
0·5
6
112
6·4
6·4
0·0
Total 1196
4·5
6·0
1·5
t Eleven extinct reflections excluded.
TABLE
2
(a) Positional co-ordinates of nonhydrogen atoms and their estimated standard deviations
inA
x/a P
0(6) 0(7) 0(6) 0(3') 0(2') 0(1')
y/b
z/c
aX
ay
oz
0·06124
0·24612
0·11564
0·0011
0·0011
0·0012
0·19560 -0,03692 -0,02640
0·20607
0·06524
0·0035
0·0032
0·0040
0·11926 0·38205 0·06387
0·25015 -0·07236
0·0030
0·0036
0·0031
0·22765
0·0038
0·0034
0·0039
0·31637
0·29130 0·13753
0·0036
0·0030
0·0037
0·39188
0·08896
0·0034
0·0034
0·0036
0·42731
0·40293
0·0034
0·0030
0·0039
0(5')
0·28315
0·37445
0·70685
0·0039
0·0036
0·0039
0(2)
0·18955
0·53474 -0·02972
0·0049
0·0039
0·0041
N(ll N(3)
0·22500
0·50558
0·29079
0·0042
0·0037
0·0039
0·28111
0·60506
0·18441
0·0040
0·0038
0·0042
0·38438
0·67854
0·38896
0·0049
0·0040
0·0047
0·19705
0·33674
0·30956
0·0048
0·0043
0·0051
C(2') C(1')
0·29050
0·39419
0·25659
0·0045
0·0043
0·0051
0·16687
0·44288
0·25119
0·0050
0·0042
0·0051
C(4')
0·08213
0·36267
0·45772
0·0050
0·0046
0·0056
C(5')
0·12721
0·35787
0·67012
0·0062
0·0057
0·0064
C(6)
0·26336
0·52248
0·47406
0·0053
0·0046
0·0054
C(5l C(.)
0·31405
0·57971
0·51841
0·0058
0·0049
0·0051
0·32719 0·22813
0·62262
0·36245
0·0047
0·0045
0·0051
0·54711
0·13389
0·0050
0·0044
0·0053
N(.) C(3')
C(2)
S T RUCTUR E OF NUC LE IC ACI D CONS T I TUEN TS.
I
917
T able 2 continued.]
(b) Anisotropic thermal parameters of nm~hydrogen atoms and their estimated standard
deviations in parentheses ,812
,813
f32~
,811t
,822
,833
P
0·00467 (0'00012)
0·00054 (0'00002 )
0·00704 (0'00026)
- 0,00026 (0'00005)
-0,00004 (0'000 16)
- 0,00020 (0'00007)
0 (6)
0·00620 (0'00043 )
0 ·00085 (0'00006)
0 ·01716 (0 '00096)
0·0001l (0'00014)
-0,00002 (0'00056)
-0,00 100 (0'00020)
0 (7)
0·00583 (0'00041 )
0'00 1l4 (0'00006)
0 ·00824 (0 '00080)
- 0,00046 (0'00016)
-0,00095 (0'00042)
-0,00 130 (0-000 22)
0 (6)
0·00818 (0'00052)
0·00 117 (0'00007 )
0·01 134 (0 '00084)
- 0·00050 (0'00015)
0·0017 1 (0'00054)
0 ·00039 (0'00020)
0 (3')
0·00806 (0'00043)
0·00063 (0' 00006)
0·009 49 (0'00080)
-0·0003 3 (0'00014)
-0,00147 (0'00052 )
-0,00008 (0'00018)
0 (2' )
0·00610 (0'00040)
0·00130 (0'00007)
0· 00934 (0'00088)
0·00 012 (0'000 14)
0·00162 (0'00050 )
-0·0001 7 (0'00020)
0(1')
0 ·006 40 (0 '00040)
0·00081 (0'00006)
0·01630 (0'00094)
0·00015 (0'00014)
0·00203 (0'00059)
0·00084 (0 '00020)
0 (5')
0·00940 (0'00054)
0·00 148 (0'00008)
0 '0 1l78 (0'00088)
0·00023 (0'00017)
- 0· 00129 (0'00059)
- 0,00010 (0'00022 )
0 (2)
0 ·01643 (0'00070)
0·00 165 (0'00010)
0·01020 (0'00094)
-0,00 120 (0'00023)
-0·00383 (0'00073 )
-0,00036 (0'00022)
N(l )
0·00700 (0' 00053 )
0·0007 4 (0'0 0007 )
0·00764 (0'00084)
- 0,00018 (0'00016)
- 0,00050 (0'00058)
0·00000 (0'00022)
N(3)
0·00648 (0'00052)
0·00082 (0'00008)
0·01046 (0'0 0103)
- 0, 00052 (0'00016)
-0·00 108 (0'00059 )
0·00029 (0'00022)
N (4)
0'0 1ll4 (0'00063)
0·00 105 (0'00008)
0·00950 (0'00106)
- 0·00035 (0'00019)
-0·00010 (0'00072)
-0·00061 (0'00026)
C(3')
0·00603 (0'00059)
0·00073 (0'00008)
0·0 1052 (0'00114)
0·00006 (0'00018)
- 0·00125 (0'00070 )
-0·00045 (0'00024)
C(2' )
0·00427 (0'00056)
0·00080 (0'00008)
0 ·01292 (0'00133)
0·00005 (0'00018)
0·00096 (0'00064)
0 ·000 19 (0'00026 )
C(1')
0·00 794 (0'00066)
0·00054 (0'00008)
0·01072 (0'001 32)
- 0,00016 (0'00020)
-0·00066 (0'00069)
-0·00009 (0'00024)
C(4')
0·00643 (0'00066)
0·00099 (0'00009)
0·0 1277 (0'00124)
- 0,00063 (0'00 020)
0·00074 (0'00074)
-0·00004 (0' 00026)
C(5' )
0·00967 (0'00078 )
0·00 150 (0'00012)
0·0 1576 (0'00170 )
- 0,00036 (0'00026)
0 ·00 1l7 (0'00092 )
0·00103 (0'00034)
C(6)
0 ·00770 (0' 00066)
0·00088 (0 '00008)
0·0 1242 (0'00126 )
0 ·00000 (0'0002 1)
-0,00 182 (0'00076 )
0·00078 (0'00026)
C(5)
0·01035 (0'00072)
0·00098 (0'00009 )
0·00841 (0'00 122)
- 0,00028 (0'00024 )
0·00068 (0 '00076)
0·00004 (0'00026)
C(4)
0 ·00582 (0'00054)
0·00094 (0'00008)
0·007 24 (0'001l0)
-0,00013 (0'00018)
0·00086 (0'00070)
- 0,00037 (0'00025)
C(2)
0·00800 (0'00064)
0·00085 (0'00008)
0·010 12 (0'001l9)
- 0·0002 5 (0'00 020)
-0·00036 (0'00076)
0·00070 (0'00026)
t ,8 as given here is d efined by: T = exp { - (h 2,811
+ k 2,822 + 12,833 + 2hk f31 2 + 2hl,813 + 2klf323)}.
918
l\L SUNDARALINGAM AND L. H. JENSEN
TABLE
3
Atomic positional and thermal parameters of hydrogens from least-squares refinement, peak heights from difference Fourier synthesist
x/a
z/c
y/b
B(A2)
e1.A-3
0·2946 H N ( 3) H O (7 ) -0'1508 H(3') 0·2472
0·6428
0·0762
2·6
0·2725
-0·0567
II·5
0·51 0·30
0·3072
0·3468
0·1
0·53
H(2')
0·3495
0·4II8
0·3683
1·3
0·45
H O( 2 ' ) H(1')
0·3406
0·3803
-0'0475
3·4
0·38
0·1300
0·4488
0·1280
-0·2
0·46
H(4') H(5')
0·0139
0·3354
0·4342
-0·1
0·47
0·0441
0·3933
0·7484
3·1
0·35
H'(5')
0·0922
0·3125
0·7356
2·0
0·41
Ho(S') H(6)
0·3191
0·3314
0·7340
0·4
0·42
0·2523
0·4957
0·5778
0·1
0·49
H(5)
0·3528
0·5852
0·6496
1·5
0·38
H N(. ) H N (. , )
0·4153
0·6861
0·5140
0·1
0·56
0·3975
0·6990
0·2573
0·6
0·51
t Average estimated standard deviation of positional and thermal parameters are 0·06 A and 1·50A2, respectively.
TABLE
4
(a) Bond lengths not involving hydrogen atoms and their estimated standard deviations P-0(6) P-O(7) P-O(6)
P-O(3') O(3')-C(3') C(3,)-C(2') C(2,)-O(2') C(2')-C(1') C(1')-O(1') O(1')-C(4') C(4')-C(3') C(4')-C(5') C(5')-O(5')
± 0·004 A ± 0·003 1·483 ± 0·004 1·6II ± 0·003 1·431 ± 0·006 1·533 ± 0·006 1·402 ± 0·006 1·513 ± 0·006 1·418 ± 0·006 1·458 ± 0·006 1·537 ± 0·006 1'5II ± 0·008 1·437 ± 0·007 1·504
C(1')-N(1)
1·551
N(1)-C(2) C(2)-O(2) C(2)-N(3) N(3)-C(4) C(4)-N(4) C(4)-C(5) C(5)-O(6)
C(6)-N(l)
± ± 1·201 ± 1·382 ± 1·339 ± 1·323 ± 1·420 ± 1·351 ± 1·350 ± 1-475
0,006A
1·401
0·006 0·007 0·006 0·007 0·006 0·007 0·007 0·007
STRUCTURE OF NUCLEIC ACID CONSTITUENTS.
I
919
Table 4 continued.]
(b) Bond angles not involving hydrogen atoms and their estimated standard deviations 1l0·1 ± 116·0 ± 106·1 ± 101·5 ± 108·5 ± 113·6 ± 121·2 ± 108·1 ± 1l0'0 ± 102·2 ± 118·1 ± 114·6 ± 100·8 ± 105·9 ± 112·9 ± 107·7 ± 1l0·3 ± 104·7 ±
0(B)-P-O C3')
°CB)-P-O CB) OCB)-P-O(7) O(7)-P-O(3') °CB)-P-O C3') OCB)-P-O(7) P-O(3')-C(3') 0(3,)-C(3')-CC2') °C3,)-C(3')-CC4') C(4' )-CC3,)-CC2') CC3,)-CC2·)-OC2') CC1,)-CC2·)-OC2') CC3,)-CC2,)-CC1') CC2,)-C(1.)-0(1') C(2,)-CC1·)-N(1) 0(1·)-C(1.)-N c1) C(4')-O(1.)-C(1') 0(1,)-CC4,)-CC3')
0.3 0 0·3 0·2 0·2 0·2 0·3 0·4 0·4 0·5 0·4 0·5 0·5 0·4 0·4 0·5 0·4 0·5 0·4
C(3')-C(4')-C(5') C(5')-C(4')-O(l" C(4')-C(5')-OC5" C(1')-N(1)-CC2) C(1.)-N Cl)-CCB) N (1)-CC2)-0 (2) N(3)-C(2)-OC2) N(1)-CC2)-N(3) C(2)-N(3)-CC4) N (3)-C(4)-CC5) N(3)-C(4)-N c4) C(5)-CC4)-N (4) C(4)-C(5)-CC6) C(5)-CC6)-N (1) C(6)-N(1)-C(o)
116·0 1l0'0 113·6 117-l 120·4 124·5 122·1 113·4 126·0 118·3 119·9 121·8 117·2 122·7 122·3
± 0·6 ± 0·5 ± 0·6 ± ± ± ± ±
± ± ± ±
± ± ±
0·5 0·6 0·6 0·6 0·5 0·6 0·6 0·6 0·6 0·6 0·6 0·6
TABLE 5 Bond lengths and bond angles involving covalently bonded hydrogenst N
l-l12A 1·116 0·817 1·999 1·033 0·943 1·042
( 3 ) - HN(3l
o (7)-H o
(7 )
C(3·)-H(3') CC2·)-H c2·) °C2.)-H o ( 2 ' ) C(1')-H(1" C(4')-H(4')
Bond C(sp 3 )- H C( sp 2 )- H N-H O-H P-O(7)-H O(7 ) °C3,)-C(3')-H c3') CC2,)-C(3·)-H c3·) C(4,)-CC3·)-H(3') CC3,)-C(2·)-H c2') C(1')-C(2·)-H(2') °C2,)-CC2'l-HC2'l C(2,,-OC2,,-HOC2') cC2·)-C(1.)-H c1·) 0(1.)-C(1.)-H C1') N(1)-CC1'l-H(1'l C(3')-Cc4·)-H(4') 0(1,)-Cc4.)-H c4' l C(5')-Cc4.)-H«.) CC4,)-C(5·)-H(5')
116.3 0 105·9 114·8 115·7 113·9 97·2 110·0 123·2 115·0 114·5 100·8 102·8 114·6 108·8 103·3
t Average estimated standard deviations in respectively.
C(5')-H(5') C(5')-H' C5 ' >
O(5')-H o ( s' ) C(6)-H CB) C(5)-H c5).
N ( 4 ) - HN ( 4 ) N(4)-H NC4')
1·186 1·122 1·001 0·922 0·967 0·913 1·011
Average 1·018 A 0·945 1·013 1·050 0(5,)-C(5')-H C5' l C(4')-C(5')-H'C5') °C5,)-C(5')-H'C5') H(5')-C(5')-H' C5')
C(5·)-0(5')-H ocs' ) N(1)-C(6)-H c6) C(5l-CC6l-H(6l C(6)-C(5l-H(5) CC4l-C(5)-H(5) C(4)-N(4)-H N(4) C(4)-N(4)-HN(4') H N(4)-N(4)-HN(4') C(2l-N(3)-H N( 3 ) C(4)-N C3 )-
C-H
H N(3 )
110·2 111·9 114·2 102·5 95·8 121·3 116·0 115·9 126·1 114·0 108·7 136·4 122·6 111·4
bond length and angles are 0·06 A and 4°
920
M. SUNDARALIN GAM AND L_ H. JENSEN
3. Discussion (a ) .Molecular configuration
The angl e between the pl ane of the base and that of t he sugar is 60.5 The torsion angle ePCN as defined by Donohue & Trueblood (1960) "is t he angle form ed by the trace of the plane of the base with the projecti on of the 0 0 , )- 0 (1' ) bond of the furanose ring when viewed along t he 0(1,)- N bond . Thi s angle is t aken as zero when 0 (1' ) is anti -planar to 0 (2) of t he pyrimidine or purine rin g, and positive angles ar e taken as t hose measured in a clockwise direction when viewing from 0 (1 ' ) to N ." Becau se the base in nucl eic acid consti t uents need not necessarily be pl anar and becau se 0 0 , ) can be significant ly displaced from the least-squar es planes of the base, we suggest th e definiti on of ePCN be modified as follows. The t orsion angle ePCN of t he bond O(l')- N is the angle formed by the projection of 0(1 ,) - 0(1') relative to the project ion of N (l)-0(6) (in pyrimidine) and N (9)-0(8) (in purine) when viewed along O(1 TN. Thi s angle is taken as zero when 0(1 ') is anti-planar to 0 (2) of the pyrimidine or 0 (4) of the purine ring, and positive angles are taken as those measured in a clockwise dir ection when viewing from 0(1') to N. It is seen that ePCN in both these definitions will be the same if the atoms comprising the bas e ar e coplanar and 0(1') lies in this pla ne . The torsion angle ePCN. according to the new definition, for cytidylic acid is -42-1 and the conforma tio n t herefore is anti as found in several other known natural and synthetic nu cleic acid constituents . 0
•
0
,
0
The plane determined by 1>-0 ( 3,)-0(3') makes an angle of 71-9 with that of the sugar and 13-5 with that of the base. The planes 0 (1 , )-0(1 , )-0 (2') and 0(6)-N(l)-0(2 ) arc at an angle of 76.0 0
0
•
(b) Cytosine cation
The least -squares equa tion of the plane form ed by the nonhydrogen atoms of the cytosine cati on is: 0-4559x - O'1538y - 0-0968z + 1 = 0 Th e departure from planar it y is seen from the pronounced out -of-plane deviations of the atoms N (1 )' N ( 3 ) ' N (4 ) and 0(5) as shown in Tabl e 6. The equation of the least sq uares plane defined by t he six ring atoms is: 0-5273x - 0'1656y - 0-1075z
+1=0
STRUCTURE OF NUCLEIC ACID CONSTITUENTS. TABLE
I
921
6
Deviations of the atoms from the cytosine plane and the pyrimidine ring Atom
Displacement
N(l) C(2) N(3) C(.) C(6)
0·0223 -0·0119 -0·0014
C(6)
-0·0089
0(2)
N(.) e(l') HN ( 3) H(6) H(8)
A
0·0239 -0,0141 -0·0360t 0·l037t -0·0044t
Displacement
0·0459 A 0·0010 -0·0283 -0,0110 -0·0370 -0,0027 -0,0084 0·0405 0·0508
o-oooor 0·098t -0'065t
t Atoms excluded in the calculation of the leaat-squares plane.
The deviations of N(l) and C(4) from this plane, though small, are significant and in the same direction, while 0(2) and N(4) are very significantly displaced from the plane and in the opposite direction. The pyrimidine ring, therefore, has a shallow boat configuration. Atom C(l')' within the error of the determination, is in the plane of the ring, in contrast to the results obtained for some other nucleic acid constituents. The bonds C(4)-N(4) and C(2)-0(2) make angles of 4'5° and 1'6°, respectively, with the plane of the pyrimidine ring. During the refinement of the structure, the migration of H N (3) from 0(6) to N(3) provides unequivocal evidence that the cytosine moiety is protonated. Angles within the pyrimidine ring are in the range 113 to 126°, markedly distorted from those of a regular hexagon. Molecular dimensions in the cytosine cation and the neutral base are compared in the following paper (Sundaralingam & Jensen, 1965). Surprisingly, the exocyclic C(4)-N(4) "single bond" is as short as the C(4)-N(3) "double bond" in the ring. It is noteworthy that the amino C-N linkage in the benzenoid derivative 2,5-dicWoroaniline is 1·407 A (Sakurai, Sundaralingam & Jeffrey, 1963). The shortening observed in the C(4)-N(4) bond length in the cytosine may be attributed primarily to two factors: first, the strong electron-attracting property of the heterocyclic ring, and second, the electron-releasing property of the NH 2 group attached to the base. The above factors seem to endow the C(4)-N(4) bond with double-bond character in excess of that found in a C-N bond adjacent to a double bond. In this context it may be noted that the C-N bond length in a peptide group is 1·32 A and is also less than the expected value for a C - N linkage adjacent to a double bond (Pauling, 1960). 61
M. SUNDARALINGAM AND L. H. JENSEN
922
(c) Ribose
Spencer (1959) proposed that, in the furanose ring of the deoxyribose derivative, either 0(2') or 0(3') is displaced from the plane formed by the remaining four ring atoms. Recently, Sundaralingam (1965) has reviewed the conformations of the furanose ring in nucleic acids. Since there are few precise determinations ofribofuranose structures, it has not been possible to say with certainty that four atoms in the furanose ring lie in a plane. This refinement leads to results which indicate quite clearly that no four atoms in the ribofuranose ring lie within experimental error of a plane. This puckering probably alleviates further the nonbonded interactions between the ring substituents and the lone electron pairs on oxygen, which tend toward a staggered conformation about the ring bonds (Fig. 2). In Table 7 are listed the deviations of the atoms from the least-squares planes formed by all combinations offour atoms. The atoms marked with an asterisk were not used to define the plane. In cytidylic acid, 0(2') is displaced 0·609 A from the best least-squares plane of the remaining atoms, the root mean square (r.m.s.) displacement of these atoms being 0·026 A. t Deviations of the individual atoms from the
FIG. 2. Displacement of the ribose ring atoms and the substituents from the least-square plane formed by the atoms C(1'), C(3'), C(4'), 0U').
TABLE
7
Displacement of atoms (A) from least-squares planes formed by all combinations of four atoms in the o-ribose of cytidylic acid Atom
Mean a
C U')
0·0047 A
C(2')
0·0047
C(3') C(4')
0·0047
0·163
0·018
3·8
0·0054
-0'166 0·107
-0'030
6·4
0·032
9·4
0·138
0·026
0·0034 0(1') r.m.s, deviation of the in-plane atoms
J/a
0·435 A* -0,104
-0,020 A
4·3
-0'609*
0·140A
0·105 A -0,061
0·212 A -0,203
-0·232
0,558*
0·126 0,298*
0·239 -0·147
0·065 -0,109
-0,135
0·088
0·173
-0·103* 0·195
t The mode of puckering is called C(2')-endo, meaning that C(2') is displaced on the same side of the best 4-atom plane as C(s')'
STRUCTURE OF NUCLEIC ACID CONSTITUENTS.
I
923
FIG. 3. Covalent bond lengths (A) in cytidylic acid b.
best four-atom plane range from about four to nine times their mean a values, which indicate the deviations are significant. Brown & Levy (1963 and personal communication) have described the puckering . of the furanose ring in sucrose in terms of a twist angle about each ring single bond. Adopting their notation, the conformation angle ePC(1')---+C(2') of the bond CO,)--+C(2') is the angle measured counterclockwise, formed by the projection of C(1')--+O(I') relative to the projection of C(2')--+C(3'), when viewed along C(1')--+(C2'). Thus the conformational angles of the ribofuranose ring bonds in cytidylic acid are: ePC(l')-+ . 0(1') = -19'8°, ePC(2')---+C(1') = 36'6°, ePC(3')-+C(2') = -38'8°, ePC(4')---+C(3') = 28·2° and ePO(l')-+C(4') = -5'7°. The average ring C-C single bond length is 1·528 A; this is close to the val ue 1·533 ± 0·003 A observed by Bartell (1959) for a "normal" C-C single bond. But the exocyclic C(4')-C(5') bond length, 1·511 A, is considerably less than the normal value. This may be due in part to a distortion of the angles around C(4') from the tetrahedral value, with a consequent change in the hybridization state of this carbon atom. The ring C-O bonds are unequal. C(4')-0(1') is 1·458 A, which is significantly longer than the usually accepted standard value of 1·425 ± 0·007 (Venkateswarlu & Gordy, 1955) and is 0·04 A longer than the C(1')-0(1') bond length of 1·418 A. The C(2,)-0(2') bond length of 1·402 A associated with the puckered (endo) carbon atom has shortened by about 0·03 A from the C(3,)-0(3') bond, 1·431 A. The latter bond is phosphorylated at the 0(3') atom. The C(5')-0(5') bond is part of a primary alcohol group and is only slightly longer than the normal value. The angle at the ring oxygen atom is 110'3°, and the average internal angle involving the carbon atoms adjacent to 0(1') is 105'3°, and that of the carbon atoms at the meta position to 0 0 , ) is 101'5°. Exocyclic bond angles involving the C(2,)-endo carbon atom, Fig. 4, are severely distorted from the tetrahedral value and are markedly greater than those involving C(3')' Further details of the furanose residue and the phosphate group will be discussed in the subsequent paper (Sundaralingam & Jensen, 1965).
924
M. SUNDARALINGAM AND L. H. JENSEN
FIG. 4. Bond angles (0) in cytidylic acid b .
(d) Hydrogen atoms
The bond-lengths and angles involving covalently bonded hydrogen atoms, though less precise than for simpler structures, are, nevertheless, quite reasonable for a structure of this complexity. The average estimated standard deviations in the C-H bond lengths and the C--C--H bond angles are 0·06 A and 40 , respectively. The angles C(2 )-N(3)-H N (J ) ' C(4)-N(3)-H~(3)' C(5)-C(6)-H(6) and N(l)-C(6)-H(6) of 123°, 111 0 , 116° and 121° respectively are in surprisingly good agreement with the values 124,0, 109°, 116° and 121 found for the corresponding angles in the I-methyl thymine: 9-methyl adenine complex (Hoogsteen, 1963). The hydrogen atoms as they appear in a composite difference Fourier synthesis are shown in Fig. 5. This synthesis was computed including all observed reflections except those suffering appreciable extinction, which were given zero weight. 0
b =3/4
b= 1/4
0 =1/2
~=-
~a
=0
FIG . 5. Composite differen ce F ourier using all observed r eflections , extinguished refl ections omitted. Contours at intervals of 0 -05 el. A -3, beginning 0-1 el. A -3.
STRUCTURE OF NUCLEIC ACID CONSTITUENTS.
I
925
It is to be noted that all t hermal parameters for hydrogen ato ms except that of H O (7) ar e relatively sma ll and two ar e slight ly negati ve. Th ese have been listed in Table 3 only to indicate the behavior of such refinement in a molecule of this complexity. None of the thermal param eters, except that for Hom' differs significant ly from the average B = 1·1 A2. Neglecting H O(7) ' this valu e is less than the mean value of the nonhydrogen at om t hermal parameter in thi s st ruc t ure. Similar results have been found for a number of other compounds (Sundaralingam & Jensen, 1963, communicat ion to 6th Int. Congo Crystallograp hy, p. A 61). The low temperature factors for the hydrogen atoms ar e du e to the use of scatte ring factors which were calculate d for the free atom (McWeeny , 1951). It has now been shown that th e elect ron density in the bond ed atom is more localized than that in the free atom, and consequently the use of McWeeny scattering factors for hydrogen leads to the anomolously low values for the temperature factors (Jensen & Sundaralingam, 1964). (e) Hydrogen bonds
The molecules of cytidylic acid in the crystal lattice are held together quite firmly by a three-dimensional network of hydrogen bonds (Fig. 6). There are three pairs of O-H ... 0 and N-H ... 0 hydrogen bonds, all of them less than 3·0 A (Table 8(a)).
FIG. 6.
The O-H . .. 0 bonds ar e between phosphate-phosphate, phosphate-ribose and riboserib ose oxygen atoms, while the N-H ... 0 bonds involve pho sph ate oxygens and base nitrogens. The shortest hydrogen bond in this structure is 2·532 A and is between two ph osphate oxygens, P-O-H -O-P. A similar distance, 2·525 A, ha s been observed in adenylic a cid, P-O-H O-P. In genera l, hydrogen bonds between phosphoric acid derivatives t end to be shorter t han those betw een car boxylate groups (Calleri & Speakman, 1963, communicat ion to 6th Int. C01UJ. Crystallograp hy, p. A61). Th e ph osph ate ester oxygen 0 (3' )' t he sugar ring oxygen 0 ll ' )' and the carbonyl oxygen
926
M. SUNDARALINGAM AND L. H. JENSEN TABLE
8
(a) Hydrogen bond lengths (A) N-H ... O
H ... O
N(3)
0(6}(3)
2·783
H O (7 )
0(6}(2)
1·427
N(4}
0(7}(3)
2·983
H N(3)
N(4}
0(8}(10)
2·734
H N (4' )
0(6,(3) 0(7}(3)
2·080
H O (2 ' ) 0(6}
0(5'}(7)
1·761
H o(S')(9)
1·872
H N(4)
0(8}(IO)
1·881
O-H ... O Om
0(6)(2)
2·532
0(2'} 0(6}
0(5'}(7)
2·782
0(5')(9)
2·771
1·679
(b) The shorter van der Waals contacts
<
H ... H
3·oA
<
0 ... H
3·oA
HoW)
H N(4)(I)
2·802
0(2')
H N(3)(I)
2·945
H O (7 )
H(3·)(2)
0(2')
H O (7 )
HNWP) 0(3·)(2)
2·842
H O (7 }
H N(3)(2) H'(5'}(7)
2'779 2·404 2·707
H O (2 ' )
H(5',(7)
2·967
H O (2 ' )
1·843
H N (4' ) 0(2)
H(1')(5)
2·954
H(l')
HowP) H(5',(7)
2·942
0(7}
H'(5',(7)
2·199
H N ( 4' )
H'(5'}(IO)
2·461
0(3'}
H'(5',(7)
2·763
0(2'} 0(2)
Ho(S') (7) H(6)(7)
2·815
0(2)
H(5}(7)
2·840
0(2}
H(5',(10)
2·991
C ... H
<
3·3A
C(5) C(4)
H(2.}(I) H(2.)(I)
2·932
H O (7 ) C(5}
C(3')(2)
3·220
H(4')(5)
3·264
C(4)
H(4')(5)
2·938
C(2)
H u·}(5)
3·066
H O (2 ' )
0(5',(7)
2·736
(I)
3·132
I - x, 1 - y, z
(2) -1/2 + x, 1/2 - Y, -z (3) 1/2 - x, 1/2 + Y, -z (5) -x, 1 - y, z
H O (7 ) 0(7)
2·990
H(3·}(2)
2·943
0(8)(3)
2·359
<
0 .•• 0
2·870
3·3A
0(2.}(2)
Om
0 ... N
<
3·158 3·3A
0(2'}
N(3}(I)
3·029
0(2'}
N(4}(I) (I)
3·278
N(4)
0(6)(3)
3·243
°u'}
N(l}(5)
3·022
(7) (10)
+z + x, 1/2
x, y, 1
(9) -1/2
1/2 - z, 1/2
(II) -x, 1 - y, 1
- y, I - z
+ y,
+z
1 - z
STRUCTURE OF NUCLEIC ACID CONSTITUENTS.
I
927
0(2) are the only three oxygen atoms not participating in hydrogen bonding. The remaining five oxygen atoms are involved in two hydrogen bonds each, except 0(2')' which forms only one donor hydrogen bond. The two phosphate oxygen atoms, 0(6) and 0(8)' form two acceptor hydrogen bonds, while 0(7) and 0(5') atoms form an acceptor and a donor hydrogen bond.
(f) Intermolecular approaches
The shortest intermolecular contacts are shown in Table 8(b). The H ... Hinteractions are all greater than 2·4 A except H O ( 2 ' ) • • • H o(5' ) = 1·843 A, which involves the hydrogen atom on 0(2') and the hydrogen atom on 0(5') of a neighboring molecule. This latter hydrogen atom is hydrogen-bonded to 0(2')' The other approaches involving hydrogen atoms are reasonable considering the standard deviations in positional parameters of the hydrogen atoms. The dense molecular packing of the compound is exemplified by the 0 ... 0 and 0 ... N intermolecular interactions also listed in the above Table. In Table 9 are listed the values of B, (r.m.s. displacements) and direction cosines Cia' C l b and Cl e of the principal axes of the thermal vibration ellipsoids. The thermal motions of the atoms in the base moiety are greatest normal to the plane of the base and in the a axial direction. These values are about twice the thermal motion in the plane of the base. On an average, the vibrations along the length of the molecule, b axial direction, are smaller. Thermal anisotropy is most severe in the carbonyl oxygen.
TABLE
9
Magnitude and direction cosines of the principal axes of thermal vibration ellipsoid Atom
p
0(5)
0(7)
0(8)
0(3')
0(2')
Axis i
BI
Cia
Cl b
o;
1 2 3
0·91 1·53 1·35
0·3374 0·9026 0·2674
0·9026 -0,3909 0·1804
0·2673 0·1805 -0,9466
1 2 3
3·41 1·40 1·93
-0·0220 -0,1473 -0·9888
-0·3096 0·9415 -0,1333
0·9506 0·3032 -0,0663
1 2 3
0·86 2·71 1·93
0·3827 0·1936 0·9034
0·5493 -0,8339 -0,0540
0·7429 0·5168 -0,4255
1 2 3
1·60 2·87 2·40
-0,5139 0·8574 -0·0281
-0,5680 -0,3646 -0·7378
0·6429 0·3632 -0,6744
1 2 3
2·67 1·12 1·67
-0,9223 0·2233 -0,3156
0·1448 0·9564 0·2535
0·3584 0·1881 -0·9144
1 2 3
1·41 2·47 2·22
-0,6394 -0·0989 -0'7625
0·1299 -0,9913 0·0196
0·7578 0·0865 -0·6467
M. SUN DARA L INGAM AN D L . H . J E N S E N
928
Table 9 continued.] 0 10
At om
Ax is i
Bj
0,0.
O'b
O(l'}
1 2 3
3·39 1·38 1·81
0·3362 -0'0624 0·9397
0·2631 - 0,9418 -0,1574
0·9043 0·3002 -0·3036
o.,;
1 2 3
2·09 3· 13 2·69
0·3570 - 0·8233 0·4414
-0·0103 -0, 4760 - 0,8794
0·9341 0·3094 -0,1784
O(2}
1 2 3
5-69 1·66 2·76
-0'9067 0·2662 0-3272
0·3454 0·0236 ()-9381
0·2420 0·9636 -0, 1134
N(l)
1 2 3
2·2 1 1·38 1·43
-0,9736 0·2227 - 0-0510
0·1682 0·8498 0·4996
0·1546 0·4777 -0,8648
N(3J
1 2 3
2·45 1-31 1·76
- 0'7204 -0-4925 -0-4884
0·4276 - 0,8697 0·2465
0·5461 0·0312 -0,8371
N( ~)
1 2 3
3·50 1·49 2·23
0·9837 0·0956 0-1522
-0,1783 0·6250 0·7600
0·0225 0·7784 -0·6318
C(3'}
1 2 3
2·29 1·26 1·66
- 0·5807 0·1295 -0-8038
-0,2553 0·9085 0-3308
0·77 31 0·3973 -0,4945
C(2')
1 2 3
2-49 1·28 1-49
0·1975 -0'9793 0-0438
0·1200 0·0686 0-9904
0·9729 0· 1904 -0,1310
1 2 3
1·00 2·52 1·97
0-0907 - 0-95 36 - 0-2870
0·9936 0·0672 0·0 904
0·0669 0·2934 - 0, 9537
C(4'}
1 2 3
1-43 2·56 2-26
-0·6676 0-5786 0-4685
- 0,7371 -0·4246 -0-5258
0·1053 0·6964 -0,7099
C(s'}
1 2 3
2· 10 3·5 1 3·17
-0-4031 0·0674 -0-9127
-0,6760 0·6503 0-3466
0·6169 0·7567 -0·2 166
1 2 3
2·89 1·38 2·11
-0·6322 0-2201 -0-7429
0· 2732 -0,8339 -0-4795
0·7251 0·5061 -0,4671
1 2 3
3·25 I-55 1·82
0·9845 -0- 1188 -0-1291
-0,1480 -0·167 1 -0,9748
0·0942 0·9788 -0,1821
1 2 3
1·22 2·02 1·69
-0·2704 0·6666 0·6947
0·3153 -0,6204 0·7181
0·9097 0·4 132 -0,0424
1 2 3
1·30 2·57 2-11
-0-0925 -0-9006 -0'4247
-0,8249 0·3082 -0-4739
0·5577 0·3065 -0,7714
C(l')
U(6)
C IS)
C (4 )
C(21
STRUCTURE OF NUCLEIC ACID CONSTITUENTS.
I
929
Although this is probably primarily due to its terminal position, it may also be due in part to an inadequate scattering factor for oxygen. It is estimated that thermal motion correction to the 0(2)-0(2) bond is about 0·015 A. Figure 7 shows an F o synthesis calculated using the phases from the last refinement cycle. In Table 10 the observed and calculated structure factors appear with the final phase angles in millicycles. t
FIG. 7. Composited Fourier synthesis. Contours at every 1 e1. A -3. Phosphorus contours at every 5 el. A- 3, beginning 5 e1. A- 3. This investigation was supported by grant AM-3288 of the United States Public Health Service. We wish to thank the University of Washington Research Computer Laboratory for grants of unsupported computer time, and Drs J. Kraut, J. Stewart, D. High and R. Chastain for computer programs. REFERENCES Alver, E. & Furberg, S. (1957). Acta Chem. Scand. 11, 188. Alver, E. & Furberg, S. (1959). Acta Chem. Scand. 13, 910. Bartell, L. S. (1959). J. Amer. Chern; Soc. 81, 3497. Berghuis, J., Haanappel, 1. J. M., Potters, M., Loopstra, B. 0., McGillavry, C. H. & Veenandaal, A. L. (1955). Acta Cryst. 8, 478. Brown, G. M. & Levy, H. A. (1963). Science, 141, 921. Busing, W. R. & Levy, H. A. (1959). A Crystallographic Least Squares Refinement Program for the IBM-709. U.S. Atomic Energy Commission Publication ORNL 59-4-37. Donohue, J. & Trueblood, K. N. (1960). J. Moi. BioI. 2, 363. Freeman, A. J. & Watson, R. E. (1961). International Tables, vol. 3, p. 202. Birmingham: Kynoch Press. Hoogsteen, K. (1963). ActaCryst. 16, 907. Hughes, E. W. (1941). J. Amer. Chem. Soc. 63,1737. Jensen, L. H. & Sundaralingam, M. (1964). Science, 145, 1185. McWeeny, R. (1951). Acta Cryst. 4,513. Pauling, L. (1960). Nature oj the Chemical Bond, 3rd ed, p. 282. Ithaca: Cornell University Press. Sakurai, T., Sundaralingam, M. & Jeffrey, G. A. (1963). Acta Cryst. 16, 354. Spencer, 1\'1:. (1959). Acta Cryst. 12, 59. Sundaralingam, M. (1965). J. Amer. Chem, Soc. 87, 599. Sundaralingam, M. & Jensen, L. H. (1965). J. Mol. tu«. 13,930. Venkateswarlu, P. & Gordy, W. (1955). J. Chem. Phys. 23, 1200.
t EDITOR'S NOTE: This Table has been omitted to save space. The authors will make these data. available on request.