Biochimica et Biophysica Acta, 1173(1993) 39-48
39
© 1992 Elsevier Science Publishers B.V. All rights reserved 0167-4781/92/$06.00
BBAEXP 92477
Structure and conformation of N4-hydroxycytosine and Na-hydroxy-5-fluorocytosine. A theoretical ab initio study Andrzej Leg a,b, Ludwik Adamowicz b and Wojciech Rode c a Department of Chemistry, University of Warsaw, Warsaw (Poland), h Department of Chemistry, The University of Arizona, Tucson (USA) and c Nencki Institute of Experimental Biology, Polish Academy of Sciences, Warsaw (Poland)
(Received 1 June 1992)
Key words: Ab initio calculation; Tautomerism; N4-Hydroxycytosine;Nucleic acid base analog; Base-pairing; Thymidylate synthase; Hydroxylaminemutagenesis Optimal molecular geometries and molecular energies were obtained for N4-hydroxycytosine and its 5-fluoro congener with the use of the theoretical ab initio quantum mechanical calculations within the Self Consistent Field method corrected for the electron correlation effects by the second-order Many Body Perturbation Theory (SCF + MBPT(2)). The 6-31G Gaussian basis set was employed. Several tautomeric and rotameric forms were considered. For Na-hydroxycytosine and Na-hydroxy-5-fluorocy tosine the imino tautomer (in the conformation syn relatively to the N3-nitrogen atom) appeared to be the most stable form. The imino tautomer of Nn-hydroxy-cytosine in the anti rotameric form is by 12.8 kJ mol-1 less stable than the imino-syn form. The 5-fluoro substituent raises the energy difference between the syn and anti rotamers up to 38.5 kJ mol-1. The potential energy barrier for the syn-anti rotation in the imino form of N4-hydroxycytosine is estimated to be about 180 kJ/mol. The results presented in this paper suggest that the syn-imino and anti-imino forms can be treated as two structural isomers that do not interconvert at temperatures relevant to biochemical conditions. The theoretical results also show that the amino tautomeric forms do not compete with the imino forms in the gas-phase and in non-polar and weakly-polar environment. In a polar environment (e.g., in aqueous solutions), however, one may expect an increased population of the amino forms. Qualitatively, the results of the present study agree well with the available experimental and theoretical data for Na-hydroxycytosine and some of its derivatives. The implications of the present study are discussed in relation to the molecular mechanisms of mutagenesis caused by NH2OH and of enzyme (thymidylate synthase) inhibition by N4-hydroxydeoxycytidine monophosphate.
Introduction Analogues of the normal pyrimidine and purine nucleic bases have attracted considerable attention in biochemistry of D N A and R N A [1]. Particularly interesting is the biochemical response to modified nucleosides that contain various structural changes at the C5-position of the pyrimidine ring. There exists a large family of potent drugs related to 5-fluorodeoxyuridylate (FdUMP) used in the treatment of advanced solid tumors (e.g., breast, gastrointestinal, gynecological cancers) [2]. The interest in another pyrimidinic nucleic base, Na-hydroxycytosine (oh4C), stems from the extraordinary properties of its mono- and triphospate nucleotides in the chemical mutagenesis and inhibition of biosynthesis of the nucleic acids. It is well known that a potent mutagenic agent, hydroxylamine N H 2 O H ,
Correspondence to: L. Adamowicz, Department of Chemistry, The University of Arizona, Tucson, AZ 85721, USA.
specifically reacts with the cytosine residues of D N A and R N A nucleic acids [3]. The product of such a reaction, the oh4C nucleotide, behaves either as cytosine (C) or uracil (U) residues, leading to a C ~ U (T or thymine) transition. In principle, such a change of the nucleotide sequence may lead to modification of genetic information, i.e., to a mutation. A detailed molecular mechanism of N H 2 O H mutagenesis is not known. Early concepts, following the Watson-Crick suggestion [4], were based on the assumption that the mismatches are formed under the action of chemical mutagens. The mismatch formation does not necessarily lead to a mutation. In most recent works the role of the restriction enzymes is exposed as those controlling the mutation level. In some extreme cases the errors introduced by chemical mutagens can be completely removed by the specialized SOS systems [5]. ohnc has been also used in the modern technique of bioengineering - the site-directed mutagenesis. Flavell et al. [6] have introduced a mutation in a specific place of the bacteriophage genome and subsequently cloned genes coding the rabbit /3-globine. By means of
40 NH2OH or, more precisely, by means of oh4C, they were able, in a sequence of experiments, to synthesize a DNA fragment that differed from the normal fragment by only one nucleotide. Very recently Stone et al. [7] have suggested a likely molecular mechanism of mismatch formation in selfcomplementary modified octanucleotides whose structures were consistent with the regular B-DNA duplex. Modified nucleic bases, N4-methoxycytosine (M or omaC) and bicyclic-dihydropyrimido-[4,5-c][1,2]oxazin7-one (P), were introduced in the position of a single thymine residue in the octanucleotide. An interesting feature of these analogs is that in M the conformation of the Nn-methoxy-residue is predominantly syn (Zconfiguration) in free coils, while in P the anti form (E-configutarion) is fixed. The formation of duplexes was monitored at different temperatures by means of two-dimensional spectroscopic techniques, such as the double quantum filtered correlation spectroscopy (DQF-COSY) and the phase sensitive nuclear Overhauser effect spectroscopy (NOESY). It was proven that both ohac analogues, M and P, are capable of forming Watson-Crick base-pairs with adenine (A), and that the A-P base-pair is isostructural to the A-T base-pair. In this way a direct evidence was given that the N4-O-residue of cytosine is present exclusively in the anti form in the complementary coils forming duplex. We may than conclude that the anti form of ohac is biologically active, at least in the chemically induced mutagenesis. The transformation of free coils containing M-residues in the syn conformation into a duplex, where M-residues appear in the anti conformation, is a slow isomerization process proceeding via several intermediate steps. Stone et al. [7] have suggested that the rotation about the C(4)-N4-bond in M-residues from the syn- to the anti form is the rate determining step in the duplex formation. It is worth noticing that in another known promutagen agent, N6-methoxyadenosine, the conformation of N6-meth oxy-residue is syn relatively to the imino N(1)-ring nitrogen atom. Such a conformation facilitates the formation of a complementary base-pair with cytidine [8]. A close structural similarity of Na-hydroxycytosine to uracil is probably the driving force of the inhibitory action of NaOH-dCMP on thymidylate synthase (TS or EC 2.1.1.45). Inhibition is competitive with respect to dUMP and is time-dependent. It has been demonstrated that the inhibitory process is mediated by N 5'l°methylenetetrahydrofolate(CH2FH 4 ) [9-- 11], probably via the formation of the ternary complex, N4OHdCMP-CH 2FHn-enzyme [10]. Several analogs of NaOHdCMP exhibit slow-binding inhibitory action on the TS enzymes extracted from different sources and characterized by different sensitivities towards the 5-fluoro-dUMP inhibition. In a series of comparative studies with N 4 O H and N 4-
methoxy analogues of dCMP, 5-fluoro-dCMP, and 5methyl-dCMP Rode et al. [11] have pointed out a special role of the conformation of the NnOH residue in the molecular mechanism of inhibition. For exampie, the 5-methyl substituent, introduced into NaOHdCMP or into om4dCMP residues caused a drammatic diminution of the inhibitory potency (10 afold). Such an effect was explained at the molecular level by means of a change in the conformation of the N4OH-group, which has been forced (by 5-methyl substituent) to flip from the anti position (in the presumably active species) to the syn position. In further analogues, N4OH-5-fluoro-dCMP and N4OCH3-5-flu oro-dCMP an increase of the inhibitory activity was observed (about 20-fold). Besides, comparative studies of inhibition by NnOHdCMP and its 5-fluoro congener of the TS forms differing in sensitivity towards FdUMP, showed that lower sensitivity to FdUMP was associated with lower sensitivity to NnOHdCMP but not to NaOH-5-fluoro-dCMP inhibition. This observation strongly suggests an interplay between the substituents at the C(4) and C(5) positions in NnOH-5-fluoro dCMP. To explain the latter enzymatic data, it was suggested that at the molecular level there exist interactions, via hydrogen bonds, between the NaOH and C(5)-F groups, that influence the thermodynamic equilibrium between the syn- and anti-rotameric forms in such a way that the anti form is stabilized. A first step towards understanding the biochemical behaviour of ohac is an attempt to recognize its molecular structure, i.e., the topography of its chemical bonds, oh4C exists mainly in the imino tautomeric form, Fig. 1, which can be written as the oxime of uracil. The imino form was found in the crystalline 14HO~N~,HI0 N3~HI1 O
H12 I HI3 syn-amino N4-hydroxycytosine
14HO~Na loll ~.N3 ' / ~ O
HI2 I HI3 syn-imino N4- hydroxycytosine
S I°Hx'N~OH~4
NI~.,'Htl
oL L.,, Ht 3
ant.~i- amino N4- hydroxycytosine
H tl
Ne~OHt4 IoH'~.N~
HII
Hi3
ant.__~i- imino N4- hydroxycytosine
Fig. 1. N4-Hydroxycytosine(oh4C) in the syn (Z-configuration)and anti (E-configuration) forms.
41 state [12]. The results of the simplified theoretical quantum mechanical calculations, corresponding to the gas-phase, also pointed out the dominant imino form [13,14]. Not clear, however, was the conformation of the NaOH residue as well as the amount of the amino tautomeric form that can, in principle, be present. In this work we demonstrate the results of the state-of-the-art quantum mechanical calculations on oh4C and ohnfsc in different tautomeric and rotameric forms. The implications of the results obtained here are discussed in relation to the molecular mechanisms of chemical mutagenesis caused by NH2OH and of inhibition by N4OHdCMP of the enzyme, thymidylate synthase. Theoretical calculations
The aim of our calculations was to determine the total energies, i.e., the sums of the electronic energy, the repulsion of nuclei and the zero-point energy of the vibrational nuclear motion, of the amino and imino tautomeric forms of ohac and their 5-fluoro congeners. The 2-hydroxy tautomer (Hi3 located at 0 7, Fig. 1)has not been considered in the present work because our attention was focused on the model compounds that simulate nucleotides with the N(1)-nitrogen atom engaged in the glycosidic bond. Another 2-hydroxy tautomer that may originate from the syn-imino form after transfering of the hydrogen H10 atom from the N 3 to O 7 position does not seem to compete with the syn-imino form. One may eliminate this 2-hydroxy tautomer based on the comparison of the relative total energies of structurally similar uracil derivatives. For example, the syn-imino form of ohac (Fig. 1) is structurally similar to the 2-oxo,4-oxo form of uracil, and the 2-hydroxy (H10 at 0 7) form of syn-imino oh4C is similar to the 2-hydroxy,4-oxo form of uracil. There exist experimental and theoretical studies [15,16] showing that 2-hydroxy,4-oxo form of uracil is considerably less stable, by some 40-50 kJ/mol, than the normal 2-oxo,4-oxo form of uracil. We may than expect that its structural analogs will behave in a similar way. In order to validate such a simple reasoning we performed an ab initio SCF geometry optimization with the standard 3-21G basis set for the 2-hydroxy (H10 at 0 7) syn-imino form of ohnc. It appeared that this form is unfavourable indeed, by 117 kJ/mol, with respect to the syn-imino form (H L0 at N3). A search for the energy-optimal molecular structure has been performed in two steps. In the first step, we performed the structure optimization by means of the SCF method with the standard 3-21G basis set. Next, the molecular structures were re-optimized with the use of the SCF procedure and the 6-31G** basis set, which includes polarization functions on all the atoms. The calculations were performed with the use of the
GAUSSIAN 90 program [17]. The geometries of the imino forms were assumed planar. No geometrical constraints were imposed in the amino forms because of their significant non-planarity, especially in the region of the N aO H group. The molecular geometries obtained at the SCF/6-31G** level for the imino-syn oh4C were compared with the experimental values from the crystalline state of the 1,5-dimethyl-Nn-hydroxycytosine [12]. It appeared (see Table A1 for the details) that the bond lengths and bond angles corresponding to the pyrimidic ring and to the NaOH moiety are reproduced quite well. Although the experimental data for other congeners are not currently available we believe for them we also obtained reasonable structures. In the next step we performed SCF/6-31G** calculations of the harmonic vibrational frequencies, and finally, the single-point SCF + MBPT(2) calculations using the geometries optimized at the SCF/6-31G** level. The above procedure may suffer from several small errors originating from various sources (e.g., approximate treatment of the molecular wavefunction, structure, nuclear vibrations). Not all of these errors can be precisely estimated although the present approach belongs to one of the most advanced in the field of computational quantum chemistry. For example, one can obtain slightly different molecular structures in an SCF optimization and in an optimization performed with the use of a higher-order method that accounts for the electron correlation effects (e.g., the SCF + MBPT(2)). This in turn may affect the calculated energy differences between the isomeric forms. In order to estimate the error resulting from using the molecular geometries obtained at the non-correlated level we performed a study on the tautomerizing model system, formamide (F)-formamidic acid (FA) (see Table A4 in the Appendix). The molecular structures were optimized with the SCF/6-31G ** and SCF + MBPT(2)/6-31G ** methods (using analytical second derivatives with the MBPT method implemented in Gaussian 90). The SCF energies of the obtained molecular structures differ by 5.63 kJ/mol and 6.46 kJ/mol for formamide and formamidic acid, respectively. The corresponding SCF + MBPT(2) values are 5.28 kJ/mol and 5.91 kJ/mol. Fortunately, the relative tautomerization energy was much less affected: the SCF energy difference of formamide and formamidic acid is 53.13 kJ/mol and 53.97 kJ/mol for the SCF- and SCF + MBPT(2)-optimized geometries, respectively. For comparison, the relative tautomerization energy obtained with the SCF + MBPT(2) method is 52.80 kJ/mol (SCF geometry) and 52.17 kJ/mol (SCF + MBPT(2) geometry). From the results presented above one can conclude that although relying on the molecular optimization at the SCF level rather than the more accurate SCF + MBPT(2) level may cause an error of a few
42 TABLE I
Theoretical ab initio calculations for N4-hydroxycytosine (oh4C) in T1, T3, T5 and T7 tautomeric forms (see Fig. 2)
E n e r g y (a.u.) SCF/3-21G a SCF/6-31G **b MBPT(2) c ZPE d Total energy ~ Relative f
T1
T3
T5
T7
- 464.813924 --467.437426 - 1.353234 0.100885 - 468.689774
- 464.810601 --467.433300 - 1.352349 0.100739 - 468.684910
- 464.790623 --467.421148 - 1.353498 0.100503 - 468.674143
- 464.791833 --467.423744 - 1.351360 0.101134 - 468.673970
total e n e r g y ( k J / m o l ) Frequencies g (intensities) vOH UN(I) H VN(3)H .VU4H Dipole moments h
0.00
12.77
41.05
41.49
3791 (171) 3519 (137) 3 4 9 8 (103)
3788 (148) 3518 (143) 3490 (87) -
3612 (118) 3499 (118) -3480 (67)
3 7 4 2 (97) 3499 (117) 3447 (52)
3.36
6.98
6.39
--
3.41 a b c d e f g h
SCF/3-21G//3-21G calculations. SCF/6-31G **//6-31G ** calculations. MBPT(2)/6-31G **//6-31G ** calculations. A c o m m o n f a c t o r o f 0.9 w a s u s e d to scale d o w n all h a r m o n i c f r e q u e n c i e s o b t a i n e d with t h e S C F / 6 - 3 1 G * * / / 6 - 3 1 G * * method. SCF(6-31G **)+MBPT(2)(6-31G **)+0.9*ZPE(6-31G **). 1 a.u. o f e n e r g y = 2625.5 k J / m o l . f r e q u e n c i e s ( s c a l e d by 0.9) in c m - 1, i n t e n s i t i e s in k m tool i [38]; 1 a.u. o f e n e r g y = 219476.45 c m - 1. T h e d i p o l e m o m e n t values, in D e b y e s , w e r e o b t a i n e d w i t h t h e M P 2 o p t i o n a v a i l a b l e in the G a u s s i a n 90 p r o g r a m [17].
kJ/mol, the error in the relative tautomerization energy is considerably smaller, of an order of tenths of kJ/mol. Results
Total energies The results of the ab initio calculations are presented in Table I (N4-hydroxycytosine) and in Table II
(N4-hydroxy-5-fluorocytosine). The relative total energies of various tautomeric and rotameric forms are also shown on Fig. 2. The most stable form of N4-hydroxy cytosine and N4-hydroxy-5-fluorocytosine is the iminosyn form. The next stable form of N4-hydroxycytosine is the imino-anti form being 12.77 kJ mol-1 above the imino-syn form. It is worth noticing that the relative energies are qualitatively quite well predicted by using the S C F / 3 - 2 1 G values.
T A B L E II
Theoretical ab initio calculations for N4-hydroxy-5-fluorocytosine (oh 4f f C) in T2, T4, T6, and T8 tautomeric forms (see Fig. 2)
E n e r g y (a.u.) SCF/3-21G a SCF/6-31G **b MBPT(2) c ZPE d Total energy e Relative f total e n e r g y ( k J / m o l )
T2
T4
T6
T8
- 563.122552 -566.271677 - 1.515117 0.093133 - 567.693661
- 563.105322 -566.254868 - 1.516875 0.092743 - 567.679000
- 563.102126 -566.256150 - 1.516380 0.092573 - 567.679957
- 563.099148 -566.254941 - 1.515006 0.093352 - 567.676595
0.00
38.49
35.98
44.81
3786 (190) 3522 (147) 3495 (106) -
3745 (65) 3520 (161) 3485 (101) -
3626 (113) 3503 (55) 3501 (181)
3751 (104) 3500 (127) 3442 (51)
2.07
5.22
5.61
5.12
F r e q u e n c i e s (intensities) g VOH b'N(1) H P N(3)H /JN4H
Dipole moments h
a - h See f o o t n o t e s to T a b l e I.
43 ~.E
H
kJ tool"1
H/OxN/H ""N H
I
HxN/O
50 I
H 40
H\ /O
30
N ,,H
20
0x
41.5
I
N
H
I
H
"112.8
T,
r3
10
T1
T5
HXN/O'~H
~.E kJ mol-I
hi/O%H /0\/H
In order to estimate the barrier height for the rotation of the OH-residue about the C(4)=N 4 double bond we first performed the geometry optimization with the SCF/3-21G method fixing only the tetrahedral N(3)-C(4)-N(8)-O(9) angle to 90 degrees. When the optimization process was completed we calculated the harmonic frequencies. One of the frequecies appeared negative indicating that the structure corresponds to a transition state (Fig. 3). Subsequently, a single-point SCF + MBPT(2) calculations were performed with the 6-31G ** basis set on the transition structure. It appeared that the the energy of the transition state is by 184.5 kJ/mol higher than the energy of the non-rotated syn-imino (T1) form of ohac. We may then conclude that the rotation barrier is high enough to prevent the syn-anti flip at any temperatures relevant to the biochemical processes.
I
50
40
H\
O~N I~H
N
H
I
"'N
F I H
H,,N, , ' ~ F
30
H
H
H /H
20
Rotation about C(4) = N 4 double bond
14.8
o,
18.5
N
|6.0
I
H
h
5,
T4
T6
T@
Fig. 2. Total energy differences for various tautomeric forms of Na-hydroxycytosine (upper pannel) and N4-hydroxy-5-fluorocytosine (lower pannel). Energies of T1 and T2 forms are taken as the reference values (see Tables I and II).
transition
state
Saturated C(5)-C(6) bond We have also considered a question whether the C5-C6 bond saturation can reduce the energy difference of syn-imino and syn-amino forms of ohac. Such a bond saturation may occur in the ternary complex TS-(Na-hydroxy-dCMP)-CH2FH4 [10], whose molecular structure should be similar to the structure recently suggested for the TS-dUMP-CH2FH 4 complex [18]. The present ab initio SCF + MBPT(2)/6-31 G * * / / 6 31G ** calculations have shown that the imino-synamino-syn total energy difference in ohac with saturated (single) C(5)-C(6) bond increases up to 74.4 kJ/mol. It is than concluded that the C5-C6 bond saturation will not facilitate an imino ~ amino transformation.
transition state
Fig. 3. Stereo-view of the N4-hydroxycytosine (oh4C) transition state. The chemical bonds are marked by the thick solid lines. Thin lines are added to make the transition structure more transparent. Circles of increasing diameter denote hydrogen (smallest), carbon, nitrogen and oxygen (largest) atoms.
44
lntramolecular hydrogen bond The appearance of an intramolecular hydrogen bond in the imino-syn form has been postulated a long time ago [12]. The interatomic distance H10 .-. 0 9 of 2.17 appeared to be smaller than the sum of the atomic van der Waals radii (2.4 A). In the present work we show that indeed the H 1 0 " " O 9 distance equals to 2.178 A, but two other criteria of the hydrogen bond formation are not fulfilled. In particular, we do not observe any significant changes either in the net charges (not shown, but available upon request via mail or e-mail:
[email protected]) or in the frequency of the vibrational modes of u(N3-H) and u(Og-H). On the other hand, based on the difference of the u(Og-H) frequency of the T5 and T7 forms presented in Table I, and on the calculated relatively short interatomic H14 • .. N 3 distance of 2.016 A, we argue that an internal (intramolecular) hydrogen bond can be formed in the amino-syn forms. o
The role of the F 5 substituent At first sight the results of calculations for the FS-congener are rather unexpected. Particularly surprising is a low stability of the imino-anti form in comparison with the imino-syn form. In order to shed some more light onto this problem we compared the atomization energies of all the compounds. The atomization energy is defined as the difference between the total energy of a given molecule (see Tables I and II, entries corresponding to the SCF/6-31G ** calculations) and the sum of the atomic energies: Atomization energy = E(molecule)- XE(atoms) Such a difference represents the internal bonding effect upon molecule formation. Using the values of the SCF energies of H, C, N, O and F atoms ( - 0 . 5 0 0 , -37.689, -54.401, -74.809, -99.409 a.u., respectively [16]) we found that the atomization energy of oh4C is by about 200 kJ mol 1 more favourable than that of oh4fsC. In other words, the fluorine substituent at C5 weakens the internal bonding effect in oh4C. Most of the atomization energy difference originates from the replacement of the C5-H bond by the weaker C5-F bond. All the remaining bonds (i.e., bond lengths and bond angles) are also somewhat affected by this replacement (see Table A2 in the Appendix for the details). This, in turn, leads to considerable changes in the calculated IR spectra of oh4C and oh4fSC (see Table A3 in the Appendix). Incidentally, the frequency of each normal mode in oh4C is larger than the corresponding mode in oh4fSC, except three modes (denoted as 5, 34 and 35, Table A2) which are nearly equal. Such a situation is reflected in the ZPE energy, which is smaller (less positive) in o h 4 f s c than in the non-substitued oh4C. As a consequence of geometry
modifications due to the C5-substituent one obtains significant changes in the relative total energies of various tautomers and rotamers. The stabilization that brings the internal hydrogen bond in the imino-anti form of o h 4 f s c is by far smaller than the loss of the internal bonding energy due to the F5-substituent. In principle, the fluorine atom can attract nucleophilic agents. Therefore, it has been suggested [11] that the fluorine atom can be involved not only in the intermolecular hydrogen bonds but it can also interact with some of the hydrogen atoms present in the parent molecule (internal hydrogen bonds). From the present ab initio calculations it appears that the formation of the internal hydrogen bonds involving fluorine does not seem to occur. A clear evidence of this fact is presented in Fig. 2, where the T l l tautomeric form (no internal hydrogen bonding involving fluorine) is shown to have the total energy by 13 k J / t o o l more favourable total energy than the T4 tautomer (a putative internal hydrogen bond is marked by a dotted line in Fig. 2). The problem of the fluorine intermolecular hydrogen bonds deserves a further study. It is worth noticing that in several 2'-deoxy-2'-fluoronuGleosides the fluorine atoms also do not participate in any hydrogen bonding interactions [20]. Another C(5)-substituent, the methyl group, may have an opposite effect to the fluorine atom. When comparing cytosine [21] and 5-methylcytosine [22] atomization energies one realizes that the internal bonding energy of cytosine is by about 100 k J / m o l higher (less stable) than that of 5-methylcytosine. The methylated cytosine appears then to be tighter bounded than cytosine itself.
The soh,ent effect It is clear from Tables I and II that in a non-polar or weakly-polar environment the oh4C and o h 4 f s c will appear in the syn-imino form. In a polar environment, however, this situation may become more complicated. The amino forms have relatively high dipole moments that suggest a possibility of strong interactions with the polar molecules of the surrounding media. At present it seems rather difficult to calculate quantitatively the solvation energy. A more detailed molecular simulation study (e.g., Molecular Dynamics, Monte Carlo) would be necessary to obtain a reasonable estimate of the solvent effect on the adopted molecular structure of oh4C and its derivatives. However, based on the calculated dipole moments (see Tables I and II) we may expect that the polar environment should stabilize the amino forms stronger than the imino forms. Discussion
In the present work we argue that the imino-syn form does not flip sponteneously into the imino-anti
45 form due to a high potential barrier. A similar conclusion was reached before on the basis of a combined spectroscopical (UV, CD, proton NMR) and semiempirical quantum mechanical CNDO/S studies [23,24] on some methylated derivatives of oh4C. In order to explain the general patterns of the experimental spectra, it has been assumed that the samples investigated contain two components corresponding to two geometrical isomers (syn, anti). As long as the spontaneous flip of the syn form into the anti form has been excluded, Morozov et al. [23,24] have argued that such an isomerization can occur via a cationic (protonated) intermediate, especially at low pH. In neutral solutions (pH ~ 7), however, the formation of the putative cationic intermediate does not seem to be possible. Moreover, the rotation of the amino-like Na(OHXH) group from the syn to anti position must be strongly hindered. This is due to the fact that the apparently single C(4)-N 4 bond becomes, in fact, conjugated with the ring system. Based on the calculated values of the Mulliken overlap populations (a by-product of the present SCF calculations) we may say that the 'single' C(4)-N 4 bond in the T5-tautomer (amino-syn) is stronger then N(1)-C(2) or N(1)-C(6) bonds in the aromatic ring. An additional argument supporting the hypothesis on the hindered rotation about the C(4)-N 4 bond in the cationic form comes from the comparison of the bond lengths in the 1-methyl-Na-hydroxycytosine hydrochloride crystals [25]. In the protonated oh4C residues the C(4)-N 4 bond length (1.30 ,~) is bracketed by the bond lengths corresponding to the double bond in the imino form (1.29 ~,) [12] and the single bond, as in cytosine (1.32-1.33 ,~) [26,27] It is worth noticing that the syn-anti transformation can efficiently occur upon UV-irradiation (A = 313 nm) via the lowest-lying triplet excited state [23]. A natural consequence of our hypothesis, that in the ground state the syn form does not flip into the anti form, brings up a possibility of a chemical separation of the two forms. Here we suggest a modified scheme for the synthesis of ohac leading to the imino-anti form. Starting with 4-thio-uracil we suggest to protect the N(3)H first by an easily removable substituent. Upon hydroxylamine reaction one can obtain the N(3)-substituted derivative of oh4C. Finally, the imino-anti form can be restored by removing the N(3)-substituent. Our theoretical prediction that the amino tautomer does not compete with the imino form of oh4C may lead to the re-interpretation of the contemporary molecular models of the guanine-ohnc base pairing. Since 1974 [6] it is known that oh4C pairs either with guanine or adenine. Recently, Anand et al. [28] have shown that omOC is the only base-analogue attaining truly degenerate pairing either with guanine or with adenine [28]. Anand et al. have also shown that the (17-mer) duplex causes a stability decrease (reflected
by the duplex-to coil temperatures, Tm) in the order G-C, G-om4C, G-om4mSC [28]. Such an ambivalent behaviour of oh4C played a crucial role in the explanation at the molecular level of the mutagenic activity of oh4C. The tautomeric rearrangement amino-imino was then the essence of the molecular model of oh4C mutagenicity. In the present work we argue that the ambivalent behaviour of oh4C resembling either uracil or cytosine should not necessary lead to the conclusion that oh4C must appear in two tautomeric forms, i.e., in the imino form when pairing with adenine and in the amino form when pairing with guanine. In the view of the results gathered in the present work the appearance of oh4C in the amino form is questionable. Moreover, one may argue that guanine can form a complementary base pair with oh4C in the imino form. At least two possibilities can be taken into account. Guanine and oh4C may form a wobble base pair in a way suggested by Brown et al. [7]. In this case, instead of three, only two hydrogen bonds are formed. Another interesting possibility is a potential tautomerism of the guanine residue, i.e., a proton transfer from the N(1) atom to the 0(6) atom induced by the electric field of the complementary oh4C nucleic base. Such a tautomerism of guanine has been observed experimentally and confirmed by the theoretical ab initio calculations [29-33] as well as by the semiempirical calculations [34]. In particular it was found that the oxo and hydroxy tautomeric forms of guanine have nearly equal energies. A path length for the proton movement is rather short and this geometrical factor will certainly promote the suggested tautomeric rearrangement of guanine. A direct consequence of the appearance of oh4C in the imino form in modified oligonucleotides forming duplexes should be the presence of imino proton resonances (at about 12-14 ppm) corresponding to the N(3)-H protons of the ohaC residue that is paired with guanine. In fact, the imino proton n.m.r, resonances have recently been recorded in 8-mer GM (including a G-om4C base pair) [35]. The suggested interpretation [35] of the imino proton spectra does not stricly coincide with our results. Nedderman et al. [35] have argued that a slow exchange between Watson-Crick (amino form of om4C assumed) and wobble (imino form of om4C assumed) forms takes place. A similar imino proton resonance has been attributed to the N(3)-H residue in 8-mer AM and 8-mer AP duplexes by Stone et al. [7]. In our opinion, the suggested geometry of the G(hydroxy)... oh4C (imino-anti) base-pair can be verified spectroscopically, e.g., by 2-dim NOESY, and here we would like to suggest such experiments. There exists also a possibility of using the nitrogen magnetic resonance spectroscopy. Our preliminary ab initio SCF calculations [36,37] of the chemical shielding
46 have shown that the nitrogen atom shielding of N4OH-residue is particularly sensitive to the configuration (syn, anti) adopted by the oh4C molecule. The low stability of the imino-anti form of oh4fsC that results from the present ab initio study can be used in the re-interpretation of the experimental resuits [11] showing a potentiation for a thymidylate synthase inhibition by N 4 O H d C M P due to the 5-fluoro substituent. Formerly, Rode et al. have suggested that such a potentiation can be explained in terms of the hydrogen bond F 5 • • • H O - N 4 stabilizing the imino-anti rotamer in comparison to the imino-syn form. The present ab initio study clearly shows that the imino-anti form is by far less stable than the imin0-syn form. The intramolecular hydrogen bond present in the imino-anti form seems to be too weak to significantly improve the internal stability of ohnfsc. An evident interplay between the C(4)- and C(5)-substituents observed experimentally should, therefore, be explained assuming a slightly different molecular mechanism, most probably taking into account a strong relaxation of the inhibitor structure by the 5-fluoro and 5-methyl substituents. In view of our hypothesis, the role of FS-substition is slightly different than it has been formerly assumed [11]. Instead of the expected stabilization of the anti form via the intramolecular hydrogen bond towards the N4OH-residue, the fluorine substitution probably increases the population of the anti form during the synthesis of oh4f5C, i.e., before any enzymatic reaction takes place. Conclusions
Several conclusions can be drawn from the present paper: N4-Hydroxycytosine appears in the imino tautomeric form. The amino tautomeric form does not compete with the imino form. The N4OH-residue of the imino tautomeric form can appear in two isomeric forms, syn and anti, relatively to the N(3)-nitrogen atom. The imino-syn form is more stable than the imino-anti form. The rotation of the N 4 O H residue around the C(4)=N 4 double bond should not occur. The biochemical role of FS-substitution in N4-hy droxycytosine is likely to be different than it has been
formerly assumed. FS-substituent does not stabilize the anti form of N4-hydroxycytosine.
Acknowledgements We would like to thank Prof. David Shugar for vigorous discussions and for calling our attention to numerous problems of the spectroscopy of oh4C and related molecules as well as for his continued interest in this work. This work was supported in part by the American Cancer Society in the form of the Junior Faculty Research Award for Ludwik Adamowicz. Andrzej Leg was partly supported by the Polish Committee for Scientific Research within the project KBNC H E M - B S T 412/23.
Appendix
TABLE A1 The comparison of the theoretical ab initio SCF / 6-31G * * geometry of the N4-hydroxycytosine in the T1 tautomeric form (see Figs. 1 and 2) and the crystal structure of 1,5-dimethyl-N4-hydroxycytosine [12]
ab initio Bond lengths (,~) 1- 2 2- 3 3- 4 4- 5 5- 6 1- 6 2- 7 4- 8 8- 9 9-14 9 "'" 10 3"" 9
1.370 1.368 1.384 1.458 1.324 1.381 1.195 1.262 1.386 0.942 2.178 2.528
Bond angles (degrees) 1- 2-3 114.4 2- 3-4 126.2 3- 4-5 115.7 4- 5-6 118.6 1- 2-7 122.6 3- 4-8 123.7 5- 4-8 120.7 4- 8-9 110.9
crystal. [12] 1.362 1.367 1.387 1.446 1.330 1.381 1.222 1.288 1.416 0.81 2.17 2.484 115.6 125.7 116.2 117.1 122.1 122.4 121.4 108.4
47 T A B L E A2
T A B L E A3
The comparison of the theoretical ab initio SCF / 6-31G * * geometries of the N4-hydroxycytosine (T1 form) and N 4-hydrOxy-5-fluOrOcytOsine (T2 form) in the syn-imino tautomeric form (see Figs. 1 and 2)
The comparison of the theoretical ab initio SCF / 6-31G ** frequencies (v, not scaled, cm - t) and infra-red intensities (A, km tool - 1) of the N4-hydroxycytosine (T1 form) and N4-hydroxy-5-fluorocytosine (T2 form) in the syn-imino tautomeric form (see Fig. 2).
T1 Bond lengths (,~) 1- 2 1- 6 1-13 2- 3 2- 7 3- 4 3-10 4- 5 4- 8 5- 6 5-11 6-12 8- 9 9-14 Bond angles(degrees) 1- 2 - 3 2- 3- 4 3- 4- 5 4- 5- 6 1- 2- 7 3- 2- 7 3- 4- 8 5- 4- 8 4- 8- 9 2 - 3-10 4- 3-10 4 - 5-11 6 - 5-11 5 - 6-12 2 - 1-13 8 - 9-14
T2
1.3703 1.3806 0.9932 1.3680 1.1953 1.3842 0.9954 1.4585 1.2618 1.3244 1.0700 1.0728 1.3864 0.9418
114.38 126.20 115.66 118.63 122.62 123.00 123.67 120.66 110.92 115.86 117.94 118.81 122.56 122.78 115.42 104.17
1.3667 1.3866 0.9930 1.3720 1.1947 1.3829 0.9957 1.4599 1.2581 1.3173 1.3210 1.0715 1.3809 0.9423
114.33 126.74 114.25 120.49 122.99 122.68 124.71 121.04 110.80 115.76 117.49 117.47 122.03 122.21 115.70 104.26
Normal mode a
T1 v
A
v
A
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
130 162 242 276 301 473 493 549 556 613 664 700 764 845 853 870 1065 1077 1094 1126 1178 1307 1404 1455 1544 1563 1606 1640 1859 1958 2000 3397 3431 3886 3910 4213
0.4 0.7 3.4 41.0 134.8 29.8 37.7 51.7 5.7 5.7 17.4 109.9 0.1 1.6 156.3 20.8 36.5 4.3 0.0 228.5 20.9 71.7 25.5 206.6 70.1 14.9 1.5 100.1 57.2 440.1 915.3 4.8 3.2 102.6 137.4 171.1
107 143 222 249 319 340 368 474 510 525 526 622 667 684 755 843 849 889 999 1076 1135 1243 1347 1390 1463 1521 1564 1612 1661 1921 1975 2001 3419 3884 3913 4206
1.0 1.2 1.3 0.6 148.2 2.5 49.6 29.8 45.5 30.6 7.2 2.9 3.5 105.3 24.6 3.4 92.4 49.8 37.1 5.6 272.2 31.5 191.0 160.3 120.1 146.0 1.9 37.1 11.5 51.4 234.9 936.1 4.6 106.6 147.4 189.7
a
T2
For comparison with the experimental values the theoretical frequencies should be scaled down by a factor of 0.9 that in an approximate way corrects the deficiences of the SCF calculations originating from the anharmonicity of nuclear vibrations and lack of electron correlation effects [38].
48 TABLE A4
The theoretical ab initio calculations for formamide (F) and formamidic acid (FA) No.
Total energy in a.u.
F
FA
1 2
SCF/(SCF-geom.) SCF + MBPT(2)/ (SCF-geom.) SCF/(SCF + MBPT (2)-geom.) SCF + MBPT(2)/ (SCF + MBPT(2)-geom.)
- 168.940492
- 168.920255
- 169.430571
- 169.410460
- 168.938349
- 168.917793
- 169.432582
- 169.412712
3 4
SCF SCF SCF + MBPT(2) SCF+ MBPT(2) SCF SCF SCF + MBPT(2) SCF + MBPT(2)
Total energy differences a in kJ/mol F(3)-F(1) 5.63 FA(3)-FA(1) 6.46 F(2)-F(4) 5.28 FA(2)-FA(4) 5.91 FA(1)-F(1) 53.13 FA(3)-F(3) 53.97 FA(2)-F(2) 52.80 FA(4)-F(4) 52.17
51.0 b 49.4 b
a F(x), FA(x) denote the total energy of formamide and formamidic acid, respectively, calculated with the method x (x = 1, 2, 3, 4) described in the upper part of this table. b For comparison, the results of recent ab initio quantum mechanical study are given [39]. Wang et al. [39] optimized the molecular structures with the SCF method using a different Gaussian basis set ('correlation consistent' polarized valence double-zeta) and a different optimization procedure.
References 1 Shugar, D. and Psoda, A. (1990) in Landoldt-Boernstein New Series: Biophysics of Nucleic Acids (Saenger, W., ed.), Vol. 7, pp. 308-348, Springer-Verlag, Berlin. 2 Novotny, L., Faghali, H., Janku, I. and Beranek, J. (1989) Cancer Chemother. Pharmacol. 24, 238. 3 Singer, B. and Kusmierek, J.T. (1982) Annu. Rev. Biochem. 52, 655. 4 Watson, J.D. and Crick, F.H.C. (1953) Nature 171,737. 5 Little, J.W. and Mount, D.W. (1982) Cell 29, 11. 6 Flavell, R.A., Sabo, D.L., Bandle, E.F. and Weissman, C. (1974) J. Mol. Biol. 89, 255. 7 Stone, M.J., Nedderman, A.N.R., Williams, D.H., Lin, P.K.T. and Brown, D.M. (1991) J. Mol. Biol. 222, 711. 8 Stolarski, R., Kierdaszuk, B., Hagberg, C.E. and Shugar, D. (1987) Biochemistry 26, 4332. 9 Santi, D.V. and Danenberg, P.V. (1984) in Folates and Pterines (Blakley, R.L. and Benkovic, S.J., eds.), Vol. l, pp. 345-398, Wiley, New York. 10 Goldstein, S., Pogolotti, Jr. A.L., Garvey, E.P., and Santi, D. (1984) J. Med. Chem. 27, 1259.
11 Rode, W., Zielinski, Z., Dzik, J.M., Kulikowski, T., Bretner, M., Kierdaszuk, B., Cie~la, J. and Shugar, D. (1990) Biochemistry 29, 10835. 12 Shugar, D., Huber, C.P. and Birnbaum, G.I. (1976) Biochim. Biophys. Acta 447, 274. 13 Kwiatkowski, J.S., Lesyng, B., Palmer, M.H. and Saenger, W. (1982) Z. Naturforsch. 37, 937. 14 Palmer, M.H., Wheeler, J.R., Kwiatkowski, J.S. and Lesyng, B. (1983) J. Mol. Struct. 92, 283. 15 Les, A. and Adamowicz, L. (1989) J. Phys. Chem. 93, 7078. 16 Leszczynski, J. (1992) J. Phys. Chem. 96, 1649). 17 GAUSSIAN 90, Revision I, M.J. Frisch, M. Head-Gordon, G.W. Trucks, J.B. Foresman, H.B. Schlegel, K. Raghavachari, M. Robb, J.S. Binkley, C. Gonzalez, D.J. Defrees, DJ. Fox, R.A. Whiteside, R. Seeger, C.F. Melius, J. Baker, R.L. Martin, L.R. Kahn, J.J.P. Steward, S. Topiol, and J.A. Pople, Gaussian, Inc., Pittsburgh PA, 1990. 18 Finer-Moore, J.S., Montfort, W.R. and Stroud, R.M. (1990) Biochemistry 29, 6977. 19 Clementi, E. and Roetti, C. (1974) Atomic Data Nucl. Data Tabl. 14, 177-478. 20 Marck, Ch., Lesyng, B. and Saenger W. (1982) J. Mol. Struct. 82, 77. 21 Les, A., Adamowicz, L. and Bartlett, R.J. (1989) J. Phys. Chem. 93, 4001. 22 Les, A. and Adamowicz, L. (1990) J. Mol. Struct. 221,209. 23 Morozov, Yu. V., Savin, F.A., Chekhov, V.O., Budowsky, E.I., Yakovlev, D. Yu. (1982) J. Photochem. 20, 229. 24 Yakovlev, D. Yu., Simukova, N.A., Budowsky, E.I., Chekhov, V.O., Savin, F.A. and Morozov, Yu. V. (1982) J. Photochem. 20, 253. 25 Birnbaum, G.I., Kulikowski, T. and Shugar, D. (1979) Can. J. Biochem. 57, 308. 26 Barker, D.L. and Marsh, R.E. (1964) Acta Cryst. 17, 1581-1587. 27 McClure, R.J. and Craven, B.M. (1973) Acta Cryst. 16, 20-28. 28 Anand, N.N., Brown, D.M. and Salisbury, S.A. (1987) Nucleic Acids Res. 15, 8187. 29 Sheina, G.G., Stepanian, S.G., Radchenko, E.D. and Blagoi, Yu. P. (1987) J. Mol. Struct. 158, 275. 30 Szczepaniak, K. and Szczesniak, M. (1987) J. Mol. Struct. 156, 29. 31 Person, W.B., Szczepaniak, K., Szczesniak, M., Kwiatkowski, J.S., Hernandez, L. and Czerminski, R. (1989) J. Mol. Struct. 194, 239. 32 Le Breton, P.R., Yang, X., Urano, S., Fetzer, S., Min, Y., Leonard, N.J. and Kumar, S. (1990)J. Am. Chem. Soc. 112, 2138. 33 Szczepaniak, K., Szczesniak, M., Szajda, W., Leszczynski, J. and Person, W.B. (1991) Can. J. Chem. 69, 1707. 34 Fabian, W.M.F. (1991) J. Comput. Chem. 12, 17. 35 Nedderman, A.N.R';, Stone, MJ., Kong Thoo Lin, P., Brown, D.M. and Williams, D.H. (1991) J. Chem. Soc. Chem. Commun. 1357. 36 Ditchfield, R. (1974) Mol. Phys. 27, 789. 37 Wolinski, K., Hinton, J.F. and Pulay, P. (1990) J. Am. Chem. Soc. 112, 8251. 38 Hess, Jr., B.A., Schaad, L.J., Carsky, P. and Zahradnik, R. (1986) Chem. Rev. 86, 709. 39 Wang, X.C., Nichols, J., Feyereisen, M., Gutowski, M., Boatz, J., Haymet, A.DJ. and Simons, J. (1991) J. Phys. Chem. 95, 10419.