Folding Topology of a Bimolecular DNA Quadruplex Containing a Stable Mini-hairpin Motif within the Diagonal Loop

Folding Topology of a Bimolecular DNA Quadruplex Containing a Stable Mini-hairpin Motif within the Diagonal Loop

doi:10.1016/j.jmb.2008.11.050 J. Mol. Biol. (2009) 385, 1600–1615 Available online at www.sciencedirect.com Folding Topology of a Bimolecular DNA Q...

2MB Sizes 1 Downloads 64 Views

doi:10.1016/j.jmb.2008.11.050

J. Mol. Biol. (2009) 385, 1600–1615

Available online at www.sciencedirect.com

Folding Topology of a Bimolecular DNA Quadruplex Containing a Stable Mini-hairpin Motif within the Diagonal Loop Graham D. Balkwill, Thomas P. Garner, Huw E. L. Williams and Mark S. Searle⁎ Centre for Biomolecular Sciences, School of Chemistry, University Park, Nottingham NG7 2RD, UK Received 19 September 2008; received in revised form 12 November 2008; accepted 20 November 2008 Available online 3 December 2008

We describe the NMR structural characterisation of a bimolecular antiparallel DNA quadruplex d(G3ACGTAGTG3)2 containing an autonomously stable mini-hairpin motif inserted within the diagonal loop. A folding topology is identified that is different from that observed for the analogous d(G3T4G3)2 dimer with the two structures differing in the relative orientation of the diagonal loops. This appears to reflect specific base stacking interactions at the quadruplex–duplex interface that are not present in the structure with the T4-loop sequence. A truncated version of the bimolecular quadruplex d(G2ACGTAGTG2)2, with only two core G-tetrads, is less stable and forms a heterogeneous mixture of three 2-fold symmetric quadruplexes with different loop arrangements. We demonstrate that the nature of the loop sequence, its ability to form autonomously stable structure, the relative stabilities of the hairpin loop and core quadruplex, and the ability to form favourable stacking interactions between these two motifs are important factors in controlling DNA G-quadruplex topology. © 2008 Elsevier Ltd. All rights reserved.

Edited by D. E. Draper

Keywords: DNA hairpins; NMR spectroscopy; bimolecular DNA quadruplex; structural topology

Introduction The multi-stranded guanine-rich quadruplex structure demonstrates a diversity of structural topologies and has attracted considerable interest in recent years.1,2 The G-quadruplex structure is comprised of a core of hydrogen bonded guanine quartet motifs formed through either intramolecular folding of a single guanine-rich strand or via the intermolecular association of two or four individual strands. The connecting sequences in the intramolecular or dimer structures fold to form extrahelical loops; however, the precise role of these loop sequences and their length in modulating quadruplex stability, structural topology and in recognition is only now becoming apparent.

*Corresponding author. E-mail address: [email protected]. Abbreviations used: NOE, nuclear Overhauser effect; NOESY, nuclear Overhauser effect spectroscopy; TOCSY, total correlated spectroscopy; DQF-COSY, double quantum filtered correlated spectroscopy.

The guanine-rich telomeric sequences found at the ends of eukaryotic chromosomes have been shown to form quadruplex structures in vitro.3 The formation of these structures has been demonstrated to inhibit telomerase,4 which is present in ∼85% of cancers,5 leading to possible therapeutic targets.6 A number of proteins have been identified that interact specifically with quadruplex structures, including a nuclease and helicases capable of unwinding the structure.7,8 Convincing evidence for the in vivo formation of G-quadruplexes via the interaction of a quadruplex-specific antibody has been reported,9 and so has the formation of G-loops.10 Furthermore, genome-wide studies suggest that as many as 376,000 potential quadruplexes could exist within the human genome.11,12 There has been considerable of interest in quadruplex structures formed in oncogenic promoter regions, including the nuclease hypersensitive element of the c-myc promoter,13 and within the proto-oncogene c-kit, which encodes a receptor tyrosine kinase.15,16 Small molecule DNA quadruplex recognition, and ligand-induced quadruplex formation, has been shown to downregulate levels of c-myc14 and c-kit17 expression, suggesting that quadruplex formation may control

0022-2836/$ - see front matter © 2008 Elsevier Ltd. All rights reserved.

Folding Topology of a Bimolecular DNA Quadruplex

gene regulation and offer a promising therapeutic target. Guanine-rich sequences have also been identified in mammalian gene-promoter regions of TGF-β,18 Hif-1α,19 c-Ki-Ras,20 VEGF21 and bcl2,22 although detailed investigations have not been reported in all cases. Our current knowledge of bimolecular quadruplexes shows that this family possesses a high degree of conformational plasticity, indeed subtle changes within a common nucleotide sequence can yield quite different folded conformations. Studies on the d(G4T4G4) hairpin, derived from 1.5 repeats of the Oxytricha nova telomeric sequence, demonstrated that it can form a dimeric quadruplex with four Gquartets and two T4 loops spanning the diagonal of each outer quartet.23,24 The analogous sequence with shorter G-tracts, d(G3T4G3), also folds to form the same diagonal conformer, essentially yielding an identical structure but with one of the central quartets removed.25 However, mutating two guanines in d(G4T4G4) to cytosines, d(G3CT4G3C), now yields a sequence predisposed to form an edgelooped dimer structure with mixed tetrads containing Watson-Crick base pairs.26 The presence of unequal G-tract lengths can also have a dramatic effect on dimer folding. Studies on the truncated d(G4T4G3)2 and d(G3T4G4)2 sequences showed that they folded into novel dimer conformations notably different from the corresponding d(G4T4G4)2 and d

1601 (G3T4G3)2 structures.27,28 Loop length also influences quadruplex folding, with the crystal structure of the d(G4T3G4) sequence affording an edge-loop dimer in contrast to the diagonally looped arrangement generated by the equivalent T4 loop.29 A single hairpin sequence may also generate multiple conformations in solution. The crystal structure of a double human telomeric repeat d(TAG3TTAG3T), proved to be a landmark parallel structure with propeller loops,30 whereas it was found to interconvert with an additional edge-looped structure in solution.31 The frequent and systematic occurrence of G-rich sequences throughout the genome has shown that a substantial variability exists in loop sequence and length, with the implication that the latter also reflects biological function. However, these studies have focused mainly on telomeric all thymine or TTA nucleotide loop sequences. As such, the role of alternative loop sequences and specific loop–loop interactions in quadruplex folding is not fully understood. A number of mini-hairpin sequences containing a GNA trinucleotide loop (N is any nucleotide) have been reported to form with remarkably high levels of stability.32,33 The short GNA loop folds to form a sheared G-A base pair with the unpaired nucleotide N stacking on the guanine base (Fig. 1). The high level of stability of mini-hairpins (as short as seven nucleotides; d(GCGAAGC), Tm~345 K) appears to

Fig. 1. (a) Structure of a sheared G-A base pair. (b) Folding of the mini-hairpin motif of d(ACGTAGT) around the GTA loop. (c) Sequence of the GGG and (d) GG bimolecular quadruplexes with the d(ACGTAGT) mini-hairpin inserted in the loop between the G-rich tails. (e) Possible models for the formation of a bimolecular quadruplex with three G-tetrads formed from mixed parallel and anti-parallel strands, and diagonal loops (left) or edgewise loops. Residues are numbered 1 to n on one strand, and 1⁎ to n⁎ on the other strand.

1602 correlate with a CG base pair in the flanking part of the hairpin stem region resulting in very favourable C-G on G-A base pair stacking within the cGNAg motif (c and g represent the first base pair in the double-stranded stem). These hairpins occur in nature within the replication origin of ϕX174 and herpes simplex virus, and have been identified within the promoter region of phage N4 double-stranded DNA. 34,35 The stabilising effect of this short sequence and its ability to nucleate hairpin formation has been exploited in the design of hairpin sequences to study drug recognition of bulged bases and mis-matched base pairs.36-38 In this work we have introduced a mini-hairpin motif into the loop region of a G-rich sequence capable of forming a bimolecular quadruplex through hairpin dimerisation. Within the sequence d(G3ACGTAGTG3), the structured hairpin motif containing a GTA loop was inserted to investigate the potential role that this constrained and highly structured loop sequence has in controlling and directing the folding of the quadruplex dimer.11,12,31,39 The relative stabilities of the hairpin and quadruplex motifs could be controlled by changing the number of guanine bases. We present NMR structural studies of the d(G3ACGTAGTG3) quadruplex dimer and a truncated version d(G2ACGTAGTG2) depleted by one G-tetrad. We demonstrate that the nature of the loop sequence, its ability to form autonomously stable structure, and the relative stability of the hairpin loop and core quadruplex are all important factors in controlling stability and structural topology.

Folding Topology of a Bimolecular DNA Quadruplex

ber of quartets.25 In addition, we observe two overlapping resonances at 12.5 ppm characteristic of a Watson-Crick G-C base pair as expected for C5-G9 in the hairpin loop region of the quadruplex dimer. These resonances are highly temperaturesensitive and are fully exchange broadened above 298 K, but sharp at 278 K. We confirmed the presence of a quadruplex motif using far UV CD and UV absorption methods with samples of 4 μM d(G3ACGTAGTG3) in 100 mM KCl, 10 mM potassium phosphate buffer adjusted to pH 7.0. The CD spectrum gives a broad maximum between 270 nm and 290 nm with a weak minimum ellipticity at around ∼240 nm (Fig. 3a). This is consistent with two overlapping spectra with contributions from an anti-parallel quadruplex structure with a maximum at around 285–290 nm, and from a hairpin stem–loop structure. Earlier CD studies on hairpin motifs with GTA and GAA loop sequences (data not shown) indicated a maximum at 275–280 nm, consistent with a double-stranded hairpin contributing to the spectrum of the quadruplex dimer shown in Fig. 3a. Spectra recorded between 278 K and 358 K showed that the broad maximum decreases to zero over this range. A CD melting curve constructed from changes in ellipticity at 280 nm shows a transition mid-point at 322 ± 1 K, with a small hysteresis of 2–3 K in the re-annealed curved (dotted line in Fig. 3b), indicative of slow refolding kinetics.42

Results Evidence for quadruplex formation obtained by NMR, CD and UV absorbance spectroscopy Two hairpins can dimerise in a number of different ways, giving rise to a range of possible structures with diagonal loops, edge loops (Fig. 1), propeller loops or even with a mixture of different loops, and G-tetrads with combinations of syn and anti glycosidic bond angles.40,41 In turn, specific loop arrangements can give rise to different strand orientations and polarities. For each particular model, the individual tetrads can form in a clockwise or anticlockwise manner, increasing the number of permutations for each model, each with intrinsically different degrees of symmetry that will be manifested in the NMR spectrum in terms of the number of resonances. The hydrogen bonded imino protons present in a G-quartet give rise to sharp resonances in the 10.5–12.0 ppm range of the 1D 1H NMR spectrum. The spectrum of d(G3ACGTAGTG3) exhibited characteristic imino proton resonances at low temperatures (Fig. 2). The presence of 12 signals indicates that the quadruplex is an asymmetric dimer, with each tetrad guanine giving rise to a unique imino resonance. This is consistent with the intrinsic asymmetry associated with an odd num-

Fig. 2. The 1D 1H NMR spectra of the exchangeable NH resonances between 9.0 ppm and 12.5 ppm in the spectrum of d(G3ACGTAGTG3)2 showing clear evidence for an asymmetric bimolecular structure. The NH groups from the 12 non-equivalent guanines in the three G-tetrads occur between 10.7 ppm and 11.8 ppm. The resonance at 12.5 ppm corresponds to the overlapping G9 and G9⁎ NH groups from within the mini-hairpin loops. The spectra at 600 MHz are shown at 278 K, 288 K and 298 K with the G9/G9⁎ resonances exchanging rapidly in a pre-melting transition ahead of disruption of the quadruplex structure. The sample was 2.0 mM quadruplex DNA, 10 mM KCl (pH 7.85) in 10% (v/v) 2H2O.

Folding Topology of a Bimolecular DNA Quadruplex

1603

Fig. 3. (a) Far UV CD spectra of d(G3ACGTAGTG3)2 (4 μM in 100 mM KCl, 10 mM potassium phosphate buffer pH 7.0) at temperatures between 278 K and 358K (10 K increments). (b) CD melting curves determined from the change in ellipticity at 280 nm (Tm 322 ± 1K); the re-annealing curve (dotted line) shows a 2–3K hysteresis, indicative of slow refolding kinetics. (c) UV absorption spectra recorded at 278 K and at 358 K showing a small hyperchromic shift in intensity at 295 nm, and the UV difference spectrum (d) illustrates this as a small negative differential. The single broad maximum is indicative of the unfolding of a largely anti-parallel quadruplex structure.

The UV spectra recorded at 278 K and at 358 K show an increase in absorbance at 260 nm in the unfolded form, and the characteristic hyperchromic shift at 295 nm that is the hallmark of quadruplex melting (Fig. 3c). The UV difference spectrum shows this as a small negative differential (Fig. 3d). The single broad maximum in the difference spectrum at 260 nm is typical of an antiparallel structure, whereas parallel-stranded alignments give rise to a well defined double maximum in the difference spectrum with an intervening minimum close to zero or even negative.16 NMR evidence for the formation of hairpin loops We characterised the structure of the bimolecular quadruplex by NMR at 278 K, where we were able to obtain a nearly complete 1H assignment. In the asymmetric dimer, the two DNA strands are nonequivalent and are numbered G1–G13 on strand 1, and G1⁎ –G13⁎ on strand 2. Quantitative analysis of H1′-H8 NOE cross-peak intensities in 2D nuclear Overhauser effect spectroscopy (NOESY) spectra at short mixing times (b 100 ms) shows clear evidence

for different guanine bases adopting anti and syn orientations within a given tetrad with six syn guanines readily apparent (Fig. 4). Each hairpin, therefore, has three syn guanine nucleotides participating in tetrad formation which is characteristic of a foldback bimolecular quadruplex structure. The intrinsic asymmetry in the core tetrads is partially propagated to the loop conformations; however, the chemical shift differences between residues present in the two loops are typically small, indicating that they are probably nearly symmetric in structure. We and others have shown that mini-hairpin loop formation is associated with characteristic NMR perturbations.33,36–38 The H4′ of the thymine in the GTA loop is upfield-shifted ∼ 2 ppm through its unusual proximity to the adenine base; these shifts are clearly visible for both hairpins. Missing NOEs for T7 H2′ to A8 H8 and the very weak T7 H2″ to A8 H8 interaction (and equivalent interactions on strand 2) are both indicators that the loop has ”folded back” between these two nucleotides, creating a novel backbone geometry. NOE intensities through out the hairpin part of the sequence are consistent with bases

1604

Folding Topology of a Bimolecular DNA Quadruplex

hydrogen bonds. Additional peaks were anticipated from A4-T10 and A4⁎-T10⁎ base pairs in the hairpin stems immediately adjacent to either end of the quadruplex. The absence of a downfield thymine NH resonance appears to preclude the formation of stable hydrogen bonded A-T pairs. This provides some initial evidence that the two loops adopt a diagonal arrangement, which has implications for the required separation between the two strands. In the diagonal arrangement the increased distance would disrupt the hydrogen bonding lower down the stem region, precluding A-T base pair formation. The sensitivity to temperature of the G9 and G9⁎ NH exchange rates (Fig. 2) suggests that disorder within the hairpin loop is readily propagated, with the hairpin undergoing a premelting transition ahead of the disruption of the core quadruplex. Assigning the G-tetrad core

Fig. 4. Portions of the NOESY spectra of d(G3ACGTAGTG3)2 showing (a) base H6/H8 to deoxyribose H1′ NOE connectivities in 100 ms data. Six strong intramolecular H8-H1′ peaks are indicated with an asterisk and correspond to guanines in a syn glycosidic conformation. (b) Base H6/H8 to base H6/H8 NOEs between adjacent stacked bases showing clear sequential connectivities for assignment purposes. The T7 H6-A8 H8 NOE in the loop is missing. The resolved parts of the sequential pathway for strand 1 (G1 –G13) are highlighted above the diagonal with connecting lines; the equivalent connectivities for strand 2 (G1⁎ –G13⁎) are labelled below the diagonal. The H6/8-H6/8 interactions that are missing are either too close to the diagonal, or have low intensity as a result of the tight loop turn, or correspond to the 5′-anti-syn-3′ steps.

The G-tetrad core shows clear evidence, from NOE intensities, for bases in both the anti and syn conformation. The variation in glycosidic bond angles between pairs of adjacent nucleotides gives rise to ”missing” or ”reversed” sequential NOEs compared with the ”normal” standard connectivity pathways evident in B-DNA double helices (Fig. 4a).43 These unusual patterns of connectivities allow clusters of adjacent guanine residues to be identified and their subsequent interactions with the loop residues permitted their location within the sequence to be determined. The presence of an anti guanine preceding the loop and the lack of any 5′-anti-syn-3′ steps facilitated identification of the 5′-GGG tract in both hairpins. In contrast, the remaining two GGG runs could not be connected to the 3′ end of the loop, corroborating the presence of a post-loop syn guanine. These guanine tracts were assigned to the correct hairpin by the geometrical requirement that a reversal of strand polarity within a hairpin requires an alteration in the guanine glycosidic angle. Additional H8–H8 interactions confirmed the nucleotides position within the strands (Fig. 4b). The assignment of the clusters was confirmed when the exchangeable resonances were subsequently identified. The bimolecular quadruplex was identified as having the same syn/ anti arrangement as the d(G3T4G3) quadruplex.25 Evidence for a novel diagonally looped structure

adopting the anti glycosidic conformation. The C5 H2′ and C5⁎ H2′ resonances are shifted upfield by N 0.5 ppm, which shows the cytosine base is located under a sheared G-A pair. All of these features are consistent with those reported earlier and are consistent with the formation of a stabilised trinucleotide GTA mini-hairpin loop at each end of the core quadruplex. The data highlight the pre-loop C5-G9 and C5⁎G9⁎ pairs as the only base pairs forming stable

In addition to the lack of stable A-T base pairing in the loop, which appears to be consistent with a diagonal conformation, several unusual crossstrand NOEs are also evident. Interactions were observed from the first 5′ guanine H8 to sugar and methyl protons on the tenth thymine in the other hairpin (1–10⁎ and 1⁎–10). This set of NOEs can occur only if the two non-sequential bases are stacked together, as the thymine nucleotide stretches across the diagonal of a quartet to form intermolecular interactions within the dimer. The methyl

Folding Topology of a Bimolecular DNA Quadruplex

groups also give interactions to some of the imino protons on the top tetrad; namely, G1-H1 or G1⁎-H1. The NMR structure of the d(G4T4G4)2 symmetrical quadruplex also showed analogous cross-strand interactions from the loop nucleotides of one hairpin to the quartet guanine of the other hairpin.23 Thus, while the stacking continues from G9 to T10 to G1⁎ (and vice versa on the other tetrad face), there is no interaction with the syn G11 base. The H2′/2′′ sugar protons of both G11 and G11⁎ have been ring current shifted by over 0.5 ppm to 3.5 ppm (G11⁎) and 3.8 ppm (G11), a feature retained in the d (G3T4G3)2 and d(G4T4G4)2 structures, suggesting it may be common to a diagonal loop.23 The A4 base on the opposite side of the hairpin stem appears to be stacking well with the G3 base. As both bases are anti, they adopt a more B-DNA-like stacking arrangement, with a clear H8–H8 NOE interaction occurring. The data point towards the fact that both the A and T form good stacking interactions with the tetrad, as the strands orient across the diagonal of the tetrad. As a consequence they experience base

1605 pair opening (σ) that optimises the interactions with the tetrad (see below). Analysis of the restrained molecular dynamics simulations suggests, in agreement with the NMR data, that the N1-NH3 distance precludes the formation of a stable hydrogen bond. In contrast, the NH6-O4 distance has a hydrogen bond occupancy of 84% over the last 500 ps of the simulation, suggesting the A-T pair is partially disrupted and maintains favourable stacking with both the tetrad and the loop. The strand polarity was confirmed via the complete assignment of the exchangeable resonances. Depending on whether the tetrads form in an anticlockwise or clockwise arrangement, the NH can have a number of possible non-exchangeable (base H8) neighbours. The H8 proton assignment enables each guanine NH1-H8 interaction to be identified for the four NH resonances for each tetrad (Fig. 5a). The assignment of these resonances was made easier by the presence of a strong NOE from the G3 NH to the A4 H2 (and 3⁎-4⁎ respectively), a direct consequence of their stacking

Fig. 5. Portions of the NOESY spectrum of d(G3ACGTAGTG3)2 showing NOEs from guanine NH groups that define the relative orientation of guanine bases within each tetrad as well as proximity and stacking with adjacent G-tetrads. (a) Guanine NH to base H8 NOEs within the tetrad are highlighted in boxes. (b) Guanine NH to NH NOEs between stacked tetrads. (c) A representation of the relative orientation of guanines within each tetrad. The anticlockwise and clockwise bonding arrangements are shown in each tetrad with syn bases in black and anti bases in white. The position of the base H8 and NH are indicated with white and black circles, with the circle at the head of each arrow representing the NH. The NOEs that are identified in a for H8-NH are indicated by grey arrows and confirm the relative hydrogen bonding geometry within each G-tetrad. The NH-NH NOEs in b confirm the relative stacking orientation between tetrads.

1606 arrangement. It was then possible to assess whether an anti or syn H8 interacted with this G3 NH. A number of different diagonal and edge-looped models were assessed, but only a single diagonal conformation could account for both the exchangeable and non-exchangeable inter-hairpin contacts. The pattern of NOEs confirmed the relative orientations of the bases within each tetrad as illustrated in Fig. 5c. The complicated set of core NH-NH connectivities could then be fully rationalised (Fig. 5b). Typically, very weak NH–NH NOE interactions are observed round the quartet and stronger interactions are found between adjacent quartets. The NH–NH interactions can be intra- or inter-strand depending on the anti/syn arrangement for each 5′–3′ step. Finally, NH exchange rates could be used to substantiate the assignment. Resonances from those protons buried in the core of the quadruplex can have extremely long NH/ ND exchange times.43 The assignment was validated by the observation that the four resonances in the central quartet showed the slowest exchanges times, with G2/G2⁎ and G12/G12⁎ still evident after 6 h. All peaks had essentially disappeared after 24 h. The final assignment of the imino protons validated the non-exchangeable assignments and clearly identified a unique structural model for the bimolecular quadruplex. The notable difference between the structures of d(G3T4G3)2 and

Folding Topology of a Bimolecular DNA Quadruplex

d(G3ACGTAGTG3)2 is the relative polarity of the strands, which are defined unambiguously by the NOE data. There are two ways of comparing the structures: either by aligning the position of the loops, or by aligning the syn/anti conformations of the bases. Two orientations of d(G3ACGTAGTG3)2 are shown in Fig. 6a and are related by a 90o rotation clockwise about the vertical axis. Comparing Fig. 6a (left) with the structure of d(G3T4G3)2 in Fig. 6b reveals that one of the hairpin components of the bimolecular quadruplex is essentially rotated 180o with respect to the other. It is possible to align the bases in their syn/anti conformations such that there is perfect overlap between the three tetrads in Fig. 6a (right) and b. However, now the loops are rotated by 90o with respect to each other. As a result, the bases in each tetrad are associated with different loops. Structural refinement of the d(GGGACGTAGTGGG) bimolecular quadruplex On the basis of a full 2D NMR assignment of data collected at 278 K, we were able to generate distance and hydrogen bonding restraints as a basis for calculating an ensemble of structures using restrained molecular dynamics simulations (Fig. 7). 36,38 Following reported equilibration

Fig. 6. (a) A representation of the structural model of d(G3ACGTAGTG3)2 showing the strand alignment and syn and anti arrangement of the glycosidic torsion angles within the core G-quadruplex (two orientations rotated by 90o with respect to each other about the vertical axis). The anticlockwise and clockwise bonding arrangements are shown in each tetrad with syn bases in black and anti bases in white. The overall structure has a mix of parallel and anti-parallel strands with diagonal loops giving rise to a syn-anti-anti-loop-syn-syn-anti arrangement in one half of the structure and syn-synanti-loop-syn-anti-anti in the other half. This arrangement is reminiscent of the d(G3T4G3)2 bimolecular quadruplex in terms of the quartet arrangement and stacking (b). The notable and unique difference is that the relative polarity of the strands of d(G3T4G3)2 and d(G3ACGTAGTG3)2 are different, with one of the hairpin components of the bimolecular quadruplex essentially rotated 180o with respect to the other. The polarity of the strands is shown by the crossed circles and dotted circles, the former representing the 5′-end of the chain and the latter the 3′-end.

Folding Topology of a Bimolecular DNA Quadruplex

1607

Fig. 7. (a) Ensemble of ten NMR structures for the G3 bimolecular quadruplex taken from the final 300 ps of the 2.1 ns restrained dynamics simulation (RMSD from the mean 1.7 Å). (b) Ribbon representation of the average NMR-derived structure of the bimolecular quadruplex d(G3ACGTAGTG3)2 showing the disposition of the loops with respect to the core G-quadruplex, viewed through a medium groove. (c) Overlay of the top two G-tetrads (red and blue) showing the variation in groove widths, as well as the helical twist of the two G-tetrads. (d) A representation of the upper G-tetrad (shown in red in c) illustrating the arrangement of syn and anti G-bases that leads to differences in groove widths for the bimolecular quadruplex (anti bases shaded grey).

protocols, the quadruplex was subjected to 2.1 ns of fully restrained dynamics in explicit solvent using the AMBER suite of programmes. The final ensemble of structures and the mean structure shows a regular right-handed twisted bimolecular quadruplex with a diagonal loop that forms a stable hydrogen bonded CGTAG mini-hairpin motif at each end of the quadruplex dimer (Fig. 7). Each hairpin contains a hydrogen bonded sheared G-A base pair that stacks favourably on the adjacent C-G Watson-Crick pair. The separation of the A and T at the base of the stem is too great to permit complete hydrogen bonded base pairing; however, both bases form specific

stacking interactions with the G-tetrad (Fig. 8a), as evident from the clear pattern of sequential NOEs. Overall, the structure has an apparent 2-fold symmetry; however, this is broken as the centre of the molecule resides in the plane of the central quartet. This quartet has two distinct faces and interacts differently with the end quartets on either side of it. The quartet core displays different groove widths (two medium size grooves and a narrow and wide groove) arising as a consequence of the alternation of syn and anti guanines within the tetrads (Fig. 7c and d). The hairpin loops at either end of the structure are essentially symmetry-related,

1608

Folding Topology of a Bimolecular DNA Quadruplex

Fig. 8. (a) Close up of the conformation of the one of the mini-hairpin loops illustrating stabilising interactions around the GTA turn involving the G-A sheared base pair and stacking with the terminal G-tetrad. (b) Stacking interactions at the structural interface between A4 and T10 in the hairpin loop and the terminal G-tetrads. (c) Corresponding loop of the d(G3T4G3)2 bimolecular quadruplex showing hydrogen bonding interactions across the loop (structure derived from a Na+-containing solution; PDB IB 1FQP) with thymine bases shown in red. (d) View from above showing base stacking of the thymine bases in c with the terminal G-tetrad.

giving the dimer a propeller-like topology. The overall structure has a mix of parallel and antiparallel strands with diagonal loops giving rise to a syn-anti-anti-loop-syn-syn-anti arrangement in one half of the structure and syn-syn-anti-loop-syn-antianti in the other half (see Fig. 6), as evident in the quartet arrangement and stacking within the structure of the d(G3T4G3)2 bimolecular quadruplex.26 The unique difference between the structures of d (G3T4G3)2 and d(G3ACGTAGTG3)2 is that one hairpin component of the dimer is rotated 180o with respect to the other (Fig. 6). The identifiable differences between the two bimolecular structures with different loop orientations relate to the stacking interactions between the loop and the terminal G-tetrad (Fig. 8). Both structures satisfy the emerging observation that the preloop guanine is in an anti orientation that maximises stacking interactions with the first residue in the loop, which also has an anti glycosidic conformation (Figs. 7 and 8). In this case, this is a stable purine– purine (5′-GpA) interaction, demonstrating one of

the first examples of a diagonal loop beginning with an adenine residue. In the T4 loop, this stacking interaction corresponds to a 5′-GpT step. The intrinsically greater flexibility of the T4 loop has been suggested to allow the third T (T6) to adjust its conformation to hydrogen bond across the loop to a second thymine (see Fig. 8d). Thus, specific interactions within the T4 loop may favour this particular diagonal conformation. The systematic replacement of each thymine of d(G3T4G3) with cytosine highlighted the importance of the third thymine in the loop, with substitution leading to a loss of dimer stability due to the lack of a hydrogen bonding imino proton of cytosine at neutral pH.44 Thermodynamic studies of G-quadruplex formation have established that stability and structure are cationdependent phenomena.45 Structural polymorphisms within T4 loops have been reported to be dependent on whether Na+ or K+ or NH4+ ions are bound to the G-tetrads. Co-ordination between the O2 of the third thymine in the loop and the cation of the terminal tetrad is apparent in the case of Na+46

Folding Topology of a Bimolecular DNA Quadruplex

however, the lack of co-ordination with K+ or NH4+ results in a different loop structure with greater dynamics.45 Minimising the number of 5′-G(anti)-G(syn) base stacking interactions at each 5′-GpG step appears to provide a thermodynamic driving force for formation of a diagonal loop that could be satisfied in either of the arrangements shown in Fig. 6. Thus, in the case of d(G3XnG3)2 where n is ≥ 4 nucleotides, it is highly likely that the sequence will adopt a diagonally looped dimer structure. Which diagonal conformer prevails, on the other hand, is under the control of the specific structure of the loop and the subsequent interactions it makes with the tetrad core. The observation of the alternative conformer for d(G3ACGTAGTG3)2 reflects specific contacts at the interface between the G-tetrad and hairpin stem. We asked which specific interactions favour the alternative diagonal loop structure observed for d(GGGACGTAGTGGG)2? We examined the free and restrained dynamics trajectory during the molecular dynamics simulations to characterise the dynamics of the hairpin loop and identify persistent base stacking interactions, hydrogen bonding and potential backbone phosphate repulsions that could influence the observed conformational preference. The pre-organisation of the hairpin loop results in significant stabilisation of base stacking both within the loop and with the terminal tetrads. The AT pair in the stem undergoes base pair opening (σ), to maximise quartet stacking, leaving only the NH6O4 pair capable of making a hydrogen bond. The opening of this base pair permits A4 to stack effectively with G3 in the same molecule and affords T10 the ability to stack with the 5′-syn guanine on the other molecule (G101). The rotation of one hairpin 180° with respect to the other, generating the other diagonally looped conformer, alternates the position of the two terminal anti and syn guanine bases on the outer quartet. This, in turn, affects whether the loop sequence thymine interacts with a 3′-anti or 5′-syn guanine. In order to maintain the geometry of the tetrad, the positions of the guanine bases remain the same, but the change in glycosidic conformation alters the orientation of the sugar-phosphate backbone. The presence of a 5′ or 3′ residue coupled with the glycosidic conformation has implications on the spatial proximity of the functional groups from both the incoming loop base and the tetrad guanine nucleotide. In the experimentally observed conformer, the loop forms across a medium size groove and the thymine base is able to approach and stack on the 5′-syn guanine unhindered. In contrast, the unobserved second conformer has a loop that forms over a narrow groove, bringing the sugar-phosphate backbones into close proximity. In addition, the approach of the thymine base methyl group is hindered by the H2′ on the 3′-anti sugar. We have modelled this alternative diagonally looped conformer and identified marked differences in the location of the thymine base with the hindered approach leading to an inability of the thymine to effectively overlap with the guanine

1609 base. Thus, the simulations suggest a rationale for the formation of an energetically preferred conformer. The conformer which is able to afford the most effective loop–tetrad stacking interaction, while maintaining the integrity of the loop conformation and the tetrad stack, will be more thermodynamically favoured and thus account for the single species that is observed in the spectra. It is likely that this rationale may also explain the presence of a single conformer in the d(G3T4G3)2 spectra, as T6 folds out to also stack on the unhindered 5′-syn guanine. In agreement with our sequence, the disfavoured conformer also appears to display steric repulsion from the thymine methyl group and H2′ on the 3′-anti sugar. Thus, the pre-disposition of the loop sequence for a particular conformation and its subsequent ability to form optimized stacking interactions with the quartet are likely to afford a single thermodynamic conformer. The implications of this rationale on potential folding pathways and its applicability to other diagonally looped structures will require further studies on these bimolecular quadruplex systems. Relative stability of quadruplex versus hairpin loop sequence: influence on structural topology The structural model of d(G3ACGTAGTG3)2 demonstrates that the distance required to span the diagonal of the terminal G-tetrad is incompatible with formation of a stable Watson-Crick hydrogen bonded A-T base pair; thus, we observe an open AT pair that is still able to form favourable base stacking interactions with the G-tetrad. The distortion of the first base pair in the hairpin stem region is not propagated further to the neighbouring C5G9 (or C5⁎-G9⁎) pair. Both C-G pairs form a stable Watson-Crick arrangement, as evident from the sharp overlapping imino proton resonances for G9 and G9⁎ at 12.5 ppm. However, these resonances are rapidly exchange broadened as the temperature is increased from 278 K to 298 K. The 2D NMR analysis of the bimolecular quadruplex reveals that at 298 K the GTA loops are now significantly more disordered, serving as more flexible diagonal connectors between strands. Earlier, we reported the results of structural and stability studies of the isolated d(ACGNAGT) hairpin (N is A or T) and have shown that this mini-hairpin melts with a Tm ∼321 K.33 Thus, inserting the hairpin into the loop sequence of the bimolecular quadruplex appears to destabilise the hairpin by N 20 K as a consequence of conformational strain in the stem region. Although the hairpin motif appears to be less well structured above 298 K, we do not observe interconversion to the alternative diagonal loop topology observed for the d(G3T4G3)2 dimer. However, we have the flexibility to modulate the relative stabilities of the two structural components (quadruplex versus mini-hairpin) by destabilising the core G-quadruplex. In the truncated sequence d(G2ACGTAGTG2) the bimolecular quadruplex is stabilised by a core of only two G-tetrads.

1610

Folding Topology of a Bimolecular DNA Quadruplex

Fig. 9. (a) The 1D 1H NMR spectra (600 MHz) of the exchangeable NH resonances (10–14 ppm) of the G2 bimolecular quadruplex d(G2ACGTAGTG2)2 showing evidence for a 2-fold symmetric bimolecular structure with other minor species present. The NH groups from the four non-equivalent environments of the guanines in the three G-tetrads occur between 11 ppm and 13 ppm, and are labelled G1, G2, G10 and G11 in the spectrum at 280K. The resonance at 12.8 ppm corresponds to G8 within the stable hairpin loop of the quadruplex (G8H). Three quadruplex species (Q1, Q2 and Q3) are resolved in the 1D spectrum with relative populations of 70%, 20% and 10%, respectively, and are specifically highlighted for G2 (box). The weak sharp signal at 13.7 ppm is characteristic of a Watson-Crick A-T pair and is tentatively assigned to T9 NH in Q3, suggesting that Q3 is the only structure with edgewise loops (rather than diagonal loops) that permit formation of a stable A-T pair. It is evident also from the resonance at 12.8 ppm (labelled G8H) that the monomeric hairpin is significantly populated. Spectra at temperatures of 280 K, 288 K and 298 K show that the signals for the three quadruplex species disappear with an increase in the intensity of the G8H signal at 12.8 ppm for the monomeric hairpin. (b) Temperature-dependence of the equilibrium constant for the three quadruplex structures Q1, Q2 and Q3 (calculated assuming a two-state equilibrium between quadruplex and the monomeric hairpin structure) measured from NMR peak intensities for the resolved cluster of G2 peaks around 11.4 ppm and from the G8H signal at 12.4 ppm in a. The derived enthalpies of dissociation are similar (114–120 kJ mol-1) and indicative of structurally related species with a single stacking interaction between two G-tetrads. (c) Models of possible structures for the three quadruplexes. Q1 is the dominant species and its structure has been determined by NMR. Q2 plausibly has a diagonal loop orientation analogous to the T4 loop, although we cannot rule out the possibility of a propeller loop structure. Q3 is likely to possess edge loops that form over wide grooves but allow formation of an A3-T9 Watson-Crick base pair within the hairpin motif.

Structural heterogeneity in the assembly of d(GGACGTAGTGG)2 The NMR spectra of d(GGACGTAGTGG)2 at 280 K reveal a bimolecular quadruplex structure with a 2-fold symmetry with well resolved resonances between 11.0 ppm and 13.0 ppm. The presence of four distinct NH resonances from G-tetrads, plus the resonance at 12.6 ppm for the Watson-Crick hydrogen bonded C4-G8 pair from the hairpin stem (Fig. 9a), shows that one major quadruplex

conformer is present, and provides evidence for a number of minor species. The spectroscopic properties of the major species are very similar to the longer G3 quadruplex sequence, albeit with the 2-fold symmetry reducing the number of resonances present. There is also clear evidence for a significant population of the stable monomeric hairpin in equilibrium with other quadruplex conformers. In contrast, only one species could be detected in the spectra of the G3 bimolecular quadruplex.

1611

Folding Topology of a Bimolecular DNA Quadruplex

The data for d(GGACGTAGTGG)2 reveals three quadruplex species (Q1, Q2 and Q3), each of which appears to have the same 2-fold symmetry. The relative populations at 280 K are ∼ 70%, 20% and 10% (Fig. 9a). The resonance at 12.8 ppm corresponds to the monomeric hairpin which, when integrated, suggests that 65% of the species exist in this form. The thermal unfolding transition also reveals some significant differences. The stability of the hairpin loop now exceeds that of the core G2 quadruplex. The resonances for the quadruplex structure disappear over the temperature range 280 –298 K with the population of the folded hairpin increasing, which is evident from the increase in the intensity of the resonance at 12.8 ppm (Fig. 9a). At temperatures above 298 K, the predominant species is the monomeric hairpin, which forms a stable hydrogen bonded structure at this temperature, consistent with our earlier analysis of the isolated d(ACGTAGT) mini-hairpin.38 A broad hump on the baseline at around 13.5 ppm, most clearly evident at 282−288 K, appears to correspond to a thymine NH consistent with a marginally stable A-T base pair in the monomeric hairpin stem region. The temperature-dependent change in the relative population of the quadruplex and hairpin species enabled us to determine the difference in thermodynamic stability, assuming a simple twostate model in which Q1, Q2 and Q3 are formed by the association of two monomeric hairpins. The relative populations of the three quadruplex species are insensitive to temperature, such that a van't Hoff plot of ln K versus 1/T for each species produces a linear correlation with a similar slope (Fig. 9b). The derived values for the enthalpies of dissociation lie in the narrow range of 114–122 kJ mol-1 and suggest that we are observing unfolding transitions for structurally similar species in solution with comparable numbers of base stacking interactions. Studies of other intermolecular quadruplexes (for example, d(TG3T)4 and d(TTAG3T)4, both in K+ solutions) have suggested that each stacking interaction between G-tetrads contributes N 100 kJ mol-1 in enthalpy;47,48 however, there is a wide variation according to context and the nature of the stabilising monovalent cation.49 More generally, enthalpies per quartet lie between 60 kJ mol-1 and 100 kJ mol-1, with a number of thrombin aptamers with only a single stack of two G-tetrads lying at the lower end of this range. In the context of the Q1, Q2 and Q3 structures of d(GGACGTAGTGG)2, enthalpies of dissociation in the range 114–122 kJ mol-1 suggest significant contributions from both the core quadruplex structure and the hairpin stem and loop. The low population of the minor species Q2 and Q3 precludes a detailed structural analysis by NMR, but there is limited evidence to suggest that for Q3 the loops are positioned at the edges between adjacent strands rather than in a diagonal arrangement. At equilibrium, Q3 is the least populated species with its exchangeable resonances assigned on the basis of intensity. A small sharp peak at 13.70 ppm has the characteristics of a hydrogen bonded

thymine in a stable Watson-Crick A-T base pair. On the basis of the analysis presented above, we tentatively suggest that this is most likely to arise when the hairpin is accommodated in an edge position over a wide groove where the distance across the loop is compatible with stable base pairing between A3 and T9. This base pair remains stable over a wide range of temperatures and melts in a co-operative two-state fashion with the core of the Q3 G-quadruplex (Fig. 9a). In contrast, there is no resonance attributable to an A-T base pair for either Q1 or Q2, suggesting that Q2 may form a structure with diagonal loops, propeller loops or even a tetrameric assembly (Fig. 9c). It was possible, however, to characterise the structure of the major species, Q1, in detail by NMR at 278 K. We generated an ensemble of structures (using the methodology outlined above), which reveals a 2-fold symmetric structure with the same folding topology identified for the G3 bimolecular quadruplex (Fig. 10). The two structures are related by removing the central G-tetrad from the G3 quadruplex structure in Fig. 7. All other glycosidic torsion angles, and the orientation and nature of the stacking interactions within the hairpin loops and with the G-tetrads, are also completely conserved. However, the lower stability of the G2 structure results in a two-step thermal unfolding in which the predominant species (N 90%) in solution at 298 K is the monomeric hairpin. The hairpin subsequently undergoes thermal unfolding to the disordered single strand at temperatures above 315 K (data not shown). The strong resonance at 12.8 ppm in the 298 K data in Fig. 9a shows the presence of a stable C4-G8 base pair. The 2D NMR analysis confirmed that a compact mini-hairpin structure was formed, nucleated around the central loop sequence d(G2ACGTAGTG2), although the 5′and 3′-termini containing the guanine residues remained disordered. Under these conditions, the stability of the hairpin exceeds the stability of the G2 quadruplex core structure such that we observe evidence of the hairpin stem loop influencing the conformational equilibrium between quadruplex structures with different diagonal or edgewise loop arrangements.

Discussion Influence of loop sequence on the structural topology of bimolecular quadruplexes The simplest folding pathway for the formation of a bimolecular quadruplex requires two preformed hairpins in the correct orientation to coalesce to form an edge-looped structure (Fig. 1). In practice, this simple association does not lead to the most thermodynamically stable arrangement of G-tetrads, at least for G3-loop-G3 based bimolecular structures. It has been suggested that quartet formation minimises the number of 5′-G(anti)-G(syn) sequential base steps, which results in poor π–π interactions

1612

Folding Topology of a Bimolecular DNA Quadruplex

Fig. 10. (a) Ensemble of ten NMR structures for the G2 bimolecular quadruplex taken from the final 300 ps of the 2.1 ns restrained dynamics simulation (RMSD from the mean 1.67 Å). (b) Ribbon representation of the average NMR-derived structure of the bimolecular quadruplex d(GGACGTAGTGG)2 showing the disposition of the loops with respect to the core G-quadruplex, viewed through a medium groove.

and high energy structures.50 In the current context, the anti/syn arrangement for this sequence has been minimised to zero showing that this is a strong thermodynamic driving force irrespective of the nature of the connecting loops. This conformational preference appears to drive formation of the diagonally looped quadruplex, which overrides any loop preference for an edge location. In the structure of the G2 bimolecular quadruplexes, where the energetic driving force for hairpin formation versus specific G-tetrad formation is more evenly balanced, we see multiple minor conformations. We speculate that at least one of these (the minor form Q3) appears to have an edge loop, which may reflect the conformational preferences of the loop for forming a stable minihairpin with the maximum number of Watson-Crick base pairs and base pair stacking interactions. We see no evidence for this minor conformer in the spectra of the G3-containing sequence. It is established that the structures of G-quadruplexes and their folding topology vary with different lengths of loop.11,12,31,39 Intramolecular assemblies with loop lengths ranging from T1 to T7 have been characterised by CD to show the influence of loop length in stabilising parallel and anti-parallel strand arrangements. In these cases, short loops appear to favour the parallel structures on steric grounds, with the loops on the outside in the grooves rather than in the diagonal or edge arrangements.39 Longer loops appear to favour anti-parallel structures analogous to those reported

here. In general, longer disordered loops appear to decrease the stability. The effects of base substitutions within the loops are subtle and can produce shifts in equilibria between parallel and anti-parallel alignments on the basis of a single base change. The work of Phan & Patel31 with the two-repeat human telomeric sequence d(TAG3TTAG3T) has shown the formation of bimolecular quadruplexes in both parallel and anti-parallel arrangements in K+ solution that coexist and interconvert. The single T→U point mutation to give d(TAG3UTAG3T) results in a predominantly parallel dimeric structure. Double substitution to d(UAG2GBrUTAG3T) results in a predominantly anti-parallel fold. In the case of d(TAG3TTAG3T), the anti-parallel form with edgewise loops persists at low temperature (b323 K) but populates the parallel form at higher temperatures, showing differences in the thermodynamics and kinetics that result in temperature-dependent population changes. The four-repeat form of the human sequence d[AG3(TTAG3)3] has been shown to adopt completely different G-quadruplex topologies in Na+ versus K+ solutions and crystals, or multiple forms in K+ in solution.3,30 In short, the evidence for loop-dependent folding of G-quadruplexes is significant and suggests a high level of sensitivity to point mutations and loop-specific interactions that lead to a complex relationship between fold and sequence. This is underpinned by an intrinsically high degree of structural plasticity,40,51 and ample scope for the design of quadruplex-specific ligands with anti-cancer activity.52

1613

Folding Topology of a Bimolecular DNA Quadruplex

Materials and Methods DNA synthesis and purification The oligonucleotides d(GGGACGTAGTGGG) and d(GGACGTAGTGG) were synthesised on a 10 μmol scale on a commercial DNA synthesiser using standard solidphase phosphoramidite chemistry with the 4,4′-dimethoxytrityl (DMT) protecting groups left in place to aid purification by reverse-phase HPLC. The tritylated sequences were separated from the nontritylated failure sequences in TEAA buffer (0.1 M triethylammonium acetate, pH 7) an acetonitrile gradient. The HPLC purification was performed on an Agilent 1100 series system using a Hypersil C18 column. The fractions were combined and the acetonitrile was removed by rotary evaporation. Aqueous acetic acid (10 ml, 50%, v/v) was added and the mixture was stirred for 30 min at 308 K. The sample was then extracted with ether (3 × 200 ml) and the aqueous layer was retained. The sample was buffered to pH 7 using 10 mM KH2PO4 and had KCl added (100 mM final concentration) before lyophilisation. The dried sample was suspended in 5 ml of water before a desalting step using FPLC on an Amersham Pharmacia Biotech AKTA prime FPLC machine and a Hi-trap desalting column washed (5 × 5 ml) with filtered Milli-Q water. The DNA fractions were combined to a volume of ∼50 ml and buffered with phosphate buffer (100 mM KH2PO4 , pH 7) before lyophilisation. CD and UV spectroscopy Samples for UV and CD spectroscopy were prepared at a concentration of 4 μM of single strands in 100 mM KCl, 10 mM potassium phosphate buffer at pH 7.0 using MilliQ prepared water. UV spectra were collected in a 1 cm3 quartz cuvette on a Pharmacia Biotech Ultraspec 2000 UV/VIS spectrophotometer coupled to a temperaturecontrolled water-bath. CD spectra were recorded on an Applied Photophysics pi-star 180 CD/fluorescence spectrophotometer fitted with a Peltier heating device. CD spectra were also recorded in a 1 cm3 quartz cuvette over the wavelength range 215−340 nm and at temperatures between 283K and 363K with 10 K increments. The sample was equilibrated for 20 min at each temperature and each spectrum was the average of three scans. NMR sample preparation The d(GGACGTAGTGG) sample was prepared to give 3.2 mM DNA in 100 mM KCl, 10 mM KH2PO4/K2HPO4 buffer (pH 7). Attempts to duplicate the conditions for d(GGGACGTAGTGGG) resulted in only broad spectra. Instead, optimised conditions of 2 mM DNA in 10 mM KCl at pH 7.85 (adjusted using small amounts of KOH) were used to yield a sample amenable to NMR analysis. The samples were annealed for 12 h. NMR samples, for the observation of exchangeable resonances, were prepared in 600 μl of 10 % 2H2O/90 % H2O (v/v). The samples were lyophilised and dissolved in 600 μl of 2H2O solution for the observation of non-exchangeable resonances.

probe with z-axis gradient). Standard phase-sensitive 2D NMR pulse sequences were used throughout, including NOESY, total correlated spectroscopy (TOCSY) and double quantum filtered correlated spectroscopy (DQFCOSY) experiments. Water suppression was achieved using the 1-1 jump-return or WATERGATE pulse sequence for the H2O samples and a presaturation pulse for the 2 H2O samples. Phase-sensitive DQF-COSY, TOCSY, and NOESY experiments were performed, collecting either 1024 or 2048 points in t2, and between 400 and 512 points in t1. Data were collected with spectral widths of 20.03 ppm, equating to an acquisition time of 85 ms in t2, with a further relaxation delay of 1.5 s between cycles. NOESY data were acquired at mixing times ranging from 70 – 300 ms. All TOCSY experiments employed a spinlocking field of 7 kHz over a mixing time of 68 ms. All data were processed, using Bruker XWIN-NMR™ processing software. The 2D data were zero-filled to 2048 × 1024 points and optimised with a shifted sine squared function in both dimensions before Fourier transformation, followed by automatic baseline correction. The spectra were assigned using the CCPNMR analysis programs. The temperature-dependent change in the relative populations of the d(GGACGTAGTGG) quadruplexes and hairpin species enabled us to determine differences in thermodynamic stability, assuming a simple two-state model in which Q1, Q2 and Q3 are each in equilibrium with the monomeric hairpin [H]. Direct interconversion of the quadruplexes without first dissociating to the monomeric hairpin was considered unlikely. We estimated the relative population of hairpin and the three quadruplex species by integrating the NMR signals for the monomeric hairpin G8H (at 12.8 ppm) and those for the clearly resolved signals of G2 from Q1, Q2 and Q3 at around 11.4 ppm (see Fig. 9a). The equilibrium constant was taken as: KQn = ½Qnc=½H 2 where Qn is either Q1, Q2 or Q3, c is the concentration of single strands (2.0 × 10-3 M) and [H] is the concentration of the monomeric hairpin. The relative populations of the three quadruplex species are insensitive to temperature, such that a van't Hoff plot of ln KQn versus 1/T for each species produces a linear correlation with a similar slope (Fig. 9b). Structure refinement and molecular modelling The structure of the G3 dimer was modelled using a total of 467 restraints (385 non-exchangeable NOE distances, 52 exchangeable NOE distances and 30 hydrogen bonding restraints). Simulations on the G2 dimer utilised 326 restraints (140 × 2 non-exchangeable NOE distances, 24 exchangeable NOE distances and 22 hydrogen bonding restraints). In the case of the G2 dimer, the structure has 2-fold symmetry. Consequently, the restraints were duplicated and applied to both diagonal loops in the model. Derivation of the interproton NOE distance restraints have been described.36 Hydrogen bonding restraints between bases were included when imino resonances showed slow exchange rates. Molecular dynamics simulations were performed using the AMBER 8.0 suite of programs† and explicit solvation models, as described.36,38

NMR experiments NMR data were collected at 600 MHz on a Bruker Avance spectrometer (fitted with a TXI triple resonance

† http://amber.scripps.edu/doc8/index.html

1614 The starting structures were constructed using a central G-quartet fragment and two identical mini-hairpin loop fragments. The G3 core fragment was obtained from the deposited d(G3T4G3)2 bimolecular quadruplex structure (PDB code 1FQP) following the transposition of two strands to yield the appropriate syn/anti arrangement. The initial model used loops obtained from previous minihairpin studies.53 Two potassium ions were located between tetrad layers. The G2 sequence was constructed in an analogous manner but with the central tetrad removed. The structures were subsequently edited using the LEaP module within AMBER and each one was allowed to fully equilibrate before restrained molecular dynamics simulations were done. The two structural models, G2 and G3, were simulated using the same procedure, with restrained molecular dynamics, incorporating all restraints, for 2.1 ns. Snapshots taken at 30 ps intervals during the last 300 ps of restrained dynamics were used to generate an ensemble of hairpin conformations displaying a pairwise RMSD of 1.73 Å (G3) and 1.67 Å (G2) for all heavy atoms. The final energyminimised G3 structure violated only four restraints N 0.5 Å (but less than 0.8 Å), all of which were within the loop. Similarly, the final G2 structure violated two restraints N 0.3 Å (but less than 0.6 Å), both of which were within the loop. Structural analysis and generation of the hydrogen bond occupancy was performed using the PTRAJ module of AMBER 8.0 and the structures were displayed using MOLMOL.54 Data bank accession codes The structure of the d(G3ACGTAGTG3)2 dimer has been deposited with RCSB ID code rscb100896 and the PDB ID code 2kaz.

Acknowledgements G.D.B. and T.P.G. were supported by the Engineering and Physical Sciences Research Council of the UK and the University of Nottingham.

References 1. Bates, P., Mergny, J. & Yang, D. (2007). Quartets in G-major. The first international meeting on quadruplex DNA. EMBO Rep. 8, 1003–1010. 2. Huppert, J. (2008). Four-stranded nucleic acids: structure, function and targeting of G-quadruplexes. Chem. Soc. Rev. 37, 1375–1384. 3. Wang, Y. & Patel, D. (1993). Solution structure of the human telomeric repeat d[AG3(TTAG3)3] G-tetraplex. Structure, 1, 263–282. 4. Fletcher, T., Sun, D., Salazar, M. & Hurley, L. (1998). Effect of DNA secondary structure on human telomerase activity. Biochemistry, 37, 5536–5541. 5. Mergny, J., Riou, J., Mailliet, P., Teulade-Fichou, M. & Gilson, E. (2002). Natural and pharmacological regulation of telomerase. Nucleic Acids Res. 30, 839–865. 6. Neidle, S. & Parkinson, G. (2002). Telomere maintenance as a target for anticancer drug discovery. Nat. Rev., Drug Discov. 1, 383–393.

Folding Topology of a Bimolecular DNA Quadruplex

7. Sun, H., Karow, J., Hickson, I. & Maizels, N. (1998). The Bloom's syndrome helicase unwinds G4 DNA. J. Biol. Chem. 273, 27587–27592. 8. Sun, H., Yabuki, A. & Maizels, N. A. (2001). Human Nuclease Specific for G4 DNA. Proc. Natl Acad. Sci. USA, 98, 12444–12449. 9. Paeschke, K., Simonsson, T., Postberg, J., Rhodes, D. & Lipps, H. (2005). Telomere end-binding proteins control the formation of G-quadruplex DNA structures in vivo. Nature Struct. Mol. Biol. 12, 847–854. 10. Duquette, M., Handa, P., Vincent, J., Taylor, A. & Maizels, N. (2004). Intracellular transcription of G-rich DNAs induces formation of G-loops, novel structures containing G4 DNA. Genes Dev. 18, 1618–1629. 11. Huppert, J. & Balasubramanian, S. (2005). Prevalence of quadruplexes in the human genome. Nucleic Acids Res. 33, 2908–2916. 12. Todd, A., Johnston, M. & Neidle, S. (2005). Highly prevalent putative quadruplex sequence motifs in human DNA. Nucleic Acids Res. 33, 2901–2907. 13. Seenisamy, J., Rezler, E., Powel, I. T., Tye, D., Gokhale, V., Joshi, C. et al. (2004). The dynamic character of the G-quadruplex element in the c-MYC promoter and modification by TMPyP4. J. Am. Chem. Soc. 126, 8702–8709. 14. Grand, C., Han, H., Munoz, R., Weitman, S., Von Hoff, D., Hurley, L. & Berass, D. J. (2002). The cationic porphyrin TMPyP4 down-regulates c-MYC and human telomerase reverse transcriptase expression and inhibits tumor growth in vivo. Mol. Cancer The. 1, 565–573. 15. Rankin, S., Reszka, A. P., Huppert, J., Zlot, M., Parkinson, G. N., Todd, A. K. et al. (2005). Putative DNA quadruplex within the human c-kit oncogene. J. Am. Chem. Soc. 127, 10584–10589. 16. Fernando, H., Reszka, A. P., Huppert, J., Ladame, S., Rankin, S., Venkitaraman, A. R. et al. (2006). A conserved quadruplex motif located in a transcription activation site of the human c-kit oncogene. Biochemistry, 45, 7854–7860. 17. Bejugam, M., Sewitz, S., Shirude, P. S., Rodriguez, R., Shahid, R. & Balasubramanian, S. (2007). Trisubstituted isoalloxazines as a new class of G-quadruplex binding ligands: small molecule regulation of c-kit oncogene expression. J. Am. Chem. Soc. 129, 12926–12927. 18. Lafyatis, R., Denhez, F., Williams, T., Sporn, M. & Roberts, A. (1991). Sequence specific protein binding to and activation of the TGF-β3 promoter through a repeated TCCC motif. Nucleic Acids Res. 19, 6419–6425. 19. De Armond, R., Wood, S., Sun, D., Hurley, L. H. & Ebinghaus, S. W. (2005). Evidence for the presence of a guanine quadruplex forming region within a polypurine tract of the hypoxia inducible factor 1α promoter. Biochemistry, 44, 16341–16350. 20. Pestov, D. G., Dayn, A., Siyanova, E., George, D. L. & Mirkin, S. M. (1991). H-DNA and Z-DNA in the mouse c-Ki-ras promoter. Nucleic Acids Res. 19, 6527–6532. 21. Sun, D., Guo, K., Rusche, J. J. & Hurley, L. H. (2005). Facilitation of a structural transition in the polypurine/polypyrimidine tract within the proximal promoter region of the human VEGF gene by the presence of potassium and G-quadruplex-interactive agents. Nucleic Acids Res. 33, 6070–6080. 22. Dai, J., Dexheimer, T. S., Chen, D., Carbver, M., Ambrus, A., Jones, R. A. & Yang, D. (2006). An intramolecular G-quadruplex structure with mixed parallel/antiparallel G-strands formed in the human

Folding Topology of a Bimolecular DNA Quadruplex

23.

24. 25.

26.

27. 28.

29.

30. 31.

32.

33.

34. 35.

36.

37. 38.

bcl-2 promoter region in solution. J. Am. Chem. Soc. 128, 1096–1098. Smith, F. & Feigon, J. (1993). Strand Orientation in the DNA Quadruplex Formed from the Oxytricha Telomere Repeat Oligonucleotide d(G4T4G4) in Solution. Biochemistry, 32, 8682–8692. Haider, S., Parkinson, G. & Neidle, S. (2002). Crystal structure of the potassium form of an Oxytricha nova G-quadruplex. J. Mol. Biol. 320, 189–200. Smith, F., Lau, F. & Feigon, J. (1994). d(G3T4G3) Forms an asymmetric diagonally looped dimeric quadruplex with guanosine 5′-syn-syn-anti and 5′syn-anti-anti N-glycosidic conformations. Proc. Natl Acad. Sci. USA, 91, 10546–10550. Kettani, A., Bouaziz, S., Gorin, A., Zhao, H., Jones, R. & Patel, D. (1998). Solution structure of a Na cation stabilized DNA quadruplex containing G·G·G·G and G·C·G·C tetrads formed by G-G-G-C repeats observed in adeno-associated viral DNA. J. Mol. Biol. 282, 619–636. Crnugelj, M., Hud, N. & Plavec, J. (2002). The solution structure of d(G4T4G3)2: a bimolecular G-quadruplex with a novel fold. J. Mol. Biol. 320, 911–924. Crnugelj, M., Sket, P. & Plavec, J. (2003). Small change in a G-rich sequence, a dramatic change in topology: new dimeric G-quadruplex folding motif with unique loop orientations. J. Am. Chem. Soc. 125, 7866–7871. Hazel, P., Parkinson, G. & Neidle, S. (2006). Topology variation and loop structural homology in crystal and simulated structures of a bimolecular DNA quadruplex. J. Am. Chem. Soc. 128, 5480–5487. Parkinson, G., Lee, M. & Neidle, S. (2002). Crystal structure of parallel quadruplexes from human telomeric DNA. Nature, 417, 876–880. Phan, A. & Patel, D. (2003). Two-repeat human telomeric d(TAGGGTTAGGGT) sequence forms interconverting parallel and antiparallel G-quadruplexes in solution: distinct topologies, thermodynamic properties, and folding/unfolding kinetics. J. Am. Chem. Soc. 125, 15021–15027. Hirao, I., Kawai, G., Yoshizawa, S., Nishimura, Y., Ishido, Y., Watanabe, K. & Miura, K. (1994). Most compact hairpin-turn structure exerted by a short DNA fragment, d(GCGAAGC) in solution – an extraordinarily stable structure resistant to nucleases and heat. Nucleic Acids Res. 22, 576–582. Yoshizawa, S., Kawai, G., Watanabe, K., Miura, K. & Hirao, I. (1997). GNA trinucleotide loop sequences producing extraordinarily stable DNA minihairpins. Biochemistry, 36, 4761–4767. Arai, K., Low, R., Kobori, J., Shlomai, J. & Kornberg, A. (1981). Mechanism of DnaB protein action. J. Biol. Chem. 256, 5273–5280. Glucksman, M., Markiewicz, P., Malone, C. & Rothman-Denes, L. B. (1992). Specific sequences and a hairpin structure in the template strand are required for N4-virion RNA-polymerase promoter recognition. Cell, 70, 491–500. Colgrave, M., Williams, H. & Searle, M. S. (2002). Structure of a drug-induced DNA T-bulge: Implications for DNA frameshift mutations. Angew. Chem.Int. Edit. 41, 4754–4756. Gallagher, C. & Searle, M. S. (2003). Drug-induced stabilisation of a mismatched C-T base pair in a DNA hairpin. Chem. Commun. 1814–1815. Williams, H., Colgrave, M. & Searle, M. S. (2002). Drug recognition of a DNA single strand break Nogalamycin intercalation between coaxially stacked hairpins. Eur. J. Biochem. 269, 1726–1733.

1615 39. Hazel, P., Huppert, J., Balasubramanian, S. & Neidle, S. (2004). Loop-length-dependent folding of G-quadruplexes. J. Am. Chem. Soc. 126, 16405–16415. 40. Webba da Silva, M. (2007). Geometric formalism of DNA quadruplex folding. Chem. Eur. J. 13, 9738–9745. 41. Parkinson, G. N. (2007). Fundamentals of quadruplex structure. In Quadruplex Nucleic Acids (Neidle, S. & Balasubramanian, S., eds), pp. 1–30, Royal Society of Chemistry Publishing, Cambridge, UK. 42. Rachwal, P. A., Findlow, I. S., Werner, J. M., Brown, T. & Fox, K. R. (2007). Intramolecular DNA quadruplexes with different arrangements of short and long loops. Nucleic Acids Res. 35, 4214–4222. 43. Feigon, J., Koshlap, K. M. & Smith, F. (1995). 1H NMR spectroscopy of DNA triplexes and quadruplexes. Methods Enzymol. 261, 225–255. 44. Keniry, M., Owen, E. & Shafer, R. (1997). The contribution of thymine-thymine interactions to the tability of folded dimeric quadruplexes. Nucleic Acids Res. 25, 4389–4392. 45. Schultze, P., Hud, N. V., Smith, F. W. & Feigon, J. (1999). The effect of sodium, potassium and ammonium ions on the conformation of the dimeric quadruplex formed by the Oxytricha nova telomere repeat oligonucleotide d(G4T4G4). Nucleic Acids Res. 27, 3018–3028. 46. Hud, N. V. & Plavec, J. (2007). Role of cations in determining quadruplex structure and stability. In Quadruplex Nucleic Acids (Neidle, S. & Balasubramanian, S., eds), pp. 100–130, Royal Society of Chemistry Publishing, Cambridge, UK. 47. Jin, R., Gaffney, B., Wang, C., Jones, R. & Breslauer, K. (1992). Thermodynamics and structure of a DNA tetraplex – a spectroscopic and calorimetric study of the tetramolecular complexes of d(TG3T) and d (TG 3 T 2 G 3 T). Proc. Natl Acad. Sci. USA, 89, 8832–8836. 48. Gavathiotis, E. & Searle, M. S. (2003). Structure of the parallel-stranded DNA quadruplex d(TTAGGGT)4 containing the human telomeric repeat: evidence for A-tetrad formation from NMR and molecular dynamics simulations. Org. Biomol. Chem. 1, 1650–1656. 49. Mergny, J. -L., Gros, J., De Cian, A., Bourdoncle, A., Rosu, F., Sacca, B. et al. (2007). Energetics, kinetics and dynamics of quadruplex folding. In Quadruplex Nucleic Acids (Neidle, S. & Balasubramanian, S., eds), pp. 31–80, Royal Society of Chemistry Publishing, Cambridge, UK. 50. Keniry, M. (2000). Quadruplex structures in nucleic acids. Biopolymers, 56, 123–146. 51. Phan, A. T., Kuryavyi, V., Ngoc, K. & Patel, D. J. (2007). Structural diversity of G-quadruplex scaffolds. In Quadruplex Nucleic Acids (Neidle, S. & Balasubramanian, S., eds), pp. 81–99, Royal Society of Chemistry Publishing, Cambridge, UK. 52. Balkwill, G. & Searle, M. S. (2007). DNA quadruplex-ligand recognition: structure and dynamics. In Quadruplex Nucleic Acids (Neidle, S. & Balasubramanian, S., eds), pp. 131–153, Royal Society of Chemistry Publishing, Cambridge, UK. 53. Balkwill, G., Williams, H. & Searle, M. S. (2007). Structure and folding dynamics of a DNA hairpin with a stabilising d(GNA) trinucleotide loop: influence of base pair mis-matches and point mutations on conformational equilibria. Org. Biomol. Chem. 5, 832–839. 54. Koradi, R., Billeter, M. & Wuthrich, K. (1996). MOLMOL: a program for display and analysis of macromolecular structures. J. Mol. Graph. 14, 51–55.