Sequence specificity of 125I-labelled Hoechst 33258 damage in six closely related DNA sequences

Sequence specificity of 125I-labelled Hoechst 33258 damage in six closely related DNA sequences

J. Mol. Bid. (1988) 203, 63-73 Sequence Specificity of ’ 251-labelled Hoechst 33258 Damage in Six Closely Related DNA Sequences Vincent Murray and ...

2MB Sizes 0 Downloads 74 Views

J. Mol. Bid.

(1988) 203, 63-73

Sequence Specificity of ’ 251-labelled Hoechst 33258 Damage in Six Closely Related DNA Sequences Vincent Murray

and Roger F. Martin

Molecular Science Group Peter MacCallum Cancer Institute 481 Little Lonsdale Street Melhourn,e, Victoria 3000, Australia (Received 10 February 1988) The sequence selectivity of ] lz51 ] Hoechst 33258 in six 340 base-pair DNA sequences has been investigated. [ 125T ] Hoechst 33258, which is a bis-benzimidazole and binds to t,he minor groove of B-DNA, preferentially binds to A + T-rich regions of DNA. Six out of nine strong binding sites contained four or more consecutive A. T base-pairs, while the other three strong binding sites were AAGGATT. TATAGAAA (the peak of damage was in the run of 3 A residues) and AA. One of the six weak binding sites had five consecutive A. 1 base-pairs, two of the weak binding sites had three, and three did not have any. Tn addition to genomic 340 base-pair G~RI-DNA (which is a tandem repeat in human cells), five 340 base-pair crRI-DNA clones were generated that differed from the genomic “consensus” sequence by a number of random base alterations. The effect of these base 33258 damage indicated that of the changes on the sequence specificity of [ ‘251]Hoechst base changes that interrupted 14 binding sites, six decreased and eight did not change the extent of damage, while two sites changed position. Of the base alterations that augmented 17 binding sites, five increased, two decreased and ten did not alter the degree of cleavage. while ten sites changed position. It was concluded from the data that, while runs of consecutive A. T base-pairs was the most important parameter that, determines ] ‘251]Hoechst 33258 binding, other factors including position in the DNA sequence, nearest neighbour and long-range interactions were also important.

& Drew, 1981; Lomonossoff et al.. 1981; Calladine & Drew, 1984). Other evidence for the microheterogeneity of DNA comes from the study of the sequence specificity of DNA-damaging agents (Jessee et aE., 1982; Cartwright & Elgin, 1982; Martin & Holmes, 1983; Boles & Hogan, 1984; Murray & Martin, 1985a). The DNA sequence is thought to cause these minor structural alterations in DNA (Shakked & Rabinovich, 1986) and in this work we examined the influence of base substitutions on the degree of binding of Hoechst 33258. In order to identify the sites of Hoechst 33258 binding to DNA, 125I was covalently attached to Hoechst 33258. 125I decays by electron capture so that, where the Hoechst 33258 binds to DNA: the intense emission of low-energy (Auger) electrons that accompanies 125I decay causes a double-strand break in DNA at the exact position of ligand binding (Martin & Holmes, 1983). We used six DNA sequences derived from human

1. Introduction The bis-benzimidazole Hoechst 33258 binds to the minor groove of DNA. Martin & Holmes (1983) examined the binding sites of Hoechst 33258 in a defined DNA sequence from pBR322. They found consecutive A. T base-pairs were that four for strong binding. However, they necessary demonstrated that the sequence in the four consecutive A. T base-pairs was not the sole determinant of the degree of Hoechst 33258 binding, but that neighbouring sequences have a marked influence on the ligand binding. There is now strong evidence that the structure of DNA is not a uniform Watson-Crick helix but can adopt various minor alterations in structure. The most direct evidence for the microheterogeneity comes from X-ray crystallography of oligonucleotides coupled with studies of DNase I treatment of the same oligonuclootido (Drew et al., 1981; Dickerson 63 0022-2836/88/17(WW3%11

$03.00/O

0

1988

Academic

F’ress

Limited

V. Murray

64

and R. F. Martin

aRI-DNA as our target species. Human aRI-DNA is a middle repetitive sequence, 340 bpt in length, that exists mainly as tandem repeats at the centromeres of chromosomes (Wu & Manuelidis, 1980; Darling et al., 1982). The 340 bp EcoRI repeat is sufficiently homogeneous to give a unique DNA sequence using the Maxam-Gilbert technique. However, not all the 50,000 copies that are present in the human haploid genome are the same but contain random base alterations. Sequencing data (Jorgensen et al., 1986; Murray & Martin, 1987) indicate that aRI-DNA sequences differ by an average of 9% from the consensus sequence. We used this natural heterogeneity to obtain five 340 bp a-DNA clones that differ randomly from the “consensus” sequence in a number of base changes. We then used these five aRI-DNA clones plus the genomic 340 bp aRI-DNA to examine the intensity and position of [ ‘251]Hoechst 33258 damage in each DNA species.

2. Materials

and Methods

(a) Materids The chemicalsand reagents used were all analytical reagent grade. Klenow DNA polymerasewas purchased from BRESA; [32P]dATP wasfrom Du Pont, carrier-free “‘1 was obtained from Amersham; and all other enzymes and vectors were from Bethesda Research Laboratories. [“‘IjHoechst 33258 was synthesized with carrier-free “‘1 as described previously (Martin & Pardee, 1985). (b) The 340 bp aRI-DNA

sequences

Five 340 bp aRI-DNA clones were constructed and sequenced by conventional methods (Murray & Martin, 19856, 1987). Clones a6, a22, a82 and a32 were sequenced on both strands using the M13-dideoxy sequencing procedure (Sanger et al., 1977, 1980) and clone aB3 was sequenced by the Maxam & Gilbert (1980) method (see Murray & Martin, 1985b). The supercoiled plasmids were purified by ethidium bromide/CsCl density centrifugation. The 340 bp aRI-DNA inserts were then isolated after digestion with EcoRI and elution from a 1.5% agarose/ethidium bromide gel using the procedure of Dretzen et al. (1981). Genomic 340 bp @RI-DNA was prepared as described (Murray & Martin, 1985c). (c) [‘“5Z]Hoechst

33258 incubation

conditions

Approximately 10 ng of 340 bp aRI-DNA was incubated with 3 x lo7 to 6 x lo7 cts/min of [“‘IJHoechst 33258 (0.24 to 0.48 PM) in 200 mM-Tris * HCl (pH 7.5) 500 mM-NaCl, 5 mM-EDTA, 5 m&r-KI at 4°C for 30 to 50 days in a final volume of 50 ~1. The molar binding ratio for [“‘I]Hoechst 33258 per base-pair of DNA was between 1.4 and 2.9. The controls contained no [’ 251]Hoechst 33258. After preciptation with ethanol, the samples were labelled with [3 P]dATP at the EcoRI ends (Maniatis et al., 1982) and subjected to the M13hybridization procedure (see below). This protocol reduced the quantity of i2’I to approximately background levels, and so treatment of the samples with t Abbreviation

used: bp, base-pair(s).

NaOH in methanol as described by Martin & Holmes (1983) was unnecessary. The samples were then electrophoresed on urea/polyacrylamide DNA sequencing gels and autoradiographed without an intensifying screen. For each strand in each experiment, an 8% gel was run to analyse smaller fragments and a 5% gel to examine larger fragments. (d) Densitometer

analysis

The autoradiographs were scanned with an LKB 2202 laser densitometer and peak heights determined. The measurements in the corresponding control were subtracted from the damaged samples and normalized with reference to base-pair 96. The normalized densitometer readings are shown in Tables 2 and 3 in the columns marked D. In 3 separateexperiments, analysis was carried out on both strands and an additional experiment was analysed only on the minus strand. The values in Tables 2 and 3 are averages of these 3 or 4 experiments.

Densitometer values were obtained for both strands between base-pairs 80 and 285. Weak hotspots are defined as having an average densitometer value for both strands (relative to the binding site at base-pair 96) of less than 0.70, while strong hotspots are greater than 0.70. On comparison of the densitometer values at a particular binding site on each strand for the 6 sequences examined, it can be seen that the values are reasonably similar from base-pairs 86 to 220. But the minus strand values are high and the plus strand values are low from base-pairs 220 to 270. (e) Ml3

hybridization

procedure

This protocol (Murray & Martin, 1985b) separates the DNA strands, resulting in a single-stranded DNA molecule with the “P label at one end, and also removed non-aRI-DNA from the sam les. Briefly, the procedure involves hybridization of 3PP-labelled aRI-DNA to a single-stranded Ml3 clone containing an aRI-DNA insert. The insert can be in 2 orientations; the plus strand insert will hybridize minus strands, and the minus strand will hybridize plus strands. The Ml3 aRI-DNA hybrids can be purified easily on a non-denaturing polyacrylamide gel. The samples were then loaded onto thin urea/polyacrylamide DNA sequencing gels to assess cleavage. The only alteration from the published procedure was that hybridization was at 55°C instead of 70°C because the highly divergent clones a6 and a22 did not hybridize efficiently at the higher temperature.

3. Results (a) The sequence specijicity

of (lg51/Hoechst 33258

The DNA sequencesof the six 340 bp aRI-DNAs used in these experiments are shown in Table 1. The “consensus” sequence is that determined for genomic 340 bp aRI-DNA (Wu & Manuelidis, 1980). The five cloned DNAs examined fall into two classes, based on the number of differences in nucleotide sequence compared to the consensus sequence. The first class consists of three clones with a low proportion of base substitutions: al33, 3.2%; a82, 1.5%; and a32, 3.2%. The second class has two clones with a high proportion of base alterations: a22, 15.0%; and a6, l8*9o/o divergence. Figures 1 and 2 illustrate autoradiographs of

Sequence Speci$icity of [‘251]Hoescht

33258

65

66

V. Murray 1 2

3

4

5

6

7

8

and R. F. Martin

910111213141516

-

1

-

100

-

150

- 200

- 220

-

-

240

260

-

80

Figure 1. Autoradiograph of an 80/, polyacrylamide DNA sequencing gel with six 340 hp aRI-DNA sequences damaged by [ 1251]Hoechst 33258; the plus strand damage is shown. Lanes 1 to 4 and 14 to 16 are controls that were not incubated with [“‘I]Hoechst 33258. Lanes 6 to 12 were incubated with [ ‘251)Hoechst 33258. Lanes 5 and 13 are the Maxam-Gilbert G+A sequencing tracks Lanes 1 and 6. genomic 340 bp crRI-DNA; lanes 2 and 7, clone aR3: lanes 4 and 9, clone b16; lanes 10 and 14, clone ~22; lanes 11 and 15, clone ~32; lanes 12 and 16, clone a82. Lane 8 is a cloned sequence that is not discussed in this paper.

Figure 2. autoradiograph of an Soi, polyacrylamide DNA sequencing gel with six 340 bp aRT-DNA sequences damaged by [ ‘251]Hoechst 33258; the minus strand damage is shown. Lanes 1 to 4 and 14 to 16 are controls that were not incubated with [‘251]Hoechst 33258. Lanes 6 to 12 were incubated with [125T]Hoechst 33258. Lanes 5 and 13 are the Maxam-Gilbert G+A sequencing tracks. Lanes 1 and 6, genomic 340 bp aRI-DNA; lanes 2 and 7, clone aB3, lanes 4 and 9, clone a6; lanes 10 and 14, clone a22; lanes I I and 15, clone a32; lanes 12 and 16, clone ~82. Lane 8 is a cloned sequence that is not discussed in this paper.

urea/polyacrylamide sequencing gels analysing [‘25T]Hoechst 3325%damaged 340 bp aRT-DNAs. Tt can been seen that the damaged DNAs (lanes 6 to 12) have hotspots that consist of four to ten cleaved bands with more intense cleavage at the centre. This type of damage is characteristic of 12’1induced cleavage of DNA (Martin & Haseltine, 1981; Martin & Holmes, 1983). The intensity of damage varies considerably from hotspot to hotspot. With reference to the Naxam-Gilbert G + A sequencing tracks (lanes 5 and 13), the exact position of the damaged DNA can be determined to the nearest base-pair. Data have been obtained for six aRI-DNAs from base-pairs 30 to 290 on the plus strand and from base-pairs 80 to 330 on the minus strand. So, data have been collected for both strands between basepairs 80 and 290.

The hotspots for genomic 340 bp LYRT-DNA are shown with reference to the DNA sequence in Figure 3. For clones aB3, a82 and a32, the position of the damage hotspots is exactly the same as the genomic sequence except in the region from basepairs 60 to 75. With clone aB3, each particular hotspot has the same strong or weak damage as the genomic sequence, but with clones a82 and a32 the intensity of the hotspots is slightly different. Clones a6 and a22 showed

numerous

differences

in position

and intensity of damage. One striking feature of the damage hotspots is that, on comparing damage on both strands, the hotspots are staggered around an A+ T-rich site. The hotspot is always biased towards the 3’ end, which is labelled with 32P. This feature is explained in the Discussion. On examining the genomic 340 bp aRI-DNA sequences damaged by [ ‘2sI]Hoechst 33258, it is

Sequence Specijcity

of (‘251]Hoescht

67

33258

-x-

AATTCTCAGTAACTTCCTTGTGTTGTGTGTATTCAACTCACAGAGTTGAA TTAAGAGTCATTGAAGGAACACAACACACATAAGTTGAGTGTCTCAACTT

v

50

v

v

CGATCCTTTACACAGAG%TTGAAACACTCTTTTTGTGAA?%?i GCTAGGAAATGTGTCTCGTCTGAACTTTGTGAGAAAAACACCTTAAACGT -A

1

v 100 A

V-E

~GTGGAGATTTCAGCCGCTTTGAGGTCAATGGTAGAATAGAAATATCTT TCACCTCTAAAGTCGGCGAAACTCCAGTTACCATCTTATCCTTTATAGAA

150

-Y-‘ILL

v

CCTATAGAAACTAGACAGAATGATTCTCAGAAACTCCTTTGTGATGG GGATATCTTTGATCTGTCTTACTAAGAGTCTTTGAGGAAACACTACACAC

-

200

v -E-

CGTTCAACTCACAGAGTTTAACCTTTCTTTTCATAGAGCAGTTAGGAAAC GCAAGTTGAGTGTCTCAAATTGGAAAGAAAAGTATCTCGTCAATCCTTTG -T A A

250 A

I-Y-. ACTCTGTTTGTAAAGTCTGCAAGTGGATATTCAGACCTCTTTGAGGCCTT TGAGACAAACATTTCAGACGTTCACCTATAAGTCTGGAGAAACTCCGGAA A

300

A

CGTTGGAAACGGGATTTCTTCATATTATGCTAGACAGAAGAATT GCAACCTTTGCCCTAAAGAAGTATAATACGATCTGTCTTCTTAA -A A A

(+ (- l

344

The Consensus Figure 3. The sequence specificity of [ 1251]Hoechst33258 damage to genomic 340 bp ctRI-DNA. sequence of S40 bp aRI-DNA is depicted (Wu & Manuelidis, 1980). Above the DNA sequence, the damage on the plus strand is shown with minus strand damage below. The extent of damage is represented by a single line. Strong damage is indicated by a large triangle and weak damage by a small triangle. The positions of t,he t,riangles correspond to the peak of damageint’ensity.

apparent that the centre of the staggered hotspots is usually an A +T-rich region. For the hotspot centred on base-pair 145. six consecutive A. T basepairs are present: for the hotspots at, base-pairs 86 and 96, there are five; for the hotspot at base-pairs 109, 137, 228 and 261, there are four; there is a twopeaked hotspot with peaks at base-pairs 171 and 174 at the sequence AATGAT (the dot indicates base-pair 170, and in subsequent DNA sequences the dot is on the base-pair that is a multiple of ten, Also, runs of 3 or more consecutive A. T base-pairs are underlined). For the hotspots at base-pairs 158 (although this is part of the sequence TATAGAAA), 182, 189 and 249, there are three consecutive A ‘T base-pairs; for the hotspots at) base-pairs 200, 210 and 237, there are none. There are four or more consecutive A. T base-pairs present, in seven out of 15 hotspots but there are nine out of 15 if the sequences AATGATT and

TATAGAAA are included. Weak hotspots are defined as having an average densitometer value (relative to the binding site at base-pair 96) of less than 0.70, while strong hotspots are greater than 0.70. If the weak hotspots are excluded. six out of’ nine hotspots have four or more caonsecutivr A .T base-pairs, and eight out of nine if AATGATT and TATAGAAA are included. The binding site at basepair 249, AAA, has an average densitometer reading of 0.82 with a run of only thrre A residues present. The sequence AAA occurs four times between base-pairs 80 and 290, and results in one strong binding site at base-pair 249. GGAAA
V. Murray

and R. F. Martin

Table 2 I)Pnsitometer-derived tlentre of binding site (bp) 32

Genomir D BS

0.36

intensity

of (‘251]Hoeehst damage with six 340 bp aRI-DNA on the plus strand

aB3

-

D

BS

D

a82 BS

D

a3% BS

t(12) 0.54

0.45

0.44

D

a22 HS

sequences

a6 I)

t’(l8)

410) t(31)

d(20)

0.46

0.34 ~(52).

g(55)

t(53)

65 69 71

0.16

0.08

0.23

0.20

0.17

0.21

I4

0.52

0.67

0.57

T(92)

0.49 c(92)

1.0

1.0

1.0

0.67

0.71

0.71

0.67 0.98

0.71 1.12

a(138)

om

0.80 1.07

0.65 0.91

a( 138)

0.14 0.36

!3(152) 156 158

0.43

-

157

0.57

0.68

A(157)

0.48

0.50 0.68

0.82

0.86

0.34

189 190

(1.24

100 do%

0.16

“IO

0.1%

0.35

0.47

T(63)

0.93

d(71),

T(72)

0.57 0.25

gw, g(83) h(85),

G(86)

A(115) a(124) G(129) a(138)

0.70 &(13X) 0.8 I

A(157)

‘l-(166), c( 172)

0.53

t’( 153), A(157) (‘(160) g(l67)

0.47

(‘(171).

A(l68)

a(174)

0.27

0.28

t(186) 0.47

A(ll4), C(l21). a(l26), a(132) G(l36), t(143) g(W) t(l56), C(l59)

a(91) 1.0 t(93) c(98), c( 103) 1.44 T(l07), a(l11) g(I 1%). g(116, c(lli), D(121) ?(126), t(l2x)

~(177) 0.26

0.42

c(lO5)

0.48 g(l76).

Ix1 1x4

WP6 (~(60) g(61).

0.42

169. 173 Ifs174

t(lOO), 0.53

C(l20)

C(l28)

137 14<5

1.0 t(l01)

gw 109

g(W

gV2)

0.18

0.34

aWJ) 96

G(59) gWL

t(68) 0.23

0.48

G(59)

gw, t(53)

0.19

0.38

73

8% X6

0.16

C(30) g(37). c(52).

T(55)

5’)

13s

A( 184) c(l85), ~(186)

0.27

O.l!l

a(lX6) G(IH9)

0.25 g( 187). c(188) 0.34

0.47

0.26

0.12

t(201) 0.39

0.27

0.38

0.24

A(202) ~(206) ().%I) a(209) c(214), 0.3X G(219) (‘(229)’

0.05

.-

22 4 %PX P3 1

0.24

237

04)X

0. I 1

wo5

0. IO

0.21

0.18

0.21 0.16

A(232),

g(215) A(223)

C(235) g(%3%)

14% '49

0.25 -~ 0.19

0.07 t(244)

1’51 260 Sfil %63

0.13

0.05

0.06

0.14

0.3H

t(244),

T(24.5)

0.07

0.02

T(250) a(253), D(262)

0.10 ~(256)

0.06 0.16

g(W

T(256), G(263),

C(258) A(265)

11. Densitometer measurement relative to measurement at base-pair 96. BS, Base substitution relative to the consensus sequence of genomic aRI-DNA; with the position in base-pairs in parentheses. When a lower case a (adenine), g (guanine), c (cytosine), t (thymine) or d (deletion) is used, the base substitution does not interrupt or augment a run of consecutive A .T base pairs. When the upper case G, C or D is shown, this indicates that a run of A .T base pairs is interrupted. For upper case A or T, this indicates that a run of consecutive A. T base pairs is augmented or creakd (for 3 or more consecutive A. T base pairs).

sequence at base-pair 219 that has five consecutive A ‘T base-pairs (TTTAA) but is not damaged at a detectable level (Fig. 2). The weak binding sites at base-pairs 200, 210 and 237 do not contain any runs

of consecutive A. T base-pairs but have the sequences GT&ZG, CTCAC and AGAGC. Of the six 340 bp aRI-DNAs used in these experiments (Table I), the clones from the first

Sequence Speci$city

of (lz51JHoescht 33258

Table 3 Densitometer-derived intensity of [‘“‘I]Hoechst 33258 damage with six 340 bp aRI-DNA sequenceson the minus strand Centre of binding site (bp)

Genomir D BS

aB3

D

a82

88

D

a32

BS

D

a22

BS

t(l2)

D

WV

86

0.47

0.42

0.54

96

I.0

1.0

1.0

0.44

0.85

0.42 C(128)

137 145

0.96 1.3

0.70 0.97

a( 138)

02) 0.79

1.1 1.6

a(138)

0.16 0.57

g(lW 156 157 158

0.82 1.2

169-173 169-174

1.2

A(157)

0.58

1.6

1.1

0.60

189 190

0.53

200 202

0.47

210

0.49

0.58

0.56

A(ll.5) a(124) G(129)

2.4

a(138)

T(l66), ~(172)

2.1 3.0

a(138)

1.9

C(l53), C(lfw g(lfW

A(157)

24

C(l71),

a(174)

A(157)

A(l68)

1.0

1.0

t( 186) 0.49

G(85), G(86) a(91 1 t(93) c(98), ~(103) T(l07), a(ll1) g(lW, g(ll’3 c(ll7), D(l21) c(l26), t(128)

~(177) 1.2

0.60

A(ll4), C(121), a(126), a(132) G(l36), t( 143) g(l52) t(l56), C(l59)

1.5 g(lW,

182 184

c(lO5)

1.2 1.8

1.2

1.0 t(lOO),

0.76

C(l20)

0.38 0.49

0.05

1.0 t(l01)

BS

g(W. g(W

0.28

1.0 gw

0.77

D

g(72)

T(92)

4W

109

a6

BS

A(184) c(l85),

a(l86)

0.78

0.92

a( 186) G( 189)

1.0 g(l87), 0.55

0.69

1.43

0.87

2.3

224 228 231

0.55

0.62

1.2

1.1 3.9

I.6

1.7

1.6

2.5 2.3

237 242 249 251

1.7

2.0

2.4

1.6

1.9

3.2

3.4

5.4 2.2

A(232),

4.0

1.8

3.6

732444)

4.0

T(250) a(253), D(262)

3.4

3.1

1.6

3.1

2.0

t(244),

T(24.5)

T(256), G(263), a(280), g(.W,

C(258) A(265) T(282) gPW

~(256)

3.1

4.3

1.9

a(299)

308

g(215) A(223)

~(235)

3.6

280

A(202) ~(206) a(209) c(214), G(219), C(229) g(232)

1.2

3.3

260 261 263

c(l88)

t(201)

(I(290)

1.3

1.7

0.78

2.8

4.3

1.3

gw4 a(280), g(2W c(285), a(295), A(310)

T(282) t(284) a(291) c(296)

8.7

w90) 2.0

A(310) 315

3.1

325

Column

2.0

headings

2.7 2.5

C(327) g(W

2.7

D and BS have the same meanings

C(327) gww

3.6

as in Table

class, with a low percentage of base substitutions, were very similar with respect to position of damage and extent of damage. But clones from the second class, with a high percentage of base alterations, were quite different with respect to the position of damage and extent of damage at many binding sites. In general, there are advantages in comparing the effects of base substitutions at damage hotspots in separate cloned sequences

0.90

5.5 c(319) a(326)

0.58

G(324), t(329)

C(328)

2.

rather than at an altered sequence in another part of the same DNA. This is principally because neighbouring hotspots provide a common reference point. The autoradiographs were subjected to a densitometer analysis and the values at each damage site are shown in Tables 2 and 3 in the columns marked D. These values are an average of three separate experiments. In the columns marked

70

V. Murray

and R. F. Martin

US, the base alterations are indicated: upper case letters indicate base changes that will alter a potential [ 12’1 ] Hoechst-33258 binding site, which is considered to be three or more consecutive A ST base-pairs; lower case letters indicate changes that do not affect potential binding sites. There are 53 base alterations at 41 potential [ ‘251]Hoechst 33258 binding sites. Of these 41 potential binding sites, 31 have detectable damage associated with them. Out of these 31 binding sites, 14 have alterations that interrupt runs of A. T base-pairs, while 17 sites are altered in such a way as to augment the runs of A . T base-pairs. Six of the 14 interrupted binding sites have a decreased intensity of damage, eight are unaltered in damage intensity; in addition, two bindings sites have moved relative to the DNA sequence. Five of the 17 augmented binding sites have increased the degree of cleavage, two have decreased, ten are the same; ten binding sites have changed position. (b) Nucleotide alterations that interrupt potential [“‘l]Hoechst 33258 binding sites Of the 21 potential ] ‘251]Hoechst 33258 binding sites with base alterations that interrupt runs of A. T base-pairs, eight consist of three consecutive A. T base-pairs, which are reduced to two or less. Seven are not binding sites: clone aB3 C(128); c&2 C(120); ~32 C(290); a22 C(121), G(129); a6 D(l21), C(290). In clone ~6, the G(189) base substitution interrupts the binding site at base-pair 189, and the densitometer measurement is decreased by 50yb on one strand but not’ significantly on the other strand at this site. There are three occasions where a run of four A. T base-pairs is reduced to three. In clone ~22, G(136) reduces the intensity of damage at the binding site at base-pair 137 four- to fivefold. For clone a6, C(30) and G(60) did not alter the densitometer measurement’. A run of four A. T base-pairs is reduced to two or less at three sites. In clone a32 G(59) and in clone a6 C(229), the intensity of damage is reduced. However, in clone ~22 G(59), which is exactly the same substitution clone as a32 G(59), no change in the intensity of damage can be detected. For clone ~22, there are additional flanking base substitutions that could compensate for the G(59) substitution. For clone cr6, G(85) G(86) reduce a run of five A. T base-pairs to two or less. The densitometer measurement is reduced by two- to 15-fold at, binding site base-pair 86. There are two occasions where a run of seven A. T base-pairs is reduced to five. For clones a82 C(327) and aB3 C(327), no alteration in densitometer intensity is apparent. In clone ~6 G(324) C(328), the run of seven A. T base-pairs is reduced to three. The intensity of cleavage at the base-pair 325 binding site is reduced by sixfold. The deletion at base-pair 262 in clone ~22 changes the sequence from TTTGTAAA t,o

TTTGAAA. This alteration does not consistently alter the intensity of damage but the position of the hotspot is moved from base-pair 261 to 260. The two substitutions at binding site base-pairs 169 to 174 for clone a6 C(171) a(174) changes AATGATT to AACGAAT without affecting the position or intensity of damage. The complex changes at the base-pair 263 binding site in clone ~16T(256) C(258) G(263) A(265) alters the sequence from TGTTTGTAAAGT to TJYJ’CTGTAGAAT. There is no consistent’ alteration in the intensity of damage but the binding site moves from base-pair 261 to 263. This indicates that the new binding site is TAGAM. (c) Nucleotide alterations that augment potential /lea1 JHoechst 33258 binding sites Out of 20 potential [ ‘251]Hoechst 33258 binding sites with base alterations that augment runs of A. T base-pairs, four result in a run of three. For clone a22 T(55), A(114) A(115) and clone a6 T(63), no binding site is detected. However, for clone ct6 A(202), the base-pair 200 binding site is moved in the expected direction to base-pair 202 without any change in intensity of cleavage. There are five positions where three A. T basepairs are converted to four. With clone crB3 A(310), no change in intensity is observed. For clone ~122 A(310), there is an unexpected decrease in intensity of cleavage. The base substitution A(184) in clone a22 changes the position of binding from base-pair 182 to 184; the intensity of cleavage is increased on one strand but not significantly in the other strand. The change T(245) in clone a6 is similar, in that the intensity of damage is increased on one strand but not on the other strand, and the posit’ion of binding has moved from base-pair 237 to 242. This large change in position of 5 bp is caused by the creation of four A. T base-pairs where previously three were not a binding site. For clone a22 T(250), no change in degree of cleavage occurs but the binding site moves from base-pair 249 to 251. The base substitution T(107) in clone ct6 changes a run of four A . T base-pairs to five. This results in a twofold increase in degree of cleavage on both strands at the base-pair 169 binding site. There are three occasions where a run of five A . T base-pairs is increased to six. For clone a32 T(92) and clone ct6 T(282), no change in intensity of binding occurred. With clone a22 A(310), an unexpected decrease in intensit’y of cleavage was observed. The alteration of T(72) d(71) in clone cr6 changes the sequence GACTTGAA to GTTTGAAA. There is a fourfold increase in intensity of damage and the binding site moves from base-pair 73 to 71. The changes of T(166) and A(168) in clone KU. which alters the sequence from ACAGA&GAD t)o ATAAAATGAZ, produces no significant, change in the intensity of cleavage but the position of binding moves from base-pairs 169 to 174 to 169 to 173.

Sequence SpecQicity of [‘““I]Hoescht The base substit’utions A(232) C(235) in clone 1~22 change t,he sequence TT?TCATA to TTTTAATC. These changes do not affect) the position of binding or the degree of cleavage. The alteration of A(157) in the sequence TATAGAA.~ to TATAAAAA in clone aH3 does not cahange t’he posit,ion of binding or extent of damage. The base sub&tutions G(2l9) A(223) C(22g) g(232) in clone a6 greatly changes binding. The IJNA sequence changes from TTTAAC( T’TTCTT.~TC to TTCAACATTTCTCTTC. In the conset~s~~s sequence, the run of five and three A . T llasr-pairs are not binding sites, whereas the run of four ‘I rcsidu~5 is a binding site. The base snbstzitutions eliminate t,he run of five and four A . T I)ascp;tirs and create a run of four from the run of t hrcr. This new run of four A . T base-pairs is now t hc basepair 224 binding site and not the original site at base-pair 228. The degree of cleavage is irtcrrasetl on one st,rand by t.wofold but not signiticant,ly on the other strand. On alteration of t(156) A(l.57) (1(159) in clone ~2%. the 1)N.A sequence changes from TATA(:AAA to TXTTAA(1A. The degree of cleavage is not’ altered but the binding site moves from base-pair 15X IO 156 its exfWcted. In clone a6, the base substit’utions C( 153) A(j57) (‘(ItjO) change t’he I)NA sequence TATAGA& to (‘,I\TAAAA(“. These alteraGons do not affect t,he extent of damage but the base-pair 168 binding site is tnovctl to 157.

The binding site at. base-pair 210 provides evidence of a long-distance effect of a base of substitutiotr. With clan e a22. the intensity damage is reduced by about fivefold on one strand and on the other below the limit df~ detection (probably reduced greater than IO-fold) at this site. The nearest base substit,ution is t(201), which is 7 bp from the binding site for clone a22 and is t,hereforc assumed to exert its influence over this distance in the DNA. At this binding site there is an a(209) ba,se substitution in clone a6 that does not affect the intensit,y of damage on either strand.

4. Discussion The general aim of this study was to define the criteria for binding of ] ‘25T]Hoechst 33258 in terms of nucleotide sequence. Compared to earlier studies wit,h ] ‘25T]Hoechst 33258 (Martin & Holmes, 1983) and ut~ta.belled Hoerhst) 33258 (Harshman & Hervan, 19X5), we have examined a much more cxtcnsivc set of binding sit,es; however, the important feature of our approach is the comparison of closely related sequences. This has enabled t,wo sorts of analysis. First,ly, by examining a particular binding sit’e that is common to two or tnore closely related sequences (i.e. a particular binding site sequence at t.hr same location in the

33%5X

71

340 bp a-ONA), one can observe the influence of differences in neighbouring nucleotide sequrn(es. Secondly, there are cases where the effects of a single base-pair difference within a part,icu tar binding site can be studied against a more or Icss constant background of neighbonring sequences. We have been able to study single base-pair differences t,hat crea,te. tnodif:j 01 itholish (1251]Hoechst 33258 binding. Martin & Holmes (1983) found that forlr or more consecutive A T base-pairs were necessary lot. strong [ “‘1 JHoechst 33258 binding to a defined plasmid DNA sequence. In this study. we found that, for genomic and aI33 340 bp aKI-1)X;\. six out of nine st.rong hotspots conformed to t heir observat)ion but the sequences AAA. ALA’I’(:~~T and TATACAAA (the peak of damage was in the run of 3 As) were also found to be st,rong hotspots. These runs of A .‘I’ base-pairs with a single (: (‘ base-pair interruption appear t,o bind the ligand when they occur in the DNA. Martin & Holmes (1983) found that the sequence TAA(:AAA was a very weak hotspotj. There are seven positions in the genomic consensus sequence where a run of three A . T basepairs occurs and is more than 2 bp from a longer run of A .T base-pairs. Of these seven positions, four were a sit,e of significant, binding. An investigation of the nearest neighbours to thr thrcr consecutive A. T base-pairs revealed that on two occasions with genomic aRT-DNA, the same sequence was present but cleaved to various ext,ent,s at different positions in the sequence. The sequence GA&C is strongly cleaved at) base-pair 248, weakly cleaved at’ base-pair 180, but not significantly cleaved at base-pair 77. Likewise, the binding site at) base-pair 189 is (‘TTj‘(: and t,he same sequence is found at base-pair 120 but not, cleaved. There were also t,hree DNA sequences thai were weak hotspots and did not contain runs of consecutive A. ‘I’ base-pairs, CT&Y:, CTCAC and AGAGC. The sequence TTTAA is not. a site of even weak damage. The main aim of this study was to more rigorously detine the binding site for ( 1251]Hoechst 33258 using random base substit,utions. For the base substitutions t
72

V. Murray

and R. F, Martin

binding sites that had a run of three A * T base-pairs that was increased to four by base substitution, two had increased damage, one was decreased and two were the same. There were nine occasions where runs of four or more A * T base-pairs were increased to five or more, only one had increased damage, one decreased and seven were the same. So, extension of the run of consecutive A. T base-pairs at an existing binding site does not generally increase binding. To summarize, in these experiments we discovered that the binding site for [‘251]Hoechst 33258 is not simply a run of consecutive A. T basealthough this is the most important pairs, parameter. Other factors are important. (1) The position in the DNA sequence seems to have considerable impact because, in the cases where an existing binding site containing a run of A *T basepairs is reduced to two or less, significant binding can still occur. This could be due to the structure of the DNA being similar, even though the DNA sequence has been changed. The aRI-DNA clones, since they are naturally occurring in human DNA, could have a particular structure for their function in the cell. (2) Nearest-neighbour base substitutions are important. In the case of the G(59) change in clones a22 and a32, a completely different effect on extent of damage occurred. In this and other examples, nearest-neighbour base substitutions could affect binding. (3) The long-range effects of base substitutions on the extent of damage was investigated. The binding site at base-pair 210 with clone a22 revealed that a base substitution 7 bp away significantly affected the extent of damage. This change could affect the microstructure of DNA at the binding site, which in turn affects binding of a ligand (Shakked & Rabinovich, 1986). There were 12 binding sites that moved after base alterations changed the binding site. They generally moved in the manner that would be expected; for example, toward the new centre of a run of A. T base-pairs. The only difference in position of hotspots between genomic and clones aB3, a82 and a32 is that the genomic hotspot at base-pair 69 is changed to two hotspots at base-pairs 65 and 73 for clones aB3, a82 and a32. This could be caused by microheterogeneity in the genomic sequence (Jorgensen et al., 1986; Murray & Martin, 1987). A striking feature of the [‘251]Hoechst 33258 damage to the 340 bp aRI-DNA sequences is the “staggered” nature of the hotspots around the binding site. The hotspots are always biased towards the 3’-labelled end of the DNA sequence (see Fig. 3). Martin & Holmes (1983) only looked at one strand around the binding site but predicted that staggered cutting would occur. A rationale for stagpred cutting is as follows: during the decay of an 251 atom, approximately 22 low-energy electrons are emitted next to the DNA. This results in a series of overlapping single and double-stranded breaks. Since the 32P label, which is detected by autoradiography, is at one end,

molecules that have been cleaved twice or more, will appear only as the shortest molecule present. Hence the bias towards the labelled end and the staggered hotspots on both strands. It is important to realize that an observed hotspot at a particular site could be an average result if a number of different ligand-DNA interactions occur. For example, at the site GATTTC at base-pair 109, the ligand could be bound in either or both polarities and there might be subpopulations of binding configurations for each. The averaged outcome of the different binding modes is a distribution of strand cleavage over several base-pairs, more or less centred over the ATTT, which is then subject to the bias effect discussed above. In African green monkey cells, a specific highmobility group nuclear protein (called a-protein) binds to a-DNA (Strauss 6 Varshavsky, 1984). The properties of this protein are relevant to point (1)) above, where the position of the DNA sequence is important in determining a [“‘I]Hoechst 33258 damage site. The a-protein binds to A+T-rich regions of DNA. A similar protein is thought to exist in human cells. In human 340 bp aRI-DNA there are six potential binding sites for a-protein, and these DNA sequences are conserved in aRI-DNA clones (Jorgensen et al., 1986, Murray & Martin, 1987). These potential binding sites correspond to the [ ’ 2‘I]Hoechst 33258 binding sites at base-pairs 32, 109, 145, 200, 280 and 315. The site at base- air 200 is interesting because its sequence GT E)CG does not contain a run of A *T base-pairs. Since both a-protein and [ 1251]Hoechst 33258 bind to the sequence, it must be similar in conformation to a run of A * T base-pairs. The most obvious feature of the sequence is a run of ten alternating purine . pyrimidine base-pairs. The structure of Hoechst 33258 complexed with the dodecamer oligonucleotide CGCGAATTCGCG has been determined to an accuracy of 2.2 A (1 A = 0.1 nm) using X-ray crystallography (Pjura et al., 1987). The structure was elucidated using non-iodinated Hoechst 33258. However, the sequence specifity of the iodinated and noniodinated Hoechst 33258 are similar (Harshman & Dervan, 1985). The X-ray structure was surprising in that, instead of binding to the four consecutive A. T base-pairs, t,he Hoechst 33258 bound to three A. T base-pairs and the neighbouring G. C basepair. This happens because the piperidine end of the ligand is more bulky than the rest of the molecule and therefore prefers the slightly wider minor groove of the G *C base-pair. The X-ray data provide an explanation as to why DNA sequences with three A. T base-pairs can be binding sites. The particular microstructure of the potential binding site is important in determining the extent of binding. The necessary parameters appear to be a 3 bp narrow minor groove of dimensions usually found for A . T base-pairs and a slightly wider minor groove at the end as usually found for a G. C base-

Sequence SpeciIcity pair. Because of structural variations, not all A G nucleotide sequences will have the preferred 0 i- JJ DNA microstructure and be binding sites. Other sequences could adopt the correct microstructure and be Hoechst 33258 binding sites; e.g. base-pairs 200 GT&?G, 210 CTCAC, and 237 AGAGC. The differing extents of ligand binding can be explained by how closely the DNA structure resembles the preferred DNA microstructure. At a more dynamic level the Hoechst 33258 could, on binding, induce a change in the microstructure of DNA to permit more efficient binding, although the X-ray structure reveals that Hoechst 33258 causes only small changes in DNA microstructure on binding. Although it is not possible to explain all our our studies observations in molecular terms, indicate the complementary nature of X-ray crystallography, and techniques that analyse information, albeit, indirectly, in longer DNA molecules. The affinity cleavage method using [ ‘25T]Hoechst 33258 has the additional advantage that, it can be used in experiments with intact cells (Murray & Martin, 19SS). This project was supported by the Australian Research Scheme and Research funds from the Peter Mnc(Num (‘ancer Institute. Grants

References Boles, T. C. & Hogan, M. E. (1984). PTOC. Nut. Acad Sci., U.S.A. 81 56’23-5627. Calladine. C. R. & Drew. H. R. (1984). J. Mol. BioE. 178, 773-782. (‘artwright, I. L. & Elgin, S. C. R. (1982). Nucl. Acids Res. 10. 5835-5852. Darling, S. M., Crampton, ,J. M. & Williamson, R. (1982). J. Mol. Biol. 154, 51-63. Dickerson, R. E. 8: Drew. H. R. (1981). J. Mol. BioE. 149, X-786.

of (l’sZ]Hoescht

33258

Dretzen, G., Bellard, M., Sassone-Corsi, P. & Chambon, P. (1981). Anal. Biochem. 112, 295-298. Drew, H. R., Wing, R. M., Takano. T., Broka, C., Tanaka, S., Itakura, K. & Dickerson, R. E. (1981). Proc. Nat. Acad. Sci., U.S.A. 78, 2179-2183. Harshman, K. D. & Dervan, P. B. (1985). Nucl. Acids Res. 13, 4825-4835. Jessee, B., Gargiulo, G., Razvi, F. & Worcel. A. (1982). Nucl. Acids Res. 10, 5823-5834. Jorgensen. A. L., Bostock, C. J. & Bak. A. 1,. (1986). J. Mol. Biol. 187, 185-196. Lomonossoff, G. P., Butler, P. ,I. G. & Klug, A. (1981). J. Mol. Biol. 149. 745-760. Maniatis. T., Fritsch, E. F. & Sambrook. J. (1982). Editors of Molecular Cloning: A Laboratory Manual, p. 380, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY. Martin, R. F. & Haseltine. W. A. (1981). &ience, 213, 896-898. Martin, R. F. & Holmes. N’. (1983). .ature (Lo7do71), 302, 452.-454. Martin, R. F. & Pardee, M. (1985). Int. J. AppZ. Radiat. lsot. 36, 745-747. Maxam, A. M. 8: Gilbert, W. (1980). Methods Enzymol. 65. 499-560. Murray, V. & Martin, R. F. (1985a). ;Vucl. Acids Rex. 13, 1467-1481. Murray, V. & Martin, R. F. (1985b). Genr ilnal. Techn. 2. 95-99. Murray, V. & Martin. R. F. (1985c). .I. Hiol. (‘Rem. 260, 10389-10391. Murray, V. & Martin, R. F. (1987). &TX, 57. 255-259. Murray, V. & Martin. R. F. (1988). J. Mol. Biol. 201, 437442. Pjura, P. E.. Grzeskowiak, K. & Dickerson, R. E. (1987). .I. Mol. Biol. 197, 251-271. Sanger, F., Nicklen, 8. & Co&on, A. R. (1977). Proc. Nat. Acad. Sci., U.S.A. 74, 5463-5467. Sanger, F., Coulson, A. R., Barrel], B. G.. Smith. A. J. H. & Roe, B. A. (1980). J. Mol. Biol. 143, 161-178. Shakked, Z. & Rabinovich, D. (1986). Prog. Biophys. MoZ. Biol. 47. 159-195. Strauss. F. & Varshavsky. A. (1984). (:ell, 37. 889-901. Wu, J. (1. & Manuelidis. L. (1980). .I. Mol. Riol. 142. 36% 386.

Edited by P. uon Hippel